"McShane–don’t come back, McShane!"
The hockey stick lives! Recent global warming is unprecedented in magnitude and speed and cause.
Shane (updated): A Hockey Stick is a tool, Marian; no better or no worse than any other tool: an axe, a shovel or anything. A Hockey Stick is as good or as bad as the man using it. Remember that.
The rate of human-driven warming in the last century has exceeded the rate of the underlying natural trend by more than a factor of 10, possibly much more. And warming this century on our current path of unrestricted greenhouse gas emissions is projected to cause a rate of warming that is another factor of 5 or more greater than that of the last century. We are punching the climate beast “” and she ain’t happy about it!
Future historians — and countless long-suffering future generations — will be quite puzzled that during the same short window humanity was given to avert multiple catastrophes predicted by basic science, a vast amount of effort went into a debate over the so-called “Hockey Stick” graph developed by leading U.S. climatologists and substantiated by multiple independent analyses. As WAG notes, within a few decades, nobody is going to be talking about hockey sticks, they will be talking about right angles (or hockey skates, see figure above) — when they are done cursing our greed and myopia and gullibility in the face of polluter-funded disinformation, that is.
Here in the present, the evidence just keeps accumulating that
- Recent global warming is unprecedented in magnitude and speed and cause (see “Two more independent studies back the Hockey Stick” and figures below)
- Future global warming on our current emissions path risks multiple simultaneous ever-worsening disasters that individually justify strong action now but, taken together, demand it (see “A stunning year in climate science reveals that human civilization is on the precipice” and Royal Society special issue details ‘hellish vision’ of 7°F (4°C) world — which we may face in the 2060s!“
And so into this fray walks a gunman — well, two statisticians, McShane and Wyner. Gavin Schmidt and Michael Mann, the real gunslingers, can take it from here (reposted from RealClimate) — while we just watch:
Readers may recall a flurry of excitement in the blogosphere concerning the McShane and Wyner paper in August. Well, the discussions on the McShane and Wyner paper in AOAS have now been put online. There are a stunning 13 different discussion pieces, an editorial and a rebuttal. The invited discussions and rebuttal were basically published ‘as is’, with simple editorial review, rather than proper external peer review. This is a relatively unusual way of doing things in our experience, but it does seem to have been effective at getting rapid responses with a wide variety of perspectives, though without peer review, a large number of unjustified, unsupportable and irrelevant statements have also got through.
A few of these discussions were already online, i.e. from Martin Tingley, Schmidt, Mann and Rutherford (SMR), and one from Smerdon. Others, including contributions from Nychka & Li, Wahl & Ammann, McIntyre & McKitrick, Smith, Berliner and Rougier are newly available on the AOAS site and we have not yet read these as carefully yet.
Inevitably, focus in the discussions is on problems with MW, but it is worth stating upfront here (as is also stated in a number of the papers) that MW made positive contributions to the discussion as well – they introduced a number of new methods (and provided code that allows everyone to try them out), and their use of the Monte Carlo/Markov Chain (MCMC) Bayesian approach to assess uncertainties in the reconstructions is certainly interesting. This does not excuse their rather poor framing of the issues, and the multiple errors they made in describing previous work, but it does make the discussions somewhat more interesting than a simple error correcting exercise might have been. MW are also to be commended on actually following through on publishing a reconstruction and its uncertainties, rather than simply pointing to potential issues and never working through the implications.
The discussions raise some serious general issues with MW’s work – with respect to how they use the data, the methodologies they introduce (specifically the ‘Lasso’ method), the conclusions they draw, whether there are objective methods to decide whether one method of reconstruction is better than another and whether the Bayesian approach outlined in the last part of the paper is really what it is claimed. But there are also a couple of very specific issues to the MW analysis; for instance, the claim that MW used the same data as Mann et al, 2008 (henceforth M08).
On that specific issue, presumably just an oversight, MW apparently used the “Start Year” column in the M08 spreadsheet instead of the “Start Year (for recon)” column. The difference between the two is related to the fact that many tree ring reconstructions only have a small number of trees in their earliest periods and that greatly inflates their uncertainty (and therefore reduces their utility). To reduce the impact of this problem, M08 only used tree ring records when they had at least 8 individual trees, which left 59 series in the 1000 AD frozen network. The fact that there were only 59 series in the AD 1000 network of M08 was stated clearly in the paper, and the criterion regarding the minimal number of trees (8) was described in the Supplementary Information. The difference in results between the correct M08 network and spurious 95 record network MW actually used is unfortunately quite significant. Using the correct data substantially reduces the estimates of peak medieval warmth shown by MW (as well as reducing the apparent spread among the reconstructions). This is even more true when the frequently challenged “Tiljander” series are removed, leaving a network of 55 series. In their rebuttal, MW claim that M08 quality control is simply an ‘ad hoc’ filtering and deny that they made a mistake at all. This is not really credible, and it would have done them much credit to simply accept this criticism.
With just this correction, applying MW’s own procedures yields strong conclusions regarding how anomalous recent warmth is the longer-term context. MW found recent warmth to be unusual in a long-term context: they estimated an 80% likelihood that the decade 1997-2006 was warmer than any other for at least the past 1000 years. Using the more appropriate 55-proxy dataset with the same estimation procedure (which involved retaining K=10 PCs of the proxy data), yields a higher probability of 84% that recent decadal warmth is unprecedented for the past millennium.
However K=10 principal components is almost certainly too large, and the resulting reconstruction likely suffers from statistical over-fitting. Objective selection criteria applied to the M08 AD 1000 proxy network as well as independent “pseudoproxy” analyses (discussed below) favor retaining only K=4 PCs. (Note that MW correctly point out that SMR made an error in calculating this, but correct application of the Wilks (2006) method fortunately does not change the result, 4 PCs should be retained in each case). Nonetheless, this choice yields a very close match with the relevant M08 reconstruction. It also yields considerably higher probabilities up to 99% that recent decadal warmth is unprecedented for at least the past millennium. These posterior probabilities imply substantially higher confidence than the “likely” assessment by M08 and IPCC (2007) (a 67% level of confidence). Indeed, a probability of 99% not only exceeds the IPCC “very likely” threshold (90%), but reaches the “virtually certain” (99%) threshold. In this sense, the MW analysis, using the proper proxy data and proper methodological choices, yields inferences regarding the unusual nature of recent warmth that are even more confident than expressed in past work.
An important real issue is whether proxy data provides more information than naive models (such as the mean of the calibrating data for instance) or outperform random noise of various types. This is something that has been addressed in many previous studies which have come to very different different conclusions than MW, and so the reasons why MW came to their conclusion is worth investigating. Two factors appear to be important – their use of the “Lasso” method exclusively to assess this, and the use of short holdout periods (30 years) for both extrapolated and interpolated validation periods.
So how do you assess how good a method is? This is addressed in almost half of the discussion papers – Tingley in particular gives strong evidence that Lasso is not in fact a very suitable method, and is outperformed by his Composite Regression method in test cases, Kaplan points out that using noise with significant long term trends will also perform well in interpolation. Both Smith and the paper by Craigmile and Rajaratnam also address this point.
In our submission, we tested all of the MW methods in “pseudoproxy” experiments based on long climate simulations (a standard benchmark used by practitioners in the field). Again, Lasso was outperformed by almost every other method, especially the EIV method used in M08, but even in comparison with the other methods MW introduced. The only support for ‘Lasso’ comes from McIntyre and McKitrick who curiously claim that the main criteria in choosing a method should be how long it has been used in other contexts, regardless of how poorly it performs in practice for a specific new application. A very odd criteria indeed, which if followed would lead to the complete cessation of any innovation in statistical approaches.
The MW rebuttal focuses a lot on SMR and we will take the time to look into the specifics more closely, but some of their criticism is simply bogus. They claim our supplemental code was not usable, but in fact we provided a turnkey R script for every single figure in our submission – something not true of their code, so that is a little cheeky of them [as is declaring that one of us to be a mere blogger, rather than a climate scientist ;-) ]. They make a great deal of the fact that we only plotted the ~50 year smoothed data rather than the annual means. But this seems to be more a function of their misconstruing what these reconstructions are for (or are capable of) rather than a real issue. Not least of which, the smoothing allows the curves and methods to be more easily distinguished – it is not a ‘correction’ to plot noisy annual data in order to obscure the differences in results!
Additionally, MW make an egregiously wrong claim about centering in our calculations. All the PC calculations use
prcomp(proxy, center=TRUE, scale=TRUE)to specifically deal with that, while the plots use a constant baseline of 1900-1980 for consistency. They confuse plotting convention with a calculation.
There is a great deal to digest in these discussions, and so we would like to open the discussion here to all of the authors to give their thoughts on how it all stacks up, what can be taken forward, and how such interactions might be better managed in future. For instance, we are somewhat hesitant to support non-peer reviewed contributions (even our own) in the literature, but perhaps others can make a case for it.
In summary, there is much sense in these contributions, and Berliner’s last paragraph sums this up nicely:
The problem of anthropogenic climate change cannot be settled by a purely statistical argument. We can have no controlled experiment with a series of exchangeable Earths randomly assigned to various forcing levels to enable traditional statistical studies of causation. (The use of large-scale climate system models can be viewed as a surrogate, though we need to better assess this.) Rather, the issue involves the combination of statistical analyses and, rather than versus, climate science.
If you want another debunking see, “I went to a statistician fight and a hockey stick broke out,” which presents Deep Climate‘s evisceration, summarized as:
McShane and Wyner’s background exposition of the scientific history of the “hockey stick” relies excessively on “grey” literature and is replete with errors, some of which appear to be have been introduced through a misreading of secondary sources, without direct consultation of the cited sources. And the authors’ claims concerning the performance of “null” proxies are clearly contradicted by findings in two key studies cited at length, Mann et al 2008 and Ammann and Wahl 2007. These contradictions are not even mentioned, let alone explained, by the authors.In short, this is a deeply flawed study and if it were to be published as anything resembling the draft I have examined, that would certainly raise troubling questions about the peer review process at the Annals of Applied Statistics.
And the best scientific gunslingers always shoot with actual data, so, for completeness’ sake and new readers, let’s run through the recent literature.
There are now more studies that show recent warming is unprecedented – in magnitude and speed and cause “” than you can shake a stick at!
As with a pride of lions, and a delusion of disinformers, perhaps the grouping should get its own name, like “a team of hockey sticks” (see “The Curious Case of the Hockey Stick that Didn’t Disappear“).
- GRL: “We conclude that the 20th century warming of the incoming intermediate North Atlantic water has had no equivalent during the last thousand years.“
- JGR: “The last decades of the past millennium are characterized again by warm temperatures that seem to be unprecedented in the context of the last 1600 years.” [figure below]
Reconstructed tropical South American temperature anomalies (normalized to the 1961-1990AD average) for the last ˆ¼1600 years (red curve, smoothed with a 39″year Gaussian filter). The shaded region envelops the ±2s uncertainty as derived from the validation period. Poor core quality precluded any chemical analysis for the time interval between 1580 and 1640 AD.
Yes, the 39″year Gaussian filter appears to wipe out over half of the warming since 1950 as this NASA chart makes clear:
For the record, even a moderate MWP (even if it were global, which remains unproven) does nothing whatsoever to undermine our understanding of human-caused global warming. The temperature trend in the past millennium prior to about 1850 is well explained in the scientific literature as primarily due to changes in the solar forcing along with the effect of volcanoes, whereas the recent rise in temperature has been driven primarily “” if not almost entirely “” by human activity (see Scientist: “Our conclusions were misinterpreted” by Inhofe, CO2 “” but not the sun “” “is significantly correlated” with temperature since 1850 and a post to be named later).
The Geophysical Research Letters paper, “Twentieth century warming in deep waters of the Gulf of St. Lawrence: A unique feature of the last millennium” concludes:
“¦ irrespective of the precise mechanisms responsible for the temperature variations reconstructed from core MD99″2220, it is unquestionable that the last century has been marked there by a warming trend having no equivalent over the last millennium.
For those keeping score at home, here are a few more members of the team of hockey sticks:
The Hockey Stick lives. And so we have the last Mann standing:
With serious apologies to the original writers of the classic Western:
- A.B. Guthrie Jr. (screenplay)
- Jack Sher (additional dialogue)
- Jack Schaefer (novel)