Steve McIntyre, Climate Audit, 20 July 2005

MBH98 Source Code: Cross-validation R2

Mann has just (July 2005) archived a fortran program at

Here are my first thoughts on this.

The newly-archived source code and the data archive made public in June-July 2004 (mirrored at Nature) are clearly connected with MBH98, but do not match.

At present, it is impossible to do a run-through and get results. I'll discuss this in an upcoming post. In Mann's shoes, I'd have tried to ensure that everything matched.

The new source code shows how various statistics were calculated and definitely shows the correctness of our surmise that MBH calculated cross-validation R2 statistics and that these statistics were withheld, presumably because the cross-validation R2 statistics were adverse.

A Summary Table here ( shows a probable reconciliation between statistics as calculated in the computer program and statistics reported in the original Supplementary Information (
- with the caveat that the identifications in this table are based on the structure of the calculations rather than from a run-through of Mann's Fortran program. (The Supplementary Information link shown here is to the FTP site at the University of Massachusetts, rather than to Nature. The mirror version at Nature was deleted earlier this year. I don't know how often Nature does this. The University of Massachusetts directory
was deleted temporarily in November 2003 after the publication of MM03, but it was restored after the late John Daly complained. It's lucky that it's still extant.)

In the original SI, the cross-validation R2 statistic was not reported. You can see columns for calibration beta (which is equivalent to the calibration period R2) and for the verification beta, plus some r^2 and g^2 statistics pertaining to Nino, but, if you look closely, there is no verification R2 statistic. We remarked on this in MM05a and MM05b. We had previously speculated that it seemed inconceivable that the cross-validation R2 statistic would not been calculated (and thus withheld), but without source code, we were
then unable to show this conclusively.

However, the newly-archived source code demonstrates clearly that MBH did calculate the cross-validation R2 statistic (pages 28-29 in my printout). Accordingly, I can now assert that the information was withheld in the original SI.

At this point, we also know that the values of the cross-validation R2 were very insignificant (~0.0) in the controversial 15th century reconstruction. One can reasonably surmise that this information would have been very detrimental to widespread acceptance of the MBH98 reconstruction had it been disclosed. The IPCC assertion that the MBH98 reconstruction

"had significant skill in independent cross-validation tests"

is obviously not true for the withheld cross-validation R2 statistic. I previously discussed this inaccurate disclosure by IPCC as illustrating the potential conflict of interest between an author in his capacity as an IPCC review author and in his capacity as the author of the underlying study.

While I anticipated that the code would demonstrate the actual calculation of the cross-validation R2 statistic, there was a bit of a surprise in the form of another discrepancy between statistics calculated in the program and statistics reported in the original SI.

The program shows that a verification period RE statistic was calculated for the Nino index; however, the original SI only reported a verification period R2 statistic - reversing the reporting pattern for the NH temperature index. In this case, I presume that the verification RE statistic for the Nino calculation will be adverse. However, I have not attempted to replicate the MBH98 Nino calculations and this is merely a surmise at present.

I strongly believe that the authors had a responsibility to report adverse statistics, such as the cross-validation R2, and were not entitled to withhold this information. This also applies to Wahl and Ammann, who similarly do not report a cross-validation R2 statistic. In their case, their code as published does not even include the calculation of cross-validation R2 statistics in their calculations , but I would be astonished if they had not calculated these values at some point and later edited the step out of their code.

Mann has begun the process of trying to justify the withholding the R2 statistic in one of his answers to the House Committee letters. In my opinion, this attempted justification is very unsatisfactory. If the authors had wished to argue (as they are now attempting to do at this late stage) that the RE statistic is "preferred", this should have been done at the time, after ensuring that the reader was in possession of the statistics that the authors had calculated, thereby permitting the reader to come to his own conclusion on these matters. The selective omission of the cross-validation R2 statistic is a material distortion of the

It's late in the day to be arguing these matters after positions have been taken and locked in. I have no doubt, as I've mentioned recently, that, if the IPCC had reported that the MBH98 reconstruction had a cross-validation R2 of ~0.0 (rather than claiming that it had "significant skill in independent cross-validation tests"), the MBH98 hockey stick graph would not have been featured in IPCC. If it had been reported in the original publication, it's possible that the original article would not have been published in the first place. It will be interesting to see what the various learned societies and individuals will make of this.