*click ***here** for larger image

I just cannot stop thinking about this graph that appeared with **this article** in the NYTimes recently. The piece discussed how the number of hot summer days, those above 90 degrees F, are projected to increase in the future, and it allows readers to enter their town and date of birth to see how the weather has changed between then and now.

Hmmm…. Well, we **all** know that climate is always changing, and we all know that it is warmer now, in general, than it was 100 years ago, but beyond that what does this article and its interactive graphic tell us?

I imagine that a lot of readers misinterpret the data plot and believe that it represents the rise in temperature in NYC over the recorded period: my experience is that most readers of these articles in the Times are not too concerned with details of data and data presentation. In fact, it is more accurate to say that the chart shows the number of “above 90-degree F days” in NYC over the period. That is, a count of days, not temperatures. Except that it doesn’t show that… On the left there is some text that says that it shows the “average number of days above 90-degrees F.” What does *that *mean?

If we look at the data point for the year 2010, we find a value of about ten days. Ten days above 90F in 2010? You could easily check the record to see if that is accurate. But the text says that ten days is the “average number” in 2010. In that year, there were either ten days above 90F or there were not ten days. An average does not enter into the discussion. That would be as if we said that June, on average, has thirty days.

The confusion is eliminated when we read the FAQ and Methodology document to which a link is provided at the end of the article: How many people do that, do you think? We learn that the data plot shows a twenty-year moving average of the above 90F days for each year. For example, for the year 2000, the number of above 90F days for 1990,1991, 1992…2000…2008, 2009, 2010 are added up and and divided by twenty-one (there are twenty-one years’ values) and an average is obtained. For 2001, the same process is used, but the summed years begin with 1991 and end with 2011. Moving averages are often used to smooth out the data curve: in this case, without doing it the plot would be very “spiky” with sudden changes in the number of above 90F days from year to year. Smoothing the data gives a better idea of the trend, but it is good practice to make clear up front that you have done so, which the authors of the piece do not do.

On the other hand, what about the years 2008 through 2018? For example, take the year 2015: we get a twenty-year moving average by summing the data from 2005 to 2015, and adding that to the data for 2016 **to 2026**… Oops! There is NO DATA for the years after 2017!! The kindly scientists at the Climate Impact Lab of Columbia University have used *model data*, *simulated data*, or shall we say, *created data *in place of actual historical data. They do, obliquely, note this fact in their FAQ and Methodology text, but you’d never know it by looking at the graph.

Consider this: their models show temperatures rising and above 90F days increasing, so the tendline after 2017 is rising. But unlike the rest of the graph, that is NOT actual recorded data. For all we know, the data record during that period is flat, or perhaps moving downward.

And speaking of flat data records, at least in NYC, the period from 1990 to 2017 (keeping in mind that the data for 2008 to 2017 is not actually *the* historical data) looks pretty much horizontal, i.e. constant, not increasing. But sure enough, we can be completely confident that the upward trend that begins…next year, will come about.

Well, we cannot be completely sure because the Climate Lab also tells us – they are honest, if not forthcoming – that the results plotted here represent the data range that two-thirds of the models project. I’m used to hearing the IPCC and other outfits talk about high or very high confidence in projections, i.e. a 90 or 95% confidence interval, but here we have a “just likely,” …mebbe… confidence interval of 66%. Of course, this is simply a statistical sample of modeled results, described with the unspoken assumption that the models are correct, or nearly correct, or more correct than not correct… 🙂 If all the models share a few assumptions and parameters that later are disproved, then the fact that 66% predict this is hardly something to inspire confidence. This, by the way, goes for all the climate projection models.

It would be nice if this graph for NYC were to be published every year in the NYTimes. Then we could see each year how accurate the projections actually were. Instead, this plot will be forgotten, and next year there will be a new batch, showing the rise in this or that frightful metric after the fateful year at hand.

Of course, it could happen exactly the way they are claiming it will. We shall see…!