Climate Delusions 1 – Karl et al 2015 propaganda

This is the first is a planned series of climate delusions. These are short pieces of where the climate alarmists are either deluding themselves, or deluding others, about the evidence to support the global warming hypothesis; the likely implications for changing the climate; the consequential implications of changing / changed climate; or associated policies to either mitigate or adapt to the harms. The delusion consists is I will make suggestions of ways to avoid the delusions.

Why is the Karl et al 2015 paper, Possible artifacts of data biases in the recent global surface warming hiatus proclaimed to be the pause-buster?

The concluding comments to the paper gives the following boast

Newly corrected and updated global surface temperature data from NOAA’s NCEI do not support the notion of a global warming “hiatus.”  …..there is no discernable (statistical or otherwise) decrease in the rate of warming between the second half of the 20th century and the first 15 years of the 21st century. Our new analysis now shows that the trend over the period 1950–1999, a time widely agreed as having significant anthropogenic global warming (1), is 0.113°C decade−1 , which is virtually indistinguishable from the trend over the period 2000–2014 (0.116°C decade−1 ). Even starting a trend calculation with 1998, the extremely warm El Niño year that is often used as the beginning of the “hiatus,” our global temperature trend (1998–2014) is 0.106°C decade−1 —and we know that is an underestimate because of incomplete coverage over the Arctic. Indeed, according to our new analysis, the IPCC’s statement of 2 years ago—that the global surface temperature “has shown a much smaller increasing linear trend over the past 15 years than over the past 30 to 60 years”—is no longer valid.

An opinion piece in Science, Much-touted global warming pause never happened, basically repeats these claims.

In their paper, Karl’s team sums up the combined effect of additional land temperature stations, corrected commercial ship temperature data, and corrected ship-to-buoy calibrations. The group estimates that the world warmed at a rate of 0.086°C per decade between 1998 and 2012—more than twice the IPCC’s estimate of about 0.039°C per decade. The new estimate, the researchers note, is much closer to the rate of 0.113°C per decade estimated for 1950 to 1999. And for the period from 2000 to 2014, the new analysis suggests a warming rate of 0.116°C per decade—slightly higher than the 20th century rate. “What you see is that the slowdown just goes away,” Karl says.

The Skeptical Science Temperature trend data gives very similar results. 1950-1999 gives a linear trend of 0.113°C decade−1 against 0.112°C decade−1 and for 2000-2014 gives 0.097°C decade−1 against 0.116°C decade−1. There is no real sign if a slowdown,

However, looking at any temperature anomaly  chart, whether Karl. NASA Gistemp, or HADCRUT4, it is clear that the period 1950-1975 showed little or no warming, whilst the last quarter of the twentieth century show significant warming.  This is confirmed by the Sks trend calculator figures in Figure 1.

What can be clearly seen is the claim of no slowdown in the twenty-first century compared with previous years is dependent on the selection of the period. To repeat the Karl et. al concluding claim.

Indeed, according to our new analysis, the IPCC’s statement of 2 years ago—that the global surface temperature “has shown a much smaller increasing linear trend over the past 15 years than over the past 30 to 60 years”—is no longer valid.

The period 1976-2014 is in the middle of the range, and from the Sks temperature trend is .160. The trend is significantly higher than 0.097, so a slowdown has taken place. Any remotely competent peer review would have checked what is the most startling claim. The comparative figures from HADCRUT4 are shown in Figure 2.

With the HADCRUT4 temperature trend it is not so easy to claim that there is no significant slowdown. But the full claim in the Karl et al paper to be a pause-buster can only be made by a combination of recalculating the temperature anomaly figures and selection of the 1950-1999 period for comparing the twenty-first century warming. It is the latter part that makes the “pause-buster” claims a delusion.

Kevin Marshall

 

Warming Bias in Temperature Data due to Consensus Belief not Conspiracy

In a Cliscep article Science: One Damned Adjustment After Another? Geoff Chambers wrote:-

So is the theory of catastrophic climate change a conspiracy? According to the strict dictionary definition, it is, in that the people concerned clearly conferred together to do something wrong – namely introduce a consistent bias in the scientific research, and then cover it up.

This was in response to last the David Rose article in the Mail on Sunday, about claims the infamous the Karl et al 2015 breached America’s National Oceanic and Atmospheric Administration (NOAA) own rules on scientific intergrity.

I would counter this claim about conspiracy in respect of temperature records, even in the strict dictionary definition. Still less does it conform to a conspiracy theory in the sense of some group with a grasp of what they believe to be the real truth, act together to provide an alternative to that truth. or divert attention and resources away from that understanding of that truth. like an internet troll. A clue as to know why this is the case comes from on of the most notorious Climategate emails. Kevin Trenberth to Micheal Mann on Mon, 12 Oct 2009 and copied to most of the leading academics in the “team” (including Thomas R. Karl).

The fact is that we can’t account for the lack of warming at the moment and it is a travesty that we can’t. The CERES data published in the August BAMS 09 supplement on 2008 shows there should be even more warming: but the data are surely wrong. Our observing system is inadequate.

It is the first sentence that was commonly quoted, but it is the last part is the most relevant for temperatures anomalies. There is inevitably a number of homogenisation runs to get a single set of anomalies. For example the Reykjavik temperature data was (a) adjusted by the Iceland Met office by standard procedures to allow for known locals biases (b) adjusted for GHCNv2 (the “raw data”) (c) adjusted again in GHCNv3 (d) homogenized by NASA to be included in Gistemp.

There are steps that I have missed. Certainly Gistemp homogenize the data quite frequently for new sets of data. As Paul Matthews notes, adjustments are unstable. Although one data set might on average be pretty much the same as previous ones, there will be quite large anomalies thrown out every time the algorithms are re-run for new data. What is more, due to the nature of the computer algorithms, there is no audit trail, therefore the adjustments are largely unexplainable with reference to the data before, let alone with reference to the original thermometer readings. So how does one know whether the adjustments are reasonable or not, except through a belief in how the results ought to look? In the case of the climatologists like Kevin Trenberth and Thomas R. Karl, variations that show warmer than the previous run will be more readily accepted as correct rather than variations that show cooler. That is, they will find reasons why a particular temperature data set now shows greater higher warming than before. but will reject as outliers results that show less warming than before. It is the same when choosing techniques, or adjusting for biases in the data. This is exacerbated when a number of different bodies with similar belief systems try to seek a consensus of results, like  Zeke Hausfather alludes to in his article at the CarbonBrief. Rather than verifying results in the real world, temperature data seeks to conform to the opinions of others with similar beliefs about the world.

Kevin Marshall

How strong is the Consensus Evidence for human-caused global warming?

You cannot prove a vague theory wrong. If the guess that you make is poorly expressed and the method you have for computing the consequences is a little vague then ….. you see that the theory is good as it can’t be proved wrong. If the process of computing the consequences is indefinite, then with a little skill any experimental result can be made to look like an expected consequence.

Richard Feynman – 1964 Lecture on the Scientific Method

It’s self-evident that democratic societies should base their decisions on accurate information. On many issues, however, misinformation can become entrenched in parts of the community, particularly when vested interests are involved. Reducing the influence of misinformation is a difficult and complex challenge.

The Debunking Handbook 2011 – John Cook and Stephan Lewandowsky

My previous post looked at the attacks on David Rose for daring to suggest that the rapid fall in global land temperatures at the El Nino event were strong evidence that the record highs in global temperatures were not due to human greenhouse gas emissions. The technique used was to look at long-term linear trends. The main problems with this argument were
(a) according to AGW theory warming rates from CO2 alone should be accelerating and at a higher rate than the estimated linear warming rates from HADCRUT4.
(b) HADCRUT4 shows warming stopped from 2002 to 2014, yet in theory the warming from CO2 should have accelerated.

Now there are at least two ways to view my arguments. First is to look at Feynman’s approach. The climatologists and associated academics attacking journalist David Rose chose to do so from a perspective of a very blurred specification of AGW theory. That is human emissions will cause greenhouse gas levels to rise, which will cause global average temperatures to rise. Global average temperature clearly have risen from all long-term (>40 year) data sets, so theory is confirmed. On a rising trend, with large variations due to natural variability, then any new records will be primarily “human-caused”. But making the theory and data slightly less vague reveals an opposite conclusion. Around the turn of the century the annual percentage increase in CO2 emissions went from 0.4% to 0.5% a year (figure 1), which should have lead to an acceleration in the rate of warming. In reality warming stalled.

The reaction was to come up with a load of ad hoc excuses. Hockey Schtick blog reached 66 separate excuses for the “pause” by November 2014, from the peer-reviewed to a comment in the UK Parliament.  This could be because climate is highly complex, with many variables, the presence of each contributing can only be guessed at, let alone the magnitude of each factor and the interrelationships with all factors. So how do you tell which statements are valid information and which are misinformation? I agree with Cook and Lewandowsky that misinformation is pernicious, and difficult to get rid of once it becomes entrenched. So how does one evaluate distinguish between the good information and the bad, misleading or even pernicious?

The Lewandowsky / Cook answer is to follow the consensus of opinion. But what is the consensus of opinion? In climate one variation is to follow a small subset of academics in the area who answer in the affirmative to

1. When compared with pre-1800s levels, do you think that mean global temperatures have generally risen, fallen, or remained relatively constant?

2. Do you think human activity is a significant contributing factor in changing mean global temperatures?

Problem is that the first question is just reading a graph and the second could be is a belief statement will no precision. Anthropogenic global warming has been a hot topic for over 25 years now. Yet these two very vague empirically-based questions, forming the foundations of the subject, should be able to be formulated more precisely. On the second it is a case of having pretty clear and unambiguous estimates as to the percentage of warming, so far, that is human caused. On that the consensus of leading experts are unable to say whether it is 50% or 200% of the warming so far. (There are meant to be time lags and factors like aerosols that might suppress the warming). This from the 2013 UNIPCC AR5 WG1 SPM section D3:-

It is extremely likely that more than half of the observed increase in global average surface temperature from 1951 to 2010 was caused by the anthropogenic increase in greenhouse gas concentrations and other anthropogenic forcings together.

The IPCC, encapsulating the state-of-the-art knowledge, cannot provide firm evidence in the form of a percentage, or even a fairly broad range even with over 60 years of data to work on..  It is even worse than it appears. The extremely likely phrase is a Bayesian probability statement. Ron Clutz’s simple definition from earlier this year was:-

Here’s the most dumbed-down description: Initial belief plus new evidence = new and improved belief.

For the IPCC claim that their statement was extremely likely, at the fifth attempt, they should be able to show some sort of progress in updating their beliefs to new evidence. That would mean narrowing the estimate of the magnitude of impact of a doubling of CO2 on global average temperatures. As Clive Best documented in a cliscep comment in October, the IPCC reports, from 1990 to 2013 failed to change the estimate range of 1.5°C to 4.5°C. Looking up Climate Sensitivity in Wikipedia we get the origin of the range estimate.

A committee on anthropogenic global warming convened in 1979 by the National Academy of Sciences and chaired by Jule Charney estimated climate sensitivity to be 3 °C, plus or minus 1.5 °C. Only two sets of models were available; one, due to Syukuro Manabe, exhibited a climate sensitivity of 2 °C, the other, due to James E. Hansen, exhibited a climate sensitivity of 4 °C. “According to Manabe, Charney chose 0.5 °C as a not-unreasonable margin of error, subtracted it from Manabe’s number, and added it to Hansen’s. Thus was born the 1.5 °C-to-4.5 °C range of likely climate sensitivity that has appeared in every greenhouse assessment since…

It is revealing that quote is under the subheading Consensus Estimates. The climate community have collectively failed to update the original beliefs, based on a very rough estimate. The emphasis on referring to consensus beliefs about the world, rather than looking outward for evidence in the real world, I would suggest is the primary reason for this failure. Yet such community-based beliefs completely undermines the integrity of the Bayesian estimates, making its use in statements about climate clear misinformation in Cook and Lewandowsky’s use of the term. What is more, those in the climate community who look primarily to these consensus beliefs rather than the data of the real world will endeavour to dismiss the evidence, or make up ad hoc excuses, or smear those who try to disagree. A caricature of these perspectives with respect to global average temperature anomalies is available in the form of a flickering widget at John Cooks’ skepticalscience website. This purports to show the difference between “realist” consensus and “contrarian” non-consensus views. Figure 2 is a screenshot of the consensus views, interpreting warming as a linear trend. Figure 3 is a screenshot of the non-consensus or contrarian views. They is supposed to interpret warming as a series of short, disconnected,  periods of no warming. Over time, each period just happens to be at a higher level than the previous. There are a number of things that this indicates.

(a) The “realist” view is of a linear trend throughout any data series. Yet the period from around 1940 to 1975 has no warming or slight cooling depending on the data set. Therefore any linear trend line derived for a longer period than 1970 to 1975 and ending in 2015 will show a lower rate of warming. This would be consistent the rate of CO2 increasing over time, as shown in figure 1. But for shorten the period, again ending in 2015, and once the period becomes less than 30 years, the warming trend will also decrease. This contracts the theory, unless ad hoc excuses are used, as shown in my previous post using the HADCRUT4 data set.

(b) Those who agree with the consensus are called “Realist”, despite looking inwards towards common beliefs. Those who disagree with warming are labelled “Contrarian”. This is not inaccurate when there is a dogmatic consensus. But it utterly false to lump all those who disagree with the same views, especially when no examples are provided of those who hold such views.

(c) The linear trend appears as a more plausible fit than the series of “contrarian” lines. By implication, those who disagree with the consensus are viewed as as having a distinctly more blinkered and distorted perspective than those who follow the consensus. Yet even using gistemp data set (which is gives greatest support to the consensus views) there is a clear break in the linear trend. The less partisan HADCRUT4 data shows an even greater break.

Those who spot the obvious – that around the turn of the century warming stopped or slowed down, when in theory it should have accelerated – are given a clear choice. They can conform to the scientific consensus, denying the discrepancy between theory and data. Or they can act as scientists, denying the false and empirically empty scientific consensus, receiving the full weight of all the false and career-damaging opprobrium that accompanies it.

fig2-sks-realists

 

 

fig3-sks-contras

Kevin Marshall

 

Beliefs and Uncertainty: A Bayesian Primer

Ron Clutz’s introduction, based on a Scientific American article by John Horgan on January 4, 2016, starts to grapple with the issues involved.

The take home quote from Horgan is on the subject of false positives.

Here is my more general statement of that principle: The plausibility of your belief depends on the degree to which your belief–and only your belief–explains the evidence for it. The more alternative explanations there are for the evidence, the less plausible your belief is. That, to me, is the essence of Bayes’ theorem.

“Alternative explanations” can encompass many things. Your evidence might be erroneous, skewed by a malfunctioning instrument, faulty analysis, confirmation bias, even fraud. Your evidence might be sound but explicable by many beliefs, or hypotheses, other than yours.

In other words, there’s nothing magical about Bayes’ theorem. It boils down to the truism that your belief is only as valid as its evidence. If you have good evidence, Bayes’ theorem can yield good results. If your evidence is flimsy, Bayes’ theorem won’t be of much use. Garbage in, garbage out.
With respect to the question of whether global warming is human caused, there is basically a combination of three elements – (i) Human caused (ii) Naturally caused (iii) Random chaotic variation. There may be a number of sub-elements and an infinite number of combinations including some elements counteracting others, such as El Nino events counteracting underlying warming. Evaluation of new evidence is in the context of explanations being arrived at within a community of climatologists with strong shared beliefs that at least 100% of recent warming is due to human GHG emissions. It is that same community who also decide the measurement techniques for assessing the temperature data; the relevant time frames; and the categorization of the new data. With complex decisions the only clear decision criteria is conformity to the existing consensus conclusions. As a result, the original Bayesian estimates become virtually impervious to new perspectives or evidence that contradicts those original estimates.

Science Matters

Those who follow discussions regarding Global Warming and Climate Change have heard from time to time about the Bayes Theorem. And Bayes is quite topical in many aspects of modern society:

Bayesian statistics “are rippling through everything from physics to cancer research, ecology to psychology,” The New York Times reports. Physicists have proposed Bayesian interpretations of quantum mechanics and Bayesian defenses of string and multiverse theories. Philosophers assert that science as a whole can be viewed as a Bayesian process, and that Bayes can distinguish science from pseudoscience more precisely than falsification, the method popularized by Karl Popper.

Named after its inventor, the 18th-century Presbyterian minister Thomas Bayes, Bayes’ theorem is a method for calculating the validity of beliefs (hypotheses, claims, propositions) based on the best available evidence (observations, data, information). Here’s the most dumbed-down description: Initial belief plus new evidence = new and improved belief.   (A fuller and…

View original post 1,082 more words

A note on Bias in Australian Temperature Homogenisations

Jo Nova has an interesting and detailed post guest post by Bob Fernley-Jones on heavily homogenised rural sites in Australia by the Australian BOM.

I did a quick comment that was somewhat lacking in clarity. This post is to clarify my points.

In the post Bob Fernley-Jones stated

The focus of this study has been on rural stations having long records, mainly because the BoM homogenisation process has greatest relevance the older the data is.

Venema et al. 2012 stated (Italics mine)

The most commonly used method to detect and remove the effects of artificial changes is the relative homogenization approach, which assumes that nearby stations are exposed to almost the same climate signal and that thus the differences between nearby stations can be utilized to detect inhomogeneities (Conrad and Pollak, 1950). In relative homogeneity testing, a candidate time series is compared to multiple surrounding stations either in a pairwise fashion or to a single composite reference time series computed for multiple nearby stations.

This assumption of nearby temperature stations being exposed to same climate signal is standard practice. Victor Venema, (who has his own blog) is a leading academic expert on temperature homogenisation. However, there are extreme examples where this assumption does not hold. One example is at the end of the 1960s in much of Paraguay where average temperatures fell by one degree. As this was not replicated in the surrounding area both GISTEMP and Berkeley Earth homogenisations eliminated this anomaly. This was despite using very different homogenisation techniques. My analysis is here.

On a wider scale take a look at the GISTEMP land surface temperature anomaly map for 2014 against 1976-2010. (obtained from here)


Despite been homogenised and smoothed it is clear that trends are different. Over much of North America there was cooling, bucking the global trend. What this suggests to me is that the greater the distance between weather stations the greater the likelihood that the climate signals will be different. Most importantly for temperature anomaly calculations, over the twentieth century the number of weather stations increased dramatically. So it is more likely homogenisation will end up smoothing out local and sub-regional variations in temperature trends in the early twentieth century than in the later period. This is testable.

Why should this problem occur with expert scientists? Are they super beings who know the real temperature data, but have manufactured some falsehood? I think it is something much more prosaic. Those who work at the Australian BOM believe that the recent warming is human caused. In fact they believe that more than 100% of warming is human caused. When looking at outlier data records, or records that show inconsistencies there is a very human bias. Each time the data is reprocessed they find new inconsistencies, having previously corrected the data.

Kevin Marshall

Climatic Temperature Variations

In the previous post I identified that the standard definition of temperature homogenisation assumes that there are little or no variations in climatic trends within the homogenisation area. I also highlighted specific instances of where this assumption has failed. However, the examples may be just isolated and extreme instances, or there might be other, offsetting instances so the failures could cancel each other out without a systematic bias globally. Here I explore why this assumption should not be expected to hold anywhere, and how it may have biased the picture of recent warming. After a couple of proposals to test for this bias, I look at alternative scenarios that could bias the global average temperature anomalies. I concentrate on the land surface temperatures, though my comments may also have application to the sea surface temperature data sets.

 

Comparing Two Recent Warming Phases

An area that I am particularly interested in is the relative size of the early twentieth century warming compared to the more recent warming phase. This relative size, along with the explanations for those warming periods gives a route into determining how much of the recent warming was human caused. Dana Nuccitelli tried such an explanation at skepticalscience blog in 20111. Figure 1 shows the NASA Gistemp global anomaly in black along with a split be eight bands of latitude. Of note are the polar extremes, each covering 5% of the surface area. For the Arctic, the trough to peak of 1885-1940 is pretty much the same as the trough to peak from 1965 to present. But in the earlier period it is effectively cancelled out by the cooling in the Antarctic. This cooling, I found was likely caused by use of inappropriate proxy data from a single weather station3.

Figure 1. Gistemp global temperature anomalies by band of latitude2.

For the current issue, of particular note is the huge variation in trends by latitude from the global average derived from the homogenised land and sea surface data. Delving further, GISS provide some very useful maps of their homogenised and extrapolated data4. I compare two identical time lengths – 1944 against 1906-1940 and 2014 against 1976-2010. The selection criteria for the maps are in figure 2.

Figure 2. Selection criteria for the Gistemp maps.

Figure 3. Gistemp map representing the early twentieth surface warming phase for land data only.


Figure 4. Gistemp map representing the recent surface warming phase for land data only.

The later warming phase is almost twice the magnitude of, and has much the better coverage than, the earlier warming. That is 0.43oC against 0.24oC. In both cases the range of warming in the 250km grid cells is between -2oC and +4oC, but the variations are not the same. For instance, the most extreme warming in both periods is at the higher latitudes. But, with the respect to North America in the earlier period the most extreme warming is over the Northwest Territories of Canada, whilst in the later period the most extreme warming is over Western Alaska, with the Northwest Territories showing near average warming. In the United States, in the earlier period there is cooling over Western USA, whilst in the later period there is cooling over much of Central USA, and strong warming in California. In the USA, the coverage of temperature stations is quite good, at least compared with much of the Southern Hemisphere. Euan Mearns has looked at a number of areas in the Southern Hemisphere4, which he summarised on the map in Figure 5

Figure 5. Euan Mearns says of the above “S Hemisphere map showing the distribution of areas sampled. These have in general been chosen to avoid large centres of human population and prosperity.

For the current analysis Figure 6 is most relevant.

Figure 6. Euan Mearns’ says of the above “The distribution of operational stations from the group of 174 selected stations.

The temperature data for the earlier period is much sparser than for later period. Even where there is data available in the earlier period the temperature data could be based on a fifth of the number of temperature stations as the later period. This may exaggerate slightly the issue, as the coasts of South America and Eastern Australia are avoided.

An Hypothesis on the Homogenisation Impact

Now consider again the description of homogenisation Venema et al 20125, quoted in the previous post.

 

The most commonly used method to detect and remove the effects of artificial changes is the relative homogenization approach, which assumes that nearby stations are exposed to almost the same climate signal and that thus the differences between nearby stations can be utilized to detect inhomogeneities. In relative homogeneity testing, a candidate time series is compared to multiple surrounding stations either in a pairwise fashion or to a single composite reference time series computed for multiple nearby stations. (Italics mine)

 

The assumption of the same climate signal over the homogenisation will not apply where the temperature stations are thin on the ground. The degree to which homogenisation eliminates real world variations in trend could be, to some extent, inversely related to the density. Given that the density of temperature data points diminishes in most areas of the world rapidly when one goes back in time beyond 1960, homogenisation in the early warming period far more likely to be between climatically different temperature stations than in the later period. My hypothesis is that, relatively, homogenisation will reduce the early twentieth century warming phase compared the recent warming phase as in earlier period homogenisation will be over much larger areas with larger real climate variations within the homogenisation area.

Testing the Hypothesis

There are at least two ways that my hypothesis can be evaluated. Direct testing of information deficits is not possible.

First is to conduct temperature homogenisations on similar levels of actual data for the entire twentieth century. If done for a region, the actual data used in global temperature anomalies should be run for a region as well. This should show that the recent warming phase is post homogenisation is reduced with less data.

Second is to examine the relative size of adjustments to the availability of comparative data. This can be done in various ways. For instance, I quite like the examination of the Manaus Grid block record Roger Andrews did in a post The Worst of BEST6.

Counter Hypotheses

There are two counter hypotheses on temperature bias. These may undermine my own hypothesis.

First is the urbanisation bias. Euan Mearns in looking at temperature data of the Southern Hemisphere tried to avoid centres of population due to the data being biased. It is easy to surmise the lack of warming Mearns found in central Australia7 was lack of an urbanisation bias from the large cities on the coast. However, the GISS maps do not support this. Ronan and Michael Connolly8 of Global Warming Solved claim that the urbanisation bias in the global temperature data is roughly equivalent to the entire warming of the recent epoch. I am not sure that the urbanisation bias is so large, but even if it were, it could be complementary to my hypothesis based on trends.

Second is that homogenisation adjustments could be greater the more distant in past that they occur. It has been noted (Steve Goddard in particular) that each new set of GISS adjustments adjusts past data. The same data set used to test my hypothesis above could also be utilized to test this hypothesis, by conducting homogenisations runs on the data to date, then only to 2000, then to 1990 etc. It could be that the earlier warming trend is somehow suppressed by homogenizing the most recent data, then working backwards through a number of iterations, each one using the results of the previous pass. The impact on trends that operate over different time periods, but converge over longer periods, could magnify the divergence and thus cause differences in trends decades in the past to be magnified. As such differences in trend appear to the algorithm to be more anomalous than in reality they actually are.

Kevin Marshall

Notes

  1. Dana Nuccitelli – What caused early 20th Century warming? 24.03.2011
  2. Source http://data.giss.nasa.gov/gistemp/graphs_v3/
  3. See my post Base Orcadas as a Proxy for early Twentieth Century Antarctic Temperature Trends 24.05.2015
  4. Euan Mearns – The Hunt For Global Warming: Southern Hemisphere Summary 14.03.2015. Area studies are referenced on this post.
  5. Venema et al 2012 – Venema, V. K. C., Mestre, O., Aguilar, E., Auer, I., Guijarro, J. A., Domonkos, P., Vertacnik, G., Szentimrey, T., Stepanek, P., Zahradnicek, P., Viarre, J., Müller-Westermeier, G., Lakatos, M., Williams, C. N., Menne, M. J., Lindau, R., Rasol, D., Rustemeier, E., Kolokythas, K., Marinova, T., Andresen, L., Acquaotta, F., Fratianni, S., Cheval, S., Klancar, M., Brunetti, M., Gruber, C., Prohom Duran, M., Likso, T., Esteban, P., and Brandsma, T.: Benchmarking homogenization algorithms for monthly data, Clim. Past, 8, 89-115, doi:10.5194/cp-8-89-2012, 2012.
  6. Roger Andrews – The Worst of BEST 23.03.2015
  7. Euan Mearns – Temperature Adjustments in Australia 22.02.2015
  8. Ronan and Michael Connolly – Summary: “Urbanization bias” – Papers 1-3 05.12.2013


Defining “Temperature Homogenisation”

Summary

The standard definition of temperature homogenisation is of a process that cleanses the temperature data of measurement biases to only leave only variations caused by real climatic or weather variations. This is at odds with GHCN & GISS adjustments which delete some data and add in other data as part of the homogenisation process. A more general definition is to make the data more homogenous, for the purposes of creating regional and global average temperatures. This is only compatible with the standard definition if assume that there are no real data trends existing within the homogenisation area. From various studies it is clear that there are cases where this assumption does not hold good. The likely impacts include:-

  • Homogenised data for a particular temperature station will not be the cleansed data for that location. Instead it becomes a grid reference point, encompassing data from the surrounding area.
  • Different densities of temperature data may lead to different degrees to which homogenisation results in smoothing of real climatic fluctuations.

Whether or not this failure of understanding is limited to a number of isolated instances with a near zero impact on global temperature anomalies is an empirical matter that will be the subject of my next post.

Introduction

A common feature of many concepts involved with climatology, the associated policies and sociological analyses of non-believers, is a failure to clearly understand of the terms used. In the past few months it has become evident to me that this failure of understanding extends to term temperature homogenisation. In this post I look at the ambiguity of the standard definition against the actual practice of homogenising temperature data.

The Ambiguity of the Homogenisation Definition

The World Meteorological Organisation in its’ 2004 Guidelines on Climate Metadata and Homogenization1 wrote this explanation.

Climate data can provide a great deal of information about the atmospheric environment that impacts almost all aspects of human endeavour. For example, these data have been used to determine where to build homes by calculating the return periods of large floods, whether the length of the frost-free growing season in a region is increasing or decreasing, and the potential variability in demand for heating fuels. However, for these and other long-term climate analyses –particularly climate change analyses– to be accurate, the climate data used must be as homogeneous as possible. A homogeneous climate time series is defined as one where variations are caused only by variations in climate.

Unfortunately, most long-term climatological time series have been affected by a number of nonclimatic factors that make these data unrepresentative of the actual climate variation occurring over time. These factors include changes in: instruments, observing practices, station locations, formulae used to calculate means, and station environment. Some changes cause sharp discontinuities while other changes, particularly change in the environment around the station, can cause gradual biases in the data. All of these inhomogeneities can bias a time series and lead to misinterpretations of the studied climate. It is important, therefore, to remove the inhomogeneities or at least determine the possible error they may cause.

That is temperature homogenisation is necessary to isolate and remove what Steven Mosher has termed measurement biases2, from the real climate signal. But how does this isolation occur?

Venema et al 20123 states the issue more succinctly.

The most commonly used method to detect and remove the effects of artificial changes is the relative homogenization approach, which assumes that nearby stations are exposed to almost the same climate signal and that thus the differences between nearby stations can be utilized to detect inhomogeneities (Conrad and Pollak, 1950). In relative homogeneity testing, a candidate time series is compared to multiple surrounding stations either in a pairwise fashion or to a single composite reference time series computed for multiple nearby stations. (Italics mine)

Blogger …and Then There’s Physics (ATTP) partly recognizes these issues may exist in his stab at explaining temperature homogenisation4.

So, it all sounds easy. The problem is, we didn’t do this and – since we don’t have a time machine – we can’t go back and do it again properly. What we have is data from different countries and regions, of different qualities, covering different time periods, and with different amounts of accompanying information. It’s all we have, and we can’t do anything about this. What one has to do is look at the data for each site and see if there’s anything that doesn’t look right. We don’t expect the typical/average temperature at a given location at a given time of day to suddenly change. There’s no climatic reason why this should happen. Therefore, we’d expect the temperature data for a particular site to be continuous. If there is some discontinuity, you need to consider what to do. Ideally you look through the records to see if something happened. Maybe the sensor was moved. Maybe it was changed. Maybe the time of observation changed. If so, you can be confident that this explains the discontinuity, and so you adjust the data to make it continuous.

What if there isn’t a full record, or you can’t find any reason why the data may have been influenced by something non-climatic? Do you just leave it as is? Well, no, that would be silly. We don’t know of any climatic influence that can suddenly cause typical temperatures at a given location to suddenly increase or decrease. It’s much more likely that something non-climatic has influenced the data and, hence, the sensible thing to do is to adjust it to make the data continuous. (Italics mine)

The assumption of a nearby temperature stations have the same (or very similar) climatic signal, if true would mean that homogenisation would cleanse the data of the impurities of measurement biases. But there is only a cursory glance given to the data. For instance, when Kevin Cowtan gave an explanation of the fall in average temperatures at Puerto Casado neither he, nor anyone else, checked to see if the explanation stacked up beyond checking to see if there had been a documented station move at roughly that time. Yet the station move is at the end of the drop in temperatures, and a few minutes checking would have confirmed that other nearby stations exhibit very similar temperature falls5. If you have a preconceived view of how the data should be, then a superficial explanation that conforms to that preconception will be sufficient. If you accept the authority of experts over personally checking for yourself, then the claim by experts that there is not a problem is sufficient. Those with no experience of checking the outputs following processing of complex data will not appreciate the issues involved.

However, this definition of homogenisation appears to be different from that used by GHCN and NASA GISS. When Euan Mearns looked at temperature adjustments in the Southern Hemisphere and in the Arctic6, he found numerous examples in the GHCN and GISS homogenisations of infilling of some missing data and, to a greater extent, deleted huge chunks of temperature data. For example this graphic is Mearns’ spreadsheet of adjustments between GHCNv2 (raw data + adjustments) and the GHCNv3 (homogenised data) for 25 stations in Southern South America. The yellow cells are where V2 data exist V3 not; the greens cells V3 data exist where V2 data do not.

Definition of temperature homogenisation

A more general definition that encompasses the GHCN / GISS adjustments is of broadly making the data homogenous. It is not done by simply blending the data together and smoothing out the data. Homogenisation also adjusts anomalous data as a result of pairwise comparisons between local temperature stations, or in the case of extreme differences in the GHCN / GISS deletes the most anomalous data. This is a much looser and broader process than homogenisation of milk, or putting some food through a blender.

The definition I cover in more depth in the appendix.

The Consequences of Making Data Homogeneous

A consequence of cleansing the data in order to make it more homogenous gives a distinction that is missed by many. This is due to making the strong assumption that there are no climatic differences between the temperature stations in the homogenisation area.

Homogenisation is aimed at adjusting for the measurement biases to give a climatic reading for the location where the temperature station is located that is a closer approximation to what that reading would be without those biases. With the strong assumption, making the data homogenous is identical to removing the non-climatic inhomogeneities. Cleansed of these measurement biases the temperature data is then both the average temperature readings that would have been generated if the temperature station had been free of biases and a representative location for the area. This latter aspect is necessary to build up a global temperature anomaly, which is constructed through dividing the surface into a grid. Homogenisation, in the sense of making the data more homogenous by blending is an inappropriate term. All what is happening is adjusting for anomalies within the through comparisons with local temperature stations (the GHCN / GISS method) or comparisons with an expected regional average (the Berkeley Earth method).

But if the strong assumption does not hold, homogenisation will adjust these climate differences, and will to some extent fail to eliminate the measurement biases. Homogenisation is in fact made more necessary if movements in average temperatures are not the same and the spread of temperature data is spatially uneven. Then homogenisation needs to not only remove the anomalous data, but also make specific locations more representative of the surrounding area. This enables any imposed grid structure to create an estimated average for that area through averaging the homogenized temperature data sets within the grid area. As a consequence, the homogenised data for a temperature station will cease to be a closer approximation to what the thermometers would have read free of any measurement biases. As homogenisation is calculated by comparisons of temperature stations beyond those immediately adjacent, there will be, to some extent, influences of climatic changes beyond the local temperature stations. The consequences of climatic differences within the homogenisation area include the following.

  • The homogenised temperature data for a location could appear largely unrelated to the original data or to the data adjusted for known biases. This could explain the homogenised Reykjavik temperature, where Trausti Jonsson of the Icelandic Met Office, who had been working with the data for decades, could not understand the GHCN/GISS adjustments7.
  • The greater the density of temperature stations in relation to the climatic variations, the less that climatic variations will impact on the homogenisations, and the greater will be the removal of actual measurement biases. Climate variations are unlikely to be much of an issue with the Western European and United States data. But on the vast majority of the earth’s surface, whether land or sea, coverage is much sparser.
  • If the climatic variation at a location is of different magnitude to that of other locations in the homogenisation area, but over the same time periods and direction, then the data trends will be largely retained. For instance, in Svarlbard the warming temperature trends of the early twentieth century and from the late 1970s were much greater than elsewhere, so were adjusted downwards8.
  • If there are differences in the rate of temperature change, or the time periods for similar changes, then any “anomalous” data due to climatic differences at the location will be eliminated or severely adjusted, on the same basis as “anomalous” data due to measurement biases. For instance in large part of Paraguay at the end of the 1960s average temperatures by around 1oC. Due to this phenomena not occurring in the surrounding areas both the GHCN and Berkeley Earth homogenisation processes adjusted out this trend. As a consequence of this adjustment, a mid-twentieth century cooling in the area was effectively adjusted to out of the data9.
  • If a large proportion of temperature stations in a particular area have consistent measurement biases, then homogenisation will retain those biases, as it will not appear anomalous within the data. For instance, much of the extreme warming post 1950 in South Korea is likely to have been as a result of urbanization10.

Other Comments

Homogenisation is just part of the process of adjusting data for the twin purposes of attempting to correct for biases and building a regional and global temperature anomalies. It cannot, for instance, correct for time of observation biases (TOBS). This needs to be done prior to homogenisation. Neither will homogenisation build a global temperature anomaly. Extrapolating from the limited data coverage is a further process, whether for fixed temperature stations on land or the ship measurements used to calculate the ocean surface temperature anomalies. This extrapolation has further difficulties. For instance, in a previous post11 I covered a potential issue with the Gistemp proxy data for Antarctica prior to permanent bases being established on the continent in the 1950s. Making the data homogenous is but the middle part of a wider process.

Homogenisation is a complex process. The Venema et al 20123 paper on the benchmarking of homogenisation algorithms demonstrates that different algorithms produce significantly different results. What is clear from the original posts on the subject by Paul Homewood and the more detailed studies by Euan Mearns and Roger Andrews at Energy Matters, is that the whole process of going from the raw monthly temperature readings to the final global land surface average trends has thrown up some peculiarities. In order to determine whether they are isolated instances that have near zero impact on the overall picture, or point to more systematic biases that result from the points made above, it is necessary to understand the data available in relation to the overall global picture. That will be the subject of my next post.

Kevin Marshall

Notes

  1. GUIDELINES ON CLIMATE METADATA AND HOMOGENIZATION by Enric Aguilar, Inge Auer, Manola Brunet, Thomas C. Peterson and Jon Wieringa
  2. Steven Mosher – Guest post : Skeptics demand adjustments 09.02.2015
  3. Venema et al 2012 – Venema, V. K. C., Mestre, O., Aguilar, E., Auer, I., Guijarro, J. A., Domonkos, P., Vertacnik, G., Szentimrey, T., Stepanek, P., Zahradnicek, P., Viarre, J., Müller-Westermeier, G., Lakatos, M., Williams, C. N., Menne, M. J., Lindau, R., Rasol, D., Rustemeier, E., Kolokythas, K., Marinova, T., Andresen, L., Acquaotta, F., Fratianni, S., Cheval, S., Klancar, M., Brunetti, M., Gruber, C., Prohom Duran, M., Likso, T., Esteban, P., and Brandsma, T.: Benchmarking homogenization algorithms for monthly data, Clim. Past, 8, 89-115, doi:10.5194/cp-8-89-2012, 2012.
  4. …and Then There’s Physics – Temperature homogenisation 01.02.2015
  5. See my post Temperature Homogenization at Puerto Casado 03.05.2015
  6. For example

    The Hunt For Global Warming: Southern Hemisphere Summary

    Record Arctic Warmth – in 1937

  7. See my post Reykjavik Temperature Adjustments – a comparison 23.02.2015
  8. See my post RealClimate’s Mis-directions on Arctic Temperatures 03.03.2015
  9. See my post Is there a Homogenisation Bias in Paraguay’s Temperature Data? 02.08.2015
  10. NOT A LOT OF PEOPLE KNOW THAT (Paul Homewood) – UHI In South Korea Ignored By GISS 14.02.2015

Appendix – Definition of Temperature Homogenisation

When discussing temperature homogenisations, nobody asks what the term actual means. In my house we consume homogenised milk. This is the same as the pasteurized milk I drank as a child except for one aspect. As a child I used to compete with my siblings to be the first to open a new pint bottle, as it had the cream on top. The milk now does not have this cream, as it is blended in, or homogenized, with the rest of the milk. Temperature homogenizations are different, involving changes to figures, along with (at least with the GHCN/GISS data) filling the gaps in some places and removing data in others1.

But rather than note the differences, it is better to consult an authoritative source. From Dictionary.com, the definitions of homogenize are:-

verb (used with object), homogenized, homogenizing.

  1. to form by blending unlike elements; make homogeneous.
  2. to prepare an emulsion, as by reducing the size of the fat globules in (milk or cream) in order to distribute them equally throughout.
  3. to make uniform or similar, as in composition or function:

    to homogenize school systems.

  4. Metallurgy. to subject (metal) to high temperature to ensure uniform diffusion of components.

Applying the dictionary definitions, data homogenization in science is not about blending various elements together, nor about additions or subtractions from the data set, or adjusting the data. This is particularly true in chemistry.

For UHCN and NASA GISS temperature data homogenization involves removing or adjusting elements in the data that are markedly dissimilar from the rest. It can also mean infilling data that was never measured. The verb homogenize does not fit the processes at work here. This has led to some, like Paul Homewood, to refer to the process as data tampering or worse. A better idea is to look further at the dictionary.

Again from Dictionary.com, the first two definitions of the adjective homogeneous are:-

  1. composed of parts or elements that are all of the same kind; not heterogeneous:

a homogeneous population.

  1. of the same kind or nature; essentially alike.

I would suggest that temperature homogenization is a loose term for describing the process of making the data more homogeneous. That is for smoothing out the data in some way. A false analogy is when I make a vegetable soup. After cooking I end up with a stock containing lumps of potato, carrot, leeks etc. I put it through the blender to get an even constituency. I end up with the same weight of soup before and after. A similar process of getting the same after homogenization as before is clearly not what is happening to temperatures. The aim of making the data homogenous is both to remove anomalous data and blend the data together.

Base Orcadas as a Proxy for early Twentieth Century Antarctic Temperature Trends

Temperature trends vary greatly across different parts of the globe, an aspect that is not recognized when homogenizing temperatures. At a top level NASA GISS usefully split their global temperature anomaly into eight bands of latitude. I have graphed the five year moving averages for each band, along with the Gistemp global anomaly in Figure 1.

Figure 1. Gistemp global temperature anomalies by band of latitude.

The biggest oddity is the 64S-90S band. This bottom slice of the globe roughly equates to Antarctica, which is South of 66°34′S. Not only was there massive cooling until 1930 – in contradiction to the global trend – but prior to the 1970 was very large volatility in temperatures, despite my using five year moving averages. Looking at the GHCN database of weather stations, there none listed in Antarctica until Rothera point started collecting data in 1946, as shown in Figure 21.

Figure 2. A selection of temperature anomalies in the Antarctica. The most numerous are either on the Antarctic Pennisula, or the islands just to the North.

The only long record is at Base Orcadas located at (60.8 S 44.7 W). I have graphed the GISS homogenised temperature anomaly data for station 701889680000 with the Gistemp 64S-90S band in Figure 3.

Figure 3. Gistemp 64S-90S annual temperature anomaly compared to Base Orcadas GISS homogenised data.

There is a remarkable similarity in the data sets until 1950, after which they appear unrelated. This suggests that in the absence of other data, Base Orcadas was the principle element in creating a proxy for the missing Antarctic data, despite it being located outside the area, and not being related to the actual data for well over half a century. The outcome is to bias the overall global temperature anomaly by suppressing the early twentieth century warming, making the late twentieth century warming appear relatively greater than is the underlying reality2. The error is due to assuming that temperature trends are the same at different latitudes are the same, an assumption that the homogenised data shows to be false.

Kevin Marshall

 

Notes

  1. Also in Antarctica (but not listed) there has been data collected at Amundsen-Scot base at the South Pole (90.0 S 0.0 E) since 1957, and at Vostok base (78.5 S 106.9 E) since 1958.
  2. Removing the Antarctic data would increase both the early twentieth century and post 1975 warming periods. But, given that 64S-90S is 5% of the global surface area, I estimate it would increase the earlier warming trends by 5-10% as against 1-3% for the later trend.


Temperature Homogenization at Puerto Casado

Summary

The temperature homogenizations for the Paraguay data within both the BEST and UHCN/Gistemp surface temperature data sets points to a potential flaw within the temperature homogenization process. It removes real, but localized, temperature variations, creating incorrect temperature trends. In the case of Paraguay from 1955 to 1980, a cooling trend is turned into a warming trend. Whether this biases the overall temperature anomalies, or our understanding of climate variation, remains to be explored.

 

A small place in Mid-Paraguay, on the Brazil/Paraguay border has become the centre of focus of the argument on temperature homogenizations.

For instance here is Dr Kevin Cowtan, of the Department of Chemistry at the University of York, explaining the BEST adjustments at Puerto Casado.

Cowtan explains at 6.40

In a previous video we looked at a station in Paraguay, Puerto Casado. Here is the Berkeley Earth data for that station. Again the difference between the station record and the regional average shows very clear jumps. In this case there are documented station moves corresponding to the two jumps. There may be another small change here that wasn’t picked up. The picture for this station is actually fairly clear.

The first of these “jumps” was a fall in the late 1960s of about 1oC. Figure 1 expands the section of the Berkeley Earth graph from the video, to emphasise this change.

Figure 1 – Berkeley Earth Temperature Anomaly graph for Puerto Casado, with expanded section showing the fall in temperature and against the estimated mean station bias.

The station move is after the fall in temperature.

Shub Niggareth looked at the metadata on the actual station move concluding

IT MOVED BECAUSE THERE IS CHANGE AND THERE IS A CHANGE BECAUSE IT MOVED

That is the evidence of the station move was vague. The major evidence was the fall in temperatures. Alternative evidence is that there were a number of other stations in the area exhibiting similar patterns.

But maybe there was some, unknown, measurement bias (to use Steven Mosher’s term) that would make this data stand out from the rest? I have previously looked eight temperature stations in Paraguay with respect to the NASA Gistemp and UHCN adjustments. The BEST adjustments for the stations, along another in Paul Homewood’s original post, are summarized in Figure 2 for the late 1960s and early 1970s. All eight have similar downward adjustment that I estimate as being between 0.8 to 1.2oC. The first six have a single adjustment. Asuncion Airport and San Juan Bautista have multiple adjustments in the period. Pedro Juan CA was of very poor data quality due to many gaps (see GHCNv2 graph of the raw data) hence the reason for exclusion.

GHCN Name

GHCN Location

BEST Ref

Break Type

Break Year

 

Concepcion

23.4 S,57.3 W

157453

Empirical

1969

 

Encarcion

27.3 S,55.8 W

157439

Empirical

1968

 

Mariscal

22.0 S,60.6 W

157456

Empirical

1970

 

Pilar

26.9 S,58.3 W

157441

Empirical

1967

 

Puerto Casado

22.3 S,57.9 W

157455

Station Move

1971

 

San Juan Baut

26.7 S,57.1 W

157442

Empirical

1970

 

Asuncion Aero

25.3 S,57.6 W

157448

Empirical

1969

 

  

  

  

Station Move

1972

 

  

  

  

Station Move

1973

 

San Juan Bautista

25.8 S,56.3 W

157444

Empirical

1965

 

  

  

  

Empirical

1967

 

  

  

  

Station Move

1971

 

Pedro Juan CA

22.6 S,55.6 W

19469

Empirical

1968

 

  

  

  

Empirical

3 in 1970s

 
           

Figure 2 – Temperature stations used in previous post on Paraguayan Temperature Homogenisations

 

Why would both BEST and UHCN remove a consistent pattern covering and area of around 200,000 km2? The first reason, as Roger Andrews has found, the temperature fall was confined to Paraguay. The second reason is suggested by the UHCNv2 raw data1 shown in figure 3.

Figure 3 – UHCNv2 “raw data” mean annual temperature anomalies for eight Paraguayan temperature stations, with mean of 1970-1979=0.

There was an average temperature fall across these eight temperature stations of about half a degree from 1967 to 1970, and over one degree by the mid-1970s. But it was not at the same time. The consistency is only show by the periods before and after as the data sets do not diverge. Any homogenisation program would see that for each year or month for every data set, the readings were out of line with all the other data sets. Now maybe it was simply data noise, or maybe there is some unknown change, but it is clearly present in the data. But temperature homogenisation should just smooth this out. Instead it cools the past. Figure 4 shows the impact average change resulting from the UHCN and NASA GISS homogenisations.

Figure 4 – UHCNv2 “raw data” and NASA GISS Homogenized average temperature anomalies, with the net adjustment.

A cooling trend for the period 1955-1980 has been turned into a warming trend due to the flaw in homogenization procedures.

The Paraguayan data on its own does not impact on the global land surface temperature as it is a tiny area. Further it might be an isolated incident or offset by incidences of understating the warming trend. But what if there are smaller micro climates that are only picked up by one or two temperature stations? Consider figure 5 which looks at the BEST adjustments for Encarnacion, one of the eight Paraguayan stations.

Figure 5 – BEST adjustment for Encarnacion.

There is the empirical break in 1968 from the table above, but also empirical breaks in the 1981 and 1991 that look to be exactly opposite. What Berkeley earth call the “estimated station mean bias” is as a result of actual deviations in the real data. Homogenisation eliminates much of the richness and diversity in the real world data. The question is whether this happens consistently. First we need to understand the term “temperature homogenization“.

Kevin Marshall

Notes

  1. The UHCNv2 “raw” data is more accurately pre-homogenized data. That is the raw data with some adjustments.

Understanding GISS Temperature Adjustments

A couple of weeks ago something struck me as odd. Paul Homewood had been going on about all sorts of systematic temperature adjustments, showing clearly that the past has been cooled between the UHCN “raw data” and the GISS Homogenised data used in the data sets. When I looked at eight stations in Paraguay, at Reykjavik and at two stations on Spitzbergen I was able to corroborate this result. Yet Euan Mearns has looked at groups of stations in central Australia and Iceland, in both finding no warming trend between the raw and adjusted temperature data. I thought that Mearns must be wrong, so when he published on 26 stations in Southern Africa1, I set out to evaluate those results, to find the flaw. I have been unable to fully reconcile the differences, but the notes I have made on the Southern African stations may enable a greater understanding of temperature adjustments. What I do find is that clear trends in the data across a wide area have been largely removed, bringing the data into line with Southern Hemisphere trends. The most important point to remember is that looking at data in different ways can lead to different conclusions.

Net difference and temperature adjustments

I downloaded three lots of data – raw, GCHNv3 and GISS Homogenised (GISS H), then replicated Mearns’ method of calculating temperature anomalies. Using 5 year moving averages, in Chart 1 I have mapped the trends in the three data sets.

There is a large divergence prior to 1900, but for the twentieth century the warming trend is not excessively increased. Further, the warming trend from around 1900 is about half of that in the GISTEMP Southern Hemisphere or global anomalies. Looked in this way Mearns would appear to have a point. But there has been considerable downward adjustment of the early twentieth century warming, so Homewood’s claim of cooling the past is also substantiated. This might be the more important aspect, as the adjusted data makes the warming since the mid-1970s appear unusual.

Another feature is that the GCHNv3 data is very close to the GISS Homogenised data. So in looking the GISS H data used in the creation of the temperature data sets is very much the same as looking at GCHNv3 that forms the source data for GISS.

But why not mention the pre-1900 data where the divergence is huge?

The number of stations gives a clue in Chart 2.

It was only in the late 1890s that there are greater than five stations of raw data. The first year there are more data points left in than removed is 1909 (5 against 4).

Removed data would appear to have a role in the homogenisation process. But is it material? Chart 3 graphs five year moving averages of raw data anomalies, split between the raw data removed and retained in GISS H, along with the average for the 26 stations.

Where there are a large number of data points, it does not materially affect the larger picture, but does remove some of the extreme “anomalies” from the data set. But where there is very little data available the impact is much larger. That is particularly the case prior to 1910. But after 1910, any data deletions pale into insignificance next to the adjustments.

The Adjustments

I plotted the average difference between the Raw Data and the adjustment, along with the max and min values in Chart 4.

The max and min of net adjustments are consistent with Euan Mearns’ graph “safrica_deltaT” when flipped upside down and made back to front. It shows a difficulty of comparing adjusted, where all the data is shifted. For instance the maximum figures are dominated by Windhoek, which I looked at a couple of weeks ago. Between the raw data and the GISS Homogenised there was a 3.6oC uniform increase. There were a number of other lesser differences that I have listed in note 3. Chart 5 shows the impact of adjusting the adjustments is on both the range of the adjustments and the pattern of the average adjustments.

Comparing this with this average variance between the raw data and the GISS Homogenised shows the closer fit if the adjustments to the variance. Please note the difference in scale on Chart 6 from the above!

In the earlier period has by far the most deletions of data, hence the lack of closeness of fit between the average adjustment and average variance. After 1945, the consistent pattern of the average adjustment being slightly higher than the average variance is probably due to a light touch approach on adjustment corrections than due to other data deletions. The might be other reasons as well for the lack of fit, such as the impact of different length of data sets on the anomaly calculations.

Update 15/03/15

Of note is that the adjustments in the early 1890s and around 1930 is about three times the size of the change in trend. This might be partly due to zero net adjustments in 1903 and partly due to the small downward adjustments in post 2000.

The consequences of the adjustments

It should be remembered that GISS use this data to create the GISTEMP surface temperature anomalies. In Chart 7 I have amended Chart 1 to include Southern Hemisphere annual mean data on the same basis as the raw data and GISS H.

It seems fairly clear that the homogenisation process has achieved bringing the Southern Africa data sets into line with the wider data sets. Whether the early twentieth century warming and mid-century cooling are outliers that have been correctly cleansed is a subject for further study.

What has struck me in doing this analysis is that looking at individual surface temperature stations becomes nonsensical, as they are grid reference points. Thus comparing the station moves for Reykjavik with the adjustments will not achieve anything. The implications of this insight will have to wait upon another day.

Kevin Marshall

Notes

1. 26 Data sets

The temperature stations, with the periods for the raw data are below.

Location

Lat

Lon

ID

Pop.

Years

Harare

17.9 S

31.1 E

156677750005

601,000

1897 – 2011

Kimberley

28.8 S

24.8 E

141684380004

105,000

1897 – 2011

Gwelo

19.4 S

29.8 E

156678670010

68,000

1898 – 1970

Bulawayo

20.1 S

28.6 E

156679640005

359,000

1897 – 2011

Beira

19.8 S

34.9 E

131672970000

46,000

1913 – 1991

Kabwe

14.4 S

28.5 E

155676630004

144,000

1925 – 2011

Livingstone

17.8 S

25.8 E

155677430003

72,000

1918 – 2010

Mongu

15.2 S

23.1 E

155676330003

< 10,000

1923 – 2010

Mwinilunga

11.8 S

24.4 E

155674410000

< 10,000

1923 – 1970

Ndola

13.0 S

28.6 E

155675610000

282,000

1923 – 1981

Capetown Safr

33.9 S

18.5 E

141688160000

834,000

1880 – 2011

Calvinia

31.5 S

19.8 E

141686180000

< 10,000

1941 – 2011

East London

33.0 S

27.8 E

141688580005

127,000

1940 – 2011

Windhoek

22.6 S

17.1 E

132681100000

61,000

1921 – 1991

Keetmanshoop

26.5 S

18.1 E

132683120000

10,000

1931 – 2010

Bloemfontein

29.1 S

26.3 E

141684420002

182,000

1943 – 2011

De Aar

30.6 S

24.0 E

141685380000

18,000

1940 – 2011

Queenstown

31.9 S

26.9 E

141686480000

39,000

1940 – 1991

Bethal

26.4 S

29.5 E

141683700000

30,000

1940 – 1991

Antananarivo

18.8 S

47.5 E

125670830002

452,000

1889 – 2011

Tamatave

18.1 S

49.4 E

125670950003

77,000

1951 – 2011

Porto Amelia

13.0 S

40.5 E

131672150000

< 10,000

1947 – 1991

Potchefstroom

26.7 S

27.1 E

141683500000

57,000

1940 – 1991

Zanzibar

6.2 S

39.2 E

149638700000

111,000

1880 – 1960

Tabora

5.1 S

32.8 E

149638320000

67,000

1893 – 2011

Dar Es Salaam

6.9 S

39.2 E

149638940003

757,000

1895 – 2011

2. Temperature trends

To calculate the trends I used the OLS method, both from the formula and using the EXCEL “LINEST” function, getting the same answer each time. If you are able please check my calculations. The GISTEMP Southern Hemisphere and global data can be accessed direct from the NASA GISS website. The GISTEMP trends are from the skepticalscience trends tool. My figures are:-

3. Adjustments to the Adjustments

Location

Recent adjustment

Other adjustment

Other Period
Antananarivo

0.50

 

 
Beira

 

0.10

Mid-70s + inter-war
Bloemfontein

0.70

 

 
Dar Es Salaam

0.10

 

 
Harare

 

1.10

About 1999-2002
Keetmanshoop

1.57

 

 
Potchefstroom

-0.10

 

 
Tamatave

0.39

 

 
Windhoek

3.60

 

 
Zanzibar

-0.80