Tuesday 3 October 2023

January - March 2013

Quarterly journal of the Hungarian Meteorological Service

Special Issue of the COST-ES0601 (HOME) ACTION:
Advances in homogenization methods of climate series: an integrated approach

Guest editors: Mónika Lakatos and Tamás Szentimrey

Mónika Lakatos and Tamás Szentimrey

On the multiple breakpoint problem and the number of significant breaks in homogenization of climate records
Ralf Lindau and Victor Venema
 PDF (652 KB)   |   Abstract

Changes in instrumentation and relocations of climate stations may insert inhomogeneities into meteorological time series, dividing them into homogeneous subperiods interrupted by sudden breaks. Such inhomogeneities can be distinguished from true variability by considering the differences compared to neighboring stations. The most probable positions for a given number of break points are optimally determined by using a multiple-break point approach. In this study the maximum external variance between the segment averages is used as decision criterion and dynamic programming as optimization method. Even in time series without breaks, the external variance is growing with any additionally assumed break, so that a stop criterion is needed. This is studied by using the characteristics of a random time series. The external variance is shown to be beta-distributed, so that the maximum is found by solving the incomplete beta function. In this way, an analytical function for the maximum external variance is derived. In its differential form our solution shows much formal similarities to the penalty function used in Caussinus and Mestre (2004), but differs numerically and exhibits more details.

Climatological series shift test comparison on running windows
José A. Guijarro
 PDF (596 KB)   |   Abstract

The detection and correction of inhomogeneities in the climate series is of paramount importance for avoiding misleading conclusions in the study of climate variations. One simple way to address the problem of multiple shifts in the same series is to apply the tests on windows running along the series of anomalies. But it is not clear which of the available tests works better. 500 Monte Carlo simulations have been done for the ideal case of a 600 normally distributed terms (a 50 years series of monthly differences), with a single shift in the middle and magnitudes of 0 to 2 standard deviations (s) in steps of 0.2 s. The compared tests have been: 1) classical t-test; 2) standard normal homogeneity test; 3) two-phase regression; 4) Wilcoxon-Mann-Whitney test; 5) Durbin-Watson test (lag-1 serial correlation), and 6) squared relative mean difference (simpler than t-test and hence faster to compute). The criterion for qualifying the performance of each test was the ability to detect shifts without false alarms and to locate them at the correct point. Results indicate that, under these precise simulated conditions, the best test are the classical t-test, Alexandersson’s SNHT and SRMD, with almost identical results, followed by the Wilcoxon-Mann-Whitney test, while two phase regression and Durbin-Watson performances are very poor.

HOMER : a homogenization software – methods and applications
Olivier Mestre, Peter Domonkos, Franck Picard, Ingeborg Auer, Stéphane Robin, Emilie Lebarbier, Reinhard Böhm, Enric Aguilar, Jose Guijarro, Gregor Vertachnik, Matija Klancar, Brigitte Dubuisson, and Petr Stepanek
 PDF (786 KB)   |   Abstract

Between 2007–2011, the European COST Action ES0601 called HOME project was devoted to evaluate the performance of homogenization methods used in climatology and produce a software that would be a synthesis of the best aspects of some of the most efficient methods. HOMER (HOMogenizaton softwarE in R) is a software for homogenizing essential climate variables at monthly and annual time scales. HOMER has been constructed exploiting the best characteristics of some other state-of-the-art homogenization methods, i.e., PRODIGE, ACMANT, CLIMATOL, and the recently developed joint-segmentation method (cghseg). HOMER is based on the methodology of optimal segmentation with dynamic programing, the application of a network-wide two-factor model both for detection and correction, and some new techniques in the coordination of detection processes from multiannual to monthly scales. HOMER also includes a tool to assess trend biases in urban temperature series (UBRIS). HOMER’s approach to the final homogenization results is iterative. HOMER is an interactive method, that takes advantage of metadata. A practical application of HOMER is presented on temperature series of Wien, Austria and its surroundings.

Measuring performances of homogenization methods
Peter Domonkos
 PDF (312 KB)   |   Abstract

Climatologists apply various homogenization methods to eliminate the non-climatic biases (the so-called inhomogeneities) from the observed climatic time series. The appropriateness of the homogenization methods is varied, therefore, their performance must be examined. This study reviews the methodology of measuring the efficiency of homogenization methods. The principles of reliable efficiency evaluations are: (i) Efficiency tests need the use of simulated test datasets with similar properties to real observational datasets; (ii) The use of root mean squared error (RMSE) and the accuracy of trend-estimations must be preferred instead of the skill in detecting change-points; (iii) The evaluation of the detection of inhomogeneities must be clearly distinguished from the evaluation of whole homogenization procedures; (iv) Evaluation of homogenization methods including subjective steps needs blind tests. The study discusses many other details of the efficiency evaluation, recalls the results of the blind test experiment of the COST action ES0601 (HOME), summarizes our present knowledge about the efficiencies of homogenization methods, and describes the main tasks ahead the climatologist society in the examinations of the efficiency of homogenization methods.

Theoretical questions of daily data homogenization
Tamás Szentimrey
 PDF (287 KB)   |   Abstract

The so-called variable correction methods form a special type of methods developed for daily data homogenization. Their common assumption is that in case of daily data series, the corrections for inhomogeneity have to vary according to the meteorological situation of each day in order to represent the extremes. In this paper we express our objections to these variable correction methods, especially to their underlying principles. Since the exact theoretical mathematical formulation of the question of daily data homogenization is generally neglected, we also try to formulate and analyze this problem in accordance with mathematical conventions.

Experiences with data quality control and homogenization of daily records of various meteorological elements in the Czech Republic in the period 1961–2010
Petr Štěpánek, Pavel Zahradníček and Aleš Farda
 PDF (687 KB)   |   Abstract

Quality control and homogenization has to be undertaken prior to any data analyses in order to eliminate any erroneous values and non-climatic biases in time series. In recent years, considerable attention was paid to daily data since it can serve, among other conventional climatological analyses, as non-biased input into extreme value analysis, correction of RCM outputs, etc. In this work, we describe and then apply our own approach to data quality control of station measurements, combining several methods: (i) by analyzing difference series between candidate and neighboring stations, (ii) by applying limits derived from interquartile ranges, and (iii) by comparing the series values tested with “expected” values – technical series created by means of statistical methods for spatial data (e.g., IDW, kriging). Because of the presence of noise in series, statistical homogeneity tests render results with some degree of uncertainty. In this work, the use of various statistical tests and reference series made it possible to increase considerably the number of homogeneity test results for each series and, thus, to assess homogeneity more reliably. Inhomogeneities were corrected on a daily scale. In the end, missing values were filled applying geostatistical methods; thus, the so-called technical series for stations were constructed, which can finally be used as quality input into further time series analysis. These methodological approaches are applied to daily data, for various meteorological elements within the area of the Czech Republic in the period 1961–2010, which allows demonstrate their usefulness. Series were processed by means of the developed ProClimDB and AnClim softwares (http://www.climahom.eu).

Creation of a homogenized climate database for the Carpathian region by applying the MASH procedure and the preliminary analysis of the data
Mónika Lakatos, Tamás Szentimrey, Zita Bihari, and Sándor Szalai
 PDF (379 KB)   |   Abstract

Homogenization of the long term observation series is essential in climate change studies. The most important achievements of the COST Action ES0601 (HOME) are survey and the comparison of the available homogenization methods. A benchmark test was performed in the Action to choose the best recent methods. The MASH (Multiple Analysis of Series for Homogenization; Szentimrey) procedure which was developed at the Hungarian Meteorological Service (OMSZ) produced good results. The Short Term Scientific Missions (STSMs) supported by the COST established the wide usage of MASH in the neighboring countries. This is the main reason why MASH became the common homogenization method used to fulfil the Climate of the Carpathian Region tender service. The aim of the project is to improve the climate data source and data access in the Carpathian Region by creating a daily harmonized gridded dataset during the period between 1961 and 2010. The homogenization process executed and the verification of the homogenization along with the quality control results are introduced in this paper. Preliminary results of trend analysis carried out on the harmonized database are also presented.

Homogenity of monthly air temperature in Portugal with HOMER and MASH
Luís Freitas, Mário Gonzalez Pereira, Liliana Caramelo, Manuel Mendes, and Luís Filipe Nunes
 PDF (507 KB)   |   Abstract

In this paper we focus on the homogeneity of Portuguese monthly mean air temperature with two purposes: i) to detect and correct eventual inhomogeneities in the dataset; and, ii) to compare the homogenized time series with different methods. The dataset used in this study comprises time series of minimum (TN) and maximum (TX) monthly mean air temperature recorded in weather stations located in the northern region of the continental part of Portugal, from 1941 to 2010. MASH and HOMER were the methods used in this study to homogenize the Portuguese air temperature database. The former was selected for being one of the most widely used by the homogenization community, while the latter was selected because it is one of the most recent homogenization methods, and the combination of detection methods resulted in that, along with MASH, HOMER exhibited the best results in the comparative analysis performed within the COST Action ES0601 (HOME). A high number of break points were identified in both minimum and maximum air temperature time series, but differences in the number, size and temporal location of the breaks detected by both methods must be underlined. The homogenization process was assessed by comparing results obtained with correlation, trend, and principal component analysis using non-homogenized (NH) and homogenized datasets with both methods. Correlation analysis reveals a higher increase in the similarity in homogenized TX than in TN in relation with NH time series. Decrease in the amplitude of the tendencies and in the number of statistically significant trends is higher in homogenized TX than in TN, independently of the homogenization method. On the other hand, the number of statistically significant principal components tend to decrease with the application of homogenization procedures, while the explained variance by the first principal components of homogenized datasets is tendentiously higher than for non-homogenized datasets.