Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark...

Current Monthly Homogenization Approaches –

Benchmarking their Strengths and Weaknesses

Victor Venema

Content

Global Historical Climate Network (NOAA-

GHCNv3)

– Trend: 0.8°C per century since 1880

– Raw data: 0.6°C

Need independent lines of research

1. Statistical homogenization

2. Physical understanding (parallel measurements)

3. Modelling (UHI, radiation screens)

Homogenisation: WHY?

Example of PAU-UZEIN temperature

1912 PAU-LESCAR (EN) 2005 PAU-UZEIN (AERO)

Slide: Olivier Mestre

Pairwise homogenization

http://variable-variability.blogspot.de/2012/08/statistical-homogenisation-for-dummies.html

HOME validation study

Compare full homogenisation algorithms

Benchmark dataset

– Monthly temperature and precipitation networks

– Most realistic to date

Configuration

– Typical for Europe

– Number of stations: 5, 9, 15

Scatterplots monthly CRMSE

0 0.5 1 1.5

0

0.5

1

1.5

ACMANT

CRMSE inhomogeneous data [°C]

CR

MS

E h

om

og

en

ise

d d

ata

[°C

]

0 0.5 1 1.5

0

0.5

1

1.5

PRODIGE monthly


CR

MS

E h

om

og

en

ise

d d

ata

[°C

]

0 0.5 1 1.5

0

0.5

1

1.5

USHCN main


CR

MS

E h

om

og

en

ise

d d

ata

[°C

]

0 0.5 1 1.5

0

0.5

1

1.5

MASH main


CR

MS

E h

om

og

en

ise

d d

ata

[°C

]

0 0.5 1 1.5

0

0.5

1

1.5

C3SNHT


CR

MS

E h

om

og

en

ise

d d

ata

[°C

]

0 0.5 1 1.50

0.5

1

1.5

2

2.5

PMFred abs


CR

MS

E h

om

og

en

ise

d d

ata

[°C

]

Errors in trends

-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6

ACMANT

SNHT DWD

C3SNHT

PMFred abs

PMTred rel

AnClim main

USHCN cx8

USHCN 52x

USHCN main

PRODIGE trendy

PRODIGE monthly

PRODIGE main

MASH main

Inhom. data

Trend difference [°C/100a]

-50 -40 -30 -20 -10 0 10 20 30 40 50

Climatol

C3SNHT

PMFred abs

PMTred rel

AnClim main

PRODIGE trendy

PRODIGE monthly

PRODIGE main

MASH main

Inhom. data

Trend difference [mm/100a]

Lessons

Modern methods a factor 2 more accurate

– Multiple breakpoint methods

– Methods that are designed to work with

inhomogeneous reference series

Training is important

Automatic methods as good as manual methods

– No metadata in validation dataset

SNHT is not recommended

Absolute homogenization is method of last resort

Decomposition method on Benchmark

Domonkos, P., V. Venema, O. Mestre. Efficiencies of homogenisation methods: our present knowledge and its limitation.

Proceedings of the Seventh seminar for homogenization and quality control in climatological databases, Budapest, Hungary, 24

– 28 October 2011, WMO report, Climate data and monitoring, WCDMP-No. 78, pp. 11-24, 2013.

RMSE station

trends

CRMSE

Annual data

Caveats HOME: ISTI

1. Missing homogenization methods

– Two- or multi-phase regression method

2. Size breaks (random walk or noise) – Ralf Lindau and Victor Venema. The joint influence of break and noise variance on the break detection

capability in time series homogenization.

3. Signal to noise ratio varies regionally

4. Regional trends (absolute homogenization)

5. Length of the series – Ralf Lindau and Victor Venema. On the multiple breakpoint problem and the number of significant breaks

in homogenisation of climate records. Idojaras, 117, no. 1, pp. 1-34, 2013.

6. Non-climatic trend bias

International Surface Temperature Initiative – Kate Willett et al. A framework for benchmarking of homogenisation algorithm performance on the global

scale. Geosci. Instrum. Method. Data Syst., 3, pp. 187-200, 2014.

Radiation error

Radiation error

Climates largest radiation errors:

* Strong insolation

* Low wind

* Dry ground

* High specific humidity

Parallel measurements

Transition to Stevenson screens

North-West Europe: < 0.2°C (Various, Parker)

Basel, Switzerland: 0°C (Wild screen)

Kremsmünster, Austria: 0.2°C (North-wall)

Adelaide, South Australia: 0.2°C (Glaisher stand)

Spain: 0.35°C (French screen)

Sri Lanka: 0.37°C

(Tropical screen)

India: 0.42° (Tropical screen)

Sources of global temperature trend bias

Transition to Stevenson screens

Transition to Automatic Weather Stations

Urbanization

Siting

Irrigation

Relocations to airports

Research on parallel data

Large database with parallel measurements

needed to study daily inhomogeneities

o Study statistical and physical properties of (daily)

inhomogeneities

o Dependence on local weather and regional climate

o Most studies are currently about mid-latitudes

o Validate detected inhomogeneities

o Independent evidence for

trend bias

Parallel Data Initiative

Produce an open database

Initially data is restricted to contributors

– Incentive to contribute

– Until first joint paper(s) by contributors are written

First action: Inventory of parallel datasets

– https://ourproject.org/moin/projects/parallel

– Dozens of datasets available

More information

– http://tinyurl.com/paralleldata

– [email protected]

Conclusions & outlook

Statistical homogenization improves temperature trend estimates – Only best method improve precipitation trends

Modern homogenization methods more accurate

1. Statistical homogenization – Global validation study missing

– Better mathematical understanding methods

2. Better physical understanding of causes – http://tinyurl.com/paralleldata

3. More modelling to improve understanding

Q&A slides

Shorter length, less certainty

n = 21 years n = 101 years

Exceeding probability

1/128

1/64

1/32

1/16

1/8

1/4

Ralf Lindau and Victor Venema. On the multiple breakpoint problem and the number of significant breaks in

homogenisation of climate records. Idojaras, Quart. journal Hungarian Meteorol. Service, 117, no. 1, pp. 1-34, 2013.

Which SNR is sufficient?

RMS skill for:

0 Random segmentation

+ Standard search

for different SNRs.

So far we considered SNR = ½

Random segmentation and

standard search have comparable

skills.

Only for SNR > 1, the standard

search is significantly better.

Random

Standard

Surrogate temperature section

Generated homogeneous temperature networks

– Stochastic modelling

– Based on statistical properties of homogenized data

Configuration

– Typical for Europe

– 15 networks

– Length: 100 years

– Number of stations: 5, 9, 15

Added non-climatic changes

– Most realistic to date

Beginning

Missing data

WWII

Missing data

Outliers

Breaks

Simulataneous

Breaks

Local trends

Physical causes of inhomogeneities

Shelter type, exposure

– Radiation & wetting protection

– Natural or forced ventilation

– Snow cover

– Plastic screen: insolation on hot

days

Relocation of station

– City-> airport, suburbs, lower heights

– Deurbanisation of network

Instrument

– Response, integration time

– Zero drift, shrinking glass initial

years

– Calibration errors

– Temperature out of range

– Quicksilver thermometers: T < -39°C

Change surrounding

– Urbanization, growing vegetation,

irrigation

Definitions

– Computation daily means

Measurement procedures

– Reading times

Maintenance procedures

– AWS: Icing, damage detection

– Painting & cleaning schedule

Digitisation & database

– Minus sign forgotten

– Station names mixed up

– Pre-homogenised data

Correction methodology - inflation

Corrections have deterministic (explained variance) and stochastic (unexplained) component

Downscaling: problems deterministic corrections

– Variance inflation (Von Storch, 1999)

– Quantile Matching (Maraun, 2013)

– Unintended change trend in mean

Should correct unexplained variance with noise

Homogenization – Trend in difference TS is small

– Gradual inhomogeneities (urbanization)

Maraun, D. Bias correction, quantile mapping, and downscaling: revisiting the inflation issue. J. Clim., 26, pp. 2137-2143, doi: 10.1175/JCLI-D-12-00821.1, 2013.

Von Storch, H. On the Use of ‘‘Inflation’’ in Statistical Downscaling. J. Clim., 12, pp. 3505-3506, 1999.

Correction – change in noise source

Change in cross-correlation

– Relocation, change in noise source

Simple example

– |N1| = |N2|

– No inhomogeneity in distribution

– Jump in difference time series

R+ +W1N1 R+ +W1N2

R Regional climate signalN Instrument specific errorW Station specific weather

R+ +W2N1

Station 2

Station 1

Correction – change in noise source

Large database with parallel measurements

needed to study daily inhomogeneities

o Generate benchmark data with realistic inhomogeneities

o For example, second cycle of ISTI

o Validate detected inhomogeneities

Research on parallel data

Exposure

Insolation

– Sun, hot ground, scattered

radiation

Humidity and clouds

– Infrared radiative cooling

Wind

– Heat exchange

Design

– Size sensor

– Shielding

– Mechanical ventilation

Australia: Albany airport and town

Trewin (2012)

Parallel measurements – Kremsmünster

Böhm et al. (2010)

Kremsmünster – percentiles difference

Böhm et al. (2010)

Spain: Montsouri screen, Stevenson observations,

Stevenson automatic

Montsouri vs. Stevenson: difference as function of

Diurnal Temperature Range and Tmax

Murcia: South East Spain, Mediterranean.

La Corunia, Corunna: North West Spain, Atlantic.

Juli

April

Motivation: daily data

“[Inhomogeneous data] affects, in particular, the

understanding of extremes, because

changes in extremes are often more sensitive to

inhomogeneous climate monitoring practices

than changes in the mean.”

Trenberth, K.E., et al., 2007: Observations: Surface and Atmospheric Climate Change. In: Climate

Change 2007: The Physical Science Basis. Cambridge University Press, Cambridge, United Kingdom

and New York, NY, USA.

Extremes, mean and variability

Importance changes in variability and mean

The relative sensitivity of an

extreme to changes in the

mean (dashed line) and in

the standard deviation

(solid line) for a certain

temperature threshold (x-

axis). The relative sensitivity

of the mean (standard

deviation) is the change in

probability of an extreme

event to a change in the

mean (or standard deviation)

divided by its probability.

From Katz and Brown

(1992).

A priori formula

The different reaction of breaks

and noise on randomly inserted

breaks makes it possible to

estimate break variance and

break number a priori.

If we insert many breaks, almost

the entire break variance is

explained plus a known fraction of

noise.

At k = nk half of the break variance

is reached (22.8% in total).

No bias component.

0.228

3.1

Pairwise homogenization

http://variable-variability.blogspot.de/2012/08/statistical-homogenisation-for-dummies.html

A blind test of

monthly homogenisation algorithms

Victor Venema, O. Mestre, E. Aguilar, I. Auer, J. A. Guijarro, P. Domonkos, G. Vertacnik,

T. Szentimrey, P. Stepanek, P. Zahradnicek, J. Viarre, G. Müller-Westermeier, M. Lakatos,

C. N. Williams, M. Menne, R. Lindau, D. Rasol, E. Rustemeier, K. Kolokythas, T. Marinova,

L. Andresen, F. Acquaotta, S. Fratianni, S. Cheval, M. Klancar, M Brunetti, C. Gruber, M. Prohom Duran,

T. Likso, P. Esteban, T. Brandsma

MeteorologicalInstitute

Bonn

Participant returned the data

25 blind contributions

Some algorithms multiple contributions

– Test versions

– Test influence operator (manual methods)

Algorithms/software

– USHCN

– PRODIGE

– MASH

– Craddock

– AnClim

– RhTestV2

– SNHT

– Climatol

– ACMANT

Monthly CRMSE complete contributions

0 0.2 0.4 0.6 0.8 1 1.2 1.4

ACMANT

SNHT DWD

C3SNHT

PMFred abs

PMTred rel

AnClim main

USHCN cx8

USHCN 52x

USHCN main

PRODIGE trendy

PRODIGE monthly

PRODIGE main

MASH main

Inhom. data

CRMSE [°C]

Temperature

0 5 10 15 20 25 30

Climatol

C3SNHT

PMFred abs

PMTred rel

AnClim main

PRODIGE trendy

PRODIGE monthly

PRODIGE main

MASH main

Inhom. data

CRMSE [mm]

Precipitation

Decadal CRMSE complete contributions

0 0.2 0.4 0.6 0.8 1 1.2 1.4

ACMANT

SNHT DWD

C3SNHT

PMFred abs

PMTred rel

AnClim main

USHCN cx8

USHCN 52x

USHCN main

PRODIGE trendy

PRODIGE monthly

PRODIGE main

MASH main

Inhom. data

CRMSE [°C]

Temperature

0 5 10 15 20 25 30

Climatol

C3SNHT

PMFred abs

PMTred rel

AnClim main

PRODIGE trendy

PRODIGE monthly

PRODIGE main

MASH main

Inhom. data

CRMSE [mm]

Precipitation

Contribution

No

stations POD POFD

Pierce

Skill Score

Heidke

Skill Score

Heidke

Special

MASH main 111 0.63 0.09 0.53 0.31 -0.20

PRODIGE main 111 0.35 0.02 0.33 0.35 0.41

PRODIGE monthly 111 0.39 0.02 0.37 0.40 0.44

PRODIGE trendy 111 0.35 0.02 0.32 0.35 0.41

USHCN main 111 0.34 0.00 0.33 0.46 0.61

USHCN 52x 111 0.40 0.01 0.39 0.51 0.62

USHCN cx8 111 0.35 0.01 0.35 0.47 0.61

AnClim main 111 0.18 0.03 0.15 0.16 0.20

iCraddock Vertacnik 55 0.60 0.03 0.57 0.54 0.49

PMTred rel 111 0.41 0.04 0.37 0.34 0.27

PMFred abs 111 0.21 0.01 0.20 0.27 0.46

C3SNHT 111 0.23 0.05 0.18 0.16 0.04

SNHT DWD 111 0.12 0.01 0.11 0.15 0.40

Climatol 111 0.38 0.01 0.37 0.45 0.55

ACMANT 111 0.50 0.03 0.47 0.44 0.41

Contingency scores

Pairwise vs composite reference

Composite reference

– Compute a weighted average of neighbours

– Reduces the influence of IH in single stations

– Careful selection of stations needed

No large breaks for detection

No breaks for corrections

Pairwise

– Need to attribute breaks found in the pairs to a station

– Solution to this problem is still ad-hoc or manual

– Potential for optimal mathematical solution

– Joint detection: all stations simultaneously

– Solving combinatorial problem for large breaks

Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark...

Documents

Transcript of Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark...