Radan HUTH, Monika CAHYNOVÁ Institute of Atmospheric Physics, Prague, Czech Republic

Classifications of circulation patterns

from the COST733 database: An assessment of synoptic-

climatological applicability by two-sample Kolmogorov-Smirnov test

Radan HUTH, Monika CAHYNOVÁInstitute of Atmospheric Physics,

Prague, Czech [email protected]

http://www.cost.esf.org/

COST733 database (collection)

• COST733 Action – “Harmonization and Applications of Weather Types Classifications for European Regions”

• (very) large number of classifications produced

• on unified data– SLP at 12 UTC – ERA40 (Sep 1957 – Aug 2002)– ~9, ~18, ~27 types wherever possible – 12 European domains



• version 2.0 of the database– released this spring

• 18 methods for each domain– threshold-based: GWT (Beck), Litynski, Lamb (Jenkinson-

Collison), P27 (Kruizinga), WLK– leader algorithm: Lund, Kirchhofer, Erpicum– PCA-based: T-mode PCA– optimization algorithms: CKMEANS, PCACA (k-means),

Petisco, PCAXTRKMS, SANDRA, SANDRA-S, NNW (SOMs), PCAXTR

– pseudo-random: random centroids• plus 7 subjective and objectivized classifications not

attributable to any domain– ignored today



• different attributes of classifications– number of types (9 x 18 x 27)– sequencing (no vs. 4-day sequences)– seasonal vs. year-round definition– variable: all based on SLP, several additional

variables used


GOAL• assess the synoptic-climatological

applicability of classifications• i.e., how well they stratify surface weather

(climate) conditions• demonstrate effect of

– sequencing– seasonal vs. annual definition– adding more variables

• 500 hPa height • 500 hPa vorticity• 850/500 hPa thickness

– number of types


Classifications examined• 11 methods

– 30 classifications available for each of them– differing in

• sequencing (no x 4 days)• additional variables (Z500, THICK850/500, VOR500, all

together)• number of types (9, 18, 27)

• 5 methods– additional 6 classifications available– differing in

• seasonality of definition (year-round x seasonal)


TOOL• 2-sample Kolmogorov-Smirnov test• equality of distributions of the climate

element under one type against under all the other types

- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0

4 5

5 0

5 5

6 0

- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0

4 5

5 0

5 5

6 0

- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0

4 5

5 0

5 5

6 0

- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0

4 5

5 0

5 5

6 0

- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0

4 5

5 0

5 5

6 0

- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0

4 5

5 0

5 5

6 0

- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0

4 5

5 0

5 5

6 0

- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0

4 5

5 0

5 5

6 0

x

- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0

4 5

5 0

5 5

6 0


TOOL• at each station• types for which the K-S test rejects the

equality of distributions are counted• the larger the count, the better the

stratification, the better the synoptic-climatological applicability


ANALYSIS

• preliminary results• maximum temperature

(minimum temperature – very similar results)(precipitation – different)

• domain 07 (central Europe)• 39 stations from ECA&D database• winter (DJF)• Jan 1961 – Dec 2000


RANKING OF CLASS’S• at all stations individually:

– for each classification: number of rejected K-S counted

– classifications ranked by the %age of rejected K-S tests (= well separated classes)

– higher %age better lower rank• for each classification: ranks averaged over

stations• area mean rank ranking of the

classification


Result 1: comparison of methods

• area mean ranks averaged over 30 realizations of each method

• result: order of the method, independent of any attribute (no. of types, sequencing, variable)



so the winner is…



1 cluster analysis of PCs 7 PCA-extreme scores reassigned by k-means

2 SANDRA (optimized k-means) 8 obliquely rotated T-mode

PCA

3 C-k-means 9PCA-extreme scores reassigned by Eucl. distance

4 random centroids 10 Erpicum

5 Lund correlation-based 11 orthogonally rotated T-mode PCA

6 Kruizinga

NOTE: not all methods participated in the race!


Result 2: sensitivity to the number of types

• all pairs of classifications– differing in no. of types

• 9 vs. 18• 18 vs. 27

– with all other attributes equal

• difference in rank is calculated

• histogram of differences• t-test: equality of the

difference to zero-320 -240 -160 -80 0 80 160 240

0

0.04

0.08

0.12

0.16Effect o f no. o f types: 9 vs. 18

-106 ± 17


-320 -280 -240 -200 -160 -120 -80 -40 0 40 80 120

0

0.04

0.08

0.12

0.16Effect o f no. o f types: 18 vs. 27

Result 2: sensitivity to the number of types

• all pairs of classifications– differing in no. of types

• 9 vs. 18• 18 vs. 27

– with all other attributes equal



difference to zero

-55 ± 12


-280 -240 -200 -160 -120 -80 -40 0 40 80 120 160 200

0

0.04

0.08

0.12

0.16Effect o f sequencing

Result 3: effect of sequencing

• all pairs of classifications– differing in sequencing (no

vs. 4-days)– with all other attributes equal



difference to zero

-30 ± 11


-200 -160 -120 -80 -40 0 40 80 120

0

0.04

0.08

0.12

0.16

0.2Effect o f seasona l defin ition

Result 4: effect of seasonality

• all pairs of classifications– differing in the seasonality in

their definition– with all other attributes equal



difference to zero

-44 ± 24


Result 5: effect of additional variables

-200 -160 -120 -80 -40 0 40 80 120 160 200 240 280 320

0

0.04

0.08

0.12

E ffect o f adding 500 hPa heigh t & 850/500 hP a th ickness &

500 hP a vortic ity

-120 -80 -40 0 40 80 120 160 200 240 280

0

0.04

0.08

0.12

0.16Effect o f Z5 addition

+42 ± 24+68 ± 18


-160 -120 -80 -40 0 40 80 120 160 200 240 280

0

0.04

0.08

0.12

0.16Effect o f adding 500 hPa vorticity

-160 -120 -80 -40 0 40 80 120 160 200 240 280

0

0.04

0.08

0.12

0.16Effect o f adding 850/500 hPa th ickness

Result 5: effect of additional variables

+61 ± 19+41 ± 18


CONCLUSIONS

• various kinds of cluster analysis perform well• fewer types better performance• sequencing adds value: surface temperature is

better described by types of 4-day sequences than types of instantaneous fields

• seasonal definition better than annual, but:– systematic difference in the number of types (7 vs. 9)

• additional variables bring no benefit; in fact they worsen the synoptic-climatological applicability


OUTLOOK

• analysis to extend to– all domains– more variables (Tmin, Precip)

• more comparisons will be possible results may be more general

• several other criteria as well• other datasets (gridded: ENSEMBLES,

reanalyses)


Radan HUTH, Monika CAHYNOVÁ Institute of Atmospheric Physics, Prague, Czech Republic

Documents

Transcript of Radan HUTH, Monika CAHYNOVÁ Institute of Atmospheric Physics, Prague, Czech Republic