Radan HUTH, Monika CAHYNOVÁ Institute of Atmospheric Physics, Prague, Czech Republic
description
Transcript of Radan HUTH, Monika CAHYNOVÁ Institute of Atmospheric Physics, Prague, Czech Republic
Classifications of circulation patterns
from the COST733 database: An assessment of synoptic-
climatological applicability by two-sample Kolmogorov-Smirnov test
Radan HUTH, Monika CAHYNOVÁInstitute of Atmospheric Physics,
Prague, Czech [email protected]
COST733 database (collection)
• COST733 Action – “Harmonization and Applications of Weather Types Classifications for European Regions”
• (very) large number of classifications produced
• on unified data– SLP at 12 UTC – ERA40 (Sep 1957 – Aug 2002)– ~9, ~18, ~27 types wherever possible – 12 European domains
COST733 database (collection)
• version 2.0 of the database– released this spring
• 18 methods for each domain– threshold-based: GWT (Beck), Litynski, Lamb (Jenkinson-
Collison), P27 (Kruizinga), WLK– leader algorithm: Lund, Kirchhofer, Erpicum– PCA-based: T-mode PCA– optimization algorithms: CKMEANS, PCACA (k-means),
Petisco, PCAXTRKMS, SANDRA, SANDRA-S, NNW (SOMs), PCAXTR
– pseudo-random: random centroids• plus 7 subjective and objectivized classifications not
attributable to any domain– ignored today
COST733 database (collection)
• different attributes of classifications– number of types (9 x 18 x 27)– sequencing (no vs. 4-day sequences)– seasonal vs. year-round definition– variable: all based on SLP, several additional
variables used
GOAL• assess the synoptic-climatological
applicability of classifications• i.e., how well they stratify surface weather
(climate) conditions• demonstrate effect of
– sequencing– seasonal vs. annual definition– adding more variables
• 500 hPa height • 500 hPa vorticity• 850/500 hPa thickness
– number of types
Classifications examined• 11 methods
– 30 classifications available for each of them– differing in
• sequencing (no x 4 days)• additional variables (Z500, THICK850/500, VOR500, all
together)• number of types (9, 18, 27)
• 5 methods– additional 6 classifications available– differing in
• seasonality of definition (year-round x seasonal)
TOOL• 2-sample Kolmogorov-Smirnov test• equality of distributions of the climate
element under one type against under all the other types
- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0
4 5
5 0
5 5
6 0
- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0
4 5
5 0
5 5
6 0
- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0
4 5
5 0
5 5
6 0
- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0
4 5
5 0
5 5
6 0
- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0
4 5
5 0
5 5
6 0
- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0
4 5
5 0
5 5
6 0
- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0
4 5
5 0
5 5
6 0
- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0
4 5
5 0
5 5
6 0
x
- 1 0 - 5 0 5 1 0 1 5 2 0 2 54 0
4 5
5 0
5 5
6 0
TOOL• at each station• types for which the K-S test rejects the
equality of distributions are counted• the larger the count, the better the
stratification, the better the synoptic-climatological applicability
ANALYSIS
• preliminary results• maximum temperature
(minimum temperature – very similar results)(precipitation – different)
• domain 07 (central Europe)• 39 stations from ECA&D database• winter (DJF)• Jan 1961 – Dec 2000
RANKING OF CLASS’S• at all stations individually:
– for each classification: number of rejected K-S counted
– classifications ranked by the %age of rejected K-S tests (= well separated classes)
– higher %age better lower rank• for each classification: ranks averaged over
stations• area mean rank ranking of the
classification
Result 1: comparison of methods
• area mean ranks averaged over 30 realizations of each method
• result: order of the method, independent of any attribute (no. of types, sequencing, variable)
Result 1: comparison of methods
1 cluster analysis of PCs 7 PCA-extreme scores reassigned by k-means
2 SANDRA (optimized k-means) 8 obliquely rotated T-mode
PCA
3 C-k-means 9PCA-extreme scores reassigned by Eucl. distance
4 random centroids 10 Erpicum
5 Lund correlation-based 11 orthogonally rotated T-mode PCA
6 Kruizinga
NOTE: not all methods participated in the race!
Result 2: sensitivity to the number of types
• all pairs of classifications– differing in no. of types
• 9 vs. 18• 18 vs. 27
– with all other attributes equal
• difference in rank is calculated
• histogram of differences• t-test: equality of the
difference to zero-320 -240 -160 -80 0 80 160 240
0
0.04
0.08
0.12
0.16Effect o f no. o f types: 9 vs. 18
-106 ± 17
-320 -280 -240 -200 -160 -120 -80 -40 0 40 80 120
0
0.04
0.08
0.12
0.16Effect o f no. o f types: 18 vs. 27
Result 2: sensitivity to the number of types
• all pairs of classifications– differing in no. of types
• 9 vs. 18• 18 vs. 27
– with all other attributes equal
• difference in rank is calculated
• histogram of differences• t-test: equality of the
difference to zero
-55 ± 12
-280 -240 -200 -160 -120 -80 -40 0 40 80 120 160 200
0
0.04
0.08
0.12
0.16Effect o f sequencing
Result 3: effect of sequencing
• all pairs of classifications– differing in sequencing (no
vs. 4-days)– with all other attributes equal
• difference in rank is calculated
• histogram of differences• t-test: equality of the
difference to zero
-30 ± 11
-200 -160 -120 -80 -40 0 40 80 120
0
0.04
0.08
0.12
0.16
0.2Effect o f seasona l defin ition
Result 4: effect of seasonality
• all pairs of classifications– differing in the seasonality in
their definition– with all other attributes equal
• difference in rank is calculated
• histogram of differences• t-test: equality of the
difference to zero
-44 ± 24
Result 5: effect of additional variables
-200 -160 -120 -80 -40 0 40 80 120 160 200 240 280 320
0
0.04
0.08
0.12
E ffect o f adding 500 hPa heigh t & 850/500 hP a th ickness &
500 hP a vortic ity
-120 -80 -40 0 40 80 120 160 200 240 280
0
0.04
0.08
0.12
0.16Effect o f Z5 addition
+42 ± 24+68 ± 18
-160 -120 -80 -40 0 40 80 120 160 200 240 280
0
0.04
0.08
0.12
0.16Effect o f adding 500 hPa vorticity
-160 -120 -80 -40 0 40 80 120 160 200 240 280
0
0.04
0.08
0.12
0.16Effect o f adding 850/500 hPa th ickness
Result 5: effect of additional variables
+61 ± 19+41 ± 18
CONCLUSIONS
• various kinds of cluster analysis perform well• fewer types better performance• sequencing adds value: surface temperature is
better described by types of 4-day sequences than types of instantaneous fields
• seasonal definition better than annual, but:– systematic difference in the number of types (7 vs. 9)
• additional variables bring no benefit; in fact they worsen the synoptic-climatological applicability
OUTLOOK
• analysis to extend to– all domains– more variables (Tmin, Precip)
• more comparisons will be possible results may be more general
• several other criteria as well• other datasets (gridded: ENSEMBLES,
reanalyses)