Testing the validity of indicators in the field of education The experience of CRELL

Testing the validity of indicators in the field

of educationThe experience of

CRELL Rome, October 3-5, 2012, Improving Education through Accountability and Evaluation: Lessons from Around the

World

Andrea Saltelli European Commission, Joint Research Centre

2April 22, 2023

CRELLCentre for research on lifelong

learning based on indicators and

benchmarks

DG Education and Culture+Joint

Research Centre, since 2005

http://crell.jrc.ec.europa.eu/

3April 22, 2023

CRELL

- Trajectories to achieving EU 2020 objs. - Employability and other benchmarks

(mobility, multi-linguism)

- Labour market outcomes

Focus

Foci of econometric research at the JRC, Ispra

4April 22, 2023 4April 22, 2023

Counter Factual analysis and other Impact Assessment methodologies

Regional Studies (com…petitiveness, innovation, well being)

Composite indicators and social choice

Focus

Foci of econometric research at the JRC, Ispra

Indicators

Context: Knowledge in support to policy; evaluation and impact assessment, but also advocacy

Caveat: Validity = plausibility, defensibility … and not ‘proof of truth’

When testing the evidence some reasonable people (and guidelines) suggest that ‘sensitivity analysis would help’.

JRC fostered sensitivity analysis development and uptake (20 years of papers, schools and books).

Today we call it sensitivity auditing and teach it within the syllabus for impact assessment run by the SEC GEN.

…

Sensitivity analysis

How to shake coupled stairs

How coupled stairs are shaken in most of available literature


Testing (composite) indicators: two approaches

Michaela Saisana, Andrea. Saltelli, and Stefano Tarantola (2005). Uncertainty and sensitivity analysis techniques as tools for the quality assessment of composite indicators. J. R. Statist. Soc. A 168(2), 307–323.

Paolo Paruolo, Michaela Saisana, Andrea SaltelliRatings and rankings: Voodoo or Science?, J. R. Statist. Soc. A, 176 (2), 1-26


10April 22, 2023

First: The invasive approach

Michaela Saisana, Béatrice d’Hombres, Andrea Saltelli, Rickety numbers: Volatility of university rankings and policy implicationsResearch Policy (2011), 40, 165-177


11April 22, 2023

ROBUSTNESS ANALYSIS OF SJTU AND THES

Assumption Alternatives

Number of indicators all six indicators included or

one-at-time excluded (6 options)

Weighting method original set of weights,

factor analysis,

equal weighting,

data envelopment analysis

Aggregation rule additive,

multiplicative,

Borda multi-criterion

12April 22, 2023

SJTU: SIMULATED RANKS – TOP20

Harvard, Stanford, Berkley, Cambridge, MIT: top 5 in more than 75% of our simulations.

Univ California SF: original rank 18th but could be ranked anywhere between the 6th and 100th position

Impact of assumptions: much stronger for the middle ranked universities

Legend:Frequency lower 15%Frequency between 15 and 30%Frequency between 30 and 50%Frequency greater than 50%Note: Frequencies lower than 4% are not shown

1-5

6-10

11-1

5

16-2

0

21-2

5

26-3

0

31-3

5

36-4

0

41-4

5

46-5

0

51-5

5

56-6

0

61-6

5

66-7

0

71-7

5

76-8

0

81-8

5

86-9

0

91-9

5

96-1

00 Originalrank

Harvard Univ 100 1 USAStanford Univ 89 11 2 USAUniv California - Berkeley 97 3 USAUniv Cambridge 90 10 4 UKMassachusetts Inst Tech (MIT) 74 26 5 USACalifornia Inst Tech 27 53 19 6 USAColumbia Univ 23 77 7 USAPrinceton Univ 71 9 11 7 8 USAUniv Chicago 51 34 13 9 USAUniv Oxford 99 10 UKYale Univ 47 53 11 USACornell Univ 27 73 12 USAUniv California - Los Angeles 9 84 7 13 USAUniv California - San Diego 41 46 9 14 USAUniv Pennsylvania 6 71 23 15 USAUniv Washington - Seattle 7 71 21 16 USAUniv Wisconsin - Madison 27 70 17 USAUniv California - San Francisco 14 9 14 11 7 10 6 6 18 USATokyo Univ 16 16 49 20 19 JapanJohns Hopkins Univ 7 54 21 17 20 USA

Simulated rank range - SJTU 2008

13April 22, 2023

THES: SIMULATED RANKS – TOP 20

Impact of uncertainties on the university ranks is even more apparent.

M.I.T.: ranked 9th, but confirmed only in 13% of simulations (plausible range [4, 35])

Very high volatility also for universities ranked 10th-20th position, e.g., Duke Univ, John Hopkins Univ, Cornell Univ.

Legend:Frequency lower 15%Frequency between 15 and 30%Frequency between 30 and 50%Frequency greater than 50%Note: Frequencies lower than 4% are not shown

1-5

6-10

11-1

5

16-2

0

21-2

5

26-3

0

31-3

5

36-4

0

41-4

5

46-5

0

51-5

5

56-6

0

61-6

5

66-7

0

71-7

5

76-8

0

81-8

5

86-9

0

91-9

5

96-1

00

HARVARD University 44 56 1 USAYALE University 40 49 11 2 USAUniversity of CAMBRIDGE 99 3 UKUniversity of OXFORD 93 7 4 UKCALIFORNIA Institute of Technology 46 50 5 USAIMPERIAL College London 74 24 6 UKUCL (University College London) 73 23 7 UKUniversity of CHICAGO 80 19 8 USAMASSACHUSETTS Institute of Technology 14 13 17 16 11 11 7 9 USACOLUMBIA University 6 13 17 11 10 7 10 14 10 USAUniversity of PENNSYLVANIA 37 56 6 11 USAPRINCETON University 6 59 27 9 12 USADUKE University 27 11 9 7 10 6 9 6 13 USAJOHNS HOPKINS University 20 10 9 9 7 10 6 6 7 6 13 USACORNELL University 6 24 11 7 6 7 9 9 7 15 USAAUSTRALIAN National University 10 30 29 31 16 AustraliaSTANFORD University 10 14 7 10 9 10 6 6 7 17 USAUniversity of MICHIGAN 6 27 17 9 10 7 14 6 18 USAUniversity of TOKYO 16 7 13 7 6 6 19 JapanMCGILL University 7 19 41 13 9 7 20 Canada

Simulated rank range - THES 2008

14April 22, 2023

Second: The non-invasive approach

Comparing the weights as assigned by developers with ‘effective weights’ derived from sensitivity analysis.


15April 22, 2023

University Rankings

Comparing the internal coherence of ARWU versus THES by testing the weights declared by developers with ‘effective’ importance measures.

Partnerships with OECD, WEF, INSEAD, WIPO, UN-IFAD, FAO, Transparency International, World Justice Project, Harvard, Yale, Columbia …

Sixty analyses (Michaela Saisana, JRC)

JRC fosters the development of good practices for the construction of aggregated statistical measures (indices, composite indicators).

Something worth advocating for (1):

More use of social choice theory methods both for building meaningful aggregated indicators …

(A pity that methods already available between the end of the XIII and the XV century are neglected by most developers)

… they could be used more also in comparing options in the context of impact assessment studies. course at JRC Ispra October 11-12

Econometrics and Applied Statistics Unithttp://ipsc.jrc.ec.europa.eu/?id=155

Sensitivity Analysis: http://sensitivity-analysis.jrc.ec.europa.eu/

Sensitivity Auditing: http://sensitivity-analysis.jrc.ec.europa.eu/Presentations/Saltelli-final-February-1-1.pdf

Quality of composite indicators: http://ipsc.jrc.ec.europa.eu/index.php?id=739

Useful links:

Testing the validity of indicators in the field of education The experience of CRELL

Documents

Transcript of Testing the validity of indicators in the field of education The experience of CRELL