What is … small area estimation

35
What is … small area estimation Dimitris Ballas Department of Geography University of Sheffield e-mail: d.ballas@sheffield.ac.uk http://www.sheffield.ac.uk/geography/staff/ballas_dim itris

description

What is … small area estimation. Dimitris Ballas Department of Geography University of Sheffield e -mail: [email protected] http://www.sheffield.ac.uk/geography/staff/ballas_dimitris. Outline. Small area data sources Why small area estimation? - PowerPoint PPT Presentation

Transcript of What is … small area estimation

Page 1: What is … small area estimation

What is … small area estimation

Dimitris BallasDepartment of GeographyUniversity of Sheffielde-mail: [email protected]://www.sheffield.ac.uk/geography/staff/ballas_dimitris

Page 2: What is … small area estimation

Outline

• Small area data sources• Why small area estimation?• Methodological approaches to small

area estimation• Spatial microsimulation• Policy relevance examples• Further reading and resources

(including web-links to free software)

Page 3: What is … small area estimation

Small area data sources: the census of population• Census data describe the state of the whole

nation, area by area – no other social survey has such comprehensive spatial coverage

• Extremely relevant for policy analysis – used by government in the allocation of billions of pounds of public expenditure

• Very valuable commercially – essential ingredients in marketing analysis and retail modelling

After Rees, P, Martin, D, Williamson, P (eds) (2002), The Census Data System, Chichester, Wiley

Page 4: What is … small area estimation

Neighbourhood statistics topics (http://www.neighbourhood.statistics.gov.uk ):Census of populationCrime and SafetyEconomic DeprivationEducation, Skills and TrainingHealth and CareHousingPhysical EnvironmentDeprivation and ClassificationIncome and LifestylesPopulation and Migration

Examples of more small area data sources

Page 5: What is … small area estimation

Why small area estimation?• Need for small area estimates of variables such as

income, poverty, wealth, health, fear of crime, healthy lifestyles…

• We know little about the interdependencies between household structure or type and their lifestyles at the small area level

• There is no ‘live’ geographical database of household types linked to earning capabilities (both earned and/or transfer payments) which can be used both to explore spatial variations in lifestyles and behaviour and to monitor the effects of changes in taxation, family credit, pensions, social security payments etc.

Page 6: What is … small area estimation

6

• Policy makers need small area estimates• Academics need small area estimates• Public like small area estimates

– “What’s happening in my backyard”

Why small area estimation?

Policy relevance– socio-economic impact assessment– geographical impacts of social policy– what-if socio-spatial analysis

Page 7: What is … small area estimation

Small area estimation methods• Conduct a survey - very costly - confidentiality issues• Small area estimation methods can be applied to get

survey data down to small area level and to evaluate the spatial impacts of policies

• Various methodologies of small area estimation– Statistical approaches– Spatial microsimulation approaches

• Deterministic reweighting (IPF)• Probabilistic reweighting (CO)• Generalised linear regression (GREGWT)

Page 8: What is … small area estimation

8

Methodological approaches to small area estimation

• Statistical approaches (more linked to statisticians)– Synthetic estimation– Multi-level modelling– Bayesian approaches

• Spatial microsimulation approaches (more linked to geographers)– Deterministic reweighting approaches (IPF)– Probabilistic reweighting approaches (combinatorial optimisation)– Generalised linear regression (GREGWT)

• But many links between the methodsFor a review of a recent effort to explore linkages between these two often separate sets of approaches see: http://www.ncrm.ac.uk/research/NMI/2012/smallarea.php

Page 9: What is … small area estimation

A very simple approach to generating indirect non-survey designed estimates

- Obtain small area total numbers from the census on variables that may be correlated with a ‘target variable (e.g. for income would be correlated with “occupational classification”)

- obtain information at the national, or sometimes regional level information on the same variable cross-tabulated by the census variable (e.g. earnings by occupational classification)

- multiply the known census totals by average value for each area

Page 10: What is … small area estimation

A model-based approach (Office for National Statistics, Heady et al., 2003)

• Estimating ‘average weekly household’ at the electoral ward level in England and Wales on the basis of the following predictors:

• the social class of the ward population; • Household type/composition• Regional/country indicators• the employment status of the ward population• the proportion of the ward population claiming DWP benefits;• the proportion of dwellings in each of the Council Tax bands in a ward• “The model-based approach is based on finding a relationship between

weekly household income (as measured in the Family Resources Survey (FRS)) and covariate information (usually from Census or administrative sources) for the wards that are represented in the Survey”

see http://www.neighbourhood.statistics.gov.uk/HTMLDocs/images/Model-Based_Income_Estimates%28V2%29_tcm97-51115.pdf

Page 11: What is … small area estimation

Spatial Microsimulation• A technique aiming at building large

scale data sets• Modelling at the microscale• A means of modelling real life events

by simulating the characteristics and actions of the individual units that make up the system where the events occur

Page 12: What is … small area estimation

What is microsimulation?

PERSON AHID PID AAGE12 SEX AJBSTAT … AHLLT AQFVOC ATENURE AJLSEG …

1 1000209 10002251 91 2 4 … 1 1 6 9 …

2 1000381 10004491 28 1 3 … 2 0 7 -8 …

3 1000381 10004521 26 1 3 … 2 0 7 -8 …

4 1000667 10007857 58 2 2 … 2 1 7 -8 …

5 1001221 10014578 54 2 1 … 2 0 2 -8 …

6 1001221 10014608 57 1 2 … 2 1 2 -8 …

7 1001418 10016813 36 1 1 … 2 1 3 -8 …

8 1001418 10016848 32 2 -7 … 2 -7 3 -7 …

9 1001418 10016872 10 1 -8 … -8 -8 3 -8 …

10 1001507 10017933 49 2 1 … 2 0 2 -8 …

11 1001507 10017968 46 1 2 … 2 0 2 -8 …

12 1001507 10017992 12 2 -8 … -8 -8 2 -8 …

Page 13: What is … small area estimation

Static spatial microsimulation• Reweighting probabilistic approaches, which

typically reweight an existing national microdata set to fit a geographical area description on the basis of random sampling and optimisation techniques

• Reweighting deterministic approaches, which reweight a non geographical population microdata set to fit small area descriptions, but without the use of random sampling procedures

• Synthetic probabilistic reconstruction models, which involve the use of random sampling

Page 14: What is … small area estimation

Static spatial microsimulation

PERSON AHID PID AAGE12 SEX AJBSTAT … AHLLT AQFVOC ATENURE AJLSEG …

1 1000209 10002251 91 2 4 … 1 1 6 9 …

2 1000381 10004491 28 1 3 … 2 0 7 -8 …

3 1000381 10004521 26 1 3 … 2 0 7 -8 …

4 1000667 10007857 58 2 2 … 2 1 7 -8 …

5 1001221 10014578 54 2 1 … 2 0 2 -8 …

6 1001221 10014608 57 1 2 … 2 1 2 -8 …

7 1001418 10016813 36 1 1 … 2 1 3 -8 …

8 1001418 10016848 32 2 -7 … 2 -7 3 -7 …

9 1001418 10016872 10 1 -8 … -8 -8 3 -8 …

10 1001507 10017933 49 2 1 … 2 0 2 -8 …

11 1001507 10017968 46 1 2 … 2 0 2 -8 …

12 1001507 10017992 12 2 -8 … -8 -8 2 -8 …

Page 15: What is … small area estimation

Static spatial microsimulation

Small area table 1 (household type)

Small area table 2 (number of cars)

Small area table 3 (tenure status)

Area 1 Area 1 Area 160 "married couple

households"10 no car 60 owner occupier

20 "Single-person households"

80 1 car 20 Local Authority or Housing association

20 "Other" 10 2+ cars 20 Rented privatelyArea 2 Area 2 Area 240 "married couple

households"40 no car 60 owner occupier

20 "Single-person households"

40 1 car 20 Local Authority or Housing association

40 "Other" 20 2+ cars 20 Rented privately

Page 16: What is … small area estimation

Tenure and car ownership example

Household car ownership characteristics

Household tenure characteristics

1 car

2+ cars

No car

Owner-occupier

LA/HA rented

Other

Simulation 27 24 49 39 17 44Census 50 20 30 60 10 30Absolute error

23 4 19 21 7 14

Page 17: What is … small area estimation

Combinatorial optimisation: simulated annealing

• Origins in thermodynamics• Metropolis et al. (1953) suggested an algorithm for the

efficient simulation of the evolution of a solid material to thermal equilibrium

• Annealing is a physical process in which a solid material is first melted in a heat bath and then it is cooled down slowly until it crystallises

• First used in a spatial microsimulation context by Williamson, P., Birkin, M., Rees, P. (1998), The estimation of population microdata by using data from small area statistics and samples of anonymised records, Environment and Planning A, 30, 785-816

Page 18: What is … small area estimation

Other methodologies

• Hill-climbing, genetic algorithms• Deterministic reweighting

approaches• Probabilistic synthetic

reconstruction techniques (IPF-based approaches)

Page 19: What is … small area estimation

age/sex male femaleunder-50 1 1over-50 2 1

Deterministic Reweighting the British Household Panel Survey (BHPS) - a simple example (1)

A hypothetical sample of individuals (list format)Individual sex age-group weight1st male over-50 12nd male over-50 13d male under-50 14th female over-50 15th female under-50 1

In tabular format:

age/sex male femaleunder-50 3 5over-50 3 1

Hypothetical Census data fora small area:

Page 20: What is … small area estimation

age/sex male femaleunder-50 1 1over-50 2 1

Reweighting the BHPS - a simple example (2)

Calculating a new weight, so that the sample will fit into the Census table

In tabular format:

age/sex male femaleunder-50 3 5over-50 3 1

Hypothetical Census data fora small area:

Individual sex age-group weight New weight 1st male over-50 1 1 x 3/2 = 1.5 2nd male over-50 1 1 x 3/2 = 1.5 3d male under-50 1 1 x 3/1 = 3 4th female over-50 1 1 x 1/1 = 1 5th female under-50 1 1x 5/1 = 5

Page 21: What is … small area estimation

Probabilistic synthetic reconstruction

After Birkin, M., Clarke, M. (1988), SYNTHESIS – a synthetic spatial informationsystem for urban and regional analysis: methods and examples, Environment and Planning A, 20, 1645-1671

Page 22: What is … small area estimation

Steps 1st 2nd … Last Age, sex and marital status and location (DED level) (given)

Age: 25 Sex: Male Marital Status: Single GeoCode: Leitrim Co., DED 001 Ballinamore

Age: 76 Sex: Female Marital Status: married GeoCode: Leitrim Co., DED 002 Cloverhill

… Age: 30 Sex: Male Marital Status: married GeoCode: Leitrim Co., DED 078 Rowan

Probability of hh of given age , sex and location (ED level) being at work

0.7 0.2 … 0.7

Random number 0.55 0.5 … 0.45 Economic activity assigned to hh on the basis of random sampling

At work Other (e.g. Retired)

… At work

Probability (conditional upon the above attributes) of hh being an employee in the Agriculture industry

0.6 … 0.2

Random number 0.4 … 0.6 Economic activity category assigned on the basis of random sampling

Employee in Agriculture

… Not an employee in Agriculture

… … … … …

SMILE model, after Ballas, D., Clarke, G. P., Wiemers, E., (2006) Spatial microsimulation for rural policy analysis in Ireland: The implications of CAP reforms for the national spatial strategy, Journal of Rural Studies, vol. 22, pp. 367-378 (doi:10.1016/j.jrurstud.2006.01.002)

Probabilistic synthetic reconstruction techniques

Page 23: What is … small area estimation

Dynamic spatial microsimulation

• Probabilistic dynamic models, which use event probabilities to project each individual in the simulated database into the future (e.g. using event conditional probabilities).

• Implicitly dynamic models, which use independent small area projections and then apply the static simulation methodologies to create small area microdata statically

Page 24: What is … small area estimation

Steps 1st 2nd … Last Age, sex and marital status and location (DED level) (given)

Age: 25 Sex: Male Marital Status: Single GeoCode: Leitrim Co., DED 001 Ballinamore

Age: 76 Sex: Female Marital Status: married GeoCode: Leitrim Co., DED 002 Cloverhill

… Age: 30 Sex: Male Marital Status: married GeoCode: Leitrim Co., DED 078 Rowan

Probability (conditional upon age, sex, location) of hh to migrate

0.30 0.05 … 0.26

Random number 0.2 0.4 … 0.4 Migration status assigned on the basis of random sampling

Migrant Non-migrant … Non-migrant

Probability (conditional upon age, sex, location) of hh to survive

0.9 0.5 0.8

Random number 0.5 0.4 … 0.4 Survival status Survived Deceased … Survived

Probabilistic dynamic models

after Ballas D , Clarke, G P, Wiemers, E, (2005) Building a dynamic spatial microsimulationmodel for Ireland , Population, Space and Place, 11, 157–172 (http://dx.doi.org/10.1002/psp.359)

Page 25: What is … small area estimation

SimBritain: combining Census data with the BHPS

Census of UK population:• 100% coverage• fine geographical detail• Small area data available

only in tabular format with limited variables to preserve confidentiality

• cross-sectional

British Household Panel Survey:

• sample size: more than 5,000 households

• Annual surveys (waves) since 1991

• Coarse geography• Household attrition

Ballas, D. , Clarke, G.P., Dorling, D., Eyre, H. and Rossiter, D., Thomas, B (2005) SimBritain: a spatial microsimulation approach to population dynamics, Population, Space and Place 11, 13–34 (http://dx.doi.org/10.1002/psp.351)

Page 26: What is … small area estimation

SimBritain modelling approach1. Establish a set of constraints2. Choose a spatially defined source population3. Repeatedly sample from source4. Adjust weightings to match first constraint5. Adjust weightings to match second constraint6. …7. Adjust weightings to match final constraint8. Go back to step 4 and repeat loop until

results converge9. Save weightings which define membership of

SimBritain

Page 27: What is … small area estimation

CONSTRAINT TABLESTABLE CATEGORY

Car Ownership no cars 1 car 2+ cars

Social Class affluent middle income less affluent

Demography 1 child 2+ children no children

Employment active retired inactive

Households married couple lone parent other

Tenure owner occupied council tenant other

Page 28: What is … small area estimation

How do we know it makes sense?

Average age se = 1.0 r squared = .760 beta = 1.22RR

cage

30

32

34

36

38

40

42

44

46

48

50

sage 30 32 34 36 38 40 42 44 46 48 50

Page 29: What is … small area estimation

How do we know it makes sense?

Long-term illness se = 1.7 r squared = .767 beta = 1.19

cill

0.00

0.05

0.10

0.15

0.20

0.25

0.30

sill 0.00 0.05 0.10 0.15 0.20 0.25 0.30

Page 30: What is … small area estimation

The potential of microsimulation for policy analysis Classifying households• Very poor: all households with income below 50% of the

median York income• Poor: all households with income more than 50% of the

median but lower than 75% of the median• Below-average: all households living on incomes higher

than 75% of the median but less than or equal to the median

• Above-average: all households living on incomes higher than the median and lower than 125% of the median

• Affluent: all households living on incomes above 125% of the median

Ballas, D., Clarke, G P, Dorling D, Rossiter, D. (2007), Using SimBritain to Modelthe Geographical Impact of National Government Policies,Geographical Analysis 39, pp.44-77 (doi:10.1111/j.1538-4632.2006.00695.x)

Page 31: What is … small area estimation

Very poor households 1991 2001 2011 2021

Households (% of all households in York) 17.2% 17.3% 17.8% 21.3%Individuals (% of all individuals in York) 14.7% 13.3% 13.7% 20.5%Children (% of all children in York) 21.8% 17.7% 18.6% 38.5%LLTI (as a % of all individuals in group) 9.0% 7.3% 5.4% 7.9%Elderly (over 64 years as a % of all individuals in group) 30.1% 32.0% 33.3% 44.2%Individuals in group with father's occupation: unskilled (%) 10.5% 6.8% 3.3% 15.1%Reporting anxiety and depression (% of all individuals in group) 10.6% 10.3% 7.4% 3.1%Reporting health problems with alcohol or drugs (% of all individuals in group) 0.9% 1.1% 0.3% 0.0%

Individuals who reported that they have no one to talk to 19.9% 23.8% 31.1% 31.5%

Living standards of very poor households

Page 32: What is … small area estimation

Working Families Tax Credits Amount in 2002-3

Adjusted for 1991

Couple or lone parent £60.00 £ 42.39 Child aged    under 16 £26.35 £ 18.62 16-18 £27.20 £ 19.22 30 hours credit £11.65 £ 8.23 Disabled child credit £35.50 £ 25.08 Enhanced disability credit    

Couple or lone parent £16.25 £ 11.48 Child £46.75 £ 33.03 Childcare credit    One child 70% of up

to £13570% of up to £95.39

Two or more children 70% of up to £200

70% of up to £141.31

Additional partners in a polygamous marriage £22.70 £ 16.04

Using SimBritain to Model the Geographical Impact of National Government Policies

Page 33: What is … small area estimation

The estimated spatial impact in York

Page 34: What is … small area estimation

The estimated spatial impact in Wales

Page 35: What is … small area estimation

Further reading and resources (including software)• Combinatorial Optimisation software (including dummy dataset and associated

documentation) by Paul Williamson (University of Liverpool): http://pcwww.liv.ac.uk/~william/microdata/CO%20070615/CO_software.html

• Iterative Proportional Fitting and integerisation R code and data: Lovelace, R, Ballas D (2013), ‘Truncate, replicate, sample’: A method for creating integer weights for spatial microsimulation, Computers, Environment and Urban Systems, http://www.sciencedirect.com/science/article/pii/S0198971513000240 (open access article including publicly available R code and data)

• A recent review of the state of the art and research challenges by Adam Whitworth (University of Sheffield): Whitworth, A et al. (2013) Evaluations and improvements in small area estimation methodologies. Discussion Paper. NCRM http://www.ncrm.ac.uk/research/NMI/2012/smallarea.php and http://eprints.ncrm.ac.uk/3210/ ). This includes a Spatial Microsimulation R-Library by Dimitris Kavroudakis (University of the Aegean) including R code available from: http://www.shef.ac.uk/polopoly_fs/1.268326!/file/sms_Manual_v9.zip

• An introductory text to spatial microsimulation: Ballas, D., Rossiter, D, Thomas, B., Clarke G, Dorling D, (2005), Geography matters: simulating the local impacts of national social policies, Joseph Roundtree Foundation http://www.jrf.org.uk/sites/files/jrf/1859352669.pdf

• 2-day NCRM/TALISMAN course: An Introduction to Spatial Microsimulation Using R, 19-20 September 2014, University of Cambridge http://store.leeds.ac.uk/browse/extra_info.asp?compid=1&modid=2&deptid=9&catid=47&prodid=449