11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time...

57
1 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics New Zealand [email protected] [email protected]

description

A 1 minute challenge Write down all the ways that you have ever accessed outputs produced by Statistics New Zealand. e.g. looked at a media release on 3

Transcript of 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time...

Page 1: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

11

Time Series in Official Stats:Statistical Thinking and Communication

about Variation over Time

STOR 481: 14 Oct 2015

Emma Mawby & Sonya McGlone: Statistics New [email protected] [email protected]

Page 2: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

22

Contents: Green: activities, on paper, to discuss1. Introduction2. Two fascinating series, and reflections on them3. What are TS, and what do you do with them?

What do OS people do: filtering and seasonal adjustment Electronic card transactions

4. iNZight: smart new software(Break)

5. Births per quarter, and the Poisson distribution6. Assignment 5 Time Series questions7. The challenges and opportunities in Official Stats8. Summary: signal and noise

and, if we have time:9. Big ideas in Time Series and Official Stats10. Earnings and OSS issues11. Term Test (Richard Arnold)

Page 3: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

A 1 minute challenge

• Write down all the ways that you have ever accessed outputs produced by Statistics New Zealand.

• e.g. looked at a media release on http://www.stats.govt.nz/

3

Page 4: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Some tweets about Official Statistics

4

Page 5: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

55

1: Introduction: Aims:1. The world of ‘Official Stats’ time series

(essential, exhilarating, accessible)2. iNZight

3. Apply statistical thinking and communication skills to variation in time series

4. Access and enjoy Assignment 5 questions

CO2 at Baring Head (Wellington)

Model fitted by linear regression:y = 1.4749x - 2584.7R2 = 0.9956

320

330

340

350

360

370

380

1973 1978 1983 1988 1993 1998 2003

Page 6: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Learning objectives for STOR 481:1. key aspects of Official Statistics

2. legal and ethical constraints on organisations producing Official Statistics

3. principal methods for data collection, analysis and interpretation of health, social and economic data, including spatial data

4. methods for presenting and preparing commentaries on Official Statistics

Page 7: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

7

Resources (1)• Statistics New Zealand homepage: http://www.stats.govt.nz/

• Stories about data: eg: Labour Market Statistics http://www.stats.govt.nz/browse_for_stats/income-and-work/employment_and_unemployment/LabourMarketStatistics_HOTPJun15qtr.aspx

• Data:NZ.Stat

http://nzdotstat.stats.govt.nz/wbos/Index.aspx

Infoshare: (to eventually be replaced by NZ.Stat)http://www.stats.govt.nz/infoshare/

Demonstration:http://www2.stats.govt.nz/domino/external/web/aboutsnz.nsf/htmldocs/Seasonal+decomposition+demonstration

Background on TS and Seasonal Adjustment:http://www.stats.govt.nz/surveys_and_methods/methods/data-analysis/seasonal-adjustment.aspx

Software: iNZight and its time series modulehttp://www.stat.auckland.ac.nz/~wild/iNZight/

Page 8: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Resources (2)Demonstration:

http://www2.stats.govt.nz/domino/external/web/aboutsnz.nsf/htmldocs/Seasonal+decomposition+demonstration

Background on TS and Seasonal Adjustment:http://www.stats.govt.nz/surveys_and_methods/methods/data-analysis/seasonal-adjustment.aspx

Software: iNZight and its time series modulehttp://www.stat.auckland.ac.nz/~wild/iNZight/

8

Page 9: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

9

2. What is Statistics all about?One answer is:

_ _ _ _ _ t _ _ n

Page 10: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

10

What is Statistics all about?One answer is:

Variation

Which occurs:in estimates from samplesacross timeacross a population or sample

That’s us today

Page 11: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Where does ‘variation’ arise in OS?What does it look like?

Nov 2011

11

Cross-sectional data:Income (NZIS)

Series data:Guest nights: back packers

Inference:Income: 100 means: SuperSURF

0

100,000

200,000

300,000

400,000

500,000

600,000

1996

M07

1997

M07

1998

M07

1999

M07

2000

M07

2001

M07

2002

M07

2003

M07

2004

M07

2005

M07

2006

M07

2007

M07

2008

M07

2009

M07

2010

M07

2011

M07

Actual

Seas Adj

Trend

Page 12: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

12

Official Stats and Time Series:

Official Stats

Stats

Admin data

Time Series stats

Page 13: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Graph of my happiness score for Tuesday 13th October 2015

13

score versus timesc

ore

time

0 5 10 15 20 25

24

68

10

Page 14: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

1.1 What has happened over these 27 years? 1.2 How does this data get collected? Does it have sampling error?1.3 Why are there high values in some of the Q1’s ( first quarters)?1.4 What are these series going to do next?1.5 What are these ‘Quarter’ things that official stats folk are so keen on?

14

Activity 1: Two fascinating series Unemployment rates, quarterly, Male and Female

1986 Q1 to 2013 Q2

0.0

2.0

4.0

6.0

8.0

10.0

12.0

14.0

1986Q1 1989Q1 1992Q1 1995Q1 1998Q1 2001Q1 2004Q1 2007Q1 2010Q1 2013Q1

UnempRateMale

UnempRateFemale

Page 15: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Yes Unemployment Rate does have SE: (published from 1990 Q2, (found via resampling: jacknife)

Male and Female Unemploymet Rates + - Sampling Errors

0

2

4

6

8

10

12

14

1986Q1 1989Q1 1992Q1 1995Q1 1998Q1 2001Q1 2004Q1 2007Q1 2010Q1 2013Q1

M-SEM+SEF-SEF+SE

http://asq.org/quality-progress/2008/07/statistics-roundtable/statistics-roundtable-the-trusty-jackknife.html

Page 16: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

16

3: What are time series?...

A time series is a statistical record of a particular social or economic activity, with the data usually measured at regular intervals over a period of time.

Page 17: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

… and what do we do with them?Time series are analysed to:• understand the past • predict the future

• A time series analysis quantifies the main features in the data (the “signal”) and the random variation (the “noise”)

17

Page 18: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

18

So what are TS?? EG:http://www.stats.govt.nz/infoshare/SelectVariables.aspx?pxID=d294d46d-0a80-4628-8095-b1273a8186c5

1878 - 1957 -2010 Use with caution

YearRecorded offences

Resolved offences

1957 81,998 49,4731958 85,153 54,5921959 88,071 52,9941960 102,792 66,8571961 96,384 56,1701962 115,921 62,0141963 113,942 66,9921964 118,422 71,9141965 132,311 73,2941966 135,374 77,4651967 139,737 79,4091968 149,103 85,0251969 153,914 88,7731970 165,859 94,7851971 177,924 91,3011972 189,283 96,6251973 192,079 98,7781974 206,115 101,5931975 223,362 105,3891976 232,376 109,9371977 243,619 104,9821978 245,640 88,1101979 257,922 99,1211980 286,789 107,2351981 294,015 108,5571982 309,843 114,8571983 336,155 122,9761984 347,453 125,4261985 370,844 120,7141986 376,558 124,4611987 368,712 125,5271988 378,122 129,8221989 384,928 128,3301990 409,747 124,9841991 446,417 133,4411992 464,596 141,3011993 462,536 162,8541994 447,525 171,4531995 465,052 170,6491996 477,596 175,7511997 473,547 176,2991998 461,677 175,1761999 438,074 170,2992000 427,230 177,0342001 426,526 179,0072002 440,129 184,4652003 442,489 192,5402004 406,363 181,3442005 407,496 176,3622006 424,137 185,2272007 426,384 194,7682008 431,383 201,4192009 451,405 215,6182010 426,345 202,545

Police recorded crime

So what do we do with it now???

1878 - 1957 -2010 Use with caution

YearRecorded offences

Resolved offences

1957 81,998 49,4731958 85,153 54,5921959 88,071 52,9941960 102,792 66,8571961 96,384 56,1701962 115,921 62,014

Police recorded crime

Page 19: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

19

Crime data: Obvious things to do:

0

100,000

200,000

300,000

400,000

500,000

1957 1967 1977 1987 1997 2007

Recorded offences

Resolved offences

0.00%

20.00%

40.00%

60.00%

80.00%

100.00%

1957 1967 1977 1987 1997 2007

Percent Resolved

Graph the series: Divide, to get Percent Resolved:

Page 20: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

And divide by Population to getRates (offences per person) (from 1991)

20

Page 21: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Components of a time seriesThe actual values of a time series are made up of the following components:•Trend •Long term cycle•Seasonal component•Irregular component

We assume that some relationship exists between them. It is either multiplicative: A = C x S x I or additive: A = C + S + I

21

Page 22: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

22

Filtering, seasonal adjustment and decompositionStatistics New Zealand time series tend to be either:• The “actual” series• Seasonally adjusted series – with regular seasonal

component removed• Trend series – just the trend cycle component

Page 23: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Activity 2: A monthly series, filtered and seasonally adjusted (by Stats NZ):

2.1 Describe the features of the variation in debit card transactions2.2 Why does Stats NZ publish the Seasonally Adjusted series?2.3 Imagine that you own a business that receives mainly debit card transactions, and get

StatsNZ’s latest info release. Of the three series (Actual, Seasonally adjusted, Trend), which might you use and why?2.4 What do you expect to happen next in the series?

23

1500

2000

2500

3000

3500

4000

2004 2006 2008 2010 2012 2014 2016

Time

Time series plot for Debit

$million

Page 24: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

4: iNZight: an intro: 8 slideshttp://www.stat.auckland.ac.nz/~wild/iNZight/

24

Get some data

Page 25: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

TS, and other goodies, here

TS here

Page 26: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

26

Our unemployment TS

Use

Ignore

Page 27: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Results:

4

6

8

10

1985 1990 1995 2000 2005 2010 2015

Time

Time series plot for Total.Both.Sexes

Page 28: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Decomposition:

Page 29: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Seasonal features:

Page 30: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

For 2 or more series:

Use Multi-Plot

Page 31: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Results: multiplicative

Page 32: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

5: Activity 3: Births per quarter

32

3.1 What do you think the two series (male births, female births) look like? What features might they have? Sketch your guesses in.3.2 Can you think of a sensible way to model this? Which distribution would be appropriate? Assumptions?

Page 33: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Births per quarter actual data

33

Page 34: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Births per quarter and the Poisson distribution

34

If we assume the number of male or female births per quarter is Poisson with lambda = 7,077, then the two births series would look like this:

Births per Quarter, Poisson, 1976 Q1 to 2013 Q2

5000

6000

7000

8000

9000

1976Q1 1981Q1 1986Q1 1991Q1 1996Q1 2001Q1 2006Q1 2011Q1

BirthsPoisson

Page 35: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

6. Four slides:STOR 481: 2015: Assignment 5: Time Series questionsshortened version:

Note: Assignment 5 will include questions from the Data Visualisation, Time Series and Macroeconomic Statistics lectures

Please install iNZight: http://www.stat.auckland.ac.nz/~wild/iNZight/, and try its Time Series option. You’ll find this under the Advanced tab. In iNZight’s Data folder, you’ll find times series datasets for practice.

To use a time series dataset from Infoshare (from the Statistics NZ website) in iNZight, you need to simplify it so that it contains only simple headings and the columns of data, and then save it as a csv file. 

35

Page 36: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

STOR 481: 2015: Assignment 5 Time Series questions,shortened version:

3: Number of Guest nights from the Accommodation SurveyThe Accommodation Survey consists of several series describing the number of guest nights spent in different types of accommodation in New Zealand. These series are found in the Industry sectors section of the Statistics New Zealand website: www.stats.govt.nz.

Statistics NZ Home > Browse for statistics > Industry sectors > Accommodation

Please read all sections of the “Accommodation Survey: August 2015” release. Also, please examine the second download, which contain tables and components of Accommodation Survey data for the last twelve months. Also, note the short Media Release.

36

Page 37: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

STOR 481: 2015: Assignment 5: Time Series: shortened version:

3.1 (12 marks) Choose one accommodation type from Hotels, Motels or Backpackers. Describe the behaviour of the “Number of guest nights” for the period July 1996 to August 2015 for this accommodation type. You’ll need to discuss the usual components of time series and any other feature or features that the number of guest nights shows.

Now describe the behaviour of the “Number of guest nights” for the period July 1996 to August 2015 for “Holiday parks”.

Now describe the differences between the two series.

3.2 (2 marks) Why do you think the series “total excluding holiday parks” is published as well as the series “total”.  3.3 (4 marks) As an Official Statistics agency, Statistics NZ aims to convey information about very complex situations to very wide audiences. Discuss and give examples of the communication methods that Statistics NZ uses to tell the stories that come from Accommodation Survey Statistics

37

Page 38: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

End of Assignment 5 slides.

38

Page 40: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

The end: enjoy the assignment!

40http://xkcd.com/418/

Page 41: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

AND IF WE HAVE TIME ….Supplementary slides

41

Page 42: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

42

7: Challenges in TS for Official StatsThe monitoring of TS production:

dealing with the unexpectedOutliers:

detecting themfinding causes for themdealing with themassessing their effect on seasonals

Level shiftsrecognising them from noise or seasonal

Trading day and holiday effectsTimeliness

incoming data with a tail (eg tax)publish fast and revise a lot or publish slower and revise a little and …

A fake quarterly series with rogue outliers

0

5

10

15

20

25

30

35

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Are they:the new trendthe new seasonaljust one-offs??

Page 43: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Research:Use of ARIMA models to reduce revisions of seasonally adjusted estimatesError bounds on seasonally adjusted series

Implementation of new tools: New seasonal adjustment user interfaceX-13-ARIMA-SEATSSensitivity analysis tool

Collaboration with our Australian counterparts:Cross-centre trainingRegular meetings

43

7: Challenges cont’d: hot issues:

Page 44: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

44

7… the Official Statisticians’ TS Dilemma:Do we:A: keep the variable definition the same forever,

and watch it go out of dateB: update the definition and break the seriesC: do something smart: what???

EG: ANZSIC 1996 -> ANZSIC 2006 Aust and NZ Standard Industrial Classification

EG: Employment-related series

Page 45: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

45

8: Summary: signal and noiseTime Series in Official Stats:

great opportunities to apply Statistical Thinking to Variationusing: intuition-based concepts and powerful software tools great new data sources: admin and othersfor vitally important issues that are: social environmental economic scientific.

Page 46: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

46

9: Big ideas in TS and/or OSS1. Data visualisation and time2. Longitudinal collections3. Admin data4. Integrated data: IDI5. Census linkage

Page 47: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

47

9.1 Data visualisation and timeNature gave us

3 spatial dimensions(and R agrees!)

Isn’t there a 4th dimension?(Hans Rosling agrees)http://www.gapminder.org/

That’s all about TS, and mostly international OS TS

A different DV using TS: www.christchurchquakemap.co.nzA dynamic view of the Australian population etc etc etc etc:

http://www.abs.gov.au/websitedbs/d3310114.nsf/home/Population%20Pyramid%20-%20Australiahttp://www.abs.gov.au/websitedbs/D3310114.nsf/home/Interact+with+our+datahttp://betaworks.abs.gov.au/betaworks/betaworks.nsf/projects/dual_pyramid/frame.htm

Page 48: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

48

9.3 Longitudinal collections 1:There are 2 sorts of dataset:

1: cross-sectional2: time-series

Is there a third sort??

Yes! And most Official Stats collections are like that!

That’s us today

Page 49: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

49

Longitudinal collections 2: EGsSoFIE, with 8 waves (2001 … 2008): 10k people

Survey of Family, Income and EmploymentLISNZ, with 3 waves

Longitudinal Immigration Survey NZIntegrated Data Infrastructure (IDI)

Page 50: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

50

Longit. collections 3: millions of TS!

50Fake quarterly earnings series from tax data, for 100 people

Page 51: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

51

9.4 Administrative Data:Stats NZ intends to become

an Admin Data First agency.Egs of Admin Data sources:

Tax BenefitsStudent Loans and AllowancesEducational OutcomesMigrationElectronic Card TransactionsRetail barcode scanning Births, deathsand plenty more

Page 52: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

Most sources are: longitudinal, administrative, full-coverage, other peoples’. 52

Longitudinal Business database

Person to business

link

Educationsecondary &

tertiary:Ministry of Education

Tax data: Inland

Revenue:

Student loans &

allowances:Inland

Revenue &Ministry of

Social Development

Labour Force, IncomeSurveys

Benefits:Ministry of

Social Development

OutputsRelevant releasesDynamic datasets

Cutting edge cubesRich research

Central Linking Concordance

(CLC)

Migration data:

Department of Labour

9.5 The ultimate integrated system?: Integrated Data Infrastructure: IDI

Page 53: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

53

9.6 Census linkageUK’s Office of National Stats (ONS):

In 10-yearly censuses from 1971Australian Bureau of Stats (ABS):

One linked pair 2006-2011Stats NZ:

Five linked pairs, spanning 1981 to 2006

Page 54: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

10: Ac 4: Earnings: issues in official stats: sources, quality, measures

4.1 Where do we get data detailed enough for us to use Industry Classification Level 3?4.2 Is it admin data, or data collected for stat purposes? 4.3 Is the collection full-coverage or sampled?4.4 What errors might it have? 4.5 What’s happening to the earnings? Why??4.6 What’s the start date for the LEED data? (Linked Employer Employee Database)4.7 What analyses and transformations would you do to this?4.8 Were you thinking of going into the fungus trade?

Earnings, per Quarter, 1999Q2 to 2012Q1,LEED, ANZIC06 Level 3

0

5,000

10,000

1999Q2 2001Q2 2003Q2 2005Q2 2007Q2 2009Q2 2011Q2

Median: mushroom & veg growersMean: mushroom & veg growersMean: All industriesMedian: All industries

54

Page 55: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

iNZight and forecasting: 2 slides:

55

Page 56: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

56

Page 57: 11 Time Series in Official Stats: Statistical Thinking and Communication about Variation over Time STOR 481: 14 Oct 2015 Emma Mawby & Sonya McGlone: Statistics.

57

Comparisons: More to think about:

Mean and Median Earnings: Auckland and NZ: Quarterly: 1999 Q2 to 2007 Q2

5,000

10,000

15,000

00 01 02 03 04 05 06 07

Mean Earnings - AkMedian Earnings - AkMean Earnings - NZMedian Earnings - NZ