Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme...
-
Upload
cora-gillian-booth -
Category
Documents
-
view
216 -
download
0
Transcript of Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme...
![Page 1: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/1.jpg)
Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme
Jane Elliott
Director of the Centre for Longitudinal Studies and Director of CLOSER
Sub-brand to go here
![Page 2: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/2.jpg)
Summary
• A brief overview of CLOSER• Early progress on harmonisation work packages
– biological structure– Socioeconomic status and qualifications
• Uniform Search Platform• Contextual database• Benefits of cross cohort analysis
2
![Page 3: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/3.jpg)
Cohorts and Longitudinal Studies Enhancement Resources = CLOSER
Nine Longitudinal Studies Hertfordshire Cohort Study 1946 British Birth Cohort 1958 British Birth Cohort 1970 British Birth Cohort ALSPAC – Avon Longitudinal Study of Parents and Children Millennium Cohort Study Southampton Women’s Study Life Study Understanding Society
Funded by ESRC and MRC
3
![Page 4: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/4.jpg)
Objectives & timetableMaximise the use, value and impact of data collected through a portfolio
of key UK longitudinal studies
• Stimulate interdisciplinary research across major longitudinal studies
• Provide common resources for research
• Assist with training and development
• Share information and expertise between study teams
1st October 2012 – 30th September 2017
4
![Page 5: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/5.jpg)
Work streams4 work packages on data harmonisation
3 work packages on data linkage
Core work on
Impact – Lead by the British Library
Training and Capacity Building
Uniform Search platform
Leadership team contributing to strategic planning, sharing of best practice, funders’ strategies
See our website: www.CLOSER.ac.uk for further information
Twitter: @CLOSER_UK
5
![Page 6: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/6.jpg)
6
Leadership team
1946 cohort
1958 cohort
1970 cohort
ALSPAC
MCSUnderstanding
Society
SWS
HCS
Life Study
Metadata
Uniform Search Platform
Training and capacity building
Impact
WP6: Data linkage - geography
WP5: Data linkage administrative data
WP7: Data linkage – health data
![Page 7: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/7.jpg)
Vision for the USP
• Portal to discovery of hundreds of thousands of variables, questions and data collection instruments across the nine longitudinal studies:
• covering survey and biomedical data collection• promoting CLOSER harmonisation work• state-of-the-art searching tool• focus on improving visibility of associations between (currently) disparate
metadata items• shared subject/topic classification
• We should remember that this is massively ambitious; something that matches or surpasses the best multi-study metadata repository out there:
• RAND Survey Meta Data Repository covering the HRS family of studies: https://mmicdata.rand.org/megametadata/
![Page 8: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/8.jpg)
Why do it?
• Benefits to users:
• single resource discovery portal – replacing a fractured resource discovery landscape
• lowers barriers to conducting cross-cohort analysis• increased visibility of cohort data and resources
• Benefits to data managers:
• standardised metadata management workflows – currently curated in isolation• workflows in place for future ‘joiners’
• Benefits to Principal Investigators/survey commissioners:
• make prospective harmonisation easier• promotion and re-use of tested questions and instruments
![Page 9: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/9.jpg)
Assumptions, constraints
• Not a data repository
• Not a major software development project:
• major £££ is for metadata creation/enhancement
• DDI-L agreed as standard for metadata exchange:
• covers subject areas (bio and soc science) and data collection methods (‘hard’ instrument and survey)
• designed for marking-up longitudinal/repeated metadata items
• Colectica Designer selected as preferred metadata ingest/editing software
![Page 10: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/10.jpg)
Challenges
• Legacy metadata:
• elderly and decrepit!• not always designed for equivalence within a study, much less across
studies• differing or non-existent naming conventions• substantial (manual) effort required to establish equivalences and level of
equivalence
• Metadata managed by five or six different units: different formats, workflows, vocabularies
• Relative lack of familiarity with DDI-L:
• uneven knowledge across study units
![Page 11: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/11.jpg)
Metadata: State of play
• >200k variables
• c.150 data collections:
• CAI, PAPI, nurse visit, clinic-based protocol, biosamples, etc.
• c.85 validated survey instruments
• GHQ, AUDIT, Malaise Inventory, etc.• c.10 instruments used in >1 study
• c.20 validated clinical measures
• blood pressure, bone density, lung function, etc. • range of instruments used
• c.15 cognitive or physical tests
![Page 12: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/12.jpg)
How to do it?
• USP will be a web interface that sits on top of a central repository fed by metadata created and delivered both by the individual study units and the CLOSER core
• Study units continue to curate metadata as they see fit; but not in conflict with proposed USP metadata profile
• Substantial metadata creation and enhancement to be undertaken by the study units: inputting historical questionnaires; mapping between data items and data collection
• CLOSER core responsible for identifying common (cross-study) variable and question schemes, allowing studies to reference these and also any agreed controlled vocabularies (concept, life stage etc.)
![Page 13: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/13.jpg)
Contextual database - rationaleLife course approach stresses the importance of the connection between
individuals and the historical and socioeconomic context in which these individuals lived
But some research based on cohort studies pays little attention to the social, economic or historical context that helps shape the lives of individuals
Some data on social change and social context will come from the studies themselves (e.g. Breast feeding)
Aim of the contextual database is to provide a central source of key indicators over time likely to be of direct relevance to cohort research
13
![Page 14: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/14.jpg)
14
Source: Changing Britain Changing Lives : Three generations at the turn of the century Table 8.3 (Wadsworth et al)
![Page 15: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/15.jpg)
Proportion of women in paid employment, by age and cohort
Source: Jenny Neuburger - Paper presented at CLS June 2008
![Page 16: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/16.jpg)
Contextual database - elements
16
Economic indicatorsQualifications & EducationDemographyHealth & health behaviourInequality & povertyLabour market and unemploymentHousingDigital economy
Also want to include policy narratives and a bibliography
![Page 17: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/17.jpg)
Work package 1Biological structure and function
Two years March 2013- February 2015William Johnson & Rebecca Hardy
MRC Unit for Lifelong Health and Ageing
Blood pressure
Cognitive performance
Physical capability
Body size and composition
![Page 18: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/18.jpg)
Research priority
Body size - because of the obesity epidemic and the long term consequences of adiposity on health & well-being
Need for harmonisation:
Body size data froma single study
Harmonised body size data across multiple studies
Restricted N and power Larger N and greater power
Results may not begeneralizable
Replication of results andquantification of heterogeneity
Modelling capabilitydependent on studydata
Modelling capability increased by pooling data
Age and period effectsconfounded
Decompose age and periodeffects
No cohort effects (secular trend) Investigate cohort effects
![Page 19: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/19.jpg)
First papers
Compare body size distributions and mean trajectories, across different phases of the life course, between cohorts
Investigate how SEP inequalities in body size trajectories, across different phases of the life course, differ between cohorts
Li L et al. Am J Epidemiol. 2008
Howe LD et al. JECH. 2012
![Page 20: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/20.jpg)
0 2 4 6 7 11 15 20 26 36 43 53 60-64
0 7 11 16 23 33 42 44 50
0 5 10 16 26 30 34
0 7 8 9 10 11 12 13 15 18
0 1 3 5 7
Studies
![Page 21: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/21.jpg)
Data
![Page 22: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/22.jpg)
Between studies:Data covering different age ranges
Data increasingly positively skewed in more recent studies
Within individuals:Different number of observations at different exact ages
Different precision of data
Within and between individuals:Both measured and self-reportdata
Challenges
![Page 23: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/23.jpg)
23
![Page 24: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/24.jpg)
1) Demonstration research project focussing on socioeconomic differences in growth and obesity across cohorts
2) A harmonised dataset, with accompanying documentation for other
users
What we are aiming to achieve:
![Page 25: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/25.jpg)
Socio-economic data harmonisation work package Claire Crawford, Brian Dodgeon, Tim Morris, Sam Parsons,
Anna Vignoles (lead)
Two years April 2013- March 2015
![Page 26: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/26.jpg)
What measures?Measures to be harmonised are: • parental education level• cohort member level of education• socio-economic (occupation) status• household equivalised income• home ownership
Cohorts: NSHD; NCDS; BCS; ALSPAC; MCS
![Page 27: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/27.jpg)
Priority Measures agreed• Highest qualification (vocational/academic separately) held at every
age• Age left full time education • Whether the person went past compulsory schooling • Average GCSE score or equivalent• GCSE Grades in mathematics and English (not for all cohorts)• For cohort member parents - age left full time education and
highest qualification at birth of CM• Grandparents’ age left school
![Page 28: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/28.jpg)
Measures available by cohort
NSHD NCDS BCS70 ALSPAC MCS
Cohort MemberHighest qualification (each age)
✔ ✔ ✔ ✔
Age left full-time education ✔ ✔ ✔ ✔
Post compulsory education ✔ ✔ ✔ ✔
Maths grade [O’level, CSE, GCSE] ✔ ✔ ✔ ✔
English grade [O’level, CSE, GCSE] ✔ ✔ ✔ ✔
Exam total score [O’level, CSE, GCSE] ✔ ✔ ✔ ✔
ParentAge left full-time education ✔ ✔ ✔ ✔ ✔
Highest qualification [birth or nearest data collection point ✔ ✔ ✔ ✔
GrandparentAge left full-time education ✔
![Page 29: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/29.jpg)
The value of cross-cohort analysis
1) A meta-narrative of societal change over time
2) Creating a synthetic life course – understanding life time trajectories
3) Investigate cohort effects - examining the impact of different social and policy contexts
4) Replication of results – checking the robustness of models
5) Larger N and greater power
6) Decompose age and period effects
29
![Page 30: Data harmonisation and the value of cross-cohort analysis: an overview of the new CLOSER programme Jane Elliott Director of the Centre for Longitudinal.](https://reader035.fdocuments.us/reader035/viewer/2022062716/56649e035503460f94aeebd1/html5/thumbnails/30.jpg)
Lifetime systolic blood pressure trajectories and velocities (predicted means)
Men Women
Wills et al. PLOS Med, 2011