Business data linking
description
Transcript of Business data linking
Business data linking
recent UK experience
business data in the UK
• common register (IDBR) since 1994• key law: Statistics of Trade Act 1947• data collection supervised by a Survey Control Unit
– concerns over burden on business– exemptions from repeat surveys for smallest firms
• devolved political and statistical framework– government departments separate bodies– data sharing has purposes and limitations specified
the Business Data Linking project (BDL)
• begun in the late 1990s– core dataset: Annual Respondents Database– other datasets: R&D, skills, Community Innovation
Surveys, e-commerce, New Earnings Survey…
• joint venture between ONS, OGDs*, academics• academics on secondment work in a “safe setting”• no access outside ONS• outputs checked manually for disclosure checking
*OGD: other government department
sample outputs
• solving the productivity problem?– UK multinationals as productive as foreign-owned firms– domestically-oriented firms even more unproductive?
• ecommerce lowers prices!– ...perhaps...– actually seems to emphasise existing market conditions– competition increases - but monopolies get stronger too
• on-the-job versus general skills– linking skills and schooling data to firm data indicates a
genuine productivity gain from general human capital
problems (1): “the ministry for adding things up”
• microdata quality suffers– statistical editing and block adjustment
• redefinition and interpretation of data or metadata– more problematic for micro users– eg SIC80-SIC92
• longitudinal integrity– crucial to micro analysis, irrelevant to macro numbers– not designed into repeat surveys
• documentation– different focus
problems (2): sampling frames
• small firms– low probability of reselection– smallest excluded by design
• changes in census band• voluntary surveys• non-IDBR sample selection
problems (3): inconsistencies
• inconsistent across time– eg ICT and innovation surveys
• inconsistent across surveys– eg foreign ownership
problems (4): confidentiality
• linking complicates disclosure control– increases number of quality assurers
• linking across small samples– reduces frequencies– increases likelihood of disclosiveness
• no general government right to share data– explicit agreement needed to share data across OGDs
new developments
• timely electronic documentation• automatic matching• feedback into survey design• integrated data and metadata system
• increasing awareness of benefits of microdata– increases value of data– lowers business burden– answers new questions– improves knoweldge of datasets
what have we learnt?
• enthusiastic data providers are the key– plan early for disclosure checking too– feed back
• check data version– may not be a ‘definitive’ file– and even ‘clean’ datasets need preparation time
• check micro validity - macro validity isn’t enough– duplicates and bad values– inconsistencies within and across datasets and time
• “useless” data can be useful when linked
finally...
• be prepared to take the lead• don’t get stressed
– recognise the data wasn’t collected for this purpose– enjoy the fact that is available
• talk about it
contact
Felix Ritchie
Business Data Linking
Office for National Statistics
1 Drummond Gate
London SW1V 1QQ