Finding, searching and sharing qualitative data: the uses of XML
-
Upload
lshtm -
Category
Technology
-
view
219 -
download
2
description
Transcript of Finding, searching and sharing qualitative data: the uses of XML
Finding, searching and sharing qualitative data: the uses of XML
Data Management in PracticeLSHTM, London, 14 November 2013
Libby BishopProducer Relations and Research Ethics
UK Data Service seeking to improve
• We have one of the largest qualitative data collections– over 300 data collections in the social sciences
• Currently users find and download these from our website – generally good, we would like to improve:• No searching within collections
• Hard to display complex relationships among related files within a collection (transcript, audio, image, memo)
• Cannot reliably cite parts of data
What researchers want from data centres• Search - find data regardless of location
• Use – ways to use data flexibly• Examine interview extract in context, online
• Decide before download
• Support analysis led by research questions (not technology)
• Cite – get and give credit appropriately
• Preserve – for own or others’ use later
XML is not a miracle cure,
just a (key) part of the solution
XML – eXtensible Mark-up Language• Language – system for communication
• Mark-up – encoding descriptive features of text• Tags, e.g. <u>words spoken in an interview</u>
• Extensible – set of tags is not fixed• Text Encoding Initiative (TEI) has 100s
• Independent of specific hard/software
• Open
XML allows qual data (rich, deep, but messy, unstructured) to benefit from computing power typically applied to structured, numeric data.
Search: all types of resource available
Data collections
• studies• variables
Case studies
• research• teaching
ESRC outputs
• conference paper• article• report• research summary
Support/ ‘how to’ guides
• dataset• theme• methods/statistics
Search
What makes all this possible? XML…..
Data Documentation Initiative (DDI)
DDI: A metadata specification for the social sciences
Use and Cite: Digital Futures project
• Build a user-friendly system for publishing and exploring qualitative data online
• Project includes large-scale digitisation of precious and undigitized materials
• Browse search results in context
• Improve display complex data
• Offer a mechanism for reliably citing data located in the system
Search results – displayed in context
Many formats for different research questions
School Leaver Essay 53 – My Pastaaa In 1978 I left school, I was sixteen years old. I came straight out of school into an apprenticeship heavy meter machanics. I served my four year apprenticeship in a garage for another year and the left and started my own garage. At the age of twenty three I got married. The garage was doing well so I didn’t have Much prodlems setting up a home. One year After I had/been married my wife had her first child. When I had some spare time I made up a car for rally cross racing but In the time I was racing I only won a few. When I was twenty five our second child was born. Once when rally driving I had a smash and was in hospital for five months when I was twenty nine we had our third child. I would get up at six o clock and drive to the garage and open it at Saturdays. On some Sundays when I wasn’t rally driving the family would go horse riding or for a picnic whilst I went fishing. In the garage I took an apprenticship from people who had just left school. When I was thirty six we had our fourth child. My first child would come and help in the garage at least when he left school he would get a job. When I was forty I had an extension built on to the garage. I also bought 4 acres of land and built a racetrack and made go-karts for my second and third eldest sons when my last child was eight I brought her a pony and taught her to ride. From when I was forty four My mother died and my father had died when I was twenty nine.
Corrected spelling – for accurate searches
<sic>apprenticship</sic><corr>apprenticeship<corr/>
Status quo - rft transcript for download
DF - Target page for an interview
Objects in collection metadata
Richer metadata = richer discovery
• Use of DDI 2.5, QuDEx and TEI schema
• QuDEx allows identification of data objects:• Interview transcript or audio recording etc.
• Relationship to another data object or part of data
• Descriptive categories at the object level, e.g. mime type, interview characteristics, interview setting
• Capacity to capture rich annotation of parts of data
• QuDEx model in use (Schema at: www.data-archive.ac.uk/create-manage/projects/qudex/)
• Object-level description = a lot of manual work!
Citation – of collection, and utterance
World Health Organization and International Collaborative Study of Medical Care Utilization, WHO/ICS Medical Care Utilization Study Data, 1968-1969 [computer file]. Colchester, Essex: UK Data Archive [distributor], January 1981. SN: 1427, http://dx.doi.org/10.5255/UKDA-SN-1427-1
Preservation – benefits of XML
• Open standard• Widely adopted as the basis for interchange of
documents and data over the Web• Human readable• Best for metadata; some challenges for preserving data
itself
How can researchers help?
• Produce and share high quality metadata and documentation….and,
• Using XML is not that different than text processing and spread sheets