The future of Jisc's library support services - Jisc Digifest 2016
Introduction to data and text mining - Jisc Digifest 2016
Transcript of Introduction to data and text mining - Jisc Digifest 2016
Introduction to Data and Text miningCatherine Grout
Click to icon to add image
https://en.wikipedia.org/wiki/Text_mining2/03/2016
What is data and text mining?
» Text mining, also referred to as text data mining, roughly equivalent to text analytics, refers to the process of deriving high-quality information from text
» High-quality information is typically derived through the devising of patterns and trends through means such as statistical pattern learning
» Text mining usually involves the process of structuring the input text … deriving patterns within the structured data, and finally evaluation and interpretation of the output
» High quality' in text mining usually refers to some combination of relevance, novelty, and interestingness
Ref : http://bit.ly/_jisc_textmining
Text mining
2/03/2016 Introduction to Data and Text mining
What is data and text mining?
» Data mining – is an imprecise term but means anything from› Large scale data analysis within science - outputs of Hubl
telecscope, Cern Large Hadron Collider
› Analysing census data for socio-economic trends (medium scale –finite amount of data)
› The opportunities of mining connected small objects/collections of research data to find new insight. e.g. bringing together various versions of the Mona Lisa and using Data Mining to analyse their underlying structure
Ref : http://bit.ly/_jisc_textmining
Data mining
2/03/2016 Introduction to Data and Text mining
Click to icon to add image
2/03/2016 Introduction to Data and Text mining
What is its value for research and education?
» 2012 – Jisc published a key report “Value and benefits of text mining” http://bit.ly/Jisc_textmining
» Took a case study approach and also under took an economic analysis of the benefits (…biomedicine)
» Wider at scale benefits were harder to come by owing to legal and technical limitations in inhibiting systematic use
» Since then new benefits have emerged
2/03/2016 Introduction to Data and Text mining
Introduction to Data and Text mining
What types of benefits?» Finding research insights that were
not possible through other techniques
» Bringing together texts/data across different discipline and finding new insights
» “Text mining offers a way of helping researchers to make sense of and leverage value from the vast sea of electronic resources, which is continually expanding.”
» .”.potential to increase the research base available to business and society and to enable business and others to use the research base more effectively”2/03/2016
Health benefits of outdoor educationhttps://en.wikipedia.org/wiki/Outdoor_education
Introduction to Data and Text mining
Innovative research in Humanities and Social sciences
» Digging into Data challengehttp://diggingintodata.org
» International Initiative now in its 4th funding round e.g.:› Trees and Tweets -
http://bit.ly/treesandtweets
› DiLiPaDhttp://dilipad.history.ac.uk/
2/03/2016
Introduction to Data and Text mining
Click to icon to add image
2/03/2016
Click to icon to add image
2/03/2016 Introduction to Data and Text mining
Mining repositories: Core» CORE is an aggregation of Open Access Repositories and offers itself as a
platform for TDM (£25 million articles)› Can use an API (of interest if want to build value add services on top)› Or - download the whole aggregation as an open dataset here:
https://core.ac.uk/intro/data_dump› Jisc and the Open University running CORE in partnership, with the
back-end aggregation hosted by the OU and the front-end services hosted by Jisc. (Further services by Jisc could be developed on top of this.)
2/03/2016 Introduction to Data and Text mining
Introduction to Data and Text mining
Universities and industry» NCUB (National Council for Universities and Business) is developing a tool
called an “Intelligent Broker”› To assist with making better links between University and Industry› Could potentially harvest and mine data from key sources like the
Research Council’s Gateway to Research, equipment.data (national equipment portal) and other services potentially - like Core
› This would give SME’s more intelligence about research intensive activity in particular areas for example
2/03/2016
Introduction to Data and Text mining
Click to icon to add image
2/03/2016
And finally…» Open Citation Experiment (using Text mining techniques – see Digifest
session and demo on this!)» Jisc are commissioning a study to examine the Text mining landscape and
future contributions to this space to review:› The current landscape – primarily in UK HE but also looking
internationally, and within other relevant sectors to provide a broad view
› The market – what are the value chains and where might Jisc contribute?
› The legal position and other inhibitors› Researcher practice, the issues they encounter, their current and
future needs, considering subjects that use and those that don’t› Existing platforms, services and tools, and potential for use by Jisc or its
customers› Recommendations on possible future areas of work or services for Jisc
to explore 2/03/2016 Introduction to Data and Text mining
jisc.ac.uk
Introduction to Data and Text mining
For more information
Contact
Catherine GroutHead of change - [email protected]
2/03/2016