Supporting Big Data, Open Data, Data Analytics and Data Science

Post on 11-Apr-2017

43 views 10 download

Transcript of Supporting Big Data, Open Data, Data Analytics and Data Science

Supporting Big Data, Open Data, Data Analytics and Data Science

Dr Simon PriceResearch IT Manager

2

• Bristol is a research-intensive university

• 6 Faculties: Social Science & Law, Science, Engineering, Arts and two Medical Faculties

• Employs 2000+ researchers (excluding PhDs)

• Each year (approximately):• 1500 research funding applications• £100M research income• 4500 research outputs

3

Outline

1. Big Data2. Open Data3. Data Analytics4. Data Science

5. Implications for IT support

4

Big Data

5

Big Data

• Lots and lots of technology buzzwords!• Some important ones:

• MapReduce• The Hadoop stack

• Distributed file systems• Query languages & programming languages

• NoSQL databases (columns, document, graph, ...)

7

Big Data

• Trends in Hadoop stack• Near realtime analytics• Streaming analytics• In-memory

• Trends in NoSQL• Relational and NoSQL moving closer together

8

Open Data

9

Open Data - data.bris• Each PI allocated 5TB "forever"• Research Data Management• Open Data Publication

10

Open Data - public data

11

140+ datasets live on opendata.bristol.gov.uk Some real time data Transport API repository now available Examples

Government: Elections since 2007 Community: Quality of Life survey Education: School Results Energy: Installed PV, Energy Use in Council Buildings Environment: Real time & Historic Air Quality, Flood Alerts (EA) Land use: 2013 Planning applications Health: Life expectancy/ Mortality, Obesity, NHS Spend

Bristol is Open - datasets

12

Data Analytics

• Operational focus• variables are "known knowns and known unknowns"

• Descriptive• summarisation known variables and alerting

• Predictive• correlations between known variables

13

Data Science

• Multidisciplinary data-intensive research• Focus on research insights, causation and prediction• Usually involves Machine Learning and Statistics

• Different perspectives:• Computer Scientists view DS as a research domain• Statisticians view DS as a research domain• Other academics view DS as a service

14

3 May 2023

15

3 May 2023

16

Implications for IT support

• Governance• Shift from IT-owned to academic-owned (Shadow IT)

• Skills• IT experts need to train and trust academics• Nurture internal skills pipeline (interns, postgrads)

• Systems• Mixed economy of internal and external