My journey to Data Science · Data science is the extraction of actionable knowledge directly from...

43
A journey into Data Science in Libraries Luis Martinez-Uribe Data Scientist Library Fundación Juan March Photo with CC BY-NC-SA 2.0 licence taken from https://www.flickr.com/photos/bg2axk/ Second EDISON Conference - 16 March 2017

Transcript of My journey to Data Science · Data science is the extraction of actionable knowledge directly from...

Page 1: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

A journey into Data

Science in Libraries

Luis Martinez-Uribe

Data Scientist

Library

Fundación Juan March

Photo with CC BY-NC-SA 2.0 licence taken from https://www.flickr.com/photos/bg2axk/

Second EDISON Conference - 16 March 2017

Page 2: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Outline

Data Science My personal journey

Data Science at Fundación Juan March

Training

Looking ahead

Page 3: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

What is Data Science?

Page 4: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Data science is the extraction of actionable knowledge directly from datathrough a process of discovery, or hypothesis formulation and hypothesis testing.

National Institute of Standards and Technology (NIST) Big Data Working Group (2015)

Page 5: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Data Science

Statistics

Computer science

Mathematics

Social science

Software engineering

Ethics

Artificial Intelligence

Page 6: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

What is Data Science for?

Page 7: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

What is Data Science for?

Page 8: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

A pathway to Data Science in Libraries

Page 9: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

BSc Mathematics

Social Science Data Librarian

Research Data Management

Data Scientist

MSc InformationSystems

PhD

Sociology

Big Data

Page 10: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.
Page 11: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.
Page 12: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.
Page 13: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Klavans, Richard and Kevin W. Boyack. (2006). “Quantitative Evaluation of Large Maps of

Science.” Scientometrics 68 (3): 475-499.

Page 14: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.
Page 15: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.
Page 16: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.
Page 17: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.
Page 18: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Page 19: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Research Data Management

Page 20: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.
Page 21: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Page 22: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

CURATION

CAPTURE AND STORAGE

ANALYSIS

VIZ AND BI

Tools

Page 23: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

KnowledgePortals

01Our users and visitors

02Classifiers

03Search & recommend

04Vizs

05Analysis of social networks

06

Examples of Data Science Activities

Page 24: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Knowledge portals

Page 25: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.
Page 26: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.
Page 27: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Our users and visitors

Page 28: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Classification of our content

Keyword generation for events

Abstract

FormatTitle

...

EVENTS WITH KEYWORDS

(training set)

WORD PROCESSING

(stopwords, expressions)

STATISTICALINDICATORS

(frequency, word length, position,)

PREDICTIVE MODELS (machine learning)

CLASSIFIER(80% precision)

Page 29: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Search and recommendation systems

Search

Integrating data from 6.000 events, 500 exhibitions and art catalogues and 5.000 Library items

RecommendationsUsing meaningfull words in the title and keywords.

Page 30: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Interactive web graphs

Page 31: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Twitter networks and real time sentiment analysis

Page 32: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Training and education

Page 33: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Data Science methods

Page 34: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Big Data technologies

Page 35: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

PhD in Social Sciences, department of sociology

Develop analitical and visual framework for the social analysis of Big Cultural Data from libraries, archives and museums

Photo with CC BY-SA 2.0 licence taken from https://www.flickr.com/photos/seiho/

Page 36: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

datamonster.co

Page 37: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

• the is to ...learn all you

can about your data…from where it was first created.

•Embrace the broader reality…all the information that is yet to be stored in technology.”

Page 38: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Looking ahead

Page 39: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Artificial

intelligence

Page 40: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.
Page 41: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.
Page 42: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Looking ahead

"The best prophet of the future is the past." Lord Byron

Page 43: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing.

Thanks

[email protected]

@luismart

es.linkedin.com/in/luismartinezuribe