Data Scientist: The Sexiest Job of the 21st...

Post on 21-May-2020

13 views 0 download

Transcript of Data Scientist: The Sexiest Job of the 21st...

Data Scientist:The Sexiest Job of the 21st Century

Martial Luyts

Interuniversity Institute for Biostatistics and statistical Bioinformatics (I-BioStat)

Katholieke Universiteit Leuven, Belgium

martial.luyts@kuleuven.be

www.ibiostat.be

Interuniversity Institute for Biostatistics and statistical Bioinformatics Belgium, 2 May 2016

Contents

0. About me . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1. Introductory material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2. What is a data scientist? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3. Popularity of data science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2. Sharing experience as former data scientist . . . . . . . . . . . . . . . . 12

2.1. Own experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2. Facebook, Netflix, Google, ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

i

2.3. Live demo: Emotion recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Belgium, 2 May 2016 ii

Chapter 0:About me

• Current job:PhD student & teaching assistantin statistics @ KUL

• Previous job:Data science consultant @ Keyrus

• Academic background:

• Master in Statistics (@ KUL)

• Bachelor in Mathematics (@UHasselt)

• Member of Filii Lamberti

Belgium, 2 May 2016 1

Part 1:

Introduction and motivation

Belgium, 2 May 2016 2

Chapter 1:Introductory material

• Motivation

• What is a data scientist?

• Popularity of data science

Belgium, 2 May 2016 3

1.1 Motivation

Belgium, 2 May 2016 4

1.2 What is a data scientist?

• Profile of a data scientist:

Belgium, 2 May 2016 5

• What are data scientists exactly doing?

1. Take over the world

2. Try to become president

3. Learn to speak to girls (if it is a male data scientist)

Belgium, 2 May 2016 6

4. Find patterns in big data structures

5. Develop machine learning algorithms to perform predictive analytics

Belgium, 2 May 2016 7

6. Visualize their results (in a clear way) to non-technical audience

Belgium, 2 May 2016 8

1.3 Popularity of data science

• Data scientists are hot in the industry:

Belgium, 2 May 2016 9

• Companies pay lots of money for them:

Belgium, 2 May 2016 10

Part 2:

Sharing experience as former data scientist

Belgium, 2 May 2016 11

Chapter 2:Data science in practice

• Different real-life data science projects

• Own experience

• Facebook, Netflix, Google, ...

• Live demo: Emotion recognition

Belgium, 2 May 2016 12

2.1 Own experience

• Involved in different data science projects

1. Sentiment analysis (large beer company)

2. Job-CV matching & salary prediction (HR-company)

3. Churn prediction (shoe company)

4. ...

Belgium, 2 May 2016 13

• Sentiment analysis (large beer company):

• Main goal:Find out what people think about their beer, based on social media, blogs, forums,etc.

• Strategy:Over 300 sites were scraped and parsed, and analyzed based on own and existingdictionaries

• Outcome:

Belgium, 2 May 2016 14

• Job-CV matching (HR company):

• Main goal:Develop an algorithm that automatically recommends the five best CurriculumVitae’s to a particular job description

• Strategy:A neural network is developed in combination with text mining principles andrecommender systems

• Outcome:

Belgium, 2 May 2016 15

• Churn prediction (shoe company):

• Main goal:Predict when online customers will churn in order to send promotions

• Strategy:Customer segmentation (k-means clustering) with survival analysis to obtain thechance of churning per group (over time)

• Outcome:

Belgium, 2 May 2016 16

2.2 Facebook, Net ix, Google, ...

• Companies like Facebook, Google, Netflix, etc. investigate a lot in data science

Belgium, 2 May 2016 17

• Research nowadays at Facebook:Let blind people interact with visual content with the use of data science

Belgium, 2 May 2016 18

• Research nowadays at Netflix & IMDB:Netflix, imdb and many more uses data science to develop recommender systems,based on previous search results for a user, to improve user experience

Belgium, 2 May 2016 19

• Research nowadays at Google:Construct self-driving cars with the use of data science

Belgium, 2 May 2016 20

• Research nowadays at EA sports & SonyGames at EA sport & Sony are now designed using machine learning algorithms whichimprove/upgrade themselves as the player moves up to a higher level. In motiongaming also, your opponent (computer) analyzes your previous moves and accordinglyshapes up its game

Belgium, 2 May 2016 21

• Research nowadays at FedEx & UPS:Using data science, FedEx and UPS have discovered the best routes to ship, the bestsuited time to deliver, the best mode of transport to choose thus leading to costefficiency, and many more to mention

Belgium, 2 May 2016 22

2.3 Live demo: Emotion recognition

Live demo (optional)

Belgium, 2 May 2016 23

The End

Belgium, 2 May 2016 24