Download - Introduction to Big Data

Transcript
Page 1: Introduction to Big Data
Page 2: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 2

Agenda

• What is Big Data• Example of Big Data• Drivers of Big Data: HIPO vs “Geeks”• Potential of Big Data

Gan, Jeremy

Page 3: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 3

What is Big Data?

• Three V’s of Big Data– Volume– Velocity– Variety

Gan, Jeremy

Page 4: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 4

VOLUME: HOW MUCH DATA?

Gan, Jeremy

Page 5: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 5

Volume: How Much Data?

• Kilo- : 10^3 bytes• Mega- : 10^6 bytes• Tera- : 10^9 bytes• Giga- : 10^12 bytes• Peta- : 10^15 bytes• Exa- : 10^18 bytes• Zetta- : 10^21 bytes• Yotta- : 10^24 bytes

Gan, Jeremy

Page 6: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 6

Volume: How Much Data?

• Kilo- : 10^3 bytes• Mega- : 10^6 bytes• Tera- : 10^9 bytes• Giga- : 10^12 bytes• Peta- : 10^15 bytes• Exa- : 10^18 bytes• Zetta- : 10^21 bytes• Yotta- : 10^24 bytes

Gan, Jeremy

As of 2013

Page 7: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 7

Volume: How Much Data? (cont.)

HELLA- (~ 10^27 byte)

aka

“HELLUVA-”Gan, Jeremy

Page 8: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 8

Volume: How Much Data? (cont.)

If we were to take all that information and store it in books, we could cover the entire area of the US or China in 3 layers of books.

Martin Hilbert, Researcher, USC

Gan, Jeremy

Page 9: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 9

VELOCITY: IMMEDIATE & REACTIVE(REAL-TIME DATA ANALYSIS)

Gan, Jeremy

Page 10: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 10

NYSE collects over 1 TB of trade info EACH session

Gan, Jeremy

Page 11: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 11

Modern cars have over HUNDRED sensors

Gan, Jeremy

Page 12: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 12

Google Wallet Debit Card

Gan, Jeremy

Page 13: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 13

iOS7 Location Tracking Map

Gan, Jeremy

Page 14: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 14

NBC The Voice #InstantSave

Gan, Jeremy

Page 15: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 15

Wasabi Waiter

Gan, Jeremy

Page 16: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 16

VARIETY: DATA IN WHAT FORM?

Gan, Jeremy

Page 17: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 17

Tweets

Gan, Jeremy

Page 18: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 18

Facebook: Likes

Gan, Jeremy

Page 19: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 19

Facebook: Mouse Cursor Tracking

Gan, Jeremy

Page 20: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 20

Apple iBeacon

Gan, Jeremy

Page 21: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 21

Variety: Data In What Form?

• Goal– Identify patterns– Gain insights

• Why?– Combine big data with traditional data to better

understand pain points– Mitigate/limit negative impact– Increase/create revenue stream

Gan, Jeremy

Page 22: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 22

THREE V’S + 1 = VERACITY

Gan, Jeremy

Page 23: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 23

Role of Data Scientist

• Keep data organized - accurately• Poor data management quality cost U.S.

economy roughly $3.1 trillion/year

Gan, Jeremy

Page 24: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 24

Role of Data Scientist (cont.)

• Data used correctly could spark limitless potentials– Prevent disease– Combat crime– Revolutionize global R&D– Disrupt conventional business model– Challenge HIPO’s guts

Gan, Jeremy

Page 25: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 25

Role of Data Scientist (cont.)

• Data used correctly could spark limitless potentials– Prevent disease– Combat crime– Revolutionize global R&D– Disrupt conventional business model

–Challenge HIPO’s guts

Gan, Jeremy

Page 26: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 26

DRIVERS OF BIG DATA: HIPO VS “GEEKS” (EXAMPLE)

Gan, Jeremy

Page 27: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 27

2012 Presidential Election

President Barack Obama Gov. Mitt Romney

Gan, Jeremy

Page 28: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 28

HIPO vs Geek

Michael Slaby, CTO, OFA 2008 Harper Reed, CTO, OFA 2012

Gan, Jeremy

Page 29: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 29

Breakdown

• Innovative solution by leveraging big data– Facebook information• Personal interest: Preferences • Location: Hyper-local, better content distribution• Relevant: Contact efficiency

– Push innovation into sales by using data to have a conversation

– Twitter• DM via President and First Lady’s Twitter accounts

Gan, Jeremy

Page 30: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 30

Result

Gan, Jeremy

Page 31: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 31

POTENTIAL OF BIGDATA

Gan, Jeremy

Page 32: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 32

Limitless

• Research by McKinsey in Jan 2013– Companies using large-scale big data to shape

corporate strategy• Example:

– IBM acquiring Kenexa Corp.» Cloud (SAAS foundation) + big data (market insights)» Remove “guess work” – replacing it with precision

• Hiring – Utilize behavioral traits

• Research by Harvard School of Public Health– Big data could effectively prevent TB and shrinkage

of health care costGan, Jeremy

Page 33: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 33

Harper’s Thought On Healthcare.gov

Gan, Jeremy

Source NYT.com

Page 34: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 34Gan, Jeremy