Introduction to Big Data

34

description

Big data burst upon the scene in the first decade of the 21st century, and the first organizations to embrace it were online and startup firms. Arguably, firms like Google, eBay, LinkedIn, and Facebook were built around big data from the beginning. Here is a brief introduction to what Big Data entails and how it could effect businesses today.

Transcript of Introduction to Big Data

Page 1: Introduction to Big Data
Page 2: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 2

Agenda

• What is Big Data• Example of Big Data• Drivers of Big Data: HIPO vs “Geeks”• Potential of Big Data

Gan, Jeremy

Page 3: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 3

What is Big Data?

• Three V’s of Big Data– Volume– Velocity– Variety

Gan, Jeremy

Page 4: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 4

VOLUME: HOW MUCH DATA?

Gan, Jeremy

Page 5: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 5

Volume: How Much Data?

• Kilo- : 10^3 bytes• Mega- : 10^6 bytes• Tera- : 10^9 bytes• Giga- : 10^12 bytes• Peta- : 10^15 bytes• Exa- : 10^18 bytes• Zetta- : 10^21 bytes• Yotta- : 10^24 bytes

Gan, Jeremy

Page 6: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 6

Volume: How Much Data?

• Kilo- : 10^3 bytes• Mega- : 10^6 bytes• Tera- : 10^9 bytes• Giga- : 10^12 bytes• Peta- : 10^15 bytes• Exa- : 10^18 bytes• Zetta- : 10^21 bytes• Yotta- : 10^24 bytes

Gan, Jeremy

As of 2013

Page 7: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 7

Volume: How Much Data? (cont.)

HELLA- (~ 10^27 byte)

aka

“HELLUVA-”Gan, Jeremy

Page 8: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 8

Volume: How Much Data? (cont.)

If we were to take all that information and store it in books, we could cover the entire area of the US or China in 3 layers of books.

Martin Hilbert, Researcher, USC

Gan, Jeremy

Page 9: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 9

VELOCITY: IMMEDIATE & REACTIVE(REAL-TIME DATA ANALYSIS)

Gan, Jeremy

Page 10: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 10

NYSE collects over 1 TB of trade info EACH session

Gan, Jeremy

Page 11: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 11

Modern cars have over HUNDRED sensors

Gan, Jeremy

Page 12: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 12

Google Wallet Debit Card

Gan, Jeremy

Page 13: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 13

iOS7 Location Tracking Map

Gan, Jeremy

Page 14: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 14

NBC The Voice #InstantSave

Gan, Jeremy

Page 15: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 15

Wasabi Waiter

Gan, Jeremy

Page 16: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 16

VARIETY: DATA IN WHAT FORM?

Gan, Jeremy

Page 17: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 17

Tweets

Gan, Jeremy

Page 18: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 18

Facebook: Likes

Gan, Jeremy

Page 19: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 19

Facebook: Mouse Cursor Tracking

Gan, Jeremy

Page 20: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 20

Apple iBeacon

Gan, Jeremy

Page 21: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 21

Variety: Data In What Form?

• Goal– Identify patterns– Gain insights

• Why?– Combine big data with traditional data to better

understand pain points– Mitigate/limit negative impact– Increase/create revenue stream

Gan, Jeremy

Page 22: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 22

THREE V’S + 1 = VERACITY

Gan, Jeremy

Page 23: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 23

Role of Data Scientist

• Keep data organized - accurately• Poor data management quality cost U.S.

economy roughly $3.1 trillion/year

Gan, Jeremy

Page 24: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 24

Role of Data Scientist (cont.)

• Data used correctly could spark limitless potentials– Prevent disease– Combat crime– Revolutionize global R&D– Disrupt conventional business model– Challenge HIPO’s guts

Gan, Jeremy

Page 25: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 25

Role of Data Scientist (cont.)

• Data used correctly could spark limitless potentials– Prevent disease– Combat crime– Revolutionize global R&D– Disrupt conventional business model

–Challenge HIPO’s guts

Gan, Jeremy

Page 26: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 26

DRIVERS OF BIG DATA: HIPO VS “GEEKS” (EXAMPLE)

Gan, Jeremy

Page 27: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 27

2012 Presidential Election

President Barack Obama Gov. Mitt Romney

Gan, Jeremy

Page 28: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 28

HIPO vs Geek

Michael Slaby, CTO, OFA 2008 Harper Reed, CTO, OFA 2012

Gan, Jeremy

Page 29: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 29

Breakdown

• Innovative solution by leveraging big data– Facebook information• Personal interest: Preferences • Location: Hyper-local, better content distribution• Relevant: Contact efficiency

– Push innovation into sales by using data to have a conversation

– Twitter• DM via President and First Lady’s Twitter accounts

Gan, Jeremy

Page 30: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 30

Result

Gan, Jeremy

Page 31: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 31

POTENTIAL OF BIGDATA

Gan, Jeremy

Page 32: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 32

Limitless

• Research by McKinsey in Jan 2013– Companies using large-scale big data to shape

corporate strategy• Example:

– IBM acquiring Kenexa Corp.» Cloud (SAAS foundation) + big data (market insights)» Remove “guess work” – replacing it with precision

• Hiring – Utilize behavioral traits

• Research by Harvard School of Public Health– Big data could effectively prevent TB and shrinkage

of health care costGan, Jeremy

Page 33: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 33

Harper’s Thought On Healthcare.gov

Gan, Jeremy

Source NYT.com

Page 34: Introduction to Big Data

eMOT | MG 8783: Cloud Computing 34Gan, Jeremy