Introduction to Big Data

Post on 22-Nov-2014

206 views 3 download

Tags:

description

Big data burst upon the scene in the first decade of the 21st century, and the first organizations to embrace it were online and startup firms. Arguably, firms like Google, eBay, LinkedIn, and Facebook were built around big data from the beginning. Here is a brief introduction to what Big Data entails and how it could effect businesses today.

Transcript of Introduction to Big Data

eMOT | MG 8783: Cloud Computing 2

Agenda

• What is Big Data• Example of Big Data• Drivers of Big Data: HIPO vs “Geeks”• Potential of Big Data

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 3

What is Big Data?

• Three V’s of Big Data– Volume– Velocity– Variety

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 4

VOLUME: HOW MUCH DATA?

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 5

Volume: How Much Data?

• Kilo- : 10^3 bytes• Mega- : 10^6 bytes• Tera- : 10^9 bytes• Giga- : 10^12 bytes• Peta- : 10^15 bytes• Exa- : 10^18 bytes• Zetta- : 10^21 bytes• Yotta- : 10^24 bytes

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 6

Volume: How Much Data?

• Kilo- : 10^3 bytes• Mega- : 10^6 bytes• Tera- : 10^9 bytes• Giga- : 10^12 bytes• Peta- : 10^15 bytes• Exa- : 10^18 bytes• Zetta- : 10^21 bytes• Yotta- : 10^24 bytes

Gan, Jeremy

As of 2013

eMOT | MG 8783: Cloud Computing 7

Volume: How Much Data? (cont.)

HELLA- (~ 10^27 byte)

aka

“HELLUVA-”Gan, Jeremy

eMOT | MG 8783: Cloud Computing 8

Volume: How Much Data? (cont.)

If we were to take all that information and store it in books, we could cover the entire area of the US or China in 3 layers of books.

Martin Hilbert, Researcher, USC

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 9

VELOCITY: IMMEDIATE & REACTIVE(REAL-TIME DATA ANALYSIS)

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 10

NYSE collects over 1 TB of trade info EACH session

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 11

Modern cars have over HUNDRED sensors

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 12

Google Wallet Debit Card

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 13

iOS7 Location Tracking Map

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 14

NBC The Voice #InstantSave

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 15

Wasabi Waiter

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 16

VARIETY: DATA IN WHAT FORM?

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 17

Tweets

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 18

Facebook: Likes

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 19

Facebook: Mouse Cursor Tracking

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 20

Apple iBeacon

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 21

Variety: Data In What Form?

• Goal– Identify patterns– Gain insights

• Why?– Combine big data with traditional data to better

understand pain points– Mitigate/limit negative impact– Increase/create revenue stream

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 22

THREE V’S + 1 = VERACITY

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 23

Role of Data Scientist

• Keep data organized - accurately• Poor data management quality cost U.S.

economy roughly $3.1 trillion/year

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 24

Role of Data Scientist (cont.)

• Data used correctly could spark limitless potentials– Prevent disease– Combat crime– Revolutionize global R&D– Disrupt conventional business model– Challenge HIPO’s guts

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 25

Role of Data Scientist (cont.)

• Data used correctly could spark limitless potentials– Prevent disease– Combat crime– Revolutionize global R&D– Disrupt conventional business model

–Challenge HIPO’s guts

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 26

DRIVERS OF BIG DATA: HIPO VS “GEEKS” (EXAMPLE)

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 27

2012 Presidential Election

President Barack Obama Gov. Mitt Romney

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 28

HIPO vs Geek

Michael Slaby, CTO, OFA 2008 Harper Reed, CTO, OFA 2012

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 29

Breakdown

• Innovative solution by leveraging big data– Facebook information• Personal interest: Preferences • Location: Hyper-local, better content distribution• Relevant: Contact efficiency

– Push innovation into sales by using data to have a conversation

– Twitter• DM via President and First Lady’s Twitter accounts

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 30

Result

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 31

POTENTIAL OF BIGDATA

Gan, Jeremy

eMOT | MG 8783: Cloud Computing 32

Limitless

• Research by McKinsey in Jan 2013– Companies using large-scale big data to shape

corporate strategy• Example:

– IBM acquiring Kenexa Corp.» Cloud (SAAS foundation) + big data (market insights)» Remove “guess work” – replacing it with precision

• Hiring – Utilize behavioral traits

• Research by Harvard School of Public Health– Big data could effectively prevent TB and shrinkage

of health care costGan, Jeremy

eMOT | MG 8783: Cloud Computing 33

Harper’s Thought On Healthcare.gov

Gan, Jeremy

Source NYT.com

eMOT | MG 8783: Cloud Computing 34Gan, Jeremy