Business DataWarehouse_Big Data

32
Big Data MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Transcript of Business DataWarehouse_Big Data

Page 1: Business DataWarehouse_Big Data

Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Page 2: Business DataWarehouse_Big Data

What is Big Data ?

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Page 3: Business DataWarehouse_Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

‘Data’-The New oil of Information Revolution

Page 4: Business DataWarehouse_Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

‘Data’-The New Information Revolution ‘Data’-The New oil of Information Revolution

Page 5: Business DataWarehouse_Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

What makes it ‘Big’ Data ?

Page 6: Business DataWarehouse_Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Volume

Page 7: Business DataWarehouse_Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Velocity

Page 8: Business DataWarehouse_Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Variety

Page 9: Business DataWarehouse_Big Data

Hadoop

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Hadoop

Page 10: Business DataWarehouse_Big Data

HDFS HDFS

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Page 11: Business DataWarehouse_Big Data

HDFS

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

HDFS

Page 12: Business DataWarehouse_Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Map Reduce

Page 13: Business DataWarehouse_Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Map Reduce

Page 14: Business DataWarehouse_Big Data

Key =index.php Value=1

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Processing Logs

Page 15: Business DataWarehouse_Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Hadoop Ecosystem

Page 16: Business DataWarehouse_Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Big Data Landscape

Page 17: Business DataWarehouse_Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

What’s in store for us?

• More jobs

• More opportunities

• More Money!

Page 18: Business DataWarehouse_Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Big Data Landscape

Page 19: Business DataWarehouse_Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Big Data Landscape

Page 20: Business DataWarehouse_Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Big Data Landscape

Page 21: Business DataWarehouse_Big Data

Sectors Using Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Enhancing the Multichannel Consumer experience:

• Use big data to integrate promotions and pricing for shoppers seamlessly, whether those consumers are online, in-store, or perusing a catalog.

• Integrate customer databases with information on households such as income, housing values, and number of children and thus create different versions of catalogs etc attuned to the behavior and preferences of different groups of customers

Page 22: Business DataWarehouse_Big Data

Big Data Revenue

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Page 23: Business DataWarehouse_Big Data

Increased Efficiency

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Page 24: Business DataWarehouse_Big Data

Current Limitations for Big Data Analytics

• Meeting the need for speed

• Understanding the data

• Addressing data quality

• Displaying meaningful results

• Big data skills are in short supply.

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Page 25: Business DataWarehouse_Big Data

Problems & Treats – Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

• Privacy breaches and embarrassments

• Anonymization could become impossible

• Data masking could be defeated to reveal personal

information

• Unethical actions based on interpretations

• Big data analytics are not 100% accurate

• Discrimination

• Few (if any) legal protections exist for the involved

individuals

• Big data will probably exist forever

• Concerns for e-discovery

• Making patents and copyrights irrelevant

Page 26: Business DataWarehouse_Big Data

Case Studies – Recent Data Breaches

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

• Target breach, in which 40 million credit and debit accounts were

compromised over a three-week period - lost $148 million dollars.

• JP Morgan reporting that 76 million households and 8 million small business

were exposed in a data breach.

• Customer names, addresses, phone numbers and e-mail addresses were

taken

• Hackers also obtained internal data identifying customers by category,

such as whether they are clients of the private-bank, mortgage, auto or

credit-card divisions, said a person briefed on the matter.

• Third party – External Data - News: Banks turn to Facebook and Twitter to

keep track of education loan takers

Page 27: Business DataWarehouse_Big Data

Thinking Dimensionally

Sentiment_Analysis Table

Sentiment_ID ( e g-1,2,3,)

Sentiment_description

(eg-Wow, Awesome, Crap)

Customer_ID

Product_ID

Dim_Customer

Customer_ID

Customer_Name

Gender

Age

Dim_Product

Product_ID

Product_Name

Category

Product_Description

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Data-Big or small

Page 28: Business DataWarehouse_Big Data

Customer Name Location

Avadhoot Patil Dallas

Customer name

Location

Ankur Kaushik Dallas

Customer Name

Location

Avadhoot Patil Dallas

Ankur Kaushik Dalllas

Sort and Merge

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Conformed Dimensions

Online_Customer Table Store_Customer Table

Page 29: Business DataWarehouse_Big Data

Airport

Name

City Country

ABC Dallas USA

Airport_ID Airport

Name

City Country

1001 ABC Dallas USA

1002 XYZ Dallas USA

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Selecting Keys

• Anchor Dimensions with Durable Surrogate Keys

Natural Keys

durable surrogate keys.

slowly changing dimension

Datawarehouse System Airport Data_source

Page 30: Business DataWarehouse_Big Data

Dimensionalize data before applying governance

Dimensionalize data as early as possible in the data pipeline

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Governance

Parse Match Identify

Resolution on Fly

Page 31: Business DataWarehouse_Big Data

• Privacy is the Most Important Governance Perspective

For Most form of Analysis the personal details should be masked

Data aggregated enough not to allow identification of

individuals

Data masked or encrypted on write or data should be masked on read.

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Privacy

Page 32: Business DataWarehouse_Big Data

THANK YOU !

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6