MT30 Best practices for data lake adoption

Post on 17-Feb-2017

218 views 3 download

Transcript of MT30 Best practices for data lake adoption

MT30

Best practices: Data lake adoption

Matt Maccaux, Global Big Data Practice Lead

2Dell - Internal Use - Confidential

Agenda

• Two models for big data

• Big data anti-patterns

• Big data best practice

• How to get started?

• Your questions

3Dell - Internal Use - Confidential

Two models for big data

Exploratory analytics

• Full data set – batch

• Explore, test, refine,

iterate

• The output is an algorithm

that will be integrated into

new or existing

applications.

Operationalization

• Limited data set –

Streaming

• The algorithm is integrated

into applications that drive

business decisions.

Big data anti-patterns

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Big data best practices

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

me:~>_

CONTINUUM

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Hadoop

Spark

Tableau

Python

TOOL CATALOG

Customer

Alert

Bills

Social

DATACATALOG

Duration

Performance

Normal

Analytics Request Portal

NONSampleData

SampleData

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Data Lake

Discover/Map

Transform

Organize/Tag

CATALOG AND PROVISION

ENTERPRISE LOG ANALYSIS

Virtualisation

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Virtualised Compute Pool

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Data Pool

Meta

-dat

a T

aggi

ng

G

o

v

e

r

n

a

n

c

e

A

n

o

n

y

m

i

s

e

E

n

c

r

y

p

t

i

o

n

Pooln

Pooln

Pooln

Copy

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Virtualised Compute Pool

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

CD \>_

CONTINUUM

Data Pool

G

o

v

e

r

n

a

n

c

e

A

n

o

n

y

m

i

s

e

E

n

c

r

y

p

t

i

o

n

Pooln

Pooln

Pooln

Copy

Virtualised Compute Pool

18Dell - Internal Use - Confidential

How to get started?

Big Data Technology Advisory

• Interview stakeholders including business users and technical/functional

experts

• Document requirements and gaps

• Define a future-state reference architecture

• Provide a plan/roadmap for implementation

Q&A