The Analytics Data Store: Information Supply Framework

Post on 18-Jul-2015

3.896 views 1 download

Tags:

Transcript of The Analytics Data Store: Information Supply Framework

The Analytics Data Store

Martyn Jones

Extending the Data Warehouse Architecture

Cambriano Energy 2015 -

http://www.cambriano.es

Data Warehousing

Big Data

Business Intelligence

Statistics

Analytics

Confused by Big Data?

Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com

You should be!

Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com

NOW!

Build me an ADS...

Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com

Let’s start

simple!

Enterprise Operational Data – This is data that is used in applications that support the day

to day running of an organisation’s operations. Typical data items in this space are sales

transactions, purchase transactions, product information, client and contact information.

Enterprise Operational Data may also include complexly structured data, such as contracts

and other business documents. Applications in this space may include production control,

logistics and stock control, as well as purchase order, supply chain management,

management accounting and human resource modules.

Enterprise Process Data – This is measurement and

management data collected to show how the operational

systems are performing. In the past the recording of events

went down to the level of a completed transaction – with a

start and an end and nothing in between, and as transactions

were kept as simple as possible, to maximize performance

and throughput and minimise the risk of failure, very little

process data was captured. Now, especially with the advent of

Business Process Management and Web Logs, we collect a

whole array of transaction and process performance data that

was never previously captured.

Enterprise Process Data – This is

measurement and management data

collected to show how the operational

systems are performing. In the past the

recording of events went down to the level

of a completed transaction – with a start

and an end and nothing in between, and as

transactions were kept as simple as

possible, to maximize performance and

throughput and minimise the risk of failure,

very little process data was captured. Now,

especially with the advent of Business

Process Management and Web Logs, we

collect a whole array of transaction and

process performance data that was never

previously captured.

Internal

digital data

DW 3.0 Information Supply Framework

External

digital data

Data

logistics

Operational

Data Store

Data

Warehouse

Analytics

Data Store

Data Marts

Statistical

Analysis

Business

Intelligence

Scenarios

Data

logistics

Primary data flow

Secondary data flow

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 -

http://www.cambriano.es

EDW

ADS

DM

DM

DM

Statistical

analysis

ETL

T/ETL

ET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Message

Adapter

Message

QueueOLTP

Staging

ODS

ETLT/ETL

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Message

Adapter

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 -

http://www.cambriano.es

EDW

ADS

DM

DM

DM

Statistical

analysis

ETL

T/ETL

ET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Message

Adapter

Message

QueueOLTP

Staging

ODS

ETLT/ETL

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Message

Adapter

Data Sources – This element covers all the current sources, varieties

and volumes of data available which may be used to support

processes of 'challenge identification', 'option definition', decision

making, including statistical analysis and scenario generation.

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

EDW

ADS

DM

DM

DM

Statistical

analysis

ETL

T/ETL

ET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Message

Adapter

Message

QueueOLTP

Staging

ODS

ETLT/ETL

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Message

Adapter

Core Data Warehousing – This is a suggested evolution path of the

DW 2.0 model. It faithfully extends the Inmon paradigm to not only

include unstructured and complex data but also the information and

outcomes derived from statistical analysis performed outside of the

Core Data Warehousing landscape.

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

EDW

ADS

DM

DM

DM

Statistical

analysis

ETL

T/ETL

ET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Message

Adapter

Message

QueueOLTP

Staging

ODS

ETLT/ETL

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Message

Adapter

Core Statistics – This element covers the core body of statistical

competence, especially but not only with regards to evolving data

volumes, data velocity and speed, data quality and data variety.

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical

analysisET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

INTO THE ZONE!

ADSStatistical

analysisET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Complex Data – This is unstructured or highly complexly

structured data contained in documents and other

complex data artefacts, such as multimedia

documents.

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical

analysisET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Event Data – This is an aspect of Enterprise Process

Data, and typically at a fine-grained level of abstraction.

Here are the business process logs, the internet web

activity logs and other similar sources of event data.

The volumes generated by these sources will tend to

be higher than other volumes of data, and are those

that are currently associated with the Big Data term,

covering as it does that masses of information

generated by tracking even the most minor piece of

'behavioural data' from, for example, someone casually

surfing a web site.

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical

analysisET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Infrastructure Data – This aspect includes data which

could well be described as signal data. Continuous high

velocity streams of potentially highly volatile data that

might be processed through complex event correlation

and analysis components.

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical

analysisET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Event Applicance – This puts the dynamic data

collation, selection and reduction functionality as close to

the point of event data generation as physically possible.

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical

analysisET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Signal Applicance – This puts the dynamic data

collation, selection and reduction functionality as close to

the point of continuous streaming data generation as

physically possible.

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical

analysisET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Distributed Inter Process Communication – Different

forms of messaging allow high volumes of data to be

transmitted in near real time.

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical

analysisET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Staging and Reduction – Traditional data staging

combined with in-line data reduction.

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical

analysisET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

ET(A)L – Extending ETL to include data analytics

components tightly integrated into parallel ETL job

streams.

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical

analysisET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

ADS – The Analytics Data Store. 1. Statistics oriented 2.

Integrated by focus area 3. Variable volatility 4. Time

variant

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical

analysisET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Statistical Analysis – Qualitative analysis. Diagnostic

analysis, predictive analysis, speculative analysis, data

mining, data exploration, modelling.

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical

analysisET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Scenarios and outcomes – 1. Snapshots of outcomes

of scenario analysis as the process of analyzing possible

future events by generating alternative possible

outcomes. 2. Captured outcomes of statistical analysis.

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical

analysisET(A)L

Staging &

Reduction

Signal

Appliance

Message

Adapter

Message

Queue

Infrastructur

e Data

Write back

Complex

data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Martyn Richard Jones 2015 – martynjones.eu

Core Data Warehousing

Core Statistics

Data

Source

s

Message

Adapter

Write back – The ability to append data, update data

and enrich data within the Analytics Data Store, and to

provide scenario data to the Core Data Warehousing.

Cambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com

UNCOVER

UNDERSTAND

USE

KNOWLEDGE

INFORMATION

DATA

The Iniciativa Information Management PyramidCopyright © 2000-2015 Iniciativa Org, S.L.

Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com

STEP

STEP

BY

UNCOVER

UNDERSTAND

USE

KNOWLEDGE

INFORMATION

DATA

The Iniciativa Information Management PyramidCopyright © 2000-2015 Iniciativa Org, S.L.

DW 3.0 Aligning with Statistics

Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 -

http://www.cambriano.esPublished by goodstrat.com

DW 1.0 Building the Data Warehouse

DW 2.0 Adding Unstructured Data

The Analytics Data Store

Martyn Jones

Extending the Data Warehouse Architecture

Cambriano Energy 2015 -

http://www.cambriano.es

Have any questions about the Analytics Data Store and DW 3.0?

Feel free to connect via Twitter, Facebook and the Cambriano

Energy website.

Alternatively you can contact me via my personal web-site at

martynjones.eu

Also, please checkout my blog, Good Strat, which deals with organisational

strategy and information management.