MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30%...

38
© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY Christy Haragan, Principal Sales Engineer, MarkLogic Justin Makeig, Director, Product Management, MarkLogic

Transcript of MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30%...

Page 1: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY Christy Haragan, Principal Sales Engineer, MarkLogic Justin Makeig, Director, Product Management, MarkLogic

Page 2: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 2

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Hello, my name is Product Manager for 8+ years at MarkLogic Background in consulting and web development Focus on software architecture, APIs, and data integration

[email protected] https://github.com/jmakeig

Page 3: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 3

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Hello, my name is

[email protected] https://github.com/christyharagan

8 years experience in the data management space Previously a MDM Software Engineer, and Software Architect

Page 4: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 4

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Agenda What is Master Data and why should I care?

Traditional approaches to Master Data Management

How MarkLogic changes MDM

Use cases: Techniques and tips

Q&A

Page 5: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 5

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

“Christy Haragan”

What is “Christy Haragan”? Who decided that?

What is our relationship with her?

What has she purchased? Returned? Received support on?

Where does she live? With whom? How do I contact her? When have we contacted her?

Is she a “high-value” customer? What risk does she represent?

Who is authorized to see Christy’s data? Who can change it? Under what conditions?

Address: London

Address: Winchester

Returns

Credit score: 775

Orders

Page 6: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 6

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Relational strips important context from data

System A System B

Customer

CREATE TABLE Customers ( CustID NUMBER(6) PRIMARY KEY, StartDate DATE NOT NULL PartyID NUMBER(6) … )

CREATE TABLE CUST_MASTER ( CID VARCHAR2(40) PRIMARY KEY, Evt_Dt TIMESTAMP Cst_Type VARCHAR2(120) … )

Customer

Page 7: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

TRADITIONAL APPROACHES TO MDM

Page 8: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 8

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Repository of references to records in source systems ✔ Easy to start

✔ Clear separation from live systems

✘ Difficult to query, update

✘ Separation of context and data

✘ Dependence on source systems for quality, governance, security

Master Data Registry

ERP • PARTY_MASTER • HARA_C CRM • customers_1 • 100067 Credit • http://… • 9930-221 CRM2 • refcustview • H_3325

Customer: Christy Haragan

Page 9: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 9

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Aggregation of data from source systems ✔ Authoritative, bi-directional

✔ Centralized quality, governance, security

✔ Rich queries, aggregates, enrichment, and transactions

✘ Technical canonicalization

✘ Political collaboration

Master Data Hub

Page 10: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 10

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Master Data

Reference Data

Metadata

Exceptions

Staging

Staging

Cleansing Exception

processing Registry

Matching Workflow rules Domain services Security Logging, audit

Ser

vice

s

Stewardship, orchestration, monitoring

Sour

ce S

yste

ms

of R

ecor

d

Page 11: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 11

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Master Data

Reference Data

Metadata

Exceptions

Staging

Staging

Cleansing Exception

processing Registry

Matching Workflow rules Domain services Security Logging, audit

Ser

vice

s

Stewardship, orchestration, monitoring

Sour

ce S

yste

ms

of R

ecor

d

Application Servers Relational

Databases ETL

Federated Search

Page 12: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 12

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Canonical models require political and technical alignment

Single version of the truth, but which version?

Provenance: show your work

Changing requirements, changing data

“Unstructured”, unknown data

Shortcomings of Traditional MDM Approaches

Customer: Christy Haragan

• First Name • Last Name • Home

Address • Ship Address

• Order History • Interactions • Related

Customers • Campaigns

Page 13: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 13

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Implications of Traditional MDM Approaches Time to Value: Average ROI for an MDM implementation is 3 years

Inflexibility: Snapshot is unable to support multiple Business Units, domains, and changing requirements

Cost: ETL is brittle, expensive and time-consuming to build and maintain

Accountability: Difficult or impossible to explain past decisions on current data

Complexity: Orchestration, governance, and security across many moving parts

Page 14: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

A NEW APPROACH TO MASTER DATA MANAGEMENT

Page 15: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 15

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

FLEXIBLE

Capture data and context without having to rigorously model upfront

DATA MODEL

Enforce governance, security, quality across the entire lifecycle

Discover data and project views of business entities in real-time

UNIVERSAL INDEXING

TRUSTED MANAGEMENT

Page 16: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 16

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Customer v5.0.9

Thinking in Entities

Party, CRM v201509-prd

Orders, ERP v106

Transactional updates Granular security Indexes Bitemporal history

Metadata Provenance, Relationships, …

Page 17: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 17

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Customer v5.0.9

Thinking in Entities: Versioning

Party, CRM v201509-prd

Orders, ERP v106

Metadata Provenance, Relationships, …

Customer v6.0.0

Party, CRM v201509-prd

Orders, ERP v106

Metadata Provenance, Relationships, …

customer6.sjs

:nextVersion

Page 18: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 18

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Master Data

Reference Data

Metadata

Exceptions

Staging

Staging

Cleansing Exception

processing Registry

Matching Workflow rules Domain services Security Logging, audit

Ser

vice

s

Stewardship, orchestration, monitoring

Sour

ce S

yste

ms

of R

ecor

d

Application Servers Relational

Databases ETL

Federated Search

Page 19: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 19

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Master Data

Reference Data

Metadata

Exceptions

Staging

Staging

Cleansing Exception

processing Registry

Matching Workflow rules Domain services Security Logging, audit

Ser

vice

s

Stewardship, orchestration, monitoring

Sour

ce S

yste

ms

of R

ecor

d

Page 20: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 20

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Master Data

Reference Data

Metadata

Exceptions

Staging

Staging

Cleansing Exception

processing Registry

Matching Workflow rules Domain services Security Logging, audit

Ser

vice

s

Stewardship, orchestration, monitoring

Sour

ce S

yste

ms

of R

ecor

d

Page 21: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 21

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Master Data

Reference Data

Metadata

Exceptions

Staging

Staging

Cleansing Exception

processing Registry

Matching Workflow rules Domain services Security Logging, audit

Ser

vice

s

Stewardship, orchestration, monitoring

Sour

ce S

yste

ms

of R

ecor

d

Page 22: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

MASTER DATA MANAGEMENT IN PRACTICE

Page 23: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 23

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Customers Leveraging MarkLogic for MDM

Time to Value

Evolving Business Requirements

5 year Replacement Cycle

Legacy Systems Retirement

Decision Accountability

Customers Capability

Mitchell1

Healthcare.gov

Page 24: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 24

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Top Entertainment Company Apply corporate standards for creation and use of digital assets from production through distribution across the global enterprise.

Healthcare.gov Insurance marketplace and exchange hub for millions of consumers, thousands of providers, dozens of stakeholder agencies.

Aetna Hub for sharing master data among hundreds of source systems and hundreds of subscribers.

US Combatant Command Secure sharing, exploitation, and analysis from dozens of sources at HQ, theater operations centers, and detached users.

Our customers are already doing this today…

Page 25: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 25

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Techniques and Tips Envelope Pattern: Model what you need when you need it

Common Services Layer: The distributed Hub

Semantics: Relating to your Data

Decision Accountability: Bi-temporal and Tiered Storage

Page 26: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 26

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Techniques and Tips – Envelope Pattern Data Modeling can easily take up 30% of a project’s resources and is commonly

cited as a key reason for project failure

Need to be able to maintain the original data in context so systems can continue to use it

Have to be able to expand the data model to meet existing and new requirements

Solution:

The envelop pattern: Leave you data as is and wrap it with the information that you need

Page 27: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 27

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Customer v5.0.9

Managing Entities in Envelopes

Party, CRM v201509-prd

Orders, ERP v106

Transactional updates Granular security Indexes Bitemporal history

Metadata Provenance, Relationships, …

Page 28: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 28

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Techniques and Tips – Envelope Pattern

<env:customer-envelope> <customer xmlns="…"> <uuid>123abc…</uuid> <first-name>Christy</first-name> <last-name>Haragan</last-name> </customer> <env:metadata>…</env:metadata> <crm:party version="v201509-prd">…</crm:party> <erp:orders …>…</erp:orders> </env:customer-envelope>

Zero to many Indexes Bitemporal history

Page 29: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 30

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Techniques and Tips – Common Services Layer Business Units need to own and manage data that’s important to them in the way

that works best for their requirements

The enterprise needs to be able to enforce consistent access, visibility and quality requirements on key pieces of data

Page 30: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 32

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Techniques and Tips – Common Services Layer

Customer

Billing Party

Customer v5.0.9

Party, CRM v201509-prd

Orders, ERP v106

Metadata Provenance, Relationships, … Billing Address H

TTP

Dat

a Se

rvic

es

Page 31: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 33

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Techniques and Tips - Semantics Not every organizational unit within the enterprise will have the same naming

conventions, cardinality rules, or perspective on master data

The enterprise and each business unit may have one or more taxonomies or ontologies that apply to their data

Solution:

Semantics can be used to model relationships between data across the enterprise and allow OUs to relate their data to others and vice-versa

Key relationships: sameAs, belongsTo, parentOf, childOf

Page 32: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 34

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Semantic Relationships

Customer

Sales Order

Product

Support Case

agrees to has

files about

purchased

is type

uses

Page 33: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 35

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Techniques and Tips – Decision Accountability Businesses Need to

Understand why a decision was made Based on what data

For regulatory and compliance reasons

For back testing

Manage the costs of doing so

Page 34: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 36

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Bitemporal

VALID

SYSTEM

Mon

Mon

Invoice Paid

Tue

Tue

Check bounces (oops!)

Wed

THU

FRI WED

Page 35: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 37

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Tiered Storage

($/GB)

Active

$24.67

Effective Unit Cost

$4.10 $1.44

Compliance Analytic

96 504 1,044 Total Size (TB)

592 2,066 2,080 Total Cost ($000)

Page 36: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 39

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Conclusions

Page 37: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

SLIDE: 40

© COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Additional Resources MarkLogic University Training - http://www.marklogic.com/training/

MarkLogic Developer Site – http://developer.marklogic.com

MarkLogic Data Modeling - https://developer.marklogic.com/learn/data-modeling

MarkLogic Data Hub Framework - http://marklogic.github.io/marklogic-data-hub/

Page 38: MASTERING ENTERPRISE DATA FOR CONSISTENCY & ACCURACY€¦ · Data Modeling can easily take up 30% of a project’s resources and is commonly cited as a key reason for project failure

QUESTIONS?