Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

21
LEDS WWW.LEDS-PROJEKT.DE LINKED ENTERPRISE DATA SERVICES DR. MICHAEL MARTIN UNIVERSITÄT LEIPZIG / AKSW 8. November 2016 1

Transcript of Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

Page 1: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDS

WWW.LEDS-PROJEKT.DE

LINKED ENTERPRISE DATA SERVICES

DR. MICHAEL MARTINUNIVERSITÄT LEIPZIG / AKSW

8. November

20161

Page 2: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDS

• Forschungseinrichtungen:• Universität Leipzig / AKSW

Forschungsgruppe• Technische Universität Chemnitz

• Firmen:• brox IT-Solutions GmbH• Ontos GmbH• Netresearch GmbH & Co. KG• Lecos GmbH• eccenca GmbH

PROJEKTÜBERSICHT

November 8, 2016

2

Page 3: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSFORSCHUNGSGEBIETE

November 8, 2016

3

• Open eGovernment Data

• Semantic eCommerce

• Natural Language Processing

• Linking and Knowledge Extraction

• Versioning and Co-Evolution

• Big Data / Linked Data Integration

Page 4: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSFORSCHUNGSGEBIETE

November 8, 2016

4

• Open eGovernment Data

• Semantic eCommerce

• Natural Language Processing

• Linking and Knowledge Extraction

• Versioning and Co-Evolution

• Big Data / Linked Data Integration

Page 5: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSFORSCHUNGSGEBIETE

November 8, 2016

5

• Open eGovernment Data

• Semantic eCommerce

• Natural Language Processing

• Linking and Knowledge Extraction

• Versioning and Co-Evolution

• Big Data / Linked Data Integration

Page 6: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDS

• Open eGovernment Data

• Semantic eCommerce

• Natural Language Processing

• Linking and Knowledge Extraction

• Versioning and Co-Evolution

• Big Data / Linked Data Integration

FORSCHUNGSGEBIETE

November 8, 2016

6

Page 7: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSFORSCHUNGSGEBIETE

November 8, 2016

7

• Open eGovernment Data

• Semantic eCommerce

• Natural Language Processing

• Linking and Knowledge Extraction

• Versioning and Co-Evolution

• Big Data / Linked Data Integration

Page 8: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSFORSCHUNGSGEBIETE

November 8, 2016

8

BusinessUnit2 BusinessUnit3 BusinessUnit4 BusinessUnit5BusinessUnit1

CorporateMemory

Inbound

DataSources

OutboundandConsumption

InboundRawDataStore

BigData DWH-Infrastructure

KnowledgeGraphforMetaData,KPIDefinitionand DataModels

FrontendtoAccessRelationshipandKPIDefinition/Documentation FrontendtoAccess(adhoc)Reports

OutboundDataDeliverytoTarget

Systems

• Open eGovernment Data

• Semantic eCommerce

• Natural Language Processing

• Linking and Knowledge Extraction

• Versioning and Co-Evolution

• Big Data / Linked Data Integration

Page 9: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDS

WWW.LEDS-PROJEKT.DE

ECCENCA CORPORATE MEMORY

SEMANTICALLY INTEGRATED ENTERPRISE DATA LAKES

RENE PIETZSCH

8. November

20169

Page 10: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSMOTIVATION

Enterprise Data Management Objective:

“Ensure all data is aligned to a common meaning in order to achieve automation in performing complex analytics and generating trusted reports.”

Source:2015DataManagementIndustryBenchmark- EDMCouncil

November 8, 2016

10

In 2015 only 7% of respondents claim to already be using shared and unambiguous definitions of data across the firm and have it accessible as operational metadata.

7%

©eccencaGmbH2016

Page 11: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDS

Accounting RegulatoryReporting

RiskMgmt. Treasury...

PerspectivesonDataturnintosilosofdatabeingduplicated,annotated,simplychangedovertime,makingreconciliationandinterpretationachallenge

MOTIVATION

©eccencaGmbH2016

Page 12: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSARCHITECTURE

November 8, 2016

12

ManagementAccounting

RiskManagementRegulatoryReporting

Treasury MarketingAccounting

CorporateMemory

Inbound

DataSources

OutboundandConsumption

InboundRawDataStore

KnowledgeGraphforMetaData,KPIDefinitionandDataModels

FrontendtoAccessRelationshipandKPIDefinition/Documentation FrontendtoAccess(adhoc)Reports OutboundDataDeliveryto

TargetSystems

BigData DWH-Infrastructure

©eccencaGmbH2016

Page 13: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSARCHITECTURE

ManagementAccounting

RiskManagementRegulatoryReporting

Treasury MarketingAccounting

InboundRawDataStore

KnowledgeGraphforMetaData,KPIDefinitionandDataModels

FrontendtoAccessRelationshipandKPIDefinition/Documentation FrontendtoAccess(adhoc)Reports OutboundDataDeliveryto

TargetSystems

BigData DWH-Infrastructure

DataIngestion• Filesinthedatalake(CSV,XML,Excel)• (relational)Databases

©eccencaGmbH2016

Page 14: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSARCHITECTURE

ManagementAccounting

RiskManagementRegulatoryReporting

Treasury MarketingAccounting

InboundRawDataStore

KnowledgeGraphforMetaData,KPIDefinitionandDataModels

FrontendtoAccessRelationshipandKPIDefinition/Documentation FrontendtoAccess(adhoc)Reports OutboundDataDeliveryto

TargetSystems

BigData

DWH-Infrastructure

DataLake• Emergingapproachtohandlelargeamounts

ofdata• Cost-effectivestorage• DataisheldintheirnativeformatsGoodDoesnotforceanup-frontintegrationoftheingesteddatasetsBadRetaininganoverviewofdisparatedatasilosinthelakewithouthavingacoherentsharedviewisachallengingissue

DataWarehouses• Existinginfrastucture• Typicallyrelationaldatabases

©eccencaGmbH2016

Page 15: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSARCHITECTURE

ManagementAccounting

RiskManagementRegulatoryReporting

Treasury MarketingAccounting

InboundRawDataStore

KnowledgeGraphforMetaData,KPIDefinitionandDataModels

FrontendtoAccessRelationshipandKPIDefinition/Documentation FrontendtoAccess(adhoc)Reports OutboundDataDeliveryto

TargetSystems

BigData DWH-Infrastructure

Metadata Layer• DatasetMetadata• Ontologies• IntegrationRules

Graphical UserInterface

CustomerApplications

©eccencaGmbH2016

Page 16: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSDATASET MANAGEMENT

DatasetManagement• CatalogDatasets• CatalogOntologies• ManageMetadata

DatasetDiscovery• DataProfiling• DatasetExploration

DatasetIntegration• DatasetLifting• DatasetLinking• DataQualityValidation

DataAccess• DomainSpecificConsolidatedViews

• ExecutiononHadoop

November 8, 2016

16

©eccencaGmbH2016

Page 17: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSDATASET DISCOVERY

November 8, 2016

17

DatasetManagement• CatalogDatasets• CatalogOntologies• ManageMetadata

DatasetDiscovery• DataProfiling• DatasetExploration

DatasetIntegration• DatasetLifting• DatasetLinking• DataQualityValidation

DataAccess• DomainSpecificConsolidatedViews

• ExecutiononHadoop

©eccencaGmbH2016

Page 18: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSINTEGRATION PROCESS 1/2

November 8, 2016

18

DatasetManagement• CatalogDatasets• CatalogOntologies• ManageMetadata

DatasetDiscovery• DataProfiling• DatasetExploration

DatasetIntegration• DatasetLifting• DatasetLinking• DataQualityValidation

DataAccess• DomainSpecificConsolidatedViews

• ExecutiononHadoop

©eccencaGmbH2016

Page 19: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSINTEGRATION PROCESS 2/2

November 8, 2016

19

DatasetManagement• CatalogDatasets• CatalogOntologies• ManageMetadata

DatasetDiscovery• DataProfiling• DatasetExploration

DatasetIntegration• DatasetLifting• DatasetLinking• DataQualityValidation

DataAccess• DomainSpecificConsolidatedViews

• ExecutiononHadoop

©eccencaGmbH2016

Page 20: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDSDATA ACCESS

DatasetManagement• CatalogDatasets• CatalogOntologies• ManageMetadata

DatasetDiscovery• DataProfiling• DatasetExploration

DatasetIntegration• DatasetLifting• DatasetLinking• DataQualityValidation

DataAccess• DomainSpecificConsolidatedViews

• ExecutiononHadoop

©eccencaGmbH2016

Page 21: Semantically integrated Enterprise Data Lakes and Co-Evolution of Public / Private Data

LEDS

ContactRenePietzschTel:+491726940915email:[email protected]

eccencaCommand your Data!