`
AI-Powered Data Cataloging Virtual Summit
Data Cataloging for Data Governance
With Maersk & Informatica
Requirements:
Detailed Technical Lineage
Intelligent Glossary Associations
Agile Data Curation
Data Certifications
Key Data Element Discovery
Data Governance ensures that enterprise data is trusted, governed and
protected. Intelligent data catalogingplays a critical role in enabling agile
data governance at scale.
Data Cataloging for Data Governance
Enterprise Data CatalogAI, Human Knowledge and Collaboration
AI-powered automatic discovery, enrichment
and curation
Business context via intelligent business term
association
Collaboration & social curation to tap into shared
data knowledge
PowerCenter | DQ MDM | BDM | DIH
BG | ILM | Axon | Informatica Cloud
Informatica
Oracle | DB2 | DB2 for z/OSSQL Server | Sybase | TeradataNetezza | JDBC | SQL Scripts |
SAP HANA | Stored Procedures
Databases
SAP R/3 | SalesforceOracle | Workday
Applications
HIVE (Cloudera, Hortonworks, MapR, IBM BigInsights, EMR, HDI)
HDFS | MapRFS |
Cloudera Navigator | Atlas
Big Data
AWS S3 | AWS Redshift | Azure SQL DB | Azure SQL DW | Azure
ADLS | Azure Blob | Google BigQuery | ADLS Gen 2
Cloud Platforms
CSV | Delimited | XML | JSON | Avro | Parquet | MS Excel | Adobe PDF | Flat File | MS
PowerPoint | MS Word
File Formats
Tableau | IBM Cognos |
SAP BusinessObjects
MicroStrategy | OBIEE
Business Intelligence
Microsoft SSIS | Erwin Models | PowerDesigner | Oracle Data Integrator | IBM DataStage | Custom Scanner Framework
Other
EnterpriseData
Catalog
David Falder Gaurav Pathak Dharma Kuthanur
Speakers
Senior Technical Specialist Senior Director, Product Management
Senior Director, Product Marketing
Classification: Public
|6
MAERSK
A TRANSFORMATIONAL JOURNEY
David FalderVirtual Summit – Aug 2019
Classification: Public
|9
Classification: Public
|10
Classification: Public
|11
Classification: Public
|12
Classification: Public
|13
Our Transformational Themes
Develop new, in house
capabilities
Change the way we work
Create new, innovative products
• Scrum teams, experts in data engineering
• Develop existing & hire new capabilities
• Decrease number of external consultants
• Train employees
• Test & learn, fail-fast
• Reduced governance, bureaucracy & handovers
• Embed BI in the business
• Reduce ”time to market”
• E2E responsibility for Scrum Teams
• Automate standard reports & provide self serve capabilities
• Reduce reporting legacy
• Democratise data
• Partner with IT & Digital to enable development of analytical products
Classification: Public
Future BI Requirements
Going Agile
• Classic agile (Prod. Owner + Scrum Team)
• Operating model must be able to handle net new ideas and software
• Use OKRs for improved alignment on planning in development teams
• Scalable BI platform, that supports the ambitions for Maersk BI
• Don’t compromise reliability of reports or data
From Service Delivery to Partner
• Investing in POs to enable them to establish a proximity with business stakeholders
• Enable more flexibility in the organization (Network-structure and PoC teams)
• Implement Product- and Market Mindset
Reduce Time to Market
• Paradox in expectation to be proactive and knowledgeable to earn a seat at the table
• Change perception of BI with the business stakeholders
• Involve the business more by having a rapid release cycle and an an MvPapproach - test and learn, fail-fast
• We want business and IT to partner up and work closer together, to ensure value and relevance
|14
Classification: Public
|15
Classification: Public
|16
Front End Consumption / Visualization
(Microstrategy / MSBI / Power BI / Etc)
Search / Find / Understand
Axon
Business Data
Assets / Business Lineage
/Ownership Classification /
Quality
Tech Assets / Tech Lineage
EDP
Prepare
Ontology
Cleansed
Raw
Projects
MashupsRDBMS
Sources
Data Lake
Local
.txt, xls
Data Platform Strategy – Non SQL / non-Technical
Business assets governance
Technical assets
management
**Search/Find/Add
Search/Find/
Select
Data Quality management
Rate data
DQ Profiles
1
2
3*
3* - Optional
4
5
6
7
8 Visualize
10- Consume9 Publish
1 – Load Catalogue (Others)2 – Search Find Understand3 – Rate (crowd source)4 – Search & Create Prj for Prep5 – Import – Import to project6 – Prepare – Prepare data7 - Publish data – Publish dataset8 – Visualize – in tool of choice9 – Publish Viz & share visual10 – Consumers access
IDQ** Integration into MSTR on road map
EDC
Axon, EDC, EDPEDC – Enterprise Data Catalog
EDP – Enterprise Data Preparation
IDQ – Informatica Data Quality
Axon – Axon Data Governance
Classification: Public
Data Governance Strategy
Future scope
Classification: Public
Self Service BI
Secure@source -Future scope
Classification: Public
|19
Classification: Public
|20
EDC Current Load Status
Classification: Public
EDC Development Tasks
DataBricks Azure Data Factory Power BI
|21
MSBI
PowerCenter
Classification: Public
• Data Migration• Data Engineers are saving considerable effort understanding the transformations as they migrate data from Teradata to ADLS.
• Engineers are able to generate Lineage Diagrams quickly and so that they may understand data flows and transformations quickly.
• Continuous Development and Deployment• Engineers are utilising EDC to perform impact analysis when making changes. They are quickly able to understand the impact of
making change to downstream consumers.
• Helping with New Ideas• Engineers are starting to utilise EDP to help them fast-track new ideas – this has self-service MicroStrategy in use as a visualisation
tool
|22
Additional EDC Use Cases
Classification: Public
|23
Sample Lineage Diagram
Classification: Public
|24
Sample Lineage Diagram (zoomed)
Classification: Public
|25
Sample Lineage Diagram – Transformation Logic
Classification: Public
|26
Embedding Informatica Toolset
Classification: Public
• Enterprise Adoption
• Additional Source Applications to facilitate lineage
• Increased focus on Operational Reporting – Include Operational DB’s
• MicroStrategy Integration
|27
The Future
Demo
29 © Informatica. Proprietary and Confidential.29 © Informatica. Proprietary and Confidential.
Learn More
1. Don’t miss Keynotes and Deep-Dives at the AI-Powered Data Cataloging Virtual Summit:• Market and Analyst Perspectives featuring New York Life, Tableau, and Amalgam Insights
• Data Cataloging Solution Theaters featuring Maersk, Nissan, Rabobank and Biogen
2. Stop by an Informatica World Tour near you:• Chicago Sept-11 | Washington, DC Oct-15
• Frankfurt Oct-8 | London Oct-9 | Paris Oct-10
3. Watch a Product Webinar:• Advancing Analytics Maturity with an Intelligent Data Catalog: with Mattel and Aberdeen
• Meet the Expert PM Webinar: EDC 10.2.2 Release Deep-Dive & Demo
`
Thank You
Top Related