Logical Data Models for Agile BI David D. Schoeff Teradata - EDW Data Architect & Principal...

Click here to load reader

download Logical Data Models for Agile BI David D. Schoeff Teradata - EDW Data Architect & Principal Consultant.

of 21

Transcript of Logical Data Models for Agile BI David D. Schoeff Teradata - EDW Data Architect & Principal...

  • Logical Data Models for Agile BIDavid D. SchoeffTeradata - EDW Data Architect & Principal Consultant

    * > *

    Not Designing a Data Architecture is a

    * > *

    Why do we need an LDM?Data Warehouse with LDM Data Warehouse Without LDM

    * > *

    What is the Purpose of a Data Model?A visual business representation of how data is organized in the enterprise It provides discipline and structure to the complexities inherent in data managementCan you imagine building a house without a blueprint?Or driving across the country without a map?It facilitates communication within the business (e.g. within IT and between IT and the business)It facilitates arriving at a common understanding of important business concepts (e.g what is a customer?)

    * > *

    Logical Data Model Components LDM graphically represents the data requirements and data organization of the business

    Identifies those things about which it is important to track information (entities)Facts about those things (attributes)Associations between those things (relationships)

    Subject-oriented, designed in Third Normal Form one fact in one place in the right place

    * > *

    Reference Models Lots of Detail / Expertise Behind Models

    * > *

    Reference Model SourcesData Warehousing VendorsIBMOracleTeradata Tool VendorsEmbarcadero Service VendorsEWSolutions Industry/Standards AssociationsARTS (Association for Retail Technology Standards)

    * > *

    Teradata Industry Logical Data Models - iLDMsFinancial Services- Banking, InvestmentsTravel Travel, Hospitality, GamingRetail Retail Store, Food ServiceCommunications Wireline, Wireless, Cable, SatelliteHealthcare- Payor, HIPAATransportation 3PL, 4PL, Air, Truck, Rail, SeaManufacturing CPG, High Tech AutomotiveFinancial Services- Insurance

    * > *

    Data Management Context Three Layer StructureEDW-LDMEDW-PDMImplement(Physical)Analyze & Design(Logical)Core(Enterprise)Semantic (Usage/Presentation)SemanticLayer ModelsLoad OnceUse ManyMartsViewsBIOs & User Typesdrive requirementsiLDMUsed forcustomizationSourceOperational ImagesDataIntegrationSource

    * > *

    Enterprise Information Management Requires A Shared VOCABULARYExperts estimate that the 500 most commonly used words in the English language have an average of 28 definitions each.

    * > *

    Enterprise Data Management Objectives that are enabled by Enterprise Logical Data Modeling :Build a Common Business Vocabulary for the enterprise.

    Develop an EDW Data Structure that is Neutral from All the Sources that populate it.

    Develop an EDW Data Structure that will Support All Business Requirements While Not Being Constrained by any specific requirement.i.e. Neutral from use by multiple functional areasSupports operational and analytical uses

    * > *

    Data Modeling StructureSUBJECT ModelCONCEPTUAL ModelKEY-BASED ModelATTRIBUTED ModelPHYSICAL ModelData ModelingA model of the high level data concepts that define the scope of the Data Architecture.An entity-relationship model that identifies the elements of the Business Vocabulary and Business Rules.A refinement of the Conceptual Model that identifies the natural and surrogate keys for all entitles and relationships.This the foundation of the Enterprise Business Vocabulary.A detailed model that identifies the non-key attributes for the entitles. Attribution also leads to refining the Key-Based ModelA model that is the design for a database. The Attributed Model is transformed for Sourcing and Accessing performance.

    * > *

    Data Modeling Structure PurposesSUBJECT ModelCONCEPTUAL ModelKEY-BASED ModelATTRIBUTED ModelPHYSICAL ModelData ModelingArchitectureImplementationReference ModelInformation Requirements Business Improvement Opportunities Business Questions Key Performance Indicators Legacy Reporting/Analysis

    * > *

    Data/Information Management

    * > *

    Data Management Context Agile Development EnvironmentSandboxUser External Data

    * > *

    Data Management Context Perceived Value from Medium to Large Scale ProjectsSandboxUser External Data80-95%5-15%0-5%0-1%

    * > *

    Data Management Context Development Time for Medium to Large Scale ProjectsSandboxUser External Data4-8 weeks3-6 months2-4 Months1-5 days

    * > *

    Data IntegrationCommonSharedSharedSharedLocalLocalLocal1st SandboxApplication3rd SandboxApplication2nd SandboxApplication

    * > *

    Data Management Context Integration in an Agile Development EnvironmentSandboxUser External DataConceptual Data ArchitectureGovernance-driven Integration

    * > *

    Pros and Cons of Using a Vendor Provided Analytical Data Model in Your BI ImplementationBoris Evelson, Information Management Blogs, January 29, 2010 Pros:Leverage vendor knowledge from prior experience and other customers May fill in the gaps in enterprise domain knowledge Best if your IT dept does not have experienced data modelers May sometimes serve as a project, initiative, solution accelerator May sometimes break through a stalemate between stakeholders failing to agree on metrics, definitions Cons:May sometimes require more customization effort, than building a model from scratch May create difference of opinion arguments and potential road blocks from your own experienced data modelers May reduce competitive advantage of business intelligence and analytics (since competitors may be using the same model) Goes against agile BI principles that call for small, quick, tangible deliverables Goes against top down performance management design and modeling best practices, where one does not start with a logical data model but rather Defines departmental, line of business strategies Links goals and objectives needed to fulfill these strategies Defines metrics needed to measure the progress against goals and objectives Defines strategic, tactical and operational decisions that need to be made based on metrics Then, and only then defines logical model needed to support the metrics and decisions Lets discuss.

    * > *

    Cooking Something New ...Change without a recipe is a recipe for chaos. The transformation model must describe not only the steps in the process,but also the enabling context that is critical to its success.

    The Data Modeling Context illustrates the flow of requirements, constraints and results in an Integrated Enterprise Data Warehousing Environment. The diagram makes two important distinctions. The first distinction is shown by the columns representing the Core and Semantic Layers. The Core Layer contains data and information that is neutral from the data sources and from the data and information uses. Data and information in the Semantic Layer is structured based on specific uses.

    The second distinction in the diagram is between a Logical dimension and a Physical dimension. In the Logical dimension the elements of data and information are identified and related. In the Physical dimension these structures are transformed into SQL Data Definition Language statements that define the tables and columns in the database. The Physical structures are constrained by the environment sourcing and accessing capabilities and expectations.

    Information requirements enter this framework in the upper right and dictate the structures in the Semantic Layer Models. The information requirements may take the form of Business Improvement Opportunities (BIOs). Data possibilities enter from the left. An Industry Logical Data Model (iLDM) may provide data structures for a particular business context along with the available source data.

    The business results are generated at the lower right in the form of data and information provided to address timely business questions. The data and information flow is form the sources to the Core Layer and through the Semantic Layer to the users. The Data Modeling Workstreams address the design of the Logical and Physical Dimensions of the Core and Semantic Layers.