Data Warehouse Architecture Components and Implementation Options _ Infonitive

5
Infonitive Data Management Consulting Home About Infonitive Business Intelligence Data Warehousing Metadata Management Contact Evaluation Criteria for Data Quality Platforms and Vendors Criteria for True Business Intelligence Data Warehouse Architecture Components and Implementation Options August 12, 2010 Categories: Data Warehousing | Data Warehousing Basics | Information Architecture | Operational Data Store | Infonitive Research At a high level a generic data warehousing architecture can be said to comprise of the following components: Data Staging Layer (DSL) Operational Data Store (ODS) Data Warehouse (DW, EDW for an Enterprise) Data Marts (DM) Each components functions are follows: Data Staging Layer Decouples extraction processes from data cleansing processes Houses the transformation of formats of data from the incoming disparate systems to the single target data warehousing technology platform It reduces impact of loads on transaction system Maintains a ‘Snapshot’ that allows repeatable / re-startable ETL processes Is an area to store data using different load schedules to the DW Is not typically available to end users for data consumption Acts as an area to store 3rd party data Employs data manipulation not appropriate for the DW The data in Data Staging Area is: a copy of operational data, deleted after a short period so non-historical, in flat tables atomic in granularity Operational Data Store (ODS) ODS is a hybrid of data warehouse and operational systems. Functions of an ODS are: Allows OLTP response times, update capabilities and DSS capabilities 2/20/2011 Data Warehouse Architecture Compon… http://infonitive.com/?p=205 1/5

Transcript of Data Warehouse Architecture Components and Implementation Options _ Infonitive

Page 1: Data Warehouse Architecture Components and Implementation Options _ Infonitive

InfonitiveData Management Consulting

HomeAbout InfonitiveBusiness IntelligenceData WarehousingMetadata ManagementContact

Evaluation Criteria for Data Quality Platforms and Vendors Criteria for True Business Intelligence

Data Warehouse Architecture Components and ImplementationOptions

August 12, 2010 Categories: Data Warehousing | Data Warehousing Basics | Information Architecture |Operational Data Store | Infonitive Research

At a high level a generic data warehousing architecture can be said to comprise of the following components:

Data Staging Layer (DSL)Operational Data Store (ODS)Data Warehouse (DW, EDW for an Enterprise)Data Marts (DM)

Each components functions are follows:

Data Staging Layer

Decouples extraction processes from data cleansing processesHouses the transformation of formats of data from the incoming disparate systems to the single target datawarehousing technology platformIt reduces impact of loads on transaction systemMaintains a ‘Snapshot’ that allows repeatable / re-startable ETL processesIs an area to store data using different load schedules to the DWIs not typically available to end users for data consumptionActs as an area to store 3rd party dataEmploys data manipulation not appropriate for the DWThe data in Data Staging Area is: a copy of operational data, deleted after a short period so non-historical,in flat tables atomic in granularity

Operational Data Store (ODS)ODS is a hybrid of data warehouse and operational systems. Functions of an ODS are:

Allows OLTP response times, update capabilities and DSS capabilities

2/20/2011 Data Warehouse Architecture Compon…

http://infonitive.com/?p=205 1/5

Page 2: Data Warehouse Architecture Components and Implementation Options _ Infonitive

Can be used for operational reporting because data from multiple source systems are integrated at thispointOperational reporting is typically more structured so dimensional models are not requiredData in the ODS is: kept for 6 months so non-historical, in flat tables or 3NF, often a copy of operationaldata, and is atomic in granularity

For architectural purposes, ODS’ are classified as follows:

Class I – Real-timeClass II – Two-hour to four-hourClass III – DailyClass IV -Aggregated data from the data warehouse

Enterprise Data Warehouse (EDW)

An EDW is the central hub of the BI infrastructure in some architectural approaches. The function of the EDW isto be the single point of truth for enterprise. Data in the EDW is:

cleansed and harmonized during the ETL processHistoricalDistributed to various other systemsOften 3NF but can be dimensionalUsually atomic but sometimes aggregate

The degree of normalization in the EDW data model is a tradeoff between multiple properties or characteristicsof the data warehouse. The following matrix captures the trade-offs:

Data MartsA Data mart is a departmental, or subject oriented, subset of the EDW. Data in the data mart comes from theEDW if one exists Relationship between EDW and data marts is often called a hub-and-spoke architecture. Datain Data Mart is often aggregated but may be atomic, dimensional, may or may not be historical, and may bevolatile.

Data Warehousing Architectural Options

There are three architectural approaches for setting up a flexible, future-oriented data warehouse. Thisarchitectural decision is critical since the selection will play a large role in many aspects of the data warehousesuch as data modeling methodology uses and physical infrastructure characteristics. The data warehousearchitecture will determine the locations of the data warehouses and data marts themselves, and where thecontrol resides. For example, the data can reside in a central location that is managed centrally. Alternatively, thedata can reside in distributed local and/or remote locations that are either managed centrally or independently.

At a higher level, the three architectural approaches are:

Enterprise data warehouse (EDW)Dependent data martsIndependent data marts

2/20/2011 Data Warehouse Architecture Compon…

http://infonitive.com/?p=205 2/5

Page 3: Data Warehouse Architecture Components and Implementation Options _ Infonitive

These architectural choices do not have to be used exclusively by themselves, thy can be used in combinations assuited to the organizational needs. For example an often used option is to combine the EDW and dependent datamarts in the same instance. The data marts are chained to follow in lock step with EDW, rather than directly fromthe staging area. The grain of the marts are restricted by the lowest grain of the EDW, as is the data latency andrecency.

Enterprise Data Warehouse:An enterprise data warehouse supports a large part of business requirements for a more fully integrated datawarehousing environment that has a high degree of data access and usage across departments or lines ofbusiness. The data warehouse is designed and constructed based on the needs of the business as a whole. It is acommon repository for decision-support data that is available across the entire organization. The term“Enterprise” reflects the scope of data access and usage, not the physical structure.

This type of data warehouse is characterized as having all the data under central management. However,centralization does not necessarily imply that all the data is in one location or in one common systemsenvironment. While the data is centralized, it is logically centralized rather than physically co-located. When this isthe case, by design, it then may also be referred to as a hub and spoke implementation. The key point is that theenvironment is managed as a single integrated entity.

Independent data mart architecture:An independent data mart architecture, as the name implies, is comprised of standalone data marts that arecontrolled by particular workgroups, departments, or lines of business. There typically is no connectivity withdata marts in other workgroups, departments, or lines of business. Therefore, these data marts do not share anyconformed dimensions and conformed facts between them. This is one of the concerns when using anindependent data mart. The data in each may be at a different level of currency, and the data definitions may notbe consistent – even for data elements with the same name. Independant data marts are primary candidates fordata mart consolidation for companies around the world today. The proliferation of such independent data martshas resulted in issues such as:

Increased hardware and software costs for the numerous data martsIncreased resource requirements forsupport and maintenanceMany redundant and inconsistent implementations of the same dataDevelopment of many extract, transform, and load (ETL) processesLack of a common data model, and common data definitions, leading to inconsistent and inaccurateanalyses and reportsNo data integration or consistency across the data martsTime spent, and delays encountered, while deciding what data can be used, and for what purposeConcern and risk of making decisions based on data that may not be accurate, consistent, or currentMany heterogeneous hardware platforms and software environments that were implemented, because ofcost, available applications, or personal preference, resulting in even more inconsistency and lack ofintegration.Inconsistent reports due to the different levels of data currency stemming from differing update cycles; andworse yet, data from differing data sources

Dependent data mart architectureAn interconnected data mart architecture is basically a distributed implementation. Although separate data martsare implemented in a particular workgroup, department, or line of business, they are integrated, or

2/20/2011 Data Warehouse Architecture Compon…

http://infonitive.com/?p=205 3/5

Page 4: Data Warehouse Architecture Components and Implementation Options _ Infonitive

interconnected, to provide a more global view of the data. These data marts are connected to each other usingconformed dimensions and conformed facts. Each of the dependant data marts typically has a common stagingarea. At the highest level of integration, the combination of all dependent data marts could be thought of as adistributed enterprise data warehouse. In an implementation where the EDW and dependent data martarchitectures are combined, the Staging Area is basically replaced by the EDW.

Search

Recent Articles

Design & Architectural Approaches for Data Warehouses: Inmon vs Kimball, Federated, HybridBusiness Intelligence Implementation: Think Big, Start SmallBusiness Intelligence Vision & StrategyMarket Trends – Business Intelligence in Core Banking SystemsCriteria for True Business IntelligenceData Warehouse Architecture Components and Implementation OptionsEvaluation Criteria for Data Quality Platforms and VendorsIn-Memory Analytic Platforms – The Next Frontier in Business IntelligenceData Warehouse – Analysis & DesignData Cleansing in Data Warehousing: A Structured ApproachChallenges of Real Time Data Integration in Data WarehousingData Warehousing: Implementation Strategies for Externally Serviced Data WarehousesConcept of Global Data Codes as an integral part of an MDM solutionAdvantages of an Effectively Managing Master Data (MDM)Outsourcing as an option for a Data Warehousing Program

Infonitive Site Map

About InfonitiveBusiness IntelligenceData WarehousingMetadata ManagementContact

Go To Top »

2/20/2011 Data Warehouse Architecture Compon…

http://infonitive.com/?p=205 4/5

Page 5: Data Warehouse Architecture Components and Implementation Options _ Infonitive

Infonitive

Pages

HomeAbout InfonitiveBusiness IntelligenceData WarehousingMetadata ManagementContact

Stay In Touch

Site RSS Feed

More

Stay updated by subscribing to the RSS feed.© 2010 Infonitive

Content Protected Using Blog Protector By: PcDrome.

2/20/2011 Data Warehouse Architecture Compon…

http://infonitive.com/?p=205 5/5