Laying Foundations for High Performance Data · PDF fileLaying Foundations for High...

26
1 Oracle Data Warehousing Laying Foundations for Laying Foundations for High Performance High Performance Data Warehouses on Oracle Data Warehouses on Oracle [email protected]

Transcript of Laying Foundations for High Performance Data · PDF fileLaying Foundations for High...

Page 1: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

1

Oracle Data Warehousing

Laying Foundations forLaying Foundations forHigh PerformanceHigh PerformanceData Warehouses on OracleData Warehouses on Oracle

[email protected]

Page 2: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

2

State of Literature on DWPerformance

Still searching…………..

High degree of customization

Divergent technology

Unique NeedsWe Must

BuildAnecdotalLibraries

Page 3: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

3

Classic DW Struggles

Dimensional Modeling vs. ER

ETL vs. End-User priorities

Summaries - Nesting vs. de-normalization

Indexes- Bitmaps vs. B-Trees

Page 4: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

4

DM Needs StarTransformation

JOINS, JOINS and MORE JOINSJOINS, JOINS and MORE JOINS

Page 5: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

5

Selecting Partition Keys

Which single Key?

ETL window (Partition Exchange on Load Date)OrBiggest Queries (Store once use many times principle)

Smartkey temptations(Huge value but apply filters on FACT instead of Dims)

Page 6: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

6

Considerations for Pre-Aggregating

The Two dimensional SAS mindset vs. Over hyped OLAP

Should Oracle be a dumb data repository?

Needs high speed JOINs (more efficient than star transforms and bitmap join indexes)

Needs effortless data Transposing (Scalability of MODEL clause)

If not, can it scale for complex analysis??

Page 7: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

7

Indexing Choices

Bitmaps conducive to Star transformations(They do not scale on DML)

Bitmaps – the death spiral (Tim Gorman)

GLOBAL vs. LOCAL Index choices(GLOBAL indexes for the addl partitioning keys)

Page 8: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

8

Design Factors

Logical Considerations

Physical Considerations

Page 9: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

9

Influential Factors

TimelinessTimeliness

Levels of aggregationLevels of aggregation

Types of usage – Ad hoc, drilldown vs. cannedTypes of usage – Ad hoc, drilldown vs. canned

Execute on demand for volatile objects vs. scheduledExecute on demand for volatile objects vs. scheduled

Ratio of Power users to the normal onesRatio of Power users to the normal ones

Nature of Hierarchies of Dimensions (Agility)Nature of Hierarchies of Dimensions (Agility)

Data load frequency and the volume to be processedData load frequency and the volume to be processed

Page 10: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

10

Keeping a low ETL Window

CHUNK Sized ETL

Partitioning PLUS

Data Volumes

Re-invent ETL/Staging

Page 11: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

11

CHUNK Sized ETL

Daily Load Volume

BULK ETL CHUNK ETL

Page 12: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

1210AM 2PM12 Noon

TX_MainTX_holding

TX_FACT

Partitioning PLUS

Page 13: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

13

Chunking Loads

Chunk Size

Time Tofinish

A 40% gain realized on a 5Mill/Day operation on 50Kchunks!!!

Optimize datavolume per

chunk

Minimizeimpact to

online users

1

2

Page 14: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

14

Art of Staging

Page 15: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

15

SCDs vs. RCDs

ETL on Oracle is not conducive to UPDATEs

RCDs are a huge problem to keep ETL Window down

Wage battlesWage battlesUpfront at DimUpfront at DimModeling timeModeling time

Creative Process DesignsCreative Process Designs

Page 16: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

16

‘LAZY-UPDATE’ TRICK

UPDATE a Weekly_DEL Flag for updated records

ADD as NEW records

Use a VIEW to filter out ‘deleted’ records

Weekly house keeping

Should this be a product development suggestion to Oracle!?Should this be a product development suggestion to Oracle!?

Page 17: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

17

An Quick Intro to HybridSystems

A split FACT approach

Page 18: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

18

Hybridization of DWs –Create Sub-FACTs

Legacy data

TX FACT

TB FACT-GF TB FACT-ADJ

TB FACT Union

JV Master

Batch LoadsUser Interface

Page 19: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

19

Hybridization of DWs –Unified FACT-VIEWs

Page 20: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

20

‘Fashioning’ Data Usage

# of Users Accessing Summaries at each level

Data Volume AccessedAt each Level ofAggregation

Page 21: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

21

10g Features of InterestOra_rowscnsRow level System Commit Number (SCN) and TIMESTAMP

Robust Oracle Streams for change propagationThis topic requires a major discussion in itself

Rename Tablespaces for TTSThis is a very useful feature for bulk moving data segments acrossdatabases

Sorted HASH Cluster TablesTables can be stored in hash clusters after pre-sorting on selected columns

Page 22: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

22

10g FeaturesExternal Tables for Read/writeExternal tables now can be read from and written to in parallel

User defined metrics and trackingUser defined metrics on production data to trigger alerts, messages andevents

HTML DBLight weight, operational reports from log tables to SysMan portals

RCG EnhancementsResource Consumer Groups now monitor idle time and trigger sessionterminations

Page 23: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

23

10g Features

Oracle OLAPWaiting to hear the pros and cons of this database embeddedExpress Engine

SQL – MODEL clauseMeets common spreadsheet-like transposing needs

Job SchedulerDBMS_JOB interface has been used in evolving this scheduler

Data PumpThis enhancement will probably revolutionize the ETLarchitectures like never before.

Page 24: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

24

Lessons Learned

Move from ER/DM Puritanism to a practical MIXMove from ER/DM Puritanism to a practical MIX

Never hesitate to customizeNever hesitate to customize

Platform independence is an impractical dreamPlatform independence is an impractical dream

Page 25: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

25

Be Creative

The BEST is yet toThe BEST is yet tocomecome

No packagedNo packagedsolution heavensolution heaven

Usage of DWs isUsage of DWs ischanging rapidlychanging rapidly

Manage performanceManage performanceexpectationsexpectationsAdaptivelyAdaptively

Page 26: Laying Foundations for High Performance Data · PDF fileLaying Foundations for High Performance Data ... High degree of customization Divergent technology ... This enhancement will

26

Q & A

[email protected]