Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance...

65
Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata – a Division of NCR [email protected] 28. 6. 2005

Transcript of Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance...

Page 1: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Teradata

Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing

Dr. Barbara SchulmeisterTeradata – a Division of NCR

[email protected]

28. 6. 2005

Page 2: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Agenda

• History• Definitions• Hardware• Architecture• Fault Tolerance and High Availability• Coexistence• Operational System • Tools and Utilities• Data Distribution• SQL Parser• Active Data Warehouse• Scalability

Page 3: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Born to be parallel!

1984 1985 1986 1987 1988 1989 1990 1991 1992 1994...19931979...

Teradata Corp.

Founded

DBCModel 1:

First MPP System!

First 100GB

System!

“Product of the Year”– Forbes

First 500GB

System!

First 700GB

System!

DBC Model 3

“Fastest Growing Small

Company”– INC

Magazine

“Fastest Growing Electronic Company”

– Electronic Business

DBC Model 4

“Leader in Commercial

Parallel Processing”

– Gartner Group

First Terabyte System!

more

3+ TBSystem!

Teradata Timeline Overview

First Beta system shipped Christmas to Wells Fargo Bank

Joint Venture with NCR for next generation systems

Initial public offering on Wall Street

Page 4: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Teradata Timeline (II)

Over 500 Production

Data Warehouses Worldwide!

First Vendorto Publish1TB TPC-D

Benchmark!

Teradata V2 on

WorldMark 4300

DWI VLDB Best Practice Award w/ ATT BMD:

“Data Warehouse and

the Web”

DB Expo RealwareAward w/

Union Pacific: “Data Warehouse

Innovations”

Only Vendorto PublishMulti-user TPC-Ds!

“#1 in MPP”– IDC

Survey in Computer-

world

Teradata Version 2on NCR

3555 SMP

24TBData Warehousein Production!

Demonstrated World’s Largest Data Warehouse

Database at 11TB!

Teradata V2on WorldMark

5100SMP & MPP

100GBTPC-D

Benchmark Leader!

“...only NCR’s Teradata V2 RDBMS hasproven it can

scale…”– Gartner

Group

1996 1997 ......1995

more

Page 5: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Teradata Timeline (III)

Database Programming

and Design Award

IT Award of Excellence

V2R5 Teradata

• TDWI Solution Provider Best Practices in Data Warehousing

• TDWI Leadership in Data Warehousing Award

• DM Review World-Class Solution Award for business Intelligence

• IT Times Award• DM Review 100 Award• DM Review Readership Award• Intelligent Enterprise Real Ware Award

the commitment continues…

64 bit Teradata

Industry leading TPCH at 1TB and 3TB

Largest Data Ware-house system (176 node, 130 TB disk)

Industry leading TPC-D benchmark for all volumesTeradata V2

ported to Microsoft

Windows NTTeradata attains

99.98% availability

1999...1998 2000 2001 2002 2003

V2R6 Teradata

2004 2005

Linux

Page 6: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Alternative Approaches to Enterprise Analytics

Sources

Users

DW

Sources

Users

DW

Marts

Sources

Users

Marts

Sources

Users

Middleware

Data Mart Centric

Virtual,Distributed,Federated

Hub-and-Spoke DataWarehouse

EnterpriseData

Warehouse

• Requires corporate leadership and vision

• Business Enterprise view challenging

• Redundant data costs• High DBA and

operational costs• Data latency

• Only viable for low volume access

• Meta data issues• Network bandwidth and

join complexity issues• Workload typically placed

on workstation

• Business Enterprise view unavailable

• Redundant data costs• High ETL costs• High App costs• High DBA and

operational costs

Cons

• Single Enterprise “Business” View

• Data reusability• Consistency• Low Cost of Ownership

• Allows easier customization of user interfaces & reports

• No need for ETL• No need for separate

platform

• Easy to Build Organizationally

• Limit Scope• Easy to Build

Technically

Pros

Centralized Integrated Data With

Direct Access

Hub-and-Spoke Data Warehouse

Leave Data Where it Lies

Independent Data Marts

Page 7: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

A Spectrum of Data Warehouse Architectures

Sources

Users

DW

Sources

Users

DW

Marts

Sources

Users

Marts

Sources

Users

Middleware

Data Mart Centric

Virtual,Distributed,Federated

Hub-and-SpokeData

Warehouse

EnterpriseData

Warehouse

Teradata’s Advocated Data Warehouse

Approach for 20 years, Since 1984!

The goal: Any question, on any data, at any time.

Page 8: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Most time consuming steps:

ll Full scan of big tablesFull scan of big tablesll Complexe joinsComplexe joinsll AggregationAggregationll Sorting Sorting

Frequency of steps OLTP or DSS

OLTP DSS

Diffentiating OLTP - DSS

Page 9: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

NCR Server

• Provide customers with growth opportunities and investment protection> Coexistence is enabled

across five generations– NCR 5400E & 5400H

Servers– NCR 4980 & 5380 Servers– NCR 4950 & 5350 Servers– NCR 4900 & 5300 Servers– NCR 485X & 525X Servers

NCR Server Generations

485X&

525X

4900&

5300

4950&

5350

4980&

5380

5400E&

5400H

BYNET V2 / V3

Page 10: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

NCR 5400 Server SMP

• 5400E > 1 - 4 nodes> BYNET V2> ESCON & FICON for 3 and

4 node configurations> Field Upgradeable to

5400H1

3

1

3

1

3

1

3

1

3

3rd Node

4th Node

Internal BYNET

switches

3GSM

1st Node1st Node2nd Node2nd Node

Ethernet Switches

Three UPS Modules

Up to 4 nodes within each cabinet

Server Management

Page 11: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

NCR 5400 Server MPP

1

3

1

3

1

3

1

3

1

3

1

3

1

3

1

3

1

3

1

3

1

3

Ethernet Switches

BYNET V3 Switches

Five UPS Modules

Up to 10 nodes within each cabinet

Server Management

• Continued rapid adoption of latest Intel® Technology> Dual Intel Pentium Xeon EM 64T 3.6

GHz processors with Hyper-Threading(32-bit and 64-bit capability)

> 800 MHz front side bus• Industry Standard Form Factor

> Up to 10 nodes per cabinet> Integrated BYNET V3 (provides the

capability to physical separate systems between 300-600 meters)

> Integrated Server Management> N+1 UPS> Dual AC

• Multi-Generation Coexistence > Investment protection

Page 12: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Industry CPU Performance per Core

0

500

1000

1500

2000

2500

3000

2004 2005 2006 2007

Year

Itanium 2 1.6 Ghz130nm

Power 4+1.45Ghz130 nm

Ultrasparc 3130 nm1.6Ghz

Power 5~1.9Ghz130 nm

Xeon3.0Ghz 1M130nm

Xeon 3.6 Ghz 90nm

54000Xeon 2M L2 3.6 Ghz 90nm

Xeon2M L2>3.6 Ghz 90nm

Dual Core65 nm

Next Gen Arch. Dual Core65 nm

Multi Core45 nm

Itanium 29M130nm

Power 5+~2.5Ghz90 nm

Montecito90nm

TukwillaCommon Platform65nm

Rock90nm

Power 6~3Ghz65nm

Relative CPU Performance per Core

Symmetric Multi Threading (Hyper Threading)

Dual Core

Xeon

Itanium

Power

Sparc

Multicore, Multithreaded

Relative CPU Performance based on multi-threading and multi-core roadmap capabilities

www.spec.org: benchmarks SPECint2000 and SPECint_rate2000

Page 13: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Gartner Product Ranking 2004 ASEM

FUJITSU HP HP HP IBM NCR SUN

P

rim

e-p

ow

er

HP

9000

Inte

gri

ty

Pro

lian

t

pS

erie

s

Ter

adat

a

Su

nfi

re

PRODUCT 43 45 46 29 45 54 40

The Product category (which was called Technology in previous ASEM updates) focuses on the performance and reliability/availability aspects of each platform. In this category Teradata received a very strong 93.5% of total possible points and leads the IBM pSeries with 74.35% by 44 points or 19%.

Source Gartner 2004 ASEM Report

Page 14: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

NCR Enterprise Storage 6842

• NCR Enterprise Storage 6842 Features> Two array modules per cabinet> 56, 73GB, 15K drives

– greater than 8 Terabytes of spinning disk per cabinet

> Dual Quad Fibre Channel Controllers per array for performance and availability

> Typical configuration is 4 NCR 5400 Server nodes per 3 –6842 arrays– 1.2 Terabytes of database space per node (RAID 1)

> Supports RAID 1 and RAID 5> Support for MP-RAS and Microsoft Windows Server 2003

environments

Page 15: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

192

MP-RAS and Windows

RAID-1 Only

MPP: supports 2, 3, or 4 nodes per cabinet

73GB – 15K RPM

DMX 2000 M2

RAID -1 OnlyRAID Options

MP-RAS and WindowsOperating Environment

MPP: supports 1 or 2 nodes per cabinetTeradata Use

73GB – 15K RPMDisks

96Maximum Teradata disks

DMX 1000 M2EMC Model

EMC Symmetrix DMX

• Enterprise Fit• Storage Standardization• Extended storage life

through Redeployment

Page 16: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Assumption: Compute and Storage Balance

• A balanced configuration is one where the storage I/O subsystem for each compute node is configured with enough disk spindles, disk controllers, and connectivity so that the disk subsystem can satisfy the CPU demand from that node.

• A supersaturated configuration also can satisfy the CPU demand from that node although the extra I/O may be underutilized.> This is useful for investment protection on certain

upgrade paths.

• All system configurations discussed in this presentation are based on balanced or supersaturated compute nodes.

Page 17: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Node CPU and Storage I/O Balance

Qu

ery

Resp

on

se T

ime

Best Query Response Time

Effective NodeUtilization

# of Disk Drives/Storage Capacity

I/O

Ban

dw

idth

–M

B/

sec

OptimumNode/Storage

Balance and

Response Time

Query R

esponse Time

Band

width

95% Node Utilization

Industry wide, disk drive capacity is increasing at a faster rate than disk drive performance

Page 18: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Common Upgrades Applied

Query Response Time Increases

because you didn’t add more compute power to support

the additional raw data volume.

SYSTEM with CURRENT Nodes

Raw Data Volume

Qu

ery

Resp

on

se T

ime

GROW RAW DATA VOLUME Performance more than adequate: Add more data to all nodes

Page 19: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

LINEAR GROWTHMaintain Query Performance with more nodes

Typical System Expansion

SYSTEM with Current Nodes

SYSTEM with More or Faster Nodes

Query Response Time Remains Constant

because you add proportionally more raw data volume as

compute power.

Scale out with Teradata by adding compute nodes,

interconnect, storage arrays, and disks.

aka “horizontal scalability”

Raw Data Volume

Qu

ery

Resp

on

se T

ime

Page 20: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Common Upgrades Applied

Query Response Time Decreases

because you didn’t add more raw data volume to

offset the increase in compute power.

“Scale vertically” with Teradata by increasing

compute power.

SYSTEM with Current Nodes

SYSTEM with More or Faster Nodes

Raw Data Volume

Qu

ery

Resp

on

se T

ime

GROW QUERY PERFORMANCE Raw Data Volume adequate: Upgrade to faster CPUs

Page 21: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

SYSTEM with Current Nodes

SYSTEM with More or Faster Nodes

Scale to Target query performance and data volume by increasing compute power

and adding storage.

Adjust Query Performance and Data Volume to match service level agreement

Raw Data Volume

Qu

ery

Resp

on

se T

ime

Combo: Upgrade Nodes and Increase Storage Per Node

Page 22: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Improve query performance and adjust data volume by

reducing storage per node and adding more nodes.

SYSTEM with Current Nodes

SYSTEM with More Nodes

Improve Query Performance and Adjust Data Volume to match service level agreement

Raw Data Volume

Qu

ery

Resp

on

se T

ime

Scaling by Reconfiguration and Expansion

Page 23: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

CPU(s)

CPU(s)

CPU(s)

CPU(s)

CPU(s)

CPU(s)

CPU(s)

CPU(s)

Cache

Cache

Cache

Cache

Cache

Cache

Cache

Cache

Memory

Memory

Memory

Memory

Memory

Memory

Memory

Memory

Disk Storage

Disk Storage

Disk Storage

Disk Storage

Disk Storage

Disk Storage

Disk Storage

Disk Storage

BY

NE

T F

ab

rics

Teradata Shared-Nothing MPP• Designed for Slope of 1 Linear Scaling • Optimized for very high data rates to/from disk• Excellent performance and efficiency for data

warehousing

Architecture Determines Scalability

Interconnect used only for database messages, no I/O or memory traffic

All memory accesses are local

CPU uses independent direct I/O path to Disk

Page 24: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Teradata MPP Architecture

• Nodes> Incrementally scalable to

1024 nodes> Windows or Unix

• Storage> Independent I/O> Scales per node

• BYNET Interconnect> Fully scalable bandwidth

• Connectivity> Fully scalable> Channel – ESCON/FICON> LAN, WAN

• Server Management> One console to view

the entire system

SMP Node1 SMP Node2 SMP Node3 SMP Node4

Server Management

CPU1 CPU2 CPU1 CPU2 CPU1 CPU2 CPU1 CPU2

Dual BYNET Interconnects

Memory Memory Memory Memory

Page 25: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Teradata Node SW Architecture (SMP) 4-Node MPP Clique

Perfectly Tuned Nodes Working in Parallel for Scalability and Availability

Node Software ArchitectureD

isk A

rray

VP

RO

CS

BYNET

VPROCsAMP & PE

VPROCsAMP & PE

VPROCsAMP & PE

VPROCsAMP & PE

CommunicationInterfaces

Access Module

ProcessorVPROCS

Parsing EngineVirtual

Processors(VPROCS)

UNIX, Windows 2000

Parallel Database Extensions (PDE)

PE2

PE1 AMP1 AMP5

AMP2 AMP6

AMP3 AMP7

AMP4 AMP8

Channel Gateway

LAN Gateway

PEs recieve the queries and figure out the query planAMPs interact with the disk arrays and process the data

Parser

PE VProc

Optimizer

DispatcherSessionControl

Relational Database Management

File System / Data Management

VAMP

Page 26: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Specifically designed for data warehousing workloads

The Scalable BYNET Interconnect

Multiple SimultaneousPoint-to-Point Messaging

BroadcastMessaging

Node

Node

Node

Node

Node

Node

Node

Node

Node

The Teradata Optimizer

chooses between Point-to-Pointand Broadcast

Messaging to select the most effective communication.

• Bandwidth scales linearly to 1,024 nodes• Redundant, fault tolerant network• Guaranteed message delivery

Node

Page 27: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Traditional Configuration

X

Built-In Integrated Fail Over

• Teradata provides built-in node failover.> Cost effective> Easy to deploy

• Work migrates to the remaining nodes in the cliques. • System performance degradation up to 33%.

Page 28: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Node Node Node Node Node Node Node Node

Disk Array Disk Array Disk Array Disk Array Disk Array Disk Array

86% SystemPerformance

Continuity

Large Cliques

• Double the number of nodes in a clique up to 8.• Work distributed across a greater number of nodes.• Minimize system performance impacts – may not be

noticeable to end-users.

XFibre Channel

Switches

Page 29: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Node Node Node Hot Standby

Disk Array Disk Array Disk Array

100% System Performance

Continuity

Hot Standby Nodes

• Work re-directed to a Hot Standby Node.• No system performance impacts.• Teradata restart can be postponed to a maintenance

window.

X

Page 30: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Node Node Node Node Node NodeNode Hot Standby

Disk Array Disk Array Disk Array Disk Array Disk Array Disk Array

100% System Performance

Continuity

Large Clique + Hot Standby Node

• Same performance benefits of Hot Standby node.• Reduced costs for larger system implementations.

Fibre Channel Switches

X

Page 31: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

BYNET

Server Nodes

DiskArray Subsystem

Clique

High Availability

Case Hardware Teradata

Power FailureUPS (redundant),

Dual ACNode Failure VPROC Migration (VAMP, PE)Bynet failure Redundant BYNET

Disc failureRAID-1/-5/-S in Disc

SubsystemMore than one Disc Failure Fallback-Option

Clique Failure Fallback-Option

Page 32: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Coexistence Considerations

• VAMPs manage the same amount of data• Coexistence enables the faster nodes to be realized by

running more VAMPs per node

1x 1.5x 2.0xPerformance Factor

Generation xVAMPS

AMP AMP AMPAMP

AMP AMP AMPAMP

Generation xVAMPS

AMP AMP AMPAMP

AMP AMP AMPAMP

AMP AMP

AMP AMP

Generation xVAMPS

AMP AMP AMPAMP

AMP AMP AMPAMP

AMP AMP AMPAMP

AMP AMP AMPAMP

Page 33: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

SMP Node1 SMP Node2 SMP Node3 SMP Node4

AMP AMP AMP AMP AMP AMP AMP AMP AMP AMP AMP AMP

System Expansion with Teradata Coexistence

• The utilization of multiple generations of hardware within a single Teradata MPP system

SMP Node1 SMP Node2 SMP Node3 SMP Node4

Server Management

AMP AMP AMP AMP AMP AMP AMP AMP AMP AMP AMP AMP

Dual BYNET Interconnects

5380 - 9 AMPs per node 5400 - 12 AMPs per node

Page 34: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

11/2000Original Footprint

8 Node 5250

Customer Example (72 Nodes, 4 Generations)

Generation “A”

1/2001 Expansion 1

12 Node 5250

11/2001 Expansion 28 Node 5255

6/2002 Expansion 312 Node 5300

Generation “B”

6/2003 Expansion 416 Node 5350

Generation “C”

6/2004 Expansion 5

16 Node 5380

Generation “D”

2005Future 5400Expansion

Generation “E”

Page 35: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Database and Operating System

• Database> Teradata V2R6> Support one Release Back V2R5.1.X (Current Exception in Place V2R5.0.3)

• Unix> MP-RAS 3.03 required for Teradata Database on 5400> MP-RAS 3.02 still supported on previous server generations

• Microsoft Windows> Microsoft Windows Server 2003 recommended for new and expanding 5400 motions> Microsoft Windows 2000 supported in 2Q 2005

485x/525x 4900/5300 4950/5350 4980/5380 5400

V2R5.0.3, V2R5.1.X and V2R6 on MP-RAS 3.02

V2R6 on WS03 (2Q 2005)

V2R5.0.3, V2R5.1.X, V2R6 on MP-RAS 3.03

V2R5.0.3, V2R5.1.X, V2R6 on W2K (2Q 2005)

Page 36: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

• Analytic extensions such as Extended Windows Functions & Multiple Aggregate Distincts

• Random Stratified Sampling

• Join Elimination• Extended Transitive

Closure• Partial Group By • Early Group By • Derived Table Rewrite• Very Large SQL

Tactical & Event-DrivenDecision Making• Partial Covering Join Index• Global Index• Sparse Index• Join Index Extensions• ODS Workload Optimization• Stored Procedures Enhancements

Single Version ofthe Truth

• Cylinder Read • Partitioned Tables (PPI)• Value List Compression• 2000 Columns, 64 Columns per Index • Identity Column

•Index Wizard•Statistics Wizard & Collect Statistics Improvements

•Query Log•Extreme Workload Management & Administration

•Roles and Profiles•SQL Assistant/Web Edition

•Availability•Performance Dashboard & Reporting

Trusted,Integrated Environment

Strategic Decision Making• Extended Grouping• Inner Join Optimization• Eliminate Unnecessary

Outer Joins• Hash Joins• UDFs for Complex Analysis

and Unstructured Data

V2R5.0 Features

• Security enhancements (Encryption)• DBQL enhancements • Database Object Level Use Count• ROLES enhancements• Priority Scheduler enhancements• TDQM enhancements• No Auto Restart After Disk-Array Power

Failure• Cancel Rollback• Incompatible Package Warning• Disk I/O Integrity Check

• Enhancements to Triggers• Extra FK-PK Joins in Join Index• UDFs for XML Processing etc.

• Enhancement to Identity Column• UTF16 Support• PPI Dynamic Partition Elimination• Large Objects (LOBs)

+ V2R5.1 Features

• Continuous Update Performance & Manageability

• Faster Join Index Update• Join Update Performance• Bulk Update Performance• Teradata Warehouse Builder Full Functionality

& Platform Support

• Continuous Update Performance & Manageability

• Faster Join Index Update• Join Update Performance• Bulk Update Performance• Teradata Warehouse Builder Full Functionality

& Platform Support

Data FreshnessData Freshness

•UDFs for Data Transformation and Scoring

Page 37: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

V2R6.0 Feature List

• Remove 1MB limit on plan cache size• Increase response buffer to 1MB• Table header expansion• Improve Random AMP Sampling• Top N Row Operation• Recursive Queries

Tactical & Event-Driven DecisionMaking• Improve Primary Index Operations\• Improved IN-list processing• External Stored Procedures• Trigger calling a Stored Procedure• Stored Procedure Internals Enhancements• Queue tables

Single View of Your Business

• Stored Procedure LOB support• External Table Function• Partition level BAR• Eliminate indexed row IDs (PPI)• PPI Join performance improvement• DBS Information consolidation

Trusted,Integrated Environment

Strategic Decision Making

• Teradata Dynamic Workload Management

• Extensible User Authentication• Directory Integration• Global deadlock logging• Faster Rollbacks

Data FreshnessData Freshness

• Replication Services • Array support • Priority Scheduler enhancements• Reduce restart time

Page 38: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Teradata Utility PakTeradata AdministratorTeradata SQL AssistantTeradata SQL Assistant/Web Edition BTEQODBCJDBCCLIOLE DB Provider

Teradata Tools & Utilities (1)

Teradata Database

Any Query,Any Time

.com

Database ManagementTeradata ManagerTeradata Dynamic Query Manager Teradata System Emulation ToolTeradata Visual ExplainTeradata Index WizardTeradata Statistics Wizard

MetadataTeradata Meta Data Services

Load/UnloadTeradata Warehouse Builder FastLoad, MultiLoad & FastExportTeradata TPumpAccess Modules

Mainframe ConnectivityMainframe Channel ConnectTS/API, CICS, HUTCNS & IMS/DC

Page 39: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Teradata Warehouse

ParallelIn

ParallelOut

Technical Differentiator: Database Utilities

Teradata Utilities Are Robust and Mature

• Teradata utilities are fully parallel.• Teradata utilities have checkpoint restart capability.• Data loads directly from the source into the database.

> No manual data partitioning.> No file splitting.> No intermediary file transfers.> No separate data conversion step.

Page 40: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Teradata Tools & Utilities (2)

• Teradata Data Profiler• Teradata CRM• Teradata Warehouse Miner• Teradata Demand Chain Management• Teradata Supply Chain Management• .....

(LDM = Logical Data Model)

• Financial Solution LDM• Retail LDM• Communication LDM• Insurance/Healthcare LDM• Manufacturing LDM• Government LDM• Media and Entertainment LDM• Travel/Transportation LDM

Page 41: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Two Basic Software Architecture ModelsTask Centric and Data Centric

Uniform and shared accessto all platform resources(disk, etc) is REQUIRED

Exclusive access to a subset of resources

Data

Data Data

Data

Parallel Unit Parallel Unit

ParallelOptimizerTask

Request

DATA DATA

Task

Request

SharedMemory

Request Request

Page 42: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Data Centric Software: Teradata Virtual AMP

• Each virtual AMP has rows from every table• Each virtual AMP works independently on its rows• Goal: Database rows are equally distributed across

multiple tables

16151413121110987654321

282726252423222120191817

Table A Table B Table C

2521171395

26221814106

27231915117

28242016128

1 2 3 4

P

DM

P

DM

P

DM

P

DM

AMP 1 AMP 2 AMP 3 AMP 4

Tables AMP - Balanced collection of three abstracted platform resources

P

DMMemory Disk

Processor

Page 43: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Data distribution by Primary Index

Node1 Node2 Node3 Node4 Node1 Node2 Node3 Node4 Node1 Node2 Node3 Node4 Node1 Node2 Node3 Node4

Dual BYNET Interconnects

Row Hash (32 bits)

Primary Index value for a row

Hashing Algorithm

Hash Map

Current configuration PrimaryCurrent configuration FallbackReconfiguration PrimaryReconfiguration Fallback

Destination Selection Word(DSW) – first 16 bits

Page 44: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Teradata Hashing

7225

32 bit Row Hash Remaining16 bitsDSW #

Hashing Algorithm

0000 0000 0001 1010 1100 0111 0101 1011

AMP 1 AMP 2 AMP x

7225 2 4/13 O

AMP 3

V V V V0 0 1 A

(Hexadecimal)

000 01 02 03 04 05 06 07 08 01 02 03 04 05 001 02 03 04 05 06 07 08 01 02 03 04 05 001 02 03 04 05 06 07 08 01 02 03 04 05 001 02 03 04 05 06 07 08 01 02 03 04 05 001 02 03 04 05 06 07 08 01 02 03 04 05 001 02 03 04 05 06 07 08 01 02 03 04 05 0

HASH MAP

0 1 2 3 4 5 6 7 8 9 A B C D E F

001002003004005

Bucket Number

SELECT * FROM ORDERWHERE order_number = 7225;

Table ORDERO r d e r

N u m b e rC u s to m e r

N u m b e rO r d e rD a te

O r d e rS t a t u s

P K

U P I

7 3 2 57 3 2 47 4 1 57 1 0 37 2 2 57 3 8 47 4 0 27 1 8 87 2 0 2

231121312

4 / 1 34 / 1 34 / 1 34 / 1 04 / 1 54 / 1 24 / 1 64 / 1 34 / 0 9

OOCOCCCCC

Page 45: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Primary Index Choice Criteria

ACCESSMaximize one_AMP operations: choose the column most frequently

used for access

DISTRIBUTIONOptimize parallel processing: choose a column that provides good

distribution

VolatilityReduce maintenance resource overhead (I/O): choose a column

with stable data values

Page 46: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Data distribution by Primary Index - 2

48 bit table ID 32 bit row hash value Index value

Node1 Node2 Node3 Node4 Node1 Node2 Node3 Node4 Node1 Node2 Node3 Node4 Node1 Node2 Node3 Node4

Dual BYNET Interconnects

SQL requestParser algorithm

Logical block identifier

Page 47: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

SQL Parser Overview

Request Parcel

AMP Steps

Statistics

Data Parcel

Generator

GNCApply

Resolver

Optimizer

Syntaxer

Security

Cached?

DDDBase, AccRightsTVM, TVFieldsIndexes

Costs

Serial stepsParallel stepsIndividual and common steps (MSR)Additional:Triggers, check constraints, references, foreign keys, join indexescollected statistics or dynamic sampling

Page 48: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Statistics Summary

Collect statistics on• all non-unique indexes• UPI of any table with less than x rows per AMP

(dependent on available number of AMPs)• All indexes of a join index• any non-indexed column used for join constraints• indexes of global temporary tables

Collected statistics are not automatically updated by the system

Refresh statistics when 5-10% of the table rows have changed

Page 49: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Database Workload Continuum

EnterpriseData

Warehouse

Strategic Decision RepositoriesTransactional Repositories

OLTPi OLTPnOLTP1 • • •ODS1

Tactical Decision Repositories

• • •ODS2

• • •

Workload Continuum

Transactional (OLTP)

• User Profiles• Customers• Clerks

• Services:• Transactions• Bookkeeping

• Access Profile:• Frequent updates• Occasional lookup

• Data:• Current “state” data• Limited history• Narrow Scope

Strategic (EDW)

• User Profiles• Back Office Services• Management• Trading Partners

• Services:• Strategic decisions• Analytics (e.g. scoring)

• Access Profile:• Bulk Inserts – Some Updates• Frequent complex analytics

• Data Model:• Periodic “state” data• Deep history• Enterprise integrated view

Tactical (ODS)

• User Profiles• Front Line Services• Customers - Indirectly

• Services:• Lookups• Tactical decisions• Analytics (e.g. scoring)

• Access Profile:• Continuous updates• Frequent lookups

• Data Model:• Current “state” data• Recent history• Integrated business areas

Page 50: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Database Workload Continuum

ActiveData

Warehouse

Tactical and Strategic Decision RepositoriesTransactional Repositories

OLTPi OLTPnOLTP1

Workload Continuum

Transactional (OLTP)

• User Profiles• Customers• Clerks

• Services:• Transactions• Bookkeeping

• Access Profile:• Frequent updates• Occasional lookup

• Data:• Current “state” data• Limited history• Narrow Scope

Strategic (EDW)

• User Profiles• Back Office Services• Management• Trading Partners

• Services:• Strategic decisions• Analytics (e.g. scoring)

• Access Profile:• Bulk Inserts – Some Updates• Frequent complex analytics

• Data Model:• Periodic “state” data• Deep history• Enterprise integrated view

Tactical (ODS)

• User Profiles• Front Line Services• Customers - Indirectly

• Services:• Lookups• Tactical decisions• Analytics (e.g. scoring)

• Access Profile:• Continuous updates• Frequent lookups

• Data Model:• Current “state” data• Recent history• Integrated business areas

Page 51: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Data Sophistication

Continuous update and time-sensitive queries

become important

OPERATIONALIZINGWHAT IS happening?

Event-based triggering takes hold

ACTIVATINGMAKE it happen!

Primarily batch andsome ad hoc reports

Increase in ad hoc analysis

ANALYZINGWHY

did it happen?

REPORTINGWHAT

happened?Analytical modeling

grows

PREDICTINGWHAT WILL

happen?

Batch

Ad Hoc

Analytics

Continuous Update/Short Queries

Event-Based Triggering

Data Warehouse Needs Will Evolve

• Query complexity grows• Workload mixture grows• Data volume grows• Schema complexity grows• Depth of history grows• Number of users grows • Expectations grow

Single View of the Business – Better, Faster Decisions – Drive Business Growth

Wo

rklo

ad

Co

mp

lexit

y

SimultaneousWorkloads:

Strategic, tactical, loading

Increasing depth and breadth of

users and queries

Increasing depth and breadth of data

Page 52: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Data Sophistication

Continuous update and time-sensitive queries

become important

OPERATIONALIZINGWHAT IS happening?

Event-based triggering takes hold

ACTIVATINGMAKE it happen!

Primarily batch andsome ad hoc reports

Increase in ad hoc analysis

ANALYZINGWHY

did it happen?

REPORTINGWHAT

happened?Analytical modeling

grows

PREDICTINGWHAT WILL

happen?

Batch

Ad Hoc

Analytics

Continuous Update/Short Queries

Event-Based Triggering

Data Warehouse Needs Will Evolve

• Query complexity grows• Workload mixture grows• Data volume grows• Schema complexity grows• Depth of history grows• Number of users grows • Expectations grow

Single View of the Business – Better, Faster Decisions – Drive Business Growth

Wo

rklo

ad

Co

mp

lexit

y

Measure

Chasm from static to dynamic decision-

making

Optimize

Execute

Automate

Understand

Page 53: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Database Requirement:Data Warehouse

Foundation must handle multi-dimensional growth!

Data Sophistication

Continuous update and time-sensitive queries

become important

OPERATIONALIZINGWHAT IS happening?

Event-based triggering takes hold

ACTIVATINGMAKE it happen!

Primarily batch andsome ad hoc reports

Increase in ad hoc analysis

ANALYZINGWHY

did it happen?

REPORTINGWHAT

happened?Analytical modeling

grows

PREDICTINGWHAT WILL

happen?

Batch

Ad Hoc

Analytics

Continuous Update/Short Queries

Event-Based Triggering

Data Warehouse Needs Will Evolve

• Query complexity grows• Workload mixture grows• Data volume grows• Schema complexity grows• Depth of history grows• Number of users grows • Expectations grow

Single View of the Business – Better, Faster Decisions – Drive Business Growth

Wo

rklo

ad

Co

mp

lexit

y

Page 54: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

The Multi-Temperature Warehouse

• Customers desire deep historical data in the warehouse.> The access frequency or average temperature of data

varies.– HOT, WARM, COOL, dormant

> Seamless management required.

• Teradata systems can address this need through a combination of technologies, such as:> Partitioned primary index (PPI).> Multi-value compression.> Priority scheduler.

Page 55: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Priority Scheduler

Three Tiers of Workload Management

ADW

Database Query Log

Teradata DynamicQuery Manager

Pre-Execution

Post-Execution

•Control what and how much is allowed to begin execution

•Manage the level of resources allocated to different prioritiesof executing work

(prioritized queues)

•Analyse query performance and behavior after completion

Page 56: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Teradata Dynamic Query Manager

Page 57: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Indexes

• PI (UPI and NUPI)• SI (USI and NUSI)• Join Index

single table indexmulti table indexaggregated indexsparse index (where clause used)partial covering global

• Materialized Views (join index)

Page 58: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

An Integrated, Centralized Data Warehouse Solution Database Must Scale in Every Dimension

Data Volume(Raw, User Data)

SchemaSophistication

QueryFreedom

QueryComplexity

DataFreshness

Query Data Volume

QueryConcurrency

MixedWorkload

Page 59: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

An Integrated, Centralized Data Warehouse Solution Database Must Scale in Every Dimension

Data Volume(Raw, User Data)

SchemaSophistication

QueryFreedom

QueryComplexity

DataFreshness

Query Data Volume

QueryConcurrency

MixedWorkload

TheTeradata

Difference

Page 60: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Data Volume(Raw, User Data)

Customers Need to Evaluate “Real Life”

Workloads

SchemaSophistication

QueryFreedom

QueryComplexity

DataFreshness

Query Data Volume

The Teradata Difference“Multi-dimensional Scalability”

QueryConcurrency

MixedWorkload

Good Example?TPC-H

Benchmark

Page 61: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Companies generating >80% of the industry revenue utilize Teradata Data Warehousing

Teradata Experiencein the Communications Industry

Page 62: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Some of Teradata’s Retail Customers Worldwide

Page 63: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

65% of Top Global Retailers

60% of Top Most Admired

Global Companies

80% of Top Global Telco Firms

70% of Top Global Airlines

50% of the Top Transportation

Logistic Firms

Teradata is Well-Positioned in the Top Global 3000 Industries

• Leading industries> Banking > Government> Insurance & Healthcare> Manufacturing > Retail> Telecommunications> Transportation Logistics> Travel

• World class customer list> More than 750 customers> Over 1200 installations

• Global presence > Over 100 countries

FORTUNE Global Rankings, April 2005

Page 64: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Industry Leaders Use Teradata

• Leading industries> Banking > Government> Insurance & Healthcare> Manufacturing > Retail> Telecommunications> Transportation Logistics> Travel

• World class customer list> More than 750 customers> Over 1200 installations

• Global presence > Over 100 countries

Teradata Global 400 Customers

32% ofFinancial Services

Industry

54% of Retailers

50% ofTransportation

Industry

50% of Telco Industry

19% of Manufacturers

Page 65: Teradata - informatik.hu-berlin.de€¦Teradata Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing Dr. Barbara Schulmeister Teradata

Data Volume(Raw, User Data)

SchemaSophistication

QueryFreedom

QueryComplexity

DataFreshness

Query Data Volume

QueryConcurrency

MixedWorkload

TheTeradata

Difference

www.teradata.com