Teradata - hu

download Teradata - hu

of 65

  • date post

    11-Feb-2017
  • Category

    Documents

  • view

    232
  • download

    1

Embed Size (px)

Transcript of Teradata - hu

  • Teradata

    Architecture, Technology, Scalabilty, Performance and Vision for Active Enterprise Data Warehousing

    Dr. Barbara SchulmeisterTeradata a Division of NCR

    Barbara.Schulmeister@ncr.com

    28. 6. 2005

  • Agenda

    History Definitions Hardware Architecture Fault Tolerance and High Availability Coexistence Operational System Tools and Utilities Data Distribution SQL Parser Active Data Warehouse Scalability

  • Born to be parallel!

    1984 1985 1986 1987 1988 1989 1990 1991 1992 1994...19931979...

    Teradata Corp.

    Founded

    DBCModel 1:

    First MPP System!

    First 100GB

    System!

    Product of the Year Forbes

    First 500GB

    System!

    First 700GB

    System!

    DBC Model 3

    Fastest Growing Small

    Company INC

    Magazine

    Fastest Growing Electronic Company

    Electronic Business

    DBC Model 4

    Leader in Commercial

    Parallel Processing

    Gartner Group

    First Terabyte System!

    more

    3+ TBSystem!

    Teradata Timeline Overview

    First Beta system shipped Christmas to Wells Fargo Bank

    Joint Venture with NCR for next generation systems

    Initial public offering on Wall Street

  • Teradata Timeline (II)

    Over 500 Production

    Data Warehouses Worldwide!

    First Vendorto Publish1TB TPC-D

    Benchmark!

    Teradata V2 on

    WorldMark 4300

    DWI VLDB Best Practice Award w/ ATT BMD:

    Data Warehouse and

    the Web

    DB Expo RealwareAward w/

    Union Pacific: Data Warehouse

    Innovations

    Only Vendorto PublishMulti-user TPC-Ds!

    #1 in MPP IDC

    Survey in Computer-

    world

    Teradata Version 2on NCR

    3555 SMP

    24TBData Warehousein Production!

    Demonstrated Worlds Largest Data Warehouse

    Database at 11TB!

    Teradata V2on WorldMark

    5100SMP & MPP

    100GBTPC-D

    Benchmark Leader!

    ...only NCRs Teradata V2 RDBMS hasproven it can

    scale Gartner

    Group

    1996 1997 ......1995

    more

  • Teradata Timeline (III)

    Database Programming

    and Design Award

    IT Award of Excellence

    V2R5 Teradata

    TDWI Solution Provider Best Practices in Data Warehousing

    TDWI Leadership in Data Warehousing Award

    DM Review World-Class Solution Award for business Intelligence

    IT Times Award DM Review 100 Award DM Review Readership Award Intelligent Enterprise Real Ware Award

    the commitment continues

    64 bit Teradata

    Industry leading TPCH at 1TB and 3TB

    Largest Data Ware-house system (176 node, 130 TB disk)

    Industry leading TPC-D benchmark for all volumesTeradata V2

    ported to Microsoft

    Windows NTTeradata attains

    99.98% availability

    1999...1998 2000 2001 2002 2003

    V2R6 Teradata

    2004 2005

    Linux

  • Alternative Approaches to Enterprise Analytics

    Sources

    Users

    DW

    Sources

    Users

    DW

    Marts

    Sources

    Users

    Marts

    Sources

    Users

    Middleware

    Data Mart Centric

    Virtual,Distributed,Federated

    Hub-and-Spoke DataWarehouse

    EnterpriseData

    Warehouse

    Requires corporate leadership and vision

    Business Enterprise view challenging

    Redundant data costs High DBA and

    operational costs Data latency

    Only viable for low volume access

    Meta data issues Network bandwidth and

    join complexity issues Workload typically placed

    on workstation

    Business Enterprise view unavailable

    Redundant data costs High ETL costs High App costs High DBA and

    operational costs

    Cons

    Single Enterprise Business View

    Data reusability Consistency Low Cost of Ownership

    Allows easier customization of user interfaces & reports

    No need for ETL No need for separate

    platform

    Easy to Build Organizationally

    Limit Scope Easy to Build

    Technically

    Pros

    Centralized Integrated Data With

    Direct Access

    Hub-and-Spoke Data Warehouse

    Leave Data Where it Lies

    Independent Data Marts

  • A Spectrum of Data Warehouse Architectures

    Sources

    Users

    DW

    Sources

    Users

    DW

    Marts

    Sources

    Users

    Marts

    Sources

    Users

    Middleware

    Data Mart Centric

    Virtual,Distributed,Federated

    Hub-and-SpokeData

    Warehouse

    EnterpriseData

    Warehouse

    Teradatas Advocated Data Warehouse

    Approach for 20 years, Since 1984!

    The goal: Any question, on any data, at any time.

  • Most time consuming steps:

    ll Full scan of big tablesFull scan of big tablesll Complexe joinsComplexe joinsll AggregationAggregationll Sorting Sorting

    Frequency of steps OLTP or DSS

    OLTP DSS

    Diffentiating OLTP - DSS

  • NCR Server

    Provide customers with growth opportunities and investment protection> Coexistence is enabled

    across five generations NCR 5400E & 5400H

    Servers NCR 4980 & 5380 Servers NCR 4950 & 5350 Servers NCR 4900 & 5300 Servers NCR 485X & 525X Servers

    NCR Server Generations

    485X&

    525X

    4900&

    5300

    4950&

    5350

    4980&

    5380

    5400E&

    5400H

    BYNET V2 / V3

  • NCR 5400 Server SMP

    5400E > 1 - 4 nodes> BYNET V2> ESCON & FICON for 3 and

    4 node configurations> Field Upgradeable to

    5400H1

    3

    1

    3

    1

    3

    1

    3

    1

    3

    3rd Node

    4th Node

    Internal BYNET

    switches

    3GSM

    1st Node1st Node2nd Node2nd Node

    Ethernet Switches

    Three UPS Modules

    Up to 4 nodes within each cabinet

    Server Management

  • NCR 5400 Server MPP

    1

    3

    1

    3

    1

    3

    1

    3

    1

    3

    1

    3

    1

    3

    1

    3

    1

    3

    1

    3

    1

    3

    Ethernet Switches

    BYNET V3 Switches

    Five UPS Modules

    Up to 10 nodes within each cabinet

    Server Management

    Continued rapid adoption of latest Intel Technology> Dual Intel Pentium Xeon EM 64T 3.6

    GHz processors with Hyper-Threading(32-bit and 64-bit capability)

    > 800 MHz front side bus Industry Standard Form Factor

    > Up to 10 nodes per cabinet> Integrated BYNET V3 (provides the

    capability to physical separate systems between 300-600 meters)

    > Integrated Server Management> N+1 UPS> Dual AC

    Multi-Generation Coexistence > Investment protection

  • Industry CPU Performance per Core

    0

    500

    1000

    1500

    2000

    2500

    3000

    2004 2005 2006 2007

    Year

    Itanium 2 1.6 Ghz130nm

    Power 4+1.45Ghz130 nm

    Ultrasparc 3130 nm1.6Ghz

    Power 5~1.9Ghz130 nm

    Xeon3.0Ghz 1M130nm

    Xeon 3.6 Ghz 90nm

    54000Xeon 2M L2 3.6 Ghz 90nm

    Xeon2M L2>3.6 Ghz 90nm

    Dual Core65 nm

    Next Gen Arch. Dual Core65 nm

    Multi Core45 nm

    Itanium 29M130nm

    Power 5+~2.5Ghz90 nm

    Montecito90nm

    TukwillaCommon Platform65nm

    Rock90nm

    Power 6~3Ghz65nm

    Relative CPU Performance per Core

    Symmetric Multi Threading (Hyper Threading)

    Dual Core

    Xeon

    Itanium

    Power

    Sparc

    Multicore, Multithreaded

    Relative CPU Performance based on multi-threading and multi-core roadmap capabilities

    www.spec.org: benchmarks SPECint2000 and SPECint_rate2000

  • Gartner Product Ranking 2004 ASEM

    FUJITSU HP HP HP IBM NCR SUN

    P

    rim

    e-p

    ow

    er

    HP

    9000

    Inte

    gri

    ty

    Pro

    lian

    t

    pS

    erie

    s

    Ter

    adat

    a

    Su

    nfi

    re

    PRODUCT 43 45 46 29 45 54 40

    The Product category (which was called Technology in previous ASEM updates) focuses on the performance and reliability/availability aspects of each platform. In this category Teradata received a very strong 93.5% of total possible points and leads the IBM pSeries with 74.35% by 44 points or 19%.

    Source Gartner 2004 ASEM Report

  • NCR Enterprise Storage 6842

    NCR Enterprise Storage 6842 Features> Two array modules per cabinet> 56, 73GB, 15K drives

    greater than 8 Terabytes of spinning disk per cabinet

    > Dual Quad Fibre Channel Controllers per array for performance and availability

    > Typical configuration is 4 NCR 5400 Server nodes per 3 6842 arrays 1.2 Terabytes of database space per node (RAID 1)

    > Supports RAID 1 and RAID 5> Support for MP-RAS and Microsoft Windows Server 2003

    environments

  • 192

    MP-RAS and Windows

    RAID-1 Only

    MPP: supports 2, 3, or 4 nodes per cabinet

    73GB 15K RPM

    DMX 2000 M2

    RAID -1 OnlyRAID Options

    MP-RAS and WindowsOperating Environment

    MPP: supports 1 or 2 nodes per cabinetTeradata Use

    73GB 15K RPMDisks

    96Maximum Teradata disks

    DMX 1000 M2EMC Model

    EMC Symmetrix DMX

    Enterprise Fit Storage Standardization Extended storage life

    through Redeployment

  • Assumption: Compute and Storage Balance

    A balanced configuration is one where the storage I/O subsystem for each compute node is configured with enough disk spindles, disk controllers, and connectivity so that the disk subsystem can satisfy the CPU demand from that node.

    A supersaturated configuration also can satisfy the CPU demand from that node although the extra I/O may be underutilized.> This is useful for