Teradata Platform Introduction

Click here to load reader

  • date post

    31-Dec-2015
  • Category

    Documents

  • view

    57
  • download

    1

Embed Size (px)

description

Teradata Platform Introduction. Hardware and Software Components in Enterprise Data Warehouse Derek Jones March 2005. Teradata in the Enterprise. Teradata is relational database management system Acts as central enterprise-wide database - PowerPoint PPT Presentation

Transcript of Teradata Platform Introduction

  • Teradata Platform IntroductionHardware and Software Components in Enterprise Data Warehouse

    Derek JonesMarch 2005

    Teradata Confidential* > *

    Teradata in the EnterpriseTeradata is relational database management systemActs as central enterprise-wide databaseContains information extracted from operational systemsCentral placement minimizes data duplication and provides single view of business

    Teradata Confidential* > *

    Key Teradata DifferentiatorsParallelism throughout platformShared Nothing ArchitectureProprietary intelligent system inter-connect

    Teradata Confidential* > *

    Teradata Scales LinearlyScaling achieved via shared nothing architecture and unconditional parallelismPower is in linear scalability, where slope = 1Scales with dataScales with usersScales with work

    More nodesMore workMore usersMore dataNodeWorkUsersData

    Teradata Confidential* > *

    The Teradata DifferenceMulti-dimensional ScalabilityData Volume(Raw, User Data)SchemaSophisticationQueryFreedomQueryComplexityQueryConcurrencyMixedWorkloadQuery Data VolumeDataFreshness

    Teradata Confidential* > *

    The Teradata DifferenceMulti-dimensional ScalabilityData Volume(Raw, User Data)Business Needs ChangeCompetition can be Tuned to Meet a Static EnvironmentSchemaSophisticationQueryFreedomQueryComplexityDataFreshnessQuery Data VolumeQueryConcurrencyMixedWorkload

    Teradata Confidential* > *

    The Teradata DifferenceMulti-dimensional ScalabilityData Volume(Raw, User Data)Competition can be Tuned to Meet a Static EnvironmentBut At the Expense of Another DimensionCompetition Scales One Dimension at the Expense of OthersBusiness Needs Change Desire to Increase User/ Query ConcurrencySchemaSophisticationQueryFreedomQueryComplexityDataFreshnessQuery Data VolumeQueryConcurrencyMixedWorkload

    Teradata Confidential* > *

    The Teradata DifferenceMulti-dimensional ScalabilityData Volume(Raw, User Data)Competition Scales One Dimension at the Expense of Others

    Limited by Technology!SchemaSophisticationQueryFreedomQueryComplexityDataFreshnessQuery Data VolumeQueryConcurrencyMixedWorkloadTeradata can Scale Simultaneously Across Multiple Dimensions Driven by Business!

    Teradata Confidential* > *

    Key Teradata DifferentiatorsParallelism throughout platformShared Nothing ArchitectureProprietary intelligent system inter-connect

    Teradata Confidential* > *

    Node Architecture (Shared Nothing)Each Teradata Node is made up of hardware and softwareEach node has CPUs, system disk, memory and adaptersEach node runs copy of OS and database SW

    Teradata Confidential* > *

    Node Architecture (Shared Nothing)Each Teradata Node is made up of hardware and softwareEach node runs copy of OS, database SW, & virtual processes (above line)Each node has CPUs, system disk, memory & adapters (below line)

    Teradata Confidential* > *

    NCR 5400 Server Value PropBetter Price/Performance20% Performance Improvement12% Price/Performance Improvement

    Advanced Cabinet DesignUp to 10 Nodes Per CabinetUp to a 40% Reduction in Floor Space

    Investment ProtectionMulti Generation (5) Coexistence32-bit/64-bit Transition Platform

    Teradata Confidential* > *

    NCR 5400 Server Key Messages#2 Advanced Cabinet DesignRevolutionary cabinet increases reliability and provides greater configuration flexibility. up to 10 nodes per cabinet enable a 20% - 40% smaller footprint than the 538030% increase in system storage reliability with new advanced cooling mechanisms Extend supported distance for large systems (65+ nodes) between cabinets to 300 600 meters with new BYNET V3.Doubles the number of configurable nodes to 1,024

    Teradata Confidential* > *

    Key Teradata DifferentiatorsParallelism throughout platformShared Nothing ArchitectureProprietary intelligent system inter-connect

    Teradata Confidential* > *

    Parallelism via BYNET InterconnectBYNET high-speed interconnect facilitates system communicationAll nodes connected via BYNETHardware networkSoftware runs on each nodeDifferent communication paths facilitate system parallelism1 to 11 to Many1 to All

    Teradata Confidential* > *

    MPP System ConfigurationNodes grouped to increase data availability and system uptime Not shared storage but access within groupImproves data availabilityImproves system up timeAllows for VPROC migration

    Teradata Confidential* > *

    Teradata CliqueClique is group of nodes that access same arraysVPROC smallest unit of parallelismVPROC has assigned storage within cliqueVPROCs can migrate within cliqueImproves system up time, data availability, and ease of recovery

    Teradata Confidential* > *

    Teradata Clique and VPROCNodeNodeDisk ArrayDisk ArrayDisk ArrayDisk ArrayVPROC smallest unit of parallelismVPROC smallest unit of parallelism or workData distributed by hash to all VPROCsVPROC has assigned storage within cliqueVPROCs can migrate within cliqueImproves system up time, data availability, and ease of recoveryData fully available at degraded performance until node returns.

    X

    Teradata Confidential* > *

    Teradata Clique with Hot Standby

    Teradata Confidential* > *

    Teradata OptimizerThe Teradata Optimizer is the most robust in the industryOptimizer is parallel-aware, understands available system componentsHandles mixed work loadsMultiple complex queries Joins per query Unlimited ad-hoc processingOutput is least expensive plan (resources) to answer request

    Teradata Confidential* > *

    Teradata Request CycleRequest flow diagramEach request parcel contains at least one SQL statementSix main component stepsSyntaxerResolverSecurityOptimizerGeneratorgncApplyAMP steps are instructions sent to AMP VPROCs to complete the requestFollowing completion each request generates a success/fail parcel with any necessary records.

    Teradata Confidential* > *

    Data Protection (Object Locks)Locks protect data from simultaneous accessVary by typeExclusive, Write, Read, & AccessVary by object locked Database, Table, & Row HashLocks enforced by hierarchy

    Teradata Confidential* > *

    Data Protection (RAID-1)RAID data protectionRAID-1 (disk mirroring)Disk pair increases read performance and data availabilityIn failure scenario, mirrored drive re-built by array controller

    Teradata Confidential* > *

    Data Protection (Fallback)Fallback table dataCopy of table rows maintained by database on second AMP VPROCFallback copies grouped logically in CLUSTERS so data fully available when physical CLIQUE is off-line.Fallback + RAID increase data availability

    Teradata Confidential* > *

    Data Storage and AccessData stored by hashPrimary Index is chosen for data distribution, not same as primary keyPrimary Index value hashedHash value creates bucket assignmentHash Map assigns buckets to AMP VPROCsAMP VPROCs reside on specific nodeAMP VPROC writes row to diskData and algorithm exceptions require Uniqueness value for guaranteed unique Row ID

    Teradata Confidential* > *

    Data Access by Primary IndexData accessed by row hash valueNeed 3 pieces of information to find a rowTable IDRow Hash of PI valueOutput of hash algorithm on PI ValuePI ValueOperation involves only one AMP VPROC

    Teradata Confidential* > *

    Data Access by Unique Secondary Index (USI)USI Data accessIndex is created on tableSQL uses USI by valuePE VPROC managing session uses same information as primary index access (Table ID, Row Hash, Index Value)This process involves two AMP VPROC operationsAfter USI subtable lookup, process similar to primary index access

    Unique Secondary Index (USI) AccessRowID Cust Name PhoneUSINUPI639, 177 Jones777-6666778, 395 Peters555-7777778, 756 Smith555-7777915, 951 Marsh888-2222Customer Table ID = 100USI Value = 56Hashing AlgorithmPE

    Teradata Confidential* > *

    Data Access via Non-uniqueSecondary Index (NUSI)PENUSI Value = AdamsCustomer Table ID = 100Index is created on tableSQL uses NUSI by valuePE VPROC managing session uses same information as primary index access (Table ID, Row Hash, Index Value)This process involves all-AMP VPROC operations

    Teradata Confidential* > *

    Teradata StructuresDatabase structuresUsersDatabasesTablesViewsMacrosTriggersStored ProceduresUser Defined Functions

    Teradata Confidential* > *

    Teradata is an Open SystemVirtually any application or middleware framework can be integrated with Teradata.TeradataCORBAODBC.NETOLE-DBTeradataUtilitiesAdapter(s)Message BusPublish & SubscribeJAVAJDBCEJBJDBCAdapter(s)TeradataUtilitiesWebMessagesIIOPASPJSPJSMQueues

    Teradata Confidential* > *

    64-bit Teradata SolutionTeradata on SuSE Linux 2H 200564-BIT ApplicationServer-TierClient-TierClient-Tier64-Bit32-BIT ApplicationServer-TierIntelAlsoIBM/Power PCSUN/SPARCHP/PA-RISC32-BitDELLHPIBMTeradata Tools & UtilitiesTeradata System MgmtIntel PlatformTeradata Application3rd Party PartnersLinuxIntel 64-Bit Database Server-TierTeradata Database on Intel 32-bit and 64-bit will support both 32-bit and 64-bit applications & clients concurrentlyTeradata DatabaseOperating System2H 2005

    Teradata Confidential* > *

    TX1APPLNWDA-MWTX2APPLMSG-MWDA-MWTX3APPLMSG-MWDA-MWTX4APPLMSG-MWDA-MWBIAPPLMSG-MWDA-MWTacticalAPPLMSG-MWDA-MWStrategicAPPLMSG-MWDA-MWBusiness Process AutomationAnalytic & Decision Making RepositoriesAnalytic & Decision Making ServicesTransactional RepositoriesBatchStreamingData Acquisition & IntegrationTransactional ServicesEnterprise Users (Browsers and/or Portal)Legacy EnvironmentLegacy EnvironmentService BrokersBusiness RulesMSG-MWEvent NotificationMSG-MWMSG-MWEvent DetectionWAN / VANRSEDW BEDW AInternet / IntranetWAN / VANOLTP1OLTP2OLTP3OLTP4DA-MWDA-MWDA-MWRDBMS BasedEvent ProcessingBIAPPLNWDA-MWQDQDC/SEDIConsumersSuppliersInternalPartnersEDIC/SASP / JSPTeradatas Real-Time Enterprise Reference Architecture

    Teradata Confidential* > *

    Transactional ServicesTX1APPLNWDA-MWTX2APPLMSG-MWDA-MWTX3APPLMSG-MWDA-MWTX4APPLMSG-MWDA-MWTransactional RepositoriesTransactional ServicesOLTP1OLTP2OLTP3OLTP4Data Access Middleware Occurs via standards, such as; ODBC, OLE-DB, JDBC, as well as proprietary techniquesTransactional Application ServicesApplications that perform book-keeping or transactional services for the enterpriseOLTP Data RepositoriesData that reflects the current state of various business processLimited history Tuned for transaction workloadApplication Scope Applications have narrow scope. Tuned for specific book-keeping or transactional services.

    Teradata Confidential* > *

    Transactional User BaseTX1APPLNWDA-MWTX2APPLMSG-MWDA-MWTX3APPLMSG-MWDA-MWTX4APPLMSG-MWDA-MWTransactional RepositoriesTransactional ServicesEnterprise Users (Browsers and/or Portal)Legacy EnvironmentService BrokersWAN / VANInternet / IntranetOLTP1OLTP2OLTP3OLTP4C/SEDIConsumersSuppliersInternalPartnersASP / JSPClient/Server Styles 2-Tier and 3-Tier RPC style interfacesUser-level IntegrationOccurs via standard EAI services, such as JAVA, Web Sphere, .NET, Tibco, and SeeBeyondTransactional User Base Consumers, Suppliers, Internal, and Trading PartnersService Brokers J2EE, CORBA, DCOM, Web Services

    Teradata Confidential* > *

    Data Warehouse Services Data Access Middleware Occurs via standards, such as ODBC, OLE-DB, JDBC, as well as proprietary techniquesApplication Services Applications that provide predictive analysis and assisted decision makingEnterprise Data WarehouseConsolidated enterprise dataCrosses multiple business domainsIntegrated data modelBIAPPLMSG-MWDA-MWTacticalAPPLMSG-MWDA-MWStrategicAPPLMSG-MWDA-MWAnalytic & Decision Making RepositoriesAnalytic & Decision Making ServicesRSEDW BEDW ARDBMS BasedEvent ProcessingBIAPPLNWDA-MWQDQDApplication ScopeStrategic and Tactical decision making applications.Though BI tools or custom applications.

    Teradata Confidential* > *

    Decision Support User BaseBIAPPLMSG-MWDA-MWTacticalAPPLMSG-MWDA-MWStrategicAPPLMSG-MWDA-MWAnalytic & Decision Making RepositoriesAnalytic & Decision Making ServicesRSEDW BEDW ARDBMS BasedEvent ProcessingBIAPPLNWDA-MWQDQDEnterprise Users (Browsers and/or Portal)Service BrokersInternet / IntranetConsumersSuppliersInternalPartnersASP / JSPService Broker Styles J2EE, CORBA, DCOM, Web ServicesClient/Server Styles 2-Tier and 3-Tier RPC style interfacesUser-level Integrationoccurs via standard EAI services, such as Web Services, JAVA, .NET, Tibco, and SeeBeyondLegacy EnvironmentWAN / VANEDIC/SDW User Base Consumers, Suppliers, Internal, and Trading Partners

    Teradata Confidential* > *

    Data Acquisition ServicesAnalytic & Decision Making RepositoriesTransactional RepositoriesBatchStreamingData Acquisition & IntegrationRSEDW BEDW AOLTP1OLTP2OLTP3OLTP4RDBMS BasedEvent ProcessingQDQDData Transformation ServicesData cleansingData transformation (normalization)Streaming data for frequent updatesBatch data moves for bulk operationsPartner ETL tools are typically used to perform these servicesData ExtractionData is extracted from OLTP systemsPartner ETL tools are frequently used hereData LoadData is loaded into EDW system using Teradata Load toolsFastLoadMultiLoadTPumpData Acquisition OptionsTraditional load utilities (bulk or continuous loads)Loads through in-flight Message PassingReplication Table level replication from source to target

    Teradata Confidential* > *

    Event-Driven Business ProcessesRDBMS Based Event ProcessingReal-time events are detected through a combination of Triggers, Stored Procedures, and UDFs Event engine performs queryMessages are passed via P2P, Web Services or Enterprise Message BusBusiness Process AutomationEvent DetectionApplied Business RulesEvent NotificationMessages are passed via P2P, Web Services or Enterprise Message BusAnalytic & Decision Making RepositoriesTransactional RepositoriesBatchStreamingData Acquisition & IntegrationRSEDW BEDW AOLTP1OLTP2OLTP3OLTP4RDBMS BasedEvent ProcessingQDQDBusiness Process AutomationBusiness RulesMSG-MWEvent NotificationMSG-MWMSG-MWEvent DetectionDA-MWDA-MWDA-MWEnterprise Message Bus

    Teradata Confidential* > *

    Application IntegrationTX1APPLNWDA-MWTX2APPLMSG-MWDA-MWTX3APPLMSG-MWDA-MWTX4APPLMSG-MWDA-MWBIAPPLMSG-MWDA-MWTacticalAPPLMSG-MWDA-MWStrategicAPPLMSG-MWDA-MWBusiness Process AutomationAnalytic & Decision Making RepositoriesAnalytic & Decision Making ServicesTransactional RepositoriesBatchStreamingData Acquisition & IntegrationTransactional ServicesService BrokersBusiness RulesMSG-MWEvent NotificationMSG-MWMSG-MWEvent DetectionRSEDW BEDW AOLTP1OLTP2OLTP3OLTP4DA-MWDA-MWDA-MWRDBMS BasedEvent ProcessingBIAPPLNWDA-MWQDQDASP / JSPDecision Making Applications interact with bookkeeping applications via standard Enterprise services, such as Web Services, JAVA, .NET, -or- through the use of traditional client/server technology.

    Teradata Confidential* > *

    Dual Active SolutionBIAPPLMSG-MWDA-MWTacticalAPPLMSG-MWDA-MWStrategicAPPLMSG-MWDA-MWAnalytic & Decision Making RepositoriesAnalytic & Decision Making ServicesBatchStreamingData Acquisition & IntegrationRSEDW BEDW ARDBMS BasedEvent ProcessingBIAPPLNWDA-MWQDQDSecondary Active system does not need to be as large as primary systemReplication ServicesChanged data capture in V2R6Update propagation via GoldenGateDual Data LoadInput data stream is split into two independent load streamsInput data is filtered so that only critical data is loaded on the Secondary Active systemTeradata Query DirectorQuery routing control based on business rulesBusiness Continuity, workload sharing

    The Teradata Database is a relational database management system (RDBMS) that drives a company's data warehouse.

    A data warehouse is a central, enterprise-wide database that contains information extracted from the operational systems. A Data Warehouse has a centrally located logical architecture which minimizes data synchronization and provides a single view of the business. Data warehouses have become more common in corporations where enterprise-wide detail data may be used in on-line analytical processing to make strategic and tactical business decisions. Warehouses often carry many years worth of detail data so that historical trends may be analyzed using the full power of the data.

    "Linear scalability" means that as you add components to the system, the performance increase is linear. Adding components allows the system to accommodate increased workload without decreased throughput. Linear scalability enables the system to grow to support more users/data/queries/complexity of queries without experiencing performance degredation. As the configuration grows, performance increase is linear, slope of 1.

    A database is a collection of permanently stored data that is: Logically related (the data was created for a specific purpose). Shared (many users may access the data). Protected (access to the data is controlled). Managed (the data integrity and value are maintained). The Teradata Database is a relational database. Relational databases are based on the relational model, which is founded on mathematical Set Theory. The relational model uses and extends many principles of Set Theory to provide a disciplined approach to data management. A relational database is designed to: Represent a business and its business practices. Be extremely flexible in the way that it can be selected and used. Be easy to understand Model the business, not the applications All businesses to quickly respond to changing conditions Relational databases present data as of a set of tables. A table is a two-dimensional representation of data that consists of rows and columns. According to the relational model, a valid table does not have to be populated with data rows, it just needs to be defined with at least one column. A relational database is a set of logically related tables. Tables are logically related to each other by a common field, so information such as customer telephone numbers and addresses can exist in one table, yet be accessible for multiple purposes. The example below shows customer, order, and billing statement data, related by a common field. The common field of Customer ID lets you look up information such as a customer name for a particular statement number, even though the data exists in two different tables."Linear scalability" means that as you add components to the system, the performance increase is linear. Adding components allows the system to accommodate increased workload without decreased throughput. Linear scalability enables the system to grow to support more users/data/queries/complexity of queries without experiencing performance degredation. As the configuration grows, performance increase is linear, slope of 1. The Teradata Database was the first commercial database system to scale to and support a trillion bytes of data. The origin of the name Teradata is "tera-," which is derived from Greek and means "trillion."

    The Teradata Database acts as a single data store, with multiple client applications making inquiries against it concurrently. Instead of replicating a database for different purposes, with the Teradata Database you store the data once and use it for all clients. The Teradata Database provides the same connectivity for an entry-level system as it does for a massive enterprise data warehouse. Data Volume (Raw, User Data) - Raw data stored in the warehouse. This is the user data stored in the warehouse. It does not include generated data that also takes space within the warehouse such as indexes, summarizations, aggregations, duplicated data, and system overhead.

    Query Concurrency - The volume of work that can be done at the same time. Most commonly the number of queries that the database can process at the same time. It can also include load and in-database transformation work and stored procedure processing activity. Logged-on users not currently executing a query do no add to the concurrency workload.

    Query Complexity - The degree to which queries are complex in areas that make a query difficult or resource intensive for a database system. These areas include the number of tables involved in joins, complex "where" constraints in the SQL, aggregations and statistical functions, and the use of views in addition to base tables. Business intelligence query tools often generate very complex queries.

    Schema Sophistication The ability to chose the scheme to meet my business requirements verses limiting the complexity of the schema due to technology performance limitations of the database. Its the ability to be able to deploy a denormalized star schema, a sophisticated and complex normalized 3NF schema, or a combination of the two or anywhere in between to meet the requirements of the business.

    Query Data Volume - Refers to how much data must be touched to satisfy a query. Teradata features that can be cited as reducing the amount of data touched would include our unsurpassed compression capabilities,efficient row storage, strong indexing capabilities, and lack of storage requirement for primary index.

    Query Freedom - The ability for users to ask any question of the data at the time best for the business. This is an indication of how free the users are to ask exploratory, broad, or complexquestionsas well as expected and tuned queries and to ask new types of questions associated with new applications.

    Data Freshness - The ability to load data into the warehouse and to update data in the warehouse at the speed the business operates. This is an indication of whether the data in the warehouse can be kept current and in sync with business processes and operations to the degree necessary to respond to events and business activities as well as to provide meaningful analyses.

    Mixed Workload - The ability of the database to handle the broad mix of tasks for which a data warehouse is used today without impacting the effectiveness in any area. For example, data warehouses must answer complex strategic questions as well as brief tactical questions or customer inquiries. At the same time, data must be loaded and updated. Can the database handle the various workloads concurrently while meeting the very different service level agreement attributes (e.g., response time, performance consistency) of the various types of work? Does the database require separation of work (e.g., batch windows)? A Teradata Database node requires three distinct pieces of software: The Teradata Database can run on the following operating systems:UNIX MP-RAS Microsoft Windows 2000 The Parallel Database Extensions (PDE) software layer was added to the operating system by NCR to support the parallel software environment. A Trusted Parallel Application (TPA) uses PDE to implement virtual processors (vprocs). The Teradata Database is classified as a TPA. The four components of the Teradata Database TPA are: AMP (Top Right) PE (Bottom Right) Channel Driver (Top Left) Teradata Gateway (Bottom Left) A Parsing Engine (PE) is a vproc that manages the dialogue between a client application and the Teradata Database, once a valid session has been established. Each PE can support a maximum of 120 sessions. The AMP is a vproc that controls its portion of the data on the system. AMPs do the physical work associated with generating an answer set (output) including sorting, aggregating, formatting, and converting. The AMPs perform all database management functions on the required rows in the system. The AMPs work in parallel, each AMP managing the data rows stored on its single vdisk. AMPs are involved in data distribution and data access in different ways. Channel Driver software is the means of communication between an application and the PEs assigned to channel-attached clients. There is one Channel Driver per node. Teradata Gateway software is the means of communication between an application and the PEs assigned to network-attached clients. There is one Teradata Gateway per node. A Teradata Database node requires three distinct pieces of software: The Teradata Database can run on the following operating systems:UNIX MP-RAS Microsoft Windows 2000 The Parallel Database Extensions (PDE) software layer was added to the operating system by NCR to support the parallel software environment. A Trusted Parallel Application (TPA) uses PDE to implement virtual processors (vprocs). The Teradata Database is classified as a TPA. The four components of the Teradata Database TPA are: AMP (Top Right) PE (Bottom Right) Channel Driver (Top Left) Teradata Gateway (Bottom Left) A Parsing Engine (PE) is a vproc that manages the dialogue between a client application and the Teradata Database, once a valid session has been established. Each PE can support a maximum of 120 sessions. The AMP is a vproc that controls its portion of the data on the system. AMPs do the physical work associated with generating an answer set (output) including sorting, aggregating, formatting, and converting. The AMPs perform all database management functions on the required rows in the system. The AMPs work in parallel, each AMP managing the data rows stored on its single vdisk. AMPs are involved in data distribution and data access in different ways. Channel Driver software is the means of communication between an application and the PEs assigned to channel-attached clients. There is one Channel Driver per node. Teradata Gateway software is the means of communication between an application and the PEs assigned to network-attached clients. There is one Teradata Gateway per node. Lets do a quick review of some important dates. The release we are announcing internally today is the NCR 5400 Server with MP-RAS and existing storage. The external GCA date and supporting press release is March 2005. Also available in this timeframe are new Storage Tek Tape Libraries and the Teradata AWS with Windows Server 2003.

    On the heals of the February release in April, the a new NCR Enterprise Storage cabinet, the 6842 will be announced along with FICON Channel Connectivity and product updates from our 3 BAR software partners.

    Later in 2nd quarter, support for the Microsoft Windows 2000 and the Microsoft Windows Server 2003 operating systems on the 5400 will be released, along with a new box for the AWS and a new box for the SMP. More information will be available at release time.

    Lets start by looking at the 3 key messages associated with the 5400 release.

    In addition, the new design provides greater flexibility in data center configuration options. The new design increases the number of nodes in a cabinet up to 10. Previously, we supported up to 4 nodes per cabinet, now with up to 10 nodes/cabinet, we reduce the footprint and floor space required for larger Teradata systems. And with the inclusion of the new BYNET release, BYNET V3, weve extended the physical distance that customers can put between the cabinets for very large systems. Customers can now split a Teradata system in 2 distinct physical locations on their data center floor with as much as 300 600 meters between them. Additionally, the BYNET V3 doubles our system scalability, enabling up to 1024 nodes in a single system. While we dont expect many customers to approach this limit in the near term, it does support our unlimited scalability story and provide for future growth in the long term.The BYNET (pronounced, "bye-net") is a high-speed interconnect (network) that enables multiple nodes in the system to communicate. The BYNET handles the internal communication of the Teradata Database. All communication between PEs and AMPs is done via the BYNET. When the PE dispatches the steps for the AMPs to perform, they are dispatched onto the BYNET. The messages are routed to the appropriate AMP(s) where results sets and status information are generated. This response information is also routed back to the requesting PE via the BYNET. Depending on the nature of the dispatch request, the communication between nodes may be to all nodes (Broadcast message) or to one specfic node (Point-to-point message) in the system

    The BYNET has several unique features: Scalable: As you add more nodes to the system, the overall network bandwidth scales linearly. This linear scalability means you can increase system size without performance penalty -- and sometimes even increase performance. High performance: An MPP system typically has two BYNET networks (BYNET 0 and BYNET 1). Because both networks in a system are active, the system benefits from having full use of the aggregate bandwidth of both the networks. Fault tolerant: Each network has multiple connection paths. If the BYNET detects an unusable path in either network, it will automatically reconfigure that network so all messages avoid the unusable path. Additionally, in the rare case that BYNET 0 cannot be reconfigured, hardware on BYNET 0 is disabled and messages are re-routed to BYNET 1. Load balanced: Traffic is automatically and dynamically distributed between both BYNETs.

    The BYNET hardware and software handle the communication between the vprocs and the nodes. Hardware: The nodes of an MPP system are connected with the BYNET hardware, consisting of BYNET boards and cables. Software: The BYNET driver (software) is installed on every node. This BYNET driver is an interface between the PDE software and the BYNET hardware. SMP systems do not contain BYNET hardware. The PDE and BYNET software emulate BYNET activity in a single-node environment.

    Point-to-Point Messages With point-to-point messaging between vprocs, a vproc can send a message to another vproc on: The same node (using PDE and BYNET software) A different node using two steps: Send a point-to-point message from the sending node to the node containing the recipient vproc. This is a communication between nodes using the BYNET hardware. Within the recipient node, the message is sent to the recipient vproc. This is a point-to-point communication between vprocs using the PDE and BYNET software.

    Multicast Messages A vproc can send a message to multiple vprocs using two steps: Send a broadcast message from the sending node to all nodes. This is a communication between nodes using the BYNET hardware. Within the recipient nodes, the PDE and BYNET software determine which, if any, of its vprocs should receive the message and delivers the message accordingly. This is a multicast communication between vprocs within the node, using the PDE and BYNET software. Broadcast Messages A vproc can send a message to all the vprocs in the system using two steps: Send a broadcast message from the sending node to all nodes. This is a communication between nodes using the BYNET hardware. Within each recipient node, the message is sent to all vprocs. This is a broadcast communication between vprocs using the PDE and BYNET software.

    The diagram below shows three cliques. The nodes in each clique are cabled to the same disk arrays. The overall system is connected by the BYNET. If one node goes down in a clique the vprocs will migrate to the other nodes in the clique, so data remains available. However, system performance decreases due to the loss of a node. System performance degradation is proportional to clique size.

    Vprocs are distributed across all nodes in the system. Multiple cliques in the system should have the same number of nodes.

    A clique (pronounced, "kleek") is a group of nodes that share access to the same disk arrays. Each multi-node system has at least one clique. The cabling determines which nodes are in which cliques -- the nodes of a clique are connected to the disk array controllers of the same disk arrays.

    Cliques Provide ResiliencyIn the rare event of a node failure, cliques provide for data access through vproc migration. When a node resets, the following happens to the AMPs: When the node fails, the Teradata Database restarts across all remaining nodes in the system. The vprocs (AMPs) from the failed node migrate to the operational nodes in its clique. Disks managed by the AMP remain available and processing continues while the failed node is being repaired.

    Vprocs are distributed across all nodes in the system. Multiple cliques in the system should have the same number of nodes. The diagram below shows three cliques. The nodes in each clique are cabled to the same disk arrays. The overall system is connected by the BYNET. If one node goes down in a clique the vprocs will migrate to the other nodes in the clique, so data remains available. However, system performance decreases due to the loss of a node. System performance degradation is proportional to clique size.A clique (pronounced, "kleek") is a group of nodes that share access to the same disk arrays. Each multi-node system has at least one clique. The cabling determines which nodes are in which cliques -- the nodes of a clique are connected to the disk array controllers of the same disk arrays.

    Cliques Provide ResiliencyIn the rare event of a node failure, cliques provide for data access through vproc migration. When a node resets, the following happens to the AMPs: When the node fails, the Teradata Database restarts across all remaining nodes in the system. The vprocs (AMPs) from the failed node migrate to the operational nodes in its clique. Disks managed by the AMP remain available and processing continues while the failed node is being repaired.

    Vprocs are distributed across all nodes in the system. Multiple cliques in the system should have the same number of nodes. The diagram below shows three cliques. The nodes in each clique are cabled to the same disk arrays. The overall system is connected by the BYNET. If one node goes down in a clique the vprocs will migrate to the other nodes in the clique, so data remains available. However, system performance decreases due to the loss of a node. System performance degradation is proportional to clique size.And finally, for any system 8 nodes or greater, implementing these two new solutions together you receive all the performance continuity benefits of Hot Standby Node. By implementing Large Cliques with Hot Standby Nodes, your system will have fewer cliques overall, requiring fewer Hot Standby Nodes.The Optimizer is parallel-aware, meaning that it has knowledge of system components (how many nodes, vprocs, etc.). It determines the least expensive plan (time-wise) to process queries fast and in parallel. A Request parcel contains one or more whole SQL statements. Normally, a Request parcel represents a single transaction. Some transactions may require multiple Request parcels. A REQUEST parcel is followed by zero or one DATA parcel plus one RESPOND parcel. The RESPOND parcel identifies response buffer size. A RESPOND parcel may be sent by itself as a continuation request for additional data.A SUCCESS parcel may be followed by DATA parcels. Every REQUEST parcel generates a SUCCESS/FAIL parcel.

    SQL Parser Overview (done by PE)The flowchart provides an overview of the SQL parser. As you can see, it is composed of six main sections: Syntaxer, Resolver, Security, Optimizer, Generator and gncApply.When the parser sees a Request parcel it checks to see if it has parsed and cached the execution steps for it. If the answer is NO, then the Request must pass through all the sections of the parser as follows:The Syntaxer checks the Request for valid syntax.The Resolver breaks down Views and Macros into their underlying table references through use of DD information.Security determines whether the Requesting UserID has the necessary permissions.The Optimizer chooses the execution plan.The Generator creates the steps for execution.gncApply binds the data values into the steps. (This phase of the Parser is known as Optapply.)Note:If the steps in the Request parcel are in cache, the Request passes directly to gncApply (after a check by Security). This is illustrated on the flow chart by the YES path from the CACHED? decision box.

    A REQUEST parcel is followed by zero or one DATA parcel plus one RESPOND parcel.The RESPOND parcel identifies response buffer size.A RESPOND parcel may be sent by itself as a continuation request for additional data.A SUCCESS parcel may be followed by DATA parcels.Every REQUEST parcel generates a SUCCESS/FAIL parcel.

    Temporary locks can be placed on data to prevent multiple users from simultaneously changing it: Exclusive Lock Write Lock Read Lock Access Lock

    Locks may be applied at three levels: Database Locks: Apply to all tables and views in the database. Table Locks: Apply to all rows in the table. Row Hash Locks: Apply to a group of one or more rows in a table.

    ExclusiveExclusive locks are applied only to databases or tables, never to rows. They are the most restrictive type of lock. With an exclusive lock, no other user can access the database or table. Exclusive locks are used rarely, most often when structural changes are being made to the database. An exclusive lock on a database or table prevents other users from obtaining the following type of locks on the locked data: Exclusive locks Write locks Read locks Access locks WriteWrite locks enable users to modify data while maintaining data consistency. While the data has a write lock on it, other users can obtain an access lock only. During this time, all other locks are held in a queue until the write lock is released. Write locks prevent other users from obtaining the following locks on the locked data: Exclusive locks Write locks Read locks ReadRead locks are used to ensure consistency during read operations. Several users may hold concurrent read locks on the same data, during which time no data modification is permitted. Read locks prevent other users from obtaining the following locks on the locked data: Exclusive locks Write locks AccessAccess locks can be specified by users unconcerned about data consistency. The use of an access lock allows for reading data while modifications are in process. Access locks are designed for decision support on large tables that are updated only by small, single-row changes. Access locks are sometimes called "stale read" locks, because you may get "stale data" that has not been updated. Access locks prevent other users from obtaining the following locks on the locked data: Exclusive locks

    Several types of data protection are available with the Teradata Database. Redundant Array of Inexpensive Disks (RAID) is a storage technology that provides data protection at the disk drive level. It uses groups of disk drives called "arrays" to ensure that data is available in the event of a failed disk drive or other component. The Teradata Database has journals that can be used for specific types of data or process recovery: Permanent Journals Recovery Journals

    Fallback is accomplished by grouping AMPs into clusters. When a table is defined as Fallback-protected, the system stores a second copy of each row in the table on the disk space managed by an alternate "Fallback AMP" in the AMP cluster. Below is a cluster of four AMPs. Each AMP has a combination of Primary and Fallback data rows: Primary Data Row: A record in a database table that is used in normal system operation. Fallback Data Row: The online backup copy of a Primary data row that is used in the case of an AMP failure.

    Permanent Journals are an optional feature of the Teradata Database to provide an additional level of data protection. You specify the use of Permanent Journals at the table level. It provides full-table recovery to a specific point in time. It also can reduce the need for costly and time-consuming full-table backups.

    The Teradata Database uses Recovery Journals to automatically maintain data integrity in the case of: An interrupted transaction (Transient Journal) An AMP failure (Down-AMP Recovery Journal) Recovery Journals are created, maintained, and purged by the system automatically, so no DBA intervention is required. Recovery Journals are tables stored on disk arrays like user data is, so they take up additional disk space on the system. Several types of data protection are available with the Teradata Database. Fallback is a Teradata Database feature that protects data against AMP failure. Fallback is accomplished by grouping AMPs into logical clusters. When a table is defined as Fallback-protected, the system stores a second copy of each row in the table on the disk space managed by an alternate "Fallback AMP" in the AMP cluster. Below is a cluster of four AMPs. Each AMP has a combination of Primary and Fallback data rows: Primary Data Row: A record in a database table that is used in normal system operation. Fallback Data Row: The online backup copy of a Primary data row that is used in the case of an AMP failure.

    No two AMP VPROCs in a cluster should reside in the same physical clique (node group) to prevent a single point of hardware failure that would disrupt data availability.

    Permanent Journals are an optional feature of the Teradata Database to provide an additional level of data protection. You specify the use of Permanent Journals at the table level. It provides full-table recovery to a specific point in time. It also can reduce the need for costly and time-consuming full-table backups.

    The Teradata Database uses Recovery Journals to automatically maintain data integrity in the case of: An interrupted transaction (Transient Journal) An AMP failure (Down-AMP Recovery Journal) Recovery Journals are created, maintained, and purged by the system automatically, so no DBA intervention is required. Recovery Journals are tables stored on disk arrays like user data is, so they take up additional disk space on the system. A hashing algorithm is a standard data processing technique that takes in a data value, like last name or order number, and systematically mixes it up so that the incoming values are converted to a number in a range from zero to the specified maximum value. A successful hashing scheme scatters the input evenly over the range of possible output values. It is predictable in that Smith will always hash to the same value and Jones will always hash to another (hopefully different) value. With a good hashing algorithm any patterns in the input data should disappear in the output data.

    Teradatas algorithm works predictably well over any data, typically loading each AMP with variations in the range of .1% to .5% between AMPs. For extremely large systems, the variation can be as low as .001% between AMPs. A Row Hash is the 32-bit result of applying a hashing algorithm to an index value. The DSW (Destination Selection Word) or Hash Bucket is represented by the high order 16 bits of the Row Hash.

    Teradata also uses hashing quite differently than other data storage systems. Other hashed data storage systems equate a bucket with a physical location on disk. In Teradata, a bucket is simply an entry in a hash map. Each hash map entry points to a single AMP. Therefore, changing the number of AMPs does not require any adjustment to the hashing algorithm. Teradata simply adjusts the hash maps and redistributes any affected rows.

    The hash maps must always be available to the Message Passing Layer. A Teradata Version 2 a hash map has 65,536 entries. When the hash bucket has determined the destination AMP, the full 32-bit row hash plus the Table-ID is used to assign the row to a cylinder and a data block on the AMPs disk storage. In Version 2 the algorithm can produce over 4,000,000,000 row hash values.

    Hash values will be the same for non-unique primary indexes and for hash synonyms (values where different inputs produce same output), so a Uniqueness value is added to the Row Hash. This combined value becomes the Row ID and is truly unique for every row on the database.Locating a row on an AMP VPROC requires three inputs:Table IDRow Hash of the PIPI value

    Table ID and Row hash are used in Hash Map to identify the AMP VPROC that has the row (#3 in this case)The AMP VPROC uses the Table-id and Row hash to look up the cylinder number in the Master Index (memory resident structure on each AMP VPROC)Then the cylinder index is reference (also memory resident) to find the specific data block on disk that contains the row.

    The Row IDIn order to differentiate each row in a table, every row is assigned a unique Row ID. The Row ID is a combination of the row hash value plus a uniqueness value. The AMP appends the uniqueness value to the row hash when it is inserted. The Uniqueness Value is used to differentiate between PI values that generate identical row hashes.The first row inserted with a particular row hash value is assigned a uniqueness value of 1. Each new row with the same row hash is assigned an integer value one greater than the current largest uniqueness value for this Row ID. If a row is deleted or the primary index is modified, the uniqueness value can be reused.Only the Row Hash portion is used in Primary Index operations. The entire Row ID is used for Secondary Index support.Locating a row on an AMP VPROC requires three inputs:Table IDRow Hash of the PIPI value

    A secondary index is an alternate path to the data. Secondary Indexes are used to improve performance by allowing the user to avoid scanning the entire table. A Secondary Index is like a Primary Index in that it allows the user to locate rows. It is unlike a Primary Index in that it has no influence on the way rows are distributed among AMPs. A database designer typically chooses a secondary index because it provides faster set selection. Primary Indexes requests require the services on only one AMP to access rows, while secondary indexes require at least two and possibly all AMPs, depending on the index and the type of operation. A secondary index search will typically be less expensive than a full table scan.Secondary indexes add overhead to the table, both in terms of disk space and maintenance, however they may be dropped when not needed, and recreated whenever they would be helpful.

    Just as with primary indexes, there are two types of secondary indexes unique (USI) and non-unique (NUSI). Secondary Indexes may be specified at table creation or at any time during the life of the table. It may consist of up to 16 columns, however to get the benefit of the index, the query would have to specify a value for all 16 values.Unique Secondary Indexes (USI) have two possible purposes. They can speed up access to a row which otherwise might require a full table scan without having to rely on the primary index. Additionally, they can be used to enforce uniqueness on a column or set of columns. This is sometimes the case with a Primary Key which is not designated as the Primary Index. Making it a USI has the effect of enforcing the uniqueness of the PK.

    All secondary indexes cause an AMP local subtable to be built and maintained as column values change. Secondary index subtables consist of rows which associate the secondary index value with one or more rows in the base table. When the index is dropped, the subtable is physically removed.

    Just as with primary indexes, there are two types of secondary indexes unique (USI) and non-unique (NUSI). Secondary Indexes may be specified at table creation or at any time during the life of the table. It may consist of up to 16 columns, however to get the benefit of the index, the query would have to specify a value for all 16 values.

    Non-Unique Secondary Indexes (NUSI) are usually specified in order to prevent full table scans. NUSIs however do activate all AMPs - after all, the value being sought might well live on many different AMPs (only Primary Indexes have same values on same AMPs). If the optimizer decides that the cost of using the secondary index is greater than a table scan would be, it opts for the table scan.

    All secondary indexes cause an AMP local subtable to be built and maintained as column values change. Secondary index subtables consist of rows which associate the secondary index value with one or more rows in the base table. When the index is dropped, the subtable is physically removed.

    Notice in all cases the data used to access the index is the same.Database: The Teradata Definition In Teradata, a "database" provides a logical grouping of information. A Teradata Database also provides a key role in space allocation and access control. A Teradata Database is a defined, logical repository that can contain objects, including: Databases: A defined object that may contain a collection of Teradata Database objects. Users: Databases that each have a logon ID and password for logging on to the Teradata Database. Tables: Two-dimensional structures of columns and rows of data stored on the disk drives. (Require Perm Space) Views: A virtual "window" to subsets of one or more tables or other views, pre-defined using a single SELECT statement. (Use no Perm Space) Macros: Definitions of one or more Teradata SQL and report formatting commands. (Use no Perm Space) Triggers: One or more Teradata SQL statements associated with a table and executed when specified conditions are met. (Use no Perm Space) Stored Procedures: Combinations of procedural and non-procedural statements run using a single CALL statement. (Require Perm Space)

    User: A Special Kind of Database A User can be thought of as a colllection of tables, views, macros, triggers, and stored procedures. A User is a specific type of Database, and has attributes in addition to the ones listed above: Logon ID Password

    A table in a relational database management system is a two-dimensional structure made up of columns and physical rows stored in data blocks on the disk drives. Each column represents an attribute of the table. Attributes identify, describe, or qualify the table. Each column is named and all the information contained within it is of the same type, for example, Department Number.

    A view is like a "window" into tables that allows multiple users to look at portions of the same base data. A view may access one or more tables, and may show only a subset of columns from the table(s). A view does not exist as a real table and does not occupy disk space. It serves as a reference to existing tables or views. A view is a logical structure with no actual data -- it accesses data that is stored in a table and returns the requested rows from the table to the user.

    A macro is a Teradata Database extension to ANSI SQL that defines a sequence of prewritten Teradata SQL statements. Macros are pre-defined, stored sets of one or more SQL commands and/or report-formatting (BTEQ) commands. Macros can also contain comments. Macros can be a convenient shortcut for executing groups of frequently-run or complex SQL statements (queries) or sets of operations. When you execute the macro, the statements execute as a single transaction. Macros reduce the number of keystrokes needed to perform a complex task. This saves you time, reduces the chance for errors, and reduces the communication volume to the Teradata Database.

    A trigger is a set of SQL statements usually associated with a column or table that are programmed to be run (or "fired") when specified changes are made to the column or table. The pre-defined change is known as a triggering event, which causes the SQL statements to be processed.

    A stored procedure is a pre-defined set of statements invoked through a single CALL statement in SQL. While a stored procedure may seem like a macro, it is different in that it can contain: Teradata SQL data manipulation statements (non-procedural) Procedural statements (in the Teradata Database, referred to as Stored Procedure Language)

    As you would expect, any system of this level of maturity must be an open system.

    TERADATA is committed to supporting data interchange with mainframes.

    TERADATA is committed to supporting open standards as they emerge.

    Many customers are asking about the availability of Teradata on Linux and / or when will Teradata will be available in 64-bit. We are fulfilling both of these requirements in 2005 with the release of the Teradata on 64-bit Linux solution. Currently, the 5400 is 64-bit capable as it uses the new Xeon EMT64 chip. However our complete Teradata 64-bit solution, that includes the operating system and the database will be available in late 2005. As such, we are not marketing the 5400 as a 64-bit solution. We will begin marketing the 64 bit solution in its entirety in mid 2005.

    While we are not yet offering this solution to our customers, if asked, you should articulate that the Teradata solution is already 90% there. As you can see from this picture, customers can run both their 32 bit and 64 applications from a client or server tier with the Teradata Database today. The Teradata Tools and Utilities are all available on 64-bit Linux and any 3rd party application using standard interfaces can connect without issue today. With the 5400, now we have the 64-bit platform, and in late 2005, we will release the complete solution with Linux and the Teradata Database.

    Now lets move on to the recommended configuration options for the 5400.

    KEY MESSAGE: The ADW (TERADATA) is an integrated part of the Real Time Enterprise.KEY MESSAGE: The EDW is NOT simply a mirror of the transactional data models. It IS a data model specifically designed for decision support.KEY MESSAGE: Driven by a business need for concurrent tactical & strategic decision making, a new class of applications need to access the ADW. Traditional (legacy) applications also need access to the ADW, either directly, or indirectly via data sharing between the transactional and decision making environments.There are four programming models illustrated in the diagram, all of which can inter-operate with in the ADW architecture:Client Server: Applications, such as, BTEQ, Cognos, MicroStrategy, are examples of tightly bound CS applications that can inter-operate with TERADATA.Web Services: Frameworks, such as, SeeBeyond, WebMethods, BEA WebLogic, etc., can be used to deploy database service applications based on the emerging Web Services model. Publish & Subscribe: Frameworks, such as TIBCO, can be used to deploy applications that access based on the a publish & subscribe model.EDI: Using EDI VAN vendors, such as GE Global eXchange, Get2Connect.net, Sterling Commerce, TERADATA can participate in electronic transactions with trading partners.Data is shared between the transactional and decision making environments.Data Acquisition & Integration: Data from the transactional environment is captured and copied to the ADW. Based on business needs, data can be moved to the ADW in a streaming fashion, or in a more traditional batch operation.Information Feedback: Raw data in the ADW is analyzed and transformed into actionable information, some of which is fed back to the transactional systems.

    TERADATA is contained in the Decision Making Environment, and is fully integrated in the Active Enterprise using a number of standard interfaces and programming models. Not all components are required to achieve the goal of an Active Enterprise. A system designer is free to choose those components that provide business value to their overall IT mission.KEY MESSAGE: Transactional system tend to have a very narrow scope.KEY MESSAGE: Many transactional repositories exist in the enterprise.

    Examples include TPF, SAP, Siebel, ORCL-financials, etcKEY MESSAGE: The ADW (TERADATA) is an integrated part of the Real Time Enterprise.KEY MESSAGE: The EDW is NOT simply a mirror of the transactional data models. It IS a data model specifically designed for decision support.KEY MESSAGE: Driven by a business need for concurrent tactical & strategic decision making, a new class of applications need to access the ADW. Traditional (legacy) applications also need access to the ADW, either directly, or indirectly via data sharing between the transactional and decision making environments.There are four programming models illustrated in the diagram, all of which can inter-operate with in the ADW architecture:Client Server: Applications, such as, BTEQ, Cognos, MicroStrategy, are examples of tightly bound CS applications that can inter-operate with TERADATA.Web Services: Frameworks, such as, SeeBeyond, WebMethods, BEA WebLogic, etc., can be used to deploy database service applications based on the emerging Web Services model. Publish & Subscribe: Frameworks, such as TIBCO, can be used to deploy applications that access based on the a publish & subscribe model.EDI: Using EDI VAN vendors, such as GE Global eXchange, Get2Connect.net, Sterling Commerce, TERADATA can participate in electronic transactions with trading partners.Data is shared between the transactional and decision making environments.Data Acquisition: Data from the transactional environment is captured and copied to the ADW. Based on business needs, data can be moved to the ADW in a streaming fashion, or in a more traditional batch operation.Information Feedback: Raw data in the ADW is analyzed and transformed into actionable information, some of which is fed back to the transactional systems.

    TERADATA is contained in the Decision Making Environment, and is fully integrated in the Active Enterprise using a number of standard interfaces and programming models. Not all components are required to achieve the goal of an Active Enterprise. A system designer is free to choose those components that provide business value to their overall IT mission.KEY MESSAGE: The ADW (TERADATA) is an integrated part of the Active Enterprise.KEY MESSAGE: Driven by a business need for concurrent tactical & strategic decision making, a new class of applications need to access the ADW. Traditional (legacy) applications also need access to the ADW, either directly, or indirectly via data sharing between the transactional and decision making environments.There are four programming models illustrated in the diagram, all of which can inter-operate with in the ADW architecture:Client Server: Applications, such as, BTEQ, Cognos, MicroStrategy, are examples of tightly bound CS applications that can inter-operate with TERADATA.Web Services: Frameworks, such as, SeeBeyond, WebMethods, BEA WebLogic, etc., can be used to deploy database service applications based on the emerging Web Services model. Publish & Subscribe: Frameworks, such as TIBCO, can be used to deploy applications that access based on the a publish & subscribe model.EDI: Using EDI VAN vendors, such as GE Global eXchange, Get2Connect.net, Sterling Commerce, TERADATA can participate in electronic transactions with trading partners.

    TERADATA is contained in the Decision Making Environment, and is fully integrated in the Active Enterprise using a number of standard interfaces and programming models. Not all components are required to achieve the goal of an Active Enterprise. A system designer is free to choose those components that provide business value to their overall IT mission.KEY MESSAGE: The ADW (TERADATA) is an integrated part of the Real Time Enterprise.KEY MESSAGE: The EDW is NOT simply a mirror of the transactional data models. It IS a data model specifically designed for decision support.KEY MESSAGE: Driven by a business need for concurrent tactical & strategic decision making, a new class of applications need to access the ADW. Traditional (legacy) applications also need access to the ADW, either directly, or indirectly via data sharing between the transactional and decision making environments.There are four programming models illustrated in the diagram, all of which can inter-operate with in the ADW architecture:Client Server: Applications, such as, BTEQ, Cognos, MicroStrategy, are examples of tightly bound CS applications that can inter-operate with TERADATA.Web Services: Frameworks, such as, SeeBeyond, WebMethods, BEA WebLogic, etc., can be used to deploy database service applications based on the emerging Web Services model. Publish & Subscribe: Frameworks, such as TIBCO, can be used to deploy applications that access based on the a publish & subscribe model.EDI: Using EDI VAN vendors, such as GE Global eXchange, Get2Connect.net, Sterling Commerce, TERADATA can participate in electronic transactions with trading partners.Data is shared between the transactional and decision making environments.Data Acquisition: Data from the transactional environment is captured and copied to the ADW. Based on business needs, data can be moved to the ADW in a streaming fashion, or in a more traditional batch operation.Information Feedback: Raw data in the ADW is analyzed and transformed into actionable information, some of which is fed back to the transactional systems.

    TERADATA is contained in the Decision Making Environment, and is fully integrated in the Active Enterprise using a number of standard interfaces and programming models. Not all components are required to achieve the goal of an Active Enterprise. A system designer is free to choose those components that provide business value to their overall IT mission.KEY MESSAGE: The ADW (TERADATA) is an integrated part of the Active Enterprise.KEY MESSAGE: Driven by a business need for concurrent tactical & strategic decision making, a new class of applications need to access the ADW. Traditional (legacy) applications also need access to the ADW, either directly, or indirectly via data sharing between the transactional and decision making environments.There are four programming models illustrated in the diagram, all of which can inter-operate with in the ADW architecture:Client Server: Applications, such as, BTEQ, Cognos, MicroStrategy, are examples of tightly bound CS applications that can inter-operate with TERADATA.Web Services: Frameworks, such as, SeeBeyond, WebMethods, BEA WebLogic, etc., can be used to deploy database service applications based on the emerging Web Services model. Publish & Subscribe: Frameworks, such as TIBCO, can be used to deploy applications that access based on the a publish & subscribe model.EDI: Using EDI VAN vendors, such as GE Global eXchange, Get2Connect.net, Sterling Commerce, TERADATA can participate in electronic transactions with trading partners.Data is shared between the transactional and decision making environments.Data Acquisition: Data from the transactional environment is captured and copied to the ADW. Based on business needs, data can be moved to the ADW in a streaming fashion, or in a more traditional batch operation.Information Feedback: Raw data in the ADW is analyzed and transformed into actionable information, some of which is fed back to the transactional systems.Direct Information Access: The transactional systems can also access the ADW directly.TERADATA is contained in the Decision Making Environment, and is fully integrated in the Active Enterprise using a number of standard interfaces and programming models. Not all components are required to achieve the goal of an Active Enterprise. A system designer is free to choose those components that provide business value to their overall IT mission.KEY MESSAGE: The ADW (TERADATA) is an integrated part of the Active Enterprise.KEY MESSAGE: The ADW (TERADATA) is an integrated part of the Active Enterprise.