A Teradata Database System

138
TERADATA ARCHITECHTURE 1). WELCOME TO TERADATA DATABASE Covers introduction to Teradata database, explains concepts of Data warehouse and Data marts, elaborates on Relational Database concepts and its components. 2). TERADATA DATABASE & DATAWAREHOUSE ARCHITECHTURE Covers details of Data warehouse and Data marts, describes and compares different existence of Data Marts, details on Teradata Database System and its components (ynet, !M"s, "arsin# $n#ine, %&"rocs etc.' 3). CLIENT ACCESS Covers Teradata interfacin# with networ attached clients, channel attached clients, details and when to use ind of facts for different Teradata utilities. 4). TERADATA SQL Covers basic Teradata S)* syntax. 5). DATA STRUCTURE $xplains Teradata Database ob+ects, details about spool space, perm space etc with example. 6). DATA PROTECTION $xplains in detail various data protection methods used by Teradata System lie R!D, -allbac, ournals, and !ccesses etc. 7). INDICES Describes what, when, how of several inds of indices used in Teradata Database Systems.  

Transcript of A Teradata Database System

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 1/138

TERADATA ARCHITECHTURE

1). WELCOME TO TERADATA DATABASE 

Covers introduction to Teradata database, explains concepts of Data

warehouse and Data marts, elaborates on Relational Database conceptsand its components.

2). TERADATA DATABASE & DATAWAREHOUSE ARCHITECHTURE 

Covers details of Data warehouse and Data marts, describes andcompares different existence of Data Marts, details on Teradata DatabaseSystem and its components (ynet, !M"s, "arsin# $n#ine, %&"rocs etc.'

3). CLIENT ACCESS 

Covers Teradata interfacin# with networ attached clients, channelattached clients, details and when to use ind of facts for differentTeradata utilities.

4). TERADATA SQL

Covers basic Teradata S)* syntax.

5). DATA STRUCTURE 

$xplains Teradata Database ob+ects, details about spool space, permspace etc with example.

6). DATA PROTECTION 

$xplains in detail various data protection methods used by TeradataSystem lie R!D, -allbac, ournals, and !ccesses etc.

7). INDICES 

Describes what, when, how of several inds of indices used in TeradataDatabase Systems.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 2/138

1. WELCOME TO TERADATA DATABASE

Objectie!

  !fter completin# this module, you should be able to/

• Describe the Teradata Database.

• Describe the advanta#es of the Teradata Database.

• Define the terms associated with relational databases.

• Describe the advanta#es of a relational database.

W"#t I! T"e Te$#%#t# D#t#b#!e&

  The Teradata Database is a relational databasemana#ement system (RDMS' that drives acompany0s data warehouse. The TeradataDatabase provides the foundation to #ive acompany the power to #row, to compete intoday1s dynamic maretplace, and to evolve the business by #ettin# answers to a new #enerationof 2uestions. The Teradata Database1s scalabilityallows the system to #row as the business #rows,from #i#abytes to terabytes and beyond. TheTeradata Database1s uni2ue technolo#y has been

 proven at customer sites across industries andaround the world.

The Teradata Database is an open system, compliant with !3S standards. t iscurrently available on 435 M"&R!S and 6indows 7888 operatin# systems.The Teradata Database is a lar#e database server that accommodates multipleclient applications main# in2uiries a#ainst it concurrently. %arious client platforms access the database throu#h a TC"&" connection or across an Mmainframe channel connection. The ability to mana#e lar#e amounts of data isaccomplished usin# the concept of parallelism, wherein many individual processors perform smaller tass concurrently to accomplish an operation

a#ainst a hu#e repository of data. To date, only parallel architectures can handledatabases of this si9e.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 3/138

H'( I! T"e Te$#%#t# D#t#b#!e U!e%&

  $ach Teradata Database implementation canmodel a company1s business. The ability toeep up with rapid chan#es in today1s business environment maes the TeradataDatabase an ideal foundation for manyapplications, includin#/

• $nterprise data warehousin#

• !ctive data warehousin#

• Customer relationship mana#ement

• nternet and $&usiness

• Data marts

1) Ete$*$i!e D#t# W#$e"'+!e

Data warehousin# is a process for properly assemblin# and mana#in# data from variousservers to answer business&critical 2uestions. The Teradata Database is ideal forete$*$i!e %#t# (#$e"'+!i,, which is commonly characteri9ed by/

• Multiple sub+ect areas

• Many concurrent users

• Many concurrent 2ueries, includin# ad&hoc 2ueries• *ar#e 2uantity of tables

• :undreds of #i#abytes (and terabytes' of detail data

• :istorical data stored (months or years'

Wit"'+t an enterprise data warehouse, a financial institution may be able to identify profitable customers for separate products such as mort#a#es or credit cards, but notnow the overall profitability of each customer. !n enterprise data warehouse brin#sto#ether the different sub+ect areas into a central repository, creatin# ;one sin#le view ofthe business; for a complete picture of the customer.

!n enterprise data warehouse environment built on the Teradata Database simplifies thesystem maintenance tas, resultin# in a lower total cost of ownership. n addition, theTeradata Database1s ability to handle lar#e&scale, decision&support 2ueries a#ainst hu#evolumes of detail data maes it the obvious choice for companies wantin# to start at anylevel and #row.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 4/138

2) Actie D#t# W#$e"'+!e

The #ctie %#t# (#$e"'+!e extends a company1s ability beyond historical data and

strate#ic decisions to brin# the decision&main# capability to front&line personnel. Thetactical decisions such as, ;6ho should #et the empty seat on this airplane<; or ;6hatshould offer this customer to eep her from leavin#, based on her history with ourcompany<; can be made more effectively with the ri#ht information.

6ith an active data warehouse, employees who interact directly with customers andsuppliers are empowered with information&based decision main# at their fin#ertips. TheTeradata 6arehouse supports active data warehousin# with/

• Capability to handle thousands of additional users and mixed worloads

• :i#h availability and reliability to support mission&critical applications

• Scalability to accommodate an increase in the amount of data, the number of datasources, and the number of applications supported in the data warehouseenvironment

3) C+!t'-e$ Re#ti'!"i* M##,e-et

C+!t'-e$ Re#ti'!"i* M##,e-et solutions help companies capture and analy9edata to maximi9e customer ac2uisition, retention, and profitability. =ou can use theTeradata Database1s detailed data and analysis capabilities to identify and optimi9e business relationships with the hi#hest potential of profitability and #rowth. $xamplesinclude/

• ! telephone company can conduct and refine maretin# pro#rams tar#eted at a

certain type of profitable customer.

• ! supermaret can create incentives based on specific combinations of products

that customers tend to buy to#ether.• ! ban can reco#ni9e chan#es in a customer0s life circumstances, such as a new

 baby or a colle#e&bound son or dau#hter, and offer timely services such as a newhome loan, mort#a#e insurance, additional checin# account, extra credit card, orstudent loan.

• ! retailer can run a department store credit card sales pro#ram and filter out those

customers who already have that card.

The 3CR CRM solution consists of software, professional and customer services, and theTeradata Database to create, maintain, and enhance customer relationships.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 5/138

4) Ite$et #% E/B+!ie!!

The Teradata Database provides a sin#le repository for customer information that helps$&usinesses build and maintain one&to&one customer relationships that are critical totheir success on the nternet. The Teradata Database supports the fast&paced style of $&

usiness by allowin# many concurrent users to as complicated 2uestions as they thinof them && and #et 2uic answers.

The Teradata Database allows $&usinesses to/

• Capture massive amounts of clic&stream data.

• $nable multiple users to as complex 2uestions of the customer1 clic&stream data

with near real&time response.• "rotect customers1 privacy with consumer opt&in>opt&out preferences and ability

for consumers to chec and revise their information stored on the TeradataDatabase throu#h the nternet or a company call center.

5) D#t# M#$t

! data mart is a special purpose subset of a company1s enterprise data used by a particulardepartment, function, or application. ?ften, these sin#le&sub+ect area data marts containdata that was a##re#ated or transformed in some way to better handle the re2uests of aspecific user community. %endors implement data marts usin# different architectures/

• I%e*e%et %#t# -#$t! & Created directly from operational systems to an

individual data store.• De*e%et %#t# -#$t! & Created from detail data in the data warehouse. t still

re2uires movement and transformation of data, but may provide better performance for some specific user 2ueries.

• L',ic# %#t# -#$t! & $xistin# parts of the data warehouse, not separate physical

structures. ecause in theory the data warehouse contains the detail data of theentire enterprise, a lo#ical data mart would then provide the specific informationfor a specific user community. 6ith the proper technolo#y, this can be an idealway to remove the need for massive data loadin# and transformin#.

ndependent and dependent data marts are architectures endorsed by other databasevendors and tend to be associated with hi#her maintenance costs for physically movin#and maintainin# the data, inconsistent data (and resultin# inconsistent decisions', and

indirect ways to #et the complete picture of the data. The Teradata Database is ideal forthe lo#ical data mart environment, where different user communities view subsets of asin#le repository of enterprise data.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 6/138

W"#t M#0e! t"e Te$#%#t# D#t#b#!e Ui+e&

  n this 6eb&ased Trainin#, you will learn about many features that mae theTeradata Database, a RDMS, ri#ht for business&critical applications. To startwith, this section covers these ey features/

• Sin#le data store

• Scalability

• 4nconditional parallelism (parallel architecture'

• !bility to model the business

• Mature, parallel&aware ?ptimi9er

1) Si,e D#t# St'$e

The Teradata Database acts as a sin#le data store, with multiple client applicationsmain# in2uiries a#ainst it concurrently.

nstead of replicatin# a database for different purposes, with the Teradata Databaseyou store the data once and use it for all clients. The Teradata Database provides thesame connectivity for an entry&level system as it does for a massive enterprise data

warehouse.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 7/138

2) Sc##biit

 

;*inear scalability; means that as you add components to the system, the performance increase is linear. !ddin# components allows the system toaccommodate increased worload without decreased throu#hput. *inearscalability enables the system to #row to support moreusers>data>2ueries>complexity of 2ueries without experiencin# performancede#radation. !s the confi#uration #rows, *e$'$-#ce ic$e#!e i! ie#$ !'*e

' 1. The Teradata Database was the first commercial database system to scaleto and support a trillion bytes of data. The ori#in of the name Teradata is ;tera&,;which is derived from @ree and means ;trillion.;

The chart below lists the meanin# of the prefixes/

$ei  E*'et  Me#i, 

ilo& A8B  A,888 (thousand'

me#a& A8 A,888,888 (million'

#i#a& A8

A,888,888,888 (billion'te$#/ 112 1 8t$ii')

 peta& A8AE A,888,888,888,888,888 (2uadrillion'

exa& A8AF A,888,888,888,888,888,888 (2uintillion'

The Teradata Database can scale from A88 #i#abytes to over A88 terabytes ofdata on a sin#le system without losin# any performance capability. The TeradataDatabase1s scalability provides ie!t-et *$'tecti' for customer1s #rowthand application development. The Teradata Database is the only database that ist$+ !c##be, and this extends to data loadin# with the use of parallel loadin#

utilities. The Teradata Database provides #+t'-#tic %#t# %i!t$ib+ti' and noreor#ani9ations of data are needed. The Teradata Database is scalable inmultiple ways, includin# hardware, complexity, and concurrent users.

H#$%(#$e

@rowth is a fundamental #oal of business. !n M"" Teradata Database system

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 8/138

easily accommodates that #rowth whenever it happens. The Teradata Databaseruns on hi#hly optimi9ed 3CR servers in the followin# confi#urations/

• SM & Symmetric multiprocessin# platforms mana#e #i#abytes of data

to support an entry&level data warehousin# system.• M & Massively parallel processin# systems can mana#e hundreds of

terabytes of data. =ou can start small with a couple of nodes, and laterexpand the system as your business #rows.

6ith the Teradata Database, you can increase the si9e of your system withoutreplacin#/

• D#t#b#!e! / 6hen you expand your system, the data is automatically

redistributed throu#h the reconfi#uration process, without manualinterventions such as sortin#, unloadin# and reloadin#, or partitionin#.

• #t'$-! / The Teradata Database1s modular structure allows you to

add components to your existin# system.• D#t# -'%e / The physical and lo#ical data models remain the same

re#ardless of data volume.• A**ic#ti'! / !pplications you develop for Teradata Database

confi#urations will continue to wor as the system #rows, protectin#your investment in application development.

C'-*eit

The Teradata Database is adept at complex data models that satisfy theinformation needs throu#hout an enterprise. The Teradata Database efficiently processes increasin#ly sophisticated business 2uestions as users reali9e thevalue of the answers they are #ettin#. t has the ability to perform lar#ea##re#ations durin# 2uery run time and can perform up to G +oins in a sin#le2uery.

C'c+$$et U!e$!

!s is proven in every Teradata Database benchmar, the Teradata Database canhandle the most concurrent users, who are often runnin# -+ti*e c'-*e

+e$ie!. The Teradata Database has the proven ability to handle from hundredsto thousands of users on the system simultaneously. !ddin# many concurrentusers typically reduces system performance. :owever, addin# more componentscan enable the system to accommodate the new users with e2ual or even better performance.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 9/138

3) Uc'%iti'# #$#ei!-

 

The Teradata Database provides exceptional performance usin# parallelism toachieve a sin#le answer faster than a non&parallel system. "arallelism usesmultiple processors worin# to#ether to accomplish a tas 2uicly.

!n example of parallelism can be seen at an amusement par, as #uests stand inline for an attraction such as a roller coaster. !s the line approaches the boardin# platform, it typically will split into multiple, parallel lines. That way,#roups of people can step into their seats simultaneously. The line moves fasterthan if the #uests step onto the attraction one at a time. !t the bi##estamusement pars, the parallel loadin# of the rides becomes essential to theirsuccessful operation.

"arallelism is evident throu#hout a Teradata Database, from the architecture todata loadin# to complex re2uest processin#. The Teradata Database processesre2uests in parallel without mandatory 2uery tunin#. The Teradata Database1s parallelism does not depend on limited data 2uantity, column ran#e constraints,or speciali9ed data models && The Teradata Database has 9+c'%iti'#

*#$#ei!-.9

4) Abiit t' M'%e t"e B+!ie!!

 

! data warehouse built on a business model contains information from across theenterprise. ndividual departments can use their own assumptions and views of the

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 10/138

data for analysis, yet these varyin# perspectives have a common basis for a ;sin#leview of the business.;

6ith the Teradata Database1s centrally located, lo#ical architecture, companies can

#et a cohesive view of their operations across functional areas to/

• -ind out which divisions share customers.

• Trac products throu#hout the supply chain, from initial manufacture, to

inventory, to sale, to delivery, to maintenance, to customer satisfaction.• !naly9e relationships between results of different departments.

• Determine if a customer on the phone has used the company1s website.

• %ary levels of service based on a customer1s profitability.

=ou #et consistent answers from the different viewpoints above usin# a sin#le business model, not functional models for different departments. n a functionalmodel, data is or#ani9ed accordin# to what is done with it. ut what happens ifusers later want to do some analysis that has never been done before< 6hen asystem is optimi9ed for one department1s function, the other departments1 needs(and future needs' may not be met.

! Teradata Database allows the data to represent a business model, with %#t#

'$,#i:e% #cc'$%i, t' ("#t it $e*$e!et! not how it is accessed, so it is easy tounderstand. The data model should be desi#ned with re#ard to usa#e and be thesame $e,#$%e!! ' %#t# '+-e. 6ith a Teradata Database as the enterprise datawarehouse, users can as new 2uestions of the data that were never anticipated,throu#hout the business cycle and even throu#h chan#es in the business

environment.

! ey Teradata Database stren#th is its ability to model the customer1s business.The Teradata Database1s business models are t$+ '$-#i:e%, avoidin# the costlystar schema and snowflae implementations that many other database vendors use.The Teradata Database can do Star Schema and other types of relational modelin#, but T"i$% ;'$-# <'$- is the methodolo#y Teradata Division recommends tocustomers. The Teradata Database1s competitors typically implement Star Schemaor Snowflae models either because they are implementin# a set of nown 2ueriesin a transaction processin# environment, or because their architecture limits them tothat type of model. 3ormali9ation is the process of reducin# a complex data

structure into a simple, stable one. @enerally this process involves removin#redundant attributes, eys, and relationships from the conceptual data model. TheTeradata Database supports normali9ed lo#ical models because it is able to perform64 t#be j'i! and lar#e a##re#ations durin# 2ueries.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 11/138

5) M#t+$e #$#e/A(#$e O*ti-i:e$

  The Teradata Database1s ?ptimi9er is the most robust in the industry, able tohandle/

• Multiple complex 2ueries

• oins per 2uery

• 4nlimited ad&hoc processin#

The ?ptimi9er is parallel&aware, meanin# that it has nowled#e of systemcomponents (how many nodes, vprocs, etc.'. t determines the e#!t e*e!ie

*# 8ti-e/(i!e) to process 2ueries fast and in parallel. The ?ptimi9er is furtherexplained in the next module.

W"#t I! # Re#ti'# D#t#b#!e&

  ! database is a collection of permanently stored data that is/

• *o#ically related (the data was created for a specific purpose'.

• Shared (many users may access the data'.

• "rotected (access to the data is controlled'.

• Mana#ed (the data inte#rity and value are maintained'.

The Teradata Database is a relational database. Relational databases are basedon the relational model, which is founded on mathematical Set T"e'$. Therelational model uses and extends many principles of Set Theory to provide adisciplined approach to data mana#ement. ! relational database is desi#ned to/

• Represent a business and its b+!ie!! *$#ctice!. 

• e extremely eibe in the way that it can be selected and used.

• e e#! t' +%e$!t#% 

• M'%e t"e b+!ie!!, not the applications

• !ll businesses to +ic0 $e!*'% t' c"#,i, c'%iti'! 

Relational databases present data as of a set of tables. ! t#be is a two&dimensional representation of data that consists of $'(! and c'+-!.!ccordin# to the relational model, a valid table does not have to be populatedwith data rows, it +ust needs to be defined with at least one column.

! relational database is a set of lo#ically related tables. Tables are lo#icallyrelated to each other by a common field, so information such as customer

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 12/138

telephone numbers and addresses can exist in one table, yet be accessible formultiple purposes. The example below shows customer, order, and billin#statement data, related by a common field. The common field of Customer Dlets you loo up information such as a customer name for a particular statement

number, even thou#h the data exists in two different tables.

Relational databases are more flexible than other types so businesses are able torespond more 2uicly to chan#in# conditions.

1) R'(!

  $ach $'( contains all the columns in the table. ! row is 'e i!t#ce ' #

c'+-!, and each table can contain ' 'e $'( '$-#t. The order of rows is#$bit$#$ and does not imply priority, hierarchy, or si#nificance.

$ach row represents an occurrence of an entity defined by the table. !n entity is

a person, place, or thin# about which the table contains information. n thisexample, the entity is the employee and each row represents a sin#le employee.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 13/138

2) C'+-!

  $ach column contains 9i0e %#t#9 such as only part names, or only suppliernames, or only employee numbers. n the example below, the *astH3amecolumn contains last names only, and nothin# else. The data in the columns isatomic data, so a telephone number mi#ht be divided into three columns/ thearea code, the prefix, and the suffix, so the customer data can be analy9edaccordin# to area code, etc. Missin# data values would be represented by;nulls.; 6ithin a table, the column position is #$bit$#$.

3) $i-#$ =e

  n the relational model, a "rimary Iey ("I' is used to desi#nate a uni2ueidentifier for each row when you desi#n a database. ! "rimary Iey can becomposed of one or more columns. n the example below, the "rimary Iey isthe employee number.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 14/138

$i-#$ =e R+e!

  Rules #overnin# how "rimary Ieys must be defined and how they function are/

R+e 1> ! "rimary Iey is re2uired.R+e 2> ! "rimary Iey value must be uni2ue.R+e 3> The "rimary Iey value cannot be 34**.R+e 4> The "rimary Iey value should not be chan#ed.R+e 5> The "rimary Iey column should not be chan#ed.R+e 6> ! "rimary Iey may be any number of columns.

R+e 1> A $i-#$ =e i! $e+i$e%

n the lo#ical model, each table re2uires a "rimary Iey because that is how eachrow is able to be uni2uely identified. $ach table must have one, and ' 'e,"rimary Iey. n any #iven row, the value of the "rimary Iey +i+e

i%etiie! t"e $'(. The "rimary Iey may span more than one column, but eventhen, there is only one "rimary Iey.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 15/138

R+e 2> Ui+e = 

  6ithin the column(s' desi#nated as the "rimary Iey, the values in each rowmust be uni2ue. 3o duplicate values are allowed. The "rimary Iey1s purpose isto uni2uely identify a row. n a multi&column "rimary Iey, the c'-bie% valueof the columns must be uni2ue, even if an individual column in the "rimary Ieyhas duplicate values.

R+e 3> = C#'t Be ;ULL

  6ithin the "rimary Iey column, each row must have a "rimary Iey value andcannot be 34** (without a value'. ecause 34** is indeterminate, it cannot;identify; anythin#.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 16/138

R+e 4> = ?#+e S"'+% ;'t C"#,e

  "rimary Iey values should not be chan#ed. f you chan#ed a "rimary Iey, youwould lose all historical tracin# of that row.

R+e 5> = C'+- S"'+% ;'t C"#,e

  !dditionally, the column(s' desi#nated as the "rimary Iey should not bechan#ed. f you chan#ed a "rimary Iey, you would lose all the information

relatin# that table to other tables.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 17/138

R+e 6> ;' C'+- Li-it

  n the relational model, there is no limit to the number of columns that can bedesi#nated as the "rimary Iey, so it may consist of one or more columns. n theexample below, the "rimary Iey consists of three columns/ $M"*?=$$ 34M$R, *!ST 3!M$, and -RST 3!M$.

<'$ei, =e

  ! -orei#n Iey (-I' is an identifier that lins related tables. ! -orei#n Ieydefines how two tables are related to each other. $ach -orei#n Iey references amatchin# "rimary Iey in another table in the database. -or example, in the table

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 18/138

 below, the Department 3umber column that is a -orei#n Iey actually exists inanother table as a "rimary Iey.

:avin# tables related to each other #ives users the flexibility to loo at the datain different ways, without the database administrator havin# to mana#e andmaintain many tables of duplicate data for different applications.

<'$ei, =e R+e!

  Rules #overnin# how -orei#n Ieys must be defined and how they operate are/

R+e 1> -orei#n Ieys are optional.R+e 2> ! -orei#n Iey value may be non&uni2ue.

R+e 3> The -orei#n Iey value may be 34**.R+e 4> The -orei#n Iey value may be chan#ed.R+e 5> ! -orei#n Iey may be any number of columns.R+e 6> $ach -orei#n Iey must exist as a "rimary Iey in the related table.

R+e 1> O*ti'# <=!

  -orei#n Ieys are optionalJ not all tables have them. Tables that do have themcan have multiple -orei#n Ieys because a table can relate to multiple othertables. n fact, a table can have an unlimited number of forei#n eys. n the

example table below/

• The Department 3umber -orei#n Iey relates to the Department 3umber

"rimary Iey in the Department Table elsewhere in the database.• The ob Code -I relates to the ob Code "I in the ob Code Table,

elsewhere in the database.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 19/138

:avin# tables related to each other maes a relational database flexible so thatdifferent users can loo up information they need, while simplifyin# the

database administration so the data doesn1t have to be duplicated for each purpose or application.

R+e 2> Ui+e '$ ;'/Ui+e <=!

  Duplicate -orei#n Iey values are allowed. More than one employee could beassi#ned to the same department.

R+e 3> <=! C# Be ;ULL

  34** (missin#' -orei#n Iey values are allowed. -or example, under specialcircumstances, an employee mi#ht not be assi#ned to a department.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 20/138

 

R+e 4> <= ?#+e C# C"#,e

-orei#n Iey values may be chan#ed. -or example, if !rnando %ille#as movesfrom Department G8B to Department EFK, the -orei#n Iey value in his rowwould chan#e.

R+e 5> <= H#! ;' C'+- Li-it

  The -orei#n Iey may consist of one or more columns. ! multi&column forei#ney is used to relate to a multi&column "rimary Iey in the related table. n therelational model, there is no limit to the number of columns that can bedesi#nated as a -orei#n Iey.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 21/138

R+e 6> <= M+!t Be = i Re#te% T#be

  $ach -orei#n Iey must exist as a "rimary Iey in the related table. !department number that does not exist in the Department Table would be invalidas a -orei#n Iey value in the $mployee Table.

This rule can apply even if the -orei#n Iey is 34**, or missin#. Remember, amissin# value is defined as a non&valueJ there is no value present. So the rulecould be better stated/ if a value exists in the -orei#n Iey column, it must matcha "rimary Iey value in the related table.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 22/138

2. TERADATA DATABASE @ DATAWAREHOUSE ARCHITECHTURE

Objectie!

  !fter completin# this module, you should be able to/

• dentify the different types of enterprise data processin#.

• Define a data warehouse, active data warehouse, and a data mart.

• *ist and define the different types of data marts.

• $xplain the advanta#es of detail data over summary data.

• Describe the overall Teradata Database parallel architecture.

• *ist and describe ma+or Teradata Database hardware and software

components and their functions.

• $xplain how the architecture helps to maintain hi#h availability and

reliability for Teradata Database users.

E'+ti' t' Actie D#t# W#$e"'+!i,

D#t# W#$e"'+!e U!#,e E'+ti' 

There is an information evolution happenin# in the data warehouse environmenttoday. Chan#in# business re2uirements have placed demands on datawarehousin# technolo#y to do more thin#s faster. Data warehouses have movedfrom bac room strate#ic decision support systems to operational, business&critical components of the enterprise. !s your company evolves in its use of thedata warehouse, what you need from the data warehouse evolves, too.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 23/138

St#,e 1 Re*'$ti,> The initial sta#e typically focuses on reportin# from asin#le view of the business to drive decision&main# across functional and>or product boundaries. )uestions are usually nown in advance, such as a weelysales report.

St#,e 2 A#:i,> -ocus on why somethin# happened, such as why sales wentdown or discoverin# patterns in customer buyin# habits. 4sers perform ad&hocanalysis, slicin# and dicin# the data at a detail level, and 2uestions are notnown in advance.

St#,e 3 $e%icti,> Sophisticated analysts heavily utili9e the system tolevera#e information to predict what will happen next in the business to proactively mana#e the or#ani9ation1s strate#y. This sta#e re2uires data minin#tools and buildin# predictive models usin# historical detail. !s an example,users can model customer demo#raphics for tar#et maretin#.

St#,e 4 O*e$#ti'#i:i,> "rovidin# access to information for immediatedecision&main# in the field enters the realm of active data warehousin#. Sta#esA to B focus on strate#ic decision&main# within an or#ani9ation. Sta#e Gfocuses on tactical decision support.. Tactical decision support is not focused ondevelopin# corporate strate#y, but rather on supportin# the people in the fieldwho execute it. $xamples/ A' nventory mana#ement with +ust&in&timereplenishment, 7' Schedulin# and routin# for paca#e delivery. B' !lterin# acampai#n based on current results.

St#,e 5 Actie W#$e"'+!i,> The lar#er the role an !D6 plays in the

operational aspects of decision support, the more incentive the business has toautomate the decision processes. =ou can automate decision&main# when acustomer interacts with a web site. nteractive customer relationshipmana#ement (CRM' on a web site or at an !TM is about main# decisions tooptimi9e the customer relationship throu#h individuali9ed product offers, pricin#, content delivery and so on. !s technolo#y evolves, more and moredecisions become executed with event&driven tri##ers to initiate fully automateddecision processes. $xample/ determine the best offer for a specific customer based on a real&time event, such as a si#nificant !TM deposit.

Actie D#t# W#$e"'+!e

  Data warehouses are be#innin# to tae on -i!!i'/c$itic# roles supportin#CRM, one&to&one maretin#, and minute&to&minute decision&main#. Datawarehousin# re2uirements have evolved to demand a decision capability that isnot +ust oriented toward corporate staff and upper mana#ement, but actionableon a day&to&day basis. Decisions such as when to replenish arbie dolls at a

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 24/138

 particular retail outlet may not be strate#ic at the level of customer se#mentationor lon#&term pricin# strate#ies, but when executed properly, they mae a bi#difference to the bottom line. 6e refer to this capability as ;tactical; decisionsupport.

Tactical decisions are the drivers for day&to&day mana#ement of the business.usinesses today want more than +ust strate#ic insi#ht from their datawarehouse implementations&they want better execution in runnin# the businessthrou#h more effective use of information for the decisions that #et madethousands of times per day.

The ori#in of the active data warehouse is the timely, inte#rated store of detaildata available for analytic business decision&main#. t is only from that sourcethat the additional traits needed by the active data warehouse can evolve. Thesenew ;active; traits are supplemental to data warehouse functionality. -orexample, the wor mix in the database still includes complex decision support

2ueries, but expands to tae on short, tactical 2ueries, bac#round data feeds,and possibly event&driven updates all at the same time. Data volumes and userconcurrency levels may explode upward beyond expectation. Restraints mayneed to be placed on the lon#er, analytical 2ueries in order to #uarantee tacticalwor throu#hput. 6hile accessin# the detail data directly remains an importantopportunity for analytical wor, tactical wor may thrive on shortcuts andsummaries, such as '*e$#ti'# %#t# !t'$e 8ODS) ee i'$-#ti'. !nd forb't" !t$#te,ic #% t#ctic# %eci!i'! to be useful to the business, today1s data,this hour1s data, even this minute1s data has to be at hand.

The Teradata Database is positioned exceptionally well for steppin# up to the

challen#es related to hi#h availability, lar#e multi&user worloads, and handlin#c'-*e +e$ie! that are re2uired for an active data warehouse implementation.The Teradata Database technolo#y supports the evolvin# business re2uirements by providin# "i," *e$'$-#ce #% !c##biit for/

• Mixed worloads (both tactical and strate#ic 2ueries' for mission critical

applications• *ar#e amounts of detail data

• Concurrent users

The Teradata Database provides 724 ##i#biit #% $ei#biit, as well as

continuous updatin# of information so %#t# i! #(#! $e!" #% #cc+$#te.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 25/138

E'+ti' ' D#t# $'ce!!i,

  Traditionally, data processin# has been divided into two cate#ories/ on&linetransaction processin# (?*T"' and decision support systems (DSS'. -or either,re2uests are handled as transactions. ! transaction is # ',ic# +it ' ('$0 ,such as a re2uest to update an account.

!n RDMS is used in the followin# main processin# environments/

• DSS

• ?*T"

• ?*!"

• Data Minin#

Deci!i' S+**'$t S!te-! 8DSS)

n a decision support environment, users submit re2uests to ##:e "i!t'$ic#

%et#i %#t# stored in the tables. The results are used to establish strate#ies,reveal trends, and mae pro+ections. ! database used as a decision supportsystem (DSS' usually receives fewer, very complex, ad&hoc 2ueries and mayinvolve numerous tables. Decision support systems include batch reports, whichroll&up numbers to #ive business the bi# picture, and over time, have evolved.nstead of pre&written scripts, users now re2uire the ability to do ad hoc 2ueries,analysis, and predictive what&if type 2ueries that are often complex andunpredictable in their processin#. These types of 2uestions are essential for lon#ran#e, strate#ic strate#ic plannin#. DSS systems often process hu#e volumes ofdetail data.

O/ie T$#!#cti' $'ce!!i, 8OLT)

4nlie the DSS environment, an on&line transaction processin# (?*T"'environment typically has users accessin# current data to update, insert, and

delete rows in the data tables. ?*T" is typified by a small number of rows (orrecords' or a few of many possible tables bein# accessed in a matter of secondsor less. %ery little >? processin# is re2uired to complete the transaction. Thistype of transaction taes place when we tae out money at an !TM. ?nce ourcard is validated, a debit transaction taes place a#ainst our current balance toreflect the amount of cash withdrawn. This type of transaction also taes placewhen we deposit money into a checin# account and the balance #ets updated.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 26/138

6e expect these transactions to be performed 2uicly. They must occur in realtime.

O/ie A#tic# $'ce!!i, 8OLA)

?*!" is a modern form of analytic processin# within a DSS environment.?*!" tools (e.#. from companies lie Microstrate#y and Co#nos' provide aneasy to use @raphical 4ser nterface to allow Lslice and dice analysis alon#multiple dimensions (e.#. products, locations, sales teams, inventories, etc.'.6ith ?*!", the user may be looin# for historical trends, sales ranin#s or

seasonal inventory fluctuations for the entire corporation. 4sually, this involvesa lot of detail data to be retrieved, processed and analy9ed. Therefore, responsetime can be in seconds or minutes.

D#t# Mii,

Data Minin# (predictive modelin#' involves analy9in# moderate to lar#eamounts of detailed historical data to detect behavioral patterns (e.#. buyin#,attrition, or fraud patterns' that are then used to predict future behavior. !nLanalytic model is built from the historical data ("hase A/ minutes to hours'incorporatin# the detected patterns. The model is then applied a#ainst currentdetail data (Lscorin#' to predict liely outcomes ("hase 7/ seconds or less'.

*iely outcomes, for example, include scores on lielihood of purchasin# a product, switchin# to a competitor, or bein# fraudulent.

A%#t#,e! ' U!i, S+--#$ D#t#

  4ntil recently, most business decisions were based on summary data. The problem is that summari9ed data is not as useful as detail data and cannotanswer some 2uestions with accuracy. 6ith summari9ed data, peas and valleysare leveled when the peas fall at the end of reportin# period and are cut in half.

:ere1s another example. Thin of your monthly ban statement that recordschecin# account activity. f it only told you the total amount of deposits andwithdrawals, would you be able to tell if a certain chec had cleared< To answerthat 2uestion you need a list of every chec received by your ban. =ou needdetail data.

Decision support&answerin# business 2uestions&is the real purpose of databases.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 27/138

To answer business 2uestions, decision&maers must have four thin#s/

• The ri#ht data

• $nou#h detail data

• "roper data structure• $nou#h computer power to access and produce reports on the data

Consider your own business and how it uses data. s that data detailed orsummari9ed< f it1s summari9ed, are there 2uestions it cannot answer<

T"e D#t# W#$e"'+!e

  ! %#t# (#$e"'+!e is a central, enterprise&wide database that contains

information extracted from the operational systems. ! Data 6arehouse has acet$# 'c#te% ',ic# #$c"itect+$e which minimi9es data synchroni9ationand provides a sin#le view of the business. Data warehouses have become morecommon in corporations where enterprise&wide detail data may be used in on&line analytical processin# to mae strate#ic and tactical business decisions.6arehouses often carry many years worth of detail data so that historical trendsmay be analy9ed usin# the full power of the data.

Many data warehouses #et their data directly from operational systems so thatthe data is timely and accurate. 6hile data warehouses may be#in somewhatsmall in scope and purpose, they often #row 2uite lar#e as their utility becomesmore fully exploited by the enterprise.

Data 6arehousin# is a process, not a product. t is a techni2ue to properlyassemble and mana#e data from various sources to answer business 2uestionsnot previously possible or nown.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 28/138

D#t# M#$t!

  ! data mart is a special purpose subset of enterprise data used by a particulardepartment, function or application. Data marts may have b't" !+--#$ #%

%et#i %#t# '$ # *#$tic+#$ +!e rather than for #eneral use. 4sually the datahas been pre&a##re#ated or transformed in some way to better handle the particular type of re2uests of a specific user community.

I%e*e%et D#t# M#$t!

ndependent data marts are created directly from operational systems, +ust as is a

data warehouse. n the data mart, the data is usually transformed as part of theload process. Data mi#ht be a##re#ated, dimensionali9ed or summari9edhistorically, as the re2uirements of the data mart dictate.

L',ic# D#t# M#$t!

*o#ical data marts are not separate physical structures or a data load from a datawarehouse, but rather are an existin# part of the data warehouse. ecause intheory the data warehouse contains the detail data of the entire enterprise, alo#ical view of the warehouse mi#ht provide the specific information for a #ivenuser community, much as a physical data mart would. 6ithout the propertechnolo#y, a lo#ical data mart can be a slow and frustratin# experience for end

users. 6ith the proper technolo#y, it removes the need for massive data loadin#and transformin#, main# a sin#le data store available for all user needs.

De*e%et D#t# M#$t!

Dependent data marts are created from the detail data in the data warehouse.6hile havin# many of the advanta#es of the lo#ical data mart, this approach stillre2uires the movement and transformation of data but may provide a better

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 29/138

vehicle for performance&critical user 2ueries.

D#t# M#$t $'! #% C'!

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 30/138

  I%e*e%et D#t# M#$t!

ndependent data marts are usually the easiest and fastest to implement and their paybac value can be almost immediate. Some corporations start with severaldata marts before decidin# to build a true data warehouse. This approach hasseveral inherent problems/

• 6hile data marts have obvious value, they are not a true enterprise&widesolution and can become very costly over time as more and more areadded.

• ! ma+or problem with proliferatin# data marts is that, dependin# on

where you loo for answers, there is often more than one view of the business.

• They do not provide the historical depth of a true data warehouse.

• ecause data marts are desi#ned to handle specific types of 2ueries from

a specific type of user, they are often not #ood at ;what if; 2ueries lie adata warehouse would be.

L',ic# D#t# M#$t!*o#ical data marts overcome most of the limitations of independent data marts.They provide a sin#le view of the business. There is no historical limit to thedata and ;what if; 2ueryin# is entirely feasible. The ma+or drawbac to lo#icaldata marts is the lac of physical control over the data. ecause data in thewarehouse in not pre&a##re#ated or dimensionali9ed, performance a#ainst thelo#ical mart will not usually be as #ood as a#ainst an independent mart.:owever, use of parallelism in the lo#ical mart can overcome some of thelimitations of the non&transformed data.

De*e%et D#t# M#$t!

Dependent data marts provide all advanta#es of a lo#ical mart and also allow for physical control of the data as it is extracted from the data warehouse. ecausedependent marts use the warehouse as their foundation, they are #enerallyconsidered a better solution than independent marts, but they tae lon#er and aremore expensive to implement.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 31/138

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 32/138

A Te$#%#t# D#t#b#!e S!te-

  ! Teradata Database system contains one or more nodes. ! node is a term for a processin# unit under the control of a sin#le operatin# system. The node iswhere the processin# occurs for the Teradata Database. There are two types ofTeradata Database systems/

• Symmetric multiprocessin# (SM"' & !n SM" Teradata Database has asin#le node that contains multiple C"4s sharin# a memory pool.

• Massively parallel processin# (M""' & Multiple SM" nodes worin#

to#ether comprise a lar#er, M"" implementation of a Teradata Database.The nodes are connected usin# the =3$T, which allows multiplevirtual processors on multiple nodes to communicate with each other.

To mana#e a Teradata Database system, you use/

• SM" system/ System Console (eyboard and monitor' attached directly

to the SM" node• M"" system/ !dministration 6orstation (!6S'

To access a Teradata Database system, a user typically lo#s on throu#h one ofmultiple client platforms (channel&attached mainframes or networ&attachedworstations'. Client access is discussed in the next module.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 33/138

;'%e C'-*'et!

  ! node is a basic buildin# bloc of a Teradata Database system, and contains alar#e number of hardware and software components. ! conceptual dia#ram of anode and its ma+or components is shown below. :ardware components areshown on the left side of the node and software components are shown on theri#ht side.

<'$ # %e!c$i*ti' cic0 ' e#c" c'-*'et. 

A**ic#ti'

!n application is software that accesses the Teradata Database. t can run on various platforms/

• Channel&attached client

• *!3&attached client

•  3ode

C"#e D$ie$ 

Channel driver software is the means of communication between the "$s andapplications runnin# on channel&attached (mainframe' clients.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 34/138

#te(#

The Teradata @ateway software is the means of communication between the "$s (on the

node' and applications runnin# on/

•  3etwor&attached clients

• ! node in the system

AM

!M"s (!ccess Module "rocessors' are virtual processors (vprocs' that receive steps from"$s ("arsin# $n#ines' and perform database functions to retrieve or update data. $ach!M" is associated with one virtual dis (vdis', where the data is stored. !n !M"mana#es only its own vdis, not the vdis of any other !M".

E

"$s ("arsin# $n#ines' are vprocs that receive S)* re2uests from the client and brea there2uests into steps. The "$s send the steps to the !M"s and subse2uently return theanswer to the client.

Te$#%#t# D#t#b#!e

The Teradata Database is a relational database mana#ement system (RDMS' that runsas a Trusted "arallel !pplication (T"!' on the operatin# system.

! T"! implements virtual processors and runs on the operatin# system with "D$. Thesoftware components of the Teradata Database include/

• Channel Driver

• Teradata @ateway

• !M"

• "$

! Teradata Database system can have some nodes with the Database software, and somenodes without it.

•  3odes that contain the Teradata Database software are called ;T"! nodes.;

•  3odes that do not contain Teradata Database software #enerally have applications

installed on them. They are called ;non&T"! nodes,; with the software label;3?T"!; (pronounced, ;no T"!;'.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 35/138

DE

The "D$ ("arallel Database $xtensions' software layer runs on the operatin# system on

each node. This additional software layer was created by 3CR to support the parallelenvironment.

?%i!0 8?i$t+# Di!0)

! vdis (pronounced, ;%$$&dis;' is the lo#ical dis space that is mana#ed by an !M".Dependin# on the confi#uration, a vdis may not be contained on the nodeJ however, it ismana#ed by an !M", which is always a part of the node.

U!i, t"e B;ET

  The =3$T (pronounced, ;bye&net;' is a hi#h&speed interconnect (networ'that enables multiple nodes in the system to communicate. The =3$T handlesthe internal communication of the Teradata Database. !ll communication between "$s and !M"s is done via the =3$T.

6hen the "$ dispatches the steps for the !M"s to perform, they are dispatchedonto the =3$T. The messa#es are routed to the appropriate !M"(s' whereresults sets and status information are #enerated. This response information is

also routed bac to the re2uestin# "$ via the =3$T. Dependin# on the natureof the dispatch re2uest, the communication between nodes may be to all nodes(roadcast messa#e' or to one specfic node ("oint&to&point messa#e' in thesystem

1) B;ET Ui+e <e#t+$e!

The =3$T has several uni2ue features/

• Sc##be> !s you add more nodes to the system, the overall networ

 bandwidth scales linearly. This linear scalability means you can increasesystem si9e without performance penalty && and sometimes even increase performance.

• Hi," *e$'$-#ce> !n M"" system typically has two =3$T

networs (=3$T 8 and =3$T A'. ecause both networs in a systemare active, the system benefits from havin# full use of the a##re#ate

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 36/138

 bandwidth of both the networs.

• <#+t t'e$#t> $ach networ has multiple connection paths. f the

=3$T detects an unusable path in either networ, it will automaticallyreconfi#ure that networ so all messa#es avoid the unusable path.

!dditionally, in the rare case that =3$T 8 cannot be reconfi#ured,hardware on =3$T 8 is disabled and messa#es are re&routed to=3$T A.

• L'#% b##ce%> Traffic is automatically and dynamically distributed between both =3$Ts.

2) B;ET H#$%(#$e #% S't(#$e

  The =3$T hardware and software handle the communication between thevprocs and the nodes.

• H#$%(#$e> The nodes of an M"" system are connected with the=3$T hardware, consistin# of =3$T boards and cables.

• S't(#$e> The =3$T driver (software' is installed on every node.

This =3$T driver is an interface between the "D$ software and the=3$T hardware.

SM" systems do not contain =3$T hardware. The "D$ and =3$Tsoftware emulate =3$T activity in a sin#le&node environment.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 37/138

3) C'--+ic#ti' Bet(ee ;'%e!

The =3$T hardware can carry the followin# types of messa#es betweennodes/

• roadcast messa#e to all nodes

• "oint&to&point messa#e from one node to another node

4) C'--+ic#ti' Bet(ee ?*$'c!

  ?n an M"" system, =3$T hardware is used to first send the communicationacross nodes (usin# either the point&to&point or broadcast messa#in# described previously'.

?n an SM" system, this first step is unnecessary since there is only one node.

?nce a node receives a communication, vproc communication within the nodeis done by the "D$ and =3$T software usin# the followin# types ofmessa#in#/.

• "oint&to&point

• Multicast

• roadcast

'it/t'/'it Me!!#,e! 

6ith point&to&point messa#in# between vprocs, a vproc can send a messa#e toanother vproc on/

• The same node (usin# "D$ and =3$T software'

• ! different node usin# two steps/

A. Send a point&to&point messa#e from the sendin# node to the nodecontainin# the recipient vproc. This is a communication between

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 38/138

nodes usin# the =3$T hardware.

7. 6ithin the recipient node, the messa#e is sent to the recipientvproc. This is a point&to&point communication between vprocsusin# the "D$ and =3$T software.

Ci+e!

 ! cli2ue (pronounced, ;lee;' is a #roup of nodes that share access to the samedis arrays. $ach multi&node system has at least one cli2ue. The cablin#determines which nodes are in which cli2ues && the nodes of a cli2ue are

connected to the dis array controllers of the same dis arrays.

Ci+e! $'i%e Re!iiec

  n the rare event of a node failure, cli2ues provide for data access throu#h vprocmi#ration. 6hen a node resets, the followin# happens to the !M"s/

A. 6hen the node fails, the Teradata Database restarts across all remainin#nodes in the system.

7. The vprocs (!M"s' from the failed node mi#rate to the operational

nodes in its cli2ue.

B. Diss mana#ed by the !M" remain available and processin# continueswhile the failed node is bein# repaired.

Ci+e! i # S!te-

  %procs are distributed across all nodes in the system. Multiple cli2ues in thesystem should have the same number of nodes.

The dia#ram below shows three cli2ues. The nodes in each cli2ue are cabled tothe same dis arrays. The overall system is connected by the =3$T. f onenode #oes down in a cli2ue the vprocs will mi#rate to the other nodes in thecli2ue, so data remains available. :owever, system performance decreases dueto the loss of a node. System performance de#radation is proportional to cli2uesi9e.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 39/138

T$+!te% #$#e A**ic#ti' 8TA)

  ! Trusted "arallel !pplication (T"!' uses "D$ to implement virtual processors(vprocs'. The Teradata Database is classified as a T"!. The four components ofthe Teradata Database T"! are/

• !M" (Top Ri#ht'

• "$ (ottom Ri#ht'• Channel Driver (Top *eft'

• Teradata @ateway (ottom *eft'

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 40/138

Te$#%#t# D#t#b#!e S't(#$e> E

  ! "arsin# $n#ine ("$' is a vproc that mana#es the dialo#ue between a clientapplication and the Teradata Database, once a valid session has beenestablished. $ach "$ can support a maximum of 12 !e!!i'!. The "$ handlesan incomin# re2uest in the followin# manner/

A. The Se!!i' C't$' component verifies the re2uest for sessionauthori9ation (user names and passwords', and either allows or disallowsthe re2uest.

7. The #$!e$ does the followin#/o nterprets the S)* statement received from the application.

o %erifies S)* re2uests for the proper syntax and evaluates them

semantically.o Consults the Data Dictionary to ensure that all ob+ects exist and

that the user has authority to access them.B. The O*ti-i:e$ develops the least expensive plan (in terms of time' to

return the re2uested response set. "rocessin# alternatives are evaluatedand the fastest alternative is chosen. This alternative is converted intoeec+t#be !te*!, to be *e$'$-e% b t"e AM!, which are then passedto the Dispatcher.

The ?ptimi9er is 9*#$#e #(#$e9 meanin# that it has nowled#e ofthe system components (how many nodes, vprocs, etc.', which enables itto determine the fastest way to process the 2uery. n order to maximi9ethrou#hput and minimi9e resource contention, the ?ptim9er must now

about system confi#uration, available units of parallelism (!M"s and"$s', and data demo#raphics. The Teradata Database ?ptimi9er isrobust and intelli#ent, and enables the Teradata Database to handlemultiple complex, ad&hoc 2ueries efficiently.

G. The Di!*#tc"e$ controls the se2uence in which the steps are executedand passes the steps received from the "arser on to the =3$T forexecution by the !M"s.

E. !fter the !M"s process the steps, the "$ receives their responses overthe =3$T.

. The Dispatcher builds a response messa#e and sends the messa#e bac

to the user.

Te$#%#t# D#t#b#!e S't(#$e> AM

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 41/138

  The !M" is a vproc that controls its portion of the data on the system. !M"s dothe physical wor associated with ,ee$#ti, # #!(e$ !et 8'+t*+t) includin#!'$ti, #,,$e,#ti, '$-#tti, #% c'e$ti,. The !M"s perform alldatabase mana#ement functions on the re2uired rows in the system. The !M"swor in parallel, each !M" -##,i, t"e %#t# $'(! !t'$e% ' it! !i,e

%i!0 . !M"s are involved in data distribution and data access in different ways.!M"s perform all tass in parallel providin# exceptional performance.

D#t# Di!t$ib+ti'

6hen data is loaded, inserted, and updated, the !M"/

• Receives incomin# data from the "$.

• -ormats rows and distributes them on its vdis.

D#t# Acce!!

6hen data is accessed, the !M" retrieves the rows re2uested by the "$ in thefollowin# manner/

A. The database mana#ement subsystem receives the steps from theDispatcher over the =3$T.

7. The database mana#ement subsystem processes the steps. Thesubsystem on the !M" can/

o *oc databases and tables

o Create, modify, or delete definitions of tables

o oin tableso nsert, delete, or modify rows within tables

o Sort, a##re#ate, or format data

o Retrieve information from definitions and rows from tables

B. The database mana#ement subsystem returns responses over the =3$Tto the Dispatcher 

3. CLIE;T ACCESS

Ciet C'ecti'!

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 42/138

  4sers can access data in the Teradata Database throu#h an application on bothchannel&attached and networ&attached clients. !dditionally, the node itself canact as a client. Teradata client software is installed on each client (channel&attached, networ&attached, or node' and communicates with RDMS softwareon the node. =ou may occasionally hear either type of client referred to by the

le#acy term of ;host,; thou#h this term is not typically used in documentation or product literature.

C"#e/Att#c"e% Ciet

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 43/138

 

Channel&attached clients are M&compatible mainframe systems supported by

the Teradata Database. The followin# software components installed on themainframe are responsible for communications between client applications andthe Channel Driver on a Teradata Database node/

• Teradata Director "ro#ram (TD"' software to mana#e session traffic,

installed on the channel&attached client.• Call&*evel nterface (C*', a library of routines that are the lowest&level

interface to the Teradata Database.

C'--+ic#ti' (it" t"e Te$#%#t# D#t#b#!e S!te-

Communication from client applications on the mainframe #oes throu#h themainframe channel, to the :ost Channel !dapter on the node, to the ChannelDriver software.

;et('$0 Att#c"e% Ciet

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 44/138

 

The Teradata Database supports networ&attached clients connected to the nodeover a *!3. The followin# software components installed on the networ&attached client are responsible for communication between client applicationsand the Teradata @ateway on a Teradata Database node/

• ?DC

• C*v7

C'--+ic#ti' (it" t"e Te$#%#t# D#t#b#!e S!te- 

Communication from applications on the networ&attached client #oes over the

*!3, to the $thernet card on the node, to the Teradata @ateway software.

?n the database side, the Teradata @ateway software and the "$ provide theconnection to the Teradata Database. The Teradata Database is confi#ured with2 LA; c'ecti'! for redundancy. This ensures hi#h availability

;'%e

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 45/138

 

The node is considered a networ&attached client. f you install applicationsoftware on a node, it will be treated lie an application on a networ&attachedclient. n other words, communications from applications on the node #othrou#h the Teradata @ateway. !n application on a node can be executedthrou#h/

• System Console that mana#es an SM" system.

• Remote lo#in, such as over a networ&attached client connection.

Re+e!t $'ce!!i,

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 46/138

 

! re2uest lie the one above is processed a little differently, dependin# onwhether the user is accessin# the Teradata Database throu#h a channel&attachedor networ&attached client/

A. S)* re2uest is sent from the client to the appropriate component on thenode/

o Channel&attached client/ re2uest is sent to Channel Driver

(throu#h the TD"'.o  3etwor&attached client/ re2uest is sent to Teradata @ateway

(throu#h C*v7 or ?DC'.7. Re2uest is passed to the "$(s'.B. "$s parse the re2uest into !M" steps.G. "$ Dispatcher sends steps to the !M"s over the =3$T.E. !M"s perform operations on data on the vdiss.. Response is sent bac to "$s over the =3$T.K. "$ Dispatcher receives response.

F. Response is returned to the client (channel&attached or networ&

attached'.

Te$#%#t# Ciet Utiitie!

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 47/138

  Teradata has a robust suite of client utilities that enable users and systemadministrators to en+oy optimal response time and system mana#eability.%arious client utilities are available for tass from loadin# data to mana#in# thesystem.

Teradata utilities levera#e the Teradata Database 0s hi#h performancecapabilities and are fully parallel and scalable. The same utilities run on smallerentry&level systems, as well as the lar#est M"" implementations.

Teradata Database client utilities include the followin#, described in thissection/

• +e$ S+b-itti, Utiitie! 

o T$)

o Teradata S)* !ssistant

• L'#% #% U'#% Utiitie!

o -ast*oado Multi*oad

o T"ump

o -ast$xport

o Teradata 6arehouse uilder

• A%-ii!t$#tie Utiitie!

o Teradata Mana#er

o Teradata Dynamic )uery Mana#er (TD)M'

o Teradata !nalyst "ac

• A$c"ie Utiitie!

o !RC

o  3et%ault

o  3etacup

+e$ S+b-itti, Utiitie!

 

The Teradata Database provides a number of tools that are front&end interfacesfor submittin# S)* 2ueries. Two mentioned in this section are T$) andTeradata S)* !ssistant.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 48/138

BTE

  T$) (asic Teradata )uery' && often pronounced L$$&tee && is a TeradataDatabase tool used for submittin# S)* 2ueries on all platforms. T$) providesthe followin# functionality/

• Standard report writin# and formattin#

• asic import and export of small amounts of data to and from the

Teradata Database across all platforms. -or tables more than a fewthousand rows, the Teradata Database load utilities are recommended formore efficiency.

• !bility to submit S)* re2uests in the followin# ways/

o nteractive

o atch

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 49/138

Te$#%#t# SL A!!i!t#t

  Teradata S)* !ssistant (formerly nown as )ueryman' is an informationdiscovery>2uery tool that runs on Microsoft 6indows. Teradata S)* !ssistantenables you to access t"e Te$#%#t# D#t#b#!e as well as other ODBC/b#!e%

%#t#b#!e!. Some of its features include/

• !bility to save data in "C&based formats, such as Microsoft $xcel,

Microsoft !ccess, and text files.• :istory of submitted S)* syntax, to help you build scripts for data

minin# and nowled#e discovery.• :elp with S)* syntax.

• mport and export of small amounts of data to and from ?DC&

compliant databases. -or tables more than a few thousand rows, theTeradata Database load utilities are recommended for more efficiency.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 50/138

D#t# L'#% #% U'#% Utiitie!

  n a data warehouse environment, the database tables are populated from avariety of sources, such as mainframe applications, operational data marts, orother distributed systems throu#hout a company. These systems are the sourceof data such as daily transaction files, orders, usa#e records, $R" (enterpriseresource plannin#' information, and nternet statistics. Teradata Division has asuite of data load and unload utilities optimi9ed for use with the TeradataDatabase. They run on any of the supported client platforms/

• Channel&attached client

•  3etwor&attached client

•  3ode

U!i, Te$#%#t# L'#% #% U'#% Utiitie!

Teradata load and unload utilities are fully parallel. ecause the utilities arescalable, they accommodate the si9e of the system. "erformance is not limited by the capacity of the load and unload tools.

The utilities have full restart capability. This feature means that if a load orunload +ob should be interrupted for some reason, it can be restarted a#ain fromthe last checpoint, without havin# to start the +ob from the be#innin#.

The load and unload utilities are/

• -ast*oad

• Multi*oad

• T"ump

• -ast$xport

• Teradata 6arehouse uilder

y default, you can run up to AE instances of -ast*oad, Multi*oad, and-ast$xport in any combination. There is no limit to the number of concurrentT"ump +obs.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 51/138

<#!tL'#%

  4se the -ast*oad utility to load data into empty tables.

-ast*oad can only wor on one table at a time. -ast*oad loads data into anempty table in parallel, usin# multiple sessions to transfer blocs of data.-ast*oad achieves hi#h performance by fully exploitin# the resources of thesystem. !fter the data load is complete, the table can be made available to users.

M+tiL'#%

  4se the Multi*oad utility to maintain tables by/

• nsertin# rows into a *'*+#te% '$ e-*t table

• 4pdatin# rows in a table

• Deletin# multiple rows from a table

Multi*oad can load multiple input files concurrently and wor on up to fivetables at a time, usin# multiple sessions. Multi*oad is optimi9ed to applymultiple rows in b'c0/ee '*e$#ti'!. Multi*oad usually is run durin# a

 batch window, and places a loc on on the destination table(s' to prevent user2ueries from #ettin# inconsistent results before the data load or update iscomplete.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 52/138

T+-*

  4se T"ump to/

• Constantly load data into a table

• Continuously load, update, or delete data in tables

• 4pdate lower volumes of data usin# fewer system resources than other

load utilities• %ary the resource consumption and speed of the data loadin# activity

over time

The T"ump utility complements Multi*oad as a data loadin# utility. ! ma+ordifference is that T"ump uses row hash locs, which eliminates the need for

table locs and ; batch windows; typical with Multi*oad. 4sers can continue torun 2ueries durin# T"ump data loads. n addition, T"ump is desi#ned forsmaller volumes of data than Multi*oad, and maintains up to 8 tables at a time.

T"ump has a dynamic throttle that operators can set to specify the percenta#e ofsystem resources to be used for an operation. This enables operators to set whenT"ump should run at full capacity durin# low system usa#e, or within limitswhen T"ump may affect other business users of the Teradata Database.

<#!tE*'$t

  4se the -ast$xport utility to export data from -+ti*e t#be! '$ ie(! on theTeradata Database to a client&based application.

=ou can export data from any table or view to which you have the S$*$CTaccess privele#e. The destination for the exported data can be a/

• H'!t ie> ! file on your channel&attached or networ&attached client

system• U!e$/($itte #**ic#ti'> !n ?utput Modification (?4TM?D' routine

you write to select, validate, and preprocess the exported data.

-ast$xport is a data extract utility. t transfers lar#e amounts of data usin# bloctransfers over multiple sessions and ($ite! t"e %#t# t' # "'!t ie on thenetwor&attached or channel&attached client. Typically, -ast$xport is run durin#a batch window, and the tables bein# exported are loced.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 53/138

Te$#%#t# W#$e"'+!e B+i%e$

Teradata 6arehouse uilder (T6' is a data warehouse loadin# tool thatenables data extraction, transformation and loadin# processes common to alldata warehouses.

4sin# b+it/i '*e$#t'$!, Teradata 6arehouse uilder combines thefunctionality of the Teradata utilities (-ast*oad, Multi*oad, -ast$xport, andT"ump' in a sin#le *#$#e ei$'-et. ts extensible environment supports-ast*oad 3M?Ds, -ast$xport ?4TM?Ds, and !ccess Modules to provideaccess to all the data sources you use today. There is a set of open !"s

(!pplication "ro#ammer nterface' to add third party or custom datatransformation to Teradata 6arehouse uilder scripts. 4sin# multiple, paralleltass, a sin#le Teradata 6arehouse uilder script can load data from disparatesources into the Teradata Database in the same +ob.

Teradata 6arehouse uilder is scalable and enables end&to&end parallelism. The

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 54/138

 previous versions of utilities (lie -ast*oad' allow you to load data into theTeradata Database in parallel, but with a sin#le input stream. Teradata6arehouse uilder allows you to run multiple instances of the extract, optionaltransformation, and load operators. =ou can have as many loads as you have

sources in the same +ob. 6ith multiple sources of data comin# from multiple platforms inte#ration is important in a parallel environment.

Teradata 6arehouse uilder eliminates the need for persistent stora#e. t storesdata into data buffers so you no lon#er need to write data into a flat file. Sinceyou don1t need flat files, there is no lon#er a 7@ file limit.

Teradata 6arehouse uilder provides a sin#le, S)*&lie scriptin# lan#ua#e, aswell as a @4 to mae scriptin# faster and easier. =ou can do the extract, sometransformation, and loads all in one S)*&lie scriptin# lan#ua#e. ?nce thedynamics of the lan#ua#e are learned, you can perform multiple tass with a

sin#le script. =ou can use script converters to convert scripts on existin#systems for utilities (-ast*oad, Multi*oad, -ast$xport, and T"ump' to Teradata6arehouse uilder scripts.

 A single Teradata Warehouse Builder job can load data from multiple disparate

sources into the Teradata Database, as indicated by the green arrow. 

Te$#%#t# W#$e"'+!e B+i%e$ O*e$#t'$!

The operators are components that ;plu#; into the T6 infrastructure andactually perform the functions.

• The -ast*oad 3M?D and -ast$xport ?4TM?D operators support thecurrent -ast*oad and -ast$xport 3M?D>?4TM?D features.

• The Data Connector operator is an adapter for the !ccess Module or

non&Teradata files.• The S)* Select and nsert operators submit the Teradata S$*$CT and

3S$RT commands.• The *oad, 4pdate, $xport and Stream operators are similar to the

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 55/138

current -ast*oad, Multi*oad, -ast$xport and T"ump utilities, but builtfor the T6 parallel environment.

The 3M?D and ?4TM?D adapters, Data Connector operator, and the S)*Select>nsert operators are included when you purchase the nfrastructure. The

*oad, 4pdate, $xport and Stream operators are purchased separately.

To simplify these new concepts, let1s compare the Teradata 6arehouse uilder?perators with the classic utilities that we +ust covered.

TWB O*e$#t'$ Te$#%#t# Utiit De!c$i*ti'

TWB

O*e$#t'$

Te$#%#t#

UtiitDe!c$i*ti'

*?!D -ast*oad

! consumer&type operator that

uses the Teradata -ast*oad protocol. Supports $rror limitsand Checpoint> Restart. othsupport Multi&%alueCompression and "".

4"D!T$ Multi*oad

4tili9es the TeradataMulti*oad protocol to enable +ob based table updates. Thisallows hi#hly scalable and parallel inserts and updates toa pre&existin# table.

$5"?RT -ast$xport! producer operator thatemulates the -ast$xport utility

STR$!M T"ump4ses multiple sessions to perform DM* transactions innear real&time.

DataConnector 3>!

This operator emulates theData Connector !". Readsexternal data files, writes datato external data files, reads anunspecified number of datafiles.

?DC 3>!Reads data from an ?DC"rovider.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 56/138

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 57/138

Te$#%#t# D#-ic +e$ M##,e$ 8TDM)

  Teradata Dynamic )uery Mana#er (TD)M', formerly nown as Database)uery Mana#er (D)M', is a 2uery worload mana#ement tool thatdynamically tunes the Teradata Database. TD)M can run, suspend, reschedule,or re+ect a 2uery based on current ('$0'#% and !et t"$e!"'%!.

-or example, with TD)M a re2uest can be scheduled to run periodically ordurin# a specified time period without an active system connection. Results can be retrieved any time after the re2uest has been submitted by TD)M andexecuted.

TD)M can restrict 2ueries based on factors such as/

• !nalysis control thresholds

• ?b+ect control thresholds

• $nvironmental factors

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 58/138

Te$#%#t# A#!t #c0 

  Teradata !nalyst "ac is a suite of the followin# products.

Te$#%#t# ?i!+# E*#i

Teradata %isual $xplain maes 2uery plan analysis easier by providin# theability to c#*t+$e #% ,$#*"ic# $e*$e!et t"e !te*! ' t"e *# #%

*e$'$- c'-*#$i!'! of two or more plans. t is intended for application

developers, database administrators and database support personnel to betterunderstand why the Teradata Database ?ptimi9er chooses a particular plan for a#iven S)* 2uery. !ll information re2uired for 2uery plan analysis such asdatabase ob+ect definitions, data demo#raphics and cost and cardinalityestimates is available throu#h the Teradata %isual $xplain interface. The tool isvery helpful in identifyin# the performance implications of data sew and bad ormissin# statistics. %isual $xplain uses a +e$ C#*t+$e D#t#b#!e to store2uery plans which can then be visuali9ed or manipulated with other Teradata

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 59/138

!nalyst "ac tools.

Te$#%#t# S!te- E-+#ti' T'' 8Te$#%#t# SET)

Teradata S$T simplifies the tas of e-+#ti, # t#$,et !!te- by providin# the

ability to export and import all information necessary to fae out the optimi9erin a test environment. This information can be used alon# with the Teradata1sTar#et *evel $mulation feature to #enerate 2uery plans on the test system as ifthey were run on the tar#et system. This feature is useful for verifyin# 2ueriesand reproducin# optimi9er related issues in a test environment.

Teradata S$T allows the user to capture the followin# by database, 2uery, orworload/

• System cost parameters

• ?b+ect definitions

• Random !M" samples• Statistics

• )uery execution plans

• Demo#raphics

This tool does not export user data.

Te$#%#t# I%e Wi:#$%

Teradata ndex 6i9ard automates the process of manual index desi#n by$ec'--e%i, !ec'%#$ i%ee! for a particular worload. Teradata ndex6i9ard provides a simple, easy&to&use #raphical user interface (@4' that #uides

the user how to #o about analy9in# a database worload and providesrecommendations for improvin# performance throu#h the use of indexes.

Te$#%#t# St#ti!tic! Wi:#$%

Teradata Statistics 6i9ard is a #raphical tool that has been desi#ned to#+t'-#te t"e c'ecti' #% $e/c'ecti' ' !t#ti!tic!, resultin# in better2uery plans and helpin# the D! to efficiently mana#e statistics.

The Statistics 6i9ard enables the D! to/

• Specify a worload to be analy9ed for recommendations specific to

improvin# the performance of the 2ueries in a worload.• Select an arbitrary database or selection of tables, indexes, or columns

for analysis, collection, or re&collection of statistics.

!s chan#es are made within a database, the Statistics 6i9ard identifies thosechan#es and recommends which t#be! should have statistics collected, basedon a#e of data and table #rowth, and what columns>indexes would benefit from

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 60/138

havin# statistics defined and collected for a specific ('$0'#%. The D! isthen #iven the opportunity to accept or re+ect the recommendations.

A$c"i# Utiitie!

 Teradata has utilities specifically desi#ned for data archive and recovery purposes. There are different utilities for channel&attached clients and networ&

attached clients.

A$c"ii, ' C"#e/Att#c"e% Ciet!

  n a channel&attached (mainframe' client environment, the A$c"ie Rec'e$

8ARC) +tiit is used to bac up data. t supports commands written in ob

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 61/138

Control *an#ua#e (C*'. The !RC utility archives and restores databaseob+ects, allowin# recovery of data that may have been dama#ed or lost.

There are several scenarios where restorin# ob+ects from external media may benecessary/

• Restorin# non&-allbac tables after a dis failure.

• Restorin# tables that have been corrupted by batch processes that may

have left the data in an uncertain state.• Restorin# tables, views, or macros that have been accidentally dropped

 by the user.• Miscellaneous user errors resultin# in dama#ed or lost database ob+ects.

6ith the !RC utility you can copy a table and restore it to another TeradataDatabase. t is scalable and parallel, and can run on a channel&attached client (ornetwor&attached client' or a node.

A$c"ii, ' ;et('$0/Att#c"e% Ciet!

  n a networ&attached client environment, the A$c"ie Rec'e$ 8ARC) +tiit 

is used to bac up data, alon# with either of the followin# tape stora#esubsystems/

• ;et?#+t (from aone Software nc.'

• ;etB#c0+* (from %$RT!S Software Corporation'

 3et%ault and 3etacup have modules created for Teradata Database systems

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 62/138

for use in a scalable, parallel, enterprise environment. They run on networ&attached clients or a node (Microsoft 6indows or 435 M"&R!S'. Data is baced up into the 3et%ault or 3etacup tape stora#e subsystems usin# the!RC utility.

4. TERADATA SL

W"#t I! SL&

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 63/138

 

The Teradata Database is accessed usin# S)* (Structured )uery *an#ua#e'.S)* is the industry standard access lan#ua#e for communicatin# with arelational database. S)* is a set&oriented lan#ua#e for relational database

mana#ement. ! user or application can use S)* statements to performoperations on the data and define how an answer set should be returned from anRDMS.

The Teradata Database supports two types of S)*/

• A;SI SL> Teradata S)* is compliant with !3S standards (an

industry standard'.

• Te$#%#t# SL Ete!i'!> 3CR has added Teradata S)* extensions

above and beyond standard S)* capabilities, includin# one&step S)*

statements for complex administrative operations.

Te$#%#t# SL Beeit!

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 64/138

  Teradata S)* is the set of S)* commands used with the Teradata Database.Some benefits of Teradata S)* are/

• #$#e Eec+ti' / The ?ptimi9er breas up an S)* statement into

tass that can be executed in parallel to minimi9e resource contention.

The desi#n of the Teradata Database, alon# with its automatic datadistribution, balances the worload and reduces bottlenecs.

• A;SI C'-*i#t / Teradata S)* is compliant with !3S standards. f

you have pro#rams already written with !3S&compliant S)* for adifferent relational database, you can run them with the TeradataDatabase, as well.

• Hi,"/e$'$-#ce Ete!i'! / 3CR has added Teradata S)*

extensions that are above and beyond the standard S)* capabilities,includin# one&step S)* statements for complex administrativeoperations.

T*e! ' SL St#te-et!

  S)* statements commonly are cate#ori9ed as follows/

• Data Definition *an#ua#e (DD*'

• Data Manipulation *an#ua#e (DM*'

• Data Control *an#ua#e (DC*'

D#t# Deiiti' L#,+#,e 8DDL)

  Data Definition *an#ua#e (DD*' is used to define and create 4sers, Databases,and the ob+ects they contain (tables, views, macros, tri##ers, and stored procedures'.

$xamples/

CR$!T$ & Define a new Database, 4ser, database ob+ect, or index.

DR?" & Remove an existin# Database, 4ser, database ob+ect, index, orstatistics.

!*T$R & Chan#e table structure and protection definition, or enable anddisable tri##ers.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 65/138

D#t# M#i*+#ti' L#,+#,e 8DML)

  Data Manipulation *an#ua#e (DM*' is used to wor with data, includin# tasssuch as insertin# data into a table, updatin# an existin# record, or performin#2ueries.

$xamples/

S$*$CT & "erform relational 2uery functions (Select, oin, 4nion,ntersect, Minus'.

3S$RT & "lace a new row into a table.

4"D!T$ & Modify values in an existin# row.

D$*$T$ & Remove a row from a table.

D#t# C't$' L#,+#,e 8DCL)

  Data Control *an#ua#e (DC*' is used for administrative tass such as #rantin#and revoin# privile#es to database ob+ects or controllin# ownership of thoseob+ects.

$xamples/

@R!3T & @ive user privile#es.

R$%?I$ & Remove user privile#es.

@%$ & Transfer database ownership.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 66/138

T"e SELECT St#te-et

  The SELECT statement is the most commonly used S)* statement. t is a DM*

statement that allows you to retrieve data from one or more tables. n its mostcommon form, you specify certain rows to be returned as shown.

SELECT *FROM employeeWHERE department_number = 401 

The asteris, ;N;, is a ;wild card; character. n this example, it specifies thatwhen the result is displayed, we want to see all the columns of the rows wherethe department number is G8A. The FROM clause specifies from which table in

our database to retrieve the rows. The WHERE clause acts as a filter that passes

only rows meetin# the specified condition && in this case, rows of employees indepartment G8A.

;OTE> S)* does not re2uire a trailin# semicolon to end a statement, but theasic Teradata )uery (T$)' utility that can be used to enter S)* statementsdoes. The semicolon is used in the examples, as if it were entered in T$).

f you do not specify a WHERE clause, the 2uery would return all columns and all

rows from the employee table, for example/

SELECT * FROM employee

EM!"#EE$%&MBE' 

MA%A(E'$EM!"#EE$

%&MBE' DEA'TME%T$

%&MBE' )"B$

*"DE !A+T$%AME 

-'+T$%AME 

-'E$DATE 

B-'T$DATE 

+A!A'#$ AM"&%T 

/001  /0/2  30/  3/4/0/ +tein  )ohn  51/0/6 63/0/6  4276000 /008  /0/2  30/  3/4/04 9anies:i *arol  55040/ 6806/5  4246000 /006  080/  703  73//00 'yan  !oretta  51/0/6 6602/0  3/40000 /007  /003  70/  7/4/0/ )ohnson Darlene  51/0/6 710743  3130000 /005  /006  703  734/0/ ;illegas   Arnando 550/04 350/3/  7250000 /003  080/  70/  7///00 Trader   )ames  51053/ 7501/2  3586000 

Ret+$i, # S+b!et ' C'+-! 

nstead of usin# the asteris symbol to specify all columns, we could namespecific columns separated by a comma/

SELECT employee_number 

, !"re_date# la$t_name# %"r$t_name

FROM employee

WHERE department_number = 401

U!'$te% Re!+t! 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 67/138

Results include the columns named in the S)* statement. The results areunsorted unless you specify that you want them sorted in a certain way. :ow toretrieve ordered results is covered in the followin# section.

employee_number !"re_date la$t_name %"r$t_name1004 &'(10(1) o!n$on +arlene

100, &'(0&(,1 Trader ame$

101, &&(04(01 -!"ll"p$ C!arle$

1010 &&(0,(01 Ro.er$ Fran/

10 &(0,(01 Ma2!ado 3lbert

1001 &'(0'(1 Hoo5er W"ll"am

100 &'(0&(,1 6ro7n 3lan

T"e ORDER B C#+!e

  To have your results displayed in a sorted order, use the OR+ER 68 clause, for

example/

OR+ER 68  !"re_date

S'$t O$%e$

4sin# this example, results are returned in ascendin# order. f a sort order is notspecified, we #et results in ascendin# order by default. To specify ascendin# ordescendin# order, add 3SC or +ESC to the end of your OR+ER 68 clause. The

followin# is an example of specifyin# the results in ascendin# order.

SELECT employee_number#la$t_name#%"r$t_name#!"re_date

FROM employee

WHERE department_number = 401

OR+ER 68 !"re_date 3SC

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 68/138

O+t*+t

employee_number !"re_date la$t_name %"r$t_name

1001 &'(0'(1 Hoo5er W"ll"am

100, &'(0&(,1 Trader ame$100 &'(0&(,1 6ro7n 3lan

1004 &'(10(1) o!n$on +arlene

1010 &&(0,(01 Ro.er$ Fran/

101, &&(04(01 -!"ll"p$ C!arle$

10 &(0,(01 Ma2!ado 3lbert

;#-i, 

Specify the column to sort on by either namin# it directly (for example,hireHdate' or by namin# its position within the SELECT statement. Since

hireHdate is the fourth column in the SELECT clause, the followin# S)*statement is e2uivalent to the one in the example above/

OR+ER 68  4 3SC 

U!e$ A!!i!t#ce St#te-et! #% M'%iie$!

  S)* user assistance statements (and modifiers' vary widely from databasevendor to database vendor. The Teradata Database1s user assistance statementsare commonly called Teradata extensions. These Teradata extensions areadditions to the DD*, DM*, and DC* statements in standard S)*, and maesome operations less time consumin#.

This pa#e discusses the followin# Teradata S)* user assistance commands/

• :$*"

• :$*" S$SS?3

• S:?6

This pa#e also discusses the statement modifier/

• $5"*!3

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 69/138

T"e HEL St#te-et

The :$*" statement is used to display information about database ob+ects. =oucan #et help on the followin#/

:$*" D!T!!S$:$*" 4S$R :$*" T!*$:$*" %$6:$*" M!CR?:$*" TR@@$R :$*" "R?C$D4R$:$*" C?*4M3:$*" 3D$5:$*" ST!TSTCS. . . and much moreO

E#-*e>

:$*" D!T!!S$ databasename

Displays all the ob+ects in the specified database.

T"e HEL SESSIO; St#te-et

  4se the :$*" S$SS?3 statement to see specific information about your S)*session.

E#-*e>

:$*" S$SS?3J

Displays the user name with which you lo##ed in, the lo#&on date and time,your default database, and other information related to your current session.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 70/138

T"e SHOW St#te-et

  4se the S:?6 statement to display the data definition lan#ua#e (DD*'associated with database ob+ects (tables, views, macros, tri##ers, or stored procedures'. =ou can show the DD* for the followin#/

S:?6 T!*$S:?6 %$6S:?6 M!CR?S:?6 TR@@$R S:?6 "R?C$D4R$S:?6 ?3 3D$5

E#-*e>

S:?6 T!*$ tablename

Displays the CR$!T$ T!*$ statement that was used to create the specifiedtable.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 71/138

T"e ELAI; M'%iie$

  The $5"*!3 modifier allows you to preview how the Teradata Database willexecute an S)* re2uest. t is a #ood way to see what database resources will be usedin processin# the re2uest. 4se the $5"*!3 modifier precedin# any S)* statement tosee a plan with/

• $n#lish text describin# a plan for how the statement will be processed.

• !n estimate of the number of rows involved.

• ! relative cost of the re2uest.

The relative cost is shown in units of time, and should not be used to predict actualresponse time for an S)* re2uest. This time estimate can be used to compare theduration of re2uest processin# relative to other plans.

6hen you execute a re2uest preceded by the $5"*!3 modifier, the re2uest is 't executed. nstead, the system/

• -ully parses the re2uest.

• ?ptimi9es the re2uest.

• Reports the complete plan for executin# the re2uest in readable $n#lish.

E#-*e>

$5"*!3 S$*$CT N -R?M tablenameJ

Displays the steps involved in processin# the re2uest, S$*$CT N -R?M thespecified table.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 72/138

5. DATA STRUCTURE

A Te$#%#t# D#t#b#!e

  n Teradata Database systems, the words ;database; and ;user; have specificdefinitions.

D#t#b#!e> T"e Te$#%#t# Deiiti' 

n Teradata, a ;database; provides a lo#ical #roupin# of information. !Teradata Database also provides a ey role in space allocation and accesscontrol. ! Teradata Database is a defined, lo#ical repository that can containob+ects, includin#/

• D#t#b#!e!> ! defined ob+ect that may contain a collection of Teradata

Database ob+ects.• U!e$!> Databases that each have a lo#on D and password for lo##in# on

to the Teradata Database.• T#be!> Two&dimensional structures of columns and rows of data stored

on the dis drives.(Re2uire "erm Space'

• ?ie(!> ! virtual ;window; to subsets of one or more tables or other

views, pre&defined usin# a sin#le S$*$CT statement.(4se no "erm Space'

• M#c$'!> Definitions of one or more Teradata S)* and report formattin#

commands.(4se no "erm Space'

• T$i,,e$!> ?ne or more Teradata S)* statements associated with a table

and executed when specified conditions are met.(4se no "erm Space'

• St'$e% $'ce%+$e!> Combinations of procedural and non&procedural

statements run usin# a sin#le C!** statement.(Re2uire "erm Space'

;'te> ! Database (it" ' e$- S*#ce can have views, macros, and tri##ers, but no tables or stored procedures.

These Teradata Database ob+ects are created, maintained, and deleted usin#S)*, and are described in further detail in this section.

U!e$> A S*eci# =i% ' D#t#b#!e

! 4ser can be thou#ht of as a colllection of tables, views, macros, tri##ers, andstored procedures. ! 4ser is a specific type of Database, and has attributes in

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 73/138

addition to the ones listed above/

• *o#on D

• "assword

So, a 4ser is the same as a Database except that a 4ser can actually lo# on tothe RDMS. To lo# on to a Teradata Database, you need to specify a 4ser tolo# on to (which is simply a Database with a password'. =ou cannot lo# on to aDatabase because it has no password.

;'te> n this course, we will use uppercase ;4; for 4ser and uppercase ;D; forDatabase when referrin# to these specific Teradata Database ob+ects.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 74/138

T#be!

  ! table in a relational database mana#ement system is a two&dimensionalstructure made up of c'+-! and physical $'(! stored in data blocs on thedis drives.

$ach column represents an attribute of the table. !ttributes identify, describe, or2ualify the table. $ach column is named and all the information containedwithin it is of the same type, for example, Department 3umber.

$ach row represents an instance of the table. ! row could represent a particular person, thin#, or event.

?ie(!

  ! view is lie a ;window; into tables that allows multiple users to loo at

 portions of the same base data. ! view may access one or more tables, and mayshow only a subset of columns from the table(s'.

! view does not exist as a real table and does not occupy dis space. t serves asa reference to existin# tables or views. ! view is a lo#ical structure with noactual data && it accesses data that is stored in a table and returns the re2uested

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 75/138

rows from the table to the user.

4ser privile#es determine which views a user can see, and what the user can dowith each view. f you have the privile#es to chan#e data in a table, then you

can chan#e it through its associated view. The system maes the chan#es to theunderlyin# table, and the chan#es are reflected in the view.

%iews are useful in an enterprise&wide data warehouse environment. :i#herlevels of mana#ement in an or#ani9ation may want to see the ;bi# picture;contained in the sin#le, lar#e storehouse of information, but various departmentswant or need to see only the portion they are concerned with. 4sin# views, alllevels of the or#ani9ation are still accessin# the same underlyin# data, forconsistent results.%iews are often used for the followin# purposes/

• ! view can be defined for a user (or #roup of users' to have read&only

access, insulatin# the ori#inal table from inadvertent or unwelcomechan#es.

• ! view can filter out extraneous columns for a user (or #roup of users'.

The view would contain a subset of table columns, or a combination ofcolumns from different tables, that are appropriate for a specific tas.

• %iews can simplify or standardi9e data access techni2ues for different

users across the company.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 76/138

M#c$'!

  ! -#c$' is a Teradata Database extension to !3S S)* that defines ase2uence of prewritten Teradata S)* statements. Macros are *$e/%eie%

!t'$e% !et! ' 'e '$ -'$e SL c'--#%! #%'$ $e*'$t/'$-#tti,

8BTE) c'--#%!. Macros can also contain comments.

Macros can be a convenient !"'$tc+t '$ eec+ti, ,$'+*! ' $e+et/$+

'$ c'-*e SL !t#te-et! 8+e$ie!) '$ !et! ' '*e$#ti'!. 6hen you

execute the macro, the statements execute as a sin#le transaction. Macros reducethe number of eystroes needed to perform a complex tas. This saves youtime, reduces the chance for errors, and reduces the communication volume tothe Teradata Database.

Macros also have a performance benefit && they are #+t'-#tic# $e/'*ti-i:e%

each time they are run. !s the database demo#raphics evolve over time, you can

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 77/138

 be sure that the macros are optimi9ed for the current data, not for data thatexisted when the macro was created. =ou can use the $5"*!3 function tocompare a macro1s execution plan as your data demo#raphics chan#e.

Macros can be executed interactively or by batch applications, and simplifyaccess to the system. Macros are database ob+ects. ecause they are stored inthe Teradata Database1s Data Dictionary, they are available to all connectedclients.

Macros also control access to the system. ! Database !dministrator can usemacros to/

• *imit the tass a user can perform, for example, by #ivin# the user

access to only a macro and not a whole database.• Control which users can execute a macro.

• Restrict users to specific rows and columns of the database throu#h themacro code.

Macros are owned by a 4ser or a Database and can be run by 4sers who have$5$C4T$ privile#es. "arameters allow you to customi9e a macro to suit yourindividual needs at run time. To execute the macro, you use one $5$C4T$statement, and the statements in the macro are processed as a sin#le transaction.

To wor with macros, a 4ser must have the followin# privile#es/

• $5$C

• DR?"• CR$!T$

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 78/138

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 79/138

T$i,,e$!

  ! tri##er is a set of S)* statements usually associated with a column or tablethat are pro#rammed to be run (or ;fired;' when specified chan#es are made tothe column or table. The pre&defined chan#e is nown as a tri##erin# event,which causes the S)* statements to be processed.

!s an example, a user with the appropriate privile#es can create a tri##er toeep company records consistent. The tri##er would be associated with theD$"!RTM$3T table, which contains each department number in the company,as well as the employee number of the mana#er assi#ned to that department.This tri##er has a t$i,,e$i, event and a t$i,,e$e% event/

• T$i,,e$i, eet> 6hen a new mana#er is assi#ned to a department, the

mana#er1s employee number chan#es for that department in theD$"!RTM$3T table.

• T$i,,e$e% eet> S)* statements will be executed that update each

affected employee1s information in the $M"*?=$$ table, which lists

each employee, his or her employee number, and the employee numberof his or her mana#er.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 80/138

St'$e% $'ce%+$e!

  ! stored procedure is a pre&defined set of statements invoed throu#h a sin#leC!** statement in S)*. 6hile a stored procedure may seem lie a macro, itis different in that it can contain/

• Teradata S)* data manipulation statements (non&procedural'

• "rocedural statements (in the Teradata Database, referred to as Stored

"rocedure *an#ua#e'

A St'$e% $'ce%+$e Beeit> Abiit t' U!e SL

! stored procedure provides the benefit of S"* control and condition handlin#statements unavailable in Teradata S)*. Teradata Database macros cancontain only Teradata S)* statements. The combined functionality of TeradataS)* and S"* statements in stored procedures provides a computationallycomplete pro#rammin# lan#ua#e. $xamples of S"* functionality include/

• -&T:$3&$*S$

• D? 6:*$

• *??"

• $@3 & $3D

 3ote that in the !3S S)* standard, the procedural statements are included as

 part of S)*. n the Teradata Database, the procedural statements are allowedonly in a stored procedure, so the terms S)* and S"* are used to differentiate between the non&procedural and procedural statements.

A't"e$ St'$e% $'ce%+$e Beeit> Le!! IO Oe$"e#%

oth macros and stored procedures eliminate the overhead of sendin#commands from a client over a connection to the "$ and down to the !M"s.The commands for macros and stored procedures are resident on the TeradataDatabase, so there is less >? (input>output' traffic used to execute them.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 81/138

C$e#ti, D#t#b#!e! #% U!e$!

  n the Teradata Database, Databases (includin# that special cate#ory ofDatabases called 4sers' have attributes assi#ned to them/

• Acce!! Ri,"t!> "rivile#es that allow a 4ser to perform operations (such

as CR$!T$, DR?", and S$*$CT' a#ainst database ob+ects. ! 4sermust have the correct access ri#hts to a database ob+ect in order to accessit.

• e$- S*#ce> The maximum amount of "ermanent Space assi#ned and

available to a 4ser or Database to store tables. 4nlie some otherrelational databases, t"e Te$#%#t# D#t#b#!e %'e! 't *"!ic# *$e/

#'c#te e$- S*#ce for Databases and 4sers when they are defineddurin# ob+ect definition time. ?nly the "ermanent Space limit is defined,then the space is consumed dynamically as needed. A D#t#b#!e! "#e

# %eie% +**e$ i-it of "ermanent Space.• S*'' S*#ce> The amount of space assi#ned and available to a 4ser or

Database to #ather answer sets. -or example, when executin# aconditional 2uery, 2ualifyin# rows are temporarily stored usin# SpoolSpace. Dependin# on how the system is set up, a sin#le 2uery could

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 82/138

temporarily use all available System Space to store its result in spool."ermanent Space not bein# used for tables is available for Spool Space.

• Te-* S*#ce> The amount of space used for #lobal temporary tables,

and these results remain available to the 4ser until the session is

terminated. Tables created in Temp Space will survive a restart."ermanent Space not bein# used for tables is available for Temp Spaceas well as Spool Space.

A L',ic# D#t#b#!e Hie$#$c"

  n a lo#ical, hierarchical or#ani9ation, Databases (includin# 4sers' are createdsubordinate to existin# Databases or 4sers. The ownin# Database or 4ser iscalled the parent. The subordinate Database or 4ser is called the child.e$-#et S*#ce for the new Database or 4ser comes from its immediate parent.

6hen the Teradata Database software is first installed, all "ermanent Space isassi#ned to Database DC (also a 4ser in Teradata Database terminolo#y, because you can lo# on to it with a userid and password'. Durin# installation,the followin# Databases are created/

• Database Crashdumps (initially empty'

• 4ser System-$ (with its views and macros'

• 4ser Sys!dm (with its views and macros'

ecause Database DC is the immediate parent of these child Databases,"ermanent Space limits for the children are subtracted from Database DC.

C$e#ti, # ;e( D#t#b#!e

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 83/138

  !fter the initial setup, customer tables can then be created. ?ne way to set up adatabase hierarchy would be to create a Database !dministrator 4ser directlysubordinate to Database DC. Most of the system "ermanent Space would beassi#ned to the Database !dministrator 4ser. This setup #ives you the freedomto have multiple administrators lo##in# on to the Database !dministrator 4ser,

and limit the number of people lo##in# on directly to Database DC (which hasmore access ri#hts than any other 4ser'.

 3ext, all other 4sers and Databases would be created from the databaseadministrator 4ser, and their "ermanent Space limits would be subtracted fromthe Database !dministrator 4ser1s space limit. =our hierarchy would loo liethis/

• Database DC at the hi#hest level, the parent of all other Databases

(includin# 4sers'.• 4ser SysD! (we called it SysD!J you can assi#n it any name' with

the ma+ority of the system1s "erm Space assi#ned to it.• !ll customer Databases and 4sers in the system created from 4ser

SysD! .• $ach table, view, macro, stored procedure, and tri##er are owned by a

Database (or 4ser'. =ou specify the ownin# Database when creatin# theob+ects. -or example, when creatin# a table, you specify the table1sowner in the CR$!T$ T!*$ statement. f no owner is specified, thesystem uses the 4ser you are lo##ed on to as the table1s owner.

M#i-+- e$- S*#ce A'c#ti'!> A E#-*e

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 84/138

  elow is an example of how "ermanent Space limits for 4sers and Databasescome from the immediate parent 4ser or Database. n this case, the 4serSysD! has E88 @ of maximum "ermanent Space assi#ned to it.

The 4ser :R is created from SysD! with 788 @ of maximum "ermanentSpace. The 788 @ for :R is subtracted from SysD!, which becomes B88 @(E88 @ minus 788 @'.

The 4ser "ayroll is created as a child of :R with A88 @ of "ermanent Space.The A88 @ for "ayroll is subracted from :R, which becomes A88 @ (788 @minus A88 @'.

!t a different level under SysD!, Database Maretin# is created as a child ofSysD!, with A88 @ of maximum "ermanent Space. The A88 @ forMaretin# comes from its parent, SysD!, which becomes 788 @ (B88 @minus A88 @'.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 85/138

S*'' S*#ce

  M#i-+- S*'' S*#ce 

!s mentioned previously in ;Creatin# Databases and 4sers,; Spool Space isworin# space used to hold intermediate answer sets. !ny "erm Space currentlyunassi#ned is available as Spool Space.

Definin# Spool Space is not re2uired when 4sers and Databases are created. fit is not defined, the Spool Space for the 4ser or Database is inherited from its parent. Thus, if no Spool Space limit were defined for any 4sers or Databases,an erroneous S)* re2uest could create a ;runaway transaction; thatunintentionally consumes all of a system1s resources. -or this reason, definin#maximum Spool Space for a 4ser or Database is hi#hly recommended.

The Spool Space limit for a Database or 4ser is not subtracted from itsimmediate parent, but the Database or 4ser1s maximum spool allocation canonly be as lar#e as its immediate parent. -or example/

• Database ! has a Spool Space limit of E88 @.

• Database is created as a child of Database !. The maximum Spool

Space that can be allocated to Database is E88 @.• Database C is created as another child of Database !. The maximum

Spool Space that can be allocated to Database C is also E88 @.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 86/138

ecause Spool Space is worin# space, temporarily used and released by thesystem as needed, the total maximum Spool Space allocated for all the

Databases and 4sers on the system can actually exceed the total system disspace. ut this is not the amount of Spool Space actually consumed.

C'!+-i, S*'' S*#ce

The maximum Spool Space for a Database (or 4ser' is merely an upper limit ofthe Spool Space that the Database can use while processin# a transaction. Thereare two limits to Spool Space utili9ation/

• The maximum Spool Space assi#ned to a 4ser or Database. f a

transaction is #oin# to exceed its assi#ned limit, it is aborted and an errormessa#e is #iven statin# that the maximum Spool Space was exceeded.

• "hysical limitation of dis space. -or a specific transaction, the system

can only use the amount of Spool Space #ct+# ##i#be ' t"e %i!0

%$ie! at that particular time, whether a maximum spool limit has beendefined or not. f a +ob is #oin# to exceed the Spool Space available onthe system, an error messa#e is #iven statin# that there is not enou#hspace to process the +ob.

!s the amount of "ermanent Space used to store data varies over a lon# period

of time, so will the amount of space available for spool (worin# space'.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 87/138

Te-*'$#$ S*#ce

Temporary Space is "ermanent Space currenlty not bein# used. TemporarySpace is the amount of space used for #loabal temporary tables, and theseresults remain available to the 4ser until the session is terminated. Tablescreated in Temp Space will survive a restart.

D#t# Dicti'#$

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 88/138

  The Data Dictionary is a set of relational tables that contains information aboutthe RDMS and database ob+ects within it. t is the metadata or ;data about thedata; for a Teradata Database. The Data Dictionary resides in Database DC.Some of the ma+or items it tracs are/

• Dis space• !ccess authori9ations

• ?wnership

• Data definitions

Di!0 S*#ce

  The Data Dictionary stores information about how much space is allocated for perm and spool for each Database and 4ser. The table below shows an exampleof Data Dictionary information for space allocations. n this example, the 4sers"ayroll and enefits have no "ermanent Space allocated or consumed becausethey do not contain tables.

Acce!!

  The Data Dictionary also stores information about which 4sers can accesswhich database ob+ects.

System !dministrators often are responsible for archivin# the system. n theexample below, it is liely that the Sys!dm 4ser would have access to thetables in the $mployee and Crashdumps databases, as well as other ob+ects.6hen you #rant and revoe access to any 4ser for any database ob+ect,

 privile#es are stored in the Data Dictionary.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 89/138

O(e$!

  The Data Dictionary also stores information about which Databases and 4sersown each database ob+ect.

Deiiti'!

  The Data Dictionary stores definitions of all database ob+ects, their names, andtheir place in the hierarchy.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 90/138

-or macros, the DataDictionary also stores the actual S)* statements of the macro. 6hile stored procedures also contain statements (S)* and S"* statements', the statementsfor each stored procedure are ept in a separate table and distributed amon# the!M"s (lie re#ular user data', not ept in the Data Dictionary.

6. DATA ROTECTIO;

$'tecti, D#t#

 Several types of data protection are available with the Teradata Database. !llthe data protection methods shown on this pa#e are covered in further detaillater in this module.

RAID

 Redundant !rray of nexpensive Diss (R!D' is a stora#e technolo#y that provides data protection at the dis drive level. t uses #roups of dis drivescalled ;arrays; to ensure that data is available in the event of a failed dis driveor other component. The word, ;redundant,; implies that either data, functions,and>or components have been duplicated in the array1s architecture. The industryhas a#reed on six R!D confi#uration levels (R!D 8 throu#h R!D E'. The

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 91/138

classifications do not imply superiority of one mode over the other, butdifferentiate how data is stored on the dis drives. 6ith the Teradata Database,the two R!D technolo#ies used most commonly are R!D A and R!D E. ?nsystems usin# $MC dis drives, R!D E is called R!D S.

<#b#c0 

 -allbac is a Teradata Database feature that protects data a#ainst !M" failure.!s shown later in this module, -allbac uses #roups of !M"s that provide fordata availability and consistency if an !M" is unavailable.

L'c0!

  Temporary locs can be placed on data to prevent multiple users fromsimultaneously chan#in# it/

• $xclusive *oc

• 6rite *oc

• Read *oc

!ccess *oc

F'+$#!

  The Teradata Database has +ournals that can be used for specific types of data or process recovery/

• "ermanent ournals

• Recovery ournals

RAID 1

  R!D A is a data protection scheme that uses mirrored pairs of diss to protectdata from a sin#le drive failure.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 92/138

RAID 1> Eect! ' '+$ S!te-

  R!D A re2uires double the number of diss because every data bloc has anidentical copy. R!D A is usually faster than R!D E. The hi#hest level of data protection is R!D A with -allbac.

RAID 1> H'( It W'$0!

  R!D A protects a#ainst a sin#le dis failure usin# the followin# principles/

• Mirrorin#

• Readin#

Mi$$'$i,> R!D A maintains a duplicate dis for each dis in the system.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 93/138

;'te> f you confi#ure more than one pair of diss per !M", the RD!C stripesthe data across both the re#ular and mirror diss. 3ote that while stripedmirrorin# (also called R!D A P 8' is available, it is not recommended for usewith the Teradata Database. Striped mirrorin# is a method used to create parallelism for a non&parallel environment. ecause the Teradata Database isalready parallel, there is no benefit #ained from usin# striped mirrors.

Re#%i,> 4sin# both copies of the data, the system reads data blocs from thefirst available dis.

RAID 1> H'( It H#%e! <#i+$e!

  f a dis fails, the Teradata Database is unaffected and the followin# are eachhandled in a different way/

• Reads

• 6rites

• Replacements

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 94/138

Re#%!> 6hen a drive is down, the system reads the data from the functionaldrive only. There is a minor performance penalty because reads can occur fromonly one drive instead of two.

W$ite!> 6hen a drive is down, the system writes to the functional drive. 3omirror ima#e exists at this time.

Re*#ce-et!> !fter you replace the failed dis, the dis array controllerautomatically reconstructs the data on the new dis from the mirror ima#e. 3ormal system performance is affected durin# the reconstruction of the faileddis 

RAID 5

  R!D E is a data protection scheme that uses dis arrays to protect data from thefailure of a sin#le drive.

;'te> R!D S is the name for R!D E implemented on $MC dis drives.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 95/138

Dis arrays contain the followin# ma+or components/

• SCS bus

• "hysical diss

• Dis array controllers

-or maximum availability and performance, the Teradata Database always usesdual redundant dis array controllers. :avin# two dis array controllers adds alevel of protection in case one controller fails, and provides parallelism for disaccess

RAID 5> Eect! ' '+$ S!te-

  The number of diss per ran varies from vendor to vendor. The number of

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 96/138

diss in a ran impacts space utili9ation/

• G drives per ran re2uires a BBQ increase in data space.

• E drives per ran re2uires a 7EQ increase in data space.

R!D E also re2uires some performance overhead durin# a write operation, because it has to read the data, then calculate and write the parity

RAID 5> H'( It W'$0!

  R!D E uses a data parity scheme to provide data protection.

R#0> -or the Teradata Database, R!D E uses the concept of a ran, which isa set of diss worin# to#ether. 3ote that the diss in a ran are not directlycabled to each other.

#$it> n R!D E, data is handled as follows/

• D#t# i! !t$i*e% across a ran of diss (spread across the dis drives' one

se#ment at a time, usin# a binary ;exclusive&or; (5?R ' al#orithm.• #$it i! #!' !t$i*e% across all dis drives, interleaved with the data. !

;parity byte; is an extra byte written to a drive in a ran. The process ofwritin# data and parity to the dis drives includes a read&modify&writeoperation for each new se#ment/

A. Read existin# data on the dis drives in the ran.7. Read existin# parity in that ran for the correspondin# se#ment.B. Calculate the parity/ existin# data P new data P existin# parity

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 97/138

new parity.G. 6rite new data.E. 6rite new parity.

• f one of the dis drives in the ran becomes unavailable, the system

uses the parity byte to calculate the missin# data from the down drive so

the system can remain operational. 6ith a ran of G diss, if a dis fails,# -i!!i, %#t# b'c0 -# be $ec'!t$+cte% usin# the other B diss.

n the example below, data bytes are written to dis drives A, 7, and B. Thesystem calculates the parity byte usin# the binary 5?R al#orithm and writes itto dis drive G.

RAID 5> H'( It H#%e! <#i+$e!

  f a dis fails, the Teradata Database is unaffected and the followin# areeach handled in different ways/

• Reads

• 6rites

• Replacements

Re#%!> Data is reconstructed on&the&fly as users re2uest data usin# the binary 5?R al#orithm.

W$ite!> 6hen a drive is down, the system writes to the functional drives, but not to the failed drive.

Re*#ce-et!> !fter you replace the failed dis, the dis array controllerautomatically reconstructs the data on the new dis, usin# nown datavalues to calculate the missin# data. 3ormal system performance is affected

durin# reconstruction of the failed dis.

Di!0 A'c#ti'

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 98/138

  The operatin# system, "D$, and the Teradata Database do not reco#ni9e the physical dis hardware. $ach software component reco#ni9es and interacts withdifferent components of the data stora#e environment/

• O*e$#ti, !!te-> Reco#ni9es a lo#ical unit (*43'. The operatin#

system reco#ni9es the *43 as its ;dis,; and is not aware that it isactually writin# to spaces on multiple dis drives. This techni2ueenables the use of R!D technolo#y to provide data availability withoutaffectin# the operatin# system.

• DE> Translates *43s into vdiss usin# slices (in 435' or partitions

(in Microsoft 6indows' in con+unction with a Teradata utility/o 435 & pdeconfi# 

o Microsoft 6indows & "4T ("arallel 4p#rade Tool'

• Te$#%#t# D#t#b#!e> Reco#ni9es a virtual dis (vdis'. 4sin# vdiss

instead of direct connections to physical dis drives enables the use ofR!D technolo#y without affectin# the Teradata Database.

C$e#ti, LU;!

  Space on the physical dis drives is or#ani9ed into *43s. The R!D leveldetermines how the space is or#ani9ed. -or example, if you are usin# R!D E, a*43 includes a re#ion of space from each of the physical dis drives in a ran.

%i!0!> U!e$ D#t# S*#ce

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 99/138

  !fter a *43 is created, it is divided into partitions.

• n 435 systems, a *43 consists of one partition, which is further

divided into slices/o oot slice (a very small slice, tain# up only BE sectors'

o 4ser slices for storin# data. These user slices are called ;pdiss;in the Teradata Database.

• n Microsoft 6indows systems, a *43 consists of multiple partitions,

there are no slices. Thus, *43s in Microsoft 6indows do not have a boot slice. nstead, they contain a ;Master oot Record; that includesinformation such as the partition layout. The partitions store data and arecalled ;pdiss; in the Teradata Database.

n summary, pdiss are the user slices (435' or partitions (Microsoft6indows' and are used for stora#e of the tables in a database. ! *43 may haveone or more pdiss.

A!!i,i, %i!0! t' AM!

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 100/138

  The pdiss (user slices or partitions, dependin# on the operatin# system' areassi#ned to an !M" throu#h the software. 3o cablin# is involved.

The combined space on the pdiss is considered the !M"1s vdis. !n !M"mana#es only its own vdis (dis space assi#ned to it', not the vdis of any

other !M". !ll !M"s can then wor in parallel, processin# their portion of thedata

  $ach !M" in the system is assi#ned one vdis. !lthou#h numerousconfi#urations are possible, #enerally all pdiss from a ran (R!D E' ormirrored pair (R!D A' are assi#ned to the same !M" for optimal performance.

:owever, an !M" reco#ni9es only the vdis. The !M" has no control over the physical diss or rans that compose the vdis.

<#b#c0 

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 101/138

  -allbac is a Teradata Database feature that protects data in the case of an !M" vproc failure. -allbac is used to #uarantee the maximum availability of data.=ou can use -allbac protection on a table&by&table basis. t is especially usefulin applications that re2uire hi#h availability.

-allbac protects your data by storin# a second copy of each row of a table onan alternate, -allbac !M" in the same c+!te$. f an !M" fails, the sytemaccesses the -allbac rows to meet re2uests. -allbac provides !M" faulttolerance at the t#be ee. 6ith -allbac tables, if one !M" fails, all table datais still available. 4sers may continue to use -allbac tables without any loss ofavailable data.

Durin# table creation or after a table is created, you may specifiy whether or notthe system should eep a -allbac copy of the table. f -allbac is specified, it isautomatic and transparent.

-allbac #uarantees that the two copies of a row will always be on different!M"s. f either !M" fails, the alternate row copy is still available on the other!M".

<#b#c0> Eect! ' '+$ S!te-

  -allbac has the followin# effects on your system/

S*#ce 

n addition to the ori#inal database si9e, you need space for/

• -allbac&protected tables (A88Q additional stora#e space for each-allbac&protected table'

• R!D protection of -allbac&protected tables

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 102/138

e$'$-#ce

There will be twice as much input>output (>?' for nserts, 4pdates, and Deletesof rows in -allbac&protected tables. 3o extra >? is re2uired for Selectoperations, as the -allbac >? is performed in parallel with the "rimary >?.

-allbac beeit! include/

• !dds a level of protection beyond R!D dis array protection.

• Can be specified on a table&by&table basis to protect data re2uirin# the

hi#hest availability.• "ermits access to data while an !M" is off&line.

• !utomatically restores data that was chan#ed durin# the !M" off&line period.

The hi#hest level of data protection is <#b#c0 #% RAID1.

<#b#c0> S't(#$e T''!

  The followin# Teradata utilities are used to recover a failed !M"/

• ?*$'c M##,e$> $nables you to/

o Display and modify vproc states.

o nitiate Teradata Database restarts.

• T#be Reb+i%> Reconstructs tables on an !M" from data on other

!M"s in the cluster .

• Rec'e$ M##,e$> *ets you monitor recovery processin#.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 103/138

<#b#c0> H'( It W'$0!

  -allbac is accomplished by #roupin# !M"s into clusters. 6hen a table isdefined as -allbac&protected, the system stores a second copy of each row inthe table on the dis space mana#ed by an alternate ;-allbac !M"; in the!M" cluster.

elow is a cluster of four !M"s. $ach !M" has a combination of "rimary and-allbac data rows/

• $i-#$ D#t# R'(> ! record in a database table that is used in normal

system operation.• <#b#c0 D#t# R'(> The online bacup copy of a "rimary data row that

is used in the case of an !M" failure.

W$ite> $ach "rimary data row has a duplicate -allbac data row on another!M". The "rimary and -allbac data rows are written in parallel.

""rimary --allbac  

Re#%> 6hen you access data and all !M"s are operational, only "rimary rowsare read.

M'$e C+!te$!> The dia#ram below shows how -allbac data is distributed

amon# multiple clusters.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 104/138

""rimary --allbac  

<#b#c0> H'( It H#%e! <#i+$e!

  f two physical diss fail in the same R!D E ran or R!D A mirrored pair, theassociated !M" vproc fails. -allbac protects a#ainst the failure of a sin#le!M" in a cluster.

f two !M"s in a cluster fail, the system halts and must be restarted manually.

Re#%!> 6hen an !M" fails, the system reads all rows it needs from the disspace of the remainin# !M"s in the cluster. f the system needs to find a"rimary row from the failed !M", it reads the -allbac copy of that row, whichis on the dis space of another !M".

W$ite!> ! failed !M" is not available, so the system cannot access any of thefailed !M"1s dis space. Copies of all its rows are available on dis space forother !M"s in the cluster (either as "rimary or -allbac rows', and are updatedthere.

Re*#ce-et> Repairin# the failed !M" re2uires replacin# the failed physicaldiss and brin#in# the !M" online. ?nce the !M" is online, the system uses

the -allbac data on the other !M"s to automatically re&create data on thenewly replaced diss.

F'+$#! '$ D#t# A#i#biit

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 105/138

  The followin# +ournals are ept on the system to provide data availability in theevent of a component or process failure in the system/

• "ermanent ounals

• Recovery ournals

e$-#et F'+$#!

  "ermanent ournals are an optional feature of the Teradata Database to provide an additional level of data protection. =ou specify the use of"ermanent ournals at the table level. t provides full&table recovery to aspecific point in time. t also can reduce the need for costly and time&consumin# full&table bacups. "ermanent ournals are tables stored on disarrays lie user data is, so they tae up additional dis space on the system.The Teradata Database !dministrator maintains the "ermanent ournalentries (deletin#, archivin#, and so on.'

H'( e$-#et F'+$#! W'$0  

! Database (ob+ect' has a maximum of one "ermanent ournal.6hen youcreate a table with "ermanent ournalin#, you must specify whether the"ermanent ournal will capture/

• efore ima#es && for rollbac to ;undo; a set of chan#es to a previous

state.• !fter ima#es && for rollforward to ;redo; to a specific state.

=ou can also specify that the system eep both before ima#es and afterima#es. n addition, you can choose that the system captures/

• Sin#le ima#es (the default' && this means that the "ermanent ournal

table is not -allbac protected.• Dual ima#es && this means that the "ermanent ournal table is

-allbac protected.

The "ermanent ournal captures ima#es concurrently with standard tablemaintenance and 2uery activity. The additional dis space re2uired may becalculated in advance to ensure ade2uate resources. "eriodically, theTeradata Database !dministrator can dump the "ermanent ournal toexternal media, thus reducin# the need for full&table bacups since onlychan#es are baced up and not the entire database.

Rec'e$ F'+$#!

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 106/138

  The Teradata Database uses Recovery ournals to automatically maintain datainte#rity in the case of/

• !n interrupted transaction 8T$#!iet F'+$#) 

• !n !M" failure 8D'(/AM Rec'e$ F'+$#) 

Recovery ournals are created, maintained, and pur#ed by the systemautomatically, so no D! intervention is re2uired. Recovery ournals are tablesstored on dis arrays lie user data is, so they tae up additional dis space onthe system.

T$#!iet F'+$# 

! Transient ournal maintains data inte#rity when in&fli#ht transactions areinterrupted (due to aborted transactions, system restarts, and so on'. Data is$et+$e% t' it! '$i,i# !t#te after transaction failure.

! Transient ournal is used durin# normal system operation to eep ;beforeima#es; of chan#ed rows so the data can be restored to its previous state if thetransaction is not completed. This happens ' e#c" AM as chan#es occur.6hen a transaction is started, the system #+t'-#tic# stores a copy of all therows affected by the transaction in the Transient ournal until the transaction iscommitted (completed'. ?nce the transaction is complete, the ;before ima#es;are pur#ed. n the event of a transaction failure, the ;before ima#es; arereapplied to the affected tables and deleted from the +ournal, and the ;rollbac;operation is completed.

D'(/AM Rec'e$ F'+$# 

! Down&!M" Recovery ournal allows c'ti+e% !!te- '*e$#ti' while an!M" is down (for example, when two dis drives fail in a ran or mirrored pair'. ! Down&!M" Recovery ournal is used with -allbac&protected tables tomaintain a record of write transactions (updates, creates, inserts, deletes, etc.' onthe failed !M" while it is unavailable.

! Down&!M" Recovery ournal starts automatically after the loss of an !M"in a cluster, !ny chan#es to the data on the failed !M" are lo##ed into theDown&!M" Recovery ournal by the other !M"s in the cluster. 6hen the

failed !M" is brou#ht bac online, the restart process includes applyin# thechan#es in the Down&!M" Recovery ournal to the recovered !M". The +ournal is discarded once the process is complete, and the !M" is brou#htonline, fully recovered.

L'c0!

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 107/138

  *ocin# prevents multiple users who are tryin# to access or chan#e the samedata simultaneously from violatin# data inte#rity. This concurrency control isimplemented by locin# the tar#et data.

*ocs are automatically ac2uired durin# the processin# of a re2uest and

released when the re2uest is terminated.

Lee! ' L'c0i,

  *ocs may be applied at three levels/

• D#t#b#!e L'c0!> !pply to all tables and views in the database.

• T#be L'c0!> !pply to all rows in the table.

• R'( H#!" L'c0!> !pply to a #roup of one or more rows in a table.

T*e! ' L'c0!

  The four types of locs are described below.

Ec+!ie

$xclusive locs are applied only to databases or tables, never to rows. They arethe most restrictive type of loc. 6ith an exclusive loc, no other user can

access the database or table. $xclusive locs are used rarely, most often whenstructural chan#es are bein# made to the database. !n exclusive loc on adatabase or table prevents other users from obtainin# the followin# type of locson the loced data/

• $xclusive locs

• 6rite locs

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 108/138

• Read locs

• !ccess locs

W$ite

6rite locs enable users to modify data while maintainin# data consistency.6hile the data has a write loc on it, other users can obtain an access loc only.Durin# this time, all other locs are held in a 2ueue until the write loc isreleased. 6rite locs prevent other users from obtainin# the followin# locs onthe loced data/

• $xclusive locs

• 6rite locs

• Read locs

Re#%

Read locs are used to ensure consistency durin# read operations. Several usersmay hold concurrent read locs on the same data, durin# which time no datamodification is permitted. Read locs prevent other users from obtainin# thefollowin# locs on the loced data/

• $xclusive locs

• 6rite locs

Acce!!

!ccess locs can be specified by users unconcerned about data consistency. Theuse of an access loc allows for readin# data while modifications are in process.!ccess locs are desi#ned for decision support on lar#e tables that are updatedonly by small, sin#le&row chan#es. !ccess locs are sometimes called ;staleread; locs, because you may #et ;stale data; that has not been updated. !ccesslocs prevent other users from obtainin# the followin# locs on the loced data/

• $xclusive locs

7. I;DICES

I%ee! i t"e Te$#%#t# D#t#b#!e

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 109/138

  ndexes are used to access rows from a table without havin# to search the wholetable. n the Teradata Database, an index is made up of one or more columns ina table. ?nce Teradata Database indexes are selected, they are maintained by thesystem. 6hile other vendors may re2uire data partitionin# or indexmaintenance, these tass are unnecessary with the Teradata Database.

n the Teradata Database, there are two types of indexes/

• $i-#$ I%ee! define the way the data is distributed.

• $i-#$ I%ee! and Sec'%#$ I%ee! are used to locate the datarows more efficiently than scannin# the whole table.

=ou specify which column or columns are used as the "rimary ndex when youcreate a table. Secondary ndex columns can be specified when you create atable or at any time durin# the life of the table.

D#t# Di!t$ib+ti'

  6hen the "rimary ndex for a table is well chosen, the table rows are evenlydistributed across the !M"s for the best performance. The way to #uaranteeeven distribution of data is by choosin# a "rimary ndex whose columns containuni2ue values. The values do not have to be evenly spaced, or even ;trulyrandom,; they +ust have to be uni2ue to be evenly distributed.

The even distribution enables each !M" to be responsible for only a subset ofthe rows in a table. f the data is evenly distributed, the wor is evenly dividedamon# the !M"s so they can wor in parallel and complete their processin#about the same time. $ven data distribution is critical to performance because itoptimi9es the parallel access to the data.

4nevenly distributed data, also called ;sewed data,; causes slower response

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 110/138

time as the system waits for the !M"(s' with the most data to finish their processin#. The slowest !M" becomes a bottlenec. I %i!t$ib+ti' i! !0e(e%,an all&!M" operation will tae lon#er than if all !M"s were evenly utili9ed.

6hen data is loaded into the Teradata Database/

• The system automatically distributes the data across the !M"s based on

row content (the "rimary ndex values'.• The distribution is the same re#ardless of the data volume bein# loaded.

n other words, lar#e tables are distributed the same way as small tables.

Data is not distributed in any particular order. The benefits of havin#+'$%e$e% %#t# are that they %'Gt ee% # -#itee#ce t' *$e!e$e

'$%e$, and they are i%e*e%et ' # +e$ bei, !+b-itte%. Theautomatic, unordered distribution of data eliminates tass for a TeradataDatabase !dministrator that are necessary with some other relational databasesystems. The D! does not waste time on labor&intensive data maintenancetass.

Te$#%#t# D#t#b#!e M##,e#biit

  ?ne of the ey benefits of the Teradata Database is its mana#eability. The list oftass that Teradata Database !dministrators do not have to do is lon#, and

illustrates why the Teradata Database system is so easy to mana#e and maintaincompared to other databases.

T"i,! Te$#%#t# D#t#b#!e A%-ii!t$#t'$! ;ee$ H#e t' D'

Teradata Database !dministrators ee$ have to do the followin# tass/

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 111/138

• Reor#ani9e data or index space.

• "re&allocate table>index space

• "hysical partitionin# of dis space

o 6hile it is possible to have partitioned indexes in the Teradata

Database, they are not re2uired.• "re&prepare data for loadin# (convert, sort, split, etc.'.• 4nload>reload data spaces due to expansion. 6ith the Teradata

Database, the data can be redistributed on the lar#er confi#uration withno offloadin# and reloadin# re2uired.

• 6rite or run pro#rams to split input source files into partitions for

loadin#.

6ith the Teradata Database, the worload for creatin# a table of A88 rows is thesame as creatin# a table with A,888,888,888 rows. Teradata Database!dministrator now that if data doubles, the system can expand easily toaccommodate it. The Teradata Database provides hu#e cost advanta#es,especially when it comes to staffin# Database !dministrators. Customers tell usthat their D! staff re2uirements for administerin# non&Teradata databases arethree to four times hi#her.

H'( Ot"e$ D#t#b#!e! St'$e R'(! #% M##,e D#t# 

$ven data distribution is not easy for most databases to do. Many databases use$#,e %i!t$ib+ti', which creates intensive maintenance tass for the D!.?thers may use i%ee! as a way to select a small amount of data to return theanswer to a 2uery. They use them to avoid accessin# the underlyin# tables if possible. The assumption is that the index will be smaller than the tables so they

will tae less time to read. ecause they scan indexes and use only part of thedata in the index to search for answers to a 2uery, they can carry extra data inthe indexes, duplicatin# data in the tables. This way they do not have to read thetable at all in some cases. !s you will see, this is not nearly as efficient as theTeradata Database1s method of data stora#e and access.

?ther D!s have to as themselves 2uestions lie/

• :ow should partition the data<

• :ow lar#e should mae the partitions<

• 6here do have data contention<

• :ow are the users accessin# the data<

Many other databases re2uire the D!s to -#+# *#$titi' the data. Theymi#ht place an entire table in a sin#le partition. The disadvanta#e of thisapproach is it creates a bottlenec for all 2ueries a#ainst that data. t is not themost efficient way to either store or access data rows.

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 112/138

6ith other databases, addin#, updatin# and deletin# data affects manual datadistribution schemes thereby reducin# 2uery performance and re2uirin#reor#ani9ation. ! Te$#%#t# D#t#b#!e provides hi#h performance because itdistributes the data evenly across the !M"s for parallel processin#. ;'

*#$titi'i, '$ %#t# $e/'$,#i:#ti'! #$e ee%e%. 6ith the TeradataDatsabase, your D! can spend more time with users developin# strate#icapplications to beat your competitionO

$i-#$ I%e

  ! "rimary ndex ("' is the *"!ic# -ec"#i!- '$ #!!i,i, # %#t# $'( t'# AM #% # 'c#ti' ' t"e AM! %i!0! . t is also used to #cce!! $'(!

(it"'+t "#i, t' !e#$c" t"e eti$e t#be. ! "rimary ndex operation isalways a 'e/AM '*e$#ti'. =ou specify the column(s' that comprise the"rimary ndex for a table when the table is created. -or a #iven row, the "rimaryndex value is the combination of the data values in the "rimary ndex columns.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 113/138

Choosin# a "rimary ndex for a table is perhaps the -'!t c$itic# %eci!i' adatabase desi#ner maes, because this choice affects both data distribution andaccess.

$i-#$ I%e R+e!

  The followin# rules #overn how "rimary ndexes implemented in a TeradataDatabase must be defined as well as how they function/

R+e 1> ?ne "rimary ndex per table.R+e 2> ! "rimary ndex value can be uni2ue or non&uni2ue.R+e 3> The "rimary ndex value can be 34**.R+e 4> The "rimary ndex value can be modified.R+e 5> The "rimary ndex of a populated table cannot be modified.R+e 6> ! "rimary ndex has a limit of G columns.

R+e 1> Oe I e$ T#be

  $ach table must have a "rimary ndex. The "rimary ndex is the only way forthe system to determine where a row will be physically stored. 6hile a "rimaryndex may be composed of multiple columns, the table can have only one(sin#le& or multiple&column' "rimary ndex.

R+e 2> Ui+e '$ ;'/Ui+e I

There are two types of "rimary ndex/

• Ui+e $i-#$ I%e 8UI) / -or a #iven row, the combination of the

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 114/138

data values in the columns of a 4ni2ue "rimary ndex are not duplicatedin other rows within the table, so the c'+-! +!e% #$e +i+e. Thisuni2ueness ,+#$#tee! ee %#t# %i!t$ib+ti' and %i$ect #cce!!. -orexample, in the case where old employee numbers are sometimes

recycled, the combination of the *ast 3ame and $mployee 3umbercolumns would be a 4". 6ith a 4", there is no duplicate row checin#done durin# a load, which maes it a faster operation.

• ;'/Ui+e $i-#$ I%e 8;UI) / -or a #iven row, the

combination of the data values in the columns of a 3on&4ni2ue "rimaryndex can be duplicated in other rows within the table. So, t"e$e c# be

-'$e t"# 'e $'( (it" t"e !#-e I #+e.  ! 34" c# c#+!e

!0e(e% %#t#, but in specific instances can still be a #ood "rimary ndexchoice. -or example, either the Department 3umber column or the :ireDate column mi#ht be a #ood choice for a 34" if you will be accessin#

the table most often via these columns.

R+e 3> I C# Be ;ULL

 f the "rimary ndex is uni2ue, you could have one row with a null value. f youhave multiple rows with a null value, the "rimary ndex must be 3on&4ni2ue.

R+e 4> I ?#+e C# Be M'%iie%

  The "rimary ndex value can be modified. n the table below, if *oretta Ryanchan#es departments, the "rimary ndex value for her row chan#es.

6hen you update the index value in a row, the Teradata Database re&hashes it

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 115/138

and redistributes the row to its new location based on its new index value.

R+e 5> I C#'t Be M'%iie%

The "rimary ndex of a table cannot be modified.

n the event that you need a new "rimary ndex, you must drop the table,recreate it with the new "rimary ndex, and reload the table.

The !*T$R T!*$ statement allows you to chan#e the " of a table if thetable is empty.

R+e 6> I H#! 64/C'+- Li-it

 =ou can desi#nate a "rimary ndex that is composed of A to G columns.

SL St# '$ C$e#ti, # $i-#$ I%e

6hen a table is created, it must have a "rimary ndex specified. The "rimaryndex is created in the CR$!T$ T!*$ statement in S)*.

I '+ %' 't !*eci # $i-#$ I%e in the CR$!T$ T!*$ statement, thesystem will use the "rimary Iey as the "rimary ndex. f a "rimary Iey has not

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 116/138

 been specified, the system will choose the first uni2ue column. f there are nouni2ue columns, the system will use the first column in the table and desi#nateit as a 3on&4ni2ue "rimary ndex.

C$e#ti, # Ui+e $i-#$ I%e

The S)* syntax to create a 4ni2ue "rimary ndex is/

CRE3TE T36LE $ample_192ol_a :;T#2ol_b :;T#2ol_2 :;T<

;:>E -R:M3R8 :;+E? 92ol_b<

C$e#ti, # ;'/Ui+e $i-#$ I%e 

The S)* syntax to create a 3on&4ni2ue "rimary ndex is/

CRE3TE T36LE $ample_92ol_@ :;T#2ol_y :;T#2ol_A :;T<

-R:M3R8 :;+E? 92ol_@<

M'%ii, t"e$i-#$ I%e ' # T#be 

!s mentioned in the "rimary ndex rules, you cannot modify the "rimary ndexof a table. n the event that you need a new "rimary ndex, you must drop thetable, recreate it with the new "rimary ndex, and reload the table.

D#t# Mec"#ic! ' $i-#$ I%ee!

  This section describes how "rimary ndexes are used in/

• Data distribution

• Data access

Di!t$ib+ti, R'(! t' AM!

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 117/138

  The Teradata Database uses hashin# to $#%'- #% ee %i!t$ib+te %#t#

#c$'!! # AM! for balanced performance. -or example, in a t(' ci+e !!te-, data is hashed across all !M"s in the system for even datat districution,which results in evenly distributed worloads. $ach !M" is desi#ned to hold a portion of the rows of each table. !n !M" is responsible for the stora#e,

maintenance, and retrieval of the data under its control. The Teradata Database1s#+t'-#tic "#!" %i!t$ib+ti' eliminates costly data maintenance tass.

Rows are distributed to !M"s durin# the followin# operations/

• *oadin# data into a table (one or more rows, usin# a data loadin# utility'

• nsertin# or updatin# rows (one or more rows, usin# S)*'

• Chan#in# the system confi#uration (redistribution of data, caused by

reconfi#urations to add or delete !M"s'

6hen loadin# data or insertin# rows, the data bein# affected by the load or

insert is not available to other users until the transaction is complete. Durin# areconfi#uration, no data is accessible to users until the system is operational inits new confi#uration.

R'( Di!t$ib+ti' $'ce!! 

The process the system uses for insertin# a row on an !M" is described below/

A. The system uses the "rimary ndex value in each row as input to thehashin# al#orithm.

7. The output of the hashin# al#orithm is the row hash value (in thisexample, G'.

B. The system loos at the hash map, which identifies the specific !M"

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 118/138

where the row should be stored (in this example, !M" B'.G. The row is stored on the tar#et !M".

o 4"/ The system automatically checs for duplicate 4" values

when rows are loaded or inserted. f a row already exists with the4" value, the new row is not added.

o  34"/ The system does not chec for duplicate 34" values. f arow already exists with the 34" value, the new row is added tothe same !M".

D+*ic#te R'( H#!" ?#+e!

t is possible for the hashin# al#orithm to end up with the same row hash valuefor two different rows. There are two ways this could happen/

• Duplicate 34" values/ f a 3on&4ni2ue "rimary ndex is used,

duplicate 34" values will produce the same row hash value.• :ash synonym/ !lso called a hash collision, this occurs when the

hashin# al#orithm calculates an identical row hash value for twodifferent "rimary ndex values. :ash synonyms are very rare. 6henusin# a 4ni2ue "rimary ndex, you will still #et uniform datadistribution.

To differentiate each row in a table, every row is assi#ned a +i+e R'( ID.The Row D is the combination of the row hash value and a uni2ueness value.

'ow -D < 'ow ash ;alue = &ni>ueness ;alue 

The +i+ee!! #+e is used to differentiate between rows whose "rimaryndex values #enerate identical row hash values. n most cases, only the rowhash value portion of the Row D is needed to locate the row.

6hen each row is inserted, the !M" adds the row D, stored as a prefix of the

row. The first row inserted with a particular row hash value is assi#ned auni2ueness value of A. The uni2ueness value is incremented by A for anyadditional rows inserted with the same row hash value.

D+*ic#te R'(!

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 119/138

  ! duplicate row is a row in a table whose column values are identical to anotherrow in the same table. n other words, the entire row is the same, not +ust anindex. !lthou#h duplicate rows are 't #'(e% in the relational model (becauseevery "rimary Iey must be uni2ue', t"e Te$#%#t# D#t#b#!e %'e! #'(

%+*ic#te $'(! because the capability is a part of the !3S standard.

ecause duplicate rows are allowed in the Teradata Database, how does it affectthe 4", which, by definition, is uni2ue< 6hen you create a table, the followin#definitions determine whether or not it can contain duplicate rows/

• M4*TS$T tables/ May contain duplicate rows. The Teradata Database

will not chec for duplicate rows.

• S$T tables/ The default. The Teradata Database checs for and does not

 permit duplicate rows. f a S$T table is created with a 4ni2ue "rimaryndex, the chec for duplicate rows is replaced by a chec for duplicate

index values.

Acce!!i, # R'( Wit" # $i-#$ I%e

6hen a user submits an S)* re2uest usin# the table name and "rimary ndex,the re2uest becomes a 'e/AM '*e$#ti', which is the most direct andefficient way for the system to find a row. The process is explained below.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 120/138

H#!"i, $'ce!!

A. The primary index value #oes into the hashin# al#orithm.7. The output of the hashin# al#orithm is the row hash value.B. The hash map points to the specific !M" where the row resides.G. The "$ sends the re2uest directly to the identified !M".E. The !M" locates the row(s' on its vdis .

. The row data is sent over the =3$T to the "$, and the "$ sends the

answer set on to the client application.

C"''!i, # Ui+e '$ ;'/Ui+e $i-#$ I%e

  Criteria for choosin# a "rimary ndex include/

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 121/138

• Ui+ee!!> ! UI is often a #ood choice because it/

o @uarantees even data distribution.

o $liminates duplicate row checin# durin# a load, which maes it

a faster operation.

! ;UI with few duplicate values could provide #ood (if not perfectlyuniform' distribution, and mi#ht meet the other criteria better.

• ='( Acce!! #t"! / U!e i #+e #cce!!> Retrievals, updates, and

deletes that specify the "rimary ndex are much faster than those that donot. ecause a $i-#$ I%e is a 0'( #cce!! *#t" to the data, it is best to choose column(s' that will be fre2uently used for access. -orexample, the followin# S)* statement would directly access a row based on the e2uality 6:$R$ clause/

SELECT * FROM employee WHERE employee_:+ = 36C4)'& 

! ;UI -# be # bette$ c"'ice if the access is based on another, mostlyuni2ue column. -or example, the table may be used by the Mail Room to trac paca#e delivery. n that case, a column containin# room numbers or mail stopsmay not be uni2ue if employees share offices, but a better choice for access.

• F'i e$'$-#ce / U!e i j'i #cce!!> S)* re2uests that use a ?3

statement perform the best when the +oin is done on a $i-#$ I%e.Consider "rimary Iey and -orei#n Iey columns as potential candidates

for "rimary ndexes. -or example, if the $mployee table and the "ayrolltable are related by the $mployee D column, then the $mployee Dcolumn could be a #ood "rimary ndex choice for one or both of thetables.

-or +oin performance, a ;UI can be a better choice than a 4".

• ;'/'#tie #+e!/ *oo for columns where the values do not chan#e

fre2uently. -or example, in an nvoicin# table, the outstandin# balancecolumn for all customers probably has few duplicates, but probablychan#es too fre2uently to mae a #ood "rimary ndex. ! customer D,

statement number, or other more stable columns may be better choices.

6hen choosin# a "rimary ndex, try to find the column(s' that best fit thesecriteria and the business need.

6hat do you thin are ey considerations in choosin# a "rimary ndex< (Choose three.'

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 122/138

!. Column(s' containin# uni2ue (or nearly uni2ue' values for uniform distribution.

. Column(s' with values in se2uential order for best load and access performance.

C. Column(s' fre2uently used in 2ueries to access data or to +oin tables.

D. Column(s' with values that are stable (do not chan#e fre2uently', to minimi9eredistribution of table rows.

$. Column(s' with many duplicate values for redundancy.

#$titi'e% $i-#$ I%e

  The Teradata Database provides an indexin# mechanism called "artitioned"rimary ndex (""'. "" is used to/

• mprove performance for #$,e t#be! when you submit 2ueries that

specify a $#,e c'!t$#it.• Reduce the number of rows to be processed by usin# a new techni2ue

called *#$titi' ei-i#ti'.• ncrease performance for incremental data loads, deletes, and data access

when worin# with lar#e tables with ran#e constraints.• nstantly %$'* '% %#t# and rapidly #%% e( %#t#.

• !void full&table scans without the overhead of a Seconday ndex.

H'( D'e! I W'$0&

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 123/138

  Data distribution with "" is still based on the $i-#$ I%e/

"rimaryndex

:ash%alue

Determines which !M" #ets the row

6ith "", the ORDER  in which the rows are stored on the !M" is affected.4sin# the traditional method, 3o "artitioned "rimary ndex (3""', the rowsare stored in row hash order.

4 AM! (it" O$%e$! T#be Deie% (it" ;I

4sin# "", the rows are stored first by partition and then by row hash. n ourexample, there are four partitions. 6ithin the partitions, the rows are stored inrow hash order.

4 AM! (it" O$%e$! T#be Deie% (it" I ' OD#te

D#t# St'$#,e U!i, I

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 124/138

  To store rows usin# ""/ specify #$titi'i, in the CREATE TABLE statement. The 2uery will run throu#h the hashin# al#orithm as normal, andcome out with the ase Table D, the "artition number(s', the Row :ash, andthe "rimary ndex values.

D#t# St'$#,e U!i, I

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 125/138

Acce!! Wit"'+t # I

  *et1s say you have a table with Store information by *ocation and did not use a"". f you 2uery on *ocation B on this 3"" table, the entire table will bescanned to find records for *ocation (-ull&Table Scan'.

Acce!! Wit"'+t # I

UERS$*$CT N -R?M $mployeeH3""6:$R$ *ocationH3umber BJ

LA; !**&!M"s & -ull&Table Scan

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 126/138

Acce!! Wit" # I

  n the same example for a "" table, you would partition the table with as many*ocations as you have (or will soon have in the future.' Then if you 2uery on*ocation B, each !M" will use partition elimination and each !M" only has toscan partition B for the 2uery. This 2uery will run much faster than the -ull&Table Scan in the previous example. !lso, if you had a 3"" table with aSecondary ndex, there would not be a -ull&Table Scan, but there would be theoverhead of usin# a Secondary ndex, which is not a factor in a "" table.

Acce!! Wit" # I

UERS$*$CT N -R?M $mployee6:$R$ *ocationH3umber BJ

LA;!**&!M"s & Sin#le "artitionScan

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 127/138

Sec'%#$ I%e

  ! Secondary ndex (S' is an alternate data access path. t allows you to accessthe data without havin# to do a full&table scan. Secondary indexes do not affecthow rows are distributed amon# the !M"s.

=ou can drop and recreate secondary indexes dynamically, as they are needed.4nlie "rimary ndexes, Secondary ndexes are stored in separate subtables thatre2uire extra overhead in terms of dis space, and maintenance which is handledautomatically by the system. So, Secondary ndexes do re2uire some systemresources

n what instances would it be a #ood idea to define a Secondary ndex for a table< (Thisinformation will be covered in this module, but here is a preview.'

The "rimary ndex exists for even data distribution and data access, but a Secondaryndex is defined to efficiently #enerate monthly reports based on a different set ofcolumns.

The "roduct table is accessed by the retailer (who accesses data based on theretailer1s product code column', and by a vendor (who access the same data based on thevendor1s product code column'.

The table already has a 4ni2ue "rimary ndex, but a second column must also haveuni2ue values. The column is specified as a 4ni2ue Secondary ndex (4S' to enforceuni2ueness on the second column.

!ll of the above.

Sec'%#$ I%e R+e!

  Several rules that #overn how Secondary ndexes must be defined and how theyfunction are/

R+e 1> Secondary ndexes are optional.R+e 2> Secondary ndex values can be uni2ue or non&uni2ue.R+e 3> Secondary ndex values can be 34**.R+e 4> Secondary ndex values can be modified.R+e 5/ Secondary ndexes can be chan#ed.R+e 6> ! Secondary ndex has a limit of G columns.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 128/138

R+e 1> O*ti'# SI

  6hile a "rimary ndex is re2uired, a Secondary ndex is optional. f one path tothe data is sufficient, no Secondary ndex need be defined.

=ou can define 8 to B7 Secondary ndexes on a table for multiple data access paths. Different #roups of users may want to access the data in various ways.=ou can define a Secondary ndex for each heavily used access path.

R+e 2> Ui+e '$ ;'/Ui+e SI

*ie "rimary ndexes, Secondary ndexes can be uni2ue or non&uni2ue.

• ! Ui+e Sec'%#$ I%e (4S' serves two possible purposes/

o E'$ce! +i+ee!! in a column or #roup of columns. The

database will chec 4Ss to see if the values are uni2ue. -orexample, if you have chosen different columns for the "rimaryIey and "rimary ndex, you can mae the "rimary Iey a 4S toenforce uni2ueness on the "rimary Iey.

o S*ee%! +* access to a row (data retrieval speed'. !ccessin# a

row with a 4S re2uires one or two !M"s, which is less directthan a 4" (one !M"' access, but -'$e eiciet than a full&table scan.

• ! ;'/Ui+e Sec'%#$ I%e (34S' is usually specified to prevent

full&table scans, in which every row of a table is read. The ?ptimi9erdetermines whether a full&table scan or 34S access will be moreefficient, then pics the best method. !ccessin# a row with a 34Sre2uires # !M"s.

R+e 3> SI C# Be ;ULL

 

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 129/138

 

!s with the "rimary ndex, the Secondary ndex column may contain 34**values.

R+e 4> SI ?#+e C# Be M'%iie%

 The values in the Secondary ndex column may be modified as needed.

R+e 5> SI C# Be C"#,e%

 Secondary ndexes can be chan#ed. Secondary ndexes can be created and

dropped dynamically as needed. 6hen the index is dropped, the system physically drops the subtable that contained it.

R+e 6> SI H#! 64/C'+- Li-it

  =ou can desi#nate a Secondary ndex that is composed of A to G columns. Touse the Secondary ndex below, the user would specify both ud#et andMana#er $mployee 3umber.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 130/138

Ot"e$ Sec'%#$ I%ee!

  F'i I%e 

oin indexes have several uses/

• Define a pre&+oin table on fre2uently +oined columns (with optional

a##re#ation' without denormali9in# the database.• Create a full or partial replication of a base table with a primary index on

a forei#n ey column table to facilitate +oins of very lar#e tables byhashin# their rows to the same !M" as the lar#e table.

• Define a summary table without denormali9in# the database.

=ou can define a +oin index on one or several tables. Sin#le&table +oin indexfunctionality is an extension of the ori#inal intent of +oin indexes, hence theconfusin# ad+ective ;+oin; used to describe a sin#le&table +oin index.

S*#$!e I%e 

!ny +oin index, whether simple or a##re#ate, multi&table or sin#le&table, can besparse. ! sparse +oin index uses a constant expression in the 6:$R$ clause ofits definition to narrowly filter its row population. This is nown as a Sparseoin ndex.

H#!" I%e 

:ash indexes are used for the same purposes as sin#le&table +oin indexes. :ashindexes create a full or partial replication of a base table with a primary indexon a forei#n ey column table to facilitate +oins of very lar#e tables by hashin#them to the same !M".

=ou can define a hash index on one table only. :ash indexes are not indexes inthe usual sense of the word. They are base tables that cannot be accesseddirectly by a 2uery.

?#+e/O$%e$e% ;USI 

%alue&ordered 34Ss are very efficient for ran#e conditions and conditions withan ine2uality on the secondary index column set. ecause the 34S rows aresorted by data value, it is possible to search only a portion of the index subtablefor a #iven ran#e of ey values. Thus, the ma+or advanta#e of a value&ordered 34S is in the performance of ran#e 2ueries.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 131/138

%alue&ordered 34Ss have the followin# limitations/

• The sort ey is limited to a sin#le numeric column.• The sort ey column cannot exceed four bytes.

• They count as two indexes a#ainst the total of B7 non&primary indexes

you can define on a base or +oin index table.

U!i, Sec'%#$ I%ee!

  n the table below, users will be accessin# data based on the Department 3amecolumn. The values in that column are uni2ue, so it has been made a 4S forefficient access. n addition, the company wants reports on how manydepartments each mana#er is responsible for, so the Mana#er $mployee 3umber can also be made a secondary index. t has duplicate values, so it is a 34S.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 132/138

H'( Sec'%#$ I%ee! A$e St'$e%

  Secondary indexes are stored in index subtables. The subtables for 4Ss and 34Ss are distributed differently/

• USI> The 4ni2ue Secondary ndexes are hash distributed separately

from the data rows, based on their 4S value. (!s you remember, the base table rows are distributed based on the "rimary ndex value'. Thesubtable row may be stored on the same !M" or a different !M" thanthe base table row, dependin# on the hash value.

;USI> The 3on&4ni2ue Secondary ndexes are stored in subtables onthe same !M"s as their data rows. This reduces activity on the =3$Tand essentially maes 34S 2ueries an !M"&local operation & the processin# for the subtable and base table are done on the same !M".:owever, in all 34S access re2uests, all !M"s are activated becausethe non&uni2ue value may be found on multiple !M"s.

D#t# Acce!! Wit"'+t # $i-#$ I%e

  =ou can submit a re2uest without specifyin# a "rimary ndex and still access

the data. The followin# access methods do not use a "rimary ndex/

• 4ni2ue Secondary ndex (4S'

•  3on&4ni2ue Secondary ndex (34S'

• -ull&Table Scan

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 133/138

Acce!!i, D#t# (it" # USI

  6hen a user submits an S)* re2uest usin# the table name and a 4ni2ueSecondary ndex, the re2uest becomes a 'e/ '$ t('/AM '*e$#ti', asexplained below.

USI Acce!!

A. The S)* is submitted, specifyin# a 4S (in this case, a customer number of E'.7. The hashin# al#orithm calculates a row hash value (in this case, 87'.B. The hash map  points to the !M" containin# the subtable row correspondin# to

the row hash value (in this case, !M" 7'.G. The subtable indicates where the base row resides (in this case, row KKF on !M"

G'.E. The messa#e #oes bac over the =3$T to the !M" with the row and the !M"

accesses the data row (in this case, !M" G'.. The row is sent over the =3$T to the "$, and the "$ sends the answer set on to

the client application.

!s shown in the example above, accessin# data with a 4S is typically a two&!M"operation. :owever, it is possible that the subtable row and base table row could end up

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 134/138

 bein# stored on the same !M", because both are hashed separately. f both were on thesame !M", the 4S re2uest would be a one&!M" operation.

Acce!!i, D#t# (it" # ;USI

  6hen a user submits an S)* re2uest usin# the table name and a 3on&4ni2ueSecondary ndex, the re2uest becomes an all&!M" operation, as explained below.

;USI Acce!!

A. The S)* is submitted, specifyin# a 34S (in this case, a last name of;!dams;'.

7. The hashin# al#orithm calculates a row hash value for the 34S (in thiscase, EK'.

B. !ll !M"s are activated to find the hash value of the 34S in their index

subtables. The !M"s whose subtables contain that value become the participatin# !M"s in this re2uest (in this case, !M"A and !M"7'. Theother !M"s discard the messa#e.

G. $ach participatin# !M" locates the row Ds (row hash value plusuni2ueness value' of the base rows correspondin# to the hash value (inthis case, the base rows correspondin# to hash value EK are G8, 777,and AAE'.

E. The participatin# !M"s access the base table rows, which are located on

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 135/138

the same !M" as the 34S subtable (in this case, one row from !M" Aand two rows from !M" 7'.

. The 2ualifyin# rows are sent over the =3$T to the "$, and the "$sends the answer set on to the client application (in this case, three

2ualifyin# rows are returned'.

Acce!!i, D#t# Wit"'+t I%ee!

  n the Teradata Database, you can access data on any column, whether thatcolumn is an index or not. =ou can as any 2uestion, of any data, at any time.

f the re2uest does not use a defined index, the Teradata Database does a +/

t#be !c#. ! full&table scan is another way to access data without usin#"rimary or Secondary ndexes. n evaluatin# an S)* re2uest, the ?ptimi9erexamines all possible access methods and chooses the one it believes to be themost efficient.

6hile Secondary ndexes #enerally provide a more direct access path, in somecases the ?ptimi9er will choose a full&table scan because it is more efficient. !re2uest could turn into a full&table scan when/

• !n S)* re2uest searches on a 34S column with many duplicates. -or

example, if a re2uest usin# last names in a Customer database searchedon the very prevalent ;Smith; in the 4nited States, then the ?ptimi9ermay choose a full table scan to efficiently find all the many matchin#rows in the result set.

• !n S)* re2uest uses a non&e2uality 6:$R$ clause on an index

column. -or example, if a re2uest searched an $mployee database for allemployees whose annual salary i! ,$e#te$ t"# A88,888, then a full&table scan would be used, even if the Salary column is an index. n thisexample, a full&table scan can be avoided by usin# an e2uality 6:$R$clause on a defined index column.

• !n S)* re2uest uses a ran#e 6:$R$ clause on an index column. -or

example, if a re2uest searched an $mployee database for all employeeshired between anuary 788A and une 788A, then a full&table scan would

 be used, even if the :ireHDate column is an index.

-or all re2uests, you must specify a value for each column in the index or theTeradata Database will do a full&table scan. ! full&table scan is an all&!M"operation. Ee$ %#t# b'c0 -+!t be $e#% and e#c" %#t# $'( i! #cce!!e% '

'ce. !s lon# as the choice of "rimary ndex has caused the table rows todistribute evenly across all of the !M"s, the parallel processin# of the !M"s

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 136/138

worin# simultaneously can accomplish the full&table scan 2uicly. H'(ee$ i

# $i-#$ I%e c#+!e! !0e(e% %#t# %i!t$ib+ti' # AM '*e$#ti'! (i

t#0e ',e$.

6hile full&table scans are impractical and even disallowed on some commercialdatabase systems, the Teradata Database routinely permits ad&hoc 2ueries withfull&table scans.

S+--#$ ' =e! #% I%ee!

  Some fundamental differences between Ieys and ndexes are shown below/

Keys Indexes

! relational modelin# conventionused in a ',ic# data model.

! Teradata Database mechanism usedin a *"!ic# database desi#n.

4ni2uely identify a row ("rimaryIey'.

4sed for row distribution ("rimaryndex'.

$stablish relationships betweentables (-orei#n Iey'.

4sed for row access ("rimary ndexand Secondary ndex'.

6hile most commercial database systems use the "rimary Iey as a way toretrieve data, a Teradata Database system does not. n a Teradata Database

system, you use the "rimary Iey only when desi#nin# a database, as amechanism for maintainin# referential inte#rity accordin# to relational theory.The Teradata Database itself does not re2uire eys in order to mana#e the data,and can function fully with no awareness of "rimary Ieys.

The Teradata Database1s parallel architecture uses "rimary ndexes to distributeand access the data rows. ! "rimary ndex is always re2uired when creatin# aTeradata Database table.

! "rimary ndex may include the same columns as the "rimary Iey, but doesnot have to. n some cases, you may want the "rimary Iey and "rimary ndex to

 be different. -or example, a credit card account number may be a #ood "rimaryIey, but customers may prefer to use a different ind of identification to accesstheir accounts.

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 137/138

R+e! '$ =e! #% I%ee!

  ! summary of the rules for eys (in the relational model' and indexes (in theTeradata Database' is shown below.

Rule Primary Key Foreign Key Primary Index SecondaryIndex

/  "ne 9 Multiple 9s "ne - 0 to 34 +-s

4  &ni>ue ?alues &ni>ue or non@uni>ue

&ni>ue or non@uni>ue

&ni>ue or non@uni>ue

3  %o %&!!s %&!!s allowed %&!!s allowed %&!!s allowed

7  ;alues should

not change

;alues may be

changed

;alues may be

changedredistributes row

;alues may be

changed

6  *olumn shouldnot change

*olumn shouldnot change

*olumn cannot bechanged drop andrecreate table

-ndeC may bechanged dropand recreateindeC

1  %o column limit %o column limit 17@column limit 17@column limit

5  na 9 must eCist as9 in the relatedtable

na na

Deii, $i-#$ #% <'$ei, =e! i t"e Te$#%#t# D#t#b#!e

!lthou#h "rimary ndexes are re2uired and "rimary Ieys are not, you do havethe option to define a "rimary Iey or -orei#n Iey for any table. 6hen youdefine a "rimary Iey in a Teradata Database table, the RDMS will implementthe specified column(s' as an index. ecause a "rimary Iey re2uires uni2uevalues, a defined "rimary Iey is implemented as one of the followin#/

• Ui+e $i-#$ I%e (f the D! did not specify the "rimary ndex

in the CR$!T$ T!*$ satement.'

• Ui+e Sec'%#$ I%e (f columns other than the "rimary ndex are

chosen'

6hen a "rimary Iey is defined in Teradata S)* and implemented as an index,the rules that #overn that type of index now apply to the "rimary Iey. -orexample, in relational theory, there is no limit to the number of columns in a

8/12/2019 A Teradata Database System

http://slidepdf.com/reader/full/a-teradata-database-system 138/138

"rimary Iey. :owever, if you specify a "rimary Iey in Teradata S)*, the G&column limit for indexes now applies to that "rimary Iey.

THA;= OU