Embed Size (px)
Transcript of Teradata Architecture
Tera-Tom Tera-Cram for Teradata Basics V12: Understanding is the Key!
by Tom Coffing Coffing Data Warehousing. (c) 2011. Copying Prohibited.
Reprinted for Mausam Upadhyay, Accenture
Reprinted with permission as a subscription benefit of Skillport, http://skillport.books24x7.com/
All rights reserved. Reproduction and/or distribution in whole or in part in electronic,paper or other forms without written permission is prohibited.
Chapter 1: The Teradata Architecture
The Teradata Architecture
Let me once again explain the rules. Teradata rules!
Hello friend! My name is Tom Coffing and I am going to guide you through the certification process. I have been writing about Teradata for over 15 years and I have developed a certification track that is second to none for you. The first goal is to get you through the Teradata V12 Basics test. To do so all you need to do is read this book. But I also have some wonderful surprises for you. The first surprise is that I have also developed easy to learn and fun to watch Videos. You will see links to the video in the book. If you are reading this book electronically you will be able to click on the link to see the videos of the subject you are currently learning. If you are reading this book from paper you can place the link in your web browser to see the video.
The second surprise I have is that you can take practice tests for the basics, but even better is that I have created a Video Game that will allow you to challenge yourself on what you have learned. The game blows up once you miss three questions and you must start over. Pass all three levels of the game and you know you are ready for the test. To play the Tera-Tom Certification game all you have to do is Download the Nexus softer from our website at www.CoffingDW.com, connect to a Teradata system, click on the button that says DBA and start playing.
Lets get started Teradata relies on three architectural components that have set the rules for parallel processing. They are the Parsing Engine, which is also called the PE or the Optimizer, the Access Module Processors, which are referred to as the AMPs, and two BYNETs to communicate between PEs and AMPs.
The PE is the boss and tells the AMPs exactly what to do. The AMPs each have their own virtual disk, which no other AMP can read, and they merely read and write to their respective disks.
When a user logon to Teradata their logon is accepted or rejected by a Parsing Engine. The Parsing Engine will take care of that user for the entire session, which really means until that user Logs Off.
The Parsing Engine will accept each query from that user and come up with a plan for the AMPs to satisfy the request. The PEs plan is passed to the AMPs via the BYNET. The AMPs will retrieve the data requested from their virtual disks and pass it back up the BYNET to the PE. The PE will then deliver the data to the user.
Page 2 / 15
The Parsing Engine
Fall seven times, stand up eight.
The Parsing Engine never falls seven times, but it can handle 128 stand-up sessions.
The Parsing Engines are perfectly balanced, with each having the capability to handle up to 120 users at a time. This could be 120 distinct users or a single user utilizing the power of all 120 sessions for a single application. That is why there are multiple PEs in every Teradata system. Each PE has total command over every AMP.
Divided they stand (PEs) and United are the AMPs!
Each PE will take users SQL and do three things:
The PE will check the users SQL syntax. If there is a syntax error the user will receive and error. For example, if the user wanted to use the KEY WORD SELECT and instead wrote SLLLECCCT the PE would reject the SQL, but be kind enough to send the user a message to help them correct the error. Thats because the PEs are Stand-up guys!
If the SQL passes the syntax check the PE will check the users ACCESS RIGHTS to ensure the user has permission to access the data in that table. If not then the user receives a message ACCESS Denied!
If the user passes the Security Check then the Parsing Engine will come up with a PLAN to satisfy the user request. The fastest plan is a Single-AMP retrieve. The second fastest plan is a Two-AMP retrieve. The next fastest plan will be all AMPs reading only a portion of the table, and the slowest plan is the full table scan. That is where each AMP reads every row they contain for a table.
Page 3 / 15
Not all who wander are lost.
J. R. R. Tolkien
The AMPs are never lost because the PE always tells them what to do. One PE to rule them all? No! Each PE rules them all because the rows of every table are spread across all the AMPs. The AMPs organize every table in separate blocks just like you might organize your clothes in separate dresser drawers. Organizing their tables and the rows they contain is an obsession with the AMPs. They make organization a hobbit!
The PE passesthe PLAN to the AMPs over the BYNET. The AMPs then retrieve the rows they own from their disks and pass it back to the PE over the BYNET.
When a table is first created each AMP creates a table header on their disk. Even though the table is empty the AMPs at least know the table name, the columns in the table, and any indexes the table.
When the table is loaded each AMP receives rows for that table that they and only they own. They carefully place the rows inside data blocks where they can easily be retrieved.
Now each AMP will own their own Table Header for the table and they will also own data blocks where they place the rows for that table. Now the AMP is truly Lord of the Disks!
Page 4 / 15
Born to be Parallel
Only he who attempts the ridiculous may achieve the impossible.
The concept of parallel processing back in 1979 was almost as outrageous as attempting to go to the moon, but Teradata attempted the ridiculous and the impossible was achieved. Teradata took every table and spread the rows across all the AMPs in the system and the birth of parallel processing happened.
You will never see a Teradata table that is only on one AMP. The parallel processing aspect is then lost. You will see every Teradata table spread the rows of the table across all AMPs. Teradata was born to be parallel and the impossible was born.
The first picture on the opposite page never happens. The second picture below that is exactly the design behind Teradata.
Teradata NEVER lays out data like this!
Teradata lays out data like this!
Page 5 / 15
Every table spreads its rows over the AMPs
A Journey of a thousand miles begins with a single step.
The Parsing Engine passes a plan in Steps to the AMPs and the AMPs merely follow the plan. Those steps are passed over the BYNET. A Journey of a single step can be transferred to a thousand AMPs in a millisecond over the BYNET.
The BYNET is the communication network between AMPs and PEs. The PE comes up with a PLAN and passes the plan to the AMPs in steps over the BYNET. This step and all the steps of the plan travel down the BYNET highway which guarantees delivery to each AMP.
The AMPs then retrieve the data requested by the PE and they deliver their portion of the answer set to the PE over the BYNET.
The BYNET provides the communications between AMPs and PEs so no matter how large the data warehouse physically gets, the BYNET makes each AMP and PE think that they are right next to one another. The BYNET gets its name from the Banyan tree. The Banyan tree has the ability to continually plant new roots to grow forever. Likewise, the BYNET scales as the Teradata system grows in size. The BYNET is scalable.
There are always two BYNETs for redundancy and extra bandwidth. AMPs and PEs can use both BYNETs to send and retrieve data simultaneously. What a network! It is like having to phone lines to talk. Each AMP or PE can use one BYNET to retrieve communication and simultaneously accept messages using the other BYNET. Both BYNETs can be used to send a message or to receive a message!
Below is the steps to completely satisfy a query.
n The PE checks the users SQL Syntax;
n The PE checks the users security rights;
n The PE comes up with a plan for the AMPs to follow;
n The PE passes the plan along to the AMPs over the BYNET;
n The AMPs follow the plan and retrieve the data requested;
n The AMPs pass the data to the PE over the BYNET; and
n The PE then passes the final data to the user.
Page 6 / 15
Watch the Tera-Tom Video on Architecture