Teradata TPump

download Teradata TPump

of 21

  • date post

    28-Apr-2015
  • Category

    Documents

  • view

    84
  • download

    3

Embed Size (px)

description

Teradata TPump

Transcript of Teradata TPump

Teradata Utilities: TPumpReprinted for KV Satish Kumar, IBM kvskumar@in.ibm.com Reprinted with permission as a subscription benefit of Books24x7, http://www.books24x7.com/

i

Table of ContentsChapter 5: TPump ............................................................................................................................1 Overview................................................................................................................................1 Why it is Called "TPump".................................................................................................1 TPump Has Many Unbelievable Abilities...............................................................................1 TPump Has Some Limits.................................................................................................2 Supported Input Formats.......................................................................................................3 TPump Commands and Parameters.....................................................................................3 LOAD Parameters IN COMMON with MultiLoad...................................................................3 .BEGIN LOAD Parameters UNIQUE to TPump.....................................................................4 TPUMP Example...................................................................................................................5 Creating a Flatfile for our Tpump Job to Utilize ......................................................................5 Creating a Tpump Script........................................................................................................6 Executing the Tpump Script...................................................................................................8 TPump Script with Error Treatment Options........................................................................12 A TPump Script that Uses Two Input Data Files ..................................................................13 A TPump UPSERT Sample Script.......................................................................................15 Monitoring TPump ................................................................................................................16 Handling Errors in TPump Using the Error Table................................................................16 One Error Table.............................................................................................................16 Common Error Codes and What They Mean .......................................................................17 RESTARTing TPump...........................................................................................................18 TPump and MultiLoad Comparision Chart...........................................................................18

Chapter 5: TPump"Diplomacy is the art of saying "Nice Doggie" until you can find a rock." Will Rogers

OverviewThe chemistry of relationships is very interesting. Frederick Buechner once stated, "My assumption is that the story of any one of us is in some measure the story of us all." In this chapter, you will find that TPump has similarities with the rest of the family of Teradata utilities. But this newer utility has been designed with fewer limitations and many distinguishing abilities that the other load utilities do not have. Do you remember the first Swiss Army knife you ever owned? Aside from its original intent as a compact survival tool, this knife has thrilled generations with its multiple capabilities. TPump is the Swiss Army knife of the Teradata load utilities. Just as this knife was designed for small tasks, TPump was developed to handle batch loads with low volumes. And, just as the Swiss Army knife easily fits in your pocket when you are loaded down with gear, TPump is a perfect fit when you have a large, busy system with few resources to spare. Let's look in more detail at the many facets of this amazing load tool.

Why it is Called "TPump"TPump is the shortened name for the load utility Teradata Parallel Data Pump. To understand this, you must know how the load utilities move the data. Both FastLoad and MultiLoad assemble massive volumes of data rows into 64K blocks and then moves those blocks. Picture in your mind the way that huge ice blocks used to be floated down long rivers to large cities prior to the advent of refrigeration. There they were cut up and distributed to the people. TPump does NOT move data in the large blocks. Instead, it loads data one row at a time, using row hash locks. Because it locks at this level, and not at the table level like MultiLoad, TPump can make many simultaneous, or concurrent, updates on a table. Envision TPump as the water pump on a well; pumping in a very slow, gentle manner resulting in a steady trickle of water that could be pumped into a cup. But strong and steady pumping results in a powerful stream of water that would require a larger container. TPump is a data pump which, like the water pump, may allow either a trickle- feed of data to flow into the warehouse or a strong and steady stream. In essence, you may "throttle" the flow of data based upon your system and business user requirements. Remember, TPump is THE PUMP!

TPump Has Many Unbelievable AbilitiesJust in Time: Transactional systems, such those implemented for ATM machines or Point-of-Sale terminals, are known for their tremendous speed in executing transactions. But how soon can you get the information pertaining to that transaction into the data warehouse? Can you afford to wait until a nightly batch load? If not, then TPump may be the utility that you are looking for! TPump allows the user to accomplish near real-time updates from source systems into the Teradata data warehouse. Throttle-switch Capability: What about the throttle capability that was mentioned above? With TPump you may stipulate how many updates may occur per minute. This is also called the statement rate. In fact, you may change the statement rate during the job, "throttling up" the rate with a higher number, or "throttling down" the number of updates with a lower one. An example: Having this capability, you might want to throttle up the rate during the period from 12:00 noon to 1:30 PM when most of the users have gone to lunch. You could then lower the rate when they return and begin running their business queries. This way, you need not have such clearly defined load windows, as the other utilities require. You can have TPump running in the background all theReprinted for ibmkvskumar@in.ibm.com, IBM Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited

Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition

2

time, and just control its flow rate. DML Functions: Like MultiLoad, TPump does DML functions, including INSERT, UPDATE and DELETE. These can be run solo, or in combination with one another. Note that it also supports UPSERTs like MultiLoad. But here is one place that TPump differs vastly from the other utilities: FastLoad can only load one table and MultiLoad can load five tables. But, when it pulls data from a single source, TPump can load more than 60 tables at a time! And the number of concurrent instances in such situations is unlimited. That's right, not 15, but unlimited for Teradata! Well OK, maybe by your computer. I cannot imagine my laptop running 20 TPumps, but Teradata does not care. How could you use this ability? Well, imagine partitioning a huge table horizontally into multiple smaller tables and then performing various DML functions on all of them in parallel. Keep in mind that TPump places no limit on the number of jobs that may be established. Now, think of ways you might use this ability in your data warehouse environment. The possibilities are endless. More benefits: Just when you think you have pulled out all of the options on a Swiss Army knife, there always seems to be just one more blade or tool you had not noticed. Similar to the knife, TPump always seems to have another advantage in its list of capabilities. Here are several that relate to TPump requirements for target tables. TPump allows both Unique and Non-Unique Secondary Indexes (USIs and NUSIs), unlike FastLoad, which allows neither, and MultiLoad, which allows just NUSIs. Like MultiLoad, TPump allows the target tables to either be empty or to be populated with data rows. Tables allowing duplicate rows (MULTISET tables) are allowed. Besides this, Referential Integrity is allowed and need not be dropped. As to the existence of Triggers, TPump says, "No problem!" Support Environment compatibility: The Support Environment (SE) works in tandem with TPump to enable the operator to have even more control in the TPump load environment. The SE coordinates TPump activities, assists in managing the acquisition of files, and aids in the processing of conditions for loads. The Support Environment aids in the execution of DML and DDL that occur in Teradata, outside of the load utility. Stopping without Repercussions: Finally, this utility can be stopped at any time and all of locks may be dropped with no ill consequences. Is this too good to be true? Are there no limits to this load utility? TPump does not like to steal any thunder from the other load utilities, but it just might become one of the most valuable survival tools for businesses in today's data warehouse environment.

TPump Has Some LimitsTPump has rightfully earned its place as a superstar in the family of Teradata load utilities. But this does not mean that it has no limits. It has a few that we will list here for you: Rule #1: No concatenation of input data files is allowed. TPump is not designed to support this. Rule #2: TPump will not process aggregates, arithmetic functions or exponentiation. If you need data conversions or mat