INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791...

43
883 INDEX Symbols “ “ (double quotes), building strings using, 175 + (string concatenation), 177, 186–187 == (equivalence operator), 175 [ ] (bracket characters), qualification of column names using, 181–182 Numbers 32-bit mode DTExecUI and, 790 Visual Studio and, 791 Windows OSs in, 474 64-bit mode, 790–791 80/20 rule (Pareto principle), 228 A absolute references, environment references, 765 Access (Microsoft) 64-bit support in, 415–417 accessing source data from, 414–415, 421–427 referencing columns in expressions within, 181 accessibility, UI design principles, 667 ACE (Access Engine), for Microsoft Office, 415–417 ACE OLE DT Provider, 415 ACH (Automated Clearing House) files Control Flow batch creation, 850–853 Control Flow loop, 846–848 Control Flow retrieval of XML file size, 848–850 Data Flow capturing total batch items, 859–860 Data Flow detail processing ETL, 860–861 Data Flow parsing and error handling, 854–856 Data Flow validation, 853–854, 856–859 input file specification, 800 as load package, 845 package structure, 801 payments via, 806–807 setting up, 845–846 solution architecture, 803 AcquireConnection method adding connection time methods to components, 595 building Destination adapter component, 627–628 building Source adapter component, 604–606 defined, 272 retrieving data from database, 273–274 retrieving files from FTP Server, 274–275 active time, Data Flow components, 506–507 administration, of SSIS 64-bit issues, 790–791 basic reporting, 791–795 catalog and, 743–744 clustering, 768–770 command-line utilities, 774 COPYRIGHTED MATERIAL

Transcript of INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791...

Page 1: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

883

INDEX

Symbols

“ “ (double quotes), building strings using, 175

+ (string concatenation), 177, 186–187== (equivalence operator), 175[ ] (bracket characters), qualifi cation of

column names using, 181–182

Numbers

32-bit modeDTExecUI and, 790Visual Studio and, 791Windows OSs in, 474

64-bit mode, 790–79180/20 rule (Pareto principle), 228

A

absolute references, environment references, 765

Access (Microsoft)64-bit support in, 415–417accessing source data from, 414–415,

421–427referencing columns in expressions

within, 181accessibility, UI design principles, 667ACE (Access Engine), for Microsoft Offi ce,

415–417ACE OLE DT Provider, 415ACH (Automated Clearing House) fi les

Control Flow batch creation, 850–853Control Flow loop, 846–848

Control Flow retrieval of XML fi le size, 848–850

Data Flow capturing total batch items, 859–860

Data Flow detail processing ETL, 860–861

Data Flow parsing and error handling, 854–856

Data Flow validation, 853–854, 856–859

input fi le specifi cation, 800as load package, 845package structure, 801payments via, 806–807setting up, 845–846solution architecture, 803

AcquireConnection methodadding connection time methods to

components, 595building Destination adapter

component, 627–628building Source adapter component,

604–606defi ned, 272retrieving data from database, 273–274retrieving fi les from FTP Server,

274–275active time, Data Flow components,

506–507administration, of SSIS

64-bit issues, 790–791basic reporting, 791–795catalog and, 743–744clustering, 768–770command-line utilities, 774

bindex.indd 883bindex.indd 883 2/27/12 8:37:25 AM2/27/12 8:37:25 AM

COPYRIG

HTED M

ATERIAL

Page 2: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

884

administration, of SSIS (continued)creating central server, 766–768creating database (SSISDB), 747–748custom reporting, 795data taps, 765–766deployment models, 748DTExec, 774DTExecUI, 775–780DTUtil, 780–782environments, 760–765legacy security, 785–787monitoring package execution, 791overview of, 743package confi guration, 770–773package deployment, 751–757performance counters, 796project deployment, 748–751scheduling packages, 787–790securing catalog, 782–785setting catalog properties, 744–747ssis_admin role, 782–783summary, 796T-SQL for managing security, 785T-SQL for package execution, 757–758T-SQL for setting parameter values,

758–759T-SQL querying tables to set parameter

values, 759–760administrators, Management Studio and,

36–37ADO

coding SQL statement property, 76executing parameterized SQL statements,

69–71populating recordsets, 117

ADO.NETcoding SQL statement property, 76Connection Manager, 595, 851–852creating connection for CDC tools, 399executing parameterized SQL statements,

69–71outputting Analysis Services results to, 46sorting data with SQL Server, 383source in Data Flow, 11, 115

Advanced Editordesign-time functionality and, 589Import Column Transformation using,

142–143OLE DB Command Transformation

using, 146–147transformation outputs and, 497–498user interface as alternative to, 643user interface overriding, 651viewing components with, 666

Advanced Windowing Extensions (AWE), 475AES (Advanced Encryption Standard), 746Aggregate Transformation

asynchronous transformation outputs and, 498

as blocking transformation, 496–497in Data Flow, 119–121example using, 159

Agileiterative development, 525MSF Agile, 537–539

All Executions report, 792–794Analysis Services. See SSAS (SQL Server

Analysis Services)ANDS, in data extraction, 381–382annotations, on packages, 32, 805Application object

maintaining, 683operations of, 682package management and, 683–686package monitoring and, 686–687

applications, interaction with external. See external applications, interaction with

architecturedata architecture, 805–806scaling out, 474of SSIS, 5

archiving fi lescreating dynamic packages, 251–252overview of, 52

artifacts, in SDLC, 523ASP.NET, 727–731assemblies

adding to GAC, 598–602

administration – assemblies

bindex.indd 884bindex.indd 884 2/27/12 8:37:26 AM2/27/12 8:37:26 AM

Page 3: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

885

creating new projects, 597example using custom .NET, 261–264strong names, 646–647, 651–652using managed, 260–261

asynchronous transformationsidentifying, 493, 500vs. synchronous transformations, 119,

498–500writing Script components to act as,

302–305Audit Transformation

in Data Flow, 128–129handling more bad data with, 248

auditing, SSIS database, 791authentication

types supported, 782Windows Authentication and, 18

Automated Clearing House fi les. See ACH (Automated Clearing House) fi les

Autos window, script debugging using, 309–310

AWE (Advanced Windowing Extensions), 475

B

backpressurein SSIS 2012, 488staging environments for source, 512

bad data, handling, 471–473bank fi le package

Control Flow batch creation, 828–832Control Flow fi le loop, 824–825Control Flow retrieval of fi le properties,

825–828Data Flow capturing total batch items,

840Data Flow detail processing ETL,

841–845Data Flow parsing and error handling,

832–835Data Flow validation, 832, 835–839fl at fi les, 801setting up, 819–823

BaseSelect variable, using expressions in Data Flow, 198–199

batch operationsACH fi le package, 850–853bank fi le package, 828–832BankBatch table, 813–814BankBatchDetail table, 813–814batch entities in case study database, 809Data Flow capturing total batch items,

840, 859–860executing batch of SQL statements,

71–72stored procedures for adding, 816–817stored procedures for balancing, 818–819stored procedures for updating, 818stored procedures for working with,

816–819bcp.exe, inserting data into SQL Server

database, 64–65Beginning C# 3.0: An Introduction to Object

Oriented Programming (Purdum), 254benchmarks, 796BI (Business Intelligence) platform, 1BI xPress, Pragmatic Works, 791BIDS (Business Intelligence Development

Studio), 4BLOB (Binary Large Objects) counters,

Performance Monitor, 519–520blocking transformations

Data Flow design practices, 508–510non-blocking, steaming, and row-based

transformations, 493–495optimizing package processing and effects

of, 516overview of, 496–497semi-blocking transformations, 495–496

Boole, George, 524Boolean expressions

in conditional expressions, 187–188precedence constraints used with,

551–555syntax of, 182–183

Boolean literals, 180boot.ini fi le, 474

asynchronous transformations – boot.ini fi le

bindex.indd 885bindex.indd 885 2/27/12 8:37:26 AM2/27/12 8:37:26 AM

Page 4: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

886

bottlenecks, in Data Flow, 516–518bracket characters ([ ]), qualifi cation of

column names using, 181–182branching, as source control method, 546breakpoints

adding to Data Flow Task, 638enabling and using, 569–572setting for debugging script, 308

buffer managerin asynchronous component outputs, 499in execution trees, 502

buffersData Flow memory, 492–493Destination adapters de-allocating data

in, 501in execution trees, 502monitoring Data Flow execution, 503–

505optimizing package processing, 513–514performance counters, 519–520, 796synchronous transformation outputs and,

499–500Build menu, projects and, 752BULK INSERT statement, SQL, 64Bulk Insert Task

adding to Control Flow, 65–66overview of, 64–65using with typical data load, 67–68

Business Intelligence (BI) platform, 1Business Intelligence Development Studio

(BIDS), 4

C

C#expression language and, 174Hello World example, 257–258Script Task accessing C# libraries, 43–44scripting with, 254selecting as scripting language, 255–256

Cache Connection Manager (CCM). See CCM (Cache Connection Manager)

Cache Data Sources, Lookup Transformation, 474

cache optionslimitations of SCD, 336in Lookup Transformation, 474

Cache Transformationconfi guring Cache Connection Manager,

229Data Flow and, 124loading Lookup Cache with, 229–230

Call Stack window, 571Candidate Key Profi les

Data Profi ling Task, 318–319turning results into actionable ETL steps,

321capture instance (shadow or change) tables,

in CDCoverview of, 394–396querying, 401–405writing entries to, 394

cascaded Lookup operations, 227–228case sensitivity, of variables, 170, 268case study

ACH Control Flow batch creation, 850–853

ACH Control Flow loop, 846–848ACH Control Flow retrieval of XML fi le

size, 848–850ACH Data Flow capturing total batch

items, 859–860ACH Data Flow detail processing ETL,

860–861ACH Data Flow parsing and error

handling, 854–856ACH Data Flow validation, 853–854,

856–859ACH fi le for bank payments, 806–807ACH load package, 845ACH package setup, 845–846advantages of, 798background information related to

company in, 798–799bank fi le Control Flow batch creation,

828–832bank fi le Control Flow fi le loop,

824–825

bottlenecks, in Data Flow – bank fi le Control Flow fi le loop

bindex.indd 886bindex.indd 886 2/27/12 8:37:27 AM2/27/12 8:37:27 AM

Page 5: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

887

bank fi le Control Flow retrieval of fi le properties, 825–828

bank fi le Data Flow capturing total batch items, 840

bank fi le Data Flow detail processing ETL, 841–845

bank fi le Data Flow parsing and error handling, 832–835

bank fi le Data Flow validation, 832, 835–839

bank fi le package and variable setup, 819–823

BankBatch tables, 813–815business problem addressed by, 799corporate ledger data, 815–816customer table, 810–811CustomerLookup table, 813data architecture, 805–806database model for, 808–809database setup for, 810driver package setup, 800–881e-mail Control Flow processing, 862–865e-mail Data Flow processing, 865–866e-mail load package, 861–862e-mail package setup and fi le system

tasks, 862ErrorDetail table, 816fi le storage locations, 806interpreting the results, 879–880invoice table, 811–813load packages, 819lockbox fi les, 807–808matching process Control Flow, 867matching process high-confi dence Data

Flow, 870–874matching process (invoice matching), 867matching process logic, 868–870matching process medium-confi dence

Data Flow, 875–878matching process package setup, 867–868naming conventions in, 804–805overview of, 797PayPal or direct credits to corporate

account, 808

solution architecture, 801–804solution summary, 799–800stored procedures working with batches,

816–819summary, 800–881testing, 866tips related to package development, 805

castingcasting operator, 169–170conditional expression issues, 188

catalogbuilt-in reporting, 791–792as central storage location, 743–744Create Catalog command, 747–748executing packages deployed to, 680–681logging, 582–584Managed Object Model and, 671managing, 672–673operation logs and, 703–704package monitoring and, 686–687permissions, 784project, folder, and package listings,

688–689project deployment model and, 749–751securing, 782–785setting catalog properties, 744–747stored procedures securing, 785

Catalog class, 671–673CatalogCollection class, 672–673CatalogFolder class

folder management with, 673–674overview of, 672server deployment project, 679

.caw fi le, 229–230CCM (Cache Connection Manager)

defi ned, 203loading Lookup Cache from any source

with, 229–230selecting in full-cache mode of Lookup

Transformation, 216CDC (Change Data Capture)

API, 396–398benefi ts of, 392–393instance tables, 394–396

bank fi le Control Flow retrieval of fi le properties – CDC (Change Data Capture)

bindex.indd 887bindex.indd 887 2/27/12 8:37:27 AM2/27/12 8:37:27 AM

Page 6: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

888

CDC (Change Data Capture) (continued)overview of, 391–392preparing, 393–394querying, 401–405sources in Data Flow, 11using new SSIS tools, 398–401

CDC Control Task, 398–400CDC Source, 398–400CDC Splitter, 398–401change management, in development, 522Change Tracking, 392Changing Attributes

complex dimension changes with SCD, 331–333

dimension tables, 323updates output, 333

Character Map Transformationcolumn properties in user interface

assembly, 665–667in Data Flow, 129–130processing bank fi le check and invoice

details, 841–842checkpoints

controlling start location, 463creating simple control fl ow, 456–457Data Flow restart using, 476–477effect of containers and transactions on,

457–459inside checkpoint fi le, 461–463restarting packages using, 454variations of FailPackageOnFailure

property, 459–461child packages, 80–81Class Library, 596classes, scripting in SSIS, 259–260cleansing data. See data cleansingCleanup method, component runtime and,

594CLR (Common Language Runtime), 670CLS (Command Language Specifi cation), 602clustering, 768–770code

scripting in SSIS, 259–260source code control, 525–526

code reusecopy-and-paste operation for, 259–260custom assemblies for, 261–264managed assemblies for, 260–261

CodePlex.com, 795Collection class, 688–689Column NULL Ratio Profi le, Data Profi ling

Task, 319, 321Column Pattern Profi le, Data Profi ling Task,

320Column Statistics Profi le, Data Profi ling

Task, 320–321Column Value Distribution Profi le, Data

Profi ling Task, 319columns

Copy Column Transformation, 130Derived Column Transformation, 121–122design-time methods for column data

types, 591design-time methods for setting column

properties, 592Export Column Transformation, 131–133Import Column Transformation, 142–144referencing in expressions, 181–182

columns, in UIdisplaying, 654–657properties, 665–667selecting, 657–661

ComboBox control, for column selection, 658–661

comma-delimited fi les, Flat File sources as, 110

Command Language Specifi cation (CLS), 602command-line

DTExec, 774DTExecUI, 775–780DTUtil, 780–782executing console application in Control

Flow, 81–82utilities, 774

comment fi elds, analyzing with Term Extraction Transformation, 153–156

Common Language Runtime (CLR), 670common table expressions (CTEs), 389–391

CDC (Change Data Capture) – common table expressions (CTEs)

bindex.indd 888bindex.indd 888 2/27/12 8:37:27 AM2/27/12 8:37:27 AM

Page 7: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

889

communication mechanism, of transformations, 493

comparison operationscasting issues in, 170concatenation operator in, 186–187

complex queries, writing for Change Data Capture, 391

ComponentMetaData properties, Source Component, 603–604

componentsadding connection time functionality,

594–595adding design-time functionality, 589–

593adding run-time functionality, 593–594adding to SSIS Toolbox, 633–634building, 595building complete package, 636–637component-level properties in user

interface, 661–663design time debugging, 634–636Destination. See Destination ComponentPipeline Component methods and, 588–

589preparing for coding Pipeline

Components, 596–602Row Count Component, 309runtime debugging, 637–640Script Component. See Script Componentseparating component projects from UI

(user interface), 645Source Component. See Source

ComponentTransformation. See Transformation

Componenttypes of, 586upgrading to SQL Server 2012, 641

composite domains, DQS, 367–368compound expressions

conditional, 188creating, 174

concatenation operator (+), string functions and, 186–187

conditional expressions

building logical evaluation expressions, 187–188

creating, 174conditional operator, 174Conditional Split Transformation

capturing total batch items, 840connecting to Lookup Transformation,

245–246handling dirty data, 244loading fact tables, 342matching process medium-confi dence,

875–876Merge Join Transformation using, 203processing bank fi le check and invoice

details, 841querying CDC in SSIS, 404scaling across machines using, 477–478

Configuration objectoverview of, 707–708programming, 708–709

Connection ManagersADO.NET, 851Analysis Services, 46building Destination Component, 625building Source Component, 605–606Cache. See CCM (Cache Connection

Manager)defi ning connection characteristics, 9expressions in properties of, 193–194File, 595, 625fl at fi les, 595, 822Foreach ADO Enumerator example, 100FTP, 53HTTP, 55–56OLE DB, 67, 107–108, 727, 830overview of, 31Package Designer tab for, 32Project, 822–823properties, 193–194SMTP, 83–84, 868Source adapters and, 586–587sources pointing to, 106–107values returned by, 595WMI, 84

communication mechanism, of transformations – Connection Managers

bindex.indd 889bindex.indd 889 2/27/12 8:37:28 AM2/27/12 8:37:28 AM

Page 8: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

890

connection time, adding methods to components, 594–595

connectionscoding SQL statement property according

to, 76creating across packages, 234–236to data sources in Script Task, 271–279executing parameterized SQL statement,

69–71Connections collection, Script Task, 272console application, executing in Control

Flow, 81–82constraints

evaluating, 30precedence constraints. See precedence

constraintscontainers

container tasks, 42in Control Flow architecture, 8–9effect on checkpoints, 457–459Foreach ADO Enumerator example,

100–102Foreach File Enumerator example, 98–99Foreach Loop Container, 97–98grouping tasks into, 31groups vs., 95logging, 576–577For Loop Container, 95–97precedence constraints controlling,

550–551Sequence Container, 94for storing parameters (environments),

674–676summary, 103Task Host Container, 93

Control Flowadding bulk insert to, 65checkpoints occurring only at, 454completing package, 239connections in, 31containers in. See containerscustomizing item properties, 28Data Flow compared with, 28, 105–106,

488–491

defi ning for package, 237evaluating tasks, 30example using Script Task variables for,

269–271expressions in precedence, 195–196expressions in tasks, 194–195handling workfl ows with, 491looping and sequence tasks, 42–43options for setting variables, 13overview of, 6precedence constraints, 8, 29–30, 549Script Task in. See Script Tasktasks in, 6–7, 194–195Toolbox tabs related to, 27–28

Control Flow, in case studyACH fi le batch creation, 850–853ACH fi le loop, 846–848ACH fi le retrieval of XML fi le size,

848–850bank fi le batch creation, 828–832bank fi le loop, 824–825bank fi le retrieval of fi le properties,

825–828e-mail package, 862–865invoice matching process, 867–870

control table, in parallel loading, 479–480conversion

rules for date/time types, 166–167Unicode and non-Unicode data type

issues, 167–169using Data Conversion Transformation,

121Copy Column Transformation, 130copy-and-paste operation, code reuse with,

259–260copy-on-fi rst-write technology, database

snapshots, 406corporate ledger data, 815–816correlation operations, in Data Flow design,

510–511counters, Performance Monitor, 518–520CPU cost, 376Create Catalog command, 747–748credentials, Windows Authentication and, 18

connection time, adding methods to components – credentials, Windows Authentication

bindex.indd 890bindex.indd 890 2/27/12 8:37:28 AM2/27/12 8:37:28 AM

Page 9: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

891

cross-data fl ow communication, 115cross-package communication, 115CTEs (common table expressions), 389–391cubes, processing, 46customers

Customer table, 810–811CustomerLookup table, 813database entities in case study, 809

customizing SSISadding connection time functionality,

594–595adding design-time functionality, 589–

593adding run-time functionality, 593–594building complete package, 636–637building components, 595building Destination Component, 625–

633building Source Component, 602–614building Transformation Component,

614–625debugging components, 634–636Destination Component, 588installing components, 633–634overview of, 585–586Pipeline Component methods and,

588–589preparing for coding Pipeline

Components, 596–602runtime debugging, 637–640Source Component, 586–587summary, 641Transformation Component, 587UI component, 667upgrading components to SQL Server

2012, 641

D

Dashboard report, 794data cleansing

analyzing source data for. See data profi ling

Derived Column use, 354–357

DQS (Data Quality Services), 366–370DQS Cleansing Transformation, 131,

370–373error outputs and, 471–473Fuzzy Grouping, 363–365Fuzzy Lookup, 357–363overview of, 353–354sources in this book for, 322summary, 373transformations in Data Flow design,

511–512Data Conversion Transformation

in Data Flow, 122in Excel Source, 109Unicode and non-Unicode data type

issues, 168–169Data Defi nition Language (DDL)

defi ning Data Flow for package, 238Execute DDL Task, 45

Data Encryption Standard (DES), 746data extraction

Data Flow restart using, 476JOINS, UNIONS and subqueries in, 381–

382modularizing, 384–385overview of, 376SELECT * problem in, 376–377set-based logic in, 389–391sorting databases, 382–384sources in this book for, 322SQL Server and text fi les, 385–389transformations during, 378–381WHERE clause tool in, 377–378

Data Flow connections, 31Control Flow compared with, 28creating for package, 237–239, 242customizing item properties, 28data taps for viewing data in, 765–766data viewers, 106destinations in. See destinationserror handling and logging, 14Error Row Confi guration properties, 572example, 157–160

cross-data fl ow communication – Data Flow

bindex.indd 891bindex.indd 891 2/27/12 8:37:28 AM2/27/12 8:37:28 AM

Page 10: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

892

Data Flow (continued)expressions in, 197–200matching process high-confi dence, 870–

874matching process medium-confi dence,

875–878NULL values in, 183, 185overview of, 9–10performing query tuning when

developing, 378–381pipeline and, 586restart, 475–477scripting in. See Script Componentsources in. See sourcessummary, 160synchronous vs, asynchronous

transformations, 294–302transformations in. See transformationsunderstanding, 105–106working with, 34

Data Flow , in case studyACH fi le capturing total batch items,

859–860ACH fi le detail processing ETL, 860–861ACH fi le parsing and error handling,

854–856ACH fi le validation, 853–854, 856–859bank fi le capturing total batch items, 840bank fi le detail processing ETL, 841–845bank fi le parsing and error handling,

832–835bank fi le validation, 832, 835–839e-mail package, 865–866

Data Flow enginecomparing with Control Flow, 488–491data processing in Data Flow, 491–492design practices, 508–513execution trees, 501–503handling workfl ows with Control Flow,

491memory buffer architecture, 492–493monitoring execution, 503–505optimizing package processing, 513–516overview of, 487–488

pipeline execution reporting, 506–507pipeline execution tree log details, 505–506pipeline performance monitoring, 518–520SSIS engine, 488summary, 520transformations types, 493–501troubleshooting performance bottlenecks,

516–518Data Flow Task

adding to Control Flow, 34breakpoints added to, 638Data Flow restart using, 476–477defi ning Data Flow for package, 237–239Foreach ADO Enumerator example, 102implementing as checkpoint, 454For Loop Container, 97overview of, 10, 47–48in parallel loading, 483–484querying CDC in SSIS, 403referencing columns in expressions

within, 181–182data loading

database snapshots and, 406–408MERGE operator and, 408–411

data miningAnalysis Services tasks, 45Data Mining Query Task, 46–47mining objects, 46

Data Mining Extension (DMX), 47, 130–131Data Mining Model Training Destination,

118Data Mining Query Task, 46–47Data Mining Query Transformation, 130–

131data pipeline architecture, parallelism, 474data preparation tasks

archiving fi les, 52Data Profi ling Task, 48–50File System Task, 50–51FTP Task, 53–55overview of, 48Web Service Task, 55–60XML Task, 60–64

data processing, in Data Flow, 491–492

Data Flow – data processing, in Data Flow

bindex.indd 892bindex.indd 892 2/27/12 8:37:29 AM2/27/12 8:37:29 AM

Page 11: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

893

Data Profi le Viewer, 318–321data profi ling

defi ned, 315executing Data Profi ling Task, 315–317overview of, 48–50turning results into actionable ETL steps,

321viewing results of Data Profi ling Task,

317–321Data Profi ling Task

initial execution of, 315–317overview of, 48–50viewing results of, 317–321

Data Quality Services. See DQS (Data Quality Services)

data scrubbing. See mainframe ETL, with data scrubbing

data sharpening, 378data sources. See sourcesdata stores, 706data taps, 765–766Data Transformation Services. See DTS (Data

Transformation Services)data types

confi guring in Flat File sources, 111–112date and time support, 166design-time methods for column data

types, 591Destination Component, 630–631impact on performance, 167mapping and converting as needed, 20parameters, 172–173Source Component, 608–609SSIS, 164–166tips related to working with large

projects, 805Transformation Component, 618–619understanding, 164Unicode and non-Unicode conversion

issues, 167–169variables, 172–173, 184

Data Vieweradding to Fuzzy Grouping

Transformation, 364–365

adding to Fuzzy Lookup, 360–361benefi ts of, 805CDC Splitter outputs, 401data taps and, 765–766overview of, 106querying CDC in SSIS, 405script debugging using, 309using relational join in source, 210

The Data Warehouse Toolkit (Kimball and Ross), 323

data warehousesdata extraction and cleansing, 322data profi ling. See data profi lingdimension table loading. See dimension

table loadingfact table loading, 337–344overview of, 313–315SSAS processing, 345–350summary, 351using Master ETL package, 350–351

databasebuilding basic package for joining data,

207–209creating, 747–748retrieving data, 272–274snapshots, 406–408sorting data, 382–384Transfer Database Task, 88–89Transfer Error Messages Task, 89Transfer Logins Task, 89–90transferring SQL Server objects,

91–92database, for case study

BankBatch tables, 813–815corporate ledger data, 815–816customer table, 810–811CustomerLookup table, 813data architecture and, 805–806ErrorDetail table, 816invoice table, 811–813model used, 808–809setup, 810stored procedures working with batches,

816–819

Data Profi le Viewer – database, for case study

bindex.indd 893bindex.indd 893 2/27/12 8:37:29 AM2/27/12 8:37:29 AM

Page 12: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

894

DataReader Destination, 118, 728DataView controls

column display, 654column selectin, 657–658

dateadding new columns for Change Data

Capture, 391data types, 166functions for expressions, 188–190

DatePart() expression functionBoolean expressions and, 182overview of, 188–189string functions, 186T-SQL function vs., 175

DBAs, 521DDL (Data Defi nition Language)

defi ning Data Flow for package, 238Execute DDL Task, 45

debug mode, package execution and, 240debugging

breakpoints. See breakpointscomponents at design time, 634–636components at runtime, 637–640interacting with external applications

and, 720debugging, script

Autos, Locals, and Watch windows, 309–310

breakpoints, 308Immediate window, 310–311overview of, 308Row Count Component and Data

Viewers, 309de-duplication, in Fuzzy Grouping

Transformation, 363–365DELETE statements, 408–411deployment

of custom .NET assembly, 263executing packages deployed to catalog,

680–681executing packages with T-SQL, 736–737models, 748package model. See package deployment

model

project model. See project deployment model

server deployment, 679–680utility for, 751–752, 754

deployment manifest, creating, 751–752Derived Column Transformation

advanced data cleansing with, 354–357as alternative to SCD, 336Audit Transformation compared with,

129confi guring Lookup Transformation

with, 225in Data Flow, 122–123example using, 158expressions and, 199–200handling dirty data, 243InfoPath example, 724–725loading fact tables, 338–339processing bank fi le check and invoice

details, 841–843DES (Data Encryption Standard), 746DescribeRedirectedErrorCode method, 594design practices, Data Flow

data cleansing and transformation, 511–512

data integration and correlation, 510–511leveraging Data Flow, 509–510overview of, 508–509staging environments, 512–513

design timeadding methods to components, 589–593Advanced Editor and, 589component phases, 588creating package parameters, 172debugging components, 634–636defi ning variables, 170Transformation Component methods,

615–620Destination adapters

as integral to Data Flow, 500–501troubleshooting bottlenecks in Data Flow

by removing, 516–517Destination Assistant, 107, 237–238Destination Component

DataReader Destination – Destination Component

bindex.indd 894bindex.indd 894 2/27/12 8:37:29 AM2/27/12 8:37:29 AM

Page 13: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

895

AcquireConnection method, 627–628ComponentType property and, 598confi guring Script Component Editor,

289–290Connection Managers and, 625debugging, 634–636defi ned, 288installing, 633–634overview of, 588PreExecute method, 631–633ProcessInput method, 631–632ProvideComponentProperties method,

626–627ReinitializeMetaData method, 629–630SetUsageType method, 630–631types of pipeline components, 586Validate method, 628–629

DestinationConnection property, Foreach File Enumerator example, 99

destinationsconnectivity to, 719creating destination table, 20in Data Flow, 13, 88–89Data Mining Model Training, 118DataReader, 118Dimension and Partition Processing, 118dragging DataReader to Data Flow, 728Excel, 116Flat File, 116function of, 106OLE DB, 116–117, 843overview of, 115–116Raw File, 117Recordset, 117selecting for bulk insert, 65specifying in Import and Export Wizard,

19SQL Server and Mobile, 118troubleshooting bottlenecks in Data

Flow, 516–517development

custom. See customizing SSISsoftware development. See SDLC

(software development life cycle)

Diff operation, 721Diffgram, 721Dimension and Partition Processing

Destination, 118dimension table loading

complex tables, alternatives to SCD Transformation, 335–336

complex tables, preparing data, 327–331complex tables, using SCD

Transformation, 331–335overview of, 332–333simple tables, 323–327

dimensionsDimension and Partition Processing

Destination, 118processing, 46solving changing dimensions with SCD

Transformation, 126directives, creating new projects, 597directories

creating, 51polling for fi le delivery, 86–87

dirty datacleansing. See data cleansinghandling, 242–246

disk I/O, 508–510Distributed Transaction Coordinator

Transactions. See DTC (Distributed Transaction Coordinator) Transactions

DMX (Data Mining Extension), 47, 130–131Document Type Defi nitions (DTDs), 61documents, MSF Agile, 538domains, DQS

DQS Cleansing Transformation and, 370–373

overview of, 367–368double quotes (“ “), building strings using, 175DQS (Data Quality Services)

as alternative to Integration Services, 799Cleansing Transformation, 131, 370–373data cleansing workfl ow of, 366–370KB (Knowledge Base), 366overview of, 366

driver package setup, 880–881

DestinationConnection property, Foreach File – driver package setup

bindex.indd 895bindex.indd 895 2/27/12 8:37:30 AM2/27/12 8:37:30 AM

Page 14: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

896

DT_DBDATE data type, 164–166DT_DBTIME data type, 164–166DT_DBTIME2 data type, 166DT_DBTIMESTAMP2 data type, 166DT_DBTIMESTAMPOFFSET data type, 166DT_NUMERIC data type, 178DT_UI4 data type, 178DTC (Distributed Transaction Coordinator)

Transactionsdefi ned, 463–464single package, multiple transactions,

466–468single package, single transaction, 464–

466two packages, one transaction, 468–469

DTDs (Document Type Defi nitions), 61DTExec

32 and 64-bit versions, 79132-bit runtime executables in 64-bit

mode, 416–417debugging components, 634executing packages, 774runtime debugging, 637–640

DTExecUIas 32-bit application, 790executing packages, 775–780

DTS (Data Transformation Services)Import and Export Wizard and, 2package failure. See package restartabilityruntime managed code library, 676SSIS compared with, 1–2

Dts objectaccessing variables in Script Task, 267–

268confi guring Script Task Editor, 265–266connecting to data sources in Script Task,

272–279overview of, 287–288

DtsDebugHost.exe, 638DtsPipelineComponent attribute, 598DTUtil, 780–782dump and reload, for Change Data Capture,

391dynamic packages, 162, 250–252

E

Edit Script button, 255–256editors

Advanced Editor. See Advanced EditorFTP Task Editor, 53Precedence Constraint Editor, 29Property Expressions Editor, 41Script Component Editor, 289–291Script Task Editor, 265–266task editors, 39–41, 50, 65Term Extraction Transformation Editor,

153–154e-mail, Send Mail Task, 83e-mail package

Control Flow processing, 862–865Data Flow processing, 865–866as load package, 861–862payments via, 801setup and fi le system tasks, 862

encryptionalgorithms, 745–746data protection, 21re-encrypting all packages in a directory,

781end-to-end packages. See package creationEngineThreads property, Data Flow, 503enumerators

Foreach ADO Enumerator example, 100–102

Foreach File Enumerator example, 98–99Foreach Loop Container, 97–98

environment referencesabsolute and relative, 765confi guring projects to use environments,

763EnvironmentReference object, 681–682

environment variablesEnvironmentVariable class, 674package confi guration and, 706referenced during package execution, 749

EnvironmentInfo class, 674EnvironmentReference object, 681–682environments

confi guring project to use, 763–765

DT_DBDATE data type – environments

bindex.indd 896bindex.indd 896 2/27/12 8:37:30 AM2/27/12 8:37:30 AM

Page 15: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

897

containers for storing parameters, 674–676

creating and confi guring project level parameters, 761

Data Flow design practices for staging, 512–513

Managed Object Model and, 671migrating packages between, 773overview of, 760package confi guration and, 771referencing, 681–682setting up, 761–762setting up environment references, 765variables referenced during package

execution, 749equivalence operator (==), 175error handling

ACH fi le package, 854–856advanced precedence constraints, 551bank fi le package, 832–835basic precedence constraints, 549–551Boolean expressions used with

precedence constraints, 551–555breakpoints, 569–572building Transformation Component

and, 623–624catalog logging, 582–584combining expressions and multiple

precedence constraints, 556–557error rows and, 572–576ErrorQueue table, 248–249in Excel Destination, 116log events, 577–581logging, 576–577logging providers, 577with Merge Transformation, 144in OLE DB Source, 109overview of, 14, 549staged data in, 513summary, 584user interface assembly and, 663–665working with multiple precedence

constraints, 555–556error messages

in Excel Destination, 116Lookup Transformation and, 207, 223–

226with Merge Transformation, 144in OLE DB Source, 109Transfer Error Messages Task, 89

error outputs, 471–473error rows

error handling in Data Flow, 572example demonstrating use of, 573–576table of error handlers and descriptions,

573ErrorDetail table, 816ErrorQueue table, SQL Server, 248–249escape sequences, string literals, 179–180ETL (extraction, transformation, and

loading)ACH fi le detail processing, 860–861bad data handling with Fuzzy Lookup

Transformation, 133–138bank fi le detail processing, 841–845data transformation aspect of, 47development and, 523Import and Export Wizard, 3mainframe ETL. See mainframe ETL,

with data scrubbingMaster ETL package, 350–351SSIS as ETL tool, 1–2, 5tasks in SSIS, 6–7team preparation and, 522–523turning Data Profi le results into

actionable ETL steps, 321Evaluation Operations, in precedence

constraints, 552event handling

breakpoints, 569–572catalog logging, 582–584events available at package level, 558–560inheritance, 567–569log events, 577–581logging and, 576–577logging providers, 577OnError events, 565–566OnPreExecute events, 567

equivalence operator (==) – event handling

bindex.indd 897bindex.indd 897 2/27/12 8:37:30 AM2/27/12 8:37:30 AM

Page 16: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

898

event handling (continued)overview of, 557–558responding to events in Script Task,

283–284summary, 584working with event handlers, 34–35,

560–565event logs

log providers, 700–701programming to log providers, 703specifying events to log, 701–702

eventsavailable at package level, 558–559custom, 560defi ned, 281log events, 577–581log provider for Windows events, 577logging, 284–286, 576–577methods for fi ring, 281monitoring pipeline logging, 503–505OnError events, 565–566OnPreExecute events, 567raising in Script Component, 292–293raising in Script Task, 281–283responding to in Script Task, 283–284WMI Event Watcher Task, 86

Excel (Microsoft)64-bit support in, 110, 415–417accessing source data from, 414–415,

417–421destinations in Data Flow, 116executing parameterized SQL statement,

69–71expressions similar to cells in, 163referencing columns in expressions

within, 181sources in Data Flow, 10, 109–110

EXCEPT, set-based logic for extraction, 389–391

exception handling. See also error handling, 305–308

exception logs, 703Execute Package Task

master ETL package and, 350–351

overview of, 80–81package execution, 240scaling out memory pressures with, 475

Execute Package window, 507Execute Process Task

overview of, 81–82SSAS cube processing with, 345, 349

Execute SQL TaskADO.NET properties, 851–852capturing multi-row results, 73–75capturing singleton results, 72–73coin toss example, 552–555combining expressions and multiple

precedence constraints, 556–557completing packages, 239creating simple Control Flow, 455–457e-mail Control Flow processing, 863executing batch of SQL statements,

71–72executing parameterized SQL statements,

69–71executing stored procedures, 75–78expressions in, 194–195Foreach ADO Enumerator example,

100–101matching process logic and, 868–870OLE DB properties and, 830overview of, 68–69in parallel loading, 481–484project deployment model and, 750retrieving output parameters from stored

procedures, 78–80execution, package

from command-line with DTExec, 774from command-line with DTExecUI,

775–780monitoring Data Flow, 503–505monitoring execution, 791overview of, 240, 493–495total time in Data Flow vs. Control Flow,

490–491T-SQL for, 736–737, 757–758

Execution Results tab, 281–282execution trees, Data Flow

event handling – execution trees, Data Flow

bindex.indd 898bindex.indd 898 2/27/12 8:37:31 AM2/27/12 8:37:31 AM

Page 17: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

899

monitoring, 503–505optimizing package processing, 513–515overview of, 501–503pipeline log details, 505–506pipeline reporting, 506–507

ExecutionOperation objects, 686explicit variable locking, in Script Task, 267Export Column Transformation

in Data Flow, 131–133optimizing processing with, 515task, 132

expression adorners, 163Expression Builder

creating dynamic packages, 251opening, 251referencing parameters, 181referencing variables, 180–181working with, 175–176

Expression Task, for setting variables, 13, 196–197

expressionsBoolean expressions, 182–183Boolean literals, 180C#-like syntax of, 174–175casting, 169–170column references, 181–182combining with precedence constraints,

556–557conditional expressions, 187–188confi guring Derived Column

Transformation, 121–122in Connection Manager properties,

193–194in Control Flow, 194–196in Data Flow, 197–200data types, 164–170date and time functions, 188–190dealing with NULLs, 183–185dynamic package objects and, 162equivalence operator, 177evaluating, 30Expression Builder, 175–176Expression Task, 196–197Foreach ADO Enumerator example, 100

line continuation, 177–178in Lookup Transformation, 226numeric literals, 178–179overview of, 163–164, 190parameter data types, 172–173parameter defi nition, 171–172parameter reference, 181parameters as, 162–163, 191–193reading string data conditionally, 121setting task properties at runtime, 40–41string concatenation, 177string functions, 185–187string literals, 179–180summary, 200variable data types, 172–173variable defi nition, 170–171variable references, 180–181variables as, 162, 191–193

Expressions tab, task editors, 40–41, 265Extensible Markup Language. See XML

(Extensible Markup Language)Extensible Stylesheet Language

Transformations (XSLT), 61, 722external applications, interaction with

InfoPath data source, 720–726outputting to ASP.NET, 727–731overview of, 719–720summary, 736–741T-SQL for package execution, 736–741Winform application for dynamic

property assignment, 731–736external management, of SSIS

application object maintenance operations, 683

catalog management, 672–673Configuration object and, 707–709deployment project model, 676–677DTS runtime managed code library, 676EnvironmentReference object, 681–682environments, 674–676event logging, 701–702executing packages deployed to catalog,

680–681folder management, 673–674

ExecutionOperation objects – external management, of SSIS

bindex.indd 899bindex.indd 899 2/27/12 8:37:31 AM2/27/12 8:37:31 AM

Page 18: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

900

external management, of SSIS (continued)LogProviders collection object and,

702–703managed code in, 670Managed Object Model code library,

671–672operation logs in SQL Server 2012,

703–705package confi gurations, 705–707package log providers, 699–701package maintenance, 684–686package management example, 689–699package monitoring, 686–687package operations, 682–684parameter objects, 677–678project, folder, and package listings,

688–689server deployment, 679–680setting up demonstration package for,

670–671summary, 716–718WMI Data Reader Task example, 710–

715WMI Data Reader Task explained,

709–710WMI Event Watcher Task example,

716–718WMI Event Watcher Task explained, 715WMI task overview, 709

extraction. See data extractionextraction, transformation, and loading. See

ETL (extraction, transformation, and loading)

F

fact table, data warehouses and, 337–344Fail Component, Lookup Transformation,

223, 225FailPackageOnFailure property

checkpoints and, 454creating simple control fl ow, 455–457variations of, 459–461

Failure value, constraints, 8

False

Boolean expressions and, 182–183Boolean literals and, 180in conditional expressions, 187–188

fast load option, OLE DB Destination, 117FastParse option, Flat File Source, 113–114File Connection Manager

building Destination Component, 625values returned by, 595

fi le system deployment, 752File System Task

ACH fi le package, 848–850archiving fi les, 52bank fi le batch creation, 828–832bank fi le package, 823, 825–828basic fi le operations, 50–51e-mail package, 862Foreach File Enumerator example, 99

File Transfer Protocol. See FTP (File Transfer Protocol)

fi lesACH. See ACH (Automated Clearing

House) fi lesarchiving, 52, 251–252bank fi les. See bank fi le packagecheckpoint. See checkpointscopying assembly fi le into GAC, 263fl at. See fl at fi lesgenerating unique fi lenames, 259–260lockbox fi les. See lockbox fi leslocking for editing and committing

changes, 531–532operations, 50–51polling a directory for fi le delivery, 86–87raw. See raw fi lesrepresented in Solution Explorer, 27retrieving from FTP Server, 54–55,

274–275storage locations, 806text. See text fi lesXML. See XML fi les

FileUsageType property, building Source Component, 605–606

fi xed attributes, 331–332, 335

external management, of SSIS – fi xed attributes

bindex.indd 900bindex.indd 900 2/27/12 8:37:31 AM2/27/12 8:37:31 AM

Page 19: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

901

Flat File Destinationin Data Flow, 116example using, 160Merge Join Transformation using, 211

Flat File SourceAdvanced page, 111–112Columns page, 111defi ned, 10exporting batches of text fi les, 385FastParse option, 113–114generating Unpivot Transformation, 151Import Column Transformation using,

142–143MultiFlatFile Connection Manager, 114overview of, 31, 110SQL Server data types and, 113text qualifi er option, 110–111

fl at fi lesaccessing source data from, 414, 442–

447Connection Managers, 595, 822creating connection for, 235

foldersdata architecture, 806granting user access to, 783managing with CatalogFolder class,

673–674removing from catalog, 675

For Loop Containercoin toss example, 552–555combining expressions and multiple

precedence constraints, 556–557overview of, 95–97in parallel loading, 480–481tasks, 42–43

Foreach ADO Enumerator, 97, 100–102Foreach File Enumerator, 97Foreach Loop Container

ACH fi le package and, 847creating loop with, 250Foreach ADO Enumerator example,

100–102Foreach File Enumerator example, 98–99lockbox fi les and, 824

overview of, 97–98tasks, 42

formsbuilding UI form, 653modifying form constructor, 653–654steps in building UI (user interface), 644

FTP (File Transfer Protocol)Connection Manager, 53FTP Task, 54–55FTP Task Editor, 53package deployment via, 53retrieving fi le from FTP server, 54–55,

274–276full-cache mode, Lookup Transformation

Cache Connection Manager option in, 230

in cascaded Lookup operations, 227–228data preparation for complex dimension

table, 329defi ned, 474features of, 205overview of, 202partial-cache mode option, 220–222trade-off between no-cache mode and, 220working in, 216–219

fully blocking transformations, 119fully qualifi ed variable names, Script Task,

268Functional Dependency Profi le, 319functions

Change Data Capture, 396–398date and time, 188–189expression, 174–175string, 185–187

Fuzzy Grouping Transformationadvanced data cleansing with, 363–365in Data Flow, 138–141defi ned, 357

Fuzzy Lookup Transformationadding Data Viewer to, 360–361Advanced tab, 135, 359Columns tab, 135, 358connection to SQL Server database,

361–362

Flat File Destination – Fuzzy Lookup Transformation

bindex.indd 901bindex.indd 901 2/27/12 8:37:32 AM2/27/12 8:37:32 AM

Page 20: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

902

Fuzzy Lookup Transformation (continued)defi ned, 357example of, 136–138handling bad data with, 133–134matching process high-confi dence, 872–

873matching process medium-confi dence,

875–876output to, 134Reference Table tab, 134–135, 358

G

GAC (global assembly cache)adding assemblies to, 598–602copying assembly fi le into, 263installing user interface assembly in,

645–646Managed Object Model and, 671using managed assemblies, 260

gacutil.exe, 600, 652GateKeeperSequence expression, for Control

Flow precedence, 195–196global assembly cache. See GAC (global

assembly cache)GridView controls

column display, 654column selectin, 657–658displaying SSIS data with ASP.NET

control, 729groups

containers vs., 95Fuzzy Grouping, 138–141, 363–365highlighting tasks to create, 95task groups, 31, 94

GUI, managing security with, 783–784

H

Header Derived Column Transformation, 844Hello World example, of SSIS scripting,

257–258helper methods, Source Component, 608heterogeneous data

Access, 415–417, 421–427Excel, 415–421fl at fi les, 442–447ODBC, 447–449Oracle, 427–430other sources, 450overview of, 413–415summary, 451XML and Web Services, 431–442

historical attribute, 331–334horizontal partitioning, 477–478HTTP Connection Manager, 55–56HttpConnection property, Web Service Task,

56hubs, creating central SSIS server, 766–768

I

IBM MQ Series, 82icons, expression adorner and, 163IDTSComponentEvents interface, 281IDtsComponentUI interface

Delete method, 648Edit method, 649–651Help method, 648implementing, 647–648Initialize method, 649New method, 648–649steps in building UI (user interface), 644

IF.THEN logic, in conditional expressions, 187–188

Ignore Failure, Lookup Transformation, 223–224

Immediate window, script debugging using, 310–311

implicit variable locking, in Script Task, 267Import and Export Wizard

as basic tool in ETL world, 3creating destination table, 20DTS and, 2moving data from sources, 17opening and selecting source in welcome

screen, 18

Fuzzy Lookup Transformation – Import and Export Wizard

bindex.indd 902bindex.indd 902 2/27/12 8:37:32 AM2/27/12 8:37:32 AM

Page 21: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

903

options for saving and executing package, 20–22

specifying destination for data, 19Import Column Transformation

in Data Flow, 142–144optimizing processing with, 515saving fi le snapshots and, 844

inferred membersfact tables and, 344SCD and, 332–333updates output, 335

InfoPath data source, 720–726inheritance

components and, 588event handling and, 567–569

Input tab, Web Service Task, 56input verifi cation

design-time methods, 590Transformation Component, 619–620

Insert Destinationcomplex dimension changes with SCD,

333limitations of SCD, 336optimizing SCD packages, 336

INSERT statements, MERGE operator for, 408–411

Integration Services. See SSIS (SQL Server Integration Services), introduction to

IntegrationServices class, 671–672INTERCEPT, set-based logic for extraction,

389–391invoices, in case study

as database entity, 809Invoice table, 811–813matching process Control Flow, 867matching process high-confi dence Data

Flow, 870–874matching process logic, 868–870matching process medium-confi dence

Data Flow, 875–878matching process package setup, 867–868

I/O cost, 376–377ISNULL() expression function

setting NULL values in Data Flow, 183, 185

T-SQL function vs., 175IsSorted property, data sources, 383iterative methodology

in MSF Agile, 537–539in SDLC, 525

J-K

JET (Join Engine Technology)JET engine, 415OLE DB Provider, 415

jobs, SQL Server Agent, 91joins

contrasting SSIS and relational joins, 203–206

in data extraction, 381–382overview of, 201–202summary, 231

joins, with Lookup Transformationbuilding basic package, 207–209with cascaded operations, 227–228with CCM and Cache Transform, 229–230with expressionable properties, 226features of, 206–207in full-cache mode, 216–219in multiple outputs mode, 223–226in no-cache mode, 219–220overview of, 202–203in partial-cache mode, 220–222using relational join in source, 209–211

joins, with Merge Join Transformationbuilding packages, 211–212overview of, 144–145retrieving relational data, 212–214specifying sort order, 214–216working with, 203

L

labeling (striping) source versions, 547–548legacy security, 785–787libraries

Class Library, 596DTS runtime managed code library, 676

Import Column Transformation – libraries

bindex.indd 903bindex.indd 903 2/27/12 8:37:32 AM2/27/12 8:37:32 AM

Page 22: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

904

libraries (continued)Managed Object Model code library,

671–672of views, 745

line continuation characters, expression syntax and, 177–178

lineage number, referring to columns by, 182LineageIDs

asynchronous transformation outputs and, 498–499

Source adapters and, 500synchronous transformation outputs and,

499–500transformation outputs and, 498

literalsBoolean, 180numeric, 178–179string, 179–180

load packages, in case studyACH fi le package. See ACH (Automated

Clearing House) fi lesbank fi le package. See bank fi le packagee-mail package. See e-mail package

loadingData Flow restart using, 476data warehouse. See data warehousesLookup Cache from any source, 229–230scaling out using parallel, 479–485

localization, UI design principles, 667Locals window, script debugging using, 310lockbox fi les. See also bank fi le package

looping, 824parsing and error handling, 832–835saving fi le snapshot to database, 844–845solution architecture for case study, 803specifi cation for input fi les, 800structure of, 807–808

Log method, Dts object, 287–288log providers

overview of, 699–701programming for, 702–703in SSIS, 700–701

loggingcatalog logs, 582–584

designing logging framework, 523event logging, 284–286, 577–581, 701–702LOGGING_LEVEL parameter, 739LogProviders collection object, 702–703monitoring pipeline events, 503–505operation logs in SQL Server 2012,

703–705overview of, 14, 576–577package log providers, 699–701pipeline execution reporting, 506–507pipeline execution tree log details, 505–

506providers of, 577writing log entry in Script Component,

293–294writing log entry in Script Task, 287–288

logical AND, 29–30logical expressions

casting issues in, 170using with precedence constraints, 29–30

logical OR, 29–30login, database, 89–90LogProviders collection object, 702–703Lookup Transformation

ACH Data Flow validation, 856as alternative to SCD for dimension table

data, 336building basic package, 207–209caching optimized in, 474caching smallest table in, 212with cascaded operations, 227–228with CCM and Cache Transform, 229–230for complex dimension table, 328–330in Data Flow, 123–124with expressionable properties, 226features of, 206–207in full-cache mode, 216–219Fuzzy Lookup compared with, 133–138,

357handling dirty data, 245–246loading fact table, 338matching process high-confi dence, 871matching process medium-confi dence, 876in multiple outputs, 223–226

libraries – Lookup Transformation

bindex.indd 904bindex.indd 904 2/27/12 8:37:33 AM2/27/12 8:37:33 AM

Page 23: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

905

in no-cache mode, 219–220in partial-cache mode, 220–222relational joins compared with, 203–205relational joins performed with, 202–203relational joins used in source, 209–210for simple dimension table, 324–325Term Lookup, 156–157

loopingACH fi le package, 846–848bank fi le package, 824–825CCM enabling reuse of caches across

iterations, 230Foreach Loop Container, 97–98For Loop Container, 95–97tasks, 42–43

LTRIM function, Conditional Split Transformation, 244

M

magic numbers, converting to NULLs, 378mail servers, SMTP, 868Main() function, Hello World example,

257–259mainframe ETL, with data scrubbing

creating Data Flow, 242fi nalizing, 246–247handling dirty data, 242–246handling more bad data, 247–249looping, 250overview of, 241–242summary, 252

maintenance, package, 684–686Manage_Object_Permissions, granting, 783managed assemblies

code reuse and, 260–261using custom .NET assemblies, 261–264

managed codecatalog management, 672–673deployment project model, 676–677DTS runtime managed code library, 676EnvironmentReference object, 681–682environments, 674–676

executing packages deployed to catalog, 680–681

external management of SSIS with, 670folder management with, 673–674Managed Object Model code library,

671–672overview of, 670parameter objects, 677–678server deployment, 679–680setting up demonstration package for,

670–671Managed Object Model. See MOM

(Managed Object Model)Management Studio

creating Customer table with, 810–811creating table with, 731–732overview of, 36–37package deployment with, 754

MapInputColumn/MapOutputColumn methodsdesign-time methods, 590Source Component, 611

mappingdefi ning Data Flow for package, 238–239DQS Cleansing Transformation and,

371–372handling more bad data, 249loading fact table, 342–343sources to destinations, 20variable data types to SSIS Data Flow

types, 172–173master ETL packages, 350–351memory

buffers, 492–493Data Flow and, 105–106design practices for, 508–510increasing in 32-bit Windows OS, 474Merge Join Transformation and, 203monitoring in blocking transformations,

508–509pipeline processing occurring in, 474–475transformations working in, 119

Merge Join Transformationas alternative to SCD for dimension table

data, 336

looping – Merge Join Transformation

bindex.indd 905bindex.indd 905 2/27/12 8:37:33 AM2/27/12 8:37:33 AM

Page 24: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

906

Merge Join Transformation (continued)in Data Flow, 144–145features of, 203InfoPath example, 723–726loading fact tables, 339, 341–342, 344Look Transformation compared with,

203matching process high-confi dence, 873matching process medium-confi dence,

875–876pre-sorting data in, 127processing bank fi le check and invoice

details, 842relational joins compared with, 203–204,

206semi-blocking nature of, 495–496working with, 211–216

Merge operationas source control method, 546–547XML Task, 721–722

MERGE operator, for mixed-operation data loads, 408–411

Merge Transformationin Data Flow, 144pre-sorting data in, 127semi-blocking nature of, 495–496

Message Queue TaskFor Loop Container, 95overview of, 82–83

messaging systems, 82–83methodology, in SDLC

iterative, 525overview of, 523waterfall, 524

Microsoft Access. See AccessMicrosoft Excel. See ExcelMicrosoft Message Queuing (MSMQ), 82–83Microsoft Offi ce, 720–726Microsoft Solution Framework, 525Microsoft Team Foundation Server, 526,

533–536mining models, training, 118mining objects, processing, 46

miss-cache feature, Lookup Transformation, 206–207, 221–222

modularize, in data extraction, 384–385MOM (Managed Object Model)

catalog management, 672–673code library, 671–672deployment projects, 676–677environment references, 681–682environments, 674–676executing packages deployed to catalog,

680–681folder management, 673–674package parameters, 678server deployment, 679–680

monitoringbuilt-in reporting, 791–795custom reporting, 795Data Flow execution, 503–505package execution, 791packages, 686–687

MQ Series, IBM, 82MSF Agile

documents, 538overview of, 537reports, 538–539source control, 539team builds, 539work items, 537–538

MSMQ (Microsoft Message Queuing), 82–83Multicast Transformation, in Data Flow, 145,

516–518MultiFlatFile Connection Manager, in Data

Flow, 114multiple outputs, Lookup Transformation

with, 223–226MyExpressionTester variable, 181

N

naming conventionsbest practices, 241creating connections across packages, 235generating unique fi lename for archiving

fi le, 259–260

Merge Join Transformation – naming conventions

bindex.indd 906bindex.indd 906 2/27/12 8:37:33 AM2/27/12 8:37:33 AM

Page 25: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

907

referencing columns in expressions and, 181–182

SSIS data types, 164–166using fully qualifi ed variable names, 268variables, 170

native transactiondefi ned, 463single package in SQL Server using,

469–471nesting

conditional expressions, 188containers, 94

.NETADO.NET. See ADO.NETASP.NET, 727–731custom assemblies, 261–264scripts, 124–125Winform application for dynamic

property assignment, 731–736no-cache mode, of Lookup Transformation

in cascaded Lookup operations, 227–228defi ned, 202, 474partial-cache mode option, 220–222trade-off between full-cache mode and,

220variables used in auditing, 32working with, 219–220

non-blocking transformationsoverview of, 493row-based, 494–495server resources required by, 495streaming, 493–494with synchronous outputs, 500

nonmatches, in Lookup Transformation, 207normal load option, OLE DB Destination,

117NULL values

Boolean expressions used in, 182–183converting magic numbers to, 378in Data Flow, 121–122, 185Multicast Transformation compared

with, 145variables and, 183–184

numeric literals, 178–179

O

objectsbuffers, 614data mining, 46Dts object. See Dts objectdynamic package, 162environment references, 681–682external management, 707–709logging, 702–703package management, 683–686package monitoring, 686–687parameters, 677–678permissions, 784storing recordset in memory using object

variables, 101tasks, 40transferring database objects between

databases, 91–92ODBC

accessing source data from, 414, 447–449coding SQL statement property according

to, 76executing parameterized SQL statements,

69–71sources in Data Flow, 11

ODS (operational data store), 48Offi ce (Microsoft), InfoPath example of

interaction with, 720–726OLAP (online analytical processing), 45OLE DB

coding SQL statement property, 76as Data Flow source, 10outputting Analysis Services results to, 46

OLE DB Command Transformationin Data Flow, 145–147loading fact table, 343optimizing processing with, 515–516optimizing SCD package by removing,

333using set-based update vs., 344

OLE DB Connection Manageradding connections across packages,

234–236adding new connections, 67

native transaction – OLE DB Connection Manager

bindex.indd 907bindex.indd 907 2/27/12 8:37:34 AM2/27/12 8:37:34 AM

Page 26: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

908

OLE DB Connection Manager (continued)confi guring connections, 727Execute SQL Task properties, 830selecting, 107–108selecting in full-cache mode of Lookup

Transformation, 216OLE DB Destination

adding, 843in Data Flow, 116–117e-mail Data Flow processing, 866fi nalizing package with scrubbed data,

246–247loading fact table, 339, 342

OLE DB SourceADO.NET Source vs., 115confi guring, 727–728in Data Flow, 107–109data preparation for complex dimension

table, 327–329loading fact table, 337, 340–341Merge Join Transformation and, 212–213querying CDC in SSIS, 403relational join for data extraction, 209–

210sorting data with SQL Server in, 383

OnError eventsapplying, 565–566defi ning event handler for, 560–565error handling and logging and, 14inheritance and, 567–569specifying events to log, 701

online analytical processing (OLAP), 45online references

for 64-bit version of Offi ce 2010, 415for conversion rules for date/time types,

166for DQS (Data Quality Services), 366for regular expressions, 297

OnPreExecute eventsapplying, 567defi ning event handler for, 560–565

OPENROWSET functiondata extraction and text fi les, 385–389MERGE operator and, 411

operational data store (ODS), 48operations

Application object, 682–683logging in SQL Server 2012, 703–705Managed Object Model and, 671package, 682–684Project class, 676–677

optimization, staging environments for Data Flow, 512–513

Oracleaccessing source data from, 414, 427–430CDC option, 405

ORDER BY clauseloading fact table and, 340Merge Join Transformation and, 213–214

Output tab, Web Service Task, 57output verifi cation

design-time methods, 590–591Transformation Component, 619–620

outputsasynchronous and synchronous

transformation, 498–500DQS Cleansing Transformation, 372evaluating results of Data Profi ling Task,

317–321improving reliability and scalability of,

471–473Lookup Transformation multiple, 223–

226turning Data Profi le results into

actionable ETL steps, 321Overview report, in SSIS administration,

792–793OverwriteDestination property, Foreach File

Enumerator example, 99

P

Package Confi guration Wizard, 771package creation

adding connections, 234–236basic transformation tutorial, 233–234completing, 239creating Control Flow, 237

OLE DB Connection Manager – package creation

bindex.indd 908bindex.indd 908 2/27/12 8:37:34 AM2/27/12 8:37:34 AM

Page 27: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

909

creating Data Flow, 237–239executing, 240making packages dynamic, 250–252performing mainframe ETL with data

scrubbing. See mainframe ETL, with data scrubbing

saving, 239summary, 252

package deployment modelcreating deployment manifest, 751–752list of, 748overview of, 751Package Deployment Wizard, 752–755SSIS Package Store and, 755–757

Package Deployment Wizard, 752–755Package Designer

annotations, 32Connection Manager tab, 31–32Control Flow tab, 29–31Data Flow tab, 34Event Handlers tab, 34–35grouping tasks, 31overview of, 28Package Explorer tab, 35–36Parameters tab, 34Variables window, 33–34

Package Explorer, 35–36Package object

LogProviders collection and, 702operations, 682package maintenance and, 684–686

Package Protection Levels, 21package restartability

containers within containers and checkpoints, 457–459

FailPackageOnFailure property, 459–461inside checkpoint fi le, 461–463overview of, 453–455simple control fl ow, 455–457staging environments for, 512

package transactionseffect on checkpoints, 457–459overview of, 463–464

single package, multiple transactions, 466–468

single package, single transaction, 464–466

single package using native transaction in SQL Server, 469–471

two packages, one transaction, 468–469packages

32-bit and 64-bit modes, 416–417annotations, 31, 805Application object maintenance

operations, 683building basic, 207–209building custom, 636–637built-in reports, 791–792compiled assemblies in, 263–264Configuration object, 707–708confi gurations, 705–707, 770–773containers as miniature, 94Control Flow and, 6as core component in SSIS, 5creating fi rst, 25–26creating to run parallel loads, 480deploying via FTP, 53deployment models. See package

deployment modeldesigning, 28executing, 36, 680–681execution time in Data Flow vs. Control

Flow, 490–491expressions in. See expressionsgrouping tasks in Sequence Containers,

94handling corrupt, 781–782lists, 688–689log providers, 699–701maintaining, 684–686Managed Object Model and, 671management example, 689–699modular, 523monitoring, 686–687, 791naming conventions, 804operations, 682–684optimizing, 513–516

package deployment model – packages

bindex.indd 909bindex.indd 909 2/27/12 8:37:34 AM2/27/12 8:37:34 AM

Page 28: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

910

packages (continued)parameters, 14, 162, 678parent and child, 80–81precedence constraints, 8properties of, 28re-encrypting, 781scheduling, 787–790security of, 782T-SQL for executing, 736–741, 757–758

packages, in case studyACH fi le package. See ACH (Automated

Clearing House) fi lesbank fi le package. See bank fi le packagee-mail package. See e-mail package

parallel loading, scaling out with, 479–485parameters

compared with variables, 34creating and confi guring project level,

761data types for, 172–173defi ning, 171–172Managed Object Model and, 671overview of, 162–163packages and, 14parameter objects, 677–678project deployment model and, 749referencing in expressions, 181T-SQL setting parameter values, 758–760using as expressions, 191–193

parent packages, 80–81Pareto principle (80/20 rule), 228parsing

ACH fi le package, 854–856bank fi le package, 832–835

partial-cache mode, of Lookup Transformationin cascaded Lookup operations, 227–228defi ned, 202, 474overview of, 220–222

partially blocking transformations, 119Partition Processing Destination, in Data

Flow, 118partitioned fact tables, considerations, 344partitioning

scaling across machines using horizontal, 477–478

staged data as, 475–477passwords, for data protection, 21Patch operation, XML Task, 722paths

path argument, 730path attachment, 593

patterns, analyzing source data for. See data profi ling

PDSA (Plan, Do, Study, and Act), 524Percentage Sampling Transformation, in Data

Flow, 147perfmon, 796performance counters, 796performance metrics, 576Performance Monitor, 518–520performance monitoring

of pipeline, 518–520troubleshooting bottlenecks in Data

Flow, 516–518performance overhead

of data types, 167of database snapshots, 406of Fuzzy Lookup Transformation, 133,

136of Lookup Transformation caching

modes, 217PerformUpgrade method, design-time

methods, 591permissions

catalog, 784folder, 783object, 784

persisted cache, Lookup Transformation, 474persistent fi le storage, Lookup

Transformation, 474pipeline

component types, 586debugging components in, 635–636defi ned, 474execution reports, 506–507execution tree log details, 505–506

packages – pipeline

bindex.indd 910bindex.indd 910 2/27/12 8:37:35 AM2/27/12 8:37:35 AM

Page 29: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

911

monitoring Data Flow execution, 503–505

monitoring performance, 518–520overview of, 585–586scaling out memory pressures, 474–475troubleshooting bottlenecks, 516–518

Pipeline Componentsconnection time functionality, 594–595design-time functionality, 589–593methods, 588–589preparing for coding, 596–602run-time functionality, 593–594UI (user interface) and, 643, 645

PipelineComponent base class, 598pivot tables, 147–148Pivot Transformation, in Data Flow, 147–150placeholders, troubleshooting bottlenecks

and, 516Plan, Do, Study, and Act (PDSA), 524PostExecute method

adding runtime methods to components, 594

Transformation Component, 625Pragmatic Works BI xPress, 791precedence, staging environments for, 512Precedence Constraint Editor, 29, 181precedence constraints

advanced, 551basic, 549–551Boolean expressions used with, 551–555combining expressions and multiple

precedence constraints, 556–557Control Flow and, 29–30, 195–196overview of, 8working with multiple, 555–556

predictive queries, Data Mining Query Task, 46–47

PreExecute methodadding runtime methods to components,

593Destination Component, 631–633Transformation Component, 620

prefi x, generating unique fi lename for archiving fi le, 259–260

PrepareForExecute method, adding runtime methods to components, 593

PrimeOutput methodadding runtime methods to components,

594Transformation Component, 620, 625

processing windows, staging environments for, 512

ProcessInput methodadding runtime methods to components,

594Destination Component, 631–632Transformation Component, 620–623, 625

Professional SQL Server Analysis Services 2012 with MDX and DAX (Harinath et al.), 118

profi ling, Data Profi ling Task, 48–50programming custom features. See

customizing SSISProject class, 676–677Project Connection Managers, 822–823project deployment model

catalog logging and, 582deploying projects with, 761managed code and, 676–677overview of, 748–751

Project Portal, 540projects

adding to UI, 645–647Build menu, 752confi guring to use environments,

763–765creating and aligning with solutions,

24–25creating and confi guring project level

parameters, 761creating from Solution Explorer window,

27defi ned, 24deploying. See project deployment modellistings, 688–689Managed Object Model and, 671parameters, 162server deployment project, 679

Pipeline Components – projects

bindex.indd 911bindex.indd 911 2/27/12 8:37:35 AM2/27/12 8:37:35 AM

Page 30: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

912

projects (continued)tips related to working with large

projects, 805versioning in SQL Server 2012, 746

propertiescheckpoint fi le, 454confi guring Script Task Editor, 265–266of Dts objects, 266groups vs. containers and, 95setting catalog properties, 744–747of tasks, 41–42using expressions in Connection

Manager, 193–194Properties windows, 28, 171Property Expressions Editor, 41ProvideComponentProperties method

debugging components, 636design-time methods, 589–590Destination Component, 626–627Source Component, 602–603Transformation Component, 615–616

proxy accounts, 789–790

Q

QBE (Query-By-Example) tool, 71queries

catalog logging and, 583–584Data Mining Query Task, 46–47T-SQL querying tables to set parameter

values, 759–760WQL queries, 84–85

Query Optimizer, 816Query-By-Example (QBE) tool, 71Quick Watch window, 310

R

Ragged Right option, in SSIS, 296rational database management systems

(RDBMS)reducing reliance on, 508–510types of database systems, 64

Raw File Destination, in Data Flow, 114, 117

Raw File Sourceconfi guring Cache Connection Manager,

230in Data Flow, 10, 114–115

raw fi lesData Flow restart using, 476–477scaling across machines using, 477–479scaling out by staging data using, 475

RDBMS (rational database management systems)reducing reliance on, 508–510types of database systems, 64

RDBMS Server tasksBulk Insert Task, 64–68Execute SQL Task. See Execute SQL Taskoverview of, 64

ReadOnlyVariables propertyScript Component, 291–292Script Task, 267–268

ReadWriteVariables propertyScript Component, 291–292Script Task, 267–268

Recordset Destination, in Data Flow, 117recordsets, Execute SQL Task, 73–75Redirect Rows to Error Output, Lookup

Transformation, 223–224Redirect Rows to No Match Output, Lookup

Transformationhandling dirty data, 245–246in multiple outputs, 223–225

referencescolumns, 181–182environment, 765parameters, 181variable, 180–181

RegisterEvents method, design-time methods, 591

registration, of assembly, 263registry, package confi guration and, 705–706,

771regular expressions, validating data using,

297–298ReinitializeMetaData method

design-time methods, 590

projects – ReinitializeMetaData method

bindex.indd 912bindex.indd 912 2/27/12 8:37:35 AM2/27/12 8:37:35 AM

Page 31: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

913

Destination Component, 629–630Source Component, 609Transformation Component, 617–618

relational engineChange Data Capture. See CDC (Change

Data Capture)data extraction. See data extractiondata loading, 405–411overview of, 375

relational joins. See also joinsoverview of, 203–204using in source, 209–210

relative references, environment references, 765

ReleaseConnections method, 595reliability and scalability

overview of, 453restarting packages for. See package

restartabilityscaling out for. See scaling outsummary, 485using error outputs for, 471–473using package transactions for data

consistency. See package transactionsRendezvous, Tibco, 82Reporting Services, 795reports

All Executions report, 792–794catalog logging, 582–583custom, 795MSF Agile, 538–539options, 794–795performance bottlenecks, 516–518pipeline execution, 506–507

Required property, parameters, 162, 172resources

used by blocking transformations, 497used by non-blocking vs. semi-blocking

transformations, 497restarting packages. See package

restartabilityreusability, caching operations for. See

Lookup TransformationReverse String Transformation

building UI with, 643design time debugging, 635–636operating on user interface columns,

654–655runtime debugging, 637–640

Review Data Type Mapping screen, Import and Export Wizard, 20

root cause analysis, 576Row Count Component, 309Row Count Transformation

capturing total batch items, 840in Data Flow, 124–125

row counters, Performance Monitor, 519–520Row Number Transformation, 478Row Sampling Transformation, 147row-based transformations

optimizing processing with, 515–516overview of, 494–495

Rows Read, performance counters, 796Rows Written, performance counters, 796rules

conditional expressions, 188date/time type conversion, 166–167numeric literals, 178–179

runtimeadding methods to components, 593–594component phases, 588debugging components, 634, 637–640defi ning variables, 170Source Component methods, 611–614Transformation Component methods,

620–625UI connections and, 658–661

S

Save and Execute Package screen, Import and Export Wizard, 20–21

Save SSIS Package Screen, Import and Export Wizard, 21

saving data, to XML fi le, 276–277saving packages, 239scaling out. See also reliability and scalability

architectural features of, 474

relational engine – scaling out

bindex.indd 913bindex.indd 913 2/27/12 8:37:36 AM2/27/12 8:37:36 AM

Page 32: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

914

scaling out (continued)memory pressures, 474–475overview of, 473with parallel loading, 479–485by staging data, 475–479

SCD (Slowly Changing Dimension) Transformationcomplex dimension changes with, 331–335considerations and alternatives to, 335–336in Data Flow, 126loading simple dimension table with,

325–327querying CDC in SSIS, 402–403

scheduling packagesoverview of, 787proxy accounts and, 789–790SQL Server Agent for, 787–788

scope, variable, 31–32Script Component

accessing variables in, 291–292adding programmatic code to, 255–256as alternative to SCD for dimension table

data, 336compiled assemblies in, 263–264confi guring Script Component Editor,

289–291connecting to data sources, 292data validation example, 294–302editor, 289–291logging, 293–294overview of, 125–126, 288primary role of, 254raising events in, 292–293script debugging and troubleshooting,

308–310Script Task compared with, 288–289synchronous vs. asynchronous

transformations, 302–305when to use, 255

Script tab, Script Task Editor, 265Script Task

accessing variables in, 267–271adding programmatic code to,

255–256

breakpoints set in, 569checkpoint fi le and, 461–462coin toss example, 552–555compiled assemblies in, 263–264connecting to data sources in, 271–279in Control Flow, 264defi ned, 254Dts object, 266Foreach ADO Enumerator example, 102Hello World example, 257–258logging, 287–288For Loop Container, 95, 96–97overview of, 43–45raising events in, 281–286Script Component compared with,

288–291script debugging and troubleshooting,

308–311setting variables in, 13, 171SSAS cube processing with, 345, 349when to use, 255

Script Task Editor, 265–266scripting

adding code and classes, 259–260custom.NET assemblies for, 261–264debugging and troubleshooting, 308–311getting started, 255Hello World example, 257–258interacting with external applications

and, 719introduction to, 253–254managed assemblies for, 260–261overview of, 253Script Component. See Script ComponentScript Task. See Script Taskselecting scripting language, 255–256structured exception handling, 305–308summary, 311–312VSTA Scripting IDE, 256–257

scrubbing data. See mainframe ETL, with data scrubbing

SDLC (software development life cycle)branching, 546as development methodology, 719

scaling out – SDLC (software development life cycle)

bindex.indd 914bindex.indd 914 2/27/12 8:37:36 AM2/27/12 8:37:36 AM

Page 33: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

915

history of, 524iterative approach, 525labeling (striping) source versions, 547–548merging, 546–547MSF Agile and, 537–539overview of, 521–523Project Portal, 540shelving and unshelving, 544–546Subversion (SVN), 526–533summary, 548Team Foundation Server and, 533–536Team System features, 540–542Team System version and source control,

542–544versioning and source code control,

525–526waterfall approach, 524

securitycatalog, 782–785legacy security, 785–787

SEH (structured exception handling), 305–308SELECT * statements

with JOINS, UNIONS and subqueries, 381–382

performing transformations, 380–381problems with, 376–377sorting data, 382–384WHERE clause and, 377–378

SelectSQL variable, 198–199SelectSQL_ExpDateParm variable, 198–199SelectSQL_UserDateParm variable, 198–199semi-blocking transformations

Data Flow design practices, 508–510overview of, 495–496

Send Mail Taskadding, 870overview of, 83–84

Sequence Containeroverview of, 94in single package, multiple transactions,

467tasks, 42

sequence tasks, 42–43serialization, XML object-based, 277–280

service-oriented architectures, Web services and, 55

set-based logic, in data extraction, 389–391SetComponentProperty method, design-time

methods, 591SetUsageType method

design-time methods, 592Destination Component, 630–631Source Component, 608–609Transformation Component, 618–619

shadow tables, SQL Server Agent writing entries to, 394

Shannon, Claude, 524shared methods, 262SharePoint Portal Services, 540shelving/unshelving, source control and,

544–546Shewhart, Dr. Walter, 524shredding recordsets, Execute SQL Task, 73signing assemblies, 262–263Slowly Changing Dimension Transformation.

See SCD (Slowly Changing Dimension) Transformation

SMO (SQL Management Objects)Managed Object Model based on, 671overview of, 87

SMO administration tasksoverview of, 87–88Transfer Database Task, 88–89Transfer Error Messages Task, 89Transfer Job Task, 91Transfer Logins Task, 89–90Transfer Master Stored Procedures Task,

90Transfer SQL Server Objects Task, 91–92

SMTPConnection Manager, 83–84e-mail messages via, 83invoice matching process and, 868values returned by, 595

snapshots, databasecreating, 406–408saving ACH fi le to database, 861saving bank fi le to database, 844–845

security – snapshots, database

bindex.indd 915bindex.indd 915 2/27/12 8:37:37 AM2/27/12 8:37:37 AM

Page 34: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

916

Soft NUMA node, in parallel loading, 484–485

software development life cycle. See SDLC (software development life cycle)

Solution Explorercomponents in, 26creating new project, 27executing packages, 25–26, 36OLE DB Connection in, 822

Solution Framework, Microsoft, 525solutions

creating new project in, 27creating projects and aligning with,

24–25defi ned, 24

sort in database, data extraction and, 382–384

Sort Transformationasynchronous transformation outputs

and, 498as blocking transformation, 496–497data fl ow example using, 159InfoPath example, 725loading fact table, 339overview of, 126–127presorting data for Data Mining Model

Training Destination, 118presorting data for Merge Join

Transformation, 212–215presorting data for Merge

Transformation, 144processing bank fi le check and invoice

details, 842sorting data in SQL Server compared

with, 382–384Source adapters

debugging, 634–636installing, 633–634as integral to Data Flow, 500–501overview of, 586–587

Source Assistantaccessing heterogenous data in, 414confi guring source in Data Flow with, 107defi ning Data Flow for packages, 237

Source ComponentAcquireConnections method, 604–606buffer objects and, 614columns and, 613–614ComponentMetaData properties, 603–604ComponentType property and, 598Connection Managers and, 605–606data types, 608–609debugging source adapter, 634–636FileUsageType property, 605–606helper methods, 608installing source adapter, 633–634MapInputColumn/MapOutputColumn

methods, 611overview of source adapter, 586–587ParseTheFileAndAddToBuffer method,

612–613PrimeOutput method, 611–612ProvideComponentProperties method,

602–603querying CDC in SSIS, 404ReinitializeMetaData method, 609SetUsageType method, 608–609types of pipeline components, 586Validate method, 606–609

source controlbranching, 546iterative development and, 525labeling (striping) source versions,

547–548merging, 546–547MSF Agile, 539shelving and unshelving, 544–546Team System and, 542–544tools for, 523versioning and source code control,

525–526Source type, of Script Component

confi guring Script Component Editor, 289–290

connecting to data sources, 292defi ned, 288

sourcesADO.NET Source, 115

Soft NUMA node, in parallel loading – sources

bindex.indd 916bindex.indd 916 2/27/12 8:37:37 AM2/27/12 8:37:37 AM

Page 35: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

917

CDC Source, 398–400confi guring destination vs., 115–116connecting in Script Component to, 292connecting in Script Task to, 271–279connectivity, 719in Data Flow, 10–11ETL development and, 523Excel Source, 109–110fl at fi les. See Flat File Sourcefunction of, 106Import and Export Wizard and, 17–18mapping to destinations, 20OLE DB. See OLE DB Sourceoverview of, 106–107permissions. See data profi lingprocessing data from heterogeneous

sources, 800raw fi les. See Raw File SourceTransfer Database Task and, 88–89XML Source, 57–60, 115

space padding, string functions and, 186–187SPC (statistical process control), 524special characters, string literals with, 179–

180Spiral, iterative development, 525SQL (Structured Query Language)

capturing multi-row results, 73–75capturing singleton results, 72–73creating BankBatch table, 813–814creating BankBatchDetail table, 814–815creating corporate ledger data, 815creating CustomerLookup table, 813creating ErrorDetail table, 816creating Invoice table with, 811–812executing batch of statements, 71–72executing parameterized statements,

69–71executing stored procedure, 75–78Management Objects. See SMO (SQL

Management Objects)retrieving output parameters from stored

procedures, 78–80SQL Profi ler

log provider for, 577

package log provider for, 700programming to log providers, 703

SQL ServerAnalysis Services. See SSAS (SQL Server

Analysis Services)authentication, 782Bulk Insert Task, 64–65CDC. See CDC (Change Data Capture)creating central server, 766–768Data Tools. See SSDT (SQL Server Data

Tools)deploying SQL Server 2012, 679–680deployment options, 753destinations, 118editions, 14–15Integration Services. See SSIS (SQL Server

Integration Services), introduction tolog provider for, 577Management Studio. See Management

Studiooperation logs in SQL Server 2012,

703–705package confi guration and, 705–706package log provider for, 700programming to log providers, 703project versioning in SQL Server 2012,

746single package using native transaction

in, 469–471Transfer SQL Server Objects Task,

91–92upgrading components for SQL Server

2012, 641WMI Data Reader Task for gathering

operational type data, 86SQL Server Agent, 393–394, 787–788SQL Server Business Intelligence Edition, 15SQL Server Enterprise Edition, 14–15SQL Server Standard Edition, 15SQLCMD command, in parallel loading,

484SQLMOBILE, 69–71SQLStatement property, Execute SQL Task,

194–195, 470

space padding, string functions and – SQLStatement property, Execute SQL Task

bindex.indd 917bindex.indd 917 2/27/12 8:37:37 AM2/27/12 8:37:37 AM

Page 36: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

918

SSAS (SQL Server Analysis Services)cube processing with, 314–315, 345–350Data Mining Query Task, 46–47Execute SQL Task, 45Processing Task, 46

SSDT (SQL Server Data Tools)adding components to, 633–634common task properties, 41–42creating deployment utility, 751–752creating fi rst package, 25–26creating new project, 732data taps, 765–766debugging components, 634locating and opening, 23opening Import and Export Wizard, 18overview of, 4Properties windows, 28runtime debugging, 637–640Solution Explorer window, 26–27solutions and projects in, 24Toolbox items, 27–28

SSIS (SQL Server Integration Services), introduction toarchitecture of, 5containers, 8–9Control Flow, 6Data Flow, 9–10data tools. See SSDT (SQL Server Data

Tools)destinations, 13error handling and logging, 14history of and what’s new, 2Import and Export Wizard, 3overview of, 1–2packages, 5–6parameters, 14precedence constraints, 8sources, 10–11SQL Server editions and, 14–15summary, 15–16tasks, 6–7transformations, 11–12variables, 13–14

SSIS external management. See external management, of SSIS

SSIS interaction with external applications. See external applications, interaction with

SSIS Package Confi guration, 770–773SSIS Package Store, 755–757SSIS tools. See tools, SSISssis_admin role, 782staged data

across machines, 477–479Data Flow design for, 508, 512–513Data Flow restart, 475–477scaling out by, 475

Standardize Zip Code Transformation, 243–244

static methods, 262statistical process control (SPC), 524Stephen’s Visual Basic Programming 24-

Hour Trainer (Stephens), 254steps (phases), in SDLC, 523storage

catalog for, 743of fi les, 806of packages, 683

stored procedurescatalog security and, 785controlling and managing catalog with,

745–747in databases, 748encapsulating common queries in, 384–385executing, 75–78in parallel loading, 481querying CDC, 402–404retrieving output parameters, 78–80Transfer Master Stored Procedures Task,

90T-SQL for package execution, 758for working with batches, 816–819

streaming assertion, 387strings

concatenation (+) operator, 177functions, 185–187literals, 179–180

SSAS (SQL Server Analysis Services) – strings

bindex.indd 918bindex.indd 918 2/27/12 8:37:38 AM2/27/12 8:37:38 AM

Page 37: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

919

striping (labeling) source versions, 547–548strong names

GAC (global assembly cache) and, 598–599

signing assembly with, 646–647, 651–652

structured exception handling (SEH), 305–308

Structured Query Language. See SQL (Structured Query Language)

subqueries, in data extraction, 381–382SUBSTRING function, 243Subversion (SVN). See SVN (Subversion)success values, constraints, 8suffi xes, numeric literal, 178–179surrogate keys, in data warehousing, 323,

338SVN (Subversion)

confi guring, 526–527connecting project to, 529–531downloading and installing, 526locking fi les for editing and committing

changes, 531–532overview of, 525–526testing integration with project, 531for version control for packages, 239walkthrough exercise, 527–529

Swap Inputs button, Merge Join Transformation, 215

synchronous processeslimiting in Data Flow design, 509reducing in Data Flow design, 508reducing in data-staging environment,

513tasks in Control Flow, 491

synchronous transformationsidentifying, 493, 500vs. asynchronous, 119, 498–500writing Script components to act as,

302–303SynchronousInputID property, 500System Monitor, 796system variables, 31–34

T

tab-delimited fi les, 110–111tables

creating with Management Studio, 731–732

in databases, 748enabling CDC for, 394package confi guration and, 705, 771T-SQL querying tables to set parameter

values, 759–760table-valued parameters, 389–391Tabular Data Stream (TDS), 72task editors

Bulk Insert Task, 65data profi ling and, 50Expressions tab, 40–41FTP Task Editor, 53overview of, 39Script Task Editor, 265–266

Task Host Container, 93task objects, 40tasks

Analysis Services, 45–46archiving fi les, 52Bulk Insert Task, 64–68comparing Data Flow with Control Flow,

488–490Data Flow Task, 10, 47–48Data Mining Query Task, 46–47data preparation tasks, 48Data Profi ling Task, 48–50defi ned, 39DQS (Data Quality Services), 366ETL tasks, 6–7evaluating, 30Execute Package Task, 80–81Execute Process Task, 81–82Execute SQL Task. See Execute SQL TaskFile System Task, 50–51FTP Task, 53–55grouping in containers, 31logging, 576–577looping and sequence tasks, 42–43Message Queue Task, 82–83

striping (labeling) source versions – tasks

bindex.indd 919bindex.indd 919 2/27/12 8:37:38 AM2/27/12 8:37:38 AM

Page 38: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

920

tasks (continued)opening for editing, 237overview of, 39precedence constraints controlling, 550properties of, 41–42RDBMS Server tasks, 64Script Task, 43–45Send Mail Task, 83–84SMO administration tasks, 87–88summary, 92Task Editor, 40–41Transfer Database Task, 88–89Transfer Error Messages Task, 89Transfer Job Task, 91Transfer Logins Task, 89–90Transfer Master Stored Procedures Task,

90Transfer SQL Server Objects Task, 91–92Web Service Task, 55–60WMI Data Reader Task, 84–86WMI Event Watcher Task, 86–87work fl ow tasks, 80working with multiple precedence

constraints, 555–556XML Task, 60–64

TDS (Tabular Data Stream), 72team builds, MSF Agile, 539Team Foundation Server (Microsoft), 526,

533–536team preparation, ETL and, 522–523Team Project, setting up, 534–536Team System. See VSTS (Visual Studio Team

System)Term Extraction Transformation

Advanced tab, 154–156in Data Flow, 152–156Exclusion tab, 154overview of, 152–153Term Extraction Transformation Editor,

153–154Term Frequency and Inverse Document

Frequency (TFIDF) score, 152–156Term Lookup Transformation, 156–157testing

case study packages, 866data fl ows during development with

Union All Transformation, 210database snapshot functionality, 407expressions with Expression Builder, 175external applications, 720Immediate window for, 311UI component, 667

textcomma-delimited fi le requirement, 110–

111Derived Column for advanced data

cleansing, 355–357Term Extraction Transformation, 152–

156Term Lookup Transformation, 156–157

text fi lesdata extraction, 385–389log provider for, 577, 700MERGE operator and reading from, 411programming to log providers, 703

TFIDF (Term Frequency and Inverse Document Frequency) score, 152–156

third-party solutionsChange Data Capture, 392trash destinations for testing purposes,

210threads

monitoring Data Flow execution, 502optimizing package processing, 513–515

Tibco Rendezvous, 82time

data types, 166functions, 188–190

Toolboxadding components to, 633–634working with, 27–28

tools, SSISannotations, 31Connection Managers, 31Control Flow, 29–31creating fi rst package, 25–26Data Flow, 34event handlers, 34–35

tasks – tools, SSIS

bindex.indd 920bindex.indd 920 2/27/12 8:37:38 AM2/27/12 8:37:38 AM

Page 39: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

921

executing packages, 36Import and Export Wizard and, 17–22Management Studio, 36–37overview of, 17Package Designer, 28Package Explorer, 35–36parameters, 34Properties windows, 28Solution Explorer window, 26–27SSDT (SQL Server Data Tools), 23–25summary, 37task groups, 31Toolbox items, 27–28variables, 31–34

TransactionOption propertypossible settings for, 464in single package, multiple transactions,

467in single package, single transaction,

465–466in two packages, one transaction, 468–

469transactions, package. See package

transactionsTransfer Database Task, 88–89Transfer Error Messages Task, 89Transfer Job Task, 91Transfer Logins Task, 89–90Transfer Master Stored Procedures Task, 90Transfer SQL Server Objects Task, 91–92Transformation Component

building, 614ComponentType property and, 598confi guring Script Component Editor,

289–290debugging, 634–636defi ned, 288error handling, 623–624input/output verifi cation methods, 619–

620installing, 633–634overview of, 587PostExecute method, 625PreExecute method, 620

PrimeOutput method, 620, 625ProcessInput method, 620–623, 625ProvideComponentProperties method,

615–616ReinitializeMetaData method, 617–618SetUsageType method, 618–619types of pipeline components, 586Validate method, 616–617

transformationsAggregate Transformation. See Aggregate

Transformationasynchronous outputs, 498–499Audit Transformation, 128–129, 248blocking, 496–497Cache Transformation, 124, 229–230Character Map Transformation. See

Character Map TransformationConditional Split Transformation. See

Conditional Split TransformationCopy Column Transformation, 130Data Conversion Transformation. See

Data Conversion Transformationin Data Flow, 11–12, 47Data Flow and Control Flow comparison,

488–490Data Flow design for correlation and

integration, 510–511Data Flow design for data cleansing,

511–512Data Flow restart using, 476Data Mining Query Transformation,

130–131Data Quality Services (DQS) Cleansing

Transformation, 131Derived Column Transformation. See

Derived Column TransformationDTS (Data Transformation Services). See

DTS (Data Transformation Services)example, 98–99Export Column Transformation. See

Export Column Transformationfunction of, 106Fuzzy Grouping Transformation. See

Fuzzy Grouping Transformation

TransactionOption property – transformations

bindex.indd 921bindex.indd 921 2/27/12 8:37:39 AM2/27/12 8:37:39 AM

Page 40: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

922

transformations (continued)Fuzzy Lookup Transformation. See Fuzzy

Lookup TransformationImport Column Transformation. See

Import Column TransformationInfoPath example, 723–726Lookup Transformation. See Lookup

TransformationMerge Join Transformation. See Merge

Join TransformationMerge Transformation. See Merge

TransformationMulticast Transformation, 145, 516–518non-blocking (streaming and row-based),

493–495OLE DB Command Transformation. See

OLE DB Command Transformationoverview of, 119Percentage Sampling and Row Sampling

Transformations, 147Pivot Transformation, 147–150Reverse String Transformation. See

Reverse String TransformationRow Count Transformation, 124–125,

840Row Number Transformation, 478SCD (Slowly Changing Dimension)

Transformation. See SCD (Slowly Changing Dimension) Transformation

Script Component and, 125–126semi-blocking, 495–496Sort Transformation. See Sort

TransformationSource and Destination adapters, 500–

501Standardize Zip Code Transformation,

243–244synchronous outputs, 499–500synchronous vs, asynchronous, 119synchronous vs, asynchronous

transformations, 294–302Term Extraction Transformation. See

Term Extraction TransformationTerm Lookup Transformation, 156–157

troubleshooting bottlenecks in Data Flow, 517

types of, 493–497Union All Transformation. See Union All

TransformationUnpivot Transformation, 150–152when to use during data extraction,

378–381XSLT (Extensible Stylesheet Language

Transformations), 61, 722trash destinations, testing data fl ow in

development with, 210triggers, adding for Change Data Capture,

391troubleshooting performance bottlenecks,

516–518True

Boolean expressions and, 182–183Boolean literals and, 180in conditional expressions, 187–188

truncation, during casting, 170Try/Catch/Finally structure, in Visual Basic

or C#, 305–308T-SQL

aggregating data, 119–120confi guring projects to use environments,

763–764controlling environments with, 745DMX (Data Mining Extension) to, 47expression functions vs. functions in, 175managing security, 785for package execution, 736–741, 757–758querying tables to set parameter values,

759–760setting environments with, 762setting parameter values with, 758–759

U

UI (user interface)adding project to, 645–647building form for, 653column display in, 654–657column properties, 665–667

transformations – UI (user interface)

bindex.indd 922bindex.indd 922 2/27/12 8:37:39 AM2/27/12 8:37:39 AM

Page 41: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

923

column selection, 657–658component-level properties, 661–663design-time functionality and, 589Expression Builder, 175extending, 658handling errors and warnings, 663–665implementing IDtsComponentUI

interface, 647–651managing security with GUI, 783–784modifying form constructor, 653–654overview of, 643runtime connections, 658–661setting UITypeName property, 651–653steps in building, 644summary, 667

UITypeName property, 644, 651–653unchanged output, SCD Transformation, 335Ungroup command, 95Unicode

conversion issues, 167–169string functions in, 186–187

UNION

in data extraction, 381–382set-based logic for extraction, 389–391

Union All Transformationadding to Lookup Transformation,

225–226as asynchronous transformation, 500in Data Flow, 127–128data preparation for complex dimension

table, 330in parallel loading, 483querying CDC in SSIS, 405sending cleansed data back into main

data path with, 246testing data fl ow in development with,

210testing data fl ow with Fuzzy Lookup, 359testing Lookup Transformation, 217testing Merge Join Transformation, 215

Unpivot Transformation, in Data Flow, 150–152

UPDATE statements, 408–411updates

capture instance tables in CDC, 396complex dimension changes with SCD,

333–335limitations of SCD, 336loading simple dimension table, 327

upgrading components, to SQL Server 2012, 641

usability, UI design principles, 667user interface. See UI (user interface)user variables, 32

V

Validate methoddesign-time methods, 590Destination Component, 628–629Source Component, 606–609Transformation Component, 616–617

Validate operation, XML Task, 722validation

ACH fi le package, 853–854, 856–859bank fi le package, 832, 835–839of data using Script Component,

294–302staged data in, 513timeout, 747of XML fi le, 62–64

Variable Mappings tabForeach ADO Enumerator example,

101–102Foreach File Enumerator, 98–99

VariableDispenser object, Script Task, 267variables

accessing in Script Component, 291–292

accessing in Script Task, 267–271ACH fi le package, 845–846adding to checkpoint fi le, 461data types for, 172–173defi ning, 170–171displaying list of, 32as expressions, 191–193Immediate window for changing value of,

311

UITypeName property – variables

bindex.indd 923bindex.indd 923 2/27/12 8:37:39 AM2/27/12 8:37:39 AM

Page 42: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

924

variables (continued)matching process package, 867NULL values and, 183–184options for setting, 13–14overview of, 162package confi guration and, 706, 771referencing in expressions, 180–181retrieving data from database into,

272–274scope, 31–32setting up for bank fi le load package,

819–823setting variable values in environments,

761–762types of, 31

Variables collection, Script Task, 267–268VB (Visual Basic)

Hello World example, 257–258overview of, 254Script Task accessing VB libraries, 43–44selecting as scripting language, 255–256using VSTA scripting IDE, 255–256

verifi cation methods, Pipeline Components, 589

version control. See also source controlpackages, 239project versioning in SQL Server 2012,

746source code control and, 525–526Team System and, 542–544

Visual C#, creating Windows application project, 734–736

Visual Studio32-bit runtime executables in 64-bit

mode, 416–41764-bit issues, 791creating Visual C# Windows application

project, 734–736source control and, 526SSDT (SQL Server Data Tools) and, 4, 22Team System. See VSTS (Visual Studio

Team System)Tools for Applications. See VSTA (Visual

Studio Tools for Applications)

Visual Studio Team Explorer 2010, 533VSTA (Visual Studio Tools for Applications)

accessing with Script Task, 43Hello World example, 257–258Script Task and Script Component using,

254using managed assemblies for

development purposes, 260using scripting IDE, 255–256

VSTS (Visual Studio Team System)features, 540–542source control and collaboration and, 522Team Foundation Server and, 533–536version and source control, 542–544

W

warningspackages and, 247user interface and, 663–665

watch windowsscript debugging using, 310viewing debugging with, 571

waterfall methodology, in SDLC, 524Web Service Task

General tab, 56Input tab, 56Output tab, 57overview of, 55retrieving data from XML source, 57–60

Web Services Description Language (WSDL), 55

Web Services, XML and, 414, 431–442WHERE clause, in data extraction, 377–378Windows Authentication

credentials and, 18securing catalog and, 782

Windows clusters, 768–769Windows Forms

for displaying user interface, 647steps in building UI (user interface), 644

Windows Management Instrumentation. See WMI (Windows Management Instrumentation)

variables – Windows Management Instrumentation

bindex.indd 924bindex.indd 924 2/27/12 8:37:40 AM2/27/12 8:37:40 AM

Page 43: INDEX [ ] · PDF fileData Flow detail processing ETL, ... of SSIS 64-bit issues, 790–791 basic reporting, 791–795 ... naming conventions in, 804–805

925

Windows OSsincreasing memory in 32-bit OS, 474log providers for Windows events, 577,

701Winform application, for dynamic property

assignment, 731–736WMI (Windows Management

Instrumentation)Connection Managers, 84overview of, 709values returned by Connection Manager,

595WMI Data Reader Task

example, 710–715explained, 709–710overview of, 84–86

WMI Event Watcher Taskexample, 716–718explained, 715overview of, 86polling a directory for fi le delivery, 86–87

work fl owsExecute Package Task, 80–81Execute Process Task, 81–82handling with Control Flow, 491Message Queue Task, 82–83overview of, 80Send Mail Task, 83–84WMI Data Reader Task, 84–86WMI Event Watcher Task, 86–87

WQL queries, 84–85wrapper classes, user interface and, 653–654WSDL (Web Services Description Language),

55

X-Y

XML (Extensible Markup Language)retrieving data from XML source, 57–60

retrieving XML-based result sets using Web service, 55

sources in Data Flow, 11validating XML fi le, 62–64Web Services and, 414, 431–442

XML Diffgram, 62XML fi les

in case study solution architecture, 803log provider for, 577, 701package confi guration and, 705–706, 771programming to log providers, 703retrieval of fi le size, 848–850saving data to, 276–277serializing data to, 277–280storing log information in, 576

XML Path Language (XPATH), 61, 722XML Schema Defi nition (XSD), 61, 115XML Source, in Data Flow, 115XML Task

confi guring, 720–721InfoPath document consumed by,

723–726operation options, 61OperationType options, 61–62overview of, 60validating XML fi le, 62–64

XMLA code, 348–349XPATH (XML Path Language), 61, 722XSD (XML Schema Defi nition), 61, 115XSLT (Extensible Stylesheet Language

Transformations), 61, 722

Z

zip codes, handling dirty data, 243

Windows OSs – zip codes, handling dirty data

bindex.indd 925bindex.indd 925 2/27/12 8:37:40 AM2/27/12 8:37:40 AM