Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services :...

14
Pro SQL Server 2012 Integration Services Francis Rodrigues Michael Coles David Dye Apress"

Transcript of Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services :...

Page 1: Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services : [build performance-driven ETL solutions using SSIS] Subject New York, NY, Apress, Springer,

Pro SQL Server 2012

Integration Services

Francis RodriguesMichael Coles

David Dye

Apress"

Page 2: Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services : [build performance-driven ETL solutions using SSIS] Subject New York, NY, Apress, Springer,

Contents at a Glancej

About the Authors xvii

About the Technical Reviewer xviii

Chapter 1: Introducing Integration Services 1

Chapter 2: BIDS and SSMS 11

Chapter 3: Hello World—Your First SSIS 2012 Package 43

Chapter 4: Connection Managers 83

Chapter 5: Control Flow Basics 107

Chapter 6: Advanced Control Flow Tasks 163

Chapter 7: Source and Destination Adapters 203

Chapter 8: Data Flow Transformations 245

Chapter 9: Variables, Parameters, and Expressions 325

Chapter 10: Scripting 361

Chapter 11: Events and Error Handling 405

Chapter 12: Data Profiling and Scrubbing 427

Chapter 13: Logging and Auditing 465

Chapter 14: Heterogeneous Sources and Destinations 487

Chapter 15: Data Flow Tuning and Optimization 511

Chapter 16: Parent-Child Design Pattern 525

Chapter 17: Dimensional Data ETL 543

Chapter 18: Building Robust Solutions 561

Chapter 19: Deployment Model 579

Index 605

Page 3: Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services : [build performance-driven ETL solutions using SSIS] Subject New York, NY, Apress, Springer,

Contents/

y

About the Authors xvii

About the Technical Reviewer xviii

Chapter 1: Introducing Integration Services 1

A Brief History of Microsoft ETL 1

What Can SSIS Do for You?,

2

What Is Enterprise ETL? 3

SSIS Architecture 5

New SSIS Features 8

Our Favorite People and Places -9

Summary 10

Chapter 2: BIDS and SSMS 11

SQL Server Business Intelligence Development Studio... 11

Analysis Services Project 12

Integration Services Project 14

Report Server Project Wizard 15

Report Server Project 15

Import Analysis Services Database 16

Integration Services Project Wizard 16

Report Model Project 16

Integration Services 18

Project Files 19

v

Page 4: Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services : [build performance-driven ETL solutions using SSIS] Subject New York, NY, Apress, Springer,

CONTENTS

Tool Windows 21

Designer Window 23

SQL Server Management Studio 33

Tool Windows 33

SQL Server Management Studio Project 37

Templates 37

Code Snippets 39

Queries for SSIS 42

Summary 42

Chapter 3: Helio World—Your First SSIS 2012 Package 43

Integration Services Project 43

Key Package Properties 44

Package Annotations 45

Package Property Categories 46

Hello World 47

Flat File Source Connection 49

OLE DB Destination Connection 53

Data Flow Task 57

Real World 70

Control Flow 70

Execute SQL Task 71

Data Flow Task 72

Summary 81

s Chapter 4: Connection Managers .-.83

Commonly Used Connection Managers 83

OLE DB Connection Managers 85

File Connection Managers 87

Page 5: Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services : [build performance-driven ETL solutions using SSIS] Subject New York, NY, Apress, Springer,

& CONTENTS

ADO.NET Connection Manager 90

Cache Connection Manager 92

Other Connection Managers 94

FTP Connection Manager 94

HTTP Connection Manager - 96

MS0LAP100 Connection Manager 98

DQS Connection Manager 99

MSMQ Connection Manager 99

SMO Connection Manager 100

SMTP Connection Manager 100

SQLM0BILE Connection Manager 101

WMI Connection Manager 104

Summary 105

Chapter 5: Control Flow Basics 107

What Is a Control Flow? 107

SSIS Toolbox for Control Flow 108

Favorite Tasks 110

Data Flow Task 110

Execute SQL Task 111

Common Tasks 119

Analysis Services Processing Task 120

Bulk Insert Task 126

Data Profiling Task 130

Execute Package Task 134

Execute Process Task 138

File System Task 140

FTP Task 141

Script Task 144

vii

Page 6: Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services : [build performance-driven ETL solutions using SSIS] Subject New York, NY, Apress, Springer,

CONTENTS

Send Mail Task 147

Web Service Task 149

XML Task 152

Precedence Constraints 155

Basic Containers 157

Containers 157

Groups 158

Breakpoints 159

Summary 161

Chapter 6: Advanced Control Flow Tasks. 163

Advanced Tasks 163

Analysis Services Execute DDL Task 163

Data Mining Query Task 165

Message Queue Task 170

Transfer Database Task 175

Transfer Error Messages Task 177

Transfer Jobs Task 180

Transfer Logins Task 182

Transfer Master Stored Procedures Task 184

Transfer SQL Server Objects Task 186

WMI Data Reader Task 190

WMI Event Watcher Task 192

Advanced Containers 194

For Loop Container 194

Foreach Loop Container 196

Task Host Controller 202

Summary 202

viii

Page 7: Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services : [build performance-driven ETL solutions using SSIS] Subject New York, NY, Apress, Springer,

CONTENTS

Chapter 7: Source and Destination Adapters 203

The Data Flow 203

Sources and Destinations 205

Source Assistant 205

Destination Assistant 212

Database Sources and Destinations 217

OLE DB 218

ADO.NET 226

SQL Server Destination 226

SQL Server Compact 226

Files 226

Flat Files 227

Excel Files 233

Raw Files 242

XML Files 243

Special-Purpose Adapters 243

Analysis Services 244

Summary 244

Chapter 8: Data Flow Transformations 245

High-Level Data Flow 245

Types of Transformations 246

Synchronous Transformations 247

Asynchronous Transformations 247

Blocking Transformations 248

Row Transformations 249

Data Conversion 249

Character Map 254

ix

Page 8: Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services : [build performance-driven ETL solutions using SSIS] Subject New York, NY, Apress, Springer,

CONTENTS

Copy Column 257

Derived Column 259

Import Column 262

OLE DB Command 265

Export Column 269

Script Component 271

Rowset Transformations 280

Aggregate 281

Sort 283

Pivot 287

Percentage Sampling 289

Row Sampling 291

Unpivot 293

Splits and Joins 297

Lookup 297

Cache Transformation 303

Conditional Split 309

Multicast 312

Union All 313

Merge 314

Merge Join 316

Auditing 319

Row Count 319

Audit 321

Business Intelligence Transformations 323

Summary 324

Chapter 9: Variables, Parameters, and Expressions 325

What Are Variables and Expressions? 325

X

Page 9: Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services : [build performance-driven ETL solutions using SSIS] Subject New York, NY, Apress, Springer,

CONTENTS

What Are Parameters? 328

SSIS Data Types 331

Variable Scope, Default Values, and Namespaces 334

Scope 334

Default Values 337

Namespaces 337

System Variables.... 337

Package System Variables 338

Container System Variable 339

Task System Variables 339

Event Handler System Variables 340

Accessing Variables 342

Parameterized Queries 343

Derived Column Transformations 344

Conditional Splits •345

Recordset Destinations 346

Foreach Loop Containers 348

Script Tasks 350

Execute SQL Task Result Sets 352

Source Types 353

Dynamic SQL 354

Passing Variables 356

SSIS Expression Language 357

Functions 357

Operators 359

Summary 360

xi

Page 10: Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services : [build performance-driven ETL solutions using SSIS] Subject New York, NY, Apress, Springer,

CONTENTS

Chapter 10: Scripting 361

Script Task 361

Advanced Functionality 366

Script Component Source 375

Synchronous Script Component Transformation 383

Asynchronous Script Component Transformation 388

Script Component Destination 396

Summary 403

Chapter 11: Events and Error Handling 405

SSIS Events 405

Logging Events 407

Script Events 418

Script Task Events 418

Script Component Events 421

Event Handlers 423

Summary 425

Chapter 12: Data Profiling and Scrubbing 427

Data Profiling 427

Data Profiling Task 428

Data Profile Viewer 433

Column Length Distribution Profile 436

Column Null Ratio Profile 438

Column Pattern Profile 440

Column Statistics Profile 443

Column Value Distribution Profile 445

Candidate Key Profile 447

xii

Page 11: Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services : [build performance-driven ETL solutions using SSIS] Subject New York, NY, Apress, Springer,

« CONTENTS

Functional Dependency Profile 450

Fuzzy Searching 452

Fuzzy Lookup 452

Fuzzy Grouping 458

Data Previews 460

Data Viewer 460

Data Sampling 462

Summary 464

Chapter 13: Logging and Auditing..................................... 465

Logging 465

Enabling Logging 466

Choosing Log Events 470

On SQL Logging 471

Summary Auditing 472

Batch-Level Auditing 473

Package-Level Auditing 478

Adding Auditing to Packages 480

Simple Data Lineage 481

Summary 486

Chapter 14: Heterogeneous Sources and Destinations 487

SQL Server Sources and Destinations 487

Other RDBMS Sources and Destinations 494

Flat File Sources and Destinations 495

Excel Sources and Destinations 498

XML Sources 502

Raw File Sources and Destinations 504

xiii

Page 12: Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services : [build performance-driven ETL solutions using SSIS] Subject New York, NY, Apress, Springer,

» CONTENTS

SQL Server Analysis Services Sources 506

Recordset Destination 508

Summary 509

Chapter 15: Data Flow Tuning and Optimization 511

Limiting Rows at the Database 511

Performing Joins in the Database 515

Sorting in the Database 516

Performing Complex Preprocessing at the Database 516

Ensuring Security and "Read Auditing" 517

Pulling Too Many Columns 517

Using Execution Trees 518

Implementing Parallelism 522

Summary 523

Chapter 16: Parent-Child Design Pattern 525

Understanding the Parent-Child Design Pattern 525

Using Parameters to Pass Values 527

Working with Shared Configuration Information 530

Overriding Properties 530

Logging 531

Implementing Data-Driven ETL 531

Summary 542

Chapter 17: Dimensional Data ETL 543

Introducing Dimensional Data 543

Creating Quick Wins 546

Run in Optimized Mode 546

xiv

Page 13: Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services : [build performance-driven ETL solutions using SSIS] Subject New York, NY, Apress, Springer,

* CONTENTS

Remove "Dead-End" Components 547

Keep Package Size Small 548

Optimize Lookups 549

Keep Your Data Moving , 549

Minimize Logging 549

Use the Fast Load Option 550

Understanding Slowly Changing Dimensions 550

Type 0 Dimensions 550

Type 1 Dimensions 550

Type 2 Dimensions 556

Type 3 Dimensions 558

Summary 559

Chapter 18: Building Robust Solutions 561

What Makes a Solution Robust 561

Resilience,

562

Data Flow Task 562

Event Handlers 572

Dynamism 573

Accountability 574

Log Providers 574

Custom Logging 575

Summary 577

Chapter 19: Deployment Model 579

The Build Process 579

The Deployment Process 581

Environments 588

Execution 594

XV

Page 14: Pro SQL Server 2012 integration services : [build …Pro SQL Server 2012 integration services : [build performance-driven ETL solutions using SSIS] Subject New York, NY, Apress, Springer,

CONTENTS

The Import Process 601

The Migration Process 601

Summary 604

Index 605

xvi