SAP Community Network Wiki - Enterprise Information Management - EIM Home
-
Upload
dudi-kumar -
Category
Documents
-
view
103 -
download
3
Transcript of SAP Community Network Wiki - Enterprise Information Management - EIM Home
12/05/2012 SAP Community Network Wiki - Enterprise Information Management - EIM Home
1/11wiki.sdn.sap.com/wiki/display/EIM/EIM+Home
Added by Moshe Naveh, last edited by Vicky Bolster on Mar 26, 2012
Welcome, GuestLoginRegister
Getting StartedNewslettersStore
SolutionsSAP Services & SupportAbout SCNDownloadsIndustriesTraining & EducationPartnershipCode ExchangeLines of BusinessUniversity AlliancesEvents & WebinarsIdea Place
EIM Home
Welcome to the SAP Enterprise Information Management (short: EIM) topic. Feel free to create new entries or add to existing
ones.
Empow er your people to make better decisions, drive operational excellence, ensure regulatory compliance, and minimize IT costs based
on trusted information. SAP solutions for Enterprise Information Management help you deliver integrated, accurate, and timely data —
both structured and unstructured — across your enterprise.
Moderators:
Kendra Van Gundy | Brandon Jacobson | Kris
Sorenson
How to contribute:
Click here to submit content
Please use Page Template for all
Submissions
SCN Discussions (formerly
called Forums):
Postalsoft | Data Integration and Data Quality
EIM Staging Area
Data Services
Getting Started w ith DataServices
Compare DataServices w ith...
Comparing DataServices w ith hand
coding
Challenges w ith Scripts
Development goodies in
DataServices
Reducing the cost of ow nership
w ith DataServices
Scenario 1 - Copy
Scenario 1 - Copy via SQL
Scenario 1 - Copy via
DataServices
Scenario 2 - SCD2
Scenario 2 - SCD2 via SQL
Scenario 2 - SCD2 via
DataServices
Scenario 2 - SCD2 via SSIS
Scenario 3 - Fact Load
Scenario 3 - Fact Load via SQL
Scenario 3 - Fact Load via
DataServices
Video Tutorials
Supported Platforms documentation
How to dow nload a new release
DataServices and the RampUp
program
How to get a new License Key
I am new to DS, w here to start?
Training and certif ication of
DataServices and other EIM products
Postalsoft
ACE
Business Edition and DeskTop Mailer
DataRight IQ
Frequently Used USPS and Industry Links
Label Studio
Match Consolidate
Postal File Preparation Tool
Presort
PrintForm
Information Steward
Business Terms Glossary (Metapedia)
Cleansing Package Builder
Data Insight
Information Stew ard Installation
Metadata Management
Metapedia
User Management and Security
Data Quality Management for Enterprise Apps
Data Quality Management for Informatica
Data Quality Management for SAP
Data Quality Management for Siebel
RapidMarts
Introduction To RapidMarts
Welcome to EIM!
Data Services
Postalsoft
Data Quality Management for Enterprise Apps
Information Stew ard
Data Quality Management SDK
Enterprise Information Management Use Case Wiki
RapidMarts
Submit your content!
Click here to submit content
Please use Page Template for all Submissions
(All new content to be created in Staging Area until
Moderation and Point Assignment)
Watch out!
RSS Feeds - Coming Soon!
Click here to w atch w hen content is updated
Help!
Our Wiki User Guide - Coming Soon!
Contact Moderators via contact information at top
of page
Last updated
Recently Updated
12/05/2012 SAP Community Network Wiki - Enterprise Information Management - EIM Home
2/11wiki.sdn.sap.com/wiki/display/EIM/EIM+Home
Report Generation updated by Brandon Law
(v iew change) May 01
CGUL Tips and Tricks for Entity Extraction
updated by Julie Oliv er (v iew change) May 01
test updated by Shari Wennes (v iew change) May 01
girls.jpg attached by Shari Wennes May 01
Data Services User Resource limits on Unix
systems
updated by Lina Encinales (v iew change) May 01
DataServices and other EIM products
How to f ile a support case
Where is a DS and DQ FAQ
Where to f ind documentation
Setup scripts
How To - typical DataServices questions
Access to Previous Row Values
Previous row processing via custom
function
Previous row processing via self join
Previous row processing via User
Defined Transform
Adapter SDK Tutorial
Integration of Adapters into DI
Types of Adapters
Document Source-Target
Table Source-Target
FunctionCall
Adapter Operations
Poll Operation - Realtime Service
bridge
Listener Operation - Outbound
Message
A simple Table Read Adapter
Interfaces to Implement
The Adapter Interface
The Session Interface
The MetadataBrow sing
Interface
The MetadataNode for the
RootNode
The MetadataNode for the
FileNode
The MetadataImport Interface
ImportByName Interface
The TableSource Interface
The GUI Interfaces
For the ImportByName
For the DataStore
Example II - the eMail Adapter
The Coding
Installation
Using the Adapter
The XML and processing
options
Reading eMails in Batch
Processing eMails in Realtime
Installing the Adapter
Adding and starting the SFAdapter
Creating a new DataStore
Execute a DataFlow
ANSI 92 joins in DataServices
ANSI 92 right outer join in
DataServices
Auditing and Validation
Audit Points
Data Validation
Build a "w here exists" query
Complex transformation rules
Routing via the Case Transform
Lookup_ext() function
lookup_ext() w ith pattern
lookup_ext() w ith return expressions
Control the Commit points
Create a Designer Dektop Icon
Creating a last insert and last update date
Consuming an external Web Service
Cumulative Sum
Database Session parameters
Datastore Configurations
Using configurations for porting
Datatype conversions
Debugging jobs using log f iles
Delta Load Implementation
Timestamp based delta
Logtable based delta
Database Transactionlog based delta
12/05/2012 SAP Community Network Wiki - Enterprise Information Management - EIM Home
3/11wiki.sdn.sap.com/wiki/display/EIM/EIM+Home
Database Transactionlog based delta
Replication based delta
Messaging based delta
Table Comparison based delta
Initial load as delta
Ignore data not possible to be modif ied
for delta
Error Handling
ETL Project Guidelines
Goals
Global Variables
Inititialize-End Script
AW_EndJob
AW_JOBEXECUTION
AW_StartJob
Components
Sections
Dimension Tables
Fact Tables
PreLoad Stored Procedure
PreLoad Stored Procedure for
Oracle
PostLoad Stored Procedure
PostLoad Stored Procedure for
Oracle
Initial vs. Delta Load
Other ETL Project Rules
Restartability for Initial Loads
Restartability for Delta Loads
Supports the Recovery Feature
Testing
Flat Files
Change the Row Delimiter
Errorhandling in f ile formats
Excel- Save to DI
How to create a f lat f ile format from a
Query
Multirecord Files
Reading a large XML file
Reading multiple f iles at once
File Group Reader
Selective Reading and
Postprocessing
Shared Directory access
Writing a large XML file
Help, my Dataflow consumes so much
memory!
How to call a function returning multiple
parameters
How to delete records?
How to split a comma separated String
into multiple row s?
Identify a Bottleneck in a Dataflow
Display Optimized SQL
Monitor Log File
The lookup tuning
The Stepped Execution
Carrying Attributes
Where is the bottleneck now ?
Thread Names in the Monitor Log
Identify long running dataflow
Installation and Architecture
Example 1 - Three parallel projects
Example 2 - One project w ith three
developers
Example 3 - Remote Development
Architecture Details
Where to put the Jobserver
Where to place the Repos
Naming the database accounts
Moving to Prod via Designer - Push
into Target Repo
Moving to Prod via Designer -
Export into ATL File
Central Repository - Yes or No
Creating a Central Repo
12/05/2012 SAP Community Network Wiki - Enterprise Information Management - EIM Home
4/11wiki.sdn.sap.com/wiki/display/EIM/EIM+Home
Creating a Central Repo
Using a Central Repo
Installation Checklists
Prerequisites w hen installing
Designer
Installing Designer
Creating a Repository
Prerequisites for Window s
Jobserver
Prerequisites for Unix Jobserver
Installing Window s Jobserver
Installing Unix Jobserver
Loading log f iles (Error, Trace, Monitor)
into a table
Multiple Codepages
Terminology
The different Softw are Layers
Data Integrator example
ODBC connections from a Linux (or UNIX)
jobserver
To configure Teradata ODBC on Linux
and Unix
To configure DataDirect ODBC on
Linux and Unix
Oracle CDC
The Publisher and the Subscriber
Setting Up a CDC Environment
Create a CDC Datastore
Creating the Job
Oracle Hints and DI
Parallel vs sequential Execution
Read from Excel
Realtime at a Glance
Batch or Realtime?
Batch vs. RealTime Flow s
The Nested Relational Data Model
NRDM
What is it (good for)?
Master-Detail Table to NRDM
An additional Hierarchy Level
Unnest or NRDM join
One table to NRDM
Separating a NRDM node
Building an XML String
Memory requirements of NRDM
Realtime Objects in batch and vice
versa
The RealTime DataFlow
Acting as Server
Building the DTDs
The Job and the DataFlow
Caching and RealTime
How to test Realtime Jobs
Setting up the Service
Sending Messages to the
Service
Clienttest Utility
Connecting to a WebServer
Setting up as WebService
The Client Flow
Setting up the Webservice-
Adapter (pre DI 12.1)
Creating a datastore of type
WebService (DI 12.1 and higher)
Calling Webservices
Guaranteed Delivery
Recovery
Scheduling
Using other schedulers
Building Job Chains
Event based scheduling
Scheduling via WebServices
Using the SAP scheduler
Sharing Caches
SQL Server 2008 CDC
SQL Server 2008 CDC - Transactional
DataFlow
12/05/2012 SAP Community Network Wiki - Enterprise Information Management - EIM Home
5/11wiki.sdn.sap.com/wiki/display/EIM/EIM+Home
DataFlow
SQL Server 2008 CDC - Delta
DataFlow
SQL Server 2008 CDC - Impact on
source System
SQL Server DeadLocks
SQL Server Identity column
Staging tables automated
Cases w here an additional
Data_Transfer is added automatically
Performance of DataServices
Caching in DI
DI Caching Example
Pageable Cache and DSConfig
Pageable Cache and Sort Order
Caching and Execute as separate
Process
DataServices Performance example
DataServices Performance example -
ETL Speed
DataServices Performance
example - Details
DataServices Performance
example - DWH tasks missing
DataServices Performance
example - Why does it not scale?
Performance of the Customer
Dimension example
Customer SCD2 Initial Load
Customer SCD2 Delta Load
Customer SCD2 Initial Load
w ith SQL
Customer SCD2 Delta Load
w ith SQL
Performance of the Material
Dimension example
Material Dimension Initial Load
Material Dimension Delta
Load
Material Dimension Initial Load
w ith SQL
Material Dimension Delta
Load w ith SQL
Performance of the Order Fact
example
Fact Initial Load
Fact Delta Load
Fact Initial Load w ith SQL
Fact Delta Load w ith SQL
Performance w hen
DataServices is on a separate server
Customer SCD2 Initial Load -
DataServices separate
Fact Initial Load -
DataServices separate
Data Quality Performance example -
Address Cleanse and Geocoding
Data Quality Performance example
- Address Cleanse Details
Data Quality Performance example -
Match and Data Cleanse
How to use the DataServices sizing
dashboard
Degree of Parallelism
DoP and Partitions
High Performance Loads w ith Oracle
Inserts vs. Updates
Speed up Updates
Speed up Inserts
Speed up Inserts Part 2
Loads and Indexes
Parallel processing
Putting it all together
How to lookup a row
Database Join
DI Join
12/05/2012 SAP Community Network Wiki - Enterprise Information Management - EIM Home
6/11wiki.sdn.sap.com/wiki/display/EIM/EIM+Home
lookup_ext
I w ant to lookup in a selected
dataset, not just a table
lookup_ext and constants
lookup, lookup_ext, lookup_seq -
w hat is the difference??
multiple lookups
sql function
lookup_ext() inside a custom function
Calling a stored procedure
Dynamic Lookups
Joining tables in the engine
Monitor Sample Rate
Myths about ELT tools
Insert...select is fastest
If there is no database link
Database is faster for joining data
PL-SQL scripts are faster than any
ETL tool
Database Links
Implementing lookups
Nested SQL
SQL for a Slow Changing
Dimension
Having tw o target tables
Performance characteristics at
customers
Installation steps for the Benchmark
Installation for Oracle
Installation for SQL Server
Installation for others
Monitoring
Results
Results Version 1.0
Results w ith different DI Versions
Test Details
DF_Benchmark_read
DF_Benchmark_API_bulkloader
DF_Benchmark_regular_load
DF_Benchmark_single_thread
DF_Benchmark_lookup_DOP1
DF_Benchmark_lookup_DOP10
Performance of Functions
Performance of nesting and unnesting
Performance of Reader, Engine and
Loader
Source-Query-Target
Without any options
With Bulkloader turned on
Ignoring Reader and Loader
Ignoring the Loader
With API Bulkloader turned on
Performance of Self joins
Performance of Transforms
Generation Transforms
Date Generation Transform
Row Generation Transform
Streamline Transforms
Case Transform
History Preserving
Key Generation
Map CDC Operation
Map Operation
Merge Transform
Pivot
Query (simple)
Validation
Streamline Transforms w ith (SQL)
overhead
SQL Transform
Table Comparison (row by row
setting)
Table Comparison (sorted input)
Cached Transforms
Hierarchy Flattening
Query w ith distinct
Query w ith group by
12/05/2012 SAP Community Network Wiki - Enterprise Information Management - EIM Home
7/11wiki.sdn.sap.com/wiki/display/EIM/EIM+Home
Query w ith group by
Query w ith order by
Table Comparison (cache mode)
Other Transforms
Effective Date
Reverse Pivot
Multiple Transforms w orking in
conjunction
Loading a table w ith surrogate key
Slow Changing Dimension Type 2
Slow Changing Dimensions and
Deletes
Data Quality Transforms
CountryID - all in one line
CountryID - City centric
CountryID - Country centric
CountryID - Multilines
Data Cleanse - Name Parsing
Global Address Cleanse - EMEA
Engine
Global Address Cleanse - Global
Engine
Global Address Cleanse - US
Engine
Global Suggestion - Lookup City
Match Consolidate Transform
Match Consolidate - Household
Data
User Defined Transform - Python
Pushdow n not w orking
Row creation time
The Impact of Number of Loaders
Impact of Number of Loaders (Oracle)
Impact of Number of Loaders (SQL
Server)
The impact of the CommitSize
The impact of the CommitSize (Oracle)
The impact of the CommitSize (SQL
Server)
What is better Table Comparison or
AutoCorrect Load?
Autocorrect Load Pushdow n Example
SAP Topics
Overview SAP Interfaces
Direct SQL
ABAPs
RFC ReadTable
Extractors
RFC-BAPI
IDOCs
Connecting to SAP
Chosing the Transport Method
direct_dow nload transport method
ftp transport method
shared_directory transport method
custom_transfer transport method
Reading via ABAP
How to read the ABAP
How to execute the ABAP
Moving to Production
Moving ABAP to Production (DI 12.1)
Common Questions
Custom ABAP Transform
Calling functions inside the ABAP
Reading R3 Hierarchies
Reading via RFC Read Table
Using Extractors as Source (Data
Services 4.0)
Releasing Extractors for use by the
ODP API
Importing Extractors into the Datastore
What Extractors to use
Identify the type of Extractor
Building Dataflow s w ith Extractors
General considerations about
Extractor based delta dataflow s
12/05/2012 SAP Community Network Wiki - Enterprise Information Management - EIM Home
8/11wiki.sdn.sap.com/wiki/display/EIM/EIM+Home
Extractor based delta dataflow s
The Extractor date-time f ield
When does the Extractor start
collecting the delta?
Extractor RecordMode
Dataflow s for each Extractor Delta
Process Type
Dataflow for Extractor Delta
Process Type A
Dataflow for Extractor Delta
Process Type ABR
Dataflow for Extractor Delta
Process Type ABR1
Dataflow for Extractor Delta
Process Type ADD
Dataflow for Extractor Delta
Process Type ADDD
Dataflow for Extractor Delta
Process Type AIE
Dataflow for Extractor Delta
Process Type AIED
Dataflow for Extractor Delta
Process Type AIM
Dataflow for Extractor Delta
Process Type AIMD
Dataflow for Extractor Delta
Process Type CUBE
Dataflow for Extractor Delta
Process Type FULL
Dataflow for Extractor Delta
Process Type NEWD
Dataflow for Extractor Delta
Process Type NEWE
The Extractor does not contain all the
data needed
No ODP API and Extractors are show n
still?? How is this possible?
Debugging DataServices issues w ith
Extractors (SAP internal only)
Extractors w ith DataServices -
Monitoring
Administration of Extractors (Data
Services 4.0)
Calling RFCs-BAPIs
Reading SAP BW
Configuring SAP BW Open Hub
Destination
Configuring SAP BW Open Hub
ProcessChain
Reading from an Open Hub Destination
Openhub Common Questions
Loading BW
Setup BW - DataServices
communication
Run the DataServices 3.2 RFC Server
Run the DataServices RFC Server
Prepare a BW InfoSource for Loading
via DataServices
Build the BW Load Job
Configure the Load Job in BW (DS 3.2)
Configure the Load Job in BW
BW Load Job and datatypes
Loading BW 7.x DataSources
Receiving IDOCs
Configure SAP to send IDOCs to DI
Building the RealTime DataFlow
Configure WebAdmin for IDOCs
Testing the IDOCs
Sending IDOCs
Function Example READ_TEXT
Function Example READ_TEXT ABAP
w rapper function
Function Example READ_TEXT RFC
enabled
Data Quality
Continuous Monitoring
Data Assessment
12/05/2012 SAP Community Network Wiki - Enterprise Information Management - EIM Home
9/11wiki.sdn.sap.com/wiki/display/EIM/EIM+Home
Data Assessment
Data Cleansing
Address Reference Data
Installing the Address Dictionaries
CountryID Transform
Data Cleanse Transform
Data Cleanse Transform (Data
Services 3.x)
Global Address Cleanse Transform
Installing and using the EMEA
Engine (Data Services 3.x)
Installing and using the US Engine
Chinese Pinyin Fuzzy Search
feature in Global Address Cleansing
Name, Title & Firm Cleansing Packages
Real-Time Address Validation
GAC Suggestion Lists (Data
Services 4.x or higher version)
Global Suggestion Lists Transform
(Data Services 3.x)
US Regulatory Address Cleanse
Transform
DSF2 Walk Sequencer
Enhance
Geocoder Reference Data
Geocoder Transform
Directory
Data Services 4.x Geocoder
Transform w ork w ith US Tomtom
Directories
US 2010 CENSUS DATA
Upgrade in SAP GEO Directories
US GEO Directories Dow nload
and Setup
Geocoder Labs
Add Location Aw areness
Perform Address Geocoding
Perform Geo Spatial Search
Geocoder Options
Input f ields
Output f ields
POI and address geocoding
Geocoding scenario1
Geocoding scenario2
POI and address reverse
geocoding
Reverse geocoding scenario1
Reverse geocoding scenario2
POI Types
Understanding your output
What's new - Data Services 4.1
Geocoder transform features
Match and Consolidate
Associate Transform
Consumer Householding Match
Strategy
Corporate Householding Match
Strategy
Match Transform
Multinational Consumer Match Strategy
FIM
HANA
Prerequisites
SAP Data Services - SAP Business Suite
Data Extraction Options
ABAP Application Layer
Data Sources (Extractors)
Extractors - Full Refresh
Extractors - Source-Based CDC
Extractors - Target-Based CDC
Tables
Insert Only
Insert Only - Full Refresh
Insert Only - Target-Based
CDC
Insert Only - Timestamp-
Based CDC
12/05/2012 SAP Community Network Wiki - Enterprise Information Management - EIM Home
10/11wiki.sdn.sap.com/wiki/display/EIM/EIM+Home
Based CDC
Updateable
Updateable - Full Refresh
Updateable - Target-Based
CDC
Updateable - Timestamp-
Based CDC
Direct RDBMS Connection
RDBMS - Full Refresh
RDBMS - Source-Based Change
Data Capture (CDC)
Native RDBMS CDC
Timestamp-Based CDC
RDBMS - Target-Based CDC
SAP HANA
Bulk Loading
Table Creation
Before Data Services Job
Execution
Import
New Table Editor
SQL Statement
During Data Services Job Execution
Metadata
Text Data Processing
Configuring Extraction Options
How to Customize Rules
Mapping Input and Output Fields
Recommendations on Best Practices
Enterprise Information Management Use Case
Wiki
Data Integration and Data Quality Use Case
Data Migration Use Case
Data Services (DI, DQM) and Information
Stew ard Webinar Series
EIM Use Case Submission guidelines
Enterprise Content Management Use Case
Event Processing Use Case
Information Governance Use Case
Information Lifecycle Management
Master Data Management Use Case
Data Quality Management SDK
Documentation Information and Dow nloads
DQM SDK Code Samples
DQM SDK Sample Transform Configurations
Using Customized Cleansing Packages w ith
DQM SDK
Child Pages (10)
Copy of RapidMarts
Data Quality Management for Enterprise Apps
Data Quality Management SDK
Data Services
Data Services Wiki - Template
EIM Staging Area
Enterprise Information Management Use Case Wiki
Information Steward
Postalsoft
RapidMarts
Follow SCN
12/05/2012 SAP Community Network Wiki - Enterprise Information Management - EIM Home
11/11wiki.sdn.sap.com/wiki/display/EIM/EIM+Home
Contact UsSAP Help Portal
PrivacyTerms of UseLegal DisclosureCopyright