Data Validation Error Handling
-
Upload
rahmand2003 -
Category
Documents
-
view
71 -
download
3
Transcript of Data Validation Error Handling
Data validation and Error handling
SAP NetWeaver Regional Implementation Group Business IntelligenceSAP AG
© SAP AG 2004, Efficient usage of BW InfoProviders / 2
Contents
Motivation11
Data validation with BW 3.x
Error handling with BW 3.x
Data repair44
33
22
© SAP AG 2004, Efficient usage of BW InfoProviders / 3
Contents
Motivation11
Data validation with BW 3.x22
Error handling with BW 3.x33
Data repair techniques 44
© SAP AG 2004, Efficient usage of BW InfoProviders / 4
Why Data Validation?
BW data are expected to be of high qualityBW data needs to be complete
BW data needs to be correct
BW data needs to be up to date
BW requires high data accuracy for effective decision support
On top management level
To support operational processes
BW is used as EDW, this means it is the central data store for consolidation and distribution of enterprise wide data
BW data often serve as foundation for further processing
BW data are highly integrated
BW data are queried frequently
© SAP AG 2004, Efficient usage of BW InfoProviders / 5
ROI of Data Quality - Data Quality as an Investment
You should ask…What is the risk of incomplete / incorrect data sets ?
What is the cost to fix data, once contaminated ?
What are corporate quality standard ?
In which time frame incorrect data need to be repaired ?
Availability of correct data in BW might be seen as critical as availability of operational system
However, also you should ask…What is the reliability of source data ?
Where is the point of diminishing returns ?
Can different data models / data validation procedures set up inorder to respond to different needs concerning availability and quality of data ?
© SAP AG 2004, Efficient usage of BW InfoProviders / 6
Sources for Dirty Data
Data are incorrect in source system
Data consolidation causes issues
Technical platforms are different (code pages, etc.)
Administration issues (double loadings,…)
Custom logic (errors in routines,…)
Technology issues (SW, DB, O/S, HW, …)
…
© SAP AG 2004, Efficient usage of BW InfoProviders / 7
Data Contaminants - 1
012-3344Cup Holder, green US012-3378Cup Holder, red US012-4122Lighter, black US012-552 white cover US012-7662green Cup Holder US
012-401 Cup Holder, green JP012-4122phone plug JP012-661 channel JP013-1452plastic cover, red JP013-1452(pink version of above) JP
red wheel, type "014-2221" CAblue wheel, type "012-3342" CA023-2211white wheel CA
multiple keys
inconsistent keys
invalid characters
surprises
free form fields
© SAP AG 2004, Efficient usage of BW InfoProviders / 8
Data Contaminants - 2
XYZ.com Ltd. 10/10/2000 $ 67221XYZ.com Ltd. 10/10/2000 $ 67221XYZ.com Ltd. 10/10/2000 $ 67221XYZ.com Ltd. 10/12/2000 $ 35332XYZ.com Ltd. 10/14/2000 $ 31122XYZ.com Ltd. 10/17/2000 $ 99999999XYZ.com Ltd. 10/19/2000 $ 78882
XYZ.com Ltd. 10/10/99 $ 44332XYZ.com Ltd. 10/12/99 $ 33222
ABC Co. 10/14/2000 $ 4333LMN Ltd. 10/14/2000 $ 9000XYZ.com Ltd. 10/14/2000 $ 31122ZZZ Sl. 10/14/2000 $ 122211
data redundancy
data anomalies
data format
data redundancy
© SAP AG 2004, Efficient usage of BW InfoProviders / 9
Data Contaminants - 3
Data Contamination during upload via ExitsApplication Exits
Generic BW Exit RSAP0001
Transfer- / Update-Routines
Virtual Exits
Consider the following:Timeliness of Data
Check for Versions
Check for Return Codes
Delta Trigger Capabilities
Performance and General Architecture
© SAP AG 2004, Efficient usage of BW InfoProviders / 10
Contents
Motivation11
Data validation with BW 3.x22
Error handling with BW 3.x33
Data repair techniques 44
© SAP AG 2004, Efficient usage of BW InfoProviders / 11
Data validation
Data validation answers the questions: check what?
Technical qualitySemantically (Business rules)Completeness
check where?Automatically during data loadRule driven
check how?Built-inRoutineFormula (planned)
© SAP AG 2004, Efficient usage of BW InfoProviders / 12
Where: Checks in BW 3.x
Transfer Rules
Info Cube
InfoSourceInfoSource
ODS Object
Master Data
Master data Hierarchies
Update Rules Texts
DataData
© SAP AG 2004, Efficient usage of BW InfoProviders / 13
What: possible checks
technical business
Single field Empty field,
Correct data type,
code page,
Master data check (SID)
Single record
Free delivery: Revenue = 0
Supplier <> Receiver
Multi record
Check for double records
Records sent to BW = Records updated
Sum of single revenues < 200
Multi table Referential integrity (Foreign key check)
CheckTypeDegree
of Detail
Black: built inGrey: Not built in
© SAP AG 2004, Efficient usage of BW InfoProviders / 14
What: Captured errors (1)
In a transfer rule:Not allowed characteristic valuesLower case lettersArithmetic and conversion errorsUser built routine with returncode <> 0No aggregation
Check for referential integrity on the InfoSourceAgainst Master data tablesAgainst ODS-Objects
In an update rule: Arithmetic or conversion errorMaster data read unsuccessfulCurrency translation or time conversion errorUser built routine with error messageNo aggregation
© SAP AG 2004, Efficient usage of BW InfoProviders / 15
What: Captured errors (2)
Checks during master data and text updateNot allowed characteristic valuesNo SID for navigational attributeNo language in text uploadDouble records concerning the keyOverlapping or invalid time intervalsData does not map with the scheduler selectionNo aggregation
Checks during Hierarchy UpdateErrors in Hierarchy structureOverlapping time intervalsNo aggregation
Errors during InfoCube updateNo SID for characteristic valuesNo aggregation
© SAP AG 2004, Efficient usage of BW InfoProviders / 16
Check for Permitted Characters
C
Case A: characters not permitted Case B: characters permitted
Permitted by standard:
!"%&'()*+,-/:;<=>?_0123456789
ABCDEFGHIJKLMNOPQRSTUVWXYZ
© SAP AG 2004, Efficient usage of BW InfoProviders / 17
Consistency Check for Characteristic Values
Checking for…
use of character values in the Data type NUMC fields
correct consideration of the conversion routine ALPHA
use of lower case letters
use of special characters
plausibility of date / time fields
Consider performance impacts !
© SAP AG 2004, Efficient usage of BW InfoProviders / 18
New in BW 3.x: Check of Referential Integrity
Bus. Partner Material1000 PC 90009000 PC 9000
ODS-Object defined in InfoObject as check table
Enable check (optional)Communication Structure
Look Up
Error Handling
1000
Bus. PartnerInfoObject Bus. Partner 9000 doesn’t meet the referential Integrity => the record is marked as erroneous
‘9000’ not allowed
© SAP AG 2004, Efficient usage of BW InfoProviders / 19
Check of Referential Integrity: Example
COSTC##Master Data
ODS-Object defined in InfoObject as check table
COSTC##_FLEX_MD
0COMP_CODE 9999
SUBSTRING ( TXTSH , ' 0' , ' 4' )
0TXTSH 9999-0000000001100
Check for existing master data not possible here
Communication Structure
Transfer Structure
© SAP AG 2004, Efficient usage of BW InfoProviders / 20
ODS: Flexible Master Data Staging
Update Rules
MasterData ODS-Object
Master Data InfoSource
Business Partners
Master Data
Customers
Master Data
Vendors
Additional Master Data layerOptional
Cleansing
Consolidation
Benefit: Flexibility
© SAP AG 2004, Efficient usage of BW InfoProviders / 21
Difference to Master Data check in InfoPackage
Master Data CheckAll Data targets
Check after update rules for each data target
All InfoObjects
BW 3.0: Error handling (except ODS object)
BW 2.0 SP18: ODS object only if BEx-Reporting is active
Check only against SID-table
Referential IntegrityAll Data targets
One check in transfer rules
Only for selected InfoObjects
Error handling
Works for all ODS objects types
Check against MD-table or ODS object is possible
© SAP AG 2004, Efficient usage of BW InfoProviders / 22
Check: No aggregation allowed
If you select this indicator, the request is regarded as incorrect if the number of records received in BW does not match the number of updated records.
That means that the request is regarded as incorrect if the records are sorted out, aggregated or created in the following:
transfer rules
update rules
update
© SAP AG 2004, Efficient usage of BW InfoProviders / 23
Handling of double data records
Handling of double data records is available in InfoPackage for time independent master data and text DataSources
R/3 DataSources can deliver flag if transferring double data records
Handling of double Data Records (checked on the key fields of the characteristics) means that only the last data record is updated to the master data / text table
Checking for double data records only possible if update method is ‘Only PSA – Update Subsequently in Data Targets‘
© SAP AG 2004, Efficient usage of BW InfoProviders / 24
Customer build checks
Implement checks in customer routines in the update or transfer rules
these checks can call the Error Handling
Early checking in customer routines can avoid time consuming rollback and recovery of complex load scenarios !!
Additional customer check scenarios might be:Build business check procedure
Data Integrity Checks on Data Packages in PSA
Use custom check points during extraction
Check on master data completeness
Build Audit Dimension in Data Modeling
© SAP AG 2004, Efficient usage of BW InfoProviders / 25
Build own business check
Data loaded to BW data targets is compared with source data using business rules like:
correct subtotals, correct +/-, etc.
Check can be undertaken using a MultiProvider query comparing source data in PSA or from the source system (possible usage of Virtual or Remote cubes) with data contained in BW data targets
Built Exception on Column “Difference” <> 0
Proactive Alerting off Administrator via Reporting Agent
Embed this check in BW 3.x process chains
Salesorganization
Source Data Data in InfoProvider
Difference
1000 120.000,00 120.000,00 0,002000 190.000,00 189.600,00 - 400,00
© SAP AG 2004, Efficient usage of BW InfoProviders / 26
Build own business check
MultiCube
Sales(Basis Cube)
Remote Cube
PSAData InfoSource
InfoSource
UpdateRules
TransferRules
TransferRules
SourceSystemSourceSystem
BusinessInformationWarehouse
BusinessInformationWarehouse
Check and filter dataCheck and filter data
Comparevalid with
loaded data
Comparevalid with
loaded data
variable 0LSTRQID
variable 0MAPRQID
*see: How to… Validate InfoCube Data by Comparing it With PSA Data
© SAP AG 2004, Efficient usage of BW InfoProviders / 27
Data Integrity Checks on Data Packages in PSA
APIs are available to read PSA contents
Function RSAR_ODS_MAINTAIN,….
Check for reference between records
Summary checks, ….
© SAP AG 2004, Efficient usage of BW InfoProviders / 28
Use Custom Check Points during extraction
Identify check points in source system
Write check point data to custom table
Use generic extractor for load
Populate check cube
Perform Compress with 0 suppression
Execute exception report
© SAP AG 2004, Efficient usage of BW InfoProviders / 29
Check on master data completeness
Material Material Type Global Material Packing Size
Source System 1
Material Material Type
Source System 2
0Material
Source System 3
0Material
Global Material
Packing Size
QM Status
Scenario: master data is loaded from different source systems
© SAP AG 2004, Efficient usage of BW InfoProviders / 30
Check on master data completeness
Check completeness
Source System 1
Source System 2
Source System 3
Material Material Type Global Material Packing Size QM Status
Myself Data Mart
Material Material Type Global Material Packing Size QM Status
Complete
Incomplete
Export DataSourceExport DataSource
© SAP AG 2004, Efficient usage of BW InfoProviders / 31
Check on master data completeness
Material Material Type Global Material Packing Size QM Status
Complete
IncompleteConsumer Report
„only complete Information!“Expert Report
„stored information!“
© SAP AG 2004, Efficient usage of BW InfoProviders / 32
Build Audit Dimensions in Data Modeling
Audit Dimensions can identify:
When were the data created?
Which source did the data come from?
Which tools where used for extraction?
Which rules had touched the data?
…
© SAP AG 2004, Efficient usage of BW InfoProviders / 33
Contents
Motivation11
Data validation in BW 3.x22
Error handling in BW 3.x33
Data repair techniques 44
© SAP AG 2004, Efficient usage of BW InfoProviders / 34
Data Cleansing
Where:In the Source System?During Data Extraction?In the BW System?
When:In the productive phase? In the test phase?In the blueprint phase?
Who:Is it a technical issue?Is it a project issue?Is it an organizational issue?
• Data cleansing occurs at all levels.• Avoid tendency to attempt cleanse onlywithin the BW extraction process. • Often data cleansing is best performed at the legacy / source system level.
• Data cleansing is one of the greatest risks in data movement efforts.• Design belongs into blueprint phase.• Test data are often cleaner than real data.
Often data quality and inconsistency issues are systemic in the organization and must be addressed at higher level in the organization to get resolved.
© SAP AG 2004, Efficient usage of BW InfoProviders / 35
Error handling in BW 3.x
Handling of invalid data answers the questions:
What to do in case of error?
Abort validation or continue loading
Book or don‘t book valid data
Report or don‘t report valid data
How to correct the invalid data?
Source System
PSA
During upload
How to re-book the corrected data?
Re-load from source system
Book from PSA to data target
© SAP AG 2004, Efficient usage of BW InfoProviders / 36
Error handling features
Show error status of records in PSA table
Possibility to choose in the scheduler to...
abort process when errors occurr
process the correct records but do not allow reporting on them
process the correct records and allow reporting on them
It also can be chosen, with how many errors the whole request is wrong
Write invalid records to a new request
Update the invalid records after correction
© SAP AG 2004, Efficient usage of BW InfoProviders / 37
Error handling
Error handling No Error handling
Restrictions on Error handling capabilities in InfoPackages on…Connected Data targets ( is data updated to an ODS object ? ) and
Update mode ( is data load a delta update ? ) and
Serialization ( is serialization required ? )
or
Transfer method ( is transfer method IDoc used ? )
© SAP AG 2004, Efficient usage of BW InfoProviders / 38
Error handling: Overview
StagingEngine
StagingEngine
Business Information Warehouse
PSAExtractExtract OKOK
SchedulerScheduler
Error Handling:1- No Update, No Reporting2- Valid Records Update, No Reporting3- Valid Records Update, Reporting Possible
ErrorError
Correction of invalid data:• within source System• manually in PSA• by Rule
PSA
Consider automation using Process Chains !
© SAP AG 2004, Efficient usage of BW InfoProviders / 39
Error handling: Features
Monitor entry
Abort of update
Upd. valid records
Appli-cationlog
Marked in PSA
Error-Request
Color of Request
X
X
red
red
red
red
red
redX
X green
No Error handling X XError handling
No PSA available (e.g. Transfer via Idoc) X X XPSA available
No update, no reporting X X XUpdate valid records, no reporting
X XUpdate valid records,
Reporting possible
X X
© SAP AG 2004, Efficient usage of BW InfoProviders / 40
Error handling in BW (2.0B)
Update Rules
Transfer Rules
Info Cube
InfoSourceInfoSource
ODS object
Master Data
Master Data
Hierarchies
IDocTexts
DataData
© SAP AG 2004, Efficient usage of BW InfoProviders / 41
Error handling in BW (3.x)
Update Rules
Transfer Rules
Info Cube
InfoSourceInfoSource
ODS object
Master Data
Master Data
Hierarchies
PSA IDocTexts
DataData
© SAP AG 2004, Efficient usage of BW InfoProviders / 42
Call Error handling from customer routines
Customer routines in the update rules or transfer rules can mark the record and call the Error Handling
From update rules append table MONITOR
From transfer rules append table G_T_ERRORLOG
process single record using field RECORD
If no Error Handling is needed, records or even the whole data package can be skipped:
RETURNCODE <> 0 means skipping the record
ABORT <> 0 means skipping the data package
With BW 3.0B the functions SKIP RECORD and ABORT PACKAGE exist in the Transformation library
© SAP AG 2004, Efficient usage of BW InfoProviders / 43
Bad
Communication structure
Communication structure
Good
Transfer structure Transfer structure
Abort
Lookup
Example
Lookup + Check + Error Handling
Lookup + Check + Error Handling
© SAP AG 2004, Efficient usage of BW InfoProviders / 44
Automate Error handling using a customer program
Automatic correction of the error-request can be done in a customer program
Therefore use method GET_ERRORS of class CL_RSSM_ERROR_HANDLER
As template the program RS_ERRORLOG_EXAMPLE can be used
© SAP AG 2004, Efficient usage of BW InfoProviders / 45
Contents
Motivation11
Data validation in BW 3.022
Error handling in BW 3.033
Data repair44
© SAP AG 2004, Efficient usage of BW InfoProviders / 46
BW Data and Metadata Test and Repair Environment
Transaction RSRV
Transaction RSRV checks the consistency of data stored in BW.
The transaction interface was re-designed for SAP BW Release 3.0
© SAP AG 2004, Efficient usage of BW InfoProviders / 47
Conversion to consistent Internal Values
Converting Inconsistent Internal Characteristic Values
find and correct incorrect internal characteristic values.
a characteristic value is inconsistent when the characteristic has a conversion routine and when its values do not correspond to the internal format of the conversion routine.
The following conversion routines are covered:
ALPHA
NUMCV
GJAHR
Successful conversion of the system is required before upgrading to Release 3.x
© SAP AG 2004, Efficient usage of BW InfoProviders / 48
Conversion to consistent Internal Values
Once the conversion is complete, stricter checks / conversions are activated in the system to prevent new inconsistent values entering the system.
You can then run an Optional Conversion to correct internal values (according to conversion exit in the Transfer Rules)
© SAP AG 2004, Efficient usage of BW InfoProviders / 49
Repair requests
With BW 3.x repair requests can be updated to an ODS object. This means:
Full update for correction purposes updated to an ODS object which is usually updated using delta loads
In the InfoPackage menu the request is then marked as repair request
Before doing this, incorrect data can be deleted selectively from the ODS object
Possible approach in BW 2.0B / 2.1C:
Use generic DataSource based on PSA table to correct ODS object data via full uploads
© SAP AG 2004, Efficient usage of BW InfoProviders / 50
Further Information
Public Web:www.sap.com > Solutions > SAP NetWeaver
SAP Service Marketplace:
http://service.sap.com/bwFolder “Data Consistency”SAP BW InfoIndex – Data QualityServices & Implementation
HOW TO… GuidesGuide List SAP BW 2.x
“How to… Create monitor entries from an update routine”
© SAP AG 2004, Efficient usage of BW InfoProviders / 51
Copyright 2004 SAP AG. All rights reserved
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice.
Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors.
Microsoft®, WINDOWS®, NT®, EXCEL®, Word®, PowerPoint® and SQL Server® are registered trademarks of Microsoft Corporation.
IBM®, DB2®, OS/2®, DB2/6000®, Parallel Sysplex®, MVS/ESA®, RS/6000®, AIX®, S/390®, AS/400®, OS/390®, and OS/400® are registered trademarks of IBM Corporation.
ORACLE® is a registered trademark of ORACLE Corporation.
INFORMIX®-OnLine for SAP and Informix® Dynamic ServerTM are registered trademarks of Informix Software Incorporated.
UNIX®, X/Open®, OSF/1®, and Motif® are registered trademarks of the Open Group.
Citrix®, the Citrix logo, ICA®, Program Neighborhood®, MetaFrame®, WinFrame®, VideoFrame®, MultiWin® and other Citrix product names referenced herein are trademarks of Citrix Systems, Inc.
HTML, DHTML, XML, XHTML are trademarks or registered trademarks of W3C®, World Wide Web Consortium, Massachusetts Institute of Technology.
JAVA® is a registered trademark of Sun Microsystems, Inc.
JAVASCRIPT® is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape.
SAP, SAP Logo, R/2, RIVA, R/3, SAP ArchiveLink, SAP Business Workflow, WebFlow, SAP EarlyWatch, BAPI, SAPPHIRE, Management Cockpit, mySAP.com Logo and mySAP.com are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other products mentioned are trademarks or registered trademarks of their respective companies.