SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.
-
Upload
alison-norton -
Category
Documents
-
view
219 -
download
2
Transcript of SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.
![Page 1: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/1.jpg)
SAS Macros are the Cure for SAS Macros are the Cure for Quality Control PainsQuality Control Pains
Gary McQuown
Data and Analytic Solutions
![Page 2: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/2.jpg)
Rants and Raves of a Rants and Raves of a SAS ProgrammerSAS Programmer
![Page 3: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/3.jpg)
PurposePurpose
I. Quality Control
II. SAS Macros for Quality Control
III. Sources of SAS Macros and QC Code
![Page 4: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/4.jpg)
I. Quality ControlI. Quality Control
An ongoing effort for validation, improvement and facilitation of the data related process to insure that data meets the business needs.
![Page 5: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/5.jpg)
Quality ControlQuality Control
“Quality control means you can have what you need, how you need it, when you need it.” E. Demming
![Page 6: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/6.jpg)
Why Practice QC?Why Practice QC?
It Saves Time
It Saves Money
It Makes Money
Ignorance is not Bliss
![Page 7: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/7.jpg)
How Data Goes BadHow Data Goes Bad
“Bad Genes” .. Poor design and collection
“Adoption” … Someone Else’s Design
“Child Abuse” ... Poorly Nurtured
“Terrible Teens” ... Growing Pains
![Page 8: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/8.jpg)
The QC ProcessThe QC Process
1. Define Requirements
2. Identify Data Issues
3. Analyze Options
4. Improve Data Quality
• Document every step and repeat
![Page 9: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/9.jpg)
Define RequirementsDefine Requirements
What do you need?
Requires an understanding of the business process, the data, the operating system and the users.
Documentation, business specs and “experts”.
![Page 10: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/10.jpg)
Devil’s AdvocateDevil’s Advocate
What is correct for one task / group may be incorrect for another.
What is correct now may be incorrect later.
What is correct now ... may not be able to be repeated.
![Page 11: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/11.jpg)
Identify Data IssuesIdentify Data Issues
AccuracyCompletenessConsistencyTimelinessUniquenessValidity
![Page 12: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/12.jpg)
G = Good F = Fair B = Bad
Variable Pre
sen
t
Co
nfo
rmin
g
Scr
ub
?
Acc
ura
te
Co
mp
lete
Co
nsi
sten
cy
Tim
elin
ess
Un
iqu
enes
s
Val
idit
y
NAME_LAST 99% 88% 11% G G G G G G
NAME_FIRST 86% 78% 8% G G F G F F
GENDER 63% 59% 4% G G G G G G
TELEPHONE 100% 6% 94% B B G B B B
AGE 57% 55% 2% F G G F G F
SPECIALTY 76% 72% 5% G F G G G G
EDUCATION 100% 100% 0% G G G G B B
G=GOOD F = FAIR B = BAD
EXCEPTION AND PERCENTAGE REPORT (EAP)
![Page 13: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/13.jpg)
Analyze OptionsAnalyze Options
What do you need?
What do you have?
What changes need to be made?
Will you break anything along the way?
![Page 14: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/14.jpg)
Improve Data QualityImprove Data Quality Selective Processing
Clean Existing Values
Correcting Existing Values
Delete “bad” data
Add additional data
• Document original and new values.
![Page 15: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/15.jpg)
DocumentationDocumentation
Design Process ... business specs“As You Go” ... in the code, log, emailInput and Output files (Freqs & Means)Modifications .... “as per xxx “, email Exceptions (Errors and Issues)User’s ManualElizabeth Axelrod ... Big ‘D’
“Just Shoot Them”
![Page 16: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/16.jpg)
General SuggestionsGeneral Suggestions
“Drive Out Fear” Early Intervention Obtain “Buy In” from all parties Keep it “Simple” ... use macros Be consistent … use macros Monitor results Document everything, every time
![Page 17: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/17.jpg)
II. SAS MacrosII. SAS Macros
Macros allow you to use, re-use and share “object-oriented” code.
QC is very redundant .... the same or similar process performed on each data set, each variable and each process.
![Page 18: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/18.jpg)
RealityReality
People are:
Ignorant Forgetful Busy Lazy Don’t Care
![Page 19: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/19.jpg)
Why MacrosWhy Macros
Minimal Effort
Parameters
Available (FREE)
![Page 20: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/20.jpg)
FREQOUT
Produces Frequencies for multiple variables
% FREQOUT
(data= /* input dataset name */,
out= freqout /* output data set name ,
vars= /* list of variables */,
by = /* list of by variables */,
fmtassign = /* var fmt var fmt */,
debugging = NO /* YES or NO */
Author: Ian Whitlock
Location: www.lexjansen.com and sconsig.com
![Page 21: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/21.jpg)
%EAP_RPT (DSN=, LIBIN= , LIBOUT=, _VARS= , _FMTS=); DSN = Name of input SAS data set LIBIN= SAS library of input data set LIBOUT= SAS library of output data set _VARS= list of character variables to review .. paired with _FMTS _FMTS= list of formats to apply ... paired with _VARS
Example: %EAP_RPT(_VARS = AGE INCOME EDUCATION ,
_FMTS = AGE INC EDU , LIBIN = PROJ_IN , LIBOUT = PROJ_OUT , DSN = STUDY_1);
EAP_RPT
![Page 22: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/22.jpg)
DATA CLEANING
TIP00128a - Cleansing Macro, Data Scrubbing routine (see tip 00128 for more)
%cleanse(schlib=work, schema=, strlen=50,
var=, target=target, replace=replace, case=nocase); Author: Charles Patridge
Version: 2.1 (sug. by Ian Whitlock) Location: www.sconsig.com
![Page 23: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/23.jpg)
REMOVE OUTLIERS %outlier ( data = _SAS_dataset_name_, out = _SAS_output_dataset_name var = _variable_to_screen pass = _number_of_passes except = _exception_report_data_set_, mult = _multiplier_of_standard_deviations_) The %OUTLIER macro completes outlier screens based on statistical values of a numeric variable in a SAS data set. It is set up to remove any outlier records that are within a given number of Standard Deviations from the mean, and will run that screen a given number of times. For example, a "3-Pass-2" outlier screen will remove any values outside 3 standard deviations from the mean, and will run that outlier screen twice. The given numbers can be any integer.Author: Unknown Location: www.spikeware.com
![Page 24: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/24.jpg)
CONT_COMPARE
Compares two data sets, list all variables and reports potential issues:
1) Fields in Both2) Type3) Length
%cont_compare (dsn1, dsn2)
![Page 25: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/25.jpg)
KEEPDBLS: Documents Duplicates TIP000367- KeepDbls %MACRO KeepDbls (SourceDs =_LAST_, TargetDs =, Overwrit =N, IdList =, Where =); Moves duplicate observations to another file.
Author: Jim GroeneveldLocation: www.sconsig.com
![Page 26: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/26.jpg)
CK_MISSING
Evaluates variables in regards to missing and non missing status.
Default= _numeric_ missing. _character_ $missing.
Parms:
DSN = libname and name of data set. Default is the last read/created.
PATH= path to directory where QC info is stored.
VAR = list of variables to b evaluated.
FMT = format statment.
%ck_missing( dsn=mylib.recentfile,
var=UPB FICO1 FICO2 FICO3 CHANNEL,
fmt=UPB upb. FICO1 FICO2 FICO3 fico. CHANNEL $chnl. );
![Page 27: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/27.jpg)
LOG FILTER: Examines and Reports on SAS Log
Log Filter checks your log for errors, warnings, and other "interesting" messages. It then displays what it finds in its summary window. Double-click on a row and it'll reposition the log window to display the message in context (if it's an external log file, it'll open it in a viewer window and position it for you).
Author: Ratcliffe Location: http://ratcliffe.co.uk/rest_logfilt.htm
![Page 28: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/28.jpg)
MK_FORMATS
Create a format from a SAS data set.
Parms:
DSN = SAS data set
START =Unique key value ie. SSN
LABEL =Value to be associated with start ie. Full Name with SSN
FMTNAME =Name of Format (sans ".")
TYPE = C or N for Character or Numeric
LIBRARY = Libname of Format Library (default =work)
OTHER = Value to supply for missing (default =OTHER)
![Page 29: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/29.jpg)
III. Sources of SAS Macros III. Sources of SAS Macros and QC Code and QC Code
www.sas.com (examples)
www.lexjansen.com (proceeding)www.sconsig.com
www.ratcliffe.co.uk
www.statetechservices.com
www.spikeware.com
![Page 30: SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062407/56649cd95503460f949a2c61/html5/thumbnails/30.jpg)
More SourcesMore Sources
www.mcw.edu/pcor/rsparapa/sasmacro.html www.math.yorku.ca/scs/friendly.html www.stat.ncsu.edu/sas/samples/index.htmlwww.dasconsultants.com SAS-L
Books By Users:Ron Cody’s Data CleaningNumerous books on Macros .... “By Example”