Lecture 2
Transcript of Lecture 2
Lecture 2: Getting your Data into SAS By: Jamie Romeiser *Lectures are not to be redistributed without prior written consent
HPH 562 DATA MANAGEMENT & INFORMATICS
Sept 5, 2012
Organization Tips:
Create a folder for each class meeting Save all class documents in that folder
Lecture Lecture Code Lecture Databases
Use this folder as your SAS library folder for that class.
Lecture Outline
Two Parts of SAS Programming Data Proc
Conceptual Model The Data Step – How it works
Temporary Vs Permanent Datasets: Temporary:
Method 1: Importing DBMS Files Method 2: Table Entry Method 3: Internal Raw Data (Code) Method 4: External Raw Data
Permanent Method 1: Libname Method 2: Point and Click
Types of Variables in Brief: Character, Numeric
“Set” statement “Proc Contents”
Lecture Code
*-----------------------------------** HPH 562 ** Class 2 **-----------------------------------*------------------------------------------** Temporary Datasets ** Method 1:Importing data through the import wizard ** Method 2: Input data via SAS Table ** Method 3a: Internal Raw Data, List Input ; ** Method 3b: Internal Raw Data, Column Input; ** Method 4: External Raw Data; ** Permanent Databases ** Method 1: Libname ** Method 2: Point and Click ** Set Statement; ** Proc Contents; **------------------------------------------------------------------------------*;
Two Parts of SAS Programming Code
Data Steps Creating new
datasets Reading and
modifying data
Proc Steps Print reports Perform utility
functions Analyze data
proc print data=examp1;run;
data examp1;input ID BMI Gender $ VitDDefincient;datALinEs;4687 31 F 07542 17 M 19637 18 F 1;run;
proc freq data=examp1;table Gender*VitDDeficient;run;
proc means data=examp1;Var BMI;run;
data examp2;set examp1;If Gender = 1 or BMI>= 18.5 then FRAILTY=1;Else Frailty=0;run;
*Example of Creating New Dataset;
*Example of Modifying a Dataset;
*Example of Report;
*Example of Utility Function;
*Example of Analyzing Data;
SAS Conceptual Model
Raw Data to
be Analyze
d
SAS Datas
et
Data statement; More SAS
Statements;
DATA step
SAS Procedure
Statements;
PROC step
Results of analysis
Data Analysis in a finished
report
SAS Conceptual Model (Vitamin D Paper Example)
Raw Data:
NHANES III
Surveys
Data statements; Bring data into SAS;
Apply Inclusion Criteria;Create Outcome “Frailty”
Variable;Create other predictor
Variables;
DATA step
Proc Statements;
Frequency Tables;
Odds Ratios (i.e. Logistic Regression);
PROC step
Results of analysis
Data Analysis in a finished
report
SAS Datas
et
SAS Conceptual Model: Data Step
No
Data Statement applied to dataset
A;
Is there data to read?
Reads Data and
Executes Statement
Yes
Writes observation into dataset
B
Yes
No
Is there another
Data Stateme
nt?
Done. Modifications are now in dataset
B
Example Code:Data DatasetB;set DatasetA;
Frail1=0;if BMI<=18.5 then Frail1 = 1;
Frail2=0;if SLOWWALK=1 then Frail2 = 1;
FRAILTY=0;if Frail1=1 or Frail2=1 then FRAILTY = 1;
Run;
Data DatasetB; *Create a new dataset called datasetB;
set DatasetA; *From datasetA– i.e. use DatasetA as the base;
Statement 1; *I want you to make theStatement 2; following changes to Statement 3; datasetA…;Statement 4;Statement 5;Statement 6;Run; *Execute my statements up to here;
SAS Conceptual Model (Vitamin D Paper Example)
Raw Data:
NHANES III
Surveys
Data statements; Bring data into SAS;
Apply Inclusion Criteria;Create Outcome “Frailty”
Variable;Create other predictor
Variables;
DATA step
Proc Statements;
Frequency Tables;
Odds Ratios (i.e. Logistic Regression);
PROC step
Results of analysis
Data Analysis in a finished
report
SAS Datas
etWhere?
Temporary Vs. Permanent Datasets Temporary
Datasets: Stored in Work Folder
within SAS Libraries All files that SAS stores
in the WORK library are deleted at the end of a session (i.e. temporary)
Creating a Temporary Dataset: Method 1: Importing DBMS Files
Using Import Wizard to bring in datasets stored excel or access format.
Method 2: Table Entry Enter data into SAS table
Method 3: Internal Raw Data (Code) Type it in your code
Method 4: External Raw Data Pulling in Data stored in .txt format, but
specifying variable names
Temporary Vs. Permanent Datasets
Creating a Temporary Dataset: Method 1: Importing Database
Management System (DBMS) files Excel/Access 1997-2003 Steps for Import Wizard:
File Import Data Select Excel Find Dataset Select Sheet Name It Finish
PROC IMPORT OUT= WORK.datasetname DATAFILE= “DRIVE:\Foldername\datasetname.xls" DBMS=EXCEL REPLACE; RANGE="Sheet1$"; GETNAMES=YES; MIXED=NO; SCANTEXT=YES; USEDATE=YES; SCANTIME=YES;RUN;
Temporary Vs. Permanent Datasets
ID Height
Gender
Intervention
Result Year
46 67 F 0 Yes 2006
752 71 M 0 No 2006
9673
62 F 1 Yes 2006
969 69 M 1 Yes 2006
Creating a Temporary Dataset: Method 2: Table Entry
Hand Entering Data into SAS Table Steps (must be in the Explorer Window):
File New Table Enter Data Label Variables
You will hardly ever use this method.
Temporary Vs. Permanent Datasets
Temporary Vs. Permanent Datasets Creating a Temporary
Dataset: Method 3: Internal Raw
Data Input raw data via code (a) List Input:
Each data value is separated by one spacedata example2;
input ID Height Gender $ Intervention Result $ Year;datalines; 46 67 F 0 Yes 2005752 71 M 0 No 20059673 62 F 1 Yes 2005969 69 M 1 Yes 2005; run;
1. input statement2. Define the
variables, character$ or numeric
3. Specify the location of the raw data (in this case, your location is “datalines”, meaning you’re inputting the raw data
ID Height
Gender
Intervention
Result Year
46 67 F 0 Yes 2005
752 71 M 0 No 2005
9673
62 F 1 Yes 2005
969 69 M 1 Yes 2005
Temporary Vs. Permanent Datasets Creating a Temporary
Dataset: Method 3: Internal Raw
Data Input raw data via code (b) Column Input:
Each data value is separated in a defined column
1. input statement2. Define the
variables, character$ or numeric
3. Specify the location of the raw data (in this case, your location is “datalines”, meaning you’re inputting the raw data
ID Height
Gender
Intervention
Result Year
46 67 F 0 Yes 2004
752 71 M 0 No 2004
9673
62 F 1 Yes 2004
969 69 M 1 Yes 2004data example3;input ID 1-4 Height 9-10 Gender $ 17 Intervention 21 Result $25-27 Year 29-32;datalines;
1 2 2 1-------9-------7-------5---9---46 67 F 0 Yes 2004752 71 M 0 No 20049673 62 F 1 Yes 2004969 69 M 1 Yes 2004; run;
Temporary Vs. Permanent Datasets Creating a Temporary
Dataset: Method 4: External Raw
Data Input raw data, usually
a .txt file, via link
1. Specify the location of the raw data (in this case, your location is a browser location);
2. input statement3. Define the
variables, character$ or numeric
ID Height
Gender
Intervention
Result Year
46 67 F 0 Yes 2004
752 71 M 0 No 2004
9673
62 F 1 Yes 2004
969 69 M 1 Yes 2004data example4;infile "G:\HPH 562\Class 2 Final\Example4\ex4.txt";input ID Height Gender $ Intervention Result $ Year @@;run;
Temporary Vs. Permanent Datasets Permanent
Datasets: Datasets have two
names Stored in a folder you
create within SAS Libraries
Purposes Store data as
permanent SAS datasets
Retrieve SAS datasets
Class2a
Temporary Vs. Permanent Datasets Creating a Library to create/to access your
Permanent databases Method 1: Libname Statement
Method 2: Non-programming method While the Explorer window is highlighted, click File/New Fill in required fields (name of folder, location of folder)
Tip: If you’re returning to the same code every time, it’s easier to form your library through the Libname statement.
libname Class2a "J:\HPH 562 2011\Lecture 2\Class2Data";run;
Name of library
Location of your library
Class2a
J:\
CLASS2A
J:\Class2a
Class2a
Hmwk2
Now, in SAS, create a permanent library referencing the folder where your database is saved.
1 2
3 libname PICKLE “J:\CLASS2A";run;
If you did it correctly, you will see the Hmwk2 appear in your Homework permanent library. The full name of this database is now Homework.Hmwk2. NOTE: You will also see any other SAS databases that are stored in that folder (J:\CLASS2A)
Make a folder called Class2a
Download or save your SAS database into this folder.
data VitD;set PICKLE.Hmwk2;run;
4 Finally…. What can you do?
Temporary Vs. Permanent Datasets
Taking a Permanent Database and Making it Temporary:
*Permanent Databases have 2 names
Libname BASIL"G:\HPH 562\DRAFT\Datasets\NAMCSIII ";run;
data Slide20;set BASIL.namcsedit;run;
Name of Library where my PERMANENT data is stored
Location of Library where my data PERMANENT is stored
Name of new Temporary Database
Libname PermFoldername “BrowserAddress";run;
data nameoftemporarydatabase;set Permfoldername.databasename;run;
Name of my Permanent database
Taking a Temporary Database and Making it Permanent:
Code:
Example
data PermFolderName.Permdatabase;set tempdatabase;run;
data BASIL.SLIDE;set slide21;run;
In Summary: Getting Data into SAS Accessing a SAS database
MUST CREATE A LIBRARY WHERE THE PERMANENT SAS DATABASE IS STORED in order to access database The only way to look at an Excel database is through the Excel program. Same
thing with SAS. The only way to look at a SAS database in through the SAS program. The difference between the two is the SAS program does not automatically open when you try to open a SAS database. You must physically open SAS first, and create a permanent library where that database is stored. Then, you may look at the data.
It is already is permanent! Permanent means that it is a SAS database. Excel/Access Databases:
Use the Import Wizard! (point & click: file, import, name, etc.) It will be temporary! (Work folder)
Inputting Internal Raw Data: Use Code (Input & Datalines Commands) It will be temporary! (Work folder)
Inputting External Raw Data ( .TXT data) Use Code and site the browser location of data (Input & INFILE commands) It will be temporary! (Work folder) We will go into this in more detail in Lecture 3
Proc Contents
Shows (In the output window) information about your dataset Number of Observations Variables
Name Type Length
Example
Proc Contents data=nameofdataset; run;
Proc Contents data=example3; run;