Lecture 2

23
Lecture 2: Getting your Data into SAS By: Jamie Romeiser *Lectures are not to be redistributed without prior written consent HPH 562 DATA MANAGEMENT & INFORMATICS Sept 5, 2012

Transcript of Lecture 2

Page 1: Lecture 2

Lecture 2: Getting your Data into SAS By: Jamie Romeiser *Lectures are not to be redistributed without prior written consent

HPH 562 DATA MANAGEMENT & INFORMATICS

Sept 5, 2012

Page 2: Lecture 2

Organization Tips:

Create a folder for each class meeting Save all class documents in that folder

Lecture Lecture Code Lecture Databases

Use this folder as your SAS library folder for that class.

Page 3: Lecture 2

Lecture Outline

Two Parts of SAS Programming Data Proc

Conceptual Model The Data Step – How it works

Temporary Vs Permanent Datasets: Temporary:

Method 1: Importing DBMS Files Method 2: Table Entry Method 3: Internal Raw Data (Code) Method 4: External Raw Data

Permanent Method 1: Libname Method 2: Point and Click

Types of Variables in Brief: Character, Numeric

“Set” statement “Proc Contents”

Page 4: Lecture 2

Lecture Code

*-----------------------------------** HPH 562 ** Class 2 **-----------------------------------*------------------------------------------** Temporary Datasets ** Method 1:Importing data through the import wizard ** Method 2: Input data via SAS Table ** Method 3a: Internal Raw Data, List Input ; ** Method 3b: Internal Raw Data, Column Input; ** Method 4: External Raw Data; ** Permanent Databases ** Method 1: Libname ** Method 2: Point and Click ** Set Statement; ** Proc Contents; **------------------------------------------------------------------------------*;

Page 5: Lecture 2

Two Parts of SAS Programming Code

Data Steps Creating new

datasets Reading and

modifying data

Proc Steps Print reports Perform utility

functions Analyze data

proc print data=examp1;run;

data examp1;input ID BMI Gender $ VitDDefincient;datALinEs;4687 31 F 07542 17 M 19637 18 F 1;run;

proc freq data=examp1;table Gender*VitDDeficient;run;

proc means data=examp1;Var BMI;run;

data examp2;set examp1;If Gender = 1 or BMI>= 18.5 then FRAILTY=1;Else Frailty=0;run;

*Example of Creating New Dataset;

*Example of Modifying a Dataset;

*Example of Report;

*Example of Utility Function;

*Example of Analyzing Data;

Page 6: Lecture 2

SAS Conceptual Model

Raw Data to

be Analyze

d

SAS Datas

et

Data statement; More SAS

Statements;

DATA step

SAS Procedure

Statements;

PROC step

Results of analysis

Data Analysis in a finished

report

Page 7: Lecture 2

SAS Conceptual Model (Vitamin D Paper Example)

Raw Data:

NHANES III

Surveys

Data statements; Bring data into SAS;

Apply Inclusion Criteria;Create Outcome “Frailty”

Variable;Create other predictor

Variables;

DATA step

Proc Statements;

Frequency Tables;

Odds Ratios (i.e. Logistic Regression);

PROC step

Results of analysis

Data Analysis in a finished

report

SAS Datas

et

Page 8: Lecture 2

SAS Conceptual Model: Data Step

No

Data Statement applied to dataset

A;

Is there data to read?

Reads Data and

Executes Statement

Yes

Writes observation into dataset

B

Yes

No

Is there another

Data Stateme

nt?

Done. Modifications are now in dataset

B

Example Code:Data DatasetB;set DatasetA;

Frail1=0;if BMI<=18.5 then Frail1 = 1;

Frail2=0;if SLOWWALK=1 then Frail2 = 1;

FRAILTY=0;if Frail1=1 or Frail2=1 then FRAILTY = 1;

Run;

Data DatasetB; *Create a new dataset called datasetB;

set DatasetA; *From datasetA– i.e. use DatasetA as the base;

Statement 1; *I want you to make theStatement 2; following changes to Statement 3; datasetA…;Statement 4;Statement 5;Statement 6;Run; *Execute my statements up to here;

Page 9: Lecture 2

SAS Conceptual Model (Vitamin D Paper Example)

Raw Data:

NHANES III

Surveys

Data statements; Bring data into SAS;

Apply Inclusion Criteria;Create Outcome “Frailty”

Variable;Create other predictor

Variables;

DATA step

Proc Statements;

Frequency Tables;

Odds Ratios (i.e. Logistic Regression);

PROC step

Results of analysis

Data Analysis in a finished

report

SAS Datas

etWhere?

Page 10: Lecture 2

Temporary Vs. Permanent Datasets Temporary

Datasets: Stored in Work Folder

within SAS Libraries All files that SAS stores

in the WORK library are deleted at the end of a session (i.e. temporary)

Page 11: Lecture 2

Creating a Temporary Dataset: Method 1: Importing DBMS Files

Using Import Wizard to bring in datasets stored excel or access format.

Method 2: Table Entry Enter data into SAS table

Method 3: Internal Raw Data (Code) Type it in your code

Method 4: External Raw Data Pulling in Data stored in .txt format, but

specifying variable names

Temporary Vs. Permanent Datasets

Page 12: Lecture 2

Creating a Temporary Dataset: Method 1: Importing Database

Management System (DBMS) files Excel/Access 1997-2003 Steps for Import Wizard:

File Import Data Select Excel Find Dataset Select Sheet Name It Finish

PROC IMPORT OUT= WORK.datasetname DATAFILE= “DRIVE:\Foldername\datasetname.xls" DBMS=EXCEL REPLACE; RANGE="Sheet1$"; GETNAMES=YES; MIXED=NO; SCANTEXT=YES; USEDATE=YES; SCANTIME=YES;RUN;

Temporary Vs. Permanent Datasets

ID Height

Gender

Intervention

Result Year

46 67 F 0 Yes 2006

752 71 M 0 No 2006

9673

62 F 1 Yes 2006

969 69 M 1 Yes 2006

Page 13: Lecture 2

Creating a Temporary Dataset: Method 2: Table Entry

Hand Entering Data into SAS Table Steps (must be in the Explorer Window):

File New Table Enter Data Label Variables

You will hardly ever use this method.

Temporary Vs. Permanent Datasets

Page 14: Lecture 2

Temporary Vs. Permanent Datasets Creating a Temporary

Dataset: Method 3: Internal Raw

Data Input raw data via code (a) List Input:

Each data value is separated by one spacedata example2;

input ID Height Gender $ Intervention Result $ Year;datalines; 46 67 F 0 Yes 2005752 71 M 0 No 20059673 62 F 1 Yes 2005969 69 M 1 Yes 2005; run;

1. input statement2. Define the

variables, character$ or numeric

3. Specify the location of the raw data (in this case, your location is “datalines”, meaning you’re inputting the raw data

ID Height

Gender

Intervention

Result Year

46 67 F 0 Yes 2005

752 71 M 0 No 2005

9673

62 F 1 Yes 2005

969 69 M 1 Yes 2005

Page 15: Lecture 2

Temporary Vs. Permanent Datasets Creating a Temporary

Dataset: Method 3: Internal Raw

Data Input raw data via code (b) Column Input:

Each data value is separated in a defined column

1. input statement2. Define the

variables, character$ or numeric

3. Specify the location of the raw data (in this case, your location is “datalines”, meaning you’re inputting the raw data

ID Height

Gender

Intervention

Result Year

46 67 F 0 Yes 2004

752 71 M 0 No 2004

9673

62 F 1 Yes 2004

969 69 M 1 Yes 2004data example3;input ID 1-4 Height 9-10 Gender $ 17 Intervention 21 Result $25-27 Year 29-32;datalines;

1 2 2 1-------9-------7-------5---9---46 67 F 0 Yes 2004752 71 M 0 No 20049673 62 F 1 Yes 2004969 69 M 1 Yes 2004; run;

Page 16: Lecture 2

Temporary Vs. Permanent Datasets Creating a Temporary

Dataset: Method 4: External Raw

Data Input raw data, usually

a .txt file, via link

1. Specify the location of the raw data (in this case, your location is a browser location);

2. input statement3. Define the

variables, character$ or numeric

ID Height

Gender

Intervention

Result Year

46 67 F 0 Yes 2004

752 71 M 0 No 2004

9673

62 F 1 Yes 2004

969 69 M 1 Yes 2004data example4;infile "G:\HPH 562\Class 2 Final\Example4\ex4.txt";input ID Height Gender $ Intervention Result $ Year @@;run;

Page 17: Lecture 2

Temporary Vs. Permanent Datasets Permanent

Datasets: Datasets have two

names Stored in a folder you

create within SAS Libraries

Purposes Store data as

permanent SAS datasets

Retrieve SAS datasets

Class2a

Page 18: Lecture 2

Temporary Vs. Permanent Datasets Creating a Library to create/to access your

Permanent databases Method 1: Libname Statement

Method 2: Non-programming method While the Explorer window is highlighted, click File/New Fill in required fields (name of folder, location of folder)

Tip: If you’re returning to the same code every time, it’s easier to form your library through the Libname statement.

libname Class2a "J:\HPH 562 2011\Lecture 2\Class2Data";run;

Name of library

Location of your library

Page 19: Lecture 2

Class2a

J:\

CLASS2A

J:\Class2a

Class2a

Hmwk2

Now, in SAS, create a permanent library referencing the folder where your database is saved.

1 2

3 libname PICKLE “J:\CLASS2A";run;

If you did it correctly, you will see the Hmwk2 appear in your Homework permanent library. The full name of this database is now Homework.Hmwk2. NOTE: You will also see any other SAS databases that are stored in that folder (J:\CLASS2A)

Make a folder called Class2a

Download or save your SAS database into this folder.

data VitD;set PICKLE.Hmwk2;run;

4 Finally…. What can you do?

Temporary Vs. Permanent Datasets

Page 20: Lecture 2

Taking a Permanent Database and Making it Temporary:

*Permanent Databases have 2 names

Libname BASIL"G:\HPH 562\DRAFT\Datasets\NAMCSIII ";run;

data Slide20;set BASIL.namcsedit;run;

Name of Library where my PERMANENT data is stored

Location of Library where my data PERMANENT is stored

Name of new Temporary Database

Libname PermFoldername “BrowserAddress";run;

data nameoftemporarydatabase;set Permfoldername.databasename;run;

Name of my Permanent database

Page 21: Lecture 2

Taking a Temporary Database and Making it Permanent:

Code:

Example

data PermFolderName.Permdatabase;set tempdatabase;run;

data BASIL.SLIDE;set slide21;run;

Page 22: Lecture 2

In Summary: Getting Data into SAS Accessing a SAS database

MUST CREATE A LIBRARY WHERE THE PERMANENT SAS DATABASE IS STORED in order to access database The only way to look at an Excel database is through the Excel program. Same

thing with SAS. The only way to look at a SAS database in through the SAS program. The difference between the two is the SAS program does not automatically open when you try to open a SAS database. You must physically open SAS first, and create a permanent library where that database is stored. Then, you may look at the data.

It is already is permanent! Permanent means that it is a SAS database. Excel/Access Databases:

Use the Import Wizard! (point & click: file, import, name, etc.) It will be temporary! (Work folder)

Inputting Internal Raw Data: Use Code (Input & Datalines Commands) It will be temporary! (Work folder)

Inputting External Raw Data ( .TXT data) Use Code and site the browser location of data (Input & INFILE commands) It will be temporary! (Work folder) We will go into this in more detail in Lecture 3

Page 23: Lecture 2

Proc Contents

Shows (In the output window) information about your dataset Number of Observations Variables

Name Type Length

Example

Proc Contents data=nameofdataset; run;

Proc Contents data=example3; run;