SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.

27
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.

Transcript of SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.

SAS Workshop Lecture 1

Lecturer: Annie N. Simpson, MSc.

Summer 2007 SAS Workshop 2

Workshop Website

www.musc.edu/~simpsona/SASWorkshop/

Summer 2007 SAS Workshop 3

Part I of Lecture 1 What is SAS? Why do we need it? How to open/manipulate the

windows Where to get help Example 1…give it a try

Summer 2007 SAS Workshop 4

SAS Introduction “SAS” = Statistical Analysis System,

now it is not an acronym for anything Developed in the early 1970s at North

Carolina State University Originally intended for management

and analysis of agricultural field experiments

Now the most widely used statistical software

Summer 2007 SAS Workshop 5

What is the SAS System? The SAS System is an integrated system of

software packages that enables you to perform: Data entry, retrieval, and management Report writing and graphics Statistical and mathematical analysis Business planning, forecasting, and decision support Operations research and project management Quality improvement Applications development

Summer 2007 SAS Workshop 6

What is the SAS System Really?

SAS is a COMPUTER PROGRAM for managing and analyzing data

It is a TOOL!

Summer 2007 SAS Workshop 7

“I only want to know how to analyze data”

With respect to statistical programs, most applied statistician/researchers spend most time doing data management (manipulation) activities in preparation for doing analysis

Statistical Programming:85% Data Manipulation

10% Comments5% Analysis

Summer 2007 SAS Workshop 8

SAS Help Resources

Nothing replaces experience / trial and error

Me! SAS Books by users (I have a shelf

full) SAS technical support on the web Help files from the program

Summer 2007 SAS Workshop 9

3 Main Programming Windows Program Editor – Enter, edit, and

submit (run) SAS programs Log – Displays messages about SAS

session and programs that you submit Output – View output from SAS

programs that have been run

*We will review these together in a moment!

Summer 2007 SAS Workshop 10

SAS Programs A SAS program is a sequence of

statements executed in order Every SAS statement ends with a

semicolon (;)!

Summer 2007 SAS Workshop 11

Two main parts to SAS code

DATA steps read and modify data Create new variables create a SAS data set

PROC steps (or procedure step) analyze data produce results or output

(e.g. – MEANS, FREQ, PRINT, CONTENTS) A step ends with a RUN statement

Summer 2007 SAS Workshop 12

Comments - The “Third” Step

Comments are usually used to annotate the program, making it easier for someone to read your program and understand what you have done and why.

There are two styles of comments that you can use:

one starts with an asterisk (*) and ends with a semicolon (;).

The other starts with a slash asterisk (/*) and ends with and asterisk slash (*/)

Summer 2007 SAS Workshop 13

Lets open SAS together a take a look…

Open SAS Review Window structure Check out the help options Practice with a small program… Check out what errors look like…

Summer 2007 SAS Workshop 14

SAS Example Program 1Data FEV; *Create new data set called FEV;

Input Id Age FEV Height Sex Smok_Stat;Cards;23151 11 2.54200 62.0000 0 023401 10 2.60800 66.0000 1 023601 11 2.35400 62.0000 0 023651 13 2.59900 62.5000 0 123652 10 1.45800 57.0000 0 023901 10 3.79500 68.5000 1 024201 11 2.49100 59.0000 0 024251 13 3.06000 61.5000 0 024501 10 2.54500 65.0000 1 024543 11 2.99300 66.5000 1 024601 10 3.30500 65.0000 0 024642 13 4.75600 68.0000 1 124701 11 3.77400 67.0000 0 024741 10 2.85500 64.5000 1 024801 11 2.98800 70.0000 1 025041 11 2.49800 60.0000 1 025051 14 3.16900 64.0000 0 025501 11 2.88700 62.5000 1 025551 13 2.70400 61.0000 0 025901 11 3.51500 64.0000 0 0;RUN;PROC PRINT DATA = FEV; /*Prints the data in FEV*/RUN;

Summer 2007 SAS Workshop 15

Part II of Lecture 1

Now that we know the basics lets talk about: Some SAS rules of programming Big data sets…don’t want to type those in!

Summer 2007 SAS Workshop 16

Variables and Observations Data consist of variables and

observations (much like you are used to seeing in MSExcel spreadsheets)

Variables – columns Observations - rows

Summer 2007 SAS Workshop 17

Data Types In SAS there are just two data types:

numeric and character Numeric fields are numbers Character data are everything else If it contains only numbers, then it

may be numeric or character (example – zip codes)

Summer 2007 SAS Workshop 18

Missing Data Missing character data are

represented by blanks Missing numeric data are

represented by a single period (.)

Summer 2007 SAS Workshop 19

Rules for SAS names Variable names must be 32 characters

or fewer in length (used to be 8, some still like to stick to this shorter length)

Names must start with a letter or an underscore (_)

Names contain only letters, numerals, or underscores

Names can contain upper- and lowercase letters. SAS is insensitive to case

Summer 2007 SAS Workshop 20

Don’t be afraid to type the wrong thing…Just give it a Try!

Summer 2007 SAS Workshop 21

DATA step’s built in loop DATA steps execute line by line and observation

by observation Must create variables before you use them SAS takes the first observation and runs it all the

way through the DATA step before looping back to pick up the second observation. In this way, SAS sees only one observation at a time.

Summer 2007 SAS Workshop 22

Reading the SAS Log

Every time you run a SAS job, READ the Log window first!

Then go to the output window to view your result, that way you know that your results are “real”.

Summer 2007 SAS Workshop 23

Things to remember SAS does not automatically save any of the

windows You must save each window individually!

Saving one, does not save any of the others. Name each saved file the same for each

related window.Ex: Program Editor – ‘zoo.sas’ (SAS Program File)

Log Window – ‘zoo.log’ (SAS Log File)Output Window – ‘zoo.lst’ (SAS Output File)

Summer 2007 SAS Workshop 24

SAS Data Sets Before SAS can read your data, it

must be in a special form called a SAS data set.

What type of data did we just use in our first example?

How do you expect your data to normally be stored?

Summer 2007 SAS Workshop 25

LIBNAME Statement (i.e. SAS asks “Where are my SAS data sets stored?” Use this statement to define your SAS

Library location before using your SAS data sets

Follows your file storage directory structureExample:

LIBNAME ABC ‘C:\DATA’;

Proc Means Data = ABC.EX4A;Run;

Summer 2007 SAS Workshop 26

Lets take a look at SAS to check it out…

Look for SAS data sets inside your C:\DATA file folder

Can you write a Libname statement so that SAS can “see” those same files

Rule #1: Only works if the data is a SAS Data Set…not Excel, or ACCESS, etc.

Summer 2007 SAS Workshop 27

SAS Example Program 2Open SAS, type and run the following program:Libname annie 'c:\DATA';DATA new;

Set annie.HTWT;Run;

PROC CONTENTS DATA = new;TITLE "What is contained in the HTWT Data Set";

RUN;

PROC Print DATA = new;TITLE "Printing my HTWT data set";

RUN;