Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

44
1 2015 Praxis Business School Vibeesh C S Solution for Even Numbered Problems For Chapters 7-15 from Learning SAS by Example - A Programmers Guide by Ron Cody

Transcript of Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

Page 1: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

1

2015

Praxis Business School

Vibeesh C S

Solution for Even

Numbered Problems For Chapters 7-15 from Learning SAS by Example -

A Programmer’s Guide by Ron Cody

Page 2: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

2

Chapter 7- Performing Conditional Processing

Question

/* 7.2 Using the SAS data set Hosp, use PROC PRINT to list observations for

Subject values of 5, 100, 150, and 200. Do this twice, once using OR

operators and once using the IN operator. Note: Subject is a numeric

variable */

Program

data a15031.hosp99l4;

set a15031.hosp; *USING "OR";

where Subject eq 5 or Subject eq 100 or Subject eq 150 or Subject eq

200;

run;

proc print data=a15031.hosp99l4;

run;

data a15031.hosin; *using "IN";

set a15031.hosp;

where Subject in(5,100,150,200);

run;

proc print data=a15031.hosin;

run;

Output using OR

Output using OR

Page 3: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

3

Question

/*4. Using the Sales data set, create a new, temporary SAS data set

containing Region and Total Sales plus a new variable called Weight with

values of 1.5 for the North Region, 1.7 for the South Region, and 2.0 for

the West and East Regions. Use a SELECT statement to do this */

Program

data a15031.sales11q;

set a15031.sales(keep = Region Totalsales);

*keep function use to keep totlsales and eliminate the other

variables;

select;

when (Region = 'North') Weight = 1.5;*WHEN function subset the data;

when (Region = 'South') Weight = 1.7;

when (Region = 'East') Weight = 2.0;

when (Region = 'West') Weight = 2.0;

otherwise;

end;

proc print data=a15031.sales11q;

run;

Output

Page 4: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

4

Question

/*6. Using the Sales data set, list all the observations where Region is

North and Quantity is less than 60. Include in this list any observations

where the customer name (Customer) is Pet's are Us */

Program

data a15031.sal55;

set a15031.sales;

where Region eq "North" and Quantity < 60;

* Region is North and Quantity is less than 60 using where function;

run;

proc print data=a15031.sal55;

run;

Learnings from this chapter

The importance of using Keep and drop functions in data step which allows

to select the required variables for doing analysis. If dataset has large

number of variables we can study only the variables of interest by using

these functions

The importance of where statement in data step that allows us to execute

the filters in the dataset in accordance with the requirements

The importance of using Boolean functions that allows us to execute the

conditions

The importance of Select statement that allows us to implement the

customized selection of both variables associated with conditions

Page 5: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

5

Chapter 8 – Performing Iterative Processing

Question

/*8.2 Run the program here to create a temporary SAS data set (MonthSales):

data monthsales;

input month sales @@;

---add your line(s) here---

datalines;

1 4000 2 5000 3 . 4 5500 5 5000 6 6000 7 6500 8 4500

9 5100 10 5700 11 6500 12 7500

;

Modify this program so that a new variable, SumSales, representing Sales to

date, is added to the data set. Be sure that the missing value for Sales in

month 3 does not result in a missing value for SumSales */

Program data a15031.monthsales;

input month sales @@; *DOUBLE Trailing procedure to read the data

set ;

datalines;

1 4000 2 5000 3 . 4 5500 5 5000 6 6000 7 6500 8 4500

9 5100 10 5700 11 6500 12 7500

;

proc print data=a15031.monthsales;

run;

data a15031.modifiedsales;

set a15031.monthsales;

sumsales+sales; *sum function;

*RETAIN function for initiate and return value ;

retain sumsales 0;

run;

proc print data=a15031.modifiedsales;

run;

Output

Page 6: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

6

Question /*8.4 Count the number of missing values for the variables A, B, and

C in the Missing data set. Add the cumulative number of missing

values to each observation (use variable names MissA, MissB, and

MissC). Use the MISSING function to test for the missing values */

Program

data a15031.missing1;

input G $ A B C ;

*using sum function in if statement to calculate num of missing

value;

if missing(G) then COUNTG+1;

if missing(A) then COUNTA+1;

if missing(B) then COUNTB+1;

if missing(C) then COUNTC+1;

datalines;

M 56 68 89

F 33 60 71

M 45 91 .

F 35 35 68

M . 71 81

M 50 68 71

. 23 60 46

M 65 72 103

. 35 65 67

M 15 71 75

;

proc print data=a15031.missing1 NOOBS;

run;

Output

Page 7: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

7

Question

/*8.6 Repeat Problem 5, except have the range of N go from 5 to 100

by 5 */

Program

data a15031.loger2;

do n = 5 to 100 by 5;*using do loop creating values from 5 to 100 by

5;

log_of_n=log(n);

output;

end;

run;

proc print data=a15031.loger2;

run;

Output

Page 8: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

8

Question

/*8.8 Use an iterative DO loop to plot the following equation:

Logit = log(p / (1 – p))Use values of p from 0 to 1 (with a point at

every .05). Using the following GPLOT

statements will produce a very nice plot. (If you do not have

SAS/GRAPH

software, use PROC PLOT to plot your points).

goptions reset=all

ftext='arial'

htext=1.0

ftitle='arial/bo'

htitle=1.5

colors=(black);

symbol v=none i=sm;

title "Logit Plot";

proc gplot data=logitplot;

plot Logit * p;run;quit;*/

Program

data a15031.itrative1;

do p= 0 to 1 by 0.05;*using DO loop creating values from 0 to 1 by

0.05;logit=log(p/(1-p));

output;

end;run;

goptions reset=all ftext='arial' htext=1.0 ftitle='arial/bo'

htitle=1.5 colors=(black);

symbol v=none i=sm;

title "Logit Plot";

proc gplot data=a15031.itrative1;

plot logit * p; *plot function to draw a graph;

run;quit;

proc print data=a15031.itrative1;

run;

Output

Page 9: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

9

Question

/*8.10 You are testing three speed-reading methods (A, B, and C) by

randomly assigning10 subjects to each of the three methods. You are

given the results as three lines of reading speeds, each line

representing the results from each of the three

methods,respectively. Here are the results:

250 255 256 300 244 268 301 322 256 333

267 275 256 320 250 340 345 290 280 300

350 350 340 290 377 401 380 310 299 399

Create a temporary SAS data set from these three lines of data. Each

observation should contain Method (A, B, or C), and Score. There

should be 30 observations inthis data set. Use a DO loop to create

the Method variable and remember to use asingle trailing @ in your

INPUT statement. Provide a listing of this data set using PROC PRINT

*/

Program data a15031.speed;

do method = "method_a" ,"method_b", "method_c" ;

do n= 1 to 10;*creating values using do loop;

input score@;*single trail function read the data;

output;

end;end;

datalines;

250 255 256 300 244 268 301 322 256 333

267 275 256 320 250 340 345 290 280 300

350 350 340 290 377 401 380 310 299 399

;proc print data=a15031.speed;

run;

Output

Page 10: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

10

Question

/* 8.12 You place money in a fund that returns a compound interest

of 4.25% annually. You

deposit $1,000 every year. How many years will it take to reach

$30,000? Do not

use compound interest formulas. Rather, use “brute force” methods

with DO WHILE

or DO UNTIL statements to solve this problem */

Program

data a15031.money;

interest=0.0424;

total=1000;

do until (total gt 30000) ;

year+1;

total=total+interest*total;

output;end;

run;

proc print data=a15031.money;

format total dollar10.2;

run;

Output

Page 11: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

11

Question /*14. Generate a table of integers and squares starting at 1 and

ending when the square

value is greater than 100. Use either a DO UNTIL or DO WHILE

statement to accomplish this*/

Program

data a15031.table;

do n=1 to 100 until (square ge 100);

square= n**2;

*using do until taking values from 1 to 100 and specifying the

condition for squares variable to stop the loop when it reaches 100;

output;

end;

run;

proc print data=a15031.table ;

run;

Output

Learnings from this chapter

The importance of Sum and Retain functions

Using Sum function to find the number of missing values

The importance of do loop in executing iterative conditions

Using single trial functions and double trial functions to read the data

Using Do While and Do Until Statements

Page 12: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

12

Chapter 9 – Working with Dates

Question

/* 9.2 Using the following lines of data, create a temporary SAS

data set called Three Dates. Each line of data contains three dates,

the first two in the form mm/dd/yyyy descenders and the last in the

form ddmmmyyyy. Name the three date variables Date1, Date2, and

Date3. Format all three using the MMDDYY10. format. Include in your

data set the number of years from Date1 to Date2 (Year12) and the

number of years from Date2 to Date3 (Year23). Round these values to

the nearest year. Here are the lines of data (note that the columns

do not line up):

01/03/1950 01/03/1960 03Jan1970

05/15/2000 05/15/2002 15May2003

10/10/1998 11/12/2000 25Dec2005 */

Program

data a15031.threedate;

input @1 date1 mmddyy10. *fixed line reading;

@12 date2 mmddyy10.

@23 date3 date9. ;

format date1 mmddyy10.

date2 mmddyy10.

date3 mmddyy10.;

year1_2=round(yrdif(date1,date2,"actual"));

year2_3=round(yrdif(date2,date3,"actual"));

*accessing the values from the above dataset using set function

Using yrdif function to calculate difference between date1,date2 and

date3 variables and rounding them using round command along with

yrdif;

datalines;

01/03/1950 01/03/1960 03Jan1970

05/15/2000 05/15/2002 15May2003

10/10/1998 11/12/2000 25Dec2005

;

proc print data=a15031.threedate noobs ;

run;

Output

Page 13: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

13

Question

/* 9.4 Using the Hosp data set, compute the subject’s ages two ways:

as of January 1, 2006(call it AgeJan1), and as of today’s date (call

it Age Today) The variable DOB represents the date of birth. Take

the integer portion of both ages. List the first 10

observations */

hint :

*using yrdif to find the difference between DOB and today’s date and

int to get only integer value of the difference

Program

data a15031.hospp;

set a15031.hosp;

age_tdat=round(yrdif(DOB,today(),"actual"));

age_1jan=round(yrdif(DOB,"01jan2006"d,"actual"));

run;

proc print data=a15031.hospp(OBS=10 );

run;

Output

Question

/* 9.6 Using the Medical data set, compute frequencies for the days

of the week for the date of the visit (VisitDate). Supply a format

for the days of the week and months of the year */

Program

data a15031.medical;

input @1 VisitDate mmddyy10. @12 patno $3.

format visitdate date9.;

day_of_week=weekday(visitdate); *fetching weekday from visitdate

variable;

month_of_year=month(visitdate); *providing format for month

variable;

Page 14: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

14

datalines;

11/29/2003 879

11/30/2003 880

09/04/2003 883

08/28/2003 884

09/04/2003 885

08/26/2003 886

08/31/2003 887

08/25/2003 888

11/16/2003 913

11/15/2003 914

;

proc freq data= a15031.medical;

table day_of_week; format day_of_week date9.;

run;

proc print data=a15031.medical;

run;

Output

Question

/* 9.8 Using the values for Day, Month, and Year in the raw data

below, create a temporary SAS data set containing a SAS date based

on these values (call it Date) and format this value using the

MMDDYY10. format. Here are the Day, Month, and Year values:

25 12 2005

1 1 1960

21 10 1946 */

Program

data a15031.date_it;

input Day Month Year;

datalines;

25 12 2005

1 1 1960

21 10 1946

;

data a15031.date_it1;

set a15031.date_it;*set function to set the data into another data ;

Page 15: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

15

Date = mdy(Month,Day,Year);* merging the day month year values into

mmddyy format;

format Date mmddyy10.;*date format mmddyy10.;

run;

proc print data=a15031.date_it1;

run;

Output

Question /* 9.10 Using the Hosp data set, compute the number of months from

the admission date (AdmitDate) and December 31, 2007 (call it

MonthsDec). Also, compute the number of months from the admission

date to today's date (call it MonthsToday). Use a date interval

function to solve this problem. List the first 20 observations for

your solution */

Program data a15031.monthdec;

set a15031.hosp;

*set hosp data into this data from permanent library;

*you can find hosp dataset in the blog folder uploaded in the

dropbox;

MonthDec =intck('month',admitdate,'31dec2007'd) ;

*using intck function to find month difference between admitdate and

31Dec2007;

MonthToday =intck('month',AdmitDate,today());

run;

proc print data= a15031.monthdec;

run;

Output

Page 16: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

16

Question

/* 9.12 You want to see each patient in the Medical data set on the

same day of the week 5 weeks after they visited the clinic (the

variable name is VisitDate).Provide a listing of the patient number

(Patno), the visit date, and the date for the return visit */

Program

data a15031.med;

set a15031.medical;

Followdate=intnx('month',VisitDate,5,'sameday');

*using intck function calculate follow date for the given condition;

run;

proc print data=a15031.med;

format Followdate VisitDate date9.;

run;

Output

Learnings from this chapter

The ways to read date variables

The ways to store date variables

The ways to extract day of a week or a month

Providing formats to dates

The importance and usage of intck function

Page 17: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

17

Chapter 10 – Subsetting and Combining SAS

Datasets

Question

/*10.2 Using the SAS data set Hosp, create a temporary SAS data set

called Monday2002,consisting of observations from Hosp where the

admission date (AdmitDate) falls on a Monday and the year is 2002.

Include in this new data set a variable called Age,computed as the

person’s age as of the admission date, rounded to the nearest

year;*/

Program

data monday20122;

set a15031.hosp;

Admit_day= weekday(AdmitDate); *week day of admit;

admit_year=year(admitdate); *year OF admit;

admit_month=month(AdmitDate);*month of admit;

day_of_admit=day(admitdate); *date of admit;

run;

proc print data=monday20122;

run;

Output

Program

data a15031.monday2012;

set a15031.monday20122; *2 = monday;

where Admit_day = 2 and admit_year=2002;

AGE=ROUND( yrdif(DOB,AdmitDate,'Actual')); *ROUND THE NEAREST AGE;

run;

proc print data= monday2012;

format DOB date9. AdmitDate date9.0;run;

Page 18: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

18

Output

Question

/*10.4 Using the SAS data set Bicycles, create two temporary SAS

data sets as follows:

Mountain_USA consists of all observations from Bicycles where State

is Uttar Pradesh and Model is Mountain. Road_France consists of all

observations from Bicycles where State is Maharastra and Model is

Road Bike. Print these two data sets */

Program

title bicycle;

data a15031.bicycle;*create a data set;

set a15031.bicycles;

run;

proc contents data=a15031.bicycle;*check the content of data set;

run;

proc print data=a15031.bicycle;

run;

data a15031.mountain_usa a15031.road_france;

set a15031.bicycle;

*using if statement subset the state and model pass the out put to

two data set above mentioned;

if state = "Uttar Pradesh" and model = "mountain bike" then output

a15031.mountaion_usa;

else if state = "Maharastra" and model = "road bike" then output

a15031.road_france;

run;

proc print data=a15031.mountain_usa;

run;

Page 19: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

19

Output

Question

/*10.6 Repeat Problem 5, except this time sort Inventory and

NewProducts first (create two temporary SAS data sets for the sorted

observations). Next, create a new, temporary SAS data set (Updated)

by interleaving the two temporary, sorted SAS data sets. Print out

the result.*/

Program

proc sort data=a15031.inventory out=a15031.inventory;

by Model;*must sort before merge using any common variable;

run;

proc sort data=a15031.newproducts out=a15031.newproducts;

by Model;run;

data a15031.updated;

set a15031.inventory a15031.newproducts;

*set function use to combine the both table;

by Model;run;

title "Listing of UPDATED";

proc print data=a15031.updated;

run;

Output

Page 20: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

20

Question

/* 10.8 Run the program here to create a SAS data set called Markup:

data markup;

input manuf : $10. Markup;

datalines;

Cannondale 1.05

Trek 1.07

;

Combine this data set with the Bicycles data set so that each

observation in the Bicycles data set now has a markup value of 1.05

or 1.07, depending on whether the bicycle is made by Cannondale or

Trek. In this new data set(call it Markup_Prices),create a new

variable (NewTotal) computed as TotalCost times Markup */

Program

data a15031.markup;

input manuf : $10. Markup;

datalines;

Cannondale 1.05

Trek 1.07

;

proc print data = a15031.markup;

run;

proc contents data= a15031.markup;

run;

data a15031.merage;

*combine markup data set with bicycle using merge function;

merge a15031.bicycle a15031.markup;

by manuf;

newtotal=sum(unitcost);run;

proc print data = a15031.merage;run;

proc sort data = a15031.merage;

by manuf;run;

proc print data = a15031.merage;*merged data set;run;

Output

Page 21: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

21

Question

/*10.10 Using the Purchase and Inventory data sets, provide a list

of all Models (andthe Price) that were not purchased*/

Program

proc sort data=a15031.inventory out=a15031.inventory;

by Model;*sort the inventory data by model and pass it to inventory;

run;

proc sort data=a15031.purchase out=a15031.purchase;

by Model; *sort the purchase data by model and pass it to purchase;

run;

data a15031.not_bought;

merge a15031.inventory(in=InInventory)*merge the sorted data set;

a15031.purchase(in=InPurchase);

by Model;

if InInventory and not InPurchase;

keep Model Price;

*keep only model and price and eliminate the other variable;

run;

title "Listing of NOT_BOUGHT";

proc print data=a15031.not_bought noobs;

run;

Output

Question

/*10.12 You want to merge two SAS data sets, Demographic and

Survey1, based on an identifier. In Demographic, this identifier is

called ID; in Survey1, the identifier is called Subj. Both are

character variables.*/

Program

data a15031.demographic;

input ID : $3.

DOB : mmddyy10.

Gender : $1.;

format DOB mmddyy10.;

datalines;

012 10/10/37 M

Page 22: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

22

535 7/12/87 F

723 1/5/2000 M

007 6/4/1966 F

;

*Data set SURVEY1;

data a15031.survey1;

input Subj : $3.

(Q1-Q5)($1.);

datalines;

535 13542

012 55443

723 21211

007 35142

;

*Data set SURVEY2;

data a15031.survey2;

input ID

(Q1-Q5)(1.);

datalines;

535 13542

012 55443

723 21211

007 35142

;

proc sort data=a15031.demographic out=demographic;

by ID;

run;

proc sort data=a15031.survey1 out=survey1;

by Subj;

run;

data a15031.combinech10;

merge a15031.demographic

survey1 (rename=(Subj = ID));

by ID;

run;

title "Listing of COMBINE";

proc print data=a15031.combinech10 noobs;

run;

Output

Page 23: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

23

Question

/*14 Data set Inventory contains two variables: Model (an 8-byte

character variable) and Price (a numeric value). The price of Model

M567 has changed to 25.95 and the price of Model X999 has changed to

35.99. Create a temporary SAS data set (call it NewPrices) by

updating the prices in the Inventory data set*/

Program

data a15031.modelnew;

input Model $ Price;

datalines;

M567 25.95

X999 35.99

;

*sorting inventory data by model variable;

proc sort data=a15031.inventory out=inventory;

by Model;

run;

*updating inventory data with modelnew for price for the models;

data a15031.updatedprices;

update a15031.inventory a15031.modelnew;

by Model;

run;

title "Listing of NEWPRICES";

proc print data=a15031.updatedprices noobs;

run;

Output

Learnings from this chapter

The ways to subset a dataset based on the requirements

The ways to generate multiple subsets from the data in single data step

The ways to manipulate the data.

Adding observations, moving observations from datasets

How to produce summary of variables

Merging two datasets by performing one to one, one to many and

many to many joins

Page 24: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

24

Chapter 11- Working with Numeric

Functions

Question

/* 11.1 Using the SAS data set Health, compute the body mass index

(BMI) defined as the weight in kilograms divided by the height (in

meters) squared. Create four other variables based on BMI: 1)

BMIRound is the BMI rounded to the nearest integer, 2) BMITenth is

the BMI rounded to the nearest tenth, 3) BMIGroup is the BMI rounded

to the nearest 5, and 4) BMITrunc is the BMI with a fractional

amount truncated. Conversion factors you will need are: 1 Kg equals

2.2 Lbs and 1 inch = .0254 meters */

Program

data a15031.health;

set a15031.health;

BMI = (Weight / 2.2) / (Height*.0254)**2;

BMIRound=round(BMI);

BMITenth=round(BMI,.1);

BMIGroup=round(BMI,5);

BMITrunc=int(BMI);

run;

proc print data=a15031.health;

run;

Output

Question

/* 11.2 Count the number of missing values for WBC, RBC, and Chol in

the Blood data set.

Use the MISSING function to detect missing values */

Page 25: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

25

Program

data a15031.hel;

set a15031.blood;

*blood dataset is present in the blog folder uploaded in dropbox

folder;

if missing(Gender) then MissG+1;

if missing(WBC) then MissWBC+1;

if missing(RBC) then MissRBC+1;

if missing(Chol) then MissChol+1;

*using sum function to find the number of missing values in each

variable;

run;

proc print data=a15031.hel;

run;

Output

Question

/* 11.4 The SAS data set Psych contains an ID variable, 10 question

responses (Ques1–Ques10), and 5 scores (Score1–Score5). You want to

create a new, temporary SASdata set (Evaluate) containing the

following:

a. A variable called QuesAve computed as the mean of Ques1–Ques10.

Perform this computation only if there are seven or more non-missing

question values.

b. If there are no missing Score values, compute the minimum

score(MinScore),the maximum score (MaxScore), and the second highest

score (SecondHighest) */

Program

data a15031.evaluate;

set a15031.psych;

*pysch dataset is present in the blog folder uploaded in dropbox

folder;

if n(of Ques1-Ques10) ge 7 then QuesAve=mean(of Ques1-Ques10);

if n(of Score1-Score5) eq 5 then maxscore=max(of Score1-Score5);

if n(of Score1-Score5) eq 5 then Minscore=min(of Score1-Score5);

Page 26: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

26

if n(of Score1-Score5) eq 5 then SecondHighest=largest(2,of Score1-

Score5);

*using if then stmt to find max score min score secondhighest of the

score variables;

run;

proc print data=a15031.evaluate;

run;

Output

Question

/* 11.6 Write a short DATA _NULL_ step to determine the largest

integer you can score on

your computer in 3, 4, 5, 6, and 7 bytes */

Program

data _null_;

set a15031.cons;

put int3= int4= int5= int6= int7= ;

run;

Output of log window

Question

/*11.8 Create a temporary SAS data set (Random) consisting of 1,000

observations, each with a random integer from 1 to 5. Make sure that

all integers in the range are equally likely. Run PROC FREQ to test

this assumption */

Page 27: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

27

Program

data a15031.random;

do i=1 to 1000;

x=int(rand('uniform')*5)+1 /*OR x=int(ranuni(0)*5+1) */;output

;end;

*rand function to get random value between 1 and 5;

run;

proc freq data=a15031.random;

tables x/missing;run;

Output

Question

/* 11.10 Data set Char_Num contains character variables Age and

Weight and numeric variables SS and Zip. Create a new, temporary SAS

data set called Convert with new variables NumAge and NumWeight that

are numeric values of Age and Weight, respectively, and CharSS and

CharZip that are character variables created from SS and Zip. CharSS

should contain leading 0s and dashes in the appropriate places for

Social Security numbers and CharZip should contain leading 0s

Hint: The Z5. format includes leading 0s for the ZIP code */

Program

data a15031.convert;

set a15031.char_num;

NumAge = input(Age,8.);

NumWeight = input(weight,8.);

*converting character variables weight and age into numeric

variables;

CharSS = put(SS,ssn11.);

CharZip = put(Zip,z5.);

*converting numeric variables SS and Zip into character variables;

run;

proc print data=a15031.convert;

run;

Page 28: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

28

Output

Question

/* 11.12 Using the Stocks data set (containing variables Date and

Price), compute daily changes in the prices. Use the statements here

to create the plot. Note: If you do not have SAS/GRAPH installed,

use PROC PLOT and omit the GOPTIONS and SYMBOL statements. goptions

reset=all colors=(black) ftext=swiss htitle=1.5;

symbol1 v=dot i=smooth;

title "Plot of Daily Price Differences";

proc gplot data=difference;

plot Diff*Date;

run;

quit; */

Program

data a15031.price_difference;

set a15031.stocks;

Diff = Dif(Price);

*using dif function to calculate the difference in the price

compared to the previous price ;

run;

goptions reset=all colors=(black) ftext=swiss htitle=1.5;

symbol1 v=dot i=smooth;

title "Plot for Price Differences";

proc gplot data=a15031.price_difference;

plot Diff * Date;

run;

quit;

Page 29: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

29

Output

Learnings from this chapter

The ways of rounding and truncating numerical values

The ways to detect missing values

The ways to treat missing values

The ways to assign data types to missing values

The usage of random numbers and the ways to generate random numbers

Page 30: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

30

Chapter 12- Working with character

functions

Question

/*12.2 Using the data set Mixed, create a temporary SAS data set

(also called Mixed) with the following new variables:

a. NameLow – Name in lowercase

b. NameProp – Name in proper case

c. (Bonus – difficult) NameHard – Name in proper case without using

the

PROPCASE function*/

Program data a15031.mixed;

set a15031.mixed;

length First Last $ 15 NameHard $ 20;

NameLow = lowcase(Name);

*converting entire word into lower case;

NameProp = propcase(Name);

*making first letter of each work into uppercase;

First = lowcase(scan(Name,1,' '));

*converting entire word into lower case;

Last = lowcase(scan(Name,2,' '));

*converting entire word into lower case;

substr(First,1,1) = upcase(substr(First,1,1));

*converting entire word into upper case;

substr(Last,1,1) = upcase(substr(Last,1,1));

*converting entire word into upper case;

NameHard = catx(' ',First,Last);

drop First Last;

run;

proc print data=a15031.mixed;

run;

Output

Page 31: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

31

Question /*12.4 Data set Names_And_More contains a character variable called

Height. As you cansee in the listing in Problem 3, the heights are

in feet and inches. Assume that these units can be in upper- or

lowercase and there may or may not be a period following the units.

Create a temporary SAS data set (Height) that contains a numeric

variable (HtInches) that is the height in inches.*/

*Data set NAMES_AND_MORE;

Program data a15031.height;

set a15031.names_and_more(keep = Height);

Height = compress(Height,'INFT.','i');

/* Alternative

Height = compress(Height,' ','kd');

*keep digits and blanks;

*/

Feet = input(scan(Height,1,' '),8.);

Inches = input(scan(Height,2,' '),?? 8.);

*using scan function to extract values around the characters from

the variable1 value before space and 2 for value after two for ;

if missing(Inches) then HtInches = 12*Feet;

else HtInches = 12*Feet + Inches;

drop Feet Inches;

run;

title "Listing of HEIGHT";

proc print data=a15031.height noobs;

run;

Output

Question /*12.6 Data set Study (shown here) contains the character variables

Group and Dose. Create a new, temporary SAS data set (Study) with a

variable called GroupDose by putting these two values together,

separated by a dash. The length of the resulting variable should be

6 (test this using PROC CONTENTS or the SAS Explorer). Make sure

that there are no blanks (except trailing blanks) in this value. Try

this problem two ways: first using one of the CAT functions, and

second without using any CAT functions*/*Using CAT functions;

Page 32: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

32

Program data a15031.study;

set a15031.study;

length GroupDose $ 6;

GroupDose = catx('-',Group,Dose);

*catx function conect the two variable values in “-” ;run;

title "Listing of STUDY";

proc print data=a15031.study noobs;run;

*Without using CAT functions;

data a15031.study;

set a15031.study;

length GroupDose $ 6;

GroupDose = trim(Group) || '-' || Dose;

*remove the blank space using trim function;

*combine the two variable;run;

title "Listing of STUDY";

proc print data=a15031.study noobs;

run;

Output

Question

/*12.8 Notice in the listing of data set Study in Problem 6 that the

variable called Weight contains units (either lbs or kgs). These

units are not always consistent in case and may or may not contain a

period. Assume an upper- or lowercase LB indicates pounds and an

upper- or lowercase KG indicates kilograms. Create a new, temporary

SAS data set (Study) with a numeric variable also called Weight

(careful here) thatrepresents weight in pounds, rounded to the

nearest 10th of a pound.Note: 1 kilogram = 2.2 pounds*/

Program data a15031.study;

set a15031.study(keep=Weight rename=(Weight = WeightUnits));

*using compress(kd)inside input function to keep numerical values

alone from the string and change if character variables present to

numerical;

Weight = input(compress(WeightUnits,,'kd'),8.);

if find(WeightUnits,'KG','i') then Weight = round(2.2*Weight,.1);

else if find(WeightUnits,'LB','i') then Weight = round(Weight,.1);

*using find function with "i" argument to remove characters and to

ignore cases;

run;

title "Listing of STUDY";

Page 33: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

33

proc print data=a15031.study noobs;

run;

Output

Question

/*12.10 Data set Errors contains character variables Subj (3 bytes)

and PartNumber (8bytes). (See the partial listing here.) Create a

temporary SAS data set (Check1) with any observation in Errors that

violates either of the following two rules: first,Subj should

contain only digits, and second, PartNumber should contain only the

uppercase letters L and S and digits.

Here is a partial listing of Errors:*/

Program data a15031.violates_rules;

set a15031.errors;

where notdigit(trim(Subj)) or

*using notdigit to check any invalid character type value present

Here you should use trim function along with notdigit because

Without the TRIM function "not" function used here would return the

position of the first trailing blank in each of the character

values;

verify(trim(PartNumber),'0123456789LS');

run;

title "Listing of VIOLATES_RULES";

proc print data=a15031.violates_rules noobs;

run;

Output

Page 34: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

34

Question /*12.12 List the subject number (Subj) for any observations in

Errors where PartNumber contains an upper- or lowercase X or D.*/

Program

title "Subjects with X or D in PartNumber";

proc print data=a15031.errors noobs;

*using findc function with argument "i" to find if the variable

values contain any case ;

where findc(PartNumber,'XD','i'); var Subj PartNumber;run;

Output

Question

/*12.14 List all patients in the Medical data set where the word

antibiotics is in the comment field (Comment).*/

Program

proc print data=a15031.medical;

*comment function to find the particular word in the variable

comment;

where indexw(Comment,'antibiotics');

run;

Output

Question /*12.16 Provide a list, in alphabetical order by last name, of the

observations in the Names_And_More data set. Set the length of the

last name to 15 and remove multiple blanks from Name.

Page 35: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

35

Note: The variable Name contains a first name, one or more spaces,

and then a last name.*/

Program data a15031.names;

set a15031.names_and_more;

length Last $ 15;

Name = compbl(Name);*compbl function use to compress the blank

value;

Last = scan(Name,2,' ');

*scan function use to take second part of the name and store it the

last variable;

run;

proc sort data=a15031.names;

by Last;

run;

title "Observations in NAMES_AND_MORE in "

"Alphabetical Order";

proc print data=a15031.names;

id Name;

var Phone Height Mixed;

run;

Output

Learnings from this chapter

The ways to perform concatenation of strings

The ways to calculate the length of the string

The ways to remove leading and trailing blanks from string

Using compress and NOT functions

Using comment function to find a word in a variable

Using notdigit to check the invalid character type in a dataset

Page 36: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

36

Chapter 13- Working with arrays Question

/*Using the SAS data set Survey1, create a new, temporary SAS data

set (Survey1) where the values of the variables Ques1–Ques5 are

reversed as follows: 1 ?? 5; 2?? 4; 3 ?? 3; 4 ?? 2; 5 ?? 1.

Note: Ques1–Ques5 are character variables. Accomplish this using an

array.*/

*Data set SURVEY;

Program

data a15031.survey;

infile 'c:\books\learning\survey.txt' pad;

input ID : $3.

Gender : $1.

Age

Salary

(Ques1-Ques5)(1.);

run;

proc format library=a15031;

value $gender 'M' = 'Male'

'F' = 'Female'

' ' = 'Not entered'

other = 'Miscoded';

value age low-29 = 'Less than 30'

30-50 = '30 to 50'

51-high = '51+';

value $likert '1' = 'Strongly disagree'

'2' = 'Disagree'

'3' = 'No opinion'

'4' = 'Agree'

'5' = 'Strongly agree';

run;

data a15031.survey1;

set a15031.survey1;

array Ques{5} $ Q1-Q5;

*creating array to storing variables from Q1 to Q5;

do i = 1 to 5;

Ques{i} = translate(Ques{i},'54321','12345');

*using do loop to create "i" variable with values from 1 to 5 and

to reverse the question using translate function inside the Ques

array;

end;

drop i;

run;

title "List of SURVEY1 ";

proc print data=a15031.survey1;

run;

Page 37: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

37

Output

Question

/*13.2 Redo Problem 1, except use data set Survey2.

Note: Ques1–Ques5 are numeric variables.*/

Program

data a15031.survey2;

set a15031.survey2;

array Ques{5} Q1-Q5;

do i = 1 to 5;

Ques{i} = 6 - Ques{i};

end;

drop i;

run;

title "List of SURVEY2 ";

proc print data=a15031.survey2;

run;

Output

Question

/*13.4 Data set Survey2 has five numeric variables (Q1–Q5), each

with values of 1, 2, 3, 4,or 5. You want to determine for each

subject (observation) if they responded with a5 on any of the five

questions. This is easily done using the OR or the IN

operators.However, for this question, use an array to check each of

the five questions. Set variable (ANY5) equal to Yes if any of the

five questions is a 5 and No otherwise.*/

Page 38: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

38

Program

data a15031.any5;

set a15031.survey2;

array Ques{5} Q1-Q5;

Any5 = 'No ';

do i = 1 to 5;

if Ques{i} = 5 then do;

Any5 = 'Yes';

leave;

end;

end;

drop i;

run;

title "Listing of ANY5";

proc print data=a15031.any5 noobs;

run;

Output

Learnings from this chapter

The ways to create arrays

The ways of using arrays in creating new variables

Setting values to a missing character values and missing numeric values

Importance of temporary arrays

Page 39: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

39

Chapter 14 - Displaying your Data

Question

/*14.2 Using the data set Sales, create the report shown here:*/

Program

proc sort data=a15031.sales out=a15031.sales;

by Region;

run;

title "Sales ";

proc print data=a15031.sales;

by Region;

id Region;

var Quantity TotalSales;

sumby Region;

run;

Output

Question

/*14.1 List the first 10 observations in data set Blood. Include only the

variables Subject,WBC (white blood cell), RBC (red blood cell), and Chol.

Label the last threevariables “White Blood Cells,” “Red Blood Cells,” and

“Cholesterol,” respectively. Omit the Obs column, and place Subject in the

first column. Be sure the column headings are the variable labels, not the

variable names.*/

Program

title " The First 10 Observations in BLOOD data";

proc print data=a15031.blood(obs=10) label;

id Subject;

var WBC RBC Chol;

label WBC = 'White Blood Cells'

RBC = 'Red Blood Cells'

Chol = 'Cholesterol';

run;

Page 40: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

40

Output

Question

/* 14.3 Use PROC PRINT (without any DATA steps) to create a listing like

the one here. Note: The variables in the Hosp data set are Subject,

AdmitDate (Admission Date),DischrDate (Discharge Date), and DOB (Date of

Birth).*/

Program

proc print data=a15031.hosp

n='Number of Patients = '

label

double;

where Year(AdmitDate) eq 2004 and

Month(AdmitDate) eq 9 and

yrdif(DOB,AdmitDate,'Actual') ge 83;

id Subject;

var DOB AdmitDate DischrDate;

label AdmitDate = 'Admission Date'

DischrDate = 'Discharge Date'

DOB = 'Date of Birth';

run;

Output

Page 41: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

41

Question

/*14.4List the first five observations from data set Blood. Print only

variables Subject,

Gender, and BloodType. Omit the Obs column.*/

Program

title "First 5 Observations";

proc print data=a15031.blood(obs=5) noobs;

var Subject Gender BloodType;

run;

Output

Learnings from this chapter

The ways to view the summary of the data

Listing the observations

Changing the looks of the observation

Sorting by multiple variables

Computing total across variables

Page 42: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

42

Chapter 15 – Creating Customized Reports

Question

/*15.2 Using the Blood data set, produce a summary report showing the

average WBC and RBC count for each value of Gender as well as an overall

average. Your report should look like this:*/

Program

proc report data=a15031.blood nowd headline;

column Gender WBC RBC;

define Gender / group width=6;

define WBC / analysis mean "Average WBC"

width=7 format=comma6.0;

define RBC / analysis mean "Average RBC"

width=7 format=5.2;

rbreak after / dol summarize;

run;

quit;

Output

Question

/*15.4 Using the SAS data set Blood Pressure, compute a new variable in

your report. This variable (Hypertensive) is defined as Yes for females

(Gender=F) if the SBP is greater than 138 or the DBP is greater than 88 and

No otherwise. For males(Gender=M), Hypertensive is defined as Yes if the

SBP is over 140 or the DBP is over 90 and No otherwise. Your report should

look like this:*/

Program

proc report data=a15031.bloodpressure nowd;

column Gender SBP DBP Hypertensive;

define Gender / Group width=6;

define SBP / display width=5;

define DBP / display width=5;

define Hypertensive / computed "Hypertensive?" width=13;

compute Hypertensive / character length=3;

if Gender = 'F' and (SBP gt 138 or DBP gt 88)

then Hypertensive = 'Yes';

else Hypertensive='No';

Page 43: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

43

if Gender = 'M' and

(SBP gt 140 or DBP gt 90)

then Hypertensive = 'Yes';

else Hypertensive = 'No';

endcomp;

run;

quit;

Output

Question

/*15.6 Using the SAS data set BloodPressure, produce a report showing

Gender, Age, SBP,and DBP. Order the report in Gender and Age order as shown

here:*/

Program

proc report data=a15031.bloodpressure nowd;

column Gender Age SBP DBP;

define Gender / order width=6;

define Age / order width=5;

define SBP / display "Systolic Blood Pressure" width=8;

define DBP / display "Diastolic Blood Pressure" width=9;

run;

quit;

Output

Page 44: Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

44

Question

/*15.8 Using the data set Blood, produce a report like the one here. The

numbers in the table are the average WBC and RBC counts for each

combination of blood type and gender.*/

Program

proc report data=a15031.blood nowd headline;

column BloodType Gender,WBC Gender,RBC;

define BloodType / group 'Blood Type' width=5;

define Gender / across width=8 '-Gender-';

define WBC / analysis mean format=comma8.;

define RBC / analysis mean format=8.2;

run;

quit;

Output

Learnings from this chapter

The importance and usage of PROC REPORT

Customising the report using the options available under PROC REPORT

Changing the order of the variables in the column statement