Spanish courses for beginners, Spanish Language courses for beginners in Malaga,
SAS for Beginners
-
Upload
colby-stoever -
Category
Documents
-
view
102 -
download
0
Transcript of SAS for Beginners
SAS for BeginnersAnd Institutional Researchers
TAIR Summer Workshop 2015
Why Use SAS for IR?
• Data Manipulation
• Strong reliable statistical package (regression, psychometrics, create your own formula)
• Email reports (directly from SAS)
• Create reports in many different file types (doc, xls, xml, pdf…)
• Macros (variable that can be used throughout a SAS program)
• Create all THECB and IPEDS
• Create your own error check reports
• Create tables for tools like Tableau
• Write table directly to other data warehouses
• Record of exactly what you did
• Batch report (no need to even open SAS)
• Create Maps
• …
Workshop Goals
• Make you a SAS Expert in Three Hours
• Learn the basic SAS programming for:• Importing (In Programming)• The DATA Step ( the most important procedure in SAS.)• Proc Sort• Exporting Data• Proc Print• Proc Freq (the easily reporting procedure)• Proc Summary• Proc Tabulate (a little more difficult)• If we have time:
• Touch of Macros to make life easy• Creating Text files ready to send to the THECB• Email reports (if we have time)• Proc Report ( show it –It’s great but harder to master)
Importing Data
• SAS File (.sas7bdat files)
• Excel
• Text
• Other Sources
Libname Statement
• libname Enrollme "\\w4aafs\ss\SSIT\CBM SAS Databases\CBM 001 Text Files\Final”;
• In English-
• I want to create a directory of files in SAS named Enrollme that is stored current here \\w4aafs\ss\SSIT\CBM SAS Databases\CBM 001 Text Files\Final “;”
• Library can only be 8 characters long.
Run
Check the log (ALWAYS DO THIS STEP)
Blue everything is cool. Red something went bad. Green you got something but it may be messed up.
Click Here
Click Here
All of the SAS data file at that location
Windows File Location
Proc Import
• Colby, we have no SAS datasets, YET.
Proc Import file=“\\w4aafs\ss\SSIT\Planning\TAIR\SAS DB Training 1.xlsx”DBMS=Excel out=Students replace;
Run;
I want to import a file located at \\w4aafs\ss\SSIT\Planning\TAIR\SAS DB Training 1.xlsx that is this type of file. I want to named it “Students” and if I rerun this code I want SAS to replace “Students”.
Please run this command;
http://support.sas.com/documentation/cdl/en/acpcref/63184/HTML/default/viewer.htm#a003102096.htm
All DBMS Options
Tips: In SAS programs, don’t use mapped drives. Use network addresses especially if you plan to share code.
The logNote: SAS Data sets names in SAS code have two parts, the library name and the database name. If you do not give a libname (library name), SAS assumes the database goes to a temporary library called “work”.Caution: SAS deletes all data sets in “work” once you close SAS.
The exported file
Tip: If you have data set open, SAS will not change them.
The Data Step- Creating a Dataset from Existing data • Data (Required)
• PUT (Optional)
• Set or Merge (Required most times)
• By (optional, but required if you use merge statements)
• Where or IF Statements (optional)
• Keep and Drop Statements (optional)
• Rename Statement (optional)
• Run; (Required)
Create a dataset with a new variable (field)
Data Statement
• Data work.StudentChanged;
• I want to create a dataset stored in the “work” library named “StudentChanged”.
Set Statement
• Data work.StudentChanged;
• Set work.students;
• I want to create a dataset stored in the “work” library and name “StudentChanged”.
• I want you to get the data to create this dataset from the dataset stored in the library named “work” and named “students”.
Create a field with the same value for all records.• Data work.StudentChanged;
• Set work.students;
• Studentmarker=1;
• I want to create a dataset stored in the “work” library and named “StudentChanged”.
• I want you to get the data to create this dataset from the dataset stored in the library named “work” and named “students”.
• I want to add a field named Studentmarker where every record as the value of 1.
Note: SAS variables exist in two forms: Numeric or string (character). “ “ or ‘ ‘ must be used when created or using character variables.
Create a field with the same value for all records.• Data work.StudentChanged;
• Set work.students;
• Studentmarker=1;
• Run;
• I want to create a dataset stored in the “work” library and name “StudentChanged”.
• I want you to get the data to create this dataset from the dataset stored in the library named “work” and named “students”.
• I want to add a field named Studentmarker where every record as the value of 1.
• I want you to do this.
Tips: Notice the semicolons at the end in of each statement. This is how SAS know the statement has ended. YOU WILL GET AN ERROR WITHOUT THEM.
Create a Dataset with only some students (records) based a selection criteria. • I want a dataset of Male students with a GPA equal or greater than
3.5.
Data Statement
• Data StudentHiGPAMale;
• I want to create a dataset stored in the “work” library named “StudentHiGPAMale”.
Set Statement
• Data work. StudentHiGPAMale;
• Set work.students;
• I want to create a dataset stored in the “work” library and named “StudentHiGPAMale”.
• I want you to get the data to create this dataset from the dataset stored in the library named “work” and named “students”.
Where Statement
• Data work. StudentHiGPAMale;• Set work.students;• Where Gender=“M” and GPA>=3.5;• Run;• I want to create a dataset stored in the “work” library and named
“StudentHiGPAMale”.• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.• I don’t want all of “work.students”. I only want students who are male and
have GPA’s greater than or equal to 3.5• I want you to do this.
Where Statement 2
• Data work. StudentHiGPAMale;• Set work.students;• Where Gender=“M” or GPA>=3.5;• Run;• I want to create a dataset stored in the “work” library and named
“StudentHiGPAMale”.• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.• I don’t want all of “work.students”. I only want students who are male or
have GPA’s greater than or equal to 3.5• I want you to do this.
SAS Operators (And/Or)-And -you only want data with those two (or more) conditionsOR- You want data that has either of those conditions
Where Statement 3
• Data work. StudentHiGPAMale;• Set work.students;• Where (Gender=“M” or GPA>=3.5) and School=“SOD”;• Run;• I want to create a dataset stored in the “work” library and named
“StudentHiGPAMale”.• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.• I don’t want all of “work.students”. I only want students who are male or
have GPA’s greater than or equal to 3.5, but only if they are in School SOD.• I want you to do this.
Where Statement 3
• Data work. StudentLisaJoe;• Set work.students;• Where First_name in (“lisa” “Joe”);• Run;• I want to create a dataset stored in the “work” library and named
“StudentLisaJoe”.• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.• I don’t want all of “work.students”. I only want students whose names are
“lisa” or “Joe”.• I want you to do this.
IF Statement 1
• Data work. StudentLisaJoe;• Set work.students;• if First_name in (“lisa” “Joe”) then output (or delete);• Run;• I want to create a dataset stored in the “work” library and named
“StudentLisaJoe”.• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.• I don’t want all of “work.students”. I only want students whose names are
“lisa” or “Joe”.• I want you to do this.
Where and IF Statement warning
• Data work. StudentLisaJoe;• Set work.students;• if First_name in (“Lisa” “Joe”);• Run;• I want to create a dataset stored in the “work” library and name
“StudentLisaJoe”.• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.• I don’t want all of “work.students”. I only want students whose names are
“Lisa” or “Joe”.• I want you to do this.
Warning Where and IF are case sensitive
Use functions to make records similiar
• data students1;
• set students;
• First_name1=propcase(First_name);
• run;
• Data work.StudentLisaJoe;
• Set work.students1;
• if First_name1 in ("Lisa" "Joe");
• Run;
• List of SAS Functions
• http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000245860.htm
Added and Removing Variables from a data set
Keep Statement
• Data Studentonlynames;
• Set students;
• Keep First_name Last_name;
• Run;
• I want to create a dataset stored in the “work” library and named “Studentonlynames”.
• I want you to get the data to create this dataset from the dataset stored in the library named “work” and named “students”.
• I want only the variables named First_name and Last_name to be dataset “Studentonlynames”.
• I want you to do this.
Remember if no libname is given SAS assumed you mean the “work” Library
Keep will only “keep” the variables in the name in the statement.
Drop Statement
• Data Studentwonames;
• Set students;
• drop First_name Last_name;
• Run;
• I want to create a dataset stored in the “work” library and named “Studentwonames”.
• I want you to get the data to create this dataset from the dataset stored in the library named “work” and named “students”.
• I want all the variables in dataset “Studentwonames” except First_nameLast_name.
• I want you to do this.
Remember if no libname is given SAS assumed you mean the “work” Library
Drop statement only removes the variables named after it.
Rename or Copy variables in a Dataset
Rename variables in a Dataset
• Data StudentsCopyStudent_Loan;
• Set students;
• Rename Student_loan=Student_Loan2School=School2;
• Run;
• I want to create a dataset stored in the “work” library and named “StudentsCopyStudent_Loan”.
• I want you to get the data to create this dataset from the dataset stored in the library named “work” and named “students”.
• I want the variable Student_loan to now be named “Student_Loan2 and School to be named school2.
• I want you to do this.
Copy variables in a Dataset
• Data StudentsCopyStudent_Loan;
• Set students;
• Student_loan2=Student_Loan;
• Run;
• I want to create a dataset stored in the “work” library and name “StudentsCopyStudent_Loan”.
• I want you to get the data to create this dataset from the dataset stored in the library named “work” and named “students”.
• I want the variable Student_loan2 to be create by copying “Student_Loan”.
• I want you to do this.
Creating new variables from information already in your dataset
• Start with numeric values
Simple equations with Numeric variables
• Data StudentsStuloansbyhalf;• Set students; • Student_loanbyhalf=Student_loan*0.5;• Run;• I want to create a dataset stored in the “work” library and named
“StudentsCopyStudent_Loan”.• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.• I want a new variable created called Student_loanbyhalf which will be
equal to Student_loan * 0.5 .• I want you to do this.
Simple equations with Numeric variables
• Data StudentsStuloanstimesyear;
• Set students;
• Student_loantimesyear=Student_loan*CalYear;
• Run;
• I want to create a dataset stored in the “work” library and name “StudentsStuloanstimesyear”.
• I want you to get the data to create this dataset from the dataset stored in the library named “work” and named “students”.
• I want a new variable created called Student_loantimesyear which will be equal to Student_loan * CalYear .
• I want you to do this.
• All SAS Operators
• http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000780367.htm
Warning: Null (.) equals negative finite not zero
Create new numeric variables with Functions Round function• Data StudentGPARound;
• Set students;
• GPAround=round(GPA,.01);
• Run;
• I want to create a dataset stored in the “work” library and name “StudentGPARound”.
• I want you to get the data to create this dataset from the dataset stored in the library named “work” and named “students”.
• I want to create a new variable named “GPAround” by rounding “GPA” to the hundredth place
Creating New Variables with String variables
Substr, compress, upcase Functions
• Data StudentIntials;• Set students;• Initials=compress(upcase(substr (first_name, 1,1)) || “.”||
upcase(substr(last_name, 1,1)) );• Run;• I want to create a dataset stored in the “work” library and name
“StudentIntials”.• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.• I want to create a variable named “Initials” where I take the first letter in
upper case of the first name then add a period then take the first letter in upper case of the last name and then remove all spaces.
Concept Check
• Data Fred;
• Set red.color;
• Run;
• Data Students2;
• Set students;
• Where semester=“Fall”
• Run;
• Data time.student2;
• Set student;
• Keep gpa semester last_name;
• Run;
• Data time.student2;
• Set student;
• Year2=year*100;
• Run;
Concept Check
• Create a dataset named “grow” using a SAS database named “flowers”.
• Create a dataset named “sun” using the dataset named “star”. Copy the variable named “mass” into the a variable named “mass2”.
• Create a dataset name highSATusing a SAS database named “satscores”. Only include records if the variable SAT is greater than 1400.
Logic statements to create new variable
• data studentmarkloan;
• put highloans $30.;
• set students;
• If student_loan>20000 then highloans="High Student Loans";
• else highloans="Not High Student Loans";
• run;
• I want to create a dataset stored in the “work” library and name “studentmarkloan;”.
• I want you to get the data to create this dataset from the dataset stored in the library named “work” and named “students”.
• If student_loan is greater than 20000 then I want highloans to equal “High Student loans” if not I want it to equal Not High Student Loans”
Combining to dataset (SET)
• Data StudentDouble;
• Set studentmarkloan Students;
• Run;
• I want to create a dataset stored in the “work” library and named “StudentDouble”.
• I want you to get the data to create this dataset by combining putting datasets studentmarkloan and Students
Proc sort
• Proc Sort data= StudentDouble;• By Stud_id;• Run;• I want to sort in ascending order all dataset records using the variable
“stud_id” .
• Proc Sort data= StudentDouble;• By descending Stud_id ;• Run;• I want to sort in descending order all dataset records using the variable
“stud_id” .
Creating multiple datasets in one dataset
• Data first second;• Set StudentDouble;• By stud_id;• If first.stud_id then output first; *Use any kind of condition;• Else output second;• Run;• I want to create two datasets stored in the “work” library and named
“first” and “second”.• I want to rely on the fact that “studentdouble” is sorted by “stud_id”.• When you find the first unique stud_id store it in the dataset “first”. If it is
not the first unique stud_id then store it in the dataset “second”.
Combining to dataset (Merge one to one)
• Data CoursegradeMerge;
• Merge students (in=a) Coursegrade (in=b);
• By stud_id;
• If a then output;
• Run;
Combining to dataset (Merge one to one)
• Data CoursegradeMerge;
• Merge students (in=a) Coursegrade (in=b);
• By stud_id;
• If b then output;
• Run;
Combining to dataset (Merging)
• Data CoursegradeMerge;
• Merge students (in=a) Coursegrade (in=b);
• By stud_id;
• If a then output;
• Run;
Different Ways to Merge
• Data CoursegradeMerge2;
• Merge students (in=a) Coursegrade (in=b);
• By stud_id;
• If b then output;
• Run;
• Data CoursegradeMerge3;
• Merge students (in=a) Coursegrade (in=b);
• By stud_id;
• If a and b then output;
• Run;
• Data CoursegradeMerge4;
• Merge students (in=a) Coursegrade (in=b);
• By stud_id;
• If a and not b then output;
• Run;
• Data CoursegradeMerge5;
• Merge students (in=a) Coursegrade (in=b);
• By stud_id;
• If B and not a then output;
• Run;
Proc Print
• Proc Print data=students;
• Var _all_;
• Run;
• I want to create output to the SAS viewer using data from a dataset named “students”.
• In the output, I want you to include all variables and records in the dataset.
More Proc print
• Proc Print data=students;
• Var First_name Last_name GPA;
• Run;
• I want to create output to the SAS viewer using data from a dataset named “students”.
• In the output, I want you to include First_name, Last_name, and GPA variables in that order.
• Proc Print data=students noobs label;
• where gpa>3.8;
• label first_name="First Name";
• Var First_name Last_name GPA;
• Run;
• I want to create output to the SAS viewer using data from a dataset named “students”.
• Please delete the observation number and let me relabel the field names.
• I only want records with greater than a 3.8 GPA to be in this output.
• Please change the relabel variable first_nameto First Name.
• In the output, I want you to include First_name, Last_name, and GPA variables in that order.
Proc Freq
• Proc freq data=students;• Table Last_name;• Run;• I want to create output in the form
of a frequency table to the SAS viewer using data from a dataset named “students”.
• Please make a frequency table of the variable “last_name”
• Proc freq data=students;• Table Last_name*semester
/NOPERCENT NOCOL;• Run;• I want to create output in the form
of a frequency table to the SAS viewer using data from a dataset named “students”.
• Please make a crosstab table of the variable “last_name” and “semester. In the output, please remove the overall percentages and column percentages.
http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_sect010.htm
Proc Tabulate• Proc tabulate data=students;
• Class semester;
• Var gpa;
• Table semester, gpa*mean=“Average”;
• Run;
• I want to create output in the form of a frequency table to the SAS viewer using data from a dataset named “students”.
• I want to include the categorical variable name “Semester” and the numeric variable “GPA”.
• Please build a table with Semester’s on the row and a mean gpa for each semester.
• Proc tabulate data=students;
• Class semester ethnic;
• Table semester, ethnic*(n=“Count");
• Run;
• I want to create output in the form of a frequency table to the SAS viewer using data from a dataset named “students”.
• I want to include the categorical variables name “Semester” and “ethnic”.
• Please build a table with Semester’s on the row and ethnicity in the column’s and give me the frequency for each cell. Please add the label count for the frequencies.
More Proc Tabulate• Proc tabulate data=students;
• label ethnic="Ethnicity";
• Class semester ethnic;
• Table semester, ethnic*(n=" ");
• Run;
• I want to create output in the form of a frequency table to the SAS viewer using data from a dataset named “students”.
• Please relabel “ethnic” as “Ethnicity” in the table.
• I want to include the categorical variables name “Semester” and “ethnic”.
• Please build a table with Semester’s on the row and ethnicity in the column’s and give me the frequency for each cell. Please remove the label for n on the columns.
• Proc tabulate data=students;
• where gpa>3.8;
• label ethnic="Ethnicity";
• Class semester ethnic;
• Table semester all="Total", ethnic*(n=" ") all="Total";
• Run;
• I want to create output in the form of a frequency table to the SAS viewer using data from a dataset named “students”.
• Please only include records with a GPA greater than a 3.8.
• Please relabel “ethnic” as “Ethnicity” in the table.
• I want to include the categorical variables name “Semester” and “ethnic”.
• Please build a table with Semester’s on the row. Ethnicity in the column’s and give me the frequency for each cell. Include row and column totals Please remove the label for n on the columns
More Proc Tabulate
• Proc tabulate data=students;
• label ethnic="Ethnicity";
• Class semester ethnic;
• Table semester*ethnic all="Total", gender*(n=" ") all="Total";
• Run;
• I want to create output in the form of a frequency table to the SAS viewer using data from a dataset named “students”.
• Please relabel “ethnic” as “Ethnicity” in the table.
• I want to include the categorical variables name “Semester” and “ethnic”.
• Please build a table with Semesters on the row crossed by ethnic. Gender in the columns and give me the frequency for each cell. Include row and column totals Please remove the label for n on the columns
Titles and Footnotes
Titles
• Title1 "Counts of all Students Last Names";
• Title2 "AY 2014-2015";
• Proc freq data=students;
• Table Last_name;
• Run;
Footnotes
• Title1 "Counts of all Students Last Names";
• Title2 "AY 2014-2015"
• Proc freq data=students notitle;
• Table Last_name;
• Run;
• Footnote1 "Source: Certified Enrollment Records";
• Footnote2 "Office of Institutional Research";
Proc Summary
• Proc Summary data=Students ;
• class Gender school;
• var GPA;
• output out=SchoolgenGPA1 mean=gpa;
• run;
Proc Summary
• Proc Summary data=Students nway;
• class Gender school;
• var GPA;
• output out=SchoolgenGPA2 mean=gpa;
• Run;
• Proc Summary data=Students nway;
• class Gender school;
• output out=SchoolgenGPA3 ;
• run;
• Proc Summary data=Students nway;
• Where ethnic="White";
• class Gender school;
• var gpa;
• output out=SchoolgenGPA4 max=gpa ;
• run;
Proc Format
• Proc format;
• value $gendern "M"="Male"
• "F"="Female";
• Value GPAcat low-3.5="Less than or equal to 3.5"
• 3.51-high="Greater than 3.5";
• run;
• Title1 "Student list with Gender and GPA Category";
• Title2 "AY 2014-2015";
• Proc Print data=students;
• Var First_name Last_namegender GPA;
• format gender $gendern. GPA GPAcat.;
• Run;
Output Delivery System (PDF)
• options orientation=landscape;
• OdS pdf file="\\w4aafs\ss\SSIT\Planning\TAIR\report1.pdf";
• Proc tabulate data=students;
• where gpa>3.8;
• label ethnic="Ethnicity";
• Class semester ethnic;
• Table semester, ethnic*(n=" ");
• Run;
• ods pdf close;
Output Delivery System (RTF)
• OdS rtf file="\\w4aafs\ss\SSIT\Planning\TAIR\report1.doc";
• Proc tabulate data=students;
• where gpa>3.8;
• label ethnic="Ethnicity";
• Class semester ethnic;
• Table semester, ethnic*(n=" ");
• Run;
• ods rtf close;
Output Delivery System (Excel)
• ods tagsets.ExcelXP file="\\w4aafs\ss\SSIT\Planning\TAIR\report1.xls" style=SUGI31;
• ods tagsets.ExcelXP options(sheet_name='All Students');
• Proc tabulate data=students;
• label ethnic="Ethnicity";
• Class semester ethnic;
• Table semester, ethnic*(n=" ");
• Run;
• ods tagsets.ExcelXP options(sheet_name='Greater than 3.8');
• Proc tabulate data=students;
• where gpa>3.8;
• label ethnic="Ethnicity";
• Class semester ethnic;
• Table semester, ethnic*(n=" ");
• Run;
• ods tagsets.ExcelXP close;
Simple SAS Macros
• %let Year=2014;
• %let school="SOD" "SON";
• %let rdate = %sysfunc(today(),MMDDYYd10.);
• %let yearNext=%eval(&year +1 );
• %let GPAScr=%SYSEVALF(3.8);
• options mprint mlogic SYMBOLGEN MINOPERATOR papersize=letter orientation=portrait missing=. nonumber nodate;
• data macronext;
• set students;
• where school in (&school);
• Nextyear=&yearnext;
• run;
Simple SAS Macros
• ods tagsets.ExcelXP file="\\w4aafs\ss\SSIT\Planning\TAIR\report &rdate..xls" style=SUGI31;
• ods tagsets.ExcelXP options(sheet_name='All Students');
• Proc tabulate data=students;
• label ethnic="Ethnicity";
• Class semester ethnic;
• Table semester, ethnic*(n=" ");
• Run;
• ods tagsets.ExcelXP options(sheet_name="Greater than &gpascr");
• Proc tabulate data=students;
• where gpa>&gpascr ;
• label ethnic="Ethnicity";
• Class semester ethnic;
• Table semester, ethnic*(n=" ");
• Run;
• ods tagsets.ExcelXP close;
Creating a Text File
• data _NULL_;
• put n z5.;
• if 0 then set students nobs=n;
• call symputx('numofrecords',n);
• stop;
• run;
• DATA _null_;
• SET students nobs=j;
• FILE "\\w4aafs\ss\SSIT\Planning\TAIR\test1.txt" ;
• if _n_=1 then put "HY2K000040CBM001012016C0150Sharon Carpenter [email protected]";
• put
• @1 Gender
• @2 School
• @6 First_name
• @14 Last_name ;
• If _n_=j then put "EOF100&numofrecords";
• RUN;
What do you want to know?