PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra...

48
PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science in Predictive Analytics Program

Transcript of PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra...

Page 1: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

PROC SQL – Select Codes To Master For Power Programming

Codes and Examples from SAS.com

Nethra Sambamoorthi, PhDNorthwestern University

Master of Science in Predictive Analytics Program

Page 2: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Data Processing Terminologies Across Data Sciences…

Page 3: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Why PROC SQL or What Can It Do For Analysts?• Generate reports• Generate summary statistics• Retrieve data from tables or views• Combine data from tables or views• Create tables, views, and indexes• Update the data values in PROC SQL tables• Update and retrieve data from database management system (DBMS) tables• Modify a PROC SQL table by adding, modifying, or dropping columns• PROC SQL can be used in an interactive SAS session or within batch programs,

and it• Can include global statements, such as TITLE and OPTIONS.

Page 4: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

An Example of Extracting, Summarizing, and Printing Using Data Steptitle 'Large Countries Grouped by Continent';proc summary data=sql.countries;where Population > 1000000;class Continent;var Population;output out=sumPop sum=TotPop; run;

proc sort data=SumPop;by totPop; run;

proc print data=SumPop noobs;var Continent TotPop;format TotPop comma15.;where _type_=1; run;

/* Extracting and summarizing */

/* Sorting to arrange the output */

/* Printing */

Page 5: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Creating The Same Using PROC SQL

proc sql;

title 'Population of Large Countries Grouped by Continent';

select Continent, sum(Population) as TotPop format=comma15.

from sql.countries

where Population gt 1000000

group by Continent

order by TotPop;

quit;

Page 6: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Countries Table

Page 7: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

WordCityCoords Table

Page 8: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

USCityCoords Table

Page 9: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

UnitedStates Table

Page 10: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

PostalCodes Table

Page 11: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Worldtemps Table

Page 12: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Oilprod Table

Page 13: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

OILRSRVS Table

Page 14: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

CONTINENTS Table

Page 15: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

FEATURES Table

Page 16: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

SELECT statement

Page 17: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Three Important Aspects – Describe, Print, Quit/* Helps understand the structure of the table */PROC SQL;Describe table sql.unitedstates; Quit;

Page 18: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

SELECT means PRINTING is Included Unless• SELECT * /* all columns */

• SELECT city, state /* specific columns */

• SELECT distinct continent /* specific columns but avoid dup */

So it is possible to run this

Page 19: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

The output is…

Page 20: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Suppress column headings…

Page 21: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Calculated columns and alias name…

Page 22: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Retrieving Data From Multiple Tables

• Means we are JOINING tables• If there is no JOIN statement, it means (1) Cartesian product of

records [no subset condition ] or (2) inner joins [ we need some subset condition]• Alias names can be used for tables too; it helps simplify calling specific

columns of a table

Page 23: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

SELECT … FROM table1, table2; A Cartesian Product

Page 24: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.
Page 25: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Order the output from INNER JOIN

INNER JOIN can be used explicitly

Page 26: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

INNER JOIN with comparison values on another column…

Page 27: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Effect of Null Values on JOINS

Page 28: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

NOT MISSING option

Page 29: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Multicolumn JOINS

Page 30: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Columns are directly comparable between two tables…

Capitals FROM sql.unitedstates

City FROM sql.uscitycoord

Postalcodes FROM sql.postalcodes

Page 31: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Is it possible to do SELFJOIN?

Page 32: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Two Types of OUTERJOIN – LEFTJOIN and RIGHTJOIN

Page 33: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

FULLJOIN …

Page 34: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

SPECIALTY JOINS

Page 35: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

NATURAL is applicable for both LEFT and RIGHT JOIN. The purpose is to reduce verbose to match on multiple common columns…

Gives the same output;

Non matching rows have missing values

Page 36: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Use COALESCE to combine multiple columns to create new matching variables

Page 37: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Using SUB QUERY or NESTED QUERY – SINGLE VALUE

=

Page 38: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Correlated SUBQUERY = NESTED QUERY

Page 39: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Where “EXISTS” option

Page 40: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Multiple NESTED QUERY

Page 41: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

Combine a JOIN with a SUBQUERY

Page 42: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

QUERY strategies…

Page 43: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

UNION is ROWWISE (PROC APPEND), while JOIN is COLUMNWISE (MERGE by)

Keep the dups

Page 44: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

OUTER UNION = KEEP ONLY FROM – Key word EXCEPT

Page 45: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.
Page 46: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.
Page 47: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.

To overlay data better: keyword CORRESPONDING

Page 48: PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.