WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks...

23
WRDS SAS User Guide West Virginia University

Transcript of WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks...

Page 1: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

WRDS SAS User Guide

West Virginia University

Page 2: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Available Data in WRDS

By Subject Data Contents

Stocks

CRSP Security price, return, and volume data for main indexes

Dow Jones Dow Jones averages and total return

CBOE Key measure of market expectations of volatility

Fundamentals

Compustat North America U.S. and Canadian accounting and market information

Compustat Global Financial and accounting data of publicly traded companies

SEC Information about Disclosure of Order Execution Statistics

Bank Bank Regulatory Financial information about U.S. banking institutions

BondTRACE Transaction data for all eligible corporate bonds

CRSP Treasury Historical info and market data including yields, and durations

Interest Rate FRB Databases collected from Federal Reserve Banks

Mergers & Acquisition Bank Regulatory Merger information concerning U.S. banking institutions

Currency Option PHLX Philadelphia Stock Exchange's United Currency Options Market

Marketing DEMF Customer buying history

Ownership Blockholders Standardized data for blockholders

Page 3: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Advantage for using SAS

• WRDS is built using SAS data sets, so manipulating data through SAS is easier than almost any other querying tool.

• Any combination of two databases can be constructed.

• The web interface deletes observations for which the chosen variables have missing values and there is no simple way of finding out what observations were deleted

Page 4: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

SAS Sample programs• For simple SAS codes: ► SAS Sample Programs on WRDS Info Home

• For advanced SAS codes: ► Support → WRDS Datasets and Sample program → SAS ► Support → Research Applications

Page 5: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Connect to WRDS

%let wrds = wrds.wharton.upenn.edu 4016; options comamid=TCP remote=wrds; signon username=_prompt_; rsubmit;

*------------------- * * your code here *------------------------- *;

endrsubmit;

► Note: you always need this code since it has SAS connected to WRDS.

Page 6: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Autoexe.sas

data _NULL_; file 'autoexec.sas'; put "%include '!SASROOT/wrdslib.sas';";run;

► A list of important libnames already is assigned by WRDS through this statement. ► You may run this code only when libname error happens.

Page 7: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Libname • SAS library names are already defined in all user accounts of

Unix. • For example, ► Bank regulatory bank: /wrds/bank/sasdata ► Compustat: comp /wrds/compustat/sasdata ► CRSP CCM: crsp /wrds/crsp/sasdata/cc ► CRSP Monthly stock: crsp /wrds/crsp/sasdata/sm

• Unix home directory: ► Temp /home/wvu/min06/temp

Page 8: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Data set• To set up data steps, use the LIBNAME statement and then

name the dataset. This is enough to create it. Example:• For CRSP monthly stock file: ► set crsp.msf• For Compustat Industrial Fundamental file: ► set comp.ina

• All data set are from SAS data files stored in Unix. ► A good way to fix your error is checking variables and directory name of SAS files in Unix.

Page 9: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Finding variables

• Web – Based: ► Documents ► Tools (Searching variables)

• Using SAS: ► proc contents data=crsp.dsf;

Page 10: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Finding identifiers • Web – Based: ► Code lookup ► Tools • Using SAS: ► For example, to find identifiers in Compustat; data names; set comp.namesann; where coname contains 'IBM' or SMBL contains 'IBM'; run; proc print data = names; run; ► For CRSP: the file name is “stocknames”. • Using Unix command: grep

Page 11: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

a. Merge CRSP / Compustat using CUSIP

When merging two databases, we need a commonID• Best way is to match them with CUSIP: ► Names and Tickers are problematic since they

change though time, can be re-used. and therefore have different entries in different databases.

► CUSIP changes through time but are not re-used. ► should be historical one (NCUSIP)

Page 12: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Understanding CUSIP

• Example for IBM

Compustat: ► CNUM=459200 (6 digits of CUSIP) CRSP: ► CUSIP=45920010

459200 10 1

Page 13: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Matching identifiersDatabase Ticker CUSIP GVKEY PERMNO

CRSP YES CUSIP NCUSIP NO YES Main identifier

COMPUSTAT YES(SMBL) CNUM YESMain identifier

NO

► To create a common identifier (cnum), we use CUSIP and subtract 6 digits from it.

Page 14: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Step 1. Headers from Compustat

• From Compustat header file “namesann”:

► Find “cnum” and “gvkey” for IBM ► Then exclude missing data ► Sort data by cnum

• proc sort data=comp.namesann(keep = gvkey cnum) out=comp nodupkey;

where missing(cnum)=0 where smbl in IBM; by cnum gvkey; run;

Page 15: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Step 2. Headers from CRSP

• From CRSP header file “stocknames”;

► Find “ncusip” and “permno” for IBM ► Then exclude missing data ► Sort data by ncusip ► Define the output “mse” • proc sort data=crsp.stocknames(keep = permco ncusip) out=mse

nodupkey; where missing(ncusip)=0; by permco ncusip; run;

Page 16: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Step 3. Creating cnum from ncusip

• Create 6 digits identifier (cnum) from ncusip in order CRSP and Compustat to be matched on cnum:

► Using functions “length” and “subtr”, create cnum from ncusip in “mse” ► Sort data by cnum ► Define the output as “mse3”

data mse2; length cnum $6.; set crsp.mse; cnum= substr(ncusip,1,6); run;

proc sort data=mse2 out=mse3(keep = permco cnum) nodupkey; by cnum permco; run;

Page 17: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Step 4. Merging► Create temporary variables “aa” and “bb” using option “in” in order to

track whether that data set contributed to the current observation

data joint2; merge comp(in=aa) mse3(in=bb); by cnum;

/* Create Dummies to test source of merging*/

if aa=1 then compustat=1; else compustat=0; if bb=1 then crsp=1; else crsp=0; run;

Page 18: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

b. Extract data from CCM

Concepts needed: • Historical identifier (NPERMNO) • Linking file (cstlink2)

► see CCM guide • Most of the SAS procedures on WRDS use SQL

► see SQL references

Page 19: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Step 1. Libname & Years

• Libname: ‘wrds/crsp/sasdata/cc’ • Sepecify beginning and ending years:

%let beg_yr = 1995; %let end_yr = 2003;

BEGFYR ENDFYR

Step 1

Page 20: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Step 2. Link file (cstlink2)Specify link information: ► Create a data (temp1) which is set on “cstlink2”

/* data temp1; set crsp.CSTLINK2;run; */

► Select link types and link dates (1995 < date < 2004) ► Sort data by GVKEY ► Define the output as “link”

/* proc sort data=temp1 out=lnk; where NPERMNO in (12490, 11081, 10107) and LINKTYPE in ("LU", "LC", "LD", "LF", "LN", "LO", "LS", "LX") and (&end_yr+1 >= year(LINKDT) or LINKDT = .B) and (&beg_yr-1 <= year(LINKENDDT) or LINKENDDT = .E); by GVKEY LINKDT; run; */

BEGFYR ENDFYR

LINKDT LINKENDDT

Step 2

Page 21: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Step 3.a A,B part → (LINKDT <= FYENDDT or LINKDT = .B) and

(FYENDDT <= LINKENDDT or LINKENDDT =.E) B part →(LINKDT <= FYBEGDT or LINKDT = .B) and

(LINKENDDT >= FYENDDT or LINKENDDT = .E) A,B,C part → (LINKDT <= FYENDDT or LINKDT = .B) and

(LINKENDDT >= FYBEGDT or LINKENDDT = .E)

BEGFYR ENDFYR

LINKDT LINKENDDT

ENDFYRBEGFYR

A B C Step 3

Page 22: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

Step 3b. Specify overlapping periods

► Create table (defined “mydata”) which has following variables from the file “link”

► Name “crsp.CSTANN” as “cst” (* CSTANN is a file which contains all compustat data)

► Specify date requirements. (Select A, B or C) ► With GVEKY we found, extract data we need from “CSTANN“ by the

corresponding GVKEY (lnk.GVKEY = cst.GVKEY)

proc sql; create table mydata(keep=GVKEY NPERMNO NPERMCO SMBL YEARA LINKDT

LINKENDDT LINKTYPE DATA6) as select * from lnk, crsp.CSTANN as cst where lnk.GVKEY = cst.GVKEY and (&beg_yr <= YEARA <= &end_yr) and (LINKDT <= cst.FYENDDT or LINKDT = .B) and (cst.FYENDDT <= LINKENDDT or

LINKENDDT = .E);quit;

Page 23: WRDS SAS User Guide West Virginia University. Available Data in WRDS By SubjectDataContents Stocks CRSPSecurity price, return, and volume data for main.

References

To see all references and SAS programs: ► http://www.be.wvu.edu/wrds/home/index.html