Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax –...

23
Stata and logit recap

Transcript of Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax –...

Page 1: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Stata and logit recap

Page 2: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Topics

• Introduction to Stata– Files / directories– Stata syntax– Useful commands / functions

• Logistic regression analysis with Stata– Estimation– Goodness Of Fit– Coefficients – Checking assumptions

Page 3: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Overview of Stata commands

• Note: we did this interactively for the larger part …

Page 4: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Stata file types

• .ado – programs that add commands to Stata

• .do– Batch files that execute a set of Stata commands

• .dta– Data file in Stata’s format

• .log– Output saved as plain text by the log using

command (you could add .txt as well)

Page 5: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

The working directory

• The working directory is the default directory for any file operations such as using & saving data, or logging output

cd “d:\my work\”

Page 6: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Saving output to log files

• Syntax for the log command

log using [filename], replace text

• To close a log file

log close

Page 7: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Using and saving datasets

• Load a Stata dataset use d:\myproject\data.dta, clear

• Save save d:\myproject\data, replace

• Using change directorycd d:\myprojectuse data, clearsave data, replace

Page 8: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Entering data

• Data in other formats– You can use SPSS to convert data that can be read with

Stata. Unfortunately, not the other way around (anymore)– You can use the infile and insheet commands to import

data in ASCII format– Direct import and export of Excel files in Stata is possible

too

• Entering data by hand (don’t do this …)– Type edit or just click on the data-editor button

Page 9: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Do-files

• You can create a text file that contains a series of commands. It is the equivalent of SPSS syntax (but way easier to memorize)

• Use the do-file editor to work with do-files

Page 10: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Adding comments in do-files

• // or * denote comments stata should ignore

• Stata ignores whatever follows after /// and treats the next line as a continuation

• Example II

Page 11: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

A recommended template for do-files

capture log close //if a log file is open, close it, otherwise disregard

set more off //dont'pause when output scrolls off the page

cd d:\myproject //change directory to your working directory

log using myfile, replace text //log results to file myfile.log

… here you put the rest of your Stata commands …

log close //close the log file

Page 12: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Serious data analysis

• Ensure replicability use do+log files• Document your do-files– What is obvious today, is baffling in six months

• Keep a research log– Diary that includes a description of every program

you run• Develop a system for naming files

Page 13: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Serious data analysis

• New variables should be given new names• Use variable labels and notes (I don’t like

value labels though)• Double check every new variable• ARCHIVE

Page 14: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Stata syntax examples

Page 15: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Stata syntax exampleregress y x1 x2 if x3<20, cluster(x4)

1. regress = command

– What action do you want to performed

2. y x1 x2 = Names of variables, files or other objects– On what things is the command performed

3. if x3 <20 = Qualifier on observations– On which observations should the command be

performed

4. , cluster(x4) = Options appear behind the comma– What special things should be done in executing the

command

Page 16: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

More examples

tabulate smoking race if agemother>30, row

More elaborate if-statements:

sum agemother if smoking==1 & weightmother<100

Page 17: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Elements used for logical statements

Operator Definition Example

== is equal in value to if male == 1

!= not equal in value to if male !=1

> greater than if age > 20

>= greater than or equal to if age >=21

< less than if age < 66

<= less than or equal to if age <=65

& and if age==21 & male==1

| or if age<=21 | age>=65

Page 18: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Missing values

• Automatically excluded when Stata fits models (same as in SPSS); they are stored as the largest positive values

• Beware!! – The expression “age>65” can thus also include

missing values (these are also larger than 65)– To be sure type: “age>65 & age!=.”

Page 19: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Selecting observations

drop [variable list]

keep [variable list]

drop if age<65

Note: they are then gone forever. This is not SPSS’s [filter] command.

Page 20: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Creating new variables

Generating new variables

generate age2 = age*age

(for more complicated functions, there also exists a command “egen”, as we will see later)

Page 21: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Useful functionsFunction Definition Example

+ addition gen y = a+b

- subtraction gen y = a-b

/ Division gen density=population/area

* Multiplication gen y = a*b

^ Take to a power gen y = a^3

ln Natural log gen lnwage = ln(wage)

exp exponential gen y = exp(b)

sqrt Square root gen agesqrt = sqrt(age)

Page 22: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Replace command

• replace has the same syntax as generate but is used to change values of a variable that already exists

gen age_dum5 = .replace age_dum5 = 0 if age < 5replace age_dum5 = 1 if age >=5

Page 23: Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.

Recode

• Change values of existing variables

– Change 1 to 2 and 3 to 4 in origvar, and call the new variable myvar1: recode origvar (1=2)(3=4), gen(myvar1)

– Change 1’s to missings in origvar, and call the new variable myvar2:recode origvar (1=.), gen(myvar2)