INTRODUCTION TO STATA Võ Tuấn Khoa Trần Thế Trung.

Post on 17-Dec-2015

223 views 4 download

Tags:

Transcript of INTRODUCTION TO STATA Võ Tuấn Khoa Trần Thế Trung.

INTRODUCTION TO STATA

Võ Tuấn Khoa

Trần Thế Trung

Stata basics

• command-driven or menu-driven software

• modeling complex data from longitudinal studies or surveys deal for analyzing results from clinical trials or epidemiological studies

• provides a powerful programming language

Stata interface in Window

Stata commandThe basic language syntax for STATA commands is

[by varlist:] command [varlist] [=exp] [if exp] [in range] [weight] [using filename] [, options]

where the elements between brackets are optional.

Stata command• [by varlist:] instructs Stata to repeat the command for each

combination of values in the list of variables varlist.

• [command] is the name of the command and can be abbreviated; for example, the command display can be abbreviated as dis.

• [varlist] is the list of variables to which the command applies.

• [=exp] is an expression.

• [if exp] restricts the command to that subset of the observations that satisfies the logical expression exp.

• [in range] restricts the command to those observations whose indices lie in a particular range.

• [weight] allows weights to be associated with observations

• [using filename] specifies the filename to be used.

• [options] are specific to the command and may be abbreviated.

Stata command

• Example 1– Stata Command:

.bysort black: summarize age if year >= 80, detail

– Results:• Summarizes age separately for

different values of black, including only observations for which year >= 80, includes extra detail.

Stata command

• Example 2– Stata Commands:

.generate agelt30 = age

.replace agelt30 = 1 if age < 30

.replace agelt30 = 0 if age >= 30 & age <.– Result: variable agelt30 set equal to 1, 0, or

missing– Generally [= exp] used with commands

generate and replace

Stata command

• Click Help / Stata command

• Type key word (Ex: summarize)

• See details

Do Files and Log Files

• A do file is a text file with STATA code that STATA runs line by line, as if the sentences where written in the STATA command window.

• A log file is a text file with all the results that appear in the STATA results window.– the user selects when to start and when to

stop logging to the log file

Variable name• Have up to 32 characters but shorter

names are easy to type

• Stata names are case sensitive (age≠Age)

• Should:– short lowercase– single word– underscore to separate word

effort

fpe

family_planning_effort

familyplanningeffort

Variable type• Nummeric variable

• String variable

• Missing value

– numberic: dot (.)

– string: “”

Some Basic Commands

• computing basic statistics– summarize ypc– summarize ypcf [w=popwt]– summarize ylab [w=popwt] if age >=25 & and age <=55

• generate new variables– generate ypc2 = ypc^2

• tabulate data– table skill [w=popwt], c(mean ylab)

Some Basic Commands

• renaming variables– rename ypc2 ypcf22

• eliminating variables– drop ypc22

• replacing values– replace male=0 if male==1

Open data from Excel format

• Import data from excel file

Open data from Excel format

Open data from Excel format

Review data

Starting descriptive analysis

Starting descriptive analysis

Output Window