Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1...

20
1 Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University

Transcript of Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1...

Page 1: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

1

Getting Started With StataSession 1

Jim AnthonyJohn Troost

Department of EpidemiologyMichigan State University

Page 2: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

2

Windowing and the Edit submenus

Page 3: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

3

janthony
Sticky Note
Page 4: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

4

Page 5: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

5

TheReviewWindowdisplaysa recordof the commands implied by your changes to the data editor.You can save these commands so that you do not have to enter the basic data structure next time.

The‘Log’ or ‘Output’window echoes back the commands and the result of each command. You’ll learn to save a log file that you can use to document your work, copy/paste tables to emails, or print it out.

VARIABLE window, with variables being created

COMMAND window

Page 6: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

6

Enter the 1 0 1 0 sequence in the first four rows of var1 as shown here, and then click on the first row of the second column.In that cell of the dataset, enter 1 as shown below.

Page 7: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

7

Page 8: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

8

Repeat that process for the var2 variable by double-clicking on ‘var2’at the top of the second column of values, and make the changes as shown below in order to label the unprotected sex variable, which also is a ‘dummy-coded’ exposure variable which means that 1 is the code for exposed and 0 is the code for unexposed.

In the jargon of our field, any 0/1 coded variable is a ‘dummy-coded’variable. Question for you: Is the aids variable a ‘dummy-coded’variable even though it has to do with ‘case’ status and not with ‘exposure’ status’? Think about it. The answer is on the next slide.

Yes, the aids variable is a dummy-coded variable as well because it has the 0/1 coding scheme.

ANY 0/1 variable might be thought of as a dummy-coded variable, whether it applies to case status, exposure status, or any other kind of variable. A dummy-coded variable always is a ‘binary’ or ‘dichotomous’ variable.

I think it is helpful to reserve the term ‘dummy-coded’ for variables that are “nominal” in the variable’s level of measurement.

This concept of ‘level of measurement’ is an important one.

Nominal variables are at a very low level of measurement. The values are names, and the 0/1 coding may have nothing to do with units of measurement as we see in an ‘ordinal’ variable that conveys rank.

For example, an ordinal variable conveys class rank. The best student might get a class rank value of 1 (first in the class; best grade). The next best student would get a class rank value of 2 (second in class; next best score), and so on, with the integers actually conveying the ‘distance’ or ranking of each student in relation to an underlying scale, and we can interpret the meaning of a unit change in the class rank variable.

In this sense, we look across values of nominal variables, but can’t compare levels.Nominal variables can reveal group differences, but not levels of the variable.

Page 9: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

9

Actually, you can leave this at float type and change the format to %4.0f, which gives a bit more generality.But name the variable struc1 (your first data structure).

The data structure you just created corresponds to a null association between aids and unprotected sex.

One way to think about this data structure is that it is the kind of structure we might generate if we had flipped two fair heads/tails coins for each of the 400 people, and then used the pattern of heads and tails to place each person into a case-exposure cell of the table.

If the laws of chance worked exactly as they should work with respect to these 400 people, and we are flipping Coin #1 and then Coin #2, and looking at the combinations of heads and tails on the two paired coins, then how many combinations of each type should we see?

How many ‘head–head’ combinations?How many ‘tail-tail’ combinations?How many ‘head-tail’ combinations? And how many ‘tail-head’ combinations?

Hint:The chance of a ‘head-head’ combination= the chance of a ‘tail-tail’ combination= the chance of a ‘head-tail’ combination= the chance of a ‘tail-head’ combination.

ANSWER IS ON THE NEXT PAGE

Page 10: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

10

If the laws of chance work exactly as they should work with respect to these 400 people, then how many the paired coin flips should be generated?

100 of each combination type.

This is the data structure we just created using the Stata Data Editor, for an initial look at the association between being a case of AIDS and prior unprotected sex exposure.

Here we leave it as a float variable,but change the format to: %4.0f.

Name it struc2.

Page 11: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

11

Page 12: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

12

When you go back to the main Stata windows after closing the Data Editor window, you will find that Stata has kept a record of all your commands in the REVIEW window.You can save them and study the program syntax later. The next slide explains how to do it.

Your VARIABLES window now has a list of all the variables you created in the dataset, which you can save.

The BIG window is the OUTPUT or LOG window, and it shows your commands and their execution results. You probably won’t see anything in

the COMMAND window at this time.

To save your commands for later inspection, move your cursor into the Review Window, and right click.

Slide your cursor to ‘SAVE ALL’ and you will be prompted to declare a location and file name where you can save them.

By tradition, the extension for Statasyntax files includes the letters ‘do’because these are ‘do’ files.

Save them with an informative name, such as ‘build26feb11.do’ so that you can remember out how to ‘build’ a dataset from scratch using this file.

We can go over the other options later.

Page 13: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

13

However, sometimes, if you have issued some incorrect commands, you may want to partition the correct commands from the incorrect commands, before you save your program syntax commands for later use,Incorrect commands show up in the command window as red font.To partition the incorrect ones (if you have made any mistakes in issuing commands), slide the Review Window to the right (red arrow at bottom of the snapshot), and then click on the _rc letters printed up at the top of the command window.

Now, you can either select on the correct commands and save the selected ones, using the menu from the last slide. Or you can save ALL of the commands, sorted by correct status.

Now, let’s have you look at the data structures you built, using a basic tabulate command, which is abbreviated by Stata as: tab

Start by typingtabin the COMMAND window, and follow the instructions below, step by step. Then go to next slide.

Page 14: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

14

Now, position your cursor in the command window to the right of the word ‘aids’ as shown above.Use the space bar to add a space and ENTER this phrase

[fweight=struc1]

Then press the ENTER key.

This command applies ‘struc1’ values as ‘frequency weights’ and builds the 2x2 aids – u_sex table, as shown in the log window (next slide).

The result should look like what you see down below.

The result should look like what you see down below, but the command line should be empty. Look at the table before going to the bottom of this slide.

Now, let’s apply the struc2 weights and see the positive association table. Do this by pressing the PgUp key to retrieve your just-issued command, and change struc1 to struc2. (You can just change the number. No need to type the entire word again.)Press ENTER to issue this command.

Page 15: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

15

Page 16: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

16

Page 17: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

17

ANNOTATING THE OUTPUT WITH COMMENTS

Page 18: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

18

SAVING THE DATASET

SAVING THE OUTPUT IN A LOG FILE YOU CAN EDIT WITH NOTEPAD

SAVING THE COMMAND FILE YOU CAN EDIT WITH NOTEPAD

SEE SLIDE 24-25, AS SHOWN BEFORE

Page 19: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

19

Second Part of Session 1• As a work group or on your own, view the UCLA Stata introductory

streaming video on other ways to bring data into the Stata environment (e.g., if you have an .xls spreadsheet version of the data):

http://www.ats.ucla.edu/stat/stata/notes_old/movies/IntroStata1.html

• This video also teaches some nifty Stata tricks about describing datasets, etc.

• Information about importing SPSS and SAS files into Stata can be found here:

http://www.ats.ucla.edu/stat/stata/faq/convert_pkg.htm

Other Stata aids at the UCLA site are here:

http://www.ats.ucla.edu/stat/stata/

Session 2 Overview• An overview of the Stata epitab commands will be provided.

• The ‘immediate’ commands will be covered in detail

http://www.stata.com/help.cgi?epitab

http://www.epi.msu.edu/janthony/Epidemiologic%20Analysis%20with%20a%20Programmable%20Calculator.pdf

In advance of Session 2, read Chapter 1 (3 pages) of this online text if you are new

to epidemiology or need a quick refresher overview.

Page 20: Getting Started With Stata Session 1 - Epidemiology Started...Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University 2 Windowing

20

End of Session 1

A copy of this PPT and an annotated Stata do-file with these commands can be found at the following URL:

http://www.epi.msu.edu/janthony/stata/session1/

Try the .zip file if you cannot access the individual files.