An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration...

40
An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg minar “Migration and the Labour Market Session 3, June 9, 2011
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration...

Page 1: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

An Introduction into Stata I

Prof. Dr. Herbert Brücker

University of Bamberg

Seminar “Migration and the Labour Market”Session 3, June 9, 2011

Page 2: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

Contents

1Introduction into the workplan2Introduction into the dataset3Introduction into STATA I•Overview on working with STATA•Menues and editors

• General editor• Data editor• Do File editor

•The Grammar of STATA• commands• loading data• describing data• graphs

•Working with Do-Files

Page 3: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

1 Workplan

•Forming four teams à 4-5 students•Introduction and outline of research question•Review of literature on labour market effects of migration (3-5 pages)•Description of the dataset

• Data sources and caveats• Descriptive statistics and graphs

•Presenting the empirical model•Presenting and discussing the regression results•Conclusions•Presenting the papers in class

Page 4: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

2 The dataset: general information

•The IAB employment sample (IABS)•2% random sample of all employees obliged to pay social security contributions and recipients of unemployment benefits (e.g. SGB II and III)•Precise information on wages and unemployment spells•Information on education and work experience•Period: 1974-2004 (meanwhile until 2008)•Here we use 1980 – 2004 since information at beginning of sample period are less reliable•Focus on Western Germany excl. (West-)Berlin due to unification

Page 5: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

2 The dataset: Caveats I

•Identification of foreigners by nationality• We use nationality of first spell to control for

nationalisations•Problem to identify immigration of ethnic Germans (Spätaussiedler)

• We try to identify via programme participation•No civil servants (“Beamte”) and self-employed

• Nothing what we can do.•Wages are censored at legal pension threshold level (66,000 Euros)

• We impute wages above threshold level

Page 6: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

2 The dataset: Caveats II

•Missing education information (17%, about 35 per cent of foreigners)

• We impute education information•We have only daily wages (not hourly wages)

• We exclude all part-time workers•See Brücker/Jahn (2011), Data Section for Description and FDZ at IAB for description of data set

Page 7: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

2 The dataset: Organisation

•We distinguish 25 years (1980 – 2004)•We distinguish 64 labour market spells by education (4), work experience (8) and nationality (2)

• 4 x 8 x 2 = 64•We use the following indexes:

• h = native (German)• f = foreigner• q = Education• k = work experience• t = time

• Note that we have also aggregates in the dataset (e.g. wt, wqt, wqkt and not only whqkt, wfqkt)

Page 8: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

General overview of STATA

The desktop of STATA is divided in four different parts:

1.Review shows executed commands2.Results shows the results of your commands3.Variables the current list of variables in the data set4.command here the commands have to be typed in

Page 9: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.
Page 10: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

Review window:Lists your previous commands

Page 11: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

Result window:Shows outcome of your current command

Page 12: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

Variable window:Shows variables of your dataset

Page 13: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

Command window: Here you can type your commands

Page 14: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

STATA has the following menues/editors you can work with:

1.The desktop menue You can run all commands here2.The data editor Here you can edit the data you

have loaded3.The data browser Here you can browse the data

you have loaded, but not edit4.The do file editor The do file is a file where you

can edit and execute all types of commands. Very useful for replication and memorizing what you have done. We come back to this.

Page 15: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

The Data Editor. You can change each cell by hand.

The Data Browser looks similiar. But you can‘t edit the data.

Page 16: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

The Do File Editor. You can type your commands and execute your commands there.

(Words in stars are not treated as commands, e.g. * Note that … *).

Page 17: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

The Grammar of STATA

General Structure of STATA

[prefix :] command [varlist] [if] [in] [weight] [, options]

Page 18: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

General structure of STATA

We will concentrate on:

[prefix :] command [varlist] [if] [in] [weight] [, options]

Page 19: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

General structure of STATA

We will concentrate on:

[prefix :] command [varlist] [if] [in] [weight] [, options]

What you want to do?

Page 20: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

[prefix :] command [varlist] [if] [in] [weight] [, options]

First step how to load data:

> use “Filename” , clear

Practice:

> use “C:\EigeneDateien\Stata\data1.dta” , clear

other option to load data:-> File -> Open -> Choose your data

Page 21: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

General structure of STATA

There are two types of variables (data):

numerical variables, e.g.: 0, 1, 501, 0.5, -12 etc.

string variables, e.g.: no voc train , male, female etc.

How to deal with the data types:

Numerical variables: you can do all mathematical operations, e.g. var1 + var2, var1/var2, var1*var2 etc.

String variables: You have to use quotation marks for identifcation, e.g.

var1 = 1 if sex == “female”

Page 22: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

The black variables are numerical variables.

The red variables are string variable.

Page 23: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

[prefix :] command [varlist] [if] [in] [weight] [, options]

Since you have now loaded the data –

How to get an overview of your data?

> describe

“describe” gives general information about the data, such as the number of observations, the amount of variables, the label and the name of the variables etc.

Page 24: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

[prefix :] command [varlist] [if] [in] [weight] [, options]

How to get an overview of your data?

> list

enlists the data of every single cell (e.g. persons, groups, classes) in the data set.

Attention your data might be really large! “-more-” indicates that there are more information available, either put any key to continue or “q” in order to “quit”.

Page 25: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

General structure of STATA

We will concentrate on:

[prefix :] command [varlist] [if] [in] [weight] [, options]

What is concerned?

Page 26: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

[prefix :] command [varlist] [if] [in] [weight] [, options]

[varlist] stands for either a list of variables or only one variable which is concerned by the command.

[varlist] is set into brackets since it’s an optional specification; in case there is no [varlist] specified, STATA will execute the command for all variables.

Practice:

In order to get information only about education and wages in the data set:

> list ed whqkt

Page 27: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

[prefix :] command [varlist] [if] [in] [weight] [, options]

Further commands to describe the data set I.:

> tabstat

gives a table with the mean of the variable(s)

> codebook

indicates the codification of the variable with information on the datatype, range, units, unitvalues, missings, mean, standard deviation, percentiles

In practice:

tabstat whqkt wfqkt

codebook

tabstat whqkt

Page 28: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

[prefix :] command [varlist] [if] [in] [weight] [, options]

Further commands to describe the data set II.:

> summarize

gives the absolute frequencies, the mean, the standard deviation, the minimum and the maximum of a variable

> tabulate

indicates a table with the absolute and relative distributions of a certain variable

In practice:

> sum whqkt wfqkt

> tab whqkt wfqkt

Page 29: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

[prefix :] command [varlist] [if] [in] [weight] [, options]

Practice:

- how many observations- mean earnings or unemployment rate- standard deviation of earnings and unemployment rate- range of observations (minimum and maximum wage and unemployment rate)

Note that the descriptive statistics provides already interesting information about the data, helps to control for outliers and measurement error and for the interpretation of regression results (most results refer to the sample mean)

Page 30: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

General structure of STATA

We will concentrate on:

[prefix :] command [varlist] [if] [in] [weight] [, options]

Under which condition

Page 31: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

[prefix :] command [varlist] [if] [in] [weight] [, options]

With [if] you can set a condition, or make restrictions.

e.g. in order to get to know only the average income of migrants with the lowest education (no vocational training).

summarize wfqkt if ed == “no voc train”?

“no voc train” is a string variable (therefore the quotation marks) and indicates that an individual has no vocational training.

Page 32: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

[prefix :] command [varlist] [if] [in] [weight] [, options]

How to create dummies?

What is a dummy variable? A dummy variable has a value of 0 or 1.

With STATA you are also able to make up new variables out of the data.

In order to do so you need the command of “generate” and “replace”

> gen ed1 = 0

> replace ed1 = 1 if education == “no voc train”

Other example:

> gen ex1 = 0

> replace ex1 = 1 if ex == 1

Page 33: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

[prefix :] command [varlist] [if] [in] [weight] [, options]

How to calculate and transform numerical variables

> generate newvar = var1 – var2

STATA knows the mathematic calculations rules (+, -, /, logs, etc.)

Practice: Create the log wage:

> generate ln_whqkt = ln(whqkt)

Page 34: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

[prefix :] command [varlist] [if] [in] [weight] [, options]

How to modify variables/dummies?

> replace var = (var1 – var2)/2

STATA knows the mathematic calculations rules (+, -, /, log, etc.)

Practice: Replace the wage by the log wage only for low skilled

> replace ln_whfqkt = ln(whqkt) if ed == “no voc train”

Page 35: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

[prefix :] command [varlist] [if] [in] [weight] [, options]

How to create graphics?

> graph twoway line var1 year [if] [in]

STATA produces twodimensional graphs with lines, bars, dots, scatter plots etc. with the “graph twoway” command, the type of the graph is assigned after that, e.g. “line”

Practice:

Graph the development of native and foreign wages for the years in our sample in a given education and experience group.

> graph twoway line whqkt wfqkt year if ed == “no voc train” & ex == 1

> graph twoway scatter whqkt wfqkt if ed == “no voc train” & ex == 1

Page 36: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

The do-file

STATA also provides a do-file (= text-editor), into which the commands can be written.

- the do-file can be opened by the command “doedit” or by pressing “STRG + 8” or by clicking at the do-file bar.

How to execute commands in a do-file?

- you write the command into the text-editor, then mark the text and press “STRG + d”- in case of no text is marked, the whole do-file will be executed. That can create troubles if you have in your list of commands a mistake. (That happens in most cases.)

Page 37: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

The do-file

Reasons to use a do-file:

- your work is documented and reproducible!

- you can include comments into your work by setting a “*” at the very

beginning of the line (they automatically get a green color):

e.g. > *load data> use “C:\User\...data1.dta” , clear> *get an overview> describe

- you can save your do-file ->File ->Save- and you also can open do-files ->File ->Open- do-files have the extensions “.do”

Page 38: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

This is an example of a Do-File.

First I „set more off“ and load the data.

Second I use a command for panel regressions.

Third I generate some variables.

The remarks in stars are explaing what I‘m doing.

Page 39: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

Now I mark the lines where I have the commands I want to execute.

Then I press the execute button.

Page 40: An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.

Next Meeting:

June 30, Room RZ 1.03!