SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.
Transcript of SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.
![Page 1: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/1.jpg)
SJTU CMGPD 2012Methodological Lecture
Day 2
TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR
![Page 2: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/2.jpg)
Descriptive statistics
• There are a number of ways in STATA of transforming the dataset to produce descriptive statistics to be plotted or put into a figure
• Slow, manual way– TABULATE– Copy results to Excel, parse, and plot– Not recommended
• Transformation to produce counts, averages etc. according to the values of specified variables to use as the basis of plots– TABLE, REPLACE– COLLAPSE– BYSORT combined with EGEN (to be discussed later)
![Page 3: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/3.jpg)
Collapsing the data
• TABLE, REPLACE and COLLAPSE transform the data
• For each value of a specified variable, or each combination of values for specified variables, produce a single observation with summary statistics of other specified values
• These summary statistics can be counts, sums, means, etc.
![Page 4: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/4.jpg)
COLLAPSE Start with a hypothetical dataset
+----------------+ | x1 x2 y | |----------------| 1. | 1 3 12 | 2. | 2 3 100 | 3. | 1 3 45 | 4. | 2 3 -18 | 5. | 1 3 73 | |----------------| 6. | 2 4 22 | 7. | 1 4 -129 | 8. | 2 4 -100 | 9. | 1 4 -9 | 10. | 2 4 112 | +----------------+
Replace the dataset with one that for each combination of x1 and x2, contains the mean of y
. collapse y, by(x1 x2)
. list
+--------------------+ | x1 x2 y | |--------------------| 1. | 1 3 43.33333 | 2. | 1 4 -69 | 3. | 2 3 41 | 4. | 2 4 11.33333 | +--------------------+
![Page 5: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/5.jpg)
Or count the numbers of records for each unique combination of x1 and x2
. collapse (count) y, by(x1 x2)
. list
+-------------+ | x1 x2 y | |-------------| 1. | 1 3 3 | 2. | 1 4 2 | 3. | 2 3 2 | 4. | 2 4 3 | +-------------+
Or both at the same time, creating count and average simultaneously. ‘avgy=‘ tells it to create a new variable name.
. collapse (count) y (mean) avgy=y, by(x1 x2)
. list
+------------------------+ | x1 x2 y avgy | |------------------------| 1. | 1 3 3 43.33333 | 2. | 1 4 2 -69 | 3. | 2 3 2 41 | 4. | 2 4 3 11.33333 | +------------------------+
![Page 6: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/6.jpg)
TABLE, REPLACECan achieve the same thing with TABLE, REPLACE, though the resulting variable names are a bit cryptic
. table x1 x2, contents(count y mean y) replace
------------------------------ | x2 x1 | 3 4----------+------------------- 1 | 3 2 | 43.33333 -69 | 2 | 2 3 | 41 11.33333------------------------------
. list
+-----------------------------+ | x1 x2 table1 table2 | |-----------------------------| 1. | 1 3 3 43.33333 | 2. | 1 4 2 -69 | 3. | 2 3 2 41 | 4. | 2 4 3 11.33333 | +-----------------------------+
.
![Page 7: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/7.jpg)
histogramObservations by year
The easy way to get a figure for numbers of observations by register year is to use histogram.
histogram YEAR, discrete frequency ytitle("Observations") xtitle("Year") xlabel(1750(25)1900)
To force a monochromatic color scheme, we can add scheme(s1mono)
To override the default numeric format of the vertical axis labels, we can add ylabel(,format(“%5.0f”))
histogram YEAR, discrete frequency ytitle("Observations") xtitle("Year") xlabel(1750(25)1900) ylabel(,format(%5.0f)) scheme(s1mono)
![Page 8: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/8.jpg)
02.
0e+
04
4.0e
+0
46.
0e+
04
8.0e
+0
4O
bse
rva
tion
s
1750 1775 1800 1825 1850 1875 1900Year
![Page 9: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/9.jpg)
020
000
4000
060
000
8000
0O
bse
rva
tion
s
1750 1775 1800 1825 1850 1875 1900Year
![Page 10: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/10.jpg)
histogramRestricting the data
• Often, in producing a histogram, it is necessary to prevent the display of invalid, implausible, or otherwise problematic observations.– Missing values are always coded as -98 or -99, and should be
excluded from graphs• Do this with an if restriction in the command• This applies to tables as well.• Compare the results of
– histogram AGE_IN_SUI– histogram AGE_IN_SUI if AGE_IN_SUI >=1 & AGE_IN_SUI <= 99
![Page 11: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/11.jpg)
if and logical expressions in STATA• if AGE_IN_SUI >=1 & AGE_IN_SUI <= 99 restricts the
command to observations where AGE_IN_SUI is >=1, and <= 99.• & represents AND
– Expression is evaluated as true only if ALL expressions are TRUE
• | represents OR– Expression is evaluated as true if ANY of the expressions are TRUE
• May use parentheses (, ) to specify order of evaluation• ! represents NOT
– In a logical expression, TRUE is typically indicated as 1, and FALSE is indicated as 0.• If AGE_IN_SUI was 45, AGE_IN_SUI >= 1 would evaluate to 1, and
AGE_IN_SUI <= 99 would evaluate to 1.– 1 & 1 would evaluate to 1, TRUE
• If AGE_IN_SUI was 105, AGE_IN_SUI >= 1 would evaluate to 1, and AGE_IN_SUI <= 99 would evaluate to FALSE or 0
– 1 & 0 would evaluate to 0, FALSE
![Page 12: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/12.jpg)
histogramSome additional options
• Tell STATA that the values are discrete, not continuous:– histogram AGE_IN_SUI if AGE_IN_SUI >= 1 & AGE_IN_SUI <=
99, discrete
• Set the Y-axis to represent percentages:– histogram AGE_IN_SUI if AGE_IN_SUI >= 1 & AGE_IN_SUI <=
99, percent discrete
• Customize labeling of the X-axis– histogram AGE_IN_SUI if AGE_IN_SUI >= 1 & AGE_IN_SUI <=
99, percent discrete xlabel(0(10)100)
• Add tick marks to the X axis– histogram AGE_IN_SUI if AGE_IN_SUI >= 1 & AGE_IN_SUI <=
99, percent discrete xlabel(0(10)100) xtick(0(5)100)
• Produce separate graphs according to the value of another variable– histogram AGE_IN_SUI if AGE_IN_SUI >= 1 & AGE_IN_SUI <= 99
& (SEX != -99), percent discrete xlabel(0(10)100) xtick(0(5)100) by(SEX)
![Page 13: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/13.jpg)
table and bar to produce histogramsObservations by year
We could do the same thing with table to prepare the dataset, and then twoway bar.
table YEAR, contents(freq) replacetwoway bar table1 YEAR, scheme(s1mono) xlabel(1750(25)1900) ytitle("Number of observations")
Or if we want to do it as a scatter plot…
twoway scatter table1 YEAR, scheme(s1mono) xlabel(1750(25)1900) ytitle("Number of observations")
![Page 14: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/14.jpg)
020
,000
40,0
0060
,000
80,0
00N
um
ber
of o
bse
rva
tions
1750 1775 1800 1825 1850 1875 1900Year
![Page 15: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/15.jpg)
020
,000
40,0
0060
,000
80,0
00N
um
ber
of o
bse
rva
tions
1750 1775 1800 1825 1850 1875 1900Year
![Page 16: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/16.jpg)
Registers by year
• The number of available registers varies year by year.
• This accounts for some of the year to year fluctuation in numbers of observations
• In some cases, may also account for some of the year to year fluctuation in other summary values
• We can do a year by year count of the number of available registers easily enough
![Page 17: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/17.jpg)
Registers by yeartable YEAR DATASET, replacetable YEAR, replacetwoway bar table1 YEAR, scheme(s1mono)
ytitle("Registers")Let’s use angle and labsize on xlabel to label each register year
individuallytwoway bar table1 YEAR, scheme(s1mono)
ytitle("Registers") xlabel(1750(3)1909,angle(vertical) labsize(vsmall))
• Note that coverage is much more sparse before 1789.• Some years (1810) are missing an especially large number of
registers• No registers at all from 1888 to 1903
![Page 18: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/18.jpg)
010
2030
Re
gist
ers
1750
1759
1768
1777
1786
1795
1804
1813
1822
1831
1840
1849
1858
1867
1876
1885
1894
1903
1912
Year
![Page 19: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/19.jpg)
010
2030
Re
gist
ers
17
50
17
53
17
56
17
59
17
62
17
65
17
68
17
71
17
74
17
77
17
80
17
83
17
86
17
89
17
92
17
95
17
98
18
01
18
04
18
07
18
10
18
13
18
16
18
19
18
22
18
25
18
28
18
31
18
34
18
37
18
40
18
43
18
46
18
49
18
52
18
55
18
58
18
61
18
64
18
67
18
70
18
73
18
76
18
79
18
82
18
85
18
88
18
91
18
94
18
97
19
00
19
03
19
06
19
09
Year
![Page 20: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/20.jpg)
Population by age groupLet’s use TABLE to look at the distribution of the population by age
group
keep if PRESENT & AGE >= 1 & AGE <= 75, clearrecode AGE_IN_SUI 1/15=1 16/55=16 56/75=56,
generate(AGE_GROUP)tab AGE_GROUP SEX if SEX >= 1, col rowtable AGE_GROUP SEX if SEX >= 1, col row
recode maps values of an existing variable to new values, based on the specified rule. If generate is not specified, it transforms the existing variables. If generate is specified, it creates a new variable with the new values. In this case, all AGE_IN_SUI 1 through 15 all get converted to 1, 16 through 55 are converted to 16, and so forth.
![Page 21: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/21.jpg)
RECODE of |AGE_IN_SUI | (Age in | Sex Sui) | Female Male | Total-----------+----------------------+---------- 1 | 36,300 234,332 | 270,632 | 13.41 86.59 | 100.00 | 6.88 28.12 | 19.88 -----------+----------------------+---------- 16 | 393,977 500,381 | 894,358 | 44.05 55.95 | 100.00 | 74.67 60.04 | 65.71 -----------+----------------------+---------- 56 | 97,333 98,716 | 196,049 | 49.65 50.35 | 100.00 | 18.45 11.84 | 14.40 -----------+----------------------+---------- Total | 527,610 833,429 | 1,361,039 | 38.77 61.23 | 100.00 | 100.00 100.00 | 100.00
![Page 22: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/22.jpg)
-------------------------------------RECODE of |AGE_IN_SU |I (Age in | Sex Sui) | Female Male Total----------+-------------------------- 1 | 36,300 234,332 270,632 16 | 393,977 500,381 894,358 56 | 97,333 98,716 196,049 | Total | 527,610 833,429 1361039-------------------------------------
![Page 23: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/23.jpg)
Counts, averages, proportions by age and time
• There are a variety of options for collapsing observations to produce counts, proportions, averages, etc. by year, age, etc.
• One simple approach is the table command, combined with the replace option
• This replaces the dataset in memory with a ‘collapsed’ version
• Values in the ‘collapsed’ version can be plotted with twoway bar etc.
![Page 24: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/24.jpg)
table AGE_GROUP SEX if SEX >= 1, by(YEAR) replace* Entries created for totals have missing values for AGE_GROUPdrop if AGE_GROUP == .reshape wide table1, i(YEAR SEX) j(AGE_GROUP)* Also need to remove newly created totals with missing values for SEXdrop if SEX == .reshape wide table11 table116 table156, i(YEAR) j(SEX)generate male_proportion_16_55 =
table1162/(table112+table1162+table1562)twoway bar male_proportion_16_55 YEAR, ytitle("Proportion of males who
are 16 to 55 sui") xtitle("Year") ylabel(0(0.1)1) scheme(s1mono)generate male_dependency_ratio = (table112+table1562)/(table1162)twoway bar male_dependency_ratio YEAR, ytitle("Male dependency ratio
((1-15 + 56-75)/(16-55) ") xtitle("Year") ylabel(0(0.1)1) scheme(s1mono)
generate child_sex_ratio = table112/table111twoway bar child_sex_ratio YEAR, ytitle("Ratio of males to females aged
1-15 sui") xtitle("Year") scheme(s1mono) yscale(log) ylabel(1 2 5 10 20 50 100 200)
![Page 25: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/25.jpg)
Reshape• Notice that TABLE (and COLLAPSE) will produce one observation
for each combination of YEAR, age_group, and SEX• 50*3*2=300 observations (approximately)
– 299 in reality because one cell is empty• We would like one observation per year
– In order to carry out calculations• Use reshape to convert to one observation per combination of
YEAR and SEX, with three variables, one each for each of the age groups
• Use reshape again to convert to one observation per YEAR, with six variables per observation, one for each combination of SEX and age_group
• Can calculate dependency ratios, sex ratios etc. from these numbers
![Page 26: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/26.jpg)
0.1
.2.3
.4.5
.6.7
.8.9
1P
rop
ortio
n of
ma
les
wh
o ar
e 1
6 to
55
sui
1750 1800 1850 1900Year
![Page 27: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/27.jpg)
0.1
.2.3
.4.5
.6.7
.8.9
1M
ale
depe
nden
cy r
atio
((1
-15
+ 5
6-7
5)/(
16-5
5)
1750 1800 1850 1900Year
![Page 28: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/28.jpg)
12
510
2050
100
200
Ra
tio o
f ma
les
to fe
mal
es
age
d 1-
15
sui
1750 1800 1850 1900Year
![Page 29: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/29.jpg)
Proportions/meansProportion ever married by year
We can also calculate means of specified variables by YEAR, AGE_IN_SUI, or other variables of interest
use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta" if PRESENT & AGE >= 16 & AGE <= 50 & SEX == 2 & MARITAL_STATUS >= 0, clear
recode AGE_IN_SUI 16/30=16 31/40=31 41/50=41, generate(age_group)generate ever_married = MARITAL_STATUS != 2table YEAR age_group, contents(mean ever_married) replacetwoway bar table1 YEAR if age_group == 16,ylabel(0(0.1)1)
ytitle("Proportion of men 16-30 ever married") xtitle("Year") scheme(s1mono)
twoway bar table1 YEAR if age_group == 31,ylabel(0(0.1)1) ytitle("Proportion of men 31-40 ever married") xtitle("Year") scheme(s1mono)
![Page 30: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/30.jpg)
0.1
.2.3
.4.5
.6.7
.8.9
1P
rop
ortio
n of
me
n 16
-30
eve
r m
arr
ied
1750 1800 1850 1900Year
![Page 31: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/31.jpg)
0.1
.2.3
.4.5
.6.7
.8.9
1P
rop
ortio
n of
me
n 31
-40
eve
r m
arr
ied
1750 1800 1850 1900Year
![Page 32: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/32.jpg)
Proportion married by ageuse "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from
ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta" if PRESENT & AGE >= 1 & AGE <= 50 & SEX == 2 & MARITAL_STATUS >= 0, clear
generate ever_married = MARITAL_STATUS != 2table AGE_IN_SUI, contents(mean ever_married) replacetwoway bar table1 AGE_IN_SUI, ylabel(0(0.10)1)
ytitle("Proportion of males ever married") xtitle("Age in sui") scheme(s1mono)
![Page 33: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/33.jpg)
0.1
.2.3
.4.5
.6.7
.8.9
1P
rop
ortio
n of
ma
les
eve
r m
arr
ied
0 10 20 30 40 50Age in sui
![Page 34: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/34.jpg)
Multiple trends in the same graphkeep if SEX == 2 & PRESENT & BIRTHYEAR >= 1750 & BIRTHYEAR <=
1900keep if MARITAL_STATUS > 0keep if AGE_IN_SUI >= 11 & AGE_IN_SUI <= 40recode AGE_IN_SUI 11/15=11 16/20=16 21/25=21 26/30=26 31/35=31
36/40=36, generate(age_group)generate ever_married = MARITAL_STATUS != 2table BIRTHYEAR age_group, contents(mean ever_married) replacetwoway line table1 BIRTHYEAR if age_group == 11 || line table1
BIRTHYEAR if age_group == 16 || line table1 BIRTHYEAR if age_group == 21 || line table1 BIRTHYEAR if age_group == 26 || line table1 BIRTHYEAR if age_group == 31 || line table1 BIRTHYEAR if age_group == 36 || ,scheme(s1mono) legend(order(1 "11-15 sui" 2 "16-20 sui" 3 "21-25 sui" 4 "26-30 sui" 5 "31-35 sui" 6 "36-40 sui")) ytitle("Proportion of males ever married")
![Page 35: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/35.jpg)
0.2
.4.6
.81
Pro
por
tion
of m
ale
s e
ver
ma
rrie
d
1750 1800 1850 1900Year of Birth
11-15 sui 16-20 sui21-25 sui 26-30 sui31-35 sui 36-40 sui
![Page 36: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/36.jpg)
Using COLLAPSEkeep if PRESENT & SEX == 2 & AGE_IN_SUI > 1 &
AGE_IN_SUI <= 60mvdecode _all, mv(-99 -98)generate MARRIED = MARITAL_STATUS == 1
By default, collapse will create variables of the same name containing means
collapse MARRIED SON_COUNT DAUGHTER_COUNT FATHER_ALIVE MOTHER_ALIVE BROTHER_COUNT, by(AGE_IN_SUI)
Notice use of legend to specify a label for each of the 5 linestwoway line FATHER_ALIVE MOTHER_ALIVE MARRIED
SON_COUNT BROTHER_COUNT AGE_IN_SUI, scheme(s1mono) legend(order(1 "Father alive" 2 "Mother alive" 3 "Wife alive" 4 "Sons ever born" 5 "Brothers alive")) ytitle("Mean") lpattern(solid solid dash dot dash_dot)
![Page 37: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/37.jpg)
0.5
11.
5M
ean
0 20 40 60Age in Sui
Father alive Mother aliveWife alive Sons ever bornBrothers alive
![Page 38: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/38.jpg)
Calculating rates• Calculation of demographic rates by age and so forth is
straightforward, using the AT_RISK_* and NEXT_* flag variables.• Let’s calculate and compare probability of marriage in the next
three years by age, for men and women
keep if AT_RISK_MARRY == 1 & SEX > 0 & AGE_IN_SUI > 0 & AGE_IN_SUI <= 30
collapse NEXT_MARRY, by(AGE_IN_SUI SEX)twoway line NEXT_MARRY AGE_IN_SUI if SEX == 1 || line NEXT_MARRY AGE_IN_SUI if SEX == 2 || , legend(order(1 "Female" 2 "Male")) scheme(s1mono)
![Page 39: SJTU CMGPD 2012 Methodological Lecture Day 2 TABLE, COLLAPSE, HISTOGRAM, TWOWAY BAR.](https://reader036.fdocuments.us/reader036/viewer/2022062308/56649d225503460f949f75b8/html5/thumbnails/39.jpg)
0.2
.4.6
Pro
por
tion
mar
ryin
g in
nex
t 3 y
ear
s
1 6 11 16 21 26 31Age in Sui
Female Male