Demographers in SAS® –SAS® in DemographyKLÁRA HULÍKOVÁ
DEPARTMENT OF DEMOGRAPHY AND GEODEMOGRAPHY
CHARLES UNIVERSITY, CZECH REPUBLIC
FACULTY OF SCIENCE
Charles University
IntroductionWhy to talk about demography in connection to SAS®?
◦ Demography studies reproduction of population – population growth, analysis of fertility, mortality, migration flows, population structure, factors of population development
◦ With increasing data availability and complexity, still more and more sophisticated analytical methods are a crucial part of the work of demographers
◦ Application of demography
◦ Study and description of past & current demographic trends
◦ Life insurance, pension & social reforms, population policy, demographic and social forecasts
◦ Historical studies, study of demographic trends in developing countries
◦ Questions about sustainable population development and its conditions, risk of social conflicts
IntroductionWhy to talk about demography in connection to SAS®?
◦ Demography studies reproduction of population – population growth, analysis of fertility, mortality, migration flows, population structure, factors of population development
◦ With increasing data availability and complexity, still more and more sophisticated analytical methods are a crucial part of the work of demographers
◦ Application of demography
◦ Study and description of past & current demographic trends
◦ Life insurance, pension & social reforms, population policy, demographic and social forecasts
◦ Historical studies, study of demographic trends in developing countries
◦ Questions about sustainable population development and its conditions, risk of social conflicts
Demography is rather small field of study, but of growing importance…
Introduction
Increasing data availability and
complexity
Need of more sophisticated and complex methods
We cannot use them
Still the same results & no new
results
We can use themNew results &
new knowledge
Introduction
Increasing data availability and
complexity
Need of more sophisticated and complex methods
We cannot use them
Still the same results & no new
results
We can use themNew results &
new knowledge
Structure of the presentationPart 1: Demographers in SAS®
◦ Situation in Faculty of Science, Charles University – ways and methods of SAS®-education
◦ Cooperation with the SAS® Institute, Czech Republic
Part 2: SAS® in Demography◦ Selected case studies
◦ Real examples of current demographic research based on SAS®
How we teach SAS®SAS® courses are part of the Master studies of Demography
◦ Demographic applications in SAS I.
◦ Demographic applications in SAS II.
◦ Demographic applications in SAS III.
◦ Elementary econometrics
◦ Demography in Life Insurance
How we teach SAS®SAS® courses are part of the Master studies of Demography
◦ Demographic applications in SAS I.
◦ Introduction to SAS programming
◦ Work with data sets
◦ Basic procedures – SORT, MEANS, TABULATE, FREQ, etc.
◦ Demographic applications in SAS II.
◦ Demographic applications in SAS III.
◦ Elementary econometrics
◦ Demography in Life Insurance
How we teach SAS®SAS® courses are part of the Master studies of Demography
◦ Demographic applications in SAS I.
◦ Demographic applications in SAS II.
◦ Demographic standardization – proc STDRATE
◦ Plots
◦ Survival analysis, Cox regression – proc LIFETEST, LIFEREG, PHREG
◦ Logistic regression – proc LOGISTIC
◦ Demographic applications in SAS III.
◦ Elementary econometrics
◦ Demography in Life Insurance
How we teach SAS®SAS® courses are part of the Master studies of Demography
◦ Demographic applications in SAS I.
◦ Demographic applications in SAS II.
◦ Demographic applications in SAS III.
◦ Specific issues in SAS® - box-plots, cluster analysis
◦ Generalized linear models (GENMOD)
◦ Macros in SAS®
◦ Elementary econometrics
◦ Demography in Life Insurance
How we teach SAS®SAS® courses are part of the Master studies of Demography
◦ Demographic applications in SAS I.
◦ Demographic applications in SAS II.
◦ Demographic applications in SAS III.
◦ Elementary econometrics
◦ Linear regression in SAS® - proc REG
◦ Basics of time series analysis
◦ Demography in Life Insurance
How we teach SAS®SAS® courses are part of the Master studies of Demography
◦ Demographic applications in SAS I.
◦ Demographic applications in SAS II.
◦ Demographic applications in SAS III.
◦ Elementary econometrics
◦ Demography in Life Insurance
◦ Demographic modelling in SAS® - mortality models
◦ Our courses are traditionally focused on SAS® Base (not SAS EG)
◦ Knowledge of programming and its principles enables to switch among more statistical softwareswhen needed
How to teach the basicsDifficult start of anything new, including programming
… SAS® as a communication tool (between computer and the user)
How to teach the basicsDifficult start of anything new, including programming
… SAS® as a communication tool (between computer and the user)
proc univariate data=example;var residuals;histogram /normal (mu=est var=est) kernel;run;
Please, draw a histogram for me
How to teach the basicsDifficult start of anything new, including programming
… SAS® as a communication tool (between computer and the user)
proc univariate data=example;var residuals;histogram /normal (mu=est var=est) kernel;
---2276
ERROR 22-322: Syntax error, expecting one of the following: a numeric constant, a datetime constant, ), COLOR, CONTENTS, FILL, L, MIDPERCENTS, MU, NOPRINT, PERCENT, PERCENTS, SIGMA, W.ERROR 76-322: Syntax error, statement will be ignored.
?!Sorry, I do not
understand you fully– I do not know some word(s)
How to teach the basicsDifficult start of anything new, including programming
… SAS® as a communication tool (between computer and the user)
proc univariate data=example;var residuals;histogram /normal (mu=est sigma=est) kernel;run;
Sorry, do you understand it now?
SAS, please, use the procedure univariate and apply it to my data called (=) example;and please analyze the variable called residuals;I would like to see a histogram and moreover (/) in comparison to the normal distribution with characteristics estimated from the observed values (mu=est sigma=est) and also to the kernel distribution;Please, SAS, run it for me;
How to teach the basicsDifficult start of anything new, including programming
… SAS® as a communication tool (between computer and the user)
NOTE: PROCEDURE UNIVARIATE used (Total process time):real time 0.33 secondscpu time 0.15 seconds
Great!I did it!
Sometimes, SAS® is the best teacher of SAS®…
Cooperation with the SAS® Institute, Czech RepublicCooperation between the faculty and the SAS® Institute lasts already for several years
◦ Academic licenses for researches, education and students
◦ E-learning
◦ Support for teachers – consultations, materials
◦ SAS Prize
◦ SAS Certificates for the best students
◦ Workshops
SAS® in Demography: Case study 1Latest development of fertility or mortality
◦ How to describe such a complex process as:
◦ Change of the age pattern of fertility and mortality – what is the intensity according to age
◦ Change of the overall fertility or mortality level as well as its age pattern in time
◦ Why to deal with it?
◦ Importance for demographic forecasts
◦ Detection of period or cohort effects – reflect the effects of population policies or specific behavior of some cohorts (generations)
◦ Could bring important knowledge for population policy, human resources, pension reforms, etc.
◦ What to use?
◦ Analytical methods
◦ Graphical methods – visualization of data is a contemporary trend also in demography
Hulíková Tesárková, K. 2013. Selected demographic methods of mortality analysis: Approaches focused on adults and the oldest age-groups using primarily cross-sectional data. LAP LAMBERT Academic Publishing (June 5, 2013), 404 pp. ISBN 978-3659404139.
Latest development of fertility in Italy
History of mortality in Sweden
SAS® in Demography: Case study 2Mortality models:
◦ Express the intensity of mortality using a simple parametric function
◦ Smoothing of random variations and irregularities
◦ Possibility of mortality extrapolation to the highest ages where usually reliable data are not available
◦ Important e.g. for life insurance, pension reforms, etc.
◦ Why to deal with it?
◦ Important topic for pension agencies (preparation of pension reforms in aging populations)
◦ High importance for life insurance agencies – as precise estimation of mortality as possible is needed
◦ What we have to use?
◦ Generalized regression – estimation of parameters
◦ Generalization of the results
Hulíková, K., Burcin, B., Pachlová, T., Kašpar, D. 2016. Parametric Mortality Smoothing: Deciding on the Optimal Method. Annual Conference of Population Association of America 2016, 28th March – 2nd April, 2016, Washington D.C., USA
Analysis of mortality models
Many methods(functions) that could be used
Actual research: which function
should be used?
Application of more functions to as many real data
as possible
Circa 5 200 populations –
defined by country, sex or
year
Application of 6 the most often used functions –
weighted generalized least squares
Evaluation of all the methods
according to their suitability
Factors tied to suitability
of each particular
model
Analysis of mortality modelsCirca 5 200 populations
Estimation of parameters of 6 different models using the weighted generalized least squares
Evaluation of suitability using the AIC (Akaike Information Criterion)
Multinomial logistic regression◦ Explained variable: the most suitable model for each population
◦ Explaining variables: calendar year, sex, country or region of the country, overall level of mortality expressed by the life expectancy (life expectancy at some particular age is one of the basic demographic indicators representing the average expected number of years to be lived by a person at that age)
Generalization – which characteristics of a population could be tied to suitability of each particular model
Analysis of mortality models – examples of resultsIt was possible to confirm
◦ Systematic increase of suitability of one of the models with increasing life expectancy (i.e. with improving mortality)
MALES, FEMALES
Analysis of mortality models – examples of resultsIt was possible to confirm
◦ Systematic increase of suitability of one of the models with increasing life expectancy (i.e. with improving mortality)
◦ On the other hand, suitability of other function (in this case one of the currently used e.g. in Central European countries) is rapidly decreasing
Conclusion:◦ Calculation of the official mortality
tables could be modified
MALES, FEMALES MALES, FEMALES
Analysis of mortality models –advantages of SAS® usageCirca 5 200 x 6 (= 31 200) estimated models
◦ Macros
◦ Proc NLIN
Multinomial Logistic Regression applied to all the 31 200 cases◦ Proc Logistic
Advantages:◦ Quick
◦ Effective
◦ Complex – only one software needed for the analysis as well as summary analysis and graphs or tables
SAS® in Demography: Case study 3Historical demography:
◦ Historical demography studies reproduction of historical populations – e.g. in the 18th century
◦ Specific sources of data – parish registers
◦ Specific types of data and problems of data – family histories, missing data
◦ Why to deal with it?
◦ Aim to learn more about the past demographic trends and factors influencing it
◦ Significant relevance to demographic development in currently developing countries
◦ What we have to use?
◦ Event history analysis, Cox regression, censoring
◦ Tools of effective data organization
MarriageBirth of the 1st child
Birth of the 2nd
child
Death of the 1st child
Birth of the 3rd
child
Birth of the 4th
child
Death of the 4th
child
Death of the man =
end of marriage
Death of the
woman
Fialová, L., Hulíková, K., Kuprová, B. 2016. What stood behind the length of birth intervals in the past: a case study of Jablonec nad Nisou (Czech lands) from 17th to 19th century. Journal of Family History. [in the review process]
Historical demographyWhat are parish registers?
Historical demographyOriginal data structure:
Re-organization of the data when the aim was to study the reproduction (births of children, inter-birth intervals, child mortality)
◦ Data were no more organized according to families (ID) but each raw represented one child and data about the corresponding family
ID Date of marriage
Date of birth (woman)
Date of birth (man)
Date of death (woman)
Date of death (man)
Date of birth –1st child
Date of death –1st child
Date of birth – 2nd child
Date of death – 2nd child
1
2
Historical demographyAnalysis
◦ Instead of dates of birth and death → time duration between specific events:
◦ Time duration from marriage to birth of the 1st child
◦ Time duration between two consecutive births (birth interval)
◦ Time duration from birth to the death of the child
◦ Survival analysis, Cox regression
◦ Study of explanatory factors of observed time durations
◦ PHREG procedure
Historical demography – analysisAnalyzed locality: Jablonec – town in the northern part of the Czech Republic
Studied period: 18th century
Analyzed data: around 2 thousand families → more than 10 thousand children and data about their families
Studied behavior:◦ Frequency of births in families and birth intervals - as an indicator of the type of reproductive behavior
(tempo of family growth, level of reproductive health, or effects of infant mortality on fertility)
◦ Mortality of children within first 10 years of life – as an indicator of health condition and child care
Specific features of the study:◦ Still rather unique in the world, classical historical demographic studies were not based on modern
statistical methods
◦ Relatively big data set
Historical demography – analysisStudied effects of various explanatory variables:
◦ Within analysis of the length of birth intervals: birth parity, age of a mother at birth of a child, age of a mother at marriage, total number of births in the family, reversal birth order and survival status of a previous child
◦ Within analysis of the child mortality: age of a mother at birth of a child, age of a mother at marriage, total number of births in the family, reversal birth order, survival status of a previous child with interaction of its sex, sex-structure of previous children in a family and length of the previous birth interval
Historical demography – selected resultsAverage impact on the length of birth intervals
Prolongation
• With increasing age of mother at birth of the child
Shortening
• With increasing age of mother at marriage
• With higher number of children born in a family
• With increasing reversal birth order (longer interval before the birth of the last child in a family)
• When previous child died within 12 months after birth
Historical demography – selected resultsAverage impact on the child mortality
Increase (higher mortality)
• For higher age of mother at birth of the child (above 25)
• In families with higher total number of births
• For the last child born in a family
Decrease (lower mortality)
• When previous child survived at least first 12 months of life
• With each additional month of the length of the birth interval
Historical demography – advantages of SAS® usageAdvantages of more sophisticated methods used in the study
◦ Pure effect of all variables considered as explanatory ones was estimated
◦ Cox regression – confirmation of classical studies and more detailed results
◦ Several new and not yet fully described factors affecting the length of birth intervals and child mortality in the past were revealed
What we know from the study: reproductive behavior in the 18th century in Europe was still influenced above all by natural processes (decrease of fecundity with age, increase of child mortality with age of the mother or shortening of the birth intervals)
Currently: cooperation with colleagues from developing countries – similar results are confirmed also for the least developed populations nowadays
ConclusionUsing of the SAS® software enabled:
◦ More detailed and effective analyses
◦ Use more sophisticated methods
◦ Work effectively with large datasets
◦ Reveal some facts for the first time
Our future aims:◦ Incorporate the SAS® software more into our
research as well as education
◦ Motivate our students to use SAS® for their own research and master/doctoral theses
◦ Encourage our students to participate in the SAS Prize and SAS workshops organized for them and then use SAS in their working practice
◦ Find effective ways how to teach and explain SAS programming to the beginners
Top Related