SADC Course in Statistics Setting the scene (Session 01)

14
SADC Course in Statistics Setting the scene (Session 01)

Transcript of SADC Course in Statistics Setting the scene (Session 01)

Page 1: SADC Course in Statistics Setting the scene (Session 01)

SADC Course in Statistics

Setting the scene

(Session 01)

Page 2: SADC Course in Statistics Setting the scene (Session 01)

2To put your footer here go to View > Header and Footer

Learning Objectives

At the end of this session, you will be able to

• recognise situations where statistical modelling in relevant

• understand the purpose of modelling

• for a given scenario, be able to identify the key response variable of interest and potential factors that may affect the variation in the key response

Page 3: SADC Course in Statistics Setting the scene (Session 01)

3To put your footer here go to View > Header and Footer

Session Contents

In this session you will be

• provided with examples of situations where modelling is relevant to answer questions of importance in policy decisions

• given the opportunity to explore examples in order to develop some insight into modelling ideas

• introduced to the associated terminology

Page 4: SADC Course in Statistics Setting the scene (Session 01)

4To put your footer here go to View > Header and Footer

Examples where modelling is relevant

Two examples will be discussed initially…

• Child malnutrition and feeding practices in Malawi, in Food and Nutrition Bulletin, Volume 18, No. 2, 1997. United Nations University Press, Tokyo, Japan

• Gender-sensitive education statistics and indicators, in UNESCO Training Materials for workshops on Education Statistics and Indicators in Ghana (1996), Côte d’Ivoire (1997).

Page 5: SADC Course in Statistics Setting the scene (Session 01)

5To put your footer here go to View > Header and Footer

Example 1 - Nutrition:The data come from the Malawi Demographicand Health Survey, 1992. Primary interestwas in identifying factors affecting malnutrition.The factors were:• gender, age, birth size, type of breast feeding,

maternal education & area of residence amongst 4-11 month olds infants

• age, birth size, preceding and succeeding birth interval, if still breast feeding, no. of days with diarrhoea in past 2 weeks and other household characteristics amongst 12-59 month old children

Page 6: SADC Course in Statistics Setting the scene (Session 01)

6To put your footer here go to View > Header and Footer

Example 2 - Education:

A cross-country study to determine factors which hinder gender equality in education. One outcome variables was a gender-equity sensitive indicator (GESI). Some factors studied were:

• Total fertility rate

• GNP per capita

• % female teachers in primary education

• Male & female enrolment ratios at primaryand secondary education

Page 7: SADC Course in Statistics Setting the scene (Session 01)

7To put your footer here go to View > Header and Footer

Identifying response and regressor (explanatory) variables

In each of the above examples, there was a key response of interest. This is called the dependent variable, usually denoted by y.

Factors identified as possibly influencing the variability in y are called explanatory, or regressor variables. They form the x’s in the model. In statistical modelling, we assume they are measured without error.

What are the y and x’s in previous examples?

Page 8: SADC Course in Statistics Setting the scene (Session 01)

8To put your footer here go to View > Header and Footer

What is a statistical model?

A model is a simple equation which relates a key response (y) of interest to one or more

other variables (x1, x2, …) which are believed

to contribute to the variability in the key response.

For example, y = 38.1 – 1.91x, where y is perinatal mortality per 1000 live births and x the number of health centres per 1000 HHs.This describes the relationship between mortality and availability of health facilities.

Page 9: SADC Course in Statistics Setting the scene (Session 01)

9To put your footer here go to View > Header and Footer

Purpose of Modelling

• To determine a simple summary of the way that a key response (y) relates to a set of x’s

• To understand factors (x’s) affecting y

• To use the model equation to make predictions about y

• To determine which values of the x’s will optimise y in some way

Page 10: SADC Course in Statistics Setting the scene (Session 01)

10To put your footer here go to View > Header and Footer

Types of key response

In the simplest type of statistical modelling, the key response is a quantitative measurement, assumed to follow a normal distribution. This module focuses on such responses.

However, there are other types of key responses. Often have binary variables, e.g. whether or not a household is below the poverty line, whether contraceptives are used or not, person is HIV positive or not.

Page 11: SADC Course in Statistics Setting the scene (Session 01)

11To put your footer here go to View > Header and Footer

Example 3: a binary response

See Impact of HIV on tuberculosis in Zambia: a cross-sectional study, in British Medical Journal, 1990, Vol.301, pp.412-5

This includes studying the relationship of HIV-1 antibody state (yes/no) to

• years of full-time education

• housing (no. of people sharing bedroom)

• marital state (married, single, other)• history of treatment for sexually transmitted diseases (yes/no)

Page 12: SADC Course in Statistics Setting the scene (Session 01)

12To put your footer here go to View > Header and Footer

Example 4: a multinomial response

See Patterns of Tobacco Use in the Early Epidemic Stages: Malawi and Zambia, 2000-2002, in American J of Public Health, 2005, Vol. 95, No. 6, pp. 1009-1015.

This was a study relating tobacco use (none, light smoker, heavy smoker) to

• age, education, occupation, religion, and

• residence (rural/urban), and

• marital status (married, single, other)

Page 13: SADC Course in Statistics Setting the scene (Session 01)

13To put your footer here go to View > Header and Footer

Types of regressor variables

In above examples, the explanatory(regressor) variables can be:

• quantitative measurements, e.g years of education;

• ordered categorical variables, e.g. extent of smoking (low, medium, high)

• nominal (type of occupation);

• binary (possess a specific asset or not).

Quantitative x’s will be considered in sessions1-10, and other types in later sessions.

Page 14: SADC Course in Statistics Setting the scene (Session 01)

14To put your footer here go to View > Header and Footer

Practical work follows to ensure learning objectives

are achieved…