Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi...

37
irtual University of Pakista Lecture No. 3 Statistics and Probability By: iss Saleha Naghmi Habibullah

Transcript of Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi...

Page 1: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Virtual University of Pakistan

Lecture No. 3 Statistics and Probability

By:Miss Saleha Naghmi Habibullah

Page 2: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

IN THE LAST LECTURE, IN THE LAST LECTURE, YOU LEARNT:YOU LEARNT:

Concept of samplingConcept of sampling Random versus non-random Random versus non-random

samplingsampling Simple random samplingSimple random sampling A brief introduction to other A brief introduction to other

types of random samplingtypes of random sampling Methods of data collectionMethods of data collection

Page 3: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

TOPICS FOR TODAYTOPICS FOR TODAY

Data RepresentationData Representation TabulationTabulation Simple bar chartSimple bar chart Component bar chartComponent bar chart Multiple bar chartMultiple bar chart Pie chartPie chart

Page 4: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

The tree-diagram below presents an outline of the various techniques

TYPES OF DATA

QuantitativeQualitative

UnivariateFrequency

Table

Percentages

Pie Chart

Bar Chart

Bivariate Frequency

Table

MultipleBar

Chart

Discrete

Frequency Distribution

Line Chart

Continuous

Frequency Distribution

Histogram

Frequency Polygon

Frequency Curve

Component Bar Chart

Page 5: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

In today’s lecture, we will be dealing with various techniques for summarizing and describing qualitative data.

Qualitative

UnivariateFrequency

Table

Percentages

Pie Chart

Bar Chart

Bivariate Frequency

Table

MultipleBar Chart

Component Bar Chart

We will begin with the univariate situation, and will proceed to the bivariate situation.

Page 6: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Suppose that we are carrying out a survey of the students of first year studying in a co-education. Suppose that in all there are 1200 students of first year in this large college. We wish to determine

What proportion of students have come from Urdu medium schools?

What proportion has come from English medium schools?

Example

Page 7: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Interview Results Interview Results We will have an array of observations as follows: U, U, E, U, E, E, E, U, ……

(U : URDU MEDIUM) (E : ENGLISH

MEDIUM)Question:

What should we do with this data?

Obviously, the first thing that comes to mind is to count the number of students who said “Urdu medium” as well as the number of students who said “English medium”.

Page 8: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

This will result in the following table:

Medium ofMedium ofInstitutionInstitution

No. of StudentsNo. of Students(f)(f)

UrduUrdu 719719

EnglishEnglish 481481

TotalTotal 12001200

Important:The technical term for the numbers given in the second column of this table is “frequency”.It means “how frequently something happens?”

Out of the 1200 students, 719 stated that they had come from Urdu medium schools.

Page 9: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Dividing the cell frequencies by the total frequency and multiplying by 100 we obtain the following:

Medium ofMedium ofInstitutionInstitution

ff %%

UrduUrdu 719719 59.9 = 60%59.9 = 60%

EnglishEnglish 481481 40.1 = 40%40.1 = 40%

12001200

100frequency

PercentageTotal No. of Students

Page 10: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Diagrammatical Representation of DataDiagrammatical Representation of Data

A pie chart consists of a circle which is divided into twoor more parts in accordance with the number of distinctcategories that we have in our data.

Medium Medium ofof

InstitutionInstitutionff AngleAngle

UrduUrdu 719719 215.7215.700

ENGLISHENGLISH 481481 144.3144.300

12001200

English40% Urdu

60%

Cell FrequencyDivision of Circle = 360

Total Frequency

Page 11: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

For the example that we have just considered,

the circle is divided into two sectors, the larger sector pertaining to students coming from Urdu medium schools and the smaller sector pertaining to students coming from English medium schools.

How do we decide where to cut the circle?The answer is very simple! All we have to do is to divide the cell frequency by the total frequency and multiply by 360.

This process will give us the exact value of the angle at which we should cut the circle.

Page 12: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Diagrammatical Representation of DataDiagrammatical Representation of Data

SIMPLE BAR CHART

A simple bar chart consists of horizontal or vertical bars of equal width and lengths proportional to values they represent.

Page 13: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Example Example

Suppose we have available to us information regarding theturnover of a company for 5 years as given in the tablebelow:

YearsYears 19651965 19661966 19671967 19681968 19691969

TurnoverTurnover(Rupees)(Rupees)

35,00035,000 42,00042,000 43,50043,500 48,00048,000 48,50048,500

Page 14: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

In order to represent the above information in the form of a bar chart, all we have to do is to take the year along the x-axis and construct a scale for turnover along the y-axis.

0

10,000

20,000

30,000

40,000

50,000

1965 1966 1967 1968 1969

Next, against each year, we will draw vertical bars of equal width and different heights in accordance with the turn-over figures that we have in our table.

Page 15: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

As a result we obtain a simple and attractive diagram as shown below.

0

10,000

20,000

30,000

40,000

50,000

1965 1966 1967 1968 1969

When our values do not relate to time, they should be arranged in ascending or descending order before-charting.

Page 16: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

BIVARIATE FREQUENCY TABLE

What we have just considered was the univariate situation. In each of the two examples, we were dealing with one single variable. In the example of the first year students of a college, our alone variable of interest was ‘medium of schooling’. And in the second example, our one single variable of interest was turnover.

Page 17: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Example Example

Suppose that along with the enquiry Suppose that along with the enquiry

about the about the Medium of InstitutionMedium of Institution we are we are

also also recording the sexrecording the sex of the student. of the student.

Page 18: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Student No.Student No. MediumMedium GenderGender

11 UU FF

22 UU MM

33 EE MM

44 UU FF

55 EE MM

66 EE FF

77 UU MM

88 EE MM

:: :: ::

:: :: ::

Now this is a bivariate situation; we have two variables, medium of schooling and sex of the student.

Page 19: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Bivariate Frequency TableIn order to summarize the above information, we will construct a table called Bivariate Frequency Table, containing a boxhead and a stub as shown below:

SexSexMed.Med.

MaleMale FemaleFemale TotalTotal

UrduUrdu

EnglishEnglish

TotalTotal

Box Head

Stub

Page 20: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Next, we will count the number of students falling in each of the following four categories:

• Male student coming from an Urdu medium school.

• Female student coming from an Urdu medium school.

• Male student coming from an English medium school.

• Female student coming from an English medium school.

Page 21: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

As a result, suppose we obtain the following figures:

SexSexMed.Med.

MaleMale FemaleFemale TotalTotal

UrduUrdu 202202 517517 719719

EnglishEnglish 350350 131131 481481

TotalTotal 552552 648648 12001200

Bivariate Frequency Table pertaining to two qualitative variables.

Page 22: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Let us now consider how we will depictthe above information diagrammatically

Page 23: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

This can be accomplish by constructing the componentcomponent bar chart COMPONENT BAR CHARTcomponent bar chart is also known as the subdivided bar chart.

0

100

200

300

400

500

600

700

800

Male Female

Urdu

English

Page 24: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

In the above figure, each bar has been divided into two parts.

The first bar represents the total number of male students whereas the second bar represents the total number of female students.

As far as the medium of schooling is concerned, the lower part of each bar represents the students coming from English medium schools. Whereas the upper part of each bar represents the students coming from the Urdu medium schools.

The advantage of this kind of a diagram is that we are able to ascertain the situation of both the variables at a glance. We can compare the number of male students in the college with the number of female students, and at the same time we can compare the number of English medium students among the males with the number of English medium students among the females.

Page 25: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

The next diagram to be considered is the Multiple Bar Chart

Page 26: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

MULTIPLE BAR CHARTUsed in a situation where we have two or more related sets of data.

Example:Suppose we have information regarding the imports and exports of Pakistan for the years 1970-71 to 1974-75 as shown in the table below:

YearsYearsImportsImports

(Crores of Rs.)(Crores of Rs.)ExportsExports

(Crores of Rs.)(Crores of Rs.)

1970-711970-71 370370 200200

1971-721971-72 350350 337337

1972-731972-73 840840 855855

1973-741973-74 14381438 10161016

1974-751974-75 20922092 10291029

Source: State Bank of Pakistan

Page 27: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

A A multiple multiple bar chart is a very useful andbar chart is a very useful andeffective way of presenting this kind ofeffective way of presenting this kind ofinformation.information.This kind of a chart consists of a set ofThis kind of a chart consists of a set ofgroupedgrouped bars, the lengths of which are bars, the lengths of which areproportionate to the values of ourproportionate to the values of ourvariables, and each of which is shaded orvariables, and each of which is shaded orcolored differently in order to aidcolored differently in order to aididentification.identification. With reference to the above example, weWith reference to the above example, weobtain the multiple bar chart shown ahead:obtain the multiple bar chart shown ahead:

Page 28: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

0

500

1000

1500

2000

2500

1970-71 1971-72 1972-73 1973-74 1974-75

Imports

Exports

Multiple Bar Chart representing Imports & Exports of Pakistan ( 1970 - 71 to 1974 - 75)

Page 29: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Difference between Component Bar Chart Difference between Component Bar Chart

and Multiple Bar Chartand Multiple Bar Chart

Information available regarding Totals and their componentsFor Example:Total no. of male studentsi.e. English Medium and Urdu Medium

No Information regarding TotalsFor example:Imports and Exports do not addup to give you the totality ofsome one thing.

Component Bar Chart Multiple Bar Chart

Page 30: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Quantitative VariableQuantitative Variable

Quantitative Variable

Discrete Variable

Continuous Variable

• Frequency Distribution• Line Chart

• Frequency Distribution• Histogram• Frequency Polygon• Ogive

Page 31: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Example Example Suppose we walk in the nursery Suppose we walk in the nursery

class of a school and we count the no. class of a school and we count the no. of Books and copies that students have of Books and copies that students have in their bags. in their bags.

Suppose the no. of books and copies are Suppose the no. of books and copies are

3, 5, 7, 9 and so on. 3, 5, 7, 9 and so on.

Page 32: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Representation of Data in a Representation of Data in a Discrete Frequency DistributionDiscrete Frequency Distribution

XX TallyTally FrequencyFrequency

33 || 11

44 |||||| 33

55 |||| |||||||| |||| 99

66 |||| |||| ||||||| |||| ||| 1313

77 |||| |||||||| |||| 1010

88 |||||| 33

99 |||| ||||| | 66

TotalTotal 4545

Page 33: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Graphical Representation of Graphical Representation of Discrete DataDiscrete Data

8

10

12

2

4

6

03 4 5 6 7 8

X

14

9

No. of books and copies

No.

of

stu

den

ts

Page 34: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Relative Frequency DistributionRelative Frequency Distribution XX FrequencyFrequency Relative Relative

FrequencyFrequency

33 11 1/45 x 100 = 2.22%1/45 x 100 = 2.22%

44 33 3/45 x 100 = 6.67%3/45 x 100 = 6.67%

55 99 9/45 x 100 = 20%9/45 x 100 = 20%

66 1313 13/45 x 100 = 28.89%13/45 x 100 = 28.89%

77 1010 10/45 x 100 = 22.22%10/45 x 100 = 22.22%

88 33 3/45 x 100 = 6.67%3/45 x 100 = 6.67%

99 66 6/45 x 100 = 13.33%6/45 x 100 = 13.33%

TotalTotal 4545

Page 35: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

Cumulative Frequency DistributionCumulative Frequency Distribution

XX FrequencyFrequency Cumulative Cumulative FrequencyFrequency

33 11 11

44 33 1+3 = 41+3 = 4

55 99 4+9 = 134+9 = 13

66 1313 13+13 = 2613+13 = 26

77 1010 26+10 = 3626+10 = 36

88 33 36+3 = 3936+3 = 39

99 66 39+6 = 4539+6 = 45

TotalTotal 4545

Page 36: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

IN TODAY’S LECTURE, IN TODAY’S LECTURE, YOU LEARNTYOU LEARNT

Tabular and diagrammatic representation of Quantitative data

univariate Bivariate

Tabular and diagrammatic representation of Discrete Quantitative variable

Page 37: Virtual University of Pakistan Lecture No. 3 Statistics and Probability By: Miss Saleha Naghmi Habibullah.

IN THE NEXT TWO LECTURES, YOU WILL LEARN

Tabular and Diagrammatic representation of a Continuous Quantitative Variable.

Continuous Frequency Distribution Histogram Frequency polygon Frequency curve Cumulative frequency distribution (continuous) Cumulative frequency polygon (Ogive)