Descrptive statistics
-
Upload
drzahid-khan -
Category
Health & Medicine
-
view
146 -
download
0
Transcript of Descrptive statistics
![Page 1: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/1.jpg)
Slide 1
Lecture by Dr Zahid Khan King Faisal University,KSA.
1
Descriptive Statistics
![Page 2: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/2.jpg)
Slide 2
04/11/2023
2
Types of Statistical Analyses
Five Types of Statistical Analysis:
1. Descriptive analysis: used to describe the data set
2. Inferential analysis: used to generate conclusions about the population’s characteristics based on the sample data
![Page 3: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/3.jpg)
Slide 3
Descriptive Statistics
Tools for summarising, organising, simplifying data.
1. Tables & Graphs2. Measures of Central Tendency3. Measures of Variability
04/11/2023
3
![Page 4: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/4.jpg)
Slide 4
04/11/2023
4
Understanding Data Via Descriptive Analysis
Two sets of descriptive measures:–Measures of central tendency: used to report a
single piece of information that describes the most typical response to a question
–Measures of variability: used to reveal the typical difference between the values in a set of values
![Page 5: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/5.jpg)
Slide 5
04/11/2023
5
Understanding Data Via Descriptive Analysis Measures of Central Tendency:
– Mode: the value in a string of numbers that occurs most often
– Median: the value whose occurrence lies in the middle of a set of ordered values
– Mean: sometimes referred to as the “arithmetic mean”; the average value characterizing a set of numbers
![Page 6: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/6.jpg)
Slide 6
11/04/2023
6
Understanding Data Via Descriptive Analysis Measures of Variability:
– Frequency distribution reveals the number (percent) of occurrences of each number or set of numbers
– Range identifies the maximum and minimum values in a set of numbers
– Standard deviation indicates the degree of variation in a way that can be translated into a bell-shaped curve distribution
![Page 7: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/7.jpg)
Slide 7
Descriptive statistics
If we wanted to characterize the students in a first year medical class we would find that they are:– Young– Fit– Male & Females
How young? How fit is this class? What is the distribution of males and females?
11/04/2023
![Page 8: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/8.jpg)
Slide 8
Frequency distribution
The frequency with which observations are assigned to each category or point on a measurement scale.– Most basic form of descriptive statistics– May be expressed as a percentage of the total
sample found in each category
11/04/2023
![Page 9: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/9.jpg)
Slide 9
Frequency distribution
The distribution is “read” differently depending upon the measurement level– Nominal scales are read as discrete measurements at each level– Ordinal measures show tendencies, but categories should not be
compared– Interval and ratio scales allow for comparison among categories
11/04/2023
![Page 10: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/10.jpg)
Slide 10
SPSS Output for Frequency Distribution
IQ
1 4.2 4.2 4.2
1 4.2 4.2 8.3
1 4.2 4.2 12.5
2 8.3 8.3 20.8
1 4.2 4.2 25.0
1 4.2 4.2 29.2
1 4.2 4.2 33.3
1 4.2 4.2 37.5
1 4.2 4.2 41.7
1 4.2 4.2 45.8
1 4.2 4.2 50.0
1 4.2 4.2 54.2
1 4.2 4.2 58.3
1 4.2 4.2 62.5
1 4.2 4.2 66.7
1 4.2 4.2 70.8
1 4.2 4.2 75.0
1 4.2 4.2 79.2
1 4.2 4.2 83.3
2 8.3 8.3 91.7
1 4.2 4.2 95.8
1 4.2 4.2 100.0
24 100.0 100.0
82.00
87.00
89.00
93.00
96.00
97.00
98.00
102.00
103.00
105.00
106.00
107.00
109.00
111.00
115.00
119.00
120.00
127.00
128.00
131.00
140.00
162.00
Total
ValidFrequency Percent Valid Percent
CumulativePercent
11/04/2023
![Page 11: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/11.jpg)
Slide 11
Grouped Relative Frequency Distribution
Relative Frequency Distribution of IQ for Two Classes
IQ FrequencyPercent Cumulative Percent
80 – 89 3 12.5 12.590 – 99 5 20.8 33.3100 – 109 6 25.0 58.3110 – 119 3 12.5 70.8120 – 129 3 12.5 83.3130 – 139 2 8.3 91.6140 – 149 1 4.2 95.8150 and over 1 4.2 100.0
Total 24 100.0 100.0
11/04/2023
![Page 12: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/12.jpg)
Slide 12
Discrete and Continuous data
Data consisting of numerical (quantitative) variables can be further divided into two groups: discrete and continuous.
1. If the set of all possible values, when pictured on the number line, consists only of isolated points.
2. If the set of all values, when pictured on the number line, consists of intervals.
The most common type of discrete variable we will encounter is a
counting variable.
11/04/2023
![Page 13: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/13.jpg)
Slide 13
Accuracy & Precision Accuracy: the closeness of the measurements to the “actual” or “real” value
of the physical quantity.– Statistically this is estimated using the standard error of the mean
Precision: is used to indicate the closeness with which the measurements agree with one another.
- Statistically the precision is estimated by the standard deviation of the mean
Precision is related to random errors that can be dealt with using statistics
Accuracy is related to systematic errors and are difficult to deal with using statistics
11/04/2023
![Page 14: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/14.jpg)
Slide 14
Accuracy and precision: The target analogy
High accuracy but low precision
High precision but low accuracy
What does High accuracy and high precision look like?
11/04/2023
![Page 15: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/15.jpg)
Slide 15
Accuracy and precision:The target analogy
High accuracy and high precision
11/04/2023
![Page 16: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/16.jpg)
Slide 16
Two types of error Systematic error
– Poor accuracy– Definite causes– Reproducible
Random error– Poor precision– Non-specific causes– Not reproducible
11/04/2023
![Page 17: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/17.jpg)
Slide 17
Systematic error
Diagnosis– Errors have consistent signs– Errors have consistent magnitude
Treatment– Calibration– Correcting procedural flaws– Checking with a different procedure
11/04/2023
![Page 18: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/18.jpg)
Slide 18
Random error
Diagnosis– Errors have random sign– Small errors more likely than large errors
Treatment– Take more measurements– Improve technique– Higher instrumental precision
11/04/2023
![Page 19: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/19.jpg)
Slide 19
Statistical graphs of data A picture is worth a thousand words!
Graphs for numerical data:
Histograms
Frequency polygons
Pie
Graphs for categorical data
Bar graphs
Pie
11/04/2023
![Page 20: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/20.jpg)
Slide 20
Box-PlotsA way to graphically portray almost all the descriptive
statistics at once is the box-plot.
A box-plot shows: Upper and lower quartiles
Mean
Median
Range
Outliers (1.5 IQR)11/04/2023
![Page 21: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/21.jpg)
Slide 21
Box-Plots
IQ
80.00
100.00
120.00
140.00
160.00
180.00
123.5
96.5
106.5
82
162
M=110.5
IQR = 27; There is no outlier.
11/04/2023
![Page 22: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/22.jpg)
Slide 22
Bar Graphs For categorical data Like a histogram, but with gaps between bars to show
that each bar is a separate group. Useful for showing two samples side-by-side
11/04/2023
![Page 23: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/23.jpg)
Slide 23
PoorPoor BelowAverageBelowAverage
AverageAverage AboveAverage AboveAverage
ExcellentExcellent
Fre
qu
en
cy
Fre
qu
en
cy
RatingRating
Bar Graph
1122
33
44
55
66
77
88
991010
Marada Inn Quality Ratings
Good?
Bad?
11/04/2023
![Page 24: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/24.jpg)
Slide 24
Histograms
f on y axis (could also plot p or % ) X values (or midpoints of class intervals) on x axis Plot each f with a bar, equal size, touching No gaps between bars
11/04/2023
![Page 25: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/25.jpg)
Slide 25
Pie Chart The pie chart is a commonly used graphical device for presenting relative frequency distributions for qualitative data.
First draw a circle; then use the relative frequencies to subdivide the circle into sectors that correspond to the
relative frequency for each class.
11/04/2023
![Page 26: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/26.jpg)
Slide 26
BelowAverage 15%
BelowAverage 15%
Average 25%Average 25%
AboveAverage 45%
AboveAverage 45%
Poor10%Poor10%
Excellent 5%Excellent 5%
Toyota Quality RatingsToyota Quality Ratings
Pie Chart
11/04/2023
![Page 27: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/27.jpg)
Slide 27
Skewness of distributions Measures look at how lopsided distributions are—how far from the ideal
of the normal curve they are When the median and the mean are different, the distribution is skewed.
The greater the difference, the greater the skew. Distributions that trail away to the left are negatively skewed and those
that trail away to the right are positively skewed If the skewness is extreme, the researcher should either transform the
data to make them better resemble a normal curve or else use a different set of statistics—nonparametric statistics—to carry out the analysis
11/04/2023
![Page 28: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/28.jpg)
Slide 28
Symmetric– Left tail is the mirror image of the right tail– Examples: heights and weights of people
Histogram (Common categories)
Rela
tive F
req
uen
cyR
ela
tive F
req
uen
cy
.05.05
.10.10
.15.15
.20.20
.25.25
.30.30
.35.35
00
11/04/2023
![Page 29: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/29.jpg)
Slide 29
Histogram Moderately Skewed Left
– A longer tail to the left– Example: exam scores
Rela
tive F
req
uen
cyR
ela
tive F
req
uen
cy
.05.05
.10.10
.15.15
.20.20
.25.25
.30.30
.35.35
00
11/04/2023
![Page 30: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/30.jpg)
Slide 30
Moderately Right Skewed– A Longer tail to the right– Example: housing values
Histogram
Rela
tive F
req
uen
cyR
ela
tive F
req
uen
cy
.05.05
.10.10
.15.15
.20.20
.25.25
.30.30
.35.35
00
11/04/2023
![Page 31: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/31.jpg)
Slide 31
Positively Skewed
11/04/2023
![Page 32: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/32.jpg)
Slide 32
Negatively Skewed
11/04/2023
![Page 33: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/33.jpg)
Slide 33
Symmetry: Kurtosis
A high kurtosis distribution has a sharper "peak" and fatter "tails", while a low kurtosis distribution has a more rounded peak with wider "shoulders".
11/04/2023
![Page 34: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/34.jpg)
Slide 34
5. Measures of relationship…
Spearman Rho Pearson r
11/04/2023
![Page 35: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/35.jpg)
Slide 35
correlations
determines whether and to what degree a relationship exists between two or more quantifiable variables
the degree of the relationship is expressed as a coefficient of correlation.
the presence of a correlation does not indicate a cause-effect relationship primarily because of the possibility of multiple confounding factors
11/04/2023
![Page 36: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/36.jpg)
Slide 36
…the presence of a correlation does not indicate a cause-effect relationship primarily because of the possibility of multiple confounding factors
11/04/2023
![Page 37: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/37.jpg)
Slide 37
Correlation coefficient…
-1.00 +1.00
strong negative strong positive
0.00
no relationship
11/04/2023
![Page 38: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/38.jpg)
Slide 38
Spearman Rho...…a measure of correlation used for
rank and ordinal data
11/04/2023
![Page 39: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/39.jpg)
Slide 39
Pearson r...…a measure of correlation used for
data of interval or ratio scales
…assumes that the relationship between the variables being correlated is linear
11/04/2023
![Page 40: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/40.jpg)
Slide 40
So
Descriptive statistics are used to summarize data from individual respondents, etc.– They help to make sense of large numbers of individual
responses, to communicate the essence of those responses to others
They focus on typical or average scores, the dispersion of scores over the available responses, and the shape of the response curve
11/04/2023
![Page 41: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/41.jpg)
Slide 41
Chi square (χ2 ) test
11/04/2023
![Page 42: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/42.jpg)
Slide 42
Used to:
Test for goodness of fit Test for independence of attributes Testing homogeneity Testing given population variance
11/04/2023
![Page 43: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/43.jpg)
Slide 43
Introduction
The test we use to measure the differences between what is observed and what is expected according to an assumed hypothesis is called the chi-square test.
11/04/2023
![Page 44: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/44.jpg)
Slide 44
Important The chi square test can only be used on
data that has the following characteristics:
The data must be in the form of frequencies
The frequency data must have a precise numerical value and
must be organised into categories or groups.
The total number of observations must be greater than 30.
The expected frequency in any one cell of the table must be
greater than 5.
11/04/2023
![Page 45: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/45.jpg)
Slide 45
Formulaχ 2 = ∑ (O – E)2
E
χ2 = The value of chi squareO = The observed valueE = The expected value∑ (O – E)2 = all the values of (O – E) squared then added together
11/04/2023
![Page 46: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/46.jpg)
Slide 4611/04/2023
![Page 47: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/47.jpg)
Slide 47
Construct a table with the information you have observed or obtained.
Observed Frequencies (O)
Money Health Love Row Total
men 82 446 355 883
women 46 574 273 893
Column total
128 1020 628 1776
11/04/2023
![Page 48: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/48.jpg)
Slide 48
Work out the expected frequency.
Expected frequency = row total x column total
Grand total
money health love Row Total
men 63.63 507.128 312.23 883
women 64.36 512.87 315.76 893
Column Total 128 1020 628 1776
11/04/2023
![Page 49: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/49.jpg)
Slide 49
For each of the cells calculate.
money
health love Row Total
Men 5.30 7.37 5.85
women 5023 7.29 5.8
Column Total
χ2Calc. =
36.873
(O – E)2
E
11/04/2023
![Page 50: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/50.jpg)
Slide 50
χ2Calc. = sum of all ( O-E)2/ E values in the cells.
Here χ 2Calc. =36.873
Find χ 2critical From the table with degree of freedom 2 and level of
significance 0.05χ 2
Critical =5.99
11/04/2023
![Page 51: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/51.jpg)
Slide 51
Χ2 table
![Page 52: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/52.jpg)
Slide 52
Conclusion
Compare χ2Calc. and Χ2
critical obtained from the table If χ2
Calc. Is larger than χ2Critical. then reject null hypothesis and
accept the alternative Here since χ 2
Calc. is much greater than χ 2Critical, we can easily
reject null hypothesisthat is ; there lies a relation between the gender and choice of selection.
11/04/2023
![Page 53: Descrptive statistics](https://reader035.fdocuments.us/reader035/viewer/2022070313/554b42d8b4c905cd2c8b4dd6/html5/thumbnails/53.jpg)
Slide 53
Any Questions !!!!!
•Thank You.
5311/04/2023