Dr. Engr. Sami ur Rahman Data Analysis Lecture 3: Data Distribution Normal Distribution.
-
Upload
alisha-higgins -
Category
Documents
-
view
225 -
download
0
Transcript of Dr. Engr. Sami ur Rahman Data Analysis Lecture 3: Data Distribution Normal Distribution.
Dr. Engr. Sami ur Rahman
Data AnalysisLecture 3: Data DistributionNormal Distribution
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 2
Introductory Statistics
Dispersion
The Normal Distribution Curve
Variability
Calculating a Mean and a Standard Deviation
Interpreting Distributions
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 3
Dispersion
Dispersion – The distribution of values around some central value, such as an average.
Distribution of a variable tells us what values & how often (frequency of a variable)
Distribution
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 4
St id age St id Age1 18 11 202 20 12 193 19 13 204 19 14 225 20 15 196 20 16 217 21 17 128 21 18 189 21 19 22
10 23 20 20
Distribution (Frequency)
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 5
St id age St id Age1 18 11 202 20 12 193 19 13 204 19 14 225 20 15 196 20 16 217 21 17 178 21 18 189 21 19 22
10 23 20 20
Age Frequency
23 1
17 118 219 420 6
22 221 3
Mean?
Median?
Mode?
Histogram
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 6
Age Frequency17 118 219 420 621 422 223 1
17 18 19 20 21 22 23 24 More0
1
2
3
4
5
6
7
Histogram
Frequency
Bin
Fre
qu
ency
The Normal Distribution Curve
00.0050.01
0.0150.02
0.025
0 20 40 60 80 100
It is bell-shaped and symmetrical about the mean
The mean, median and mode are equal
Mean, Median, Mode
It is a function of the mean and the standard deviation
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 7
Examples of Normal Distribution
Examples of normal distribution in everyday life many:
• Height
• Weight
• Shoe size
• Exam marks
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 8
Variation or Spread of Distributions
Measures that indicate the spread of scores:
Range
Standard Deviation
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 9
Variation or Spread of Distributions
Range It compares the minimum score with the maximum
score Max score – Min score = Range It is a crude indication of the spread of the scores
because it does not tell us much about the shape of the distribution and how much the scores vary from the mean
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 10
Variation or Spread of Distributions
Standard Deviation It tells us what is happening between the minimum
and maximum scores It tells us how much the scores in the data set vary
around the mean
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 11
Calculating Mean and Standard Deviation
Absolute SquaredData Deviation Deviation Deviation
x x - Mean |x - Mean| (x-Mean)²10 -20 20 40020 -10 10 10030 0 0 040 10 10 10050 20 20 400
Sums 150 0 60 1000Means 30 0 12 200
Variance
14.1421356Standard deviation = Variance
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 12
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 13
Standard deviation(s)
Used as a measure of spread when mean=center
Units of s=same as data units
s always positive
Higher s->more spread
s=0->no spread -> all observations equal
s affected by outliers
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 14
Standard Deviation
A measure of dispersion around the mean, calculated so that approximately 68 percent of the cases will lie within plus or minus one standard deviation from the mean, 95 percent within two, and 99.9 percent within three standard deviations.
This is often referred to as the 68-95-99.7 rule
When to Use Standard Deviation
When you need to determine how much a set of scores vary from each other.
Interpreting Distributions
0
0.005
0.01
0.015
0.02
0.025
0.03
0 10 20 30 40 50 60 70 80 90 100
Mean = 50
Std Dev = 15
34%
14%
2%34%
14%
2%
0 +1 +2 +3-2-3 -1s d
50 80 955 3520 65scores
50% 84% 98% 100%2%0% 16%rank
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 15
Interpreting Distributions
School A School B School CMean 50 60 70S.d. 10 10 10
0
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0 20 40 60 80 100 120
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 16
Interpreting Distributions
School A School B School CMean 50 50 50S.d. 10 13 16
0
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0 20 40 60 80 100 120
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 17
Interpreting Distributions
National Mean School A School BMean 55 60 40S.d. 10 15 15
0
0.005
0.01
0.015
0.020.025
0.03
0.035
0.04
0.045
-20 0 20 40 60 80 100 120
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 18
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 19
Example-Do women study more than men?Variable: minutes studied on a typical weeknight of a first-
year university classRandom samples of 30 women and 30 men:Women:180,120,150, 200, 120,90,120,180,120, 150, 60,
240,180,120,180,180,120, 180, 360, 240, 180, 150, 180, 115,240, 170, 150,180,180,120
Men: 90, 90,150,240,30,0, 120,45,120,60,230,200,30,30, 60, 120, 120, 120, 90, 120, 240, 60, 95, 120,200,75,300, 30, 150,180
University Of Malakand | Department of Computer Science | UoMIPS | Dr. Engr. Sami ur Rahman | 20
Thanks for your attention