Stat 155, Section 2, Last Time Reviewed Excel Computation of: –Time Plots (i.e. Time Series)...
-
Upload
clinton-stephens -
Category
Documents
-
view
218 -
download
0
Transcript of Stat 155, Section 2, Last Time Reviewed Excel Computation of: –Time Plots (i.e. Time Series)...
Stat 155, Section 2, Last Time
• Reviewed Excel Computation of:– Time Plots (i.e. Time Series)– Histograms
• Modelling Distributions: Densities (Areas)
• Normal Density Curve (very useful model)
• Fitting Normal Densities
(using mean and s.d.)
Reading In Textbook
Approximate Reading for Today’s Material:
Pages 71-83, 102-112
Approximate Reading for Next Class:
Pages 123-127, 132-145
2 Views of Normal Fitting
1. “Fit Model to Data”
Choose & .
2. “Fit Data to Model”
First Standardize Data
Then use Normal .
Note: same thing, just different rescalings
(choose scale depending on need)
sx
1,0
Normal Distribution Notation
The “normal distribution,
with mean & standard deviation ”
is abbreviated as:
,N
Interpretation of Z-scores
Recall Z-score Idea:
• Transform data
• By subtracting mean & dividing by s.d.
• To get (mean 0, s.d. 1)
• Interpret as
• I.e. “ is sd’s above the mean”
nXX ,...,1
ii ZX
/ ii XZ
iX iZ
Interpretation of Z-scores
Same idea for Normal Curves:
Z-scores are on scale,
so use areas to interpret them
Important Areas:
• Within 1 sd of mean
“the majority”
1,0N
%68
Interpretation of Z-scores
2. Within 2 sd of mean
“really most”
3. Within 3 sd of mean
“almost all”
%95
%7.99
Interpretation of Z-scores
Interactive Version (used for above pics)
From Publisher’s Website:
http://bcs.whfreeman.com/ips5e/
• Statistical Applets
• Normal Curve
Interpretation of Z-scores
Summary:
These relations are called the
“68 - 95 - 99.7 % Rule”
HW: 1.86 (a: 234-298, b: 234, 298),
1.87
Computation of Normal Areas
Classical Approach: Tables
• See inside covers of text
• Summarizes area computations
• Because can’t use calculus
• Constructed by “computers”
(a job description in the early 1900’s!)
Computation of Normal Areas
EXCEL
Computation:
works in terms of
“lower areas”
E.g. for
Area < 1.3
is 0.7257
)5.0,1(N
Computation of Normal Areas
Interactive Version (used for above pic)
From Same Publisher’s Website:
http://bcs.whfreeman.com/ips5e/
• Statistical Applets
• Normal Curve
Computation of Normal Areas
EXCEL Computation:
(of above e.g.)
• Use NORMDIST
• Enter parameters
• x is “cutoff point”
• Return is Area
below x
Computation of Normal Areas
Computation of areas over intervals:
(use subtraction)
= -
Computation of Normal Areas
Computation of areas over intervals:
(use subtraction for EXCEL too)
E.g. Use Excel to check 68 - 95 - 99.7% Rulehttp://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg9.xls
Normal Area HWHW (use Excel):
1.94
1.97 (Hint: the % above 130 =
100% - % below 130)
1.99 (see discussion above)
1.113
Caution: Don’t just “twiddle EXCEL until answer appears”. Understand it!!!
And Now for Something Completely Different
A mind blowing video clip:
8 year old Skateboarding Twins:
http://www.youtube.com/watch?v=8X2_zsnPkq8&mode=related&search=
• Do they ever miss?
• You can explore farther…
Thanks to Devin Coley for the link
Inverse of Area Function
Inverse of Frequencies: “Quantiles”
Idea: Given area, find “cutoff” x
I.e. for
Area = 80%
This x
is the “quantile”
Inverse of Area Function
EXCEL Computation of Quantiles:
Use NORMINV
Continue Class Example:
http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg9.xls
• “Probability” is “Area”
• Enter mean and SD parameters
Inverse Area Example
When a machine works normally, it fills bottles with mean = 25 oz, and SD = 0.2 oz.
The machine is “out of control” when it overfills. Choose an “alarm level”, which will give only 1 % false alarms.
Want: cutoff, x, so that Area above = 1%
Note: Area below = 100% - Area above = 99%http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg9.xls
Inverse Area HW
1.95, 1.101, 1.107, 1.109
1.116 a (-0.674, 0.674)
1.117
1.118 (4.3%)
Normal Diagnostic
When is the Normal Model “good”?
Useful Graphical Device:
Q-Q plot = Normal Quantile Plot
Idea: look at plot which is approximately linear for data from Normal Model
Normal Quantile Plot
Approach, for data :
1. Sort data
2. Compute “Theoretical Proportions”:
3. Compute “Theoretical Z-scores”
4. Plot Sorted Data (Y-axis) vs.
Theoretical Z – scores (X-axis)
nXX ,...,1
nini ,...,1),1/(
niniNORMINV ,...,1)),1/((
Normal Quantile Plot
Several Examples:
http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg12.xls
• Show how to compute in Excel
• Steps as above
Normal Quantile Plot
Main Lessons:
• Melbourne Winter Temperature Data– Gaussian is good, so looks ~ linear
– So OK, to use normal model for these data
– Adding trendline helps in assessing linearity
Normal Quantile Plot
Main Lessons:
• Intro Stat Course Exam Scores Data– Skewed distributions nonlinearity
– Outliers show up clearly
– Normal model unreliable here
• Combined plot highlights– Mean = Y-intercept
– Standard Deviation = Slope
Normal Quantile Plot
Main Lessons:
• Simulated Bimodal Data– Curve is flat near modes
– Roughly linear near peaks
– Corresponds to two normal subpopulaitons
– Goes up fast a valley
Normal Quantile Plot
Homework:
1.122
1.123
1.125
And now for something completely different
Recall
Distribution
of majors of
students in
this course:
Stat 155, Section 2, Majors
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Busine
ss /
Man
.
Biolog
y
Public
Poli
cy /
Health
Pharm
/ Nur
sing
Jour
nalis
m /
Comm
.
Env. S
ci.
Other
Undec
ided
Fre
qu
ency
And now for something completely different
How about a biology joke?
A seventh grade Biology teacher arranged a demonstration for his class. He took two earth worms and in front of the class he did the following: He dropped the first worm into a beaker of water where it dropped to the bottom and wriggled about. He dropped the second worm into a beaker of Ethyl alchohol and it immediately shriveled up and died. He asked the class if anyone knew what this demonstration was intended to show them.
And now for something completely different
He asked the class if anyone knew what this demonstration was intended to show them.
A boy in the second row immediately shot his arm up and, when called on said: "You're showing us that if you drink alcohol, you won't have worms."
Variable Relationships
Chapter 2 in Text
Idea: Look beyond single quantities, to how quantities relate to each other.
E.g. How do HW scores “relate”
to Exam scores?
Section 2.1: Useful graphical device:
Scatterplot
Plotting Bivariate Data
Toy Example:
(1,2)
(3,1)
(-1,0)
(2,-1)
Toy Scatterplot, Separate Points
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
-2 -1 0 1 2 3 4
x
y
Plotting Bivariate Data
Sometimes:
Can see more
insightful patterns
by connecting
points
Toy Scatterplot, Connected points
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
-2 -1 0 1 2 3 4
x
y
Plotting Bivariate Data
Sometimes:
Useful to switch off
points, and only
look at lines/curves
Toy Scatterplot, Lines Only
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
-2 -1 0 1 2 3 4
x
y
Plotting Bivariate Data
Common Name: “Scatterplot”
A look under the hood:
EXCEL: Chart Wizard (colored bar icon)
• Chart Type: XY (scatter)
• Subtype conrols points only, or lines
• Later steps similar to above
(can massage the pic!)
Scatterplot E.g.Data from related Intro. Stat. Class
(actual scores)
A. How does HW score predict Final Exam?
= HW, = Final Examhttp://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg10.xls
i. In top half of HW scores:
Better HW Better Final
ii. For lower HW:
Final is much more “random”
ix iy
Scatterplots
Common Terminology:
When thinking about “X causes Y”,
Call X the “Explanatory Var.” or “Indep. Var.”
Call Y the “Response Var.” or “Dep. Var.”
(think of “Y as function of X”)
(although not always sensible)
Scatterplots
Note: Sometimes think about causation,
Other times: “Explore Relationship”
HW: 2.1
Class Scores Scatterplotshttp://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg10.xls
B. How does HW predict Midterm 1?
= HW, = MT1i. Still better HW better Exam
ii. But for each HW, wider range of MT1 scores
iii. I.e. HW doesn’t predict MT1 as well as Final
iv. “Outliers” in scatterplot may not be outliers in either individual variable
e.g. HW = 72, MT1 = 94
(bad HW, but good MT1?, fluke???)
ix iy
Class Scores Scatterplotshttp://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg10.xls
C. How does MT1 predict MT2?
= MT1, = MT2i. Idea: less “causation”, more “exploration”
ii. Still higher MT1 associated with higher MT2
iii. For each MT1, wider range of MT2
i.e. “not good predictor”
iv. Interesting Outliers:
MT1 = 100, MT2 = 56 (oops!)
MT1 = 23, MT2 = 74 (woke up!)
ix iy
Important Aspects of Relations
I. Form of Relationship
II. Direction of Relationship
III. Strength of Relationship
I. Form of Relationship• Linear: Data approximately follow a line
Previous Class Scores Examplehttp://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg10.xls
Final vs. High values of HW is “best”
• Nonlinear: Data follows different pattern
Nice Example: Bralower’s Fossil Data
http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg11.xls
Bralower’s Fossil Datahttp://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg11.xls
From T. Bralower, formerly of Geological Sci.
Studies Global Climate, millions of years ago:
• Ratios of Isotopes of Strontium
• Reflects Ice Ages, via Sea Level
(50 meter difference!)
• As function of time
• Clearly nonlinear relationship
II. Direction of Relationship
• Positive Association
X bigger Y bigger
• Negative Association
X bigger Y smaller
E.g. X = alcohol consumption, Y = Driving Ability
Clear negative association
III. Strength of Relationship
Idea: How close are points to lying on a line?
Revisit Class Scores Example:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg10.xls
• Final Exam is “closely related to HW”
• Midterm 1 less closely related to HW
• Midterm 2 even related to Midterm 1