Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG....
Transcript of Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG....
![Page 1: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/1.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1
BackgroundContentsSummary
Computational StatisticsA Proposal for a Basic Course
Statistical Computing 200928.6.-1.7.2009, Schloss Reisensburg
Gunther Sawitzki<[email protected]>
StatLab Heidelberg
30. 6.2009Typeset: July 26, 2009
![Page 2: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/2.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 2
BackgroundContentsSummary
Note
Page references refer to
Computational Statistics:An Introduction to R
Chapman & Hall/CRC Press,Boca Raton (FL), 2009.
ISBN: 978-1-4200-8678-2
See http://sintro.r-forge.r-project.org/.
![Page 3: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/3.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 3
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
Background
BackgroundPredecessorsAudienceTopicsStructure
Contents
Summary
![Page 4: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/4.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 4
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundPredecessors
Aim
A concise course in computational statistics
Predecessors
One week post-graduate course ”Biometry in Medicine”
One week course: R programming
Linear Models
Statistical Data Analysis
. . .
![Page 5: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/5.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 5
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundPredecessors
Aim
A concise course in computational statistics
Predecessors
One week post-graduate course ”Biometry in Medicine”
One week course: R programming
Linear Models
Statistical Data Analysis
. . .
![Page 6: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/6.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 6
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundPredecessors
Aim
A concise course in computational statistics
Predecessors
One week post-graduate course ”Biometry in Medicine”
One week course: R programming
Linear Models
Statistical Data Analysis
. . .
![Page 7: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/7.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 7
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundPredecessors
Aim
A concise course in computational statistics
Predecessors
One week post-graduate course ”Biometry in Medicine”
One week course: R programming
Linear Models
Statistical Data Analysis
. . .
![Page 8: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/8.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 8
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundPredecessors
Aim
A concise course in computational statistics
Predecessors
One week post-graduate course ”Biometry in Medicine”
One week course: R programming
Linear Models
Statistical Data Analysis
. . .
![Page 9: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/9.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 9
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundAudience
Computational Statistics: An Introduction to R
Designed for a mixed audience
researchers and post-graduates from applied areas (in particularfrom clinical departments and from the DKFZ, the Germancancer research center), with some working knowledge instatistical methods and with considerable laboratory experience
students from mathematics or computer science, with a basicknowledge in (mathematical) stochastics
As one of the participants from the applied field said ”We can look upthe methods ourselves. What we need is a guide to the underlyingconcepts.”
![Page 10: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/10.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 10
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundAudience
Computational Statistics: An Introduction to R
Designed for a mixed audience
researchers and post-graduates from applied areas (in particularfrom clinical departments and from the DKFZ, the Germancancer research center), with some working knowledge instatistical methods and with considerable laboratory experience
students from mathematics or computer science, with a basicknowledge in (mathematical) stochastics
As one of the participants from the applied field said ”We can look upthe methods ourselves. What we need is a guide to the underlyingconcepts.”
![Page 11: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/11.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 11
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundTopics
What do we need?
Try to illustrate/demonstrate:
What are the statistical concepts and methods that are essential forcomputational statistics on a scientific level?
What is not needed?
How to survive bolognese?
![Page 12: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/12.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 12
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundTopics
What do we need?
Try to illustrate/demonstrate:
What are the statistical concepts and methods that are essential forcomputational statistics on a scientific level?
What is not needed?
How to survive bolognese?
![Page 13: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/13.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 13
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundTopics
What do we need?
Try to illustrate/demonstrate:
What are the statistical concepts and methods that are essential forcomputational statistics on a scientific level?
What is not needed?
How to survive bolognese?
![Page 14: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/14.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 14
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundTopics
Statistical Topics
Idea: Select a small set of fairly general statistical topics.
![Page 15: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/15.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 15
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundTopics
Statistical Topics
distribution diagnosticsgiven Xi i = 1, . . . , n, infer on L (Xi )
regression models and regression diagnosticsY = m(X ) + ε
non-parametric comparisons
multivariate analysis
![Page 16: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/16.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 16
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundTopics
Statistical Topics
distribution diagnosticsgiven Xi i = 1, . . . , n, infer on L (Xi )
regression models and regression diagnosticsY = m(X ) + ε
non-parametric comparisons
multivariate analysis
![Page 17: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/17.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 17
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundTopics
Statistical Topics
distribution diagnosticsgiven Xi i = 1, . . . , n, infer on L (Xi )
regression models and regression diagnosticsY = m(X ) + ε
non-parametric comparisons
multivariate analysis
![Page 18: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/18.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 18
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundTopics
Statistical Topics
distribution diagnosticsgiven Xi i = 1, . . . , n, infer on L (Xi )
regression models and regression diagnosticsY = m(X ) + ε
non-parametric comparisons
multivariate analysis
![Page 19: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/19.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 19
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundTopics
Note
Topics refer to statistical problem classes,
not specifically to heuristics such as least square, maximum likelihoodetc.,
not to specific models.
They try to mark a broad range of topics.
Topics may be used as self-contained teaching modules, with onlylimited cross-import. They can be taught as separate units.
![Page 20: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/20.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 20
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundTopics
The course may be presented as an introduction to R. But actually itis an invitation to statistical data analysis.
Time Table
Compact course (5 days).
or
One term, 2h lectures plus 2h exercises per week.
Details to come.
![Page 21: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/21.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 21
BackgroundContentsSummary
PredecessorsAudienceTopicsStructure
BackgroundStructure
Chapter Structure
Content chapters - used as course material.
Core contentR supplementStatistical summary
Course Material Structure
Four chapters, by statistical topic - used as course material.
Appendix: R Reference sections by programming topic - used assupplement or for look up
![Page 22: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/22.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 22
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
Contents
Background
ContentsCh. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
Summary
![Page 23: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/23.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 23
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Distribution Diagnostics
R Programming Conventions
Generation of Random Numbers and Patterns
Case Study: Distribution Diagnostics
Distribution FunctionsHistogramsBarchartsStatistics of Distribution Functions; Kolmogorov-Smirnov TestsMonte Carlo Confidence BandsStatistics of Histograms and Related Plots; χ2-Tests
Moments and Quantiles
![Page 24: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/24.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 24
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Distribution Diagnostics
Example 1.1: A Simple Plot (p.7)
Inputx <- runif(100)
plot(x)
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Index
x
![Page 25: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/25.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 25
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Distribution Diagnostics
Exercise 1.1(p.7)
Try experimenting with these plots and runif(). Do the plots show imagesof random numbers?To be more precise: do you accept these plots as images of 100 independentrealisations of random numbers, distributed uniformly on (0, 1)?Repeat your experiments and try to note as precisely as possible thearguments you have for or against (uniform) randomness. What is yourconclusion?
Walk through your arguments and try to draft a test strategy toanalyse a sequence of numbers for (uniform) randomness. Try toformulate your strategy as clearly as possible.
![Page 26: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/26.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 26
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Distribution Diagnostics
Exercise 1.2 (p.10)
Use
plot(sin(1:100))
to generate a plot of a discretisedsine function. Use your strategyfrom Exercise 1.1.
Does your strategy detect that thesine function is not a randomsequence?
![Page 27: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/27.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 27
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Distribution Diagnostics
Exercise 1.2 (p.10)
Use
plot(sin(1:100))
to generate a plot of a discretisedsine function. Use your strategyfrom Exercise 1.1.
Does your strategy detect that thesine function is not a randomsequence?
0 20 40 60 80 100
-1.0
-0.5
0.0
0.5
1.0
Index
sin(1:100)
![Page 28: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/28.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 28
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Distribution Diagnostics
Exercise 1.2 (p.10)
Use
plot(sin(1:100))
to generate a plot of a discretisedsine function. Use your strategyfrom Exercise 1.1.
Does your strategy detect that thesine function is not a randomsequence?
0 20 40 60 80 100
-1.0
-0.5
0.0
0.5
1.0
Index
sin(1:100)
![Page 29: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/29.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 29
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Distribution Diagnostics
Exercise 1.2 (p.10)
Use
plot(sin(1:100))
to generate a plot of a discretisedsine function. Use your strategyfrom Exercise 1.1.
Does your strategy detect that thesine function is not a randomsequence?
0 20 40 60 80 100
-1.0
-0.5
0.0
0.5
1.0
Index
sin(1:100)
![Page 30: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/30.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 30
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
Note
Try to put a challenge.This is not something to solve on the fly.It is something to come back to.
![Page 31: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/31.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 31
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
Running Exercise in Ch. 1:
Can you tell a uniform from a Gaussian distribution, based on asample?Various methods are discussed.
What is the minimum sample size at which the distribution is barelyrecognizable?
What is the sample size needed for a clear impression?
![Page 32: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/32.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 32
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Distribution Diagnostics
Case Study: Distribution Diagnostics (p.10)
Distribution function
Histogram
Smoothing (kernel densityestimation)
Exercise 1.4 (p.16)
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Empirical distribution function
(X uniform)
x
Fn((x
))
[
[
[
[
[
[
[
[
[
[
![Page 33: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/33.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 33
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
Note
We start with possibly competing approaches. There is more than oneway.For each approach, the theory (and pragmatics) is developed in steps.After each step, the question is addressed how these approachescompare, and, ultimately, whether there is one which is to bepreferred.
![Page 34: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/34.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 34
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Distribution Diagnostics
Case Study: Distribution Function (p.10)
start with simple prototypes
refine software, e.g. graphics
make mathematics correct.
This may need sometheorems, e.g.
Theorem: F (X(i)) has a betadistribution β(i , n − i + 1).Corollary:E (F (X(i))) = i/(n + 1). 0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Empirical distribution function
(X uniform)
x
Fn((x
))
[
[
[
[
[
[
[
[
[
[
![Page 35: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/35.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 35
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Distribution Diagnostics
Example 1.11 Monte Carlo Confidence Bands (p.23)
Simulation can also help us to getan impression of the typicalfluctuation. We use randomnumbers to generate a smallnumber of samples, and compareour sample in question with thesesimulations. For comparison, wegenerate envelopes of thesesimulations and check whether oursample lies within the areadelimited by the envelopes. 0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Monte Carlo Band: 19 Monte Carlo Samples
x
Fn
![Page 36: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/36.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 36
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
From: Statistical Computing 1993
We claim: a diagnostic plot is only as good as the hardstatistical theory that is supporting it.
G. Sawitzki in:Computational Statistics. Papers collected on the Occasion of the 25thConference on Statistical Computing at Schloss Reisensburg.Edited by P.Dirschedl & R.Ostermann for the Working Groups . . .Physica/Springer: Heidelberg, 1994, isbn 3-7908-0813-x, p. 237-258.
Plots and their statistical counterparts
Plot Statistics/Testhistogram χ2 tests
distribution function Kolmogorov - Smirnov test
![Page 37: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/37.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 37
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Distribution Diagnostics
Statistics for the probability plot (p.27)
Theorem: (Kolmogorov, Smirnov) For a continuous distribution function F ,the distribution of supx |Fn − F |(x) is independent of F (in general, it willdepend on n).
Theorem: (Kolmogorov) For a continuous distribution function F andn→∞ the statistic
√n sup |Fn − F | has asymptotically the distribution
function FKolmogorov−Smirnov (y) =∑
m∈Z(−1)me−2m2y2
for y > 0.
Theorem: (Massart 1990) For all integer n and any positive λ, we have
P(√
n sup |Fn − F | > λ) ≤ 2e−2λ2
.
![Page 38: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/38.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 38
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
Note
With all respect: asymptotics should be put in its place.
Learning the difference between asymptotic statements (such asKolmogorov) and finite sample bounds (like the Dvoretzky - Kiefer -Wolfowitz inequality studied by Massart) should start early.
![Page 39: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/39.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 39
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Distribution Diagnostics
Exercise 1.25 Sample Size (p. 41)
Generate a PP plot of the t(ν) distribution against the standard normaldistribution in the range 0.01 ≤ p ≤ 0.99 for ν = 1, 2, 3, . . ..
Generate a QQ plot of the t(ν) distribution against the standard normaldistribution in the range −3 ≤ x ≤ 3 for ν = 1, 2, 3, . . ..
How large must ν be so that the t distribution is barely different from thenormal distribution in these plots?How large must ν be so that the t distribution is barely different from thenormal distribution if you compare the graphs of the distribution functions?
See also (p. 42 - p. 45).
![Page 40: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/40.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 40
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
Note
Where possible, we try to complement theoretical results bysimulations.
At this step, we avoid concepts like power. Instead we draw theattention to the question: what is the sample size we need to solve acertain task?
At this early point of the course, power differences are discussed interms of required sample size.
We avoid to introduce the term “relative efficiency”, not to overloadthe chapter.
![Page 41: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/41.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 41
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Linear Models and Regression Diagnostics
General Regression Model
Linear Model
Variance Decomposition by Orthogonal Complements,and Analysis of Variance
Simultaneous Inference
Beyond Linear Regression
![Page 42: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/42.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 42
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Linear Models and Regression Diagnostics
General Regression Model
Linear Model
Least Squares EstimationRegression Diagnostics (see p. 69 ff)Model FormulaeGauss-Markov Estimator and Residual
Variance Decomposition by Orthogonal Complements,and Analysis of Variance
Simultaneous Inference
Beyond Linear Regression
![Page 43: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/43.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 43
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Linear Models and Regression Diagnostics
General Regression Model
Linear Model
Variance Decomposition by Orthogonal Complements,and Analysis of Variance
Simultaneous Inference
Scheffe’s Confidence Bands (see p. 85 ff)Tukey’s Confidence Intervals (see p. 87)Case Study: Titre Plates (see p. 88ff)
Beyond Linear Regression
![Page 44: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/44.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 44
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Linear Models and Regression Diagnostics
General Regression Model
Linear Model
Variance Decomposition by Orthogonal Complements,and Analysis of Variance
Simultaneous Inference
Beyond Linear Regression
Just mentioned:
TransformationsGeneralised Linear ModelsLocal Regression
![Page 45: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/45.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 45
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 2: Linear Models and Regression Diagnostics
Note
This chapter: mainly textbook material by now.With some extensions for a data analytical point of view . . .
Still needed: point out the special role of the one dimensionalresponse situation, e.g. as expressed by the Gauss-Markov theorem.
![Page 46: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/46.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 46
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Shift/Scale Families, and Stochastic Order
QQ Plot, PP Plot, and Comparison of Distributions
Kolmogorov-Smirnov Tests
Tests for Shift Alternatives
A Road Map
Power and Confidence
Theoretical Power and ConfidenceSimulated Power and ConfidenceNon-Parametric Quantile Estimation
Qualitative Features of Distributions
![Page 47: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/47.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 47
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Exercise 3.2: Click Comparison (p. 109)
Try clicking on a random point,with left and then with righthand.
Please click on the circle
![Page 48: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/48.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 48
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Exercise 3.2: Click Comparison (p. 109)
Try clicking on a random point,with left and then with righthand.
Immediate impression:”feels different”
One hand is more responsive.
Please click on the circle
![Page 49: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/49.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 49
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Stochastic Order
Notation: A distribution with distribution function F1 is stochasticallysmaller than a distribution with distribution function F2 (in symbols,F1 ≺ F2), if a variable distributed as F1 takes rather smaller values than avariable distributed as F2. This means that F1 increases sooner:F1(x) ≥ F2(x) ∀x and F1(x) > F2(x) for at least one x .
Shift/Scale Families
Notation: For a distribution with distribution function F the familyFa(x) = F (x − a) is called the shift family for F . The parameter a is calledthe shift or location parameter.
Define shift/scale family, and relate to stochastic order.
![Page 50: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/50.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 50
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
Note
Stochastic order and stochastic monotonicity are the core conceptsthat explain why in some situations statistical problems can bereduced to optimization problems.
It is at the core of much of theoretical statistics, e.g.Neyman-Pearson theory.
Recognizing stochastic order relation and stochastic monotonicity area key competence in statistics.
“Monotone likelihood ratios” etc. only obscure the core argument.
![Page 51: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/51.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 51
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Two Sample Comparison
Challenge: compare twosamples.
0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
Click Time
time [s]
Fn
[[
[[[
[[
[[[[[[[[[[
[[[[
[
[[[[[[
[[
[[[
[[[[
[[
[[[
[
right hand
left hand
left hand
right hand
Distribution functions for theright/left click time
![Page 52: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/52.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 52
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Two Sample Comparison
Challenge: compare twosamples.
Note: this is not a shiftalternative.
0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
Click Time
time [s]
Fn
[[
[[[
[[
[[[[[[[[[[
[[[[
[
[[[[[[
[[
[[[
[[[[
[[
[[[
[
right hand
left hand
left hand
right hand
Distribution functions for theright/left click time
![Page 53: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/53.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 53
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Two Sample Comparison
Challenge: compare twosamples.
t−test for normal shift families -recall from Ch. 2
rank tests (Wilcoxon)
permutation tests
bootstrap
QQ plot
PP plot & Kolmogorov-Smirnov
![Page 54: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/54.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 54
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Two Sample Comparison
Challenge: compare twosamples.
t−test for normal shift families -recall from Ch. 2
rank tests (Wilcoxon)
permutation tests
bootstrap
QQ plot
PP plot & Kolmogorov-Smirnov
![Page 55: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/55.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 55
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Two Sample Comparison
Challenge: compare twosamples.
t−test for normal shift families -recall from Ch. 2
rank tests (Wilcoxon)
permutation tests
bootstrap
QQ plot
PP plot & Kolmogorov-Smirnov
![Page 56: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/56.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 56
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Two Sample Comparison
Challenge: compare twosamples.
t−test for normal shift families -recall from Ch. 2
rank tests (Wilcoxon)
permutation tests
bootstrap
QQ plot
PP plot & Kolmogorov-Smirnov
![Page 57: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/57.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 57
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Two Sample Comparison
Challenge: compare twosamples.
t−test for normal shift families -recall from Ch. 2
rank tests (Wilcoxon)
permutation tests
bootstrap
QQ plot
PP plot & Kolmogorov-Smirnov
![Page 58: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/58.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 58
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Two Sample Comparison
Challenge: compare twosamples.
t−test for normal shift families -recall from Ch. 2
rank tests (Wilcoxon)
permutation tests
bootstrap
QQ plot
PP plot & Kolmogorov-Smirnov
![Page 59: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/59.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 59
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Lessons / Issues to point out
Statistical methods / tests may have different assumptions on thedata to make them valid.
Statistical methods / tests may have different targets. Which of thesemethods targets shift alternatives? Which have a more general target?
Unsatisfied assumptions or failed targets do not necessarily imply thata method is not usable. For example, you can use a test targeted atshift alternatives to detect differences which are not covered by shiftalternatives.
A thorough discussion is needed here to prepare for the next question:how to compare tests, or methods in general.
![Page 60: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/60.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 60
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Two Sample Comparison: Comparison of Methods
Two sample comparisons
t−test for normal shiftfamilies - recall from Ch. 2
rank tests (Wilcoxon)
permutation tests
bootstrap
QQ plot
PP plot &Kolmogorov-Smirnov
![Page 61: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/61.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 61
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Two Sample Comparison: Comparison of Methods
Two sample comparisons
t−test for normal shiftfamilies - recall from Ch. 2
rank tests (Wilcoxon)
permutation tests
bootstrap
QQ plot
PP plot &Kolmogorov-Smirnov
Comparison of methods
sample size comparisons(relative efficiency)
theoretical power
power comparison bysimulation
“test beds” and scenarios
![Page 62: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/62.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 62
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Two Sample Comparison: Comparison of Methods
Two sample comparisons
t−test for normal shiftfamilies - recall from Ch. 2
rank tests (Wilcoxon)
permutation tests
bootstrap
QQ plot
PP plot &Kolmogorov-Smirnov
Comparison of methods
sample size comparisons(relative efficiency)
theoretical power
power comparison bysimulation
“test beds” and scenarios
![Page 63: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/63.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 63
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Two Sample Comparison: Comparison of Methods
Two sample comparisons
t−test for normal shiftfamilies - recall from Ch. 2
rank tests (Wilcoxon)
permutation tests
bootstrap
QQ plot
PP plot &Kolmogorov-Smirnov
Comparison of methods
sample size comparisons(relative efficiency)
theoretical power
power comparison bysimulation
“test beds” and scenarios
![Page 64: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/64.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 64
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Note
Comparison of methods, other from the sample size point of view, hasbeen postponed until there is a sufficient collection of methods incompetition.
There is no discussion of optimality in this course, except for marginalremarks.
“Optimality” is helpful if there is a one dimensional optimalitycriterion. It may be a misleading focus, if there is more than oneaspect to cover.
![Page 65: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/65.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 65
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 3: Non-parametric Comparisons
Open Question
What is the state of the art information we should give about twosample comparison, keeping in mind that there are more possibilitiesfor differences than what is covered by shift alternatives?
![Page 66: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/66.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 66
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 4: Multivariate Analysis
Dimensions
Selections
Projections
Sections, Conditional Distributions and Coplots
Transformations and Dimension Reduction
Higher Dimensions
High Dimensions
![Page 67: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/67.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 67
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 4: Multivariate Analysis
Dimensions
Selections
Projections
Marginal Distributions and Scatter Plot MatricesProjection PursuitProjections for Dimensions 1, 2, 3, . . . 7Parallel Coordinates
Sections, Conditional Distributions and Coplots
Transformations and Dimension Reduction
Higher Dimensions
High Dimensions
![Page 68: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/68.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 68
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 4: Multivariate Analysis
Dimensions
Selections
Projections
Sections, Conditional Distributions and Coplots
Transformations and Dimension ReductionHigher Dimensions
Linear CasePartial Residuals and Added Variable PlotsNon-Linear CaseExample: Cusp Non-LinearityCase Study: Melbourne Temperature DataCurse of DimensionalityCase Study: Body Fat
High Dimensions
![Page 69: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/69.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 69
BackgroundContentsSummary
Ch. 2: Distribution DiagnosticsCh. 2: Linear Models and Regression DiagnosticsCh. 3: Non-parametric ComparisonsCh. 4: Multivariate Analysis
ContentsCh. 4: Multivariate Analysis
Open Questions
This chapter needs a revision.
What are the minimal concepts which we should teach aboutmultivariate statistics?
What are the basic methods which should at least be mentioned?
![Page 70: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/70.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 70
BackgroundContentsSummary
Time TableOpen IssuesReferences
Summary
Background
Contents
SummaryTime TableOpen IssuesReferences
![Page 71: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/71.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 71
BackgroundContentsSummary
Time TableOpen IssuesReferences
SummaryTime Table
Monday Tuesday Wednesday Thursday Friday
Basic Data Analysis(Ch. 1)
Regression (Ch. 2) Regression Regression: Discus-sion
Excerpts from Multi-variate (Ch. 4)
Basic DataAnalysis
Regression Regression Comparison (Ch. 3) Excerpts from Multi-variate
Lunch Break
Basic DataAnalysis
Exercises Unsupervised Exer-cises
Comparison Exercises &Discussion
Exercises Regression Unsupervised Exer-cises
Exercises Exercises &Discussion
Afternoon Break
Basic DataAnalysis
Exercises Unsupervised Exer-cises
Discussion Discussion
Exercises& Discussion
Exercises& Discussion
Supplements fromCh. 03
Issues to check:QQ-Plot, PP-Plot,Monte Carlo Bands
Issues to Check:Basic Diagnosticsfor Linear Regression
No Checks Today Issues to Check:Stochastic Order;Shift Alternatives vs.General Differences
Issues to Check:Univariate vs. Mul-tivariate Comparison
![Page 72: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/72.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 72
BackgroundContentsSummary
Time TableOpen IssuesReferences
SummaryOpen Issues
Ch. 1 Distribution Diagnostic
Essentially stable.
Ch. 2 Regression
Clarify Gauss-Markov theorem and role of dimension.Can the chapter be cleaned up ?
Ch. 3 Comparison (Still a placeholder)
What is an up-to-date discussion of the (non-shift) two sample case?Clean up.
Ch. 4 Multivariate
Given the time limitations, is the current list of concepts sufficient ?Discuss scaling issues, e.g. with respect to PCA.
![Page 73: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/73.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 73
BackgroundContentsSummary
Time TableOpen IssuesReferences
SummaryOpen Issues
Ch. 1 Distribution Diagnostic
Essentially stable.
Ch. 2 Regression
Clarify Gauss-Markov theorem and role of dimension.Can the chapter be cleaned up ?
Ch. 3 Comparison (Still a placeholder)
What is an up-to-date discussion of the (non-shift) two sample case?Clean up.
Ch. 4 Multivariate
Given the time limitations, is the current list of concepts sufficient ?Discuss scaling issues, e.g. with respect to PCA.
![Page 74: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/74.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 74
BackgroundContentsSummary
Time TableOpen IssuesReferences
SummaryOpen Issues
Ch. 1 Distribution Diagnostic
Essentially stable.
Ch. 2 Regression
Clarify Gauss-Markov theorem and role of dimension.Can the chapter be cleaned up ?
Ch. 3 Comparison (Still a placeholder)
What is an up-to-date discussion of the (non-shift) two sample case?Clean up.
Ch. 4 Multivariate
Given the time limitations, is the current list of concepts sufficient ?Discuss scaling issues, e.g. with respect to PCA.
![Page 75: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/75.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 75
BackgroundContentsSummary
Time TableOpen IssuesReferences
SummaryOpen Issues
Ch. 1 Distribution Diagnostic
Essentially stable.
Ch. 2 Regression
Clarify Gauss-Markov theorem and role of dimension.Can the chapter be cleaned up ?
Ch. 3 Comparison (Still a placeholder)
What is an up-to-date discussion of the (non-shift) two sample case?Clean up.
Ch. 4 Multivariate
Given the time limitations, is the current list of concepts sufficient ?Discuss scaling issues, e.g. with respect to PCA.
![Page 76: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/76.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 76
BackgroundContentsSummary
Time TableOpen IssuesReferences
SummaryOpen Issues
Ch. 1 Distribution Diagnostic
Essentially stable.
Ch. 2 Regression
Clarify Gauss-Markov theorem and role of dimension.Can the chapter be cleaned up ?
Ch. 3 Comparison (Still a placeholder)
What is an up-to-date discussion of the (non-shift) two sample case?Clean up.
Ch. 4 Multivariate
Given the time limitations, is the current list of concepts sufficient ?Discuss scaling issues, e.g. with respect to PCA.
![Page 77: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/77.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 77
BackgroundContentsSummary
Time TableOpen IssuesReferences
SummaryOpen Issues
Ch. 1 Distribution Diagnostic
Essentially stable.
Ch. 2 Regression
Clarify Gauss-Markov theorem and role of dimension.Can the chapter be cleaned up ?
Ch. 3 Comparison (Still a placeholder)
What is an up-to-date discussion of the (non-shift) two sample case?Clean up.
Ch. 4 Multivariate
Given the time limitations, is the current list of concepts sufficient ?Discuss scaling issues, e.g. with respect to PCA.
![Page 78: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/78.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 78
BackgroundContentsSummary
Time TableOpen IssuesReferences
SummaryOpen Issues
Ch. 1 Distribution Diagnostic
Essentially stable.
Ch. 2 Regression
Clarify Gauss-Markov theorem and role of dimension.Can the chapter be cleaned up ?
Ch. 3 Comparison (Still a placeholder)
What is an up-to-date discussion of the (non-shift) two sample case?Clean up.
Ch. 4 Multivariate
Given the time limitations, is the current list of concepts sufficient ?Discuss scaling issues, e.g. with respect to PCA.
![Page 79: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/79.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 79
BackgroundContentsSummary
Time TableOpen IssuesReferences
SummaryOpen Issues
Ch. 1 Distribution Diagnostic
Essentially stable.
Ch. 2 Regression
Clarify Gauss-Markov theorem and role of dimension.Can the chapter be cleaned up ?
Ch. 3 Comparison (Still a placeholder)
What is an up-to-date discussion of the (non-shift) two sample case?Clean up.
Ch. 4 Multivariate
Given the time limitations, is the current list of concepts sufficient ?Discuss scaling issues, e.g. with respect to PCA.
![Page 80: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/80.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 80
BackgroundContentsSummary
Time TableOpen IssuesReferences
SummaryOpen Issues
Ch. 1 Distribution Diagnostic
Essentially stable.
Ch. 2 Regression
Clarify Gauss-Markov theorem and role of dimension.Can the chapter be cleaned up ?
Ch. 3 Comparison (Still a placeholder)
What is an up-to-date discussion of the (non-shift) two sample case?Clean up.
Ch. 4 Multivariate
Given the time limitations, is the current list of concepts sufficient ?Discuss scaling issues, e.g. with respect to PCA.
![Page 81: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/81.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 81
BackgroundContentsSummary
Time TableOpen IssuesReferences
SummaryOpen Issues
Ch. 1 Distribution Diagnostic
Essentially stable.
Ch. 2 Regression
Clarify Gauss-Markov theorem and role of dimension.Can the chapter be cleaned up ?
Ch. 3 Comparison (Still a placeholder)
What is an up-to-date discussion of the (non-shift) two sample case?Clean up.
Ch. 4 Multivariate
Given the time limitations, is the current list of concepts sufficient ?Discuss scaling issues, e.g. with respect to PCA.
![Page 82: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/82.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 82
BackgroundContentsSummary
Time TableOpen IssuesReferences
SummaryOpen Issues
Ch. 1 Distribution Diagnostic
Essentially stable.
Ch. 2 Regression
Clarify Gauss-Markov theorem and role of dimension.Can the chapter be cleaned up ?
Ch. 3 Comparison (Still a placeholder)
What is an up-to-date discussion of the (non-shift) two sample case?Clean up.
Ch. 4 Multivariate
Given the time limitations, is the current list of concepts sufficient ?Discuss scaling issues, e.g. with respect to PCA.
![Page 83: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/83.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 83
BackgroundContentsSummary
Time TableOpen IssuesReferences
SummaryReferences
See http://sintro.r-forge.r-project.org/.
![Page 84: Computational Statistics A Proposal for a Basic …sintro.r-forge.r-project.org/CompStatBasic.pdfG. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 1 Background Contents](https://reader030.fdocuments.us/reader030/viewer/2022040105/5e5f56503406f8246b59f795/html5/thumbnails/84.jpg)
G. Sawitzki: Computational Statistics Basic Reisensburg, 30.6.2009 84
BackgroundContentsSummary
Time TableOpen IssuesReferences
Private Note
Document identification:
$Source: /u/math/j40/cvsroot/lectures/src/SIntro/Reisensburg_2009/CompStatBasic.tex,v $
$Revision: 1.16 $
$Date: 2009/07/25 20:09:50 $
$Name: $
$Author: j40 $
LATEX typeset: July 26, 2009