Post on 06-Apr-2018
8/3/2019 Statistics 0 Introduction
1/16
Part 0 -- Introduction
Statistics and DataAnalysis
Professor William Greene
Stern School of Business
IOMS Department
Department of Economics
8/3/2019 Statistics 0 Introduction
2/16
Part 0 -- Introduction
Statistics and Data Analysis
Part 0 - Introduction
1/15
8/3/2019 Statistics 0 Introduction
3/16
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroomand Onion
Garlic
Meatball
CategoryMeatball
5.0%Garlic2.3%
Mushroomand Onion9.2%
Pepper and Onion7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chartof Percentvs Type
Listing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Boxplotof Listing
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Percent
10000008000006000004000002000000
99
95
90
80
70
60
50
40
30
20
10
5
1
Mean 369687
StDev 156865
N 51
AD 0.994
P-Value 0.012
Probability Plotof ListingNormal - 95%CI
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Frequency
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogramof Listing
Listing
Percent
9000
00
8000
00
7000
00
600000
500000
400000
3000
00
2000
00
1000000
100
80
60
40
20
0
Mean 369687
StDev 156865
N 51
Empirical CDF of ListingNormal
IncomePC
Listing
30000250002000015000
1000000
800000
600000
400000
200000
Marginal Plotof Listingvs IncomePC
2e mc
Part 0 -- Introduction
3
Professor William Greene; Economics andIOMS Departments
Office: KMEC, 7-90 (Economics Department)
Office phone: 212-998-0876
Email: wgreene@stern.nyu.edu
URL: http://www.stern.nyu.edu/~wgreene
http://www.stern.nyu.edu/~wgreene/Statistics/Outline.htm
2/15
mailto:wgreene@stern.nyu.edumailto:wgreene@stern.nyu.edumailto:wgreene@stern.nyu.eduhttp://www.stern.nyu.edu/~wgreenehttp://www.stern.nyu.edu/~wgreenemailto:wgreene@stern.nyu.edumailto:wgreene@stern.nyu.edumailto:wgreene@stern.nyu.edumailto:wgreene@stern.nyu.edumailto:wgreene@stern.nyu.edumailto:wgreene@stern.nyu.eduhttp://www.stern.nyu.edu/~wgreenehttp://www.stern.nyu.edu/~wgreenemailto:wgreene@stern.nyu.edu8/3/2019 Statistics 0 Introduction
4/16
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroomand Onion
Garlic
Meatball
CategoryMeatball
5.0%Garlic2.3%
Mushroomand Onion9.2%
Pepper and Onion7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chartof Percentvs Type
Listing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Boxplotof Listing
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Percent
10000008000006000004000002000000
99
95
90
80
70
60
50
40
30
20
10
5
1
Mean 369687
StDev 156865
N 51
AD 0.994
P-Value 0.012
Probability Plotof ListingNormal - 95%CI
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Frequency
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogramof Listing
Listing
Percent
9000
00
8000
00
7000
00
600000
500000
400000
3000
00
2000
00
1000000
100
80
60
40
20
0
Mean 369687
StDev 156865
N 51
Empirical CDF of ListingNormal
IncomePC
Listing
30000250002000015000
1000000
800000
600000
400000
200000
Marginal Plotof Listingvs IncomePC
2e mc
Part 0 -- Introduction
4
Course Objectives
Understand random outcomes and randominformation
Understand statistical information as the
measured outcomes of random processes Learn how to analyze statistical information
Statistical analysis
Model building Learn how to present statistical information
3/15
8/3/2019 Statistics 0 Introduction
5/16
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroomand Onion
Garlic
Meatball
CategoryMeatball
5.0%Garlic2.3%
Mushroomand Onion9.2%
Pepper and Onion7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chartof Percentvs Type
Listing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Boxplotof Listing
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Percent
10000008000006000004000002000000
99
95
90
80
70
60
50
40
30
20
10
5
1
Mean 369687
StDev 156865
N 51
AD 0.994
P-Value 0.012
Probability Plotof ListingNormal - 95%CI
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Frequenc
y
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogramof Listing
Listing
Percent
9000
00
8000
00
7000
00
600000
500000
400000
3000
00
2000
00
1000000
100
80
60
40
20
0
Mean 369687
StDev 156865
N 51
Empirical CDF of ListingNormal
IncomePC
Listing
30000250002000015000
1000000
800000
600000
400000
200000
Marginal Plotof Listingvs IncomePC
2e mc
Part 0 -- Introduction
5
What Does it Mean?
Slightly more than one-third of Americans have a favorable opinion ofthe Democratic-led Congress, a poll said Wednesday.
The Pew Research Center for the People & the Press said the 37%
expressing a positive opinion represents a decline of 13 points sinceApril.
The favorable percentage is one of the lowest in more than two decadesof Pew surveys if not the lowest, the poll said. The previous low was40% in January, but the result is not statistically significant because ofthe margin of error.
(USA Today, 9/3/09, page 4)
4/15
8/3/2019 Statistics 0 Introduction
6/16
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroomand Onion
Garlic
Meatball
CategoryMeatball
5.0%Garlic2.3%
Mushroomand Onion9.2%
Pepper and Onion7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chartof Percentvs Type
Listing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Boxplotof Listing
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Percent
10000008000006000004000002000000
99
95
90
80
70
60
50
40
30
20
10
5
1
Mean 369687
StDev 156865
N 51
AD 0.994
P-Value 0.012
Probability Plotof ListingNormal - 95%CI
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Frequenc
y
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogramof Listing
Listing
Percen
t
9000
00
8000
00
7000
00
600000
500000
400000
3000
00
2000
00
1000000
100
80
60
40
20
0
Mean 369687
StDev 156865
N 51
Empirical CDF of ListingNormal
IncomePC
Listing
30000250002000015000
1000000
800000
600000
400000
200000
Marginal Plotof Listingvs IncomePC
2e mc
Part 0 -- Introduction
6
Really?
To Get Rid of Hiccups, Have Someone Startle You.
The truth is: Most home remedies, like holding your breath ordrinking from a glass of water backward, haven't been medicallyproven to be effective, says Pollack. However, you can try this trickdating back to 1971, when it was published in The New EnglandJournal of Medicine: Swallow one teaspoon of white granulatedsugar. According to the study, this tactic resulted in the cessation ofhiccups in 19 out of 20 afflicted patients.
Posted August 31, 2010, cnn.comhttp://www.cnn.com/2010/HEALTH/08/31/rs.12.health.myths/index.html?iref=allsearch
5/15
8/3/2019 Statistics 0 Introduction
7/16
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroomand Onion
Garlic
Meatball
CategoryMeatball
5.0%Garlic2.3%
Mushroomand Onion9.2%
Pepper and Onion7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chartof Percentvs Type
Listing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Boxplotof Listing
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Percent
10000008000006000004000002000000
99
95
90
80
7060
50
40
30
20
10
5
1
Mean 369687
StDev 156865
N 51
AD 0.994
P-Value 0.012
Probability Plotof ListingNormal - 95%CI
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Frequenc
y
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogramof Listing
Listing
Percen
t
9000
00
8000
00
7000
00
600000
500000
400000
3000
00
2000
00
1000000
100
80
60
40
20
0
Mean 369687
StDev 156865
N 51
Empirical CDF of ListingNormal
IncomePC
Listing
30000250002000015000
1000000
800000
600000
400000
200000
Marginal Plotof Listingvs IncomePC
2e mc
Part 0 -- Introduction
7
Heard on the Street?
Dear Professor Greene,
The WSN is trying to poll people on the Park51 Mosquedebate. I saw that you were an statistics/data analysis
professor and I was wondering if you could explain how weshould go about conducting this poll. For example,approximatley [sic] how many people would we need to pollfor the data to be completley [sic] unbaised?
Email received September 5, 2010
6/15
8/3/2019 Statistics 0 Introduction
8/16
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroomand Onion
Garlic
Meatball
CategoryMeatball
5.0%Garlic2.3%
Mushroomand Onion9.2%
Pepper and Onion7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chartof Percentvs Type
Listing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Boxplotof Listing
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Percent
10000008000006000004000002000000
99
95
90
80
7060
50
40
30
20
10
5
1
Mean 369687
StDev 156865
N 51
AD 0.994
P-Value 0.012
Probability Plotof ListingNormal - 95%CI
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Frequency
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogramof Listing
Listing
Percen
t
9000
00
8000
00
7000
00
600000
500000
400000
3000
00
2000
00
1000000
100
80
60
40
20
0
Mean 369687
StDev 156865
N 51
Empirical CDF of ListingNormal
IncomePC
Listing
30000250002000015000
1000000
800000
600000
400000
200000
Marginal Plotof Listingvs IncomePC
2e mc
Part 0 -- Introduction
8
Technical Help Wanted
Our firm is looking for a [Ph.D.-level] statistician to assist us in analyzinga simple database of compensation levels. Our database includes 93 unique records fordifferent institutions. We expect to analyze two dependent variables against 13independent variables.
We need to perform multivariate regression analysis to determine which of the variables
are statistically significant. We also need to calculate the t-statistics for each of theindependent variables and adjusted r-squared values for the multivariate regressionmodel developed. We expect that some of the variables may need to be transformedprior to creating the regression analysis. Additional statistical approaches andtechniques may be required as appropriate.
Subsequent to the analysis of each of the variables, we will require a brief write-updetailing any relationships (or lack thereof) uncovered through the analysis. Weanticipate that this write-up will be approximately 2-3 pages in length, excluding anysupporting appendices. This write up should describe, in plain English, all relevantdetails regarding the analysis.
7/15
8/3/2019 Statistics 0 Introduction
9/16
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroomand Onion
Garlic
Meatball
CategoryMeatball
5.0%Garlic2.3%
Mushroomand Onion9.2%
Pepper and Onion7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chartof Percentvs Type
Listing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Boxplotof Listing
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Percent
10000008000006000004000002000000
99
95
90
80
7060
50
40
30
20
10
5
1
Mean 369687
StDev 156865
N 51
AD 0.994
P-Value 0.012
Probability Plotof ListingNormal - 95%CI
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Frequency
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogramof Listing
Listing
Percen
t
9000
00
8000
00
7000
00
600000
500000
400000
3000
00
2000
00
1000000
100
80
60
40
20
0
Mean 369687
StDev 156865
N 51
Empirical CDF of ListingNormal
IncomePC
Listing
30000250002000015000
1000000
800000
600000
400000
200000
Marginal Plotof Listingvs IncomePC
2e mc
Part 0 -- Introduction
9
Course Prerequisites
Basic algebra. (Especially summation)
Geometry (straight lines)
Logs and exponentsNOTE: I (you) will use only base e (natural)logs, not base 10 (common) logs in thiscourse.
A smattering of simple calculus. (I may use twoor three derivatives during the entire semester.)
8/15
8/3/2019 Statistics 0 Introduction
10/16
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroomand Onion
Garlic
Meatball
CategoryMeatball
5.0%Garlic2.3%
Mushroomand Onion9.2%
Pepper and Onion
7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chartof Percentvs Type
Listing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Boxplotof Listing
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Percent
10000008000006000004000002000000
99
95
90
80
7060
50
40
30
20
10
5
1
Mean 369687
StDev 156865
N 51
AD 0.994
P-Value 0.012
Probability Plotof ListingNormal - 95%CI
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Frequency
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogramof Listing
Listing
Percent
9000
00
8000
00
7000
00
600000
500000
400000
3000
00
2000
00
1000000
100
80
60
40
20
0
Mean 369687
StDev 156865
N 51
Empirical CDF of ListingNormal
IncomePC
Listing
30000250002000015000
1000000
800000
600000
400000
200000
Marginal Plotof Listingvs IncomePC
2e mc
Part 0 -- Introduction
10
Course Materials
Notes: Distributed in first class Text: Hildebrand, Ott and Gray. Basic
Statistical Ideas for Managers, 2nded.(Recommended, not required)
On the course website: Miscellaneous notes and materials Class slide presentations Problem sets
http://www.stern.nyu.edu/~wgreene/Statistics/Outline.htm
9/15
8/3/2019 Statistics 0 Introduction
11/16
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroomand Onion
Garlic
Meatball
CategoryMeatball
5.0%Garlic2.3%
Mushroomand Onion9.2%
Pepper and Onion
7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chartof Percentvs Type
Listing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Boxplotof Listing
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Percent
10000008000006000004000002000000
99
95
90
80
7060
50
40
30
20
10
5
1
Mean 369687
StDev 156865
N 51
AD 0.994
P-Value 0.012
Probability PlotofListingNormal - 95%CI
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Frequen
cy
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogramof Listing
Listing
Percent
9000
00
8000
00
7000
00
600000
500000
400000
3000
00
2000
00
1000000
100
80
60
40
20
0
Mean 369687
StDev 156865
N 51
Empirical CDF of ListingNormal
IncomePC
Listing
30000250002000015000
1000000
800000
600000
400000
200000
Marginal Plotof Listingvs IncomePC
2e mc
Part 0 -- Introduction
11
Course Software: MinitabThe Current Version: Minitab 16
Buy: ProfessionalBookstore
Rent: www.e-academy.come-Store
10/15
http://www.e-academy.com/http://www.e-academy.com/http://www.e-academy.com/http://www.e-academy.com/http://www.e-academy.com/http://www.e-academy.com/http://www.e-academy.com/http://www.e-academy.com/http://www.e-academy.com/http://www.e-academy.com/http://www.e-academy.com/http://www.e-academy.com/8/3/2019 Statistics 0 Introduction
12/16
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroomand Onion
Garlic
Meatball
CategoryMeatball
5.0%Garlic2.3%
Mushroomand Onion9.2%
Pepper and Onion
7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chartof Percentvs Type
Listing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Boxplotof Listing
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Percent
10000008000006000004000002000000
99
95
90
80
7060
50
40
30
20
10
5
1
Mean 369687
StDev 156865
N 51
AD 0.994
P-Value 0.012
Probability Plotof ListingNormal - 95%CI
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Frequen
cy
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogramof Listing
Listing
Percent
9000
00
8000
00
7000
00
600000
500000
400000
3000
00
2000
00
1000000
100
80
60
40
20
0
Mean 369687
StDev 156865
N 51
Empirical CDF of ListingNormal
IncomePC
Listing
30000250002000015000
1000000
800000
600000
400000
200000
Marginal Plotof Listingvs IncomePC
2e mc
Part 0 -- Introduction
12
Course Outline and Overview
1. Presenting Data
Data
Types
Information content
Data Description
Graphical devices: Plots, histograms
Statistical: Summary statistics
11/15
P 0 I d i
8/3/2019 Statistics 0 Introduction
13/16
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroomand Onion
Garlic
Meatball
CategoryMeatball
5.0%Garlic2.3%
Mushroomand Onion9.2%
Pepper and Onion
7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chartof Percentvs Type
Listing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Boxplotof Listing
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Percent
10000008000006000004000002000000
99
95
90
80
70
60
50
40
30
20
10
5
1
Mean 369687
StDev 156865
N 51
AD 0.994
P-Value 0.012
Probability Plotof ListingNormal - 95%CI
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Frequen
cy
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogramof Listing
Listing
Perce
nt
9000
00
8000
00
7000
00
600000
500000
400000
3000
00
2000
00
1000000
100
80
60
40
20
0
Mean 369687
StDev 156865
N 51
Empirical CDF of ListingNormal
IncomePC
Listing
30000250002000015000
1000000
800000
600000
400000
200000
Marginal Plotof Listingvs IncomePC
2e mc
Part 0 -- Introduction
13
Data:House
PriceListingsand
Income
How to
describe/summarizethem.
How to explain thevariation across
statesHow to determine ifthere is anyconnection between
the two variables.
12/15
P t 0 I t d ti
8/3/2019 Statistics 0 Introduction
14/16
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroomand Onion
Garlic
Meatball
CategoryMeatball
5.0%Garlic2.3%
Mushroomand Onion9.2%
Pepper and Onion
7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chartof Percentvs Type
Listing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Boxplotof Listing
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Percent
10000008000006000004000002000000
99
95
90
80
70
60
50
40
30
20
10
5
1
Mean 369687
StDev 156865
N 51
AD 0.994
P-Value 0.012
Probability Plotof ListingNormal - 95%CI
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Frequen
cy
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogramof Listing
Listing
Perce
nt
9000
00
8000
00
7000
00
600000
500000
400000
3000
00
2000
00
1000000
100
80
60
40
20
0
Mean 369687
StDev 156865
N 51
Empirical CDF of ListingNormal
IncomePC
Listing
30000250002000015000
1000000
800000
600000
400000
200000
Marginal Plotof Listingvs IncomePC
2e mc
Part 0 -- Introduction
14
Course Outline and Overview
2. Explaining How Random Data Arise
Probability: Understanding unpredictable outcomes
Precise mathematical principles of random outcomesthat occur naturally e.g., gambling and games of chance
Models = descriptions of random outcomes that occur innature but dont have fixed mathematical laws
The Normal distribution
THE fundamental model for outcomes involving
behavior Model building for random outcomes using the normal
distribution
13/15
P t 0 I t d ti
8/3/2019 Statistics 0 Introduction
15/16
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroomand Onion
Garlic
Meatball
CategoryMeatball
5.0%Garlic2.3%
Mushroomand Onion9.2%
Pepper and Onion
7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chartof Percentvs Type
Listing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Boxplotof Listing
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Percent
10000008000006000004000002000000
99
95
90
80
70
60
50
40
30
20
10
5
1
Mean 369687
StDev 156865
N 51
AD 0.994
P-Value 0.012
Probability Plotof ListingNormal - 95%CI
IncomePC
Listing
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
300000
200000
100000
Scatterplotof Listingvs IncomePC
Listing
Frequen
cy
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogramof Listing
Listing
Perce
nt
9000
00
8000
00
7000
00
600000
500000
400000
3000
00
2000
00
1000000
100
80
60
40
20
0
Mean 369687
StDev 156865
N 51
Empirical CDF of ListingNormal
IncomePC
Listing
30000250002000015000
1000000
800000
600000
400000
200000
Marginal Plotof Listingvs IncomePC
2e mc
Part 0 -- Introduction
15
Course Outline and Overview3. Modeling Relationships Between
Outcomes What is correlation?
Simple linearregression:Connecting onevariable with another
Multiple regression Model building
Understandingcovariation of morethan one variable.
IncomePC_1
Listing_
1
3250030000275002500022500200001750015000
900000
800000
700000
600000
500000
400000
Scatterplot of Listing vs IncomePC
Correlation = 0.428. Is this large?
Hawaii. Outlier?
14/15
Part 0 Introduction
8/3/2019 Statistics 0 Introduction
16/16
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroomand Onion
Garlic
Meatball
CategoryMeatball
5.0%Garlic2.3%
Mushroomand Onion9.2%
Pepper and Onion
7.3%
Sausage5.8%
Mushroom162%
Plain325%
Pepperoni21.8%
Pie Chartof Percentvs Type
Listing
900000
800000
700000
600000
500000
400000
300000
Boxplotof Listing
Listing
900000
800000
700000
600000
500000
400000
300000
200000
Scatterplotof Listingvs IncomePC
Percent
99
95
90
80
70
60
50
40
30
20
10
5
Mean 369687
StDev 156865
N 51
AD 0.994
P-Value 0.012
Probability Plotof ListingNormal - 95%CI
Listing
900000
800000
700000
600000
500000
400000
300000
200000
Scatterplotof Listingvs IncomePC
Frequency
14
12
10
8
6
4
2
Histogramof Listing
Perce
nt
0000000000
100
80
60
40
20
0
Mean 369687
StDev 156865
N 51
Empirical CDF of ListingNormal
Listing
1000000
800000
600000
400000
200000
Marginal Plotof Listingvs IncomePC
2e mc
Part 0 -- Introduction
16
Course Outline and Overview - 4
Statistical inference
Hypothesis testing: (Is the correlation large?Could it actually be zero?)
Hypothesis tests for specific applications Mean of a population: Is it a specific value?
Pair of means: Are they equal?
Applications in regression: Are the variables in
the model really related? An application in marketing: Did the sales
promotion work? How would you find out?
15/15