Variable Compleja - Serie Schaum - Murray Spiegel - En Español.pdf
5-1 Chapter 5 Theory & Problems of Probability & Statistics Murray R. Spiegel Sampling Theory.
-
Upload
frederick-lockman -
Category
Documents
-
view
224 -
download
3
Transcript of 5-1 Chapter 5 Theory & Problems of Probability & Statistics Murray R. Spiegel Sampling Theory.
5-1
Chapter 5
Theory & Problems of
Probability & Statistics
Murray R. Spiegel
Sampling Theory
5-2
Outline Chapter 5
Population X
mean and variance - µ, 2
Sample
mean and variance X, ^s2
Sample Statistics
X mean and variance
^s2 mean and variance
x , x
ˆ s 2
,ˆ s 22
5-3
Outline Chapter 5
Distributions
Population
Samples Statistics
Mean
Proportions
Differences and Sums
Variances
Ratios of Variances
5-4
Outline Chapter 5
Other ways to organize samplesFrequency DistributionsRelative Frequency Distributions
Computation Statistics for Grouped Datameanvariance
standard deviation
5-5
Population Parameters
A population - random variable X
probability distribution (function) f(x)
probability function
- discrete variable f(x)
density function
- continuous variable
f(x) function of several parameters, i.e.:
mean: , variance: 2
want to know parameters for each f(x)
5-6
Example of a Population
5 project engineers in department
total experience of (X) 2, 3, 6, 8, 11 years
company performing statistical report
employees expertise based on experience
survey must include:
average experience
variance
standard deviation
5-7
Mean of Population
average experience mean:
years 6530
5118632
5-8
Variance of Population
variance: n)x( 2
i2
5)611()68()66()63()62( 22222
2
8.105
25409162
5-9
Standard Deviation of Population
standard deviation:
2..ds
8.10..ds
29.38.10
5-10
Sample Statistics
What if don’t have whole population Take random samples from population
estimate population parametersmake inferenceslets see how
How much experience in companyhire for feasibility studyperformance study
5-11
Sampling Example
manager assigns engineers at random
each time chooses first engineer she sees
same engineer could do both
lets say she picks (2,2)
mean of sample X= (2+2)/2 = 2
you want to make inferences about true µ
5-12
Samples of 2
replacement she will go to project department twice
pick engineer randomly
potentially 25 possible teams
25 samples of size two
5 * 5 = 25
order matters (6, 11) is different from (11, 6)
5-13
Population of Samples
All possible combinations are:
(2,2) (2,3) (2,6) (2,8) (2,11)
(3,2) (3,3) (3,6) (3,8) (3,11)
(6,2) (6,3) (6,6) (6,8) (6,11)
(8,2) (8,3) (8,6) (8,8) (8,11)
(11,2) (11,3) (11,6) (11,8) (11,11)
5-14
Population of Averages
Average experience or sample means are: Xi
(2) (2.5) (3) (5) (6.5)
(2.5) (3) (4.5) (5.5) (7)
(3) (4.5) (6) (7) (8.5)
(5) (5.5) (7) (8) (9.5)
(6.5) (7) (8.5) (9.5) (11)
5-15
Mean of Population Means
And mean of sampling distribution of means is :
This confirms theorem that states:
625
15025
(11)...(5)(3)(2.5)(2)X
6)X(E X
5-16
Variance of Sample Means
variance of sampling distribution of means (Xi -X)2
(2-6)2 (2.5-6)2 (3-6)2 (5-6)2 (6.5-6)2
(2.5-6)2 (3-6)2 (4.5-6)2 (5.5-6)2 (7-6)2
(3-6 ) (4.5-6)2 (6-6)2 (7-6)2 (8.5-6)2
(5-6 )2 (5.5-6)2 (7-6)2 (8-6)2 (9.5-6)2
(6.5-6 )2 (7-6)2 (8.5-6)2 (9.5-6 )2 (11-6)2
5-17
Variance of Sample Means
Calculating values:
16 12.25 9 1 0.25
12.25 9 2.25 0.25 1
9 2.25 0 1 6.25
1 0.25 1 4 12.25
0.25 1 6.25 12.25 25
5-18
Variance of Sample Means
variance is:
Therefore standard deviation is
4.525
135n
)XX( 2
i2
X
32.24.5X
5-19
Variance of Sample Means
These results hold for theorem:
Where n is size of samples. Then we see that:
n
22
X
40.52
8.10n
22
X
5-20Math Proof
X mean
X = X1 + X2 + X3 + . . . Xn
n
E(X) = E(X1) + E(X2)+ E(X3) + . . . E(Xn)
n
E(X) = + + + . . .
n
E(X) =
5-21Math Proof X variance
X = X1 + X2 + X3 + . . . Xn
n
Var(X) = 2x = 2
x + 2x + 2
x + . . . 2x
n2
=
5-22
Sampling Means No Replacement
manager picks two engineers at same time
order doesn't matter
order (6, 11) is same as order (11, 6)
10 choose 2 5!/(2!)(5-2)! = 10
10 possible teams, or 10 samples of size two.
5-23
Sampling Means No Replacement
All possible combinations are:
(2,3) (2,6) (2,8) (2,11) (3,6)
(3,8) (3,11) (6,8) (6,11) (8,11)
corresponding sample means are:
(2.5) (3) (5) (6.5) (4.5)
(5.5) (7) (7) (8.5) (9.5)
mean of corresponding sample of means is:
610
5.9...535.2X
5-24
Sampling Variance No Replacement
variance of sampling distribution of means is:
standard deviation is:
05.410
)65.9(...)64()65.2(n
)XX( 2222
i2
X
01.205.4n
)XX( 2
i2
XX
5-25Theorems on Sampling
Distributions with No Replacements
1.
2.05.4
4
3
2
8.10
15
25
2
8.10
1N
nN
n
22X
6X
5-26Sum Up Theorems on Sampling Distributions
Theorem I:Expected values sample mean = population mean
E(X ) = x = : mean of population
Theorem II:infinite population or sampling with replacementvariance of sample is
E[(X- )2] = x2 = 2/n
2: variance of population
5-27Theorems on Sampling
Distributions
Theorem III: population size is N
sampling with no replacement
sample size is n
then sample variance is:
1NnN
n
22
x
5-28Theorems on Sampling
Distributions
Theorem IV: population normally distributed
mean , variance 2
then sample mean normally distributed
mean , variance 2/n
)1,0(N
n
XZ
5-29Theorems on Sampling
Distributions
Theorem V:
samples are taken from distribution
mean , variance 2
(not necessarily normal distributed) standardized variables
asymptotically normal
n
XZ
5-30Sampling Distribution of
Proportions
Population properties:
* Infinite
* Binomially Distributed
( p “success”; q=1-p “fail”)
Consider all possible samples of size n
statistic for each sample
= proportion P of success
5-31
Sampling Distribution of Proportions
Sampling distribution of proportions of:mean:
std. deviation:
pP
n
)p1(p
n
pqP
5-32Sampling Distribution of
Proportions
large values of n (n>30) sample distribution for Papproximates normal distribution
finite population sample without replacingstandardized P is
npq
pPZ
5-33
Example Proportions
Oil service company
explores for oil
according to geological department
37% chances of finding oil
drill 150 wells
P(0.4<P<0.6)=?
5-34
Example Proportions
npq
pPZ
P(0.4<P<0.6)=?
P(0.4-0.37 < P-.37 < 0.6-0.37) =? (.37*.63/150).5 (pq/n).5 (.37*.63/150).5
5-35
Example Proportions
P(0.4<P<0.6)=P(0.24<Z<1.84)
=normsdist(1.84)-normsdist(0.24)= 0.372
Think about mean, variance and distribution of
np the number of successes
5-36
Sampling Distribution of Sums & Differences
Suppose we have two populations.
Population XA XB
Sample of size nA nB
Compute statistic SA SB
Samples are independent
Sampling distribution for SA and SB gives
mean: SA SB
variance: SA2 SB
2
5-37 Sampling Distribution of Sums
and Differences
combination of 2 samples from 2 populations sampling distribution of differences
S = SA +/- SB
For new sampling distribution we have:
mean: S = SA +/- SB
variance: S2 = SA
2 + SB2
5-38Sampling Distribution of
Sums and Differences
two populations XA and XB
SA= XA and SB = XB sample means
mean: XA+XB = XA + XB = A + B
variance:
Sampling from infinite populationSampling with replacement
B
2
B
A
2
ABX
2
AX nn
5-39Example Sampling Distribution
of Sums
You are leasing oil fields from
two companies for two years
lease expires at end of each year
randomly assigned a new lease for next year
Company A - two oil fields
production XA: 300, 700 million barrels
Company B two oil fields
production XB: 500, 1100 million barrels
5-40
Population Means
•Average oil field size of company A:
•Average oil field size of company B:
5002
700300XA
80021100500
XB
1300800500XBXA
5-41
Population Variances
Company A - two oil fields
production XA: 300, 700 million barrels
Company B two oil fields
production XB: 500, 1100 million barrels
XA2 = (300 – 500)2 + (700 – 500)2/2 = 40,000
XB2 = (500 – 800)2 + (1100 – 800)2/2 = 90,000
5-42Example Sampling Distribution
of Sums
Interested in total production: XA + XB
Compute all possible leases assignments
Two choices XA, Two choices XB
XAi XBi
{300, 500}
{300, 1100}
{700, 500}
{700, 1100}
5-43Example Sampling Distribution
of Sums
XAi XBi
{300, 500}
{300, 1100}
{700, 500}
{700, 1100}
Then for each of the 4 possibilities –
4 choices year 1, four choices year 2 = 4*4 samples
5-44Example Sampling Distribution
of Sums
Samples XAi XBi XAi XBi
Year 1 300 500 300 1100
Year 2 300 500 300 500
Year 1 300 500 300 1100
Year 2 300 1100 300 1100
Year 1 300 500 300 1100
Year 2 700 500 700 500
Year 1 300 500 300 1100
Year 2 700 1100 700 1100
5-45
Example Sampling Distribution of Sums
Samples XAi XBi XAi XBi
Year 1 700 500 700 1100
Year 2 300 500 300 500
Year 1 700 500 700 1100
Year 2 300 1100 300 1100
Year 1 700 500 700 1100
Year 2 700 500 700 500
Year 1 700 500 700 1100
Year 2 700 1100 700 1100
5-46
Compute Sum and Means of each sample
Means XAi+XBi Mean XAi+XBi Mean
Year 1 800 800 1400 1100
Year 2 800 800
Year 1 800 1100 1400 1400
Year 2 1400 1400
Year 1 800 1000 1400 1300
Year 2 1200 1200
Year 1 800 1300 1400 1600
Year 2 1800 1800
5-47
Compute Sum and Means of each Sample
Means XAi+XBi Mean XAi+XBi Mean
Year 1 1200 1000 1800 1300
Year 2 800 800
Year 1 1200 1300 1800 1600
Year 2 1400 1400
Year 1 1200 1200 1800 1500
Year 2 1200 1200
Year 1 1200 1500 1800 1800
Year 2 1800 1800
5-48
Mean of Sum of Sample Means
Population of Samples
{800, 1100, 1000, 1300, 1100, 1400, 1300, 1600, 1000, 1300, 1200, 1500, 1300, 1600, 1500, 1800}_______XAi+XBi =
(800 + 1100 + 1000 + 1300 + 1100 + 1400 + 1300 + 1600 + 1000 + 1300 + 1200 + 1500 + 1300 + 1600 + 1500 + 1800)
16
= 1300
5-49
Mean of Sum of Sample Means
This illustrates theorem on means _____ (XA+XB)= 1300= XA+ XB = 500 + 800 = 1300
_____What about variances of XA+XB
5-50
Variance of Sum of Means
Population of samples
{800, 1100, 1000, 1300, 1100, 1400, 1300, 1600, 1000, 1300, 1200, 1500, 1300, 1600, 1500, 1800}
2 = {(800 - 1300)2 + (1100 - 1300)2 + (1000 - 1300)2 + (1300 - 1300)2 + (1100 - 1300)2 + (1400 - 1300)2 + (1300- 1300)2 + (1600 - 1300)2 + (1000 - 1300)2 + (1300 - 1300)2 + (1200 - 1300)2 + (1500 - 1300)2 + (1300 - 1300)2 + (1600 - 1300)2 + (1500 - 1300)2 + (1800 - 1300)2}/16
= 65,000
5-51
Variance of Sum of Means
B
2
B
A
2
ABX
2
AX nn
2000,90
240000
000,65
This illustrates theorem on variances
5-52Normalize to Make Inferences on
Means
B
2
B
a
2
A
BABA
nn
XX
5-53
Estimators for Variance
n)XX(...)XX()XX(
S2
n
2
2
2
12
22 )ˆ( SE1n
)XX(...)XX()XX(S
2
n
2
2
2
12
use for populations
unbiased better for smaller samples
Two choices
5-54
Sampling Distribution of Variances
All possible random samples of size n
each sample has a variance
all possible variances
give sampling distribution of variances
sampling distribution of related random variable
2
2n
22
21
2
2
2
2 )XX(...)XX()XX(S)1n(nS
5-55
Example Population of Samples
All possible teams are:
(2,2) (2,3) (2,6) (2,8) (2,11)
(3,2) (3,3) (3,6) (3,8) (3,11)
(6,2) (6,3) (6,6) (6,8) (6,11)
(8,2) (8,3) (8,6) (8,8) (8,11)
(11,2) (11,3) (11,6) (11,8) (11,11)
5-56
Compute Variance for Each Sample
sample variance corresponding to each of 25 possible
choice that manager makes are: ^s2
0 0.25 4 9 20.25
.25 0 2.25 6.25 16
4 2.25 0 1 6.25
9 6.25 1 0 2.25
20.25 16 6.25 2.25 0
25.202
)5.611()5.62( 22
5-57
Sampling Distribution of Variance
Population of Variancesmeanvariancedistribution
(n-1)s2/2 2n-1
5-58What if Unknown Population
Variance?
X is Normal (, 2)
to make inference on means we normalize
n
XZ
5-59
Unknown Population Variance
2
22 S)1n(
1n
2
2
t
nS
X
S)1n(
n
X
5-60
Unknown Population Variance
)t
nSX
t(P 2c,1n1c,1n
Use in the same way as for normal
except use different Tables
α = 0.05
05.01)0639.2
nSX
0639.2(P
n = 25, =tinv(0.05,24)= 2.06392.06-2.06
5-61
Uses t -statistics
Will use for testing
means, sums, and differences of means
small samples when variable is normal
substitute sample variance in for true
ns
Xt
n
XZ 1n
5-62
Uses t -statistics
sums and differences of means
)1,0(N
nn
)(XX
21
2
2
2
1
2121
unknown variance
2nn
21
21
21
2
22
2
11
2121
21t
nnnn
2-nns1-ns1-n
)(XX
5-63
Uses 2 statistic
2
22 S)1n(
Inference on Variance
Large sample test
5-64
Inferences
F Statistic
)1n(s)1n(
)1n(
s)1n(
22
2
2
22
1
2
1
2
11
2df1/df1 =2df2/df2
2df,1df2
1
2
2
2
2
2
1 Fss
5-65F Statistic
Other tests
groups of coefficients
5-66
Other Statistics
. Medians .
n > 30, sample distribution of medians
nearly normal if X is normal
n2533.1
n2med
med
5-67
Frequency Distributions
If sample or population is large
difficult to compute statistics
(i.e. mean, variance, etc)
Organizing RAW DATA is useful
arrange into CLASSES or categories
determine number in each class
Class Frequency or Frequency Distribution
5-68
Frequency Distributions - Example
Example of Frequency Distribution:
middle size oil company
portfolio of 100 small oil reservoirs
reserves vary from 89 to 300 million barrels
5-69
Frequency Distributions - Example
arrange data into categories
create table showing ranges of reservoirs sizes
number of reservoirs in each range
ReservesNumber of
Fields50-100 4
101-150 21151-200 42201-250 27251-300 6TOTAL 100
5-70
Frequency Distributions - Example
Class intervals are in ranges of 50 million barrels
Each class interval represented by median value
e.g. 200 up to 250 will be represented by 225
Can plot data
histogram
polygon
This plot is represents frequency distribution
5-71Frequency Distributions Plotted -
Example
ReservesNumber of
Fields50-100 4
101-150 21151-200 42201-250 27251-300 6TOTAL 100
0
5
10
15
20
25
30
35
40
45
25 75 125 175 225 275 325
Reserves (mmb)
No. o
f Fie
lds
5-72Relative Frequency Distributions
and Ogives
number of individuals
- frequency distribution
- empirical probability distribution
percentage of individual
- relative frequency distribution
empirical cumulative probability distribution
- ogive
5-73
Percent Ogives
OGIVE for oil company portfolio of reservoirs
Shows percent reservoirs < than x reserves
5-74Computation of Statistics for
Grouped Data
can calculate mean and variance from grouped data
5-75Computation of Statistics for
Grouped Data
take 420 samples of an ore bodymeasure % concentration of Zinc (Zn) frequency distribution of lab results
5-76Computation of Statistics for
Grouped Data% Weight Frequency % Weight Frequency1.00 2 1.55 281.05 5 1.60 141.10 11 1.65 221.15 21 1.70 181.20 33 1.75 151.25 41 1.80 41.30 53 1.85 21.35 42 1.90 21.40 38 1.95 31.45 31 2.00 11.50 34 TOTAL 420
5-77
Computation of Statistics for Grouped Data
mean will then be:
And in our example:n
xf...xfxf
n
xfx kk2211ii
k21i f...fffn
40.1420
1*00.2...31*45.1...5*05.12*00.1
n
xfx ii
5-78Computation of Statistics for
Grouped Data
variance will then be:
n)xx(f...)xx(f)xx(f
n)xx(f
S2
kk
2
22
2
11
2
ii2
5-79Computation of Statistics for
Grouped Data
And in our example:
0365.0S420
)40.100.2(1....)40.105.1(5)40.100.1(2S
n
)xx(fS
2
2222
2ii2
5-80Computation of Statistics
for Grouped DataSimilar formula are available for higher moments:
n
)xx(f...)xx(f)xx(f
n
)xx(fm
rkk
r22
r11
rii
r
n
xf...xfxf
n
xfm
rkk
r22
r11
rii
r
5-81Sum up Chapter 5
Population X
mean and variance - µ, σ2
distribution
A Sample
statistic from sample
usually mean and variance X, ^s2
5-82Sum up Chapter 5
Sample Statistics
X mean and variance x, x 2
^s2 mean and variance ^s2, ^s
2
Distribution
5-83
Sum Up Chapter 5
Samples Statistics
Mean X ~ µ, σ2/n
Distribution
ns
Xt
n
XZ 1n
5-84
Sum Up Chapter 5
Samples Statistics
Proportions P ~ p, p(1-p)/n
n>30
Distribution
npq
pPZ
5-85
Sum Up Chapter 5Samples Statistics
Differences and Sums
X1+/- X2 ~ 1 + 2, 12/n1 + 2
2/n2
Distribution
)1,0(N
nn
)(XX
21
2
2
2
1
2121
2nn
21
21
21
2
22
2
11
2121
21t
nnnn
2-nns1-ns1-n
)(XX
5-86
Sum Up Chapter 5
Samples Statistics
Variances
Distribution
Mean = n-1
Variance = 2(n-1)
2
22
1n
S)1n(
5-87
Sum Up Chapter 5
Samples Statistics
Ratios of Variances
2df,1df2
1
2
2
2
2
2
1 Fss
5-88
Sum up Chapter 5
Other ways to organize samplesFrequency DistributionsRelative Frequency Distributions
Computation Statistics for Grouped Datameanvariance
standard deviation
5-89
THAT’S ALL FORCHAPTER 5
THANK YOU!!