Estimators
Transcript of Estimators
-
8/9/2019 Estimators
1/30
-
8/9/2019 Estimators
2/30
An Estimator is the statistical function of the
observable sample data that is used to
estimate an unknown parameter (which is
called as Estimand). The result from the application of the
function to a particular sample of data is
called as Estimate .
It is possible to construct many estimatorsfor a given parameter.
The performance of an estimator may be
evaluated by using Loss functions.
-
8/9/2019 Estimators
3/30
To estimate a parameter i.e. a population mean
etc.. The usual procedure is as follows :
Select a random sample from the population of
interest .Calculate the Point EstimatePoint Estimate of the parameter .
Calculate a measure of variability, often a
confidence interval .
Associate with this estimate a measure of
variability .
-
8/9/2019 Estimators
4/30
This Point estimation makes use of sample
data to calculate a single value which is to
serve as a best guessbest guess for an unknown
parameter .
Point estimation should be contrasted with the
general Bayesian methodsBayesian methods of estimation where
the goal is usually to compute the posteriorposterior
distributions of parameters and other
quantities of interest.
The contrast here is between estimating asingle point versus estimating a weighted set
of points (PDF).
-
8/9/2019 Estimators
5/30
Uses aspects of the scientific method, which
involves in collecting evidence that is meant
to be consistent or inconsistent with a given
hypothesis. As evidence accumulates, thedegree of belief in a hypothesis ought to
change.
Hypothesis with very high support should be
accepted as true and those with low supportis to be rejected .
-
8/9/2019 Estimators
6/30
Maximum Likelihood (ML)Maximum Likelihood (ML)
Method of MomentsMethod of Moments
Minimum mean squared error (MMSE)Minimum mean squared error (MMSE)
Minimum variance unbiased estimator(MVUE)Minimum variance unbiased estimator(MVUE)
Best Linear unbiased estimator.Best Linear unbiased estimator.
Now we discuss each of them in detailNow we discuss each of them in detail
-
8/9/2019 Estimators
7/30
This is a popular statistical model used for fitting
a statistical model to data ,and providingestimates for the models parameters .
For example, suppose you are interested in the
heights of Americans. You have a sample of some
number of Americans, but not the entire
population, and record their heights.
Further, you are willing to assume that heights
are normally distributed with some
unknown mean and variance . The sample mean
is then the maximum likelihood estimator of thepopulation mean, and the sample variance is a
close approximation to the maximum likelihood
estimator of the population variance
-
8/9/2019 Estimators
8/30
For a fixed set of data and underlyingprobability model, maximum likelihood picks
the values of the model parameters that
make the data "more likely" than any other .
Maximum likelihood estimation gives a
unique and easy way to determine solution in
the case of the normal distribution and many
other problems, although in very complex
problems this may not be the case.
If a uniform prior distribution is assumed
over the parameters, the maximum
likelihood estimate coincides with the most
probable values thereof.
-
8/9/2019 Estimators
9/30
This is a method of estimation of populationparameters such as mean, variance, median
etc by equating sample moments with
unobservable population moments and then
solving those equations for the quantities to
be estimated.
Estimates by the method of moments may be
used as the first approximation to the
solutions of the likelihood equations, and
successive improved approximations maythen be found by the NewtonRaphson
method .
-
8/9/2019 Estimators
10/30
In some cases, infrequent with large samples
but not so infrequent with small samples, the
estimates given by the method of moments
are outside of the parameter space; it does
not make sense to rely on them then.
Also, estimates by the method of momentsare not necessarily sufficient statistics, i.e.,
they sometimes fail to take into account all
relevant information in the sample.
-
8/9/2019 Estimators
11/30
MSE of an estimator is one of many ways toquantify the difference between anestimator and the true value of the quantitybeing estimated.
The MSE is the second moment (about the
origin) of the error, and thus incorporatesboth the variance of the estimator and itsbias. For an unbiased estimator, the MSE isthe variance.
In an analogy to standard deviation, takingthe square root of MSE yields the root meansquared error or RMSE
For an unbiased estimator RMSE is called asstandard error.
-
8/9/2019 Estimators
12/30
Since MSE is an expectation, it is a scalar and
not a random variable. It may be a function
of a unknown parameter , but it does not
depend on any random quantities.
-
8/9/2019 Estimators
13/30
-
8/9/2019 Estimators
14/30
An MSE of zero, meaning that the
estimator predicts observations of the
parameter with perfect accuracy, is the
ideal and forms the basis for the leastleast
squares methodsquares method of regression analysis.
While particular values of MSE other thanzero are meaningless in and of themselves,
they may be used for comparative purposes.
The unbiased model with the smallest MSE is
generally interpreted as best explaining thevariability in the observations.
-
8/9/2019 Estimators
15/30
Minimizing MSE is a key criterion in selectionestimators. Among unbiased estimators, theminimal MSE is equivalent to minimizing thevariance, and is obtained by the MVUEMVUE(Minimum variance unbiased estimator)(Minimum variance unbiased estimator) .
Like variance, mean squared error has thedisadvantage of heavily weighting outliersweighting outliers.This is a result of the squaring of each term,which effectively weights large errors moreheavily than small ones. This property,
undesirable in many applications, has ledresearchers to use alternatives such asthe mean absolute errormean absolute error, or those based onthe median.
-
8/9/2019 Estimators
16/30
In statistics, the mean absolute error is aquantity used to measure how close forecasts orpredictions are to the eventual outcomes. Themean absolute error (MAE) is given by
As the name suggests, the mean absolute error isan average of the absolute errors ei = fi yi,where fi is the prediction and yi the true value.
The MAE and the RMSE can be used together todiagnose the variation in the errors in a set of
forecasts. The RMSE will always be larger orequal to the MAE; the greater differencebetween them, the greater the variance in theindividual errors in the sample. If the RMSE=MAE,then all the errors are of the same magnitude
-
8/9/2019 Estimators
17/30
In statistics and signal processing ,a MMSEEstimator describes the approach which
minimizes the mean square errormean square error, which is a
common measure of estimator quality.
Let X be an unknown random variable, and Ybe a known random variable(measurement).
An estimator X^(y) is any function of the
measurement Y and its MSE is given by
MSE=E{(X^-X)2 }
The MMSE estimator is defined as the
estimator achieving minimal MSE.
-
8/9/2019 Estimators
18/30
It has lower variance than any other
unbiased estimator for all possible values of
the parameter.
An efficient estimator need not exist, but ifit does, it's the MVUE because MSE is the sum
of variance and bias of an estimator.
The MVUE minimizes MSE among unbiased
estimators. In some cases biased estimatorshave lower MSE because they have a smaller
variance than does any unbiased estimator.
-
8/9/2019 Estimators
19/30
It frequently occurs that the MVU estimator,even if it exists, cannot be found. Forexample, if PDF is not known, theory ofsufficient statistics cannot be applied. Also,if PDF is known, it doesnt make ensure
minimum variance. In such cases, we have to resort to a
suboptimal estimator approach. We canrestrict the estimator to a linear form that is
unbiased. It should also have minimumvariance.
An example of this approach is the BestLinear Unbiased Estimator (BLUE) approach
-
8/9/2019 Estimators
20/30
This a linear model in which the errors have
expectation zero and are uncorrelated and
have equal variances.
-
8/9/2019 Estimators
21/30
Maximum A Posteriori (MAP)
Wiener filter
Kalman filter
Particle filter
Markov chain Monte Carlo (MCMC)
-
8/9/2019 Estimators
22/30
Sometimes we have priori information about
PDF of the parameter to be estimated. Let be the RV and the associated
probabilities are called priori probabilities.
Bayes theorem shows the way for
incorporating prior information in theestimation process
The term on the left hand side is called
posterior ,numerator is product of likelihood
term and the prior term, denominator serves
as a normalization term so that a posterior
PDF sums to unity.
-
8/9/2019 Estimators
23/30
In Bayesian statistics MAP is a mode of the
posterior distribution.
The Bayesian inference produces a maximum
a posterior (MAP) estimate ..
-
8/9/2019 Estimators
24/30
Wiener filter reduces the amount of noise
present in a signal by comparison with anestimation of the desired noiseless signal.
Since ,the filter assumes the inputs as stationarythis filter is not an adaptive filter.
Wiener filters are characterized by thefollowing:
Assumption: signal and (additive) noise arestationary linear stochastic processes with knownspectral characteristics or known autocorrelationand cross-correlation
Requirement: the filter must be physicallyrealizable, i.e. causal .
Performance criterion: minimum mean-squareerror(MMSE)
-
8/9/2019 Estimators
25/30
The input to the filter is assumed to be a
signal, s(t) corrupted by additive noise,n(t).The output S^(t) is calculated by means
of a filter g(t) using the following
convolution S^(t) =g(t)*(s(t)+n(t))
whereg(t) is the wiener filters impulse response
The error is defined as e(t)=s(t+ )- S^(t)
where is the delay of the wiener filter (
since it is casual) . In other words, the error is the difference
between the estimated signal and the true
signal shifted by .
-
8/9/2019 Estimators
26/30
Clearly the squared error is given by
e2(t)=s2(t+)-2s(t+ )s^(t)+s^2(t)where s(t+ ) is the desired output of thefilter e(t) is the error .
Depending on the value of the problem
name can be changed. If >0 then the problem is that of predictionprediction
(error is reduced when s^(t) is similar to alater value of s) .
If =0 then the problem is that of filtering(error is reduced when s^(t) is similar to s(t).
If
-
8/9/2019 Estimators
27/30
The Wiener filter problem has solutions for
three possible cases:One where a non-causal filter is acceptable
(requiring an infinite amount of both pastand future data) .
The case where a causal filter is desired(using an infinite amount of past data), and
The FIR case where a finite amount of pastdata is used.
The first case is simple to solve but is notsuited for real-time applications.
Wiener's main accomplishment was solvingthe case where the causality requirement isin effect
-
8/9/2019 Estimators
28/30
A major limitation towards more widespread
implementation of Bayesian approaches is
that obtaining the posterior distribution
often requires the integration of high-dimensional functions.
This can be computationally very difficult in
times.
MCMC approaches are so named because oneuses the previous sample values to randomly
generate the next sample value , thus
generating a Markov chain.
-
8/9/2019 Estimators
29/30
-
8/9/2019 Estimators
30/30