Properties of Estimators

From MM*Stat International

Jump to: navigation, search
English
Português
Français
‎Español
Italiano
Nederlands


When estimating a specific parameter or characteristic of a population, several possible estimators exist. Example 1:Suppose that the underlying population distribution is symmetric. In this case the population expectation equals the population’s median. Thus the unknown expectation can be estimated using either the sample mean or the sample median. In general, the two estimators will provide different estimates. Which estimator should be used? Example 2: To estimate the variance we may use either of the following: Which estimator should be used? Example 3: Suppose that the underlying population distribution is Poisson. For the Poisson distribution . Therefore the unknown parameter could be estimated using the sample mean or the sample variance. Again in this case the two estimators will in general, yield different estimates. In order to obtain an objective comparison,  we need to examine the properties of the estimators.

Mean Squared Error

A general measure of the accuracy of an estimator is the Mean Squared Deviation, or Mean Squared Error (MSE). The MSE measures the average squared distance between the estimator and the true parameter : It is straightforward to show that the MSE can be separated into two components: The first term on the right side is the variance of : The second term is the square of the bias . Hence the MSE is the sum of the variance and the squared bias of the estimator: If several estimators are available for an unknown parameter of the population, one would thus select that one with the smallest MSE. . Starting with the MSE three important properties of estimators are described, which should facilitate the search for the ”best” estimator.

Unbiasedness

An estimator of the unknown parameter is unbiased, if the expectation of the estimator matches the true parameter value: That is the mean of the sampling distribution of equals the true parameter value . For an unbiased estimator the MSE equals the variance of the estimator: Thus the variance of the estimator provides a good measure of the precision of the estimator. If the estimator is biased, then the expectation of the estimator is different from the true parameter value. That is, An estimator is called asymptotically unbiased, if ,i.e. the bias converges to zero with increasing sample size .

Efficiency

Often there are several unbiased estimators available for the same parameter.  In this case, one would like to select the one with the smallest variance (which in this case is equal to the MSE). Let and be two unbiased estimators of  using a sample of size .  The estimator is called relatively efficient in comparison to , if the variance of is smaller than the variance of , i.e., The estimator is called efficient if its variance is smaller than that of any other unbiased estimator.

Consistency

The consistency of an estimator is a property which focuses on the behavior of the estimator in large samples. In particular consistency requires that the estimator be close to the true parameter value with high probability in large samples. It is sufficient if the bias and variance of the estimator converge to zero.    Formally, suppose and Then the estimator is consistent.  Equivalently, the two conditions may be summarized using: This notion of consistency is also referred to as ’squared mean consistency’. An alternative version, known as weak consistency is defined by the following: That is, the probability that the estimator yields values within an arbitrarily small interval around the true parameter value , converges to one with increasing sample size . The probability that the estimator differs from the true parameter value by more than , converges to zero with increasing sample size .  That is, The unknown mean and variance will be estimated. A random sample of size was drawn from a population yielding the following data:1; 5; 3; 8; 7; 2; 1; 4; 3; 5; 3; 6. The sample mean is an unbiased and efficient estimator. Substituting the sample values yields This result constitutes a point estimate of . The estimator is given by: Substituting the sample values yields the point estimate Assume a population with mean and variance . Let be a random sample drawn from the population. Each random variable has and . Consider the following three estimators of the population mean:

  • Which estimators are unbiased?
  • Which estimator is most efficient?

All of them are unbiased, since  : The variance of each estimator is given by: The first estimator, because uses all the data, is the most efficient.  This estimator is of course the sample mean.  Note that even though the second and third estimators each use two observations, the third is less efficient than the second because it does not weight the observations equally. .

Mean Squared Error (MSE):

Recall the MSE is defined as Expanding the expression one obtains: For the middle term we have: and consequently we have The MSE does not measure the actual estimation error that has occurred in a particular sample.  It measures the average squared error that would occur in repeated sample.

Unbiasedness

The following figure display three estimators of a parameter .

Nl s2 41 m 7.gif

The estimators and are unbiased since their expectation coincides with the true parameter (denoted by the vertical dashed line). In contrast, the estimator is biased. For both unbiased estimators holds, as the bias equals zero. However has lower variance and is therefore preferred to It is also preferred to which has the same variance but exhibits substantial positive bias. Each of the following widely used estimators are unbiased.

Sample Mean

The sample mean is an unbiased estimator of unknown expectation since See Section Distribution of the Sample Mean.

Sample Proportion

The sample proportion is an unbiased estimator for the population proportion since See Section Distribution of the Sample Fraction.

Sample Variance

Assume a random sample of size .

  1. If the expectation of the population is unknown and estimated using the sample mean, the estimator is an unbiased estimator of , since See Section Distribution of the Sample Variance  The standard deviation which is the square root of the sample variance is not an unbiased estimator of , as it tends to underestimate the population standard deviation.

    The estimator is not unbiased, since See Section Distribution of the Sample Variance. The bias is given by: Using the estimator one will tend to underestimate the unknown variance. The estimator, however is asymptotically unbiased as with increasing sample size the bias converges to zero.

    Division by as in ) rather than by (as in the ) assures unbiasedness.

Efficiency

  • The sample mean  is an efficient estimator of the unknown population expectation . This is true for any distribution
  • Suppose data are drawn from a  distribution. The sample mean  is an efficient estimator of .  It can be shown that no unbiased estimator of  exists which has a smaller variance.
  • The sample mean  is an efficient estimator for the unknown parameter of a poisson distribution.
  • The sample proportion is an efficient estimator of the unknown population proportion for a dichotomous population, i.e. the underlying random variables have a common Bernoulli distribution.
  • For a normally distributed population the sample mean  and the sample median  are unbiased estimators of the unknown expectation . For random samples (with replacement) we have: Furthermore one can show that and hence The sample mean  is relatively efficient in contrast to the sample median .
  • The relative efficiency of various estimators of the same parameter in general depends on the distribution from which one is drawing observations.

Consistency

  • Consistency is usually considered to be a minimum requirement of an estimator. Of course, consistency does not preclude the estimator having large bias and variance in small or moderately sized samples.  Consistency only guarantees that bias and variance go to zero for sufficiently large samples. On the other hand, since sample size cannot usually be increased at will, consistency may provide a poor guide to the finite sample properties of the estimator.

  • For random samples, the sample mean is a consistent estimator of the population expectation since and the variance converges to zero, i.e.,

  • For random samples the sample proportion is a consistent estimator for the population proportion as the estimator is unbiased and the variance converges to zero, i.e.,

  • For a Gaussian distributed population the sample median  is a consistent estimator for the unknown parameter .

  • For a Gaussian distribution, the estimator is consistent for the unknown variance , since the estimator is unbiased and the variance converges to zero:

    The sample variance is also a consistent estimator of the population variance for arbitrary distributions which have a finite mean and variance.