Problem 15

Question

If \(X_{1}, X_{2}, \ldots, X_{n}\) and \(Y_{1}, Y_{2}, \ldots, Y_{m}\) are independent random samples from normal distributions with the same \(\sigma^{2}\), prove that their pooled sample variance, \(S_{p}^{2}\), is an unbiased estimator for \(\sigma^{2}\).

Step-by-Step Solution

Verified
Answer
The pooled sample variance, \(S_{p}^{2}\), is an unbiased estimator for \(\sigma^{2}\). The proof is based on calculating the expectation of \(S_{p}^{2}\) and showing that it equals \(\sigma^{2}\).
1Step 1: Calculate pooled sample variance
First define the pooled sample variance, \(S_{p}^{2}\), which is the weighted average of the individual sample variances \(S_{X}^{2}\) and \(S_{Y}^{2}\).It is given by: \[S_{p}^{2} = \frac{(n-1)S_{X}^{2} + (m-1)S_{Y}^{2}}{n+m-2}\].
2Step 2: Write down the expectations of the sample variances
Recall that the variances of two independent normal distributions are unbiased estimators of the true variance. Therefore, \(E(S_{X}^{2}) = E(S_{Y}^{2}) = \sigma^{2}\).
3Step 3: Substitute and simplify
Substitute these expectations into the equation for \(E(S_{p}^{2})\): \[E(S_{p}^{2}) = E\left(\frac{(n-1)S_{X}^{2} + (m-1)S_{Y}^{2}}{n+m-2}\right) = \frac{(n-1)E(S_{X}^{2}) + (m-1)E(S_{Y}^{2})}{n+m-2} =\frac{(n-1)\sigma^{2} + (m-1)\sigma^{2}}{n+m-2} \]\]. Note that this simplifies to \(\sigma^{2}\) proving \(S_{p}^{2}\) is an unbiased estimator.
4Step 4: Conclude the proof
Since \(E(S_{p}^{2}) = \sigma^{2}\), it's proven that the pooled sample variance is an unbiased estimator for the common variance \(\sigma^{2}\) of the two original populations. This completes the proof.

Key Concepts

Unbiased EstimatorNormal DistributionSample VarianceExpectation in Statistics
Unbiased Estimator
In statistical terms, an estimator is said to be unbiased if its expected value equals the true value of the parameter it estimates. This concept is fundamental in statistics since an unbiased estimator doesn't systematically overestimate or underestimate the parameter it aims to predict.
For an unbiased estimator, we have:
  • Expectation of the estimator equals the parameter: \( E(\hat{\theta}) = \theta \)
  • It ensures long-run average accuracy.
The pooled sample variance \( S_p^2 \) is considered unbiased for the common variance \( \sigma^2 \) because its expectation equals \( \sigma^2 \). This was shown in steps involving substitution of expectations of sample variances. Ultimately, this property makes \( S_p^2 \) very useful in comparing data from different samples.
Normal Distribution
A normal distribution is a continuous probability distribution characterized by its bell-shaped curve, known as the Gaussian curve. It is defined by two parameters, the mean \( \mu \) and the variance \( \sigma^2 \).
  • The mean determines the center of the distribution.
  • The variance determines the spread or width of the distribution.
The reason the normal distribution is key in this problem is that, under the assumption of normality, the sample variances are unbiased estimators of the population variance. This allows us to calculate a reliable pooled variance \( S_p^2 \). The normal distribution's properties ensure the consistency and effectiveness of statistical tools like the pooled sample variance.
Sample Variance
Sample variance is a measure of the spread of sample data. It is given by the formula:\[ S^2 = \frac{1}{n-1} \sum_{i=1}^{n} (X_i - \bar{X})^2 \]where \(X_i\) are the sample values and \(\bar{X}\) is the sample mean.
  • Expresses how data points differ from the sample mean.
  • Used as an unbiased estimator of the population variance in normally distributed data.
  • Shows how well the sample represents the population quality of the estimator.
In the context of pooled sample variance, the individual sample variances \( S_X^2 \) and \( S_Y^2 \) are weighted by their degrees of freedom and combined to form \( S_p^2 \), thus providing a comprehensive sense of variance when dealing with multiple samples.
Expectation in Statistics
Expectation, in the context of statistics, refers to the average or mean value predicted by a probability distribution. It is a crucial concept because it forms the basis for inferential statistics, providing estimators like the mean, variance, etc.
The general formula for the expectation of a random variable \(X\) is:\[ E(X) = \sum (x_i \times P(x_i)) \]for discrete variables, or\[ E(X) = \int x f(x) \; dx \]for continuous variables.
  • Expectation indicates long-term average outcomes.
  • In the problem, expectations of sample variances equal \( \sigma^2 \).
This means the expectation of the pooled variance \( S_p^2 \) equals the true variance \( \sigma^2 \), showcasing it as an unbiased estimator. By understanding expectation, we can predict the effectiveness of statistics in interpreting real-world data.