Problem 8

Question

Let \(\bar{X}_{n}\) and \(\bar{Y}_{m}\) be the sample means of two independent random samples of size \(n\) (resp. \(m\) ) from the same distribution with mean \(\mu\). We combine these two estimators to a new estimator \(T\) by putting $$ T=r \bar{X}_{n}+(1-r) \bar{Y}_{m} $$ where \(r\) is some number between 0 and \(1 .\) a. Show that \(T\) is an unbiased estimator for the mean \(\mu\). b. Show that \(T\) is most efficient when \(r=n /(n+m)\).

Step-by-Step Solution

Verified
Answer
T is unbiased, and most efficient when r = n/(n+m).
1Step 1: Define the Objective
We need to show that \(T = r \bar{X}_{n} + (1-r) \bar{Y}_{m}\) is an unbiased estimator for the mean \(\mu\). This means the expected value of \(T\) should be equal to \(\mu\).
2Step 2: Calculate the Expected Value of T
We calculate \(E[T]\) using the linearity of expectation. We find that \[E[T] = E[r \bar{X}_{n} + (1-r) \bar{Y}_{m}] = r E[\bar{X}_{n}] + (1-r) E[\bar{Y}_{m}].\]
3Step 3: Evaluate Expected Values of Sample Means
Knowing that \(\bar{X}_{n}\) and \(\bar{Y}_{m}\) are both unbiased estimators of \(\mu\), we have \(E[\bar{X}_{n}] = \mu\) and \(E[\bar{Y}_{m}] = \mu\).
4Step 4: Show T is Unbiased
Substitute the expected values into the equation: \[E[T] = r\mu + (1-r)\mu = (r + 1 - r)\mu = \mu.\] Thus, \(T\) is an unbiased estimator for \(\mu\).
5Step 5: Define Variance of T
We now need to show that \(T\) is most efficient when \(r = \frac{n}{n+m}\). First, calculate the variance of \(T\): \[\text{Var}(T) = r^2\text{Var}(\bar{X}_{n}) + (1-r)^2\text{Var}(\bar{Y}_{m}).\]
6Step 6: Calculate Variance of Sample Means
Given that \(\text{Var}(\bar{X}_{n}) = \frac{\sigma^2}{n}\) and \(\text{Var}(\bar{Y}_{m}) = \frac{\sigma^2}{m}\), substitute these into the variance expression: \[\text{Var}(T) = r^2\frac{\sigma^2}{n} + (1-r)^2\frac{\sigma^2}{m}.\]
7Step 7: Optimize Variance with respect to r
To find the most efficient \(r\), take the derivative of \(\text{Var}(T)\) with respect to \(r\), set it equal to zero, and solve for \(r\). Differentiating: \[\frac{d}{dr}\left(r^2\frac{1}{n} + (1-r)^2\frac{1}{m}\right) = \frac{2r}{n} - \frac{2(1-r)}{m}.\] Set the derivative to zero to find: \[\frac{2r}{n} = \frac{2(1-r)}{m}.\]
8Step 8: Solve for r
Rearrange the equation \[\frac{2r}{n} = \frac{2(1-r)}{m}\] to find \(r\): \[\frac{r}{n} = \frac{1-r}{m},\] which simplifies to \[mr = n - nr,\] leading to \[r(n+m) = n.\] Thus, \(r = \frac{n}{n+m}\).

Key Concepts

Unbiased EstimatorVariance OptimizationSample Means
Unbiased Estimator
In statistics, an unbiased estimator is a statistic used to estimate a parameter, where the expected value of the statistic is equal to the true value of the parameter. This means the estimator does not tend to overestimate or underestimate the parameter consistently.

For example, consider the situation where we have two independent sample means, \( \bar{X}_{n} \) and \( \bar{Y}_{m} \), from the same distribution with a mean \( \mu \). To form a new estimator \( T = r \bar{X}_{n} + (1-r) \bar{Y}_{m} \), we aim for it to have an expected value equal to \( \mu \).

The key characteristic ensuring \( T \) is unbiased is the linearity of expectation. Specifically, the expectation of \( T \) is a weighted sum of the expected values of the sample means, which themselves are unbiased estimators of \( \mu \). Hence, \( E[T] = rE[\bar{X}_{n}] + (1-r)E[\bar{Y}_{m}] = r\mu + (1-r)\mu = \mu \), confirming that \( T \) is an unbiased estimator for \( \mu \).

This concept is crucial as it provides assurance that in the long run, the estimator will accurately reflect the parameter it is meant to estimate.
Variance Optimization
Variance in statistics measures how far a set of numbers are spread out from their average value, providing an insight into the estimator's precision.

For an estimator, lower variance suggests that the estimate is likely to be closer to the parameter being estimated in different samples. In the original exercise, optimizing the variance of the estimator \( T = r \bar{X}_{n} + (1-r) \bar{Y}_{m} \) was achieved by selecting an optimal value of \( r \).

By assuming the variances of the sample means as \( \text{Var}(\bar{X}_{n}) = \frac{\sigma^2}{n} \) and \( \text{Var}(\bar{Y}_{m}) = \frac{\sigma^2}{m} \), the variance of \( T \) becomes \( \text{Var}(T) = r^2\frac{\sigma^2}{n} + (1-r)^2\frac{\sigma^2}{m} \). Finding the value of \( r \) that minimizes this variance involves calculus, and leads to the derivative equating to zero.

Through differentiation, the optimal \( r \) results in \( r = \frac{n}{n+m} \), ensuring that \( T \) uses information from both samples efficiently, minimizing the overall variance of the estimator. This process is important when combining multiple sample estimates to achieve the best possible precision.
Sample Means
Sample means are powerful and simple measures used to estimate the true mean of a population based on a sample. When dealing with sample means, as in the situation of \( \bar{X}_{n} \) and \( \bar{Y}_{m} \), they provide unbiased estimates of the population mean \( \mu \).

In statistical inference, the use of sample means allows us to draw conclusions about a population from a small subset. The sample mean \( \bar{X}_{n} = \frac{1}{n}\sum_{i=1}^n X_i \) is calculated by summing up all sample observations and dividing by the number of observations \( n \).

Understanding sample means is crucial, as it forms the basis of more complex statistical techniques and models. They are central to the law of large numbers, which tells us that as the sample size grows, the sample mean better estimates the population mean. This reliability is what makes sample means a foundational concept in both descriptive and inferential statistics. It's through the use of sample means that we derive various estimators, including the unbiased estimator \( T \) discussed earlier.