Problem 7

Question

Leaves are divided into four different types: starchy-green, sugary-white, starchy-white, and sugary-green. According to genetic theory, the types occur with probabilities $\frac{1}{4}(\theta+2), \frac{1}{4} \theta, \frac{1}{4}(1-\theta)$, and $\frac{1}{4}(1-\theta)$, respectively, where $0<\theta<1$. Suppose one has $n$ leaves. Then the number of starchy-green leaves is modeled by a random variable $N_{1}$ with a $\operatorname{Bin}\left(n, p_{1}\right)$ distribution, where $p_{1}=\frac{1}{4}(\theta+2)$, and the number of sugary-white leaves is modeled by a random variable $N_{2}$ with a $\operatorname{Bin}\left(n, p_{2}\right)$ distribution, where $p_{2}=\frac{1}{4} \theta$. The following table lists the counts for the progeny of self-fertilized heterozygotes among 3839 leaves. \begin{tabular}{lr} \hline \hline \multicolumn{2}{c}{ Type } & Count \\ \hline Starchy-green & 1997 \\ Sugary-white & 32 \\ Starchy-white & 906 \\ Sugary-green & 904 \\ \hline \hline \end{tabular} Source: R.A. Fisher. Statistical methods for research workers. Hafner, New York, 1958; Table 62 on page $299 .$ Consider the following two estimators for $\theta$ : $$ T_{1}=\frac{4}{n} N_{1}-2 \quad \text { and } \quad T_{2}=\frac{4}{n} N_{2} . $$ a. Check that both $T_{1}$ and $T_{2}$ are unbiased estimators for $\theta$. b. Compute the value of both estimators for $\theta$.

Step-by-Step Solution

Verified

Answer

Both T1 (0.0826) and T2 (0.0334) are unbiased estimators for $\theta$.

1Step 1: Understand the Problem

You are given probabilities for four types of leaves based on a parameter $ \theta $. They're modeled as outcomes of binomial distributions. You need to verify if the given estimators $ T_1 $ and $ T_2 $ are unbiased estimators for $ \theta $, and compute them using observed data.

2Step 2: Define Unbiased Estimator

An estimator $ T $ for a parameter $ \theta $ is unbiased if the expected value of $ T $ is equal to $ \theta $, i.e., $ E[T] = \theta $. We need to check if this condition holds true for both $ T_1 $ and $ T_2 $.

3Step 3: Check Unbiasedness of T1

$ T_1 = \frac{4}{n} N_1 - 2 $ is an unbiased estimator if $ E[T_1] = \theta $. We have:\[ E[N_1] = n \cdot p_1 = n \cdot \frac{1}{4}(\theta+2) \]Hence, \[ E[T_1] = \frac{4}{n}E[N_1] - 2 = \frac{4}{n} \left( n \cdot \frac{1}{4}(\theta+2) \right) - 2 = \theta \]

4Step 4: Check Unbiasedness of T2

$ T_2 = \frac{4}{n} N_2 $ is an unbiased estimator if $ E[T_2] = \theta $. We have:\[ E[N_2] = n \cdot p_2 = n \cdot \frac{1}{4} \theta \]Thus,\[ E[T_2] = \frac{4}{n} E[N_2] = \frac{4}{n} \left( n \cdot \frac{1}{4} \theta \right) = \theta \]

5Step 5: Compute T1

Use the count data: $ N_1 = 1997 $ and $ n = 3839 $ for starchy-green leaves.\[ T_1 = \frac{4}{3839} \cdot 1997 - 2 = \frac{7988}{3839} - 2 \approx 0.0826 \]

6Step 6: Compute T2

Use the count data: $ N_2 = 32 $ and $ n = 3839 $ for sugary-white leaves.\[ T_2 = \frac{4}{3839} \cdot 32 = \frac{128}{3839} \approx 0.0334 \]

Key Concepts

Binomial DistributionUnbiased EstimatorParameter Estimation

Binomial Distribution

A Binomial Distribution is a common statistical distribution that represents probabilities related to experiments with fixed numbers of trials. Each trial in a binomial setting has only two possible outcomes - often labeled as "success" or "failure". Probabilities in a binomial distribution can be described with two parameters: $ n $, the number of trials, and $ p $, the probability of success in each trial.
In the given problem, leaves are categorized based on genetic properties into four types, with each type representing an experiment of a certain kind. Specifically, the number of starchy-green leaves and sugary-white leaves in a sample are modeled using a binomial distribution. Here, the parameter $ n $ corresponds to the total number of leaves, which is 3839. The probability of getting a starchy-green leaf is represented by $ p_1 = \frac{1}{4}(\theta+2) $ and for sugary-white leaf it's $ p_2 = \frac{1}{4} \theta $. This allows us to use binomial models to predict and estimate the counts of each type based on $ \theta $.
Using these principles, one can examine data to understand how close observed frequencies are compared to those predicted by the binomial distribution, providing insights into the underlying genetic probabilities.

Unbiased Estimator

An unbiased estimator is a statistical technique used to estimate the value of a parameter. The beauty of an unbiased estimator lies in its mathematical efficiency - it produces an expected result that equals the true parameter value over many samples.
For an estimator $ T $ of a parameter $ \theta $, it is unbiased if the expected value of $ T $ matches $ \theta $, i.e., $ E[T] = \theta $. This ensures that if you repeat your experiments or sampling over and over again, the average of these estimations will converge to the true value of the parameter $ \theta $.
In the provided task, there are two estimators, $ T_1 = \frac{4}{n} N_1 - 2 $ for the starchy-green leaves and $ T_2 = \frac{4}{n} N_2 $ for the sugary-white leaves. Both estimators are shown to be unbiased for $ \theta $ because their expected values equate to $ \theta $, as derived by calculating the expected counts using the binomial probability dependencies specified for $ N_1 $ and $ N_2 $. Through these unbiased estimators, we aim to accurately reflect the underlying genetic parameter $ \theta $.

Parameter Estimation

Parameter estimation involves using data to determine the values of parameters that define a statistical model. It is essential in bridging the observed data and the theoretical framework that describes the underlying process.
In the context of the given problem, the task is to estimate $ \theta $, the genetic parameter managing the distribution probabilities of the leaf types, using the count data of each type. The estimators $ T_1 $ and $ T_2 $ provide formulas to plug in the observed data and derive estimates for $ \theta $.
For parameter estimation, calculations are executed as follows:

$ T_1 $ is computed using the starchy-green leaf count, resulting in $ T_1 \approx 0.0826 $.
$ T_2 $ uses the sugary-white leaf count, giving $ T_2 \approx 0.0334 $.

The estimates, though derived from unbiased estimators, may differ due to variability in sample size and data representation. Hence, parameter estimation through these models provides a gateway to understanding genetic dissemination and trait representation among leaf types.

Problem 6

Problem 9

Other exercises in this chapter

Problem 3

$\square$ Suppose the random variables $X_{1}, X_{2}, \ldots, X_{n}$ have the same expectation $\mu$. For which constants $a$ and $b$ is $$ T=a\left(X

View solution

Problem 6

\square\( Consider the following dataset of lifetimes of ball bearings in hours. \begin{tabular}{rrrrrrrrrr} \hline \hline 6278 & 3113 & 5236 & 11584 & 12628 &

View solution

Problem 9

Consider the network example where the dataset is modeled as a realization of a random sample $X_{1}, X_{2}, \ldots, X_{n}$ from a Pois $(\mu)$ distribution

View solution

Problem 1

Suppose our dataset is a realization of a random sample $X_{1}, X_{2}, \ldots, X_{n}$ from a uniform distribution on the interval $[-\theta, \theta]$, where

View solution