Problem 7
Question
Leaves are divided into four different types: starchy-green, sugary-white, starchy-white, and sugary-green. According to genetic theory, the types occur with probabilities \(\frac{1}{4}(\theta+2), \frac{1}{4} \theta, \frac{1}{4}(1-\theta)\), and \(\frac{1}{4}(1-\theta)\), respectively, where \(0<\theta<1\). Suppose one has \(n\) leaves. Then the number of starchy-green leaves is modeled by a random variable \(N_{1}\) with a \(\operatorname{Bin}\left(n, p_{1}\right)\) distribution, where \(p_{1}=\frac{1}{4}(\theta+2)\), and the number of sugary-white leaves is modeled by a random variable \(N_{2}\) with a \(\operatorname{Bin}\left(n, p_{2}\right)\) distribution, where \(p_{2}=\frac{1}{4} \theta\). The following table lists the counts for the progeny of self-fertilized heterozygotes among 3839 leaves. \begin{tabular}{lr} \hline \hline \multicolumn{2}{c}{ Type } & Count \\ \hline Starchy-green & 1997 \\ Sugary-white & 32 \\ Starchy-white & 906 \\ Sugary-green & 904 \\ \hline \hline \end{tabular} Source: R.A. Fisher. Statistical methods for research workers. Hafner, New York, 1958; Table 62 on page \(299 .\) Consider the following two estimators for \(\theta\) : $$ T_{1}=\frac{4}{n} N_{1}-2 \quad \text { and } \quad T_{2}=\frac{4}{n} N_{2} . $$ a. Check that both \(T_{1}\) and \(T_{2}\) are unbiased estimators for \(\theta\). b. Compute the value of both estimators for \(\theta\).
Step-by-Step Solution
VerifiedKey Concepts
Binomial Distribution
In the given problem, leaves are categorized based on genetic properties into four types, with each type representing an experiment of a certain kind. Specifically, the number of starchy-green leaves and sugary-white leaves in a sample are modeled using a binomial distribution. Here, the parameter \( n \) corresponds to the total number of leaves, which is 3839. The probability of getting a starchy-green leaf is represented by \( p_1 = \frac{1}{4}(\theta+2) \) and for sugary-white leaf it's \( p_2 = \frac{1}{4} \theta \). This allows us to use binomial models to predict and estimate the counts of each type based on \( \theta \).
Using these principles, one can examine data to understand how close observed frequencies are compared to those predicted by the binomial distribution, providing insights into the underlying genetic probabilities.
Unbiased Estimator
For an estimator \( T \) of a parameter \( \theta \), it is unbiased if the expected value of \( T \) matches \( \theta \), i.e., \( E[T] = \theta \). This ensures that if you repeat your experiments or sampling over and over again, the average of these estimations will converge to the true value of the parameter \( \theta \).
In the provided task, there are two estimators, \( T_1 = \frac{4}{n} N_1 - 2 \) for the starchy-green leaves and \( T_2 = \frac{4}{n} N_2 \) for the sugary-white leaves. Both estimators are shown to be unbiased for \( \theta \) because their expected values equate to \( \theta \), as derived by calculating the expected counts using the binomial probability dependencies specified for \( N_1 \) and \( N_2 \). Through these unbiased estimators, we aim to accurately reflect the underlying genetic parameter \( \theta \).
Parameter Estimation
In the context of the given problem, the task is to estimate \( \theta \), the genetic parameter managing the distribution probabilities of the leaf types, using the count data of each type. The estimators \( T_1 \) and \( T_2 \) provide formulas to plug in the observed data and derive estimates for \( \theta \).
For parameter estimation, calculations are executed as follows:
- \( T_1 \) is computed using the starchy-green leaf count, resulting in \( T_1 \approx 0.0826 \).
- \( T_2 \) uses the sugary-white leaf count, giving \( T_2 \approx 0.0334 \).