Problem 13

Question

A box contains an unknown number \(N\) of identical bolts. In order to get an idea of the size \(N\), we randomly mark one of the bolts from the box. Next we select at random a bolt from the box. If this is the marked bolt we stop, otherwise we return the bolt to the box, and we randomly select a second one, etc. We stop when the selected bolt is the marked one. Let \(X\) be the number of times a bolt was selected. Later (in Exercise 21.11) we will try to find an estimate of \(N\). Here we look at the probability distribution of \(X\). a. What is the probability distribution of \(X ?\) Specify its parameter(s)! b. The drawback of this approach is that \(X\) can attain any of the values \(1,2,3, \ldots\), so that if \(N\) is large we might be sampling from the box for quite a long time. We decide to sample from the box in a slightly different way: after we have randomly marked one of the bolts in the box, we select at random a bolt from the box. If this is the marked one, we stop, otherwise we randomly select a second bolt (we do not return the selected bolt). We stop when we select the marked bolt. Let \(Y\) be the number of times a bolt was selected. Show that \(\mathrm{P}(Y=k)=1 / N\) for \(k=1,2, \ldots, N\) ( \(Y\) has a so-called discrete uniform distribution). c. Instead of randomly marking one bolt in the box, we mark \(m\) bolts, with \(m\) smaller than \(N\). Next, we randomly select \(r\) bolts; \(Z\) is the number of marked bolts in the sample. Show that $$ \mathrm{P}(Z=k)=\frac{\left(\begin{array}{c} m \\ k \end{array}\right)\left(\begin{array}{c} N-m \\ r-k \end{array}\right)}{\left(\begin{array}{c} N \\ r \end{array}\right)}, \quad \text { for } \quad k=0,1,2, \ldots, r $$ ( \(Z\) has a so-called hypergeometric distribution, with parameters \(m, N\), and \(r .\) )

Step-by-Step Solution

Verified
Answer
a) Geometric distribution, p=1/N. b) Uniform distribution, P(Y=k)=1/N. c) Hypergeometric distribution, P(Z=k)=formula provided.
1Step 1: Identify the Distribution for Part a
The problem is describing a classic scenario of a geometric distribution. The random variable \( X \) represents the number of trials needed to achieve the first success (selecting the marked bolt).The probability \( p \) of selecting the marked bolt (success) on any given trial is \( \frac{1}{N} \) because there is 1 marked bolt among \( N \) bolts. The geometric distribution is characterized by this probability of success, so \( X \sim \text{Geometric}(p) \) where \( p = \frac{1}{N} \).
2Step 2: Describe the Distribution's Parameter
The parameter for a geometric distribution is \( p \), the probability of success on each trial. For this scenario, \( p = \frac{1}{N} \), meaning that the probability that we stop after \( k \) trials is given by:\[ \mathrm{P}(X = k) = (1-p)^{k-1} p = \left(1-\frac{1}{N}\right)^{k-1} \frac{1}{N}\]
3Step 3: Show that Y is Uniform for Part b
When selecting bolts without replacement, each position is equally likely for the marked bolt. Therefore, all values from 1 to \( N \) have the same probability.Since there are \( N \) possible positions for the marked bolt and one of them must be the correct guess, \( \mathrm{P}(Y=k) = \frac{1}{N} \) for \( k = 1, 2, \ldots, N \). This makes \( Y \) a discrete uniform distribution: \( Y \sim \text{Uniform}(1, N) \).
4Step 4: Explain Hypergeometric Distribution Setup in Part c
When selecting \( r \) bolts at random and calculating how many marked bolts \( m \) were chosen without replacement, this scenario aligns with a hypergeometric distribution because the selection is without replacement and depends on distinct categories (marked vs unmarked bolts).The probability \( \mathrm{P}(Z=k) \) is the probability of choosing \( k \) marked bolts and \( r-k \) unmarked bolts, calculated using combinations.
5Step 5: Derive the Formula for Hypergeometric Distribution
The formula uses combinations to calculate probability:The total number of ways to choose \( r \) bolts from \( N \) is \( \binom{N}{r} \).The number of ways to choose \( k \) marked bolts from \( m \) is \( \binom{m}{k} \).The number of ways to choose \( r-k \) unmarked bolts from \( N-m \) is \( \binom{N-m}{r-k} \).Therefore, the probability \( \mathrm{P}(Z=k) \) is:\[ \mathrm{P}(Z=k) = \frac{\binom{m}{k} \binom{N-m}{r-k}}{\binom{N}{r}}\]where \( k = 0, 1, 2, \ldots, r \).

Key Concepts

Geometric DistributionDiscrete Uniform DistributionHypergeometric Distribution
Geometric Distribution
The geometric distribution is a fundamental concept in probability, often used to model the number of trials required for the first success in a series of independent trials. Each trial has the same probability of success. In this exercise, we see a situation where a marked bolt is mixed with identical bolts and we randomly select bolts until we pick the marked one.
  • The random variable, denoted as \(X\), counts the number of times we select a bolt before getting the marked one.
  • The probability \(p\) of choosing the marked bolt in any single attempt is \( \frac{1}{N} \), where \(N\) is the total number of bolts.
The geometric distribution is defined for \(X\) such that the probability of needing exactly \(k\) trials is given by \[ \mathrm{P}(X = k) = (1-p)^{k-1} p = \left(1-\frac{1}{N}\right)^{k-1} \frac{1}{N} \].
This formula reflects the concept that multiple failures occur before the first success. Hence, it's suitable for scenarios where repeated attempts are involved.
Discrete Uniform Distribution
A discrete uniform distribution provides a framework where each outcome of a discrete random variable is equally likely. In this exercise, we explore the scenario where bolts are selected without replacement. After marking a bolt and starting the selection, we choose until we find the marked one.
  • Here, the random variable \(Y\) represents the number of trials needed until the marked bolt is selected.
  • All values from \(1\) to \(N\) are equally likely outcomes given an unknown number of unmarked and marked bolts.
This type of distribution is defined such that the probability \( \mathrm{P}(Y=k) = \frac{1}{N} \) for \( k = 1, 2, \ldots, N \). This indicates that in a situation of perfect randomness, each attempt is equally probable to be successful.
Such distributions are crucial in simulations and games of chance, where each outcome is supposed to occur with equal likelihood.
Hypergeometric Distribution
The hypergeometric distribution arises in situations involving selections without replacement from a finite population of different categories. In this problem, the distribution helps to model how many marked bolts are in a random sample.
  • This scenario is represented by the random variable \(Z\), which counts the number of marked bolts in a sample of \(r\) bolts.
  • The population includes \(N\) bolts in total, with \(m\) of them being marked.
The probability \( \mathrm{P}(Z=k) \) is involved in finding \(k\) marked bolts and \(r-k\) unmarked bolts from the population. The formula is given by:\[\mathrm{P}(Z=k)=\frac{\binom{m}{k} \binom{N-m}{r-k}}{\binom{N}{r}}\]where \(k=0,1,2,\ldots,r\).
Applications of hypergeometric distributions include quality control and lottery drawings, where the success of selections is sensitive to the lack of replacement.