Problem 24
Question
Face recognition systems pick faces out of crowds at airports to see if any matches occur with law enforcement databases. Performance of the systems can be affected by lighting, gender and age of the target, and age of the database. In Problems \(24-27,\) the tables give identification rates for faces under various conditions. Decide whether the rate of recognition is independent of the given factor. Table 17.22 compares face recognition under different lighting conditions.$$ \begin{array}{c|c|c} \hline \text { Lighting } & \text { Did recognize } & \text { Did not recognize } \\ \hline \text { Indoors } & 900 & 100 \\ \hline \text { Outdoors } & 300 & 300 \\ \hline \end{array} $$
Step-by-Step Solution
Verified Answer
No, the rate of recognition in a face recognition system is not independent of lighting conditions. The chi-square test shows that the performance of the system is different when used indoors and outdoors, thus affecting its effectiveness in identifying individuals.
1Step 1: State the null and alternative hypotheses
Null hypothesis (H0): The rate of recognition is independent of lighting conditions.
Alternative hypothesis (H1): The rate of recognition is not independent of lighting conditions.
2Step 2: Calculate the expected frequencies
Assuming that the null hypothesis is true (recognition rate is independent of lighting conditions), we can calculate the expected frequencies for each cell in the table.
Expected frequency (E) = (row total * column total) / total observations
Total observations = 900 + 100 + 300 + 300 = 1600
For each cell:
$$E_{11} = (1000 * 1200) / 1600 = 750$$
$$E_{12} = (1000 * 400) / 1600 = 250$$
$$E_{21} = (600 * 1200) / 1600 = 450$$
$$E_{22} = (600 * 400) / 1600 = 150$$
3Step 3: Calculate the chi-square statistic
The chi-square statistic (χ²) is computed using the formula:
$$\chi^2 = \sum \frac{(O - E)^2}{E}$$
Where O represents the observed frequencies and E represents the expected frequencies.
For each cell:
$$(O_11 - E_11)^2 / E_11 = (900 - 750)^2 / 750 = 30$$
$$(O_12 - E_12)^2 / E_12 = (100 - 250)^2 / 250 = 90$$
$$(O_21 - E_21)^2 / E_21 = (300 - 450)^2 / 450 = 50$$
$$(O_22 - E_22)^2 / E_22 = (300 - 150)^2 / 150 = 150$$
Sum these values to obtain the chi-square statistic:
$$\chi^2 = 30 + 90 + 50 + 150 = 320$$
4Step 4: Determine the degrees of freedom
Degrees of freedom (df) = (number of rows - 1) * (number of columns - 1)
$$df = (2 - 1) * (2 - 1) = 1$$
5Step 5: Calculate the p-value and make a decision
The p-value can be calculated using a chi-square distribution table or statistical software. In this case, we will use a table to find the critical chi-square value for df = 1 and a significance level (α) of 0.05.
From a table or calculator, we find that the critical value at α = 0.05 is χ² = 3.84. Since our calculated χ² (320) is greater than the critical value, we reject the null hypothesis.
Therefore, we have enough evidence to conclude that the rate of recognition is not independent of lighting conditions. The face recognition system seems to perform differently when used indoors and outdoors, which could affect its effectiveness in identifying individuals from law enforcement databases.
Key Concepts
Null HypothesisExpected FrequencyDegrees of FreedomP-value
Null Hypothesis
When performing a Chi-Square Test, the null hypothesis is a vital starting point. It represents a statement that there is no effect or no association between variables. In the context of the exercise, the null hypothesis (\(H_0\)) states that the rate of recognition is independent of lighting conditions.
This means, we assume that lighting does not influence whether the face recognition system will recognize a face or not. By stating the null hypothesis clearly, researchers can then perform statistical tests to provide evidence for or against it.
The null hypothesis thus provides a foundation upon which statistical questions are framed and tested.
This means, we assume that lighting does not influence whether the face recognition system will recognize a face or not. By stating the null hypothesis clearly, researchers can then perform statistical tests to provide evidence for or against it.
The null hypothesis thus provides a foundation upon which statistical questions are framed and tested.
Expected Frequency
Expected frequencies are a critical component of the Chi-Square Test. They refer to the frequencies we would expect to find in each category if our null hypothesis were true.
Calculating expected frequencies involves the formula:\[ E = \frac{\text{(Row Total) * (Column Total)}}{\text{Grand Total}} \]This helps us understand whether the observed data significantly deviate from what we would expect.
For instance, if the null hypothesis is true, face recognition rates under different lighting conditions would align with the expected frequencies calculated. Deviations from these expected frequencies help indicate whether lighting conditions might indeed influence recognition rates.
Thus, expected frequencies provide a benchmark by which the claim of independence is tested.
Calculating expected frequencies involves the formula:\[ E = \frac{\text{(Row Total) * (Column Total)}}{\text{Grand Total}} \]This helps us understand whether the observed data significantly deviate from what we would expect.
For instance, if the null hypothesis is true, face recognition rates under different lighting conditions would align with the expected frequencies calculated. Deviations from these expected frequencies help indicate whether lighting conditions might indeed influence recognition rates.
Thus, expected frequencies provide a benchmark by which the claim of independence is tested.
Degrees of Freedom
In a Chi-Square Test, degrees of freedom (df) are crucial in determining the critical value against which the calculated chi-square statistic can be compared. It essentially helps in understanding how many values in the final calculation are free to vary.
The formula, especially in a contingency table, is given by:\[ df = (\text{number of rows} - 1) \times (\text{number of columns} - 1) \] For our table comparing face recognition under different lighting, with two rows and two columns, the calculation is straightforward: \( df = (2-1)*(2-1) = 1 \).The degree of freedom serves as a bridge between the test statistic and the significance levels.
The formula, especially in a contingency table, is given by:\[ df = (\text{number of rows} - 1) \times (\text{number of columns} - 1) \] For our table comparing face recognition under different lighting, with two rows and two columns, the calculation is straightforward: \( df = (2-1)*(2-1) = 1 \).The degree of freedom serves as a bridge between the test statistic and the significance levels.
P-value
The p-value is the probability that the observed data, or something more extreme, would occur if the null hypothesis were true. It provides a precise statistical measure for determining whether to reject the null hypothesis.
In our example involving face recognition, a low p-value (often less than 0.05) indicates that the observed differences in recognition rates under various lighting conditions are statistically significant.
Thus, a Chi-Square statistic of 320 with df=1 gives a p-value much less than 0.05, leading us to reject the null hypothesis. This means that we have substantial evidence to conclude that lighting affects the rate of recognition.
The decision to reject or fail to reject the null hypothesis is made in the light of the calculated p-value, a key element in hypothesis testing.
In our example involving face recognition, a low p-value (often less than 0.05) indicates that the observed differences in recognition rates under various lighting conditions are statistically significant.
Thus, a Chi-Square statistic of 320 with df=1 gives a p-value much less than 0.05, leading us to reject the null hypothesis. This means that we have substantial evidence to conclude that lighting affects the rate of recognition.
The decision to reject or fail to reject the null hypothesis is made in the light of the calculated p-value, a key element in hypothesis testing.
Other exercises in this chapter
Problem 23
For his term project in biology, Robert believed he could increase the weight of mice by feeding them a hormone. Do his results, in Table 17.21 , support the cl
View solution Problem 23
Suppose you record the hours of daylight each day for a year in Tucson, Arizona, and find the mean. (a) What do you expect for an approximate mean? (b) How woul
View solution Problem 25
Face recognition systems pick faces out of crowds at airports to see if any matches occur with law enforcement databases. Performance of the systems can be affe
View solution Problem 25
Find the mean of each data set: (a) Five readings equaling (not totaling) \(120,\) three readings equaling 130 , two readings equaling 140 , four readings equal
View solution