Inference for Distributions of Categorical Data
The Practice of Statistics for AP ยท 110 exercises
Q. 5.2
Sample surveys on sensitive issues can give different results depending on how the question is asked. A University of Wisconsin study randomly divided 2400 respondents into three groups. All participants were asked if they had ever used cocaine. One group of 800 was interviewed by phone; 21% said they had used cocaine. Another 800 people were asked the question in a one-on-one personal interview; 25% said “Yes.” The remaining 800 were allowed to make an anonymous written response; 28% said “Yes.”
2. Make a two-way table of responses about cocaine use by how the survey was administered
2 step solution
Q. 5.3
Sample surveys on sensitive issues can give different results depending on how the question is asked. A University of Wisconsin study randomly divided 2400 respondents into three groups. All participants were asked if they had ever used cocaine. One group of 800 was interviewed by phone; 21% said they had used cocaine. Another 800 people were asked the question in a one-on-one personal interview; 25% said “Yes.” The remaining 800 were allowed to make an anonymous written response; 28% said “Yes.”
3. Are the differences between the three groups statistically significant? Give appropriate evidence to support your answer.
4 step solution
Q. 6.1
Many popular businesses are franchises—think of McDonald’s. The owner of local franchise benefits from brand recognition, national advertising, and detailed guidelines provided by the franchise chain. In return, he or she pays fees to the franchise firm and agrees to follow its policies. The relationship between the local owner and the franchise firm is spelled out in a detailed contract.
One clause that the contract may or may not contain is the entrepreneur’s right to an exclusive territory. This means that the new outlet will be the only representative of the franchise in a specified territory and will not have to compete with other outlets of the same chain. How does the presence of an exclusive-territory clause in the contract relate to the survival of the business?
A study designed to address this question collected data from a random sample of new franchise firms. Two categorical variables were measured for each franchisor. First, the franchisor was classified as successful or not based on whether or not it was still offering franchises as of a certain date. Second, the contract each franchisor offered to franchisees was classified according to whether or not there was an exclusive-territory clause. Here are the count data, arranged in a two-way table:
Do these data provide convincing evidence of an association between an exclusive territory clause and business survival? Carry out an appropriate test at the level.
3 step solution
Q. 27
Do men and women participate in sports for the same reasons? One goal for sports participants is social comparison—the desire to win or to do better than other people. Another is mastery—the desire to improve one’s skills or to try one’s best. A study on why students participate in sports collected data from independent random samples of male and female undergraduates at a large university. Each student was classified into one of four categories based on his or her responses to a questionnaire about sports goals. The four categories were high social comparison– high mastery (HSC-HM), high social comparison– low mastery (HSC-LM), low social comparison–high mastery (LSC-HM), and low social comparison–low mastery (LSC-LM). One purpose of the study was to compare the goals of male and female students. Here are the data displayed in a two-way table:
(a) Calculate the conditional distribution (in proportions) of the reported sports goals for each gender.
(b) Make an appropriate graph for comparing the conditional distributions in part (a).
(c) Write a few sentences comparing the distributions of sports goals for male and female undergraduates.
6 step solution
Q. 28
The nonprofit group Public Agenda conducted telephone interviews with three randomly selected groups of parents of high school children. There were black parents, Hispanic parents, and white parents. One question asked was “Are the high schools in your state doing an excellent, good, fair, or poor job, or don’t you know enough to say?” Here are the survey results:
(a) Calculate the conditional distribution (in proportions) of responses for each group of parents.
(b) Make an appropriate graph for comparing the conditional distributions in part (a).
(c) Write a few sentences comparing the distributions of responses for the three groups of parents
6 step solution
Q. 29
Refer to Exercise 27. Do the data provide convincing evidence of a difference in the distributions of sports goals for male and female undergraduates at the university?
(a) State appropriate null and alternative hypotheses for a significance test to help answer this question.
(b) Calculate the expected counts. Show your work.
(c) Calculate the chi-square statistic. Show your work
From exercise
6 step solution
Q. 30
Refer to Exercise 28. Do the data provide convincing evidence of a difference in the distributions of opinions about how high schools are doing among black, Hispanic, and white parents?
(a) State appropriate null and alternative hypotheses for a significance test to help answer this question.
(b) Calculate the expected counts. Show your work.
(c) Calculate the chi-square statistic. Show your work.
6 step solution
Q 31
Why men and women play sports Do men and women participate in sports for the same reasons? One goal for sports participants is social comparision—the desire to win or to do better than other people. Another is mastery—the desire to improve one’s skills or to try one’s best. A study on why students participate in sports collected data from independent random samples of 67 male and 67 female under-graduates at a large university 15 Each student was classified into one of four categories based on his or her responses to a questionnaire about sports goals. The four categories were high social comparison– high mastery (HSC-HM), high social comparison– low mastery (HSC-LM), low social comparison–high mastery (LSC-HM), and low social comparison–low mastery (LSC-LM). One purpose of the study was to compare the goals of male and female students. Here are the data displayed in a two-way table: Observed Counts for Sports Goals
(a) Check that the conditions for performing the chi-square test are met.
(b) Use Table C to find the P-value. Then use your calculator’s C2cdf command.
(c) Interpret the P-value from the calculator in context.
(d) What conclusion would you draw? Justify your answer.
8 step solution
Q.32
32. How are schools doing? Refer to Exercises 28 and 30 .
(a) Check that the conditions for performing the chi-square test are met.
(b) Use Table to find the -value. Then use your calculator's command.
(c) Interpret the -value from the calculator in context.
(d) What conclusion would you draw? Justify your answer.
8 step solution
Q. 33
How is the hatching of water python eggs influenced by the temperature of the snake’s nest? Researchers randomly assigned newly laid eggs to one of three water temperatures: hot, neutral, or cold. Hot duplicates the extra warmth provided by the mother python, and cold duplicates the absence of the mother. Here are the data on the number of eggs and the number that hatched
(a) Make a two-way table of temperature by outcome (hatched or not). Calculate the proportion of eggs in each group that hatched. The researchers believed that eggs would not hatch in cold water. Do the data support that belief?
(b) Are the differences between the three groups statistically significant? Give appropriate evidence to support your answer
4 step solution
Q. 34
After randomly assigning subjects to treatments in a randomized comparative experiment, we can compare the treatment groups to see how well the random assignment worked. We hope to find no significant differences among the groups. A study on how to provide premature infants with a substance essential to their development assigned infants at random to receive one of four types of supplements, called PBM, NLCP, PL-LCP, and TG-LCP. The subjects were premature infants. In the experiment, were assigned to the PBM group and to each of the other treatments.
(a) The random assignment resulted in females in the TG-LCP group and 11 females in each of the other groups. Make a two-way table of the group by gender. Calculate the proportion of females in each treatment group. Does it appear that the random assignment roughly balanced the groups by gender? Explain.
(b) Are the differences between the groups statistically significant? Give appropriate evidence to support your answer.
4 step solution
Q. 35
How do U.S. residents who travel overseas for leisure differ from those who travel for business? The following is the breakdown by occupation:
Explain why we can’t use a chi-square test to learn whether these two distributions differ significantly.
2 step solution
Q. 36
Does eating chocolate trigger headaches? To find out, women with chronic headaches followed the same diet except for eating chocolate bars and carob bars that looked and tasted the same. Each subject ate both chocolate and carob bars in random order with at least three days between. Each woman then reported whether or not she had a headache within hours of eating the bar. Here is a two-way table of the results for the subjects:
The researchers carried out a chi-square test on this table to see if the two types of bar differ in triggering headaches. Explain why this test is incorrect.
2 step solution
Q. 37
How to quit smoking It’s hard for smokers to quit. Perhaps prescribing a drug to fight depression will work as well as the usual nicotine patch. Perhaps combining the patch and the drug will work better than either treatment alone. Here are data from a randomized, double-blind trial that compared four treatments. A “success” means that the subject did not smoke for a year following the beginning of the study.
Group Treatment Subjects Successes
1 Nicotine patch 244 40
2 Drug 244 74
3 Patch plus drug 245 87
4 Placebo 160 25
(a) Summarize these data in a two-way table.
(b) Make a graph to compare the success rates for the four treatments. Describe what you see.
(c) Explain in words what the null hypothesis H0: p1 = p2 = p3 = p4 says about subjects’ smoking habits.
(d) Find the expected counts if H0 is true, and display them in a two-way table similar to the table of observed counts.
5 step solution
Q.38
Aspirin prevents blood from clotting and so helps prevent strokes. The Second European Stroke Prevention Study asked whether adding another anticlotting drug named dipyridamole would be more effective for patients who had already had a stroke. Here are the data on strokes during the two years of the study:
(a) Summarize these data in a two-way table.
(b) Make a graph to compare the rates of strokes for the four treatments. Describe what you see.
(c) Explain in words what the null hypothesis : p1 = p2 = p3 = p4 says about the incidence of strokes.
(d) Find the expected counts if H0 is true, and display them in a two-way table similar to the table of observed counts
8 step solution
Q. 39
Refer to Exercise 37. Do the data provide convincing evidence of a difference in the effectiveness of the four treatments? Carry out an appropriate test at the significance level
2 step solution
Q. 40
Do the data provide convincing evidence of a difference in the effectiveness of the four treatments? Carry out an appropriate test at the significance level.
2 step solution
Q. 41
Perform a follow-up analysis of the test in Exercise 39 by finding the individual components of the chi-square statistic. Which cell(s) contributed most to the final result?
2 step solution
Q. 42
Perform a follow-up analysis of the test in Exercise 40 by finding the individual components of the chi-square statistic. Which cell(s) contributed most to the final result?
2 step solution
Q. 43
Gastric freezing was once a recommended treatment for ulcers in the upper intestine. The use of gastric freezing stopped after experiments showed it had no effect. One randomized comparative experiment found that of the gastric-freezing patients improved, while of the patients in the placebo group improved. We can test the hypothesis of “no difference” in the effectiveness of the treatments in two ways: with a two-sample z test or with a chi-square test.
(a) Minitab output for a chi-square test is shown below. State appropriate hypotheses and interpret the P-value in context. What conclusion would you draw?
Chi-Square Test: Gastric freezing, Placebo Expected counts are printed below observed counts Chi-Square contributions are printed below expected counts
(b) Minitab output for a two-sample z test is shown below. Explain how these results are consistent with the test in part (a).
4 step solution
Q. 44
The General Social Survey asked a random sample of adults, “Do you favour or oppose the death penalty for persons convicted of murder?” The following table gives the responses of people whose highest education was a high school degree and of people with a bachelor’s degree:
We can test the hypothesis of “no difference” in support for the death penalty among people in these educational categories in two ways: with a two-sample z test or with a chi-square test.
(a) Minitab output for a chi-square test is shown below. State appropriate hypotheses and interpret the P-value in context. What conclusion would you draw? Chi-Square Test: C1, C2 Expected counts are printed below-observed counts Chi-Square contributions are printed below expected counts
(b) Minitab output for a two-sample z test is shown below. Explain how these results are consistent with the test in part (a).
4 step solution
Q. 45
Some people think recycled products are lower in quality than other products, a fact that makes recycling less practical. Here are data on attitudes toward coffee filters made of recycled paper from a random sample of adults:
(a) Make a bar graph that compares buyers’ and non-buyers’ opinions about recycled filters. Describe what you see
(b) Minitab output for a chi-square test using these data is shown below. Carry out the test. What conclusion do you draw?
4 step solution
Q. 46
The General Social Survey asked a random sample of adults their opinion about whether astrology is very scientific, sort of scientific, or not at all scientific. Here is a two-way table of counts for people in the sample who had three levels of higher education:
(a) Make a bar graph that compares opinions about astrology for the three education categories. Describe what you see.
(b) Minitab output for a chi-square test using these data is shown below. Carry out the test. What conclusion do you draw?
4 step solution
Q. 47
North Carolina State University studied student performance in a course required by its chemical engineering major. One question of interest was the relationship between time spent in extracurricular activities and whether a student earned a C or better in the course. Here are the data for the 119 students who answered a question about extracurricular activities:
(a) Calculate percentages and draw a bar graph that describes the nature of the relationship between time spent on extracurricular activities and performance in the course. Give a brief summary in words.
b) Explain why you should not perform a chi-square test in this setting.
4 step solution
Q.48
Removing warts A recently reported study looked at the use of oral zinc supplements to get rid of warts. The treatment group took an oral zinc supplement. Of these people, were rid of their warts after one month, and additional patients were rid of them after two months. The subjects in a control group were given a placebo. Of these, was rid of his warts after one month and additional patients were rid of their warts after two months. The reported results of the study included this table.
(a) The researchers say they did a chi-square test on this table. Explain why that makes no sense. Then make a correct two-way table displaying the results of the experiment.
(b) Explain why it is not appropriate to use a chi-square test in this setting even with the correct two-way table.
4 step solution
Q.49
Regulating guns The National Gun Policy Survey asked a random sample of adults, “Do you think there should be a law that would ban possession of handguns except for the police and other authorized persons?” Here are the responses, broken down by the respondent’s level of education:
(a) How do opinions about banning handgun ownership seem to be related to the level of education? Make an appropriate graph to display this relationship. Describe what you see.
(b) Determine whether or not the sample provides convincing evidence that education level and opinion about a handgun ban are independent in the adult population
6 step solution
Q.50
Market research Before bringing a new product to market, firms carry out extensive studies to learn how consumers react to the product and how best to advertise its advantages. Here are data from a study of a new laundry detergent. The participants are a random sample of people who don’t currently use the established brand that the new product will compete with. Give subjects free samples of both detergents. After they have tried both for a while, ask which they prefer. The answers may depend on other facts about how people do laundry.
(a) How are laundry practices (water hardness and wash temperature) related to the choice of detergent? Make an appropriate graph to display this relationship. Describe what you see.
(b) Determine whether or not the sample provides convincing evidence that laundry practices and product preference are independent in the population of interest
5 step solution
Q.51
A survey by the National Institutes of Health asked a random sample of young adults (aged 19 to 25 years), “Where do you live now? That is, where do you stay most often?” Here is the full two-way table (omitting a few who refused to answer and one who claimed to be homeless):
a) Should we use a chi-square test for homogeneity or a chi-square test of association/independence in this setting? Justify your answer.
(b) State appropriate hypotheses for performing the type of test you chose in part (a). Minitab output from a chi-square test is shown below
(c) Check that the conditions for carrying out the test are met.
(d) Interpret the P-value in context. What conclusion would you draw?
8 step solution
Q.52
What is the most important reason that students buy from catalogs? The answer may differ for different groups of students. Here are results for separate random samples of American and Asian students at a large midwestern university
(a) Should we use a chi-square test for homogeneity or a chi-square test of association/independence in this setting? Justify your answer. (b) State appropriate hypotheses for performing the type of test you chose in part (a). Minitab output from a chi-square test is shown below.
(c) Check that the conditions for carrying out the test are met.
(d) Interpret the P-value in context. What conclusion would you draw?
8 step solution
Q.53
The appropriate null hypothesis for performing a chi-square test is that
(a) equal proportions of female and male teenagers are almost certain they will be married in years.
(b) there is no difference between female and male teenagers in this sample in their distributions of opinions about marriage.
(c) there is no difference between female and male teenagers in the population in their distributions of opinions about marriage.
(d) there is no association between gender and opinion about marriage in the sample.
(e) there is no association between gender and opinion about marriage in the population.
2 step solution
Q.54
The expected count of females who respond “almost certain” is
(a) .
(b) .
(c) .
(d) .
(e) None of these.
2 step solution
Q.55
The degrees of freedom for the chi-square test for this two-way table are (a) .
(b) .
(c) .
(d) .
(e) None of these.
2 step solution
Q.56
The cell in the table that contributes the most to the chi-square statistic is (a) Female, chance.
(b) Male, chance.
(c) Female, almost certain.
(d) Male, almost certain.
(e) All the cells contribute equally to the test statistic.
3 step solution
Q.57
Software gives test statistic and -value close to 0 . The correct interpretation of this result is
(a) the probability of getting a random sample of teens that yields a value of of or larger is basically .
(b) the probability of getting a random sample of teens that yields a value of of or larger if is true is basically .
(c) the probability of making a Type I error is basically .
(d) the probability of making a Type II error is basically .
(e) it's very unlikely that these data are true.
2 step solution
Q.58
Which of the following explains why one of the conditions for performing the chi-square test is met in this case?
(a) The sample is large, teenagers in all.
(b) The sample is random.
(c) All the observed counts are greater than .
(d) We used software to do the calculations.
(e) Both variables are categorical.
2 step solution
Q.59
Inference recap In each of the following settings, say which inference procedure from Chapter you would use. Be specific. For example, you might say “two-sample z test for the difference between two proportions.” You do not need to carry out any procedures.
(a) What is the average voter turnout during an election? A random sample of cities was asked to report the percent of registered voters who actually voted in the most recent election.
(b) Are blondes more likely to have a boyfriend than the rest of the single world? Independent random samples of blondes and nonblondes were asked whether they have a boyfriend.
4 step solution
Q.60
Inference recap In each of the following settings, say which inference procedure from Chapters you would use. Be specific. For example, you might say “two-sample z test for the difference between two proportions.” You do not need to carry out any procedures.
(a) Is there a relationship between attendance at religious services and alcohol consumption? A random sample of adults was asked whether they regularly attend religious services and whether they drink alcohol daily.
(b) Separate random samples of college students and high school students were asked how much time, on average, they spend watching television each week. We want to estimate the difference in the average amount of TV watched by high school and college students.
4 step solution
Q.61
Design (4.2) Was this an observational study or an experiment? Justify your answer.
2 step solution
Q.62
Sorry, no chi-square (11.1) Explain why it would not be appropriate to perform a chi-square goodness-of-fit test in this setting.
2 step solution
Q.63
Average ratings The students decided to compare the average ratings of the cafeteria food on the two scales.
(a) Find the mean and standard deviation of the ratings for the students who were given the -to- scale.
(b) For the students who were given the -to- scale, the ratings have a mean of and a standard deviation of . Since the scales differ by one point, the group decided to add to each of these ratings. What are the mean and standard deviation of the adjusted ratings?
(c) Would it be appropriate to compare the means from parts (a) and (b) using a two-sample t-test? Justify your answer
6 step solution
Q.64
No answer Explain carefully how nonresponse could lead to bias in this project
2 step solution
Q.1
Representative sample? For a class project, a group of statistics students is required to take an SRS of students from their large high school to take part in a survey. The students’ sample consists of freshmen, sophomores, juniors, and seniors. The school roster shows that of the students enrolled at the school are freshmen, are sophomores, are juniors, and are seniors.
(a) Construct a well-labeled bar graph that shows the distribution of grade levels (in percents) for the sample data. Do these data give you any reason to suspect that the statistics students’ sample is unusual? Explain. (b) Use an appropriate test to determine whether the sample data differ significantly from the actual distribution of students by grade level at the school.
6 step solution
Q.3
Stress and heart attack You read a newspaper article that describes a study of whether stress management can help reduce heart attacks. The subjects all had reduced blood flow to the heart and so were at risk of a heart attack. They were assigned at random to three groups. The article goes on to say: One group took a four-month stress management program, another underwent a four-month exercise program, and the third received usual heart care from their personal physicians. In the next three years, only three of the people in the stress management group suffered “cardiac events,” defined as a fatal or non-fatal heart attack or a surgical procedure such as a bypass or angioplasty. In the same period, seven of the people in the exercise group and out of the 40 patients in usual care suffered such events.36
(a) Use the information in the news article to make a two-way table that describes the study results.
(b) What are the success rates of the three treatments in preventing cardiac events?
(c) Is there a significant difference in the success rates for the three treatments? Give appropriate statistical evidence to support your answer.
7 step solution
Q.R11.2
Sorry, no chi-square We would prefer to learn from teachers who know their subject. Perhaps even pre-school children are affected by how knowledgeable they think teachers are. Assign three- and four-year-olds at random to be taught the name of a new toy by either an adult who claims to know about the toy or an adult who claims not to know about it. Then ask the children to pick out a picture of the new toy in a set of pictures of other toys and say its name. The response variable is the count of right answers in four tries. Here are the data:
The researchers report that children taught by the teacher who claimed to be knowledgeable did significantly better . Explain why this result isn't valid.
2 step solution
Q.R11.4
Researchers looked at a random sample of full-page ads that show a model in magazines aimed at young men, at young women, or at young adults in general. They classified the ads as “not sexual” or “sexual,” depending on how the model was dressed (or not dressed). Here are the data:
The figure below displays Minitab output for a chi-square test using these data
a) Which type of chi-square test should be used in this case? Justify your answer.
(b) State an appropriate pair of hypotheses for the test you chose in part (a).
(c) Show how each of the numbers and was obtained for the “notsexy, Women” cell
(d) Assuming that the conditions for performing inference are met, what conclusion would you draw? Explain
8 step solution
Q. 1
A chi-square goodness-of-fit test is used to test whether a 0 to 9 spinner is "fair" (that is, the outcomes are all equally likely). The spinner is spun 100 times, and the results are recorded. The degrees of freedom for the test will be
(a) 8 .
(c) 10 .
(e) None of these.
(b) 9 .
(d)
2 step solution
Q. 2
Which hypotheses would be appropriate for performing a chi-square test?
(a) The null hypothesis is that the closer students get to graduation, the less likely they are to be opposed to tuition increases. The alternative is that how close students are to graduation makes no difference in their opinion.
(b) The null hypothesis is that the mean number of students who are strongly opposed is the same for each of the four years. The alternative is that the mean is different for at least two of the four years.
(c) The null hypothesis is that the distribution of student opinion about the proposed tuition increase is the same for each of the four years at this university. The alternative is that the distribution is different for at least two of the four years.
(d) The null hypothesis is that year in school and student opinion about the tuition increase in the sample are independent. The alternative is that these variables are dependent.
(e) The null hypothesis is that there is an association between a year in school and opinion about the tuition increase at this university. The alternative hypothesis is that these variables are not associated.
2 step solution
Q.6
A study of identity theft looked at how well consumers protect themselves from this increasingly prevalent crime. The behaviors of randomly selected college students were compared with the behaviors of randomly selected non students. One of the questions was “When asked to create a password, I have used either my mother’s maiden name, or my pet’s name, or my birth date,
or the last four digits of my social security number, or a series of consecutive numbers.” For the students, agreed with this statement while of the nonstudents agreed.
a) Display the data in a two-way table and perform
the appropriate chi-square test. Summarize the results.
(b) Reanalyze the data using the methods for comparing two proportions that we studied in Chapter. Compare the results and verify that the chi-square
statistic is the square of the z statistic.
4 step solution
Q.R11.5
Who were the popular kids at your elementary school? Did they get good grades or have good looks? Were they good at sports? A study was performed to examine the factors that determine social status for children in grades , and . Researchers administered a questionnaire to a random sample of students in these grades. One of the questions they asked was “What would you most like to do at school: make good grades, be good at sports, or be popular?” The two-way table below summarizes the students’ responses.
(a) Construct an appropriate graph to compare male and female responses. Write a few sentences describing the relationship between gender and goals.
(b) Is there convincing evidence of an association between gender and goals for elementary school students? Carry out a test at the level and report your conclusion.
(c) Which cell contributes most to the chi-square statistic in part (b)? Explain
6 step solution
Q. 3
The conditions for carrying out the chi-square test in exercise T11.2 are
I. Separate random samples from the populations of interest.
II. Expected counts large enough.
III. The samples themselves and the individual observations in each sample are independent.
Which of the conditions is (are) satisfied in this case?
(a) I only
(d) Il and III only
(b) II only
(e) I, II, and III
(c) I and II only
2 step solution