Exploring Data

The Practice of Statistics for AP ยท 193 exercises

Q. 1

Jake is a car buff who wants to find out more about the vehicles that students at his school drive. He gets permission to go to the student parking lot and record some data. Later, he does some research about each model of car on the Internet. Finally, Jake makes a spreadsheet that includes each car’s model, year, color, number of cylinders, gas mileage, weight, and whether it has a navigation system. 

Who are the individuals in Jake’s study? 

2 step solution

Q. 2

Jake is a car buff who wants to find out more about the vehicles that students at his school drive. He gets permission to go to the student parking lot and record some data. Later, he does some research about each model of car on the Internet. Finally, Jake makes a spread-sheet that includes each car’s model, year, color, number of cylinders, gas mileage, weight, and whether it has a navigation system. 

What variables did Jake measure? Identify each as categorical or quantitative.

2 step solution

Q. 1

How can we help wood surfaces resist weathering, especially when restoring historic wooden buildings? In a study of this question, researchers prepared wooden panels and then exposed them to the weather. Here are some of the variables recorded: type of wood (yellow poplar, pine, cedar); type of water repellent (solvent-based, water-based); paint thickness (millimeters); paint color (white, gray, light blue); weathering time (months). 

Identify each variable as categorical or quantitative.

2 step solution

Q. 2

Medical study variables Data from a medical study contain values of many variables for each of the people who were the subjects of the study. Here are some of the variables recorded: gender (female or male) age (years) race (Asian, black, white, or other); smoker (yes or no) systolic blood pressure (millimetres of mercury) level of calcium in the blood (micrograms per millilitre).

Identify each as categorical or quantitative.

2 step solution

Q. 3

A class survey Here is a small part of the data set that describes the students in an AP Statistics class. The data come from anonymous responses to a questionnaire filled out on the first day of class.

a. What individual does with the data set describe

b. Identification of the quantitative variables

c. describe the individual in a highlighted row

4 step solution

Q. 4

A class survey Here is a small part of the data set that describes the students in an AP Statistics class. The data come from anonymous responses to a questionnaire filled out on the first day of class.

   a. Description of a data set

   b. Identification of Quantitative variables

   c. Description of an individual in a highlighted row

3 step solution

Q. 5

Ranking colleges Popular magazines rank colleges and universities on their “academic quality” in serving undergraduate students. Describe two categorical variables and two quantitative variables that you might record for each student. Give the units of measurement for the quantitative variables.

2 step solution

Q 6.

You are preparing to study the television-viewing habits of high school students. Describe two categorical variables and two quantitative variables that you might record for each student. Give the units of measurement for the quantitative variables.

2 step solution

Q. 7

The individuals in this data set are

(a) households. (d) 120 variables.

(b) people. (e) columns.

(c) adults.

2 step solution

Q. 8


At the Census Bureau Web site, you can view detailed data collected by the American Community Survey. The table below includes data for 10 people chosen at random from the more than one million people in households contacted by the survey. “School” gives the highest level of education completed. 


This data set contains

(a) 7 variables, 2 of which are categorical.

(b) 7 variables, 1 of which is categorical.

(c) 6 variables, 2 of which are categorical.

(d) 6 variables, 1 of which is categorical.

(e) None of these.


3 step solution

Q. 1.1

Use the data in the two-way table on page 12 to calculate the marginal distribution (in percents) of gender. 

2 step solution

Q. 1.2

Make a graph to display the marginal distribution. Describe what you see 

2 step solution

Q 2.1.


Find the conditional distributions of gender among each of the other four opinion categories (we did “Almost no chance” earlier). Use Figure 1.5 or Figure 1.6 to check that your answers are approximately correct.

3 step solution

Q 2.2.

Make a revised version of Figure 1.4 that includes your results from Question 1

3 step solution

Q 9.


Cool car colors The most popular colors for cars and light trucks change over time. Silver passed green in 2000 to become the most popular color worldwide, then gave way to shades of white in 2007. Here is the distribution of colors for vehicles sold in North America in 2008 

(a) What percent of vehicles had colors other than those listed?

(b) Display these data in a bar graph. Be sure to label your axes and title your graph.

(c) Would it be appropriate to make a pie chart of these data? Explain. 


5 step solution

Q 10.

Spam Email spam is the curse of the Internet. Here is a compilation of the most common types of spam:

(a) What percent of spam would fall in the “Other” category?

(b) Display these data in a bar graph. Be sure to label your axes and title your graph.

(c) Would it be appropriate to make a pie chart of these data? Explain. 


5 step solution

Q. 10

The solutions also deals with the percentages of email spam and the email failing 

2 step solution

Q 11.

Birthdays Births are not evenly distributed across the days of the week. Here are the average numbers of babies born on each day of the week in the United

States in a recent year:

(a) Present these data in a well-labeled bar graph. Would it also be correct to make a pie chart?

(b) Suggest some possible reasons why there are fewer births on weekends. 

3 step solution

Q. 11


Birth days Births are not evenly distributed across

the days of the week. Here are the average numbers

of babies born on each day of the week in the United

States in a recent year.




(a) Present these data in a well-labelled bar graph.

Would it also be correct to make a pie chart?

(b) Suggest some possible reasons why there are

fewer births on weekends.

2 step solution

Q 12.

Deaths among young people Among persons aged 15 to 24 years in the United States, the leading causes of death and number of deaths in a recent year were

as follows: accidents, 15,567; homicide, 5359; suicide, 4139; cancer, 1717; heart disease, 1067; congenital defects, 483

(a) Make a bar graph to display these data.

(b) To make a pie chart, you need one additional piece of information. What is it?

4 step solution

Q. 12

Deaths among young people among persons aged 15 to 24 and 15 to 24 years in the United States, the leading causes of death and number of deaths in a recent year were;

(a) Make a bar graph to display these data.

(b) To make a pie chart, you need one additional piece of information. What is it?

 

3 step solution

Q 13.

Hispanic origins Below is a pie chart prepared by the Census Bureau to show the origin of the more than 43 million Hispanics in the United States in 2006  About what percent of Hispanics are Mexican? Puerto Rican?



Comment: You see that it is hard to determine numbers from a pie chart. Bar graphs are much easier to use. (The Census Bureau did include the percent in its pie chart.)

3 step solution

Q 14.

About 1.6 million first-year students enroll in colleges and universities each year. What do they plan to study? The pie chart displays data on the percent of first-year students who plan to major in several discipline areas. About what percent of first-year students plan to major in business? In social science?


3 step solution

Q 15.

Buying music online Young people are more likely than older folk to buy music online. Here are the percent's of people in several age groups who bought music online in 2006:

(a) Explain why it is not correct to use a pie chart to display these data.

(b) Make a bar graph of the data. Be sure to label your axes and title your graph. 

4 step solution

Q 16.

The audience for movies Here are data on the percent of people in several age groups who attended a movie in the past 12 months: 

(a) Display these data in a bar graph. Describe what you see.

(b) Would it be correct to make a pie chart of these data? Why or why not?

(c) A movie studio wants to know what percent of the total audience for movies is 18 to 24 years old. Explain why these data do not answer this question. 

Age groupMovie attendence
18 to 24 years83%
25 to 34 years73%
35 to 44 years 
68%
45 to 54 years 
60%
55 to 64 years 
47%
65 to 74 years 
32%
75 to and over
20%

5 step solution

Q 17.

Going to school Students in a high school statistics class were given data about the primary method of transportation to school for a group of 30 students. They produced the pictograph shown.

(a) How is this graph misleading?

(b) Make a new graph that isn’t misleading.

4 step solution

Q 18.

Oatmeal and cholesterol Do eating oatmeal reduce cholesterol? An advertisement included the following graph as evidence that the answer is “Yes.” 

(a) How is this graph misleading?

(b) Make a new graph that isn’t misleading. What do you conclude about the effect of eating oatmeal on cholesterol reduction?

4 step solution

Q 19.

Attitudes toward recycled products Recycling is supposed to save resources. Some people think recycled products are lower in quality than other products, a fact that makes recycling less practical. People who actually use a recycled product may have different opinions from those who don’t use it. Here are data on attitudes toward coffee filters made of recycled paper among people who do and don’t buy these filters:

(a) How many people does this table describe? How many of these were buyers of coffee filters made of recycled paper?

(b) Give the marginal distribution of opinion about the quality of recycled filters. What percent of consumers think the quality of the recycled product is the same or higher than the quality of other filters?  

5 step solution

Q 20.

Smoking by students and parents Here is data from a survey conducted at eight high schools on smoking among students and their parents:


(a) How many students are described in the two-way table? What per cent of these students smoke?

(b) Give the marginal distribution of parents’ smoking behaviour, both in counts and in percentages.

5 step solution

Q 21.

Attitudes toward recycled products Exercise 19 gives data on the opinions of people who have and have not bought coffee filters made from recycled paper.

To see the relationship between opinion and experience with the product, find the conditional distributions of opinion (the response variable) for buyers and nonbuyers. What do you conclude?

3 step solution

Q 22.

Smoking by students and parents Refer to Exercise 20. Calculate three conditional distributions of students’ smoking behavior: one for each of the three parental smoking categories. Describe the relationship between the smoking behaviors of students and their parents in a few sentences.

3 step solution

Q 23.

Popular colors—here and there Favorite vehicle colors may differ among countries. The side-by-side bar graph shows data on the most popular colors of cars in 2008 for the United States and Europe. Write a few sentences comparing the two distributions. 

3 step solution

Q. 23


Popular colours—here and there Favorite vehicle colours may differ among countries. The side-by-sidebar graph shows data on the most popular colours of cars in 2008 for the United States and Europe. Write a few sentences comparing the two distributions.



2 step solution

Q 24.

Comparing car colours Favorite vehicle colours may differ among types of vehicles. Here are data on the most popular colours in 2008 for luxury cars and for SUVs, trucks, and vans.


(a) Make a graph to compare colours by vehicle type.

(b) Write a few sentences describing what you see.

4 step solution

Q 25.

Snowmobiles in the park Yellowstone National Park surveyed a random sample of 1526 winter visitors to the park. They asked each person whether they owned, rented, or had never used a snowmobile. Respondents were also asked whether they belonged to an environmental organization (like the Sierra Club). The two-way table summarizes the survey responses.



Do these data provide convincing evidence of an association between environmental club membership and snowmobile use for the population of visitors to Yellowstone National Park? Follow the four-step process.

3 step solution

Q. 26

Angry people and heart disease People who get angry easily tend to have more heart disease. That’s the conclusion of a study that followed a random sample of 12,986 people from three locations for about four years. All subjects were free of heart disease at the beginning of the study. The subjects took the Spiel burger Trait Anger Scale test, which measures how prone a person is to sudden anger. Here are data for the 8474 people in the sample who had normal blood pressure. CHD stands for “coronary heart disease. ”This includes people who had heart attacks and those who needed medical treatment for heart disease.

Do these data support the study’s conclusion about the relationship between anger and heart disease? Follow the four-step process.

3 step solution

Q 27.

The National Survey of Adolescent Health interviewed several thousand teens (grades 7 to 12). One question asked was “What do you think are the chances you will be married in the next ten years?” Here is a two-way table of the responses by gender:


The percentage of females among the respondents was

(a) 2625 (c) about 46%. (e) None of these.

(b) 4877 (d) about 54%.

3 step solution

Q 28.

The National Survey of Adolescent Health interviewed several thousand teens (grades 7 to 12). One question asked was “What do you think are the chances you will be married in the next ten years?” Here is a two-way table of the responses by gender:

Your percent from the previous exercise is part of

(a) the marginal distribution of females.

(b) the marginal distribution of gender.

(c) the marginal distribution of opinion about marriage.

(d) the conditional distribution of gender among adolescents with a given opinion.

(e) the conditional distribution of opinion among adolescents of a given gender.

3 step solution

Q 29.

The National Survey of Adolescent Health interviewed several thousand teens (grades 7 to 12). One question asked was “What do you think are the chances you will be married in the next ten years?” Here is a two-way table of the responses by gender: 

What percent of females thought that they were almost certain to be married in the next ten years?

(a) About 16% (c) About 40% (e) About 61%

(b) About 24% (d) About 45%

3 step solution

Q 30.

The National Survey of Adolescent Health interviewed several thousand teens (grades 7 to 12). One question asked was "What do you think are the chances you will be married in the next ten years?" Here is a two-way table of the responses by gender: Your percent from the previous exercise is part of- 

(a) the marginal distribution of gender.

(b) the marginal distribution of opinion about marriage.

(c) the conditional distribution of gender among adolescents with a given opinion.

(d) the conditional distribution of opinion among adolescents of a given gender.

(e) the conditional distribution of “Almost certain” among females. 


3 step solution

Q 31.

The National Survey of Adolescent Health interviewed several thousand teens (grades 7 to 12). One question asked was “What do you think are the chances you will be married in the next ten years?” Here is a two-way table of the responses by gender:

What percent of those who thought they were almost certain to be married were female?

(a) About 16% (c) About 40% (e) About 61%

(b) About 24% (d) About 45%


3 step solution

Q 32.

National Survey of Adolescent Health interviewed several thousand teens (grades 7 to 12). One question asked was “What do you think are the chances you will be married in the next ten years?” Here is a two-way table of the

responses by gender: 

Your percent from the previous exercise is part of

(a) the marginal distribution of gender.

(b) the marginal distribution of opinion about marriage.

(c) the conditional distribution of gender among adolescents with a given opinion.

(d) the conditional distribution of opinion among adolescents of a given gender.

(e) the conditional distribution of females among those who said “Almost certain. 

3 step solution

Q 33.

Marginal distributions aren’t the whole story Here are the row and column totals for a two-way table with two rows and two columns:

Find two different sets of counts a, b, c, and d for the body of the table that gives these same totals. This shows that the relationship between two variables cannot be obtained from the two individual distributions of the variables.

3 step solution

Q 34.

Baseball paradox Most baseball hitters perform differently against right-handed and left-handed pitching. Consider two players, Joe and Moe, both of whom bat right-handed. The table below records their performance against right-handed and left-handed pitchers:

(a) Use these data to make a two-way table of player (Joe or Moe) versus outcome (hit or no hit).

(b) Show that Simpson’s paradox holds: one player has a higher overall batting average, but the other player hits better against both left-handed and right-handed pitching.

(c) The manager doesn’t believe that one player can hit better against both left-handers and right-handers yet have a lower overall batting average. Explain in

simple language why this happens to Joe and Moe.

5 step solution

Q 35.

Race and the death penalty Whether a convicted murderer gets the death penalty seems to be influenced by the race of the victim. Here are data on

326 cases in which the defendant was convicted of murder:

(a) Use these data to make a two-way table of the defendant’s race (white or black) versus the death penalty (yes or no).

(b) Show that Simpson’s paradox holds: a higher percent of white defendants are sentenced to death overall, but for both black and white victims a higher percent of black defendants are sentenced to death.

(c) Use the data to explain why the paradox holds in the language that a judge could understand.

5 step solution

Q 36.

Fuel economy (Introduction) Here is a small part of a data set that describes the fuel economy (in miles per gallon) of model year 2009 motor vehicles:

(a) What are the individuals in this data set?

(b) What variables were measured? Identify each as categorical or quantitative.

4 step solution

Q 1.1.

The Fathom dot plot displays data on the number of siblings reported by each student in a statistics class. 

Describe the shape of the distribution.

3 step solution

Q 1.2.

The Fathom dot plot displays data on the number of siblings reported by each student in a statistics class. 

Describe the center of the distribution. 

3 step solution

Q 1.4.

The Fathom dot plot displays data on the number of siblings reported by each student in a statistics class. 

Identify any potential outliers.

3 step solution

Q 1.3.

The Fathom dot-plot displays data on the number of siblings reported by each student in a statistics class. 

Describe the spread of the distribution.

3 step solution

Show/ page