Descriptive Methods in Regression and Correlation
Elementary Statistics ยท 256 exercises
Q.4.5
a. Find the regression equation for the data points, use the defining formulas in definition 4.4 to obtain
b. Graph the regression equation and the data points
4 step solution
Q.4.51
a. Find the regression equation for the data points, use the defining formulas in definition 4.4 to obtain
b. Graph the regression equation and the data points
4 step solution
Q.4.52
a. Find the regression equation for the data points, use the defining formulas in definition 4.4 to obtain
b. Graph the regression equation and the data points
4 step solution
Q.4.53
a. Find the regression equation for the data points, use the defining formulas in definition 4.4 to obtain
b. Graph the regression equation and the data points
4 step solution
Q. 4.54
4.54 The data points in Exercise
a. find the regression equotion for the data points. Use the defining formulas in Definition to obtain and .
b. graph the regression equation and the data points.
5 step solution
Q. 4.55
4.55 The data points in Exercise 4.43
a. find the regression equation for the data points. Use the defining formations in Definition to obtain and
b. Graph the regression connection and the data points.
5 step solution
Q. 4.56
4.56 The data points in Exercise 4.44
a. find the regression equation for the data points. Use the defining formulas in Definition to obtain and .
b. Graph the regression equation and the data points.
5 step solution
Q. 4.57
4.57 The data points in Exercise 4.45
a. find the regression equation for the data points. Use the defining formulas in Definition to obtain and .
b. graph the regression equation and the data points.
5 step solution
Q. 4.54
Data points:
| x | 1 | 3 | 4 | 4 |
| y | 8 | 0 | 3 | 1 |
- a. Find the regression equation for the data points. Use the defining formulas in Definition 4.4 to obtain and .
- b. graph the regression equation and the data points.
5 step solution
Q. 4.55
Data points:
| X | 1 | 2 | 3 |
| Y | 4 | 3 | 8 |
- a. Find the regression equation for the data points. Use the defining formulas in Definition 4.4 to obtain and .
- b. graph the regression equation and the data points.
5 step solution
Q. 4.56
Data points:
| X | 1 | 1 | 5 | 5 |
| Y | 1 | 3 | 2 | 4 |
- a. Find the regression equation for the data points. Use the defining formulas in Definition 4.4 to obtain and .
- b. graph the regression equation and the data points.
5 step solution
Q. 4.57
Data points:
| X | 0 | 2 | 2 | 5 | 6 |
| Y | 4 | 2 | 0 | -2 | 1 |
- a. Fivnd the regression equation for the data points. Use the defining formulas in Definition 4.4 to obtain and .
- b. graph the regression equation and the data points.
5 step solution
Q. 4.6
Custom Homes. Hanna Properties specializes in custom home resales in the Equestrian Estates, an exclusive subdivision in Phoenix, Arizona. A random sample of nine custom homes currently listed for sale provided the following information on size and price. Here, x denotes the size, in hundreds of square feet, rounded to the nearest hundred, and y denotes price, in thousands of dollars, rounded to the nearest thousand. For part (g). predict the price of a . . home in the Equestrian Estates.
- a. fond the regression equation for the data points.
- b. graph the regression equation and the data points.
- c. describe the apparent relationship between the two variables under consideration.
- d. interpret the slope of the regression line.
- e. identify the predictor and response variables.
- f. identify outliers and potential influential observations.
- g. predict the values of the response variable for the specified values of the predictor variable, and interpret your results.
10 step solution
Q. 4.61
Plant Emissions. Plants emit gases that trigger the ripening of fruit, attract pollinators, and cue other physiological responses. N. Agelopolous et al. examined factors that affect the emission of volatile compounds by the potato plant Solanum numerous and published their findings in the paper "Factors Affecting Volatile Emissions of Intact Potato Plants. Solanum tuberosum: Variability of Quantities and Stability of Ratios" (Journal of Chemical Ecology, Vol. 26. No. 2, pp. 497-511). The volatile compounds analyzed were hydrocarbons used by other plants and animals. Following are data on plant weight (x), in grams, and quantity of volatile compounds emitted (y), in hundreds of nanograms, for 11 potato plants. For part (g), predict the number of volatile compounds emitted by a potato plant that weighs 75 grams.
| X | 57 | 85 | 57 | 65 | 52 | 67 | 62 | 80 | 77 | 53 | 68 |
| Y | 8.0 | 22.0 | 10.5 | 22.5 | 12.0 | 11.5 | 7.5 | 13.0 | 16.5 | 21..0 | 12.0 |
- a. fond the regression equation for the data points.
- b. graph the regression equation and the data points.
- c. describe the apparent relationship between the two variables under consideration.
- d. interpret the slope of the regression line.
- e. identify the predictor and response variables.
- f. identify outliers and potential influential observations.
- g. predict the values of the response variable for the specified values of the predictor variable, and interpret your results.
9 step solution
Q. 4.62
Crown-Rump Length. In the article "The Human Vomeronasal Organ. Part II: Prenatal Development" (hormonal of Anatomy, Vol. 197. Issue 3, pp. 421-436). T. Smith and . Bhatnagar examined the controversial issue of the human vomeronasal organ, regarding its structure. function. and identity. The following table shows the age of fetuses (x), in weeks, and the length of crown-rump (y), in millimeters. For part (g). predict the crown-rump length of a 19-week-old fetus.
| X | 10 | 10 | 13 | 13 | 18 | 19 | 19 | 23 | 25 | 28 |
| Y | 66 | 66 | 108 | 106 | 161 | 166 | 177 | 228 | 235 | 280 |
- a. fond the regression equation for the data points.
- b. graph the regression equation and the data points.
- c. describe the apparent relationship between the two variables under consideration.
- d. interpret the slope of the regression line.
- e. identify the predictor and response variables.
- f. identify outliers and potential influential observations.
- g. predict the values of the response variable for the specified values of the predictor variable, and interpret your results.
9 step solution
Q. 4.63
Study Time and Score. An instructor at Arizona State University asked a random sample of eight students to record their study times in a beginning calculus course. She then made a table for total hours studied (x) over 2 weeks and test score (y) at the end of the 2 weeks. Here are the results. For part (g), predict the score of a student who studies for 15 hours.
| X | 10 | 15 | 12 | 20 | 8 | 16 | 14 | 22 |
| Y | 92 | 81 | 84 | 74 | 85 | 80 | 84 | 80 |
- a. fond the regression equation for the data points.
- b. graph the regression equation and the data points.
- c. describe the apparent relationship between the two variables under consideration.
- d. interpret the slope of the regression line.
- e. identify the predictor and response variables.
- f. identify outliers and potential influential observations.
- g. predict the values of the response variable for the specified values of the predictor variable, and interpret your results.
9 step solution
Q. 4.59
Corvette Prices. The Kelley Bime Book provides information on wholesale and retail prices of cars. Following are age and price data for 10 randomly selected Corvettes between 1 and 6 years old.
Here, x denotes age, in years, and y denotes price, in hundreds of dollars. For part (g). predict the prices of a 2 -year-old Corvette and a 3-year-old Corvette.
| X | 6 | 6 | 6 | 2 | 2 | 5 | 4 | 5 | 1 | 4 |
| Y | 290 | 280 | 295 | 425 | 384 | 315 | 355 | 328 | 425 | 325 |
- a. fond the regression equation for the data points.
- b. graph the regression equation and the data points.
- c. describe the apparent relationship between the two variables under consideration.
- d. interpret the slope of the regression line.
- e. identify the predictor and response variables.
- f. identify outliers and potential influential observations.
- g. predict the values of the response variable for the specified values of the predictor variable, and interpret your results.
10 step solution
Q. 4.58
Tax Efficiency. Tax efficiency is a measure, ranging from 0 to 100, of how much tax due to capital gains stock or mutual funds investors pay on their investments each year; the higher the tax efficiency, the lower is the tax. In the article "At the Mercy of the Manager" (Financial Planning, Vol. 30(5), pp. 54-56), C. Israel Sen examined the relationship between investments in mutual fund portfolios and their associated tax efficiencies. The following table shows percentage of investments in energy securities and tax efficiency for 10 mutual fund portfolios. For part . predict the tax efficiency of a mutual fund portfolio with of its investments in energy securities and one with of its investments in energy securities.
- find the regression equation for the data points.
- graph the regression equation and the data points.
- describe the apparent relationship between the two variables under consideration.
- interpret the slope of the regression line.
- identify the predictor and response variables.
- identify outliers and potential influential observations.
- predict the values of the response variable for the specified values of the predictor variable, and interpret your results.
15 step solution
Q. 4.59
Corvette Prices. The Kelley Blue Book provides information on wholesale and retail prices of cars. Following are age and price data for 10 randomly selected Corvettes between 1 and 6 years old.
Here, denotes age, in years, and denotes price, in hundreds of dollars. For part (g). predict the prices of a 2 -year-old Corvette and a 3-year-old Corvette.
- find the regression equation for the data points.
- graph the regression equation and the data points.
- describe the apparent relationship between the two variables under consideration.
- interpret the slope of the regression line.
- identify the predictor and response variables.
- identify outliers and potential influential observations.
- predict the values of the response variable for the specified values of the predictor variable, and interpret your results.
15 step solution
Q. 4.60
Custom Homes. Hanna Properties specializes in custom home resales in the Equestrian Estates, an exclusive subdivision in Phoenix, Arizona. A random sample of nine custom homes currently listed for sale provided the following information on size and price. Here, denotes size, in hundreds of square feet, rounded to the nearest hundred, and denotes price, in thousands of dollars, rounded to the nearest thousand. For part (g), predict the price of a 2600 -sq. ft. home in the Equestrian Estates.
- find the regression equation for the data points.
- graph the regression equation and the data points.
- describe the apparent relationship between the two variables under consideration.
- interpret the slope of the regression line.
- identify the predictor and response variables.
- identify outliers and potential influential observations.
- predict the values of the response variable for the specified values of the predictor variable, and interpret your results.
15 step solution
Q.4.70
Bridies and Score, How important are birdies ( a score of one under par on a given golf hole) in determining the final total score of a women golger? From the U. S women's open website, we obtained data on number of birdies during a tournament and final score for 63 women golfies. the data are presented on the Weiss Stats site.
a. Obtain the scatterplot for the data
b. Decide whether finding a regression line for the data is reasonable. if so, then also do parts (c)-(f)
c. Determine and interpret the regression equation for the data
d.Identify the potential outliers and influential observation
e. In case a potential outlier is present, remove it and discuss the effect.
f. In case potential influential observations is present,remove it and discuss the effect.
12 step solution
Q.4.71
U.S Presidents. The Information Please Almanac provides data on the ages at inauguration and of death for the presidents of the United States. We give those data on the WeisStats site for those presidents who are not still living at the time of this writing.
a. Obtain the scatterplot for the data
b. Decide whether finding a regression line for the data is reasonable. if so, then also do parts (c)-(f)
c. Determine and interpret the regression equation for the data
d.Identify the potential outliers and influential observation
e. In case a potential outlier is present, remove it and discuss the effect.
f. In case potential influential observations is present,remove it and discuss the effect.
12 step solution
Q.4.72
Home Size And Value. On the WeisStats Site are data on home size and assessed value for the same homes in exercise 4.73
a. Obtain the scatterplot for the data
b. Decide whether finding a regression line for the data is reasonable. if so, then also do parts (c)-(f)
c. Determine and interpret the regression equation for the data
d.Identify the potential outliers and influential observation
e. In case a potential outlier is present, remove it and discuss the effect.
f. In case potential influential observations is present,remove it and discuss the effect.
12 step solution
Q.4.75
High and Low Temperature, The National Oceanic and Atmospheric Administration publishes temperature information of cities around the world in Climates of the world. A random sample of 50 cities gave the data on average average high and low temperature in january shown on the Weis Stats site.
a. Obtain the scatterplot for the data
b. Decide whether finding a regression line for the data is reasonable. if so, then also do parts (c)-(f)
c. Determine and interpret the regression equation for the data
d.Identify the potential outliers and influential observation
e. In case a potential outlier is present, remove it and discuss the effect.
f. In case potential influential observations is present,remove it and discuss the effect.
12 step solution
Q.4.66
Anscombe's Quartet. In the article "Graphs in Statistical Analysis" (American Srariatician, Vol. 27, Issue 1. PP 17-21), F. Anscombe presented four sets of data points with almost identical basic statistical properties (means, standard deviations, regression lines, etc.) but quite different scatterplots. We have provided Anscombe's four sets of data points on the WeissStats site. Use the technology of your choice to solve the following problems.
a. Obtain the mean and standard deviation of each set of -values. Compare your results.
b. Obtain the mean and standard deviation of each set of -values. Compare your results.
c. Find the regression equation for each set of data points. Compare your results.
d. Draw a scatterplot with superimposed regression line for each set of data points.
e. Discuss your results in part (d) with respect to the importance of plotting data before analyzing it and to the effect of outliers.
10 step solution
Q. 4.64
Tax Efficiency. In Exercise 4.58, you determined a regression equation that relates the variables percentage of investments in energy securities and tax efficiency for mutual fund portfolios.
a. Should that regression equation be used to predict the tax efficiency of a mutual fund portfolio with of its investments in energy securities? with of its investments in energy securities? Explain your answers.
b. For which percentages of investments in energy securities is use of the regression equation to predict tax efficiency reasonable?
4 step solution
Q. 4.65
Corvette Prices. In Exercise 4.59. you determined a regression equation that can be used to predict the price of a Corvette, given its age.
a. Should that regression equation be used to predict the price of a 4-year-old Corvette? a 10-year-old Corvette? Explain your answers.
b. For which ages is the use of the regression equation to predict price reasonable?
4 step solution
Q. 4.67
Study Time and Score. The negative relation between study time and test score found in Exercise has been discovered by many investigators. Provide a possible explanation for it.
2 step solution
Q. 4.68
Age and Price of Orions. In Table , we provided data on age and price for a sample of Orions between and years old. On the Weiss Stats site, we have given the ages and prices for a sample of Orions between and years old.
a. Obtain a scatterplot for the data.
b. Is it reasonable to find a regression line for the data? Explain your answer.
4 step solution
Q. 4.69
Wasp Mating Systems. In the paper "Mating System and Sex Allocation in the Gregarious Parasitoid Cotesia glomerata" (Animal Behaviour, Vol. , pp. ). H. Gu and S. Dorn reported on various aspects of the mating system and sex allocation strategy of the wasp C. glomerata, One part of the study involved the investigation of the percentage of male wasps dispersing before mating in relation to the brood sex ratio (proportion of males). The data obtained by the researchers are on the WeissStats site.
a. Obtain a scatterplot for the data.
b. Is it reasonable to find a regression line for the data? Explain your answer.
4 step solution
Q. 4.72
Movie Grosses. Box Office Mojo collects and posts data on movie grosses. For a random sample of 50 movies, we obtained both the domestic (U.S.) and overseas grosses, in millions of dollars. The data are presented on the Weiss Stats site.
a. Obtain a scatterplot for the data.
b. Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c)-(f).
4 step solution
Q. 4.73
Acreage and Value. The document Arizona Residential Property Valuation System, published by the Arizona Department of Revenue, describes how county assessors use computerized systems to value single-family residential properties for property tax purposes. On the Weiss Stats site are data on lot size (in acres) and assessed value (in thousands of dollars) for a sample of homes in a particular area.
a. Obtain a scatterplot for the data.
b. Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c) - f).
4 step solution
Q. 4.76
4.76 PCBs and Pelicans, Polychlorinated biphenyls (PCBs), industrial pollutants, are known to be carcinogens and a great danger to natural ecosystems. As a result of several studies, PCB production was banned in the United States in and by the Stockholm Convention on Persistent Organic Pollutants in . One study, published in by R. Risebrough, is titled "Effects of Environmental Pollutants Upon Animals Other Than Man" (Proceedings of the 6th Berkeley Symposium on Mathematics and Statistics, Vl, University of California Press, Pp. ). In that study, Anacapa pelican eggs were collected and measured for their shell thickness, in millimeters (mm), and concentration of PCBs, in parts per million (ppm). The data are on the Weiss Stats site.
a. Obtain a scatterplot for the data.
b. Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c)-(f).
4 step solution
Q. 4.77
More Money, More Beer? Does a higher state per capita income equate to a higher per capita beer consumption? From the document Survey of Current Business, published by the U.S. Bureau of Economic Analysis, and from the Brewer's Almanac, published by the Beer Institute, we obtained data on personal income per capita, in thousands of dollars, and per capita beer consumption, in gallons, for the states and Washington. D.C. Those data are provided on the Weiss Stats site.
a. Obtain a scatterplot for the data.
b. Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c)-(f).
4 step solution
Q. 4.70
How important are birdies (a score of one under par on a given golf hole) in determining the final total score of a women golfer? From the US Women's Open website, we obtained data on number of birdies during tournament and final score for \(63\) women golfers. The data are presented on the Weissstats site.
a. Obtain a scatterplot for the data.
b. Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c)-(f)
c. Determine the interpret the regression equation for the data.
d. Identify potential outliers and influential observations.
e. In case a potential outlier is present, remove it and discuss the effect.
f. In case a potential influential observation is present, remove it and discuss the effect.
8 step solution
Q. 4.71
The Information Please Almanac provides data on the ages at inauguration and of death for the presidents of the United States. We give those data on the Weissstats site for those presidents who are not still living at the time of this writing.
a. Obtain a scatterplot for the data.
b. Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c)-(f)
c. Determine the interpret the regression equation for the data.
d. Identify potential outliers and influential observations.
e. In case a potential outlier is present, remove it and discuss the effect.
f. In case a potential influential observation is present, remove it and discuss the effect.
8 step solution
Q. 4.74
On the WeissStats site are data on home size (in square feet) and assessed value (in thousands of dollars) for the same homes as in Exercise \(4.73\).
a. Obtain a scatterplot for the data.
b. Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c)-(f)
c. Determine the interpret the regression equation for the data.
d. Identify potential outliers and influential observations.
e. In case a potential outlier is present, remove it and discuss the effect.
f. In case a potential influential observation is present, remove it and discuss the effect.
8 step solution
Q. 4.75
The National Oceanic and Atmosphere Administration publishes temperature information of cities around the world in Climates of the World. A random sample of \(50\) cities gave the data on average high and low temperatures in January shown on the WeissStats in the site.
a. Obtain a scatterplot for the data.
b. Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c)-(f)
c. Determine the interpret the regression equation for the data.
d. Identify potential outliers and influential observations.
e. In case a potential outlier is present, remove it and discuss the effect.
f. In case a potential influential observation is present, remove it and discuss the effect.
8 step solution
Q. 4.79
Top Wealth Managers. An issue of HAKRON'S presented information on top wealth managers in the United States bused on individual clients with accounts of million er more. Data were given for various variables, two of which were a number of private client managers and private client assets. Those data are provided on the WeissStats site, where private client assets are in ballons of dollars.
a. Obtain a scatterplot for the dater.
h. Decide whether finding a depression lime for the data is reasonable. If so, then also do jurras (e) (f).
4 step solution
Q. 4.8
Shortleaf Mines. The ability to estimate the volume of a tree based on a simple measurement, such as the tree's diameter, is important to the lumber industry, ecologists, and conservationists. Data on volume, in cubic feet. and diameter at breast height, in inches, for shortlist pics were reported in C. Bruce and F. X. Schumacher's hopes Mensuration (New York: McGraw-Hill. ) and analyzed by A. C. Atkinson in the article "Transforming Both Sides of a Tree" (The Anarricas Slasisiciun, Vol, 48, PD. ), The data are presented on the WeissStats site.
a. Obtain a scatterplot for the dater.
h. Decide whether finding a depression lime for the data is reasonable. If so, then also do jurras (e) (f.
4 step solution
Q. 4.81
Sample Covariance. For a set of n data points, the sample covariance, is given by
The sample covariance can be used as an alternative method for tinding the slope and y-intercept of a regression line. The formulas are
where denotes the sample standard deviation of the x-values.
a. Use Equation (4.1) to determine the sample covariance of the data points in Exercise 4,45.
b. Use Equation (4.2) and your answer from part (a) to find the regression equation. Compare your result to that found in Exercise 4.57.
4 step solution
Q. 4.82
Tine Series. A collection of observations of a variable y taken at regular intervals over time is called a time series. Bocoomsic data and electrical signals are examples of time series. We can think of a time series as providing data points where is the ith observation time and is the observed value of y at time . If a time series exhibits a linear trend, we can find that trend by determining the regression equation for the data points. We can then use the regression equation for forecasting purposes.
As an illustration, consider the data on the WeissStats site that shows the U.S. population, in millions of persons, for the years 1900 2013. as provided by the I.S. Census Beret.
a. Use the technology of your choice to lesbian a scatterplot of the data.
h. Use the technology of your choice to find the regression equation.
6. Use your result from part (b) to forecast the U.S. population for the years 2014 and 2015 .
6 step solution
Q. 4.78
4.78 Gas Guzzlers. The magazine Consumer Reports publishes information on automobile gas mileage and variables that affect gas mileage. In one issue, data on gas mileage (in miles per gallon) and engine displacement (in liters) were published for vehicles. Those data are available on the Weiss Stats site.
a. Obtain a scatterplot for the data.
b. Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c) (f).
4 step solution
Q.4.88
a. Compute the three sums of square SST,SSR, SSE using defining formulas
b. Verify the regression identity,
c. Compute the coefficient of determination.
d. Determine the percentage of variation in the observed values of the response variable that is explained by the regression.
e. State how useful the regression equation appears to be for making predictions.
10 step solution
Q.4.89
a. Compute the three sums of square SST,SSR, SSE using defining formulas
b. Verify the regression identity,
c. Compute the coefficient of determination.
d. Determine the percentage of variation in the observed values of the response variable that is explained by the regression.
e. State how useful the regression equation appears to be for making predictions.
10 step solution
Q.4.90
a. Compute the three sums of square SST,SSR, SSE using defining formulas
b. Verify the regression identity,
c. Compute the coefficient of determination.
d. Determine the percentage of variation in the observed values of the response variable that is explained by the regression.
e. State how useful the regression equation appears to be for making predictions.
10 step solution
Q. 4.84
A measure of the total variation in the observed values of the response variable is the------.
The mathematical abbreviation for it is----.
2 step solution
Q. 4.85
4.85 A measure of the amount of variation in the observed values of the response variable explained by the regression is the-----. The mathematical abbreviation for it is----.
2 step solution
Q. 4.86
A measure of the amount of variation in the observed values of the response variable not explained by the regression is the----. The mathematical abbreviation for it is----.
2 step solution
Q. 4.87
For regression analysis, and .
a. Obtain and interpret the coefficient of determination.
b. Determine SSE :
4 step solution