Chapter 11

An Introduction to Mathematical Statistics and Its Applications · 37 exercises

Problem 5

The following is the residual plot that results from fitting the equation \(y=6.0+2.0 x\) to a set of \(n=10\) points. What, if anything, would be wrong with predicting that \(y\) will equal \(30.0\) when \(x=12\) ?

3 step solution

Problem 7

The relationship between school funding and student performance continues to be a hotly debated political and philosophical issue. Typical of the data available are the following figures, showing the per-pupil expenditures and graduation rate for twenty-six randomly chosen districts in Massachusetts. Graph the data and superimpose the least squares line, \(y=a+b x\). What would you conclude about the \(x y\) relationship? Use the following sums: $$ \begin{array}{ll} \sum_{i=1}^{26} x_{i}=360 & \sum_{i=1}^{26} y_{i}=2,256.6 \\ \sum_{i=1}^{26} x_{i}^{2}=5,365.08 & \sum_{i=1}^{26} x_{i} y_{i}=31,402 \end{array} $$

3 step solution

Problem 9

An Atomic Energy Commission nuclear facility was established in Hanford, Washington, in 1943. Over the years, a significant amount of strontium 90 and cesium 137 leaked into the Columbia River. In a study to determine how much this radioactivity caused serious medical problems for those who lived along the river, public health officials created an index of radioactive exposure for nine Oregon counties in the vicinity of the river. As a covariate, cancer mortality was determined for each of the counties (45). The results are given in the table in the next column. For the nine \(\left(x_{i}, y_{i}\right)\) 's in the table, $$ \begin{array}{ll} \sum_{i=1}^{9} x_{i}=41.56 & \sum_{i=1}^{9} x_{i}^{2}=289.4222 \\ \sum_{i=1}^{9} y_{i}=1,416.1 & \sum_{i=1}^{9} x_{i} y_{i}=7,439.37 \end{array} $$ $$ \begin{array}{lcc} \hline & & \text { Cancer Mortality per } \\ \text { County } & \text { Index of Exposure, } x & 100,000, y \\ \hline \text { Umatilla } & 2.49 & 147.1 \\ \text { Morrow } & 2.57 & 130.1 \\ \text { Gilliam } & 3.41 & 129.9 \\ \text { Sherman } & 1.25 & 113.5 \\ \text { Wasco } & 1.62 & 137.5 \\ \text { Hood River } & 3.83 & 162.3 \\ \text { Portland } & 11.64 & 207.5 \\ \text { Columbia } & 6.41 & 177.9 \\ \text { Clatsop } & 8.34 & 210.3 \\ \hline \end{array} $$ Find the least squares straight line for these points. Also, construct the corresponding residual plot. Does it seem reasonable to conclude that \(x\) and \(y\) are linearly related?

5 step solution

Problem 10

Would you have any reservations about fitting the following data with a straight line? Explain. $$ \begin{array}{rr} \hline x & y \\ \hline 3 & 20 \\ 7 & 37 \\ 5 & 29 \\ 1 & 10 \\ 10 & 59 \\ 12 & 69 \\ 6 & 39 \\ 11 & 58 \\ 8 & 47 \\ 9 & 48 \\ 2 & 18 \\ 4 & 29 \\ \hline \end{array} $$

5 step solution

Problem 12

Verify that the coefficients \(a\) and \(b\) of the least squares straight line are solutions of the matrix equation $$ \left(\begin{array}{cc} n & \sum_{i=1}^{n} x_{i} \\ \sum_{i=1}^{n} x_{i} & \sum_{i=1}^{n} x_{i}^{2} \end{array}\right)\left(\begin{array}{l} a \\ b \end{array}\right)=\left(\begin{array}{l} \sum_{i=1}^{n} y_{i} \\ \sum_{i=1}^{n} x_{i} y_{i} \end{array}\right) $$

4 step solution

Problem 13

Prove that a least squares straight line must necessarily pass through the point \((\bar{x}, \bar{y})\).

4 step solution

Problem 14

In some regression situations, there are a priori reasons for assuming that the \(x y\)-relationship being approximated passes through the origin. If so, the equation to be fit to the \(\left(x_{i}, y_{i}\right)\) 's has the form \(y=b x\). Use the least squares criterion to show that the "best" slope in that case is given by $$ b=\frac{\sum_{i=1}^{n} x_{i} y_{i}}{\sum_{i=1}^{n} x_{i}^{2}} $$

4 step solution

Problem 16

Given a set of \(n\) linearly related points, \(\left(x_{1}, y_{1}\right),\left(x_{2}, y_{2}\right), \ldots\), and \(\left(x_{n}, y_{n}\right)\), use the least squares criterion to find formulas for (a) \(a\) if the slope of the \(x y\)-relationship is known to be \(b^{*}\). (b) \(b\) if the \(y\)-intercept of the \(x y\)-relationship is known to be \(a^{*}\).

3 step solution

Problem 19

Set up (but do not solve) the equations necessary to determine the least squares estimates for the trigonometric model, $$ y=a+b x+c \sin x $$ Assume that the data consist of the random sample \(\left(x_{1},\right.\), \(\left.y_{1}\right),\left(x_{2}, y_{2}\right), \ldots\), and \(\left(x_{n}, y_{n}\right)\).

3 step solution

Problem 21

The growth of federal expenditures is one of the characteristic features of the U.S. economy. The rapidity of the increases from 2000 to 2015 , as shown in the table below, suggests an exponential model. (a) Find the best-fitting exponential curve, using the method of least squares together with an appropriate linearizing transformation. Use the sums: \(\sum_{i=0}^{15} x_{i}=120\), \(\sum_{i=0}^{15} \ln y_{i}=37.04571\), and \(\sum_{i=0}^{15} x_{i} \cdot \ln y_{i}=307.6275\) (b) Calculate the residuals for the years 2009 through 2015. What does this say about the exponential model?

3 step solution

Problem 24

Suppose a set of \(n\left(x_{i}, y_{i}\right)\) 's are measured on a phenomenon whose theoretical \(x y\)-relationship is of the form \(y=a e^{b x}\). (a) Show that \(\frac{d y}{d x}=b y\) implies that \(y=a e^{b x}\). (b) On what kind of graph paper would the \(\left(x_{i}, y_{i}\right)\) 's show a linear relationship?

4 step solution

Problem 26

Among mammals, the relationship between the age at which an animal develops locomotion and the age at which it first begins to play has been widely studied. The table below lists "onset" times for locomotion and for play in eleven different species (46). Fit the data to the \(y=a x^{b}\) model. $$ \begin{array}{lcc} \hline & \text { Locomotion } & \text { Play Begins, } \\ \text { Species } & \text { Begins, } x \text { (days) } & y \text { (days) } \\\ \hline \text { Homo sapiens } & 360 & 90 \\ \text { Gorilla gorilla } & 165 & 105 \\ \text { Felis catus } & 21 & 21 \\ \text { Canis familiaris } & 23 & 26 \\ \text { Rattus norvegicus } & 11 & 14 \\ \text { Turdus merula } & 18 & 28 \\ \text { Macaca mulatta } & 18 & 21 \\ \text { Pan troglodytes } & 150 & 105 \\ \text { Saimiri sciurens } & 45 & 68 \\ \text { Cercocebus alb. } & 45 & 75 \\ \text { Tamiasciureus hud. } & 18 & 46 \\ \hline \end{array} $$

4 step solution

Problem 38

The sodium nitrate \(\left(\mathrm{NaNO}_{3}\right)\) solubility data in Question \(11.2 .3\) is described nicely by the regression line \(y=67.508+0.871 x\), where \(s=0.959\). Construct a \(90 \%\) confidence interval for the \(y\)-intercept, \(\beta_{0}\).

3 step solution

Problem 41

Let \(\left(x_{1}, Y_{1}\right),\left(x_{2}, Y_{2}\right), \ldots\), and \(\left(x_{n}, Y_{n}\right)\) be a set of points satisfying the assumptions of the simple linear model. Prove that $$ E(\bar{Y})=\beta_{0}+\beta_{1} \bar{x} $$

3 step solution

Problem 42

Derive a formula for a \(95 \%\) confidence interval for \(\beta_{0}\) if \(n\left(x_{i}, Y_{i}\right)\) 's are taken on a simple linear model where \(\sigma\) is known.

3 step solution

Problem 44

State the decision rule and the conclusion if \(H_{0}: \sigma^{2}=12.6\) is to be tested against \(H_{1}: \sigma^{2} \neq 12.6\) where \(n=24, s^{2}=18.2\), and \(\alpha=0.05 .\)

3 step solution

Problem 47

Regression techniques can be very useful in situations where one variable \(-\) say, \(y-\) is difficult to measure but \(x\) is not. Once such an \(x y\)-relationship has been "calibrated," based on a set of \(\left(x_{i}, y_{i}\right)\) 's, future values of \(Y\) can be easily estimated using \(\hat{\beta}_{0}+\hat{\beta}_{1} x\). Determining the volume of an irregularly shaped object, for example, is often difficult, but weighing that object is likely to be easy. The following table shows the weights (in kilograms) and the volumes (in cubic decimeters) of eighteen children between the ages of five and eight (15). The estimated regression line has the equation \(y=-0.104+0.988 x\), where \(s=0.202\). (a) Construct a \(95 \%\) confidence interval for \(E(Y \mid 14.0)\). (b) Construct a \(95 \%\) prediction interval for the volume of a child weighing \(14.0\) kilograms. $$ \begin{array}{cccc} \hline \text { Weight, } x & \text { Volume, } y & \text { Weight, } x & \text { Volume, } y \\ \hline 17.1 & 16.7 & 15.8 & 15.2 \\ 10.5 & 10.4 & 15.1 & 14.8 \\ 13.8 & 13.5 & 12.1 & 11.9 \\ 15.7 & 15.7 & 18.4 & 18.3 \\ 11.9 & 11.6 & 17.1 & 16.7 \\ 10.4 & 10.2 & 16.7 & 16.6 \\ 15.0 & 14.5 & 16.5 & 15.9 \\ 16.0 & 15.8 & 15.1 & 15.1 \\ 17.8 & 17.6 & 15.1 & 14.5 \\ \hline \end{array} $$

4 step solution

Problem 52

Attorneys representing a group of male buyers employed by Flirty Fashions are filing a reverse discrimination suit against the female-owned company. Central to their case are the following data, showing the relationship between years of service and annual salary for the firm's fourteen buyers, six of whom are men. The plaintiffs claim that the difference in slopes \((0.606\) for men versus \(1.07\) for women) is prima facie evidence that the company's salary policies discriminate against men. As the lawyer for Flirty Fashions, how would you respond? Use the following sums: $$ \sum_{i=1}^{6}\left(y_{i}-21.3-0.606 x_{i}\right)^{2}=5.983 $$ \(\sum_{i=1}^{6}\left(y_{i}-21.3-0.606 x_{i}\right)^{2}=5.983\) and Also, \(\sum_{i=1}^{6}\left(x_{i}-\bar{x}\right)^{2}=31.33\) and \(\sum_{i=1}^{8}\left(x_{i}^{*}-\bar{x}^{*}\right)^{2}=46 .\)

4 step solution

Problem 53

Polls taken during a city's last two administrations (one Democratic, one Republican) suggested that public support of the two mayors fell off linearly with years in office. Can it be concluded from the following data that the rates at which the two administrations lost favor were significantly different? Let \(\alpha=0.05\). (Note: \(y=69.3077-3.4615 x\) with an estimated standard deviation of \(0.9058\) and \(y^{*}=59.9407-2.7373 x^{*}\) with an estimated standard deviation of 1.2368.)

5 step solution

Problem 56

Let \(X\) and \(Y\) have the joint pdf $$ f_{X, Y}(x, y)= \begin{cases}\frac{x+2 y}{22}, & \text { for }(x, y)=(1,1),(1,3),(2,1),(2,3) \\ 0, & \text { elsewhere }\end{cases} $$ Find \(\operatorname{Cov}(X, Y)\) and \(\rho(X, Y)\).

3 step solution

Problem 57

Suppose that \(X\) and \(Y\) have the joint pdf $$ f_{X, Y}(x, y)=x+y, \quad 0

3 step solution

Problem 58

If the random variables \(X\) and \(Y\) have the joint pdf $$ f_{X, Y}(x, y)=\left\\{\begin{array}{l} 8 x y, 0 \leq y \leq x \leq 1 \\ 0, \text { otherwise } \end{array}\right. $$ show that \(\operatorname{Cov}(X, Y)=\frac{8}{450} .\) Calculate \(\rho(X, Y) .\)

5 step solution

Problem 59

Suppose that \(X\) and \(Y\) are discrete random variables with the joint pdf $$ \begin{array}{cc} \hline(x, y) & f x, Y(x, y) \\ \hline(1,2) & \frac{1}{2} \\ (1,3) & \frac{1}{4} \\ (2,1) & \frac{1}{8} \\ (2,4) & \frac{1}{8} \\ \hline \end{array} $$ Find the correlation coefficient between \(X\) and \(Y\).

4 step solution

Problem 60

Prove that \(\rho(a+b X, c+d Y)=\rho(X, Y)\) for constants \(a, b, c\), and \(d\) where \(b\) and \(d\) are positive. Note that this result allows for a change of scale to one convenient for computation.

4 step solution

Problem 61

Let the random variable \(X\) take on the values \(1,2, \ldots, n\), each with probability \(1 / n .\) Define \(Y\) to be \(X^{2} .\) Find \(\rho(X, Y)\) and \(\lim _{n \rightarrow \infty} \rho(X, Y)\).

6 step solution

Problem 62

(a) For random variables \(X\) and \(Y\), show that $$ \operatorname{Cov}(X+Y, X-Y)=\operatorname{Var}(X)-\operatorname{Var}(Y) $$ (b) Suppose that \(\operatorname{Cov}(X, Y)=0\). Prove that $$ \rho(X+Y, X-Y)=\frac{\operatorname{Var}(X)-\operatorname{Var}(Y)}{\operatorname{Var}(X)+\operatorname{Var}(Y)} $$

4 step solution

Problem 64

Let \(\left(x_{1}, y_{1}\right),\left(x_{2}, y_{2}\right), \ldots,\left(x_{n}, y_{n}\right)\) be a set of measurements whose sample correlation coefficient is \(r\). Show that $$ r=\hat{\beta}_{1} \cdot \frac{\sqrt{n \sum_{i=1}^{n} x_{i}^{2}-\left(\sum_{i=1}^{n} x_{i}\right)^{2}}}{\sqrt{n \sum_{i=1}^{n} y_{i}^{2}-\left(\sum_{i=1}^{n} y_{i}\right)^{2}}} $$ where \(\hat{\beta}_{1}\) is the maximum likelihood estimate for the slope.

3 step solution

Problem 66

Some baseball fans believe that the number of home runs a team hits is markedly affected by the altitude of the club's home park. The rationale is that the air is thinner at the higher altitudes, and balls would be expected to travel farther. The following table shows the altitudes \((X)\) of American League ballparks and the number of home runs \((Y)\) that each team hit during a recent season (183). Calculate the sample correlation coefficient, \(r\), using the sums below. What would you conclude? $$ \begin{gathered} \sum_{i=1}^{12} x_{i}=4936 \quad \sum_{i=1}^{12} y_{i}=1175 \\ \sum_{i=1}^{12} x_{i}^{2}=3,071,116 \quad \sum_{i=1}^{12} y_{i}^{2}=123,349 \\\ \sum_{i=1}^{12} x_{i} y_{i}=480,565 \end{gathered} $$

3 step solution

Problem 67

Many people believe that a salary bonus is a reward for good performance. The corporate world may have a different understanding. A random sample of thirty chief executive officers of large capitalization public companies recorded the cash bonus paid, \(x\) (in \(\$ 100,000\) ), and the performance of the company, \(y\), as measured by percentage change in company revenues. The following sums resulted. $$ \begin{gathered} \sum_{i=1}^{30} x_{i}=1,300.69 \quad \sum_{i=1}^{30} y_{i}=323 \\ \sum_{i=1}^{30} x_{i}^{2}=86,754.6939 \quad \sum_{i=1}^{30} y_{i}^{2}=11,881 \\\ \sum_{i=1}^{30} x_{i} y_{i}=7,807.36 \end{gathered} $$ Find the sample coefficient of correlation. What does this study say about the relationship between bonuses and performance?

5 step solution

Problem 68

The extent to which stress is a contributing factor to the severity of chronic illnesses was the focus of the study summarized in the following table (221). Seventeen conditions were compared on a Seriousness of Illness Rating Scale (SIRS). Patients with each of these conditions were asked to fill out a Schedule of Recent Experience (SRE) questionnaire. Higher scores on the SRE reflect presumably greater levels of stress. How much of the variation in the SIRS values can be attributed to the linear regression with SRE? Use the following sums: $$ \begin{array}{cl} \sum_{i=1}^{17} x_{i}=7,973 & \sum_{i=1}^{17} y_{i}=8,517 \\ \sum_{i=1}^{17} x_{i}^{2}=4,611,291 & \sum_{i=1}^{17} y_{i}^{2}=5,421,917 \\ \sum_{i=1}^{17} x_{i} y_{i}=4,759,470 \end{array} $$

3 step solution

Problem 69

Burglary and larceny both involve the illegal taking of something of value. The difference, simply put, is that burglary involves unlawful entry to a structure, while larceny does not. While the two crimes might seem similar, the correlation between the two is quite low. A data set to be used for such an analysis is the annual rates of burglary, \(x\), and larceny, \(y\), from 1975 to 2010 . Both variables give the number of crimes per 100,000 U.S. citizens. Calculate the \(x y\) correlation. Use the following sums: $$ \begin{aligned} \sum_{i=1}^{36} x_{i} &=994.7700, \quad \sum_{i=1}^{36} x_{i}^{2}=28462.1047 \\\ \sum_{i=1}^{36} y_{i} &=254.6900, \quad \sum_{i=1}^{36} y_{i}^{2}=1816.1417 \\\ \sum_{i=1}^{36} x_{i} y_{i} &=7051.2633 \end{aligned} $$

4 step solution

Problem 71

Suppose that \(X\) and \(Y\) have a bivariate normal pdf with \(\mu_{X}=3, \mu_{Y}=6, \sigma_{X}^{2}=4, \sigma_{Y}^{2}=10\), and \(\rho=\frac{1}{2}\). Find \(P\left(5

5 step solution

Problem 72

Suppose that \(X\) and \(Y\) have a bivariate normal distribution with \(\operatorname{Var}(X)=\operatorname{Var}(Y)\). (a) Show that \(X\) and \(Y-\rho X\) are independent. (b) Show that \(X+Y\) and \(X-Y\) are independent. [Hint: See Question 11.4.7(a).]

4 step solution

Problem 73

Suppose that \(X\) and \(Y\) have a bivariate normal distribution. (a) Prove that \(X+Y\) has a normal distribution when \(X\) and \(Y\) are standard normal random variables. (b) Find \(E(c X+d Y)\) and \(\operatorname{Var}(c X+d Y)\) in terms of \(\mu_{X}\), \(\mu_{Y}, \sigma_{X}, \sigma_{Y}\), and \(\rho(X, Y)\), where \(X\) and \(Y\) are arbitrary normal random variables.

3 step solution

Problem 74

Suppose that the random variables \(X\) and \(Y\) have a bivariate normal pdf with \(\mu_{X}=56, \mu_{Y}=11, \sigma_{X}^{2}=1.2\), \(\sigma_{Y}^{2}=2.6\), and \(\rho=0.6\). Compute \(P(10

3 step solution

Problem 75

If the joint pdf of the random variables \(X\) and \(Y\) is $$ f_{X, Y}(x, y)=k e^{-(2 / 3)\left[(1 / 4) x^{2}-(1 / 2) x y+y^{2}\right]} $$ find \(E(X), E(Y), \operatorname{Var}(X), \operatorname{Var}(Y), \rho(X, Y)\), and \(k\).

5 step solution

Problem 76

Give conditions on \(a>0, b>0\), and \(u\) so that $$ f_{X, Y}(x, y)=k e^{-\left(a x^{2}-2 u x y+b y^{2}\right)} $$ is the bivariate normal density of random variables \(X\) and \(Y\) each having expected value 0 . Also, find \(\operatorname{Var}(X)\), \(\operatorname{Var}(Y)\), and \(\rho(X, Y)\).

4 step solution

Show/ page
Chapter 11 - An Introduction to Mathematical Statistics and Its Applications Solutions | StudyQuestionHub