Problem 67

Question

Many people believe that a salary bonus is a reward for good performance. The corporate world may have a different understanding. A random sample of thirty chief executive officers of large capitalization public companies recorded the cash bonus paid, \(x\) (in \(\$ 100,000\) ), and the performance of the company, \(y\), as measured by percentage change in company revenues. The following sums resulted. $$ \begin{gathered} \sum_{i=1}^{30} x_{i}=1,300.69 \quad \sum_{i=1}^{30} y_{i}=323 \\ \sum_{i=1}^{30} x_{i}^{2}=86,754.6939 \quad \sum_{i=1}^{30} y_{i}^{2}=11,881 \\\ \sum_{i=1}^{30} x_{i} y_{i}=7,807.36 \end{gathered} $$ Find the sample coefficient of correlation. What does this study say about the relationship between bonuses and performance?

Step-by-Step Solution

Verified
Answer
The sample correlation coefficient, \( r \), is calculated in step 4. It is a measure of the degree of the relationship between X and Y. The closer the value of \( r \) is to it's extreme values (1 or -1), the stronger the relationship, while a value close to 0 (zero) would suggest a weak or no correlation between the two variables.
1Step 1: Calculate Mean of X and Y
Calculate the average of each variable by using the formula for finding the mean: mean \( \bar{x} = \sum{x} / n \) and \( \bar{y} = \sum{y} / n \). Here, \( n =30 \), \( \sum{x} = 1300.69 \) and \( \sum{y} = 323 \). So the mean \( \bar{x} \) and \( \bar{y} \) can be calculated as \( \bar{x} = 1300.69 / 30 \) and \( \bar{y} = 323 / 30 \).
2Step 2: Calculate Variance of X and Y
Next, calculate the variance of each variable using the formula \( S_x^2 = (\sum{x^2} - n\bar{x}^2) / (n - 1) \) for X and \( S_y^2 = (\sum{y^2} - n\bar{y}^2) / (n - 1) \) for Y. Here \( n = 30 \), \( \sum{x^2} = 86754.6939 \), \( \sum{y^2} = 11881 \), \( \bar{x} \) and \( \bar{y} \) are calculated from step 1. So variances, \( S_x^2 \) and \( S_y^2 \), can be calculated.
3Step 3: Calculate Covariance between X and Y
Now, calculate the covariance between X and Y using the formula \( Cov_{xy} = (\sum{xy} - n\bar{x}\bar{y}) / (n - 1) \). \( \sum{xy} = 7807.36 \), \( n = 30 \), \( \bar{x} \) and \( \bar{y} \) are calculated from step 1.
4Step 4: Calculate Sample Coefficient of Correlation
Finally, calculate the sample correlation coefficient, \( r \), using the formula \( r = Cov_{xy} / \sqrt{S_x^2 * S_y^2} \). Here, \( Cov_{xy} \), \( S_x^2 \) and \( S_y^2 \) are calculated from step 2 and 3.
5Step 5: Interpret the Result
The correlation coefficient, \( r \), ranges from -1 to 1. If \( r \) is close to 1, it is a strong positive relationship. If \( r \) is close to -1, it means there is a strong negative relationship. If \( r \) is around 0, then there is weak or no linear relationship between X and Y.

Key Concepts

MeanVarianceCovarianceSample Correlation
Mean
The term "mean" simply refers to the average value of a data set. It's a fundamental concept in statistics that helps to summarize data with a single value. The mean is calculated by adding up all the data points and then dividing by the number of data points. For instance, if you have a data set of bonuses paid to 30 CEOs, the mean would show you the average bonus amount per CEO. This allows us to get a general sense of the central tendency of the bonuses.
  • Formula for mean: \(\bar{x} = \frac{\sum{x}}{n}\)
  • For our exercise: \(\bar{x} = \frac{1300.69}{30} = 43.3563\)
The mean can provide a convenient summary of the data set, but it might not capture the variability within the data set. Remember, sometimes extremely high or low values can skew the mean, making it less representative of the entire distribution.
Variance
Variance measures how much the values in a data set differ from the mean. In simpler terms, it tells us how spread out the data points are. A higher variance means the numbers are more spread out over a larger range, while a lower variance indicates they are closer to the mean. This is crucial for understanding the consistency of a data set.
The formula to calculate variance is a bit more involved than the mean:
  • For variance of X: \(S_x^2 = \frac{\sum{x^2} - n\bar{x}^2}{n - 1}\)
  • For our exercise: Substitute known values to calculate \(S_x^2\).
The variance helps in assessing the reliability of the mean. A low variance suggests that the mean is a reliable representation of the data, whereas a high variance might indicate outliers or a more diverse data set.
Covariance
Covariance is another statistical measure that indicates how two variables change together. Unlike the mean or variance, covariance specifically focuses on the relationship between two different variables. If one variable increases when the other increases, the covariance is positive. If one decreases when the other increases, the covariance is negative.
  • Formula for covariance: \(Cov_{xy} = \frac{\sum{xy} - n\bar{x}\bar{y}}{n - 1}\)
  • In our case, calculate using \(\sum{xy}, \bar{x}, \) and \( \bar{y}\).
Covariance can be tricky to interpret, as its magnitude is not as straightforward as correlation. But it's an important step before calculating the correlation coefficient, which normalizes covariance.
Sample Correlation
The sample correlation coefficient, often symbolized by \( r \), is a statistical measure that describes the strength and direction of a linear relationship between two variables on a scatterplot. This measure normalizes covariance and converts it into a more intuitive value that ranges between -1 and 1.
  • Calculation formula: \(r = \frac{Cov_{xy}}{\sqrt{S_x^2 \cdot S_y^2}}\)
  • If \( r \) is close to 1, the relation between the variables is strong and positive.
  • If \( r \) is close to -1, the relation is strong but negative.
In our context of bonuses and company performance, a strong positive \( r \) would suggest that high bonuses correspond to better performance, while a strong negative \( r \) would imply the opposite. The correlation, when properly interpreted, can offer insights into the potential causative or associative relationships between two sets of data.