Problem 23

Question

(Calculus needed.) Consider the multiple regression model: $$Y_{i}=\beta_{1} X_{i 1}+\beta_{2} X_{i 2}+\varepsilon_{i} \quad i=1, \ldots, n$$ where the $\varepsilon_{i}$ are uncorrelated, with $E\left[\varepsilon_{i}\right]=0$ and $\sigma^{2}\left\\{\varepsilon_{i}\right\\}=\sigma^{2}$ a. 'State the least squares criterion and derive the least squares estimators of $\beta_{1}$ and $\beta_{2}$. b. Assuming that the $\varepsilon_{i}$ are independent normal random variables, state the likelihood function and obtain the maximum likelihood estimators of $\beta_{1}$ and $\beta_{2}$. Are these the same as the least squares estimators?

Step-by-Step Solution

Verified

Answer

LTE and MLE for β₁ and β₂ are the same under normality.

1Step 1: Define the Least Squares Criterion

The least squares criterion aims to minimize the sum of the squared residuals. The residual for each observation is the difference between the observed value, the model value: RSS = Σ [ Y_{i} - ( β_{1}X_{i1} + β_{2}X_{i2} )]^2.

2Step 2: Form the Normal Equations

To find the least squares estimators, take partial derivatives of RSS with respect to β_{1} and β_{2}, and set them to zero: $\frac{∂ RSS}{∂ β_{1}} \ = -2 Σ[X_{i1} ( Y_{i} - β_{1}X_{i1} - β_{2}X_{i2})] = 0 $\ $\frac{∂ RSS}{∂ β_{2}} \ = -2 Σ[X_{i2} ( Y_{i} - β_{1}X_{i1} - β_{2}X_{i2})] = 0 $

3Step 3: Solve the Normal Equations

Setting the equations from Step 2 to zero, rearrange to solve for β_{1} and β_{2}. These can be written in matrix form and solved:$ \left( \begin{array}{cc} ΣX_{i1}² & ΣX_{i1}X_{i2} \ \ ΣX_{i1}X_{i2} & ΣX_{i2}² \end{array} \right) \ \left( \begin{array}{c} β_{1} \ \ β_{2} \end{array} \right) \ = \ \left( \begin{array}{c} ΣX_{i1}Y_{i} \ \ ΣX_{i2}Y_{i} \end{array} \right) $

4Step 4: State the Likelihood Function

Given $ε_{i} \sim N(0,σ²)$, the likelihood function L is\[ L = ∏_{i=1}^{n} \frac{1}{√2πσ} exp\left( \frac{-(Y_{i} - β_{1}X_{i1} - β_{2}X_{i2})²}{2σ²} \right) \]

5Step 5: Obtain the Log-Likelihood Function

Take the natural log of the likelihood function: \[ ln L = -\frac{n}{2} ln2π - \frac{n}{2} ln σ² - \frac{1}{2σ²} Σ [Y_{i} - ( β_{1}X_{i1} + β_{2}X_{i2})]² \]

6Step 6: Differentiate and Set to Zero

Differentiate the log-likelihood function with respect to β_{1} and β_{2}, then set the equations to zero: $ \frac{∂}{∂ β_{1}}(ln L) = 0 $ and $ \frac{∂}{∂ β_{2}}(ln L) = 0 $

7Step 7: Solve for Maximum Likelihood Estimators

Solving these equations gives the Maximum Likelihood Estimators (MLE) for β_{1} and β_{2}. They are: $ \left( \begin{array}{c} \hat β_{1} \ \ \hat β_{2} \end{array} \right) = \left( X^{T}X \right)^{-1} \left( X^{T}Y \right) $

8Step 8: Compare Both Estimators

Observe that the MLE for β_{1} and β_{2} are the same as the Least Squares Estimators (LSE). This is due to the normality assumption of the residuals.

Key Concepts

Least Squares EstimationMaximum Likelihood EstimationNormal EquationsLog-Likelihood Function

Least Squares Estimation

Least Squares Estimation is a fundamental method used to estimate the parameters of a regression model. The goal is to minimize the sum of the squared differences (residuals) between the observed values and the values predicted by the model. This technique is particularly useful when dealing with linear regression models.

In the given exercise, the residual for each observation is defined as the difference between the observed value, $Y_{i}$, and the predicted value, $\beta_{1}X_{i1} + \beta_{2}X_{i2}$. The least squares criterion is then expressed as the Residual Sum of Squares (RSS):
$RSS = \sum_{i=1}^{n} \left(Y_{i} - (\beta_{1}X_{i1} + \beta_{2}X_{i2})\right)^2$.

To find the least squares estimators ($\hat{\beta_{1}}$ and $\hat{\beta_{2}}$), we need to minimize the RSS. This involves taking the partial derivatives of the RSS with respect to each parameter, setting them to zero, and solving the resulting equations for $\beta_{1}$ and $\beta_{2}$. This method ensures that the sum of the squared residuals is as small as possible, thereby providing the best-fit line for the given data.

Maximum Likelihood Estimation

Maximum Likelihood Estimation (MLE) is another crucial method for estimating the parameters of a statistical model. The idea is to find the parameter values that maximize the likelihood function, which measures how likely it is to observe the given data under different parameter values.

In this exercise, we assume the errors $\varepsilon_{i}$ follow a normal distribution with mean 0 and variance $\sigma^2$. Given this assumption, the likelihood function $L$ for the observed data is:
\[ L = \prod_{i=1}^{n} \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left( \frac{-(Y_{i} - \beta_{1}X_{i1} - \beta_{2}X_{i2})^2}{2\sigma^2} \right) \]

To simplify the optimization, we usually work with the natural log of the likelihood function, which is called the log-likelihood function $lnL$:
\[ lnL = -\frac{n}{2} \ln(2\pi) - \frac{n}{2} \ln(\sigma^2) - \frac{1}{2\sigma^2} \sum_{i=1}^{n} \left( Y_{i} - (\beta_{1}X_{i1} + \beta_{2}X_{i2}) \right)^2 \]

We then differentiate the log-likelihood function with respect to the parameters, set these derivatives to zero, and solve for $\beta_{1}$ and $\beta_{2}$. This yields the maximum likelihood estimators, which, under the normality assumption, turn out to be the same as the least squares estimators.

Normal Equations

Normal Equations are a set of simultaneous linear equations derived from the method of least squares. These equations are used to find the estimators of the parameters in a linear regression model.

To derive these equations, we start with the partial derivatives of the Residual Sum of Squares (RSS) concerning each parameter and set them to zero. For the given model, the total system of normal equations is:
\[\frac{\partial RSS}{\partial \beta_{1}} = -2 \sum_{i=1}^{n} X_{i1} (Y_{i} - \beta_{1}X_{i1} - \beta_{2}X_{i2}) = 0 \]
\[\frac{\partial RSS}{\partial \beta_{2}} = -2 \sum_{i=1}^{n} X_{i2} (Y_{i} - \beta_{1}X_{i1} - \beta_{2}X_{i2}) = 0 \]

These partial derivatives, when set to zero, result in a system of linear equations known as the normal equations. In matrix form, these can be written as:
\[\begin{pmatrix} \sum X_{i1}^2 & \sum X_{i1}X_{i2} \ \sum X_{i1}X_{i2} & \sum X_{i2}^2 \end{pmatrix} \begin{pmatrix} \beta_{1} \ \beta_{2} \end{pmatrix} = \begin{pmatrix} \sum X_{i1}Y_{i} \ \sum X_{i2}Y_{i} \end{pmatrix} \]
By solving this system, we obtain the least squares estimators for $\beta_{1}$ and $\beta_{2}$. The solution involves inverting the matrix and multiplying by the vector of summed products, giving us the best-fit parameters.

Log-Likelihood Function

The log-likelihood function is a transformation of the likelihood function that simplifies the process of maximization. Since the likelihood function can be cumbersome due to its multiplicative nature, especially with normally distributed variables, taking the natural logarithm helps by turning products into sums.

In this exercise, the log-likelihood function $lnL$ for our multiple regression model, given the normal distribution assumption for the residuals, is:
\[lnL = -\frac{n}{2} \ln(2\pi) - \frac{n}{2} \ln(\sigma^2) - \frac{1}{2\sigma^2} \sum_{i=1}^{n} \left(Y_{i} - (\beta_{1} X_{i1} + \beta_{2} X_{i2})\right)^2 \]

By differentiating this log-likelihood function with respect to each parameter ($\beta_{1}$ and $\beta_{2}$), setting the derivatives to zero, and solving, we find the maximum likelihood estimators. As shown here, under normality assumptions, these estimators are the same as the least squares estimators. This equivalence makes maximum likelihood estimation a particularly powerful tool in the context of linear regression.

Problem 22

Problem 24

Other exercises in this chapter

Problem 9

Grocery retailer. A large, national grocery retailer tracks productivity and costs of $\frac{k}{\text { fts facilities }}$ closely, Data below were obtained f

View solution

Problem 22

For each of the following regression models, indicate whether it is a general linear regression model. If it is not, state whether it can be expressed in the fo

View solution

Problem 24

(Calculus needed.) Consider the multiple regression model: $$Y_{i}=\beta_{0}+\beta_{1} X_{i 1}+\beta_{2} X_{i 1}^{2}+\beta_{3} X_{i 2}+\varepsilon_{i} \quad i=1

View solution

Problem 26

For regression model $(6.1),$ show that the coefficient of simple determination between $Y_{i}$ and $\hat{Y}_{i}$ equals the coefficient of multiple deter

View solution