Problem 5

Question

QR factorization and the hat matrix Using the QR factorization defined in Appendix A.13, show that \(\mathbf{H}=\mathbf{Q Q}\) '. Hence, if \(\mathbf{q}\), is the \(i\) th row of \(\mathbf{Q}\) \\[ h_{i i}=\mathbf{q}^{\prime} \mathbf{q}_{i} \quad h_{i j}=\mathbf{q}^{\prime} \mathbf{q}_{j} \\] This means that if the QR factorization of \(\mathbf{X}\) has been computed, \(h_{i i}\) is the sum of squares of the elements of \(\mathbf{q}\), and the less- frequently used off-diagonal elements \(h_{i j}\) are the sums of products of the elements of \(\mathbf{q}_{i}\) and \(\mathbf{q}_{j}\)

Step-by-Step Solution

Verified
Answer
In summary, we proved that the hat matrix \(\mathbf{H}\) can be expressed as \(\mathbf{H} = \mathbf{Q Q}'\), where \(\mathbf{Q}\) is the orthogonal Q matrix from the QR factorization of a given matrix \(\mathbf{X}\). Furthermore, we showed that the elements of the hat matrix \(h_{ii}\) and \(h_{ij}\) can be computed using the rows of the Q matrix, as \(h_{ii} = \mathbf{q}'_i \mathbf{q}_i\) and \(h_{ij} = \mathbf{q}'_i \mathbf{q}_j\).
1Step 1: Understand the QR factorization
Given a matrix \(\mathbf{X}\), the QR factorization decomposes \(\mathbf{X}\) into an orthogonal matrix \(\mathbf{Q}\) and an upper triangular matrix \(\mathbf{R}\), such that \(\mathbf{X} = \mathbf{Q R}\).
2Step 2: Express the hat matrix as a product of Q matrix
The hat matrix \(\mathbf{H}\) is defined as \(\mathbf{H} = \mathbf{X}(\mathbf{X}' \mathbf{X})^{-1} \mathbf{X}'\). Since we have \(\mathbf{X} = \mathbf{Q R}\), let's substitute it into the formula for the hat matrix: \[ \mathbf{H} = (\mathbf{Q R})((\mathbf{Q R})' (\mathbf{Q R}))^{-1} (\mathbf{Q R})' \] Now, let's simplify the expression. We know from the properties of orthogonal matrices that \(\mathbf{Q}' \mathbf{Q} = \mathbf{I}\). Therefore: \[ \mathbf{H} = (\mathbf{Q R})((\mathbf{R}' \mathbf{Q}' \mathbf{Q R})^{-1}) (\mathbf{R}' \mathbf{Q}') \] \[ \mathbf{H} = (\mathbf{Q R})((\mathbf{R}' \mathbf{I R})^{-1}) (\mathbf{R}' \mathbf{Q}') \] \[ \mathbf{H} = (\mathbf{Q R})(\mathbf{R}' \mathbf{R})^{-1} (\mathbf{R}' \mathbf{Q}') \] We can recognize this expression as the product of three matrices: \(\mathbf{Q}\), \(\mathbf{R} (\mathbf{R}' \mathbf{R})^{-1} \mathbf{R}'\), and \(\mathbf{Q}'\). The middle matrix is an identity matrix, so we get: \[ \mathbf{H} = \mathbf{Q Q}' \]
3Step 3: Compute the elements of the hat matrix
Now that we have expressed the hat matrix as a product of the Q matrix, we can easily compute its elements using the rows of the Q matrix: For \(h_{ii}\) we have: \[ h_{ii} = \mathbf{q}'_i \mathbf{q}_i \] For \(h_{ij}\) we have: \[ h_{ij} = \mathbf{q}'_i \mathbf{q}_j \] Thus, if we have the QR factorization of the matrix \(\mathbf{X}\), we can compute the elements of the hat matrix \(\mathbf{H}\) using the rows of the Q matrix.

Key Concepts

Hat MatrixOrthogonal MatrixMatrix Decomposition
Hat Matrix
The concept of a hat matrix might sound a bit whimsical, but it is a fundamental tool in linear regression analysis. The hat matrix, often denoted as \(\mathbf{H}\), projects the observed data into the space of predicted values. With the QR factorization, it’s crucial to understand that the hat matrix can be represented as \(\mathbf{H} = \mathbf{Q} \mathbf{Q'}\). This means it is intrinsically linked to the orthogonal matrix \(\mathbf{Q}\) obtained during QR factorization.

The significance of the hat matrix lies in its role in estimating fitted values \(\hat{\mathbf{y}}\). It essentially transforms the vector of observed responses \(\mathbf{y}\) into their fitted counterparts using \(\mathbf{H}\).
  • It gives the influence each observed point has on its own fitted value.
  • Both diagonal \(h_{ii}\) and off-diagonal \(h_{ij}\) elements of the hat matrix provide insights into this influence and the leverage of individual data points.
The diagonal elements \(h_{ii}\) indicate the influence of the \(i\)-th observation on its own fitted value. The lesser-used off-diagonal elements \(h_{ij}\) illustrate how much one observation influences another.
Orthogonal Matrix
Orthogonal matrices are central to the QR factorization process. An orthogonal matrix \(\mathbf{Q}\) has the property that its transpose is also its inverse, i.e., \(\mathbf{Q}' \mathbf{Q} = \mathbf{I}\), where \(\mathbf{I}\) is the identity matrix. This property makes orthogonal matrices special and highly valuable in numerical computations.
  • Any operation involving an orthogonal matrix preserves the length of vectors, which is key in maintaining numerical stability.
  • Orthogonal matrices are used to create a transformation that preserves the angles and lengths, ensuring that the outcome remains unchanged apart from the orientation.
In the QR factorization, the matrix \(\mathbf{Q}\) serves to orthogonally rotate and scale the original matrix \(\mathbf{X}\) without altering its fundamental properties, making the computations more efficient and resilient to numerical errors.

Understanding orthogonal matrices helps in comprehending why the hat matrix \(\mathbf{H}\) ends up being \(\mathbf{Q} \mathbf{Q'}\) in its composition.
Matrix Decomposition
Matrix decomposition is a technique used to simplify matrix operations by breaking a matrix into simpler, constituent matrices. In the context of QR factorization, the original matrix \(\mathbf{X}\) is decomposed into an orthogonal matrix \(\mathbf{Q}\) and an upper triangular matrix \(\mathbf{R}\), such that \(\mathbf{X} = \mathbf{Q} \mathbf{R}\). This technique makes complex matrix computations more manageable and efficient.
  • This decomposition helps in solving linear systems and least squares problems more effectively, which is crucial in statistical methods and numerical linear algebra.
  • QR factorization is particularly advantageous because it prevents the amplification of round-off errors that can occur in other methods.
Within the framework of linear regression, QR factorization simplifies the calculation of coefficients by substituting the original matrix \(\mathbf{X}\) with \(\mathbf{Q}\) and \(\mathbf{R}\), making the process of obtaining optimal solutions smoother.

When you think of matrix decomposition, consider it as dismantling a complex object into more manageable parts to reach a solution efficiently and accurately.