Matrix Decompositions
〽️

Matrix Decompositions

Breaking down the essentials of matrix decompositions, explaining how they are used in machine learning and why determinants, traces, eigenvalues, and eigenvectors enable us to draw better decision boundaries when designing our machine learning models.

Matrix factorization

Also known as matrix decomposition, this is the process in which matrices can be described with just a few numbers that characterize the overall properties.

Determinants

  • What is it?
    • A determinant maps a (square) matrix to a real number.
    • For a (square) matrix A∈RnΓ—n\textbf{A} \in \mathbb{R}^{n \times n}, the determinant would be written as det(A)\text{det}(\textbf{A}).
    • In an example where we have A∈R2Γ—2\textbf{A} \in \mathbb{R}^{2 \times 2}, we would express this as follows:
      • det(A)=∣a11a12a21a22∣=a11a22βˆ’a12a21\text{det}(\textbf{A}) = \begin{vmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{vmatrix} = a_{11}a_{22} - a_{12}a_{21}
  • What does it tell us?
    • For any square matrix A∈RnΓ—n\textbf{A} \in \mathbb{R}^{n \times n}, it holds that A\textbf{A} is invertible if and only if det(A)β‰ 0\text{det}(\textbf{A}) \neq 0.
    • It tells us about the changes in volume of the space the matrix inhabits.
      • Because this tells us about the area/volume, if the angles of the vectors were linearly dependent, then the area would equal zero. Therefore if the vectors are linearly independent, then their determinant would be non-zero.
      • When you have your matrix, if you swap the columns, you are essentially reversing the orientation. This will have an impact on our machine learning models and the decision boundary.
        • Example: If you have the matrix [b00g]\begin{bmatrix} b & 0 \\ 0 & g \\ \end{bmatrix}, you would compute the determinant as follows: bgβˆ’0=bgbg - 0 = bg. However if you swapped the columns of the matrix as follows: [0bg0]\begin{bmatrix} 0 & b \\ g & 0 \\ \end{bmatrix}, then computing the determinant would be as follows: 0βˆ’gb=βˆ’gb0 - gb = -gb. This clearly shows that the orientation is flipped and reversed.
      • The sign of the determinant indicates the orientation of the spanning vectors with respect to the standard basis.
      • The determinant acts as a function that measures the signed volume formed by column vectors composed in a matrix.
    • Determinant as Scaling Factor:
      • If the determinant of a matrix is positive, it indicates that the transformation scales volumes by a positive factor. In other words, the transformation expands or contracts space without flipping it.
      • If the determinant is negative, it means that the transformation includes a reflection, and the volume of the parallelepiped is scaled by a negative factor. This implies a flip or an orientation reversal.
  • How to compute determinants
    • 2Γ—22\times2 matrix
    • 3Γ—33\times3 matrix
      • Laplace’s expansion

Trace

  • What is a trace?
    • The trace of a square matrix is the sum of diagonal elements. The importance of this property illuminates whether the matrix is invariant under cyclic permutations, i.e. tr(AKL)=tr(KLA)\text{tr}(\textbf{AKL})=\text{tr}(\textbf{KLA}).
    • It is defined as the following, where essentially the trace is the sum of the diagonal elements of A\textbf{A}:
    • tr(A):=βˆ‘i=1naii\text{tr}(\textbf{A}):=\sum^{n}_{i=1}a_{ii}
    • The trace of a linear mapping is independent of the basis (while matrix representations of linear mappings usually do).

Characteristic Polynomial

  • The characteristic polynomial of a square matrix is a polynomial equation obtained by subtracting a scalar multiple of the identity matrix from the matrix itself, and then finding the determinant of the resulting expression. The roots of this polynomial are the eigenvalues of the matrix, which have significant applications in linear algebra and various mathematical fields.
  • They are represented as follows:
    1. pA(Ξ»):=det(Aβˆ’Ξ»I)=c0+c1Ξ»+c2Ξ»2+...+cnβˆ’1Ξ»nβˆ’1+(βˆ’1)nΞ»n\mathcal{p}_{\textbf{A}}(\lambda):=\text{det}(\textbf{A}-\lambda\textbf{I}) \\ = c_{0}+c_{1}\lambda+c_{2}\lambda^{2}+...+c_{n-1}\lambda^{n-1}+(-1)^{n}\lambda^{n}
    2. In this case, the characteristic polynomial would be c0...,cnβˆ’1∈Rc_{0}...,c_{n-1}\in \mathbb{R}. Specifically
    3. c0=det(A),cnβˆ’1=(βˆ’1)nβˆ’1tr(A)c_{0}=\text{det}(\textbf{A}), \\ c_{n-1}=(-1)^{n-1}\text{tr}(\textbf{A})

Eigenvalues & Eigenvectors

  • An eigenvalue of a linear mapping will tell us how a special set of vectors, the eigenvectors, s transformed by the linear mapping.
  • An eigenvector is