Analytic Geometry

This post is a followup to the post on Vector Spaces & Linear Mappings. We’ve explored the use of matrices, their various operations, the linear combinations and linear transformations that arise from their mappings. We’ll dig in deeper around these operations and uncover the beauty behind the geometry that arises from these higher-dimensional mathematical objects.

Inner products

Within a vector space or subspace, we want to derive the geometric interpretation of this line which possesses length and magnitude. Comparing this line with others, can we understand their angles and relative lengths? To do so, we introduce inner products to the mix and use them primarily to understand whether vectors are orthogonal (perpendicular) to each other.

Norm → A norm is a function in a vector space that assigns the length for each vector. We use the principles of the Pythagorean theorem in order to derive it but to do so, they must adhere to the following three properties:

Absolutely homogenous →
Triangle inequality → The sum of the lengths of any two sides must be greater than or equal to the length of the remaining side.
Positive definite → If for any non-zero vector, multiplying the matrix by the vector gives a positive result, ensuring the matrix "points" in a positive direction.

Manhattan norm → The sum of the absolute value of the differences between corresponding components of the vectors. Said differently, it is the sum of the absolute values of the differences along each dimension. It’s also known as the $L_{1}$ norm.
Euclidean norm → The distance of the vector from the origin. Also known as the $L_{2}$ norm.
The main difference between Manhattan norm and Euclidean norm, is the former is like length of traversing city blocks, while the latter is the distance as the crow flies.
The Euclidean norm, also known as L2 norm, is a way to measure the straight-line distance between two points in space. It is calculated by taking the square root of the sum of the squares of the differences between the corresponding components of the vectors, providing a measure of the overall "as-the-crow-flies" distance.
We can also write the length of the vector by using the dot product. You can take the square root of the dot product of a vector.

When taking the dot product of two vectors, they are orthogonal (perpendicular) if it equals 0.

Metric → Also known as the distance function, this is written as $d(\mathbf{u}, \mathbf{v})$ and is the measure of the distance between two vectors. This comes in handy in differential geometry when looking to understand how something traverses n-dimensional space across various coordinate systems.

$d(\mathbf{u}, \mathbf{v}) = \|\mathbf{u} - \mathbf{v}\|$

When multiplying two vectors together, it is symmetric if the following holds:

\Omega (\mathbf{x}, \mathbf{y}) = \Omega (\mathbf{y}, \mathbf{x}) \textnormal{ for all }\mathbf{x}, \mathbf{y}\in V

However when using the inner product, we don’t use the terms $\Omega (\mathbf{x}, \mathbf{y})$ but instead use $\langle \mathbf{x}, \mathbf{y} \rangle$ .

Inner product space → Written as follows $(V, \langle \cdot , \cdot \rangle)$ , which is the real vector space with inner product. We would use the same notation if we follow the method of Euclidean norm, which would be known as the inner Euclidean vector space.
Because we can reinterpret any vector as the linear combination of basis vectors (multiplying the scalar value with the basis vectors), we can break down the inner product in the following manner, where $A_{ij} := \langle \mathbb{b}_{i},\mathbb{b}_{j}\rangle$ and where the coodinates $\hat{x},\hat{y}$ are the coordinates of $\mathbb{x}$ and $\mathbb{y}$ with respect to the basis B.

\langle \mathbb{x}, \mathbb{y}\rangle = \Big\langle \sum^{n}_{i=1}\psi_{i} \mathbb{b}_{i}, \sum^{n}_{j=1}\lambda_{i} \mathbb{b}_{j} \Big\rangle = \sum^{n}_{i=1}\sum^{n}_{j=1}\psi_{i}\langle \mathbb{b}_{i}, \mathbb{b}_{j}\rangle\lambda_{j} = \hat{\mathbb{x}}^{\top}\mathbf{A\hat{y}}

Any inner product induces a norm
Inner products also capture the geometry of a vector space by giving the angles between two vectors. We use $\omega$ to denote the angle, which should signal to us that the angle will be denoted in radians ( $0 \rightarrow \omega$ ).

Therefore, the angle $\omega$ between the two vectors would be represented as follows:

cos\omega = \frac{\langle\mathbb{x},\mathbb{y}\rangle}{\|\mathbb {x}\| \|\mathbb{y}\|}

The angle is important because it tells us how similar their orientations are. For example if we find that y is a scaled version of x (like y = 4x) then it would have the same orientation because it’s not adding any new dimension in its expansion.

Orthogonality

Orthogonality → Two vectors $x$ and $y$ are orthogonal if and only if $\langle \mathbb{x}, \mathbb{y} \rangle = 0$ . It would be represented as follows $\mathbb{x}\perp \mathbb{y}$ . What this equation tells is is that the 0-vector is orthogonal to every vector in the vector space. If we also have the conditions that $\|x\| = 1 = \|y\|$ , which implies that the vectors are unit vectors, we would say that $x$ and $y$ are orthonormal.

A square matrix is an orthogonal matrix if and only if its columns are orthonormal. So $\textbf{AA}^{\top}=\textbf{I}=\textbf{A}^{\top}\textbf{A}$ , which implies that $\textbf{A}^{-1}=\textbf{A}^{\top}$ . This means that the inverse can be obtained by simply transposing the matrix. Under transformations with orthogonal matrices, the length of a vector is not changed after the transformation. So for the vector $x$ and the orthogonal matrix $\textbf{A}$ , we would undergo the following:

\|\textbf{A}\mathbb{x}\|^{2}=(\textbf{A}\mathbb{x})^{\top}(\textbf{A}\mathbb{x}) = \mathbb{x}^{\top}\textbf{A}^{\top}\textbf{A}\mathbb{x} = \mathbb{x}^{\top}\textbf{I}\mathbb{x}=\mathbb{x}^{\top}\mathbb{x}=\|\mathbb{x}\|^{2}

Orthogonal matrices define transformation for rotations, which make sense considering the vector lengths and the angles between vectors remain the same.

Vector spaces can also have orthogonality.

Let’s assume that V is a vector space with D dimensions, and U is an M-dimensional subspace of V, represented as: $U \subseteq V$ .
The orthogonal complement $U^{\perp}$ is a (D - M)-dimensional subspace of V and contains all vectors in V that are orthogonal to every vector in U.
$U \cap U^{\perp} = \{0\}$ ( $\cap$ is a set intersection), which means that any vector $x \in V$ can be uniquely decomposed into:

x = \sum^{M}_{m=1}\lambda_{m}b_{m} + \sum^{D-M}_{j=1}\psi_{j}b^{\perp}_{j}, \; \; \; \lambda_{m}, \psi_{j} \in \mathbb{R}

where $(b_{1}, ..., b_{M})$ is a basis of U and $(b^{\perp}_{1}, ..., b^{\perp}_{D-M})$ is a basis of $U^{\perp}$ .

Dot products

Also known as “scalar products”, we would define it as follows, in $\mathbb{R}^n$

x^{\top}y=\sum^{n}_{i=1}x_{i}y_{i}

When multiplying matrices together, we can use what is called the dot product. We need to ensure the two matrices we multiply with each other have the same number of rows in the first matrix multiplied by the same number of columns in the second matrix. Ordering here matters because the multiplying of matrices if not commutative (AB ≠ BA), so attempting to flip the two in order to take the product would give you an entirely different answer. This would be expressed as following for matrices A and B.

\textbf{A}\in \mathbb{R}^{m \times n}, \; \textbf{B}\in \mathbb{R}^{n \times k}

With i being row and j being the column, we can compute the dot product of A and B as follows:

\textbf{C} = \textbf{AB}\in \mathbb{R}^{m \times k}

We would then compute this as follows:

c_{ij} = \sum^{n}_{l = 1}a_{il}b_{lj}

What this effectively tells us is that we multiply the first element of the row in the first matrix, multiplied by the first element in the column of the second matrix, proceeding down the column and summing those values to produce the first element in the new matrix C.

As you can see below, we start by multiplying the first element in the row against the elements in the first column and sum them up.

Then we advance down to the second row element and continue the same with the first column in the second matrix.

We do this until we produce the first row in C, and then proceed onwards to the

Identity matrix

The identity matrix is a matrix where there are ones trailing from the upper left element, diagonally across to the bottom right, with zeroes elsewhere. This property is akin to multiplying a number by 1, or adding 0 to a number. It essentially leaves the matrix unchanged and is indicated as follows.

\begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix}

Inverse matrix

For two matrices to share the distributive property, where $\textbf{AB} = \textbf{BA}$ , we would say this possesses an inverse property.

\textbf{AB} = \textbf{BA} = \textbf{I}_n

So we would say that $\textbf{B}$ is the inverse of $\textbf{A}$ and we would indicate this as $\textbf{A}^{-1}$ . When this is the case, we would say that $\textbf{A}$ is regular matrix (or invertible or nonsingular) while those matrices without an inverse are known as singular or noninvertible. Below we have an example o

\textbf{A} = \begin{bmatrix} 1 & 2 & 1 \\ 4 & 4 & 5 \\ 6 & 7 & 7 \end{bmatrix}, \, \, \textbf{B} = \begin{bmatrix} -7 & -7 & 6 \\ 2 & 1 & -1 \\ 4 & 5 & -4 \end{bmatrix} \\ \textbf{AB} = \textbf{I} = \textbf{BA}

We would solve for $\textbf{AB}$ using the dot product as follows:

\text{Column 1} \rightarrow \begin{bmatrix} 1(-7)+1(2)+1(4) = -1\\ 4(-7)+4(2)+4(4) = 4\\ 6(-7)+6(2)+6(4) = 6 \\ \end{bmatrix} \\ \text{Column 2} \rightarrow \begin{bmatrix} 2(-7) + 2(1) + 2(5) = -2\\ 4(-7) + 4(1) + 4(5) = -4\\ 7(-7) + 7(1) + 7(5) = -7 \\ \end{bmatrix} \\ \text{Column 3} \rightarrow \begin{bmatrix} 1(6) + 1(-1) + 1(-4) = 1\\ 5(6) + 5(-1) + 5(-4) = 5\\ 7(6) + 7(-1) + 7(-4) = 7\\ \end{bmatrix}

Our final product is the matrix:

\begin{bmatrix} -1 & 4 & 6 \\ -2 & -4 & -7 \\ 1 & 5 & 7 \end{bmatrix}

$\textbf{A}^{-1}$

\textbf{A}^{-1} = \begin{bmatrix} \end{bmatrix}

Now we will solve for $\textbf{BA}$ using the dot product as follows:

\text{Column 1} \rightarrow \begin{bmatrix} -7(1) + -7(4) + -7(6) = -1\\ 2(1) + 2(4) + 2(6) = -1\\ 4(1) + 4(4) + 4(6) = -1\\ \end{bmatrix} \\ \text{Column 2} \rightarrow \begin{bmatrix} 2(-7) + 2(1) + 2(5) = -2\\ 4(-7) + 4(1) + 4(5) = -4\\ 7(-7) + 7(1) + 7(5) = -7 \\ \end{bmatrix} \\ \text{Column 3} \rightarrow \begin{bmatrix} 1(6) + 1(-1) + 1(-4) = 1\\ 5(6) + 5(-1) + 5(-4) = 5\\ 7(6) + 7(-1) + 7(-4) = 7\\ \end{bmatrix}