Linear Algebra

Basis Space and Basis Vectors

Imagine these in 2D as ‘tiling’ a vector space. Imagine making an even grid with those long pieces from Erector Sets. You can shear/smush them only at certain angles. Now imagine that the long pieces can only stretch or shrink lengthwise. That’s kinda what these are.

Now “applying” a Matrix means that you change this Vector Space and all the vectors you’ve embedded in it in some way. You shrink it, stretch it, rotate it by some angle, flip it inside-out, or just leave it alone! In some cases, you can even change your mind and smash an undo button called “Commutativity”.

Vector Norm

This is a function that takes a vector and maps it to $\mathbb{R}^+$ so you get an idea of the ‘size’ or ‘length’ of the vector.

\|a\|_p = \left( \sum_{i=1}^{n} |a_i|^p \right)^{1/p}

That’s the p-norm. Using that, and for $p = 1,2,\infty$ , you get these norms:

Manhattan $\|a\|_1 = |a_1| + |a_2| + \cdots + |a_n|$
Euclidean $\|a\|_2 = \sqrt{a_1^2 + a_2^2 + \cdots + a_n^2}$
Infinity $\|a\|_\infty = \max_i |a_i|$
AKA The Fuck It Norm

TODO: When does one use each?

Eigenvectors (and Eigenvalues)

These just vectors (of course) but relate to matrics.

TODO: Finish this.

Dot and Cross Products

Vectors Only Please

Note that Dot and Cross Products are only defined for Vectors. I mean there are things like the Kronecker Product but that’s not what we’re dealing with here.

Dot Products

These are easy-peasy and tell you about how well two vectors vibe with each other. The result is a number. Consider two vectors with the same size. If $\bold{a},\bold{b} \in \mathbb{R}^n$

\bold{a}.{\bold{b}} = ||\bold{a}||\space||\bold{b}||\space cos\theta = \sum_{k=1}^n{a_kb_k}

That’s about it. If you get a zero, they’re orthogonal (at $90^o$ in 2D space). That Cosine is a good similarity measure that’s used in all manner of Machine Learning algos like LLMs. E.g. Recall that $Cos(90^o) = 0$ , which you can take to mean that they’re not similar at all.

Cross Products

These work in 3D for the most part and will give you a new vector that is orthogonal/perpendicular to the plane of the two input vectors (which are 3D!) I’ve never used them for anything. Read this for more.

Matrix Rank

This is an easy concept but is pretty important downstream. It’s the number of linearly independent rows or columns of a matrix.

When you do you pick rows versus columns? The smallest of the two: if you have a ‘rectangular’ $m \times n$ matrix (always rows $\times$ columns), $rank \leq min(m, n)$ .

A “Full Rank” matrix is one where there are no linearly dependent (not independent!) rows or columns (whichever is smallest). So if you have a matrix that’s 4 rows and 3 columns, the maximum rank possible is 3. Now look at the columns and see if you can figure out if one column depends on the other. Didn’t find any? Awesome, you have a Full Rank matrix.

Found one that depends on the other? Your rank is 2. Found two? Rank 1. See this Wikipedia article on Row Echelon Forms for more.

Identity Matrix

A nice simple square matrix that looks like this.

I_n = \begin{bmatrix} 1 & 0 & 0 & \cdots & 0 \\ 0 & 1 & 0 & \cdots & 0 \\ 0 & 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \end{bmatrix}

Determinants

This gives you a scalar (boring-ass number) from your matrix. This number tells you how much applying the matrix will transform the magnitude and direction of an area (2D) or volume (3D+) of a space. If the number’s negative you get a mirror image.

TODO: More here…

Commutativity

In general, $AB \ne BA$ . You can verify this yourself with two $2 \times 2$ matrices. But there are cases where this holds:

$AI = IA = A$
$A0 = 0A = A$
If $B = \lambda I$ for some scalar $\lambda$ then $AB = \lambda A = BA$ (i.e. you can scale the Identity Matrix all you want)
Some diagonal matrices…

Invertible Matrix

This is a square matrix $A \in \mathbb{R}^{n \times n}$ that has some other square matrix $B \in \mathbb{R}^{n \times n}$ such that

AB = BA = I_n

This other square matrix $B$ is the Inverse of $A$ and is denoted $A^{-1}$ . It has some properties:

Its Determinant is not zero.
It has a Full Rank
If $x$ is some vector ( $x \in \mathbb{R}^n$ ), $Ax = 0$ has only one solution: $x$ is full of zeroes!
If $b$ is some vector ( $b \in \mathbb{R}^n$ ), $Ax = b$ has just one solution $x = A^{-1}b$

Singular Matrix

This is a square matrix ( $n \times n$ ) where

The Determinant is Zero
It’s not Full Rank
There’s some non-zero vector $x$ such that $Ax = 0$
It is not invertible!

These things smush a vector space into lower dimensions. Well really they create a mapping to a lower space (the original is preserved) but yeah.

Transposes

Transposes are when you turn a matrix $A$ ‘s rows into columns and vice-versa and denote the monstrosity $A^T$ . They’re just a different kind of transformation and are useful depending on the problem you’re trying to solve. They have some properties.

$(A^T)^T = A$
$(AB)^T = B^TA^T$
$(A+B)^T = A^T + B^T$
$det(A^T) = det(A)$
$(\alpha A)^T = \alpha A^T$
$(A^{-1})^T = (A^T)^{-1}$

Miscellaneous

Other Types of Matrices

An Orthogonal matrix is one where $A^T = A^{-1}$
A Symmetric matrix is one where $A = A^T$
A Conjugate matrix just flips the sign of the imaginary part of any complex numbers in a matrix.
A Hermitian matrix is when a matrix equals its Conjugate Transpose: $A = \bar{A}^T$
Pretty important in ML and Quantum Mechanics
TODO: Conjugate and Adjoint matrices…

Cramer’s Rule

Easier shown with an example. Heaven forbid you compute things by hand these days…

\textbf{Solve } A\mathbf{x}=\mathbf{b}\text{ with } A=\begin{bmatrix} 2 & 1 & -1\\ -3 & -1 & 2\\ -2 & 1 & 2 \end{bmatrix},\quad \mathbf{x}=\begin{bmatrix}x\\y\\z\end{bmatrix},\quad \mathbf{b}=\begin{bmatrix}8\\-11\\-3\end{bmatrix}.

\det(A) = -1.

Replace the $i$ -th column of $A$ by $\mathbf{b}$ to get $A_i$ :

A_1= \begin{bmatrix} 8 & 1 & -1\\ -11 & -1 & 2\\ -3 & 1 & 2 \end{bmatrix},\quad A_2= \begin{bmatrix} 2 & 8 & -1\\ -3 & -11 & 2\\ -2 & -3 & 2 \end{bmatrix},\quad A_3= \begin{bmatrix} 2 & 1 & 8\\ -3 & -1 & -11\\ -2 & 1 & -3 \end{bmatrix}.

\det(A_1)=-2,\qquad \det(A_2)=-3,\qquad \det(A_3)=1.

By Cramer’s Rule,

x=\frac{\det(A_1)}{\det(A)}=\frac{-2}{-1}=2,\qquad y=\frac{\det(A_2)}{\det(A)}=\frac{-3}{-1}=3,\qquad z=\frac{\det(A_3)}{\det(A)}=\frac{1}{-1}=-1.

\boxed{(x,y,z)=(2,\,3,\,-1).}

Basis Space and Basis Vectors​

Vector Norm​

Eigenvectors (and Eigenvalues)​

Dot and Cross Products​

Dot Products​

Cross Products​

Matrix Rank​

Identity Matrix​

Determinants​

Commutativity​

Invertible Matrix​

Singular Matrix​

Transposes​

Miscellaneous​

Other Types of Matrices​

Cramer’s Rule​