Eigenvalues and Eigenvectors — Bench

The Core Idea

Most vectors get knocked off their span when a matrix transformation is applied — they rotate and stretch in complicated ways. But some special vectors only get stretched or compressed. They point in the same direction before and after the transformation.

These are eigenvectors. The stretch factor is the eigenvalue.

Av = λv

v is an eigenvector of A, λ is the corresponding eigenvalue.

If λ = 3: the transformation stretches v to three times its length. If λ = −1: it flips v. If λ = 0: it collapses v to zero.

Finding Eigenvalues

Rearrange Av = λv:

Av = λIv
Av − λIv = 0
(A − λI)v = 0

For this to have a non-trivial solution (v ≠ 0), the matrix (A − λI) must be singular:

det(A − λI) = 0

This is the characteristic equation. Solving it gives the eigenvalues.

Example: A = [4 1; 2 3]

det([4−λ  1 ]) = 0
    [ 2  3−λ]

(4−λ)(3−λ) − 2 = 0
12 − 7λ + λ² − 2 = 0
λ² − 7λ + 10 = 0
(λ − 5)(λ − 2) = 0
λ = 5  or  λ = 2

Finding Eigenvectors

For each eigenvalue λ, solve (A − λI)v = 0:

For λ = 5:

A − 5I = [−1  1]
         [ 2 −2]

Row reduce: [1 −1]  →  v₁ = v₂
            [0  0]

Eigenvector: (1, 1) (or any scalar multiple)

For λ = 2:

A − 2I = [2  1]
         [2  1]

Row reduce: [2 1]  →  2v₁ + v₂ = 0  →  v₂ = −2v₁
            [0 0]

Eigenvector: (1, −2)

The Characteristic Polynomial

For an n×n matrix, det(A − λI) is a degree-n polynomial in λ. An n×n matrix has exactly n eigenvalues (counting multiplicity), possibly complex.

Trace and determinant:

Sum of eigenvalues = trace(A) = Σ Aᵢᵢ
Product of eigenvalues = det(A)

For A = [4 1; 2 3]: trace = 7 = 5+2 ✓, det = 10 = 5×2 ✓. Quick checks.

Special Cases

Symmetric matrices (Aᵀ = A):

All eigenvalues are real
Eigenvectors for different eigenvalues are orthogonal
Always diagonalisable

Symmetric matrices arise naturally from covariance matrices, graph Laplacians, and physical systems. Their eigenvectors form a clean orthogonal basis.

Repeated eigenvalues: may or may not have independent eigenvectors. A matrix with a repeated eigenvalue but fewer independent eigenvectors than expected is defective — it can’t be diagonalised.

Complex eigenvalues: always come in conjugate pairs for real matrices. Correspond to rotation in the transformation.

Diagonalisation

If A has n linearly independent eigenvectors v₁, …, vₙ, it can be written:

A = P D P⁻¹

Where:

P = matrix whose columns are the eigenvectors
D = diagonal matrix of eigenvalues

D = [λ₁  0 ]    P = [v₁ v₂]
    [ 0  λ₂]

Why this is useful: computing Aⁿ becomes trivial:

Aⁿ = P Dⁿ P⁻¹

And Dⁿ is just diagonal entries raised to the nth power:

Dⁿ = [λ₁ⁿ  0 ]
     [ 0   λ₂ⁿ]

Markov chain long-run behaviour, population growth models, solving differential equations — all use this.

Geometric Interpretation

Eigenvalue magnitude:

|λ| > 1: eigenvector direction gets stretched (expansion)
|λ| < 1: gets compressed (contraction)
|λ| = 1: stays the same length
λ < 0: gets flipped

Eigenvectors as axes: in the eigenvector basis, the transformation is just scaling along each axis. Diagonalisation literally means finding the coordinate system where the transformation is simplest.

The dominant eigenvector: the eigenvector with the largest |λ| dominates in the long run. Repeatedly applying A amplifies this direction and suppresses others. This is why PageRank converges — the dominant eigenvector of the link matrix is the stationary distribution.

The Spectral Theorem

For symmetric matrices, the eigenvectors form an orthonormal basis for the whole space. Every symmetric matrix can be written:

A = Q D Qᵀ

where Q is orthogonal (Qᵀ = Q⁻¹) and D is diagonal. This is the spectral decomposition.

This is the foundation of PCA, the discrete Fourier transform, and the solution to many physics problems.

Principal Component Analysis (PCA)

PCA finds the directions of maximum variance in a dataset. It’s eigenanalysis of the covariance matrix.

Centre the data (subtract mean)
Compute covariance matrix Σ = XᵀX / (n−1)
Find eigenvectors and eigenvalues of Σ
Eigenvectors = principal components (directions of variance)
Eigenvalues = amount of variance in each direction
Project data onto top k eigenvectors to reduce dimensions

The first principal component is the eigenvector with the largest eigenvalue — it’s the direction of maximum variance. The second is orthogonal to the first, with the next most variance. And so on.

Why eigenvalues = variance: the covariance matrix is symmetric and positive semi-definite. Its eigenvalues are non-negative. They literally measure the spread of data in each eigenvector direction.

Singular Value Decomposition (SVD)

For any m×n matrix A (not just square, not just symmetric):

A = U Σ Vᵀ

U: m×m orthogonal matrix (left singular vectors)
Σ: m×n diagonal matrix of singular values σᵢ ≥ 0
V: n×n orthogonal matrix (right singular vectors)

Singular values are √(eigenvalues of AᵀA). SVD generalises eigendecomposition to any matrix.

Applications:

Low-rank approximation: keep only the top k singular values/vectors — the best rank-k approximation (Eckart-Young theorem). Used in image compression, latent semantic analysis.
Pseudoinverse: A⁺ = V Σ⁺ Uᵀ — solves least squares problems even when A is not square or invertible.
PCA: the right singular vectors of centred data X are the principal components.

SVD is the most important matrix factorisation in applied mathematics.