Tuesday, August 30, 2016

Refresher on Eigenvalues and Eigenvectors



A lot of graph analysis strongly relies on linear algebra so here's a quick refresher on stuff you'll have studied in your undergraduate days.

In the mathematics of graphs, you'll see the mention of eigenvalues and eigenvectors. The general description of them (as in that Wikipedia link) focuses on the rotation, translation and shearing of images. Although absolutely correct, it takes the emphasis away from their most interesting quality: we're solving sets of equations with them.

A set of equations that are formed of constants and variables that are never raised to a power greater than one are called linear equations.

Representation

Let's take a straight-forward set of simultaneous equations:

x-y+4z=5
2x-3y+ 8z=4
x-2y+4z=9

We can represent the coefficients as a matrix:

1-145
2-384
1-249

Now, we can produce a row reduced matrix by employing any combination of:

  • interchanging two rows
  • multiplying a row by a non-zero constant
  • adding one row to another

to give us:

10411
0-10-6
000-20

where the the entire matrix we'll call matrix A but the purple subset we'll call matrix M. The rank (that is, the number of non-zero rows after all row reduction) is greater in A than M indicating that the equations have no solution (since no values of 0x, 0y and 0z can equal the -20 in the last row). It is inconsistent.

The rule here is:
  1. rank M < rank A: no solution
  2. rank M = rank A = number of unknowns: one solution
  3. rank M = rank A = R < n: R unknowns can be expressed in terms of (n - R) unknowns.
An example:

x+y-z=7
2x-y- 5z=2
-5x+4y+14z=1
3x+-y+-7z=5

when row reduced becomes:

10-23
0114
0000
0000

Since this is option #3, and R = 2 and n  = 3, it's not surprising that we can express the original equations in one unknown (z, say)

x = 3 + 2z
y = 4 - z

which with a bit of Python, Matlib and some help from here like this

import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import matplotlib.pyplot as plt

mpl.rcParams['legend.fontsize'] = 10
fig = plt.figure()
ax = fig.gca(projection='3d')
theta = np.linspace(-4 * np.pi, 4 * np.pi, 100)
z = np.linspace(-10, 10, 10)
x = 3 + (2 * z)
y = 4 - z
ax.plot(x, y, z, label='x = 3 + 2z; y = 4 - z')
ax.legend()

plt.show()

tells us that there are infinite solution - that is, anything on this line:


Determinants

Like the use of images to demonstrate the properties of matrices, determinants are often described geometrically in the way they define volumes (although there are other definitions). "For an n x n matrix with columns a1...an, the value of det A is the signed volume of the parallelepiped defined by the vectors a1...an." [2] This is similar to calculating the cross product. In fact, the triple product that uses both the cross product and the dot product is how to find the volume in 3 dimensions.

But, this is only part of the story. Determinants are interesting because of what they tell us about the matrix.

Calculating determinants is tedious. Thankfully, since my undergraduate days, there have been many libraries written to do it for me (floating-point caveats aside). Now, we just run something like:

import numpy as np

identity = np.eye(3)
print "det(I3) = ", np.linalg.det(identity)

which tells me the (not too surprising) fact that the determinant of the identity matrix (I) over field ℜ3 is 1.0.

More interestingly:

det AB = det BA = (det A) x (det B)

Why is this interesting? Well, if A is a matrix (and matrices can be treated as functions) that has an inverse:

A A-1 = I
det (A A-1) = det(A) x det(A-1) = det I = 1

If the product of terms equals 1, no term can be 0. So, for an inverse matrix to exist, a prerequisite is that the matrix's determinant must be non-zero.

Homogeneous Equations

When the constants on the right hand side of all linear equations in the set are 0, we call them homogeneous equations.

Unlike the equations at the top of this post, "homogeneous equations are never inconsistent; they always have the solution 'all unknowns = 0' (often called the 'trivial solution'). If the number of independent equations (that is, the rank of the matrix) is the same as the number of unknowns, this is the only solution." [1] That is, there are too many contradictory equations (see here for a nice demo).

"If the rank of the matrix is less than the number of unknowns, there are infinitely many solutions." [1] That is, we can only express unknowns in terms of other unknowns as there are no constants (try it!). However, the infinite solutions describe a line, a plane, a hyper-plane (depending on the number of dimensions in which we're working) and we can take unit-vectors over these spaces.

"A system of n homogeneous equations in n unknowns has solutions other than the trivial solution if and only if the determinant of the coefficients is zero" [1]

Eigen- vectors and values

Which finally brings us to eigen- vectors and values. What they are is simple. Again in geometric terms, we can consider them as vectors whose direction does not change under a transformation from a matrix but whose magnitude might. 

Why they are interesting is more subtle. They come in handy in problems as disparate as Latent Semantic Analysis where we want to know which documents contain which 'concepts' to determining if a graph is acyclic to producing an equation for the n-th element of the Fibinacci sequence (see Coding the Matrix, chapter 12) to finding the relationships between distributions in a multi-variate Gaussian... In short, the "why" is very domain specific.

Let's take their definition:

M v = λ v

(where M is a matrix, v an eigenvector and λ an eigenvalue - there may be more than one).

Not all matrices have eigen-vectors and -values (over the field on which the matrix acts - see here for more information). Again from a geometric point of view, if we imagine the matrix to represent a rotation, no vector stays pointing in the same direction (note: there may still be solutions but the eigenpairs will be made of complex numbers).

Not all eigenvectors are orthogonal. Generally, they are not although they are for symmetric matrices.

[1] Mathematical Method in the Physical Sciences - Boas.
[2] Coding the Matrix - Klein

No comments:

Post a Comment