Saturday, August 21, 2010

What is a matrix? (Part 1)

Functions are an important tool in mathematics, and are used to represent many different kinds of processes in nature. Like so many mathematical objects, however, functions can be difficult to use without making some simplifying assumptions. One particularly nice assumption that we will often make is that a function is linear in its arguments:
One can think of a linear function as one that leaves addition and scalar multiplication alone. To see where the name comes from, let's look at a few properties of a linear function f:
This implies that f(0) = 0 for any linear function. Next, suppose that f(x) = 1 for some x. Then:
This means that if f represents a line passing through 0 having slope m = 1 / x.

So what does all this have to do with matrices? Suppose we have a linear function which takes vectors as inputs. (To avoid formatting problems, I'll write vectors as lowercase letters that are italicized and underlined when they appear in text, such as v.) In particular, let's consider a vector v in ℝ². If we use the {x, y} basis discussed last time, then we can write v = ax + by. Now, suppose we have a linear function f : ℝ² → ℝ² (that means that takes ℝ² vectors as inputs and produces ℝ² vectors as output). We can use the linear property to specify how f acts on any arbitrary vector by just specifying a few values:
This makes it plain that f(x) and f(y) contain all of the necessary information to describe f. Since each of these may itself be written in the {x, y} basis, we may as well just keep the coefficients of f(x) and f(y) in that basis:
We call the object F made up of the coefficients of f(x) and f(y) a matrix, and say that it has four elements. The element in the ith row and jth column is often written Fij. Application of the function f to a vector v can now be written as the matrix F multiplied by the column vector representation of v:
We can take this as defining how a matrix gets multiplied by a vector, in fact. This approach gives us a lot of power. For instance, if we have a second linear function g : ℝ² → ℝ², then we can write out the composition (gf)(v) = g(f(v)) in the same way:

That means that we can find a matrix for gf from the matrices for g and f. The process for doing so is what we call matrix multiplication. Concretely, if we want to find (AB)ij, the element in the ith row and jth column of the product AB, we take the dot product of the ith row of A and the jth column of B, where the dot product of two lists of numbers is the sum of their products:

To find the dot product of any two vectors, we write them each out in the same basis and use this formula. It can be shown that which basis you use doesn't change the answer.

If this all seems arcane, then try reading through it a few times, but rest assured, it makes a lot of sense with some more practice. Next time, we'll look at some particular matrices that have some very useful applications.

No comments: