3D on the Web
Pure Java

Source Code
Java Applet I
Java Applet II

Web Sites
Mailing Lists

In the 3D graphics industry, math is perhaps the most important tool that is used to create stunning, photo-realistic still images - or to create realistic, real-time animations. In particular, vector math and matrix math are the two areas which are the most useful. On this page we will discuss not only the math, but the problems which the mathematics is used to solve. I start with a summary of the equations, then follow that with discussions about the equations - what they do, how they tie together, and why they are important.

Equation Summary Standard Math Problems Vector Math Matrix Mat Projections

Return to top of document

Equation Summary

I hate to wait and I assume you do to. So without further discussion here are all of the equations that are discussed in the rest of this page.
 Translate             Scale                   Perspective
x' = x + dx          x' = x * Ax             x' = d * y / (d - z)
y' = y + dy          y' = y * Sy             y' = d * x / (d - z)
z' = z + dz          z' = z * Sz             z' = 0

        X-axis                   Y-axis                     Z-axis
x' = x                     x' = z sin(a) + x cos(a)   x' = x cos(a) - y sin(a)
y' = y cos(a) - z sin(a)   y' = y                     y' = x sin(a) + y cos(a)
z' = y sin(a) + z cos(a)   z' = z cos(a) - x sin(a)   z' = z

Dot Product
AB  = <x1, y1, z1>  <x2, y2, z2> = x1x2 + y1y2 + z1z2
AB  =  |A| |B| cos(a)
AB  =  <x1, y1, z1> |x2|

Cross Product
A x B = <x1, y1, z1> X <x2, y2, z2> = <y1z2 - z1y2, z1x2 - x1z2, x1y2 - y1x2>
        |  i  j  k |
A x B = | x1 y1 z1 | =  | y1 z1 |i -  | x1 z1 |j  +  | x1 y1 |k
        | x2 y2 z2 |    | y2 z2 |     | x2 z2 |      | x2 y2 |
|AxB| = |A| x |B| x sin(a)

Matrix Addition
A + B = | x1 y1 | + | c1 c2 | =  | x1+c1  y1+c2 |
        | x2 y2 |   | d1 d2 |    | x2+d1  y2+d2 |

Matrix Multiplication
| a  b  c |     | r  s |     | ar + bt + cv   as + bu + cw   |
| d  e  f |  X  | t  u |  =  | dr + et + fv   ds + eu + fw   |
                | v  w |     

2x2 Determinant
  D =  | a  b | = ad - bc
       | c  d |                                   

3x3 Determinant
       | c1 c2 c3 |
  D =  | x1 y1 z1 | = c1 | y1 z1 |  -  c2 | x1 z1 |  +  c3 | x1 y1 |
       | x2 y2 z2 |      | y2 z2 |        | x2 z2 |        | x2 y2 |

Translation Matrix             Scaling Matrix            Perspective
    |  1  0  0  0  |          | sz  0  0  0  |        |  1  0  0  0     | 
    |  0  1  0  0  |          |  0 sy  0  0  |        |  0  1  0  0     |
    |  0  0  1  0  |          |  0  0 sx  0  |        |  0  0  0  -1/d  |
    | dx dy dz  1  |          |  0  0  0  1  |        |  0  0  0  1     |

  Rotation X-Axis            Rotation Y-Axis            Rotation Z-Axis
| 1    0      0      0 |  | cos(a)  0  -sin(a)  0 |  |  cos(a) sin(a)  0  0 |
| 0   cos(a) sin(a)  0 |  |    0    1     0     0 |  | -sin(a) cos(a)  0  0 |
| 0  -sin(a) cos(a)  0 |  | sin(a)  0   cos(a)  0 |  |   0       0     1  0 |
| 0     0      0     1 |  |    0    0     0     1 |  |   0       0     0  1 |

Return to top of document

Standard Math

In the world of 3D animation there are three basic movements:

  • Translation
    - simple movement of an object in specified x, y, and z increments
  • Rotation
    - rotation of an object about a line, usually the x, y, or z axis
  • Scaling
    - changing the size of an object.

Translation and scaling are particularly simple to compute. For translation, simply add the amount of translation (dx, dy, and dz) in the direction of each axis to the corresponding x, y, and z component. The equation is:

x' = x + dx
y' = y + dy
z' = z + dz

For scaling, simply multiply each point's x, y, and z components by the scaling factor. The equation is:

x' = x * Ax
y' = y * Sy
z' = z * Sz
Note that scaling an object not centered at the origin also results in a translation. See the discussion below on how to handle scaling without translation.

The general problem of rotating a point around an arbitrary line is fairly complicated, however the rotation of a point around the x, y, and z axes have fairly straight-forward trigonometric solutions:

        X-axis                    Y-axis                     Z-axis
x' = x                     x' = z sin(a) + x cos(a)   x' = x cos(a) - y sin(a)
y' = y cos(a) - z sin(a)   y' = y                     y' = x sin(a) + y cos(a)
z' = y sin(a) + z cos(a)   z' = z cos(a) - x sin(a)   z' = z

To generate complicated 3D scene animations multiple transformations may be required. For example, scaling of an object not centered at the origin (0,0,0) also results in translation of the object. So, to rotate the object around a fixed point requires three separate transformations: 1)translate that point to the origin, 2) perform the rotation, and 3) translate the point back to the original position.

The approach just described - the application of multiple, simpler transformations to achieve a more complicated, single transformation is a very common approach in rendering 3D scenes.

As you will see in the next section, it is the handling of multiple transformations which has helped drive the 3D industry to depend on the use of vectors and matrices. These mathematical tools offer efficiencies (both in terms of coding and speed of performance) which are needed to provide real-time rendering of complex 3D objects.

Return to top of document

Problems - 3D Computations

The following list identifies the Top 10 computations that must be made routinely in the creation of 3D images and summarizes the mathematics involved in the computation. The details of the math is covered in other sections on this page. Some of the descriptions may not make much sense until you read the details, but the descriptions should be sufficiently general to help you realize the way in which vector and matrix mathematics prove useful and necessary to generating 3D scenes.

The list certainly does not cover everything, but it serves to show how important mathematics is to the 3D graphics industry and to emphasize why the 3D graphics artist must understand vector and matrix mathematical concepts in order to get the most out of the software tools available to him.

I need to re-emphasize a an aspect of 3D calculations that may not be obvious until it is pointed out. I've mentioned that vectors have only two properties - magnitude and direction. Strictly speaking, the sides of a triangle are not vectors. However, a triangle side does have a magnitude (the length of the side) and a direction (it is directed between two points). By applying vector math to line segments (sides of triangles) we can calculate information needed to render photo-realistic 3D scenes. Throughout these pages I'll use line segments (sides of polygons) interchangeably with vectors - but only because the fiction allows us to use vector math in useful ways.

  1. Angle of light incidence on a polygon
    The color that a polygon should be shaded is a function of the angle at which the light source strikes the polygon. If the light source strikes the polygon straight on - directly overhead - the surface of the polygon closely approximates the color of the light. Likewise, when the light source strikes the polygon at a low angle, the surface of the polygon darkens, tending towards black when the light source is parallel to the surface of the polygon.

    So, in order to correctly render (color) the polygons of a 3D scene, the first thing we need is the direction of a vector straight out (perpendicular to the surface) of the polygon. Treating two edges of a polygon as vectors and performing a vector cross product calculation returns a vector that is perpendicular to the surface of the polygon. This vector has a special name, a 'normal'. As you will see, normals (vectors perpendicular to a surface) are used extensively in 3D graphics computations.

    Then we use the direction of the light source in the scene (a vector) and the normal to the surface of the triangle to calculate the angle between the light source and the surface of the polygon - a calculation called a vector dot product, or simply dot product. This angle is then used in rendering/shading/coloring the polygon.

    As part of the rendering process, these calculations are repeated for every polygon in the 3D scene. It may sound like a complicated process, but as you will see it's really only a few lines of code repeated over and over.

  2. Direction of the 'face' of a polygon
    As I've mentioned, the polygon mesh that makes up the model for a 3D scene can consist of thousands of polygons - enough to tax the ability of even today's PC's to render complex scenes in real time.

    It's much faster to perform mathematical calculations than it is to draw on the computer screen, so we want to avoid drawing any polygons unless absolutely necessary. In particular, we would like not to draw the polygons facing away from a scene's point of view, because they cannot be seen.

    This highlights an interesting property of polygons. A polygon is not a 2-sided object. For the purposes of a 3D scene, a polygon only has 1 side, which may be visible or not depending on the direction it faces - towards or away from the point of view. Polygons facing away from the point of view are not visible and do not need to be drawn. Polygons which face the point of view can be visible, provided they are not behind another polygon, and should be drawn.

    What you might think of as the "back side" of a polygon is not visible and does not get rendered. If you have an object for which you want to have a visible front and back side, then you must use at least two polygons, one for each visible surface.

    Using this information about visibility of a polygon can significantly reduced the amount of drawing that is required to render a 3D scene. During rendering, each polygon is analyzed and drawn, one at a time. At that point, a decision can be made whether to draw the polygon based on a vector calculation that will determine if the polygon is facing towards (and should be drawn), or facing away from the point of view (and need not be drawn).

    This is a case where normals come into play again. By performing a dot product between the normal (vector perpendicular to the surface of the polygon) and the point of view vector, we can calculate the cosine of the angle between the two. The result of the dot product is a number which may be positive or negative, which identifies the direction of the face of the polygon as follows:

    • > 0 - the face is visible
    • 0 - the face appears as an edge
    • < 0 - the face is not visible

    If the polygon is visible, or appears as an edge, then we draw (render) it. If it is not visible then we do not draw (render) it.

    This step can be very important to real-time rendering of a 3D scene. As much as 75% of all polygons in a scene may not be visible in any particular point of view - significantly reducing the number of polygons to be drawn and thus the total rendering time.

    This technique of not drawing polygons which face away from the point of view is also called backface culling.

  3. Intersection of two objects (collision)
    In 3D graphics we often need to know if two objects have collided Use of the equation for a plane will allow us to determine if a point is in a plane.

    The minimum information needed to define a plane is a point on the plane (x0, y0, z0) and a normal vector (n1, n2, n3) to the plane. Using these two pieces of information, the equation for a plane can be derived:

    Plane Equation:  Ax + By + Cz + D = 0
    A = n1
    B = n2
    C = n3
    D = -(n1x0 + n2y0 + n3z0)

  4. Sorting of Polygons
    When rendering a 3D scene the order in which the polygons or triangles are drawn onto the computer screen can make a big difference in how accurate the final rendering appears. An algorithm called the 'Painter's Algorithm' is used to address this issue. This algorithm simply states that items farther away from the viewer are rendered first. The algorithm works very well for simple geometric objects, but does not handle overlapping or intersecting polygons very well.

    There are several variations of this algorithm in use. The simplest one simply sorts the polygon list by the average depth of each polygon. The average depth can be calculated by simply adding the z-coordinate of all the point on the polygon and then dividing by the number of points.

    Sorting is not one of the topics I cover on this page, but on the source code pages I provide sorting algorithms. It's worth noting that the choice of sorting algorithm can be very important. The various algorithms can be as much as 10X different in how fast they sort. So, with a 3D scene of thousands of polygons the ability to quickly render a 3D scene is heavily impacted by the sorting algorithm.

  5. Reduction of Computational Steps
    Well, actually there are dozens of techniques that might fit this description. In this case the intent is to talk about a common animation event, completing multiple transformations in a single step. For example, a 3D graphics designer might want an object in a 3D scene to be moved to the left while at the same time rotating half way around the y-axis. Combinations of translation, rotation or scaling are often required.

    One approach to completing the multiple translations is to perform each one sequentially on every point in the 3D scene. However, as you will see later in this page, each of these transformations may involve a dozen or more multiplications or additions.

    A technique frequently used is to write each transformation as a matrix operation (matrices will be discussed below). The specific matrix for each operation can be combined into a single transformation matrix which is then applied to the points in the 3D scene. Use of a transformation matrix can reduce the number of multiplications or additions by over 50%, providing significant speed improvements in the rendering process.

  6. Undo Function Sometimes an animation involves moving an object with a 3D scene back and forth between two positions - basically an Undo function. As was discussed in the topic above, a transformation matrix is often used to perform one or more movements on points.

    A matrix can be manipulated to create what is caused the transpose of the original matrix. Applying an inverse matrix to the result of a transformation matrix has the result of reversing the actions taken by the inverse matrix.

    While a generic solution to inverse matrices is not simple, for orthogonal systems (such as our usual x-y-z coordinate system), creating the inverse matrix is extremely simple and is substantially faster than creating a totally new transformation matrix from each individual actions. As you will see later, the inverse matrix is very easily created

Return to top of document


Vectors and vector math are used extensively in generating 3D images. So what are vectors?

A vector is not a real, physical object. It is a mathematical concept which is written as three numbers (called components), where each number corresponds to the x, y, and z directions of the coordinate system. An example of vector notation is:

P = <4, -5, 9>

A vector has only two properties - magnitude and direction. The direction of a vector is that of a line drawn from the origin of the coordinate system to the point corresponding to given by the three components of the vector. The magnitude of a vector is the square root of the sum of the squares of the three numbers (the 3D version of the Pythagorean equation for calculating the hypotenuse of a right triangle), written as:

|P| = sqrt ( x^2 + y^2 + z^2)

Graphically, a vector is shown as an arrow that starts at the origin of the coordinate system and which ends at the x-y-z coordinates given by the three numbers. An alternate way of graphically showing a vector is to draw an arrow starting at one point in space (not necessarily the origin) and ending at a second point.

Both representations of a vector are just aides in displaying the concept of a vector. It's important to note that a vector has no position - it's only properties are direction and magnitude. The points used to graphically display a vector are not properties of the vector.

Despite it's technical inaccuracy, it is convenient to represent a vector as a line segment in space - starting at one point and ending at another. In such a case, the vector is considered to have a magnitude equal to the length of the line segment. The direction of the vector is the same as the direction of the line segment. For two such line segments at points (x1, y1, z1) and (x2, y2, z2), the vector itself is given by <x2-x1, y2-y1, z2-z1> - the direction of the vector. Again, the points are not properties of the vectors.

The benefit of representing a vector as a line segment in space is that since the polygons (triangles) of a 3D scene are made up of line segments, we can use vector mathematics to manipulate the positions of the line segments. The power of this statement will become more obvious as we progress throughout this tutorial.

Two points connected by a line is sometimes called a "directed line segment", referring to the ability of the construct to denote direction. A line segment between two points A and B may be referred to as AB, which can represent a vector whose direction is from A to B. The line segment BA would represent a vector of the same magnitude, but opposite in direction.

As we will see later, the front surface of a triangle is also denoted by the order in which the points are listed. If a triangle is defined by three points A, B, and C, then the front surface of triangle ABC is given by the direction of the thumb on the right hand when traversing these point in a counter-clockwise direction.

The back surface of a triangle is the direction the thumb points when traversing the three points in a clockwise direction.

Throughout the discussions on vectors, the order of point listing will be important in assigning direction.

You may have noted that the notation for writing vectors is to include the three numbers in <> brackets, whereas coordinate points are written with () brackets. This is fairly universal notation - (4,-5,9) is a point and <4,-5,9> is a vector, but you may run across variations in the literature.

As noted earlier, there are two equally correct ways of interpreting the three components of a vector. One approach is to say that the vector starts at the origin of the coordinate system (0,0,0) and ends at the point (x, y, z). This interpretation of a vector is called a position vector.

It is also common with the graphics and mathematics world to represent a position vector as (Ai + Bj + Ck), where i, j, and k are vectors of magnitude 1 and a direction parallel to the axes of the coordinate system. A, B, and C are the components of the vector as before. The use of the i, j, and k vector notation will be useful when we discuss matrix multiplication.

A second way of interpreting this vector is to treat the components as direction information from whatever point the vector is considered to begin. This interpretation is known as a displacement vector, where regardless of the starting point the end of the vector will be moved x, y, and z units in the direction of the corresponding coordinate system axes.

In both interpretations it is important to note that the numbers used to describe the vectors are not coordinates, They are the "components" of the vector (displacement values). For this reason, many texts use nomenclature such as >a1, b1, c1< or >a, b, c< for the components - just to avoid the tendency of readers to interpret x, y, and z as actual coordinates.

Throughout this site the x, y, and z notation is used. When you get around to actually using the equations that are discussed here, use of x, y, and z will be helpful by avoiding the mental conversion between a1, b1, c1, or whatever letters might be used, to x, y, and z.

There are a number of mathematical operations which can be performed on vectors to compute the information needed to correctly and efficiently move and render the polygons which make up the objects within a 3D scene.

Vector Addition
Vectors can be added, subtracted, and multiplied. To add two vectors, simply add the corresponding respective components of the two vectors together as shown in the following example:

P3 = P1 + P2 = <x1, y1, z1> + <x2, y2, z2> = _
                         <x1 + x2, y1 + y2, z1 + z2>

Again, remember that the values are the displacement components of the vectors, not the starting or ending coordinates.

Graphically, addition of two vectors can be done by placing the tail of the second vector to the head of the first. The resulting vector goes from the tail of the first vector to the head of the second vector.

Likewise, subtraction of two vectors is also very simple. Just subtract the corresponding components of the two vectors as shown in the following example:

P3 = P1 + P2 = <x1, y1, z1> + <x2, y2, z2> = +
                          <x1 - x2, y1 -y2, z1 - z2>

Multiplication of two vectors is the far more useful operation in 3D graphics. There are actually two kinds of vector multiplications - each of which have their own use, as you saw in the list of Problems above. Strictly speaking, the "multiplications" are a series of operations on the components of two vectors, but it is commonplace to describe the operations simply as multiplication.

Dot Product
The first type of vector multiplication is called a dot product. A dot product is generated by multiplying two vectors together and results in a scalar (number). It is sometimes called scalar multiplication. The dot production can be calculated one of two ways:

AB  = <x1, y1, z1>  <x2, y2, z2> = x1x2 + y1y2 + z1z2
AB   =  |A| |B| cos(a)

where a is the angle between the two vectors.

This is a good time to point out again that the x, y, and z variables listed in the equation are the components of the vectors - the displacement components. If the dot product is applied to two sides of a triangle the components of the vectors must be calculated by subtracting the coordinates of the two points. As was noted above, the vector components of a directed line segment are calculated as:

<x2-x1, y2-y1, z2-z1>

where (x1, y1, z1) and (x2, y2, z2) are the starting and ending points of the line segment.

One way to physically interpret the result of a dot product is that it is the magnitude of the component of A that extends in the direction of B (which is called the projection of A onto B) times the magnitude of B.

Two dot product calculations of interest are:

  • AB = 0 when the two vectors are perpendicular to one another
  • AB = |A| |B| when the vectors are parallel

As was noted earlier, one use of the dot product is to compute the angle at which a light source strikes a polygon, which in turn is used to determine how to render the polygon. There are several other uses of the dot product in rendering 3D scenes.

Cross Product
The second type of vector multiplication is called a cross product, A cross product results in a vector which points in a direction perpendicular to the plane which contains the two vectors that were multiplied together. The calculation of a cross product is performed as shown in this example:

A X B = <x1, y1, z1> X <x2, y2, z2> = <y1z2 - z1y2, z1x2 - x1z2, x1y2 - y1x2>

The magnitude of the resultant vector is given by:

|AXB| = |A| x |B| x sin(a)

where a is the angle between the two vectors.

When a cross product is applied to two line segments (edges) of a triangle the result is a vector perpendicular to the surface of the triangle - where the surface of the triangle is determine by the order in which the points are listed, as was discussed above. This vector perpendicular to the surface of the triangle is called a normal and is very important in rendering a 3D scene.

As with algebraic equations there are a number of manipulations of vector equations which can be applied:

  • Dot Products
    - the following rules apply to dot product calculations
  • Cross Products
    - the following rules apply to cross product calculations
    • AXB does not equal BXA. The order of the vectors in the operation changes the answer except under very special circumstances.

Return to top of document

Matrix Math

To move an object in a 3D scene, you must move every point in the model - corresponding to every vertex of every polygon in the 3D scene. As you will see, there are three common types of movements of points:

  • Translation
    Translation is simple movement in the x, y, or z direction (or any combination)
  • Rotation
    Rotation is movement of a point around another point, or line, by a given angle. 3D scene rotations are often accomplished by successive rotations, up to one for each axis. In this approach, the order of rotation is important. Changing the order of rotation can change the way an object appears after all rotations are done.
  • Scaling
    Scaling is a resizing of the object.

Movement of these points is called a transformation and can be performed in a number of ways. One way involves individually manipulating the geometric and trigonometric equations to calculate the new x, y, and z coordinates.

A second approach is to perform a calculation on a point by using matrix operations, which is the approach used most often in 3D graphics engines. As we shall see, using matrix operations can significantly speed up the rendering process by significantly reducing the number of multiply operations involved in a transformation - by up to 75%. This tutorial focuses on the use of matrices to perform point transformations.

Matrix math is not specific to vectors and 3D graphics. It was developed originally for its generic ability to solve multiple equations with multiple unknowns. Many algebra classes include limited exposure to matrix math.

A matrix is simply a rectangular table of numbers. The horizontal lines are called rows and the vertical columns are called columns. A 4-by-3 matrix has 4 rows and 3 columns, such as this example which defines a matrix A:

    | 1  3  4 |
A = | 5  2  7 |
    | 1  2  9 |
    | 1  0  4 |

Each number in the matrix can be identified, such as A[2,3], which is the element in the 2nd row and 3rd column - the number 7 in this example.

Matrices can be added, subtracted, or multiplied. As you will see, multiplication in particular will prove most valuable in rendering 3D scenes. Matrix multiplications can be used to implement point transformations (movement) as well as to calculate dot and cross products of vectors for use in rendering 3D scenes.

Even though the example matrix above had multiple rows and columns, a matrix can just as easily have a single row or column. In fact, representation of a vector as a matrix with single row or column is a frequent form of applying matrix operations to manipulate vectors - which, as we have shown, is equivalent to operating on the coordinates of points that make up the line segments (edges) of polygons (triangles) in a 3D scene! What will be seen is that one matrix (loaded with vector components) will be multiplied by a second matrix (loaded with mathematical equation variables) in order to achieve transformation of the original vectors - translation, scaling, and rotation. We will also see that matrix multiplication can be used to combine multiple transformations (translation, scaling, rotation) into a single matrix operation, resulting in a significant reduction of the number of multiply operations - very necessary for real-time rendering of complex 3D scenes.

Matrix Addition
For the sake of completeness, definitions of matrix addition and subtraction are provided here. However, neither will be used much in creating 3D graphics images.

To add two matrices, both must contain the same number of rows and columns. Addition is accomplished by adding corresponding elements of each matrix, as in this example:

A + B = | x1 y1 | + | c1 c2 | =  | x1+c1  y1+c2 |
        | x2 y2 |   | d1 d2 |    | x2+d1  y2+d2 |
Subtraction is handled in a similar fashion, except that the values are subtracted from one another.

Matrix Multiplication
As has been stated, matrix multiplication is the primary matrix operation of interest in working with 3D images.

To multiply two matrices, the number of columns in the first matrix must be the same as the number of rows in the second matrix. Matrix multiplication results in a matrix with the number of rows of the first matrix and the number of columns of the second matrix.

Matrix multiplication is performed as follows:

| a  b  c |     | r  s |     | ar + bt + cv   as + bu + cw   |
| d  e  f |  X  | t  u |  =  | dr + et + fv   ds + eu + fw   |
                | v  w |     

As you might be able to see, AB < > BA - that is, matrix multiplication is not commutative. Changing the order of the matrices in the multiplication changes the result (there are exceptions to this rule).

There are rules which can be applied to matrix multiplication. The rules are NOT the same as the rules for simple algebraic variable multiplication!

(AB)C    = A(BC)    for all k-by-m matrices A, 
                    m-by-n matrices B and n-by-p matrices C
(A + B)C = AC + BC  for all m-by-n matrices A and B 
                    and n-by-k matrices C
C(A + B) = CA + CB  for all m-by-n matrices A and B 
                    and k-by-m matrices C 

Some matrices are seen so often, or have such special use, that they are given a special name that is describes the content of the matrix. These special matrices will also be used in various aspects of rendering, so it's useful to know them by name.

  • Square matrix
    Has the same number of rows and columns
  • Identity matrix
    All elements are zero except those of the main diagonal - top left to bottom right. All values in the main diagonal are 1. Here's an example of an identify matrix:

    | 1 0 0 |
    | 0 1 0 |
    | 0 0 1 |

  • Diagonal matrix
    All entries not on the main diagonal (the diagonal from the upper left to the lower right corner) are zero. This is similar to the identify matrix without the requirement for the main diagonal entries to have vales of 1.

  • Triangular matrix
    All entries above the main diagonal are zero or all below the main diagonal are zero.

    | 1 0 0 |      | 4 3 2 |
    | 2 3 0 |  or  | 0 5 1 |
    | 5 4 1 |      | 0 0 1 |

  • Transpose matrix
    A transpose matrix is formed by swapping the rows with the columns of a matrix.

    | a  b  c |       | a  d  g |
    | d  e  f | --->  | b  e  h |
    | g  h  i |       | c  f  i |

Before we get further into the details of how matrix math applies to 3D graphics scene rendering, the concept of a matrix determinant needs discussion. A determinant is defined for a 2x2 matrix as follows:

  D =  | a  b | = ad - bc
       | c  d |

A determinant for a larger matrix is calculated by breaking the matrix down into smaller matrices, until 2x2 matrices are reached, for which the definition of a 2x2 determinant is applied. For example, the determinant of a 3x3 matrix is defined as follows:

| c1 c2 c3 |
| x1 y1 z1 | = c1 | y1 z1 |  -  c2 | x1 z1 |  +  c3 | x1 y1 |
| x2 y2 z2 |      | y2 z2 |        | x2 z2 |        | x2 y2 |

We're finally to the point where the application of matrix multiplication can be applied to vector operations. You may recall that the equation for a cross product between two vectors was a bit lengthy - difficult to remember. If we load the components of two vectors into a 3x3 matrix as follows, the determine will exactly equal the cross product. Consider the following example of two vectors A and B, from which we form the following matrix:

A x B = <x1, y1, z1> X <x2, y2, z2> 

        | i   j  k |
A x B = | x1 y1 z1 | = | y1 z1 | i  +  | x1 z1 | j  +  | x1 y1 |
        | x2 y2 z2 |   | y2 z2 |       | x2 z2 |       | x2 y2 |

A X B = <y1z2 - z1y2, z1x2 - x1z2, x1y2 - y1x2>

In addition to the components of the two vectors, this matrix is loaded with the unit vectors - >i, j, k< - where i, j, and k are vectors of length one that are parallel to the x, y, and z axes of the coordinate system. These were discussed earlier in this tutorial.

The main point to recognize from these equations is that the final line matches exactly the definition of a cross product! The format of a matrix is easy to remember and its determinant calculated to arrive at the cross product of two vectors without have to memorize the final equation.

Matrix operations can also be used to perform a dot product between two vectors. Simply put the first vector in a single row matrix and the second vector in a single column matrix and then perform matrix multiplication. The result is the dot product of the two vectors, as seen in this example:

AB  =  <x1, y1, z1> |x2|
AB  = x1x2 + y1y2 + z1z2

As you can see, the final equation is exactly the definition of a dot product!

Now that the use of matrix operations for calculating dot products and cross products has been demonstrated, we'll show how matrix multiplications can also be used to perform transformation (translation, scaling, and rotation) of the points in a 3D graphics scene.

Matrix Transformations
Another, more important value of matrices is their use in performing all of the transformations (translations, scaling, and rotation) on points. Specifically, a 4x4 matrix can be defined which, when multiplied by a one row matrix loaded with the vector components, can achieve any of the desired transformations. These 4x4 matrices are appropriately called transformation matrices.

Here's an example of a 4x4 matrix which can be multiplied by the vector matrix to achieve translation of each point by an amount dx, dy, and dz, respectively:

<x', y', z', 1> = <x, y, z, 1>  |  1  0  0  0  | 
                                |  0  1  0  0  |
                                |  0  0  1  0  |
                                | dx dy dz  1  |
<x', y', z', 1> = <x+dx, y+dy, z+dz, 1>

The first 3 values in the result are the new coordinates of the points, translated by dx, dy, and dz respectively. The last column is an artifact of using a 4x4 matrix to perform the transformation and is ignored. All of the 4x4 matrices that are used for transformations utilize a fourth column containing 0,0,0,1 and when used for multiplication result in a fourth column of value 1 which is ignored.

This particular example may seem too trivial to require the use of matrix multiplication, but it does show the concept that a 4x4 matrix can be used to effect a vector (or point) transformation.

A similar 4x4 matrix can be constructed to scale a vector, as shown here:

<x', y', z', 1> = <x, y, z, 1>  | Sz  0  0  0  |
                                |  0 Sy  0  0  |
                                |  0  0 Ax  0  |
                                |  0  0  0  1  |
<x', y', z', 1> = <Sx, Sy, Sz, 1>

As before, the first 3 elements of the matrix multiplication are the new coordinates of the points, scaled by a factor of S.

Though simple, both of these can also show the concept of a combining transformation matrices in order to reduce the total number of calculations needed to perform 3D transformations. If we let |A| and |B| be the translation and transformation matrices, then to perform both operations on a vector can be written as:

X' = (|X| * |A| ) * |B|
X' = |X| * (|A| * |B|)
X' = |X| * |K|

where |K| = |A| * |B|. This points out the idea of combining matrices before applying them to a vector. Applying the transformation matrix to thousands of points in a 3D graphics scene can reduce the number of calculations be as much as 75% - a significant boost for real-time 3D animation.

Rotation Matrices
Rotation of objects in a 3D scene is at the top of the list of transformations used in 3D animation. In general, rotation is not that difficult. Earlier in this discussion the trigonometric equations for rotation were presented. Those equations were for the rotation of a point about one of the axes of the coordinate system, which is a very common transformation.

Rotations about an arbitrary point or line are much more complicated and are seldom done as a single transformation. Generally, the 3D graphics industry performs complicated transformations as a combination of simpler transformations. For example, to rotate an object simultaneously about all three axes is typically achieved by rotating about the x axis, followed by rotation about the y axis, and then followed by rotation about the z axis. The result is the same, but the individual calculations are simpler - at the expense of performing more operations.

The equations that were presented earlier in this tutorial can be inserted into a matrix to form transformation matrices which will rotate about the x, y, or z axes.

The following matrix multiplication will rotate a point around the x-axis through an angle a.

<x', y', z', 1> = <x, y, z, 1>  | 1     0       0       0 |
                                | 0    cos(a)  sin(a)   0 |
                                | 0   -sin(a)  cos(a)   0 |
                                | 0      0       0      1 |

The following matrix multiplication will rotate a point around the y-axis through an angle a.

<x', y', z', 1> = <x, y, z, 1>  | cos(a)   0   -sin(a)   0 |
                                |    0     1      0      0 |
                                | sin(a)   0    cos(a)   0 |
                                |    0     0      0      1 |

The following matrix multiplication will rotate a point around the z-axis through an angle a.

<x', y', z', 1> = <x, y, z, 1>  |  cos(a)  sin(a)  0     0 |
                                | -sin(a)  cos(a)  0     0 |
                                |    0       0     1     0 |
                                |    0       0     0     1 |

By way of clarification, the angle used here is the incremental angle of rotation, not the cumulative angle of rotation.

To see how these transformation matrices can be programmed, take a look in the code section of this site. Both Visual Basic and Java applet source code are provided which utilize matrix multiplications.

Homogeneous Coordinates
When the idea of 4x4 transformation matrices was introduced, the derivation for the matrices was not discussed. Since the combination of individual transformations into a single transformation has been shown to be desirable, the first criteria was that translation, scaling, and rotation should have common transformation matrix sizes. Scaling can be performed using a 3x3 transformation matrix, but both translation and rotation require at least a 4x4 transformation matrix. So the scaling transformation matrix is has been increased by one dimension in order to allow combination of all three transformation types - translation, scaling, and rotation.

The 4 dimensions used in the transformation matrices are known a homogeneous coordinates.

Return to top of document


Regardless of the approach taken to calculating points locations after a transformation - straight trigonometric equations or matrix multiplications - the next step is to display the 3D scene information on the 2D scene of the computer screen. There are two basic methods of projecting the 3D scene onto the computer screen - parallel and perspective projection. In either case, the goal is to figure draw each point of the 3D scene onto the computer screen - converting the x, y, and z coordinates to a corresponding x' and y' coordinate on the screen.

Parallel Projection
This is certainly the simplest approach top placing a 3D scene onto a 2D screen. In this approach, there is no perspective at all - the x and y values of the objects in the scene are the same as on the screen.

x' = x
y' = y
This has the serious drawback that all objects, no matter how "distant" in the 3D scene all look the same size. So, while it's easy to draw, it's seldom used because the images are not realistic.

However, it also has the advantage of being simpler to code, plus for very simple 3D scenes, particular CAD-type drawings, the result may be totally adequate.

Perspective Projection
This approach provides the visual cues of distance that you would expect in a photo-realistic drawing - distant objects seem smaller. There are several variations of perspective projection, but the general technique is the standard for all 3D graphics editing software.

From a geometrical point of view, a perspective projection results in an image where parallel lines (from the 3D scene) appear to converge (on the projected image). The point of convergence is called the vanishing point. A vanishing point where lines parallel to a coordinate axis would meet is called a principal vanishing point.

Not all parallel lines run parallel to the coordinate axes, but the principle of perspective requires that parallel lines converge, so 3D scene may include other non-principal vanishing points.

In the standard derivation of a perspective transformation, it is assumed that the center of projection (vanishing point) is at some positive distance d along the z axis (0,0,d) and that a point P of coordinates (x,y,z) is projected onto the x-y plane to a new point P'. Under those circumstances, the equations for the new x' and y' coordinates become:

x' = dy /(d-z)
y' = dx /(d-z)
z' = 0

And the equivalent transformation matrix can be written as:

<x', y', z', 1> = <x, y, z, 1>  |  1  0  0  0     | 
                                |  0  1  0  0     |
                                |  0  0  0  -1/d  |
                                |  0  0  0  1     |

Practical Matters of Displaying a Projection
Regardless of the projection approach that you use, there are other decisions to be made when attempting to create a 3D scene. For example, consider the idea of a stand alone 3D model of a simple cube.

When defining the 8 points that define the cube it is common to define the coordinates in terms of the center of the cube. Then, when it is placed into a 3D scene, the cube (its points or polygons) are translated to the location in the 3D scene.

This compares to giving a cube world coordinates which are with respect to a particular coordinate system origin. The approach works fine as long as the model is placed in the coordinate system which matches it's coordinates, but complicates placement if the model is to be re-used in another 3D scene.

Another decision to be made is the location of the origin of the world coordinate system with respect to the computer screen. It is common to place the origin at the center of the screen, with the positive x direction pointing to the right, the positive y direction pointing to the top of the screen, and positive z pointing into the computer screen.

Unfortunately, the various computer languages don't use this convention, nor do all the computer language use the same convention amongst themselves. Visual Basic, for example, uses the 0,0 point as the upper left hand corner of the screen.

So, when coding a 3D application you may have to make language-specific translations when placing the object in the 3D scene for the first time. Once placed correctly, future transformations will give the expected results.