Idea: take any 3D vector and add a fourth component called the -component. This transforms the vector from Cartesian coordinates to homogeneous coordinates. To convert back to Cartesian, we divide the first three components by .
The most common value of is 1. Dividing by 1 gives us back, so to convert Cartesian to homogeneous we simply set .
Why? Big scary math stuff that I don’t care too much to understand. The math doesn’t really come up anyway. The following are some applications/examples that provide some intuition about what it means in day to day usage.
2D example
A 2D point has an equivalent homogenous point . Unlike 3D, however, we can visualize this. In particular, every possible 2D point exists on the 2D plane that spans , which represents all homogenous points .
For each Cartesian 2D point , there are an infinite number of homogenous points , where .
Enables representing translation transformations
A translation of is represented by the translation matrix
This is obviously a good thing, and furthermore allows us to compose them with the scale and rotation transformation matrices. See Homogeneous transformation matrices.
Differentiates between “points” and “directions”
Consider a translation of . To apply it to a point, we do:
This translates the origin point to . And if we look at the math, it’s the -component value of 1 that is doing all the heavy lifting. Contrast this to a vector that represents a direction, let’s say . Let’s try “translating” this direction:
It doesn’t do anything because mathematically, the -component being zero, in fact, zeroes everything out. But also, it doesn’t make sense to translate a direction in the first place, because it just doesn’t have that notion.
So another way of seeing the -component is its ability to tell us whether the vec4
we’re working with is a 3D point or just a direction. For example, this is why the -component of a surface normal vector is zero, because it’s a direction.
Allows us to represent infinity
Setting means that we cannot divide by to get back to Cartesian coordinates. This tells us something important: Cartesian coordinates literally can’t represent the value captured by the homogenous point .
This makes sense in the context of thinking them as directions. A direction is simply a point at infinity. And as seen above, the math also shows this. It doesn’t make sense to “translate” a point at infinity, just like how you can’t translate a direction.
Furthermore, it explains why we need an additional fourth component. Standard Cartesian points simply don’t contain enough information to encapsulate infinity. Homogenous does.
Also see
- More on Matrices - 3D Math Primer for Graphics and Game Development: good overview from the online version of the book.
- Homogenous coordinates - Wikipedia: especially the “Introduction” section that talks about homogenous points at infinity.