On Github mollerse / sjs14
Scotland JS 2014
Stian Veum Møllersen / @mollerse
Lets talk about transforms.
translate(175, 200);
translate(175, 200); scale(2);
translate(175, 200); scale(2); rotate(45);
translate(175, 200); scale(2); rotate(45); skew(15,15);
Transforms enable us to alter the position and shape of objects. We can move objects in space F. We can scale objects F We can even rotate F and skew, or shear, objects.F We can use one or combine many. Give objects new shapes or throw them from side to side.
So, what is actually going on here? We can observe an effect, but what is happening under the hood.
Why does an object morph into that exact shape when a transform is applied? Why does the application of a transform give different results when applied after another transform rather than alone? In order to figure this out, we will need to look at some math.
First we need to look at this thing called a matrix. F Not the movie, but the mathematical concept. A matrix is a rectangular array of values, arranged in rows and cols. Matrices have many usages and they form the basis for many calculations. They even have their own calculus.
Defines the mapping of one set of coordinates to another
The underlying representation of a transform is a matrix. Applying a matrix to the coordinates of a point produces a new set of coordinates which decides the new location of the point.
When we transform objects, we are actually transforming each of the points that define the object using the same transformation matrix. When the points that define the object change their location, the object changes its form and location.
Mapping from [x, y] to [x', y']
To map from once space to another we multiply the original point with a transform defined in a matrix F. The coordinates of the new point is defined by multiplying the transform with the original coordinates. We can break this multiplication down into three pieces.
The first piece defines the x-coordinate of the new point F. The first row of the transformation matrix is dot-multiplied with the original point. We repeat this for the the y-coordinate F and the perspective component. F
We now have the complete set of equations we need solve to get the new coordinates of the point after the transformation has been applied. We can see how each value in the matrix affects the coordinate of the new point. We will come back to this later.
We are using homogenous coordinates
After the previous slide, you might have some questions perculating in the back of your heads. Why do we have 3 values defining a point and why is the 3rd component set to 1?
When dealing with graphics we use a representation for coordinates called homogenous coordinates. In this representation we include a value to represent the perspective of the canvas. Varying this value shrinks or enlarges the projected canvas, but it does not affect the relative distances in the image. To keep things ordered and easier to deal with we keep this component at the value 1.
We need 3 columns to enable translation
You might also wonder why we are using 3x3 matrices for defining 2D transforms. The reason is that in order to do translation-transforms we need to have one more column than we have dimensions. This means that we need 3 columns for a 2D translation.
The matrix calculus says that we can only multiply two matrices when the second matrix has the same number of rows as the first matrix has number of columns. This is where homogenous coordinates come in. Homogenous coordinates have three rows, so we can use them with the translation transform.
To avoid messing with the perspective component
So why are we not doing anything in the 3rd row of the transformation matrix? This is so that we don't affect the perspective-component of the homogenous coordinate, the 3rd component in the point.
In most cases this is not something we want to fiddle with. In fact, none of the low-level native 2Dmatrix APIs exposes the ability to manipulate this row.
Transformation context
(0,0)XY(150,150)The second piece of the puzzle is the coordinate system. The coordinate system forms the context for our transforms.
The way most coordinate systems are layed out when working with computer graphics is with the origin in the top left corner.F This allows us to define points within the system with positive x and y coordinates.F This way we won't have to deal with negative coordinates unless we venture outside the context.
scale(2);
The origin of the coordinate system decides the coordinates of a point. The coordinates of a point decides what the outcome of a transform will look like.
Here we have a box and the lower right corner of the box has the coordinates 65,65 within the coordinate system. If we apply a scale-transform of 2 F to the box we see that the coordinates of the point change accordingly.
scale(2);
If we change the origin slightly, for instance to the middle of the box, we get a different effect of applying the same transform.F This effect is not entirely un-expected, since the coordinate of the lower right corner of the box changed when we changed the origin of the coordinate system.
The different ways of doing graphics on the web has different ways of setting the origin of the coordinate system. CSS places origin relative to the object, in the middle by default, but Canvas and SVG places origin in the upper left corner of the container to the object.
From shorthand to matrix representation
Now that we have the theoretical foundation, lets revisit the shorthand transformation functions I showed at the beginning and look at how they relate to the underlying matrix representation.
We can think of the matrix representation of a transform as analogous to the hexcode of a color. We can do some things with shorthands, but it we really want to have the power and freedom we need to access the underlying representation.
translate(x,y)
The translation transformation works by adding a value to the existing value of either the x or the y coordinate. The values are located at the upper two rows of the 3rd column in a transform matrix.
translate(,)
The effect of a transformation matrix is pretty straight forward. As expected the object moves a set distance along either the x or the y axis when the values change.
scale(sx, sy)
The scale transform works by multiplying the value of the x and y coordinate by a factor. The scaling factors are located along the diagonal of the transform matrix.
scale(,)
When changing the scale, the coordinates of the point is multiplied by the scaling factor resulting in the object increasing in size along the x or the y axis.
rotate(a)
The rotation transform works by moving both the x and the y component of a point either clockwise or counterclockwise by a value. This value is decided by the trigonometric functions in the upper left corner of the transform matrix.
These functions are chosen so that the points move along the edge of a circle that has its center at the origin of the coordinate system.
rotate(deg)
In this demonstration i have set the origin of the coordinate system to be in the middle, so we can observe the rotation of the object around its center when the values change.
skew(a, b)
The skew transform works by moving the x-coordinate by a factor of the y coordinate, and vice versa for the y-coordinate. Very similar to the scale transform.
The factor is decided by using the tangent function. This is because we want to move the x or y coordinate in such a way that angle of the new position to the opposite axis changes by an angle of a or b from the previous position.
Skew(deg,deg)
Here we can see the X coordinates of the points shifting and forming an angle of A degrees with the y-axis. And vice versa for the y-coordinates.
Matrix multiplication
We have seen how transforms affect the object independently, but what if we want to apply more than one transform? What is the outcome of the composition? To answer this we need to take a look at some properties of matrix multiplication.
B ⋅ (A ⋅ x⃗) = (B ⋅ A) ⋅ x⃗
One of the great things about matrix multiplication is that it is associative. This means that applying multiple transforms to a point is the same as applying the product of all the transforms to the point.
And that means that the combined transformation matrix can be calculated up front, which is more performant for the computer. We can also invert the entire transformation with one operation.
Order matters
B ⋅ A ⋅ x⃗ ≠ A ⋅ B ⋅ x⃗
But we need to remember that matrix multiplication is not commutative. This means that the order in which transformations are applied matters. A then B produces a different outcome than B then A. This is the thing that makes complex transformations harder to reason about.
scale(2) translate(15, 15)
The first example of transformation composition is a scale transform followed by a translate transform.
If we look at the row-col pairs that define the translation part of the final matrix we see that the final translation is 30 in the x-direction F and 30 in the y-direction F.
translate(15, 15) scale(2)
If we reverse the order of the transforms, we get a different outcome.
We look at the transformation part again and notice that the transformation in the x-direction is now 15 F and the same for the y-direction F.
Transforms are represented as matrices
... that map from one coordinate to another
... and compose through multiplication
To wrap this up, heres a quick summary. The transform shorthands we see in web graphics, translate, scale, rotate and skew, are actually shorthands for matrices. F These matrices define the mapping of the current coordinates of a point to the new coordinates F And these transforms compose through multiplication, but remember that the ordering matters.
The math is the same, "just" one more dimension
A quick note about 3D transforms. The math is the same, all the same principles apply. But, we have one more dimension so there is more of the math.