Ethan Lipson

Why do rotations happen around an axis?

October 13, 2024

Partway through your introductory linear algebra course, you were introduced to the idea of an orthogonal matrix, characterized by the property that its columns are orthonormal -- orthogonal, each with unit length. You were then told that rotations correspond exactly to the orthogonal matrices with determinant 1.

Intuitively, this is clear; orthogonal transformations are rigid, like rotations. But if we dig into the technical definitions, things start getting murky. In 3D, any rotation is around an axis, but how's that have to do with orthogonality? In 2D and 3D, a combination of two rotations is another rotation, but in 4D that's not true anymore. And in 2D, there's no axis of rotation at all. What's going on here? What are rotations, really?

Rotations in 2D

Orthogonal Matrices

An orthogonal matrix $M$ is one with orthonormal columns. In other words, if the matrix has columns $c_i$ , i.e. it looks like

\left(\begin{array}{c|c|c|c}\\c_1&c_2&\cdots&c_n\\\ \end{array}\right)

then

c_i \cdot c_j = \begin{cases}1 & \mathrm{if\ } i = j\\0 & \mathrm{if\ } i \neq j\end{cases}

or more concisely, $c_i \cdot c_j = \delta_{ij}$ , where $\delta_{ij}$ is the Kronecker delta. The dot product is important here: in the case where it equals 1, it means that $c_i \cdot c_i = \lVert c_i\rVert^2 = 1$ , so each vector has unit length, i.e. is normalized. And when $c_i \cdot c_j = 0$ for $i \neq j$ , this tells us that differing columns are orthogonal. Hence, such a set of vectors is called orthonormal.

All eigenvalues of orthogonal matrices have magnitude 1, a fact we will make great use of. Roughly, this means that no stretching or squishing is going on, and this linear transformation is rigid.

The orthogonal matrices in 2D are easy to describe^[1], and are easy to describe. The first column has to be a unit vector, and all of the 2D unit vectors can be parameterized by $(\cos\theta, \sin\theta)$ . Then, the second column has to be orthogonal to the first, so a 90° rotation of the first column gives us $(-\sin\theta, \cos\theta)$ . So our final matrix looks like

\begin{pmatrix}\cos\theta&-\sin\theta\\\sin\theta&\cos\theta\end{pmatrix}, \quad \theta \in [0, 2\pi)

It's easy to check that the determinant is 1 and that the columns are orthonormal (by construction). $e^{i\theta}$ is an eigenvalue of this matrix, which intuitively makes sense, since it also corresponds to a $\theta$ -radian rotation, just in the complex plane. Its eigenvector is $(i, 1)$ , which corresponds to the orthogonality of the coordinate axes (observe that i and 1 are orthogonal in the complex plane). And due to the complex conjugate root theorem, $e^{-i\theta}$ is an eigenvalue as well, corresponding to a rotation of $-\theta$ radians, with the conjugate eigenvector $(-i, 1)$ .^[2]

Rotations in 3D

This is where things get interesting. Unlike in 2D, the 3D orthogonal matrices don't lend themselves to a simple classification. But we can do a bit of work to extract some useful results. First, since complex eigenvalues come in conjugate pairs, there'll always be an even number of them. Since 3 is odd, we have one real eigenvalue left, which must be $\lambda_1 = 1$ since orthogonal matrices have eigenvalues of unit length.^[3] So its corresponding eigenvector $v_1$ doesn't change at all under the rotation, and in fact any vector on the axis spanned by $v_1$ won't change either. Thus, the existence of an axis of rotation is guaranteed.

But there's more. Remember that conjugate pair of eigenvalues from before? Well, since our eigenvalues have unit length, that means they're of the form $e^{i\theta}$ and $e^{-i\theta}$ , which are exactly the eigenvectors of a 2D rotation! In other words, we have a 2D rotation embedded inside our 3D rotation, acting in what's known as the plane of rotation, orthogonal to the axis of rotation.

We can make this intuition a bit more precise. In general, any 3D orthogonal matrices can be block-diagonalized to look like this:

\begin{pmatrix}\cos\theta&-\sin\theta&0\\\sin\theta&\cos\theta&0\\0&0&1\end{pmatrix}

There's a 2D rotation matrix hiding in there! This form tells us that there's always a 2D subspace being acted on by a rotation, and then an unchanged 1D subspace leftover, which corresponds to the axis of rotation. In particular, because this matrix is block-diagonal, we know that each subspace operates independently.

Rotations in Higher Dimensions

Higher-dimensional rotations have a similar story to 3D. They can always be decomposed into 2D rotations acting independently on orthogonal subspaces, and if the dimension is odd, then there's a 1D axis of rotation left over. All orthogonal matrices can be block-diagonalized into the form

\begin{pmatrix}\cos\theta_1&-\sin\theta_1&0&0&&0\\\sin\theta_1&\cos\theta_1&0&0&&0\\0&0&\cos\theta_2&-\sin\theta_2&&0\\0&0&\sin\theta_2&\cos\theta_2&&0\\&&&&\ddots&\\0&0&0&0&&1\end{pmatrix}

just like in 3D. And the eigenvalues are $e^{\pm i\theta}$ like before as well. Except some interesting stuff happens in higher dimensions. For example, any composition of rotations in 3D is a rotation, but not so in 4D! Why? Well, every rotation comes with a pair of conjugate eigenvalues, so 3 dimensions can only fit one pair (i.e. one rotation), but 4 dimensions lets us fit two pairs. More concretely, this corresponds to independent rotations, say in the XY and ZW planes, that can't be combined into a single rotation.

So it would appear that rotation is fundamentally a 2D concept, and higher dimensions simply emulate the process in 2D subspaces. And the 2D nature of rotations is due to the two dimensions of the complex plane, a concept uniquely well-suited to describing rotations. Multiplying by complex numbers is rotation, after all, so the connection is impossible to avoid.

There's technically a single 1D orthogonal matrix, the identity matrix $(1)$ , since it's the only 1x1 matrix with unit determinant. It's a zero-degree rotation! ↩
The linked theorem leads immediately to eigenvalues coming in conjugate pairs. Eigenvalues are exactly the solutions to the characteristic polynomial $\det(M - \lambda I) = 0$ , so if $M$ has all real entries, then the characteristic polynomial has real coefficients and the theorem applies. ↩
-1 is the only other real number with unit length, but since rotations have positive determinant equal to the product of all eigenvalues, it has to be +1. ↩