\( \newcommand{\CC}{\mathbb{C}} \newcommand{\NN}{\mathbb{N}} \newcommand{\RR}{\mathbb{R}} \newcommand{\TT}{\mathbb{T}} \newcommand{\ZZ}{\mathbb{Z}} \newcommand{\H}{\mathcal{H}} \newcommand{\e}{\epsilon} \newcommand{\x}{\mathbf{x}} \newcommand{\y}{\mathbf{y}} \)

\( \newcommand{\HH}{\mathbb{H}} \newcommand{\one}{\mathbb{1}} \newcommand{\i}{\mathbf{i}} \newcommand{\j}{\mathbf{j}} \newcommand{\k}{\mathbf{k}} \newcommand{\wdotsigma}{\mathbf{w}\cdot\boldsymbol{\sigma}} \)

Notes on Quaternions, \(SU(2)\), and \(SO(3)\)

These notes dot the i's and cross the t's of some of the facts about the quaternions, \(SO(3)\) and \(SU(2)\) which are used in Woit, Ch 6. See also my related page on Pauli matrices.

The Basics of \(SO(3)\)

Properties of matrices in \(O(2)\)

Lemma 1. Every matrix in \(SO(2)\) is a rotation, of the form \[ \left[ \begin{array}{cc} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{array} \right] \]

Proof. Let \(S\in SO(2)\). Then \[ S = \left[ \begin{array}{cc} a & c \\ b & d \\ \end{array} \right] \] where \(a^2 + b^2 = c^2 + d^2 = 1\) and \(ac + bd = 0\). Thus \[ \begin{align} ac &= -bd \\ a^2c^2 &= b^2d^2 = b^2(1-c^2) \\ (a^2 + b^2)c^2 &= b^2 \\ c^2 &= b^2 \\ d^2 &= 1 - c^2 = 1 - b^2 = a^2 \\ \end{align} \] Then the relation \(ac = -bd\) implies \(c = \pm b\) and \(d = \mp a\) and since \(S\in SO(3)\), \(ad - bc = 1\), and so the only choice is \(c = -b\) and \(d = a\). Finally choose \(0 \le\theta <2\pi\) such that \(a=\cos\theta\) and \(b = \sin\theta\) and we are done.

Corollary 2. If \(S\in SO(2)\) and \(S \not= \pm I\) then \(S\) has no real eigenvalues.

Proof. The characteristic equation of \(S\) is \(\lambda^2 - 2\cos(\theta)\lambda + 1\) and its discriminant is \(4(\cos^2(\theta) - 1)\). This is only non-negative for \(\theta = 0, \pi\).

Lemma 3. Every matrix in \(O(2)\) with determinant \(-1\) is a reflection, meaning that it is orthogonally equivalent to \[ \left[ \begin{array}{cc} 1 & 0 \\ 0 & -1 \\ \end{array} \right] \] In particular, \(1\) is an eigenvalue.

Proof. The characteristic polynomial is a monic quadratic which is \(-1\) at the origin. Thus it has two distinct roots, which therefore must be \(\pm 1\). If \(\mathbf{v}_1\) and \(\mathbf{v}_{-1}\) are corresponding eigenvectors then \[ \mathbf{v}_1 \cdot \mathbf{v}_{-1} = S\mathbf{v}_1 \cdot S\mathbf{v}_{-1} = - \mathbf{v}_1 \cdot \mathbf{v}_{-1} \] Therefore \(\mathbf{v}_1\) and \(\mathbf{v}_{-1}\) are orthogonal and \(S\) is orthogonally diagonalizable.

Matrices in \(SO(3)\) are rotations

Lemma 4. Every matrix in \(SO(3)\) has \(1\) as an eigenvalue.

Proof. Let \(S\in SO(3)\). The characteristic polynomial for \(S\) is a cubic polynomial which, being of odd degree, always has at least one real root. Let \(\lambda\) be an eigenvalue of \(S\) and \(\mathbf{x}\) a corresponding unit eigenvector. Then \[ 1 = \mathbf{x}\cdot \mathbf{x} = \mathbf{Sx}\cdot \mathbf{Sx} = \lambda^2 \] and so \(\lambda = \pm 1\). If \(\lambda = 1\) we are done so suppose \(\lambda = -1\) and let \(\mathbf{v}_{-1}\) be a unit eigenvector for it. Construct an orthonormal basis \(\{\mathbf{u}, \mathbf{w}, \mathbf{v}_{-1}\}\) so that with respect to this basis \[ S = \left[ \begin{array}{ccc} a & c & 0 \\ b & d & 0 \\ 0 & 0 & -1 \\ \end{array} \right] \] Since \(\det(S) = 1\), \[ \det \left[ \begin{array}{cc} a & c \\ b & d \end{array} \right] = -1 \] and, by Lemma 3, this has 1 as an eigenvalue, thus, so does \(S\).

Theorem 5. Every matrix in \(SO(3)\) is a rotation about a fixed axis.

Proof. Let \(S\in SO(3)\) and, by Lemma 4, let \(\mathbf{v}_1\) be a unit eigenvalue for eigenvalue \(1\). Extend this to an orthonormal basis \(\{\mathbf{u} , \mathbf{w}, \mathbf{v}_1\}\) so that with respect to this basis \[ S = \left[ \begin{array}{ccc} a & c & 0 \\ b & d & 0 \\ 0 & 0 & 1 \\ \end{array} \right] \] Since clearly \[ \det \left[ \begin{array}{cc} a & c \\ b & d \end{array} \right] = 1 \] this matrix is in \(SO(2)\) and by Lemma 1, is a rotation by angle \(\theta\). Thus \(S\) is a rotation of angle \(\theta\) about the axis \(\mathbf{v}_1\).

Corollary 6. If \(S\in SO(3)\) and \(S \not= I\) then the eigenvalue \(1\) has multiplicity \(1\).

Proof. As in the proof of the Theorem 5, let \[ S = \left[ \begin{array}{ccc} a & c & 0 \\ b & d & 0 \\ 0 & 0 & 1 \\ \end{array} \right] \] with respect to a suitably-chosen orthonormal basis. If the \(1\)-eigenspace is greater than \(1\)-dimensional, then it must contain a unit vector \([x, y, 0]^T\). But then \([x, y]^T\) is an eigenvector of \[ \left[ \begin{array}{cc} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{array} \right] \] which is only possible if \(\theta = 0\).

Euler angles

Write \(R_x(\theta)\) (resp. \(R_y(\theta)\), \(R_z(\theta)\)) for the matrix in \(SO(3)\) of a rotation of angle \(\theta\) about the \(x\) (resp \(y\), \(z\)) axis. Then given a fixed unit vector \(\mathbf{w}\in\RR^3\), there are two angles \(\alpha\), \(\beta\) such that \[ R_y(\beta)R_z(\alpha)\mathbf{w} = \mathbf{e}_3 \] These are the Euler angles, and are found by rotating \(\mathbf{w}\) about the \(z\)-axis until it lies in the \(xz\)-plane, and then rotating about the \(y\)-axis until the vector points in the positive \(z\)-direction.

From this it follows that the matrix \(R_\mathbf{w}(\theta)\) for rotation by angle \(\theta\) about \(\mathbf{w}\) is given by \[ R_\mathbf{w}(\theta) = R_z(-\alpha)R_y(-\beta)R_z(\theta)R_y(\beta)R_z(\alpha) \]

The Quaternions

The quaternions are the set \(\HH\) consisting all numbers of the form \[ a\one + b\i + c\j + d\k \] with the lovely relations due to Hamilton, \[ \i^2 = \j^2 = \k^2 = \i\j\k = -1 \]

From these we can easily deduce the full (non-commutative) multiplication table: \[ \begin{align} \i\j &= -\j\i = \k \\ \j\k &= -\k\j = \i \\ \k\i &= -\i\k = \j \end{align} \]

(Details: \(\i\j = -(\i\j\k)\k = \k\) and \(\j\k = -\i(\i\j\k) = \i\). Thus \(-1 = (\i\j)^2\) and so \(-\i\j = \i(\i\j\i\j)\j = \j\i\) and, similarly, \(-1 = (\j\k)^2\) and so \(-\j\k = \j(\j\k\j\k)\k = \k\j\). This establishes the first two lines of equalities. Finally, \(-1 = \i\j\k = -\i\k\j\) and so \(-\j = (-\i\k\j)\j = \i\k\), and \(\k\i = (\i\j)(\j\k) = -\i\k\), completing the final line.)

The conjugate

The quaternions are a 4-dimensional real vector space with basis \(\one, \i, \j, \k\). Define the conjugate of a quaternion \(q = a\one + b\i + c\j + d\k\) to be \[ \bar{q} = a\one - b\i - c\j - d\k \] This is clearly an idempotent, real-linear map of \(\HH\) to itself. Thus to see that conjugation also satifies the multiplicative rule \(\overline{pq} = \bar{q}\bar{p}\) it suffices to check the relation for the basis elements. For example, \(\overline{\i\j} = \overline{\k} = -\k\) and \(\overline{\j}\overline{\i} = (-\j)(-\i) = \j\i = -\k\). The other relations follow in exactly the same way.

Next define \[ p\cdot q = \frac{1}{2}(p\overline{q} + q\overline{p}) \] This is clearly a real bilinear form on \(\HH\) and, writing \(e_1 = \one, e_2 = \i, e_3 = \j, e_4 = \k\), we can easily verify the relation \[ e_i\cdot e_j = \delta_{ij} \] from the definition of the conjugate and the anticommuttivity relations. From this it follows that for \(p = \sum p_ie_i\) and \(q=\sum_i q_i e_i\), \[ p\cdot q = \sum_i p_i q_i \qquad\text{and}\qquad |p|^2 \stackrel{\text{def}}{=} p\cdot p = \sum p_i^2 \] and so \(p\cdot q\) is a real inner product on \(\H\).

Connection with scalar and vector products in \(\RR^3\)

The real part of the quaternion \[ p = p_0\one + p_1\i + p_2\j + p_3\k \] is the real number \(p_0\), and the imaginary part is the quaternion \(p_1\i + p_2\j + p_3\k\). The purely imaginary quaternions (i.e., those with real part equal to zero) can be identified with vectors in \(\RR^3\). Write \(\mathbf{p}\) for the 3-vector corresponding to the purely imaginary quaternion \(p\). In this case, by slight abuse of notation, if \(p\) and \(q\) are purely imaginary then \[ \tag{1} pq = (-\mathbf{p}\cdot\mathbf{q})\one + \mathbf{p}\times\mathbf{q} \]

(This formula can be extended by linearity to a somewhat more complicated form for the product of arbitrary quaternions, but we don't need that here.)

Action on \(\RR^3\)

Let \(u\) be a fixed unimodular quaternion. Then for any vector \([x, y, z]^T \in \RR^3\), compute \[ u(x\i + y\j + z\k)\overline{u} \] It turns out (see below) that this quaternion is also purely imaginary, and so is of the form \(x'\i + y'\j + z'\k\), which we can map back to the vector \([x', y', z']^T \in \RR^3\). This mapping \([x, y, z]^T \mapsto [x', y', z']^T\) is clearly real linear and we shall see that it defines an element of \(SO(3)\).

First note that since \(u\) is unimodular, \(u\overline{u} = |u|^2 = 1\) and so for any \(p, q \in \HH\), \[ \begin{align} (u p \bar{u})\cdot(u q \bar{u}) &= \frac{1}{2}(u p \bar{u}u \bar{q} \bar{u} + u q \bar{u} u \bar{p} \bar{u}) \\ &= u\left[ \frac{1}{2}(p\bar{q} + q\bar{p}) \right] \bar{u} \\ &= u(p\cdot q)\bar{u} \end{align} \] In particular, conjugation by \(u\) maps orthogonal spaces to orthogonal spaces. Thus since conjugation by \(u\) clearly maps \(\RR\one\) to itself, it must also map the purely imaginary vectors to themselves. Thus the linear map \(U\) induced on \(\RR\) is well-defined and, in addition, is clearly an orthogonal map, i.e., in \(O(3)\). Finally, writing \(\mathbf{e}_x\), \(\mathbf{e}_y\), and \(\mathbf{e}_z\) for the standard basis of \(\RR^3\), note that the determinant of \(U\) is equal to the scalar triple product \[ U(\mathbf{e}_x)\cdot U(\mathbf{e}_y)\times U(\mathbf{e}_z) \] However, from (1) above, this triple produce is equal to the negative of the real part of the product \[ (u\i\overline{u})(u\j\overline{u})(u\k\overline{u}) = u\i\j\k\overline{u} = -1 \] It follows that \(U\in SO(3)\).

Double cover of \(SO(3)\)

Now write \(\Phi\) from the map which takes a unimodular quaternion to an element of \(SO(3)\). It's easily seen that \(\Phi\) is multiplicative (i.e., a group homomorphism) and is (at least) two-to-one, because \(u\) and \(-u\) induce the same linear map on \(\RR^3\).

To see that \(\Phi\) is a double cover of \(SP(3)\), we must show that it is surjectoive and precisely two-to-one. This follow from this lemma:

Lemma 7. Let \(u = \cos\theta\one + \sin\theta w\), where \(\theta\in\RR\) and \(w\) is a unimodular, purely imginary quaternion. Then \(U := \Phi(u)\) is the transformation of rotation by angle \(2\theta\) about the vector \(\mathbf{w}\).

Proof. First consider \(w = \k\) (i.e., for rotation about the \(z\)-axis). For \(\mathbf{v} = [x, y, z]^T \in \RR^3\), write \(v = x\i + y\j + z\k\). Then \[ \begin{align} uv\overline{u} &= (\cos\theta\one + \sin\theta \k)(x\i + y\j + z\k) (\cos\theta\one - \sin\theta \k) \\ % &= \cos^2\theta(x\i + y\j + z\k) \\ &\qquad + \sin\theta\cos\theta(x\k\i + y\k\j + z\k^2) \\ &\qquad - \cos\theta\sin\theta(x\i\k + y\j\k + z\k^2) \\ &\qquad - \sin^2\theta(x\k\i\k + y\k\j\k + z\k^3) \\ % &= \cos^2\theta(x\i + y\j + z\k) \\ &\qquad + \sin\theta\cos\theta(x\j - y\i - z\one) \\ &\qquad - \cos\theta\sin\theta(-x\j + y\i - z\one) \\ &\qquad - \sin^2\theta(x\i + y\j - z\k) \\ % &= (-\sin\theta\cos\theta z + \cos\theta\sin\theta z)\one \\ &\qquad + (\cos^2\theta x - \sin\theta\cos\theta y - \cos\theta\sin\theta y - \sin^2\theta x)\i \\ &\qquad + (\cos^2\theta y + \sin\theta\cos\theta x + \cos\theta\sin\theta x - \sin^2\theta y)\j \\ &\qquad + (\cos^2\theta z + \sin^2\theta z)\k \\ &= (\cos(2\theta)x - \sin(2\theta)y)\i + (\sin(2\theta)x + \cos(2\theta)y)\j + z\k \end{align} \] As an action on \(\RR^3\) this means that \(\Phi(u)\) acts as \[ \left[ \begin{array}{ccc} \cos(2\theta) & -\sin(2\theta) & 0 \\ \sin(2\theta) & \cos(2\theta) & 0 \\ 0 & 0 & 1 \end{array} \right] \] which is a rotation of angle \(2\theta\) about the \(z\)-axis. Similar calculations, mutatis mutandis, show that the corresponding results hold for \(w=\i\) and \(w=\j\).

To handle the general case, write the matrix \(R_\mathbf{w}(\theta)\) for rotation by angle \(\theta\) about \(\mathbf{w}\) as \[ R_\mathbf{w}(\theta) = R_z(-\alpha)R_y(-\beta)R_z(\theta)R_y(\beta)R_z(\alpha) \] (see the section on Euler angles). Writing \[ u = \cos(\alpha/2)\one + \sin(\alpha/2) \k \quad\text{and}\quad v = \cos(\beta/2)\one + \sin(\beta/2) \j \] and since \(\Phi\) is a homomorphism, \[ \begin{align} R_\mathbf{w}(2\theta) &= \Phi(\bar{u}\bar{v}(\cos(\theta)\one + \sin(\theta) \k)vu) \\ &= \Phi(\cos(\theta)\one + \sin(\theta) (\bar{u}\bar{v}\k vu) \\ \end{align} \] But we already know that \(\bar{u}\bar{v}\k vu\) gives the action of \(R_z(-\alpha)R_y(-\beta)\) on \(\mathbf{e}_3\), which is \(w\). It follows that \[ R_\mathbf{w}(2\theta) = \Phi(\cos(\theta)\one + \sin(\theta)w) \] as required.

Theorem 8. The map \(\Phi\) is a homomorphism from the group of unimodular quaternions onto \(SO(3)\) and is exactly two-to-one.

Proof. Every matrix in \(SO(3)\) is a rotation of the form \(R_\mathbf{w}(\theta)\) for some \(\theta\in\RR\) and unit vector \(\mathbf{w}\). By the last lemma, \(R_\mathbf{w}(\theta) = \Phi(\cos(\theta/2)\one + \sin(\theta/2)w)\) and so \(\Phi\) is surjective. We have seen that \(\Phi(u) = \Phi(-u)\) for every unimodular \(u\). It remains to show that if \(\Phi(u) = \Phi(v)\) then \(u = \pm v\).

First, consider the case when \(\Phi(u) = I\). Since \(u\) is unimodular we can find an angle \(0\le \theta < 2\pi\) and a unimodular purely imaginary quaternion \(\hat{u}\) such that \[ u = \cos\theta\one + \sin\theta\hat{u} \] But this means that the matrix \(R_{\hat{\mathbf{u}}}(2\theta) = I\), and so \(\theta\) can only be \(0\) or \(\pi\). Correspondingly, \(u = \pm 1\).

In general, if \(\Phi(u) = \Phi(v)\) then since \(\Phi\) is a homomorphism,# \(\Phi(uv^{-1}) = I\), so \(uv^{-1} = \pm 1\), and we are done.

The Group \(SU(2)\)

Action on \(\RR^3\)

Let \(w\) be a unimodular quaterion, \(\theta\) a real number, and \(u = \cos\theta\one + \sin\theta w\); and let \(v = x\i + y\j + z\k\). We saw above that \(v \mapsto uv\bar u\) acts as a rotation of the point \([x,y,z]^T\) by the angle \(\theta\) about the axis \(\mathbf{w}\). In \(SU(2)\) the corresponding action is \[ (xX_1 + yX_2 + zX_3) \longmapsto U(xX_1 + yX_2 + zX_3)U^* \] where \[ U = \cos\theta I + \sin\theta (w_1 X_1 + w_2X_2 + w_3 X_3) \] It's convenient to write \[ w_1 X_1 + w_2X_2 + w_3 X_3 = \mathbf{w}\cdot\mathbf{X} = -i\wdotsigma \]

Exponentials

If \(\mathbf{w}\) is a unit vector in \(\RR^3\) then, by direct multiplication using the fundamental relations of the quaternions, \[ w^2 = -\one \quad\text{and}\quad (-i \wdotsigma)^2 = - I \] Focusing on the case of \(SU(2)\) this says that \((\wdotsigma)^2 = I\) and so \[ (\wdotsigma)^k = \begin{cases} \wdotsigma & k \text{ is odd} \\ I & k \text{ is even} \end{cases} \] Thus \[ \begin{align} e^{-i\theta \wdotsigma} &= \sum_{k=1}^\infty \frac{(-i\theta \wdotsigma)^k}{k!} \\ &= \sum_{k \text{ even}} \frac{(-i\theta)^k}{k!}I + \sum_{k \text{ odd}} \frac{(-i\theta)^k}{k!}\wdotsigma \\ &= \cos\theta I - i\sin\theta\; \wdotsigma \\ &= \cos\theta I + \sin\theta\; \mathbf{w}\cdot\mathbf{X} \end{align} \] and so conjugation by \(e^{-i\theta \wdotsigma}\) implements rotation on the copy of \(\RR^3\) embedded in \(M_2(\CC)\) as \[ xX_1 + yX_2 + zX_3 = \begin{bmatrix} - i z & -y - i x \\ y - i x & i z \end{bmatrix} \]