\(
\newcommand{\CC}{\mathbb{C}}
\newcommand{\NN}{\mathbb{N}}
\newcommand{\RR}{\mathbb{R}}
\newcommand{\TT}{\mathbb{T}}
\newcommand{\ZZ}{\mathbb{Z}}
\newcommand{\H}{\mathcal{H}}
\newcommand{\e}{\epsilon}
\newcommand{\x}{\mathbf{x}}
\newcommand{\y}{\mathbf{y}}
\)
# Notes on Quaternions, \(SU(2)\), and \(SO(3)\)

## The Basics of \(SO(3)\)

### Properties of matrices in \(O(2)\)

### Matrices in \(SO(3)\) are rotations

### Euler angles

## The Quaternions

### The conjugate

### Connection with scalar and vector products in \(\RR^3\)

### Action on \(\RR^3\)

### Double cover of \(SO(3)\)

## The Group \(SU(2)\)

### Action on \(\RR^3\)

### Exponentials

\( \newcommand{\HH}{\mathbb{H}} \newcommand{\one}{\mathbb{1}} \newcommand{\i}{\mathbf{i}} \newcommand{\j}{\mathbf{j}} \newcommand{\k}{\mathbf{k}} \newcommand{\wdotsigma}{\mathbf{w}\cdot\boldsymbol{\sigma}} \)

These notes dot the i's and cross the t's of some of the facts about the quaternions, \(SO(3)\) and \(SU(2)\) which are used in Woit, Ch 6. See also my related page on Pauli matrices.

Lemma 1. Every matrix in \(SO(2)\) is a rotation, of the form \[ \left[ \begin{array}{cc} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{array} \right] \]

Proof. Let \(S\in SO(2)\). Then \[ S = \left[ \begin{array}{cc} a & c \\ b & d \\ \end{array} \right] \] where \(a^2 + b^2 = c^2 + d^2 = 1\) and \(ac + bd = 0\). Thus \[ \begin{align} ac &= -bd \\ a^2c^2 &= b^2d^2 = b^2(1-c^2) \\ (a^2 + b^2)c^2 &= b^2 \\ c^2 &= b^2 \\ d^2 &= 1 - c^2 = 1 - b^2 = a^2 \\ \end{align} \] Then the relation \(ac = -bd\) implies \(c = \pm b\) and \(d = \mp a\) and since \(S\in SO(3)\), \(ad - bc = 1\), and so the only choice is \(c = -b\) and \(d = a\). Finally choose \(0 \le\theta <2\pi\) such that \(a=\cos\theta\) and \(b = \sin\theta\) and we are done.

Corollary 2. If \(S\in SO(2)\) and \(S \not= \pm I\) then \(S\) has no real eigenvalues.

Proof. The characteristic equation of \(S\) is \(\lambda^2 - 2\cos(\theta)\lambda + 1\) and its discriminant is \(4(\cos^2(\theta) - 1)\). This is only non-negative for \(\theta = 0, \pi\).

Lemma 3. Every matrix in \(O(2)\) with determinant \(-1\) is a reflection, meaning that it is orthogonally equivalent to \[ \left[ \begin{array}{cc} 1 & 0 \\ 0 & -1 \\ \end{array} \right] \] In particular, \(1\) is an eigenvalue.

Proof. The characteristic polynomial is a monic quadratic which is \(-1\) at the origin. Thus it has two distinct roots, which therefore must be \(\pm 1\). If \(\mathbf{v}_1\) and \(\mathbf{v}_{-1}\) are corresponding eigenvectors then \[ \mathbf{v}_1 \cdot \mathbf{v}_{-1} = S\mathbf{v}_1 \cdot S\mathbf{v}_{-1} = - \mathbf{v}_1 \cdot \mathbf{v}_{-1} \] Therefore \(\mathbf{v}_1\) and \(\mathbf{v}_{-1}\) are orthogonal and \(S\) is orthogonally diagonalizable.

Lemma 4. Every matrix in \(SO(3)\) has \(1\) as an eigenvalue.

Proof. Let \(S\in SO(3)\). The characteristic polynomial for \(S\) is a cubic polynomial which, being of odd degree, always has at least one real root. Let \(\lambda\) be an eigenvalue of \(S\) and \(\mathbf{x}\) a corresponding unit eigenvector. Then \[ 1 = \mathbf{x}\cdot \mathbf{x} = \mathbf{Sx}\cdot \mathbf{Sx} = \lambda^2 \] and so \(\lambda = \pm 1\). If \(\lambda = 1\) we are done so suppose \(\lambda = -1\) and let \(\mathbf{v}_{-1}\) be a unit eigenvector for it. Construct an orthonormal basis \(\{\mathbf{u}, \mathbf{w}, \mathbf{v}_{-1}\}\) so that with respect to this basis \[ S = \left[ \begin{array}{ccc} a & c & 0 \\ b & d & 0 \\ 0 & 0 & -1 \\ \end{array} \right] \] Since \(\det(S) = 1\), \[ \det \left[ \begin{array}{cc} a & c \\ b & d \end{array} \right] = -1 \] and, by Lemma 3, this has 1 as an eigenvalue, thus, so does \(S\).

Theorem 5. Every matrix in \(SO(3)\) is a rotation about a fixed axis.

Proof. Let \(S\in SO(3)\) and, by Lemma 4, let \(\mathbf{v}_1\) be a unit eigenvalue for eigenvalue \(1\). Extend this to an orthonormal basis \(\{\mathbf{u} , \mathbf{w}, \mathbf{v}_1\}\) so that with respect to this basis \[ S = \left[ \begin{array}{ccc} a & c & 0 \\ b & d & 0 \\ 0 & 0 & 1 \\ \end{array} \right] \] Since clearly \[ \det \left[ \begin{array}{cc} a & c \\ b & d \end{array} \right] = 1 \] this matrix is in \(SO(2)\) and by Lemma 1, is a rotation by angle \(\theta\). Thus \(S\) is a rotation of angle \(\theta\) about the axis \(\mathbf{v}_1\).

Corollary 6. If \(S\in SO(3)\) and \(S \not= I\) then the eigenvalue \(1\) has multiplicity \(1\).

Proof. As in the proof of the Theorem 5, let \[ S = \left[ \begin{array}{ccc} a & c & 0 \\ b & d & 0 \\ 0 & 0 & 1 \\ \end{array} \right] \] with respect to a suitably-chosen orthonormal basis. If the \(1\)-eigenspace is greater than \(1\)-dimensional, then it must contain a unit vector \([x, y, 0]^T\). But then \([x, y]^T\) is an eigenvector of \[ \left[ \begin{array}{cc} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{array} \right] \] which is only possible if \(\theta = 0\).

Write \(R_x(\theta)\) (resp. \(R_y(\theta)\), \(R_z(\theta)\)) for the matrix in \(SO(3)\) of a rotation of angle \(\theta\) about the \(x\) (resp \(y\), \(z\)) axis. Then given a fixed unit vector \(\mathbf{w}\in\RR^3\), there are two angles \(\alpha\), \(\beta\) such that \[ R_y(\beta)R_z(\alpha)\mathbf{w} = \mathbf{e}_3 \] These are the Euler angles, and are found by rotating \(\mathbf{w}\) about the \(z\)-axis until it lies in the \(xz\)-plane, and then rotating about the \(y\)-axis until the vector points in the positive \(z\)-direction.

From this it follows that the matrix \(R_\mathbf{w}(\theta)\) for rotation by angle \(\theta\) about \(\mathbf{w}\) is given by \[ R_\mathbf{w}(\theta) = R_z(-\alpha)R_y(-\beta)R_z(\theta)R_y(\beta)R_z(\alpha) \]

The quaternions are the set \(\HH\) consisting all numbers of the form \[ a\one + b\i + c\j + d\k \] with the lovely relations due to Hamilton, \[ \i^2 = \j^2 = \k^2 = \i\j\k = -1 \]

From these we can easily deduce the full (non-commutative) multiplication table: \[ \begin{align} \i\j &= -\j\i = \k \\ \j\k &= -\k\j = \i \\ \k\i &= -\i\k = \j \end{align} \]

(Details: \(\i\j = -(\i\j\k)\k = \k\) and \(\j\k = -\i(\i\j\k) = \i\). Thus \(-1 = (\i\j)^2\) and so \(-\i\j = \i(\i\j\i\j)\j = \j\i\) and, similarly, \(-1 = (\j\k)^2\) and so \(-\j\k = \j(\j\k\j\k)\k = \k\j\). This establishes the first two lines of equalities. Finally, \(-1 = \i\j\k = -\i\k\j\) and so \(-\j = (-\i\k\j)\j = \i\k\), and \(\k\i = (\i\j)(\j\k) = -\i\k\), completing the final line.)

The quaternions are a 4-dimensional real vector space with
basis \(\one, \i, \j, \k\). Define the *conjugate* of a quaternion
\(q = a\one + b\i + c\j + d\k\) to be
\[
\bar{q} = a\one - b\i - c\j - d\k
\]
This is clearly an idempotent, real-linear map of \(\HH\) to itself.
Thus to see that conjugation also satifies the multiplicative rule
\(\overline{pq} = \bar{q}\bar{p}\) it suffices to check the relation
for the basis elements. For example, \(\overline{\i\j} = \overline{\k} = -\k\)
and \(\overline{\j}\overline{\i} = (-\j)(-\i) = \j\i = -\k\).
The other relations follow in exactly the same way.

Next define \[ p\cdot q = \frac{1}{2}(p\overline{q} + q\overline{p}) \] This is clearly a real bilinear form on \(\HH\) and, writing \(e_1 = \one, e_2 = \i, e_3 = \j, e_4 = \k\), we can easily verify the relation \[ e_i\cdot e_j = \delta_{ij} \] from the definition of the conjugate and the anticommuttivity relations. From this it follows that for \(p = \sum p_ie_i\) and \(q=\sum_i q_i e_i\), \[ p\cdot q = \sum_i p_i q_i \qquad\text{and}\qquad |p|^2 \stackrel{\text{def}}{=} p\cdot p = \sum p_i^2 \] and so \(p\cdot q\) is a real inner product on \(\H\).

The *real part* of the quaternion
\[
p = p_0\one + p_1\i + p_2\j + p_3\k
\]
is the real number \(p_0\), and the *imaginary part* is the quaternion
\(p_1\i + p_2\j + p_3\k\). The purely imaginary quaternions (i.e., those
with real part equal to zero) can be identified with vectors in
\(\RR^3\). Write \(\mathbf{p}\) for the 3-vector corresponding to the
purely imaginary quaternion \(p\). In this case, by slight abuse of notation,
if \(p\) and \(q\) are purely imaginary then
\[
\tag{1}
pq = (-\mathbf{p}\cdot\mathbf{q})\one + \mathbf{p}\times\mathbf{q}
\]

(This formula can be extended by linearity to a somewhat more complicated form for the product of arbitrary quaternions, but we don't need that here.)

Let \(u\) be a fixed unimodular quaternion. Then for any vector \([x, y, z]^T \in \RR^3\), compute \[ u(x\i + y\j + z\k)\overline{u} \] It turns out (see below) that this quaternion is also purely imaginary, and so is of the form \(x'\i + y'\j + z'\k\), which we can map back to the vector \([x', y', z']^T \in \RR^3\). This mapping \([x, y, z]^T \mapsto [x', y', z']^T\) is clearly real linear and we shall see that it defines an element of \(SO(3)\).

First note that since \(u\) is unimodular, \(u\overline{u} = |u|^2 = 1\) and so for any \(p, q \in \HH\), \[ \begin{align} (u p \bar{u})\cdot(u q \bar{u}) &= \frac{1}{2}(u p \bar{u}u \bar{q} \bar{u} + u q \bar{u} u \bar{p} \bar{u}) \\ &= u\left[ \frac{1}{2}(p\bar{q} + q\bar{p}) \right] \bar{u} \\ &= u(p\cdot q)\bar{u} \end{align} \] In particular, conjugation by \(u\) maps orthogonal spaces to orthogonal spaces. Thus since conjugation by \(u\) clearly maps \(\RR\one\) to itself, it must also map the purely imaginary vectors to themselves. Thus the linear map \(U\) induced on \(\RR\) is well-defined and, in addition, is clearly an orthogonal map, i.e., in \(O(3)\). Finally, writing \(\mathbf{e}_x\), \(\mathbf{e}_y\), and \(\mathbf{e}_z\) for the standard basis of \(\RR^3\), note that the determinant of \(U\) is equal to the scalar triple product \[ U(\mathbf{e}_x)\cdot U(\mathbf{e}_y)\times U(\mathbf{e}_z) \] However, from (1) above, this triple produce is equal to the negative of the real part of the product \[ (u\i\overline{u})(u\j\overline{u})(u\k\overline{u}) = u\i\j\k\overline{u} = -1 \] It follows that \(U\in SO(3)\).

Now write \(\Phi\) from the map which takes a unimodular quaternion to an element of \(SO(3)\). It's easily seen that \(\Phi\) is multiplicative (i.e., a group homomorphism) and is (at least) two-to-one, because \(u\) and \(-u\) induce the same linear map on \(\RR^3\).

To see that \(\Phi\) is a double cover of \(SP(3)\), we must show that it is surjectoive and precisely two-to-one. This follow from this lemma:

Lemma 7. Let \(u = \cos\theta\one + \sin\theta w\), where \(\theta\in\RR\) and \(w\) is a unimodular, purely imginary quaternion. Then \(U := \Phi(u)\) is the transformation of rotation by angle \(2\theta\) about the vector \(\mathbf{w}\).

Proof.
First consider \(w = \k\) (i.e., for rotation about the \(z\)-axis).
For \(\mathbf{v} = [x, y, z]^T \in \RR^3\), write
\(v = x\i + y\j + z\k\). Then
\[
\begin{align}
uv\overline{u}
&= (\cos\theta\one + \sin\theta \k)(x\i + y\j + z\k)
(\cos\theta\one - \sin\theta \k) \\
%
&= \cos^2\theta(x\i + y\j + z\k) \\
&\qquad + \sin\theta\cos\theta(x\k\i + y\k\j + z\k^2) \\
&\qquad - \cos\theta\sin\theta(x\i\k + y\j\k + z\k^2) \\
&\qquad - \sin^2\theta(x\k\i\k + y\k\j\k + z\k^3) \\
%
&= \cos^2\theta(x\i + y\j + z\k) \\
&\qquad + \sin\theta\cos\theta(x\j - y\i - z\one) \\
&\qquad - \cos\theta\sin\theta(-x\j + y\i - z\one) \\
&\qquad - \sin^2\theta(x\i + y\j - z\k) \\
%
&= (-\sin\theta\cos\theta z + \cos\theta\sin\theta z)\one \\
&\qquad + (\cos^2\theta x - \sin\theta\cos\theta y -
\cos\theta\sin\theta y - \sin^2\theta x)\i \\
&\qquad + (\cos^2\theta y + \sin\theta\cos\theta x +
\cos\theta\sin\theta x - \sin^2\theta y)\j \\
&\qquad + (\cos^2\theta z + \sin^2\theta z)\k \\
&= (\cos(2\theta)x - \sin(2\theta)y)\i
+ (\sin(2\theta)x + \cos(2\theta)y)\j + z\k
\end{align}
\]
As an action on \(\RR^3\) this means that \(\Phi(u)\) acts as
\[
\left[
\begin{array}{ccc}
\cos(2\theta) & -\sin(2\theta) & 0 \\
\sin(2\theta) & \cos(2\theta) & 0 \\
0 & 0 & 1
\end{array}
\right]
\]
which is a rotation of angle \(2\theta\) about the \(z\)-axis. Similar
calculations, *mutatis mutandis*, show that the corresponding
results hold for \(w=\i\) and \(w=\j\).

To handle the general case, write the matrix \(R_\mathbf{w}(\theta)\) for rotation by angle \(\theta\) about \(\mathbf{w}\) as \[ R_\mathbf{w}(\theta) = R_z(-\alpha)R_y(-\beta)R_z(\theta)R_y(\beta)R_z(\alpha) \] (see the section on Euler angles). Writing \[ u = \cos(\alpha/2)\one + \sin(\alpha/2) \k \quad\text{and}\quad v = \cos(\beta/2)\one + \sin(\beta/2) \j \] and since \(\Phi\) is a homomorphism, \[ \begin{align} R_\mathbf{w}(2\theta) &= \Phi(\bar{u}\bar{v}(\cos(\theta)\one + \sin(\theta) \k)vu) \\ &= \Phi(\cos(\theta)\one + \sin(\theta) (\bar{u}\bar{v}\k vu) \\ \end{align} \] But we already know that \(\bar{u}\bar{v}\k vu\) gives the action of \(R_z(-\alpha)R_y(-\beta)\) on \(\mathbf{e}_3\), which is \(w\). It follows that \[ R_\mathbf{w}(2\theta) = \Phi(\cos(\theta)\one + \sin(\theta)w) \] as required.

Theorem 8. The map \(\Phi\) is a homomorphism from the group of unimodular quaternions onto \(SO(3)\) and is exactly two-to-one.

Proof. Every matrix in \(SO(3)\) is a rotation of the form \(R_\mathbf{w}(\theta)\) for some \(\theta\in\RR\) and unit vector \(\mathbf{w}\). By the last lemma, \(R_\mathbf{w}(\theta) = \Phi(\cos(\theta/2)\one + \sin(\theta/2)w)\) and so \(\Phi\) is surjective. We have seen that \(\Phi(u) = \Phi(-u)\) for every unimodular \(u\). It remains to show that if \(\Phi(u) = \Phi(v)\) then \(u = \pm v\).

First, consider the case when \(\Phi(u) = I\). Since \(u\) is unimodular we can find an angle \(0\le \theta < 2\pi\) and a unimodular purely imaginary quaternion \(\hat{u}\) such that \[ u = \cos\theta\one + \sin\theta\hat{u} \] But this means that the matrix \(R_{\hat{\mathbf{u}}}(2\theta) = I\), and so \(\theta\) can only be \(0\) or \(\pi\). Correspondingly, \(u = \pm 1\).

In general, if \(\Phi(u) = \Phi(v)\) then since \(\Phi\) is a homomorphism,# \(\Phi(uv^{-1}) = I\), so \(uv^{-1} = \pm 1\), and we are done.

See also my page on Pauli matrices.

The Pauli Matrices are defined to be:

\[ \begin{align} \sigma_x &= \sigma_1 = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} \\ \sigma_y &= \sigma_2 = \begin{bmatrix} 0 & -i \\ i & 0 \end{bmatrix} \\ \sigma_z &= \sigma_3 = \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix} \end{align} \]

From direct matrix multiplication we get the fundamental relations \[ \sigma_1^2 = \sigma_2^2 = \sigma_3^2 = -i \sigma_1 \sigma_2 \sigma_3 = I \]

Lemma 9. \(SU(2)\) consists of precisely those complex matrices of the form \[ \begin{bmatrix} \alpha & \beta \\ -\bar\beta & \bar\alpha \end{bmatrix} \] where \(|\alpha|^2 + |\beta|^2 = 1\)

Proof. Direct multiplication easily shows that every matrix of this form is in \(SU(2)\). To see the converse, consider \[ S = \begin{bmatrix} \alpha & \beta \\ \gamma & \delta \end{bmatrix} \in SU(2) \] This implies \[ \begin{align} \alpha\bar\alpha + \gamma\bar\gamma &= 1 \\ \beta\bar\beta + \delta\bar \delta &= 1 \\ \alpha\bar\beta + \gamma\bar\delta &= 0 \\ \alpha\delta - \beta\gamma &= 1 \\ \end{align} \] If \(\gamma = 0\) once can very quickly chase through the relations to see that \(|\alpha| = 1\), \(\beta = 0\), and \(\delta = 1/\alpha\). Thus \(S\) is \[ \begin{bmatrix} \alpha & 0 \\ 0 & \bar\alpha \end{bmatrix} \] In the more general case when \(\gamma\not=0\), \[ \delta = -\frac{\bar\alpha \beta}{\bar\gamma} \] and so \[ 1 = -\frac{\alpha\bar\alpha \beta}{\bar\gamma} -\beta\gamma = -\frac{\beta}{\bar\gamma}(\alpha\bar\alpha + \gamma\bar\gamma) = -\frac{\beta}{\bar\gamma} \] Thus \(\beta = -\bar\gamma\) and so also, substituting back, \(\delta = \bar\alpha\) and \[ S = \begin{bmatrix} \alpha & \beta \\ -\bar\beta & \bar\alpha \end{bmatrix} \]

Lemma 10. \(SU(2)\) is Lie group isomorphic to the group of unimodular quaternions.

Proof. Take \(X_0 = I\) and \(X_j = -i\sigma_j\) (\(j=1,2,3\)). Knowing that \[ \sigma_1^2 = \sigma_2^2 = \sigma_3^2 = -i \sigma_1 \sigma_2 \sigma_3 = I \] it follows that \[ X_1^2 = X_2^2 = X_3^2 = X_1X_2X_3 = -I \] Thus the map \[ a_0\one + a_1\i + a_2\j + a_3\k \longmapsto a_0I + a_1X_1 + a_2X_2 + a_3X_3 \] is a multiplicative map from the unimodular quaternions into the \(2\times 2\) complex matrices. Moreover, \[ a_0 I + a_1X_1 + a_2X_2 + a_3X_3 = \begin{bmatrix} a_0 - i a_3 & -a_2 - i a_1 \\ a_2 - i a_1 & a_0 + i a_3 \end{bmatrix} \] and this is of the form identified in the Lemma 9 to belong to \(SU(2)\) (unimodularity of the quaternion ensures that the condition on the moduli of the matrix entries is met). Inspection of the matrix form, together with this lemma, demonstrates that the map is bijective.

Remark 11. Topologically, the unimodular quaternions are homeomorphic to the 3-sphere, \(S^3\), and so the same is true for \(SU(2)\).

Let \(w\) be a unimodular quaterion, \(\theta\) a real number, and \(u = \cos\theta\one + \sin\theta w\); and let \(v = x\i + y\j + z\k\). We saw above that \(v \mapsto uv\bar u\) acts as a rotation of the point \([x,y,z]^T\) by the angle \(\theta\) about the axis \(\mathbf{w}\). In \(SU(2)\) the corresponding action is \[ (xX_1 + yX_2 + zX_3) \longmapsto U(xX_1 + yX_2 + zX_3)U^* \] where \[ U = \cos\theta I + \sin\theta (w_1 X_1 + w_2X_2 + w_3 X_3) \] It's convenient to write \[ w_1 X_1 + w_2X_2 + w_3 X_3 = \mathbf{w}\cdot\mathbf{X} = -i\wdotsigma \]

If \(\mathbf{w}\) is a unit vector in \(\RR^3\) then, by direct multiplication using the fundamental relations of the quaternions, \[ w^2 = -\one \quad\text{and}\quad (-i \wdotsigma)^2 = - I \] Focusing on the case of \(SU(2)\) this says that \((\wdotsigma)^2 = I\) and so \[ (\wdotsigma)^k = \begin{cases} \wdotsigma & k \text{ is odd} \\ I & k \text{ is even} \end{cases} \] Thus \[ \begin{align} e^{-i\theta \wdotsigma} &= \sum_{k=1}^\infty \frac{(-i\theta \wdotsigma)^k}{k!} \\ &= \sum_{k \text{ even}} \frac{(-i\theta)^k}{k!}I + \sum_{k \text{ odd}} \frac{(-i\theta)^k}{k!}\wdotsigma \\ &= \cos\theta I - i\sin\theta\; \wdotsigma \\ &= \cos\theta I + \sin\theta\; \mathbf{w}\cdot\mathbf{X} \end{align} \] and so conjugation by \(e^{-i\theta \wdotsigma}\) implements rotation on \(\RR^3\), at least on the copy of \(\RR^3\) embedded in \(SU(2)\) as \[ xX_1 + yX_2 + zX_3 = \begin{bmatrix} - i z & -y - i x \\ y - i x & i z \end{bmatrix} \] By the standard form of matrices in \(SU(2)\) above, this correspondence is precisely between vectors in \(\RR^3\) and elements in \(SU(2)\) of trace zero.