# Joan Lindsay Orr


# The Pauli Marices and the Bloch Sphere

These notes are an exposition of the basic facts about the Pauli matrices and the Bloch Sphere. The goal is to give a completely mathematically rigourous exposition of the core facts about the action of the Pauli matrices as rotations on the Bloch Sphere, and to do so in a way where the reasons for this strange correspondence between vectors in $$\CC^2$$ and $$\RR^3$$ become clearer. To this end, I've tried to avoid matrix multiplications and complicated trig formulas almost completely, and to let the underlying algebraic patterns shine through.

This page is not a good introduction to the properties of the Bloch Sphere. For that I recommend Nielsen and Chuang or Rieffel and Polak. Hopefully it may be of interest to someone who has read those books and would like to see proofs of all the gory details and, perhaps, gain some insight into why this correspondence works so well.

I am also indebted to Ian Glendinning's lovely slide deck on the same topics.

## The Pauli Matrices

### Definition and simple properties

The Pauli Matrices are defined to be:

\begin{align} X & = \sigma_1 = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} \\ Y & = \sigma_2 = \begin{bmatrix} 0 & -i \\ i & 0 \end{bmatrix} \\ Z & = \sigma_3 = \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix} \end{align}

Routine calculation shows that $$X^2 = Y^2 = Z^2 = -i XYZ = I$$. All other multiplicative identities involving $$X$$, $$Y$$, and $$Z$$ can be deduced from these. For example, \begin{align} XY & = (XYZ)Z = iZ, \\ YZ & = X(XYZ) = iX, \\ XZ & = X(-iXY) = -iY, \\ ZY & = (-iXY)Y = -iX, \\ YX & = Y(-iYZ) = -iZ, \\ ZX & = (iYX)X = iY. \end{align} (Of course these identities could also be obtained by direct matrix multiplication.) Note in particular that the distinct $$\sigma_i$$ anticomute, that is, $$\sigma_i \sigma_j = -\sigma_j\sigma_i$$ for $$i\not=j$$. Also, as observed, $$\sigma_i^2 = I$$ for $$i=1,2,3$$. We can summarize the rules for $$i\not=j$$ as $$\sigma_i\sigma_j = (-1)^s\sigma_k$$ where $$i,j,k$$ are distinct, and $$s$$ is the sign of the permutation $$1\mapsto i, 2\mapsto j, 3\mapsto k$$.

### Relation to the quaternions

If we set $\i = iX, \quad \j = iY, \quad\text{ and }\quad \k = -iZ$ then $$\i^2 = \j^2 = \k^2 = \i\j\k = -I$$. These are the defining relations of the quaternions, and so the span of $$a_0 I + a_1 \i + a_2 \j + a_3 \k$$ provides a representation of the algebra of quaternions.

### Matrix properties

Note that $$X, Y, Z$$ are self-adjoint matrices. If $$a_0, a_1, a_2, a_3 \in\RR$$ then $a_0 I + a_1 X + a_2 Y + a_3 Z = \begin{bmatrix} a_0 + a_3 & a_1 - i a_2 \\ a_1 + i a_2 & a_0 - a_3 \end{bmatrix}$ is also a self-adjoint matrix and it is clear that every self-adjoint $$2\times 2$$ matrix can be expressed in this form. Thus $$I, X, Y, Z$$ span the real vector space $$M_2^{sa}(\CC)$$ of $$2\times 2$$ self-adjoint matrices, and since this space is 4-dimensional, they are a basis. Note in particular that $$\tr(a_0 I + a_1 X + a_2 Y + a_3 Z) = 2a_0$$ and so $$X, Y, Z$$ span the space of trace-zero self-adjoint matrices. Also, by direct computation, $\det(a_0 I + a_1 X + a_2 Y + a_3 Z) = a_0^2 - (a_1^2 + a_2^2 + a_3^2)$

Now we write $$\vecsigma = (\sigma_1, \sigma_2, \sigma_3)$$ which is called the Pauli vector and, by slight abuse of notation, for $$\veca = (a_1, a_2, a_3) \in \RR^3$$ we write $$\veca\cdot\vecsigma = a_1\sigma_1 + a_2\sigma_2 + a_3\sigma_3.$$ Using the anticommutation relations we calculate \begin{align} (\veca\cdot\vecsigma)(\vecb\cdot\vecsigma) & = (a_1\sigma_1 + a_2\sigma_2 + a_3\sigma_3)(b_1\sigma_1 + b_2\sigma_2 + b_3\sigma_3) \\ & = (a_1b_1 + a_2b_2 + a_3b_3)I + (a_2b_3 - a_3b_2)i\sigma_1 - (a_1b_3 - a_3b_1)i\sigma_2 + (a_1b_2 - a_2b_3)i\sigma_3 \\ & = (\veca\cdot\vecb)I + i(\veca\times\vecb)\cdot\vecsigma \end{align} In particular $$(\veca\cdot\vecsigma)^2 = \|\veca\|^2 I$$ and if $$\vecn$$ is a unit vector in $$\RR^3$$ then $$(\vecn\cdot\vecsigma)^2 = I$$.

### Eigenvalues and eigenvectors

Now let $$\veca$$ be a unit vector in $$\RR^3$$ and $$H = \veca \cdot \vecsigma$$. Clearly $$H$$ is Hermitian, and $$H^2 = I$$ by the anticommutation relations. Thus the spectrum of $$H$$ is $$\{-1, 1\}$$ and its spectral projections are $$(I+H)/2$$ and $$(I-H)/2$$. Thus the (not necessarily normalized) eigenvectors for $$-1$$ and $$1$$ respectively are $\begin{bmatrix} a_3 + 1 \\ a_1 + i a_2 \end{bmatrix} \quad\text{and}\quad \begin{bmatrix} a_3 - 1 \\ a_1 + i a_2 \end{bmatrix}$ (Note these are orthogonal since $$a_1^2+a_2^2+a_3^2 = 1$$.)

If $$A$$ is any matrix with $$A^2=I$$ then \begin{align} e^{itA} & = \sum_{k=0}^\infty \frac{(itA)^k}{k!} \\ & = \sum_{k=0}^\infty \frac{(itA)^{2k}}{(2k)!} + \sum_{k=0}^\infty \frac{(itA)^{2k + 1}}{(2k + 1)!} \\ & = \left(\sum_{k=0}^\infty (-1)^k\frac{t^{2k}}{(2k)!}\right) I + i \left(\sum_{k=0}^\infty (-1)^k\frac{t^{2k + 1}}{(2k + 1)!} \right)A \\ & = \cos(t) I + i \sin(t) A \end{align} (In fact this is true in any unital Banach algebra.) Thus if $$\vecn$$ is a unit vector in $$\RR^3$$ and $$a\in\RR$$ then $e^{i a \vecn\cdot\vecsigma} = \cos(a) I + i \sin(a)\vecn\cdot\vecsigma$ which is a generalized Euler's Identity.

Note that by the anticommutation relations, if $$i\not= j$$ then $$\sigma_i^k \sigma_j = (-1)^k\sigma_j\sigma_i$$ and so $e^{ia\sigma_i}\sigma_j = \sum_{k=0}^\infty \frac{(ia\sigma_i)^k\sigma_j}{k!} = \sum_{k=0}^\infty \frac{\sigma_j(-ia\sigma_i)^k}{k!} = \sigma_j e^{-ia\sigma_i}$ Thus $e^{ia\sigma_i}\sigma_j e^{-ia\sigma_i} = \begin{cases} e^{2ia\sigma_i}\sigma_j & i \not= j, \\ \sigma_j & i = j. \end{cases}$ So when $$i,j$$ are distinct $e^{i\frac{\theta}{2}\sigma_i}\sigma_j e^{-i\frac{\theta}{2}\sigma_i} = \cos(\theta)\sigma_j + (-1)^s i \sin(\theta)\sigma_k$ For example \begin{align} e^{i\frac{\theta}{2}X} X e^{-i\frac{\theta}{2}X} & = X \\ e^{i\frac{\theta}{2}X} Y e^{-i\frac{\theta}{2}X} & = \cos(\theta)Y + \sin(\theta)Z \\ e^{i\frac{\theta}{2}X} Z e^{-i\frac{\theta}{2}X} & = - \sin(\theta)Y + \cos(\theta)Z \end{align} In the coordinate system of $$X, Y, Z$$ this acts as a rotation of $$\theta$$ about the $$X$$ axis. In other words $e^{i\frac{\theta}{2}X} (xX + yY + zZ) e^{-i\frac{\theta}{2}X} = x'X + y'Y + z'Z$ where $\begin{bmatrix} x' \\ y' \\ z' \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & \cos{\theta} & -\sin{\theta} \\ 0 & \sin{\theta} & \cos{\theta} \end{bmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix}$

The analogous computations with $$e^{i\frac{\theta}{2}Y}$$ and $$e^{i\frac{\theta}{2}Z}$$ show that these also give rotations of $$\theta$$ about the $$Y$$ and $$Z$$ axes respectively.

Thus we write \begin{align} R_X(\theta) & = e^{i\frac{\theta}{2}X} = \cos(\theta) I + i\sin(\theta) X \\ R_Y(\theta) & = e^{i\frac{\theta}{2}Y} = \cos(\theta) I + i\sin(\theta) Y \\ R_Z(\theta) & = e^{i\frac{\theta}{2}Z} = \cos(\theta) I + i\sin(\theta) Z \end{align} for these operators of rotation about the coordinate axes.

### General rotations

Now for a unit vector $$\vecn \in \RR^3$$ we consider the operator $R_{\vecn\cdot\vecsigma}(\alpha) = e^{i\frac{\alpha}{2}\vecn\cdot\vecsigma} = \cos(\alpha) I + i\sin(\alpha) \vecn\cdot\vecsigma$ and show that this corresponds to a rotation of $$\alpha$$ about the axis $$\vecn$$. Write $$\vecn$$ in spherical coordinates as $\vecn = (n_x, n_y, n_z) = (\cos\phi\sin\theta, \sin\phi\sin\theta, \cos\theta)$ for suitable $$0\le\theta\le\pi$$ and $$0\le\phi\le2\pi$$. Now, a rotation of $$\alpha$$ about $$\vecn$$ can be done by the following steps:

1. rotate by $$-\phi$$ about $$Z$$ to put $$\vecn$$ in the $$XZ$$ plane,
2. rotate by $$-\theta$$ about $$Y$$ to put $$\vecn$$ along the $$Z$$ axis,
3. rotate by $$\alpha$$ about the $$Z$$ axis,
4. rotate back by $$\theta$$ about $$Y$$, and
5. rotate back by $$\phi$$ about $$Z$$.

This sequence of rotations is implemented by the following matrix $R_Z(\phi)R_Y(\theta)R_Z(\alpha)R_Y(-\theta)R_Z(-\phi)$ A routine calculation shows that this is indeed $$R_{\vecn\cdot\vecsigma}(\alpha)$$:

\begin{align} R_Z(\phi)R_Y(\theta)R_Z(\alpha)R_Y(-\theta)R_Z(-\phi) & = R_Z(\phi)R_Y(\theta)(\cos(\alpha) I + i\sin(\alpha)Z)R_Y(-\theta)R_Z(-\phi)\\ & = \cos(\alpha) I + i\sin(\alpha) R_Z(\phi)R_Y(\theta)ZR_Y(-\theta)R_Z(-\phi)\\ & = \cos(\alpha) I + i\sin(\alpha) (\vecn\cdot\vecsigma)\\ & = R_{\vecn\cdot\vecsigma}(\alpha) \end{align}

## The Bloch Sphere

### The forward mapping

Given a unit vector $$\psi\in\CC^2$$, consider the pure state $$\ket{\psi}\bra{\psi}$$. This is a projection and so $$P = \ket{\psi}\bra{\psi}$$ can be written as $$P = \frac{1}{2}(I + S)$$ where $$S$$ is a self-adjoint unitary, also known as a symmetry. Since $$P$$ is rank-1, $$1 = \tr(P) = 1 + \frac{1}{2}\tr(S)$$ and so $$\tr(S)=0$$. Thus $$S = \veca\cdot\vecsigma$$ for some $$\veca\in\RR^3$$ and since $$I = S^2 = \|a\|^2 I$$, so $$\veca$$ is a point on the unit sphere of $$\RR^3$$.

Thus the mapping $\psi \in \CC^2 \longmapsto \ket{\psi}\bra{\psi} = \frac{1}{2}(I + \veca\cdot\vecsigma) \longmapsto \veca = (a_1, a_2, a_3) \in \RR^3$ gives a correspondence of unit vectors in $$\CC^2$$ with unit vectors in $$\RR^3$$. Since $$\psi$$ is determined up to a unimodular multiple by the range of $$P$$, it can also be recovered from $$\veca$$, at least up to a physically irrelevant multiple of $$e^{i\alpha}$$.

### The action of Pauli exponentials

The unitary $$R_{\vecn\cdot\vecsigma}(\alpha)$$ acts on $$\psi$$ and induces an action on $$\RR^3$$ as: $\Rna\psi \in \CC^2 \longmapsto \ket{\Rna\psi}\bra{\Rna^\adj\psi} = \frac{1}{2}(I + \Rna(\veca\cdot\vecsigma)\Rna^\adj) \longmapsto \veca = (a'_1, a'_2, a'_3) \in \RR^3$ We know that conjugation by $$\Rna$$ acts on $$\veca\cdot\vecsigma$$ as rotation of $$\veca$$ about the $$\vecn$$ axis, and so the action of $$\Rna$$ on $$\psi$$ corresponds to a rotation of $$(a_1, a_2, a_3)$$ about $$\vecn$$ by an angle of $$\alpha$$ to $$(a'_1, a'_2, a'_3)$$.

### Antipodal points

If $$\phi$$ and $$\psi$$ are unit vectors in $$\CC^2$$ then they are orthogonal if and only if $$(\ket{\phi}\bra{\phi})(\ket{\psi}\bra{\psi}) = \ket{\phi}\ip{\phi}{\psi}\bra{\psi} = 0$$. This happens if and only if $(I + \veca\cdot\vecsigma)(I + \vecb\cdot\vecsigma) = (1 + \veca\cdot\vecb)I + (\veca + \vecb + i\veca\times\vecb)\cdot\vecsigma = 0$ (where $$\phi, \psi$$ correspond to $$\veca, \vecb$$ respectively). Because the $$\sigma_i$$ ($$i=0,1,2,3$$) are linearly independant, if the product is zero then $$\veca\cdot\vecb = -1$$ and by the extremal case of the Cauchy-Schartz Inequality, $$\vecb = -\veca$$. Conversely if $$\vecb = -\veca$$ then the product is zero. Thus $$\phi$$ and $$\psi$$ are orthogonal if and only if they correspond to antipodal points on the Bloch Sphere.

### The action of unitaries

Let $$U$$ be a unitary $$2\times 2$$ complex matrix and consider the map $$\ad{U}:X\in M_2(\CC) \mapsto UXU^\adj\in M_2(\CC)$$. Let $$\phi$$ be a unit vector in $$\CC^2$$ which corresponds to $$\veca$$ in the Bloch Spehere. Then $$U\phi$$ is also a unit vector in $$\CC^2$$, which corresponds, say, to $$\veca'$$ in the Bloch Sphere. Thus \begin{align} \frac{1}{2}(I + \veca'\cdot\vecsigma) & = \ket{U\phi}\bra{U\phi} \\ & = \ad{U}(\ket{\phi}\bra{\phi}) \\ & = \frac{1}{2}\ad{U}(I + \veca\cdot\vecsigma) \\ & = \frac{1}{2}(I + \ad{U}(\veca\cdot\vecsigma)) \end{align} Thus $$U\phi$$ corresponds to $$\veca'$$ in the Bloch Sphere, where the coordinates of $$\veca'$$ are given by $$\veca'\cdot\vecsigma = \ad{U}(\veca\cdot\vecsigma)$$.

Now $$\ad{U}$$ is a linear map which takes the set of trace-0 self-adjoint matrices to itself. Since $$X, Y, Z$$ is a basis for this set of matrices, $$\ad{U}$$ induces a linear map of $$\RR^3$$ to itself. Moreover, since $\veca\cdot\vecb = \frac{1}{2}\tr((\veca\cdot\vecsigma)(\vecb\cdot\vecsigma))$ it follows that if $$\veca'\cdot\vecsigma = \ad{U}(\veca\cdot\vecsigma)$$ and $$\vecb'\cdot\vecsigma = \ad{U}(\vecb\cdot\vecsigma)$$ then \begin{align} \veca'\cdot\vecb' & = \frac{1}{2}\tr((\veca'\cdot\vecsigma)(\vecb'\cdot\vecsigma)) \\ & = \frac{1}{2}\tr(\ad{U}(\veca\cdot\vecsigma)\ad{U}(\vecb\cdot\vecsigma)) \\ & = \frac{1}{2}\tr((\veca\cdot\vecsigma)(\vecb\cdot\vecsigma)) \\ & = \veca\cdot\vecb \end{align} Thus $$\ad{U}$$ induces a real unitary (or orthogonal) matrix on $$\RR^3$$. That is, it belongs to $$O(3)$$.

Next, if $$\veca,\vecb, \vecc\in\RR^3$$ then we can compute \begin{align} (\veca\cdot\vecsigma)(\vecb\cdot\vecsigma)(\vecc\cdot\vecsigma) & = (\veca\cdot\vecsigma)((\vecb\cdot\vecc)I + i(\vecb\times\vecc)\cdot\vecsigma) \\ & = (\vecb\cdot\vecc)\veca\cdot\vecsigma + i(\veca\cdot(\vecb\times\vecc))I - (\veca\times(\vecb\times\vecc))\cdot\vecsigma \\ & = i(\veca\cdot(\vecb\times\vecc))I + ((\vecb\cdot\vecc)\veca - \veca\times(\vecb\times\vecc))\cdot\vecsigma \\ & = i(\veca\cdot(\vecb\times\vecc))I + ((\vecb\cdot\vecc)\veca - (\veca\cdot\vecc)\vecb + (\veca\cdot\vecb)\vecc)\cdot\vecsigma \end{align} by the triple product formula. So if $$\veca'\cdot\vecsigma = \ad{U}(\veca\cdot\vecsigma)$$, $$\vecb'\cdot\vecsigma = \ad{U}(\vecb\cdot\vecsigma)$$, and $$\vecc'\cdot\vecsigma = \ad{U}(\vecc\cdot\vecsigma)$$ then \begin{align} i(\veca'\cdot(\vecb'\times\vecc'))I + (\cdots)\cdot\vecsigma & = (\veca'\cdot\vecsigma)(\vecb'\cdot\vecsigma)(\vecc'\cdot\vecsigma) \\ & = \ad{U}(\veca\cdot\vecsigma)\ad{U}(\vecb\cdot\vecsigma)\ad{U}(\vecc\cdot\vecsigma) \\ & = \ad{U}((\veca\cdot\vecsigma)(\vecb\cdot\vecsigma)(\vecc\cdot\vecsigma)) \\ & = \ad{U}(i(\veca\cdot(\vecb\times\vecc))I + (\cdots)\cdot\vecsigma) \\ & = i(\veca\cdot(\vecb\times\vecc))I + (\cdots)\cdot\vecsigma \end{align} Comparing the coefficients in $$I$$, we conclude that $$\veca'\cdot(\vecb'\times\vecc') = \veca\cdot(\vecb\times\vecc)$$ and so we see that $$\ad{U}$$ induces an orientation-preserving map on $$\RR^3$$. This map is therefore in $$SO(3)$$ and so acts as a rotation.

### Spherical coordinates

To explicitly see the reverse mapping from $$\veca$$ to $$\psi$$, consider the spherical coordinates for $$\veca$$ as $(a_1, a_2, a_3) = (\cos\phi\sin\theta, \sin\phi\sin\theta, \cos\theta)$ and take $\psi = \cos(\theta/2)\ket{0} + e^{i\phi}\sin(\theta/2)\ket{1}$ Straighforward calculation shows that \begin{align} \ket{\psi}\bra{\psi} & = \begin{bmatrix}\cos(\theta/2) \\ e^{i\phi}\sin(\theta/2)\end{bmatrix} \begin{bmatrix}\cos(\theta/2) & e^{-i\phi}\sin(\theta/2)\end{bmatrix} \\ & = \begin{bmatrix} \cos^2(\theta/2) & e^{-i\phi}\cos(\theta/2)\sin(\theta/2) \\ e^{i\phi}\cos(\theta/2)\sin(\theta/2) & \sin^2(\theta/2) \end{bmatrix} \\ & = \frac{1}{2} \begin{bmatrix} 1 + \cos\theta & e^{-i\phi}\sin\theta \\ e^{i\phi}\sin\theta & 1 - \cos\theta \end{bmatrix} \\ & = \frac{1}{2} \begin{bmatrix} 1 + \cos\theta & \cos\phi\sin\theta - i\sin\phi\sin\theta \\ \cos\phi\sin\theta + i\sin\phi\sin\theta & 1 - \cos\theta \end{bmatrix} \\ & = \frac{1}{2} \begin{bmatrix} 1 + a_3 & a_1 - i a_2 \\ a_1 + i a_2 & 1 - a_3 \end{bmatrix} \\ & = \frac{1}{2}(I + \veca\cdot\vecsigma) \end{align}

This establishes the usual spherical coordinatization of the Bloch sphere.

### Mixed states

A mixed state is represented by a density matrix, which is a positive semi-definite matrix with trace 1. The set of density matrices is convex and the extremal points of this convex set are the pure states, which can be shown to be the vector states, i.e., of the form $$\ket{\psi}\bra{\psi}$$. The pure states can also be recognized as the density matrices $$\rho$$ which satsfy $$\tr(\rho^2) = \tr(\rho)$$.

Density matrices can be mapped to $$\RR^3$$ in exactly the same way as we mapped rank-1 projections, and in this case density matrices correspond to points $$\veca$$ with $$\|\veca\| \le 1$$, and the points on the boundary (i.e., $$\|\veca\| = 1$$) correspond to the pure states.

If $$\rho\in M_2(\CC)$$ is a density matrix then $$2\rho - I$$ is a trace-0 self-adjoint matrix and so there is unique $$\veca\in\RR^3$$ such that $$2\rho - I = \veca\cdot\vecsigma$$ or, equivalently, $$\rho = \frac{1}{2}(I + \veca\cdot\vecsigma)$$. Thus we extend our mapping into $$\RR^3$$ to the density matrixes. Note that $$\|\veca\|^2 I = (\veca\cdot\vecsigma)^2 = 4(\rho^2 - \rho) + I$$. Taking the trace we see that $\|\veca\|^2 = 1 + 2\tr(\rho^2 - \rho).$ Now the eigenvalues of $$\rho$$ are $$p$$ and $$1-p$$ for some $$0\le p \le 1$$ and so the eigenvalues of $$\rho^2 - \rho$$ are $$p^2-p$$ and $$(1-p)^2 - (1-p)$$. Thus we can compute $\tr(\rho^2 - \rho) = p^2 - p + (1-p)^2 - (1-p) = 2p(p - 1)$ This quantity clearly varies between 0 and -1/2 for $$0\le p\le 1$$, and so $$\|\veca\| \le 1$$ and $$\veca$$ is in the unit ball of $$\RR^2$$. Further $$\|\veca\| = 1$$ if and only if $$\tr(\rho^2 - \rho) = 0$$ which, as remarked, corresponds to a pure state.

Note also, that the correspondence $$\rho = \frac{1}{2}(I + \veca\cdot\vecsigma) \mapsto \veca \in\RR^3$$ is an affine map and that convex combinations of density matrices map to convex combinations of vectors in $$\RR^3$$. Thus, again, extremal points of the sphere map to extramal points of the set of density matrices which we have seen are vector states. The centre of the sphere, on the other hand, corresponds to the matrix $$\frac{1}{2}I$$ which, phyically, represents a mixed state of maximum uncertainty (e.g., the states $$\ket{0}$$ and $$\ket{1}$$ are equally likely).