Joan Lindsay Orr

\( \newcommand{\CC}{\mathbb{C}} \newcommand{\NN}{\mathbb{N}} \newcommand{\RR}{\mathbb{R}} \newcommand{\TT}{\mathbb{T}} \newcommand{\ZZ}{\mathbb{Z}} \newcommand{\H}{\mathcal{H}} \newcommand{\e}{\epsilon} \newcommand{\x}{\mathbf{x}} \newcommand{\y}{\mathbf{y}} \)

\( \newcommand{\bra}[1]{\langle#1|} \newcommand{\ket}[1]{|#1\rangle} \newcommand{\ip}[2]{\langle #1 | #2\rangle} \newcommand{\dual}[1]{{#1}^*} \DeclareMathOperator{\tr}{tr} \)

The trace, partial traces, and entanglement

Formalization of bra and ket

In this page it's helpful to think carefully about the exact vector spaces involved, and so we start by treating the bra-ket notation rather more formally than usual. Let \(V\) be a finite-dimensional Hilbert space and for \(v\in V\) define \(\bra{v}\) to be the linear map \(u\in V\mapsto \ip{v}{u} \in \CC\) (where the inner product is conjugate-linear in the first place and linear in the second place) and \(\ket{v}\) to be the linear map \(\lambda\in\CC \mapsto \lambda v \in V\). Thus, looked at this way, \(\bra{v}\in L(V, \CC)\) and \(\ket{v}\in L(\CC, V)\).

The map \(v\in V \mapsto \bra{v}\in L(V, \CC)\) is a conjugate-linear isometric isomorphism (see the Riesz Representation Theorem) and the map \(v\in V\mapsto \ket{v}\in L(\CC, V)\) is a linear isometric isomorphism.

It's easily seen in terms of operator composition that \((\bra{u})(\ket{v})\) is the operator in \(L(\CC)\) which maps \(\lambda\in\CC\) to \(\ip{u}{\lambda v} = \lambda\ip{u}{v}\). Making the natural identification between \(L(\CC)\) and \(\CC\), this corresponds to \(\ip{u}{v}\), which justifies the notation \((\bra{u})(\ket{v}) = \ip{u}{v}\). Likewise \((\ket{v})(\bra{u}) \in L(V)\) and maps \(w\in V\) to \((\ket{v})(\ip{u}{w}) = \ip{u}{w} v\). This is the canonical form of a rank-1 operator in \(L(V)\).

We now blur the distinction between \(L(\CC, V)\) and \(V\), and write \(\dual{V}\) (the dual of \(V\)) for \(L(V, \CC)\). Thus we regard \(\bra{v}\) as belonging to \(\dual{V}\) and \(\ket{v}\) as belonging to \(V\).

Tensor products

The map \[ (\ket{a}, \bra{b}) \in V \times \dual{V} \longmapsto \ket{a} \bra{b} \in L(V) \] is a bilinear map and extends by linearity to a map \(V \otimes \dual{V} \rightarrow L(V)\), which can be shown to be an isomorphism.

Likewise, for any \(A\in L(U)\) and \(B\in L(V)\), the pair \((A,B)\) induce a linear map on \(U\otimes V\) by extending \(u\otimes v\mapsto Au\otimes Bv\). This gives a map \(L(U)\otimes L(V)\rightarrow L(U\otimes V)\) which can also be shown to be an isomorphism.

The trace

The map \[ (\ket{a}, \bra{b}) \in V \times \dual{V} \longmapsto \ip{b}{a} \in\CC \] is bilinear on \(V\times \dual{V}\) and so induces a linear map \(V\otimes\dual{V}\). Thus there is a unique linear map on \(L(V)\) which takes \[ \ket{a} \bra{b} \in L(V) \longmapsto \ip{b}{a}\in\CC \] This map is our coordinate-free definition of the trace, and we can easily see its other familiar properties from this formulation. First note that \[ \tr((\ket{a}\bra{b})(\ket{c}\bra{d})) = \ip{b}{c}\tr(\ket{a}\bra{d}) = \ip{b}{c}\ip{d}{a} \] and, by the same token, \(\tr((\ket{c}\bra{d})(\ket{a}\bra{b})) = \ip{d}{a}\ip{b}{c}\). Thus \(\tr(AB) = \tr(BA)\) when \(A\) and \(B\) are elementary tensors and so, by linearity, this holds on all of \(L(V)\).

Another useful property, when \(A\in L(U)\) and \(B\in L(V)\), is that \(\tr(A\otimes B) = \tr(A)\tr(B)\). This is easily seen when \(A=\ket{a}\bra{b}\) and \(B=\ket{c}\bra{d}\) because \[ (\ket{a}\bra{b})\otimes(\ket{c}\bra{d}) = \ket{a\otimes c}\bra{b\otimes d} \] and the result follows in general by (bi)linearity.

Writing \(\ket{i}\) (\(i=1,2,\ldots, n\)) for the standard basis of \(V = \CC^n\), note that \(\sum_{i=1}^n \ket{i}\bra{i} = I\) and so \[ \tr(A) = \sum_{i=1}^n \tr(A\ket{i}\bra{i}) = \sum_{i=1}^n \langle i | A | i \rangle \] which is the familiar formula for the trace as the sum of diagonal matrix entries.

Partial traces

The map \[ (A, B) \in L(U) \times L(V) \longmapsto \tr(A) B \in L(V) \] is a bilinear map and so extends uniquely to a map \(\tr_U:L(U) \otimes L(V) \rightarrow L(V)\) such that \[ \tr_U(A\otimes B) = \tr(A) B \] Clearly for any \(C\in L(U)\), \[ \tr_U((A\otimes B)(C\otimes I)) = \tr_U(AC\otimes B) = \tr(AC)B = \tr(CA)B = \tr_U((C\otimes I)(A\otimes B)) \] Thus by linearity, \(\tr_U(X(C\otimes I)) = \tr_U((C\otimes I)X)\) for any \(X\in L(U\otimes V)\). Note also that \(\tr(\tr_U(X)) = \tr(X)\). This follows easily for elementary tensors \(A\otimes B\), \[ \tr(\tr_U(A\otimes B)) =\tr(A)\tr(B) = \tr(A\otimes B) \] and so in general by taking sums.

Another useful formula whuch generalizes the last one is \(\tr(\tr_U(X)C) = \tr(X(I\otimes C))\). Again we see this easily for elementary tensors and obtain the general result by linearity: \[ \begin{align} \tr((A\otimes B)(I\otimes C)) &= \tr((A\otimes BC) \\ &= \tr(A)\tr(BC) \\ &= \tr(\tr(A)B\, C) \\ &= \tr(\tr_U(A\otimes B)C) \end{align} \]

To get the coordinatized formula for the partial trace, let \(U=\CC^m\) and \(V=\CC^n\), and let \(\ket{ij}\) for \(i=1,2,\ldots, m\) and \(j=1,2,\ldots, n\) be the standard basis for \(U\otimes V\). Note that \(\ket{ij}\bra{kl} = (\ket{i}\bra{l})\otimes(\ket{j}\bra{l})\). Then for \(A\in L(U\otimes V)\), \[ A = \sum_{i=1}^m\sum_{j=1}^n\sum_{k=1}^m\sum_{l=1}^n a_{ijkl} \ket{ij}\bra{kl} \] and so \[ \tr_U(A) = \sum_{i,j,k,l} a_{ijkl} \tr_U(\ket{ij}\bra{kl}) = \sum_{i,j,k,l} a_{ijkl} \tr(\ket{i}\bra{k})\,\ket{j}\bra{l} = \sum_{i,j,k,l} a_{ijkl} \ip{k}{i}\,\ket{j}\bra{l} = \sum_{j,l} \sum_{i=1}^m a_{ijil} \ket{j}\bra{l} \] In other words the entry in the \(j,l\) position of the matrix of \(\tr_U(A)\) is \(\sum_{i=1}^m a_{ijil}\).


A system in a mixed state is described by a density operator \(\rho\), is which is a positive, trace-1 operator. In more detail, if \(\rho\) is a density operator and \(P_i\) are the spectral projections of an observable \(O\), then the probability of outcome \(i\) is given by \[ p_i = \tr(P_i \rho) \] The probabilities so obtained for all observables are a full picture of all the information which we can physically extract from the system in state \(\rho\).

Now suppose that \(\rho\) is a mixed state on the system \(U\otimes V\) but that we can only make measurements in \(V\). Then we are restricted to measurement projections of the form \(I\otimes P\) where \(P\in L(V)\). The probabilities which we can extract from \(\rho\) with such projections represent the sum total of all the information about \(\rho\) which is available by means of measurements in \(V\) alone. The probability of getting a measurement outcome with projection \(P\) on \(V\) is \(p = \tr((I\otimes P) \rho)\). However by properties of the partial trace seen above, this probability is \[ p = \tr((I\otimes P) \rho) = \tr(P \tr_U(\rho)) \] Thus \(\tr_U(\rho)\) holds all the information about \(\rho\) which is available to us from measurements in \(V\).

Furthermore, \(\tr_U(\rho)\) is a density operator on \(V\). To see it is positive, compute for an arbitrary unt vector \(x\in V\), \[ \begin{align} \ip{x}{\tr_U(\rho)x} &= \tr(\tr_U(\rho) \ket{x}\bra{x}) \\ &= \tr(\rho (I\otimes \ket{x}\bra{x})) \\ &= \tr((I\otimes \ket{x}\bra{x}) \rho (I\otimes \ket{x}\bra{x})) \\ &\ge 0 \end{align} \] Similarly, \(\tr(\tr_U(\rho)) = \tr(\rho) = 1\), and so \(\tr_U(\rho)\) is a density operator, which contains all the physical information abouyt the system which is available to an observe measuring in \(V\).

In particular, even if \(\ket{\psi}\) is a pure state on \(U\otimes V\) then the part of \(\ket{\psi}\) available in \(V\) is a mixed state, with density operator \(\tr_U(\ket{\psi}\bra{\psi})\). As we shall see next, \(\tr_U(\ket{\psi}\bra{\psi})\) is pure (in \(V\)) if and only if \(\psi\) is unentangled. Thus mixed states arise very naturally from pure states when considering partial measurements of entangled systems.

A state \(\psi\) in \(U\otimes V\) is said to split if there are \(u\in U\) and \(v\in V\) such that \(\psi = u\otimes v\). Otherwise \(\psi\) is said to be entangled.

Recall the Schmidt Decomposition Theorem:

(Schmidt Decomposition) Let \(\psi\) be a unit vector in \(U\otimes V\). Then there are orthonormal sequences \(\{u_1, \ldots, u_m\}\) in \(U\) and \(\{v_1, \ldots, v_m\}\) in \(V\), and positive numbers \(\lambda_1 \ge \lambda_2 \ge \cdots \ge \lambda_m > 0\) satisfying \(\sum \lambda_i^2 = 1\), such that \[ \psi = \sum_{i=1}^m \lambda_i u_i\otimes v_i \] The sequence \(\lambda_i\) (in order) is unqiuely determined by \(\psi\) and \(m\le\min\{\dim U, \dim V\}\).

Proof. See here.

As a consequence of the Schmidt Decomposition Theorem, if \(\psi\) is a state (unit vector) on \(U\otimes V\) then it is entangled if and only if \(m>1\). Furthermore, \[ \begin{align} \tr_U(\ket{\psi}\bra{\psi}) &= \sum_{i,j=1}^m \lambda_i \bar{\lambda_j} \tr_U(\ket{u_i\otimes v_i}\bra{u_j\otimes v_j})\\ &= \sum_{i,j=1}^m \lambda_i \bar{\lambda_j} \tr_U(\ket{u_i}\bra{u_j}\otimes\ket{v_i}\bra{v_j})\\ &= \sum_{i,j=1}^m \lambda_i \bar{\lambda_j} \delta_{ij}\ket{v_i}\bra{v_j}\\ &= \sum_{i=1}^m |\lambda_i|^2\ket{v_i}\bra{v_i} \end{align} \] Thus \(\tr_U(\ket{\psi}\bra{\psi})\) is pure in \(V\) if and only if \(m=1\), which happens if and only if \(\psi\) is unentangled.

From this calculation we can also observe that \(\tr_V(\ket{\psi}\bra{\psi}) = \sum_{i} |\lambda_i|^2\ket{u_i}\bra{u_i}\). Since the spectra of \(\tr_U(\ket{\psi}\bra{\psi})\) and \(\tr_V(\ket{\psi}\bra{\psi})\) are the same, this reveals the interesting fact that exactly the same information content is available from the compound in both of \(U\) and \(V\) separately.

Conversely, just as we have seen that mixed states arise naturally from pure states when considering partial measurements of entangled systems, it also turns out that any mixed state can be viewed as the restriction to a partial trace of a pure state on a larger product space.

To see this, let \(\rho\) be a density matrix on \(V\) with spectral decomposition \(\rho = \sum_{i=1}^m a_i \ket{v_i}\bra{v_i}\) where \(a_i\ge 0\) and \(\sum_i a_i = 1\). Take \(U\) to be any finite-dimensional Hilbert space of dimension at least \(m\) and let \(\{u_i\}_1^m\) be an orthonormal sequence in \(U\). Then the previous calculation shows that if \(\psi\) is the pure state \(\sum_i \sqrt{a_i}(u_i\otimes v_i)\) in \(U\otimes V\), then \(\rho\) is the part of \(\psi\) measurable in \(V\). The state vector \(\psi\) is called a purification of \(\rho\).