\documentclass[11pt]{article}
\usepackage{latexsym}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{amsthm}
\usepackage{hyperref}
\usepackage{graphicx}
%\usepackage{epsfig}
%\usepackage{psfig}
\newcommand{\handout}[5]{
\noindent
\begin{center}
\framebox{
\vbox{
\hbox to 5.78in { {\bf PHYS 7895: Quantum Information Theory } \hfill #2 }
\vspace{4mm}
\hbox to 5.78in { {\Large \hfill #5 \hfill} }
\vspace{2mm}
\hbox to 5.78in { {\em #3 \hfill #4} }
}
}
\end{center}
\vspace*{4mm}
}
\newcommand{\lecture}[4]{\handout{#1}{#2}{#3}{Scribe: #4}{Lecture #1}}
\newtheorem{theorem}{Theorem}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{observation}[theorem]{Observation}
\newtheorem{proposition}[theorem]{Proposition}
\newtheorem{definition}[theorem]{Definition}
\newtheorem{claim}[theorem]{Claim}
\newtheorem{fact}[theorem]{Fact}
\newtheorem{assumption}[theorem]{Assumption}
\newtheorem{criterion}[theorem]{Criterion}
\newtheorem{remark}[theorem]{Remark}
\newtheorem{exercise}[theorem]{Exercise}
\newtheorem{notation}[theorem]{Notation}
\newtheorem{property}[theorem]{Property}
% 1-inch margins, from fullpage.sty by H.Partl, Version 2, Dec. 15, 1988.
\topmargin 0pt
\advance \topmargin by -\headheight
\advance \topmargin by -\headsep
\textheight 8.9in
\oddsidemargin 0pt
\evensidemargin \oddsidemargin
\marginparwidth 0.5in
\textwidth 6.5in
\parindent 0in
\parskip 1.5ex
%\renewcommand{\baselinestretch}{1.25}
\begin{document}
\lecture{18 --- October 26, 2015}{Fall 2015}{Prof.\ Mark M.\ Wilde}{Mark M.~Wilde}
This document is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
\section{Overview}
In the previous lecture, we discussed classical entropy and entropy inequalities.
In this lecture, we discuss several information
measures that are important for quantifying the amount of information and
correlations that are present in quantum systems. The first fundamental
measure that we introduce is the
\index{von Neumann entropy}%
von Neumann entropy. It is the quantum generalization of the Shannon entropy,
but it captures both classical and quantum uncertainty in a quantum
state. The von Neumann
entropy gives meaning to a notion of the \textit{information qubit}. This
notion is different from that of the physical qubit, which is the description
of a quantum state of an electron or a photon. The information qubit is the
fundamental quantum informational unit of measure, determining how much
quantum information is present in a quantum system.
The initial definitions here are analogous to the classical definitions of
entropy, but we soon discover a radical departure from the intuitive classical
notions from the previous chapter: the conditional quantum entropy can be
negative for certain quantum states. In the classical world, this negativity
simply does not occur, but it takes a special meaning in quantum information
theory. Pure quantum states that are entangled have stronger-than-classical
correlations and are examples of states that have negative conditional
entropy. The negative of the conditional quantum entropy is so important in
quantum information theory that we even have a special name for it: the
\index{coherent information}%
coherent information. We discover that the coherent information obeys a
quantum data-processing inequality, placing it on a firm footing as a
particular informational measure of quantum correlations.
We then define several other quantum information measures, such as quantum
mutual information, that bear similar definitions as in the classical world,
but with Shannon entropies replaced with von Neumann entropies. This
replacement may seem to make quantum entropy somewhat trivial on the surface,
but a simple calculation reveals that a maximally entangled state on two
qubits registers \textit{two bits} of quantum mutual information (recall that
the largest the mutual information can be in the classical world is
\textit{one bit} for the case of two maximally correlated bits).
%We then
%discuss several entropy inequalities that play an important role in quantum
%information processing:\ the monotonicity of quantum relative entropy, strong
%subadditivity, the quantum data-processing inequalities, and continuity of
%quantum entropy.
\section{Quantum Entropy}
We might expect a measure of the entropy of a quantum system to be vastly
different from the classical measure of entropy from the previous chapter
because a quantum system possesses not only classical uncertainty but also
quantum uncertainty that arises from the uncertainty principle. But recall
that the density operator captures both types of uncertainty and allows us to
determine probabilities for the outcomes of any measurement on system $A$.
Thus, a quantum measure of uncertainty should be a direct function of the
density operator, just as the classical measure of uncertainty is a direct
function of a probability density function. It turns out that this function
has a strikingly similar form to the classical entropy, as we see below.
\begin{definition}
[Quantum Entropy]\label{def:quantum-entropy}%
\index{von Neumann entropy}%
Suppose that Alice prepares some quantum system $A$ in a state $\rho_{A}%
\in\mathcal{D}(\mathcal{H}_{A})$. Then the entropy $H(A)_{\rho}$\ of the state
is\ as follows:%
\begin{equation}
H(A)_{\rho}\equiv-\operatorname{Tr}\left\{ \rho_{A}\log\rho_{A}\right\} .
\end{equation}
\end{definition}
The entropy of a quantum system is also known as the \textit{von Neumann
entropy} or the \textit{quantum entropy} but we often simply refer to it as
the \textit{entropy}. We can denote it by $H(A)_{\rho}$ or $H(\rho_{A})$ to
show the explicit dependence on the density operator $\rho_{A}$. The von
Neumann entropy has a special relation to the eigenvalues of the density
operator, as the following exercise asks you to verify.
\begin{exercise}
\label{ex-qie:eigen-von-neumann}Consider a density operator $\rho_{A}$ with
the following spectral decomposition:%
\begin{equation}
\rho_{A}=\sum_{x}p_{X}(x)|x\rangle\langle x|_{A}.
\end{equation}
Show that the entropy $H(A)_{\rho}$\ is the same as the Shannon entropy
$H(X)$\ of a random variable $X$ with probability distribution $p_{X}(x)$.
\end{exercise}
In our definition of quantum entropy, we use the same notation $H$ as in the
classical case to denote the entropy of a quantum system. It should be clear
from the context whether we are referring to the entropy of a quantum or
classical system.
The quantum entropy admits an intuitive interpretation. Suppose that Alice
generates a quantum state $|\psi_{y}\rangle$ in her lab according to some
probability density $p_{Y}(y)$, corresponding to a random variable $Y$.
Suppose further that Bob has not yet received the state from Alice and does
not know which one she sent. The expected density operator from Bob's point of
view is then%
\begin{equation}
\sigma=\mathbb{E}_{Y}\left\{ |\psi_{Y}\rangle\langle\psi_{Y}|\right\}
=\sum_{y}p_{Y}(y)|\psi_{y}\rangle\langle\psi_{y}|.
\end{equation}
The interpretation of the entropy $H(\sigma)$\ is that it quantifies Bob's
uncertainty about the state Alice sent---his expected information gain is
$H(\sigma)$ qubits upon receiving and measuring the state that Alice sends.
\subsection{Mathematical Properties of Quantum Entropy}
We now discuss several mathematical properties of the quantum
entropy:\ non-negativity, its minimum value, its maximum value, its invariance
with respect to isometries, and concavity. The first three of these properties
follow from the analogous properties in the classical world because the von
Neumann entropy of a density operator is the Shannon entropy of its
eigenvalues (see Exercise~\ref{ex-qie:eigen-von-neumann}). We state them
formally below:
\begin{property}
[Non-Negativity]The von Neumann entropy%
\index{von Neumann entropy!positivity}
$H( \rho) $ is non-negative for any density operator $\rho$:%
\begin{equation}
H( \rho) \geq0.
\end{equation}
\end{property}
\begin{proof}
This follows from non-negativity of Shannon entropy.
\end{proof}
\begin{property}
[Minimum Value]The minimum value of the von Neumann entropy is zero, and it
occurs when the density operator is a pure state.
\end{property}
\begin{proof}
The minimum value equivalently occurs when the eigenvalues of a density
operator are distributed with all the probability mass on one eigenvector and
zero on the others, so that the density operator is rank one and corresponds
to a pure state.
\end{proof}
Why should the entropy of a pure quantum state vanish?\ It seems that there is
quantum uncertainty inherent in the state itself and that a measure of quantum
uncertainty should capture this fact. This last observation only makes sense
if we do not know anything about the state that is prepared. But if we know
exactly how it was prepared, we can perform a special quantum measurement to
verify that the quantum state was prepared, and we do not learn anything from
this measurement because the outcome of it is always certain. For example,
suppose that Alice prepares the state $|\phi\rangle$ and Bob knows that she
does so. He can then perform the following measurement $\left\{ |\phi
\rangle\left\langle \phi\right\vert ,I-|\phi\rangle\langle\phi|\right\} $ to
verify that she prepared this state. He always receives the first outcome from
the measurement and thus never gains any information from it. Thus, in this
sense it is reasonable that the entropy of a pure state vanishes.
\begin{property}
[Maximum Value]\label{prop:max-val-von-ent}The maximum value of the von
Neumann entropy is $\log d$ where $d$ is the dimension of the system, and it
occurs for the maximally mixed state.
\end{property}
\begin{proof}
A proof of the above property is the same as that for the classical case.
\end{proof}
\begin{property}
[Concavity]\label{prop-qie:concavity}%
\index{von Neumann entropy!concavity}%
Let $\rho_{x}\in\mathcal{D}(\mathcal{H})$ and let $p_{X}(x)$ be a probability
distribution. The entropy is concave in the density operator:%
\begin{equation}
H(\rho)\geq\sum_{x}p_{X}(x)H(\rho_{x}),
\end{equation}
where $\rho\equiv\sum_{x}p_{X}(x)\rho_{x}$.
\end{property}
The physical interpretation of concavity is as before for classical entropy:
entropy can never decrease under a mixing operation. This inequality is a
fundamental property of the entropy, and we prove it after developing some
important entropic tools.
\begin{property}
[Isometric Invariance]Let $\rho\in\mathcal{D}(\mathcal{H})$ and $U:\mathcal{H}%
\rightarrow\mathcal{H}^{\prime}$ be an isometry. The entropy of a density
operator is invariant with respect to isometries, in the following sense:%
\begin{equation}
H(\rho)=H(U\rho U^{\dag}).
\end{equation}
\end{property}
\begin{proof}
Isometric invariance of entropy follows by observing that the eigenvalues of a
density operator are invariant with respect to an isometry.
\end{proof}
\section{Joint Quantum Entropy}
The joint quantum entropy
\index{von Neumann entropy!joint}%
$H(AB)_{\rho}$\ of the density operator $\rho_{AB}\in\mathcal{D}%
(\mathcal{H}_{A}\otimes\mathcal{H}_{B})$\ for a bipartite system $AB$ follows
naturally from the definition of quantum entropy:%
\begin{equation}
H(AB)_{\rho}\equiv-\operatorname{Tr}\left\{ \rho_{AB}\log\rho_{AB}\right\} .
\end{equation}
Now suppose that $\rho_{ABC}$ is a tripartite state, i.e., in $\mathcal{D}%
(\mathcal{H}_{A}\otimes\mathcal{H}_{B}\otimes\mathcal{H}_{C})$. Then the
entropy $H(AB)_{\rho}$ in this case is defined as above, where $\rho
_{AB}=\operatorname{Tr}_{C}\{\rho_{ABC}\}$. This is a convention that we take
throughout. We introduce a few of the properties of joint quantum
entropy in the subsections below.
\subsection{Marginal Entropies of a Pure Bipartite State}
The five properties of quantum entropy in the previous section may give you
the impression that the nature of quantum information is not too different
from that of classical information. We proved all these properties for the
classical case, and their proofs for the quantum case seem similar. The first
three even resort to the proofs in the classical case!
Theorem~\ref{thm-ie:marginal-entropies-pure-state} below is where we observe
our first radical departure from the classical world. It states that the
marginal entropies of a pure bipartite state are equal, while the entropy of
the overall state is equal to zero. Recall that the joint entropy $H(X,Y)$ of
two random variables $X$ and $Y$ is never less than one of the marginal
entropies $H(X)$ or $H(Y)$:%
\begin{equation}
H(X,Y)\geq H(X),\ \ \ \ \ \ \ \ \ \ H(X,Y)\geq H(Y).
\end{equation}
The above inequalities follow from the non-negativity of classical conditional
entropy. But in the quantum world, these inequalities do not always have to
hold, and the following theorem demonstrates that they do not hold for an
arbitrary pure bipartite quantum state with Schmidt rank greater than one. The fact that
the joint quantum entropy can be less than the marginal quantum entropy is one
of the most fundamental differences between classical and quantum information.
\begin{theorem}
\label{thm-ie:marginal-entropies-pure-state}The marginal entropies
$H(A)_{\phi}$ and $H(B)_{\phi}$ of a pure bipartite state $|\phi\rangle_{AB}$
are equal:%
\begin{equation}
H(A)_{\phi}=H(B)_{\phi},
\end{equation}
while the joint entropy $H(AB)_{\phi}$ vanishes:%
\begin{equation}
H(AB)_{\phi}=0.
\end{equation}
\end{theorem}
\begin{proof}
The crucial ingredient for a proof of this theorem is the Schmidt
decomposition. Recall that any bipartite state
$\vert\phi\rangle_{AB}$ admits a Schmidt decomposition of the following form:%
\begin{equation}
\vert\phi\rangle_{AB}=\sum_{i}\sqrt{\lambda_{i}}\left\vert i\right\rangle
_{A}\vert i\rangle_{B},
\end{equation}
where $\{\vert i\rangle_{A}\}$ is some orthonormal set of vectors on system
$A$ and $\{\vert i\rangle_{B}\}$ is some orthonormal set on system $B$. Recall
that the Schmidt rank is equal to the number of non-zero coefficients
$\lambda_{i}$. Then the respective marginal states $\rho_{A}$\ and $\rho_{B}$
on systems $A$ and $B$ are as follows:%
\begin{equation}
\rho_{A} =\sum_{i}\lambda_{i}\vert i\rangle\langle i\vert_{A},
\ \ \ \ \ \ \rho_{B} =\sum_{i}\lambda_{i}\vert i\rangle\langle i\vert_{B}.
\end{equation}
Thus, the marginal states admit a spectral decomposition with the same
eigenvalues. The theorem follows because the von Neumann entropy depends only
on the eigenvalues of a given spectral decomposition.
\end{proof}
The theorem applies not only to two systems $A$ and $B$, but it also applies
to any number of systems if we make a bipartite cut of the systems. For
example, if the state is $\vert\phi\rangle_{ABCDE}$, then the following
equalities (and others from different combinations) hold by applying
Theorem~\ref{thm-ie:marginal-entropies-pure-state}:%
\begin{align}
H( A) _{\phi} & =H( BCDE) _{\phi},\\
H( AB) _{\phi} & =H( CDE) _{\phi},\\
H( ABC) _{\phi} & =H( DE) _{\phi},\\
H( ABCD) _{\phi} & =H( E) _{\phi}.
\end{align}
The closest analogy in the classical world to the above property is when we
copy a random variable $X$. That is, suppose that $X$ has a distribution
$p_{X}(x)$ and $\hat{X}$ is some copy of it so that the distribution of the
joint random variable $X\hat{X}$\ is $p_{X}(x)\delta_{x,\hat{x}}$. Then the
marginal entropies $H(X)$ and $H(\hat{X})$ are both equal. But observe that
the joint entropy $H(X\hat{X})$ is also equal to $H(X)$ and this is where the
analogy breaks down. That is, there is not a good classical analogy of the
notion of purification.
\subsection{Additivity}
\begin{property}
[Additivity]Let $\rho_{A}\in\mathcal{D}(\mathcal{H}_{A})$ and $\sigma_{B}%
\in\mathcal{D}(\mathcal{H}_{B})$. The quantum entropy is additive
\index{von Neumann entropy!additivity}%
for tensor-product states:%
\begin{equation}
H(\rho_{A}\otimes\sigma_{B})=H(\rho_{A})+H(\sigma_{B}). \label{eq-qie:add-ent}%
\end{equation}
\end{property}
One can verify this property simply by diagonalizing both density operators
and resorting to the additivity of the joint Shannon entropies of the eigenvalues.
Additivity is an intuitive property that we would like to hold for any measure
of information. For example, suppose that Alice generates a large sequence
$\left\vert \psi_{x_{1}}\right\rangle \left\vert \psi_{x_{2}}\right\rangle
\cdots\left\vert \psi_{x_{n}}\right\rangle $\ of quantum states according to
the ensemble $\left\{ p_{X}(x),|\psi_{x}\rangle\right\} $. She may be aware
of the classical indices $x_{1}x_{2}\cdots x_{n}$, but a third party to whom
she sends the quantum sequence may not be aware of these values. The
description of the state to this third party is then $\rho\otimes\cdots
\otimes\rho$, where $\rho\equiv\mathbb{E}_{X}\left\{ |\psi_{X}\rangle
\langle\psi_{X}|\right\} $, and the quantum entropy of this $n$-fold tensor
product state is $H(\rho\otimes\cdots\otimes\rho)=nH(\rho)$, by applying
\eqref{eq-qie:add-ent} inductively.
\subsection{Joint Quantum Entropy of a Classical--Quantum State}
Recall that a
classical--quantum state is a bipartite state in which a classical system and
a quantum system are classically correlated. An example of such a state is as
follows:%
\begin{equation}
\rho_{XB}\equiv\sum_{x}p_{X}( x) \vert x\rangle\langle x\vert_{X}\otimes
\rho_{B}^{x}. \label{eq-qie:cq-state}%
\end{equation}
The joint quantum entropy of this state takes on a special form that appears
similar to entropies in the classical world.
\begin{theorem}
\label{thm-qie:joint-ent-cq-state}The joint entropy $H(XB)_{\rho}$ of a
classical--quantum state, as given in \eqref{eq-qie:cq-state}, is as follows:%
\begin{equation}
H(XB)_{\rho}=H(X)+\sum_{x}p_{X}(x)H(\rho_{B}^{x}),
\end{equation}
where $H(X)$ is the entropy of a random variable $X$ with distribution
$p_{X}(x)$.
\end{theorem}
\begin{proof}
Consider that%
\begin{equation}
H(XB)_{\rho}=-\operatorname{Tr}\left\{ \rho_{XB}\log\rho_{XB}\right\} .
\end{equation}
So we need to evaluate the operator $\log\rho_{XB}$, and we can find a
simplified form for it because $\rho_{XB}$ is a classical-quantum state:%
\begin{align}
\log\rho_{XB} & =\log\left[ \sum_{x}p_{X}(x)|x\rangle\langle x|_{X}%
\otimes\rho_{B}^{x}\right] \\
& =\log\left[ \sum_{x}|x\rangle\langle x|_{X}\otimes p_{X}(x)\rho_{B}%
^{x}\right] \\
& =\sum_{x}|x\rangle\langle x|_{X}\otimes\log\left[ p_{X}(x)\rho_{B}%
^{x}\right] .
\end{align}
Then%
\begin{align}
& -\operatorname{Tr}\left\{ \rho_{XB}\log\rho_{XB}\right\} \nonumber\\
& =-\operatorname{Tr}\left\{ \left[ \sum_{x}p_{X}(x)|x\rangle\langle
x|_{X}\otimes\rho_{B}^{x}\right] \left[ \sum_{x^{\prime}}|x^{\prime}%
\rangle\langle x^{\prime}|_{X}\otimes\log\left[ p_{X}(x^{\prime})\rho
_{B}^{x^{\prime}}\right] \right] \right\} \\
& =-\operatorname{Tr}\left\{ \sum_{x}p_{X}(x)|x\rangle\langle x|_{X}%
\otimes\left( \rho_{B}^{x}\log\left[ p_{X}(x)\rho_{B}^{x}\right] \right)
\right\} \\
& =-\sum_{x}p_{X}(x)\operatorname{Tr}\left\{ \rho_{B}^{x}\log\left[
p_{X}(x)\rho_{B}^{x}\right] \right\} . \label{eq-qie:ent-cq-state}%
\end{align}
Consider that%
\begin{equation}
\log\left[ p_{X}(x)\rho_{B}^{x}\right] =\log\left( p_{X}(x)\right)
I+\log\rho_{B}^{x},
\end{equation}
which implies that \eqref{eq-qie:ent-cq-state} is equal to%
\begin{align}
& -\sum_{x}p_{X}(x)\left[ \operatorname{Tr}\left\{ \rho_{B}^{x}\log\left[
p_{X}(x)\right] \right\} +\operatorname{Tr}\left\{ \rho_{B}^{x}\log\rho
_{B}^{x}\right\} \right] \\
& =-\sum_{x}p_{X}(x)\left[ \log\left[ p_{X}(x)\right] +\operatorname{Tr}%
\left\{ \rho_{B}^{x}\log\rho_{B}^{x}\right\} \right] .
\end{align}
This last line is then equivalent to the statement of the theorem.
\end{proof}
\section{Conditional Quantum Entropy}%
\index{von Neumann entropy!conditional}%
The definition of conditional quantum entropy that has been most useful in
quantum information theory is the following simple one, inspired from the
relation between joint entropy and marginal entropy in
the classical case.
\begin{definition}
[Conditional Quantum Entropy]\label{eq-ie:cond-quantum-entropy}Let $\rho
_{AB}\in\mathcal{D}(\mathcal{H}_{A}\otimes\mathcal{H}_{B})$. The conditional
quantum entropy $H(A|B)_{\rho}$ of $\rho_{AB}$ is equal to the difference of
the joint quantum entropy $H(AB)_{\rho}$ and the marginal entropy $H(B)_{\rho
}$:%
\begin{equation}
H(A|B)_{\rho}\equiv H(AB)_{\rho}-H(B)_{\rho}.
\end{equation}
\end{definition}
The above definition is the most natural one, both because it is
straightforward to compute for any bipartite state and because it obeys many
relations that the classical conditional entropy obeys (such as chaining rules
and conditioning reduces entropy). We explore many of these relations in the
forthcoming sections. For now, we state \textquotedblleft conditioning cannot
increase entropy\textquotedblright\ as the following theorem and tackle its
proof later on after developing a few more tools.
\begin{theorem}
[Conditioning Does Not Increase Entropy]\label{thm-qie:cond-reduce-ent}%
Consider a bipartite quantum state $\rho_{AB}$. Then the following inequality
applies to the marginal entropy $H( A) _{\rho}$ and the conditional quantum
entropy $H( A|B) _{\rho}$:%
\begin{equation}
H( A) _{\rho}\geq H( A|B) _{\rho}.
\end{equation}
We can interpret the above inequality as stating that conditioning cannot
increase entropy, even if the conditioning system is quantum.
\end{theorem}
\subsection{Conditional Quantum Entropy for Classical--Quantum States}
\label{sec-qie:cond-ent-cq}A classical--quantum state is an example of a state
for which conditional quantum entropy behaves as in the classical world.
Suppose that two parties share a classical--quantum state $\rho_{XB}$ of the
form in~\eqref{eq-qie:cq-state}. The system $X$ is classical and the system
$B$ is quantum, and the correlations between these systems are entirely
classical, determined by the probability distribution $p_{X}(x)$. Let us
calculate the conditional quantum entropy $H(B|X)_{\rho}$ for this state:%
\begin{align}
H(B|X)_{\rho} & =H(XB)_{\rho}-H(X)_{\rho}\\
& =H(X)_{\rho}+\sum_{x}p_{X}(x)H(\rho_{B}^{x})-H(X)_{\rho}\\
& =\sum_{x}p_{X}(x)H(\rho_{B}^{x}).
\end{align}
The first equality follows from Definition~\ref{eq-ie:cond-quantum-entropy}.
The second equality follows from Theorem~\ref{thm-qie:joint-ent-cq-state}, and
the final equality results from algebra.
The above form for conditional entropy is completely analogous with the
classical formula and holds whenever the
conditioning system is classical.
\subsection{Negative Conditional Quantum Entropy}
One of the properties of the conditional quantum entropy in
Definition~\ref{eq-ie:cond-quantum-entropy} that seems counterintuitive at
first sight is that it can be negative. This negativity holds for an ebit
$\left\vert \Phi^{+}\right\rangle _{AB}$ shared between Alice and Bob. The
marginal state on Bob's system is the maximally mixed state $\pi_{B}$. Thus,
the marginal entropy $H( B) $ is equal to one, but the joint entropy vanishes,
and so the conditional quantum entropy $H( A|B) =-1$.
What do we make of this result? Well, this is one of the fundamental
differences between the classical world and the quantum world, and perhaps is
the very essence of the departure from an informational standpoint. The
informational statement is that we can sometimes be more certain about the
joint state of a quantum system than we can be about any one of its individual
parts, and this is the reason that conditional quantum entropy can be
negative. This is in fact the same observation that Schr\"{o}dinger made
concerning entangled states:
\begin{quotation}
\textquotedblleft When two systems, of which we know the states by their
respective representatives, enter into temporary physical interaction due to
known forces between them, and when after a time of mutual influence the
systems separate again, then they can no longer be described in the same way
as before, viz.~by endowing each of them with a representative of its own. I
would not call that one but rather the characteristic trait of quantum
mechanics, the one that enforces its entire departure from classical lines of
thought. By the interaction the two representatives [the quantum states] have
become entangled. Another way of expressing the peculiar situation is: the
best possible knowledge of a whole does not necessarily include the best
possible knowledge of all its parts, even though they may be entirely separate
and therefore virtually capable of being `best possibly known,' i.e., of
possessing, each of them, a representative of its own. The lack of knowledge
is by no means due to the interaction being insufficiently known --- at least
not in the way that it could possibly be known more completely --- it is due
to the interaction itself.\textquotedblright
\end{quotation}
\section{Coherent Information}
Negativity of the conditional quantum entropy is so important in quantum
information theory that we even have an information quantity and a special
notation to denote the negative of the conditional quantum entropy:
\begin{definition}
[Coherent Information]The
\index{coherent information}%
coherent information $I(A\rangle B)_{\rho}$\ of a bipartite state $\rho
_{AB}\in\mathcal{D}(\mathcal{H}_{A}\otimes\mathcal{H}_{B})$ is as follows:%
\begin{equation}
I(A\rangle B)_{\rho}\equiv H(B)_{\rho}-H(AB)_{\rho}.
\end{equation}
\end{definition}
You should immediately notice that this quantity is the negative of the
conditional quantum entropy in Definition~\ref{eq-ie:cond-quantum-entropy},
but it is perhaps more useful to think of the coherent information not merely
as the negative of the conditional quantum entropy, but as an information
quantity in its own right. This is why we employ a separate notation for it.
The \textquotedblleft$I$\textquotedblright\ is present because the coherent
information is an information quantity that measures quantum correlations,
much like the mutual information does in the classical case. For example, we
have already seen that the coherent information of an ebit is equal to one.
Thus, it is measuring the extent to which we know less about part of a system
than we do about its whole. Perhaps surprisingly, the coherent information
obeys a quantum data-processing inequality, which gives further support for
it having an \textquotedblleft$I$\textquotedblright\ present in its notation.
The Dirac symbol \textquotedblleft$\rangle$\textquotedblright\ is present to
indicate that this quantity is a quantum information quantity, having a good
meaning really only in the quantum world. The choice of \textquotedblleft%
$\rangle$\textquotedblright\ over \textquotedblleft$\langle$\textquotedblright%
\ also indicates a directionality from Alice to Bob, and this notation will
make more sense when we begin to discuss the coherent information of a quantum
channel.
\begin{exercise}
\label{ex-qie:alternate-coh-info}Let $\rho_{AB}\in\mathcal{D}(\mathcal{H}%
_{A}\otimes\mathcal{H}_{B})$. Consider a purification $|\psi\rangle_{ABE}$ of
this state to some environment system $E$. Show that%
\begin{equation}
I(A\rangle B)_{\rho}=H(B)_{\psi}-H(E)_{\psi}.
\end{equation}
Thus, there is a sense in which the coherent information measures the
difference in the uncertainty of Bob and the uncertainty of the environment.
\end{exercise}
\begin{exercise}
[Duality of Conditional Entropy]\label{ex-qie:other-coh-info}Show that
$-H(A|B)_{\rho}=I(A\rangle B)_{\rho}=H(A|E)_{\psi}$ for the purification in
the above exercise.
\end{exercise}
The coherent information can be both negative or positive depending on the
bipartite state for which we evaluate it, but it cannot be arbitrarily large
or arbitrarily small. The following theorem places a useful bound on its
absolute value.
\begin{theorem}
\label{thm-qie:bound-cond-ent}Let $\rho_{AB}\in\mathcal{D}(\mathcal{H}%
_{A}\otimes\mathcal{H}_{B})$. The following bound applies to the absolute
value of the conditional entropy $H(A|B)_{\rho}$:%
\begin{equation}
\left\vert H(A|B)_{\rho}\right\vert \leq\log\dim(\mathcal{H}_{A}).
\end{equation}
The bounds are saturated for $\rho_{AB}=\pi_{A}\otimes\sigma_{B}$, where
$\pi_{A}$ is the maximally mixed state and $\sigma_{B}\in\mathcal{D}%
(\mathcal{H}_{B})$, and for $\rho_{AB}=\Phi_{AB}$ (the maximally entangled state).
\end{theorem}
\begin{proof}
We first prove the inequality $H(A|B)_{\rho}\leq\log\dim(\mathcal{H}_{A})$ in
two steps:%
\begin{align}
H(A|B)_{\rho} & \leq H(A)_{\rho}\\
& \leq\log\dim(\mathcal{H}_{A}).
\end{align}
The first inequality follows because conditioning reduces entropy
(Theorem~\ref{thm-qie:cond-reduce-ent}), and the second inequality follows
because the maximum value of the entropy $H(A)_{\rho}$ is $\log\dim
(\mathcal{H}_{A})$. We now prove the inequality $H(A|B)_{\rho}\geq-\log
\dim(\mathcal{H}_{A})$. Consider a purification $|\psi\rangle_{EAB}$\ of the
state $\rho_{AB}$. We then have that%
\begin{align}
H(A|B)_{\rho} & =-H(A|E)_{\psi}\\
& \geq-H(A)_{\rho}\\
& \geq-\log\dim(\mathcal{H}_{A}).
\end{align}
The first equality follows from Exercise~\ref{ex-qie:other-coh-info}. The
first and second inequalities follow by the same reasons as the inequalities
in the previous paragraph.
\end{proof}
\begin{exercise}
[Conditional Coherent Information]Consider a tripartite state $\rho_{ABC}$.
Show that%
\begin{equation}
I(A\rangle BC)_{\rho}=I(A\rangle B|C)_{\rho},
\end{equation}
where $I(A\rangle B|C)_{\rho}\equiv H(B|C)_{\rho}-H(AB|C)_{\rho}$ is the
conditional coherent information%
\index{coherent information!conditional}%
.
\end{exercise}
\begin{exercise}
[Conditional Coherent Information of a Classical--Quantum State]%
\label{ex-qie:cond-coh-info}Suppose we have a classical--quantum state
$\sigma_{XAB}$ where%
\begin{equation}
\sigma_{XAB}=\sum_{x}p_{X}(x)|x\rangle\langle x|_{X}\otimes\sigma_{AB}^{x},
\label{eq-qie:cqq-state}%
\end{equation}
$p_{X}$ is a probability distribution on a finite alphabet $\mathcal{X}$ and
$\sigma_{AB}^{x}\in\mathcal{D}(\mathcal{H}_{A}\otimes\mathcal{H}_{B})$ for all
$x\in\mathcal{X}$. Show that
\begin{equation}
I(A\rangle BX)_{\sigma}=\sum_{x}p_{X}(x)I(A\rangle B)_{\sigma^{x}}.
\end{equation}
\end{exercise}
\section{Quantum Mutual Information}
The standard informational measure of correlations in the classical world is
the mutual information, and such a quantity plays a prominent role in
measuring classical and quantum correlations in the quantum world as well.
\begin{definition}
[Quantum Mutual Information]The quantum mutual information%
\index{quantum mutual information}
of a bipartite state $\rho_{AB}\in\mathcal{D}(\mathcal{H}_{A}\otimes
\mathcal{H}_{B})$ is defined as follows:%
\begin{equation}
I(A;B)_{\rho}\equiv H(A)_{\rho}+H(B)_{\rho}-H(AB)_{\rho}.
\end{equation}
\end{definition}
The following relations hold for quantum mutual information, in analogy with
the classical case:%
\begin{align}
I(A;B)_{\rho} & =H(A)_{\rho}-H(A|B)_{\rho}\label{eq-ie:expand-quantum-MI}\\
& =H(B)_{\rho}-H(B|A)_{\rho}.
\end{align}
These immediately lead to the following relations between quantum mutual
information and the coherent information:%
\begin{align}
I(A;B)_{\rho} & =H(A)_{\rho}+I(A\rangle B)_{\rho}\\
& =H(B)_{\rho}+I(B\rangle A)_{\rho}.
\end{align}
The theorem below gives a fundamental lower bound on the quantum mutual
information---we merely state it for now and give a full proof later.
\begin{theorem}
[Non-Negativity of Quantum Mutual Information]\label{thm-ie:QMI-positive}
\index{quantum mutual information!positivity}%
The quantum mutual information $I( A;B) _{\rho}$ of any bipartite quantum
state $\rho_{AB}$ is non-negative:%
\begin{equation}
I( A;B) _{\rho}\geq0.
\end{equation}
\end{theorem}
\begin{exercise}
[Conditioning Does Not Increase Entropy]\label{ex-qie:cond-red-ent}Show that
non-negativity of quantum mutual information implies that conditioning does
not increase entropy (Theorem~\ref{thm-qie:cond-reduce-ent}).
\end{exercise}
\begin{exercise}
[Bound on Quantum Mutual Information]\label{ex-qie:dim-bound-MI}Let $\rho
_{AB}\in\mathcal{D}(\mathcal{H}_{A}\otimes\mathcal{H}_{B})$. Prove that the
following bound applies to the quantum mutual information:%
\begin{equation}
I(A;B)_{\rho}\leq2\log\left[ \min\left\{ \dim(\mathcal{H}_{A}),\dim
(\mathcal{H}_{B})\right\} \right] .
\end{equation}
What is an example of a state that saturates the bound?
\end{exercise}
\section{Conditional Quantum Mutual Information}
\label{sec-qie:cond-MI}We define the conditional quantum mutual information
$I(A;B|C)_{\rho}$ of any tripartite state $\rho_{ABC}\in\mathcal{D}%
(\mathcal{H}_{A}\otimes\mathcal{H}_{B}\otimes\mathcal{H}_{C})$ similarly to
how we did in the classical case:%
\begin{equation}
I(A;B|C)_{\rho}\equiv H(A|C)_{\rho}+H(B|C)_{\rho}-H(AB|C)_{\rho}.
\end{equation}
In what follows, we sometimes abbreviate \textquotedblleft conditional quantum
mutual information\textquotedblright\ as CQMI.
One can exploit the above definition and the definition of quantum mutual
information to prove a chain rule for quantum mutual information.
\begin{property}
[Chain Rule for Quantum Mutual Information]\label{prop-qie:chain-CMI}The
quantum mutual information
\index{quantum mutual information!chain rule}
obeys a chain rule:%
\begin{equation}
I(A;BC)_{\rho}=I(A;B)_{\rho}+I(A;C|B)_{\rho}.
\end{equation}
The interpretation of the chain rule is that we can build up the correlations
between $A$ and $BC$ in two steps: in a first step, we build up the
correlations between $A$ and $B$, and now that $B$ is available (and thus
conditioned on), we build up the correlations between $A$ and $C$.
\end{property}
\begin{exercise}
\label{ex-qie:chain-rule-mut-info}Use the chain rule for quantum mutual
information to prove that%
\begin{equation}
I(A;BC)_{\rho}=I(AC;B)_{\rho}+I(A;C)_{\rho}-I(B;C)_{\rho}.
\end{equation}
\end{exercise}
\subsection{Non-negativity of CQMI}
In the classical world, non-negativity of conditional mutual information
follows trivially from non-negativity of mutual information. The proof of non-negativity of conditional
quantum mutual information is far from trivial in the quantum world, unless
the conditioning system is classical (see Exercise~\ref{ex-qie:trivial-SSA}).
It is a foundational result that non-negativity of this quantity holds because
so much of quantum information theory rests upon this theorem's shoulders (in
fact, we could say that this inequality is one of the \textquotedblleft
bedrocks\textquotedblright\ of quantum information theory). The list of its
corollaries includes the quantum data-processing inequality, the answers to
some additivity questions in quantum Shannon theory, the Holevo bound, and
others. The proof of Theorem~\ref{thm-ie:ssa-quantum}\ follows directly from
monotonicity of quantum relative entropy (Theorem~\ref{thm-qie:mono-rel-ent}),
which we prove later. In fact, it is possible to
show that monotonicity of quantum relative entropy follows from strong
subadditivity as well, so that these two entropy inequalities are essentially
equivalent statements.
\begin{theorem}
[Non-Negativity of CQMI]\label{thm-ie:ssa-quantum}Let $\rho_{ABC}%
\in\mathcal{D}(\mathcal{H}_{A}\otimes\mathcal{H}_{B}\otimes\mathcal{H}_{C})$.
Then the conditional quantum mutual information
\index{quantum mutual information!conditional!positivity}
is non-negative:%
\begin{equation}
I(A;B|C)_{\rho}\geq0.
\end{equation}
This condition is equivalent to the strong subadditivity%
\index{strong subadditivity}
inequality, so we also refer to this
entropy inequality as strong subadditivity.
\end{theorem}
\begin{exercise}
[CQMI of Classical--Quantum States]\label{ex-qie:trivial-SSA}Consider a
classical--quantum state $\sigma_{XAB}$\ of the form in
\eqref{eq-qie:cqq-state}. Prove the following relation:%
\begin{equation}
I(A;B|X)_{\sigma}=\sum_{x}p_{X}(x)I(A;B)_{\sigma_{x}}.
\end{equation}
Conclude that non-negativity of conditional quantum mutual information is
trivial in this special case in which the conditioning system is classical,
simply by exploiting non-negativity of quantum mutual information
(Theorem~\ref{thm-ie:QMI-positive}).
\end{exercise}
\begin{exercise}
[Conditioning Does Not Increase Entropy]\label{ex-qie:SSA->cond-red-ent}Let
$\rho_{ABC}\in\mathcal{D}(\mathcal{H}_{A}\otimes\mathcal{H}_{B}\otimes
\mathcal{H}_{C})$. Show that Theorem~\ref{thm-ie:ssa-quantum}\ is equivalent
to the following stronger form of Theorem~\ref{thm-qie:cond-reduce-ent}:%
\begin{equation}
H(B|C)_{\rho}\geq H(B|AC)_{\rho}.
\end{equation}
\end{exercise}
%\bibliography{mybib}
\bibliographystyle{alpha}
\end{document}