Norms and Dual Norms as Supremums and Infimums

Let \(\mathcal{H}\) be a finite-dimensional Hilbert space over \(\mathbb{R}\) or \(\mathbb{C}\) (the fields of real and complex numbers, respectively). If we let \(\|\cdot\|\) be a norm on \(\mathcal{H}\) (not necessarily the norm induced by the inner product), then the dual norm of \(\|\cdot\|\) is defined by

\(\displaystyle\|\mathbf{v}\|^\circ := \sup_{\mathbf{w} \in \mathcal{H}}\Big\{ \big| \langle \mathbf{v}, \mathbf{w} \rangle \big| : \|\mathbf{w}\| \leq 1 \Big\}.\)

The double-dual of a norm is equal to itself (i.e., \(\|\cdot\|^{\circ\circ} = \|\cdot\|\)) and the norm induced by the inner product is the unique norm that is its own dual. Similarly, if \(\|\cdot\|_p\) is the vector p-norm, then \(\|\cdot\|_p^\circ = \|\cdot\|_q\), where \(q\) satisfies \(1/p + 1/q = 1\).

In this post, we will demonstrate that \(\|\cdot\|^\circ\) has an equivalent characterization as an infimum, and we use this characterization to provide a simple derivation of several known (but perhaps not well-known) formulas for norms such as the operator norm of matrices.

For certain norms (such as the “separability norms” presented at the end of this post), this ability to write a norm as both an infimum and a supremum is useful because computation of the norm may be difficult. However, having these two different characterizations of a norm allows us to bound it both from above and from below.

The Dual Norm as an Infimum

Theorem 1. Let \(S \subseteq \mathcal{H}\) be a bounded set satisfying \({\rm span}(S) = \mathcal{H}\) and define a norm \(\|\cdot\|\) by

\(\displaystyle\|\mathbf{v}\| := \sup_{\mathbf{w} \in S}\Big\{ \big| \langle \mathbf{v}, \mathbf{w} \rangle \big| \Big\}.\)

Then \(\|\cdot\|^\circ\) is given by

\(\displaystyle\|\mathbf{v}\|^\circ = \inf\Big\{ \sum_i |c_i| : \mathbf{v} = \sum_i c_i \mathbf{v}_i, \mathbf{v}_i \in S \ \forall \, i \Big\},\)

where the infimum is taken over all such decompositions of \(\mathbf{v}\).

Before proving the result, we make two observations. Firstly, the quantity \(\|\cdot\|\) described by Theorem 1 really is a norm: boundedness of \(S\) ensures that the supremum is finite, and \({\rm span}(S) = \mathcal{H}\) ensures that \(\|\mathbf{v}\| = 0 \implies \mathbf{v} = 0\). Secondly, every norm on \(\mathcal{H}\) can be written in this way: we can always choose \(S\) to be the unit ball of the dual norm \(\|\cdot\|^\circ\). However, there are times when other choices of \(S\) are more useful or enlightening (as we will see in the examples).

Proof of Theorem 1. Begin by noting that if \(\mathbf{w} \in S\) and \(\|\mathbf{v}\| \leq 1\) then \(\big| \langle \mathbf{v}, \mathbf{w} \rangle \big| \leq 1\). It follows that \(\|\mathbf{w}\|^{\circ} \leq 1\) whenever \(\mathbf{w} \in S\). In fact, we now show that \(\|\cdot\|^\circ\) is the largest norm on \(\mathcal{H}\) with this property. To this end, let \(\|\cdot\|_\prime\) be another norm satisfying \(\|\mathbf{w}\|_{\prime}^{\circ} \leq 1\) whenever \(\mathbf{w} \in S\). Then

\(\displaystyle \| \mathbf{v} \| = \sup_{\mathbf{w} \in S} \Big\{ \big| \langle \mathbf{w}, \mathbf{v} \rangle \big| \Big\} \leq \sup_{\mathbf{w}} \Big\{ \big| \langle \mathbf{w}, \mathbf{v} \rangle \big| : \|\mathbf{w}\|_{\prime}^{\circ} \leq 1 \Big\} = \|\mathbf{v}\|_\prime.\)

For the remainder of the proof, we denote the infimum in the statement of the theorem by \(\|\cdot\|_{{\rm inf}}\). Our goal now is to show that: (1) \(\|\cdot\|_{{\rm inf}}\) is a norm, (2) \(\|\cdot\|_{{\rm inf}}\) satisfies \(\|\mathbf{w}\|_{{\rm inf}} \leq 1\) whenever \(\mathbf{w} \in S\), and (3) \(\|\cdot\|_{{\rm inf}}\) is the largest norm satisfying property (2). The fact that \(\|\cdot\|_{{\rm inf}} = \|\cdot\|^\circ\) will then follow from the first paragraph of this proof.

To see (1) (i.e., to prove that \(\|\cdot\|_{{\rm inf}}\) is a norm), we only prove the triangle inequality, since positive homogeneity and the fact that \(\|\mathbf{v}\|_{{\rm inf}} = 0\) if and only if \(\mathbf{v} = 0\) are both straightforward (try them yourself!). Fix \(\varepsilon > 0\) and let \(\mathbf{v} = \sum_i c_i \mathbf{v}_i\), \(\mathbf{w} = \sum_i d_i \mathbf{w}_i\) be decompositions of \(\mathbf{v}, \mathbf{w}\) with \(\mathbf{v}_i, \mathbf{w}_i \in S\) for all i, satisfying \(\sum_i |c_i| \leq \|\mathbf{v}\|_{{\rm inf}} + \varepsilon\) and \(\sum_i |d_i| \leq \|\mathbf{w}\|_{{\rm inf}} + \varepsilon\). Then

\(\displaystyle \|\mathbf{v} + \mathbf{w}\|_{{\rm inf}} \leq \sum_i |c_i| + \sum_i |d_i| \leq \|\mathbf{v}\|_{{\rm inf}} + \|\mathbf{w}\|_{{\rm inf}} + 2\varepsilon.\)

Since \(\varepsilon > 0\) was arbitrary, the triangle inequality follows, so \(\|\cdot\|_{{\rm inf}}\) is a norm.

To see (2) (i.e., to prove that \(\|\mathbf{v}\|_{{\rm inf}} \leq 1\) whenever \(\mathbf{v} \in S\)), we simply write \(\mathbf{v}\) in its trivial decomposition \(\mathbf{v} = \mathbf{v}\), which gives the single coefficient \(c_1 = 1\), so \(\|\mathbf{v}\|_{{\rm inf}} \leq \sum_i c_i = c_1 = 1\).

To see (3) (i.e., to prove that \(\|\cdot\|_{{\rm inf}}\) is the largest norm on \(\mathcal{H}\) satisfying condition (2)), begin by letting \(\|\cdot\|_\prime\) be any norm on \(\mathcal{H}\) with the property that \(\|\mathbf{v}\|_{\prime} \leq 1\) for all \(\mathbf{v} \in S\). Then using the triangle inequality for \(\|\cdot\|_\prime\) shows that if \(\mathbf{v} = \sum_i c_i \mathbf{v}_i\) is any decomposition of \(\mathbf{v}\) with \(\mathbf{v}_i \in S\) for all i, then

\(\displaystyle\|\mathbf{v}\|_\prime = \Big\|\sum_i c_i \mathbf{v}_i\Big\|_\prime \leq \sum_i |c_i| \|\mathbf{v}_i\|_\prime = \sum_i |c_i|.\)

Taking the infimum over all such decompositions of \(\mathbf{v}\) shows that \(\|\mathbf{v}\|_\prime \leq \|\mathbf{v}\|_{{\rm inf}}\), which completes the proof.

The remainder of this post is devoted to investigating what Theorem 1 says about certain specific norms.

Injective and Projective Cross Norms

If we let \(\mathcal{H} = \mathcal{H}_1 \otimes \mathcal{H}_2\), where \(\mathcal{H}_1\) and \(\mathcal{H}_2\) are themselves finite-dimensional Hilbert spaces, then one often considers the injective and projective cross norms on \(\mathcal{H}\), defined respectively as follows:

\(\displaystyle \|\mathbf{v}\|_{I} := \sup\Big\{ \big| \langle \mathbf{v}, \mathbf{a} \otimes \mathbf{b} \rangle \big| : \|\mathbf{a}\| = \|\mathbf{b}\| = 1 \Big\} \text{ and}\)

\(\displaystyle \|\mathbf{v}\|_{P} := \inf\Big\{ \sum_i \| \mathbf{a}_i \| \| \mathbf{b}_i \| : \mathbf{v} = \sum_i \mathbf{a}_i \otimes \mathbf{b}_i \Big\},\)

where \(\|\cdot\|\) here refers to the norm induced by the inner product on \(\mathcal{H}_1\) or \(\mathcal{H}_2\). The fact that \(\|\cdot\|_{I}\) and \(\|\cdot\|_{P}\) are duals of each other is simply Theorem 1 in the case when S is the set of product vectors:

\(\displaystyle S = \big\{ \mathbf{a} \otimes \mathbf{b} : \|\mathbf{a}\| = \|\mathbf{b}\| = 1 \big\}.\)

In fact, the typical proof that the injective and projective cross norms are duals of each other is very similar to the proof of Theorem 1 provided above (see [1, Chapter 1]).

Maximum and Taxicab Norms

Use \(n\) to denote the dimension of \(\mathcal{H}\) and let \(\{\mathbf{e}_i\}_{i=1}^n\) be an orthonormal basis of \(\mathcal{H}\). If we let \(S = \{\mathbf{e}_i\}_{i=1}^n\) then the norm \(\|\cdot\|\) in the statement of Theorem 1 is the maximum norm (i.e., the p = ∞ norm):

\(\displaystyle\|\mathbf{v}\|_\infty = \sup_i\Big\{\big|\langle \mathbf{v}, \mathbf{e}_i \rangle \big| \Big\} = \max \big\{ |v_1|,\ldots,|v_n|\big\},\)

where \(v_i = \langle \mathbf{v}, \mathbf{e}_i \rangle\) is the i-th coordinate of \(\mathbf{v}\) in the basis \(\{\mathbf{e}_i\}_{i=1}^n\). The theorem then says that the dual of the maximum norm is

\(\displaystyle \|\mathbf{v}\|_\infty^\circ = \inf \Big\{ \sum_i |c_i| : \mathbf{v} = \sum_i c_i \mathbf{e}_i \Big\} = \sum_{i=1}^n |v_i|,\)

which is the taxicab norm (i.e., the p = 1 norm), as we expect.

Operator and Trace Norm of Matrices

If we let \(\mathcal{H} = M_n\), the space of \(n \times n\) complex matrices with the Hilbert–Schmidt inner product

\(\displaystyle \big\langle A, B \big\rangle := {\rm Tr}(AB^*),\)

then it is well-known that the operator norm and the trace norm are dual to each other:

\(\displaystyle \big\| A \big\|_{op} := \sup_{\mathbf{v}}\Big\{ \big\|A\mathbf{v}\big\| : \|\mathbf{v}\| = 1 \Big\} \text{ and}\)

\(\displaystyle \big\| A \big\|_{op}^\circ = \big\|A\big\|_{tr} := \sup_{U}\Big\{ \big| {\rm Tr}(AU) \big| : U \in M_n \text{ is unitary} \Big\},\)

where \(\|\cdot\|\) is the Euclidean norm on \(\mathbb{C}^n\). If we let \(S\) be the set of unitary matrices in \(M_n\), then Theorem 1 provides the following alternate characterization of the operator norm:

Corollary 1. Let \(A \in M_n\). Then

\(\displaystyle \big\|A\big\|_{op} = \inf\Big\{ \sum_i |c_i| : A = \sum_i c_i U_i \text{ and each } U_i \text{ is unitary} \Big\}.\)

As an application of Corollary 1, we are able to provide the following characterization of unitarily-invariant norms (i.e., norms \(\|\cdot\|_{\prime}\) with the property that \(\big\|UAV\big\|_{\prime} = \big\|A\big\|_{\prime}\) for all unitary matrices \(U, V \in M_n\)):

Corollary 2. Let \(\|\cdot\|_\prime\) be a norm on \(M_n\). Then \(\|\cdot\|_\prime\) is unitarily-invariant if and only if

\(\displaystyle \big\|ABC\big\|_\prime \leq \big\|A\big\|_{op}\big\|B\big\|_{\prime}\big\|C\big\|_{op}\)

for all \(A, B, C \in M_n\).

Proof of Corollary 2. The “if” direction is straightforward: if we let \(A\) and \(C\) be unitary, then

\(\displaystyle \big\|B\big\|_\prime = \big\|A^*ABCC^*\big\|_\prime \leq \big\|ABC\big\|_\prime \leq \big\|B\big\|_{\prime},\)

where we used the fact that \(\big\|A\big\|_{op} = \big\|C\big\|_{op} = 1\). It follows that \(\big\|ABC\big\|_\prime = \big\|B\big\|_\prime\), so \(\|\cdot\|_\prime\) is unitarily-invariant.

To see the “only if” direction, write \(A = \sum_i c_i U_i\) and \(C = \sum_i d_i V_i\) with each \(U_i\) and \(V_i\) unitary. Then

\(\displaystyle \big\|ABC\big\|_\prime = \Big\|\sum_{i,j}c_i d_j U_i B V_j\Big\|_\prime \leq \sum_{i,j} |c_i| |d_j| \big\|U_i B V_j\big\|_\prime = \sum_{i,j} |c_i| |d_j| \big\|B\big\|_\prime.\)

By taking the infimum over all decompositions of \(A\) and \(C\) of the given form and using Corollary 1, the result follows.

An alternate proof of Corollary 2, making use of some results on singular values, can be found in [2, Proposition IV.2.4].

Separability Norms

As our final (and least well-known) example, let \(\mathcal{H} = M_m \otimes M_n\), again with the usual Hilbert–Schmidt inner product. If we let

\(\displaystyle S = \{ \mathbf{a}\mathbf{b}^* \otimes \mathbf{c}\mathbf{d}^* : \|\mathbf{a}\| = \|\mathbf{b}\| = \|\mathbf{c}\| = \|\mathbf{d}\| = 1 \},\)

where \(\|\cdot\|\) is the Euclidean norm on \(\mathbb{C}^m\) or \(\mathbb{C}^n\), then Theorem 1 tells us that the following two norms are dual to each other:

\(\displaystyle \big\|A\big\|_s := \sup\Big\{ \big| (\mathbf{a}^* \otimes \mathbf{c}^*)A(\mathbf{b} \otimes \mathbf{d}) \big| : \|\mathbf{a}\| = \|\mathbf{b}\| = \|\mathbf{c}\| = \|\mathbf{d}\| = 1 \Big\} \text{ and}\)

\(\displaystyle \big\|A\big\|_s^\circ = \inf\Big\{ \sum_i \big\|A_i\big\|_{tr}\big\|B_i\big\|_{tr} : A = \sum_i A_i \otimes B_i \Big\}.\)

There’s actually a little bit of work to be done to show that \(\|\cdot\|_s^\circ\) has the given form, but it’s only a couple lines – consider it an exercise for the interested reader.

Both of these norms come up frequently when dealing with quantum entanglement. The norm \(\|\cdot\|_s^\circ\) was the subject of [3], where it was shown that a quantum state \(\rho\) is entangled if and only if \(\|\rho\|_s^\circ > 1\) (I use the above duality relationship to provide an alternate proof of this fact in [4, Theorem 6.1.5]). On the other hand, the norm \(\|\cdot\|_s\) characterizes positive linear maps of matrices and was the subject of [5, 6].

References

J. Diestel, J. H. Fourie, and J. Swart. The Metric Theory of Tensor Products: Grothendieck’s Résumé Revisited. American Mathematical Society, 2008. Chapter 1: pdf
R. Bhatia. Matrix Analysis. Springer, 1997.
O. Rudolph. A separability criterion for density operators. J. Phys. A: Math. Gen., 33:3951–3955, 2000. E-print: arXiv:quant-ph/0002026
N. Johnston. Norms and Cones in the Theory of Quantum Entanglement. PhD thesis, University of Guelph, 2012.
N. Johnston and D. W. Kribs. A Family of Norms With Applications in Quantum Information Theory. Journal of Mathematical Physics, 51:082202, 2010.
N. Johnston and D. W. Kribs. A Family of Norms With Applications in Quantum Information Theory II. Quantum Information & Computation, 11(1 & 2):104–123, 2011.

Nathaniel Johnston

Quantum information theory, cellular automata, and recreational mathematics