Riesz representation theorem

From Wikipedia the free encyclopedia

The Riesz representation theorem, sometimes called the Riesz–Fréchet representation theorem after Frigyes Riesz and Maurice René Fréchet, establishes an important connection between a Hilbert space and its continuous dual space. If the underlying field is the real numbers, the two are isometrically isomorphic; if the underlying field is the complex numbers, the two are isometrically anti-isomorphic. The (anti-) isomorphism is a particular natural isomorphism.

Preliminaries and notation[edit]

Let be a Hilbert space over a field where is either the real numbers or the complex numbers If (resp. if ) then is called a complex Hilbert space (resp. a real Hilbert space). Every real Hilbert space can be extended to be a dense subset of a unique (up to bijective isometry) complex Hilbert space, called its complexification, which is why Hilbert spaces are often automatically assumed to be complex. Real and complex Hilbert spaces have in common many, but by no means all, properties and results/theorems.

This article is intended for both mathematicians and physicists and will describe the theorem for both. In both mathematics and physics, if a Hilbert space is assumed to be real (that is, if ) then this will usually be made clear. Often in mathematics, and especially in physics, unless indicated otherwise, "Hilbert space" is usually automatically assumed to mean "complex Hilbert space." Depending on the author, in mathematics, "Hilbert space" usually means either (1) a complex Hilbert space, or (2) a real or complex Hilbert space.

Linear and antilinear maps[edit]

By definition, an antilinear map (also called a conjugate-linear map) is a map between vector spaces that is additive:

and antilinear (also called conjugate-linear or conjugate-homogeneous):
where is the conjugate of the complex number , given by .

In contrast, a map is linear if it is additive and homogeneous:

Every constant map is always both linear and antilinear. If then the definitions of linear maps and antilinear maps are completely identical. A linear map from a Hilbert space into a Banach space (or more generally, from any Banach space into any topological vector space) is continuous if and only if it is bounded; the same is true of antilinear maps. The inverse of any antilinear (resp. linear) bijection is again an antilinear (resp. linear) bijection. The composition of two antilinear maps is a linear map.

Continuous dual and anti-dual spaces

A functional on is a function whose codomain is the underlying scalar field Denote by (resp. by the set of all continuous linear (resp. continuous antilinear) functionals on which is called the (continuous) dual space (resp. the (continuous) anti-dual space) of [1] If then linear functionals on are the same as antilinear functionals and consequently, the same is true for such continuous maps: that is,

One-to-one correspondence between linear and antilinear functionals

Given any functional the conjugate of is the functional

This assignment is most useful when because if then and the assignment reduces down to the identity map.

The assignment defines an antilinear bijective correspondence from the set of

all functionals (resp. all linear functionals, all continuous linear functionals ) on

onto the set of

all functionals (resp. all antilinear functionals, all continuous antilinear functionals ) on

Mathematics vs. physics notations and definitions of inner product[edit]

The Hilbert space has an associated inner product valued in 's underlying scalar field that is linear in one coordinate and antilinear in the other (as described in detail below). If is a complex Hilbert space (meaning, if ), which is very often the case, then which coordinate is antilinear and which is linear becomes a very important technicality. However, if then the inner product is a symmetric map that is simultaneously linear in each coordinate (that is, bilinear) and antilinear in each coordinate. Consequently, the question of which coordinate is linear and which is antilinear is irrelevant for real Hilbert spaces.

Notation for the inner product

In mathematics, the inner product on a Hilbert space is often denoted by or while in physics, the bra–ket notation or is typically used instead. In this article, these two notations will be related by the equality:

Competing definitions of the inner product

The maps and are assumed to have the following two properties:

  1. The map is linear in its first coordinate; equivalently, the map is linear in its second coordinate. Explicitly, this means that for every fixed the map that is denoted by and defined by
    is a linear functional on
    • In fact, this linear functional is continuous, so
  2. The map is antilinear in its second coordinate; equivalently, the map is antilinear in its first coordinate. Explicitly, this means that for every fixed the map that is denoted by and defined by
    is an antilinear functional on
    • In fact, this antilinear functional is continuous, so

In mathematics, the prevailing convention (i.e. the definition of an inner product) is that the inner product is linear in the first coordinate and antilinear in the other coordinate. In physics, the convention/definition is unfortunately the opposite, meaning that the inner product is linear in the second coordinate and antilinear in the other coordinate. This article will not choose one definition over the other. Instead, the assumptions made above make it so that the mathematics notation satisfies the mathematical convention/definition for the inner product (that is, linear in the first coordinate and antilinear in the other), while the physics bra–ket notation satisfies the physics convention/definition for the inner product (that is, linear in the second coordinate and antilinear in the other). Consequently, the above two assumptions makes the notation used in each field consistent with that field's convention/definition for which coordinate is linear and which is antilinear.

Canonical norm and inner product on the dual space and anti-dual space[edit]

If then is a non-negative real number and the map

defines a canonical norm on that makes into a normed space.[1] As with all normed spaces, the (continuous) dual space carries a canonical norm, called the dual norm, that is defined by[1]

The canonical norm on the (continuous) anti-dual space denoted by is defined by using this same equation:[1]

This canonical norm on satisfies the parallelogram law, which means that the polarization identity can be used to define a canonical inner product on which this article will denote by the notations

where this inner product turns into a Hilbert space. There are now two ways of defining a norm on the norm induced by this inner product (that is, the norm defined by ) and the usual dual norm (defined as the supremum over the closed unit ball). These norms are the same; explicitly, this means that the following holds for every

As will be described later, the Riesz representation theorem can be used to give an equivalent definition of the canonical norm and the canonical inner product on

The same equations that were used above can also be used to define a norm and inner product on 's anti-dual space [1]

Canonical isometry between the dual and antidual

The complex conjugate of a functional which was defined above, satisfies

for every and every This says exactly that the canonical antilinear bijection defined by
as well as its inverse are antilinear isometries and consequently also homeomorphisms. The inner products on the dual space and the anti-dual space denoted respectively by and are related by
and

If then and this canonical map reduces down to the identity map.

Riesz representation theorem[edit]

Two vectors and are orthogonal if which happens if and only if for all scalars [2] The orthogonal complement of a subset is

which is always a closed vector subspace of The Hilbert projection theorem guarantees that for any nonempty closed convex subset of a Hilbert space there exists a unique vector such that that is, is the (unique) global minimum point of the function defined by

Statement[edit]

Riesz representation theorem — Let be a Hilbert space whose inner product is linear in its first argument and antilinear in its second argument and let be the corresponding physics notation. For every continuous linear functional there exists a unique vector called the Riesz representation of such that[3]

Importantly for complex Hilbert spaces, is always located in the antilinear coordinate of the inner product.[note 1]

Furthermore, the length of the representation vector is equal to the norm of the functional:

and is the unique vector with It is also the unique element of minimum norm in ; that is to say, is the unique element of satisfying Moreover, any non-zero can be written as

Corollary — The canonical map from into its dual [1] is the injective antilinear operator isometry[note 2][1]

The Riesz representation theorem states that this map is surjective (and thus bijective) when is complete and that its inverse is the bijective isometric antilinear isomorphism
Consequently, every continuous linear functional on the Hilbert space can be written uniquely in the form [1] where for every The assignment can also be viewed as a bijective linear isometry into the anti-dual space of [1] which is the complex conjugate vector space of the continuous dual space

The inner products on and are related by

and similarly,

The set satisfies and so when then can be interpreted as being the affine hyperplane[note 3] that is parallel to the vector subspace and contains

For the physics notation for the functional is the bra where explicitly this means that which complements the ket notation defined by In the mathematical treatment of quantum mechanics, the theorem can be seen as a justification for the popular bra–ket notation. The theorem says that, every bra has a corresponding ket and the latter is unique.

Historically, the theorem is often attributed simultaneously to Riesz and Fréchet in 1907 (see references).

Proof[4]

Let denote the underlying scalar field of

Proof of norm formula:

Fix Define by which is a linear functional on since is in the linear argument. By the Cauchy–Schwarz inequality,

which shows that is bounded (equivalently, continuous) and that It remains to show that By using in place of it follows that
(the equality holds because is real and non-negative). Thus that

The proof above did not use the fact that is complete, which shows that the formula for the norm holds more generally for all inner product spaces.


Proof that a Riesz representation of is unique:

Suppose are such that and for all Then

which shows that is the constant linear functional. Consequently which implies that


Proof that a vector representing exists:

Let If (or equivalently, if ) then taking completes the proof so assume that and The continuity of implies that is a closed subspace of (because and is a closed subset of ). Let

denote the orthogonal complement of in Because is closed and is a Hilbert space,[note 4] can be written as the direct sum [note 5] (a proof of this is given in the article on the Hilbert projection theorem). Because there exists some non-zero For any
which shows that where now implies
Solving for shows that
which proves that the vector satisfies

Applying the norm formula that was proved above with shows that Also, the vector has norm and satisfies


It can now be deduced that is -dimensional when Let be any non-zero vector. Replacing with in the proof above shows that the vector satisfies for every The uniqueness of the (non-zero) vector representing implies that which in turn implies that and Thus every vector in is a scalar multiple of

The formulas for the inner products follow from the polarization identity.

Observations[edit]

If then

So in particular, is always real and furthermore, if and only if if and only if

Linear functionals as affine hyperplanes

A non-trivial continuous linear functional is often interpreted geometrically by identifying it with the affine hyperplane (the kernel is also often visualized alongside although knowing is enough to reconstruct because if then and otherwise ). In particular, the norm of should somehow be interpretable as the "norm of the hyperplane ". When then the Riesz representation theorem provides such an interpretation of in terms of the affine hyperplane[note 3] as follows: using the notation from the theorem's statement, from it follows that and so implies and thus This can also be seen by applying the Hilbert projection theorem to and concluding that the global minimum point of the map defined by is The formulas

provide the promised interpretation of the linear functional's norm entirely in terms of its associated affine hyperplane (because with this formula, knowing only the set is enough to describe the norm of its associated linear functional). Defining the infimum formula
will also hold when When the supremum is taken in (as is typically assumed), then the supremum of the empty set is but if the supremum is taken in the non-negative reals (which is the image/range of the norm when ) then this supremum is instead in which case the supremum formula will also hold when (although the atypical equality is usually unexpected and so risks causing confusion).

Constructions of the representing vector[edit]

Using the notation from the theorem above, several ways of constructing from are now described. If then ; in other words,

This special case of is henceforth assumed to be known, which is why some of the constructions given below start by assuming

Orthogonal complement of kernel

If then for any

If is a unit vector (meaning ) then

(this is true even if because in this case ). If is a unit vector satisfying the above condition then the same is true of which is also a unit vector in However, so both these vectors result in the same

Orthogonal projection onto kernel

If is such that and if is the orthogonal projection of onto then[proof 1]

Orthonormal basis

Given an orthonormal basis of and a continuous linear functional the vector can be constructed uniquely by

where all but at most countably many will be equal to and where the value of does not actually depend on choice of orthonormal basis (that is, using any other orthonormal basis for will result in the same vector). If is written as then
and

If the orthonormal basis is a sequence then this becomes

and if is written as then

Example in finite dimensions using matrix transformations[edit]

Consider the special case of (where is an integer) with the standard inner product

where are represented as column matrices and with respect to the standard orthonormal basis on (here, is at its th coordinate and everywhere else; as usual, will now be associated with the dual basis) and where denotes the conjugate transpose of Let be any linear functional and let be the unique scalars such that
where it can be shown that for all Then the Riesz representation of is the vector
To see why, identify every vector in with the column matrix so that is identified with As usual, also identify the linear functional with its transformation matrix, which is the row matrix so that and the function is the assignment where the right hand side is matrix multiplication. Then for all
which shows that satisfies the defining condition of the Riesz representation of The bijective antilinear isometry defined in the corollary to the Riesz representation theorem is the assignment that sends to the linear functional on defined by
where under the identification of vectors in with column matrices and vector in with row matrices, is just the assignment
As described in the corollary, 's inverse is the antilinear isometry which was just shown above to be:
where in terms of matrices, is the assignment
Thus in terms of matrices, each of and is just the operation of conjugate transposition (although between different spaces of matrices: if is identified with the space of all column (respectively, row) matrices then is identified with the space of all row (respectively, column matrices).

This example used the standard inner product, which is the map but if a different inner product is used, such as where is any Hermitian positive-definite matrix, or if a different orthonormal basis is used then the transformation matrices, and thus also the above formulas, will be different.

Relationship with the associated real Hilbert space[edit]

Assume that is a complex Hilbert space with inner product When the Hilbert space