This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
,r .
(1.5)
It turns out, as we explain in Sect. III that t(A) is, for A e W(.9), nothing other than the usual normalized trace on .!F applied to A. The basic ingredients of a non-commutative integration theory are an algebra of operators such as `B(2), and a linear functional on it such as T. Norms analogous
to the L° norms of commutative integration theory can now be introduced by Il A ll, = (r(A'A)p12)uo for
I
II A II , denotes the operator norm of A. Let 4S"(2) denote '6(i) equipped with the corresponding norm. Having expanded our horizons beyond Hilbert space, we can ask for hounds between II P,A Iig and II A II,, for different q, p and t. Our main result, Theorem 4, is
the optimal fermion hypercontractivity inequality; i.e. for all I < p 5 q < cz, and all A in 16P(9), IIP'A!Iq!5 11Ali
when
e-z'
(1.6)
and the t saturating the inequality on the right is the smallest for which the inequality on the left always holds. We prove this for it degrees of freedom with n an arbitrary finite integer. Since the estimate is independent of n. a theorem of Gross [Gr72] implies that it holds as well with infinitely many degrees of freedom. The result (1.6) was conjectured by Gross [Gr75] who proved it [Gr72] in the
special case p = 2, q = 4. The cases p = 2. q = 2m, where in is an integer were proved by Lindsay and Meyer [LiMe] following earlier work by Lindsay [Lin] on the case p = 2, q = 2". Since P, is self adjoint, duality yields a corresponding family of results in which q = 2. Until now all other cases (except when it = I or 2) had remained open. The optimal relation between t, p and q found here for fermion hypercontractivity is the same as that found by Nelson [Ne73] for boson hypercontractivity. By now there are many proofs of Nelson's inequality. Neveu's elegant proof [Nev], like Nelson's original proof, is based on probabilistic methods. Proofs based on geometric methods have been given by Carlen and Loss [CL90] and by Lieb [Li90] who considers generalizations in which the Mehler kernel (the kernel for the
153
With E. Carlen in Commun. Math. Phys. 155, 27-46 (1993) E.A. Carlcn and E.H. Lieb
30
boson oscillator semigroup) is replaced by an arbitrary Gaussian kernel. More references can be found in the bibliography to Gross's article [Gr89, Gr92]. However, none of the existing approaches to the boson problem has been found to solve the fermion problem. See also [Far]. As a corollary to the optimal fermion hypercontractivity inequality, we obtain the optimal fermion logarithmic Sobolev inequality:
t(IAI2InIAl2)-(IIAIIiInlIAll2 :52r(A*HOA),
(1.7)
where I AI = (A*A)1/2.
This inequality was conjectured by Gross, and proved by him in a weaker form with the constant on the right increased by a factor of In 3. In studying perturbations of Ho by multiplication operators V, this inequality plays the same role as does the usual Sobolev inequality in studying perturbations of - JA by multiplication operators V on L2(R", d"x) [Fe69]. (The multiplication operator associated to a self-adjoint element V of X8(2) is defined to be the average of left and right multiplication by V, and is denoted here by the same symbol V.) Again, in the standard Fock space setting, there is no natural way to formulate such an a-priori regularity inequality, or even to introduce the notion of a multiplication operator. This paper is organized as follows: In Sect. 11 we study the structure of 'e(2) for finite n. It is actually simpler, as well as technically advantageous, to consider it as a subalgebra of the algebra W(it) generated by the identity, the configuration observables Q1, ... , Q. and their conjugate momenta P1, ... , P. Since they both turn out to be certain Clifford algebras, their structure has been worked out long ago with the representation theory of the orthogonal groups. Thus, this section contains no new result but simply introduces notation and prepares the way for what follows. Of particular use are an explicit spin-chain representation of the and the Jordan-Wigner transform identifying it with the operator algebra algebra generated by n "hard core bosons." Section III concerns properties of the spaces and their norms. The main result here is an optimal uniform convexity inequality for'8'(. P ), I < p < 2, which is joint work with Keith Ball. We need only a special case of this inequality here, and a proof is provided in an appendix for the reader's convenience. Section IV introduces a convenient expression for P, in terms of the conditional expectation its of with respect to W(2). The main result in this section is an inequality for conditional expectations which enables us to prove that
sup{IIP,AII,:IIAII,,= 1) =sup{IIP,AII4:A _ 0 and IIA'Ii= I' Thus, to establish (1.6) in general, we need only consider positive A. In Sect. V we establish the optimal fermion hypercontractivity bounds and the corresponding optimal fermion logarithmic Sobolev inequality. This is done in several steps. First, using results collected in Sects. 11 and III we establish that (1.7) holds for I < p < 2 and q = 2. At t = 0 this is an equality; differentiating it there yields the logarithmic Sobolev inequality (1.7). Gross showed that (1.6) would follow from (1.7), if it were true (as we show here), for all self adjoins A. His result rests on a deep inequality he established for positive operators. Since it is not in general true that II P,AII9 5 IIP:IAIIIQ, as is trivially true in the commutative case, Gross's result does not allow us to conclude (1.6) for general (i.e. non-self adjoint)
A from (1.7). The results of the Sect. IV do, however, allow us to draw this conclusion.
154
Optimal Hypercontractivity for Fermi Fields and Related Non-Commutative Integration Inequalities Optimal Hypercontractivity for Fermi Fields
31
Finally, in Sect. VI we show that the same hypercontractivity relation (1.6) holds for a mixed system of bosons and fermions. Non-commutative probability theory has grown into a substantial branch of
analysis with a number of physical applications. The mathematical theory is reviewed and developed in [Me85] and [Me86], while other sorts of physical applications, besides those discussed here, are treated in [Da76] and [HuPa] for example. It
is a pleasure to thank Leonard Gross for discussing his results and
conjectures with us, and for encouraging us to take up the latter. We are indebted to Keith Ball for his collaboration on the subject of convexity inequalities that led to Theorem I [BCL], which is one of the key ingredients in the present work. Thanks are also due to G.-F. Dell'Antonio, A. Jaffe and A. Wightman for useful discussions. H. Fermions and the Clifford Algebra
We begin by recalling for later use some well known facts about fermions. The fundamental observables for a system of n fermion degrees of freedom are
configuration operators QA,, ... , Q. together with their conjugate moments operators P,, P2.... , P. all acting as operators on a complex Hilbert spaces" and satisfying the canonical anticommutation relations: PjPk + PkPj = 2bjk ,
Qj Qk + QkQj = 2bjk ,
PjQk + QkPj = 0 .
(2.1)
We denote the complex algebra generated by the identity and the configuration
observables by IC(J), and the complex algebra generated by the identity, the configuration observables and the momentum observables all together by 'G'(:?) is the object of primary interest; but many aspects of its structure are most readily seen within the larger algebra lifl_*'). This algebra can be concretely represented as the algebra of observables for a spins-chain as follows. We define the matrices
I=[0
1],
-1]' Q-[1
U=10,
fi
P=[-
Let .s denote the n-fold tensor product of CZ with itself-
- .C2®... ®Cz n
times
and on W' define the operators
Qj =
U®... U®Q®1®... ®1 , Pj =
U®...
U®P®I®... ®1 .
(2.2)
where the Q and the P occur in the j1° places.
The operators Q,, ... , P...... P. just defined are easily seen to satisfy the canonical anticommutation relations. Of great use in studying the algebra that they generate is the fact that it is also the algebra generated by n "hard core boson" degrees of freedom. More explicitly, put
Qj=I®...1pQ®1®...®J,
Pj=I®...1QPQ1Q...®1,
(2.3)
155
With E. Carlen in Commun. Math. Phys. 155, 27-46 (1993) E.A. Carlen and E.H. Lieb
32
and call the algebra generated by the operators Q,, ... ,
P,, ... , P. the hard
core boson algebra, and denote it by W(17Y). To see that the two algebras coincide, put
UJ=1®...1®U®1®...®1
Vk
and
k-1
= r[ U;.
(2.4)
1=1
Then since PJQ; = P,Q; = iU;, each Vk belongs to both the hard core boson algebra and
Moreover Yk = VkQk
and Pk = VkPk
(2.5)
with the inverse relation given as well by Qk = VkQk
and
Pk = VkPk .
(2.6)
Thus, Qk, Pk are in and Qk, Pk are in What we call the hard core boson algebra was initially introduced by Jordan
and Klein [JoKI] as a first attempt to implement the Pauli exclusion principle mathematically. The transformation of observables (2.6) was discovered by Jordan and Wigner [JoWi] and used by them to write the algebraic relations characteriz-
ing the algebra in the familiar covariant form (2.1); today it is known as the Jordan Wigner transform. It has been used many times since for example, in a solution of the two dimensional Ising model by Schultz, Mattis and Lieb, whose paper [SML] can be consulted for references to other applications. It is also the key to Brauer and Weyl's treatment [BrWe] of the Clifford algebra on which the exposition in the rest of this section is largely based. Following them, we now show is the full matrix algebra on e. that
First, we introduce a basis in .f. Let e,
=(0')
and
e- , = (0)
be the standard basis of V. For each j = 1, ... , n, let a; be either 1 or - I. Then the unit vectors
®e provide a natural orthonormal basis for .,(''. Next introduce the fermion creation
and
operators
annihilation
c* = 2(Q; - iP,) and
2(QJ + iPJ) and their hard core boson analogs c;* = }(Q; - iP,) and eJ = j(Q, + iP,). Now put L(a,, .... fi;-, B;. where B, - c if a = 1 and Bj = ej if a1 - - 1. Then L(a .....a) e,, .,, = S2 .
_
.
where Q
0
Y
`\
12.7)
n limt
is a distinguished unit vector in .,W called the ground state. Also, L(a,.... , annihilates all the other basis vectors. Moreover, L*(a,, ... e,, , .. ,
Thus.
L*(r,, .... 156
r
Optimal Hypercontractivity for Fermi Fields and Related Non-Commutative Integration Inequalities Optimal Hypercontractivity for Fermi Fields
33
and this operator annihilates all other basis vectors. Manifestly this operator belongs to
and hence to W(,*) as well, and the 2" operators of this kind form
a basis for the full matrix algebra on .0. This concrete description of 16(Jf') in terms of spin-chain observables is the most useful for many purposes. Still. it is also useful to have a characterization of IC(,*-) which is less dependent on coordinates; i.e. on the choice of fermi configura-
tion observables Q1, ... , Q. Toward this end, consider the standard n-dimensional Hilbert space V equipped with its standard inner product (., ) and complex conjugation. Let .)f' denote 44'" considered as a real 2n-dimensional Hilbert space equipped with the inner product <X.y).ar = R(x,y) .
Then complex conjugation on C induces an involutory orthogonal transformation Jon .t'. Let 2 and Y respectively denote the eigenspaces of J corresponding
to the eigenvalues + I and - 1. The bilinear form on *' given by ;(x,
is
symplectic so that A' is naturally endowed with the structure possessed by the classical phase space of a system of n linear degrees of freedom. 1 is called the configuration space, 9 the momentum space, and the complex conjugation J is usually identified with time reversal. ') is characterized up to automorphism as the algebra The Clifford algebra with unit I such that:
(i) There is a linear imbedding J:..*' -'6(A ), and /(.)Y) generates 'h (S'). (ii) For all x, y e Jf',
/(x)/(y)+.f(r),f(x)=<x.y>a
.
(2.8)
To make contact with our previous concrete description, let . q,.... , qn; be an orthonormal basis of II " consisting of purely real vectors. For each j, let p, = iq;. Evidently .1 is spanned by {q,, ... , q"} and is spanned by ; p,, ... , p,}. Any +1; p;. Using the notation x e .7Y can then be written as x , ;q; + introduced above, put .,f : J(' (('(J') by
;.
,
rt
Q 1= 1
1=1
Let ;x,, ... , x2,,} be any orthonormal basis of 'rS'(. ) Then the monomials Q
(x,d 'f (X,) ....f(.,)
together with I form a basis for the algebra. It is easy to see that the product of any
two such monomials is a third. Though the multiplication rule can be simply expressed in terms of certain contraction rules, its precise form is not useful to us here. What is useful to observe is that since the right side of (2.8) is invariant under orthogonal transformations of .*', the multiplication law of these monomials does not depend on the choice of the orthonormal basis. For this reason, any orthogonal transformation R of .*' induces an automorphism of K (.X') which we shall also denote by R. Indeed, with R: (1) defined by
R(/ (x, I ... ;Y(X, I) _ .f (R(x,.I) ... ,f(R(x,,)) R is evidently invertible and by the remarks made just above one easily sees that for all A. B c- '6 (. (, R(AB) = R(A) R(B). which is to say that R is an automorphism.
157
With E. Carlen in Commun. Math. Phys. 155, 27-46 (1993) E.A. Carlen and E.H. Lieb
34
Finally, we remark that W(-V') is a *-algebra; there is a unique conjugate linear involutory antiautomorphism A'-'A* which is the identity on /(.iV ). It is given by
R(/(xa,) . . d (xa,,))* = /(xaw) . . . f (xa,) . .
Of course, on regarded as a matrix algebra, this is just the usual adjoint. Evidently the automorphism R of ''(1) induced by an orthogonal transformation R of .7Y is a *-automorphism; i.e. R(A*) = (R(A))*.
The facts that orthogonal transformations of _V- induce automorphisms of that T(1) is a full matrix algebra, and that all automorphisms of full matrix algebras are inner, i.e. of the form A a-' SAS ` for some nonsingular matrix S, are the basis of Brauer and Weyl's treatment of the spin representations of the orthogonal groups [BrWe]. We will use the fact that all automorphisms of are inner several times in what follows. Ill. Analysis on the Clifford Algebra
Let .4 be a von Neumann algebra of operators on some finite dimensional Hilbert space. By a trace on .fat we shall mean a linear functional T which is positive in the sense that T(A*A) > 0 for all non zero A in fat, and cyclic in the sense that T(AB) = T(BA) for all A and B in .sp. Such a functional is evidently
unitarily invariant in the sense that whenever A and U belong to Q, and U is unitary, then T(U* AU) = T(A). Since W(-*') is a full matrix algebra, it contains
all unitaries. Hence any trace on ''(,t) must assign the same value to all rank one projections, and thus must be a scalar multiple of the standard trace Tr on the matrix algebra. Henceforth, T shall denote this trace normalized by the condition that T(1) = 1 and Tr shall denote the standard unnormalized trace.
In the non-commutative integration theories of Dixmicr [Di53] and Segal [Se53], the trace functional T is the non-commutative analog of the functional that
assigns to an integrable function its integral. When the Hilbert space is infinite dimensional, some further regularity properties are required of r in order to obtain
a useful analog. Since all of our estimations will be carried out in the finite dimensional setting, we shall not go into this here, but shall simply refer the reader to these original papers as well as the accounts in [Gr72] and [Ne74]. Norms on ''(it) which are the non-commutative analogs of the L° norms can
now be introduced; namely for I S p < oc we put II All p
(3.1)
= (T((A*A)pI2))lip
and denote the operator norm of A by II A II ')P(J() shall denote It'(,V') equipped with the norm II'llp; evidently W2 (,Y ) is the Hilbert space of 2" x 2" matrices equipped with the Hilbert-Schmidt norm. Consider the monomials E, a,.....a,:B...... a4) = Qa, .
where a,>
.
. Q., Pn, .
.
. PQ4 ,
>a;and/,>...>(Jkandj+k>0.Evidently Ela,...a,.Bi... dk)Elai.....a,.Pi.....flw1 =
158
Optimal Hypercontractivity for Fermi Fields and Related Non-Commutative Integration Inequalities
Optimal Hypercontractivity for Fermi Fields
35
and thus IIE1a,....,a,:p,.....Pk111p= I for all p. Moreover T(E121.....a,tpi...../3k) = 0 .
To see this, first consider the case in which j + k is odd. The inversion x -- - x on and hence there is Jr is orthogonal. Hence it induces an automorphism of an invertible S in W (X') so that .....pwl = SE(a,.....a,.p,.....pkls
Then, by using cyclicity of the trace we get the desired result. Next consider the case in which j + k is even and, say, j > 0. Then write E1a, .... p...... ph1 = Qa, X, and
note that by (2.1), Q., X = - XQa,. Again the desired conclusion follows from cyclicity of the trace. It is easy to see from this that
=0
T(Ei ...... a,.p...... pkl Eln.....
(3.4)
unless the two monomials coincide. Thus, together with the identity, the assemFinally observe blage of such monomials forms an orthonormal basis for 'Gr2(
that since Qe_ = e= 0
(3.5)
whenever j >_ 1, and, as indicated, k = 0. It now follows that, restricted to W(1), r(A) =
(3.6)
for all A in "if (.K). Formula (3.6) is very important for us. It permits us to calculate the "physically"
relevant quantityin terms of the apparently mathematically simpler quantity T(A). Many familiar inequalities for L° norms hold for the 'C" norms as well [Di53].
This is true in particular of the Holder inequality 1
Ii ABII.:! 11AIIp11BIq
l
1
r= n +
Certain optimal inequalities expressing the uniform convexity properties of the L° norms also hold for the WP norms, and this fact constitutes one cornerstone of our analysis. The modulus of convexity 6p of V(.)r) is defined by 1
6p(c) = inf
1 - 21A +
BIIp:
!IA)=IIBII,=1,
I A - Blip = t: }
(3.7)
}))norms
for 0 < e < 2. For I < p < oc, by is always positive which means these are uniformly convex. Useful geometric information is contained in the rate at which 6,(t) tends to zero with e. It is known [TJ74] that for 2 < p < x , bp(e) - rp, but that for I < p < 2, bp(E) r.2. An optimal expression of this fact is given by the following theorem which was proved jointly with Keith Ball [BCL]:
159
With E. Carlen in Commun. Math. Phys. 155, 27-46 (1993) E.A. Carlen and E.H. Lieb
.16
Theorem 1. (Optimal 2-uniform convexity for matrices). For all nix in matrices
A and B and all p for I Sp 52, TrIA + BI" 2+ TrIA - BI" 2"
> (TrIAI")2," + (p - I)(TrIBI")r" .
(3.8)
For I < p < 2, there is equality only when B = 0. This result, which we interpret here as a statement about W"(,X''), is proved in the appendix in the special case that both A + B and A - B are positive; this is the only case in which we shall use it here, and the proof is considerably simpler in this case. The full result is proved in [BCL], in which other geometric inequalities for trace norms are proved as well. The theorem implies that
1)(r)2
(P 2
2
for
I < p <2
as one sees by considering A = (C + D)/2, B = (C - D)/2, II C I1" = II D II" = I and 11 C - D 11" = E. It is easily seen that the constant (p - 1)/8 cannot be improved.
We make our main application of this result in Sect. V. There we will also need to know that the norms on (4 (.)f) are continuously differentiable away from the origin for I < p < x . This is known [Gr75], but a simple proof can be based on inequalities of the form S"(c) > Ke'P' such as we have found above for I < p < 2. This proof, moreover, gives the modulus of continuity of the derivative, and is sketched in the appendix as well. Again, these estimates are independent of the dimension and therefore apply to the case of infinitely many degrees of freedom.
IV. Conditional Expectations and the Fermion Oscillator Semigroup
We arc particularly concerned with the subalgebra 16(2) of '6'(X ), and the conditional expectation [Di53, Um54] with respect to it shall play a basic role in our investigation. For any A in 't (1 ). the conditional expectation ;r -,(A) of A with respect to le(2) is defined to be the unique element of (6'(..1) such that Otherwise said, it, is the orthor(B*n,(A)) = r(B*A) for all B in
gonal projection from 12(,f') onto 162(1). It is well known that the conditional expectation is positivity preserving; a familiar argument shows that ir,(A*A) ? lr,(A)*n,(A). We can use the conditional expectation to give a useful expression for the oscillator semigroup for fermion fields.
Let R. be the orthogonal transformation of 1 given by Re(gj) = (cosO)q; + (sinO)p,
(4.1)
for each j. Of course Ro gives the evolution at time () on phase space i generated (p; + q; ). Let RB denote by the classical oscillator Hamiltonian H(p,q) the automorphism of 16(1Y) generated by the orthogonal transformation R, as in the first section. For each t > 0. define 6(t) = arccos(e-`) and define the operator P, (.f) by on
P,A = it,
160
n.,.,A
.
(4.2)
Optimal Hypercontractivity for Fermi Fields and Related Non-Commutative Integration Inequalities
Optimal Hypercontractivity for Fermi Fields
37
Note that n, is the natural imbedding of '(2) into W(Y), and regarded as such, it is a s-automorphism. Formula (4.2) is the analog of the familiar expression for the boson oscillator semigroup on L2(2, (27E) -'/' a-q2(2 d"q), i.e. the Mehler semigroup
P,'°"°") A(q) = J A(e-'q + (I -
e-2')ijzp)(2rr)-`!2e-p'12d"p
Note that since all of the operators on the right in (4.2) are positivity preserving, so is P,. Also, since the first two operations on the right preserve the V-norms, and since the conditional expectation is readily seen to be a contraction from WP(,*^) to `'P(2) for each p, it is readily seen that P, possesses this property as well.
To obtain a more familiar expression for P note that Re(,)
I)= e
k' F_t,
.
, ,
akl + (terms annihilated by 1r,) .
Hence P, (E(.,..
.. Ql. I)
e - k,p. "E[,..... Tkl'
,k I) = kEl .. _ .. Qy 1. It Evidently { P,: t ? 0} is generated by Ho where is easy to see that under the unitary equivalence between 1if2(2) and fermion Fock space.F described by Segal [Se56], Ho is equivalent to the usual number operator, or in other words, the oscillator Hamiltonian on .5t. Our primary goal is to prove optimal hypercontractivity bounds for P,. That is,
given l < p < q < oo we want to show that for some finite t, P, is a contraction from Vp(2) to Wq(2), and to find the smallest such t. Let IIP,11p-.q=sup{IIP,AIiq:1iAllp= I} .
(4.4)
As a first reduction, we shall show that the supremum on the right in (4.4) can be restricted to the positive operators A with 11 A 11 p = 1. In the boson case this follows
immediately from the fact that, in ordinary probability theory, the absolute value of
a conditional expectation is no greater than the conditional expectation of the absolute value.
In general, matters concerning the absolute value in the non-commutative setting are more troublesome than in the commutative setting. An example is provided by the Araki-Yamagami inequality [ArYa] which, specialized to our with context, asserts that the map A i-+ I A I is Lipschitz continuous on constant instead of the constant 1 which we would have in the commutative setting.
Thus while the conditional expectation in an operator algebra has many properties analogous to those of the conditional expectation in ordinary probability theory [Um54], it is not in general true that I n,(A)l will be a smaller operator than rr,(IAI). The following theorem expresses a useful property in this direction
which does hold, and after proving it we shall show by example that stronger properties do not hold. The theorem and its proof are easily extended to a more general von Neumann algebra setting by the methods in [Ru72]. Theorem 2. (A Schwarz inequality for conditional expec(ations). For all A in
and all p with t < p < Xo, I':ne(A)Ilp IIn,(IAI)IiP`2IInj(IA'I)11P`2.
(4.5)
161
With E. Carlen in Commun. Math. Phys. 155, 27-46 (1993) E.A. Carlen and E.H. Lieb
38
Remark. If we let F(A) denote II n,A II p, then the same argument which we shall use
to prove Theorem 2 also establishes that F(A*B) 5 F(A*A)`I2 F(B*B)12 .
(4.6)
In this form, the term "Schwarz inequality," by which we referred to (4.5), is more evidently appropriate. Moreover, inequalities of the type (4.6) are well known in matrix analysis for many familiar functions; for example when F(A) is the determinant of A or the spectral radius of A. Further examples can be found in [MeDS]. In [Li76] it is shown that for a function F that satisfies (4.6), and which is monotone increasing; i.e. satisfies F(B) > F(A) for all B Z A z 0, the following inequalities hold:
F\; ( AjB!
n
1112
F
1ij2
> B'B'/
and
A) C
IAI IU2F(,>
F\,Ei
In particular, these inequalities hold for F(A) = II n,A 11 p. Specializing the last inequality to the case m = I then yields (4.5). In our present case however, the proof of the (4.6) is essentially the same as the direct proof of (4.5). Nonetheless, it should not be considered novel that by taking I A * 1 into consideration as well as I A I , we can obtain a suitable bound on 1 1 n, A 11 p .
Proof Let A = UTAI be the polar decomposition of A. Then I! nf(A)11 p = t(CUTAI) for some C in W(2) with II C U,,. = 1. Let C = VI CI be the polar decomposition of C.
Both V and ICI belong to W(2) as well. Thus Iln_,(A)II,, = r(CUTA1"2IA1112) = r(ICI''2UlAIu2IA11 .2 VIC11,2)
<= t(ICl"2 UTAIU*ICI"2)"2 r(ICIu2 V*IAI
VICI1;2)112
= r(ICI(UTAIU*))`/2t((VICI V*)IA1)''2 < Ilns(UTAIU*) 11 o;2I it
Finally, we note that UTAI U* = IA*l.
U
Example. Let A be the matrix A = [0
I
lAl=f[0 Note that I A
IA*1=42[1 1].
and
1]
is in W(.2), but A and A I are not. One easily finds
ne(A)=z[1 1], n3(IAI)= 11n:(A)11
162
]. Then
=1
,
Iln,(IAI)II
=
0
I]'
f
and
Iln.,(IA*I)II, _
(IAI)Ilpn.
Optimal Hypercontractivity for Fermi Fields and Related Non-Commutative Integration Inequalities
Optimal Hypercontractivity for Fermi Fields
39
Theorem 3. (P, has positive maximizers). The norm of P, from 16'1'(2) to 10(.2) is achieved on the positive operators; i.e. (4.7)
11 P,IID-q=sup{f1P,AlIq:A>Oand IIAIIP= 1}. Proof. Since R. and it* are both s-automorphisms, IR0 ° n;A I = Re(,) ' (R6r,>°n3A)*I=Ret,i°nrlA*I. Thus IIP,AII,SIIP,IAIIIa12IIP,IA*IIIq%2
AI and
and of
course IIIA*IIIP= IIIAIIIP= IIAIIP.
V. Hypercontractivity for Fermions
Our main result is the following theorem which is established in this section. Theorem 4. (Optimal fermion hypercontractivity). For 11 P, Ilp-q = I exactly when
all
I < p 5 q < or,,
e-2,
=q -I
The heart of the matter is the following lemma: Lemma. For all 1 < p 5 2,
11 P,11 ,
. 2 = 1 exactly when e- 22' < (p - 1).
Proof. Fix a positive element A of W(2). Pick a basis {q,.....
of 2, and let
W(.21 _ 1)) denote the Clifford algebra associated with the span of the first n - I of these basis elements. It is evident from the form of our standard basis of 'W(2) that
A can be uniquely decomposed as A = B + CQ,,, where B and C belong to Then
A=B+
Jordan-Wigner
using the Now write
transform
we
can
.*,= ,f°rn-1,©C2
write (5.1)
so that B, C and VC can be considered as operators on the first factor *''r° - ,. Let (e+ ± e_)/./2, so that Qu , = ± u , . Then if v is any vector in u
=,.0-,,.
<(v®ut),A(v(u*)>,r
(5.2)
We see from this that since A > 0, so are both B + CV 2t 0 and B - C V ? 0. Now let Tr, and Tr2 denote the partial traces over the first and second factors in (5.1), so that with Tr still denoting the full trace, we have Tr = Tr, Tr2. Now applying Theorem I in the special case which is proved in the appendix li A 112 p =
+ C V.&P (TrIB 2)2/p(Tr, IB +
I C 2r"
"
ZiP
+Tr,I B 2
2iP
\
1
/
((Tri IBI°)2ip + (p - I)(Tr, CIP)Zro)
since V. is unitary. Thus 1 1A1I 2 ? II B Il v + (p _I) II C II norms on
p,
where the norms on the right are all
163
With E. Carlen in Commun. Math. Phys. 155, 27-46 (1993) E.A. Carlen and E.H. Lieb
40
for
Now we make the inductive assumption that the lemma has been established This is clearly the case when t = 1. We then have IIAIIp
where e
2'
11P,BIIi +(p - 1)1!P,C112 = (p - 1). But clearly from (4.3)11 P, CQ Il i = (p
norms are once again norms on
- 1)11 P,C 11 22, where the
Moreover by (3.4) P1B and P,CQ are
U + (p - 1) II p, C 1I 22 = 11 P, (B + I12 2 = II P, A 112. 2 Theorem 5. (Optimal fermion logarithmic Sobolev inequality). For all A E W(1), orthogonal. Thus JI P, B II
1
z
(5.3) r(IAI2InFAI')-(IIA112In 11A11')<2. P r o o f . By the lemma, 11 P, A I I 2 < I I A 1 1 2 +..- i.) and there is equality at t = 0. Both sides arc continuously differentiable, and comparing derivatives at t = 0 we obtain the result. Indeed,
dpJAII,=vhIAI; °(r(IAI°1nIAl)-HAII;InIIAII°)
(5.4)
and of course drIIP'AEI2=- _
Gross refers to the quadratic form on the right side of (5.3) as the Clifford Dirichlet form since it shares many properties of Dirichlet forms in the ordinary commutative setting. An approach to the development of a theory of Dirichlet forms in the non-commutative setting can be found in [AIHK].
0 and I < p < .7 ,
Proof of Theorem 4. By a deep result of Gross. when A .5
(P%2)'
p-I
'
Replacing A in (5.3) by Ap/2 and using the inequality just quoted we obtain, following Gross's ideas [Gr75],
r(A°InA)-IIAIl°PInhIAIIP. By combining this with (5.4) a differential inequality is obtained which implies that is a decreasing function of t when q(t) = I + e2t(p - 1). This establishes II P,A the result for A z 0, and by Theorem 3 it is established in general. By (4.3), P, I = 1, and therefore II P, II°-.Q is always at least 1 for all p and q. That the inequality is best
possible follows from a direct computation with one degree of freedom. To be precise, II P,(I + Q))IIQ = 111 + e-'Q) IIq is easily computed and compared with 11 1 + Q 11 ° [Gr72]. The first quantity is greater than the second if e 2' > (p - 1)! (q - 1). VI. Hypercontractivity for Bosons and Fermions Together
As a result of the present work and of earlier work on bosons, we know that for
t given by a-2t = (p - l)/(q - 1), both the fermion and the boson oscillator
164
Optimal Hypercontractivity for Fermi Fields and Related Non-Commutative Integration Inequalities Optimal Hypercontractivity for Fermi Fields
41
semigroups are contractive from the appropriate p-spaces to the appropriate q-spaces, and that this value of t is optimal for each case separately. It is natural to expect that the same condition governs hypercontractivity in a situation in which we have bosons and fermions together. This is indeed the case, as we now show using Minkowski's inequality in an argument based on Segal's
method for showing that the optimal conditions for hypercontractivity with m boson degrees of freedom are the same as for one degree of freedom. Let p(dx) = (2n)-,"'2 a"12 dx be the unit Gauss measure on IRt. Then in our mixed setting, with m boson degrees of freedom and n fcrmion degrees of freedom, the relevant p-space is
which may be regarded as consisting of 16p(:21,) valued measurable functions x" A (x) such that IIIAIII;= I IIA(x)II°p(dx) R^'
is
finite. This equation defines the norm on M°. For p = 2, .V° is naturally
isomorphic to the tensor product of the symmetric tensor algebra over C' and the antisymmetric tensor algebra over C" as shown by Segal. On the latter space we
have the mixed oscillator semigroup generated by the sum of the boson and fermion number operators
l
;_
,
111
Considered as operators on M", the operators °, which constitute this semigroup are given by
Y,A(x)= J M,(x,x')P,[A(x')]p(dx'),
(6.1)
where M,(x,x') is the Mehler kernel; i.e., the positive integral kernel for the boson oscillator semigroup P(,b°'°") discussed in Sect. IV. Of course P, denotes the fermion oscillator semigroup studied throughout this paper. Now successively applying Minkowski's inequality, our theorem on optimal fermion hypercontractivity, and Nelson's theorem on optimal boson hypercontractivity, we have for e- 2' <_ (p - 1)/(q - 1): q
J M,(x,x')P,CA(x')]p(dx')
III>,A1114=
µ(dx)
R^`
M,(x,x')!IP,A(x')liµ(dx')
5 R^ f
R^'
5
IIA
Rm
g
µ(dx)
g M,(x,x')!IA(x')II,u(dx') Iµ(dx)
/
\ Il
p
µ(dx)
Jq!"
/
= 1) A )j ,
165
With E. Carlen in Commun. Math. Phys. 155, 27-46 (1993) E.A. Carlen and E.H. Lieb
42
Appendix
Proof of Theorem I when A ± B ? 0. Let Z and W be the 2m x 2m matrices given by
Z-[0 A]' W-[0B -B] Our goal is to establish that for all r with 0 < r < 1,
(Tr(A + rB)° +Tr(A -
(Tr(A)'+ r2(p - 1)(TrBI)21,
2
or what is the same, Tr(Z + rW)z;p z (Tr(Z)p)21p + r2(p - 1)(TrI W jp)z/p
(A. 1)
First, note that the null space of Z + r W is exactly the null space of Z for 0 < r < 1.
Thus by carrying out all of the following computations on the orthogonal complement of this fixed null space, we may freely assume that Z + rW > 0 for all 0 5 r < 1. Next, both sides of (A.1) agree at r = 0, and the first derivatives in r of both sides vanish there as well. We define fi(r) to be Tr(Z + rW )p. Then the second derivative in r of the left side of (A. 1) satisfies d2
dr2
(fi(r)) zip =>
2
P
fi(r)
d2
u
dr2
0(r) .
The second derivative on the right side is just 2(p - 1)(TrI WIP)z;p, and we are left with showing that z
p ,(r)(2
-p),pdr-0(r)?(p-
(A.2)
1)(TrlW1p)2rp
for all 0 < r < 1. By redefining Z to be Z + r W, it suffices to establish (A.2) at r = 0.
Now dr fi(r) = p(Tr(Z + r W) (p-' W), since A ± B ? 0, Z + r W >_ 0 for small d
r, and we can use the integral representation (Z +
t'p
t.
rW)U'
0
Fl]dt It t+(Z+rW) 1
to conclude that d2
dr2
1
(0) = pc4 J
0
It
+Z WI
1
I
W
]dt
.
(A.3)
Consider the right side as a function, f (Z), ofZ for fixed W. It is easy to sce that
f is convex in Z. (Simply replace Z by Z + tX, with X self-adjoint, and then differentiate twice with respect to t; the positivity follows from the Schwarz inequality for traces.) Also, f (UZU') = f (Z) provided U is unitary and U commutes with W. In a basis in which W is diagonal, we form the set 11 consisting of the
166
Optimal Hypercontractivity for Fermi Fields and Related Non-Commutative Integration Inequalities Optimal Hypercontractivity for Fermi Fields
43
22m distinct diagonal unitary matrices, each with + I or - I in each diagonal entry. Each of these clearly commutes with W. Then
f(Z) = 2-2m Y f (UZU*) z U. *
f(2-2m
\
Y UZU*)
f(Zd,.,) ,
(!elf
where Zdj.g is the matrix that is diagonal in the basis diagonalizing W, and whose diagonal entries are those of Z in this basis. Replacing Z by Zd;,, in (A.3), the integration can be carried out, and we obtain dz
drz
2m
Zip-2) wI
(0) ? P(P - 1) Y
l
i=t J where z; and w;, respectively, denote the j" diagonal entries of Z and Win a basis diagonalizing W. Now consider i(0) = Tr(ZP) as a function of Z. It is clearly convex, and thus by the averaging method just employed, we obtain 2m
0(0) 2:
ZP
ll
1=((
To establish (A.2), we are only left with showing that
(`')
Puv ( ' (z(,
z) x2 )
(2.'
Iw,IP1z,P,
(A.4)
C,
but this follows immediately from Holder's inequality.
/J
To complete the proof, observe that equality in (A.1) for r = I and 1
denotes the j" diagonal element of Z + rW; these are the numbers z; + rw;, where z; denotes the j" diagonal element of Z. Let us assume that w; * 0 for some j. Then equality in Holder's inequality (A.4) requires that the vector with positive components z + rw; be proportional to the vector with components Iw,I. Thus, for almost every r in [0, 1] we require
=i+n+',=c(r)Iw,I for some number c(r) that depends on r but not on j. The left side above is a linear
function, and thus c(r) = a + rb for some numbers a and b. But then clearly h = w; I w;1, and all non-zero eigenvalues of W would necessarily have the same sign. This is impossible since TrW = 0. We now give an application of the uniform convexity implied by this theorem to the differentiability of the WP(,*-) norms. First we recall that for 2 < p < x. , the
modulus of convexity is given by an analog of an inequality of Clarkson for integrals which Dixmier [Di53] established for traces. Specializing to 4k''(. f'), this inequality reads A + B
P P
A-B <2[IIA11P+IIBIIP] 2<_p< x 2 .
P
which implies that in this range (cf. (3.7))
p (OP
(A.5)
167
With E. Carlen in Commun. Math. Phys. 155, 27-46 (1993)
E.A. Carlen and E.H. Lieb
44
For any non-zero A in '8°(.f), define 9(A) by U*
9(A)=
where A = UTAI is the polar decomposition of A. Let p' be defined by 1/p + 1/p' = 1. Then for I < p < oo, II9(A)ll = I and r(9(A)A) = II A IIp. Moreover, 9(A) is the unique element of V (.fit') with this property. We call the map the gradient map on IW°(.7Y). The next theorem sharpens a result of Gross [Gr75]. Theorem 6. (Holder continuity of the gradient map on 4B°()r )). For all I < p < x , the gradient map is norm continuous. Moreover, p
I'IA -Blip (P-1) forI
(A.6)
and I19(A)-9(B)11p.<4(p-1)
IIA - Blk
fort
IIA+Blip) Proof. First observe that for all I < p < a, ,
.
(A.7)
119(A) + 9(B)Il p.II A + Blip _ 91r((9(A) + 9(B))(A + B)) = 2(11 A 11° + II B II p) - SRr ((9(A) - 9(B))(A - B))
2(11 A + B11p) - 119(A) - 9(B)Ilp.ilA - B11°.
Thus, .IA =BI1p
9(A)- 9(B) 2
IA+BIlp)
2
p
(A.8)
But, by (A.5) and (3.7), we have
9(A)-1(B)
9(A) + 9(B) 2
p
p
p
(A.9)
2
for I < p 5 2. By combining (A.9) and (A.8) we obtain (A.6). Similarly, by combining (3.9) with (A.8) we obtain (A.7).
The continuity of the gradient map for '6°(.7Y) has been established by Gross
[Gr72], but his proof is more involved and does not yield an estimate of the modulus of continuity. It is now easy to establish continuous differentiability of the (B°(,f) norms away from the origin since, with h(t) =IIA + rB 11°, with A different from 0 and with t and s sufficiently small, we have
92r(9(A + IB)B) <
h(r + s) - h(t)
< 9 (9(A + (t + s)B)B)
(A. 10)
N
when s is positive. To see this, observe that h(t) = Rr(9(A + tB)(A + 1B)), and that h(t + s) -> 8tr(9(A + tB)(A + (t + s)B)) by Holder's inequality. By subtracting the expression for h(r) from the estimate for h(t + s) and dividing by s, we obtain
the inequality on the left in (A.10). The inequality on the right is obtained in an
168
Optimal Hypercontractivity for Fermi Fields and Related Non-Commutative Integration Inequalities
Optimal Hypercontractivity for Fermi Fields
45
analogous manner. When s is negative, the inequalities are clearly reversed. Letting
s tend to zero, we obtain dt (JA +
tBp,lr_o = 93T(!Y(A)B) .
References
[AIHK] [ArYa]
[BCL] [BrWe] [CL91]
[Da76] [Di53]
[Far]
Albeverio, S., Hnegh-Krohn, R.: Dirichlet forms and Markov semigroups on C'187 (1977) algebras. Commun. Math. Phys. 56, Araki, H. Yamagami, S.: An inequality for the Hilbert -Schmidt norm, Commun. Math. Phys. 31, 89-96 (1981) Ball, K., Carlen, E.A., Lieb, E. EL preprint 1992 Brauer, R., Weyl, H.: Spinors in is dimensions. Am. J. Math. 57, 425 -449 (1935) Carlen, EA., Loss, M.: Extremals of functionals with competing symmetries. J. Func. Anal. > 437-456 (1991) Davies, E.B.: Quantum Theory of Open Systems. New York: Academic Press, 1976 Dixmier, J.: Formes lineaires sur un anneau d'operateurs, Bull. Soc. Math. France 81 222-245(1953)
Faris, W.: Product spaces and Nelson's inequality, Hely. Phys. Acta 48 721-730 (1975)
[Fe69]
Federbush, P.: A partially alternate derivation of a result of Nelson. J. Math. Phys. 10 5a-52 (1969)
[Gr72]
Gross, L.: Existence an uniqueness of physical ground states. J. Funct. Anal. 10.
[Gr75]
52.104 (1972) Gross, L.: Hypercontractivity and logarithmic Sobolev inequalities for Clifford-Dirichlet form. Duke Math. J. 43 383- 396 (1975)
[Gr89]
the
Gross, L.: Logarithmic Sobolev inequalities for the heat kernel on a Lie group and a bibliography on logarithmic Sobolev inequalities and hypercontractivity. In: White Noise Analysis, Mathematics and Applications. Hida et al. (eds.) Singapore: World Scientific, 1990, pp. 1.1fl -130
[Gr92]
[HuPa] [JoKl)
Gross, L.: Logarithmic Sobolev inequalities and contractivity properties of semigroups. 1992 Varenna summer school lecture notes (preprint) Hudson, R., Parthasarathy, K.R.: Quantum It6's formula and stochastic evolutions. Commun. Math. Phys. 93, 301-323 (1984) Jordan, P., Klein, 0.: Zum Mchrkorpcrproblem der Quantentheorie. Zeits. far Phys.
[JoWi]
45. 751 765 (1927) Jordan, P., Wigner, E.P.: Ober das Paulische Aquivalenzvcrbot. Zeits. fur Phys. 41, 631-651 (1928)
[Li76]
Lieb, EH_ Inequalities for some operator and matrix functions. Adv. Math. 20 174-178 (1976)
[Li90]
Lieb, ELLGaussian kernels have only Gaussian maximizers. Invent. Math. 1(12. 179-208 (1990)
[Lin] [LiMe] [MeDS]
Lindsay, M.: Gaussian hypercontractivity revisited. J. Funct. Anal. 21313 324 (1990) Lindsay, M.. Meyer, P.A.: preprint, 1991
[Me85]
442 448 (1975) Meyer, P.A.: Elements de probabilites quantiques, exposes I V. In: Sem. de Prob. XX, Lecture notes in Math. 1204, New York: Springer, 1985 pp. 186 312
[Me86] [Nc66]
Morris, R., Dias da Silva, J.A.: Generalized Schur functions. J. Lin. Algebra 3,
Meyer, P.A.: Elements de probabilites quantiques, exposes VI VIII. In: Sem. de Prob. XXI, Lecture notes in Math. 1247, New York: Springer, 1986 pp. 27-81 Nelson, E.: A quartic interaction in two dimensions. In: Mathematical Theory of Elementary Particles, R. Goodman, L Segal (eds.) Cambridge, MA MIT Press. 1966
[Ne73] [Ne74] [Nev]
Nelson, E.: The free Markov field. J. Funct. Anal. 12. 211 227 (1973) Nelson, E.: Notes on non-commutative integration. J. Funct. Anal. 15 103 -116 (1974) Neveu, J.: Sur 1'esperance conditionelle par rapport a un mouvement Brownien. Ann. Inst. I3. Poincare Sect. B. (N.S.) 12. 105 109 (1976)
169
With E. Carlen in Commun. Math. Phys. 155, 27-46 (1993)
E.A. Carlen and E.H. Lieh
46
[Ru72]
Ruskai, M.B.: Inequalities for traces on Von Neumann algebras. Commun. Math. Phys. 26, 280-289 (1972)
[SML] [Sc53]
Schultz, T.D., Mattis, D.C., Lieb, E. H.: Two dimensional Ising model as a soluble problem of many fermions. Rev. Mod. Phys. 36, 856- 871 (1964) Segal, I.E.: A non-commutative extension of abstract integration. Ann. Math. 57. 401-457 (1953)
[Se56] [Se70]
Segal, I.E.: Tensor algebras over Hilbert spaces It. Ann. Math. 63, 160 175 (1956) Segal, I.E.: Construction of non-linear local quantum processes: 1. Ann. Math. 92.
[TJ74]
462-481 (1970) Tomtzak-Jaegermann, N.: The moduli of smoothness and convexity and Rademacher
[Um54]
averages of trace classes S,(I 5 p < co ). Studia Mathematics 50, 163 182 (1974) Umegaki, H.: Conditional expectation in operator algebras I. Tohoku Math. J. 6, 177-181(1954)
Communicated by A. Jaffe
170
With K. Ball and E. Carlen in Invent. Math. 115, 463-482 (1994)
Invent math. 115, 463 482 (1994)
Inventions mathematicae
Sharp uniform convexity and smoothness inequalities for trace norms Keith
Eric A.
and Elliott H. Lieb3,***
Department of Mathematics, Texas A&M University, College Station, TX, 77843, USA : School of Mathematics, Georgia Institute of Technology. Atlanta, GA, 30332, USA Departments of Mathematics and Physics. Princeton University. P.O. Box 708, Princeton, NJ, 08544, USA Oblatum 7-VII-1993
Summary. We prove several sharp inequalities specifying the uniform convexity and uniform smoothness properties of the Schatten trace ideals C,, which are the analogs of the Lebesgue spaces L, in non-commutative integration. The inequalities are all precise analogs of results which had been known in L,, but were only known in C. for special values of p. In the course of our treatment of uniform convexity and smoothness inequalities for C, we obtain new and simple proofs of the known inequalities for L,. I Introduction
The concepts of uniform convexity and its dual property, uniform smoothness, play
an important role in analysis. After reviewing these concepts in the L,, function spaces, we shall consider their extension to the Schatten trace ideals, C,, i.e., the setting in which functions are replaced by operators, and integrals are replaced by traces. The emphasis throughout will be on the optimal constants appearing in the various inequalities. These optimal constants are "natural", as will be explained later in the introduction: they are the constants one would obtain from an informed guess using elementary calculus. However, as is often the case in such matters, no ready-made arguments suffice to validate the informed guesses. A normed space X is said to be uniformly convex if, for each e > 0, there is a d > 0 such that if x and y are unit vectors in X with I; x - y II z 2e, then the
average (x + y)/2 has norm at most I - d. A normed space X is said to be uniformly smooth if, for all e > 0, there is a r > 0 such that if x and y are unit vectors
in X with I{ x - y II < 2t, then the average (x + y)/2 has norm at least I - cr.
Work partially supported by US National Science Foundation grant DMS 88-07243 ** Work partially supported by US National Science Foundation grant DMS 92-07703 **' Work partially supported by US National Science Foundation grant PHY90.19433 A02 ,C) 1993 by the authors. Reproduction of this article, in its entirety, by any means is permitted for noncommercial purposes
171
With K. Ball and E. Carlen in Invent. Math. 115, 463-482 (1994) 464
K. Ball et at
Figuratively speaking, the unit ball of a uniformly convex space is uniformly free of "flat spots", and the unit ball of a uniformly smooth space is uniformly free of "corners". Since the unit ball of X *, the dual of X, is the polar conjugate of the unit ball of X, it is not difficult to show that X is uniformly convex (and hence reflexive) if and only if X * is uniformly smooth [D]. Many applications of uniform convexity and smoothness require quantitative versions of these notions. The function bx given by
bx(e):=inf{1-}Ilx+y11:IIx11=llyll= 1, 1 x-yl:>2e)
(1.1)
is called the modulus of convexity of X. (N.B. The function bx is frequently defined with c in place of 2x. The definition used here simplifies several of the formulae involving bx and fits more naturally with the definition of the modulus of smoothness given below.) Clearly, X is uniformly convex if and only if bx is strictly positive for every r > 0. It might seem natural to define the modulus of smoothness by setting it equal,
at r, to
sup 'I - IIx+yil:ilx11 = 11yll = 1,1Ix-yI 521}
(*)
Clearly, X is uniformly smooth if and only if this supremum is o(1) at r = 0. The definition (*), however, would not be well adapted to the duality between uniform convexity and uniform smoothness. Instead, the function px given by Px(r): = sup{
His + v11 + Ilu - rll
- 1:IIu11 = 1, ;lv;I = 11
(1.2)
is called the modulus of smoothness of X. This definition arises from (*) if we rewrite the quantity to be maximized there in terms of is = (x + y)/2 and v = (x - y)/2, and change the constraint from ll is + v II = 11 u - v 1; = 1 to simply II is I; = 1. For small 1,
there is no substantial difference, and it is easy to show (see [K6]) that X is uniformly smooth if and only if lim, .oPx(T)/T = 0.
Lindenstrauss [L] has shown that with these definitions, the modulus of convexity of a normed space X and the modulus of smoothness of its dual X * are related by p,.(-r) = sup {1e - bx(s):0 < c < 11. (1.3)
This is a quantitative versions of Day's duality theorem [D]. Uniform convexity was introduced by Clarkson [C] who proved that every Lo space with I < p < o c. both uniformly convex and uniformly smooth. Clarkson proved inequalities which give bounds of the form ?c.,(r.) >_ (e/Kr.,)'
(1.4)
where r=pfor 25p< oc,andr=p/(p-1)for I_ (e/C)' for some constant C. (After Eq. (2.6) below, we make an apparently more restrictive definition of r-uniform convexity. The two definitions will be shown to be consis-
tent in Proposition 7, and the present definition is the simplest to use in the introduction.) Clarkson's bounds (1.4) only show that L, is r-uniformly convex with
r > 2 for all p + 2 while, actually, L, is 2-uniformly convex for I < p < 2.
172
Sharp Uniform Convexity and Smoothness Inequalities for Trace Norms
465
Sharp uniform convexity
The 2-uniform convexity of L, for I < p < 2 follows from a result of Hanner [H], who proved an inequality from which 5,., can be easily computed. Hanner's result is recalled in part (a) of Theorem 2 below. The best constant K,,2 in (1.4) seems to have been first determined by Ball and Pisier [BP], who gave a simple direct proof, independent of Hanner's calculation, that L, is 2-uniformly convex for such p. Their optimal 2-uniform convexity inequality is:
6,,,(E) > 2 F.2 for
I < p < 2.
(1.5)
Because of the dual nature of the notions of uniform convexity and smoothness, the
modulus of smoothness of Lo for 2 < p < .x satisfies an inequality of the form pL,(r) (K, 2 t)2. Again, this is a better estimate than that which follows from Clarkson's inequalities. A more detailed history of these and related inequalities will be presented in Sect. 11 of our paper.
Less is known about the corresponding inequalities for the trace classes C. Clarkson's inequalities were extended to C, partly by Dixmier [Di], and fully by Klaus [Si], with precisely the same constants and exponents as in the L, case. Tomczak-Jaegermann later showed that, as with L, C, is actually 2-uniformly convex for I < p < 2: 6,,(e) ? (E/K,,2)2
for
1 < p < 2.
(1.6)
Her proof proceeds by establishing the C. analog of Hanner's inequality when p is an even integer, then deducing the 2-uniform smoothness of C, for all p >_ 2 from this by interpolation, and then using Lindenstrauss's duality result to obtain the 2-uniform convexity of C, for 1 < p <= 2. Implicit in her proof is the fact that when p = 2k/(2k - 1) for some positive integer k, the sharp constants K,,2 for C,, coincide with those of L,; i.e. K,,2 = (p - 1),12 for such values of p.
The principal results in our paper are the determination of the best possible constants for all p in Tomczak-Jaegermann's theorem, and the proof that the C,
analog of Hanner's inequality holds for 1 < p < 4/3. and in the dual range 4 <_ p < oc. Our two main theorems are the following (in which II - II, denotes the L, or the C. norm):
Theorem I (Optimal 2-uniform convexity) For I < p < 2, the inequality
(IX 2
+(p-1)1YI;.
(1.7)
holds in the following cases:
(a) X and Y are functions in L,. (b) X and Y are matrices in C,. if 2 < p <_ cc , the inequality is reversed. The title of this theorem will be explained more fully in Sect. II. the point, of course,
is that validity of the inequality (1.7) implies 2-uniform convexity. Part (a) is an unpublished result of Ball and Pisier, and the cases of part (b) for p = 2k/(2k - 1) are, as we have said before, implicit in the paper [TJ] of Tomczak-Jaegermann. The rest is new. The constant p - I in (1.7) is clearly seen to be optimal as well as natural from the point of view of elementary calculus: if X and Y are real numbers with I YI much smaller than X. then the two sides of (1.7) agree to second order
173
With K. Ball and E. Carlen in Invent. Math. 115, 463-482 (1994)
K. Ball et al.
466
in Y. The fact that (1.7) is true for all pairs of numbers is Gross's two-point
inequality [G]. Theorem 2 (Extension of Harmer's inequality to Cp) For 1 5 p 5 2, the inequality 11X + Yl1o+:1X - YllPp ? (IIXIIp+ I! Yllp)p+ IIIXIp - II Yilplp
(1.8)
holds in the following cases: (a) X and Y are functions in Lp. (b) p < 1 and X and Y are matrices in Cp.
(c) X and Y are matrices in Cp such that both X + Y and X - Y are positive semidefinite.
For 2 < p < oo , the inequality is reversed and the restriction in (b) becomes p ? 4,
and the restriction in (c) changes to the restriction that X and Y are positive semidefinite.
Part (a) is Hanner's inequality, and the cases of part (b) in which p = 2k are due to
Tomczak-Jaegermann [TJ]. The rest is new. As we explain in the first proof of Proposition 3 below, the inequality (1.8), whenever it holds, implies the inequality (1.7). Thus, if the conditions under which we establish (1.8) were not more restrictive
than those under which we establish (1.7), Theorem I would be a corollary of Theorem 2. The paper is organized as follows: In Sect. 11, we review the large number of inequalities bearing on the uniform convexity of Lp spaces. Thus, Sect. II consists largely of known results which are presented because of the light they shed on the problems solved in this paper regarding the uniform convexity of Cp, and those
that remain open. To our knowledge, such a systematic compendium of these inequalities has not appeared before, and we hope it will be found useful. There are however some new results and some new, simpler proofs. Finally in Sect. III we prove Theorem 1, and in Sect. IV we prove Theorem 2. Theorem 1 has been applied by Carlen and Lieb [CL] to prove a conjecture of Gross, which arose in his work on quantum field theory. Other applications of the kinds of the inequalities that we discuss here are given, for example, in Pisier's book
[P]. Although all of our theorems are stated and proved in the language of matrices,
the proofs go through without any change in the context of linear operators on a Hilbert space. By the results of Ruskai [Ru], they can even be extended to a natural Von Neumann algebra context.
II Uniform convexity and smoothness in L. While this section is largely focused on inequalities relating to LP spaces, we state certain definitions and prove certain results in the general normed space setting so that they are available to us in the next section. For the rest of the paper, q denotes the dual index of p, i.e., I /p + l 1q = 1. The notion of uniform convexity was introduced by Clarkson who proved four
inequalities. The two that imply the uniform convexity of Lp spaces are the
174
Sharp Uniform Convexity and Smoothness Inequalities for Trace Norms
467
Sharp uniform convexity
following, in which x and y are functions in L,:
x+y
it
2
pl"°
p
+
x2y /
p
(11x11',+ IlYllpl`rv 2 J
2 5 p 5 oo
(2.1)
)lip for I <_p<2.
(2.2)
for
and
x+y
2
x-y
q
+
2
p
(I1x'lo2 :IY'l
`
p
In the cases 1 5 p 5 2 and 2 5 p 5 oc, the inequalities in (2.1) and (2.2), respectively, hold in the reversed sense. These are the other two Clarkson inequalities - the ones which imply that the Lp spaces are uniformly smooth. They follow from (2.1) and (2.2) by an elementary duality argument. The inequality (2.1), which involves only p powers and not q powers as well, is simpler to prove, and is known as the "easy" Clarkson inequality. In fact, (2.1) is not only easier to prove, it is actually a consequence of (2.2). This is so because both
inequalities can be viewed as statements about the norms of certain linear operators. Viewed as such, (2.1) is weaker than the dual inequality to (2.2). More concretely, for 15 s, t S oc, equip L. x L. with the norm II' 11,., given by II(x,Y)11,.,=((Ilxlh+Ilyll`<)/2)'". Also, define the operator B:L,,,-'L,,, by B(x, y) = ((x + y)/2, (x - y)/2). Then (2.2) is equivalent to the statement that B is a bounded operator from L,,, to Lp., for 15 p < 2 with norm 2/0. But since B is self-adjoint, it has the same norm as an operator between the dual spaces; i.e. B is bounded from Lq.p to L9 with norm 211p for 1 <_ p 5 2. Finally, since p < q, 11(X,Y)1i°,,, 5 ;(x,y)11,.q, and hence B has norm 2" from L°,9 to L°,9 for 2 5 q 5 oo, which is clearly equivalent to (2.1). Since these bounds on the norm of B (which are equivalent to (2.1) and (2.2)) are log-linear in 11p, they can be proved by interpolation between the elementary cases
p = 1 (Minkowski's inequality for L,), p = 2 (the parallelogram law) and p = oo (Minkowski's inequality for L,,), as observed by Boas [Bo]. This same approach was later used by Klaus [Si] to establish the C, analogs of Clarkson's inequalities. It is convenient also to have the following inequality, obtained from (2.2) by rearranging some powers of 2. If x and y belong to L. where either s = p or s = q, then
l
Ilz+yll,+Ilx-yll:l'4=(11x11SP J
2
+;IYfl:)'rp for
1<=p52.
(2.3)
Replacing x and y respectively with x + y and x - y, and rearranging some powers of 2, we see that the inequality reverses when p and q are interchanged. That is,
IIx+YII°+I:x-ylls "p(Ilxll°+Ifyll;)'/° 2
for
1<-p<2.
(2.4)
It follows directly from (2.3) that if x and y are unit vectors in one of the spaces L, or L°, and Il x - y II = 2e then Ilx 2
y II < (1 -
E°)t/° 5 I
- E9
so that b,,,(e) >_ 0°/q. Similarly, from (2.4) one sees that p,_,(r)5 rp/p.
175
With K. Ball and E. Carlen in Invent. Math. 115, 463-482 (1994) K. Ball et at.
468
We next turn to the sort of inequalities that figure in our Theorem 1. We begin with some definitions and general considerations intended to clarify the relations among all the inequalities that we consider. Let X be a uniformly convex normed space, and suppose that bx(E) > (E/CY for some C and r > 1. Then with 11x 11 = II Y II = I and 11x - y II = 2E, we have that
x+yl 2
5 (1 - (E/CYY 5 1 -
)
r
(E/CY,
and thus, with K = C(1/t)-", 1/t + 1/r = 1,
x+y
x-y
2
2K
'511x11'+ I1y11' 2
for all x and y such that II x II = ll 1' II By replacing x with x + y and y with x - y, we
find that
IIx+YII'+ IIx-Y11' 2
11x11'+ IK`yll'
(2.6)
for allxandysuchthatllx+yll=IIx-YII. As promised in the introduction (after (1.4)), we now impose a definition of r-uniform convexity that may seem more restrictive than the one we gave before. Proposition 7 below shows that the two definitions are equivalent, up to constants. It is the constant in (2.6), figuring in the second definition, that is the main object of our attention. A normed space X is said to be r-uniformly convex for some r e [2, oo ) if there is
a constant K such that (2.6) holds for all x,y a X. The best constant K is called the r-uniform convexity constant of X. When X is r-uniformly convex, so that (2.6) and hence (2.5) hold, it is immediate from the latter that 5X(e) >_ (E/KY. Thus r-uniform convexity implies the validity of a lower bound of the form 6x(E) >_ (E/C)' for the modulus of convexity; i.e. the condition under which we called X r-uniformly convex in the introduction. Similarly, X is said to be t-uniformly smooth for some t E (1, 2] if
IIx+YII'+IIx-Y11'c11x11'+ 2
II KYII',
(2.7)
for some K and all x, y E X. The best constant K is called the t-uniform smoothness
constant of X. We shall show at the end of this section that the t-uniform smoothness constant of a normed space X equals the r-uniform convexity constant
of its dual X where, as usual, 1 /r + I /r = 1. When (2.7) holds, we have that for all x and y with II x I! = 1 and II YII = T
IIx+YI1 + ilx-YII
'Ix+y11'+ IIx-YII'
2
2
Hence, by (1.1), t-uniform smoothness implies an estimate of the form pX(T) 5 (CT)'.
Proposition 7 shows that the reverse implication holds as well. The parallelogram identity shows that Hilbert space is 2-uniformly convex and 2-uniformly smooth, and it is readily seen that the exponent 2 is the best that can occur for each property. Clarkson's inequality shows that when 1 S p 5 2 then
176
Sharp Uniform Convexity and Smoothness Inequalities for Trace Norms
469
Sharp uniform convexity
each of LP and LQ is q-uniformly convex and p-uniformly smooth. As we have remarked, these exponents are not in general the best possible, despite the fact that the constants in Clarkson's inequalities are always sharp. The actual situation is the following: For 1 < p 5 2, L. is 2-uniformly convex though no better than p-uniformly smooth while for 2 < q < oc , L, is 2-uniformly smooth as well as q-uniformly convex.
These facts follow from Hanner's inequality (Theorem 2(a) of the introduction) which determines exactly the moduli of convexity and smoothness of all L° spaces. The optimal 2-uniform convexity inequality is the following: Proposition 3 then
(Optimal 2-uniform convexity for Lp) If I S p < 2 and x and y e L,
.Ix+yiID+ Ilx - yll; z Ilx!II+(p_ I)Ilyll;.
(2.8)
2
For 2 <- p < oo , the inequality is reversed.
Remark. The inequality (2.8) holds for any normed space for which (1.8) holds, as we will soon show. Inequality (2.8) does not seem to appear in the literature in quite this form but it is probably folk-lore. Ball and Pisier noticed that it follows from
Gross's two-point inequality using arguments which (in the context of general Banach lattices) go back to Figiel [F]. The reader will note that (2.8) is not identical to (1.7), but we shall soon see that the validity of (2.8) for all L° spaces implies the validity of the apparently stronger (1.7).
First proof. To deduce (2.8) from Hanner's inequality recall that a special case of Gross's inequality [G] states that if 1 5 p 5 2 and a and h are real.
la+hl°+Ia-bI°"° 2
Now if x, Y e L.
rllx+yllo+Ilx-y'ilP)1J2 '>rllx+yllo+ Ilx - yllglIO 2
``
J
2
> E(xp + I'yll°)°+IIx1I°- Ily1l°I°1':° J
2
?(IIxi' +(p -
1)Ilyi:;)1;1
where we have used, in succession, Holder's inequality, Hanner's inequality (1.8). and Gross's inequality with a = II x iI ° and h = II y II P. O 177
With K Ball and E. Carlen in Invent. Math. 115, 463-482 (1994)
470
K. Ball et al.
Second proof. An alternative proof of Proposition 3 consists simply of showing that for II y II small,
IIx+YIIp+ Ilx - yllp, 2
Ilxllo+(p - 1)IlYllp+o(IIYII;),
(2.9)
and then observing that this infinitesimal form of (2.8) is equivalent to the full statement. That is, (2.9) is the same as
ds2x+sy11o2 11x-sy11P>tAp1)11x11p
11Y11;,
(2.10)
)ls=O
which is easy to establish for LP functions by elementary calculus. Proposition 3 follows from this by integration with respect to s E [0, 1]. Q As will be shown later, inequality (2.10) is also true for matrices in Cp, and this
will form the basis of the proof of Theorem 1 given in Sect. III. But there is an important difference between the commutative and non-commutative cases. For functions x and y in LP,
l
ds2(Ilx+syiIPP2 Ilx - syll°ll
= p(p - 1)f lxIP 21y12.
(2.11)
and the latter dominates p(p - 1) 11 x II v 211 Y 112 by Holder's inequality (since p < 2). For matrices, the analogue of (2.11) is false: one always has
d2IIX+sYIlo+11X-sYllpl dS2(
2
5p(p-1)TrIXIPlYl 2 )Id-o
and equality need not hold. The problem is to find a replacement for (2.11) in the non-commutative setting. Inequality (2.8), as we said, is apparently weaker than (1.7) since
IIx-y11°\ <(Ilx+y12+ I1 x-y11212 l 2 l (lix+y11P+2 for I < p < 2. However, a simple doubling argument shows that (1.7) is actually a consequence of the fact that (2.8) holds for all Lp spaces. This argument is also valid for CP, and we give it in detail in Sect. III, Eq. (3.5}{3.6). The reader can easily
translate the CP version into the L. version.
Note also that in the first proof of Proposition 3, which was based on Hanner's inequality, we actually arrived at (1.7) in an intermediate step. In the CP setting, we do not possess a full analog of Hanner's inequality, and so we shall prove Theorem I by adapting the second proof of Proposition 3 to the C. setting. The following diagram shows the relationships between the several
expressions mentioned above. Connecting lines indicate inequality between
the expressions. All of the indicated inequalities hold in both CP and Lp except for that indicated by the line labeled Hanner, which is only known to hold in C. in the special cases specified in Theorem 2. In each expression x and y are elements of L. or of Cp, and q is the index conjugate to p. For 1 5 p S 2, the quantities increase as one goes up the page; for 2 < p 5 oc , they decrease.
178
Sharp Uniform Convexity and Smoothness Inequalities for Trace Norms 471
Sharp uniform convexity
C
I1= + yllx,+Il=-YIl; 1
/a
J
2
i
1
Holder
i
(II=+yllp+Il=-yllo/r 2
BCL (Thm. 1)
Harmer (Thm. 2)
"strong" Clarkson
f (114, + HOOP + 111-11, - HOW '
l
Gross
X
/r
2
"strong" Clarkson
for numbers
(II=II; + (P-1)Ilyll;)'/2
(lI4p + I1Y111P
I/°
Fig. 1. Relationships among the inequalities
We turn now to a proof of Hanner's inequality for L, which yields Clarkson's inequalities along the way, and to a simple duality result which shows
that optimal constants obtained for q-uniform convexity of a normed space immediately yield optimal constants for p-uniform smoothness of its dual (and conversely).
Lemma 4 (Variational characterization of sums of p" powers) For I < p < x: define a = ap: [0, cc) [0, oo) by
a(r) = (1 + r)"_' + 11 - rl" sign(I - r). Then for all x, y E 68 Ix+yl°+Ix-yl°={()}{(r)lxVsup
+a(1/r)iyl°:0
the sup or inf being taken according as p < 2 or p > 2.
179
I.
With K. Ball and E. Carlen in Invent. Math. 115, 463-482 (1994)
472
K. Ball et al.
Proof Assume I < p 5 2; the proof for p ;>= 2 is similar. Plainly, it may be assumed
that 0 < y 5 1 = x. For r = y; one readily checks that a(r)+ a
yP
=
(1
+y)P+ (1 -y)P.
To see that (1 + y)P + (1 - yf a(r) + a(l/r) yP for all r, it suffices to check that the latter quantity attains its maximum when r = y. But dr(a(r) + a(l/r)y") = a'(r) - -Z a'(1/r)y'
=(P-1)L(1+r)P-z-I1-rIP-Z-Y1+llv-2
r z \\ r
)
- llP-Z/J
r 2
-I1-rlv
2 5 0 and I + r Z 11 - rl, the last factor is non-positive. Thus, the whole is non-negative for 0 < r < y and non-positive for r > y.
Proof of Theorem 2(a) (Hanner s inequali(y) Again assume 1 < p 5 2 and let x, y e L. Then
lix+y!IP+ Ilx - yFVP= f(Ix+ylP+Ix-ylP)= f sup {a(r)IxIP+a(l/r)Iy1P} ,>sup f (a(r)IxIP+a(1/r)IyIP)=sup{a(r)IlxllP+a(1/r)IlylIP} _ (Ilxll + 11YII)P+ 111 x!l - 11y111P
Proof of (2.2) Let us also show how Lemma 4 can be used to deduce the "hard" Clarkson inequality. This time we shall prove it in the uniform smoothness range, i.e. for Lq with 2 <- p < oe. Since I l x + yIIP+ Ilx - y l l P < a(r) I !
a(l/r) IIyliP
for all r, it is enough to find an r for which the right side equals 2(11 x 11 q + I y ii'f'q. Set u = II x II p, v = 11 y II;, r = v/u, and assume that v 5 u. Then
a(r) i x 11 P + a(1/r) 11 y 11 P = a(v/u) uP -' + a(u/v) vP ' = 2(u + v)P = 2(IIxilP+ 11y11P)p1q.
We now prove the duality results mentioned earlier. These results hold in general; no reference to LP or CP is made.
Lemma 5 (Duality for q-uniform convexity and p-uniform smoothness) Let X he a normed space with dual X *. The p-uniform smoothness constant of X (the constant K in (2.7)) is equal to the q-uniform convexity constant of X * (the constant K in (2.6).
Proof. Suppose that the q-uniform convexity of X * is K and let x, y e X. We denote norms in X and X * indiscriminately by 1I.11 and trust that the meaning will be clear. There are unit vectors i. and p in X * such that A(x + y) = 1; x + t 11
180
Sharp Uniform Convexity and Smoothness Inequalities for Trace Norms Sharp uniform convexity
473
and u(x - Y) =IIx -YII. Define 44e X * by O= Z -''9 II x + y II ° - A and with Z=(IIx + yII°+11x-yII°)/2. Then, II4I1°+ 4, 11 0 II V = 2 and we have IIx-YII°l'i°-4,(x+y)+2 r(x-y)=(+_I/.,)(x)+ 0
(iix±Yii"+2
(_:_)(y)
<(II0
1IKy11 P)
2 110114
IIJ11°1'1`(IIx11°+ 2
=(Ilxll°+
J
IIKYII°)'r°
11Ky11°)u°.
The first inequality is Holder's inequality for numbers, and the second is (2.5) with r = q. The other implication is similar. Lemma 6 (Duality for Hanner's inequality) Let X be a normed space with dual X
Let I < p S 2 and 1/p + 1/q = 1. Then the validity of
III+YII"+IIIO119:! (II0II+IIJ1II)9 +III0II-IoIII°
(2.12)
for all 0,41 a X* implies the validity of
IIy+zll°+ IIy-z11°?(Ily11 + 11z11)°+IIIy11 - IIz111°
(2.13)
for all y, z e X. Similarly, the validity of (2.13) in X implies the validity of (2.12) in X *.
Proof. Suppose first that (2.12) holds in X *. To establish (2.13), we first rewrite it in
terms of u = y + z and v = y - z, so that what we must show is: 2°(IIvII°+ IIv111)>=(IIu+v11 + IIu-v11)°+IIIu+v11 - IIu - vlll°. (2.14)
We may assume without loss of generality that
and that
11 u + v 11 = I,
r: = 11 u - v11 S 1. Then the right side of (2.14), which we call RP. can be rewritten (as
in Lemma 4) as
RP =allu+vII°+fIlu-v11°=(1 +r)°+(I -r)°, where 2=(I + r)°-' + (1 - r)°-' and # = r' -°[(1 + r)°-' - (I - r)° -']. As in the proof of Lemma 5, we choose unit vectors A and p in X* such that ).(u + v) =IIu + v 11 and u(u - v) = 11 u - vII . Then we define 4,=aR-°wllu+v11°-'A
and
4G=/3R-pjallu-vII°-'µ.
Thus,
R=4,(u+v)+i(u-v)=(0+4,)(u)+(¢-t,li)(v) 110+4111 (lull + 110+01111vII :5- (I10+0119+ 110-0Iu )''9(01ull°+
To complete the demonstration of T:=110+4119+110-4,119 24. By (2.12), TSR-°[allu+vll°-1
= R °[a +
ffr°-']9 + R °[a
(2.14),
we
have
to
show that
+flIlu-v11°-']q+R-°[allu+vll°-'
-(illu-v11°-1]q
- lr°-']4 = 29.
A similar proof works in the other direction to go from (2.13) to (2.12)
181
With K. Ball and E. Carlen in Invent. Math. 115, 463-482 (1994) K. Ball et at.
474
Remark. In our application of Lemma 6 to the proof of Theorem 2, we need the following refinement whose truth is evident from the proof given above. We take X = CD and X* = Cq. Then the validity of (2.13) with the extra constraint that y + z andy - z are positive semidefinite matrices implies the validity of (2.12) when 4) and 0 are positive semidefinite matrices. We close this section with a proposition showing the consistency of the two definitions that we have given for r-uniform convexity; i.e., the ones following (1.4) and (2.6).
Proposition 7 (Equivalence of definitions of r-uniform convexity) Let X he a normed space. Then (2.5) holds for some constant K and all x, y e X if and only if bx(E) z (E/C)' for some constant C. Similarly, (2.7) holds for some constant K and all
x, y e X if and only if px(r) 5 (Cr)' for some constant C. Proof. We have already seen that (2.5) and (2.7) imply the indicated bounds on bx and px respectively. Suppose first that px(r) 5 (Cr)' for some constant C. Of course, 1 < r 5 2. Then for all Il x 11 = 1 and Il y ll 5 1,
IIx + yll + lix - YII _ 1 2
(C11y11Y.
Define numbers h and ft by b:=IIx + Y11 + Ilx - YII
and
2
p:=
IIx + YII - IIx - YII
IIx+YII+IIx-Y1,
Then
Cllx+yll'+ Ilx-YII')"'_ IIx+yll+ IIx - YII J
2
b(I +/1)'+(I -p)' J R 2
2
1J.
(2.15)
LL
The function of fi on the right side in (2.15) vanishes quadratically at the origin. Thus, a simple estimation using Taylor's theorem shows that it is no greater than D,fl2 for some constant D, depending only on r. Then, since 1#1 < II y li /h
Ii y II and
1 5 r 5 2, we have from (2.15) and the assumption on px that
Ifx+yll'+IIx-YII'
-`
-
(2.16)
for all x and y with Ii y II < 11 x II = 1, where K, depends only on C and r. Therefore,
IIx+YII'+ I!x - yll'-<11x11'+K;iiy1'
(2.17)
2
for all x and y with II y Il 5 11x 11. Finally, since we may assume that K,
t
1, (2.17)
holds for all x and y. Next, suppose that bx(E) >_ (E/C)' for some constant C. Then by (1.3), px.(t) = Sup{tE - bx(E): 0 S E 5 1) 5 Sup{rF. - (F./CT: 0 5 E 5 00 } = (Cr)'
182
Sharp Uniform Convexity and Smoothness Inequalities for Trace Norms
475
Sharp uniform convexity
where I/r + 1/r' = 1.Then, by what we have shown above, there is a constant K so that (2.7) is valid in X *.But then by Lemma 5, (2.5) is valid in X with the same
constant K. 0 Ill Optimal 2-uniform convexity inequalities for trace norms Norms on the space of n x n matrices which are non commutative analogs of the L, norms can be defined in terms of the trace by IIXII,=(Tr((X*X)vi2)'w=(Tr((XX*)pr2)up
(3.1)
for 1 <_ p < oo. For p = oo, 11 X 11p denotes the operator norm of X, as usual. The
analogy can be made quite close, and it has been developed in a von Neumann
algebra context by Segal [Se] and Dixmier [Di] as part of their theories of non-commutative integration. Many familiar inequalities for L. norms also hold for the C, norms. This is true, in particular, of the Holder inequality IIXYU,.
11X11,11 YIIq. 1/r=
1/p+ l/q.
There are, however, other inequalities for L, norms which do not hold for the C. norms. Many examples are connected with the poor behavior of the map
X HIXI = (X*X)'n.
(3.2)
For example, if f and g are complex valued functions in some L, space, then II if I - Igl 11, :5 Ilf - g Il, This is not true for C,, and, when p = 2, the factor of f in the Araki-Yamagami inequality [ArY] 111 X I - I Y1112 < f 11 X - Y 112 is optimal whenever n z 2. As we have asserted in Theorems 1 and 2, however, almost all of the optimal inequalities expressing uniform convexity and smoothness properties of LP spaces have exact analogs which hold for the C, norms. Most of this section is devoted to the proof of Theorem 1. Before giving the proof we briefly discuss the history of uniform convexity inequalities for C, as we know it. The first such inequality was established by Dixmier [Di] who proved the C, analog of the "easy" Clarkson inequality (2.1) by means of interpolation. As with L,, this implies that for 2 5 p < oo, bc,(e) Z (1/p)ep.
Interpolation was later used to establish the analog of the "hard" Clarkson inequality (2.2) which implies the uniform convexity of Lp for 1 < p:5 2. Such a proof has been given by Martin Klaus, and is sketched in [Si]; to some extent it is modeled on Boas' proof [Bo] of (2.2) for L,. This result implies that for I < p < 2, C, is at least q-uniformly convex. Later, Tomczak-Jaegermann showed that for I < p 5 2, C, is actually 2-uniformly convex and Cq is 2-uniformly smooth. Moreover, she showed for q = 2k, that the sharp 2-uniform smoothness constants of Cq are the same as those for L. (so that the corresponding equalities of 2-uniform convexity constants hold by Lemma 5).
183
With K. Ball and E. Carlen in Invent. Math. 115, 463-482 (1994)
K. Ball et al.
476
Before we prove Theorem 1, note that t i-. tP12 is concave for 1 < p < 2 and therefore (1.7) immediately implies that
IIX+ YII;+IIX - YIIP> 2
IIXI12+(p- 1)I1Y11;,
(3.3)
which expresses the 2-uniform convexity of C, in the usual way. It follows that Sc,(E) >_ (P 2
for
1)s2
1 < p < 2,
(3.4)
and thus the analog of (1.5) holds for C,. We now observe that (1.7) is only formally stronger than (3.3). To see that (3.3) implies (1.7), consider the 2n xx 2n matrices given in block form by
Z = LO 01,
W= [ U
o r].
(3.5)
Then
TrIZ + WIP=TrIZ - WIP=(TrIX+ YIP+TrIX - YIP) and thus,
IIZ+ W(,-IIZ- W112=(IIX+YIIP+IIX- YIIP)2/P Since also 11ZII; = 221PIIXII2 and II W11P = 22/P11 YII2, (3.3) implies
IIXIIo+(P-1)11Y41;=2 2;°(11ZIIn+ (P-1)11W112
l
<
W112 + 11Z - W11PI 2
=(I;X+ YIIP+IIX - YI(P2`P 2
which is (1.7).
Proof of Theorem 1 First, we reduce to the case in which X and Y are self-adjoint. Consider the 2n x 2n matrices given in block form by
C - [X*
]' D=LY* ol.
Clearly, if (1.7) holds for the 2n x 2n0matrices C and D. it holds for X and Y. Since C and D are self-adjoint, it suffices to prove inequality (3.3) for such matrices. We therefore assume without loss of generality that X and Y are self-adjoint. Let Z and W be defined in terms of X and Y as in (3.5). Then we can rewrite (1.7) as
Tr(IZ + rWIP)2'P > (TrlZIP)2iP + r2(P -
(3.7)
1)(TrIWI°)2,P.
First, note that without loss of generality we may assume by continuity that the union of the ranges of Z and W span Ctn. Then det(Z + rW) is a polynomial of order exactly 2n in r, and it has at most 2n zeros for 0 <-- r <- 1. We will avoid these
values of r below in our computations. We define O(r) by
O(r) = TrIZ + rWIP = Tr((Z2 + r(ZW + WZ) +
184
r2W2)P'2).
Sharp Uniform Convexity and Smoothness Inequalities for Trace Norms 477
Sharp uniform convexity
Then i/i(r) is continuously differentiable and (p/2)Tr((Z2 + r(ZW + WZ) +
+ 2rW2).
W2)vl2-')((ZW + WZ)
r2
With the aid of the integral representation (Z2 + r(Z W + WZ) + r2 W2 )t"l2 - 1
= w f tw/2 -" U
1
t+(Zz +r(ZW+WZ)+rz W z )
dt
'
(3.8)
we see that di(r)/dr is again continuously differentiable. Now, both sides of (3.7) agree at r = 0, and the first derivatives in r of both sides vanish there as well. Moreover, the second derivative in r of the left hand side of (3.7) satisfies
ziv_2pe(r)
d2
,2-v)rod2
dr2O(r),
while the second derivative on the right side of (3.7) is just 2(p - 1) (Tr I WI')2ip. It therefore suffices to show that 10(r)(2 -v)rod 22O(r) >
(3.9)
(p - 1)(TrI WIO)2i0
for all 0 < r < 1. By redefining Z to be Z + rW, it suffices to establish (3.9) at r = 0. Since Z + rW is non-singular, after the redefinition, IZI will be strictly positive. We now claim that
d2TrIZ+rWI°
,-o> a2TrIIZI+rWI°I'
(3.10)
To see this, note that by the integral formula (3.8), dz
dr2TrIZ+rWIr ,
TrIZIV-2 W2_(p/2)/JvJta'2 'Tr
X ( Z2 + t (ZW + WZ) Z2 + (ZW + 1
WZ))dt.
3 . 11 )
1
The trace under the integral sign consists of four terms which, using the cyclicity of the trace, can be rewritten as
Tr WZ
I
Z2 + t
WZ Zz I +
I_
1
WZ2 t) + 3Trf1\ W_ Z2 + t Z2 + t).
Since only Z2 enters the second of these two terms, this term is unchanged when Z is replaced by IZI. Upon writing out the first term in a basis that diagonalizes Z, that term becomes 2I
+t
, ..1 + t
Iw;ll2z;z;.
Clearly this term, and hence the integral in (3.11), increases when Z is replaced by
IZI. The first term in (3.11), being a function of Z2, is invariant under the substitution, and the assertion (3.10) is established.
185
With K. Ball and E. Carlen in Invent. Math. 115, 463-482 (1994) K. Ball et al.
478
Therefore, without loss of generality we may assume that Z > 0. Then, of course, Z + rW > 0 for all r sufficiently small, and we no longer need to square Z + r W to obtain a positive operator whose powers can be expressed as an integral over its resolvent. Working directly with Z + rW, we can use the simpler integral representation (Z+rW)'P-n=Yrf t(P-1)
t+(Z+r
Ctt
o
W)dt
(3.12)
1
to conclude that
4,"(0)=PYvtly "TrI o
t+ZWt+ZWldt.
J
L
(3.13)
Consider the right side of (3.13) as a function of Z for fixed W. We claim that it is convex in Z. To prove this, it suffices to prove the following inequality for every self-adjoint matrix A: d(A). _
d2
1 ds2Tr
There
are
six
terms.
1
t+(Z+sA)Wt+(Z+sA)W If
we
define
C = (t + Z)- 12A(t + Z)"!2
and
D = (t + Z)- 1/2 W(t + Z)-''2, then the result of the computation is A(A) = 4Tr C2D2 + 2Tr CDCD.
But by the Schwarz inequality, ITr(CDCD)I < {Tr(CD2C)}112 {Tr(DC2D)}1'2 = TrC2D2.
Thus, A(A) > 0 and the integrand in (3.13) is a convex function of Z. Now fix W and t, and define
F(Z) = Tr
[t+ZWt+ZW]-
Clearly, when U is any unitary matrix that commutes with W, F(UZU*) = F(Z). be Let {e1 , ... , e2.1 be an orthonormal basis of eigenvectors of W. Let some enumeration of the 22n unitary matrices with the property that U;ek= ± ek for each k. Clearly each of these unitaries commutes with W. Thus, by the convexity
of F which we have established in the last paragraph, 22-
F(Z) = 2-2"
22
F(U;ZUj) >_ .1=1
UjZU.1= F(Zdhg)
F(2-2"
l
J=1
where Zdi,g is the matrix whose diagonal entries, in the basis specified above, are those of Z, and whose off-diagonal entries are all zero. Replacing Z by Zd;,g in (3.13), the integration can be carried out, and we obtain 1 V "(0)>=P(P-
1 1I
where zj and w;, respectively, denote the jth diagonal entries of Z and W in the W-basis specified above.
186
Sharp Uniform Convexity and Smoothness Inequalities for Trace Norms Sharp uniform convexity
479
Now consider i/,(0 = Tr(Zv) as a function of Z. It is clearly convex. Thus,by the averaging method just employed, we obtain z,
O(0) ?
Y. z;
I
To establish (3.9), it only remains to check that /Y' , z' \(z v)rv/, '
Zw-z)wj) >_
( ,Y` Iwirvl
l
inequality.
but this follows immediately from Holder's
z
0
IV Hanner's inequality for matrices This section is devoted to the proof of parts (b) and (c) of Theorem 2. We begin with the proof of Theorem 2(c), and then show how that implies Theorem 2(b).
Proof of Theorem 2(c) First, let Y be a fixed self-adjoint n x n matrix, and consider the set Mr of n x n self-adjoint matrices given by
Mr:= {X: X+ Y> O and X- Y>01. Clearly Mr is convex, and if X e Mr, then X > 0. We claim that G(X):= 11X + YIIp+ IIX - YIIp-211XIIp
(4.1)
is a convex function on Mr. By the averaging method employed in the proof of Theorem 1, this convexity would imply that
IIX+ Y{ip,+ IIX- YIIp,-aIIXllp,? IIXdi.,+ YIIp,+ IIXdi,1- YIIp-aliXdi,`IIp, (4.2)
for any 0 a 5 2, where Xdi,a denotes the diagonal part of X in a basis diagonalizing Y. (Note that if X e Mr, then Xdi,a a My.) By Lemma 4 and Hanner's inequality in !v.
IIXdi,1+ YIIp,+ IIXdi.Q- Yllp- a(r)lIXdi.QIIp,>= a(l/r)II YIIp,
for all r, where a(r) is the function defined in Lemma 4. (Here we are making use of the easily checked fact that for I < p < 2, a(r) and a(l/r) never exceed 2.) Combining this with (4.2), we would obtain IIX+ YIIPp + IIX - Ylip,za(r)11X11p+a(1/r)II YIIp.
Then, by another application of Lemma 4, the inequality (1.8) would be established
for 15 p 5 2 for all matrices X and Y such that X + Y and X - Y are positive semidefinite. By Lemma 6, and the remark that follows it, (1.8) would be established
for 2 <- p < oo and all positive semidefinite matrices X and Y. It remains to establish the convexity of G(X). We choose a self-adjoint matrix A and define
4(s)= II(X+sA)+ YIIp+11(X+sA)-
YIIp-21'(X+sA)11P.
(4.3)
187
With K. Ball and E. Carlen in Invent. Math. 115, 463-482 (1994) 480
K. Ball et al.
Then ds O
(s) = pT r [ ((X + sA) + Y)°- I + ((X + sA) - Y)°-
2(X + sA )o- t ] A.
Using the integral representation formula to compute the next derivative, we have
"(O)= py,o 0
Tr t+X+YAt+X+Y+t+X-YAt+X-Y -2I+XAt+X]A)dt.
This is positive by the convexity of I
t+XAt+X A, I
which we established in the last section.
El
Proof of Theorem 2(b) We use a power doubling argument inspired by the 2convexification method developed by Figiel and Johnson [FJ]. First, consider the case p > 4. As in the proof of Theorem 1, we can assume that X and Y are self-adjoint n x n matrices. The spectrum of the 2n x 2n matrix
XY Y
X)
consists of the union of the spectra of X + Y and of X - Y. Thus, the pth power of its CD norm equals the left side of (1.8). By the same spectral considerations, one sees
that the pth power of the C, norm of the 2 x 2 matrix II X II
Ii Y!i
II YII
II
X11l
(4.5)
equals the right side of (1.8). Thus, our problem is to show that the C, norm of the 2 x 2 matrix in (4.5) exceeds the C, norm of the 2n x 2n matrix in (4.4).
Now
X2+Y2 XY+YX
X Y)
( X Y + YX X2+ Y2) pre ( Y X 12p The second matrix is positive semidefinite, and it has the special block form (B 'A). Block matrices of this form are characterized by the fact that they commute with
(° o), where I is the n x n identity matrix. Evidently, all powers of a positive semidefinite block matrix of this special form have the same special form. Thus, if r is the index conjugate to p/2, there is a positive matrix (o °) whose C,-norm is I with the property that the norm in (4.6) is realized as Tr(X2 + YZ X Y + YX
XY+ YX X2 + YZ
)(C D)
DC
= 2Tr(X 2 + Y2)C + 2Tr(X Y + YX )D < 21.C11,(11X211#2 + 11 Y2,1a2) + 4I1DII.IIX!!,1! Y!,
by the Holder inequality for traces of matrices.
188
Sharp Uniform Convexity and Smoothness Inequalities for Trace Norms
Sharp uniform convexity
481
Let us define IICII:= IICII..11D11:= IIDIIr.IIX11:= I1XII,and II Y11:= 11
last expression is 211C1](IIX112 + II Y112)+411D11 IIXII II YII
IICII
= Tr(IIDII
IIDIII
11X112+IIY112
2 11XII II Y!1
21IXII II YII
11XII2 + II Y112
IICII)(
=11(IIDII IIDII)ILII(III
IIIXIII)I,
YII
The positivity of the matrix (p D) guarantees that both C + D and C - D are positive. Since 1 <_ r 5 2, Theorem 2(c) implies that IICII IIDII
=1.
(IIDII IICII) Consequently,
\Y X X)II,
IIXII II YII )'l! IIXI II
2
P
as required. Finally, by Lemma 6, we obtain the validity of (1.8) for I
p
Remark. For all p, (1.8) holds (with the appropriate direction of inequality) in C. when II Y11 = 11 X II since (1.8) is then a special case of the "easy" Clarkson inequality
(2.1) which was extended to C,, by Dixmier [Di]. It also holds to leading order for small Y, as one can verify using Theorem 1. We make the natural conjecture that (1.8) holds in C,, for I <- p < 2. without the restrictions imposed in part (c) of Theorem 2.
References
[ArYa]
[BP] [Bo] [C] [CL]
[D]
Araki, H., Yamagami, S.: An inequality for the Hilbert-Schmidt norm. Commun. Math. Phys. 81, 89-96 (1981) Ball, K., Pisicr, G.: Unpublished result; private communication. Boas, R.P.: Some uniformly convex spaces. Bull. Am. Math. Soc. 46, 304-311 (1940) Clarkson, J.A.: Uniformly convex spaces. Trans. Am. Math. Soc. 40, 396-414 (1936)
Carlen. E., Lieb, E.: Optimal hypercontractivity for fermi fields and related noncommutative integration inequalities. Commun. Math. Phys. "155, 27-46 (1993); for a slightly different presentation, see: Optimal two-uniform convexity and fermion hypercontractivity. In: Araki, H., Ito. K.R., Kishimoto, A., Ojima. I. (eds,) Quantum and non-commutative analysis. London New York. Kluwer (in press) Day, M.: Uniform convexity in factor and conjugate spaces. Ann. Math. 45, 375-385 (1944)
[Di]
[F]
Dixmier, J.: Formes lineaires sur un anneau d'operateurs. Bull. Soc. Math. Fr. 81, 222-245(1953) Figiel. T.: On the moduli of convexity and smoothness. Studia Math. 56. 121-155 (1976)
[FJ] [Gr] [H]
Figiel, T., Johnson, S.B.: A uniformly convex Banach space which contains no Cr. Compos. Math. 29. 179-190 (1974) Gross, L.: Logarithmic Sobolev inequalities. Am. J. Math. 97. 1061-1083 (1975) Hanner. 0.: On the uniform convexity of I P and )P. Ark. Math. 3. 239-244 (1956)
189
With K. Ball and E. Carlen in Invent. Math. 115,463-482 (1994) K. Ball et al.
482
[Ko]
Kothe, G.: Topologische lineare Raume, Die Grundlehren der mathematischen Wissen schaften in Einzeldarstellungen, Bd. 107, Springer Berlin Heidelberg New York: 1960
[L]
[P] [Ru] [Sc] [Si]
[TJ]
Lindenstrauss, J.: On the modulus of smoothness and divergent series in Banach spaces. Mich. Math. J. 10, 241-252 (1963) Pisier, G.: The volume of convex bodies and Banach space geometry. Cambridge: Cambridge University Press, 1989 Ruskai, M.B.: Inequalities for traces on Von Neumann algebras. Commun. Math. Phys. 26, 280-289 (1972) Segal, I.E.: A non-commutative extension of abstract integration. Ann. Math. 57, 401-457 (1953) Simon, B.: Trace ideals and their applications. (See p. 22) Cambridge: Cambridge University Press, 1979
Tomczak-Jacgermann, N.: The moduli of smoothness and convexity and Rademacher averages of trace classes Sp(l (1974)
190
p < co). Studia Math. 50. 163-182
With E. Carlen in in Advances in Math. Sciences, Amer. Math. Soc. Transl. (2), 189, 59-62 (1999)
Amer. Math. Soc. Tranel. (2) Vol. 189, 1999
A Minkowski Type Trace Inequality and Strong Subadditivity of Quantum Entropy Eric A. Carlen and Elliott H. Lieb Dedicated to M. Sh. Bir,nan on his seventieth birthday ABSTRACT. We consider the following trace function on n-tuples of positive operators:
lUPl
((n p(A1,A2, -, A,,)=Tr\\LA)/ 3=1
and prove that it is jointly concave for 0 < p < 1 and convex for p = 2. We then derive from this a Minkowski type inequality for operators on a tensor product of three Hilbert spaces, and show how this implies the strong subadditivity of
quantum mechanical entropy. For p > 2, p is neither convex nor concave. We conjecture that +p is convex for 1 < p < 2, but our methods do not show this.
I. Introduction Let Px denote the set of all positive semidefinite operators on a finite dimenThen, for any finite natural number sional Hilbert space 7{ with inner product n, any finite p > 0, and any finite n-tuple (A1, A2, ... , An) of elements of PR, define (1.1)
4)p(A1,A2,...,An)=Tr\EAP 1
I.
1=1
The main result of this paper is the following is a jointly concave function of its arguments. THEOREM 1. For 0 < p < 1, For p = 2, 'pp is jointly convex. For p > 2, 41p is neither convex nor concave.
We conjecture that 4p is jointly convex for 1 < p < 2. We state all of the theorems in a finite dimensional context, and some of our methods of proof explicitly 1991 Mathematics Subject Classification. Primary 47A63, 15A90. Copyright 1997 in image and content by the authors. Reproduction of this article in its entirety by any means is permitted. The work of the first-named author was supperted by U. S. National Science Foundation grant no. DMS 9500840. The work of the second-named author was supported by U. S. National Science Foundation grant no. PHY95-19433 A01. 59
191
With E. Carlen in in Advances in Math. Sciences, Amer. Math. Soc. Transl. (2), 189, 59-62 (1999) ERIC A. CARLF.N AND ELLIOTT H. LIEI3
60
involve this finite dimension. Nonetheless, the results themselves do not depend on the dimension, and therefore easily extend to the appropriate trace classes on an infinite dimensional Hilbert space. We note that the trace in Theorem 1 is essential; the asserted trace inequalities do not hold as operators inequalities. If they did, we would have, for example at
p = 2 that (A2 + B2)1/2 < A + B. This is of course not true in general positive operators, as is well known and easily checked.
We shall use Theorem 1 to derive a Minkowski type inequality for traces of operators on a product of three Hilbert spaces. To set this in perspective, recall that the Minkowski inequality says that for non-negative measurable functions f on the Cartesian product of two measure spaces (X, µ) and (Y, v), I/p
p (1.2)
(L(Jf(x,Y)dv)
fp(x,y)dµlI/pdv
dl/
/
L(L
for p > 1, and that the opposite inequality(holds for 0 < p < 1. A direct analog of (1.2) holds for positive operators A on the tensor product of two Hilbert spaces %1 ® 9't2. To state it, let Tt, A denote the positive operator on 9{2 that is given as a quadratic form by (v,Tr, Av) = 1: (u, (& v,A(u, ®v)),
where v E it2 and the ul constitute an orthonormal basis of it1. As is well known, the quadratic form on the left is independent of the choice of the orthonormal basis on the right. The operator Tr1 A so defined is called the partial trace of A over it,. It will be convenient, and generally clearer, in what follows to write Tr1 also to denote the usual trace on 11, for operators A on %I alone. The following is the tracial analog of (1.2). THEOREM 2. Let A be a positive operator on the tensor product of two Hilbert
spaces it, ® t2. Then for all p > 1, (Tr2(Tr, A)") 1/P < Tr1 ((Tr2 A") I/p)
(1.3)
and inequality (1.3) reverses for 0 < p < 1.
Returning to (1.2), note that it has a trivial extension to functions of three (or more) variables. Though trivial, it has an interesting consequence. If one considers a non-negative measurable function f (x, y, z) on the Cartesian product of three measure spaces (X, µ), (Y, v) and (Z, p), and simply holds z fixed as a parameter, one gets 1/p (1.4)
(1, U
f (x, y, z)dv pdµ )
5
J
(f
f p(x, y, z)dµ f
"Pd,
pointwise in z for p > 1. Integrating in z then yields
z)dv)pdµ J I /pdP 1z CJx \1Y f (x, y, for p > 1, and of course the inequality reverses for 0 < p < 1. (1.5)
fz I (I
fp(x, y,
z)dµl
I ipdvdp
J
Now, since (1.5) is an equality at p = 1, we get another inequality by differentiating (1.5) with respect to p at p = 1. This yields an entropy inequality. In
192
A Minkowski Type Trace Inequality and Strong Subadditivity of Quantum Entropy STRONG SUBADDITIVITY OF QUANTUM ENTROPY
61
fact, using the homogeneity of (1.5), we can normalize f so that it is a probability density. Recall that for any probability density p on any measure space (X, 14), the entropy S(p) is defined as
r
S(p)=-J pIn pdµ.
(1.6)
x We denote various marginal densities of f as follows:
f2.3(y,z) = f f(x,y,z)dti,
f3(z) =XfYf f
f1.3(x,z) = f f(x,y,z)dv, Y
(x, y,
z)dpdv.
Then the derivative of (1.5) at p = 1 is S(fl,3) + S(f2.3) > S(f1,2.3) + S(f3),
(1.7)
which is the strong subadditivity of the classical entropy; see (L75]. Now consider operators on the product of three Hilbert spaces, and a density matrix A; i.e., a positive operator on W, ®7-[2 ® H3 with Tr A = 1. The entropy S(A) of a density matrix A is defined by
S(A) = - T (A In A).
(1.8)
The operator analog of (1.7) is the Lieb-Ruskai LLRI strong subadditivity inequality for the quantum mechanical entropy: (1.9)
S(A1.3) + S(A2,3) ? S(A1,2,3) + S(A3),
where, in analogy with our notational conventions for marginal densities, we define A1.2,3 = A,
A2,3 = T 1 A,
A3 = T 1 Tr2 A
and so forth.
Thus, the differential form of Minkowski type inequality (1.7) is known to hold
at p = 1 for operators. It is therefore natural to enquire whether there exists an operator analog of the three-variable Minkowski inequality (1.7) for other values of p. Unfortunately, the methods at our disposal suffice to establish this only for
0
THEOREM 3. Let A be a positive operator on the tensor product of three Hilbert
spaces ?ll ®f2 ®113. Then (1.10)
T
3(` r2(T 1 A)P) 11" < Tr1,3((T
2 AP) "P)
for p = 2 and, trivially, p = 1, while the reverse inequality holds for 0 < p < 1. This is, nonetheless, enough to imply the strong subadditivity (1.9): one simply takes the left derivative at p = 1. It is readily seen by considering block-diagonal matrices that the inequality of Theorem 3 implies the convexity of 4P for p = 2, and the concavity of 4bp for 0 < p < 1. By the same token, (1.10) cannot hold in general for p > 2 since this would imply the convexity of for such p, and Theorem 1 precludes this. This is
193
With E. Carlen in in Advances in Math. Sciences, Amer. Math. Soc. Transl. (2), 189, 59-62 (1999)
ERIC A. CARLEN AND ELLIOTT H. LIEB
62
in contrast to Theorem 2, the Minkowski inequality for two spaces, which holds for
allp> 1. The fact that there is such an easy passage from the Minkowski inequality in two variables to that in three variables may leave one surprised that there should be any difficulty in making the same passage with operators. But difficulty there is. In fact, even the simple version in Theorem 2 seems to require a more intricate proof than does the corresponding statement for integrals-which after all is simply the statement that the unit ball in LP is convex for p > 1. In fact, we know of no previous proof of Theorem 2. We emphasize that there is no operator analog of the pointwise inequality (1.4). That is, if we omit Tr3 on both sides of (1.10), the result will be two operators on 713i and these two operators do not satisfy the corresponding operator inequality. We present a proof of Theorem 2 in Section II. Then in Section III we prove Theorem 1. In Section IV, we recast Theorem 1 into an equivalent form, from which Theorem 3 is readily derived in Section W. Section V contains a brief comment on a relation between the conjectured convexity for 1 < p < 2 and a very interesting trace inequality of Birman, Koplienko and Solomyak (BKS].
H. Proof of Theorem 2 The following proof of Theorem 2 is given for matrices, but is easily extended to operators as the statement is dimension independent.
Let A be a positive operator on PH,®x the tensor product of two finite dimensional Hilbert spaces. Suppose first that p > 1. We proceed by duality.
There is a positive operator B in PH, with (Tr2(Ba)I/9) = I with 1/q + I/p = 1 such that A) = Tt1,2((I ® B)A)
(Tr2(Tr1
F(u, ®v,, (10 B)A(u, (9 v, )) _ J>, ®Bvj, A(u, (& v,)) for any pair of orthonormal bases {u,} and {vj}. We now choose the {v,} to be a basis of eigenvectors of B, and let {A, } be the corresponding eigenvalues. Then the right hand side above becomes ®v.,A(u,®vj))1(u,8+vj,A(u,®v,))/p) I/p
(EA)
EA,(u,
11/p
Next, by the spectral theorem, for each i and j, (uj ®vj, A(u, (9 vj)) < ((u, ®vj, A'(u, ®v,)))
Using this, one arrives at (Trz(Tri A) P'
1/p <
(((uj ®v,, A'(u, ®vj)) ll ((u,,Tr,Apu,))1/p
194
I/p
A Minkowski Type Trace Inequality and Strong Subadditivity of Quantum Entropy STRONG SUBADDITIVITY OF QUANTUM ENTROPY
63
Now we choose the {u;} to be a basis of eigenvectors of Tr2 AP. Then
E((u:,`
r2APu1))1/P=E(u, (T2AP)1/Pui)=TrI(Tr2AP)1/P
and the desired inequality is proved for p > 1. Note that this part of the proof works for all p > 1, not only 1 < p < 2. Now suppose 0 < p < 1, and define r = 1/p and B = AP so that A = Br. Since
> (Tr2(Tr IB)r)I/r. r > 1, the inequality proved above says Tr Rewriting this in terms of A and p, and switching the roles of f1 and 1(2, one 1((Tr2Br)I/r)
O
obtains the desired result for 0 < p < 1.
III. Proof of Theorem 1 As before we give the proof for matrices. Consider first the case 0 < p < 1. The proof in this case proceeds by reduction to a theorem of Epstein [E) concerning the function
A , Tr((BAPB)'/P) on P,, where B is any given element of P1(. Epstein's theorem says that this function is concave for 0 < p < 1. To apply this, consider first the case n = 2 in (1.1), and define
A=[A10
0]
A2
and 1
o= [01 0] Then
AP+aAPa= I
Al
l
But
+A2 0 ] 0 AI + A2
\
\
/
AP+aAPa=2(12a)AP(I 2a I+2I Now define
1I±=
\ )AP(I2a). / 126
--I
and observe that these are complementary orthogonal projections. Thus, (3.1)
2Tr((A + A2 )'/P) = 21/PTr((n+APn+)'/P) + 2'/PTr((lI-APf-)1/P).
Epstein's theorem, with A = A and B = lI±, now implies that each term on the right hand side of (3.1) is a concave function of A, which means that the left hand side is a jointly concave function of AI and A2. This concludes the proof for n = 2. One now easily iterates this procedure to obtain the result for all dyadic powers n = 2'`, and hence for all n. To prove the convexity of 4>2 there are several ways to proceed, but the simplest was pointed out to us by S. Sahi. Namely, let n be given and consider the block
195
With E. Carlen in in Advances in Math. Sciences, Amer. Math. Soc. Transl. (2), 189, 59-62 (1999)
ERIC A. CARLEN AND ELLIOTT H. LIED
64
matrix A given by
Al
0
...
0
0
...
0
0
...
0
Then A2,... , An) = Tr JAI,
where IXJ is the usual operator absolute value; i.e.,
In other words
A2, ... , An) is simply the trace norm of A, is therefore clearly jointly convex
in A1,A2,...,An. Finally, we show that convexity fails to hold for p > 2. To see this, choose any pair A1,A2 E Ph, and any vector v such that (3.2)
(v, ((AP + AZ)/2)v) < (v, ((A1 + A2)/2)Pv)
Note the strict inequality here. It is always possible to find such A1,A2 and v for p > 2 since, for such p, X '-. XP is not operator convex. Now let H,, denote the orthogonal projection onto the span of v, and let [Il, _ I - H. denote its orthogonal complement. Then, for a large number A to be fixed below, put
B=IIv+AIll,. Then, if 4bP were convex, we would have
1/2 \
A2,B f - I VtA1, B) - IbP(tA2,B) I < 0 .
(3.3)
2
However, for small t \> 0,
tP
11(t
A1 + 2
A2 B) = TY(tP(A1+A2)"+BP)" /J
\
2/
`\
=TrB+_TYI
P
\
B-P(A1
\
2
A2) )
/ I +O(t2p)
and
24P(tAl, B) + Z4p(tA2, B) = Tr B + p G Tr
B'-PAP +
Tr B1-PAP + O(t2p).
/
2
Thus, limseuppi_'
(
pI tAl 2 A2, B I - 24'(tA1, B) - 2lb(tA2i B))
=Tr(B'-P(A12A2I (Ai +A2)PV )/2
/
i-I!TrBI-PA°+2TrB1-PAP)
((v'Aly) + (v'A2v) 2 2 )
+O(Al-P).
Now taking A sufficiently large, this last term on the right is strictly positive by (3.2). This contradicts (3.3), and thus convexity does not hold not even separately. El
196
A Minkowski Type Trace Inequality and Strong Subadditivity of Quantum Entropy STRONG SUBADDITIVITY OF QUANTUM ENTROPY
65
IV. Corollary of Theorem 1 and proof of Theorem 3 A corollary of Theorem I is obtained by writing the partial trace as an average, and exploiting the convexity and concavity established above. Let A be a positive operator on ?-(l 0 l2. Next, suppose the dimension of ?{2 is N, and fix some orthonormal basis { e 1 , r.2, ... , eN }. With respect to this basis, define the self-adjoint unitary operators U,., and V,,, on 7.12 by
U, j = I - E;., - Ej, + E;, + EJ
V;=I-2E,,,, where the i and j are a distinct pair of indices, and E,,, in this basis has the matrix with I in the i,jth place, and 0 elsewhere. Let G be the subgroup of the group of unitary operators on 7i2 that is generated by this family together with the identity. Each operator W in this group acts by Wee =
and some map s : 1, 2, ... , N .--+ 0, 1. Thus, the size of the group is 2^'N!, and the point about it is that any operator on 712 that for some permutation
commutes with every element of this group is necessarily a multiple of the identity on 7{2. Then 2NN!
F (10W')A(I So H7) = N Tr2(A) 0 1,,,. W'
This way of writing partial traces can be traced back to Uhlmann JUJ. From here one easily arrives at the following result:
THEOREM 4. For p > 0, let the map WJ,(A) from positive operators A on f1 x f2 to R+ be given by 'Y/,(A) = Trl (('112 A") I/p)
(4.2)
.
Then this map is concave for 0 < p < 1, convex for p = 2, and neither for p > 2. PROOF. We shall assume that the dimension of 7.12 is N so that we may apply the averaging formula introduced above. We then have
// = N'IP- 1
\ = N'/p-1 (
1
2'v N!
l l 21 1: (I
)
1
w'E9
W')Ap(I0W))l lip) )
TY1.21 1\ F ((10W')A(I0 97))")
\
The result now follows directly from Theorem 1.
/ p
197
With E. Carlen in in Advances in Math. Sciences, Amer. Math. Soc. Transl. (2), 189, 59-62 (1999) ERIC A. CARLEN AND ELLIOTT H. LIEB
66
Notice that the conclusion of Theorem 4 not only follows from Theorem 1, but also implies it. To see this, suppose that the A in Theorem 4 is block diagonal with Al 0
A2
... ...
0
0
... A
0
0 0
A
Then clearly T'p(A) =
1/p
p
1
1/p
= x'1,31 Tr2 I N Tr1 A ® IN,/ 1 )
= 4rp (N Try A ®IN,)
where the pair of spaces in the definition of Pp is taken to be 113 and HI ®9.13. Then by (4.1) and the convexity of lip established in Theorem 4,
*pI TrIA®1x,/ =41pl\2 N
>(1®W')A(I®W)l
N
IN N.
/
I
WEQ
q'p((10W')A(1®W) WEB
The last term above is 1/v I
N 2N
I
E T1,3 (Tr2(1 ®W)AP(1® W)))
we(
1/p
Y- Tr1,3 ((1® W') (Tr2 AP) (1(9 W) 1 2N N.I WEG Tr1.3((I®W')(Tr2A')I/p(1®W))
= 2NNf E
=T13T1I((T<'2A")1/p),
WEG
which is the desired result.
0
V. The BKS inequality and the 1 < p < 2 conjecture Birman, Koplienko and Solomyak JBKS] proved that for p > 1, and A and B positive semidefinite operators, (5.1)
Tr(BP - A")+/p > Tt(B - A)+,
where X, denotes the positive part of a selfadjoint operator X; i.e., X+ _ (X + JXJ)/2. In (5.1), neither B nor A needs to be bounded, but it is assumed that (BP - AP)1 p is trace class. Though the inequality in (5.1) is only one of
198
A Minkowski Type Trace Inequality and Strong Subadditivity of Quantum Entropy STRONG SUBADDITIVITY OF QUANTUM ENTROPY
67
several very interesting inqualities proved in [BKS], we refer to it here as the BKS inequality.
The proof is in two parts, the first of which is to reduce consideration to the case BP > AP in which case one has B = (AP + CP)'l" with C > 0. Then (5.1) becomes
Tr(C + A) > Tr(AP + CP)I/P
(5.2)
for all A > 0 and C > 0. It is (5.2) that interests us here. Clearly (5.2) can be rewritten as 4YA, C) < $p(A, 0) + 4$p(O, C),
(5.3)
which is a subadditivity property of $p for all p > 1. Since Op is homogeneous of degree 1, subadditivity and convexity are the same thing. Thus for p = 2, (5.3) is a special case of the convexity of proved in Theorem 1, and for 1 < p < 2, it would be a consequence of the conjectured convexity for these p. However, the BKS inequality holds for all p > 1, not only for 1 < p < 2. There is a simple proof of (5.2) for matrices. Let Al
-
AP/2
0 0
fCP/2
so that
Tr(MfM})Ilp = Tr(AP +C")I/v On the other hand, the spectrum of M; M± is the same as the spectrum of M± M;, so
Tr(AP+CP)I/P= It(llftllf})1/v One computes
MfM _
t
AP
tJl
tJ CPI
with J = AP12Cn12. Since X - Tr(XI/p) is concave for p > 1, one has that
Tr(A+C)=Tt
M+M++M_M_ Up 2
1
>Tr(AP+CP) 1/v
A recent application of the BKS inequality, and a different proof of (5.1) that holds in the case of unbounded operators, can be found in [LSS].
Acknowledgements. We thank T. Ando and F. Hiai for a careful reading of this paper, and for pointing out many misprints in an earlier draft.
References T. Ando and F. Hiai, Holder type inequalities for matrices, Preprint, 1997. (BKSJ M. S. Birman, L. S. Koplienko, and M. Z. Solomyak, Estimates for the spectrum of the difference between fractional powers of two selfadjoint operators, J. Soviet Math. 19 (1975), no. 3, 1-6. JEJ H. Epstein, On a concavity theorem of Lieb, Commun. Math. Phys. 31 (1973), 317-327. (L75( E. H. Lieb, Some convexity and subadditivtty properties of entropy, Bull. Amer. Math. Soc. 81 (1975), 1-13. JLRj E. H. Lieb, and M. B. Ruskai, Proof of the strong subadditwity of quantum-mechanical entropy, J. Math. Phys. 14 (1973), 1938-1941. (AHI
199
With E. Carlen in in Advances in Math. Sciences, Amer. Math. Soc. Transl. (2), 189, 59-62 (1999) ERIC A. CAREEN AND ELLIUTT H. LIES
68 [LSS)
E. H. Lieb, H. Siedentop, and J. P. Solovej, Stability and instability of relativistic electrons in magnetic fields, J. Stat. Phys. 89 (1997), 37-59.
IU)
A. Uhlmann, Satze nber Diehtematrizen, Wiss. Z. Karl-Marx Univ. Leipzig 20 (1971), 633-653. SCHOOL OF MATHEMATICS, GEORGIA INSTITUTE OF TECHNOLOGY, ATLANTA, GEORGIA 30332 DEPARTMENTS OF MATHEMATICS AND PHYSICS, PRINCETON UNIVERSITY, PRINCETON. NEW
JERSEY 08544-0708
200
Part III
Inequalities Related to the Stability of Matter
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
INEQUALITIES FOR THE MOMENTS OF THE EIGENVALUES OF THE SCHRODINGER HAMILTONIAN AND THEIR RELATION TO SOBOLEV INEQUALITIES
Elliott H. Lieb* Walter E. Thirring 1.
Introduction
Estimates for the number of bound states and their energies, ej < 0, are of obvious importance for the investigation of quantum mechanical Hamiltonians. If the latter are of the single particle form H = -A + V(x)
in R", we shall use available methods to derive the bounds
fdnxlvx+n/2.
lejly< Ly,n
y> max (0, 1-n/2) . (1.1)
Here, IV(x)l_=-V(x) if V(x) < 0 and is zero otherwise. Of course, in many-body theory, one is more interested in Hamiltonians
of the form - E A i + F v(xi - xj). It turns out, however, that the i
i>1
energy bounds for the single particle Hamiltonian yield a lower bound for
the kinetic energy, T, of N fermions in terms of integrals over the single particle density defined by p(x) = N
fI(x,x2,.,xN)I2dnx2 ... d"xN
,
(1.2)
where cli is an antisymmetric, normalized function of the N variables
xi ( R. Our main results, in addition to (1.1), will be of the form
Work supported by U. S. National Science Foundation Grant MPS 71-03375-A03. 269
203
With W. Thitring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
E. H. LIEB AND W. E. THIRRING
270
rN f y IVi0(x1 ... xN)I2 dnxl ... dnxN
T
i=1
2(p-1)/n
> Kn [fdnxPxP/P_1]
(1.3)
when max I n/2, 11 < p< 1+ n/2. For N = 1, p = n/2, (1.3) reduces to the well-known Sobolev inequalities. (1.3) is therefore a partial generalization of these inequalities, and we shall expand on this in Section 3. Our constants Kp n are not always the best possible ones, but nevertheless, they may be useful for many purposes. In particular, in ref. [1], a special case of (1.3) was used to give a simple proof of the stability of matter, with a constant of the right order of magnitude. The result for q species of fermions (2m = e = >S = 1) moving in the field of M nuclei with
positive charges Zl is M
2
H > - 1.31 q2/3 N 1 + M Z7 /3/N 1/2
(1.4)
j=1
In particular, if q = 2 (spin 1/2 electrons), we have a bound N, and if we set q = N, we get a bound ^- NS/3 if no symmetry requirement is imposed on the wave function; a fortiori this is a bound for bosons. Our bound implies stability of matter in its intuitive meaning such that the N (Bohr radius)3. To give a volume occupied by N particles will be formal demonstration of this fact, one might use a method which gives lower bounds for the radii of complex atoms (compare Equation (3.6, 38)
of ref. [201). As a first observation, one calculates the ground state energy of N electrons (with spin) in a harmonic potential. Filling the oscillator levels, one finds N 2
(
i=1
204
+
(02 z2) > w N4/3 34 3 (1 +
O(N-t/3))
.
(1.5)
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian THE EIGENVALUES OF THE SCHRODINGER HAMILTON
271
Next, take the expectation value of this operator inequality with the ground
state of H, set
344/3 <-n>
W
(1.6)
Na/3
and use the virial theorem
<-Ai>
1/2 2
M
-E0 < 2.08N 1+
L j=1
I
(1.7)
Altogether we find N
x>>
<
i=1
>
(3N)s/3
16<-A.>
16.2.08
<X? > 1 / 2 > C N1 /3
r
38/3 N5/3
(1.$)
Z?/3/Nl1/2
+
c
.75
(1.9)
(2z1'3/N) Cl +
J
Therefore, if the system is not compressed by other forces, so that the virial theorem is valid, it will not collapse, but will adjust its volume to a size proportional to the number of particles. Regarding the Z-dependence, we see that with Z = Zj = N/M we have (for large Z)
<x?>1/2 _
M1/32-1/3
That is, the mean atomic radius is predicted to be > Z-1 /3. A better result can hardly be expected since for M = 1, this is the correct
Z-dependence for large Z.
Although we have no results on the best possible constants, Kp n, except in a few special cases, experience drawn from computer calculations suggests that there is a critical value yc n above which the classical value gives a bound:
205
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
272
E. H. LIEB AND W. E. THIRRING
(I Jejly)ctassica, = (2.)-n
LY
fdnp dnx lp2 + V(x)IY
n f I V(x)Iy+n/2 dnx (1.10)
Y ? Yc,n
and where LY n, given by the above integral, is LY,n
2-n,,-n/2 r(y+ 1)/I'(y + I+ n/2) .
=
(1.11)
We conjecture yc,l = 3/2, Yc,3 a., .863 and yc n = 0, all n > 8. If this conjecture were to be true, the constants in (1.3, 1.4) could be further improved.
In the next section we shall deduce bounds for F, lejlY and use them i
in Section 3 to derive (1.3). In Section 4 we shall discuss our conjectures and support them for n = 1 with results from the Korteweg-de Vries equation. Section 5 contains new results added in proof. In Appendix A, generously contributed by J. F. Barnes, further evidence from computer studies is presented. We are extremely grateful to Dr. Barnes for taking an interest in this problem, for without his results we would have been hesitant to put forth our conjectures. 2.
Bounds for Moments of the Eigenvalues
In this section we shall deduce bounds of the form (1.1), and we shall compare our Ly n with the classical values which one gets by replacing IejlY
by
(2n)-n
f
dnx dnp IP2 + V(x)ly
For n - 3 and y - 1, the latter are smaller by about an order of magnitude.
206
Inequalities for the Moments of the Eigenvalues of the Schr&dinger Hamiltonian
273
THE EIGENVALUES OF THE SCHRODINGER HAMILTON
Our inequalities are based on the Birman-Schwinger (2, 3) method for
estimating NE, the number of bound states of H = -A + V(x) having an energy < E. Since NE = S(E - ej)
V we have 00
Yi JejlY = y J
daay-I N_a
(2.1)
o
Now, according to Birman-Schwinger [2, 3], for all a > 0, m > I and t ( [0, 1],
N_a < Tr(JV+(1-t)all/2(--L\+ta)-IIV+(1-t)aj//2)m
.
(2.2)
REMARKS ABOUT (2.2):
1. We are only interested in potentials such that V_ LY+n/2(Rn) for y > min (0, 1- n/2). For such potentials (2.2) is justified, and a complete discussion is given in Simon [4, 5]. Moreover, it is sufficient to consider V c Co(Rn) in (2.2), and in the rest of this paper, and then to use a limiting argument. Such potentials have the advantage that they have only a finite number of bound states [5]. 2. Since we are interested in maximizing E IejJ>/ f lVjY+n/2, we may as well assume that V(x) < 0, i.e. V = - V I _ . This follows from the max-min principle [4] which asserts that ej(V) > ej(-l V1 _), all j, including multiplicity.
To evaluate the trace in (2.2), we use the inequality Tr (BI /2 A B1 /2)m < Tr Bm/2 Am Bm/2
(2.3)
when A, B are positive operators and m > 1. When m is integral and A, B is of our special form, (2.3) is a consequence of Holder's inequality. For completeness, we shall give a more general derivation of (2.3) in Appendix B.
207
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
E. H. LIEB AND W. E. THIRRING
274 To calculate
TrIV+(1-t)alm(-A+ta)-m,
(2.4)
we shall use an x-representation where (-fl+ta)-m is the kernel G(ma)(x-y) = (2,,)-n
J
dnp (P2 +
ta)-m
eip(x-Y)
(2.5)
if m > n/2. Using 2fln/2
dnP =
2)
.0
dp pn-I ,
(2.6)
we easily compute
Gta)(0)=(2n)-n 2nn/2 (ta)-m+n/2
r m
(2.7)
> n/2. Thus, N-a < (4rr)-n/2 r (m-n/2 (ta)-m+n/2 r dnx IV(x)+(1-t)alm. (2.8) r(m) J
Next, we substitute (2.8) into (2.1). If we impose the condition that t < 1, it is easy to prove that one can interchange the a and the x integration. Changing variables a -. (1-t)-I I V(x)L _fl, leads to
je)(y < x
y(4n)-n12 t-m+n/2(1_t)m-y-n/2 1'(y-m+n/2)r(m-n/2) m
Ry+1+n/2)
V(x)jy+n/2
f dnx (2.9)
provided n/2 < m < n/2 + y, m > 1 and 0 < t < 1. The optimal t is
t = (m-n/2)/y. If we put our results together, we obtain the following (see note added in proof, Section 5).
208
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian THE EIGENVALUES OF THE SCHRODINGER HAMILTON
275
THEOREM 1. Let V_ c LY+n/2(Rn), y > max (0, 1- n/2). Let H
= -A + V(x), and let e) < 0 be the negative energy bound states of H. Then jejjY < Ly
n
J IV(x)ly+n/2
(2.10)
where
Ly n < Ly n = min (4rr)-n/2yy+l M
n I y+2+1
F (m - n) F(y + ` / \ 2
- ml /
,
(2.11)
and where F(x) = I'(x) x-x, max 11, n/21 < m < n/2 + y. REMARKS:
1. When y = 0, 11 ejj 0 means the number of bound states, including
zero energy states. For n > 2, our Lo,n = '. In Section 4, we shall discuss the y = 0 case further. See also Section 5. 2.
In (2.11), LY
is the bound we have obtained using the Birman-
n
Schwinger principle. We shall henceforth reserve the symbol LY the quantity Ly = sup t` le)jy/ J lVly+n/2 . n
Vr
n for
(2.12)
Optimization with respect to m in (2.11) can be done either numerically or analytically in the region where Stirling's formula F(x)
a-x jv7x
(2.13)
can be applied. In [11, for n = 3, y = 1, we used the value 2 for m. A marginal improvement can be obtained with m = 1.9. If (2.13) were exact, the best m would be m = n(y + n/2)/(n + y)
.
(2.14)
Note that as y - w, in- is bounded by n. Using IS, together with (2.13), which is valid when yn(y+n)-t is large,
209
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
276
E. H. LIEB AND W. E. THIRRING
L
y,n
-
(4r7)1-n/2
yye Y r n/2 1/2 I'(y+n/2) LY + n J
(2.15)
Finally, we want to compare our bounds with their classical values, Ly,n. From the results of Martin [6] and Tamura [7], one has the following THEOREM 2. if V(x) < 0 and V t Co (Rn), then lim
Iej(AV)ly /
f
lAVly+n/2 = LY,n
(2.16)
,J
COROLLARY.
Ly,n > LCC
(2.17)
Our Ly,n satisfies (2.17), in particular in the asymptotic region (2.15), we find y
os
[4nn(y+n/2)]1/2y-1/2
.
(2.18)
We conjecture in Section 4 that for y sufficiently large, the best possible Lynn should be Lynn, a result which does not follow from the Birman-Schwinger method employed here. For small y, we know that Lynn is not a bound. We conclude this section with a theorem about Ly,n which will be useful in the discussion of the one-dimensional case in Section 4. THEOREM 3. Let y > 1 + max (0, 1- n/2). Then Ly n < Ly_ 1 n [y/(y + n/2)]
(2.19)
.
PROOF. Choose e > 0. We can find a V c CD (Rn), with V < 0, such that
LY n(V) =
210
4 lej(V)IY/J IVIY+n/2 > Ly n -
E
.
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian THE EIGENVALUES OF THE SCHRODINGER HAMILTON
277
Let g e Co (Rn) be such that 0 < g(x) < 1, Vx, and V(x) a< 0 implies g(x) = 1. Let VA(x) = V(x) - Ag(x), A < 0. The functions Iej(VA)I are continuous and monotone increasing in A. Furthermore, there are a finite number of values -- < Al < A2 < < Ak < 0 with Aj being the value of A at which ej(VA) first appears. Al is finite because VA is nonnegative for A sufficiently negative. ej(VA) is continuously differentiable on A={AIO>A>A1,AAAi, and dej(VA)/dA = - J Ikbj(x: VA)I2 g(x)dnx by the Feynman-Hellman theorem. It is easy to prove that if f, ge Lp(Rn),
p > 1,
then
h(A) =
Jlf(x)_Ag(x)I!.dhix
is differentiable, VA and dh/dAIA=o = p
f If(x)I?-1 g(x)dnx
.
Thus LY,n(VA) is piecewise C' on A and its derivative, LY,n, is given by -1
LY,n=1
VAV+n/21 yy Iej(VA)IY-1fg(x) j(x:VA)I2dnx-(Y+n/2)LY n(VA) J
fvA(x)n/2_1 g(x)dnx By the stated properties of LY n, there exists a A e (A1, 0] such that (i) LY,n(VA) >_ 0; (ii) Ly n(VA) > LY
n-
2e.
Thus, using the properties of g,
0 < Y ej(V),)IY-1 - LY n(VA)(Y+n/2) J
IVAIY+n/2-1
(2.20)
Since a was arbitrary, (2.20) implies the theorem.
211
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
278
E. H. LIEB AND W. E. THIRRING
If we use (2.17) together with the fact that Ly,n = LY_l,n[y/(y+n/2)], we have
COROLLARY. If for some y > max (0, 1-n/2), Lynn = LC,n, then y
Ly+l,n = LC
i = 0, 1,2,3,-.-
Y+j,n,
REMARK.. By the same proof
Lynn < L,_l,n[y/(y+n/2)]
(2.21)
(see (3.1) for the definition of Lynn). 3.
Bounds for the Kinetic Energy In this section, we shall use Theorem 1 to derive inequalities of the
type (1.3). We recall the definition (2.14) and we further define Ly,n
sup Jelly/
I
IVI +n/2
(3.1)
Clearly, L1y,n
If
< Ly n
(3.2)
E HN,n q = the N-fold antisymmetric tensor product of L2(Rn;Cq),
with x] r Rn, we can write 0 pointwise as aj c 11, 2,.. , q4 and 0 -, -0/i if (xi, ai) is permuted with (xj, aj). q = 2 for spin 1/2 fermions. We can extend the definition (1.2) to
pa(x) = N
I ... ± I
a2=1
We also define
212
aN=1
10(x, x2,..., xN ; a. a2,..., aN)12 dnx2 ... dnxN . (3.3)
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian THE EIGENVALUES OF THE SCHRODINGER HAMILTON q
N
TAG =
279
I q
a1=1
(3.4)
f Iojo(X;o)12dnNX
j=1 a1=1
i
aN=1
f
Iq,(X,a)12dnNx
(3.5)
Our result is THEOREM 4. Let p satisfy max j n/2, 11 < p < 1 + n/2 and suppose
"
that Lp-n/2,n < If 110112 = 1, then, except for the case n = 2, p = 1, there exists a positive constant K p,n such that 2(p-1)/n
q
T
,
[fPx)P1P_1)dnx]
> Kp,n
(3.6)
a=1
K
p,n > - 21 np-2p/n(p_n/2)-1+2p/n(L1p-n/2,n /L p-n/2,n
)-1+2p/nL'
p-n/2,n
-2/n
(3.7)
Before giving the proof of Theorem 4, we discuss its relation to the well-known Sobolev inequalities [9, 101:
THEOREM 5 (Sobolev-Talenti-Aubin). Let V0 f Lt(Rn) with 1 < r < n. Let t = nr/(n-r). Then
f
INolr > Cr,n J 101t
r/t
(3.8)
for some Cr n > 0.
Talenti [11] and Aubin [21] have given the best possible Cr n (for n = 3, r = 2, t = 6, C2 ,n is also given in [8] and (121):
213
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
280
E. H. LIEB AND W. E. THIRRING
Cr,n
nrr =
r/2 (
n-r
r R1+ n-n/r)I'(n/r) r/n { I'(n)r(1+n/2) }
(3.9)
Our inequality (3.6) relates only to the r = 2 case in (3.8), in which case t = 2n/(n - 2). Consider (3.8) with r = 2 and 110 112 = 1. Using Holder's inequality on the right side of (3.8), one gets
fvI2 > C2,n[J
IliI2P/(P-i)12(p-i)/nIJ
I
iI21 -2(p-n/2)/n (3.10)
whenever n > 2 and p ? n/2. However, C2 ,n is not necessarily the best constant in (3.10) when p A n/2 (p = n/2 corresponds to r = 2 in (3.8)). Indeed, Theorem 4 says something about this question.
In the case that N = 1 and q = 1, Theorem 4 is of the same form as (3.10) (since p = I0I2 and 110 II2 = 1). We note two things: 1. For n > 2 and p = n/2, (3.6) agrees with (3.8) except, possibly, for a different constant. We have, therefore, an alternative proof of the usual Sobolev inequality (for the r = 2 case). As we shall also show Kn/2,n - C2 n, so we also have the best possible constant for this case. 2. If max In/2, 11 < p < 1 + n/2, Theorem 4 gives an improved version of (3.10), even it n = 1 or 2 (in which cases C2 ,n = 0, but Kp n > 0). For p > 1 + n/2, one can always use Holder's inequality on the p = I+ n/2 result to get a nontrivial bound of the form (3.10). However, in Theorem 4, the restriction p < 1 + n/2 is really necessary. This has to do with the dependence of T , on N rather than on n, as we shall explain shortly. Next we turn to the case N > 1. To illustrate the nature of (3.6), we may as well suppose q = 1. To fix ideas, we take a special, but important form for 0, namely G(xl,...,xN)
=
(N!)-t /2Dett(kr(x));N=1
(3.11)
and where the (k' are orthonormal functions in L2(Rn). Then, suppressing
the subscript o because q = 1,
214
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian THE EIGENVALUES OF THE SCHRODINGER HAMILTON
281
N P(x) _
Pi(x)
i=1
p'(x) = Id'(x)12
ti
=
J
IV.0ij2
Theorem 4 says that ti > Kp
,
(3.12)
.
p/(p-1)
f
n
C
Pi(x)1
2(p-1)/n (3.13)
dnx
If we did not use the orthogonality of the 95', all we would be able to conclude, using (3.6) with N = 1, N times, would be
t' > Kp
r n
()((3.14) 2(p-1)/n
fPixP/P_tdnx1
L
If p = n/2, then (3.14) is better than (3.13), by convexity. In the opposite case, p = 1 + n/2, (3.13) is superior. For in between cases, (3.13) is decidedly better if N is large and if the p' are close to each other (in the LP/(p-1)(Rn) sense). Suppose p'(x) = p(x)/N, i = Then the right side of (3.13) is proportional to N2p/n while the right side of (3.14) grows only as N. This difference is caused by the orthogonality of
the 0', or the Pauli principle. In fact, the last remark shows why p < I + n/2 is important in Theorem 4.
If p' = p/N, all i, then the best bound, insofar as the N dependence
is concerned, occurs when p is as large as possible. It is easy to see
215
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
E. H. LIEB AND W. E. THIRRING
282
by example, however, that the largest growth for TV due to the orthogonality condition can only be N(n+2)/n PROOF OF THEOREM 4. Let V(x) < 0 be a potential in Rn with at
least one bound state. If` e1 = min Ieil, then, for y c [0, 1],
,
JejIY > le,JY-1 I I IejI
Using the definition (2.14) and (3.2), we have that 1/y
IejI < Ay,n { J11 IVly+n/2 ((
(3.1S) )f
i
Ay,n = L y,n (L1y,n )-1+1/y
(3.16)
when 1 > y > max (0, 1 - n/2). (3.15) holds even if V has no bound state. be the projection onto the state a, i.e. for Let aa, a = (avgti) (x, a) = +/,(x, v) if a = v and zero otherwise. Choose 0 c L2(Rn; Cq), y= p - n/2. Let (polo= l be given by (3.3) and, for as > 0, or = 1, , q, define h
-A-
q
aapa(x)1/(y+n/2-1)aa
(3.17)
a=1
to be an operator on L2(Rn; Cq) in the usual way. Define N
HN =
hi
(3.18)
i=1
where hi means h acting on the i-th component of xN n q. Finally, let E = inf.spec HN. Now, by the Rayleigh-Ritz variational principle
216
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian
THE EIGENVALUES OF THE SCHRODINGER HAMILTON
283
q
E < (0, HN 0) = Tgb - 1 ao J
Po/(p-1)
(3.19)
a=1
On the other hand, E > the sum of all the negative eigenvalues of h
I 4
> - A y,n
1/y a aP/y
(3.20)
a
a=1
by (3.15). Combining (3.19) and (3.20) with
2(y-1)/n ao = S rppa/(P-1)
J
y
1
2y/n
{y+n 2AYnc
the theorem is proved.
Note that when p = 1 + n/2 (corresponding to y = 1 in the proof ), L1 In this case, the right side of (3.7) is the i,n does not appear in (3.7). best possible value of K1+n/2,nl as we now show. LEMMA 6. From (3.7), define
Li n - [n/(2K1+n/2,n)Jn/2 Then L1,n
=
(1+n/2)-1-n/2
L 1,n'
PROOF. By (3.7), we only have to prove that Lin > L1,n. Let V < 0, be the bound state V c C0 (Rn) and let H = -A + V. Let 10i, ei:N_ eigenfunctions and eigenvalues of H. Let 0 and p' be as defined in (3.11), (3.12). Then
Y
Ieil = - J VP - To <_ BVIIp11PNp/(p-1) - TO
with p = 1 + n/2. Using Theorem 4 for TIP, one has that
217
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon. A. Wightman eds.
E. H. LIEB AND W. E. THIRRING
284
11e;I < maxIIIVIIpY-K1+n/2,nY2p/°s = LI nIIVIP i
Y>O
We conclude with an evaluation of Kn/2
n
for n > 2 as promised. By
a simple limiting argument
(right side of (3.7)) .
lim
Kn/2,n >
pln/2
(3.21)
Our bound (2,11) on Lp-n/2,n shows that (Lp-n/2,n)-1+2p/n = 1
lim
.
(3.22)
pin/2
Hence
K
>
L1
-2/n
(3.23)
On the other hand, by the method of Lemma 6 applied to the N = 1 case, The value of is given in (4.24). To be Kn/2,n < honest, its evaluation requires the solution of the same variational problem as given in [8, 11, 12]. Substitution of (4.24) into (3.23) yields the required result Kn/2,n = C2 n = rrn(n-2)[I'(n/2)/I'(n)]2/n . (3.24) (L0',n)-2/n.
If we examine (3.23) when n = 2, one gets K1 2 > 0 since Lo 2 = W. This reflects the known fact [5] that an arbitrarily small V < 0 always has a bound state in two dimensions. This observation can be used to
show that
K1,2 = 0 .
(3.25)
When n = 1, the smallest allowed p is p = 1. In this case, (3.6) reads q
T
, ? K1.1 1 a=1
Using (3.7) and (4.20),
218
IIPoII;,
(3.26)
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian THE EIGENVALUES OF THE SCHRODINGER HAMILTON 1 K1.1 > [2L1/2.11-1 /2.1]
285
(3.27)
If one accepts the conjecture of Section 4 that L1/2,1
=
LI/2,1 ° 1/2,
then (3.28)
K1,1=1.
The reason for the equality in (3.28) is that K1.1 = 1 is well known to be the best possible constant in (3.26) when q = 1 and N = 1. 4.
Conjecture About Ly
n
We have shown that for the bound state energies (ej{ of a potential
V in n dimensions and with Ly,n(V) = t` Jejly/ f lV(y+n/2
(4.1)
then
Ly,n =
sup
Ly,n(V)
(4.2)
V t LY+n/2
is finite whenever y + n/2 > 1 and y > 0. The "boundary points" are
y1/2
n=1
y=0
n>2
(4.3)
We showed that for n = 1 ,
L1 /2
1 < -. For y < 1/2, n = 1, there
cannot be a bound of this kind, for consider VL(x) - -1/L for Ixl < L and zero otherwise. For L -. 0, this converges towards - 2S(x) and thus has a bound state of finite energy (which is -1 for -2S(x)). On the other hand, lim
fdx tVLI1/2+Y = 0
for
y < 1/2
L+ 0
For n = 2, y = 0 is a "double boundary point" and L0 2 = -, i.e. there is no upper bound on the number of bound states in two dimensions. (Cf. [5].)
219
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
E. H. LIEB AND W. E. THIRRING
286
For n > 3, LOrn is conjectured to be finite (see note added in proof, Section 5); for n = 3, this is the well-known f IVI3/2 conjecture on the number, NO(V), of bound states (cf. [51). The best that is known at present is that 4/3
NO(V) < c
If
IVI3/21
(4.4)
NO(V) < I (1 + 4 in Il I = 4(3rr2 31
,
/2)-1 fvI3I2
(4.5)
In (1.4) and (3.1), we introduced LC and L1 and showed that Ly,n > max(L4,n,L?,n)
(4.6)
A parallel result is Simon's [22] for n > 3: NO(V) < Dn e(IIV_IIE+n/2 + IIV-II_e+n/2)n/2
with Dn,r - oo as r - 0. In our previous paper [4], we conjectured that L1 3 = Li,3, and we also pointed out that Li, l > L1,1 A remark of Peter Lax (private communication), which will be explained presently, led us to the following:
CONJECTURE. For each n, there is a critical value of y,yc,n, such that
LYn= LCy,n 1 Lynn = Ly,n
220
Y?Yc
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian
THE EIGENVALUES OF THE SCHRODINGER HAMILTON
287
yc is defined to be that y for which LY,n = Ly n; the uniqueness of this yc is part of the conjecture. Furthermore, yc I = 3/2, Yc 2 ^ 1.2, Yc,3
.86 and the smallest n such that yc,n = 0 is n = 8.
(A) Remarks on Lynn We want to maximize
fu2
/f Y
V + IVsGI2]dnx)
IVly+n/2
(4.7)
with respect to V, and where f 1,p 12 = 1 and (-A+V),A = e1i . By the variational principle, we can first maximize (4.7) with respect to V, holding 0 fixed. Holder's inequality immediately yields V(x) =
-aIVi(x)I2/(y+n/2-1)
with a > 0. The kinetic energy, f IVI#12, is not increased if Vi(x) is replaced by IVi(x)I and, by the rearrangement inequality [13], this is not increased if 101 is replaced by its symmetric decreasing rearrangement. Thus, we may assume that IVI and Irlil are spherically symmetric, nonincreasing functions. By the methods of [8] or [11], (4.7) can be shown to have a maximum when y + n/2 > 1. The variational equation is
-A4i(x) -
agr(x)(Y+n/2+1)/(y+n/2-1)
with 1
a =
YIe1IY
= e1V,(x)
(4.8)
}1/(Y+n/2._1) (4.9)
(y + n/2) Ly n
Equation (4.8) determines 0 up to a constant and up to a change of scale in x. The former can be used to make f 02 = 1 and the latter leaves (4.7) invariant.
221
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds. E. H. LIEB AND W. E. THIRRING
288
Equation (4.8) can be solved analytically in two cases, to which we shall return later: (i) n = 1, all y>1/2
(ii) n > 3, y = 0. (B) The One-Dimensional Case Lax's remark was about a result of Gardner, Greene, Kruskal and
Miura (14] to the effect that C
1'3/2,1 =1'3/2,1 = 3/16.
(4.10)
To see this, we may assume V c C0 '(R), and use the theory of the Korteweg-de Vries (KdV) equation [14]: (4.11)
Wt = 6WWX - Wxxx
There are two remarkable properties of (4.11): (i) As W evolves in time, t, the eigenvalues of -d2/dx2 + W remain invariant.
(ii) j W2 dx is constant in time. Let W(x, t) be given by (4.11) with the initial data W(x, 0) = V(x)
.
Then L3/2 1(W( , t )) is independent of t, and may therefore be evaluated by studying its behavior as t w. There exist traveling wave solutions to (4.11), called solitons, of the form
W(x, t) = f(x - ct)
.
Equation (4.11) becomes
-cfx = -fxxx+6ffx
(4.12)
The solutions to (4.12) which vanish at m are fa(x) = - 2a2 cosh-2 (ax) c = 4a2
222
.
(4.13)
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian
289
THE EIGENVALUES OF THE SCHRODINGER HAMILTON
Any solution (4.13), regarded as a potential in the Schrodinger equation
has, as we shall see shortly, exactly one negative energy bound state with energy and wave function
e = -a2 0(x) = cosh-I (ax)
.
(4.14)
Now the theory of the KdV equation says that as t - -, W evolves into a sum of solitons (4.13) plus a part that goes to zero in L°°(R) norm (but not necessarily in L2(R) norm). The solitons are well separated since they have different velocities. Because the number of bound states is finite, the non-soliton part of W can be ignored as t -. Hence, for the initial V,
I
I jej13/2
a3
(4.15)
solitons while
f V(x)2 dx
I
ffa(thx
(4.16)
solitons
Since 4 JG cosh -4(x)dx = 16/3, we conclude that c 1'3/2,1 = 1'3/2,1 = 3/16
(4.17)
with equality if and only if W(x, t) is composed purely of solitons as w. For the same reason, t
L'3/2,1 = LC3/2,1
(4.18)
(cf. (4.21)).
Not only do we have an evaluation of L3/2,1, (4.17), but we learn something more. When y = 3/2, there is an infinite family of potentials for which L3/2,1 (V) = L3/2 1, and these may have any number of bound states = number of solitons. What we believe to be the case is that when y < 3/2, the optimizing potential for Ly n has only one bound state, and satisfies (4.8). When
223
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
E. H. LIEB AND W. E. THIRRING
290
y > 3/2, the optimizing potential is, loosely speaking, infinitely deep and has infinitely many bound states; thus Lynn = LY,n. An additional indication that the conjecture is correct is furnished by the solution to (4.8). When y = 3/2, this agrees with (4.14). In general, one finds that, apart from scaling, the nodeless solution to (4.8) is clr
(x) = r(Y)1 /2 n 1 /4 I'(y - 1/2)-1 /2 cosh-Y+1 /2(x)
Vy(x) = -(y2 - 1/4)cosh-2(x)
e1 = -(y-1/2)2
(4.19)
.
Thus,
LI
=
rr-1/2
1
Y.1
r(y+1)
/y-1/21Y+1/2
I'(y+1/2)\YY+
/
(4.20)
When Ly , is compared with LY.,, one finds that
Ly,1 > LY 1
y < 3/2
LyI
y>3/2
(4.21)
This confirms at least part of the conjecture. However, more is true. For y = 3/2, VY has a zero energy single node bound state
O(x) = tanh(x)
.
Since VY is monotone in y, it follows that VY has only one bound state for y < 3/2 and at least two bound states for y > 3/2. The (unnormalized) second bound state can be computed to be
O(x) = sinh(x) cosh-Y+1/2(x) e2 = - (y - 3/2)2
.
(4.22)
In like manner, one can find more bound states as y increases even further.
224
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian THE EIGENVALUES OF THE SCHRODINGER HAMILTON
291
Thus we see that the potential that optimizes the ratio telly/fIV1y+I/2 automatically has a second bound state when y > yc. Finally, we remark that Theorem 3, together with (4.10), shows that THEOREM 7. Ly 1 = LC
y = 3/2, 5/2, 7/2, etc.
for
1
An application of Theorem 7 to scattering theory will be made in Section 4(D). (C) Higher Dimensions
We have exhibited the solution to the variational equation (4.8) for Ly 1.
When n > 2 and y = 0, we clearly want to take e1 = 0 in order to maximize Lo,n(V). (4.8) has the zero energy solution
c(x) = V(x) =
ac(x)2/(y+n/2-1)
(1+lxl2)1-n/2
=
n(n-2)(1+IxI2)(2-n)/(n/2-1)
(4.23)
(note: (h r L2(Rn) if and only if n > 4, but V ( Ln/2(Rn) always). This
leads to
Lp'n =
[nn(n-2)]-n/2 r(n)/I'(n/2)
.
(4.24)
The smallest dimension for which Lo n < LO n is n = 8. If we suppose that the ratio Ly n/LY n is monotone decreasing in y (as it is when n = 1 and as it is when n = 3 on the basis of the numerical solution of (4.10) by J. F. Barnes, given in Appendix A), and if our conjecture is correct, then Ly n = LY n for n > 8. The value of yc obtained numerically is yc=1.165
n=2
yc = .863
n= 3
.
(4.26)
225
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
292
E. H. LIEB AND W. E. THIRRING
The other bit of evidence, apart from the monotonicity of Ly n/LY n, for the correctness of our conjecture is a numerical study of the energy levels of the potential 'k >0 ,
VX(x)_Ae-t'I,
in three dimensions. This is given in Appendix A. The energy levels of the square well potential are given in [15, 16]. In both cases, one finds that
lim Lt,3(Vx) = Li3
a»0
and the limit is approached from below. Unfortunately, it is not true, as one might have hoped, that Lt 3(Vk) is monotone increasing in \. (D) Bounds on One-Dimensional Scattering Cross-Sections In their study of the KdV equation, (4.11), Zakharov and Fadeev [ 17]
showed how to relate the solution W(x, t) to the scattering reflection
coefficient R(k) and the bound state eigenvalues lejl of the initial potential V(x). There are infinitely many invariants of (4.11) besides fW2 and these have simple expressions in terms of R(k), lejl. Thus, for any potential V,
J V3 + 2 Vx = -(32/5)
f V4+2VV+ 1 Vxx = (256/35)
Iej(512- 8 J
k4 T(k) dk
(4.28)
ejI7/2 -(64/5) f k6 T(k)dk (4.29)
where
T(k) = , r
226
In (1- I R(k)I2) < 0 .
(4.30)
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian THE EIGENVALUES OF THE SCHRODINGER HAMILTON
293
These are only the first three invariants; a recursion relation for the others can be found in [171.
Notice that 3/16, 5/32, 35/256 are, respectively L3,,2,1, Ls12,1,
Since f V' > 1IV(4.27) establishes that L3/2,1 = 1'3/2,1' 1 as mentioned earlier. For the higher invariants, the signs in (4.28) and (4.29) are not as fortunately disposed and we cannot use these equations to prove Theorem 7. But, given that Theorem 7 has already been proved, we can conclude that L712
THEOREM 8. For any nonpositive potential V(x),
fv
? -16 E k4T(k) dk .
(4.31)
For any potential V(x), 00
2
fvv2 + (1/5)
fv Xx < -(64/5) J
k6 T(k)dk
.
(4.32)
The first inequality, (4.31), is especially transparent: If V(x) is very smooth, it cannot scatter very much. 5.
Note Added in Proof
After this paper was written, M. Cwikel and Lieb, simultaneously and by completely different methods, showed that the number of bound states,
N0(V) for a potential, V, can be bounded (when n > 3) by
N0(V) < An
J
I V(x)i n/2 dnx
(5.1)
Cwikel exploits the weak trace ideal method of Simon [221; his method is more general than Lieb's, but for the particular problem at hand, (5.1), his
227
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
E. H. LIEB AND W. E. THIRRING
294
An does not seem to be as good. Lieb's method uses Wiener integrals and the general result is the following: 00
N_a(V) < Cdnx f
J
dt t-Ie-at(4nt)-n/2 f(tIV(x)I-)
(5.2)
0
[0, oo) satisfying
for any non-negative, convex function f : [0, oe)
1=
f
00t-1
f(t) a-t dt
(5.3)
.
0
For a = 0, one can choose f(t) = c(t-b), t > b, f(t) = 0, t < b. This leads to (5.1), and optimizing with respect to b, one finds that
A3 = 0.116,
(5.4)
A4 = 0.0191
and, as n . 00,
An/Loin = (nv)1/2 + O(n 1/2)
.
(5.5)
Note that A3/L0 ' 3 = 1.49, i.e. A3 exceeds Lo 3 by at most 49%. Since N_a(V) < No(- I V +al _), one can use (5.1) and (2.1) to deduce
that for r? 0 and n>3, Lynn < LY n(An/Lo n)
(5.6)
This is better than (2.11), (2.18). In particular, for n = 3, y = 1, the improvement of (5.6) over (2.11) with m = 2 is a factor of 1.83. The factor 1.31 in Equation (1.4) can therefore be replaced by 1.31 (1.83)-2/3 = 0.87.
228
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian
THE EIGENVALUES OF THE SCHRODINGER HAMILTON
295
APPENDIX A. NUMERICAL STUDIES
John F. Barnes Theoretical Division Los Alamos Scientific Laboratory Los Alamos, New Mexico 87545
Evaluation of LI n, n = 1, 2, 3 y, The figure shows the numerical evaluation of Lynn as well as Ly The latter is given in (1.11) 1.
n
2-n,7-n/2r(y+1)/C'(y+1+n/2)
LY,n =
The former is obtained by solving the differential equation (4.8) in polar coordinates and choosing a such that c(t(x) - 0 as Ixi - W. Note that
by scaling, one can take el = -1, whence (LI
y,n
)-1
=
,(y+n/2)
Iql(x)I(2y+n)/(y-l+n/2)dnx
In one dimension, Ly l is known analytically and is given in (4.20). Another exact result, (4.24), is Lo 3 =
4n-2 3-3/2
= 0.077997
The critical values of y, at which Ly n = Ly,n are: yc I = 3/2 yC 2 = 1.165 0.8627
229
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
E. H. LIEB AND W. E. THIRRING
296
J
0.0011
0
1
.2
1
.4
1
.6
I
1
1
.8
1.0
12
1
r
230
I
1
I
I
1
I
14 16 1.8 2.0 2.2 2.4 2.6 28
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian THE EIGENVALUES OF THE SCHRODINGER HAMILTON
11.
297
The Exponential Potential
To test the conjecture that L1,3 = L1,3, the eigenvalues of the potential VA _ -X exp(- Ixi) in three dimensions were evaluated for A = 5, 10, 20, 30, 40, 50, and 100. These are listed in the table according to angular momentum and radial nodes. These numbers have been corroborated by H. Grosse, and they can be used to calculate Ly 3(Vx) for any
y. The final column gives L1 3(Vx), since f1Vjj5/2 = kS12(64rr)/125. It is to be noted that the classical value L 3 = 0.006755,
is approached
from below, in agreement with the conjecture, but not monotonically.
VA= -Ae-r Q
lei
nodes
states
lei
x5/2 4 125
A= 5
4 = 10
A = 20
0
0.55032
0
1
0.55032 0.55032
2
2.2520
3
1.0022
5
3.2542
3
8.0584
8.6342
1
0.06963
1
2.18241
0
1
0.33405
0
0
0.00869
2
1.42562
1
6.62410
0
0.16327
1
2.71482
0
6
0 43136
0
1
0
1
2
.
5
2.1568
0 006120 .
0 006398 .
0 006551 .
18.8 94
231
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
E. if. LIEB AND W. E. THIRRING
298
VA = -Ae-t (continued) lei
A = 30
0.58894
2
3.83072
1
11.84999
0
1.39458
1
6.12302
0
0.00593
1
2.36912
0
3
0 07595
0
0
0.07676
3
1.86961
2
6.88198
1
17.53345
0
0.41991
2
3.35027
1
10.13596
0
0.93459
1
5.03378
54738
0
1
2
A = 40
nodes
1
2
3
.
1
.
states
lei
3
16.270
6
22.553
10
26
125
11.875
51.22 30
4
26.362
9
41.718
0
10
29.842
0
7
10.832
0 . 006461
0 006682 .
108.754
30
232
A5/4n
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian
299
THE EIGENVALUES OF THE SCHRODINGER HAMILTON
VA = -Ae-r (continued)
,lei
A = 50
f
lei
nodes
0
0.60190
3
3.66447
2
10.39110
1
23.53215
0
1.43321
2
5.81695
1
14.56904
0
0.07675
2
2.45887
1
8.19840
0
0.26483
1
3.61626
0
0.49009
0
1
2
3
4
states
4
38.190
9
65.458
15
53.670
14
27.168
ST
A = 100
0
0.39275
5
2.91408
4
8.29231
3
17.44909
2
32.07168
1
lei
A 5/2
4a
0.006643
233
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
300
E. H. LIEB AND N. E: THIRRING
VA = -Ae-t (continued) P
1
2
3
4
234
,
let
nodes
states
56.28824
0
6
117.41
1.10170
4
4.76748
3
11.62740
2
22.79910
1
40.45495
0
15
242.25
0.02748
4
2.04022
3
6.85633
2
15.22147
1
28.46495
0
25
263.05
0.22692
3
3.14743
2
9.13429
1
19.04073
0
28
220.85
0.52962
2
4.37856
1
11.56470
0
27
148.26
lei
A 512 64n
irs
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian THE EIGENVALUES OF THE SCHRODINGER HAMILTON
301
VA = - Ae-r (continued) iel jel
5
6
nodes
A5/2 4n
states
0.88997
1
5.69707
0
22
1.26789
0
f
let
72.46 6.48 108 0.76
0.006719
APPENDIX B: PROOF OF (2.3) THEOREM 9.
Let x be a separable Hilbert space and let A, B be
positive operators on J{. Then, for m > 1, Tr (B/2 A BI /2)m < Tr BM/2 Am Bm/2
.
(B.1)
REMARK. When .){ = L2(Rn) and A is a kernel a(x-y) and B is a multiplication operator b(x) (as in our usage (2.2)), Seiler and Simon [19] have given a proof of (B.1) using interpolation techniques. Simon (private communication) has extended this method to the general case. Our proof is different and shows a little more than just (B.1).
PROOF. For simplicity, we shall only give the proof when A and B are matrices; for the general case, one can appeal to a llimiting argument. For m = 1, the theorem is trivial, so assume m > 1. Let C = Am and f(C) g(C) - h(C), where g(C) = Tr (B1 /2 C1 /m B1 /2)m and h(C) = Tr Let M' be the positive matrices. Clearly M+ ) C -. h(C) is linear. Epstein [181 has shown that M+) C - g(C) is concave (actually, he showed this for m integral, but his proof is valid generally for m > 1). Write C = CD + CO where CD is the diagonal part of C in a basis in which B is diagonal. CA CD + A CO ='k C + (1-A) CD is in M+ for A E (0, 11,
235
Bm/2CBm/2.
With W. Thirring in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds.
302
E. H. LIEB AND W. E. THIRRING
because CD c M+. Then A f(CA) = R(A) is concave on [0, 11. Our goal is to show that R(1) < 0. Since [CD , B] = 0, R(O) = 0 and, by concavity, it is sufficient to show that R(A) < 0 for A > 0 and A small. h(CA) h(CD) for A c [0, 11. Since f(C) is continuous in C, we can assume that CD is nondegenerate and strictly positive, and that CA is positive when A _> - e for some E > 0. Then R(A) is defined and concave on 1/m is differentiable at A = 0 and its derivative at A = 0 11. A CA has zero diagonal elements. (To see this, use the representation Cl/m = K fo dx x-1+1/m . C(C+ xl)-I.) Likewise, the derivative of (BI /2(D + AO) Bt /2)m at A = 0 has zero diagonal elements when 0 has and when
D is diagonal. Thus
dR(A)/dAIx=0 = 0
.
Acknowledgment
One of the authors (Walter Thirring) would like to thank the Department
of Physics of the University of Princeton for its hospitality. ELLIOTT H. LIEB DEPARTMENTS OF MATHEMATICS AND PHYSICS PRINCETON UNIVERSITY PRINCETON, NEW JERSEY
WALTER E. THIRRING INSTITUT FUR THEORETISCHE PHYSIK DER UNIVERSITAT WIEN, AUSTRIA
REFERENCES [1]
E. H. Lieb and W. E. Thirring, Phys. Rev. Lett. 35, 687(1975). See Phys. Rev. Lett. 35, 1116 (1975) for errata.
[2]
M. S. Birman, Mat. Sb. 55(97), 125(1961); Amer. Math. Soc. Translations Ser. 2, 53, 23 (1966).
[3]
J. Schwinger, Proc. Nat. Acad. Sci. 47, 122 (1961).
[4]
B. Simon, "Quantum Mechanics for Hamiltonians Defined as Quadratic Forms," Princeton University Press. 1971.
236
Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian THE EIGENVALUES OF THE SCHRODINGER HAMILTON
303
[51
B. Simon, "On the Number of Bound States of the Two Body Schrodinger Equation - A Review," in this volume.
[61
A. Martin, Hely. Phys. Acta 45, 140 (1972).
[7]
H. Tamura, Proc. Japan Acad. 50, 19 (1974).
[8]
V. Glaser, A. Martin, H. Grosse and W. Thirring, "A Family of Optimal Conditions for the Absence of Bound States in a Potential," in this, volume.
[9]
S. L. Sobolev, Mat. Sb. 46, 471(1938), in Russian.
[101
, Applications of Functional Analysis in Mathematical Physics, Leningrad (1950), Amer. Math. Soc. Transl. of Monographs, 7(1963).
[11] G. Talenti, Best Constant in Sobolev's Inequality, Istituto Matematico, University Degli Studi Di Firenze, preprint (1975). [12] G. Rosen, SIAM Jour. Appl. Math. 21, 30(1971).
[13] H. J. Brascamp, E. H. Lieb and J. M. Luttinger, Jour. Funct. Anal. 17, 227 (1974). [14] C. S. Gardner, J. M. Greene, M. D. Kruskal and R. M. Miura, Commun.
Pure and Appi. Math. 27, 97 (1974). [15] S. A. Moszkowski, Phys. Rev. 89, 474 (1953). [16] A. E. Green and K. Lee, Phys. Rev. 99, 772(1955).
[171 V. E. Zakharov and L. D. Fadeev, Funkts. Anal. i Ego Pril. 5, 18(1971). English translation: Funct. Anal. and its Appl. 5, 280 (1971).
[181 H. Epstein, Commun. Math. Phys. 31, 317 (1973).
[19] E. Seiler and B. Simon, "Bounds in the Yukawa Quantum Field Theory," Princeton preprint (1975). [201 W. Thirring, T7 Quantenmechanik, Lecture Notes, Institut filr Theoretische Physik, University of Vienna.
1211 T. Aubin, C. R. Acad. Sc. Paris 280, 279(1975). The results are stated here without proof; there appears to be a misprint in the expression for Cm. [221 B. Simon, "Weak Trace Ideals and the Bound States of Schrodinger Operators," Princeton preprint (1975).
237
With M. Aizenman in Phys. Lett. 66A, 427-429 (1978) Volume 66A, number 6
26 June 1978
PHYSICS LETTERS
ON SEMI-CLASSICAL BOUNDS FOR EIGENVALUES OF SCHRODINGER OPERATORS* Michael AIZENMAN Department of Physics. Princeton University. Princeton. NJ 08540. USA and
Elliott H. LIEB Departments of Mathematics and Physics. Princeton University Princeton, NJ 08540, USA Received 27 April 1978
Our principal result is that if the semiclassical estimate is a bound for some moment of the negative eigenvalues (as is known in some cases in one-dimension), then the serniclassical estimates are also bounds for all higher moments.
Bounds on the moments of energy levels of Schrtidinger operators have been the object of several studies 11, 2, 5-81. In [ I I such bounds were used to obtain a lower bound for the kinetic energy of fermions in terms of their one particle density and thereby prove the stability of matter.
In the notation of (21,
and d"z = d"x drip (21r)-n. R,,(-f. V) is the ratio of the moments of the binding energies of a quantum mechanical hamiltonian to the moments of its classical analog. The integral in (1) comes from doing the d"p integration in (2). In the notation of (2[ Rn(Y) = Ly, /L`y,n
(4)
-For V < 0, V E Cp (R") it is known 13,4,91 that
Z;lei(V)Ir
(I)
where ei(V) are the eigenvalues of -A + V(x) defined in L2 (R") and lyl_ = max(-y, 0). Lv n denotes the smallest number for which (1) holds independently of V. The case y = I is the one needed for the kinetic energy bound. It was shown in (21 that L y., <m for y > max(0, I - n12). Eq. (1) also holds for y = 0, n > 3 but the proof is quite different (see refs. (5, 6, 81). For ,y > 0 we use the notation
H(x,p)=p2+V(x) If d"z l H(x, p)l Rn(y, V) = E l ei(V)I;I
(2)
Rn(7') = sup (Rn(y. V)} ,
(3)
V
Rn(y,1t V) -v I
(5)
as X - se, which is the semiclassical limit. Thus Rn('7) > 1
.
(6)
In [2[ it was conjectured that Rn(y) = I for certain y and n, in particular for y = 1, n = 3 which is the case of primary physical interest. R3(I) - I would imply that the Thomas-Fermi theory of atoms and molecules (together with a modified treatment of the electronelectron repulsion) gives a lower bound to the true Schr6dinger ground state energy (see (I I). The only cases where the value of Rn(y) is known are n = 1, y = 3/2, 5/2, 7/2, ..., where R1(y) = I. Part (a) of the following theorem, together with (6), settles the question for n - 1, y > 3/2. Theorem:
Work partly supported by U.S. National Science Foundation grant MCS 75-21684 A02.
(a) For any n, R"(y) is a monotone nonincreasing function of y. 427
239
With M. Aizenman in Phys. Lett. 66A, 427-429 (1978) 26 June 1978
PHYSICS LETTERS
Volume 66A, number 6
(b) If, for some 'y > max(O, 1 - n/2), the supremum in (3) is attained, i.e. R"(y) = Rn(y, V) for some V is strictly decreasing with 1171 _ E L7+n/2, then from the left at y. In fact
R"(y, V) = R"(y)
liminf [R"(y - b) - R"(y)) /6 > 0 .
is strictly decreasing from the We shall prove that left at y by showing that
6-0+
(13)
In particular (13) implies that
0 > e0 = inf spec(-A + V) > ess inf {V(x)) .
(14)
lim inf [R"(y - 6) - R"(-I)1 /6
Proof
6-0'
(a) Fix V. For y > 0, 6 > 0, let
(15)
fd"zlH(x,P)I' rt,(Ieo/Hl) dXX- I+d I I-XI'r<«+.
1(y,6)=f
(7)
0
> Rn(7)
fd"zIH(x,p)I!
>0
where
By scaling, for any e E R
1
I,tl,(t)= f dX(1 -X)y/X>(1
lell'6 =1(y.8)-1 f dXX-I+a le+Xly .
-t)t+rl(l +y).
(16)
(8)
0
The key fact which will be used is that the integral in (9) can be cut off from above at leoI
Thus, for any SchrOdinger potential V,
For anyS>0 Z!I e1(V)l !+6
EIe/V)111 =/(y-6,6)
=I(y.6Y1 f d), ),-'+6 Ele/(V)+XI" . 0
(9)
J
However, a/(V) + X are the eigenvalues of the potential V(x) + ) . Therefore, by definition (3),
Filet(V)+Al
(10)
I
leol
Xf
dXX-1+6
0
(V)+X111 -6 J
Xfd"z f d),),-1+61H(x.p)+XI'r-6 0
and, by substitution in (9) and using (8),
=R"(y 6)fd"zIH(x,p)IV Eler(V)1'''`6
/
0
(11)
-Rn(7-6)I(y-6,6)-1
X f d"z I H(x, p) + X111 = Rn(y) f d"z IH(x, P)I11+6 . Hence
R"(y + 6, V)
(12) Rn(y +6)max(0, 1 - n/2) and assume that for some V with IVI_ EL7+"/2(R")
428
240
Xfd"z f
dXX`I+6IH(x,p)+XIy-6.
(17)
leol
Therefore, using (2), (13) and Rn(y - 6) > Rn(y)
,
On Semi-Classical Bounds for Eigenvalues of Schrodinger Operators
Volume 66A. number 6
PHYSICS LETTERS
(R.(7 - 6) - Rn(7)1 /6 > (R"(7)/61(7 - 6.6)1
X f d"z I H(x, p)l y H <eo
Xf
(18) dX 7t- '+' (1 - X)y-b'1 d"z I H(x. p)l r .
leo/HI
Using lim5_p. 61(7 - 6, 6) = 1 and Fatou's lemma, we obtain (IS). In view of (16) the theorem implies
Corollary 1: If for some y, R"(7) = I then Rn(y)
= I for ally > 7. Moreover, for j > 7 the supremum in (3) is not attained by any potential. This proves part of a conjecture made in (21 (another part of the conjecture was disproved, for n > 7, in (71). In one dimension we can say even more since it is
known, (21, that R t (312) = 1 (R i (7) > 1 for 7 < 3/2).
Corollary 2: For ?>312,Rl(7)= 1.
26 June 1978
stricted classes of potentials, V, as was done in 171
for the spherically symmetric ones (the constants thus obtained are no larger than R,(-y) but it is not known whether any of them are strictly smaller). The theorem and its proof extend to such bounds as long as the class of potentials is closed under the addition of constants. References
III E.H. Lieb and WE. Thvring, Phys. Rev. Lett. 35 (1975) 687. See Phys. Rev. Lett. 35 (1975) 1116 for errata. Also E.N. Lieb, Rev. Mod. Phys. 48 (1976) 553. (21 E.H. Lieb and WE. Thirring. in: Studies in mathematical physics, Essays in honor of V. Bargmann (Princeton Univ. Press, Princeton, N.J., 1976). 131 A. Martin, Helv. Phys. Acta 45 (1972) 140. 141 H. Tamura. Proc. Japan Acad. 50 (1974) 19. 151 M. Cwikel, Ann. of Math. 106 (1977) 93.
16) E.H. Lieb, Bull. Amer. Math. Soc. 82 (1976) 751. 171 V. Glaser, H. Grosse and A. Martin, Bounds on the Number of Eigenvalues of the Schrbdinger Operator, CERN preprint TH2432 (1977). 181 G.V. Rosenblum, The distribution of the discrete spectrum for singular differential operators, lsvestia Math.
164 No.I(1976)75. 191 M.S. Birman and V.Y. Borzov, On the asymptotics of the discrete spectrum of some singular differential operators, Topics in Math. Phys. 5 (1972) 19.
One may also study bounds like (3) for some re-
429
241
Proceedings of the Amer. Math. Soc. Symposia in Pure Math. 36, 241-252 (1980)
THE NUMBER OF BOUND STATES OF ONE-BODY SCHROEDINGER OPERATORS AND THE WEYL PROBLEM
Elliott H. Lieb1
ABSTRACT.
If R(0,A) is the number of eigenvalues of -A in a
domain 0 in a suitable Riemannian manifold of dimension n, we
derive bounds of the form R(tt,A)< DnAn/2IQ,
for aZZ S2,
A
,n
Likewise, if Na(V) is the number of nonpositive eigenvalues of -A + V(x) which are < a < 0, then
Na(V)< LnIM [v
af12
for all a and V and n > 3.
1.
INTRODUCTION AND BACKGROUND.
Two closely related problems will concern us here : One is to bound the nonpositive eigenvalues of the one-body Schroedinger operator H - -A + V(x).
(1.1)
The other is to find an upper bound for R(it,A) - number of eigenvalues of -A < A
(1.2)
in a domain t with Ditichlet boundary conditions.
In both cases the setting is a Riemannian manifold, M,and A is the Laplace-Beltrami operator.
The only way in which the properties of the mani-
fold will appear in our results will be through the fundamental solution of the heat equation
G(x, y; t) -
[
e x p(t A)] (x, y)
(1.3)
1980 Mathematics Subject Classification 35P15. I Work supported by U.S.National Foundation grants PHYS7825390 and TNT 78-01160.
241
243
Proceedings of the Amer. Math. Soc. Symposia in Pure Math. 36, 241-252 (1980)
ELLIOTT H. LIEB
242
evaluated on the diagonal x - y, or else through the Green function e)-1
(-A +
(x, y) - 10 dt a-et C(x, Y; t)
(1.4)
(Note: G is defined for the whole manifold, not the subdomain
for e > 0.
One could of course, use C defined for the domain
Q in the case of (1.2).
S2 in all the formulas, but then the dependence of the result on 0 will be complicated.
It is precisely to avoid this complication that we use the C. for
the whole of M).
Let us begin with the problem defined by (1.2), which we may term the Weyl problem.
Weyl [1] proved the asymptotic formula (for suitable domains)
MIX)
Cn 0/2IOI
(1.5)
where I0I is the Riemannian volume of Si, n is the dimension of M and C
=
(4n)-n/2
C( 1 + n/2 )-1
n
(2n) n Tn
(1.6)
,
is the volume of the unit ball in1Rn. The constant Cn is called T n the classical constant for reasons which will become clear later. This result
where
(1.5) is discussed in (21 and (3], §XIQ.15, vol.4.
The proof uses Dirichlet-
Neumann bracketing.
Polya's conjecture is that (1.5) holds in Rn for all A and Q, not just asymptotically. THEOREM 1.
Here we will prove For all
on the manifold M)
A and 0 there exist constants Dn and En (depending
such that
In/2In'
R(S2 A) < Dn if
C(x, x; t)
for some An <
n
< An t-n/2, Vx C M, Vt > 0
and
n
(1.8)
while
(n,a) < (Dn In/2 + En)Is1I -n/2 + Bn, G(x, x; t) S Ant
if D
(1.7)
are proportional to A
(1.9)
Vx C M, Vt > 0,
(1.10)
and Bn respectively. n
In particular (1.8) and (1.7) hold for IRn (with A
n
- (4n)-n/2) and for
many noncompact M, e.g. homogeneous spaces with curvature <- 0.
(1.10) and
(1.9) hold for compact M.
Next we turn to the Schroedinger problem (1.1). be the nonpositive eigenvalues of H on L2(M).
If we write
V - V+ - Vwith
V+ (x)
IV(x)I
(1.11) ,
when V(x) < 0
0, otherwise
244
Let EI(V) < E2(V) <_...5 0
(1.12)
The Number of Bound States of One-Body Schrodinger Operators and the Weyl Problem BOUND STATES OF SCHROEDINGER OPERATORS AND THE WEYL PROBLEM 243
then the negative spectrum of H is discrete if V _E: Ln/2, for example. Na(V) is the number of eigenvalues of H which are <- a <- 0.
DEFINITION.
Our main result is Suppose.(1.8) holds and suppose n > 3. Then
THEOREM 2.
N0(V) S Ln 41 V_(x)n/2 dx
(1.13)
for some constant Ln depending on M. There are many remarks to be made about Theorem 2 and its connection with Rosenbljum (4) first announced
First, the history of (1.13).
Theorem 1.
Unaware of this, Simon (5) proved an inequality of the form
Theorem 2 in 1972.
n/2+En/2
NO(V) < Sn,el II V-II n/2+c with Sn,c
c 4 0.
as
+
11 V+11
Also unaware of (4], Cwikel and myself (6,7) simulReed and Simon (3) call Theorem 2 the
taneously found a proof of Theorem 2.
Cwikel's method exploits some ideas in (5].
Cwikel - Lieb - Rosenbljum bound.
All three methods are different, Cwikel's and Rosenbljum's are applicable to a wider class of operators than the Schroedinger operator, but my method [7], based on Wiener integrals, which is the one presented here, gives the This result was announced in [7) and the proof was
best constant by far.
written up in [3], Theorem XIII.12 and in [8].
Because all the technical de-
tails can easily be found in [8], the presentation given here will ignore technicalities nical
.
Not only am I indebted to B.Simon for his help with the tech-
details, as just mentioned, but I also wish to acknowledge his role in
stimulating my interest in the problem, and for his constant encouragement while the ideas were taking shape.
The connection between the two theorems is Let
PROPOSITION 3.
a <- 0.
Then for all M
8(Q,X) 5 Na( (a-a) where
XO
)
(1.14)
)
is the characteristic function of
PROOF.
Let 1P j be the j th
on all of M, be ip j
Q.
eigenfunction of -A in I
in Q and zero outside.
and let j, defined
(1.14) is obtained by using the
j as variational functions for -A + (a - A) Xn
in the Rayleigh - Ritz varia-
QED.
tional principle.
Similarly, we have if 0 < B < -a , then for all M
PROPOSITION 4.
Na(V) < Na PROOF.
+
8(-[V + 8]-).
Same as for proposition 3.
(1.15)
Alternatively, one can remark that
V(x) >-(V + 61-(x) - B, and adding a positive operator can never decrease any eigenvalue.
QED.
245
Proceedings of the Amer. Math. Soc. Symposia in Pure Math. 36, 241-252 (1980)
ELLIOTT H. LIEB
244
In particular, Na(V) 5 N0(-[V - a]_) and
N0(V) 5 N0(-V_), whencz we
have If
COROLLARY 5.
and
3
n -.
Na(V) <
(1.13) holds, then
Ln IM [V - a]- (x)n/2 dx.
(1.16)
Moreover to prove (1.13) it is sufficient to consider V satisfying V(X) < 0, Yx.
Proposition 3 will be used to derive Theorem 1 from Theorem 2 (actually from a generalization of Theorem 2, namely Theorem 8 and (4.3), which holds for all n).
However, at this point we can, under the assumption that (1.8)
holds, deduce (Y.7) of Theorem 2 from proposition 3 and Theorem 2. Choosing
a - 0, we have
Ln
t7(Sl,A)
an/2
Xiln/2
jM
dx
(1.17)
- Ln n/2InI n ? 3.
for
It is to be emphasized that Theorem 2 is more delicate than Theorem 1.
For one thing Theorem 1 holds for all n, whereas Theorem 2 holds only for The analogue of Theorem 2 is definitely false for n - 1 and 2.
n 2 3.
Rn,
In
at least,an arbitrarily small, nonpositive potential always has a nega-
tive eigenvalue when n <3; cf. [3], Theorem XIII.11. best constant in (1.7) is
constant Cn (in Rn at least). in general.
Cn
However, the best constant
n
7;
for
is bigger than
Ln
It is easy to prove [9] (by considering a very "large" V and
using Dirichlet-Neumann bracketing) that Ln > Cn
For another thing, the
, according to the Polya conjecture, the classical
3 < n 5 7
Ln 2 Cn.
In [9] it was shown that
and recently [16] it was shown that
Ln > Cn
for
in both cases explicit examples were constructed.
It is somewhat ironic that although Theorem 2 starts to hold only for
n - 3, Theorem 1 is easiest to prove for Rn with
n - 1.
In that case the
only domain that need be considered is a finite interval, and there the eigenvalues of
-A
can be computed explicitly.
The Polya conjecture is easily
seen to be true.
The intuition behind Theorem 2, and thereby the reason for calling Cn classical constant is important.
the
In the semiclassical picture of quantum me-
chanics in Rn, which is similar to WKBJ theory, one has the mystical postulate that "each nice set in phase space (2
)n
r - {(p,x)l p C Rn, x E Rn}
can accomodate one eigenstate of H".
of volume
This postulate can be made more
precise by mean of the Dirichlet-Neumann bracketing method mentioned before. In any event, since the "eigenvalues" of of V are V(x), the postulate implies that
246
-A
are
p2
and the "eigenvalues"
The Number of Bound States of One-Body Schrodinger Operators and the Weyl Problem
BOUND STATES OF SCHROEDINGER OPERATORS AND THE WEYL PROBLEM 245
(27r) n Jr,.,Rn dpdx 0(a -(p2+V(x))
Nat (V) z
with 0(a) - 1, for a > 0
and - 0 fnr a < 0.
The
(1.18)
integration in (1.18) for
fixed x, is easy to do, namely 1
= Tn (V - a]_(x)n/2
dp
(1.19)
p2< a-V(x)
Thus, (1.18) yields Na(V)
= Cn f (V -
a)_(x)n/2
dx
(1.20)
.
While the chief purpose of this paper is to prove Theorems 1 and 2, quantities of no less interest are the moments of the nonpositive eigenvalues of H.
DEFINITION.
For y > 0 IY(V) - EIEi(V)IY
Y
r° IclY-1 Nada
(1.21)
10(V) is defined to he NO(V).
For n > 3 we can use
Corollary 5 and Fubini's theorem to obtain I0
Ialy-1(V
IY(V) < Y Ln fM dx
= Y Ln fM dx = L
y,n
_ a )_(x)n/2 da
da IaIY-1(V-(x) +
f°
)n/2
-V_(x)
JM V-(x)Y+n/2 dx
(1.22)
with
LY n = Ln r(y+1) r(l+n/2) r(l+y+n/2)-I There are several things to be said about (1.22).
(1.23)
Although it was derived
from Corollary S under the assumptions (1.13) and n > 3, it holds much more generally. For example it holds in ,n for n - 2, y > 0 and for n - 1, y > 1/2 provided a > 0.
This was first given in (9].
holds for n - 1 and Y - 1/2.
That was an error
In (9) it was stated that it ;
it is not known if (1.22)
holds for n - 1. y = 1/2 but it is known (9] that (1.22) does not hold for n - 1, Y < 1/2.
In section II
we shall briefly mention how to deduce (1.22).
The best constant LY n in (1.22) is not given by (1.23), as the foregoing remark already indicates.
If we use C
n
classical value of L
in place of LY n in (1.23) we have the
namely, Y ,n
LY n -
As in the case cof Ln,
(4v)-n/2
r(Y+1) r(l+y+n/2)-1
(1.24)
it is easy to prove that LY The classin cal constant LY n can also be "derived" from the semiclassical assumption as .
in (1.18), (1.19), namely
247
Proceedings of the Amer. Math. Soc. Symposia in Pure Math. 36, 241-252 (1980) ELLIOTT H. LIES
246
(2r)-n fRrRn
EIE1IY =
dpdx
2 I p
+ V(x)IY 0(-p2-V(x)).
(1.25)
If the p integration is done in (1.25), the result is (1.22) with 'Y
,n
An important question is when is
LY.n
C
-
L Y.n
for y large enough, depending on n.
It seems to be true that LY.n - LY.n
This is known to be true 19, 101 for n - 1 and Y ? 3/2. LY .n 0
In fact 1101, if
for some y0, then equality holds for all y > Y0. for 0 The case of primary physical interest is Y - 1, n - 3. where it is conLC
jectured (91 that equality holds.
If this were so, it would have important
consequences for physics and it is hoped that someone will be motivated to solve the problem.
We now turn to the proof of Theoremst and 2 in the next three sections.
U.
THE BIRMAN-SCHWINCER KERNEL As stated in Corollary 5 we can assume V(x)
<- 0.
Therefore write
V(x) - -U(x), U >- 0.
A useful device for studying the nonpositive eigenvalues of -A-U was discovered by Birman [111 and Schwinger (121.
If (-A-U)* - EV, E < 0, then
IEI)-1
4,
(-A +
(2.1)
UV+
Defining U1/2* - $ , and multiplying (2.1) by U1/2 , we have
(2.2)
0 - KIEI (U)O where Ke(U), for e > 0, is the positive
Birman-Schert.nger Kernel given explici-
tly by U(x)1/2(-A + e)-I(x. Y)U1/2(Y)
Ke(x. Y; U) -
(2.3)
.
What (2.2) says is that for every nonpositive eigenvalue, E, of -A-U, KIEI(U) has an eigenvalue 1. (3,8) for more details).
The converse is also easily seen to hold ( see
Ke(U) is to be thought of as an operator on L2
; we
will see that it is compact, when e > 0 at least, and U is in a suitable Lp space.
In addition to the advantage that the study of the E's reduces to the study of a compact operator, there is the following important fact: Since
(-A + 0-1 is operator monotone decreasing as a function of e, so is Ke Hence (with V - -U),
Na(V) - k101(U) a nwnber of eigenvalues of K1(U) 2 1.
(2.4)
(2.4) will be exploited in the following way. PROPOSITION 6.
forx21. 248
Let F : R+ + 1t+ be any function such that
F(x) ? 1
The Number of Bound States of One-Body Schrodinger Operators and the Weyl Problem
BOUND STATES OF SCHROEDINGER OPERATORS AND THE WEYL PROBLEM 247
Then
Tr F(Ke(U))
F(Z' (U)) i
(2.5)
> ke(U) - N -e(V) where Tr means trace and the le (U) are the eigenvalues of Ke(U). For example, consider lRn, n
<-
3, and F(x) = x2
.
Then
Tr Ke(U)2 = ff U(x) U(y) ((-A + e)-1(x - y)12 dxdy <
by Young's inequality.
IIUIIZ 11 (-A + e)-1(x)IIZ
(2.6)
The last factor is of the form (2.7)
II(-o + e)_1(x)112 = hne-2+n/2
This shows that Ke(U) is Hilbert-Schmidt when U C L2 and e > 0. h
n
- - but one can show that
higher power of K.M.
When n > 3,
K (U) is compact by considering the trace of a
e
cf. [9).
At this point we can derive the aforementioned bound, (1.22), on Iy(V).
If we use (2.5) and (2.6) and insert the latter in (1.21), the a integration will diverge.
The trick [9) is to use Proposition 4 with 28 = e = -a. Thus
N-e(V) < Tr Ke/2([V + e/2)-) <
II
(U - e/21+II2 hnle/2)
-2+n/2
(2.8)
Inserting (2.8) into (1.21), and doing the a integration first, we obtain ly(V) < 22-n/2 yhnfdx f0 da IajY-3+n/2[U(x) + a/2)2 2U(x)
Ly n fU(x)y+n/2dx
if 7
>
2 - n/2 and n
<-
(2.9)
3.
(2.9) can be extended to other values of n and Y (but with y > 1/2 for n = 1), cf.
There is no way, however, to make this method work with F(x) = xa when y = 0 and n 2 3.
Quiet reflection shows that if theorem 2 is to be provable
by this method then we need x-n/2 F(x) ed as x
m.
0 as x -+ 0 but x 1F(x) remains bound-
The tool we will use to bound TrF(K (U)) for such F's is the e
Wiener integral.
That is the subject of the next section, which is really the
main point of this paper.
M.
THE WIENER INTEGRAL. REPRESENTATION OF F(K
(II))
e
Let du x y.t be conditional Wiener measure on paths w(t) with w(')) = x and w(t) - y.
This measure gives a representation for C (cf. (3)) by fdux.y:t(w)
- C(x.y;t) = etA(x, Y).
(3.1)
C itself has the semi group property
249
Proceedings of the Amer. Math. Soc. Symposia in Pure Math. 36, 241-252 (1980)
ELLIOTT H. LIEB
248
fMG(x, y; t) G(y, z; a) dy - G(x, z; t+s).
(3.2)
The well known Feynman-Kac formula [131 is 1dux.Y;t(w)
e-t(-A + XU)(x,
exp [-aft U(w(s))ds1 -
(3.3)
U 2 0, multiply (3.3) by
e 2 0, A 2 0, and
Take
(Note the signs in (3.3).).
y).
U(x)1/2U(Y)1/2 exp (-et) and integrate with respect to A - U(x)1/2U(Y)1/2 N°dt e-etrdpx.Y:t(w) exp (-Aft
t.
The result is
U(w(s))ds)
. {Ul/2(-A + XU + e)-1 U1/2) (x, Y).
(3.4)
Now (-A +
e)-1
e)-l
- (-A + AU +
+ A(-A + XU + e)-1 U(-A + e)-
Multiplying (3.5) on both sides by
U1/2
(3.5)
we obtain
A - [K (U) [1 + AKe(U)}-1}(x, Y). (3.6) can be cast in a more general form.
If
(3.6)
g(x) - exp(-ax)
and
then
F(x) - x(1 + Ax)-I
F(x) - xf0 dy a-y g(xy)
(3.7)
and
U(x)1/2U(Y)1/2 (°Ddt a-et fdpx Y;t
(w) g(f
U(w(s))ds) (3.8)
F(Ke(U)) (x, Y).
Next we want to take the trace of both sides of (3.8). by setting
This is obtained
(To be precise, one must first take
and integrating.
y - x
U E C and then use a limiting argument.)
The point to notice is that the
x
dependence occurs through fMdx U(x) dux.x;t (w) - fM dx dux,x;t (w) U(w(0)). By the semigroup property of obtained (after the U(w(s))
for any
a-et
f'0 dt t-1
du
exp[-t(-A + AU)], however, the same result is
integration) if
0 5 s S t.
(3.9)
U(w(0))
in (3.9) is replaced by
Thus,
fMdx fdyx,x;t (w) f(ft U(w(s))ds) - Tr F(Ke(U)).
(3.10)
with f(x) - xg(x),
(3.11)
i.e.
F(x) - f0 dy
Now the relations between (3.8) and (3.10).
F,
f
and
y-1 a-Y f(xy) g
(3.12)
are linear, as are the relations
The latter therefore extend to a large class of f's.
Since
we are here not particularly interested in the operator version, (3.8), we will
250
The Number of Bound States of One-Body Schrodinger Operators and the Weyl Problem
BOUND STATES OF SCHROEDINGER OPERATORS AND THE WEYL PROBLEM 249
concentrate on the trace version (3.10). The g's of the form
exp(-Xx)
norm dense in the continuous functions which vanish at infinity. using the semigroup property of
are
Furthermore,
exp [-t(-A + U)], one can see explicitly that
(3.8) and (3.10) hold for f's of the form
xk exp(-Ax), with
k
a positive
By a monotone convergence arguement one arrives at
integer.
THEOREM 7.
Let
f
be a nonnegative lower semicontinuous function on
[0, -) satisfying:
(i)
f(O) = 0
(ii) xp.f(x) - 0
as
x + m for some
p<
U>-0 and UcLP+Lq with p=n/2 (n23), p>l (n=2), p-1
Let
(n = 1)
and p < q < m
Then (3.10) holds, with
sense that both sides may be
F
given by (3.12), in the
+
The reader is referred to (81 for details.
f's
Obviously the class of
in Theorem 7 is not the largest possible, but it is more than adequate for our intended application.
The remark that allows us to bound the left side of (3.10) is the following.
THEOREM 8. that
f
Suppose that
is also convex.
Tr F(Ke(U)) < Ip dt t-1 a PROOF.
satisfies the conditions of Theorem 7 and
f
Then et
fM dx G(x, x; t) f(tU(x)).
By Jensen's inequality, for any fixed path
t
(3.13)
w(t),
f(f0 U(w(s)) ds) = f(f0 (t-Ids) t U((s))) S fo (t-Ids) f(tU(w(s))). By the same remark as that preceding (3.10), fm dx fdux,x;t(w) h(w(s)) independent of (3.11).
s
for any fixed function
h.
is
Inserting this in (3.10) gives
QED.
APPLICATIONS OF THEOREM 8
IV.
PROOF OF THEOREM 1:
We use Proposition 3 and Proposition 6.
sen to be of the following form for some
0<x
- b(x - a),
a < x.
F
given by (3.12) is monotone, and the condition that
a
and
(1.7)
E1
(4.1)
F(x)
1
is that
are related by 1 = h(a, b) - bfa dy(1 -
where
is cho-
0 < a < b
f (x) - 0,
b
f
is the exponential integral.
a/y)e-'
= be -a - abEI(a)
(4.2)
Assuming (1.10), (1.9) will be proved;
follows from the special assumption
Bn - 0
in what follows.
251
Proceedings of the Amer. Math. Soc. Symposia in Pure Math. 36, 241-252 (1980) ELLIOTT H. LIEB
250
a - -e
In Proposition 3 write U - (e + ).)y
f(tU(x)) - 0
A) < (3.13)
so that [y(Sl,
with
The x integration can be done last in (3.13) in which case
.
For x c f2 we change variables to
x t 4.
if
t + t/(e + A).
Thus, A)n/2t-n/2
blOl fa dt(l - a/t)(An(e +
+ Bn)exp[-et/(e + A)]
(4.3)
> W(Sl, A)
when n 2 3
Bn - 0.
and
Proposition 6 is used again and the proof parallels
PROOF OF THEOREM 2:
to
t
-.
tU(x)-1
We change variables in (3.13)
(4.1) and (4.2) are assumed.
if
U(x) # 0.
e - 0
If
Ln - An JO dt
the result is (1.13) with
t-1-n/2f(t)
4Anbal-n/2[n(n-2)]-1
-
REMARKS:
(i)
If
e - ca, c > 0 and
But for all cases we can choose QED.
thereby prove Theorem 1.
that of Theorem 1.
This will work only
e = 0.
The simplest choice for e, which is arbitrary, is
QED.
.
(4.4)
then the only estimate we have for
a - -e # 0
is contained in Corollary 5, which is valid only for one could try to estimate (3.13) directly with
n
>-
3.
Na(V)
Alternatively,
e 0 0, but this is messy.
As
stated earlier, no inequality of the form (1.16) holds for all a, V when n - 1 or 2.
But recently Ito [14] has bounded (3.13) when
He uses the fact that bound for
Na(V)
(ii) If
This estimate for
f(x) < bx
in terms of
Dn
n > 3
for
and
x > a
and
n = 2.
and obtains a complicated upper
and
lIV-llnV-ll/2112.
B. - 0
we can choose
11vJ12
e # 0
is, of course, the same as
Ln
e = 0
in (4.3).
given by (4.4).
As an illustration of how good our bound is let us consider the case of
R3, where
A3 = (4a)-3/2.
E1(.25) - 1.0443
and
a - 0.25
We choose
b - 1.9315
in (4.1) and find that
according to (4.2).
D3 - L3 - 0.1156
This value of D3 can be used in (1.7).
Using (4.4), (4.5)
.
When compared with
is supposed to be the sharp constant, it is not very good. D3 can be improved by using (4.3) with
C3 = 0.0169, which The estimate for
e - cA, c > 0.
If, however, the same number, L3, is used in (1.13) the result is quite good.
As already stated, the best
L3 > C3.
In fact, by an explicit example,
L3 2 (3n)-3/2P(3)/P(3/2) - 0.0780,
252
(4.6)
The Number of Bound States of One-Body Schrodinger Operators and the Weyl Problem
BOUND STATES OF SCHROEDINGER OPERATORS AND THE WEYL PROBLEM 251 cf. (9), eqn. (4.24).
It is conjectured that the right side of (4.6) is, in
fact, the sharp constant in (1.13) for R3.
In any case our result, (4.5), is
off by at most 49%.
As stated in Section 1, a quantity of physical interest is I1, the sum of
the absolute values of the eigenvalues, in R3.
Using the bound (1.22),
(1.23), together with (4.5), we have L1, 3
(2/5)L3
.04624
(4.7)
This result was announced in [15].
BIBLIOGRAPHY
H. Weyl, "Das asymptotische Verteilungsgesetz der Eigenwerte Linearer 1. partieller Differentialgleichungen", Math. Ann. 71 (1911), 441-469. M. Kac, "Can one hear the shape of a drum?", Slaught Memorial Papers, 2. no. 11, Amer. Math. Monthly 73 (1966), no. 4, part 11, 1-23. M. Reed and B. Simon, Methods of Modern Mathematical Physics, Acad. 3. Press, N. Y., 1978. C. V. Rosenbljum, "Distribution of the discrete spectrum of singular 4. differential operators", Dokl. Aka. Nauk SSSR, 202 (1972), 1012-1015 (MR 45 The details are given in "Distribution of the discrete spectrum of 84216). singular differential operators", Izv. Vyss. Ucebn. Zaved. Matematika 164 [English trans. Sov. Math. (Iz. VUZ) 20 (1976), 63-71.) (1976), 75-86.
B. Simon, "Weak trace ideals and the number of bound states of 5. Schroedinger operators", Trans. Amer. Math. Soc. 224 (1976), 367-380. M. Cwikel, "Weak type estimates for singular values and the nunber of 6. bound states of Schroedinger operators", Ann. Math. 106 (1977), 93-100. 7. E. Lieb, "Bounds on the eigenvalues of the Laplace and Schroedinger operators", Bull. Amer. Math. Soc. 82 (1976), 751-753.
B. Simon, Functional Integration and Quantum Physics, Academic Press, 8. N. Y., to appear 1979. E. Lieb and W. Thirring, "Inequalities for the moments of the 9. eigenvalues of the Schroedinger equation and their relation to Sobolev inequalities", in Studies in Mathematical Physics: Essays in Honor of Valentine Bargmann (E. Lieb, B. Simon and A. Wightman eds.), Princeton Univ. Press, Princeton, N. J., 1976. These ideas were first announced in "Bound for the kinetic energy of fermions which proves the stability of matter", Phys. Rev. Lett. 35 (1975), 687-689, Errata 35 (1975), 1116. 10. M. Aizenman and E. Lieb, "On semi-classical bounds for eigenvalues of Schroedinger operators", Phys. Lett. 66A (1978), 427-429. 11. M. Birman, "The spectrum of singular boundary problems", Math. Sb. 55 (Amer. Math. Soc. Trans. 53 (1966), 23-80). (1961), 124-174. 12. J. Schwinger, "On the bound states of a given potential", Proc. Nat. Acad. Sci. U.S.A. 47 (1961), 122-129.
253
Proceedings of the Amer. Math. Soc. Symposia in Pure Math. 36, 241-252 (1980) ELLIOTT H. LIEB
252
13. M. Kac, "On some connections between probability theory and differential and integral equations". Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, Univ. of Calif. Press, Berkeley, 1951, 189-215.
14. K. R. Ito, "Estimation of the functional determinants in quantum field theories", Res. Inst. for Math. Sci., Kyoto Univ. (1979), preprint. 15. E. Lieb, "The stability of matter", Rev. Mod. Phys. 48 (1976), 553-569. 16. V. Glaser, H. Grosse and A. Martin, "Bounds on the number of eigenvalues of the Schroedinger operator", Comcun. Math. Phys. 59 (1978), 197-212.
DEPARTMENTS OF MATHEMATICS AND. PHYSICS PRINCETON UNIVERSITY JADWIN HALL P.O.BOX 708 PRINCETON, N. J. 08544
254
With S. Oxford in Int. J. Quant. Chem. 19, 427-439 (1981)
Improved Lower Bound on the Indirect Coulomb Energy* ELLIOTT H. LIEB AND STEPHEN OXFORD Departments of Mathematics and Physics, Princeton University, Princeton, New Jersey 08544, U.S.A.
Abstnet For a Coulomb system of particles of charge e, it has previously been shown that the indirect part of
the repulsive Coulomb energy (exchange plus correlation energy) has a lower bound of the form -Ce21s f p(x) Vs dx, where p is the single particle charge density. Here we lower the constant C from the 8.52 previously given to 1.68. We also show that the best possible C is greater than 1.23.
1. Introduction
In the study of quantum Coulomb systems of charged particles (atoms, molecules, and solids), it is frequently desirable to estimate various energies in terms of the (diagonal) single particle charge density p,,(x) belonging to a given state tli of N particles. We will be concerned with the repulsive Coulomb energy N
1=(GI L eie1Ix,-xillo),
(1)
,<4
where e, > 0 are the particle charges (by a trivial change we could assume all e; <0). 41 is any normalized state, not necessarily an eigenstate of any given Hamiltonian; it could also be a density matrix. Our goal is to find a lower bound to I4. [See note added in proof.] We define
D4=D(p*,p4)
(2)
to be the direct part of l4, where, in general, we have
D(! g) 21 J Jf(x)g(y)Ix-yl-' dxdy. The density p4 is the sum defined by
P4(x)=e;
(3)
of the N individual single particle charge densities
JI0(xt,...,xi-1, x, x;+1,...,xN,a1,...,aN)I2dx,.
(4)
Work partially supported by U.S. National Science Foundation Grant No. PHY-7825390 A01.
255
With S. Oxford in Int. J. Quant. Chem. 19, 427-439 (1981) LIEB AND OXFORD
428
Here the a, are any quantum numbers (e.g., spin) and dz, means integration over all variables except xi. The indirect part E4 of 14 is defined by (5)
E,, + D,,, = 4.
E4, includes all "exchange" and "correlation" effects. It was shown by one of us [1], that for all #P and C = 8.52, we have
E4,?-Cez/3 Jp,
(x)4/3dx.
(6)
A more complicated bound was given when the e; are not identical. The main result of this paper is given in Section 2. We show that N
dx]1/2r
eip'(x)1413
Ey?-C[J((L'
p4(X)413 dxl
1/2
If
(7)
even in the case that the e, may be distinct, with C = 1.68. With this improved value of C, the bound for E,1, may have computational value as well as theoretical value. Inequality (7) reduces to the form of Eq. (6) when all e; = e. Our proof differs from that in Ref. I chiefly in avoiding the use of the Hardy-Littlewood maximal function. We also show in Section 3 that the best (i.e., smallest) possible C, in Eq. (6), must be greater than 1.23 by explicitly calculating E for a two-particle wave function q/.
It should be noted that Eq. (6) is a special case of the lower bound for N-particle states I/f, EJ ?
p e (Sp .6)/(3p-3)
P1/(1p-3)N (3p 4)/(3p P)
(8P4
31,
1
which is a consequence of Thomas-Fermi theory for p>3 (see Refs. 1-3). Holder's inequality applied to Eq. (6) gives Eq. (8) for all p 3 with the same C as in Eq. (6). However, the best Cp in Eq. (8) will in general be smaller than this C and it would be interesting to have an estimate for it. [In Ref. 1 it was incorrectly stated that Thomas-Fermi theory gives Eq. (8) for p > j. This is wrong because Thomas-Fermi theory is bounded below only for p > z. Equation (8), for p = 3, was first given in Ref. 3, where it was used to prove the stability of matter.] The bound, equations (6) or (7), can be examined in a number of ways. One might try to find a constant C in Eq. (6) only for 0 symmetric or antisymmetric, or only for 0 with a particular spin value. However any such restriction on ./i cannot improve the constant in Eq. (6). Let all e, = e. Suppose we take an
arbitrary 0(x,, ... , xN; a(, ... , aN), where a,, ... , a,v are arbitrary quantum numbers. [.et f4(X1,
256
.. , XN) ' E 1'1'(x1..... XN, aI. ..
,
(X,N)I2,
Improved Lower Bound on the Indirect Coulomb Energy INDIRECT COULOMB ENERGY
429
and let the symmetrized version of fa be given as
F(x,,.
, XN) =
I
L fJ,(xP1,
N. PEsN
, xPN)
We can define a symmetric 4 such that E,,,s = E,,, Ir,s = Ir, and p,,s(x) = p, (x). We
merely take iIis(x1, ... , xN) = F(x1.... , This shows that C cannot be improved by excluding bosonic 41. In a similar way, we define an antisymXN)112.
metric '°, t° = F(x,, ... ,
xN)112B(x...... XN ),
where 0 is any antisymmetric function which takes on the values ± 1 except on a set
of measure zero. We see that we cannot improve the constant by excluding r' with Fermi statistics. A similar construction shows that restrictions on the spin quantum number cannot improve C. It will be noted that the right-hand side of Eq. (6) is of the form given by Dirac [4] to approximate the exchange energy. There is, however, a difference between the two if spin is taken into account. The Dirac approximation is -Cq-113
Ey =
f
p.(X)413 dx,
(9)
with C = 0.93 and q is the number of spin states (q = 2 for electrons). Dirac computed Eq. (9) using a plane wave determinant for 0. This determinant depends upon q. In view of Eq. (9) one might infer that a good lower bound to Ep
should have a q113 factor. This, as we have just noted above, is not the case. Another way to say this is the following: a "diagonal" operator, such as the Coulomb repulsion cannot distinguish the spin of a particle. To "see" the spin it is necessary to examine an off-diagonal operator such as the kinetic energy T,d. A lower bound for T,, does exist (3] and it does have a factor q-213. In order to have a lower bound for E,,, that measures q it would be necessary to use an expectation value such as T,,, that is "off -diagonal." A useful bound of this kind might exist, but
a bound such as ours that involves only the "diagonal" quantity p,y has no q dependence. It is true however, that the constant in Eq. (6) or (7) can be improved if we specify the number of particles. Consider the case of equal charges. Define CN to be the best constant for Eq. (6) when we consider only N-particle states 1(i, i.e., CN=
sup
N pan,d
,4
(- J p, (x)413 dx)
Ey.
In Appendix A we compute C, = 1.092, following a treatment by Gadre, Bartolotti, and Handy [5]. One notes that C, is less than the constant 1.23 computed for the variational two-particle wave function of Section 3, which is itself less than C2. In Appendix B, we show that this is a general phenomenon, i.e., CN <_ CN 1. Presumably, one might take advantage of this if one were only dealing with a fixed low number of particles. However, we have no method available for computing an upper bound for CN, N ? 2, except 1.68, which holds for all N.
257
With S. Oxford in Int. J. Quant. Chem. 19, 427-439 (1981)
LIEB AND OXFORD
430
Inequality (7) might be improved upon in another way in the case of unequal charges. One might try to prove that a universal constant C existed such that the following is true (for fixed 0
N
Er,
- C [ r ( Y_I
e;'
I -a
o
dx]
2 poi (x))
p(x)4/3 dx]
[J
(10)
It is an open question whether a bound exists when a = 1 in Eq. (10), called the symmetric form. Holder's inequality implies that for Eq. (10), (right-hand side, a = 1) a- (right-hand side, a s 1); thus the symmetric form gives the best lower bound. In Ref. I a bound with constant 8.52 was proved for a =', and in this paper we sacrifice even more "symmetry" in the bound to show that Eq. (10) holds for a = 2 with the improved constant 1.68. In Ref. 5, the comment was made that no one has yet produced any upper bound for E*. The following simple remark is relevant. There certainly cannot be any upper bound of form C f p(x)4/3 dx. The reason is that E,I,/f p(x)4/3 dx can be made arbitrarily positive simply by taking a G for two particles of the form 4, =ALk -(XI-x2)]exp(-1x112- 1x212-Aixl-x212,
where k is some fixed vector. As A tends to infinity, Ei,, will tend to +oo while f py4/3dx will remain bounded. This Vi is antisymmetric. 2. A Lower Bound for E,1,
We will use an argument similar to that given in Ref. I to derive Eq. (7) for C = 1.68.
We first fix charges e 1 > 0, ... , eN z 0. Let f (x 1, ... , xN) be the particle density associated with an N-particle wave function t/r, as given in Section 1. Let p,,
be the associated single-particle charge density, equation (4), and let x1,. .. , xN be distinct but otherwise arbitrary points in R3. We take µ(y) to be some function satisfying the following: (i)µ is non-negative; (ii) µ is spherically symmetric about the origin and µ (x) = 0 if Ix I > 1; and (iii) f µ (y) dy = 1. Let A be a positive constant to be determined later. We now define a function A.(Y)A3P.,(x)IL(Ap4(x)1/3(y
F<<(- ):
-x)),
(11)
if p,r(x) > 0, and µ, (y) w 0 if p,r(x) = 0. We see that µ, is a non-negative function which satisfies (i) its integral (with respect to y) is I if p,y(x) > 0; (ii) it is spherically
symmetric about x; (iii) µ,(y)=0 if Iy - xI>A
'p,(X)-113.
We observe that Lemma 1 of Ref. I may be applied to this choice of µ. Namely, we prove
Lemma 1: N
N
eejjx, -x1j ' -_ -D(P#, p,)+2 E D(p,, eiix,)- E D(eiµi,, e,µ=,). i-I i -I Isi<jsN E
(12)
258
Improved Lower Bound on the Indirect Coulomb Energy INDIRECT COULOMB ENERGY
431
Proof. It is a well-known fact [1] that the potential io generated by a non-negative spherically symmetric charge distribution of total charge 1 satisfies
ti(x) s 1 /jxj. In particular, taking x, to be the origin, f µ,,(y)jx - yj_' dy :s ix -x,j- '. Hence D(e,p.r,, erg,,) = iee! J A., MIX - yj-'µ,,(x) dx dy (13)
sJee; Jix -x;jie.,,(x)dxske,eaIx,-x;j'. We now observe that
D(P* -
N
N
e,Ez,,, po -
(14)
e,K=) L. 0
by the positive definiteness of the Coulomb kernel. Expanding the left-hand side of Eq. (14) and rearranging, one has that N
N
2 ,
,-I D(e,µ=ne,µx,)
(15)
The lemma follows by applying Eq. (13) to the left-hand side of Eq. (15). Now let 8,, be a point charge distribution of charge one centered at x;. Adding and subtracting terms on the right-hand side of inequality (12), we get F. e,e;jx,-x,j-'>--D(P#,p,r,)+2 E D(P,,.,e,8,,) 1-I
,<j
- (2
N
N
(16)
D(P*, e8,, - e,µ:) + E D (eµ,,, e,k.))
We now integrate Eq. (16) against j*(x1, ... , xN), whence N
I*
D(P,n P)-(E f 2D(Pa, 8,, -/a,,)P'r(x,) dx,
,-I
+ E f D(IL, µ=,)e,Py(x,) dx,)
(17)
We wish to find an upper bound of the appropriate kind to the expression in large parentheses in Eq. (17). Let us denote the first sum in large parentheses as (*) and the second as (**). We rewrite (*) as follows: (*) = J P*(x)2D(P*, S. -µ.x) dx = f f Pf(Y)FA(P#(x), Ix -Y1) dx dy,
(18)
where we define
FA(a, r)=[ar-'-Aa°".(Aa'/3r)],
(19)
and 0 is the potential generated by our fixed µ. Hereafter we require that k be bounded. In this case, F, (a, r), considered as a function of a on (0, oo), satisfies the
259
With S. Oxford in Int. J. Quant. Chem. 19, 427-439 (1981)
LIEB AND OXFORD
432
following for r > 0; (i) it is continuously differentiable; (ii) F, (0, r) = 0 and r) =0 if Aa'/3r> 1. These properties follow from the continuity and differentiability of 0 and the relation 46(x)= 1/Ixl for lxl> 1. For a >O, let
Xa(x)=9[p*(x)-a], where 9(t)=0 if ts0, and 0(t)=1 if t>0. X. is the characteristic function of the set p4(x)>a. By Fubini's theorem and the fundamental theorem of calculus, one has that
Jo da JXa(x)--F,,(a,r)dx= Jdx Jo
aaF,(a,r)da= JdxF.,(p(x),r), (20)
and thus M
.0
(*)=Jo Jo dadb JdxdYXa(x)Xb(Y)aaFA(a,Ix-y1) where we have used the representation p(y) fo dbXb(Y) We bound (*) as follows: Let (y). = y if y? 0, and (y), = 0 if y:5 0. Then
Jo dadb JdxdYXa(x)Xb(Y)\aaFA(a,Ix-Yl)
sJ
ayb
+J6
dadb J dxdyX.(x)(a F,(a,lx-yl))
(21)
as
dadb JdxdYxb(Y)(aFA(a,Ix-YI))+
By scaling properties of (a/aa)F,(a, Ix - yl), one has that J(aaF,(a,Ix-YI))+dx=J(aaF,,(a,Ix-YI))
dy=A-2Ka-2/3
(22)
where K = f [(a/aa)F1(1, lz l)]. dz and K only depends on the original choice of A. We, therefore, have that ao
co
a
(*):5A-2KJdx(J
0
b
Xb(x)dbj a-2/3da)
daXa(x)a-2/3J db+J 0
0
0
(23)
4A-2K Jdx JXa(x)a'/3da=3A-2K J py/3(x)dx, a
where we have used the representation p?3 (x) = (4/3)fo a113Xa(x) da. The second sum (**) in the large parentheses of (17) can be written
(**) = E J D(µ., µx)e,P*(x) dx =AD(k, A)
i-1 fr
N
260
(24)
erPm(x)P,(x)'/3 dx
J 4/3
l3/4r dxJ
1/4 py(x)4/3dx]
LJ
(25)
Improved Lower Bound on the Indirect Coulomb Energy INDIRECT COULOMB ENERGY
433
Equation (24) follows from simple scaling and Eq. (25) is the Holder inequality. Optimizing Eqs. (24) and (25) with respect to A yields 4/3
N
rfr
1/2
dx,
(E e;p'(x))
1/2
r(
p(x)4/3dx]
LJ
(26)
A variational argument shows that the optimum choice of µ would be the uniform ball if [(a/aa )F, (a, r)]+ were replaced by (a/aa)F,, (a, r) [in which case the constant in Eq. (26) would be 1.45]. However, trial and error indicates this choice is also
approximately best with the cutoff. We find that [aF,(1, r)/aa]+=aF,(1, r)/aa if and only if r :s R with R = (5"' -1)/2. Then K = 0.6489 and D(µ,µ) = 5. The constant in Eq. (26) is then 1.68. Thus we reach the conclusion that N
4/3
E, > -1.68[ J (E e;py(x))
1/2
1/2
{Jp4(x)4'3dx]
dx]
.
(27)
3. A Lower Bound for C2 We now exhibit a lower bound to C2 [and thus to the best possible C in Eq. (6)] which is greater than C1. We choose a singular O(x, y), and take e, = e2 = 1. Let t = Ix I and s =1yI for x,
y E R3, and let h and e he unit vectors e = x/IxI and h = y/I yl. We define 1l12[(t, e), (s, h)]s f =(15/4rr2)S(1- t -s)S(e
h + 1)9(1-t)9(1-s).
We check the following:
J f[(I,e),(s,h)]s2 dsdh=
15
'
4` J S(1
z t - s)sdx
J2r
J 8(cos '+ 1)sinlidiIidb9(1-t)
x
= 15 (1-()29(1-t)'(t). We have used spherical coordinates to evaluate the above integral with the north pole in the (fixed) "e" direction. Similarly, the s marginal is p2(s) = p'(s).
One checks that f p' =I and hence f is properly normalized. We have that p(t) = 2p1(t). Trivially, 1, = I since the particles are always one unit apart. We have that
(-(1 - t)8j3t2 I5)-"'f01 J p(x)413 dx = 4zr
dt = 2.084.
+r
By Newton's theorem D(p, p) _ (4 7r)2 J p(t)tJ p(s)s2 = 3.572. 0
0
261
With S. Oxford in Int. J. Quant. Chem. 19, 427-439 (1981) 434
LIES AND OXFORD
The result is that
C2_>-E*/ J p#,(x)4'3dx = 1.234.
Appendix A: Evaluation of C,
For one particle I,, =0. Therefore C1 is the maximum of H(p) given in Eq. (28). We prove the following theorem and then compute C1. This was done in Ref.
5. The purpose of this appendix is only to give a rigorous justification of the calculation done in Ref. 5. Theorem. There exists a symmetric decreasing function p which maximizes the functional
H(P) _ (J p(x)413 dx)
1
D(p, p)
(28)
over the set A ={p(x)p(x)O, p E L413(R3), f p(x) dx = 1). Proof. For am p at 0 and p E L`13(R3) (1 L'(R3), let (p),, be a scaled version of p, i.e.,(p) (x) = A p (Ax). It is simple to check that f (P )a (x) dx = f p (x) dx,
J (P)" (X)413 dx =A J P(x)4/3 dx, and
H[(P)a] = H(p).
(29)
Let p , E A, j = 1, 2, ... be such that lime H(pi) = H(p). This supremum may a priori be infinite, but we will see that it is finite. By scaling p, and using Eq. (29), we may assume henceforth that f p;(x)4'3 dx = 1. By the Riesz inequality for
symmetric decreasing rearrangements [6],t we have H(p*)>-H(p), where p* is the symmetric decreasing rearrangement of p. One also has that f p*(x) dx = JP(X) dx, f p*(x)4'3 dx =f p(x)'i3 dx; therefore, by replacing pi by p" if necessary, we may also assume that the pi are symmetrically decreasing. We now use an idea used in Ref. 7. By the symmetric decreasing property of pi, we have [writing pi(x)=pi(jx1)]
4a
p;(x)dxs Jpi(x)ds=1.
'3 R p1(R)s J -ISR
t The 3-dimensional proof can be found in Brascamp, Lieb, and Luttinger, J. Funct. Anal. 17, 227 (1974).
262
Improved Lower Bound on the Indirect Coulomb Energy INDIRECT COULOMB ENERGY
435
Hence
P;(R)skI/R3,all j.
(30)
Similarly,
43 R3p4i3(R)s j
Pi(x)4n
dz = 1,
Pi(R) s k2/R914, all j.
(31)
We define f(R) - min (k, /R3, k2/R9'4). Since the pi are symmetric decreasing and uniformly bounded by f (which is finite except at 0), by a variant of Helley's theorem [8], some subsequence of the pi (which we continue to denote by pi) converges pointwise almost everywhere to some symmetric decreasing p(x) and p(x) s f(x). We will see that p(x) # 0. We now show that the p we have found satisfies the conditions of the theorem.
By calculation D (f, f) < m. We therefore apply the dominated convergence theorem to conclude that
limD(pi,PI)=D(p,P)
(32)
0<sup H(p)=limD(pi,pi)
(33)
i
In particular, A
Furthermore, by Fatou's lemma we have that j p(x)4"' dx s lim f pi(x)4'3 dx = 1,
(34)
j p(x) dx s lim j pi(x) dx = 1.
(35)
Therefore by Eqs. (32)-(34), H(p)>_ 5UPPEA H(p). By Eq. (35) we can multiply p by a scalar A z I so that f Ap(x) dx = 1. By definition of A, Ap e A and
H(Ap)=A2'3H(P)>_H(P)zsupH(P) peA
(36)
It must be that the inequalities in Eq. (36) are actually equalities and thus that on that A = 1. We therefore have that p belongs to A and maximizes set.
We shall now show that the constant C, can be calculated. By usual variational arguments [7], one knows that the optimizing p satisfies the following variational equations: '0(x)_4CIP(x)I13+A
0
ifp(x)>0,
(37)
s0
if p (x) = 0,
(38)
263
With S. Oxford in Int. J. Quant. Chem. 19, 427-439 (1981)
436
LIEB AND OXFORD
where fi(x) is the potential generated by p,
d6(x)= J p(y)Ix-yj-' dy,
(39)
and where A is a real-valued Lagrange multiplier. We first note that Eqs. (37) and (38) imply that p is compactly supported. If
not, then one must have that p(x)>0 for all x by the symmetric decreasing property of p. By letting lxl i oo one has that A = 0, since p and-O tend to zero in
Eq. (37). We then would have by Eq. (37), p(x)=(constant)c6(x)3, where the constant is positive. For sufficiently large lxj, we see that p(x) ;2(constant)Ix This implies that p is not integrable, contradicting the fact that f p(x) dx = 1. Let ro be the distance at which p first vanishes. We now apply the Laplacian to Eq. (39) and use Eq. (37),
Ix1`ro, X ? r,.
ll0,
(40)
Let f(r) = (3n/C,)312[.0 (X) +A ]/4a. We rewrite Eq. (40) in spherical coordinates
d
_
r
dr2rf(r)=-f(r)3,
r
(41)
I
Equations (40) and (41) hold in the distributional sense. We now argue that f(r) is continuously differentiable and that f'(0) = 0. This will also imply that Eq. (41) is supplemented by f (ro) = 0. In spherical coordinates, one can write ,o
fi(r)=
4ar-' Ju p(s)s2ds+41r r
J,p(s)sds,
rsro,
(42)
r -- ro.
We apply Holder's inequality to the first integral and use the fact that p E L°'' 3/4
r
J p(s)s2 ds 0
s (J,p(s)4/3s2 ds) 0
,/4
(J s2 ds)
s (constant)r3/4.
n
This and a similar inequality satisfied by the second integral imply that fi(r) = O(r '/4) near the origin. By Eq. (37) one has that p(r) = 0(r-3/4) near r = 0. This in turn implies that 0 is bounded at the origin, and therefore by Eq. (37), p is also bounded at the origin. Since 0 is the potential of a bounded, compactly supported charge distribution, it is also continuous, hence p is continuous by Eqs. (37) and (38). One now can see that m (hence f) is C' by examining Eq. (42). Since p is continuous and bounded, the first term of Eq. (42) is of the form r-'g(r), where g(r) is continuously differentiable for r >0, g(r) = 0(r3) and g'(r) = 0(r) near the origin. Hence the first term is continuously differentiable for r - 0, and has vanishing derivative at r = 0. The preceding statement is true of the second term in
264
Improved Lower Bound on the Indirect Coulomb Energy
INDIRECT COULOMB ENERGY
437
Eq. (42) by inspection. Thus A(r) is continuously differentiable for r>0 and 0'(0) = 0. Equation (41) holds in the strong sense because its right-hand side is C'.
As first noted by Gadre, Bartolotti, and Handy [5], Eq. (41) is the Emden equation of order 3. One may rescale p(x)-. a3p(ax) to ensure that f(0) = 1. The two conditions f (0) = 1 and f'(0) = 0 uniquely determine the solution of the ordinary differential equation (41).
If ro is the first zero of the solution, we have that p(r) = 0 if r>_ ro and p(r)=(3a) 312C1/zf(r)3 if rsro. In Ref. 5 it was noted that this equation determines the constant CI. Namely, we have that 1=41r
Jrzf(r)3dr
J
p(r)rzdr=4,3-3/z17,-1;zC'/z
o
_ -4 3
0
3r2,7 uzC
1n
O°
r[rf(r)]° dr
0
3nrz
_ -4 3
I
f(ro)
(43)
Emden functions are tabulated [9]. We find that r o = 6.89684, f(r0) = -0.04243. Equation (43) then gives C, = 1.092. Appendix B: Monotonicity of CN
We show that CN s C,.,,. ,, where CN is defined in Section 1 as the best constant in Eq. (6) for an N-particle state. We consider the case e; = e. Let e > 0 be arbitrary but fixed. We let fN (x,, ... , .N) be an N-particle density
which vanishes for Jx,j> L for I <- i s N, where L is some finite number, and furthermore, let fN have the property that -1
-(e 213
JP/N(x)4/3dx)
E!N?CN-E.
(44)
A simple approximation argument using dominated convergence shows that L and fv can be found satisfying Eq. (44). Let x0 E R3 be chosen such that Ixol > L +2R, where R will be determined later. We define a one-particle density f,(x) _ (31rR3)-'6(R - Ix -xoi) and we also define the (N + 1)-particle density fv.I(xI, ... , XN.I) = fN(XI, ... , xN)f,(xN+,). One sees that pfN ,(x) =pr,(x)+ef,(x). Since PN and fI are never simultaneously nonzero, we have e
r 1 p/N(x)4/3 2/3 J P/N.,(x)4/1 dx = e2/31\J dX + e4/3J f1(x)4/3 dx) /!
= e2/3 r J
PfN(x)4/3
dx +
(45)
ez(3/4rr)'/3 R
265
With S. Oxford in Int. J. Quant. Chem. 19, 427-439 (1981)
438
LIES AND OXFORD
We also have that
IIN+e2 LfN(x1, ,xN)f1(xN+1)Ix(-XN+II 1dxl,.. ,dxN+l J CCN
i- I
(46)
s If, +e2N/R, by the definition of ft. The evident inequality DIN ?DIN together with Eq. (46) implies that EIN,,, <EIN +e2N/R. This and Eq. (45) imply that CN+
-
-(e 2/3 r
P/N.,(X)1/3
J
1
dx
1
EIN., e2N
e2(3/47r)1/31-1
(e2/3
J PIN
(X)413
dx +
)
R
(
(E,N + R) .
(47)
We now choose R so large that the right-most term in Eq. (47) is greater or equal to
-(e2/3
PIN(X)4/3 dx)
-I
J
EIN -e.
Recalling Eq. (44), we have the result that CN+1 z CN - 2e. Since e was arbitrary, CN+1 ? CN.
In the case of distinct ei's, one may define (for some fixed a, 0 < a s 1) 4/3
N
CN(el,
Sup
, eN) =
X
-LJ( e, /2ap/N(x)1
(J PIN (x)413 dx)
dx] 11
EIN.
A similar argument shows that these constants also increase,
i.e.,
CN(el, .. , eN) -- CN+t(e1, .. , eN, eN+1), where eN+1 ? 0 is arbitrary. Of course Section 2 shows that CN(el, ... , eN) < 1.68 for all N and e; when a = 2. Note added in proof:
In the text we proved the inequalities, Eqs. (6) and (7), when 0 is a wave function (pure state), and remarked that the inequalities also hold for a density
matrix. To prove this, note that any density matrix, µ, can be written as µ = Ea kYa > < 0a. In the definition, Eq. (4), simply regard f3 as just one more quantum number to sum over-on the same footing as the a's. The rest of the proof is then the same as in the pure state case. Bibliography [1] E. H. Lieb, Phys. Lett. 70A, 444 (1979). [2] E. H. Lieb, Rev. Mod. Phys. 48, 553 (1976).
[3] E. H. Lieb and W. E. Thirring, Phys. Rev. Lett. 35, 687 (1975); 35, 1116 (1975) (errata). [4] P. A. M. Dirac, Proc. Cambridge Philos. Soc. 26, 376 (1930).
266
Improved Lower Bound on the Indirect Coulomb Energy
INDIRECT COULOMB ENERGY
439
[5] S. R. Gadre. L. J. Bartolotti, and N. C. Handy. J. Chem. Phys. 72, 1034 (1980). (61 F. Riesz, J. London Math. Soc. S, 162 (1930). [71 E. H. Licb, Stud. Appl. Math. 57, 93 (1977).
[8] W. Feller, An Introduction to Probability Theory and its Applications, (Wiley, New York, 1966), Vol. 2, p. 261. [9] British Association for the Advancement of Science Mathematical Tables, (Office of the British Assoc.. Burlington House, London, 1932) Vol. 2.
Received June 10, 1980. Accepted for publication September 18, 1980
267
Int. J. Quant. Chem. 24, 243-277 (1983) INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY, VOL XXIV, 243-277 09H3)
Density Functionals for Coulomb Systems ELLIOTT H. LIEB Departments of Mathematics and Physics, Princeton University, P.O.B. 708, Princeton. New Jersey 08544, U.S.A.
Abstract
This paper has three aims: (i) To discuss some of the mathematical connections between N-particle wave functions 4, and their single-particle densities p(x). (ii) To establish some of the mathematical
underpinnings of "universal density functional" theory for the ground state energy as begun by Hohenberg and Kohn. We show that the HK functional is not defined for all p and we present several ways around this difficulty. Several less obvious problems remain, however. (iii) Since the functional mentioned above is not computable, we review examples of explicit functionals that have the virtue of yielding rigorous bounds to the energy.
Introduction
It is a pleasure to dedicate this article to Laszlo Tisza on the occasion of his seventy-fifth birthday. As a colleague at MIT he was a source of inspiration and encouragement, especially in drawing our attention to the importance of careful and precise thought in mathematical physics. The subject, if not the content, of this article may therefore not be inappropriate in a book dedicated to Professor Tisza (see the Acknowledgment).
The idea of trying to represent the ground state (and perhaps some of the excited states as well) of atomic, molecular, and solid state systems in terms of the diagonal part of the one-body reduced density matrix p(x) is an old one. It goes back at least to the work of Thomas [1] and Fermi [2] in 1927. In 1964 the idea was conceptually extended by Hohenberg and Kohn (HK) [3]. Since then many variations on the theme have been introduced. As the present article is not meant to be a review, I shall not attempt to list the papers in the field. Some recent examples of applications are Refs. 4 and 5. Some recent examples of theoretical papers which will play a role here are Refs. 6-12. A bibliography can be found in the recent review article of Bamzai and Deb [13]. This article has three aims:
(i) To discuss and prove some of the mathematical relations between Nparticle functions Us and their corresponding single-particle densities p. (ii) To discuss the mathematical underpinnings of general density functional theory along the lines initiated by HK. In that theory a universal energy functional F(p) is introduced. Despite the hopes of HK, F(p) is not defined for all p because
it is not true (see Theorem 3.4) that every p (even a "nice" p) comes from the ground state of some single-particle potential v (x). This problem can be remedied
by replacing the HK functional by the Legendre transform of the energy, as is done here. However, the new theory is also not free of difficulties, and these c 1983 John Wiley & Sons, Inc.
CCC 0020-7608/83/090243-35504.50
269
Int. J. Quant. Chem. 24, 243-277 (1983)
244
LIES
can be traced to the fact that the connection between v and p is extremely complicated and poorly understood. (iii) To present briefly another approach to the ground state energy problem by means of functionals that, while not exact, are explicitly computable and yield upper and lower bounds to the energy.
The analysis in this paper gives rise to many interesting open problems. It is my hope that the incompleteness of the results presented here will be partly compensated if others are encouraged to pursue some of the questions raised by them. It is not my intention to present a brief for HK theory. However, it deserves to be analyzed for at least two reasons: The HK theory is used by many workers and it gives rise to some deep problems in analysis. While it is my opinion that density functionals are a useful way to approach Coulomb systems, there are other approaches besides the HK approach [e.g., see (iii) above]. Apart from the difficulties mentioned above, the HK approach may be too general because all potentials have to be considered. Coulomb potentials are special and do lend themselves to a density functional approach; for example, Thomas-Fermi theory is asymptotically exact as Z -ao (see Sect. 5E and Ref. 14). In addition to this question of generality there is also the crucial point that the "universal functional" is very complicated and essentially uncomputable. If one is going to make uncontrolled approximations for this functional, then the general theory is not very helpful.
It is a pleasure to thank Barry Simon for some very helpful conversations and the proofs of Theorems 4.4 and 4.8. I also thank Haim Brezis for the proof of Theorem 1.3. 1. Single-Particle Densities
The first order of business is to describe the single-particle densities of interest. For simplicity we confine our attention to three dimensions whenever dimensionality is important. z = (x, will denote a space-spin variable, that is, x e R3 and o' a {1, . . . , q}.
q = 2 for electrons, of course, but one might wish to consider q = 1, which would mean that a ferromagnetic state is under consideration. We use the notation
J dz =oil Jdx. Let ii = 4r(z 1, ... , zN) be an N-particle function (which may be complex valued). To simplify notation we will not indicate N explicitly except where needed. However, the condition of fixed N is crucial and frequently glossed over. The density functionals that will be introduced later are explicitly N dependent in a highly nontrivial way (see Sect. 4A). rjr is assumed to be normalized: J I0IZ = 1,
270
(1.2)
Density Functionals for Coulomb Systems (a revised version of no. 144) COULOMB DENSrrY FUNCTIONALS
245
(with f = J dz,, ... , dzN) and to have finite kinetic energy, that is, N
(1.3)
T(ar)= E J IV,q, 12<00.
-1
Notes. f E L° means that f is a function satisfying hill, ={11 j°}"°
T(f) =
(27r)-3
(1.4)
J k2If(k)12 dk,
where f is the Fourier transform of f. Since f E L2. f exists and j E L2. H' is a Hilbert space with inner product (f, g) = f f *g + J V f * Vg. In most of the following it will be assumed that ' satisfies the Pauli principle,
that is, 0 is antisymmetric. However, some of the theorems are easier for symmetric (i.e., bosonic) +l' with q = 1, and occasionally this will be mentioned explicitly. In either case, the symmetry implies that
T(IG)=N J
(1.5)
IV,0 I2.
We define the single-particle density to be [see Eq. (A.1)]
p(x)=NE
JI'((x.oi).....(xN,o'N))12dx2...dxN.
(1.6)
Notice that f p (x) dx = N, not 1.
Determinants. If 46,W_ .. , ON(z) are orthonormal functions, we can form the determinantal (N!)-112 det (0,(z;)}, O(z,, ... , zO = (1.7) which is normalized. Then N
P(X)=
q
EE N
I'Yi(x,o)I2
(1.8)
J IVO+(x, r)12 dx.
(1.9)
q
T(J') = E E
Returning to the general case, the finiteness of T(ir) implies the following [15].
Theorem 1.1. p(x)''2EL2(R3)and Vp(x)'r2EL2(R3),that is,p(x)1/2EH'(R3). Moreover, J (Vp' i2)2 -- T(0).
Proof. pi/2EL2(R3) because Jp=N. Now Vp(x)=N f'(V,0)*oft+ *VIr(i, where J' means the integral in (1.6). By the Schwarz inequality,
Nf
IV11#12.
[Vp(x)]2-_ 4Np(x) J +
Thus
I (VP 1/1)2 dx = 11 (Vp)2p
,
dx <_ T(r(r).
271
Int. J. Quant. Chem. 24, 243-277 (1983)
LIEB
246
We know p' /2 a H' (R3) = { fjf e L2, V f E L2}. (Here we use the standard convention that {ABC} means the set of A such that condition C holds.) To discuss the converse of Theorem 1.1 some definitions are useful. Definition.IN = {p 1p (x) . 0, p' /2 a H' (R 3), f p (x) dx = N).
Definition. IN ={p Ip(x)30, jp(x)dx =N, p eL3(R3)}. 1N contains sN by the Sobolev inequality (see Ref. 16) because if f e H' (R3), then 1/3
J
lVf(X)J2
dx -- 3(r/2)4/3 hf If(X)E6 dX]
.
(1.10)
Equation (1.10) is true only in three dimensions, but analogous inequalities hold in other dimensions. By Theorem 1.1, T(I!/) z 3(zr/2)4131lp113
`1N is clearly a convex set ; that is, if p, and p2 a 9IN, then p 3 API + (1- A )p2 E
IN for all 0 --A -- 1. ON is also convex by the same proof as in Theorem 1.1; that is, by the Schwarz inequality (V,0)2
-- 4p[A(Vp1/2)2+(1-A)(VP21/2)2].
In particular, the functional f [Vp 112]2 is convex. The convexity of YN will be important in Sect. 3. Definition. A function (or functional) f is convex if
f(Ax +(1 -A)Y)
Theorem 1.2. Suppose p E'N. Then for either Bose or Fermi statistics there exists a 0 (which is a determinant in the fermion case) such that (1.6) holds and, moreover,
T(dr). J T(.i)
[VP112(X)]2
(bosons),
dx
(41r)2N2 J [Vp'/2(x )]2 dx
(1.11)
(fermions).
(1.12)
Proof. For bosons the proof is easy; simply take 41(x 1,...,XN)=
P(xi
n
))1/2
N
For fermions the construction is much more complicated. Some ideas from Ref. 17 will be used in the following. Write x = (x', x 2, x 3) and define
2a
f(x')=(N) J 272
ds
J.dt f 00
M
dup(s,t,u).
Density Functionals for Coulomb Systems (a revised version of no. 144) COULOMB DENSITY FUNCTIONALS
247
Then f is monotone increasing from 0 to 21r. For k = 0, ... , N -1 define .0 k (x) _ [p (x)IN ]'
12
exp [ikf (x' )]
It is easy to check that the 16k are orthonormal functions in L2(R3). (First do
the x2 and x3 integrations and then note that the overlap integral is of the form f m(df/dx')exp[i(Ak)f(x')]dx'={exp[i(Ak)f(oo)]-exp[f(Ak)f(-co)]}/ i (Ak) = 0. Furthermore,
NJ VI2= ((Optz)z+\r
)
Jg(s)6ds,
(1.13)
with
g(s)2=J Jdtdup(s,t,u). As in Theorem 1.1, we conclude that
gEH'(R')
and
s
z
J 1d g ) ds ` J
(Vpuz)2:A.
Since
K(s)2 = 2 JS g(y) (dd(y)) dy, we conclude by the Schwarz inequality that g(s)4--4[J g2][J (dg/dy)2]. Thus, the
last term in (1.13) is less than 4(2irk/N)2N2A. Finally, we take 41 to be a determinant as in (1.7) using the functions 46'k (x) x (spin up). Equation (1.12) follows by summing on k. Theorem 1.2 is closely related to the results of Gilbert [8] and Harriman [9]. For fermions, the extra factor N2 in (1.12) is noticeably different from the factor No in Theorem 1.1. Although (1.12) can be improved, it is not easy to do so. In any case, the conclusion is that the map from 41 to p'/2 given by (1.6) is a map from H'(R3N) onto H'(R3). But the map is clearly not 1 : 1; different Qr's can give the same p.
Question 1. Is this map continuous as a map from H'(R3N) to H'(R3)? That is, if 0 is fixed and Oj is a sequence (with corresponding p and p;) such that J10-41iI2i0 J1p'12-pi12I2 and does it follow that 0 and JIvpl/2-VP 1/212,0?
Question 2. Although the map is not invertible (since it is not 1: 1), we can ask the following: Given a sequence p, 12 that converges to p 11' in the above H'(R3) since, and given some I# satisfying (1.6) for p, does there exist a sequence up, [related to p, by (1.6)] that converges to u in the above H'(R) sense? [This
is equivalent to the statement that the map u yp 1/2 is "open," that is, the map takes open sets in H'(R3N) into open sets in H'(R3).]
273
Int. J. Quant. Chem. 24, 243-277 (1983)
LIER
248
Intuitively, the answer to both questions should be affirmative. The continuity
can indeed be proved, but the proof is not entirely elementary. A proof of Theorem 1.3, due to H. Brezis, is given in the appendix. Theorem 1.3. The map ill -p112 given by (1.6) is continuous as a map from HI(R3N) to H'(R3).
I cannot offer any proof of the openness of the map, however. The fact that these questions do not have simple answers should serve as a warning that the connection between y and p is not as obvious as one might intuitively think. 2. Single-Particle Density Matrices
If ' is given as before, we can define the single-particle density matrix
y(X,X')=NE 0((X,a1),...,(xN,QN)) X 0((x', o-1), ... , (XN, OW W dx2 ... dxN.
(2.1)
This definition is different from the usual one because we sum on a, in (2.1). Usually one defines the quantity j(x, a; x', o-'), so our y(x, x') = E. y(x, a; x', o). Clearly, p (x) = y (x, x). Theorem 2.1. y satisfies
(i) Try = f y(x, x) dx = N. (ii) As an operator, 0 _- y <_ qI, for fermions ; that is, 0 -_ (f, yf) < q (f, f ). For bosons,
0<_y-_ NI.
Proof. (i) is "obvious" but not trivial. The point is that if an operator K is given, then its kernel K(x, y) is defined only almost everywhere. In particular, K (x, x) can be anything. Thus, Tr K need not be f K (x, x) dz. However, (i) can be proved from (2.1). This is left as an exercise. To prove (ii) let M(x, x') = f (x) f (x')* be a one-particle operator with (f, f) = 1. Then A = E; _ i M (x,, x:) has as its largest eigenvalue on the antisymmetric space the value q. Moreover, A is clearly positive semidefinite. Thus, 0--(f, yf) = Tr yM = (IG, AO) -_q. Definition. Let -y(x, y) be any kernel. y is said to be admissible if Try = N and 0:5 y s ql (fermions) or 0:5 y s N (bosons). The set of admissible y is clearly convex; that is, if y and S are admissible, then so is ay + (1- a )S for O :s a s 1.
Now we come to a subtle point. If y is an admissible operator, we can ask two questions:
Question 3. Does an N-particle density matrix r always exist, where r = r(z,, ... , ZN; z'1, ... , z N), so that y is given by (2.1) with i4i1i* replaced by I'? (I' is a density matrix if o!5 r and Tr r = 1. r must also satisfy the appropriate symmetry.) Question 4. Does a 0 always exist so that (2.1) holds; that is, can r be chosen to be a pure state, namely, r= Ili)(ilr? 274
Density Functionals for Coulomb Systems (a revised version of no. 144)
COULOMB DENSITY Fl1NCTIONALS
249
The answer to question 4 is No! (for fermions). For bosons, the answer is Yes. The proof of question 3 (which we now call Theorem 2.2) has been known
for a long time. An explicit construction is given in Ref. 26. An example in which f fails to be of the form ili)(ili, for N = 2 and q = 1, is the case in which y has three nonzero eigenvalues 1, Z. To see this, let the normalized eigenvectors of y be f (x), g (x), and h (x), respectively; that is, Y(x, x') = f (x)f(x')* +ig(x)g(x')* + zh (x )h (x')*.
Let A = -y(x 1, x;) - y(x2, X2) be an operator on the antisymmetric states. Its lowest eigenvalue is -1 - 1/2 = -3/2, which is doubl7 degenerate. If I = 0)(4t,
then eG must be a ground state since Tr f'A = -Try = -1 -1 /4 -1/4 = -3/2. But every ground state is of the form ift = 2-1/2 det (f, p), where p = ag +bh, + lb 12 = 1. But then y = f)(f + p)(p, and this is never of the form f)(f + 1a 12
zg)(g + zh)(h.
The moral of all this is the following: On the one-particle level we can study density matrices y(x, x') or densities, p(x) = y(x, x). The former do not always come from pure states 41)(1. The latter do, as Theorem 1.2 shows. While y is more complicated than p (it has two variables), it has the distinct advantage that the map r-y is linear! The map 4i - p is nonlinear, and this, as will be seen, is the source of some difficulty.
The relation among ', r, y, and p can be summarized by the following diagram: lG
r
Y -'p,
(2.2)
by which we mean (i) the map rG H r =1b)(ty, (ii) r- y by (2.1) with lylf* replaced by i', (iii) y +y(x, x) = p(x ). (ii) and (iii) are linear while (i) is nonlinear.
Notation. We shall use the symbol 1/1yp (or any other combination such as y--'p) to indicate that 41 and p are related by the above maps. Technical remarks. Since y is self -adjoint and trace class, it can always be written in the form co
Y(x, X') = E A/,(x)fi(x')*, j-1
(2.3)
where the f; are orthonormal and 0 { Aj
N=Try= Z A;.
(2.4)
1=i
3. General Density Functional Theory The problem that will concern us in calculating the ground state energy for
N electrons interacting with each other via a repulsive Coulomb potential 275
Int. J. Quant. Chem. 24, 243-277 (1983) LIEB
250
Ixi - x; I-' and also interacting with a single-particle potential v (x ). If v = 0, the Hamiltonian is
Ho=K+
E Ixi-x/L-', 1si<j N
(3.1)
where K is the kinetic energy operator N
K
Ai
(3.2)
i=1
in units in which h2/2m = 1. Also of interest is the case where Ho = K alone (see Sec. 4C). Recall that N is fixed and will not be mentioned unless necessary.
Also, to simplify matters we shall confine our attention in the following to fermions. However, many of the following results have obvious analogs for bosons. The total Hamiltonian is
H,; =Ho+ V,
(3.3)
where CN
V = L v(Xi)
(3.4)
i-i
The ground state energy E(v) is defined to be E(v) = inf {(', HH4i)I4i E `W'N},
(3.5)
IVN ={4LIII,PII= 1, T(e(,)
(3.6)
where
Technical remark. Something should be said about the meaning of (Ilr, and about the class of v's under consideration. We shall always interpret (ii', H,O ) in the sense of a quadratic form; in particular, this means that (0, KO) -M T(O). It is not assumed that A* a L2. Since E H', it is easy to prove that (0, Ixi - x/I-'rfr) is finite for all i # j. The part containing v is J p (x )v (x) dx. Asp a L' and p e L 3
(since Op' /2 E L), p E L' for all 1 S p S 3. The integral is then well defined if V E L3J2 +L°°. This means that we consider v's that can be written as v = v3/2+ vw with V3/2 E
L3/2
and with Ivml a bounded function. This choice precludes v's that
go tom as Ix I m, such as the harmonic oscillator potential. Unbounded potentials can also be handled by the methods given here, but then we have to place additional restrictions on p so that J vp makes sense. We restrict ourselves here to L3/2+L°° for simplicity of exposition. The class includes Coulomb potentials because Ix I -' = 0(x )Ix I-' + [1- 9(x )]Ix I-' with O(x)= 1 if Jx I < 1, 6(x) _
0, IxI> 1. The two terms on the right are in L3/2 and in L°°, respectively. L 3/2 +L' is a Banach space with the norm NvII = inf {IIg1I312+IIhilclg +h = v}.
276
(3.7)
Density Functionals for Coulomb Systems (a revised version of no. 144) COULOMB DENSITY FUNCIIONALS
251
Technical remark. RN is a subset of the Banach space X = L3 fl L'. X*, the dual of X, is Y = L312+L°°. However, the dual of Y is not X because while L°°
is the dual of L', L' is not the dual of L'. However, X - Y*. The duality will be useful.
There may or may not be a minimizing I(i for (3.5), and if there is one it may
not be unique (for bosons it is unique because it is a positive function). Any minimizing rG (called a ground state) would satisfy
H.0 = E(v)i/i
(3.8)
in the distributional sense. The proof of this assertion is not difficult. For example,
a minimizing 0 will not exist if v is an attractive square well and if N is too large; the extra, unbound electrons will simply "leak away" to infinity. In such a case, E(v) would still have physical significance. It would be the ground state energy for fewer than N particles. There are three simple, but important, properties of E(v):
Theorem 3.1. (i) E(v) is concave in v: that is,
E(v)?aE(v,)+(1 -a)E(v2),
(3.9)
for all v,, v2, 0:5 a <_ I and v = av, +(1-a )v2.
(ii) E(v) is monotone decreasing: that is, if v1(x):5V2(x) for all x, then E(vl)<_E(v2) (iii) E(v) is continuous in the L312+L° norm and is, moreover, locally Lipschitz. In particular, E(v) is finite. Proof. (i) If 0 e 9WV, then
(+G, HH.) =a(.y, H°,iI0 +(1-a)(iI, H°,0)>aE(v,)+(1-a)E(v2). 00 (iii) Fix vo and let S = v - vo. We want to show that when 11811:5L/3, for some C, independent of v. [Here, L is the constant in (1.10).] Since E(v) is concave, it is sufficient to show that for some fixed D, E(v) - E(vo) y D whenever 11811=L13; because if 0 s y s 1,
y[E(vo+S)-E(vo)]<E(vo+yS)-E(vo)sy[E(vo)-E(vo-S)]. Let E(v,122) denote (3.5) with K replaced by K/2. Then
E(v)?E(vo, 2)+inf (r(i, [ZK+ES(x,)]II/).
The last term is bounded by -LN/2 because S = g + h with 119113/2 < L12 and llh lI= < L/2. Thus,
J So > - (L/2)[IIPII3+ N]. But (fir, Ki#)/2 ? (L/2)llp113 by (1.10) and Theorem 1.1. Finally, note that E(vo, z) -
E(vo) is a constant, D', independent of v.
277
Int. J. Quant. Chem. 24, 243-277 (1983) LIES
252
Now we begin the study of density functional theory in the manner of Hohenberg and Kohn. Their work is based on the following theorem [3]: Theorem 3.2. Suppose t4i (respectively, 02) is a ground state for v I (respectively
v2) and vI # v2+constant. Then pI 0P2 Proof. Suppose p, = P2 = P. 1#10 02 because they satisfy different Schrodinger equations, (3.8). [Note. To prove this we must know that V10 = 02* implies that
vt = v2. This, in turn, requires that 4s(x) does not vanish on a set of positive measure. This technical point is discussed in remark (ii) preceding Theorem 3.5.]
Moreover, 02 (respectively, ¢I) does not satisfy (3.8) for vt (respectively, v2). Therefore, E(vi)<(142,HH,*2)=E(v2)+ J (vI-V2)P
Likewise, E(v2)<E(vI)+J (v2-vI)p. This is a contradiction.
Hohenberg and Kohn assume that every p comes from some I that is a ground state for some v. For such p they define the functional
FHK(p)=E(v)-J
vP,
(3.10)
and we shall retain this definition for p E 9'N, where s?N = {pip comes from a ground state}.
(3.11)
siN 0 5N, as remarked earlier, and it is not convex (see Theorem 3.4)! The definition given by (3.10) requires Theorem 3.2, according to which there is a unique v (up to a constant) associated with p. We can also define `Y'N = {v IH has a ground state}.
(3.12)
It then follows easily that for v E `Y'N
E(v) = min I FHK(P) + J VPlpEdN}
(3.13)
This is the HK variational principle, but it is important to note that it holds only for v E `1N, which is unknown, and that the variation is restricted to the unknown set SG-
We also do not know what FHK is, and that is a very serious problem. But there are also conceptual problems, which will be addressed here. If F is to be used in a variational principle, it is clearly desirable that F be a convex functional. In particular, it should be defined everywhere on .1N, or at least on some known convex subset of .$ N. The domain of FHK (i.e., s1N) is not all of -ON and it is not convex. This last fact is closely connected with the following difficulty: One can define a functional
278
Density Functionals for Coulomb Systems (a revised version of no. 144)
COULOMB DENSITY FUNCTIONALS
253
for all p in 'N by* F(p) = inf {(0, Hotfi)I,, -P, 4, E WN}.
(3.14)
It then follows trivially that
E(v)=infIP(p)+J VPIPEJfN}.
(3.15)
pE.iN.
(3.16)
F(p)=FHK(p),
if
So far, so good. The difficulty is that P is not convex either. However, F has one important property that is proved in the appendix. Theorem 3.3. For each p in ON there is a ' E WN such that P(p) = (4', Ho4s). In other words, the infimum in (3.14) is a minimum.
The following functional F is one choice for "the density functional" that remedies the difficulties mentioned so far:
F(p)=sup IE(v)-J VP IV EL'n+L'°}.
(3.17)
We shall explore the properties of F, but it, too will be seen to have subtle difficulties of its own.
Remarks. (i) (3.17) defines F(p) for all p E RN, not just 3N, provided F is interpreted in the extended sense as a function that can have the value +oo. In fact, (2.17) defines F on the much larger set X = L3 fl L t, without the restrictions p (z) -- 0 and f p = N. As Theorem 3.5 shows, however, it is only necessary to consider F on the convex subset J r4 of X. (ii) Recall that F depends explicitly on N through E. (iii) Since F is the supremum of a family of linear functionals, it is convex. (iv) Theorem 3.8 shows that F(p) = +co if p Of JON. There is an alternative definition of F, namely, F, by which F is finite on the set
5 .s{pIp(x)2OandVpti2eL2}, without requiring f p = N. This is
F(p) _ (J p) F(pl J p)' F'(0)=0.
p4*O,
(3.18)
It is easy to check that the convexity and lower semicontinuity (a concept to be defined later) of F carry over to F. This definition has the virtue that F is finite * Levy [10] also defined F(p) which he called 0, and derived (3.15). He did not prove Theorem 3.3. but assumed the existence of a minimizing +'. Also, he did not establish the connection between
t and the Legendre transform, F (Theorem 3.7). In Ref. 11, Levy proved Theorem 3.4(11), independently and virtually at the same time as myself, using essentially the same construction. See Ref. 12 for additional remarks about Q.
279
Int. J. Quant. Chem. 24, 243-277 (1983) LIEB
254
on a dense subset of the set of nonnegative functions in X. However, this does not change the theory in any important way, so we shall continue to use the definition given by (3.17). (v) Other characterizations of F, directly in terms of F, are given in Theorem 3.7, and in Eqs. (4.5)-(4.7). There is an obvious relation between F and F, namely,
F(p)sF(p)
for all pe.ON,
(3.19)
since E(v) s F(p) +I vp for all p e JN. Furthermore, since F is convex and F is not convex (by Theorem 3.4), there are p's in ON for which F(p)
First we prove that not all p's come from ground states. The essential ingredient is the existence of v with a degenerate ground state. (Such v's, incidentally, preclude the existence of a map v Hp.) Theorem 3.4. Let N > q = number of spin states. Then (i) F(p)) is not convex (ii) There exists a p E.'N that does not come from a ground state .4'. Moreover this p is a convex combination of p's that do come from a ground state.
Proof. Let v be a spherically symmetric potential having a ground state and with the property that its ground state has orbital angular momentum L 1. We assume the degeneracy is no greater than necessary, namely M = 2L + 1. The orthonormal ground states are Mfrs, ... , e,&M and .; yp1. Under simultaneous rotation of all N coordinates, they transform as a basis for the M-dimensional irreducible representation of 0(3). The following fact is easy to prove: (a) If p = M-1gyp;, then #(x) is spherically symmetric: that is, p depends only on r = jx j. A second fact that will be needed is (b): if 44 is any ground state (and hence a linear combination of the 4r;) and 45'p, then p is not spherically symmetric.
This fact must follow from some group-theoretic agreement, but I have not found one. However, it is not hard to see that (b) is equivalent to (c): There exists a perturbation of v, v (x) - v (x) +A w (x), with w bounded and of compact support, so that to first order in A the M- fold degeneracy is broken. Such pairs
v and w certainly exist, so we can henceforth assume that (b) holds. [A proof that a v satisfying (b) exists is the following. First, take the case that Ho = K, that is, independent particles. The ground states are determinants. Choose v so that the ground state has L >_ 1, in which case (b) obviously holds. Next, consider
H = K +A Ejx; -x,L-' + V. Angular momentum is still conserved and for sufficiently small k the ground state will have the same L and, by continuity of the ground states, (b) will continue to hold for small A. We are interested in A = 1 but, under the scaling
x-.x/A,
v(x)
A-zv(x/A)=v'(x),
the v, A problem is converted into the v', A = 1 problem. Thus, v' has the desired properties. I thank B. Simon for this remark.]
280
Density Functionals for Coulomb Systems (a revised version of no. 144) COULOMB DENSITY FUNCTIONALS
255
Clearly F(p;) = constant = D = E(v) - f vp;. We claim F(p) > D, thereby proving lack of convexity. Obviously, F(p) z D, for otherwise we could use p instead
of p; in (3.15). (Note: f vp = f vp; = constant = C.) Suppose F(p) = D. Then p comes from some 41 that must be a ground state for v. But (a) and (b) show this
to be impossible. Thus F(p) > D. Moreover, p cannot come from any ground state ' for any other v'. If it did, then r
E(v')=F(p)+J v'p>M-'I(F(p;)+J vp) This implies that for some 1:5i:!:-:M, F(p4)+f v'p; <E(v'), which is a contradiction.
Remarks. (i) The foregoing proof holds just as well if (0, Hopi) is replaced by T(I') in the definition of F. [This functional will be denoted by l'(p) and the analog of (3.17) by T(p).] In other words, the interelectron Coulomb repulsion plays no role in Theorem 3.4 (see Sec. 4C). (ii) There are other p's that do not come from some v, namely, those p EIN that vanish on a nonempty open set. If V E L9J2+L° and aG is its ground state, then 0 cannot vanish in an open set by the unique continuation theorem [18]. (Strictly speaking, this theorem is only known to hold fore EL C,butitisbelievedto hold for L' 12 + L'.) Presumably, such p's can, in many cases, be obtained as limits in
which v - 30 on the open set. Therefore, if the set of allowed v's can be extended properly to include infinite v's, the existence of such p's may not have any particular importance. The question is very delicate, however, as Englisch and Englisch [7] showed recently. Even for one particle there are densities which never vanish but which do not come from any v, even if one allows density matrices (see Sect. 4B) instead of pure states. These densities have regions in which they are "small"so that the obvious v (defined by v =p "2Ap1/2) has the property that -A+v cannot be defined as a semibounded operator. Theorem 3.5.
E(v)=inf {F(p)+J pvlpEL3f1L'},
(3.20)
E(v)=inf {F(p)+J pv)pEJN}.
(3.21)
Remark. The right sides of (3.20) and (3.21) are automatically concave functionals, which is a property we already proved for E.
Proof. Let M-(v) [respectively, M'(v)] be the infimum in (3.20) [respectively, (3.21)]. Obviously, M-(v)!5M4(v). First, pick v,,. Clearly F(p) > E( t-()) - f vop = F, (p). Therefore, r
M (vlyinf{FI(p)+J pvJpEL3f1L'}.
281
Int. J. Quant. Chem. 24, 243-277 (1983) LIES
256
and hence M-(vo)aE(vo). Second, by (3.19), F(p)sP(p), so that M*(v)s inf {FL(p)+ f pv{p e.fN} = E(v).
Let us pause briefly to review the situation. Three functionals have been defined: FHK
F F
defined on siN C IN ;
defined on 5NCL3 flL'; defined on X=L3flL'.
Of these, only F is convex and only F and F satisfy the variational principle for all v.
The next step is to find out something of the nature of F. It is at this point that the analysis becomes complicated and where difficulties and incompleteness
arise. The basic reason is that the connection between v and p is anything but simple. We have X = L3 fl L' and its dual X * = L 3'2 +L-. Although X is not the dual of X *, it is a subset of X**, the dual of X *. Definitions. (i) A sequence p" e X is said to converge top e X (p" -' p) if and only if Ib -p113 i 0 and IIp" -phj, - 0. This is also called norm convergence. P.
converges weakly to p(p"--p) if and only if f v(p"-p)-'0 for all ve Y=X*. Clearly, strong convergence implies weak convergence.
(ii) A functional f on X is continuous (or norm continuous) if and only if p" -*p implies f (p.) - f (p ). Weak continuity requires the concept of nets to define but, if f is weakly continuous, then whenever p" -- p, f (p") -> f (p ). Weak continuity implies norm continuity.
(iii) A real functional f on X is lower semicontinuous (I.s.c.) if and only if p" -' p implies f (p) _< lim inf f (P" ). Weak lower semicontinuity requires nets to
define, but if f is weakly l.s.c. then p" -p implies f (p) s lim inf f (p" ). (Weak) lower semicontinuity is equivalent to the following: {p j f(p) s A } is (weakly) closed for all real A.
Remarks. (i) Weak lower semicontinuity always implies lower semicontinuity, but not conversely. It is a theorem of Mazur [19], however, that if f is convex and norm l.s.c., then it is automatically weakly l.s.c. (ii) The function p (x) m 0 is not in the L3 fl L' weak closure of 'N. The reader may be puzzled by all these definitions, especially lower semi-
continuity, because finite convex functions on R" are always continuous. Unfortunately, this is not true in infinite-dimensional spaces such as the space X we are considering. Even l.s.c. cannot be taken for granted. Theorem 3.6. F(p) is weakly (and hence also norm) lower semicontinuous. Proof. K.,
282
={pJF(p)sA}=IpIE(v)-J vpsAfor all vEY}.
Density Functionals for Coulomb Systems (a revised version of no. 144) ('OUI.OMB DENSITY FUNCTIONALS
257
Now if p -+ p in norm and P. E K then, for each v E Y.
E(v) -
J
vp=lim(E(v)-
Therefore KA is norm closed, so that F is norm l.s.c. Weak l.s.c. is a consequence of Mazur's theorem. Next we define the convex envelope (CE).
Definition. Let f be a real functional defined on a subset A of X. f(p) is allowed to be +oo, but not for all p E A. CE f is defined on all of X as follows: CE f(p)=SUP {g(p)lg is weakly l.s.c., g is convex on X, and g(p')s f(p') for all p'EA}. It is easy to check that CE f is convex and weakly l.s.c. and CE f(p)sf(p) for all p E A. However, CE f (p) may be +oo for some p. The function of interest is CE F with A =JN. Note that A is convex and that
F (and hence cE F) is finite on A by Theorem 1.2. Since CE FsF on A, it is obvious from (3.19) and Theorem 3.6 that F:5 CE F on X. On the other hand, suppose we use cF F instead of F in (3.15). This gives a new function, which we call F. Clearly E's E. Then, if E' is used in (3.17), we get a new function F, and F's F. However, an infinite-dimensional generalization of Fenchel's theorem [29] (which uses the Hahn-Banach theorem) states that if the original function (in our case, cF F) is convex and weakly l.s.c. on X, then its double Legendre transform (in our case F) is equal to the original function. Thus, F' = CE F and we have
Theorem 3.7. F(p)=cEF(p) for all pEL'flL'. The reader may wonder what Theorem 3.7 is good for; the following is an example of the usefulness of the foregoing functional analysis (see Theorem 4.3).
Theorem 3.8. For all p E L' fl L' let
G(p)=J (Vp(x)"2)2dx = +00
1fPEJN otherwise.
Then F(p) ?G(p), for all pEL'flL'. Proof. G is obviously convex on X [see the remark after (1.10)]. We claim that G is norm I.s.c. (Note: The norm in question is L3 fl L', not the H' norm on p 1/2.) If so, we are done because G is then weakly I.s.c. and, by Theorem
1.1, G<-F on IN: but then G CE F=F. To prove norm l.s.c., let p be any sequence in X with p - p ; that is, exists and is finite, and we have to show that G>G(p). We can also assume p(x) -e 0 a.e. because if p <0 on a set S of positive measure, then, for sufficiently large n, ILp -pII, i 0 and 11p,, -pf13 _ 0. We can assume that G = lim
283
Int. J. Quant. Chem. 24, 243-277 (1983)
L.IFB
258
p < 0 on some set of positive measure; hence p 0'N and G (p.) = oo. For a similar reason we can assume f p = N. Since oo, P. E 5N. Thus, if we define
g =p,'/2 and g =p'/2, we have: (a) g is bounded in H'; (b) g. -' g2 in L3 and L'. By the Banach-Alaoglu theorem there is an f E H' such that g -f and V f - Vf weakly in L2. Clearly f (x) ? 0. It is not hard to prove that if g -f in L and g - g2 in L', then g = f. Hence VF = Of, and thus Vg -Cg. But since f (Vg)2 is H'-norm continuous, it is H weakly l.s.c., so that lim G(p).
Theorem 3.8 is certainly not obvious. Among other things it says that if p0' N (and such p's can be quite smooth and innocent looking), then there exists such that The reader a sequence of potentials is asked to reflect on this fact. Another interesting fact is that F is convex and
finite on .ON, but infinite off ON. However, the complement of 'N (in X) is
dense (in the X norm) in 'N and JN is dense in the cone of nonnegative functions in X. The following upper bound complements Theorem 3.8. Theorem 3.9. If p E .ON, then F(p) ts- F(p):55 (4ir)2N2G(p)+z
JJ p(x)p(y)Ix -yl-' dxdy.
(3.22)
Proof. Use the definition (3.14). By Theorem 1.2 there is a determinantal 0, with O -p, such that (1.12) holds. With this 41 we can calculate the Coulomb
repulsion I = (IG, 11x; -x;I-'iIi). I has a direct term, given in (3.22), plus an exchange term. The latter is negative, as is well known, since Ix - y I-' is a positive
definite kernel. Thus F(p)s right side of (3.22). Then use (3.19). Remark. By one of Sobolev's inequalities,
D - Jf p(x)p(y)Ix - yI - ' dxdy s (const.)IIPIIFi5. By Holder's inequality IIPII6/5
D:5 (const.)N3/2G(p
)''2
<_ (const.)[N +N2G (p)].
To continue the study of F the following concept is needed.
Definition. Let f be a real functional on a subset A of a Banach space B, and let p E A. A linear functional I on B is said to be a tangent functional (TF) at p(, if and only if for all p E A
f(p)?f(po)-1(p -po).
(3.23)
1 may not be unique. If ! is continuous, then 1 is a continuous tangent functional at p,,.
284
Density Functionals for Coulomb Systems (a revised version of no. 144)
COULOMB DENSITY FUNCTIONAL S
259
I is a continuous linear functional on X if and only if it has the form I vp with v E X * = Y. If f is convex, then at every point po at which f is finite, f has at least one TF. This is guaranteed by the Hahn-Banach theorem. However, f may have no continuous TF at po. The functional of interest is obviously F. In general, Fs F but the following says something about those p for which F(p) = P(p). Theorem 3.10. Let Po E IN. The following are equivalent : (1) F(po) = F(po) and F has a continuous rF at po. (2) PO E SIN.
(3) F has a continuous TF at po; that is, F(p) a: P(po) - J v (p - po) for p E .4N. (4) (3) and (5) hold with the same v.
(5) E(v)=F(po)+J vpo for some v. (6) (5) holds and, in addition, V E ?N and (7) F has a continuous TF at po and v is unique up to a constant. Moreover, F has the same continuous TFS at po, and no others.
Proof. (1)x(3): ForpEJN,
F(p)?F(P)?F(po)- J v(p-po)=F(Po)- J v(p-pa). (3)'(4): Let F1(p)=F(po)-Jv(p-pa)sF(p).Then F(po)+ J vpo?E(v)?inf I
FI(P)+J PvIPEJSN} =F(Po)+ J vpo.
(4)'(5), (7)'(3), (6)x(5): All trivial. (5)x(1): F(P(,)+Ipnv=EMF(Pn)?F(Po)=>F(Po) F(po). Then, for all p E X, F(p) + f pv > E(v) = F(po) + J pov. (2)x(5): By (3.16).
(5)x(2), (6): By Theorem 3.3, F(po)=(II/,H(,&I!) for some Il/ with 0
Then E(v)=(+',HoJr)+f
ve'VN, and v-.po. Thus (1)-(6) are
equivalent and (7) x(3). Now we show that (1)-(6)x(7). If v is a continuous TF
for F, then v is a continuous TF for F [by the proof of (1)x(3)]. If v is a continuous TF for F, then F(p) - E(v) - f vp, so v is a continuous TF for F. Suppose F has two continuous TFS v and w with v - w 0 constant. Then E(v) _ F(po) + J vpo and E(w) = F(po) + f wpo. Since po E dN, this is impossible by Theorem 3.2.
It should be noted that the only place that the HK Theorem 3.2 entered in the analysis of F was in establishing the uniqueness (modulo constants) in (7). Now we turn to two important questions whose answers we cannot give but that are obviously important for the theory. We replaced FHK by F because FHK was not defined on all of 'N. Theorem 3.10 states that on .c4, where FHK is defined, F = F = FHK and F has an essentially unique continuous TF. Question 5. For which points of ON does F have a continuous TO Where there is one, is it unique (modulo adding a constant to v)?
285
Int. J. Quant. Chem. 24, 243-277 (1983)
LIEB
260
Question 6. If F has a continuous TF at poEIN given by some v E L312+LOO, is this v e `I1N?
Questions 5 and 6 have alternative formulations, given below. Theorem 3.11. Let po a .5N and v E for all p,
L3i2
+ L. v is not necessarily in `VVN. Then,
F(p) ? F(po) - J V (p - po)
(continuous TF)
(3.24)
[minimum in (3.21)].
(3.25)
if and only if
E(v) =F(po)+v J Proof. Assume (3.24) rand let
E(v) zinf For the converse,
Po
be its right side. Then
J Pv} =F(po)+ J vpoa, E(v). `
F(p) + J vp aE(v)=F(po)+ J
vpo. 0
Question 5 is equivalent to the following: For which po E IN is there a v such that (3.25) holds? Is this v unique (up to constants)? Question 6 is the following: If (3.25) holds, is v E VN? Some insight into the continuous TFS of F are provided by the Bishop-Phelps
theorem. We refer the reader to Ref. 20 for this as well as other interesting facts about convexity. A definition is needed.
Definition. Let F be a real functional on a real Banach space B with dual B* (the set of continuous linear functionals on B). b * E B * is said to be F-bounded
if there is a constant C (depending on b* but not on b) such that F(b)?b*(b)+C
for allbeB. In our case B = X and F is our density functional. Theorem 3.12. Every v e X * = L 312 + L°° is F bounded.
Proof. By Theorem 3.8, F(p) = oo if p L IN, so we only have to consider p E IN and prove that G(p)? I vp + C for some C. The proof of this is identical to the last part of the proof of Theorem 3.1. The Bishop-Phelps theorem is the following.
Theorem 3.13. Let F be a l.s.c. convex functional on a real Banach space B. (Note: Norm and weak l.s.c. are identical.) F can take the value +oo, but not everywhere. Then
(i) The continuous tangent functionals to F (over all of B) are B*-norm dense in the set of F-bounded functionals in B*
286
Density Functionals for Coulomb Systems (a revised version of no. 144) COULOMB DENSITY FUNCTIONALS
261
(ii) Suppose bo e B and b *o E B* with F(bo) < oo. For every e > 0 there exists b, E B and b,*. E B * such that IIb ,* - bo lle s e and e Jib. - bolls s F(bo) +b o* (bo) -inf {F(x) +bo* (x )l x e B}. Moreover, b *,
is tangent to F at b,, namely F(b) ? F(b1) - b? (b - b,) for all b.
The significance of Theorem 3.13(i) is the following. There are certainly many v's in Y that are not in `YN. (Example: Suppose v eL't2 and 11V 113/2
where L is the constant in (1.10). Then (4i, H,4')>0 for all 4i, but E(v)=O because we can always take a sequence 0 that "leaks away to infinity.") Let v9 `YN, whence P(p)+Jvp does not have a minimum. What Theorem 3.13 says is that there always exists a sequence v (not necessarily in YN) such that
(a) F(p)+ J pv has a minimum at some p E.ON and this minimum is (b) F(p) for all p. v,, (p (c) v - v in the L112 +L°° norm.
Point (c) means the following: v = v +g +h. with
and Ilh,,ll=- 0. In particular, if veL312 with IIvII312
One consequence of Theorem 3.13(ii) is the following. Theorem 3.14. Let POE 'N. Then there exists a sequence p E .S6N such that
(i) p -. po in L3 fl L' norm. (ii) F has a continuous TF at p,,.
Proof. Given n > 0, by (3.17) there exists v such that
J pov >
F(po) -1 In. Hence
F(P) a
J pv z F(po) - J v, (P - po) - 1 In.
Take e = I in Theorem 3.13. There exists W. E Y such that w is a continuous TF at some p and
lI' -PolIEF(Po)+ J with
Z=inf {F(p)+J By the above,
Z?F(po)+ 4. Additional Remarks about Density Functionals A. The N-Dependence of F
As was stressed earlier, any functional F that satisfies (3.20) or (3.21) must
depend explicitly on the particle number N. This fact is unavoidable and
287
Int. J. Quant. Chem. 24, 243-277 (1983)
LIFR
262
frequently overlooked. Let us denote the N dependence by F(.'V, p). It might be hoped that F is jointly convex in N and p in the sense that for N a- 2 F(N + 1, PO +F(N -1, p2) 2t 2F(N, ?p, + ?P2).
(4.1)
This convexity definitely does not hold as a general feature, as will be demonstrated. The importance of convexity is shown by the following. Theorem 4.1. Consider the following two statements about any two functionals, Fand E: (i) F(N, p) is jointly convex in Nand p in the sense of (4.1). (ii) E(N, v) is convex in N for all fixed v ; that is, for N ? 2
E(N+1, v)+E(N-1, v)?2E(N, v).
(4.2)
(a) If (3.17) holds, then (ii) implies (i). (b) If either (3.20) or (3.21) holds, then (i) implies (ii).
Proof. (a) For each v, E(N, v) -f vp is jointly (N, p) convex. By (3.17), F(N, p) is the supremum of such convex functions and hence is convex. (b) Pick e > 0. For N + 1 there is a p+ such that
A=F(N+1, p.)+I p.vsE(N+1, v)+e. Likewise,
B =F(N-1, p_)+J p_v sE(N- 1, v)+e. Then
For the N problem, define 2p = p. +p .
2{F(N,p)+J pv}.A+B. Since this holds for all e >0, (ii) is proved.
Equation (4.2) has a simple physical meaning. The ionization potential increases as the number of electrons is decreased. This is intuitively expected to be true, but if it is true, it must be because of some special property of the Coulomb repulsion. A non-Coulombic counterexample is given below. The kinetic energy functional T(N, p) is not even convex in p (Theorem 3.4), but the Legendre transform T(N, p) is jointly convex. This is so because E(N, v) is indeed convex in N for independent particles as the explicit expression for E, as the sum of the first N eigenvalues (counted with an extra multiplicity q) shows.
What about the convexity of F when the Coulomb repulsion is included? While it has been conjectured that E(N, v) is convex in N (for all v) in the case of Coulomb repulsion, this has never been proved. It has not even been proved
that E(3, v)+E(1, v) ?2E(2, v).
288
Density Functionals for Coulomb Systems (a revised version of no. 144)
COULOMB DENSITY FUNCTIONALS
263
Lest the reader think that convexity in N is a general feature, we present a counterexample. Replace Jx I-' by the hard-core repulsion O (x) = oo if Jx I < I and O (x) = O otherwise. Pick four distinct points xo, Y1, Y2, Y3 in
R3
such that Jy, - y; J > I
for all i 96j but Jx,, - y; J < I for all i. Let v (x) = -2A < 0 in small balls about the y,, v (x) = -3A in a small ball about xo, and v (x) = 0, otherwise. If the kinetic
energy be neglected, then E(1, v) _ -3A, E(2, v) _ -4A, and E(3, v) _ -6A. Convexity does not hold. This can be turned into a proper example by letting A be sufficiently large so that the kinetic energy can effectively, be neglected; it is also possible to replace the hard core by a soft core.
Remark. The foregoing example is not applicable if 0 is replaced by JxJ-', thereby keeping alive the hope that convexity holds in the Coulomb case. The reason is the following: Given any four points xo, y1, Y2, y3, let Ixo - y I I = maxi {Ixo - Y. I}.
Then
Ixo-Y1I
'+IYt-Y31
The proof of this is left as an exercise, as well as the implication that if the kinetic energy is neglected, then convexity holds in the Coulomb case.
Question 7. For the case of Coulomb repulsion, is F(N, p) jointly convex in N and p? B. Density Matrices
Another possible modification of the theory of Sect. 3 is to replace densities p(x) by single-particle admissible density matrices y(x, x'). (See Questions 3 and 4 in Sec. 2. We do not restrict ourselves to y's that come from pure states 0)(0.) This set of y's is convex, and F(y), defined analogously to (3.14), is convex [see the proof of Theorem 4.1(b)].
Despite the attractive feature just mentioned, there are three drawbacks to the approach: (i) The problems about continuous tangent functionals remain and may even be more complex than before. (ii) The original aim of the theory was to express the energy in terms of p(x) and not y(x, x'). (iii) While the set of admissible y's is well defined, it is not easy to identify. Given some y, it is easy to verify that Tr y = N, but it is difficult to verify that
0`y<_ql. Still another possible modification is to retain p but to consider all N-particle density matrices r instead of merely pure states 0) (0. In other words, consider
F-p instead of ,/iyp and define FDM(p) = inf {Tr Horlr-.p}
(4.3)
on .ON and FDM(p) = +oo otherwise. Because r --*p is linear, FOM is convex on 1N. (Note: The example in Theorem 3.4 does not yield nonconvexity of FI)M.)
289
Int. J. Quant. Chem. 24, 243-277 (1983)
LIEB
264
Obviously, the analog of (3.15) holds, namely,
E(v) = inf {FDM(P)+J
PV 1P E
IN}
(4.4)
Since FOM is convex, (4.4) can be used directly instead of (3.20) or (3.21). Both F and FOM are convex. The amusing fact is that
F(p)=FDM(P), (4.5) PEJON. Equation (4.5) is not at all obvious, but it does say that the modification does
not change the theory in any way. Equation (4.5) also yields another characterization of F. Equation (4.5) is proved in Theorem 4.3. First, I is admissible if and only if 00
.-I with 0 s A;, EA; = 1, and the ti, are orthonormal. If t; -+p1, then Tr Hor = EA. (+Gr, Ho+G. ).
Thus we conclude that for all p e -ON
FoM(P)=inf { E A;F(P,)IyA+P1=p,P.E'N,Ar?0,yAr=1}. r=1
(4.6)
A simpler expression (which has to be proved) is FDM(P)=inf {EAF(Pr)I EAP.=P,P.EJON,
0,EAr= 1},
(4.7)
where the sums in (4.7) are restricted to finite sums. In view of (4.5), (4.7) is an alternative characterization of F(p) for p E'N.
Theorem 4.2. Equation (4.7) is true. Proof. Pick e > 0. Using (4.6), let {A,, p, } be an infinite sequence satisfying MA,p; = p, p1 E ON, and FFM(p) ? EA;F(pr) - e. Since EA; = 1 and I A,F(p;) < oo, there
exists K such that A s EO-K Aj s e and B - E°-K A;F(p;) s e. Assume A > 0 for otherwise we are done. By Theorem 1.1 and the convexity of G(p)= I (Vp1"2)2 E :2t
K
E ArG(pi)?AG(PK) K
with p K = EK A,p,/A E .ON. By Theorem 3.9 and the remark following it;
F(PK)sC(N2G(pK)+N]. Therefore the finite sequence {A;, Pi }K t with (A K, PK ) = {A, P K } satisfies EA;F(pr) :5
FDM(p)+ECN(N+1)+e. Theorem 4.3. Equation (4.5) is true.
290
Density Functionals for Coulomb Systems (a revised version of no. 144) COULOMB DENSITY FUNCTIONALS
265
Proof. The easy part is that for P E IN, FDM(p) aF(p). By (4.4). E(v) s FDM(p) + j pt; for all v. Hence, by (3.17), FDM(p) ? F(p). The hard part is contained
in Corollary 4.5, which will be assumed for now. Then: (i) FDM(p)sP(p) by (4.6); (ii) FDM is convex and l.s.c. Hence FDM(p) s CE F(p) = F(p) by Theorem 3.7.
Theorem 4.4. Suppose {p } and p E 5N and p p weakly in L'. Then there exists a density matrix r, with r-p, such that Tr Hor s lim inf FDM(p ). The proof of Theorem 4.4, due to Barry Simon, is given in the appendix. Corollary 4.5. (i) FDM is (norm and weakly) l.s.c.
(ii) If P E 1N, there exists a density matrix r with ry p such that Tr Hor = FDM(p) (see Theorem 3.3).
Proof. (i) If p -+ p, FDM(p) s Tr Hors lim inf FDM(p ). Norm I.s.c. implies weak I.s.c.
(ii) Take p = p in Theorem 4.4. C. The Kinetic Energy Functional
Kohn and Sham (KS) [30] define a kinetic energy functional TKS(p). There are several other possible kinetic energy functionals and we shall explore their interrelations, as well as the fact that TKs does not have a property assumed by KS. Ks define the exchange and correlation functional E,c(p) by JJp(x)p(Y)Ix-y[_'dxdy+TKS(p)+Exc(p).
FHK(P)-21
(4.8)
FHK and TKS are defined on different subsets of .0N, so E,c is defined only on a third unknown subset of .ON. This difficulty can be remedied by using P and '
in (4.8), but there is another point that should be stressed: There is no reason to believe that E,, is convex on 'N. First, let us give some definitions. These use K instead of Ho but otherwise are self-explanatory (with the aid of the equation numbers on the left): (3.5):
E'(v)
on
L'"Z+L°°;
(3.10):
TKS(p)
on
sQN
(3.14):
t(p)
on
.IN;
(3.17):
T(p)
on
L3flL'.
(3.11); (4.9)
(T(ar) = (dr, KO) was defined in (1.3) but it is quite different from T(p) above. It is hoped that this notational lapse will not be confusing.) All the previous theorems [except for 3.9, wherein the last term in (3.22) should be omitted] carry over to these quantities. The primes on E'(v) and ON indicate that these are different from before. Since Theorem 3.4 still holds, sd'N is not AN. It is left as an exercise to show that ''N # sF .
Question 8. What is sL fl .sZQ'N?
291
Int. J. Quant. Chem. 24, 243-277 (1983)
LIED
266
There is one more kinetic energy functional that can be defined on.4N, namely, Tde,(p) = inf {(O, K41)1 4i -p, 0 e WN, 0 is a determinant).
(4.10)
Clearly, Tde,(p) a- t(p). The question to be addressed is whether TdE1= t. The answer is No!, not even on all of 4'N. Ks assumed implicitly that TKS(p) = Td,(p) for P E 4d ; any such p minimizes K + V, but it is not true that such a p (x) can always be written as N I Ioi(x)I2 with the t/i; being orthonormal functions on R3. (Spin is a complication that is ignored at this point for simplicity.) In other words, not every ground state of K + V is a determinant when degeneracy is present. I thank B. Simon for drawing my attention to this subtlety and for the construction in Theorem 4.8, which is reminiscent of the construction in Theorem 3.4. Of course, TKS = T on sd'N by definition. Also T = T on .9f by Theorem 3.10. The following shows that there are cases in which t = Tde,. Theorem 4.6. Suppose P E so that K + V has a ground state. If this ground state is nondegenerate, then TdG1(p) = T(p).
Proof. The 0 that minimizes (0, [K + V]O) is, of course, a determinant. The following analog of Theorem 3.3 will be needed for Theorem 4.8.
Theorem 4.7. Let p EON. Then there exists a determinant that minimizes (I(,, KI/i) under the condition that a(i H p, 0 E WN, and 41 is a determinant. Thus, (4.10) is actually a minimum.
Proof. Let D; be a sequence of determinants with D1 Hp
and
lim (D;, KD;) = Tde,(p ).
The proof of Theorem 3.3 shows that 0 exists such that (i) p; (ii) (IG, Ku/r) _ Tde,(p ); (iii) Di -. 41 strongly in L2. It suffices to show that #A is a determinant. i = 1, ... , N, be the orthonormal single-particle functions of Di. By the Let Banach-Alaoglu theorem, N functions f'. ... . fN exist so that (after passing to
a subsequence) f; -f' weakly. The f' are not necessarily orthonormal. The function
Pl(z1,...,2)V)=llfl(z,) then converges weakly to P = R f'. This so because any IG E L2(R3N) can be approximated in norm by sums of product functions. Therefore,
D; -(N!)"2 det [ f'(z1)]=D
weakly.
Theorem 4.8. Let N = 7 and q = 1. Then there is a p E .s1?N such that Tde,(p) > T(p).
Proof. Take v(x) _ IxI - ', the hydrogen potential. The eigenvalues of -:1+v
are -1/4 (onefold), -1/16 (fourfold), -1/36 (ninefold). All other eigenvalues are greater than -1/36. The ground state for N = 7 and q = I is (z) = 36-fold
292
Density Functionals for Coulomb Systems (a revised version of no. 144) COULOMB DENSITY FUNCTIONALS
267
degenerate, and a basis for this eigenspace consists of the determinants (7!)-I"2
det (IS, 2S, 2P,, 2P2, 2P3, f, g) where f and g are any orthonormal functions in the nine-dimensional space M spanned by S, PI, P2, P3, DI, ... , Ds (an orthonormal set for the 3S, 3P, and 3D waves). Let d (f, g) denote the above normalized determinant and let 31/21/i =d(S, DI)+d(D2, D3) +d(D4, Ds). Then 0 -+p with p = pa + Pb and 3
P0(x) = I1S(x)I2+I2S(x)I2+ E 12Pr(x)12, i-I
3pb(x) = IS(x)I2 + E IDi(x)I2.
I-I
Clearly P E .:&' since .' is a ground state.
If T ((p) = T (p ), then there exists a determinant 44 with 0 -p and such that must be a ground state. Therefore 0 = d (f, g) for some orthonormal f, g E M. Thus, (4.11)
If(x)I2+Ig(x)I2=Pb(z).
I claim that this is impossible. Write f = A + D and g = B +d, with A and B being linear combinations of S and the P; while D and d are linear combinations of the D;. Now the S, P, and D waves behave as Ix 1°, Ix I', 1x12, respectively, near the origin. By examining the behavior of (4.11) near the origin we conclude that 3
ID(x)I2 + Id (x )12 = E ID,(x )I2. I
Since all the D, waves have the same radial wave functions, this is really an equality about spherical harmonics The right side of the last equality is spherically symmetric, so the problem is to find two linear combinations F and G of the Y2, such that IF(fl)I2 + IG (fl) I2 = constant > 0.
This is impossible, and the proof is left as an exercise. (It is easily carried out if the following five basis functions are used: xyr-2, yzr 2, 3x2r-2 -1, 3y2r-2-1, with r2=x2+y2+z2.) xzr_ 2,
Remarks. (i) N = 7 is not special; it was chosen for convenience in the proof. (ii) An alternative way of viewing Theorem 4.8 is following. Suppose K + V
has a degenerate ground state, so that the ground eigenspace G is more than one-dimensional. Ib E G is a linear combination of determinants. Consider a perturbation w of v, namely, In first-order perturbation theory, V + A W picks out a subspace g of G as the new ground eigenspace. If g is one dimensional, then g consists of one determinant since the ground eigenspace of V + A W always contains determinants (see Theorem 4.6). Now we ask, 41o G
293
Int. J. Quant. Chem. 24, 243-277 (1983) LIEB
268
and Ikon *po, can w be chosen so that g is one dimensional and g = {41o}? Alternatively, can w be chosen so that min {f wp Illr * p and 0 a G} occurs uniquely for p = po? If so, +&o is a determinant. Theorem 4.8 says that there can be a po such that no w can pick it out uniquely.
Even though T6,(p) > T (p) for some p, Td., still satisfies the variational principle for E(v). Theorem 4.9. For all v e L3/2 +L°°
E'(v)=inf{TT.t(P)+ JPvlpeill.
(4.12)
Proof. Equation (4.12) is equivalent to the following:
E'(v) inf
[K + V]0)14i e `V'N}
= inf {(0, [K + V]i)I i/i E WN, 0 is a determinant)
-E(v). Clearly E'(v)sE(v). Consider the operator -A+v(x). We d e f i n e its "eigenvalues" e1 s e2, ... (here, spin degeneracy is included) by the min-max principle: where inf {(0, [-o+v]0)I4b E H', 11.0 11, = I
and 46 is orthogonal to 40,.... ,
From this definition, it follows by a standard argument that
EN(v)- E e; =inf { E (0i, [-,& +v)0;)I-0 ,, ... , 40N are orthonormal}. 1-,
i-1
(4.13)
But this least infimum equals
are orthonormal,
inf { 00
A; =N}.
0 s A; s 1 and r-,
This is easy to verify. Let oft E `V'N and let y = EA; f;)(f; be its one-particle density matrix (including spin and with the f'-orthonormal. 0 s A; s 1, EA; = N). Then (#, [K + V]I(r) = EAr(f,, [-A+v]f;).
Thus E'(v)_-EN(v). But EN(v)=E(v) by inspection. Remark. This proof gives a formula for E'(v), namely, EN(v).
294
Density Functionals for Coulomb Systems (a revised version of no. 144)
COULOMB DENSITY FUNCTIONALS
269
The situation is complicated, so let us summarize it. TKS is defined only on the set of p's that come from ground states for some v. .4;v has a smaller subset, .sdN, in which p comes from a determinantal ground state, s1., includes. but is larger than, >1N, the set of p's that come from nondegenerate ground states. (Note: By Theorem 3.2 any p comes from a unique v (up to constants). Thus, if p comes from a determinant in a degenerate ground eigenspace, then pit std,'.) On siN' we have sf1 ,
TKS(P) = Td<,(P) = t(P)-
Elsewhere on 4, Td.,(P) > TKS(p) = t(P)-
Thus, there are two choices for (4.8): either TKS or Td t. On 96,v, the complement of std'N, TKS is not defined (but t, Td.,, and T are defined). The preferred functional here is T(p) because it is convex and hence most manageable. On 91
probably strictly less than t(p)
Td.,(p)-at least this is so when T has a
continuous tangent functional (Theorem 3.10). Such points are dense (Theorem 3.14). In any case, since T, T, and Tdet can be interchangeably used in (4.12); it makes no difference which is used as far as E'(v) is concerned.
5. Some Density Functionals That Are Bounds In this section we forego the abstract functional theory of the previous sections
and instead expound a different philosophy. Rather than pursuing "the correct density functional," which seems to be uncomputable, we shall content ourselves here with finding upper and lower bounds to the various quantities of interest in terms of p(x). This latter program can provide rigorous bounds on ground state energies that, while they may not always be extremely accurate, do have a proper place in our conceptual scheme. Some of these bounds will be briefly displayed here; the interested reader is referred to the original papers for proofs. It should be remembered that if one has bounds on two quantities (e.g., T and I; see below) and even if these bounds are optimal, then, in general, the sum of the bounds is not optimal for the sum of the two quantities (e.g., T + 1). A. Kinetic Energy Lower Bound
Lieb and Thirring (LT) [211 (also see Ref. 16) proved (for fermions in three dimensions) that if 0 Hp, then (for all N) T(Ji)?K`(41r)-2isq-2/3
p(x)513dx,
(5.1)
where K` is the "classical" value (3/5)(6a2)213. LT conjectured that (5.1) holds in three dimensions with the (41r)-213 deleted. [Note: Although an analog of (5.1) holds in all dimensions, the corresponding constant is definitely less than
K` in one and two dimensions.] In Ref. 22 (also see Ref. 16) (41r)-2'3 was replaced by
1.496(41r)-2i3.
295
Int. J. Quant. Chem. 24, 243-277 (1983) LIEB
270
K Incidentally, the statement T(O) ? Kq -213 J p 313 for all op, all N, andLssome 2(R3) is equivalent to the following [21]. Let v be any nonpositive potential in and let e I s e 2 s - , be the negative eigenvalues (if any) of -A+ v (x) counting degeneracy, but not counting the q- fold degeneracy. Then -
Ee;? -L J
Iv(x)Isn dx
with K = (3/5)(2/5L)213. B. ,Kinetic Energy Upper Bound
There is, of course, no upper bound for T(O) in terms of p. March and Young (my) [17] proposed that for all p e 3N there is a determinantal 0, with 41 Hp, such that
T(#)sq-ziK J p(x) ("-z)r' dx+
[gyp
1'2 (X))z
dx,
(5.2)
where n is the dimension and K` = -rr2/3 for n = 1. (Compare (5.2) with Theorem 1.2.) They proved (5.2) for n = 1, but their proof for n > 1 has an error. Equation (5.2) for n > 1 is still an open problem. The my construction for n = 1 motivated the construction in the proof of Theorem 1.2. C. Lower Bound for the Indirect Part of the Coulomb Repulsion
Let I' be a density matrix (which may be a pure state, r = ilr)(o) with r --+ p. Let
I(r)=Tr{r
y-
Isi<jsN
1x;-xii-I}
(5.3)
be the Coulomb repulsive energy. The indirect part of this energy, E(r), is defined by
I(t)=D(p)+E(I ),
(5.4)
with
y'
(5.5)
E(r)>-C J P(X)413 dx
(5.6)
D(p)=z JJ p(x)p(y)(x - yl-' being the direct part. In Ref. 23 it was shown that
with C = 8.5. In Ref. 24 this was improved to C = 1.68. The sharp (i.e., best) C in (5.6) is not known, but it is larger than 1.23.
296
Density Functionals for Coulomb Systems (a revised version of no. 144)
COULOMB DENSITY FUNCTIONAI.S
271
It is well known that in any pure, determinantal state, E(F)<0. For other states, E(I) can be positive. Indeed, for any fixed p there is no upper bound for E(I) (see Ref. 24). There is no q-dependence in (5.6) and, indeed, (5.6) holds for all statistics (i.e., C does not depend on statistics). This is explained in Ref. 24. The Dirac
approximation has CDq -'" in (5.6) with C,=3(6/1r)"'/4=0.93, but this q dependence is an artifact of the particular q-dependent determinantal 0 used to evaluate E from (5.4). It should be noted that the bound
I(r)>_D(p)-CJ p`1'
(5.7)
is not convex in p. It is not even positive. These two faults lead to absurd conclusions when the right side of (5.7) is used in Thomas-Fermi-Dirac theory (see Ref. 25). Since F -p is linear,
1(p)=inf{I(r)Ir -p}
(5.8)
is convex in p. In other words, an optimal positive, convex lower bound must exist. Any reader who is devoted to abstract density functional theory, in the spirit of Sec. 3 or (5.8), should try to guess a plausible form for 1(p). (Proving it
is another matter.) It will quickly be seen that 1(p) must be extremely
complicated, and to say that it is "nonlocal" is an understatement. To see this, consider N = 2 and p consisting of two "bumps," p, and p2, very far apart. As long as J p I = f P2 = 1, 1(p) - D (p) -0, independently of p, and p2. But when f p l > 1, J p2 < 1, then t (p) - D(p) depends heavily on p, but not on p2. The reason is that in the former case the two electrons can be far apart in the two bumps; in the latter case the two electrons must partly be close together in the first bump.
A problem that is physically more relevant and that illustrates the hidden complexity of density functional theory is the following problem about induced
dipolar (or Van der Waals) forces raised in Ref. 25. When two atoms are a distance R apart, and R is large, there is an attraction -R -6 (neglecting retardation effects). This attraction comes from the Coulomb repulsion, but it is not a static effect. The atomic dipole moment is almost zero. (There are, in fact, tiny dipole moments, but these are opposite in sign by symmetry, and hence repulsive. They must exist by the Feynmann-Hellman theorem: dE/dR = electric potential at the nucleus. I thank C. Herring for this remark.) There is almost no static dipole moment because to create one would cost a polarization energy ad2. The _z attractive energy is -d2R -' and, if R < a, d = 0 for minimum energy. The cause of the -R-(' energy is more subtle, but it has a semiclassical basis: The electrons in each atom move in phase while maintaining the spherical symmetry about each atom. The energy cost is then ad 4 and the minimum energy occurs when 2ad2 = R Thus, the -R -6 attraction comes from the fact that the electron
297
Int. J. Quant. Chem. 24, 243-277 (1983) LIES
272
cloud cannot be thought of as a simple "fluid." This effect is somehow built into
1(p), but an explicit form of 1(p) that will produce this effect has yet to be displayed.
D. A Variational Principle
E(v), given by (3.5), satisfies (by definition) the well-known variational principle
E(v):s (0, H4)
(5.9)
Can an upper bound for E(v) be given in terms of p alone? If (5.2) were true, then, for any p E .ON,
E(v) s right side of (5.2)+D(p)+J VP.
(5.10)
[See the remark about E(I') for determinants in Sec. 5C.] An upper bound for E(v) can, indeed, be given in terms of the one-particle density matrices y(z, z') as follows [26]: Let y be any admissible one-particle density matrix (0 <_ y s 1, Tr y = N). (Note: y includes spin. It was called i in Sec. 2). Then
E(v)sTry(-A+v(x))+'2 K2(z, z')Ix -x'I-' dz dz',
(5.11)
where f dz = Eo f dx and K2(z, z') = y(z, z)y(z', z') - Iy(z, z')12.
(5.12)
The form (5.11) is well known if y came from a pure state ry)(I(i with 0 being
a determinant. The point about (5.11) is that it holds for all admissible y. Incidentally, the minimum of (5.11) over all admissible y occurs when y comes from a determinantal 0. In other words, the best Hartree-Fock function minimizes (5.11), but (5.11) is interesting precisely because this HF function is unknown.
E. Thomas-Fermi Theory This theory (see Ref. 25 for an exposition) does not yield bounds and therefore
does not properly belong here. However, it illustrates the usefulness of the bounds in Secs. 5A-C. The TF functional is it
F(P)=Kcq-2/3 r p5/3+D(P)+ j
vP,
(5.13)
while the TF Weizsaecker functional '(p) is the right-hand side of (5.10). If -C f p°" is added to the right-hand side of (5.13), the result is TF Dirac theory.
298
Density Functionals for Coulomb Systems (a revised version of no. 144) COULOMB DENSITY FUN('TIONAIS
273
The TF energy for N particles is defined by ETF = inf { 'TF (p )I j p =N1
(5.14)
and similarly for ETFw and ETI I).
Now suppose that v is an atomic or molecular potential, that is,
v(x)=- E ziIx-R;I-',
(5.15)
with the z; > 0. It is a fact [ 14] that under the scaling z; - Az1 and N -. AN, as A -0 CO
E'F/E(v) -.1,
(5.16)
where E(v) is the true ground state energy. Note that (5.16) also holds if E'F ETI,v or E'FD (see Ref. 25). is replaced by Thus we see that if the conjecture in Sec. 5A holds, then, combining (5.1) with (5.7), TFD theory is a lower bound that is asymptotically exact. Similarly, if (5.2) holds, then, as remarked in Sec. 5C TFW theory is an upper bound that is asymptotically exact. F. Two-Body Density Matrices
If one is willing to go beyond the one-body density p or one-body density matrix y and consider the two-body reduced density matrix y(2), then E(v) is directly and exactly expressible in terms of y121, since H. has only one- and two-body terms. The problem is that it is very difficult to decide when a given yi2) is. in fact, the reduction of an admissible N-body density matrix r. This is called the N-representability problem and it has not been solved. (This is to be compared with the fact that there is a simple necessary and sufficient condition for a one-body y to be N-representable; see Sec. 2). It is possible, however, to find some necessary conditions and some sufficient conditions for .12) to be N-representable. Using these, bounds on E(v) can be derived. Since this approach is outside the scope of this article, we refer the reader to the excellent review of Percus (27).
Appendix: Proofs of Theorems 1.3, 3.3, and 4.4
The following proof of Theorem 1.3 is due to H. Brezis (pivate communication.)
Proof. For simplicity of presentation we take N = 2 and q = 1 (no spin). Therefore we have ,/i(x, y) and lie
F(z) _ (J 14,(x, y)12 dy)
_
[p(x)/2]'/
that ii -0 in H' (R x R'); that is, 0 -. r, and V O. -of in L2. We want to show that F. -.F in L2(R3) and VF,, - VF in L2(R'). The former is trivial:
299
Int. J. Quant. Chem. 24, 243-277 (1983)
274
LD -H
By the Schwarz inequality, ),)12dy JI+G(x,y)I2dy)
\J
-i J
(x,Y)dy
+
dx J10.,(x,Y)-1(x,Y)12dy.
J
and the right-hand side converges to zero. The proof that VF,, -* VF is the difficult one. It is sufficient to prove conver-
gence for some subsequence n,, j = 1, 2..... (If
then there is some
subsequence and some e > 0 such that IIVF,,, - VFII > E. But then this subsequence
clearly does not have a subsequence which converges to VF.) Now since in H' there is some subsequence and some function G E H' such that I
(x, y )l ts G(x, y) and
(x, Y )I <- G (x, y),
a.e. in R6. (The proof of this fact is the same as the first half of the proof of the Riesz-Fischer lemma that L' is complete.) Henceforth, we shall replace n, by n. We shall also assume, for simplicity, that F. (x) and F(x) > 0 for all x (otherwise, an approximation argument can be used). Now (x)+c.c. with
and
As we saw, F. -'F in L2, so, by passing to a subsequence, we can assume F (x) -F(x) a.e. Furthermore, a.e. y); sG(x, y)2E L1(R6).
By passing to a subsequence we can assume Vift -* V O and c,, i,(< a.e. Thus, by dominated convergence, B - B in L' (R6). For this subsequence I B (x, y) B (x, y )l - 0, a.e. (R6). Then, for a.e. x, IB (x, y) - B (x, y )I --> 0 a.e. y. Thus, by dominated convergence y)-B(x, y)I-'O, a.e. x. In other words, for some subsequence, VF (x) - VF(x ), a.e. Finally, we note that, by the Schwarz inequality, J
Y )I2 dy 5 J G(x, Y)' dy =C(x )2.
Since C is a fixed L2 function, VF -. VF in L2 by dominated convergence.
300
Density Functionals for Coulomb Systems (a revised version of no. 144)
COULOMB DENSITY FUNCTIONAL S
275
Proof of Theorem 3.3. Let Oi (with be a minimizing sequence for F(p). The ,(r; are obviously bounded in H'(R3t'), so, by the Banach-Alaoglu theorem, there is a ,y a H'(R3N) such that Oi - 0 weakly in H' (R3N). Obviously,
,/i has the same symmetry as the it/. It is well known that under weak limits positive quadratic forms decrease. Thus
F(p) = lim (4,;, Ho,/ii) ? (0, Hod ). If we can show that '-+p, we are done. To do so it is sufficient to prove that ,y, -,6 strongly because if i H p we have, by the easy part of Theorem 1.3 that P
1/2
=p;112 ->p" f /2 InL2 ,sothat p=p.
Strong convergence will be proved by showin, that f =1. Let S be the characteristic function of some bounded set in R3 . By the Rellich-Kondrachov theorem [28] there is a subsequence (which can be chosen independent of S) of the ,k/ such that St/i1 converges strongly (in L2) to Sift. Pick e > 0 and let x be the characteristic function of a bounded set in R3 such that
e>.fp(1-x)° f Ii1121[1-x(x)]. But
E[1-x(xr)]a. 1-S, where S = nx (x; ). Thus, f Iii,12S 2t 1- e. Since l,/r; l2S - f It' I2S, we have that f
10I2
f 1012S?1-e foralle>0. Remark. The symmetry of ,/' was not needed in this proof provided one generalizes definition (1.6) to N
P(X)=EE f 10(Z11
(A.1)
,
The following proof of Theorem 4.4 is due to B. Simon (private communication). It is closely related to the proof of Theorem 3.3 just given.
Proof. Without loss, replace H o by h 2 = Ho+ 1 in the definitions. h -' is a g;alimg exists, and bounded operator. We can assume that g Tr Tr h 2 s g + 1 /n with r. -+p,,. Thus, y = h I' h is uniformly bounded in the trace norm. The dual of the compact operators, com, is the trace class operators r, and y e t takes A e com into Tr yA. A sequence y E t converges to y e t, in the weak* topology, if and only if Tr Tr -yA for all A E corn. The Banach-Alaoglu theorem states that a norm-closed ball of finite radius in t is compact in the weak* topology. For us this means that there exists y with
Try0 and therefore lim Tr y ? Tr y. Also, y obviously has the correct (Pauli) symmetry. If we can show that r - h -' yh -' (which is in trace class)
satisfies r -p, we are done. To do this we shall show that if r - p', then f (p, - p')f - 0 for any f e L. This would mean that p -p' weakly in L. But since p;, -p in L', p'= p.
301
Int. J. Quant. Chem. 24, 243-277 (1983) LIES
276
As in the proof of Theorem 3.3, for any e >0 there is a X (=characteristic function of a bounded set in R3) such that
J P(1 -X)<e Since p
P'(1 -X)<e.
and
p, j p (1- X) < e for n sufficiently large. If
0n(xI. .XN)=Te(x1. .xN;xl..
.xN)
(after summing on spins), and similarly for 46, we have (as in Theorem 3.3)
J
J*/)(1_S)<E
and
where s = nX (x; ). In view of this, it is sufficient to show that
f. P-
JOP
with
P=SY-f(x1).
Let P = P(x 1, . . . , xN) be any bounded functions of compact support and let Mp be the operator (in L2) of multiplication by P. It is a fact that A,, = It -'Mph -'
is compact. (This is essentially the same as the Rellich-Kondrachov theorem used in Theorem 3.3.) Therefore
Tr I'Mp = Tr yAp - Tr yAp = Tr I'Mp.
Acknowledgment This work was partially supported by U.S. National Science Foundation grant
No. PHY-7825390-A02. This paper is a revised version of a paper with the same title that appeared in Physics as Natural Philosophy : Essays in Honor of Laszlo Tisza on his 75th Birthday, H. Feshbach and A. Shimony, Eds. (M.I.T. Press, Cambridge, 1982), pp. 111-149.
Bibliography (1) L. H. Thomas, Proc. Camb. Phil. Soc. 23. 542 (1927). [2] E. Fermi, Rend. Accad. Naz. Lincei 6.602 (1927).
[3) P. Hohenberg and W. Kohn, Phys. Rev. B 136, 86411964). [4] M. M. Morel[, R. G. Parr and M. Levy, J. Chem. Phys. 62, 549 (1975). [5] R. G. Parr, S. Gadre and L. J. Bartolotti. Proc. Natl. Acad. Sci. USA 76, 2522 (19791. [6) R. A. Donnelly and R. G. Parr, J. Chem. Phys. 69,4431 (1978). (7] H. Englisch and R. Englisch, "Hohenbcrg-Kohn theorem and non-v- representable densities," Physica A, to he published. [8] T. L. Gilbert, Phys. Rev. B 6, 211 (1975). [91 J. E. Harriman, Phys. Rev. A 6, 680 (19811.
302
Density Functionals for Coulomb Systems (a revised version of no. 144)
COULOMB DENSITY FIJN(TIONALS
277
[10] M. Levy, Proc. Natl. Acad. Sci. USA 76, 6062 (1979). [11] M. Levy, Phys. Rev. A 26, 1200 (1982). [12] S. M. Valone, J. Chem. Phys. 73, 1344 (1980); ibid. 73, 4653 (1980).
[13] A. S. Bamzai and B. M. Deb, Rev. Mod. Phys. 33, 95 (1981). Erratum, 33, 593 (1981). [14] E. H. Lieb and B. Simon, Adv. Math. 23, 22 (1977). See also Thomas-Fermi theory revisited, Phys. Rev. Lett. 31, 681 (1973). See also Refs. 16 and 25. [15] M. Hoffmann-Ostenhof, and T. Hoffmann-Ostenhof, Phys. Rev. A 16, 1782 (1977). [16] E. H. Lieb, Rev. Mod. Phys. 48, 553 (1976). [17] N. H. March and W. H. Young, Proc. Phys. Soc. 72, 182 (1958).
[18] M. Reed and B. Simon, Methods of Modern Mathematical Physics (Academic, New York, 1978), Vol. 4. [19] S. Mazur, Studia Math. 4, 70 (1933). [20] R. B. Israel, Convexity in the Theory of Lattice Gases (Princeton U.P., Princeton NJ, 1979).
[21] E. H. Lieb and W. E. Thirring, "Inequalities for the moments of the eigenvalues of the Schrodinger hamiltonian and their relation to Sobolev inequalities," in Studies in Mathematical Physics, E. H. Lieb. B. Simon, and A. S. Wightman, Eds. (Princeton U.P., Princeton, NJ, 1976). See also Phys. Rev. Lett. 687 (1975); Errata, 35. 1116 (1975). [22] E. H. Lieb, Am. Math. Soc. Proc. Symp. Pure Math. 36, 241 (1980). [23] E. H. Lieb. Phys. Lett. A 70, 444 (1979). [24] E. H. Lieb and S. Oxford. Int. J. Quantum Chem. 19, 427 (1981). [25] E. H. Lieb, Rev. Mod. Phys. 53, 603 (1981); Errata, 54,311 (1982). [26] E. H. Lieb, Phys. Rev. Lett. 46, 457 (1981); Erratum. 47, 69 (1981). [27] J. K. Percus, Int. J. Quantum Chem. 13, 89 (1978). [28] R. A. Adams, Sobolev Spaces (Academic Press, New York, 1975). [29] W. Fenchel, Can. J. Math. 1, 23 (1949). [30] W. Kohn and L. J. Sham. Phys. Rev. A 140 1133 (1965).
Received October 19, 1982
Accepted for publication March 11, 1983
303
Commun. Math. Phys. 92, 473-480 (1984)
LOfi1munications in Commun. Math. Phys. 92, 473 -480 (1984)
Mattlerrtatk:el Pwsics © Springer-Verlag 1984
On Characteristic Exponents in Turbulence Elliott H. Lieb* Departments of Mathematics and Physics. Princeton University, P.O. Box 708, Princeton, NJ 08544, USA
Abstract. Ruelle has found upper bounds to the magnitude and to the number of non-negative characteristic exponents for the Navier-Stokes flow of an
incompressible fluid in a domain 0. The latter is particularly important because it yields an upper bound to the Hausdorff dimension of attracting sets. However, Ruelle's bound on the number has three deficiences: (i) it relies on
some unproved conjectures about certain constants; (ii) it is valid only in dimensions > 3 and not 2; (iii) it is valid only in the limit S2-ca. In this paper these deficiences are remedied and, in addition, the final constants in the inequality are improved.
Ruelle [1] has derived upper bounds on the magnitude and number of nonnegative characteristic exponents of the Navier-Stokes equation for the flow of an incompressible fluid in a domain Qe IR°. The bound on the number, N(u) [defined
in (42)], is particularly interesting because it leads to an upper bound on the Hausdorff dimension of a compact attracting set [I, Corollary 2.3]. Unfortunately, the bounds in [I] on N(p), unlike those on the magnitude, have certain deficiencies which are
(i) They rely for their validity on some conjectured, but as yet unproved, relations between the sharp constants in two known inequalities. (ii) They are valid only for d > 3. (iii) Because Weyl's asymptotic formula for the eigenvalues of the Laplacian in 9 is used, the inequalities are not valid for any fixed Q, but only in the limit 0-c-0. In this paper a different proof of Ruelle's inequality for the number will be given so that the above three deficiencies are remedied. The result is contained in Eqs. (40)-(43).
Let v : Q IR° denote a solution to the Navier-Stokes equation, and let i
? 1A2 > ...
be the characteristic exponents corresponding to a probability
measure g(dv) on the space of solutions that is ergodic with respect to the Navier*
Work partially supported by U.S. National Science Foundation grant No. PHY-8116101-A01
305
Commun. Math. Phys. 92, 473-480 (1984) E. H. Lieb
474
Stokes time evolution. Ruelle shows [1] that for all n - I nd
n
µ;<_<_-d
<e,>
-d<E.>
The brackets < - ) denote average with respect to p, E. _
(1)
e;, and the e. = e;(v) are
ordered such that e, 5 e2 5 ... and are the eigenvalues of the Schrodinger operator
H = - vd - w(x)
(2)
with Dirichlet boundary conditions on Q. Here, v is the kinematic viscosity and w(x)z0 with
w(x)2=[(d-1)/4d] Y(av;/ax;+avJ/ax;)2=[(d-1)/2vd]a(x).
(3)
The quantity t(x) is the rate of energy dissipation per unit mass in the flow v. In (2), (3) and henceforth, explicit dependence of the various quantities on v is understood but not explicitly indicated unless necessary.
One might try to take additional advantage of the fact that divv=0 but, as in [1], we shall merely assume that w is some given non-negative function. It will, however, be assumed, as in [1], that wE L'
. d, 2(12)
(4)
Remark. The definition (3) has a factor (d- 1)1d, which is an improvement over
that in [1]. The reason is the following : Ruelle starts with an operator on L2(Rd)®W given by Jr = - vdS;,+ Wa(x), where W1,(x) is the d x d symmetric matrix W,4x)=(at;;1ax;+av;/ax;)12. Ruelle notes that the eigenvalues of Jr will satisfy (1) if w(x) in (2) is the largest eigenvalue of the matrix Wj(x). This he estimates by (Tr W2)"2, and this leads to (3) without (d-1)/d. Since divv=O, however, Tr W = 0. If A, > A2 > are the eigenvalues of W, then Tr W 2 =Y A, and
TrW=YA,. But (d-1)i 2
/4
``2
=22, and hence (d-I)TrW2>_dA;. In
2
addition to the condition divv=0, Jr is supposed to be restricted to the space of divergenceless functions. This restriction might improve (1) but, as in [1], it will not be used here.
The domain QER" is assumed to be an open set of finite volume 101; boundedness is not .required. Condition (4) insures that the quadratic form on H,(Q), defined by
Q*)=V1ivol2- Jwo2,
(5)
is bounded below and thus defines H as a self-adjoint operator. (Integrals, here and
henceforth, are over Ii) For our purposes, self-adjointness is not important; the only important consideration is the max -min principle which can be used as a definition of the e; :
inf
Q(4;,4,,).
(6)
where is any L2 orthonormal set in H,(Q). It is, in fact, the right side of (6) that enters in the derivation of the bound (1).
306
On Characteristic Exponents in Turbulence 475
Characteristic Exponents in Turbulence
The goal is to find upper bounds on the following two quantities
y20.
E(y)= Y Ie;IY,
(a)
e,60
(7)
E. itself for fixed n.
(b)
(8)
It is a consequence of (1) that for y> I lu Sd
I<e;>I'
Y_
<e,)<0
Y,30
d<E(y)>
(9)
This is Karamata's theorem which, more generally, states that when f :IR-+IR is convex and non-decreasing then (1) implies that
df(-e;))
f(j;)_d
(10)
for all n. If, in addition, f(t)=0 for t<0, then Yof(!t,)Sd <e,Y
i!S
f(-e;)1.
(I1)
[Actually, Karamata's inequality gives the left-hand inequalities in (9}{11). The right-hand inequalities come from Jensen's inequalityf().] It is (9) that gives information about the magnitude of the p,. The bound used in [1] [except for the factor (d- 1)/d in (3)] was E(y)
d;2f
w(x)Y'd,zdx.
(12)
The present knowledge about (12) is the following: (1) LY, d < 00 for y > i (d =1), y > 0 (d = 2), y Z 0 (d 3). No such bound exists for y < 1(d =1) or y = 0 (d = 2). The case y = 2d =1 does not seem to have been settled.
(The claim in [2] that Lti2 t < x is not justified.) Bounds on L,, were first given by Lieb and Thirring in [3] and on L,., for y >0 (d=2.3) and y > 1(d (d= in [2]. Bounds on Lo,d, d>-3, were first given by Cwikel [4], Lieb [5, 6]. and Rosenbljum
[7]. The best upper bound for Lo,3 is in [6], namely 0.0780=4n-23-3r2SLo,3 <_0.1156. The lower bound is from [2, Eq. (4.24)]. Recently, by a simpler method. Li and Yau [8] derived upper bounds for Lo.d, d >- 3 which they claimed was better
than that in [6); unfortunately a numerical error was made in [8] and their bound for Lo, 3 is three times larger than that in [6]. (2) The sharp constant LY.d in (12) cannot depend on n, i.e. LY. d(Q)=L, d(IRd).
To see this, assume that OeQ and, given w on Rd, consider wa(x)=c2i0cx) on 0. Then let c-+ oo. This situation is in contrast with the Inl dependent bound for E, to be derived later. (3) There is a natural "guess" for LY d given by the semiclassical formula (2n)-dIf dpdxlvp2-w(Y)1L`rdt.-d;2 1 w(x)''d'2dx (13) E(y) with Ia1_. =max(0, -a). An easy integration gives
Lr=2
en a;zl(y+d
1)/!(y+I+d/2).
(14)
(4) It is a fact [2] that Li.d
(15)
307
Commun. Math. Phys. 92, 473-480 (1984)
E. H. l.ieb
476
In [2,3] it was conjectured that L1., = L , for d 2t 3. It is known [2] that for each d< 7 there is a y, > 0 such that LY , > VI., when y < y,. When d =1 or 2, )', > 1. It is also known [9] that L,. L`y for y 3/2. In fact [9] the ratio Ra(y) = L}. ,, L'7.d is monotone non-increasing in y; thus if R,(yo)=1 for some yo, then R,(y)=I for all y>yo. Glaser et al. [10] have shown that Lo,d>Lo,d for d? 7. They also evaluate Lo,4 exactly (it is a Sobolev constant) provided it, is restricted to be spherically symmetric. For related results see [ I I]. 1
(5) Inequality (12) for y=I is equivalent [2] to n
_ f IV4i(x)12 dx ? K, f pm(x)' - z;d dx
(16)
i= 1
where the (0,) is any L2 orthonormal set in H'(Rd) [or Ho'(Q)] and co"W
IOi(x)12.
(17)
The sharp constants in (12) and (16) are related by L1,,=[d/2K,]d!2 (1
+d/2)-, -ere
(18)
[Note: If it is specified then the sharp constant in (16) may depend on n, i.e. K,(n). K,, the sharp constant in (16), (18) is defined to be sup.K,(n).] Corresponding to L,., in (14) there is a classical value Kd given by (18):
Ka=4rrdf(1 +d/2)2'd/(2+d).
(19)
By (15), K,__<_Ka.
An inequality related to (16), and which will be used later in the event that K,Kan' + 2r4IQI - 2.d
(20)
(The strict inequality in (20) is, in fact, implied by the proof in [8].) Before turning to our estimate for E. let us make a few additional remarks about (12).
(a) Combining (3), (9), (12) we see that the right side of (12) is suitable for passing to the "infinite volume" limit, i.e. in some vague sense it is proportional to the volume. The upper bound we shall obtain later for the quantity introduced in
[I]'
N(w) = smallest it such that E.>0,
(21)
will also have this extensivity property. By (1), dN(w) is related to number of nonnegative characteristic exponents and an upper bound on N(w) will yield a bound on the number of non-negative characteristic exponents [see (43)]. ((3) The bound on N(w) in [I] relied on the fact that Lo d < :r (which is true if
and only if d>3) and on the conjecture that Ltd1, the best bound published so far [6] for L1 , is
L1 ,5(6.844)L; ,=0.04624,
(22)
and this exceeds L'0, =0.01689. However, the bound can be improved slightly to 0.04030 [sec (51) below].
308
On Characteristic Exponents in Turbulence
Characteristic Exponents in Turbulence
477
(y) Inequality (12) can be used to derive a lower bound for each e,,. If e.(V) is the me eigenvalue for the potential V in place of - w in (2) then, for any number e, it is clear that e.(- w) > e.(- (w + e)+) + e. Take y =0 in (12) and set e = e,,. Then the number of non-positive eigenvalues for V= -(w+e )+ is at least n, and (12) yields nSLo,dv - ere j
(23)
The integral on the right side of (23) is finite if e <0 or if IQI is finite. It is also monotone in e and thus (23) yields a lower bound for ep. Now we turn to our main goal which is an upper bound for E.. Let
be the eigenfunctions corresponding to et 5e2 5 ... <e,,. By virtue of (6) and a
limiting argument, any approximating orthonormal set such that
Q(0,, ¢;)
will suffice. I Let Q`(x)= jI¢t(x)I2. By (6) i 16) and with p =1 + 2/d, E. > F(Qm),
(24)
F(Q)°vK'11e111,- jwe
(25)
with
which in turn is greater or equal to G(Q)=vKdIIQIID-IIwIIP IIell'.
(26)
E. -inf{F(e)Ije=n,e(x)>0)
(27)
Thus, inf{G(Q)IJQ=n,Q(x)>0). However,11ell,IQI"° E by
(28)
je, and therefore if we define the function J, (for X >0), and
J(X)=vKdX°-IIw) X, =inf(J(X)IX>nlQI-"P'),
(29) (30)
we have that (31)
The strict inequality in (31) is justified by the fact that Qb cannot satisfy the Holder inequality after (28), i.e. Q. cannot be constant in Q. [It is left as an exercise, using the fact that IIQII//IIQIl t can be made arbitrarily
large, that k. is indeed the infimum in (27).] The minimum in (30) can be computed to be uv')
=J(X0),
it z IQI uv'Xo
n510 uvXo
( 32 )
where J'(X o) = 0, namely pvKdXo._'
=11 x'IID
(33)
309
Commun. Math. Phys. 92, 473-480 (1984) E. H. Lieb
478
In particular, if n101-1° is greater than or equal to the value X, >0 such that Therefore, N(w) defined by (21) satisfies J(X1)=0, then N(w)5Y101 uv'{ Ilwllp /vK°}'" - u =Y101 {Jdxw(x)i+°;2/1Q1}°nz+°'(vKd)-d/2
(34)
The symbol .9' denotes "the smallest integer >." can be improved as follows. Let If KdF4(Q,) ,
(35)
Fa(Q)=(1-b)vKan"IS2I'-"+bvK°Ilell°- Jwe.
(36)
with
As before, En>E,(b)=(1-b)vKan"IQI'-"+inf(bvKdX"-Ilwll".XIXnIQ1-`D).
(37)
Previously, in (32), we discussed the inf in (37). Thus En(b)>0 if n satisfies the following two conditions:
n>l0luv'Xab-ulv-1),
[see (32),(33)],
n"-'vIQI'-"{(1-b)Kd+bKd}IIwII"-Id21-`P'
(38) (39)
Condition (39) implies that En(b)>0, provided (38) is satisfied. Choose b so that (38) and (39) are the same, namely b = Ka[2Kd/d + Ka] -' .
Inserting this in (37), we have as before N(w) 5.9'AdI121v-df 2 (J dx w(x)'
+d12/1Q1 }°rtd+ 2)
(Ad)2,d=[2Kd+dK;] [(d+2)K°K'd]-'
(40) (41)
The inequality (40), (41) is our main result. We now wish to relate (40), (41) to the turbulence problem, i.e. we want to find
an upper bound to
n
N(p)=smallest n such that Y µ; <0.
(42)
By (1),
N(µ)<=d{smallest integer such that <En>>0) S d{smallest integer such that <En> > 0} ,
where, for each w,
En=sup{En(b)I05b51). For each fixed n and b, E,(b), and t. are functions oft =- Il wll "' Denote them by En(b, t) and En(t). Direct calculation using (32), shows that En(t) is a convex function oft (not t"= II wll" ). Since E,,(b, t) differs from En(t) in a trivial way, En(b, t) is also a
convex function of t. Since En(t) is the supremum of convex functions, is too is convex in t. By Jensen's inequality <En>>En().
310
On Characteristic Exponents in Turbulence Characteristic Exponents in Turbulence
479
Thus, by expressing the right side of (40) in terms of II wllp: and then averaging
with respect to e(dv) we obtain the bound sought in [1]: +di2/IQItdi(d+21
(43)
Finally, let us record some available information about the constants in (41). Using (19) we have K; =n2/3=3.290, K Z = 2n = 6.283,
(44)
K3 = 3(6n2)2'3/5 =9.116.
To bound K. a bound on Lt.d is needed.
d=1: The bound in [2, Eq. (2.11)] with m=1, n= l is L,., 5(4n)-17/'(5/2)-'1'(1/2)2(1/2)-' =4/3.
(45)
d=2,3: In this case we use the formula [6] 0
e,so
le;l'=y J lely-'Nede,
(46)
-m
where Ne is the number of eigenvalues of H 5 e. In [6] it is shown (with v =1) that Ne<(4n)-"Jdx J dtt 0
' -d1 2eel f(tx'(x)),
(47)
with f(t)=max(0,b(t-a)) and 1/b= J (1-a/y)e-ydy.
(48)
Inserting (47) in (46), then doing the e integration, then the t integration [after a change of variable to tu(x)] and finally the x integration, one finds
L/.dSb(4rr)-d"'2F(y+1)(d/2-l+y)
(d/2+y) tat
dr2
'
(49)
The optimum constant a satisfies
ae' e-°ydy/y=(d/2+y-1)/(d/2+y).
(50)
When y= I we take a=0.61, b=3.6807 for d=2 and a=1.02, b=6.9358 for d=3. Inserting this in (49) yields L12 50.24008, (51)
L1.3:50.040304. Using (18)
K,
1/12=0.08333,
K2>1.0413,
(52)
K3>2.7709, 311
Commun. Math. Phys. 92, 473-480 (1984)
480
E. H. Lieb
which, by (41), leads to
A, = 2.050,
A, =0.5597.
(53)
A3 =0.1329.
This value for A3 can be compared with the value in [1, Footnote 7], which is
obtained under the conjectured assumptions La 3=0.0780 and L,.3=Lc,,,. namely
L0.3[I-(L1.3/L0 3)215]-312=0.459.
(54)
If K3=K;, which is conjectured to be true, (41) yields A 3 =(K;) - 312 =0.03633.
(55)
In addition to the improvement in (53) over (54), we also note the additional factor (d - I)/din (3) which yields a factor [(d-1)/d]di4 when the right sides of (40), (43) are expressed in terms of (x). This factor is 0.7378 for d= 3 and 0.7071 for d = 2. Acknowledgement. I should like to thank David Ruelle for stimulating this work and for several helpful conversations. References I.
Ruelle, D.: Large volume limit of the distribution of characteristic exponents in turbulence. Commun. Math. Phys. 87. 287-302 (1982)
2. Lieb, E., Thirring, W.: Inequalities for the moments of the eigenvalues of the Schrbdinger Hamiltonian and their relation to Sobolev inequalities. In: Studies in mathematical physics: essays in honor of Valentine Bargmann, Lieb. F.., Simon, B., Wightman, A. (eds.), pp. 269-303. Princeton, NJ: Princeton University Press 1976
3. Lieb, E., Thirring, W.: Bound for the kinetic energy of fermions which proves the stability of matter. Phys. Rev. Lett. 35, 687-689 (1975); 35, 1116 (1975) (Erratum) 4. Cwikel, M.: Weak type estimates for singular values and the number of bound states of Schriidinger operators. Ann. Math. 106, 93 100 (1977) 5. Lieb, E.: Bounds on the cigenvalues of the Laplace and Schrodinger operators. Bull. Am. Math. Soc. 82, 751-753 (1976); the details appear in [6] 6. Lieb, E.: The number of bound states of one-body Schrodinger operators and the Weyl problem. Proc. Am. Math. Soc. Symp. in Pure Math., Osserman, R.. Weinstein, A. (eds.), Vol. 36, pp. 241 -252
(1980). Much of this material is reviewed in Simon, B.: Functional integration and quantum physics, pp. 88- 100. New York : Academic Press 1979
7. Rosenbljum, G.: Distribution of the discrete spectrum of singular differential operators. Dokl. Akad. Nauk SSSR 202,1012--1015 (1972) (MR45 No. 4216). The details arc given in: Distribution of the discrete spectrum of singular differential operators. Izv. Vyss. Ucebn. Zaved. Matem. 164, 75 86 (1976) (English transL Sov. Math. (]z VUZ) 20, 63-71 (1976)] 8.
Li, P., Yau, S.-T.: On the Schrodinger equation and the eigenvaluc problem. Commun. Math. Phys.
88,309-318(1983) 9. Aizenman, M., Lich, E.: On semiclassical bounds for eigcnvalucs of Schr6dinger operators, Phys. Lett. 66 A, 427 -429 (1978) 10. Glaser, V., Grosse, H., Martin. A.: Bounds on the number of eigenvalues of the Schrodingcr operator. Commun. Math. Phys. 59. 197 212 (1978)
11. Grosse, H.: Quasiclassical estimates on moments of the energy levels. Acta Phys. Austr. 52, 89 105 (1980)
Communicated by A. Jaffe Received October 21. 1983
312
Phys. Rev. Lett. 54, 1987-1990 (1985)
PHYSICAL REVIEW LETTERS
VOLUME54. NUNRLR18
6 MAY 1985
Baryon Mass Inequalities in Quark Models Elliott H. Lieb Departments ofMathematics and Physics, Princeton University. Princeton. New Jersey 08544 (Received 28 February 1985)
Recently conjectured three- (and more-) body mass inequalities are investigated for the quark models of baryons in which it is assumed that baryon masses are the ground-state energies of Schrfidinger-type operators with pair potentials V It is proved that these inequalities hold (even with a "relativistic" form for the kinetic energy) if V belongs to a certain class (which includes many potentials commonly used), but that they do not hold for all V (even in the nonrelativistic case). One example of our results is 2M(cgs); M(cgq)+ M(es). PALS numbers 12 70.+q, 12.35.Eq, 12.35.Ht, 12.40.gq
In the nonrelativistic quark model of mesons and baryons, the masses of these composite particles are estimated by the ground-state energies of simple two- (for mesons) or three- (for baryons) body Schr6dinger operators with
ordinary pair potentials. A question of some recent interest is whether one can derive inequalities among these masses (in this model) that hold for all (or a large class of) pair potentials. Such "potential-independent" results about the masses of mesons and baryons are obviously conceptually interesting. The two- and three-body Hamiltonians have the form
Hill- T1(x1)+ T2(x2)+ V(x1,x2),
(1)
H'3 - T1(xt)+ T2(x2)+ T3(x3)+ V15(xt.x2)+ V23(x2,x5)+ Vt)(xt,x3).
(2)
Here, xi, x2, as are the particle (quark) coordinates
and the quantities in parentheses [e.g., (xl,x2)I denote the particle coordinates on which the operators
This can be proved by noting that E(a) is also the ground-state energy of H. - 2 T1(I) + V (x - x2) and
tors (e.g., Tt and T2 might be kinetic energy operators
similarly for E(c). IProof: Since H, - T H. + 01with Ha-2T1(2)+ V(x1-x2) and since Ha is, by the evenness of T1, unitarily equivalent to H., we have E(a)3E(a). Conversely, E(a)-inf(41h14),
with respective masses ml and m2. but no special form
(4101 -1, with as being a one-particle function and
act.
(The dimensionality of space is unimportant in
most of the following.) The subscripts (e.g., I or 12) are simply labels to designate possibly different opera-
of Tj other than that it is a one-body operator is as-
h-2T(x)+V(x). As a variationa) function in (3)
sumed at this point). The potentials, V, are usual mul-
for E(a) take
tiplication operators; initially, neither translation in-
variance I V(x,y)- V(x - y)I nor symmetry IV(x,y) - V (y. x) I is assumed. The ground-state energy of a Hamiltonian is
E-inf(4,H4,).
with (4, 14) - 1.
(3)
The absolute ground state is implied, i.e., no symmetry restriction is imposed on 4,. The reason is that there are enough internal quantum members (color and flavor) associated with the particles so that the Pauli principle can be satisfied by "internal" antisymmetry.
The known two-body inequalityl-) concerns the following situation: Fix V(x1,x2)- V(xt-x2) in (1) and let Tt and T2 be even functions of the momentum only (i.e., they are translation and inversion invariant). Consider three different systems in which the first two terms in (1) are respectively (a) Tt,T1; (b) T1,T2. (c) T2,T2. The desired inequality is
£(a)+E(c)t2E(b).
((XI,x2)-C4.(x1-x2)expl-e(x1+x2)21 and let e- 0. This yields E(a)
E(a).) The opera-
tor equality 2H(b)-H(a)+H(b) implies (4). Inequalities similar to (4) have been proved in a fieldtheoretic context by Weingarten4 and Witten.5 Note that (4) need not hold if the T are arbitrary singleparticle operators [e.g.. T-p2+ U(x)1. Counterexample 1, at the end of this paper, shows this.
An inequality relating Htn to H121 has also been derived s-a Let T1, T2, and T2 be arbitrary singleparticle operators. In (2) let V12, V2,, and V13 be arbi-
trary two-body operators (not necessarily multiplication operators). Consider three two-body problems in
(I): (a) H.-Tt+T2+2V12; (b) Ha-T2+T2+2V23; (c) Hr-T1+T2+2V12. Then 2H° -H.+H5+Hr, which implies that
2E(2) 3 E(a) + E(b) + E(c).
(5)
If the V's and T's are identical then 2E1713 3E(2) or, (4)
in terms of baryon and meson masses, ma 3 3m,y/2.
© 1985 The American Physical Society
1987
313
Phys. Rev. Lot. 54, 1987-1990 (1985)
V(1LCM1. 54, Ni,Mttt.x Is
PHYSICAL REVIEW I.ETTERS
The purpose of this paper is to investigate n-body inequalities with n 3. 1 am grateful to P. Taxi] and J. M. Richard for communicating the following problem to me and for their very helpful correspondence
n 4tni 1985
set of Bthat is physically important is BC Bconsisting of translation-invariant potentials V (x - y ). In this
case, the ordinary functions Fa(a)-exp(-/3V(x)I
- y) - V (y - x). Let T, be nonrelativistic kinetic energy operators: T - p2/2m,. Denote the energy by
are known as infinitely divisible distributions V E B if and only if the Fourier transform (FT) of Fa is positive (as a distribution) for all /3 > 0. The LevyKhintchine formula12 provides a necessary and sufficient condition for V E B. but it is not particularly
E(m,,m2,m21. Given mand M. is it true that
transparent.
on the subject.
In (2), let V12- V13- V23- V with V(x,y)- V(x
E(m,m,m)+E(m.M.M)52E(m,m.M)?
(6)
The physically interesting case is m < M. where in is the mass of a u or d quark and M is the mass of a strange quark, in which case (6) is related to the GellMann°-Okubo10 mass formula.
Unfortunately (6) is not true for all m, M, and V, as counterexample II (with m >> M) given at the end of this paper shows. Although it was thought) that a straightforward "convexity" argument similar to the
proof of (4) and (5) would yield (6), a recent critique11 (and our counterexample) dispels this idea. For in << M, I do not have a counterexample, and the status of (6) could conceivably be different in the two cases in < Mand m> U. It will be shown here that for suitable V, (6) is indeed true for all in and M. This class of V is large enough to include many of the potentials actually used in these quark-model calculations. As partial compen-
sation for the restriction on V, a larger class of onebody operators, T, will be allowed (in particular, the
"relativistic"
T(p) - (p2c2+ m2c4) u21. Furthermore, our result extends to n > three bodies. First, let us define the one-body operators, T, to be considered. It will always be assumed that the kernel expression
Kp(x,y)-(e-O2)(x.y) is real for all x,y and all
B > 0. This condition is automatically satisified when
T-T(p)+U(x) with U and T real
and
T(p)
- T(-p). We also define a special subclass A by saying that TEA if K(x. y) ;at 0 for all a. y and all $>0. Examples of such T's in A are17
T- p2/2m+ U(x).
(7)
T-(p2c2+m2c4)V2+U(x),
(8)
for any real U(x). (Remark: Hidden in this, and the following, is the tacit assumption that various operators such as (1), (2), (7), and (8) are bounded below and self-adjoint. This restricts the singularities of U and Vin well-known ways.] Next we define a class, B. of two-body potentials, V. We say that V E B if V (x, y) - V (y, x) and the kernel L,(x. y) - exp] - 0 V (x, y) I is positive semidefinite for all 0>0. [This means that ff(x)f(y)L0(x. y )d x d y 3 0 for all f for which the integral is absolutely convergent.] B is a cone, namely if V, E B and V2 E B then al/1 + bV2 E B for all a and b - 0. A sub1988
314
The physically interesting- case is V(x) - I'(r), with ]xl. Call this class BC B. I shall give two different sufficient conditions for V (r) E B. The first is dimension dependent and is due to Askey" who proves that f (r) has a positive FT in d dimensions if, for all r > 0,
(-I)//1j)(r);y0. 0<j-_2+d/2.
(9)
where f 3).u d3f/dr/, j-0, 1,2..... (When d- I, the result goes back to W. H. Young and to Polya.) If V (r) satisfies
IEj-_2+d/2.
(10)
then Fe will satisfy (9). Thus, in three dimensions, it suffices that V(r) satisfies V ' 0, V,-< 0, and V 3 0.
In many derivations of V from lattice gauge field theory, one automatically gets V > 0 and V - 0 for the qq potential by reflection positivity. (10) requires only the additional condition V y 0. Another sufficient condition for V E Bis this: Write
V (r) - W (d) and demand that for all s _0 and j-> 1,
(- 1)/IV3)(s)
(11)
0.
If (II) holds then g(s)-e.8W1d satisfies (-1)' X g(n(s) 0 and, by Bernstein's theorem, is the Laplace transform of a positive measure up on 10.00),
whence
exp[ -$ V(x)]-f.- oc2)dvp(rl, which is obviously positive definite. Some potentials in B of this form are (with a and h 3 0)
V(r)-(ar2+b14,
--(ar2+bill.
0,q1c_I q-_ 0.
(12)
In particular, 1`(r)-are-br-4, a,h,q,p30, p62, which is a choice frequently employed,2.1' is allowed.
Again, note that b is a cone which implies that V(r) E B if it can be decomposed into V - V, + V2 with V, satisfying (10) and V2 satisfying (I1). Now our theorem can be stated: Let T,, T2, Ti be three one-body operators, with TI ( A. Let V1, V2, and V1 be two-body potentials with V, E B. (T2 and T) do not have to be in A and V2 and V, do not have to be in B.) Consider the three three-body Hamiltoni-
Baryon Mass Inequalities in Quark Models
PHYSICAL REVIEW LETTERS
VOLUME. 54, NUMBER 18
6MAv 1485
and it suffices to prove the inequality for each N.
ans
Z;N) gives us a polygonal path approximation. Let X1
H,- T1(x,)+ T2(x2)+ T2(x3)+ 0.
be a path of x, [i.e., N + I points x,(0)-xi,x,($/
Hb-TI(x1)+T2(x2)+T3(x3)+ V23,
(13)
N)....,x1(Np/N)-x,) and similarly for X2, X3.
(14)
The contribution of the single-particle terms to a path in has the form F.-F,(X,)F2(X2)F2(X3). For Hb and H, the terms are
H,-T1(x,)+T3(x2)+T3(x3)+ V. Vg- V,(x1,x3)+ VJ(x,,x2)+ VI(x2,x3).
F+- FI(X,)F2(X2)F2(X3).
Then
E(a)+E(c)
2E(b).
(I5)
Note that T1, T2, and T3 are unrelated, as are V1, V2, V3. Thus, the theorem covers the case of three different quarks, e.g., mass(cgq) + mass(css) 2 x mass(cgs) if one assumes V -V 11-V p E B.
Proof-Choose some
A .(xI,x2,x3)
and
let
ZP-(e-PN)(X,X). As $3-o, p-1InZ0- -E, so that it suffices to prove Z0(a)ZA(c)' ZP(b)2 for all p > 0. The Trotter product formula asserts that with ZP - limN -.
contribution of V to a path is of the form G2(X,, X3)G2(X,,X3)GI(X2,X3) for H G2G)GI for Hs, and G3G3G, for H. By assumption, G2 and G3 are real functions and G, is positive definite (reason: G1 is of
the form N
n expt - (0/N) Vt(x2(/3!/N),x1(pj/N))I.
J-t
and the N-fold tensor product of positive semidefinite operators is positive semidefinitel. Thus ZAN) (a) has the form
Z O -1(e-$TINe-P1'!N)N)(X,X ). 0
rd3(N-1x1F1(X,)If
Fr- F1(XI)F3(X2)F3(X1)By assumption F1(X) 3 0 and F2 and F3 are real. The
d34N-1x3dOON-1 PX3G2(X,,X2)F2(X2)G2(X1.X3)F2(X3)G,(X2,X3)I.
A similar form holds for ZAN) (b) and Z IN' (c). For any real number A, the quantity
A(A)-ZEN) (a)+A2ZpN1 (c)-2AZeNl (b) can be written
A(A) -f d3tN-I 1X1 F,(X1)(f
d)tN-uX2d3tN-nX3Jp(X1,X3)J,,(X1,X2)G,(X2,X3))
with JI,(X,Y)-G2(X,)')F2(Y)-AG3(X,Y)F3(Y). Since G, is positive semidefinite, the inner integral (in curly braces) is nonnegative. Since F, 30, A(A):& 0. Minimizing A(A) with respect to A yields Z;N1(a)Z;N1(c) i IZ IN) (6)12, Q.E.D. We remark that the theorem can obviously be extended easily to appropriate two-body operators V (instead of
merely potentials), but we shall not indicate this explicitly. Another extension is to replace Vin (14) by a genuine three-body potential V(x1, x2, x3) (the same V for o,b,c) with the property that for every fixed it and p > 0, a-P' is positive semidefinite as a function of x2 and x3. The theorem can also be extended to n > 3 bodies as follows. Let T1, ... , T. be one-body operators with
T2, .... T. E A. Let W be an arbitrary (n - 2)-body potential, U an arbitrary (n - 1)-body potential, and V a two-body potential in B. Let
H,t-T,(x1)+Tt(x2)+ Y, TR(x4)+ W(x3,...,z.) R-3
+U(xl,x),....z.)+U(x2.x3.....x.)+ V(x1.x2).
(16)
Then E(HII)+E(H22);2E(H,2). Again, obvious generalizations suggest themselves
for a >3 as they did for a-3. Counterexample 1.-Inequality (4) is false for arbitrary Ti. T2. Take T1-p2+U(x). T2-p2, and
of T1. Similarly, E(b)-e+E1 and E(c)-E2, where E. is the ground-state energy of h.-np2+ V(x). For (4) to hold would require 2Ei : E2+ Q. but this is false for Q large enough.
V(x,y) -V(x-y). Let U be a deep, narrow square well and let V be smooth, but with V(0) - Q a local
Counterexample //.-Inequality (6) can be false. We shall show that with m - m and V an infinite square
maximum. In (4) we have (essentially) that E(a) - e + e + Q. where e is the ground-state energy
well I V(x)-0 for Ixl E 1, and V(x) - oo otherwise] the inequality in (6) is reversed and strict (i.e., - is 1989
315
Phys. Rev. Lett. 54, 1987-1990 (1985)
Vgt.UME 54, NUMHEK 18
PHYSICAL REVIEW LETTERS
6Mk 198"
replaced by >). By continuity, (6) will continue to fail for m finite, but large, and V bounded and smooth.
V12- V17- V2s-0), so that E(m,ni,tn)-0. For (m,m,M) we note that if V12 is ignored, we should
When m - ao, one simply fixes the coordinates of those particles with mass m. then does the minimization in (3) for the other variables and, finally, minimizes the energy with respect to the assumed
clearly take x, - x2. IThis follows from concavity or by noting that given any b(x3) one should place x, and x2, which are independent, at the point x that minimizes the effective potential Iy 12 V.1 But x, - x2 also minimizes V12; hence x, - x2 is the best choice. Thus E(m,m,M)-E(h) with h-p2/2M+2V(x). Like-
fixed coordinates (i.e., the Born-Oppenheimer approx-
imation is exact).
For (m,m,m) we obviously fix
x,-x2-x3-0 (since one cannot do better than I
wise E(m,M,f) -E(h) with
h-p}/2M+pyt/2M+V(x2) + V(x3)+ V(x2-x2)
Since V1730, E6)>2E(1)>0 with h-p2/2M + V(x). (It is easy to see that the inequality is strict.) Since 2 V - V, E(h) - E(h) and thus (6) is reversed.
Helpful conversations with E. Witten and R. Askey are also gratefully acknowledged, as is the partial support of the U. S. National Science Foundation (PHY81l6101-A03).
2370 (1982). 7S. Nussinov. Phys. Rev. Lett. 51, 2081 (1983). 4. M. Richard, Phys. Lett. 1398, 408 (1984). 'M. Gell-Mann, Phys. Rev. 125, 1067 (1962). tOS. Okubo, Prog. Theor. Phys. 27, 949 (1962).
itJ. M. Richard and P. Taxi), Phys. Rev. Lett. 54, 847
'R. Bertlmann and A. Martin, Nucl. Phys. 8168, Ill
(1985). 12M. Reed and B. Simon, Methods of Modern Mathematical Physics. Vol. 4 (Academic, New York. 1978). 13R. Askey, in Harmonic Analysis on Homogeneous Spaces: Proceedings of the Symposia in Pure Mathematics, Vol. 26
(I980). 21. M. Richard and P. Taxil, Ann. Phys. (N.Y.) 150, 267
(American Mathematics Society, Providence, 1973), pp. 335-338. In this paper Askey proves the sufficiency of (9)
(1983). 3S. Nussinov, Phys. Rev. Lett. 52, 966 (1984). 4D. Weingarten, Phys. Rev Lett. 51, 1830 (1983). SE. Witten. Phys. Rev. Lett. 51, 2351 (1983). 'J. P. Ader, J. M. Richard, and P. Taxil, Phys. Rev. D 25,
if a conjecture about Besset functions holds. This he proves
1990
316
for d odd in R. Askey, Trans. Am. Math. Soc. 179, 71 (1973). The even-d case was proved in J. Fields and M. Ismail, J. Math. Anal. 6. 551 (1975). 14C. Quigg and J. Rosner, Phys. Rep. 56, 167 (1979).
Schodinger Operators, Proceedings Senderborg Denmark 1988, H. Holden and A. Jensen eds.
KINETIC ENERGY BOUNDS AND THEIR APPLICATION TO THE STABILITY OF MATTER Elliott H. Lieb Departments of Mathematics and Physics, Princeton University P.O. Box 708, Princeton, NJ 08544
The Sobolev inequality on R", n > 3 is very important because it gives a lower bound for the kinetic energy f IVf l2 in terms of an LP norm of f. It is the following. l(n-2)/n
IfI2n/(n-2)}
fR' IVf12 > sn ffR-
= SnIIfII2n/(n-2)'
(1)
Applying Holder's inequality to the right side we obtain the following modification of (1).
fR's
IV f2 >
2/n
K{J
rR
p(n2)/}
j r
J / l R.
^
p}
= KnhIpIl(2)/IIPII, 2/n
(2)
with p(x) = If (x) 1'. The superscript 1 on K,', indicates that in (2) we are considering only one function, f. Holder's inequality implies that K,i, > S,, but, in fact, the sharp value of K,', (which can be obtained by solving a nonlinear PDE) is larger than S. In particular, K,', > 0 for all n > 1, even though S. = 0 for n < 3. Inequality (2), unlike (1) has the following important property: The non-linear term f p(n+2)/n enters with the power 1 (and not (n-2)/n) and is therefore "extensive." The price we have to pay for this is the factor Ilf = Ilplli/" in the denominator, but since we shall apply (2) to cases in which 1lfll2 = I (L2 normalization condition) this is not serious. Inequality (2) is equivalent to the following: Consider the Schrodinger operator on Rn Ilz/n
H=-O-V(x)
(3)
and let el = inf spec(H). (We assume H is self-adjoint.) Let V+(x) = max{V(x),0}. Then V+(x)(n+2)lndx
e> > -L ;,n J
=
-Li
nIIV+II(n+2)/2
(4)
with 1
L
n/2
n
The reason for the subscript 1 in Li
n
(n+2)/n (5)
n
will be clarified in eq. (8).
317
Schwinger Operators, Proceedings Sonderborg Denmark 1988, H. Holden and A. Jensen eds.
372
Here is the proof of the equivalence. We have
el>ilfiJlVfl2-Jpv+ 111f112=1 and p=lfI2} Use (2) and Holder to obtain (with X = Ilpll(n+2)/n)
el > inf {K^X
(n+2)/n
- llV+11(n+2)/2X}
(6)
Minimizing (6) with respect to X yields (4). To go from (4) to (2), take V = V+ _ a] fl4/n = ape/n in (3). Then -Li,na(n+2)/2 r p(n+2)/n < el < (f , Hf) = f lVf 12 - o r p(n+2)/n. Optimizing this with respect to a yields (2). J So far this is trivial, but now we turn to a more interesting question. Let el < e2 < ... C 0 be the negative spectrum of H (which may be empty). Is there a bound of the form
1: ei > -Ll.n f
V+(x)(n+2)/2dz
(7)
for some universal, V independent, constant L1,,, > 0 (which, of course, is > L'1,n)? The point is that the right side of (7) has the same form as the right side of (4). More generally, given y > 0, does L lei1-1 < L,,n
f
V+ (X)-, +1
(8)
hold for suitable L.y,n .' When y = 0, E le;l° is interpreted as the number of e; < 0. The answer to these questions is yes in the following cases:
n = 1: All y > 1. The case y = 1/2 is unsettled. For y < Z, examples show there can be no bound of the form (8). n = 2: All y > 0. There can be no bound when y = 0.
n>3: All-f>0. The cases y > 0 were first done in [10], [11). The y = 0 case for n > 3 was done in [3], [6], [14], with [6] giving the best estimate for Lo,,,. For a review of what is currently known about these constants and conjectures about the sharp values of L,,,,, see [8]. The proof of (8) is involved (especially when y = 0) and will not be given here. It uses V+/2(-A + A)_' V1/2. the Birman-Schwinger kernel,
318
Kinetic Energy Bounds and Their Application to the Stability of Matter
373
There is a natural "guess" for L..1,7 in terms of a semiclassical approximation (and which is not unrelated to the theory of pseudodifferential operators): leil7
(27r)-" = Ly,n
[V(x) - p2]'dpdx
f
(9)
R" x R" ,p=
(10)
From (9), Ly " = (4a)-"l2r(y + 1)/r(1 + y + n/2).
It is easy to prove that Ly,,, > L`Y,n.
(12)
The evaluation of the sharp L,," is an interesting open problem - especially L1,".
In particular, for which -y,n is L,,n = Lc,"? It is known [1] that for each fixed n, Lyon for some -to, then L,," = LY,n 'Y " is nonincreasing in y. Thus, if for all ry > yo. In particular, L3/2,1 = L3/2,t [11], so L,,1 = 1 for ry > 3/2. No other sharp values of L,,n are known. It is also known [11] that L,,1 > Lc, 1 for y < 3/2 and L,," > Ly," for n = 2,3 and small ry. Just as (4) is related to (2), inequality (7) is related to a generalization of (2). (The proof is basically the same.) Let ON be any set of L2 orthonormal functions on R"(n > 1) and define N
P(x) = F, Ioi(x)I2.
(13)
i=1 N J/
(14)
i=1
Then we have The Main Inequality T > Kn
J
p(x)1+2/ndx
(15)
with K. related to L1," as in (5), i.e. 2)-(n+2)/2 L],n
=
(16)
(k)n/2 (1 +
The best current value of Kn, for n = 1, 2, 3 is in [8]; in particular K3 > 2.7709. We might call (15) a Sobolev type inequality for orthonormal functions. The point is that if the 0i are merely normalized, but not orthogonal, then the best one could say is
T > N-2/"K, J P(x) 1+2/ndx.
(17)
319
Sehodinger Operators, Proceedings Senderborg Denmark 1988, H. Holden and A. Jensen eds.
374
The orthogonality eliminates the factor N'2/", but replaces K,1! by the slightly smaller value Kn. One should notice, especially, the N dependence in (15). The right side, loosely speaking, is proportional to N(n+2)/n, whereas the right side of (17) appears, falsely, to be proportional to N1, which is the best one could hope for without orthogonality. The difference is crucial for applications. In fact, if one is willing to settle for N' one can proceed directly from (1) (for n > 3). One then has (with p = n/(n - 2)) 11/P
if
T > Sn { r P(x)pdx
(n > 3).
(17a)
This follows from 1100 IIp > IIE 10,12IIP. Eq. (11) gives a "classical guess" for L,,n. Using that, together with (16), we have a "classical guess" for Kn, namely //
Kn=4anrI n22
\2/n J
/(2+n)
= 3(67r2)2/3 = 9.1156 for n = 3.
(18)
Since L1,n > Li n, we have Kn < Kn. A conjecture in [11) is that K3 = K3, and it would be important to settle this. Inequality (15) can be easily extended to the following: Let i1'(x,,...,XN) E L2((Rn)N),xi E R". Suppose 111012 1 and 0 is antisymmetric in the N variables, i.e.,
'(x,,...,xi,...,xj.... ,sN) = -W(x1,...x1,...,x,,...,XN). Define 11
fIW(x,,...,xi-1,x,xi+1,...,XN)I dx, ...dxi .. dxN
Pi(x) =
Ti(x) =
J
;Vit1I2dx, ... dxN
N
P(x) _
(19) (20)
(NN
Pi (X)
T = LTi.
(21)
(Note that p(x) = Np, (x) and T = NT1 since t' is antisymmetric, but the general form (19)-(21) will be used in the next paragraph.) Then (15) holds with p and T given by (19)-(21) (with the same K,, as in (15)). This is a generalization of (13)-(15) since we can take N 0(x1, ... , XN) = (N!)-1/2 det {0i(xj)) ,,,=1 , which leads to (13) and (14).
320
Kinetic Energy Bounds and Their Application to the Stability of Matter
375
A variant of (15) is given in (52) below. It is a consequence of the fact that (17) and (17a) also hold with the definitions (19)-(21). Antisymmetry of rli is not required. The proof of (17a) just uses (1) as before plus Minkowski's inequality, namely for p > 1
f {fIFxYIPd}
t/p
dx > If { f IF(x,y)Idx}
p
dy}1/p
We turn now to some applications of these inequalities.
Application 1. Inequality (15) can be used to bound LP norms of Riesz and
Bessel potentials of orthonormal functions [7]. Again, 01, ... , ON are L2 orthonormal and let
-D + m 2)-1/20,
U.
(22)
N
p(x) _ E
Iu;(x)I2_
(23)
i=I
Then there are constants L, Bp, A. (independent of m) such that
IIpII. < L/m, Bpm 2/"N'1" HOP
M > 0
Ilpllp < A.N'lp,
P = n/(n - 2), m > 0.
1 < p < 00'm > 0
(24) (25)
(26)
If the orthogonality condition is dropped then the right sides of (24)-(26) have to be multiplied by N, N'-'1p, N'-'/p respectively. Possibly the absence of N in (24) is the most striking. Similar results can be derived [7] for (-A + m2)-°/2 in place of
(-A + m2)-'/2, with a < n when m = 0. Inequality (15) also has applications in mathematical physics.
Application 2. (Navier-Stokes equation.) Suppose Q C R^ is an open set with finite volume I!Il and consider
H = -A - V(x) on S2 with Dirichlet boundary conditions. Let Al < A2 < ... be the eigenvalues of H. Let N be the smallest integer, N, such that N
EN=_EAi>0.
(27)
We want to find an upper bound for S.
321
Schodinger Operators, Proceedings Senderborg Denmark 1988, H. Holden and A. Jensen eds.
376
If 461, 02, ... are the normalized eigenfunctions then, from (13)-(15) with 61,
EN = T - Jpv > K j p1+n/2 -
f V+p ? G(p),
, ON,
(28)
where (with p = 1 + n/2 and q = I + 2/n) G(p) = K llpllp - IIv+IIgIlpIIp
(29)
EN > inf{G(P)I IIPIII = N, p(x) ? 0}.
(30)
Thus, for all N, But IIp[Ip1111/q > IIPIII = N so, with X = IIpII9,
EN > inf{J(X) I X > NI0I-11q}
(31)
J(X) = KnXp - IIv+IIQX-
(32)
where
Now J(X) > 0 for X > X0 =
(IIV+I[q/Kn}'/(p'1), whence
we have the following
implication:
N>
III/9{IIV+IIq/Kn}1/(p-I)
EN > 0.
(33)
Therefore
1V < Jill
(34)
1/q{I[V+IIq/Kn}1/(p-I).
The bound (34) can be applied [81 (following an idea of Ruelle) to the Navier-stokes
equation. There, N is interpreted as the Hausdorff dimension of an attracting set for the N-S equation, while V(x) - v-312e(x), where E(x) = vIVv(x)I2 is the average energy dissipation per unit mass in a flow v. v is the viscosity.
Application 3. (Stability of matter.) This is the original application [10,11]. In the quantum mechanics of Coulomb systems (electrons and nuclei) one wants a lower bound for the Hamiltonian operator: N
H=
N
- E Ai - E i=1
K
E Ixi -
E Zj I xi - Rj I -1 + i=1 j=1 1
+ E zizjlRi - Rjl-' 1
x.,I-1
(35)
on the L2 space of antisymmetric functions 10(x1, ... , xN ), xi E R3. Here, N is the number of electrons (with coordinates xi) and R1,.. . , RK E R3 are fixed vectors
322
Kinetic Energy Bounds and Their Application to the Stability of Matter
377
representing the locations of fixed nuclei of charges z I, ... , ZK > 0. The desired bound is linear:
H > -A(N + K)
(36)
for some A independent of N, K, R1,. .. , RK (assuming all z; < some z). The main point is that antisymmetry of >!i is crucial for (36) and this is reflected in the fact that (15) holds with antisymmetry, but only (17) holds without it. Without the antisymmetry condition, H would grow as -(N + K)5/3. This is discussed in Application 6 below. By using (15) one can eliminate the differential operators A;. - (', HO), with (1P, 0) = 1 can be bounded below using (15) by The functional a functional (called the Thomas-Fermi functional) involving only p(x) defined in (21). The minimization of this latter functional with respect to p is tractable and leads to (36).
Application 4. (Stellar structure.) Going from atoms to stars, we now consider N neutrons which attract each other gravitationally with a coupling constant r. = Gm2, where G is the gravitational constant and m is the neutron mass. There are no Coulomb forces. Moreover, a "relativistic" form is assumed for the kinetic energy, which means that -A is replaced by (-A)112. Thus (35) is replaced by N 1
(again on antisymmetric functions). One finds asymptotically for large N, that inf spec(HN) = 0 if rc < CN-2/3
_ -oc
if
, > CN-2/3
(38)
for some constant, C. Without antisymmetry, N-2/3 must be replaced by N-1. Equation (38) is proved in [12). An important role is played by Daubechies's generalization [4) of (15) to the operator (-A)1/2 on L2(RN'), namely (for antisymmetric ' with 1k1'112 = 1) N
B. Jp(x)1+h/' i=1
(39)
J
with p given by (19), (21). In general, one has
(_AP) > C,n 1 A-)1 +20dx. N
J
323
Sch6dinger Operators, Proceedings Senderborg Denmark 1988, H. Holden and A. Jensen eds.
378
Recently [13] there has been considerable progress in this problem beyond that in [12]. Among other results there is an evaluation of the sharp asymptotic C in (38), i.e. if we first define rc`(N) to be the precise value of r. at which infspec(HN) = -oo, we then define
C= lim N2/3K`(N). N_oo
(41)
Let Bn be the "classical guess' in (39). This can be calculated from the analogue of (9) (using lpl instead of p2, and which leads to E leiI if V+and then from the analogue of (16), namely L = C"Bn". One finds B _ (3/4)(6wr2)'/3 (cf. (18)). Using B3, we introduce the functional
L(p) = B3 Jp4/3 7_ 2Kf f P(x)P(y)Ix - yl-'dxdy
(42)
for p E L' (R3) n L'/3(R3) and define the energyr E`(N) = inf{-I (p)[
J p = N}.
(43)
One finds there is a finite a` > 0 such that E`(N) = 0 if KN2/3 < a` and E`(N) > -oo if KN2/3 > a`. (This a' is found by solving a Lane-Emden equation.) Now (42) and (43) constitute the semiclassical approximation to HN in the following sense. We expect that if we set K = aN-213 in (37), with a fixed, then if
a < a` lim infspec(HN) = 0
N-.oo
(44)
while if a > a` there is an No such that inf spec(HN) = -oo if N > N0.
(45)
Indeed, (44) and (45) are true [13], and thus a` is the sharp asymptotic value of C in (38).
An interesting point to note is that Daubechies's B3 in (39) is about half of B. The sharp value of B3 is unknown. Nevertheless, with some additional tricks one can get from (37) to (42) with B3 and not B3. Inequality (39) plays a role in [13], but it is not sufficient.
Application 5. (Stability of atoms in magnetic fields.) This is given in [9]. Here >V (x i , ... , X N) becomes a spinor-valued function, i.e. 0 is an antisymmetN
ric function in n L2(R3; CZ). The operator H of interest is as in (35) but with the replacement
-A -. (a (iV - A(x))}2 324
(46)
Kinetic Energy Bounds and Their Application to the Stability of Matter
379
where oj.o2io3 are the 2 x 2 Pauli matrices (i.e. generators of SU(2)) and A(x) is a given vector field (called the magnetic vector potential). Let Eo(A) = inf spec(H)
(47)
after the replacement of (46) in (35). As A -+ no (in a suitable sense), Eo(A) can go to - oo. The problem is this: Is
E(A) = Eo(A) + I 1(,,r] A)2
(48)
bounded below for all A? In (9] the problem is resolved for K = 1, all N and N = 1, all K. It turns out that k(A) is bounded below in these cases if and only if all the zi satisfy zi < z` where z` is some fixed constant independent of N and K. The problem is still open for all N and all K. One of the main problems in bounding E(A) is to find a lower bound for the kinetic energy (the first term in (35) after the replacement given in (46)) for an antisymmetric
t(i. First, there is the identity V - A(x,)}2
I =T(ty,A) - I ry,
o B(xi),P f {=t
(49)
f
with B = curl A being the magnetic field and T(tp, A) = (u',
i.t
Jiv - A(x)]2tb I
.
(50)
J
The last term on the right side of (49) can be controlled, so it will be ignored here. The important term is T(t', A). Since Pauli matrices do not appear in (50) we can now let t, be an ordinary complex valued (instead of spinor valued) function. It turns out that (8), and hence (15), hold with some L,,,, which is independent of A. The T in (15) is replaced, of course, by the T(tp, A) of (50). To be more precise, the sharp constants L,,,, and I,,,,, are unknown (except for ry > 3/2, n = 1 in the case
of L,,,,) and conceivably L,,,, > L,,,,. However, all the current bounds for L,,,, (see (8]) also hold for Thus, for n = 3 we have A) > K3
J
p5f3
(51)
with K3 being the value given in (8], namely 2.7709.
325
Schodinger Operators, Proceedings Senderborg Denmark 1988, H. Holden and A. Jensen eds.
380
However, in [9] another inequality is needed }2/3.
T (,G, A) > C if p2
(52)
It seems surprising that we can go from an L513 estimate to an L2 estimate, but the surprise is diminished if (17a) with its L3 estimate is recalled. First note that (1) holds (with the same if IV f I2 is replaced by I[iV - A(x)] f I2. (By writing f = If l e'" one finds that IVIfII2 < I [iV - A(x)] f 12 .) Then (17a) holds since only convexity was used. Thus, using the mean of (15) and (17a),
15/6
T(zb,A) > (S,,Kn)1/2IIPII3 /2IIPII5/3.
(53)
An application of Holders inequality yields (52) with C2 =
Application 6. (Instability of bosonic matter.) As remarked in Application 3, dropping the antisymmetry requirement on 0 (the particles are now bosons) makes
inf spec(H) diverge as -(N + K)513. The extra power 2/3, relative to (36) can be traced directly to the factor N-2/3 in (17). An interesting problem is to allow the positive particles also to be movable and to have charge z; = 1. This should raise inf specH, but by how much? For 2N particles the new H is 2N
H = - ED; + i-1
e;ejlx; - xjI-
1
(54)
1
with e; = +1 for 1
-AN'/5 for some A > 0. Thus, stability (i.e. a linear law (36)) is not restored, but the question of whether the correct exponent is 7/5 or 5/3, or something in between, remained open. It has now been proved [2] that N'/5 is correct, inf spec(H) > -BN7/5. The proof is much harder than for (36) because no simple semiclassical theory (like Thomas-Fermi theory) is a good approximation to H. Correlations are crucial.
Application 7. (Stability of relativistic matter.) Let us return to Coulomb systems (electrons and nuclei) as in application 3, but with (35) replaced by N
H=
{(-0;+m2)1/2 -m} +QVV(x1i...,xN;R1,...,RK)
(55)
with a = e2 = electron charge squared (and h = c = 1) and where N
VV =
326
-
E
K E zjlxi - RiI-1 +
i=1 j=1
E Ix; - xjI_1 + E zizjIR; - Rj[-1 1
(56)
Kinetic Energy Bounds and Their Application to the Stability of Matter
381
is the Coulomb potential. The electron charge, x112, is explicitly displayed in (55) for a reason to be discussed presently. Also (55) differs from (35) in that the kinetic energy operator -A is replaced by the relativistic form (-A + m2)t/2 - m, where m is the
electron mass. Since -A - in < (-A + m2)'/2 - in < -A, the difference of these two operators is a bounded operator and therefore, as far as the stability question is concerned, we may as well use the simplest operator (-A)'12 in (55), which will be done henceforth. This, in fact, was already done in (37). We define
EN,K(R1i...,RK) = infspecH EN,K = inf ENK(R1,...,RK)
(57)
E = inf ENK
(59)
R, ,...,RK
N,K
(58)
Under scaling (dilation of coordinates in R3N+3K) the operators (-A)'/2 and 1xI-1 behave the same (proportional to length)-' and hence we conclude that EN,K = 0
or
-00.
(60)
The system is said to be stable if E = 0. For simplicity of exposition let us take all zj to be some common value, z. For the hydrogenic atom N = K = 1 the only constant that appears is the com-
bination za. It is known that E1,, = 0 if and only if za < 2/a. In the many-body case there are two constants (which can be taken to be za and a) and the question is whether the system is stable all the way up to za = 2/a for a less than some small, but fixed a, > 0. The answer will depend on q, the number of spin states allowed for the fermionic electrons. (Note: in application 3 we implicitly took q = 1. In fact q = 2 in nature. To say that there are q spin states means that under permutations ,O(x 1 i ... , x N) belongs to a Young's tableaux of q or fewer columns.) This problem is resolved in 1151 where it is shown that stability occurs if qa < 1/47.
(61)
The kinetic energy bound (39) plays a crucial role in the proof (but, of course, many other inequalities are also needed). It is also shown in [15J that stability definitely fails to occur if
a > 36q-'/3z2/3
(62)
a > 128/15x.
(63)
or if
If (63) holds then instability occurs for every z > 0, no matter how small.
327
Schodinger Operators, Proceedings Senderborg Denmark 1988, H. Holden and A. Jensen eds.
382
REFERENCES (1] M. Aizenman and E.H. Lieb, On semiclassical bounds for eigenvalues of Schrodinger
operators, Phys. Lett. 66A, 427-429 (1978). [2] J. Conlon, E.H. Lieb and H.-T. Yau, The N'15 law for bosons, Commun. Math. Phys. (submitted). [3] M. Cwikel, Weak type estimates for singular values and the number of bound states of Schrodinger operators, Ann. Math. 106, 93-100 (1977). [4] I. Daubechies, Commun. Math. Phys. 90, 511-520 (1983). [5] F.J. Dyson, Ground state energy of a finite systems of charged particles, J. Math. Phys. 8, 1538-1545 (1967). [6] E.H. Lieb, The number of bound states of one-body Schrodinger operators and the Weyl problem, A.M.S. Proc. Symp. in Pure Math. 36, 241-251 (1980). The results were announced in Bull. Ann. Math. Soc. 82, 751-753 (1976). [7] E.H. Lieb, An La bound for the Riesz and Bessel potentials of orthonormal functions, J. Funct. Anal. 51, 159-165 (1983). [8] E.H. Lieb, On characteristic exponents in turbulence, Commun. Math. Phys. 92, 473-480 (1984). [9] E.H. Lieb and M. Loss, Stability of Coulomb systems with magnetic fields: II. The many-electron atom and the one-electron molecule, Commun. Math. Phys. 104, 271-282 (1986).
[10] E.H. Lieb and W.E. Thirring, Bounds for the kinetic energy of fermions which proves the stability of matter, Phys. Rev. Lett. 35, 687-689 (1975). Errata 35, 1116 (1975).
[11] E.H. Lieb and W.E. Thirring, "Inequalities for the moments of the eigenvalnes of the Schrodinger Hamiltonian and their relation to Sobolev inequalities" in Studies in Mathematical Physics (E. Lieb, B. Simon, A. Wightman eds.) Princeton University Press, 1976, pp. 269-304. [12] E.H. Lieb and W.E. Thirring, Gravitational collapse in quantum mechanics with relativistic kinetic energy, Ann. of Phys. (NY) 155, 494-512 (1984). [13] E.H. Lieb and H.-T. Yau, The Chandrasekhar theory of stellar collapse as the limit of quantum mechanics, Commun. Math. Phys. 112, 147-174 (1987). [14] G.V. Rosenbljum, Distribution of the discrete spectrum of singular differential operators. Dokl. Akad. Nauk SSSR 202, 1012-1015 (1972). (MR 45 #4216). The details are given in Izv. Vyss. Ucebn. Zaved. Matem. 164, 75-86 (1976). (English trans. Sov. Math. (Iz VUZ) 20, 63-71 (1976).) [15] E.H. Lieb and H.T. Yau, The stability and instability of relativistic matter, Commun. Math. Phys. 118, 177-213 (1988). For a short summary see: Many body
stability implies a bound on the fine structure constant, Phys. Rev. Lett. 1695-1697 (1988).
328
61,
With D. Hundertmark and L.E. Thomas in Adv. Theor. Math. Phys. 2, 719-731 (1998)
© The Authors Adv. Theor. Math. Phys. 2 (1998) 719 - 731
A sharp bound for an eigenvalue moment of the one-dimensional Schrodinger operator 1 Dirk Hundertmarka, Elliott H. Lieba, Lawrence E. Thomasb 'Department of Physics and Mathematics Jadwin Hall Princeton University P.O. Box 708 Princeton New Jersey 08544
n Department of Mathematics University of Virginia Charlottesville Virginia 22903
Abstract We give a proof of the Lieb-Thirring inequality in the critical case d = 1, y = 1/2, which yields the best possible constant.
e-print archive: http://xxx.lanl.gov/abs/hep-th/9806012 'On leave of NWF I - Mathematik, Universitiit Regensburg, D-93040 Regensburg © The Authors. Reproduction of this article in its entirety, by any means, is permitted for non-commercial purposes.
329
With D. Hundertmark and L.E. Thomas in Adv. Theor. Math. Phys. 2, 719-731 (1998)
720
1
SHARP BOUND FOR AN EIGENVALUE MOMENT .....
Introduction
There is a family of inequalities [9], [10) that has proved to be useful in various areas of mathematical physics, especially in the proofs of stability of
matter. They state that given a Schrodinger operator
-A+ V on
2(1Rd),
the sum of the moments of the negative eigenvalues -El < -E2 < -E3 < ... < 0 (if any) of this operator is bounded by Ei < L, ,d f (V (x))7+d/2 dx
(1)
with V_(x) := max(-V(x),0). These inequalities have been generalized in several directions, e.g. manifolds instead of Rd. Here we are concerned with the case d = 1. The cases originally shown to hold [10] are
d=1,-y> 2, d=2,-y>0, andd>3,y>0. When d = 2 there cannot be any bound for y = 0 (meaning the number of negative eigenvalues) since at least one negative eigenvalue always exists for arbitrarily small negative perturbations of the free Laplacian in two dimensions [5, page 156-1571, [15].
The critical case d _> 3 and 'y = 0 was open for a while and proved independently by Cwikel [4], Lieb [7], and Rozenbljum [11]. Still later, different proofs where given by Conlon [3] and Li and Yau [6]. The sharp constants are still not known, but the best one so far is in [7].
If d = 1 it is not hard to see that the inequality cannot hold for -y < 1/2. To prove this choose a sequence of aproximate 6-functions. They converge to zero in L7+1/2(R) but the limit may have a negative eigenvalue; see the discussion of a Dirac potential below. In the critical case d = 1, ry = 1/2, which concerns us here, it was not known until recently whether L1/2,1 is finite. This case was settled by Timo Weidl [17] who showed that L112.1 < 1.005. Unfortunately his method of proof cannot be improved to yield the sharp constant as can be seen from the following argument: His method is also applicable for a half-line problem corresponding to a Schrodinger operator on 1R+ with Neumann boundary conditions at the origin; in fact he reduces the full problem (but not the determination of the sharp constant) to this case. Since in this half-line problem the trivial lower bound for the
330
A Sharp Bound for an Eigenvalue Moment of the One-Dimensional Schrodinger Operator
D. HUNDERTMARK, E. H. LIEB, L. E. THOMAS
721
sharp constant is given by 1 his method cannot yield a better bound than 1 in the problem concerning us here. Hence, the sharp constant L1/2,1 remained undetermined, a tantalizing situation, since there is an obvious conjecture about the value of this constant
[10]. In one dimension the potential can be a measure (thanks to the fact that H1(R1) functions are continuous) and when ry = 1/2 the right hand side of (1) is simply the total mass of this measure. In order to maximize the sum of the square roots of the eigenvalues it is reasonable to suppose that one should concentrate the potential at one point and the extreme case should hence correspond to a 3-function. It is well-known that -a2 - c6 is a well-defined closed quadratic form on the Sobolev space H'(IR1) and the Hamiltonian corresponding to this form is used in textbooks as a simple solvable model in quantum mechanics. An exercise shows that the only bound state of this operator for positive c is given by Vi(x) = exp(-cIxl/2) with eigenvalue -c2/4.
If it is true that this Dirac potential is the optimal case we conclude that the sharp constant in the Lieb-Thirring inequality for d = l,-y = 1/2 is given by L1/2,1 = 1/2. The proof of this statement is the main result of this paper. A corollary of our result is that for the half-line problem with Neumann boundary conditions considered by Weidl, the sharp constant is 1. Before turning to the proof let us note the corresponding -- still unproved - conjecture when-1/2 < ry < 3/2. The optimal potential should be given by
V(x) _
2
1
47` (cosh(_2
X
-2
I/if and the sharp constant is supposed to be [10] r(ry + 1)
1 L ,, = 7r- 1/2 -y-1/2r(7+1/2)
- 1/2) -f+1/2 = 2L` ry+1/2) ry,,
(ry
1/2) (ryry+1/2)
-
-y-1/2
Oar
Here L , := (
) 1/Zr(ry + 1)/r(7 + 3/2) is its classical value. Unlike the case ry < 3/2 the optimal constant in one dimension and ry > 3/2 is known [1], [10] to be Ly,1 = L,,1. Using the fact proved in [1] that L,,1/L71 is monotone decreasing in ry and the sharp value for L1/2,1 obtained here we conclude that Ly,1 < 2Ly,1 for all ry > 1/2. As a last remark, let us note that our proof uses no special 1-D technique, except for the explicit form of the Birman-Schwinger kernel (3) in one dimension.
331
With D. Hundertmark and L.E. Thomas in Adv. Theor. Math. Phys. 2, 719-731 (1998)
SHARP BOUND FOR AN EIGENVALUE MOMENT .....
722
2
Proof of the main result for potentials
The principal result of this paper is
Theorem 1. For a Schrodinger operator -82 + V in one dimension the optimal constant L1/2,1 is 1/2, i.e.
E; < E, <0
2
fV_(x)dx.
(2)
The inequality is strict if the negative part V_ is a non-zero L' function.
In this section we prove this theorem in the case the potential is an L' function. In the last section we extend the bound (2) to potentials that are (finite) measures and prove that the 6-function is the unique maximizer up to translations. By the minmax principle it suffices to investigate the operator -8i - V_. We will henceforth assume V = -U with U non-negative and integrable.
To study the bound states energies of a Schrodinger operator it is often useful to investigate another problem. To do so we need some more notation.
For E > 0 let
ICE(x,y) := - exp(-2Ex -
yf)VU-(
,
for all x,y E R
(3)
be the Birlnan-Schwinger kernel for the Schrodinger operator -a2 - U in L2(IR). ICE stands for the integral operator given by this kernel. The Birman-
Schwinger principle (2, 13] states that -E,, < 0 is the nt'' eigenvalue of -82 - U if and only if the nth eigenvalue of 1CE equals one. The explicit expression (3) suggests that multiplying (3) by will yield a still implicit but perhaps more flexible expression for En. This is exactly what we are going to do. Let us define, for p > 0,
G,.(x, y) := Ve-11r-y1v1,
for all x, y E R.
(4)
Moreover, given some arbitrary non-negative locally finite Borel-measure r, on IR, we can generalize the kernel (4) to L
(x, y) :_
,
e-1J(-)-J(Y);
for all x, y E R,
(5)
where the function 3 is given by
3(x) :_ / x 0
332
K(dz).
(6)
A Sharp Bound for an Eigenvalue Moment of the One-Dimensional Schrodinger Operator
D. HUNDERTMARK, E. H. LIEB, L. E. THOMAS
723
Again Gµ and £' are the corresponding integral operators. Of course, f,,, in (4) corresponds to sc(dz) = pdz. Both KE and G" are compact integral operators; their Hilbert-Schmidt norms are bounded by (f U(x) dx)2/(2v/E) and (f U(x) dx)2, respectively. For a positive compact operator A we denote its ordered eigenvalues by )1(A) > )i2(A) > ... > 0. With the help of the Fourier transform (exp(-eIxI)/(2e) = f eipz/(p2 + e2) dp/(27r)) one sees the following facts:
(i) C and KE are positive definite operators, and hence the (ordered) eigenvalues \j(C`) obey (ii) At (C) > A2(C') > A3(C') > ... > 0 with a similar statement for AJ(lCE). The strict inequality follows from the positivity of the integral kernel and the Perron-Frobenius theorem. The trace of £' is given by (iii) tr G" = f U(x) dx, independent of K, and (iv) G° = Co is a rank one operator with eigenvalue f U(x) dx. The discussion above suggests that the sum of the square roots of the eigenvalues of the one dimensional Schrodinger operator is related to the sum of the eigenvalues of G,,. Indeed we have the following bound:
Theorem 2 (Domination by G,,). Suppose U > 0 with U E Lt (H2) and let -Et < -E2 < -E3 <.... < 0 be the negative eigenvalues counting multiplicity of the Schrodinger operator -82 - U given by the minmax principle. Furthermore, we denote by ), (C) the eigenvalues of G,, in (4). Then, for
all nENand 0<E<En (7)
in
'
In (7) we set E,+t = 0 in case the Schrodinger operator happens to have
only j negative eigenvalues. Proof. As already mentioned, the Birman-Schwinger principle gives a one-toone correspondence between negative eigenvalues of a Schrodinger operator
and the eigenvalues of KE: ai(KE,) = I. Multiplying this equality by 2 / yields 2v = 2Vt MICE,) = \i(GVET) for all i such that Ei > 0. Note that A,(G°) = 0 if i > 2 since G° is a rank one operator. Therefore we have
_
2
i
in
\i(C
(8)
333
With D. Hundertmark and L.E. Thomas in Adv. Theor. Math. Phys. 2, 719-731 (1998)
SHARP BOUND FOR AN EICENVALUE MOMENT .....
724
for arbitrary n E N. If the eigenvalues of G were monotonically decreasing as p > 0 increases this would immediately imply
vK,
2
i
in
X i(L,,,,-n) < E ai(LVjE-)
for 0 < E < En.
i
However, such a monotonicity cannot hold since the trace of Gµ is indepen-
dent of p > 0. Nevertheless, the partial sums Ei
(Cv ) +\1(.CVE,) -\1(G,)
2V G1 = A1(1
)-A1(C
<
)
forall0<E<E1
where we take E2 = 0 if the potential has only one negative eigenvalue. If there are two or more negative eigenvalues it follows by induction that
2Ev_E_ +21 i
< E ai(L
) + An+1(L r ) + A1(L , ) - A1(L
)
i
forall0<E<En+1 andnEN. Before proving the Lemma, we note a simple consequence of this theorem which proves our main bound (2).
Corollary 3 (Sharp constant). Under the hypotheses of Theorem 2 and forUi4 0
2 1] Ei < J U(x) dx. iEN
J
Proof. From the theorem we get
21: Et <
A1(z0)+a1(L
)-.I(L/ )
iEN
=
334
fU(x)dx + A1(L1E,) - '\1(L1),
(9)
A Sharp Bound for an Eigenvalue Moment of the One-Dimensional Schrodinger Operator
D. HUNDERTMARK, E. H. LIEB, L. E. THOMAS
725
since Lo is a rank one operator with eigenvalue f U(x) dx. To conclude the strict inequality also note that Al(Ga) is strictly monotone decreasing in E > 0 by Lemma 4. The Perron-Frobenius theorem [12, Theorem XIII.44] implies Et is simple and hence al (G -) - at (G ) < 0.
Lemma 4 (Monotonicity). For all n E N the nth partial sum of the eigenvalues of the operator L' defined in (5) is monotonically decreasing in the sense that
E ,yGr`') < i
L,\t(GK)
(10)
i
if ,'([s, t]) > r.([s, t]) for all s < t E R. Moreover the largest eigenvalue al (G") is strictly monotone decreasing in K. Proof. To clarify the line of reasoning we consider first a toy-model given by
an (m+1) x (m+1) matrix where the two variables x and yin (5) take on m + 1 values xo < ... < xm. With ai = exp(-jJ(xi) - J(xol )) < 1 (where J is defined in (6) and with U = 1 on {xo,... , xm} for simplicity) the operator given in (5) has the matrix 1
al
ala2
ala2a3
al
1
a2
a2a3
a l ... am-1
a2 ... am_ l a2 ... am
... ...
L({ai})
al ...am
1
am
am
1
Let \I({ai}) > \2({ai}) _> ... > Am+l({ai}) be the ordered eigenvalues of L({ai}). We investigate the sum of the largest n eigenvalues in the cube given by 0 < ak < 1 for all k E { 1, ... , m+ 1} and want to show that it is a (separately) monotone increasing function of each ak in the interval
0 < ak < 1. Fix k E {1,... ,m+l} and {ai}ilk. For simplicity we write L(ak) for L({ai}l#k,ak). The matrix L has the form L(ak)
L({ai}i#k, ak) =
(akWtA
a kW
L(0) + akT
with L(0):= L({ai}i#k,0)
A 0 0 B
on Ck ®
Cn+1-k = .n+1
and the perturbation
T = (Wt
0) ,
W:
C"+1-k
, C`,
335
With D. Hundertmark and L.E. Thomas in Adv. Theor. Math. Phys. 2, 719-731 (1998)
SHARP BOUND FOR AN EIGENVALUE MOMENT .....
726
where A, B, and W are kxk, (m+l-k)x(m+1-k), and kx (m+l-k) matrices respectively, depending only on {ai}ilk. This shows that the dependence of L on ak (for fixed {ai}irk) is affine-linear. Now the claimed monotonicity of the sum of the largest n eigenvalues in 0 < ak < 1 is easily seen by the usual quantum mechanics textbook arguments of perturbation theory, cf. [16, chapter 3.5]: The sum is given by
F'Jii(L(ak)) = i
sup O
sup
tr(dL(ak)) {tr(dL(0) + ak tr(dT)}
O
where d : C'"+1 -a C'"+1 is a density matrix. Consequently, being a supremum of afflne-linear functions, it is convex. To conclude monotonicity in ak it is enough to show that the derivative of the sum with respect to ak at ak = 0 is non-negative. If the eigenvalues of L(0) are non-degenerate this follows immediately from the Feynmann-Hellman theorem of perturbation ®C+1-k invariant theory: Since L(0) leaves the decomposition C"+1 = C" its eigenvectors 1i live either in the subspace Ck or so T4i) = 0. Thus by the Feyninan-Hellrnan formula each eigenvalue has derivative 0 at ak = 0, and for this reason each partial sum has zero derivative at ak = 0. C'"+1-k,
In the degenerate case'a single eigenvalue might have a negative deriva-
tive at ak = 0 but the partial sum of the largest n eigenvalues always has a non-negative derivative. Indeed, if the eigenvalues are degenerate we first have to diagonalize the perturbation T in the corresponding eigenspace h of L(0). This eigenspace, however, can he decomposed into h = h1 ® h2, with h1 or h2 possibly empty. With Pi being the orthogoh1 C Ck, h2 C nal projection onto h i = 1, 2, the perturbation T restricted to the subspace C-+1-k
h is again of the form T 1h = PhTPh = W + W1, i.e., T 1h = (r°t o) with W := Ph,WPhz : h2 -+ h1. This gives trh T = trTlh = 0. The FeyninanHellman formula tells us that the eigenvalues of the restricted perturbation TIh are the derivatives of the eigenvalue branches emerging from this degeneracy subspace at ak = 0. Since even the perturbation restricted to the eigenspace h has trace zero, we conclude that the derivative of the sum at ak = 0 is at most greater or equal to zero. For the strict monotonicity of the largest eigenvalue a1(L({ai})) in the cube 0 < ai < 1 , i E { 1, ... , rt + 1 } note that by the Frobenius-Perron theorem the corresponding cigenvector 4'({ai}) has only positive entries. Consequently f o r 0 < a, < a; < 1 , all i E { 1, ... , m + 11, the minnlax
336
A Sharp Bound for an Eigenvalue Moment of the One-Dimensional Schrodinger Operator
D. HUNDERTMARK, E. H. LIEB, L. E. THOMAS
727
principle implies
al(L({a,})) = (4({a,}), L({a,})$({a,})) < <
(I({a;}),L({ai})4({a,})) (4o({ a;}), L({a;})4,({a;})) = \I ((a'))
Remark: The above reasoning for the toy model remains valid if L is replaced by MLM where M is a multiplication operator, i.e. a diagonal matrix, so that the partial sums of the eigenvalues for MLM are also monotone. To apply this reasoning to our operator C is enough to show the monotonicity (10) for finite discrete measures K = E cjbr, and rc' = E cjb=, with ;A
c > cj. Indeed, approximate n and r.' - K by finite sums K,,, and 0,,, of 6-functions. This is possible since they are weakly dense in the set of locally finite Borel-measures. It is easy to see that the corresponding operators G"'^ and G"'^+°'^ converge in Hilbert-Schmidt norm to G" and L". Monotonicity of the partial sums of eigenvalues of G" for arbitrary r. then follows by approximation and, without loss of generality, we may assume m
m
K=
c bra ,
K=
for some m E N
c bra
with c'j>c3>0, j E {1,...,m},and-oo<x, < ... < x,,, < oc. For x
J(x) - J(y)I =
Jr
y K(dz)
_ E cj x<x,
and
G",,, (x, y)
U(x)exp(- E ")
U(y)
x<xj
fi
e `'
U(x)
r<x,
JJ aj U(x)
U(y),
aj := e-`' , j = 1, ... 'M
r<X,
G({a,})(x,y).
(11)
As in the matrix case the dependence of G({a,}) on a single ak (for fixed {a) }j#k) is afline-linear and decomposition of the Hilbert space is now given by L2(R) = L2(-oo, xk) ® L2(xk, oo). Hence we are in precisely the same situation as for our MLM toy-model, and we infer that the partial sums of
337
With D. Hundertmark and L.E. Thomas in Adv. Theor. Math. Phys. 2, 719-731 (1998)
SHARP BOUND FOR AN EICENVALUE MOMENT .....
728
the largest eigenvalues are monotone in k for Cm. By the above limiting argument therefor for C` and in particular for Gµ. Strict monotonicity of the largest eigenvalue A1(c") in k, i.e. A1(c") < At(C`) if k' > k, follows from the Perron-Frobenius theorem, the minmax principle, and the strict monotonicity of the kernel (5) in K. One can, however, avoid the minmax principle in this conclusion. The Perron-Frobenius theorem states that the eigenvectors 't and corresponding to A1(C') 4t;,
and al(&) are non-negative and strictly positive on the support of the potential U. By definition AI (L') 4'
L' IV'
and the same for W. From this we get AIGC")NIT, 4
1
, ) - al(c")(-b ',VI) = (4 ,C' 4 ')
- (411/GK4)1).
(12)
since (V, 4 ") > 0 and the scalar products in (12) are real, hence symmetric, we get by interchanging the integration variables
ff fi(x)-1 (y)(C"(x,y) - c-(x,y))dxdy
1(C ) - t(C') <0
by the strict monotonicity of the kernel C(x, y) in k and the strict positivity of (DI, on the support of U. This concludes the proof of the monotonicity 4P1,
lemma.
3
Extension to `potentials' that are measures
In this section we extend theorem 1 to measure perturbations of -d=. As mentioned in the introduction the Sobolev inequality in one dimension, cf. [8] (Theorem 8.5], ensures that a finite measure T on R yields a quadratic form
T[0] := f 10(x) I' T(dx) that is infinitesimally form bounded with respect to the Laplacian in one dimension. The quadratic form
(0, HO) _ (0, -a) + (ti, TQ) (8.0,0.0)+ IR O(x)O(x) T (dx)
(13)
is thus closed on the Sobolev space H1(R) and defines a unique self-adjoint
operator H = -8s + T on L2(R). By the minmax principle for forms it is again enough to consider the case T = -v for some positive bounded measure
v on R. We will hence consider H = -8z - v. Our result is
338
A Sharp Bound for an Eigenvalue Moment of the One-Dimensional Schrtidinger Operator
D. HUNDERTMARK, E. H. LIES, L. E. THOMAS
729
Theorem 5. Suppose v is a non-negative measure with v(R) < oo and let -El < -E2 < -E3 < ... < 0 be the negative eigenvalues counting multiplicity of the Schrodinger operator -8= - v (if any) given by the corresponding quadratic form. Then
'
00
:-1
21 v(R)
(14)
with equality if and only if the measure v is a single Dirac measure.
Proof. One obstacle in the proof of this theorem is to construct an analog of the Birman-Schwinger kernel (3) for measures. It is given by KE(VI(x,Y) = J
p2+E
(x,c)
p2+E
(C,y)v(do
(15)
where we set p2 := -di for convenience. A given measure v can be approximated by smooth functions by convoluting it with an approximate 6-function v -r ve = bf * v. Of course vE -a v weakly and the operators KE[ve] converge to KE(v] for large E in Hilbert-Schmidt norm, hence in the usual operator norm, too. By Tiktopoulos' formula [14] this shows the norm convergence of the resolvents (p2-ve+E)-1 to (p2-v+E)-1 and thus any finite collection of eigenvalues of p2 - ve converges to those of p2 - v. So, applying the results of the last section, we have for any partial sum, i.e. any n E N m
i 6 --+0
2
f
vF(x)dx+m E (A1(GV (v']) -
o (A1(G
f v(dx) + l
[vJ) - )1(L
[ve]))
where for µ > 0 the operator L,,[ve] is defined by the right hand side of (4) with U(x) replaced by ve(x). For any positive bounded measure v let L, [P) = 2µ(p2
+/,I)-1/2v(p2 +/i2)-1/2 be defined by its kernel
1Cv[v](x, y)
21i
f
p2 1+
1,2
(x, )
1
+µ2((, y) v(d()
1
Since the spectrum of an operator of the form AA1 is the same as that of AtA except at zero we conclude for it > 0 A1(L [L£]) _ Al (I
[,E]) e-i0 I (Gµ(V])
in Hilbert-since A (L,(ve]) > 0 and the operators 4[v] converge to Schmidt norm as a -> 0 Thus the equivalent of (9) in the measure case is .
339
With D. Hundertmark and L.E. Thomas in Adv. Theor. Math. Phys. 2, 719-731 (1998)
SHARP BOUND FOR AN EIUENVALUE MOMENT .....
730
given by
2 E s < v(R) + al
[v]) - At (LT,-[v])
(16)
iEN
By the Perron-Frobenius theorem for quadratic forms we know that the lowest negative eigenvalue -El of p2 - v is simple, ie. El > E2. So (14) will follow from (16) once we prove that 0 < µ ,- A,(G4[v]) is (strictly) monotone decreasing. The operator £,[v] is given by a strictly positive integral kernel and hence the eigenvector ¢. corresponding to the largest eigenvalue is strictly positive. Rewriting = A, (G,[v])QS, with ik,, = (P2 +p2)l/2.04 > 0 we get 2µ(p2+92)-tva(,µ = At(G4[v])iI . Consequently
for 05B,µ2 Al (Gµl [v])(i42, vV541) = 2p1 (O42, v
2
1
2
pt
V011)
and similarly for At(G42[v]) with pi and p2 interchanged. As in the end of the proof of Lemma 4 we can substract these equations and interchange the integration variables to arrive at A 1 (L41 [U]) -'\l (C42 [v] )
(e_
V
Jf v(dx)v(dy)+/iµl (x) 01-2 (y) l
e- IA21X-YI
< 0 for 0<92
Acknowledgment: D.H. and L.T. would like to thank the physics department of Princeton university for its warm hospitality and we thank Wolfgang Spitzer for discussions. - The authors also thank the following organizations for their support: Deutsche Forschungsgemeinschaft, grant Hu 773/1-1 (DH), and the U.S. National Science Foundation, grant PHY9513072 A02 (EHL), and grant DMS 9801329 (LET).
References [1] M. Aizenmann and E. H. Lieb: On serni-classical bounds for eigenvalues of Schrodinger operators. Phys. Lett. 66A (1978), 427-429. [21 M. S. Birman: The spectrum of singular boundary problems. Mat. Sb. 55 No.2 (1961), 125-174, translated in Amer. Math. Soc. Trans. (2), 53 (1966), 23-80.
340
A Sharp Bound for an Eigenvalue Moment of the One-Dimensional Schrodinger Operator
D. HUNDERTMARK, E. H. LIEB, L. E. THOMAS
731
[3] J. G. Conlon: A new proof of the Cwikel-Lieb-Rosenbljum bound. Rocky Moun-
tain J. Math., 15, no.1 (1985), 117-122. [4] M. Cwikel: Weak type estimates for singular values and the number of bound states of Schrodinger operators. Trans. AMS, 224 (1977), 93-100. [5] L. D. Landau and E. M. Lifshitz: Quantum Mechanics. Non-relativistic theory. Volume 3 of Course of Theoretical Physics, Pergamon Press (1958) [6] P. Li and S: T. Yau: On the Schrodinger equation and the eigenvalue problem. Comm. Math. Phys., 88 (1983), 309-318. [7] E. H. Lieb: The number of bound states of one body Schrodinger operators and the Weyl problem. Bull. Amer. Math. Soc., 82 (1976), 751-753. See also Proc. A.M.S. Symp. Pure Math. 36 (1980), 241-252. [8] E. H. Lieb and M. Loss: Analysis. Graduate Studies in Mathematics 14, American Mathematical Society 1997.
[9] E. H. Lieb and W. Thirring: Bound for the kinetic energy of fer7nions which proves the stability of matter. Phys. Rev. Lett., 35 (1975), 687-689. Errata 35 (1975), 1116.
[10] E. H. Lieb and W. Thirring: Inequalities for the moments of the eigenvalues of the Schrodinger Hamiltonian and their relation to Sobolev inequalities. Studies in Math. Phys., Essays in Honor of Valentine Bargmann, Princeton (1976),
[11] G. V. Rozenbljum: Distribution of the discrete spectrum of singular differential operators. Dokl. AN SSSR, 202, N 5 1012-1015 (1972), Izv. VUZov, Matematika, N.1(1976),. 75-86. [12] M. Reed and B. Simon: Methods of modern mathematical physics IV: Analysis of operators. Academic Press, New York 1978.
[13] J. Schwinger: On the bound states of a given potential. Proc. Nat. Acad. Sci. U.S.A. 47, (1961), 122-129. [14] B. Simon: Quantum mechanics for Hamiltonians defined as quadratic forms. Princeton Series in Physics, Princeton University press, New Jersey, 1971. [15] B. Simon: The bound state of weakly coupled Schrdinger operators in one and two dimensions. Ann. Physics 97, no. 2, (1976), 279-288.
(16] W. Thirring: A course in mathematical physics. Vol. 3. Quantum mechanics of atoms and molecules. Translated from the German by Evans M. Harrell. Lecture Notes in Physics, 141. Springer-Verlag, New York-Vienna, 1981.
[17] T. Weidl: On the Lieb-Thirring constants L,,,1 for ry > 1/2. Comm. Math. Phys., 178, no. 1, (1996), 135-146.
341
Part IV
Coherent States
Commun. Math. Phys. 31, 327-340 (1973) Commun. math. Phys. 31, 327--340 (1973) © by Springer-Verlag 1973
The Classical Limit of Quantum Spin Systems Elliott H. Lieb* Institut des Hautes Etudes Scientifiques, Bures-sur-Yvette, France Received February 28. 1973
Abstract. We derive a classical integral representation for the partition function, ZQ, of a quantum spin system. With it we can obtain upper and lower bounds to the quantum free energy (or ground state energy) in terms of two classical free energies (or ground state energies). These bounds permit us to prove that when the spin angular momentum J -+oo (but after the thermodynamic limit) the quantum free energy (or ground state energy) is equal to the classical value. In normal cases, our inequality is Zc(J)
1. Introduction
It is generally believed in statistical mechanics that if one takes a quantum spin system of N spins, each having angular momentum J, normalizes the spin operators by dividing by J, and takes the limit J - oo, then one obtains the corresponding classical spin system wherein
the spin variables are replaced by classical vectors and the trace is replaced by an integration over the unit sphere. Indeed, Millard and Leff [1] have shown this to be true for the Heisenberg model when N is held fixed. Their proof is quite complicated and it is therefore not surprising that this goal was not achieved before 1971. Despite that success, however, the problem is not finished. One wants to show that one can interchange the limit N-+ce with the limit J -+oo, i.e. is the classical system obtained if we first let N-+c and then let J-oo? In the MillardLeff proof the control over the N dependence of the error is not good enough to achieve this desideratum. A more useful result, and one which would include the above, would be to obtain, for each J, upper and lower bounds to the quantum free energy in terms of the free energies of two classical systems such that those two bounds have a common classical limit as J-+co. In this paper we do just that, and the result is surprisingly simple: In most cases of interest (including the Heisenberg model), the classical upper bound is * On leave from the Department of Mathematics, M.I.T., Cambridge. Mass. 02139, USA. Work partially supported by National Science Foundation Grant GP-31674X and by a Guggenheim Memorial Foundation Fellowship.
345
Commun. Math. Phys. 31, 327-340 (1973)
328
E. H. Lieb:
obtained by replacing the quantum spin by (J + 1) times the classical unit vector, while the lower bound is obtained by using J instead of (J + 1). Symbolically,
Zc(J) S ZQ(J) S ZC(J+ 1).
(1.1)
In other cases the result is a little more complicated to state, but it is of the same nature. With an upper and lower bound in hand, it is then possible to derive rigorous bounds on expectation values, as we shall describe in Sections V and VI.
The main tool in our derivation will be what has been termed by Arrechi et al. [2] the Bloch coherent state representation. These states and some of their properties were obtained earlier [3, 4], but the most
complete account is in Ref. [2]. Our lower bound is obtained by a variational calculation, while the upper bound is obtained from a representation of the quantum partition function that bears some similarity to the Wiener (or path) integral. Apart from its use in deriving
the upper bound, the representation may be of theoretical value in proving other properties of quantum spin systems. In particular, it provides a sensible definition of the quantum partition function for all complex J. not just when J is half an integer, and one may discuss the existence or non-existence of a phase transition as a function of the continuous parameter J. In a forthcoming paper [7] it will be shown how to apply the methods and bounds developed herein (using not only the Bloch states but the Glauber coherent photon states as well) to certain models of the interaction of atoms with a quantized radiation field, for example the Dicke Maser model.
II. Bloch Coherent States
In this section we recapitulate results derived in Refs. [2] and [3]. We consider a single quantum spin of fixed total angular-momentum and shall denote by S =_ (S, , S. S2) the usual angular momentum operators: [Si. S,] = i S=, and cyclically. S f = S.r ± i S,.. (2.1)
We denote by J the total angular momentum, i.e. S2 = Sx2 + S,2 + S=2 = J(J + 1).
(2.2)
The Hilbert space on which these operators act has dimension 2J + 1. i.e. it is C2J+,
346
The Classical Limit of Quantum Spin Systems 329
Classical Limit of Quantum Spin Systems
On the classical side, we denote by
the unit sphere in three
dimensions:
Y=((x.y.z)Ix2+y2+z2= 1),
(2.3)
and by L2 (.Y') the space of square integrable functions on Y with the usual measure
0=(0.(p), 050<-n, 0<-cp<2n,
dig=sin0d0dT,
(2.4) (2.5)
x =sin 0 cos cp. y =sin 0 sin cp. z =cos 0 .
(2.6)
(Note: In Ref. [2]. but not Ref. [3] the "south pole", instead of the customary "north pole" corresponds to 0 = 0. Hence our formulas will differ from Ref. [2] by the replacement 0-+n - 0). With I J> a C J +' being a normalized "spin up" state, S. I J> = J I J>. one defines the Bloch state I Q> a C2'+' by
IQ> =exp{i0[S-e"' -Sfe-'11]} IJ> = [cos 0] 2 J exp {(tan 10) e"° S- }
'
I J>
(2.7)
2
( 2J 1t;2 /I (cos20)'+M(sin20)' Mexp[i(J-M)rp]IM> M=-J M+J where I M> is the normalized state
_
(M2J+J)
1;2
[(J-M)!]-' (S )J-MIJ>
(2.8)
I
such that S=IM> = MIM>.
(2.9)
It is clear from (2.7) that the set of states IQ) are complete in C2.r+t Their overlap is given by
{cos;0cos20'+e'('0-m''sin 10sin'0'}-'
(2.10)
so that if we think of K. (0'. 0) as the kernel of a linear transformation on L2 (.%') it is selfadjoint and compact. In fact, it is positive semidefinite. We also have (2.11) IK,(Q' Q)12 = [cos i ©]4. where cos& = cosO cos0' +sin0 sinO' cos(rp - (p')
(2.12)
347
Commun. Math. Phys. 31, 327-340 (1973) E. H. Lieb:
330
is the cosine of the angle between Q and Q'. In particular IQ> is normalized since K, (Q, Q) = 1.
Now let y2J+t be the set of linear transformations on C2J+1 (i.e. operators on the spin space) and, for a given G E L' (.°), define Ac a by 24+ 1 K2J+ t
(dQ G(Q) IQ>
Ac
.
(2.13)
(Note: J dQ always means J dc). Since the Hilbert space is finite di-
mensional mensional there is no problem in giving a meaning to (2.13). It is a remarkable fact that every operator in A"2J+1 can be written in the form (2.13). In particular, 1= 2
4n
I
J dQ IQ>
(2.14)
.
Thus, to every operator Ae.,K2J+1 there correspond two functions: (2.15)
g (Q) _,
and the G(Q) of (2.13). The former is, of course, unique, but the latter is not. However, it is always possible to choose G(Q) to be infinitely differentiable. In Table 1 we list some function pairs for operators of
common interest and useful formulas for calculation are given in Appendix A.
Table L Expectation values. g(Q), and operator kernels, G(Q). [cf. (2.13). (2.15)] for various
operators commonly appearing in quantum spin Hamiltonians Operator
g(Q).(2.15)
G(Q), (2.13)
S.
J cos0
(J+1)cos0
S,
Jsin0cosrp
(J + 1) sin O cos cp
S,.
J sinO sinV
(J+I)sin0sin(p
S,2
J(J -))(cos0)2 + 3/2
(J + 1)(J + 3/2)(cos0)2 - l(J t I)
S_,2
J(J -')(sinOcos(p)2 +J/2
(J+1)(J+3/2)(sin Ocos(p)'-12 (J+I)
5,.2
J (J - i) (sin0 cos(p)2 + J/2
(J+l)(J+3/2)(sin Ocos(p)2- 1, (J+I)
We need three final remarks. The first IQ>
TrIQ>
348
is
that if we consider
e,,Il2J+1 then
(2.16)
The Classical Limit of Quantum Spin Systems Classical Limit of Quantum Spin Systems
331
(where Tr means Trace) as may be seen from (2.7). Hence, from (2.13)
TrAG=
2J+1 I dQG(Q). 4n
(2.17)
The second is that
2J+1 4n
J dQ KJ(Q'. Q) Kj(Q Q") = K, (Q', Q") .
(2.18)
as may be seen from (2.14). Thus, Kj reproduces itself under convolution.
The third remark is that for any A e 42J+1 we can use (2.14) to obtain
TrA = 2J4+ 1 (dQ Tr1Q>
-
2J+1 4it
= 2J+1 4n
dQ
M=-r
<MIQ>
(2.19)
dQ.
III. Lower Bound to the Quantum Partition Function
We consider a system of N quantum spins and shall label the operators and the angular momenta (which need not all be the same) by a superscript i, i = 1, ... , N. The Hamiltonian, H, can be completely general but, in any event, it can always be written as a polynomial in the 3N spin operators. The partition function is
Zu = aNTrexp(-PH).
(3.1)
where N
aN = fl (2J' + 1)-' 1=1
[The normalization factor aN is inessential; it is chosen to agree with the classical partition function when /3 = 0]. The Hilbert space is C2!'+1
We denote by I QN> the complete, normalized set of states on
(3.3)
°N defined
by N
I QN> _ 0 1 Q'> .
(3.4)
i=1
349
Commun. Math. Phys. 31, 327-340 (1973) E. H. Lieb:
332
by '1N the Cartesian product of N copies of the unit sphere. and by dQN the product measure (2.4). (2.5) and (2.6) on 1'N. Using (2.19), ZQ =
(41t)-N
f dQN <'1NI
a-e"
I QN> .
(3.5)
By the Peierls-Bogoliubov inequality, <WI eX IW>_> exp<WI X IW> for
any normalized We'N and X selfadjoint. Thus, ZQ >
(4it)-N J dQN
exp { -1g) .
(3.6)
Suppose, at first, that the polynomial. H, is linear in the operators Si of each spin. That is we allow multiple site interactions of arbitrary complexity such as Sx' S,,2 Sy' S=`, but do not allow monomials such as (S,')2 or Sx' Sy'. In this case, which we shall refer to as the normal case, we see from (2.15) and Table 1 that the right side of (3.6) is precisely the classical partition function in which each S' is replaced by J' times a vector in .5'. I.e.
Sl _ J'(sin9' cos4', sinO'sin(p', cosO').
(3.7)
Thus, in the normal case,
ZQ>_Zc(Jl.....JN),
(3.8)
where Zc means the classical partition function (with the normalization (4n)- N).
In more complicated cases, (3.7) is not correct and Si'. for example, has to be replaced by J' cos 8' if it appears linearly in H, (S=' )2 has to be replaced by [J' cos B' ]2 + J' (sin 01)2 /2 and so forth (see Table 1). However, to leading order in P. (3.7) is correct. We note in passing that it is not necessary to use the Peierls-Bogoliubov inequality for all operators appearing in H. Thus, suppose the whole
Hilbert space is ,7to' =,)(o®.* where .*2 is the Hilbert space of some additional degrees of freedom (which may or may not themselves be spins) and H is selfadjoint on A". Then (by a generalized PeierlsBogoliubov inequality)
ZQ = aN Trr, Trr exp(- PH) >Trp(4rt)-N
f dQNexp{-P
whereis a partial expectation value and defines a selfadjoint operator on SW'. We shall give an example of (3.9) in Appendix B.
It is clear that if ,4' is itself a spin space, then (3.9) gives a better bound than (3.6) applied to the full space A".
350
The Classical Limit of Quantum Spin Systems Classical Limit of Quantum Spin Systems
333
IV. Upper Bound to the Quantum Partition Function Returning to the definitions (3.1) and (3.3) we note that
ZQ = lim Z(n),
(4.1)
T, (n) = aN Tr(1 - fin-' H)" .
(4.2)
where
Now, let H be represented by some G(QN) as in (2.13). whence 1 - fin-' H is represented by F"(S2N) = I - fin- G(QN) . (4.3) Using (2.10), (2.13) and (2.16). we can represent Z. as an nN fold integral:
Z(n) = aN J dQN' ... I dQN" [j F.420 LJ(QN'. QN'+`)
(4.4)
i=1
with n + I - I in the last factor, and where N
LJ (QN'. QN) = (41t)- N aN
`
[[ K J, (Q". 0').
(4.5)
Thus LJ(Q5. QN) = J
dON LJ (QN QN) LJ(QN,
(4n)-NaN- t
(4.6)
.
QN") = LJ (QN'. QN")
.
(4.7)
Equations (4.1) and (4.4) are our desired integral representation for ZQ. To use them to obtain a bound, we think of F. as a multiplication
operator and of L. as the kernel of a compact. selfadjoint operator on L2 (.VN). If B(QN'. QN) is such a kernel, then
TrB = dON B(QN. QN)
(4.8)
is the trace on L2(1%N). Thus.
Z(n) = aNTr(F.LJ)".
(4.9)
In general, if m = 2', j = 0. 1, 2, 3..... jTr(AB)2mj:5Tr(A2B2)'"
(4.10)
whenever A and B are selfadjoint. This follows from the Schwarz inequality (sec Ref. [5] for details). Hence. if we take a sequence n = 2', j = 1.2.... in (4.2) and use (4.7) n times and (4.6), we obtain, in the limit
351
Commun. Math. Phys. 31, 327-340 (1973)
E. H. Lieb:
334
n --. oo.
ZQ<=(4it)-N$dQNexp[-PG(QN)].
(4.11)
(4.11) is our desired classical upper bound. It is just like (3.6). In the normal case we see from Table I that S' is replaced by (J' + 1) times a classical unit vector. In other cases. G(QN) is a bit more complicated, but the same remarks as in Section III apply. Thus, in the normal case
Zc(J1,...,JN)
(4.12)
This inequality says that as J increases the quantum and classical free energies form two decreasing, interlacing sequences. As in Section III, if Y'= .Jr®.ll°N an inequality similar to (4.11) can be shown to hold. i.e. ZQ
QN)]
(4.13)
,
obtained by replacing where H(-, ON) is a selfadjoint operator on each monomial in the spin operators in H by the appropriate G(QN) function found in Table 1. We shall illustrate (4.13) in Appendix B. If .>t° is a spin space then (4.13) gives a better bound than (4.11) applied to the full ,Y'.
V. Bounds on Expectation Values and the Ground State Energy
The expectation value of a quantum operator (observable), A, is
(5.1)
We can always assume A is selfadjoint (otherwise consider A+ At and iA - iA'). in which case the Peierls-Bogoliubov inequality reads. for A real.
AQ > f (A) - f (0). where
f
#-' In Tr exp [ - ft(H +;A)]
(5.2)
,
(5.3)
is a free energy. Hence, with A > 0,
>Q>=[f(A)-f(0)]lA
(5.4)
The upper and lower bounds to f (A) derived in the preceding two sections can be used to advantage in (5.4). In particular. we use (5.4) in the next section to derive J oo limits of quantum expectation values.
352
The Classical Limit of Quantum Spin Systems Classical Limit of Quantum Spin Systems
335
in (3.1) we obtain bounds on the quantum
If we take the limit ground state energy:
Ec- SEQSE: where Ec is the classical ground state energy (i.e. the minimum of the classical Hamiltonian over 91N) and the + (resp. -) refers to the substitution of the appropriate G(f2N) (resp. g(ON)) functions from Table 1. In the normal case Ec(J'..... JN)>_EQEc(J'+1,...,JN+1).
(5.6)
As ground state expectation values obey an inequality similar to (5.2), with f replaced by E, a bound similar to (5.4) holds for E. This is merely the variational principle. The upper bound in (5.6) is easy to obtain directly by a variational calculation, but the lower bound is not. It is not easy to find a direct proof of it in a system consisting of three spins antiferromagnetically coupled to each other.
VI. The Thermodynamic Limit
A. The Free Energy
We shall, for simplicity, consider only the normal case here. The general case can be handled in a similar manner. Let HN be a Hamiltonian (polynomial) of N spins in which each
spin has angular momentum one. Replace each spin operator S` by (J)-'S` and let S' now have angular momentum J. We shall denote this symbolically by H$(J) and the partition function, (3.1). by ZQ(J). [It would equally be possible to allow different J values for different spins, but that is a needless complication. Also, the factor J-' is not crucial. One could as well use J-"2(J + 1)-'"2]. Denoting the free energy per spin by IN (J) = -(N#)-' In ZN(J), the theorem to be proved is that
lim lim JQ(J)=fC= lim f' c.
J--a N-ao
N-a;
(6.1)
where f, is the free energy per spin of the classical partition function in which each S' is replaced by a classical unit vector. It is assumed that HN is known to have a thermodynamic limit for the free energy per spin. We also want to prove an analogous formula for the ground state energy per spin. Our bounds are JN > R (J) > f c(6') ,
(6.2)
353
Commun. Math. Phys. 31, 327-340 (1973) E. H. Licb:
336
where the right side is the classical free energy per spin in which each vector is multiplied by bJ = (J + I)/J. If we think of bJ as a variable. b, then HN (b), the classical Hamiltonian
as a function of b, is continuous in b. Moreover. N`HN (b) is equicontinuous in N. i.e. given any t > 0 it is possible to find a ; > 0 such that II N ' [HN (b + x) - HN (b)] II 5 t for Ixi < y, independent of N, where means the uniform on VN. Hence, the limit function
II
fc(b) = lim fN (b)
(6.3)
is continuous in b. This, together with (6.2), proves (6.1). The same equicontinuity holds for the classical ground state energy. Thus, the analogue of (6.1) is also true for the ground state energy per spin: lim lim N-' EN(J) = lim EcN. (6.4) J-mN-m
N-m
B. Expectation Values
We consider expectation values of intensive observables N-' AN. For example. AN might be the Hamiltonian itself, in which caseN
is the energy per spin. Alternatively, AN could be
S' so that
AN>
is the magnetization per spin. As before, we replace each S' by (J)-' times a quantum spin of angular momentum J, both in the Hamiltonian and in AN. Then, using inequality (5.4) and the bounds (6.2) we have. for each positive A, fixed N and fixed J.
1)-fN(-).;6j)]?N `A '[fN(A;(J)-fN(0; I)] where
(6.5)
(A; b) is the classical free energy per spin when the Hamiltonian
is Hc +AAN and where each classical spin unit vector in HN and AN is multiplied by b. We are interested in bJ = (J + 1)/J. Now take the limit N-'oo and then the limit J-+co in (6.5). By the same equicontinuity remark as in Section VI.A, for each A > 0. lim sup lim sup N-'J-m N-m
`[f
f c(0)]
.
(6.6)
In (6.5). f c(2) is the limiting classical free energy per spin for the Hamiltonian A AN (with b = 1). It is easy to see that f c(A) is concave in A
354
The Classical Limit of Quantum Spin Systems
Classical Limit of Quantum Spin Systems
337
and hence limi.`[fc(A)- fc(0)] = G+ and limb '[fc(0)- f`(-))] G exist everywhere. If G* = G- (i.e. the right derivative equals the left derivative) then by a theorem of Griffiths (6) Nt m
dl fN(A)=
dA
fc(.).
(6.7)
This is the case in which the classical expectation value N-'
lim lint N-'= a
J-.w N-ao
,
in (6.6). In other words, we have as one sees by taking the limit proved that for intensive observables, as defined above, the quantum expectation value equals the classical expectation value after first taking the thermodynamic limit and then taking the classical limit J - ao. If one takes the limits in the opposite order the theorem is trivially true and uninteresting. Note that we have not proved that the quantum thermodynamic limit, lim N-'exists. It may not. m The same proof obviously goes through for ground state expectation 'v
values, as in Section VI.A, because the ground state energy is also concave in A. Acknowledgements. The author thanks the Institut des Hautes Etudes Scientifiques for its
hospitality, as well as the Chemistry Laboratory III, University of Copenhagen
where part of this work was done. The financial assistance of the Guggenheim Memorial Foundation is gratefully acknowleged. The author also acknowledges his gratitude to Dr. N. W. Dalton who suggested the problem to him in 1967.
Appendix A: Some Useful Formulas
The algebra .112J+` has S, S_ and S. as generators. Hence, the following generating function permits. by differentiation, easy calculation of g(Q) in (2.15) or Table I for any operator. It is to be found, with appropriate modifications, in Ref. [2].
{[e-
(A.1)
fl12+eR2[cos20]2}2J.
Turning to (2.13). we calculate AG for a sufficiently large class of functions G(Q). Let G(0) = e""m(cos20)° (sin
0)Q
(A.2)
355
Commun. Math. Phys. 31, 327-340 (1973)
E. H. Lieb:
338
where m is an integer and p and q are complex numbers. Defining A(m, p, q) = Ac, the matrix elements of this operator can be calculated using (2.7) to be
A(m.p,q;M.M')=b(,%f-M'-m)F(J+a+l+p12)I'(J-x+I+q/2) [(J+a+m/2)!(J+a-m/2)!(J-a-m/2)!(J-a+m/2)!]-',12
(A.3)
(2J + 1) !/1' (2J + 2 + p/2 + q12),
where S is the Kroenecker delta function, !' is the gamma function and a = (M + M')/2. This formula has been used to calculate Table 1.
Appendix B: Application to the One Dimensional Heisenberg Chain
To illustrate the methods of this paper, we derive bounds for the free energy of a Heisenberg chain whose Hamiltonian is
H=-
N-1
S'.S'+1
(B.1)
i=l
Each spin is assumed to have angular momentum J. We have chosen the
isotropic case for simplicity, but one could equally well handle the anisotropic Hamiltonian with a magnetic field. Note that #>O is the ferromagnetic case while /3 < 0 is the antiferromagnetic case. The classical partition function is r
Zcv(/i, x) =
(4n)-N j dQN
N--1
exp { px2 l
i i+I l
i =1
(B.2)
J
with free energy per spin
m(NI/31)-' InZN(f.x).
(B.3)
Our bounds are that
fc(p.J)>fQ(R.J)>fC(li.J+1). It
(B.4)
is easy to evaluate (B.2) by the transfer matrix method. The
normalized eigenfunction (of Q) giving the largest eigenvalue is obviously the constant function (4n)-12. Thus, f c (fl. x) _ - If I-' In A (fl, x) .
356
(B.5)
The Classical Limit of Quantum Spin Systems Classical Limit of Quantum Spin Systems
339
where A (P. x) _ (41t) ' ( dQ exp (ixz G S2') (B.6)
_ (fix')-' sinh(fixz) , and A(fl, x) is independent of 0' as it should be. In this approximation. (B.4), one cannot distinguish between the ferro- and antiferromagnetic cases as far as the free energy is concerned. To illustrate the idea mentioned at the ends of Sections III and IV. we suppose that the chain has 2N + 1 spins and we let .,Y,, (resp. )
be the Hilbert space
for
the odd (resp. even) numbered spins.
.h" =.e ®.,YN is the whole space. Our bounds are
g(P.J)?f4(P.J)?9(P.J+1).
(B.7)
where
9(fl x) =
lim (2NIIJU-' In ((2J + 1)-N ZN(Q. x)) . l
N
ZN( x) =
dQN Tr
(41r)-N
exp'fix
and where dQN=dQ'
d52'...dQ2N+t
(B.8)
+
S2zi+t)t (B.9)
and the trace is over the Hilbert
space of Sz. S4..... Since the remaining spin operators no longer interact, it is easy to calculate the trace. For a single spin: S2N.
J
Trexp[bS v] _
Y_
exp[bMv]
(B.10)
M= -J
where b is a constant and v is a vector of length v. Now we can do the
integration over YN by the transfer matrix method (with the same eigenvector
(4n)-';z) and obtain
y(/3. x) = -11#l -' In[A(f3,x),'(2J+1)] where
A(/1,x)=(4n)-' (dQ
(B.11)
j Y_
exp{J1xMIS2+S2'1}
M= -J
(B. 12)
1
=2 (ydysinh[(2J+1)lxy]isinh[$xy]. 0
Again, no distinction between the ferro- and antiferromagnetic cases appears.
357
Commun. Math. Phys. 31, 327-340 (1973)
340
E. H. Lieb: Classical Limit of Quantum Spin Systems References
I. Millard,K., Leff,H.: J. Math. Phys. 12, 1000-1005 (1971). 2. Arecchi,F.T., Courtens, E., Gilmore,R., Thomas, H.: Phys. Rev. A6, 2211-2237 (1972).
3. Radcliffe,J.M.: J. Phys. A4, 313-323 (1971). 4. Kutzner,J.: Phys. Lett. A41, 475-476 (1972). Atkins, P. W., Dobson, J. C.: Proc. Roy. Soc. (London) A, A 321, 321-340 (1971). 5. Golden, S.: Phys. Rev. B 137, 1127- -1128 (1965).
6. Grifths,R.B.: J. Math. Phys. 5, 1215-1222 (1964). 7. Hepp,K., Lieb, E. H.: The equilibrium statistical mechanics of matter interacting with the quantized radiation field. Preprint. E. H. Lieb I.H.E.S. F-91440 Bures-sur-Yvette, France
358
Commun. Math. Phys. 62, 35-41 (1978)
I caftis in Man pt OWN
Comn Commun, math. Phys. 62, 35-41 (1978)
© by Springer-Verlag 1978
Proof of an Entropy Conjecture of Wehrl Elliott H. Lieb* Departments of Mathematics and Physics, Princeton University, Princeton. New Jersey 08540, USA
Abstract. Wehrl has proposed a new definition of classical entropy, S, in terms
of coherent states and conjectured that S 1. A proof of this is given. We discuss the analogous problem for Bloch coherent spin states, but in this case the conjecture is still open. An inequality for the entropy of convolutions is also given. 1. Introduction
In a recent paper [1], A. Wehrl introduced a new definition of the "classical" entropy corresponding to a quantum system, proved that it had several interesting
properties that deserve to be studied further, and posed a conjecture about the minimum value of this "classical" entropy. The main purpose of this paper is to prove Wehrl's conjecture. It is somewhat surprising that while the conjecture appears to be almost obvious, the proof we give requires some difficult theorems in
Fourier analysis. The conjecture may or may not be important physically, but it reveals an interesting feature of coherent states.
To briefly recapitulate Wehrl's analysis, consider a single particle in one dimension, so that the Hilbert space is L2(R). (The generalization to R" is trivial.) For each z=(p,q)eR2, define the normalized vector Iz> in L2(R) by (1.1) Iz>__(7th)-1/4exp([-(x-q)2/2+ipx]/h)=R(xlp,q). These vectors are the coherent states used by Schrodinger [2], Bargmann [3]. Klauder [4), and Glauber [5]. If P2 =1z>
(1.2)
is the orthogonal projection onto Iz> then (1.3) Work partially supported by US National Science Foundation grant MCS 75-21684 A02
359
Commun. Math. Phys. 62, 35-41 (1978) E. H. Lieb
36
where dz/n=-dpdq/2nh and 1=identity. The integral in (1.3) can be defined as a weak integral and (1.2) is simply the Plancherel equality. For a "density matrix" QQ (a positive semidefinite operator of trace 1) on L2(R),
its quantum entropy is SQ(QQ)_ -TrQQInQQz0.
(1.4)
The right side of (1.4) is well defined, although it may be + cc. For a nonnegative function f on R2, with f f (z)dz/n = 1, its classical entropy is dz
S(f)= - f z f(z) In f(z).
(1.5)
In general this integral may not be well defined, but even if it is it can be negative. Given a quantum density matrix QQ, Wehrl defines the function Q``(z) =,
(1.6)
whence 05e'(z)51. Then S"WI) = S(Q") .
(1.7)
This is the classical entropy of 0Q. [Note that by (1.3), f Q`(z)dz/n= 1.] Since 0 5 Q`'(z) 51, the integral in (1.5) is now well defined, and S' >-_ 0. The positivity of S" is one advantage of Wehrl's definition. On the contrary, if, as is usual, QQ = ZQ' exp [ - J3(- h2d/2m + V(q))], the customary classical approximation is f (z) = Z4l' exp[ - f(p2/2m + V(q))]. The difficulty with f is that S(f) can
be negative and, in general,S(f )- - oo as #- oc. A second advantage of Wehrl's definition is that S' is monotonic. If QQ2 is a density matrix on L2(R)®L2(R), and Iz1,z2> Jz,>®Iz2>, one defines ei2(z1,z2)=.
(1.8)
One can then define Q (z1) by partial trace on 2 (either first on QQ2 or else on the
right side of (1.8); by (1.3) they are identical). Wehrl shows that the entropies satisfy
SiO2=S(e1)_: S(Q')=Si ,
(1.9)
in an obvious notation. This property, which is obviously desirable physically, does not hold in general for either the quantum entropies or for ordinary classical continuous entropies (see [6] for further details). It does hold for these particular classical entropies. Not only is S`'>>-0, but Wehrl proves [1] S`(QQ) > SQ(QQ) .
(1.10)
[To prove >- note that s(x)= - x In x is concave, so s(Qd(z)). But SQ(QQ)= f dz%n.] While the minimum of SQ is zero (for any pure state, i.e.
one dimensional projection) the minimum of S" is not zero. Wehrrs conjecture is the following: Theorem 1. The minimum of S`' is 1 (independent of h). This minimum occurs if QQ = P. for any z. 360
Proof of an Entropy Conjecture of Wehrl Proof of an Entropy Conjecture of Wehrl
37
Remarks. 1) There is no upper bound or lower bound (other than zero) for S`(gQ) - SQ(el). 2) It is easy to see from Theorem 1 that in L2(IR"), the minimum of S`' is N. The proof of Theorem 1 will be given in Section II. An analogous conjecture can be posed for Bloch coherent spin states and this is discussed, but not proved, in Section III. In Section II an inequality (Theorem 3) on LP norms is also presented. Section IV contains an inequality which may be of use for related problems. H. Proof of Wehrl's Conjecture
From now on we set h = 1. As a preliminary remark we note: Lemma 2. If eQ minimizes S", eQ must be a pure state.
Proof. If gQ=
,1gri, the n; being one dimensional orthogonal projections, a.1>0
and >A= = 1, then e"(z)=yA1e1(z) with e;(z)=. By concavity of S, S(e`'(z)) Z A1S(ei), with equality if and only if e;(z)=ef(z) almost everywhere for all Q.
Suppose ei is a projection onto W1eL2(IR). Let w=q+ipEC and let f,(w) =Jtp;(x)exp[-x2/2+wx]dx, which is an entire analytic function of w [3]. Then equality almost everywhere implies that I f (w)I = I f,(w)l, all w, and hence f;(w) = ff(w)exp(i9(w)) and 0 is real and analytic on the complement of the zeros of fi.
Hence, 6(w)=const. By the uniqueness of the Fourier transform, W;=aij, with lal=1, almost everywhere, and, hence n,=n,, which is a contradiction. E] Thus, to prove Theorem 1 we have to consider f(p, q) = f ip(x)R(xlp, q)dx
(2.1)
with II W112 =1, and show that (2.2)
SQJI2)? 1
with equality if ip(x) = R(xlp, q) for some (p, q).
We will first prove Theorem 3 which concerns LP norms of f(p,q). Theorem 1 is a corollary of Theorem 3. Theorem 3. Let
with IIWII2=1, and f given by (2.1) and (1.1). Then, for
s>2 IS
I(f(p, q)I'd pdq/2rr < 2/s
(2.3)
with equality for s>2 if W(x)=aR(xlp,q) for some p, q and Ial=1. For s = 2, (2.3) is an equality for all W. To prove Theorem 3 we will require the following two lemmas (for N = 1). The first (best constant in the Hausdorfl Young inequality) was proved by Beckner [7] and thg, second (best constant in Young's inequality) simultaneously by Beckner
[7] and Brascamp and Lieb [8]. Lemma 4. Let feLP(IRN), 1:5p:5 2, and J its Fourier transform (J(k)=f f(x)elkxdx). Then, with 1/p + 1/p' = 1, 11111p s {CP(2rt)'!p}"IIflP,
(2.4)
361
Commun. Math. Phys. 62, 35-41 (1978)
E. H. Lieb
38
where
Cp=pllp(p')-1/D'
and C,=C,=l.
Remark. Equality holds in (2.4) if f is any Gaussian, i.e. fix) =aexp{-(x,Mx) +(x,b)}, ac-C,
bECN, and M positive definite.
Lemma 5. Let f E Lp(IRN), ge Lq(lRN) 1 5 p, q 500. Then, with I + 1 /r =1 /p + 1 /q,
r? 1, and s = convolution, IIf*g11,5{CpCq/C,)NIIfI1 pHg11 q
(2.5)
Equality holds [8] for r> 1 and N=1 if and only if f(x)=aexp[-p'(x-b)2+ibx] and g(x)=aexp[-q'(x-1)2 +ibx] for some a,aeC and b,b,fJelR. For r=1 (all N), p=q=1 and (2.5) is an equality for all positive f,g. Remark. In the classical inequalities, Cp is replaced everywhere by I in Lemmas 4 and 5. Proof of Theorem 3. As a first step apply Lemma 4 (with p' =s) to the function gq(x)=tp(z)n-'14exp[-(x-q)2/2], with q regarded as a parameter. (ggEL'"(R) by Holder's inequality.) Thus, JIf(p,q)I'dp/2n5C;.n-'144,(gY'3
(2.6)
,
where 0, is the convolution
0,=IW(x)I' exp[-s'x2/2].
(2.7)
The second step is to integrate (2.6) over q and use Lemma 5 with p = q = 2/s' 3 =2/s. and r=s/s'. Since lleXp(-x2/2)112 Equality holds in the first step if tp is any Gaussian. In the second step, since p = q = 21s', equality holds for s > 2 if w is a Gaussian with the same variance as exp(-x2/2), which is the condition stated in the theorem. When s=s'=2, equality for all W is a simple consequence of the Plancherel formula. E] =n'14,
Proof of Theorem 1. We continue to use the notation of Theorem 3. Let a>0. Since 12=1, K,=E-'{12-1,(, )>(1+E)-' by Lemma 5. Assuming S(IfI2)<x (otherwise, there is nothing to prove), we claim that 1 r K, = S(I f I2), which proves
that 9(1 f I2)>1. To see this note that by Theorem 3 or by the Schwarz inequality, and hence Thus, K,-+S(I f 12) by dominated convergence. p
III. Bloch Coherent Spin States Instead of L2(R), one can consider the finite dimensional vector space W'j = C2' J = 1/2,1,3/2, .... The analogue of the vectors Iz> are the Bloch coherent states [913] in f, These have been used to prove the classical limit of quantum spin systems [13]. For each unit vector QER3, the vector IQ>e, is defined as the normalized vector (unique up to the phase) satisfying (3.1)
362
Proof of an Entropy Conjecture of Wehrl Proof of an Entropy Conjecture of Wehrl
39
where S = (S.,, S,, S:) are the usual angular momentum operators satisfying [SX, S,] = iSz and cyclically. An explicit representation is J
I0>= Y AM(0)exp(-iM4)IM>,
(3.2)
M . -J 2J
A M(O) = (M
'n
+ J)
[cos(0/2)]J
. M [sin (0/2)]J - M ,
(3.3)
where (0, 0) are the polar coordinates of 0. IM> is the normalized vector satisfying S_I M> = MI M> and whose phase is given by IM> = (pos. const.) (S. - iS,y -MIJ>. With the measure dµ,(f2) = (2J ± 1) sin 0d0d¢/4n
(3.4)
on the unit sphere S2, and (3.5)
Pn = IQ>
the projection onto IQ>, one has the analogue of (1.3): (3.6)
J dµ,(f2)Pr, = 1.
Now given a density matrix pQ on ato one can imitate the Wehrl construction : (3.7)
Q>
and S`'(QQ) = S(Q`') with
S(f)= - If(Q)Inf(Q)du,((2)
(3.8)
The monotonicity of S' and the inequality S`>= SO carry over to this case. It is easy to compute that since [13] (S2'IP0IS2'> = [cos i©]", where a is the angle between 92 and Q', S`'(Pn) = 2J/(2J + 1).
(3.9)
The analogue of Theorem I is then Conjecture. S`d(QQ) >_ 2J/(2J + 1).
We will have to content ourselves with the following remarks. Remark A. Suppose QQ is of the form QQ = J dp,(f2)h(Q)Pr,
(3.10)
with h(Q) 0 and J hd p, = 1. Every QQ can be written in the form (3.10) with h real
but, for J>_ 1, not necessarily with h? 0, even though pQ is positive. However Pr, is of this form with h being a delta function. By (3.10) Q`'(s2) = J dp,(SY)[cos
e]a,h(Q').
i Since Q`'(0) is then a convex combination of Ifs, the concavity of S leads to S`'(QQ) z Sd(P0.) = 2J(2J + 1)
if h(Q) z0. The analogue of this remark would, of course, also hold for the original Wehrl problem. 363
Commun. Math. Phys. 62, 35-41 (1978) E. H. Lieb
40
Remark B. Lemma 2 holds for the Bloch case as well. Thus we can assume p4 is a projection onto pe.7rJ. Then e`'(a) = If(Q)12
(3.12)
J
f(Q)=
Y_
CMAM(0)e-'Mb
U= -J
(3.13)
and Y_ ICMI2 =1.
If J = 1/2, every w=aIQ> for some IQ> and a. Thus the conjecture is manifestly true for J = 1/2.
IV. An Inequality for Entropy of Convolutions Lemmas 4 and 5 yielded a lower bound for S. Lemma 5 alone yields the following
entropy inequality which, while not strictly related to coherent states, may be useful for related problems. We first remark that if f is a nonne&ative function on IR" with f f(x)dx=1, and if f EL'(IRN) for some s> 1, then S(f) is well defined in the sense that f (x) In f (x)dx < oo. S(f) may be + oo, however. I
Theorem 6. Suppose f and g are nonnegative functions on IR" with f f = f g =1 and f,gEL(lR") for some s> 1. Then f *g has the same properties and
exp[2S(f *g)/N]>exp[2S(f)/N]+exp[2S(g)/N]
(4.1)
(4.1) is equivalent to the following:
29(f * g) -:z 2A. (f) + 2(1 - A)9(g)
-NAIn).-N(1-A)In(1-d)
(4.2)
for all Ae [0,1 ]. Corollary. S(f *g)Z[S(f)+. 9(g) + N In 2]
Remark. (4.1) is an equality if f and g are any two Gaussians of the form f (x) exp [ - (x, Mx) + (b, x)], g(x)-exp[-a(x,Mx)+(c,x)] with x>0, b,ceR" and M positive definite.
Proof. By Lemma 5, (f *g)E L°(IR") for p=1 and for p=s(2-s)-'. Hence S(f *g) is well defined. (4.2)x(4.1) : Choose
A= (exp [2S(f)/N] + exp [2S(g)/N] } -' exp [2S(f)/N] .
(4.1)x(4.2): Geometric-arithmetic mean inequality. We now prove (4.2). In
Lemma 5, choose p'=r'/A, q'=r'/(1-2). so that 1 +r-'=p-'+q-'. By convexity, f e L'r L' implies f E L' for 1 < t <s and tr-;If II, is continuous for to [0, s]. For r close enough 1, p, q <s, so f *gE L' and (2.5) holds. Furthermore, (2.5) is an equality
for r = p = q = I so one can take the right derivative at r =1. Without loss we can 364
Proof of an Entropy Conjecture of Wehrl Proof of an Entropy Conjecture of Wehrl
41
assume S'(f) and S(g) < oo, for otherwise S(f*g)= oo by concavity and there is nothing to prove. For the same reason, one can assume S(f *g) < oo. Next, we claim that if F e-L' nL', s> 1, and S(F) < oo then Iim E -1 J F(1-F')=9(F). To see this, let I10
A = {xIF(x) < 1 }. Then for xeA, 0 51- F(x)` <- - E In F(x). For xeA` and
0F(x)`-1<-E(s-1)-'{F(x)'-'-1}. The claim follows by dominated convergence. Thus, the right side of (2.5) is
differentiable at r=1 and Theorem 6 follows by explicit calculation. This calculation can be avoided noting that as r varies, p'/q'= const =(1-.1)/A. As noted in Lemma 5, (if N = 1, and hence for all N) (2.5) is saturated for the Gaussians
f(x)=exp(-x2/.1), g(x)=exp(-x2/(1 -A)), independent of r. But these Gaussians also give equality in (4.1).
0
References 1. Wehrl, A.: On the relation between classical and quantum-mechanical entropy. Rept. Math. Phys. 2. Schrodinger,E.: Naturwissenschaften 14, 664-666 (1926) 3. Bargmann,V.: Commun. Pure Appl. Math. 14, 187-214 (1961); 20, 1-101 (1967) 4. Klauder,J.R.: Ann. Phys. (N.Y.) It, 123 (1960) 5. Glauber,R.J.: Phys. Rev. 131, 2766 (1963)
6. Lieb,E.H.: Bull. Am. Math. Soc. 81, 1-13 (1975) 7. Beckner,W.: Ann. Math. 102, 159-182 (1975) 8. Brascamp,H.J., Lieb,E.H.: Advan. Math. 20, 151-173 (1976) 9. RadclifS,J.M.: J. Phys. A4, 313-323 (1971) 10. Kutzner,J.: Phys. Lett. A41, 475-476 (1972) 11. Atkins,P.W., Dobson,J.C.: Proc. Roy. Soc. (London) A321, 321-340 (1971) 12. Arrechi,F.T., Courtens,E., Gilmore,R., Thomas,H.: Phys. Rev. A6, 2211-2237 (1972) 13. Lieb,E.H.: Commun. math. Phys. 31, 327-340 (1973) Communicated by J. Glimm Received May 12, 1978
365
With J.P. Solovej in Lett. Math. Phys. 22, 145-154 (1991)
Letters in Mathematical Physics 22: 145
154, 1991.
145
1991 Kluwer Academic Publishers. Printed in the Netherlands.
Quantum Coherent Operators: A Generalization of Coherent States ELLIOTT H. LIEB* Departments of Mathematics and Physics, Princeton University, PO Box 708, Princeton, NJ 08544.0708, U.S.A. and
JAN PHILIP SOLOVEJ** School of Mathematics. Institute for Advanced Study, Princeton, NJ 08540, U.S.A. (Received: 15 March 1991)
Abstract. We introduce a technique to compare different, but related, quantum systems. thereby generalizing the way that coherent states are used to compare quantum systems to classical systems in semiclassical analysis. We then use this technique to estimate the dependence of the free energy of the quantum Heisenberg model on the spin value, and to estimate the relation between the ferromagnetic and antiferromagnetic free energies. AMS
dasffcatlow (1991). 81 R30, 81 S30.
1. Introduction
Coherent states have been used since the origin of quantum mechanics as one possible approach to semiclassical analysis, i.e., to compare quantum systems to corresponding classical systems. A complete list of references would be enormous. To mention just a few, see [2, 3, 6, 9, 11, 12, 14] for applications to continuous systems and [1, 8, 13] for applications to spin systems. For reviews, see [5, 7, 101. In this Letter, we introduce a technique that can be used to compare two different quantum systems in very much the same way as regular coherent states compare a quantum system to a classical system. Coherent states can be introduced from several points of view. The first is simply
to think of them as being an interesting parameterization, by points z in the classical phase space W, of a complete set of vectors 41: in the Hillbert space, W, describing the quantum system. In one dimension, W = L2(R) and the classical phase space is R2. The latter can be identified with C through z = q + ip. The usual coherent states in L2(R) are then (x) = n - t t2 exp(zx - 2(X 2 + Iz I2)).
(I )
* Work supported in part by the U.S. National Science Foundation grant PHY-9019433. Work supported in part by the U.S. National Science Foundation grant DMS-9002416. Current address: Department of Mathematics, Princeton University, Fine Hall, Washington Road. Princeton. NJ 08544-1000, U.S.A. 367
With J.P. Solovej in Lett. Math. Phys. 22, 145-154 (1991) ELLIOTT H. LIEB AND JAN PHILIP SOLOVEJ
146
Another way, mainly emphasized by Segal [121 and Bargmann (2,31, is to consider the coherent states as defining an isometry between two Hilbert spaces,
namely between Y and a suitable subspace of the space of square integrable functions on W. In the case of (1), we map L'(l) into the subspace of analytic functions of L2(C ; it -' a-N' d2z) by
f
JeXP(2X - Zx2)f(x) dx.
Bargmann proved that this map is, in fact, a bijection. A third point of view (see [4] and [8]), which is the one that will be generalized in this paper, emphasizes the one-dimensional projections n(z) = 1*:> «_I onto the coherent states rather than the vectors 41: themselves. One point is that this suppresses the unimportant information of the phase of 0;. The completeness can now be written as 11(z) dz
l,r
(2)
J
for a certain measure - proportional to the Liouville measure in the case of a Hamiltonian system - dz on W. If Y is finite-dimensional then 1, dz = dim Y. Here, 1,r denotes the identity on A'. The general philosophy is that all interesting operators A can be written as (or approximated by) operators that can be written in 'diagonal form' as A = fG(z)H, dz. The function G is called the upper symbol for the operator A. The function g(z) = «; JA 1cfr > is called the lower symbol. It is the third point of view that is useful for proving the classical limit of quantum systems [8] and, generally, the Berezin-Lieb inequalities [4, 8]. We shall illustrate this technique in the case of spin systems. Quantum spin systems are given by representation spaces A°, = CZ'+' for SU(2), where the spin J is a half-integer. The corresponding classical phase space is JSZ,
namely vectors in IB' with length J. To a point 0 e SZ we associate the Bloch coherent state vector 10), a 3t, defined up to an arbitrary phase by .0 S,l1 >, = JlG>,, where S, = (Si, S,, S;) is the vector of spin operators on The projector, ln>,j
2J + I Jni(Q)dQ=1.r,, 4n
(3)
wh ere dfl is the normalized Euclidean measure on §Z. In this case it is, indeed, true that all operators on A°, can be expressed in diagonal form. EXAMPLE. Consider the Heisenberg model of interacting spins at inverse temperature P. We denote the free energy by.f(J, /3) in the quantum case (for the precise definition see (19) below) or f`55(J f3) in the classical case. The Bloch coherent states can be used (see [8]) to prove that fCmss( l3) 3 f 368
f3)
f<'-(J + 1, /1)
(4)
Quantum Coherent Operators: A Generalization of Coherent States
QUANTUM COHERENT OPERATORS
147
By using these inequalities twice together with the fact that the classical free energy depends on J through the simple scaling relation f`'"`(J, l3) =_f` a'"(1, JZ/i,
we can relate quantum spin systems with spins K and J. We get .f(K.(J)2I
l)>-l(J,Il)>,f(K.JJi).
(5)
The point is now that the route through the classical system is not an optimal procedure to obtain inequalities like (5). An obvious drawback of Equation (5) is that it does not reduce to equalities when K = J. Our new result here will be Equation (20) in Theorem 7. As an illustration, suppose we wish to compare spin I and spin 1/2. Then (5). with K = 1, J = 1/2, says f(I, 16f) >I f(?, l3) it (1 , 4N),
whereas (20) gives the better bound
f( I, 0)>f(z,/l)>ft I, /I) The technique presented here is intrinsically quantum mechanical. In Theorem 8,
we also compare the antiferromagnetic and ferromagnetic free energies on a bipartite lattice for the same spin values. Classically, there is no real distinction between antiferromagnets and ferromagnets. The free energies are the same by a simple change of variable. On the other hand, in quantum mechanics the two systems are not unitarily equivalent, and the free energies are, indeed, different. Our bounds delimit that difference. To describe the framework of our generalization, consider two Hilbert spaces .Xo, and ire either both finite dimensional or both infinite dimensional. A positive semi-definite operator r on Ao, ® Jr2 is called a quantum coherent operator for the pair (A',, .)t"2) if it satisfies 1,p2,
(6)
Tr,r,1'= 1,r,,
(7)
where
denotes the normalized partial trace, i.e., I
--
dim.,,.,
rX = Tr,r.,
if .W'
is finite-dimensional,
if ;, is infinite-dimensional.
The definition of partial trace over ;°,, which gives an operator on 'W2, is well known. To make an analogy with (2) we can pretend that Y2 = Jr and that 'J', is the classical phase space V. Then (6) is the same as (2), whereas (7) imitates the trace condition Tr 11(z) = 1.
(8)
However, (6) and (7) bring out the symmetry between the two spaces Jr1 and ,Y2. 369
With J.P. Solovej in Lett. Math. Phys. 22, 145-154 (1991)
ELLIOTT H. LIEB AND JAN PHILIP SOLOVEJ
148
If A is an operator on r, we define an operator A on .)to2 by
A =Tr,,. (rA).
(9)
(Really, A means A (D 1,r2 in (9).) Since r can be written as a linear combination of tensor products of an operator on Y, with an operator on Jr2, we see from the cyclicity of the .Y,-trace that Tr,,.,(17A) =Trr,(Ar) and, hence, if A is Hermitean,
then so is A. One virtue of this generalization is that it establishes a complete symmetry between the upper and lower symbols, i.e., A is the (unique) lower symbol for A, and A is an upper symbol for A. The main comparison inequality that generalizes the Berezin-Lieb inequalities such as (4) is THEOREM 1. If A is a Hermitean operator on Yj and f is any convex function on the reals, then
Tr,r,f(A) > Tr,r2f(A) If f is concave the inequality is reversed. An equivalent restatement of this is that if A is any upper symbol for A, then Tr,r 2 f(A) 3 Trr, f(A). Proof.
Tr,r,f(A) = Tr,r2Tr,r,(rf(A)) _
1
dim Jr
Y Tr,r,(f(A)),
(for the finite-dimensional case, and without I/dim r2 in the infinite-dimensional is the orthonormal basis in r2 consisting of eigenfunctions for A. case), where
is a positive operator on J, with Tr.,r,
1.
It then follows immediately from the spectral theorem and Jensen's inequality that Tr,r,f(A) % diml
W2
Vf(
= T r.1 2f(A),
since all the v are eigenfunctions for A. Remarks. (I) An important open problem is to decide what condition on I-, .lto,
and Jr2 will guarantee that every operator A on Jr2 has an upper symbol A, satisfying (9). We can call this operator completeness. Obviously, dim Y, >, dim Y2 is needed for operator completeness, but conditions on r are also needed. If r is the
identity operator on r, ® r2 then r = 1,r ®1,r2 and, for any A, the A of (9) is I
proportional to 1,,..2. Operator completeness clearly fails in this case. A less trivial case of operator incompleteness, due to G. M. Graf, is mentioned in the acknowledgement at the end. As already mentioned, operator completeness holds in the case (3) above. For a proof of this see [7], pp. 29-34 or the remark after Theorem 5 below. 370
Quantum Coherent Operators: A Generalization of Coherent States
149
QUANTUM COHERENT OPERATORS
(2) If p2 is a density matrix on at°2, i.e., a positive semidefinite matrix with the somewhat unusual normalization Tr,r, zp, = 1, we find that p, is a density matrix on .W' . Furthermore, if H, is any operator on at°, we get (10)
In the following section, we shall give interesting examples of quantum coherent operators, r, for spin systems. Moreover, they will be operator complete from the big space to the small one; see Theorem 5. We shall also study how coherent these operators are. More precisely, we shall define entropies related to these operators and estimate them in (13). In Section 3, we use these operators to give the stated estimates on the Heisenberg free energies. 2. Coherent Operators for Spin Systems
Let jr., = C +' and a[°K = C2K+' be representation spaces for SU(2) corresponding to spin values K >- J, where K and J are half-integers. Let P,, L =
K - J, ... , K + J be the projection from of, ® Jr, onto its subspace in which (SK + S, )2 = L(L + 1). Here SK = (SK, SK1 SK) is the vector of spin operators on a(cK which we also identify as an operator on aL°K ®.,Y j (really SK ®1,, ... ). Define rL=(2K+l)(2J+1)PL.
2L + I
LEMMA 2. For L = K - J, ... , K + J, rL is a quantum coherent operator for (.K+ Jr). J denote the action of some R E SU(2) in Proof. Let UJ = U,(R) and UK = the representation spaces Jr, and of K. Then =Tr.,rK[UK' UKUJPLUJ' UK' UK] =Tr,,,K[UK' PLUK) =Tr.,rKPL.
Here we have used the cyclicity of the trace on aL°K and the fact that PL commutes
with UKU, = UK(R)U,(R). Since this holds for all R e SU(2) and since both representations U, and UK are irreducible, it follows from Schur's Lemma that Tr,r., PL is a multiple of the identity. The same is true, of course, with K and J interchanged. The lemma then follows from Tr,,.KO,VJP,, = 2L + 1, from which the relevant constants can be computed. 0
The next question is how I",. transforms the regular spin operators according to (9). LEMMA 3. We have
K =Tr.,,K(r,.SK) =
L(L+1)-K(K+I)-J(J+1) 1)..--Si, 2J(J+
and the same identity with J and K interchanged. 371
With J.P. Solovej in Lett. Math. Phys. 22, 145-154 (1991) ELLIOTT H. LIEB AND JAN PHILIP SOLOVEJ
150
Proof. Using the same argument as in Lemma 2 we find that U,S4.U,
=Tr.,A.[P,.UASKUK-
If we now set U, = exp[itS-] in (11) and take the derivative with respect to i at t = 0, we infer the commutation relation [S;;, SK ] _ 9K-. As usual S± = S` ± iS'. Likewise, we infer [S;. 9K ] = 0. It follows easily that SK = KS; for some scalar K. Since we can interchange x, y and z, we have shown that SK = KS,. We compute K from
KJ(J+ I) = S,
=Tr. (S! 'SKr,.)
= Z(L(L + 1) - J(J + 1) - K(K + 1)).
(:.1
The most interesting cases are Lmnx = K + J and Lm,n = K - J. We denote the respective operators by rma. and T'm,n. Then
Tr.,,,(SK rmin) _ -
K+I
J+I
S, and Tr,K,(Sl rmin) _ -
K
SK, (12)
Tr.,rK(SK rmax) =
J
1
Si and Tr.,r,(S, rmax) = K
f
1
SK
+ Equation (12) has a surprising lack of symmetry between J and K; but here the condition K > J has to be remembered. We see that r.,,,, except for a minus sign, gives a scaling of the spin. A natural question is now whether we can find a coherent operator that acts like I_m;n but without the minus sign. In the remark after Theorem 8 below, we shall see that such an operator does not exist. We now prove in a very precise sense that rm;n is a better coherent operator than I'max. In fact, if we are given a density matrix p, on -*'j we can compute entropies relative to .7(°K as 7K(p;'n) and QK(p; ax aK(P) =
1
dim ,,.K
Tr,,Kf(p)
where
and f(t) _ -t In t.
This definition is similar to the definition of Wehrl's [ 15] classical entropy given in [9], i.e., a',(P) = an
Jf(j) dO.
Notice that we are using our unconventional normalization of always working with normalized traces.
THEOREM 4.
a,(PJ)5aK(Pm")
a,,(Pj
)5aK(PJax).
(13)
Proof. The first inequality follows from Theorem l and the fact that f(t) is a 372
Quantum Coherent Operators: A Generalization of Coherent States
QUANTUM COHERENT OPERATORS
151
concave function. The second follows from the inequality a,(p) < o, J(P) in [9] together with the fact that n
(14)
This identity follows from
Tr., (TminnK(O))
2J+ I = 2K + I
l,( -f)),
(I S)
which, since it is rotation invariant, can be checked by choosing lfl> to be the maximal weight vector IK>K (i.e., S;, IK)K = KKK>K ). Before proving the last inequality in (13), we first notice that it follows from the theory of coherent states that
rmaa =
(2K+1)(2J+1) 4a
f
FI
K(fl)®nJ(n)dn,
because Ifl>® ®10>, is a coherent state in the subspace on which Pma, projects. From (16) we easily conclude that pm a" has the following simple representation in terms of Bloch coherent states.
=(2K+ 1) fJnxu)da
(17)
The last inequality in (13) now follows from a proof almost identical to the proof of Theorem 1.
If p, is a pure state then a,(p,) = -ln(2J + 1) which is the smallest possible value with our normalization. It is now clear that p"" is not a pure state if K > J
because aA(p;'" ) > -ln(2J + I) by (13), whereas a,(p) = -ln(2K + 1) for any pure state p. We shall now show operator completeness for Fina, and rmin
THEOREM 5. If K > J, then the coherent operators rma and Fm,n are operator complete from .*E°A to .)t°,.
Proof. We have to show that we can get all operators (matrices) in End(.)t°,) as A in (9) with A an operator in End(.)t°K ). In the case of I"min this follows from (15),
since we know from the operator completeness of (3) that the projections Il,(fl) span all operators. For rma" we note that the group SU(2) acts on operators in End(,Y,) or End(Jt°K) through the adjoint representation aduKA = UKAUK' . As in Lemma 3, we see that for all the coherent operators r, (aduKA) " = adu,(A).
Thus. End(3E°K) is a vector space on which SU(2) acts, and it is clear that End(JrK) can be written as a direct sum of irreducible representations for SU(2) 373
With J.P. Solovej in Lett. Math. Phys. 22, 145-154 (1991)
ELLIOTT H. LIES AND JAN PHILIP SOLOVEJ
152
corresponding to spins M = 0, ... , 2K. The map A i- A will map the subspaces corresponding to representations with M = 0, ... , 2J, into the corresponding subspaces of End(.;'to, ), while the subspaces with M = 2J + 1, ... , 2K are in the kernel.
To see that the map is onto, we have to show that none of the representation subspaces with M = 0, ... , 2J in End(.*,) are mapped to zero, and thus, from irreducibility, conclude that they are disjoint from the kernel. In the adjoint representation, the generators SK act via commutators, and from the identities [SK, [SK, (SK )M]] +[SK1 Ms//(
SK)M]]
+ [SK, [SK, (SK )M]]
= M(M + 1)(S K+ ) M, and
[SK. (SK )M] = M(SK )Ms
we see that (SK
is a heighest weight vector in the irreducible subspace of
End(A'K) with spin M. It is therefore enough to show that (SK )M is not mapped to zero. From [8] formula (A.1) we can calculate= CT)((Ix ± i l")M,
where CT) > 0 for M = 0, ... , 2K. Using these lower symbols we get from (16)
that if M=O,...,2J )M(S. )M] = CtK )C;')(4a) -'
f1 2M df2 > 0,
from which the theorem follows in the case of rmx. Remark. In view of (16) operator completeness in the case of rmax is clearly a stronger statement than operator completeness in the classical case (3). Therefore, the above proof for r,,,ax automatically gives an alternative proof of completeness in the classical case, and shows, moreover, that one can always choose an upper symbol from the subspace of all lower symbols (see [ 7], pp. 29- 34). From the above proof we also immediately get the following corollary. COROLLARY 6. If K J then the subspace of End(.)°K) consisting of matrices A with A E End(-*P,) (for both rm;,, and Finax) is the direct sum of irreducible subspaces u n d e r the a d j o i n t representation with spin values M = 0, ... , 2J.
3. Free Energy of the Heisenberg Model In this section, we shall use the method described in the previous sections to estimate the free energy of the Heisenberg model of interacting spins. For simplicity, we take the same spin J on each site, but this is not necessary. Let A denote a finite collection of JAI points and define the Heisenberg Hamiltonian H(J) on .lt°,(A) = ®;E,,.lt°, by 374
Quantum Coherent Operators: A Generalization of Coherent States QUANTUM COHERENT OPERATORS
H(J)
153
E,1S,(i) - S, (j),
(18)
i./eA
where E,, are real numbers. No assumption is made about the sign of the E,,. The partition function is defined to be e.. RHCj).
ZA(J, li) = (2J + 1) The normalized free energies are In ZA(J, /f).
(3) =
(19)
The operator
r(A) = ®ie Armin(t)
on -*',(A) ®.lr,(A) is a coherent operator for (Yf,r(A), .Y,(A)). We get from (12) for K > J (K)2 H(K) and H(K) = N(J) = H(K). (1)2
Thus, Theorem I implies that
ZA(K,(J)2s),
ZA(K,(3)
1)2/3).
Using (19) we arrive at Theorem 7
THEOREM 7. If K > J
f(K, (K)2 u3) ,
fl) %f(K. (.±_') ii).
f( ' ( Kj + l ) fl) %f( K P) %f( (K) /f )
(20)
2
.
2
,
( 21 )
Inequalities (20) and (21) are the same, but both are given here for the sake of clarity.
Finally, let us compare the free energy for the Hamiltonian H given in (18) with that of - H, i.e., we reverse the sign of all the E,,. (Recall that the sign of each E,; is arbitrary but fixed.) With an application to ferro- and antiferromagnets in mind, we shall call the former case (with H) the ferromagnet and shall call the latter case (with - H) the antiferromagnet. Subscripts a and f will denote the two cases. One important new assumption must now be made, however. We assume that A is bipartite. This means that A = A u B with A n B empty and with E,, = 0 whenever i e A and j e A or else i e B and j e B. The coherent operator to be used is r' - ®/E . rmin
a
on .*r,(A) 0 Af,(A). Note the combination of min and max used here. We get H,
and &(J) = Hr (J). J J + I Hf,(J) J+ 1
375
With J.P. Solovej in Lett. Math. Phys. 22, 145-154 (1991) ELLIOTT H. LIEB AND JAN PHILIP SOLOVEJ
154
Thus
THEOREM 8. (ferro- and antiferromagnetic comparison). I
J
f.
J 113) J0(J,1)
1.
Remark. If we could find a coherent operator inducing the same transformation on the basic spin operators as rmin, but without the minus sign, then the above proof would give f. = ff, which is, of course, wrong. Acknowledgements
We are grateful to G. M. Graf for pointing out to us that rL need not be operator complete for every J, K,and L. By Lemma 3 we see that S, is mapped to zero when K = J = 2 and L = 3; from this we conclude that Si cannot be in the range of the rL transform in this case. References I. Arccchi. F. T.. Courtens, F.., Gilmore, R., and Thomas, H., Atomic coherent states in quantum optics, Phys. Rer. A 6, 221 1-2237 (1972). 2. Bargmann. V., On a Hilbert space of analytic functions and an associated integral transform, part 1. Comm. Pure App!. Math. 14. 187-214 (1961). 3. Bargmann, V., On a Hilbert space of analytic functions and an associated integral transform, part 2, Comm. Pure App!. Math. 20. 1 101 (1967). 4. Berezin, F. A.. I:t-. Akad. Nauk SSSR Ser. Mat. 36(5), 1134 1167 (1972). English translation: Covariant and contravariant symbols of operators. Math. USSR-I.-v. 6(5), 1117 -1151 (1972) and F. A. Berezin. General concept of quantization, Comm. Math. Phys. 40, 153- 174 (1975). 5. Feng, D. H., Gilmore. R., and Zhang, W-M., Coherent states: Theory and some applications, Rev. Mod. Phts. 62, 867 927 (1990). 6. Klauder, J. R., The action option and a Feynman quantization of spinor fields in terms of ordinary c-numbers, Ann. Phvs. 11, 123 (1960). 7. Klauder, J. R., and Skagerstam, B-S., Coherent States, World Scientific. Singapore, 1985. 8. Lieb, E. H., The classical limit of quantum spin systems, Comm. Math. Phys. 31, 327 -340 (1973). 9. Lich. F. H.. Proof of an entropy conjecture of Wehrl, Comm. Math. Phys. 62, 35 41 (1978). 10. Pcrelomov, A., Generalized Coherent States and their Applications, Springcr-Verlag, New York, Berlin, Heidelberg, 1986.
11. Schrodinger, E., Der stetige ubcrgang von der Mikro-zur Makromechanik..Naturndss. 14, 664 666 (1926). 12. Segal, 1. F.. Mathematical characterizations of the physical vacuum for the linear Bosc - Einstein
fields. Illinois J. Math. 6, 500 523 (1962). 13. Simon. B., The classical limit of quantum partition functions, Comm. Math. Phpc. 71. 247 276 (1980).
14. Thirring, W. F. A lower bound with the best possible constant for Coulomb hamiltonians, Comm.
Math. Phrs. 79, I 7 (1981). 15. Wchrl. A., On the relation between classical and quantum-mechanical entropy. Rep. Moth. Phrss. 12, 385 (1977). 376
Proceedings of the Symposium on Coherent States, past, present and future, Oak Ridge
COHERENT STATES AS A TOOL FOR OBTAINING RIGOROUS BOUNDS Elliott H. Lieb* Departments of Mathematics and Physics, Princeton University Princeton, NJ 08544 USA
ABSTRACT This talk reviews some of the ways in which coherent states can be used to give rigorous bounds for quantities of physical interest and, in certain cases, can yield exact asymptotic formulas. Three main topics will be discussed.
1. The Berezin-Lieb inequalities which yield upper and lower bounds to quantum mechanical free energies in terms of classical free energies. 2. Coherent states (combined with a variational principle and correlation inequality) can generate upper and lower bounds for the ground state energies of atoms and other Coulomb systems. 3. Wehrl's conjecture about the entropy of coherent states.
0. Introduction This talk reviews some of the ways in which coherent states can be used to give
rigorous bounds to quantities of physical interest and, in certain cases, can yield exact asymptotic formulas. Three main topics will be discussed. 1. The Berezin-Lieb inequalities, which yield upper and lower bounds to quantum mechanical free energies in terms of classical free energies: Some applications are (a) upper and lower bounds to the free energy of quantum spin systems in terms of the corresponding classical spin systems and (b) the exact evaluation of the ground state energy and free energy of the Dicke laser model. (The generalization to bounds of one quantum system in terms of another quantum system are given in J.P. Solovej's talk.) * Work partially supported by U.S. National Science Foundation grant PHY 9019433 A02.
@1993 by the author. Reproduction of this article, by any means, is permitted for non-commerical purposes.
377
Proceedings of the Symposium on Coherent States, past, present and future, Oak Ridge
268
2. Coherent states (combined with a variational principle and a correlation inequality) can generate upper and lower bounds for the ground state energies of atoms and other Coulomb systems. In the limit Z -+ oo these bounds coincide and thereby establish the asymptotic exactness of Thomas-Fermi theory. 3. Wehrl's conjecture about the entropy of coherent states, its resolution for Glauber states (which also leads to LP bounds for Wigner distribution functions), and the open conjecture about SU(2) (or Bloch) coherent states.
1. Free Energies It is helpful to have an example in mind, and for this it is convenient to take the
Heisenberg model in which each spin has a value S (which can be 1/2,1,3/2,...) and the interaction is Si - Si for nearest neighbor pairs (i, j). Thus, the Hamiltonian is
H(S) = S2 E Si Si.
(1.1)
(I,i)
The normalization 1/S2 is taken for convenience so that H(S) has, in some sense to be determined, a nice limit as S - oo. We are interested in the partition function
Z"(S) = (2S+
1)-NTre-NH(S) = e-NF(S)
(1.2)
and will try to bound it in terms of a classical partition function Z" given by
Zd =
(47,)-N
rexp]_/Hct(Ri,...,Ore)]dfll ...dflN
(1.3) (1.4)
(+,i)
with 117,l = 1. The N integrations in (1.3) are each over the unit sphere S2.
The general situation is the following. We are given a Hilbert space 1( and a family of coherent states Iz), parametrized symbolically by z (and which might, in fact, be 11 E S2 as a particular case), satisfying
(zlz) = I
(normalization)
and
J
378
Iz)(zldz = 1
(resolution of identity),
(1.6)
Coherent States as a Tool for Obtaining Rigorous Bounds
0 for a suitable measure dz on the parameter space. For each operator H on 7{ we can define the lower symbol, H(z), which is a function on the parameter space, by
H(z) = (z(HIz).
(1.7)
Usually there is at least one upper symbol, E(z), which is a measurable function satisfying
H = JU(z)IzXzldz.
(1.8)
Such a function may not exist (it does not exist for the Coulomb potential, (xI-1, for example, using Glauber coherent states, or even generalized states of the type given in (2.8)) and if it exists it is not always unique. In the finite dimensional case (i.e., spins) it always exists, but is never unique.
An important - and frustrating - point is that while the lower symbol is always a positive function when H is positive, the upper symbol H need not be positive. For example, with Glauber coherent states, the lower symbol for the oscillator Hamiltonian H = at a is p2 + q2 while the upper symbol is p2 + q2 - 1. The Beresin-Lieb
are as follows.
THEOREM: Tre-off
ZQ
'=
Trl
f < f exp(-iH(z)Idz > f exp(-QH(z)ldz
This relates a quantum partition function to a classical partition function. Recent developments, with J.P. Solovej', relate one quantum Z to another quantum Z. Note that this inequality also holds for any convex function of H, not just for the exponential function. Returning now to our example with spins, we take (n) to be the Bloch coherent state, i.e., the vector in C2s+1 defined (up to a phase) by
(S . n)If) = S(n).
(1.10)
The measure on 82 for (1.6) and (1.8) is (41r)-ldfl. We then have the following upper and lower symbols for the three spin operators S = (S=, Ss, S-).
3(17) _ (S + 1)n,
S(n) = Sn.
(1.11)
379
Proceedings of the Symposium on Coherent States, past, present and future, Oak Ridge
270
Recalling the 1/S2 normalization convention in (1.1), the Berezin-Lieb inequalities yield
Z`t(fl) < Z`I(S,,B) <- Z`t
(((S+1))2#)
(1.12)
With these upper and lower bounds, which are uniform in the size of the system, it is a trivial matter to prove the classical limit of the quantum spin systems, i.e., if we define f(S) = N-1F(S) to be the free energy per spin, then
s m lim o fQ(S) = fe1 = limo -,O-'In Z`'.
(1.13)
Another example of the use of these inequalities to evaluate a quantum free energy is the Dicke laser model, defined by the Hamiltonian H = ata + eSz + N-112(a + at)S= with
S=E N
Sj.
(1.14)
(1.15)
Each Sj is a spin 1/2 particle (in reality, a 2-level atom with "spin-up" being an excited state and "spin down" being the ground state). The operators a and at are annihilation and creation operators for a single photon made in a cavity. The first term in H is the photon energy. The last term in H is the atom-photon interaction. There are N atoms and we want to take the thermodynamic limit N - oo and find the free energy per atom.
It turns out that this system has a phase transition (in the thermodynamic limit) as a function of 0 from a low ,B state in which the average photon number (ate) is 0(1) to a high 0 state in which (eta) is O(N). K. Hepp and I were initially able to prove this only with great difficulty'. Later we realized' it could be done with ease using a variant of Theorem 1.
This variant noted in Ref. 2 and also in Ref. 5 is that when It is a tensor product the inequalities (1.9) hold if we replace only some of the operators by their symbols and then take the proper trace over the remaining operators. Thus, when the total spin S, which is a conserved quantity, has the value S, we can replace the spin operators by their lower symbols, for example, and we have (with ns being the projection onto spin S)
Trwsene >
380
VS
2S4+ 1
Jr Tro exp{-Q[ata+eSi2 + (a + at)N -lie Sftt)}dS2. (1.16)
Coherent States as a Tool for Obtaining Rigorous Bounds
271
Here vs is the number of ways of getting spin S with N spin 1/2 particles and Tra is the trace over the photon field. A similar upper bound is obtained using upper symbols. The photon field trace in the right side of (1.16) is easy to compute because it is just the partition function of a displaced oscillator, namely e-n)-t exp{-13[cSls - S2N-1(f2=)2)}. The fl integration can then be done (1 by steepest descent as N tends to oo. Finally, the expression in (1.16) has to be maximized with respect to S.
Alternatively, we can get upper/lower bounds to the partition function by replacing at a and a + at by their upper/lower symbols with respect to Glauber coherent states and then taking the trace over the spin operators. Either way, the upper bounds and the lower bounds converge as N -. oo and the earlier results in Ref. 4 appear in a simple way.
2. Coulomb Systems We are interested in computing the ground state energy, EQ(N, Z) of an atom consisting of N electrons and a nucleus of charge Z; units in which e = ft = 2m = 1 will be used. The nucleus is assumed to be infinitely massive and fixed at the origin in R'. The well known non-relativistic Hamiltonian is N
H =
hj +
iGi
7=1
Ixi - xjl
1
(2.1)
with the one-body operator h given by
h = -A - Z/Ixl.
(2.2)
(What follows can also be generalized to a "relativistic" Hamiltonian in which -A is replaced by -c2 ++ m2c;.)
Our goal is to show that when A = N/Z is fixed (usually A = 1, which is the neutral case) and Z - oo then ZimoEQ(N,
Z)/ETF(N, Z) = 1
where ETF(N, Z) is the Thomas-Fermi (TF) energy of an atom. The TF energy is defined by minimizing the following functional of a nonneg-
ative density p(x), x E Ra, under the condition f p = N. E(p) = 3(31r2)2/3
J
p(x)'I'd'x - Z J P(x)lxl-13x + D(P,P),
(2.4)
381
Proceedings of the Symposium on Coherent States, past, present and future, Oak Ridge
272
where D(f,g) := 1 f f f(x)g(y)lx - yI-ldaxd3y is the Coulomb repulsion functional. We denote the unique minimizer by pN z(x).
There is no space here to go into the details of TF theory, but it is a fact that (2.3) is true, as shown by Lieb and Simons using a rather involved proof. The point here is that coherent states offer a much easier route. The following ideas can be found in the review article'. A lower bound to EQ using coherent states was given
at the same time by Thirrings. We note that ETF has the form ETF = C(A)Z'13.
(2.5)
An upper bound to E2 of the form C(A)Z7/1 +O(Z2) and a lower bound to Ey of the form C(A)Z'/3 - O(Z'13-1/30) can be derived using coherent states.
The following is a very sketchy derivation of the upper and lower bounds. For details see Ref. 7. For simplicity of exposition here I shall deal with spinless electrons, which means that 3w2 must be replaced by 6w2 in (2.4). 1.1 Upper bound to EQ(N, Z)
One necessary input is the following variational principle9 (whose proof was later simplified by Bach"). The one-particle reduced density matrix, 7(x,y), x E R3, y E R3 of any N-fermion density matrix satisfies 0:5 7 < 1
(as an operator)
(2.6a)
Tr7=N
(2.6b)
Given any 7 satisfying (2.6) (which we call admissible), the following inequality (or variational principle) holds:
E`t(N,Z) < Tryh+ JJix -yi-117(x,x)7(y,y))
- I'Y(x,y)12]d3rd3y.
(2.7)
The right side of (2.7) is well known for a Hartree-Fock 7 (i.e., a 7 that is a projection); the interesting point is that it holds for all admissible 7's.
Next, we introduce the family of coherent states parametrized by p E R3,
gER3, fp,q(x) = g(x - q)e`P ",
382
f
Ig(x)I2d3x
= 1.
Coherent States as a Tool for Obtaining Rigorous Bounds
273
(g will not be a Gaussian as in the Glauber states; in fact it is convenient to let g have compact support.) We define our variational 7 by
7(x,Y) = Jf
M(p,q)fp,q(x)fp,q(Y)dspds9
(2.9)
with M(p,q) = 0(I6x2pwFZ(q)J2'a - p2).
(2.10)
Here, 0 is the step-function (0(t) = 1 if t > 0 and 0(t) = 0 if t < 0). Since the function M satisfies 0 < M(p, q) < 1, it follows that 7 satisfies (2.6a). Since f pN Z(x)d- z = N, it follows that -y satisfies (2.6b). This construction, (2.9) is simple and effective. It generates a useful admiaaible
density matrix without having to construct a Hartree-Fock determinantal wave function. The construction eliminates what used to be called "The orthogonality problem (or catastrophe)". If we substitute (2.9) into (2.7) and do some computations we find that E(4 < ETF + O(Z2),
(2.11)
as required.
2.2 Lower Bound to EQ(N, Z)
Let 7!(x1 , ... , xN) be any normalized function satisfying the Pauli principle (i.e., ,b is antisymmetric) and define the one-body density matrix 7(x,Y)
NJ O(x,x2,...,xN)0(Y,xi,...,x,v)'d'X2...daxN,
(2.12)
and the one-body density p(x) := 7(x, x). A second input we shall need is the following inequality, which controls the difference between the true Coulomb repulsion and the classical analogue, D(p,p).
('4'I E Ix, - xil-'IV,) !D(P,P) - (1.68) J p4/3(x)d2x
(2.13)
1
By Schwarz's inequality for the Coulomb potential we have
D(p,p) > 2D(pN z, p) - D(pN Z, P v,Z).
(2.14)
383
Proceedings of the Symposium on Coherent States, past, present and future, Oak Ridge
274
Thus (t1'IDIV,)
>- Tryh -
D(PN z,PtTV z) - 1.68 1 P4/3(X)d31.
(2.15)
11
where h is the one-body Schroedinger operator
h=-A-OTF(X) with
z
cTF(X)
_ -IXI - 5_1
* PTF.
We now use coherent states (2.8) to find a lower bound to Tr yh. Define 0 < M(p,y) < 1 by
M(P, q)
(fp.gl y Ifp,q) <_ 1.
(2.16)
We then have Tr]-V2'y] = JJJvi(pq) p2d3pd34 - N I IVg(X)I2d3x ,j,r]0TF l]
= JJ(p,q)
OT F
d3pdq + "controllable error".
(2.17) (2.18)
The last term, "controllable error", is a bit mysterious and a bit hard to compute, O(Z7/3-1/30) when combined with the but it can be evaluated and shown to be other "errors" -N f IVg12 and (-1.68) f P4/3 in (2.13). _ The quantity Tryh is bounded below by the sum of the negative eigenvalues of h, i.e. -Trg(h). Here g(t) = -tO(-t) is a convex function. If h had an upper symbol we could get a lower bound to (2.15) by using the Berezin-Lieb inequality.
The "controllable error" in (2.18) is due to the fact that the TF potential, like the Coulomb potential, does not have an upper symbol and, therefore, must be approximated by a potential that does have one.
The sum of the two main terms in (2.17) and (2.18) can now be minimized with respect to all functions M(p, q) satisfying O < M(p, q) < 1. The minimizer is found, with the help of the TF equation, to be the same as in (2.10). If this is substituted into (2.17), (2.18) and if D(PN z, PT z) is subtracted from this (as in (2.15)), the result is precisely the anticipated quantity ETF(N, Z).
3. Wehrl's Entropy In 1977 Wehr112 proposed a classical interpretation of quantum-mechanical entropy. While it differs from both the usual "classical entropy" and from the true quantum-mechanical entropy, it remedies some deficiencies of both.
384
Coherent States as a Tool for Obtaining Rigorous Bounds
275
We begin with the density matrix of a system with Hamiltonian H,
r := e'OH/Tre''H.
(3.1)
Next, we define the function on phase space P(z) := (ziriz)
(3.2)
with Iz) being a family of coherent states, as in (1.5)-(1.7). Thus, p(z) is the lower symbol of r, and we note that 0 < p(z)
1,
p(z)dz = 1
(3.3) (3.4)
J
since Tr r = 1. Wehrl'a entropy is given by
S"t(r) := -
J
p(z) In p(z)dz,
(3.5)
i.e., S" is the ordinary classical or Shannon entropy of the density p.
The entropy Stt' is not the classical entropy of r. That quantity is usually defined by attributing a classical function H(z) to H (say, fl(z) as in (1.7)) and then defining
pCI(z) := exp[-,QH(z)]/ J exp[-(3H(z)Jdz,
(3.6)
S°t(r) := - JP'(z)lnP(z)dz.
(3.7)
and then setting
The major drawback to (3.7) is that p`t(z) can easily exceed 1 for some large /3. Indeed as /3 -+ oo, v(r) -4 -oo. This is in stark contradiction with the fact that the quantum entropy
-Tr r In r
(3.8)
is always nonnegative. As Wehrl points out12, this simple observation shows the impossiblity (for all /3) of S°t tending to V as Planck's constant tends to zero. On the other hand, since p(z) in (3.2) never exceeds 1, S" (1') > 0 for all /3, and thus S" behaves better than Sc' from this perspective. Indeed, Wehrl proved that
S" (r) > SQ(r) > 0.
(3.9)
385
Proceedings of the Symposium on Coherent States, past, present and future, Oak Ridge
276
The left side of (3.9) is the Berezin-Lieb inequality applied to the convex function
zlnz. The quantum entropy 5Q also has a serious defect - this time a physical one. If our Hilbert space N is the tensor product of two spaces, 7! _ 111 0 N2, and I' is an operator on N, we have the entropy S9(I) which we shall call S12. Using the partial trace we can also define 1'1 by I 1 := TrN, C as an operator on N1 i with corresponding entropy SQ(r1) := S. Likewise, S2 is defined. If we did this with classical discrete densities and replaced partial traces by sums, we would have the inequalities S1 < S12 < S1 + S2.
(3.10)
It turns out that the subadditivity inequality S12 < Si + S2 does hold for the quantum entropies and classical continuous entropy (in which sums become integrals) but the monotonicity, Si < S12 fails in general for quantum systems! While it does hold for classical discrete systems, monotonicity also fails for classical continuous systems (cf. Ref. 13). Thus, the universe could be in a pure state, and hence have zero entropy, while Si, the entropy of Earth, is quite large.
An advantage of the Wehrl entropy is that both parts of (3.10) hold! This means that we define
P12(z1,z2) _ (z1,z2IrIz1,z2),
(3.11)
where Iz1,z2) is the ordinary tensor product of two coherent states on N1 and N2. We can then define p1(z1) = JPi2(ZiZ2)dZ2 = (z1I Tun, rIz1),
(3.12)
noting that the two possible definitions of P1 are, in fact, the same. In addition to (3.10) the Wehrl entropy also satisfies all the other nice properties of entropy such as concavity in r and strong subadditivityls
Returning now to (3.9) we can ask for the minimum (with respect to all I's) of the value of S" (r). By concavity, it is easy to prove that a minimizing r must be a pure state, i.e., I' = I4)(0I for some normalized vector 14) in the Hilbert space. In case that N = L2(R") and the coherent states are the Glauber coherent states (i.e. (2.8) and with g(z) = exp(-z2)), Wehrl conjectured that the minimizing 10) must itself be a coherent state, i.e., 10) is an fp,q in (2.8). Anyone will do. An easy computation would then show min ST1 (I) = 1. F
386
(3.13)
Coherent States as a Tool for Obtaining Rigorous Bounds
277
This conjecture was proved'4, but the strange fact was that two deep theorems in harmonic analysis had to be used - the sharp constant in the Hausdorff-Young inequality and the sharp constant in Young's inequality. In view of the Heisenberg group lying behind Glauber coherent states (which are minimal weight vectors), it is tempting to suppose that a much simpler proof, perhaps group theoretical, of Wehrl's conjecture is possible. This is an interesting open mathematical problem. Another interesting mathematical problem concerns the obvious analog of Wehrl's conjecture, made in Ref.14, for the spin S Bloch coherent spin states 197) used in Sect. 1. If, as the conjecture states, the minimum Wehrl entropy occurs when 14) is an In) the entropy, which is independent of fit?) and easy to calculate, is
min S" (r) =
r
2S
25+1'
(3.14)
For S = 1/2 the proof is trivial since all vectors in C2 are coherent states. But no proof exists for any other S value, even though many attempts have been made to find one. It would be very nice if someone could solve this 15 year old problem! Clearly, we do not know everything there is to be known about SU(2). A final remark about the Wehrl conjecture for Glauber states is its generalization 14
in R": (2ir)
f 1(0If,,9)1zrd"pd"q > r
(3.15)
for all 10) satisfying (¢10) = 1 and all r > 1. Since (3.15) is always an equality when r = 1, we can deduce (3.13) from (3.15) by differentiating (3.15) at r = 1. Further generalizations, related to radar signal analysis, wavelets and Wigner distribution functions, were also obtained". Among them there is the following. Let the fp,q in (3.15) be given by (2.18) but with an arbitrary normalized g(x - q), so that the left side of (3.15) now involves two arbitrary functions 9$ and g. The inequality (3.15) remains true!
References 1. F.A. Berezin, Covariant and contravariant symbols of operators, Izv. Akad. SSSR Ser. Mat. 6 (1972) 1134-1167. 2. E.H. Lieb, The classical limit of quantum spin systems, Commun. Math. Phys. 31 (1973) 327-340.
3. E.H. Lieb and J.P. Solovej, Quantum coherent operators: A generalization of coherent states, Lett. Math. Phys. 22 (1991) 145-154. 4. K. Hepp and E.H. Lieb, On the superradiant phase transition for molecules in a quantized radiation field, Ann. of Phys. (NY) 76 (1973) 360-404.
387
Proceedings of the Symposium on Coherent States, past, present and future, Oak Ridge
278
5. K. Hepp and E.H. Lieb, The equilibrium statistical mechanics of matter interacting with the quantized radiation field, Phys. Rev. A8 (1973) 2517-2525. 6. E.H. Lieb and B. Simon, Thomas-Fermi theory of atoms, molecules and solids, Adv. in Math. 23 (1977) 22-116. 7. E.H. Lieb, Thomas-Fermi and related theories of atoms and molecules, Rev. Mod. Phye. 53 (1981) 603-641; errata 54 (1981) 311. See Sect. V. 8. W. Thirring, A lower bound with the beat possible constants for Coulomb Hamiltonians, Commun. Math. Phye. 79 (1981) 1-7. 9. E.H. Lieb, A variational principle for many-fermion systems, Phys. Rev. Lett. 46 (1981) 457-459; errata 47 (1981) 69. 10. V. Bach, Error bounds for the Hartree-Fock energy of atoms and molecules, Commun. Math. Phys. 147 (1992) 527-548. 11. E.H. Lieb and S. Oxford, An improved lower bound on the indirect Coulomb energy, Int. J. Quant. Chem. 19 (1981) 427-439. 12. A. Wehrl, On the relation between classical and quantum-mechanical entropy, Rept. Math. Phys. 16 (1979) 353-358. 13. E.H. Lieb, Some convexity and subadditivity properties of entropy, Bull. Amer. Math. Soc. 81 (1975) 1-13. 14. E.H. Lieb, Proof of an entropy conjecture of Wehrl, Commun. Math. Phys. 62 (1978) 35-41. 15. E.H. Lieb, Integral bounds for radar ambiguity functions and Wigner distributions, J. Math. Phys. 31 (1990) 594-599.
388
Part V
Brunn-Minkowski Inequality and Rearrangements
With H.J. Brascamp and J.M. Luttinger in J. Funct. Anal. 17, 227-237 (1975)
Reprinted from JOURNAL or FUNCTIONAL ANALYSIS
All Rights Reserved by Academic Press, New York and London
Vol. 17, No. 2, October 1974 Printed in Betgiast
A General Rearrangement Inequality for Multiple Integrals H. J. BRASCAMP*f The Institute for Advanced Study, Princeton, New Jersey 08540
ELLIOTT H. LIES' Departments of Mathematics and Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139 AND
J. M. LUTTINGERt Department of Physics, Columbia University, New York, New York 10027 Communicated by Irving Segal
Received March 21, 1974
In this paper we prove a rearrangement inequality that generalizes inequalities given in the book by Hardy, Littlewood and P61ya' and by Luttinger and Friedberg.2 The inequality for an integral of a product of functions of one variable is further extended to the case of functions of several variables.
1. INTRODUCTION
Rearrangement inequalities were studied by Hardy, Littlewood and Polya in the last chapter of their book "Inequalities." Let us start by recapitulating the definition of the symmetric decreasing rearrangement of a function, and the integral inequalities following from that definition. Our new results are contained in Theorems 1.2 and 3.4. In the following, measure always means Lebesgue measure and is denoted by µ. DEFINITION 1.1.
Let f be a nonnegative measurable function on R,
* Work partially supported by National Science Foundation Grant GP-16147 A#1. f Work partially supported by National Science Foundation Grant GP-31674 X. t Work partially supported by a grant from the National Science Foundation. 227
391
With H.J. Brascamp and J.M. Luttinger in J. Funct. Anal. 17, 227-237 (1975)
228
BRASCAMP. LIBB AND LUTTINGBR
let K t = {x If (x) > y) and let Mt = µ(K'). Assume that Mat < o0
for some a < oo. If f * is another function on R with the same properties as f and, additionally,
(a) f *(x) = f *(-x), bx, (b) 0 < x1 < x2 - f *(xx) < f *(xl), (c)
Mt' = Ms', by > 0,
then f * is called a symmetric decreasing rearrangement of f. Remarks.
(1)
If g and h are two symmetric decreasing rearrange-
ments of f, then g(x) = h(x)
a.e.
(2) If X is the characteristic function of a measurable set, we can define X*(x) = 1 if 2 1 x I< f X and X*(x) = 0, otherwise. For a
general function f, define Xv(x) = 1 if f (x) > y and Xt,(x) = 0, otherwise. Then f (x) = fo dyX.(x), and
f *(x) = fo dyX,*(x)
is a symmetric decreasing rearrangement of f. The fact that Mat < 00 implies that f *(x) < oo, Vx 0 0. (3)
In the following theorems we shall always be dealing with
integrals. Consequently, by remark (1),f * is unique for our purposes.
Trivially, f e L'(R) iff f* e L'(R) and f f = f f *. The inequalities to be found in [1] are f dxf (x) g(x) C f dxf *(x) g*(x);
f
R'
dxidxaf (xi) g(x2) h(x1 - x2) C f dx1dxj *(x1) g*(x2) h*(xi - x2),
the latter being due to Riesz [3].
392
R'
A General Rearrangement Inequality for Multiple Integrals
229
REARRANGEMENT INEQUALITIES FOR INTEGRALS
A generalization due to Luttinger and Friedberg [2] reads
f d"x FIfAxi) h x - x <
r
*x
fit.d-x
J_l
where xx+1 - xl . This formula was derived for the purpose of physical applications (inequalities for Green's functions, Luttinger [4]).
In the present paper we give a further generalization, one which was already conjectured in [2].
Let fj , I < j < k,
THEOREM 1.2.
be
nonnegative measurable
functions on R, and let a,m , I < j < k, 1 < m < is, be real numbers. Then JR
d"x
'1 1 f,
-I
1
M-
aimxm)
f
R. d"x
fi f1* (Ll
atmx)").
j_1
Remark. Theorem 1.2 is nontrivial only for k > is. If k < n, both integrals diverge. If k = is and det I aim I = 0, both integrals diverge.
If k - is and det [ a;m {
0, equality holds (change variables to
y; = E» _1 afmxm and then use the fact that f fi = f f; *). A proof of Theorem 1.2 is given in Section 2. An important tool is Brunn's part of the Brunn-Minkowski theorem, which we recall here
(see e.g., [5] Section 11.48). Note that every convex set in R" is measurable. LEMMA 1.3. Let C be a convex set in R"+', let p e R"+', and let V(t) be the family of planes= t, -- oo < t < oo. Let S(t) be
the n-dimensional volume of the convex set V(t) n C. Then S(t)1111 is a concave function of t in the interval where S(t) > 0. COROLLARY 1.4.
Let C, q and S(t) be as in Lemma 1.3 and, in
addition, let C be balanced (i.e., x e C
-x e C). Then S(t) = S(-t)
and S(t2) < S(tl) for t2 > t1 > 0. In Section 3 we generalize Theorem 1.2 to the Schwarz symmetrization (Definition 3.3) of functions of several variables. An
auxiliary lemma that we need for this purpose is given in the Appendix. II. PROOF OF THEOREM 1.2
Although in general f --*f is not linear, by Remark (2) following Definition 1.1 it is sufficient to assume that each f1 is the characteristic
393
With H.J. Brascamp and J.M. Luttinger in J. Funct. Anal. 17, 227-237 (1975)
230
BRASCAMP, LIEB AND LUTTINGER
function of some measurable set. By standard approximation arguments we may assume this set to be a finite union of disjoint compact intervals (cf. [1], Section 10.14). We start by assuming that each f1 is the characteristic function of one interval. Let f1 , 1 < j < k, be the characteristic functions of
LEMMA 2.1.
the intervals
b,-c1
x
and define f,(x l t) = f5(x + bst). Then k
fin
1-1
m-1
I(t) = f d"x 11 fi {(L afmxm I t) is a nondecreasing function of t e [0, 1]. Remark.
Note, that f,(x 10) = f,(x) and f1(x 1 1) = f,*(x), so
Lemma 2.1 includes a special case of Theorem 1.2. Proof of Lemma 2.1. 1(t) is the volume of the intersection of the k strips
St = x e R"
I bf(l
- t) - ci <
n
bf(l - t) + c,
.
M-1
In Rn+l, consider the set [n'
C=
I
I
1(SSk l
XE
Rn+1 I -c! < Y aimxm - blxn+l < ci. m=1
1(t) is the volume of the intersection of C with the plane xn+1
= 1 - t.
Since C is convex and balanced, I(t) is nondecreasing for t c [0, 1] by Corollary 1.4. Q.E.D. We now conclude the proof of Theorem 1.2 with the following lemma. LEMMA 2.2. Theorem 1.2 holds under the restriction, that each f1 is the characteristic function of a finite union of disjoint compact intervals.
Proof. Let f1 be the characteristic function of n1 intervals. We prove the lemma by induction on N = (n1 , n2 ,..., nk), with fixed k.
394
A General Rearrangement Inequality for Multiple Integrals REARRANGEMENT INEQUALITIES FOR INTEGRALS
231
We say that M -< N if m, < n, , 1 <j < k, and m, < ni for some i.
Lemma 2.2 is true for N = {1, 1,..., I} by Lemma 2.1. Now assume that Lemma 2.2 is true for all M < N. Let f,(x) be the characteristic function of U {x e R I big - cf, < x < bf5, + c1,}, 1yp
with
brp+csp
for 0 < t < T, where r = miDn[1 - (bt.p+1 - b,,)-1 (cj..,1 + cr,)] > 0.
For 0 < t <,r, the intervals belonging to each function f, remain disjoint; at t = T at least two intervals coalesce for some j. Since each f, is a positive sum of characteristic functions of single intervals of the type stated in the hypothesis of Lemma 2.1, we can apply that lemma interval by interval and find
fgn dnx it l fi (Ll aimxm) < fRn dnx ' fi (Y
1
a ,,,xm I TI
At t = T, the family of functions {f,(x I r) satisfies the hypothesis of Lemma 2.2, except that N has been reduced to some M -< N. Therefore, by assumption
f dnx j j ff (> aimxm I r) < f dnx fl f, * ( aimxm) R
,-1
m-1
R"
i-1
m-1
have the same symmetric decreasing rearbecause f,( I T) and rangement. This proves Lemma 2.2 and at the same time Theorem 1.2 III. GENERALIZATION TO FUNCTIONS OF SEVERAL VARIABLES
In this section we indicate how to generalize Theorem 1.2 to functions of several variables (Lemma 3.2 and Theorem 3.4). The intuitive idea was given in [4], p. 1450.
395
With H.J. Brascamp and J.M. Luttinger in J. Funct. Anal. 17, 227-237 (1975)
232
BRASCAMP. LIEB AND LUTTINGER
Let f be a nonnegative, measurable function on RP, and let V be
a p - 1 dimensional plane through the origin of RP. Choose an orthogonal coordinate system in RP such that the x'-axis is perpendicular to V. DEFINITION 3.1. A nonnegative, measurable function f *(x I V) on RP is called a Steiner-symmetrization with respect to V of the
function f (x), if f *(x', x2,..., xn) is a symmetric decreasing rearrangement with respect to x' of f (XI, x2,..., XP) for each fixed x2,..., XP. Remark.
The notion of Steiner symmetrization is usually reserved
for sets; for any y > 0, the set {x E RP If *(x I V) > y} is a Steiner symmetrization with respect to V of the set {x c- RP If (x) > y} (see e.g., Polya and Szego [6], Note A). LEMMA 3.2. Let ff(x), I <j < k, be nonnegative measurable functions on RP, let afm , 1 < j < k, 1 < m < n, be real numbers, and let V be any plane through the origin of R. Then fjtny d1 J
Px fl fi (Y afmxm)
i-i
m-1
where Rn" 9 x = (x1 ,...,
f
na
d -x f ( f,* (> aimxm I V), 11411
m=1
xE RP.
Proof. Choose appropriate orthogonal coordinates in RP -3 x = (x',..., xP) as above, so that the x'-axis is orthogonal to V. Then, by Theorem 2.1, the inequality already holds for the integration over
xl', x21,..., xn' for any fixed x,,,9, DEFINITION 3.3.
I < m < n, 2 < q < p.
Q.E.D.
Let f be a nonnegative measurable function on
RP, let Kyf = {x I f (x) > y} and let MPf = µ(K f). Assume that Maf < oo for some a < oo. If f ** is another function on RP with the same properties as f and, additionally, (a)
f **(x1) = f **(x2)
(b)
0 < I x1 I < I x2 I . f **(x2)
(c)
when I x1 I = I x21,
M"_M', vy>0,
then f ** is called a Schwarz symmetrization of f. Remarks.
(1)
The remarks after Definition 1.1 apply, mutatis
mutandis, to Schwarz symmetrization.
396
A General Rearrangement Inequality for Multiple Integrals
233
REARRANGEMENT INEQUALITIES FOR INTEGRALS
(2) The notion of Schwarz symmetrization is usually reserved for sets; the set in Ry+1 under the graph of y = f * *(x) is the Schwarz symmetrization with respect to the y-axis of the set under the graph of y = f (x) (see [5], Note A).
It is intuitively clear, that the Schwarz symmetrization can be obtained as the L'(RP) limit of a sequence of Steiner symmetrizations
with respect to different planes. That fact will be proved in the Appendix for the characteristic function of a bounded measurable set (Lemma Al). For the moment we use it, together with Lemma 3.2 and the remarks at the beginning of Section 2, to conclude our main theorem, which is the following. Under the assumptions of Lemma 3.2,
THEOREM 3.4. I('
J RnP
d"Px
kl 11
i=1
n
k
f,F (Y_ aixm) m1
{ fl Jf *m=1( fR d"x i=1
al,nxm).
nn
APPENDIX
We give the lemma that suffices to establish Theorem 3.4. For two p,, denotes Lebesgue
sets A and B, AAB - (A u B)\(A n B). measure in R.
LEMMA A. 1. Let K be a bounded measurable set in RP, and let S be the ball centered at the origin with µ,(S) = µ,,(K). Then there exists a sequence of sets K,, , where K = K and where K,, +.1 is obtained from K, by Steiner symmetrization with respect to some (p - l)-dimensional
subspace of RP, such that lim µ,(K,
n-w
J S) = 0.
There exist various theorems stating the convergence of Kn to S in the Ilausdorff metric ([7], Section 21 for compact convex sets; [8], Section 4.5.3 and [9], Section 2.10.31 for general compact Remark.
sets).
Let us first give a precise definition of the Steiner symmetrization for arbitrary measurable sets (cf. the Remark following Definition 3.1). DEFINITION A.2.
Let K be a bounded measurable set in RP, and
let V be a (p - 1)-dimensional subspace of R. Then the set K,,* is called a Steiner symmetrization of K with respect to V, if, for every
397
With H.J. Brascamp and J.M. Luttinger in J. Funet. Anal. 17, 227-237 (1975)
234
BRASCAMP, LIEB AND LUTTINGER
straight line L perpendicular to V with K n L measurable in R, Kv* n L is a segment (open or closed) with center in V and µl(Kv* n L) = µ1(K n L). Remarks. (1) Let K be open (resp. closed) and take for Kv* n L in Definition A2 the open (resp. closed) segments. Then Kv* is open resp. closed).
To prove this, choose coordinates x = (x',..., xP) e RP with x' in the direction orthogonal to V. Let XK be the characteristic function
of K. Then the statement is true if the function RP-1 9 y f dx' XK(x', y) is lower (resp. upper) semicontinuous. But this follows
from the fact that XK(x', y) is lower (resp. upper) semicontinuous in RP. (2) For arbitrary measurable K, all Steiner symmetrizations are measurable and satisfy (Fubini's theorem)
IA.(Kv*) = µa(K)
Two Steiner symmetrizations can only differ by a set of measure zero. All this is readily seen by sandwiching K between closed sets from within and open sets from without. (3) If K and M are measurable sets, Lemma 3.2 gives that
µv(Kv* n My*) > µ,(K n M), and therefore
l'v(Kv* d My*) < µ,;(K d M).
In particular, if K and M differ only by a set of measure zero, so do Kv* and Mv*. (4) In view of Remarks 2 and 3, we shall further speak of the Steiner symmetrization of a measurable set, which in fact associates with each equivalence class of measurable sets a unique equivalence class of measurable sets. PROPOSITION A.3.
Let K and S be as in Lemma A.1. Then
p,(Kv* d S) < µv(K d S) and the equality holds for all subspaces V iff K = S. Proof.
398
The < inequality holds by Remark 3 above, since Sv* = S.
A General Rearrangement Inequality for Multiple Integrals
REARRANGEMENT INEQUALITIES FOR INTEGRALS
235
Denote by L(v) the straight line perpendicular to V through v E V. Let K(v) = K r1 L(v), and let irj,(K) be the projection of K on V, .r(K) = {v e V I z1(K(v)) > 0).
Now let K e S so that µp(K\S) = µp(S\K) > 0. It can be shown by a tedious but trivial argument that there exists a subspace V such
that P = irv(K\S) rl irv(S\K) has positive µp_1 measure. If v e P, neither K(v) C S(v) nor S(v) C K(v); therefore !h(Kv*(v) d S(v)) = I µ,(K(v)) - µ,(S(v))I < Ih(K(v) A S(v))
for all v e P. Because, generally, for all v e V µ,(Kv*(v) d S(v)) < µ1(K(v) d S(v)),
we have for the particular subspace V under consideration
µ,(Kv* d S) < ,,(K d S). This proves Proposition A.3. Let us now specify the sequence of sets in Lemma A. I. Given K, choose a subspace Vl , such that ,-,(KA v, d S) < i Vf ,v(Kn v d S) + n-1
Then construct Kn+1 from K. by p consecutive Steiner symmetrizations with respect to a set of p - 1 dimensional subspaces V1, V2,..., Vp (beginning with V1 specified above) whose orthogonal complements are pairwise orthogonal. In that way, iv(Kn+1 d S) < I..(Knw d S) + n-1
for all n and for all subspaces W. PROPOSITION A.4.
There exist a subsequence
and a measurable
set M such that dim 1z9(K., d M) = 0 -CO
Proof. Express a point x e R9 in coordinates (x', X2,..., X9) corresponding to the planes used to construct K . Then, it is not
difficult to show that for n > 0 (i.e., after the first set of p orthogonal symmetrizations), x e K implies y e K,a if I ym I < I xm 1, m = 1,..., p. Therefore, if Xn is the characteristic function of K. f dx" I Xn(xl,..., Xm + ym..... x") - X (Xl,..., x"',..., x )I <_ 2 I ym I. A
399
With H.J. Brascamp and J.M. Luttinger in J. Funct. Anal. 17, 227-237 (1975)
236
BRASCAMP. LIEB AND LUTTINGER
Note that by assumption K is contained in some ball B of radius R centered at the origin; then also Kn C B. This implies that f d Dx I Xn(x + y) - Xn(x) I < 2(2R)'-1 it,
I y"` I. M-1
In other words lu
t Y-0 RP
dnx I Xn(x + y) - Xn(x)I = 0
uniformly in n. Hence the family of functions {Xn} is conditionally compact in L1(Rp) (Dunford and Schwartz [10], Theorem IV, 8.21). Q.E.D.
Propositions A.3. and A.4. immediately give the following. COROLLARY A.5.
Ep(Kn d S) decreases monotonously to µ,,(M d S).
Let us now conclude the proof of Lemma A.1. Assume that M we shall show that this leads to a contradiction.
S;
Let µ,,(M d S) = S > 0. Then there exist a p - 1 dimensional subspace W and an c > 0 such that µn(Mw* d S) = S - E. by Proposition A.3. Also lim j'W
4 Mw*) = 0,
lim 1A,(Knw 4 S) = 8 -
Then there exists an nk such that nk > 2e-1 and pp(Kkw d S) < 8 - E/2.
But by the construction of the sequence K., Pv(Kn,t+14 S) < µ,(Kkw d S) + nk1 < S,
which contradicts Corollary A.5.
Thus we find that M = S; then by Corollary A.5., µp(K,, d S) decreases monotonously to zero. This proves Lemma A.1.
400
A General Rearrangement Inequality for Multiple Integrals REARRANGEMENT INEQUALITIES FOR INTEGRALS
237
REFERENCES
I. G. E. HARDY, J. E. LI TLEWOOD, AND G. P6LYA, Inequalities, Cambridge Uni-
versity Press, London and New York (1952). 2. J. M. LurrINGER AND R. FRIEDBERG, Preprint, A New Rearrangement Inequality for Multiple Integrals (1973).
3. F. RIEsz, Sur une In6galit6 Int6grale, J. L.M.S. 5 (1930), 162-168. 4. J. M. LUTTINGER, Generalized isoperimetric inequalities, J. Math. Phys. 14 (1973), 586-593, 1444-1447, 1448-1450. 5. T. BONNESEN AND W. FENCHEL, Theorie der Konvexen Korper, Chelsea, New York (1948). 6. G. PGLYA AND G. SzEGO, Isoperimetric Inequalities in Mathematical Physics, Princeton Univ. Press, Princeton (1951). 7. W. BLASCHKE, Kreis and Kugel, Veit and Comp., Leipzig (1916). 8. H. HADWIGER, Vorlesungen itber Inhalt, Ober9$che and Isoperimetrie, Springer, Berlin-Gottingen-Heidelberg (1957). 9. H. FEDERER, Geometric Measure Theory, Springer, New York (1969). 10. N. DUNFORD AND J. T. SCHWARTZ, Linear Operators, Part I, Interscience, New York and London (1958).
401
With H.J. Brascamp in Functional Integration and Its Applications, A.M. Arthurs, ed.
Some inequalities for Gaussian measures and the long-range order of the one-dimensional plasma H. J. BRASCAMP AND E. H. LIEB
1.1. Introduction THE following is a preliminary report on some recent work, the full details of which will be published elsewhere. We have come across some inequalities
about integrals and moments of log concave functions which hold in the multidimensional case and which are useful in obtaining estimates for multidimensional modified Gaussian measures. By making a small jump (we shall not go into the technical details) from the finite to the infinite dimensional case, upper and lower bounds to certain types of functional integrals
can be obtained. As a non-trivial application of the latter we shall, for the
first time, prove that the one-dimensional one-component quantummechanical plasma has long-range order when the interaction is strong enough. In other words, the Wigner lattice can exist, in one dimension at least. As another application we shall prove a log concavity theorem about the fundamental solution (Green's function) of the diffusion equation. 1.2. Basic concavity theorem
We begin with a theorem (Theorem 1.1) which, to the best of our knowledge, is new and which constitutes the basis of all our other inequalities.
DEFINITION 1.1. A function F from R" to R is a log concave function if
F(x)>0, VxeR", and F(x)'F(y)'-',FtAx+(1-,1)y], Vx,yER" and AE (0, 1). If the inequality is reversed, we say that F is log convex. We shall sometimes write F(x) = et''' and f is concave, but it then is understood that f can take on the value - oo. We say that F is even if F(x) = F(- x), Vx. Two important examples of log concave functions are:
(a) F(x) = exp[ - (x, Ax)], where A is any symmetric real positivesemidefinite quadratic form on R". (b) Let C be any convex set in R" and let Xc(x) =1 for x E C, Xc(x) = 0 for xg C be the characteristic function of C. Then Xc is a log concave function. - x E C. Xc is even if and only if C is balanced, i.e. X E C
THEOREM 1.1. Let F be a log concave function on R'` and F: (x, y) .-s F(x, y) for x E R'", Y E R. Then G(x) = f R" F(x, y) dy is a log concave function on R'".
We have four different proofs of this theorem, one of which is the following. 403
With H.J. Brascamp in Functional Integration and Its Applications, A.M. Arthurs, ed.
Some inequalities for Gaussian measures and the
2
Proof. It is sufficient to prove the theorem when m = n = 1; the general case follows by Fubini's theorem and induction. Choose two points x and x' such that G(x) i 0 and G(x'),4 0. We may assume that sup{F(x, y)} = sup{F(x', y)}, Y
Y
for otherwise we can replace F(x, y) by e"F(x, y) with b suitably chosen. For
each z , 0, define C(z) = {(x, y)IF(x, y) = z} c R2, C(x, y) _ {yIF(x, y) a z} c R and g(x, z) = meas{C(x, z)}. Then
(i) C(z) is convex and thus C(x, z) is an interval; (ii) G(x) = Jo g(x, z) dz;
(iii) for all 0-- A _ 1, g(Ax+(1-A)x', z)-Ag(x, z)+(1-A)g(x', z). This last fact follows easily from the convexity of C(z); it is also the Brunn-Minkowski theorem which, in one dimension, is trivial. Thus
G(Ax+(1-A)x') 'AG(x)+(1-A)G(x'),G(x)"G(x')'-". Q.E.D. Theorem 1.1 should not be confused with the following theorem, which is much simpler and which follows directly from Hoelder's inequality. THEOREM 1.2. Let F:Rm" -> R and, for x e R", y r= R", let F(x, y) be log convex in x for each fixed y. Then G(x)= JR- F(x, y) dy is log convex on
R-. An immediate consequence of Theorem 1.1 is the following. THEOREM 1.3. The convolution of two log concave functions on R' is log concave.
Proof. H(x) = JR- F(x - y)G(y) dy is log concave since F(x - y)G(y) is jointly log concave in (x, y) E R". Q.E.D. REMARK. In the case of R, Theorem 1.3 is known [1]. 1.3. Application of Theorem 1.1 to Gaussian measures
A Gaussian measure on R" is given by an (unnormalized) density function
W(x) = exp[ - (x, Ax)/2], A > 0. The expectation value of a real-valued function H, on R", is given by (H)o=
404
JH(x)W(x) dx JW(x) dx
Some Inequalities for Gaussian Measures
long-range order of the one-dimensional plasma
3
Now suppose that W(x) is replaced by WF(x) = W(x)F(x), where F is a log concave function. With respect to the new weight we define (H)F as above. How does (H)F compare with (H)o?
THEOREM 1.4. The covariance matrix MF, whose elements are MF= (xixI)F - (x,)F(xj)F satisfies
MF<M°=A-' in the sense of forms, i.e. M°-MF is positive -semidefinite.
Proof. Consider the function Ton R"*' defined by T(x, y) = W(x)F(x) exp[-(y, A-'y)/2+(y, x)]
= W(x-A-'y)F(x). Then T is log concave and, by Theorem 1.1, U(y) = f dxT(x, y) is log concave on R. Thus, the matrix a21n U(y)/ayiay,I,_o=M' -M° is negativesemidefinite.
Q.E.D. THEOREM 1.5. If, in the above, we replace F by a log convex function then
MF-- M°=A-'. Proof. Write U(y) = f R" W(x)F(x +A-' y) dx and use Theorem 1.2.
Q.E.D.
As an application of Theorem 1.4, consider an Ising model with Boltzmann factor B(o,) = exp[](u, Ao,)], v i = ± 1 , i = 1, ... , n. By adding an
unimportant multiple of the identity to A, we can always assume A >0. Since
B(v) = (2zr)"[det A]-'
J R"
exp[-(x, A"'x)/2+(x, o-)] dx is simply related to
we find that the covariance matrix of the us, Nij =
the covariance matrix MF, introduced above (with A replaced by A'), by
N = A-'MFA-' - A', where F(x) _
e('--'= [ 2 cosh xi.
Now F(x) is log convex, so Theorem 1.5 states that M' --A, which implies that N y 0-hardly an interesting result. Note, however, that
G(x)=F(x)exp(-1
2 i_,
x
405
With H.J. Brascamp in Functional Integration and Its Applications. A.M. Arthurs, ed.
Some inequalities for Gaussian measures and the
4
is log concave. Therefore, provided A -'> I (equivalently, A < I) we can write
exp[ - (x, A-'x)/2]F(x) = exp[ - (x, (A -- I)x)/2]G(x)
and Theorem 1.4 states that
and N--(I-A)-'. In the
physical situation, A is a matrix whose eigenvalues are of 0(1) independent of n and A < I occurs for sufficiently high temperature, independently of n.
Hence, for high temperature, the eigenvalues of N are 0(1); this means there is no long-range order. Although previously there existed elementwise bounds on N for special choices of A([2] and [3]: inequalities), our result is the first case of a quadratic form inequality on N.
We now quote an assortment of theorems, to indicate some of the directions in which Theorem 1.4 can be generalized. THEOREM 1.6. Consider the weight WF(x) = W(x)F(x) with F log concave, as in Theorem 1.4, and let Fbe even. Let L be any symmetric, real, n-square matrix. Then
((x, Lx)2)F-((x, Lx))fi_ 2((x, LA -'Lx))F.
(1.1)
Proof. We consider the case in which A = I; the general case can be handled by the change of variables x -* A-'x. Let Z = 1R. dx. Then 2A m 2((x, Lx)2)F - 2((x, Lx))F
= Z-2 rR^ L .
Z-7 =
JR^ L
Lx) - (y, Ly)]2 dx dy
F(2-1(u - v))4(u, Lv)2 du dv
after the change of variables x = 2-1(u + v), y = 2-1(u - v). Now do the v integration and recall that ((v,, v;)}4,_, _ I for each u, by Theorem 1.4. Thus,
2A _ 4((u, L2u)). Returning to the original x, y variables, one notes that 2(u, L2u) = (x, L2x)+(y, L2y)+2(x, L2y). Finally ((x, L2x)) = ((y, L2y)) = ((x, L2x))F and (x,y) = 0.
Q.E.D. REMARKS. (i) If F is log convex, the inequality in Theorem 1.6 is reversed. (ii) The significance of Theorem 1.6 is that if L and A are of the order of 1, the left side of (1.1) is the difference of two terms of 0(n2), while the right
side is 0(n). Choosing L = A, the left side of (1.1) is like n times a specific
heat, while the right side is like n times an internal energy-to use the language of statistical mechanics. Usually, it is difficult to obtain an upper bound on a specific heat.
406
Some Inequalities for Gaussian Measures
long-range order of the one-dimensional plasma
5
COROLLARY 1.7. Let A and L be symmetric, n-square matrices with A non-singular, let F be even and log concave and let k be real. Then
Z(A)= J exp[-(x, A e"`Ax)]F(x)dx w
is log concave in A.
Proof. Compute d2 In Z/dA2 and compare with Theorem 1.6.
Q.E.D. THEOREM 1.8. -Let WF(x) = e-`=/2F(x) be a weight in R with F log concave.
Define (. )F and (. )o as before. Then (Ix - (x )FI ")F -- (Ix - (x)oI")o
fora-- 1. The proof of Theorem 1.8 is lengthy and will not be given here. The theorem says that multiplying a Gaussian weight on R by a log concave function may, if the function is not even, shift the mean, but all moments, higher than the first, with respect to the new mean are decreased. We present next a theorem which will play an important role in the next section.
THEOREM 1.9. Let A be a real positive-definite (n+m)-square matrix partitioned as A =
a QT y
,where a is n-square, y is m-square, j9 is n x m,
and T means transpose. Let Fbe a log concave function on R"and form the unnormalized weight on R"*': WF(x) = W(x)F(x), W(x) = exp[-(x, Ax)/2]. Denoting, as before, a point x E R' *" as x = (y, z), y e R", z E R'", define the unnormalized weight V on R" by V(y) = J
WF(y, z) dz.
If we define G: R" - R by
V(y) = exp[ - (y, By)/2]G(y),
with B = a -fy-'j3T > 0, then G is log concave.
Proof. Note that the (n + m)-square matrix C - A - [ A is positive-definite. semidefinite, since exp[-(x, Cx)/2]F(x) is log concave on R"+"'. Since
0 ] is positive0 Hence UF(x) _
V(y) = exp[-(y, By)/2] J . UF(y, z) dz, R
407
With H.J. Brascamp in Functional Integration and Its Applications, A.M. Arthurs, ed.
Some inequalities for Gaussian measures and the
6
Theorem 1.9 follows from Theorem 1.1.
Q.E.D. REMARKS. (i) Mutatis mutandis, if F is replaced by a log convex function,
then G is log convex on R. (ii) If F(x) is a constant, then G(y) is also a constant. Thus, Theorem 1.9 states that if one does a partial integration over a Gaussian weight times a log concave function, the result is the Gaussian weight one would have obtained without the log concave multiplier times a new log concave function.
To pursue the ideas of Theorem 1.9 a bit further, let us formulate the Brunn-Minkowski theorem for Gaussian measures. We recall the classical Brunn-Minkowski theorem [4]. THEOREM 1.10. Let Co, C, be non-empty convex sets in R", and let
CA=AC,+(1-A)Co,
0--A-- 1.
Denote by ICI the n-dimensional Lebesgue measure of C Then ICAI""3AIC,I' "+(1-A)IC0I"". REMARK. If Co = {0}, then CA = AC,.
In the case of Gaussian measures we have the following. THEOREM 1.11. Let Co, C, and CA be as in Theorem 1.10, and let A be a real,
positive-definite, n-square matrix. Let µG(C) =
J
exp[ - (x, Ax)/2] dx.
Then
AG(CA):WCi(C,)Atk;(CO)
Proof. Define the convex set
D={(A,x)I0
Since the integrand is log concave in (A, x), µ(;(C,) is log concave by Theorem 1.1. Q.E.D.
408
Some Inequalities for Gaussian Measures
long-range order of the one-dimensional plasma
7
As a corollary to Theorem 1.11 we quote a theorem of L. Gross [5]. The Gaussian measure A. on R" defines the measure of a Borel set B - R" to be
N",(B)=(27r)" J exp[ - (x, x)/2] dx. a
THEOREM 1.12. (L. Gross) Let Cbe a convex, balanced set in R' x R", let D be the intersection of C with R'", and let E be the projection of Con R. Then A... (C) = µ" (E)µ," (D).
Proof. Let C. be the intersection of C with the plane parallel to R'" through x E R"; in particular, Co = D. By the symmetry of C and Theorem
1.11, x -> µ,"(C) is log concave and even on R", and hence µm(C) is maximal for x = 0. Thus
µ,,.,"(C) = J IL-(C) dp,(x)_ A. (D) J dµ"(x) = µ"(E)p.,"(D) E
E
Q.E.D.
Let us return to the Brunn-Minkowski theorem (1.11) for Gaussians. By passing to the limit n -> oo, the same theorem obviously remains true for infinite-dimensional Gaussian measures, for example, the Wiener measure. In that case we should deal with measurable, convex sets of Wiener paths. We shall consider here particular convex sets of paths, namely those passing through a convex set C. c R" for all t. With Ca, 0 _ A _ 1, defined as in Theorem 1.10, consider the fundamental solution G, (x, y; t), 1 0, of the diffusion equation with potential V, in R", defined by a-t
-2
G,, (x, y;0)=S(x-y), G. (x, y; t) = 0,
x, yEC.,;
dy, t;
x E aC,,
=0,
xeC.
or
y0C..
THEOREM 1.13. Let V(x) be a convex function. Then G.(x, y; t) is log concave in (x, y, A) E R" x R" x [0, 1 ].
Proof. Use the Trotter product formula with xo = x, xN = y: G.(x, y; t) = lim(27rt/N)-N12 J
N-
xf1{exp[r
N
W-11
dx, ... dxN_, x 2
t
2t(x,-xi-,) -NV(x;)Ixr.(xi)}.
409
With H.J. Brascamp in Functional Integration and Its Applications, A.M. Arthurs, ed.
Some inequalities for Gaussian measures and the
8
The integrand is log concave in (x, x...... xN_,, y, A). Finally the pointwise limit of a sequence of log concave functions is log concave. Q.E.D.
COROLLARY 1.14. In addition to the hypotheses of Theorem 1.13, either let Co and C, be compact or let exp(- tV) be in L'(R" ), Vt > 0. Define
Z(t) =
j
G., (x, x; t) dx = tr e-"',
C,
with H= -12A + V. Then Z, (t) < oo and Z, (t) is log concave in A.
Proof. That Z,(t) is finite is a standard result and can be proved from the Trotter product formula above using Hoelder's inequality. The log concavity of Z,(t) follows from Theorems 1.1 and 1.13.
Q.E.D. COROLLARY 1.15. Let V(x) and C, be as in Corollary 1.14. Let ro(A) be the lowest eigenvalue of the equation
[-40+ V(x)]F(x) = eo(A)f(x), with f (x) = 0 for x E CI,. Then so(A) is a convex function of A E [0, 1 ].
Proof. Since e` is trace class, Z, = Y_ exp[-tr,(A)], e,,,(A)-_ e,(A) and !-0
each e,(A) has finite multiplicity. Then
ro(A) = -lim t-' In Z,(t). and, since the pointwise limit of a sequence of convex functions is convex. Corollary 1.15 is proved. Q.E.D.
1.4. The one-dimensional plasma
In this section we apply the previous theorems to an old problem in physics, namely, to the one-dimensional, one-component plasma in a neutralizing background. We shall consider both the classical and quantum-
mechanical cases. The latter requires the introduction of the Wiener integral, and thus provides another example of the application of our theorems to functional integrals. The object of our investigations is to show that long-range order exists for sufficiently large coupling constant, i.e. that the one-particle distribution function is a non-constant periodic function. The occurrence of this phenomenon was first predicted by Wigner [6].
410
Some Inequalities for Gaussian Measures
9
long-range order of the one -dimensional plasma
be the coordinates of (2n + 1) one-dimensional Let x = (x_,,, ... , particles, each having a negative charge of one unit. The one-dimensional Coulomb potential between two unit charges separated by a distance IxI is - jxi. Then the total potential energy of (2n + 1) particles in a'box' [ - L, L] with a fixed uniform positive charge background of density p is
4,(x)=-
rL
Ix,-x,i+p -nci<j
,
n
1 L
JJ P Z
lx; -xldx -
2
L
L
L L
Ix - yldxdy. L
(1.2)
We shall further assume that the total charge is zero, i.e.
2Lp=2n+1. Since 0 is symmetric in the x it is sufficient to consider the convex domain
C={xl -L<x_ _- x_n., ...<xn-_ L}.
(1.3)
In C,
x; - 1)Z,
O(x) = P
(1.4)
P
where a constant term in the potential has been neglected. Our methods are capable of handling the domain C as it stands, but then the function we wish to calculate, p(x), will not be strictly periodic, except in the thermodynamic limit n -, co, p =constant. To circumvent this difficulty we extend C to the larger convex domain (1.5) D={xlx_,-- x_,,,...-- xn-- x_n+2L}.
The domain D no longer confines the particles to the box, and we shall cheat a little by supposing that the expression (1.4) for 46 is valid in all of D. The original walls of the box are still visible in 0.
Remark then that the domain D and the potential 0 are invariant under the linear transformations R(reflection) and T(translation), defined by (Rx), = -x-,
(Tx),x,.,(Tx),
(1.6) 1
P
-nn-1;
2n (1.7)
P
1.4.1. The classical case The Gibbs distribution function of the jth particle is defined by (13 is the reciprocal temperature) (n,
P, (x) =
JnS(x-x,)exp[-/34,(x)]dx Jp exp[-13cb(x)] dx
411
With H.J. Brascamp in Functional Integration and Its Applications, A.M. Arthurs, ed.
Some inequalities for Gaussian measures and the
10
The symmetry properties (1.6) and (1.7) imply that (1.8)
Po'(x) = P0`(- X);
(1.9) P
Since D is a convex domain in R", direct application of Theorem 1.9 gives that
po'(x) = exp(- /3px2)F"'(x),
(1.10)
where F" is log concave; by eqn (1.8), F'"'(x) is also even. We shall not go into the details of the existence of the limiting distribution functions
p,(x)lim p,``(x) in the thermodynamic limit (n -- oo, L -> oo, 2n + 1= 2Lp). Obviously, properties (1.8)-(1.10) remain true in the limit. It is also fairly clear, that the
use of domain C instead of domain D would give the same distribution functions in the limit. Thus far we have established part (i) of the following theorem: THEOREM 1.16. (i) The one-particle distribution functions of the classical one-dimensional plasma computed in D satisfy Po(x) = po(- x), P, (X) = P. (X -
I),
po(x) = exp[ - l3Px 2]F(x ),
where F(x) is a log concave, even function. (ii)
JR jxI°po(x) dx = (13P)
IR Ix I' exp[-RPx2] dx
for a > 1.
(iii) For large values of a/p, the total density
P(x)= > P,(x) is non-trivially periodic.
Proof. (ii) Follows from Theorem 1.8, (iii) will be proved in § 1.4.3, Theorem 1.18.
412
Some Inequalities for Gaussian Measures
long-range order of the one-dimensional plasma
11
REMARKS. (i) Theorem 1.16 (ii) states that the moments of the oneparticle distribution functions are smaller than they would be without the restriction x,' xj+,. (ii) The interpretation of Theorem 1.16 (iii) is that the plasma is in a crystalline state. The specific position of the crystal is a consequence of the hard walls that were imposed at ± L = ± (n +2)p. This fact is reflected not
only in the domain D (eqn (1.5)) but also in the expressions (1.4) for t(x). Hard walls at positions ±L+S would translate the crystal through a distance 8. (iii) The fact that p(x) is not a constant has recently been proved by Kunz [7] who, by other methods, showed that to be true for all 6,p except possibly for a countable number of values of (3/p.
1.4.2. The quantum-mechanical case The quantum-mechanical Hamiltonian of the system defined by equation (1.2), with h2/m = 1, is
H= -}A+4(x), where
A=
a2
2ax; .
We consider the case that the particles are spinless fermions. This means
that H acts on square integrable, antisymmetric functions. As is well known, an equivalent statement in one dimension is that H acts on square integrable functions defined on E = {xlx_n < x . . ,- ... , , xn} which vanish on the boundary of E. The `box' condition requires that the functions vanish on the boundary of C c E. As in the classical case, we shall use the larger domain D instead of C. The distribution function of the jth particle is then (x) ' trD e-a"
trD
P1j"(x) _
where Sj(x) is the operator of multiplication by S(x - x;). Since H and D are
invariant under the transformations R and T (eqns (1.6), (1.7)), the distribution functions again have the symmetry properties ((1.8), (1.9)).
To find the analogue of eqn (1.10), use the Trotter product formula for exp(- 61H), which gives, with x° = xN, Tro e a"So(x) = Nl (
-
x r[ exp _ N k=1
2,6
Z"
2ir N
I
12n+1)N/2
r JD
(x; -x;k-1) 2- NP
dx1
... dxN x
(x;k p)2 ]S(xo-x). 1
(1.11)
413
With H.J. Brascamp in Functional Integration and Its Applications, A.M. Arthurs, ed.
Some inequalities for Gaussian measures and the
12
Since D" is convex in R'"', we can apply Theorem 1.9 to conclude that po'(x) = exp(- yx2)H``(x),
(1.12)
where H`"' is log concave, and where exp(- yx2) is, up to multiplication by
an x-independent constant, what eqn (1.11) would give if D were replaced by R2n''. But in that case the integrations separate into (2n + 1) independent integrations over RN. Therefore, exp(- yx2) is proportional to G(x, x;13), where G(x, y; t) is the fundamental solution (Green's function) of the differential equation, for t > 0,
ra-1 Tat
z
z
a2+Px )G(x,y;t)=0; 2 ax G(x, y; 0) = S(x - y).
Using the well-known expression [8] for G, we obtain
y = (2p)' tanh /3(p/2)'.
(1.13)
The analogue of Theorem 1.16 is now immediate.
THEOREM 1.17: Theorem 1.16 is correct for the quantum-mechanical plasma of spinless fermions except that in part (ii) 9p is replaced by y (eqn (1.13)) and in part (iii) 61p is replaced by y/p2. Remarks (i) and (ii) after Theorem 1.16 also apply here. We turn next to the demonstration that parts (iii) of Theorems 1.16 and 1.17 follow from parts (i) of those theorems. 1.4.3. Can modified theta functions be constant?
Let f(x)=exp(-Ax2)F(x) with F(x) even and log concave and A>0. Consider
p(x)= Y_ f(x-j).
(1.14)
The question to which we address ourselves here is whether or not F can be chosen so that p(x) is constant. The answer, surprisingly, depends on A. As Theorem 1.18 shows, p(x) cannot be constant when A is large, and thus parts (iii) of Theorems 1.16 and 1.17 are proved. Define the Fourier transform by /(k) = JR e2" "f(x) dx.
(1.15)
Then, by the Poisson summation formula, P(x) _
fU)
e-2"v=
Therefore p(x) is constant ifi /(j) = 0 for j = t 1, t 2, ....
414
(1.16)
Some Inequalities for Gaussian Measures
long-range order of the one -dimensional plasma
13
THEOREM 1.18. Let p(x) be defined as in eqn (1.14). Then there exists a Ao, 0 < Ao < cm, such that
(a) For all A> A. and for all Feven and log concave, p(x) is not constant;
(b) For all A < Ao, A > 0 there exists an even, log concave F such that p(x) = constant.
Proof. (i) Existence of Ao: If, for some A, there is an F(x) that leads to a constant p(x), then, for µ < A, the log concave function F(x) exp[(µ - A)x2] gives the same constant p(x). (ii) A, < oo: Normalize to F(0) = 1. Then p(0) .1, and 2e-A/4E a-2A;=2e-A/4(1-a-2A)-,.
p(3)
This gives the simple estimation Ao < 3. (iii) Ao> 0: We indicate how to construct an example of constant p for A sufficiently small. Choose a non-constant, even, log concave function G, and
normalize it so that g(x)=exp(-Ax2)G(x) satisfies JR g(x) dx = g(0) =1. Define
1(k)= II g(k/j),
(1.17)
which is the Fourier transform of the convolution
f(x)=fl*j exp(-Aj2x2)G(jx).
(1.18)
The infinite product (1.17) is defined and g(k) > 0 in a neighbourhood of k = 0, since
1>g(k/j)-1+i(k/j)2g"(0)>0 for lk/jl<<1. Equation (1.18) then follows from the Lebesgue dominated convergence theorem, and f;4 0. Now Theorem 1.9 applied to eqn (1.18) gives f (x) = exp(- aAx2)F(x),
where F(x) is log concave and even and a = (E;-, j-')-'= 6ar-2. It is now sufficient to determine A and G such that g(± 1) = 0; then, by eqn (1.17),
f(j) = 0
for all integers j ?6 0,
415
With H.J. Brascamp in Functional Integration and Its Applications, A.M. Arthurs, ed.
14
Some inequalities forGaussian measures
and we are done. Take G(x)=f 1 +'/(A/ar)]X(x), where X is the characteristic function of [ -;, :). Then lim g(k) _ (irk)-' sin 2irk; lim g(k) = 1. A+O
A-
-Therefore A can be chosen such that g(± 1) = 0.
Q.E.D. Acknowledgements This work has been partially supported by U.S. National Science Foundation Grants GP-31674X and GP-16147A # 1.
References I. SCHOENBERG, I. J. On Polya frequency functions I: The totally positive functions and their Laplace transforms. J. Anal. math. 1, 331-74 (1951). 2. GRIFFrrHs, R. B. Correlations in Ising ferromagnets, I, II, Ill. J. Math. Phys. 8, 478-83, 484-9 (1967); Commun. Math. Phys. 6, 121-7 (1967); KELLY, D. and SHERMAN, S. General Griffiths inequalities on correlations in [sing ferromagnets. J. Math. Phys. 9, 466-84 (1968). 3. FORTUIN, C. M., KASTELEYN, P. W., and GINIBRE, J. Correlation functions on
some partially ordered sets. Commun. Math. Phys. 22, 89-103 (1971). 4. BONNESEN, T. and FENCHEL, W. 7heorie der Konvexen Koerper. Chelsea, New York (1948). 5. GROSS, L. Measurable functions on Hilbert space. Trans. Am. math. Soc. 105, 372-90 (1962); see also DUDLEY, R. M., FELDMAN, J., and LE CAM, L. On seminorms and probabilities, and abstract Wiener spaces. Ann. Math. 93, Ser. 2, 390-408 (1971).
6. WIGNER, E. P. Effects of the electron interaction on the energy levels of electrons in metals. Trans. Faraday Soc. 34, 678-85 (1938). 7. KuNZ, H. Equilibrium properties of the one-dimensional classical electron gas. Preprint E.P.F.L., Lausanne (1974). R. MERZBACHER, E. Quantum mechanics, Chapter 8. Wiley, New York (1961).
Note Since this work was completed we have found that Theorem 1.1 has been proved independently by A. Prekopa, Acta Math. Szeged 32,301-15 (1971), and Y. Rinott, Thesis, Weizmann Institute, Rehovoth, Israel (Nov. 1973).
416
With H.J. Brascamp in Adv. in Math. 20, 151-172 (1976) ADVANCES IN MATHEMATICS 20, 151-173 (1976)
Best Constants in Young's Inequality, Its Converse, and Its Generalization to More than Three Functions HERM JAN BRASCAMP* Department of Physics, Princeton University, Princeton, New Jersey 08540 AND
ELLIOTT H. LIEB* Departments of Mathematics and Physics, Princeton University, Princeton, New Jersey 08540
The best possible constant Dm in the inequality I Jf dx dy f (x)g(x - y) h(y)I 6 , P, q, t > 1, 1 Ip -I- 1 lq + 1 It = 2, is determined; the equality is reached if f, g, and h are appropriate Gaussians. The same is shown D,,, II f Ilv II g II. II h II,
to be true for the converse inequality (0 < p, q < 1, t < 0), in which case the inequality is reversed. Furthermore, an analogous property is proved for an integral of k functions over n variables, each function depending on a linear combination of the n variables; some of the functions may be taken to be fixed Gaussians. Two applications are given, one of which is a proof of Nelson's hypercontractive inequality.
1. INTRODUCTION
The classical inequality of Young is that Ilf*gll,
(1.1)
where * means convolution, I /p + 1 /q = 1 + 1 /r, p, q, r > I and f and g are functions on R. Alternatively, (1.1) is equivalent to
I = I f f f(x)g(x-y)h(y)dxdyI
(1.2)
when I /p + 1 /q + 1 It = 2. Unlike Holder's inequality, 1lfg11,
" Work partially supported by National Science Foundation Grant Number GP-31674X. 151
417
With H.J. Brascamp in Adv. in Math. 20, 151-172 (1976)
BRASCAMP AND LIEB
152
(],!p -- l /q = I /r), the best possible constant in (I.1) is not unity. About
a year ago, Beckner [I] showed that Gaussians give the best constant when I < p, q, t < 2, by finding the best constant in the HausdorffYoung inequality. The latter result is very deep, but will not play a role in this paper. Thus the conjecture was raised that for all p, q, r, Gaussians give the best constant in (1.1) and (1.2). This fact was proved simulta-
neously by Beckner [1] and us by different methods. We report our method here because it also leads to a generalization of (1.2), namely to integrals involving k functions instead of 3 and to integrations over n variables instead of 2. This is contained in Theorem I and the explicit constant for (1.2) is in Eqs. (2.19) and (2.20). In Section 3, we also find the best constant in the converse of Young's inequality (Eq. (1.1) with the reversed inequality, for 0 < p, q, r < I),
first shown by Leindler [2]. In particular, we rederive the PrekopaLeinder inequality [3-5]. In Section 4 we show that, as far as Young's inequality and its converse are concerned, the equality holds uniquely for Gaussians. We are not able to show this for the general inequality of
Theorem 1; this remains an open question. Section 5 contains two applications of Theorem 1: Nelson's hypercontractivity theorem and an inequality in statistical mechanics. An amusing example of Theorem I which shows how Gaussians arise, and which can be done by elementary methods is the following: let
J=JJf(x).g(y)h(x-y)k(x+y)dxdy. Thus, using the Schwarz inequality, 1 J: <
[f f 1 f (x)g(y)i2dx
dy]ii.[f
f I h(x - y) k(x -i- y)12 dx
dy]...
= 2-1/211f11211g11211h11211k112.
Equality holds if f (x) g(y) is proportional to h(x - y) k(x + y). But this
is true for the Gaussians f (x) = g(x) = exp(-2x2), h(x) = k(x) _ exp(-x2). Thus, 2-1/2 is the best constant. The general case is not as simple as this example. The idea behind our proof is that Im can trivially be written as an integral over 08M x R M. However, by the rearrangement inequality [6-9], 1 I"' I can be increased
by replacing f (xl ,..., xm) = If (xl)I "' I f (xM)I by its spherically sym-
metric, decreasing rearrangement, F, and similarly for g and h. This
418
Best Constants in Young's Inequality, Its Converse and Its Generalization
YOUNG'S INEQUALITY
153
rearrangement does not affect the LP norms. The main fact is that for large M, all spherically symmetric, decreasing functions look like Gaussians in some sense. The proof is concluded by letting M -. oo. 2. THE MAIN THEOREM
In this section we prove the following theorem. THEOREM 1.
Let n and k be integers with 1 < n < k. Let p,,
1 < j < k be real numbers such that I < pj < oo, Ek t l ip5 = n. Let
f, , I < j < k, be complex-valued functions on lR, and let fJ E LP%(98). Let a', I < j < k, be vectors in IRn, and let R^
Affil) _
d"x l ifi().
(2.1)
Then x
I I({fi))I < D fl II fi Ii
,,
(2.2)
i=i
where
D = sup{I({¢;})I 0, e G, I! 1i IIT, = 1, j = 1,..., k),
(2.3)
the supremum being taken over the class G of all Gaussian functions with maximum at the origin.
The value of D will be exhibited in Section 2.3. 2.1 Auxiliary Remarks
Let us first pave the way for the proof of Theorem I with some remarks and propositions. Obviously it is sufficient to consider only non-negative functions f; , since taking the absolute values of all f1 increases ! I({f;}); and does not change the LP norms. In the same way, one can restrict oneself to symmetric decreasing functions. To see this, let us introduce the symmetric decreasing rearrangement f * of a non-negative function f, [6]: f * is the symmetric decreasing function that is equimeasurable to f, i.e., the sets (x c- 68 f (x) > z} and {x e 08 I f *(x) > z} have equal Lebesgue measures for
419
With H.J. Brascamp in Adv. in Math. 20, 151-172 (1976)
154
BRASCAMP AND LIEB
all z > 0. Obviously, f and f * have equal p-norms. Also, according to a theorem proved by Luttinger and ourselves [9],
f dnx flfi() < R
J
J-1
dnx IIfi*(). R^
(2.4)
i-1
We shall further need a generalization of the inequality (2.4) to functions of several variables, also given in [9]. Given a non-negative function f (x), x E RM, its Schwarz symmetrization f * (spherically
symmetric decreasing rearrangement) is defined as the spherically symmetric function which is decreasing in radial directions and which is equimeasurable to f. Then the inequality reads k
f
dnx1 ... dnxM flft()
,-I
R^M
k I " " R^M
dnx1 ... dnxM 1 ifi*(). 1 .1
The derivation in [9] of Eq. (2.5) from Eq. (2.4) follows Sobolev's method [8].
Now, restricting ourselves to non-negative, symmetric decreasing functions f; , each of those functions can be approximated pointwise from below by functions of the type (x'
f,K(x) = Y_ g,mXt (x) M-1
Here, the Xm are characteristic functions of symmetric intervals [-11', In'],
with Im > 1' 1. Note that the function f;K(x) takes only K different positive values, namely, h1I = g;1, h;2 = g51 + gs=,..., h;x = g;I + ... + g;K.
As K - oo, f;K(x) T f;(x) for all x E R, and hence by monotone convergence .fix 11n, t 11 fill,,; I({f,")) T I({A}).
(2.7)
The latter remains true if I((f,)) = oo. As a consequence of Eq. (2.7), it suffices now to prove Theorem 1 for step functions of the form given in Eq. (2.6). We conclude this subsection with two useful propositions.
420
Best Constants in Young's Inequality, Its Converse and Its Generalization
155
YOUNG'S INEQUALITY
PROPOSITION 2.
Let 0j, l <j < K, be non-negative functions in
LP(O M), p > 1. Then
Il Yj > K-11Q
IIj ll9 , 1-1
where 1 /p + l 1q = 1.
Holder's inequality applied to a finite sum yields
Proof.
F, II Oj Ilp < Kl/Q j-1
(y
\ j-l
IIp)1/p
11 0,
_ Kl/Q (f dx
/
J
1-1
J
However, Y_
5-l
0,(x), < (Y +Gj(x))p.
j-l
Q.E.D.
PROPOSITION 3. Let ,t be the characteristic function of the ball {xelBM: Ixl
Y'a(x) = exp[(l - x2/a2) M/2p],
so that r1,t(x) < fa(x). Then II0a Ilp < II 77a Ilp(3V'M)1/p.
Proof.
If S2M is the surface area of a unit sphere in M dimensions, II
II a ill, = QMeA1/2
f dx
7. Iln = QMaM/M;
xM-1e Mxs/tax = QMeM/2aM(21M)M/2 r(Ml2\ l I / 2 /
0
Hence, by Stirling's formula (II a Ilp/ll na Ilp)p = r(M/2 + I )I [MI(2e)]M12
< (,rM)1/2 elnKM) < 3 N/M.
Q.E.D.
These foundations being laid, we can turn to the proof of Theorem 1.
421
With H.J. Brascamp in Adv. in Math. 20, 151-172 (1976)
BRASCAMP AND LIEB
156
2.2 Proof of Theorem 1
As explained above, each of the functions fj can be taken to be a step function of the form (2.6), all with the same number of steps, say K. Now write k
I(
M111 _ //
qnM
M
{/ dnx1 ... dnxM Fl II f,()
j=1 m=1
Let F,(xj ,..., xM) be the Rm Schwarz symmetrization of jIm=1 fj(x Then by Eq. (2.5) the last integral is not larger than k
f
dnx1 .
d"xM II F,(,..., ) j-1
R"M
Now notice that fj(x) only takes K positive values, say h,1,..., hjK. Then jlm-1 fj(x,n) and Fj take the values (hj1)°1(hj2)°2 ... (hjK)°K, with an, e {0,..., M}, and Em=, a,n = M. The number of values taken by F, is thus certainly smaller than (M + 1)K. We can write
F,7-
(M+I)K
Hm J rn-1
where the ?Im are characteristic functions of M-dimensional balls centered
at the origin. Then, by Proposition 2 II!, IIp = IF, II,,, > (M + 1)-Kjoj y Hjm
11
7,4 11",
nl
where 1 /pj + 1 'qj = 1. Altogether, this gives k77
[I({ fj))/fl I1fj 119,]"j < (gq -I- 1)(k-u)K -1
X
Iny
(
... Ymk Hm' ... Hmk I' d"x... dnx TTk 1 m, af, x 1 k J RnM 1 M F1 k m, HE ... H E mk 1 k Ilj_3 II n II9,
) (2.9)
Now pick any k-tuple of characteristic functions 771 ,..., rlk of balls with radii b1 ,..., bk. Define Oj = exp[(1 - x2lbj2) M12p,]
422
Best Constants in Young's Inequality, Its Converse and Its Generalization
157
YOUNG'S INEQUALITY
Then, by Proposition 3 J RM
dnx1 ... d"xM 1 1l-1 , (,...,
II;_111'1, Ii,,
(3V n/ )" f Rte` dnx1 ...
dnxM n,-, Y't(,...,)
(2.10)
II,_1 II ¢t Iin,
Since each 0f is a product of M one-dimensional Gaussians, the quotient on the right side of Eq. (2.10) is at most DM, by the very definition of
D, Eq. (2.3). Using this together with Eqs. (2.9) and (2.10), we get k
[(M +
1({fi})
1)(k-n)5(3vrM)n]hIM D Tl 11 fi 119J.
The desired result is obtained by letting M go to oo.
Q.E.D.
2.3 Computation of the Best Constant, D
We now proceed to compute the supremum D in Eq. (2.2). Let 0,(x) = exp(-z;x2). Then k
fit.
d"x f 0!() _ 7r"I(det A)-112, t-1
where the n x n matrix A is given by k
1 < u, v < n.
Y zfa ia9', i-1
PROPOSITION 4.
det A = Y_ Jszs , 1S-n
where zs = fl z!, fes
(2.11)
and Js , S = { 11 ,..., j.), is defined as
Js =
[det(ai' ... of-)]E.
(2.12)
423
With H.J. Brascamp in Adv. in Math. 20, 151-172 (1976)
BRASCAMP AND LIEB
158
Proof.
Note that det A is homogeneous of degree n in the zj .
Further n
axi
det A = I aui det Au; vavt(-1)u+v, u.v-1
where the (n - 1) x (n - 1) matrix Au;,, is obtained from A by removing the uth row and with column. Repeating this procedure, one gets azJ azt
det A = Y- Y, auia,ot'g(u, w) det Auw;otavfatt rv, t)(-I)u+v+w+t, / urtw rot
where ii(u, w) = 1,
if u < w,
-1,
if u > w.
In particular, (a2/az;2) det A = 0. Differentiating n times, one ends up with
(a"/fl azi) det A = Js .
Q.E.D.
'Es Since 11 ¢; 112
V1
_ (7r/z1p,)h/Ps, we get k
D2 =
sup
fl (z, Pj)1 ; /Y- Jszs IS
z,..... sk>O 7-1
Now consider the function '(z1 ,..., xk)
_
k
I,n,/y
xs =1
Jsxs
S
defined on W = (R+)k. By Schwarz's inequality, II Y'((x1t1)1/2,..., (zktk)"2)2 i Y'(x1 ,..., zk) +b(t1 ,..., 1k).
In other words, log
(2.13)
is a concave function of the log z, . Therefore,
if the variational equations I/Pt = Y- Jszs/Y- Jszs
sa/
S
have a solution in W, / reaches its absolute maximum there. We show now that the variational equations have a unique solution
424
Best Constants in Young's Inequality, Its Converse and Its Generalization YOUNG'S INEQUALITY
159
(modulo the trivial rescaling z; --> czj) if Js > 0, 1 < pj < oo. Firstly, the equality sign in Eq. (2.13) holds if Jszs = (const) Jsts . If Js > 0, this implies that zj = ctj . Thus, modulo rescaling, log is strictly concave.
Secondly, let zj approach the boundary of W; say zj
N'' with
N -> oo and any real aj . Let k
j.
y = y ajlp, - Imax
If Js > 0, 0 '- Nv; moreover, if 1 < pj < oo, y < 0 and qi - 0 (unless aj = const. for all j, which again corresponds to the resealing). The results are summarized in the following theorem. THEOREM 5.
Under the assumptions of Theorem 1, and with the
notation of Eqs. (2.11, 12), let the equations
j = 1,..., k
1/pj = I Iszs/> Jszs , s3j
(2.14)
s
have a solution for 0 < zj < oo. Then the constant D in Theorem 1
is
given by 7-k
D2 = 1 1 (zj
pj)i"P/Y-
j=1
ISIS
(2.15)
)
(2.16)
s
and the equality sign in Eq. (2.2) holds for
fj(x) = exp(--zjx
If I < pj oo and is 0, the Eqs. (2.14) have a unique solution satisfying 0 < zj < cc (modulo the resealing zj -> czj). Remark. If Js = 0 for some S, Eqs. (2.14) may or may not have a solution and D may be finite or infinite. If Js > 0 and some pi = 1,
Eq. (2.14) formally leads to zj = oo. If Js > 0 and some pj = oo, Eq. (2.14) formally leads to zj = 0. In both cases this gives the right value for D. An important consequence of Theorem 5 is this: Normally one would
apply Theorem I with fixed values of pt ,..., p,. , but then the determination of zt ,..., z,t, from Eq. (2.14) may not be easy to do when k is
425
With H.J. Brascamp in Adv. in Math. 20, 151-172 (1976)
160
BRASCAMP AND LIEB
large. It may be much easier to fix the values of the z1 , whence the p, are trivially given by Eq. (2.14). Eq. (2.15) then correctly gives the value of D for those p; . Examples of such usage are given in Section 5. 2.4 A Generalization of Theorem I THEOREM 6.
Let m, n, k be integers with 0 < m < n, I < n < k + m.
Let p; , 1 < j < k, satisfy
I
k
n - m
Y 1/p, < n. -1
Let fi , aj, 1 < j < k, be as in Theorem 1. Finally, let B be a nonnegative
real, n x n matrix of rank m: k+m
B.. =
z'a.'a,'. irk+l
Then k
A,
i-1
)-1
( d"x f] fi() eXp(-<x, Bx)) < E f Ilfi Iln; JR"
(2.17)
where the optimal constant E can be determined by restricting the f, to be Gaussians.
If the equations
I <j
1 /pi -= I iszs/Y- Iszs , s
S3j
(2.18)
(with S running over the n-point subsets of (1,..., k + m}) have a solution
satisfying 0 < z, < oo for 1 < j < k, E is given by k
E2 = fl (azi
pi)1/a,/-" >
J-1
Jsz's ,
S
and the equality sign in Eq. (2.17) holds if
f,(x) = exp(-z,x2),
1 < j < k.
Proof. If Eqs. (2.18) have a solution, one can define p; , k + I j < k + m, by Eq. (2.18) extended to k + I < j < k + m. Then
Theorem 6 reduces to Theorems I and 5.
426
Best Constants in Young's Inequality, Its Converse and Its Generalization YOUNG'S INEQUALITY
161
In the general case, Theorem 6 can be proved in the same way as Theorem 1, following the lines of Sections 2.1 and 2.2. During that operation, exp(-<x, Bx>) is kept fixed.
Q.E.D.
2.5 Young's Inequality
Theorems I and 5 contain the following special case, which gives the best possible improvement to Young's inequality. I
Ja2
dx dy f (x) g(x - y) h(y) I < CnCQCe II f 11, II g 11.11 h IIt , Cy2
where
c
p1/9/A11/P',
(2.19) (2.20)
1 < p, q, t < oo, 1/p + 1/q + l/t = 2, 1!p + 1/p' = 1.
[Throughout the remainder of this paper we use the convention 1,!p' 1 - 1 ip] The equality sign holds if g(x) = exp(-q'x2),
f(x) = exp(-p'x2),
h(x) = exp(-t'x2).(2.21)
Eqs. (2.20, 21) can be immediately read off from Eq. (2.14-16). In Section 4 we shall show that (2.21) is essentially the only choice to obtain equality in (2.19). An equivalent form of Eq. (2.19) is 11f *g II S CDCaCr'I I f II9 I l g I10
(2.22)
Repeated application of the last equation gives n II
f1....
fn 11r ,` Cr U Cr,llf;lp;,
(2.23)
j-1
where I < p, < o, _J ip, = n - l
l,r. The constant in Eq. (2.23) is the best possible, the equality sign holding for }
fj(x) = exp(-pj'x2).
In Section 3 we shall show that the inequality (2.23) is reversed, if the exponents p1 ,..., p, lie between 0 and 1. 2.6 A Multi-Dimensional Version of Theorem I Theorem 1 has been stated and proved for functions f; from Rt --k C. We now state a generalization of that theorem for functions from R " - C.
427
With H.J. Brascamp in Adv. in Math. 20, 151-172 (1976)
162
BRASCAMP AND LIEB
THEOREM 7.
With the same assumptions as in Theorem
1,
let
ff ELPj(RM), 1 < j < k. Let {a4f}, 1 < i < M, I < j < k be vectors in 13'x. Then k
dnxl ... d nxM Fl fl(,...,
and D is determined by taking the supremum over Gaussian functions. Proof.
We note that the rearrangement inequality, cf. Eq. (2.5),
is not true for such integrals. However, the theorem easily follows from Theorem 1 by integrating first over x, , then over x$ , etc. In this way one finds that the optimal Gaussians are of the form Oi(x, ,..., xM) = exp
[-
Y_
C,x,2,
.
i=1
Q.E.D.
An open question is the following: Let B1,..., Bk be k linear maps from RN to RM and let f, ,..., fk be functions in LPi(RM). Let k
I=
"V
dNx f] ffi(B'(x)) i-1
When can I be bounded by a constant times fl , II ff IIp, and when is the optimal set off's Gaussian? 3. THE CONVERSE INEQUALITY
This section is devoted to the following theorem. THEOREM 8.
Let pi , 1 < j < n, and r satisfy 0 < pi , r < 1. Let
Z;`,I/pf_n-1+1/r.With I/p+1/p'= 1, let Cn2
= I p ]1/nil p' j1/n'.
Finally, let ff , 1 < j < n, be non-negative functions in Lpj(R). Then Il f, * ... * fn Iii > Cf. fl C. II J jjy, 1=1
428
(3.1)
Best Constants in Young's Inequality, Its Converse and Its Generalization
YOUNG'S INEQUALITY
163
The equality sign holds (for pi :?1- 1) if
fi(x) = exp(pi xs) 3.1 Preliminary Remarks
It is sufficient to prove Theorem 8 for n = 2 (0 < p, q, r < 1, 1!p -1- 1.q = 1 + 1/r): IIf * g ll, > CIC.Cr' II l II, II g IIQ ;
(3.2)
the general case then follows by repeated application. A weaker form of Eq. (3.2) was found by Leindler [4]:
'If*g'Ir>Ilf1109 1l,
(3.3)
If p = 1, q = r and Eq. (3.2) is the same as Eq. (3.3). Thus we shall further restrict ourselves to 0 < p, q < 1. As in Section 2, we shall need a rearrangement inequality. PROPOSITION 9.
Let f, g: RM - 18+ and let 0 < r < 1. Then
IIf*gllr>Ilf**g*Ilr
(3.4)
Proof. If r = 1, Eq. (3.4) is a trivial equality. For 0 < r < 1 and
f, h > 0, Holder's inequality becomes f f(x) h(x) dx > If l''r 11 h iIr
Hence IIf * g II,. = inf If f(x - y) g(y) h(x) dMx dTMy h(x) > O,IlhI!r = 11.
(3.5)
Note that r' < 0. Define the symmetric increasing rearrangement *h of h by
*h = [(h-I)*]-I. Then II *h II,' = II h
For A > 0, let
hl(x) = min[A, h(x)];
k"(x) = A - h"(x).
Then, as A -> oo hl(x) t h(x);
A - kA*(x) t *h(x).
(3.6)
429
With H.J. Brascamp in Adv. in Math. 20, 151-172 (1976)
164
BRASCAMP AND LIEB
We can assume, that II f * g 11, < oo But in that case, Leindler's inequality (3.3) implies that f, g e L1 0 Lr. Then by the rearrangement inequality (2.5) and Eq. (3.6) together with monotone convergence, we have
f f (x - y) g(y) h(x) dTx d"y = Aim [A Ii f Ill II g Ill - f f (x - y) g(y) k4(x) dMx d"!y,
>Aim[A11f*11111g*111f f*(x-y)g*(y)k''*(x)dTMxdMy, -oo
= f f *(x -y)g*(y)* h(x) dMx dMy. Eq. (3.4) now follows from Eq. (3.5).
Q.E.D.
A consequence of Proposition 9 is, that we can restrict ourselves to symmetric decreasing functions in proving Eq. (3.2). Then we can find sequences of simple step functions as in Eq. (2.6) such that 11f"IIn
f"(x) < f(x),
]If 11"
g"(x)
This means that it suffices to prove Eq. (3.2) for step functions of the form given in Eq. (2.6). We need the analogue of Proposition 3, which reads PROPOSITION 10.
Let i ,
I < j < K, be non-negative functions in
LP(RM), 0 < p < 1. Then
II 0, II,, i=1
i-1
T
where 1 /p + lip' = 1. Proof. The first inequality follows from the fact that 11 0 II can be written as an infimum (cf. Eq. (3.5)); the second one is proved as in
Proposition 3, where it should be noted that both inequalities encountered in that proof are reversed, Q.E.D.
We also need some comparison between characteristic functions of balls and Gaussians, as in Proposition 4. It is true that Proposition 4 remains valid for 0 < p < 1; however the direction of the inequality signs makes it quite useless here. In fact, no such simple trick seems to be
430
Best Constants in Young's Inequality, Its Converse and Its Generalization
YOUNG'S INEQUALITY
165
available now, and we are obliged to make a brutal computation of the volume of the intersection of two balls. PROPOSITION 11.
Let % be the characteristic function of the ball
{x a R m: I x I < a}, and let (with 0 < p, q, r < 1, l /p + 1 /q = 1 + 1 /r) OM(alb) = II'7I. * nb 11,/11 is IID II Ib II,
Then
#M(alb) > (C,C.Cr')M
Proof.
Note that by the rearrangement inequality in Proposition 9 Y'MN(a/b) < EY'M(a/b)]N;
hence it suffices to show that limao
[ ,(a/b)]1/M > C,C2CT .
The intersection in Rm of a ball with radius a centered at the origin and a ball with radius b centered at the point x can be thought of as the union of M - 1-dimensional balls, each centered on the line connecting the origin with x. The greatest radius h(x) occurring among these balls is h(x) = min(a, b),
0 < x < I a2 - b211/2;
h(x) _ [-x4 + 2(a2 + b2) x2 - (a2 - b2)2]1/2/2x, I a2 - b21112 < x < a + b;
h(x) = 0,
x > a + b.
Then ('la * flb)(x) -.QMh(x)M
(i.e., the Mth root of the ratio of both members goes to I as M -+ oo). In the same way II '17a * '7b Ilr ^ QI M
sIr(maox{xlirh(x))]M X>
The maximum on the right side is reached for I a2 - b211/2 < x < a + b; //hence li [Y'M((a/b)]11M = 3 max{(x/a)1/P(x/b)1/O[-
I + 2(a2/x2 + b2/x2) - (a2/x2 - b2/x2)2]1/2).
607/20/2-5
431
With H.J. Brascamp in Adv. in Math. 20, 151-172 (1976)
BRASCAMP AND LIEB
166
Let
q (A, B) = A-1/PB-1/a[-l + 2(A + B) - (A - B)2]. Straightforward calculation gives that the unique solution to 8A
eB
is given by
A=rr'lpp';
B=rr'l44
Since [0m(alb)] h/M -* OD for alb --> 0 or alb -. oo, substitution of these values for A and B into 4'(A, B) must lead to the minimum over a/ b of limu.X [0M(aib)]11m. The result is min lim [1M(alb)]1/M = C9CC,,, . a/b M-.oo
Q.E.D.
3.2 Proof of Theorem 8
Theorem 8 is now proved along the same lines as Theorem 1. Given step functions f, g as in Eq. (2.6), define F(x1 ,..., xM) [resp. G(x1 ,..., xM)] as the Schwarz symmetrization of f1m-1 f (x,,,) [resp. H m1 g(xm)]; then
F, G are as in Eq. (2.8). We have, by Proposition 9,
(Ilf*g/Ir)M >!IF*GI!,. Hence, by Propositions 10 and 11,
[Ilf
rgLr/I f11.I1
111M
(1 .1 + . 1)(1/D'i l/4')K
H1mH2n
Lm n H I
* n Ilr
II 7)'n 11 -9!!n !I nn 11q
(M + 1)"n>'+114')x(CDCaCr.)M.
The proof is again concluded by taking the Mth root.
Q.E.D.
The proof given above does not allow for a generalization of Theorem 8 to a full analogue of Theorem 1, concerning k functions and n variables.
In fact, a converse rearrangement inequality as in Proposition 9 only seems true if k - n + 1. (cf. the proof of Proposition 9).
432
Best Constants in Young's Inequality, Its Converse and Its Generalization
167
YOUNG'S INEQUALITY
3.3 A Limiting Case of Theorem 8
Theorem 8 allows us to rederive a theorem due to Prekopa [3, 5] and Leindler [4]. THEOREM 12 (Prekopa-Leindler). Let f, g >, 0, f, g e LI (R), and let A e (0, 1). Let h(x)
`1-A
essvup f (X A_ yy g(
1y-
Then It is measurable and II h
II f 1111; g I'l-A
Proof. The measurability of h is proved in [10]. Let f (") (resp. g(")) be a sequence of bounded functions of compact support which approach
f(resp. g) in Lt norm and such that f (")(x) < f (x), g(")(x) < g(x), Vx. Defining h(") using f (") and g("), one has that 11 h(")11t < 11 h 11, , and hence it is sufficient to prove the theorem for bounded functions of compact support. For such functions h(x) -- lim hR(x),
R:
hR(x) = [ J
(]-A)RI 1(R-1)
Y _y)AR A
A)
1
'
i h 'il =_- lim ;I hR {i
The interchange of the R limit and the integral is allowed by dominated convergence since the hR are uniformly bounded and their supports lie in some common compact set. Now for R :> max(A ', (1 - A)-'), let 1,'p - AR, I ; q = (1 - A)R, 1. r - R - 1, 1 ir' -- 2 - R. Using (3.2) one has, with t -- R(R 1) -1, li hR :'1
' (C,C,,C,')'[A I! f 1]A'[(I _. A)! ; g :11](1
A)r.
A-A (I -- A) (1-A). When R - oo, t --* I and Q.E.D. Note, that Prekopa and Leindler proved a slightly weaker form of
Theorem 12, concerning sup instead of ess sup. Variants of their theorem were later found by Rinott [11] and ourselves [12]. Much simpler proofs
are possible without using Theorem 8 and these will he published in the Journal of Functional Analysis.
433
With H.J. Brascamp in Adv. in Math. 20, 151-172 (1976)
168
BRASCAMP AND LIEB
4. UNIQUENESS
In this section we show that Eqs. (2.22) and (3.2) hold as equalities only if f and g are Gaussians. THEOREM 13. Let f (x) c LP(R), g(x) E L4(R), with either 1 < p, q < oo or 0 < p, q < 1. In the latter case, let f (x) > 0, g(x) > 0. Let 1 + l i r= 1 /p + 1 /q, and let
11f*g11r = C,C.Cr'11f11,lIglla
(4.1)
Then
f(x) = A exp[-y I p' I(x - «)2 + i Sx], g(x) = B exp[-y 14 1(x - fl)2 + i Sx],
(4.2)
with constants y c- R+, a, fl, S E R, and A, B E R+, S = 0 if 0 < p, q < 1; A,
BECifp,q> 1. Proof.
If p, q > 1, the equality (4.1) implies that f > 0, g > 0
(apart from arbitrary multiplicative constants (see note added in proof)). Eq. (4.1) holds, if there exists a function h e Lr'(R) such that f dx dv f (x
- y) g(y) h(x) = C,C,Cr' I' f 11, 11 g 11, 11 h II.'
(4.3)
R=
In fact, by Holder's inequality the only possible choice for h is h _ (const)(f * g)rhr'.
(4.4)
Now let Eq. (4.3) be satisfied for the triples f, g, h and f1 , g', h,. Then
fW
dy du dv f (x - y) ft(u - v - x + y) g(y) gt(v - y) h(x) ht(u - x)
Ry
_ (C,C,,Cr')211 f11,11ff111,11g11,11 gtI!,, 11h11r'I!h,IIr'
Now first integrate over (x, y) and then over (u, v). Using Eq. (2.22), resp. Eq. (3.2), twice, this implies that, for almost all (u, v), Eq. (4.3) is satisfied for the triple f (x) f,(u - v - x), g(x) g,(v - x), h(x) h(u - x). Therefore, this triple must satisfy an equation of the form (4.4), with the constant depending on (u, v). As a special choice, take fi(x) = exp(- I p' I x2/2),g1(x) = exp(--- I q' I x2/2), h,(x) = exp[-r(sgn r')x2/2].
434
Best Constants in Young's Inequality, Its Converse and Its Generalization
YOUNG'S INEQUALITY
169
Define
F(x) = f (x) exp(- I p' I x2/2),
G(x) = g(x) exp(- 14 I x2/2), H(x) = h(x)r'I' exp(- I r' I x2/2).
Then, for almost all (u, v), we have for almost all x 11(x) exp(r'ux)
K(u, v) f. dy F(x - y) exp[ p'(u - v)(x - y)] G(y) exp(q'vy).
(4.5)
Define the two-sided Laplace transform by ff(s) = f dx A(x) e-8--. a
Since F, G, and H contain a Gaussian factor, their Laplace transforms are defined and analytic in the whole complex s-plane. Eq. (4.5) becomes ll(s -- r'u) = - K(u, v) F(s - p'(u - v)) 0(s - q'v).
By a shift s -* s + r'u, this becomes fl(s) = K(u, v) P(s + p't) C(s - q't),
(4.6)
with
t = v - ur'/q'. Since 17 does not depend on u and v, K(u, v) can only depend on t. Since .E, G and If are entire functions and are strictly positive for real arguments, one can take the second logarithmic derivative with respect to s and t of (4.6). One then finds that F'(s) = D exp(ta2/p' + 8s), C(s) = E exp(Fu2/q' + es),
with constants D, E, µ, 8, e. With the inverse Laplace transform, this leads to Eq. (4.2). Q.E.D. Remarks. Obviously, the uniqueness of the Gaussians can be proved in the same way for multiple convolutions, as in Eq. (2.23) and Theorem 8. However, the above proof fails for the general case of Theorem 1, if
435
With H.J. Brascamp in Adv. in Math. 20, 151-172 (1976)
170
BRASCAMP AND LIES
k > n -!- 2. Then introduction of the Laplace transform in an equality like (4.5) does not lead to a simple product, as in Eq. (4.6). Theorem 14 does not extend to the case in which p or q is one. 5. APPLICATIONS
5.1 A Theorem of Nelson
We are now in the position to give a simple proof of Nelson's hypercontractivity theorem [13]. On R, consider the Gaussian measure dp(x) = (2,r)-1/2 a-='12 dx,
with the corresponding spaces L9(R, µ). If f ELQ(R, µ), the map T(c), 0 < c < 1, is defined by (y -cx)2 (' r (I'(c)f)(x) _ [2_(l c2)]-1/2
fit exp L- 2(1
THEOREM 14.
Let l < q
-
c2)] f (y) dy.
p < oc. Then r(c) is a contraction from
La(rl, µ) to LP(68, µ) if
c < [(q - 1)1(p - 1)]1/2. The contraction constant is 1. Proof.
It has to be shown that
(27r)-1(1 - c2)a'2 f
al al
exp
[-
x" - (v -cx)2 f(y) dx dy (5.2)
< IIf1Q.,.IIg lIP%"
with 1 /p -I- 1 /p' = 1. If we write
F(x) = f(x) exp(-x2/2q),
G(x) = g(x) exp(-x2/2p'),
we are in the situation of Theorem 6; however, for that theorem to apply, the quadratic form y2 1X2 (y -- c.C)2 - 2q+ 2' + 2(1 - c2)
t2
2p'
must be non-negative definite. This is equivalent to the condition (5.1).
436
Best Constants in Young's Inequality, Its Converse and Its Generalization YOUNG'S INEQUALITY
171
If it fails, one can choose Gaussians for F and G such, that the left side of Eq. (5.2) diverges.
Now let us assume (5.1) to hold. If we put f(x) = exp(-axe/2), g(x) = exp(-Px2/2), the ratio of the left and right sides of Eq. (5.2) is {(_p' + l)1/v'(6q + 1)1/o[afl(1
- c2) + a + P + 1]-i}112
(5.3)
This expression reaches an extremum if
(np'+1)-1=[#(I-c2)+]][nP(1-c2)+a+fl+l]-1; (nq + 1)-1 = [a(I - c2) + 1][afl(l - c2) + a + 9 + I]-1. It is ensured by the general concavity argument in Section 3.3 that any solution to these equations gives the absolute maximum; hence we can
take a = P = 0 and the maximal ratio of the left and right sides of Q.E.D.
Eq. (5.2) is 1. 5.2 The Anharmonic Crystal in Statistical Mechanics
We consider a d-dimensional crystal of size L. This means that we have N = Ld particles. The equilibrium position of the nth particle is the vector n = {n1 ,..., nd} E Zd, with 0 < of < L - 1, j = 1,..., d. The vector n labels the particles and the n's are distinct. We assume that each particle has a one-dimensional motion with coordinate xn . Neighboring particles interact through a potential 4(xn - xm), O(x) _ ¢(-x). Let us take periodic boundary conditions, that is, particles numbered (n1 ,..., L - 1,..., nd) and (n1 ,..., 0,..., nd) interact. Fixing the center of mass, we define the partition function ZN(cl) =
I
RN
d'x S (N-112 E x,) exp [- Y O(xn - xm)],
where the summation in the exponential extends over all pairs of nearest neighbours. We now apply Theorem 6, with the 8-function playing the role of a fixed Gaussian. We get ZN(cb) < sup{II
e-11 111°/II a-"
III } ZN(yx2)
(5.4)
Note, that we have chosen all exponents pf in Theorem 6 to be equal; then, by symmetry, the Gaussians giving the best constant E can all be taken the same. The condition on the pf in Theorem 6 now becomes
N-1
437
With H.J. Brascamp in Adv. in Math. 20, 151-172 (1976)
172
BRASCAMP AND LIEB
However, the expression on the right side of Eq. (5.4) contains
Y (N-1)/2+Nd/2y as a factor; hence, the supremum is finite only if Nd/p=N-1, so that ZN(Y') < {II e-' IINd/(N-11/II a-5 IINd,(N-1)}Nd ZN(x2).
The partition function for the harmonic crystal, ZN(x2), is easily computed
by going over to normal modes; one has (x,, - xm)2
Y_
L,, -k I Qk Itr
k
where 4k = N-1/2Y_ xn exp[2,,ri/L], n
ruk = 2 [d
- Y cos(2+rkt/L)], f-1
0 < kf < L - 1.
k = {kl ,..., kd} E Zd,
Hence ZN(x2) = Ti (ir/Wk)1,2 kk, O
For the free energy, f (0)
N
mN-1
log ZNs6,
this gives the lower bound f (0) > -d log II a-° lid - log(d/2)
+
d
1
I f0dkl
.
f dkd log (d - Y cos 21rk,). 0
t-1
Note added in proof. In the original version of Theorem 13 we did not include the term iSx in (4.2). We are indebted to J. Fournier for pointing out this oversight to us.
REFERENCES
1. W. BECKmm, Inequalities in Fourier analysis, Ann. of Math. 102 (1975), 159-182. 2. L. LEINDLER, On a certain converse of Holder's inequality, In "Linear Operators and
Approximation," Proceedings of the 1971 Oberwolfach Conference, BirkhAuser Verlag, Basel-Stuttgart, 1972.
438
Best Constants in Young's Inequality, Its Converse and Its Generalization
YOUNG'S INEQUALITY
173
3. A. PREKOPA, Logarithmic concave measures with application to stochastic programming, Acta Sci. Math. Szeged 32 (1971), 301-315.
4. L. LEINDLER, On a certain converse of Holder's inequality. II, Acta Sci. Math. Szeged 33 (1972), 217-223.
5. A. PR9KOPA, On logarithmic measures and functions, Acta Sci. Math. Szeged 34 (1973). 335-343. 6. G.E. HARDY, J. E. LITTLEWOOD, AND G. PBLYA, "Inequalities," Cambridge University
Press, London and New York, 1952. 7. F. RiEsz, Sur une InEqualitb Intbgrale, J. L.M.S. 5 (1930), 162-168. 8. S. SOBOLEV, On a theorem of functional analysis, Mat. Sb. (N.S.) 4 (1938), 471-497; Amer. Math. Soc. Trawl. 34, 2 (1963), 39-68. 9. H. J. BRASCAMP, E. H., LIES, AND J. M. LurrINGER, A general rearrangement inequality for multiple integrals, J. Funct. Anal. 17 (1974), 227-237. 10. P. R. CHERNOFF, Advanced problems and solutions, Amer. Math. Monthly 81 (1974), 1038-1039. 11. Y. RINOrr, Thesis, Tel Aviv, 1973. 12. H. J. BRASCAMP AND E. H. LIPS, Some inequalities for Gaussian measures and the long range order of the one-dimensional plasma, in "Functional Integration and Its Applications" (A. M. Arthurs, Ed.), Clarendon, Oxford, 1975. 13. E. NELSON, The free Markoff field, J. Funct. Anal. 12 (1973), 211-227.
439
With H.J. Brascamp in J. Funct. Anal. 22, 366-389 (1976) JOURNAL OF FUNCTIONAL ANALYSIS 22, 366-389 (1976)
On Extensions of the Brunn-Minkowski and Prekopa-Leindler Theorems, Including Inequalities for Log Concave Functions, and with an Application to the Diffusion Equation HERM JAN BRASCAMP* Department of Physics, Princeton University, Princeton, N.J. 08540 AND
ELLIOTT H. LIEB* Departments of Mathematics and Physics, Princeton University, Princeton, N.J. 08540 Communicated by the Editors
We extend the Prekopa-Leindler theorem to other types of convex combinations of two positive functions and we strengthen the Prekopa-Leindler and Brunn-Minkowski theorems by introducing the notion of essential addition.
Our proof of the Prekopa-Leindler theorem is simpler than the original one. We sharpen the inequality that the marginal of a log concave function is log concave, and we prove various moment inequalities for such functions. Finally, we use these results to derive inequalities for the fundamental solution of the diffusion equation with a convex potential.
1. INTRODUCTION
In this paper we give various extensions of the Brunn-Minkowski and Prekopa-Leindler theorems. The Brunn-Minkowski theorem for the convex addition D = AA + (1 - A)B :- {x a R'a I x = Ay + (I - A)z, y e A, z e B} of two nonempty, measurable sets A, B C Rn reads [1, 2] µn(D)11"
Aµn(A)I/n + (I - A) µ.(B)11^,
* Work partially supported by National Science Foundation Grant MPS71-03375 A03 at M.I.T. 366
441
With H.J. Brascamp in J. Funct. Anal. 22, 366-389 (1976) LOG CONCAVE FUNCTIONS
367
where µ,, means Lebesgue measure in R". The requirement that A and B are nonempty is crucial. .The Pr6kopa-Leindler theorem [3, 4, 5] reads 11 k I!1 :>-
I! f I
(1.2)
Il g II _A,
where
k(xIf,g)=sup flx ,y)Ag\, ya)1_'' veR^
(1.3)
and f, g are nonnegative, measurable functions on R". If f and g are the characteristic functions of A and B, respectively, k is the characteristic function of D. Thus, Eq. (1.2) states that µ"(D) I if µ"(A) = µ"(B) = 1. By the scaling property, µ"(AA) = ,1"µ"(A). Thus Eq. (1.2) implies Eq. (1.1). In that sense, the Pr6kopa-Leindler theorem can be viewed as an extension. of the Brunn--Minkowski theorem. These theorems are extended here in the following ways. EXTENSION 1.
The sup in Eq. (1.3) is replaced by ess sup:
h(xIf'g)=esssupf( X --Y A) g(1 ((
veR°
,1
1-a
y A)
The Prekopa-Leindler theorem strengthened in this way is contained in Theorems 3.2 and 3.3. Our new version really is stronger than the old; generally, I,1 h 111 , and there are functions f and g such that h differs greatly from k. It is a fact, however, established in the Appendix, that f and g can always be replaced by functions f * and g* which differ
II k II
only by null functions from f and g such that h(x I f, g) = h(x I f *, g*) = k(x 1 f
Thus, once one knows how to construct f * and g*, the strengthened Pr6kopa-Ieindler theorem follows from the known one.
However, we prefer to work with the essential supremum h, because (1) h(x) is unaltered if null functions are added to f and g, and (2) h(x) is lower semicontinuous for any measurable f and g. The supremum k has neither property. By taking characteristic functions for f and g, a stronger form of the Brunn-Minkowski theorem results; as above, it can be derived from the known theorem (see the Appendix). The proof given here of the Prekopa-Leindler theorem is based on the Brunn-Minkowski theorem; it is simpler than the original proof by Pr6kopa and Leindler.
The idea of our proof is already contained in [6]. Another (rather
442
On Extensions of the Brunn-Minkowski and Prekopa-Leindler Theorems
368
BRASCAMP AND LIEB
involved) proof of the strengthened Prekopa-Leindler theorem is given by us in [7]. EXTENSION 2. Other types of convex combinations, h, , of two functions, f and g are defined for a e [-oo, oo]; see Eqs. (2.1-2.3). The convex combination in Eq. (1.4) is the case a = 0. In Section 3 theorems of the Prekopa-Leindler type are given for general a (Theorems 3.1-3.3). A Brunn-Minkowski-like version of these theorems is contained in Corollary 3.4. For the case a = 0 and with sup instead of ess sup, it was first given by Prekopa [2, 4]. A much simpler proof for that case was found by Rinott [8]; his proof is completely different from ours. Rinott also found the case
a = -11n in Corollary 3.4. Moreover, he found
,
Corollary 3.4, saying that Eq. (3.8) for all A, B implies of a log concave density function. In Section 4 we consider log concave functions. A coronary or me
Prekopa-Leindler theorem is that f F(x, y) dy is log concave in x if F(x, y) is log concave in (x, y). This result is sharpened in Theorem
4.2. In Theorem 4.1 a Sobolev-type inequality for log concave measures is given. Some theorems on log concave functions have counterparts for log
convex functions (Theorems 4.3, 5.1, and 6.1). However, these counterparts are comparatively trivial; they essentially follow from the usual convexity arguments (Holder's inequality). We stress that the log concave theorems and other Brunn-Minkowski and PrekopaLeindler-like theorems do not follow trivially from Holder's inequality.
In Section 5 we give inequalities for the moments of a Gaussian distribution, compared with the moments of the same distribution perturbed by a log concave (or log convex) function (Theorem 5.1). In Section 6 we give an application to the diffusion equation in Rn with convex potential. More applications (the Ising model, the one dimensional Coulomb plasma) are given in [6]. 2. NOTATION
Given nonnegative measurable functions f (x), g(x) on R", we shall introduce various convex combinations of them, parametrized by the real number a e [- oo, oo]. With 0 < A < 1, we define !r (x
f, g) = ess sup #af VER"
(x
y )¢ +Q (1 - A)g (1 y 1 1) )k_
.
(2.1)
443
With H.J. Brascamp in J. Funct. Anal. 22, 366-389 (1976)
369
LOG CONCAVE FUNCTIONS
The symbol +Q differs from the ordinary addition + in that for
f=0
g = 0,
or
{Af a D (I - A) ga}1/° = 0.
(2.2)
Otherwise, Q+ and -1- are the same: For f > 0 and g > 0, {Afa Q (1 - ) ga}lea
min(f, g),
if -00 < a < 0, if a = -00;
max(f, g),
if a = oo;
_ {Afa + (I - A)ga}hIa,
=fAg' ,
if
0 < (1, < ao; (2.3)
a=0.
Note, that Q+ and + are completely identical for a < 0; however, for a > 0 Eq. (2.2) makes them essentially different. Note further that ha(x) < hB(x),
if
a
We shall often write ha(f, g), ha(x) or ha if the dependence of ha(x I f, g) on x, f and g, or both is obvious. The dependence on A is not displayed, A being held fixed. As a particular case, take for f and g characteristic functions of measurable sets A, B C Rn: f = XA + g = XB . Then by Eqs. (2.2, 2.3), {Af a +O (1 - A)
cga}'Iu = 0
or
1,
independent of a. Hence, there is a set C such that ha(XA , XB) = Xc ,
da
We shall use the notation C = ess(AA + (1 - A)B).
To stress the difference with the ordinary Brunn-Minkowski addition we give appropriate definitions:
AA+(I-A)B=(xeR"I (x - AA)r(I - A)B ess{AA + (I - A)B} _ {x e R" I µ [(x - AA) ( (I - A)B] > 0}.
(2 4)
The ordinary addition results, if ess sup in Eq. (2.1) is replaced by sup. The ordinary and the essential additions may differ considerably, as can be seen by taking for A a single point. However, there always
On Extensions of the Brunn-Minkowski and Prekopa-Leindler Theorems
370
BRASCAMP AND LIEB
exist sets A* and B* which differ from A and B by null sets and such that A* + B* = ess(A* + B*) = ess(A + B)
(2.5)
(see the Appendix). Equation (2.5) and the Brunn-Minkowski theorem, Eq. (1.1), immediately imply the strengthened BrunnMinkowski theorem pn(C)1I" >, ,1µ (A)1I', + (1 - A) fin(B)"n,
(2.6)
if µn(A) > 0, µ (B) > 0. In the next section we show how Eq. (2.6) extends to inequalities for II h Ili in terms of II f III and II g III 3. INEQUALITIES FOR II ha II1
The following theorem is basic. THEOREM 3.1. Let f, g be nonnegative, measurable functions on R and define h_, as in Eqs. (2.1-2.3):
h_.(x) = ess sup min { f (x van
y
Y ( )Y
t
1-A
Let ]If I I. = II g I I a,= m. Then
Ilh-.III >, allfll, +(1 -A)IlglI1. Proof.
For z > 0, define the sets
A(z)={xeRIf(x)>z}, B(z) = {x e R I g(x) > z),
D(z) = {x e R I h_.(x) > z}.
Then D(z) D ess{AA(z) + (1 - A) B(z)},
by the definitions of h_,, and of the essential addition.
If z < m, p1(A(z)) > 0 and µl(B(z)) > 0. Thus, by Eq. (2.6) f.i(D(z)) > Aµi(A(z)) + (1 - A) µi(B(z))
445
With H.J. Brascamp in J. Funct. Anal. 22, 366-389 (1976) LOG CONCAVE FUNCTIONS
371
Note, further, that p.1(D(z)) = µl(A(z)) = µ1(B(2)) = 0 for z > m, and that II f 11, = J
µ(A(z)) dz, etc.
This gives the desired result.
Q.E.D.
By a simple rescaling, Theorem 3.1 immediately leads to THEOREM 3.2. Let f, g be nonnegative measurable functions on R and define h as in Eqs. (2.1-2.3). Let II f 111 > 0, II g 11, > 0. Then, for
a > - 1,
11 h. ll > {A Iflli+(1 - A) IIglI }'I",
(3.1)
urith ft = a/(1 + a). In particular, II holli > II}II
IIgIli-A
Proof. It is sufficient to consider bounded functions f and g, since any f, g can be approximated from below in L' by bounded
functions. Now define F(x) = f (x)lll f II. ;
G(x) = g(x)llI g II
.
Let us first consider the case a 0 0. Then
h.(xIf,g)=essscup
IAIlfII-M F (x Ay)aQ(1-A)IIgIIIG(1 y
[A Ill III+(1 -A)II gll,]"°
+esssup
JOF(x - y)a
Q+(1-9) G(
YER
1-
with the obvious meaning of 9, 0 < 0 < 1. Thus ha(x I f, g) > [A 11f Il
+ (1 - A) II g IIa]"° h-.(x I F, G),
and by Theorem 1 II h.111> [A IllII* + (1
-A)Ilgll'.]"a
[A
IIf111
Ilflh
+(1 - A)
IIg1k 1
(3.3)
11gli.J
Now Eq. (3.1) for -1 < a < 0 or 0 < a < oo follows by Holder's inequality.
446
On Extensions of the Brunn-Minkowski and Prekopa-Leindler Theorems
372
BRASCAMP AND LIEB
Force=0, ho(f,g)=1IfIIWIIgIII ho(F,G)>I{fII Then Theorem 1 gives
llfll. +(1 _a)
I!ho III >11fllmllg111-A [A
Ilglll
1,
(3.4)
IIgII
and Eq. (3.2) follows by the arithmetic-geometric mean inequality. Q.E.D. Remarks. 1. Equation (3.3) (supplemented with Eq. (3.4) for (x = 0) holds for all a e [- oo, oo]. The restriction a > -1 arises
from the final application of Holder's inequality. 2. Theorem 3.2 does not hold if a > 0, 11 f IIt = 0, II g IIt > 0; in that case ha = 0. Analogously, the extended Brunn-Minkowski theorem [Eq. (2.6)] is not true if A or B has measure zero. The n-dimensional version of Theorem 3.2 reads thus. THEOREM 3.3. Let f, g be nonnegative measurable functions on R" and define ha as in Eqs. (2.1-2.3). Let II f IIt > 0, II g IIl > 0. Then for
a > -- I In, (3.5)
11 ha III > {A II f Ili + (I - A) Il g Ili)'",
with y = of (1 + na). In particular, Ilholll > Ilflli llgIII-'. Proof.
Write R" n x = (y, z), with y e R, z e R11-1. Define
Since
ha(y, z I f, g) -= ess sup ess sup Jdf ( weR -
(3.6)
G(z) = f dy g(y, z).
F(z) = f dy f (y, z);
veR
y-v z-w a) A
A
v
w
( Q(1-A)g\1-a 1a)
a i!a
it follows from Theorem 3.2 that f dy ha(y, z I f, g) > ha(z j F, G),
(3.7)
447
With H.J. Brascamp in J. Funct. Anal. 22, 366-389 (1976) LOG CONCAVE FUNCTIONS
373
with P = a/(a + 1). Note, that we used that f dy ess sup >, ess sup f dy. W
W
Note further, that Theorem 3.2 does not apply, if z and w are such
that F((z - w)/a) = 0 or G(z/(1 - A)) = 0. However, Eq. (3.7) is saved by the Q+ sign in the definition of hs [cf. Eq. (2.2)].
If we assume Theorem 3.3 to be true for n - 1, we have that hs(F, G){11 > {A II F IIi + (1 - A) II G Ili}vy,
with y = P/[l + (n - 1)fl] = a;(l + na). With Eqs. (3.6, 3.7) and Fubini's theorem, this leads to Eq. (3.5). Q.E.D. Thus Theorem 3.3 is proved by induction. As an introduction to two corollaries of Theorem 3.3, let us define the classes of functions K,(R").
K,(R") consists of the nonnegative, measurable
DEFINITION.
functions F on R" such that for all A E (0, 1) F == h.(F, F) a.e.
In more pedestrian terms, this means that F has the following convexity properties (apart from null functions). a = - oo : F is unimodal, i.e., the sets {z I F(x) > z} are convex.
- oo < a < 0 : F" is convex. a - 0 F is logarithmically concave, i.e., F(Ax + (l - A)y) > F(x)a F(y)` 1. 0 < a < oo : F' is concave on a convex set, and F(x) = 0 outside this set.
a = oo F(x) = const. on a convex set, and F(x) = 0 outside this set.
Note, that K. C K if at > fl. This follows from Jensen's inequality. COROLLARY 3.4.
Let A, B be measurable sets in R" of positive
measure, and let C = ess{AA + (1 - A)B}.
448
On Extensions of the Brunn-Minkowski and Prekopa-Leindler Theorems
374
BRASCAMP AND LIEB
Let F e Ka(R"), a > -1 In, and let µF(A) = f F(x) dx. A
Then, with y = oe/(1 + n«), R'F(C) > {AµF(A)'" + (1 - A) µF(B)Y}'IY.
In particular, if F is log concave, IzF(C) > f'F(A)a I F(B)'-a.
Proof. Let f = FXA and g = FXB. Then ha(f, g) < Xcha(F, F) = Q.E.D. XcF. Apply Theorem 3.3 to complete the proof. EXAMPLES. (1) Let F(x) - 1 e K. . Then y = 1 In and we recover the Brunn-Minkowski theorem, Eq. (2.6).
(2)
Let G(x) = exp(-x2) E K,. Then in any R" PG(C) > 1 G(A)a
(3)
PG(B)'-a.
Let L(x) = (1 + x2)-1 e K_1/2 . Then IL(C) > {A1-L(A)-1 + (1 - A)
!AL(B)-1}-1,
p L(C) > min{PL(A), I'L(B)), COROLLARY 3.5.
in R, in R2.
Let F(x, y) e KK(Rm+"), x e R-, y e R". Let
G(x) = f F(x, y) dy. R^
Then G e K,,(Rm), y = a/(1 + na). In particular, if F is log concave, so is G. Proof. Since F(x, y) > 0 on a convex set in R"'+n, G(x) > 0 on a convex set in R'". Now fix points xo , x1 in this set, and define f (y) _ F(x1 , y), g(y) = F(xo , y). Then F(Ax1 + (1 - A) xo , y) > ha(y I f, g).
Now apply Theorem 3.3 to ha(y if, g).
Q.E.D.
449
With H.J. Brascamp in J. Funct. Anal. 22, 366-389 (1976)
LOG CONCAVE FUNCTIONS
375
4. LOG CONCAVE FUNCTIONS AND MEASURES
In this section we prove a Sobolev-type inequality (Theorem 4.1)
for log concave measures (i.e., measures given by a log concave density function). We shall write F(x) - exp[ f (x)], x e R"; F(x) is log concave if f (x) is convex. If f (x) is twice continuously differentiable, this means that the second derivatives matrix, f , is nonnegative.
It is often convenient to write R"+m -3 x = (y, z), y E R"', z e R". The matrix f... is then partitioned in an obvious way as (4.1)
We shall often encounter
G(y) = exp[-g(y)]
F(y, z) dz.
(4.2)
Then G(y) is log concave by Corollary 3.5. A sharper form of this result will be given in Theorem 4.2. With F as a density function, define = f A(x) F(x) dx/ f F(x) dx, R^
*
var A = IE>,
(4.3)
cov(A, B) = <(A - )(B - )>.
If x = (y, z), yeRm, zeR", we write A(y, z) F(y, z) dz/ f F(y, x) dz,
e (y) = J R^
R
., = f B(y) G(y) dy/ f G(y) dy, R' R.
so that = <e>y . In analogy with Eq. (4.3), vary , covy , var, , and cove are defined. THEOREM 4.1.
Let F(x) = exp[ f (x)], x e R", let f be twice
continuously differentiable and let f be strictly convex. Let f have a minimum, so that F decreases exponentially in all directions; then
f F(x) dx < oo.
450
On Extensions of the Bmmn-Minkowski and Prt kopa-Leindler Theorems BRASCAMP AND LIEB
376
Let h E C1(RR), and let var h < oo. Then var h < <(h,,, (fr=)-1 hi)>,
(4.5)
where the inner product is with respect to C", and hx denotes the gradient of h.
It is convenient to postpone the proof of Theorem 4.1 a moment. We prefer to give an immediate corollary first. THEOREM 4.2.
Let F(x) = F(y, z) = exp[ f (y, z)], y c Re",
z E R", satisfy the assumptions of Theorem 4.1. Moreover, let the integrals dz,
f
f
(0,f,)2 F dx
(4.6)
R^
R"
converge uniformly in y in a neighborhood of a given point yo a R", for all vectors 0 e Rn. Then, with the notation of Eqs. (4.1, 4.2, 4.4), g(y) is twice continuously differentiable near yo , and 91, >.
(4.7)
as a matrix inequality. Proof. We denote differentiation in a direction t at yo by a subscript t. Then Eq. (4.7) is equivalent to saying that for all directions t get >. -
By differentiating g(y) = log f F(y, z) dx, one gets gee =
The differentiation can be done under the integral sign by the uniform convergence of the integrals (4.6), which also ensures the continuity of get
The result (4.7) follows by applying Theorem 4.1 with h(z) = Q.E.D.
fi(yo , z). Remark.
Even though F is assumed to be a log concave function,
decreasing exponentially in all directions, the convergence of the integrals (4.6) does not follow automatically. For example, define the convex function #(x), x c- R, by 0(0) _ ¢'(0) = 0, and ¢"(x) = Y a,S(x - n),
an > 0, a,, = a_ .
noo
451
With H.J. Brascamp in J. Funct. Anal. 22, 366-389 (1976)
LOG CONCAVE FUNCTIONS
377
Then f 4'(x) exp[-O(x)) dx = 2Y- an exp
-Y-
n-1
(n - k) a,
k=1
,
J
which can be made divergent by an appropriate recursive definition of a,, . If we take f (y, z) = y2 + #(y + z),
y, z e R,
the integrals (4.6) obviously diverge for all y.
The function 0 can be approximated by a C2 function without changing the conclusion.
Proof of Theorem 4.1. We can obviously restrict h to be real valued. Let us first give the proof for R1. If f (x) has its unique minimum at x = a, write h(x) - h(a) = f'(x) k(x)-
Then k(x) is continuously differentiable, except possibly at x = a.
However, if we set k(a) = h'(a)ff"(a), k is continuous at x = a. Now
f
(h')2/f "F dr =
f [(k'f')2/f " + 2kk'f' + k2f "]F dx
= f [(kf')Zlf" + (kf')2]F dx + [k2f F]°. + [k2f'F]a f [h(x) - h(a)]2 F(x) dx.
Equation (4.5) follows by noting that var h < <[h - h(a)]2>.
Now assume that Theorem 4.1 has been proved for x c- R11-1. Hence we also have Theorem 4.2 for z e R11-1 at our disposition. Write R" 3 x = (y, z), y e R, z c- Rit-1. Then var h = +
with the notation of Eqs. (4.3, 4.4). Let us first restrict ourselves to functions h with compact support. This has the advantage that F can be modified outside the support of h in such a way, that it satisfies all the assumptions of Theorem 4.2
452
On Extensions of the Brunn-Minkowski and Prekopa-Leindler Theorems
378
BRASCAMP AND LIEB
for all y. Then G(y) = f F(y, z) dz satisfies the assumptions of Theorem 4.1, so that vary, < <((d1dy) .)2(gI>..
Now all differentiations can be carried out under the integral signs, since h has compact support and F has been appropriately modified. Thus we find (cf. Eq. (4.8)) var h < , , B -- var= h + [
. - var. f,
Applying Theorem 4.1 for z e R"-t, with fixed y e R, we have vars H < <(H. ,
f:'H.)>.
Since this is true for
H = Ah + .f, with arbitrary .1 and µ, we get B < <(h. , f == h.)>. +.
Since f is convex, the denominator above is positive and we can use Schwarz's inequality to obtain //
,
\\
22
B . ((hz , f .z h.) +
f a:fz,)]2
- (fy.' , f 2i+_Y)
\ /
= <(h. , f -rh.)>..
Z
(4.10)
Eq. (4.5) follows by combining Eqs. (4.9) and (4.10).
Now only the restriction that h has compact support remains to be removed. As an intermediate step, let us show that for all h and F satisfying the assumptions of Theorem 4.1 vars h < <(h= , f T=hs)>s ,
(4.11)
where the averages are taken over a ball with radius S centered at
the origin, instead of over all R".
453
With H.J. Brascamp in J. Funct. Anal. 22, 366-389 (1976)
LOG CONCAVE FUNCTIONS
379
Modify h outside the ball smoothly to a function k with compact support, and let f(N)(x) = f(x),
if 1 x 1 < S;
f(N)(x)=f(x)+N(IxI-S)4,
if
IxI>,S.
By our results until now, we have that varN k < <(kx , (f
(N1)-1
kd)>N
with averages with respect to the weight exp[ f (N)(x)]. Equation (4.11) is proved by taking the limit N - oo and using the monotone convergence theorem.
Now let S - oo in Eq. (4.11). Then vars h -* var h, and JS
(h. , .f uhz)F dx
increases (it may actually increase to oo). This concludes the proof. Q.E.D. EXAMPLES.
1. Let M11 = cov(x{, x1). Then we have the matrix
inequality
M < <(ff)-1)1
(4.12)
as can be seen by taking h(x) = (0, x) for any 0 e R" in Theorem 4.1. As a curiosity, compare (4.12) with the one dimensional inequality
var x >,-',
(4.13)
which holds for general weights F. The proof is
I = [cov(x, f')]2 < var f' var x =
var x,
with Schwarz's inequality and two integrations by parts. 2. For the Gaussian weight F(x) = exp[-(x, Ax)], var h < <(hi , (2A)-1 he)>.
(4.14)
In particular, if F(x) = exp[-(x, x)!2], var h < <1 h= 12>
454
(4.15)
On Extensions of the Brunn-Minkowski and Prekopa-Leindler Theorems BRASCAMP AND LIEB
380 3.
If F(x) = exp[-(x, Ax)], M = (2A)-', and thus the in-
equality in (4.12) holds as an equality. 4.
The analog of Example 3 in the setting of Theorem 4.2
concerns the Gaussian
'(x, y) = exp [-(x, y) \B* C)( y )l'
(x, y) a R'" X R", (4.16)
with a real, positive matrix (B. "). Then f 45(x, y) dy = const. exp[-(x, Dx)],
(4.17)
D == B - BC-'B*.
(4.18)
with
Thus for Gaussians the equality sign in Eq. (4.7) holds. THEOREM 4.3. defined by
With the notation of Eqs. (4.16-4.18), let G(x) be
f 45(x, y) F(x, y) dy = G(x) exp[-x, Dx)].
Then, if F(x, y) is log concave, G(x) is log concave; if F(x, y) is log convex, G(x) is log convex. Proof.
Write
= exp[-(x, Dx) - (y', Cy')], y' = y + C-'B*x.
45(x, y)
Then G(x) = f exp[-(y, Cy)] F(x, y - C-'B*x) dy.
(4.19)
If F(x, y) is log concave, the integrand in Eq. (4.19) is log concave. Then G(x) is log concave by Corollary 3.5. If F(x, y) is log convex,
the integrand is log convex in x for all fixed y. Then G(x) is log convex by Holder's inequality.
Q.E.D.
Note, that the log concave part of Theorem 4.3 also follows from Theorem 4.2.
455
With H.J. Brascamp in J. Funct. Anal. 22, 366-389 (1976)
LOG CONCAVE FUNCTIONS
381
5. MOMENT INEQUALITIES
THEOREM 5.1.
Let F(x) be a nonnegative function on R'L, and let A be
a real, positive definite, n x n matrix. Assume exp[-(x, Ax)] F(x) eLl and define
F = f k(x) exp[-(x, Ax)] F(x) dx/ f exp[-(x, Ax)] F(x) dx.
If F(x) = I we write <->,. Let
E R", a e R. Then
F 1'>F ( l ,
when F is log concave and « >/ 1;
if « > 0,
F 1< 1 r
i f -1 < <0,
F>l,
when F is log convex.
Proof.
By a linear transformation such that (¢, x) --. x, and by
Theorem 4.3 it suffices to prove Theorem 5.1 for the one-dimensional case. This will be done in Lemmas 5.2 and 5.3. Q.E.D. LEMMA 5.2.
Let F(x) be a log convex function on R, and let the
be computed with the weights exp(-x2)F(x) and and exp(-x2), respectively. Let a E R. Then averages
F>
<1x-a1°>F<1
Proof.
if
x>0;
if -1
(5.1) 0.
(5.2)
Note that
F= G= H, where
G(x) = F(x + a) exp(-2ax), 11(x) = G(x) + G(-x). Since F is log convex, G and H are log convex; moreover, H is even.
Thus, for a > 0, it has to be shown that <x°tl(x)> >, <x'>,
456
(5.3)
On Extensions of the Brunn-Minkowski and Prekopa-Leindler Theorems
382
BRASCAMP AND LIEB
with the averages computed over x > 0 with the weight exp(-x2). But this is equivalent to the inequality
f f dx dy exp(-x2 - y2)[H(x) - H(y)](x°' - ya) >, 0, 0
(5.4)
0
which is obvious, since H(x) and x- are increasing functions for
x > 0. If -I < « < 0, x° is decreasing for x > 0, and hence <x°H(x)> < <x1>
This proves Eq. (5.2).
Q.E.D.
Let F(x) be a log concave function on R. Then, with
LEMMA 5.3.
the notation of Lemma 5.2,
FI°>F< l, Proof.
if a> 1.
(5.5)
WriteFI">F G,
with
G(x) = F(x + (x>F) exp(-2x<x>F)
Then G(x) is log concave, and <x>G = 0. By approximation, it is sufficient to assume G e C'. Hence f dx exp(-x2) G'(x) = 2 f dx x exp(-x2) G(x) = 0.
(5.6)
Moreover, there must exist a number K such that G(x) is increasing
for x < K; decreasing for x > K. By Eq. (5.6) K must be finite and we can assume that K > 0, say. Then G'(x) > 0 for x < 0, and Eq. (5.6) implies that dx exp(-x2) G'(x) < 0.
(5.7)
It has to be shown that <x [G(x) + G(-x)]> < <x°>,
(5.8)
where the averages are with respect to exp(-x2), x > 0. We assumed, that G'(x) > 0 for x < 0, and thus (cf. Eqs. (5.3, 5.4)] <x'G(-x)> < <x°>.
457
With H.J. Brascamp in J. Funct. Anal. 22, 366-389 (1976) LOG CONCAVE FUNCTIONS
383
We wish to show the same inequality for the G(x) part in Eq. (5.8), which is equivalent to fo dx If we write
f
dy
eXp(-x2 - y2)[G(x) - G(y)](x° -y')
G(x) - G(y)
z
(5.9)
G'(z) dz,
Eq. (5.9) becomes f 'o dz #(z) exp(-z2) G'(z) < 0,
(5.10)
0
Xz) - exp(z2) f dx f dy exp(-x2 - y2)(x° - ye).
(5.11)
o
If we manage to show that O(z) is an increasing function for z > 0,
Eq. (5.10) follows from Eq. (5.7) and the fact that G'(x) > 0 for 0 < x < K; G'(x) < 0 for x > K, and Lemma 5.3 is proved. After some manipulation, we find that 0'(z)
dx exp(-x2)(x' - z') + z exp(z2) f dx f dy exp(-x2 - y2)[(°f - 1) xn-2 -?- y'x-2]. o
'T'hus, if « > 1, '(z) > 0.
Q.E.D.
Remark. Here, as well as in Theorem 4.3, the log convex case is much simpler than the log concave case. We leave as an open question,
the correct generalization of Eq. (5.5) when -1 < a < 1. If F(x) is symmetric decreasing, which implies that <x>p = 0 but does not imply that F is log concave, then Eq. (5.5) trivially generalizes to <1 x l'>F
<1 x I'), ,
if
a > 0;
if -1 <«<0;F i
Under the assumptions of Theorem 5.1, let M be the
covariance matrix Mil - <xiXJ>F - <Xi>F <xi>F
Then
M
<(2A } J=s) '>F < (2A)
M > (2A)-',
458
if F - exp(-f) is log concave; if F is log convex.
(5.12)
On Extensions of the Brunn-Minkowski and Prekopa-Leindler Theorems
384
BRASCAMP AND LIEB
Proof.
Setting « = 2 in Theorem 5.1 leads to M < (2A)-1
resp. M > (2A)-1. The stronger inequality (5.12) is obtained from Theorem 4.1 by taking h(x) = (¢, x) and replacing the weight F(x) by exp[-(x, Ax)] F(x). Q.E.D. 6. THE DIFFUSION EQUATION
Consider the diffusion equation in RR aOlat = -HAO
with the Hamiltonian
(H440) = - (dOxx) + V(x) fi(x),
(6.2)
defined on an open, connected region A C Rx, with zero boundary conditions. The potential V(x) is assumed to be convex; in particular,
V(x) may be oo outside a convex set D. Further we assume the region A to be such that
f exp[-tV(x)] dx < oo,
Vt > 0.
(6.3)
A
(This means that A is bounded in the directions, for which V(x) does not go to oo as I x I - oo.) The fundamental solution GA(x,y; t) of Eq. (6.1) is defined by ((Olat) - HA..) GA(x, y; t) = 0, GA(x, y; 0) = s(x - y), GA(x, y; t) = 0,
x, y e A n D, t > 0; x, y e A n D; x e a(A n D);
x0AnDor y0AnD.
G4(x,y;t) =0,
We could, of course, replace A by A n D without changing GA, but the point is that in Theorem 6.2 we want to vary A while keeping D fixed.
Using the Trotter product formula, we can write -nM/E ... fA dxM-1 GA(XI Y; t) = Mt. ( 2art l fA dx1 \ M)
x fl exp
M
(x, - xJ-1)2 - -M V(x,)],
(6.4)
1-r
where xo = x, xm = y.
459
With H.J. Brascamp in J. Funct. Anal. 22, 366-389 (1976) LOG CONCAVE FUNCTIONS
385
Define the partition function by Z,,(t)
Tr exp(-tH,,) = f G4(x, x; t) dx.
(6.5)
A
'T'hen Eq. (6.3) guarantees, that ZA(t) < oo for all t > 0, so that HA has a pure point spectrum. In fact, Holder's inequality applied to Eqs. (6.4, 6.5) gives that Z4(t) < f G°(x, x; t) exp[-tV(x)] dx = (271't)-"12
f exp[-tV(x)] dx, A
where G° is the fundamental solution of Eq. (6.1) with V(x) = 0. Moreover the ground state is nondegenerate and the corresponding eigenfunction is nonnegative [9]. THEOREM 6.1.
Let A = R", and let the potential be of the form V(x) = 4w2x2 + W(x),
w > 0,
(6.6)
with a convex function W(x). Then the ground state wave function +(0°(x) is of the form 00(x) - exp(- .1 -x2) fi(x),
where q(x) is log concave. Proof. Let G,.(x, y; t) be the fundamental solution of Eq. (6.1) for V(x) _ Jw2x2. Then the fundamental solution for the potential (6.6) is of the form
G(x, y; t) = G.(x, y; t) H(x, y; t),
where H(x, y, t) is log concave in (x, y) for all t. This follows directly from Theorem 4.3 applied to Eq. (6.4). If t is the ground state energy, 00(x) 0,(y) = lim G(x, y; t) exp(Et).
Since the pointwise limit of log concave functions is log concave, Q.E.D. the theorem follows. Remark. If W(x) is concave instead of convex, (but such that Eq. (6.3) still holds), the log convex part of Theorem 4.3 implies
in the same way as above that O(x) is log convex.
460
On Extensions of the Brunn-Minkowski and Prekopa-Leindler Theorems
BRASCAMP AND LIEB
386
THEOREM 6.2. Let A and B be open, connected regions, let C = AA + (1 - A)B, and let V(x) be convex. Then ZC(t)
ZA(t)A ZB(t)1-a;
(6.7)
(6.8)
EC < ACA + (1 - A) CB,
where EA(EB , CC) is the ground state energy of HA(HB , He).
Proof. Equations (6.4, 6.5) together give an expression for the partition function. We note, that we can apply Corollary 3.4 to the sets Am, BM, and C"'. This proves Eq. (6.7). Further
CA = -lim t-' log ZA(t), t' M
which gives Eq. (6.8).
Q.E.D. APPENDIX
THEOREM A.1.
For measurable sets A and B C R", define the
essential sum C = ess(A + B) as in Eq. (2.4). Then C is open, and (A.1)
p"(C)'!" > pn(A)'"n + µn(B)1/n. THEOREM A.2.
For nonnegative, measurable functions f (x) and
g(x) on R", define H,(x i f, g)- ess sup{f (x - y)° Q+ g(y)"}'/
(A.2)
VER°
cf. Eqs. (2.1-2.3). Then HQ(x) is lower semicontinuous in x for all a.
Proof of Theorem A.1. All the above facts are based on the following observation: For an arbitrary measurable set A C R", define
A* _ {x a R" I p JA n V(e, x)]/W.(,)
I
for e { 0},
(A.3)
where V(e, x) is the open ball of radius c centered at x, and W"(E) is its volume. Then A* is measurable and tc"(A* AA) = 0, where d means symmetric difference [2, Theorem 2.9.11]. Hence ess(A + B) = ess(A* + B*),
(A.4)
and it is sufficient to prove the theorem when A and B are replaced by A* and B*.
461
With H.J. Brascamp in J. Funct. Anal. 22, 366-389 (1976) LOG CONCAVE FUNCTIONS
387
Let x E A* + B*, i.e., there is a pointy E A* 0 (x - B*). Notice, that A** = A*; thus for some e > 0, F'"[A* n V(E, y)] > W.(e)+
µ"[(x - B*) r) V(E, y)] > W.(-E). Hence, t [A* rl (v - B*)] > 0 for all v in some neighborhood
V(S, x), which implies that A* + B* is open, and that A* + B* = ess(A* + B*).
(A.5)
Equation (A.1) now follows from Eqs. (A.4, A.5) and the BrunnQ.E.D.
Minkowski theorem, Eq. (1.1).
Proof of Theorem A.2. For a nonnegative, measurable function f, let (A.6) At = {(x, z) e R"+' I 0 < z < f (x)}.
Define At* as in (A.3). If (x, z) e At*, (x, t) e At* for all t, 0 < t Thus it makes sense to define f *(x) = sup{z I (x, y) e A,*}.
< Z.
(A.7)
The supremum over the empty set is taken to be zero. Given f *, define At. according to definition (A.6). Clearly A,, A,* and f * are all measurable. By (A.6) and (A.7), At* D Al.. Since At *\A,. C G - {(x, f *(x)) I x e R"),
and since µ,,+,(G) = 0, it follows that µ"+1(At*\A,.) = 0. In general, f p = p.,. (A,,). Therefore
f I f*- f I dx = p"+t(At dAf) = F."+1(A,*dA,)
(A.8)
As a consequence of (A.8), Ha(f, g) = Half *, g*).
(A.9)
Now consider the function Ka(x I f, g) = sup If (x - y)a (@ g(y)a)'Ia. VCR'
462
(A.10)
On Extensions of the Brunn-Minkowski and Pri kopa-Leindler Theorems
388
BRASCAMP AND LIEB
Note that generally K,(x) > Hg(x). Let D(z) _ {x E Rn I Ka(x I f *, g*) > z},
z > 0.
(A.11)
Choose z > 0, x e D(z). By definitions (A.10) and (A.11), there is a y c- Rn, and numbers b, c > 0 such that z C (ba + L°)lla,
f *(x - y) > b, g*(y) > c. In other words
i4 - (x - y, b) e A,. ,
y = (y, c) a A,. .
Then for all S > 0 there exist balls V(e, P) and V(e, y) in R"+1 such that, in the notation of (A.3), Pn+l(At. (1 V(e,
8) Wn+1(e),
lln+i(A,. n V(E, y)) > (1 - 8) W1(e) If S is small enough, it follows that the sets
{veV(e,x-y) If*(v)>b}, (w E V(e, y) I g*(w) > c)
have measure at least equal to JWn(e). This implies (1) that Ha(x if *, g*) > z, so that in fact Half *, g*) = Ka(f *, g*),
(A.12)
and (2) that D(z) contains a neighborhood of x, such that D(z) is open. Hence Ka(f *, g*) is lower semicontinuous. By Eqs. (A.9, Q.E.D.
A.12), so is HH(f, g). REFERENCES
1. L. LusrmRN1K, Die Brunn-Minkowskische Ungleichung fur beliebige measbare Mengen, C. R. Dokl. Acad. Sci. URSS No. 3, 8 (1935), 55-58. 2. M. FEDERER, "Geometric Measure Theory," Springer, New York, 1969. 3. A. PRfKOPA, Logarithmic concave measures with application to stochastic programming, Acta Sci. Math. (Szeged), 32 (1971), 301-315.
4. L. LEINDLER, On a certain converse of Holder's inequality If, Acta Sci. Math. (Szeged) 33 (1972), 217-223.
463
With H.J. Brascamp in J. Funct. Anal. 22, 366-389 (1976)
LOG CONCAVE FUNCTIONS
389
5. A. PREKOPA, On logarithmic concave measures and functions, Acta Sci. Math. (Szeged) 34 (1973), 335-343. 6. H. J. BRASCAMP AND E. H. LIEB, Some inequalities for Gaussian measures, in "Functional Integral and its Applications" (A. M. Arthurs, Ed.), Clarendon Press, Oxford, 1975. 7. H. J. BRAscAMP AND E. H. Lisa, Best constants in Young's inequality, its converse and its generalization to more than three functions, Advances in Math. 20 (1976).
8. Y. RINOIT, On convexity of measures, Thesis, Weizmann Institute, Rehovot, Israel, November 1973, to appear. 9. B. SIMON AND R. HeecH-KRoHN, Hypercontractive semigroups and two-dimensional self-coupled Bose fields, J. Functional Analysis 9 (1972), 121-180. Note added in proof. After this paper was submitted for publication we discovered that Corollary 3.4 and its converse were proved by C. Borell: C. BORELL, Convex measures on locally convex spaces, Ark. Mat. 12 (1974), 239-252.
C. BORELL, Convex set functions, Period. Math. Hangar. 6 (1975), 111-136.
464
Studies in Appl. Math. 57, 93-105 (1977)
Existence and Uniqueness of the Minimizing Solution of Choquard's Nonlinear Equation By Elliott H. Lieb
The equation dealt with in this paper is
{-d-2f Il(y)I2Ix-yI-'dyl¢=e¢
in three dimensions.
It comes from minimizing the functional RO) = f IV-012dx- f f IO(x)I2Ix-YI-'I$(y)I2dxdy,
which, in turn, comes from an approximation to the Hartree-Fock theory of a plasma. It describes an electron trapped in its own hole. The interesting
mathematical aspect of the problem is that & is not convex, and usual methods
to show existence and uniqueness of the minimum do not apply. By using symmetric decreasing rearrangement inequalities we are able to prove existence and uniqueness (modulo translations) of a minimizing 0. To prove uniqueness a strict form of the inequality, which we believe is new, is employed.
I. Introduction We consider the functional
t,(o)a f IV$(x)I2dx- f f
IO(x)I2jx-yj-'I*(Y)I2dxdy
(1.1)
on W'(R3), the space of functions on R3 such that IIV40112 and 11$112 are finite. 'Work supported by U.S. National Science Foundation grant MCS 75-21684. STUDIES IN APPLIED MATHEMATICS 57, 93-105 (1977)
93
Copyright O 1977 by The Massachusetts Institute of Technology Published by Elsevier North-Holland. Inc.
465
Studies in Appl. Math. 57, 93-105 (1977) Elliott H. Lleb
94
This functional arises in a certain approximation to Hartree-Fock theory for a
one component plasma. Ph. Choquard proposed it for investigation at the Symposium on Coulomb Systems, Lausanne, July, 1976. If one defines E(A)=inf{f (4)1$E W' (R), 11,0112<11},
intuition suggests that: (i) E(X) is finite. (ii) There is a minimizing 4, for E(A) which satisfies the nonlinear Schrddinger equation
{ -A+ V.(x)}$(x)=e$(x)
(1.3)
with
V., (x)=-2 f {4)(Y)l21x-y1-dy. (iii) The minimizing 0 is unique except for translations (i.e., 0(x)-4'0(x +a). aER3) and 110112=X. Furthermore, 0 is infinitely differentiable. Thus,
E(X)-inf{f(-0)I4E W'(R3), 114)112=A).
These facts will be proved in this paper. The mathematical difficulty of the problem stems from the minus sign in &, which precludes the conventional arguments about convex functionals. To overcome the lack of convexity, the theory of symmetric decreasing functions will be employed. This is reviewed in Sec. III. To prove uniqueness of the minimum a strict form of the inequality is used. This we believe to be new, and it is given in the appendix. The uniqueness proof is technically the hardest, if not the most novel, part of this paper. The proof in Sec. VI would not have been possible without important insights generously contributed by S. Patter and M. Steuerwalt. For the uniqueness proof we rely heavily on the fact that the kernel in (1.1) is Ix - yI -'; in particular, the kernel yields a useful scaling relation. On the other
hand, for the existence proof we only use the fact that 1x1-' is a symmetric decreasing function. Thus, our method should be applicable in a wider context.
For example, the existence proof is applicable for the functional &(¢)f R(x)I¢(x)I2dx, where R is a symmetric decreasing function in L312(R3)+ L'°(R3). This latter functional arises in the Hartree-Fock theory of the helium atom.
466
Existence and Uniqueness of the Minimizing Solution of Choquard's Nonlinear Equation Choquard's Nonlinear Equation
95
H. Boundedness of & ($)
We use the notation
T($)=f IoO(x)I2dxllo#ll21
W(4)=f f l#(x)I2Ix-yl-'I$(Y)I2dxdy.
(2.1)
Sobolev's inequality in R3 states that for 40E W',
T(4)> K given by [1, 5, 71
K= 3(i/2)4"35.478. If we define
P#(x)=I$(x)I2, then 0 E W' implies p. E L3 and T(4) > K 11P#113-
To discuss E (A) we also assume p, E L' with lip* 111 <,X2.
The function IxI - ' can be written
IxI-'=h,(x)+h2(x) with h, E L312 and h2E L°°, where h,(x)-IxI For any A > 0 we can choose R such that
(2.6)
for IxI < R, h,(x)=0 otherwise.
Ilhi 113/2= KA -2/2,
(2.7)
b(A)=Ilh2.ll.-const A2.
(2.8)
and we then define
By Young's inequality, f f p(x)h,(x-Y)P(Y)dxdy < IIh,113/211P11311P11,
f f p(x)h2(x-y)p(y)dxdy < Ilh2ll.lIPIIi-
(2.10)
467
Studies in Appl. Math. 57, 93-105 (1977) Elliott H. Lieb
96
From the above facts we can conclude LEMMA 1. If ¢ E W' and II$I12 < A, then
Fi($)>
(2.11)
-b(A)X4.
Furthermore,
(i) E (A) <0; (ii) if & (¢) < E (A) + 1, then
(2.12)
T(4) <2[ I + b(A)A`].
(2.13)
Proof: (2.11) follows from (2.5), (2.7), (2.9) and (2.10). To prove (2.12) it is sufficient to find some 4 E W' such that F9 (¢) <0 and 1140112 < A. A Gaussian O(x) -a exp(- bx2) will do this. For (2.13) we note that for II0I12 < A, W(4,) < 'K IIP4113 + b (A)X4
< 1 T(¢)+ b(A)X4.
(2.14)
Hence, 1 > E (A) + I > F (¢) > T(0)12 - b(A)A4. COROLLARY 2. If 4E W', 11-0112
=A
Proof: If 110112-Y I. Then 6 (,y)_(X/y)2[T($)(A/Y)2W(I)1<(X/Y)2E(A)<E(A). III. Symmetric decreasing rearrangements It is necessary to use some inequalities about the symmetric decreasing rearrangement of a function, and we therefore briefly review some of the main
facts. (See [2) for details and generalizations.) Let
S={.f:R3-[0,oo]If (x)Iyl)
(3.1)
be the symmetric decreasing functions, and let
S'=[f:R3-,[0,oo]If(x-v)=g(x)a.e. for some vER3andgeS)
(3.2)
be the translates (a.e.) of functions in S. The functions in S' are Lebesgue measurable. Two functions f and h in S' are said to be equicentered if the same v can be chosen for f and h in (3.2). If X is the characteristic function of a measurable set in R3. we define X' by
X`(x)= I =0
468
if 4?rlxl3/3 < 11Xlli. otherwise.
Existence and Uniqueness of the Minimizing Solution of Choquard's Nonlinear Equation Choquard's Nonlinear Equation
97
Clearly X' E S and IIX' II, = IIXII1. Given f : R3-.[O, 00], let X;(x)- I if f (x) > a, X f (x) - 0 otherwise. Then
f(x)= f 'X;(x)da,
(3.4)
f'(x)=f'xf'(x)da.
(3.5)
I'=IfI'-
(3.6)
and we define
If f : R3->C we define
and for all a, µ(xIf'(x)>a)=µ(xllf(x)I>a), Clearly Lebesgue measure. This implies that for all p,
where µ is
III' II= 11III, It is easy to check that for a > 0,
(fa )'=(fpointwise.
(3.8)
The inequality of Riesz [4] states that for any three measurable functions on R3,
If
ff(x)g(x-.v)h(y)dxdl'I < f
ff'(x)g'(x-y)h(y)'dxdy.
(3.9)
To prove uniqueness we will need the following strict version of (3.9) (see the appendix). LEMMA 3. If gES and g is positive and strictly decreasing (i.e., I x I < I y g(x)>g(y)>0), then (3.9) is a strict inequality when the right side is finite unless f and h are equicentered functions in S'.
Since g(x)= IxI-' satisfies the hypothesis of Lemma 3, and since 14PI E S' a Io12 E S', we have COROLLARY 4. If Iol a S', then W (0) < W
Next we turn to T(4). LEMMA 5.
If 0 E W'. then 4 E W' and T (o) > T (o*).
This lemma is well known, but what we believe to be an original and simple proof is given in the appendix. Probably a strict version of Lemma 5 is true, but we do not need it, since we have the strict inequality for W(o). The results of this section can be summarized as follows:
469
Studies in Appl. Math. 57, 93-105 (1977) Elliott H. t.ieb
98
LEMMA 6.
(a) There exists a sequence of symmetric decreasing functions ip(h E W' such that 11-0"112=A and &('0 (J°)-.E(A). (b) If ¢ E W', 114)112 = A and &(4p)=E(X), then ¢ E S'.
Proof: If (¢t J))
is a minimizing sequence for & (0) and if ¢t n is replaced
by t=¢t , then (a) follows from Corollaries 2 and 4 and Lemma 5. (b) follows from Corollary 4.
Remark: Part (b) is crucial for the uniqueness question, because it is then sufficient to prove uniqueness among the functions in S'. IV. Existence of a minimum and its properties THEOREM 7. There exists a 4) E S with 1140112 - A such that f (¢) = E (A).
Proof: Let 0(13 E S be a minimizing sequence for E(A). W' is a Hilbert space with norm I I+H -11+112 + I I V+112, and (1 V0112 is bounded by Lemma 1. By the
Banach-Alaoglu theorem there exists a W'-weakly convergent subsequence which we shall denote by 00). If 0 is the weak limit then liminfj_.T(¢(1) > T(4)) and 110112,4 A.
Now consider pt >>(x)-fit J1(x)2. We abuse notation by writing p(r) with r- Ixl for spherically symmetric functions. Since p(I)ES and IIpt J1II i =A2, we have, for any R > 0, Pt J)(R)4erR3/3 <41r fRp(j)(S)s2dS < 1IPt''IIi =A2 Likewise, IIPt''II3 < C by (2.2) and Lemma 1, and hence (p(j)(R )]34,rR 3/3 < C3. Thus p(1)
(R)
f(r)=Ar-'
for
r
=Ar-3
for
r> 1.
(4.1)
By a trivial generalization of Helly' theorem (3], we can find a further subsequence such that p4 1'(r)-,p(r)< f pointwise for r>0. Hence 0 )-4mp'l2
pointwise on R3\(0). We also know that 4t
in weak L2. Since 4"
< f 1I2E LL, it is easy to see that j=4). [Proof: If gE Ca , then f g(4(''-40)-,0 by the weak convergence, while f g(4)(i)-3)-.0 by dominated convergence. Hence fg(4)-+)=0 for all gECo , which implies that Since pt 13_p=¢2 pointwise, and p < f, we have, by dominated convergence, that W(4 (J))-, W(0) provided W (f'l) < oo. This latter fact is easy to verify. In summary, E (A) = Iim inft_.f (0t') > F q(4-), so ¢ is a minimum for E (A). We turn next to some properties of any minimizing function 0. In the next theorem we do not use the fact that 0 is symmetric decreasing. THEOREM 8. If 41 E W', II$1I2 = A and f (0) = E (A), then ¢ satisfies the (distributional) equation
(-,5+ V.(x)) $(x)=e$(x)
470
(1.3)
Existence and Uniqueness of the Minimizing Solution of Choquard's Nonlinear Equation Choquard's Nonlinear Equation
99
for some e < 0. VO is given in (1.4). If 4) E W' is any complex valued function (not necessarily minimizing) satisfying (1.3) in the distributional sense for any e then:
(ia) V. E LP, 4 < p < oo. (ib) 4)Vm E LP, 1 < p < 6.
(ii) V,, is a continuous function which goes to zero at infinity. (iii) If e < 0, then 4P E C °° and goes to zero at infinity, and hence 4) is a strong solution of (1.3).
Proof: The proof of (1.3) is standard. Simply replace 4p by $+Ag, gE'5 (Schwarz space), and compute the derivative at 11=0 of f(4,+hg). To see that e<0 for a minimizing 4,, multiply (1.3) by 4)(x) and integrate. Then E (0) = E (X) = eA 2 + W ($)
(4.2)
and E (X) < 0, while W (¢) > 0.
For the second part we write IxI '=h1(x)+h2(x) as in (2.6) and note that h, E L3/2 and h2 E L4. By Young's inequality, if f E LP, g E LQ, p -' + q-' - I + r -', then f *g E L'. (ia) follows from the fact that p4 = I0I2 E V for 1
kernel for (- A - e)-'. As YELP, p=2, (ib) implies that 0 is a continuous function which goes to zero at infinity. Now fix xo E R3, and let 4i Ca be a function which is I near xo. Let 4),(x)=4+(x)4,(x), 02-0-4),. Let 4)=$Q++e, where 4p, = -
Vo+2). Since '02 vanishes near xo, 0a is C °° near xo. Assuming
that 0 is Ck (k > 0) in a neighborhood of xo, we shall prove that ¢ is Ck;' near x0. Write p.=p'+ p2, where p'= 4)112 and p2=1+212+$14-2++14)2. Since P2 is zero is harmonic and hence C°° near xo. Since p' has compact near xo, (IxI
support, it is in all LP, and p' is Ck near xo. Then
is Ck near xo.
Therefore 0, V,, is Ck near xo and has compact support. Hence 4)e
V,)
is Ck+' near xo. V. Scaling properties
In this section we shall exploit the fact that the kernel in (1.1) is Ix consider the functional
For z>0
;:(4) = T (-0) - z W (0)
(5.1)
E(X,z)=inf(&=(4))¢E W', II+112=A)
(5.2)
on W' and
The results of the previous sections carry through mutatis mutandis.
471
Studies in Appl. Math. 57, 93-105 (1977) Elliott H. Lieb
100
THEOREM 9. Let 4,(x;A,z) be a minimizing function for E(A,z), and let e(A,z) be the eigenvalue in the analogue of (1.3), i.e.,
(-A+zV, (x)) $(x;A,z)=e(A,z)¢(x;A,z) Let 4),(x), E, and e, denote such a triplet when A = z = 1. Then for every solution to the (A, z) problem there is a solution to the (1,1) problem and conversely. These are related by $(x; A, z) = z3"2A44, (zA2x),
zV.(.;a.t)(x)=z2A4Vo (zA2x),
E(A,z)-z2A'E,, e(A,z)=z2A4e,,
T(4(.;A,z))=z2A6T(4',), z W (,0(.; A,z))= z2A6W($,)
Proof: Trivial.
VI. Uniqueness of the minimum If ¢ minimizes & (4) subject to II40112 =A, Lemma 6 asserts that is E S'. THEOREM 10. If ¢ is minimizing for E (A) and 0 E S, then 0 is unique.
Proof: By Theorem 9, we know that if we prove uniqueness for any A > 0, then we have uniqueness for all A > 0. If ¢ is minimizing for some Aa, then, by Theorem 9, for every A > 0 there is a scaled copy of ¢ that minimizes for A. Consider l (A)- f I4)(x;A)I2Ix1-'dx. Since ¢E L2n L6, 1(A) is finite. By scaling, I(A)=AA4 and e(A)=[eigenvalue in (5.3)]=BA4. By Newton's 1687 theorem, and using the fact that ¢ is spherical, we can conveniently express V. in polar coordinates as
V` (x)= -8lrr f "I¢(s;A)I2s2ds-8rr f 00I¢(s;A)I2sds = f 'K(r,s)I4)(s;A)I2ds-21(A),
(6.1)
where r= I x l and, for r> s,
K(r,s)=8irs2(s-'-r-')>0. 472
(6.2)
Existence and Uniqueness of the Minimizing Solution of Choquard's Nonlinear Equation Choquard's Nonlinear Equation
101
Thus, (5.3) reads
{ -A+ U,(x))+(x)=(e+21)4(x)
(6.3)
Uv(r)= f K(r,s)1+(s)I2ds.
(6.4)
and
(6.3) is a Schrodinger equation with potential Ue. As U*>0, we see that a+21 must be positive. Since a+21-(B+2A)A4, B+2A >0. Now choose 11 such that (B+2A)A4=1. Then we have the following canonical form of (1.3) for spherical functions:
In other words, every W' spherically symmetric solution of (1.3), whether minimizing or not and whether e<0 or not, obeys (6.5) after a suitable scale transformation. Our goal will be to show that (6.5) has only one (non-null) nonnegative solution in W', for this will imply that the minimum is unique (modulo translations).
An advantage of (6.5) is that the parameter A appears nowhere. Another advantage of (6.5), especially for numerical work, is that U+(r) depends upon 4(s) only for s < r. Hence one can integrate from r = 0 outwards. Given any solution 4 of (6.5), we can reconstruct the original problem by X2=41r f 'I4(r)I2r2dr, 0
I T
f 'I4(r)I2rdr, 0
2dr,
4ar 0
W($)--2rr f 00 I4(r)I2U,(r)r2dr+A21 4(x; A) _ (A/J1)44((A/A)2x)
(6.10)
If 4 is minimizing, E (A) = T (4) - W (4)
(6.11)
473
Studies in Appl. Math. 57, 93-105 (1977)
102
Elliott H. Ueb
and, for any A>0, E (A) - (X/X)6E (X).
(6.12)
We turn next to the uniqueness proof for (6.5). Suppose ¢.E W' is a nonnega-
tive solution of (6.5). 4, and Uo can also be thought of as functions on R3. Consider the following functional on W'(R3):
A,(')=_f I V¢(x)I2dx+ f IiP(x)11U.(x)dx.
(6.13)
Let I'o=inf{Ao(4#)j4E W', 114'112- 0' It follows by standard arguments [or, since U,(x) is symmetric increasing and U,,(x) is bounded, by the methods of Theorem 71 that there is a minimizing function for r.. This function satisfies (6.5) and is positive and (also by standard arguments) unique. Therefore it must be proportional to ¢ itself. Hence, Ao(4,)=110112 and
A,(*')> II'flIi
(6.14)
for any,pE W', and equality holds in (6.14) when 4=4,. Suppose there are two different, non-null, nonnegative solutions 0, and ¢2 of (6.5). Denote the potentials simply by U, and U2 and the functionals in (6.13) by A, and A2. Consider first the case that 4,,(r)> 4,2(r), all r> 0. Then U,(r)> U2(r), all r > 0. It is easy to check that B m f [ U,(x) - U2(x)J¢,(x)2dx >0. Then II4,,112 < A 2(¢,) = A 1(01) - B < 110, 111, and this is a contradiction.
Next, suppose that _ 01- 02 is not of one sign. It is easy to see by the methods of Theorem 8 that 0, and 102, and hence ,y, are continuous. There are two cases: (i) 4(0):# 0, in which case we can assume *(0) > 0; (ii) there exists an R > 0 such that >P(r)- 0 for 0 < r < R and 4, is not identically zero in any open interval of the form I, = (R, R + e). In case (ii) we write 4,; (r) - u; (r)/r, and (6.5) can be solved for r E I, as
u;(r)=u,(R)+a;(r-R)+T(r,u;).
(6.15)
T(r,f)= f r(r-s)[O1(s)- I ]f (s)ds,
(6.16)
where
and Of is given by (6.4) with tp(r)=4,1(r)=4,2(r) for r
sup(u;(r)IrEI,, 1=1,2) and such that f(R)=u,(R)=u2(R). Equip D, with the usual sup norm. For sufficiently small e>0, T(r, - ) is a strict contraction on D,, and hence u1= u2 in 1, if a, = a2. If a, > a2, then u,(r) > u2(r) for some (possibly smaller) open interval I,. Thus we can say, in either case, that there exists an R >0 such that O(R)=0 Ay(r) > 0 for r E I -[0, R ] and P(r) does not vanish identically in 1. This implies
474
Existence and Uniqueness of the Minimizing Solution of Choquard's Nonlinear Equation Choquard's Nonlinear Equation
103
that the function F(r)=_'-z[U1(r)- U2(r)]>0 in 1. Define E W' to be the function ¢(x)-ip(x), IxI < R, +4(x)=0, IxI > R. From (6.5), satisfies the following equation for IxI < R:
Multiplying this by 4 and integrating by parts yields
K=i[A, where L= f i(x)F(x)[4 j(x)+¢2(x)]dx. It is easy to see that L>0. On the other hand, K > II'4II2 by (6.14). This is a contradiction. Appendix: Proof of two theorems on symmetric decreasing rearrangements
Given a nonnegative (respectively complex valued) function, f, on R", then f denotes the symmetric decreasing rearrangement of f (respectively If I). We turn first to the strong form of the Riesz theorem [4]. LEMMA 3. Suppose g is a positive spherically symmetric decreasing function on
R' and g is strictly decreasing (i.e., IxIg(y)>0). For any two nonnegative functions f E Lo (R"), h E L9 (R") define
1(f g,h)=J f f (x)g(x-y)h(y)dxdy.
(A.l)
I (f g, h) < I (f *,g,h*)
(A.2)
Then
with strict inequality whenever I (f ',g, h') < oo, unless the following holds: For and h(x-v)=h'(x) a.e. some vER",
Proof: The Riesz theorem which gives < in (A.2) will be assumed; our problem will be to prove "less than". By subtracting positive constants, if necessary, from f, g, and h we can suppose without loss of generality that f', h', and g=g' go to zero at infinity. It can also be assumed that neither f nor h are null functions. We first prove the lemma for R'. g can be written as
g(x)= f X,(x)dtt (r) where p is a positive measure on A =[0, oo] and X, is the characteristic function of the interval [ - r, r). The hypothesis about g implies that µ((a, b)) > 0 for every open interval (a, b) in A. Now suppose that f and h are characteristic functions
of two sets F and H of finite measure. Then f' (resp. g*) is the characteristic function of the closed interval [ - c, c](resp.[ - d, d J), where 2c = meas(F) (2d=
475
Studies in Appl. Math. 57, 93-105 (1977)
104
Elliott H. Lieb
meas(H)). Let B = [ - c - d, c + d J. If m ° f+h, then m is continuous (by the remark in the proof of Theorem 8, [5]), and supp(m) c B if and only if F and H are equicentered intervals. Let [ - R, R ] be the smallest symmetric interval that
contains supp(m). For any f, g, and h, 1(f,g,h)= Jg(x)m(x)dx. Suppose that F
and H are not equicentered intervals. Then for rE(c+d,R) we have that
J(r)
ff
fm(x)dx=Jf
fffh.
For all r > 0, J (r) < K (r) by the Riesz theorem. Therefore
fR c+e
[J(r)-K(r)]dµ(r)>0,
and this proves the lemma for characteristic functions. For arbitrary f and h we can write 00
f(x)= f Xa(x)d,
(A.3)
where x77 is the characteristic function of the set Bo = {xI f (x) > a), and similarly for h. By Fubini's theorem, to have equality in (A.2) we must have for almost all (a, b) (in the sense of R2 Lebesgue measure) that there exists a v E R' such that and are (a.e.) the characteristic functions of symmetric
intervals. This v, if it exists, cannot depend on a or b. [To see this, choose an a such that x77 is not null. Then the v such that x;( - v) is symmetric is unique, and hence cannot depend on b.] Hence, for equality, there exists a fixed v such
that xo ( - v) [x y ( - v)] is symmetric (a.e.) for almost all a (in the R' sense) [almost all b]. By (A.3), f and h then satisfy the last line of the lemma. Next we turn to R"+' and suppose the lemma to be true for R". f and h can be assumed to be Borel measurable. If x = (x,, ... , x") E R" and y E R', consider F,(x)=f(x,,...,x",y) to be a function on R". G. and HY are defined similarly. G. satisfies the hypothesis of the lemma for each y. In (A.I) first do the integral over x,.. -,x. and y,....,y,, holding x"+, and y"+, fixed. By induction, equality holds in (A.2) only if F, and H, are equicentered functions in S' for almost all (y,z) (in the R2 sense). By the same argument as given above for the R' case, the displacement v E R" must be independent of y and z. If the argument is repeated holding some other coordinate [not necessarily orthogonal to the (n + l)th] fixed, we conclude that for equality there exists w E R"+' such that the two translated
functions f'=-f ( - w) and h'=- ( - w) have the following property: Let P, be any family of parallel n-dimensional hyperplanes in R"+' parametrized by the distance t from the origin, and let f, be f' restricted to P,. Then for almost all i, f, can be modified on a set of measure zero such that f, is symmetric decreasing.
By standard but tedious arguments (see the appendix of [2] for details), this implies that the last line of the lemma holds.
The next theorem concern the behavior of the W' norm under rearrangement of a function. LEMMA 5. If 0E W'(R"), then Proof: Let 1>0 and consider
W'(R") and IIVOI12> 11VO*112
the following function on R": G,(x)=
(4771)-"'2exp(-x2/4t). G, is a kernel for eia, the fundamental solution of the
476
Existence and Uniqueness of the Minimizing Solution of Choquard's Nonlinear Equation Choquard's Nonlinear Equation
105
heat equation. G, is in all the L° spaces, so l
I,(O = r I { f I¢(x)j2dx- f(x)G, (x-y)4(y)dxdy }
(A.4)
is well defined. By Riesz's rearrangement theorem (3.9), I,(4)> I,(4 ), since G,(x) is symmetric decreasing. ¢ E L2(R"), since 40 E L2(R"). To complete the proof we have to show that for any f E L2(R"),
if f E W',
fo1,(f)=11Vf112
lim 1, (f) = oo tlo
if f E W'.
(A.5) (A.6)
Recall that for fEL2(R"), IIVfII2=fk2If(k)12dk by definition, where f is the Fourier transform of f. We can rewrite (A.4) as
1,(f)= f If (k)I2{t1 1-exp(-k2t)] )dk.
(A.7)
Suppose f E W'. Since l - e -x < x, t -'[ 1- exp(- k2t)] G k2 and (A.5) is true by dominated convergence. Suppose f a W'. Since 1- e - x > 1 - (1 + x)-' =
x(l+x)-', t-'[I-exp(-k21)]> k2(l+k21)-'. (A.6) follows from this. References 1. T. Ausiw, Problemes isoperimetrique et espaces de Sobolev, C. R. Acad. Sci. Paris 280, 279-281 (1975).
2. H. J. BRASCAw, E. H. LIES, and J. M. LUITINGER, A general rearrangement inequality for multiple integrals, J. Funct. Anal. 17, 227-237 (1974). 3. W. FEi I ER, An Introduction to Probability Theory and its Applications, Vol. 2, Wiley, New York, 1966, p. 261. 4. F. RIESz, Sur une inegalite integrale, J. LMS 5, 162-168 (1930). 5. G. ROSEN, Minimum value for c in the Sobolev inequality 114113 < c110+112, SIAM J. Appl. Math. 21, 30-32 (1971). 6. W. RUDIN, Fourier Analysis on Groups, Interscience, New York, 1962. 7. G. TALENrI, Best constant in the Sobolev inequality, to be published.
PRINCETON UNIVERSITY
(Received November 15, 1976)
477
With F. Almgren in Bull. Amer. Math. Soc. 20, 177-180 (1989) BULLETIN New Series) OF THE AMERICAN MATHEMATICAL SOCIETY Volume 20, Number 2. April 1989
SYMMETRIC DECREASING REARRANGEMENT CAN BE DISCONTINUOUS FREDERICK J. ALMGREN, JR. AND ELLIOTT H. LIES
Suppose f (xI , x2) > 0 is a continuously differentiable function supported in the unit disk in the plane. Its symmetric decreasing rearrange-
ment is the rotationally invariant function f'(xl,x2) whose level sets are circles enclosing the same area as the level sets of f. Such rearrangement preserves LP norms but decreases convex gradient integrals,
Ilof' lip < Ilvf lip (1 < p < oo). Now suppose that f1(xI , x2) > 0 (j = 1, 2, 3, ...) is a sequence of infinitely differentiable functions also supported in the unit disk which converge uniformly together with first e.g.
derivatives to f . The symmetrized functions also converge uniformly. The real question is about convergence of the derivatives of the symmetrized functions. We announce that the derivatives of the symmetrized functions
need not converge strongly, e.g. it can happen that Ilof; - Of' lip - 0 for every p. We further characterize exactly those f's for which convergence is assured and for which it can fail. f' in general dimensions also deThe rearrangement map . : f creases gradient norms. For this reason alone, rearrangement has long been a basic tool in the calculus of variations and in the theory of those PDE's that arise as Euler-Lagrange equations of variational problems; it permits one to concentrate attention on radial, monotone functions and thereby reduces many problems to simple one dimensional ones. Some examples are (i) the lowest eigenfunction of the Laplacian in a ball is symmetric decreasing; (ii) the body with smallest capacity for a given volume is a ball [PS]; (iii) the optimal functions for the Sobolev and Hardy-Littlewood-Sobolev inequalities are symmetric decreasing and can be explicitly calculated [LE]. Other examples are given in [KB].
Obviously M is highly nonlocal, nonlinear, and nonintuitive, but the property of decreasing gradient norms would lead one to surmise that . is a smoothing operator in some sense. Thus when W. Ni and L. Nirenberg asked, some years ago, whether T is continuous in the topology
the answer appeared to be that it should be so (it is easy to prove that 5P is always a contraction in LP). Indeed, by an elegant analysis Coron [CJ] proved this in RI. An affirmative answer to this question would have meant that the mountain-pass lemma could be used to establish spherically symmetric solutions of certain PDE's, and Coron's result led to just such an application [RS]. Our result is that R is not continuous in for n > 2 and it is surprising, to us at least. Since almost all applications Received by the editors October 17. 1988 and, in revised form, November 29, 1988. 1980 Mathematics Subject Classification (1985 Revision). Primary 46E35; Secondary 26B99,47B38. Q1989 American Mathematical Society 0277-0979/89 SI 00 + 1.25 per pap 177
479
With F. Almgren in Bull. Amer. Math. Soc. 20, 177-180 (1989) F. J. ALMGREN, JR. AND E. H. LIES
178
of R -apart from the mountain-pass application-do not rely on continuity, our result does not have much immediate impact on applications. It reveals, however, an unexpected subtlety about the geometry of level sets of functions and shows that intuition can be very wrong. More precisely, our analysis has led us to isolate a property of functions on their critical sets which we call co-area regularity, in terms of which we prove [AL].
MAIN THEOREM. The rearrangement map W is W'-"(R") continuous at
a function f if and only if f is co-area regular.
Each W1.p function on the line is automatically co-area regular. In higher dimensions both the regular and irregular functions are dense in WI.p.
The symmetric decreasing rearrangment of a vector valued function f norms by gradient integrals of other convex integrands w: R+ -. R', i.e. Ilof II = f l V f I p d P" (2' is Lebesgue measure) is replaced by f w(l V f I) dy . Our conclusions about continuity remain the same. However, for each
is defined by setting f' = Ifl'. One can also replace
0 < a < 1, each p > 1, and each n > I we show that the rearrangement map .9' is continuous everywhere on the fractional Sobolev space W',.1'(R" ). We thus have the curious fact that co-area regularity plays a role for W"p
only when a = 1. DEFINITION. Suppose f : R" - R* and set
'Vf(Y) = f X{t>r}X{vr=o} d2'"; for each positive number y; here XA denotes the characteristic function of the set A. Since _Vf: R' -' R+ is nonincreasing, its distribution first derivative 9' is a (negative) measure. Our function f is called co-area regular if and only if the measure _Wf is purely singular with respect to Y1. Otherwise f is called co-area irregular. The term co-area regular was suggested by H. Federer's "co-area formula" for the absolutely continuous function y
- "
which is comple mentary to our .Vf We also announce THEOREM. For each n > 2 and each 0 < A < 1. there is (by construction) a positive constant C and a function f : R" -, [0. 1 ] in C"-' 1 whose support is the unit cube Q such that V-(y) = C(1 - y) for each 0 < y < 1. In
particular, the measure . ' is absolutely continuous with respect to Y'; thus f is a co-area irregular function. turns out to be co-area regular and both the regular Each f in and the irregular functions are dense in W'-" for n > 2. The idea behind the construction above is to decompose Q into 2" cubes
Q of half the size, then decompose each of those into 2" QJk's and so
480
Symmetric Decreasing Rearrangement Can Be Discontinuous SYMMETRIC DECREASING REARRANGEMENT
179
on. We first set f(x) = E°°I a;(x)2-"' where ai(x) equals (I - 1) when x belongs to the cube Q_..t... and I E { I-_ 2") is the index in the ith position. This f is not continuous but its range is uniformly spread over (0, 1). The second step is to "smooth" this f in such a way that it belongs
and Ytt{x: of = 0) > 0.
to
A fuller statement of failure of continuity is the following. THEOREM (DISCONTINUITY AT CO-AREA IRREGULAR FUNCTIONS). Suppose
n > 2 and f is a co-area irregular function belonging to W1"P(R"). Then there is a sequence fl, f2, f2.... of infinitely differentiable functions in W'.P(R") such that fj fin as j -' oo but fj y+ f' in W'.v(R") The basic idea behind the proof is the following. Let Uj be a suitable smooth approximation of X{vf=o) and set
fj(x)=f(x)+ ! Uj(x)sin(jf(x)) for each x. We confirm that fj - f in W I P as j -. oo. Defining sets K(j)(y) = {x: f(j)(x) > y} for each y, we check for integers m that K(y) =
Kj(y) when y = (2m)(7r/j) while K(y) is generally a proper subset of Kj(y) when 0 < a < I and y = (2m + a)(a/j). Since Kj and K define f7 and f' one can estimate (by using the Schwarz inequality several times and a simple Sobolev inequality) that IIDf f - V f' ll, > (constant) f h 1J2 dSo1, where 2'' A h denotes the absolutely continuous part of -.s°f.
Now, suppose that fj -. f in W 1'P and that f is co-area regular. As a further part of our Main Theorem we will indicate why fj -. f' in W1 -P. We infer, using Federer's co-area formula and dominated convergence, that (of (y) =
1
dAPn - 1
f-'(y) IV fl is well defined and finite for i' almost every y (Zn-1 denotes Hausdorff measure) and that
f cofd5°' = f X(vfo)dEn The co-area formula fails to give information about the set {x: V f = 0). This missing information is contained in .S°f.
To compute f' we must compute of (y) = f x( f,,) dY and we have of(y) =-P1 Acvf(y)+'f(y). We show that lim infj-. cof (y) > &f (y) for almost every y. Let 21 A btjl(y) be the absolutely continuous part of aI (y). Since 9; is purely singular (this is where co-area regularity is used) we infer that (1)
liminf6,(y) > d(y). j-00
To prove the convergence of V fj' to VP we prove the convergence of arc length of the one dimensional graphs representing these functions in polar coordinates (this is the geometrically invariant notion). It turns out (using
481
With F. Almgren in Bull. Amer. Math. Soc. 20, 177-180 (1989) F. 1. ALMGREN. JR. AND E. H. LIEB
180
several involved convexity arguments) that if we use the LP convergence
of fj to f' (so that the graphs converge pointwise) only the absolutely continuous pieces 8(,) are needed and that (1) suffices for our purposes. REFERENCES [AL] F. Almgren and E. Lieb, Symmetric decreasing rearrangement is sometimes canon. uous (submitted). [CJ] J-M. Coron, The continuity of the rearrangement in WI D(R), Ann. Scuola Norm. Sup. Pisa Sir 4 11 (1984), 57-85. [KB] B. Kawohl, Rearrangements and convexity of level sets in PDE, Lecture Notes in Math., vol. 1150, Springer-Verlag, Berlin and New York, 1985, 134 pp. ILE] E. Lieb, Sharp constants in the Hardy-Littlewood-Sobolev and related inequalities, Ann. of Math. (2) 118 (1983), 349-374. [PS] G. P6Iya and G. Szego, Isoperimetric inequalities in mathematical physics, Ann. of Math. Studies no. 27, Princeton Univ. Press, Princeton, N. J., 1952.
[RS] B. Ruf and S. Solimini, On a class of superlinear Sturm-Liouville problems with arbitrarily many solutions, SIAM J. Math. Anal. 17 (1986), 761-771. DEPARTMENT OF MATHEMATICS, PRINCETON UNIVERSITY, PRINCETON, NEW JERSEY 08544
482
With F. Almgren in Symposia Mathematica, vol. XXX, 89-102 (1989)
THE (NON) CONTINUITY OF SYMMETRIC DECREASING REARRANGEMENT FREDERICK J. ALMGREN JR. - Eultrrr H. Lm
Abstract. The operation R of symmetric deaeasing rearrangement maps W' a(R") to W' a(R") . Even though it is norm de reasing we show that R is not continuous for n > 2. 77jefunctionsat which R is continuous are precisely characterized by a new property called caarea regularity. Every sufficiently differentiable function is co-area regular, and both the regular and the imegularfunctions are dense in W( R") .
1. INTRODUCTION Suppose f( x' , x2) > 0 is a continuously differentiable function supported in the unit disk in the plane. Its rean-angement is the rotationally invariant function f( x' , x2) whose level sets are circles enclosing the same area as the level sets of
f, i.e. z E2 G ((XI, x)
t z : f(x ,x) > y} = G 2 {(XI, x)z : f (x t ,x)2 > y}
for each positive height y (G" denotes Lebsgue over R"). Such rearrangement preserves L' norms, i.e.
I If'TQdG2 = 1 lfl°dG2 (1 < p < but decreases convex gradient integrals, e.g.
f
1IOfI'dG2.
Now suppose that fi(x' , x2) > 0 (j = 1, 2 , 3 , ...) is a sequence of continuosly differentiable functions also supported in the unit disk which converge uniformly together with first derivatives to f, i.e.
fi(x',x2) -, f(x',x2)
and
Vf1(x',xz) -+Vf(x',xz)
483
With F. Almgren in Symposia Mathematica, vol. XXX, 89-102 (1989)
Prederick J. Almgren Jr.. Ellioll H. Lieb
90
uniformly in (x', x2) as j -+ oo. It is not difficult to check that the symmetrized functions also converge uniformly. The real question is about convergence of the derivatives of the symmetrized functions. It is certainly plausible that they should converge strongly (we believed it for some time). Our principal new result is that the derivatives of the symmetrized functions need not converge strongly, e.g. for special f's and fi's satisfying our conditions above it can happen that for every p
lim inf J JV fj - V f' IPd G2 > 0 .
i
Furthermore, we are able to characterize exactly those f's for which convergence is assured and for which it can fail. The general notion of the symmetric decreasing rearrangement f of a function
f : R" -+ R' is important in various parts of analysis. For example, various rotationally invariant variational integrals (like the gradient norms mentioned above) are not increased by symmetrization of competing functions. One is then free to search for a minimum among rotationally invariant decreasing functions (which are much easier to analyze since they are essentially functions of a single independent variable). A particular application of this technique has been in the computation of optimal constants for Sobolev inequalities.
Some years ago W. Ni and L. Nirenberg raised the question whether the rearrangement map R : f -+ f' is strongly continuous in the W''P(R") topology for all I < p < on (this would facilitate application of the 4cmountain pass lemma*, for example). J-M. Coron [CJ] showed such strong continuity (and more) to be true in case n = I, and we, at least, were led to the *obvious* conjecture that continuity
holds for all n. We have settled this question [AL] - rearrangement is not continuous in dimensions larger than one. As indicated above, we can also identify precisely those f's at which the map R is continuous and those at which it is not. Our analysis has led us to isolate a property of functions which we call co-area regularity which deals with the behavior of functions on their critical sets. For W's functions our main result is THEOREM 1. [AL] For each I < p < on the rearrangement map R is W' ' (R") continuous at a function f if and only if f is co-area regular.
Each W'.P function on the line turns out to be necessarily co-area regular so that our theorem is consistent with Coron's result. For higher dimensional domains, however, there are always functions which are not co-area regular. In particular, in
R"(n > 2) there are irregular functions in C' .a for each 0 < a < I (i. e. f 's which are n- 1 times continuosly differentiable with (n- 1)rh derivatives which
484
The (Non) Continuity of Symmetric Decreasing Rearrangement The (non) continuity or symmetric decreasing tesrrogoment
91
are Holder continuous with exponent X). In fact these irregular functions are dense
in W1-P(R"). However, each f with Lipschitz (n-1)tt, derivatives (i.e. X = 1) is co-area regular. In this note we shall briefly review symmetric rearrangement, introduce co-area regularity, sketch the construction of a co-area irregular function, give the reason that co-area irregularity implies lack of continuity of R in W1 -P, and finally sketch
the reason that co-area regularity implies continuity of R. Our proof of continuity discussed here uses the theory of rectifiable currents in an essential way. The version in [AL] uses more traditional functional analysis instead. REMARK. One sometimes defines the symmetric decresing rearrangement of
vector valued function f : R" -+ R' (as well as functions R" -' R*) by setting f' = If I'. Sometimes it is also of interest to replace W1 P norms by gradient energies associated with integrals of other convex integrands ry : R' -+ R', i.e.
IIVfIIP = f IVfIPdLa is replaced by f k(IVfi)d V. These two generalizations are carried out in [AL] but are omitted here for simplicity. The conclusions about continuity remain the same.
It is worth pointing out that although the map R is not continuous for WI.P norms we show [AL].
THEOREM 2. For each 0 < a < I, each l < p < co, and each n > 1, the marrangement map R is continuous on the fractional Sobolev space W( R") .
For 0 < a < I the norm IIfIIwu, is given by
f (v)IP1,-vI-"-P°dC"zdf"y. We have the curious conclusion that co-area regularity plays a role for W°-P only when a = I. Fractional derivatives, of course, are not a local construct.
2. REARRANGEMENTS AND CO-AREA REGULARITY
2.1. Rearrangements We review the definition and basic properties of the symmetric decreasing reR*. It is convenient to use the notation X(A) : R" -+ {0, I } symbolically to denote a function which takes value 1 when the test A is passed and takes value 0 otherwise; e.g. X{f>P}(x) equals I
arrangement f' = R f of a function f : R"
485
With F. Almgren in Symposia Mathematica, vol XXX, 89-102 (1989) 92
Frederick J. Akngren Jr.. Elliott H. Lieb
in case f (x) > y and equals 0 otherwise. Also we associate to a fixed function f a radius function R : R' -. R' defined by requiring
(2.1)
ar(n)R(Y)" = fx(,>)dC"
for each y; here a( n) is the volume of the unit ball in W. We further denote by XR : R" {0,1 } the characteristic function of the open ball centered at the origin and of radius R. Finally, our rearranged function
f is defined by setting
(2.2)
f(x) = fV>o
XR(v)(x)dt'Y
for each x. It is immediate to check that f* is symmetric and decreasing, i.e.
f(x) = f(z) if IzI = IxI and 0 < f(x) < f(z) if IxI > IzI. It is also clear that f is equimcasurable with f, i.e. (2.3)
G"({x : f(x) > y}) = G"({x : f'(x) > y})
for each y > 0.
Equation (2.3) implies immediately that rearrangement preserves LP norms, i.e.
(2.4)
IIAII, = IIf'II,.
Moreover [CG], rearrangement is a contraction on LP, i.e.
(2.5)
IIf - 9II, _> Ilf' - 9'11,1
whenever f,9 E LP.
In particular, R is a continuous map from LP into LP.
The function space W"( R") consists of those functions f which belong to LP(R") and whose distribution gradients V f are functions belonging to LP(R", R"). It has long been known [B] [BZ] [H] [K] [L] [PS] IS I] [S2] [T] that R is W t .v norm non-increasing, i.e.
(2.6)
livfllp >
This implies that Rf also belongs to Wt "P. (Actually, when p = I it is not obvious that f' is in W1,1 and not merely in BV; this was proved by Hildcn
486
The (Non) Continuity of Symmetric Decreasing Rearrangement
The (non) conuwity of rynrnaric decreeing wr igement
93
[H].) However, 7L is not a contraction mapping. Indeed, (IVf - VgjIP can be arbitrarily large compared to ITV f - Vg*11p. To see why this can happen, suppose
that f, g : R -+ R' are smooth functions with f (z) = g(x) for x < 0 and f (x) > g(x) for x > 0. Suppose also, for x < 0, that both V f (and hence Vg) are very large in Lo norm while, for z > 0, both V f and Vg are of order I in LP norm. Then JjV f - VgDDP is of order one because of the cancellation for x < 0. On the other hand it is easy to arrange things so that the rearrangement destroys this cancellation so that l V f - Vg' 11P will be large. These facts suggest some of the subtlely of questions about the continuity of R on W' JP. We can phrase our question in the following way.
Given f, f1, f2.... in W'-P with ff - fin W'.P, isit tnie that Af = IIV f f V f IP ultimately converges to Oar j -+ oo even though Al maybe large for very many j 's 7
2.2. Co-area Regularity Instead of the integral in (2.1) representing the full crossectional area at height y of the subgraph of our function f, consider the integral
(2.7)
cf(y) = fX(f>v)X{Vf_O}dC'
which, for each y, represents that pan of the crossection of the subgraph associated
with critical points of f. Since our function C f : R' --+ R' is nonincreasing its distribution first derivative G'f is a (negative) measure. Since a smooth function must be constant on any connected open set on which its gradient vanishes, there are many functions f for which the contributions to the integral in (2.7) come only from flat parts of the graph corresponding to those positive numbers y for which the set {z : f (z) = y} has positive measure. Since there can be at most countably
many such y's, the measure C'f would then be singular with respect to G' on R'. This situation is not the most general one, however, and there are «irregular* smooth functions f for which the measure 9f has an absolutely continuous piece as well. Indeed, we have the following theorem.
THEOREM 3. [AL] For each n > 2 and each 0 <) < 1, there is (by constn,clion) a positive constant C and a function f : R" -. [ 0, 1] withe the following properties.
(1) The function f belongs to C"-',-'(R") and has support equal to the cube
Q=(x:jx'j <1 for each i=1,...,n) of side length 2.
487
With F. Almgren in Symposia Mathematica, vol. XXX, 89-102 (1989)
94
Frederick J. Almgren Jr.. Elliott H. Lieb
(2) For each 0 < y < 1,
91(Y)=C(1-y) In particular, the measure Q1 is absolutely coninuous with respect to Gl . Thus f is co-area url gular. (See Definition below). It can be difficult to picture such a function. Somehow its gradient vanishes on a set of positive £" measure containing no open subsets or flat spots, i.e. C'({x :
f(x) = y}) = 0 for every y. Furthermore, the image of the critical set is distributed uniformly over all y values in the range [0, 1]. Theorem 3 also tells us that the following definition is not an empty one. DEFINITION. A function f in W1 P is called co-area regular if and only if the measure g1 (see (2.7)) is purely singular with respect to L 1. Otherwise f is called
co-area irregular. The term co-area in these definitions was suggested by H. Federer's «co-area formula>> which gives an integral representation of the absolutely continuous function
Y ~' fX(f>V)X(r/f,lO)dC'.
A mild generalization [AL] of the Morse-Sard-Federer theorem shows that each f belonging to C"-l-t is automatically co-area regular. An easy argument then shows
THEOREM 4. [AL] For each n > 2 and each p > 1, the co-area regular and the co-area irregular functions are each dense in W t ,P(R") Questions of the behavior of functions on their critical sets have a substantial mathematical heritage both in theory and in examples. We here sketch the con-
struction of a function f as in Theorem 3 when n = 2. First set f(x) = 0 for x V Q. For x E Q we will use 4-adic notation to express the values of our jr, i.e. we will write
AX) _ >4-tat(x)
with at(x) E {0,1,2,3}.
tt First divide Q in the obvious way into four squares each of side length I and label these squares SOP, S11 IJ , SZ l) , S3(l) in clockwise order. Set a I (x) = j if x E SS IJ (don't worry about the boundaries of the S( l)'s) . Next, divide each S(l) into four
squares each of side length f and label these S; 2) (with k = 0, 1, 2, 3, ) in the
488
The (Non) Continuity of Symmetric Decreasing Rearrangement
The (non) continuity of synunetrie deaasing rearrangement
95
same clockwise order. Set a2(x) = k if x E S. The construction continues in the obvious way ulimately to define an f. For each 0 < a < b < 1 we have G2 (f -t (a, b)) = 4(b - a). At present our f is not even continuous much less smooth. We fix this up by modifying this construction. We replace each al by a carefully constructed smooth function bt in our sum above. The support of each bb is contained within the 4 1-1 squares on which bl_ t assumes constant values, and. bt assumes constant values on 41 squares nested within the b,_t constant value squares. The subgraph then resembles a union of step pyramids (like 2hoser not Cheops) with those at the 2-th level having bases on the tops of those at the
2 - 1-th level. With some effort one can construct the be's so that f E C1-' and {x : V f = 0) has positive measure. As expected the measure of the set {x : V f = 0 } goes to zero as a approaches 1.
3. REARRANGEMENT IS DISCONTINUOUS AT CO-AREA IRREGULAR FUNCTIONS
THEOREM 5. [AL] Suppose n > 2 and f is a co-ama irregularfunction belonging to W1 ''(Rn) . Then them is a sequence fl, f2, f3.... of functions in
W'' (Rn) such that fj - . f in Wt-P(R°) as j - oo but fj* 74 f*. Moreover, for each c > 0, the fj 's can be chosen with the following properties. (1) The sequence of differences fj - f converges to zero in L°°(F.") . (2) Them is a positive number Y such that
fj(x) = f(x)
f(x)Y+c
whenever
and
y < fj(x)
whenever Y < f(x)
(3) For Gn almost every x, IVf1(x)I <-
2
IVf(x)I+ E
(4) The measure of the set
{x:Vf(x) (0 and Vfj(x)
Vf(x)}
converges to zero as j -' oo.
If we do not require properties (2), (3), (4) then the difference fj - f can be chosen to belong to C°°. If we drop all four properties then each fj can be chosen to belong to C°°. The basic idea behind the proof of Theorem 5 (omitting refinements
489
With F. Almgren in Symposia Mathematica, vol. XVC, 89-102 (1989)
tiedciet J. AhMmk., Won H. Lieh
96
(1), (2), (3), (4)) is the following. Let W be the characteristic function of the critical
set of f, i.e. the set for which V f = 0, and set
fi(x)= f(x) + 2jW(x)sin(jf(x))
(3.1)
for each x. Then clearly ff -+ f in LP as j -+ oo. For the gradients we compute formally
Vff(x) -Vf(x) = 2W(x)Vf(x)cos(jf(x)) (3.2) +
VW(x) sin(f f(x)).
y The first term on the right side is zero since W vanishes when V f does not vanish. The second term on the right side in (3.2) is a bit problematic since V IV is not p-th power summable. This defect, however, can be remedied (with some effort) by mollifying W in a j-dependent way so that IIV WIIOO < j'/2. This establishes the LP convergence of V f f to V f. Now define sets
K1(y)={x:ff(x)>y} (j=1,2,3,...)
and
K(y) = {x : f(x) > y}
for each V. Since the function t i-4 t + - sin(tj) is increasing (check the derivative) we infer that K1(y) = K(y) whenever m is an integer and y = 2 m7r/ j . For these special y values we infer that the radius functions are equal, i.e. R,(y) =
R(y) (recall (2.1)). On the other hand, if 0 < a < I and y = (2 m + a) (7r/j), then R,(y) > R(y) and, in general, R1(y) > R(y). Think of the graphs of f,* and f' parametrized by the height y instead of the radius Ixi. When y = 2ma/j the graphs intersect. When y = (2m+a)(7r/j) and 0 < a < I, the graph of fj* lies to the right of the graph of f'. For our purposes it sufficies to show that the numbers B, =_ I IV fl V r I I I are bounded away from
-
zero. We then try to estimate the B,'s in terms of the distribution g f from (2.7). Using the Schwarz inequality several times and a simple Sobolev inequality we are able to estimate
(3.3)
Bf > (constant) fihlhht2dCI;
here L' A h denotes the absolutely continuous part of our C'f It is reassuring that the bound (3.3) above involves Ihlr/2 instead of Ihi. This is so because the square root of a singularmeasure is zero»; by this we mean that if the singular part of 9j (which cannot contribute to the lack of convergence, as we assert in the next section) is approximated by absolutely continuous measures
L' A7tk) (k= 1,2,3,...), then f Ih(k)I1/2dL' converges to zero as k -* oo.
490
The (Non) Continuity of Symmetric Decreasing Rearrangement The (non) continuity of synm en is deceeuing rearangatwn
97
4. REARRANGEMENT IS CONTINUOUS AT CO-AREA REGULAR FUNCTIONS
The proof [AL] that the co-area regularity of f implies W 1 m continuity of R at f is quite technical. We will attempt to outline some of the main ideas. In our proof in [AL] sections 4.2 and 4.3 below are replaced by more traditional methods in functional analysis. 4.1. Reduction to Wt,t
' is1implied by Our first step is to establish the fact that continuity of R in W W1,1. This may seem surprising since ordinarily nothing can continuity of R in be inferred about IIvf; - Vf IIp from information about IiVfj - Vf II1 In the present case, however, our rearrangement operator R acts independently on slabs {x : Yt < f (x) < Y2 }. We can then surgically remove small, well chosen slabs from the fl and f on which Iv f j I or I O f is large. On these slabs we can control Iivf; - Vf'IIP in terms of IIvf, IIp and IIVf'Ii, and these quantities can, in turn, be controlled by IIvf,IIP and IlvfliP with use of the basic inequality (2.6). After these small slabs are removed, the f, and f effectively have bounded gradients and then W1 -1 convergence implies W1 -P convergence. 4.2. The co-area formula and co-area regularity
The basic tool in our second step is H. Federer's co-area formula as extended by J. Brothers and W. Ziemer [BZ]. Suppose f E W1 -1(R') and g is a nonnegative Borel function. Then the slice integral
A(y) =
(4.1)
f'{v) g
L
d7in-i
exists for LI almost every positive number y and we have the co-ama formula
(4.2)
fl,o
Ado'= fgivii d L';
here H1 1 denotes Hausdorff's (n - 1) -dimensional measure over W. In one application of (4.2) we replace f( x) by Ft(x) = max {f( x) , t} (with t > 0), then 0+, and finally use Lebesgue's replace g(x) by (Iv f (x) I + d) -1, then let 6 monotone convergence theorem applied to each side of (4.2) to infer
(4.3)
f wf(y)dC'y= f X(f>t)X(vflo)d,C°= y y>o
491
With F. Almgren in Symposia Mathematica, vol. XJX 89-102 (1989)
Frederick J. Almgren Jr., Elliott H. Lieb
98
where we have written
wf(y) = ff
(4.4)
Vfrldx"-l
- {y}
1
for each y. In other words, the basic distribution integral on the right side of (2.1) (call it a.( y)) breaks up naturally into two pieces
(4.5)
a f(y) = 7f(y) + 9 f(y)
and (4.3) states that y f is absolutesly continuous with derivative -w f. The KEY POINT is: the only absolutely continuous part of the measure -a f' is w f if and only if f is co-area regular.
4.3. Currents and the lower semicontinuity of slice integrals
Suppose that we have a sequence ff converging to f in Wl-l and that f is co-area regular. Henceforth we will omit the subscript f (e.g. a f will be denoted a) when referring to f, and will use a subscript j when referring to ff (e.g. a f, will be denoted af). We assert that
(4.6)
lim inf wf(y) > w(y)
Gt almost every y.
To show this it sufficies to prove that
(4.7)
lim ifJ '{y} )-cc
g dH°-for L' almost every y
g dH°-t = f `{y}
whenever g E L°°. An approximation argument shows it is sufficient to prove (4.7) for g E Ca . It is here that we need to utilize the inherent current structure of the graph and subgraph of jr and the fi's and the inherent convergence as currents. To do this we form the n+ 1 dimensional current
Q= E°'l L {(x, y) : x E R°,y < f(x)} whose boundary T = 8Q is the current associated with the graphs of f. The current T can then be sliced by the coordinate function y to obtain an n - I dimensional slice current T(y) corresponding to the level set f -l for C t almost every y. Likewise, we define Q., TI, T,(y) for the various j's and further set Sf = Q - Q, with associated slice currents S,(y). Since «slicing commutes with boundaries>> in the
current setting we infer 8Si(y) = T(y) - T.(y) for almost every y.
492
The (Non) Continuity of Symmetric Decreasing Rearrangement The (non) continuity of ymmeuic decreasing rearrangement
99
Since the mass M of a current corresponds to its volume, we readily check that
(4.8)
M(S,) = M(Q - Qi) = IIf - fills -+ 0
as j -+ oo.
Since M (Si) = f M( Si (y)) d Ely for each j, there will be a subsequence (still denoted by j's) such that
(4.9)
lim M(S, (y)) = 0
i Sao
for almost everyy.
Since 8Si(y) = T(y) -Ti(y) we conclude the convergence of the T,(y)'s to T(y) for almost every y. The lower semicontinuity of mass under such convergence then implies
(4.10)
lim i M(T,(y)) > M(T(y)) for ,C' almost every y.
Using, for example, J. Michael's [M] Lipschitz approximation theorem we readily infer
(4.11)
M(T(i)(y)) = 7{"-'
for G' almost every y;
here (j) denotes either j or no j. We use the co-area formula again to infer
JM(T())(Y))dC'Y= fIVf(,)ldC.".
(4.12)
However, f I V fi ld ,C" -, f I V f Id.C" by the assumed L' convergence of Vfi to
vf. The following is a general lemma. Suppose it is a measure and h, ht, h2, h3, ... are nonnegative, summable functions such that lim inf hi(x) > h(x) for p
i
almost every x. In case f hi d µ -+ f hd u as j -+ no then there is a subsequence j (k) of the i's such that hi(k) (x) -+ h(x) ask --+ oo for p almost every x. We apply this lemma to the case at hand to infer that, for a further subsequence,
(4.13)
lim inf M(T,(y)) = M(T(y)) j-00
for G' almost every y.
Equation (4.13), with a little more work, then leads to (4.7). As an application of (4.7) we return to (4.4) and prove that
(4.14)
lim inf w,(y) > w(y) i-,00
for E' almost every y.
This result is crucial for us. To prove it, we use (4.7) with g(i) (y) _ (Iv f(i) I+ 6) -'
(as in the proof of (4.4)) and then let 6 - 0.
493
With F. Almgren in Symposia Mathematica, vol. XXJtI 89-102 (1989)
100
Frederick J. Almgren Jr., Elliou H. Lieb
4.4. Graph arc length as an invariant measure The last main step in our proof is to combine (4.14), the co-area regularity of f, convergence of the fi's to f to show that the V fJ 's convergence to and the W',1
Vf* in L'. Since
(x) is really only a function of r = jxj, our considerations are essentially one-dimensional. (It is true that the real measure is r°-' dr and not dr, but this is merely a nuisance which one can handle). Let us suppose then n = I and we will denote d/dr by a prime. Think of the graph of f' (or fj*) which is a curve in R2. The geometrically invariant notion is not f" (which is the quantitativity in which we are really interested) but rather the arc length derivative (I + (f') 2 )' /2 . The arc length can be computed in two different ways. The first way is to use the height y as parameter. equals Then the arc length of the graph of
1(1 + (
(y))2)+d11y+ fhere
v(i) is the singular part of the measure -(d a(i) /d y) while C' A p(i) is the absolutely continuous part of -(d a(i) /d y). The crucial point is the following: The co-area regularity off implies that p(y) = w(y). For fj*, all we can say is that pi (y) > wi (y) ; but this is of no concern since, from (4.14), we have
(4.15)
lim inf pi(y) > p(y)
j_M
for ,C' almost every y.
Concerning the singular components vi and v one knows nothing. However, by the L' convergence of f; to f' (see (2.5)) we can infer that the arcs convergence pointwise, i.e. for any 0 < a < b (4.16)
Jpidcl + j dvl-jpdG'+ fdv.
b
a
It is then a simple exercise to show that (4.15), (4.16) alone imply arc length convergence, i.e.
(4.17)
f(l+p)I/2dC1+fd:1,4f(1+P2)1/2dr.I+fdz,.
Now think about this arc length convergence (4.17) in terms of the radius parameterization, i.e.
(4.18)
r J(1 +(f;'(r))2)'/2dC'r--+1
There is no singular part of the measure (since fj*' is a function). Intuitively, it is clear (by drawing a few graphical examples) that arc length convergence implies
L' convergence of f7' to f'' because the function t H (I + t2) 1/2 is strictly convex. This is indeed conrct as the following general theorem [AL] states.
494
The (Non) Continuity of Symmetric Decreasing Rearrangement
The (non) contimity of symmetric dcaetsing rewnVernem
101
THEOREM 6. Suppose ik : R" -e R' is a convex function. Suppose also
that f, fl, f2, f3, . am functions in LL ( R", R) having distributional gradients which am functions in R). Suppose that i(V f), ,G(V ft), t!i(V f2),
,p(V f3) , ... also are functions in L t (R") and that fi - f - 0 in Lt (R") as
j
oo. Then (as has been known for some time [SI)) (1)
lint inf ,_.w
f
%b(Vf/)dG" >
J
fi(Vf)dG".
(2) Suppose further that equality holds in (1) and that sp is strictly convex (i.e.
,,(x) + 0(y) > 21G (j l) whenever x ¢ y). Uniform convexity is not assumed. co. Furthermore, there is a sub. Then ¢(V f/) - P(V f) in Lt (R") as j
sequence j(1),j(2),j(3),... of1,2,3.... such Vfl(k)(x) -+Vf(x) for C" almost every x as k
oo.
(1 + oo (e.g. our function 12)1/2). Then, for every measurable subset f1 of R" of finite measure, V f j IE
(3) Finally, suppose 0(C) -. oo as fit:
Vf in Lt(f2,R"). REFERENCES [AL]
F. ALMGREN and E. LIEB: Symmetric decresing rearrangement is sometimes contin-
[B]
uous, J. Amer. Math. Soc. 2,683-773 (1989). C. BANDLE: Isoperimetric inequalities and applications. Pitman (Boston, London, Melboune), 1980.
[BZ]
J. BROTHERS and W. ZmEMER: Minimal rearrangements of Sobolev functions. loom.
[CG]
Reine Angew. Math. 394,153-179 (1988). G. Cttm: Rearrangements of functions and convergence in Orlicz spaces. Appl. Anal. 9, 23-27 (1979).
[CJ] [H] [K)
[L] [M]
J-M. CoRON: The continuity of the rearrangement in W 1.o(R) . Ann. Scuol. Norm. Sup. Pisa, Ser4, 11, 57-85 (1984). K. HILDEN: Symmetrization of functions in Sobolev spaces and the isoperimetric inequality. Manuscr. Math. 18, 215-235 (1976). B. KAWOHL: Rearrangements and convexity of level sets in partial differential equations. Loci. Notes in Math. 1150, Springer (Berlin, Heidelberg, New York), 1985. E. LiEB: Existence and uniqueness of the minimizing solution of Choquard's nonlinear equation. Stud. Appl. Math. 57,93-105 (1977). See appendix. J. Micmma.: Lipschitz approximations to summable functions. Acta Math. 111, 73-
94(1964). [PS]
[Si] [S1)
G. POLYA and G. SzEGO: Isopcrimetric inequalities in mathematical physics. Ann. Math. Stud. 27, Princeton University Press (Princeton) (1951). J. SERRIN: On the definition and properties of certain variational integrals. Trans. Amer. Math. Soc. 101,139-167, (1961). E. SPERNER: Zur syntmetrisiening von Funktionen auf Sphfiren. Math. Z.134, 317327 (1973).
495
With F. Almgren in Symposia Mathematica, voL XVC 89-102 (1989)
102
Frederick 1. Akngren Jr.. Ellim H. Lieb
[S2]
E. SPERNER: Symmetrisietung filr Funktionen mehrerer reeller Variablen. Manuscr.
[T]
Math. 11,159-170 (1974). 0. TAt.Fxrt: Best constant in Sobolev inequality. Ann. Pura Appl. 110, 353-372 (1976).
496
With L. Cafarelli and D. Jerison in Adv. in Math. 117, 193-207 (1996)
ADVANCES IN MATHEMATICS 117, 193-207 (1996) ARTICLE NO. 0008
On the Case of Equality in the Brunn-Minkowski Inequality for Capacity Luis A. CAFFARELLI* School of Mathematics, Institute for Advanced Study, Princeton, New Jersy, 08540; and Courant Institute for the Mathematical Sciences, New York, New York 10012-1110 DAVID JERISON* Department of Mathematics, Massachusetts Institute of Technology. Cambridge, Massachusetts 02139-4307 AND
ELL1oTT H. LnEB* Departments of Mathematics and Physics, Jadivin Hall, Princeton University, P.O. Box 708, Princeton, New Jersey 08544-0708 Received July 28, 1995
Suppose that S2 and Q, are convex, open subsets of RN. Denote their convex combination by
Q,=(1 -t)Qo+tQ, = {(1 -t)x+ty: xeQ and yeQ,}. The Brunn-Minkowski inequality says that
(VolQ,)I'A'i(I -t)vol52',"+tVolQ1IN for 0 < I < 1. Moreover, if there is equality for some t other than an endpoint, then the domains Q, and 920 are translates and dilates of each other.
Borell proved an analogue of the Brunn-Minkowski inequality with capacity (defined below) in place of volume. Borell's theorem [B] says * The work of the first author was partially supported by NSF Grant DMS-9101324. The work of the second author was partially supported by NSF Grant DMS-9401355. The work of the third author was partially supported by NSF Grant PHY90-19433 A04. 193 497
With L. Cafarelli and D. Jerison in Adv. in Math. 117, 193-207 (1996)
CAFFARELLI, JERISON, AND LIEB
194 THEOREM A.
Let Q, = tQ I + (I - t )Q0 be a convex combination of nvo
convex subsets of RN, N >, 3. Then
(capQ,)1n"/-2 >(1 -t)capQo(N-2)+tcap
2)
for 0<, t<1. The main purpose of this note is to prove. THEOREM B.
There is equality in the inequality of Theorem A if and only
if 0, is a translate and dilate of Qo. The case of equality in the classical Brunn-Minkowski inequality can be used to prove uniqueness in the Minkowski problem described below. In particular, it implies that any two convex bodies with the same Gauss cur-
vature (as a function of the unit normal) are translates of each other. Theorem B will be used to prove uniqueness for an analogous problem associated to the first variation of capacity [J 1, J2 ]. There is a similar theory in the case N = 2 in which the capacity is replaced by the transfinite diameter (the exponential of the logarithmic capacity).
1. THE MINKOWSKI PROBLEM AND ITS VARIATIONAL FORMULATION
Let g denote the Gauss map, that is, the map from aQ to S", n = N - 1, that sends a point X to Q to the outer unit normal to 0 at X. The mapping g is defined almost everywhere with respect to surface measure da on aQ. We define a measure 4u0 on S" by du, =g.(da), i.e.,
N0(E) =a(g-'(E))
for every Borel subset E of S", is a measure on S". The Minkowski problem asks under what conditions on u one can find a convex, bounded open set 0 such that p, =p. In the case of measures that consist of a finite number of point masses, each mass corresponds to the area of a face of a convex polyhedron and the location on S" of the point mass is the unit normal to the face. Thus the problem is to find a convex polyhedron given the areas of its faces and the normals to the faces. In the case the measure ,u has a smooth positive density with respect to the uniform measure dd on the sphere, dp = (1 /K) d , the function K is the Gauss curvature of Q and the problem can be restated as the problem of finding a convex body given its Gauss curvature as a function of the unit normal. Here are the basic existence and uniqueness theorems:
498
On the Case of Equality in the Brunn-Minkowski Inequality for Capacity CAPACITY
195
THEOREM 1.1. Let p be a positive Bore! measure on S", n = N - 1. There exists a bounded, convex open set 0 c R' such that pr, = p if and only if
(a)
p({ :
e > 0) > 0 for every e e S" and
(b)
J,- e
dp(i) = O for every e E S".
THEOREM 1.2.
p,,=,u,, if and only if 0o and 0, are translates of each
other.
The Minkowski problem can be solved variationally. Let S2 be a convex domain. The support function ps, of 0 is the function defined for e S" by
Xe0}. The support function determines S2 because
Q={XeR':
for all eS"}.
Consider the functional A = inf {J
,
us, du : convex d2 such that vol 92 >, 1 }.
(*)
THEOREM 1.3. If p is a finite positive measure satisfying (a) and (b) of Theorem 1.1, then A>0 and a minimizer 0 of (*) exists. Moreover, it is unique up to translation, and it solves p,,=NA-'dp. One then recovers the
solution of Theorem 1.1 by dilation.
The Lagrange multiplier factor NA-' arises from the volume constraint and the relation
vol 0 = N f u, du,. S.
(1.4)
The proofs of Theorems 1.1-1.3 are contained in [BF, CY]. In parallel with the Minkowski problem there is a problem of prescribing the first variation of capacity [J2]. To define capacity, let N >, 3 and let Sl be a bounded, convex, open subset of RN. The equilibrium potential of 0 is the continuous function U defined in i2'= RN\S2 satisfying
4U= 0
in S2'
and
U= 1
on 8S2'
and such that U tends to zero at infinity. The electrostatic capacity of 0 is defined as the constant y = cap S2 such that U(x)=yaNIxl2
N+O(IxV'_1)
as
x-+oo 499
With L. Cafarelli and D. Jerison in Adv. in Math. 117, 193-207 (1996)
196
CAFFARELLI, JERISON, AND LIEB
where the dimensional constant aN is chosen according to the fundamental solution of Laplace's equation
A(-aN Ix12-") =ao. By a theorem of Dahlberg, IVUI2 is defined almost everywhere on 000 and integrable with respect to surface measure. Define v0 by
dv0=gr(IVUI2do). The analogous problem is to find a convex domain Q such that v = v. The associated functional is in = inf { Jus, t
s^
dv : convex Q such that cap Q
1
}.
(t)
11
The results analogous to Theorems 1.1 through 1.3 are THEOREM 1.5. Let N > 4, n = N - 1. Suppose that v is a positive measure on S". There exists a bounded, convex, open set 0 c R' such that v0 = v if and only if
(a)
(b) Jss e
for every eeS" and for every eeS".
When N = 3, conditions (a) and (b) hold if and only if there exists a number c > 0 and a bounded, convex, open set 0 that vs, = cv. THEOREM 1.6.
Let N> 4. Then v,,0 = v0, if and only if 120 and 0, are
translates of each other. When N = 3, v00 = v,,, if and only if Q0 and S2, are translates and dilates of each other. THEOREM 1.7.
If N>, 4, and v is a finite, positive measure satisfying (a)
and (b) of Theorem 1.5, their in > 0 and a minimizer 0 of (t) exists. Moreover, it is unique up to translation, and it solves g*(IVUI2 da) = (N - 2) in ' dv. When N = 3, the result is the same except that S2 is unique up to translation and dilation. When N >, 4, a dilation of the minimizer given in Theorem 1.7 solves the equation in Theorem I.S. But when N = 3, vn is dilation invariant. Therefore the statements of the theorems must be modified theorems must be modified as indicated. When N = 3, there is exactly one constant
c, c = in ', for which the equation v0 = cv has a solution. The uniqueness statements in Theorems 1.2 and 1.3 are not logically equivalent, although this is a problematic distinction to make between two
500
On the Case of Equality in the Brunn-Minkowski Inequality for Capacity
CAPACITY
197
true statements. The distinction is that the uniqueness in Theorem 1.2 applies to any stationary point of the functional, whereas the one in Theorem 1.3 refers only to minimizers. (This distinction is a trivial one because it follows from convexity of the functional that all stationary points are minimizers; see Proposition 5.2.) More important to the present article, the fact that the minimizer of (*) is unique up to translation follows from Theorem 1.2 only after one proves the variational equation
ps, = N).-'p for the minimizing body Q. The situation in the case of the capacity theorems is less complete than it appears. Although it is not hard to show directly that the minimizer of (t) exists, we cannot confirm directly
that it satisfies the equation vi, = (N - 2) m `v. Instead, we will prove Theorem 1.7 using Theorem B and Theorem 1.5. Theorem 1.5 is proved in
[J], using a mixture of variational and limiting techniques. It would be nice to have a direct proof of Theorem 1.7. This problem will be discussed again at the end of the paper. We will frequently identify the boundary of Q with the unit sphere by the Gauss map. In particular, we will abuse notation by considering the support function as a function on aQ:u(x)=u(g(x)) is defined almost everywhere on Q.
2. FIRST AND SECOND VARIATIONS OF CAPACITY
The analogue of formula 1.4 for capacity is [J2] cap Q =
I J us, dv,,. N-2 .s
(2.1)
The following first variation formula, proved by Poincare in the smooth case, says that I DUB zda is the first variation of capacity in the same sense in which da is the first variation of volume. PROPOSITION 2.2 [J].
Let u and u, be support functions for convex
domains Q and Q, respectively. Let v = vs,0, then (a)
d
dtcap(Qo+IQ,)I,_o.=
u,dvo S"
and
(b)
d
dtcap((1-I) Q0+IQ,)I,=o+=J (u,--uo)dvo.
501
With L. Cafarelli and D. Jerison in Adv. in Math. 117, 193-207 (1996)
198
CAFFARELLI, JERISON, AND LIEB
Next, we describe the second variation, that is the Frechet derivative of the mapping 52 -i vA. Following [J2, J1, CY], we write this only in the smooth strictly convex case and express it in terms of the variation of the ..., e" be an orthonormal frame for S", and let support function un. Let e,..., covariant derivatives with respect to this frame be denoted V, and V. Denote Wl = (u e C°°(S" ): V t,u + u8;; > 0). It can be shown that the corre-
spondence Q - uA is a one-to-one correspondence between C°° convex domains with strictly positive Gauss curvature and functions of 611. Let b e RN. Translation of the domain n to 0 + b corresponds to the change in u to u + b . Denote the N-dimensional space A, = spanf %, , ..., N} . The Gauss mapping g is a diffeomorphism and we denote the inverse mapping It is given by the formula F=Vii, where u is the extension by F: of u from S" to RN as homogeneous function of degree 1: u(rn) = ru(b) for all e S". The Gauss curvature K can be defined as a function of the unit normal by g*(da) = (1/K(f)) dd, where dd is the uniform measure on the Gauss sphere. The density 1/K can be computed in terms of u and written (2.3)
K is unchanged by translation of 52. In fact, each individual entry of the matrix whose determinant is 1/K is unchanged by translation: if v E Y, , then
S;;)=0
for all i, j.
(2.4)
Define the coefficients c,, of the cofactor matrix of Vju+u8;j by u 8j,) = 8;; det(Vpqu + u apq) = 8;;/K.
(2.5)
Here and in subsequent formulas we follow the convention that repeated indices are summed. Define the density SE C°°(S") by g*(IVUI2 da) = Sdd, define the mapping .f: 611- C°°(S") by .F (u) = S. We have the formula (2.6)
where h(x) _ IVU(x)I for x E80.
Let feC°"(S") and let w be the harmonic function in 0' that vanishes at infinity and has boundary values at x = F(c5) on 80'. Define the operator A acting on C°°(S") by A(f) = the normal derivative of the harmonic extension. Let v e C' (S"). For I sufficiently small, u + tv E 671. Furthermore, if v is the support function of a domain 52, , then u + tv is the support function of S2 + tQ, I.
502
On the Case of Equality in the Brunn-Minkowski Inequality for Capacity
199
CAPACITY
PROPOSITION 2.7 [ J ].
The directional derivative of 3F is given by d dt
(u+tv)I,-o=Lv,
where L = L is defined as Lv = V,(h2c Vv) - (2/K) hA(hv) - h2 Tr(c,,) v.
Green's formula implies that A is selfadjoint on L2(OQ, do). It follows that Remark 2.8.
L is selfadjoint on L2(S", dd).
3. UNIQUENESS FOR SMALL PERTURBATIONS OF THE SPHERE
We analyze the second 'variation L to deduce uniqueness for small perturbations of the sphere. LEMMA 3.1. Let S2o he the domain with support function u. If u - 1 has sufficiently small C2N(S") norm, and N >, 4, then the null space of L is A, and there is an orthonormal basis for the orthogonal complement of the null space of the form 100, k = 0, 1, ... with LOO = aoOo
and
Lok = -akOk'
k=1,2,...
and ak > I for all k = 0, 1, ... and ak = 9(k2m"). In the case N = 3, the null space is the span of Yj and the additional vector u. Furthermore, the complement of the null space has a basis {Y'k}, with k = 1, 2, .... that is, all the rest of the eigenvalues are strictly negative. Proof:
Denote ,em(u)=S. Dilation gives _F((1 +t)u)_(I +t)N-'S, so
that
Lu=(N-3)S. Translation gives
(3.2)
(u + v) = S for all v e :O I, so that
Lv=O
for all
ve:3.
(3.3)
Thus the null space contains Y, (and u in the case N= 3). The asymptotic size of the eigenvalues follows from standard elliptic estimates. The fact that there are no other zero eigenvalues and the uniform lower bound on the eigenvalues follows from perturbation and the explicit calculation of the case of the unit sphere that follows.
503
With L. Cafarelli and D. Jerison in Adv. in Math. 117, 193-207 (1996)
200
CAFFARELLI, JERISON, AND LIES
In the case of the sphere, u = 1, U(x) =1x12 - N, h = N- 2, K=1, and C;; = b;;. The operator A can be computed from the observation that if Pk(x) is homogeneous harmonic polynomial of degree k, then its extension
to the exterior of the ball is given by w(x) =1x12-"Pk(x11x12) which is homogeneous of degree 2 - N - k. Thus,
A(Pk)=(2-N-k) Pk. The Laplace-Beltrami operator on the sphere satisfies
C,JV.Pk= -k(k+N-2) Pk. Therefore, if L, denotes the operator for u = 1,
L,Pk= -(N-2)2(k(k+n-2)-2(N+k-2)+(N- 1)) Pk. In particular,
L,Po=(N-2)2(N-3)P0
and
L,P,=0,
and the remaining eigenvalues are negative integers strictly less than - 1. Let L = L,,. Standard perturbation theory implies that for u sufficiently close to 1, all the small eigenvalues of L are within, say, unit distance of corresponding eigenvalues of L, . The asymptotic estimate from above and below ak = 0(k21") follows from standard theory of elliptic theory. This proves all the assertions of Lemma 3.1 provided we can show that the null
space of L is the space V defined by V=91 if N>,4 and V= span(,, 2, S3, u) when N= 3. The null space of L, is Y when N>,4 and 1) when N = 3. Let T, denote the projection onto the null space of L, . Let A be the partial inverse of L, with the same null space as L, and satisfying AL, = 1 - T,. Let II II denote the norm of L2(S", dc). For u sufficiently close to 1, span
II A(L-L,) wII < I1wII/4.
If w is orthogonal to V, then for u sufficiently close to 1, 11 T, v1I
(3.2) and (3.3) imply that V is contained in the null space of L. In order
to show that V is the null space of L, consider a function w that is orthogonal to V and satisfies Lw = 0. Then
0=ALw=A(L-L,) w+AL,w=A(L-L,) w+w-T,it,. 504
On the Case of Equality in the Brunn-Minkowski Inequality for Capacity
CAPACITY
201
Therefore,
11w)
and we conclude that tv = 0. This proves Lemma 3.1. Let f2o be the domain with support function u and let
PROPOSITION 3.4.
Q, be the domain with support function v. If u - I and v - I have small C2''(S") norm, and -t)Qo+tQ,)1/1N-21=(1 -t)cap(Qo)1/N 2+Icap(Q,)11(N 2)
cap((1
for some t e (0, 1), then v is a linear combination of u and a first-order spherical harmonic: au(k) + b for some a > 0 and some b c- RN. In other words, 12, is a translate and dilate of Q0. Proof.
Let
f(t)=cap(Q0+tQ1). nt(t)=cap((I
-t)0o+to,)1/1N-2)=(1 -1) f(t/(1 -t))l (N
2).
Formula 2.1 implies
f(t)=N1
2
(u+Iv).f(u+tv)d5.
Proposition 2.2 implies
f'(t)= J v.f(u+tv)dd. Consequently, Proposition 2.7 implies
f"(0) = f. rLv dl;. Since m is concave and agrees with a linear function at 0, t and I, it must be linear. Thus m" = 0. We can calculate
m"(0)=(N-2) 2 f(0)
.2 f IAN
2) [(N-2).f(0)f"(0)-(N-3) f'(0)2].
Thus ni"(0) = 0 implies
(N-2) f(0)f"(0)=(N-3) f"(0)2.
(3.5)
Denote by (.) the inner product on L2(S", dd), then
f(0)=(u,S)/(N-2),
f'(0)=(v.S).
f"(0) =(r.Lv). 505
With L. Cafarelli and D. Jerison in Adv. in Math. 117, 193-207 (1996)
202
CAFFARELLI, JERISON, AND LIEB
When N=3, (3.5) implies (v, Lv)=f"(0)=0. By Lemma 3.1, this implies Lv = 0 and v belongs to the span of u and c, and 5,. The case N > 4
, ,,
is more complicated because L has a positive eigenvalue ao. Rewrite equation (3.5) as (u, Lu)(v, Lv) = (v, Lu)z.
(3.6)
If we let ak = (u, 4k) and b,=(v,0,), then u - y ak ok e M and v Y_ bkOk
and (3.6) can be restated as
2 (_oa+
2
k
2
"oaoh0+
2
akak)\-s060+
k
,
k
,
,
akakbk / (3.7)
Given e > 0, we can choose u and v sufficiently close to 1, that Ia0 - I I < E and Ih0 - 1 I <.- and k2(Iak12+Ibk12)<E. k=1
If we recall that ao > l and fl > 1, then, in particular we can choose a small enough that akbk
and
Y_ akak
(3.8)
k-l
kr1
Our goal is to deduce from (3.7) and (3.8) that ak/a0=hk/hf for all k = 1, 2, .... It then follows that v/bv - u/a e Y, which is what we want to
prove. Let xk = ak
ak /a0
f
Vk = hk
0,
ak/h0 / _oc..
Then (3.7) and (3.8) are rephrased as (
- 1 + Ix12)(_ I + IyII) = ( IxI < I
and
1 +(x, t,))2 IyI < 1
(37) (3.8')
where I I denotes the norm on f'-. The conclusion that we wish to draw is
that x = y. To prove this, let A = IxI and B = I yi. Then x/AB and y/AB have length greater than 1. If .v- #y, then
I. / A- y/BI < I x/AB - v/ABI 2 = IX -.vI 2/A 2B2.
506
On the Case of Equality in the Brunn-Minkowski Inequality for Capacity
203
CAPACITY
Furthermore,
AB-(x, y) = 2B Ix/A-x/B12. Therefore,
A2B2-(x. y)2=(AB+(x, y))(AB-(x, y)) = (AB + (x, y))
AB
Ix/A -y/BI2
A2B2 Ix/A -y/BIZ < Ix-yl2.
The equation in the hypothesis can be written as 1x12+ 1y12-2(x,y) =1x12 Iy12-(x.y)2. Thus,
lx-y12=A2B'`-(x,.Y)2 < Ix-yI2. This is a contradiction, so it must be that x = Y.
4. ANALYTIC CONTINUATION
We can now prove the main result, Theorem B. Note that cap(sQ) = sN
cap Q.
Consider the regions Q and Q, of Theorem B. After dilation, one can assume without loss of generality that cap Qp = cap Q, . Furthermore, if equality holds for one value of t, then it holds for all values because a con-
cave function that agrees with a linear function at three points is linear. Let U, be the equilibrium potential of Q 0 (t < 1. Let Q,(A) = { x E Q': U,(x) > :t) v S2. The equilibrium potential of Q,().) is U,/), so that cap Q,().) = A cap Q, .
(4.1)
Therefore, the hypothesis of Theorem B implies (cap
Q0(A)'''"
(4.2)
507
With L. Cafarelli and D. Jerison in Adv. in Math. 117, 193-207 (1996)
204
CAFFARELLI, JERISON, AND LIEB
Borell's inequality and 4.2 imply (cap(1 - t) Q0(2) + IQ,(A)) IIII - 2)
>,(I -1)capQo(2)1 (N- 2) +IcapQ(A)'/,,v2 Q,(1))'/(N-2)
= (cap
Borell [ B], in the process of proving Theorem A, shows that if 0 <- I and x, = (1 - t) x0 + tx 1, then for all
U,(x,) >, min(Uo(xo), U,(x,))
I
x0eQ0,x1eQ,.
This can be rephrased as 92,(2)=(1 -1) Q0(1)+152,0.).
On the other hand, the capacity of the smaller set is at least as large as the larger, so 52,(.1) = (I - t) 520(2.) + 1Q, (A)
holds for all 2 < 1 and all t, 0 -< t -< 1. Furthermore, N-2)
cap((1-1)Q0(A)+IQI(A))I
= (1 - I) cap QO(A) I/(N - 2) + t cap Q ,(A)
IIIN- 2).
(4.3)
We will show that as d tends to 0, the domains 520(.1) and 92,(A) approach spheres. We will then be able to apply Proposition 3.4. Let A = csA- 2, where c = aN cap Qo = aN cap S2, For z a unit vector in R N, .
define p(z, s) implicitly by z)=CSN-2.
U0(s-'p(z, s)
(There is a unique value of p because the radial derivative of U0 is negative.) There is a harmonic function 0 defined in the image of Q' by the mapping x - x/Ix12 satisfying 0(0) = c and U0(x) = IxI2 N O(xIIxI2).
The equation for p can be written 2 - - N
,
The implicit function theorem shows that p is a real analytic function of (z, s) near s = 0 and that p(z, 0) = I for all z and p(z, s) tends to the 508
On the Case of Equality in the Brunn-Minkowski Inequality for Capacity
205
CAPACITY
function I on S" in the C- topology as s tends to 0. Thus a suitable dilate of Q0(A) is very close to the unit ball:
sQo(1)={rz:zeR',Izl=1,0
cap((1 -1) .cQo(A) +
=(l - t) cap
2) sQo(A)'I).v
21 + t
cap sQ,(A) "'N
2)
(4.4)
Fix s and A sufficiently small that both sQo(A) and sQI(A) are close to the unit ball. Then Proposition 3.4 says that they are translates and dilates of each other. In fact, since we have normalized the capacities to be equal, they are translates of each other. Therefore, the same is true of Q0(A) and 0, (A). It follows that there is a vector b e R' such that
U,(x-b)= U0(x) provided Uo(x) <, A. By analytic continuation, this equality holds for all x and we are done.
5. APPLICATIONS TO EXISTENCE AND UNIQUENESS IN THE VARIATIONAL PROBLEM
We can now deduce Theorem 1.6. Consider first the case N> 4. Suppose
that Q and Q, are two domains satisfying v = v,,, = v., Let u and u, .
denote the support functions of Qo and Q, , and denote (I - t) Q + tQ, Then Proposition 2.2 and formula 2.1 imply
Q, =
.
di
cap
s
.
(5.1)
Denote m(t) =cap Q;'" 2'. Then
m'(0,)=capQO +uiv. ,'(cap Q,-cap Q0) =m(0)' N(m(1)N 2_m(0)N 2) Because in is concave, m'(0+ ) > m(I) - m(0). This can be rewritten as
m(I )N-3>In(0)N
t
By symmetry we have the opposite inequality. Therefore nt(0) =m(l) and cap Q, = cap Qo. Consequently, m'(0+) = 0; and since m is concave, it
509
With L. Cafarelli and D. Jerison in Adv. in Math. 117, 193-207 (1996)
206
CAFFARELLI, JERISON, AND LIES
must be constant. Therefore, by Theorem B, 0, is a translate and dilate of 0o. Since the capacities are the same, S2, must be a translate of 00. Next, suppose that N = 3, and suppose that 0o and 0, are two domains with corresponding measures co v and c, v. Formula 5.1 yields
m'(0+)= J 2(u,-uo)codv=(co/c,)cap0,-capQo. s
Because m(t) is concave,
(co/c,)cap0, -cap go=m'(O+)>, m(1)-m(O)=cap0, -capQo so that co > c, I. By symmetry, c0=c1. It follows that m'(0)=m(l)-m(0), and since m is concave, it must be linear. Finally, Theorem B implies that 01 is a translate and dilate of Q0. Next let us deduce Theorem 1.7 from Theorem 1.5 and Theorem B. Fix a positive measure v on S" satisfying the two necessary conditions (a) and (b). Theorem 1.5 implies that there is a bounded, convex open set Q0 and a positive number c such that Qo has capacity 1 and induced measure dv,, = c dv. Formula 2.1 implies
I = cap go =
l
2
J,^ uo c dv
where uo is the support function of Q0. Let W _ (D : 0 is convex and open, cap Q > , 1 } .
Denote
F(D)=J U0 A. .s^
PROPOSITION 5.1.
00 is the unique minimizer of F in the class W, up to
translation. Proof.
Let 0 e ', and let u, be the support function of Q, . Suppose
that
F(DI)
510
On the Case of Equality in the Brunn-Minkowski Inequality for Capacity
207
CAPACITY
Since (cap Q,)'1'' - Z) is concave and equals I at t =0 and t = 1, it must be constant. Therefore by Theorem B, Q, is a translate and dilate of Q0. (No dilation is needed because the two convex bodies have the same capacity.) It is probably possible to carry out a direct variational proof of Theorem 1.5, but we did not do so because we have not proved that minimizers satisfy the natural Euler-Lagrange equation. To make this remark more precise, consider an arbitrary positive, continuous function u on S", which need not be the support function of a convex domain. Denote for all DES").
If u* is the support function of Q[u], then 0
dcapQ[u+tv]I,_O=J
s
dt
(5.3)
were proved for all u e C(S") and all support functions u. In (5.1), this is proved for t = 0 + provided the function u is also a support function of a convex domain. The corresponding identity for volume is true for all continuous v. The proof is not immediate, but follows from the fact that the Gauss mapping g is continuous almost everywhere with respect to da.
REFERENCES
[BF] T. BONNESEN AND W. FENCHE1., "Theory of Convex Bodies," BCS Associates, Moscow,
[B]
ID, 1987, Translation from German. C. BoRELi, Capacitary inequalities of the Brunn- Minkowski type, Math. Ann. 263 (1983), 179- 1 84.
[CY] S.-Y. CIn.NG AND S.-T. YAU, On the regularity of the solution of the n-dimensional Minkowski problem, Comm. Pure App!. Math. 29 (1976), 495-516. [D] B. E. J. DAHI.BERG, Estimates for harmonic measure, Arch. Rational Mech. Anal. 65 (1977), 275-283. [JI]
D. JERISON, Prescribing harmonic measure on convex domains. Invent. Math. 105
[J2]
(1991), 375-400. D. JERISON, A Minkowski problem for electrostatic capacity, Acta Math., to appear.
511
Part VI
General Analysis
J. Funct. Anal. 51, 159-165 (1983)
Vol. 51, No. 2, April 1983 Priued U Belgium
Reprtined from JOURNAL OF FUNCnONAL ANALYIU
All Righn Reserved by Academic Prew New York m W London
An LP Bound for the Riesz and Bessel Potentials of Orthonormal Functions ELLIOTT H. LiEB Institute for Advanced Study, Princeton, New Jersey 08540 Communicated by Irving Sega! Received September 14, 1982
WN be orthonormal functions in pd and let u,=(-J)-"'W, or and let p(x) = E I u,(x) I'. L° bounds are proved for p, an example being IIpII, < AaN10 for d3 3, with p = d(d - 2)-'. The unusual feature of these bounds is that the orthogonality of the w, yields a factor N"0 instead of N. as would be the case without orthogonality. These bounds prove some conjectures of Battle and Federbush (a Phase Cell Cluster Expansion for Euclidean Field Theories, 1, 1982, preprint) and of Conlon (Comm. Math. Phys., in press). Let
u, = (-d +
The genesis of this paper is a problem posed by Battle and Federbush (21 in connection with their new approach to Euclidean tp' quantum field theory. The problem is related to the proof of stability in 12, Sect. 71. Let w ..... WN be N orthonormal complex valued functions in L2(Pd) and define
ut=(-d+
m2)-In
N
p(x)=
Iut(x)12,
(2)
where d is the Laplacian and m >, 0. Battle and Federbush prove that for d = I, 2, and 3 and m > 0 there is a universal constant W such that N"2 (3) 11P112 < Wm-2+d'2 They ask whether (3) also holds for d = 4, and also point out that no such bound can hold for d > 5. Equation (3) looks like a Sobolev inequality-and it is-except for one important innovation. The standard Sobolev inequality would have a factor N, not N"2, in (3). What makes (3) interesting is that the orthogonality of * Permanent address: Jadwin Hall, Princeton University, P. 0. Box 708, Princeton, N.J. 08544. Work partially supported by National Science Foundation Grant PHY-8116101. 159
515
J. Funct. Anal. 51, 159-165 (1983)
160
ELLIOTT H. LIES
the V, yields N'"2. If the yr, were normalized, but not orthogonal, the best estimate in (3) would have N. Indeed, (3) is easily seen to hold with a factor N by the standard Sobolev inequality, even for d = 4.
Here we shall not only prove (3) for d = 4 but will generalize the inequality (for all d > 1) to what is essentially the best possible-in the sense that any proposed strengthening will fail even for N = 1.
The main results are the following, but generalizations are given
in
Theorems 3 and 4: THEOREM 1.
Let yr, ,..., WN be orthonormal in L 2 (IR d) with u, and p given
by (I) and (2). Then (i) d = 1. For m > 0, p E C'-'I' (the Holder continuous functions with exponent Z) and p E L. There is a universal constant L such that
Iip1k.
(4)
d=2.Form>Oandall 1
(5)
where Bp is a universal constant. (p is not necessarily in La.) (iii) d > 3. For all m > 0 (including m = 0), and with p = d(d - 2)-', p E L° and 11p11,
(6)
where Ad is a universal constant (independent of m).
Remark I. If the orthogonality (but not the normality) of the vi is omitted, then the right side of (4) has to be multiplied by N, and N"° has to be replaced by N on the right sides of (5) and (6). In some sense the effect of orthogonality is most striking in (4). Remark 2. The theorem yields (3) for d = 4. For d = 1, 2, or 3, the theorem also yields (3) via the Holder inequality and the obvious fact (for all d), which follows by taking Fourier transforms, that I1pII,
(7)
Proof of Theorem 1. Let us first study the situation for N = 1. The operator (-A)-'12 =_ 1 is the Riesz operator while (-A + m2) -'/2 - J. is the Bessel operator. We refer to 17, Chap. V I for a discussion and definitions. What will concern us here is (a) For d > 3, 1 is a bounded map from L 2 to L' and from L' to L 2
with r = 2d(d - 2) -' and s = r' = 2d(d + 2) '.
516
An Lo Bound for the Riesz and Bessel Potentials of Orthonormal Functions
RIESZ AND BESSEL POTENTIALS
(b)
161
For all d, J, is given by an integral kernel of the form Gm(X - y) = and -' G(m(x - y))
(8)
The m dependence given by (8) accounts trivially (by scaling) for the m dependence in (4)-(7). Henceforth we shall take m = I and drop the m subscript. J has the same properties as I in (a) for d > 3. (c) For all d and 1
H=V12J,
H*=JV112.
(9)
By (c), H is a bounded operator from L2 to L2 and H* is its adjoint. Let K = H*H. K is compact, for TrK= JJv(x)R(x-y)dxdy
(10)
The last inequality comes from the fact that R(x) = G * G(x) = C2 exp(-Ixl). Let A, > A2 > . be the eigenvalues (including multiplicity) of K. Then for any N orthonormal functions yr, N
V 11 Hv,112 < V A, < Tr K.
However, the left side of (11) is just f pV. Since ( pV < C, 11 VI{, for any
V E L' n L', (4) is proved. (ii) d = 2. The p = I case is given in (7), so assume I
consider the operator K with V = p° -' E L' with r = p'. By (c), H is bounded from L 2 to L 2 and H* is its adjoint. Let T= Tr K'. Then T1Ir < C111 V1Ir. (To prove this we can appeal to a general result of Cwikel
(see also Theorem 2 and 16, Theorem 4.1 1) that Ii(f(x)g(-iV)IIi2, < 111, is the trace norm.) Using the same variational principle as in (i) we have that 31
C, U112r II gil2r, where III N
aC
pV< t' A,
II VIIr =
C,N1/p11VIIr
(12)
1.-1 A`
!=1
Since V =
IJr
IIPIIo-'
.
517
J. Funct. Anal. 51, 159-165 (1983)
ELLIOTT H. LIEB
162
(iii)
d> 3. First consider m > 0. For reasons of clarity we reintroduce
the parameter m, namely, H = V'r2Jm. With V''2 E Ld, H and H* are bounded from L 2 to L 2 by (a). If we try to imitate the d = 2 proof (with V = p"-') we would have, as in (12), f pV < C, N'rD IIIK III, with t = p' = d/2. However, I11K1111 need not be finite; it is certainly not bounded by II V III is provided by new idea is needed and this A
the
Cwikel-Lieb-Rosenbljum bound 13-51. (This bound was proved by these authors by completely independent methods. The Cwikel and Rosenbljum methods extend to a wider class of operators, but for the operator of interest K, Lieb's bound gives the best constant of the three.) First, K is compact. The nonzero eigenvalues of K are, of course, the same as those of B = HH*. B is called the Birman-Schwinger kernel 161. Second, let n(V) denote the number of eigenvalues of K which are >,I. Then n(V)
(13)
Here C, is independent of m (as it must be). Since K is linear in V, (13) can be inverted to read /
(14)
(Simply consider V/ ..1 and n(V/A,) =j in (13).) Now we can imitate (12). Take V = p°-' whence N
N
fVP< I=
z/
1: j-21d
(15)
J=1
This completes the proof for m > 0.
For m = 0 we take H= V'121, H* = I V'"2. By (a), these are bounded from L 2 to L 2 with a bound C, II V II do Bound (13) continues to hold, and
(15) is again true. Alternatively, we can note that for fixed yr, U,, = Jm yr converges pointwise a.e. to u = IV as m - 0 by dominated convergence using the explicit integral kernels for J. and I (see 17, Chap. V, Theorem I a 1). Then (6) follows by Fatou's lemma. I
VARIATIONS ON THE THEME
An obvious generalization is to replace (1) by
u; _ (-d + m2)-a,,V,
518
(16)
An LP Bound for the Riesz and Bessel Potentials of Orthonormal Functions 163
RIESZ AND BESSEL POTENTIALS
with a > 0, and with a < d if m = 0 (see 171). Here p is still defined by (2). Equation (7) becomes
(17)
IIP111
Cwikel's theorem 131, the first half of which was mentioned just before (12), will be needed. See 16, Theorems 4.1 and 4.21. THEOREM 2 (Cwikel). (i) If f, g E L9(IRd) with 2 < q < oo then, with 111X111,= {TrIXI°}"
IIIf(x)g(-IV)III,<(27r)-°'
If 11, 11g11'.
(18)
(ii) !f f E L°(Rd) and g E Lw(Rd), then for 2 < q < oo there is a finite constant C,.d such that
Ilif(x)g(-iV)III,.», < C,.d IIIII, 11gIL.. By
definition,
(19)
t meas;x1t < I g(x)I} 19, and 1110111,,,=
11 gI1,.W =
where A, > A, > are the eigenvalues of the (compact) sup operator (0*0)''2. Note that the nonzero eigenvalues of 0*0 and 00* are the same.
In our application
g(k) = (k2 +
m2)-°i2
and
11 gII, = Rq.d.° m-a+d/,,
if
aq > d and m > 0.
(20)
and m0.
(21)
if qa = d
I1 g111.- = Td,Q,
With this information, and by imitating the proof of Theorem 1, case d >, 3, we have the following generalization of Theorem 1: THEOREM 3. (i) For all d, m > 0, and a > 0 a finite universal constant Bp.d.° exists such that IIPIIp < \
Bp,d.° Md-2°-d/p N'lp
(22)
provided that
< p < co
when
2a > d,
1 <,p < 00
when
2a = d,
I < p < d(d - 2a)-'
when
2a < d.
519
J. Funct. Anal. 51, 159-165 (1983) 164
ELLIOTT H. LIEB
(ii) For all universal constant
d, m = 0, 0 < a < d/2, and p = d(d - 2a)"', a finite exists such that (23)
IIPIIp
Note. The q in (20), (21) is chosen to be 2p(p - 1) ' when p > 1. Also,
V=p°"'. In the foregoing, the operators H and H*, given by (9), were used with V = P' for some suitable r. Now let us consider the following problem as suggested by Conlon (8]: Consider the operator L given by the kernel
N
L(x.y) _ V w,(x) Gm,a(x -y) Wr(y),
(24)
i=1
where G,,.. is the kernel for (-A + m2)-°" with a > 0, and with a < d if m = 0. Again, the I yr, II are an orthonormal set. For d = 3, m = 0, and a = 2, Conlon ]8] proved that when 1/r + 1/s
2 < r, s < 6, 1(f Lg)I < (const) N12 If 11, 11 g 11, (with (v, u) = J vu). In this case, the operator L is the exchange Coulomb energy operator of
Hartree-Fock theory. Conlon ]8] suggested that the exponent z could be improved to by using the results of Cwikel [31. Subsequently, Conlon 3'
(private communication) was able to prove the N"3 bound for r = s = 3 by a
completely different method from that given below. The general case is contained in THEOREM 4. With L given by (24) and f E L'(IRd), g E L,(F? d), there are universal constants C, independent of the 1w,), such that
(i) For all d and all m > 0,
I(fLg)1
(25)
when 1/r + 1/s = a/d and 2 < r, s < oo. (ii)
For all d and m > 0 I(J:Lg)I
1,'r
"' 11 f 11r 11 9L
when I/r + 1/s < a/d and 2 < r, s < oo. m2)-"2 f Let Hf= (-A + and HR = (-A + m2)-'12g with Q+y=a. Then I(wi,Hf HRW,)I
Proof.
I
,11Hrwi11'}''2.
rQ > d, sy > d. For part (i), one mimics the proof of Theorem 1. d >, 3. For
520
An LP Bound for the Riesz and Bessel Potentials of Orthonormal Functions
RIESZ AND BESSEL POTENTIALS
165
both parts, it is necessary to note that the orthonormality of the ; pi } implies that II HW; II i < Y' I A j, Where A, < A, < .. are the eigenvalues of
H*H. U ACKNOWLEDGMENTS
I thank Professor Paul Federbush for drawing my attention to the d = 4 problem contained in 121 and I thank Professor Joseph Conlon for valuable discussions about the problem raised in 181. The Institute for Advanced Study is thanked for its hospitality and support.
REFERENCES 1. R. A. ADAMS, "Sobolev Spaces," Academic Press, New York, 1975. 2. G. A. BATTLE III AND P. FEDERBUSH. A Phase Cell Cluster Expansion for Euclidean Field
Theories, Part I, 1982, preprint. 3. M. CWIKEL, Weak type estimates for singular values and the number of bound states of Schroedinger operators, Ann. of Math. 106 (1977), 93-102. 4. E. H. LIEB, The number of bound states of one-body Schroedinger operators and the Weyl
problem. Proc. A.M.S. Symposia in Pure Math., Vol. 36, pp. 241-252, 1980; these results were announced in Bounds on the eigenvalues of the Laplace and Schroedinger operators,
Bull. Amer. Math. Soc. 82 (1976), 751-753. 5. G. V. ROSENBUUM, Distribution of the discrete spectrum of singular differential operators, Dokl. Akad. Nauk SSSR 202 (1972), 1012-1015 (MR 45, No. 4216); the details are given in Distribution of the discrete spectrum of singular differential operators. Izv. Vvss. Uk*ebn. "Laved. Matematika 164 (1976), 75-86; English trans.. Soviet Math. (Iz. VUZ) 20 (1976), 367-380.
6. B. SIMON. "Trace Ideals and their Applications." Cambridge Univ. Press, Cambridge, 1979.
7. E. M. STEIN, "Singular Integrals and Differentiability Properties of Functions." Princeton Univ. Press, Princeton. N.J.. 1970. 8. J. G. CONLON, Semi-classical limit theorems for Hartree-Fock theory. Comm. Math. Phvs.. in press.
521
With H. Brezis in Proc. Amer. Math. Soc. 88, 486-490 (1983) PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume U. Number 3. July 19113
11v) A RELATION BETWEEN POINTWISE CONVERGENCE OF FUNCTIONS AND CONVERGENCE OF FUNCTIONALS HAIM BREZIS AND ELLIOTT LIEBI
ABSTRACT. We show that if {
is a sequence of uniformly I."-bounded functions on a measure space, and if f - f pointwise a.c.. then f.11 v - II f, - f II f 11 ° for all 0 < p < oc. This result is also generalized in Theorem 2 to some
functionals other than the f." norm, namely / I J(f) -- /(J - f) - j(/) I - 0 for
suitable J: C - C and a suitable sequence A brief discussion is given of the usefulness of this result in variational problems.
1. Introduction. Let (0, 1, µ) be a measure space and let (f.)1.=, be a sequence of
complex valued measurable functions which are uniformly bounded in L^ = L P(Sl, X, z) for some 0 < p < oo. Suppose that fn - / pointwise almost everywhere (a.e.). What can be said about II f II P? The simplest tool for estimating II f II P is Fatou's lemma, which yields IIfIIP < lim inf II/,IIP. It - eo
The purpose of this note is to point out that much more can be said, namely (1)
fn
lira (114111
-11fn
-f11PP) =11f11P
More generally, if j: C C is a continuous function such that j(0) = 0. then, when f a.e. and f 1 j(/n(x)) I d z(x) < C < oo, it follows that
(2)
lim
f [J(fn) -j(fn -/)] = f j(f )
under suitable conditions on j and/or (fn ). Heuristically, (2) says the following. If we write f. = f+ gn with g -- 0 a.e., then, for large is. ,(j(/ + gn) decouples into two parts, namely Jj(f) and Jj(g,,). Equation (1) is not merely an idle exercise, but it is actually useful in the calculus of variations to prove the existence of maximizing (resp. minimizing) functions in some cases in which compactness is not available. In fact (1) was first used by one of us (E. Lieb), but with a different notion of convergence than pointwise convergence
of fn -f, to solve a variational problem [1). Later, it was also used in another variational problem (2). At the end of this note we shall give a brief account of how (l) can be used. Received by the editors August 9, 1982 and, in revised form, November 17, 1982. 1980 Mathematics Subject Classification. Primary 28A20. 35160, 46E30. Ke3 words and phrases. Convergence of functionals, pointwise convergence, L° spaces. 'Work partially supported by U S. National Science Foundation Grant PHY-81 16101
486
523
With H. Brezis in Proc. Amer. Math. Soc. 88, 486-490 (1983) CONVERGENCE OF FUNCTIONALS
487
Two theorems will be stated: (i) the Lo case (0 < p < oo), (ii) the general case (2). Although (i) is a corollary of (ii) we state it separately because it is an important special case and because the assumptions are especially transparent.
E. Lieb is most grateful to the Institute for Advanced Study for its support and hospitality. Both authors thank the Summer Research Institute for bringing them together in Melbourne, Australia, where this note had its origin. 2. The Lo case (0
(ii) In case 0 < p -4 1, and if we assume that f E LP, then we do not need the hypothesis that II f II P is uniformly bounded. (This follows from the inequality I
I f, IP - I f, - f r I < I f r and the dominated convergence theorem.) However, when
I < p < oo, the hypothesis that II fR II P is uniformly bounded is really necessary (even if we assume that f E LP) as a simple counterexample shows.
(iii) When I < p < oc, the hypotheses of Theorem I imply that f, -f weakly in LP. [By the Banach-Alaoglu theorem, for some subsequence, f, converges weakly to some g; but g = f since f - f a.e.] However, weak convergence in LP is insufficient
to conclude that (1) holds, except in the case p = 2. When p # 2 it is easy to construct counterexamples to (1) under the assumption only of weak convergence. When p = 2 the proof of (1) is trivial under the assumption of weak convergence.
3. The general case. In order to prove (2), some conditions are needed on the To make this point clear we shall later give an function j and the sequence example for which (2) fails. On the other hand, we shall not attempt to find the most general conditions for which (2) holds but shall, instead, content ourselves here with conditions which are reasonably simple, yet general enough to cover many examples. Let j: C C be a continuous function with j(0) = 0. In addition let j satisfy the following hypothesis: For every sufficiently small e > 0 there exist two continuous, nonnegative functions 9), and ¢, such that
Ii(a + b) -j(a)I < eq,(a) + 4.1(b)
(3)
for all a, b E C. THEOREM 2. Let j satisfy the above hypothesis and let f = f + g,, be a sequence of measurable functions from 11 to C such that:
(1) g -. 0 a.e. (ii) j(f) E L'. (iii) fq,( g (x )) dµ(x) < C < oo, for some constant C, independent of e and n.
(iv) f4((f(x)) du(x) < oo for all E > 0. Then, as n (4)
524
oo,
fli(f +g.)-j(gj -j(f)Idu--0.
A Relation Between Pointwise Convergence of Functions and Convergence of Functionals
488
HAIM BREZIS AND ELLIOTT LIEB
or are separately in V. (ii) Note that the convergence in (4) is in the strong L' topology. This is a stronger statement than (2). REMARKS. (i) It is not assumed that j(
PROOF OF THEOREM 2. Fix e > 0 and let
W,.,,(x) =
[l/(ff(x))
where [a]+ = max(a,O). As n
.j(g,,(x))
J(f(x))I - e9"(gn(x))],
,
oo, W,,,,(Ix.) -+ 0 a.e. On the other hand,
1i(fn) -f(gn) -.l(f)I
<ep,(gn) +4 (f) +11(1)1 Therefore, W < 4,,(f) + I j(f) I E L'. By dominated convergence, f W,. dµ - 0 as n - oo. However,
Ij(f,,) -j(g,) -j(f) 1< W.,, + ep,(g,,) and, thus,
I,, =IIJ(fn) -j(8-) -J(f )Idp
cC. Now let a
0.
fJ
EXAMPLES. (a) j(t) =1 t r, 0
(b) Suppose that j is a continuous, convex function from C to R with j(0) = 0. Choose some number k > 1. Then (3) holds for ek < I with
q,(t) =j(kt) - kj(t)
and
'',(t)
=Ij(C,t)I+Ij(-C,t)I,
with 1/C, = e(k - 1). This is proved in Lemma 3 below. Therefore, the hypotheses is of Theorem 2 are satisfied if there is some fixed k > I such that [ uniformly bounded in L', and if j(Mf) is in L' for every real M. is uniformly bounded in L' (c) The condition in example (b) that for some constant k > I can be essential, not only for the hypotheses of Theorem 2
but for the conclusion as well. Let SZ = [0, 1], j(t) = ell -1, dµ = dx, f(x) = 1, 0 otherwise. Then ln(1 + n) if 0 < x < 1/n, and I and fj(f) = e - 1. In this example we see that (2) does not hold even f is uniformly bounded in L' and j(Mf) E L' for all real M. Note that though for this sequence (g,), j(kg,) is not uniformly bounded when k > 1. However since j(t) is convex, (b) above tells us that the conclusion of Theorem 2 would be valid for
any other sequence g such that j(kg,) is uniformly bounded in L' for some k > 1. LEMMA 3. Let j: C -» R be convex and let k > 1. Then
Ij(a + b) -j(a)I e[ j(ka) - kj(a)] +Ij(C,b)I +Ij(-C,b)I for all a,bEC,0<e< Ilk and 1/C,=e(k- 1).
525
With H. Brezis in Proc. Amer. Math. Soc. 88,486-490 (1983) 489
CONVERGENCE OF FUNCTIONALS
PROOF. Leta=I-ke,$=E.y=(k-1)e.Then a+$+y=I and(a+b) = as + 8(ka) + y(Cb). By convexity, j(a + b) ' ai(a) + 13j(ka) + yj(C,b). This implies that
j(a + b) -j(a) < e[j(ka) - kj(a)] + j(C.b)I For the reverse inequality let
a= 1/(I +ke),
e/(1 +kc),
y=e(k - 1)/(1 +ke),
whence a = a(a + b) + fl(ka) + y(-C,b). Then
j(a) -j(a+b) <e[j(ka) - kj(a)] +c(k - 1)j(-Cb). 0 4. Applications. In the calculus of variations an oft-met problem is to show that an infimum or supremum is achieved. We shall outline by two examples how Theorem I can be used for this purpose. (A) If K is the sharp constant in the inequality I I Af I I q < K II f II p, where A is a bounded linear operator from L" to V. can one find f such that equality holds? We
shall assume that oo > q > p > 1. In fact, the problem in [1) that motivated Theorem I was the Hardy-Littlewood-Sobolev inequality on Lo(R", dx). Namely, A
is the integral kernel A(x, y) = I x - y I-', 0 < A < n and p-' + A/n = I + q'. Let K = sup(R(f) If E V. f # 0), where R(f) = IIAjIIQ/II f IIp. The problem we address here is to prove the existence of a maximizing f. i.e. R(j) = K. Suppose that an LV-bounded sequence (j") can be found such that (i) R(f") -a K. (ii) f - f a.e.,
(iii) f # 0. (For the H.L.S. inequality, this can be done by using a rearrangement inequality.) The difficulty that one faces is to show R(f) = K. This difficulty can be overcome by Theorem I if we make the additional assumption that Af" -. Af a.e. (This can also be verified for the H.L.S. problem.) With these assumptions we have that v_
K _ lim
IIAfIIQ I I f"IIo
_ =
(IIAjIIQ+ IIAg"II'I9 u-
(Il f lip + IIg"IIp} °
with f = f + g" as before. Since p/q c I and (a + b)' a' + b' for a, b > 0 and 11P t < 1. and since II Ag" IM q < K II g" (by definition), it follows that Ko II Af IIp/li f II
Thus f is maximizing, as desired. For further details see [1).
(B) This is taken from (2). Let S1 C R", n > 3, be a bounded domain. Let A > 0 and let RX(f
/Iofl2-AIIfI2 Ilfllo
with p=
2n
n-2'
The problem is to show that K, = inf(RA(f) I f E H01(S2), f # 0) is achieved. Suppose that we know that K. < Ko (this is indeed the case for every A > 0 when n > 4, and for X sufficiently large when n = 3; see [2)); then KA is achieved. To prove this, let { f" } be a minimizing sequence with it j"IIp = 1. Since f is bounded in H'(Sl) we may assume thatf" -f weakly in H', f" -- f strongly in L2 and
526
A Relation Between Pointwise Convergence of Functions and Convergence of Functionals
HAIM BRFZIS AND ELLIOTT LIER
490
f - fa.e. We have
JI
KA + o(l),
A
and since f I V f. 12 ? KD II f II n = Ko (by definition of KO ). it follows that A f If l2 >
Ko - KA > 0. Therefore)
0. On the other hand, let g =1 - f. We have
JI vf,I2 - Jllnl2 = KjIf.II, + 0(l). and since g 0 weakly in H', we obtain JIvfI2 + JI vg.12 - A J1112 = KA lf,,lI, +0(l). Consequently.
JIvfI2 + KDllgnIl; - A JIfI2 < KAIII,,II, + o(I).
On the other hand, it follows from Theorem I that IIInI1: =11111 + IIgnI1
Since p
+ o(I ).
2 we deduce that +Ilg.llp
11411P
+ o(1).
If KA > 0, we conclude that 2
2
KAIIfnllo < KAIIf11o + K011gn11, + o(l)
and, therefore,
JIvfI2-xJIII2
JIvfI2-xfII12
THE INSTITUTE FOR ADVANCED STUDY, PRINCETON, NEW JERSEY 08540
Permanent address: Departments of Mathematics and Physics. Princeton University. Jadwin Hall. P.O. B. 708. Princeton. New Jersey 08544
527
Annals of Math. 118, 349-374 (1983) Annals of Mathematics. 118 (1983), 349-374
Sharp constants in the Hardy-Littlewood-Sobolev and related inequalities By ELLtorr H. LIEB*
Abstract
II
I xl
A maximizing function, f, is shown to exist for the HLS inequality on R": ' * f Ily < Np, A, j If IIp, with N being the sharp constant and 1/p + A/n =
p=2 explicitly evaluated. A maximizing f is also shown to exist for other inequalities: (i) The Okikiolu, Glaser, Martin, Grosse, Thirring inequality: Kn,p11 of 112 II
I xI - ''fIIp, n z 3, 0 5 b < 1, p = 2n/(2b + n - 2). (This was known before,
but the proof here has certain simplifications.) (ii) The doubly weighted HLS inequality of Stein and Weiss:
f V(x, y)f(y)dy a
with V(x, y) = Ixl
'Ix
- yl ''Iyl °, 0 < a < n/p', 0 <_ /3 < n/q, l/p +
+a+/3)/n=1+1/q.
(iii) The weighted Young inequality: II IxIYf IIp' 2: where f ""(x) is the m-fold convolution of f with itself, in >_ 3, m/(m - 1) < p < in,
y/n + 1/p = (in - 1)/m. When p = m/(m - 1) or p = 2, f and Q are explicitly evaluated.
I. Introduction A classical inequality, due to Hardy and Littlewood [151, [16] and Sobolev [26] (see also [11]) states that fff(x)lx-yl-)g(y)dxdy < Np, x, 11f llpllgll,
for all f E L"(R" ), g E L'(R" ), 1 < p, t < oo, 1/p + 1/t + A/n = 2 and 0 < A < n. (By notational definition, N,.;,.,, is the best, or sharp constant in (1.1).) *Work partially supported by the U.S. National Science Foundation under grant no. PllY8116101.
529
Annals of Math. 118, 349-374 (1983)
ELLIOTT H. LIEB
350
The main purpose of this paper is two-fold. In Section II, it is shown that a maximizing pair f, g exists for (1.1), i.e., a pair that gives equality in (1.1). This will require the use of two rearrangement inequalities and a new compactness technique (Lemma 2.7) for maximizing sequences. From the point of view of general methodology, this is perhaps the most interesting part of this work. In
Section III, N and f, g are explicitly computed for the case t = p and, as a corollary, for the cases t = 2 or p = 2. This part is amusing for the following reason: one can guess what the pair f, g ought to be and verify that they satisfy the integral equations (Euler-Lagrange equations) for (1.1). But these equations are nonlinear and it is far from clear that this choice is actually maximal. The proof that it is so requires use of stereographic projection from R" to the sphere S" and exploitation of the symmetry of f, g given by the rearrangement inequalities used in Section II. Additional examples are given of the techniques of Section II. Section IV contains a comparatively simple proof of the existence of a maximizing f for the Sobolev inequality (1.2)
K"II of II2 ? IIf II2nA" - 2),
n
3
and its generalization due to Okikiolu [21], Glaser, Martin, Grosse and Thirring [14):
(1.3)
K, PIIofII2 ? IIIxI - bill,
n >- 3
for 0 < b < 1 and p = 2n/(2b + n - 2). Of course (1.2) has been treated before by Aubin [2] and Talenti [31] and (1.3) in [14], but the directness of the proofs given here may be of some value.
Section V uses the techniques of Section II to prove the existence of a maximizing f, g for the doubly weighted HLS inequality of Stein and Weiss [29]: (1.4)
f f g(x)V(x, y)f(y) dxdy < Pa.ILp.."IIfIIPIIg11'
yI-AIyi-", 0 < a < n/p', 0 < Q < n/t', 1/p + with V(x, y) = 1/t + (A + a + j3)/n = 2. Finally, the weighted Young inequality is shown to have a maximizing f: (1.5)
QP.
"IIf'",>Il x < II IxVYf llp ,
m>3
where f ')(x) is the m-fold convolution of f with itself and m/(m - 1) < p < m, y/n + 1/p = (m - 1)/nt. Moreover, Q can be evaluated in two cases: p = m/(m - 1) and p = 2. The latter case turns out, by Fourier transformation, to be (1.1) in disguise with p = t. Thus, the evaluation of the sharp constant in (1.5) for p = 2 brings the work full circle.
530
Sharp Constants in the Hardy-Littlewood-Sobolev and Related Inequalities
351
HARDY-LITTLEWOOI>SOBOLEV INEQUALITIES
My indebtedness to Alan Sokal is profound. He stimulated this investigation
by suggesting that (1.5) was true for m = 4, p = 2, a case which arose in his study of quantum field theory [27]. Later he proposed the general case of (1.5).
He also suggested that the techniques of Section 11 would work for (1.4). Throughout the course of this work he was a constant source of encouragement and stimulation. I am also indebted to Henri Berestycki for his encouragement. I thank Haim Brezis for pointing out the last part of Lemma 2.7 and I thank the referee for many helpful remarks, in particular for drawing my attention to [3]. I
am most grateful to the Institute for Advanced Study for its support and hospitality.
A technical remark can be made about (1.1) in the context of tweak L" spaces, I xI
Lw "(R" ). There are two definitions of what is meant by II h II q, u
for 1 < q < oo. One is Ilhllo.u, _
(1.6)
aµ(xi Ih(x)I> a)
11q
a>O
where an is the area of the unit sphere, (2.13), and µ is Lebesgue measure. This is not a true norm (the triangle inequality is not satisfied), but it is convenient and it is equivalent to the following, due to Calderon, which is a true norm: (1.7)
Ilhllq,*,, =
(1/q')(n/a,,)"'supjt(A)-
A
tiq' flh(x)I A A
for 0 < µ(A) < oo. Clearly Ixi-a has unit norm in both definitions (q = n/A). The generalization of (1.1) is
f f f(x)h(x- y)g(y)dxdy
(1.8)
<
with q = n/A, 1/p + 1/t + 1/q = 2, 1 < p, t, q < oc, and the same A.
X,
is
sharp in (1.8) as in (1.1). Either one of the two definitions, (1.6) or (1.7), may be used in (1.8), and the same N is sharp for both.
The justification for (1.8) is the following: if we replace f, g, h by their symmetric decreasing rearrangement, f*, g*, h* (see Section II), the left side of (1.8) does not decrease. All the norms on the right side of (1.8) are invariant. The
maximizing f, g for (1.1) satisfies f = f*, g = g* (Section II). We note that Ixl
_a
= sup(h(x)I Ilhllv.,,, <_ 1, h = h*). Thus, (1.8) holds with (1.6). The proof
for (1.7) is trickier. Let h = f * g. Clearly, h = b* since f = f*, g = g*. Thus, h(x) = j x da X (x) where X,, is the characteristic function of T, = (x lh(x) > a), which is a ball of some radius R,,. Assume that IIhIl *,,. = I. The left side of 531
Annals of Math. 118, 349-374 (1983) ELLIOTT H. LIES
352
(1.8) is
f b(x)h(x) dx = f'da f X,,(x)h(x) dx 0
S fda 0
= f 0da q'(a"/n)Ro- = f cda f
Xa(x)IxL-"dx
0
0
= f b(x)Ixj-"dx.
U. Existence of a maximizing function
Here we shall establish the existence of a maximizing pair of functions f, g
giving equality in (1.1). This means finding f E LP, p 1 + A/n = 1 + q such that if (2.1)
R(F) = Illxl
(2.2)
R(f)=Np,
FIIQ/IIFIIp,
F * 0, then "sup{R(F)IF(=- Lp,F*0).
Some remarks might be helpful to explain the difficulties to be faced in finding this f First, the usual way to find f is by a compactness argument. But R(F) is not upper-semicontinuous in the LP weak topology. Second, R(F) is invariant under the conformal group of dilations, rotations and translations, namely,
(2.3)
F(x) - F(y9Px+y),
y>0, 5PE0(n), yER".
Furthermore, if q = p', a case for which we shall explicitly find f, an inversion symmetry also exists, i.e., (2.4)
F(x) -. IxI"-2nF(xlxl - 2).
The existence of this large invariance group means that a maximizing f cannot be
unique and also that it is easy for a weakly convergent maximizing sequence { f") to converge to zero. Third, if the kernel Ix - yI ' is changed slightly, a maximizing f need not exist. Explicitly, let K(r), for r > 0, be any positive function such that r'K(r) is strictly monotone increasing. Consider K(Ix - yI) in place of Ix - yI ' as a kernel in (2.1). The Fourier transform of the Bessel potential, (1 + Ix - y12) ) /2, is a good example; it is even positive definite. For any f E Lp, let F(x) = I f(x/2) I. It is easy to check that R(F) > R(f), and hence, that no maximum can exist. One of the key tools we shall use (several times, in fact) is Riesz's rearrangement inequality [24] (for a generalization see [7]) in the strong form given by Lieb [20]. It is recalled in Lemma 2.1.
532
Sharp Constants in the Hardy-Littlewood-Sobolev and Related Inequalities
353
HARDY-LITTLEWOOD-SOBOLEV INEQUALITIES
Definition 1. Let f R" -- C satisfy tt f(a) = µ(x I f(x) I > a) < oo for all a > 0. (Here, µ is Lebesgue measure.) f : R" --> [0, oc) is a symmetric decreasing
rearrangement of f if P(x) depends only on I x 1, and f(x,) >_ f (x2) >_ 0 if
Ix11:1x21,and µf.(a)=µf(a),for all a>0. It is easy to check that f always exists and it is defined uniquely almost everywhere (see [20]). Henceforth, notation will be abused in the sense that any function f (x) that depends only on IxI will sometimes be written as f (IxI). It is convenient to introduce the following sets of functions from R" _ [0, oo) (where T denotes "translate"): SD = (f I f is symmetric decreasing, i.e., f = f*); SSD = (fl f E SD and f is strictly monotone decreasing); f
TSD = (fIf,,(x)=f(x+y)and ffESD for some yeR"); TSSD=( fIf ETSDand f,, CSSD). LEMMA 2.1. Let f, g, h be functions on R" satisfying the conditions of Definition 1 and let
1(f, g, h) = f f f(x)g(x - y)h(y) dx dy. Then
(i) 1(f*, g*, h*) ? II(f, g, h)I If, in addition, g* E SSD then (ii)
1(f*, g*, h*) > II(f, g, h)I unless f(x) = f*(x + y) and h(x) _ h*(x + y) for some (common) y E R".
The first part of this lemma has been generalized to more than three functions and more than two variables in [7]. Another closely related fact that will be needed later is Lemma 2.2. We omit the easy proof (which mimics the proof of Lemma 2.1); it can also be obtained from Lemma 2.1 by suitable choice of h.
LEMMA 2.2. Let g = g*, f = f. Suppose the convolution k ° g * f satisfies
k(x)
THEOREM 2.3. Let I/p + A/n = 1 + l/q with 0 < A < n, I < p, q < oo. Then
(i) NP x, in (2.2) is finite and there exists an f E L" that maximizes R, i.e., R(f) = N,.A.n'
533
Annals of Math. 118, 349-374 (1983)
354
ELLIOTT H. LIEB
(ii) After multiplication by a suitable complex constant, every maximizing f is in TSSD and satisfies the pair of equations Ixl-a*f=g,-t IxIg= fp (2.5)
for some g E L' and g E TSSD. (Here t = q' = q/(q - 1).) After a common translation, f, g E SSD.
(iii) When q' = p = t, then g = f. (iv) Let q' = p = t and let f be translated so that f = P. Then, possibly after a dilation f (r)
y"/p f (yr ), f has the inversion symmetry of (2.4):
(2.6)
f(1/r) = r2n/pf(r)
In the following, irrelevant positive constants will all be denoted by the common symbol C. Proof of (ii) and (iii): These two parts are easy in view of Lemma 2.1. N in (2.2), which is here assumed to be finite, can be written as
NA
(2.7)
"
= sup f f f(x)g(y)K(x - y) dx dy/llfllpllgll,, f. g
where f E LP, g E L' and K(x) = IxI -\ and 1/t + 1/p + A/n = 2. Since the rearrangements f - f', g - g* do not change the norms llfllp, Ilgll,, and since K = K* E SSD, Lemma 2.1 (ii) shows that f, g E TSD (possibly after the multiplication by constants). The equations (2.3) are then easy to derive by letting f - f + eqp, g - g + Sty and setting the derivatives ate = S = 0 equal to zero. (Again, it may be necessary to multiply f and g by constants to get unity on
the right side of (2.5).) By Lemma 2.2, equations (2.5) imply that (after a translation), f, g E SSD. (iii) follows from (2.7) and the fact that K(x - y) is positive definite. In fact, IxI
Clxl -("+'`),2 * IxI -(n+X)/2. (See (3.6).)
Note that (2.7) implies
1/p + 1/t + A/n = 2. Beginning of Proof of (i). Let (f) be a maximizing sequence, i.e., R(f) - N. Assume llfllp = 1. Since f E LP, j exists and Iii,-lip = 1. By (2.8)
Np A,. = N,,,\,.,
Lemma 2.1 (see the proof of (ii)), R(f1) > R(f ), so we can henceforth assume that f, = P, . Now (2.9)
-By
1 = C ftf(r)pdr>_ C fRr" lf(r)pdrCR"f(R)p 0
0 < f(r) < Cr".
passing to a subsequence, we can then assume (2.10)
534
and 05 f(r)
Sharp Constants in the Hardy-Littlewood-Sobolev and Related Inequalities HARDY-LITTLEWOOD-SOBOLEV INEQUALITIES
355
for all rational r. Since f(r) is non-increasing in r, it is easy to see that f,( r ) converges for almost all r > 0 and therefore that f = f*. (This is essentially Helly's theorem.) By Fatou's lemma, f E L. The problem we face is that f could easily be zero because of the dilation subgroup mentioned in (2.3). Even if f # 0 it is not obvious that R(f) >- N,,..\.., but this fact will be proved with the help of the following lemma.
LEMMA 2.4. Let 1/p + A/n = 1 + 1/q, with 0 < A < n. Suppose f E Lp(R") is spherically symmetric and I f(r) I < er - "/p for all r > 0. There is a constant, Cn, independent off and e such that IIIxI A fllq 5 (Note that p < q.) C"Ilfllp/qel-p/q
Remarks. (i) Lemma 2.4 and (2.9) obviously imply that N < oo. (ii) Lemma 2.4 follows from known results about the Lorentz spaces L(p, q) (see [22], [9], [28]). We give our own proof, which is based on a transformation to logarithmic radial variables, for two reasons: (a) In conjunction with Lemma 2.1 it provides an alternative strategy for proving many known facts about L(p, q) spaces; (b) The formulation given in our proof will be needed later in order to
establish (and hence to exploit) the inversion symmetry for q = p' given in Theorem 2.3 (iv).
Proof. Define F: R - R by
F(u) = e""/pf(e"), whence
(2.11) (2.12)
IIflLP(R^)
Here, a" is the area of the unit sphere in R", an = 2ir"/2/1'(n/2).
(2.13)
Without loss of generality, we can assume f(r) >- 0. Define h = Ix I - a * f, which
is spherically symmetric, and H: R - R by H(u) = e""/qh(e"). As in (2.11), Ilhllq An explicit form for H, which is easily obtained by integrating d"x over angles in R", is the following: 00
(2.14)
H(u) = f Ln(u - v)F(v) dv,
where
cc
(2.15)
Ln(u) = 2-j2exp(u(n/q - A/2))Zn(u),
(2.16)
Zn(u) = an -1I"[cosh u - cos B]
6)"-2 dt9,
n >: 2,
n
n = 1. _ [cosh u + 11-1/2 + [cosh u - 1] -'/2, Now, Ln E L'(R) and F E LP(R) and IIFII0 < E. (Note that In/q - X/21 < A/2, since p > 1, and that the singularity, if any, in Z,(u) for u near zero is no worse 535
Annals of Math. 118, 349-374 (1983)
356
ELLIOTT H. LIEN
than Jul By Young's inequality, IIHIIP s CIIFIIP and IIHII0 <- CIIFII. <- Ce. Since q > p, the lemma follows from Holder's inequality.
Before returning to the proof of Theorem 2.3(i), let us draw two conclusions from the construction, (2.11)-(2.16). First, the original problem (2.2) is equivalent to the one-dimensional problem (2.17)
NP. A..=
a'/v-'/Psup{
IIL. * FIIq/llFllp 10 * F e LP(R)}.
In particular, L. * is a bounded operator from LP(R) to Lq(R). The second conclusion is Proof of Theorem 2.3 (iv). Make the change of variables given in (2.10) and
note that n/q - X/2 = 0 in (2.14) when q - p'. Thus, L" = K. E SSD(R). From (2.17) and by the same proof as for Theorem 2.3(ii), F E TSSD(R). Translating F, namely F(u) - F(u + y), is the same as dilation of f. With F E SSD(R), inverting (2.11) gives the desired result.
It is worth noting that the strong rearrangement inequality had to be used twice to prove Theorem 2.3(iv). Lemma 2.4 and (2.9) not only imply that N < oo, they also imply that, after
a suitable f-dependent dilation, we can assume that the limit, f, in (2.9) is not zero. To see this, let
a, = supr""Pf(r). r
By Lemma 2.4, a? ++ 0, for otherwise Illxl 0 while it flip = 1, which * fllq would mean that (f) is not a maximizing sequence. Thus, a1 > 2/3 > 0. Replace fir) by y"Pf{y1r), which does not change the norm off. We can now choose y,
so that f j 1) > a /2 > /3 > 0. Therefore, f(l) > /3 and, since f E SD, f(r) > /3 for r 5 1. Thus, f is not zero. Let us briefly review the situation about Theorem 2.3(i). We have a maximizing sequence { fl) of non-negative symmetric decreasing functions which
converge pointwise, almost everywhere to f * 0. By Fatou's lemma, Il f ll, < limllfill 1, = 1; therefore f will be maximizing if 1(f) 1(f ), where (2.18)
1(g) = llix1-X * gllq.
The convergence of 1(f) to 1(f) will be proved, but only after we first prove that
R(f)
R(f) and llfllp = 1. Before doing so let us first consider a related
problem which is interesting in its own right, for which it is easy to establish that
1(f) -+ 1(f). This other problem and its solution are stated as the following theorem.
536
Sharp Constants in the Hardy-Littlewood-Sobolev and Related Inequalities
HARDY-LITTLEWOOD-SOBOLEV INEQUALITIES
357
THEOREM 2.5. Let 1/p + 1/t + A/n = 2 with 0 < A < n and 1 < p, t < oo as before, and consider the ratio in (2.6) but with g restricted to be f; i.e.,
1V = Sup f f f(x)f(y)Ix - yI-"dxdy/IIf11PIIfII,
(2.19)
withfeLP nL`andf#0.(Naturally,iS SNandiS = N when t = p = q'as stated in Theorem 2.3(iii).) Let t # p. Then there exists a maximizing f for N. Furthennore, after multiplication by a constant, a dilation and a translation (i.e., f(x) - cf(yx + y) this f is in SSD and satisfies lxl-"*f=fP-t +f`-1.
(2.20)
Proof. All of the argument is as before, but with one additional fact at our disposal. We can (after dilation and multiplication by a constant) assume that IIf1IIP = II f II, = 1. From (2.9) the limit f satisfies f (r) < Cr - "/P and f(r) s Cr"- (same C). Let h(x) = C min{ Ix (- "/P, IxI - ""` }. Although his neither in LP nor in V, the function h(x)h(y)Ix - yl -" a LI(R" x R"). (To see this, note
that h a L' when min(t, p) < s < max(t, p). Choose s so that 1/s + 1/s + A/n = 2. But we already proved that h - IxI -" * h is bounded from L' to L''.) Therefore, if I(f) denotes the integral in (2.19), we have that I(f) -> 1(f) by dominated convergence.
O
Returning to Theorem 2.3(i), we see that establishing the convergence of I(f) to 1(f) is more delicate than in Theorem 2.5, even if p # t, because all we
know is that f a LP, and not necessarily in V. Therefore, the dominated convergence argument cannot be used.
To control the convergence of R(f) to R(f ), the following lemma due to Brezis and myself [8] is useful.
LEMMA 2.6. Let 0 < p < oo. Let (M, 1, µ) be a measure space and let 4) be a uniformly norm-bounded sequence in LP(M, 2, t) that converges pointwise, almost everywhere to f. (By Fatou's lemma f (=- LP.) Then the following limit exists and equality holds. li
f lIf(x)IP - If(x) - f(x)IP - If(x)IPj dµ(x) = 0. m 00
Remarks. Lemma 2.6 says more than that IIfIIp - 1If - f II P - IIfIIp It improves Fatou's lemma which says that lim inf II f IIp ? IIfIIp In [8] a similar theorem is proved for functionals of the form f --> JJ(f) dµ. The conclusion of Lemma 2.6 does not hold (except when p = 2) if pointwise is replaced by weak convergence. Note that the lemma holds even for 0 < p < 1. Lemma 2.6 can be 537
Annals of Math. 118, 349-374 (1983)
ELLIOTT H. LIEB
358
proved simply without using the general results in [8]. Note that
IIfI7'-if -fI°-If I°I 5
o
/21fl°, p2P- 1{ If
- flr - 'IfI + If -
fIIfIP-
I },
1 5 p < oo.
The lemma follows from the first inequality for 0 < p < 1 by dominated convergence; for 1 5 p < no it follows from the second by Egorov's theorem. The utility of Lemma 2.6 for problems in the calculus of variations is given in the next lemma.
LEMMA 2.7. Let (M, 2, it) and (M', 1', µ') be measure spaces and let X (resp. Y) be L'(M, 2, µ) (re-sp. LQ(M', E', µ')) with 1 5 p 5 q < no. Let A be a bounded linear operator from X to Y. For f E X, f # 0 let
R(f) = IIAfIIy/Ilfllx and N = sup(R(f )I f * 0). Let (fj) be a uniformly norm-bounded maximizing sequence for N and suppose that f -> f # 0 and that Af -* Af pointwise almost everywhere. Then f maxi-
mizes, i.e., R(f) = N. Moreover, if p < q and if limllfllx = C exists, then Ilfllx = C, and hence IlAfjlly - IIAfIIy
Proof By Lemma 2.6, Ilfllx = Ilf - fll + Ilfllx + o(1)P (where o(1) denotes something that goes to zero as j - oo) and IIAfIIi = IIAf - Afill + IIAfhI$+o(1)".Ifa,b,c>-0,then (aQ+bQ+c4)"/Q5a'+bP+c7'.Thus,
R(f)° 5 (IlAfllPy+IIA(f-f)II"y+o(1)P}If llflIz+Ilf- flit +0(1)°}. Now IIA(f - f)Il y 5 NIIf - fllx for every j, and o(1), 6(1) - 0, and R(f )P
NP. Since f# 0, we must have that
I I Af I I r? N I I f I I x, and hence I I Af I I.=
NIIfIIx We also have that limIIA(f - f )II1 - NIIf - fII r = 0. For the last part,
let p0. However, (a 4 + b" )7"' < a' + b P unless 0 a = 0 orb = 0. Thus, lim R(f )P < NP, which is a contradiction. The last sentence of Lemma 2.7, and the proof, were pointed out to me by Haim Brezis.
Conclusion of Proof of Theorem 2.3(i). Let us return to the one-dimensional equivalent formulation given in (2.11)-(2.17). Given the maximizing sequence
f E SD with Il flip = 1, (2.11) defines a sequence Fj: R -> R with IIFJIIP = "P and IIF;II. 5 C by (2.9). Also, f -+ f * 0 pointwise, so Fj -' F * 0 pointwise.
538
Sharp Constants in the Hardy-Littlewood-Sobolev and Related Inequalities
359
HARDY-LITTLEWOOD-SOBOLEV INEQUALITIES
Lemma 2.7 can now be applied to finish the proof provided the operator A = L" *: LP(R) -> LQ(R) satisfies AFB --> AF pointwise. But since L. E L'(R)
. < C, L" * (Fj - F)(x) - 0 everywhere by dominated convergence. The fact that 1(f) -- 1(f) is contained in the last sentence of Lemma 2.7. In our case, p < q since A/n < 1. and since 11 F, - FLL
M. The maximizing function when p' = q or p = 2 or q = 2 In certain cases the equations (2.5) for the maximizing f can be solved and the constant N computed explicitly. In these cases a solution to (2.5) can be easily guessed and verified. The difficult part will be to prove that this f is actually maximizing. To prove this it will be necessary to use stereographic projection to recast (2.5) as an equation on the sphere S".
Recall that 1/p + 1/t + A/n = 2 or 1/p + A/n = 1 + 1/q with t =
q' =q/(q- 1)and0
f(x) = (I +
(3.1)
NP. 11. n =
(3.2)
Ix12)-"/P
and
I'(n/2 - A/2) I I'(n/2) 1 -I+r/" I'(n - A/2) ` r(n) 1
COROLLARY 3.2. (i) Let q = t = 2 and p = 2n/(3n - 2A), which requires n < 2A < 2n. The maximizing f for (2.1) is (after multiplication by a constant and dilation) uniquely
f(x) = (1 + lxI2)-"/P and
(3.3)
(3.4)
NP.a
(ii)
r(n/2 - A/2) I'(A/2)
{
r(A - n/2) 1/2 r(n/2) r(3 n/2 - A) } r(n) }
I
Let p = 2, t = 2n/(3n - 2A), q = 2n/(2A - n), which requires
n < 2A < 2n. The maximizing f for (2.1) is (after multiplication by a constant and dilation) uniquely (3.5)
f=IxI -T*(1+Ix12) "/'
and N..,\.,, is given by the right side of (3.4).
539
Annals of Math. 118, 349-374 (1983)
360
ELLIOrr H. LIES
Proof of Corollary 3.2. Note that for 0 < A, y < n and A + y > n,
Ixl-,,*Ixl-Y = D(A,y)Ixl"-'-r,
(3.6)
D(A, y) = rr"/2 r(n/2 - A/2) r(n/2 - y/2) r(k/2 + y/2 - n/2)
(3.7)
X (r(A/2)r(y/2)r(n - A/2 - y/2)) -'. This follows from the fact [28, Theorem IV. 4.1] that the Fourier transform walk) = f IxI '`exp(ik x) dx is IkI"-"rr"/22n-"r(n/2
w,,(k) =
(3.8)
- k/2)/r(1i/2).
When q = 2, N2
= Illxl -
,,
* fll2/IlfII = D(A, \)(f,
IkIn-21,
*f)/IIfIIP.
But this maximization problem is, by Theorem 2.3(iii) and (2.7), the same as in
Theorem 3.1 (but with A -+ 2A - n). When p = 2 the proof is similar, using
0
(2.5). See (2.8).
Beginning of Proof of Theorem 3.1. By Theorem 2.3, the f we seek must have (after dilation, etc.) two properties:
(a) The inversion symmetry f l/r) = r2n/p f(r) and (b) It satisfies (2.5) up to a constant, namely (3.9)
Ixl
-a* f =BP -,
Clearly, (a) holds for (3.1). The fact that (3.1) satisfies (3.9) can be seen in several ways. One way is to note that
fµ(x)__(1+Ix12)
(3.10)
has the Fourier transform '(Iklr)t
(3.11)
f(k) _ (2rr)n/2 0
n/2
r" 'J_ ,,n/2(Iklr)fµ(r)dr
= ,n/2 21 IL 42/nr(,u)-1Iklµ-11/2Kµ n/2(Ikl)
Here, K is a Bessel function and satisfies Kjx) = K jx ). If we set µ = n/p = n - A/2 and use (3.8), we find that (3.12)
Ixl-A*fn/p = BAfa/2 = BA(fn/p)P
,,
Ba = vrnj2r(n/2 - A/2)/r(n - X/2).
(3.13)
It follows that R(fn/P) in (2.1) is (3.14)
Illxl
A*
B,\ I 4(0)] t/a
,/,,
= right side of (3.2).
540
Sharp Constants in the Hardy-Littlewood-Sobolev and Related Inequalities
HARDY-LITTLEWOOD-SOBOLEV INEQUALITIES
361
The calculation just given is slightly formal, but it can easily be made rigorous. The real problem that faces us, however, is this: (3.12) shows that fin (3.1) (hereafter we shall denote f/ by f) satisfies (3.9). It also has the correct inversion symmetry. Is this f maximizing? Is it the unique maximizer (up to a constant)? We do not know that (3.9) has an (essentially) unique solution-even if we restrict to the SSD category-and we shall offer no proof of this kind of uniqueness. This is an open problem! But it will be shown that f is (essentially) unique in the category of maximizers. In the course of this proof, (3.12)-(3.14) will be rederived in a simpler way. For the proof, a change of variables will be required, namely stereographic projection. Stereographic Projection. Consider the sphere S" in 11"', S" _ { St E R"' '1 1St1 = 11. Consider the invertible map 1: R" -. S" \ (0,... ,0, - 1) (3.15)
1(x) = (P, f) = (2x/(l + 1x12), (I - 1x12)/(l + 1x12)) where p E R". Conversely, if p E R", E E (- 1,1) and (p, J) E S", (3.16)
Y_ -
`((P, fl) = P/(l + 0.
Apart from trivial constants this is the usual stereographic projection with
0 E R" --north pole". Let x,, x, E R" and 9, = E(x1) with 9, = (p,, c,). Then (3.17)
Ixi - x21 = I2I - 122I{(1 + 0(1 + 52)}
1/
Here, 151, - S22I means Euclidean distance in R" ", not geodesic distance on S". Let dS2 be the rotation invariant measure on S" with the normalization (3.18)
f dS2 =
21r"" 1)/2r(n + 1)/2)-`
which is the area of the unit sphere in R" -' (see (2.13)). Then the Jacobian of Z is given by (3.19)
dS2 = dp/ICI = 2"(1 + Ix12) " dx = (1 + c)" dx.
With any f: R" - C we associate F: S (3.20)
C (denoted by f - F) by
F(S2) = (1 + f) "f(:, t(2)), f(x) = 2-"(1 + Ix12)"F(I(x)),
with,u = n/p = n - It/2. (X enters at this point.) Clearly, (3.21)
IIFIIp = Ilfll,,
In particular, f given by (3.1) corresponds to F(2) = constant = 2
541
Annals of Math. 118, 349-374 (1983) ELLIOTT H. LIES
362
From (3.15)-(3.18) we have that when f : F, (3.22)
(IxI `*F)(x)H(1+ )
(3.23)
(I0I-A' F)(SI) = f dS2'ISt -
where s2'I-"F(SZ').
(Again, note that Euclidean distance in R"+ is used.) Equation (3.9) then takes the following simple form (with the same B):
ISZI - a*F=BFP `.
(3.24)
This, together with (3.20), (3.21), gives another equivalent form for N (cf. (2.17)): (3.25)
NP.A." = sup( 119-` * FIIq/IIFIIPIF E LP(S"), F * 0).
As stated before, F = 2 -" H fin (3.1). To check (3.12)-(3.14) we must compute (3.26)
I = f d0'IS2 - Q'I - A = an fond9(sin9)"-I(2
- 2cos9)-X/2
0
= a"2"-A f /2dp(cosp)"-t(sinp)"-I-X o
= an 2" - a
- I I'(n/2) I'(n/2 - X/2)/I'(n - X/2).
Thus, B = 12' - " = BA in (3.13). Furthermore, (3.27)
11191 -' * FIIQ/IIFIIP =
BFp-2aa+i-I/P
= right side of (3.2), if we use the duplication formula for the gamma function.
Equation (3.24) has one very great advantage over (3.9). The O(n + 1) rotation invariance of (3.24) allows us to generate new solutions from old ones.
This fact will eventually permit us to conclude that (3.1) is the (essentially) unique maximizer.
As an interesting aside before returning to the proof of Theorem 3.1, let us consider some other solutions to (3.9) and (3.24). (Irrelevant constants will be suppressed here.) (a) We have f(x) = (1 + F(S2) = 1. However, by translation and dilation of f in R", there is an n + 1 parameter family of solutions as follows: IxI2)_µ
(3.28)
f(x) = [b2+IX -zI2]-"HF(&2)= [I+(w,&2)] ",
where b E R, z E R", w E R"+. All b, z and w are allowed, except for the condition I w I < 1. The SSD category corresponds to z = 0 and w = (0,. . . , 0, c) with Icl < 1.
542
Sharp Constants in the Hardy-Littlewood-Sobolev and Related Inequalities HARDY-LITTLEWOOD-SOBOLEV INEQUALITIES
363
These other solutions are interesting for the following reason. Since F = const. satisfies (3.24) and since this solution has the maximum possible O(n + 1) symmetry, it might have been supposed that F = const. is the unique maximizer.
But we see from (3.28) that there are other equivalent solutions with less symmetry. This is indeed surprising. (b) f(x) = Ix - zl - "with z E R" also satisfies (3.9). It is not allowed since it is not in LP. This is an n parameter family and the correspondence is (3.29)
f(x)=Ix-zl-"HF(St)=(1+6)-"/21St-St'1-"
with St' = 1(z). Even more solutions can be obtained from (3.29) by applying an O(n + 1) rotation to St and St'. The function 2(1 + ) becomes ISt - Q,, 12, with St # W. Thus we have a 2n parameter family of solutions: (3.30)
F(1) = 19 - St'I-"ISl - St"I P <", f(x) = Ix - z'I-"lx - z"I - "
with St', St", z', z" arbitrary except that St' # St" and z' # z". This is amusing because the rotation invariance of (3.24) allowed us to generate a nontrivial 2n parameter family of solutions starting from lxI - ".
Conclusion of Proof of Theorem 3.1. The f given in (3.1) satisfies (3.9) and we want to show that it maximizes and that it is (essentially) unique. Let f be any maximizer. By Theorem 2.3 we can assume three things (after translation and dilation): (a) f satisfies (3.9). This means, in particular, that f(x) is defined for all x by the left side of (3.9), not for just almost every x; (b) f e SSD; (c) f(1/r) = r 2µ f(r ). Let F H f. By (b), F(S2) depends only on 6: F(12) = q)(J ). By (c), p(- ). Thus, f(x) _ (1 + Ixl2)-"y,((1 - Ix12)/(1 + Ix12)). Let R E O(n + 1) be the following rotation:
R: (p1..... p",6) - (p,cos0 -
p,sin8).
By the rotation invariance of (3.24),
fR(x) = (1 + lxl2)-"F(R2(x)) = (1 + lxl2)-"p(f (1 - 1x12)cos 0 + 2xisin0]/[1 +
Ix121)
also maximizes. Therefore, by Theorem 2.3, there exists a unique y E R" such that ff(x + y) E SSD. This y is the unique solution to fR(y) = max, fR(x ). By we see that y2 = = the O(n - 1) rotation invariance of fR in (x2, ... , y = 0. Since f8(x + y) E SSD(R"), g(x1) = fR(x1 + y1, x,,...,x,,) E SSD(R1) for any fixed x2,... ,x". But f8((1, 1,... ,1)) = fR((-1, 1,... ,1)). Therefore y1 = 0 also, and y = 0. This means that q)([ ]/[ ]) is spherically symmetric in x for all 0 and I claim that qq must then be a constant. To see this, let u,,, u - E [ -1,1] and
543
Annals of Math. 118, 349-374 (1983)
364
ELLIOTT H. LIEB
let x.= (± b, 0, ... , 0). Since fR(x,) = fR(x _), we shall have 9)(u.) = (p(u _) if
we can find b and 0 such that u t= [(1 - b')cos 0 ± 2bsin 0]/[1 + b2]. Let b = - tan(0/2). These equations then read u , = cos(iy ± 0), and we see that a solution is trivial.
It should be noted that the proof above used the strong rearrangement inequality (Lemma 2.1) and Lemma 2.2 twice: Once to show that any maximizer,
f, is in TSSD and once to show that f(1 /r) = r 2µf (r ). For the latter, the formulation in (2.17) was essential. The following is another way to conclude the proof of Theorem 3.1. It uses
the rearrangement inequality on S" of Baernstein and Taylor [3] which is a generalization of the inequality on S' of Friedberg and Luttinger [13]. The inequality is: (3.31)
f f F(SZ)K(12 S2')G(SZ') dSl dS2' < f f F*(S2)K(S2
S2')G*(S2') dSl d12',
where K: [ - 1, 1] -+ R is non-decreasing, and where F* is equimeasurable with F, F*(p, depends only on J and is non-increasing in £ (and likewise for G*). Unfortunately, Baernstein and Taylor do not prove a strong inequality analogous to Lemma 2.1 (ii), which, it may be conjectured, exists. If it did hold, then the proof could be simplified. In our case K(SZ 9) = 10 - S2'I which is strictly increasing. Let F(s) _ p(J) be the maximizer that satisfies p(>;) = fp( - c ). If the strong version of (3.31) held, we could immediately conclude that is either non-increasing or non-decreasing, in which case constant and the proof would be finished. In the absence of this fact let F*(Sl) _ with i(¢) non-increasing. By (3.31),
F* also maximizes. Using (3.20), F* '- It with h(x) = h(IxI). By the strong rearrangement inequality on R", for some y > 0, hy(x) = h(yx) satisfies h.,(1 /r) = r 2µh.,(r ). In general (without assuming any symmetry), if F f, then FY f,, where F,(p, 6) = F(2X p/w, v/w) with w = 1 + + y2(1 - ) and 4,(v/w) and In our case (F*),(12) = p = 1 + - ),2(1 4, )= Setting = 1 we conclude that +J'(1) 1). Since 4' is
- ).
non-increasing, 4' = constant and the proof is completed. IV. The Sobolev inequality
As another application of the method of Section II, we shall prove here the existence of a maximizing function f on R" for the sharp constant in the Sobolev inequality (4.1)
544
VfI ?
2* = 2n/(n - 2),
it >- 3.
Sharp Constants in the Hardy-Littlewood-Sobolev and Related Inequalities HARDY-LITTLEWOOD-SOBOLEV INEQUALITIES
365
I thank H. Brezis and P. L. Lions for suggesting to me that there should be a simple, direct "rearrangement inequality proof" of the existence of this f. Existence proofs already exist, of course (see [2], [31] and also [25]). (A generali-
zation of (4.1) using Lorentz space norms was given by Alvino [1].) What is offered below, it is hoped, is a more direct and simpler argument. A generalization of (4.1), useful in the theory of the Schroedinger equation, was given in [14] for n = 3:
K,.,IlofII2 z IIIxI - bfIIn >- 3,
(4.2)
for OSb<1and p=2n/(2b+n-2). In [14], an interesting extension of (4.2) is also given. If 1 - n/2 < b < 0,
no inequality of this type is possible for all f. But if f is restricted to be spherically symmetric (not necessarily symmetric decreasing), then a bound as in (4.2) holds and there is also a maximizing f. This is also given in Theorem 4.3. Generalizations of (4.2) can be found in [12] and [21]. Flett [12] gives [21],
Theorem 6.5.8, as the earliest reference to (4.2). Glaser, Martin, Grosse and Thirring [14] were unaware of this, but seem to have been the first to compute the sharp constant in (4.2). Generalizations of (4.1) can be found in [32], [33]. See also [5], [16].
It will be recalled that the rearrangement inequality, Lemma 2.1, was used twice in the proof of Theorem 2.3. First, it was used to show that a maximizing sequence could be sought in the SD(R") category. Second, when p' = q, it was used in the one-dimensional formulation (2.11)-(2.17) to deduce (2.6). The dual usage will also be needed here because we are faced with the same problem as that outlined in the beginning of Section II: The variational problem posed by (4.1) and (4.2) is invariant under the same conformal group (including inversion).
We begin with the fact that for f E W' I(R") = (f If and of (=- V(R") ), II of I I , ? II of I I , where f' is the symmetric decreasing rearrangement of I f 1. This fact has been known for a long time (see [10], [23], for example), but all proofs of it seem to be complicated. There is one case, suitable for our purposes here, in which the following simple proof can be given [20], and it would be desirable to be able to extend this argument to the W t v case, p * 2. It would also be desirable to have a strict inequality as in Lemma 2.1(ii). LEMntA 4.1. Let f E W I.2(R") = H'(R" ). Then f* E H I(R") and II of I12 II of*II2
Proof. Let t > 0 and g,(x - y) = et'(x, y) be the kernel of the heat equation semigroup. Let f E L2. It is easy to see that if A( f, t) (1/t)[IIfII2 - (f gt * f)], then lim,.OA(f, t) = II Vf112 or + oo according as
v f E L2 or not. (See 120) for a proof of this fact.) For each t > 0, g,(-) is a 545
Annals of Math. 118, 349-374 (1983)
366
ELLIOTI H. LIEB
Gaussian and hence in SSD(R" ). Therefore, (f, g, * f) 5 (f', g,
fl) by Lemma
2.1. Since U112 112 = 1l f 112, the lemma is proved.
With this preparation we can now prove the following about R':
THEOREM 4.2. Let F E H'(R) and 2 < p < oo. For F # 0 let (4.3)
T(F) = IIFIIP/{ IIF'112 + IIFII2 } = (IIFIIp/11FIIN1)2,
(4.4)
MP = sup(T(F)IF E H1, F # 0).
Then MP is finite and there exists a maximizing F E SD n H', i.e., T(F) = MP. This F is unique (up to a constant and to translation). With r = 2/(p - 2), (4.5)
F(x) = (const.) (cosh(x/r)) 1)r(2r)/r1'(r)2)1
(4.6)
MP = {(2r +
2/p(r/4)21P(r
+ 1) - 1.
Proof. By Lemma 4.1, T(F*) z T(F), so henceforth we can restrict atten-
tion to F E SD. Then F E L°° since F(x) - 0 as x --s' - oo and F(x)2 = 2 f :. F'(y) F(y) dy -< 21I F'11211 F112 Let (F") be a maximizing sequence for T By the L'° and we can assume 11F. 112 + 11FF112 = 1. By (2.9), F"(x) < Clxi bound just given, F"(x) 5 C. Therefore,
F,,(x) S h(x) = min(C, CIxI-1/2) E L' since p > 2. As in Theorem 2.3, we can assume F,, -s' F E SD pointwise. We can also assume (Banach-Alaoglu theorem) that F --> C' and F. --s' G weakly in V. Clearly, G = F. Then 2
+ IIF"I12 2 -> IIF'll2 + IIFII2
It remains to show that MP = lim I l F " I I p = I I F I l p, which will also prove the crucial
fact that F * 0. This follows by dominated convergence since F,,(x) < h(x). This maximizing F can easily be found as follows: By letting F - F + eye, E C,-, and equating the derivative at e = 0 to zero,
F" = F - FP -'/MP
(4.7)
in the distributional sense. By standard ODE methods, there is only one solution
to (4.5) that vanishes as lxl - oo. (Recall that F(x) = F(-x) and IIFIIP = 1.) This solution is (4.5), (4.6).
It should be noted that the last step-the calculation of F and Ma-was very easy compared to the proof of Theorem 3.1. Here, it is easy to verify that (4.5) is the (essentially) unique positive solution to (4.7). In Theorem 3.1, on the other hand, it was difficult to verify that (3.1) is the desired maximizing solution to (3.9); the apparatus of stereographic projection had to be used.
546
Sharp Constants in the Hardy-Littlewood-Sobolev and Related Inequalities HARDY-LITrLEWOOD-SOBOLEV INEQUALITIES
367
Next we turn to the problem posed by (4.1) and (4.2).
H'(R"). Let 1 - n/2 <
THEOREM 4.3. Let n >- 3 and let f E
bp > 2. Let (4.8)
R(f) = I11xl-bf11p/I1 ofll2,
(4.9)
K"
f* 0,
p=sup(R(f)If(=- H',f#0).
(i) If 1 > b >- 0, K" R(f) = Kn.p
,
is finite and a maximizing f E SD exists, i.e.,
(4.10)
f(x) _ {1 +
(4.11)
K n.p
Ixlzvr }
=a-1/2+t/pt-1/2-1/pMI/z n
p
with r = 2/(p - 2), t = -1 + n/2, M. in (4.6), a" in (2.13). Kn.p = [7rn(n - 2)]
1/2[r(n)/r(n/2)}1/">
when b = 0.
(ii) If I - n/2 < b < 0, R(f) is unbounded on H', but R(f) restricted to spherically symmetric functions (not necessarily decreasing) in H' (denoted by HA) is bounded. If KR. p = sup(R(f )I f E He', f * 0) then there is a maximizing f E SD, R(f) = KR p, given by (4.10) and KR p is given by the right side of (4.11).
Note. When n > 4, the f in (4.10) is in H'(R" ). When n = 3 or 4, this f e H'(R") but R(f) is well defined. Proof. (i) Since Ixl -" E SD, we have that IIIxI - bfllp _< IIIxI - t'f*Il. By Lemma 4.1, II vfll2 ? II of"112 Thus, we can henceforth restrict our attention to f E SD. As in (2.11), let F: R -+ R be given by (4.12)
with t (4.13)
(4.14)
F(tu) = e`f(e")
-l+n/2>0.Then (an/t)11pIIFIIp = IIIxI bfllp,
- F)II2 = IIofII2 where a,, is given by (2.13). Since f E L2, as in (2.9) we have F(u) 5 Ce Now assume f E L°° in addition to f E H' n SD, whence F(u) < Cexp(- I uI/2t ). Then f F'F = 0, and thus (4.15)
(ant)1/21I(F'
R(f)2 = a 112/pt-1-2/p7,(F),
with T(F) given by (4.3). Since 11 VA 2 < oo, F E H'(R'). Thus, for f E L°`, 547
Annals of Math. 118, 349-374 (1983)
368
ELLIOTT H. LIEB
Theorem 4.2 completes the proof. (Note that (4.5) and (4.12) are consistent with f E L.) For f tE L°° we use the fact that L°° n H1 is dense in H1. Thus, there exists a sequence {g " } in L°° n H1 such that 11V9.112 - II of112 and Ilgnll2 11P12. By passing to a subsequence, g" -' f pointwise almost everywhere and hence, lllxl - bfllp s liminflllxl - bgnllp Therefore,
R(f)<sup(R(f)lfEL°°nH',f*0). (ii) If b < 0 we cannot say that we can restrict our attention to f E SD. But for f E HR(R") we can make the same change of variables as in (4.12)-(4.14).
0
The proof proceeds essentially as before.
It is worth remarking about Theorem 4.3 as b - 1 and p -' 2. From (4.11) we are led to believe that K,,.2 is finite, but that there is no maximizing f since (4.10) tends to unity as p - 2. This is indeed correct (see [19, Lemma 2.7] where the authors attribute the result to Karlson [ 18] and to Herbst [ 17]), and K,,2 is given by the limit of (4.11) as p - 2, namely (4.16)
(-1 + n/2)IIIxl -'f112 < 11 Vf112,
n >- 3,
and this constant is the best possible.
Another remark concerns the relation of Theorem 4.3 with b = 0 and Corollary 3.2. With p = 2n/(n - 2), Ilfllp 5 Kn,pllofll2 = K,, IK-A)t/2fll2 Formally, this is equivalent to II(- A) -1"2gIIp < Kn. p119112 But 21Y (n+1)/2(_A) ' '2g
= I'(n/2 - 1/2)IxI `
g
with A = n - 1, [30, p. 117]. Thus, we should have (4.17)
2a("+I)/2Kn.2nAn - 2) =
I'(n/2 - 1/2)N2. n - t, n ,
which is confirmed by (3.4) and (4.11).
V. Doubly weighted HIS inequality and weighted Young inequality Two more illustrations will be given of the use of the methods of Section II. The first is the doubly weighted Hardy-Littlewood-Sobolev inequality [15], [29] which generalizes the HIS inequality considered before.
THEOREM 5.1. Let 0 < A < n,
I < p < q < oo, 0 5 a < n/p' (with
1/p+1/p'=1),05$
V(x, y) =
lxl-"Ix
- y1 -"Iy1
°
be an integral kernel on R". Then f - Vf, (Vf)(x) = JV(x, y) f (y) dy, is a 548
Sharp Constants in the Hardy-Littlewood-Sobolcv and Related Inequalities
HARDY-LITTLEWOOD-SOBOLEV INEQUALITIES
369
bounded map from LP(R") to L°(R" ). Moreover, if p < q, (5.2)
R(f) = IIVfllq/Ilfllp,
f # 0, and
Pn.$.p.A.,, = sup(R(f)If E LP(R"), f # 0)
(5.3)
then there is a maximizing f e SD fl LP, i.e., R(f) = PQ, N. p. a. n
Remarks. (i) In [29] the condition 0 5 a, j3 is relaxed to a + S > 0. How-
ever, the stronger condition is needed here in order to use rearrangement inequalities. (ii) Obviously, (5.2), (5.3) are equivalent to
(5.4)
Pa.a.p.a.,,=supffg(x)Ix-yl Af(y)dxdy/111x1°fllplllxllgll,.. (iii) When p = q a maximizing f cannot be expected to exist. See the remark
at the end of Section IV which corresponds to the case p = q = 2, A = n - 1,
1,a=0,n>_3.See also [171. An extension of Lemma 2.4 is needed.
LEMMA 5.2. Let the hypothesis be the same as in Theorem 5.1 except that the condition 0 5 a, i6 is eliminated. Let f E LP(R") be spherically symmetric and I f(r)I 5 er "/P for all r > 0. Then II VfIIQ 5 CIIfIIP1°e' ',/" for some Cn. 9 independent off and E.
Proof. This is the same as the proof of Lemma 2.4 except that (2.15) changes to (5.5)
Ln a
fl(u) = 2-1'2exp(u(n/q - A/2 - fl)}Zn(u).
The hypothesis guarantees that In/q - A/2 - 1131 < A/2 so that Ln " ft E L'(R). Proof of Theorem 5.1. Since 1x1 and Ixl-0 E SD (here we use the fact that a, /3 >- 0) the generalization of the Riesz inequality given in [7] implies that a maximizing sequence { f) can be taken in SD. Lemma 5.2 implies that R(f) is bounded if we take Ilfllp = 1 so that f(r) 5 Cr as in (2.9). As in (2.10), we
can assume f (r) - fir) 5 Cr "/P almost everywhere. If q > p we can use Lemma 5.1 to dilate each f so that f # 0 (see the remarks after the proof of Theorem (2.3)(iv)). The final step is as in the conclusion of the proof of Theorem (2.3)(i), using Lemma 2.7.
The second illustration is what A. Sokal has called the weighted Young inequality. Let f: R" --> R+ and let
fpm'=f*fa ... * f
(m factors)
549
Annals of Math. 118, 349-374 (1983)
370
ELLIOTT H. LIES
be the convolution of f with itself m times. We consider m > 3. Now f"(0) makes sense, even if f is defined only almost everywhere, because
P-10) = f f(-xm-1)f(xm-1 - xm-2)
(5.7)
...f(x2 - xl)f(xl)
dxl...m 1
Let p and y satisfy the conditions
m/(m - 1) < p < m and y/n + 1/p = (m - 1)/ni >- 0.
(5.8)
Our interest will be in the ratio (5.9)
f * 0, Qp.m,n=sup( R(f)IIxI7fELp(R"),f#0}. R(f) = If`m'(O)VIIIxI flip,
(5.10)
By the generalization of the Riesz rearrangement inequality
in
[7],
f'n')(x) <- (f"`)'m)(0) and IIIxI7f'llp -< IIIxI7fllp. Thus,
(5.11)
Pp m. n = sup( II f m'll,0/IIxIYfllp I Ixlyf E L'(R" ), f # 0}.
The idea that R(f) should be bounded was suggested to me by A. Sokal. Initially, he was interested in the case p = 2, m = 4, y = n/4 for use in a problem in quantum field theory [27]. As will shortly be seen, the p = 2 case reduces to the HLS inequality itself (with p = 2). But that is a case for which the sharp constant was derived in Section III, and thus we shall be able to compute Q when p = 2. Another case for which Q can be found is p = m/(nl - 1) and this is given in (5.12). THEOREM 5.3. Assuming (5.8), Qp n,, n < oo for all p, n and m >- 3. More-
over, if m/(m - 1) < p < m, there is a maximizing function, f E SD, i.e., R(f) = Qp, m,n Remarks. (i) When y = 0, p = m/(m - 1), (5.9) is one of the generalized Young inequalities treated in [6]. The ordinary Young inequality shows that R(f )
is bounded. In [6] it was shown that a maximizing f exists and that it is a Gaussian, f(x) = exp( -I x 12 ). Then Qp m n can be easily computed in this case: (5.12)
Pp. m, n = (pm -1/m) "/-,
p = m/(m - 1).
An alternative derivation [6] of (5.12) can be obtained from the sharp constants in the ordinary Young inequality, which was also derived in [4]. If that inequality is iterated (m - 2) times, one obtains Qp n <- (right side of (5.12)). However, the explicit choice of a Gaussian for f gives a lower bound which establishes (5.12).
550
Sharp Constants in the Hardy-Littlewood-Sobolev and Related Inequalities HARDY-LITTLEWOOD-SOBOLEV INEQUALITIES
371
(ii) A generalization of Theorem 5.3 (at least the first part) is obviously possible, namely
(f1* 2t ...
't fm)(0)
CfhllxlYlfll, i=1
Ej iyi + n/pi = n(m - 1) and yi z 0, pi < m, for all j. This can easily be proved by imitating the following proof.
-a
Proof. Let (f) be a maximizing sequence and let g,{ x) = l x l Y f (x ). The denominator in R(f) is llgillP. Then f (m)(0) is an integral over a product of g1(xl - xk) and lx, - xkl - " factors. By the general rearrangement inequality in [7], the numerator does not decrease if we replace gi by g*. Henceforth, assume
gi = gj and llgillP = 1. As in (2.10), we can assume g1(r) - g(r) < Cr" everywhere and g,{r) < Cr-"/p. Let g,{r) = r-"/phi{r), so that Ilh,ll < C and A, _ fh/x)plxl -"dx < C.
First we show that R(f) is bounded. Substitute f< X) = h,{ x) I x I - Y - "/p in
(5.7). By Holder's inequality, f°'(0) < fk Ilk(hi)1/m where 1k(h1) is (5.7) with hI{x)mIxI - Y - "/p in the kth position and lxl -Y - "/p in the other (m - 1) positions. It is easy to do the trivial integrals and one finds that all Ik have the common value Cfh1{x)'jxl --" dx < CAillhill ;-P. This shows not only that 4"'(0)
is bounded but it also shows that when p < m, llhill cannot go to zero as j -- oo. Therefore there are dilations of g, so that g * 0 (see the remarks after the proof of Theorem (2.3)(iv)).
It remains to show that f(r) = r - Yg(r) maximizes. Write f = f + b, with b
0 almost everywhere. I claim that
(5.13)
f<-l(0) = f"1(0) + b1m)(0) + o(1)
when p < m. This will complete the proof (using Lemma 2.6) by the strategy of Lemma 2.7. One merely sets p/q = p/m in the last part of the proof of Lemma 2.7.
To prove (5.13), we have to show that when f + b, is inserted in (5.7) and expanded out into 2"' terms, those terms that contain at least one f and one bi factor vanish as j -+ oo. Write f = r -lY+"/p1q> and bi = r
"1r+n/v1/ij.
We shall use
a Holder inequality as in the proof that p m)(0) is bounded, but with a slight change. Consider a term, 1, in f"(0) that has q' m1 times and Si m2 times with m I + m2 = m and 0 < m 1 < m. All orderings of these functions give the same
integral (by changing variables). Let a = (m - p)/(m - 1) > 0. Then 1 5 li',/mJ2'p/m where I1 has qp" once, V (m1 - 1) times and /3a m2 times. I2 has q)" m1 times, /J once and # (m2 - 1) times. First consider 11. We know that and
551
Annals of Math. 118, 349-374 (1983) ELLIOTF H. LIES
372
P, are bounded by a constant C (since f r("/P+Y) is so bounded). Suppose the integration variable of pP is x1. Then do all the other (m - 2) integrations and call the result z 1(x 1). If we replace all the other (m - 1) functions by C, the (m - 2) integrals are finite for all x1 * 0, namely Ix1I - ". Therefore, by dominated convergence, z,{x1) --> 0 as j - oo for every x1 * 0. (Note: It is important that there is at least one factor of f3, and that a > 0.) Furthermore, z x x 1) has the form Ix1I - "wr(x1) with w/x1) uniformly bounded (in x1 and j). Thus, the final integral is 11 = fp(x1)"1xi1- "wr(x1) dx1. Since 1x1- "gq(x)P E L' and w, x) < C, 11 -> 0 as j -+ oo by dominated convergence. 12 is uniformly bounded since 12 < C f,3i(x)°Ixl -" dx < Cllg;llp = C. Therefore I
0 as j -+ oo and (5.13) is proved.
O
The value of QP n, n has already been given in (5.12) when p = m/(m - 1). Let us conclude by evaluating Q when p = 2. The function g(x) = I x V Y f(x) is in
L2. Let G be the Fourier transform of g. Then 119112 = (21r)-"/2116112 The Fourier transform of I x I -Y is w5(k) in (3.8) and has the form E. r,Ik111 - ". Thus,
the Fourier transform of f is F = (277) - "Er nikIY - " * G and P"')(0) _ (2 1T) - "f F(k)"' A. Hence (5.14)
R(f)1/"' =
(21r)-n(1/2+1/rn)E Y, n
f[IkVY_n *G]"
/IIG112
Comparing this with (2.1) (with p = 2, q = m, A = n - y and f - G), we see that apart from a constant, the two expressions are almost the same. The one difference is that II Ikl - A * GIIm is replaced by the integral in (5.14). However, the maximizing f for (2.1) is non-negative by Corollary 3.2(ii). It can be used as G in
(5.14) and the two expressions are then the same. The maximizing G is thus G = IkIY - n * (1 + Ik12) y-n/2.
(5.15)
This is unique (up to dilations, etc.).
This paper is thus brought full circle by the identification of the HLS problem and Theorem 5.3 for p = 2. The maximizing f for (5.9) is unique (up to dilations, etc.) and is (5.16)
f(x) = Ixl-YKY(Ixl),
p = 2, where KY is a Bessel function (see (3.11)). Equation (5.16) should be compared with the p = m/(m - 1) case in which f is a Gaussian (see (5.12)). Q can be computed from (5.14), (3.4) and (3.8). r(n/m)
(5.17)
552
Qz.
n=
m/2
r(n)
n(m ... 2)/9
I'(n - n/m)
r(n/2)
/2
1
Sharp Constants in the Hardy-Littlewood-Sobolev and Related Inequalities
HARDY-LITTLEWOOD-SOBOLEV INEQUALITIES
373
THE INSTITUTE FOR ADVANCED STUDY, PRINCETON, NJ PERMANENT ADDRESS:
PRINCETON UNIVERSITY, PRINCETON, NJ (MATHEMATICS & PHYSICS DEPARTMENTS) REFERENCES
[1] A. ALVINO, Sulla disguaglianza di Sobolev in spazi di Lorentz, Boll. Unione Mat. Ital. 14A (1977), 148-156. [2) T. AuBIN, Problemes isoperimetriques et espaces de Sobolev, Compt. Rend. Acad. Sci. Paris 280A (1975), 279-281. See also J. Diff. Ceom. 11 (1976), 573-598. [31 A. BAERNSTEIN II and B. A. TAYLOR, Spherical rearrangements, sub-harmonic functions and -functions in n-space, Duke Math. J. 43 (1976), 245-268. [4] W. BECKNER, Inequalities in Fourier analysis, Ann. of Math. 102 (1975), 159-182. [5] G. A. BLISS, An integral inequality, J. London Math. Soc. 5 (1930), 40-46. [6] H. J. BRASCAMP and E. H. LIES, Best constants in Young's inequality, its converse, and its generalization to more than three functions, Adv. in Math. 20 (1976), 151-173. [7] H. J. BRASCAMP, E. H. LrEa and J. M. LurrtcER, A general rearrangement inequality for multiple integrals, J. Funct. Anal. 17 (1974), 227-237. [8] H. BREzis and E. H. LIEB, A relation between pointwise convergence of functions and convergence of functionals, Proc. A. M. S. 88 (1983), 486-490. [91 H. BREzis and S. WAINCER, A note on limiting cases of Sobolev embeddings and convolution inequalities, Commun. Part. Diff. Eq. 5 (1980), 773-789.
[10] G. F. D. DUFF, A general integral inequality for the derivative of an equimeasurable rearrangement, Can. J. Math. 28 (1976), 793-804. [11] N. Du PLEssrs, Some theorems about the Riesz fractional integral, Trans. A. M. S. 80 (195.5), 124-134. [12] T. M. FLErr, On a theorem of Pitt, J. London Math. Soc. 7 (1973), 376-384. [13] R. FRIEDBERC and J. M. LUTrINCER, Rearrangement inequality for periodic functions, Arch. Rat. Mech. Anal. 61 (1976), 35-44. [14] V. GLASER, A. MARTIN, H. CROSSE and W. THnuuNc, A family of optimal conditions for the
absence of bound states in a potential, in Studies in Mathematical Physics, E. H. Lieb, B. Simon and A. S. Wightman, eds., Princeton University Press (1976), 169-194. [15] C. H. HARDY and J. E. LITTLEWOOD, Some properties of fractional integrals (1), Math. Zeitschr. 27 (1928), 565-606. , On certain inequalities connected with the calculus of variations, J. London Math. (16] Soc. 5 (1930), 34-39. [17] I. W. HERBST, Spectral theory of the operator (p2 + m2)r"2 - Zee/r, Commun. Math. Phys. 53 (1977), 285-294. 1181 B. KARLSSON, Self adjointness of Schroedinger operators, Inst. Mittag-Leffler Report no. 6 (1976). 1191 V. F. KovAIENKO, M. A. PERLMtrrrER and YA. A. SEMENOV, Schroedinger operators with
L;2(R) potentials, J. Math. Phys. 22 (1981), 1033-1044. [201 E. H. DEB, Existence and uniqueness of the minimizing solution of Choquard's nonlinear equation, Studies in Appl. Math. 57 (1977), 93-105. [21] G. O. OKrlrloLU, Aspects of the Theory of Bounded Integral Operators in L"-Spaces. Academic Press, N.Y., 1971. [22] R. O'NEIL, Convolution operators and I.(p, q) spaces, Duke Math. J. 30 (1963), 129-142. (23] C. PBLYA and G. SZECB, Isoperimetric Inequalities in Mathematical Physics, Princeton University Press, 1951.
553
Annals of Math. 118, 349-374 (1983)
374
ELLIOTT H. LIEB
[24] F. RIESZ, Sur une Inegalite Integral, J. London Math. Soc. 5 (1930), 162-168. [25] C. RosEN, Minimum value for c in the Sobolev inequality II0IIB 5 clivi Iis, SIAM J. Appl. Math. 21 (1971), 30-32.
[26] S. L. SoBo Ev, On a theorem of functional analysis, Mat. Sb. (N.S.) 4 (1938), 471-479. A. M. S. Transl. Ser. 2, 34 (1963), 39-68. [27] A. Solar., Improved upper bound for the renormalized four-point coupling, in preparation. [28] E. M. STEIN and C. WEISS, Introduction to Fourier Analysis on Euclidean Spaces, Princeton University Press, 1971. , Fractional integrals in n-dimensional Euclidean space, J. Math. Mech. 7 (1958), (29) 503-514. (30) E. M. STEIN, Singular Integrals and Differentiability Properties of Functions, Princeton University Press, 1970. [31] G. TAIENTI, Best constant in Sobolev inequality, Ann. di Matem. Pura ed Appl. 110 (1976), 353-372. [32] M. WEINSTEIN, Nonlinear SchrOdinger equations and sharp interpolation estimates, Comm. Math. Phys. 87 (1983), 567-576. [33] F. WEISSrER, Logarithmic Sobolev inequalities for the heat-diffusion semigroup, Trans. A. M. S. 237 (1978), 255-269. (Received January 31, 1983)
554
Invent. Math. 74, 441-448 (1983)
Inventions mathematicae
Invent. math. 74, 441-448 (1983)
(C; Springer-Verlag 1983
On the lowest eigenvalue of the Laplacian for the intersection of two domains Elliott H. Licb* Departments of Mathematics and Physics, Princeton University, P.O.B. 708, Princeton, NJ 08544, USA
Abstract. If A and B are two bounded domains in R" and A(A), A(B) are the lowest eigenvalues of -A with Dirichlet boundary conditions, then there is some translate, B., of B such that A(AnB.)
theorem: (i) A lower bound for sup. (volume (Ac B,)) in terms of A(A), when B is a ball; (ii) A compactness lemma for certain sequences in 1. The main theorem
The chief purpose of this paper is to prove Theorem 1 which contains, as a corollary, the answer to a geometric question about domains in E. Theorem I is generalized to Theorem 3 in Sect. 2, and this leads to the compactness lemma of Sect. 3, which was another motivation for Theorem 1. Let us begin with a discussion of the geometric question. If A is an open set in R" (bounded or unbounded), let A(A) denote the lowest eigenvalue of -A in A with Dirichlet boundary conditions. A(A)=oo if A is empty. To be precise, A(A) is defined by (1.7), (1.8). Intuitively, if A(A) is small then A must be large
in some sense. One well known result in this direction is the inequality of Faber [5] and Krahn [7] which states that among all domains with a given volume JAI, the ball has the smallest A. Thus, ).(A) ?fl. JAI - z'"
(1.1)
where P. is the lowest eigenvalue of a ball of unit volume. Equation(1.1) clearly does not tell the whole story. If A(A) is small then A
must not only have a large volume, it must also be "fat" in some sense. One might suppose that there is a constant a" such that where R is the radius of the largest ball contained in A. Unfortunately, this is not generalWork partially supported by U.S. National Science Foundation grant PHY-8116101 A01. AMS(MOS) Classification: 35P15
*
555
Invent. Math. 74, 441-448 (1983) L.H. Licb
442
ly true. The situation is the following:
n=/: a,=n2/4 n= 2: In general, there is no universal lower bound to a2. Hayman [6] showed that a,>_ 1/900 provided A is simply connected. Osserman [8] improved this to a2> 1/4 (see also Osserman [9], [10]). Osserman [8] extended Hayman's result to a2 ? k- 2 for domains of connectivity k2; Croke [4] improved this for k>_2 to a2>(2k)-'. (Earlier, Taylor [11] and Cheng [3] had found bounds of the Croke type but with worse constants.) A related result is that of Cheeger [2], valid for to_2, namely 1.(A)?infS2/4 V2, where the infimum is over all relatively compact subdomains of A of surface area S and volume V However, Cheeger's result does not imply any universal lower bound to x". We shall return to this quantity, inf S/V. later in (2.6).
n>_3: No such inequality is possible, even if topological properties are taken into account. Hayman [6] points out that if A is a ball with many narrow, inward pointing spikes removed from it, then 2(A);-z ;.(ball) but R20. in some special cases, however, a lower bound can be given for a". Hayman
[6] shows that this can be done if every boundary point, x, of A has the property that every ball centered at x has a fixed fraction of its volume outside A. Another example is Osserman's result [10] that A(A)?(2R)-2 for convex domains, based on Cheeger's result [2] and a result of Brascamp and Lieb [I] about the level sets of the lowest eigenfunction. What these result shows, in a word, is that when tn> 1, A need not contain any ball of fixed radius R no matter how small ;.(A) may be. Small holes and spikes do not influence ;t(A) very much but they do have a great effect on the ability to insert a ball.
Nevertheless, the intuition persists that if ;.(A) is small then A contains most of"' it ball of radius R - J.-12. The holes and spikes cannot prevent this. More precisely, for each fraction t' < I there should be a constant a"(qr), with a"(O) - 0 as iy -. 1, such that ;.(A)?a"(41)R-2
(1.2)
where R is the largest radius such that IAnBRI? 'IBRI for some ball BR of radius R. This is the content of Corollary 2. Equation(I.2) is the aforementioned geometric motivation. The following is the key to proving it. Theorem 1. Let A and B he non-empty open sets in R", n>_1, and let ).(A), ).(B)
be the lowest eigenvalue of - A with Dirichlet boundary conditions. Let B, denote B translated by xeR". Let r>0. Then there exists an x such that A(A n B.) < i. (A) +%(B) + i:.
(1.3)
Ij A and B are both hounded then there is an x such that ,.(A n B.)
556
(1.4)
On the Lowest Eigenvalue of the Laplacian for the Intersection of Two Domains
443
Lowest cigcnvalue of the Laplacian
Equivalently
;.(A)z infA(A nBx)-A(B)
(1.3')
X
),(A)>infi,(AnBx)-A(B),
A, B bounded.
(1.4')
x
Moreover, xr+).(A n Bx) is upper-semicontinuous, so that the set of x's Jr owhich (1.3) or (1.4) holds is open.
Remark 1. No assumption is made about the smoothness of the boundaries of A and B. Remark 2.
(1.3) and (1.4) are, in one sense, best possible as the following
examples show.
Example 1. In R2, let A be the strip A = {(x, y)10 <x
B be the perpendicular strip B= {(x, y)I - oc. <x< a;, 0s2, +t2 for suitable x.
Example3, In R2, let B be a ball of radius 2t'2. Let Z2 be the lattice with integer points, and for each yeZ2 let hY be the ball b,_ {xI Ix - y l 5 ry}. Let A
If r,,-+0 as lyI- x then ).(A)=0. For lxl
sufficiently large
).(AnBx)<0+).(B)+E. However, A(AnBx)>O+.t(B) for every x Similar examples hold in R", n>2. Corollary 2. Let A be a non-empty (hounded or unbounded) open set in R", n? 1,
and let B, be a bull of radius r of volume IB,I =r"/dn, S"=[nl (n;2)/2]n
n, 2. Let
(3" he as given in (1.1). Let 0 <sp < I be fixed and let
I]>0. Suppose that for 0
(1.5)
W
;.(A) s a.(0)R-2.
(1.6)
Then for every 0liIB,l=0r"/bn
This corollary has an obvious analogue for domains B other than balls.
Proof of Corollary. Let r < R and choose c - R -- 2) an(d). By Theorem I 2>_ .(A)>i,(AnB, )-,.(B,)-r.. However, i.(B,) there is an x such that =fl"IB,l"2'" and, by (I.I),;.(A nB,.x)? Qn IA n B..xl2rn O We turn now to the proof of Theorem 1. The basic idea is really very simple and is most clearly displayed in the proof of the first part, (1.3). In the =(r._ 2
proof of the second part, (1.4), the basic idea is obscured by technicalities. I am
557
Invent. Math. 74, 441-448 (1983)
E.H. Lieb
444
indebted to Haim Brezis for helpful ideas about the second part. First, let us define
A(A)=inf{J(f)IfEH,'(A),f 4 0}
=inf{J(f)IfECo (A), f 40).
(1.7)
J(f)=JIFf12 JIfI2.
(1.8)
Proof of (1.3). There exists feCa(A), geCo(B), fg*0 such that J(f)
D(x)=Jlhxl2.
(1.9)
Clearly, J D(x)dx = JJf(y)2 g(y-x)2 dy dx =1
(1.10)
We now compute IVhxI2(Y)=IVf 2(Y)g2(Y-x)+f2(Y)IVg12(Y-x) (1.11)
+(Vf 2)(Y) (Vg2)(Y-x)/2.
The last term can be written as
Thus the integral
(over x) of this term vanishes and
JT(x)dx=JIVf12+JIVgI2<;.(A)+A(B)+r.=A.
(1.12)
Therefore, Jdx[T(x)-AD(x)]<0 and hence AD(x)>T(x)>_0 on a set of positive measure. (1.3) then follows from the Definition (1.7), (1.8).
is upper-semicontinuous, we note the easily To prove that proved fact that C -(AnBj)= If (-)g(- -x)I fe Co (A), geCo (B)}. For any such product function, T(x) and D(x) in (1.9) are clearly continuous. The function j
given by j(x)=T(x)/D(x) if D(x)>0 and J(x)= x otherwise is thus uppersemicontinuous. Equation (1.7) then gives ;.(An B.) as the infimum (over .f g) of upper-semicontinuous functions. p
Proof of (1.4). Since A and B are bounded there
exist FeHo'(.4) and GoeHO'(B) such that J(F)=A(A) and J(G)=A(B), with J F2 = JG2 =1. This is a simple consequence of the Rellich-Kondrachov compactness theorem. (Again, we extend F to all of R" with F(x)=O, xfA, and similarly for G; it is easy to
see that F,GeHo(R").) Define Hx(v)=F(y)G(y-x). Since F,GeL2, HxeL'.
since VF and VG are L2 functions, W (y)=-(VF)(y)G(y-x) +F(y)VG(y-x)eL'. It is easy to see (by approximating F, G by Co functions)
Likewise,
that VHx = Wx in the sense of distributions.
It is not a-priori clear that H. or W eL2. (They are, in fact, in L2 because F and GeL9. However, this latter fact is not elementary and we prefer to avoid using it. For our purpose it suffices to show that H. and WeL2 for almost all x, and this can be done by the following elementary argument.) We note that (by Fubini) D(x)=JH2 satisfies JD(x)dx=1 as in (1.10). Thus, D(x)<-,f, a.e.
558
On the Lowest Eigenvalue of the Laplacian for the Intersection of Two Domains Lowest eigenvalue of the Laplacian
445
and H,EL2 a.e. (dx). Likewise, repeating the argument in (1.11), (1.12) for T(x) =JIW.I2
f T(x)dx=,1(A)+d(B)==-A.
(1.13)
(Here, one has to note as above that VF(y)G(y-x)EL2(dy) a.e. (dx) and F(y)VG(y-x)EL2(dy) a.e. (dx). By the Schwartz inequality, Z.(y)=-F(y)G(y a.e. (dx) and Z,(y)EL'(dxdy). Finally, 2G VG =VG2 in the distributional sense, whence f f (ZF(y)dxdy=0.). Thus, for almost all x, He Ho'(dy) and We have that f (T -AD)= 0. The remainder of the proof is as before, except
that in order to prove the strict inequality (1.4) we must show that T(x) =AD(x) cannot hold a.e. To see this let K,, (resp. KB) be the characteristic function of A (resp. B). Then (K,, t KB) (x) _- K(x) = I A n B, I is a continuous func-
tion of compact support. For c>0 there is an open set C such that 00, G>0.) Thus, D(x)>0, xeC. If T(x)=AD(x), xeC, then A(AnB.)SA, but this is impossible for sufficiently small E by the Faber-Krahn inequality (1.1). El
2. Some extensions of theorem 1
Instead of the ordinary eigenvalue given in (1.7), (1.8) we can consider the eigenvalue, l <- p < oc, given by
1(A)= inf{J(f)IfeWo'.P(A),f =inf{J(f )I fe Co (A), f +0)
J(f)=f IVfIP/f IIIP.
(2.1) (2.2)
As in (1.1) there is an isoperimetric inequality for 28(A), which is also proved by rearrangement inequalities. Given IAI, the minimum is achieved for a ball: AP(i1)?&PIAI-Pr",
(2.3)
where ".P is AP for a ball of unit volume.
The analogue of Theorem is Theorem 3. Let A and B be non-empty open sets in lR", n z 1. Let 1 5 p < oo and let 2,,(A), 2P(B) be given by (2.1), (2.2). Let a>0. Then there exists an xe1R" such that AP(A n Br)'"P <AP(A)'"P+1P(B)"+E.
(2.4)
If A and B are bounded and I
A,,(AnBx)'l'<AP(4)"+1P(B)" .
(2.5)
559
Invent. Math. 74, 441-448 (1983)
E.H. Lieb
446
(2.4), (2.5) are equivalent to AP(A) UP Z inf AP(A n B,)" P - AP(B)"P
(2.4')
AP(A n B,)'tP - AP(B)UP.
(2.5')
x
For 1<-p
is upper-semicontinuous so that the set of x's for
which (2.4) or (2.5) holds is open.
Proof. A few remarks about the necessary changes in the proof should suffice. (i) For (2.4) we again use Co approximants. The exponents I/p in (2.4), (2.5)
(which a clever reader might be able to eliminate) result from the fact that I VhxIP is more complicated than in (1.11). All we can say is that Ig of +fvglP
(ii) For (2.5) we note (by the Rellich-Kondrachov theorem) that when A is bounded and 1
automatically imply that combinations such as F(y) G(y - x) or PF(y)G(y-x), etc. are in L'(dy). (We could use the fact that F,GEL°° but, as before, we prefer to give an elementary proof.) The earlier proof (see (1.13)) does show, however, that they are in LP(d y) a.e. (dx). That they are also in L' (d y) a.e. (dx) follows
from this and the fact that their supports are bounded for all x. Hence W,(y) =PF(y) G(y-x)+F(y)PG(y-x) makes sense as a distribution a.e. (dx) and it is then easy to see that W,(y)=P,,F(y)G(y-x) a.e. (dx). Remark. When p= I there is no minimizing function, F, for (2.1), (2.2), even if A is bounded. However [12], A, (A) = inf S(D)/IDI,
(2.6)
where the infimum is over all relatively compact subdomains, D, of A with surface area S(D) and volume IDI. Corollary 2 has the following analogue. Corollary 4. Let A be a non-empty open set in R", n>1, let 1!5 p
be a ball of radius r with IB,I=r"/b". at 0<s
If A(A) S a,,
I]P.
(2.7)
R - P, 0< R <_ oo, then for every 0< r < R there exists an xe1R" and a
ball B, of radius r such that
IAnB,,,I>o IB,I=0r"lb". Proof. The proof imitates that of Corollary2, using Theorem4, with the choice Another variation on Theorem I was suggested to me by S.T. Yau. One can consider manifolds other than 1R" and symmetry groups other than the trans-
560
On the Lowest Eigenvalue of the Laplacian for the Intersection of Two Domains 447
Lowest cigcnvalue of the Laplacian
lation group. As an illustration, consider the sphere S" and the rotation group 0(n+1). If BcS", REO(n+1) then BR=(xeS"Ix=Ry, yeB). Let dp(R) denote normalized Haar measure on 0(n+1). The analogue of Theorems I and 3 is the following. Theorem 5. Let A and B be non-empty open sets in S", n>_ 1, and let AP(A), ,1P(B),
I0. Then there exists an R in O(n + 1) such that AP(AnBR)"<,1P(A)"P+,1P(B)"P
I
(2.8)
) (AnBR)"P<,IP(A)t/P+A,,(B)"P+e,
1
(2.9)
If p=2 then the exponent 1/p can be replaced by 1 in (2.8), (2.9). The map R
1P(A n BR) is upper-semicontinuous.
The proof is a before provided we note that for all yeS" and f: S"-C
Jdp(R)f(Ry)=IS"I-' S.Jf(y)dy3. A compactness lemma
One of the motivations, in addition to the geometric one mentioned in Sect. 1, for proving Theorems I and 3 was to prove the following compactness lemma. It is useful in the calculus of variations to show that, under some condition, a bounded sequence of functions in W `P(R") can, after suitable translations, be assumed to have a weak limit that is not zero. (See [13, 14] for example.) Lemma 6. Let Ie} satisfies IEjI ? C for some fixed e, C > 0. Then there exists a sequence of translations {t j} Fj(y)=_ fj(tjy)= fj(y+xj), such that F ,,-F weakly in W'"P of R", Ti: and F$0, for some subsequence {nj}.
Proof. By density, we can assume that fjeCo so that Ej is open and bounded (replace
a
by e/2
Let gj(y)=max(fj(y)-e/2,0)eW'"t' and Let Aj={ylgj(y)>0}=)Ej, which is also open and
if necessary).
bounded. Then gjeWW' (Aj) and JIPgjjP(e/2)"C. Thus AP(A)r"/2b,. Choose tj: Let PER' denote the characteristic function of B,,o, so that
J Fj f Zer"/4b"=K. By the Banach-Alaoglu theorem there is a subsequence such that Fj--F. But F$0 since IF =K. Motivated by the foregoing, H. Brezis (private communication) found another proof that does not use Corollary 4.
Brezis' proof of Lemma6. We start with a simple remark. Let uEL?", with Let B. denote the unit pueLP and 11VuIIP51. Set (for ball in R" centered at x and let Yx be its characteristic function. Clearly there is some x such that J I Pulpy.
(3.1)
561
Invent. Math. 74, 441-448 (1983) E.H. Lieb
448
On the other hand, by Sobolev's inequality we have IQPu1°+Iul1flx>S IIufj11q
where q-'+n-'=p-' if pn, q is arbitrary with p O depends only on p, q. Combining (3.1), (3.2) and Holder's inequality we obtain S <(k+ 1) 1 B. nsupp u1' -pfa.
(3.3)
Let us apply the previous remark to u=max(fj-e/2,O). For simplicity, we assume that 11 I fj11 P < 1 so that 11 Pu11v 51. From the assumptions of Lemma 6 we
have 11u11 o=(e/2)o IE jI z (e/2y' C, and thus k <-1 +(2/e)°/C. From (3.3) we deduce
that there exists some xj such that
1Bx,n{xlfj(x)>e/2}IzK for some constant K depending only on p, q, e, C. The conclusion follows as in the previous proof.
References
1. Brascamp, H.J., Lieb, E.H.: Some inequalities for Gaussian measures and the long-range order of the one-dimensional plasma. In: Functional Integration and its Applications, Arthurs, A.M. (ed.) pp. 1-14. Oxford: Clarendon Press 1975
2. Cheeger, J.: A lower bound for the smallest eigenvalue of the Laplacian. In: Problems in Analysis, a Symposium in Honor of Salomon Bochner, Gunning, R.C. (ed.) pp. 145-199. Princeton, N.J.: Princeton University Press 1970 3. Cheng, S.Y.: On the Hayman-Osserman-Taylor inequality, (preprint). 4. Croke, C.B.: The first eigenvalue of the Laplacian for plane domains. Proc. Amer. Math. Soc. 81, 304-305 (1981)
5. Faber, C.: Beweis das unter alien homogenen Membranen von gleicher Flache and gleicher Spannung die kreisfdrmige den tiefsten Grundton gibt, Sitzungsber. Bayer. Akad. der Wiss. Math. Phys., Munich 1923, pp.169-172 6. Hayman, W.K.: Some bounds for principle frequency. Applic. Anal. 7, 247-254 (1977/1978)
7. Krahn, E.: Ober eine von Rayleigh formulierte Minimaleigenschaft des Kreises. Math. Ann. 94, 97-100(1925)
8. Osserman, R.: A note on Hayman's theorem on the bass note of a drum. Comment. Math. Hely. 52, 545-555 (1977) 9. Osserman, R.: The isoperimetric inequality, Bull. Amer. Math. Soc. 84, 1182-1238 (1978)
10. Osserman, R.: Bonnesen-style isoperimetric inequalities. Amer. Math. Monthly 86, 1-29 (1979)
It. Taylor, M.: Estimate on the fundamental frequency of a drum. Duke Math. J. 46, 447-453 (1979)
12. Yau, S.T.: Isoperimetric constants and the first eigenvalue of a compact manifold. Ann. Sci. Ecole Norm. Sup. 8, 487-507 (1975)
13. Lieb, E.H.: Some vector field equations. In: Proceedings of the March 1983 University of Alabama, Birmingham International Conference on Partial Differential Equations, Knowles, I. (ed.), North-Holland (in press) 14. Brezis, H., Lieb, E.H.: Minimum action solutions to some vector field equations (in preparation) Oblatum 22-IV-1983
562
With H. Brezis in Commun. Math. Phys. 96, 97-113 (1984)
Mm
Communications in Commun. Math. Phys. 96, 97-113 (1984)
Pl"ics
© Springer-Verlag 1984
Minimum Action Solutions of Some Vector Field Equations Haim Brezis' and Elliott H. Lieb2* I Departement de Mathematiques, University Paris V1,4, Place Jussieu, F-75230 Paris, Cedex 05, France 2 Departments of Mathematics and Physics, Princeton University, P.O. Box 708, Princeton, NJ 08544, USA
Abstract. The system of equations studied in this paper is - du, = g'(u) on Rd, d >- 2, with u : R°-R" and g'(u) = aG/au;. Associated with this system is the
action, S(u)=f {2IFuI2-G(u)}. Under appropriate conditions on G (which differ for d = 2 and d > 3) it is proved that the system has a solution, u * 0, of finite action and that this solution also minimizes the action within the class {v is a solution, v has finite action, v *O). 1. Introduction
The purpose of this paper is to demonstrate the existence of solutions to a class of
systems of partial differential equations that arises in several branches of mathematical physics (e.g. calculating lifetimes of metastable states, estimates of large order behavior of perturbation theory, Ginzburg-Landau theory, density of states in disordered systems). The systems to be considered are of the form
-du,(x)=g'(u(x)),
i=l,...,n.
(1.1)
Furthermore, it will be shown that among the nonzero solutions to (l.1) there is one that minimizes the action, S(u), associated with (1.1). The meaning of the quantities in (1.1) is the following: u - (ut,..., U.) E R" and each u,: with d>2. We require that u,(x)--0 as jxI+00 in a weak sense described below (namely u e'). (Note: In some applications it is required that u(x)
as lxl-'cc but, by redefining u-'u-c and by redefining g', the problem can be reduced to the u(x) -0 case.) The n functions g': gradients of some function GeC'(Rn\{0}), namely
are the
g'.
(u) = c?G(u)/8u,,
9i(u)=0, *
u
O,
u=0,
(i.2)
Work partially supported by U.S. National Science Foundation Grant PHY-81-16101-A02
563
With H. Brezis in Commun. Math. Phys. 96, 97-113 (1984) H. Brezis and E. H. Lieb
98
and G satisfies certain properties described in Sect. II (d z 3) and Sect. III (d = 2). In particular, we emphasize that G(u) need not be differentiable at u = 0 so that, for example, G(u) could be - Jul near u = 0. The Action associated with (1.1) is
S(u)=K(u)- V(u),
(1.3)
K(u)- If f IVu(x)l2dxaIy_ f lVu,{x)j2dx,
(1.4)
V(u)
f G(u(x))dx.
(L5)
In general, S(u) is not bounded below, and one of our goals is to show that, under
suitable conditions, S(u)> -oo if u satisfies (1.1) and that S(u) actually has a minimum in the set of non-trivial solutions to (1.1). The word non-trivial (meaning
u * 0) is important; it will be shown later that when d = 2 the function u = 0 satisfies (1.1) and minimizes S(u), but the non-trivial solutions to (1.1) all have S(u)>0. When dz3, the u=-0 solution never has the minimum action. The class of functions to which we shall restrict our investigation of (1.1) as an
equation in 2' is (C_ (ulu a LL(R"), Vu a L2(R°), G(u) a L'(Rl, µ([juj > a]) < oo for all a > 0). (1.6)
Here, the symbol [f > a] denotes the set (xlf (x) > a). The same symbol, [f > a], will also be used to denote the characteristic function of this set. Lebesgue measure is denoted by t. The set
c3 ={uluaW,g(u)e LAa(R), u satisfies (1.1) in
'. u*0)
(1.7)
is the subset of'' which we shall prove is non-empty and in which there is a u such that S(u)<_S(u), all ued'. (1.8)
The solution of this problem was reported in 1983 and an outline of the proof was given [13]. The purpose of the present paper is to present all the details of the proof and certain additional refinements. Probably the earliest general treatment of existence of finite action solutions to (1.1) was by Strauss [20] for n= I, d? I. (The case n= I is called the scalar case.)
While this work was very important because it introduced new techniques, it imposed severe restrictions on the function G. Moreover, Strauss did not explicitly consider the question of whether or not his solution to (1.1) minimized the action. Strauss and Vi zquez extended this work to the vector case and to the "zero mass"
case [22]. The next step was taken by Coleman et al. [10] who made an important contribution to the problem by their "constrained minimum method" which not only yields a solution for d>-3 but also yields a minimum action solution. They discovered almost optimal assumptions on G so that the problem has a solution, but their method for finding a minimum action solution was restricted in an essential way to d>-3 and n=1. A detailed treatment of the Coleman, Glaser, Martin method, together with some improvements and other
theorems useful in the analysis of this and related problems was given by Berestycki and Lions [5, 6]. Then, in the same generality, Berestycki and Lions [6]
564
Minimum Action Solutions of Some Vector Field Equations Minimum Action Solutions
99
went on to prove the existence of infinitely many finite action solutions to (1.1). Strauss [20] also had results in this direction. (Infinitely many solutions for the socalled zero mass case was done in [7].) As before, all of this was for n =1, d;-> 3. In view of the aforementioned work, two natural extensions suggest themselves. One is to d = 2 and the other is ton > 1. We thank Ian Affleck for suggesting both problems to one of us (E. L.). Affleck was interested in the d=2 case for physical applications [1]. Some results for both d=2 and for n> I were obtained
by one of us (E.L.) in 1983, and these were subsequently strengthened in collaboration with H.B. to the level of generality given here and in [13]. Independently, in 1982, Berestycki, Gallouet and Kavian had solved the d=2, n = I case (with stronger hypotheses than in the present paper; in particular they do not treat the zero mass case) and this was published recently [3] (see also [4]). [However, they also showed there are infinitely many solutions of (1.1) for d=2, n= 1.] The proofs for n = I all relied on the fact that one could look for minima in the class of radial functions (by rearrangement inequalities), and that these functions
have certain compactness properties [20, 6]. For n> 1, one can still restrict attention to radial solutions, although it is not known whether the minimum action solution lies in this class (because rearrangement inequalities are not applicable). Berestycki and Lions [5] showed how to prove the existence of radial solutions that minimize the action among all radial solutions of (1.1). The extension to n> 1 requires a new compactness device. In this paper, the heart of the matter is contained in Lemma 2.2. It should be noted that Lions has developed a general compactness principle [15, 16] which allows him to deal with the cases d? 2, n>- 1.
11. The Case of Three or More Dimensions A. The Minimization Problem
Let G: R"-.R be continuous with G(0)=0. In this subsection we shall consider a minimization problem that leads to (1.1) if G happens to be differentiable, but here
we shall make no assumptions about the differentiability of G. Here, and henceforth, C>0 will denote an inessential, positive constant. G satisfies the following four conditions (2.2)-(2.5). [Note G(u), not IG(u)l in (2.2), (2.3).]
limsuplul 'G(u)50,
(2.2)
where, for d>_ 3, p always denotes
p=2*=2d/(d-2), lim sup Jul -°G(u) <0
(2.3)
iiii -o
G(u0)>0
for some uo a R" ,
For all y > 0 there exists C, such that for all u, w e R" IG(u+w)-G(u)15 y[IG(u)I +lul°] +CY[IG(w)I + IwI°+ 1]
(2.4) I .
j
(2.5)
565
With H. Brezis in Commun. Math. Phys. 96, 97-113 (1984) H. Brezis and E. H. Lieb
too
Remark 1. Condition (2.5) looks awkward, but it holds in several cases such as (2.6) or (2.7) or (2.8): lim
Iul-PIG(u)1=0,
G C- C' (R"'\{0})
and g = VG satisfies
Ig(u)1
all u$0. and
G e C'(R'"\{0})
g = VG satisfies
lg(u)I
allu*0and allueR
(2.8)
C, a > 0 .
The main result of this section is the solution to the following minimization problem. We define
T=inf(If IVul21ue%, I G(u)> I)
.
(2.9)
Theorem 2.1. Assume (2.2}{2.5). Then there exists v e `' such that
if lVvl2=T,
(2.10)
IG(v)=l.
(2.11)
and
Remark 2. Using (2.4) it is easy to see that there is some u e ' such that f G(u) =1. Remark 3. Let u c L;", and Vu a L2, such that as lxl oo in the weak sense of (1.6), namely µ([lul > a]) < oo, all a > 0. Then u e L° and Il u ll P < C 11 Vu 112. Thus, the class %' in (1.6) can be characterized (for d z 3) as W = {ulu a LP(R°), Vu a L2(R°), G(u) a L' (R')).
To prove this, let x"(x) = x(x/n), where x e Co and x ==-I near 0. Let e > 0 be fixed. Assume, provisionally, that u e W and also u E U. By Sobolev's inequality, Ilx"(lul-e), ll
CllVx"(lul-e)+112
sC{llVull2+[JAIVx"12]' 2}
:! CI1Vull2+CJn, where A = [Jul > e] and Ct is some constant depending one. We conclude (in this U case) by letting n- oo and then e-+0. If u e r' but u ll Lm, we may truncate u, then use
the foregoing, and then remove the truncation by Fatou's lemma. In the following, {u'} denotes a minimizing sequence for (2.9).
Lemma 2.1. There exist e, b>0 such that for all j, µ([IuJI>e])?6. Proof. Since Vu' is bounded in 9, Sobolev's inequality implies that (2.12)
Ilu'IIPPsC.
Let y= l/(2C). By (2.2), (2.3) there exists 1 >e>0 such that G(v)
566
for
lvl _<e
or
lvl> 1/e.
(2.13)
Minimum Action Solutions of Some Vector Field Equations Minimum Action Solutions
101
Thus, we have that t!5 I G(u')
E]). This implies the lemma with S =1/(2C j. Next, we recall the following [14]:
Lemma 2.2. Let v be a function such that v E L,,, V v e LZ, II V v II 2 < C and it ([Ivl>c])?b>0. Then, there exists a shift Tv(x)=v(x+y) such that, for some constant a=a(C,b,e)>0, p(Bn[ITvl>e/2])>a, where B={xeRdllxl<_l}.
Using Lemmas 2.1 and 2.2 we can shift each u' in such a way that p(Bn [I
c/2]) z a, where a > 0 is independent of j. Thus, we may assume
without loss of generality that 4Bn[lu'l > c/2]) ? a. After extracting a subsequence we may also assume that (cf. (2.12))
uL u weakly in LP, Vu'-Vu weakly in LZ ,
a.e. on Rd, t(Bn[jul?e/2])?a. Finally, we have G(u) a V. To prove this, let us write
G=G+-G_ with G+=Max{G,0} and G_=Max{-G,0}. We have ! G+(uh :5 y I lu'IP{[lull sc] +[lu'l
1/c])
+IG+(u')[sE]) is bounded on (c, l/c) since G+ is continuous.) We also <(C/e)P; moreover, have I G -(u') S I G+(u') -1. Hence, ! IG(u')l < const, and we deduce from Fatou's
(The last integral
lemma that G(u)aLt. Thus, ueW. We conclude the proof of Theorem 2.1 with Lemma 2.3. The limit function satisfies f G(u) = I and 2I I Vul' = T, where T is defined in (2.9).
Remark 4. It follows from Lemma 2.3 that in fact Vu'-,Vu strongly in LZ and thus
u'-u strongly in Y. Proof of Lemma 2.3. It is easily seen by scaling [i.e. v(x)-.v(Ax)] that if IVVI2
T[IG(v)](d-2)1d,
all veIt with IG(v)>0.
(2.14)
Let 0 E LP with G(¢) a L' and with 0 having compact support. We claim that, as j-+oo
!G(u'+0)zl+IG(u+0)-IG(u)+o(l).
(2.15)
[Note that the integrals in (2.15) make sense because of (2.5).]
567
With H. Brezis in Commun. Math. Phys. 96, 97-113 (1984) H. Brezis and E. H. Lieb
102
Verification of (2.15). Let K = Suppq; we have
I G(u'+0)= Ix G(u'+0)+ I G(u > 1 + Ix [G(u'+0)-G(u')] = 1 + Ix [G(u+0)-G(u)]+o(1). The last equality follows from Egorov's (or Vitali's) lemma. Indeed, given E> 0 we fix y > 0 small enough so that y I (IG(u')l + lu'1') <E/2 .
By (2.5) we have that
I
A
A
[IG(O)I+IOI°+1]sE
for any set A C K with µ(A)
I [G(u + 0) - G(u)] > -1 .
(2.16)
For j large enough we may insert v =u'+0 in (2.14) and, in the limit, we find that
T+I 17U _ 170+}II17012>_T[1+IG(u+0)-IG(u)]1-210. That is, T+ -211 117(U +0)12_ I I I pu12 > T[ I + I G(u + 0) - I G(u)]' - 21d.
(2.17)
Let A> O be fixed. We can find a mapping S : Rd-+R', bijective with S and S-'
smooth such that S(x) =
JAx
Ixl < 1 ,
IxI>R
(for some R depending on A). Set S (x) = nS(x/n) and
(x) =
u(x), so that
¢ E H' and 0 has compact support and e V. [The last assertion is obtained w=u(x) in (2.5).] We claim that as n-+co by choosing
I G(u(Ax))dx+o(l)=J.-dI G(u)+o(l),
I
(2.18)
and
I IV(u+0n)I2
IV[u(a)]I2dx+0(1)=.12-dI G(u)+o(l).
(2.19)
Indeed we write
I G(u+0.)= I G(u(S (x)))dx = I G(u(y))J.(y)dy, where J. denotes the Jacobian determinant of the mapping y-+S. '(y); it is easy to see that IJn1 S C, C independent of n, and as n-+oo for all y. Thus (2.18) follows by dominated convergence. The same argument applies to (2.19). We fix ). > 0 with I7. - I I so small that (A -d-1) I G(u)> -1. Thus 0 = 0 satisfies (2.16) for t large enough. Hence (2.17) holds for 0_0 and in the limit (as n-+oo) we obtain
T+2(a2
568
1)IIVuI2>?T[l+(.?
d-1)IG(u)]'-2m.
(2.20)
Minimum Action Solutions of Some Vector Field Equations Minimum Action Solutions
103
Finally we choose A= I ±e in (2.20) and, as a-40, we see that -If IVul2 = T f G(u). Since u * 0 we have f G(u) > 0, and we deduce from (2.14) (applied to v = u) that
f G(u)Z 1. On the other hand, since Vu'-Vu weakly in L2, we obtain, by lower semicontinuity, that z f I Vu12 -< T. Therefore I G(u) = I and i f I Vul2 = T This concludes the proof of Theorem 2.1. D
B. Further Properties of u Throughout this section we assume that G is differentiable on IR"\{0}. More precisely, let G : R"-+R be continuous (on all of R) with G(0) = 0. Assume that G satisfies (2.2H2.4) and G e C' (R'\10)). We set VG(v)
g(v)-
0
if if
v+0 v=0.
We assume (2.8). For every v e ' we define its action to be S(v) =12 f I Vv12 - f G(v).
Theorem 2.2. Let u be given by Theorem 2.I. Then after some appropriate scaling, u(x) = u(Ox), (0> 0), u satisfies
-du=g(u) in -9'.
(2.21)
Moreover,
0<S(u)SS(v), all VEWnLoC, v*0, -Av=g(v) in 19'. [In some cases, any solution v is automatically in Lo, (see Theorem 2.3).] Proof. Fix 0 e Ca . We see easily by dominated convergence that as (2.22)
f [G(u + to) - G(u)] [u $ 0] = t f [g(u) . 0] [u * 0] + 0(t).
Here we use (2.8). Also, we have that f IG(u + t¢) - G(u)I [u = 0]:5 Ct f 101 [u = 0] + 0(t).
(2.23)
From (2.14), and using (2.22) and (2.23) we deduce that, for Itl small enough,
zf IV(u+tcb)I2>=T{f G(u+t0)}'-'1d>T{I+tf g(u) -CtfI0I[u=0]+o(t)}'-2/e
>T}I+t(--) $ g(u) . -C, (-d-) l
I0I[u=0]+o(t)}
Consequently, If
dd2) f g(u).
j
all OE C'
.
We deduce from the Riesz representation theorem that there exists some h E L'
such that
-4u=T( dd2lg(u)+h[u=0] in 9'. Finally, we have u E L°", (by (2.8)) and g(u) a Lqll(q ". We deduce from the elliptic regularity theory that u E WaCatly 1) [since q/(q -1) > 1]. Therefore 4u = 0 a.e. on
569
With H. Brezis in Commun. Math. Phys. 96, 97-113 (1984)
H. Brezis and E. H. Lieb
104
the set [u=0] (see [19] or [i l]). Hence we have proved that
-Au= T l d d2) g(u) in -9', and therefore that u(x) = u(Ox) satisfies (2.21) with 02 = d [(d - 2) T] -' . In order to complete the proof of Theorem 2.1 we must establish Pohozaev's identity [ 18] in a setting slightly more general than usual. Lemma 2.4. Assume G E C' (R"\{O}) and let v e %nl,a' be any solution of (1.1) in -9'. Then pv12 fI
(2.24)
= d?d 2I G(v)
Proof. Since v e I,a, it follows from (1.1) and the elliptic regularity theory that v e L3 ', all t < oo. Note that aG(v)/ax, = g(v) av/axi in 9'. Indeed, choose a smoothing sequence G for G so that Gk-.G uniformly on compact sets of R" and gk= VGk tends tog pointwise on R"\{O}. We have aGk(v)/ax,=gk(v) av/axi, and thus, for 0 e Co ,
Iax (Gk(v))m=-JGk(v)aO
-I
and av
49V
f gk(v) - a
x. 0- I Ov) - ax. 0
by dominated convergence (recall that av/ax,=0 a.e. on the set [v=0]). Next we multiply Eq. (1.1) by 0Y, x,av/ax;, where 4'e Co. Note that ,
I0g(v).Ix,i. =-dI G(v)S-IG(v)Y- x, while
=(I-2)I{OIvvI2+EaOx,a! ax; axi ax; J
2
i
ax,
Finally we choose g4(x)=q"(x)=O,(x/n), where c, is any function in Co such that
q,(x)= I for IxI2. As n-.oo we obtain (2.24). Proof of Theorem 2.2 Concluded. We have I I Pu12 =
(d2d2) I G(y),
I I vv12 =
(d2d2) { G(v),
and on the other hand we also have 2IIVUI2=T[IG(u)]'-2/d,
570
2I IVv12
T[I G(v)1' _ 2/d.
Minimum Action Solutions of Some Vector Field Equations Minimum Action Solutions
105
Combining these relations we see that 1 G(v) Z 1 G(u). However, S(v) = 1 I Vvl2 -1 G(v) = (d 2 2)1 G(v),
i
S(u) = i 1 Ihula -1 G(u) = (d 2 2)1 G(u),
and we obtain 0<S(u)SS(v) for all veTnL.,, v*0, -Av=g(v) in 9'. C. Regularity and Behavior at Infinity In this section we shall only assume that g : R"-+R' is any mapping bounded on bounded sets and such that g(v) v5 C I vl + C Ivlp, all v e R".
Theorem 2.3. Let ve' with g(v)E Lea be any solution of -Av=g(v) in -9'. Then VC W2.4, all q < oo (and consequently v E
for all a < 1)
(2.25)
and
vc- V'
with
lim v(x)=O. Ixl-.
(2.26)
Assume, in addition, that
all veR"with lvl<6,
(2.27)
for some constants C>0, S>0, 15r52. Then if r=2, v(x) decays exponentially as lxl - oo,
(2.28)
if l _
(2.29)
Proof. By Kato's inequality (see Kato [12]) we have
in -9', and thus
-AIv15-Av - =g(v)'v
ICI
=
Therefore
-Alvl+lvl
(2.30)
where
A=Clvlp-2[lvl> 1], so that A E L°l2 and p(SuppA) < eo. We deduce from (2.30) that
lvl
571
With H. Brezis in Commun. Math. Phys. 96, 97-113 (1984) H. Brezis and E. H. Lieb
106
Applying Lemma A.I (in the appendix) with a= d/(d - 2) and fi = 2d/(d - 2), we see that f IviQ
all B with µ(B)
(2.31)
B
In order to prove (2.26) we note that (2.32)
Given c>0, we have, for some 6>0,
-AIvl2+lvl26], and thus Ivl2
(2.33)
where 0=CIvI°[lvl>6]. From (2.31) we deduce that 0 e LQ, all q< oo. Since, on the other hand, Ye L` for as IxI -oo. Using (2.33) we obtain
all 1
VELand Jim sup iv(x)l2 < CE.
This implies (2.26) since c is arbitrary. Therefore we have g(v) a L°° and consequently v E WoC'° for all q < oo. Finally we assume (2.27). Combining (2.26), (2.27), and (2.32) we see that
-AIvI2+2CIvi'<0 for lxi>R,
(2.34)
(R large enough). We easily deduce (2.28) and (2.29) from (2.34) by comparison with radial supersolutions. (When r = 2 this is standard, when I <- r < 2, see e. g. Benilan-
Brezis-Crandall [2]. A systematic survey of available methods for proving compact support can be found in the book of Diaz [23].) U III. The Two-Dimensional Case
Let G : R"-.R be continuous with G(0)=0, and GEC(R"\{0}). We set g(v) _
if if
to
v$0 v=0.
We make the following assumptions G(v)<0
for
00,
G(vo) > 0
Ig(v)I-<-C+CIvi°-',
for some vo ,
for all v, for some I
(3.1)
(3.2) (3.3)
The class c' of functions is given in (1.6). Theorem 3.1. Assume (3.1), (3.2), (3.3). Then
T=lnf{i f IVvl2lvele, v*0, f G(v)2:0}
572
(3.4)
Minimum Action Solutions of Some Vector Field Equations Minimum Action Solutions
107
is achieved by some u e IF, u * 0 such that $ G(u) = 0. Moreover a satisfies
-du=g(u) in _9',
(3.5)
and (3.6)
0 < S(u) < S(v),
for all ve'' such that v * 0 and -dv=g(v) in -?'. It is important to note that Theorem 3.1 states that the unique minimizer of S(u) in the set of functions that are in 6 and that satisfy (1.1) is, in fact, u -0. The existence of this trivial solution of (1.1) of lowest action is special to d = 2. It is the
chief difficulty in the two-dimensional case for the obvious reason that the minimum of j 1Vu12 with j G(u)=0 would be u-0. Therefore we must impose the extra condition u*0. (Independently, Keller [21] introduced the u$0 constraint, but for d >-- 3. Berestycki et al. [3, 4] used it for d = 2.)
We do not have a general result (as in the d>-3 case) for the existence of a minimum in (3.4) without assuming the differentiability of G on R"\(O). However, if we assume that for some a>O, sup IG(tv)I
O
for all lul
Proof. Let {u'} be a minimizing sequence for (3.4). Note that p([Iujl> e])>0, since u'*O and j G(u')_0. On the other hand the expression j IVul2 is invariant under scaling. Thus we may always assume that
µ([1w1>r])=1 Also, after a shift, we may assume that
(3.7)
p(Bn[lu'l>e/2])>-a>0,
(3.8)
where B is the unit ball (the argument is the same as for d?3). The following lemma is needed in the proof of Theorem 3.1. Lemma 3.1. We have that
jlu'I"[lu'I>e]
all q
(3.9)
Proof. First we claim that li011,,:S
iI V0Il2 P(SuppO)
for all
1S q < x , (3.10)
all 0 e L,, 170 E L2, it(Suppo) < oc .
The conclusion of Lemma 3.1 follows by choosing 0 = (lu'l -e) , in (3.10), and we obtain II(lu'I-e)+IIg
Step (i). We start with the well-known inequality Il
II2:!9 CiiV0II1 ,
(3.11)
(See e.g. Nircnberg [17].) 573
With H. Brezis in Commun. Math. Phys. 96, 97-113 (1984) H. Brezis and E. H. Lieb
108
Step (ii). Inserting 02, 43, ..., ¢", ... in (3.11) and interpolating, we find II#Iiq<=Cg IIV01111
V#II2-
(3.12)
with a= 2/q, q< oo, all 0 e Co. Step (iii). Smoothing by convolution, we see that (3.12) holds when 0 E Lm, V¢ E L2
and 0 has compact support. Step (iv). Inequality (3.12) still holds when 0 E L, V# a L2 and U(Suppo) < oo. Indeed use step (iii) with where X.(x) = x,(x/n) and X, E C' with y,,(x) = I for IxI < I. Note that II#VX"II -0 and II#Px"112-'0.
Step (v). Inequality (3.12) is valid for 4 e LL, V# E L2 and p(Supp#) < oo. Indeed, we can use Step (iv) on truncated #'s.
Step (vi). We obtain (3.10) from (3.12) by the Cauchy-Schwarz inequality.
Returning now to Theorem 3.1, we deduce from Lemma 3.1 that liu'Ile,a(Q)
u'-'u a.e. on R2, Vu'-'Vu weakly in L2(R2),
p(Bn[Iul2;e/2])>a>0 (in particular u$0), µ([lul >_ &]):-5 1
.
Moreover, we have G(u) V. Indeed, writing G = G+ - G_ , we have that
f G+(u')= I G+(u') [Iu'I>e] < f CIuJI'[lu'l>e]
We also deduce that ue'8 since u([IuI>e])5I and G(u)eL' [here we use assumption (3.1)]. Note that for any set B of finite measure we have f lu'Iq _ C(q, IBl),
all q < oo ,
(3.13)
I Iulq < C(q IBI),
all q < oo.
(3.14)
B
and, also in the limit B
Indeed we may write f lu'I" S f Iu'I' [I B
el+fe :5c+e m(B). B
Let us introduce the class of functions from R2-.R",
at''_{010eL,a,V0aL2,µ(Supp#)
574
Minimum Action Solutions of Some Vector Field Equations Minimum Action Solutions
Lemma 3.2. Let
109
e it be such that f G(u + 0) - G(u) > 0.
(3.15)
Then
I
(3.16)
Note that IG(u+0) makes sense for 0e.7£''; indeed if B=Suppg then
SIG(u+S)I<JIG(u+0)I+ I
-B
B
B
Proof. We have, with B=Suppm,
5G(u'+c)= f G(u'+o)+ IBG(u')>= f [G(u'+O)-G(ut)]-. f G(u+q)-G(u). For the last assertion we note that a.e.. On the other hand, if A C B we have
S IG(u'+q)-G(u')I< j (C+CIu'I°+CIgI°)
1
,
and the last term can be made arbitrarily small by choosing µ(A) small enough. We conclude the proof by Egorov's or Vitali's lemma. Thus, if (3.15) holds, we have
5G(u'+0)>0 for j large enough, and therefore -+T, we obtain (3.16) in the limit.
O
iI IV(u'+0)I2T Since Z I IVu'I2
Lemma 3.3. There is a constant C, (depending only on G) such that if 0 e 1f', then (3.17)
Note that j g(u)$ makes sense since g(u) a L2(B) and O e L2(B) (here B=Suppo). On the other hand, f 101[u=0] also makes sense since ¢eL'(B). Proof. By dominated convergence we have, as t--0,
f [G(u+t¢)-G(u)][ut0]=t f (3.18)
On the other hand we have
ISG(tO)[u=0]I
(3.19)
[here we use assumption (3.3) to deduce that IG(v)I < C, IvI + CIvI", all v]). Let
¢e.)' be such that (3.20) Ig(u) C, f 101 [u=0]>0. We deduce from (3.18) and (3.19) that f [G(u + to) - G(u)] > 0 for t>0 and small enough. Therefore, by Lemma 3.2 I 17U. 170+ 2I 1170120.
575
With H. Brezis in Commun. Math. Phys. 96, 97-113 (1984) H. Bre7is and E. It. Lieb
110
As t-0 we have j Vu. VOz0, which is precisely (3.17). Lemma 3.4. Consider the linear functional L(q5)= j Vu Vtb. There is some 0e aY such that 40) r 0 and 0 = 0 on [u = 0].
Proof. Assuming the contrary, we should have L(O)=0, for all q5 e .f such that
0=0 on [u=0]. In particular, taking 0=(q,,0,...,0), we should have
f Vu, 170, =0, for all O e -*'such that 0, = 0 on [u, =0]. W e choose 5, =(u, -6), 6 >0. Then, from the above, f flu, -(5) + IZ = 0, which implies (u, - 6)+ = C, which
in turn implies (u,-b)+-0 (since u,-+0 at infinity in the weak sense). Hence, u,:56, and thus u, 50. The same argument applied to each component leads to u -0, which is a contradiction. Lemma 35. There is a constant k >- 0 such that (3.21) Ijg(u)O-kL(#)ISC, f I#I[u=0], for all ge.Jt'. Proof. We fix some 0oe.7Y such that and 00=0 on [u=0]. (See
Lemma 3.4.) Given m e ., note that W=0+L(5)q5o+a50,
a>0,
satisfies
L(W) = -a<0 and, by Lemma 3.3, we have that f g(u) [0 + L(O)GO + a00] - C'i 1101 [u = 0]:5 0
(since f0=0 on [u=0]). As a--0 we find that
forall oc-X-, where k = - f g(u) ¢o>= 0 [by (3.17)]. By considering the two choices ± ¢, we obtain (3.21).
Lemma 3.6. For v e Hi,, we have that t3G(v)10x; = g(v) cw/8x;
in 1' .
(3.22)
Note that g(v) by/ox; E L,a and G(v) a L, so that (3.22) makes sense in J''.
pointwise on R.
Proof. Choose a smoothing sequence Gk for G so that .9k = VGk tends to g pointwise on R"\{O}. Moreover,
IGk(v)I
1.
We have that 3Gk(v)/ox; = gk(v) t?v/t?xi, and thus, for 0 E C/j', , I bt'Gk(v)/dxi = - j Gk(v)oq/E3xi-+ - j G(v)t1Y'/tax
.
and $ 19k(U) ' t /t?xi - j O9(v`) ' GC/axi .
by dominated convergence (recall that civ/ax;=0 a.e. on the set [v=0]).
576
U
Minimum Action Solutions of Some Vector Field Equations
III
Minimum Action Solutions
Proof of Theorem I Concluded. The linear functional M(cb)= f g(u). -kL(¢), 0 e Ca , satisfies, by (3.21), C, il0llc,. Thus by the Riesz representation theorem there is some function h e L=(RZ). h : R2 -R", such that M(q) = f It 0, for all 0 e Co . Moreover, by (3.21), we have I f h . GI 5 C f 101 [u = 0], for all 0 e Co , and
hence for all 0eL'. Thus, h=0 for u*0 and, therefore, g(u)+kdu=[u=0]h. It follows that k * 0 (and thus k > 0), for otherwise k = 0 g(u) - 0 3G(u)/8xj = 0 by Lemma 3.6 for a. e. x we have either G(u) = C -. G(u) = 0 (since G(u) a L')
u(x) = 0 or lu(x)I>e. On the other hand. IuleH,
and thus it has a mean value
property; therefore we would have either u = 0 a.e. on R2 or Jul >-- a a.e. on R2. Both
cases are excluded (since u*0). Hence we have proved that k>0 and u satisfies - Au = g(u) + [u = 0] h' for some h' c- L. It follows from the elliptic regularity k
theory that u e W2. 'I, all q < oo, and therefore Au = 0 a.e. on [u =0]. Consequently
h'=0 a.e. on [u=0], i.e. we have
-Au=g(u)/k for some k>0.
(3.23)
When d=2, Pohozaev's identity (the proof of which is similar to Lemma 2.4) states that f G(u) = 0. On the other hand, since Vuj- Vu weakly in L2, we have, by lower semicontinuity, i f I VuI2 5 T. Thus, in fact, -11 I VuI2 = T and u is a minimizer
for (3.4). After scaling we can always assume that u also satisfies -Au=g(u). Finally, if satisfies -Av=g(v) in then vELql,,, all q<00, -g(v) a L,.=v e L o". By Pohozaev's identity we have f G(v) = 0, and thus if v0
.,
we obtain f IVvIZ_ T. Therefore, z
S(v)=if IVc12? T= zf IVul2=S(u).
C]
Behavior at Infinity
Here we assume only that g : R"-R" is any mapping such that for some p < oo,
Ig(v)ISC+CIt
'.
for all
veR".
Theorem 3.2. Let v E ' he any solution of (1.1) in I'. Then
lim r(x)=0.
Is1-S
Assume in addition that g(v) r5 - CIvl' for all v e R" with Jul
C>0, 6>0, 1:5r:52. Then (i) if r=2, v(x) decays exponentially as IxI -.oo, (ii) if I r<2, v(x) has compact support.
Proof. For any 6>0 the function ¢=(Ivl-b)+ satisfies ¢eLI
VoeL2, p(Suppo) < oo, and thus, by (3.10), O e LQ(R2) for all q < oo. Hence Iv1Q[Iu1>6]0. We note that
-AIv12=
Given any a>0, we have, for some S>0, -AIv12+Iu125CIrl y1112+CIuhSa+C(IvI2+IvI°)[lul>8]'
577
With H. Brezis in Commun. Math. Phys. 96, 97-113 (1984) H. Brezis and E. H. Lieb
112
and thus Iv12 5 Ca +(Y * u,), where Y is the Yukawa (or Bessel) potential and 4' = C(Iv12 + Ivl) [Ivl > b]. Thus WE L, all 1 <_ q < oo. On the other hand Ye L for all
15 t< oo. It follows that (Y * w)-+0 as Ixl-+oo. Therefore, lim sup Iv(x)12 < C, Ixl-m
oo, since a is arbitrary. The rest of the proof is the same as in Theorem 2.3. F J
which implies that v(x) --*0 as 1x1
Appendix
Lemma A.I. Let I 0. We assume that f 5 I + Y* (Af ), where * denotes convolution. Then
Ilfl°
Let
X
be
the characteristic function
of BvSuppA. We have
Xf<X+X[Y*AXf]. Let g=Xf, whence g<X+X[Y*(Ag)], and geLA with .u(Suppg) < oc. Let Q : c F-+ Y * (A(b). Note that Q is a well defined bounded operator from L' into L' for all a
(1/iJ-1/a',
tl/(fl+1),
if 0 _a'.
Note that /i, > ft. We shall prove that g e LL g e L°'. Iterating this fact with replaced by /3, we find that geL"k for an increasing sequence fl,,-+oc. This will prove the lemma. Write A = A, + A2 with A, a L' and A 2 such that K : 01-+ Y* (A 2c5) is a bounded operator from Lfl into La and L" into La' with norm
< 1. We have that
g :! [X+X(Y*(A,g))]+[Y*(A2g)]=h+Kg. Note that he La'. We have that m
g< Y_ K'h+Km+tg. =t
K'h is a norm convergent series in La' while K'+'g-+0 in L. Thus g e La'. Lemma A.1 is closely related to, and in fact implies some results in [8]. References
1. Aflleck,I.: Two dimensional disorder in the presence ofa uniform magnetic field. J. Phys. C 16, 5839-5848(1983)
2. Benilan, Ph., Brezis, H., Crandall, M.: A semilincar equation in L'(R"). Ann. Scuola Norm. Sup. Pisa 2, 523 -555 (1975)
578
Minimum Action Solutions of Some Vector Field Equations Minimum Action Solutions
113
3. Berestycki, H., Gallouet, Th., Kavian, 0.: Equations de champs scalaires Euclidiens non lineaires dans Ic plan. Compt. Rend. Acad. Sci. 297, 307-310 (1983) 4. Berestycki, H., Gallouet, Th., Kavian, 0.: Semilinear elliptic problems in R2 (in preparation) 5. Berestycki. H., Lions, P: L.: Existence of stationary states of non-linear scalar field equations. In: Bifurcation phenomena in mathematical physics and related topics. Bardos, C., Bemis, D. (eds.). Proc. NATO ASI, Cargese, 1979, Reidel, 1980 6. Berestycki, H., Lions, P: L..: Nonlinear scalar field equations. 1. Existence ofa ground state. 84, 313-345 (1983). See also If: Existence of infinitely many solutions. Arch. Rat. Mech. Anal. 84,
347-375 (1983). See also An O.D.E. approach to the existence of positive solutions for semilinear problems in R" (with L.A. Peletier). Ind. Univ. Math. J. 30,141-157 (1981). See also
Une mbthode locale pour ('existence de solutions positives de problemes semilineaires elliptiques dans R'. J. Anal. Math. 38, 144 187 (1980) 7. Berestycki, H., Lions, P -L.: Existence d'etats multiples dans les equations de champs scalaires non lineaires dans Ic cas de masse nulle. Compt. Rend. Acad. Sci. 297, 1, 267-270 (1983) 8. Brczis, H., Kato, T.: Remarks on the Schrodinger operator with singular complex potentials. J. Math. Pures Appl. 58, 137--151 (1979) 9. Brezis, H., Lieb, E.H.: A relation between pointwise convergence of functions and convergence of functionals. Proc. Am. Math. Soc. 88, 486 -490 (1983) 10. Coleman, S., Glaser, V., Martin, A.: Action minima among solutions to a class of Euclidean scalar field equation. Commun. Math. Phys. 58, 211. 221 (1978) 11. Gilbarg, D., Trudinger, N.S.: Elliptic partial differential equations of second order. Berlin, Heidelberg, New York: Springer 1977 12. Kato, T.: Schrodinger operators with singular potentials. Israel J. Math. 13, 135-148 (1972) 13. Lieb, E.H.: Some vector field equations. In: Differential equations. Proc. of the Conference Held at the University of Alabama in Birmingham, USA, March 1983, Knowles, L. Lewis, R. (eds.). Math. Studies Series, Vol. 92. Amsterdam: North-Holland 1984 14. Lieb, E.ll.: On the lowest cigcnvaluc of the Laplacian for the intersection of two domains. Invent. Math. 74, 441-448 (1983)
15. Lions, P: L.: Principe de concentration-compacite en calcul des variations. Compt. Rend. Acad. Sci. 294, 261 264 (1982) 16. Lions, P: L.: The concentration-compactness principle in the calculus of variations: The locally compact case, Parts I and II. Ann. Inst. H. Poincar6. Anal. Non-lin. (submitted) 17. Nirenberg, L.: On elliptic partial differential equations. Ann. Scuola Norm. Sup. Pisa 13, 115 162 (1959) 18. Pohozaev.S.I.: Eigenfunctions oftheequation Au+Af (u) =0. Sov. Math. Dokl.6, 1408-1411 (1965)
19. Stampacchia, G.: Equations elliptiques du second ordre a coefficients discontinue. Montreal: Presses de I'UniversitC de Montreal 1966 20. Strauss, W.A.: Existence of solitary waves in higher dimensions. Commun. Math. Phys. 55, 149-162 (1977)
21. Keller, C.: Large-time asymptotic behavior of solutions of nonlinear wave equations perturbed from a stationary ground state. Commun. Partial Diff. Equations 8, 1013-1099 (1983)
22. Strauss, W.A., Vazquez, L.: Existence of localized solutions for certain model field theories. J. Math. Phys. 22, 1005- 1009 (1981) 23. Diaz, J.1.: Nonlinear partial differential equations and free boundaries. London: Pitman (in
preparation) Communicated by A. Jaffe Received March 30, 1984; in revised form May 18. 1984
579
With H. Brezis in J. Funct. Anal. 62, 73-86 (1985) Vol. 62, No. I. hone I, 1955
Reponled from JOURNAL OF FUNCTIONAL ANALYSIS
Prated a tktgtum
All Rights Reserved by Academic Press, New York and London
Sobolev Inequalities with Remainder Terms HAIM BREZIS Departement de Mathematiques, University Paris Vl, 4, Place Jussieu, 75230 Paris, Cedex 05, France AND
ELLIOTT H. LIEB* Departments of Mathematics and Physics, Princeton University. Princeton, New Jersey 08544 Communicated by the Editors Received September 14, 1984
The usual Sobolev inequality in It". n> 3, asserts that IIVf I1= 3 S. II f 1l'.. with S. being the sharp constant. This paper is concerned, instead, with functions restricted to bounded domains Q c R. Two kinds of inequalities are established: (i) 111=0
onr3t2,then IVfIIi>S"IIf12.+C(Q)0:.wwith p=2'/2andIVf11i>SAIII2,.+ D(Q) IIVf 1I;.. with q = n/(n -- I ). (ii) If f # 0 on 8n, then IVI II2 +C(Q) II f 11',,w>S ;; 2 3 f 112. with q = 2(n - I )/(n - 2). Some further results and open problems in this area are also presented. r: 1985 Academic Press, Inc.
1. INTRODUCTION
The usual Sobolev inequality in R', n > 3, for the L2 norm of the gradient is
2*=2n/(n-2), for all functions f with Vf a L2 and with f vanishing at infinity in the weak sense that meas{x I l f(x)I >a}0 (see [12]). The sharp constant S. is known to be S"=an(n-2)[F(n/2)/f(n)]21".
(1.2) 'Work partially supported by U.S. National Science Foundation Grant PHY-8116101A02.
73
581
With H. Brezis in J. Funct. Anal. 62, 73-86 (1985)
74
BREZIS AND LIEB
The constant S. is achieved in (1.1) if and only if f(X)=a[e2+IX_yI2](2_
)/2
for some ac-C, e960, and y e R" [1, 2, 6, 7, 9, 11].
In this paper we consider appropriate modifications of (1.1) when Q8" is replaced by a bounded domain 0 c R". There are two main problems: PROBLEM A.
If f = 0 on 00, then (I.1) still holds (with L° norms in 0,
of course), since f can be extended to be zero outside of 0. In this case (1.1) becomes a strict inequality when f # 0 (in view of (1.3)). However, S. is still the sharp constant in (1.1) (since Ilof II2/II f1I 2 is scale invariant). Our goal, in this case, is to give a lower bound to the difference of the two sides in (1.1) for f e Ho(Q ). In Section II we shall prove the following inequalities (1.4) and (1.6): IIV!II2 >, S. 11f ll2 +
II! I
(1.4)
,,,.,
where C(Q) depends on 92 (and n), p = n/(n - 2) = 2'/2, and w denotes the weak L° norm defined by IIfII,,.. =supJAJ -Iid f If(X)I dx, A
with A being a set of finite measure JAI.
The inequality (1.4) was motivated by the weaker inequality in [3], Ilofll2%Sn
II!112 ,
(1.5)
which holds for all p < n/(n - 2) (with C,,(Q) -. 0 as p - n/(n - 2)). The proof of (1.5) in [3] was very indirect compared to the proof of (1.4) given here. Inequality (1.4) is best possible in the sense that (1.5) cannot hold
with p = n/(n - 2); this can be shown by taking the f in (13), applying a cutoff function to make f vanish on the boundary, and then expanding the integrals (as in [ 3 ] ) near e = 0. An inequality stronger than (1.4), and involving the gradient norm is IIofIIZ> S.
Ilofllq,,V,
(1.6)
with q = n/(n - 1). (The reason that (1.6) is stronger than (1.4) is that the
Sobolev inequality has an extension to the weak norms, by Young's inequalities in weak L" spaces.) Among the open questions concerning (1.4)-(1.6) are the following:
Sobolev Inequalities with Remainder Terms
SOBOLEV INEQUALITIES
75
(a) What are the sharp constants in (1.4)-(1.6)? Are they achieved? Except in one case, they are not known, even for a ball. If n = 3, 0 is a ball of radius R and p = 2 in (1.5), then C2(Q) = n2/(4R2); however, this constant is not achieved [3]. (b) What can replace the right side of (1.4) -(1.6) when Q is unbounded, e.g., a half-space? (c) Is there a natural way to bound IIVf II - S" II f II z. from below in z terms of the "distance" off from the set of optimal functions (1.3)? PROBLEM B. If f 00 on 00, then (1.1) does not hold in 0 (simply take f = I in 12). Let us assume now that S2 is not only bounded but that t3Q
(the boundary of SZ) has enough smoothness. Then (1.1) might be expected
to hold if suitable boundary integrals are added to the left side. In Section III we shall prove that for f =constant =- f(aQ) on asz IIVfIIZ+E(S2)If(aQ)I2%s" IlfIIZ.
(1.7)
On the other hand, if f is not constant on 0Q, then the following two inequalities hold. (1.8)
IN
(1.9)
with q = 2(n - I)/(n - 2), which is sharp. (Note the absence of the exponent 2 in (1.9).) In addition to the obvious analogues of questions (a)-(c) for Problem B, one can also ask whether (1.9) can be improved to Ilof IIz+H(Q) IIf IIQ,an%S" II! II ..
(1.10)
We do not know. If Q is a ball of radius R, we shall establish that the sharp constant in ( 1 . 7 ) is E(Q) = Q" R" - 21(n - 2 ), where v" is the surface area of the ball of
unit radius in R". With this E(Q), (1.7) is a strict inequality. Given this fact, one suspects (in view of the solution to Problem A) that some term could be added to the right side of (1.7). However, such a term cannot be any L°(S2) norm off, as will be shown. To conclude this Introduction, let us mention two related inequalities. First, if one is willing to replace S. on the right side of (1.10) by the smaller constant 2 -_ 2"S", then for a ball one can obtain the inequality f IVf12+!(Q) 11f11z.an%2-2'"S" IIf11'..
(1.11)
583
With H. Brezis in J. Funct. Anal. 62, 73-86 (1985)
76
BREZIS AND LIEB
This is proved in Section III. Inequalities related to (1.11) were derived by Cherrier [4] for general manifolds.
Second, one can consider the doubly weighted Hardy-LittlewoodSobolev inequality [7, 10] which in some sense is the dual of (1.1), namely,
f f f(x)f(y)Ix-yl zlxl °ly'I °dxdy
IIfII,,
(1.12)
with p'= 2n/(). + 2a), 0 < A < n, 0 < a < n/p'. If f is restricted to have support in a bounded domain S2 and if P is (by definition) the sharp constant in R", one should expect to be able to add some additional term to the left side of (1.12). When p = 2 this is indeed possible, and the additional term is
f(x)Ixl °dx}2.
(1.13)
This was proved in [5] for n = 3, A = 2, a = , and S2 being a ball, but the method easily extends (for a ball) to other n, A. The result (1.13) further extends to general S2 (with the same constant by using the Riesz rearrangement inequality. On the other hand, when p o 2, it does not seem to be easy to find the additional term on the left side of (1.12): at least we have not succeeded in doing so. This is an open problem. In particular, in Section III we prove that when p = 1, n = 3, A = 1, a = 0, one cannot even add III 11 1 to the left side of (1.12 ).
11. PROOF OF INEQUALITIES (1.4) AND (1.6)
Proof of Inequality (1.4). By the rearrangement inequality for the L2 norm of the gradient we have lIVf*l1 2 slIVf112
(2.1)
(see, e.g., [8]); in addition we have
Ilf'llr = II
II
f Il o.w.
Here, f denotes the symmetric decreasing rearrangement of the function f extended to be zero outside Q. Therefore, it suffices to consider the case in
which Q is a ball of radius R (chosen to have the same volume as the original domain) and f is symmetric decreasing.
584
Sobolev Inequalities with Remainder Terms
77
SOBOLEV INEQUALITIES
Let ge L"(9) and define It to be the solution of
Ju=g u=0
in
0,
on
O.Q.
(2.3)
Let
OW)= m(
{
f(x)+u(x)+Ilull IIuII.(R/1x1)"
2
in in
Q,
0`.
2.4)
The Sobolev inequality in all of R' applied to 0 yields
f
r2
R" 2(n-2)Q">, S"I1f112. 2
IV(f+u)12+liuII
since f >, 0 and u+ IIulI .
(2.5)
O. Here
Q"= 2(n)v2/F(n/2)
is the surface area of the unit ball in R". Therefore, we find
f Iof12-2 f fg+ f
(2.6)
where k = R" - 2(n - 2) Q". Replacing g by Ag and u by du and optimizing with respect to A we obtain
f of12s
(Jig)2/[J
(2.7)
In inequality (2.7) we can obviously maximize the right side with respect to g. In view of the definition of the weak norm we shall in fact restrict our attention to g = 1 , namely, the characteristic function of some set A in Q. We shall now establish some simple estimates for all the quantities in (2.7) in which C. generically denotes constants depending only on n,
ffg=fJf
(2.8)
f IVuI2,C"IAI'(2.9) IIuII- <, C" IAI2"".
(2.10)
585
With H. Brezis in J. Funct. Anal. 62, 73-86 (1985)
78
BREZIS AND LIEB
Indeed we have, by multiplying (2.3) by u and using Holder's inequality,
f IDul2= -f u5Pill
2.IAI(1/2)+(I/n)
A
-<Sn
IAI(1/2)+(I/n)
1/2 IIVu112
(2.11)
which implies (2.9). Next we have, by comparison with the solution in R",
lul\Cn lxl n+2*(1,,) (2.12) C" IAI21"
since the function Ixl "12 belongs to L (" - 2). Since Al J5 101 = a" R"/n we obtain
f Iou12+kIlull2_< C"IAJ4/nRn
2.
(2.13)
Hence (1.4) has been proved (for all 52) with a constant IQ1j2-"u".
C(Q)=C"
(2.14)
Proof of Inequality (1.6). To a certain extent the previous proof can be imitated except for one important ingredient, namely, the rearrangement technique cannot be used since it is not true that Ilof II q.w 1, 0.) Consequently we have to use a direct approach and the constant D(Q) in (1.6) will not depend only on 101; it will in fact depend on the capacity of Q. It is an open question whether (1.6) holds with D(Q) depending only on IIll. Our result is that D(Q) = Cn/cap(Q).
(2.15)
We begin as before with (2.3), but (2.4) is replaced by
Jf+u+HullIluu. V
in in
Il, Q'
(2.16)
where v is the solution of AV=0 v=1
586
in
0',
on
852,
(2.17)
Sobolev Inequalities with Remainder Terms
79
SOBOLEV INEQUALITIES
with v - 0 at infinity. By definition, (2.18)
cap(Q) = J IVvl2.
Inequality (2.7) still holds but with the constant k replaced by k=cap(S2). Also we note that (2.7) can be written as J IVf I2 % Sn II f II i + (J
Vi- VU)'/[
f
IVu12
(2.19)
+ k IIuII m],
which holds for any ueC, (92). By density, (2.19) still holds for every u in Ha n L°" (the reason is that for every such u there is a sequence u; a C, -(Q) with u, -' u in Ho and IIu1II
--' IIuII
).
We now choose u to be the solution of (2.3) with g
(2.20)
ax, [(sgn ax) 1 ,J.
This function u is in L" as we now verify. We can write
u=w+h, where iv satisfies Aw= g in all of R", namely, w = Cn IXI
(2.21)
*g.
Clearly h is harmonic and h = -w on aQ. Therefore 1Ih11 Ilivl1 , and hence IIuII -,. < 211w11 . On the other hand, w = C" (_ox,
I xl 2
Ifw1I
") s [(sgn dz) I ,, ], J
,
and thus Iwl -< Cn(n-2) IXI1 n * I '.
Since Ixl' "E L",!'"
(2.22)
we obtain
IIuII.>.<2llwll.,.
JIVu12= J(sgnOf/ax,)I (au/dx.)<[JIVuI2]
JAI 1/2
587
With H. Brezis in J. Funct. Anal. 62, 73-86 (1985)
80
BREZIS AND LIEB
and thus J JVul2<JAI.
(2.24)
J Vf- Vu = -f f AU = J l of/aX;l 1A .
(2.25)
Finally, since f = 0 on an,
Using these estimates in (2.19) we find JIVfJI,> S"
laf/aXtl)21(cap(n) IAIZ'"), Illll2.+c"(JA
since IA I' tzr"t \ Inl -- tzr" < S" ' cap(Q) by Sobolev's inequality applied
to the function i = v in 0' and u = I in 0. This completes the proof of (1.6) with the constant given in (2.15).
111. PROOFS OF (1.7)-(1.9) AND RELATED MATTERS
Proof of (1.8).
Let us define
f
=w
0, in il`, in
(3.1)
where w is the harmonic function that vanishes at infinity and agrees with f on aQ. Using q in (1.1) we find
J I7fl2+J IVw12>S" IIfII2..
L
(3.2)
r2
On the other hand, we have (3.3)
JU , IVwl2^ 11f II2
This concludes the proof of (1.8).
Proof of (1.7). Now suppose that f is a constant on ail. We shall first investigate the case that Q is a ball of radius R centered at zero. In this case w(x)= f(aQ) R"- I Ix12 -". Inequality (3.2) then yields (1.7) with
E(Q)=cap(Q)=a"R" 2I(n-2) n IQI
a"
n-2{nlQI f
588
z;"
(3.4)
Sobolev Inequalities with Remainder Terms
SOBOLEV INEQUALITIES
81
Furthermore, (1.7) is a strict inequality with this E(Q) because the function 0 in (3.1) is not of the form (1.3). Also, E(Q) given by (3.4) is the sharp constant in (3.4). To see this we apply (1.7) with f = f, given by (1.3) with a = I and y = 0 = center of the ball. We have (3.5)
1 Iof 12 = s. Il fj 2'.R" On the other hand, (a's E
0
fR" Ivl I2 = JA IV CI Z + Q. Ivfr.1 2 (3.6)
= J IVf 12+cap(.Q) If(3Q)I2+o(1). Here we have to note that as E
0 for IxI > R
f,.(x) _ Ix12
n
in the appropriate topologies. On the other hand, J
If,:I2*_J R"
St
II:I2'=J
ILI2
C.
S1'
Thus 11,11 12..R.
(3.7)
This proves that E(Q) in (1.7) is greater than or equal to cap(Q) when Q is a ball, and thus that (3.4) is sharp. The same calculation with f, as above shows that if Q is a ball there is no inequality of the type J IVf 12+cap(Q) I f(aQ)I2> S II.1IIZ. +d IIf II u
(3.8)
with d> 0, because the additional term II f.II I = 0(l) as E 0. Now we consider a general domain with f(OQ) = constant = C. We can
assume C -> 0 and note that we can also assume f 3 C in 0. (This is so because replacing f by I f - CI + C >, f does not decrease the L2' norm and leaves IIVf II 2 invariant.) Consider the function g = f - C ->0 which vanishes on 8Q and hence can be extended to be zero on 0`. Apply to g the rearrangement inequality for the L2 norm of the gradient, as was done in
589
With H. Brezis in J. Funct. Anal. 62, 73-86 (1985)
82
BREZIS AND LIES
Section II. Finally consider j= g* + C in the ball Q* whose volume is I01 Since 7(aQ *) = C = f(00) we have
fn. Iv.7l2+E(n*) If(an)I2>S As we remarked, IIOf 112> IIVJ II2 Also since f > C, it is easy to check that
The conclusion to be drawn from this exercise is that (1.7) holds for general 0 with E(Q) given by (3.4), namely, cap(Q*). We also note that (1.7), with this E(0), is strict, since it is strict for a ball. QUESTION.
Is E(Q) given by (3.4) the sharp constant in general?
Proof of (1.9). Given fin Q we consider the harmonic function h in Q which equals f on aQ. We write
f=h+u
(3.9)
with u = 0 on aQ and thus f IVul2>S 11U111..
(3.10)
On the one hand f Ivul2 = f IV(f - h)12 = f IVf l2
IVhl2
(note that Jn Ivh12= J,,nh(ah/an)= f,,n f(ah/an)= J10VfVh). On hand, by the triangle inequality, lull 2'
11f112'- JhJJ2..
Inserting (3.11) and (3.12) in (3.10) we obtain
IIh112
(3.14)
with q = 2(n - I )/(n - 2), which will complete the proof of (1.9). The proof of (3.14) is a standard duality argument. Indeed, let 0 be the solution of
590
A Ji = Y
in
0,
0 =0
on
(IQ,
(3.15)
Sobolev Inequalities with Remainder Terms
83
SOBOLEV INEQUALITIES
where Y is some arbitrary function in L'. We have, by multiplying by h and integrating by parts,
f hY -J, a
8
J
(3.16)
an.
However, the L" regularity theory shows that s e W2' with 110 11 w2.,(Q) 5 Cl IYII,. In particular, IloV/ll w.,,)a) -< C II YII, and, by trace inequalities, all an Il,.au
<' C11111 "
(3.11)
where
n-i
3.18)
1(n-1)
r
Therefore, by (3.16) and Holder's inequality,
IJhYI'< Cllfll4.,QIIYII
(3.19)
where 1 /r + 1 /q = 1. Since (3.19) holds for all Y we conclude that
Ilhll,
Finally, we claim that there is no inequality of the type (1.9) with q < 2(n - I)/(n - 2). Indeed, suppose (1.9) holds with some such q. We choose f = f,, as in (1.3) with a = I and y e 00. It is obvious that as e 0 Jn IVf,I2/Jap Iof,i2= 1/2+o(l ), J S2
R"
IfIZ =1/2+0(1),
while
J Iofrl2=s
and
0(l)-
This contradicts (1.9). Remark.
The last exercise with f, given above shows that it is not
possible to apply rearrangement techniques when f is not constant on aSl,
591
With H. Brezis in J. Funct. Anal. 62, 73-86 (1985)
84
BREZIS AND LIEB
even if Q is a ball. It also shows that there is no inequality for all f e H' of the type IIVJ 112+C IIfIIq,n%S IIfII1 with q < 2*.
Proof of (1.11). Let 0 be a ball of radius R centered at zero. For simplicity, assume R = 1. Define
g(x) =
f(x), Ix12
f(x IxI -2),
IxI 5 1,
IxI , I,
(3.20)
and apply the usual Sobolev inequality (1.1) to g. We note (by a change of variables) that
fl 9
fng2 = n
(3.21)
J Vg12= f, IVgl2-(n-2) Ilfllz.an Inserting (3.21) into (1.1) yields (1.11) with 1(Q) = (n - 2)/2.
REMARK ON THE HARDY-LITTLEWOOD-SOBOLEV INEQUALITY
Consider the inequality (in Q8')
I(f)
(3.22)
1(f)= f f f(x)f(y) I.x- yl ' dxdy>_0.
(3.23)
with
The sharp constant P is known to be [7] P=
45"/[3n''13
(3.24)
Let Q be a ball of radius one centered at zero and assume that f = 0 out-
side 0. In this case, (3.22) is strict because the only functions that give equality in (3.22) are of the form [7] f:(x)= a[E2+ Ix_ y12] - 5/2.
592
(3.25)
Sobolev Inequalities with Remainder Terms
85
SOBOLEV INEQUALITIES
For f =0 outside 0, we ask whether (3.22) can be improved to
CII/II +1(f)SPIIIIIel5.
(3.26)
Our conclusion is that (3.26) fails for any C>0. Take f =). m f,1, with f given by (3.25) and with y = 0 and with a = a, chosen so that IIf.ll615.R3= 1. The function f, satisfies the following (Euler) equation on Q3', 1
xl
*f:=Pf!".
(3.27)
However, for Ixl < I 1
xl
*. . (x)+K.
xl
f.
(x),
(3.28)
where K, is a constant bounded above by D, = Ji,t =, J.. Multiply (3.27) by T. and integrate over Q. Then
1(.7.)+T.II7iii%1(.7,)+K, J.i`.=PII.
IIeis>_ PII.7'.II'i5
(3.29)
where T, = DJJ7,. From (3.29), we see that (3.26) fails if C> T, for any e > 0. However, it is obvious that T, -+ 0 as a -+ 0.
REFERENCES I. T. AUBIN, Problemes isoperimetriques et espaces de Sobolev, C. R. Acad. Sri. Paris 280A (1975), 279-281; J. Dijf. Geom. 11 (1976), 573-598.
2. G. A. Buss. An integral inequality, J. London Math. Soc. 5 (1930). 40-46. 3. H. BREZts AND L. NIRENBERG, Positive solutions of nonlinear elliptic equations involving critical Sobolev exponents, Comm. Pure AppL Math. 36 (1983), 437-477. 4. P. CHERRIER, Problemes de Neumann nonlineaires sur les varietes Riemanniennes, J. Funct. Anal. 57 (1984), 154-206. 5.
1. DAUBECHIES AND E. LIEB, One-electron relativistic molecules with Coulomb interaction, Comm. Math. Phys. 90 (1983), 497-510.
6. B. GIDAS, W. M. Ni, AND L. NIRENBERG, Symmetry of positive solutions of nonlinear elliptic equations in R", in "Mathematical Analysis and Applications" (L. Nachbin, Ed.), pp. 370-401, Academic Press, New York, 1981. 7. E. LiEs, Sharp constants in the Hardy-Littlewood-Sobolev and related inequalities, Ann. of Math. 118 (1983), 349-374.
8. E. LIEB, Existence and uniqueness of the minimizing solution of Choquard's nonlinear equation, Stud. Appl. Math. 57 (1977), 93-105. 9. G. ROSEN, Minimum value for c in the Sobolev inequality 110116<-c 1100112. SIAM J. App!. Math. 21 (1971), 30-32.
593
With H. Brezis in J. Funct. Anal. 62, 73-86 (1985)
86
BREZIS AND LIES
10. E. STEIN AND G. WEISS, Fractional integrals in n-dimensional Euclidean space, J. Math. Mech. 7 (1958). 503 514. See also, G. HARDY AND J. LITTLEWOOD, Some properties of fractional integrals (1), Math. Z. 27 (1928), 565-606. H. G. TALENT], Best constant in Sobolev inequality, Ann. Mat. Pura Appl. 110 (1976), 353-372.
12. H. BREZIS AND E. LIES, Minimum action solution of some vector field equations. Comm. Math. Phys. % (1984), 97-113. See Remark 3 on p. 100.
594
Invent. Math. 102, 179-208 (1990) Invent. math. 102, 179 208 (1990)
Inventiones
mathematicae
Gaussian kernels have only Gaussian maximizers* Elliott H. Lieb Department of Mathematics, Princeton University, Princeton, NJ 08544, USA Oblatum 19-XII-1989
Abstract. A Gaussian integral kernel G(x, y) on R" x R" is the exponential of a quadratic form in x and y; the Fourier transform kernel is an example. The problem addressed here is to find the sharp bound of G as an operator from L°(R")
to L9(R") and to prove that the LP(R") functions that saturate the bound are necessarily Gaussians. This is accomplished generally for I < p 5 q < w and also for p > q in some special cases. Besides greatly extending previous results in this area, the proof technique is also essentially different from earlier ones. A corollary of these results is a fully multidimensional, multilinear generalization of Young's inequality. 1. Introduction
The classic Hausdorff-Young-Titchmarsh [T] inequality for Fourier integrals states that for 1 < p < 2 the Fourier transform on L°(R") is a bounded map into L' (R") with a bound that is at most I; here l/p' + I/p = I. In 1961 Babenko [BA] showed that when p' is an even integer greater than 2 and n = I the bound is in fact
less than 1, and he determined its value. This bound is achieved for Gaussian functions and Babenko states, but does not demonstrate explicitly, that Gaussians are the only functions with this property. Babenko's method was to apply analytic function theory to the Euler-Lagrange equation associated with the maximization problem. The Fourier integral is but one example of a transform given by a Gaussian integral kernel G(x, y), i.e., the exponential of a quadratic plus linear form in x and y. In the Fourier transform case in R" the kernel is G(x, y) = exp{ - 2i(x, y)}. Another well known example in R" is the purely real operator = exp{td + 2tx l7} on Gauss space (with measure du = exp{ - lx12 } dx) investigated by Nelson [N I; N2] as an operator from L°(R", dµ) to LQ(R", dµ). In
* Work partially supported by U.S. National Science Foundation grant FHY-85-15288-A03
595
Invent. Math. 102, 179-208 (1990) 180
E.H. Lieb
terms of Lebesgue measure, this amounts to considering the kernel
G(x,Y)=exP -1IX12+!p
)y)2_IY
(I
cc1)
j
from L°(R") to L°(R") for 0 S c = e_` < 1. Nelson defined the operator I by (4f )(x) = f G(x, y)f(y) dy and showed that 4 is bounded from L°(R") to L°(RI) when p 5 q if and only if (q - I )c' <_ p - 1; he also derived the explicit value of the bound-which again is achieved when f is a Gaussian. This is the famous hypercontractivity theorem. [In [NI] Nelson showed that 4 is bounded if c is small enough; Glimm [GL] used this fact plus the spectral gap in the generator to show that 4 is a contraction on Gauss space for some still smaller c. Finally Nelson [N2] proved
the sharp bound as stated above. In 1976 Neveu [NE] and Brascamp and Lieb
[BL] found other proofs, and Simon [SI] found a proof for p = 2 and q = 2,4,6, 8 .... Recently, Carlen and Loss [CL] have used their method of competing symmetries to construct another proof of the hypercontractivity theorem.] However, Nelson's method seems incapable of showing that Gaussians are the only maximizers; the proof of this fact, as well as a completely different proof (using rearrangement inequalities) of the hypercontractivity theorem was given by Brascamp and Lieb [BL]. The method in [CL] also yields uniqueness. Nelson's original proof used stochastic integrals and Gaussian processes in R" (in fact it even extends to infinite dimensions). Segal [S] showed how to use Minkowski's inequality [HLP] to reduce the R" case of Nelson's kernel to the R' case; he also showed that 4 is a contraction on Gauss space for small c. The R' case was simplified by
Gross [G] who showed the equivalence of hypercontractivity with logarithmic Sobolev inequalities and built up one-dimensional Gauss measure from two-point measures via the central limit theorem. See the survey by Davies et al. [DGS]. In his important 1975 paper, Beckner [BI; B2] used the Nelson-Gross machinery and the Hermite semigroup to settle the question raised by Babenko. By using the tensor product structure of Fourier transforms and an application of Minkowski s inequality related to, but distinct from, Segal's [S], he reduced the R" case
to the R' case. He also showed that for all
I <_ p < 2 the sharp constant in the Hausdorff-Young-Titchmarsh inequality is given by Gaussian functions-as found by Babenko. However, this method also leaves open the question of whether Gaussian functions are the only maximizers.
Since then the Nelson-Gross-Beckner method has been extended to other complex (as distinct from purely real or purely imaginary) Gaussian kernels in R" (i.e., the complex Mehler kernel) [C; E; J; W]. In this paper the general problem in R" in the p 5 q case will be settled by a completely different method and, moreover, the maximizers will he shown to be Gaussian functions. Some of the p > q cases will be settled as well. Before discussing the earlier results in detail it is necessary to define the problem more completely. The most general Gaussian kernel on R" x R" is
G(x, y) = exP{ - (x, Ax) -- (y, By) - 2(x. Ay) + 21 L,
(X )IJ Y
and its action on complex valued, measurable functions f : R"
(1.1 )
C, is formally
given by
Of )(x) = f G(x. y)f(Y)dy .
596
(1.2)
Gaussian Kernels have only Gaussian Maximizers Gaussian kernels have only Gaussian maximizers
181
In (1.1) A, B and D are (complex) n x n matrices with A and B being symmetric while L is a vector in Cl*. The Fourier transform corresponds to A = B = 0, L = 0 and D = il, with I denoting the identity.
Notation. If a and fi are vectors in C" then (a, ji) _ Y;_, a;fl, and not Y"_,, ail;. Lebesgue integration over R" is denoted simply by j dx whenever the n in question
is clear from the context. The LP(R") norm of a measurable function f will be denoted by (Iffy, i.e., (JIf(x)IPdx}"P. The notation
(A Dl_M+iN Dr BJ
(1.3)
will also be used, where M and N are real, symmetric 2n x 2n matrices. The sole
condition imposed on G is that M is positive semidefinite. G is said to be nondegeaerate if M is positive definite, while G is said to be degenerate if M has
a zero eigenvalue. The Fourier transform kernel and Nelson's kernel with (q - 1)c' = (p - 1) are examples of degenerate kernels. The operator IF should perhaps be written IG, but this will not be done since the pairing of Iy and G will always be clear from the context.
The linear operator 4 associated to G will be studied as an operator from LP(R") to Lq(R") for 1 < p < oo and I < q < oo. (The cases p or q = I or oo can also be analyzed by the methods of this paper but they will be omitted since these cases involve extra technical considerations.) When G is nondegenerate the definition of I in (1.2) makes sense (by Holder's inequality) but if G is degenerate then (1.2) is meaningless unless f is also in LP(R") n L' (R"). Assuming that 4, when
restricted to LP(R") n L'(R"), is bounded from LP(R") to Lq(R") then, for any f e LP(R"), Ife Lq(R") is uniquely defined by taking any sequence j e LP(R") n L' (R") that converges to f in LP(R") and then noting that
4f= lim,_x.j is well defined since (5j is a Cauchy sequence in L"(R"). This definition is well known and is, in fact, the way that the Fourier transform is defined when I < p < 2. Associated to G and the numbers p and q with 1 S p 5 oo and 1 <_ q < x is the ratio II4f ll, (1.4)
-4p-.(f) =
I! f 11,
for f e LP(R"), f* 0 and, in case G is degenerate, f e L' (R") as well. The norm of 14 from LP(R") to Lq(R") is defined to be CP-q = sup .(5p..q(.f)
r
(1.5)
in which the supremum is over the class of f s just stated. In case O *f e LP(R") and
CP_q < x and IIIfIIq=Cp_gllflip (using the above definition of If as a limit when G is degenerate) then f is said to be a maximizer for I (or for G). If there is any ambiguity about the G under discussion (e.g., in Theorem 3.3) the notation lp.q(G, f) and CP_q(G) will be used.
Functions from R" to C of the form ,q(x) = p exp( - (x, ix) + (i, x);
(1.6)
597
Invent. Math. 102, 179-208 (1990) E.H. Lieb
182
with 0 + p e C, ! e C" and J a symmetric n x n matrix with Re(J) positive definite will be called Gaussian functions. In case L = 0 in (1.1) or 1 = 0 in (1.6) then G (resp. g) will be called a centered Gaussian kernel (resp. function). If A, B, D and L in (1.1)
are real then G is said to be a real Gaussian kernel. Likewise, if J and I (but not necessarily p) in (1.6) are real then g is said to be a real Gaussian function. A preliminary simplification of G can be made. Without loss of generality it can be assumed that A and B are real matrices because the imaginary part of B can be absorbed into f in (1.4) without changing II f II,. The imaginary part of A can be
omitted without changing 11 If IIq. For the same reason the vector L can be assumed to be real. Furthermore, when G is nondegenerate then we can also set L (which is now real) equal to zero. The reason is simply that the affine change of variables
(x)_.(x) - V, with V being the unique solution of the equation RZ",
eliminates the real linear term from (1.1) and merely changes Cp.,, MV = L in into Cp.gexp ((L, V)). When G is degenerate, L can also be eliminated in the same way provided M V = L has a solution. Because Rank (M) < 2n in the degenerate case, such a solution conceivably might not exist, but it turns out that a solution does indeed exist whenever I is bounded. This is the content of Lemma 2.2 below. Therefore, without loss of generality, the only G's that need to be studied are those for which
(i) A and B are real, symmetric n x n matrices, (ii) L = 0, i.e., G is centered. These assumptions will be made in the theorems in this paper. On the other hand, suppose that the supremum of 5 p.q(f) in (1.5) is taken over Gaussian functions only (which are automatically in L°(R") for every p). Then, according to Lemma 2.3 below, only centered Gaussian functions need be considered in (1.5). This is a considerable simplification that is not altogether obvious
and it is important in the application of Theorem 4.1 which states that this restricted supremum is all that need be considered. The results of this paper can be summarized as follows. Three.cases are treated. With the assumptions (i) and (ii) above,
(A) Disrealand I
25q< oo.
(C) D is complex and I < p < q < co. If G is nondegenerate then 9tp_q has exactly one maximizer and it is a centered Gaussian function. These are Theorems 3.2, 3.3 and 3.4. If G is degenerate then in all cases Cp_q = sup Rp-q(g) ,
(1.7)
e
where the supremum is over centered Gaussian functions. This is Theorem 4.1. Furthermore, if the supremum in (1.7) is achieved for some Gaussian function then, when p < q, every maximizer is a Gaussian function-as Theorems 4.5 shows. Theorem 4.3 gives a sufficient condition for the achievement of the maximum in (1.7) in the degenerate case; in Case (A) it is necessary as well. Thus, Case (A) is
598
Gaussian Kernels have only Gaussian Maximizers Gaussian kernels have only Gaussian maximizers
183
settled completely: in the degenerate case with p > q there is no maximizer of any kind, while if p < q all maximizers are Gaussian functions. In general, the question of the existence of a maximizer in the degenerate case is a subtle one. For the Fourier transform (which is both Case (B) and (C)), every function in Lz(R") is a maximizer when p = q = 2; on the other hand the seemingly harmless modification of the Fourier transform in 4.2(5) below is bounded but has no maximizer of any kind when q = p' Z 2. When q = p' > 2 the Fourier transform
on R' has a three real parameter family of maximizers, f(y) = exp{ - Jyz + ly} with J > 0 and I e C. When p < q the convolution kernel G(x, y) = exp{ - (x - y)') on R' has a one real parameter family of maximizers,
f(y) = exp{ - Jyz + ly) with 1 e R and J = t - 1; when p = q, G is bounded but there is no maximizer (see 4.2 below). There does not seem to be any simple rule. In simple cases (which include all the standard ones in R" and all the cases in R') the existence of a Gaussian maximizer in (1.7) can be decided by computation. Otherwise, (1.7) reduces to a complicated algebraic problem and precise conditions are
not given here. Moreover it is not even proved that the absence of a Gaussian maximizer in (1.7) precludes the existence of a non-Gaussian maximizer-although a conjecture to this effect is made in 4.4. All these results extend to Gaussian kernels on R' x R", in which A is m x m, B is n x n, D is m x n and L e C". The proof is given in Sect. V. This generalization, while it is an easy one, does occur in applications, e.g., the entropy bound for coherent states in [LI]. Multilinear Gaussian forms are discussed in Sect. VI and it is proved there that
the methods and results of Sects. II-V carry through for real forms. As an application of the real multilinear result in Sect. 6.1, the fully multidimensional Young inequality for K functions (which was left unresolved in [BL], p. 162) is proved in 6.2. The method of proof is, of course, quite different from that in [BL]; there. rearrangement inequalities were used and they were not flexible enough to encompass the fully multidimensional case. The relationship of the results of this paper to earlier results on Gaussian kernels (beyond [BA; NI; N2; B1; B2]) can be summarized as follows. In 1976 Brascamp and Lieb [BL] found the norm for Case (A) in R" (Theorem 7) and proved that Gaussian functions are the unique maximizer in R' in the degenerate case (Theorem 13); this latter proof easily extends to R" and to the nondegenerate case. In fact, by a simple change of variables (see the proof of Theorem 4.3 below) the R" Case (A) reduces to a simple tensor product of R' kernels. In 1979 Coifman et al. [C] used Beckner's result and an interpolation technique to deduce the norm for the complex Mehler kernel in R' for q = p' z 2 (which is in Case (C)). In the same year Weissler [W] extended Nelson's and Beckner's results to the complex Mehler kernel in R' with the exception of 2 < p < q < 3 and J < p < q < 2. In 1988, Epperson [E] found the norm for the following nondegenerate cases in R: Case (C), Case (B), the case p z 2 z q. He also found the norm for certain R' cases q < p < 2 and 2 < q < p with sufficiently nondegenerate kernels (Theorem 2.10), and for the R' degenerate Case (C) if A > 0 and B > 0 (corresponding to Theorem 4.3 here).
The only complex cases in R" that were known prior to Epperson's work were the simple tensor products of R' kernels; these could be analyzed for p < q via Minkowski's inequality, as shown by Beckner [B1; B2]. Epperson was able
599
Invent. Math. 102, 179-208 (1990) E.H. Lieb
184
to handle the nondegenerate Case (C) for which there is an n x n complex symmetric
matrix W with
II WII S 1
such
that A = W(I - W2)-' W - 1 I, q
B = (I - W2)- ` -
I and D = W(I - W2)'. Here, I is the identity matrix.
p
It will be seen from the above summary that all the previous cases, except for Epperson's R' cases of p z 2 >- q and the special q < p < 2 and 2 < q < p cases, are covered in the cases (A), (B) and (C) treated in this paper. Moreover cases (A), (B) and (C.) are resolved here in full R" generality (i.e., not only for simple n-fold
tensor products of R' kernels). The main methodological point of this paper, however, is that all the previous results, except for [BL] and [BA], ultimately rely on the Nelson-Gross machinery which, while it is natural in its original context of
quantum field theory and Gauss measures, is conceptually complicated in the context of general Gaussian kernels with Lebesgue measure. The two settings (Gauss measure and Lebesgue measure) for Gaussian kernels are mathematically equivalent, however, and the choice is a matter of taste. Lebesgue measure is used in this paper because it is felt that it is more natural to retain translation invariance (e.g., in the Fourier transform). Prior to Epperson's work all results in the field,
except for [BL] and [BA] came from translating Gauss measure bounds for products of complex R' Mehler kernels into R" results via Beckner's Minkowski lemma. The proofs here use only Minkowski's inequality and simple facts about analytic functions (which appear to be unrelated to Babenko's use of analyticity-the Euler-Lagrange equation is not used).
Basically there is one idea that runs through Theorems 3.1, 3.3 and 4.5, although the technicalities are different in each. The main idea is to study 10 If from Lp(R2n) to L4(R2") and use Minkowski's inequality. By considering the 4
maximizer fly, y2) =J y' +
f Yi zl
where f is a maximizer for 's?, it is
possible to conclude that Jmust be a Gaussian. It will be noted that some of the proofs are long, and so it may appear at first that their structure is not really very simple. To a large extent the length is due to the fact that proving uniqueness raises technical considerations that would be absent if only inequalities are proved, e.g., it is not sufficient here to prove the inequalities for a dense set of smooth functions.
Apart from the extension to R" (which is handled here in a natural way) the main new theorem in this paper is that a maximizer must be a Gaussian, and it is unique in the nondegenerate case. In the degenerate case Cp.q(G) is determined by examining only Gaussian functions and, if a Gaussian maximizer exists, every maximizer is a Gaussian. This is Theorem 4.5 and it can be useful as in [LI] and
[L2]. Except for the real case [BL], it was previously known that Gaussian functions were among the maximizers. The one exception to this rule was pointed out by Beckner (private communication) for the Fourier transform from L"(11") to L° (R") with the restriction p' > 4. His proof that a maximizer must be a Gaussian function in this case uses a result in [BL]; the proof is z it, II f*f ii,_ p,(Ce)" II Ili = u,(c,Br II (.f)2 II,. = µ(C°)" II f ;l P with r' = p'/2 > 2, with (CB)" being the sharp Beckner (or Babenko) constant for 1.11 I
P
the Fourier transform (denoted by ^ ), and with p, being the sharp constant in Young's convolution inequality which was derived simultaneously in [BI, 82] and in [BL]. A Gaussian function 1(y) = exp{ - Jy2 + Ip, with J > 0 and I e C gives
600
Gaussian Kernels have only Gaussian Maximizers Gaussian kernels have only Gaussian maximizers
185
equality above. However, [BL] (Theorem 13) proved that such functions are the only ones that give equality in Young's inequality. It is a pleasure to acknowledge my debt to Eric Carlen. He helped to stimulate my interest in this problem and to understand the literature in the field. He also critically examined the work as it took shape. Thanks are also due to the Institute for Advanced Study for its hospitality during part of this work, and to Michael Loss for valuable discussions. II. Some basic properties of Gaussians
2.1. Lemma (nondegenerate Gaussian kernels are compact and have maximizers). Let G be a centered, nondegenerate Gaussian kernel in R" x R" as in (1.1) with M in (1.3) positive definite and L = 0. Let I < p < oo and I < q < oo. Then I in (1.2) is
a compact operator from LP(R") to L"(R") and there is at least one maximizer f c LP(R")(i.e., 9?,.q(f) = Cp.q). Every such maximizer f : R" C, has the following three properties, in which a and fi are positive constants that depend on G, p and q but not on f. (a) There is an entire analytic function of order at most 2, m : C" - C, such that f(x) = I m(x)I ° m(x)`jor x E R". Here I /p + lip' = 1. Moreover, for z c- C",
1m(z)I 5 allf IIp-' exp{f Izl2} . (b) The function l f I21v- "from R" to R has an extension to an entire analytic function
from C" to C whose order is at most 2. If g: C" - C is this extension then for z E C"
Ig(z)I 5 allf
ll2(p- "exp{flIz12}
(c) For x E R"
If(x)I 5 allfllpexp( -/3(x,x)) . Finally, if f e LP(R") for j = 1, 2, 3.... is an LP bounded maximizing sequence for
G(i.e., Mp.,a(j) -. Cp-,) then there is a function f e LP(R") and a subsequence j(I)J(2), . . . such that jtk, -. f strongly in LP(R") as k - 00. If f * 0 (i.e., if II f; lip 0 as j - oo) then f is a maximizer. Proof. For any f E LP(R"), Holders inequality can be used to deduce
l(`sf)(x)l < T(x)llfllp
(1)
with T(x) = II G(x, -)lip.. Simple computation shows that there are positive numbers y and S depending only on G and p such that IT(x)l < y exp{ - S(x, x)}. The fact that G is nondegenerate is crucial for this result. The fact that T E L' (R") n L'(R") shows that I is bounded from LP(R") to L"(R"). Now suppose that j E LP(R") is a sequence that converges weakly in LP(R") to some f E LP(R") as j -. oo. Since, for
each x e R", G(x, ) is in L° (R"), it follows that (I f)(x) -+ (iS f)(x) as j - or) for each x c R". It can be assumed that the j and f satisfy ll j II p and 11111,, 5 C for some
C > 0 and hence, from (t), the functions cj and If are bounded pointwise by the function CT. Since T E L9(R"), Il c4f - I f Il -. 0 by dominated convergence. Thus I takes weakly convergent sequences in LP(R") into strongly convergent sequences in L°(R"), and so '1 is compact.
601
Invent. Math. 102, 179-208 (1990) E.H. l.ieb
186
Now let j be a bounded maximizing sequence, i.e., Mp-4(f) -. Cp_4 as j X. We can assume II j lip = I for each j. By the Banach-Alaoglu theorem, there is an f E Lp(R") and a subsequence j(1), j(2), . . . such that j -f weakly in Lp(R"). As is well known, II f lip 5 1. Then, by the strong convergence proved above
Cp.q=limI1If,x,114=RIP, 5Cp_411fIIp5Cp 4 k+a
This implies that II f IIp = I and that f is a maximizer. Moreover, the fact that II f lip = I implies (by the uniform convexity of the LP norm) that j, converges to f strongly in Lp(R"). Thus, the first and last assertions of the lemma have been proved. It remains to prove that a maximizer f satisfies conditions (a), (b) and (c) and it suffices to assume that II f Ilp = I. There is a function h e such that II h II4, _
I and Cp_4 = II If IIq = J h(x)(4f)(x)dx. Let m(y) = J G(x, y)h(x)dx = e-t'.sri 1 e-tx. Ax)-2(x.Dr)h(x)dx
(2)
so that, as in the proof of (I) above, I m(y)I 5 W(y) = p exp { - v(y, y)} for suitable positive numbers µ and v which depend only G and q. Holder's inequality implies that the function (x, y) -. h(x)G(x, y) f (y) is in L' (R" x R"), and Fubini's theorem then implies that 11 If 114 = f m(y)f(y)dy. If m(y) = lm(y)Iexp{iO(y)}, the optimum choice for f is f(y) = [lm(y)I/IImllp.]° ' exp{ - iO(y)}, for otherwise gtp_4(f) can be increased. The function m: R" . C has an extension to an entire analytic function on C" of
order at most 2. This can be seen easily from the representation (2) above and Holder's inequality; if yt = ui + ivt for j = 1, ... , n and D = E + iH with u,, vj, E and H real then I m(y)l 5 exp { (v, Bv) -(u, Bull [I exp{ - q'(x, Ax) - 2q'(x, Eu) + 2q'(x, Hv) } dx] 19
= (const.)exp { (v, Bv) - (u, Bu) + (Eu - Hv, A -' (Eu - Hv)) }
.
Thus Im(y)l < (const.)exp{ (const.)[(u, u) + (v, v)] } which implies that the order of
in is at most 2. This establishes conclusion (a). Since in is entire, the function m(y) (with the bar denoting complex conjugate) is also entire, and hence N(y) _- m(y)m'(y) is also entire with order at most 2 and with a pointwisc bound that is independent off However, when y e R"(i.e., v. = 0 for all j) then N(y) = Im(y)12. Conclusion (b) is then an immediate consequence of the relation between f and nt which implies that for y e R", lf(Y)121p-11 = IIm11 21m(y)12= II'n 11 p.2 N(y); thus I f 121p 1) has an analytic extension of order at most 2, namely II m II, 2 N. It only has to be shown that it m 11 p.2 is universally bounded, but this follows from the relation Cp_4 = II5f IIq = f mf = II m Il; Conclusion (c) follows from the fact that when y e R" then l f(Y)l = i-,
[Im(Y)`/Ilmllp]° -' < W(y)p -'llmllp p. The next two lemmas validate the assertion in Sect. I that linear terms can be eliminated from Gaussians.
2.2. Lemma (elimination of linear terms from Gaussian kernels). Let G he the (degenerate or nondegenerate) Gaussian kernel given in (1.1) with positive definite or semidefinite real quadratic form M in (1.3) and with real linear term L e R2". Let G. denote the Gaussian kernel with no linear term, which is obtained from G by setting
602
Gaussian Kernels have only Gaussian Maximizers Gaussian kernels have only Gaussian maximizers
187
L=O, i.e., G0(x, Y) = G(x, Y)exPS - 21 L, I x `
\\
Y
) }. Let I < p 5 c
and 1 <_ q < oo.
11
Then the following conditions are equivalent.
(i) (4U is bounded from L"(R") to Lq(R") and the equation MV = L has a solution V e R2".
(ii) I is bounded from Lp(R") to La(R"). In case these conditions are both satisfied the relation between the norms is C,-,(G) = Cp_q(Go)exp{(L, V)} .
The number (L, V) is uniquely defined even if the vector V is not unique. I has a unique maximizer if and only if 14o has one. Proof.. (ii)
(ii). This was explained in Sect. I. Simply change variables; writing
V=(a),letx_+x +a andy -y+b.Then G - Gaexp{(L, V) - 21((x), NV) - i(V, NV)}
.
Y
lll(
The imaginary terms above do not affect the norm. Since M is Hermitian L must be orthogonal to if _- kernel of M c R2", while any two solutions V, and V2 differ by an element of .)('. Thus, (L, V) is unique. This change of variables also shows that V has a unique maximizer if and only if 4u has one. (ii) r (i). Suppose that M V = L has no solution. Then, since M is Hermitean,
L is not orthogonal to Jl' and thus there is a vector W = \s/ E i'' such that the
number P = (W, L) is positive. Make the change of variables x
x + s and
y -+ y + t. Then, since M W = 0, G becomes
G(x,y)=G(x,y)exp{
I
-i(W,NW)-2i( x), NW)+2P}. `Y
111
The change of variables is an isometry so the norm of 4 is the same as the norm of
4 and, since the imaginary terms are irrelevant, we have Cp_q(G) = C,-,(d)= e2rCp_q(G). This is a contradiction since C,-,(G) + 0. Thus MV = L has a solution and the same change of variables can be made as before to derive the relation between the norms of 3 and 14o. ;7
23. Lemma (elimination of linear terms from maximizers). Let G be a centered Gaussian kernel (degenerate or nondegenerate) and let I < p < ao and I < q < or-
Assume 4: L°(R") -+ L9(R") is bounded (which is automatically true in the nondegenerate case). If g(x) = exp { - (x, Jx) + (1, x)) is a Gaussian function that maximizes zP,_q(g) among all Gaussian functions then g. (x) = exp{ - (x, Jx)} is also a maximizer. Moreover, if 9Pp_,(g) does not have a maximizer among Gaussian functions (which can happen only if G is degenerate) then the supremum of Mlp_q(g) over Gaussian functions equals the supremum over centered Gaussian functions. Finally, if G is nondegenerate then g = go, i.e., I = 0. and therefore g is centered. Proof Consider the functions g,,(x) = exp{ - (x, Jx) + A(l, x)) with i. a real parameter. Clearly g, a 1.°(R") for all % and, by a well known property of Gaussian
603
Invent. Math. 102, 179-208 (1990)
E.H. Lieb
188
integrals,
II 9AIIP= II gofire°A'
and
II19AIlg = II59oIlgee."
for some real constants a and (f. There are three cases to be considered: (i) a > Q. By setting A = 0, Mp_q is increased, i.e., 9tp_g(go) > 9tp.g(g). This means
that g is not a maximizer-which is a contradiction. (ii) a < ft. By letting A tend to infinity we conclude that 5tp_g (and hence also 4) is
unbounded-which is a contradiction. (ii) a = /3. In this case gA is a maximizer for every A and hence go is a maximizer, as claimed.
These considerations prove all but the last sentence of the lemma.
If G is nondegenerate it is possible to go further. Consider the following sequence of functions with A = j, namely h, = Zjg, for j = 1 , 2, 3, ... , where the numbers Z, are chosen so that II hj IIP = I for each j. This is a bounded maximizing sequence and, by a trivial modification of the last part of Lemma 2.1 (using the fact that a nonzero LP(R") weak limit of Gaussian functions is a Gaussian function), there is a nonzero Gaussian function h e LP(R") and a subsequence j(1), j(2), .. . such that hJ(0) -. h strongly in LP(R") ask -- oo. If I $ 0, however, it is easy to check
that hj -. 0 weakly in LP(R") as j - oo. This contradicts the supposed strong convergence to a nonzero function. O
III. Nondegenerate gaussian kernels
A main ingredient in the following theorems is Minkowski's inequality for integrals. It was exploited by Beckner [B1; B2] to prove that the sharp bound for the tensor product of two operators (e.g., Fourier transforms) is often the product of the individual bounds. In particular, the bound for the Fourier transform from
LP(R") to L°(R") is (Cr, pwhere C; is the sharp constant for R'. A proof of Minkowski's inequality can be found in [HLP]. Of crucial importance here is the sharp form in which the necessary and sufficient condition for equality is specified; this condition was not used before to analyze Gaussian kernels.
3.1. Lemma (Minkowski's inequality). Let f: R" x R' - [0, oo] be Lebesgue measurable and let 1 ; r < oo. Suppose that the measurable function M, defined for almost every x e R" by
M(x) = J f(x, yydy , R'
is finite for almost every x and that M
a L' (R"). Then the measurable function
N(y) = J f(x, y) dx R"
is finite for almost every y E R' and
JNr R'
604
R'
Gaussian Kernels have only Gaussian Maximizers
Gaussian kernels have only Gaussian maximizers
189
Furthermore, if r > l and if there is equality in (s) then there are nonnegative, measurable functions A e L' (R") and B E L'(R'") such that f(x, y) = A(x)B(y) for almost every (x, y) a R" x R'.
Remark. This lemma extends to an arbitrary pair of measure spaces (X, p) and (Y, v) in place of (R", dx) and (R'", dy) when p and v are sigma finite. As a first application of Minkowski's inequality the uniqueness of maximizers for real, nondegenerate Gaussian kernels for all p and q will be proved. This is Case (A) of Section I. It is to be noted that the order of integration in Theorem 3.2 is as in
[S] and is opposite to that of Theorem 3.4 and opposite to the order in Beckner's lemma. Analyticity considerations play only a subsidiary role in Theorem 3.2 and can be bypassed if desired, but they are important later. Theorem 3.2 was already essentially contained in [BL] Theorems 7 and 13. The following proof is offered because (i) it is different from the [BL] approach and (ii) it illustrates the techniques of the present paper. 3.2. Theorem (unique Gaussian maximizer for all p and q in the real nondegenerate case). Let G be a real, nondegenerate, centered Gaussian kernel, i.e., the matrix
N in (1.3) is zero. Let I < p < ao and I < q < oo. Then 4 has exactly one maximizer, f, (up to a multiplicative constant) from Lp(R") to L°(R") and f is a real, centered Gaussian, i.e., f(x) = exp{ - (x, Jx)} with J being a real, positive definite matrix.
Proof. Consider the linear operator 812) = 1®1: Lp(R2n) - L"(R2") given by the Gaussian kernel G12'((x1, x2), (y1, Y2)) = G(x1, y,)G(x2, Y2) with
x2, y, and Y2
in R". The first goal is to prove that Cp_q(G12') = C _q(G)2. If F e Lp(R2") then F(y1, y2) is in L'(Rf") for every (x,,x2) because G121((x,, x2), G'2' is nondegenerate. Fubini's theorem and Minkowski's inequality yield !I` 12,F 11°, = J { III (J G(x Y,)G(x2, Y2)F(y Y2)dy,)dY219dx, } dx,
(1)
< J { J [ J G(x2, yz)IK(x, V2){d)'2]°d.x, }dx2
(with K(x,, y2) = ],.q
J
J G(x2. y2)° IK(x1, y2)11dx,
dy2}4dx2
<_ (Cp_q(G)r J { J G(x2,)'2)[ J I F(Y,. Y2)Ipdy, ]"pdv2 }°dx2 (Cp-a(G))2q { if IF (y., y2flPdy, d).2'1':"-
(3)
(4)
(5)
(Notes: (2) -+ (3) is Minkowski's inequality. (3) -. (4) uses Cp_q(G) >_ - p-q(I' ( , y2))
for each y2. (4) - (5) uses C,.q(G) > .ip.q((51 F(y,, )Ipdy,)"0). The fact that G(x, y) >_ 0 is crucial. Here the x, integration was done before the x2 integration; in Theorem 3.4 the x2 integration will be done first.) Inequalities (I}{5) establish that Cp_q(G12') < Cp..q(G)2. Clearly, by considering F's of the product form ,, y2) = h(y,)h(y2), the reverse inequality is obtained, and so the goal is F() .
reached. Suppose now that F: R2n - C is a maximizer for G12'. Since G'2' is nondegener-
ate, it has a maximizer by Lemma 2.1. Since G(x, y.) > 0 for all x and y, it is clear that F = A.I F I and I A I = 1, for otherwise replacing F by I F I will increase the quotient -Rp,q for G12'. It can be assumed henceforth that F > 0. Since F is
605
Invent. Math. 102, 179-208 (1990)
190
E.H. Lieb
a maximizer all the inequalities in (1)-{5) must be equalities. Equality of (2) and (3) implies, by Lemma 3.1, that for almost every x2 there are measurable functions Ax, and B,,: R" -+ (0, ':c) such that
G(x2, y2)K(x,, Y2) = A,,(x,)B,,(Y2) (6) for almost every x, and y2. Since G > 0, this equation can be divided by G(x2, Y2) to obtain K(x,, y2) = A,,(x,)E.,6?2) with B (y)/G(x2, y). However, K(x1, Y2) is independent of x2 and therefore if any particular value of x2 is chosen
for which (6) holds for almost every x, and y2, and if the functions A and E : R" (0, oo) are defined by A = A and E _- Es, for this value of x2, then K(x,,Y2) = A(xI)E(Y2)
for almost every x, and y2. If this equation is multiplied by G(x2, y2) and integrated over y2 the result is
(2)f )(x x2) = A(x,)Z(x2) for almost every x, and x2 with Z = #E. Since G'21 > 0, both A and 2 are strictly positive functions.
There is a function H c L"'(R2") with 11H llq = 1, such that In fact
114.011F Ilq =
H(x,, x2) _ (const.)[(`112)F)(x1, x2))9-' = (const.)A(x,)9-' Z(x2r-
The point here is that H is a product function. Then, as in the proof of Lemma 2.1. F satisfies Fly,, y2) = (const.) { if G(x1. y,)G(x2, y2)H(x x2)dx, dx2 }° -' = a(y, )/I(y2)
(7)
for some positive function a and fJ: R" - (0, oo). In brief, F must be a product function, and this fact is crucial for the next step.
One example of a maximizer is F(y ))2) = f(y,)f(y2), where /'
is an
L°(R") - L9(R") maximizer for G (whose existence is guaranteed by Lemma 2.1). For the reason given before about F, we can and do assume that f(x) ? 0 for all x e R". A more interesting maximizer is F(Y, Yz) =
Y, J2Yzl f(
+)})
.
(8)
Here, the essential property of 0(2) rotation invariance of products of centered Gaussians and of Lebesgue measure is being exploited. If 0 is any fixed angle and if x',, x2, y, , y2 in R" are defined by x, = x, cos 0 - x, sin 0, x2 = x, sin 0 +
x2 cos 0, y', = y, cos 0 - y2 sin 0, v2 = y, sin 0 + y2 cos 0, the 0(2) invariance of Lebesgue measure is that dx, dx2 = dx 1 dx2 and dy, dy2 = dy', dy2. The 0(2) invariance of centered Gaussian functions is that g(x,)g(x2) = g(x', )g(x2 ), while for
centered Gaussian kernels G(x,, y1)G(x2,Y2) = G(x',, y', )G(x2, v2). With the choice 0 = n/4, these observations lead to (8). Combining (7) and (8),
J-( Y1 - Y2).fO!, + Y v2 for almost every y, and y,.
606
.,; 2
20,I)PY.0 -
(9)
Gaussian Kernels have only Gaussian Maximizers 191
Gaussian kernels have only Gaussian maximizers
Equation (9) implies that f is a Gaussian. Instead of proving this in full generality for LP(R") functions, as is done by Carlen [CA], it is easier to simplify the
proof here by taking the 2(p - 1)' power of (9) and by taking advantage of the
analyticity result Lemma 2.1(b). Introducing h = f2`P_ ", y = x310-" and b = Q2(P_ ", it is seen from (9) (by fixing y2) that y is analytic; likewise S is analytic. Thus, (9) holds for all y, and y2 because when two analytic functions on C" x C"
agree almost everywhere on R" x R" then they agree everywhere. Furthermore f never vanishes for real y because if ./'(Y) = 0 then, setting y, = Y2 + / 2 Y, we would have that 0 = y(y2 + f Y)b(y2) for all y2; this is impossible, given that y and S are analytic, unless y = 0 or b - 0, which contradicts the assumption that f ; 0. Thus, the logarithms of It, y and S are real analytic and
In[h(Y- - Y3)] + ln[h(y' rY2)] = ln[y(y,)] + In[b()'2)]
(10)
If %; denotes the derivative with respect to the 1a coordinate, and t1i with respect to y, and c', with respect to Y2 is taken in (10), then
(ci;diIn h)l y'2y2 J=(a,e,Inh) y' +2y3 which implies that the function
In It
is a constant (call it 4(1 - p)J,,) and
(y, Jy) + (1, y) for some vector 1. Ac[J(Y)] = 2(p - 1) ln[h(y)] cording to Lemma 2.3, 1 = 0 since G is centered and nondegenerate. This completes the proof that f must be a centered Gaussian. It remains to prove that Jis unique (i.e.. the matrix J above is unique). One way would be to compute .4P_q(cxp( - (x,Jx)}) for G and then deduce that there is only one optimum J. A very much easier route is to suppose that there are two maximizers f' and f2 with P(y) = exp( - (y, J'y)}. Then, for the same reason as before (0(2) symmetry) the function therefore In
1
FIY
YI _ 12)J2(Y) + Y2)
!'z)
/2
2
/J
(11)
is a maximizer for X4121. There are two ways in which this implies that f' =f'. The
first is to use (7), namely F must be a product function, and to note that this product structure is true if and only if J' = j 2. The second way is to note that since the F in (11) is never zero and, since (3) -+ (4) must be an equality, we have that the function y, i-+hv,(y,) - F(y,, y2) must be a maximizer for '.4 for almost every y2. Although the function h,., is a Gaussian for each 3.2, the Gaussian will have a linear term for each y2 * 0 unless J' = J 2. However. Lemma 2.3 precludes the existence
of such a linear term, so J' = J 2.
1-1
The next theorem concerns Case (B) of Sect. 1.
3.3. Theorem (unique Gaussian maximizers in the imaginary, nondegenerate case). Let G he a centered, nondegenerate Gaussian kernel with a real diagonal part and it purely imaginary oft-diagonal part, i.e.,
G(x..v) = exp( - (x, Ax) - (y. Br) - 2i(x. Dy)}
607
Invent. Math. 102, 179-208 (1990)
E.H. Lieb
192
where A, B and D are real n x n matrices and A and B are positive definite. Let I < p 5 2 and 1 < q < oo or else I < p < oo and 2 < q < eo . Then, in either case, 9 has exactly one maximizer, f, (up to a multiplicative constant) from LP(R") to Lq(R") and this f is a real, centered Gaussian, i.e., f (x) = exp { - (x, Jx) } with J being a real, positive definite matrix.
Proof. Assume at first that D is nonsingular. Since A and B are positive definite there are nonsingular real matrices U and V so that the change of variables x -' Ux and y Vy changes A and B to the identity matrix, 1, that is I = UTAU = VT BV, where T denotes transpose. Then (x, Dy) -+ (x, Dy) with D = U T DV. The polar decomposition of D' is D = WIDI, where W is orthogonal and IDI is positive
definite (the assumption that D is nonsingular is used here). Then there is an orthogonal matrix Y such that yT IDI Y is diagonal and there is a real diagonal matrix Z such that ZYT I DI YZ = 1. Now make one more change of variables: x - WYZx and y -4 YZy so that (x, D'y) - (WYZx, WIDI YZy) = (x, y) and (x, x) = (x, lx) (WYZx, WYZx) = (x, Z2x) and (y, y) -+ (YZy, YZy) = (y, Z2y). These two changes of variables affect ?p-q in a trivial way (involving only p and
q and the determinants of U, V and Z) and, most importantly, take Gaussian functions into Gaussian functions. In short, it can be assumed without loss of generality that G has the canonical form G(x, y) = exp{ - (x, Ax) - (y, Ay) - 2i(x, y)} ,
(1)
where A is positive definite and diagonal. By duality Cp_q(G) = C,,-,,,(G T) with G T(x, y) = G(y, x) = G(x. y), so it suffi-
ces to consider only the case I < p 5 2 and 1 < q < oo. It is easily seen that (!#f)(x) = exp{ - (x, Ax)}h(x) where h is the Fourier transform of the function h(y) = exp{ - (y, Ay)} fly). Since f E LP(R") it has a Fourier transform f and Beckner's theorem (which will also be proved here in Theorem 4.1 and 4.2(1)) states that II 111 p < (Ct)" II f III, where Beckner's constant C; is the sharp constant for the p p, norm of the Fourier transform in R'. By the convolution formula, h satisfies
h(x) = µ J exp { - (x - y, A -'(x - y)) } f (y)dy , where p_> 0 is a constant which depends only on A. Therefore (1f)(x) = µ(1.R f)(x) where G is the real, centered, nondegenerate Gaussian
6(x, y) = exp{ - (x, Ax) - (x - y, A- '(x - y))}
.
(2)
Thus itp..q(G..f)llf III = I4
II; 5 MCp_q(G)II f III <.Cp. ,q(G)(Cp)"11 f IIp(3)
from which it follows that CP.q(G) - µ(CDrCp_q(G). However, equality can be achieved in (3) in exactly one way (up to a multiplicative constant). By Theorem 3.2
there is exactly_one choice for f that will make the first inequality in (3) into an equality. This f is a real, centered Gaussian, f(x) = exp{ - (x, Jx)}. Its inverse transform f is also a real, centered Gaussian, i.e., f(x) = (const.)exp { - (x, J -'x) }. The second inequality in (3) (Beckner's) is an equality for any real Gaussian (in
particular, our f), and therefore f is the unique maximizer as asserted in the theorem.
608
Gaussian Kernels have only Gaussian Maximizers Gaussian kernels have only Gaussian maximizers
193
In case D is singular, a change of variables similar to the above replaces the canonical form (I) by G(x, y) = exp{ - (x, Ax) - (y, Ay) - 2i(x, Py)} where P = D) and A
(10(0
00) is a diagonal projection onto R'" (with m < n being the rank of
0) with a positive definite, m x in and diagonal. Writing x E R" as
( . x , , .x2) with .. x , E R" and x2 E R"-", define q: R" -+ C by
9(1',) = J exp{ - (Yr Y2)}f(Y,,Y2)dy2 ,
(4)
R"-'
and G':R" x R'"-+(0, x) by G*(x1.y,)=exp{ 2i(x, , y1) }. Then, using Fubini's theorem and the same analysis as before with x and in in place of A and n, and with G'' : R" x R' (0, x) as given in (2), ._Po_a(G.J) II f I't.')R.) = v.alp_"(G', g) 11 g 11 c',R_) < vpCp..q(G*)(C; r 119 1i 1..(R-) (5)
where v is the L"(R" - ') norm of the Gaussian function exp { - (x2, x2) }. As was the
case for (3) and the subsequent argument, equality in (5) is uniquely achieved by a real, centered Gaussian
9(y)) = exp{ - (y1, Ey,)} .
(6)
where E is a real, positive definite m x m matrix. By Holder's or Minkowski's inequality, it follows from (4) that (7)
11.g Il,.'(R-) ` N II f II
norm of the Gaussian function cxp - (y2, y2)}. Equality in (7) is compatible with (6) if and only iff(y) =fly,, y2) = (const.)exp{ - y Ey,) V - 102, Y2) }. The reason is that equality in (7) requires that AY.' y2) = i.(y, )exp { - (p' - l)(y2, Y2)) for almost every y, E R. Then, computing the integral in (4), one finds that exp{ - (y,, Et', )} = g(y,) = (const.) ).(y,). where N is the L° (R"
Finally. Case (C) of Sect. I will be considered. 3.4. Theorem (unique Gaussian maximizer when p <= q in the general nondegenerate case). Let G he a centered, nondegenerate Gaussian kernel and let
I < p 5 q < T. Then '.R has exactly one maximizer, f, (up to a multiplicative constant) from Lp(R") to L"(R") and f is a centered Gaussian function.
Proof. As in the proof of Theorem 3.2, the key is to study the kernel G12' = G ® G by means of Minkowski's inequality. Now, however, the .x2 integration is done first. Thus, for F E U(R2n). 2 t Lv
= f ; J11 G(x2, v2)(I G(x,. v,)F(1', )'2)dv,)dy21°dx2}dx,
(1)
=${ j (with
(2)
<(('D."(G))"{f
dye}"'pd.x,
IK(x,,y2)"dx,]p"qdy2}9°
<_ (C,, .q(G))2y { $5I F(y,.
y2)I"dv,dye}q:p
(3) (4)
(5)
609
Invent. Math. 102, 179-208 (1990)
F.H. Lieb
194
for each x,. is Minkowski's inequality for the exponent r = q/p > I and the function IK(x yz)l (Notes: (2)-+(3) uses Cp_q(G) Z 1
y2)) for each Y2.) This inequality (along with consideration of F's of the form F(y,, yz) = h(y,)h(yz)) shows that (4) -+(5) uses Cp_q(G)>_
Cp-.q(G(21) = Cp..q(G)2
.
Suppose now that F: R2a -, C is a maximizer for T121. Since G12' and G are nondegenerate, maximizers exist for each of them by Lemma 2.1. Then all the inequalities in (I}(5) must be equalities. In particular, inequality (4) -. (5) implies that the function y, i-. Fly,, y2) must either be the zero function or it must be a maximizer for I for almost every Y2 a R". (It is well known that this function is in LP(RI) for almost every Y2.) As in the proof of Theorem 3.2, the 0(2) invariance of G121 implies that the function given by
)
F(y1,Y2)=f
)f (Y,,/2
(6)
is a maximizer for j12) when f is a maximizer for 14, as will henceforth be assumed. Thus, for almost every z in R", the function g,(y) = F(y, z) is in Lp(R") and either (a) it is a maximizer for 4 or (b) g, is the zero function. The second possibility (b) can be excluded by Lemma 2.1 (b). If g, = 0 then, from (6), f(w) = 0 for all w in some set A c R" of positive Lebesgue measure. But If I21y-" is analytic and this is impossible unless f a 0. Thus it can be assumed that g, is indeed a maximizer for almost every z, i.e., g, + 0. In fact g, is an Lp(R") maximizer for every z e R". To prove this assertion, fix
z and let z,, z2, .... be any sequence in R' such that z, -. z as j - oo and such that g-, is an Lp(R") maximizer for each j. Such a sequence exists because g, is a aximizer for z's in a dense set. Define h1(y) = Z.g, (y) where Zj is chosen so that
II h, IIp = I for each j. By Lemma 2.1, there is a subsequence (still denoted by h,) and
a maximizing function h c- L"(R") such that h -' h strongly as j - x. By passing to a further subsequence this convergence can also be assumed to be pointwise almost everywhere. However, translation is a continuous operation in Lp(R") and thus, by
passing to
a
further subsequence, f((y + z,)/\2) converges pointwise to
f((y + z)/,/2) for almost every y. Likewise, by passing to a further subsequence. f((y - Z,)/,/2) converges pointwise tof((y - z)/,/2) for almost every y. It follows then that the maximizer h satisfies h(y) _ f(y/-2z)
f(y
` ) lim Z,
for almost every y. Therefore liim;_.,,ZZ, exists and g_ is a maximizer for every
zeR". Our first application of this result will be the proof that there is a Gaussian maximizer. Take z = 0 so that f 21(y) - f(y/ f )2 is a maximizer. Then apply the same conclusion to f 21 so that f1"(y) __ f(y/2)° is also a maximizer. Repeating this indefinitely, the sequence of L'(R") functions given by
g.(Y) = N;f
610
(7)
Gaussian Kernels have only Gaussian Maximizers 195
Gaussian kernels have only Gaussian maximizers
is a sequence of maximizers for j = 2,4,8,16,.. ..The number Ni is chosen in each case so that II g, lip = 1. Using Lemma 2.1 again we infer the existence of a subsequence (still denoted by j) and a maximizer g such that gi . g strongly in Lp(R") and pointwise almost everywhere. Our goal will be to prove that g is a Gaussian. This can be inferred from the central limit theorem, but the following argument is more
direct and will be needed later for the proof that every maximizer is a Gaussian.
The first step is to prove that f(O) * 0. Recall from Lemma 2.1(b) that R a I f I21p-" is analytic. Likewise S = IgI21p" " is also analytic and
S(Y) = lim Njlp-"R
i-
Y
(8)
%'/
for almost all y e R". Since S,(y) = Nj'p- "R(y/ f r is the 2(p - 1y" power of the modulus of a maximizer with unit Lp(R") norm (namely g,), Lemma 2.1(b) states that the analytic extension of S, is uniformly bounded on compact subsets of C. The almost everywhere convergence in (8) then implies (by Vitali's theorem) that (8) holds for all y e C" and that all partial derivatives with respect to y of the sequence
of functions S, also converge as j -. oo to the corresponding derivatives of S. However, it is easily seen by Leibniz's rule that if R(0) = 0 then every derivative of
Si at y = 0 converges to zero as j - co. This is impossible unless S(y) vanishes identically, which contradicts the fact that II g IIp = 1 The second step is to prove that g is a Gaussian. By Lemma 2.1(a), for y e R", f(y) = Im(y)I° /m(y), where m: C" -. C is entire analytic. Sincef(0) * 0, also m(0) + 0 and hence there is a neighborhood U of O e C" on which f has an analytic extension and on which f is never zero. [Reason: m,(y) _- Re(m(y)) can be written as a Taylor series for y e R", and so can m2(y) _- lm(m(y)). Consequently m, and m2 extend to is since functions. Then (mf + m2V'2 analytic on U entire m1(0)2 + m2(0)2 = lm(0)I2 + 0.] Therefore f has a logarithm, H, which is analytic on U. i.e., f(y)=f(0)exp{H(y)}. The function H can be written as
H(y) = (V, y) - (y, Jy) + 0(y3) for some V e C" and J a symmetric matrix. For each y e R", the point y/ f lies in U for all sufficiently :arge j and therefore, by (7).
g(y) = lim N;f(0)'exp{v'
i 2)}
for almost every y e R. The factor thus
exp{0(y3j-11'2)} converges to I as j
- co and
g(y) = exp{- (y,Jy)) lim N,f(0)iexp{ f(V,y)} Clearly this last limit can exist for almost every y if and only if V = 0 and Ni f(0Y has a finite limit (which cannot be zero since II g IIP = 1). This proves that g must be a Gaussian as claimed (and hence Re(J) is positive definite) but we also note that the argument also proves the following three statements: Whenever f is a maximizer then (i) f is analytic in some complex neighborhood of 0; (ii) f(0) + 0;
(iii) (3f/3y')(0) = 0, for i = l.... , n. The second assertion of the theorem is that every other maximizer, J;
is
proportional to the one just found, namely g(y) = exp{ - (y, Jy))}. Instead of (6) take F(y, y2) = g
YZ
1.(Y'
YI)
611
Invent. Math. 102, 179-208 (1990) E.H. Lieb
196
which is obviously also a maximizer for T". By the same reasoning as before, F has the property that yi-.k_(y) __ F(y, z) is a maximizer for each fixed z e R". By the three statements just made above, we conclude that k; is analytic near 0,
k,(0) * 0 and (ok_/t)y')(0) = 0 . This is equivalent to the statement that for every z e R", J is analytic near z/,/2,
f(z/f)+0and (9f1490() _ [ - JZ]if( Z which shows that f= g.
,
C7
IV. Degenerate Gaussian kernels
In the three cases (A), (B) and (C) of Sect. I, which correspond to Theorems 3.2, 3.3
and 3.4, every nondegenerate Gaussian kernel has a unique maximizer which is a Gaussian function. By taking suitable limits the following formula 4.1(a), which is one of the main results of this paper, can be deduced for the Lp(R") to L9(R") norm of degenerate kernels. This formula is, of course, trivially true in the nondegenerate case.
4.1. Theorem (the sharp bound for degenerate kernels). Let G be a centered Gaussian kernel as in (1.1) with L = 0 and let p and q satisfy the appropriate conditions given in (A), (B) or (C) of Sect. 1, according to the properties of G. Then I is bounded from Lp(R") to L9 (R") if and only if the following supremum is finite, in which case the supremum is equal to Cp_q.
sup ..t .q(g)=Cp-q,
(a)
9
where the supremum is taken over all centered Gaussian Junctions, and in Cases (A) and (B) they can be restricted to be real.
Proof. For each e > 0 let h,(x) _- exp{ - c(x, x)} and define G,(x, y) G(x, y)h,(x)h,(y), which is nondegenerate. Correspondingly, there is the linear operator 1, For each f e Lo(R") .4,_q(G,J)IIf11,= II)
(hJ)11,<11W(h,J)IIq
< Cp_q(G) Il f 11, .
(I )
This proves that Cp_q(G,) < Cp.q(G).
On the other hand, assuming that Cp_q(G) < o c, for each b > 0 there is an ja e L'(R") with III"II, = I such that 11I.Ilq > Cp_q(G) - b. Then
C,_q(G,) ? ulp.q(G fa) =
)Ilq = 11h,`4(h,fa)11,
(2)
As t -* 0, h, fa -+fa strongly in Lp(R"), so 4 (h, f) - t (fa) strongly in L9(R"). This
implies that h,'4(h, J) - 4(f') strongly in L9 (R") as well, and thus, from (2), lim inf,._0C,_q(G,) z Cp-q(G) - b. Since b is arbitrary, and in view of (1), lim Cp_q(G,) = C,-,(G) . o
A similar argument shows that (3) holds even if Cp_q(G) = x.
612
(3)
Gaussian Kernels have only Gaussian Maximizers
Gaussian kernels have only Gaussian maximizers
197
Now let g, denote the maximizer for G. which is a centered Gaussian function. Assume II g, lip = I . Then j h,g, II pCp-q(G)
II h,g,;l p.itp.,q(G, h,g,) = II 14(h,g,l Ilq
II h,`4(h,g,) Ilq = II 4,(g,) Ilq = Cp_.q(G,)
(4)
Assuming 4 to be bounded, (4) together with (3) and the fact that 11h,g,1lp< 11g,[lp= I implies that IIh,g,llp- I as E-.0. Then Cp-q(G) = Jim Cp_q(G,) < lim.cp-q(G, h,g,) < Cp-q(G)
(5)
This proves the theorem in the bounded case since h,g, is a Gaussian function (which is real in Cases (A) and (B)).
In case 14 is unbounded, (4) and (5) imply that Oc = Jim C, .q(G,) < lim II h,g, Ilp.p_q(G, h,g,) < Jim 1,, .q(G, h,g,)
which proves the theorem since h,q, is a Gaussian function. LI 4.2. Remarks and examples. Theorem 4.1(s) is a formula for the Lp(R") -+ Lq(R") norm of 4. The same formula is, of course, also valid for nondegenerate kernels, but in that case we are assured that there is precisely one g that achieves the supremum. In the degenerate case a maximizer may not exist--even if 1.4 is hounded-as the examples below show. In any event, the evaluation of this formula is, in general, a difficult nonlinear algebraic exercise, although it is simple in many applications. For example, when G(x, y) = exp{2i(x. y) } (the Fourier transform kernel), it is easy to deduce from 4.1(s) that G is bounded if and only if q = p' >_ 2, in which case a Gaussian function is a maximizer if and only if it has the form g(x) = p exp{
(x, Jx) + (I, x) }
with J positive definite, real and symmetric and I E C. Both J and I are arbitrary. This g is not necessarily centered even though G is. In the degenerate case it is not
asserted that every maximizer must be centered when G is centered. The sharp constant is then Cp .p. = (CB)" with CPBB = 71
1!0 pli2p(p)-ir2p.
(1)
[Note: The Fourier transform is an example of both Cases (B) and (C). While the proof of Theorem 3.3 (Case (B)) required 4.1(s) and 4.2(1), the proof of Theorem 3.4 (Case (C)) did not. Therefore no circular reasoning is involved because 3.4 = 4.1(s ) 3.3 = 4.1(s) for Case (B). ] for Case (C) =:-. 4.2(l)
Another example is the (real convolution operator G(x, y)=exp{ - i.(x - y, x - y) } which, using Theorem 4.1, turns out to be bounded if and only if p < q (see [BL] Section 4 for more details). There is a maximizing Gaussian function if and only if p < q and it must have the form
g(x) = exp{ - J(x, x) + (I, x)}
(2)
613
Invent. Math. 102, 179-208 (1990)
198
E.H. Lieb
with J = Af
9
1) and with I E R" arbitrary. Also
(Cp q)zr" _
rq(P)-,,q
- q ')"P- ugpIiP(q')"v'q- l
'rq(P
(3)
When p = q the limiting value CP-g = n/A is correct but, since J = 0 in this case, there is no Gaussian maximizer. Indeed, there is no maximizer of any kind in this case. To prove this, note that G(x, y) = H(x - y) with H (x) = exp{ - A(x, x) } and J H(x - y) f (y)dy = Jf(x - y)H(y)dy. Then, by Minkowski's inequality,
{ JIIf(x - y)H(y)dylPdx}"P 5 J { J If(x - y)IPHQ'ydx}'!Pdy (71)'f2
= 11 f1IPJH(y)dy= z
Ilf11P
(4)
Since the condition in Lemma 3.1 for equality is clearly not satisfied, and since (n/A)"r2 has already been shown to be the sharp bound, a maximizer cannot exist.
A second example of a degenerate G that is bounded but does not have a maximizer is the following modification of the Fourier transform in R' with
A>0. G,,(x, y) = exp{ - Aye - 2ixy} .
(5)
It is easily verified for all p that 9tp_,,(g) is unbounded on complex Gaussian functions when q < 2. Thus, it can be assumed that q >_ 2, which places us in Case (B) of Sect. 1. If ff(x) = exp{ - Jx2} is an arbitrary Gaussian function, one finds
that when q ? 2 the optimum choice is J real and
-4p-q(fj)]2 = n'ia+t/P'pl/Pq-irgJtip(A+J)-iig .
(6)
By maximizing this with respect to J one finds that CP-a is finite whenever p z q' and CP-g = oo when p < q'. If p = q' there is no J that maximizes the right side of (6) (i.e., J
oo), although the right side is bounded. Indeed, there is no maximizer of
any kind when p = q'. If there were a maximizes f e LP(R') then, by imitating the proof of Theorem 4.1, it is easily seen that C,_P4G",,) > CP..P{G,,) when 0 < p < 1. This contradicts the conclusion of Theorem 4.1 which states that the supremum
over J of the right side of (6) correctly gives CP_p(Ga) for every )., but this supremum is obviously independent of ).. These examples motivate the following theorem. 4.3. Theorem (a condition for Gaussian maximizers). Let G be a degenerate Gaussian kernel with the property that the n x n real, symmetric matrices A and B in (1.1) are both positive definite. If 1 < p 5 q < co then I is bounded from LP(R") to L°(R"). If, additionally, p < q then I has a maximizer which is a Gaussian function. If G is also real then obviously A and B must be positive definite if 4 is bounded at all. In this real, degenerate case I is unbounded when l < q < p < co and I has no maximizer of any kind when I < p = q < oo. Proof. It can be assumed that G is centered and, as in the proof of Theorem 3.3, we can use the fact that A and B are positive definite to change variables so that G(x, y)
is brought into the canonical form G(x, y) = exp{ - (x, x) - (y, y) - 2(x, Ey) - 2i(x, Hy)} .
614
(1)
Gaussian Kernels have only Gaussian Maximizers Gaussian kernels have only Gaussian maximizers
199
where E and H are\ real matrices and E is also diagonal. In the real case H = 0. I must be positive semidefinite, the eigenvalues e, , ... , e" of
Since M = (
E must be in the interval [ - 1, I]. Since G is degenerate at least one of the e; s (say
e,) is + I or - I and, by changing y to - y if necessary, we can assume that 1. Thus, G(x, y) contains the factor exp{ - (x, - y1)}.
e,
In the real case, H = 0, G in (1) is seen to be a tensor product of operators on R',
i.e., G(x, y) = G,(x,, y,) ... G"(x", y"). If p > q the operator 4, corresponding to e, is unbounded, as shown in 4.2, so I is unbounded as well. In case p < q the Minkowski inequality argument in the first part of the proof of Theorem 3.2 (applied sequentially to 11, W21 .... 4") shows that any maximizer, F, for 'y must be of the product form, i.e., F (y ... , y") = f, (y,) ... f .(y,) and each f. must be a maximizer for the corresponding 4,. When p = q, however, 4, does not have a maximizer as stated in 4.2 and therefore I has no maximizer. When p < q we
know from 4.2 that each 4, has a Gaussian maximizer g,. Since fl, g,(x,) is a Gaussian function on R", the proof for the real case is complete. For the general case with p 5 q, let G°(x, y) be the real kernel given by (1) but
with H set equal to zero and let 1° denote the corresponding operator. If f e Lp(R") n L' (R") then clearly 3t p.4(G, f) <- Jtp_q(G°, F), where F =- If I. Since is also bounded. Referring now to Theorem 4.1 I4° is bounded when p 5 q, then
let G, be the kernel defined in that proof, i.e., G,(x, y) = G(x, y)h,(x)h,(y) with h,(x) = exp{ - e(x, x)), and let g, denote its unique Gaussian maximizer with 11 g, II, = 1. Let g,(x) = p, exp { - (x, J,x) - i(x, K, x) } with J and K real, symmetric
and with J, positive definite. Define g°(x) = It, exp( - x, J,x)}. Let e 0 through the sequence e = 1/j with j = 1, 2, 3.... There is a subsequence of thej's (which we continue to denote by j) such that the eigenvectors of J, and K, have limits as j -, co (because the manifold 0(n) is compact). The corresponding eigenvalues of J, must be uniformly bounded away from 0 and oo since otherwise 5tp_4(G°, g°) will converge to zero, as the following computation shows. Apart from irrelevant constants, I I g° 11, = I J,I -"p, where I I denotes the determinant. Also, 1 1 4°(9°) II4 = I J , + I I -' II - E (J, + I)- ' E I -14. Using the fact that if - MT MI = 11 - MMT I for any real matrix, M, we have 11 - E(J, + I )-'E I =IJ,+II_'IJ,+I -E2I IJ,+ =11 -(J,+I)-i'2E2(J,+I)-';21 11-'IJ,1. Therefore "
p-4(G°.9°) where
I J,I irp uqI J, +I I
"4 = n
i;q(l +
J{)-,4".
the J;'s
are the eigenvalues of J,. Since p < q the function t)-'f4' is bounded and goes to zero as t - 0 or t or-
and the fact that The possibility that i ..,(G°, g°) -' 0 is not allowed by 4.1 .-VP g(Go go) z W,,_q(G. g, ). Thus we can pass to a further subsequence such that J, has a positive definite limit J as e -. 0. This implies that µ, also has a finite, nonzero limit P. The eigenvalues of K, must also stay bounded away from infinity for otherwise g, would tend weakly to zero in Lp(R") and then the function 1(g,) would tend to zero pointwise. (This is so because the function y" G(x. y) exp { - J (y, Jy) } is in L" (R") for each x.) But (4(g,) is bounded above pointwise by (1°(g°), and the pointwise convergence to zero would imply by dominated convergence that V(g,) converges to zero in Lq(R") norm. Thus, by passing to a further subsequence, J, and
615
Invent. Math. 102, 179-208 (1990)
E.H. Lieb
200
K, have limits J and K. From this it follows that g, converges strongly in LP(R") norm to g(x) = p exp { - (x, Jx) - i(x, Kx) }. The Gaussian function g is the desired maximizer for 1. First note that h,g -+ g in LP(R") norm as a -+ 0. Also g, -+ g, and thus we can write h,q, = g + d, with b, = II A, lip -+ 0 as c -+ 0. Then, since I is bounded, Cp.q(G) ? gp.q(G, g) ? 9tp.q(G, h,g,) - Cp.gb, . Taking the limit e -4 0, Cp_q(G) > 1p_q(G, g) ? I'M sup 5Pp_q(G, h,g,)
But by Eq. (5) of the proof of Theorem 4.1, this latter limit equals Cp_q(G).
4.4. Remarks and conjectures. Formula 4.1 (s) gives the sharp bound. The question that is incompletely resolved here is whether there is a Gaussian maximizer in the degenerate case or, indeed, any maximizer at all. In the cases of most interest (e.g., Nelson's kernel of Sect. I and the Fourier transform) the existence of a Gaussian maximizer can easily be verified by simple computation. The general case is algebraically complex, although Theorem 4.3 does give a criterion for a Gaussian maximizer and it completely settles the case of real Gaussian kernels. Indeed, as shown in 4.2, a maximizer need not exist even if I is bounded. The examples given here lead to the following conjectures. (1) If there is a maximizer for cases (A), (B) or (C) of Sect. I then there is a Gaussian maximizer.
(2) There is a maximizer in these cases if and only if the unique Gaussian maximizer g, for the mollified kernel G,(x, y) = G(x, y)h,(x)h,(y) defined in the proof of Theorem 4.1 has a strong limit g in LP(R") as e -. 0.
Maximizers need not be unique, as shown in 4.2, but if there is any Gaussian maximizer for p < q then every maximizer is a Gaussian. This is Theorem 4.5, and it
completely settles the Fourier transform case, for example. (Note that when p = q = 2, every function in L2(R") is a maximizer for the Fourier transform and thus there is at least one case in which there are maximizers that are not Gaussians.) Theorem 4.5 also completely settles the real Case (A) because, by Theorem 4.3, no maximizer exists in this case when p ? q and a Gaussian maximizer does exist when p < q.
4.5. Theorem (when p < q, a Gaussian maximizer implies all maximizers are Gaussians). Let I < p < q < or, and let G be a degenerate Gaussian kernel. Assume that rr is a bounded operator from LP(R") to L"(R") and that g is a Gaussian function that is a maximizer for 1. If f e LP(R") is another maximizer for tS then f is also a Gaussian (hut f is not necessarily proportional to g and f is not necessarily centered even if G is).
Proof. Step I. According to Lemmas 2.2 and 2.3 it can be assumed without loss of generality that both G and g are centered. As in the proof of Theorem 3.4, we study the kernel G12' = G p G. For F e LP(R2n) n L' (R2") the inequalities (1)-(5) there are valid and we conclude that Cp_q(G12)) = (Cp_q)2, where Cp_q Cp_q(G). Step 2. If f e LP(R") is a maximizer for V then, using 0(2) invariance again, F(Y1,Y2) ..
616
f
)'i
Yz
q(y 2Yzl
(I)
Gaussian Kernels have only Gaussian Maximizers Gaussian kernels have only Gaussian maximizers
201
is obviously a maximizer for X412' if f is also in L' (R"), in which case F E L' (R2n). function F,(y,, y2) = F(y,, Y2) If f# L'(R") consider the mollified
exp{ - (y1 + y21 y, + y2)/j } for j = 1, 2, ... , which is in L( R'"). Clearly Fj -. F strongly in Lp(R2a) as j -+ oo. The function 5121 Fi can be computed as a dy, dy2 integral of G121F, and the result (using the 0(2) invariance of G121 and a change of variables) is
0§iz'FF)(xi,x2)=(qf)l x`
)power 2
.(.4g)I X1 -x2 )
integral of '.4'2'F, can be
with f(y) = fly) exp{ - 2(y, y)/j}. Now the g"
computed by changing variables again and the result is II F; 11° = II If 11° II ` 911° However II`4f 11° - II `4f 114 = C,,-, 11 f IIp as j - oo since f -af in Lp(R") norm, and we conclude that 11 4121F 11° = lim,.. 11 4"1F;114 (by definition) = (Cp_°)211 F IIp, so
that F is indeed a maximizer for 1121.
Step 3. Since g is a Gaussian, it is obvious that the function z i F(z, y) is in Lp(R") n L' (R") for each y and therefore that K (x, y) = f G(x, z) F (z, y)dz
(2)
is well defined for each x and y in R'. Since I is a bounded operator, the function x i-+ K(x, y) is in L4(R") for each y. We now assert that the function yb-+ K(x, y) is in
Lp(R") for almost every x e R" and that this function satisfies
{f[f
)qlp = f { f IK(x, y)Ipdy}°"pdx
(3)
with the understanding that both sides of (3) are finite. Formally, this assertion is a consequence of inequality (3) - (4) in the proof of Theorem 3.4 and the fact that all the inequalities (I}{5) must be equalities since Fly,, y2) is a maximizer for 112'. If F e L'(R2n) this would be correct, but if F $ L'(R2") a proof is needed. Set for j= I, 2, .... Clearly FJ(YL,Y2) = F(Y, )'z)cxP{ F, e L' (R2n) and F, F strongly in Lp(R2") as j --+ oo. (Note that this F, is not the same one as in step 2.) Let K,(x, y) be as in (2) with F replaced by F,, so that K,(x, y) = K (x, y) exp { - (y, y)/j}. The inequalities (1)-(5) in the proof of Theorem 3.4 are then valid with F replaced by F.. As j - oo the left side of these inequalities, F is namely 11`412'F.1I4 converges to 11T'2PF II = (Cp )2q II F II°p _ (Clso )°Z converges a maximizer. Likewise, the right side, namely 4 (C p_°)2q II F; II4p a ges to (Cp_4)4T_ since F; -+ F. Therefore the numbers (4) B;- { f [$1K;(x,Y)I°dx]Df4dyi°Ip- f IK,(x,Y)Ipdy}' dx (which are nonnegative by Minkowski's inequality) must converge to zero as
j -a x. Moreover, each term in Bi is hounded by (Cp.°)411 F; II; < Z, and each term
converges to Z as j -+ x (because of inequalities (lH5)). The first term in B, is sip
Ai = { f [ f I K(x,)')I° dx]pr° exp
- P (Y, )')
dY
and, by the monotone convergence theorem, A, converges to A = (the left side of (3)). Therefore A = Z. The second term in B, is l)
0;= f I f IK(x,Y)Ipexp -P (Y,Y)jdY
°!p
dx.
1
617
Invent. Math. 102, 179-208 (1990) E.H. Lieb
202
The inner integral (call
it
E,(x)) converges (by monotone convergence) to
E(x) _- 11K (x, y)IDdy. The function E is measurable since it is the monotone limit of measurable functions E;. Then I { Ej }91 r converges to I { E }"I° by monotone conver-
gence, so Dj converges to the right side of (3). But, as stated above, Dj also converges to Z, so the two sides of (3) are equal and E(x) is finite for almost every x, as asserted.
Step 4. Since q > p, the strong form of Minkowski's inequality and the equality in (3) implies the existence of measurable functions a and Q: R" -. [0, 00) such that IK(x,Y)I = a(x)R(Y)
(5)
every x and y in R". Writing G(x, y) =exp{ - (x, Ax) (y, By) - 2(x, Dy)} as usual (with A and B real, symmetric, positive definite), and for almost
writing g(y) = exp{ - (y, Jy)) (with J symmetric and Re(J) positive definite) a simple computation gives K(x, y) = exp{ - (x, Ax) + (DTx, (B + J)-'D TX)
- (y, Jy)}Q((B + J )y - DTx)
(6)
with Q : C" -+ C given by
Q(w)=exp{ -(w,(B+J)-'w)} jf(_/2) exp{ -(z,(B+}J)z)+2(z,w)}dz. (7)
Evidently Q is an entire analytic function of order at most 2. Define the function M : R2" - C by M(x, y) = Q((B + J )y - DTX). Plainly, since Q is entire M has an extension to an entire analytic function from C2" to C; call this extension N. The C2" -+ C defined by N*(x, y) = N(9, y) for x and y e C" is also entire function
analytic, and thus P =- NN* is entire analytic as well. It is also true that P(x, y) = I M(x, y)12 when x and y are in R". From (5) and (6), P(x, Y) = y(x)b(Y)
(8)
for almost every x and y in R", and where y and b: R" - [0, oo ) are the measurable functions given by y(x) = a(x)2 exp{2(x, Ax) - 2Re((DTx, (B + J )-'DTx))} and b(y) = fl(y)2 exp{2Re((y, Jy))). If y0 is a value of y such that 6(y0) * 0 and such that (8) holds for almost every x, we see by substituting this yo in (8) that y has an extension to an entire analytic function. Likewise, b has an extension. Thus (8)
holds for every x and y in C" (because if two entire functions agree almost everywhere on R" x R" then they agree on all of C" x C"). Now suppose that y(x0) = 0 for some x0 in C". Then, by (8), P(xo, y) = 0 for every y e C", which implies that for each y either (i) N(xo, y) = 0 or (ii) y) = 0. This, in turn, means that for each y e C" either (i) N(x0, y) Q((B + J )y - DT x0) = 0 or (ii) N(z0, y) - Q((B + J )y - DT r0) = 0. Necessarily, either case (i) holds for all y in some set S c C" of positive 2n-dimensional Lebesgue measure Y2" or case (ii) holds in some set S of positive Y'* measure. As
y ranges over S both (B + J )y and (B + J )y range over sets of positive 22"
measure (because Re(B + J) is positive definite and therefore Rank(B + J) = n). An analytic function that vanishes on a set of positive Y2" measure vanishes identically, and thus Q would vanish identically if y(x0) = 0. This contradicts the
618
Gaussian Kernels have only Gaussian Maximizers
Gaussian kernels have only Gaussian maximizers
203
fact that K(x, y) is not identically zero. Thus, the assumption that y(xo) = 0 is not possible, and it will be assumed henceforth that y(x) * 0 for all x E C". Define the set A = {y e R": 6(y) * 0} c R". This set A has positive n-dimensional Lebesgue measure .'", for otherwise K(x, y) = 0, Y" almost everywhere. (In fact &"(R" - A) = 0 because S is analytic and S does not vanish identically, but this fact is
oot needed.) For y E A, the function ZY: C" - C defined by
Z,(x) = K(x, y) is entire analytic of order at most 2 and never zero (because y(x) is never zero). Then ZY has the form
Z,(x) = K(x, y) =r exp{ - (x, TYx) - (R,, x) +,u,}
(9)
where TY is a complex, symmetric matrix, R, E C" and p, e C (all of which depend on y). I thank Eric Carlen for the simple proof of this fact, which is that Z,, being zero free, has an entire analytic logarithm, i.e., Z. = exp{HY}. Then, since Z, has
order at most 2, IH,(x)I is bounded above by (const.) Ix12. By a well known argument using Cauchy's integral formula, H. must be a polynomial whose order is at most 2, i.e., Z, has the form stated in (9).
Step S. As noted in step 2, the function
y) is in LQ(R") for almost every y e R". By (4) -. (5) of Theorem 3.4, the function z i-- F (z, y) (which is in LP(R") for
almost every y) must be a maximizer of 9PP-q for almost every y. (Note that z i-- F (z, y) cannot be the zero function for any y since g never vanishes.) Thus there
is at least one point yo e R" such that S(yo) * 0 and (9) holds and such that zi-. F(z, yo) is a maximizer in LP(R"). Fix this yo henceforth and denote the matrix in (9) simply by T. There is then a function h E L9'(R") with 11h 11 9, = 1 such that Since
therefore
(10) Ih(x)K(x,Yo)dx= IIK(',Yo)11,=Cp_gIIF(',Yo)IIP. yo) a LQ(R"), the matrix T must satisfy Re(T) is positive definite and yo) is
a Gaussian. The optimum h
satisfies h(x) = (const.)
I K(x, yo)Iq/K(x, yo) for x E R" and therefore h is also a Gaussian (and hence h c- L'(R")). As remarked in step 3, yo) is in L'(R"). Therefore the function (x, y)- h(x)G(x, y)F(y, yo) is in L'(Rzn) and Fubini's theorem can be applied to (10). Thus,
J h(x)K(x, yo)dx = J { J h(x)G(x, z)dx} F(z, yo)dz.
(11)
Since h is a Gaussian, the inner integral in (11) (call it k(z)) is also a Gaussian. Since
yo) is a maximizer, F(z, yo) = (const.)Ik(z)IP/k(z) - r(z) for almost every z a R". Clearly r is a Gaussian and, by (1)
f( z
Yo IB(z
yol
= r(z)
(12)
for almost every z c- R". Setting z = w -yo,I(12) yields f(w/ f) = r(w - yo)/ g((w - 2y.)/,/2), which is a Gaussian (in w) as asserted in the theorem.
V. Gaussian kernels from LP(R") to LQ(R'")
This section consists essentially of a simple remark, but it can be a useful one in applications, e.g., in [LI]. Let G be a Gaussian kernel on R' x R" with m * n, i.e., G(x, y) is given by (1.1) with A m x m symmetric, B n x n symmetric, D m x n and
619
Invent. Math. 102, 179-208 (1990)
E.H. Lieb
204
L e C'"+", and with M in (1.3) a positive semidefinite (in + n) x (m + n) matrix. Evidently Lemmas 2.1, 2.2 and 2.3 continue to hold in this case, and it can be assumed without loss of generality that A and B are real and L = 0. The linear operator I from Lp(R") to Lq(R'") and the norm Cp_q(G) are defined, mutatis mutandis, as in Sect. 1. The remark is the following. 5.1. Theorem (extension to in + n). Let G he a Gaussian kernel on R' x R" as defined above. Then all the preceding theorems and lemmas in this paper holds, mutatis mutandis, in this more general case.
Proof. Suppose in < n and extend G to a Gaussian kernel, G, on R" x R" by G(x, y) = h(x,)G(x2, y)
where x e R" is written as (x x2) with x, a R"-"' and x2 E R', and where be the corresponding operator from Lp(R") to h(x,) _- exp { - (x, , x,)). Let Lq(R"). Note that j has the same properties as G, i.e., the degeneracy or nondegeneracy of G is the same as that of G; G is in Case (A), (B) or (C) if G is; the n x n matrix ( 0
A
I is positive definite if and only if A is. Also, If is unbounded if
is,
and it will be /assumed henceforth that W is bounded. If f e Lp(R") then evidently, as functions in Lq(R"), ( f)(x) = h(x,)(1§f )jx2) This proves that Cp_q(G)= and thus Cp_q(G) II It II t.yR' -( and that f is a maximizer for if and only if f is a maximizer for 4. This concludes the in < n case. If in > n duality can be used: Cp_.q(G) = C,.-, .(G') where GT(x, y) = G(y, x). This changes the in > it case into the in < n case and, since all the theorems in this paper are "duality invariant", the in > n case is proved. Remark. Clearly the proof of Theorem 5.1 is such that if other cases with in = n are settled in the future then Theorem 5.1 for in + it holds for those cases as well. VI. Multilinear forms in the real case and Young's inequality
After Sects. I to V were completed, Eric Carlen suggested that the same methods should yield similar results for real multilinear forms. Indeed this is so and the proof is outlined here (the omitted details are merely a repetition of those given before). Some remarks about the complex case will also be made here. Finally, Theorem 6.2 contains an application of the result in Sect. 6.1 for real multilinear forms: The truly multidimensional generalization of Young's inequality, which was surmised in [BL, p. 162], will be proved.
6.1. Multilinear forms. For i = I, 2, ... , K let it, be a positive integer and let x; denote a point in R. The point X = ( x . . . . . . ) denotes a point in R" with N = YK_, n;. Let G(X) be a "Gaussian kernel", i.e., .
G(X)=exp -
K
K
(xi,A;jxj)+2(L,X)
ll
.
where A is a n x n, matrix with A= A and where L E C'. The N x N symmetric matrix A is the matrix whose blocks are the A;j's and G is said to be nondegenerate if M = Re(A) is positive definite. Otherwise M ? 0 and G is degenerate.
620
Gaussian Kernels have only Gaussian Maximizers Gaussian kernels have only Gaussian maximizers
205
Let P = (p ... , pK) satisfy I < p, < co for each i. The multilinear form is K
fK) = J G(x ... , XK) fl f,(x,)dx, ... dxK ,
(11
where the integration is over R"' x R"' x .. ,=x R"K and each j e L°'(R`). The problem is to evaluate (2)
fK)
where the supremum is over f,'s with II I, II = 1. As before, if G is degenerate we have to take j e LP'(R"') n L' (R"') and then take limits. The cases treated in Sects. I to V correspond to K = 2 with p, = p and P2 = q'. The case K = I is trivial-by Holder's inequality. Lemma 2.1 is easily generalized to the complex, nondegenerate multilinear case; the details are left to the reader. The conclusion of Lemma 2.1 holds for each j in a maximizing set (J, , ... fK). The conclusions (a), (b) and (c) follow by fixing all the Jj's with j * i and then investigating the dependence of 9F(f , ... fK) on j. Lemma 2.2 obviously carries through as well; that is A can be assumed to be real and G can be assumed to be centered, i.e., L = 0. Likewise Lemma 2.3 carries through: When G is centered (i.e., L = 0) and when the supremum in (2) is restricted
to Gaussian functions f, then each f can be taken to be centered and, in the nondegenerate case, each f, must be centered. Let us now turn to the real case, i.e., each A,j is real and L = 0. Theorem 3.2 for
the nondegenerate case carries through for every choice of P. The maximizing
K-tuple (f ... , fx) is unique (up to multiplicative constants) and each j(x) = exp{ - (x, J,x) } with J, being real and positive definite. To prove that.f , say, has this property we write (with q = p, ) CP = SUP II '$(f2, ....fK)IIq
where the supremum is on f2, ... fK with III IIP, = I and where K
I( /2.. . ..IK)(x,) = f G(x .
.
.
, xK)
f] fj(xj)dx2 .
.
. dxK
j=2
by , 2) and f2 (x2 ), .... fK(xK) by before, we replace F2(x2, y2), ... , FK(xK, 3K) with F, e LP'(R2n'). To imitate the inequalities (1)-(5) in Theorem 3.2, define As
K
K(x,, Y2..... YK) = $ G(x x2..... xK) I] Fj(xj, y,)dx2 ... dxK . j=2
Then, proceeding as in (1)-(5) (and with the F, nonnegative for the same reason as before) 11,i'121(F2'
.... FK)IIq = f [ $ G(Y ... , YK)K(x,, Y2, ... , YK)dy2...dyK]gdx,dy, YK)9dx,]Irq
< J { J Go
x dye ... dYK}qdy, K
< (C,.)q J { $
G(Y...... Y.) [] hj(yj)dy2
... dyK}qdy,
j=2
(with hj(y) = L J F,(x, y)Pdx]'IP,) tKt
<(CP)2q 11 IIFj11 p, j=2
621
Invent. Math. 102, 179-208 (1990)
206
E.H. Lieb
As oefore, Minkowski's inequality implies that K(x1, Y21 ... , YK) = A(xl)E(Y2, ... IYK)
which then implies that r4a1(F2,
However, k'V2,
... , FK)(x1, Y1) = A(x1)Z(Y1)
, F,) = F°,i -1, and hence F, is a product function. The rest of the proof is identical to the proof of Theorem 3.2. By taking limits, the analogue of Theorem (4.1) hold in the degenerate case whenever it is known that the nondegenerate case has a Gaussian maximizer K-tuple. In particular, Theorem 4.1 holds in the real case. The analogue of 4.1 (a) is that Cp is given by (2) with the supremum restricted to centered Gaussian func.
tions. Likewise, Theorem 4.3 extends to the multilinear case under the same assumption about the nondegenerate case; the analogous hypothesis is that each Ai, is positive definite. In particular Theorem 4.3 holds in the real case. These results can be used to derive the sharp constants in the fully multidimensional generalized Young's inequality. Recall that Young's original inequality states that if f e L°(R") and g e L'(R") then f r g e L'(R") with I /p + I /q = I + 1 /r; here denotes convolution. The sharp constant in this inequality was derived simultaneously by Beckner [B1, B2] and by Brascamp and Lieb [BL]. Another way to state the inequality is that
f I h(x)f(x - y)g(y)dxdy<- CIIhIL IIfIIIIgII,
(3)
R' R'
with I /p + I /q + 1 /r = 2. The Beckner, Brascamp-Lieb result is that C can be determined by restricting f, g and h to be Gaussian functions. (These, in fact, are the only maximizers, as shown in [BL].) Young's inequality (3) was generalized in several ways in [BL]. The first way is
to allow an arbitrary number of functions f ..... fK instead of merely three as in (3). These are functions from R" to C and fi e L°f(R"). This is Theorem 7 of [BL].
The integration is then over (R')" and the arguments of the f 's are taken to be ((a;, x,), ... , (a;,, x")) E R", where al a R' are specified vectors and x, a R'. Unfortunately, this is not a fully mn-dimensional generalization of the n = I result because R'"" is split unnaturally into (R')". Following Theorem 7 in [BL] we asked whether the full generalization is possible and Theorem 6.2 below gives it. A second generalization was the incorporation of a fixed Gaussian function in the integral, as in Theorem 6 of [BL]. Again, the Gaussian in [BL] was completely general when n = 1, but not otherwise. In Theorem 6.2 it is completely general.
6.2. Theorem (fully generalized Young's inequality). Fix K > 1, n ... , nK and p1, ... , pK > I as before. Let M >-- I be an integer and let B. (for i = I, .... K) be a linear mapping from RM to R"'. For nonnegative functions fl, ... fx, with fi e L°'(R"') consider K
1(f,...,fK)= I
]lt(B1x)dx.
It.
(I)
i=1
More generally, let g: RM R', g(x) = exp { - (x, Jx) }, be a fixed, centered, real Gaussian function and consider
',(fl.....fK) = I
II ./j(Bix)g(x)dx .
R" i - l
622
(2)
Gaussian Kernels have only Gaussian Maximizers 207
Gaussian kernels have only Gaussian maximizers
Let Ce =
sup
I...... 1,;
{1,(f , ... ,fr):11 fillp, =
1}
(3)
and similarly for C (with 1, replaced by 1). Then
C. = sup{1,(fl, ... ,fK):fi..... fK arc real, centered Gaussian functions with Iljll,, = 1} ,
(4)
and similarly for C.
Proof. Suppose the theorem is false and that the right side of (4) (call it D,) is strictly smaller than C,. (Alternatively, D < C.) Then there are nonnegative summ-
able functions that are not all Gaussians, fl, ... , fK of unit LP, norm, such that
1,(.f,,...f,,)>D,(orl(fi,....f,)>D). Consider the functions f;": R" -. R' given by f;" f, * g}" for I a positive integer, where g}"(x) _ (1/s)*," exp{ -1(x, x)} is an L`(R"') normalized Gaussian function. We note that II f}" Ilp < 1 and that f}" -' f, in Lp'(R") as l -+ oo. By passing to a subsequence (henceforth still denoted by 1) we can assume that f}"(x) j(x) for almost every x in R"'. Evidently we can assume that M > max (n ., ... , nK } and that the rank of Bi is n; for all i. Otherwise, I or /, involves knowledge of some j only on a hyperplane in
R" and this means that 1 or 1, can be made arbitrarily large (with all f,'s being Gaussian functions) while preserving 111; lip, = 1; the theorem would then be true in
this case because both sides of (4) would be infinite. Similarly, the mapping W = J + YK , B*B1(with s denoting adjoint) from RM to RM is positive definite;
otherwise 1, can again be made arbitrarily large with Gaussian f's. A similar condition holds for I with J = 0. Since B, is linear and has full rank n,, the almost everywhere pointwise convergence of J?" to f in R implies that f;"(B,x) -+f,(B,x) for almost every x in RM. By Fatou's lemma
Ca = lim inf l,(f;i'.....f.") ? 1,(ff , ... ,f,) > D, I-M
(5)
and similarly for C' (with 1 in place of 1,). By Fubini's theorem, however, K
G(')(yi,...,y,)fl
Ip(JYi,...,J.)_
(6)
'-I
R"
Here N = ;` I n; as in Sect. 6.1, y, a R',, and Go" is the centered Gaussian kernel K
Ge"(y,, ... , yK) = $ [1 g}"(B,x - yi)g(x)dx .
(7)
Rw i= I
Similarly, (6) and (7) hold for 1 in place of 1, by deleting the g. (Note: Because W is positive definite, the integral in (7) is always finite.) The number C', defined in (5) is either finite or infinite. In either case, there is some finite integer k such that Cs = 1,(j01', ... , J ) > D,. However, by (6) we see that C; is a multilinear form as in 6.1 (1). Such a form has the property, as we have
seen in Section 6.1, that its supremum over f's with I f Il,, = 1 is equal to its supremum over real, centered Gaussian functions. But if we set all the f,'s equal to Gaussian functions we have that f!'"'s are also Gaussian functions and II .f;4i II p, < I
623
Invent. Math. 102, 179-208 (1990)
E.H. Lich
208
This means that Ca < D9, and this is a contradiction. The same proof holds for 1 in place of 19.
fl
References
[BA]
Babenko, K.1.: An inequality in the theory of Fourier integrals. Izv. Akad. Nauk SSR Ser. Mat. 25,531 542; (1961) English transl. Am. Math. Soc. Transl. (2) 44, 115 128 (1965)
[BI] [B2]
[BL] [CA]
[CLI
[C]
Beckner, W.: Inequalities in Fourier analysis. Ann. Math. 102. 159 182 (1975) Beckner, W.: Inequalities in Fourier analysis on R". Proc. Natl. Acad. Sci. USA 72, 638-641(1975) Brascamp, HJ., Lieb, E.H.: Best constants in Young's inequality, its converse, and its generalization to more than three functions. Adv. Math. 20, 151-173 (1976) Carlen, E.: Superadditivity of Fisher's information and logarithmic Sobolev inequalities. J. Funct. Anal. (in press) Carlen, E., Loss, M.: Extremals of functionals with competing symmetries. J. Funct. Anal. 88, 437-456 (1990) Coifman, R., Cwikel, M., Rochberg, R., Sagher, Y., Weiss, G.: Complex interpolation for families of Banach spaces. Am. Math. Soc. Proc. Symp. Pure Math. 35, 269-282 (1979)
(DGS) [E]
Davies, E.B., Gross. L., Simon, B.: Hypercontractivity: a bibliographic review. Proceedings of the Hoegh-Krohn memorial conference. Albeverio, S. (ed.) Cambridge: Cambridge University Press, 1990 Epperson, Jr.. J.B.: The hypercontractive approach to exactly bounding an operator with complex Gaussian kernel. J. Funct. Anal. 87, 1-30 (1989)
[GL]
Glimm, J.: Boson fields with nonlinear self-interaction in two dimensions. Commun. Math. Phys. 8. 12-25 (1968)
[G] [HLP]
Gross, L.: Logarithmic Sobolev inequalities. Am. J. Math. 97, 1061. 1083 (1975) Hardy, G.H., Litticwood, J.E., P6Iya, G.: Inequalities. See Theorem 202 on p. 148. Cambridge: Cambridge University Press 1959 Janson, S.: On hypercontractivity for multipliers on orthogonal polynomials. Ark. Mat. 21, 97-110 (1983) Lieb, E.H.: Proof of an entropy conjecture of Wehrl. Commun. Math. Phys. 62,
[J]
[LI]
35 41 (1978)
[L2]
Lieb. E.H.: Integral bounds for radar ambiguity functions and Wigner distributions. 1. Math. Phys. 31, 594-599 (1990)
[NI]
Nelson, E.: A quartic interaction in two dimensions in: Mathematical theory of elementary particles. Goodman, R.. Segal, 1. (eds.), pp. 69-73. Cambridge: M.I.T.
[N2] [NE]
Press 1966 Nelson, E.: The free MarKOV field. J. Funct. Anal. 12, 211-227 (1973)
[SI]
Neveu, J.: Sur I'esperance conditionelle par rapport a un mouvement Brownien. Ann. Inst. H. Poincare Sect. B. (N.S.) 12, 105-109 (1976) Simon. B.: A remark on Nelson's best hypercontractive estimates. Proc. Am. Math.
[S]
Soc. 55, 376 378 (1976) Segal, I.: Construction of non-linear local quantum processes: 1. Ann. Math. 92.
[TJ
[W]
624
462 481(1970) Titchmarsh, E.C.: A contribution to the theory of Fourier transforms. Proc. London Math. Soc. Ser. 2, 23. 279 289 (1924)
Weissler. F.B.: Two-point inequalities, the Hermite semigroup. and the GaussWeierstrass semigroup. J. Funct. Anal. 32, 102 121 (1979)
J. Math. Phys. 31, 594-599 (1990)
Integral bounds for radar ambiguity functions and Wigner distributions Elliott H. Lieb Departments of,Nathematics and Physics. Princeton University. P. O Box 708. Princeton, New Jersey 08544
(Received 10 November 1989; accepted for publication 22 November 1989)
An upper bound is proved for the Lr norm of Woodward's ambiguity function in radar signal analysis and of the Wigner distribution in quantum mechanics when p> 2. A lower bound is proved for I
The ambiguity function introduced by Woodward' is important in radar signal analysis. It is a function of two real variables, r (the time) and or (with 2ani being the frequency), and is defined as follows in terms of two given functions land g of one variable: \`
/
r
JLAt) I2dtJ Ig(r)Isdt.
(1.6)
In this paper, limitations on the sharpness of A,,, will be
A,,(r,or)= Jf(t- Z rg (t+ z r)e -"'dt. /ff
(1.1) as a superscript denotes
\\
(Our conventions will be that
If Ap were highly peaked then J,,, ( p) would be very large for large p and very small for small p. The dividing line is p = 2 since, by the Parseval's inversion formula, we have the identity
complex conjugate and all integrals are from - w to + w.) Strictly speaking, A,, is called the cross-ambiguity function off and g while A,,, is the proper ambiguity function off. Usually, one assumes that f and g are square integrable, which guarantees that the integrand of (1.1) is a summable function oft for every r. The summability can also be guar-
established by proving that 1,,,(p) is universally bounded above whenp> 2 (Theorem I) and universally bounded below when t
are Gaussians. It is remarkable that Gaussians both maximize and minimize 1,,, ( p), depending on the value of p. When p = 2 the identity (1.6) holds for any f and g, so the obvious quantity to consider is the derivative with re-
spect top of /,,( p) at p = 2 under the normalization as-
anteed by Holder's inequality and the alternative assumption that fEL° and gEL5 (with I/a+ I/b= I and I
sumption that the right side of ( 1.6) is unity. This derivative,
chanics and defined by
(1.6) is unity the integral in (1.7) is well defined and
Wt.,(r,a) = Jf(r+ i
2 se " "ds. (1.2)
The relation is
W,,(r,ra) =
2r,2ar),
(1.3)
where f denotes the function given by
f (1) =-p -t).
(1.4)
W,,, is called the Wigner distribution (or density) off. Because of (1.3) the bounds obtained here for A,,, apply mutatic mumndis to W,,,.
Ideally, one would like to choose f and g so that A is sharply peaked around some point (rawo) but, as is well known, there are severe limitations to the peaking that can be achieved. These limitations are inherent in the definition (1.1 ). Let us define, for p > 0,
multiplied by - 2, is the entropy given by
S, = - J JIA,,(r,ar)I'lnlA,,,(r,m)I'drdw. (1.7) with 0 In 0=0. It will be proved that when the right side of (Theorem 3)
(1.8)
S,,,>1. >
This constant is sharp since it is achieved by Gaussians. To state the theorems precisely it is first necessary to make some definitions. Definition 1:.A t) is said to be a Gaussian if
f(t)=expI -at '
t+rI.
+8
(1 . 9)
with a, 11, and y being complex numbers and with Re(a)>0; f(t) is a real Gaussian if a, fl, and y are real numbers with a > 0. Two functions f and g are said to be a matched Gaussian pair if they are both Gaussians with the same a but with possibly different fl's and y's. Definition 2: For 0
IVII,= If If(,) I'd,
yr (1.10)
and for p = w
lre(P) = 594
J JIA,, (rw)11drdw.
J. Math Phys 31 (3). March 1990
(1.5)
IUII< _esupl it)I.
0022-2488190/030594-06$03.00
;c. 1990 American Inst4We of Phys cS
594
625
J. Math. Phys. 31, 594-599 (1990) We say that fuL' if and only if the right side of (1.10) or (1.11) is finite.
Definition 3: Let 0
p= 1, i.e., q=P/(P- I). Note that w>q>I if Iq> - w if O
-.,
(1.12)
CP = p""I ql forp# 1 or w while
(1.13) I. Cl = Note that C, = I. Definition 4: Let p and q be as in Definition 3 with
IOby H(p,a,b)' = abP-'Ip - 21'
-o
XIp-a[-'' `1p-bI
Pie.
(1.14)
with the convention that 0"= 1. When a orb = p
H(P.a.b)z =
(._.
H(l,l,w) = H(l,w,l) = I.
(1.15)
(1.22)
Note that a or b= p/(p-1) is allowed here. Remarks: (I) Even iff and g are Gaussians, it is not possible to have equality in (1.20) for all a and b simultaneously, as (1.21) shows.
(2) In view of the symmetry of Air between the pair fg and the Fourier transforms f,J expressed by (2.4) below, Theorem I remains true iff and g are replaced by f and g on the right side of (1.19) et seq. In the case that p is an even integer, Theorem 1(a) and (b) [ under the additional assumption for (b) that fend g are
twice continuously differentiable and never vanish [ was proved by Price and Hofstetter' by an ingenious application
The next theorem gives reversed inequalities for p <2.
Theorem2:Assumethat t
We also define K( p,a,b)>0 by
every r, so that the definition (1.1) of A,.s(r,w) makes sense. (This L 'condition can be satisfied, for example, by assuming
K(p,a,b )' = p-'2' - Pa" b Pie and
K(I,1,w)=F22.
(1.16)
The following relations (with I/p+ I/q= I) are noteworthy forp> 1:
H(p,a,b) = Co{C ,,C>,v/C,,,,}"9,
(1.17)
H( p,a,b)"PH(q,a,b)1" "ep - "Pq
(1.18) Theorem 1: Let p> 2 and assume rhat fand g eL '. Then (1.19) (a) i, (p) <(2/p){II/ll,Ilgll,}' (b) Equality is achieved in (1.19) if and only iff and g are a matched Gaussian pair. (c) Mare generally. iffEL',gEL' with 1/a+1/b=1 and with p/(p-1)
I,( P)
(1.20)
When both a and b> p/(p-1) equality is achieved in (1.201 ifand only iff and g are Goussians that satisfy
.A I) = exp[ -(a-'+ iA)t' +Pt + y], g(t) = exp[ - (art + iA)t' +(It + r] ,
(1.21)
with a, A real, a> l and 6,ZI y,y complex and with m'=a(p-l)/(ap-a-p) and n'=b(p-1)/(bp-b-p). (Note that (p -1)/(p- 2) <m ,n' < w under the stated conditions.] When a or b=p/(p-1). (1.20) is best possible, but equality is never achieved.
(d) If the additional condition that g=f is imposed (which means that the proper ambiguity function Am is being
626
l,, (p) and
Equality is achieved in (1.22) if and only iff is any Gaussian.
Theorem 1(a) and (b) for all p > 2 in their footnote 10. The Price-Hofstetter bounds have found application in the work of Janssen' for example.
and
595
before)
of the Cauchy-Schwarz inequality. They conjectured
)P
iJ
= K(p,a,b) '1PK(q,a,b) "v =
considered) or else that g=f - (which means that the proper Wigner distribution Wt is being considered) then (1.20) can be improved. In these cases (and with a and b restricted as
J. Math. Phys.. Vol. 31. No. 3, March 1990
thatfeL' and gnL8 forsome 1
/,F(p)>H(p,a.b)(1 11.11g11e)' In particular.
(1.23)
1,(p)>(2/p){I[/' ,IIgllr)'.
(1.24)
Ifg =for g= f - (as in Theorem 1(d# then (1.23) can be improved to
li, ( P) >K( p,a,b)(I[fII.IIgHI,}'.
(1.25)
If II equality occurs in (1.23) iff and g are. given by (1.21), but a/m'/ and a/n'/ have
to be interpreted as as and ab, respectively (since /m'//
/n'/-.a/b but m',n'-.O asp-1). Remarks: (3) When p = I and a,b> I the Gaussians referred to in the last part of Theorem 2 are, in fact, the only functions for which equality holds in (1.21). A proof can be constructed by using ideas in Ref. 4, but it will not be given here. The uniqueness of Gaussian minimizers for p = I and
a = b = 2 is closely related to and can be inferred from a theorem of Hudson' (see also Ref. 6) which says that the only way in which the function As,, (r,w) can be a non-nega-
tive function of i-and (U is when f= Ag for someA > 0 andfis a Gaussian. (Actually, Hudson does this in the context of the Wigner distribution, but that is immaterial; also he proves the theorem only for Wt but his method, extends to the general case.) The connection is established by first notElhon H. tieo
595
Integral Bounds for Radar Ambiguity Functions and Wigner Distributions ing the relation for summable A f f (which is easy to derive-
so that
at least formally)
f
f f(t)g'(t)dt.
.At) = f f(w)e'--dw
(2.2)
(1.26) and Parseval's relation is
On the other hand, by Theorem 2(a) withp = 1, IUII2 = 11(112
f
a)Idrdw>21UII21Ig1J,.
(1.27)
If A, >0, the left sides of (1.26) and (1.27) are identical, which then requires that f= Ag and that equality holds in (1.27). Thus A, >0 is equivalent to equality in (1.24) for
P=1.
(4) Theorem 2(c) is striking when p =a= I
and
b = ao. Then
f IA,.,(r,w)Idrdw>IVII,IIgII..,.
(1.28)
This says that if f is fixed and g-0 in all L° norms except p = W. then j IA I does not go to zero. I For example,
g(t) = exp[ - Air] with A - oo.) The Fourier transform also has this property (cf. (2.9)1 and it is inherited by Atf. A tempting conjecture is that inequality (1.24). at least, should hold if OI. It is instructive to compare Theorems I (a) and (1.24) by considering Gaussians f(t) = exp( -at 2) and g(t) = exp( -,6t') with Re a and Re #> 0. Then one finds
li..(P)IUII: °11g11;-°
variables are
A,,(r,w) = Aft( - w. r),
(2.4)
Afe(r,w) =A,,( -r,-(o),
(2.5)
IA,.,(r. )I
(2.6)
I/a+ I/b= I
More generally, if fFL", geL' with
and
a> I,b> I, as in Theorems I and 2, Holder's inequality yields the pointwise bound
(2.7)
IA,. (r,w)I
Inequality (2.6) is important because it implies that In IA,, (r,w) I'<0when IUI12119112 = I and henceS,f isalways
well defined by the right side of (1.7) (although it might be
+ W). Three inequalities in Fourier analysis will be needed. The first fact is the sharp constant in the Hausdorff-Young inequality (2.8) proved by Beckner." The criterion for equality is due to Lieb.' Lemma 1: Let 2
=(p-1)/p. !ffvL' then fete and
_ (2/p)[ReaRe/3 1e" 1S1(a+(3')/21' (1.29) Since Re
(2.3)
The equality (1.6) follows from (2.3). Some other important facts about A, which follow easily from (2.3). the Cauchy-Schwarz inequality and a change of integration
a Re /3
(1.19) holds forp>2 and that the reverse inequality holds for all 0
IUII,
(2.8)
Conversely, let I
which case f exists by (2.8) (with q=r there.) If feL' then with q = (P -1)/p and
IUII,>C.IVII,.
Theorem3:Assume thatfandgeL'with IVIIZIIgJ6=1 Then
(2.9)
Equality isachieved in (2.8) when 2
no
and in (2.9) when
I
S., >1.
Remarks: (5) It is possible to show that equality is
Proof, Inequality (2.8) is Beckner's result, and the condition for equality when 2
achieved in Theorem 3 only when fandgare matched Gausaians. The proof is complicated and s ill not be given; the reader is invited to find a simple proof. The method of proof of these three theorems follows
Therefore, geL' f1 L' and hence, by convexity, geL '. Thusg exists and, by the L 2 Fourier inversion formula, g =f-. By (2.8), f-eL' and (using C,C, = I) C9IUII,
Equality is achieved ill andg are a matched Gaussian pair.
closely the methods used in Ref. 7 to prove LP bounds of coherent state transforms. The coherent state transform off
is A1,( - r, - w) exp(irrwr) with g being the fixed Gaussian g(t) = i "' exp( -12/2). From the mathematical point of view there is, however, a genuinely new development in the present paper, namely the proof that Gaussians uniquely saturate the bounds. This uses Ref. 4.
The following convention for the Fourier transformf of a function f will be employed:
596
J. Math. Phys. Vol. 31. No. 3. March 1990
=Cvllf Its
II. PRELIMINARY LEMMAS
f(w) = f f(t)e 2' "dt.
(2.9),let g,:fSince feL',geL' [with s=r/(r-1)>21.
(2.1)
to Brascamp and Lieb.° In the following a midline asterisk denotes convolution
(f'g)(t)= f f(r-s)g(s)ds.
(2.10) EOi0n H. Lien
596
627
J. Math. Phys. 31, 594-599 (1990) Lemma 2: Let 1/m+l/n=l+1/r with I<m<°o. geL", fgoL' and 1
Lemma 4: LetW and 0 be complex valued. Lebesgue mea-
/+1(t)/=I for all t. surable functions on R that satisfy Suppose there are real valued functions. p and v. on R (which are not a priori measurable) such that for almost every r the following holds for almost every is
(b) When or > ! and n> 1. equality holds in (2. 11) if and only
g(t-}r)p(t+}r) =expIiu(r)t+iv(r)1.
if
(2.17)
Then there are real constants. A. a. Q y and d such that
f(t) =expl -am't'+13t+ y1,
f(t) = exp 1Ut2+iat+iyl ,
g(t) =expl -an't'+13t+i'1.
(2.12) but
r) (t) = exp [ - iAt' - i11t - i6 l
( 2.18 )
.
with
Proof. Let . '1 denote the set of r such that (2.17) holds
Im(/3) = Im(/3). Here, m'=m/(m-1)and n'=n/(n-1).
for almost all t. Let X(t) = /(t) exp( - t') and Y(t) _
ifm = ! or n= ! and r> 1. (2.1 1) is at best possible but equa-
(I/n(t) I exp( - t'). Using the definition (2.1) ofthe Four-
lity is never achieved. If m=n=r=1. equality is achieved
ier transform, it is a simple matter to use the Gaussian bound
when f and g are any pair of non-negative, real valued functions
on X(t) to deduce that X is an entire analytic function of
a> O
with
complex
real,
(c11fg'=f or g'=f (2.13) IV°gII."''"(2n)'nIVII,"IUII" For all m.1 and n.l and r> 1 equality is achieved in (2.13) if and only if f is a Gaussian given by (1.9) with a real (ifg =f -). and with Q real
Remarks: (7) The classical inequality of Young is (2.11) but with C",C,/C, replaced by the larger value I. (8) Lemma 2(c) was not given in Ref. 9 because it did not occur to us at the time that it might be useful. It is however, a simple consequence of the analysis in Ref. 9. The third inequality is the converse of Young's inequali-
ty. It was first proved by Leindler'° with
I
in place of
order at most 2, i.e., I X(&)) I <expl C + D Iwl' I for suitable C,D> 0 all oEC. fact, and (In
IX(w) I (Fr exp(rr=(Im w)' J.) The same is true of Y(w). From (2.17). for every rEf97 the following holds for almost every t:
X(t - jr) = Y(t+ jr) exp(t Up (r) + 2r) + iv(r)}. (2.19) Taking Fourier transforms of (2.19) with respect tot we find
that
X(w) exp( - niwr) _ Y(w
p2n) + rr )
The sharp form below is due to Brascamp and
xexpnirar -
Lieb.°
2
ip(r)r+iv(r)-r
(2.20)
Lemma 3: Let flt) and g(t) be non-negative. real-valued
functions that are not identically zero and assume that
fgeL'. Let 1/m+l/n=l+l/r with 0<mel. 0
X(rao) = 0. Then Y(w) = 0 whenever w satisfies
w=wv - (I/2rr)p(r) + (i/n)7,
(a)
)IUII Ilgll".
if
We claim that X(w) has no zeros, for otherwise suppose that
(2.14) IV.gll,> Equality holds in (2.14) when m < I and n < i ifand only
(2.21)
for some real. As r ranges over the uncountable set s.9, the
right side of (2.21) ranges over an uncountable set in the complex plane. ( Note that p (r) is real and iris imaginary so there can be no cancellation in (2.21).) The only entire func-
tion with uncountably many zeros is the zero function, so Y(w) =0. This implies that Y(t) = 0, which is a contradic-
f(t) = explam't +0t + yl g(t) = explan't' +Qt + rl . with a> O real and 11, y, (1,y m/(m-1)<0and n'=n/(n -1) <0.
(2.15) real.
m'=
Here,
tion. By reversing the roles of X and Y we find that Y(w) has no zeros. Because X and Y are entire analytic and zero free
they have analytic logarithms. e.g., X(,w) = exp(m(ro) I for
with equality (for all m and n) if and only if f is a real
some entire analytic function m. Since X has order at most 2, It)(w) I (C Ita12 + D forsuitable C,D> 0. But then o must be a polynomial oforder 2, i.e., X is a Gaussian. The same is true of Y. By taking the inverse Fourier transform, we have that X and Y are Gaussians, which, by inspection, proves
Gaussian.
(2.18).
(b) If g' = f org = f (2.14) can beimproved to
12'(2m)"2"'(2n)"'"IVII,.IVII..
(2.16)
Q.E.D.
Remark: (9) Lemma 3(b) was not given in Ref. 9 but it is a simple consequence of the analysis given there.
The next lemma is an extension of the Cauchy functional equation to quadratics. I One form of Cauchy s equa-
Ill. PROOF OF THEOREM 1
4'(t) = be ",t)(t) = ce", and p(r) = bee" for some con-
Step 1: Fix rER. Since feL" and gvL' with I/a + I/ b = I, the function t-f(t - fr)g(t + jr) is in L'. Since A,,, is the Fourier transform of this L' function, we can use Lemma I with q = p/(p - 1) <2 in place of p there and
stants A. b, c. I
obtain
tion is (t - Ir)?I(t + jr) = p(r) with g and ly being Lebesque
597
628
measurable
functions;
the
only
J Math Phys.. Vol. 31. No. 3. March 1990
solution
is
Ellrott H Lreo
597
Integral Bounds for Radar Ambiguity Functions and Wigner Distributions
IV. PROOF OF THEOREM 2
J (Am rw)11dw
2
r) g(t+ 2 r)I'dt(~'
(3.1)
Before proving this theorem, it is perhaps worth noting a proof strategy that works when a = to orb = p, but otherwise yields a weaker result. This strategy does not require Lemma 3. From Parseval's relation one has the identity
Note that the right-hand integral may be finite or infinitedepending on r. If it is infinite then (3.1) is trivially true; if it is finite then the use of Lemma I is justified. We shall see in step 2 that this integral is finite for almost every r. Step 2: The integral on the right side of (3.1) is just the convolution
J(r)=(V 1°°1g1°)(r). (3.2) Integrating (3.1) over rand applying Lemma 2 toJ(r) with
r=p/q>land in =a/q>I,n=b/q>I,we have
= f f(t)h*(t)dt f g°(t)j(t)dr,
(4.1)
for any four functions fg,h, and j. Let f= Ifle" and g = Igle" and choose h(t) = I/(t)I' 'e'"" and
j(t)=Ig(t)I'
1e,°.".Then
(4.2)
R,,.., = IUII:IIgl1%.
It.,(P)
f At,(r,w)A;!,J(r,w)drdo,
RI.x,,,
On the other hand, by Holder's inequality.
(3.3)
The inequalities (1.19) and (1.20) are obtained by using (1.17). Step 3: It is an elementary exercise to show that Gaussians ofthe form (1.21) give equality in (3.1) and (3.3 ), and
hence that H(p,a,b) is the sharp constant in (1.19) and (1.20). We want to prove that these Gaussians uniquely saturate the bounds. Assume that m> I and n> 1. If there is equality in (1.19) or (1.20) then (3.1) must be an equality for almost every r and (3.3) must be an equality. By Lemma I, the following must be true for almost every r:
f(t- jr)g°(t+}r) =D(r)expl -o(r)t'+6(r)t 1 (3.4)
for almost every t, with a(r)ER and D(r),6(r)EC. By
I R6e.e,iI < If..x (p)
(4.3)
"Ie,,(q) "°.
If I2 and we can use Theorem 1(c) for the right-most factor in (4.3):
{Ir,,,(q)}`°
If.(P)>H(p,a,b)L(p,a,b)"{1llll°IIgII0}".
(4.4) we can
(4.5)
where
L(P,a,b) = p"'q"°a - "°b - '"(4.6) If a orb = to then L(p,a,b) = I and (4.5) is the desired inequality. Unfortunately, ifp I, which is the case we consider first,
Lemma 2, equality in (3.3) requires
l/(t) l = cxp) - am't' +Qr + Yl ,
the proof is virtually the same, mutatis mutandis, as for
Ig(t)I=cxp)-ant'+ft+fl .
(3.5)
Theorem 1.
by (3.5). Then, comparing (3.4) and (3.5), we find that
Step 1: Using inequality (2.9) (with r = I) we have that (3.1) holds, but with the reversed inequality. Note that the left side of (3.1) is finite for almost every r since 1{1IA,. Idw}dr< oo by assumption. Step 2: By (3.2) and Lemma 3, (3.3) holds with the
and V satisfy the hypotheses of Lemma 4. The conclusion of
reversed inequality. In particular, fEL° and gEL". This
with
m' = m/(m - 1), n' = n/(n - 1),
a> 0,
and
ywR.
Let us define fi(t) =f(r)/I/(t)I and r1(r) =g°(r)/ Ig(t) 1, which makes sense sincef(r) and g(t) never vanish Lemma 4, together with (3.5), gives (1.21).
Step 4: When a-p/(p - 1) then m'- oo and it -r. By taking limits of Gaussians in (1.21) with m - oo we see that (1.20) is best possible in this case. Equality is never achieved, however. An informal way to see this is to note tht m' must be infinity. A formal proof is to note that (2.11 ) or
(3.3) cannot be an equality when in = I and n = r Ias is stated in Lemma 2(b) I because of the strict convexity of the
L' norm. Step 5: W hen g
=forg =f- we proceed as in steps I to
3, making the appropriate changes and using lemma 2(c). From this we infer (1.22) and conclude that f must be a Gaussian in order to have equality. Upon inserting a Gaussian (1.9) for f and g (or g- ) in (1.1), one finds by inspection that equality in (1.22) does not impose any restriction Q.E.D. on the Gaussian. 598
J. Mallo. Ploys.. Vol. 31. No 3. March 1990
proves (1.23). Similarly, Lemma 3(b) leads to (1.25). The cases ofequality for I
Finally we turn to the casep = I. Step3: Suppose p = I
=IAI..,,(r.w)1/1VII.IIgII0 1, establishes (1.23) for to = I. A similar proof holds for (1.25). Step 4: Suppose to = a = It = I. For each a,b> I such that I /a + 1 /b = I inequality (1.23) holds by step 3. As a I I and b I oo we have that H( 1.0) -H( 1. 1. m ). Also, it is a Elliott H Lleb
598
629
J. Math. Phys. 31, 594-599 (1990)
standard fact that IUIIL - IVII i and IIgIIa -' IIglI a A similar Q.E.D. proof works for Eq. (1.25).
ACKNOWLEDGMENTS
I thank E. Carlen, 1. Daubechies, P. Flandrin, and A. Grossman for helpful discussions and A. J. E. M. Janssen for
V. PROOF OF THEOREM 3
It is assumed that f and
g ell'
and IUII2IIgII2 = I. By
(1.6),l,(2)=Iand,by(2.6),IA/,a(r,w)I2 whence, by Theorem I, fine, for e> 0,
(p) <2/p. If we de-
K(r)=r-'{!,,s(2) -lla(2+2e)},
a helpful correspondence and for encouraging me to write this paper. In fact, the results in this paper have already been quoted and used by Janssen." This work was partially supported by U. S. National Science Foundation Grant PHY 85-15288-A03.
(5.1)
we have that
K(e)>(I +r)-'.
(5.2) Assume now that S/ defined by (1.7), is finite; other wisetheinequality ( 1.8) is trivial. (Note that IAf.0I
Iim K(r) =SSs,
(5.3)
r,o
which, in view of (5.2), proves the inequality. Since IA/, I < 1 we have, for each r and w, that
I'.
o<e
(5.4)
(The last inequality is simply I+ e In X <X* for all X> 0. ) Now K(e) is just the integral of the middle function in (5.4) (which is non-negative), and we see that this function is
uniformly dominated by an integrable function. Furthermore, as rlO the middle function in (5.4) converges pointwise to the right-hand function. Equation (5.3) then follows by Lebesgue's dominated convergence theorQ.E.D. em.
599
630
J. Math. Phys, Vol. 31, No. 3. March 1990
'P. M. Woodward. Prolabihtyoad/ forrnorion Theory uath Apphrnnons ro Radar (McGraw-Hill, New York. 1953), p. 120. 'R. Price and E. M. Ho6letter, "Bounds an the volume and height distributions of the ambiguity function," IEEE Trans Inf. TheoryIT-II, 207-214 (1965). 'A. J. E. M. Janssen. "Positivity properties of phase-plane distribution functions," J. Math. Phys. 25, 2240-2252 (1984). 'E. H. Licb, "Gaussian kernels haveonly Gaussian maximiurs." Lobe pub. lished in Invem. Math. (1990). 'R. L. Hudson, "When is the Wigner quasi-peobabilty density non-negalive7," Rep. Math. Phys 6,249-252 (1974). 'A. J. E. M. Janasen, "Bilinear phase-plane distribution functions and positivily",1. Math. Phys. 26, 1986-1994 (1985). E. H. Lieb, "Proof of an entropy conjecture of Wehr(. " Common. Math. Phys. 62. 35-41 (1978). 'W. Beckner. "Inequalities in Fourier analysis," Ann. Math. 102, 159-192 (1975). 'H. J. Brascamp and E. H. Lieu, "Best constants in Young's inequality. its converse, and its generalization to more than three functions," Adv. Math. 20.151-173 (1976).
'L. Leindler, "On a certain converse of Holder's inequality. ii," Acts Math. Soegcd. 33.217-223 (1972). "A. J. E. M. Janssen. "Wigner weight functions and Weyl symbols of non. negative definite linear operators," Philips J. Res. 6, 7-42 (1989).
Elliott H. Lieb
599
Part VII
Inequalities Related to Harmonic Maps
With H. Brezis and J-M. Coron in C. R. Acad. Sci. Paris 303 Ser. 1, 207-210 (1986)
C. R. Acad. Sc. Paris, t. 303, Serle I, a° 5, 1986
207
CALCUL DES VARIATIONS. - Estimations d'energie pour des applications de R3 it valeurs dans S2. Note de Haim Brezis, Jean-Michel Coron et Elliott H. Lieb, prbsentee par Jean Leray. On resout deux problemes concernant des applications p avec des singularites ponctuelles dun domain t) e Rs, a valeurs dans S'. Le premier est de determiner In minimum de 1'enagie de op lorsque Is position et le degre topologique des singularites est prescrit. Dana le second probleme 0 at la boule unite et (p =g est done sur 22 On montre que g(x/l x I) minimise l'energie si et seulement si g =Cte ou bien g(x)= t R x et R at une rotation. CALCULUS OF VARIATIONS. - Energy estimates for Rs -. S2 mappings. Two problems concerning maps tp with point singularities from a domain Q e Rs to S2 are solved. The first is to determine the minimum energy of qt when the location and topological degree of the singularities are prescribed. In the second problem Q is the unit ball and W-g is given on 8Q: we show that the only cases in which g(x/I xI) minimizes the energy is g=cont. or g(x) - tR x with R a rotation.
On considere divers problemes lies a des estimations d'energie pour des applications to de R3 dans S2 qui sont discontinues en des points isoles. 1. SINoui. aiTEs PRESCRrrES. - On fixe des points at, a2, ..., aN dans R3 et des entiers d1, d2, ... , dN to Z avec d, #0 pour tout i. On introduit la classe d'applications ip : R' - S2 definie par : N
\
(
9=eCIR3\U {a,}; S2 1I J VtpI2
/e
1=1
Ici, VV est entendu au lens de t'(R3) et deg((p, a,) design le degre topologique de tp
restreint a une sphere centree en a, et de rayon r assez petit (r
vbrifie aisement que 0 est non vide si et seulement si
di=0
(1) 1=1
et on fera cette hypothese dans la suite. On s'intbresse a l'energie minimale de deformation (2)
E= InfJ
IVtpF2.
.! a'
Cette quantite, qui a 1'homogeneite d'une longueur, depend tres explicitement de la position des points a, et des degres d,. Afin d'exprimer cette dependance on introduit la notion de connexion minimale. On dit que a, est un point positif (resp. negatif) si d,>0 (resp. d,<0). Soit
d,= -
Q=
d1
la somme des degres positifs. On fait la liste des points positifs en repetant chaque point d1 fois. On design cette liste par pt, P21 ... , pQ. On procede de la mCsme maniere avec
les points negatifs en repetant chacun d'eux I d1I fois. On design cette liste par n1, n2, ..., nQ. On pose Q (3)
L=Min Z Ip1-n°otl e
i=r
633
With H. Brezis and J-M. Coron in C. R. Acad. Sci. Paris 303 Ser. 1, 207-210 (1986)
C. R. Acad. Sc. Paris, t. 303, Serie 1, n° 5, 1986
208
o6 Ie minimum est pris sur Ie groupe des permutations a de 1'ensemble (I. 2, ... , Q). Q
Une connexion minimale est la reunion des segments C= U (pi, n°,,,j o6 a est I'une des t=r
permutations qui realise le minimum dans (3). Bien entendu, it peut exister plusieurs connexions minimales.
On designe par Sc la mesure de Hausdorff de la connexion minimalc C, c'est-a-dirc 0
y S,, oit I,=(p n°,,,] et S, est la mesure de Hausdorff uniforme sur le segment I.
Sc
THI:OREME 1. - On a (4)
E = 8 n L.
De plus, rinfimum en (2) n'est pas atteint: si ((p°) est une suite minimisante pour (2). alors it existe une sons-suite (opy) et une connexion minimale C telles que 12 converge au sens des mesures very 8 it 8c. une constante p. p. et I V
converge vers
Insistons sur Ie fait que, mime s'il existe plusieurs connexions minimales, alors IV p. I2 se concentre sur une seule connexion minimale (et non pas sur unc reunion dc connexions minimales). Ceci nest pas le cas pour Ic probleme de minimisation en D [voir (7)].
Principe de la demonstration de (4). - On procede en deux etapes. Pour ]'estimation superieure E-S 8nL, on considere d'abord It cas d'un dipole, c'est-a-dire, un point positif
p de degr6 + I, et un point negatif n de degre - I separes par une distance L. Etant donne c>0, on construit explicitement une application cp,Ed telle que
JlVI28itL+c.
avec tp, constante en dehors d'un voisinage d'ordrc a du segment
[p, n]. Dans Ie cas general on prouve que E:5 8 it L en recollant des dipoles. L'estimation inferieure. E? 8 it L, est plus difficile. A cet effet, on introduit un concept tres utile. A toute application q,et on associe le champ de vecteurs D, de composantes D=(ip.Ip, ^ w=, w- (p, A (ps. %1. (P. ^ W,) (Oil q, =arolax,...).
On montre que Q
N (5)
divD=4n
Q
O Y- Spi-1=1 E S°, =4np. t
Comme, d'autre part, on a
I2DI
(6)
it vient
E_8n ,,,vInf JIDI.
(7)
D=p
Un argument de dualite conduit a 1'egalit6 lnf dlvD=p
o6 K=tc:
634
)et
JDI=Max
'cdp
:EK J
IIcIIL;p=SuPIc(x)-c(x)IIIx-yl. x*,
Estimations d'energie pour des applications de R3 a valeurs dans S2 C. R. Acad. Sc. Paris, t. 303, Serie 1, 6° 5, 1986
209
On prouve enfin que MaxJ cdp=L a ('aide d'un theoreme de Kantorovich (voir [1] et 4.K
121) et du theoreme de Birkhoff sur les matrices doublement stochastiques (voir par exemple [3]).
Remarque 1. - La relation (4) s'etend a des situations plus generales. Considerons, par exemple, un ouvert f2 de R3 contenant les points a1 et soit
/ 6,-JcpeCl0\ V N
(
{a,}.S31IJ I Vcp12
E,= Inf f
1
I
Q
Alors on a E,=8iL1 o6 L,=Min Y- D(pi, n,111)et a
i=I
D(p, n)=Min{Ip-nI, dist(p, aft)+dist(n, aft)}. Dans Ie meme ordre d'idees on peut considerer d'2 = { rp a 61 I cp est constante sur aft }
E2= Inf J
et
IVcp12.
.F-f2 n Q
Alors on a E2 =8 it L2 oil L2 = Min Y- da (p,, n,, (I)) et da (p, n) design la distance geod6sii=+
que dans 0 entre p et n.
Remarque 2. - On peut englober les cas precedents dans une situation encore plus generale oIi l'on remplace les points a, par des u trous , Hi (compacts disjoints de (2). Pour definir deg(9, H1) on procede de maniere similaire au cas d'un point. La conclusion est encore que E=8itL of L fait intervenir une distance appropriee entre les trous. Ici, on ne fait plus I'hypothese d,#0 et les trous de degre zero peuvent jouer un role dans le calcul de la distance entre les trous. 2. MINIMISATION AVEC CONDITION AUX LIMITES. - Soit fZ un ouvert borne de R3 et soil
g : aft -. S2 une donnee an bord. On s'inleresse au probleme l (8)
E(g)=MinI fn JVwl2ItpeHI(D: S2) et (p=gsuraf2 }. 111
iI est clair que le minimum en (8) est atteint et on sail, d'apres un resultat de Schoen et Uhlenbeck [4], que si cp realise le minimum, alors p admet au plus un nombre fini de points de discontinuite. Nos resultats principaux sont les suivants THEOREMS 2. - On suppose que Q= { x c- R 3 I I x I < I } et que g(x)=x. Alors y (x) = x/I x I realise le minimum dons (8). En fait, 41(x) est l'unique minimum dans (8).
THEOREMS 3. - On suppose que f2= (x e R3 I I x I < 1) et que g est quelconque. Alors 4, (x)=g (x/I x I) ne realise pas le minimum dans (8), excepte si ±g est une rotation ou une constante.
Revenant au cas d'un domaine 0 general et d'une donnee g arbitraire, it resulte des theoremes 2 et 3, dc [4] et [5], Ie
635
With H. Brezis and J-M. Coron in C. R. Acad. Sci. Paris 303 Set. 1, 207-210 (1986)
C. R. Acad. Sc. Paris, t. 303, Sere 1, o° 5, 1986
210
COROLLAIRE 4. - On suppose que tp realise le minimum daps (8), alors Ie degre de tp en
chaque point singulier xo est t I et
cp(x)±R(x-xo)/Ix-xoI quandx -. xo, ou R est une rotation.
Principe de la demonstration du theoreme 3. - I[ est clair que si 4, (x) =g (x/I x 1) realise le minimum dans (8), alors necessairement g est une application harmonique. Si le degre
de g est ± 1 et que g n'est pas une isometric alors on peut diminuer I'energie en IVgI'ada960. Si
a deplagant la singularite vers Ie centre de masse » de IV g 12, i.e. J an
le degre de g est different de 0, ± 1, alors on peut diminuer I'energie en eclatant la singularite en plusieurs points.
Remarque 3. - La motivation originale de ce travail est lice a des questions qui apparaissent dans 1'etude des cristaux liquides (voir (6], [7], [8]). (Dans ce cas, it faut remptacer S2 par R P2 ce qui se fait facilement, voir (9]). Le corollaire 4 explique le fait que seeks les singularites de degre ± I sont observees experimentalement (voir par exemple [10]) (dans un travail anterieur, Hardt-Kinderlehrer et Lin (11] avaient etabli que le degre des singularites est majore par une constante universelle). Nous remercions J. Ericksen et D. Kinderlebrer qui ant attire notre attention sur ces questions. Le detail des demonstrations paraitra dans [9). Rocuc Ic 12 mai 1986.
REFERENCES 1151 JOGRAPHIQUES
(11 L. V. KANroROVtaH, Dokl. Akad. Nauk S.S.S.R., 37, n' 7-8, 1942, p. 227-229. [21 S. T. RACHEV, Theory of Probability and its Appl., 29, 1985, p. 647-676. (31 H. MINC, Permanents, Encyclopedia of Math. and AppL, 6. Addison-Wesley, Reading, Mass, 1978. [4) R. SCHOEN et K. UHLENEEQZ, J. Dif. .. Geom., 17, 1982, p. 307-335 et 18, 1983, p. 253-268. 151 L. SIMON, Annals of Math., 118, 1983, p. 525-571. 161 P. G. DE GENNEs, The physics of liquid crystals, Clarendon Recs. Oxford, 1974. 171 M. KLEst", Points, leans, parots, Las Editions de Physique, Orsay, 1977. [8] 1. ERrcRsas, in Advances In liquid crystals, 2, G. BROWN ed., Acad. Press, New York, 1976. [91 H. BRezls, 1.-M. CoaoN et F. LIM Harmonic maps with defects (A paraitrc). [10) W. BRotaMAN N P. CLADIS, Physics Today, 35, 1982, p. 48-54. [I1] R. HARDr, D. KINDERLEHRER et F. H. LrN, en preparation. H. B.: Universite Paris-1/1. 4, place Jnuleu, 75252 Paris Cedex 05: J.-M. C. : gcole Polytechnique, 91128 Palaiseau Cedex;
E. L.: I. H.E.S., 91440 Beret-sur-Yvette et Princeton Uniorrstry.
636
With F. Almgren in Bull. Amer. Math. Soc. 17, 304-306 (1987) BULLETIN (New Series) OF THE AMERICAN MATHEMATICAL SOCIETY Volume 17. Number 2. October 1987
SINGULARITIES OF ENERGY-MINIMIZING MAPS FROM THE BALL TO THE SPHERE FREDERICK J. ALMGREN, JR. AND ELLIOTT H. LIEB
We study maps (p from the unit ball B in R3 to the unit sphere 82 in R3 which minimize Dirichlet's energy integral
e(v) = I IV pl2dV. 8 If such ado minimized Dirichlet's integral among mappings into R3 rather than
being constrained to lie in S' it would then be a classical smooth harmonic function. A minimizing constrained jo, however, sometimes has isolated point discontinuities. We here announce several new estimates on the number and
arrangement of such singular points [AL). The rp's we consider have well defined values io on the boundary 8B of B, and the boundary Dirichlet's energy integral is
a£(o) = L IVT+GI2dA, H
where VTty denotes the tangential gradient. In our theorems and examples below each >G has finite energy. One of our principal results is
MAIN THEOREM. Suppose v minimized Diriehlet's integral among all functions mapping B to 82 and having boundary value function 0 on 8B. Then the number of points of discontinuity of ip is bounded by a constant times 8£(o). This linear law is noteworthy because examples illustrate linear growth of the number of singularities with e£(tG) while other examples show that the
number of singularities cannot be bounded by e(p). This shows that the number and location of singularities cannot be inferred from simple energy comparisons alone. The subtlety of this estimate is further illustrated by EXAMPLES. There are boundary value functions t' for which the minimiz-
ing ip's are unique'and have an arbitrarily large number of singular points stacked arbitrarily high near the boundary-like bubbles in a pan of water that is almost ready to boil. The number of stacks is also arbitrarily large. Such examples show the necessity of an analysis containing several different length scales in proving the principal result above-the length scale of a singular point is its distance to the boundary. Received by the editors April 20, 1987. 1980 Mathematics Subject Ctaasiflcation (1985 Revision). Primary 58E20; Secondary 58E30, 82A50.
304
637
With F. Almgren in Bull. Amer. Math. Soc. 17, 304-306 (1987) SINGULARITIES OF ENERGY-MINIMIZING MAPS
305
One might expect that if ' mapped 8B to cover only small area in S2 then there could not be too many singular points of V in B. Indeed, prior to our work, all examples of boundary values ' with many singularities also had boundary mapping area proportional to 8£(,'). Such a relationship turns out not to hold in general and, as another of our principal results, we show EXAMPLES. For any preassigned number N, there is a smooth boundary value mapping lk of 8B to S' with the following properties: (i) the image of 0 in S2 consists of a single smooth curve I' near the equator (>' thus has zero mapping area), and (ii) any minimizing rp has at least N singularities.
One key ingredient of these examples is the existence of two different parametrizations of r from the boundary 8D of the unit disk D such that the least energy extension of the first parametrization maps D to cover the north pole of S2 while the least energy extension of the second parametrization maps
D to cover the south pole. This then leads directly to an example in which B is replaced by a large solid torus with cross-section D and the boundary parametrizations alternate as one goes along the torus. We effectively embed such a torus in B using the conformal equivalence between the disk and the half-plane.
Another natural question one might ask is whether minimizers respect boundary value symmetries (if any), as is true for classical harmonic functions. This is not the case as we illustrate by EXAMPLES. There are boundary value functions +' which are symmetric about the midplane of B but for which any minimizer cannot possess such a symmetry (nor can its set of discontinuities).
The basic existence and regularity (interior and boundary) theorems for Bp's and tP's as above appear in papers of R. Schoen and K. Uhlenbeck [SU1, SUBJ. It is their work which guarantees that the interior discontinuities for 9's are isolated. The uniqueness of tangential approximations at such points of discontinuity follows from the work of L. Simon [S]. Following initial estimates by R. Hardt, D. Kinderlehrer, and M. Luskin [HKLJ, H. Brezis, J.-M. Coron, and Lieb showed that the only possible tangential approximation to a minimizing (p at any singular point is the function x/Ix[ composed with an orthogonal mapping of Ss [BCLJ. Hardt and F. H. Lin showed in [HLJ how to construct boundary values ,' which would guarantee many singularities in a minimizing fp. Except for this, little was known about the number and location of singularities in a minimizer when the present work began. Much of the basic analysis in the literature mentioned above has been based ultimately on compactness arguments, i.e. failure of a desired estimate for all constants leads to an impossible situation. Such compactness arguments are central to the present work as well; they lead fairly directly to the following
important estimate (Hardt and Lin have informed us of their independent discovery of this fact).
THEOREM. The distance between any two singularities p and q in a minimizing 'p is at least a fixed constant multiple of the distance from p to 8B.
638
Singularities of Energy Minimizing Maps from the Ball to the Sphere
306
F. J. ALMGREN, JR. AND E. H. LIEB
Another compactness argument which combines the theorem above with the boundary regularity theory enables us to conclude that the existence of
a singularity at distance 6 from 8B implies that the boundary function ' must have nearby Dirichlet integral at scales comparable to 6 independent of boundary energy distribution at much larger or much smaller scales. A combinatorial analysis on a Cayley tree based on these differing length scales permits us to sum these different energies in proving our main theorem.
As one might suspect our main theorem remains true (with appropriate constants) if B is replaced by considerably more general domains in R3, while the second theorem holds with the same constant. One of the original motivations for studying mappings to 32 (or to RP2) was the mathematical analysis of liquid crystal configurations-in this context one usually regards V as a unit vectorfield in B. Because we base our analysis on compactness arguments we can also readily conclude that a unit vectorfield V which minimizes any nematic liquid crystal energy integral sufficiently close to Dirichlet's integral must have at most isolated point discontinuities and the number of these discontinuities is dominated by boundary energy. REFERENCES [AL) F. J. Almgren, Jr. and E. H. Lieb, Singularities of energy-minimizing maps from the ball to the sphere: ezamples, counterexamples, and bounds, in preparation. [BCL) H. Brezia, J: M. Coron and E. H. Lieb, Harmonic maps with defects, Comm. Math. Physics 107 (1986), 649-705. [HKL) R. Hardt, D. Kinderlehrer and M. Luskin, Remarks about the mathematical theory of liquid crystals, IMA Preprint #276, October 1986.
[HL) R. Hardt and F. H. Lin, A remark on HI mappings, Manuscripts Math. 56 (1986), 1-10. [SU1] R. Schoen and K. Uhlenbeck, A regularity theory for harmonic maps, J. Differential Geom. 17 (1982), 307-335.
[SU2] -, Boundary regularity and the Dirichlet problem of harmonic maps, J. Differential Geom. 18 (1983), 253-268. [8) L. Simon, Asymptoties for a class of non-linear evolution equations with applications to geometric problems, Ann. of Math. (2) 118 (1983), 525-571. DEPARTMENT OF MATHEMATICS, PRINCETON UNIVERSITY, PRINCETON, NEW JERSEY 08544
639
With F. Almgren and W. Browder in Springer Lecture Notes in Math. 1306, 1-22 (1988) CO-AREA, LIQUID CRYSTALS, AND MINIMAL SURFACES'
F. Almgren, W. Browder, and E. H. Lieb Department of Mathematics, Princeton University Princeton, New Jersey 08544, USA
Abstract. Oriented n area minimizing surfaces (integral currents) in M'"+" can be approximated by level sets (slices) of nearly m-energy minimizing mappings M'"+" -+ S"' with essential but controlled discontinuities. This gives new perspective on multiplicity, regularity, and computation questions in least area surface theory.
In this paper we introduce a collection of ideas showing relations between co-area, liquid crystals,-area minimizing surfaces, and energy minimizing mappings. We state various theorems and sketch several proofs. A full treatment of these ideas is deferred to another paper.
Problems inspired by liquid crystal geometries.z Suppose R is a region in 3 dimensional space R9 and f maps fl to the unit 2 dimensional sphere S' in R3. Such an f is a unit vectorfield in R to which we can associate an 'energy'
f(f) _ (87r )JnIDf12dC3; here Df is the differential of f and jDf12 is the square of its Euclidean norm-in terms of coordinates, (=))z
IDf(.)I = F E (L k=1 i=1
azj
for each x. The factor 1/8a which equals 1 divided by twice the area of S2 is a useful normalizing constant. It is straightforward to show the existence of f's of least energy for given boundary values (in an appropriate function space).
Such boundary value problems have been associated with liquid crystals." In this context, a "liquid crystal" in a container fl is a fluid containing long rod like molecules whose directions are specified by a unit vectorfield. These molecules have a preferred alignment relative to each otherin the present case the preferred alignment is parallel. If we imagine the molecule orientations along ' This research was supported in part by grants from the National Science Foundation 2 The research which led to the present paper began as an investigation of a possible equality between infimums of m-energy and the n area of area minimizing n dimensional area minimizing manifolds in Rm+" suggested in section VIII(C) of the paper, Harmonic maps with defects (BCLI by H. Brezis, J-M. Coron, and E. Lieb. Although the specific estimates suggested there do not hold (by virtue of counterexamples jMFH(W1j(YL]) their general thrust does manifest itself in the results of the present paper. " See, for example, the discussion by R. Hardt, D. Kinderlehrer, and M. Luskin in IHKLI.
641
With F. Almgren and W. Browder in Springer Lecture Notes in Math. 1306, 1-22 (1988) 2
8f1 to be fixed (perhaps by suitably etching container walls) then interior parallel alignment may not be possible. In one model the system is assumed to have 'free energy' given by our function £ and the crystal geometry studied is that which minimizies this free energy.
If 11 is the unit ball and 1(x) = x for ]xj = 1, then there is no continuous extension of these boundary values to the interior; indeed the unique least energy 1 is given by setting f (x) = z/]x] for each x. It turns out that this singularity is representative, and the general theorem is that least energy f's exist and are smooth except at isolated points p of discontinuity where 'tangential structure' is ±x/Ixj (up to a rotation), e.g. f has local degree equal to ±1 ]SU] ]BCL VII]. As a further step towards an understanding of the geometry of of energy minimizing f's one might seek estimates on the number of points of discontinuity which such an f can have-e.g. if the boundary values are not to wild must the number of points of discontinuity be not too big?" An alternative problem to this is to seek a lower bound on the energy when the points of discontinuity are prescribed together with the local degrees of the mapping being sought. This question has a surprisingly simple answer as follows.
THEOREM. Suppose pt,... , PN are points in R3 and dl,... , dN E Z are the prescribed degrees with EN , d; = 0. Let inf t denote the infimum of the energies of (say, smooth) mappings from R' - {pl,... , pN} to S2 which map to the 'south pole' outside some bounded region in R3 and which, for each i, map small spheres around pi to S' with degree d,. Then inf £ equals the least mass M(T) of integral I currents T in R3 with N
eT = Ed,lpi]. This fact (stated in slightly different language) is one of the central results of ]BCL]. We would like to sketch a proof in two parts: first by showing that inf £ < inf M (with the obvious meanings) and then by showing that inf M < inf £. The proof of the first part follows ]BCL] while the second part is new. It is in this second part that the coarea formula makes its appearance.
Proof that inf £ < inf M. The first inequality is proved by construction as illustrated in Figure 1. We there represent that case in which N equals 2 and p' and p2 are distinct points with dl = - I and d2 = + 1. We choose and fix a smooth curve C connecting these two points and orient C by a smoothly varying unit tangent vector field f which points away from p1 and towards P2The associated 1 dimensional integral current is T = t(C,I,s) and its mass M(T) is the length of C since the density specified is everywhere equal to 1.' We now choose (somewhat arbitrarily) 4 As it turns out, away from the boundary of f1, the number of these points is bounded a priori independent of boundary values. ' Formally, a 1 current such as T is a linear functional on smooth differential 1 forms in R3. If 'p is such a 1 form then
T(w) =
J zEC
(i(x) ,'v(z)) dN'x.
To each point p in R3 is associated the 0 dimensional current (p] which maps the smooth function tL to the number ri(p). See Appendix A.4. 642
Co-area, Liquid Crystals, and Minimal Surfaces 3
e3
x inverse to X stereographic projection (modified)
W
Figure 1. Construction of a mapping / (indicated by dashed arrows) from R3 to S2 having energy C (f) not much greater than the length of the curve C connecting the points p, and P2. Small disks normal to C map by / to cover S2 once in a nearly conformal way. This implies that small spheres around pi map to S2 with degree -1 while small spheres around ps map with degree + 1. The 1 current t(C, I , f) is the slice (Es , / , p) of the Euclidean 3 current E3 by the mapping f and the `north pole' p of S2. 643
With F. Almgren and W. Browder in Springer Lecture Notes in Math. 1306, 1-22 (1988) 4
and fix two smoothly varying unit normal vector fields q1 and 112 along C which are perpendicular
to each other and for which, at each point z of C, the 3-vector q,(x) A 172(z) A s(z) equals the orienting 3-vector el A e2 A e3 for R3. These two vector fields are a'framing' of the normal bundle of C. We then construct a mapping ry of R2 onto the unit 2 sphere S2 which is a slight modification of the inverse to stereographic projection. To construct such -y we fix a huge radius R in R2 and require: (i) if IyI < R then -y(y) is that point in S2 which maps to y under stereographic projection S2 -. R2 from the south pole q of S2; (ii) if Jyj > 2R then -y(y) = q; (iii) for R < Jyj < 2R, -y(y) is suitably interpolated. See Appendix A.2. Next we choose some smoothly varying (and very small) radius function 6 on C which vanishes
only at the endpoints pland p2. Finally, as our mapping / from R3 to S2 with which to estimate £ (/) we specify the following. If p in R3 can be written p = x + sgr(z) + 02 (X) for some z in C and some a and t with a2 + 12 < 6(x)2, then
2Rs , 2R9
/(p) = 7 6(z) 6(z) Otherwise, /(p) = q. We leave it as an exercise to the reader to use the fact that 7 is conformal for Jyj < R to check that t(f) very nearly equals M(T); see Appendix A.2. The remainder of the proof that inf £ < inf M is also left to the reader.
Proof that inf M < inf £. Suppose that / does map R3 to S2, has degree d; at each p,, and maps to the south pole outside some bounded region. From dimensional considerations one would expect that for most points w in S2 the inverse image /-r{w} would be a collection of curves connecting the various points pl,... PN. H. Federer's coarea formula is what enables one to quantify this idea; see Appendix AS. This formula asserts
I
N'(/-r{w})dM2w = 1.
wE82
J2/(z)dL3z; Ert3
here N r and N2 are Hausdorff's 1 and 2 dimensional measures in R3 and L3 is Lebesgue's 3 dimensional measure for R3. Also J2/(z) here denotes the 2 dimensional Jacobian of / at z and a key observation (as noted in IBCLI) is that J2f(x) is always less than or equal to half of JD/(x)12 with equality only if the differential mapping D/(z):R3 -. Tan(S2, /(z)) is maximally conformal; see Appendix A.1.3. Also central to the present analysis is the manner in which the curves /-'{w}
connect the various points pl,... pN and how they relate to the prescribed degrees d1,... dN. This connectivity is naturally measured by the current structure of these /-'{w}'s which comes from the slicing theory for currents; see Appendix AS. To set this up we regard R3 as the Euclidean current E3 (oriented by the 3 vector el A e2 A e3). The slice of E3 by the map / at the point it, in S2 is the current (E3 , / , w) =
t(/-'{w), 1, c);
the meanings here are the same as for the current T discussed above. A check of orientations and 644
Co-area, Liquid Crystals, and Minimal Surfaces 5
degrees shows that N
a(E3,f,w) = >k;8p,1; 1-
compare with our construction of q1 and r12 above. It follows immediately that 47r inf M(T) = N2 (S2) inf M(T)
M((E3,f,w))d)2w
.ES' J2 f df3
/R'\
r
= 12 I fR' IDf12W. This finishes the proof that inf M < inf E.
First Generalization. Since the methods used in the proofs of the two inequalities are quite general one might correctly suspect that considerable generalization is possible. Suppose,
for example, we fix B = (PI,... ,PN) as a general boundary set and let To be the family of those mappings f of R3 to S2 which are locally Lipschitzian except possibly on B, which map to the southpole outside some bounded region, and which have finite energy. Since deformations of mappings in To do not alter discrete combinatorial structures we are led to study properties of homotopy classes fl(To) of mappings in To-it is most useful here if our homotopies X0,11 x R3 -. S2 are permitted to have isolated point discontinuties; see Appendix A.3.
Our conditions about mapping degrees above generalize to requirements about degrees d(f, S) of f on general integral 2 dimensional cycles S in R3 - B. It turns out that such a degree d(f, S) depends only on the homotopy class of f and on the homology class of S.
It also turns out that the relative homology classes of the slices (E3 , f , w) depend only on the homotopy class if] of f. We denote this homology class by a(f ]. The Kronecker index is a pairing between 2 dimensional cycles S in R3 - B and 1 currents T having boundary in B. In general the Kronecker index k(S, T) is the sum over points of intersection of S and T of an index of relative orientations; see Appendix A.6 These various ideas are related in the following theorem.
THEOREM. The diagram below is commutative. Furthermore, a is an isomorphism, and d and k are injections.
H1(R3, B; Z)
/s n(To)
k
d
Hom(H2(R3 - B, Z), Z) 645
With F. Almgren and W. Browder in Springer Lecture Notes in Math. 1306, 1-22 (1988) 6
Here
sill = "If-'(w))" = )(E3, f, w)) = the integral homology class of the I current slice;
duf]IS] = d(f,S) = the degree off on the 2 cycle S; kJTJJSJ = k(S,T) = the Kronecker Index of the 2 cycle S and the I current T. Our relations between energy minimization and area minimization become the following.
THEOREM. Suppose that P is an integral 1 current in R3 with the support of 8P in B. Suppose also that Tz has least mass among all integral 1 currents which are homologous to P over the integers Z and that TR has least mass among all integral I currents which are homologous to P over the real numbers R. Then
M(Tz) = inf{£(f):si/J = JPJ) and
M(TR) = inf{£(f):d)f) = kIPJ) Moreover, M(Tz) = M(TR) (because of our special situation).
Further generalizations. The essential ingredients of the analyses above remain, for example, if R3 is replaced by a general m + n dimensional manifold M (without boundary) which is smooth, compact, and oriented (or M = R'"+"), and B is replaced by a sufficiently nice (possibly empty) compact subset of M of dimension n - 1. To study n dimensional integral currents in M having boundary in B we consider mappings f of M to a sphere of the complementary dimension m. The spaces 3 and 30 of such mappings and the homotopy classes Il(3) are specified in sections A.3.1 and A.3.2 of the Appendix. Some discontinuities are essential' It seems worthwhile to consider three different energies £1, £2, and £3 for mappings in To. £l is a normalization of the usual 'n energy' of mappings, £s is a normalized Jacobian integral associated with the coarea formula, and £2 is an intermediate energy; see Appendix A.3.2. As indicated above, mapping degrees and the Kronecker index have general meanings which are set forth in sections A.6 and A.7 of the Appendix. These various ideas are related as the following theorem shows. THEOREM. The diagram of mappings below is well defined and is commutative. In particular, the images ofd and k and j in Hom(Hm(M -- B, Z), Z) are the same. Furthermore, a is an 6 Suppose m = 2 and n = 5 and M = R7, and B is a smoothly embedded copy of 2 dimensional complex projective space CP(2). Then there are no continuous mappings f from the complement
of B to S2 such that small 2 spheres S which link B once map to S2 with degree one. Any f satisfying such a linking condition for general position S's near B must have interior discontinuities of dimension at least 3. 646
Co-area, Liquid Crystals, and Minimal Surfaces
isomorphism.
H"()A,B;R)
H.(M,B;Z)
/s 11(1)
c
c(H"(M, B; Z)J
1k
\d
ii
rj Hom(H.(M - B,Z),Z)
Here
a(JJ = "If-'{p}]" =I (OMII, f,p)] = the integral homology class of then current slice;
d(f J[SJ = d(f, S) = the degree of f on them cycle S; kIT]ISJ = k(S,T) = the Kronecker index of the m cycle S and then current T;7 c is induced by the coefficient inclusion Z - R; i is the inclusion; and j is defined by commutivity. We defer proof of this theorem to our fuller treatment of this subject. The natural setting and generality of such relationships are still under investigation. The relations between energy minimization and area minimization then become the following.
MAIN THEOREM. Suppose P is an integral current in M with the support of 8P contained in B so that the integral homology class (P] of P belongs to H.()4, B; Z). Let Tz be an integral current of least mass among all integral currents belonging to the same integral homology class as P in H, (M, B, Z), and let TR be an integral current of least mass among all integral currents belonging to the same real homology class as P in H. (X, B, R). Then
M(Tz) = inf{£,(f):alfI = IPJ} = inf(£s(f):a(fI = IPI} = inf(£3(f):s(fI = (P]} and
M(TR) = inf{£,(f):d(fI = kIPI) = inf{£z(f):d(fI = kIP]} = inf{£3(f):d(fI = kIP]}. r Suppose m = 2 and n = 1 and M is a 3 dimensional real projective space RP(3) and T = t(.W , 1, c); here X is a 1 dimensional real projective space RP(1) sitting in RP(3) in the usual way and S is some orientation function. Since T is not a boundary while 2T is, we conclude that the homology class IT] E Hi(M,O; Z) = Zz
is not the 0 class although k(S,T) = 0 for each 2 cycle S in M. In particular, the mapping k is generally not an injection. 647
With F. Almgren and W. Browder in Springer Lecture Notes in Math. 1306, 1-22 (1988) 8
In general, of course, M(TR) < M(TZ). Although we again defer complete proofs to our fuller treatment of this subject, it does seem useful to sketch some of the main ideas.
Proof of the inequality "inf t <- inf M". The proof here is again by construction. We will indicate the main ingredients in a special case. Suppose, say, M = Rm+" B is polyhedral, and T is an integral n current which is mass minimizing subject to some appropriate constraints as in the Main Theorem above. We will construct a mapping J: R'"+" S'" in the relevant homotopy class such that £1(J),£2 (J), and Es(J) are nearly equal and are not much bigger that M(T). By virtue of the Strong Approximation Theorem for integral currents (FH 1 4.2.201 we can modify T slightly to become simplicial with only a slight increase in mass.
Suppose then that we can express M
T
t(A. , Z0 , f0) 0=1
as a `simplicial' integral current (with the obvious interpretation ). For each k = 0,... , n we denote by Kk the collection of closed k simplexes which occur as k dimensional faces of n simplexes among
the Al's. We then choose numbers 0 < 6" << 6"_1 << 6"_2 << ... << 60 << 1 and define sets No,N1,... ,N,. in R'"+" by setting No = {z: dist(z, uK0) < 60)
and, for each k = 1,... , n set Nk = {z:dist(x,UK,) < 6k) - (Nk_1 u Nk_z U ... U NO). We assume that 60,. .. , 6" have been chosen so that the distinct components of each Nk correspond to distinct k simplexes in Kk.
We now define mappings J"+1,J",... ,Jo = J as follows.
First, the mapping J"+1:R"'+" - (N" u ... u NO) - Sm is defined by setting J"+1(z) = q for each x.
Second, the mapping J,.: R'"+" ~ (N.- I U ... U No) S' is constructed geometrically in virtually the same manner as the mapping g in the example A.8 in the Appendix. Details are left to the reader.
(N"_z U ... U NO) -. S' is constructed geometrically Third, the mapping in a manner virtually identical with the construction of the mapping f6,, of example A.8 of the Appendix (with 6,r replaced by 60/2,6"_1 respectively there). The mapping f,-, is Lipschitz across parts of n - 1 simplexes which do not lie in B and is discontinuous on those n - I simplexes which contain part of 8T. Assuming J"+1, Jn, ,Jk+1 have been constructed we define
Jk : R` (Nk _ 1 u ... U No) -. S648
Co-area, Liquid Crystals, and Minimal Surfaces 9
as follows. Each point v in Nk - (Nk_I U ... U No) can be written uniquely in the form v = vo + (v - vo) where vo is the unique closest point in UKk to v and Iv - vol < 6k. If v if vo we note
that v1=vo+6k(v-vo
IV - vol
l Edmn(fk+i)
and we set fl, (v) = fk+I(vi). A direct extension of the estimates used for the example A.8 of the Appendix shows that the energies £1(f),£2(f), and £a(f) very nearly equal M(T).
Proof of the inequality "inf M < inf V. The argument here is a direct extension of the corresponding argument given above and is left to the reader.
Remarks. (1) One of the main reasons for analyzing relations between the energy of mappings and the area of currents is that it provides a way to study n dimensional area minimizing integral currents (whose geometry is not specified ahead of time) by studying functions and integrals over the given ambient manifold. This seems the first such scheme which works in general codimensions. For real currents, however, differential forms play a role roughly analogous to that of our function spaces To; in this regard see, for example, the paper of H. Federer, Real fiat chains, cochains, and variational problems IF2 4.10(4), 4.11(2)]. Incidentally, in the language of IF2 5.12, page 400), examples show
that the equation in question there is not always true under the alternative hypotheses of IF2 5.10).
(2) Suppose C consists of smooth simple closed curves in R3 oriented by S. Suppose also for positive integers v we have reasonable mappings f from the complement of C in R3 to the circle S' with the property that small circles which link C once are mapped to S' by f with degree v. Because of the dimensions we have `-, (fV) = £2(fV) = £a(fV) =
_
J
I Df,I
W.
If f is nearly £, energy minimizing then for most w's in S' the slice
will be defined with t(C, v, S) and will be nearly mass minimizing. H. Parks, in his memoir, Explicit determination of area minimizing hypersurfaces, 11 )PHI, used a similar energy for mappings to the real numbers R (instead of to S') and was able to exhibit an algorithm for finding area minimizing surfaces. The technique used by Parks requires that C be extreme, i.e. that it lie on the boundary of its convex hull. The analysis of our paper on the other hand applies to any collection of curves which, for example, may be knotted or linked in any way. One of our hopes is to develop a method of computation analogous to that of Parks.
(3) Suppose that C and the mappings f have the same meaning as in (2) above. If 0 denotes the usual (multiple-valued radian) angle function on S' then df as a well defined closed 1-form 649
With F. Almgren and W. Browder in Springer Lecture Notes in Math. 1306, 1-22 (1988) 10
whose pullbacks /.1d9 give closed I forms on the complement of C in R3 with l/,10j _ JDf I. For fixed xo in the complement of C we define functions g mapping the complement of C to S' by requiring that
I /Ode (mod 2r)
0 o gv(x) = B o
7
for each x (with the obvious meanings); here -y(x) denotes any oriented path in the complement for each P. If we write of C starting at xo and ending at z. It is immediate to check that g v = A A. µ for some A and µ and define ha(s) in S' by requiring
8 o ha(x) = I (µl /.Odg (mod 2a) for ry as above. The mapping hA maps small circles with the same degrees as does /,,. Taking p = v we readily conclude, for example, that
inf{M(T):.9T = t(C, v, {)) = v inf{M(T):0T = t(C, 1, s)) for each P. This estimate implies that integral and real mass minimizing 2 currents having boundary
t(C, 1, {) have the same masses ]F2 5.8); although this has been known for some time, the present proof by factoring mappings seems new and simpler. This fact (and our proof) extend to n - 1 dimensional boundaries in general manifolds M of dimension n + 1 with, for example, the property that each 1 cycle is a boundary. There are counterexamples to such equalities in higher codimensions given first by L. C. Young ]YL] and later by F. Morgan IMF] and B. White ]W1]. How badly such an equality can fail remains an important open question. It is not even known, for example, if the number
inf{M(S)/M(T): S,T E 12(R4,R4) are mass minimizing with 0 # 8S = 28T) is positive; note, however, the isoperimetric inequality ]Al 2.6]. (4) Suppose M is a complex submanifold of some complex projective space CP(n) (or, more generally, M is a Kiihler manifold). Then any complex analytic (meromorphic) function / from M to the Riemann Sphere CP(1) = S2 has integral current slices which are absolutely mass minimizing in their integral homology classes ]Fl 5.4.191. Such /'s are thus necessarily maximally conformal and minimize each of the energies £r, £z, and £3 among functions in the same homotopy classes.
(5) In the context of this paper, if the mass minimizing current T being sought happens to be unique then most slices of nearly minimizing mappings will be close to that current. In a sense this describes the asymptotic behavior of a sequence {/k}k of mappings in To converging towards energy minimization; in particular, the real currents 1
(m + 1)a(m + 1) 650
O M J I_ /ka.
k
Co-area, Liquid Crystals, and Minimal Surfaces 11
must converge to T as k -. co. If m = 2 then the energy £, is Dirichlet's integral which is widely studied in the general theory of harmonic mappings between manifolds pioneered by J. Eells and J. Sampson. In any codimension m each is dimensional mass minimizing integral current is a regular minimal submanifold except possibly on a singular set of dimension not exceeding is - 2 as shown by F. Almgren in IA21. It is not yet clear to what extent the present new setup will provide new tools for study of the regularity and singularity properties of mass minimizing integral currents. This could be one of its most important potential uses.
APPENDIX When not otherwise specified we follow the. general terminology of pages 669-671 of H. Federer's treatise, Geometric Measure Theory 1F11 or the newer standardized terminology of the 1984 AMS Summer Research Institute in Geometric Measure Theory and the Calculus of Variations as summarized in pages 124-130 of F. Almgren's paper, Deformations and multiple-valued functions (A11.
A.1 Terminology. A.1.1 We fix positive integers m and n and suppose that M is an m + is dimensional submanifold (without boundary) of RN (some N) which is smooth, compact, and oriented by the continuous unit (m + n)-vectorfield f:M -+ nm+"RN; alternatively Al = R'+" with standard orthonormal basis vectors e1,... ,em+" and orienting (m+ n)-vector ei A...Aem+n. We also suppose that B is a finite (possibly empty) union of various (curvilinear) is - 1 simplexes IN 1,A2,... ,AJ associated with some smooth triangulation of M.
A.1.2 We denote by S' the unit sphere in R x RI = R1+m with its usual orientation given by the unit m-vectorfield o: S' -+ nmRI+m. in particular, for each w E S' C Rt+." _ A1Ri+m, a(w) = *w. It is convenient to let z,yi,... y. denote the usual orthonormal coordinates for R x R"' and also let p,e1,... cm be the associated orthonormal basis vectors. In particular, a(p) = p = ci A ... A Em. We regard p as the 'north pole' of Sm. The 'south pole' is q = -p. We denote by o' the differential m form (the 'volume form') on S' dual to a. A.1.3 If L is a linear mapping R'"+" -+ R'" then the polar decomposition theorem guarantees the existence of orthonormal coordinates for R'"+" and R'" with respect to which L has the matrix representation 0
0
...
0
0
0
A2
...
0
0 .. 0
0
0
Al
L
...
0
Am 0
0
with Al > A2 > ... > Am > 0. In these coordinates we can express the Euclidean norm ILI of L as
ILI = (A2+A2+...+Am)I , 651
With F. Almgren and W. Browder in Springer Lecture Notes in Math. 1306, 1-22 (1988) 12
express the mapping norm IILII of L as IILII = A,,
and express the mapping norm II nm LII of the linear mapping A,L of m-vectors induced by L as II=AI.A2... Am,
II Am
Whenever Al ?A22".2Am>0wehave
AI.A2...Am<m(Ai+az+...+\2)
if f is a mapping and L = Df(a) is the differential of f at a, then IDf (a)I2 is of value of Dirichlet's integrand of f at a, and Jmf(a) = II A. Df(a)II is the m dimensional Jacobian of f at a.
A.2 Modified Stereographic Projection. Stereographic projection of S' onto RI from the south pole q maps (z, y) E S' - {q) to 2y/(1 + z) E R'" while the inverse mapping yo: R'
S' sends y E R'" to 4
'YO (Y) = (4
_
2
+ Iy12 '
4+
IyI2) E S'" - (q).
-yo is an orientation preserving conformal diffeomorphism between R' and S' - {q} as is readily checked.
For convenience we let 0: S'" - 10, x] denote angular distance in radians (equivalently, geodesic
distance in S'") to p. General level sets of 0 are thus m - I spheres of constant latitude while 8(p) = 0 and 0(q) = x. Also for (z, y) E S"' we have z = cos 8(z, y) and Iyl = sin 0(z, y). Latitude lines on S' are level sets of the function w which maps (z, y) E S'" - (p, q) to W(z,Y) =
IYI E
sm-'
c Rm.
Certain mappings derived from to are important in our constructions. If 0 < 6 << 1/2 is a given very small number we fix 0 < r = r(6) < < R < oo by requiring that R be the radius of the sphere in R' which -yo maps to the latitude sphere 0 = x - 6 near q in S"' and that rR/6 be the radius of the sphere in R'" which yo maps to the latitude sphere 0 = 6 near p. We now modify ryo to obtain a mapping ry = ry6 = '16,, which maps R"' onto all of S"' and which maps points y in R'" with norm less that r2 to p, maps points y in R'" with norm greater 652
Co-area, Liquid Crystals, and Minimal Surfaces 13
that 26 to q, maps points y in R"' with norm between r and 6 to -yo(Ry/6) and suitable interpolates in the two remaining annular regions. More precisely, we set
p
if 0 < lyl < r2
(cos (6
,sin (6 (
))}
ifr
'1oO
'Y(y)_
if r2 < IyI <
(cosOr +IYI-26),sin Or +IYI-26)jj) if 6
which is less that 26/r since r < 1/2. Hence
1Iv15+
ID7Im dCm <
"' 26 o(m)r"' m; r-) = 2mmia(m)6m `rJ
which is small if 6 is small. Similarly, in the region 6 < IYI < 26 we estimate that the local Lipschitz constants do not exceed 1. Hence
I
ID,P"dCm <mi2o:(m)(26)m =2mm=a(m)6 M <_IYI<26
which is small is 6 is small. Finally we note that, in the region, r < Iy] < 6 the mapping 7 is conformal so that
I
Slvl
J,'yde'" = I
.
IID7II'"dC'^ =
1 I.<{rl<6 IDhImdfm = Xm(Sm
nO_1I6,x
rn
- 6I).
Our mapping ry6,1 from R' to S' preserves orientations and covers once. It is useful to have mappings ry6,,, with similar conformal properties but covering v times. To do this we fix a ratio
p = (r(6)2/6) and let r(z,y) = (-z,-y1,y2,... ,ym) for (z,y) E S'; the map r thus interchanges the north and south poles of S' while preserving orientation. We then define
76,v(Y) =
16(Y)
if P6<-IYI
rk ° -16 (Y/P*)
ilk E (1,... &,-2) and pk+i6 < IYI < Pk6
r"-' o Y6
if 0 < IYI < p' 16.
653
With F. Almgren and W. Browder in Springer Lecture Notes in Math. 1306, 1-22 (1988) 14
A.3. Mappings and homotopies from M to S' with contolled discontinuities. A.3.1 Whenever f : M -. S' we denote by Cf the closure of the set of points of discontinuity of f. We then let
I be the collection of all functions f : M -. S' such that the closure of Cf - B (recall A.1.1) has dimension not exceeding n - 2. In case m equals I we require that Cf C B for functions / in T. Also, if M is R'"+" we require that /(x) = q whenever Ixl is sufficiently large. we denote by
Similarly, whenever h: 10,11 x M -» S
C,,
the closure of the set of discontinuities of h. We then say that f and g in .7 are s-homotopic provided there is a function h:10,11 x M -. S' such that h(0, ) = f and h(1, ) = g and also
C1,-r(({0}xCf)u({l}x C9)u(10,11xB)) lies in (0,1) x .M and has dimension not exceeding n-1 (in case .M is R'+" we additionally require that h(t, x) = q for all t when 1x1 is sufficiently large); such a function h is called an s-homotopy between f and g. We then denote by 11(3)
the s-homotopy equivalence classes of 3.
A.3.2 We denote by 30
those functions / in 7 for which f1(M - Cj) is locally Lipschitz and then associate to each such f three energies El(f), E2(f), and E3(f) given by setting Er (f) = mm/2 (m +11)a(m + 1) IM ID fl"' O'"+",
I
E2(f) = (m + 1)a(m + 1)
ES(f) =
IID/11"d+",
1, J,"f d)f'"+".
1
(m + 1)a(m + 1)
M
For some analyses (beyond the scope of this present paper) it is important to recognize that
Jmf(y) = Ka'(f(x)),A'"Df(x))I 654
Co-area, Liquid Crystals, and Minimal Surfaces 15
We also call the reader's attention to the paper Homotopy classes in Sobolev spaces and the existence of energy minimizing mappings IW21 by B. White in which p energy minimization is studied in homotopy classes of mappings which are not necessarily continuous.
A.3.3 A basic fact is the following
PROPOSITION. (1) Each s-homotopy class in fl(3) contains a representative f which belongs to TO and for which each of the energies £,(f), £z(f), and £s(f) is finite.
(2) Suppose f and g belong to To and are representatives of the same s-homotopy class in 11(3). Suppose also that £, (J) and £, (g) are both finite. Then there is an s-homotopy h between f and g such that hl (10,1) x M - Ch) is locally Lipschitz and )DhjmdY-+n+I < 00. JIO,IIxM
A.4 Currents. A general k (dimensional) current T is a continuous linear functional on an appropriate space of smooth differential k forms in RN. The boundary of a k current T is the k - I current 8T which maps a smooth differential k - 1 form m to the number 8T(w) = T(d ,)Stokes's theorem becomes a definition. In this paper we are concerned with currents of the form T = t(E, 8, c). In writing such an expression we mean that set(T) = E is a (bounded) lfk measurable and (Nk, k) rectifiable subset of M, and that the density function 8: E -+ R+ is Nk L E summable, and that the orientation { is an Nk L E measurable function whose simple unit k vector values are compatible with the tangent plane structure of E. Such a k current T maps a differential k form ,p to the number
I
T(AP) = JEE (f(x),jp(x)) 8(x) dNkx.
Associated with M itself is the m + it current
IMI = t(M,1, f); if M = Rm+n a standard notation is E, m+n = t(Rm+n 1, C)
with f (x) = et n ... A em+n for each x. The area of a current T = t(E, 8, s) weighted with its density gives its mass,
M(T) = JE 8 d)k = aup{T(,p): II,II < 1). The theorems of this paper relate to minimization of this mass rather than, say, the k areas of the underlying set E (which is called the size of T and is denoted S(T)). The measure IITII associated with mass is thus XkLE n 8 so that M(T) = IITII(M) = IITII(Rx) 655
With F. Almgren and W. Browder in Springer Lecture Notes in Math. 1306, 1-22 (1988) 16
A general fact about such a current T = t(E, 0, s) is that its general current boundary ignores closed sets of zero k-1 measure, e.g. if U C RN is open and the support of 8T inside U has zero Nk ' 1 measure, then 8T(w) = 0 for each w supported in U )Fl 4.1.20).
Suppose that T = t(E,9,c) is an n current such that the support of 8T lies in B. Because of our special assumptions about B in A.1.1 we can use )Fl 4.1.31] together with our preceding remark to infer for each k = 1,... J the existence of nonnegative real numbers rk and continuous orientation functions Ch on Ak such that
i 8T = F t(Ak,rk,ct) k=1
For general (possibly empty) subsets A and C of M with C C A we denote by Rk(A,C) the
vector space of those k currents T = t(E,9,t) with the closure of E contained in A such that 8T = t(E',0',c') for some E',0',S' with the closure of E' contained in C. We further let Ik(A,C) denote the subgroup of those currents T = t(E, 0, c) in Rk(A, C) such that 0 assumes only positive integer values. It follows from )Fl 4.2.16(2)] that 8T E Ik_1(C,0) whenever T E Ik(A,C).
When convenient we will denote by sptT the support of a current T.
A.5 The coarea formula and slices of currents. A key ingredient of the present paper is slicing the current I MI by mappings f: M -- S' belonging to To and use of the coarea formula to estimate the masses of these slices in terms of the energy t3(f). As a consequence of ]Fl 3.2.22, 4.3.8, 4.3.11) we infer that for )!"` almost every w E Sm the slice
(I'm], f'-) = t(f
{w} , 1, S)
is well defined as an n dimensional current. Here, for N" almost every x E f-'{w}, if rl(x) is that simple unit m vector associated with the m plane kerDf (x)1 in Tan(11,z) for which (17(X), A Df (x)) a(w) > 0
then we specify f(z) to be that simple unit n vector associated with kerDf(z) in Tan(M,z) for which f(x) = >/(z) A f(z); we have used the symbol to denote the inner product in nmRm+1 We further infer from the coarea formula ]Fl 3.2.22) that (m + 1)a(m + 1)£3(f) = L
f, w)) d)-w.
wEB... ES-
Since 8IM) = 0 we readily infer from ]Fl 4.3.1) together with A.3.2 and A.4 above that for Nm almost every w E Sm, 8((M], f, w) belongs to I--1(B,0).
A.6 Kronecker indices of integral currents. Whenever S E I- (M, M) and T E I. (M, M) with
0 = spt8S n sptT = sptS n sptaT, 656
Co-area, Liquid Crystals, and Minimal Surfaces 17
there is naturally defined the Kronecker index of S and T in M, denoted
k(S,T) = k(S, T;.M) E Z. which is a direct extension of the definitions in ]Fl 4.3.20]. For `sufficiently regular' such currents
S = t(E,,e,,c1)
and
T = t(E2,es,c2)
in 'general position', we can write
k(S,T) = Y 01(z)
0y(z) sign(c1(z) A cs(z)
f(z))-
zEE,nE,
Among the important facts about the Kronecker index is its ability to characterize real homology classes. We have the following.
PROPOSITION. Suppose T1,T2 E 1. (M, B) with 8T1 = aT2 and k(S,T1) = k(S,T2) for each S E Im(M, 0) for which both Kronecker indices are defined. Then there is Q E R"+t (M, M)
such that 8Q = T1 - Tz. Proof. In view of ]Fl 4.4.1] it is sufficient to verify the assertion in the context of Lipschitz singular chains of algebraic topology. Moreover it is sufficient to check than an n cycle T in M is a boundary in case its general position intersections with m cycles S in M all have Kronecker index zero. This is well known.
A.7 Degrees of mappings of currents. Suppose f E To and
S=t(E,0,S)EIm(M--C1,0). Then the m current fpS in S' is naturally defined in accordance with ]Fl 4.1.14, 4.1.151 with afiS = 0 since 8S = 0. We then infer from ]Fl 4.1.31) the existence of an integer d(f,S) such that f0S = t(S" , d(f, S), a).
We call d(f,S) the degree of f on S. If f and S are 'sufficiently regular' then, for X' almost every w E S', 0(z) sign ((c(z), nmD f(x)) a(w)). d(f, S) =
F
zEEr!-' (.u)
Basic properties of degrees are the following.
PROPOSITION. (1) The degree d(f, S) depends only on the real homology class of S in M
iff E 3o, and
B. More precisely,
QERm+i(M-B,M-B) with aQ=S1-S2, then
d(f,S,) = d(f,S2). 657
With F. Almgren and W. Browder in Springer Lecture Notes in Math. 1306, 1-22 (1988) 18
(2) The degree d(f, S) depends only on the s-homotopy class of f. More precisely, if j, g E To are s-homotopic and S E I- (M ^- (Cf U C9 U B),0), then d(f, S) = d(g, S).
A.8 An example showing relations between integral current slices and boundaries, Kronecker indices, and mapping degrees. Suppose, as illustrated in Figure 2, the following. (a) M = R"'+n with its usual orthonormal basis, and U = Um+1(0,1) X
Un-1(0,1),
is an open set, and
A = (0) x U"-1(0,1) is an n - I disk with orientation function #:A -+ {em+2 A ... A em+n }.
(b) K and zl,... zK are positive integers and E1,... EK E {-1,+l). (c) For each k the vectors
P(k),rr1(k),... erlm(k) E S' X (0) C R'"+1 X Rn-I are an orthonormal family such that 71(k) A ... A 7m(k) A p(k) = e, A ... A em+1
and also p(1),... ,p(K) are distinct.
(d) For each k we let 11k denote the n plane spanned by p(k) and (0) x disk
Rn'1 and define the n half
Ak=ilk nUn(z:x.p(k)<0) with orientation function t: Ak - (Ek p(k) A em+2 A ... A em+n)
(e) 0 < 6 < < r < a < < 1 are very small numbers and
N = U n {z: dist(z, A) < r}
and
Nk = (U - N) n {z: dist(z, At < 26)
for each k; we assume that 6 is small enough so that the sets N1,... Nk are positive distances apart. (e) We denote by E the small m sphere E = 8Bm+1 (0, s) x (0)
with the standard continuous orientation function r: E -. AmRm+" determined by requiring x A r(z) = a- e1 A ... A em+,
658
Co-area, Liquid Crystals, and Minimal Surfaces 19
'
the definiton of J6,, in N of radius r depends on whether
or not 8T is zero in 0
J6,, maps to the southpole q outside N and UkNk p(2)
U°-1(0, 1)
each m dimensional section normal to 02 in N2 is of radius 6 and
maps to S"' by J6,, to cover (2Z2 times in a nearly conformal way
Figure 2. Relations between integral current slices and boundaries, Kronecker indices, and mapping degrees are illustrated by example in Appendix A.8.
659
With F. Almgren and W. Browder in Springer Lecture Notes in Math. 1306, 1-22 (1988) 20
for each z in E; it follows that (-1)m+1n1(k)
r(-s p(k)) _
A ... A nm(k)
for each k. Here denotes scalar multiplication of a vector. The m sphere E 'links' the n - I disk
A in U while 'puncturing' each Ak at the point -a p(k). We then set K
T=
and
S=t(E, 1,r)
k=1
and estimate (1) The boundary of T inside U is given by K
k=1
(Fl 4.1.81 so that aTL U = 0 if and only if E
1 Ekzk = 0.
(2) The Kronecker index of S and T is given by K
Zk r(-a p(k)) A S(-a . p(k)) el A ... A em+n
k(S,T) = A-1 K
Zk
(-1)m+In1(k) A ... rlm (k) A Ek p(k) A em+1 A ... A em+n el A ... A em+n
k=1=1
K
(-1)m+l E EkZk k=1
so that k(S,T) = 0 if and only if 8T L U = 0. We now assume r = a and will construct a mapping g: U - N --+ Sm. We first set g(z) = q (the southpok) if z lies outside both N and all the Nk's. Each point in each Nk can be written uniquely in the form
x+y1n,(k)+...+ymnm(k) where x is the unique closest point in AA: and Y E Bni+1(0,26); for each such point we set g(x + yl nl (k) + ... ymnm(k)) = -Y6.., (Ch
y1, ys, ... , Y.)
Since r < a < 1 our function g is defined on E and there is a well defined mapping degree d(g, S) (with the obvious meaning). Since each ry6,s, is orientation preserving (and 6 is very small) the orientation of g on E near p(k) is determined by Ek and by the inner product nl (k) A ... A qm(k)
660
r(-a . p(k)),
Co-area, Liquid Crystals, and Minimal Surfaces 21
and we compute (3) The degree of g on S is given by K
d(g, S) _ E Zkfk »1(k) n ... A om(k) r(-s - p(k)) k=1 K
(-1)m+1 E fkzk k=1
so that d(g,S) = 0 if and only if BTLU = 0. The extension of g to a mapping f = J6,, on all of U depends on which of two cases occurs. Case 1. If d(g, S) = 0 we infer from Hurewicz's theorem the existence of a Lipschitz mapping
h:Bin}1(0,r) - S"' such that g(w,0) if IwI = r
h(w) = q
if IwI < r/2.
We then define our mapping f: U --. S'n by setting ( g(z)
if z I N
J(x) =
l h(xl,...,x.n+l) if z E N
11
Case 2. If d(g,S) 54 owe define a discontinuous mapping h: B-+' -. S'n by setting
h(w) = g l I9IO) for each w and, as above, define f: U -. S'n by setting g(x)
ifxVN
I h(xl,...,xm+l) ifxEN
.
With the obvious interpretation of £l, £2, and 6a for function on U, each of these energies of mappings f6,, nearly equals the mass of T when 6 and r are small (and reasonable choices are made for h in Case 1). More precisely, we have. K
li
O£1(J6,,) = li
62(16,,) = li 063(16,,) = M(T) =
EZkNn(Ak).
k=1
It is also straight forward to check that for )1'n almost every w E S' the slice
T. = (Em+nLU, 16,,, w) 661
With F. Almgren and W. Browder in Springer Lecture Notes in Math. 1306, 1-22 (1988) 22
exists with
BTWLU = BTLU,
and also if a sequence of 6's and is converging to 0 is fixed then, for )l' almost every w in S'", lim (Em+^ L U, fs,r , w) = T.
6,r JO
REFERENCES JAI] F. Almgren, Deformations and multiple-valued functions, Geometric Measure Theory and the Calculus of Variations, Proc. Symposia in Pure Math. 44 (1986), 29-130. , Q valued functions minimizing Dirichlet's integral and the regularity of area JA21 minimizing rectifiable currents up to codimension two, preprint.
IBCLJ H. Brezis, J-M Coron, E. Lieb, Harmonic maps with defects, Comm. Math. Physics, 1987; see also C. R. Acad. Sc. Paris 303 (1986), 207-210. JF1J H. Federer, Geometric Measure Theory, Springer-Verlag, 1969, XIV + 676 pp. IF21
, Real fiat chains, cochains and variational problems, Indiana U. Math. J. 24
(1974), 351-407.
JHKLJ R. Hardt, D. Kinderlehrer, and M. Luskin, Remarks about the mathematical theory of liquid crystals, Institute for Mathematics and its Applications, preprint, 1988.
IMFJ F. Morgan, Area-minimising currents bounded by higher multiples of curves, Rend. Circ. Matem. Palermo, (11) 33 (1984), 37-46. IPHJ H. Parks, Explicit determination of area minimizing hypersurfaces, 11, Mem. Amer. Math. Soc. 60, March 1986, iv + 90 pp. ISUJ R. Schoen and K. Uhlenbeck, A regularity theory for harmonic maps, J. Diff. Geom. 60 (1982),307-335. IW II B. White, The least area bounded by multiples of a curve, Proc. Amer. Math. Soc. 90 (1984), 230-232. JW21
, Homotopy classes in Sobolev spaces and the existence of energy minimizing
maps, preprint. JYLJ L. C. Young, Some extremal questions for simplicial complexes V. The relative area of a Klein bottle, Rend. Circ. Matem. Palermo, (II), 12 (1963), 257-274.
662
With F. Almgren in Symposia Mathematica, vol. X.l'Y, 103-118 (1989)
COUNTING SINGULARITIES IN LIQUID CRYSTALS FREDERICK J. AiMGREN JR. - Ewo'rr H. LIES
Abstract. Energy minimizing harmonic maps hum the ball to the spbete arise in the study of liquid crystal geometries and in the c assical nonlinear sigma model. We linearly dominate the number of points ofdisaontinuity of such a map by the energy of its boundary value function. Our bound is optimal (modulo the best constant) and is the first bound of its kind. 1I also show that the locations and numbers of singular points of minimizing maps is often counterintuitive; in particular, boundary symmetries need not be respected.
1. INTRODUCTION This note is an introduction to and summary of discoveries we have made about the singular behaviour of
A mathematical model of some liquid crystal geometries Dirichlet energy minimizing harmonic maps from regions in R3 to S2 Energy minimizing configurations of a classical nonlinear sigma model
(R3 -' S2). These phenomena are different facets of a common mathematical analysis set forth in detail in our paper [AL). There we study vector fields TP of unit length defined in a reasonable region f2 in R3. In coordinates we can thus write for
each x= (xi,x2,x3) in Q, 3
(1)
wP(x) = (SVr(x),rP2(x),ww3(x))
with
E9i(x)2 = i. i-I
Since our target S2 is 2-dimensional we could, in principle, describe W using two functions instead of our three constrained functions. It is easier, however, to work with three functions and a constraint.
663
With F. Almgren in Symposia Mathematica, vol. A1X, 103-118 (1989)
Frederick J. Abng,en Jr. and Elliott H. Ub
104
The rp's important for us have distribution first derivatives which are square summable. (Caution: the space of such V's satisfying (1) is not the completion of any space of smooth mappings S2 -. S2.) The gradients of such V's are defined for almost every x with norms represented by the formula 3
(a
3
Iow(x)I2 =
z2)
;
w( )
o-1
which gives the value of Diriehlet's integrand at z. The integral of this integrand
is Dirichlet's energy integral of w,
E(w) = f IVwi2dV, with d V = d x' d x2 d z3 . Critical points of this energy integral £ are by definition
harmonic functions and satisfy the associated Euler-Lagrange partial differential equations -Aw'(x) =w'(x)IVSO( x)12
(i= 1,2,3).
These equations state that a critical cp has vanishing Laplacian in directions in which it is unconstrained. Such an energy functional and associated partial differential equations appear in the physics literature under the rubric of the nonlinear sigma model. Somewhat more generally, reasonable maps w : M N between Ricmannian manifolds M and N (often submanifolds of Euclidean vector spaces) have a Dirichlet's energy integral
6MN(w) = IMM of which ours is a special case. Alternatively, one can write
'MN (w) =
Jr
gti;(w(x)G (x)
((x)) (i(z)) axp
dVMZ
where g is the metric on N, G is the metric on M and d VJ t x = (det G(x) )1 /Z d x. Extremal mappings for such energies are also called harmonic mappings. Such mappings often are not continuous and there in an extensive mathematical theory about them.
664
Counting Singularities in Liquid Crystals
Counting singularities in liquid aystals
105
The tp's mapping A to S2 which are important for us also have well defined
boundary functions 0 : 80 - S2 having boundary energy ae(10) =
fan
IVTOI2dA
which is finite; here VTO is the tangential gradient of yG and d A is surface area measure. Associated with such a 0 is the number
B(O) = inf {E(tp) : tp has boundary value function 0}. We call tp an energy minimizing map for boundary value function ,y if and only if E(sc) = E(+G).
If 0 is any reasonable bounded domain and 0 is any boundary value function of finite energy then there will always be at least one minimizer tp having ' as boundary values (a compactness argument). Sometimes, however, there can be more than one minimizer. This is one of the fascinations of this simple nonlinear problem; if the target S2 were replaced by R3 (i.e. our constaint were removed) then the Euler-Lagrange partial differential equations are (the unconstrained) lin-
ear partial differential equations of Laplace, 0 tp' = 0 (i = 1, 2, 3), for which uniqueness is well known. If our domain 0 is all of R3 there is no boundary value function 1i, of course. We then say that tp : R3 S2 is a minimizer provided V cannot be modified on a compact set K to decrease energy in a larger bounded open set containing K.
Liquid crystals The connection of our energy minimizing tp's with liquid crystals requires explanation. We imagine that 92 is a container containing a liquid crystal. At points
z in fi the liquid determines a directrix n(x) lying in real projective space RP 2. Since RP 2 is obtained from S2 by identifying antipodal points, this means intuitively that n( x) is a unit vector like our gyp( x) except that its head is indistinguishable from its tail. For the liquid crystals with which we are concerned, the energy of n is defined analogously to our E, e.g. zero energy corresponds to parallel alignment. Like our minimizing V's (as we shall see), any minimizing n will be continuous except at isolated points. This means, in particular, that any minimizing n can locally be lifted to become a minimizing ip having the same energy; this lifting is global in case S2 is simply connected. (see [BCL], p. 686 for details). Thus, for simply connected 12's, our original problem is equivalent to the liquid crystal problem. In any case, whether or not Q is simply connected, our estimates
665
With F. Almgren in Symposia Mathematica, vol. XXX, 103-118 (1989)
106
Frederick J. Almgren Jr. and Elliott H. L.ieb
on the number of singular points hold for these liquid crystal minimizers. Line singularities do not occur in our model because they would have infinite Dirichlet energy. They do occur in nature, but to model them one, effectively, has to fatten the line and treat it separately (much as in the liquid helium problem). A further complication for liquid crystals is that there are other, more appropriate, integrands which are quadratic in V p and respect rotational symmetry. The general nematic liquid crystal integrand, for example, is of the form
Kt(divlp)2+K2(rp-curl (p)2+K3(WAcurl (p)2. Our Dirichlet's energy integrand corresponds (except for a fixed boundary term) to setting KJ = K2 = K3 = 1 (see [BCL], p. 653). Our methods give information about such liquid crystal geometries (by a compactness argument) only when
Kt, K2 and K3 are nearly equal. 2. BASIC FACTS ABOUT MINIMIZERS (A) Existence and regularity of minimizers As we mentioned above, whenever we have a reasonable domain Q and bound-
ary function r(, of finite energy, there will always exist a minimizer to having boundary values 0. Such a result is included among the general analysis of Dirichlet's integral minimizing mappings between manifolds by R. Schoen and K. Uhlenbeck in their basic papers [SUI] [SU2]. They further showed that a minimizing r t in our context is a real analytic mapping except at isolated points of discontinuity (which are our singularities). Finally, they concluded that a minimizing rp assumes its boundary values smoothly when both ail and 16 are comparably smooth. (B) Monotonicity of energy and tangential approximations
One of the basic technical properties of energy minimizing mappings is usually
called monotonicity. Whenever rp is a minimizer in iZ , y E fl, and 0 < r < s < R so that the ball BR(y) also lies within R , then
rI
fB(r) ,
I VwI2d V< 1
(VwI2d V.
8 fB.(V)
For a proof, see [SUI ]. (The absence of a corresponding monotonicity estimate is the main reason our analysis of liquid crystals is restricted to the Kt = K2 = K3 case). The monotonicity estimate leads fairly directly to the existence of certain tangential approximations to rp at each interior y. A major and deep development occured in a paper of L. Simon [S] which for our problem guarantees the existence
666
Counting Singularities in Liquid Crystals
Counting singularities in liquid crystals
107
of a unique tangential approximating mapping. At regular points this approximating mapping is constant. For a singular point y of 1P in A, Simon's result gives a unique harmonic mapping f : S2 -- S2 such that
tp(y+tw) -+ f(w)
as
t-i0+
uniformly for all w's in S2 (see [AL)), i.e.
jp(x)
f
x
-Y
Clx - YI
for x's near y. The correspondence here is in several strong senses (see [AL]). In
general, if f : S2 -. S2 and F : R3 -' S2 is defined by setting
F(x)=f`1x1) for each x ¢ 0 then f is harmonic if and only if F is.
.)
ExAt war . f ( r 7 =
,
i.e. f is the identity; see Figure 1.
(C) Harmonic mappings between spheres and mapping degrees Any continuous mapping S2 -+ S2 has a well defined topological degree measuring the number of times the first sphere covers the second, taking into account the orientations. Since the boundary functions tL under consideration map S2 to S2 and have finite energy, they also have a well defined degree given by the Jacobian integral
deg (+s) = 41 I J(,P)dA; here J(O) is the Jacobian (determinant) function of >' whose sign is positive or negative at a point depending on whether Ds preserves or reverses orientations at that point. For continuous ip's of finite energy these two notions of degree coincide.
All possible harmonic mappings from S2 to S2 have been classified for some time. In complex coordinates (resulting from stereographic projection of the S2's onto Q) they are all of the form
P(z) f(z) = Q(z)
or
P(z) f(z) = Q(z)
667
With F. Almgren in Symposia Mathematica, voL )Y, 103-118 (1989)
108
Frederick J. Almgren Jr. and Elliott H. Lieb
corresponding to various complex polynomial functions P and Q which are relatively prime. The degree of these f's can be checked to be
deg(f) =
max(deg(P),deg(Q)) first case; -max(deg(P),deg(Q)) secondcase.
For these harmonic maps f : S2 --. S2 we also set F(x) = f (R) as above and compute for each 0 < R < oo that
II=,
IVFI2d V = 87rRI deg(f) 1,
i.e. the energy does not depend on P and Q except via the degree.
(D) Tangential approximations to minimizers Suppose Y E n is a singular point of a minimizer rp and the tangential approx-
imation is of the form F(x) = f (n) corresponding to one of the harmonic f's given in (C) above. By the degree of the singular point y we mean the mapping degree of the associated f. Which of the possible f's actually occur? This question was answered by H. Brezis, J-M. Coron, and E. Lieb in their paper [BCL). The
only f's that occur are rotations R and reflections of the f in the above example, i.e.
(2)
f(w) = ±R(w), (w E S2)
with
deg(f) = fl;
see Figure 1. This class does not even include all harmonic maps of degree ±1. The proof proceeds by a construction of comparison functions. If I deg( f) > I then the energy of F can be decreased by splitting the singularity at the origin into two nearby singularities of lower degree. If I deg (f) I = I and f ±R then the energy of F can be decreased by moving the singular point slightly. The paper [BCL) also answered a question that in some sense is complementary
to the minimization question we have been studying here. Suppose yl .... V. are fixed points in i2 and d l .... , do are fixed degrees associated to these points (not necessarily ±1). What is the infimum of energies F(op) among all rp s which are continuous except at y1 's and map small spheres around each y, with degree d .? The boundary function ip is not fixed. This infimum is not achieved in general. The answer is shown in the Figure 2. Think of each singularity as a source or sink of flux and draw lines to carry the flux between singularities, or between a singularity and the boundary. Then
668
Counting Singularities in Liquid Crystals
Caning singuluiUa in liquid aynals
109
Fig. 1. Here are shown representations of unit vector fields
F(x) = (j) x
and
G(x)=R( IxxI )
in which R is a counterclockwise rotation through 45 °. Such arrays minimize Dirichlet's integral energy and are also observed as stable liquid crystal geometries [K].
Fig. 2. A region i2 is pictured here containing three prescribed singular points whose degrees (+3, -3, +1) are also prescribed. The least energy of unit vector fields having this singular behavior is the least total mass of oriented line segments connecting these singular points (as currents) either to each other or to the boundary. Such a least length
array is illustrated.
669
With F. Almgren in Symposia Mathematica, vol. XXX, 103-118 (1989)
110
Fredoick n. Abmgren Jr. and FJriou H. Lieh
inf E(rp) = 8 it min {, lengths of lines } where the minimum is over all ways of constructing the lines. A different proof of this result was later given by F. Almgren, W. Browder, and E. Lieb [ABL] using H. Federer's co-area formula in the context of currents. This is like quark confinement: a plus and minus quark have an energy proportional to their separation. From this result with specified singularities one is tempted to surmise that, in our original minimization problem, potential singularities would tend to annihilate
each other (if of opposite degrees) or move to ail. The number of singularities that will occur will be only that required by topology, i.e.
E deg (singularity) = deg(V,) = 4 fan J(,)d A. sirrgubririei
This surmise is very wrong, as we shall see later in Example 3, and misled us for a long time. Arbitrarly many singularities (of mixed signs) can occur, even if the Jacobian J(O) vanishes identically. (E) Boundary regularity and hot spots
Our main estimates require an extension of the boundary regularity results indicated above in (A). These theorems take several pages merely to state precisely, but the essence of the matter is the following. Assume that 811 is smooth and take a small patch P C 811 which is roughly a 2-dimensional disk of radius R. One consequence of the boundary regularity theory mentioned in (A) is the following.
There is a fixed c > 0, independent of R, with the property that whenever the boundary function 0 satisfies
jIV41I2dA < E then every minimizer rp is free of singularities in the region
K=
1
x : x E Q, dirt (x, P) > 2 RE, disc (x, Pi ) < 2 RE
,
here P} is the concentric disk of radius fR. Note that e is dimensionless. Our hot spot boundary regularity theorem (proved in [AL]) asserts the existence of a fixed number 0 < d << E such that whenever P C P is a smaller subpatch of radius 6R and
jAP'
Vrd2dA < e
then rp is also free of singularities in the region K above. In other words arbitrarly large boundary energy in a very small disk P cannot by itself induce singularities far away.
670
Counting Singularities in Liquid Crystals
Counting singularities in liquid cryools
111
3. COUNTING SINGULARITIES The principal question motivating our work in [AL] is this: How many singular points N(V)) is at possible for a minimizing Io to have? The following possibilities seem plausible at the outset:
N(rb) < CE(+b)
FALSE;
N(ts)
N(J)) < C&E(tG)
isThe Linear Law*.
here C is a constant, possibly depending on Q. The first possibility is false by counterexample - see below. The second possibility was suggested by the work in [BCL] and misled us for some time (had it been true it would have led to a beautiful geometric theory). In fact it is quite false as
Example I below shows; in particular, N(ti) can be large while J(O) vanishes identically.
Our main result. The Linear Law, is optimal (modulo the value of C = Ca, of which we have no knowledge since our proof is by contradiction based on compactness arguments). It is, to our knowledge, the first bound of its kind.
The following example given by R. Hardt and F. H. Lin in [HLl) shows that
N(0) can indeed be proportional to W(0). Choose N well separated small disks in 8f2. Our >G is constructed to wrap each disk D around the target sphere once (essentially by the inverse function to stereographic projection while preserving or reserving orientation as one chooses); each 8D is mapped to the north pole. The complement of these disks in 8fl is mapped by 'G also to the north pole. Then 8E(,G) :r CN; the constant C is independent of the size of the disks since surface energy is scale invariant. Clearly the orientations of tG on the disks can be arranged so that the total mapping degree of ¢ is either zero or one. It is not hard to prove directly that any minimizing V having 'V as boundary value function must have at least one singularity close to each tiny disk - otherwise E(wp) would be too large. Thus
N(+P) > N c C-'W(tG). Our first main new result (proved independently by Handt and Lin in [HL2]) is that singularities cannot be very close if they are well inside D. THEOREM 1. There is a universal constant C (independent of Cl and b) such that whenever y and z in Cl are singular points of a minimizer V then
dist(y,z) > Cdist(y,ail). The idea of the proof is the following. Fix y and suppose the contrary. Then there will be a sequence of minimizing ,p(i) with singular points at z(i) and at
671
With F. Almgren in Symposia Mathematica, voL XXX, 103-118 (1989)
Fmderiek J. Almgmn Jr. and Elliou H. Lieb
112
Fig. 3. Pictured here are the «cones of influence* in >Z of three singular points. The presence of singular points 1, 2, 3 implies the presence of boundary energy in disks
P, P', P" in 8Q . The problem is that these disks are not disjoint so that the total boundary energy is not a simple sum. Nesting of such cones induces a Cayley tree graph in which a combinatorial anaysis overcomes this difficulty.
y such that ztil - y as j -' oo. A compactness argument (contradicting the negation) and monotonicity (A) shows that the energy of p in small balls of radius R about y is uniformly greater than 8 7rR. The limit of a subsequence of the minimizers Vt11 is a minimizer which thus can have at worst a singularity of degree ±1 at y (by equation (2) above). The tangential approximation theorem implies that the energy of the limit p must be very close to 8 nR for a small R's. This leads to a contradiction because of the continuity of Dirichlet's integral when minimizers converge. A consequence of Theorem I together with equation (2) above is the following.
THEOREM 2. (Complete classification of energy minimizing maps from R3
to Sz .) Suppose P : R3 -' S2 is a minimizer. Then, either V is a constant mapping or = fR =,I) for some y and R. \ Theorem I says that if there are many singularities they have to pile up near ail. This leads to a difficult geometric-combinatorial problem on different scales proportional to bk, where 6 is given in (E) above and k = 1, 2 , , ... We attempt to illustrate this in Figure 3. Referring to the c and 6 of (E) consider the points 1, 2, and 3 in >Z at distances Re, Rc5, and Rc6 above a boundary patch P of radius R and two boundary patches P' and P" of radii R8 inside P. The hot spot boundary
672
Counting Singularities in Liquid Crystals
113
Counting singulaities in liquid crystals
regularity theorem gives us the following lower bounds for the energy of ip in P if we consider the various possibilities of having singularities at positions 1, 2, or 3:
Positions occupied (1 alone) or (2 alone) or (3 alone) (l and 2) or (l and 3) or (2 and 3)
Local boundary energy
(I and 2 and 3)
e 2e
2e
The source of all our difficulties is that we cannot infer an energy 3 e if there are singularities at all three points.
If S(kl denotes the strip {x : x E Q,dist(x, ail) < ebk}, we can effectively decompose each 5ik) into cones of height c6k and base radius dk. We then have a Cayley tree whose vertices represent these cones (i.e. a vertex of order k + I is connected to a vertex of order k in the tree if the smaller cone is inside the larger one). A vertex is occupied if its cone has a singularity near the apex; otherwise it is unoccupied. Each occupied vertex gets an energy c if and only if no more than one higher order vertex to which it is pathwise connected is occupied. The actual details of decomposing each SO) into cones so that due account is taken of overlaps (and all the other problems that will occur to the reader) involves a complicated covering and counting lemma. The final result is The Linear Law for N(>]i) in terms of 8E(v'), as stated at the beginning of this section.
4. THREE EXAMPLES OF COUNTERINTUITIVE BEHAVIOR EXAMPLE 1. Zero Mapping Area. It is easy to prove for any it that if vp takes
values only in some closed hemisphere of S2 then tp has no singularities. We, however, are able to construct a single curve r in S2 which is a slight perturbation of the equator and, for each N, a smooth boundary value function i0N : 891 --+ S2 having its image equal to r such than any minimizer,piv having boundary values ,,bN must have at least N singular points. In the example of [AL], fI is taken to be a ball, but the details of fl are not important. The Jacobian J(tipN) of each ipN vanishes identically since its image is one dimensional. The idea behind the construction appears in the following preliminary problem. Consider reasonable mappings tp : D2 --+ S2 from the unit disk D2 in the plane having two dimensional Dirichlet's integral denoted by EZ(,p). Suppose I' C S2
is a smooth embedding of a circle parametrized by a map P : 8D2 - r. The functions tp from D2 to S2 having boundary values P can be separated into two homological classes: the +class, in which, heuristically, tp «covers the top of S2 one more time than it covers the bottom>> and, the - class in which rp «covers the bottom one more time than it covers the top>>; see Figure 4.
673
With F. Almgren in Symposia Mathematica, vol. XXX, 103-118 (1989)
Frederick J. Abngren Jr. and ©liou H. Lkb
114
Fig. 4. Illustrated here is one of two homologically distinct classes of mappings rp : D2 . S2 corresponding to a given boundary parametrization P : 8D2 -. r (the curve r is a perturbation of the equator). A «+ function* is one which «covers the northern hemisphere*. For some r's, the homology type preferred by a least energy mapping can change if the parametrization P is changed. This phenomenon leads ultimately to construction of least energy mappings from the ball to the sphere having many interior singularities but for which the boundary mapping of the sphere to the sphere has zero mapping area (its entire image lies within the curve r).
Consider the two numbers
E'(P) = inf {4 (rp) : rp = P on 8D2 and rp E ± class}. In general E' (P) will not be the same as E- (P). We construct a single r having two different (homotopic) parametrizations P+
and P- such that
E+(P+) < E-(P+) - e and E-(P-) < E+(P-) - e for some e > 0. In other words if the parametrization of IF changes from P+ to Pany absolute minimizer rp changes from lying in the + class to lying in the - class.
The next step is to let 0 be a very long solid tube T of radius I and length N( L + 1). (Actually, T is bent into a torus so that we can ignore the two ends.) As boundary function 0 we alternately paste P- and P+ on sections of length L (i.e. each cross-sectional disk has P- or P+ on its boundary). In the transitional regions of length I we smoothly interpolate between P- and P+ (which can be done since they are homotopic). In the transition regions ¢ continues to take values only in r. See Figure 5. If L is large enough (depending only one) , it is believable (and we prove it) that rp must be mostly a - function on the P- disks and it must be mostly a + function
674
Counting Singularities in Liquid Crystals Counting singularities in liquid crystals
115
Fig. 5. Illustrated here is a boundary value function 0 : 811 -+ S2 for a long tube domain Q. The image of 0 is a smooth curve r in S2. On crossectional circles of 8Q the boundary values alternate between intervals of P' mappings and intervals of Pseparated by transition intervals. Least energy maps tp :11 -+ S2 with such boundary values map most crossections in P' regions to cover the northern hemisphere and map most crossections in P- regions to cover the southern hemisphere. The minimizer W therefore has at least one singular point near each transition region.
on the P' disks, for otherwise E(p) would be unnecessarily large. But when tP switches from being a - function to being a + function rp must have a singularity for topological reasons. Thus, V will have at least N singularities altogether. The drawback to this example is that the domain T depends on N. To achieve the same result for a fixed domain t2 = unit ball, we first cut the surface 8T longitudinally (i.e. perpendicular to the disks) and flatten it (key estimates here come from the conformal equivalence of the disk and the upper half plane and the fact that Dirichlet's integral in two dimensions is invariant under conformal reparametrizations of domains).This yields a strip of width 27r and length N(L + 1). We also
rotate P+ if necessary so that P` and P- have the same value ry E I' along the cut. Next we shrink the strip to width (2 7F)2 /N(L + 1) and length 2 7r. Finally we paste this strip (which is very narrow since N is large) along the equator of 12 and let +Jr
:
8t1 -+ S2 be the old ,G in the strip and let O(x) = -y for x E 8Q
but xV the strip. A somewhat nerve wracking argument shows, as expected, that any minimizer to : 12 --, S2 must have at least N singularities close to the equator
of a. ExAIviPLE 2. Symmetry Breaking When tp takes values in R3 instead of S2, any geometric symmetry of t2 and >fi is inherited by the minimizing W. The reason is simply that minimizers are unique in the linear case (A tp = 0). When, as in our case, V takes values in S1, the symmetry of t2 and, can be broken by tp; obviously there must then be several minimizers.
Let t2 be the unit ball in R3 and let ty : 811 --+ S2 be the distortion of the identity map illustrated in Figure 6. In small caps N (resp. S) on 8t2 , i covers
675
With F. Almgren in Symposia Mathematica, vol. XXX, 103-118 (1989)
116
Frederick J. Ahngren Jr. and Elliott H. Lieb
Fig. 6. Here our domain Q is the unit ball so that ail is the unit sphere. Pictured schematically is a special boundary value function its : all --' S2 having a mirror image symmetry through the equatorial plane. A small cap N around the north pole maps to cover the entire northern hemisphere of S2 while a small cap S around the south pole covers the entire southern hemisphere. The sphere less these two caps maps entirely to the equator. Longitude is preserved in each of these regions. No minimizing W : f2 -. S2 having boundary values >' can possess such a symmetry since the (necessarily odd) number of singular points must be contained within one of the regions v and a near the poles.
the northern (rcsp. southern) hemisphere of S2. The two maps are mirror images of each other. On the rest of 811 between N and S, 0 takes values in the equator of S2 in the obvious way, i.e. ,(x, y, z) = (x2 + y2) -1/2 (x, y, 0). THEOREM 3. Any minimizer rp can have singularities only in small shaded regions in 11, labelled v and a, near the caps N and S.
Since deg(ti) = 1, this result implies that V does not inherit the mirror image symmetry through the equatorial disk possessed by a/,. (Our function 9, necessarily
676
Counting Singularities in Liquid Crystals
Coming singulrities in liquid ayuaxs
117
has an odd number of singularities, and if (were symmetric, it would necessarily have one on the equatorial disk in Q.) The proof of Theorem 3 has two parts. First we show that when N and S are small ip has no singularities in a concentric ball Q' of radius I - e for some small e. This is done by a variational (or comparison) argument. Second, we show that there are no singularities in {x : I > Jz > I - c and dirt (x, a fl v) > c} by using the boundary regularity (E). EXAMPLE 3. Boiling Water The [BCL] result mentioned in (D) above suggests
that + and - singularities tend to annihilate each other. On the other hand, the hot spot boundary regularity mentioned in (E) above suggests that behavior at different length scales (as measured by the distance to 8A) is independent so that + and - singularities could coexist provided their distances to 80 were very different. There would appear to be a conflict here and one of our results is that of the two points of view just mentioned the second one is correct. We have proved the following. THEOREM 4. Let A be the unit ball and let pl , ... , pu be any distinct points in
&Q. Also let Nl, ... , NM be any positive integers and for each i = 1, ... , M let A, be any sequence of length N; consisting of+l 's and - I 's. Finally, let e > 0. Then there is a smooth 0 : 8A - SZ such that
(i) 8E(v') < c + 8 a Ful Ni. (ii) The minimizercp is unique.
(iii) For each i = I, ... , M there are at least Ni singularities stacked nearly vertically above pi (like bubbles in a pan of water that is about to boil), and these have the specified sequence of degrees given by Ai.
REFERENCES (ABL] F. ALMGREN, W. BROWDER and E. LiEB: Co-area, liquid crystals and minimal surfaces. In: Partial Differential Equations, ed. S. S. Chem, Springer Lecture
Notes in Math., 1306,1-12 (1988). F. ALMOREN and E. LIEB: Singularites of energy minimizing maps from the ball to the sphere: examples counterexamples and bounds. Ann. of Math., 128, 483530 (1988). See also: Singularities of energy minimizing maps from the ball to the sphere, Bull. Amer. Math. Soc., 17, 304-306 (1987). (BCL] H. BRIMS, J-M. CoRON and E. Lim: Harmonic maps with defects. Common. Math. Phys. 107, 649-705 (1986). [HL1 ] R. HARUr and F. H. LIN: A remark on HI mappings. Manuscripta Math., 56, 1-10 (1986). [HL2] R. HARDT and F. H. LIN: Stability of singularities of minimizing harmonic maps. J.
[AL]
Dif. Geom., 29,113-123 (1989).
677
With F. Almgren in Symposia Mathematica, vol. XXX, 103-118 (1989)
118
Fredefick J. Almgren Jr. and Ellion H. Lieb
M. KLbAAN: Points, lignes, parois daps les fluides anisotropes et les solides cristalline. Les E`diiones de Physique (Orsay), I, 36-37. L. SIMON: Asymptotics for a class of nonlinear evolution equations with applications [S) to geometric problems. Ann. of Math. 118.525-571 (1983). [SU1] R. SCHOEN and K. UHiENBEcK: A regularity theory for harmonic maps. J. Dif. Geom.,17, 307-335 (1982). [SU2) R. SCHOEN and K. UHLENBECK: Boundary regularity and the Dirichlct problem of harmonic maps. J. Dif. Geom., 18. 253-268 (1983).
(K)
678
With M. Loss in Math. Res. Left. 1, 701-715 (1994)
Mathematical Research Letters 1, 701-715 (1994)
SYMMETRY OF THE GINZBURG LANDAU MINIMIZER IN A DISC ELLIOTT H. LIEB AND MICHAEL Loss ABSTRACT. The Ginzburg-Landau energy minimization problem for a vec-
tor field on a two dimensional disc is analyzed. This is the simplest nontrivial example of a vector field minimization problem and the goal is to show that the energy minimizer has the full geometric symmetry of the problem. The standard methods that are useful for similar problems involving real valued functions cannot be applied to this situation. Our main result is that the minimizer in the class of symmetric fields is stable, i.e., the eigenvalues of the second variation operator are all nonnegative.
1. Introduction There are many energy minimization problems having a geometric symmetry and for which one can show that the energy minimizer has the same symmetry as the problem itself. Typically this is done by using a rearrange-
ment inequality of some sort. However, and this is the important point, rearrangement inequalities work (if they work at all) only when the variable is a function and not something more complicated like a vector field.
There are several important problems in which the variable is one or more vector or tensor fields and for which the minimizer is believed to be symmetric. Examples include the full multi-field Ginzburg-Landau problem for a superconductor in a magnetic field, the 't Hooft-Polyakov monopole and the Skyrme model (see [LE2] for a review). They are all unresolved. In this paper we analyze the simplest possible nontrivial example of a vector field energy minimization problem-the Ginzburg-Landau problem for a complex scalar field in a disc. It has exercised many authors (see, e.g., [JT], [BBH] and references therein) but no one has been able to show that the obvious symmetric vector field minimizes the energy (except in the ©1994 by the authors. Reproduction of this article, in its entirety, by any means is permitted for non-commercial purposes. Received October 5, 1994.
Work of E. Lieb partially supported by NSF grant PHY 90-19433 A03. Work of M. Loss partially supported by NSF grant DMS 92-07703. 701
679
With M. Loss in Math. Res. Lett. 1, 701-715 (1994) 702
ELLIOTT H. LIEB AND MICHAEL LOSS
weak coupling regime where convexity holds). In fact, it has not even been shown that the symmetric solution is stable under perturbations, and it is the purpose of this paper to prove just that. We do so by using a mixture of rearrangement inequalities on different components of the vector field and, while our methods are highly specialized to this problem, we believe that it is one of the few examples in which light can be shed on the symmetry of an energy minimizing vector field. As an illustration of the problem in which the variable, t/i, is a function, one could mention the following: Let Bn denote the closed unit ball centered at 0 E R" and let ip denote a real valued function on Bn that vanishes on BBn, the boundary of B, and whose gradient is square integrable. Then we set (1.1)
F(V)) = JB I (v&)(x)I2dx + JB,j 1 - t1(x)2)2dx
an d seek to minimize .F(V)). It is well known that there is a minimizer
and that it is spherically symmetric, i.e., ?P(x) = i/i(y) if Ixl = IyJ. The minimizer thus retains the symmetry of the problem. Indeed, more is true: t(i is symmetric decreasing, i.e., p(x) > )(y) if Ixi < lyl. While there are other methods to prove the symmetry, one of the simplest is to do so by using rearrangement inequalities to show that is symmetric decreasing. The first step in this process is to observe that replacing t' by ICI does not change IO>/il2 and hence does not change the energy .F(i'). The second step is to replace ItPj by the equimeasurable function o' which is defined to be the symmetric decreasing rearrangement of ItGI. Certainly ii' satisfies the boundary conditions. The equimeasurability of iG' and ItPI guarantees
that Pi - ,)2)2 = f [l - 0'212. The important inequality concerns the kinetic energy, or Dirichlet integral. It is (1.2)
IvIp112
Bn
>-
f
Ivp'12. n
This shows that among the energy minimizers there is at least one that is symmetric decreasing.
We now turn to the Ginzburg-Landau problem in the disc D = B2 in R2, which looks deceptively similar to the above problem. For one thing the variable is now a real vector field ?P(x) = (f (x), g(x)) instead of a single function. It is customary to introduce the complex valued function O(x) = f (x) The energy functional is
E(W) = f {(Vf(x))2 + (Vg(x))2 + J(f(x)2 +g(x)2)}dx (1.3)
D
= f D{Iv0I2 + J(1012)}.
680
Symmetry of the Ginzburg-Landau Minimizer in a Disc
SYMMETRY OF THE GINZBURG LANDAU MINIMIZER IN A DISC
703
Usually, J : R+ -' R+ is taken to be the function J(t) = \(l - t)2 with A > 0. For our purposes we can generalize this to J satisfying certain conditions, which we assume henceforth:
(i) J(0)=A>0, J(1)=0, J(t)>0ift>1, (ii) J(t) is monotone decreasing and convex on the interval [0, 1], (iii) J is twice differentiable on [0, 1]. The gradients of ip are assumed to be square integrable and the condition on tfi on the boundary of D is (1.4)
V) (X) = x = (xl,x2) = (cos0,sin0).
We denote the class of H'(D) functions satisfying (1.4) by C. The problem is to minimize E(,P) subject to t' E C. For this problem it is a standard fact that a minimizer exists and satisfies the Euler-Lagrange pair of equations (1.5)
_AV) + j1(V)2)1p = 0
with r/i2 = f2 + g2. The obvious conjecture about a minimizer tG is that it is a "hedgehog", i.e., for some nonnegative function f defined on [0, 11 with
f (l) = 1 (1.6)
V) (x) = f(r)(cos0,sin0)
where r :=
xt + x2. There is always a function t,io that minimizes the energy in the class of vector fields of type (1.6), and it satisfies (1.5). The problem is to show that this t/io is a global minimizer. In terms of f (r), (1.5) reads (1.7)
-f - T+ 2f +J'(.f2)f = 0
with f (0) = 0 and f (1) = 1. The solution to this problem is unique [HH].
It is not hard to see that f is monotone increasing, but this fact is not needed in this paper. Although we cannot prove the full hedgehog conjecture, we are able to verify that the hedgehog is stable, that is to say that all the eigenvalues of the self-adjoint second variation operator H, defined by the quadratic form, d2 I de2 E(Iko + ev) IE=0 = (v, Hv),
681
With M. Loss in Math. Res. Lett. 1, 701-715 (1994)
704
ELLIOTT H. LIEB AND MICHAEL LOSS
are nonnegative. Specifically H is given by (1.9)
Hv=-Ov+J'(t,b )v+2J"('P02)(1lio,v)ijo
for vector fields v that vanish on &D. Here (a, b) is the inner product on R2. We believe that all the eigenvalues of H are strictly positive but we cannot show this. If they are, then we can reach the following conclusion: For small A the hedgehog is certainly the global minimizer because ip - E(?i) is strictly convex and hence the global minimizer is unique. If the hedgehog
ceases to be the minimizer for large A then the non-hedgehog minimizer cannot be close to the hedgehog. In other words, a simple bifurcation away from the hedgehog cannot occur.
II. Statements of theorems and lemmas The following three theorems will be proved in the next section in the order 2, 3, 1. Theorem 1 is our main result. Theorem 3 will require three lemmas which we list here. Lemmas 1 and 2 on rearrangements are well known.
The proof of Theorem 2 uses some simple facts about convexity. This theorem holds for the analogous Ginzburg-Landau problem in R" for any n, not just for n = 2. Theorem 1 is a Corollary of Theorems 2 and 3.
Theorem 1 (Weak stability of the symmetric minimizer). The eigenvalues of H in (1.8, 1.9) (with Dirichlet boundary conditions) are all nonnegative. The complex eigenfunctions of H can all be chosen to have the following form a(r)eie + b(r)e-`B (2.1)
v(r,0) = eime
(-ia(r)eie + ib(r)e-t19
for suitable real functions a = am and b = bm and with m = 0, ±1, ±2,. ... Clearly, v, the complex conjugate, is also an eigenvector with the same eigenvalue as v. The lowest eigenvalue of H belongs either to m = 0 or to
m = ±1. Remark: Both cases, m = 0 or m = 1, can occur-depending on J. When J = 0, m = 1 is optimal with a(r) = 0. The lowest eigenfunction of -A is well known to be nodeless. When J is very large the best choice is m = 0 with b(r) ^- -a(r) because a = -b makes (i/io, v) vanish.
Theorem 2 (Partial convexity of the energy functional E(i))). Suppose tli =
682
is a real vector field in
that satisfies ile(x) =
Symmetry of the Ginzburg-Landau Minimizer in a Disc
SYMMETRY OF THE GINZBURG LANDAU MINIMIZER IN A DISC
705
n
x on the boundary of B. Suppose that iP(x)2 = E ji;(x)2 < 1 for all x and suppose that each component iI satisfies 5'n-1
.=t
O,(rw)dw = 0
for all r. Define the vector field tli(x) by (2.3)
(rw) = h(r)w,
w E S"-1
where h is the spherical average of 11p 12, i.e.,
(2.4)
h(r) = I L
ISn-11
f
n-1
(i(rw)2)d]"2 i
and ISn-11 = ,fin-, dw. Then (with E(O) given by the obvious generalization to Bn of (1.3)),
E(0 < E(t0.
(2.5)
If we assume that h(r) > 0 for all r > 0 then equality occurs in (2.5) only
if ty =. Theorem 3 (Rearrangements of special vector fields). Suppose that is a vector field in C and suppose that there exists some fixed vector wo E S' such that tl,
Vi(two) = h(t)wo
(2.6)
for all t E [-1,1]. Then there is a vector field ib E C satisfying (2.6) and, additionally, (2.7)
(z)
(2.8)
(ii)
tli(x) _ -tli(-x) for all x E D, E(t') < E(ti).
Remark: The following might help to clarify the relation between Theorems 2 and 3. Write a minimizing z,i E C in complex form as 00
(2.9)
4(r,0) _
ck(r)eike
k=-oo
683
With M. Loss in Math. Res. Lett. 1, 701-715 (1994)
ELLIOTT H. LIEB AND MICHAEL LOSS
706
with ck(1) = 0 if k 54 1 and c1(1) = 1. If co(r) - 0 then Theorem 2 applies and we learn that the hedgehog is the minimizer, i.e., ck(r) __ 0 for k 0 1. Next suppose that we take a 0 in the form (2.9) in which only at most two of the ck's are not identically zero, say cl and cm with m 0 1. Then we claim that we can choose the two c's to be real functions without raising the energy. Having done this, Theorems 2 and 3 apply and we again learn that the energy minimizing choice in this restricted category has c,,, = 0 for m 0 1. The proof of this assertion is the following. We write ca(r) _ p, (r) exp[ia, (r)] with p, > 0 and aj real. Then [4'(r, 0) 12 = pi (r)2 + pm(r)2 + 2p1(r)pm(r) cos[(m - 1)0 + am(r) - al (r)],
and we observe two things: If we replace a1 and am by zero then (i) the gradient term in E can only decrease;
(ii) the J term does not change because by a trivial shift of 0, the 0 integral does not depend on am(r) - al(r). (The convexity of J plays no role here.) The lemmas about symmetric decreasing rearrangements that we shall need are the following. The first was basically proved by Chiti [CG] and then by Crandall-Tartar [CT]. For some generalizations see [AL], 2.2 and 2.3.
Lemma 1. Let f and g be nonnegative functions on R" and let J : R R+ be a convex function with J(0) = 0. Then J (f*(x) - g*(x))dx
(2.10)
f
J(f(x) - g(x))dx
R
JR ^
where f * and g' are the symmetric decreasing rearrangement
Lemma 2 (Rearrangements and gradient norms). For u E Ho ([-a, a]) define u` = Jul*. Then u' E Ho([-a, a]) and (dxdu*)2
(2.11)
<
(du x\ s
1
Lemma 3 (Cutting argument). Let 0 = (f, g) E C and assume in addition that g(x1i 0) = 0 for x1 E [-1,1]. Then there exists _ J,4) E C such that for all x = (x1i X2) in D (i) g(x1, x2) > x2 for x2 > 0 and g(x1, x2) < x2 for x2 < 0, (ii) E(_) 1 for all x E D and hence f (X1, x2)2 < 1 - x2 (iii)
684
<
Symmetry of the Ginzburg-Landau Minimizer in a Disc
SYMMETRY OF THE GINZBURG LANDAU MINIMIZER IN A DISC
707
III. Proofs n
3A. Proof of Theorem 2. Since
V)I(x)2 < 1, and since t
J(t) is
i=1
convex we have, by Jensen's inequality, that
-
1
(3A.1)
ISn
1
j(1,0(rw)12)dw > J(h(r)2)
s ^-'
and hence
f
(3A.2)
J(I.O(x)12)dx
> j J(h(r)2)dx = f
n
n
n
To estimate the kinetic energy we expand each component, t/ij, into normalized spherical harmonics, Vim, with coefficients c!yn(r). m
-0j(rw)=EE
(3A.3)
(r)Ym (w)
1=1 m
Here I denotes the irreducible representation of SO(n), while m is a multiindex that labels the rows. The reason I = 0 is absent is that
Oj(rw)dw=0 forevery 0
JS' n
oo
Note that h(r)2 = E E
It is well known that
j=11=1 m
(3A.4) IVoj 12
fB ^
1=1 m
f
rr
(C1c m/dr)2 +
1(1 + n - 2) r2
c'ndoj_ /iSn_uiwe get, by Schwarz's inequality,
Since h dh =
I
j=1 1=1 m
that
dh
(3A.5)
dr
2
n
oo
(dCM)2
-
j=1 !=1 m
dr
/isui.
Obviously, (3A.6) O°
1=1 m
f
0 11(1
+ n - 2)(c m)2rn-3dr >
r r (n - 1)(Cjm)2rn-3dr 1
00
1=1 m
0
685
With M. Loss in Math. Res. Lett. 1, 701-715(1994)
ELLIOTT H. LIEB AND MICHAEL LOSS
708
with equality only if cl7n = 0 for all l > 2. In that case we can write (3A.7)
(x) = rn-
n
V1j
d (r) Tk
k=1
and h(r)2 = Ej k=1 d,k (r)2. In general, by summing over j, we find that (3A.8)
Jv12 > S1I Jf {(dh(r)/dr)2 +
21h(r)2}r-1dr = II2B0
JBn
with equality only if (3A.7) is satisfied.
In short, (2.5) has been proved and we know that equality requires Our final task is to show that equality in (2.5) also requires dd (r) = h(r)6k,j/-,/n when h(r) > 0 for all r > 0. (3A.7).
Inequality (3A.5) was obtained by using Schwarz's inequality. In order to have equality we must have that (3A.9)
drdk (r) _ A(r)dk (r)
for some function A(r) not depending on j and k. By multiplying (3A.9) on both sides by djk(r) and summing over k and j we have h(r)h'(r) _ .(r)h(r)2. Since h(r) > 0 for all r > 0 we have that (3A.10)
)(r) = h'(r)/h(r).
This function is integrable away from the origin and hence (3A.9) yields (3A.11)
djk(r) = p(r)djk(1).
with µ(r) = exp {- f r' A(s)ds}. By assumption, dd (1) = n-1/26j,k, and this yields the desired conclusion with p(r) = h(r). 3B. Proof of Theorem 3. Without loss of generality, we can assume wo = (1,0). Our hypothesis is that if ii(x) = (f (x), g(x)) then g(x1, 0) = 0 for -1 < x1 < 1. By Lemma 3 (cutting argument) we can assume two important facts about our f and g: 0) f (xl, x2)2 < 1 - x2; / (ii) g(x1, x2) > X2 if x2 > 0 and 9(X1, x2)
686
x2 if x2 G 0-
Symmetry of the Ginzburg-Landau Minimizer in a Disc
SYMMETRY OF THE GINZBURG LANDAU MINIMIZER IN A DISC
709
We can also assume that f 2 + g2 < 1. The first step is to define g in the following way. For each, fixed x2 we replace the function x1 '--+ g(x1, x2) by its (one-dimensional) symmet-
ric decreasing rearrangement g'(x1ix2) if x2 > 0 and we replace it by -19(x1,x2)1' if x2 < 0. By (ii) above, g' satisfies the correct boundary condition, (i.e., g(x1ix2) = x2 on 8D), and g also satisfies (ii) above. The next step, the definition of f (x1, x2), is a bit more complicated. First, let f # be the symmetric increasing rearrangement of If 1. [Another way to say this is that 1- f # = (1 - If 1)']. Once again, the rearrangement is done on each line x2 = constant. We note that f (x1, x2) is continuous in x1 for a.e. X2 (because it is an H1(R) function for a.e. X2) and has
antisymmetric boundary values at x1 = f(1 - x2\1/2. Therefore f# is a H1(R)/function continuous function of x1 (indeed, it is an
by Lemma 2)
and f#(0,x2) = 0. By (i) above, f # < (1 - x2)1/2. Moreover, f # _ (1 -x2)1/2 on 8D since x1 " If (x1ix2)I is continuous and If I satisfies the same boundary condition. Now define (3B.1)
f(x1,x2) =
f#(XI,x2) if x1 > 0
-f#(x1,x2) if x1 < 0
which satisfies the correct conditions on 8D. We also note that 18 f /8x1 I =
I8f#/8x1I and I8f/8x21 = I8f#/8x21. Our task is to show that these rearrangements decrease both terms in
the functional C. We turn to the gradient norms first. By Lemma 2 we have that f D (8g/8x t )2 does not increase and the same is true for f D (8 f /ax t )2. We next show that fD(8g/8x2)2 does not increase either. (The argument for f is essentially identical.) There are several ways to prove this, and one way is the following. An easy approximation argument shows that (3B.2)
f(8g/8x2)2
6
.pa-2
rD[9(x1, x2 + b) - g(xl, x2)]2dx1dx2
(Here, g(x1, x2) has to be extended to be x2 outside D.) The result we want-that replacement of g by g' does not increase the two sides of (3B.2)follows from a trivial modification of Lemma 1. To summarize, the vector field r/i is in C and its gradient norms are not bigger than those of 0.
The penultimate step is to prove that K(t) = [max(0, t)]2. Then (3B.3)
fD
1 for all x E D. Let
K(92 - (1 - f2)) = 0
687
With M. Loss in Math. Res. Len. 1, 701-715 (1994)
ELLIOTT H. LIEB AND MICHAEL LOSS
710
since I
I2
= f 2 + g2 < 1. By Lemma 1, however,
(3B.4) ID
K(92-(1-f2))>1D K(9 -(1-f ))
since (g2)' = g for each line, x2 = constant, and, similarly, (1 - f2)' _ (1 - f 2) since f2 < 1 and f 2 < 1. If g + f 2 > 1 on a set of positive measure, the right side of (3B.4) would be positive, but this is precluded by (3B.3). Finally, we turn to the J term in E. We can define L(t) = J(1 - t) for
0 < t < 1. (The definition of L(t) fort < 0 or t > 1 is not needed since 0 < t < 1 in our application.) Then, by Lemma 1 and the same reasoning as for K above,
ID L((l
f2) - g2)
L((1
- f) - 9 ),
which is the same as f J(I 'I2) > f J(1t'I2). Thus far we have constructed a tai with E(li) < E(-tf,) and with R X1, x2) _
-f(-XI, X2) and g(x1i x2) = g(-X1, x2). The final step is to use this tai to and E(li) < E(t/i). Let D+ denote construct a satisfying t/i(x) = the upper hemidisc {(xI, x2) : X2> 0} nD and D_ the lower hemidisc. Let (ft,9t) denote restricted to D+ and D_. Consider the following two vector fields.
1 = 102 =
(f+(x1,x2),9+(x1,x2))
in D+
(f+(x1, -x2), -9+(x1, -x2))
in D-
(f-(xl, -x2), -4- (XI, -x2))
in D+
(x 1, x2), 9-(xl, x2))
in D_
Clearly IP1,2 (x) = -4/11,2(-x). Also, 101 and 02 are in C because g`(xl, 0) = 0. Moreover, E(4/11) + E(t/12) = 2E(4').
Therefore, 1/.' or 02 is a vector field satisfying the conclusion of Theorem 3.
0
3C. Proof of Theorem 1. The basic fact, which we shall prove later, is that the real eigenfunction of H can be chosen to have at least one of the following symmetry properties for all x E D. (a) (3C.1)
688
(b)
v(x) = -v(-x) v(x) = Pv(P-Ix)
Symmetry of the Ginzburg-Landau Minimizer in a Disc
SYMMETRY OF THE GINZBURG LANDAU MINIMIZER IN A DISC
711
where P_
1
0
0
-1
is the reflection about the x1-axis in R2. This is not to say that every eigenfunction has one of these properties, but we do assert that each eigenvalue of H has at least one eigenfunction of type (a) or (b). Since we are interested only in the eigenvalues of H, we may assume (a) or (b). Now consider t/;E := tko + Ev E C, with v as in (3C.1). In case (a),
0.(x) _ -0,,(-x) for all e. In case (b), 0, satisfies hypothesis (2.6) of Theorem 3, with wo= (1,0). By Theorem 2, in case (a) there is a £(tJiE) < £(O.) and 0E is a hedgehog (1.6). Thus,
E
with
(3C.2)
£(oE) > £(Z) > F(00) since 0is the energy minimizer among all hedgehogs. In case (b) we first have to use Theorem 3 to obtain an intermediate t(, that satisfies the hypothesis of Theorem 2. Again, (3C.2) holds. Since v2 + o(e2),
£N5,0 = £(00) + E2ry
JD
where -y is the eigenvalue of H belonging to v, we see from (3C.2) that
7>0.
There are two ways to derive the symmetry (3C.1) and we shall give both. The first is a fairly general argument and the second involves a detailed study of the eigenfunctions leading to (2.1).
General argument: Let P denote reflection about some axis through the origin. For any eigenfunction, w, its reflection, (P'w)(x) := Pw(P-lx), is also an eigenfunction with the same eigenvalue. If v(x) = w(x) + (P*w)(x) is not identically zero for some P then v is an eigenfunction satisfying (3C.1) (b). If v vanishes identically for all reflections P we claim
that w must be of type (3C.1)(a). To see this recall that any rotation R is the product of two reflections and hence 7Zw(IZ-1 x) = w(x), i.e., w is rotationally symmetric. It is easy to see that w must then satisfy w(x) = k(r)(-X2, x1) for some function k, and hence w satisfies (3C.1)(a). Details of eigenfunctions: Let RQ be the rotation through the angle a and let UQ be its representation given by (3C.3)
(UQv)(x) = RQv(R_Qx).
Uc, is a strongly continuous one-parameter subgroup of the unitary group of L2(D; C2) and it commutes with H. Its infinitesimal generator is (3C.4)
L=i
a 0-0
+i( 01 0). 689
With M. Loss in Math. Res. Lett. 1, 701-715 (1994)
ELLIOTT H. LIEB AND MICHAEL LOSS
712
By standard arguments we can choose the eigenfunctions of H to be eigenfunctions of L. By solving Lv = vv we find that v(0) must be of the form (2.1). v must be an integer since v(0) = v(27r). Furthermore, a glance at the eigenvalue equation reveals that a and b can be taken to be real. Next we verify that the lowest eigenvalue belongs to m = 0 or m = ±1. Suppose, on the contrary, that the lowest one belongs to M > 1 (m and -m are the same by complex conjugation). Then define a comparison function by v := a-'9v. Obviously, the J-term is unchanged. The only term that changes in the gradient norm is the replacement of I := fD I ma vl2 r-2 by
I
fD I -Pe-t9V I2 r- r2. One easily computes that
I = 2 JD{(M + 1)2a(r)2 + (M - 1)2b(r)2}r-2
and hence ?1. Now consider (2.1) with m = 0. This vector v satisfies v(x) = -v(-x) (because changing x to -x amounts to changing 0 to 0 + 7r). Thus, all m = 0 eigenfunctions satisfy (3C.1)(a). Indeed all even-m eigenfunctions have this property. When m = 1 we claim that (3C.1)(b) holds-thus completing our proof. Take the real part of (2.1), which is v(r, 0) = (a(r) cos 20+b(r), a(r) sin 20); this satisfies (3C.1)(b) with P being reflection about the xt-axis.
3D. Remarks on Lemma 1. The Chiti, Crandall-Tartar theorem requires the convex function J to be even. It is usually stated as J : R+ -+ R+, J(0) = 0 and with (2.10) replaced by (2.10a)
JR. J(1f - g* 1) < JRn J(If - MY
It is a simple matter to derive (2.10) from (2.10a) but we are unaware of (2.10) in the literature. Evidently, it suffices to prove (2.10) for the extremal
convex functions, i.e., J of the type J(t) = A(t - a)+ or J(t) = A(t + a)with t± := max{ft, 0} and a, A > 0. Consider A(t - a)+. By replacing f by (f - a)+ we may as well take a = 0 and A = 2. Then 2t+ = ItI + t. We can assume f - g E L' (R), for otherwise the right side of (2.10) is +00. It is easy to see that f g" E L' (R) as well. Then, since 2t+ = ItI + t, (2.10) reads
f
If - gI}> f{f - g + g -f'}.
Indeed, the left side is nonnegative by (2.10a) and the right side is zero since f, f " and g, g" are equimeasurable.
690
Symmetry of the Ginzburg-Landau Minimizer in a Disc
SYMMETRY OF THE GINZBURG LANDAU MINIMIZER IN A DISC
713
3E. Remarks on Lemma 2. (2.11) also holds for functions on R", but we need only the R1 version. For some historical remarks about this inequality see [AL] 1.1, 2.6 and 2.7. There are many proofs of it but the simplest (in our view) is in (LE1], Lemma 5. The method of (LE1], Lemma 5 also proves a generalization that would suffice for our needs in the proof of
Theorem 3 above. The generalization is that if f : R+m - R+ and if f' : R"+' -4 R+ is the symmetric decreasing rearrangement with respect to the first n variables only, then
JRn+m
IV
< JRn+_
?P'-'T1'0_
Step 1.
f'I2
0 1 0/101
Ivf12.
if101<1
ifI0I>1
[with 101 = (f2 +g2)1/2]. An easy exercise shows that IIVT10II2 S (IV &II2
Furthermore, f J(I0I2) > f J(ITi1'I2) because J(t) > 0 when t > I while J(1) = 0. Therefore, without loss of generality we can assume that ItP(x)I 5 1 for all x. Step 2.
t/i - Tzt/ _ (f, h) with h(xj, xz)
max{x2,g(x1ix2)} min{x2i g(x1, x2)}
if x2 > 0 if x2 < 0.
Obviously IT20(x)I > I1L(x)I for all x. The condition g(x1i0) = 0 guarantees that T20 E C. Step 3.
tG i--+ T30 = T1T2 b.
If we write T1T2tj, (using IiI'I 5 1)
1 >_ IT30x)I ? h1(x)I
(a)
If(x)I <_ If(x)I and sgnf(x) =sgnf(x) Ig(x)I ? Ig(x)I and sgng(x) = sgnx2
(b)
(c)
(d)
we can easily verify the following for all x
g(x)2
- g(x)2 >
+ x22 (x2 _ g(x)2]+
(a) is obvious because T2 does not decrease IikI and T1 only cuts off IT2tti
at 1; but jt/51 < 1 everywhere.
(b) is also obvious because T2 leaves f
691
With M. Loss in Math. Res. Lett. 1, 701-715 (1994)
ELLIOTT H. LIEB AND MICHAEL LOSS
714
invariant and Tl can only decrease if 1. (c) follows from the facts that T2 increases IgI, the map t .-+ t/(f2 + t) is monotone increasing for t > 0, and g2 < 92/(f2 + g2) since f 2 + g2 < 1. Indeed, (d) gives a more quantitative estimate. To prove (d) we recall that T27G =: (f, h). If f2 + h2 _< 1 and x2 > g2 then 191 = Ihi = x21 and (d) is certainly true. If f2 + h2 > 1 and x2 > g2 then 2
9
2
x2 f2+x2 -9 2
- g2
x22
1-g2+x2 1_ 2
=1
g2
1 + x2
+ x2
1x2 - 92]
[x2 - g2].
We claim that E(T3i/i) < E(tp). As far as the gradient term is concerned,
T2 replaces g by the harmonic function x2 on the set where IgI < 1x21. This certainly lowers the gradient term. The J term does not increase by property (a) above, since J(t) is decreasing for 0 < t < 1. Now we iterate T3 and denote (fn,gn) = ipn := 7-3 ip. By (b) and (c) fn and gn are bounded monotone sequences and converge pointwise to limit functions f and g. Since E(ti) is weakly lower semicontinuous we have that
E(0) <
where p = (f, g). It is clear that 0 satisfies the correct
boundary conditions and hence is in C. The only thing left to check is that 4(x)2 - x2. If we define an(x) = [x2gn(x)2]+ property (d) can be rewritten as an+t (x) < an(x)(2x2/(1 + x2)), which shows that an(x) converges to zero pointwise for all x E D. 0
Acknowledgements We thank Laszlo Erdos for many valuable discussions.
References F.J. Almgren, Jr. and E.H. Lieb, Symmetric decreasing rearrangement is sometimes continuous, J. Amer. Math. Soc. 2 (1989), 683-773. (BBH) F. Bethuel, H. Brezis and F. Helein, Ginzburg-Landau Vortices, Birkhiiuser, 1994. (CG( G. Chiti, Rearrangements of functions and convergence in Orlicz spaces, Appl. Anal. 9 (1979), 23-27. (AL]
692
Symmetry of the Ginzburg-Landau Minimizer in a Disc
SYMMETRY OF THE GINZBURG LANDAU MINIMIZER IN A DISC
715
M.G. Crandall and L. Tartar, Some relations between nonexpansive and order preserving mappings, Proc. Amer. Math. Soc. 78 (1980), 358-390. ]HH] R.M. Herve and M. Herv6, Etude qualitative des solutions reeles de I'equation differentielle ... (to appear). A. Jaffe and C. Taubes, Vortices and Monopoles, Birkhiiuser, 1980. ]JT) ]LE1] E.H. Lieb, Existence and uniqueness of the minimizing solution of Choquard's non-linear equation, Stud. Appl. Math. 57 (1977), 93-105. , Remarks on the Skyrme Model, Proc. Amer. Math. Soc., Symposia in ILE21 Pure Math. 54 (1993), 379-384, (Proceedings of Summer Research Institute on Differential Geometry at UCLA, July 8-28, 1990). ICT)
DEPARTMENT OF MATHEMATICS, PRINCETON UNIVERSITY, P.O. Box 708, PRINCE-
TON, NJ 08544-0708 E-mail address: liebOmath.princeton.edu SCHOOL OF MATHEMATICS, GEORGIA INSTITUTE OF TECHNOLOGY, ATLANTA, GA
30332-0160 E-mail address: loss(Dmath.gatech.edu
693
Publications of Elliott H. Lieb
1. Second Order Radiative Corrections to the Magnetic Moment of a Bound Electron, Phil. Mag. Vol. 46, 311-316 (1955). 2. A Non-Perturbation Method for Non-Linear Field Theories, Proc. Roy. Soc. 241A, 339-363 (1957). 3. (with K. Yamazaki) Ground State Energy and Effective Mass of the Polaron, Phys. Rev. 111, 728-733 (1958). 4. (with H. Koppe) Mathematical Analysis of a Simple Model Related to the Stripping Reaction, Phys. Rev. 116, 367-371 (1959). 5. Hard Sphere Bose Gas - An Exact Momentum Space Formulation, Proc. U.S. Nat. Acad. Sci. 46, 1000-1002 (1960). 6. Operator Formalism in Statistical Mechanics, J. Math. Phys. 2, 341-343 (1961). 7. (with D.C. Mattis) Exact Wave Functions in Superconductivity, J. Math. Phys. 2, 602-609 (1961). 8. (with T.D. Schultz and D.C. Mattis) Two Soluble Models of an Antiferromagnetic Chain, Annals of Phys. (N.Y.) 16,407-466 (1961). t 9. (with D.C. Mattis) Theory of Ferromagnetism and the Ordering of Electronic Energy Levels, Phys. Rev. 125, 164-172 (1962). t 10. (with D.C. Mattis) Ordering Energy Levels of Interacting Spin Systems, J. Math. Phys. 3, 749-751 (1962). 11. New Method in the Theory of Imperfect Gases and Liquids, J. Math. Phys. 4, 671-678 (1963). 12. (with W. Liniger) Exact Analysis of an Interacting Bose Gas. I. The General Solution and the Ground State, Phys. Rev. 130, 1605-1616 (1963). 13. Exact Analysis of an Interacting Bose Gas. H. The Excitation Spectrum, Phys. Rev. 130, 1616-1624 (1963). 14. Simplified Approach to the Ground State Energy of an Imperfect Bose Gas, Phys. Rev. 130, 2518-2528 (1963). 15. (with A. Sakakura) Simplified Approach to the Ground State Energy of an Imperfect Bose Gase. II. The Charged Bose Gas at High Density, Phys. Rev. 133, A899-A906 (1964). 16. (with W. Liniger) Simplified Approach to the Ground State Energy of an Imperfect Bose Gas. III. Application to the One-Dimensional Model, Phys. Rev. 134, A312-A315 (1964). 17. (with T.D. Schultz and D.C. Mattis) Two-Dimensional Ising Model as a Soluble Problem of Many Fermions, Rev. Mod. Phys. 36, 856-871 (1964).
t means the paper appears in this volume.
695
18. The Bose Fluid, Lectures in Theoretical Physics, Vol. VIIC, (Boulder summer school), University of Colorado Press, 175-224 (1965). 19. (with D.C. Mattis) Exact Solution of a Many-Fermion System and its Associated Boson Field, J. Math. Phys. 6, 304-312 (1965).
20. (with S.Y. Larsen, J.E. Kilpatrick and H.F. Jordan) Suppression at High Temperature of Effects Due to Statistics in the Second Virial Coefficient of a Real Gas, Phys. Rev. 140, A 129-A 130 (1965).
21. (with D.C. Mattis) Book Mathematical Physics in One Dimension, Academic Press, New York (1966). t 22. Proofs of Some Conjectures on Permanents, J. of Math. and Mech. 16, 127-139 (1966). 23. Quantum Mechanical Extension of the Lebowitz-Penrose Theorem on the van der Waals Theory, J. Math. Phys. 7, 1016-1024 (1966). 24. (with D.C. Mattis) Theory of Paramagnetic Impurities in Semiconductors, J. Math. Phys. 7, 2045-2052 (1966). 25. (with T. Burke and J.L. Lebowitz) Phase Transition in a Model Quantum System: Quantum Corrections to the Location of the Critical Point, Phys. Rev. 149, 118-122 (1966). 26. Some Comments on the One-Dimensional Many-Body Problem, unpublished Proceedings of Eastern Theoretical Physics Conference, New York (1966). 27. Calculation of Exchange Second Virial Coefficient of a Hard Sphere Gas by Path Integrals, J. Math. Phys. 8,43-52 (1967). 28. (with Z. Rieder and J.L. Lebowitz) Properties of a Harmonic Crystal in a Stationary Nonequilibrium State, J. Math. Phys. 8, 1073-1078 (1967). 29. Exact Solution of the Problem of the Entropy of Two-Dimensional Ice, Phys. Rev. Lett. 18, 692-694 (1967). 30. Exact Solution of the F Model of an Antiferroelectric, Phys. Rev. Lett. 18, 1046-1048(1967). 31. Exact Solution of the Two-Dimensional Slater KDP Model of a Ferroelectric, Phys. Rev. Lett. 19, 108-110 (1967). 32. The Residual Entropy of Square Ice, Phys. Rev. 162, 162-172 (1967). 33. Ice, Ferro- and Antiferroelectrics, in Methods and Problems in Theoretical Physics, in honour of R.E. Peierls, Proceedings of the 1967 Birmingham conference, North-Holland, 21-28 (1970). 34. Exactly Soluble Models, in Mathematical Methods in Solid State and Superfluid Theory, Proceedings of the 1967 Scottish Universities' Summer School of Physics, Oliver and Boyd, Edinburgh 286-306 (1969). 35. The Solution of the Dimer Problems by the Transfer Matrix Method, J. Math. Phys. 8, 2339-2341 (1967). 36. (with M. Flicker) Delta Function Fermi Gas with Two Spin Deviates, Phys. Rev. 161, 179-188 (1967). t 37. Concavity Properties and a Generating Function for Stirling Numbers, J. Combinatorial Theory 5, 203-206 (1968). 38. A Theorem on Pfaffians, J. Combinatorial Theory 5, 313-319 (1968).
696
39. (with F.Y. Wu) Absence of Mott Transition in an Exact Solution of the Short-Range One-Band Model in One Dimension, Phys. Rev. Lett. 20, 1445-1448(1968). 40. Two Dimensional Ferroelectric Models, J. Phys. Soc. (Japan) 26 (supplement), 94-95 (1969). 41. (with W.A. Beyer) Clusters on a Thin Quadratic Lattice, Studies in Appl. Math. 48, 77-90 (1969). 42. (with C.J. Thompson) Phase Transition in Zero Dimensions: A Remark on the Spherical Model, J. Math. Phys. 10, 1403-1406 (1969). 43. (with J.L. Lebowitz) The Existence of Thermodynamics for Real Matter with Coulomb Forces, Phys. Rev. Lett. 22, 631-634 (1969). 44. Two Dimensional Ice and Ferroelectric Models, in Lectures in Theoretical Physics, XI D, (Boulder summer school) Gordon and Breach, 3 29-354 (1969).
45. Survey of the One Dimensional Many Body Problem and Two Dimensional Ferroelectric Models, in Contemporary Physics: Trieste Symposium 1968, International Atomic Energy Agency, Vienna, vol. 1, 163-176
t
(1969). 46. Models, in Phase Transitions, Proceedings of the 14th Solvay Chemistry Conference, May 1969, Interscience, 45-56 (1971). 47. (with H. Araki) Entropy Inequalities, Commun. Math. Phys. 18, 160-170 (1970). 48. (with O.J. Heilmann) Violation of the Non-Crossing Rule: The Hubbard Hamiltonian for Benzene, Trans. N.Y. Acad. Sci. 33, 116-149 (1970). Also in Annals N.Y. Acad. Sci. 172, 583-617 (1971). (Awarded the 1970 Boris Pregel award for research in chemical physics.) 49. (with O.J. Heilmann) Monomers and Dimers, Phys. Rev. Lett. 24, 14121414 (1970).
50. Book Review of "Statistical Mechanics" by David Ruelle, Bull. Amer. Math. Soc. 76, 683-687 (1970). 51. (with J.L. Lebowitz) Thermodynamic Limit for Coulomb Systems, in Systemes a un Nombre Infini de Degres de Liberte, Colloques Internationaux de Centre National de la Recherche Scientifique 181, 155-162 (1970).
52. (with D.B. Abraham, T. Oguchi and T. Yamamoto) On the Anomalous Specific Heat of Sodium Trihydrogen Selenite, Progr. Theor. Phys. (Kyoto) 44, 1114-1115 (1970). 53. (with D.B. Abraham) Anomalous Specific Heat of Sodium Trihydrogen
Selenite - An Associated Combinatorial Problem, J. Chem. Phys. 54, 1446-1450(1971). 54. (with O.J. Heilmann, D. Kleitman and S. Sherman) Some Positive Definite Functions on Sets and Their Application to the Ising Model, Discrete Math. 1, 19-27 (1971). 55. (with Th. Niemeijer and G. Vertogen) Models in Statistical Mechanics, in
Statistical Mechanics and Quantum Field Theory, Proceedings of 1970
697
Ecole d'Ete de Physique Theorique (Les Houches), Gordon and Breach, 281-326 (1971). 56. (with H.N.V. Temperley) Relations between the `Percolation' and'Colouring' Problem and Other Graph-Theoretical Problems Associated with Regular Planar Lattices: Some Exact Results for the 'Percolation' Problem, Proc. Roy. Soc. A322, 251-280 (1971). 57. (with M. de Llano) Some Exact Results in the Hartree-Fock Theory of a Many-Fermion System at High Densities, Phys. Letts. 37B, 47-49 (1971). 58. (with J.L. Lebowitz) The Constitution of Matter: Existence of Thermodynamics for Systems Composed of Electrons and Nuclei, Adv. in Math. 9, 316-398 (1972). 59. (with F.Y. Wu) Two Dimensional Ferroelectric Models, in Phase Transitions and Critical Phenomena, C. Domb and M. Green eds., vol. 1, Academic Press 331-490 (1972). 60. (with D. Ruelle) A Property of Zeros of the Partition Function for Ising Spin Systems, J. Math. Phys. 13, 781-784 (1972). 61. (with O.J. Heilmann) Theory of Monomer-Dimer Systems, Commun. Math. Phys. 25, 190-232 (1972). Errata 27, 166 (1972). 62. (with M.L. Glasser and D.B. Abraham) Analytic Properties of the Free Energy for the "Ice" Models, J. Math. Phys. 13, 887-900 (1972). 63. (with D.W. Robinson) The Finite Group Velocity of Quantum Spin Systems, Commun. Math. Phys. 28, 251-257 (1972). 64. (with J.L. Lebowitz) Phase Transition in a Continuum Classical System with Finite Interactions, Phys. Lett. 39A, 98-100 (1972). 65. (with J.L. Lebowitz) Lectures on the Thermodynamic Limit for Coulomb Systems, in Statistical Mechanics and Mathematical Problems, Battelle 1971 Recontres, Springer Lecture Notes in Physics 20, 136-161 (1973). 66. (with J.L. Lebowitz) Lectures on the Thermodynamic Limit for Coulomb Systems, in Lectures in Theoretical Physics XIV B, (Boulder summer school), Colorado Associated University Press, 423-460 (1973). t 67. Convex Trace Functions and the Wigner-Yanase-Dyson Conjecture, Adv. in Math. 11, 267-288 (1973). t 68. (with M.B. Ruskai) A Fundamental Property of Quantum Mechanical Entropy, Phys. Rev. Lett. 30, 434-436 (1973). t 69. (with M.B. Ruskai) Proof of the Strong Subadditivity of Quantum-Mechanical Entropy, J. Math. Phys. 14, 1938-1941 (1973). 70. (with K. Hepp) On the Superradiant Phase Transition for Molecules in a Quantized Radiation Field: The Dicke Maser Model, Annals of Phys. (N.Y.) 76, 360-404 (1973). 71. (with K. Hepp) Phase Transition in Reservoir Driven Open Systems with Applications to Lasers and Superconductors, Helv. Phys. Acta 46, 573-602 (1973). 72. (with K. Hepp) The Equilibrium Statistical Mechanics of Matter Interacting with the Quantized Radiation Field, Phys. Rev. A8, 2517-2525 (1973). 73. (with K. Hepp) Constructive Macroscopic Quantum Electrodynamics, in Constructive Quantum Field Theory, Proceedings of the 1973 Erice Sum-
698
mer School, G. Velo and A. Wightman, eds., Springer Lecture Notes in Physics 25, 298-316 (1973). t 74. The Classical Limit of Quantum Spin Systems, Commun. Math. Phys. 31, 327-340 (1973). 75. (with B. Simon) Thomas-Fermi Theory Revisited, Phys. Rev. Lett. 31, 681-683 (1973). t 76. (with M.B. Ruskai) Some Operator Inequalities of the Schwarz Type, Adv. in Math. 12, 269-273 (1974). 77. Exactly Soluble Models in Statistical Mechanics, lecture given at the 1973 I.U.P.A.P. van der Waals Centennial Conference on Statistical Mechanics, Physica 73, 226-236 (1974). 78. (with B. Simon) On Solutions to the Hartree-Fock Problem for Atoms and Molecules, J. Chem. Physics 61, 735-736 (1974). 79. Thomas-Fermi and Hartree-Fock Theory, lecture at 1974 International Congress of Mathematicians, Vancouver. Proceedings, Vol. 2, 383-386 (1975).
t 80. Some Convexity and Subadditivity Properties of Entropy, Bull. Amer. Math. Soc. 81, 1-13 (1975). t 81. (with H.J. Brascamp and J.M. Luttinger) A General Rearrangement Inequality for Multiple Integrals, Jour. Funct. Anal. 17, 227-237 (1975). t 82. (with H.J. Brascamp) Some Inequalities for Gaussian Measures and the Long-Range Order of the One-Dimensional Plasma, lecture at Conference on Functional Integration, Cumberland Lodge, England. Functional Integration and its Applications, A.M. Arthurs ed., Clarendon Press, 1-14 (1975). 83. (with K. Hepp) The Laser: A Reversible Quantum Dynamical System with Irreversible Classical Macroscopic Motion, in Dynamical Systems, Battelle 1974 Rencontres, Springer Lecture Notes in Physics 38, 178-208 (1975). Also appears in Melting, Localization and Chaos, Proc. 9th Midwest Solid State Theory Symposium, 1981, R. Kalia and P. Vashishta eds., NorthHolland, 153-177 (1982). 84. (with P. Hertel and W. Thirring) Lower Bound to the Energy of Complex Atoms, J. Chem. Phys. 62, 3355-3356 (1975).
85. (with W. Thirring) Bound for the Kinetic Energy of Fermions which Proves the Stability of Matter, Phys. Rev. Lett. 35, 687-689 (1975). Errata 35, 1116 (1975). 86. (with H.J. Brascamp and J.L. Lebowitz) The Statistical Mechanics of Anharmonic Lattices, in the proceedings of the 40th session of the International Statistics Institute, Warsaw, 9, 1-11 (1975). t 87. (with H.J. Brascamp) Best Constants in Young's Inequality, Its Converse and Its Generalization to More Than Three Functions, Adv. in Math. 20, 151-172 (1976). t 88. (with H.J. Brascamp) On Extensions of the Brunn-Minkowski and PrekopaLeindler Theorems, Including Inequalities for Log Concave Functions and with an Application to the Diffusion Equation, J. Funct. Anal. 22, 366-389 (1976).
699
89. (with J.F. Barnes and H.J. Brascamp) Lower Bounds for the Ground State Energy of the Schroedinger Equation Using the Sharp Form of Young's Inequality, in Studies in Mathematical Physics, Lieb, Simon, Wightman eds., Princeton Press, 83-90 (1976). t 90. Inequalities for Some Operator and Matrix Functions, Adv. in Math. 20, 174-178 (1976). 91. (with H. Narnhofer) The Thermodynamic Limit for Jellium, J. Stat. Phys. 12, 291-310 (1975). Errata J. Stat. Phys. 14, 465 (1976). 92. The Stability of Matter, Rev. Mod. Phys. 48, 553-569 (1976). 93. Bounds on the Eigenvalues of the Laplace and Schroedinger Operators, Bull. Amer. Math. Soc. 82, 751-753 (1976). 94. (with F.J. Dyson and B. Simon) Phase Transitions in the Quantum Heisenberg Model, Phys. Rev. Lett. 37, 120-123 (1976). (See no. 104.) t 95. (with W. Thirring) Inequalities for the Moments of the Eigenvalues of the Schrodinger Hamiltonian and Their Relation to Sobolev Inequalities, in Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman eds., Princeton University Press, 269-303 (1976). 96. (with B. Simon and A. Wightman) Book Studies in Mathematical Physics: Essays in Honor of Valentine Bargmann, Princeton University Press (1976). 97. (with B. Simon) Thomas-Fermi Theory of Atoms, Molecules and Solids, Adv. in Math. 23, 22-116 (1977). 98. (with O. Lanford and J. Lebowitz) Time Evolution of Infinite Anharmonic Oscillators, J. Stat. Phys. 16, 453-461 (1977).
99. The Stability of Matter, Proceedings of the Conference on the Fiftieth Anniversary of the Schroedinger equation, Acta Physica Austriaca Suppl. XVII, 181-207 (1977). t 100. Existence and Uniqueness of the Minimizing Solution of Choquard's NonLinear Equation, Studies in Appl. Math. 57, 93-105 (1977). 101. (with J. Frohlich) Existence of Phase Transitions for Anisotropic Heisenberg Models, Phys. Rev. Lett. 38, 440-442 (1977). 102. (with B. Simon) The Hartree-Fock Theory for Coulomb Systems, Commun. Math. Phys. 53, 185-194 (1977). 103. (with W. Thirring) A Lower Bound for Level Spacings, Annals of Phys. (N.Y.) 103, 88-96 (1977). 104. (with F. Dyson and B. Simon) Phase Transitions in Quantum Spin Systems with Isotropic and Non-Isotropic Interactions, J. Stat. Phys. 18, 335-383 (1978). 105. Many Particle Coulomb Systems, lectures given at the 1976 session on statistical mechanics of the International Mathematics Summer Center (C.I.M.E.). In Statistical Mechanics, C.I.M.E. I Ciclo 1976, G. Gallavotti, ed., Liguore Editore, Naples, 101-166 (1978). 106. (with R. Benguria) Many-Body Atomic Potentials in Thomas-Fermi Theory, Annals of Phys. (N.Y.) 110, 34-45 (1978). 107. (with R. Benguria) The Positivity of the Pressure in Thomas-Fermi Theory, Commun. Math. Phys. 63, 193-218 (1978). Errata 71, 94 (1980).
700
108. (with M. de Llano) Solitons and the Delta Function Fermion Gas in Hartree-Fock Theory, J. Math. Phys. 19, 860-868 (1978). 109. (with J. Frohlich) Phase Transitions in Anisotropic Lattice Spin Systems, Commun. Math. Phys. 60, 233-267 (1978). 110. (with J. Frohlich, R. Israel and B. Simon) Phase Transitions and Reflection Positivity. I. General Theory and Long Range Lattice Models, Commun. Math. Phys. 62, 1-34 (1978). (See no. 124.) t I11. (with M. Aizenman and E.B. Davies) Positive Linear Maps Which are Order Bounded on C* Subalgebras, Adv. in Math. 28, 84-86 (1978). t 112. (with M. Aizenman) On Semi-Classical Bounds for Eigenvalues of Schrodinger Operators, Phys. Lett. 66A, 427-429 (1978). 113. New Proofs of Long Range Order, in Proceedings of the International Conference on Mathematical Problems in Theoretical Physics (June 1977), Springer Lecture Notes in Physics, 80, 59-67 (1978). t 114. Proof of an Entropy Conjecture of Wehrl, Commun. Math. Phys. 62, 35-41 (1978). 115. (with B. Simon) Monotonicity of the Electronic Contribution to the BornOppenheimer Energy, J. Phys. B. 11, L537-L542 (1978). 116. (with O. Heilmann) Lattice Models for Liquid Crystals, J. Stat. Phys. 20, 679-693 (1979). 117. (with H. Brezis) Long Range Atomic Potentials in Thomas-Fermi Theory, Commun. Math. Phys. 65, 231-246 (1979). 118. The N 513 Law for Bosons, Phys. Lett. 70A, 71-73 (1979). 119. A Lower Bound for Coulomb Energies, Phys. Lett. 70A, 444-446 (1979).
120. Why Matter is Stable, Kagaku 49, 301-307 and 385-388 (1979). (In Japanese). 121. The Number of Bound States of One-Body Schrodinger Operators and the Weyl Problem, Symposium of the Research Inst. of Math. Sci., Kyoto University, (1979). 122. Some Open Problems About Coulomb Systems, in Proceedings of the Lausanne 1979 Conference of the International Association of Mathematical Physics, Springer Lecture Notes in Physics, 116, 91-102 (1980). t 123. The Number of Bound States of One-Body Schrodinger Operators and the Weyl Problem, Proceedings of the Amer. Math. Soc. Symposia in Pure Math., 36, 241-252 (1980). 124. (with J. Frohlich, R. Israel and B. Simon) Phase Transitions and Reflection Positivity. II. Lattice Systems with Short-Range and Coulomb Interactions. J. Stat. Phys. 22, 297-347 (1980). (See no. 110.) 125. Why Matter is Stable, Chinese Jour. Phys. 17, 49-62 (1980). (English version of no. 120). t 126. A Refinement of Simon's Correlation Inequality, Commun. Math. Phys. 77, 127-135 (1980). 127. (with B. Simon) Pointwise Bounds on Eigenfunctions and Wave Packets in N-Body Quantum Systems. VI. Asymptotics in the Two-Cluster Region, Adv. in Appl. Math. 1, 324-343 (1980).
701
128. The Uncertainty Principle, article in Encyclopedia of Physics, R. Lerner and G. Trigg eds., Addison Wesley, 1078-1079 (1981). t 129. (with S. Oxford) An Improved Lower Bound on the Indirect Coulomb Energy, Int. J. Quant. Chem. 19, 427-439 (1981). 130. (with R. Benguria and H. Brezis) The Thomas-Fermi-von Weizsacker Theory of Atoms and Molecules, Commun. Math. Phys. 79, 167-180 (1981). 131. (with M. Aizenman) The Third Law of Thermodynamics and the Degeneracy of the Ground State for Lattice Systems, J. Stat. Phys. 24, 279-297 (1981). 132. (with J. Bricmont, J. Fontaine, J. Lebowitz and T. Spencer) Lattice Systems with a Continuous Symmetry III. Low Temperature Asymptotic Expansion for the Plane Rotator Model, Commun. Math. Phys. 78, 545-566 (1981). 133. (with A. Sokal) A General Lee-Yang Theorem for One-Component and
Multi-component Ferromagnets, Commun. Math. Phys. 80, 153-179 (1981). 134. Variational Principle for Many-Fermion Systems, Phys. Rev. Lett. 46, 457459 (1981). Errata 47, 69 (1981). 135. Thomas-Fermi and Related Theories of Atoms and Molecules, in Rigorous Atomic and Molecular Physics, G. Veto and A. Wightman, eds., Plenum Press 213-308 (1981). 136. Thomas-Fermi and Related Theories of Atoms and Molecules, Rev. Mod. Phys. 53, 603-641 (1981). Errata 54, 311 (1982). (Revised version of no. 135.)
137. Statistical Theories of Large Atoms and Molecules, in Proceedings of the 1981 Oaxlepec conference on Recent Progress in Many-Body Theories, Springer Lecture Notes in Physics, 142, 336-343 (1982). 138. Statistical Theories of Large Atoms and Molecules, Comments Atomic and Mol. Phys. 11, 147-155 (1982). 139. Analysis of the Thomas-Fermi-von Weizsacker Equation for an Infinite Atom without Electron Repulsion, Commun. Math. Phys. 85,15-25 (1982). 140. (with D.A. Liberman) Numerical Calculation of the Thomas-Fermi-von Weizsacker Function for an Infinite Atom without Electron Repulsion, Los Alamos National Laboratory Report, LA-9186-MS (1982). 141. Monotonicity of the Molecular Electronic Energy in the Nuclear Coordinates, J. Phys. B.: At. Mol. Phys. 15, L63-L66 (1982). 142. Comment on "Approach to Equilibrium of a Boltzmann Equation Solution", Phys. Rev. Lett. 48, 1057 (1982). 143. Density Functionals for Coulomb Systems, in Physics as Natural Philosophy: Essays in honor of Laszlo Tisza on his 75th Birthday, A. Shimony and H. Feshbach eds., M.I.T. Press, 111-149 (1982). t 144. An Lo Bound for the Riesz and Bessel Potentials of Orthonormal Functions, J. Funct. Anal. 51, 159-165 (1983). t 145. (with H. Brezis) A Relation Between Pointwise Convergence of Functions and Convergence of Functionals, Proc. Amer. Math. Soc. 88, 486-490 (1983).
702
146. (with R. Benguria) A Proof of the Stability of Highly Negative Ions in the Absence of the Pauli Principle, Phys. Rev. Lett. 50, 1771-1774 (1983). t 147. Sharp Constants in the Hardy-Littlewood-Sobolev and Related Inequalities, Annals of Math. 118, 349-374 (1983). t 148. Density Functionals for Coulomb Systems (a revised version of no. 143), Int. Jour. Quant. Chem. 24, 243-277 (1983). An expanded version appears in Density Functional Methods in Physics, R. Dreizler and J. da Providencia eds., Plenum Nato ASI Series 123, 31-80 (1985). 149. The Significance of the Schrodinger Equation for Atoms, Molecules and Stars, lecture given at the Schrodinger Symposium, Dublin Institute of Advanced Studies, October 1983, unpublished Proceedings. 150. (with I. Daubechies) One Electron Relativistic Molecules with Coulomb Interaction, Commun. Math. Phys. 90,497-510 (1983). 151. (with 1. Daubechies) Relativistic Molecules with Coulomb Interaction, in Differential Equations, Proc. of the Conference held at the University of Alabama in Birmingham, 1983, I. Knowles and R. Lewis eds., Math. Studies Series, 92, 143-148 North-Holland (1984). 152. Some Vector Field Equations, in Differential Equations, Proc. of the Conference held at the University ofAlabama in Birmingham, 1983, I. Knowles and R. Lewis eds., Math. Studies Series 92,403-412 North-Holland (1984). t 153. On the Lowest Eigenvalue of the Laplacian for the Intersection of Two Domains, Inventiones Math. 74, 441-448 (1983). 154. (with J. Chayes and L. Chayes) The Inverse Problem in Classical Statistical Mechanics, Commun. Math. Phys. 93, 57-121 (1984). t 155. On Characteristic Exponents in Turbulence, Commun. Math. Phys. 92, 473-480 (1984). 156. Atomic and Molecular Negative Ions, Phys. Rev. Lett. 52, 315-317 (1984). 157. Bound on the Maximum Negative Ionization of Atoms and Molecules, Phys. Rev. 29A, 3018-3028 (1984). 158. (with W. Thirring) Gravitational Collapse in Quantum Mechanics with Relativistic Kinetic Energy, Annals of Phys. (N.Y.) 155, 494-512 (1984).
159. (with I.M. Sigal, B. Simon and W. Thining) Asymptotic Neutrality of Large-Z Ions, Phys. Rev. Lett. 52, 994-996 (1984). (See no. 185.)
160. (with R. Benguria) The Most Negative Ion in the Thomas-Fermi-von Weizsacker Theory of Atoms and Molecules, J. Phys. B: At. Mol. Phys. 18, 1045-1059 (1985). t 161. (with H. Brezis) Minimum Action Solutions of Some Vector Field Equations, Commun. Math. Phys. 96, 97-113 (1984). t 162. (with H. Brezis) Sobolev Inequalities with Remainder Terms, J. Funct. Anal. 62, 73-86 (1985). t 163. Baryon Mass Inequalities in Quark Models, Phys. Rev. Lett. 54, 19871990 (1985).
164. (with J. Frohlich and M. Loss) Stability of Coulomb Systems with Magnetic Fields I. The One-Electron Atom, Commun. Math. Phys. 104, 251270 (1986).
703
t
165. (with M. Loss) Stability of Coulomb Systems with Magnetic Fields II. The Many-Electron Atom and the One-Electron Molecule, Commun. Math. Phys. 104, 271-282 (1986). 166. (with W. Thirring) Universal Nature of van der Waals Forces for Coulomb Systems, Phys. Rev. A 34, 40-46 (1986). 167. Some Ginzburg-Landau Type Vector-Field Equations, in Nonlinear systems of Partial Differential Equations in Applied Mathematics, B. Nicolaenko, D. Holm and J. Hyman eds., Amer. Math. Soc. Lectures in Appl. Math. 23, Part 2, 105-107 (1986). 168. (with I. Aflleck) A Proof of Part of Haldane's Conjecture on Spin Chains, Lett. Math. Phys. 12, 57-69 (1986). 169. (with H. Brezis and J-M. Coron) Estimations d'Energie pour des Applications de R3 a Valeurs dans Sz, C.R. Acad. Sci. Paris 303 Ser. 1, 207-210 (1986). 170. (with H. Brezis and J-M. Coron) Harmonic Maps with Defects, Commun. Math. Phys. 107, 649-705 (1986). 171. Some Fundamental Properties of the Ground States of Atoms and Mol-
ecules, in Fundamental Aspects of Quantum Theory, V. Gorini and A.
t
Frigerio eds., Nato ASI Series B, Vol. 144, 209-214, Plenum Press (1986). 172. (with T. Kennedy) A Model for Crystallization: A Variation on the Hubbard Model, in Statistical Mechanics and Field Theory: Mathematical Aspects, Springer Lecture Notes in Physics 257, 1-9 (1986). 173. (with T. Kennedy) An Itinerant Electron Model with Crystalline or Magnetic Long Range Order, Physics 138A, 320-358 (1986). 174. (with T. Kennedy) A Model for Crystallization: A Variation on the Hubbard Model, Physica 140A, 240-250 (1986) (Proceedings of IUPAP Statphys 16, Boston). 175. (with T. Kennedy) Proof of the Peierls Instability in One Dimension, Phys. Rev. Lett. 59, 1309-1312 (1987). 176. (with I. Afeck, T. Kennedy and H. Tasaki) Rigorous Results on ValenceBond Ground States in Antiferromagnets, Phys. Rev. Lett. 59, 799-802 (1987). 177. (with H.-T. Yau) The Chandrasekhar Theory of Stellar Collapse as the Limit of Quantum Mechanics, Commun. Math. Phys. 112, 147-174 (1987). 178. (with H.-T. Yau) A Rigorous Examination of the Chandrasekhar Theory of Stellar Collapse, Astrophys. Jour. 323, 140-144 (1987). 179. (with F. Almgren) Singularities of Energy Minimizing Maps from the Ball to the Sphere, Bull. Amer. Math. Soc. 17, 304-306 (1987). (See no. 190.) 180. Bounds on Schrodinger Operators and Generalized Sobolev Type Inequalities, Proceedings of the International Conference on Inequalities, University of Birmingham, England, 1987, Marcel Dekker Lecture Notes in Pure and Appl. Math., W.N. Everitt ed., volume 129, pages 123-133 (1991). 181. (with 1. Affleck, T. Kennedy and H. Tasaki) Valence Bond Ground States in Isotropic Quantum Antiferromagnets, Commun. Math. Phys. 115,477-528 (1988).
704
182. (with T. Kennedy and H. Tasaki) A Two Dimensional Isotropic Quantum Antiferromagnet with Unique Disordered Ground State, J. Stat. Phys. 53, 383-416(1988). 183. (with T. Kennedy and S. Shastry) Existence of Neel Order in Some Spin 1/2 Heisenberg Antiferromagnets, J. Stat. Phys. 53, 1019-1030 (1988). 184. (with T. Kennedy and S. Shastry) The X Y Model has Long-Range Order for all Spins and all Dimensions Greater than One, Phys. Rev. Lett. 61, 2582-2584(1988). 185. (with I.M. Sigal, B. Simon and W. Thirring) Approximate Neutrality of Large-Z Ions, Commun. Math. Phys. 116, 635-644 (1988). (See no. 159.) 186. (with H.-T. Yau) The Stability and Instability of Relativistic Matter, Commun. Math. Phys. 118, 177-213 (1988). 187. (with H.-T. Yau) Many-Body Stability Implies a Bound on the Fine Structure Constant, Phys. Rev. Lett. 61, 1695-1697 (1988). 188. (with J. Conlon and H.-T. Yau) The N7/5 Law for Charged Bosons, Commun. Math. Phys. 116, 417-448 (1988). t 189. (with F. Almgren and W. Browder) Co-area, Liquid Crystals, and Minimal Surfaces, in Partial Differential Equations, S.S. Chern ed., Springer Lecture Notes in Math. 1306, 1-22 (1988). 190. (with F. Almgren) Singularities of Energy Minimizing Maps from the Ball to the Sphere: Examples, Counterexamples and Bounds, Ann. of Math. 128, 483-530 (1988). t 191. (with F. Almgren) Counting Singularities in Liquid Crystals, in IXth International Congress on Mathematical Physics, B. Simon, A. Truman, I.M. Davies eds., Hilger, 396-409 (1989). This also appears in: Symposia Mathematica, vol. XXX, Ist. Naz. Alta Matem. Francesco Severi Roma, 103118, Academic Press (1989); Variational Methods, H. Berestycki, J-M. Coron, I. Ekeland eds., Birkhauser, 17-36 (1990); How many singularities can there be in an energy minimizing map from the ball to the sphere?, in Ideas and Methods in Mathematical Analysis, Stochastics, and Applications, S. Albeverio, J.E. Fenstad, H. Holden, T. Lindstrom eds., Cambridge Univ. Press, vol. 1, 394-408 (1992). t 192. (with F. Almgren) Symmetric Decreasing Rearrangement can be Discontinuous, Bull. Amer. Math. Soc. 20, 177-180 (1989). t 193. (with F. Almgren) Symmetric Decreasing Rearrangement is Sometimes Continuous, Jour. Amer. Math. Soc. 2,683-773 (1989). A summary of this work (using `rectifiable currents') appears as The (Non)continuity of Symmetric Decreasing Rearrangement in Symposia Mathematica, vol. XXX, Ist. Naz. Alta Matem. Francesco Severi Roma, 89-102, Academic Press (1989) and in Variational Methods, H. Berestycki, J-M. Coron, I. Ekeland eds., Birkhauser, 3-16 (1990). t 194. Two Theorems on the Hubbard Model, Phys. Rev. Lett. 62, 1201-1204 (1989). Errata 62, 1927 (1989). 195. (with J. Conlon and H.-T. Yau) The Coulomb gas at Low Temperature and Low Density, Commun. Math. Phys. 125, 153-180 (1989).
705
t
196. Gaussian Kernels have only Gaussian Maximizers, Invent. Math. 102, 179208 (1990).
t 197. Kinetic Energy Bounds and their Application to the Stability of Matter,
in Schrodinger Operators, Proceedings Sonderborg Denmark 1988, H. Holden and A. Jensen eds., Springer Lecture Notes in Physics 345, 371-
382 (1989). Expanded version of no. 180. 198. The Stability of Matter: From Atoms to Stars, 1989 Gibbs Lecture, Bull. Amer. Math. Soc. 22, 1-49 (1990). 199. Integral Bounds for Radar Ambiguity Functions and Wigner Distributions, J. Math. Phys. 31, 594-599 (1990). 200. On the Spectral Radius of the Product of Matrix Exponentials, Linear Alg. and Appl.141, 271-273 (1990). 201. (with M. Aizenman) Magnetic Properties of Some Itinerant-Electron Systems at T > 0, Phys. Rev. Lett. 65, 1470-1473 (1990). 202. (with H. Siedentop) Convexity and Concavity of Eigenvalue Sums, J. Stat. Phys. 63, 811-816 (1991). 203. (with J.P. Solovej) Quantum Coherent Operators: A Generalization of Coherent States, Lett. Math. Phys. 22, 145-154 (1991). 204. The Flux-Phase Problem on Planar Lattices, Helv. Phys. Acta 65, 247255 (1992). Proceedings of the conference "Physics in Two Dimensions", Neuchatel, August 1991.
205. Atome in starken Magnetfeldern, Physikalische Blatter 48, 549-552 (1992). Translation by H. Siedentop of the Max-Planck medal lecture (1 April 1992) "Atoms in strong magnetic fields". 206. Absence of Ferromagnetism for One-Dimensional Itinerant Electrons, in Probabilistic Methods in Mathematical Physics, Proceedings of the International Workshop Siena, May 1991, F. Guerra, M. Loffredo and C. Marchioro eds., World Scientific pp. 290-294 (1992). A shorter version appears in Rigorous Results in Quantum Dynamics, J. Dittrich and P. Exner eds., World Scientific, pp. 243-245 (1991). 207. (with J.P. Solovej and J. Yngvason) Heavy Atoms in the Magnetic Field of a Neutron Star, Phys. Rev. Lett. 69, 749-752 (1992). 208. (with J.P. Solovej) Atoms in the Magnetic Field of a Neutron Star, in Differential Equations with Applications to Mathematical Physics, W.F. Ames, J.V. Herod and E.M. Harrell II eds., Academic Press, pages 221237 (1993). Also in Spectral Theory and Scattering Theory and Applications, K. Yajima, ed., Advanced Studies in Pure Math. 23, 259-274, Math. Soc. of Japan, Kinokuniya (1994). This is a summary of nos. 215, 216. Earlier summaries also appear in: (a) Methodes Semi-Classiques, Colloque internatinal (Nantes 1991), Asterisque 210, 237-246 (1991); (b) Some New Trends on Fluid Dynamics and Theoretical Physics, C.C. Lin and N. Hu eds., 149-157, Peking University Press (1993); (c) Proceedings of the International Symposium on Advanced Topics of Quantum Physics, Shanxi, J.Q. Lang, M.L. Wang, S.N. Qiao and D.C. Su eds., 5-13, Science Press, Beijing (1993).
706
209. (with M. Loss and R. McCann) Uniform Density Theorem for the Hubbard Model, J. Math. Phys. 34, 891-898 (1993). 210. Remarks on the Skyrme Model, in Proceedings of the Amer. Math. Soc. Symposia in Pure Math. 54, part 2, 379-384 (1993). (Proceedings of Sum-
mer Research Institute on Differential Geometry at UCLA, July 8-28, 1990.)
t 211. (with E. Carlen) Optimal Hypercontractivity for Fermi Fields and Related Noncommutative Integration Inequalities, Commun. Math. Phys. 155, 2746(1993). 212. (with E. Carlen) Optimal Two-Uniform Convexity and Fermion Hypercontractivity, in Quantum and Non-Commutative Analysis, Proceedings of June, 1992 Kyoto Conference, H. Araki et.al. eds., Kluwer (1993), pp. 93111. (Condensed version of no. 211.) 213. (with M. Loss) Fluxes, Laplacians and Kasteleyn's Theorem, Duke Math. Journal 71, 337-363 (1993). 214. (with V. Bach, R. Lewis and H. Siedentop) On the Number of Bound States of a Bosonic N-Particle Coulomb System, Zeits. f. Math. 214, 441-460 (1993). 215. (with J.P. Solovej and J. Yngvason) Asymptotics of Heavy Atoms in High Magnetic Fields: I. Lowest Landau Band Region, Commun. Pure Appl. Math. 47, 513-591 (1994). 216. (with J.P. Solovej and J. Yngvason) Asymptotics of Heavy Atoms in High Magnetic Fields: II. Semiclassical Regions, Commun. Math. Phys. 161, 77-124 (1994). 217. (with V. Bach, M. Loss and J.P. Solovej) There are No Unfilled Shells in Unrestricted Hartree-Fock Theory, Phys. Rev. Lett. 72, 2981-2983 (1994). t 218. (with K. Ball and E. Carlen) Sharp Uniform Convexity and Smoothness Inequalities for Trace Norms, Invent. Math. 115, 463-482 (1994). t 219. Coherent States as a Tool for Obtaining Rigorous Bounds, Proceedings of the Symposium on Coherent States, past, present and future, Oak Ridge, D.H. Feng, J. Klauder and M.R. Strayer eds., World Scientific (1994), pages 267-278. 220. The Hubbard model - Some Rigorous Results and Open Problems, in Proceedings of 1993 conference in honor of G.F. Dell'Antonio, Advances in Dynamical Systems and Quantum Physics, S. Albeverio et al. eds., pp. 173-193, World Scientific (1995). A revised version appears in Proceedings of 1993 NATO ASI The Hubbard Model, D. Baeriswyl et al. eds., pp. 1-19, Plenum Press (1995). A further revision appears in Proceedings of the Xtth International Congress of Mathematical Physics, Paris, 1994, D. Iagolnitzer ed., pp. 392-412, International Press (1995). 221. (with V. Bach and J.P. Solovej) Generalized Hartree-Fock Theory of the Hubbard Model, J. Stat. Phys. 76, 3-90 (1994). 222. The Flux Phase of the Half-Filled Band, Phys. Rev. Lett. 73, 2158-2161 (1994). t 223. (with M. Loss) Symmetry of the Ginzburg-Landau Minimizer in a Disc, Math. Res. Lett. 1, 701-715 (1994).
707
224. (with J.P. Solovej and J. Yngvason) Quantum Dots, in Proceedings of the Conference on Partial Differential Equations and Mathematical Physics, University of Alabama, Birmingham, 1994, I. Knowles, ed., International Press (1995), pages 157-172. 225. (with J.P. Solovej and J. Yngvason) Ground States of Large Quantum Dots in Magnetic Fields, Phys. Rev. B 51, 10646-10665 (1995). 226. (with J. Freericks) The Ground State of a General Electron-Phonon Hamiltonian is a Spin Singlet, Phys. Rev. B 51, 2812-2821 (1995). 227. (with B. Nachtergaele) The Stability of the Peierls Instability for Ring Shaped Molecules, Phys. Rev. B 51, 4777-4791 (1995). 228. (with B. Nachtergaele) Dimerization in Ring-Shaped Molecules: The Stability of the Peierls Instability in Proceedings of the Xith International Congress of Mathematical Physics, Paris, 1994, D. Iagolnitzer ed., pp. 423-431, International Press (1995). 229. (with B. Nachtergaele) Bond Alternation in Ring-Shaped Molecules: The Stability of the Peierls Instability. In Proceedings of the conference The
Chemical Bond, Copenhagen 1994, Int. J. Quant. Chem. 58, 699-706 (1996). 230. Fluxes and Dimers in the Hubbard Model, in Proceedings of the International Congress of Mathematicians, Zurich, 1994, S.D. Chatterji ed., vol. 2, pp. 1279-1280, Birkhauser (1995). 231. (with M. Loss and J. P. Solovej) Stability of Matter in Magnetic Fields, Phys. Rev. Lett. 75, 985-989 (1995). 232. (with O.J. Heilmann) Electron Density near the Nucleus of a large Atom, Phys. Rev A 52, 3628-3643 (1995). 233. (with A. Iantchenko and H. Siedentop) Proof of a Conjecture about Atomic and Molecular Cores Related to Scott's Correction, J. reine u. ang. Math. 472, 177-195 (1996). 234. (with L. Thomas) Exact Ground State Energy of the Strong-Coupling Polaron, Commun. Math. Phys. 183, 511-519 (1997). Errata 188, 499-500 (1997). t 235. (with L. Cafarelli and D. Jerison) On the Case of Equality in the BrunnMinkowski Inequality for Capacity, Adv. in Math. 117, 193-207 (1996). 236. (with M. Loss and H. Siedentop) Stability of Relativistic Matter via Thomas-Fermi Theory, Helv. Phys. Acta 69, 974-984 (1996). 237. Some of the Early History of Exactly Soluble Models, in Proceedings of the 1996 Northeastern University conference on Exactly Soluble Models, Int. Jour. Mod. Phys. B 11, 3-10 (1997). 238. (with H. Siedentop and J.P. Solovej) Stability and Instability of Relativistic Electrons in Magnetic Fields, J. Stat. Phys. 89, 37-59 (1997). 239. (with H. Siedentop and J-P. Solovej) Stability of Relativistic Matter with Magnetic Fields, Phys. Rev. Lett. 79, 1785-1788 (1997). 240. Stability of Matter in Magnetic Fields, in Proceedings of the Conference on Unconventional Quantum Liquids, Evora, Portugal, 1996 Zeits. f. Phys. B 933, 271-274 (1997).
708
241. Birmingham in the Good Old Days, in Proceedings of the Conference on Unconventional Quantum Liquids, Evora, Portugal, 1996 Zeits. f. Phys. B 933, 125-126 (1997). 242. (with M. Loss) book Analysis, American Mathematical Society (1997). 243. Doing Math with Fred, in In Memoriam Frederick J. Almgren Jr., 193 71997, Experimental Math. 6, 2-3 (1997). 244. (with J.P. Solovej and J. Yngvason) Asymptotics of Natural and Artificial Atoms in Strong Magnetic Fields, in The Stability of Matter: From Atoms to Stars, Selecta of E. H. Lieb, W. Thirring ed., second edition, Springer Verlag, pp. 145-167 (1997). This is a summary of nos. 207, 208, 215, 216, 224, 225. 245. Stability and Instability of Relativistic Electrons in Classical Electromagnetic Fields, in Proceedings of Conference on Partial Differential Eqations and Mathematical Physics, Georgia Inst. of Tech., March, 1997, Amer. Math. Soc. Contemporary Math. series, E. Carlen, E. Harrell, M. Loss eds., 217, 99-108 (1998). 246. (with J. Yngvason) Ground State Energy of the Low Density Bose Gas, Phys. Rev. Lett. 80, 2504-2507 (1998). arXiv math-ph/9712138, mparc 97-631. 247. (with J. Yngvason) A guide to Entropy and the Second Law of Thermodynamics, Notices of the Amer. Math. Soc. 45, 571-581 (1998). arXiv mathph/9805005, mparc 98-339. http://www.ams.org/notices/199805/lieb.pdf. See no. 266. This paper received the American Mathematical Society 2002 Levi Conant prize for "the best expository paper published in either the Notices of the AMS or the Bulletin of the AMS in the preceding five years". t 248. (with D. Hundertmark and L.E. Thomas) A Sharp Bound for an Eigenvalue Moment of the One-Dimensional Schroedinger Operator, Adv. Theor. Math. Phys. 2, 719-731 (1998). arXiv math-ph/9806012, mp-arc 98-753. t 249. (with E. Carlen) A Minkowski Type Trace Inequality and Strong Subadditivity of Quantum Entropy, in Amer. Math. Soc. Transl. (2), 189, 59-69 (1999). 250. (with J. Yngvason) The Physics and Mathematics of the Second Law of Thermodynamics, Physics Reports 310, 1-96 (1999). arXiv cond-mat/9708200, mp-arc 97-457. 251. Some Problems in Statistical Mechanics that I would like to see Solved, 1998 IUPAP Boltzmann prize lecture, Physica A 263, 491-499 (1999). 252. (with P. Schupp) Ground State Properties of a Fully Frustrated Quantum Spin System, Phys. Rev. Lett. 83, 5362-5365 (1999). arXiv math-ph/9908019, mparc 99-304. 253. (with P. Schupp) Singlets and Reflection Symmetric Spin Systems, Physica A 279, 378-385 (2000). arXiv math-ph/9910037, mparc 99-404. 254. (with R. Seiringer and J.Yngvason) Bosons in a Trap: A Rigorous Derivation of the Gross-Pitaevskii Energy Functional, Phys. Rev A 61, 043602-1 - 043602-13 (2000). arXiv math-ph/9908027, mp-arc 99-312. 255. (with J. Yngvason) The Ground State Energy of a Dilute Bose Gas, in Differential Equations and Mathematical Physics, University of Alabama,
709
Birmingham, 1999, R. Weikard and G. Weinstein, eds., 295-306, Internat. Press (2000). arXiv math-ph/9910033, mp-arc 99-401. 256. (with M. Loss) Self-Energy of Electrons in Non-perturbative QED, in Dif-
ferential Equations and Mathematical Physics, University of Alabama, Birmingham, 1999, R. Weikard and G. Weinstein, eds. 279-293, Amer. Math. Soc./Internat. Press (2000). arXiv math-ph/9908020, mparc 99-305. 257. (with R. Seiringer and J. Yngvason) The Ground State Energy and Density of Interacting Bosons in a Trap, in Quantum Theory and Symmetries, Goslar, 1999, H.-D. Doebner, V.K. Dobrev, J.-D. Hennig and W. Luecke, eds., pp. 101-110, World Scientific (2000). arXiv math-ph/9911026, mparc 99-439. 258. (with J. Yngvason) The Ground State Energy of a Dilute Two-dimensional Bose Gas, J. Stat. Phys. 103, 509-526 (2001). arXiv math-ph/0002014, mp-arc 00-63.
259. (with J. Yngvason) A Fresh Look at Entropy and the Second Law of Thermodynamics, Physics Today 53, 32-37 (April 2000). arXiv mathph/0003028, mparc 00-123. See also 53, 11-14, 106 (October 2000). 260. Lieb-Thirring Inequalities, in Encyclopaedia of Mathematics, Supplement vol. 2, pp. 311-313, Kluwer (2000). arXiv math-ph/0003039, mp-arc 00132.
261. Thomas-Fermi Theory, in Encyclopaedia of Mathematics, Supplement vol. 2, pp. 455-457, Kluwer (2000). arXiv math-ph/0003040, mparc 00-131. 262. (with H. Siedentop) Renormalization of the Regularized Relativistic Electron-Positron Field, Commun. Math. Phys. 213, 673-684 (2000). arXiv math-ph/0003001 mp-arc 00-98. 263. (with R. Seiringer and J. Yngvason) A Rigorous Derivation of the GrossPitaevskii Energy Functional for a Two-dimensional Bose Gas, Commun. Math. Phys. 224, 17-31 (2001). arXiv cond-mat/0005026, mp-arc 00-203. 264. (with M. Griesemer and M. Loss) Ground States in Non-relativistic Quan-
tum Electrodynamics, Invent. Math. 145, 557-595 (2001). arXiv mathph/0007014, mparc 00-313. 265. (with J.P. Solovej) Ground State Energy of the One-Component Charged Bose Gas, Commun. Math. Phys. 217, 127-163 (2001). Errata 225, 219221 (2002). arXiv cond-mat/0007425, mparc 00-303. 266. (with J. Yngvason) The Mathematics of the Second Law of Thermodynamics, in Visions in Mathematics, Towards 2000, A. Alon, J. Bourgain, A. Connes, M. Gromov and V. Milman, eds., GAFA 2000, no. 1, Birkhauser, p. 334-358 (2000). See no. 247. mp-arc 00-332. 267. The Bose Gas: A Subtle Many-Body Problem, in Proceedings of the XIII
International Congress on Mathematical Physics, London, A. Fokas, et al. eds. International Press, pp. 91-111, 2001. arXiv math-ph/0009009, mp-arc 00-351.
268. (with J. Freericks and D. Ueltschi) Segregation in the Falicov-Kimball Model, Commun. Math. Phys. 227, 243-279 (2002). arXiv math-ph/0107003, mp-arc 01-243.
710
269. (with G.K. Pedersen) Convex Multivariable Trace Functions, Reviews in Math. Phys. 14, 1-18 (2002). arXiv math.OA/0107062. 270. (with J. Freericks and D. Ueltschi) Phase Separation due to Quantum Mechanical Correlations, Phys. Rev. Lett. 88, #106401 (2002). arXiv cond-mat/0110251.
271. (with M. Loss) Stability of a Model of Relativistic Quantum Electrodynamics, Commun. Math. Phys. 228, 561-588 (2002). arXiv math-ph/0109002, mp arc 01-315. 272. (with M. Loss) A Bound on Binding Energies and Mass Renormalization
in Models of Quantum Electrodynamics, J. Stat. Phys. 108, 1057-1069 (2002). arXiv math-ph/0110027. 273. (with R. Seiringer) Proof of Bose-Einstein Condensation for Dilute Trapped
Gases, Phys. Rev. Lett. 88, #170409 (2002). arXiv math-ph/0112032, mp_arc02-115.
274. (with M. Loss) Stability of Matter in Relativistic Quantum Mechanics, in Mathematical Results in Quantum Mechanics, Proceedings of QMath8, Taxco, Amer. Math. Soc. Contemporary Mathematics series, pp. 225-238, 2002.
275. (with J. Yngvason) The Mathematical Structure of the Second Law of Thermodynamics, in Contemporary Developments in Mathematics 2001, International Press (in press). arXiv math-ph/0204007. 276. (with R. Seiringer, J.P. Solovej and J. Yngvason) The Ground State of the Bose Gas, in Contemporary Developments in Mathematics 2001, International Press (in press). arXiv math-ph/0204027, mp-arc-02-183. 277. (with R. Seiringer and J. Yngvason) Poincare Inequalities in Punctured Domains, Annals of Math (in press). arXiv math.FA/0205088. 278. (with R. Seiringer and J. Yngvason) Superfluidity in Dilute Trapped Bose Gases, Phys. Rev. B 66, # 134529 (2002). arXiv cond-mat/0205570, mp_arc02-339. 279. (with F.Y. Wu) The one-dimensional Hubbard model: A reminiscence, Physica A (in press). arXiv cond-mat/0207529. 280. (with E. Eisenberg) Polarization of interacting bosons with spin, Phys. Rev. Lett. 89, #220403 (2002), mp_arc 02-446. arXiv cond-mat/0207042. 281. The Stability of Matter and Quantum Electrodynamics, Proceedings of the Heisenberg symposium, Munich, Dec. 2001, Springer (in press).
711