Advanced Courses in Mathematics CRM Barcelona
Vladimir Temlyakov
Sparse Approximation with Bases
Advanced Courses in...
12 downloads
681 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Advanced Courses in Mathematics CRM Barcelona
Vladimir Temlyakov
Sparse Approximation with Bases
Advanced Courses in Mathematics CRM Barcelona Centre de Recerca Matemàtica Managing Editor: Carles Casacuberta
More information about this series at http://www.springer.com/series/5038
Vladimir Temlyakov
Sparse Approximation with Bases Editor for this volume: Sergey Tikhonov, ICREA and CRM, Barcelona
Vladimir Temlyakov Department of Mathematics University of South Carolina Columbia, SC, USA
ISSN 2297-0304 ISSN 2297-0312 (electronic) Advanced Courses in Mathematics - CRM Barcelona ISBN 978-3-0348-0889-7 ISBN 978-3-0348-0890-3 (eBook) DOI 10.1007/978-3-0348-0890-3 Library of Congress Control Number: 2015935236 Mathematics Subject Classification (2010): Primary: 41A65, 41A25, 42A10; Secondary: 46B20 Springer Basel Heidelberg New York Dordrecht London © Springer Basel 2015 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made.
Printed on acid-free paper Springer Basel AG is part of Springer Science+Business Media (www.birkhauser-science.com)
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
1 Introduction 1.1 General setting of approximation problems . . . . 1.2 Existence and uniqueness of best approximation . 1.3 Schauder bases in Banach spaces . . . . . . . . . 1.4 Unconditional bases . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
2 Lebesgue-type Inequalities for Greedy Approximation with Respect to Some Classical Bases 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The trigonometric system . . . . . . . . . . . . . . . . . 2.3 Wavelet bases . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Greedy bases . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Some examples . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Unconditionality does not imply democracy . . . 2.5.2 Democracy does not imply unconditionality . . . 2.5.3 Superdemocracy does not imply unconditionality 2.5.4 A quasi-greedy basis is not necessarily an unconditional basis . . . . . . . . . . . . . . . 2.6 Further results . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 Direct and inverse theorems . . . . . . . . . . . . 2.6.2 Greedy approximation in L1 and L∞ . . . . . . . 2.7 Some inequalities for the tensor product of greedy bases 3 Quasi-greedy Bases and Lebesgue-type Inequalities 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . 3.2 Properties of quasi-greedy bases . . . . . . . . . . 3.3 Construction of quasi-greedy bases . . . . . . . . 3.4 Uniformly bounded quasi-greedy systems . . . . . 3.5 Lebesgue-type inequalities for quasi-greedy bases
. . . . .
. . . . .
. . . . .
. . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
1 5 10 16
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
21 25 32 40 42 42 43 43
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
44 46 46 51 54
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
61 64 77 84 90
v
vi
Contents
3.6 3.7
Lebesgue-type inequalities for uniformly bounded quasi-greedy bases . . . . . . . . . . . . . . . . . . . . . . . . . . Lebesgue-type inequalities for uniformly bounded orthonormal quasi-greedy bases . . . . . . . . . . . . . . . . . . .
4 Almost Greedy Bases and Duality 4.1 Introduction . . . . . . . . . . . . . 4.2 Greedy conditions for bases . . . . 4.3 Democratic and conservative bases 4.4 Bidemocratic bases . . . . . . . . . 4.5 Duality of almost greedy bases . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
94 99 103 105 108 112 116
5 Greedy Approximation with Respect to the Trigonometric System 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Convergence. Conditions on Fourier coefficients . . . . . . . . . . 5.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Sufficient conditions in terms of Fourier coefficients. Proof of Theorem 5.2.1 . . . . . . . . . . . . . . . . . . . . 5.2.3 Sufficient conditions in terms of the decreasing rearrangement of Fourier coefficients. Proof of Theorem 5.2.2 . . . . 5.2.4 Convergence in the uniform norm. Proof of Theorems 5.2.3–5.2.5 . . . . . . . . . . . . . . . . 5.3 Convergence. Conditions on greedy approximants . . . . . . . . . 5.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Some inequalities . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Sufficient conditions in the case p ∈ (2, ∞) . . . . . . . . . 5.3.4 Necessary conditions in the case p ∈ (2, ∞) . . . . . . . . 5.3.5 Necessary and sufficient conditions in the case p = ∞ . . . 5.4 An application of WCGA . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 Rate of approximation . . . . . . . . . . . . . . . . . . . . 5.4.3 Constructive approximation of function classes . . . . . . 5.5 Constructive nonlinear trigonometric m-term approximation . . .
140 150 150 152 158 161 169 174 175 175 177 179
6 Greedy Approximation with Respect to Dictionaries 6.1 Introduction . . . . . . . . . . . . . . . . . . . 6.2 The Weak Chebyshev Greedy Algorithm . . . 6.3 Relaxation. Co-convex approximation . . . . . 6.4 Free relaxation . . . . . . . . . . . . . . . . . 6.5 Fixed relaxation . . . . . . . . . . . . . . . . . 6.6 Relaxation. X-greedy algorithms . . . . . . . 6.7 Greedy expansions . . . . . . . . . . . . . . .
187 193 200 202 206 212 214
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
121 127 127 130 137
vii
Contents
6.7.1 6.7.2 6.7.3 6.7.4
Introduction . . . . . . . . . . . . . . . . . . . . . . . Convergence of the Dual-Based Expansion . . . . . . A modification of the Weak Dual Greedy Algorithm Convergence of WDGA . . . . . . . . . . . . . . . .
7 Appendix 7.1 Lp -spaces and some inequalities . 7.1.1 Modulus of continuity . . 7.1.2 Some inequalities . . . . . 7.2 Duality in Lp spaces . . . . . . . 7.3 Fourier series of functions in Lp . 7.4 Trigonometric polynomials . . . . 7.5 Bernstein–Nikol’skii Inequalities. The Marcinkiewicz Theorem . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . .
. . . .
. . . .
214 217 221 226
. . . . . .
. . . . . .
. . . . . .
229 229 231 236 239 243
. . . . . . . . . . . . . . . . . . 249
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Preface
The last decade has seen great progress in the study of nonlinear approximation, which was motivated by numerous applications. Nonlinear approximation is important in applications because of its concise representations and increased computational efficiency. Two types of nonlinear approximation are frequently employed in applications. Adaptive methods are used in PDE solvers, while m-term approximation, considered here, is used in image/signal/data processing, as well as in the design of neural networks. Another name for m-term approximation is sparse approximation. Sparse approximations (representations) of functions are not only a powerful analytic tool, but they are utilized in many application areas, such as image/signal processing and numerical computation. The fundamental question of nonlinear approximation is how to devise good constructive methods (algorithms). This problem has two levels of nonlinearity. The first level of nonlinearity is m-term approximation with respect to bases. In this problem one can use the unique function expansion with respect to a given basis to build an approximant. Nonlinearity enters by looking for m-term approximants with terms (i.e., basis elements in approximant) allowed to depend on the given function. Since the elements of the basis used in the m-term approximation are allowed to depend on the function being approximated, this type of approximation is very efficient. This idea is utilized in the method termed the Thresholding Greedy Algorithm, discussed in detail in this book. We focus on the following fundamental question. Which bases are suitable for the use of the Thresholding Greedy Algorithm (TGA)? By answering this question we introduce several new concepts of bases of a Banach space X: greedy bases, quasi-greedy bases, almost greedy bases. The greedy bases are the best for application of TGA for sparse approximation —for any f ∈ X TGA provides after m iterations approximation with the error of the same order as the best m-term approximation of f . If a basis Ψ is quasi-greedy, then it merely guarantees that for any f ∈ X TGA provides approximants that converge to f , but does not guarantee the rate of convergence. This gives rise to the following question: Can we bound the error of TGA’s m-th approximation by the best m-term approximation with an extra multiplier, say, C(m)? If yes, what is the best C(m) for a given basis? The above questions lead to the concept of Lebesgue-type inequalities. We discuss them in detail.
ix
x
Preface
On the second level of nonlinearity, we replace a basis by a more general system which is not necessarily minimal (for example, redundant system, dictionary). This setting is much more complicated than the first one (bases case); however, there is a solid justification of the importance of redundant systems in both theoretical questions and practical applications. We only give here an introduction to this important area of research and refer the reader to the book [82] for further results. Recent results have established that greedy-type algorithms are suitable methods of nonlinear approximation in both m-term approximation with respect to bases and m-term approximation with respect to redundant systems. In this book we study properties of specific methods of approximation that belong to a family of greedy approximation methods (greedy algorithms). These methods allow us to build sparse representations in an economical way. We use a concept of sparsity to specify the form of an approximant. It is now well understood that we need to study nonlinear sparse approximations (representations) in order to significantly increase our ability to process (compress, denoise, etc.) large data sets. The book is organized as followings. In Chapter 1 we discuss a general setting of approximation problems and give some classical results on existence and uniqueness of best approximant in the linear setting. We also give some existence results for best m-term approximation. In this chapter we present some classical results on Schauder bases and on unconditional bases. These results are often used in the book. In Chapter 2 we present first results on greedy-type bases following a historical path. We begin with Lebesgue-type inequalities for the trigonometric system and for the Haar basis. In particular, we prove that the Haar basis is a greedy basis for the Lp spaces when p ∈ (1, ∞). Then we prove that a basis is greedy if and only if it is unconditional and democratic. We discuss some further properties of greedy-type bases. In Chapter 3 we focus on studying quasi-greedy bases. Quasi-greedy bases are close to unconditional bases: for unconditional bases every permutation of an expansion series of f converges to f while for quasi-greedy bases decreasing (in the sense of magnitudes of coefficients) permutations converge. However, it turns out that some classical spaces do not have unconditional bases, but have quasi-greedy bases. Also, some classical spaces do not have uniformly bounded unconditional bases, but have uniformly bounded quasi-greedy bases. This emphasizes that quasi-greedy bases deserve detailed study. We conduct such a study in Chapter 3. It is known that the idea of duality works very well in many areas of mathematics. In Chapter 4 we look at some greedy-type bases from the duality point of view. We establish there some theorems which imply that under some conditions the basic sequence dual (biorthogonal) to an almost-greedy (greedy) basis is also almost-greedy (greedy). Chapter 5 is an elaboration of the results on the trigonometric system from Chapter 2. Chapter 5 is probably the most technically involved chapter of the book. It is not a surprise for those who work with the trigonometric system. It turns out that the trigonometric system is a very good testing field for different greedy-type algorithms. We discuss this in
Preface
xi
detail in Chapter 5. In Chapters 2–5 we mostly study the Thresholding Greedy Algorithm. It is very simple and convenient in applications. However, our study in Chapters 2 and 5 show that it is not a good algorithm for the trigonometric system. It turns out that greedy algorithms designed for general dictionaries work well for the trigonometric system. We discuss applications of these algorithms to the trigonometric system in Chapter 5 and give an introduction to the theory of these algorithms in Chapter 6. In Chapters 1–6 we heavily use some classical results from analysis. For the reader’s convenience, we collect them in Chapter 7 (Appendix). The theory of greedy approximations and expansions has a great potential for pedagogical applications. Greedy approximations is a very fresh area of research where a graduate student may begin his/her independent research at an early stage. Also, greedy approximations being a theoretical subject has many connections to applied and computational mathematics. This feature is very attractive for many graduate students. On top of this, greedy approximations are developing into a beautiful mathematical theory with deep connections to functional analysis, harmonic analysis, and geometry. The book is addressed to researchers working in numerical mathematics, harmonic analysis, and functional analysis. It quickly takes the reader from classical results to the frontier of the unknown, but is written at the level of a graduate course and does not require a broad background for understanding. The book could be used for designing different graduate courses.
Acknowledgement This book is based on the series of advanced course lectures that the author gave at the Centre de Recerca Matem` atica in Barcelona, Spain, in November 2011. I am grateful to Joaquim Bruna and Sergei Tikhonov for organizing those lectures.
Chapter 1 Introduction
1.1 General setting of approximation problems We will always consider approximation problems in a Banach space. We briefly recall the definition of a Banach space. Let X be a linear (vector) space. We say that a nonnegative function x defined for all x ∈ X is a norm if it satisfies the following axioms: x = 0 ⇐⇒ x = 0; (i) x + y ≤ x + y;
(ii)
αx = |α|x.
(iii)
A linear space equipped with a norm is called a normed linear space. We say that a normed linear space is complete if any Cauchy sequence in X converges to an element of X. Definition 1.1.1. A complete normed linear space is a Banach space. Let us list some classical examples of Banach spaces. B.1. The real line R with the norm x := |x|. B.2. The space Rn with x := (x1 , . . . , xn ),
x := xn2 :=
n
1/2 |xi |2
.
i=1
This space is usually denoted by n2 . B.3. The space Rn with x := x
n p
:=
n
1/p |xi |
p
,
1 ≤ p < ∞.
i=1
This space is usually denoted by np .
© Springer Basel 2015 V. Temlyakov, Sparse Approximation with Bases, Advanced Courses in Mathematics - CRM Barcelona, DOI 10.1007/978-3-0348-0890-3_1
1
2
Chapter 1. Introduction
B.4. The space Rn with
x := xn∞ := max |xi |. i
n∞ .
This space is usually denoted by B.5. The space of Lebesgue-measurable square-integrable functions on [a, b] with the norm 1/2 b 1 2 f := f 2 := f L2 := |f (x)| . b−a a This space is usually denoted by L2 (a, b). B.6. The space of Lebesgue-measurable p-integrable functions on [a, b] with the norm 1/p b 1 f := f p := f Lp := |f (x)|p , 1 ≤ p < ∞. b−a a This space is usually denoted by Lp (a, b). B.7. The space of continuous functions on [a, b] with the norm f := f ∞ := f C := sup |f (x)|. x
This space is usually denoted by C(a, b). Let us proceed to the approximation problem setting. Let a set A ⊂ X be given. For any x ∈ X denote d(x, A) := d(x, A)X := inf x − a, a∈A
the distance from x to A, or, in other words, the best approximation of x by elements from A in the norm of X. For a set F denote d(F, A) := d(F, A)X := sup d(x, A)X . x∈F
Let us give some examples of pairs X and A. A.1. Let X = n2 and A be a linear subspace Xm of dimension m. Denote by B2n the unit ball (Euclidean unit ball) of n2 . Then it is easy to see that for any m < n we have d(B2n , Xm ) = 1. A.2. Let X = n2 and A be the set Nm of all vectors with at most m nonzero coordinates. Then it is easy to see that for any m < n we have d(B2n , Nm ) = (1 − m/n)1/2 . A.3. Let X = Lp (a, b), 1 ≤ p < ∞, or X = C(a, b) and A be the set Pn of algebraic polynomials of degree at most n. The very first results in Approximation Theory were obtained for this approximation. We formulate now a classical result of Weierstrass.
3
1.1. General setting of approximation problems
Theorem 1.1.2. For any f ∈ C(a, b), we have d(f, Pn )C −→ 0
as
n −→ ∞.
We mention one more standard notation for the best approximation in this case: en (f )X := d(f, Pn )X . We note that dim Pn = n + 1. A.4. Let X = Lp (0, 2π), 1 ≤ p < ∞, or X = C(0, 2π) be the set of 2π-periodic functions. Consider A to be the set Tn of complex trigonometric polynomials of order n or the set RTn of real trigonometric polynomials of order n: Tn := t : t = ck eikx , |k|≤n
n RTn := t : t = a0 /2 + (ak cos kx + bk sin kx) . k=1
This is another classical example in Approximation Theory. We will mention here one classical result of Jackson. For a Banach space X of periodic functions we define the modulus of continuity of f ∈ X by ω(f, δ)X := sup f (· + y) − f (·)X . |y|≤δ
Theorem 1.1.3. Let X = Lp (0, 2π), 1 ≤ p < ∞, or X = C(0, 2π) be the set of 2π-periodic functions. Then for any f ∈ X we have d(f, Tn )X ≤ Cω(f, 1/n)X with an absolute constant C. We mention one more standard notation for the best approximation in this case: En (f )X := d(f, Tn )X . We note that dim Tn = 2n + 1. A.5. Let X = Lp (a, b), 1 ≤ p < ∞, or X = C(a, b) and A be the set Rn of rational functions: Rn := {r : r = p/q, p, q ∈ Pn }. A.6. Let X = Lp (0, 2π), 1 ≤ p < ∞, or X = C(0, 2π) be the set of 2π-periodic functions. Consider A to be the set Σm of all complex trigonometric polynomials or the set Σm (R) of all real trigonometric polynomials which have at
4
Chapter 1. Introduction
most m nonzero coefficients: ck eikx , #Λ ≤ m , Σm := t : t = k∈Λ
Σm (R) := t : t = ak cos kx + bk sin kx, #Λ1 + #Λ2 ≤ m . k∈Λ1
k∈Λ2
We will also use the following notation in this case: σm (f, T )X := d(f, Σm )X . A.7. Let X = Lp (0, 1), 1 ≤ p ≤ ∞ and A be the set S0 of piecewise constant functions with breakpoints at j/n, j = 1, . . . , n − 1. Then S0 is a linear subspace with dim S0 = n. A.8. Let X = Lp (0, 1), 1 ≤ p ≤ ∞ and A be the set ΣSm of piecewise constant functions with at most m − 1 breakpoints at (0, 1). A.9. Let X = L2 ([0, 1]2 ) be the Hilbert space of functions of two variables x1 , x2 that are Lebesgue measurable and square integrable on the square [0, 1]2 := [0, 1] × [0, 1]. Consider an approximation problem with A = Π(m) the set of linear combinations of m functions of the form u(x1 )v(x2 ), u2 = v2 = 1: m ci ui (x1 )vi (x2 ), ui 2 = vi 2 = 1, i = 1, . . . , m . Π(m) := b : b = i=1
This problem is closely connected with properties of the integral operator 1 Jf (g) := f (x, y)g(y) dy 0
with kernel f (x, y). E. Schmidt [63] gave an expansion (known as the Schmidt expansion) ∞ sj (Jf )φj (x)ψj (y), f (x, y) = j=1
where {sj (Jf )} is the nonincreasing sequence of singular numbers of Jf , i.e., sj (Jf ) := λj (Jf∗ Jf )1/2 ; here for an operator T , {λj (T )} denotes the sequence of eigenvalues of T , and Jf∗ is the adjoint operator to Jf . The two sequences {φj (x)} and {ψj (y)} form orthonormal sequences of eigenfunctions of the operators Jf Jf∗ and Jf∗ Jf respectively. He also proved that m f (x, y) − sj (Jf )φj (x)ψj (y) j=1
=
inf
uj ,vj ∈L2 , j=1,...,m
L2
m f (x, y) − . u (x)v (y) j j j=1
L2
5
1.2. Existence and uniqueness of best approximation
The above-listed approximation problems can be placed into two different groups. In problems A.1, A.3, A.4, and A.7, the approximation set A is a linear subspace of finite dimension. We will refer to this kind of approximation problems as linear approximation. In problems A.2, A.5, A.6, A.8, and A.9, the approximation set A is not a linear subspace. We will refer to this kind of approximation problems as nonlinear approximation. Let us remark that within nonlinear approximation problems the problem A.9 is very different from problems A.2 and A.6. In the problems A.2 and A.6 an approximant is built as an m-term linear combination of elements which come from a fixed basis (minimal system): the basis (1, 0, . . . , 0), . . . , (0, . . . , 0, 1) in A.2 and the trigonometric system T in A.6. In the problem A.9 an approximant is also an m-term linear combination, although elements of this linear combination come from the redundant system Π2 —the set of all functions of the form u(x1 )v(x2 ), u2 = v2 = 1.
1.2 Existence and uniqueness of best approximation We begin with one very general and simple existence result on best approximants in the linear approximation problem. Theorem 1.2.1. Let X be a normed linear space and A be a finite-dimensional subspace. Then for any x ∈ X there exists a(x) ∈ A such that d(X, A) = x − a(x). Proof. Let x1 , . . . , xn form a basis for A: x1 , . . . , xn are linearly independent and A = span(x1 , . . . , xn ). Consider n f (c) := f (c1 , . . . , cn ) := ci xi . i=1
It is clear that f (c) is a continuous nonnegative function such that f (c) = 0 if and only if c = (0, . . . , 0). Restricting f (c) to the unit sphere c2 = 1 we get C1 (x1 , . . . , xn )c2 ≤ f (c) ≤ C2 (x1 , . . . , xn )c2
(1.2.1)
with two positive constants C1 , C2 . The inequality (1.2.1) implies that for a given x ∈ X we have n n x − = inf x − (1.2.2) c x c x d(x, A) := inf i i i i c c:c2 ≤N i=1
i=1
with appropriate N . Indeed for c2 > N we have n x − ci xi ≥ C1 c2 − x > C1 N − x > x i=1
6
Chapter 1. Introduction
n for N > 2x/C1 . It remains to note that x − i=1 ci xi is a continuous function on c and apply the theorem asserting that a continuous function on a compact set attains its minimum. As a corollary to this general theorem we infer the existence of a best approximant in problems A.1, A.3, A.4, and A.7. There is no analog of Theorem 1.2.1 in nonlinear approximation and we should treat nonlinear approximation problems case by case. To illustrate some techniques in this direction, we prove here existence theorems for the cases A.6 and A.8. Theorem 1.2.2. For any f ∈ Lp (0, 1), 1 ≤ p < ∞, there exists g ∈ ΣSm such that d(f, ΣSm )p = f − gp . Proof. Fix the breakpoints 0 = y0 ≤ y1 ≤ · · · ≤ ym−1 ≤ ym = 1, denote y := (y0 , . . . , ym ) and let S0 (y) be the set of piecewise constant functions with breakpoints y1 , . . . , ym−1 . Denote also eym (f )p :=
inf a∈S0 (y)
f − ap .
From the definition of d(f, ΣSm )p we get that there exists a sequence y i such that i
eym (f )p −→ d(f, ΣSm )p when i → ∞. Considering a subsequence of {y i } if necessary, we can assume that y i → y ∗ for some y ∗ ∈ Rm+1 . Now we consider only those indices j for which ∗ yj−1 = yj∗ . Denote the corresponding set of indices by Λ. Take a positive number satisfying ∗ )/3 < min(yj∗ − yj−1 j∈Λ
and consider i such that
y ∗ − y i ∞ < .
(1.2.3)
i
By Theorem 1.2.1, for each y there exists g(f, y i , ci ) :=
m
i cij χ[yj−1 ,yji ] ,
j=1
where χE denotes the characteristic function of a set E, with the property that i
f − g(f, y i , ci )p = eym (f )p . For i satisfying (1.2.3) and j ∈ Λ we have |cij | ≤ C(f, ), which allows us to assume (passing to a subsequence if necessary) the convergence lim cij = cj ,
i→∞
j ∈ Λ.
1.2. Existence and uniqueness of best approximation
Consider g(f, c) := Denote U (y) :=
j (yj
7
∗ cj χ[yj−1 ,yj∗ ] .
j∈Λ
− , yj + ) and G := [0, 1] \ U (y ∗ ). Then we have
|f − g(f, c)| = lim
|f − g(f, y i , ci )|p ≤ d(f, ΣSm )pp .
p
i→∞
G
G
Letting → 0 we finish the proof.
We proceed now to the trigonometric case. We give the proof in the general case of d-variables because this generality does not bring any complication. Theorem 1.2.3. Let 1 ≤ p ≤ ∞. For any f ∈ Lp (Td ) (with L∞ (Td ) := C(Td )) and any m ∈ N there exists a trigonometric polynomial tm of the form tm (x) =
m
cn ei(k
n
,x)
(1.2.4)
n=1
such that σm (f, T d )p = f − tm p . Proof. We prove this theorem by induction. Let us use the abbreviated notation σm (f )p := σm (f, T d )p . First step. Let m = 1. We assume σ1 (f )p < f p because in the case σ1 (f )p = f p the proof is trivial: we take t1 = 0. We now prove that polynomials of the form cei(k,x) with big |k| cannot provide approximation with error close to σ1 (f )p . This 1 will allow us to restrict the search for an optimal approximant c1 ei(k ,x) to a finite number of k 1 , which will in turn imply the existence. We introduce a parameter N ∈ N which will be specified later on and consider the following polynomials: |k| iku (1.2.5) KN (u) := 1− e , u ∈ T, N |k|
and KN (x) :=
d
KN (xj ),
x = (x1 , . . . , xd ) ∈ Td .
j=1
The functions KN are the Fej´er kernels. These polynomials have the following property (see [86, Chapter 3] and Section 7.4 of the Appendix here): KN 1 = 1,
N = 1, 2, . . . .
(1.2.6)
8
Chapter 1. Introduction
Consider the operator −d
(KN (g))(x) = (2π)
Td
KN (x − y)g(y) dy.
(1.2.7)
Denote eN := g − KN (g)p .
(1.2.8)
It is known that for any g ∈ Lp (T ) we have eN → 0 as N → ∞. For fixed N take any k ∈ Zd such that k∞ ≥ N . Consider g(x) = f (x) − cei(k,x) with some c. Using (1.2.5) and (1.2.6) we get, on the one hand, that d
KN (f )p = KN (g)p ≤ gp .
(1.2.9)
KN (f )p ≥ f p − f − KN (f )p ≥ f p − eN .
(1.2.10)
On the other hand,
Combining (1.2.9) and (1.2.10) we obtain k∞ ≥ N for all k, and, for any c, f (x) − cei(k,x) ≥ f p − eN . (1.2.11) p Making N big enough we get
f p − eN ≥ f p + σ1 (f )p /2.
(1.2.12)
The relations (1.2.11) and (1.2.12) imply that f (x) − cei(k,x) , σ1 (f )p = inf p c,k∞
which by Theorem 1.2.1 completes the proof for m = 1. General step. Assume that Theorem 1.2.3 has already been proved for m − 1. We prove it for m. If σm (f )p = σm−1 (f )p , we are done by the induction assumption. Let σm (f )p < σm−1 (f )p . The idea of the proof in the general step is similar to that in the first step. Take any k 1 , . . . , k m . Assume k j ∞ ≤ k m ∞ , j = 1, . . . , m − 1, and m k ∞ > N . We prove that a polynomial with frequencies k 1 , . . . , k m does not provide a good approximation. Take any numbers c1 , . . . , cm and consider fm−1 (x) := f (x) −
m−1
cj ei(k
j
,x)
,
j=1
g(x) := fm−1 (x) − cm ei(k
m
,x)
.
Then replacing f by fm−1 we obtain in the same way as above the estimate m i(kj ,x) f (x) − cj e (1.2.13) ≥ σm−1 (f )p − eN . j=1
p
9
1.2. Existence and uniqueness of best approximation
We remark here that the analog to (1.2.10) reads KN (fm−1 )p ≥ σm−1 (KN (f ))p ≥ σm−1 (f )p − f − KN (f )p ≥ σm−1 (f )p − eN . Making N big enough we derive from (1.2.13) that σm (f )p = inf
inf
cj , j=1,...,m
m i(kj ,x) f (x) − cj e , j=1
p
where the infimum is taken over the k j satisfying the restriction k j ∞ ≤ N for all j = 1, . . . , m. In order to complete the proof of Theorem 1.2.3, it remains to remark that, by Theorem 1.2.1, the inside infimum can always be replaced by minimum and the outside infimum is taken over a finite set. This completes the proof. We briefly discuss the problem of uniqueness of a best approximant. We begin by remarking that in the m-term nonlinear approximation we can hardly expect uniqueness. Indeed, consider problem A.6 on best m-term trigonometric approximation in the particular case X = L2 (0, 2π). Take f (x) =
n
eikx .
k=1
Clearly, σ1 (f )2 = (n − 1)1/2 and each eikx , k = 1, . . . , n may serve as a best approximant. In the linear approximation there is a general theorem on uniqueness. Definition 1.2.4. We say that a normed linear space X is strictly normed if its norm satisfies the following property: for any two nonzero elements x, y ∈ X, x + y = x + y
=⇒
y = αx.
Theorem 1.2.5. Let X be a strictly normed linear space. Then for any finite-dimensional linear subspace A and any x ∈ X the best approximant to x from A is unique. Proof. Take x = 0. By Theorem 1.2.1, there is a best approximant a := a(x) ∈ A. Assume there is one more best approximant a ∈ A. Then we have x − a = x − a = d(x, A). In any normed space X finite-dimensional subspaces are closed sets. Therefore, if d(x, A) = 0, then x ∈ A and the best approximant a = x is unique. Suppose d(x, A) = 0. Consider y := (a + a )/2. We have y ∈ A and hence d(x, A) ≤ x − y ≤ x − a/2 + x − a /2 ≤ d(x, A).
10
Chapter 1. Introduction
Thus
x − y = x − a/2 + x − a /2.
Since X is a strictly normed, x − a = α(x − a ) with some α = 0. The fact that x∈ / A implies α = 1, which completes the proof. Any Hilbert space and Lp spaces with 1 < p < ∞ are examples of strictly normed spaces. On the contrary, the spaces Lp with p = 1 or p = ∞ are not strictly normed.
1.3 Schauder bases in Banach spaces Schauder bases in Banach spaces can be used to associate to an element x ∈ X a sequence of numbers – the coefficients of x with respect to a basis. This helps in studying properties of a Banach space X. Definition 1.3.1. A sequence Ψ := {ψn }∞ n=1 in a Banach space X is called a Schauder basis of X (basis of X) if for any x ∈ X there exists a unique sequence ∞ ∞ {cn }∞ n=1 := {cn (x)}n=1 := {cn (x, Ψ)}n=1 such that x=
∞
cn (x)ψn .
n=1
Denote S0 (x) := 0,
Sm (x) := Sm (x, Ψ) :=
m
cn (x)ψn .
n=1
For a fixed basis Ψ consider the following quantity: x := sup Sm (x, Ψ). m
It is clear that for any x ∈ X we have
x ≤ x < ∞.
(1.3.1)
It is easy to see that · provides a norm on the linear space X. Denote this new normed linear space by X s . Lemma 1.3.2. The space X s is a Banach space. Proof. We only need to prove that X s is complete. Let {y n } be a Cauchy sequence in X s . Then by (1.3.1) {y n } is a Cauchy sequence in X and therefore there exists y ∈ X such that y n → y. We will prove that y n → y also in X s , that is, for any > 0 there exists N such that for any n > N we have, for all m, Sm (y n − y) < .
(1.3.2)
11
1.3. Schauder bases in Banach spaces
Since {y n } is a Cauchy sequence in X s , there is N1 such that for any m1 , m2 we have for all n, l > N1 that (Sm2 − Sm1 )(y n − y l ) < . (1.3.3) In the particular case m1 = i − 1, m2 = i, this implies that {ci (y n )} is a Cauchy sequence and therefore ci (y n ) −→ bi ,
i = 1, 2, . . .
(1.3.4)
for some bi . In the case m1 = 0, m2 = m, we get from (1.3.3) that, for any n, l > N1 , Sm (y l ) − y l ≤ Sm (y l − y n ) + Sm (y n ) − y n + y n − y l (1.3.5) ≤ Sm (y n ) − y n + 2. Next, using that, for each fixed m, Sm (y l ) −→
m
bi ψi ,
i=1
we get from (1.3.5) that m ≤ Sm (y n ) − y n + 2, b ψ − y i i i=1
which easily implies that
m
bi ψi −→ y.
i=1
Let, for a given > 0, M be a number such that for any m ≥ M Sm (y) − y < and, for n := N1 + 1,
Sm (y n ) − y n < .
Then Sm (y n − y) ≤ 3 and by (1.3.5), for all l > N1 and all m ≥ M Sm (y l − y) ≤ 5.
(1.3.6)
The relation (1.3.4) provides (1.3.6) for m < M and sufficiently large l. This completes the proof of (1.3.2) and Lemma 1.3.2.
12
Chapter 1. Introduction
Theorem 1.3.3. Let X be a Banach space with a Schauder basis Ψ. Then the maps Sm : X → X are bounded linear operators and sup Sm < ∞. m
Proof. The proof is based on the following fundamental theorem of Banach. Theorem 1.3.4. Let X, Y be Banach spaces and T be a bounded linear one-to-one operator from Y to X. Then the inverse operator T −1 is a bounded linear operator from X to Y . We specify X = X, Y = X s , and T to be the identity mapping. It follows from (1.3.1) that T is a bounded operator from Y to X. Thus, by Theorem 1.3.4, T −1 is also bounded. This means that there exists a constant C such that for any x ∈ X we have |||x||| ≤ Cx. This completes the proof of Theorem 1.3.3. The operators {Sm }∞ m=1 are called the natural projections associated to a basis Ψ. The number supm Sm is called the basis constant of the basis Ψ. A basis whose basis constant is one is called a monotone basis. It is clear that an orthonormal basis in a Hilbert space is a monotone basis. Every Schauder basis Ψ is monotone with respect to the norm |||x||| := supm Sm (x, Ψ) which was already used above. Indeed, we have Sm (x) = sup Sn (Sm (x)) = sup Sn (x) ≤ x. n
1≤n≤m
The above remark means that for any Schauder basis Ψ of X we can renorm X (take X s ) to make the basis Ψ monotone for a new norm with that is equivalent to the original norm. Theorem 1.3.5. Let {xn }∞ n=1 be a sequence of elements in a Banach space X. Then {xn }∞ is a Schauder basis of X if and only if the following three conditions hold: n=1 (a) xn = 0 for all n. (b) There is a constant K such that for every choice of scalars {ai }∞ i=1 and integers n < m we have n m a x a x ≤ K i i i i . i=1
i=1
(c) The closed linear span of {xn }∞ n=1 coincides with X. Proof. Necessity of (a) and (c) follows directly from the definition of a Schauder basis. Theorem 1.3.3 implies necessity of (b). Let us prove that (a), (b), and (c) imply that {xn }∞ n=1 is a Schauder basis. Take any x ∈ X and using (c) find ali xi , #Λl < ∞ such that x − y l ≤ 2−l . y l := i∈Λl
13
1.3. Schauder bases in Banach spaces
Introduce the new sequence z j := y j − y j−1 ,
z 1 := y 1 , Then x=
∞
zj ,
j=1
and, by (b),
zj =
j = 2, 3, . . . .
bji xi ,
#Gj < ∞
(1.3.7)
i∈Gj
bji xi ≤ 6K2−j ,
Denote ai :=
∞
i ∈ Gj .
bji .
(1.3.8)
j=1
Consider the partial sums Pm of the following series: ∞
ai xi ,
Pm :=
i=1
m
ai xi .
i=1
It is not difficult to see that (1.3.7) and (1.3.8) imply that Pm −→ x. Indeed, denote
Sm (y n ) :=
ani xi .
i∈Λn , i≤m
Then we have, for any fixed m,
Sm y l −→ Pm
(1.3.9)
and, for any l > n, l y − Sm y l ≤ y n − Sm (y n ) + y n − y l + Sm (y n ) − Sm y l ≤ y n − Sm (y n ) + 3(K + 1)2−n . For a fixed n choose m such that Sm (y n ) = y n . Then for any l > n
x − Sm y l ≤ 3(K + 1) + 1 2−n . Together with (1.3.9) this proves the existence of the basis expansion for any x ∈ X. The uniqueness of that expansion follows from (a) and (b). For a basis Ψ denote by σm (x, Ψ) := σm (x, Ψ)X :=
inf
inf x − ci ψi
Λ, |Λ|≤m ci , i∈Λ
i∈Λ
the best m-term approximation of x relative to Ψ. In a way similar to the proof of Theorem 1.2.3 we can prove the following existence result.
14
Chapter 1. Introduction
Theorem 1.3.6. Let Ψ be a monotone basis of X. Then for any x ∈ X and any m ∈ N there exist Λm , |Λm | ≤ m, and {c∗i , i ∈ Λm } such that ∗ x − ci ψi = σm (x, Ψ). i∈Λm
The simplest example of Schauder basis is the standard basis B := {en }∞ n=1 , en := (0, . . . , 0, 1, 0, . . . , ) with 1 in the n-th place, of p , 1 ≤ p < ∞. An important example of Schauder basis in Lp (0, 1), 1 ≤ p < ∞, is the Haar system. Denote by H := {Hk }∞ k=1 the Haar basis on [0, 1) normalized in L2 (0, 1): H1 = 1 on [0, 1) and for k = 2n + l, n = 0, 1, . . . , l = 1, 2, . . . , 2n , ⎧ n/2 ⎪ x ∈ [(2l − 2)2−n−1 , (2l − 1)2−n−1 ), ⎨2 , Hk = −2n/2 , x ∈ [(2l − 1)2−n−1 , 2l2−n−1 ), ⎪ ⎩ 0, otherwise. We denote by Hp := {Hk,p }∞ k=1 the Haar basis H renormalized in Lp (0, 1). Theorem 1.3.7. The Haar basis is a monotone basis of Lp (0, 1) for each 1 ≤ p < ∞. Proof. We check that conditions (a), (b), and (c) from Theorem 1.3.5 are satisfied. Since the linear span of the Haar basis contains all characteristic functions of intervals of the form [(l − 1)2−n , l2−n ), it is clear that (c) holds. We only have to verify that (b) holds with K = 1. We use the following simple inequality for p ≥ 1 and any x ∈ R: (1.3.10) |1 + x|p + |1 − x|p ≥ 2. Let f=
N −1
a k Hk .
k=1
It is clear that it is sufficient to prove that for any aN we have f + aN HN p ≥ f p .
(1.3.11)
Let I be the interval of support of HN . It is clear that the summand aN HN changes f only on I and f is a constant (say b) on I. This observation and inequality (1.3.10) imply that
|b + aN HN |p dx = |I|/2 |b + aN HN ∞ |p + |b − aN HN ∞ |p I ≥ |b|p |I| = |f |p dx, I
which proves (1.3.11).
15
1.3. Schauder bases in Banach spaces
Theorem 1.3.8. Let Ψ be a normalized (ψk = 1, k = 1, . . . ) Schauder basis of X with the additional property that ψk converge weakly to 0. Then for any x ∈ X and any m ∈ N there exist Λm , |Λm | ≤ m, and {c∗i , i ∈ Λm } such that ∗ x − c ψ i i = σm (x, Ψ). i∈Λm
Proof. In order to sketch the idea of the proof, let us consider first the case m = 1. Let x − ckn ψkn −→ σ1 (x, Ψ), n → ∞. (1.3.12) If lim inf kn < ∞, n→∞
then there exists k and a sequence {an } such that x − an ψk −→ σ1 (x, Ψ),
n → ∞.
(1.3.13)
Since Ψ is a Schauder basis, (1.3.13) implies that the sequence {an } is bounded. Choosing a convergent subsequence of {an } we ontain an a such that x − aψk = σ1 (x, Ψ), which proves the existence in this case. Assume now that lim kn = ∞.
n→∞
Let Fx be a norming (peak) functional for x: Fx (x) = x, Fx = 1. Then x − ckn ψkn ≥ Fx (x − ckn ψkn ) = x − ckn Fx (ψkn ).
(1.3.14)
Relation (1.3.12) implies boundedness of {ckn } and therefore, due to the weak convergence to 0 of {ψk }, we get from (1.3.14) and (1.3.12) that σ1 (x, Ψ) = x. Thus we can take 0 as a best approximant. Let us consider now the general case of m-term approximation. Let n
x :=
m
cnkjn ψkjn ,
n k1n < k2n < · · · < km ,
j=1
be such that x − xn −→ σm (x, Ψ). Then we have |cnkjn | ≤ M
(1.3.15)
16
Chapter 1. Introduction
for all n, j, for some constant M . Assume that lim inf kjn < ∞ for some (may be for none) j = 1, . . . , l ≤ m, n→∞
lim k n n→∞ j
= ∞,
j = l + 1, . . . , m (may be for none).
Then similarly to the case m = 1 we find Λ, |Λ| ≤ l, and a subsequence {ns }∞ s=1 such that n ck s ψk −→ ck ψk =: y. (1.3.16) k∈Λ
k∈Λ
Consider the norming functional Fx−y . We have from (1.3.15), (1.3.16), and the weak convergence of {ψk } to 0 that Fx−y (xns − y) −→ 0 as
s → ∞.
Thus x − y = Fx−y (x − y) = Fx−y (x − xns + xns − y) ≤ x − xns + |Fx−y (xns − y)|
−→ σm (x, Ψ)
as s → ∞. This implies that x − y = σm (x, Ψ), which completes the proof of Theorem 1.3.8.
1.4 Unconditional bases In many applications we need more properties of a system than the property of being a Schauder basis. One convenient property of a basis is unconditionality, which means (roughly speaking) that the norm depends only on the absolute values of the coefficients of the basis expansion. Before proceeding to unconditional bases we discuss unconditional convergence. Theorem 1.4.1. Let {xn }∞ n=1 be a sequence in a Banach space X. Then the following conditions are equivalent: ∞ (i) The series n=1 xρ(n) converges for every permutation ρ of the positive integers. ∞ (ii) The series l=1 xnl converges for any subsequence {nl }. ∞ (iii) The series l=1 θn xn converges for any choice of signs θn = ±1. (iv) For every > 0 there exists an integer N so that n∈Λ xn < for every finite set of indices Λ satisfying min{n ∈ Λ} > N .
17
1.4. Unconditional bases
Proof. The equivalence of (ii) and (iii) is obvious. Assume (iv) holds. Then the partial sums of the series in (i) (and also in (ii)) form a Cauchy sequence and thus (iv) implies both (i) and (ii). We now prove that (ii) implies (iv). Instead we will prove that negation of (iv) implies negation of (ii). Assume that (iv) does not hold. Then there exist an 0 > 0 and finite sets Λj , j = 1, . . . with the following properties. Denote by pj and qj respectively the smallest and the biggest numbers from Λj : xn ≥ 0 , qj < pj+1 , j = 1, . . . . n∈Λj
It is clear that the union of Λj over all j forms a subsequence for which (ii) does not hold. Hence (ii) implies (iv). We will prove in a similar way that (i) implies (iv). Let ρ be a permutation that it is identical on (qj , pj+1 ) and maps [pj , qj ] onto itself and, in addition, ρ−1 (Λj ) = pj , pj + |Λj | , j = 1, . . . . Then for this permutation (i) does not hold and therefore (i) implies (iv). ∞ Definition 1.4.2. A series n=1 xn which satisfies one (and, by Theorem 1.4.1, all) of conditions (i)–(iv) is called unconditionally convergent. ∞ Using condition (iv) of Theorem 1.4.1 ∞it is easy to verify that if n=1 xn converges unconditionally, then the sum n=1 xρ(n) does not depend on the permutation ρ. Let us prove the following simple lemma. Lemma 1.4.3. Let a1 ≥ a2 ≥ · · · ≥ aN ≥ 0. Then for all elements x1 , . . . , xN we have N k an xn ≤ a1 max xn . 1≤k≤N n=1
n=1
Proof. This lemma is an analog of Abel’s inequality (7.1.16). We have N n=1
an xn = aN
N
xn + (aN −1 − aN )
n=1
N −1
xn + · · · + (a1 − a2 )x1 ,
n=1
which easily implies the inequality of the lemma. ∞
Using Lemma 1.4.3 one can derive from Theorem 1.4.1 that if n=1 xn converges unconditionally, then for every bounded sequence of scalars {an }∞ n=1 the ∞ series n=1 an xn converges and the map T : ∞ → X defined by T (a1 , a2 , . . . ) :=
∞ n=1
is a bounded linear operator.
an xn
18
Chapter 1. Introduction
Definition 1.4.4. A basis Ψ of a Banach space X is said to be unconditional if for every x ∈ X its expansion ∞ n=1 an ψn in the basis Ψ converges unconditionally. The following result is a corollary of Theorem 1.4.1. Theorem 1.4.5. A basis Ψ is unconditional if and only if any of the following conditions holds. (i) For every permutation ρ the sequence {ψρ(n) }∞ n=1 is a basis of X. ∞ (ii) Convergence of n=1 an ψn implies convergence of n∈Λ an ψn for every subset Λ of integers. ∞ ∞ (iii) Convergence of n=1 an ψn implies convergence of n=1 bn ψn whenever |bn | ≤ |an | for all n. It follows from (ii) and the closed graph theorem that if Ψ is an unconditional basis and Λ is a subset of integers, then there is bounded linear projection PΛ , defined by PΛ (x) := cn (x)ψn . n∈Λ
These projections are called the natural projections associated to the basis Ψ. In the particular case Λ = [1, N ] we have P[1,N ] = SN . Similarly, for every choice of signs θ := {θn }∞ n=1 we have a bounded linear operator Mθ defined by Mθ (x) :=
∞
θn an ψn .
n=1
The uniform boundedness principle implies that sup PΛ < ∞,
sup Mθ < ∞. θ
Λ
The number supθ Mθ is called the unconditional constant of Ψ. Theorem 1.4.6. Let Ψ be an unconditional basis with unconditional constant K. ∞ Then for every choice of scalars {an }∞ such that n=1 n=1 an ψn converges, and ∞ every choice of multipliers {λn }n=1 , we have ∞ ∞ λn an ψn ≤ 2K sup |λn | an ψn n n=1
n=1
(in the real case we can take K instead of 2K). Proof. Denote x :=
∞ n=1
λn an ψn
19
1.4. Unconditional bases
and take a norming functional x∗ ∈ X ∗ for x. Then x∗ = 1 and x = x∗ (x) =
∞
λn an x∗ (ψn ).
(1.4.1)
n=1
Defining θn = 1 if an x∗ (ψn ) ≥ 0 and θn = −1 if an x∗ (ψn ) < 0, (1.4.1) yields x ≤
∞
|λn ||an x∗ (ψn )|
n=1
≤ sup |λn | n
∞
θn an x∗ (ψn )
n=1
∞ an ψn ≤ sup |λn |x Mθ ∗
n
n=1
∞ an ψn ≤ sup |λn | K . n n=1
In the case of complex scalars we consider separately the real and imaginary parts ∞ of n=1 λn an x∗ (ψn ).
Chapter 2 Lebesgue-type Inequalities for Greedy Approximation with Respect to Some Classical Bases
2.1 Introduction Let a Banach space X, with a basis Ψ = {ψk }∞ k=1 , be given. We assume that ψk ≥ C > 0, k = 1, 2, . . . , and consider the following theoretical greedy algorithm. For a given element f ∈ X, consider the expansion f=
∞
ck (f, Ψ)ψk .
(2.1.1)
k=1
For an element f ∈ X we say that a permutation ρ of the positive integers is decreasing if (2.1.2) |ck1 (f, Ψ)| ≥ |ck2 (f, Ψ)| ≥ · · · , where ρ(j) = kj , j = 1, 2, . . . , and write ρ ∈ D(f ). If the inequalities are strict in (2.1.2), then D(f ) consists of only one permutation. We define the m-th greedy approximant of f with respect to the basis Ψ and corresponding to a permutation ρ ∈ D(f ), by the formula Gm (f ) := Gm (f, Ψ) := Gm (f, Ψ, ρ) :=
m
ckj (f, Ψ)ψkj .
j=1
We note that there is another natural greedy-type algorithm, based on ordering ck (f, Ψ)ψk instead of ordering absolute values of coefficients. In this case we do not need the restriction ψk ≥ C > 0, k = 1, 2, . . . . Let Λm (f ) be a set of indices such that min ck (f, Ψ)ψk ≥ max ck (f, Ψ)ψk . k∈Λm (f )
k∈Λ / m (f )
© Springer Basel 2015 V. Temlyakov, Sparse Approximation with Bases, Advanced Courses in Mathematics - CRM Barcelona, DOI 10.1007/978-3-0348-0890-3_2
21
22
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
We define GX m (f, Ψ) by the formula GX m (f, Ψ) := SΛm (f ) (f, Ψ),
where SE (f ) := SE (f, Ψ) :=
ck (f, Ψ)ψk .
k∈E
It is clear that for a normalized basis (ψk = 1, k = 1, 2, . . . ) the above two greedy algorithms coincide. It is also clear that the above greedy algorithm GX m (·, Ψ) can be considered as a greedy algorithm Gm (·, Ψ ), with Ψ := {ψk /ψk }∞ k=1 being a normalized version of Ψ. Thus, we will concentrate on studying the algorithm Gm (·, Ψ). In the above definition of Gm (·, Ψ) we impose an extra condition on a basis Ψ: inf k ψk > 0. This restriction allows us to define Gm (f, Ψ) for all f ∈ X. The above algorithm Gm (·, Ψ) is a simple algorithm which describes a theoretical scheme for m-term approximation of an element f . We call this algorithm the Thresholding Greedy Algorithm (TGA) or simply the Greedy Algorithm (GA). In order to understand the efficiency of this algorithm, we compare its accuracy with the best-possible accuracy when an approximant is a linear combination of m terms from Ψ. We define the best m-term approximation with respect to Ψ as ck ψk σm (f ) := σm (f, Ψ)X := inf f − , ck ,Λ
k∈Λ
X
where the infimum is taken over the coefficients ck and the sets of indices Λ with cardinality |Λ| = m. The best we can achieve with the algorithm Gm is f − Gm (f, Ψ, ρ) = σm (f, Ψ)X , X or the slightly weaker f − Gm (f, Ψ, ρ) ≤ Gσm (f, Ψ)X , X
(2.1.3)
for all elements f ∈ X, and with a constant G = C(X, Ψ) independent of f and m. It is clear that when X = H is a Hilbert space and B is an orthonormal basis we have f − Gm (f, B, ρ) = σm (f, B)H . H The following concept of a greedy basis has been introduced in [43]. Definition 2.1.1. We call a basis Ψ a greedy basis if for every f ∈ X there exists a permutation ρ ∈ D(f ) such that f − Gm (f, Ψ, ρ) ≤ Cσm (f, Ψ)X X with a constant C independent of f and m. Lebesgue [51] proved the following inequality: for any 2π-periodic continuous function f we have 4 f − Sn (f )∞ ≤ 4 + 2 ln n En (f )∞ , (2.1.4) π
23
2.1. Introduction
where Sn (f ) is the n-th partial sum of the Fourier series of f and En (f )∞ is the error of the best approximation of f by trigonometric polynomials of order n in the uniform norm ·∞ . The inequality (2.1.4) relates the error of a particular method (Sn ) of approximation by trigonometric polynomials of order n to the best-possible error En (f )∞ of approximation by trigonometric polynomials of order n. By the Lebesgue-type inequality we mean an inequality that provides an upper estimate for the error of a particular method of approximation of f by elements of a special form, say form A, by the best-possible approximation of f by elements of the form A. In the case of approximation with respect to bases (or minimal systems), Lebesgue-type inequalities are known both in linear and in nonlinear settings (see the surveys [44], [76] and [81]). By Definition 2.1.1, greedy bases are those for which we have ideal (up to a multiplicative constant) Lebesgue inequalities for greedy approximation. In this chapter we give first results on Lebesgue-type inequalities. In Section 2.2 we obtain Lebesgue-type inequalities for greedy approximation with respect to the trigonometric system. In Section 2.3 we study Lebesgue-type inequalities for greedy approximation with respect to the Haar basis and prove that the Haar basis is a greedy basis for Lp , 1 < p < ∞. In Section 2.4 we prove a characterization theorem for greedy bases in Banach spaces. Sections 2.5 and 2.6 contain a further discussion of properties of greedy bases in Banach spaces. In Section 2.2 we consider the case X = Lp (Td ), 1 ≤ p ≤ ∞, where Ψ = T := i(k,x) }k∈Zd is the trigonometric system. We give a remark on approximation of {e one special function by trigonometric polynomials that shows an advantage of nonlinear approximation over linear approximation. Let us denote, for f ∈ Lp (T), ikx En (f, T )p := inf f (x) − ck e . ck ,|k|≤n
|k|≤n
p
Ch. de la Vall´ee Poussin (1908) and S. N. Bernstein (1912) proved that
En | sin x|, T ∞ n−1 . R. S. Ismagilov [34] (1974) proved that
σn | sin x|, T ∞ ≤ C n−6/5+ with arbitrary > 0. A little later V. E. Maiorov [56] (1986) proved that
σn | sin x|, T ∞ n−3/2 . These results showed an advantage of nonlinear approximation over linear approximation for typical individual functions. Now, when we know that efficiency of the m-term best approximation is good, the following important problem arises: Construct an algorithm which realizes a good m-term approximation. It is clear from
24
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
the definition of σm (f, T d )p that a good algorithm should be nonlinear. In Section 2.2 we focus on the efficiency of the Thresholding Greedy Algorithm. We prove the following theorem [68]. Theorem 2.1.2. For each f ∈ Lp (Td ) we have
f − Gm (f, T d ) ≤ 1 + 3mh(p) σm f, T d , p p
1 ≤ p ≤ ∞,
where h(p) := |1/2 − 1/p|. In Section 2.3 we discuss another important class of bases, wavelet-type bases. We discuss in detail the simplest representative of wavelet bases, the Haar basis. Denote by H := {Hk }∞ k=1 the Haar basis on [0, 1) normalized in L2 (0, 1): H1 = 1 on [0, 1), and for k = 2n + l, n = 0, 1, . . . , l = 1, 2, . . . , 2n , ⎧ n/2 ⎪ x ∈ [(2l − 2)2−n−1 , (2l − 1)2−n−1 ) ⎨2 , Hk = −2n/2 , x ∈ [(2l − 1)2−n−1 , 2l2−n−1) ⎪ ⎩ 0, otherwise. We denote by Hp := {Hk,p }∞ k=1 the Haar basis H renormalized in Lp (0, 1). We will use the following definition of the Lp -equivalence of bases. We say that Ψ = ∞ {ψk }∞ k=1 is Lp -equivalent to Φ = {φk }k=1 if for any finite set Λ and any coefficients ck , k ∈ Λ, we have C1 (p, Ψ, Φ) c φ ≤ c ψ ≤ C (p, Ψ, Φ) c φ (2.1.5) k k k k 2 k k k∈Λ
p
k∈Λ
p
k∈Λ
p
with two positive constants C1 (p, Ψ, Φ), C2 (p, Ψ, Φ) which may depend on p, Ψ, and Φ. For sufficient conditions on Ψ for it to be Lp -equivalent to H, see [25] and [13]. We prove the following theorem in Section 2.3 (see [69]). Theorem 2.1.3. Let 1 < p < ∞ and let the basis Ψ be Lp -equivalent to the Haar basis Hp . Then for any f ∈ Lp (0, 1) and any ρ ∈ D(f ), f − Gm (f, Ψ, ρ) ≤ C(p, Ψ)σm (f, Ψ)p , (2.1.6) p with a constant C(p, Ψ) independent of f , ρ, and m. In Section 2.4 we consider the general setting of greedy approximation in Banach spaces. We will concentrate on studying bases which satisfy (2.1.3) for all individual functions. Theorem 2.1.3 shows that each basis Ψ which is Lp -equivalent to the univariate Haar basis Hp is a greedy basis for Lp (0, 1), 1 < p < ∞. We note that in the case of Hilbert space every orthonormal basis is a greedy basis. We now give a definition of democratic basis (see [43]) that is needed in our characterization theorem.
25
2.2. The trigonometric system
Definition 2.1.4. We say that a basis Ψ = {ψk }∞ k=1 is a democratic basis for X if there exists a constant D := D(X, Ψ) such that, for any two finite sets of indices P and Q with the same cardinality |P | = |Q|, ψk ≤ D ψk . k∈P
k∈Q
In [43] we proved the following theorem. Theorem 2.1.5. A basis is greedy if and only if it is unconditional and democratic. Section 2.5 contains a discussion (see [43]) of notions close to the notion of greedy basis. In Section 2.6 we give some results on direct and inverse theorems for m-term approximation with respect to bases. The technique developed in Sections 2.3 and 2.4 provides a simple and straightforward way to get the equivalence relation between appropriate Lorenz space norms of the sequences of coefficients and best m-term approximations. We also discuss one interesting generalization of m-term approximation (restricted approximation) from [8].
2.2 The trigonometric system Here we prove Theorem 2.1.2 from this chapter’s Introduction. We restate it here for convenience. Theorem 2.2.1. For each f ∈ Lp (Td ) we have
f − Gm (f, T d ) ≤ 1 + 3mh(p) σm f, T d , p p
1 ≤ p ≤ ∞,
where h(p) := |1/2 − 1/p|. Proof. We treat separately the two cases 1 ≤ p ≤ 2 and 2 ≤ p ≤ ∞. But first we prove one auxiliary statement that holds for all 1 ≤ p ≤ ∞. We use the notation fˆ(k) := (2π)−d f (x)e−i(k,x) dx. Td
Lemma 2.2.2. Let Λ ⊂ Zd be a finite subset of cardinality |Λ| = m. Then, for the operator SΛ defined on L1 (Td ) by SΛ (f ) := fˆ(k)ei(k,x) , k∈Λ
we have, for all 1 ≤ p ≤ ∞, SΛ (f ) ≤ mh(p) f p . p
(2.2.1)
26
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
Proof. For a given linear operator A, denote by Aa→b the norm of A as an operator from La (Td ) to Lb (Td ). Then it is obvious that SΛ 2→2 = 1. Consider DΛ (x) :=
ei(k,x) ;
(2.2.2)
(2.2.3)
k∈Λ
then SΛ (f ) = f ∗ DΛ := (2π)−d
Td
f (x − y)DΛ (y) dy,
and for p = 1 or p = ∞ we have SΛ p→p ≤ DΛ 1 ≤ DΛ 2 = m1/2 .
(2.2.4)
The relations (2.2.2) and (2.2.4) and the Riesz–Thorin theorem (see Appendix, Theorem 7.3.2) imply (2.2.1). We now return to the proof of Theorem 2.2.1. Case 1: 2 ≤ p ≤ ∞. Take any function f ∈ Lp (Td ). Let tm be a trigonometric polynomial which realizes the best m-term approximation to f in Lp (Td ). For the existence of tm , see Theorem 1.2.3 from Chapter 1. Denote by Λ the set of frequencies of tm , i.e., Λ := {k : tˆm (k) = 0}. Then |Λ| ≤ m. Next, denote by Λ the set of frequencies of Gm (f ) := Gm (f, T d ). Then |Λ | = m. Let us use the representation f − Gm (f ) = f − SΛ (f ) = f − SΛ (f ) + SΛ (f ) − SΛ (f ). From this representation we derive f − Gm (f ) ≤ f − SΛ (f ) + SΛ (f ) − SΛ (f ) . p p p
(2.2.5)
We use Lemma 2.2.2 to estimate the first term in the right-hand side of (2.2.5): f − SΛ (f ) = f − tm − SΛ (f − tm ) ≤ (1 + mh(p) )σm (f, T d )p . (2.2.6) p p In estimating the second term in (2.2.5) we use the well-known inequality f 2 ≤ f p for 2 ≤ p ≤ ∞ (see inequality (7.1.4) in the Appendix) and the following lemma. Lemma 2.2.3. Let Λ ⊂ Zd be a finite subset of cardinality |Λ| = n. Then, for 2 ≤ p ≤ ∞, we have SΛ (f )p ≤ nh(p) f 2. (2.2.7)
27
2.2. The trigonometric system
Proof. For p = ∞ we have SΛ (f )∞
2 1/2 1/2 ˆ ˆ f (k) ≤ n f (k) ≤ ≤ n1/2 f 2 . k∈Λ
(2.2.8)
k∈Λ
For 2 < p < ∞ we use (2.2.8) and the well-known inequality 2/p
gp ≤ g2
g1−2/p . ∞
We continue estimating SΛ (f ) − SΛ (f )p . Using Lemma 2.2.3 we get SΛ (f ) − SΛ (f ) = SΛ\Λ (f ) − SΛ \Λ (f ) (2.2.9) p p ≤ SΛ\Λ (f )p + SΛ \Λ (f )p ≤ mh(p) (SΛ\Λ (f )2 + SΛ \Λ (f )2 ). The definition of Λ and the relations |Λ | = m , |Λ| ≤ m imply that SΛ\Λ (f ) ≤ SΛ \Λ (f ) . 2 2
(2.2.10)
Finally, we have SΛ \Λ (f ) ≤ f − SΛ (f ) ≤ f − tm ≤ f − tm = σm (f, T d )p . (2.2.11) 2 2 2 p Combining the relations (2.2.9)–(2.2.11) we get SΛ (f ) − SΛ (f ) ≤ 2mh(p) σm (f, T d )p . p
(2.2.12)
The relations (2.2.5), (2.2.6), and (2.2.12) yield
f − Gm (f ) ≤ 1 + 3mh(p) σm f, T d . p p This completes the proof of Theorem 2.2.1 in the case 2 ≤ p ≤ ∞. Case 2: 1 ≤ p ≤ 2. We keep the notation of Case 1. We start again with the inequality (2.2.5). Next, the inequality (2.2.6) holds also for 1 ≤ p ≤ 2 because it is based on Lemma 2.2.1, which covers the whole range 1 ≤ p ≤ ∞ of the parameter p. Thus, it remains to estimate SΛ (f ) − SΛ (f )p . Using the inequality f p ≤ f 2 we get SΛ (f ) − SΛ (f ) = SΛ\Λ (f ) − SΛ \Λ (f ) p p ≤ SΛ\Λ (f )p + SΛ \Λ (f )p ≤ SΛ\Λ (f )2 + SΛ \Λ (f )2 . (2.2.13) In order to estimate SΛ \Λ (f )2 we use the part of the Hausdorff–Young theorem (see Appendix, Theorem 7.3.1) which states that (fˆ(k))k∈Zd
p
≤ f p ,
1 ≤ p ≤ 2,
p :=
p . p−1
28
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
We have SΛ \Λ (f ) = (fˆ(k))k∈Λ \Λ 2 2 1/p−1/2 (fˆ(k))k∈Λ \Λ ≤ Λ \ Λ
p
≤ mh(p) (fˆ(k) − tˆm (k))k∈Zd
≤ mh(p) f − tm p = mh(p) σm f, T d p .
p
(2.2.14)
Combining (2.2.5), (2.2.6), (2.2.10), (2.2.13), and (2.2.14) we get
f − Gm (f ) ≤ 1 + 3mh(p) σm f, T d , p p
which completes the proof of Theorem 2.2.1. Remark 2.2.4. Lemma 2.2.2 implies, for all 1 ≤ p ≤ ∞, that Gm (f ) ≤ mh(p) f p . p
(2.2.15)
Remark 2.2.5. There is a positive absolute constant C such that for each m and 1 ≤ p ≤ ∞, there exists a function f = 0 with the property that Gm (f ) ≥ Cmh(p) f p . p Proof. We consider separately the two cases 1 ≤ p ≤ 2 and 2 ≤ p ≤ ∞. We start with the case 2 ≤ p ≤ ∞. We use the Rudin–Shapiro polynomials (see Appendix, Section 7.4) k eikx , k = ±1, x ∈ T, (2.2.16) Rm (x) = |k|≤m
which satisfy the estimate
Rm ≤ Cm1/2 ∞
(2.2.17)
with an absolute constant C. Denote, for s = ±1, ˆ m (k) = s . Λs := k : R The estimate (2.2.17) implies that |Λ1 | − |Λ−1 | = Rm (0) ≤ Cm1/2 .
(2.2.18)
Let s = ±1 be such that |Λs | > |Λ−s |. Then take a small positive parameter δ and consider the function R := Rm + sδDm , (2.2.19) fm,δ where Dm (x) :=
|k|≤m
eikx ,
x ∈ T,
29
2.2. The trigonometric system
R R is the Dirichlet kernel. Then, since |fˆm,δ (k)| = 1+δ for k ∈ Λs and |fˆm,δ (k)| = 1−δ R for k ∈ Λ−s and |Λs | ≥ m, the frequencies of Gm (fm,δ ) will be in Λs and
R R Gm (fm,δ ) ∞ ≥ Gm (fm,δ )(0) = 1 + δ m.
(2.2.20)
Next, R
f ≤ Rm + δ Dm ≤ Rm + δ 2m + 1 1−1/p m,δ p p p ∞
(2.2.21)
≤ Cm1/2 + δ(2m + 1)1−1/p ≤ C1 m1/2 for δ ≤ m1/p−1/2 . By the Nikol’skii inequality for trigonometric polynomials (see Appendix, Theorem 7.5.4), the relation (2.2.20) implies R R Gm (fm,δ ) p ≥ C2 m−1/p Gm (fm,δ ) ∞ ≥ C2 m1−1/p . (2.2.22) By comparing (2.2.21) and (2.2.22), we obtain the required estimate in the case 2 ≤ p ≤ ∞. We proceed to the case 1 ≤ p ≤ 2. We keep the notations of the previous case and introduce the de la Vall´ee Poussin kernels (see Appendix, Section 7.4) 2m−1 1 Dl (x), Vm (x) := m
x ∈ T,
m = 1, 2, . . . .
(2.2.23)
1 ≤ p ≤ ∞.
(2.2.24)
l=m
It is known (see Appendix, (7.4.14)) that Vm p ≤ Cm1−1/p ,
m = 1, 2, . . . ,
Consider the function V fm,δ := Vm + sδRm ,
0 < δ ≤ m1/2−1/p .
(2.2.25)
V The set Λ of frequencies of Gm (fm,δ ) has the properties |Λ| = m, Λ ⊂ Λs , and V Gm (fm,δ ) = (1 + δ)DΛ ,
(2.2.26)
where DΛ is defined by (2.2.3). A lower bound for DΛ p follows from the relations m = DΛ , Rm ≤ DΛ p Rm p ≤ Cm1/2 DΛ p . We have DΛ p ≥ C3 m1/2 .
(2.2.27)
Now (2.2.17), (2.2.24), (2.2.26) and (2.2.27) imply the required inequality in the case 1 ≤ p ≤ 2. The proof of Remark 2.2.5 is complete.
30
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
Remark 2.2.6. The trivial inequality σm (f, T d )p ≤ f p and Remark 2.2.5 show that the factor mh(p) in Theorem 2.2.1 is sharp in the sense of order. Remark 2.2.7. Using Remark 2.2.5 it is easy to construct, for each p = 2, a function f ∈ Lp (T) such that the sequence {Gm (f )p }∞ m=1 is not bounded. Proposition 2.2.8. For each 2 < p ≤ ∞ there exists f ∈ Lp (Td ) such that
fˆ(k) = O |k|−d(1−1/p) ,
|k| := max |kj |, 1≤j≤d
(2.2.28)
and the sequence {Gm (f )} diverges in Lp . Proof. We will use the construction from the proof of Remark 2.2.2. Define R (x) := fm,δ
d
R fm,δ (xj )ei(2m+1)xj
j=1
and f :=
∞
2−d(1−1/p)l f2Rl ,δl (x),
0 < δl < 2−dl−3 .
l=1
The relation (2.2.28) is obviously satisfied. Moreover, (2.2.21) implies that
f − SQ(2n ) (f ) = O 2−d(1/2−1/p)n , Q(N ) := k : |k| ≤ N 1/d . (2.2.29) ∞ However, (2.2.22) implies that {Gm (f )} diverges in Lp .
Theorem 2.2.9. There exists a continuous function f such that Gm (f, T ) does not converge to f in Lp for any p > 2. Theorem 2.2.10. There exists a function f that belongs to any Lp , p < 2, such that Gm (f, T ) does not converge to f in measure. The proof of both theorems is based on the two examples (one for p > 2 and the other for p < 2) constructed in the proof of Remark 2.2.5. Proof. We now prove Theorems 2.2.9 and 2.2.10. For Theorem 2.2.9, it is sufficient to consider the following function in the univariate case: f :=
∞
2−l/2 l−2 f2Rl ,δl (x),
0 < δl < 2−l−3 .
l=1
For Theorem 2.2.10 we use the Rudin–Shapiro polynomials RN (x) =
N −1 k=0
k eikx ,
k = ±1,
x ∈ T,
31
2.2. The trigonometric system
which satisfy the inequality RN ∞ ≤ CN 1/2
(2.2.30)
with an absolute constant C. Denote, for s = ±1, ˆ N (k) = s . Λs := Λs (N ) := k : R Denote also DΛ (x) :=
eikx .
k∈Λ
Then RN = DΛ+1 − DΛ−1 . The inequality (2.2.30) implies that RN ≥ C1 N 1/2 . 1 Using this inequality we prove that there exist two positive constants, c1 and c2 , such that for one of s = ±1 we have measure x : |DΛs (N ) (x)| ≥ c1 N 1/2 ≥ c2 . (2.2.31) We define a function f from Theorem 2.2.10 by f :=
∞
2−v/2 ei2
v
x
DΛs (2v ) + s2−v R2v .
v=1
Then for appropriately chosen m1 and m2 we get
v Gm1 f, T − Gm2 f, T = 2−v/2 ei2 x 1 + 2−v DΛs (2v ) and, by (2.2.31), measure x : |Gm1 (f ) − Gm2 (f )| ≥ c1 ≥ c2 , which shows that {Gm (f, T )} does not converge in measure. Further, for any 1 < p < 2 we have DΛ (2v ) + s2−v R2v ≤ C2v(1−1/p) , s p which implies that f ∈ Lp .
We also mention two interesting results on convergence almost everywhere. T. W. K¨orner, answering a question raised by Carleson and Coifman, constructed in [48] a function from L2 and then in [49] a continuous function such that {Gm (f, T )} diverges almost everywhere. T. Tao [65] proved that for the Haar system we have convergence: the sequence {Gm (f, Hp )} converges almost everywhere to f for any f ∈ Lp , 1 < p < ∞.
32
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
We now make some remarks about possible generalizations of Theorem 2.2.1. Reviewing the proof of Theorem 2.2.1 one verifies that all arguments hold true for any orthonormal system {φj }∞ j=1 of uniformly bounded functions φj ∞ ≤ M , j = 1, 2, . . . . The only difference is that instead of the Hausdorff–Young theorem we shall use the Riesz theorem, and the constants in Lemmas 2.2.2 and 2.2.3 will depend on M . Let us formulate the corresponding analog of Theorem 2.2.1. d Let Φ := {φj }∞ j=1 be an orthonormal system in L2 (T ) such that φj ∞ ≤ M , j = 1, 2, . . . . Theorem 2.2.11. For any orthonormal system Φ = {φj }∞ j=1 of uniformly bounded functions φj ∞ ≤ M there exists a constant C(M ) such that f − Gm (f, Φ) ≤ C(M )mh(p) σm (f, Φ)p , 1 ≤ p ≤ ∞, p where h(p) := |1/2 − 1/p|.
2.3 Wavelet bases In this section it will be convenient to index elements of bases by dyadic intervals: ψ1 =: ψ[0,1] and
ψ2n +l =: ψI , I = (l − 1)2−n , l2−n . We begin by proving Theorem 2.3.1 (see [69]) and note that Theorem 2.1.3 from the Introduction follows from Theorem 2.3.1 by a simple renormalization argument. Theorem 2.3.1. Let 1 < p < ∞ and let the basis Ψ := {ψI }I be Lp -equivalent to H. Then for any f ∈ Lp we have f − Gpm (f, Ψ) ≤ C(p)σm (f, Ψ)p . p Proof. Let us take a parameter 0 < t ≤ 1 and consider the following greedy-type algorithm Gp,t with respect to the Haar system. For the Haar basis H we define 1 cI (f ) := f, HI = f (x)HI (x) dx. 0
Denote by Λm (t) any set of m dyadic intervals such that min cI (f )HI p ≥ t max cJ (f )HJ p , J ∈Λ / m (t)
I∈Λm (t)
and define p,t Gp,t m (f ) := Gm (f, H) :=
cI (f )HI .
(2.3.1)
(2.3.2)
I∈Λm (t)
For a given function f ∈ Lp we define g(f ) := cI (f, Ψ)HI . I
(2.3.3)
33
2.3. Wavelet bases
It is clear that g(f ) ∈ Lp and
σm g(f ), H p ≤ C1 (p)−1 σm f, Ψ p .
(2.3.4)
Here and later on we use the brief notation Ci (p) := Ci (p, Ψ, H), i = 1, 2, for the constants from (2.1.5). Let cI (f, Ψ)ψI . Gpm (f, Ψ) = I∈Λm
/ Λm , by the definition of Λm we have Next, for any two intervals I ∈ Λm , J ∈ cI (f, Ψ)ψI ≥ cJ (f, Ψ)ψJ , p p whence, using (2.1.5), cI (g(f ))HI = cI (f, Ψ)HI ≥ C2 (p)−1 cI (f, Ψ)ψI p p p ≥ C2 (p)−1 cJ (f, Ψ)ψJ p ≥ C1 (p)C2 (p)−1 cJ (g(f ))HJ p .
(2.3.5)
This inequality implies that for any m we can find a set Λm (t), where t = C1 (p)C2 (p)−1 , such that Λm (t) = Λm and, therefore, f − Gpm (f, Ψ) ≤ C2 (p)g(f ) − Gp,t (2.3.6) m (g(f )) p . p The relations (2.3.4) and (2.3.6) show that Theorem 2.3.1 follows from Theorem 2.3.2. Theorem 2.3.2. Let 1 < p < ∞ and 0 < t ≤ 1. Then for any g ∈ Lp we have
g − Gp,t (g, H) ≤ C p, t σm g, H . m p p Proof. The Littlewood–Paley Theorem for the Haar system (see for instance [41]) gives, for 1 < p < ∞, 2 1/2 2 1/2 . (2.3.7) C3 (p) cI (g)HI cI (g)HI ≤ gp ≤ C4 (p) I
p
p
I
We first formulate two simple corollaries of (2.3.7): 1/p cI (g)HI p , gp ≤ C5 (p) p
1 < p ≤ 2,
(2.3.8)
2 ≤ p < ∞.
(2.3.9)
I
2 1/2 cI (g)HI p , gp ≤ C6 (p) I
34
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
The proof of analogs of these inequalities for the trigonometric system can be found, for instance, in [67], p. 37. The same proof gives (2.3.8) and (2.3.9). The dual inequalities to (2.3.8) and (2.3.9) are 2 1/2 cI (g)HI p gp ≥ C7 (p) , 1 < p ≤ 2, (2.3.10) I
1/p cI (g)HI p , gp ≥ C8 (p) p
2 ≤ p < ∞.
(2.3.11)
I
We proceed to the proof of Theorem 2.3.2. Let Tm be an m-term Haar polynomial of best m-term approximation to g in Lp (for existence, see [2], [23] and also Theorems 1.3.6 and 1.3.7 from Chapter 1): aI HI , |Λ| = m. Tm = I∈Λ
For any finite set Q of dyadic intervals we denote by SQ the projector SQ (f ) := cI (f )HI . I∈Q
From (2.3.7) we get g − SΛ (g) = g − Tm − SΛ (g − Tm ) ≤ Id − SΛ σ (g, H)p p p p→p m ≤ C4 (p)C3 (p)−1 σm (g, H)p ,
(2.3.12)
where Id denotes the identity operator. Further, we have Gp,t m (g) = SΛm (t) (g), and
g − Gp,t m (g) p ≤ g − SΛ (g) p + SΛ (g) − SΛm (t) (g) p .
(2.3.13)
The first term in the right-hand side of (2.3.13) was estimated in (2.3.12). We now estimate the second term. We represent it in the form SΛ (g) − SΛm (t) (g) = SΛ\Λm (t) (g) − SΛm (t)\Λ (g) and remark that, similarly to (2.3.12), we get SΛm (t)\Λ (g)p ≤ C9 (p)σm (g, H)p . The key point of the proof of Theorem 2.3.2 is the estimate SΛ\Λ (t) (g) ≤ C(p, t)SΛ (t)\Λ (g) , m m p p which will be derived from the following two lemmas.
(2.3.14)
(2.3.15)
35
2.3. Wavelet bases
Lemma 2.3.3. Consider f=
|Q| = N.
c I HI ,
I∈Q
Let 1 ≤ p < ∞. Assume
cI HI ≤ 1, p
I ∈ Q.
(2.3.16)
Then f p ≤ C10 (p)N 1/p . Lemma 2.3.4. Consider f=
|Q| = N.
c I HI ,
I∈Q
Let 1 < p ≤ ∞. Assume
cI HI ≥ 1, p
I ∈ Q.
Then f p ≥ C11 (p)N 1/p . Proof. First we prove Lemma 2.3.3. We note that in the case 1 < p ≤ 2 its statement follows from (2.3.8). We will give a proof of this lemma for all 1 ≤ p < ∞. We have cI HI = |cI ||I|1/p−1/2 . p The assumption (2.3.16) implies that |cI | ≤ |I|1/2−1/p . Next, we have
−1/p c I HI ≤ |I| χI (x) f p ≤ , I∈Q
p
(2.3.17)
p
I∈Q
where χI (x) is the characteristic function of the interval I, 1, x ∈ I χI (x) = 0, x ∈ / I. In order to proceed further we need a lemma. Lemma 2.3.5. Let n1 < n2 < · · · < ns be integers and let Ej ⊂ [0, 1] be measurable sets, j = 1, . . . , s. Then for any 0 < q < ∞ we have q 1 s s 2nj /q χEj (x) dx ≤ C12 (q) 2nj |Ej |. 0
j=1
j=1
36
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
Proof. Let us denote
s
F (x) :=
2nj /q χEj (x)
j=1
and estimate F (x) on the sets El− := El \
s
l = 1, . . . , s − 1;
Ek ,
Es− := Es .
k=l+1
We have, for x ∈ El− , F (x) ≤
l
2nj /q ≤ C(q)2nl /q .
j=1
Therefore,
1
F (x)q dx ≤ C(q)q 0
s
2nl |El− | ≤ C(q)q
l=1
s
2nl |El |,
l=1
as claimed.
We return to the proof of Lemma 2.3.3. Denote by n1 < n2 < · · · < ns all integers such that there is I ∈ Q with |I| = 2−nj . Introduce the sets I. Ej := I∈Q;|I|=2−nj
Then the number N of elements in Q can be written in the form N=
s
|Ej |2nj .
(2.3.18)
j=1
Using these notations the right-hand side of (2.3.17) can be rewritten as Y := 0
1
s
2
nj /p
p 1/p χEj (x) dx .
j=1
Applying Lemma 2.3.5 with q = p we get 1/p s |Ej |2nj = C13 (p)N 1/p . f p ≤ Y ≤ C13 (p) j=1
In the last step we used (2.3.18). This completes the proof.
37
2.3. Wavelet bases
Proof. We next prove Lemma 2.3.4. We derive Lemma 2.3.4 from Lemma 2.3.3. Define c¯I |cI |−1 |I|1/p−1/2 HI , u := I∈Q
where the bar means complex conjugation. Then for p =
p p−1
we have
c¯I |cI |−1 |I|1/p−1/2 HI = 1 p and, by Lemma 2.3.3,
up ≤ C10 (p)N 1/p .
(2.3.19)
Consider f, u. We have, on one hand, f, u = |cI ||I|1/p−1/2 = cI HI p ≥ N, I∈Q
(2.3.20)
I∈Q
while on the other hand f, u ≤ f p up .
(2.3.21)
Combining (2.3.19)–(2.3.21) we obtain the statement of Lemma 2.3.4.
We now complete the proof of Theorem 2.3.2. It remained to prove inequality (2.3.15). Denote and B := min cI (g)HI p . A := max cI (g)HI p I∈Λ\Λm (t)
I∈Λm (t)\Λ
Then by the definition of Λm (t) we have B ≥ tA.
(2.3.22)
Further, using Lemma 2.3.3 we get SΛ\Λ (t) (g) ≤ AC10 (p)Λ \ Λm (t)1/p ≤ t−1 BC10 (p)Λ \ Λm (t)|1/p . (2.3.23) m p Using Lemma 2.3.4 we get SΛm (t)\Λ (g)p ≥ BC11 (p)|Λm (t) \ Λ|1/p .
(2.3.24)
Taking into account that |Λm (t) \ Λ| = |Λ \ Λm (t)|, we infer from (2.3.23) and (2.3.24) the inequality (2.3.15). The proof of Theorem 2.3.2 is complete. We now discuss the multivariate analog of Theorem 2.3.1. There are several natural generalizations of the Haar system to the d-dimensional case. We describe first the one for which the statement of Theorem 2.3.1 and its proof coincide with
38
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
the one-dimensional version. First of all we include in the system the constant function H[0,1]d (x) = 1, x ∈ [0, 1)d . Next we define 2d − 1 functions with support [0, 1)d . Take any combination of intervals Q1 , . . . , Qd , where Qi = [0, 1] or Qi = [0, 1) with at least one Qj = [0, 1), and define, for Q = Q1 × · · · × Qd , x = (x1 , . . . , xd ), HQ (x) :=
d
HQi (xi ).
i=1 k d We shall also denote these functions by H[0,1) d (x), k = 1, . . . , 2 − 1. We define the basis of Haar functions with supports on dyadic cubes of the form
J = (j1 − 1)2−n , j1 2−n × · · · × (jd − 1)2−n , jd 2−n , (2.3.25)
ji = 1, . . . , 2n ,
n = 0, 1, . . . .
For each dyadic cube of the form (2.3.25) we define 2d − 1 basis functions by n
k −n HJk (x) := 2nd/2 H[0,1) , k = 1, . . . , 2d − 1. d 2 (x − j1 − 1, . . . , jd − 1)2 k We can also use another enumeration of these functions. Let H[0,1) d (x) = HQ (x) with
Q = Q1 × · · · × Qd ,
Qi = [0, 1),
i ∈ E,
Qi = [0, 1],
i ∈ {1, d} \ E,
Consider a dyadic interval I of the form
I = I1 × · · · × Id , Ii = (ji − 1)2−n , ji 2−n , Ii = [(ji − 1)2−n , ji 2−n ],
i ∈ {1, d} \ E,
E = ∅.
i ∈ E,
E = ∅,
(2.3.26)
and define HI (x) := HJk (x). Taking as the set of dyadic intervals D the set of all dyadic cubes of the form (2.3.26) amended by the cube [0, 1]d and denoting by H the corresponding basis {HI }I∈D , we get the multivariate Haar system. Remark 2.3.6. Theorem 2.3.1 holds for the multivariate Haar system H with the constant C(p) allowed to depend also on d. In this section we studied approximation in Lp ([0, 1]) and made a remark about approximation in Lp ([0, 1]d ). We can treat in the same way approximation in Lp (Rd ). Remark 2.3.7. Theorem 2.3.1 holds for approximation in Lp (Rd ). Results on approximation of function classes using the multivariate greedy algorithm Gpm (·, Ψ) can be found in [12]. Let us discuss now another multivariate Haar basis Hd := H × · · ·× H, which is obtained from the univariate one by tensor product.
39
2.3. Wavelet bases
Theorem 2.3.8. Let 1 < p < ∞. Then for any f ∈ Lp ([0, 1]d ),
f − Gpm (f, Hd ) ≤ C p, d log(m + 1) (d−1)|1/2−1/p| σm f, Hd . p p This theorem was conjectured in [70] and was proved there in the particular case d = 2, 4/3 ≤ p ≤ 4. The general case was proved in [85]. Theorem 2.3.9. For any 1 < p < ∞ there exists C(p, d) > 0 such that for any m there is fm ∈ Lp , fm = 0, with the property that
fm − Gp (fm , Hd ) ≥ C(p, d) log(m + 1) (d−1)|1/2−1/p| σm f, Hd . m p p This theorem was proved by R. Hochmuth. We will give a proof of it from [70]. For a set Λ of indices we define |I|1/2−1/p HI . gΛ,p := I∈Λ
For each n ∈ N we define two sets A and B of dyadic intervals I as follows: A := I : |I| = 2−n ; B := I : I ∈ / A, ∀I = I we have I ∩ I = ∅, |B| = |A| . Let 2 ≤ p < ∞ be given. Denote m = #A and consider f = gA,p + 2gB,p . Then on one hand Gpm (f, Hd ) = 2gB,p and
f − Gpm (f, Hd ) = gA,p m1/p log m (1/2−1/p)(d−1) . p p
(2.3.27)
On the other hand, σm (f, Hd )p ≤ 2gB,p p m1/p .
(2.3.28)
The relations (2.3.27) and (2.3.28) imply the required lower estimate in the case 2 ≤ p < ∞. The remaining case 1 < p ≤ 2 can be handled in the same way considering the function f = 2gA,p + gB,p .
40
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
2.4 Greedy bases We begin by proving Theorem 2.1.5 from the Introduction. Theorem 2.4.1. A basis is greedy if and only if it is unconditional and democratic. We will use the following well-known result about unconditional bases (see [52, p. 19] and Theorem 1.4.6 from Chapter 1). Theorem 2.4.2. Let Ψ be an unconditional basis for X. Then for every choice of bounded scalars {λk }∞ k=1 we have ∞ ∞ λk ak ψk ≤ 2K sup |λk | ak ψk k
k=1
k=1
(in the case of a real Banach space X we can take K instead of 2K). First, we prove that an unconditional and democratic basis is a greedy basis. Take any > 0 and find pm (f ) := bk ψk k∈P
such that |P | = m and
f − pm (f ) ≤ σm (f, Ψ) + .
(2.4.1)
For any finite set of indices Λ we denote by SΛ the projector SΛ (f ) := ck (f )ψk . k∈Λ
The assumption that Ψ is an unconditional basis implies that
f − SP (f ) ≤ 2K σm (f, Ψ) + . Let ρ ∈ D(f ) and Gm (f, Ψ, ρ) =
(2.4.2)
ck (f )ψk = SQ (f ).
k∈Q
Then
f − Gm (f, Ψ, ρ) ≤ f − SP (f ) + SP (f ) − SQ (f ).
(2.4.3)
The first term in the right-hand side of (2.4.3) was estimated in (2.4.2). We now estimate the second term. We have SP (f ) − SQ (f ) = SP \Q (f ) − SQ\P (f ).
(2.4.4)
As in (2.4.2), we have
SQ\P (f ) ≤ 2K σm (f, Ψ) + .
(2.4.5)
41
2.4. Greedy bases
We now estimate SP \Q (f ). By the definition of the greedy algorithm Gm , A := max |ck (f )| ≤ min |ck (f )| =: B. k∈P \Q
(2.4.6)
k∈Q\P
By Theorem 2.4.2, we have SP \Q (f ) ≤ 2KA ψk
(2.4.7)
k∈P \Q
and
SQ\P (f ) ≥ (2K)−1 B ψ k .
(2.4.8)
k∈Q\P
By the assumption that Ψ is democratic we get ψk ≤ D ψk . k∈P \Q
(2.4.9)
k∈Q\P
Combining (2.4.7)–(2.4.9) we obtain SP \Q (f ) ≤ 4DK 2 SQ\P (f ).
(2.4.10)
Now using (2.4.5) and (2.4.10) we derive from (2.4.4) and (2.4.3) that
f − Gm (f, Ψ, ρ) ≤ 8DK 3 + 4K σm (f, Ψ) + and, therefore, the inequality
f − Gm (f, Ψ, ρ) ≤ 8DK 3 + 4K σm (f, Ψ) holds. Second, we prove the inverse part of the theorem that every greedy basis is unconditional and democratic. Assume that a basis Ψ satisfies (2.1.3) for all f ∈ X. We begin with the unconditionality. We shall prove that for each function f ∈ X and any finite set Λ we have SΛ (f ) ≤ (G + 1)f , (2.4.11) where G is from (2.1.3). It is well known that (2.4.11) implies that Ψ is an unconditional basis. Take a number N such that N > max |ck (f )| k
and consider the new function g := f − SΛ (f ) + N
k∈Λ
ψk .
42
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
Then we obviously have σm (g) ≤ f ,
m := |Λ|,
and Gm (g) := Gm (g, Ψ, ρ) = N
(2.4.12) ψk .
(2.4.13)
k∈Λ
Thus by our assumption that Ψ is a greedy basis we get f − SΛ (f ) = g − Gm (g) ≤ Gσm (g) ≤ Gf . This implies (2.4.11). We now proceed to proving that Ψ is democratic. Let two finite sets P and Q with |P | = |Q| = m be given. Take a third one Y such that |Y | = m and Y ∩ P = ∅, Y ∩ Q = ∅. For a finite set Λ, denote ψΛ := ψk . k∈Λ
Fix any > 0 and consider the function
f := 1 + ψQ + ψY . Then
σm (f ) ≤ 1 + ψQ
and
f − Gm (f ) = ψY .
Therefore, by the assumption that Ψ is greedy we get
ψY ≤ G 1 + ψQ . Similarly we obtain
ψP ≤ G 1 + ψY .
(2.4.14) (2.4.15)
Combining the above two inequalities and taking into account that is arbitrarily small we obtain ψP ≤ G2 ψQ . This completes the proof of Theorem 2.4.1.
2.5 Some examples We present here some examples from [43].
2.5.1 Unconditionality does not imply democracy This follows from properties of the multivariate Haar system H2 = H × H defined as the tensor product of the univariate Haar systems H (see Theorem 2.3.9).
43
2.5. Some examples
2.5.2 Democracy does not imply unconditionality Let X be the set of all real sequences x = (x1 , x2 , . . . ) such that N ||x||X := sup xn N ∈N n=1
is finite. Clearly, X equipped with the norm || · ||X is a Banach space. Let ψk ∈ X, k = 1, 2, . . . , be defined as 1, n = k, (ψk )n = 0, n = k. Denote by X0 the subspace of X spanned by the elements ψk . It is easy to see that {ψk } is a democratic basis in X0 . However, it is not an unconditional basis since m ψ k = m, k=1
but
X
m k (−1) ψk k=1
= 1. X
2.5.3 Superdemocracy does not imply unconditionality It is clear that an unconditional and democratic basis Ψ satisfies the inequality ≤ D (2.5.1) θ ψ ψ k k S k k k∈P
k∈Q
for any two finite sets P and Q with |P | = |Q| and any choices of signs θk = ±1, k ∈ P , and k = ±1, k ∈ Q. Definition 2.5.1. We say that a basis Ψ is a superdemocratic basis if it satisfies (2.5.1). Theorem 2.4.1 implies that every greedy basis is superdemocratic. Now we will construct an example of a superdemocratic basis which is not an unconditional basis and, therefore, by Theorem 2.4.1, is not a greedy basis. Let X be the set of all real sequences x = (x1 , x2 , . . . ) ∈ 2 such that N √ ||x||+ := sup xn / n N ∈N n=1
is finite. Clearly, X equipped with the norm || · || := max(|| · ||l2 , || · ||+ )
44
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
is a Banach space. Let ψk ∈ X, k = 1, 2, . . . , be defined as 1, n = k, (ψk )n = 0, n = k. Denote by X0 the subspace of X spanned by the elements ψk . It is easy to see that Ψ = {ψk } is a democratic basis in X0 . Moreover, it is superdemocratic: for all k1 , . . . , km and for any choice of signs, m √ √ m≤ ±ψkj (2.5.2) < 2 m. j=1
Indeed, we have m ±ψkj
=
√ m,
2
j=1
m m √ ±ψ ≤ 1/ j < 2 m, kj +
j=1
j=1
and (2.5.2) follows. However, Ψ is not an unconditional basis, since for m ≥ 2, m √ m ψ / k 1/k log m, ≥ k k=1
but
k=1
m √ k (−1) ψk / k log m. k=1
2.5.4 A quasi-greedy basis is not necessarily an unconditional basis It follows from the definition of a greedy basis (see (2.1.3)) that the inequality
Gm (f, Ψ, ρ) ≤ G + 1 f (2.5.3) holds for all m and all f ∈ X, with some ρ ∈ D(f ). Definition 2.5.2. We say that a basis Ψ is quasi-greedy if there exists a constant CQ such that for any f ∈ X and any finite set of indices Λ having the property min ck (f )ψk ≥ max ck (f )ψk , (2.5.4) k∈Λ /
k∈Λ
we have
SΛ (f, Ψ) = ck (f )ψk ≤ CQ f . k∈Λ
(2.5.5)
45
2.5. Some examples
It is clear that for elements f with a unique decreasing rearrangement of coefficients (#D(f ) = 1) the inequalities (2.5.3) and (2.5.5) are equivalent. By slightly modifying the coefficients and using the continuity argument, we get that (2.5.3) and (2.5.5) are equivalent. We discuss quasi-greedy bases in detail in Chapters 3 and 4. We shall prove now that the basis Ψ constructed in the previous subsection is quasi-greedy. Combining this with the result from Subsection 2.5.3 that Ψ is not unconditional, we get the claim of this subsection. Assume f = 1. Then by the definition of · , ∞ ck (f )2 ≤ 1,
(2.5.6)
k=1
and, for any M ,
M −1/2 c (f )k k ≤ 1.
(2.5.7)
k=1
It is clear that for any Λ we have SΛ (f, Ψ) ≤ f ≤ 1. 2 2
(2.5.8)
We now estimate SΛ (f, Ψ)+ . Let Λ be any set satisfying (2.5.4). Denote α := min ck (f ). k∈Λ
If α = 0 we get SΛ (f, Ψ) = f and (2.5.5) holds. Let α > 0. Denote, for any N , Λ− (N ) := k ∈ Λ : k ≤ N . Λ+ (N ) := k ∈ Λ : k > N , We have, for any N ,
ck (f )k −1/2 ≤
k∈Λ+ (N )
ck (f )3/2
2/3
k∈Λ+ (N )
N −1/6
k
−3/2
1/3
k>N
ck (f )3/2 |ck (f )/α 1/2
2/3
−1/6 α2 N .
(2.5.9)
k∈Λ+ (N )
Choose Nα := [α−2 ] + 1. Then for any M ≤ Nα we have by (2.5.7) that
k∈Λ− (M)
M ck (f )k −1/2 ≤ ck (f )k −1/2 + k=1
≤1+α
M k=1
k∈Λ / − (M),k≤M
ck (f )k −1/2
k −1/2 ≤ 1 + 2αM 1/2 1.
(2.5.10)
46
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
For M > Nα we get, using (2.5.9) and (2.5.10), −1/2 −1/2 ck (f )k ck (f )k ≤ + k∈Λ− (M)
k∈Λ− (Nα )
Thus
ck (f )k −1/2 1.
k∈Λ+ (Nα )
SΛ (f, Ψ) ≤ C, +
which completes the proof. The above example and Theorem 2.4.1 show that a quasi-greedy basis is not necessarily greedy. Further results on quasi-greedy bases can be found in [85] and in Chapters 3 and 4.
2.6 Further results 2.6.1 Direct and inverse theorems Theorem 2.3.1 points out the importance of bases Lp -equivalent to the Haar basis. We now discuss necessary and sufficient conditions for f to have a prescribed decay of {σm (f, Ψ)p } under the assumption that Ψ is Lp -equivalent to the Haar basis Hp for 1 < p < ∞. We will express these conditions in terms of the coefficients {cn (f )} of the expansion ∞ f= cn (f )ψn . n=1
We present results from [70]. The following lemma from [69] (see Lemmas 2.3.3 and 2.3.4) plays the key role here. Lemma 2.6.1. Let a basis Ψ be Lp -equivalent to Hp , 1 < p < ∞. Then for any finite Λ and a ≤ |cn | ≤ b, n ∈ Λ, we have
1/p
1/p ≤ c ψ . C1 p, Ψ a |Λ| n n ≤ C2 p, Ψ b |Λ| n∈Λ
p
We formulate a general statement and then consider several important particular examples of rate of decrease of {σm (f, Ψ)p }. We begin by introducing some notations. For a monotonically decreasing to zero sequence E = {k }∞ k=0 of positive numbers (we write E ∈ M DP ), we define inductively a sequence {Ns }∞ s=0 of nonnegative integers by N0 = 0,
Ns is the smallest satisfying Ns < 2−s ,
ns := max Ns+1 − Ns , 1 .
We are going to consider the following examples of sequences.
(2.6.1)
47
2.6. Further results
Example 2.6.2. Take 0 = 1 and k = k −r , r > 0, k = 1, 2, . . . . Then Ns 2s/r
and ns 2s/r .
Example 2.6.3. Fix 0 < b < 1 and take k = 2−k , k = 0, 1, 2, . . . . Then b
Ns = s1/b + O(1) and ns s1/b−1 . Let f ∈ Lp . Rearrange the sequence cn (f )ψn p in decreasing order cn1 (f )ψn1 ≥ cn2 (f )ψn2 ≥ · · · p p and denote
ak (f, p) := cnk (f )ψnk p .
We now give some inequalities for ak (f, p) and σm (f, Ψ)p . We use the brief notation σm (f )p := σm (f, Ψ)p and σ0 (f )p := f p . Lemma 2.6.4. For any two positive integers N < M we have
−1/p aM f, p ≤ C p, Ψ σN (f )p M − N . Proof. By Theorem 2.3.1, for all m,
f − Gpm f, Ψ ≤ C p, Ψ σm (f )p . p From here and the definition of Gpm we get M
c (f )ψ J := nk nk ≤ C p, Ψ σN (f )p + σM (f )p . k=N +1
(2.6.2)
p
Next we have, for k ∈ (N, M ], cn (f )ψn ≥ cnM (f )ψnM = aM f, p k k p p and by Lemma 2.6.1 we get from here aM (f, p)(M − N )1/p ≤ C(p, Ψ)J. Relations (2.6.2) and (2.6.3) imply the conclusion of Lemma 2.6.4.
(2.6.3)
Lemma 2.6.5. For any sequence m0 < m1 < m2 < · · · of nonnegative integers we have ∞
1/p
σms (f )p ≤ C p, Ψ aml f, p ml+1 − ml . l=s
48
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
Proof. We have ∞ c (f )ψ ≤ σms (f )p ≤ nk nk p
k>ms
l=s
cnk (f )ψnk .
p
k∈(ml ,ml+1 ]
Using Lemma 2.6.1, this yields ∞
1/p
σms (f )p ≤ C p, Ψ aml f, p ml+1 − ml , l=s
which proves the lemma.
Theorem 2.6.6. Assume that a given sequence E ∈ M DP satisfies the conditions Ns ≥ C1 2−s ,
ns+1 ≤ C2 ns ,
s = 0, 1, 2, . . . .
Then we have the equivalence σn (f )p n
⇐⇒
aNs (f, p) 2−s n−1/p . s
Proof. We prove first ⇒. If Ns+1 > Ns then we use Lemma 2.6.4 with M = Ns+1 and N = Ns :
−1/p
aNs+1 (f, p) ≤ C(p, Ψ)σNs (f )p n−1/p ≤ C p, Ψ 2−s−1 ns+1 /C2 , s which implies the statement of Theorem 2.6.6 in this case. Let Ns+1 = Ns = · · · = Ns−j > Ns−j−1 . The assumption Ns ≥ C1 2−s combined with the definition of Ns (namely, Ns < 2−s ) imply that j ≤ C3 . Then from the above case we get
−1/p aNs−j f, p 2−s+j ns−j and, therefore,
−1/p aNs+1 f, p 2−s−1 ns+1 .
The implication ⇒ has been proved. We now prove the inverse statement ⇐. Using Lemma 2.6.5 we get σNs (f )p
∞
∞
1/p aNl f, p Nl+1 − Nl 2−l 2−s Ns
l=s
l=s
and, for n ∈ [Ns , Ns+1 ), σn (f )p ≤ σNs (f )p Ns 2−s Ns+1 ≤ n .
Corollary 2.6.7. Theorem 2.6.6 applied to Examples 2.6.2 and 2.6.3 gives the following relations: σm (f )p (m + 1)−r σm (f )p 2
−m
b
⇐⇒ ⇐⇒
an (f, p) n−r−1/p , an (f, p) 2
−nb (1−1/b)/p
n
(2.6.4) .
(2.6.5)
49
2.6. Further results
Remark 2.6.8. Making use of Lemmas 2.6.4 and 2.6.5 we can prove a version of Corollary 2.6.7 with the sign replaced by . Theorem 2.6.6 and Corollary 2.6.7 are in the spirit of the classical Jackson– Bernstein direct and inverse theorems in linear approximation theory, where conditions on the corresponding sequences of approximating characteristics are imposed in the form (2.6.6) En (f )p n or En (f )p /n < ∞. ∞
It is well known (see [11]) that in studying many questions of approximation theory it is convenient to consider, along with restriction (2.6.6), the following generalization: En (f )p /n < ∞. (2.6.7) q Lemmas 2.6.4 and 2.6.5 are also useful in considering this more general case. For instance, in the particular case of Example 2.6.2 one gets the following statement. Theorem 2.6.9. Let 1 < p < ∞ and 0 < q < ∞. Then for any positive r we have the equivalence relation σm (f )qp mrq−1 < ∞ ⇐⇒ an (f, p)q nrq−1+q/p < ∞. m
n
Proof. Using Lemma 2.6.4 with M = 2s+1 and N = 2s we get a2s (f, p)q 2s(rq+1/p) ≤ C(p) σ2s (f )q 2srq , s
s
which proves the implication ⇒ in the theorem.
In the proof of the implication ⇐ we need the following well-known lemma. Lemma 2.6.10. Let a > 0, 0 < q < ∞. Then for any sequence {xk }∞ k=1 of nonnegative numbers we have q ∞ ∞ ∞ 2anq xs 2−as ≤ C(a) xqn . s=n
n=1
n=1
Proof. In the case q ≤ 1, using the inequality ( k yk )q ≤ k ykq for nonnegative numbers yk , we obtain ∞ n=1
2
anq
∞
xs 2
−as
q ≤
s=n
∞ n=1
=
2
anq
∞
q xs 2−as s=n
s ∞ ∞
q xs 2−as 2anq ≤ C(a) xqs . s=1
n=1
s=1
50
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
In the case q > 1, using the H¨older inequality, we get ∞
xs 2−as/2 2−as/2
q
∞ ∞
q −as/2 q q/q xs 2−as/2 2
≤
s=n
s=n
s=n
≤ C(a)2−anq/2
∞
(xs 2−as/2 )q .
s=n
Using this inequality and changing the order of summation, we obtain, similarly to the above case of 0 < q ≤ 1, ∞ n=1
2anq
∞
xs 2−as
q ≤ C(a)
s=n
∞
2anq/2
s=n
n=1
≤ C(a)
∞
∞
q xs 2−as/2
xqs .
s=1
This completes the proof of Lemma 2.6.10 Using Lemma 2.6.5 with m0 = 0 and ms = 2s for s = 1, 2, . . . we get q q srq l/p s σ2 (f )p 2 ≤ C(p) a2l (f, p)2 2rqs s
s
≤ C(p)
l≥s
2
rqs
s
a2l (f, p)2
l(r+1/p) −lr
q
2
.
l≥s
Next, by Lemma 2.6.10, we continue the above chain of inequalities: q l(r+1/p)q ≤ C p, r a2l f, p 2 . l
This completes the proof of Theorem 2.6.9. Remark 2.6.11. The condition
q an f, p nrq−1+q/p < ∞
n
with q = β := (r + 1/p)−1 takes a very simple form: n
an (f, p)β =
cn (f )ψn β < ∞. p
(2.6.8)
n
In the case Ψ = Hp , the condition (2.6.8) is equivalent to imposing that f be in the Besov space Bβr (Lβ ).
51
2.6. Further results
Corollary 2.6.12. Theorem 2.6.9 implies the relation
β σm f, H p mrβ−1 < ∞ ⇐⇒ f ∈ Bβr (Lβ ), m
where β := (r + 1/p)−1 . A statement similar to Corollary 2.6.12 for free knot spline approximation was proved in [60]. Corollary 2.6.12 and further results in this direction can be found in [12] and [14]. We want to remark here that conditions in terms of an (f, p) are convenient in applications. For instance, the relation (2.6.4) can be rewritten using the idea of thresholding. For a given f ∈ Lp , denote T () := # ak (f, p) : ak (f, p) ≥ . Then (2.6.4) is equivalent to σm (f )p (m + 1)−r
⇐⇒
−1
T () −(r+1/p) .
For further results in this direction, see [8], [11], [59]. An interesting generalization of m-term approximation was considered in [8]. Let Ψ = {ψI }I be a basis indexed by dyadic intervals. Take an α and assign to each index set Λ the measure Φα (Λ) := |I|α . I∈Λ
In the case α = 0 we get Φ0 (Λ) = |Λ|. An analog of best m-term approximation is the following: inf inf c ψ f − I I . Λ:Φα (Λ)≤m cI ,I∈Λ I∈Λ
p
A detailed study of this type of approximation (restricted approximation) can be found in [8]. It is proved in [8] that the technique developed for m-term approximation can be generalized for restricted approximation.
2.6.2 Greedy approximation in L1 and L∞ In this subsection we consider approximation with respect to the Haar multivariate system Hd . It turns out that the efficiency of the greedy algorithms Gp , p = 1, ∞, drops down dramatically compared to the case 1 < p < ∞. Theorem 2.6.13. Let p = 1 or p = ∞. Then we have, for each f ∈ Lp ,
f − Gp (f, Hd ) ≤ 3m + 1 σm f, Hd . m p p The extra factor 3m + 1 cannot be replaced by a factor c(m) with c(m)/m → 0 as m → ∞.
52
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
This particular result indicates that there are problems with greedy approximation in L1 and in C with regard to the Haar basis. We note that, as proved in [59], the extra factor 3m + 1 is the best-possible extra factor in Theorem 2.6.13. Proof. We first prove the upper estimates. Let c I HI tΛ = I∈Λ
be a best m-term approximant to a given f ∈ Lp , p = 1 or p = ∞ (for existence, see [23]). Denote by Λm a set of m dyadic intervals I for which fI HI p takes the biggest values, where fI :=
f (x)HI (x) dx. [0,1]d
We need to estimate
f H δ := f − Gpm f, Hd p = I I . p
I ∈Λ / m
We have δ= f I HI − I ∈Λ /
f I HI +
I∈Λm \Λ
≤ f I HI + I ∈Λ /
p
I∈Λm \Λ
f I HI
p
I∈Λ\Λm
f I HI + p
(2.6.9) fI HI =: δ1 + δ2 + δ3 . p
I∈Λ\Λm
Let p denote the dual to p (p = ∞ if p = 1 and p = 1 if p = ∞). Then,
f − tΛ ≤ σm f, Hd HI , (2.6.10) I p p and for I ∈ / Λ we get
fI ≤ σm f, Hd HI . p p
(2.6.11)
Next, by the definition of Λm and by (2.6.11),
max fI HI p ≤ min fJ HJ p ≤ σm f, Hd p . I∈Λ\Λm
J∈Λm \Λ
Therefore, for δi , i = 2, 3, we get
δi ≤ # Λ \ Λm σm f, Hd p ,
i = 2, 3.
(2.6.12)
It remains to estimate δ1 . By (2.6.10),
d d δ1 ≤ f − tΛ p + H f − t Λ I I ≤ σm f, H p + #Λσm f, H p . (2.6.13) I∈Λ
p
Combining (2.6.9), (2.6.12), and (2.6.13), we obtain
δ ≤ 3m + 1 σm f, Hd p .
53
2.6. Further results
We now prove the lower bounds. We consider the two cases p = 1 and p = ∞ separately. In both cases we construct an example for d = 1. Case 1: p = 1. Let m be given. Consider two functions f1 and f2 . Denote Ik := [0, 2−k ) and define m |Ik |−1/2 HIk . f1 := k=1
It is easy to check that f1 =
2m+1 − 2, x ∈ [0, 2−m−1), −2, x ∈ [2−m−1 , 1/2).
Let A be any set of m disjoint dyadic intervals J such that J ∩[0, 1/2) = ∅. Denote |J|−1/2 HJ . f2 := J∈A
Consider the m-term approximation in L1 of the function f = 2f1 + f2 . We have
σm f, H 1 ≤ 2f1 1 ≤ 4 (2.6.14) and f − G1m (f, H) = f2 = m. (2.6.15) 1 1 Case 2: p = ∞. We use functions similar to those from the previous case. Define g1 :=
m 1/2 Ik HI k
and
g2 :=
|J|1/2 HJ .
J∈A
k=1
Consider the function g = g1 + 2g2 . Then σm (g, H)∞ ≤ 2g2 ∞ = 2 and
g − G∞ m g, H ∞ = g1 = m.
(2.6.16) (2.6.17)
The relations (2.6.14), (2.6.15) and (2.6.16), (2.6.17) imply the lower estimates in Theorem 2.6.13.
54
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
2.7 Some inequalities for the tensor product of greedy bases Our main goal in this section is to study multivariate bases. There are two standard ways to build a multivariate Haar basis. We discussed these ways in Section 2.3 of this chapter. One way is based on the idea of multiresolution analysis. In this way we obtain a multivariate Haar basis consisting of functions whose supports are dyadic cubes. The theory of greedy approximation in this case is parallel to the univariate case. In this section we use the tensor product of the univariate bases as a way of building a multivariate basis. We define the multivariate Haar basis Hpd as the tensor product of the univariate Haar bases: Hpd := Hp × · · · × Hp ; Hn,p (x) := Hn1 ,p (x1 ) · · · Hnd ,p (xd ), x = (x1 , . . . , xd ), n = (n1 , . . . , nd ). The supports of the functions Hn,p are arbitrary dyadic parallelepipeds (intervals). It is known from [70], [85], and [39] that the function
d inf Hn,p Hn,p sup μ m, Hp := sup Λ:|Λ|=k k≤m Λ:|Λ|=k
p
n∈Λ
n∈Λ
p
plays a very important role in estimates of the m-term greedy approximation in terms of the best m-term approximation. For instance (see [70]),
f − Gm f, Hpd ≤ C p, d μ m, Hpd σm f, Hpd , 1 < p < ∞. (2.7.1) p p The following theorem gives, in particular, upper estimates for μ(m, Hpd ). Theorem 2.7.1. Let 1 < p < ∞. Then for any Λ, |Λ| = m, we have, for 2 ≤ p < ∞,
h(p,d) 1 2 1/p Cp,d m1/p min |cn | ≤ c H max cn , log m n n,p ≤ Cp,d m n∈Λ
n∈Λ
n∈Λ
p
and for 1 < p ≤ 2,
−h(p,d) 3 4 1/p Cp,d m1/p log m min cn ≤ c H max cn , n n,p ≤ Cp,d m n∈Λ
n∈Λ
p
n∈Λ
where h(p, d) := (d − 1)|1/2 − 1/p|. Theorem 2.7.1 for d = 1, 1 < p < ∞ was proved in [69]. In the case d = 2, 4/3 ≤ p ≤ 4, it was proved in [70]. Theorem 2.7.1 in the general case was proved in [85]. It is known ([74]) that the extra log factors in Theorem 2.7.1 are sharp. In this section we generalize Theorem 2.7.1 to the case of a basis that is the tensor product of greedy bases. We present here results from [42]. We first give the relevant definitions and notations, in a general setting. Let Ψ be a normalized basis for Lp ([0, 1)). For the space Lp ([0, 1)d ) we define Ψd := Ψ × · · ·× Ψ (d times); ψn (x) := ψn1 (x1 ) · · · ψnd (xd ), x = (x1 , . . . , xd ), n = (n1 , . . . , nd ). In this section we establish the following result using a scheme of proof similar to that from [85].
55
2.7. Some inequalities for the tensor product of greedy bases
Theorem 2.7.2. Let 1 < p < ∞ and let Ψ be a greedy basis for Lp ([0, 1)). Then for any Λ, |Λ| = m, we have, for 2 ≤ p < ∞,
h(p,d) 5 1/p 6 1/p log m cn ψn max cn , Cp,d m min cn ≤ ≤ Cp,d m n∈Λ
n∈Λ
p
n∈Λ
and for 1 < p ≤ 2, 7 m1/p Cp,d
log m
−h(p,d)
8 min cn ≤ cn ψn ≤ Cp,d m1/p max cn , n∈Λ n∈Λ n∈Λ
p
where h(p, d) := (d − 1)|1/2 − 1/p|. The inequality (2.7.1) was extended in [85] to a normalized unconditional basis Ψ for X instead of Hpd for Lp ([0, 1)d ). Therefore, as a corollary of Theorem 2.7.2 we obtain the following inequality for a greedy basis Ψ (for Lp ([0, 1))):
f − Gm f, Ψd ≤ C Ψ, d, p log m h(p,d) σm f, Ψd , p p
1 < p < ∞. (2.7.2)
In this section we prove a generalization of Theorem 2.7.1 to the case of Hn,q instead of Hn,p . It will be convenient to enumerate the Haar system by dyadic intervals. We set h[0,1] := H1,∞ ; h[(l−1)2−n ,l2−n ) := H2n +l,∞ , l = 1, . . . , 2n , n = 0, 1, . . . ; hI (x) := hI1 (x1 ) · · · hId (xd ), I = I1 × · · · × Id . Theorem 2.7.3. Let 1 < p < ∞. Then for any a > 0 and any Λ, |Λ| = m we have, for 2 ≤ p < ∞, p
(1/2−1/p)p(d−1) −a p −a |I|−a hI p |I| hI , |I| hI log m p p I∈Λ
I∈Λ
p
I∈Λ
(2.7.3) and for 1 < p ≤ 2,
−a p
(1/2−1/p)p(d−1) −a p |I|−a hI p . |I| hI p log m |I| hI p I∈Λ
I∈Λ
p
I∈Λ
(2.7.4) Here, the sign means that the corresponding inequality holds with an extra factor that does not depend on m and Λ. We note that Theorem 2.7.3 in the case a = 1/p coincides with Theorem 2.7.1. Theorem 2.7.3 in the case d = 1 was proved in [8]. Proof of Theorem 2.7.2. The proof is carried out by induction. We first prove some inequalities in the univariate case. We need some known results. There is a result in functional analysis [37], [52] that says that for any unconditional basis B = (bk )
56
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
of Lp ([0, 1)d ), normalized so that bk p = 1, there is a subsequence kj , j = 1, 2, . . . such that (bkj ) satisfies p ∞ p ∞ αkj . α b kj kj p
j=1
j=1
It follows that for any democratic and unconditional basis B for Lp ([0, 1)d ) we have 1/p bk , |Λ| p
k∈Λ
with the constants of equivalence depending at most on B and p. For an unconditional, democratic basis B in Lp , the above results combine to show that 1/p 1/p C1 min |ak | |Λ| ≤ a b (2.7.5) k k ≤ C2 max |ak | |Λ| k∈Λ
k∈Λ
p
k∈Λ
for any finite set Λ, with C1 , C2 > 0 independent of Λ and {ak }. This proves Theorem 2.7.2 for d = 1, 1 < p < ∞. We will often use the following known lemma (see [52, p. 73]). Lemma 2.7.4. For any finite collection {fs } of functions in Lp , 1 ≤ p ≤ ∞,
fs ppl
1/pl
s
1/2 1/pu 2 pu ≤ ≤ |fs | f s p p
s
(2.7.6)
s
with pl := max(2, p) and pu := min(2, p). We note that, by Theorem 2.4.1, a greedy basis Ψ is unconditional. It is known that the tensor product of unconditional bases for Lp ([0, 1)), 1 < p < ∞, is an unconditional basis for Lp ([0, 1)d ). Therefore for any 1 < p < ∞ and any {an } we have 1/2 2 1/2 2 , C1 (p, d) |a ψ | ≤ a ψ ≤ C (p, d) ψ a n n n n 2 n n n
p
n
n
p
p
(2.7.7)
and also for any set of disjoint Λj we have 2 1/2 a ψ ≤ a ψ C3 (p, d) n n n n j
n∈Λj
p
j
(2.7.8)
p
n∈Λj
2 1/2 . ≤ C4 (p, d) a ψ n n j
n∈Λj
p
57
2.7. Some inequalities for the tensor product of greedy bases
Lemma 2.7.5. Let 2 ≤ p < ∞ and let Ψ be a greedy basis for Lp ([0, 1)). Then for any finite Λ, |Λ| = m, and any coefficients {ak } we have
1/p |ak |p
1/p
p log m 1/2−1/p a ψ |a | . k k k
k∈Λ
p
k∈Λ
k∈Λ
Proof. The lower estimate follows from (2.7.7) and Lemma 2.7.4. We now prove the upper estimate. Let ak1 ≥ ak2 ≥ · · · , kj ∈ Λ, j = 1, 2, . . . , m. For notational convenience we set akj = 0 for j > m. Denoting fs :=
2s+1 −1
akj ψkj
(2.7.9)
j=2s
we obtain, for n such that 2n ≤ m < 2n+1 , f :=
ak ψk =
n
fs .
(2.7.10)
s=0
k∈Λ
By (2.7.8) and Lemma 2.7.4, f p
n
1/2 fs 2p
.
s=0
Next, by (2.7.5),
fs p ak2s 2s/p .
Thus f p
n
1/2 |ak2s | 2
2 2s/p
.
s=0
Using H¨older’s inequality with parameter p/2, we continue by ≤
n
ak s p 2s 2
s=0
(1−2/p)/2 1/p 1/p n
1/2−1/p 1 log m |ak |p . s=0
k∈Λ
Lemma 2.7.6. Let 1 < p ≤ 2 and let Ψ be a greedy basis for Lp ([0, 1)). Then for any finite Λ, |Λ| = m, and any coefficients {ak } we have
log m
1/2−1/p k∈Λ
1/p |ak |
p
1/p p ak ψk |ak | . k∈Λ
p
k∈Λ
58
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
Proof. The upper estimate follows from (2.7.7) and Lemma 2.7.4. We proceed to the lower estimate. Using the notations (2.7.9) and (2.7.10), by (2.7.8), (2.7.6), and (2.7.5) we obtain n
f p
1/2 fs 2p
n
ak
2 2s/p 2 s+1
1/2 .
2
s=0
s=0
Next, by H¨older’s inequality with parameter 2/p, n ak
p s 2 ≤ 2s+1
n
s=0
ak
2 2s/p 2 s+1
p/2 (n + 1)1−p/2 .
2
s=0
Therefore, f p
n
ak
p s 2 s+1
1/p
2
n
1/2−1/p
log m
s=0
1/2−1/p
1/p |ak |
p
.
k∈Λ
We proceed to the proof of Theorem 2.7.2. We obtain the lower estimate for 2 ≤ p < ∞ and the upper estimate for 1 < p ≤ 2 from (2.7.7) and Lemma 2.7.4. It remains to prove Theorem 2.7.2 in the following cases: 2 ≤ p < ∞, the upper estimate, and 1 < p ≤ 2, the lower estimate. We mentioned above that the assumption that Ψ is a greedy basis for Lp ([0, 1)) implies that Ψd is an unconditional basis for Lp ([0, 1)d ). Therefore, it is sufficient to prove Theorem 2.7.2 in the particular case when cn = 1, n ∈ Λ. We first prove the upper estimate for 2 ≤ p < ∞. Let Λd := nd : ∃k ∈ Λ with kd = nd , Λ(nd ) := (k1 , . . . , kd−1 ) : (k1 , . . . , kd−1 , nd ) ∈ Λ . Then, by Lemma 2.7.5, ψnd (xd ) nd ∈Λd
(n1 ,...,nd−1 )∈Λ(nd )
log m
(1/2−1/p)p
nd ∈Λd
p
ψn1 (x1 ) · · · ψnd−1 xd−1 p
(n1 ,...,nd−1 )∈Λ(nd )
p
ψn1 (x1 ) · · · ψnd−1 xd−1 .
We continue by the induction assumption:
(1/2−1/p)p (1/2−1/p)p(d−2) log m Λ(nd ) log m) nd ∈Λd
= m log m
(1/2−1/p)(d−1)p
.
p
59
2.7. Some inequalities for the tensor product of greedy bases
We proceed to the lower estimate in the case 1 < p ≤ 2. By Lemma 2.7.6, p
ψnd (xd ) ψn1 (x1 ) · · · ψnd−1 (xd−1 nd ∈Λd
(n1 ,...,nd−1 )∈Λ(nd )
log m
(1/2−1/p)p
nd ∈Λd
p
(n1 ,...,nd−1 )∈Λ(nd )
p
ψn1 (x1 ) · · · ψnd−1 xd−1 . p
We continue by the induction assumption:
(1/2−1/p)p(d−2)
(1/2−1/p)p Λ(nd ) log m log m nd ∈Λd
= m log m
(1/2−1/p)(d−1)p
.
Proof of Theorem 2.7.3. The lower estimate in the case 2 ≤ p < ∞ and the upper estimate in the case 1 < p ≤ 2 follow from (2.7.7) and Lemma 2.7.4. We first note that the lower estimate in the case 1 < p ≤ 2 follows from the upper estimate in the case 2 ≤ p < ∞ by a duality argument. Indeed, assume that (2.7.3) has been proved. Let q ∈ (1, 2]. Denote p := q/(q − 1) ∈ [2, ∞). We have −aq+1 |I|−a hI q = |I| = |I|−a hI , |I|−a(q−1) hI q I∈Λ
I∈Λ
−a |I| h ≤ I
q
I∈Λ
I∈Λ
I∈Λ
−a(q−1) |I| h I . p
I∈Λ
Using (2.7.3) we continue: 1/p
(1/2−1/p)(d−1) −a −a(q−1) p log m |I| h |I| h I q I p I∈Λ
I∈Λ
(1/2−1/p)(d−1) −aq+1 1/p −a = |I| hI log m |I| . q
I∈Λ
I∈Λ
This implies the lower estimate in (2.7.4). It remains to prove the upper estimate in (2.7.3). We proceed by induction. First, consider the univariate case. We have |I|−a hI pp = |I|−ap+1 I
and, by (2.7.7), −a p |I| hI I
p
0
1
I
I
(|I|
−a
p/2 2
hI )
= 0
1
s j=1
p/2 2
2anj
χE j
60
Chapter 2. Lebesgue-type Inequalities for Greedy Approximation
with some n1 < n2 < · · · < ns and Ej ⊂ [0, 1], j = 1, . . . , s. By an analog of Lemma 2.3.5 that follows from its proof, we continue:
s
22nj a(p/2) |Ej | =
j=1
s
2nj ap |Ej | =
j=1
|I|−ap+1 .
I
We proceed to the multivariate case. Let Λd := {Id : ∃J ∈ Λ with Jd = Id }, Λ(Id ) := {(J1 , . . . , Jd−1 ) : (J1 , . . . , Jd−1 , Id ) ∈ Λ}. Using the fact that the univariate Haar basis is a greedy basis for Lp ([0, 1)), 1 < p < ∞, we obtain, by Lemma 2.7.5, p
−a −a −a |Id | hId (xd ) |J1 | hJ1 (x1 ) . . . |Jd−1 | hJd−1 xd−1 Id ∈Λd
(J1 ,··· ,Jd−1 )∈Λ(Id )
(1/2−1/p)p log m
Id ∈Λd
×
p |J1 |−a hJ1 (x1 ) · · · |Jd−1 |−a hJd−1 (xd−1 ) .
(J1 ,...,Jd−1 )∈Λ(Id )
By the induction assumption, we
(1/2−1/p)p(d−1) log m ×
p
|Id |−a hI (xd )p d p
p
continue: −a |Id | hId (xd )p p
Id ∈Λd
|J1 |−a hJ1 (x1 )p · · · |Jd−1 |−a hJ (xd−1 )p d−1 p p
(J1 ,...,Jd−1 )∈Λ(Id )
(1/2−1/p)p(d−1) −a p |I| hI . = log m p
I∈Λ
Chapter 3 Quasi-greedy Bases and Lebesgue-type Inequalities
3.1 Introduction Our primary interest in this chapter is in approximation in Lp with respect to quasi-greedy bases. The presentation of this chapter is based on the recent paper [21]. Let X be an infinite-dimensional separable Banach space with a norm · := · X and let Ψ := {ψk }∞ k=1 be a semi-normalized basis for X, that is, 0 < c0 ≤ ψk ≤ C0 , k ∈ N. All bases considered in this chapter are assumed to be semi-normalized. By Definition 2.1.1, greedy bases are those for which we have ideal (up to a multiplicative constant) Lebesgue inequalities for greedy approximation. In this chapter we focus on a wider class of bases than greedy ones: quasi-greedy bases. The concept of quasi-greedy basis was introduced in [43]. Definition 3.1.1. A basis Ψ is called quasi-greedy if there exists some constant C such that sup Gm (f, Ψ) ≤ Cf . m
Subsequently, Wojtaszczyk [85] proved that these are precisely the bases for which TGA merely converges, i.e., lim Gn (f ) = f.
n→∞
The main result of [84] is the following Lebesgue-type inequality for greedy approximation with respect to a quasi-greedy basis in the Lp spaces. We prove it in Section 3.5 (see Theorem 3.5.5). Theorem 3.1.2. Let 1 < p < ∞, p = 2, and let Ψ be a quasi-greedy basis of the Lp space. Then for each f ∈ Lp we have
f − Gm f, Ψ ≤ C p, Ψ m|1/2−1/p| σm f, Ψ . (3.1.1) Lp Lp
© Springer Basel 2015 V. Temlyakov, Sparse Approximation with Bases, Advanced Courses in Mathematics - CRM Barcelona, DOI 10.1007/978-3-0348-0890-3_3
61
62
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
Theorem 3.1.2 does not cover the case p = 2. It is mentioned in [85] that in the case p = 2 one has the following inequality:
f − Gm f, Ψ ≤ C log m σm f, Ψ . We do not know if the above inequality is sharp in the sense that the extra factor log m cannot be replaced by a slower growing factor. The reader can find further discussion of this problem in [83]. We note that inequality (3.1.1) is known (see [85]) in the case of unconditional bases Ψ. It is proved in [68] (see also Chapter 2, Section 2.2) that (3.1.1) holds for the trigonometric system Ψ = {eikx } for all 1 ≤ p ≤ ∞. It was noticed in [68] (see also Theorem 2.2.11 from Chapter 2) that (3.1.1) holds for any uniformly bounded orthonormal basis of L2 . Thus, it was known that for bases satisfying conditions very different in nature —uniformly bounded orthonormal bases of L2 or quasigreedy bases of Lp — similar Lebesgue-type inequalities (3.1.1) hold for greedy approximation. In this chapter we continue to study Lebesgue-type inequalities for greedy approximation. We try to bridge between the two conditions above —uniformly bounded orthonormal basis of L2 and quasi-greedy basis of Lp . We consider uniformly bounded quasi-greedy bases of Lq and study Lebesgue-type inequalities in Lp , q ≤ p. It turns out that even the question of existence of such bases is nontrivial. For instance, it is known (see [27]) that there are no uniformly bounded unconditional bases in Lp , p = 2. Quasi-greedy bases are close to unconditional bases. However, surprisingly, it turns out that uniformly bounded quasi-greedy bases exist in all Lq with 1 < q < ∞. We discuss this issue in Section 3.3, where we present a construction of uniformly bounded quasi-greedy bases. In particular, we prove the following result there. Theorem 3.1.3. There exists a uniformly bounded orthonormal quasi-greedy basis Ψ = {ψj }∞ j=1 in Lp , 1 < p < ∞, that consists of trigonometric polynomials. We note that existence of uniformly bounded orthonormal quasi-greedy bases was proved in [57]. The construction in [57] is a variation on a construction in [50]. The same type of construction was used in [85]. Our construction in Section 3.3 is a somewhat more general version of the known construction. It is based on the trigonometric system, which allows us to build bases of interest consisting of trigonometric polynomials. This is important when we consider the Hardy spaces Hp (D) of analytic functions. The construction in [57] is based on the Walsh system. It is known from [17] that the space C[0, 1] does not have quasi-greedy bases, while the space L1 [0, 1] has. In Section 3.4 we prove, in particular, that L1 [0, 1] does not have a uniformly bounded quasi-greedy basis. In Section 3.6 we prove Lebesgue-type inequalities for greedy approximation in Lp , 2 ≤ p ≤ ∞, under different assumptions on a basis Ψ. In that section we assume that Ψ is a uniformly bounded basis. In addition, we assume that Ψ is a basis of a certain type (quasi-greedy basis, Riesz basis) in one of the spaces L2 , Lq , 1 < q < 2, or Lq , 2 < q < ∞. Here is a typical result from Section 3.6 (see Theorem 3.6.4). We will often use the notation h(p) := 12 − p1 .
63
3.1. Introduction
Theorem 3.1.4. Assume that Ψ is a uniformly bounded quasi-greedy basis of L2 . Then for any m-term polynomial tm = bk ψk , |P | = m, k∈P
we have, for 2 ≤ p ≤ ∞,
f − Gm (f, Ψ) ≤ f − tm + Cmh(p) ln m + 1 f − tm . p p 2 In Section 3.7 we continue to prove Lebesgue-type inequalities for greedy approximation in Lp under different assumptions on a basis Ψ. In that section we assume that Ψ is a semi-normalized quasi-greedy basis for a pair of spaces: Lq , 1 < q < ∞, and Lp , q ≤ p. It turns out that this assumption results in a dramatic improvement of the corresponding Lebesgue-type inequalities. This is demonstrated by the following result (see Theorem 3.7.1). Theorem 3.1.5. Assume that Ψ is a semi-normalized quasi-greedy basis for both Lq and Lp with 1 < q ≤ 2 ≤ p < ∞. Then for any m-term polynomial tm =
bk ψk ,
|P | = m,
k∈P
we have
f − Gm (f, Ψ) ≤ f − tm + C p, q ln m + 1 f − tm . p p q We now formulate some of the Lebesgue-type inequalities presented in this chapter. We already mentioned above (see Theorem 3.1.2) that the Lebesgue-type inequalities in Lp , 1 < p < ∞ under the assumption that Ψ is a quasi-greedy basis of Lp were obtained in [84]. In Section 3.6 we prove that if Ψ is both quasi-greedy and democratic (see Definition 2.1.4 from Chapter 2) then, for any f ∈ X,
f − Gm f, Ψ ≤ C ln m + 1 σm f, Ψ . (3.1.2) X X We note that it is proved in [18] that bases which are simultaneously quasi-greedy and democratic are exactly almost greedy bases. We discuss almost greedy bases in detail in Chapter 4. As a corollary of (3.1.2) we obtain the Lebesgue-type inequality for a uniformly bounded quasi-greedy basis of Lp , 1 < p < ∞:
f − Gm f, Ψ ≤ C(p) ln m + 1 σm f, Ψ . (3.1.3) p p Comparing (3.1.3) with (3.1.1) we see that the extra assumption of uniform boundedness of the basis improves the Lebesgue-type inequalities dramatically.
64
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
In Section 3.7, making our assumptions on the basis even stronger, we improve (3.1.3) to the following inequality:
f − Gm f, Ψ ≤ C(p) ln m + 1 1/2 σm f, Ψ , p p
(3.1.4)
under the assumption that Ψ is a uniformly bounded orthonormal quasi-greedy basis of Lp , 2 ≤ p < ∞. In Section 3.6 we impose assumptions on the basis in the Lq space and obtain inequalities in the Lp space:
f − Gm f, Ψ ≤ C p, q m(1−q/p)/2 ln m + 1 σm f, Ψ , p p
(3.1.5)
under the assumption that Ψ is a uniformly bounded quasi-greedy basis of Lq , 1 < q < ∞, and 2 ≤ p < ∞, p ≥ q. We note that in the case p = q the inequality (3.1.5) turns into (3.1.3). We begin a systematic presentation with Section 3.2, where we list some properties of quasi-greedy bases that are used in this chapter.
3.2 Properties of quasi-greedy bases Quasi-greedy bases. The definition of a quasi-greedy basis is given in the Introduction (see Definition 3.1.1). We give here an equivalent definition (see [82, p. 34]). For a set of indices Λ we define the corresponding partial sum as
ck (f )ψk . SΛ (f ) := SΛ f, Ψ := k∈Λ
Definition 3.2.1. We say that a basis Ψ is quasi-greedy if there exists a constant CQ such that, for any f ∈ X and any finite set of indices Λ having the property min ck (f ) ≥ max |ck (f )|, k∈Λ /
k∈Λ
we have
SΛ f, Ψ ≤ CQ f .
We note that Definition 3.2.1 coincides with Definition 2.5.2 from Chapter 2 in the case of a normalized basis. It is proved in [44] that these definitions are equivalent in the case of semi-normalized bases. First, we present some known useful properties of quasi-greedy bases. For a given element f ∈ X we consider the expansion f=
∞ k=1
ck (f )ψk .
65
3.2. Properties of quasi-greedy bases
Let the sequence kj , j = 1, 2, . . . of the positive integers be such that ck1 (f ) ≥ ck2 (f ) ≥ · · · . We will use the notation
aj (f ) := ckj (f )
for the decreasing rearrangement of the coefficients of f . It will be convenient to define the quasi-greedy constant K to be the least constant such that Gm (f ) ≤ Kf f − Gm (f ) ≤ Kf , f ∈ X. and Lemma 3.2.2. Suppose Ψ is a quasi-greedy basis with quasi-greedy constant K. Then, for any real numbers cj and any finite set of indices P , we have 2 −1 4K min |cj | ψ c ψ |c | ψ ≤ ≤ 2K max j j j j j . j∈P j∈P j∈P
j∈P
j∈P
The above Lemma 3.2.2 is a corollary of the following two lemmas. Lemma 3.2.3. Suppose Ψ = {ψn }n∈N has quasi-greedy constant K. Suppose A is a finite subset of N. Then, for every choice of signs j = ±1, 1 ψj ≤ j ψj ≤ 2K ψj (3.2.1) , 2K j∈A
j∈A
j∈A
and hence, for any real numbers (bj )j∈A , ≤ 2K max |bj | . b ψ ψ j j j j∈A j∈A
(3.2.2)
j∈A
Proof. First note that if B ⊂ A and ε > 0, then
+ ψ 1 + ε ψ 1 + ε ψ ≤ K j j j . j∈B
j∈B
j∈A\B
Letting ε → 0, we obtain j∈B ψj ≤ K j∈A ψj , and hence for any choice of signs j = ±1, we have j ψj ≤ 2K ψj . j∈A
j∈A
This gives the right-hand inequality in (3.2.1); the left-hand inequality is similar. By convexity, (3.2.2) follows immediately.
66
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
Lemma 3.2.4. Suppose Ψ = {ψn }n∈N has quasi-greedy constant K. Then, for any f ∈ X and any n ∈ N, n 2 ψkj an (f ) ≤ 4K f ,
(3.2.3)
j=1
and hence, if A is any subset of N and (bj )j∈A are real numbers, ≤ 4K 2 . min |bj | ψ b ψ j j j j∈A j∈A
(3.2.4)
j∈A
Proof. Let j := sign ckj (f ). Using Abel’s transform, we write n
j ψkj =
j=1
n j=1
1 ck (f )ψkj aj (f ) j
1 = ck (f )ψkj + a1 (f ) j=1 j n
+
1 1 − an (f ) an−1 (f )
1 1 − a2 (f ) a1 (f )
n
ckj (f )ψkj + · · ·
j=2
ckn (f )ψkn .
(3.2.5)
Next we have, for any l ∈ [1, n], n c (f )ψ kj kj ≤ 2Kf .
(3.2.6)
j=l
Inequalities (3.2.5), (3.2.6) and Lemma 3.2.3 imply (3.2.3). Inequality (3.2.4) is a direct consequence of (3.2.3). Lemma 3.2.5. Let Ψ be a quasi-greedy basis. Then, for any two finite sets of indices A ⊆ B and coefficients 0 < t ≤ |cj | ≤ 1, j ∈ B, we have ≤ C(X, Ψ, t) c ψ c ψ j j j j . j∈A
j∈B
Proof. Using Lemma 3.2.2, we get ≤ 2K ≤ (2K)2 ≤ (2K)4 t−1 , c ψ ψ ψ c ψ j j j j j j j∈A
j∈A
as claimed. We next present a result from [17].
j∈B
j∈B
67
3.2. Properties of quasi-greedy bases
Lemma 3.2.6. Let Ψ be a quasi-greedy basis of X. Then for any finite set of indices Λ, |Λ| = m, we have, for all f ∈ X,
SΛ f, Ψ ≤ C ln m + 1 f . Proof. Without loss of generality, assume that f is normalized in such a way that guarantees that a1 (f ) ≤ 1, and take m ≥ 2. Consider, for an integer s ≥ 0, τs := k : 2−s ≤ |ck (f )| < 21−s . Denote Λs := Λ ∩ τs ,
Λ := Λ \
Λs .
s≤log2 m
The semi-normalization property of the basis Ψ implies SΛ (f ) ≤ 2 |Λ |C0 ≤ 2C0 . m For s ≤ log2 m we have
SΛs (f ) = SΛs Sτs (f ) .
By Lemma 3.2.5,
SΛs (f ) ≤ C Sτs (f ).
Our assumption that Ψ is a quasi-greedy basis implies that, for all s, Sτs (f ) ≤ Cf . Thus, for s ≤ log2 m, and so
SΛs (f ≤ Cf ,
SΛ (f ) ≤ C ln m + 1 f .
The following Lemma 3.2.7 from [21] (see also [28]) answers Question 2 from [33]. Let ∞ f= ck (f )ψk . k=1
We define the following expansional best m-term approximation of f : σ ˜m (f ) := σ ˜m (f, Ψ) := inf c (f )ψ f − k k . Λ,|Λ|=m
It is clear that
k∈Λ
˜m f, Ψ . σm f, Ψ ≤ σ
It is also clear that for an unconditional basis Ψ we have
σ ˜m f, Ψ ≤ C X, Ψ σm f, Ψ .
68
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
Lemma 3.2.7. Let Ψ be a quasi-greedy basis of X. Then, for all f ∈ X,
σ ˜m (f ) ≤ C ln m + 1 σm (f ). Proof. For a given > 0, let pm be an m-term polynomial bk ψk , |P | = m, pm := k∈P
such that
f − pm ≤ σm (f ) + .
Then by Lemma 3.2.6 we obtain
σ ˜m (f ) ≤ f − SP (f ) = f − pm − SP (f − pm ) ≤ C ln m + 1 σm (f ) + .
This completes the proof.
We prove one more estimate for σ ˜n (f, Ψ) in terms of σm (f, Ψ) for a quasigreedy basis Ψ. For a basis Ψ we define the fundamental function ϕ(m) := sup ψk . |A|≤m k∈A
We also need the following function: φ(m) := inf ψk . |A|=m k∈A
The functions ϕ(m), φ(m), and ψk ϕ (m) := sup |A|=m s
k∈A
play an important role in studying approximation properties of the greedy algorithm Gm with respect to Ψ. We have already made some comments on this in Section 2.7 of Chapter 2. Further results and a discussion can be found in [39]. The following inequality was obtained in [18]. Lemma 3.2.8. Let Ψ be a quasi-greedy basis. Then for any m and r there exists a set E with |E| ≤ m + r, such that ϕ(m) f − SE (f ) ≤ C 1 + σm (f ). φ(r + 1)
69
3.2. Properties of quasi-greedy bases
Proof. If σm (f ) = 0, then f = k∈A ck (f )ψk , |A| ≤ m and, therefore, SA (f ) = f . Let σm (f ) = 0 and let A be a set, |A| = m, such that f − pm (f ) ≤ 2σm (f ), pm (f ) = bk ψk . (3.2.7) k∈A
Denote g := f − pm (f ). Let B, |B| = r, be such that ck (g)ψk . Gr (g) = k∈B
Consider f − SA∪B (f ) = g − SA∪B (g) = g − SB (g) − SA\B (g). By the assumption that Ψ is quasi-greedy and by the definition of B, g − SB (g) ≤ C1 g ≤ 2C1 σm (f ).
(3.2.8)
(3.2.9)
Let us estimate SA\B (g). By Lemma 3.2.4 we get max |ck (g)| ≤ 4K 2 (φ(r + 1))−1 g.
k∈A\B
Next, by Lemma 3.2.3, SA\B (g) ≤ (2K)3 ϕ(m)φ(r + 1)−1 g.
(3.2.10)
Combining (3.2.9) and (3.2.10) we derive from (3.2.8), for E := A ∪ B,
f − SE (f ≤ C 1 + ϕ(m) σm (f ). φ(r + 1)
Thus Lemma 3.2.8 is proved.
It was noticed in [28] that a modification of the above proof of Lemma 3.2.8 gives the following interesting inequality. Theorem 3.2.9. Let Ψ be a quasi-greedy basis. Then for any m and r we have f − Gm+r (f ) ≤ C 1 + ϕ(m) σm (f ). φ(r) Proof. Let Q be such that Gm+r (f ) =
ck (f )ψk .
k∈Q
We use notations and inequalities from the proof of Lemma 3.2.8. We have f − Gm+r (f ) = f − SQ (f ) ≤ f − SE (f ) + SE (f ) − SQ (f ).
70
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
The term f − SE (f ) was estimated above in Lemma 3.2.8. For the remaining term we have SE (f ) − SQ (f ) ≤ SE\Q (f ) + SQ\E (f ). Since Ψ is quasi-greedy, we obtain SQ\E (f ) = G|Q\E| (f − SE (f )) ≤ K f − SE (f ). Again, we use Lemma 3.2.8 to estimate f − SE (f ). Finally, using the fact that B \ A ⊂ Q we obtain E \ Q = A \ Q and estimate SA\Q (f ). We have, by Lemma 3.2.4, max ck (f ) ≤ min ck (f ) ≤ min ck (f ) k∈A\Q
k∈Q
k∈Q\A
= min ck (g) ≤ ar (g) ≤ 4K 2 φ(r)−1 g. k∈Q\A
Therefore, by Lemma 3.2.3, SE\Q (f ) = SA\Q (f ) ≤ (2K)3 ϕ(m)φ(r)−1 g.
This completes the proof of Theorem 3.2.9.
We now proceed to a result about quasi-greedy bases in Lp spaces. We use the brief notation · p := · Lp . The following theorem is from [83]. We note that Theorem 3.2.10 was proved in [85] in the case p = 2 . Theorem 3.2.10. Let Ψ = {ψk }∞ k=1 be a quasi-greedy basis in Lp , where 1 < p < ∞. Then for each f ∈ X we have C1 (p) sup n1/p an (f ) ≤ f p ≤ C2 (p) n
C3 (p) sup n n
∞
n−1/2 an (f ),
2 ≤ p < ∞,
n1/p−1 an (f ),
1 < p ≤ 2.
n=1 1/2
an (f ) ≤ f p ≤ C4 (p)
∞ n=1
Proof. Denote Ns := {n : an (f ) ≥ 2−s } and Ns := |Ns |. The proofs in both cases 1 < p ≤ 2 and 2 ≤ p < ∞ are similar. We give a proof only in the case 2 ≤ p < ∞. First, we prove the upper bound for f p . As in the proof of Lemma 3.2.6, we can assume, with no loss of generality, that f is normalized in such a way that guarantees that a1 (f ) < 1. In this case we have ∞ f p ≤ ckn (f )ψkn . s=1 n∈Ns \Ns−1
p
71
3.2. Properties of quasi-greedy bases
Using Lemma 3.2.2, we get f p ≤ 4K
∞ s=1
2−s
n∈Ns \Ns−1
ψkn .
(3.2.11)
p
The Lp space has type 2 for 2 ≤ p < ∞ (see the definition of type and a relevant discussion below, at the end of this section after Definition 3.2.19). Therefore, ψkn ≤ C(p)Ns1/2 , (3.2.12) n∈Ns \Ns−1
p
and f ≤ C(p)
∞
2−s Ns1/2 ≤ C(p)
s=1
≤ C(p)
∞
∞
2−s
s=1
n−1/2 an (f ) ≤ C2 (p)
n=1
Ns
n−1/2
n=1 ∞
n−1/2 an (f ).
n=1
Second, we prove the lower bound for f p . The Lp space with 2 ≤ p < ∞ is of cotype p. Therefore, n ψkl ≥ C(p)n1/p . (3.2.13) l=1
p
Now Lemma 3.2.4 yields the required lower bound.
Remark 3.2.11. Theorem 3.2.10 was proved in [83] under the assumption that Ψ is a normalized basis. That proof works for a semi-normalized basis as well. Remark 3.2.12. The proof of Theorem 3.2.10 in [83] gives the following inequalities. Let Ψ = {ψk }∞ quasi-greedy basis of X. If for any set of indices A of k=1 be a cardinality m we have k∈A ψk X ≤ C m1/2 , then for each f ∈ X, f X ≤ C1
∞
n−1/2 an (f ).
(3.2.14)
n=1
If for any set of indices A of cardinality m we have for each f ∈ X, f X ≥ c1 sup n1/2 an (f ).
k∈A
ψk X ≥ c m1/2 , then
n
A general version of (3.2.14) was obtained in [33]. Define, as above, the fundamental function ϕ(m) := ϕ(m, Ψ, X) of a basis Ψ in X as
ϕ m, Ψ, X := sup ψk . |A|≤m
k∈A
72
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
Lemma 3.2.13. Let Ψ be a quasi-greedy basis of X. Then for each f ∈ X we have f ≤ C
∞
1 an (f )ϕ(n) . n n=1
Proof. It is known (see [18, p. 581]) that ϕ(n)/n is monotone decreasing. Therefore, by Lemma 3.2.2 we obtain s ∞ 2 −1 f ≤ an (f )ψkn s=1
≤C
n=2s−1
∞
∞
1 a2s−1 (f )ϕ 2s−1 ≤ C an (f )ϕ(n) . n s=1 n=1
Uniformly bounded quasi-greedy bases. It is clear that any orthonormal basis of a Hilbert space H is an unconditional basis and, therefore, a quasi-greedy basis of H. For example, the trigonometric basis is a uniformly bounded orthonormal basis of L2 . Even the question of existence of a uniformly bounded quasi-greedy basis in Lp , p = 2, is nontrivial. It is known (see [27]) that there are no uniformly bounded unconditional bases in Lp , p = 2. As we already mentioned in the Introduction, there are uniformly bounded quasi-greedy bases in Lp , 1 < p < ∞. We build such bases in Section 3.3. We now present some properties of these bases. We begin by proving an analog of Lemma 2.2 from [68] (see Lemma 2.2.3 from Chapter 2). Lemma 2.2 from [68] was proved for the trigonometric system. We will prove its analog for a uniformly bounded Riesz basis of L2 . Lemma 3.2.14. Assume that Ψ is a uniformly bounded Riesz basis of L2 . Then for any set Λ of indices we have, for 2 ≤ p ≤ ∞, SΛ (f ) ≤ C|Λ|h(p) f 2 . p Proof. Let f=
∞
ck (f )ψk .
k=1
Our assumptions on Ψ imply that SΛ (f ) ≤ Cf 2 2 and
1/2 1/2 2 ck (f ) ψk ∞ ≤ Cm |ck (f )| ≤ Cm1/2 f 2 . SΛ (f ) ∞ ≤ k∈Λ
k∈Λ
Using the inequality 2/p
gp ≤ g2 g1−2/p , ∞
2 ≤ p ≤ ∞,
we obtain the required bound from the above inequalities.
(3.2.15)
73
3.2. Properties of quasi-greedy bases
We now prove an analog of Lemma 3.2.14 for uniformly bounded quasi-greedy bases. Lemma 3.2.15. Assume that Ψ is a uniformly bounded quasi-greedy basis of L2 . Then for any set Λ of indices we have, for 2 < p ≤ ∞, SΛ (f ) ≤ C|Λ|h(p) SΛ (f )2 . (3.2.16) p Moreover,
SΛ (f ) ≤ C ln |Λ| + 1 f 2 . 2
(3.2.17)
Proof. First, we prove (3.2.17). We note that (3.2.17) follows from Lemma 3.2.6, which does not require the basis to be uniformly bounded. We give another proof here that does not require uniform boundedness of the basis either. Using the notation m := |Λ| we obtain, by Theorem 3.2.10, m m
−1/2 SΛ (f ) ≤ C2 (2) n a (S (f ) ≤ C n−1/2 an (f ) n Λ 2 n=1
≤C
m
n=1
n−1/2 C3 (2)−1 f 2n−1/2 ≤ C ln(m + 1 f 2.
n=1
This proves (3.2.17). Second, we prove (3.2.16). We have SΛ (f )∞ ≤
|ck (f )|ψk ∞ ≤ C
k∈Λ
≤C
m
m
an (SΛ (f ))
n=1
n−1/2 SΛ (f )2 ≤ Cm1/2 SΛ (f )2 .
n=1
The above inequality combined with (3.2.15) gives (3.2.16).
In the following lemma we replace the assumption of being quasi-greedy in L2 by the corresponding assumption in Lq , 1 < q < ∞. Lemma 3.2.16. Assume that Ψ is a uniformly bounded quasi-greedy basis of Lq , 1 < q < ∞. Then for any set Λ of indices we have, for q < p ≤ ∞, SΛ (f ) ≤ C|Λ|(1−q/p)/2 SΛ (f ) . (3.2.18) p q We also have
SΛ (f ) ≤ C ln |Λ| + 1 f q . q
(3.2.19)
Proof. Inequality (3.2.19) follows from Lemma 3.2.6. We prove (3.2.18). We have m
ck (f )ψk ≤ C SΛ (f ) ≤ an SΛ (f ) . ∞ ∞ k∈Λ
n=1
74
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
By Proposition 3.2.22 (see below), we continue: ≤C
m
n−1/2 SΛ (f )q ≤ Cm1/2 SΛ (f )q .
n=1
The above inequality, combined with 1−q/p gp ≤ gq/p , q g∞
q ≤ p ≤ ∞,
(3.2.20)
gives (3.2.18). We note that in the case 1 < q ≤ 2 we could use Theorem 3.2.10 instead of Proposition 3.2.22. Uniformly bounded orthonormal quasi-greedy bases. We prove in Section 3.3 that there exist uniformly bounded orthonormal quasi-greedy bases in Lp , 1 < p < ∞. We also prove in Section 3.3 that if Ψ is a uniformly bounded orthonormal quasigreedy basis in Lp , 2 ≤ p < ∞, then Ψ is a quasi-greedy basis of Lp . Thus there are uniformly bounded bases which are quasi-greedy bases of two spaces Lp and Lp , 2 < p < ∞. We now present some results in this direction. We prove an analog of Lemma 3.2.15. Lemma 3.2.17. Assume that Ψ is a semi-normalized quasi-greedy basis for both Lq and Lp with 1 < q ≤ 2, 2 ≤ p < ∞. Then for any set Λ of indices we have
SΛ (f ) ≤ C ln |Λ| + 1 f q . (3.2.21) p Proof. Using the notation m := |Λ| we obtain, by Theorem 3.2.10, m m −1/2 SΛ (f ) ≤ C2 (p) n a (f )) ≤ C(p) n−1/2 an (f ) S n Λ p n=1
≤ C(p)
m
n=1
n−1/2 C3 (q)−1 f q n−1/2 ≤ C(p, q) ln(m + 1)f q .
n=1
This proves (3.2.21).
Lemma 3.2.18. Assume that Ψ is a uniformly bounded orthonormal quasi-greedy basis of Lp , 2 < p < ∞. Then for any set Λ of indices we have
and
SΛ (f ) ≤ C ln(|Λ| + 1) 1/2 f p 2
(3.2.22)
SΛ (f ) ≤ C ln(|Λ| + 1) 1/2 f 2 . p
(3.2.23)
75
3.2. Properties of quasi-greedy bases
Proof. Let |Λ| = m. By Theorem 3.2.20 (see below) and Theorem 3.2.10 we have, in the case of (3.2.22), SΛ (f ) ≤ 2
m
1/2 2
an (f )
≤C
n=1
m
n
−1
1/2 f 2p
1/2 ≤ C ln(m + 1) f p .
n=1
In the case of (3.2.23) we obtain, by Theorem 3.2.10, m SΛ (f ) ≤ C n−1/2 an (f ) p n=1
m 1/2
1/2
1/2 an (f )2 ≤ C ln(m + 1) f 2 . ≤ C ln(m + 1)
n=1
Let us discuss uniformly bounded orthonormal quasi-greedy bases in more detail. Existence of such bases is guaranteed by Theorems 3.3.5 and 3.2.20. We first recall the definition of bases called unconditional for constant coefficients; cf. [85]. Definition 3.2.19. A basis Ψ is called unconditional for constant coefficients (UCC) if there exist constants C1 and C2 such that for each finite subset A ⊂ N and for each choice of signs εi = ±1 we have C1 ψi ≤ εi ψi ≤ C2 ψi . i∈A
i∈A
i∈A
It is known ([85]; see also Lemma 3.2.3) that quasi-greedy bases are UCC bases. To formulate our results we need some of the basic concepts of Banach space theory from [52]. First, let us recall the definition of type and cotype. Let {εi } be a sequence of independent Rademacher variables. We say that a Banach space X has type p if there exists a universal constant C3 such that, for fk ∈ X,
p 1/p 1/p n n p ε f ≤ C f , Aveεk =±1 k k 3 k k=1
k=1
and X is of cotype q if there exists a universal constant C4 such that, for fk ∈ X, q 1/q n n q 1/q ε f ≥ C . f Aveεk =±1 k k 4 k k=1
k=1
It is known that Lp , 2 ≤ p < ∞, has type 2 and cotype p. Consider a uniformly bounded orthonormal quasi-greedy basis Ψ = {ψj }∞ j=1 in Lp , 2 < p < ∞.
76
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
Then we obtain from its orthonormality and property UCC that for any set A of indices of cardinality m we have p 1/p m1/2 = ψ ≤ ψ Ave ε ψ k k εk =±1 k k k∈A
2
k∈A
p
k∈A
p
1/2 2 1/2 2 ≤ C(p) ψk |ψk | m1/2 . p p
k∈A
(3.2.24)
k∈A
Relations (3.2.24) show that for a uniformly bounded orthonormal quasi1/2 . In greedy basis Ψ = {ψj }∞ j=1 in Lp , 2 < p < ∞, we have ϕ(m, Ψ, Lp ) m particular, this implies that Ψ is democratic. We consider along with the basis Ψ in Lp its dual basis Ψ∗ in Lp . By the orthonormality of Ψ, Ψ∗ = Ψ. Properties of dual bases to quasi-greedy and almost greedy bases are discussed in detail in [18] (see also Chapter 4). In particular, by Proposition 4.4.4 and Theorem 4.5.4 from Chapter 4, the relation ϕ(m, Ψ, Lp ) m1/2 implies that Ψ is also a quasi-greedy basis of Lp . We formulate this conclusion as a theorem. Theorem 3.2.20. Let Ψ be a uniformly bounded orthonormal quasi-greedy basis Ψ = {ψj }∞ j=1 in Lp , 2 < p < ∞. Then Ψ is a quasi-greedy basis of Lp . Proposition 3.2.21. Let Ψ be a uniformly bounded quasi-greedy basis Ψ = {ψj }∞ j=1 in Lq , 1 < q < ∞. Then Ψ is democratic with fundamental function ϕ(m, Ψ, Lq ) m1/2 . Proof. The proofs in the two cases 1 < q ≤ 2 and 2 ≤ q < ∞ are similar. We give here only a proof for 1 < q ≤ 2. Using the UCC property of quasi-greedy bases and using the fact that Lq , 1 < q ≤ 2, is of cotype 2, we obtain, as in (3.2.24), 2 1/2 ψk Aveεk =±1 εk ψk ≥ Cm1/2 . k∈A
q
k∈A
q
Also 2 1/2 ψk Aveεk =±1 εk ψk k∈A
q
k∈A
≤
q
2 1/2 Aveεk =±1 ε ψ ≤ Cm1/2 . k k k∈A
2
Combination of Proposition 3.2.21 and Remark 3.2.12 gives the following inequalities, which we will often use.
77
3.3. Construction of quasi-greedy bases
Proposition 3.2.22. Let Ψ be a uniformly bounded quasi-greedy basis Ψ = {ψj }∞ j=1 in Lq , 1 < q < ∞. Then , for f ∈ Lq , c1 (q) sup n1/2 an (f ) ≤ f q ≤ C1 (q) n
∞
n−1/2 an (f ).
(3.2.25)
n=1
This proposition implies the following analog of Lemma 3.2.17. Lemma 3.2.23. Assume that Ψ is a uniformly bounded quasi-greedy basis Ψ = {ψj }∞ j=1 in Lq and Lp , 1 < q, p < ∞. Then for any set Λ of indices we have SΛ (f )p ≤ C ln(|Λ| + 1)f q .
(3.2.26)
Proof. Let |Λ| = m. By Proposition 3.2.22, m SΛ (f ) ≤ C1 (p) n−1/2 an (f ) p n=1
≤ c1 (q)−1 C1 (p)
m
n−1 f q ≤ C ln |Λ| + 1 f q .
n=1
3.3 Construction of quasi-greedy bases In this section we describe a general scheme of construction of a quasi-greedy basis out of a given basis with special properties. This scheme is similar to the one used by Wojtaszczyk in [85]. Both schemes are based on Olevskii-type matrices (see [58]). Assumptions. Let X be a separable Banach space and Φ = {ϕj }∞ j=1 be a seminormalized basis of X, 0 < c0 ≤ ϕj ≤ C0 . We assume that Φ is a Besselian basis of X: for any ∞ cj (f )ϕj (3.3.1) f= j=1
we have
∞
cj (f )2
1/2 ≤ C1 f .
(3.3.2)
j=1
Assume that Φ can be split into two systems, F = {fs }∞ s=1 , fs = ϕm(s) , and E = {ej }∞ , e = ϕ , with increasing sequences {m(s)} and {n(j)}, in such a j n(j) j=1 way that E has the following special property. For any sequence {cj }, ∞ ∞ 2 1/2 cj cj ej ≤ C2 . j=1
j=1
(3.3.3)
78
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
In our construction of quasi-greedy bases we will use special matrices. Let a collection of matrices A = {A(n)}∞ n=1 , where A(n) is of size n × n, satisfy the following properties. M1. The singular numbers of matrices A(n) and their inverses A(n)−1 are uniformly bounded: sj (A(n)) ≤ C3 ,
sj (A(n)−1 ) ≤ C3 .
(3.3.4)
M2. The elements of the first column of A(n) = [aij (n)] obey the estimates ai1 (n) ≤ C4 n−1/2 . (3.3.5) Construction. Let {nk }∞ k=0 , n0 = 0, be an increasing sequence of integers such that (3.3.6) nk+1 ≥ n2k . For a fixed natural number k we pick the basis elements g1k := fk ,
gik := eSk−1 +i−2 ,
i = 2, . . . , nk ,
(3.3.7)
where {Sj } is defined recursively as Sj = Sj−1 + nj − 1,
j = 1, 2, . . . ,
S0 = 0.
k We build a new system of elements {ψik }ni=1 using the matrix A(nk ) as follows:
k
T
T ψ1 , . . . , ψnk k = A(nk ) g1k , . . . , gnk k .
(3.3.8)
In other words, for i ∈ [1, nk ] we have ψik =
nk
aij (nk )gjk .
j=1 k ,∞ We define and study the new system Ψ = {ψik }ni=1,k=1 = {ψj(k,i) } ordered lexicographically: j(k , i ) > j(k, i) if either k > k or k = k and i > i. k ,∞ Properties of Ψ. We begin with a property of an auxiliary system G := {gik }ni=1, k=1 = {gj(k,i) } ordered lexicographically: j(k , i ) > j(k, i) if either k > k or k = k and i > i.
Proposition 3.3.1. The system G is a Besselian basis of X. Proof. It follows from the definition of G that the expansion of f with respect to G will be a rearrangement of the expansion of f with respect to Φ. Therefore, we only need to prove that G is a basis. Then the Besselian property of G follows from the Besselian property of Φ.
79
3.3. Construction of quasi-greedy bases
Let f have the expansion (3.3.1) with respect to Φ. Consider the series nk ∞
cki gik ,
k=1 i=1
where cki = cj (f ) if gik = ϕj . A partial sum of this series has the form N
ck1 g1k +
N
Pk ⊆ [2, nk ].
cki gik ,
(3.3.9)
k=1 i∈Pk
k=1
We note that in the above representation Pk = [2, nk ] for all k except maybe k = N . By our choice of gik , we have that gik ∈ E for all k and i > 1. Therefore, for the second sum in (3.3.9) we use (3.3.3) and obtain the bound N 1/2 N k k k 2 ≤ C c g |c | . 2 i i i k=1 i∈Pk
(3.3.10)
k=1 i∈Pk
Using the Besselian property of the basis Φ (3.3.2) we get N
1/2 |cki |2
≤ C1 f .
k=1 i∈Pk
Let g1N = fN = ϕm(N ) . Then for the first sum in (3.3.9) we obtain N k=1
m(N )
ck1 g1k
=
cj (f )ϕj −
j=1
K
cki gik ,
Qk ⊆ [2, nk ].
k=1 i∈Qk
The assumption that Φ is a basis implies that m(N ) c (f )ϕ j j ≤ Cf . j=1
Similarly to the above estimation of the second sum in (3.3.9), we get K k k ci g i ≤ Cf . k=1 i∈Qk
This completes the proof of Proposition 3.3.1. Proposition 3.3.2. The system Ψ is a Besselian basis of X.
80
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
Proof. Denote
Xk := span ψ1k , . . . , ψnk k = span g1k , . . . , gnk k ). Let g ∈ Xk . Then g=
nk
vi gik ,
g=
nk
i=1
Using the definition of
ψik
i=1
in terms of
nk
ui ψik
ui ψik .
=
i=1
gjk
nk
we obtain ui
nk
i=1
Therefore, vj =
nk
aij gjk
.
j=1
aij ui ,
i=1
or v = A(nk )T u, where u = (u1 , . . . , unk )T , v = (v1 , . . . , vnk )T . The property M1 of the matrix A(nk ) implies that u2 ≤ C3 v2 . This and Proposition 3.3.1 imply that Ψ is a Besselian system. It remains to prove that Ψ is a basis of X. It is clear that the use of Proposition 3.3.1 allows us to limit our proof to only one subspace Xk . In this case, by (3.3.3) and M2 we have, for n ∈ [1, nk ], n nk n n −1/2 k k k g1 + ui ψi ≤ |ui | C4 nk ui aij (nk )gj i=1
i=1
i=1
j=2
2 1/2 nk n aij (nk )ui . ≤ Cu2 + C j=2
i=1
Using our assumption M1 we obtain 2 1/2 nk n aij (nk )ui ≤ C3 u2 . j=2
i=1
Therefore, applying Proposition 3.3.1 we get n k ui ψi ≤ Cu2 ≤ Cv2 ≤ f . i=1
This completes the proof of Proposition 3.3.2.
81
3.3. Construction of quasi-greedy bases
Theorem 3.3.3. The basis Ψ is a quasi-greedy basis of X. Proof. Let f ∈ X have the representation f=
nk ∞
bki ψik
k=1 i=1
with respect to Ψ. Suppose that the m-th greedy approximant is given by
k k bi ψi , Ik ⊆ [1, nk ]. (3.3.11) Gm f, Ψ = k∈J i∈Ik
We will prove that
Gm (f, Ψ) ≤ Cf .
(3.3.12)
It is clear that it suffices to prove (3.3.12) for normalized f , f = 1. At the first step we consider the following modification of the sum from (3.3.11):
Σ1 := bki ψik − ai1 (nk )fk . k∈J i∈Ik
It follows from the definition of ψik that nk nk k k k bi aij (nk )gj = bi aij (nk ) gjk . Σ1 = k∈J i∈Ik
j=2
k∈J j=2
i∈Ik
By (3.3.3) we get Σ1 ≤ C
2 1/2 nk k b a (n ) . k i ij k∈J j=2 i∈Ik
Using property M1 and Proposition 3.3.2 we obtain from (3.3.13) that 1/2 |bki |2 ≤ C. Σ1 ≤ C k∈J i∈Ik
At the second step we consider
Σ2 := Gm f, Ψ − Σ1 = bki ai1 fk . k∈J i∈Ik
We split each of Ik into three disjoint subsets: Ik1 := i ∈ Ik : |bki | ≤ n−1 , k −1/2 Ik2 := i ∈ Ik : |bki | ≥ nk , −1/2 k Ik3 := i ∈ Ik : n−1 . k < |bi | < nk
(3.3.13)
82
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
Denote Σs2 :=
k∈J
For Σ12 we have Σ12 :=
bki ai1 fk ,
s = 1, 2, 3.
i∈Iks
bki ai1 fk =
k∈J i∈Ik1
fk
bki ai1 .
i∈Ik1
k∈J
It follows from the definition of Ik1 and from property M2 that k −1/2 bi ai1 ≤ C4 nk . i∈Ik1
Therefore, Σ12 ≤ C
−1/2
nk
≤ C.
k∈J
We proceed to estimate Σ22 . We have |Ik2 |n−1 k
≤
nk
|bki |2
i=1
and Σ22
≤ C4 C0
−1/2 nk |Ik2 |1/2
nk
1/2 |bki |2
≤ C4 C0
i=1
k∈J
nk
|bki |2 ≤ C.
k∈J i=1
Next we turn to Σ32 . We note that the bound on Σ1 combined with Proposition 3.3.2 imply that, for any N , nk N k bi fk ai1 (nk ) ≤ C.
(3.3.14)
k=1 i=1
Denote
K := max k ∈ J : Ik3 = ∅ . −1/2
3 K This means that there is a bK . The fact that i , i ∈ IK , such that |bi | < nK 2 K ∈ J, our assumption that nk+1 ≥ nk , and the definition of greedy approximant imply that for all k ∈ [1, K] we have that either Ik3 is empty or k ∈ J. Thus
Σ32 =
3 i∈IK
bK i fK ai1 (nK ) +
nk K−1 k=1 i=1
bki fk ai1 (nk ) − σ1 − σ2 ,
(3.3.15)
83
3.3. Construction of quasi-greedy bases
where σ1 has the form of Σ12 and σ2 has the form of Σ22 . Therefore, it is sufficient to bound only the first term in the right-hand side of (3.3.15). We have 1/2 nk K −1/2 1/2 K 2 ≤ Cn b f a (n ) n |b | ≤ C. i K i1 K i K K 3 i∈IK
i=1
This completes the proof of Theorem 3.3.3.
Extra assumptions. First of all we note that if H is a Hilbert space and Φ ⊂ H is an orthonormal basis in H, then G also is an orthonormal basis in H. Second, if the matrices A(n) are orthogonal then Ψ is an orthonormal basis of H. Next, assume that Y is a subspace of X with a stronger norm: f X ≤ f Y . Assume that the basis Φ is from Y and ϕj Y ≤ B, j = 1, 2, . . . . We also impose an extra assumption on matrices. M3. For all n,
n aij (n) ≤ C5 .
(3.3.16)
j=1
Under condition M3 we easily derive from the definition of Ψ that ψik Y ≤ C5 B. Examples. Let X = Lp (0, 2π), 2 < p < ∞, Y = L∞ (0, 2π). Consider Φ = T to j be the trigonometric system {eikx }. Define E := {ei2 x }∞ j=1 . It is well known that (3.3.3) holds for this system. By the Riesz theorem, T is a basis of Lp , 1 < p < ∞. Trivially, T has the Besselian property in Lp , 2 < p < ∞. Thus applying the above construction we obtain the following theorem. Theorem 3.3.4. There exists a uniformly bounded quasi-greedy basis Ψ = {ψj }∞ j=1 in Lp , 2 < p < ∞, that consists of trigonometric polynomials. Moreover, as pointed out above, if the matrices A(n) are orthogonal, then Ψ is an orthonormal basis of H. Thus, we have the following variant of Theorem 3.3.4. Theorem 3.3.5. There exists a uniformly bounded orthonormal quasi-greedy basis Ψ = {ψj }∞ j=1 in Lp , 2 < p < ∞.
84
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
3.4 Uniformly bounded quasi-greedy systems Combining Theorem 3.2.20 and Theorem 3.3.5 with the preceding discussion gives the following result. Theorem 3.4.1. There exists a uniformly bounded orthonormal system Ψ = {ψj }∞ j=1 consisting of trigonometric polynomials which is a quasi-greedy basis for Lp [0, 1] for all 1 < p < ∞. The main result of this section is that there is no analogue of Theorem 3.4.1 for L1 [0, 1]. It is known that L1 [0, 1] has a quasi-greedy basis [17, Theorem 7.1] and, by a theorem of Szarek [64], that L1 [0, 1] does not admit any uniformly integrable Schauder basis. On the other hand, the trigonometric system is a uniformly bounded Markushevich basis. Therefore, it is natural to ask whether L1 [0, 1] admits a uniformly bounded (or uniformly integrable) quasi-greedy Markushevich basis Ψ. We answer this question negatively. First, we recall the relevant definitions. Let X be a separable Banach space. Let Ψ = {ψj }∞ j=1 ⊂ X be a fundamental and semi-normalized system, i.e., there exist positive constants a and b such that a ≤ ψj ≤ b
(j ≥ 1),
(3.4.1)
∗ with a biorthogonal sequence {ψj∗ }∞ j=1 ⊂ X . Then Ψ is said to be a Markushevich ∗ ∞ basis if the mapping f → {ψj (f )}j=1 (f ∈ X) is one-one. In other words, each f ∈ X is uniquely determined by its coefficient sequence {ψj∗ (f )}∞ j=1 . We say that Ψ is quasi-greedy if there exists a constant C such that
Gm f, Ψ ≤ Cf m ≥ 1, f ∈ X . (3.4.2)
Wojtaszczyk [85] proved that (3.4.2) is equivalent to the norm convergence of {Gm (f )} to f for all f ∈ X. It follows easily from (3.4.1) and (3.4.2) that {ψj∗ }∞ j=1 is semi-normalized in ∗ X . Indeed, for f ∈ X we have |ψj∗ (f )| ≤ |a1 (f )| ≤ (1/a)G1 (f ) ≤ (C/a)f , and hence ψj∗ ≤ C/a. On the other hand, since ψj∗ (ψj ) = 1, we also have ψj∗ ≥ 1/ψj ≥ 1/b. The following result was proved for quasi-greedy bases (actually for the larger class of thresholding-bounded bases) in [17, Lemma 8.2]. The proof easily carries over to quasi-greedy Markushevich bases (cf. also the proof of Lemma 3.2.6 above). Proposition 3.4.2. Suppose that Ψ is a semi-normalized quasi-greedy Markushevich basis for X. There exists a constant C such that, for all finite sets Λ ⊂ N with |Λ| = N ≥ 2, ∗ (f ∈ X). ±ψn (f )ψn max ≤ C ln N f ±
n∈Λ
85
3.4. Uniformly bounded quasi-greedy systems
In particular,
SΛ (f ) ≤ C ln N f
(f ∈ X).
Recall that a bounded operator T : X → Y , where X and Y are Banach spaces, is absolutely summing if there exists a constant C such that, for all n ≥ 1 and for all finite sequences {fj }nj=1 ⊂ X, we have n
T (fj ) ≤ C max ±
j=1
n
±fj .
j=1
The smallest such constant is denoted π1 (T ). A Banach space X is called a GT (Grothendieck Theorem) space [61] if every bounded operator T : X → 2 is absolutely summing. Thus X is a GT space if and only if there exists a constant B such that π1 (T ) ≤ BT for all bounded T : X → 2 . Grothendieck [31] proved that L1 (μ) spaces are GT spaces. The proof of the following result is based on the methods used in [17, Section 8]. Theorem 3.4.3. Suppose that X is a GT space. Let Ψ be a semi-normalized quasigreedy Markushevich basis for X. Then Ψ is democratic and its fundamental function satisfies ϕ(n) n. Proof. For 1 ≤ p ≤ ∞, recall that a Markushevich basis Ψ is said to be p-Besselian if there exists a constant Cp such that ∞
|ψn∗ (f )|p
1/p ≤ Cp f
(f ∈ X),
n=1
with the obvious modification for p = ∞. Since Ψ is quasi-greedy, we have C∞ = supn≥1 ψn∗ < ∞, so Ψ is ∞-Besselian. We will derive Theorem 3.4.3 from the following Theorem 3.4.4. Theorem 3.4.4. Suppose that Ψ is a semi-normalized quasi-greedy Markushevich basis for a GT space X. Then Ψ is r-Besselian for all r > 1. We need the following key lemma. Lemma 3.4.5. Suppose that Ψ is a semi-normalized quasi-greedy Markushevich basis for a GT space X. If Ψ is p-Besselian for some 2 ≤ p ≤ ∞, then Ψ is r-Besselian for all r satisfying 1/r < 1/p + 1/2. Proof. We shall give the proof for the case 2 < p < ∞, as the case p = ∞ requires only minor changes. Let 1/s = 1/p + 1/2. Suppose that Λ ⊂ N, with |Λ| = N , and that (ηn )n∈Λ is any fixed choice of signs. Choose f ∈ X with f = 1 such that 1 ∗ ∗ ηn ψn (f ) ≥ ηn ψn . 2 n∈Λ
n∈Λ
86
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
Next consider T : X → 2 (Λ), defined as follows: s−1 T (g) = ψn∗ (g)ψn∗ (f ) n∈Λ
(g ∈ X).
Then, applying H¨older’s inequality and using the fact that Ψ is p-Besselian, we obtain ∗ 2s−2 ∗ 2 1/2 ψn (f ) ψn (g) T (g) = n∈Λ
≤
|ψn∗ (f )|s
1−1/s
n∈Λ
≤ Cp
∗ p ψn (g)
1/p
n∈Λ
∗ s ψ (f )
1−1/s g.
n
n∈Λ
Hence, T ≤ Cp ( n∈Λ |ψn∗ (f )|s )1−1/s . Since X is a GT space, we have ψn∗ (f )s = ψn∗ (f )T (ψn ) n∈Λ
n∈Λ
∗ ≤ BT sup εn ψn (f )ψn ε =±1 n
≤ BCp
n∈Λ
∗ s ψ (f )
1−1/s
n
n
n∈Λ
Thus,
|ψn∗ (f )|s
1/s
n∈Λ
∗ ≤ BCp sup ε ψ (f )ψ n n n . ε =±1 n
n∈Λ
∗ sup ε ψ (f )ψ n n n . ε =±1
n∈Λ
Since |Λ| = N , Proposition 3.4.2 yields ∗ sup εn ψn (f )ψn ≤ C (ln N ), εn =±1
n∈Λ
where C is independent of N . Hence ∗ s 1/s ψn (f ) ≤ BC Cp (ln N ). n∈Λ
Thus, ∗ s 1/s 1−1/s ∗ ∗ ψ η ψ η ψ (f ) ≤ 2 (f ) N ≤ 2 n n n n n n∈Λ
n∈Λ
n∈Λ
≤ BC Cp (ln N )N 1−1/s .
87
3.4. Uniformly bounded quasi-greedy systems
Now suppose that g ∈ X with g = 1. For a > 0, let Λ(a) = {n ∈ N : |ψn∗ (g)| ≥ a} and N (a) = |Λ(a)|. Then, for some choice of signs (ηn ), we have aN (a) ≤ ηn ψn∗ (g) n∈Λ(a)
∗ ≤ ηn ψn n∈Λ(a)
≤ BC Cp (ln N (a))N (a)1−1/s . Thus, for some constant C , we have N (a) ≤ C a−t provided t satisfies 1 1 1 < < . r t s Note that
sup |ψn∗ (g)| ≤ sup ψn∗ ∞ = C∞ .
n≥1
n≥1
Hence, ∞ ∞ ∗ r ψ (g) ≤ N (2−n C∞ )(21−n C∞ )r n n=1
n=0
≤ 2r C
∞ −n
r−t 2 C∞ < ∞, n=0
and so Ψ is r-Besselian.
Applying the lemma twice, starting with p = ∞, it follows that Ψ is r-Besselian for all r > 1. This proves Theorem 3.4.4. In particular, Ψ is 2-Besselian with constant C2 < ∞. Hence, for every finite Λ ⊂ N, the mapping T : X → 2 (Λ) given by f → (ψn∗ (f ))n∈Λ satisfies T ≤ C2 . Since X is a GT space, the absolutely summing norm of T satisfies π1 (T ) ≤ BC. Thus, T (ψn )2 ≤ BC max ±ψ |Λ| = n . ± n∈Λ
n∈Λ
Since Ψ is quasi-greedy, and hence unconditional for constant coefficients, it follows that ϕ(n) n. The following Proposition 3.4.6 is a stronger version of Proposition 3.4.2 under the extra assumption that X is a GT space.
88
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
Proposition 3.4.6. Suppose that Ψ is a semi-normalized quasi-greedy Markushevich basis for a GT space X. There exists a constant C such that, for all finite sets Λ ⊂ N with |Λ| = N ≥ 2, we have
|ψn∗ (g)| ≤ C ln N g
(g ∈ X).
n∈Λ
Proof. Let ξn = ±1 be such that ψ ∗ (g) = ξn ψn∗ (g). n n∈Λ
Then
n∈Λ
ψn∗ (g) ≤ ξn ψn∗ g.
We now estimate
n∈Λ
∗ n∈Λ ξn ψn .
(3.4.3)
n∈Λ
Let as above f be such that f = 1 and
ξn ψn∗ (f )
n∈Λ
1 ∗ ≥ ξn ψn . 2
(3.4.4)
n∈Λ
Consider the operator T : X → 2 (Λ) given by
T (ϕ) := ψn∗ (ϕ) n∈Λ . By Theorem 3.4.4, Ψ is 2-Besselian and therefore T (ϕ) =
|ψn∗ (ϕ)|2
1/2 ≤ C2 ϕ.
n∈Λ
Using the assumption that X is a GT space, we obtain ∗ ∗ ≤ C ln N. ψ ∗ (f ) = ψ (f )T (ψn ) ≤ BC2 sup ψ (f )ψ n n n n n n∈Λ
n∈Λ
n =±1
n∈Λ
(3.4.5) We used Proposition 3.4.2 at the last inequality. Combining (3.4.3)–(3.4.5) we complete the proof. We note that the following result, which is stronger than Proposition 3.4.6, follows from Lemma 3.2.4 and Theorem 3.4.3. Proposition 3.4.7. Suppose that Ψ is a semi-normalized quasi-greedy Markushevich basis for a GT space X. There exists a constant C such that for all g ∈ X we have an (g) ≤ Cn−1 g.
89
3.4. Uniformly bounded quasi-greedy systems
Recall that a system {fj } ⊂ L1 [0, 1] is uniformly integrable if, given ε > 0, there!exists δ > 0 such that if λ(A) < δ, where λ denotes the Lebesgue measure, then A |fj | dλ < ε for all j ≥ 1. Clearly, uniformly bounded systems are uniformly integrable. Theorem 3.4.8. Let Ψ be a semi-normalized quasi-greedy Markushevich basis for L1 [0, 1]. Then no subsequence of Ψ is uniformly integrable. Hence, every subsequence of Ψ contains a further subsequence equivalent to the unit vector basis of 1 . Proof. Let {fj } ⊂ L1 [0, 1] be any uniformly integrable system. Given ε > 0, choose M > 0 such that fj χ{|fj |>M} 1 < ε for all j. Then n n √ ±f ≤ nε + Ave ±f χ Ave± j ± j {|fj |≤M} ≤ nε + M n. j=1
1
2
j=1
n Hence, Ave± j=1 ±fj 1 = o(n). Since L1 [0, 1] is a GT space, Theorem 3.4.3 implies that {fj } is not a subsequence of any quasi-greedy Markushevich basis. Finally, it is well-known that semi-normalized sequences in L1 [0, 1] are either uniformly integrable, or contain a subsequence equivalent to the unit vector basis of 1 . Remark 3.4.9. Complemented subspaces of L1 spaces are GT spaces. Hence the previous theorem extends to quasi-greedy Markushevich bases of complemented (infinite-dimensional) subspaces of L1 [0, 1]. A related result of Popov [62] asserts that complemented subspaces of L1 [0, 1] do not admit any uniformly integrable Schauder basis. Next we consider the Hardy spaces Hp (D) (1 ≤ p < ∞) of analytic functions on the disk D := {z ∈ C : |z| < 1}, equipped with the norm f p = sup
0
1 2π
2π
f (reiθ p dθ
1/p .
0
Using the system {z n }∞ n=0 instead of T in the proof of Theorem 3.3.4 yields the following result. Theorem 3.4.10. There exists an orthonormal system of uniformly bounded analytic polynomials which is a quasi-greedy basis for Hp (D) for 1 < p < ∞. Using some deep results from Banach space theory we can extend the latter result also to the case p = 1 (see [21]). Theorem 3.4.11. H1 (D) admits a semi-normalized uniformly bounded quasi-greedy basis of analytic polynomials.
90
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
3.5 Lebesgue-type inequalities for quasi-greedy bases As above, we will use the notation an (f ) := ckn (f ) for the decreasing rearrangement of the coefficients of f . For a set of indices Λ, we define the corresponding partial sum as ck (f )ψk . SΛ (f ) := k∈Λ
The presentation of this section is based on the paper [84]. In this section we often use the following assumption: There exists an increasing function v(m) := v(m, Ψ) such that for any two sets of indices A and B with |A| = |B| = m, we have ≤ v(m) (3.5.1) ψ ψk k . k∈A
k∈B
We begin with a theorem for a Banach space X. Later on we will specify this theorem for Lp spaces. Theorem 3.5.1. Let Ψ be a quasi-greedy basis of X satisfying assumption (3.5.1) with the following property: For any set of indices Λ, SΛ (f ) ≤ w |Λ| f , where w(m) is a non-decreasing function. Then, for each f ∈ X, f − Gm (f ) ≤ (1 + 2w(m) + (2K)3 v(m)w(m))σm (f ). Proof. Let, for a given > 0, a polynomial pm (f ) = bk ψk ,
|P | = m,
k∈P
satisfy the inequality
f − pm (f ) ≤ σm (f ) + .
(3.5.2)
Denote by Q the set of indices picked by the greedy algorithm after m iterations: Gm (f ) = ck (f )ψk . k∈Q
We use the representation f − Gm (f ) = f − SQ (f ) = f − SP (f ) + SP (f ) − SQ (f ).
(3.5.3)
91
3.5. Lebesgue-type inequalities for quasi-greedy bases
First, we bound
f − SP (f ) = f − pm (f ) − SP (f − pm (f ) ≤ 1 + w(m) f − pm (f ). (3.5.4) Second, we write SP (f ) − SQ (f ) = SP \Q (f ) − SQ\P (f ) ≤ SP \Q (f ) + SQ\P (f ). (3.5.5) We begin by estimating the second term in the right-hand side of (3.5.5): SQ\P (f ) = SQ\P (f − pm (f )) ≤ w(m)f − pm (f ). (3.5.6) For the first term we have, by Lemma 3.2.2, SP \Q (f ) ≤ 2K max |ck (f )| ψk k∈P \Q
(3.5.7)
k∈P \Q
3 ≤ 2K min ck (f )v(m) ψ k ≤ (2K) v(m) SQ\P (f ) . k∈Q\P k∈Q\P
Combining (3.5.2)–(3.5.7) we obtain
f − Gm (f ) ≤ 1 + 2w(m) + (2K)3 v(m)w(m) σm (f ).
We prove a Lebesgue-type inequality in terms of expansional best m-term approximation of f with regard to Ψ (see Section 3.2). Theorem 3.5.2. Let Ψ be a quasi-greedy basis of X satisfying assumption (3.5.1). Then, for each f ∈ X,
f − Gm (f ) ≤ C Ψ, X v(m)˜ σm (f ). Proof. Let, for a given > 0, a set of indices B be such that |B| = m and f − SB (f ) ≤ σ ˜m (f ) + . (3.5.8) Let, as above, Gm (f ) =
ck (f )ψk .
k∈Q
Then
f − Gm (f ) ≤ f − SB (f ) + SB\Q (f ) + SQ\B (f ).
Our assumption that Ψ is quasi-greedy gives SQ\B (f ) = SQ\B (f − SB (f )) = G|Q\B| (f − SB (f )) ≤ Kf − SB (f ). Combining (3.5.8)–(3.5.10) and using (3.5.7) we obtain
f − Gm (f ) ≤ 1 + K + 8K 4 v(m) σ ˜m (f ).
(3.5.9)
(3.5.10)
92
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
We now proceed to a discussion of quasi-greedy bases in Lp spaces. We use the brief notation · p := · Lp . The following theorem is a corollary of the above Theorem 3.2.10. Theorem 3.5.3. Let Ψ be a quasi-greedy basis of the Lp space, where 1 < p < 2 or 2 < p < ∞. Then, for any set of indices Λ, SΛ (f ) ≤ C(p)|Λ|h(p) f p , h(p) := 1/p − 1/2. p Proof. Let m := |Λ|. Using Theorem 3.2.10, we get, for 1 < p < 2, SΛ (f )p ≤ C4 (p)
m
n1/p−1 an SΛ (f ))
n=1
= C4 (p)
m
m
n1/p−3/2 n1/2 an (SΛ (f )) ≤ C4 (p) n1/p−3/2 n1/2 an (f )
n=1
n=1
≤ C5 (p)m1/p−1/2 sup n1/2 an (f ) ≤ C5 (p)C3 (p)−1 m1/p−1/2 f p . n
Using again Theorem 3.2.10, we obtain, for 2 < p < ∞, SΛ (f )p ≤ C2 (p)
m
n−1/2 an (SΛ (f ))
n=1
= C2 (p)
m
n−1/2−1/p n1/p an (SΛ (f ))
n=1
≤ C2 (p)
m
n−1/2−1/p n1/p an (f )
n=1
≤ C6 (p)m1/2−1/p sup n1/p an (f ) n
−1
≤ C6 (p)C1 (p)
m
1/2−1/p
f p .
It is pointed out in [83] that Theorem 3.2.10 implies the following inequality for a quasi-greedy basis Ψ of Lp : v(m, Ψ) ≤ C(p)mh(p) ,
1 < p < ∞.
(3.5.11)
Using inequality (3.5.11) in Theorem 3.5.2 we obtain the following result. Theorem 3.5.4. Let Ψ be a quasi-greedy basis of Lp , 1 < p < ∞. Then, for each f ∈ Lp ,
f − Gm (f ) ≤ C Ψ, p mh(p) σ ˜m (f ), h(p) := |1/2 − 1/p|. p
93
3.5. Lebesgue-type inequalities for quasi-greedy bases
Theorem 3.5.5. Let 1 < p < ∞, p = 2, and let Ψ be a quasi-greedy basis of Lp . Then for each f ∈ Lp we have f − Gm (f, Ψ) ≤ C(p, Ψ)m|1/2−1/p| σm (f, Ψ)Lp . (3.5.12) L p
Proof. The first part of the proof goes along the lines of the proof of Theorem 3.5.1. We use the notation from that proof. By Theorem 3.5.3, w(m) ≤ C(p)mh(p) . Thus (3.5.4) gives
f − SP (f ) ≤ 1 + C(p)mh(p) f − pm (f ) . p p
(3.5.13)
(3.5.14)
Next, using Theorem 3.2.10 and our assumption that Ψ is a quasi-greedy basis of Lp we obtain, for 1 < p < 2, m
SQ\P (f ) = SQ\P (f − pm (f )) ≤ C4 (p) n1/p−1 an SQ\P (f − pm (f )) p p n=1
≤ C4 (p)
m
n1/p−1 an f − pm (f )
n=1
= C4 (p)
m
n1/p−3/2 n1/2 an (f − pm (f ))
n=1
≤ C7 (p)m1/p−1/2 sup n1/2 an f − pm (f ) n
≤ C8 (p)m1/p−1/2 f − pm (f p .
(3.5.15)
In the same way we treat the case 2 < p < ∞: m
SQ\P (f ) = SQ\P (f − pm (f )) ≤ C2 (p) n−1/2 an SQ\P f − pm (f ) p p n=1
≤ C2 (p)
m
n−1/2 an f − pm (f )
n=1
= C2 (p)
m
n−1/2−1/p n1/p an (f − pm (f ))
n=1
≤ C9 (p)m1/2−1/p sup n1/p an f − pm (f ) n
≤ C10 (p)m1/p−1/2 f − pm (f )p .
(3.5.16)
94
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
For SP \Q (f ) we have, for 1 < p < 2, m
SP \Q (f ) ≤ C4 (p) n1/p−1 an SP \Q (f ) p n=1
≤ C4 (p)
m
n1/p−1 an SQ\P (f )
n=1
= C4 (p)
m
n1/p−1 an SQ\P f − pm (f ) ,
n=1
which has been estimated in (3.5.15): ≤ C8 (p)m1/p−1/2 f − pm (f )p .
(3.5.17)
In the same way we obtain a bound in the case 2 < p < ∞: m
SP \Q (f ) ≤ C2 (p) n−1/2 an SP \Q (f ) p n=1
≤ C2 (p)
m
n−1/2 an SQ\P (f )
n=1
= C2 (p)
m
n−1/2 an SQ\P f − pm (f ) ,
n=1
which has been estimated in (3.5.16): ≤ C10 (p)m1/2−1/p f − pm (f )p .
(3.5.18)
Combining (3.5.14)–(3.5.18) we complete the proof of Theorem 3.5.5.
3.6 Lebesgue-type inequalities for uniformly bounded quasi-greedy bases Our main interest in this section is to prove Lebesgue-type inequalities for greedy approximation in Lp , 2 ≤ p ≤ ∞, under different assumptions on a basis Ψ. In this section we assume that Ψ is a uniformly bounded basis. In addition we assume that Ψ is a basis of a certain type (quasi-greedy basis, Riesz basis) in one of the spaces L2 , Lq , 1 < q < 2, or Lq , 2 < q < ∞. We will often use the following lemma.
95
3.6. Lebesgue-type inequalities
Lemma 3.6.1. Suppose that X ⊂ Y are two Banach spaces such that · Y ≤ · X . Assume that a basis Ψ of X satisfies the following property: For any set of indices Λ, SΛ (f ) ≤ w |Λ| f Y . X Then for each f ∈ X and any m-term polynomial bk ψk , |P | = m, pm = k∈P
we have
f − SP (f ) ≤ f − pm + w(m)f − pm . X X Y
Proof. It is a simple one-line proof. We have f − SP (f ) = f − pm (f ) − SP (f − pm (f ) X X ≤ f − pm X + w(m) f − pm Y .
We now proceed to a systematic presentation of our results. Theorem 3.6.2. Assume that Ψ is a uniformly bounded Riesz basis of L2 . Then for any m-term polynomial bk ψk , |P | = m, tm = k∈P
we have, for 2 ≤ p ≤ ∞, f − Gm (f, Ψ) ≤ f − tm + Cmh(p) f − tm . p p 2 Corollary 3.6.3. Assume that Ψ is a uniformly bounded Riesz basis of L2 . Then we have, for 2 ≤ p ≤ ∞,
f − Gm f, Ψ ≤ Cmh(p) σm f, Ψ . p p Proof. Denote by Q the set of indices picked by the greedy algorithm after m iterations, ck (f )ψk . Gm (f ) := Gm (f, Ψ) = k∈Q
We use the representation f − Gm (f ) = f − SQ (f ) = f − SP (f ) + SP (f ) − SQ (f ). First, we bound f − SP (f )p . By Lemma 3.6.1 and Lemma 3.2.14, f − SP (f ) ≤ f − tm + Cmh(p) f − tm . p p 2
(3.6.1)
(3.6.2)
96
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
Second, we write SP (f ) − SQ (f ) = SP \Q (f ) − SQ\P (f ) p p ≤ SP \Q (f )p + SQ\P (f )p . Using Lemma 3.2.14 we obtain
SP (f ) − SQ (f ) ≤ Cmh(p) SP \Q (f )2 + SQ\P (f )2 . p
(3.6.3)
(3.6.4)
The definition of Q implies that 1/2 ck (f )2 SP \Q (f ) ≤ C 2 k∈P \Q
≤C
1/2 |ck (f )|2
≤ C SQ\P (f )2 .
(3.6.5)
k∈Q\P
Next,
SQ\P (f ) = SQ\P f − tm ≤ C f − tm . 2 2 2
Combining (3.6.1)–(3.6.6) we complete the proof of Theorem 3.6.2.
(3.6.6)
We now impose a slightly weaker assumption on a basis Ψ than the one in Theorem 3.6.2. Theorem 3.6.4. Assume that Ψ is a uniformly bounded quasi-greedy basis of L2 . Then for any m-term polynomial tm = bk ψk , |P | = m, k∈P
we have, for 2 ≤ p ≤ ∞,
f − Gm f, Ψ ≤ f − tm + Cmh(p) ln m + 1 f − tm . p p 2 Corollary 3.6.5. Assume that Ψ is a uniformly bounded quasi-greedy basis of L2 . Then, for 2 ≤ p ≤ ∞,
f − Gm f, Ψ ≤ Cmh(p) ln m + 1 σm f, Ψ . p p Proof. This proof goes along the lines of proof of Theorem 3.6.2. However, the details are different because we need to use properties of quasi-greedy bases instead of properties of Riesz bases. We use notations from the proof of Theorem 3.6.2 and the representation (3.6.1). By Lemma 3.6.1 and Lemma 3.2.15 we get, for f − SP (f )p ,
f − SP (f ) ≤ f − tm + Cmh(p) ln m + 1 f − tm . (3.6.7) p p 2
97
3.6. Lebesgue-type inequalities
Using Lemma 3.2.15 we obtain from (3.6.3) SP (f ) − SQ (f ) ≤ Cmh(p) SP \Q (f ) + SQ\P (f ) . p 2 2
(3.6.8)
Next, by Theorem 3.2.10, m
SQ\P (f ) = SQ\P f − tm 2 ≤ C2 (2) n−1/2 an SQ\P f − tm 2 n=1
≤C
m
m
n−1/2 an f − tm = C n−1 n1/2 an f − tm
n=1
n=1
(3.6.9)
≤ C ln m + 1 sup n1/2 an f − tm ≤ C ln m + 1 f − tm 2 . n
For SP \Q (f ) we have m
SP \Q (f ) ≤ C2 (2) n−1/2 an SP \Q (f ) 2 n=1
≤ C2 (2)
m
n−1/2 an SQ\P (f )
n=1
= C2 (2)
m
n−1/2 an SQ\P f − tm ,
n=1
which has been estimated in (3.6.9):
≤ C ln m + 1 f − tm 2 . Combining (3.6.7)–(3.6.10) we complete the proof of Theorem 3.6.4.
(3.6.10)
Theorem 3.6.6. Assume that Ψ is a democratic quasi-greedy basis of X. Then, for any f ∈ X,
f − Gm f, Ψ ≤ C ln m + 1 σm f, Ψ . X X Corollary 3.6.7. Assume that Ψ is a uniformly bounded quasi-greedy basis of Lp , 1 < p < ∞. Then
f − Gm f, Ψ ≤ C(p) ln m + 1 σm f, Ψ . p p Proof. It is known (see [18]) that a democratic and quasi-greedy basis is an almost greedy basis. Therefore, the inequality
f − Gm f, Ψ ≤ C σ ˜m f, Ψ X X holds for any f ∈ X. It remains to apply Lemma 3.2.7 to complete the proof of Theorem 3.6.6. Now Corollary 3.6.7 follows by using Proposition 3.2.21.
98
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
The following theorem is a generalization of Theorem 3.6.4. Theorem 3.6.8. Assume that Ψ is a uniformly bounded quasi-greedy basis of Lq , 1 < q < ∞. Then for any m-term polynomial tm =
bk ψk ,
|P | = m,
k∈P
we have, for q ≤ p ≤ ∞,
f − Gm f, Ψ ≤ f − tm + C p, q m(1−q/p)/2 ln m + 1 f − tm . p p q Corollary 3.6.9. Assume that Ψ is a uniformly bounded quasi-greedy basis of Lq , 1 < q < ∞. Then, for q ≤ p ≤ ∞,
f − Gm f, Ψ ≤ C p, q m(1−q/p)/2 ln m + 1 σm f, Ψ . p p Proof. This proof goes along the lines of the proof of Theorem 3.6.4 and uses its notations and the representation (3.6.1). By Lemma 3.6.1 and Lemma 3.2.16 we get, for f − SP (f )p ,
f − SP (f ) ≤ f − tm + C p, q m(1−q/p)/2 ln m + 1 f − tm . (3.6.11) p p q Using Lemma 3.2.16 we obtain from (3.6.3) SP (f ) − SQ (f ) ≤ Cm(1−q/p)/2 SP \Q (f )q + SQ\P (f ) . p q
(3.6.12)
By Lemma 3.2.6,
SQ\P (f ) = SQ\P f − tm ≤ C ln m + 1 f − tm . q q q We give another proof of this bound because it will be used in estimating SP \Q (f )q . By Proposition 3.2.22, m
SQ\P (f ) = SQ\P f − tm ≤ C(q) n−1/2 an SQ\P f − tm q q n=1
≤ C(q)
m
m
n−1/2 an f − tm = C(q) n−1 n1/2 an f − tm
n=1
n=1
(3.6.13)
≤ C(q) ln m + 1 sup n1/2 an f − tm ≤ C(q) ln m + 1 f − tm q . n
99
3.7. Lebesgue-type inequalities
For SP \Q (f ) we have m
SP \Q (f ) ≤ C(q) n−1/2 an SP \Q (f ) q n=1
≤ C(q)
m
n−1/2 an SQ\P (f )
n=1
= C(q)
m
n−1/2 an SQ\P f − tm ,
n=1
which has been estimated in (3.6.13):
≤ C(q) ln m + 1 f − tm q .
(3.6.14)
Combining (3.6.11)–(3.6.14) we complete the proof of Theorem 3.6.8.
3.7 Lebesgue-type inequalities for uniformly bounded orthonormal quasi-greedy bases In this section we continue to prove Lebesgue-type inequalities for greedy approximation in Lp under different assumptions on a basis Ψ, namely, that Ψ is a quasi-greedy basis for a pair of spaces: Lq , 1 < q < ∞, and Lp , q ≤ p. Theorem 3.7.1. Assume that Ψ is a semi-normalized quasi-greedy basis for both Lq and Lp with 1 < q ≤ 2 ≤ p < ∞. Then for any m-term polynomial bk ψk , |P | = m, tm = k∈P
we have
f − Gm f, Ψ ≤ f − tm + C p, q ln m + 1 f − tm . p p q Corollary 3.7.2. Assume that Ψ is a semi-normalized quasi-greedy basis for both Lq and Lp with 1 < q ≤ 2 ≤ p < ∞. Then
f − Gm f, Ψ ≤ C p, q ln m + 1 σm f, Ψ . p p Proof. This proof goes along the lines of proof of Theorem 3.6.4 and uses its notations and the representation (3.6.1). By Lemma 3.6.1 and Lemma 3.2.17 we get, for f − SP (f )p ,
f − SP (f ) ≤ f − tm + C p, q ln m + 1 f − tm . (3.7.1) p p q
100
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
We obtain from (3.6.3) SP (f ) − SQ (f ) ≤ SP \Q (f ) + SQ\P (f ) . p p p
(3.7.2)
Next we have, by Theorem 3.2.10, m
SQ\P (f ) = SQ\P f − tm ≤ C2 (p) n−1/2 an SQ\P f − tm p p
(3.7.3)
n=1
≤ C(p)
m
m
n−1/2 an f − tm = C(p) n−1 n1/2 an f − tm
n=1
n=1
≤ C(p) ln m + 1 sup n1/2 an f − tm ≤ C p, q ln m + 1 f − tm q . n
For SP \Q (f ) we have, by Theorem 3.2.10, m
SP \Q (f ) ≤ C2 (p) n−1/2 an SP \Q (f ) p n=1
≤ C2 (p)
m
n−1/2 an SQ\P (f )
n=1
= C2 (p)
m
n−1/2 an SQ\P f − tm ,
n=1
which has been estimated in (3.7.3):
≤ C p, q ln m + 1 f − tm q . Combining (3.7.1)–(3.7.4) we complete the proof of Theorem 3.7.1.
(3.7.4)
Remark 3.7.3. The statement of Corollary 3.7.2 holds even if we drop the assumption that Ψ is a quasi-greedy basis of Lq . Proof. The assumption that Ψ is semi-normalized for both Lq and Lp , q ≤ 2 ≤ p, implies that it is semi-normalized in L2 . Then as in Proposition 3.2.21 we can prove that Ψ is democratic with ϕ(m) m1/2 . It remains to apply Theorem 3.6.6. Theorem 3.7.4. Assume that Ψ is a uniformly bounded orthonormal quasi-greedy basis for Lp , 2 ≤ p < ∞. Then for any m-term polynomial tm = bk ψk , |P | = m, k∈P
we have
f − Gm (f, Ψ) ≤ f − tm + C(p) ln m + 1 f − tm , p p p
f − Gm (f, Ψ) ≤ f − tm + C(p) ln m + 1 1/2 f − tm . p p 2
(3.7.5) (3.7.6)
101
3.7. Lebesgue-type inequalities
Corollary 3.7.5. Assume that Ψ is a uniformly bounded orthonormal quasi-greedy basis for Lp , 2 ≤ p < ∞. Then
f − Gm f, Ψ ≤ C(p) ln m + 1 1/2 σm f, Ψ . p p Proof. By Theorem 3.2.20, Ψ is a quasi-greedy basis of Lp . Thus, (3.7.5) follows from Theorem 3.7.1 with q = p . Now let us prove (3.7.6). As in the proof of Theorem 3.7.1 we obtain, by Lemma 3.6.1 and Lemma 3.2.18,
f − SP (f ) ≤ f − tm + C(p) ln m + 1 1/2 f − tm . (3.7.7) p p 2 By Theorem 3.2.10, m
SQ\P (f ) = SQ\P f − tm ≤ C(p) n−1/2 an f − tm p p
(3.7.8)
n=1
1/2 m m
1/2
2 1/2 −1 f − tm . n a n f − tm ≤ C(p) ln m + 1 ≤ C(p) 2 n=1
n=1
As in the proof of Theorem 3.7.1, m
SP \Q (f ) ≤ C(p) n−1/2 an f − tm p n=1
and, by the intermediate step in (3.7.8),
1/2 f − tm . ≤ C(p) ln m + 1 2 It remains to use representation (3.6.1) and inequality (3.7.2).
If Ψ is assumed to be uniformly bounded, then the Lebesgue-type inequality of Theorem 3.7.1 holds whenever q ≤ p. Theorem 3.7.6. Assume that Ψ is a uniformly bounded quasi-greedy basis for both Lq and Lp with 1 < q ≤ p < ∞. Then for any m-term polynomial bk ψk , |P | = m, tm = k∈P
we have
f − Gm (f, Ψ) ≤ f − tm + C p, q ln m + 1 f − tm . p p q Proof. As in the proof of Theorem 3.7.1 we obtain, by Lemma 3.6.1 and Lemma 3.2.23,
f − SP (f ) ≤ f − tm + C p, q ln m + 1 f − tm . (3.7.9) p p q
102
Chapter 3. Quasi-greedy Bases and Lebesgue-type Inequalities
By Proposition 3.2.22, m
SQ\P (f ) = SQ\P f − tm ≤ C p, q n−1/2 an f − tm p p
(3.7.10)
n=1
≤ Cv(p, q
m
n−1 f − tm q ≤ C p, q ln m + 1 f − tm q .
n=1
As in the proof of Theorem 3.7.1 we get m
SP \Q (f ) ≤ C p, q n−1/2 an f − tm p n=1
and, by the intermediate step in (3.7.10), ≤ C p, q ln m + 1))f − tm q . It remains to use representation (3.6.1) and inequality (3.7.2).
Chapter 4 Almost Greedy Bases and Duality
4.1 Introduction Let X be a Banach space with a semi-normalized basis Ψ = {ψn }∞ n=1 . An approximation algorithm {Fn }∞ n=1 is a sequence of maps Fn : X → X such that, for each f ∈ X, Fn (f ) is a linear combination of at most n of the basis elements {ψj }. The most natural algorithm is the linear algorithm {Sn }∞ n=1 , given by the partial sum operators. Recently, the Thresholding Greedy Algorithm (TGA) {Gm }∞ m=1 , in which Gm (f ) is obtained by taking the largest m coefficients (precise definitions are given in Chapter 2 and in Section 4.2 below), was studied in detail. TGA provides a theoretical model for the thresholding procedure that is used in image compression and other applications. The presentation of this chapter is based on [18]. In Chapter 2 we defined the basis Ψ = {ψn }∞ n=1 to be greedy if TGA is optimal in the sense that Gm (f ) is essentially the best m-term approximation to f using the basis elements, i.e., there exists a constant C such that for all f ∈ X and m ∈ N we have f − Gm (f ) ≤ C inf f − αj ψj : |A| = m, αj ∈ R, j ∈ A . (4.1.1) j∈A
It was shown in Chapter 2 that greedy bases can be simply characterized as unconditional bases with the additional property of being democratic, i.e., for some D > 0 we have j∈A ψj ≤ D j∈B ψj whenever |A| = |B|. We also defined a basis to be quasi-greedy if there exists a constant C such that Gm (f ) ≤ Cf for all f ∈ X and m ∈ N. Subsequently, Wojtaszczyk [85] proved that these are precisely the bases for which TGA merely converges, i.e., limm→∞ Gm (f ) = f for f ∈ X. In this chapter we introduce two natural intermediate conditions. Let us denote the biorthogonal sequence by Ψ∗ := {ψn∗ }∞ n=1 . We say Ψ is almost greedy
© Springer Basel 2015 V. Temlyakov, Sparse Approximation with Bases, Advanced Courses in Mathematics - CRM Barcelona, DOI 10.1007/978-3-0348-0890-3_4
103
104
Chapter 4. Almost Greedy Bases and Duality
if there is a constant C such that ∗ f − Gm (f ) ≤ C inf f − ψj (f )ψj : |A| = m ,
f ∈ X, n ∈ N. (4.1.2)
j∈A
Comparison with (4.1.1) shows that this is formally a weaker condition; in fact, Wojtaszczyk’s examples of conditional quasi-greedy bases of 2 [85] are almost greedy but not greedy. Also, the basis constructed in Subsection 2.5.3 of Chapter 2 is an almost greedy basis (this follows from Theorem 4.3.2 below), but not a greedy basis. We give two characterizations of almost greedy bases in Theorem 4.3.2. First, a basis is almost greedy if and only if it is quasi-greedy and democratic. Second, if λ > 1, then Ψ is almost greedy if and only if there exists a constant C such that, for all f ∈ X and m ∈ N, we have : |A| = m, αj ∈ R, j ∈ A . (4.1.3) f − G[λm] (f ) ≤ C inf f − α ψ j j j∈A
Equation (4.1.3) is a very natural weakening of (4.1.1). We also introduce partially greedy bases. These are bases such that, for some C, ∞ ∗ f ∈ X, m ∈ N. f − Gm (f ) ≤ C (4.1.4) ψ (f )ψ k , k k=m+1
We give a characterization in Theorem 4.3.3. Next we study duality of these conditions. In Theorem 4.5.1 we show that if Ψ is a greedy basis of a Banach space X with non-trivial Rademacher type, then Ψ∗ is a greedy basis of X ∗ . However, examples at the end of this chapter (see also [18]) show that if X has trivial type, then Ψ∗ need not be a greedy basic sequence. Theorem 4.5.4 concerns duality for almost greedy sequences. It is proved that Ψ and Ψ∗ are both almost greedy if and only if they are both partially greedy. It is also proved that if Ψ is almost greedy, then Ψ∗ is almost greedy if and only if Ψ is bidemocratic, i.e., for some C we have ψj ψj∗ ≤ Cn, |A| = n, n ∈ N. j∈A
j∈A
Using this result we extend Theorem 4.5.1 by showing that if X has nontrivial type and Ψ is almost greedy, then Ψ∗ is an almost greedy basic sequence. We use standard Banach space notation throughout (see, e.g., [52]). For clarity, however, we recall here the notation that is used most heavily. Let X be a Banach space. The dual space of X, denoted X ∗ , is the Banach space of all continuous linear functionals F equipped with the norm F = sup F (f ) : f = 1 .
105
4.2. Greedy conditions for bases
The closed linear span of a set A ⊆ X (resp., a sequence (fn )) is denoted [A] (resp. [fn ]). A basis for X is a sequence of elements Ψ such that every f ∈ X has a unique expansion as a norm-convergent series f=
∞
ψn∗ (f )ψn .
k=1
Here Ψ∗ is the sequence of biorthogonal functionals in X ∗ , defined by ψn∗ (ψm ) = δn,m . As above, the basis is said to be unconditional if the series expansion converges unconditionally for every f ∈ X. It is said to be monotone if n ∗ ψ (f )ψ k ≤ f , k
f ∈ X, n ≥ 1.
k=1
Finally, more specialized notions from Banach space theory will be introduced as needed.
4.2 Greedy conditions for bases Let Ψ be a semi-normalized basis of a Banach space X (i.e., 1/C ≤ ψn ≤ C for some C), and let Ψ∗ be the biorthogonal sequence in X ∗ . Let us denote by Sm the partial-sum operators Sm (f ) =
m
ψj∗ (f )ψj .
j=1
We also define the remainder operators Rm = I − Sm . For any f ∈ X we define the greedy ordering for f as the map ρ : N → N such that ρ(N) ⊃ {j : ψj∗ (f ) = 0} ∗ ∗ ∗ ∗ and such that if j < k then either |ψρ(j) (f )| > |ψρ(k) (f )| or |ψρ(j) (f )| = |ψρ(k) (f )| and ρ(j) < ρ(k). The m-th greedy approximation is given by Gm (f ) =
m
∗ ψρ(j) (f )ψρ(j) .
j=1
We also introduce the m-th greedy remainder Hm (f ) := f − Gm (f ). The basis Ψ is quasi-greedy if Gm (f ) → f for all f ∈ X. This is equivalent (see [85]) to the condition that for some constant C we have sup Gm (f ) ≤ Cf , f ∈ X. (4.2.1) m
106
Chapter 4. Almost Greedy Bases and Duality
As above it will be convenient to define the quasi-greedy constant K to be the least constant such that Gm (f ) ≤ Kf and Hm (f ) ≤ Kf , f ∈ X. If Ψ is any basis we denote σm (f ) = inf f − αj ψj : |A| = m, αj ∈ R . j∈A
A basis Ψ is greedy [43] (see Definition 2.1.1 from Chapter 2) if there is a constant C such that for any f ∈ X and m ∈ N we have Hm (f ) ≤ Cσm (f ). (4.2.2) It is natural to introduce two slightly weaker forms of greediness. For any basis Ψ let, as above, ∗ σ ˜m (f ) := inf f − ψk (f )ψk : |A| ≤ m . k∈A
Note that
σm (f ) ≤ σ ˜m (f ) ≤ Rm (f ) −→ 0
as m −→ ∞.
Let us say that a basis Ψ is almost greedy if there is a constant C so that Hm f ≤ C σ ˜m (f ), f ∈ X. (4.2.3) We will say that a basis Ψ is partially greedy if there is a constant C so that for any f ∈ X, m ∈ N, Hm (f ) ≤ C Rm (f ). (4.2.4) It is clear that for any basis we have the following implications: greedy =⇒ almost greedy =⇒ partially greedy =⇒ quasi-greedy. We conclude this section by considering direct and inverse theorems for approximation with respect to almost greedy bases. We define as above the fundamental function ϕ(n) of a basis Ψ by ϕ(n) := sup ψk . |A|≤n
k∈A
For f ∈ X with greedy ordering ρ, let
∗ ak (f ) := ψρ(k) (f ).
The following theorem was proved in [70] (see Theorem 2.6.9 from Chapter 2).
107
4.2. Greedy conditions for bases
Theorem 4.2.1. Let 1 < p < ∞ and let Ψ be a greedy basis with ϕ(n) n1/p . Then, for any 0 < r < ∞ and 0 < q < ∞, we have the following equivalence: σn (f )q nrq−1 < ∞ ⇐⇒ an (f )q nrq−1+q/p < ∞. n
n
We generalize this theorem as follows (cf. [30, Theorem 5.1]). Theorem 4.2.2. Let 1 < p < ∞ and let Ψ be a democratic quasi-greedy basis with ϕ(n) n1/p . Then, for any 0 < r < ∞ and 0 < q < ∞, we have the following equivalence: Hn (f )q nrq−1 < ∞ ⇐⇒ an (f )q nrq−1+q/p < ∞. n
n
The proof of this theorem is similar to the proof of Theorem 4.2.1 and is based on the following lemmas, which are analogous to the corresponding lemmas from [70]. Lemma 4.2.3. Let Ψ be a democratic quasi-greedy basis with ϕ(n) n1/p . Then there exists a constant C such that, for any two positive integers N < M and any f ∈ X, aM (f ) ≤ C HN (f )(M − N )−1/p . Proof. This follows from Lemma 3.2.4 of Chapter 3.
Lemma 4.2.4. Let Ψ be a democratic quasi-greedy basis with ϕ(n) n1/p . Then there exists a constant C such that, for any sequence m0 < m1 < · · · of non-negative integers, ∞
1/p Hms (f ) ≤ C aml (f ) ml+1 − ml . l=s
Proof. This lemma follows from Lemma 3.2.3 of Chapter 3.
By Theorem 4.3.2 below we get that a democratic quasi-greedy basis is almost greedy and also has the following property (setting λ = 2 in (3) of Theorem 4.3.2): σ2n (f ) ≤ H2n (f ) ≤ Cσn (f ). This inequality implies that Hn (f )q nrq−1 < ∞ ⇐⇒ σn (f )q nrq−1 < ∞. n
n
Therefore Theorem 4.2.1 holds with the assumption that Ψ is greedy replaced by the assumption that Ψ is almost greedy, which yields Theorem 4.2.2.
108
Chapter 4. Almost Greedy Bases and Duality
4.3 Democratic and conservative bases We recall (see Definition 2.1.4 of Chapter 2) that a basis Ψ in a Banach space X is called democratic if there is a constant D such that ≤ D (4.3.1) ψ ψ k k k∈A
k∈B
if |A| = |B|. This concept was introduced in [43]. In [18] we defined a democratic basis as one satisfying (4.3.1) if |A| ≤ |B|. It follows from Lemma 3.2.2 of Chapter 3 that for quasi-greedy bases the above two definitions are equivalent. Recall that the fundamental function ϕ(n) of Ψ is defined by ψk ϕ(n) := sup . |A|≤n
k∈A
The dual fundamental function is given by ∗ ψk ϕ (n) := sup . ∗
|A|≤n
k∈A
Note that ϕ (and ϕ∗ ) is subadditive (i.e., ϕ(m+ n) ≤ ϕ(m)+ ϕ(n)) and increasing. One can also verify that ϕ(n)/n (and ϕ∗ (n)/n) is decreasing since for any set A with |A| = n we have 1 ψk = ψj . n−1 k∈A
k∈A j=k
By convexity, for any set A and any scalars {aj : j ∈ A} we have aj ψj ≤ max |aj | max ±ψj . ± j∈A j∈A
Hence
A
a ψ j j ≤ 2ϕ(|A|) max |aj |. j∈A
(4.3.2)
j∈A
It is clear that Ψ is democratic with constant D in (4.3.1) if and only if −1 D ϕ(|A|) ≤ ψk |A| < ∞. ≤ ϕ(|A|),
(4.3.3)
k∈A
Lemma 4.3.1. Let Ψ be a democratic quasi-greedy basis. Let K be the quasi-greedy constant and D the democratic constant. Then, for f ∈ X, if ρ is the quasi-greedy ordering, we have 4K 2 D ∗ ≤ ψ f (4.3.4) (f ) ρ(m) ϕ(m)
109
4.3. Democratic and conservative bases
and
sup ψk∗ Hm (f ) ≤ k∈N
4K 2 D f . ϕ(m + 1)
(4.3.5)
Proof. This follows directly from (4.3.3) and Lemma 3.2.4 from Chapter 3.
Next we compare almost greedy bases with greedy bases. It follows from assertion (3) of the theorem below that in an almost greedy basis the convergence of TGA is ‘almost’ optimal. It follows from assertion (2) of the theorem below and [85] that any conditional quasi-greedy basis of a Hilbert space is actually almost greedy. See also [20] for a conditional almost greedy basis of 1 . Theorem 4.3.2. Suppose Ψ is a basis of a Banach space. The following are equivalent: (1) Ψ is almost greedy. (2) Ψ is quasi-greedy and democratic. (3) For any λ > 1 there is a constant C = Cλ such that H[λm] (f ) ≤ Cλ σm (f ). Proof. We start by showing (1) implies (2). It is immediate that Ψ is quasi-greedy. Now suppose |A| ≤ |B|. Let δ > 0 and define f= ψj + (1 + δ)ψj . j∈A
j∈B\A
Then if r = |B \ A| we have Hr (f ) = j∈A ψj . However, ∗ σ ˜r (f ) ≤ ψj (f )ψj ≤ ψj + δ ψj . j∈B
j∈B
j∈B\A
Letting δ → 0, it follows from (4.2.3) that Ψ is democratic. Next we show that (2) implies (1), so that (1) and (2) are equivalent. Suppose f ∈ X and m ∈ N. Let ψj∗ (f )ψj , Gm (f ) = j∈A
where |A| = m. Suppose |B| = r ≤ m. Then ∗ ψj (f )ψj + ψj∗ (f )ψj − ψj∗ (f )ψj . Hm (f ) = f − j∈B
j∈B\A
j∈A\B
Then |B \ A| ≤ s := |A \ B|. Thus by Lemma 3.2.2 we get ∗ ∗ ψ (f )ψ j ≤ 2K max ψj (f ) ϕ(s); j j∈B\A
j∈B\A
110
Chapter 4. Almost Greedy Bases and Duality
by (4.3.4), we continue:
∗ 3 ∗ ψj (f )ψj ≤ 2K min ψj (f ) ϕ(s) ≤ 8K D j∈A\B
j∈A\B
∗ 4 ∗ = 8K 3 D ψ (f )ψ D ψ (f )ψ f − f − G ≤ 8K j j . j j s j∈B
We also have
j∈B
∗ ∗ . = Gs f − ψ (f )ψ ψ (f )ψ j j j j j∈B
j∈A\B
Thus it follows that
∗ Hm (f ) ≤ 8K 4 D + K + 1 f − ψ (f )ψ j j j∈B
and so, optimizing over B with |B| ≤ m,
Hm (f ) ≤ 8K 4 D + K + 1 σ ˜m (f ). Let us prove that (2) implies (3) for every λ > 1. This follows directly from Lemma 3.2.8, the fact that the basis is democratic, and the equivalence of (2) and (1) proved above. It remains to show that (3) (for some fixed λ > 1) implies (2). That Ψ is quasi-greedy is immediate. Note that if |D| = [λm], then ψj ≤ ϕ(λm) ≤ λϕ(m). j∈D
So to prove that Ψ is democratic it is enough to show that ψ j ≥ ϕ(m)/Cλ . j∈D
Suppose |A| ≤ m < ∞. For any set B of cardinality [λm] disjoint from A we have (by a similar argument as in the case that (1) implies (2)) ψj ≤ Cλ σm ψj ≤ Cλ ψj j∈A
j∈A∪B
j∈D
whenever D ⊂ A ∪ B with |D| ≥ [λm]. Thus, maximizing over all A with |A| ≤ m, inf ψj ≥ ϕ(m)/Cλ |D|=[λm]
and so Ψ is democratic.
j∈D
111
4.3. Democratic and conservative bases
If A, B are subsets of N we use the notation A < B to mean that m ∈ A and n ∈ B implies m < n. We write n < A for {n} < A. We call a basis Ψ conservative if there is a constant Γ such that ≤ Γ if |A| ≤ |B| and A < B. (4.3.6) ψ ψk k k∈A
k∈B
The analog of Theorem 4.3.2 is: Theorem 4.3.3. A basis Ψ is partially greedy if and only if it is quasi-greedy and conservative. Proof. Clearly, a partially greedy basis is also quasi-greedy. Suppose Ψ is partially greedy (with constant C in (4.2.4)) and A < B with |A| = |B| = m. Let r = max A. Let D = [1, r] \ A and then for δ > 0 let
f= ψk + 1 + δ ψk . k∈A
Then
Hr (f ) = ψ k
k∈D∪B
Rr (f ) = 1 + δ ψ k
and
k∈A
k∈B
so that letting δ → 0 gives (4.3.6) with Γ = C. Conversely, suppose that Ψ is quasi-greedy with constant K and conservative with constant Γ. Let f ∈ X and m ∈ N. Let ρ be the greedy ordering for f . Set D = {ρ(j) : j ≤ m, ρ(j) ≤ m}, B = {ρ(j) : j ≤ m, ρ(j) > m}, and A = [1, m]\D. Then |A| = |B| = r, say, and A < B. Now
∗ ψk (f )ψk = Gr Rm (f ) ≤ K Rm (f ) . k∈B
Using Lemma 3.2.2 from Chapter 3 we obtain
∗ ∗ ∗ ψk (f )ψk ≤ 2K max |ψk (f )| ψk ≤ 2KΓ min |ψk (f )| ψk k∈A k∈B k∈A
k∈A
∗ 3 4 ≤ 8K Γ ψk (f )ψk ≤ 8K Γ Rm (f ) .
k∈B
k∈B
Combining gives us ∗ ∗ Hm (f ) ≤ Rm (f ) + ψk (f )ψk + ψk (f )ψk k∈A
≤ 8K 4 Γ + K + 1 Rm (f ).
k∈B
112
Chapter 4. Almost Greedy Bases and Duality
4.4 Bidemocratic bases Suppose Ψ is a democratic basis. We shall say that Ψ has the upper regularity property (URP) if there exists an integer r > 2 so that ϕ(rn) ≤ 12 rϕ(n),
n ∈ N.
(4.4.1)
This of course implies that ϕ(rk n) ≤ 2−k rk ϕ(n) and is therefore easily equivalent to the existence of 0 < β < 1 and a constant C so that, if m > n, ϕ(m) ≤ C
m n
β ϕ(n).
(4.4.2)
We say Ψ has the lower regularity property (LRP) if there exists r > 1 so that for all n ∈ N ϕ(rn) ≥ 2ϕ(n), n ∈ N. (4.4.3) This is similarly equivalent to the existence of 0 < α < 1 and c > 0 so that, if m > n, α m ϕ(m) ≥ c ϕ(n). (4.4.4) n Let us recall (see Section 3.2 of Chapter 3) that a Banach space X has (Rademacher) type 1 < p ≤ 2 if there is a constant C such that
n p 1/p 1/p n p x ≤ C x , Avej =±1 j j j j=1
x1 , . . . , xn ∈ X, n ∈ N.
j=1
The least such constant C is called the type p-constant Tp (X). We say that X has non-trivial (resp. trivial) type if X has (resp. does not have) type p for some (resp. any) p > 1. Recall also that X has (Rademacher) cotype 2 ≤ q < ∞ if there exists a constant C such that q 1/q 1/q n n xj q ≤ C Avej =±1 x , x1 , . . . , xn ∈ X, n ∈ N. j j j=1
j=1
The least such constant C is called the cotype q-constant Cq (X). We say that X has non-trivial (resp. trivial) cotype if X has (resp. does not have) cotype q for some (resp. any) q < ∞. Proposition 4.4.1. (1) If Ψ is an almost greedy basis of a Banach space with non-trivial cotype then Ψ has (LRP). (2) If Ψ is an almost greedy basis of a Banach space with non-trivial type then Ψ has (LRP) and (URP).
113
4.4. Bidemocratic bases
Proof. (1) Suppose K is the quasi-greedy constant of Ψ and D is the democratic constant. Suppose X has cotype q < ∞
with constant Cq (X). Let B1 , . . . , Bm be m disjoint sets with |Bk | = n and let A = k=1 Bk . Using Lemma 3.2.3 and (4.3.3), we obtain q 1/q q 1/q m m 1/q m ϕ(n) ≤ D ψj ≤ 2KD Avej =±1 j ψj k=1
j∈Bk
k=1
q 1/q ≤ 2KDCq (X) Avej =±1 j ψj
j∈Bk
j∈A
≤ 4K 2 DCq (X)ϕ(mn). It is clear that this implies (4.4.4) for some suitable constant c > 0 and α = 1q . (2) Since non-trivial type implies non-trivial cotype we obtain (LRP) immediately. The proof of (URP) (with β = p1 when X has type p) is very similar. Using the same notation and assuming X has type p > 1 with constant Tp (X), we have p 1/p ϕ(mn) ≤ 2KD Avej =±1 j ψj
j∈A
p 1/p m ≤ 2KDTp (X) Avej =±1 ψ j j k=1
j∈Bk
1
≤ 4K 2 DTp (X)m p ϕ(n).
This implies (4.4.2) for suitable constants.
We now say that a basis Ψ is bidemocratic if there is a constant D so that ϕ(n)ϕ∗ (n) ≤ Dn.
(4.4.5)
Proposition 4.4.2. If Ψ is bidemocratic (with constant D), then Ψ and Ψ∗ are both democratic (with constant D) and are both unconditional for constant coefficients. Proof. If A is any finite set we have ∗ −1 ∗ ∗ ψj ψj ≤ ϕ (|A|) ψj D ϕ(|A|)ϕ (|A|) ≤ |A| ≤ . j∈A
Hence, D
−1
j∈A
ϕ(|A|) ≤ ψj , j∈A
j∈A
114
Chapter 4. Almost Greedy Bases and Duality
and so Ψ is democratic with constant D. Let (j )j∈A be any choice of signs ±1. Then ∗ ∗ ≤ 2ϕ ψ ψ (|A|) ψ D−1 ϕ(|A|)ϕ∗ (|A|) ≤ |A| ≤ j j j j j j . j∈A
j∈A
j∈A
1 ϕ(|A|) ≤ j ψj ≤ 2ϕ(|A|). 2D
Hence
j∈A
Therefore Ψ is unconditional for constant coefficients. Similar calculations work for (ψj∗ ) and yield the theorem. Proposition 4.4.3. A basis Ψ is bidemocratic if and only if there is a constant C so that, for any finite set A ⊂ N, ∗ ψ ψ (4.4.6) k k ≤ C|A|. k∈A
k∈A
Proof. One direction is trivial. Assume (4.4.6) holds with C ≥ 1. Let n ∈ N. By passing to an equivalent norm on X, if necessary, we may assume that both Ψ and Ψ∗ are monotone. There exist A, B ⊂ N with |A| ≤ n, |B| ≤ n and 1 ∗ 1 ∗ ≥ ψ ϕ(n), ψ j j ≥ 2 ϕ (n). 2 j∈A
j∈B
∗ assume that By the monotonicity of Ψ and Ψ , we may |A| = |B|1 = ∗ n. Let 1 D = A ∪ B, E = D \ A. If j∈D ψj ≥ 8C ϕ(n) and j∈D ψj∗ ≥ 8C ϕ (n) we obtain immediately that
ϕ(n)ϕ∗ (n) ≤ 26 C 3 |D| ≤ 27 C 3 n. Consider when one of these inequalities fails; we need only treat the the situation 1 ϕ(n). Then case j∈D ψj < 8C ϕ(n) ϕ(n) ϕ(n) − > ≥ − ψ ψ ψ j j j > 2 8C 4 j∈E
j∈A
j∈D
and thus, as |E| ≤ n, (4.4.6) gives −1 ψj∗ ≤ 4Cnϕ(n) . j∈E
We also have from (4.4.6) that −1 ψj∗ ≤ 2Cnϕ(n) . j∈A
115
4.4. Bidemocratic bases
Hence,
∗ −1 ψj ≤ 6Cnϕ(n) , j∈D
and so
6Cn ϕ(n) 3n ∗ , ψj ψj ≤ n ≤ |D| ≤ = 8C ϕ(n) 4 j∈D
j∈D
which is a contradiction.
Proposition 4.4.4. If Ψ is a democratic quasi-greedy basis with (URP), then Ψ is bidemocratic. Proof. We assume that (4.4.2) holds and that Ψ is quasi-greedy with constant K and democratic with constant D. Suppose A is a finite subset of N. Pick f ∈ X such that f = 1 and j∈A ψj∗ (f ) > 12 j∈A ψj∗ . Let ρ be the greedy ordering for f . Then, by (4.3.5), if |A| = n, n ∗ ∗ ∗ ψ ψj (f ) ≤ 2ϕ(n) ψj ≤ 2ϕ(n) ϕ(n) ρ(k) (f ) j∈A
j∈A
≤ 8K 2 D
n k=1
k=1
ϕ(n) ≤ 8K 2 DCnβ k −β ≤ C1 n ϕ(k) n
k=1
for a suitable constant C1 . This implies ϕ(n)ϕ∗ (n) ≤ C1 n.
Corollary 4.4.5. Let Ψ be a quasi-greedy basis for a Hilbert space. Then Ψ is bidemocratic. √ Proof. Wojtaszczyk [85] proved that Ψ is democratic and that ϕ(n) n. So the result follows from Proposition 4.4.4. Corollary 4.4.6. Let Ψ be a uniformly bounded quasi-greedy basis Ψ = {ψj }∞ j=1 in Lq , 1 < q < ∞. Then Ψ is bidemocratic. Proof. This follows directly from Proposition 3.2.21 of Chapter 3 and Proposition 4.4.4. Remark 4.4.7. Proposition 4.4.4 fails for bases that are not quasi-greedy. To see this, let (epn ) be the unit vector basis of p . We define a normalized basis (fn ) of 2 ⊕2 p as follows: √
3 p 1 2 1 2 p f2n−1 = √ en + en , en . f2n = en + 2 2 2 Suppose that 1 < p < 2. It is easy to check that (fn ) and (fn∗ ) are both 1/p democratic √ and unconditional for constant coefficients, that ϕ(n) n , and that ϕ∗ (n) n. So both (fn ) and (fn∗ ) have (URP), but (fn ) is not bidemocratic.
116
Chapter 4. Almost Greedy Bases and Duality
4.5 Duality of almost greedy bases Theorem 4.5.1. Let Ψ be a greedy basis with (URP). Then Ψ∗ is a greedy basic sequence. In particular, if Ψ is a greedy basis of a Banach space X with non-trivial type, then Ψ∗ is a greedy basis of X ∗ . Proof. Since Ψ∗ is automatically unconditional, this follows from Proposition 4.4.4, Proposition 4.4.2, and Theorem 2.4.1 from Chapter 2. The second part follows from Proposition 4.4.1; note that any space with non-trivial type and an unconditional basis is reflexive by James’ theorem [35]. Remark 4.5.2. The Haar system is a greedy basis of H1 . However, Oswald [59] proved that the Haar system is not a greedy basic sequence in BM O (i.e., H1∗ ). This provides a natural illustration of the fact that the assumption of non-trivial type in Theorem 4.5.1 cannot be eliminated. Corollary 4.5.3. For 1 < p < ∞ the Banach space Lp [0, 1] has a greedy basis that is not equivalent to a rearranged subsequence of the Haar system. Proof. For p > 2 Wojtaszczyk [85] constructed such a basis with ϕ(n) n1/p , hence with (URP). The case p < 2 follows by duality using Theorem 4.5.1. Theorem 4.5.4. Let Ψ be a quasi-greedy basis of a Banach space X. Then the following are equivalent: (1) Ψ is bidemocratic. (2) Ψ and Ψ∗ are both almost greedy. (3) Ψ and Ψ∗ are both partially greedy. Proof. We first prove (1) implies (2). Let D denote the bidemocratic constant. By Theorem 4.3.2 and Proposition 4.4.2, we only need to show that Ψ∗ is quasi∗ the greedy operator and greedy remainder greedy. Let us denote by G∗m and Hm operators associated to the dual basic sequence Ψ∗ . Let f ∗ ∈ X ∗ and f ∈ X. First note that if |A| = m, then ∗ ∗ |f (ψj )| ≤ f sup j ψj j =±1
j∈A
j∈A
≤ 2ϕ(m)f ∗ . Hence,
∗ ∗ ϕ(m + 1) ∗ sup Hm f . f ψj ≤ 2 m+1 j∈N
(4.5.1)
On the other hand, (4.3.5) implies that
sup ψj∗ Hm (f ) ≤ j∈N
4K 2 D f . ϕ(m + 1)
(4.5.2)
117
4.5. Duality of almost greedy bases
Suppose Gm (f ) = |A| = |B| = m. Then
j∈A
ψj∗ (f )ψj and G∗m (f ∗ ) =
j∈B
f ∗ (ψj )ψj∗ , where
∗ ∗
∗ ∗ H f Gm (f ) = f (ψj )ψj (f ) m j∈A\B
∗ ∗ ≤ f (ψ )ψ j j f j∈A\B
≤4
ϕ(m + 1)ϕ∗ (m) f f ∗ m+1
(by (4.3.2) and (4.5.1)) ≤ 4Df f ∗. Also, ∗ ∗
∗ Gm f )(Hm (f ) = f ∗ ψj (f )ψj j∈B\A
≤ f ∗
4K 2 Df (2ϕ(m)) ϕ(m + 1)
(by (4.5.2)) ≤ 8K 2 Df f ∗. Now
∗ ∗
G∗m f ∗ (f ) = f ∗ Gm (f ) − Hm f Gm (f ) + G∗m (f ∗ ) Hm (f ) .
Hence,
∗ ∗
G f (f ) ≤ K + 4D + 8K 2 D f f ∗ m
so that
∗ ∗
G f ≤ K + 4D + 8K 2 D f ∗ . m
This implies that Ψ∗ is a quasi-greedy basic sequence, and proves that (1) implies (2). Clearly (2) implies (3), so it remains to prove that (3) implies (1). By Theorem 4.3.3, (3) implies that both Ψ and Ψ∗ are quasi-greedy and conservative. Let us assume that K is a quasi-greedy constant for both Ψ and Ψ∗ , and that Γ is a conservative constant for both Ψ and Ψ∗ .
118
Chapter 4. Almost Greedy Bases and Duality
Suppose A is any finite subset of N. For f ∈ [ψj ]j ∈A / , let y = j∈A ψj + f . First suppose that |ψj∗ (f )| = 1 for all j. Then ∗ ∗ ≤ + ψ ψ (y)ψ ψ (y)ψ j j j j j j∈A
|ψj∗ (y)|≤1
|ψj∗ (y)|<1
≤ 2Ky.
By continuity, j∈A ψj ≤ 2Ky for all f ∈ [ψj ]j ∈A / . Thus, by Nikol’skii’s duality theorem (see, e.g., [47], or Theorem 7.2.3 from the Appendix), there exists f ∗ ∈ [ψj∗ ]j∈A with f ∗ = 1 and ∗ 1 f ψ ψj ≥ (4.5.3) j . 2K j∈A
j∈A
Now suppose m ∈ N. Choose A0 , B0 1 ψj ≥ 2 ϕ(m),
with |A0 |, |B0 | ≤ m and ∗ 1 ∗ ψj ≥ 2 ϕ (m).
j∈A0
j∈B0
Let A be any subset of N with |A| = 2m and A > max(A0 , B0 ). Note that if D ⊂ A and |D| ≥ m, then since Ψ and Ψ∗ are conservative with constant Γ, ∗ 1 1 ∗ ϕ(m), ϕ (m). ≥ ψj ≥ ψj (4.5.4) 2Γ 2Γ j∈D j∈D ∗ 2 Let us choose u∗ ∈ [ψj∗ ]j∈A such that j∈A |uj (ψj )| is minimized subject to ∗ u ≤ 1 and ϕ(m) . (4.5.5) u∗ (ψj ) ≥ 4ΓK j∈A
This is possible by (4.5.3) and (4.5.4). Now let G∗m (u∗ ) = j∈B u∗ (ψj )ψj∗ , where B ⊂ A and |B| = m. Let D = A \ B. We observe that by Lemma 3.2.4 we have min |u∗ (ψj )| ψj∗ ≤ 4K 2 j∈B
and hence, by (4.5.4),
j∈B
8K 2 Γ . min u∗ (ψj ) ≤ ∗ j∈B ϕ (m)
We then use again (4.5.3) to find v ∗ ∈ [ψj∗ ]j∈D with v ∗ = 1 and j∈D
v ∗ (ψj ) ≥
ϕ(m) . 4ΓK
(4.5.6)
119
4.5. Duality of almost greedy bases
It follows from the minimality assumption on u∗ that
2 ∗
2 (1 − t)u∗ (ψj ) + tv ∗ (ψj ) ≥ u (ψj ) j∈A
j∈A
for 0 ≤ t ≤ 1 and so, using Lemma 3.2.4 and (4.5.6), u∗ (ψj )2 ≤ u∗ (ψj )v ∗ (ψj ) j∈A
j∈A
∗ v (ψj ) ≤ min u∗ (ψj ) j∈B
j∈D
8K Γ max ≤ ∗ j ψj =±1 ϕ (m) j 2
j∈D
≤
16K 2 Γϕ(m) . ϕ∗ (m)
Thus, from (4.5.5), ∗ 2
2 4 2 2 u (ψj ) ϕ(m) ≤ 2 Γ K j∈A
≤ 2 4 Γ2 K 2 m
u∗ (ψj )2
j∈A
≤
28 Γ3 K 4 mϕ(m) , ϕ∗ (m)
which gives the estimate ϕ(m)ϕ∗ (m) ≤ 28 Γ3 K 4 m, so that Ψ is bidemocratic.
Corollary 4.5.5. Let X be a Banach space with non-trivial type. If Ψ is an almost greedy basis of X, then Ψ∗ is an almost greedy basic sequence in X ∗ . Proof. This follows directly from Theorem 4.5.4 and Proposition 4.4.4.
Chapter 5 Greedy Approximation with Respect to the Trigonometric System
5.1 Introduction The trigonometric system is a classical system that inspired the creation of wonderful deep theories and proofs of a myriad of beautiful difficult theorems. In this chapter we present some results on greedy approximation with respect to the trigonometric system. We have already discussed some results in Chapter 2. In particular, we proved in Chapter 2 that the trigonometric system is not a quasigreedy basis for Lp , p = 2. This means that the mere fact that f ∈ Lp does not guarantee convergence of greedy approximants {Gm (f, T )} in the case p = 2. Convergence is a fundamental property of an approximation method. In Sections 5.2 and 5.3 we study convergence of {Gm (f, T )} in the Lp -norm. In many cases we find necessary and sufficient conditions on f to guarantee that
f − Gm f, T p −→ 0 as m −→ ∞. We note that some of these results were unexpected; for instance, necessary and sufficient conditions for convergence in the uniform norm (see Theorems 5.2.3 and 5.2.4). Also, the study of convergence of {Gm (f, T )} required new techniques, in particular new types of inequalities (see Subsection 5.3.2 below). A detailed discussion of convergence results is given in the introductory subsections 5.2.1 and 5.3.1 of Sections 5.2 and 5.3, respectively. The Thresholding Greedy Algorithm that provides approximants {Gm (f, T )} is a very simple nonlinear approximation method. However, the fact that it may diverge in Lp for some f ∈ Lp , p = 2 motivates us to consider other, possibly more complicated, methods of construction of m-term trigonometric approximations. There is a well-developed theory of greedy algorithms with respect to an arbitrary dictionary (see [82]). It turns out that general greedy algorithms work well for the trigonometric system. We now discuss this important phenomenon in detail.
© Springer Basel 2015 V. Temlyakov, Sparse Approximation with Bases, Advanced Courses in Mathematics - CRM Barcelona, DOI 10.1007/978-3-0348-0890-3_5
121
122
Chapter 5. Greedy Approximation wrt the Trigonometric System
Let us consider nonlinear approximation with respect to the trigonometric system T d := T × · · · × T (d times). The existence of best m-term trigonometric approximation was proved in [2] (see also [68] and Theorem 1.2.3 from Chapter 1). The method Gm (f ) := Gm (f, T d ) has an obvious advantage over the traditional approximation by trigonometric polynomials in the case of approximation of functions of several variables. In this case (d > 1) there is no natural order of trigonometric system and the use of Gm allows us to avoid the problem of finding natural subspaces of trigonometric polynomials for approximation purposes. We proved in [68] (see Theorem 2.2.1 and Remarks 2.2.4 and 2.2.5 from Chapter 2) the following results. Theorem 5.1.1. For each f ∈ Lp (Td ),
f − Gm (f ) ≤ 1 + 3mh(p) σm (f )p , p
1 ≤ p ≤ ∞,
where h(p) := |1/2 − 1/p|. Remark 5.1.2. For all 1 ≤ p ≤ ∞, Gm (f ) ≤ mh(p) f p . p Remark 5.1.3. There is a positive absolute constant C such that for each m and 1 ≤ p ≤ ∞ there exists a function f = 0 for which Gm (f ) ≥ Cmh(p) f p . (5.1.1) p The above Remark 5.1.3 shows that the trigonometric system is not a quasigreedy basis for Lp , p = 2. This leads to a natural attempt to consider some other algorithms that may have some advantages over TGA in the case of T . In this chapter, along with the study of convergence of TGA, we discuss the performance of other, more general, greedy algorithms; for instance, the Weak Chebyshev Greedy Algorithm (WCGA) with respect to T (see below). Let us compare the rates of approximation of TGA and WCGA. Let RT denote the real trigonometric system 1/2, sin x, cos x, . . . . We need to switch to this system from the complex trigonometric system because the algorithm WCGA is defined for the real Banach space. We note that the system RT is not normalized in Lp , but semi-normalized: C1 ≤ tp ≤ C2 for any t ∈ RT , with absolute constants C1 , C2 , 1 ≤ p ≤ ∞. This is sufficient for the application of the general methods developed in Chapter 6. For a function f with absolutely convergent Fourier series f (x) = a0 /2 +
∞
ak cos kx + bk sin kx , k=1
denote f A := |a0 | +
∞
|ak | + |bk | . k=1
123
5.1. Introduction
Define the class
A1 := A1 RT := {f : f A ≤ 1}.
For a sequence τ := {tk } with tk = t, k = 1, 2, . . . , we replace τ by t in the notation. Theorem 5.1.6 and (5.1.5) from below imply the following result. Theorem 5.1.4. Let 0 < t ≤ 1. For f ∈ A1 we have, for WCGA,
f − Gc,t ≤ C p, t m−1/2 , 2 ≤ p < ∞. m f, RT p This estimate and Theorem 5.1.1 imply that for f ∈ A1 we have
f − Gm f, RT ≤ C p, t m−1/p , 2 ≤ p < ∞, p
(5.1.2)
(5.1.3)
which is weaker than (5.1.2). It is proved in [19] (see Section 5.4 of this chapter) that (5.1.3) cannot be improved. Thus WCGA works better than TGA for the class A1 . We note that the restriction p < ∞ in (5.1.2) is important. We gave a lower estimate for the best m-term approximation in L∞ in [76]. Proposition 5.1.5. For a given m, define f :=
2m
cos 3k x.
k=0
Then
σm f, RT ∞ ≥ m/8.
We now discuss in more detail applications of general greedy algorithms for trigonometric approximation. We begin with nonlinear m-term approximation and greedy algorithms with respect to a general system (dictionary). We concentrate here on a discussion of m-term approximation with respect to redundant dictionaries in Banach spaces. A detailed discussion is offered in Chapter 6. We discuss only one example of an algorithm from the family of greedy algorithms. The reader can find a further discussion of greedy approximation in Banach spaces in the surveys [76], [81] and the book [82]. The presentation here is based on the paper [73], which in turn is a combination of ideas and methods developed for Banach spaces in a fundamental paper [22] with the approach used in [72] in the case of Hilbert spaces. The papers [22] and [72] (see also the book [82]) contain detailed historical remarks. Two greedy-type approximation methods —the Weak Chebyshev Greedy Algorithm (WCGA) and the Weak Relaxed Greedy Algorithm (WRGA)— have been introduced and studied in [73]. These methods (WCGA and WRGA) are very general approximation methods that work well in an arbitrary uniformly smooth Banach space X for any dictionary D (see below). Surprisingly, it turned out that these general approximation methods are also very good for specific dictionaries. It has been observed in [19] (see Section 5.4 of this chapter) that WCGA provides constructive methods in m-term trigonometric approximation in Lp , p ∈ [2, ∞),
124
Chapter 5. Greedy Approximation wrt the Trigonometric System
which realizes the optimal rate of m-term approximation for different function classes. In [75] WCGA and WRGA have been used in constructing deterministic cubature formulas for a wide variety of function classes, with error estimates similar to those for the Monte Carlo method. It looks like WCGA and WRGA can be considered as a constructive deterministic alternative to (substitute for) some powerful probabilistic methods. This observation encouraged us to continue a thorough study of WCGA and WRGA. In this section we discuss in detail only WCGA. In [73] we developed the theory of the Weak Chebyshev Greedy Algorithm in a general setting: X is an arbitrary uniformly smooth Banach space and D is any dictionary. We keep the term greedy algorithm in the name of this approximation method for two reasons. First, this term has been used in previous papers and has become a standard name for procedures like WCGA. For further discussion of the terminology, see [76, Remark 1.1, p. 38]. Second, clearly, in the above general setting the term algorithm cannot be confused with the same term used in a more restricted sense, say, in computer science. We note that in the case of finite-dimensional X and finite D the above methods are algorithms in a strict sense. In [77] we used WCGA to build a constructive method for m-term trigonometric approximation in the uniform norm (see Section 5.5 of this chapter). It is known that the case of approximating by m-term trigonometric polynomials in the uniform norm is the most difficult. We note that in the case of Lp -norms with p < ∞ the corresponding constructive method has been provided in [19] (see Section 5.4 of this chapter). In [77] we also studied a slight modification of an incremental type algorithm from [22]. We applied that algorithm for constructing deterministic sets of points with small Lp discrepancy and also with small symmetrized Lp discrepancy. We now proceed to a presentation of the above-mentioned results. Let X be a Banach space with norm · . We say that a set D of elements (functions) from X is a dictionary if each g ∈ D has norm less than or equal to one (g ≤ 1) and span D = X. A dictionary D is called symmetric if g∈D
implies
− g ∈ D.
Denote D± := {±g, g ∈ D} a symmetrized version of D. We note that in [73] we required in the definition of a dictionary normalization of its elements (g = 1). However, it is pointed out in [77] that it is easy to check that the arguments from [73] work under the assumption that g ≤ 1 instead of g = 1. In applications it is more convenient to have the assumption g ≤ 1 than normalization of a dictionary. For an element f ∈ X we denote by Ff a norming (peak) functional for f : Ff = 1,
Ff (f ) = f .
The existence of such a functional is guaranteed by the Hahn–Banach theorem. Let τ := {tk }∞ k=1 be a given sequence of nonnegative numbers tk ≤ 1, k = 1, 2, . . . .
125
5.1. Introduction
We define (see [73]) the Weak Chebyshev Greedy Algorithm (WCGA), which is a generalization for Banach spaces of the Weak Orthogonal Greedy Algorithm defined and studied in [72] (see also [16] for the Orthogonal Greedy Algorithm).
Weak Chebyshev Greedy Algorithm (WCGA) We define f0c := f0c,τ := f . Then for each m ≥ 1 we inductively define 1) ϕcm := ϕc,τ m ∈ D is any element satisfying c c |Ffm−1 (ϕcm )| ≥ tm sup |Ffm−1 (g)|.
g∈D
2) Define Φm := Φτm := span{ϕcj }m j=1 , and define Gcm := Gc,τ m to be the best approximant to f from Φm . 3) Denote c c,τ := fm := f − Gcm . fm The term “weak” in this definition means that at step 1) we do not shoot for the optimal element of the dictionary which realizes the corresponding supremum, but are satisfied with a weaker property than being optimal. The obvious reason for this is that we do not know in general that the optimal element exists. Another practical reason is that the weaker the assumption, the easier to satisfy it and, therefore, the easier to realize in practice. We consider here approximation in uniformly smooth Banach spaces. For a Banach space X we define the modulus of smoothness by
1 x + uy + x − uy − 1 . ρ(u) := sup x=y=1 2 A uniformly smooth Banach space is one for which lim ρ(u)/u = 0.
u→0
It is easy to see that for any Banach space X its modulus of smoothness ρ(u) is an even convex function satisfying the inequalities
max 0, u − 1 ≤ ρ(u) ≤ u, u ∈ (0, ∞). (5.1.4) It is well known (see for instance [22, Lemma B.1]) that in the case X = Lp , 1 ≤ p < ∞, we have if 1 ≤ p ≤ 2, up /p, ρ(u) ≤ (5.1.5) 2 (p − 1)u /2, if 2 ≤ p < ∞.
126
Chapter 5. Greedy Approximation wrt the Trigonometric System
It is also known (see [52, p. 63]) that for any X with dim X = ∞,
1/2 −1 ρ(u) ≥ 1 + u2 and, for every X with dim X ≥ 2, ρ(u) ≥ Cu2 ,
C > 0.
This limits the power type moduli of smoothness of nontrivial Banach spaces to the case 1 ≤ q ≤ 2. Denote by A1 (D) the closure of the convex hull of D. The following theorem from [73] gives the rate of convergence of WCGA for f in A1 (D) (see Theorem 6.2.6 from Chapter 6). Theorem 5.1.6. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u) ≤ γuq , 1 < q ≤ 2. Then for t ∈ (0, 1] we have, for any f ∈ A1 (D± ),
f − Gc,t f, D ≤ C q, γ 1 + mtp −1/p , m
p :=
q , q−1
with a constant C(q, γ) which may depend only on q and γ. Theorem 6.2.4 guarantees that for any f ∈ Lp , p ∈ (1, ∞), we have
f − Gc,t −→ 0 as m −→ ∞. m f, RT p Thus, for any f ∈ Lp , p ∈ (1, ∞), WCGA provides a convergent sequence {Gc,t m (f, RT )} of approximants. In [77] we demonstrated the power of WCGA in classical areas of harmonic analysis. The problem concerns the trigonometric m-term approximation in the uniform norm. Let RT (N ) be the subspace of real trigonometric polynomials of order N . The first result that indicated an advantage of m-term approximation over approximation by trigonometric polynomials of order m is due to Ismagilov [34]:
(5.1.6) σm | sin x|, T ∞ ≤ C m−6/5+ for any > 0. Maiorov [56] improved this estimate to
σm | sin x|, T ∞ m−3/2 .
(5.1.7)
Both R. S. Ismagilov [34] and V. E. Maiorov [56] used constructive methods to get their estimates (5.1.6) and (5.1.7). Maiorov [56] applied a number-theoretical method based on Gaussian sums. The key point of that technique can be formulated in terms of best m-term approximation of trigonometric polynomials. Using the Gaussian sums one can prove (constructively) the estimate
(5.1.8) σm t, RT ∞ ≤ CN 3/2 m−1 t1 , t ∈ RT (N ).
127
5.2. Convergence. Conditions on Fourier coefficients
Denote, as above, N N
a0 /2 + := |a0 | + a |ak | + |bk | . cos kx + b sin kx k k k=1
We note that by the simple inequality
tA ≤ 2N + 1 t1 ,
A
k=1
t ∈ RT (N ),
the estimate (5.1.8) follows from the estimate
σm t, RT ∞ ≤ C N 1/2 /m tA ,
t ∈ RT (N ).
(5.1.9)
Thus (5.1.9) is stronger than (5.1.8). The following estimate was proved in [15]:
1/2 tA , σm (t, RT )∞ ≤ Cm−1/2 ln(1 + N/m)
t ∈ RT (N ).
(5.1.10)
In a way (5.1.10) is much stronger than (5.1.9) and (5.1.8). The proof of (5.1.10) from [15] is not constructive— it uses a nonconstructive theorem of Gluskin [29]. In [77] we gave a constructive proof of (5.1.10) (see Section 5.5), in which the key ingredient is WCGA. In [19] we already pointed out that WCGA provides a constructive proof of the following estimate (see Section 5.4):
(5.1.11) σm f, T p ≤ C(p)m−1/2 f A, p ∈ [2, ∞). The known proofs (before [19]) of (5.1.11) were nonconstructive (see the discussion in [19, Section 5]). We formulate here a result from [77] (see Theorem 5.5.2 below). Theorem 5.1.7. There exists a constructive method A(N, m) which provides for any t ∈ RT (N ) an m-term trigonometric polynomial A(N, m)(t) with the following approximation property:
t − A(N, m)(t) ≤ Cm−1/2 ln(1 + N/m) 1/2 tA ∞ with an absolute constant C.
5.2 Convergence. Conditions on Fourier coefficients The presentation of this section is based on [45].
5.2.1 Introduction Here we study the following natural nonlinear method of summation of trigonometric Fourier series. Consider a periodic function f ∈ Lp (Td ), 1 ≤ p ≤ ∞, (L∞ (Td ) = C(Td )), defined on the d-dimensional torus Td . Let m ∈ N and
128
Chapter 5. Greedy Approximation wrt the Trigonometric System
t ∈ (0, 1] be given and Λm be the set of k ∈ Zd with the properties min |fˆ(k)| ≥ t max |fˆ(k)|, k∈Λ / m
k∈Λm
where fˆ(k) := (2π)−d
|Λm | = m,
(5.2.1)
f (x)e−i(k,x) dx Td
is the k-th Fourier coefficient of f . We define Gtm (f ) := Gtm (f, T ) := SΛm (f ) :=
fˆ(k)ei(k,x)
k∈Λm
and call it an m-th weak greedy approximant of f with respect to the trigonometric system T := {ei(k,x) }k∈Zd . We write Gm (f ) = G1m (f ) and call it an m-th greedy approximant. Clearly, an m-th weak greedy approximant and even an m-th greedy approximant may not be unique. In this chapter we do not impose any extra restrictions on Λm in addition to (5.2.1). Thus theorems formulated below hold for any choice of Λm satisfying (5.2.1) or, in other words, for any realization Gtm (f ) of the weak greedy approximation. There has recently been much interest (see the surveys [11], [81] and the book [82]) in approximation of functions by m-term approximants with regard to a basis (or minimal system). We will discuss in detail only results concerning the trigonometric system. T. W. K¨ orner, answering a question raised by Carleson and Coifman, constructed in [48] a function from L2 (T), and then in [49] a continuous function, such that {Gm (f, T )} diverges almost everywhere. It has been proved in [68] for p = 2 and in [9] for p < 2 that there exists an f ∈ Lp (T) such that {Gm (f, T )} does not converge in Lp . It was remarked in [76] that the method from [68] gives a little more: 1) There exists a continuous function f such that {Gm (f, T )} does not converge in Lp (T) for any p > 2 (see Theorem 2.2.9). 2) There exists a function f that belongs to any Lp (T), p < 2, such that {Gm (f, T )} does not converge in measure (see Theorem 2.2.10). Thus the above negative results show that the condition f ∈ Lp (Td ), p = 2, does not guarantee convergence of {Gm (f, T )} in the Lp -norm. In this chapter we find an additional (to f ∈ Lp ) condition on f to guarantee that f − Gm (f, T )p → 0 as m → ∞. In Subsection 5.2.2 we prove the following theorem. Theorem 5.2.1. Let f ∈ Lp (Td ), 2 < p ≤ ∞, and let q > p := p/(p − 1). Assume that f satisfies the condition |fˆ(k)|q = o nd(1−q/p ) , |k|>n
where |k| := max1≤j≤d |kj |. Then lim f − Gtm (f, T d )p = 0. m→∞
129
5.2. Convergence. Conditions on Fourier coefficients
For f ∈ L1 (Td ) let {fˆ(k(l))}∞ l=1 denote the decreasing rearrangement of ˆ {f (k)}k∈Zd , i.e., fˆ(k(1)) ≥ fˆ(k(2)) ≥ · · · . (5.2.2) Denote an (f ) := |fˆ(k(n))|. In Subsection 5.2.3 we prove the following theorem. Theorem 5.2.2. Let 2 < p < ∞ and let a decreasing sequence {An }∞ n=1 satisfy the condition
An = o n1/p−1 as n −→ ∞. (5.2.3) Then for any f ∈ Lp (Td ) with the property an (f ) ≤ An , n = 1, 2, . . . , we have
(5.2.4) lim f − Gtm f, T d p = 0. m→∞
We also prove in Subsection 5.2.3 that for any decreasing sequence {An } satisfying lim sup An n1−1/p > 0 n→∞
there exists a function f ∈ Lp such that an (f ) ≤ An , n = 1, . . . , with divergent in the Lp sequence of greedy approximants {Gm (f )}. In Subsection 5.2.4 we prove a necessary and sufficient condition on the majorant {An } to guarantee (under the assumption that f is continuous) uniform convergence of greedy approximants to a function f . Theorem 5.2.3. Let a decreasing sequence {An }∞ n=1 satisfy the condition (A∞ ): An = o(1) as M −→ ∞. (5.2.5) M
Then for any f ∈ C(T) such that an (f ) ≤ An , n = 1, 2, . . . , we have
lim f − Gtm f, T ∞ = 0. m→∞
The condition (A∞ ) is very close to the convergence of the series it holds, then N
An = o log∗ (N ) as N −→ ∞,
(5.2.6) n
An ; if
n=1
where the function log∗ (u) is defined to be bounded for u ≤ 0 and to satisfy log∗ (u) = log∗ (log u) + 1 for u > 0. The function log∗ (u) grows slower than any iterated logarithmic function. The condition (A∞ ) in Theorem 5.2.3 is sharp. Theorem 5.2.4. Assume that the decreasing sequence {An }∞ n=1 does not satisfy the condition (A∞ ). Then there exists a function f ∈ C(T) such that an (f ) ≤ An , n = 1, 2, . . . , but
lim sup f − Gm f, T ∞ > 0 m→∞
for some realization Gm (f, T ).
130
Chapter 5. Greedy Approximation wrt the Trigonometric System
Theorems 5.2.3 and 5.2.4 are proved in Subsection 5.2.4. There we also prove the following result. Theorem 5.2.5. Assume that the decreasing sequence {An }∞ n=1 is not summable. Then there exists a continuous function such that an (f ) ≤ An for all n and the partial Fourier sums of f diverge at some point. We note (see Subsection 5.2.2) that sufficient conditions for convergence of greedy approximants in Theorem 5.2.1 for p = ∞ also imply the convergence of partial Fourier sums. Theorems 5.2.3 and 5.2.5 demonstrate that the conditions for convergence of greedy approximants in terms of decreasing rearrangement of Fourier coefficients of continuous functions are weaker than the ones for convergence of partial Fourier sums.
5.2.2 Sufficient conditions in terms of Fourier coefficients. Proof of Theorem 5.2.1 Let us begin this subsection with some historical remarks. The question of the rate of approximation of functions in certain smoothness classes by greedy approximants was discussed in [68]. In particular, the following function class was considered. For 0 < r < ∞ and 0 < q ≤ ∞, let F denote the class of all functions f ∈ L1 (Td ) such that
|f |F := |k|r |fˆ(k)| k∈Zd lq ≤ 1, |fˆ(0)| ≤ 1. Here we use the notation |k| := max{|k1 |, . . . , |kd |}. The following error estimates have been proved in [68] for Gm (F )p := sup f − Gm (f )p . f ∈F
Theorem 5.2.6. For any 0 < q < ∞ and r > d(1 − 1/q)+ we have Gm (F )p m−r/d−1/q+1/2 ,
1 ≤ p ≤ 2,
(5.2.7)
Gm (F )p m−r/d−1/q+1−1/p ,
2 ≤ p ≤ ∞.
(5.2.8)
It has been also noticed in [68] that the method used in the proof of Theorem 5.2.6 allows us to prove the order estimates similar to (5.2.7) and (5.2.8) for a somewhat wider classes than F . We define these classes now. It is easy to verify that for f ∈ F we have, for each l ≥ 1, 1/q q ˆ |f (k)| ≤ 2−r(l−1) , |fˆ(0)| ≤ 1. (5.2.9) 2l−1 ≤|k|<2l
We use the relation (5.2.9) as a definition of a new class DF (D indicates here that restrictions are imposed on the dyadic blocks). Here is a remark from [68].
5.2. Convergence. Conditions on Fourier coefficients
131
Remark 5.2.7. The relations (5.2.7) and (5.2.8) are valid when the class F is replaced by DF. For r > 0, 0 < q < ∞, denote F orq the space of functions f ∈ L1 (Td ) satisfying the condition
(5.2.10) |fˆ(k)|q = o n−rq . |k|>n
We will now prove Theorem 5.2.1 from the Introduction to this section. Theorem 5.2.8. Let 2 < p ≤ ∞, q > p = p/(p − 1). Assume f ∈ Lp (Td ) ∩ F orq , with r = d(1/p − 1/q). Then for any 0 < t ≤ 1 we have f − Gtm (f ) −→ 0 as m −→ ∞. p Proof. First we note that (5.2.10) is equivalent to
|fˆ(k)|q ≤ o 2−rlq , l = 1, 2, . . . ,
(5.2.11)
k∈U(l)
where U (l) := {k ∈ Zd : 2l−1 ≤ |k| < 2l }. It has been proved in [68] (see relation (5.2.23)) that the estimates fˆ(k)q ≤ 2−rlq , l = 1, 2, . . . k∈U(l)
imply
am (f ) = O m−r/d−1/q .
In the same way one can prove that (5.2.11) implies that
am (f ) = o m−r/d−1/q .
(5.2.12)
Since r = d(1/p − 1/q), we get from (5.2.12) that am (f ) = o m−1/p . In the case 2 < p < ∞ we can finish the proof of Theorem 5.2.8 by applying Theorem 5.2.2 from the Introduction. However, we choose to give an independent proof for the following two reasons: it is simpler than the proof of Theorem 5.2.2 (see Subsection 5.2.3), and also, covers the case p = ∞, where Theorem 5.2.2 does not hold (see Subsection 5.2.4). Let Gtm (f ) = SΛm (f ), with Λm satisfying (5.2.1). Consider first the case 2 < p < ∞ and the estimate d (f ) − SΛm (f )p , where Sm d (f ) := Q(m) := k : |k| ≤ m1/d . fˆ(k)ei(k,x) , Sm k∈Q(m)
132
Chapter 5. Greedy Approximation wrt the Trigonometric System
Then we have
d Sm (f ) − SΛm (f ) =
fˆ(k)ei(k,x) −
fˆ(k)ei(k,x) =: Σ1 − Σ2 .
k∈Λm \Q(m)
k∈Q(m)\Λm
(5.2.13) From the definition of Λm we get am+1 (f ) ≤ max |fˆ(k)| ≤ t−1 min |fˆ(k)| ≤ t−1 am (f ). k∈Λ / m
(5.2.14)
k∈Λm
Thus, by the Hausdorff–Young theorem (see Appendix, Theorem 7.3.1),
Σ1 p ≤
|fˆ(k)|p
1/p
= O am (f )m1/p = o(1).
k∈Q(m)\Λm
Using the Hausdorff–Young theorem again and using the H¨ older inequality with parameter q/p we get Σ2 p ≤
|fˆ(k)|p
1/p ≤
k∈Λm \Q(m)
≤
1/q |fˆ(k)|q
m1/p −1/q
k∈Λm \Q(m)
1/q |fˆ(k)|q
m1/p −1/q = o(1).
(5.2.15)
k∈Q(m) / d (f )p → 0 as m → ∞. It remains to remark that f − Sm Let us now consider the case p = ∞. We remark that the relation (5.2.11) with r = d(1 − 1/q) and the H¨ older inequality imply fˆ(k) = o(1). (5.2.16) n≤|k|<2n
First, observe that the cubic Fourier sums Sn (f ) converge uniformly to f as n → ∞. Indeed, let us consider the de la Vall´ee Poussin sums Vn (f ) =
2n − |kj | ˆ min 1, f (k)ei(k,x) . n j=1
d |k|≤2n
It is known (see [1]) that, for any f ∈ C(Td ), Vn (f ) − f = o(1) ∞ Further,
Sn (f ) − Vn (f ) ≤ ∞
(n −→ ∞).
k∈Zd , n<|k|≤2n
fˆ(k),
(5.2.17)
133
5.2. Convergence. Conditions on Fourier coefficients
and, by (5.2.16), Sn (f ) − Vn (f ) = o(1) ∞ The relations (5.2.17) and (5.2.18) imply Sn (f ) − f = o(1) ∞
(n → ∞).
(n → ∞).
(5.2.18)
(5.2.19)
Thus, we established the uniform convergence of Sn (f ) to f . The rest of the proof is similar to the above case 2 < p < ∞, with the only difference that instead of the Hausdorff–Young theorem we use the inequality fˆ(k). f ∞ ≤ k
Theorem 5.2.8 is proved.
Let us next discuss the possibility of improving the assumption that f ∈ Lp (Td ) ∩ F orq , r = d(1/p − 1/q), in Theorem 5.2.8. Proposition 5.2.9. For each 2 < p ≤ ∞ there exists f ∈ Lp (Td ) such that
fˆ(k) = O |k|−d(1−1/p)
(5.2.20)
(and, therefore, f ∈ DF, r = d(1/p − 1/q)) and the sequence {Gm (f )} diverges in Lp . Proof. We will use a construction from [68]. We use the Rudin–Shapiro polynomials (see Appendix, Section 7.4): RN (x) = k eikx , k = ±1, x ∈ T, (5.2.21) |k|≤N
which satisfy the estimate
RN ≤ CN 1/2 ∞
(5.2.22)
for an absolute constant C. For s = ±1, denote ˆ m (k) = s . Λs := k : R The estimate (5.2.22) implies that |Λ1 | − |Λ−1 | = Rm (0) ≤ Cm1/2 .
(5.2.23)
Let s = ±1 be such that |Λs | > |Λ−s |. Take a small positive parameter δ and consider the function (5.2.24) fm,δ := Rm + sδDm ,
134
Chapter 5. Greedy Approximation wrt the Trigonometric System
where Dm (x) :=
x ∈ T,
eikx ,
|k|≤m
is the Dirichlet kernel. Then since |fˆm,δ (k)| = 1+δ for k ∈ Λs and |fˆm,δ (k)| = 1−δ for k ∈ Λ−s and |Λs | ≥ m, the frequencies of Gm (fm,δ ) will be in Λs and
Gm (fm,δ ) ≥ Gm fm,δ (0) = 1 + δ m. (5.2.25) ∞ Next, fm,δ ≤ Rm + δDm p ≤ Rm + δDm 2/p Dm 1−2/p ∞ 2 p p ∞
(5.2.26)
≤ Cm1/2 + δ(2m + 1)1−1/p ≤ C1 m1/2 for δ ≤ m1/p−1/2 . By the Nikol’skii inequality for trigonometric polynomials (see Appendix, Theorem 7.5.4), (5.2.25) implies that
Gm fm,δ ≥ C2 m−1/p Gm fm,δ ≥ C2 m1−1/p . (5.2.27) p ∞ Define now d fm,δ (x) :=
d
fm,δ (xj )ei(4m)xj
j=1
and f :=
∞
2−d(1−1/p)l f2dl ,δl (x),
0 < δl < 2−dl−3 .
l=1
The relation (5.2.20) is obviously satisfied. Moreover, (5.2.26) yields
f − V2n (f ) = O 2−d(1/2−1/p)n . ∞
(5.2.28)
However, (5.2.27) implies that {Gm (f )} diverges in Lp . Let us make some more comments. For a given set Λ denote i(k,x) EΛ (f )p := inf f − ck e . ck ,k∈Λ
k∈Λ
p
Remark 5.2.10. Theorem 5.2.8 implies that if f ∈ Lp , 2 < p ≤ ∞, and
EQ(n) (f )2 = o n−(1/2−1/p) then Gtm (f ) → f in Lp . Indeed, (5.2.29) is equivalent to f ∈ F or2 with r = d(1/2 − 1/p).
(5.2.29)
135
5.2. Convergence. Conditions on Fourier coefficients
Remark 5.2.11. The proof of Proposition 5.2.9 (see (5.2.28)) implies that there is f ∈ Lp (Td ) such that
EQ(n) (f )∞ = O n1/p−1/2 and {Gm (f )} diverges in Lp , 2 < p ≤ ∞. Remark 5.2.12. There exists a continuous function f , satisfying (5.2.16), such that {Gm (f )} diverges in the uniform norm. Proof. We construct an example in the univariate case. Define f :=
bk
with
bk :=
sk
−1/2 sk
k≥2
sk +l
2−sk f2sk ,δsk ei4
x
,
l=1
where {sk } is an increasing sequence such that all frequencies of bk+1 lie to the right of frequencies of bk . Then by (5.2.26) we get bk ∞ ≤ C1 sk 2−sk /2 1/2
and, therefore, f ∈ C(T). The relation (5.2.16) is also satisfied. It is clear that 1/2 max Gm (bk )∞ ≥ sk . m
This implies the divergence of {Gm (f )}.
Remark 5.2.13. The construction in the proof of Proposition 5.2.9 can be used for proving that the convergence set for greedy approximation {Gm }∞ m=1 is not linear in Lp , 2 < p ≤ ∞. Proof. Consider gm,δ :=
ˆ m (k) 1 − δk/m eikx R
|k|≤m
and hm,δ := fm,δ − gm,δ , where fm,δ is defined by (5.2.24). Similarly to the definition of f (d = 1), f :=
∞
2−(1−1/p)l f2l ,δl ei2
l+2
x
,
0 < δl < 2−l−3 ,
2−(1−1/p)l g2l ,δl ei2
l+2
x
,
0 < δl < 2−l−3 ,
l=1
we define g :=
∞ l=1
h :=
∞ l=1
2−(1−1/p)l h2l ,δl ei2
l+2
x
, 0 < δl < 2−l−3 .
136
Chapter 5. Greedy Approximation wrt the Trigonometric System
Thus f = g + h. It has been proved in Proposition 5.2.9 that the sequence {Gm (f )}∞ m=1 diverges in Lp . However, it is easy to check that the sequences ∞ {Gm (g)}∞ m=1 and {Gm (h)}m=1 converge uniformly. Indeed, h has an absolutely convergent Fourier series and Gm (g) = SN (g) with some N (the greedy ordering for g coincides with the natural ordering). Then the uniform convergence of {Gm (g)}∞ m=1 follows from (5.2.28). Remark 5.2.7 can also be obtained from some general inequalities for f − Gm (f )p . We recall the definition of the best m-term approximation: σm (f )p :=
inf
kj ∈Zd ,cj
m i(kj ,x) f − cj e . p
j=1
It was proved in [68] (see Theorem 2.2.1 from Chapter 2) that for any f ∈ Lp (Td ) one has f − Gm (f )p ≤ (1 + 3mh(p) )σm (f )p , 1 ≤ p ≤ ∞, where h(p) := |1/2 − 1/p|. Similarly to the above inequality one can prove the following relation. Theorem 5.2.14. For each f ∈ Lp (Td ) and any 0 < t ≤ 1 we have
f − Gtm (f ) ≤ 1 + (2 + 1/t)mh(p) σm (f )p , p
1 ≤ p ≤ ∞,
where h(p) := |1/2 − 1/p|. Proof. This proof repeats the proof of Theorem 2.2.1, which corresponds to the case t = 1 with one minor change. Let Gtm (f ) = fˆ(k)ei(k,x) , |Λ (t)| = m, Λ := Λ (1). k∈Λ (t)
Then the change needed in the proof from Chapter 2 (t = 1) to adjust it for t < 1 is the following. Instead of the obvious relation (see (2.2.10) from Chapter 2), for any Λ, |Λ| = m one has SΛ\Λ (f ) ≤ SΛ \Λ (f ) , 2
2
we use the inequality (Λ is arbitrary with |Λ| = m) SΛ\Λ (t) (f ) ≤ t−1 SΛ (t)\Λ (f ) , 2 2 which follows easily from the definition of Λ (t). We will prove one more inequality.
(5.2.30)
137
5.2. Convergence. Conditions on Fourier coefficients
Proposition 5.2.15. Let 2 ≤ p ≤ ∞. Then for any f ∈ Lp (Td ) and any Q, |Q| ≤ m, we have f − Gtm (f ) ≤ f − SQ (f ) + (3 + 1/t)(2m)h(p) EQ (f )2 . p p Proof. Let, as above, Gtm (f ) =
fˆ(k)ei(k,x) .
k∈Λ (t)
Then f − Gtm (f ) ≤ f − SQ (f ) + SQ (f ) − SΛ (t) (f ) p p p
(5.2.31)
and, by Lemma 2.2.3, SQ (f ) − SΛ (t) (f ) ≤ (2m)h(p) SQ (f ) − SΛ (t) (f ) . p 2
(5.2.32)
Next, SQ (f ) − SΛ (t) (f ) ≤ f − SQ (f ) + f − SΛ (t) (f ) . 2 2 2
(5.2.33)
Using (5.2.30) with Λ = Λ we get SΛ (t) (f ) − SΛ (f )2 = SΛ (t)\Λ (f )2 + SΛ \Λ (t) (f )2 2 2 2
2 ≤ 1 + t−2 SΛ (t)\Λ (f )2 ≤ 1 + t−2 σm (f )22 . Therefore, f − SΛ (t) (f ) ≤ f − SΛ (f ) + SΛ (t) (f ) − SΛ (f ) 2 2 2 ≤ (2 + 1/t)σm (f )2 ≤ (2 + 1/t)EQ (f )2 .
(5.2.34)
Combining (5.2.31)–(5.2.34) we complete the proof of Proposition 5.2.15.
5.2.3 Sufficient conditions in terms of the decreasing rearrangement of Fourier coefficients. Proof of Theorem 5.2.2 Let us begin with the proof of Theorem 5.2.2. We reformulate it here for convenience. Theorem 5.2.16. Let 2 < p < ∞ and let the decreasing sequence {An }∞ n=1 satisfy the condition
An = o n1/p−1 as n −→ ∞. (5.2.35)
138
Chapter 5. Greedy Approximation wrt the Trigonometric System
Then for any f ∈ Lp (Td ) such that an (f ) ≤ An , n = 1, 2, . . . ,
lim f − Gtm f, T p = 0. m→∞
(5.2.36)
Proof. By the Riesz theorem (see [41, Chapter 4, Section 3]) we have, for any f ∈ Lp (Td ), 1 < p < ∞, f − SN (f ) −→ 0 as N −→ ∞. (5.2.37) p d We will consider first the case t = 1. Let us estimate Sm (f ) − Gm (f )p , where d d (f − Gm (f )) Sm (f ) are defined in the proof of Theorem 5.2.8. Denote Σ1 := Sm d and Σ2 := (Id − Sm )(Gm (f )). Then we have d d d d Sm (f ) − Gm (f ) = Sm (f ) − Sm (Gm (f )) − (Id − Sm )(Gm (f )) = Σ1 − Σ2 .
For the first sum we get, by the Paley theorem (see [86, Chapter 12, Section 5]), d (2m1/d 1/p +1)
p p−2 am (f ) n = O am (f )m1−1/p = o(1). Σ1 p ≤ C(p, d)
n=1
(5.2.38) We now proceed to the second sum Σ2 . We first prove a general inequality. Proposition 5.2.17. Let 2 ≤ p < ∞ and u ∈ Lp , up = 0. Then for any v ∈ Lp we have
p−1 up ≤ u + v p + u2p−2 /up v2 . Proof. Denote F := u1−p u ¯|u|p−2 . p Then F p = 1
and
F, u = up .
Therefore, up = F, u = F, u + v − F, v ≤ u + v p + F 2 v2 . It remains to observe that
p−1 F 2 = u2p−2 /up .
Lemma 5.2.18. Let 2 < p < ∞. For f ∈ Lp (Td ) assume that an (f ) = o(n1/p−1 ). Then
Id − S d Gm (f ) = o(1). m p
139
5.2. Convergence. Conditions on Fourier coefficients
Proof. We use Proposition 5.2.17 with
d Gm (f ) ; u := Id − Sm
d v := f − Sm (f ) − u.
Then v2 ≤ f − Gm (f )2 ≤
1/2
= o m1/p−1/2 .
(5.2.39)
By the Paley theorem, 1/2 m p−1 2p−2 2p−4 = o m1/2−1/p ). an (f ) n u2p−2 = O
(5.2.40)
2
an (f )
n>m
n=1
Combining (5.2.39) and (5.2.40) and taking into account that u + vp = d (f )p = o(1) we get, by Proposition 5.2.17, that up = o(1). Lemma f − Sm 5.2.18 is now proved. The required estimate Σ2 p = o(1) follows from Lemma 5.2.18. This together with (5.2.38) completes the proof of Theorem 5.2.16 in the case t = 1. The general case 0 < t ≤ 1 follows from the case t = 1 and Lemma 5.2.19 below. Lemma 5.2.19. Let 2 ≤ p < ∞, t ∈ (0, 1], and f ∈ Lp (Td ) be such that an (f ) = o(n1/p−1 ). Then Gm (f ) − Gtm (f ) −→ 0 as m −→ ∞. p Proof. Let Gm (f ) = SΛ (f ) and Gtm (f ) = SΛ(t) (f ). Then gm := Gm (f ) − Gtm (f ) =
k∈Λ\Λ(t)
It is clear that
fˆ(k)ei(k,x) −
fˆ(k) ≤ am (f ),
The relation (5.2.14) implies fˆ(k) ≤ t−1 am (f ),
fˆ(k)ei(k,x) .
k∈Λ(t)\Λ
k ∈ Λ(t) \ Λ. k ∈ Λ \ Λ(t).
Thus, for the Fourier coefficients of the function gm we have |ˆ gm (k) ≤ t−1 am (f ). Taking into account that gm has at most 2m terms we get from the Paley theorem that
gm p = O am (f )m1−1/p = o(1). This proves the lemma.
140
Chapter 5. Greedy Approximation wrt the Trigonometric System
Let us note that by the Hausdorff–Young theorem the condition ∞
Apn < ∞,
2 < p < ∞,
n=1
which is stronger than (5.2.35), implies that for any f such that an (f ) ≤ An its Fourier series converges in Lp unconditionally. Proposition 5.2.20. Suppose that the decreasing sequence {An }∞ n=1 does not satisfy the condition (5.2.35) of Theorem 5.2.16, i.e., lim sup An n1−1/p > 0. n→∞
Then there is a continuous function f ∈ C(T) such that an (f ) ≤ An , n = 1, 2, . . . , but {Gm (f )} diverges in the Lp -norm, 2 < p < ∞. Proof. We will use functions constructed in the proof of Proposition 5.2.9. Let the number c > 0 and the sequence {nk } be such that 1/p−1
Ank ≥ cnk
,
nk ≥ 4nk−1 ,
n1 ≥ 4.
Define mk := [nk /4] and f := c
∞
1/p−1
nk
fmk ,δk eink x ,
k=1
where fm,δ are defined by (5.2.24). Then f is a continuous function satisfying the property an (f ) ≤ An . The divergence of {Gm (f )} follows from (5.2.27).
5.2.4 Convergence in the uniform norm. Proof of Theorems 5.2.3–5.2.5 We begin with Theorem 5.2.3. We reformulate it here for convenience. Theorem 5.2.21. Let the decreasing sequence {An }∞ n=1 satisfy the condition (A∞ ): An = o(1) as M −→ ∞. (5.2.41) M
Then for any f ∈ C(T) with the property an (f ) ≤ An , n = 1, 2, . . . we have (5.2.42) lim f − SΛm (f )∞ = 0, m→∞
where Λm is an arbitrary subset of Z satisfying |Λm | = m, min fˆ(k) ≥ t max fˆ(k).
k∈Λm
k∈Λ / m
(5.2.43) (5.2.44)
5.2. Convergence. Conditions on Fourier coefficients
141
Proof. Denote, as above, Gm (f ) =
m
fˆ(k(n))eik(n)x .
n=1
Note that if k = k(n) for n ≤ m, then |fˆ(k)| ≤ am (f ). Also, by (5.2.44), if k ∈ Λm , then |fˆ(k)| ≤ am (f )/t. Therefore, SΛm (f ) − Gm (f ) ≤ mam (f ) + mam (f )/t. (5.2.45) ∞ It is clear that (5.2.41) implies An = o(n−1 ), and therefore, am (f )m = o(1). Relations (5.2.45) and (5.2.46) give SΛm (f ) − Gm (f ) = o(1). ∞ sum
(5.2.46)
(5.2.47)
Let us estimate Vm (f ) − Gm (f )∞ , where Vm (f ) is the de la Vall´ee Poussin 2m − |k| ˆ min 1, Vm (f ) = f (k)eikx . m |k|≤2m
We have Vm (f ) − Gm (f ) = Σ1 − Σ2 , where
Σ1 = Vm f − Gm (f ) ,
Σ2 = Id − Vm Gm (f ) .
For the first sum we get Σ1 ∞ ≤
4m−1
am (f ) ≤ 4mam (f ).
n=1
Therefore, by (5.2.46), Σ1 ∞ = o(1). We proceed to the second sum Σ2 . Let us consider f − Vm (f ) − Σ2 = λn fˆ(k(n))eik(n)x + g =: Σ4 + g,
(5.2.48)
mm
where 0 ≤ λn ≤ 1. Using (5.2.17) and the assumption A∞ we get from (5.2.48) Σ2 + g ≤ f − Vm (f ) + Σ4 ∞ = o(1). (5.2.49) ∞ ∞ Next we have g2 ≤
1/2 2
an (f )
m = o e−e /2 .
n>eem
We need the following lemma that we will prove a little later.
(5.2.50)
142
Chapter 5. Greedy Approximation wrt the Trigonometric System
Lemma 5.2.22. Let the function f , f ∞ = 1, have the form fˆ(k)eikx , f= |Λ| ≤ m. k∈Λ
Then for any function g such that g2 ≤ 14 (4πm)−m/2 it holds that f + g ≥ 1/4. ∞ This lemma and (5.2.49) imply that Σ2 ∞ = o(1). Together with (5.2.47) this completes the proof of Theorem 5.2.21. Proof. We now prove Lemma 5.2.22. Denote by u the distance from the real number u to the closest integer. Denote, for a fixed j ∈ N,
Fj = x ∈ T : ∀k ∈ Λ, j kx/(2π) < 1/ 4πm , F = F1 . Well-known estimates for simultaneous diophantine approximation (see [7, p. 13]) give T= Fj , J = (4πm)m . j≤J
Note that μFj = μF for all j. Therefore, μFj ≤ JμF, 1≤ j≤J
or μF ≥ (4πm)−m .
(5.2.51)
Let |f (x0 )| = f ∞ = 1, E = {x0 + y : y ∈ F }. For x = x0 + y ∈ E and k ∈ Λ we have ikx e − eikx0 ≤ 2π ky/(2π) < 1/(2m). Therefore, |f (x) − f (x0 )| ≤
fˆ(k)eikx − eikx0 ≤ (1/2m) ≤ 1/2. k∈Λ
k∈Λ
Thus, |f (x)| ≥ 1/2 for x ∈ E. Suppose that f + g∞ < 1/4.
(5.2.52)
Then |g(x)| > 1/4 for x ∈ E, and, by (5.2.51), 2 1 2 2 g2 ≥ |g(x)| dμ > (4πm)−m . 4 E This contradicts the condition of the lemma. Hence (5.2.52) is not true and the proof is complete.
5.2. Convergence. Conditions on Fourier coefficients
143
Remark 5.2.23. Actually, in the proof of Lemma 5.2.22 we have shown the following. If |Λ| ≤ m, f= fˆ(k)eikx , k∈Λ
G ⊂ T, μG > 1 − (4πm)−m , then f ∞ ≤ 2 sup |f (x)|. x∈G
S. Konyagin and F. Nazarov (see [45]) have proved that the last inequality holds under the assumption that μG > 1 − cm for a small constant c. This can be used to weaken the assumption on g2 in Lemma 5.2.22. However, it does not affect Theorem 5.2.21. We proceed to the proof of Theorem 5.2.4 from the Introduction. The core part of the proof is the following lemma. Lemma 5.2.24. Fix Δ > 0, δ > 0. Let positive integers m → ∞ and M → ∞ be such that log M = o(m). (5.2.53) Let m1 = m, m3 = m + M , m1 < m2 < m3 . Let the decreasing sequence {An }∞ n=1 satisfy the conditions An ≤ Δ/n, m2
m3
An =
n=m1 +1
(5.2.54) An = 1,
(5.2.55)
n=m2 +1
A2m > δAm .
(5.2.56)
Then for sufficiently large m there exists a trigonometric polynomial T (x) = Tm (x) =
M
Tˆ(k)eikx
k=1
such that ak (T ) ≤ Am+k T ∞ → 0
(1 ≤ k ≤ M ),
(m → ∞),
max |Gn (T, T )(0)| > 0.01. n
(5.2.57) (5.2.58) (5.2.59)
Proof. Take independent random variables ηk (1 ≤ k ≤ M ) so that each ηk is equal to any n, m1 < n ≤ m3 , with probability 1/(10M ), and is equal to m1 with probability 0.9. The polynomial T is defined as T (x) =
M k=1
σηk Aηk eikx ,
144
Chapter 5. Greedy Approximation wrt the Trigonometric System
where σ(m1 ) = 0, σn = 1 for m1 < n ≤ m2 , σn = −1 for m2 < n ≤ m3 . We prove that T satisfies conditions (5.2.57)–(5.2.59) with a large probability. Probability, expectation, and variance will be denoted by P, E, and V, respectively. We will estimate the probabilities of the following events: E1 : ∃l ≥ 1 : k : m1 < ηk ≤ m1 + l > l,
1/2 , E2 : T ∞ > 3 Am log 2πM 2 Aηk ≤ 0.05. E3 : k:m1 <ηk ≤m2
Note that nonfulfillment of the events E1 , E2 , E3 imply (5.2.57), (5.2.58), (5.2.59), respectively. In the case of E2 and (5.2.58) we use (5.2.53) and (5.2.54) to prove that Am log(2πM 2 ) = o(1). Consider the event E1,l : k : m1 < ηk ≤ m1 + l > l. We have P(E1 ) ≤
P E1,l .
(5.2.60)
l
Furthermore, j M−j M
M l l 1− P E1,l = j 10M 10M j=l+1
j M M l ≤ . j 10M
(5.2.61)
j=l+1
For any j > l we have
j j l l < 10−l , 10M M M−l M−j l l l l 1≤e 1− ≤e 1− . M M Therefore, j M−j j M M
l M M l l l ≤ (e/10)l ≤ e/10 . 1− j j 10M M M j=l+1
j=l+1
By (5.2.60) and (5.2.61) we get P(E1 ) ≤
l e/10 < 1/2. l
(5.2.62)
145
5.2. Convergence. Conditions on Fourier coefficients
To estimate P(E2 ), we use the following theorem ([38, pp. 68, 79]). Theorem 5.2.25. Let E be a measurable space with measure μ, and μ(E) < ∞. Let B be a linear space of measurable bounded functions on E, closed under complex conjugation, and suppose that there exists ρ > 0 with the following property: if f ∈ B and f is real, then there exists a measurable set I = I(f ) ⊂ E such that μ(I) > μ(E)/ρ and |f (t)| ≥ 12 f ∞ for t ∈ I. Consider a random finite sum P = ξk fk , where E(ξk ) = 0, Moreover, suppose that fk ∞
E(ξk2 ) = b2k , |ξk | ≤ 1. = 1 and r = b2k > log ρ. Then
# 4 " P P ∞ ≥ 6(r log ρ)1/2 ≤ . ρ We apply Theorem 5.2.25 for E = T, B = { σηk Aηk /Am . Note that P (x) = T (x)/Am .
M
k=1 ck e
ikx
}, fk = eikx , ξk = (5.2.63)
One can guarantee the existence of the required set I(f ) by taking ρ = 2πM 2
(5.2.64)
([38, p. 49]). Furthermore, for k = 1, . . . , M we have Eξk = 0, and, by (5.2.56), b2k = Eξk2 =
1 10M A2m
m3
A2n ≥
n=m1 +1
mδ 2 . 10M
2
Therefore, r > mδ 10 , and, by (5.2.53) and (5.2.64), for sufficiently large m the condition r > log ρ holds. On the other hand, m3
A2n ≤ Am
n=m1 +1
b2k ≤
1 5MAm ,
and r ≤
1 5Am .
m3
An = 2Am ,
(5.2.65)
n=m1 +1
Thus, by (5.2.63),
# " # " P P ∞ ≥ 6(r log ρ)1/2 ≥ P P ∞ ≥ 3(log(2πM 2 )/Am )1/2 # " = P T ∞ ≥ 3(Am log(2πM 2 ))1/2 ,
and Theorem 5.2.25 gives P(E2 ) ≤
4 ≤ M −2 . ρ
(5.2.66)
146
Chapter 5. Greedy Approximation wrt the Trigonometric System
To estimate P(E3 ), we define the random variables ν1 , . . . , νM as νk = Aηk for m1 < ηk ≤ m2 , and νk = 0 otherwise. The event E3 can be rewritten as M
E3 :
νk ≤ 0.05.
k=1
We have E(νk ) =
1 10M
and, by (5.2.65), V(νk ) ≤ E(νk2 ) ≤
Am . 5M
Hence, E
M
νk
k=1
V
M
= 0.1,
νk
≤
k=1
Am , 5
and, by Chebyshev’s inequality,
M
V k=1 νk P(E3 ) ≤ 2 ≤ 80Am . M E k=1 νk − 0.05
(5.2.67)
So, by (5.2.62), (5.2.66), and (5.2.67), P(E1 ) + P(E2 ) + P(E3 ) < 1, and there exists a choice of a polynomial T for which neither of the events E1 , E2 , E3 holds. This completes the proof of Lemma 5.2.24. Theorem 5.2.26. Assume that the decreasing sequence {An }∞ n=1 does not satisfy the condition (A∞ ). Then there exists a function f ∈ C(T) with the property an (f ) ≤ An , n = 1, 2, . . . , and such that
lim sup f − Gm f, T ∞ > 0. m→∞
Proof. Without loss of generality we may suppose that lim sup An > 8, u→∞
(5.2.68)
u
where u ∈ R. We may also assume that, for sufficiently large n, An ≤ 10/n.
(5.2.69)
147
5.2. Convergence. Conditions on Fourier coefficients
Indeed, if (5.2.69) fails for infinitely many values of n, we replace all An by An = min(An , 10/n). If for a large m we have Am > 10/m, then
An ≥
log m
10/m > 9,
log m
and (5.2.68) holds for An . Now, observe that F (F (u)) > eu for sufficienly large u √ u and F (u) = e . Therefore,
An ≤
u
An +
u
An ,
F (u)
and (5.2.68) implies that
lim sup u→∞
An > 4. √
u
(5.2.70)
u
We will prove now that there exists an arbitrarily large integer m such that
An > 3
(5.2.71)
A2m ≥ Am /100.
(5.2.72)
√
m
m
and
Indeed, by (5.2.70), we can take a large u with
An > 4. √
u
u
Let m0 = [u]. If Am0 ≥ 1/(2m0 ), then the number m = [m0 /2] satisfies (5.2.71) and (5.2.72) (we use (5.2.69) with n = m). If Am0 < 1/(2m0 ), we define the sequence mj = 2j m0 . We take m as the minimal m = mj satisfying (5.2.72). To show the existence of such an m and to prove (5.2.71), we note that
An < 1
m0
whenever Am1 < Am0 /100, . . . , Amj < Amj−1 /100. Hence, the number m does exist and, moreover, An > 3, √
m
which clearly implies (5.2.71).
u
148
Chapter 5. Greedy Approximation wrt the Trigonometric System
We take now any large m = m1 satisfying (5.2.71) and (5.2.72), and define m2 = min m : An ≥ 1 , m3 = min m :
m1
An ≥
m2
We have
An .
m1
An < 2 + 2am2 + am3 < 3.
m1
This inequality combined with (5.2.71) shows that m3 < e3 Lemma 5.2.24 to the sequence {An }, where An
=
. We apply now
−1
Ak
An
(n ≤ m2 ),
An
(n > m2 ).
m1
An =
−1
Ak
m2
Thus we get a polynomial T = Tm satisfying (5.2.57)–(5.2.59). Setting f (x) = inm x , where the sum is taken over a sparse sequence of m’s with nm m Tm (x)e chosen to make sets of frequencies of Tm (x)einm x disjoint. This completes the proof of Theorem 5.2.26. Theorem 5.2.27. Assume that the decreasing sequence {An }∞ n=1 is not summable. Then there exists a continuous function with the property an (f ) ≤ An and such that its partial Fourier sums diverge at some point. Theorem 5.2.27 is a simple corollary of the following lemma. Lemma 5.2.28. Let the decreasing sequence {An }∞ n=1 be not summable. Then for any l ∈ N and m0 ∈ N there exist a trigonometric polynomial T (x) = Tl (x) and numbers m ≥ m0 , N ∈ N such that ak (T ) ≤ Am+k
(k ≥ 1),
T ∞ −→ 0 (l −→ ∞), SN (T, 0) −→ ∞ (l −→ ∞). Proof. By the conditions on {An } we have, for any l ∈ N, n
A2ln = ∞.
(5.2.73) (5.2.74) (5.2.75)
149
5.2. Convergence. Conditions on Fourier coefficients
Therefore, for any l > 1 we can find m1 > m0 and m2 > m1 such that
−1/2
−1/2 log l ≤ A2ln ≤ 2 log l .
(5.2.76)
m1
We associate with any n with m1 < n ≤ m2 a trigonometric polynomial Tn (x) = A2ln e
ikn x
l sin(Kjx)
j
j=1
,
where the numbers K and kn and the positive integer N satisfy the conditions kn = N − n,
K > m2 ,
We set
T =
N > lK.
Tn .
m1
Let us prove (5.2.73) with m = 2m1 . We observe that by the choice of the numbers kn and K the spectra of the polynomials Tn are disjoint, that is, for any j there exists at most one n such that Tˆn (j) = 0. We have
Tˆn kn + Kj ei(kn +Kj)x , Tˆn kn + Kj = A2ln /(2|j|). Tn (x) = 1≤|j|≤l
Therefore, we can write the following inequalities:
Tˆn v(kn + Kj ≤ A2ln ≤ A2ln−j
Tˆn kn − Kj ≤ A2ln ≤ A2ln−n−j
1≤j≤l ,
1≤j≤l ,
and note that for n > m1 , 1 ≤ j ≤ l, the numbers 2ln − j, 2ln − n − j are all greater than 2m1 and pairwise distinct. This proves (5.2.73) with m = 2m1 . We will check (5.2.74) and (5.2.75) now. Using the well-known estimate l sin(ju) ≤C j ∞ j=1 ([86, p. 61]), we get T ∞ ≤ C
A2ln ,
m1
and, by (5.2.76),
−1/2 T ∞ ≤ 2C log l .
150
Chapter 5. Greedy Approximation wrt the Trigonometric System
Let us estimate SN (T, 0). We have l
1 i . SN Tn , 0 = A2ln 2 j j=1
Hence, SN (T, 0) = 1 2
A2ln
l 1 j=1
m1
j
,
and (5.2.75) follows from (5.2.76). The proof is complete.
5.3 Convergence. Conditions on greedy approximants 5.3.1 Introduction In this section we concentrate on imposing extra conditions in the following form. We assume that for some sequence {M (m)}, M (m) > m, we have GM(m) (f ) − Gm (f ) −→ 0 as m −→ ∞. (5.3.1) p This extra assumption on f is in the style of A. S. Belov [5]. He studied convergence of Fourier series in Lp with p = 1, ∞ and imposed extra conditions on f in the form S2n (f ) − Sn (f )p = o(1). In the case when p is an even number or p = ∞, we find necessary and sufficient conditions on the growth of the sequence {M (m)} to provide convergence f − Gm (f )p → 0 as m → ∞. The presentation of this section is based on the paper [46]. We prove the following theorem in Subsection 5.3.3 (see Theorem 5.3.17). Theorem 5.3.1. Let p = 2q, q ∈ N, be an even integer, and δ > 0. Assume that f ∈ Lp (T) and there exists a sequence of positive integers M (m) > m1+δ such that GM(m) (f ) − Gm (f ) −→ 0 as m −→ ∞. p
Then
f − Gm (f ) −→ 0 p
as
m −→ ∞.
In Subsection 5.3.4 we prove that the condition M (m) > m1+δ cannot be replaced by a condition M (m) > m1+o(1) . The following theorem is a direct corollary of Theorem 5.3.21. Theorem 5.3.2. For any p ∈ (2, ∞) there exists a function f ∈ Lp (T) with divergent in Lp (T) sequence {Gm (f )} of greedy approximations with the following property. For any sequence {M (m)} such that m ≤ M (m) ≤ m1+o(1) , GM(m) (f ) − Gm (f ) −→ 0 (m −→ ∞). p
151
5.3. Convergence. Conditions on greedy approximants
In Subsection 5.3.5 we discuss the case p = ∞. We prove there necessary and sufficient conditions for convergence of greedy approximations in the uniform norm. For a mapping α : W → W we denote αk its k-fold iteration, αk := α◦αk−1 . Theorem 5.3.3. Let α : N → N be strictly increasing. Then the following conditions are equivalent: (a) For some k ∈ N and for any sufficiently large m ∈ N we have αk (m) > em . (b) If f ∈ C(T) and Gα(m) (f ) − Gm (f ) then
f − Gm (f )
∞
∞
−→ 0
(m −→ ∞),
−→ 0
(m −→ ∞).
The proof of the necessary condition is based on the above Theorem 5.2.4. In the proof of the sufficient condition we use the following special inequality (see Theorem 5.3.9 in Subsection 5.3.2). By Σm (T ) we denote the set of all trigonometric polynomials with at most m nonzero coefficients. Theorem 5.3.4. For any h ∈ Σm (T ) and any g ∈ L∞ one has h + g ≥ K −2 h∞ − eC(K)m gˆ(k) , K > 1. ∞ ∞
(5.3.2)
We note that in the proof of the above inequality we use a deep result on the uniform approximation property of the space C(X) (see [24]). Subsection 5.3.2 contains some other inequalities in the style of (5.3.2). Greedy approximations are close to thresholding approximations (thresholding greedy approximations). Thresholding approximations are defined as follows: T (f ) := SΛ() (f ) := fˆ(k)ei(k,x) , > 0. k:|fˆ(k)|≥
Clearly, for any > 0 there exists an m() such that T (f ) = Gm() (f ). Therefore, convergence of {Gm (f )} as m → ∞ implies convergence of {T (f )} as → 0. In Subsections 5.3.3–5.3.5 we obtain results on the convergence of {T (f )}, → 0, that are similar to the above-mentioned results on convergence of {Gm (f )}. We use the same notations in both cases d = 1 and d > 1. We point out that in Subsections 5.3.2 and 5.3.3 we consider the general case d ≥ 1 and in Subsections 5.3.4 and 5.3.5 we confine ourselves to the case d = 1. The reason is that we prove necessary conditions in Subsection 5.3.4 and in a part of Subsection 5.3.5, where, clearly, we consider the case d = 1 without loss of generality. We note that sufficient conditions in Theorems 5.3.28 and 5.3.29 also hold for d > 1 (the proof is the same with natural modifications).
152
Chapter 5. Greedy Approximation wrt the Trigonometric System
5.3.2 Some inequalities In this subsection we prove some inequalities that will be used in Subsection 5.3.3. The general style of these inequalities is the following. A function that has a sparse representation with respect to the trigonometric system cannot be approximated in Lp by functions with small Fourier coefficients. We begin our discussion with some concepts that are useful in proving such inequalities. The following new characteristic of a Banach space Lp plays an important role in such inequalities. We introduce some more notations. Let Λ be a finite subset of Zd . By |Λ| we denote its cardinality and by T (Λ) the span of {ei(k,x) }k∈Λ . It is clear that T (Λ). Σm (T ) = Λ:|Λ|≤m
For f ∈ Lp , F ∈ Lp , 1 ≤ p ≤ ∞, p = p/(p − 1), we write F, f := F f dμ, dμ := (2π)−d dx. Td
Definition 5.3.5. Let Λ be a finite subset of Zd and 1 ≤ p ≤ ∞. We call a set Λ := Λ (p, γ), γ ∈ (0, 1] a (p, γ)-dual to Λ if for any f ∈ T (Λ) there exists F ∈ T (Λ ) such that F p = 1 and F, f ≥ γf p . Denote by D(Λ, p, γ) the set of all (p, γ)-dual sets Λ . The following function will play an important role:
inf v m, p, γ := sup
|Λ |. Λ:|Λ|=m Λ ∈D Λ,p,γ
We note that in the particular case p = 2q, q ∈ N, we have
v m, p, 1 ≤ mp−1 .
(5.3.3)
This follows immediately from the form of the norming functional F for f ∈ Lp : F = f q−1 (f¯)q f p1−p .
(5.3.4)
We will use the quantity v(m, p, γ) in greedy approximation. We first prove a lemma. Lemma 5.3.6. Let 2 ≤ p ≤ ∞. For any h ∈ Σm (T ) and any g ∈ Lp ,
h + g ≥ γhp − v m, p, γ 1−1/p gˆ(k) . p ∞ Proof. Let h ∈ T (Λ) with |Λ| = m and let Λ ∈ D(Λ, p, γ). Then using the Definition 5.3.5 we find F (h, γ) ∈ T (Λ ) such that
F h, γ = 1 and F h, γ , h ≥ γhp . p
153
5.3. Convergence. Conditions on greedy approximants
We have
F h, γ , h = F h, γ , h + g − F h, γ , g ≤ h + g p + F h, γ , g . Next,
F h, γ , g ≤ Fˆ h, γ (k) gˆ(k) . 1 ∞
Using F (h, γ) ∈ T (Λ ) and the Hausdorff–Young theorem (see Appendix, Theorem 7.3.1), we obtain
Fˆ h, γ (k) ≤ |Λ |1−1/p Fˆ h, γ (k) 1 p
≤ |Λ |1−1/p F h, γ p = |Λ |1−1/p . Thus it only remains to combine the above inequalities and use the definition of v(m, p, γ). Definition 5.3.7. Let X be a finite-dimensional subspace of Lp , 1 ≤ p ≤ ∞. We call a subspace Y ⊂ Lp a (p, γ)-dual to X, γ ∈ (0, 1], if for any f ∈ X there exists F ∈ Y such that F p = 1 and F, f ≥ γf p. Similarly to the above, we denote by D(X, p, γ) the set of all (p, γ)-dual subspaces Y . Consider the function
inf dim Y. w m, p, γ := sup X:dim X=m Y ∈D(X,p,γ)
We begin our discussion with the particular case p = 2q, q ∈ N. Let X be given and e1 , . . . , em form a basis of X. Using the H¨older inequality for n functions f1 , . . . , fn ∈ L n , f1 · · · fn dμ ≤ f1 n · · · fn n
with fi = |ej |p , n = p − 1, we get that any function of the form m i=1
|ei |ki ,
ki ∈ N,
m
ki = p − 1,
i=1
belongs to Lp . It now follows from (5.3.4) that
w m, p, 1 ≤ mp−1 , p = 2q,
q ∈ N.
(5.3.5)
There is a general theory of uniform approximation property (UAP) that provides estimates for w(m, p, γ). We begin with some definitions from this theory. For a given subspace X of Lp , dim X = m, and a constant K > 1, let kp (X, K) be the smallest k such that there is an operator IX : Lp → Lp with IX (f ) = f for f ∈ X, IX Lp →Lp ≤ K, and rank IX ≤ k. Denote
kp m, K := sup kp X, K . X:dim X=m
154
Chapter 5. Greedy Approximation wrt the Trigonometric System
Let us discuss how kp (m, K) can be used in estimating w(m, p, γ). Consider the ∗ ∗ ∗ to the operator IX . Then IX Lp →Lp ≤ K and rank IX ≤ kp (m, K). Let dual IX f ∈ X, dim X = m, and let Ff be the norming functional for f . Define ∗ ∗ F := IX Ff /IX Ff p . Then, for f ∈ X,
∗ f, IX (Ff ) = IX (f ), Ff = f, Ff = f p
and
∗ I (Ff ) ≤ K X p
imply
Therefore
f, F ≥ K −1 f p .
w(m, p, K −1 ) ≤ kp (m, K).
(5.3.6)
We note that the behavior of the functions w(m, p, γ) and kp (m, K) may be very different. J. Bourgain [6] proved that for any p ∈ (1, ∞), p = 2 the function kp (m, K) grows faster than any polynomial in m. The estimate (5.3.5) shows that in the particular case p = 2q, q ∈ N the growth of w(m, p, γ) is at most polynomial. This means that we cannot expect to obtain accurate estimates for w(m, p, K −1 ) using the inequality (5.3.6). We give one more application of the UAP in the style of Lemma 5.3.6. Lemma 5.3.8. Let 2 ≤ p ≤ ∞. For any h ∈ Σm (T ) and any g ∈ Lp , h + g ≥ K −1 hp − kp (m, K)1/2 g2 , p g(k)}∞ . h + gp ≥ K −2 hp − kp (m, K){ˆ
(5.3.7) (5.3.8)
Proof. Let h ∈ T (Λ), |Λ| = m. Take X = T (Λ) and consider the operator IX provided by the UAP. Let ψ1 , . . . , ψM form an orthonormal basis for the range Y of IX . Then M ≤ kp (m, K). Let IX (ei(k,x) ) =
M
ckj ψj .
j=1
Then the property IX Lp →Lp ≤ K implies that M
1/2 |ckj |2
= IX (ei(k,x) )2 ≤ IX (ei(k,x) )p ≤ K.
j=1
Consider along with IX the new operator A := (2π)−d Tt IX T−t dt, Td
155
5.3. Convergence. Conditions on greedy approximants
where Tt is the shift operator, Tt (f ) = f (· + t). Then M
A ei(k,x) = ckj (2π)−d
e
−i(k,t)
Td
j=1
ψj (x + t) dt =
M
ckj ψˆj (k)
ei(k,x) .
j=1
Denote λk :=
M
ckj ψˆj (k).
j=1
We have
M M 2 k 2 ˆ ψj (k) cj |λk | ≤ ≤ K 2 M. 2
k
j=1
k
(5.3.9)
j=1
Also λk = 1 for k ∈ Λ. For the operator A we have ALp →Lp ≤ K
and AL2 →L∞ ≤ KM 1/2 .
Therefore,
A h + g ≤ K h + g p p and
A h + g ≥ hp − KM 1/2 g2. p
This proves inequality (5.3.7). Consider the operator B := A2 . Then B(h) = h,
h ∈ T (Λ);
B(ei(k,x) ) = λ2k ei(k,x) ,
k ∈ Zd ;
BLp→Lp ≤ K 2
and, by (5.3.9), B(f )∞ ≤
2 ˆ ˆ |λk |2 {f(k)} ∞ ≤ K M {f(k)}∞ .
k
Now, on the one hand, B(h + g) ≤ K 2 h + g , p p and on the other hand B(h + g) = h + B(g) ≥ hp − K 2 M gˆ(k) . p p ∞
This proves inequality (5.3.8). Theorem 5.3.9. For any h ∈ Σm (T ) and any g ∈ L∞ one has h + g ≥ K −1 h∞ − eC(K)m/2 g2 ; ∞ h + g ≥ K −2 h∞ − eC(K)m gˆ(k) . ∞ ∞
156
Chapter 5. Greedy Approximation wrt the Trigonometric System
Proof. This theorem is a direct corollary of Lemma 5.3.8 and the following known estimate (see [24]): k∞ (m, K) ≤ eC(K)m . As we already mentioned, kp (m, K) increases faster than any polynomial. We will improve inequality (5.3.7) in the case p < ∞ by using other arguments. Lemma 5.3.10. Let 2 ≤ p < ∞. For any h ∈ Σm (T ) and any g ∈ Lp , h + g p ≥ hpp − pm(p−2)/4 hp−1 g2 . p p Proof. Since the function f (x) = |x|p is convex, we have f (x − y) ≥ f (x) − yf (x). Therefore, h + g p ≥ |h|p − p|h|p−1 |g|. (5.3.10) Taking the integral of (5.3.10) over Td with respect to the measure μ with dμ := (2π)−d dx we get h + g p dμ ≥ |h|p dμ − p |h|p−1 |g| dμ. (5.3.11) Td
Td
Td
Next, by Cauchy’s inequality, |h|p−1 |g| dμ ≤ |h|2p−2 dμ Td
Td
Td
≤ g2
Td
1/2 |g|2 dμ
|h|p hp−2 ∞ dμ
(5.3.12)
1/2 (p−2)/2 = g2 hp/2 . p h∞
Using Cauchy’s inequality again, we obtain h∞ ≤ m1/2 h2 ≤ m1/2 hp . Combining (5.3.11)–(5.3.13) we complete the proof of Lemma 5.3.10.
(5.3.13)
For comparison we mention two inequalities from Section 5.2 in the style of the inequalities in Lemmas 5.3.6–5.3.10 (see Proposition 5.2.17 and Lemma 5.2.22). Lemma 5.3.11. Let 2 ≤ p < ∞ and h ∈ Lp , hp = 0. Then for any g ∈ Lp we have
p−1 g2 . hp ≤ h + g p + h2p−2 /hp Lemma 5.3.12. Let h ∈ Σm (T ), h∞ = 1. Then for any function g such that g2 ≤ 14 (4πm)−m/2 we have h + g ≥ 1/4. ∞ We proceed to estimating v(m, p, γ) for p ∈ [2, ∞). In the special case of even p we have, by (5.3.3),
v m, p, 1 ≤ mp−1 .
5.3. Convergence. Conditions on greedy approximants
157
Lemma 5.3.13. Let 2 ≤ p < ∞. Denote α := p/2 − [p/2]. Then
1/2 v m, p, γ ≤ mc(α,γ)m +p−1 . Proof. In the case where p is an even number the statement follows from (5.3.3). We will assume that p is not even. Let Λ ⊂ Zd , |Λ| = m be given. Take any nonzero h ∈ T (Λ) and assume for convenience that hp = 1. We will construct a γ-norming functional F (h, γ) (F, h ≥ γhp ). We use the formula for the norming functional of h, namely 2 p/2−1 2 [p/2]−1 ¯ ¯ ¯ p−2 = h(|h| F = h1−p ) = h(|h| ) (|h|2 )α . h|h| p
By (5.3.13), we have h∞ ≤ m1/2 . The idea is to replace (|h|2 )α by an algebraic polynomial on |h|2 . We approximate the function xα in the interval [0, m]. We use Telyakovskii’s result in [66]: there exists an algebraic polynomial Pn of degree n such that α
y − Pn (y) ≤ C1 (α) y 1/2 /n α , y ∈ [0, 1]. (5.3.14) Substituting y = x/m into (5.3.14) we get α x − mα Pn (x/m) ≤ C1 (α)xα/2 mα/2 n−α . 1/2 with C2 (α, γ) big We take θ = 1−γ 1+γ ∈ (0, 1) and choose n(m) ≤ C2 (α, γ)m enough to have C1 (α)xα/2 mα/2 n−α ≤ θxα/2 .
Denote
2 [p/2]−1 ¯ ) . Fm := mα Pn(m) |h|2 /m h(|h|
Then (x = |h|2 ) Therefore,
F − Fm ≤ θ|h|2[p/2]−1+α . F − Fm ≤ θ|h|2[p/2]−1+α p . p
Using 2[p/2] = p − 2α, we get p−1−α |h| ≤ |h|p−α−1 = hp−α−1 ≤ hp−α−1 = 1. p−α p p (p−α) Combining (5.3.15) and (5.3.16), we get F − Fm ≤ θ. p This implies that Fm p ≤ 1 + θ and
Fm , h = F, h + Fm − F, h ≥ hp − θhp = 1 − θ hp .
(5.3.15)
(5.3.16)
158
Chapter 5. Greedy Approximation wrt the Trigonometric System
Thus F (h, γ) := Fm /Fm p is a γ-norming functional for h. It remains to note that the dimension of a subspace T (Λ ) containing all 2 [p/2]−1 ¯ Pn(m) (|h|2 /m)h(|h| )
when h runs over T (Λ) does not exceed mc(α,γ)m
1/2
+p−1
.
5.3.3 Sufficient conditions in the case p ∈ (2, ∞) We will prove now several statements which give sufficient conditions for convergence of greedy approximation in Lp , 2 < p < ∞. Theorem 5.3.14. Let p = 2q, q ∈ N, be an even integer. For f ∈ Lp (Td ) assume that two sequences Λm and Ym of sets of frequencies satisfy the following conditions: |Λm | ≤ ma ,
a > 0,
sup |fˆ(k)| = o ma(1−p) ,
(5.3.17) (5.3.18)
k∈Y / m
SΛm (f ) − SYm (f )p −→ 0 Then
SΛm (f ) − f −→ 0 p
as
as
m −→ ∞.
m −→ ∞.
Proof. We use the Riesz theorem ([41, Chapter 4, Section 3]) that for all 1 < p < ∞ we have the convergence f − SN (f )p → 0 as N → ∞, where, as above, d SN (f ) := SN (f ) :=
fˆ(k)ei(k,x) ,
Q(N ) := k : max |kj | ≤ N 1/d . j
k∈Q(N )
Let
m := sup fˆ(k),
N = [map ].
k∈Y / m
We estimate SN (f ) − SYm (f ) p i(k,x) ˆ ≤ f (k)e + k:|k|≤N ;k∈Ym
p
(5.3.19) fˆ(k)ei(k,x) =: Σ1 p + Σ2 p .
k:|k|>N ;k∈Ym
By the Paley theorem ([86, Chapter 12, Section 5]),
Σ1 p = O m N 1−1/p = o(1).
p
159
5.3. Convergence. Conditions on greedy approximants
For the second sum we have Σ2 = f − SN (f ) − g
with
g :=
fˆ(k)ei(k,x) .
(5.3.20)
k:|k|>N ;k∈Ym
Let us rewrite
Σ2 = Id − SN SYm (f ) (5.3.21)
= Id − SN SΛm (f ) + Id − SN SYm (f ) − SΛm (f ) =: h1 + h2 . By the theorem’s assumption and the Riesz theorem, h2 p = o(1) and, therefore, we get from (5.3.20) and (5.3.21) that h1 + gp = o(1). We note that h1 is a polynomial with at most ma terms and g is a function with small Fourier coefficients. We have the following lemma for this situation. Lemma 5.3.15. Let p = 2q, q ∈ N, be an even integer. Assume that h is an m-term trigonometric polynomial and g is such that |ˆ g (k)| ≤ for all k. Then hp ≤ h + g p + ma(p−1) . Proof. This follows from Lemma 5.3.6 and the estimate (5.3.3).
Applying Lemma 5.3.15 we obtain for h1 that h1 p = o(1) and, therefore, Σ2 p = o(1). This in turn implies (see (5.3.19)) that SN (f ) − SYm (f ) = o(1). p Thus we conclude that f − SΛm (f )p → 0 as m → ∞. The proof of Theorem 5.3.14 is complete. We now formulate a straightforward corollary of Theorem 5.3.14. Let us note first that convergence of {Gm (f )} in Lp is equivalent to Gm (f ) − Gn (f )p −→ 0 as m, n −→ ∞. Corollary 5.3.16. Let p = 2q, q ∈ N, be an even integer. For f ∈ Lp (Td ) assume that there exists a sequence {m }, m = o(m1−p ), such that Gm (f ) − Tm (f ) = o(1). p Then
Gm (f ) − f −→ 0 p
as
m −→ ∞.
We now present some results in the direction of weakening the assumption m = o(m1−p ) in Corollary 5.3.16.
160
Chapter 5. Greedy Approximation wrt the Trigonometric System
Theorem 5.3.17. Let p = 2q, q ∈ N, be an even integer, and δ > 0. Assume that f ∈ Lp (Td ) and there exists a sequence of positive integers M (m) > m1+δ such that Gm (f ) − GM(m) (f ) −→ 0 as m −→ ∞. (5.3.22) p Then
Gm (f ) − f −→ 0 p
as
m −→ ∞. j
Proof. Let m0 := m, mj := M (mj−1 ) for j ∈ N. We have mj > m(1+δ) . Fix j0 > log(2p)/ log(1+δ). Let M0 (m) := mj0 . Then M0 (m) > m2p . Also, by (5.3.22), Gm (f ) − GM (m) (f ) −→ 0 as m −→ ∞. 0 p Let Λm and Ym be defined from Gm (f ) = SΛm (f ) and GM0 (m) (f ) = SYm (f ). Using that aM0 (m) (f ) = O(M0 (m)−1/2 ) = O(m−p ) = o(m1−p ), we complete the proof of Theorem 5.3.17 by Theorem 5.3.14. Theorem 5.3.18. Let p = 2q, q ∈ N, be an even integer, and δ > 0. Assume that f ∈ Lp (Td ) and for any > 0 there is an η() < 1+δ such that T (f ) − Tη() (f ) −→ 0 as −→ 0. (5.3.23) p Then
T (f ) − f −→ 0 p
as
−→ 0.
To prove this theorem we need the following simple result. Lemma 5.3.19. Let p ≥ 2 and δ > 0. For any f ∈ Lp (Td ) there is an f,p > 0 with the following property. For any ∈ (0, f,p ) there exists an m() such that −p/(p−1)+δ < m() < −2 and Gm() (f ) − T (f ) −→ 0 as −→ 0. p Proof. We have Gm1 () (f ) = SΛ() (f ) for m1 () = |Λ()|. Moreover, the condition f ∈ L2 (Td ) implies m1 () = o(−2 ). If m1 () > −p +δ , where p = p/(p − 1), then we put m() = m1 (). Suppose that m1 ≤ −p +δ . Let m2 () = [−p +δ ], m() = m1 () + m2 (). By the Hausdorff–Young theorem, Gm() (f ) − Gm () (f ) ≤ m2 ()1/p −→ 0 as −→ 0 1 p and, moreover, −p/(p−1)+δ < m() < −2 for small . This proves the lemma.
Proof. We now prove Theorem 5.3.18. By Lemma 5.3.19 we find m() such that −p +δ < m() < −2 and Gm() (f ) − T (f ) −→ 0 as −→ 0. p
161
5.3. Convergence. Conditions on greedy approximants
Proceeding as in the proof of Theorem 5.3.17, for any > 0 we get an η() < 2p < m()−p such that T (f ) − Tη() (f ) −→ 0 as −→ 0. (5.3.24) p We now apply Theorem 5.3.14 with Λm() and Ym() defined from Gm() (f ) = SΛm() (f );
Tη() (f ) = SYm() (f ).
The proof of Theorem 5.3.18 is complete.
Theorem 5.3.20. Let p = 2q, q ∈ N, be an even integer, and δ > 0. Assume that f ∈ Lp (Td ) and for any positive integer m there exists an (m) < m1/p−1−δ such that Gm (f ) − T(m) (f ) −→ 0 as m −→ ∞. p Then
Gm (f ) − f −→ 0 p
as
m −→ ∞.
Proof. It is clear that it suffices to prove the theorem for small δ. Thus let 0 < δ < p − 1/p . Applying Lemma 5.3.19 with = (m) we get the existence of M (m) > m1+δ with some δ > 0 such that GM(m) (f ) − Gm (f ) −→ 0 as m −→ ∞. p
It remains to use Theorem 5.3.17.
5.3.4 Necessary conditions in the case p ∈ (2, ∞) Theorem 5.3.21. For any p > 2 there exists a function f ∈ Lp (T) such that (1) if two sequences {Λj } and {Yj } of sets of frequencies satisfy the conditions sup |fˆ(k)| ≤ j := inf |fˆ(k)|, k∈Λj
k∈Λj
sup |fˆ(k)| ≤ δj := inf |fˆ(k)|, k∈Yj
k∈Yj
Λj ⊂ Yj , and either |Yj | = |Λj |1+o(1)
(j −→ ∞)
or 1+o(1)
δj = j then (2) lim inf →0 f −
(j −→ ∞),
SΛj (f ) − SYj (f ) −→ 0 p ˆ k: |f(k)|≥
fˆ(k)eikx p > 0.
(j −→ ∞);
162
Chapter 5. Greedy Approximation wrt the Trigonometric System
Let M be a sufficiently large positive integer and let ηk , 1 ≤ k ≤ M , be independent random variables such that each ηk takes the value n, 1 ≤ n ≤ M , with probability 1/M . We will use the following probabilistic inequality. Lemma 5.3.22. There is a constant C1 = C1 (p) such that for every function g : {1, . . . , M } → R with M n=1 g(n) = 0, and all independent random variables ξk = g(ηk ) and complex numbers z1 , . . . , zM with |zk | ≤ 1 (k = 1, . . . , M ), we have p p/2 M p/2 2 E ξk zk ≤ C1 M . E(ξ1 ) k=1
Proof. First assume that the numbers z1 , . . . , zM are real. Observe that E(ξk ) = 0 for k = 1, . . . , M . By Rosenthal’s inequality, M p p/2 M M p p 2 2 ξk zk ≤ C(p) |zk | E(|ξ1 | ) + zk E(ξ1 ) E k=1 k=1 k=1 (5.3.25) " #
p/2 . ≤ C(p) M E(|ξ1 |p ) + M p/2 E(ξ12 ) Furthermore, p/2 p/2 M M 1 1 p 2 p/2−1 2 E(|ξ1 | ) = |g(n)| ≤ g(n) =M . E(ξ1 ) M n=1 M n=1 p
After substitution of the last inequality into (5.3.25) we get p p/2 M E ξk zk ≤ 2C(p)M p/2 E(ξ12 ) . k=1
Finally, if the numbers z1 , . . . , zM are complex then p p p M M M ξk zk ≤ 2p E ξk zk + 2p E ξk zk E k=1
k=1
k=1
p/2 p+2 p/2 2 , ≤ 2 C(p)M E(ξ1 )
and the lemma is proved.
We will need some properties of random trigonometric polynomials. M Lemma 5.3.23. Let b = (b1 , . . . , bM ) be real numbers such that k=1 bk = 0. Then M p ikx E bηk e ≤ C(p)bp2 . k=1
p
163
5.3. Convergence. Conditions on greedy approximants
Proof. We use Lemma 5.3.22 with g: g(n) = bn , zn = einx , n = 1, . . . , M . It shows that for each x, p M
p/2 ikx bηk e ≤ C1 (p)M p/2 E(ξ12 ) . E k=1
Therefore, p M p M
p/2 ikx ikx E bηk e = E bηk e ≤ C1 (p)M p/2 E(ξ12 ) , k=1
p
1
k=1
and E(ξ12 ) =
M 1 2 b = b22 /M. M n=1 n
This completes the proof. For a given a = (a1 , . . . , aM ), consider the random polynomials taI (x) := aηk eikx − sI DM (x)/M, ηk ∈I
where I ⊆ [1, M ] is an interval and sI :=
an ;
DM (x) :=
n∈I
M
eikx .
k=1
Below we use the notation log for logarithm with the base 2. Lemma 5.3.24. We have for any A > 0, M ≥ 8, $ % P max taI p ≤ A1/p 3 log M a2 ≥ 1 − C2 (p)A−1 log M. I⊆[1,M]
Proof. First, by Lemma 5.3.23 with bn = an χI (n) − sI /M , n = 1, . . . , M , we obtain p/2 M EtaI pp ≤ C(p) b2n . n=1
Next, M
b2n ≤
n=1
2 M
2 2 (an χI (n) + sI /M )2 = 2 a2n +M an M −2 ≤ 4 a2n , n=1
n∈I
and so, EtaI pp
n∈I
p/2 ≤ 4C(p) a2n . n∈I
n∈I
164
Chapter 5. Greedy Approximation wrt the Trigonometric System
Denote I(j, l) := (2j l, 2j (l + 1)] ∩ [1, M ], j = 0, . . . , J, l = 0, 1, . . . , with J := [log M ] + 1. Then, for any j ∈ [0, J], p/2 ∞ ∞ a p 2 E tI(j,l) p ≤ 4C(p) an ≤ 4C(p)ap2 . l=0
l=0
n∈I(j,l)
Now use Markov’s inequality: for any nonnegative random variable X and t > 0, P X ≥ t ≤ E(X)/t. Thus we get, for each j ∈ [0, J], ∞ a p p P tI(j,l) p ≥ Aa2 ≤ 4C(p)/A. l=0
Since every interval I ⊆ [1, M ] with integer endpoints can be represented as a union of at most 2J + 1 disjoint dyadic intervals I(j, l), we obtain $ %
P max taI p ≤ A1/p 2 log M + 3 a2 ≥ 1 − 4C(p) log M + 2 /A. I⊆[1,M]
Hence Lemma 5.3.24 is proved. Lemma 5.3.25. Let a1 > a2 > · · · > aM ≥ 0. Then, for each n ∈ [1, M ], $ % 2 P k : aηk ≥ an − n ≥ M 1/2 log M ≤ 2e−C(log M) .
Proof. We use the probabilistic Bernstein inequality. If ξ is a random variable (a real-valued function on a probability space Z), then denote
2 σ 2 (ξ) := E ξ − E(ξ) . The probabilistic Bernstein inequality states: if |ξ − E(ξ)| ≤ B a.e. then, for any > 0, 1 m m2 m ξ(zi ) − E(ξ) ≥ ≤ 2 exp − Pz∈Z . m 2(σ 2 (ξ) + B/3) i=1 We define a random variable β as follows: β(k) = 1 Then
if aηk ≥ an ,
β(k) = 0 otherwise.
P β(k) = 1 = P ηk ∈ [1, n] = n/M.
Also E(β) = n/M,
σ 2 (β) = (1 − n/M )n/M ≤ 1/4,
165
5.3. Convergence. Conditions on greedy approximants
and
M k : aη ≥ an = β(k). k k=1
Applying the Bernstein inequality for β with m = M and = M −1/2 log M we obtain Lemma 5.3.25. It will be convenient to use the following direct corollary of Lemma 5.3.25. Lemma 5.3.26. Let a1 > a2 > · · · > aM ≥ 0. Then % $ 2 P max k : aηk ≥ an − n ≥ M 1/2 log M ≤ 2M e−C(log M) . 1≤n≤M
We will now consider some specific polynomials that will be used as building blocks of a counterexample. For a given p ∈ (2, ∞) we take γ ∈ (max(3/4, 2/p), 1). For M ∈ N we denote m1 := m1 (M ) := [M γ ] + 1. Let m2 := m2 (M ) be such that m 2 −1
(n + m1 )−1 <
n=1
m2 M 1 (n + m1 )−1 ≤ (n + m1 )−1 . 2 n=1 n=1
(5.3.26)
We define an := an (M ) := (n + m1 )−1 for 1 ≤ n ≤ m2 , and an := an (M ) := −(n+m1 )−1 for m2 < n ≤ M . We consider the random trigonometric polynomials PM (x) :=
M
aηk eikx .
k=1
We also need some polynomials associated with PM . For arbitrary integers n1 and n2 with 0 ≤ n1 < n2 ≤ M , we define I := (n1 , n2 ] and SI := Sn1 ,n2 :=
n2
an .
n=n1 +1
We consider the following function g : {1, . . . , M } → R: an − SI /M, n ∈ I, g(n) = −SI /M, otherwise, together with the random variables ξk = g(ηk ), 1 ≤ k ≤ M , and the random trigonometric polynomial M a tI (x) = ξk eikx . k=1
It is easy to see that PI (x) :=
aηk eikx = taI (x) + SI DM (x)/M.
(5.3.27)
ηk ∈I
We need the following well-known lemma (see relation (7.4.9) in the Appendix).
166
Chapter 5. Greedy Approximation wrt the Trigonometric System
Lemma 5.3.27. Let DM (x) =
M
eikx .
k=1
Then C2 M 1−1/p ≤ Dp ≤ C3 M 1−1/p ,
p ∈ (1, ∞),
for some positive C2 = C2 (p) and C3 = C3 (p). Applying Lemma 5.3.24 with A = (log M )2 we obtain $ −1/2 P max taI p ≤ 3(log M )2 m1 ≥ 1 − C2 (p)/ log M.
(5.3.28)
I⊆[1,M]
By Lemma 5.3.26, $ % 2 P max k : |PˆM (k)| ≥ (m1 + n)−1 − n ≥ M 1/2 log M ≤ 2M e−C(log M) . 1≤n≤M
(5.3.29) Therefore, for M ≥ M0 (p) there exists a realization aη1 , . . . , aηM such that for the polynomial PM we have: for any I ⊆ [1, M ], taI p ≤ 3(log M )2 M −γ/2 , and for any n ∈ [1, M ],
−1 − n ≤ M 1/2 log M. k : |PˆM (k)| ≥ m1 + n
(5.3.30)
(5.3.31)
We will use polynomials satisfying (5.3.30) and (5.3.31). We also need some other properties of these polynomials. We begin with two simple properties: PM p ≤ 3(log M )2 M −γ/2 + C(p)M −1/p−γ
(5.3.32)
and, for I = (n1 , n2 ], PI p ≤ 3(log M )2 M −γ/2 + CM −1/p (ln(m1 + n2 ) − ln(m1 + n1 )).
(5.3.33)
The estimate (5.3.32) follows from (5.3.27) with I = [1, M ], (5.3.30), Lemma 5.3.27, and (5.3.26). The estimate (5.3.33) follows from (5.3.27), (5.3.30), Lemma 5.3.27, and the inequality
−1
n + m1 ≤ C ln m1 + n2 − ln m1 + n1 ) . |SI | ≤ n∈I
Let 0 := (m1 + m2 )−1 . Then T0 (PM ) =
ηk ∈[1,m2 ]
aηk eikx = P[1,m2 ] .
167
5.3. Convergence. Conditions on greedy approximants
Using (5.3.27), Lemma 5.3.27, and (5.3.30) we obtain T0 (PM ) ≥ C1 S[1,m ] M −1/p − 3(log M )2 M −γ/2 ≥ C2 M −1/p ln M 2 p
(5.3.34)
provided M ≥ M1 (p, γ). We now estimate Tδ (PM ) − T (PM )p from above for arbitrary > δ > 0. It is clear that it is sufficient to consider the case a1 ≥ > δ ≥ |aM |. We define the numbers 1 ≤ n1 ≤ n2 ≤ M as follows: |an1 | ≥ > |an1 +1 |,
|an2 | ≥ δ > |an2 +1 |
(we set aM+1 := 0). Let I = (n1 , n2 ]. Then Tδ (PM ) − T (PM ) = PI . By (5.3.33), we get
Tδ (PM ) − T (PM ) ≤ 3(log M )2 M −γ/2 + CM −1/p ln − ln δ . p
(5.3.35)
We note that the condition δ ≥ 1+α implies that Tδ (PM ) − T (PM ) ≤ 3(log M )2 M −γ/2 + CαM −1/p log M. p
(5.3.36)
We now set n := |an | and estimate Gn (PM ) − Tn (PM )p . We have Tn (PM ) = P[1,n] . Let Gn (PM ) =
PˆM (k)eikx ,
|Λn | = n,
k∈Λn
and let In be such that Tn (PM ) =
PˆM (k)eikx .
k∈In
It is clear that either Λn ⊆ In or In ⊆ Λn . Hence, for Zn := (Λn \ In ) ∪ (In \ Λn ) we get |Zn | ≤ ||Λn | − |In ||. By property (5.3.31), |Zn | ≤ M 1/2 log M, and
Gn (PM ) − Tn (PM ) ≤ C M 1/2 log M 1−1/p M −γ . p
(5.3.37)
168
Chapter 5. Greedy Approximation wrt the Trigonometric System
We take two numbers 1 ≤ n < m ≤ M and estimate Gm (PM ) − Gn (PM )p . By (5.3.37) we have
Gm (PM ) − Gn (PM ) ≤ 2C M 1/2 log M 1−1/p M −γ + Tm (PM ) − Tn (PM ) . p p (5.3.38) Using (5.3.35) we continue:
1−1/p −γ ≤ 2C M 1/2 log M M + 3(log M )2 M −γ/2
+ C1 M −1/p ln(m + m1 ) − ln(n + m1 ) .
(5.3.39)
Proof. We now prove Theorem 5.3.21. We define two sequences of natural numbers. Let M1 be big enough to guarantee that there are polynomials PM , M ≥ M1 , satisfying (5.3.30)–(5.3.39). For ν ≥ 1 we define Mν+1 = 4Mν2 . We put N1 = 0 and for ν ≥ 1 we set Nν+1 = Nν + Mν . Let f (x) :=
∞
−1 iNν x Mν1/p log Mν e PMν (x).
(5.3.40)
μ=1
It follows from (5.3.32) and the inequality γ > 2/p that the series (5.3.40) converges in the Lp -norm. It follows from (5.3.34) that the statement 2) from Theorem 5.3.21 holds. We now proceed to the proof of part 1) of Theorem 5.3.21. Let Λ := Λj , Y := Yj , := j , δ := δj be from Theorem 5.3.21. We assume that j is big enough to guarantee that |Y | ≤ |Λ|2 and δ ≥ 2 . Denote ν Nμ , Nμ + M μ . Uν := μ=1
We note that min k∈(Nν ,Nν +Mν ]
|fˆ(k)| >
max k∈(Nν+1 ,Nν+1 +Mν+1 ]
|fˆ(k)|.
Let ν be such that Uν−1 ⊂ Λ ⊆ Uν . We will prove that Y ⊆ Uν+1 . Indeed, if we assure that Uν+1 ⊂ Y , then |Y | ≥ Mν+1 ≥ 4Mν2 ;
|Λ| ≤
ν μ=1
Mμ < 2Mν ,
169
5.3. Convergence. Conditions on greedy approximants
which contradicts the fact that |Y | ≤ |Λ|2 . Also, Uν+1 ⊂ Y implies that −γ+1/p
δ ≤ Mν+2
log Mν+2
−1
(5.3.41)
and Λ ⊆ Uν implies that
−1
−1 2Mν . ≥ Mν1/p log Mν
(5.3.42)
The relations (5.3.41) and (5.3.42) for big ν contradict our assumption that δ ≥ 2 . Thus we have Y ⊆ Uν+1 . There are two cases: Y ⊆ Uν or Uν ⊂ Y . The proofs for these are similar. Let us begin with the first one: Y ⊆ Uν . In this case,
−1 iNν x
SY (f ) − SΛ (f ) = Mν1/p log Mν SY (PMν ) − SΛ (PMν ) , e where Λ := {k − Nν , k ∈ Λ}, Y := {k − Nν , k ∈ Y }. By (5.3.36), SY (f ) − SΛ (f ) = o(1) p
(5.3.43)
if δ = 1+o(1) . By (5.3.38)–(5.3.39) we also obtain (5.3.43) if |Y | = |Λ|1+o(1) . This completes the proof of 1) from Theorem 5.3.21 in the first case. We now proceed to the second case: Uν ⊂ Y ⊆ Uν+1 . This case reduces to the first one by rewriting SY (f ) − SΛ (f ) = SY (f ) − SUν (f ) + SUν (f ) − SΛ (f ).
The proof of Theorem 5.3.21 is complete.
5.3.5 Necessary and sufficient conditions in the case p = ∞ If W is a set and f : W → W is a map, then by fk (k ∈ N) we denote the k-fold iteration of f . Theorem 5.3.28. Let α : N → N be strictly increasing. Then the following conditions are equivalent: (a) For some k ∈ N and for any sufficiently large m ∈ N we have αk (m) > em . (b) If f ∈ C(T) and
then
Gα(m) (f ) − Gm (f ) −→ 0 ∞
(m −→ ∞),
(5.3.44)
f − Gm (f ) −→ 0 ∞
(m −→ ∞).
(5.3.45)
Proof. We first prove that (a) implies (b). Denote γ = α2k . Then m
γ(m) > ee
(m ≥ m0 ).
(5.3.46)
170
Chapter 5. Greedy Approximation wrt the Trigonometric System
Let f ∈ C(T) and let (5.3.44) hold. Then Gγ(m) (f ) − Gm (f ) −→ 0 (m −→ ∞). ∞ sum
(5.3.47)
Let us estimate Vm (f ) − Gm (f )∞ , where Vm (f ) is the de la Vall´ee Poussin 2m − |k| ˆ min 1, Vm (f ) = f (k)eikx . m |k|≤2m
For m ≥ m0 we denote h1 := Gm (f ) − Vm (f ), h2 := Gγ(m) (f ) − Gm (f ), h3 := Gγ(m) (f ),
h4 := f − Gγ(m) (f ).
It will be convenient to use the notation f ˆ∞ := fˆ(k) ∞ := sup |fˆ(k)|. k
ˆ 3 (k)| = 0 or We have either inf k |h inf ˆ 3 (k)=0 h
and hence
ˆ 3 (k)| ≤ h3 2 (γ(m))−1/2 ≤ f 2 e−e |h
m
h4 ˆ∞ ≤ f 2 e−e
m
/2
/2
.
,
(5.3.48)
(5.3.49)
By Theorem 5.3.9 with K = 2, we get h1 + h4 ≥ h1 ∞ /4 − eCm h4 ˆ . ∞ ∞ By (5.3.49),
h1 + h4 ≥ h1 ∞ /4 − o(1) (m −→ ∞). ∞
Therefore, using (5.3.47) we have, for m → ∞, h1 ∞ ≤ 4h1 + h4 ∞ + o(1) = 4f − Vm (f ) − h2 ∞ + o(1) = o(1). We have used above the well-known fact that f −Vm (f )∞ → 0 with m → 0 (see [86, Chapter 3, Section 13]). Using it again we complete the proof of the first implication: (a) implies (b). Next we show that (b) implies (a). Suppose the function α does not satisfy (a). We claim that (b) does not hold. If α is identical on N, then the claim trivially follows from the existence of a continuous function with divergent greedy approximations. Otherwise, there is an m0 ∈ N such that α(m0 ) = m0 . Since α is strictly increasing, we have α(m0 ) > m0 and, moreover, α(m) > m for m ≥ m0 . Let mj = αj (m0 ) = α(mj−1 ) for j ∈ N. Then the sequence {mj } is strictly increasing. Moreover, the sequence {mj+1 − mj } is nondecreasing. By our supposition, for any k ∈ N there is an m > m0 such that αk+1 (m) < em . Let mj−1 < m ≤ mj .
171
5.3. Convergence. Conditions on greedy approximants
Then αk+1 (m) > mj+k and thus mj+k < emj . Therefore, there is an unbounded nondecreasing function τ : N → N such that for infinitely many j ∈ N we have mj < emj−τ (j) ,
τ (j) < j.
(5.3.50)
Define the sequence {An } by An = 1 for n ≤ m1 and An = (τ (j))−1 (mj+1 −mj )−1 for mj < n ≤ mj+1 . Clearly, {An } is nonincreasing. Then we have mj n=mj−τ (j) +1
An =
j−1
j−1
mi+1
An =
i=j−τ (j) n=mi +1
−1
τ (i)
j−1
≥
i=j−τ (j)
τ (j)−1 = 1.
i=j−τ (j)
If, moreover, j satisfies (5.3.50), then for M = mj−τ (j) we get An ≥ 1. M
We now use Theorem 5.2.4: there is a function f ∈ C(T) such that an (f ) ≤ An and (5.3.45) fails. We take m > m1 and let mj < m ≤ mj+1 . Then α(m) Gα(m) (f ) − Gm (f ) ≤ an (f ) ≤ n=m+1
mj+2
An
n=mj +1
= τ (j)−1 + τ (j + 1)−1 = o(1) (m −→ ∞),
which completes the proof of the theorem. Theorem 5.3.29. Let β : (0, +∞) → R be a nondecreasing function such that lim sup β()/ < 1.
(5.3.51)
→0+
Then the following conditions are equivalent: (a) For some k ∈ N and for any sufficiently large u > 0 we have βk (1/u) < e−u . (b) If f ∈ C(T) and Tβ() (f ) − T (f ) −→ 0 ( −→ 0), (5.3.52) ∞ then
f − T (f ) −→ 0 ∞
( −→ 0).
(5.3.53)
Proof. We first prove that (a) implies (b). Denote γ = β2k . Then γ(1/u) < e−e
u
(u ≥ u0 ).
Let f ∈ C(β) satisfy (5.3.52). Then Tγ() (f ) − T (f ) −→ 0 ∞
( −→ 0).
(5.3.54)
(5.3.55)
172
Chapter 5. Greedy Approximation wrt the Trigonometric System
For ≥ 0 we denote m() := [1/] and h1 := T (f ) − Vm() , h2 := Tγ()(f ) − T (f ), h4 := f − Tγ()(f ).
h3 := Tγ()(f ),
We have ˆ 1 (k) = 0 ≤ k : Tˆ (f )(k) = 0 + 4m() ≤ f 2 /2 + 4m(). k:h 2 The rest of the proof of the implication (a) ⇒ (b) repeats the proof for the same implication in Theorem 5.3.28. Next we show that (b) implies (a). We assume that a function β does not satisfy (a), and we shall show that (b) does not hold. By the assumption (5.3.51), there are numbers θ < 1 and 0 > 0 such that β() ≤ θ
(0 < ≤ 0 ).
For j ∈ N, denote j = βj (0 ) = β(j−1 ). We have j ≤ θj−1 .
(5.3.56)
By our assumption, for any k ∈ N there is < 0 such that βk+1 () ≥ e−1/ . Let j−1 ≥ > j . Then βk+1 () ≤ j+k , and thus j+k > e−1/j . Therefore, there is an unbounded nondecreasing function τ : N → N such that for infinitely many j ∈ N we have j > e−1/j−τ (j) . (5.3.57) Also, we can assume that τ (j) ≤ j for all j. Let
&
' 1 , mj := j τ (j)
Mj :=
(5.3.58) j
mi .
i=1
We set M0 := 0. Let us estimate Mj from above and from below. We have j 1 Mj ≤ , i=1 j
and, by (5.3.56), Mj ≤
1 . (1 − θ)j
Also, (5.3.56) and the divergence of τ (j) to ∞ as j → ∞ imply that Mj = o −1 (j −→ ∞). j
(5.3.59)
(5.3.60)
173
5.3. Convergence. Conditions on greedy approximants
By (5.3.56), for sufficiently large j we have j < j −2 /4, and, taking into account (5.3.58), we get 1 (5.3.61) mj ≥ 2j τ (j) and also Mj ≥ mj ≥ (j )−1/2 .
(5.3.62)
Now define the sequence {An } by An = j for Mj−1 < n ≤ Mj . If j − τ (j) is large enough (observe that this is true if j itself is large and (5.3.57) holds), then, by (5.3.61), we have Mj
An =
n=Mj−τ (j) +1
j−1
An =
i=j−τ (j) n=Mi +1
≥
j−1
Mi+1
j−1
(2τ (i))−1 ≥
i=j−τ (j)
mi i
i=j−τ (j) j−1
(5.3.63) (2τ (j))−1
i=j−τ (j)
1 = . 2
We now assume that (5.3.57) holds and denote := j−τ (j) . Using (5.3.57), (5.3.59), and (5.3.62), we have Mj <
e1/ , 1−θ
Mj−τ (j) ≥ −1/2 .
Therefore, if j is large enough (and, hence, is small), we have Mj < exp exp(Mj−τ (j) ) . We now take M equal to one of the numbers exp(Mj−τ (j) ) . Mj−τ (j) , Then by (5.3.63) we get the inequality
An ≥ 1/4.
M
Similarly to the proof of Theorem 5.3.28 we now use Theorem 5.2.4: there is a function f ∈ C(T) such that an (f ) ≤ An and (5.3.45) fails. We take sufficiently small and estimate Tβ()(f ) − T (f )∞ . Let j−1 > ≥ j . We have Tβ() (f ) − T (f ) ≤ ∞
ˆ β()≤|f(k)|<
≤ Σ1 + Σ 2 ,
|fˆ(k)| ≤
ˆ j+1 ≤|f(k)|< j−1
|fˆ(k)| (5.3.64)
174
Chapter 5. Greedy Approximation wrt the Trigonometric System
where
Σ1 =
an (f ),
n>Mj−1 ,j+1 ≤an (f )<j−1
and
Σ2 =
an (f ).
n≤Mj−1 ,j+1 ≤an (f )<j−1
We observe that, in the case n > Mj+1 , an (f ) ≤ An < j+1 . Hence,
Σ1 =
Mj−1
≤
an (f ) ≤
an (f )
(5.3.65)
Mj−1
An = mj j + mj+1 j+1 ≤ τ (j)−1 + τ (j + 1)−1 → 0
(j → ∞).
Mj−1
Further, by (5.3.60), Σ2 <
j−1 ≤ Mj−1 j−1 −→ 0
(j −→ ∞).
(5.3.66)
n≤Mj−1
Thus, by (5.3.64)–(5.3.66), lim Tβ() (f ) − T (f )∞ = 0,
→0
(5.3.67)
and (5.3.52) holds. Moreover, (5.3.67) clearly implies that lim |fˆ(k)| = 0, δ→0
|fˆ(k)|=δ
and thus for f convergence of greedy and thresholding approximations are equivalent. But we know that (5.3.45) fails. Therefore, (5.3.53) does not hold either. Theorem 5.3.29 is proved.
5.4 An application of WCGA This section is based on the paper [19]. We consider a particular case: the Banach space X = Lp (Td ) and the dictionary D = RT — the real trigonometric system {1/2, sin x, cos x, . . . } and its d-dimensional version RT d = RT × · · · × RT . It is more convenient to consider the real Lp (Td ) and the real trigonometric system because the Weak Chebyshev Greedy Algorithm is defined and studied in a real
175
5.4. An application of WCGA
Banach space. Note that the system RT is not normalized in Lp , but only seminormalized: C1 ≤ f p ≤ C2 for any f ∈ RT , with absolute constants C1 , C2 , 1 ≤ p ≤ ∞. This is sufficient for the application of the general methods developed in [73]. We will compare the performance of WCGA with the performance of the Thresholding Greedy Algorithm (TGA). It is proved in [68] that in the case of the complex trigonometric system T d = {ei(k,x) }, for any f ∈ Lp (Td ) we have that
f − Gm f, T d ≤ Cm|1/2−1/p| σm f, T d , 1 ≤ p ≤ ∞, (5.4.1) p p with an absolute constant C. The same proof works for RT d and gives
f − Gm f, RT d ≤ Cm|1/2−1/p| σm f, RT d , 1 ≤ p ≤ ∞. p p
(5.4.2)
5.4.1 Convergence It is shown in [68] that Gm (·, T ) may fail to converge in Lp when p = 2. The same is true for Gm (·, RT ). The convergence of WCGA for 1 < p < ∞ follows from general results (see [73]). The divergence of Gm (·, T ) and Gm (·, RT ) can be fixed by adding the “Chebyshev step” in these algorithms. We describe this in the general case of Gm (·, Ψ). At the step m, instead of taking the partial sum m
Gm f, Ψ = ckj ψkj , j=1
we take the best approximation Bm (f, Ψ, X) to f from span {ψk1 , . . . , ψkm }. It is easy to see that in this case we have
f − Bm f, Ψ, X −→ 0 X as m → ∞. Thus, in the sense of convergence, the Weak Chebyshev Greedy Algorithm and the Chebyshev Thresholding Greedy Algorithm (CTGA) defined above are both effective in Lp , 1 < p < ∞.
5.4.2 Rate of approximation We will compare the rates of approximation of TGA, CTGA, and WCGA for the class ∞ |ak (f )| + |bk (f )| ≤ 1 , A1 := A1 (RT ) := f : k=0
where ak , bk are the corresponding Fourier coefficients. From the general results on convergence rate of the Weak Chebyshev Greedy Algorithm (see Theorem 6.2.6 from Chapter 6) we get the following lemma.
176
Chapter 5. Greedy Approximation wrt the Trigonometric System
Lemma 5.4.1. For f ∈ A1 we have c,t fm ≤ C(p, t)m−1/2 , p
2 ≤ p < ∞.
(5.4.3)
This estimate and (5.4.2) imply, for f ∈ A1 and 2 < p < ∞, that
f − Bm f, RT , Lp ≤ f − Gm (f, RT ) ≤ C(p, t)m−1/p , p p
(5.4.4)
which is weaker than (5.4.3). Let us give an example showing that (5.4.4) cannot be improved in the sense of order. Consider, for a given m, f := (2m)−1
2m k cos kx ∈ A1 . 1− 4m k=1
Then Bm (f, RT , Lp ) is the best approximation to f from span {cos x, . . . , cos mx} in Lp . It is not difficult to see that
f − Bm f, RT , Lp ≥ C(p)m−1/p . (5.4.5) p This proves that (5.4.4) cannot be improved in the sense of the order of m. Now we will show that the constant C(p, t) in (5.4.4) can be replaced by 1. We denote p := p/(p − 1) and use the Hausdorff–Young theorem (see Appendix, Theorem 7.3.1): for g ∈ Lp , 2 ≤ p < ∞, ∞
1/p |ak (g)|p + |bk (g)|p
≤ gp .
(5.4.6)
k=0
Then we have, for any f ∈ A1 , f − Gm (f, RT ) = sup p
g,gp ≤1
=
sup
f − Gm (f, RT ), g ak (f )ak (g) + bl (f )bl (g),
g,gp ≤1 k∈Λ / c
(5.4.7)
l∈Λ / s
where Λc and Λs are such that (here # denotes cardinality) Gm (f, RT ) = ak (f ) cos kx + bl (f ) sin kx, #Λc + #Λs = m. k∈Λc
l∈Λs
Using the H¨older inequality we continue (5.4.7): ≤
sup gp ≤1
k∈Λ / c
p
|ak (f )| +
l∈Λ / s
|bl (f )|
p
1/p k∈Λ / c
|ak (g)| + p
l∈Λ / s
1/p |bl (g)|
p
.
177
5.4. An application of WCGA
By (5.4.6) we have ≤
p
|ak (f )| +
k∈Λ / c
|bl (f )|
p
1/p .
l∈Λ / s
Using the definitions of A1 and Λc , Λs , we finish the estimate
≤ m1/p −1 = m−1/p . Thus we have proved that for f ∈ A1 f − Gm (f, RT ) ≤ m−1/p , p
2 ≤ p < ∞.
(5.4.8)
The relations (5.4.3) and (5.4.5) show that WCGA gives a better rate of approximation in Lp (2 ≤ p < ∞) than CTGA.
5.4.3 Constructive approximation of function classes In this subsection we discuss the question of efficiency of the algorithms TGA and WCGA for some function classes. As above, we confine ourselves to the case of m-term approximation with respect to the trigonometric system. For a function class F , we denote
σm F, T d p := sup σm f, T d p . f ∈F
The inequality (5.4.1) implies that for any function class F we have
sup f − Gm (f, T d )p ≤ Cm|1/2−1/p| σm F, T d p , 1 ≤ p ≤ ∞. f ∈F
We would like to understand for what function classes F the constructive algorithms TGA and WCGA provide an optimal order of approximation, the order of σm (F, T d )p . The major point of the discussion that follows is that in the case of standard smothness classes and the Lp spaces with 2 ≤ p ≤ ∞, WCGA provides a constructive way of m-term approximation as efficient as the best m-term approximation. We will need some known results on σm (F, T d )p in the discussion. For the reader’s convenience these results are formulated as theorems (see Theorem 5.4.2 and 5.4.3). In the paper [15] two types of function classes were studied from the point of view of best m-term trigonometric approximation. We begin with the first class. For 0 < α < ∞ and 0 < q ≤ ∞, let Fqα denote the class of the functions in L1 (Td ) |f |Fqα :=
q
αq max(1, |k1 |, . . . , |kd | fˆ(k)
k∈Zd
The following theorem was proved in [15].
1/q ≤ 1.
178
Chapter 5. Greedy Approximation wrt the Trigonometric System
Theorem 5.4.2. If α > 0 and λ := α/d + 1/q − 1/2, then for all 1 ≤ p ≤ ∞ and all 0 < q ≤ ∞ we have
C1 m−λ ≤ σm Fqα , T d p ≤ C2 m−λ , α > d(1 − 1/q)+ , with C1 , C2 > 0 constants depending only on d, α, q. The second class is defined as follows. Let α > 0, 0 < τ, s ≤ ∞, and Bsα (Lτ ) denote the class of functions for which there exist trigonometric polynomials Tn of coordinate degree 2n with the properties f=
∞
Tn ,
nα 2 Tn τ ∞ ≤ 1. n=0 s (Z)
n=0
The following theorem concerning these classes was proved in [15]. Theorem 5.4.3. Let 1 ≤ p ≤ ∞, 0 < τ, s ≤ ∞, and define d(1/τ − 1/p)+ , 0 < τ, s ≤ ∞ and 1 ≤ p ≤ τ ≤ ∞, α(p, τ ) := max(d/τ, d/2), otherwise. Then for α > α(p, τ ) we have C1 m−μ ≤ σ(Bsα (Lτ ), T d )p ≤ C2 m−μ , where μ := α/d − (1/τ − max(1/p, 1/2))+ and C1 , C2 depend only on α, p, τ , and d. Remark 5.4.4. Theorems 5.4.2 and 5.4.3 hold with T d replaced by RT d . It was proved in [73] that in the case 1 ≤ p ≤ 2 the rate of best m-term approximation in Theorem 5.4.2 can be realized by Gm (·, T d ) for TGA, that is, by a constructive method (the same is true for RT d ). In the same case 1 ≤ p ≤ 2 the rate of σm (Bsα (Lτ ), T d )p can be realized by approximating by trigonometric polynomials of degree m1/d in each variable. Thus, in the case 1 ≤ p ≤ 2 there exist constructive methods which provide the optimal rate in Theorems 5.4.2 and 5.4.3. However, for the case 2 < p ≤ ∞, which is the most interesting case in Theorems 5.4.2 and 5.4.3 from the point of view of upper estimates (they do not depend on p in this case), there was no constructive proof of the upper estimates. The existing methods in this case are based either on a probabilistic approach (Yu. Makovoz, 2 < p < ∞), which does not cover the most interesting case p = ∞, or on the geometry of finite-dimensional Banach spaces ([15], 2 < p ≤ ∞), which covers the case p = ∞. Both approaches contain a nonconstructive step. In [15] this step is hidden in the following inequality (see [15]): d 1/2 d d + n −1/2 , (5.4.9) 1 + ln σm A1 Tn , T ∞ ≤ Cm m
5.5. Constructive nonlinear trigonometric m-term approximation
179
where Tnd denotes the subsystem of the trigonometric system T d which forms a basis for the space of trigonometric polynomials of coordinate degree n. The inequality (5.4.9) was proved in [15] with the help of the following theorem of Gluskin [29]. Theorem 5.4.5. There exist absolute constants C1 and 0 < δ < 1 such that for any finite collection V of M vectors from the unit Euclidean ball B2N of RN there is a vector z ∈ RN with |zi | = 0, 1, i = 1, . . . , N , zN ≥ δN , and 1 1/2 M . max v, z ≤ C1 1 + ln+ v∈V N The major purpose of this section is to note that in the case 2 < p < ∞ the Weak Chebyshev Greedy Algorithm provides a constructive way to get an analog of (5.4.9). This follows immediately from Lemma 5.4.1: for f ∈ A1 (RT dn ) we have c,t f ≤ C(p, t)m−1/2 , 2 ≤ p < ∞. (5.4.10) m p Thus the only nonconstructive step in the proof of upper estimates in Theorems 5.4.2 and 5.4.3 can be made constructive for p < ∞.
5.5 Constructive nonlinear trigonometric m-term approximation This section is based on the paper [77]. We describe the approximation method in detail in the univariate case. Consider the real Lp (T) space with f p :=
1 π
T
1/p |f (x)|p dx ,
f ∞ := max |f (x)|, x∈T
1 ≤ p < ∞;
f continuous.
Let 1 ≤ p < ∞. Denote Tp the real trigonometric system normalized in Lp , 2−1/p , cp sin x, cp cos x, . . . , where
cp =
1 π
T
−1/p | sin x|p dx .
It is clear that C ≤ cp ≤ C with two absolute constants C 1 and C 2 . Let T (N ) denote the set of trigonometric polynomials of order N . 1
2
180
Chapter 5. Greedy Approximation wrt the Trigonometric System
We discuss first a simpler construction, based on the particular case of p = 4, in order to illustrate the idea of the construction. For a trigonometric polynomial t(x) = a0 /2 +
N
ak cos kx + bk sin kx , k=1
denote tA := |a0 | +
N
|ak | + |bk | . k=1
Then, by Theorems 6.2.6, 6.3.3 and 6.4.4 from Chapter 6, each of the algorithms WCGA, WRGA, and WGAFR with τ = {1/2}, q = 2 provides a constructive way of approximation in the L4 -norm: for any t ∈ T (N ) we get an m-term trigonometric polynomial Gm (t) ∈ T (N ) such that t − Gm (t) ≤ C1 m−1/2 tA , (5.5.1) 4 with an absolute constant C1 . By Nikol’skii’s inequality (see Theorem 7.5.4 of the Appendix), this implies that t − Gm (t) ≤ C2 N 1/4 m−1/2 tA (5.5.2) ∞ with absolute constant C2 . We will build our constructive approximation operators Ak (N, m) inductively from level k = 1 up to arbitrary level k. We begin with the level k = 1. We set, for t ∈ T (N ), A1 (N, m)(t) := Gm (t). Then (5.5.2) implies that, for m ≤ N , t − A1 (N, m)(t) ≤ C2 N 1/4 m−1/2 tA ≤ A1 N 1/4 (N/m)1/2 m−1/2 tA . ∞ (5.5.3) We continue the construction inductively. Suppose that we have built operators Ak (N, m) such that, for any t ∈ T (N ), t − Ak (N, m)(t) ≤ Ak N 2−k−1 (N/m)1/2 m−1/2 tA . ∞
(5.5.4)
We will build operators Ak+1 (N, m) and will control the constant Ak+1 . We will carry out the construction for even numbers m. Step 1. Let t ∈ T (N ). We approximate t using (5.5.1): t − Gm/2 (t) ≤ C1 m−1/2 tA . 4 Denote
h := t − Gm/2 (t) /t − Gm/2 (t)4 .
5.5. Constructive nonlinear trigonometric m-term approximation
181
Step 2. Take a positive number D and decompose h(x), if |h(x)| ≤ D, D h = h + hD ; hD (x) := 0, otherwise. We need the following simple well-known result. Lemma 5.5.1. Assume p ∈ [2, ∞) and f p = 1. Then fD ∞ ≤ D
and
f D 2 ≤ D1−p/2 .
By Lemma 5.5.1 with p = 4, hD ∞ ≤ D
and hD 2 ≤ D−1 .
We would like to work with trigonometric polynomials instead of hD and hD . Let VN be the de la Vall´ee Poussin operator. Consider VN (hD ) and VN (hD ). We have h = VN (h) = VN (hD ) + VN (hD ) and
VN (hD ) ≤ 3D, ∞
VN (hD ) ≤ D−1 , 2
VN (hD )A ≤ 2N 1/2 D−1 .
Step 3. We approximate VN (hD ) ∈ T (2N ) using operators from level k. By (5.5.4) we have
VN (hD ) − Ak 2N, m/2 VN (hD ) ∞ 2−k−1 1/2 −1/2 VN (hD )A . ≤ Ak (2N ) 3(N/m) m For t ∈ T (N ) define
Ak+1 N, m, D (t) := Gm/2 (t) + t − Gm/2 (t)4 Ak 2N, m/2 VN (hD ) . Since h ∈ T (N ), we get
t − Ak+1 N, m, D (t)
= ht − Gm/2 (t)4 − t − Gm/2 (t)4 Ak (2N, m/2) VN (hD )
= t − Gm/2 (t)4 h − Ak 2N, m/2 VN (hD )
= t − Gm/2 (t)4 VN (hD ) + VN (hD ) − Ak (2N, m/2) VN (hD ) . Therefore,
t − Ak+1 N, m, D (t) ∞
−k−1 ≤ t − Gm/2 (t)4 3D + Ak (2N )2 6(N/m)1/2 m−1/2 N 1/2 D−1
−k−1 6(N/m)D−1 C1 m−1/2 tA . (5.5.5) ≤ 3D + Ak (2N )2
182
Chapter 5. Greedy Approximation wrt the Trigonometric System
Step 4. Choose
1/2 −k−1 D = D N, m, k := 2Ak (2N )2 (N/m) . By (5.5.5),
t − Ak+1 N, m, D (t) ≤ A N 2−k−2 (N/m)1/2 m−1/2 tA , k+1 ∞ with
−k−2
Ak+1 := 6C1 21/2 22
1/2
Ak
1/2
≤ C3 Ak .
(5.5.6) (5.5.7)
Ak+1
from (5.5.7) under We remind that we have proved (5.5.6) with the constant the assumption that m is an even number. We complete the construction by setting
Ak+1 (N, m) := Ak+1 N, 2[m/2], D(N, 2[m/2], k) , m ≥ 2. Clearly, (5.5.6) implies
t − Ak+1 N, m, D (t) ≤ Ak+1 N 2−k−2 (N/m)1/2 m−1/2 tA ∞
(5.5.8)
for all m with Ak+1 = 2Ak+1 . The relation (5.5.7) combined with A1 = C2 (see (5.5.3)) implies that Ak ≤ C4 for all k. Let N be given. Choose k satisfying 2k+1 ≥ ln N . Then (5.5.4) gives for any t ∈ T (N ) the estimate
t − Ak (N, m)(t) ≤ C5 N 1/2 /m tA (5.5.9) ∞ for any m. We now proceed to a more elaborate construction that gives the following estimate. Theorem 5.5.2. There exists a constructive method A(N, m) that provides, for any t ∈ T (N ), an m-term trigonometric polynomial A(N, m)(t) with the following approximation property:
t − A(N, m)(t) ≤ Cm−1/2 ln 1 + N/m 1/2 tA (5.5.10) ∞ with an absolute constant C. Proof. We will construct an analog of the sequence of operators {Ak (N, m)} constructed above. The new feature here is that we will approximate t in the Lp -norm, p ∈ [4, ∞), instead of the L4 -norm, and will optimize over p. Let N and m be given and let t ∈ T (N ). We use either WCGA, WRGA or WGAFR with τ = {1/2}, q = 2, DN = Tp ∩ T (N ) to approximate t by an m-term trigonometric polynomial in the Lp -norm, p ∈ [4, ∞). By Theorems 6.2.6, 6.3.3 or 6.4.4 with X = T (N )p , where T (N )p denotes T (N ) equipped with the Lp -norm, we get t − Gpm (t) ≤ C6 C(2, γ)m−1/2 tA . (5.5.11) p
5.5. Constructive nonlinear trigonometric m-term approximation
183
Let us estimate the constant C(2, γ). By 5.1.5, γ = (p − 1)/2. Thus, by Remark 6.2.11 or Remark 6.4.5 from Chapter 6, we get C6 C(2, γ) ≤ C7 p1/2 .
(5.5.12)
We define the level k = 1 algorithms A1p (N, m) by t ∈ T (N ).
A1p (N, m)(t) = Gpm (t),
(5.5.13)
We note that, by construction, A1p (N, m)(t) ∈ T (N ). By Nikol’skii’s inequality we infer from (5.5.11)–(5.5.13) that t − A1p (N, m)(t) ≤ C8 p1/2 N 1/p m−1/2 tA ∞ 1
≤ C8 p1/2 N 1/4 (N/m) p−2 m−1/2 tA ,
m ≤ N.
(5.5.14)
We note here that taking pN := ln N we get from the first inequality in (5.5.14) t − A1 (N, m)(t) ≤ C(ln N )1/2 m−1/2 tA (5.5.15) p ∞ with an absolute constant C. Thus the rest of the proof will be devoted to replacing ln N by ln(1 + N/m) in (5.5.15). As in the case p = 4 we continue the construction by induction. Suppose we have built operators Akp (N, m) such that for any t ∈ T (N ), p ∈ [4, ∞), 1 t − Ak (N, m) ≤ Ap N 2−k−1 (N/m) p−2 m−1/2 tA . p k ∞
(5.5.16)
We will make steps similar to those from above. Step 1. Let t ∈ T (N ) and let m be an even number. We approximate t using (5.5.11), (5.5.12): t − Gp (t) ≤ C9 p1/2 m−1/2 tA . (5.5.17) m/2 p Denote
h[p] := t − Gpm/2 (t) /t − Gpm/2 (t)p .
Step 2. Take a positive number D and decompose h[p] = hD [p] + hD [p]. By Lemma 5.5.1 we get hD [p] ≤ D ∞
and hD [p]2 ≤ D1−p/2
and, therefore,
VN hD [p] ≤ 3D; VN hD [p] ≤ D1−p/2 ; VN hD [p] ≤ 2N 1/2 D1−p/2 . ∞ 2 A
184
Chapter 5. Greedy Approximation wrt the Trigonometric System
Step 3. We approximate VN (hD ) ∈ T (2N ) using operators from level k. By (5.5.16) we have D
VN h [p] − Akp 2N, m/2 VN (hD [p]) ≤
∞ 1 p 2−k−1 1/2 −1/2 p−2 Ak (2N ) (4N/m) 2 m VN (hD )A .
For t ∈ T (N ), define Ak+1 (N, m, D)(t) := Gpm/2 (t) + t − Gpm/2 (t)p Akp (2N, m/2)(VN (hD [p])). p Similarly to the case p = 4 (see (5.5.5)), we get
t − Ak+1 N, m, D (t)∞ p p −k−1 ≤ t − Gpm/2 (t)p (3D + Apk (2N )2 6(N/m) 2(p−2) D1−p/2 ).
(5.5.18)
Step 4. Choose
p
2/p −k−1 Dp = Dp N, m, k := 2Apk (2N )2 (N/m) 2(p−2) . By (5.5.17) we obtain from (5.5.18), for even m,
1 2−k−2 t − Ak+1 (N/m) p−2 , m−1/2 tA N, m, Dp (t)∞ ≤ Ap,1 p k+1 N
(5.5.19)
with 1/2 (Apk )2/p , Ap,1 k+1 ≤ C10 p
Ap1 ≤ C11 p1/2 .
(5.5.20)
We note that (5.5.20) implies p
Apk ≤ C12 p 2(p−2) .
(5.5.21)
We set Ak+1 (N, m) := Ak+1 (N, 2[m/2], Dp (N, 2[m/2], k)) p p and obtain (5.5.16) with k replaced by k + 1 and a constant Apk+1 = 2Ap,1 k+1 . Let N and m be given. First we choose k satisfying 2k+1 ≥ ln N . Next we choose p = 2 + ln(1 + N/m). Then (5.5.16) and (5.5.21) give for any t ∈ T (N ) the estimate
t − Akp (N, m)(t) ≤ C13 m−1/2 ln 1 + N/m 1/2 tA (5.5.22) ∞ for any m. This completes the proof of Theorem 5.5.2.
The same technique can also be used in the multivariate case. Let Lp (Td ) be the real Banach space with 1/p 1 f (x)p dx , 1 ≤ p < ∞; f p := π d Td f ∞ := max |f (x)|, x∈Td
f continuous.
5.5. Constructive nonlinear trigonometric m-term approximation
185
Denote by T d := T × · · · × T (d times) the real multivariate trigonometric system. Let N = (N1 , . . . , Nd ). Denote by T (N) the space of trigonometric polynomials of degree Nj in the variable xj , j = 1, . . . , d. Let v(N) be the dimension of T (N). We formulate a generalization of Theorem 5.5.2 for the d-dimensional case and note that the proof repeats the proof of Theorem 5.5.2. Theorem 5.5.3. There exists a constructive method A(N, m) that provides, for any t ∈ T (N), an m-term trigonometric polynomial A(N, m)(t) with the following approximation property:
t − A(N, m)(t) ≤ C(d)m−1/2 ln 1 + v(N)/m 1/2 tA , (5.5.23) ∞ with a constant C(d) that may depend on d. This theorem can be applied to studying m-term trigonometric approximation of function classes. In the same way as in Section 5.4, one can use Theorem 5.5.3 instead of (5.4.10) to make the proofs of Theorems 5.4.2 and 5.4.3 constructive in the case p = ∞. Therefore, we now have constructive proofs of Theorems 5.4.2 and 5.4.3 in all cases. It is interesting to compare this situation with the situation of finding a constructive proof for Kolmogorov’s widths of the above function classes. We will make a comment only on the classes Bsα (Lτ ) in the case τ = 2, p = ∞. We remind the definition of the Kolmogorov width: m sup inf f − cj ϕj dm (F, X) := inf . ϕ1 ,...,ϕm f ∈F c1 ,...,cm
By a result of Kashin [40],
dm Bsα (L2 ), L∞ m−α/d ,
j=1
α > d/2.
(5.5.24)
The estimate (5.5.24) is only an existence theorem and it is an interesting open problem to find a constructive proof (construct ϕ1 , . . . , ϕm ) of (5.5.24). One can check that the proof of Theorem 5.5.2 works in the following more general situation. Let Φ := {φj }∞ j=1 be a uniformly bounded orthonormal system defined on a bounded domain. Denote Φ(N ) := span φ1 , . . . , φN and assume that the system Φ admits a sequence of the de la Vall´ee Poussin operators: (VP)
There exist two positive constants K1 and K2 such that for any N there is an operator VNΦ with the properties VNΦ (φj ) = λN,j φj , λN,j = 1 for j ∈ [1, N ], λN,j = 0 for j > K1 N , Φ V N Lp →Lp ≤ K2 for 1 ≤ p ≤ ∞ and all N .
(5.5.25)
186
Chapter 5. Greedy Approximation wrt the Trigonometric System
For a system Φ having the (VP) property we can easily derive from (5.5.25) and the uniform boundedness of Φ that Φ VN ≤ CN 1/2 . L2 →L∞ By interpolation theory of operators we get from here and from (5.5.25) with p = ∞ that Φ VN ≤ CN 1/p , p ∈ (2, ∞). Lp →L∞ The last inequality implies the Nikol’skii inequality φ∞ ≤ CN 1/p φp ,
φ ∈ Φ(N ),
p ∈ (2, ∞).
Thus Φ has all properties needed in the proof of Theorem 5.5.2. Therefore, we have the following generalization of Theorem 5.5.2. Denote N N c φ := |cj |. j j j=1
A
j=1
Theorem 5.5.4. Let Φ := {φj }∞ j=1 be a uniformly bounded orthonormal system defined on a bounded domain. Assume Φ has the (VP) property. Then there exists a constructive method A(Φ, N, m) that provides, for any φ ∈ Φ(N ), an m-term Φ-polynomial A(Φ, N, m)(φ) with the following approximation property:
φ − A Φ, N, m (φ) ≤ Cm−1/2 ln 1 + N/m 1/2 φA , ∞ with a constant C which may depend on Φ. We note that the decomposition technique used in the proof of Theorem 5.5.2 is a standard tool in the interpolation of operators. The idea of combining the decomposition technique with an inductive way of constructing approximations is also known in approximation theory. For instance, it was used in [10].
Chapter 6 Greedy Approximation with Respect to Dictionaries
6.1 Introduction In this chapter we consider greedy algorithms with respect to general systems in Banach spaces. We already pointed out in Chapter 5 that greedy algorithms designed for general systems turn out to be good for the trigonometric system. We give here an introduction to the theory of greedy approximation with respect to redundant systems. We present this theory in general Banach spaces, albeit our main applications in Chapter 5 are in the Lp spaces, 1 < p < ∞. We also make some remarks about greedy algorithms in Hilbert spaces which help to motivate our interest in the algorithms considered here. The reader can find the theory of greedy algorithms with respect to redundant systems in Hilbert spaces in Chapter 2 of [82]. Let X be a Banach space with norm · . We say that a set D of elements (functions) from X is a dictionary if the norm of each g ∈ D is bounded by one (g ≤ 1) and the closure of span D is X. A dictionary D is symmetric if g∈D
implies
− g ∈ D.
In this chapter we mostly consider symmetric dictionaries. We denote the closure (in X) of the convex hull of D by A1 (D). We introduce a new norm, associated with a dictionary D, in the dual space X ∗ by the formula F D := sup F (g),
F ∈ X ∗.
g∈D
In this chapter we will study greedy algorithms with respect to D. For a nonzero element f ∈ X we let Ff denote a norming (peak) functional for f : Ff = 1,
Ff (f ) = f .
The existence of such a functional is guaranteed by the Hahn–Banach theorem.
© Springer Basel 2015 V. Temlyakov, Sparse Approximation with Bases, Advanced Courses in Mathematics - CRM Barcelona, DOI 10.1007/978-3-0348-0890-3_6
187
188
Chapter 6. Greedy Approximation with Respect to Dictionaries
We begin with a generalization of the Pure Greedy Algorithm (PGA). In the case of a Hilbert space H with an inner product ·, · PGA is defined as follows.
Pure Greedy Algorithm (PGA) Set f0 := f . Then for each m ≥ 1 we give the following inductive definition. (1) ϕm ∈ D is any element satisfying (we assume existence) fm−1 , ϕm = sup fm−1 , g . g∈D
(2)
fm := fm−1 − fm−1 , ϕm ϕm .
(3)
Gm (f, D) :=
m fj−1 , ϕj ϕj . j=1
The greedy step (1) of PGA can be interpreted in two ways. First, we look at the m-th iteration of the algorithm for an element ϕm ∈ D and a number λm satisfying fm−1 − λm ϕm = inf fm−1 − λg . (6.1.1) H H g∈D,λ
Second, we look for an element ϕm ∈ D such that fm−1 , ϕm = sup fm−1 , g .
(6.1.2)
g∈D
In a Hilbert space both versions (6.1.1) and (6.1.2) result in the same PGA. In a general Banach space the corresponding versions of (6.1.1) and (6.1.2) lead to different greedy algorithms. The Banach space version of (6.1.1) is straightforward: instead of the Hilbert-norm · H in (6.1.1) we use the Banach-norm · X . This results in the following greedy algorithm (see [76]).
X-Greedy Algorithm (XGA) Set f0 := f , G0 := 0. Then, for each m ≥ 1, we give the following inductive definition. (1) ϕm ∈ D, λm ∈ R are such that (we assume existence) fm−1 − λm ϕm = inf fm−1 − λg . X X g∈D,λ
(2) Denote fm := fm−1 − λm ϕm ,
Gm := Gm−1 + λm ϕm .
(6.1.3)
189
6.1. Introduction
The second version of PGA in a Banach space is based on the concept of a norming (peak) functional. We note that in a Hilbert space a norming functional Ff acts as Ff (g) = f /f , g . Therefore, (6.1.2) can be rewritten in terms of the norming functional Ffm−1 as Ffm−1 (ϕm ) = sup Ffm−1 (g).
(6.1.4)
g∈D
This observation leads to the class of dual greedy algorithms. We next define the Weak Dual Greedy Algorithm with weakness τ (WDGA(τ ); see [19] and [76]), which is a generalization to the case of Banach spaces of the Weak Greedy Algorithm defined for Hilbert spaces.
Weak Dual Greedy Algorithm (WDGA(τ )) Let τ := {tm }∞ m=1 , tm ∈ [0, 1], be a weakness sequence. Set f0 := f . Then, for each m ≥ 1, we give the following inductive definition. (1) ϕm ∈ D is any element satisfying Ffm−1 (ϕm ) ≥ tm Ffm−1 D . (2) Define am as
(6.1.5)
fm−1 − am ϕm = min fm−1 − aϕm . a∈R
(3) Let fm := fm−1 − am ϕm . Let us make a remark that justifies the idea of the dual greedy algorithms in terms of real analysis. We consider here approximation in uniformly smooth Banach spaces. For a Banach space X we define the modulus of smoothness by
1 x + uy + x − uy − 1 . ρ(u) := sup x=y=1 2 A uniformly smooth Banach space is one with the property lim ρ(u)/u = 0.
u→0
It is easy to see that for any Banach space X its modulus of smoothness ρ(u) is an even convex function satisfying the inequalities max(0, u − 1) ≤ ρ(u) ≤ u,
u ∈ (0, ∞).
We note that from the definition of the modulus of smoothness we get the following inequality.
190
Chapter 6. Greedy Approximation with Respect to Dictionaries
Lemma 6.1.1. Let x = 0. Then 0 ≤ x + uy − x − uFx (y) ≤ 2xρ(uy/x).
(6.1.6)
Proof. We have x + uy ≥ Fx (x + uy) = x + uFx (y). This proves the left inequality. Next, from the definition of the modulus of smoothness it follows that x + uy + x − uy ≤ 2x(1 + ρ(uy/x)).
(6.1.7)
x − uy ≥ Fx (x − uy) = x − uFx (y).
(6.1.8)
Also, Combining (6.1.7) and (6.1.8), we obtain x + uy ≤ x + uFx (y) + 2xρ(uy/x). This proves the second inequality.
Proposition 6.1.2. Let X be a uniformly smooth Banach space. Then for any x = 0 and y we have d x + uy (0) = lim (x + uy − x)/u. Fx (y) = (6.1.9) u→0 du Proof. The equality (6.1.9) follows from (6.1.6) and the property that, for a uniformly smooth Banach space, limu→0 ρ(u)/u = 0. Proposition 6.1.2 shows that in WDGA we are looking for an element ϕm ∈ D that provides a big derivative of the quantity fm−1 + ug. Thus, we have two classes of greedy algorithms in Banach spaces. The first one is based on a greedy step of the form (6.1.3). We call this class the class of X-greedy algorithms. The second one is based on a greedy step of the form (6.1.5). We call this class the class of dual greedy algorithms. A very important feature of the dual greedy algorithms is that they can be modified into a weak form. The term “weak” in the definition of WDGA means that at the greedy step (6.1.5) we do not aim for the optimal element of the dictionary which realizes the corresponding supremum, but are satisfied with a weaker property than being optimal. The obvious reason for this is that we do not know, in general, that the optimal element exists. Another, practical reason is that the weaker the assumption, the easier it is satisfied and, therefore, it is easier to realize in practice. The greedy algorithms defined above (XGA, WDGA) are the generalizations of PGA and WGA, studied in Chapter 2 of [82], to the case of Banach spaces. The results of Chapter 2 of [82] show that PGA is not the most efficient greedy algorithm for the approximation of elements of A1 (D). It was mentioned in Chapter 2
191
6.1. Introduction
of [82] (see [55] for the proof) that there exist a dictionary D, a positive constant C, and an element f ∈ A1 (D) such that, for PGA, fm ≥ Cm−0.27 .
(6.1.10)
For better lower bounds, see [54]. We note that even before the lower estimate (6.1.10) was proved, researchers began looking for other greedy algorithms providing a good rate of approximation of functions from A1 (D). Two different ideas have been used in this endeavour. The first idea was that of relaxation: see [36], [3], [16] and [72]. The corresponding algorithms (for example, WRGA, studied in Chapter 2 of [82]) were designed for approximation of functions from A1 (D). These algorithms do not provide an expansion into a series, but they have other good features. It was established (see Theorem 2.21 on page 94 of [82]) for WRGA with τ = {1} in a Hilbert space that, for f ∈ A1 (D), fm ≤ Cm−1/2 . Also, for WRGA we always have Gm ∈ A1 (D). The latter property clearly limits the applicability of WRGA to A1 (D). The second idea was to build the best approximant from span(ϕ1 , . . . , ϕm ) instead of using only one element ϕm for an update of the approximant. This idea was realized in the Weak Orthogonal Greedy Algorithm (see below) in the case of a Hilbert space and in the Weak Chebyshev Greedy Algorithm (WCGA) (see [73]) in the case of a Banach space. Implementation of both ideas resulted in the construction of algorithms (WRGA and WCGA) that are good for approximation of functions from A1 (D). We present results on WCGA in Section 6.2 and results on WRGA in Section 6.3. WCGA has the following advantage over WRGA. As we show in Section 6.2, WCGA (under some assumptions on the weakness sequence τ ) converges for each f ∈ X in any uniformly smooth Banach space. WRGA is simpler than WCGA in the sense of computational complexity. However, WRGA has limited applicability. It converges only for elements in the closure of the convex hull of a dictionary. In Sections 6.4 and 6.5 we study algorithms that combine good features of both algorithms. In the construction of such algorithms we use different forms of relaxation. The Weak Greedy Algorithm with Free Relaxation (WGAFR, [80]), studied in Section 6.4, is the most powerful of the versions considered here. We prove convergence of WGAFR in Theorem 6.4.3. This theorem is the same as the corresponding convergence result for WCGA (see Theorem 6.2.4). The results on the rate of convergence for WGAFR and WCGA are also the same (see Theorem 6.4.4 and Theorem 6.2.13). Thus, WGAFR performs in the same way as WCGA from the point of view of convergence and rate of convergence, and outperforms WCGA in terms of computational complexity. In WGAFR we are optimizing over two parameters w and λ at each iteration of the algorithm. In other words, we are looking for the best approximation from
192
Chapter 6. Greedy Approximation with Respect to Dictionaries
a 2-dimensional linear subspace at each iteration. In the other version of the weak relaxed greedy algorithm (see the GAWR), considered in Section 6.5, we approximate from a one-dimensional linear subspace at each iteration of the algorithm. This makes the computational complexity of these algorithms very close to that of PGA. The analysis of GAWR version turns out to be more complicated than that of WGAFR. Also, the results obtained for GAWR are not as general as in the case of WGAFR. For instance, we present results on the GAWR only in the case τ = {t}, when the weakness parameter t is the same for all iterations. The XGA and WDGA have a good feature that distinguishes them from all relaxed greedy algorithms, and also from WCGA. For an element f ∈ X they provide an expansion into a series, f∼
∞
cj (f )gj (f ),
gj (f ) ∈ D,
cj (f ) > 0,
j = 1, 2, . . .
(6.1.11)
j=1
such that Gm =
m
cj (f )gj (f ),
fm = f − Gm .
j=1
In Section 6.7 we discuss other greedy algorithms that provide the expansion (6.1.11). All the algorithms studied in Sections 6.2–6.5 and 6.7 belong to the class of dual greedy algorithms. Results obtained in Sections 6.2–6.5 and 6.7 confirm that dual greedy algorithms provide powerful methods of nonlinear approximation. In Section 6.6 we present some results on the X-greedy algorithms. These results are similar to those for the dual greedy algorithms. The algorithms studied in Sections 6.2–6.7 are very general approximation methods that work well in an arbitrary uniformly smooth Banach space X for any dictionary D. Results of Chapter 5 show that these general approximation methods work well for such a complicated system as the trigonometric system. As a typical example of a uniformly smooth Banach space we will use Lp , 1 < p < ∞. It is well known (see, for instance, [22, Lemma B.1]) that in the case X = Lp , 1 ≤ p < ∞ we have ρ(u) ≤ up /p if 1 ≤ p ≤ 2 and ρ(u) ≤ (p − 1)u2 /2 if 2 ≤ p < ∞.
(6.1.12)
It is also known (see [52, p. 63]) that, for any X with dim X = ∞, one has
1/2 − 1, ρ(u) ≥ 1 + u2 and for every X with dim X ≥ 2, ρ(u) ≥ Cu2 ,
C > 0.
This restricts the power type modulus of smoothness of nontrivial Banach spaces to the case uq , 1 ≤ q ≤ 2.
193
6.2. The Weak Chebyshev Greedy Algorithm
6.2 The Weak Chebyshev Greedy Algorithm Let τ := {tk }∞ k=1 be a given weakness sequence of nonnegative numbers tk ≤ 1, k = 1, 2, . . . . We define first the Weak Chebyshev Greedy Algorithm (WCGA) (see [73]), which is a generalization for Banach spaces of the following Weak Orthogonal Greedy Algorithm defined for a Hilbert space.
Weak Orthogonal Greedy Algorithm (WOGA) Set f0o := f . Then for each m ≥ 1 we give the following inductive definition. (1) ϕom ∈ D is any element satisfying o o ,g . fm−1 , ϕom ≥ tm sup fm−1 g∈D
(2) Gom (f, D) := PHm (f ), where Hm := span(ϕo1 , . . . , ϕom ) and PY denotes the operator of orthogonal projection onto Y . (3) fm := f − Gom (f, D).
Weak Chebyshev Greedy Algorithm (WCGA) Set f0c := f0c,τ := f . Then for each m ≥ 1 we give the following inductive definition. (1) ϕcm := ϕc,τ m ∈ D is any element satisfying c . c c Ffm−1 ϕm ≥ tm Ffm−1 D (2) Define Φm := Φτm := span{ϕcj }m j=1 , and define Gcm := Gc,τ m to be the best approximant to f from Φm . (3) Let c c,τ fm := fm := f − Gcm . c } Remark 6.2.1. It follows from the definition of WCGA that the sequence {fm is non-increasing.
We proceed to a theorem on convergence of WCGA. In its formulation we need a special sequence, defined for a given modulus of smoothness ρ(u) and a given τ = {tk }∞ k=1 . Definition 6.2.2. Let ρ(u) be an even convex function on (−∞, ∞) with the property that ρ(2) ≥ 1 and lim ρ(u)/u = 0. u→0
For any τ = {tk }∞ k=1 , 0 < tk ≤ 1, and 0 < θ ≤ 1/2 we define ξm := ξm (ρ, τ, θ) as a number u satisfying the equation ρ(u) = θtm u.
(6.2.1)
194
Chapter 6. Greedy Approximation with Respect to Dictionaries
Remark 6.2.3. The assumptions on ρ(u) imply that s(u) := ρ(u)/u,
u = 0,
s(0) = 0,
is a continuous increasing function on [0, ∞) with s(2) ≥ 1/2. Thus (6.2.1) has a unique solution ξm = s−1 (θtm ) such that 0 < ξm ≤ 2. The following theorem from [73] gives a sufficient condition for convergence of WCGA. Theorem 6.2.4. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u). Assume that the sequence τ := {tk }∞ k=1 satisfies the following condition: for any θ > 0, we have ∞
tm ξm ρ, τ, θ = ∞.
m=1
Then for any f ∈ X we have c,τ = 0. lim fm
m→∞
Corollary 6.2.5. Let the Banach space X have modulus of smoothness ρ(u) of power type 1 < q ≤ 2, that is, ρ(u) ≤ γuq . Assume that ∞
tpm = ∞,
p=
m=1
q . q−1
(6.2.2)
Then WCGA converges for any f ∈ X. Proof. Denote ρq (u) := γuq . Then ρ(u)/u ≤ ρq (u)/u, and therefore for any θ > 0 we have
ξm ρ, τ, θ ≥ ξm ρq , τ, θ . For ρq we get from the definition of ξm that
1/(q−1) ξm ρq , τ, θ = θtm /γ . Thus (6.2.2) implies that ∞ m=1
∞ ∞
q
tm ξm ρ, τ, θ ≥ tm ξm ρ , τ, θ tpm = ∞. m=1
It remains to apply Theorem 6.2.4.
m=1
195
6.2. The Weak Chebyshev Greedy Algorithm
The following theorem from [73] gives the rate of convergence of WCGA for f in A1 (D). Theorem 6.2.6. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u) ≤ γuq , 1 < q ≤ 2. Then, for a sequence τ := {tk }∞ k=1 , tk ≤ 1, k = 1, 2, . . . , we have for any f ∈ A1 (D) that −1/p m c,τ fm ≤ C(q, γ) 1 + tpk ,
p :=
k=1
q , q−1
with a constant C(q, γ) that may depend only on q and γ. We will use the following two simple and well-known lemmas in the proof of the above two theorems. Lemma 6.2.7. Let X be a uniformly smooth Banach space and L be a finite-dimensional subspace of X. For any f ∈ X \ L, let fL denote the best approximant of f from L. Then Ff −fL (φ) = 0 for any φ ∈ L. Proof. Let us assume the contrary: there is a φ ∈ L such that φ = 1 and Ff −fL (φ) = β > 0. For any λ we have from the definition of ρ(u) that f − fL − λφ + f − fL + λφ ≤ 2f − fL 1 + ρ
λ f − fL
.
(6.2.3)
Next,
f − fL + λφ ≥ Ff −fL f − fL + λφ = f − fL + λβ. Combining (6.2.3) and (6.2.4) we get f − fL − λφ ≤ f − fL 1 −
λ λβ + 2ρ . f − fL f − fL
(6.2.4)
(6.2.5)
Since ρ(u) = o(u), we find λ > 0 such that λ λ β + 2ρ 1− < 1. f − fL f − fL Then (6.2.5) gives
f − fL − λ φ < f − fL ,
which contradicts the assumption that fL ∈ L is the best approximant of f .
196
Chapter 6. Greedy Approximation with Respect to Dictionaries
Lemma 6.2.8. For any bounded linear functional F and any dictionary D, we have F D := sup F (g) = g∈D
sup
F (f ).
f ∈A1 (D)
Proof. The inequality sup F (g) ≤ g∈D
sup
F (f )
f ∈A1 (D)
is obvious. We prove the opposite inequality. Take any f ∈ A1 (D). Then for any > 0 there exist g1 , . . . , gN ∈ D and numbers a1 , . . . , aN such that ai > 0, a1 + · · · + aN = 1 and N f − ai gi ≤ . i=1
Thus F (f ) ≤ F + F
N
ai gi
≤ F + sup F (g), g∈D
i=1
which proves Lemma 6.2.8. We will also need one more lemma from [73].
Lemma 6.2.9. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u). Take a number ≥ 0 and two elements f , f from X such that f − f ≤ , f /A() ∈ A1 (D), with some number A() > 0. Then c,τ c,τ inf 1 − λtm A()−1 1 − fm ≤ f m−1 λ≥0
c,τ fm−1
+ 2ρ
λ
c,τ fm−1
for m = 1, 2, . . . . Proof. We have, for any λ, c c c 1+ρ fm−1 − λϕcm + fm−1 + λϕcm ≤ 2fm−1
λ
c fm−1
and by (1) from the definition of WCGA and Lemma 6.2.8 we get c c (ϕcm ) ≥ tm sup Ffm−1 (g) Ffm−1
g∈D
= tm
sup φ∈A1 (D)
c c Ffm−1 (φ) ≥ tm A()−1 Ffm−1 (f ).
(6.2.6)
197
6.2. The Weak Chebyshev Greedy Algorithm
By Lemma 6.2.7 we obtain
c c c f + f − f ≥ Ffm−1 (f ) = Ffm−1 (f ) − Ffm−1 c c − . c fm−1 − = fm−1 = Ffm−1 Thus, as in (6.2.5), we infer from (6.2.6) that c c fm ≤ inf fm−1 − λϕcm λ≥0
c ≤ fm−1 inf
λ≥0
1 − λtm A()−1 1 −
c fm−1
+ 2ρ
λ c fm−1
, (6.2.7)
as claimed.
c } Proof. We now prove Theorem 6.2.4. The definition of WCGA implies that {fm is a non-increasing sequence. Therefore, c lim fm = α.
m→∞
We prove by contradiction that α = 0. Assume that α > 0. Then for any m we have c ≥ α. fm We set = α/2 and find f such that f − f ≤ and f /A() ∈ A1 (D), with some A(). Then, by Lemma 6.2.9, c c
−1 f ≤ f /2 + 2ρ(λ/α) . m m−1 inf 1 − λtm A() λ
Let us specify θ :=
α 8A()
and take λ = αξm (ρ, τ, θ). Then we obtain c c
fm ≤ fm−1 1 − 2θtm ξm .
The assumption ∞
tm ξm = ∞
m=1
implies that c →0 fm
as m → ∞.
We got a contradiction, which proves the theorem.
198
Chapter 6. Greedy Approximation with Respect to Dictionaries
Proof. We proceed to the proof of Theorem 6.2.6. By Lemma 6.2.9 with = 0 and A() = 1 we have for f ∈ A1 (D) that c c fm ≤ fm−1 inf
λ≥0
1 − λtm + 2γ
λ c fm−1
q .
(6.2.8)
Choose λ from the equation q λ 1 λtm = 2γ , c 2 fm−1 which implies that c q/(q−1) 1 λ = fm−1 (4γ)− q−1 t1/(q−1) . m Let
1
Aq := 2(4γ) q−1 . Using the notation p :=
q q−1
we infer from (6.2.8) that
c c c
c fm ≤ fm−1 1 − 1 λtm = fm−1 1 − tpm fm−1 p /Aq . 2 Raising both sides of this inequality to the power p and taking into account the inequality xr ≤ x for r ≥ 1, 0 ≤ x ≤ 1, we obtain c p c p c p
fm ≤ fm−1 1 − tpm fm−1 /Aq . By an analog of Lemma 2.16 from Chapter 2 of [82] (see [72, Lemma 3.1]), using the estimate f p ≤ 1 < Aq we get −1 m c p fm ≤ Aq 1 + tpn , n=1
whence
−1/p m c p fm ≤ C(q, γ) 1 + tn . n=1
Theorem 6.2.6 is now proved.
Remark 6.2.10. Theorem 6.2.6 holds for a slightly modified version of WCGA, WCGA(1), for which at step (1) we require that c(1)
Ff c(1) (ϕc(1) m ) ≥ tm fm−1 . m−1
(6.2.9)
199
6.2. The Weak Chebyshev Greedy Algorithm
This statement follows from the fact that, in the proof of Theorem 6.2.6, the relation c c (ϕcm ) ≥ tm sup Ffm−1 (g) Ffm−1 g∈D
was used only to get (6.2.9). Remark 6.2.11. It follows from the above proof of Theorem 6.2.6 that C(q, γ) ≤ Cγ 1/q . In particular, in the case X = Lp the inequality (6.1.12) implies that C(q, γ) ≤ Cp1/2 for 2 ≤ p < ∞. Proposition 6.2.12. The condition (6.2.2) in Corollary 6.2.5 is sharp. Proof. Let 1 < q ≤ 2. Consider X = q . It is known ([52, p. 67]) that q , 1 < q ≤ 2, is a uniformly smooth Banach space with modulus of smoothness ρ(u) of power q and take any sequence {tk }∞ type q. Denote p := q−1 k=1 , 0 < tk ≤ 1, such that ∞
tpk < ∞.
(6.2.10)
k=1
Choose D as the standard basis {ej }∞ j=1 , ej := (0, . . . , 0, 1, 0, . . . ), for q . Consider the following realization of WCGA for " # 1/(q−1) 1/(q−1) f := 1, t1 , t2 ,... . First of all, (6.2.10) guarantees that f ∈ q . Next, it is well known that Ff can be identified as ( 1/p ∞
1+ tpk ∈ p . Ff = 1, t1 , t2 , . . . k=1
At the first step of WCGA we pick ϕ1 = e2 and get " # 1/(q−1) ,... . f1c = 1, 0, t2 We continue with f replaced by f1 and so on. After m steps we get " # 1/(q−1) c fm = 1, 0, . . . , 0, tm+1 , . . . . c It is clear that for all m we have fm q ≥ 1.
The following variant of Theorem 6.2.6 (see [80]) follows from Lemma 6.2.9. Theorem 6.2.13. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u) ≤ γuq , 1 < q ≤ 2. Take a number ≥ 0 and two elements f , f from X such that f − f ≤ , f /A() ∈ A1 (D), with some number A() > 0. Then we have, for p := q/(q − 1), −1/p m p c,τ tk . fm ≤ max 2, C(q, γ)(A() + ) 1 + k=1
(6.2.11)
200
Chapter 6. Greedy Approximation with Respect to Dictionaries
6.3 Relaxation. Co-convex approximation In this section we study a generalization for Banach spaces of the relaxed greedy algorithms considered in Chapter 2 of [82]. We present results from [73]. Let τ := {tk }∞ k=1 be a given weakness sequence of numbers tk ∈ [0, 1], k = 1, . . . .
Weak Relaxed Greedy Algorithm (WRGA) Set f0r := f0r,τ := f and Gr0 := Gr,τ 0 := 0. Then for each m ≥ 1 we give the following inductive definition. (1) ϕrm := ϕr,τ m ∈ D is any element satisfying r
r r ϕm − Grm−1 ≥ tm sup Ffm−1 g − Grm−1 . Ffm−1 g∈D
(2) Find 0 ≤ λm ≤ 1 such that f − ((1 − λm )Grm−1 + λm ϕrm ) = inf f − ((1 − λ)Grm−1 + λϕrm ) 0≤λ≤1
and define r r Grm := Gr,τ m := (1 − λm )Gm−1 + λm ϕm .
(3) Let r r,τ fm := fm := f − Grm . r } is a nonRemark 6.3.1. It follows from the definition of WRGA that {fm increasing sequence.
We call WRGA relaxed because at the m-th iteration of the algorithm we use a linear combination (convex combination) of the previous approximant Grm−1 and a new element ϕrm . The relaxation parameter λm in WRGA is chosen at the m-th iteration depending on f . We prove here the analogs of Theorems 6.2.4 and 6.2.6 for the Weak Relaxed Greedy Algorithm. Theorem 6.3.2. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u). Assume that a sequence τ := {tk }∞ k=1 satisfies the following condition: for any θ > 0, ∞
tm ξm ρ, τ, θ = ∞.
m=1
Then for any f ∈ A1 (D) we have
r,τ = 0. lim fm
m→∞
Theorem 6.3.3. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u) ≤ γuq , 1 < q ≤ 2. Then for a sequence τ := {tk }∞ k=1 , tk ≤ 1,
201
6.3. Relaxation. Co-convex approximation
k = 1, 2, . . . , we have, for any f ∈ A1 (D), −1/p m r,τ fm ≤ C1 (q, γ) 1 + tpk ,
p :=
k=1
q , q−1
with a constant C1 (q, γ) which may depend only on q and γ. Proof. We prove both Theorems 6.3.2 and 6.3.3. This proof is similar to that of Theorems 6.2.4 and 6.2.6. Instead of Lemma 6.2.9 we use the following one. Lemma 6.3.4. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u). Then for any f ∈ A1 (D), r,τ r,τ 2λ inf fm ≤ f + 2ρ 1 − λt , m = 1, 2, . . . . m r,τ m−1 0≤λ≤1 fm−1 Proof. We have
r r := f − (1 − λm )Grm−1 + λm ϕrm = fm−1 − λm ϕrm − Grm−1 fm
and
r r = inf fm−1 − λ(ϕrm − Grm−1 ). fm 0≤λ≤1
As for (6.2.6), we have, for any λ, r
r
r r r r f + f m−1 − λ ϕm − Gm−1 m−1 + λ ϕm − Gm−1 r r λϕm − Gm−1 r 1+ρ ≤ 2fm−1 . r fm−1
(6.3.1)
Next we obtain, for λ ≥ 0, r
r
r r f ≥ Ff r fm−1 + λ ϕrm − Grm−1 m−1 + λ ϕm − Gm−1 m−1 r
r r ϕm − Grm−1 = fm−1 + λFfm−1
r r ≥ fm−1 + λtm sup Ffm−1 g − Grm−1 . g∈D
Using Lemma 6.2.8 we continue: r r r r + λtm sup Ffm−1 (φ − Grm−1 ) ≥ fm−1 + λtm fm−1 . = fm−1 φ∈A1 (D
Using the trivial estimate ϕrm − Grm−1 ≤ 2, (6.3.1) yields r
r 2λ r f ≤ f r 1 − λtm + 2ρ − λ ϕ − G , m−1 m m−1 m−1 r fm−1 which proves Lemma 6.3.4.
(6.3.2)
The remaining part of the proof uses the inequality (6.3.2) in the same way relation (6.2.7) was used in the proof of Theorems 6.2.4 and 6.2.6. The only additional difficulty here is that we are optimizing over 0 ≤ λ ≤ 1. However, it is
202
Chapter 6. Greedy Approximation with Respect to Dictionaries
easy to check that the corresponding λ chosen in a similar way always satisfies the restriction 0 ≤ λ ≤ 1. In the proof of Theorem 6.3.2 we choose θ = α/8 and λ = αξm (ρ, τ, θ)/2 and in the proof of Theorem 6.3.3 we choose λ from the equation r −q 1 . λtm = 2γ(2λ)q fm−1 2 Remark 6.3.5. Theorems 6.3.2 and 6.3.3 hold for a slightly modified version of WRGA, WRGA(1), for which at step (1) we require r(1) r(1) (6.3.3) Ff r(1) ϕr(1) m − Gm−1 ≥ tm fm−1 . m−1
This follows from the observation that in the proof of Lemma 6.3.4 we used the inequality from step (1) of WRGA only to derive (6.3.3). It is clear from Lemma 6.2.8 that in the case of approximation of f ∈ A1 (D), the requirement (6.3.3) is weaker and easier to check than (1) of WRGA.
6.4 Free relaxation Both of the above algorithms, WCGA and WRGA, use the functional Ffm−1 in a search for the m-th element ϕm from the dictionary to be used in the approximation. The construction of the approximant in WRGA is different from the construction in WCGA. In WCGA we build the approximant Gcm so as to maximally use the approximation power of the elements ϕ1 , . . . , ϕm . WRGA, by its definition, is designed for approximation of functions from A1 (D). In building the approximant in WRGA we keep the property Grm ∈ A1 (D). As we mentioned in Section 6.3, the relaxation parameter λm in WRGA is chosen at the m-th iteration depending on f . The following modification of the above idea of relaxation in greedy approximation will be studied in this section (see [80]).
Weak Greedy Algorithm with Free Relaxation (WGAFR) Let τ := {tm }∞ m=1 , tm ∈ [0, 1], be a weakness sequence. We set f0 := f and G0 := 0. Then for each m ≥ 1 we give the following inductive definition. (1) ϕm ∈ D is any element satisfying Ffm−1 (ϕm ) ≥ tm Ffm−1 D . (2) Find wm and λm such that f − ((1 − wm )Gm−1 + λm ϕm ) = inf f − ((1 − w)Gm−1 + λϕm ) λ,w
and define Gm := (1 − wm )Gm−1 + λm ϕm . (3) Let fm := f − Gm .
203
6.4. Free relaxation
We begin with an analogue of Lemma 6.2.9. Lemma 6.4.1. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u). Take a number ≥ 0 and two elements f , f from X such that f − f ≤ ,
f /A() ∈ A1 (D),
with some number A() ≥ . Then we have, for WGAFR, 5λ −1 fm ≤ fm−1 inf 1 − λtm A() 1− + 2ρ , λ≥0 fm−1 fm−1 for m = 1, 2, . . . . Proof. By the definition of fm , fm ≤ inf fm−1 + wGm−1 − λϕm . λ≥0,w
Arguing as in the proof of Lemma 6.2.9, we use the inequality fm−1 + wGm−1 − λϕm + fm−1 − wGm−1 + λϕm
≤ 2fm−1 1 + ρ wGm−1 − λϕm /fm−1
(6.4.1)
and estimate, for λ ≥ 0,
fm−1 − wGm−1 + λϕm ≥ Ffm−1 fm−1 − wGm−1 + λϕm ≥ fm−1 − Ffm−1 (wGm−1 ) + λtm sup Ffm−1 (g). g∈D
By Lemma 6.2.2, we continue: = fm−1 − Ffm−1 (wGm−1 ) + λtm
sup φ∈A1 (D)
Ffm−1 (φ)
≥ fm−1 − Ffm−1 (wGm−1 ) + λtm A()−1 Ffm−1 (f )
≥ fm−1 − Ffm−1 (wGm−1 ) + λtm A()−1 Ffm−1 (f ) − . We set w∗ := λtm A()−1 and obtain
fm−1 − w∗ Gm−1 + λϕm ≥ fm−1 + λtm A()−1 fm−1 − . Combining (6.4.1) and (6.4.2) we get
fm ≤ fm−1 inf 1 − λtm A()−1 (1 − /fm−1 λ≥0
+ 2ρ w∗ Gm−1 − λϕm /fm−1 .
(6.4.2)
204
Chapter 6. Greedy Approximation with Respect to Dictionaries
We now estimate Next,
w∗ Gm−1 − λϕm ≤ w∗ Gm−1 + λ.
Gm−1 = f − fm−1 ≤ 2f ≤ 2 f + ≤ 2 A() + .
Thus, under the assumption A() ≥ we get w∗ Gm−1 ≤ 2λtm (A() + )/A() ≤ 4λ. Finally,
∗ w Gm−1 − λϕm ≤ 5λ.
This completes the proof of Lemma 6.4.1.
Remark 6.4.2. It follows from the definition of WGAFR that {fm } is a nonincreasing sequence. We now prove a convergence theorem for an arbitrary uniformly smooth Banach space. The modulus of smoothness ρ(u) of a uniformly smooth Banach space is an even convex function such that ρ(0) = 0 and limu→0 ρ(u)/u = 0. The function s(u) := ρ(u)/u, s(0) := 0 associated with ρ(u) is a continuous increasing function on [0, ∞). Therefore, the inverse function s−1 (·) is well defined. Theorem 6.4.3. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u). Assume that the sequence τ := {tk }∞ k=1 satisfies the following condition: for any θ > 0 we have ∞
tm s−1 (θtm ) = ∞.
(6.4.3)
m=1
Then for any f ∈ X we have, for WGAFR, lim fm = 0.
m→∞
Proof. By Remark 6.4.2, {fm } is a non-increasing sequence. Therefore, lim fm = β.
m→∞
We prove that β = 0 by contradiction. Assume that β > 0. Then for any m we have fm ≥ β. We set = β/2 and find f such that f − f ≤ and f /A() ∈ A1 (D), with some A() ≥ . Then, by Lemma 6.4.1,
fm ≤ fm−1 inf 1 − λtm A()−1 /2 + 2ρ(5λ/β) . λ≥0
205
6.4. Free relaxation
Let us specify θ := β/(40A()) and take λ = βs−1 (θtm )/5. Then we obtain
fm ≤ fm−1 1 − 2θtm s−1 θtm . The assumption
∞
tm s−1 (θtm ) = ∞
m=1
implies that fm → 0
as m → ∞.
We reached a contradiction, which proves the theorem.
Theorem 6.4.4. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u) ≤ γuq , 1 < q ≤ 2. Take a number ≥ 0 and two elements f , f from X such that f − f ≤ , f /A() ∈ A1 (D), with some number A() > 0. Then we have, for WGAFR, −1/p m
tpk fm ≤ max 2, C(q, γ) A() + 1 + ,
p := q/(q − 1).
k=1
Proof. It is clear that it suffices to consider the case A() ≥ . Otherwise, fm ≤ f ≤ f + ≤ 2. Also, assume fm > 2 (otherwise Theorem holds 6.4.4 trivially). Then by Remark 6.4.2 we have for all k = 0, 1, . . . , m that fk > 2. By Lemma 6.4.1, q 5λ fk ≤ fk−1 inf 1 − λtk A()−1 /2 + 2γ . (6.4.4) λ≥0 fk−1 Choose λ from the equation q 5λ λtk = 2γ , 4A() fk−1 which implies that
−1/(q−1) 1/(q−1) λ = fk−1 q/(q−1) 5−q/(q−1) 8γA() tk . Define Aq := 4(8γ)1/(q−1) 5q/(q−1) . Using the notation p :=
q q−1
we infer from (6.4.4) that
tpk fk−1 p 1 λtk fk ≤ fk−1 1 − . = fk−1 1 − 4 A() Aq A()p
206
Chapter 6. Greedy Approximation with Respect to Dictionaries
Raising both sides of this inequality to the power p and taking into account the inequality xr ≤ x for r ≥ 1, 0 ≤ x ≤ 1, we obtain tpk fk−1 p p p fk ≤ fk−1 1 − . Aq A()p By an analog of Lemma 2.16 of Chapter 2 of [82] (see [72, Lemma 3.1]), using the estimates f ≤ A() + and Aq > 1, we get fm ≤ Aq (A() + ) p
p
1+
m
tpk
−1 ,
k=1
whence
−1/p m fm ≤ C(q, γ)(A() + ) 1 + tpk . k=1
Theorem 6.4.4 is proved.
Remark 6.4.5. It follows from the above proof of Theorem 6.4.4 that C(q, γ) ≤ Cγ 1/q . In particular, in the case X = Lp the inequality (6.1.12) implies that C(q, γ) ≤ Cp1/2 for 2 ≤ p < ∞.
6.5 Fixed relaxation In this section we consider a relaxed greedy algorithm with relaxation prescribed in advance. Let a sequence r := {rk }∞ k=1 , rk ∈ [0, 1) of relaxation parameters be given. Then at each iteration of our new algorithm we build the m-th approximant in the form Gm = (1 − rm )Gm−1 + λϕm . With an approximant of this form we are not limited to approximation of functions from A1 (D), as in WRGA. In this section we study the Greedy Algorithm with Weakness parameter t and Relaxation r (GAWR(t, r)). In addition to the acronym GAWR(t, r) we will use the abbreviation GAWR for the name of this algorithm. We give a general definition of the algorithm in the case of a weakness sequence τ . We present in this section results from [80].
GAWR(τ, r) ∞ Let τ := {tm }∞ m=1 , tm ∈ (0, 1], be a weakness sequence and let r := {rm }m=1 , rm ∈ [0, 1), be a relaxation sequence. We define f0 := f and G0 := 0. Then for each m ≥ 1 we give the following inductive definition.
(1) ϕm ∈ D is any element satisfying Ffm−1 (ϕm ) ≥ tm Ffm−1 D .
207
6.5. Fixed relaxation
(2) Find λm ≥ 0 such that f − ((1 − rm )Gm−1 + λm ϕm ) = inf f − ((1 − rm )Gm−1 + λϕm ) λ≥0
and define Gm := (1 − rm )Gm−1 + λm ϕm . (3) Let fm := f − Gm . In the case τ = {t} we write t instead of τ in the notation. We note that in the case rk = 0, k = 1, . . . , when there is no relaxation GAWR(τ, 0) coincides with the Weak Dual Greedy Algorithm. We now proceed to GAWR. We begin with an analog of Lemma 6.2.9. Lemma 6.5.1. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u). Take a number ≥ 0 and two elements f , f from X such that f − f ≤ ,
f /A() ∈ A1 (D),
with some number A() > 0. Then we have, for GAWR(t, r), fm ≤ fm−1 (1 − rm (1 − /fm−1)
+ 2ρ (rm (f + A()/t))/((1 − rm )fm−1 ) ,
m = 1, 2, . . . .
Theorem 6.5.2. Let a sequence r satisfy the conditions ∞
rk = ∞,
rk → 0
as
k → ∞.
k=1
Then GAWR(t, r) converges in any uniformly smooth Banach space for each f ∈ X and for all dictionaries D. Proof. We prove this theorem in two steps. Step I. First, we prove that lim inf m→∞ fm = 0. Assume the contrary. Then there exists K and β > 0 such that we have for all k ≥ K that fk ≥ β. By Lemma 6.5.1, for m > K, rm (f + A()/t) fm ≤ fm−1 1 − rm 1 − + 2ρ . β (1 − rm )β We choose := β/2. Using the assumption that X is uniformly smooth and the assumption that rk → 0 as k → ∞, we find N ≥ K such that for m ≥ N we have rm (f + A()/t) 2ρ ≤ rm /4. (1 − rm )β
208
Chapter 6. Greedy Approximation with Respect to Dictionaries
Then, for m > N , ∞
fm ≤ fm−1 (1 − rm /4).
The assumption m=1 rm = ∞ implies that fm → 0 as m → ∞. This contradiction to the assumption β > 0 completes the proof of part I. Step II. Secondly, we prove that limm→∞ fm = 0. Using the assumption that rk → 0 as k → ∞, we find N1 such that for k ≥ N1 we have rk ≤ 1/2. For such k we obtain from Lemma 6.5.1 that Brk fk − ≤ (1 − rk )(fk−1 − ) + 2fk−1 ρ , (6.5.1) fk−1 with B := 2(f + A()/t). Denote ak := fk−1 − . We note that from the definition of fk it follows that ak+1 ≤ ak + rk f .
(6.5.2)
Using the fact that the function ρ(u)/u is monotone increasing on [0, ∞), we obtain from (6.5.1), for ak > 0, Brk Brk fk−1 ρ ak+1 ≤ ak 1 − rk + 2 . (6.5.3) ≤ ak 1 − rk + 2ρ ak fk−1 ak We now introduce an auxiliary sequence {bk } of positive numbers that is defined by the equation 2ρ(Brk /bk ) = rk . The property ρ(u)/u → 0 as u → 0 implies bk → 0 as k → ∞. Inequality (6.5.3) guarantees that for k ≥ N1 such that ak ≥ bk we have ak+1 ≤ ak . Let U := k : k ≥ N1 , ak ≥ bk . If the set U is finite then we get lim sup ak ≤ lim bk = 0. k→∞
k→∞
This implies lim sup fm ≤ . m→∞
Consider the case when U is infinite. We note that part I of the proof implies that there is a subsequence {kj } such that akj ≤ 0, j = 1, 2, . . . . This means that U=
∞ l j , nj , j=1
/ U , k ≥ N1 we have with the property nj−1 < lj − 1. For k ∈ ak < b k .
(6.5.4)
209
6.5. Fixed relaxation
For k ∈ [lj , nj ], (6.5.2) and the monotonicity property of ak imply that ak ≤ alj ≤ alj −1 + rlj −1 f ≤ blj −1 + rlj −1 f .
(6.5.5)
By (6.5.4) and (6.5.5), lim sup ak ≤ 0 k→∞
=⇒
lim sup fm ≤ . m→∞
Taking into account that > 0 is arbitrary, we complete the proof.
We now proceed to results on the rate of approximation. We will need the following technical lemma (see [71], [80] and Chapter 2 of [82]). Lemma 6.5.3. Let a sequence {an }∞ n=1 have the following property. For given positive numbers α < γ ≤ 1, A > a1 , we have, for all n ≥ 2, an ≤ an−1 + A(n − 1)−α . If for some ν ≥ 2,
(6.5.6)
aν ≥ Aν −α ,
then aν+1 ≤ aν (1 − γ/ν).
(6.5.7)
Then there exists a constant C(α, γ) such that for all n = 1, 2, . . . we have an ≤ C(α, γ)An−α . Theorem 6.5.4. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u) ≤ γuq , 1 < q ≤ 2. Let r := {2/(k + 2)}∞ k=1 . Consider the GAWR(t, r). For a pair of functions f , f satisfying f − f ≤ , f /A() ∈ A1 (D), we have
fm ≤ + C(q, γ) f + A()/t m−1+1/q .
Proof. By Lemma 6.5.1 we obtain fk − ≤ (1 − rk )(fk−1 − ) + Cγfk−1
rk (f + A()/t) fk−1
q .
(6.5.8)
Consider, as in the proof of Theorem 6.5.2, the sequence an := fn−1 − . We plan to apply Lemma 6.5.3 to the sequence {an }. We set α := 1 − 1/q ≤ 1/2. The parameters γ ∈ (α, 1] and A will be chosen later. We note that fm ≤ fm−1 + rm f .
210
Chapter 6. Greedy Approximation with Respect to Dictionaries
Therefore, the condition (6.5.6) of Lemma 6.5.3 is satisfied with A ≥ 2f . Let ak ≥ Ak −α . Then, by (6.5.8),
q ak+1 ≤ ak 1 − rk + Cγ rk (f + A()/t)/ak Cγ(f + A()/t)q 2q k αq 2 + ≤ ak 1 − . k+2 Aq (k + 2)q Setting A := max(2f , 2(2Cγ)1/q (f + A()/t)), we obtain 3 ak+1 ≤ ak 1 − . 2(k + 2) Thus condition (6.5.7) of Lemma 6.5.2 is satisfied with γ = 3/4. Applying Lemma 6.5.3 we obtain
fm ≤ + C(q, γ) f + A()/t m−1+1/q . We conclude the study of GAWR by the following remark. The algorithms GAWR and WGAFR are both of dual-type greedy algorithms. The first steps are similar for both algorithms: we use the norming functional Ffm−1 in the search for an element ϕm . WGAFR provides more freedom than GAWR does in choosing good coefficients wm and λm . This results in more flexibility in choosing the weakness sequence τ = {tm }. For instance, condition (6.4.3) of Theorem 6.4.3 is satisfied if τ = {t}, t ∈ (0, 1] for any uniformly smooth Banach space. In the case ρ(u) ≤ γuq , 1 < q ≤ 2, condition (6.4.3) is satisfied if ∞
tpm = ∞,
p := q/(q − 1).
m=1
We proceed to one more thresholding-type algorithm (see [77]). Keeping in mind possible applications of this algorithm we do not assume that its dictionary D is symmetric: g ∈ D implies −g ∈ D. To indicate this, we use the notation D+ for such a dictionary. We do not assume that elements of the dictionary D+ are normalized (g = 1 if g ∈ D+ ), only that g ≤ 1 if g ∈ D+ . By A1 (D+ ) we denote the closure of the convex hull of D+ . Let = {n }∞ n=1 , n > 0, n = 1, 2, . . . .
Incremental Algorithm with schedule (IA()) Let f ∈ A1 (D+ ). Set f0i, := f and Gi, 0 := 0. Then for each m ≥ 1 we give the following inductive definition. + (1) ϕi, m ∈ D is any element satisfying
Ff i, ϕi, m − f ≥ −m . m−1
211
6.5. Fixed relaxation
(2) Define i, i, Gi, m := (1 − 1/m)Gm−1 + ϕm /m.
(3) Let i, fm := f − Gi, m.
We note that, as in Lemma 6.2.8, for any bounded linear functional F and any D+ we have sup F (g) = sup F (f ). g∈D +
f ∈A1 (D + )
Therefore, for any F and any f ∈ A1 (D+ ), sup F (g) ≥ F (f ).
g∈D +
This guarantees the existence of ϕi, m. Theorem 6.5.5. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u) ≤ γuq , 1 < q ≤ 2. Define n := K1 γ 1/q n−1/p ,
p=
q , q−1
Then for any f ∈ A1 (D+ ) we have i, fm ≤ C(K1 )γ 1/q m−1/p ,
n = 1, 2, . . . .
m = 1, 2 . . . .
i, i, Proof. We will use the abbreviated notation fm := fm , ϕm := ϕi, m , Gm := Gm . Writing
fm = fm−1 − ϕm − Gm−1 /m
we immediately obtain the trivial estimate fm ≤ fm−1 + 2/m.
(6.5.9)
Represent fm = (1 − 1/m)fm−1 − (ϕm − f )/m = (1 − 1/m)(fm−1 − (ϕm − f )/(m − 1)). We obtain fm−1 − (ϕm − f )/(m − 1)
≤ fm−1 1 + 2ρ(2((m − 1)fm−1 )−1 ) + m (m − 1)−1 ,
(6.5.10)
(6.5.11)
in a similar way to (6.2.5). Using the definition of m and the assumption that ρ(u) ≤ γuq , we make the following observation. There exists a constant C(K1 ) such that if fm−1 ≥ C(K1 )γ 1/q (m − 1)−1/p (6.5.12)
212
Chapter 6. Greedy Approximation with Respect to Dictionaries
then 2ρ(2((m − 1)fm−1 )−1 ) + m ((m − 1)fm−1 )−1 ≤ 1/(4m),
(6.5.13)
and therefore, by (6.5.10) and (6.5.11), fm ≤ (1 − 3/(4m))fm−1.
(6.5.14)
Taking into account (6.5.9) we apply Lemma 6.5.3 to the sequence an = fn , n = 1, 2, . . . with α = 1/p, β = 3/4 and complete the proof of Theorem 6.5.5.
6.6 Relaxation. X-greedy algorithms In Sections 6.2–6.5 we studied dual greedy algorithms. In this section we define some generalizations of the X-Greedy Algorithm using the idea of relaxation. We begin with an analog of WGAFR.
X-Greedy Algorithm with Free Relaxation (XGAFR) Set f0 := f and G0 := 0. Then for each m ≥ 1 we give the following inductive definition. (1) ϕm ∈ D and λm ≥ 0, wm are such that (we assume existence)
f − (1 − w)Gm−1 + λg f − (1 − wm )Gm−1 + λm ϕm = inf g∈D,λ≥0,w
and Gm := (1 − wm )Gm−1 + λm ϕm . (2) Let fm := f − Gm . Using this definition, we obtain that, for any t ∈ (0, 1],
fm ≤ inf f − (1 − w)Gm−1 + λϕtm , λ≥0,w
where ϕtm ∈ D is an element satisifying
Ffm−1 (ϕtm ) ≥ tFfm−1 D .
Setting t = 1 we obtain a version of Lemma 6.4.1 for XGAFR. Lemma 6.6.1. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u). Take a number ≥ 0 and two elements f , f from X such that f − f ≤ , f /A() ∈ A1 (D),
6.6. Relaxation. X-greedy algorithms
213
with some number A() ≥ . Then, for XGAFR, fm ≤ fm−1 inf
λ≥0
1 − λA()−1 1 −
fm−1
+ 2ρ
5λ fm−1
,
for m = 1, 2, . . . . Theorems 6.4.3 and 6.4.4 were derived from Lemma 6.4.1. In the same way we derive from Lemma 6.6.1 the following analogs for XGAFR. Theorem 6.6.2. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u). Then for any f ∈ X we have, for XGAFR, lim fm = 0.
m→∞
Theorem 6.6.3. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u) ≤ γuq , 1 < q ≤ 2. Take a number ≥ 0 and two elements f , f from X such that f − f ≤ ,
f /A() ∈ A1 (D)
with some number A() > 0. Then, for XGAFR,
fm ≤ max 2, C(q, γ) A() + (1 + m)−1/p ,
p := q/(q − 1).
We now proceed to an analogue of the GAWR.
X-Greedy Algorithm with Relaxation r (XGAR(r)) Given a relaxation sequence r := {rm }∞ m=1 , rm ∈ [0, 1), we set f0 := f and G0 := 0. Then for each m ≥ 1 we give the following inductive definition. (1) ϕm ∈ D and λm ≥ 0 are such that (we assume existence)
f − (1 − rm )Gm−1 + λm ϕm =
inf g∈D,λ≥0
f − (1 − rm )Gm−1 + λg
and Gm := (1 − rm )Gm−1 + λm ϕm . (2) Let fm := f − Gm .
214
Chapter 6. Greedy Approximation with Respect to Dictionaries
We note that in the case rk = 0, k = 1, 2, . . . , when there is no relaxation XGAR(0) coincides with the X-Greedy Algorithm. Practically nothing is known about convergence and rate of convergence of the X-Greedy Algorithm. However, relaxation helps to prove convergence results for XGAR(r). Here are analogs of the corresponding results for GAWR. Lemma 6.6.4. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u). Take a number ≥ 0 and two elements f , f from X such that f − f ≤ , f /A() ∈ A1 (D) with some number A() > 0. Then, for XGAR(r), rm (f + A()) fm ≤ fm−1 1 − rm 1 − + 2ρ fm−1 (1 − rm )fm−1 for m = 1, 2, . . . . Theorem 6.6.5. Let the sequence r := {rk }∞ k=1 , rk ∈ [0, 1), satisfy the conditions ∞
rk = ∞,
rk → 0
as
k → ∞.
k=1
Then XGAR(r) converges in any uniformly smooth Banach space for each f ∈ X and for all dictionaries D. Theorem 6.6.6. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u) ≤ γuq , 1 < q ≤ 2. Let r := {2/(k + 2)}∞ k=1 . Consider XGAR(r). For a pair of functions f , f satisfying f − f ≤ , we have
f /A() ∈ A1 (D),
fm ≤ + C(q, γ) f + A() m−1+1/q .
6.7 Greedy expansions 6.7.1 Introduction From the definition of a dictionary it follows that any element f ∈ X can be approximated arbitrarily well by finite linear combinations of elements in the dictionary. The primary goal of this section is to study representations of an element f ∈ X by a series f∼
∞
cj (f )gj (f ),
gj (f ) ∈ D,
cj (f ) > 0,
j = 1, 2, . . . .
(6.7.1)
j=1
In building the representation (6.7.1) we need to construct two sequences: ∞ ∞ {gj (f )}∞ j=1 and {cj (f )}j=1 . In this section the construction of {gj (f )}j=1 will be
215
6.7. Greedy expansions
based on ideas used in greedy-type nonlinear approximation (greedy-type algorithms). This justifies the use of the term greedy expansion for (6.7.1) considered in the section. The construction of {gj (f )}∞ j=1 is, clearly, the most important and difficult part in building the representation (6.7.1). On the basis of the contemporary theory of nonlinear approximation with respect to redundant dictionaries, we may conclude that the method of using a norming functional in greedy steps of an algorithm is the most productive in approximation in Banach spaces. This method was utilized in the Weak Chebyshev Greedy Algorithm and in the Weak Dual Greedy Algorithm. We use this same method in new algorithms considered in this section. A new qualitative result of this section demonstrates that we have a lot of flexibility in constructing a sequence of coefficients {cj (f )}∞ j=1 . Denote rD (f ) := sup Ff D := sup sup Ff (g). Ff
Ff g∈D
We note that, in general, a norming functional Ff is not unique. This is why in the definition of rD (f ) we take supFf over all norming functionals of f . It is known that in the case of uniformly smooth Banach spaces (our primary object here) the norming functional Ff is unique. In such a case we do not need supFf in the definition of rD (f ); instead, we have rD (f ) = Ff D . We begin with a description of a general scheme that provides an expansion for a given element f . Later, specifying this general scheme, we will obtain different methods of expansion.
Dual-Based Expansion (DBE) m−1 Let t ∈ (0, 1] and f = 0. Denote f0 := f . Assume {fj }m−1 j=0 ⊂ X, {ϕj }j=1 ⊂ D and that a set of coefficients {cj }m−1 j=1 of expansion have already been constructed. If fm−1 = 0, then we stop (set cj = 0, j = m, m + 1, . . . in the expansion) and get m−1 f = j=1 cj ϕj . If fm−1 = 0, then we conduct the following two steps.
(1) Choose ϕm ∈ D such that sup Ffm−1 (ϕm ) ≥ trD (fm−1 ).
Ffm−1
(2) Define fm := fm−1 − cm ϕm , where cm > 0 is a coefficient either prescribed in advance or chosen from a concrete approximation procedure. We call the series f∼
∞
cj ϕj
(6.7.2)
j=1
the dual-based expansion (DBE) of f with coefficients cj (f ) := cj , j = 1, 2, . . . with respect to D.
216
Chapter 6. Greedy Approximation with Respect to Dictionaries
Denote Sm (f, D) :=
m
cj ϕj .
j=1
Then it is clear that fm = f − Sm (f, D). We prove some convergence results for DBE in Subsections 6.7.2 and 6.7.3. In Subsection 6.7.3 we consider a variant of the Dual-Based Expansion with coefficients chosen by a certain simple rule. The rule depends on two numerical parameters, t ∈ (0, 1] (the weakness parameter from the definition of DBE) and b ∈ (0, 1) (the tuning parameter of the approximation method). The rule also depends on a majorant μ of the modulus of smoothness of the Banach space X.
Dual Greedy Algorithm with parameters (t, b, μ) (DGA(t, b, μ)) Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u), and let μ(u) be a continuous majorant of ρ(u): ρ(u) ≤ μ(u), u ∈ [0, ∞). For parameters ∞ ∞ t ∈ (0, 1], b ∈ (0, 1] we define sequences {fm }∞ m=0 , {ϕm }m=1 , {cm }m=1 inductively. Let f0 := f . If fm−1 = 0 for m ≥ 1, then we set fj = 0 for j ≥ m and stop. If fm−1 = 0, then we conduct the following three steps. (1) Take any ϕm ∈ D such that Ffm−1 (ϕm ) ≥ trD (fm−1 ).
(6.7.3)
(2) Choose cm > 0 from the equation
tb fm−1 μ cm /fm−1 = cm rD (fm−1 ). 2
(6.7.4)
fm := fm−1 − cm ϕm .
(6.7.5)
(3) Define In Subsection 6.7.3 we prove the following convergence result. Theorem 6.7.1. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u), and let μ(u) be a continuous majorant of ρ(u) with the property that μ(u)/u ↓ 0 as u → +0. Then, for any t ∈ (0, 1] and b ∈ (0, 1), DGA(t, b, μ) converges for each dictionary D and all f ∈ X. The following result from Subsection 6.7.3 gives the rate of convergence. Theorem 6.7.2. Assume X has a modulus of smoothness ρ(u) ≤ γuq , q ∈ (1, 2] and b ∈ (0, 1). Denote μ(u) = γuq . Then, for any dictionary D and any f ∈ A1 (D), the rate of convergence of DGA(t, b, μ) is given by t(1−b)
fm ≤ C(t, b, γ, q)m− p(1+t(1−b)) ,
p :=
q . q−1
217
6.7. Greedy expansions
6.7.2 Convergence of the Dual-Based Expansion We begin with the following lemma. Lemma 6.7.3. Let f ∈ X. Assume that the coefficients {cj }∞ j=1 of the expansion f∼
∞
fm := f −
cj ϕj ,
m
j=1
cj ϕj
j=1
are non-negative and satisfy the following two conditions: ∞
cj rD (fj−1 ) < ∞,
(6.7.6)
cj = ∞.
(6.7.7)
j=1 ∞ j=1
Then lim inf fm = 0. m→∞
(6.7.8)
Proof. The proof n of this lemma is similar to the proof of Lemma 1 from [26]. Denote sn := j=1 cj . Then (6.7.7) implies (see [4, p. 904]) that ∞
cn /sn = ∞.
(6.7.9)
n=1
Using (6.7.6) we get ∞
sn rD (fn−1 )cn /sn =
n=1
∞
cn rD (fn−1 ) < ∞.
n=1
Thus, by (6.7.9), lim inf sn rD (fn−1 ) = 0 n→∞
and also (sn−1 ≤ sn ) lim inf sn rD (fn ) = 0. n→∞
Let lim snk rD (fnk ) = 0.
k→∞
(6.7.10)
Consider {Ffnk }. The unit sphere in the dual X ∗ is weakly∗ compact (see ∗ [32, p. 45]). Let {Fi }∞ i=1 , Fi := Ffnk be a w -convergent subsequence. Denote i
F := w∗ - lim Fi . i→∞
218
Chapter 6. Greedy Approximation with Respect to Dictionaries
We will complete the proof of Lemma 6.7.3 by contradiction. We assume that (6.7.8) does not hold, that is, there exist α > 0 and N ∈ N such that fm ≥ α,
m ≥ N.
(6.7.11)
Then from (6.7.11) it follows that F = 0. Indeed, we have F (f ) = lim Fi (f )
(6.7.12)
i→∞
and
nki nki cj ϕj = fnki + cj Fi (ϕj ) Fi (f ) = Fi fnki + j=1
j=1
≥ α − snki rD fnki
(6.7.13)
for large i. The relations (6.7.12), (6.7.13) and (6.7.10) imply that F (f ) ≥ α, and hence F = 0. This implies that there exists g ∈ D for which F (g) > 0. However, F (g) = lim Fi (g) ≤ lim rD (fnki ) = 0. i→∞
i→∞
We reached a contradiction, which completes the proof of Lemma 6.7.3.
In the paper [79] we pushed to the extreme the flexibility choice of the coefficients cj (f ) in (6.7.1). We made these coefficients independent of an element f ∈ X. Surprisingly, for properly chosen coefficients we obtained results for the corresponding dual greedy expansion similar to the above Theorems 6.7.1 and 6.7.2. Even more surprisingly, we obtained similar results for the corresponding X-greedy expansions. We proceed to the formulation of these results. Let C := {cm }∞ m=1 be a fixed sequence of positive numbers. We restrict ourselves to positive numbers because of the symmetry of the dictionary D.
X-Greedy Algorithm with coefficients C (XGA(C)) Set f0 := f , G0 := 0. Then for each m ≥ 1 we give the following inductive definition. (1) ϕm ∈ D is such that (assuming existence) fm−1 − cm ϕm X = inf fm−1 − cm gX . g∈D
(2) Let fm := fm−1 − cm ϕm ,
Gm := Gm−1 + cm ϕm .
219
6.7. Greedy expansions
Dual Greedy Algorithm with weakness τ and coefficients C (DGA(τ, C)) Let τ := {tm }∞ m=1 , tm ∈ [0, 1], be a weakness sequence. Set f0 := f , G0 := 0. Then for each m ≥ 1 we give the following inductive definition. (1) ϕm ∈ D is any element satisfying Ffm−1 (ϕm ) ≥ tm Ffm−1 D . (2) Let fm := fm−1 − cm ϕm ,
Gm := Gm−1 + cm ϕm .
In the case τ = {t}, t ∈ (0, 1], we write t instead of τ in the notation. The first result on convergence properties of DGA(t, C) was obtained in [78]. We prove it here. Theorem 6.7.4. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u). Assume that C = {cj }∞ j=1 is such that cj ≥ 0, j = 1, 2, . . . , ∞
cj = ∞,
j=1
and, for any y > 0,
∞
ρ(ycj ) < ∞.
(6.7.14)
lim inf fm = 0.
(6.7.15)
j=1
Then for DGA(t, C) we have m→∞
Proof. The proof is by contradiction. Assume (6.7.15) does not hold. Then there exist α > 0 and N ∈ N such that, for all m ≥ N , fm ≥ α > 0. From the definition of the modulus of smoothness we have
fn−1 − cn ϕn + fn−1 + cn ϕn ≤ 2fn−1 1 + ρ cn /fn−1 .
(6.7.16)
Using the definition of ϕn , Ffn−1 (ϕn ) ≥ trD (fn−1 ),
(6.7.17)
we get
fn−1 + cn ϕn ≥ Ffn−1 fn−1 + cn ϕn = fn−1 + cn Ffn−1 (ϕn ) ≥ fn−1 + cn trD (fn−1 ).
(6.7.18)
220
Chapter 6. Greedy Approximation with Respect to Dictionaries
Combining (6.7.16) and (6.7.18), we obtain
fn = fn−1 − cn ϕn ≤ fn−1 1 + 2ρ cn /fn−1 − cn trD (fn−1 ). (6.7.19) We note that, by Remark 6.2.3, fn−1 ρ(cn /fn−1 ) ≤ αρ(cn /α),
n > N.
Therefore, by the assumption (6.7.14), ∞
fn−1 ρ cn /fn−1 < ∞.
(6.7.20)
n=1
This and (6.7.19) imply that ∞ n=1
∞
cn rD (fn−1 ) ≤ t−1 f + 2 fn−1 ρ cn /fn−1 < ∞. n=1
It remains to apply Lemma 6.7.3 to complete the proof.
In [79] we proved an analogue of Theorem 6.7.4 for XGA(C) and improved upon the convergence in Theorem 6.7.4 in the case of uniformly smooth Banach spaces with power-type modulus of smoothness. Under an extra assumption on C, we replaced lim inf by lim. Here is the corresponding result from [79]. Theorem 6.7.5. Let C ∈ q \ 1 be a monotone sequence. Then DGA(t, C) and XGA(C) converge for each dictionary and all f ∈ X in any uniformly smooth Banach space X with modulus of smoothness ρ(u) ≤ γuq , q ∈ (1, 2]. In [79] we also addressed the question of what is the rate of approximation for f ∈ A1 (D). We proved the following theorem. Theorem 6.7.6. Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u) ≤ γuq , q ∈ (1, 2]. We set s := (1 + 1/q)/2 and Cs := {k −s }∞ k=1 . Then DGA(t, Cs ) and XGA(Cs ) (for this algorithm, t = 1) converge for f ∈ A1 (D) with the following rate: for any r ∈ (0, t(1 − s)), fm ≤ C(r, t, q, γ)m−r . In the case t = 1, Theorem 6.7.6 provides the rate of convergence m−r for f ∈ A1 (D) with r arbitrarily close to (1 − 1/q)/2. Theorem 6.7.2 provides a similar rate of convergence. It would be interesting to know if the rate m−(1−1/q)/2 is the best that can be achieved in greedy expansions (for each D, any f ∈ A1 (D), and any X with ρ(u) ≤ γuq , q ∈ (1, 2]). We note that there are greedy approximation methods that provide an error bound of the order m1/q−1 for f ∈ A1 (D) (see the surveys [76], [81] and the book [82]). However, these approximation methods do not provide an expansion.
221
6.7. Greedy expansions
6.7.3 A modification of the Weak Dual Greedy Algorithm We begin this subsection with a proof of Theorem 6.7.1. Here we give a definition of DGA(τ, b, μ), τ = {tk }∞ k=1 , tk ∈ (0, 1] that coincides with the definition of DGA(t, b, μ) from Subsection 6.7.1 in the case τ = {t}.
Dual Greedy Algorithm with parameters (τ, b, μ) (DGA(τ, b, μ)) Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u) and let μ(u) be a continuous majorant of ρ(u): ρ(u) ≤ μ(u), u ∈ [0, ∞). For a sequence ∞ τ = {tk }∞ k=1 , tk ∈ (0, 1] and a parameter b ∈ (0, 1], we define sequences {fm }m=0 , ∞ ∞ {ϕm }m=1 , {cm }m=1 inductively. Let f0 := f . If fm−1 = 0 for some m ≥ 1, then we set fj = 0 for j ≥ m and stop. If fm−1 = 0 then we conduct the following three steps. (1) Take any ϕm ∈ D such that Ffm−1 (ϕm ) ≥ tm rD (fm−1 ).
(6.7.21)
(2) Choose cm > 0 from the equation
tm b cm rD (fm−1 ). fm−1 μ cm /fm−1 = 2
(6.7.22)
fm := fm−1 − cm ϕm .
(6.7.23)
(3) Define Proof. We now prove Theorem 6.7.1. In this case, τ = {t}, t ∈ (0, 1]. By (6.7.19), we have fm = fm−1 − cm ϕm
(6.7.24) ≤ fm−1 1 + 2ρ(cm /fm−1 ) − cm trD (fm−1 ). Using the choice of cm we find that fm ≤ fm−1 − t(1 − b)cm rD (fm−1 ).
(6.7.25)
In particular, (6.7.25) implies that {fm } is a monotone decreasing sequence and t(1 − b)cm rD (fm−1 ) ≤ fm−1 − fm . Thus
∞
cm rD (fm−1 ) < ∞.
(6.7.26)
m=1
We have the following two cases: (I)
∞ m=1
cm = ∞,
(II)
∞ m=1
cm < ∞.
222
Chapter 6. Greedy Approximation with Respect to Dictionaries
In case (I), Lemma 6.7.3 shows that lim inf fm = 0 m→∞
=⇒
lim fm = 0.
m→∞
It remains to consider case (II). We prove convergence in this case by contradiction. Assume (6.7.27) lim fm = α > 0. m→∞
By (II), fm → f∞ = 0 as m → ∞. Thanks to the uniform smoothness of X, lim Ffm − Ff∞ = 0. m→∞
We have Ff∞ = 0, and therefore there is a g ∈ D such that Ff∞ (g) > 0. However, Ff∞ (g) = lim Ffm (g) ≤ lim rD (fm ) = 0. m→∞
(6.7.28)
m→∞
Indeed, by (6.7.22) and (6.7.27) we get rD (fm−1 ) ≤ αc−1 m μ(cm /α)
2 −→ 0 tb
as m → ∞. Theorem 6.7.1 is proved.
Remark 6.7.7. It is clear from the above proof that Theorem 6.7.1 holds for an algorithm obtained from DGA(τ, b, μ), by replacing (6.7.22) by
b fm−1 μ cm /fm−1 = cm Ffm−1 (ϕm ). 2
(6.7.29)
Also, the parameter b in (6.7.22) and (6.7.29) can be replaced by varying parameters bm ∈ (a, b) ⊂ (0, 1). We proceed to study the rate of convergence of DGA(τ, b, μ) in uniformly smooth Banach spaces with the power-type majorant of modulus of smoothness: ρ(u) ≤ μ(u) = γuq , 1 < q ≤ 2. We now prove a statement more general than Theorem 6.7.2. Theorem 6.7.8. Let τ := {tk }∞ k=1 be a nonincreasing sequence such that 1 ≥ t1 ≥ t2 ≥ · · · > 0, and b ∈ (0, 1). Assume X has a modulus of smoothness ρ(u) ≤ γuq , q ∈ (1, 2]. Denote μ(u) = γuq . Then, for any dictionary D and any f ∈ A1 (D), the rate of convergence of DGA(τ, b, μ) is given by tm (1−b) − p(1+t m m (1−b)) p tk , fm ≤ C(b, γ, q) 1 +
k=1
p :=
q . q−1
223
6.7. Greedy expansions
Proof. As in (6.7.25), we get fm ≤ fm−1 − tm (1 − b)cm rD (fm−1 ).
(6.7.30)
Thus we need to estimate cm rD (fm−1 ) from below. It is clear that fm−1 A1 (D)
m−1 = f − cj ϕj
A1 (D)
j=1
Denote bn := 1 +
n
j=1 cj .
≤ f A1 (D) +
m−1
cj .
(6.7.31)
j=1
Then by (6.7.31) we get fm−1 A1 (D) ≤ bm−1 .
Next, by Lemma 6.2.8, rD (fm−1 ) = sup Ffm−1 (g) = g∈D
sup ϕ∈A1 (D)
Ffm−1 (ϕ) (6.7.32)
≥ fm−1 −1 A1 (D) Ffm−1 (fm−1 ) ≥ fm−1 /bm−1 . Substituting (6.7.32) into (6.7.30), we get
fm ≤ fm−1 1 − tm (1 − b)cm /bm−1 .
(6.7.33)
From the definition of bm we find that
bm = bm−1 + cm = bm−1 1 + cm /bm−1 . Using the inequality (1 + x)α ≤ 1 + αx, we obtain
0 ≤ α ≤ 1,
x ≥ 0,
t (1−b)
m tm (1−b) bm ≤ bm−1
1 + tm (1 − b)cm /bm−1 .
(6.7.34)
Multiplying (6.7.33) and (6.7.34), and using that tm ≤ tm−1 , we get t
m−1 tm (1−b) ≤ fm−1 bm−1 fm bm
(1−b)
≤ f ≤ 1.
(6.7.35)
The function μ(u)/u = γuq−1 is increasing on [0, ∞). Therefore the cm from (6.7.22) is greater than or equal to cm from (see (6.7.32)) tm b c fm−1 /bm−1, 2 m q 1 tm b q−1 fm−1 q−1 = . 1 2γ b q−1
γfm−1(cm /fm−1 )q = cm
m−1
(6.7.36) (6.7.37)
224
Chapter 6. Greedy Approximation with Respect to Dictionaries
Using the notations p :=
q , q−1
A−1 := (1 − b)(
1 b q−1 ≤ 1/2, ) 2γ
we obtain from (6.7.33) and (6.7.37) the inequality tpm fm−1 p . fm ≤ fm−1 1 − A bpm−1
(6.7.38)
Since bm ≥ bm−1 , this implies that
p
p (fm /bm )p ≤ fm−1 /bm−1 1 − A−1 tpm fm−1 /bm−1 .
(6.7.39)
Taking into account that f ≤ 1 < A, we obtain from (6.7.39) by an analog of Lemma 2.16 from Chapter 2 of [82] (see [72, Lemma 3.1]) that −1 m
p p fm /bm ≤ A 1 + tk .
(6.7.40)
k=1
Combining (6.7.35) and (6.7.40), we get fm ≤ C(b, γ, q) 1 +
m
tm (1−b) − p(1+t (1−b)) m
tpk
k=1
,
p :=
q . q−1
This completes the proof of Theorem 6.7.8.
In the case τ = {t}, t ∈ (0, 1] we deduce Theorem 6.7.2 from Theorem 6.7.8. Remark 6.7.9. Theorem 6.7.8 holds in fact for an algorithm obtained from DGA(τ, b, μ) by replacing (6.7.22) by (6.7.29). It follows from the proof of Theorem 6.7.8 that it holds for a modification of DGA(τ, b, μ) where we replace the quantity rD (fm−1 ) in the definition by its lower estimate (see (6.7.32)) fm−1 /bm−1 , with bm−1 := 1 + m−1 j=1 cj . Clearly, this modification is more ready for practical implementation than DGA(τ, b, μ). We formulate the above remark as a separate result.
Modified Dual Greedy Algorithm (τ, b, μ) (MDGA(τ, b, μ)) Let X be a uniformly smooth Banach space with modulus of smoothness ρ(u) and let μ(u) be a continuous majorant of ρ(u): ρ(u) ≤ μ(u), u ∈ [0, ∞). For a sequence τ = {tk }∞ k=1 , tk ∈ (0, 1] and a parameter b ∈ (0, 1), we define, for f ∈ A1 (D), ∞ ∞ sequences {fm }∞ m=0 , {ϕm }m=1 , {cm }m=1 inductively. Let f0 := f . If fm−1 = 0 for some m ≥ 1, then we set fj = 0 for j ≥ m and stop. If fm−1 = 0 then we conduct the following three steps.
225
6.7. Greedy expansions
(1) Take any ϕm ∈ D such that m−1 −1 Ffm−1 (ϕm ) ≥ tm fm−1 1 + cj . j=1
(2) Choose cm > 0 from the equation μ(cm /fm−1 ) =
m−1 −1 tm b cm 1 + cj . 2 j=1
(3) Define fm := fm−1 − cm ϕm . {tk }∞ k=1
Theorem 6.7.10. Let τ := be a nonincreasing sequence such that 1 ≥ t1 ≥ t2 ≥ · · · > 0, and b ∈ (0, 1). Assume X has a modulus of smoothness ρ(u) ≤ γuq , q ∈ (1, 2]. Denote μ(u) = γuq . Then, for any dictionary D and any f ∈ A1 (D), the rate of convergence of MDGA(τ, b, μ) is given by tm (1−b) − p(1+t m m (1−b)) p fm ≤ C(b, γ, q) 1 + tk ,
p :=
k=1
q . q−1
Let us discuss an application of Theorem 6.7.2 in the case of a Hilbert space. It is well known and easy to check that, for a Hilbert space H, ρ(u) ≤ (1 + u2 )1/2 − 1 ≤ u2 /2. Therefore, by Theorem 6.7.2 with μ(u) = u2 /2, DGA(t, b, μ) provides the following error estimate: t(1−b)
fm ≤ C(t, b)m− 2(1+t(1−b))
for f ∈ A1 (D).
(6.7.41)
f ∈ A1 (D).
(6.7.42)
The estimate (6.7.41) with t = 1 gives 1−b
fm ≤ C(b)m− 2(2−b)
for
1−b in this estimate tends to 1/4 when b tends to 0. Comparing The exponent 2(2−b) (6.7.42) with the upper estimate for PGA (see Section 3 of Chapter 2 of [82]), we observe that DGA(1, b, u2 /2) with small b has a better upper estimate for the rate of convergence than the known estimates for PGA. We note also that inequality (2.40) on page 96 from Chapter 2 of [82] indicates that the exponent in the power rate of decay of error for PGA is less than 0.1898. Let us figure out how DGA(1, b, u2 /2) works in a Hilbert space. Consider its m-th step. Let ϕm ∈ D be from (7.3). Then it is clear that ϕm maximizes fm−1 , g over the dictionary D, and fm−1 , ϕm = fm−1 rD (fm−1 ).
226
Chapter 6. Greedy Approximation with Respect to Dictionaries
PGA would use ϕm with the coefficient fm−1 , ϕm at this step. DGA(1, b, u2 /2) uses the same ϕm and only a fraction of fm−1 , ϕm : cm = bfm−1rD (fm−1 ).
(6.7.43)
Thus the choice b = 1 in (6.7.43) corresponds to PGA. However, it is clear from the above considerations that our technique, designed for general Banach spaces, does not work in the case b = 1. The above discussion brings us the following surprising observation: the use of a small fraction (cm = bfm−1 , g) of an optimal coefficient results in an improvement of the upper estimate for the rate of convergence.
6.7.4 Convergence of WDGA We now study convergence of the Weak Dual Greedy Algorithm (WDGA) defined in the Introduction of this chapter. We present results from [26]. We will prove the convergence result under an extra assumption on the underlying Banach space X. Definition 6.7.11 (Property Γ). A uniformly smooth Banach space has property Γ if there is a constant β > 0 such that, for any x, y ∈ X satisfying Fx (y) = 0, we have x + y ≥ x + βFx+y (y). Property Γ in the above form was introduced in [26]. This condition (formulated somewhat differently) was considered previously in the context of greedy approximation in [53]. Theorem 6.7.12. Let X be a uniformly smooth Banach space with property Γ. Then WDGA(τ ) with τ = {t}, t ∈ (0, 1], converges for each dictionary and all f ∈ X. Proof. Let {fm }∞ m=0 be a sequence generated by WDGA(t). Then fm−1 = fm + am ϕm ,
Ffm (ϕm ) = 0.
(6.7.44)
We use property Γ with x := fm and y := am ϕm and obtain fm−1 ≥ fm + βam Ffm−1 (ϕm ).
(6.7.45)
This inequality and the monotonicity of the sequence {fm } imply that ∞
am Ffm−1 (ϕm ) < ∞
=⇒
m=1
∞
am rD (fm−1 ) < ∞.
m=1
As in the proof of Theorem 6.7.1, we consider separately two cases: (I)
∞ m=1
am = ∞,
(II)
∞ m=1
am < ∞.
(6.7.46)
227
6.7. Greedy expansions
In case (I), by (6.7.46) and Lemma 6.7.3 we obtain lim inf fm = 0
lim fm = 0.
=⇒
m→∞
m→∞
In case (II) we argue by contradiction. Assume that lim fm = α > 0.
m→∞
Then, by (II), we have fm → f∞ = 0 as m → ∞. By the uniform smoothness of X, lim Ffm − Ffm−1 = 0. (6.7.47) lim Ffm − Ff∞ = 0, m→∞
m→∞
In particular, (6.7.44) and (6.7.47) imply that lim Ffm−1 (ϕm ) = 0
=⇒
m→∞
lim rD (fm ) = 0.
m→∞
(6.7.48)
We have Ff∞ = 0, and therefore there is a g ∈ D such that Ff∞ (g) > 0. However, by (6.7.47) and (6.7.48), Ff∞ (g) = lim Ffm (g) ≤ lim rD (fm ) = 0. m→∞
m→∞
The obtained contradiction completes the proof. We now give a direct proof in case (I) that does not use Lemma 6.7.3. By property Γ we get fm ≤ fm−1 − βam Ffm−1 (ϕm ) ≤ fm−1 − tβam Ffm−1 D .
(6.7.49)
Let > 0, A() > 0, and f be such that f − f ≤ ,
f /A() ∈ A1 (D).
Then fm−1 = Ffm−1 (fm−1 ) = Ffm−1 (f − f + f − Gm−1 )
where bm :=
m−1 k=1
≤ + Ffm−1 D (A() + bm ), ak . Therefore, Ffm−1 D ≥ (fm−1 − )/(A() + bm ).
(6.7.50)
We complete the proof by obtaining a contradiction. If limm→∞ fm = α > 0 and := α/2, then (6.7.49) and (6.7.50) imply that tβam fm ≤ fm−1 1 − . 2(A() + bm ) Assumption (I) implies that ∞
am =∞ A() + bm m=1
=⇒
fm −→ 0.
228
Chapter 6. Greedy Approximation with Respect to Dictionaries
We now turn to the Lp spaces. The following results (Proposition 6.7.13 and Theorem 6.7.14) are from [26]. Proposition 6.7.13. The Lp space with 1 < p < ∞ has property Γ. Proof. Let p ∈ (1, ∞). Consider the function φp (u) :=
u|1 + u|p−2 (1 + u) − u , |1 + u|p − pu − 1
u = 0,
φp (0) := 2/p.
We note that |1 + u|p − pu − 1 > 0 for u = 0. Indeed, it is sufficient to check the inequality for u ≥ −1/p. In this case, |1 + u|p = (1 + u)p > 1 + pu, u = 0. It is easy to check that lim φp (u) = 2/p. u→0
Thus, φp (u) is continuous on (−∞, ∞). This and lim φp (u) = lim φp (u) = 1
u→−∞
u→∞
imply that φp (u) ≤ Cp . We now proceed to property Γ. For any two real functions x(s), y(s), the inequality φp (u) ≤ Cp implies |x(s) + y(s)|p−2 (x(s) + y(s))y(s) − |x(s)|p−2 x(s)y(s) ≤ Cp (|x(s) + y(s)|p − p|x(s)|p−2 x(s)y(s) − |x(s)|p ).
(6.7.51)
Suppose that Fx (y) = 0. This means that |x(s)|p−2 x(s)y(s) ds = 0.
(6.7.52)
Integrating inequality (6.7.51) and taking into account (6.7.52), we get
x + yp−1 Fx+y (y) ≤ Cp x + yp − xp .
(6.7.53)
Next, x = Fx (x) = Fx (x + y) ≤ x + y. Therefore, (6.7.53) implies Fx+y (y) ≤ pCp (x + y − x).
(6.7.54)
It remains to note that (6.7.54) is equivalent to property Γ with β = (pCp )−1 .
Combining Theorem 6.7.12 with Proposition 6.7.13 we obtain the following result. Theorem 6.7.14. Let p ∈ (1, ∞). Then WDGA(τ ) with τ = {t}, t ∈ (0, 1], converges for each dictionary and all f ∈ Lp .
Chapter 7 Appendix This chapter contains well-known results in analysis. For the sake of completeness, some of these are proved.
7.1 Lp -spaces and some inequalities 7.1.1 Modulus of continuity Let f (x), x = (x1 , . . . , xd ) be a measurable, almost everywhere finite, function which is 2π-periodic in each variable. In the case d = 1 we shall write f ∈ Lp for 1 ≤ p < ∞ if 1/p π −1 p |f (x)| dx < ∞, f p := (2π) −π
where the integral is the Lebesgue integral. In the case d > 1, p = (p1 , . . . , pd ), 1 ≤ pj < ∞, j = 1, . . . , d, f ∈ Lp means that p2 /p1 π π π −1 f (x)p1 dx1 f p := (2π) ··· · · · (2π)−1 (2π)−1 −π
−π
−π
· · · dxd−1
1/pd
pd /pd−1 dxd
< ∞.
In the case p = ∞ it will be convenient to assume that L∞ is the space of continuous functions and f ∞ = sup f (x). x
For f ∈ Lp we define the modulus of continuity in Lp as ω(f, δ)p := sup f (· + y) − f (·)p . |y|≤δ
© Springer Basel 2015 V. Temlyakov, Sparse Approximation with Bases, Advanced Courses in Mathematics - CRM Barcelona, DOI 10.1007/978-3-0348-0890-3_7
229
230
Chapter 7. Appendix
Theorem 7.1.1. Let 1 ≤ p < ∞ (a vector inequality means the corresponding inequality in each coordinate) or p = ∞. Then ω(f, δ) → 0 for δ → 0. Proof. In the case p = ∞ the conclusion of the theorem follows from the uniform continuity of a function which is continuous on a compact set. Let 1 ≤ p < ∞. We first prove an auxiliary statement. Lemma 7.1.2. Let N be a natural number and f (x), if f (x) > N , N f (x) = 0, otherwise. Then for f ∈ Lp , 1 ≤ p < ∞, lim f N p = 0.
N →∞
Proof. For d = 1 the conclusion of Lemma 7.1.2 follows from the definition of the Lebesgue integral. In the general case we proceed by induction. Let the lemma be valid for d − 1 and xd = (x1 , . .. , xd−1 ), pd = (p1 , . . . , pd−1 ); from the inclusion f ∈ Lp it follows that f (·, xd )pd =: ϕ(xd ) ∈ Lpd . Consequently, for almost all xd , f (xd , xd ) belongs to Lpd and, by the induction hypothesis, lim f N (·, xd )pd = 0.
N →∞
Further, f N (·, xd )pd ≤ ϕ(xd ) ∈ Lpd . Thus, applying the Lebesgue theorem about passing to the limit under integral sign, we obtain the conclusion of the lemma for dimension d. Hence the lemma is proved. Corollary 7.1.3. Let f ∈ Lp , 1 ≤ p < ∞. Then lim sup f (x) χE (x)p = 0, t→0 |E|≤t
where χE is the characteristic function of a measurable set E and |E| denotes the measure of E. We shall now conclude the proof of Theorem 7.1.1. We use Lusin’s theorem asserting that, for any ε > 0, for a measurable, almost everywhere finite f (x), there is a continuous g(x) such that measure x : f (x) = g(x) < ε. The conclusion of the theorem now follows from Corollary 7.1.3.
7.1. Lp -spaces and some inequalities
231
7.1.2 Some inequalities For 1 ≤ p ≤ ∞ we shall denote by p the dual exponent, that is, the number (or ∞) such that 1/p + 1/p = 1. For a vector 1 ≤ p ≤ ∞ we denote p = (p1 , . . . , pd ) and 1/p = (1/p1 , . . . , 1/pd). ! ! For the sake of brevity we shall write f dμ instead of (2π)−d Td f (x) dx, where Td = [−π, π]d and μ denotes the normalized Lebesgue measure on Td . In the case p = p1 we shall write the scalar p instead of the vector p. The H¨older inequality.
Let 1 ≤ p ≤ ∞, f1 ∈ Lp , f2 ∈ Lp . Then f1 f2 ∈ L1 and f1 f2 dμ ≤ f1 p f2 p . (7.1.1)
Proof. The inequality (7.1.1) for p = 1 and p = ∞ is evident. Let 1 < p < ∞. We consider the function y = xp−1 defined on [0, a] and the inverse function x = y 1/(p−1) defined on [0, b]. Then, calculating the areas of the figures [0, a]×[0, b], G1 = (x, y) : 0 ≤ x ≤ a, 0 ≤ y ≤ xp−1 , G2 = (x, y) : 0 ≤ y ≤ b, 0 ≤ x ≤ y 1/(p−1) , we get
(7.1.2) ab ≤ |G1 | + |G2 | = ap /p + bp /p . Substituting a = |f1 | f1 p and b = |f2 | f2 p in (7.1.2) and integrating we get (7.1.1). Vector H¨older inequality. As a consequence of the inequality (7.1.1) we obtain the H¨older inequality for a vector 1 ≤ p ≤ ∞: f1 f2 dμ ≤ f1 p f2 p . The H¨ older inequality for several functions. Let 1 ≤ pi ≤ ∞, i = 1, . . . , m, 1/p1 + · · · + 1/pm = 1, fi ∈ Lpi , i = 1, . . . , m. Then f1 · · · fm ∈ L1 and f1 · · · fm dμ ≤ f1 p1 · · · fm pm . (7.1.3) The proof will be carried out by induction. For m = 2 one has the H¨older inequality. Suppose that (7.1.3) has been proved for m − 1. We assume without loss of generality that pm > 1. Applying the H¨older inequality for g1 = f1 · · · fm−1 and g2 = fm with exponents pm and pm we get f1 · · · fm dμ ≤ f1 · · · fm−1 fm pm . p m
232
Chapter 7. Appendix
We denote qi = pi /pm , i = 1, . . . , m − 1. Then 1/q1 + · · · + 1/qm−1 = 1. Using the induction hypothesis we get f1 · · · fm−1 ≤ p m
m−1 1/pm m−1 |fi |pm = fi pi , qi i=1
i=1
which implies (7.1.3). Monotonicity of Lp norms. If 1 ≤ q ≤ p ≤ ∞, then f q ≤ f p
(7.1.4)
f q ≤ f p.
(7.1.5)
and, for 1 ≤ q ≤ p ≤ ∞,
Proof. Clearly, it suffices to prove (7.1.4). We set a = p/q and apply the H¨older inequality with exponents a and a to the functions f1 = |f |q and f2 = 1. Then 1/q f q ≤ |f |q a = f p .
The different norms inequality. Let 1 ≤ a < p < b ≤ ∞, θ = (1/p − 1/b)(1/a − 1/b)−1 . Then f p ≤ f θa|f b1−θ . (7.1.6) Proof. In the case b = ∞, f p =
1/p |f |p−a |f |a dμ
(1−a/p) ≤ f a/p . a f ∞
Let b < ∞. We set 1/q = pθ/a. Then 1/q = p(1 − θ)/b and p = a/q + b/q . Applying the H¨older inequality with exponents q and q to the functions f1 = |f |a/q and f2 = |f |b/q we get f pp
≤
1/q |f | dμ
1/q |f | dμ
a
b
p(1−θ)
= f pθ a f b
,
which yields (7.1.6). The H¨ older inequality for sums. From inequality (7.1.1) we easily obtain the H¨older inequality for sums: N i=1
|ai bi | ≤
N i=1
|ai |
p
1/p N
1/p |bi |
p
,
i=1
We remark that in this inequality one can take N = ∞.
1 ≤ p ≤ ∞.
7.1. Lp -spaces and some inequalities
233
The Minkowski inequality. Let 1 ≤ p ≤ ∞, f ∈ Lp , i = 1, . . . , m. Then m m f ≤ fi p . i p
i=1
(7.1.7)
i=1
Proof. Clearly, it suffices to prove (7.1.7) for m = 2. For p = 1 and p = ∞ (7.1.7) is evident. Let 1 < p < ∞. Using the H¨ older inequality for sums it is easy to verify that S = f1 + f2 ∈ Lp . Furthermore, |S|p dμ ≤ |S|p−1 |f1 | dμ + |S|p−1 |f2 | dμ. Applying the H¨older inequality with exponents p and p we get
Spp ≤ Sp/p f1 p + f2 p , p which implies (7.1.7). In the case of a vector 1 ≤ p ≤ ∞, the inequality m m ≤ f fi p i i=1
p
(7.1.8)
i=1
follows from (7.1.7).
The generalized Minkowski inequality. It is possible to deduce the generalized Minkowski inequality from the Minkowski inequality. If 1 ≤ p ≤ ∞, then ϕ(·, y) dμ(y) ≤ ϕ(·, y) dμ(y). (7.1.9) p p
The vector norms inequality. If 1 ≤ q ≤ p ≤ ∞, then p/q 1/p f (x, y)q dμ(y) dμ(x) ≤
1/q q/p p f (x, y) dμ(x) dμ(y) .
(7.1.10)
Proof. The inequality (7.1.10) follows from (7.1.9) by choosing ϕ = |f |q and p = (p/q, . . . , p/q). The Young inequality.
Let p, q and a be real numbers satisfying the conditions
1 ≤ p ≤ q ≤ ∞,
1 − 1/p + 1/q = 1/a.
(7.1.11)
Let f ∈ Lp and K ∈ La be 2π-periodic functions of a single variable. Consider the convolution π J(x) = (2π)−1
−π
K(x − y)f (y) dy = K ∗ f.
234
Chapter 7. Appendix
Then Jq ≤ Kaf p .
(7.1.12)
Proof. In the case q = ∞ the inequality (7.1.12) follows from the H¨older inequality. Let q < ∞ . We first consider the case 1 < p < q, a < q. Let us represent the function |Kf | in the form
1/q |K|1−a/q |f |1−p/q . |Kf | = |K|a |f |p
(7.1.13)
We apply the H¨older inequality for three functions with exponents p1 = q, p2 = (1/a − 1/q)−1 , p3 = (1/p − 1/q)−1 . This yields J(x) ≤
1/q K(x − y)a f (y)p dμ(y) K1−a/q f 1−p/q . a p
(7.1.14)
Raising both sides of (7.1.14) to the power q and integrating we obtain the inequality (7.1.12). It remains to consider the case where either a = q, or p = q. If p = q, then a = 1. We have π J(x) = (2π)−1
−π
K(u)f (x − u) du.
Applying the generalized Minkowski inequality we get Jp ≤ f p K(u) dμ(u) = f p K1. Let, at last, a = q and p = 1. Clearly, in this case the required inequality is obtained in the same way as above. The Young inequality for vectors p, q, a. Let 1 ≤ p ≤ q ≤ ∞, 1 − 1/p + 1/q = 1/a, and J(x) = K(x − y)f (y) dμ(y) = K ∗ f. Then Jq ≤ Kaf p .
(7.1.15)
Proof. The inequality (7.1.15) can be obtained by sequential application of the inequality (7.1.12) with the help of the following analog of the generalized Minkowski inequality (x ∈ Td , y ∈ Td ): ϕ(·, y) dμ(y) ≤ · · · ϕ(·, y) dμ(y1 ) dμ(y2 ) . . . dμ(yd ) . q
q1
q2
qd
7.1. Lp -spaces and some inequalities
The Abel inequality.
235
For nonnegative and nonincreasing v1 , . . . , vn we have n k ui vi ≤ v1 max ui . k i=1
(7.1.16)
i=1
This inequality easily follows from the formula n i=1
ui vi =
n−1
ν
ν=1
i=1
(vν − vν+1 )
ui + vn
n
ui ,
(7.1.17)
i=1
which is called the Abel transformation. It is well-known that the space of continuous 2π-periodic in each variable functions, equipped with the uniform norm · ∞ , is a Banach space. It will be convenient to denote it by L∞ . If 1 ≤ p < ∞, then · p is a norm if we do not distinguish equivalent functions, that is measurable functions which do not coincide on a set of measure zero. This follows from the Minkowski inequality. The space Lp , 1 ≤ p < ∞, is a Banach space. Indeed, let {fn }∞ n=1 be a Cauchy sequence in Lp . We find a −k such that f . Then by the Levi theorem subsequence {nk }∞ nk+1 − fnk p ≤ 2 n=1 we find that the series ∞ (fnk+1 − fnk ) fn1 + k=1
converges to f almost everywhere, that is {fnk } converges to f almost every∞ where. Furthermore, applying the Fatou theorem to the sequences |fnk |p k=1 ∞ and |fnk − fnm |p k=m+1 we find that f ∈ Lp and f − fnm p → 0 for m → ∞. From here it easily follows that the Cauchy sequence {fn }∞ n=1 converges to f in Lp . As was mentioned above, functions in Lp are defined up to equivalence. We shall assume that we deal with a continuous function f if it is equivalent to a continuous function. Along with the spaces Lp we shall use the spaces p , 1 ≤ p ≤ ∞, of sequences z = {zk }∞ k=1 equipped with the norm zp := zp :=
∞
1/p |zk |
p
,
k=1
z∞ := z∞ := sup |zk |. k
The spaces p are Banach spaces.
1 ≤ p < ∞,
236
Chapter 7. Appendix
7.2 Duality in Lp spaces Let f ∈ Lp , g ∈ Lp . We denote f, g := (2π)−d
f (x)g(x) dx =
f g dμ,
Td
where z is the complex conjugate of the number z. Theorem 7.2.1. If 1 ≤ p ≤ ∞ and f ∈ Lp then f p =
sup
f, g.
g∈Lp ,gp ≤1
Proof. The estimate f, g ≤ f p for g such that gp ≤ 1 follows from the H¨older inequality. For 1 < p < ∞ we set, considering f p > 0, g = |f |p−1 (sign f ) f p−1 , p where
sign z =
z/|z|,
z = 0,
0,
z = 0.
Then gp = 1,
f, g = f p ,
which implies the conclusion of the theorem.
Let p = ∞. As we have agreed, L∞ is the space of continuous functions. Consequently, there is a point x0 ∈ Td such that f ∞ = f (x0 ). We assume ϕε (x), 0 < ε ≤ 1, to be 2π-periodic in each variable and such that
ϕε (x) =
(2π/ε)d
for |xj − x0j | ≤ ε/2, j = 1, . . . , d,
0
otherwise.
Then ϕε 1 = 1 and for gε = ϕε sign f (x0 ), gε 1 ≤ 1 we have f (x0 ) = lim (f, gε ), ε→0
which proves the conclusion of the theorem in this case.
7.2. Duality in Lp spaces
237
Let p = 1. We set g = sign f . By the Lusin theorem for an arbitrary ε > 0 we find a continuous gε such that |gε | ≤ 1 and |Eε | ≤ ε, where Eε = x : gε (x) = g(x) .
Then f 1 = f, g = whence
Td \Eε
f gε dμ ≥ f 1 −
Eε
f gε dμ +
f g dμ, Eε
f g dμ −
Eε
f gε dμ .
(7.2.1)
The conclusion of the theorem for p = 1 follows from (7.2.1) in view of Corollary 7.1.3. Remark 7.2.2. A statement analogous to Theorem 7.2.1 is valid for the spaces p : zp =
sup w
p
z, w,
≤1
1 ≤ p ≤ ∞.
Let F be a linear normed space (real or complex) and F ∗ be the conjugate (dual) space to F , that is elements of F ∗ are linear functionals ϕ defined on F , with the norm ϕ = sup ϕ(f ). f ∈F ;f ≤1
Let Φ = {ϕk }nk=1 be a set of functionals from F ∗ . Denote FΦ = f ∈ F : ϕk (f ) = 0, k = 1, . . . , n . Theorem 7.2.3 (Nikol’skii duality theorem). Let Φ = {ϕk }nk=1 be a fixed system of functionals from F ∗ . Then, for any ϕ ∈ F ∗ , n ϕ(f ). (7.2.2) infn ϕ − ck ϕk = sup f ∈FΦ ;f ≤1 {ck }k=1 k=1
Proof. Let us denote the left-hand side of (7.2.2) by a and the right-hand side of (7.2.2) by b. From the relation n n ϕ(f ) = ϕ − ck ϕk (f ) ≤ ϕ − ck ϕk , k=1
k=1
which is valid for any f ∈ FΦ , f ≤ 1, it follows that b ≤ a. We prove the inverse inequality. Clearly, we can assume that the system of functionals ϕ1 , . . . , ϕn is linearly independent.
238
Chapter 7. Appendix
Lemma 7.2.4. Let ϕ1 , . . . , ϕn ∈ F ∗ be linearly independent. There exists a set of elements f1 , . . . , fn ∈ F which is biorthogonal to ϕ1 , . . . , ϕn , that is, ϕi (fj ) = 0 for 1 ≤ i = j ≤ n and ϕi (fi ) = 1, i = 1, . . . , n. Proof. The proof will be carried out by induction. The case n = 1 is evident. Let us assume that a biorthogonal system can be constructed if the number of functionals is less than n. Clearly, it suffices to prove the existence of f1 ∈ F such that ϕk (f1 ) = 0, k = 2, . . . , n. ϕ1 (f1 ) = 1, Let Φ1 = {ϕk }nk=2 and {gk }nk=2 be a biorthogonal system to Φ1 . It is sufficient to prove the existence of f1 ∈ FΦ1 such that ϕ1 (f1 ) = 0 . Let us assume the contrary, that is, for any f ∈ FΦ we have ϕ1 (f ) = 0. We shall show that this contradicts the linear independence of the functionals ϕ1 , . . . , ϕn . If f ∈ F , then n n f− ϕk (f )gk ∈ FΦ1 and ϕ1 f − ϕk (f )gk = 0, k=2
k=2
which implies ϕ1 (f ) =
n
ϕ1 (gk )ϕk (f ).
k=2
Consequently, ϕ1 =
n
ϕ1 (gk )ϕk ,
k=2
which is in contradiction with the linear independence of ϕ1 , . . . .ϕn . The lemma is proved.
We continue the proof of the theorem. Let ϕ ∈ F ∗ . Along with ϕ we consider the restriction ϕΦ of ϕ to the subspace FΦ , that is, the bounded linear functional ϕΦ defined on FΦ such that ϕΦ (f ) = ϕ(f ) for all f ∈ FΦ . Any functional ψ =ϕ−
n
ck ϕk
(7.2.3)
k=1
is an extension of ϕΦ to F . We prove that each extension of a functional ϕΦ from FΦ to F has the form (7.2.3). We use Lemma 7.2.4. Let the system f1 , . . . , fn be biorthogonal to Φ; then, for any f ∈ F , f−
n
ϕk (f )fk ∈ FΦ .
k=1
Consequently, for any extension ψ of the functional ϕΦ we have n n ψ f− ϕk (f )fk = ϕ f − ϕk (f )fk , k=1
k=1
7.3. Fourier series of functions in Lp
whence ψ(f ) = ϕ(f ) +
239
n
ψk (fk ) − ϕ(fk ) ϕk (f ). k=1
Thus, the representation (7.2.3) is valid for ψ. Let ψ be an extension of the functional ϕΦ such that ψ = ϕΦ . The existence of such an extension follows from the Hahn–Banach theorem. Then n ϕ(f ), ψ = ϕ − ck ϕk = ϕΦ = sup f ∈FΦ ;f ≤1 k=1
that is, a ≤ b, which concludes the proof of the theorem.
Theorem 7.2.5. If ϕ, ϕ1 , . . . .ϕn ∈ Lp , 1 ≤ p ≤ ∞, then n g, ϕ. ck ϕk = sup inf ϕ − ck , k=1,...,n gp ≤1; g,ϕk =0, k=1,...,n k=1
p
Proof. This follows from Theorems 7.2.1 and 7.2.3. Indeed, let us regard a function ϕ ∈ Lp as a functional ϕ acting onLp by the formula ϕ(f ) = f, ϕ. Then, by Theorem 7.2.1, we have ϕ = ϕ(·)p . Hence it only remains to apply Theorem 7.2.3.
7.3 Fourier series of functions in Lp For a function f ∈ L1 we define its Fourier coefficients by −d ˆ f (x)e−i(k,x) dx = f, ei(k,x) . f (k) = (2π)
(7.3.1)
Td
Recall the well-known Parseval identity: for any f ∈ L2 , 1/2 fˆ(k)2 f 2 = ,
(7.3.2)
k
and the Riesz–Fischer theorem: if k |ck |2 < ∞, then f (x) = ck ei(k,x) ∈ L2 and fˆ(k) = ck . k
In the space Lp , 1 < p < ∞, the following statement holds. Theorem 7.3.1 (Hausdorff–Young theorem). If 1 < p ≤ 2, then for any f ∈ Lp k
fˆ(k)p
1/p ≤ f p .
(7.3.3)
240
Chapter 7. Appendix
If the sequence {ck } is such that for which fˆ(k) = ck and
k
f ≤
|ck |p < ∞, then there exists a function f ∈ Lp
p
fˆ(k)p
1/p .
(7.3.4)
k
We derive this theorem from the following interpolation theorem, which is a special case of the general Riesz–Thorin theorem. Denote the norm of an operator T acting from a Banach space E to a Banach space F by T E→F = sup T f F . f E ≤1
Theorem 7.3.2 (Riesz–Thorin theorem). Let Eq be either Lq or q and Fp be either Lp or p and, for 1 ≤ qi , pi ≤ ∞, T Eqi →Fpi ≤ Mi ,
i = 1, 2.
Then, for all 0 < θ < 1, T Eq →Fp ≤ M1θ M21−θ , where 1/q = θ/q1 + (1 − θ)/q2 ,
1/p = θ/p1 + (1 − θ)/p2 .
Proof of Theorem 7.3.1. We first prove the relation (7.3.3). Let us consider the operator T that maps each function f ∈ L1 to the sequence fˆ(k) of its Fourier coefficients. Then, by (7.3.2), for f ∈ L2 we have
and obviously, for f ∈ L1 ,
T f 2 = f 2,
(7.3.5)
T f ∞ ≤ f 1 .
(7.3.6)
If 1/p = θ/1 + (1 − θ)/2, then 1/p = θ/∞ + (1 − θ)/2 and the relation (7.3.3) follows from (7.3.5), (7.3.6), and Theorem 7.3.2 with Eq = Lq ,
q1 = 1,
q2 = 2;
Fp = p ,
p1 = ∞,
p2 = 2.
We prove the relation (7.3.4). Clearly, by the completeness of the space Lp , it is sufficient to prove (7.3.4) in the case when only a finite number of ck are nonzero. Let this be the case and f= ck ei(k,x) . k
7.3. Fourier series of functions in Lp
241
By Theorem 7.2.1, f p = sup f, g = sup gp ≤1
gp ≤1 k
ck gˆ(k).
(7.3.7)
Applying the H¨older inequality and the relation (7.3.3) we see that (7.3.7) is ≤ sup
gp ≤1
1/p 1/p 1/p p p p gˆ(k) |ck | ≤ |ck | .
k
k
k
The relation (7.3.4) is proved.
Let [y] be the integral part of the real number y, that is, the largest integer [y] such that [y] ≤ y. For a vector s = (s1 , . . . , sd ) with nonnegative integer coordinates we define the set ρ(s) of vectors k with integer coordinates as follows: ρ(s) = k = (k1 , . . . , kd ) : [2sj −1 ] ≤ |kj | < 2sj , j = 1, . . . , d . For f ∈ L1 , we denote δs (f, x) =
fˆ(k)ei(k,x) .
k∈ρ(s)
Theorem 7.3.3 (Littlewood–Paley theorem). Let 1 < p < ∞. There exist positive numbers C1 (d, p) and C2 (d, p), which depend on d and p, such that, for each function f ∈ Lp , 2 1/2 δs (f, x) C1 (d, p)f p ≤ ≤ C2 (d, p)f p . s p
Corollary 7.3.4. Let G be a finite set of vectors s and let the operator SG map a function f ∈ Lp , p > 1, to the function SG (f ) = δs (f ). s∈G
Then SG Lp →Lp ≤ C(d, p),
1 < p < ∞.
For the sake of brevity we shall write T Lq →Lp = T q→p . Corollary 7.3.5. Let p∗ = min(p, 2); then for f ∈ Lp we have ∗ p∗ 1/p f p ≤ C(d, p) δs (f, x) p , s
1 < p < ∞.
242
Chapter 7. Appendix
Proof. Let 2 ≤ p < ∞; then, by Theorem 7.3.3, 2 1/2 2 1/2 δs (f, x) δs (f, x) f p ≤ C(d, p) = C(d, p) s p/2 s p
1/2 2 2 1/2 ≤ C(d, p) = C(d, p) . δs (f, x) p/2 δs (f, x) p s
s
Let 1 < p ≤ 2. Using the inequality |a + b| ≤ |a| + |b|k , which is valid for 0 ≤ k ≤ 1, from Theorem 7.3.3 we find, by means of Fatou’s theorem, p/2 2 p −d δs (f, x) dx f p ≤ C(d, p)(2π) k
Td
k
s
δs (f, x)p . ≤ C(d, p) p
s
Theorem 7.3.6 (Marcinkiewicz multiplier theorem). Suppose that λ0 , λ1 , . . . are Marcinkiewicz multipliers, that is, they satisfy the conditions ∓(2ν+1 −1)
|λn | ≤ M,
n = 0, ∓1, . . . ,
|λl − λl+1 | ≤ M,
ν = 0, 1, . . . ,
l=∓2ν
where M > 0. Then the operator Λ, which maps a function f into the function λk fˆ(k)eikx , k
is bounded as an operator from Lp to Lp for 1 < p < ∞. Theorem 7.3.7 (Hardy–Littlewood inequality). Let 1 < q < p < ∞, ∞ 1/q μ = 1 − 1/q + 1/p, f Lq (R) = |f |q dx < ∞, −∞ ∞ f (y)|x − y|−μ dy. J(x) =
and
−∞
Then the inequality JLp (R) ≤ C(q, p)f Lq (R) holds. Corollary 7.3.8. Let 1 < q < p < ∞, β = 1/q − 1/p. Then the operator Aβ which maps a function f ∈ Lq into the function ⎞−β ⎛ d
max 1, |kj | ⎠ ei(k,x) , fˆ(k) ⎝ k
j=1
is bounded as an operator from Lq to Lp .
243
7.4. Trigonometric polynomials
7.4 Trigonometric polynomials Functions of the form t(x) =
ck e
ikx
n
ak cos kx + bk sin kx , = a0 /2 +
|k|≤n
(7.4.1)
k=1
where ck , ak , bk are complex numbers, will be called trigonometric polynomials of order n. We shall denote the set of such polynomials by T (n), and by RT (n) the subset of T (n) of real polynomials. We first consider a number of concrete polynomials which play an important role in approximation theory. The Dirichlet kernel. The Dirichlet kernel of order n is defined as
eikx = e−inx ei(2n+1)x − 1 (eix − 1)−1 Dn (x) = |k|≤n
= sin(n + 1/2)x sin(x/2). The Dirichlet kernel is an even trigonometric polynomial, with the majorant
Dn (x) ≤ min 2n + 1, π/|x| , |x| ≤ π. (7.4.2) The estimate Dn 1 ≤ C ln n,
n = 2, 3, . . .
follows from (7.4.2). We mention the well-known relation Dn 1 =
4 ln n + Rn , π2
|Rn | ≤ 3,
n = 1, 2, 3, . . . .
For any trigonometric polynomial t ∈ T (n) we have t ∗ Dn = t. We denote xl = 2πl/(2n + 1),
l = 0, 1, . . . , 2n.
Clearly, these are the zeros of the Dirichlet kernel Dn on [0, 2π]. For any |k| ≤ n we have 2n l=1
l
eikx Dn (x − xl ) =
|m|≤n
eimx
2n l=0
l
ei(k−m)x = eikx (2n + 1).
(7.4.3)
244
Chapter 7. Appendix
Consequently, for any t ∈ T (n), t(x) = (2n + 1)−1
2n
t(xl )Dn (x − xl ).
(7.4.4)
l=0
Furthermore, it is easy to see that for any u, v ∈ T (n) we have u, v = (2π)−1
π
u(x)v(x) dx = (2n + 1)−1
−π
2n
u(xl )v(xl )
(7.4.5)
l=0
and, for any t ∈ T (n), t22 = (2n + 1)−1
2n l 2 t(x ) .
(7.4.6)
l=0
For 1 < q ≤ ∞, the estimate Dn q ≤ C(q)n1−1/q
(7.4.7)
follows from (7.4.2). Applying the H¨older inequality to estimate Dn 22 , we get 2n + 1 = Dn 22 ≤ Dn q Dn q .
(7.4.8)
The relations (7.4.7) and (7.4.8) imply that for 1 < q < ∞ Dn q n1−1/q .
(7.4.9)
The relation (7.4.9) for q = ∞ is obvious. We denote by Sn the operator of taking the partial sum of order n. Then for f ∈ L1 we have Sn (f ) = f ∗ Dn . Theorem 7.4.1. The operator Sn does not change polynomials from T (n), and for p = 1 or ∞ n = 2, 3, . . . , Sn p→p ≤ C ln n, while for 1 < p < ∞ and all n Sn p→p ≤ C(p). This theorem follows from (7.4.3) and the Marcinkiewicz multiplier theorem (see Theorem 7.3.6 above). For t ∈ T (n), t(x) = a0 /2 +
n k=1
(ak cos kx + bk sin kx),
245
7.4. Trigonometric polynomials
we call the polynomial t˜ ∈ T (n), t˜(x) =
n
(ak sin kx − bk cos kx),
k=1
the conjugate polynomial to t. Corollary 7.4.2. For 1 < p < ∞ and all n we have t˜p ≤ C(p)tp . Proof. Let t ∈ T (n). Clearly, it suffices to consider the case of odd n. Let this be the case and set m = (n + 1)/2, l = (n − 1)/2. Then it is not difficult to see that -n , where t˜ = t ∗ D n -n (x) = 2 D sin kx. k=1
˜n (x) in the form Representing D ˜ n (x) = 1 D i
n
e
ikx
−
k=1
−1
e
ikx
=
k=−n
1 imx e Dl (x) − e−imx Dl (x) , i
we get the corollary.
We call a trigonometric conjugation operator the operator which maps a function f (x) to the function (sign k)fˆ(k)eikx . k
The Marcinkiewicz multiplier theorem implies that this operator is bounded as an operator from Lp to Lp for 1 < p < ∞. For f ∈ Lp , p > 1, we shall denote by f˜ the conjugate function. The Fej´er kernel. The Fej´er kernel of order n − 1 is defined as Kn−1 (x) = n−1
n−1 m=0
Dm (x) =
1 − |m|/n eimx |m|≤n
2
2 n(sin(x/2) . = sin(nx/2) The Fej´er kernel is an even nonnegative trigonometric polynomial in T (n−1), with the majorant
Kn−1 (x) ≤ min n, π 2 /(nx2 ) , |x| ≤ π. (7.4.10)
246
Chapter 7. Appendix
From the obvious relations Kn−1 1 = 1,
Kn−1 ∞ = n
and the inequality (see (7.1.6) above) 1/q
f q ≤ f 1 f 1−1/q , ∞ we get, in the same way as above, Cn1−1/q ≤ Kn−1 q ≤ n1−1/q ,
1 ≤ q ≤ ∞.
(7.4.11)
De la Vall´ee Poussin kernels. The de la Vall´ee Poussin kernels are defined as −1
Vm,n (x) = (n − m)
n−1
Dl (x),
n > m.
l=m
It is convenient to represent these kernels in terms of the Fej´er kernels:
Vm,n (x) = (n − m)−1 nKn−1 (x) − mKm−1 (x)
2 −1 = cos mx − cos nx 2(n − m) sin(x/2) . The de la Vall´ee Poussin kernels Vm,n are even trigonometric polynomials of order n − 1, with the majorant
Vm,n (x) ≤ C min n, 1/|x|, 1/ (n − m)x2 ) , |x| ≤ π. (7.4.12) This implies the estimate
Vm,n 1 ≤ C ln 1 + n/(n − m) . We shall often use the de la Vall´ee Poussin kernel with n = 2m and denote it by Vm (x) = Vm,2m (x),
m ≥ 1,
V0 (x) = 4.
Then for m ≥ 1 we have Vm = 2K2m−1 − Km−1 , so due to the properties of Kn
Vm 1 ≤ 3.
(7.4.13)
In addition, Vm ∞ ≤ 3m. Consequently, in the same way as above we get Vm q m1−1/q ,
1 ≤ q ≤ ∞.
We denote x(l) = πl/2m,
l = 1, . . . , 4m.
(7.4.14)
247
7.4. Trigonometric polynomials
Then as in (7.4.4) for each t ∈ T (m) we have t(x) = (4m)−1
4m
t x(l) Vm x − x(l) .
(7.4.15)
l=1
The operator Vm defined on L1 by the formula Vm (f ) = f ∗ Vm will be called the de la Vall´ee Poussin operator. The following theorem is a corollary of the definition of the kernels Vm and the bound (7.4.13). Theorem 7.4.3. The operator Vm does not change polynomials from T (m) and for 1 ≤ p ≤ ∞ we have m = 1, 2, . . . . Vm p→p ≤ 3, In addition we note two properties of the de la Vall´ee Poussin kernels: 1. The relation (7.4.12) with n = 2m implies the inequality
Vm (x) ≤ C min m, 1/(mx2 ) , |x| ≤ π. It is easy to derive from this inequality the following property. 2. For h satisfying the condition C1 ≤ mh ≤ C2 we have V(x − lh) ≤ Cm. 0≤l≤2π/h
We remark that the property 2 is valid also for the Fej´er kernel Km . The Jackson kernel. The Jackson kernel is defined as 2a a −1 sin(nx/2) Jn (x) = γa,n , a ∈ N, sin(x/2) where γa,n is selected so that
Jna 1 = 1.
(7.4.16)
Let us estimate γa,n from below. We have −1
γa,n = (2π)
π
−π
sin(nx/2) sin(x/2)
2a dx ≥ π
−1
π/n 0
nx/π x/2
2a dx ≥ Cn2a−1 .
(7.4.17) The Jackson kernel is an even nonnegative trigonometric polynomial of order a(n − 1). It follows from (7.4.17) that Jna (x) ≤ C min(n, n1−2a x−2a ),
|x| ≤ π.
(7.4.18)
248
Chapter 7. Appendix
This implies that, for 0 ≤ r < 2a − 1, π Jna (x)xr dx ≤ C(r)n−r .
(7.4.19)
0
Rudin–Shapiro polynomials. We define recursively pairs of trigonometric polynomials Pj (x) and Qj (x) of order 2j − 1: P0 = Q0 = 1, j
j
Qj+1 (x) = Pj (x) − ei2 x Qj (x).
Pj+1 (x) = Pj (x) + ei2 x Qj (x),
Then at each point x we have
j j |Pj+1 |2 + |Qj+1 |2 = Pj + ei2 x Qj P j + e−i2 x Qj
j j + Pj − ei2 x Qj P j − e−i2 x Qj = 2 |Pj |2 + |Qj |2 . Consequently, for all x, Pj (x)2 + Qj (x)2 = 2j+1 . Thus, for example, Pn ∞ ≤ 2(n+1)/2 .
(7.4.20)
It is clear from the definition of the polynomials Pn that Pn (x) =
n 2 −1
εk eikx ,
ε = ±1.
k=0
Let N be a natural number and N=
m
n1 > n2 > · · · > nm ≥ 0,
2nj ,
j=1
its binary representation. We set RN (x) = Pn1 (x) +
m
n1
Pnj (x)ei(2
+···+2nj−1 )x
,
j=2
RN (x) = RN (x) + RN (−x) − 1. Then RN (x) has the form RN (x) =
εk eikx ,
εk = ±1,
|k|≤N
and this polynomial obeys the estimate RN ∞ ≤ CN 1/2 .
(7.4.21)
249
7.5. Bernstein–Nikol’skii Inequalities. The Marcinkiewicz Theorem
7.5 Bernstein–Nikol’skii Inequalities. The Marcinkiewicz Theorem The Bernstein–Nikol’skii inequalities are inequalities connecting the Lp -norm of a derivative of some polynomial with the Lq -norm, 1 ≤ q ≤ p ≤ ∞, of this polynomial. We shall obtain here inequalities for a derivative which is slightly more general than the Weyl fractional derivative. We first make some auxiliary considerations. For a sequence {aν }∞ ν=0 we denote Δaν = aν − aν+1 ,
Δ2 aν = Δ(Δaν ) = aν − 2aν+1 + aν+2 .
Theorem 7.5.1. We have π n n −1 π aν cos νx dx ≤ (ν + 1)|Δ2 aν |. a0 /2 + −π
ν=1
ν=0
Proof. Applying twice the Abel transformation (7.1.17) we have (aν = 0 for ν > n) t(x) = a0 +
n
aν 2 cos νx =
ν=1
=
n ν ν=0
n
Dν (x)Δaν
(7.5.1)
ν=0
n Dμ (x) Δ2 aν = (ν + 1)Kν (x)Δ2 aν .
μ=0
ν=0
From (7.5.1), using Kν 1 = 1, we find t1 ≤
n
(ν + 1)|Δ2 aν |,
ν=0
as claimed.
We first prove the Bernstein inequalities. Let us consider the following special trigonometric polynomials. Let s be a nonnegative integer. We define As (x) = 1,
A1 (x) = V1 (x) − 1,
As (x) = V2s−1 (x) − V2s−2 (x),
s ≥ 2,
where Vm are the de la Vall´ee Poussin kernels. Then As ∈ T (2s ) and, by (7.1.13), As 1 ≤ 6.
(7.5.2)
250
Chapter 7. Appendix
Let r ≥ 0 and α be real numbers. We consider the polynomials Vnr (x, α) = 1 + 2
n
k r cos(kx + απ/2)
k=1
+2
2n−1
k r 1 − (k − n)/n cos kx + απ/2 .
k=n+1
Let us prove that, for all r > 0 and α, r Vn (x, α) ≤ C(r)nr , 1
n = 1, 2, . . . .
(7.5.3)
Since, for an arbitrary α,
Vnr (x, α) − 1 = Vnr (x, 0) − 1 cos απ/2 + Vnr (x, 1) − 1 sin απ/2 , it suffices to prove (7.5.3) for α = 0, 1. We first consider the case α = 0. Let vk be the Fourier cosine coefficients of the function Vnr (x, 0). Then, by Theorem 7.5.1, 2n−1 r Vn (x, 0) ≤ (k + 1)|Δ2 vk |. 1
(7.5.4)
k=0
It is easy to see that, for 1 ≤ k ≤ n − 2, |Δ2 vk | ≤ C(r)k r−2 .
(7.5.5)
Using the relation Δ2 (ak bk ) = (Δ2 ak )bk + 2(Δak+1 )(Δbk ) + ak+2 (Δ2 bk ) with ak = k r and bk = 1 − (k − n)/n, we see that (7.5.5) holds for n ≤ k ≤ 2n − 3 too. For the remaining values of k = 0 we have |Δ2 vk | ≤ |Δvk | + |Δvk+1 | ≤ C(r)nr−1 .
(7.5.6)
From the inequality |Δ2 v0 | ≤ C(r) and (7.5.4)–(7.5.6) we get (7.5.3) for r > 0 and α = 0. Let α = 1 and A˜s (x) denote the polynomial which is the trigonometric conjugate to As (x), which means that in the expression for As (x) the functions cos kx are replaced by sin kx. We claim that (7.5.7) A˜s 1 ≤ C. Clearly, it suffices to consider s ≥ 3. It is not difficult to see that the equality
s−1 s−3
A˜s (x) = 2 Im As (x) ∗ 4K2s−1 −1 (x) − 3K2s−1 −2s−3 −1 (x) ei(2 +2 )x
7.5. Bernstein–Nikol’skii Inequalities. The Marcinkiewicz Theorem
251
holds. From this equality, by virtue of the Young inequality with p = q = a = 1 (see (7.1.12)) and properties of the functions Kn and As , we get (7.5.7). Further, for n = 2m we have r
Vnr (x, 1) − 1 = V2n (x, 0) − 1 ∗ Vn0 (x, 1) =−
m+1
r V2n (x, 0) ∗ A˜s (x) = −
s=1
m+1
V2rs (x, 0) ∗ A˜s (x).
(7.5.8)
s=1
From (7.5.8), using the Young inequality (7.1.12), (7.5.7), and the relation (7.5.3), which has been proved for α = 0, we get m+1 r V (x, α)1 ≤ C(r) 2rs ≤ C(r)nr . n
(7.5.9)
s=0
Let now 2m−1 ≤ n < 2m ; then Vnr (x, 1) = V2rm+1 (x, 1) ∗ Vn (x), which by (7.5.9) and the Young inequality gives the required estimate for all n. The relation (7.5.3) is proved. We define the operator Dαr , r ≥ 0, α ∈ R, on the set of trigonometric polynomials as follows: for f ∈ T (n), Dαr f = f (r) (x, α) := f (x) ∗ Vnr (x, α),
(7.5.10)
We call f (r) (x, α) the (r, α)-derivative. It is clear that for f (x) such that fˆ(0) = 0 we have, for natural numbers r, Drr f =
dr f. dxr
The operator Dαr is defined in such a way that it has an inverse operator on each T (n). This property distinguishes Dαr from the differential operator, and is convenient for our analysis. On the other hand, it is clear that dr f = Drr f − fˆ(0). dxr Theorem 7.5.2. For any t ∈ T (n) we have (r > 0, α ∈ R, 1 ≤ p ≤ ∞) (r) t (x, α) ≤ C(r)nr tp , n = 1, 2, . . . . p Proof. By the definition (7.5.10), t(r) (x, α) = t(x) ∗ Vnr (x, α).
252
Chapter 7. Appendix
Consequently, by virtue of the Young inequality (see (7.1.12) above) with p = q, a = 1, for all 1 ≤ p ≤ ∞ and r we have (r) t (x, α) ≤ tp Vnr (x, α) . p 1 To conclude the proof it remains to use the inequality (7.5.3).
Let us discuss the case r = 0, which is excluded from Theorem 7.5.2. If r = 0 and α is an even integer, we have (0) t (x, α)| = t(x) and, consequently, (0) t (x, α) = tp , p
1 ≤ p ≤ ∞.
(7.5.11)
To investigate the general case it suffices to study the trigonometric conjugate operator. Theorem 7.4.1 and its corollary show that for all α and 1 < p < ∞ the inequality (0) t (x, α) ≤ C(p)tp holds. It remains to consider the case p = 1, ∞. It is sufficient to take α = 1. We have, for t ∈ T (n), -2n+1 (x). t(0) (x, 1) = tˆ(0) − t˜(x) = tˆ(0) − t(x) ∗ D Furthermore, ˜ 2n+1 (x) = 2 D
2n+1
sin kx = 2 Im Dn (x)ei(n+1)x ,
k=1
and consequently
-2n+1 1 ≤ C ln(n + 2). D
Thus, for t ∈ T (n), (0) t (x, 1) ≤ C ln(n + 2)tp , p
p = 1, ∞.
(7.5.12)
The relation (7.5.11) with α = 0 and (7.5.12) imply for all α the inequality (0) t (x, α) ≤ C ln(n + 2)tp , p = 1, ∞. (7.5.13) p Remark 7.5.3. We have the relation sup t(0) (x, 1)p tp ln(n + 2), t∈T (n)
p = 1, ∞.
7.5. Bernstein–Nikol’skii Inequalities. The Marcinkiewicz Theorem
253
The upper estimate follows from (7.5.12). Let us prove the lower estimate. We first consider the case p = ∞. Let f (x) = (π − x)/2, 0 < x < 2π, be a 2π-periodic function; then ∞ (sin kx)/k. f (x) = k=1
Let m = [n/2]. Then t(x) = f (x) ∗ Vm (x) has the following properties: t ∈ T (n), t∞ ≤ 3π/2,
t(0) (0, 1) ≥
m
1/k ≥ C ln(m + 2),
(7.5.14)
k=1
which imply the required lower estimate in the case p = ∞. Let p = 1 and m = [n/2]. Then the function Vm ∈ T (n) has the following properties : Vm 1 ≤ 3, (0) V (x, 1) ≥ C ln(m + 2). m 1
(7.5.15) (7.5.16)
Let us prove (7.5.16). For t we have from the above consideration for p = ∞,
(0) σ = Vm (x, 1), t ≤ Vm(0) (x, 1)1 t∞ (7.5.17) and σ≥
m
1/k ≥ C ln(m + 2).
(7.5.18)
k=1
From the relations (7.5.14), (7.5.17), and (7.5.18) we obtain (7.5.16). Then (7.5.15) and (7.5.16) give the required lower estimate for p = 1. Let us now prove the Nikol’skii inequality. Theorem 7.5.4. For any t ∈ T (n), n > 0, tp ≤ Cn1/q−1/p tq , Proof. Let first p = ∞. Then
1 ≤ q < p ≤ ∞.
t = t ∗ Vn
and, by the H¨ older inequality, t∞ ≤ tq Vn q , which by (7.4.14) implies t∞ ≤ Ctq n1/q .
(7.5.19)
254
Chapter 7. Appendix
Further, let q < p < ∞. Then by (7.1.6) we get 1−q/p tp ≤ tq/p . q t∞
(7.5.20)
The conclusion of the theorem follows from (7.5.19) and (7.5.20). The following statement is a direct corollary of Theorems 7.5.2 and 7.5.4.
Corollary 7.5.5 (Bernstein–Nikol’skii inequalities). For t ∈ T (n) and any r > 0, α, 1 ≤ q ≤ p ≤ ∞, (r) t (x, α) ≤ C(r)nr+1/q−1/p tq , n = 1, 2, . . . . p The set T (n) of trigonometric polynomials is a linear space of dimension 2n + 1. Each polynomial t ∈ T (n) is uniquely defined by its Fourier coefficients tˆ(k) |k|≤n and, by the Parseval equality, we have t22 =
tˆ(k)2 ,
(7.5.21)
|k|≤n
which means that T (n), regarded as a subspace of L2 , is isomorphic to l22n+1 . The relation (7.4.6) shows that a similar isomorphism can be set up in another way, 2n namely mapping a polynomial t ∈ T (n) to the vector m(t) = t(xl ) l=0 of its values at the points xl = 2πl/(2n + 1),
l = 0, . . . , 2n.
The relation (7.4.6) gives t2 = (2n + 1)−1/2 m(t)2 . The following statement is the Marcinkiewicz theorem. Theorem 7.5.6. Let 1 < p < ∞; then for t ∈ T (n), n > 0, C1 (p)tp ≤ n−1/p m(t) ≤ C2 (p)tp . p
Proof. We first prove a lemma. Lemma 7.5.7. If 1 ≤ p ≤ ∞, then (n > 0) 2n l al Vn (x − x ) ≤ Cn1−1/p ap , l=0
p
a = (a0 , . . . , a2n ).
7.5. Bernstein–Nikol’skii Inequalities. The Marcinkiewicz Theorem
255
Proof. Let V be the operator on 2n+1 defined as p V (a) =
2n
al Vn (x − xl ).
l=0
It is obvious that (see (7.4.13)) V 2n+1 →L1 ≤ 3.
(7.5.22)
1
Using the estimate (see (7.4.12)) Vn (x) ≤ C min n, (nx2 )−1 ) it is not hard to prove that 2n+1 V ∞ →L∞ ≤ Cn.
(7.5.23)
From the relations (7.5.22) and (7.5.23), using the Riesz–Thorin theorem (see Theorem 7.3.2 above), we find that V p2n+1 →Lp ≤ Cn1−1/p ,
which implies the conclusion of the lemma.
We continue the proof of the theorem. Let Sn be the operator of taking the partial Fourier sum of order n. Using Theorem 7.4.1 we derive from Lemma 7.5.7 the upper estimate: t(x) = (2n + 1)−1
2n
t(xl )Dn (x − xl )
l=0
2n −1 l l t(x )Vn (x − x ) , = Sn (2n + 1) l=0
whence
tp ≤ C(p)n−1/p m(t)p .
Now let us prove the lower estimate for 1 ≤ p < ∞. We have 2n 2n p−1 l p m(t)p = t(x ) = t(xl )εl t(xl ) p l=0
l=0
= (2π)−1
2π
t(x) 0
2n p−1 εl t(xl ) Vn (x − xl ) dx l=0
2n l p−1 l εl t(x ) Vn (x − x ) . ≤ tp l=0
p
256
Chapter 7. Appendix
Using Lemma 7.5.7 we see that the last term is p−1 ≤ Ctp n1/p m(t) , p
which implies the required lower estimate. The theorem is proved.
We prove a statement which is analogous to Theorem 7.5.6 but, in contrast to it, it includes the cases p = 1 and p = ∞.
Theorem 7.5.8. Let x(l) = πl/(2n), l = 1, . . . , 4n, M (t) = t x(1) , . . . , t x(4n) . Then, for arbitrary t ∈ T (n), n > 0, 1 ≤ p ≤ ∞, C1 tp ≤ n−1/p M (t)p ≤ C2 tp . Proof. Analogously to Lemma 7.5.7 one can prove Lemma 7.5.9. If 1 ≤ p ≤ ∞, then (n > 0) 4n
≤ Cn1−1/p ap . a V x − x(l) l n l=1
p
Lemma 7.5.9 with a = M (t) and the relation (7.4.15) imply the estimate tp ≤ Cn−1/p M (t)p . The lower estimate for 1 ≤ p < ∞ can be proved in the same way as above for m(t), replacing xl by x(l). The lower estimate for p = ∞ is obvious.
Bibliography [1] S. P. Baiborodov, Approximation of functions of several variables by de la Vall´ee Poussin rectangular sums, Math. Notes 29 (1981), 362–372. [2] B. M. Baishanski, Approximation by polynomials of given length, Illinois J. Math. 27 (1983), 449–458. [3] A. R. Barron, Universal approximation bounds for superposition of n sigmoidal functions, IEEE Transactions on Information Theory 39 (1993), 930– 945. [4] N. K. Bary, Trigonometric Series, Nauka, Moscow (in Russian), 1961; English transl. by Pergamon Press, Oxford, 1964. [5] A. S. Belov, On some estimates of the trigonometric polynomials in arbitrary norms, Abstracts of the 11th Saratov Winter School, 2002, 16–17. [6] J. Bourgain, A remark on the behavior of Lp -multipliers and the range of operators acting on Lp -spaces, Israel J. Math. 79 (1992), 193–206. [7] J. W. S. Cassels, An Introduction to Diophantine Approximation, Cambridge Tracts in Mathematics and Mathematical Physics, Cambridge, 1957. [8] A. Cohen, R. A. DeVore, and R. Hochmuth, Restricted nonlinear approximation, Constructive Approx. 16 (2000), 85–113. [9] A. C´ ordoba and P. Fern´andez, Convergence and divergence of decreasing rearranged Fourier series, SIAM, I. Math. Anal. 29 (1998), 1129–1139. [10] Dai Feng, Approximation of real smooth functions on the unit sphere S d−1 , Ph.D. Thesis, Beijing Normal University, 2002, 1–147. [11] R. A. DeVore, Nonlinear approximation, Acta Numerica 7 (1998), 51–150. [12] R. A. DeVore, B. Jawerth, and V. Popov, Compression of wavelet decompositions, Amer. J. Math. 114 (1992), 737–785. [13] R. A. DeVore, S. V. Konyagin, and V. N. Temlyakov, Hyperbolic wavelet approximation, Constr. Approx. 14 (1998), 1–26. [14] R. A. DeVore and V. A. Popov, Interpolation spaces and non-linear approximation, in: Lecture Notes in Mathematics 1302 (1988), 191–205 . [15] R. A. DeVore and V. N. Temlyakov, Nonlinear approximation by trigonometric sums, J. Fourier Anal. Appl. 2 (1995), 29–48.
© Springer Basel 2015 V. Temlyakov, Sparse Approximation with Bases, Advanced Courses in Mathematics - CRM Barcelona, DOI 10.1007/978-3-0348-0890-3
257
258
Bibliography
[16] R. A. DeVore and V. N. Temlyakov, Some remarks on Greedy Algorithms, Adv. Comput. Math. 5 (1996), 173–187. [17] S. J. Dilworth, N. J. Kalton, and D. Kutzarova, On the existence of almost greedy bases in Banach spaces, Studia Math. 158 (2003), 67–101. [18] S. J. Dilworth, N. J. Kalton, D. Kutzarova, and V. N. Temlyakov, The thresholding greedy algorithm, greedy bases, and duality, Constr. Approx. 19 (2003), 575–597. [19] S. J. Dilworth, D. Kutzarova, and V. N. Temlyakov, Convergence of some greedy algorithms in Banach spaces, J. Fourier Anal. Appl. 8 (2002), 489– 505. [20] S. J. Dilworth and D. Mitra, A conditional quasi-greedy basis of l1 , Studia Math. 144 (2001), no. 1, 95–100. [21] S. J. Dilworth, M. Soto-Bajo, and V. N. Temlyakov, Quasi-greedy bases and Lebesgue-type inequalities, Studia Math. 211 (2012), 41–69. [22] M. Donahue, L. Gurvits, C. Darken, and E. Sontag, Rate of convex approximation in non-Hilbert spaces, Constr. Approx. 13 (1997), 187–220. [23] V. V. Dubinin, Greedy Algorithms and Applications, Ph.D. Thesis, University of South Carolina, 1997. [24] T. Figiel, W. B. Johnson, and G. Schechtman, Factorization of natural embeddings of np into Lr . I, Studia Math. 89 (1988), 79–103. [25] M. Frazier and B. Jawerth, A discrete transform and decomposition of distribution spaces, J. Funct. Anal. 93 (1990), 34-170. [26] M. Ganichev and N. J. Kalton, Convergence of the Weak Dual Greedy Algorithm in Lp -spaces, J. Approx. Theory 124 (2003), 89–95. [27] V. F. Gaposhkin, On unconditional bases in Lp -spaces, Uspekhi Mat. Nauk 13 (1958), 179–184. [28] G. Garrig´os, E. Hern´ andez, and T. Oikhberg, Lebesgue-type inequalities for quasi-greedy bases, Constr. Approx. 38 (2013), 447–470. [29] E. D. Gluskin, Extremal properties of orthogonal parallelpipeds and their application to the geometry of Banach spaces, Math. USSR Sbornik 64 (1989), 85–96. [30] R. Gribonval and M. Nielsen, Some remarks on non-linear approximation with Schauder bases, East J. Approx. 7 (2001), 267–285. [31] A. Grothendieck, R´esum´e de la th´eorie m´etrique des produits tensoriels topologiques, Bol. Soc. Mat. Sao Paulo 8 (1953/1956), 1–79. [32] P. Habala, P. H´ajek and V. Zizler, Introduction to Banach Spaces [I], Matfyzpress, Univ. Karlovy, 1996. [33] E. Hern´ andez, Lebesgue-type inequalities for quasi-greedy bases, arXiv: 1111.0460, 2011.
Bibliography
259
[34] R. S. Ismagilov, Widths of sets in normed linear spaces and the approximation of functions by trigonometric polynomials, Uspekhi Mat. Nauk 29 (1974), 161–178; English transl. in Russian Math. Surveys 29 (1974). [35] R. C. James, Bases and reflexivity of Banach spaces, Ann. of Math. 52 (1950), 518–527. [36] L. Jones, A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training, Ann. Statist. 20 (1992), 608–613. [37] M. I. Kadec and A. Pelczynski, Bases, lacunary sequences, and complemented subspaces in the spaces Lp , Studia Math. 21 (1962), 161–176. [38] J.-P. Kahane, Some Random Series of Functions, Cambridge University Press, Cambridge, 1985. [39] A. Kamont and V. N. Temlyakov, Greedy approximation and the multivariate Haar system, Studia Math. 161 (2004), 199–223. [40] B. S. Kashin, Widths of certain finite-dimensional sets and classes of smooth functions, Izv. Akad. Nauk SSSR, Ser. Mat. 41 (1977), 334–351. [41] B. S. Kashin and A. A. Saakyan, Orthogonal Series, Amer. Math. Soc., Providence, R.I., 1989. [42] G. Kerkyacharian, D. Picard and V. N. Temlyakov, Some inequalities for the tensor product of greedy bases and weight-greedy bases, East J. Approx. 12 (2006), 103–118. [43] S. V. Konyagin and V. N. Temlyakov, A remark on greedy approximation in Banach spaces, East J. Approx. 5 (1999), 365–379. [44] S. V. Konyagin and V. N. Temlyakov, Greedy approximation with regard to bases and general minimal systems, Serdica Math. J. 28 (2002), 305–328. [45] S. V. Konyagin and V. N. Temlyakov, Convergence of greedy approximation II. The trigonometric system, Studia Math. 159(2) (2003), 161–184. [46] S. V. Konyagin and V. N. Temlyakov, Convergence of greedy approximation for the trigonometric system, Anal. Math. 31 (2005), 85–115. [47] N. P. Korneichuk, Extremal Problems of Approximation Theory, Nauka, Moscow, 1976. [48] T. W. K¨orner, Divergence of decreasing rearranged Fourier series, Ann. of Math. 144 (1996), 167–180. [49] T. W. K¨orner, Decreasing rearranged Fourier series, J. Fourier Anal. Appl. 5 (1999), 1–19. [50] S. Kostyukovsky and A. Olevskii, Note on decreasing rearrangement of Fourier series, J. Appl. Anal. 3 (1997), 137–142. [51] H. Lebesgue, Sur les int´egrales singuli´eres, Ann. Fac. Sci. Univ. Toulouse (3) 1 (1909), 25–117.
260
Bibliography
[52] J. Lindenstrauss and L. Tzafriri, Classical Banach Spaces, Springer-Verlag, Berlin, 1977. [53] E. D. Livshitz, Convergence of greedy algorithms in Banach spaces, Math. Notes 73 (2003), 342–368. [54] E. D. Livshitz, On lower estimates of rate of convergence of greedy algorithms, Izv. RAN, Ser. Mat. 73 (2009), 125–144. [55] E. D. Livshitz and V. N. Temlyakov, Two lower estimates in Greedy Approximation, Constr. Approx. 19 (2003), 509–523. [56] V. E. Maiorov, Trigonometric diameters of the Sobolev classes Wpr in the space Lq , Math. Notes 40 (1986), 590–597. [57] M. Nielsen, An example of an almost greedy uniformly bounded orthonormal basis for Lp (0, 1), J. Approx. Theory 149 (2007), 188–192. [58] A. M. Olevskii, Fourier Series with Respect to General Orthonormal Systems, Springer-Verlag, Berlin, 1975. [59] P. Oswald, Greedy algorithms and best m-term approximation with respect to biorthogonal systems, J. Fourier Anal. Appl. 7 (2001), 325–341. [60] P. Petrushev, Direct and converse theorems for spline and rational approximation and Besov spaces, in: Lecture Notes in Mathematics 1302 (1988), 363–377. [61] G. Pisier, Factorization of Linear Operators and Geometry of Banach Spaces, CBMS 60, Amer. Math. Soc, Providence, R.I., 1986. [62] M. M. Popov, A property of convex basic sequences in L1 , Methods Funct. Anal. Topology 11 (2005), 409–416. [63] E. Schmidt, Zur Theorie der linearen und nichtlinearen Integralgleichungen I, Math. Ann. 63 (1906), 433–476. [64] S. J. Szarek, Bases and biorthogonal systems in the spaces C and L1 , Ark. Mat. 17 (1979), 255–271. [65] T. Tao, On the almost-everywhere convergence of wavelet summation methods, ACHA 3 (1996), 384–387. [66] S. A. Telyakovskii, Two theorems on the approximation of functions by algebraic polynomials, Mat. Sbornik 70 (1966), 252–265. [67] V. N. Temlyakov, Approximation of functions with bounded mixed derivative, Proc. Steklov Institute, 1989, Issue 1. [68] V. N. Temlyakov, Greedy algorithm and m-term trigonometric approximation, Constr. Approx. 14 (1998), 569–587. [69] V. N. Temlyakov, The best m-term approximation and Greedy Algorithms, Adv. Comput. Math. 8 (1998), 249–265. [70] V. N. Temlyakov, Nonlinear m-term approximation with regard to the multivariate Haar system, East J. Approx. 4 (1998), 87–106.
Bibliography
261
[71] V. N. Temlyakov, Greedy algorithms and m-term approximation with regard to redundant dictionaries, J. Approx. Theory 98 (1999), 117–145. [72] V. N. Temlyakov, Weak greedy algorithms, Adv. Comput. Math. 12 (2000), 213–227. [73] V. N. Temlyakov, Greedy algorithms in Banach spaces, Adv. Comput. Math. 14 (2001), 277–292. [74] V. N. Temlyakov, Nonlinear approximation with regard to bases, in: Approximation Theory X, Vanderbilt University Press, Nashville, TN, 2002, 373–402. [75] V. N. Temlyakov, Cubature formulas and related questions, J. Complexity 19 (2003), 352–391. [76] V. N. Temlyakov, Nonlinear method of approximation, Found. Comput. Math. 3 (2003), 33–107. [77] V. N. Temlyakov, Greedy type algorithms in Banach spaces and applications, Constr. Approx. 21 (2005), 257–292. [78] V. N. Temlyakov, Greedy expansions in Banach spaces, Adv. Comput. Math, 26 (2007), 431–449. [79] V. N. Temlyakov, Greedy algorithms with prescribed coefficients, J. Fourier Anal. Appl. 13 (2007), 71–86. [80] V. N. Temlyakov, Relaxation in greedy approximation, Constr. Approx. 28 (2008), 1–25. [81] V. N. Temlyakov, Greedy approximation, Acta Numerica 17 (2008), 235–409. [82] V. N. Temlyakov, Greedy Approximation, Cambridge University Press, 2011. [83] V. N. Temlyakov, M. Yang, and P. Ye, Greedy approximation with regard to non-greedy bases, Adv. Comput. Math. 34 (2011), 319–337. [84] V. N. Temlyakov, M. Yang, and P. Ye, Lebesgue-type inequalities for greedy approximation with respect to quasi-greedy bases, East J. Approx. 17 (2011), 127–138. [85] P. Wojtaszczyk, Greedy algorithm for general biorthogonal systems, J. Approx. Theory 107 (2000), 293–314. [86] A. Zygmund, Trigonometric Series, Cambridge University Press, Cambridge, 1959.