Prime Numbers 101
H. Maier W.P. Schleich
Prime Numbers 101 A Primer on Number Theory for Pedestrains
Helmut Maier Ab...
39 downloads
912 Views
5MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Prime Numbers 101
H. Maier W.P. Schleich
Prime Numbers 101 A Primer on Number Theory for Pedestrains
Helmut Maier Abteilung f¨ur Zahlentheorie und Wahrscheinlichkeitstheorie Universit¨at Ulm
Wolfgang P. Schleich Abteilung f¨ur Quantenphysik Universit¨at Ulm
Note on the cover The cover page expresses to close connection between the prime numbers and the nontrivial zeros of the Riemann ζ-function. Here we mark the primes of the first two hundred natural numbers in red. The Riemann ζ-function is depicted in its absolute value for complex arguments. The zeros along the critical line and the pole at s = 1 are clearly visible as dips and a spike, respectively.
Preface Traditionally, introductory courses at American universities carry the number 101, “oneo-one”. This number indicates that the lectures are on an elementary level and should be readily accessible to a beginner student. Following this tradition and intention we have chosen the title Prime Numbers 101 for this booklet. Indeed, it is geared towards the non-specialist who is interested in getting an understanding of number theory and, in particular, learning the fascinating connections between the Riemann ζ-function and the distribution of primes. On purpose we have focused on a few selected topics, hence, these notes cannot and should not replace full-fledged treatises on this topic. However, we like to think that Prime Numbers 101 awakens the reader’s interest and opens the door to this rich field. Moreover, we certainly enjoy the play on words hidden in the title, since 101 is a prime number. With the same sense of humor the community of mathematicians did not celebrate the centenary of the prime number theorem but rather its 101st birthday. What is the need for an introductory book on this topic oriented towards computer scientists, electrical engineers and physicists? To paraphrase Georges Clemenceau who once had said “War! It is too serious a matter to leave to the military.” we claim that the results of number theory and, in particular on prime numbers, are too serious to leave to the mathematicians. During the last decade this originally rather abstract field of mathematics has had enormous impact on disciplines such as computing, communication and physics. Three examples testify in support of this statement in a striking way: The security of codes needed, for example, in bank transfers or in communication exchange relies on the fact that the problem of factoring a large integer into two primes is an exponentially difficult problem. Therefore no classical computer, no matter how fast it would be, could solve this problem in a reasonable time. However, the Shor algorithm demonstrated that a quantum computer could perform this task in a polynomial time. Likewise, the proof of this revolutionary algorithm requires a good deal of number theory. In the same spirit the newly emerging field of quantum information requires a certain familiarity in at least two of the three fields computing, electrical engineering or physics. Moreover, the field of quantum chaos or in the words of Sir Michael Berry, “quantum chaology” relies heavily on number theory. Indeed, there is a strong connection between the statistics of the energy eigenvalues of the Schr¨odinger equation corresponding to a billiard and that of the zeros of the Riemann ζ-function. These three examples clearly demonstrate the need for an interdisciplinary approach. Over the last three years we have already been collaborating in the Network Quantum
v
Information Highway A8 sponsored by the Landesstiftung Baden-W¨ urttemberg and the Ministerium f¨ ur Wissenschaft, Forschung und Kunst, Baden-W¨ urttemberg on questions of number theory and quantum mechanics, in particular Gauß sums. Moreover, in most stimulating discussions with Gerhard Paulus (Texas A&M University) on this topic, the idea for a crash course on prime numbers was born. We both agreed that the mathematician Helmut Maier, working in the area of number theory, would give a series of lectures. The physicist Wolfgang Peter Schleich would take notes, work out the details and prepare a first draft of Prime Numbers 101. Then we would agree on a final version. The guiding principle would not be mathematical rigor but transparency with the goal to bring out most clearly the connections between the different topics. Prime Numbers 101 is the result of this algorithm. To some degree the selection of the topics for Prime Numbers 101 is arbitrary. Hopefully our choice conveys the flavor of number theory while still keeping the audience of non-experts. We, therefore, concentrate on a few chestnuts of this field, only. Figure 0.1 displays the flow of topics discussed in Prime Numbers 101. Sieve of Eratosthenes ↓ M¨obius function ↓ multiplicative convolution ↓ Dirichlet series ↓ Riemann ζ-function ↓ functional equation ← modular forms ↓ zeros of ζ ↓ prime number theorem Figure 0.1.: Flow chart of themes in this booklet At the very heart of our discussions rests the proof of the prime number theorem. We motivate this central result of number theory by introducing in Chapter 1 the sieve of Eratosthenes. In Chapter 2 we then use the inclusion-exclusion principle of combinatorics to derive an expression for the number of primes surviving the sieve. In this formula we meet for the first time the M¨obius Function µ which plays a powerful role in number theory and which leads us directly to the Riemann ζ-function. Indeed, we arrive in Chapter 3 at this function in a rather natural and straight forward way by transforming the multiplicative convolution of two arithmetic functions with the help of generating functions to a simple multiplication. The generating function of a multiplicative convolution is a Dirichlet series. For the special cases of the unity function, that is
vi
1(n) = 1 for all n ∈ N and the M¨obius function µ, the corresponding Dirichlet series are ζ and ζ −1 respectively. We also obtain the logarithmic derivative and the Euler product representation of ζ. Both expressions contain all prime numbers and make in this way the first contact between ζ and primes. We dedicate Chapter 4 to the extension of the Riemann ζ-function to the whole complex plane. Indeed, the Dirichlet series of ζ is only convergent for the arguments s with Re s > 1. However, with the help of the integral representation of ζ as the Mellin transform of the theta series we can extend the domain to Re s > 0. Moreover, the functional equation for the products of ζ and another function finally allows us to consider ζ on the complete complex plane. Finally we discuss the zeros of ζ and summarize the Riemann hypothesis. In Chapter 5 we concentrate on the proof of the prime number theorem. The crucial tool in our approach is the representation of the prime distribution as an integral in complex space with the integrand containing the logarithmic derivative of ζ. According to the Cauchy integral theorem the leading contributions originate from the zeros of ζ. In this way we establish the connection between π and the zeros of ζ. Chapter 6 addresses the relation between the Riemann ζ-function and Gauß sums which are the essential ingredients in the Talbot effect of wave optics. This connection is particularly interesting since it is possible to factor numbers using the Talbot effect. Therefore there might even exist a realization of ζ using the Talbot effect. We conclude by summarizing in Chapter 7 our insights gained during this stroll through prime numbers. Needless to say our goal in Prime Numbers 101 is not to summarize and present the latest results obtained in the area. Its sole purpose is to teach a field usually not contained in the standard curriculum of a computer scientist, an electrical engineer or a physicist. In order to make the material more appealing to this clientele we have avoided the standard routine of definition, lemma and theorem so familiar in mathematics. Moreover, we have spent a great effort in motivating the individual results. We try not to subscribe to the motto of the Bellman in The Hunting of the Snark of Lewis Carroll: “Just the place for a Snark! I have said it thrice: What I tell you three times is true.” Although we keep proofs to a minimum we attempt to explain and motivate the individual statements. In the same spirit we have added to each chapter a section with problems. They summarize important aspects of the material presented in the individual chapters. Moreover, they add more information and allow the reader to test his/her understanding of the matter. In order to keep the book self-contained we have included the solutions to these problems. Our notation is adjusted to the non-mathematician market, for example, we denote the natural logarithm by ln rather than log. Moreover, we write the integrator dx before the integrand rather than after it. Finally, we freely interchange summation and integration
vii
without proof. However, we do mention this step explicitly every time we apply it in a calculation. We gratefully acknowledge the support of the Landesstiftung Baden-W¨ urttemberg, the Ministerium f¨ ur Wissenschaft, Forschung und Kunst Baden-W¨ urttemberg within the framework of the Quantum Information Highway A8 as well as the Promotionskolleg Mathematische Analyse von Evolution, Information und Komplexit¨at. In particular, we express our sincere thanks to Prof. Dr. Wolfgang Arendt for his continuous encouragement which made this project possible. Last not least we thank Barbara Casel, Wilma Fiebelkorn and Dipl. Phys. Oliver Crasser for their hard work and unique dedication in transforming our notes and sketches into Prime Numbers 101. Many, many thanks! Of course, all errors or misprints are our fault and cannot be blamed on them. This first draft of Prime Numbers 101 was completed by Wolfgang P. Schleich during the summer vacation in Texas, where Kathy kindly provided the space necessary to work continuously on this project. We hope that the community will welcome Prime Numbers 101 and find it beneficial. May everyone have as much fun with this booklet as we had preparing it.
Ulm, September 2005 Helmut Maier Wolfgang P. Schleich
viii
Contents
1. Where Are the Primes? 1.1. Ante . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2. Sieve of Eratosthenes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3. Prime Function π . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. How to Count Primes 2.1. Sieve of Eratosthenes as a Set Problem . . . . . . 2.2. Number of Elements in M0 . . . . . . . . . . . . . √ 2.3. A Compact Formula for π(x) − π( x) . . . . . . 2.4. A Second Glimpse of the Prime Number Theorem 3. M¨ obius Function µ 3.1. Multiplicative Arithmetic Functions . . . . 3.2. Convolution of Two Arithmetic Functions 3.3. M¨obius Inversion Formula . . . . . . . . . 3.4. Generating Functions and Dirichlet Series 3.5. Riemann ζ-Function . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
1 2 4 6
. . . .
13 13 16 18 20
. . . . .
25 26 28 31 31 32
4. Analytical Properties of ζ 41 4.1. Extension of ζ to Whole Complex Plane . . . . . . . . . . . . . . . . . . 42 4.2. Zeros of ζ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.3. Mellin Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5. Prime Number Theorem 57 5.1. New Representation of Prime Function . . . . . . . . . . . . . . . . . . . 57 5.2. New Representation of Chebyshev function . . . . . . . . . . . . . . . . . 59 5.3. Explicit Asymptotic Expression for π . . . . . . . . . . . . . . . . . . . . 62 6. Gauß Sums 65 6.1. Theta Functions in Various Forms . . . . . . . . . . . . . . . . . . . . . . 66 6.2. Emergence of Gauß Sums . . . . . . . . . . . . . . . . . . . . . . . . . . 67 7. In a Nutshell
69
A. Riemann’s Original Paper
73
ix
B. Inclusion-Exclusion Principle 85 B.1. Two Different but Equivalent Forms . . . . . . . . . . . . . . . . . . . . . 85 B.2. Sieve of Brun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 C. Improved Mertens’ formula
89
D. General Convolutions 91 D.1. Convolution for Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 D.2. Continuous Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 D.3. Multiplicative Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . 93 E. Functional Equations 95 E.1. Poisson Summation Formula . . . . . . . . . . . . . . . . . . . . . . . . . 95 E.2. Functional Equation for ϑ . . . . . . . . . . . . . . . . . . . . . . . . . . 96 F. Evaluation of Gauß Sums 97 F.1. Gauß Sum G(1, q) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 F.2. Gauß Sum G(a, q) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 F.3. Congruences and Quadratic Residues . . . . . . . . . . . . . . . . . . . . 102 G. Further Reading
103
H. Solutions to the Problems
107
x
1. Where Are the Primes? Natural numbers play a very special role in mathematics. This fact stands out most clearly in the quote by Hilbert1 : “The natural numbers are the work of God, the rest is due to man.” However, an even more elitar role is played by a subset of God’s numbers, namely by the primes. A prime is defined as a positive integer whose only divisors are 1 or the number itself. Already Euclid2 about 300 BC showed that there is an infinite amount of primes contained in the sea of natural numbers. But how to find them? 3 A possible search algorithm for primes is the sieve of Eratosthenes , which seeks to √ find all primes n ≤ x. It starts from all primes smaller than x and eliminates from all integers smaller or equal to x the integer multiples of these primes. The remaining integers are the primes. What is the number π(x) of primes smaller or equal to x? Gauß4 was the first to give an answer by conjecturing the prime number theorem lim
x→∞
π(x) x ln x
= 1.
(1.1)
Throughout these notes we use the natural logarithm ln, that it is the logarithm to the base e, that is to the Euler5 number, rather than log. However, a rigorous proof of this important conjecture had to wait till 1896 when Hadamard6 and de la Vall´ee-Poisson7 independently from each other used techniques of complex analysis to verify Eq. (1.1). The main goal of the present notes is to rederive this result. Our starting point is the Riemann8 ζ-function. We show that its non-trivial zeros determine π. For this purpose we have to study the analytical properties of this function in great detail. We start on our journey into the mysterious world of prime numbers by providing in this chapter a concise summary of the essential ingredients of number theory. We then 1
David Hilbert (1862 – 1943), German mathematician, Professor in G¨ottingen Euclid (c. 350 – 300 BC), Greek mathematician in Alexandria 3 Eratosthenes (c. 276 – 194 BC), Greek mathematician and astronomer in Alexandria 4 Carl Friedrich Gauß (1777 – 1855), German mathematician and astronomer, Professor and director of the observatory of G¨ ottingen 5 Leonhard Euler (1707 – 1783), Swiss-born mathematician and physicist, who worked mainly in St. Petersburg and in Berlin 6 Jacques Salomon Hadamard (1865 – 1963), French mathematician 7 Charles de la Vall´ee-Poisson (1866 – 1962), Belgian mathematician 8 Georg Friedrich Bernhard Riemann (1826 – 1866), German mathematician, Professor in G¨ottingen 2
1
discuss the sieve of Eratosthenes and present a crucial connection between π and another important number theoretic function. This relation is vital in the proof of the prime number theorem Eq. (1.1). Here we meet for the first time the logarithmic integral. However, we can not identify it as the leading contribution.
1.1. Ante In a poker game before the deal one has to build the pot. This stake is sometimes referred to as “ante”. In the same spirit we have to put up our stake before we can play our game of prime numbers, that is we have to summarize several elementary concepts of number theory. We dedicate this section to a review of some essential ingredients of this field. For the sake of brevity we do not present proofs for every statement but only for a few ones.
1.1.1. Definition of Primes and Famous Conjectures At the center of our interest are a sub-set of the positive natural numbers n = 1, 2, 3, . . . namely primes. A prime p is a positive integer other than 1 whose only positive divisor is 1 and p. The first few primes are 2, 3, 5, 7, 11, . . . We emphasize that 1 is not considered a prime otherwise some theorems such as the Fundamental Theorem of Arithmetic would be unnecessarily complicated. Moreover, primes cannot be even integers, since in this case 2 would be a divisor. An exception is of course 2 itself. Not only is it the only even prime it is also the first prime. There exist many famous conjectures about prime numbers. For example in 1742 Goldbach9 claimed that every even number larger than 2 is the sum of two primes. Indeed, we can verify this statement for small numbers such as 4 = 2 + 2, 6 = 3 + 3, 8 = 5 + 3, 10 = 7 + 3 = 5 + 5 . . . However, it is not clear yet if the Goldbach conjecture is correct for every even number. He also conjectured that every odd number is the sum of 3 primes. For this claim there exists a proof. Another unsolved problem of primes is the Twin Prime problem: Is there an infinite number of primes p such that p + 2 is also a prime? Examples of twin primes are 3 and 5, 5 and 7, 11 and 13, . . . Also for this question no general answer exists. We finally turn to the Mersenne10 primes which are of the form 2p − 1 with p being a prime. The integers 3 = 22 − 1, 7 = 23 − 1, 31 = 25 − 1 are examples of Mersenne primes. However, not all such numbers are primes. Indeed, 267 − 1 or 2257 − 1 are not primes. It is not clear whether there are infinitely many of them. In November 2001 the Canadian student Michael Cameron used his PC to prove the primality of 213,466,917 − 1, a number with over 4 million digits. 9
Christian Goldbach (1690 – 1764), Prussian-born number theorist and analyst, Professor of Mathematics at Russian Imperial Academy 10 Marin Mersenne (1588 –1648), French monk, theologian, philosopher and number theorist
2
1.1.2. Fundamental Theorem of Arithmetic An integer n is either a prime or a non-prime, that is a composite number. A composite number n = k · m is the product of two integers where k 6= m 6= 1 6= n. There exists a notation in mathematics for the fact that an integer m is a divisor of a number n = k · m. The abbreviation m|n or k|n denote the fact that m divides n or k divides n. Non-prime numbers such as 4, 6, 8, 12, 15, . . . enjoy such divisors. Furthermore the examples 4 = 2 · 2 = 22 , 6 = 2 · 3, 8 = 2 · 2 · 2 = 23 , 12 = 4 · 3 = 22 · 3, 15 = 3 · 5, . . . suggest that we can represent every non-prime number by products of powers of primes. This conjecture is indeed correct as summarized by the Fundamental Theorem of Arithmetic: Every natural number n can be represented as a finite product n = pα1 1 · · · · · pαr r of positive integer powers α1 , . . . , αr of primes p1 , . . . , pr . This decomposition is unique. Throughout these notes we shall make heavy use of this theorem. It is the crucial ingredient in many derivations. However, here we do not provide a proof for it but refer to the literature. Moreover, we emphasize that although specific examples such as the ones above illustrate the statement in a striking way, a rigorous proof of the Fundamental Theorem of Arithmetic is non-trivial.
1.1.3. There Are Infinitely Many Primes Already Euclid showed that there exists an infinite amount of primes. This property lim π(x) = ∞
x→∞
of the prime function π is our first result of these notes. Since the proof of this statement is rather elegant and simple we present it here. Euclid assumes that there is a finite number r of primes p1 , . . . , pr . In this case we can multiply them all together and add 1 which yields an integer m ≡ p1 · · · · · pr + 1 which cannot be divided by any of the primes p1 , . . . , pr . Hence, by assuming that there is only a finite number of primes we have violated the Fundamental Theorem of Arithmetic. Indeed, the integer m does not have a prime number decomposition. The only possible way out of this dilemma is to conclude that there are infinitely many primes.
3
1.2. Sieve of Eratosthenes According to Euclid there exist infinitely many prime numbers. But how to find them? Eratosthenes answered this question by suggesting a rather ingenious algorithm which finds all primes between 2 and a given integer x. In the present section we introduce this method and illustrate it using a specific example.
1.2.1. Algorithm √ x The approach of Eratosthenes relies on the assumption that all primes smaller than √ are known already. The choice of x is rather arbitrary. Its only√justification is to start from a domain for which all the√primes are known and obviously x ≪ x. In case not √ all primes are known for 2 ≤ n ≤ x, we might try for example the domain 2 ≤ x ≤ 4 x. The algorithm uses the fact that a prime p has only 1 or p itself as a divisor, whereas a non-prime integer is a product n = pα1 1 · · · · · pαr r of integer powers α1√ , . . . , αr of primes p1 , . . . , pr . Hence, when we consider an integer n from the domain x ≤ n ≤ x to be investigated n is either a prime or it can be represented as a product αj −1 α1 αr n = pj · p1 · · · · · pj · · · · · pr = pj · d
of a prime pj and√integer d. Hence, when we take integer multiples d of all known primes p with 2 ≤ p ≤ x such that d · p ≤ x we find all non-prime numbers for n ≤ x. The remaining numbers must therefore be the primes. Thus the algorithm of Eratosthenes works like a sieve. It eliminates all the non-prime numbers and only retains the primes.
1.2.2. Example In Table 1.1 we illustrate the sieve of Eratosthenes for x = 50. We start √ by listing all positive integers from 1 to x = 50. The prime numbers p smaller than 50 = 7.071 are p = 2, 3, 5 and 7. The integer 1 is not a prime by definition and n = 4 = 2 · 2 and n = 6 = 2 · 3 are not primes either. Next we eliminate all integer multiples of the primes 2, 3, 5 and 7 from this list. The remaining integers are the desired prime numbers. At this stage we note that some integers are crossed out only once, but many of them several times. In order to bring this fact out most clearly we have used in Table 1.1 different signs to eliminate the multiples of the different primes.
1.2.3. Different Classes of Integers At this stage it is worthwhile to pause for a minute and understand why composite integers are discarded from the list, that is sieved out. Of course the deeper reason
4
1 11 . / 21 31
41
2 . 12 22 32 / . 42
3 13 23 . 33 43
4
/ 14 . 24 34 44
5 . —– 15
—– 25 / —– 35 . —– 45
. 6
16 26 . 36 46
7 17 . 27
37
47
8 . 18 / 28 38 . 48
. 10 9 —– 19 —– 20 . 29 —– 30 . 40 39 —– / 50 49 —–
Table 1.1.: Sieve of Eratosthenes. The goal is to determine all primes smaller than x √ = 50. We start from the primes 2, 3, 5, and 7 (underlined) smaller than 50 = 7.071 and do not consider 1 to be a prime. The integers 4 = 2 · 2 and 6 = 2 · 3 are not-prime and are eliminated from the onset as indicated by the circle. We now cross out all integer multiples of 2 (vertical line), 3 (diagonal from right to left), 5 (horizontal line) and 7 (diagonal from left to right) which√yields the 11 primes 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, and 47 between x and x. is the representation of an integer n as a product of powers of primes as predicted by the Fundamental Theorem of Arithmetic. However, it is instructive to illustrate this representation using specific examples. Moreover, this discussion not only demonstrates why some integers are crossed out more than once but it makes us more familiar with the use of the elementary building blocks of integers, that is with the primes. The sieve of Eratosthenes retains primes. We note, however, that squares of primes such as n = 9 = 32 or n = 25 = 52 , or higher powers of primes such as n = 16 = 24 are not primes anymore. In Table 1.1 these integers are crossed only once. This particular class of integers will reappear when we calculate in Sec. 3.5.3 the logarithmic derivative of the Riemann ζ-function. A slightly different class are integers such as 12 = 2 · 2 · 3 = 22 · 3 = 3 · 22 or 48 = 2 · 2 · 2 · 2 · 3 = 24 · 3 which consist of powers of a prime and another integer. Composite numbers of the form n = d · pα = pα · d = d · pα−1 p
are crossed out twice in Table 1.1 since we can interpret them either as a multiple integer of d or of p. The most frequent case is when the n consists of products of primes only and no powers appear. The integer 14 = 2 · 7 = 7 · 2 is an example for this class. In the sieve of Eratosthenes n = 14 is crossed out as a multiple of 2 but also as a multiple of 7. This situation occurs rather often since multiplication of two integers is multiplicative. Moreover, there can be an even or odd number of distinct primes, for example 10 = 2 · 5 or 30 = 2 · 3 · 5.
5
This classification of integers eliminated by the sieve of Eratosthenes is important when we in Chapter 2 derive estimates for the numbers of surviving primes using the inclusion-exclusion principle. Moreover, it manifests itself in the definition of the M¨obius11 function µ discussed in great detail in Chapter 3.
1.3. Prime Function π In the preceding section we have introduced the sieve of Eratosthenes and have illustrated its workings for the example of x = 50. In particular we have found that there exist 4 + 11 = 15 primes which are smaller than x = 50. Our goal for the remaining part of these notes is to derive an analytical formula for the number of primes smaller or equal to x. We have denoted this function in the introduction by π = π(x) and now briefly discuss its behavior. In particular we draw attention to the asymptotic formulae conjectured by Gauß. In order to lay the ground work for the derivation of these formulae in the following chapters we also establish a connection between π and another important function of number theory.
1.3.1. Asymptotic Behavior We define π = π(x) as the number of elements of the set of primes p ≤ x denoted by absolute values, that is π(x) ≡ |{p ≤ x}| ≡
number of primes ≤ x
An alternative but equivalent representation of π is as a sum of ones over all primes smaller than x, that is X π(x) ≡ (1.2) p≤x
There exists a nice representation of the prime function π as a Stieltjes12 integral. In complete analogy to the Riemann integral the Stieltjes integral consists of an integrand and an integrator. In the Riemann integral the integrator is dx. However, in the Stieltjes integral the integrator is a function. In the case of the prime function π the integrator is X ϕ(y) ≡ ln p. p≤y
Whenever the integrator moves over a prime p it jumps by ln p. We find the number of primes by dividing out these jumps but still counting the number of jumps. This idea 11
August Ferdinand M¨ obius (1790 – 1868), German geometer, topologist, Professor of astronomy in Leipzig 12 Thomas Jan Stieltjes (1856 – 1894), Dutch-born naturalized French analyst and number theorist
6
600000
15 500000 400000
12.5
300000
200000
100000
10 6
2·10
6
4·10
6
6·10
6
7
8·10
1·10
7.5 15
10
5 5
2.5 10
10
20
20
30
30
40
40
50
50
Figure 1.1.: Prime function π(x) for x ≤ 50. In the inset we show the same function on a much larger scale namely for x ≤ 107 and compare its asymptotic behavior to x/ ln x (dashed line) and to the logarithmic integral (dotted line). leads us immediately to the Stieltjes integral representation π(x) =
Zx
dϕ(y)
1 ln y
(1.3)
2
of π. Here the integration starts slightly to the left of 2 as to guarantee that 2 is the first prime which is counted. From Table 1.1 we have already found the value π(50) = 15. However, we can also use this table to find π for all the intermediate values of x as depicted in Fig. 1.1. Here is important to realize that p = 1 is not considered a prime number which implies π(1) = 0. The first prime number is p = 2. Hence, π(x) vanishes for x < 2 and then shortly before p = 2 jumps to 1. We note that for increasing x there are long periods without primes. In the inset of Fig. 1.1 we show π(x) over a much larger range, that is for x < 107 . On this scale the staircase behavior which is so dominant in the main part of the figure has disappeared. Moreover, we recognize that the curve increases almost linearly. However, this is not quite correct. Gauß proposed the asymptotic behavior π(x) ∼
x ln x
(1.4)
shown in the inset of Fig. 1.1 by a dashed line. However, this expression is not quite
7
correct. For this reason Legendre13 who independently from Gauß had also suggested Eq. (1.4) and gotten into a quarrel over priorities with him had proposed π(x) =
x . ln x − 1.08366
However, a much better fit of the exact curve again conjectured by Gauß is the asymptotic distribution Zx 1 π(x) ∼ Li(x) ≡ dy (1.5) ln y 2
in terms of the logarithmic integral Li. In the inset of Fig. 1.1 we show this function by a dotted line. Since it is almost impossible to see a difference between the exact and the approximation Eq. (1.5) we amplify the curves in the neighborhood of a ≤ x < c. This picture clearly shows that the logarithmic integral is always above the exact curve, at least in the interval depicted in this example. However, it has been shown by S. Skewes in 1955 that the function can fall below the exact curve. This happens at least once before 3 1010
1010
.
This is an incredibely large number. Later this value was improved to 1.65 · 101165 which is still a very large number. To be precise there are an infinite amount of crossings between the two curves.
1.3.2. A First Glimpse of the Prime Number Theorem Closely related to the prime function π is the function X ϕ(x) ≡ ln p
(1.6)
p≤x
which is the sum of the natural logarithms of all primes p smaller than x. The function ϕ(x) and ln x provide a lower bound for π, that is ϕ(x) ≤ π(x) ln x
(1.7)
Our proof of this inequality starts from the definition Eq. (1.6) of ϕ and uses the fact that the logarithm is a monotonously increasing function. As result we have ln p ≤ ln x for p ≤ x which yields ϕ(x) =
X p≤x
13
8
ln p ≤
X p≤x
ln x = ln x
X p≤x
Adrien Marie Legendre (1752 – 1833), French mathematician
= ln x π(x).
We can even go one step further and derive an exact relation connecting π and ϕ. For this purpose we start from the definition Eq. (1.3) of π in terms of the Stieltjes integral and integrate by parts. We first obtain the integrator times the integrand. Then we subtract the integral consisting of the product of the integrator and the derivative of 1/ ln y, which yields x Zx ϕ(y) 1 + dy , (1.8) π(x) = ϕ(y) ln y y (ln y)2 y=2
or
ϕ(x) + π(x) = ln x
Zx 2
2−ε
dy
ϕ(y) . y (ln y)2
(1.9)
Here we have made use of the fact that ϕ(x) ≡ 0 for x < 2, which abolished the lower boundary term in Eq. (1.8). the derivation we refer to Appendix ??. Since the integral is always positive we recover immediately the inequality Eq. (1.7). We get a first glimpse of the prime number theorem and, in particular, see the emergence of the logarithmic integral when we cast Eq. (??) in the form 2 ϕ(x) − x π(x) = Li(x) + + + ln 2 ln x
Zx 2
dy
ϕ(y) − y . y (ln y)2
(1.10)
Here we have integrated the definition Eq. (1.5) of the logarithmic integral x Zx Zx 1 y 1 x 2 + dy Li(x) = + dy 2 = 2 − ln y 2 ln x ln 2 (ln y) (ln y) 2
2
by parts. We emphasize that Eq. (1.10) is exact: Moreover, the first term is already the celebrated asymptotics predicted by the prime number theorem. Unfortunately at this early stage we cannot identify this term as the leading contribution. This analysis will have to wait till Chapter 5 where we show that for large values of x the function ϕ(x) displays the behavior √ ϕ(x) ≈ x − 4 xT (x) and T (x) is a sum of oscillatory terms. From the resulting expression √ Zx x T (y) T (x) − 4 dy √ π(x) = Li(x) − 4 ln x y (ln y)2 2
for the prime function we can identify the logarithmic integral as the determiner of the overall behavior of π. The second contribution, in particular, T (x) is a staircase-like
9
function reproducing the steps in π due to occurrence of primes. Moreover, in Chapter 5 we relate the emergence of T to the non-trivial zeros of the Riemann ζ-function. In contrast, the logarithmic integral results from the pole in ζ. These results outline the program for the next chapters.
Problems 1.1 Prime number decomposition of factorial Verify the prime number decomposition Y n! = pe p p≤x
of n! where
x x x x + 2 +··· = +O ep ≡ p p p p2
denotes the exponent of the prime p and n ≡ [x] is the largest integer not exceeding x. 1.2 Stirling formula Express ln n! in terms of a Stieltjes integral and verify the Stirling formula 1 ln n − n + O(1). ln n! = n − 2 1.3 Another glimpse of the prime number theorem Use the result of Problem 1.1 together with the Stirling formula of Problem 1.2 to verify the result X ln p = ln x + O(1). p p≤x 1.4 Sum over the inverse of primes Use the Stieltjes integral to verify the relation X1 p≤x
p
= ln ln x + a + O
1 ln x
where a is a constant. 1.5 Mertens’ formula Take advantage of the estimate of Problem 1.4 to verify the relation Y b 1 = + O(1) 1− p ln x p≤x where b is a constant.
10
1.6 Derivation of Eq. (1.9) with Riemann integral Derive the exact relation Eq. (1.9) connecting π and ϕ using the Riemann integral. 1.7 Proof of Chebyshev bound Make use of Problem 1.3 to verify the Chebyshev bound b1
x x < π(x) < b2 ln x ln x
on π where b1 ≡ ln(2)/4 and b2 ≡ 30 ln 2.
11
12
2. How to Count Primes In the preceding chapter we have presented the sieve of Eratosthenes and introduced the prime function π. In particular, we have focused on its asymptotic behavior without proving the conjectures Eqs. (1.4) and (1.5) of Gauß. In the present section we use the inclusion-exclusion principle of combinatorics to obtain an expression for the number of primes retained by the sieve of Eratosthenes, that is for number of √ primes between ≡ π(x) − π( x) √ x and x
in terms of the M¨obius function µ. This formula allows us a second glimpse of the prime number theorem. Indeed, it yields the asymptotics Eq. (1.4). Unfortunately, this calculation also predicts a remainder which grows exponentially. We start by first reformulating the sieve of Eratosthenes in the language of set theory using the inclusion-exclusion principle. This approach casts the resulting expression √ for π(x) − π( x) into a form convenient for the M¨obius function. We conclude by evaluating the emerging expressions and arrive at the asymptotics Eq. (1.4) of Gauß. Alas the remainder grows rapidly.
2.1. Sieve of Eratosthenes as a Set Problem In Table 1.1 we have already illustrated the algorithm of Eratosthenes to locate the primes in a set of integers by crossing out the non-primes. In order to make the connection with combinatorics we now repeat in Table 2.1 this procedure emphasizing more the aspects of set theory. We therefore do not cross out numbers but rather classify them according to different sets. This figure is also instrumental in deriving the final expression for π in terms of the M¨obius function. On the way to this result to interchange the summations. Table 2.1 makes it clear why this approach is justified. Although we constantly refer to the specific example of x = 50 shown in Table 2.1 for guidance our treatment is valid for any x. We start with the set M ≡ {1 < m ≤ x}
of all positive integers m smaller than or equal to x depicted in the second row of Table 2.1. Here we have already excluded 1 since by definition it is not √ considered a prime. We recall that p = 2, 3, 5 and 7 are the primes smaller than 50 = 7.071 . . . and find all their integer multiples smaller or equal to x = 50. These members form the sets E2 , E3 , E5 and E7 shown in Table 2.1 by the next rows.
13
In general, we start from the primes p1 ,p2 , . . . ,pr and define r-subsets √ Epj ≡ m ≤ x : pj |m and pj ≤ x √ of M which consist of integers m ≤ x, where the prime pj ≤ [ x] is a divisor of m. Hence,√the set Epj contains the numbers m ≤ x which are integer multiples of the prime pj ≤ [ x] and, in particular, pj itself. √ Here we have introduced the mathematical symbol [ ]. Indeed, it might √ happen that x is not an integer. In this case √ we take the largest integer not exceeding√ x and denote x]. For example, when x = 50 we find 50 = 7.071 . . . this quantity by the symbol [ √ 50 = 7. and thus Moreover, we emphasize that there is no need to exclude the number 1 in the definition of Epj since these sets only contain multiples of primes. Therefore, 1 cannot be among them. The primes found by the sieve constitute the set M0 of numbers which are not elements of any of the subsets Epj as shown in Table 2.1 for the example of x√= 50. Moreover, the number |M0| of elements of M0 is the number of primes between x and x, that is √ π(x) − π( x) = |M0 |. √ Here, the problem of finding the number π(x) − π( x) of primes has been reduced to finding the number of elements of the set |M0 |.
14
√ π( 50) = 4
2 3
5
+1 2 3
4 5
6 7 8
4
6
Primes
7 11
M = E1 E2
2
E3
-1
3
E5
9 10 11
8
6
10 9
5
13
19 18 19 20 21 22
12
18
14
12
16 15
10
E7
20
18
23
29
23 24 25 26 27 28
29 30 31 32 33 34
22
24
21
15
7
P
17
12 13 14 15 16 17
24
20
14
26
28
31
30
27
37
32
30
25
34
36
33
38
36
30
21
41
35 36 37 38 39
28
47
π(50) = 15
46 47 48 49 50
49 +49
40
46
25
42
39
35
43
40 41 42 43 44 45 44
42
48
45
40
50
48
16
45
35
50
42
10
49
7
p |Ep |
58
E2 ∩ E3 = E6
6
12
E2 ∩ E5 = E10
18
24
10
E2 ∩ E7 = E14
30
20
+1
42
30
14
E3 ∩ E5 = E15
36
50
pj ,pk ,pl
P
3
42
2
35
1 22 +22
30
-1
1
E2 ∩ E3 ∩ E7 = E42 P
3 45
21
E2 ∩ E3 ∩ E5 = E30
5
42 30
E3 ∩ E7 = E21
42
1
|Epj ∩ Epk ∩ Epl |
Epj ∩...Epl (−1)
k
-58
8
40
28 15
E5 ∩ E7 = E35 P pj ,pk |Epj ∩ Epk |
48
2 0 0
0 0
0 0 0
0
0
1
0
1
0
0
0
1
0
1
0
0
0
1
0
0
0
0
0
1
0
1
0
0
0
0
0
1
0
0
0
1
0
1
0
0
0
1
0
0
15
√ Table 2.1: Number of primes π(50) − π( 50) = 11 surviving the sieve of Eratosthenes and evaluated with the help of the inclusion-exclusion principle in two different but completely equivalent ways. In the horizontalvertical approach we first find the number of members in each subset (count members in row) and then add the weighted sums (column). In the vertical-horizontal method we first add the weights ±1 for a number being a member of the subset (columns) and then add these terms (row). These methods correspond to Eqs. (2.4) and (2.5), respectively.
0
11
-2 11
2.2. Number of Elements in M0 We now determine the number |M0 | of elements in M0 using the inclusion-exclusion principle of combinatorics. In order to focus on the main goal of this booklet, namely to provide an introduction into number theory we do not derive but rather motivate this technique using the example of Table 2.1. For a more detailed discussion and a proof of this principle in terms of sets we refer to Appendix B.
2.2.1. Motivation of Expression for |M0 |
A first guess P for |M0 | is to take the total number |M| of integers 1 < m ≤ x and subtract the number p |Ep | of elements in all r subsets Ep , that is X ! |Ep |. (2.1) |M0 | = |M| − p
√ Here the sum includes all primes 2 ≤ p ≤ [ x]. In order to simplify the notation we here and in the remainder of this chapter suppress these limits of the sum. The exclamation mark on top of the equal sign indicates that this relation is only an “ansatz” and we are not sure yet if it is correct. Most likely it is not! Indeed, this expression only provides a lower bound for |M0|. Table 2.1 shows that many integers are simultaneously members in two or even three sets Ep . For example m = 14 = 2 · 7 is a member of E2 as well as E7 . Moreover, m = 42 = 2 · 3 · 7 is a member of E2 , E3 as well as E7 . We correct this overcounting by adding to Eq. (2.1) the number of all integers that are simultaneously in two subsets, which yields the estimate X X ! |M0 | = |M| − |Ep | + |Epj ∩ Epk |, (2.2) p
pj ,pk
where Epj ∩ Epk denotes the set of elements common to both Epj and Epk . Unfortunately, even this expression is not correct and only provides an upper bound for |M0 |. The origin of the problem lies in integers such as m = 30 or m = 42 which are members of three subsets Ep . In the first sum of Eq. (2.2) m = 30 is counted three times as apparent from Table 2.1. Similarly in the second sum it is also counted three times, since it appears in E2 ∩ E3 , E2 ∩ E5 and E3 ∩ E5 . As a consequence Eq. (2.2) does not subtract m = 30 at all from |M|. Therefore, an improved formula must take into account numbers m that are members in more than two classes Ep . This number has to be negative as illustrated in Table 2.1 by the example of m = 30, that is X X X ! |M0 | = |M| − |Ep | + |Epj ∩ Epk | − |Epj ∩ Epk ∩ Epl |. (2.3) p
pj ,pk
pj ,pk ,pl
Not even this result is correct in general since there could be numbers m which are members of 4 subsets, such as E2 , E3 , E5 and E7 . In the case of x = 50 this is not
16
true since the smallest number 2 · 3 · 5 · 7 = 210 contained in this set is already larger than x = 50. Nevertheless, in the general we have to take this possibility into account. Moreover, many more subsets are possible. Thus the alternating sum Eq. (2.3) suggests the general expression X X X ! |M0 | = |M| − |Ep | + |Epj ∩ Epk | + . . . (−1)m |Epj ∩ Epk ∩ · · · ∩ Epm | + . . . , p
pj ,pk
that is, |M0 | = |M| +
pj ,pk ,...,pm
r X
(−1)m
m=1
X
pj ,pk ,...,pm
|Epj ∩ Epk ∩ · · · ∩ Epm |.
(2.4)
According to Eq. (2.4) we obtain |M0 | by first finding the numbers of elements in all subsets Epj , in the cross section Epj ∩ Epk between all pairs Epj and Epk , in the cross sections Epj ∩ Epk ∩ Epm between three subsets Epj , Epk and Epm and so forth. We then add up these numbers with a given weight. Indeed, elements of a single subset carry the weight −1. The elements of a cross section between two sets have weight +1 whereas in the case of three sets we have again −1 and so forth. The resulting number when added to the number |M| of elements of M yields the desired number |M0 | of primes.
2.2.2. Illustration for x = 50 Before we analyze the rather complicated expression Eq. (2.4) in more detail we illustrate it for the example of x = 50 summarized in Table 2.1. This analysis is also extremely illuminating for the next step of our derivation of |M0| since it suggests a re-summation of Eq. (2.4). In Table 2.1 we start from the primes p = 2, 3, 5 and 7 and show in the successive rows the members of all possible subsets Ep and cross sections Epj ∩ Epk and Epj ∩ Epk ∩ Epm . For this specific example of M containing integers smaller or equal to x = 50 there exist only cross sections constructed out of at most three subsets Ep . Indeed, the smallest number corresponding to 4 subsets is 2 · 3 · 5 · 7 = 210 and lies outside of our domain x ≤ 50 of interest. The right column displays the numbers of elements in the individual sets. The corresponding weight is indicated in the left column giving rise to the sum +49−58+22−2 = 11 of alternating terms shown in the column furthest to the right. Hence, we arrive indeed at 11 primes found by the sieve of the Eratosthenes in complete agreement with the first row of Table 2.1. Moreover, this picture offers another strategy to obtain the number 11. So far we have first counted the numbers of elements in a given subset, that is we have counted in each row from left to right. Then we have attached the weights +1 or −1 to these numbers depending on the number of subsets involved being even or odd. In the last step we have performed the vertical sum of alternating terms. However, we can also start by first counting vertically, that is we ask how many times a given integer appears in one of these subsets. Again the weight depends on the oddeven nature of the number of subsets involved. For example, the number 18 appears in
17
M, E2 and E3 , as well as E2 ∩ E3 giving rise to√a total weight +1 − 1 − 1 + 1 = 0 as shown in the bottom row. Primes larger than [ x] are only members of M but not of any of these subsets. Hence, they carry the weight +1. When we add up all the values along the row we naturally arrive at the number 11 of primes, in complete agreement with the horizontal summation technique. In summary, we can first add horizontally, that is count up all members in a row and then perform the alternating vertical sum. However, a completely equivalent approach is to first calculate the alternating vertical sum and then complete the horizontal sum. The later approach brings us to the M¨obius function as we now show in the next section.
2.2.3. A Different Representation of |M0 |
The special example of Table 2.1 shows a path towards a new representation of |M0|. It is based on performing the “vertical summations”, this is the summation over (−1)m first. This idea leads to the expression |M0 | = |M| +
r XX
X
(−1)k
m∈M k=1 pj , . . . , pk m ∈ (Epj ∩ · · · ∩ Epk )
for the number |M0| of elements in M0 . Here we take every number m of M and ask in how many subsets Epj ∩ · · · ∩ Epk the integer m can be found. The odd-even nature of the number k of primes involved determines the weight ±1. Then we sum over all sets in which m can be found. We can also include the term |M| in the sum over k by defining p0 ≡ 1 as well as recognizing that E1 ≡ M which yields |M0| =
r XX
X
(−1)k .
(2.5)
m∈M k=0 pj , . . . , pk m ∈ (Epj ∩ · · · ∩ Epk )
The expressions Eqs. (2.4) and (2.5) follow immediately from the corresonding formulas Eqs. (B.5) and (B.1) for a general set problem.
√ 2.3. A Compact Formula for π(x) − π( x) Equation (2.5) reduces significantly when we recall some elementary properties of the subsets Epj and their cross sections Epj ∩ · · · ∩ Epk . Indeed, Ep includes all integer multiples q of the prime p that are smaller than x, that is all integers m = q · p. Similarly, the cross section Epj ∩ Epk contains all integer multiples of pj · pk , that are smaller or equal to x. This set contains the same elements as Epj ·pk and thus Epj ∩ Epk = Epj ·pk
18
In the same spirit we recognize the relation Epj ∩ Epk ∩ Epm = Epj ·pk ·pm . Table 2.1 illustrates these formulae rather clearly. We now define the product d ≡ p1 · p2 · · · · · pk
(2.6)
of k distinct primes p1 , . . . , pk . In this case the statement m ∈ Ep1 ∩ Ep2 ∩ · · · ∩ Epk = Ep1 ·p2 ·····pk = Ed ,
that m is a member of the cross section of the sets Ep1 , Ep2 , . . . , Epk implies that d = p1 · p2 · · · · · pk is a divisor of m, that is d|m. Indeed, if m is a member of this cross section of subsets it must be an integer multiple of d and thus d is a divisor of m. Moreover, when we multiply all r primes p1 , p2 , . . . , pr together we find P (z) ≡ p1 · p2 · · · · · pr =
r Y j=1
pj ≡
Y
p.
p≤z
In the last √ step, we have brought out the fact that the largest prime is smaller or equal to z ≡ [ x]. We are now in the position to cast Eq. (2.5) into a more compact form. The sums over all possible primes and their products can be replaced by a sum over d, since d as defined in Eq. (2.6) is a divisor of P (z). However, the term k = 0 in Eq. (2.5) which brings in all elements of M is a slight complication. This term translates into d = 1 which is not included in the definition Eq. (2.6) of d consisting of primes only. We recall that by definition 1 is not a prime. Therefore, we have to include d = 1 as well and attribute the weight +1 to this term. For this purpose we recall the definition if n = 1 1 r (−1) if n is the product p1 . . . pr of r distinct primes µ(n) ≡ (2.7) 0 if p2 |n where p is a prime of the M¨obius function. We note that µ vanishes if n contains powers of primes. However, no such situation can occur in our case because by definition d consists only of a product of distinct primes. Although this property is not relevant in the present situation it is extremely important when we concentrate in the next chapter on the connection between µ and the Riemann ζ-function. P Throughout this booklet the symbol d|n f (d) denotes the sum of the values of the function f taken at all the divisors d of n. For example when n = 6 = 1 · 2 · 3 we find X f (d) = f (1) + f (2) + f (3) + f (6) d|6
19
With the help of the definition Eq. (2.7) of the M¨obius function µ we can now present the final expression X X µ(d) (2.8) |M0| = π(x) − π(z) = 1
for the number |M0 | of elements in the √ set M0 . At the same time this number is the number of primes between x and z = [ x]. We conclude this section by illustrating Eq. (2.8) using again Table 2.1. Since we use the four primes 2, 3, 5 and 7 the product P (7) = 2 · 3 · 5 · 7 = 210 leads to the divisors d = 2, 3, 5, 7, 6, 10, 14, 15, 21, 35, 30 and 42 as indicated in the left column of Table 2.1. In addition, we have the case d = 1. The corresponding values of µ(d) are shown in the second column to the left.
2.4. A Second Glimpse of the Prime Number Theorem We have concluded Chapter 1 by providing “an idea for an idea” on how to obtain the Gauß conjecture of the prime number theorem. Unfortunately, at this stage we could not really identify the leading contribution since we did not have enough knowledge about the function ϕ. In the present section we initiate another attempt to verify the Gauß conjecture Eq. (1.1), which is based on the expression Eq. (2.8) for the sieve of Eratosthenes. Unfortunately, we again fall short of a proof, since the remainder in our estimate grows exponentially.
2.4.1. Analytical Expression for π(x) − π(z) We start from Eq. (2.8) in the form
π(x) − π(z) =
X X
µ(d)
n≤x d|P (z) d|n
where P (z) ≡
Y
p
p≤z
is the product of r distinct primes p1 , p2 . . . , pr not exceeding z and d is a divisor of n as well as P (z). √ In the previous sections we have always chosen z ≡ [ x]. However, in the present discussion we leave this parameter open and specify it at the very end of the calculation. Performing the Sum over all Integers with Divisors We first perform the sum over n by recognizing that only integers n contribute to the sum for which d is a divisor of n, that is n = m · d.
20
How many such integers n ≤ x can occur? Obviously from the identity x≡m·d we find m=
hxi
π(x) − π(z) =
X
such integers which results in
d
µ(d)
d|P (z)
hxi d
.
When we define the function x d d which depending on d takes values between ±1 we arrive at κ(d) ≡
hxi
−
π(x) − π(z) = xL(z) + R(z)
(2.9)
(2.10)
where we have introduced the main contribution X µ(d) L(z) ≡ d d|P (z)
and the remainder R(z) ≡
X
µ(d)κ(d).
d|P (z)
We now derive an explicit expression for L and provide an estimate for R. Main Contribution We first prove the identity X µ(d) Y 1 = 1− L(z) = d p p≤z d|P (z)
where the product is over all primes p1 , . . . pr smaller or equal to z. For this purpose we start from the right hand side of this equation and multiply out the individual products Y 1 1 1 1 1− = 1− · 1− ... 1 − p p p pr 1 2 p≤z which yields Y p≤z
1 1− p
= 1− −
1 1 1 1 1 1 − −···− + + +···+ p1 p2 pr p1 p2 p1 p3 p1 pr
1 1 1 − − . . . (−1)r . p1 p2 p3 p1 p2 p4 p1 p2 . . . pr
21
Consequently, we find an alternating sum of terms each of which is the inverse of products of primes. In fact, these are all the divisors d of P (z). Moreover, we note that for a single prime the term is negative, whereas for the product of two primes the corresponding term is positive, etc. This feature is contained in the definition Eq. (2.7) of the M¨obius function which leads us to the relation X µ(d) Y 1 = . 1− p d p≤z d|P (z)
We can even go a step further and evaluate the product. Already in Problem 1.5 we have derived the formula Y b 1 = + O(1) 1− p ln x p≤x where b is a constant. In Appendix C we identify this constant as b = e−γ where γ denotes the Euler-Mascheroni constant " n−1 # X1 γ ≡ 0.5772157 ≡ lim − ln n . (2.11) n→∞ k k=1 As a result we arrive at L(z) =
e−γ . ln z
(2.12)
Analysis of Remainder We now turn to the remainder R and recall that the M¨obius function µ cannot assume values larger than unity, that is |µ| ≤ 1. Similarly the definition Eq. (2.9) of κ implies |κ| ≤ 1 which leads to the estimate |R| ≤
X
d|P (z)
|µ(d)| · |κ(d)| ≤
X
.
d|P (z)
Hence, the upper bound of |R| is the number of divisors of P (Z). For this reason we now address the question: How many divisors are contained in a product of r distinct primes. We can easily verify the answer 2r by complete induction. Indeed, for r = 1, that is, a single prime p1 , we have d = 1 and d = p1 . When we consider the product P (z) = (p1 · . . . pr ) · pr+1 of r + 1 distinct primes each of the 2r divisors dj of (p1 . . . pr ) can be multiplied by pr+1 , to produce 2r additional divisors giving rise to a total of 2r+1 divisors. Hence, we arrive at |R| ≤ 2r = er ln 2 < ezln2 , (2.13) where in the last step we have overestimated the number r of primes by the parameter z. According to this bound the remainder can grow exponentially with z.
22
Summary When we now combine the results Eqs. (2.12) and (2.13) we find from Eq. (2.10) the asymptotic expression x γ e + O ez ln z (2.14) π(x) − π(z) = ln x for the numbers of primes surviving the sieve.
2.4.2. Application Next we apply the above estimate of π to the example of √ x z≡
used in the previous illustrations of the sieve of Eratosthenes. In this case we arrive at √ √ x + O e x ln 2 π(x) − π( x) = b ln x
where b ≡ 2eγ . The first term does provide the asymptotic behavior Eq. (1.1) predicted by the prime number theorem. However, we cannot trust this prediction since the remainder increases exponentially. This dramatic growth of the remainder can be avoided by an appropriate choice of z. When instead we choose z ≡ [c ln x] we obtain the asymptotic formula π(x) − π(c ln x) =
x x eγ + O(ec ln 2 ln x ) ≈ eγ + O(xν ) ln c + ln(ln x) ln(ln x)
where ν ≡ c ln 2. In this example the remainder only grows like a polynomial. But unfortunately the estimate for the growth of π is overestimated since ln(ln x) < ln x. These examples illustrate in a striking way how difficult it is to obtain a handle on the prime number theorem.
Problems 2.1 Sieve of Brun
23
24
3. M¨ obius Function µ In the preceding section we have shown that the M¨obius function √ µ is useful in obtaining a closed form expression for the number of primes between x and x. However, apart from this important role there are also many other interesting applications of µ. The M¨obius function belongs to the class of arithmetic functions which are functions whose domain is some set of integers. In particular, µ is a multiplicative arithmetic function and satisfies the functional equation µ(m · n) = µ(m) · µ(n). This formula is essential in the derivation of the Euler product representation of the Riemann ζ-function. However, this is not the only connection between µ and ζ. Indeed, the emergence of ζ itself is closely linked to µ. In order to bring this fact out most clearly, we consider the convolution between µ and the arithmetic function 1(n) = 1 which maps every integer n to unity. The function resulting from this convolution is a Kronecker1 delta and the corresponding equation constitutes the M¨obius inversion formula. One way to simplify a convolution is to introduce generating functions. The Riemann ζ-function is the generating function of the function 1. Moreover, due to the M¨obius inversion formula we find that the inverse ζ(s)−1 of the value of ζ at the argument s is the generating function of µ. This relation is crucial in the proof presented in Chapter 4 that ζ does not contain any zeros for Re s > 1. The generating functions are of the form X D(s) ≡ d(n) n−s n
and are called Dirichlet2 series. Here ζ and ζ −1 emerge for d(n) ≡ 1 and d(n) ≡ µ(n), respectively. We start this chapter by first showing that the M¨obius function is a multiplicative arithmetic function. Then we turn to the concept of a convolution of two arithmetic functions which reduces with the help of generating functions to a multiplication. This approach leads us straight to the Riemann ζ-function and its inverse. Yet another application of the convolution theorem yields a closed form expression for the logarithmic derivative of ζ. This formula is the essential ingredient in establishing in Chapter 5 the connection between the prime distribution and the zeros of ζ. 1
Leopold Kronecker (1823 – 1891), German mathematician and logician, Professor in Berlin, “God made the natural numbers; all else is the work of man.” 2 Peter Gustav Lejeune Dirichlet (1805 – 1859), French-born German mathematician, Professor in Berlin
25
We conclude by returning to the beginning of this chapter where we have shown that the M¨obius function is multiplicative. Indeed, we make use of this feature to derive the Euler product representation of the Riemann ζ-function.
3.1. Multiplicative Arithmetic Functions We call an arithmetic function f (n) multiplicative if f (m · n) = f (m) · f (n)
(3.1)
for m and n integers and their greatest common denominator is unity. In the present section we first define and briefly discuss the M¨obius function µ. Then we show that µ satisfies indeed the functional equation Eq. (3.1).
3.1.1. Definition of µ We start by defining the M¨obius function if n = 1 1 0 if p2 |n for some prime p µ(n) ≡ r (−1) if n = p1 · p2 · · · · · pr where pi denote distinct primes
(3.2)
Table 3.1 shows the values of µ for the first 10 integers n.
Table 3.1.: Values of the M¨obius function µ for n ≤ 10 n 1 2 µ 1 −1
3 −1
4=2·2 5 0 −1
6=2·3 7 +1 −1
8 = 23 0
9 = 32 0
10 = 2 · 5 +1
At this point, it is clear why it is useful not to consider 1 to be a prime. The decomposition of n would not be uniquely defined. We could always multiply the product of primes by 1 without changing n. Nevertheless, it would change r and hence (−1)r . According to Problem 3.1 there exists the alternative representation X µ(n) ≡ e2πim/n (3.3) (m,n)=1
of the M¨obius function µ. Here the summation extends over integers m which do not share a divisor with n except 1. We illustrate this formula using a few examples such as n = 2, 4 and 6. Indeed, we find X e2πim/2 = eiπ = −1 = µ(2), (m,2)=1
together with
X
(m,4)=1
26
e2πim/4 = eiπ/2 + ei3π/2 = i − i = 0 = µ(4)
and X
e2πim/6 = e2iπ/6 + e2iπ5/6 = eiπ/3 + eiπ5/3 =
(m,6)=1
1 1 + = 1 = µ(6). 2 2
Equation (3.3) implies that the M¨obius function µ(n) is the sum of the n-th root z of unity to the power m, with z n = 1 but z m 6= 1 for 1 ≤ m < n. Moreover, we show in Problem 3.1 that the roots z are the zeros of the polynomial Y n (xd − 1)µ( d ) . φn (x) = (3.4) d|n
3.1.2. M¨ obius Function is Multiplicative We now show that the M¨obius function µ defined by Eq. (3.2) is a multiplicative function and satisfies the functional equation Eq. (3.1). According to the Fundamental Theorem of Arithmetic we can represent m and n by the products m = pα1 1 · · · · · pαr r and
n = q1β1 · · · · · qrβs
of primes p1 < · · · < pr and q1 < · · · < qr where the powers α1 , . . . , αr and β1 , . . . , βs are positive integers. Here we assume pi 6= qj since the greatest common denominator between m and n is unity. If at least one of the exponents αj is larger than unity we find from the second case of the definition Eq. (3.2) of µ the results µ(m) = 0 and µ(m · n) = 0 confirming µ(m · n) = µ(m) · µ(n). We are therefore left with the case α1 = α2 = · · · = αr = 1 and β1 = β2 = · · · = βs = 1. From the third case of the definition Eq. (3.2) of µ we obtain µ(m) = µ(p1 · · · · · pr ) = (−1)r µ(n) = µ(q1 · · · · · qs ) = (−1)s and thus µ(m) · µ(n) = (−1)r+s . On the other hand we derive µ(m · n) = µ(p1 · · · · · pr · q1 · · · · · qs ) = (−1)r+s .
27
Thus we have proven the multiplicative relation µ(m · n) = µ(m) · µ(n).
(3.5)
We conclude by noting that for m = 1 we arrive at µ(1 · n) = µ(n) = 1 · µ(n) = µ(1) · µ(n) which confirms one more time the multiplicative nature of µ.
3.2. Convolution of Two Arithmetic Functions A convolution is an operation which maps two arithmetic functions onto another one. There exist convolutions of addition as well as multiplication. In the present section we focus on the multiplicative one. For a brief discussion of the additive convolution we refer to Appendix D. We first define the operation of a multiplicative convolution. Then we illustrate it by applying it to the M¨obius function µ and an arbitrary function f . In this way, we become more familiar not only with the concept of the convolution and the µ function but also with the way of thinking in number theory. Moreover, this illustrative example of a convolution is a stepping stone towards the M¨obius inversion formula discussed in the next section. Finally we consider the case of a convolution between µ and f (n) ≡ 1(n) ≡ 1 for all n ∈ N. The resulting object is the neutral element of the convolution operation.
3.2.1. Definition We start from two arithmetic functions f = f (n) and g = g(n) where n is an integer and introduce the new function X h(n) ≡ (f ∗ g) (n) ≡ f (d1 ) · g(d2) (3.6) d1 ·d2 =n
whose value at n is the sum of the products of f and g evaluated at two integers d1 and d2 such that d1 · d2 = n. Here we denote the operation of a convolution of two functions by a star between them. The order of the two function f and g participating in the convolution is not relevant, provided f (d1 ) · g(d2) = g(d2 ) · f (d1 ), that is the values of f and g are members of a commutative set. In this case we have f ∗ g = g ∗ f.
(3.7)
Throughout this booklet we concentrate on arithmetic functions whose values are complex numbers. Hence, this condition is indeed satisfied. This convolution Eq. (3.6) takes a slightly different form when we express it in terms of a sum of divisors, that is n X n X (f ∗ g) (n) ≡ f (d)g = f g(d). (3.8) d d d|n
28
d|n
Both expressions in Eq. (3.8) are equivalent. In the first representation we have eliminated with the help of d1 · d2 = n the integer d2 = n/d1 in favor of d1 ≡ d. In the second formula we have eliminated d1 = n/d2 in favor of d2 ≡ d
3.2.2. Application to µ and f We now illustrate the concept of a convolution for the specific example of the M¨obius function µ and an arbitrary arithmetic function f . In particular, we verify the relation ( X n f (1) h for n = 1 i µ(d) = P f . (3.9) n n µ(d) for n = m · pα d d|m f d − f d·p d|n
This formula will turn out to be extremly helpful when we consider in the next section the case of f (n) ≡ 1(n) = 1. It leads us to the neutral element ε of the convolution. Furthermore for f (n) ≡ ln n it allows us to find in Sec. 3.5.3 the expansion coefficients Λ(n) of the logarithmic derivative of the Riemann ζ-function. Obviously Eq. (3.9) is correct for n = 1 when the only divisor of 1 is d = 1 and thus X 1 f µ(d) = f (1)µ(1) = f (1). d d|1
Here we have made use of the definition Eq. (3.2) of the M¨obius function. We now turn to the general case of an integer n > 1. According to the Fundamental Theorem of Arithmetic we can represent n as the product α
r−1 · pα ≡ m · pα n = pα1 1 · pα2 2 . . . pr−1
of r powers α1 , α2 , . . . , αr−1, α of distinct primes p1 , p2 , . . . , pr−1 , p. In the last step we αr−1 into one integer have combined the product of the first r − 1 integers pα1 1 , pα2 2 , . . . , pr−1 m and p is not a divisor of m. The divisors of n consist of all divisors d of m together with the products of all divisors d of m times p, p2 , . . . , pα , that is d, d · p, d · p2 , . . . , d · pα . From the definition Eq. (3.2) of the M¨obius function we find µ(d · p2 ) = · · · = µ(d · pα ) = 0 for α 6= 1. Hence, only the terms µ(d) and µ(d · p) contribute to the sum X n X n X n f µ(d) = f µ(d) + f µ(d · p). d d d·p d|n
d|m
d|m
When we recall from Eq. (3.5) that the M¨obius function is a multiplicative function with µ(d · p) = µ(d) · µ(p) = −µ(d) we find X n X n n f µ(d) = f −f µ(d), d d d·p d|n
d|m
in complete agreement with Eq. (3.9).
29
3.2.3. Neutral Element of Convolution We are now ready to consider an example of a convolution even more specialized than the one in the preceding section. Indeed, we choose f (n) ≡ 1(n) ≡ 1 that is a function whose value is unity irrespectively of the argument n. We denote the resulting function by X n X ε(n) ≡ (µ ∗ 1) (n) = (1 ∗ µ) (n) = 1 µ(d) = µ(d). d d|n
d|n
When we substitute f (n) ≡ 1(n) ≡ 1 into Eq. (3.9) we find for n 6= 1 the relation X X X n n −1 µ(d) = 0 · µ(d) = 0 µ(d) = 1 d d·p d|n
d|m
d|m
which with f (n = 1) = 1 yields ε(n) = δn,1 ≡
1 for n = 1 0 for n > 1.
(3.10)
Here we have recalled the definition δn,j ≡
1 for n = j 0 for n 6= j
of the Kronecker delta. It is interesting to note that ε≡µ∗1=1∗µ
(3.11)
is the neutral element of the multiplicative convolution Eq. (3.5) such that f ∗ ε = f.
(3.12)
In order to verify this statement we evaluate X n X n (f ∗ ε) (n) = f ε(d) = f δd,1 = f (n). d d d|n
d|n
Here we have made use of Eq. (3.10). The Kronecker delta has reduced the sum over d to the single term d = 1 giving rise to f (n). We conclude by stating without proof that the set of all arithmetic functions constitutes a commutative ring with respect to the convolution with the neutral element ε. In Eq. (3.7) we have already remarked about the commutative nature of the convolution. However, here we do not elaborate on the ring structure of the operation.
30
3.3. M¨ obius Inversion Formula We are now in the position to prove the equivalence of two representations of two arithmetic function f and g. The M¨obius inversion formula states that if f and g satisfy one of the two equations X f (n) = g(d) (3.13) d|n
and
g(n) =
X n f µ(d) d
(3.14)
d|n
for each n, then they satisfy both conditions. We start our proof by assuming the representation Eq. (3.13) of f as the convolution of g with the unity function, that is f =g∗1=1∗g and evaluate µ ∗ f = µ ∗ (1 ∗ g) = (µ ∗ 1) ∗ g = ε ∗ g = g.
Here we have made use of the associative law of the convolution as well as of the definition Eq. (3.11) and the property Eq. (3.12) of the neutral element ε. Next we start from Eq. (3.14), that is from g = µ∗f and evaluate again using Eqs. (3.11) and (3.12) the convolution 1 ∗ g = 1 ∗ (µ ∗ f ) = (1 ∗ µ) ∗ f = ε ∗ f = f, which is indeed the representation Eq. (3.13).
3.4. Generating Functions and Dirichlet Series The convolution of two arithmetic functions as defined in Eq. (3.6) is a rather complicated process which is very different from their multiplication. Nevertheless, the application of generating functions transforms this operation into a simple multiplication. Indeed, we claim that if f and g have the generating functions F and G then the convolution h ≡ f ∗ g has the generating function H = F · G , that is h=f ∗g
⇔
H = F ·G.
This idea is going to lead us straight to Dirichlet series and, in particular, to the Riemann ζ-function. We now address the question of how to turn the convolution n X h(n) ≡ (f ∗ g)(n) = f (d)g d d|n
31
between f and g into a product. For this purpose we use the arithmetic functions f and g to define two functions ∞ X F (s) ≡ f (m)m−s m=1
and
G (s) ≡
∞ X
g(n)n−s
n=1
of the complex variable s ≡ σ + it. Sums of this type are called Dirichlet series. They are the generating functions of f and g. Indeed, the product H (s) ≡ F (s) · G (s) =
∞ X
m,n=1
f (m)g(n)(m · n)−s =
∞ X
h(k)k −s
(3.15)
k=1
of the two Dirichlet series is another Dirichlet series where the expansion coefficient X ∞ X X k k = f g(n) (3.16) f (m)g(n) = f (m)g h(k) ≡ m n m·n=k
n|k
m|k
is the multiplicative convolution of f and g and involves a sum over all divisors of k.
3.5. Riemann ζ-Function The generating function, that is the Dirichlet series corresponding to the function 1(n) ≡ 1 is the Riemann ζ-function ζ(s) ≡
∞ X
n−s =
n=1
1 1 1 + + + ... 1s 2s 3s
(3.17)
defined for a complex valued variable s ≡ σ + it.
(3.18)
It is interesting to note that the value ζ(s) of this function depends on all positive integers n. Before we discuss the properties of ζ in more detail we study the convergence of the Dirichlet series Eq. (3.17). We then consider the generating functions for µ and for the logarithmic derivative of ζ. The former provides us with a Dirichlet series for ζ −1 . The latter makes the first contact with the prime numbers. Indeed, when we use Eq. (3.9) to find the expansion coefficients of the Dirichlet series we are confronted with logarithms of prime numbers. No other numbers but primes enter this formula. Our second contact is the Euler product formula which allows us to represent ζ as an infinite product of factors containing all primes.
32
3.5.1. Convergence We now address the question for which values of the argument s we obtain convergence of the Dirichlet series, Eq. (3.17). Here we only provide heuristic arguments. For a rigorous analysis we refer to Problem 3.2. P In order to gain some insight into the behavior of the sum n n−s we first recall the relation n−s = e−s ln n = e−σ ln n · e−it ln n = n−σ e−it ln n for s = σ + it which yields the representation ζ(s = σ + it) ≡
∞ X e−it ln n
nσ
n=1
=
∞ X cos(t ln n)
nσ
n=1
−i
∞ X sin(t ln n) n=1
nσ
of the Riemann ζ-function. We estimate |ζ| using the triangle inequality and find ∞ ∞ −it ln n X X e 1 = ≡ S(σ) |ζ(s)| ≤ nσ nσ
(3.19)
n=1
n=1
which implies convergence of the Dirichlet series Eq. (3.17) of ζ provided the sum S is convergent. Since the harmonic series ∞ X 1 ζ(1) ≡ n n=1
marks the border between absolute convergence and divergence we expect absolute convergence for ζ only for σ > 1. Moreover, since the harmonic series is divergent we expect a single pole at s = 1. Moreover, we also recall that loosely speaking a harmonic series with oscillatory terms is convergent. Indeed, this situation occurs for the Riemann ζ-function ζ(s = 1 + it) =
∞ X cos(t ln n) n=1
n
−i
∞ X sin(t ln n) n=1
n
along the line s = 1 + it for t 6= 0.
3.5.2. Stieltjes Representation of ζ −1 We now consider the Dirichlet series M(s) ≡
∞ X
µ(n)n−s
(3.20)
n=1
and show that it represents ζ −1.
33
For this purpose we recall the definition Eq. (3.11) of the neutral element ε=1∗µ of the convolution and the fact that the generating functions ζ and M corresponding to 1 and µ, respectively multiply giving rise to the generating function E≡
∞ X
ε(n)n−s =
n=1
X
δn,1 n−s = 1
n=1
of ε. Then we arrive at E =1=ζ ·M which corresponds to M(s) = ζ −1(s) =
∞ X
µ(n)n−s .
(3.21)
n=1
For a slightly different derivation we refer to Problem 3.4. This formula for the inverse of ζ was derived for the first time by the Dutch mathematician Stieltjes. It is quite a remarkable result. Whereas ζ is determined by a sum containing the inverses n−s of all integers n to the power s, the inverse of ζ follows from a similar sum involving the same terms n−s . However, the main difference is that in the sum for ζ −1 some terms are missing and some have changed their sign from positive to negative. Indeed, due to the property µ(n) = 0 of the M¨obius function for p2 |n the integers n which contain squares, cubes etc. such as n = 4, 8, 9, 16 or n = 12, . . . do not contribute to the sum. Moreover, due to the relation µ(n = p1 · · · · · pr ) = (−1)r the sum for ζ −1 is alterning, whereas in ζ each term contributing to the Dirichlet sum is positive. This general statement stand out most clearly in a direct comparison between the Dirichlet series ζ(s) = 1 +
1 1 1 1 1 1 1 1 1 + s + s + s + s + s + s + s + s + ... s 2 3 4 5 6 7 8 9 10
of ζ and the corresponding one 1 1 1 =1− s − s ζ(s) 2 3
−
1 1 1 + s− s s 5 6 7
+
1 + ... 10s
representing ζ −1 . Here we have made use of Table 3.1. We note the missing terms at n = 4, 8, 9 as well as the sign changes at integers which are single primes.
34
3.5.3. Logarithmic Derivative of ζ We now present an application of the product formula Eqs. (3.15) and (3.16) for generating functions. Indeed, for the analysis of the distribution of primes in Chapter 5 it is useful to express the logarithmic derivative of ζ as a Dirichlet series. We can obtain this quantity by differentiating the Dirichlet sum Eq. (3.17) of ζ, multiplying it by the Stieltjes representation Eq. (3.21) of ζ −1 and making use of Eqs. (3.15) and (3.16). With the help of the relation n−s = e−s ln n we find by differentiation of the definition ζ(s) =
∞ X
e−s ln k
k=1
of the Riemann ζ-function the expression ∞ X d ′ k −s ln k. ζ(s) ≡ ζ (s) = − ds k=1
With the Stieltjes representation ζ(s)−1 =
∞ X
µ(m)m−s
m=1
of the inverse of ζ, Eq. (3.21), we arrive at the product ∞ ∞ ∞ ′ X X X X ζ n µ(d) n−s =− k −s ln k µ(m)m−s = − ln ζ d m=1 n=1 k=1 d|n
of two Dirichlet series which is again a Dirichlet series
∞ X ζ ′(s) d ln ζ(s) = =− Λ(n)n−s . ds ζ(s) n=1
(3.22)
According to Eq. (3.9) the expansion coefficients X n µ(d) Λ(n) ≡ ln d d|n
for n = m · pα take the form X n X X n Λ(n) ≡ ln − ln µ(d) = ln p · µ(d) = ln p µ(d) = ln p · δm,1 , d d·p d|m
d|m
d|m
(3.23)
35
that is Λ(n) =
ln p for n = pα 0 otherwise.
(3.24)
In the last step we have made use of the explicit form Eq. (3.10) of ε = µ ∗ 1. The Dirichlet series Eq. (3.22) of the logarithmic derivative of the Riemann ζ-function with its expansion coefficients Eq. (3.24) is quite remarkable. Whereas ζ itself does not display the slightest connection to prime numbers we now recognize that its logarithmic derivative is a Dirichlet series whose expansion coefficients are the natural logarithms of primes and of primes, only. Moreover, the arithmetic function Λ(n) which carries the name von Mangoldt3 function is non-vanishing only if the integer n allows the representation of a power of a prime p. However, the power α does not even enter into the definition of Λ. This surprising feature is due to the fact that the terms ln(n/d) in the definition Eq. (3.23) of Λ cancel. We are now also able to make contact with Chapter 1 where we have defined in Eq. (1.6) the function X ϕ(x) ≡ ln p p≤x
which appears in Eq. (1.10) providing a relation for the prime function π. Indeed, the Chebyshev4 function X ψ(x) ≡ Λ(n) n≤x
built out of Λ(n) and √ ϕ(x) must be closely related. In Chapter 5 we shall show that up to terms of the order x we have the identity √ ϕ(x) = ψ(x) + O( x). This relation is one of the crucial ingredients of our derivation of the prime function π(x).
3.5.4. Euler Product We have started this Chapter by presenting a proof that the M¨obius function is a multiplicative arithmetic function. In the present section we take advantage of this fact and derive the Euler product representation Y ζ(s) = (1 − p−s )−1 p
of the Riemann ζ-function. 3
Hans von Mangoldt (1854 – 1925), German mathematician, Professor in Hannover, Aachen and Danzig 4 Pafnuti Lvovich Chebyshev (or Chebychev, Chebysev, Chebycheff, Tchebychev, etc.)(1821 – 1894), Russian mathematician
36
For this purpose we first cast the Dirichlet series F (s) ≡
∞ X
f (n)n−s
n=1
of a multiplicative arithmetic function f into the product Y YX f (pnp )p−np s F (s) = f (1) + f (p)p−s + f (p2 )p−2s + . . . = p
p
(3.25)
np
of sums. Here the infinite product extends over all primes p and the summation includes all integers np ≥ 0. Moreover, for each prime p there exists a summation index np . Then we consider the special example of f (n) ≡ µ(n). For a completely different derivation of the Euler representation which does not make use of the multiplicative nature of f we refer to Problem 3.5. From a Dirichlet Sum to a Product In order to prove Eq. (3.25) we start from its right hand side, that is from the product Q X f (pnp )p−np s = p np
=
2 −2s 2 −2s . . . f (1) + f (pj )p−s + . . . · f (1) + f (pj+1)p−s j + f (pj )pj j+1 + f (pj+1 )pj+1 + . . .
of sums and multiply it out. This procedure yields the relation X YX f (pnp )p−np s = Ωr , p
(3.26)
r
np
that is a sum whose terms Ωr ≡
f (pα1 1 )f (pα2 2 ) . . . f (pαr r )
1s p−α 1
·
2s p−α 2
rs . . . p−α r
=
r Y j=1
α f (pj j )
r Y
−αj s
pj
j=1
reduce due to the multiplicative property r Y
α f (pj j )
r Y
=f
j=1
α pj j
j=1
of f and the familiar law r Y
−αj s
pj
=
j=1
of powers
Ωr = f
r Y
pj j
·
r Y
α
j=1
r Y j=1
α pj j
!
j=1
!
!−s
α pj j
!−s
.
(3.27)
37
Obviously the product
r Y j=1
α
pj j = pα1 1 · pα2 2 . . . pαr r = n
of powers of primes is an integer n. Moreover, according to the Fundamental Theorem of Arithmetic this decomposition of any integer is unique and we find from Eq. (3.27) the expression Ωn = f (n)n−s which when substituted into Eq. (3.26) completes the proof of Eq. (3.25). Application to the M¨ obius Function We now apply Eq. (3.25) to the special example of f (m) = µ(m) for which we have already derived the Stieltjes representation ζ
−1
(s) =
∞ X
µ(n)n−s
n=1
for the inverse of the Riemann ζ-function. When we recall that µ(pα ) = 0 for 1 < α, the sum in the product in Eq. (3.25) reduces to two terms only, that is Y ζ −1 (s) = (1 − p−s ) p
where we have made use of the fact that µ(1) = 1 and µ(p) = −1. Obviously we have now achieved a representation Y ζ(s) = (1 − p−s )−1
(3.28)
p
of the Riemann ζ-function as an infinite product which apart from the argument contains only primes. Equation (3.28) is the second clear indication that the Riemann ζ-function is intimately connected to the distribution of prime numbers.
Problems 3.1 Alternative representation of M¨ obius function Verify the representation Eq. (3.3) of the M¨obius function together with the fact that the zeros satisfy the polynomial Eq. (3.4). 3.2 Convergence of Dirichlet series for ζ Show that the ratio test fails in the case of the Dirichlet series Eq. (3.17) for ζ. Then compare the sum to the geometric series and prove that the Dirichlet series for ζ is absolutely convergent for Re s ≡ σ > 1.
38
3.3 Inequality for ζ Derive the inequality
1 1 ≤ ζ(s) ≤ +1 s−1 s−1 for real valued arguments s > 1. 3.4 Alternative derivation of Stieltjes representation Use the convolution theorem Eq. (3.15) to derive the Stieltjes representation Eq. (3.21) of ζ −1 .
3.5 Euler representation of ζ from geometric series Derive the Euler product representation Eq. (3.28) using the geometric series. 3.6 Logarithmic derivative of ζ Derive the logarithmic derivative Eqs. (3.22) and (3.24) of ζ using (i) the M¨obius inversion formula Eqs. (3.13) and (3.14) (ii) the Euler product representation Eq. (3.28).
39
40
4. Analytical Properties of ζ In the preceding chapter we have introduced the concept of a generating function in order to transform the problem of a convolution into a multiplication. The generating function of 1(n) = 1 is the Riemann ζ-function. Its Dirichlet series ζ(s) ≡
∞ X
n−s
(4.1)
n=1
is absolutely convergent for Re s ≡ σ > 1 only. However, this limitation of the argument can be overcome. Here the functional equation ξ(s) = ξ(1 − s) (4.2) for the product
−s/2
s
ζ(s) (4.3) 2 containing the gamma function as well as the Riemann ζ-function plays an important role. Equation (4.2) does not imply that ζ satisfies a functional equation since according to the definition Eq. (4.3) of ξ it is the product of three functions with ζ only one of them. Nevertheless, the functional equation Eq. (4.2) allows us to extend the Riemann ζ-function to the domain σ < 0, that is the left half of the complex plane. Unfortunately, we still have to fill the gap of the critical strip, that is the domain 0 < σ < 1. We can complete this task by representing ζ as an integral transform of a function w which is closely related to the theta functions. Indeed, this formula holds true for σ > 0. We devote a large portion of this chapter to the extension of ζ to the whole complex plane. The functional equation Eq. (4.2) not only allows us to lift the restrictions on Re s ≡ σ but also opens up a new avenue to discuss the zeros of ζ which is another major topic in this chapter. Unfortunately, here we can only present results without detailed proofs since this field is still an extremely active area of research. In particular, the Riemann hypothesis which states that all non-trivial zeros lie on the critical line σ = 1/2 has not been verified yet. The integral transform emerging in the derivation of the functional equation for ξ is the Mellin1 transform. It is closely related to the Laplace transform and allows us to express any Dirichlet series as an integral. This feature will be crucial when we make in Chapter 5 the connection between the prime function π and the zeros of ζ. Since the Mellin transform is not frequently used we briefly summarize its most important properties. In particular, we emphasize its close connection to Dirichlet series. ξ(s) = π
1
Γ
Robert Hjalmar Mellin (1854 – 1933), Finnish analyst and physicist
41
4.1. Extension of ζ to Whole Complex Plane So far the domain of the Riemann ζ-function has been confined to arguments s of ζ with Re s ≡ σ > 1. In the present section we extend ζ to the whole complex plane. Here we proceed in two steps. We first derive an integral representation of ζ which is valid for σ > 0. Needless to say, this new definition agrees with the one in the original domain. This extension of ζ is made possible by the fact that the integral representation of the gamma function is valid for σ > 0. We then derive the functional equation Eq. (4.2) for the function ξ. Here we start from the representation of ζ as the integral transform of a series w and use the functional equation of w derived in Appendix E. The functional equation for ξ allows us to define ζ even in the domain σ < 0.
4.1.1. General Idea We start from the integral representation Z∞ Γ(s) ≡ du e−u us−1
(4.4)
0
of the gamma function. Since the main goal of the present section is to free the definition of ζ from the restriction σ > 1, it is worthwhile to pause here for a moment and recall that the expression Eq. (4.4) for Γ is valid for Re s ≡ σ > 0. Indeed, we can convince ourselves immediately of this fact when we integrate the term us−1 by parts which yields ∞ Z∞ 1 e−u s 1 1 du e−u us = lim us + Γ(s + 1). Γ(s) = u + s s s u→0 s 0 0
Hence, only in the limit of Re s ≡ σ > 0 do we find a vanishing boundary term and the familiar functional equation 1 Γ(s) = Γ(s + 1). (4.5) s We can extend the definition of Γ to the whole complex space by taking advantage of the functional relation Eq. (4.5) working our way iteratively to the left in steps of unity. For example, for s ≡ −|σ| + it with 0 < |σ| < 1 we can define Γ(−|σ| + it) ≡
1 Γ(1 − |σ| + it), −|σ| + it
since the gamma function on the right hand side is defined by the integral Eq. (4.4). The method of extending ζ is very much in the same spirit. Starting from the integral representation of Γ we find an integral representation of ζ, that leads us already beyond the original border of σ > 1 into the critical strip defined by 0 < σ < 1. A functional equation then opens up the west of complex space, that is the domain σ < 0.
42
4.1.2. Extension into the Critical Strip The new integration variable u ≡ πn2 x together with du = πn2 dx casts the integral representation Eq. (4.4) of the gamma function into Z∞ 2 dx e−πn x xs−1 . Γ(s) = π n s 2s
0
After the substitution s → s/2 we arrive at π
−s/2
Γ
s 2
−s
n
Z∞ 2 = dx e−πn x xs/2−1 . 0
We sum both sides over n and obtain after interchanging summation and integration the relation Z∞ ∞ s s X (4.6) ζ(s) = dx w(x)xs/2−1 . n−s ≡ π −s/2 Γ ξ(s) ≡ π −s/2 Γ 2 n=1 2 0
Here we have made use of the definition Eq. (4.1) of the Riemann ζ-function and have introduced the new function " ∞ # ∞ X X 1 1 2 2 w(x) ≡ e−πn x ≡ [ϑ(x) − 1] ≡ e−πn x − 1 . (4.7) 2 2 n=1 n=−∞ In the last step we have also defined the theta series ϑ. Here the summation runs from minus infinity to plus infinity rather than from unity to plus infinity. Since the summation index n appears quadratically this extension of the summation is without complications. Moreover, the term n = 0 is canceled by the term unity. The function ϑ is a special case of the theta function ϑ3 (τ, q) ≡
∞ X
2
q n e2inτ
n=−∞
where |q| < 1. Indeed, with q ≡ exp[−πz] and τ = 0 we find ϑ3 (τ = 0, q = e−πz ) ≡ ϑ(z) =
∞ X
2
e−πn z .
n=−∞
It is for this reason that we refer to ϑ as theta function, although strictly speaking theta functions are a much larger class of functions as illustrated for example in Whittacker and Watson A Course of Modern Analysis. According to Eq. (4.6) we can express ζ as the product s s w˜ (4.8) ζ(s) = λ 2 2 43
t
c(s)ζ(1 + s)
0
λ
s 2
w˜
s s
1
P
n
n−s σ
Figure 4.1.: Representations of the Riemann ζ-function in the various domains of complex space s ≡ σ + it. For σ > 1 (downwards stripes) the Dirichlet series Eq. (4.1) is uniformly convergent. In the critical strip, that is for 0 < σ < 1, as well as for σ > 1 (darkened zone) we can use the integral representation Eq. (4.8). Moreover, the functional equation Eq. (4.2) for ξ allows us to connect the domains σ > 1 and σ < 0 leading to the definition Eq. (4.14) of ζ for σ < 0 (upwards stripes). consisting of λ(s) ≡
πs Γ(s)
(4.9)
and the integral transform Z∞ w(s) ˜ ≡ dx w(x)xs−1
(4.10)
0
of the function w. Since the expression for w˜ is valid for Re s ≡ σ > 0 we have already extended the domain of ζ to the west. We now cover the critical strip, that is 0 < σ < 1, as well as the original area σ > 1 as indicated in Fig. 4.1.
4.1.3. Functional Equation for ξ We can even abolish the border at σ = 0 when we take the gamma function as the guiding example in our quest for the west: We need to derive the functional equation of ξ. It is interesting to note that this equation appears already in Riemann’s seminal paper. Moreover, we follow essentially Riemann’s derivation.
44
For this purpose we first decompose the integral Z∞ ξ(s) = dx w(x)xs/2−1 0
in Eq. (4.6) into two domains with one extending from zero to unity and the other from unity to infinity, that is ξ(s) =
Z1 0
Z∞ dx w(x)xs/2−1 + dx w(x)xs/2−1 . 1
In the first integral we make use of the functional equation 1 1 1 1 √ −1 + w(x) = √ w x 2 x x derived in Appendix E and find ξ(s) =
Z1
s/2−3/2
dx w(x)x
0
Z∞ 1 s/2−3/2 s/2−1 x −x + dx w(x)xs/2−1 . + 2 1
When we define the new integration variable u ≡ x−1 together with du = −x−2 dx = −u2 dx we arrive at Z∞ s+1 Z∞ s+1 s+2 1 ξ(s) = du w(u)u− 2 + u− 2 − u− 2 + dx w(x)xs/2−1 , 2 1
or
1
Z∞ h i s−2 − s+1 2 2 ξ(s) = dx x +x w(x) + I˜ 1
where we have introduced the integral 1 I˜ ≡ 2
∞ ∞ Z∞ 1 s 2 − s 2 − s−1 − −1 − s+1 = dx x 2 − x 2 − x 2 + x 2 2 s−1 s 1 1 1
.
It is at this point that the condition σ > 1 enters. Indeed, only under this assumption do the contributions from the upper limit vanish. The contributions from the lower limit yield Z∞ h i s+1 s−2 1 1 ξ(s) = dx x− 2 + x 2 w(x) + − . (4.11) s−1 s 1
45
When we substitute s ≡ 1 − s′ into this expression we find the symmetry relation Z∞ h i ′ ′ 1 1 − s 2+1 ′ − 2−s 2 +x w(x) − ′ + ξ(1 − s ) = dx x = ξ(s′ ), ′ s 1−s 1
that is, we have indeed verified the functional equation ξ(s) = ξ(1 − s).
We conclude this section by noting that for s ≡ 21 −s′ we arrive at the symmetry relation 1 1 ′ ′ ξ (4.12) −s =ξ +s 2 2 for the function ξ with respect to s = 12 .
4.1.4. Extension to Left Half-Plane We are now in the position to extend the Riemann ζ-function into the domain σ < 0. For this purpose we start from the functional equation ξ(s) = ξ(1 − s) for the function ξ(s) ≡ π −s/2 Γ
s
ζ(s). 2 Although ζ is only defined for Re s ≡ σ > 1 the gamma function Γ(s) is defined for any value s except at negative integers including zero, that is for s = 0, −1, −2, . . . . These poles do not emerge from the integral transform Eq. (4.4) which is only valid for σ > 0 but from the iteration of the functional equation Eq. (4.5) and the Gauß representation n!ns−1 n→∞ s · (s + 1) . . . (s + n − 1)
Γ(s) ≡ lim
(4.13)
of the gamma function. The single poles of Γ also shown in Fig. 4.2 are of great importance in the discussion of the zeros of ζ as we shall see in the next section. We now make the substitution s → −s in the functional equation which yields ξ(−s) = ξ(1 + s) and translates into π or
s/2
s 1+s −(1+s)/2 Γ ζ(1 + s), Γ − ζ(−s) = π 2 2
ζ(−s) ≡ c(s)ζ(1 + s) ≡ π
−(s+1/2) Γ
Γ
1+s 2 ζ(1 − 2s
+ s).
(4.14)
Hence, we can interpret this equation as the definition of ζ for Re s ≡ σ < 0 as shown in Fig. 4.2. Indeed, the right hand side of Eq. (4.14) is defined in this case since Re (1+s) = 1 + |σ| > 1.
46
t
5
|Γ(s)|
0 −4
0
4
σ
Figure 4.2.: Gamma function of the complex argument s ≡ σ + it represented here as |Γ(s)|. We recognize the simple poles at s = 0, −1, −2, . . .
4.2. Zeros of ζ In the present section we discuss the zeros of the Riemann ζ-function. Their location in the complex space is of great importance in establishing the asymptotic behavior of the prime distribution π as discussed in Chapter 5. We distinguish two classes of zeros: The ones at s = −2, −4, −6, . . . and the ones in the critical strip defined by 0 < σ < 1. Unfortunately, their location is still not completely known. According to the Riemann hypothesis they are all along the critical line, that is along s = 21 + it. However, no exact proof for this statement exists. Only these zeros are directly related to the primes. For this reason the zeros along the negative real axis are called the trivial zeros. We now discuss both classes in more detail.
4.2.1. Trivial Zeros The original definition Eq. (4.1) of ζ as a Dirichlet series is valid in the domain Re s ≡ σ > 1. We now show that this part of the complex plane is free of zeros. This features of ζ follows immediately from the Stieltjes representation ∞
X 1 = µ(n)n−s ζ(s) n=1
(4.15)
of the inverse of ζ. Indeed, this sum is convergent for σ > 1 and constitutes a holomorphic function. Hence, it cannot become singular in this domain. Since this function represents ζ −1 , the Riemann ζ-function itself must be free of zeros in the shaded domain of Fig. 4.3.
47
We now consider the mirror image of this domain, that is the region σ < 0 of the complex plane. Here the Riemann ζ-function is defined with the help of the functional equation Eq. (4.2) for ξ by Eq. (4.14), that is 1+s −(s+1/2) Γ 2 ζ(−s) ≡ c(s)ζ(1 + s) ≡ π ζ(1 + s). Γ − 2s
From the Gauß representation Eq. (4.13) of Γ we recall that the gamma function has simple poles at s = 0, −1, −2, . . . . This feature makes ζ vanish at the locations of the poles unless the numerator in the above definition of ζ(s) becomes infinite as well. Such a situation occurs for s = 0 since ζ(s) has a single pole at s = 1. Hence, the singularity in the denominator at s = 0 due to Γ is compensated by the singularity of ζ at s = 1 and ζ is regular at s = 0. However, for all other values of s the poles of Γ lead to zeros of ζ. Due to the factor 2 in the argument of Γ the corresponding zeros ̺k of ζ read ̺k = −2k with k = 1, 2, . . . as indicated in Fig. 4.3 by the crosses along the negative real axis. In Chapter 5 we show that these zeros are unrelated to the distribution of primes. Therefore, they carry the name trivial zeros.
4.2.2. Non-trivial Zeros and Riemann Hypothesis Since the domain σ > 1 is free of zeros and the region σ < 0 only has the zeros provided by the poles of Γ all non-trivial zeros ̺j = σj + itj of ζ with ζ(̺j ) = 0, must be located in the critical strip defined by 0 ≤ σj ≤ 1. Moreover, it is also known that on the line σ = 1 parallel to the imaginary axis through the pole s = 1 there are no zeros, that is ζ(s = 1 + it) 6= 0. The functional equation for ξ immediately implies ζ(s = it) 6= 0. Hence, the critical strip does not include the borders. Symmetries From the functional relation ξ
1 +s 2
=ξ
1 −s 2
of ξ we find that the zeros in the critical strip must lie symmetrically with respect to the critical line σ = 1/2. If ̺j = σj + itj with 0 < σj < 1/2 is a zero of ζ, then also ̺j ≡ 1 − σj + itj is a zero. There is one more symmetry in the zeros. They are symmetrically located with respect to the real axis. Indeed, for every complex valued function f = f (z) which is real along the real axis z0∗ is also a zero provided z0 is a zero. Here z0∗ denotes the complex conjugate of z0 this familiar result of complex analysis is discussed in Problem 4.14.
48
t s 10 −6
−4
−2
0 −10
1 2
1
σ
Figure 4.3.: Distribution of zeros of the Riemann ζ-function in the complex plane s ≡ σ + it. There are no zeros in the domain for σ > 1 (downwards stripes) since the Dirichlet series Eq. (4.15) of ζ −1 does not contain any singularities. However, there are zeros on the negative real axis at s = −2, −4, −6, . . . resulting from the poles of the Γ function emerging in the representation, Eq. (4.14) of ζ. There is no zero at s = 0 since the pole of Γ at s = 0 is compensated by the pole of ζ at s = 1. In the critical strip defined by 0 < σ < 1 the zeros are located symmetrically with respect to σ = 1/2, that is if ̺j ≡ σj + itj with 0 ≤ σj ≤ 1/2 is a zero, then ̺j = 1 − σj + itj is also a zero. This property follows from the symmetry relation Eq. (4.2) of ξ. The Riemann hypothesis states that all complex zeros have real part σ = 1/2 and lie on the critical line indicated here by a dashed curve. They are symmetrically located with respect to the real axis and the imaginary parts tj of the first zeros are ±14.13 . . . , ±21.02 . . . , ±25.01 . . .
49
Riemann Hypothesis In his celebrated paper of 1859 Riemann states without proof that the non-trivial zeros of ζ lie on the critical line, that is they are of the form 1 ̺j = + itj . 2 The first zeros along the critical line have the imaginary parts tj = ±14.134725 . . . , ±21.022 . . . , ±25.010856 . . . Riemann in his paper conjectured that the number N of zeros with 0 ≤ σ ≤ 1 and with |t| ≤ T is given by the expression T T N(T ) ∼ ln − 1 + O(ln T ). (4.16) 2π 2π
This formula was finally proven thirty years later by von Mangoldt. For an elementary proof we refer to Problem 4.15. Moreover, the density of zeros of ζ is ln(T /2π) and approaches zero as T → ∞. Only after Riemann’s death it was found that he calculated the first zeros numerically. However, he did not explain how he had obtained his results. This technique was reconstructed by Siegel2 in 1932 from unpublished notebooks of Riemann. Since this time mathematicians have tried to prove the Riemann hypothesis. A major contribution was made by Harald Bohr3 the brother of the physicist Niels Bohr and Edmund Landau4 . They showed that the majority of zeros lies in a narrow strip along the critical line. In 1914 Hardy5 showed that indeed an infinity of zeros of ζ actually lie on the critical line. Moreover, Hardy and Littlewood6 in 1920 proved that the number of zeros on σ ≡ 1/2 for which 0 < t < T is at least of O(T ) as T → ∞. We conclude by presenting in Fig. 4.4 the absolute value of |ζ(s)| of the Riemann ζ-function in the neighborhood of the critical strip. The pole at s = 1 is clearly visible. Moreover, the first non-trivial zeros along the critical line as well as the trivial zeros along the negative real axis can also be seen.
4.3. Mellin Transform In Sec. 4.1 we have expressed the Riemann ζ-function as the product Eq. (4.8) of a function λ(s) and the integral transform w(s) ˜ of the function w = w(s) defined in Eq. (4.7). This integral transform carries the name Mellin transform and is defined for a function f = f (x) by Z∞ f˜(s) ≡ dx f (x)xs−1 . (4.17) 0
2
Carl Ludwig Siegel (1896 – 1981), German mathematician, Professor in G¨ottingen Harald August Bohr (1887 – 1951), Danish mathematician and soccer player, Professor in Copenhagen 4 Edmund Georg Hermann Landau (1877 – 1938), German mathematican, Professor in G¨ottimgen 5 Godfrey Harold Hardy (1877 – 1947), British mathematician, Professor in Cambridge 6 John Edensor Littlewood (1885 – 1977, British mathematician, Professor in Cambridge 3
50
t
−20
20
5
|ζ(s)| 0 −8 0
σ 8
s= −20 5
|ζ(s)| 0
1 2
+ it 20
log |ζ(s)|
20
t
−8
σ
0 8
0
Figure 4.4.: Riemann ζ-function of complex argument s ≡ σ + it represented as |ζ(s)| in the neighborhood of the critical strip. We recognize the pole at s = 1. The cut along the critical line s = 1/2+it shows the first non-trivial zeros (right) of ζ whereas the cut along the real axis (left) displays the trivial zeros at s = −2k. Since the values of |ζ| are rather small in this region of complex space we use a logarithmic scale.
51
When we compare this expression to the integral representation Eq. (4.4) of the Γfunction we note that Γ is the Mellin transform of the decaying exponential function, that is f (x) = e−x . Moreover, we recognize that the representation of a Dirichlet series in terms of the Mellin transform not only holds true for the special case of the Riemann ζ-function but for any absolutely converging Dirichlet series. In the following section we verify this statement. In addition we show that we can also express finite sums in terms of an inverse Laplace transform, which is closely related to the inverse Mellin transform.
4.3.1. Dirichlet Series as a Mellin Transform The representation Eq. (4.8) of ζ in terms of a product of λ and the Mellin transform of w suggests that a similar relation must hold true for any Dirichlet series ∞ X
D(s) ≡
d(n)n−s .
(4.18)
n=1
Indeed, we can express D by the product D(s) = λ
s 2
w˜D
π s/2 ≡ 2 Γ 2s
s
Z∞ dx wD (x)xs/2−1
(4.19)
0
of λ(s) and the Mellin transform w˜D (s) of the function wD (x) ≡
∞ X
2
d(n)e−πn x .
n=1
In order to prove this claim we substitute this expression for wD into the right hand side of Eq. (4.19), interchange summation and integration. With the help of the substitution u ≡ πn2 x we find the formula ∞
π s/2 X d(n) Γ 2s n=1
Z∞ ∞ X 2 dx e−πn x xs/2−1 = d(n)n−s = D(s), n=1
0
which is indeed Eq. (4.18).
4.3.2. Finite Sums and Inverse Mellin Transform Now we go a step further and represent D not just as the product of two functions with one being the Mellin transform but as the inverse Mellin transform of a single function. In particular, we show that we can express the function X ∆(x) ≡ d(n) (4.20) n≤x
52
constructed from the Dirichlet series D(s) ≡
∞ X
d(n)n−s
n=1
in terms of the inverse Laplace transform 1 ∆(x) ≡ 2πi
c+i∞ Z
ds D(s)
xs , s
(4.21)
c−i∞
where the path is to the right of all poles of the integrand. The representation Eq. (4.21) of ∆ is central to the connection between the prime function π and the zeros of the Riemann ζ-function, discussed in detail in Chapter 5. In order to establish Eq. (4.21) we first show that the Dirichlet series D can be represented by the product Z∞ D(s) = s du ∆(eu )e−su ≡ I
(4.22)
0
of the argument s of D and the Laplace transform of ∆(eu ). We then invert the Laplace transform to arrive at Eq. (4.21). We start by first extending the summation in Eq. (4.20) to infinity. For this purpose we introduce the Heaviside step function 1 for x ≥ 0 Θ(x) ≡ 0 for x < 0 which yields ∆(x) ≡
∞ X n=1
d(n)Θ(x − n).
When we substitute this formula into the Laplace transform Eq. (4.22) and interchange summation and integration we find I=
∞ X n=1
Z∞ Z∞ ∞ X u −su d(n) du Θ(e − n)se = d(n) dx Θ(x − n)sx−(s+1) n=1
0
1
where we have introduced in the last step the integration variable x ≡ eu with dx = eu du. Since Θ(x − n) = 0 for x < n the integration starts at x = n and implifies to I=
∞ X n=1
Z∞ ∞ X ∞ −(s+1) d(n) dx s x = d(n) −x−s x=n , n
n=1
53
that is I=
∞ X
d(n)n−s = D(s).
n=1
We are now in the position to invert Eq. (4.22), that is D(s) = s
Z∞ du ∆(eu )e−su 0
which yields 1 ∆(eu ) = 2πi
c+i∞ Z
du D(s)
esu s
c−i∞
and reduces to Eq. (4.21) when we set x ≡ eu .
Problems 4.1 Special values of Γ-function Verify the special values Γ(1) = 1
and
Γ(1/2) = π 1/2
of the gamma function and show that for s ≈ 0 we find 1 Γ(s) ≈ . s 4.2 Gauß representation of Γ-function Prove the Gauß representation Eq. (4.13) of the Γ-function. 4.3 Weierstraß representation of Γ Start from the Gauß representation of Γ derived in Problem 4.2 and obtain the Weierstraß7 representation ∞ Y s −s/k 1 γs 1+ e =s·e Γ(s) k k=1 for 1/Γ(s).
4.4 Product formula for sine function Verify the product formula ∞ x 2 Y 1− sin x = x kπ k=1 for the sine function. 7
Karl Theodor Wilhelm Weierstraß (1815 – 1897), German mathematician, Professor in Berlin
54
4.5 Functional equations of Γ Use the Weierstraß representation of Γ derived in Problem 4.3 together with the product formula of the sine function discussed in Problem 4.4 to show the functional equations Γ(s)Γ(1 − s) = and
π sin(πs)
1 1 π )Γ Γ s+ −s = . 2 2 cos(πs)
4.6 Doubling formula of Γ Verify the doubling formula 1 22s−1 Γ(2s) = √ Γ(s)Γ s + 2 π of the Γ-function. 4.7 Logarithmic derivative of Γ-function Start from the Gauß representation Eq. (4.13) and derive the logarithmic derivative ∞ X Γ′ (s) 1 1 d ln Γ(s) = = −γ − + s ds Γ(s) s j(s + j) j=1
of Γ where γ denotes the Euler-Mascheroni constant defined in Eq. (2.11). In particular, verify d = −γ. ln Γ(s) ds s=1 4.8 Integral representation of ζ Derive the integral representation
1 ζ(s) = Γ(s)
Z∞ xs−1 dx x e −1 0
of the ζ-function. 4.9 Functional equations of ζ Use the functional equations together with the duplication formula of the gamma function derived in Problems 4.5 and 4.6 to cast the functional equation Eq. (4.2) for ξ into the relations πs ζ(s) = π −1 (2π)s sin Γ (1 − s) ζ(1 − s) 2 and (2π)s ζ(1 − s) ζ(s) = 2Γ(s) cos πs 2 for the Riemann ζ-function.
55
4.10 Expansion of ζ around s = 1 Show that in the neighborhood of s = 1 the Riemann ζ-function allows the representation 1 . s−1
ζ(s) ≈ γ +
4.11 Special values of ζ Use the functional equation of ζ derived in Problem 4.9 together with the expansion of ζ discussed in Problem 4.10 and the results of Problem 4.1 to prove ζ(0) = −
1 2
1 ζ ′(0) = − ln(2π). 2
and
4.12 Laurent series of ζ around s = 1 Derive the Laurent expansion ζ(s) =
1 + γ + γ1 (s − 1) + γ2 (s − 1)2 + . . . s−1
where γ denotes the Euler-Mascheroni constant and " n # X (ln k)j 1 γj ≡ lim − (ln n)j+1 . n→∞ k j + 1 k=1 4.13 Sum rules Verify the relations
∞ X µ(n)
n
n=1
and
∞ X Λ(n) − 1 n=1
n
=0
= −2γ
for the M¨obius function µ and the von Mangoldt function Λ using the results of Problem 4.11. 4.14 Zeros of complex functions Show that for every complex valued function f = f (z) which is real along the real axis z0∗ is also a zero provided z0 is a zero. 4.15 Number of zeros of complex functions Show that for every complex valued function f = f (z) the number of zeros is determined by the line integral Z 1 f ′ (z) = var Im ln f (z). dz 2πi f (z) C
Apply this result to verify the prediction by Riemann Eq. (4.16) concerning the number N(T ) of zeros with imaginary part smaller than T .
56
5. Prime Number Theorem In Chapter 3 we have caught a first glimpse of the unique relation between the distribution of prime numbers and the Riemann ζ-function: The logarithmic derivative of ζ as well as the Euler product of ζ contain solely prime numbers. In the present chapter we finally tie the knot between these two fields: We show that the distribution of non-trivial zeros together with the pole of the Riemann ζ-function determines the distribution π of primes. In this way we prove the prime number theorem. The preceding chapters have laid the foundations for establishing the connection between π and the zeros of ζ. Three central tools help us to accomplish this task: (i) The expression Eq. (1.10) for π in terms of the function ϕ, logarithms and integrals over them, (ii) the representation Eqs. (3.22) and (3.24) of the logarithmic derivative of ζ as a Dirichlet series, and (iii) the representation Eq. (4.21) of a series as an inverse Laplace transform.
5.1. New Representation of Prime Function According to Eq. (1.10) the prime function π is the sum of several functions: The first one is the logarithmic integral followed by a constant. The third contribution is the ratio of a sum ϕ of logarithms of primes, a linear function and a logarithm. Finally, the integral contains again ϕ, logarithms and a linear function. In the present section we show that we can replace ϕ by a new function ψ formed out of the expansion coefficients Λ(n) of the logarithmic derivative of the Riemann ζ-function. In this derivation we take advantage of the Chebyshev bound, an inequality on π which can be verified with elementary tools. In this way we find a new representation of π in terms of properties of the Riemann ζ-function. The relation Zx ϕ(x) − x ϕ(y) − y 2 + + dy (5.1) π(x) = Li(x) + ln 2 ln x y (ln y)2 2
brings together π(x) ≡
X
(5.2)
p≤x
counting the numbers of primes smaller or equal to x and the function X ϕ(x) ≡ ln p
(5.3)
p≤x
consisting of the sum of logarithms of the primes smaller or equal to x.
57
We now make the connection between ϕ and the Riemann ζ-function. For this purpose we first recall from Chapter 3 that the logarithmic derivative ∞
ζ ′ (s) X d = Λ(n)n−s − ln ζ(s) = − ds ζ(s) n=1 of ζ is a Dirichlet series whose coefficients ln p for n = pm Λ(n) ≡ 0 otherwise
(5.4)
(5.5)
also involve the logarithms of primes. We can then use the coefficients Λ(n) to build the Chebyshev function X ψ(x) ≡ Λ(n)
(5.6)
p≤x
which is closely related to the function ϕ, defined in Eq. (5.3). Indeed, the sum giving rise to ψ contains all integers n which are powers of a prime. Since the first power, that is m = 1 is also allowed we have all primes p leading to Λ(n) = Λ(p1 ) = ln p. However, there are more terms since all the integers n which are squares, cubes, . . . of primes are included which provides us with additional terms Λ(n = pm ) = ln p. Hence, we arrive at the decomposition X X X ln p ≡ ϕ(x) + δψ(x) ln p = ϕ(x) + ψ(x) = ln p + p≤x
p ≤ x1/m m 6= 1
p1/m ≤ x m 6= 1
where we have used the definition Eq. (5.3) of ϕ and have taken advantage of the fact that powers as well as the logarithm are monotonously growing functions. Next we need to find an estimate for the remaining sum. In fact, here we do not deal with a single sum but several ones, since various m values are allowed. We first consider the sum for a fixed value of m 6= 1 and find X X X δψ ≡ ln p < ln x1/m = ln x1/m , p≤x1/m
that is
p≤x1/m
X
p≤x1/m
ln p < ln x1/m π(x1/m ).
p≤x1/m
Here we have recalled the definition Eq. (5.2) of π. At this stage we take advantage of the Chebyshev bound b1
x x < π(x) < b2 ln x ln x
on π where b1 ≡ ln(2)/4 and b2 ≡ 30 ln 2 discussed in Problem 1.7.
58
(5.7)
Since this inequality involves also the combination x/ ln x it is reminiscent of the prime number theorem. However, it is much weaker since it does not describe the asymptotic behavior of π but only puts a lower and an upper bound on this function. With the help of the Chebyshev bound we finally arrive at the estimate X
1/m
ln p < b2 ln(x
p≤x1/m
x1/m ) = b2 x1/m = O(x1/m ). 1/m ln(x )
Since x1/m < x1/2 for 1 < m the leading contribution in δψ is of the order x1/2 which yields ψ(x) = ϕ(x) + O(x1/2 ), that is up to terms of the order x1/2 the two functions ψ and ϕ are identical. Hence, we can use ψ rather than ϕ in Eq. (5.1) to derive an expression for π, that is ψ(x) − x + π(x) = Li(x) + ln x
Zx
dy
2
ψ(y) − y . y (ln y)2
(5.8)
Here we have neglected the constant contribution 2/ ln 2, since we only consider the limit of large values of x.
5.2. New Representation of Chebyshev function So far the function ψ has been defined as a finite sum over the expansion coefficients Λ(n), Eq. (5.5) of the logarithmic derivative of ζ. In the present section we derive a different representation of ψ which is well suited to investigate its asymptotic limit for large values of x. Indeed, the dominant contributions grow linearly and with the square root of x. Moreover, we can trace these contributions back to the pole and the non-trivial zeros of the Riemann ζ-function, respectively.
5.2.1. Outline of the Approach We start from the representation of ψ as the inverse Laplace transform 1 ψ(x) = 2πi
c+i∞ Z
c−i∞
ζ ′ (s) xs ds − ζ(s) s
(5.9)
of the Dirichlet series Eq. (5.4) discussed in Chapter 4. Here the path of integration is such that all singularities are to its left as shown in Fig. 5.1. We use the Cauchy1 integral theorem I f (z) 1 dz = f (a) (5.10) 2πi z−a 1
Augustin-Louis Cauchy (1789 – 1857), French mathematician
59
t
−6
−4
−2
0
1
σ
Figure 5.1.: Closed path of integration in the complex plane s ≡ σ + it for the integral representation Eq. (5.9) of the function ψ. The path is to the right of s = 1 + it and runs parallel to this line. It closes in infinity circumventing the pole s = 1 and the trivial as well as the non-trivial zeros of the Riemann ζ-function. valid for an analytic function f to evaluate the integral Eq. (5.9). Here the closed path of integration circumvents the simple pole at z = a in the counterclockwise direction. This approach provides us with an asymptotic expansion of ψ. The application of the Cauchy integral theorem amounts to finding all poles of the integral Eq. (5.9). They are given by (i) s = 0, (ii) the zeros ̺j of ζ and (iii) the pole of ζ at s = 1 and are indicated in Fig. 5.1 by crosses. We might wonder how a pole of ζ can make a contribution to the integral since ζ is not vanishing at s = 1 but is infinite. In order to answer this question we approximate ζ in the neighborhood of s = 1 by ζ(s) ≈
b s−1
where b is a constant. With ζ ′ = −b(s − 1)−2 we find the expression 1 ζ ′ (s) =− . ζ(s) s−1
(5.11)
Indeed, the pole in ζ manifests itself as pole of ζ ′/ζ in the integral representation of ψ.
60
With the help of the Cauchy integral theorem we find the representation X ψ(x) = I0 (x) + Ij (x) + Ip (x)
(5.12)
j
of ψ where the integral 1 Ij (x) ≡ 2πi
I
Cj
ζ ′ (s) xs ds − ζ(s) s
is the contribution due to a circular path Cj around the pole sj with s0 = 0, sj = ̺j and sp = 1.
5.2.2. Discussion of Individual Contributions We now discuss all contributions and then show that the dominant ones result from the pole and the non-trivial zeros of ζ. Contribution from s0 = 0 We start our analysis by first considering the integral I0 resulting from the vanishing of the denominator. In this case the Cauchy integral theorem yields ′ s I 1 ζ (s) x ζ ′(0) I0 (x) ≡ ds − =− , (5.13) 2πi ζ(s) s ζ(0) that is the logarithmic derivative of ζ at the origin. In particular, this contribution is independent of the variable x of the function ψ(x). Contributions from zeros of ζ We now address the contributions from the zeros ̺j of ζ where ζ(̺j ) = 0. For this purpose we first note that a Taylor expansion ζ(s) = ζ(̺j ) + ζ ′ (̺j )(s − ̺j ) + · · · = ζ ′(̺j )(s − ̺j ) of ζ around ̺j yields 1 Ij (x) = 2πi
I
ds −
s ζ ′ (s) x x̺j . = − ζ ′ (̺j )(s − ̺j ) s ̺j
In the last step we have again made use of the Cauchy integral theorem. We now distinguish between the trivial zeros ̺k along the negative real axis with ̺k = −2k and k = 1, 2, . . . and the non-trivial zeros ̺l . According to the Riemann hypothesis they all are along the critical line with ̺l = 1/2 ± itl . Hence, the integrals Ij corresponding to the trivial zeros take the form Ik =
1 1 2k x2k
(5.14)
61
whereas the ones resulting from the non-trivial zeros read √ e±itl ln x x1/2±itl = −2 x . Il = − 1/2 ± itl 1 ± 2itl
(5.15)
Contribution from Pole s = 1 We conclude by returning to the pole contribution I 1 xs Ij (x) = ds 2πi s(s − 1) where we have made use of the expression Eq. (5.11) for the logarithmic derivative of ζ. The Cauchy integral theorem yields immediately Ip (x) = x.
(5.16)
5.2.3. Asymptotic Behavior We are now in the position to collect all terms. When we substitute the explicit expressions Eqs. (5.13), (5.15), and (5.16) for I0 , Ij , and Ip into the sum Eq. (5.12) of residues we arrive at the new representation ∞ ∞ 1 X x−2k √ X eitk ln x e−itk ln x ζ ′ (0) +x+ − x2 + ψ(x) = − ζ(0) 2 k 1 + 2itk 1 − 2itk k=1
k=1
of the function ψ. In the limit of large √ values of x the dominant contributions are the term linear in x and the one containing x. They result from the pole at s = 1 and the non-trivial zeros of ζ, respectively. The third term originating from the trivial zeros of ζ approaches zero as x increases. Since the contribution from s = 0 is of the order unity we find asymptotically √ (5.17) ψ(x) = x − 4 xT (x) + O(1) where we have introduced the abbreviation T (x) ≡
∞ X cos(tk ln x) + 2tk sin(tk ln x) k=1
1 + 4t2k
.
(5.18)
5.3. Explicit Asymptotic Expression for π In the preceding section we have derived the asymptotic expression Eq. (5.17) for ψ(x). We now substitute this result into the formula Eq. (5.8) for the prime function. In this way we derive the celebrated prime number theorem.
62
0.1
0.05
305
310
315
320
325
330
335
340
-0.05
-0.1
Figure 5.2.: text
73
72
71
70
69
310
320
330
340
Figure 5.3.: Prime function Eq. (5.19) as the sum of a smooth function provided by the logarithmic integral and the steps contained in the function T (x) and defined in Eq. (5.18).
63
Indeed, with the help of Eq. (5.17) we find from Eq. (5.8) the asymptotic expression √
x π(x) = Li(x) − 4 T (x) − 4 ln x
Zx 2
dy √
T (y) y (ln y)2
for the prime function. When we neglect the integral we obtain the final formula √ x T (x) π(x) = Li(x) − 4 ln x
(5.19)
for the asymptotic distribution of the primes. In Fig. 5.2 we display the function T (x). It is in the shape of steps and gives precisely the distribution of primes. When we recall that tk denotes the imagining parts of the nontrivial zeros of the Riemann ζ-function we recognize that now we have finally connected the distribution of primes with the distribution of zeros.
64
6. Gauß Sums The physicist Victor Weißkopf is credited with the phrase: “It is well-known to those who know it well.” In this spirit we emphasize the well known fact that under appropriate conditions there can be a huge difference between an integral and a discrete summation. Notwithstanding that, for example, the Riemann integral is defined by a sum the two can display dramatically different behavior. A striking example illustrating this point is the familiar Cornu1 spiral defined by the Fresnel2 integral
E(x) ≡
Zx
y2 dy exp iπ 2
0
and its discrete version EG (N; τ ) ≡
N X n=1
exp iπn2 τ
commonly referred to as curlicue. In Fig. 6.1 we display the two curves as a function of x or of N for fixed τ . We note that EG has a much richer structure which sensitively depends on the choice of τ . Sums of the form EG we call Gauß sums. In the present chapter we discuss them in more detail and in particular, make contact with the Riemann ζ-function. 1 2
Alfred Cornu (1841 – 1902), French physicist Augustin Jean Fresnel (1788 – 1827), French physicist and engineer
10
0.6 8
0.4 0.2 -0.75
-0.5
-0.25
6
0.25
0.5
0.75
4
-0.2 -0.4
2
-0.6 -14
-12
-10
-8
-6
-4
-2
Figure 6.1.: textetextext
65
6.1. Theta Functions in Various Forms In Sec. 4.1 we have shown that the Riemann ζ-function is proportional to the Mellin transform of the function ∞ X 2 w(z) ≡ e−πn z (6.1) n=1
where we now have introduced the complex variable z ≡ x + iy. Obviously the sum in Eq. (6.1) is convergent provided Re z ≡ x > 0. Moreover, in the same section we have defined the theta function ∞ X
ϑ(z) ≡
e−πn
2z
= 1 + 2w(z)
(6.2)
n=−∞
in which the summation extends symmetrically from minus infinity to plus infinity. However, there is one more version of the theta function namely θ(z) ≡
∞ X
e2πin
2z
= ϑ(−2iz),
(6.3)
n=−∞
where the sum is convergent provided Im z ≡ y > 0. In contrast to w and ϑ this function is periodic with the period unity. Indeed, we find immediately ∞ ∞ X X 2 2 2 θ(z + 1) = e2πin z e2πin = e2πin z = θ(z) n=−∞
n=−∞
since 2
e2πin = 1. From the functional equation 1 ϑ(x) = √ ϑ x
1 x
for ϑ discussed in Appendix E together with Eq. (6.3) we find immediately the functional equation r i i θ(z) = (6.4) ϑ 2z 2z for θ. In the present chapter we concentrate on the properties of θ.
66
6.2. Emergence of Gauß Sums In the preceding section we have shown that θ is periodic with a period 1. Moreover, it satisfies the functional equation Eq. (6.4). For this reason θ is a special example of a modular function as explained in Appendix ??. However, there are many more symmetry relations contained in θ. In the present section we focus on only one of them and investigate the behavior of Θ in the neighborhood of a rational number a/q with a and q being mutual primes. This task leads us straight to the Gauß sums. We now analyze the function ∞ X 2a a 2 e2πin q e−2πn y θ z = + iy = q n=−∞
(6.5)
in the limit of 0 < y → 0. Obviously in this case the sum does not converge anymore. Nevertheless, we can derive the asymptotic behavior of θ. We first note that the function 2a γn ≡ exp 2πin q is periodic with the period q, since 2 2 a γn+q = exp 2πi(n + 2qn + q ) = γn e4πina e2πiaq = γn . q
(6.6)
Hence, it makes sense to take advantage of this periodicity in the sum of Eq. (6.5). Indeed, we first sum over all terms that have the same phase and then sum over the different phases. This procedure corresponds to applying the summation formula ∞ X
fn =
q−1 ∞ X X
fkq+r
r=0 k=−∞
n=−∞
to Eq. (6.5) which then takes the form θ
a + iy q
=
q−1 X ∞ X
2
γkq+r e−2π(kq+r) y .
r=0 k=−∞
The periodicity condition Eq. (6.6) finally yields θ
a + iy q
=
q−1 X r=0
γr
∞ X
2
e−2π(kq+r) y .
k=−∞
67
We now perform the limit 0 < y → 0 by replacing the summation over k by integration, that is Z∞ ∞ X 1 2 −2π(kq+r)2 y dk e−2π(kq+r) y = √ . e ≈ q 2y k=−∞ −∞
Indeed, for y → 0 this sum is just the definition of the above Riemann integral. Consequently, the desired limit of θ takes the form G(a, q) a √ + iy = lim lim θ 0
q−1 X
2a
e2πin q .
r=0
In Appendix B we evaluate the Gauß sum explicitly and find a G(1, q) G(a, q) = q where the Legendre symbol for odd primes q reads +1 for x2 = a( a 0 if p|a ≡ q −1 otherwise and
mod q)
1 −iπ 2q √ G(1, q) = (1 + i) 1 + e q. 2 We conclude this section by noting that the asymptotic behavior of Θ has also been investigated for other arguments. For example Hardy and Littlewood have focused on quadratic irrational numbers and have used continued fractions to analyze the asymptotics. Jurkat and Fiedler have even derived a distribution function. Moreover, Jens Marklof has used methods from the theory of ergodicity to show a relationship between θ and quantum chaos.
68
7. In a Nutshell Our journey through the sea of prime numbers has taken a long way. The time has come to summarize the main results. For a concise overview we refer to Fig. 7.1 where we depict in a question-answer (Q & A) dialogue the highlights of our trip.. An analysis of the sieve of Eratosthenes in terms of the inclusion-exclusion √ principle has given us an analytical expression for the number of primes between x and x. It was in this connection that the M¨obius function µ has emerged for the first time. However, we have only recognized its true importance when we considered multiplicative convolutions between two arithmetic functions. Here the instrument of the generating function which transforms the convolution into a multiplication leads us straight to the definition of the Riemann ζ-function: The Dirichlet series corresponding to the generating functions of the unity function and the M¨obius function are ζ and ζ −1, respectively. Two striking clues of a strong connection between the Riemann ζ-function and the distribution of primes offer themselves: Although ζ as defined for example by the Dirichlet series does not have any obvious connection to primes, its logarithmic derivative is a Dirichlet series which involves as expansion coefficients all positive primes, and no other integers. Moreover, the Euler product represents ζ as an infinite product of factors consisting of primes only. “Go west, young man”, is the famous phrase made popular by Horace Greenley (1811 – 1872) in an editorial of the New York Tribune in 1855. We have followed his advice and have extended the Riemann ζ-function to the west. In the form of the Dirichlet series ζ is restricted to arguments s with Re s ≡ σ > 1. However, the representation of ζ in form of the Mellin transform of the theta series w has opened the critical domain for us. Moreover, the functional equation for ξ relates the values of ζ in the west, that is for σ < 0 to the ones in the east, that is for σ > 0. In this way we have extended the Riemann ζ-function to the whole complex plane. The Stieltjes representation of ζ −1 as the generating function of the M¨obius function is a direct consequence of the M¨obius inversion formula. Its importance cannot be overestimated. Since the corresponding Dirichlet series represents a holomorphic function ζ −1 must be free of any singularities in the domain where the sum converges, that is for σ > 1. Consequently, ζ cannot have any zeros in this region. However, ζ does have zeros in the complex plane. The representation of ζ with the help of the functional equation for ξ brings out most clearly that the only zeros for σ < 0, that is the trivial zeros are located at the negative even integers. Therefore, all nontrivial zeros must lie in the critical strip 0 < σ < 1. Moreover, according to the Riemann hypothesis all non-trivial zeros must be on the critical line, that is along s = 1/2 + it. So far this conjuncture has only been confirmed by numerical work. No rigorous proof
69
Q
&
A
How to find primes
sieve of Eratosthenes
count primes
prime function π(x) inclusion-exclusion principle
prove M¨obius inversion formula simplify convolution
neutral element ε = µ ∗ 1 of convolution generating functions
additive
power series
multiplicative
Dirichlet series
define Riemann ζ-function
Dirichlet series with 1(n) = 1
represent ζ −1
Dirichlet series with µ(n) (Stieltjes representation)
express logarithmic derivative of ζ
Dirichlet series with Λ(n)
represent ζ in primes
Euler product
to extend ζ into critical strip
Mellin transform
left half plane
functional equation
locate zeros of ζ no zeros for Re s ≥ 1
trivial zeros −2, −4, . . .
Stieltjes representation poles of Γ-function
non-trivial zeros 1/2 ± itj
Riemann hypothesis
overall behavior
logarithmic integral
staircase
Fourier series T (x)
express asymptotics of π
connect π with ζ overall behavior
pole of ζ
staircase
non-trivial zeros of ζ
connect ζ to Gauß sum
w at rational numbers
Figure 7.1.: Main themes of the booklet formulated as a Question and Answer (Q & A) session.
70
exists. The first million zeros has been calculated and they are in agreement with the Riemann hypothesis. This information is crucial in obtaining an approximate but analytical expression for the prime function π. Indeed, we can approximate π by an integral in the complex plane with the integrand containing the logarithmic derivative of ζ. Hence, π is determined by the pole of ζ together with the non-trivial zeros. The pole contribution provides us with the overall behavior of π as predicted by the prime number theorem. The contributions from the non-trivial zeros yield the steps in π. Theta functions play an important role in the life of the Riemann ζ-function. For example, its extension into the critical strip originates from the Mellin transform of a theta function. For a particular choice of the arguments this theta function reduces to a Gauß sum, that is a finite sum where the summation index enters quadratically. This function is central to the Talbot effect of wave optics. Prime numbers have many mysterious properties. Some have been uncovered and understood in great detail. Some, such as the Riemann hypothesis, has been around for a long time. Everyone believes them to be correct even if no rigorous proof exists. However, a new field for primes opens up on the border between number theory and applications. We look forward to this era.
71
72
A. Riemann’s Original Paper The original manuscript by Riemann on the number of primes below a given number was reported to the K¨oniglich Preußische Akademie der Wissenschaften zu Berlin by Kummer1 on November 3, 1859. It was based on a letter to the academy dated October 19, 1859. Since the paper is difficult to locate we reproduced it here. It is based on the version printed in Riemann collected works. At the end it also contains remarks about the paper made by the editors.
1
Ernst Eduard Kummer (1810 – 1893), German mathematician, Professor in Breslau and Berlin
73
74
75
76
77
78
79
80
81
82
83
84
B. Inclusion-Exclusion Principle In this appendix we present the inclusion-exclusion principle in two different but completely equivalent forms. Moreover, we extend these results to weighted sums. We conclude by briefly applying a modified version of the inclusion-exclusion principle to the sieve of Brun1 .
B.1. Two Different but Equivalent Forms We start from a set M containing a number |M| of elements together with r subsets E1 , . . . , Er of M. The set M0 consists of all elements of M which are not included in any of the subsets E1 , . . . Er . The task is to find the number |M0| of elements of M0 . A frequently used example to illustrate this problem is to invoke a large group of students. Some of them speak in addition to their mother tongue a second language. The situation is complicated even further by the fact that some students speak several foreign languages. The task is to determine the number of students who do not speak any foreign language. The inclusion-exclusion principle predicts this number in two alternative representations. We start our discussion with the form |M0 | ≡
X
m∈M0
r X X = 1 + k=1 m∈M
r X
(−1) .
j1 , . . . jk m ∈ (Ej1 ∩ . . . Ejk )
k
(B.1)
We now verify this relation by assumingthat anelement m ∈ M appears in q(m) q(m) subsets Ej1 ∩ . . . Ejk out of the q subsets E1 , . . . Eq(m) . Since we can form k q(m) q(m) subsets Ejk ∩ Ejl , = q(m) subsets Ej , subsets E1 . . . Eq(m) , that is 2 1 etc. the expression in square brackets in Eq. (B.1) reduces to 1+
r X k=1
1
r X
k
(−1) = 1 +
j1 , . . . jk m ∈ (Ej1 ∩ . . . Ejk )
q(m)
X k=1
q(m) k
k
(−1) = (1 − 1)
q(m)
=
1 for q(m) = 0 0 otherwise. (B.2)
Viggo Brun (1885 – 1978), Norwegian mathematician
85
Here we have made use of the binomial theorem to perform the summation over k. Hence, the inner sum in Eq. (B.1) only provides a contribution to the exterior sum over all elements m of M if q(m) = 0, that is, if m is not a member of any of the subsets E1 , . . . Eq , that is if m ∈ M0 . As a result the sum over the elements of M reduces to one over M0 , only. We can slightly simplify the formula Eq. (B.1) when we include the term +1 in the sum over k which yields r X X |M0| = k=0 m∈M
r X
(−1) .
j1 , . . . jk m ∈ (Ej1 ∩ . . . Ejk )
k
(B.3)
Here we have defined the term k = 0 by unity, that is r X
≡1
(B.4)
j1 , . . . , j0 m ∈ (Ej1 ∩ . . . Ej0 )
The representation of |M0| in Eqs. (B.1) and (B.3) relies on adding up weighted numbers of combinations Ej1 ∩ . . . Ejk that each number m of M is in. An alternative representation is based on the fact that we can also sum the weighted numbers of elements in the individual combinations Ej1 , ∩ . . . Ejk , that is |M0| = |M| +
r X
(−1)
k=1
k
r X
j1 ,...jk
|(Ej1 ∩ . . . Ejk | =
r X k=0
(−1)
k
r X
.
(B.5)
j1 , . . . jk m ∈ (Ej1 ∩ . . . Ejk )
Here we have used again the definition Eq. (B.4) of the k = 0 term and made use of the identity X |M′| = . m′ ∈M′
In this form the equivalence of Eqs. (B.3) and (B.5) stands out most clearly: They differ only in the order of the summation. In the representation Eq. (B.3) of |M0| we first sum (−1)k , whereas in Eq.(B.5) we address this sum last. We conclude by presenting the two forms of the inclusion-exclusion principle for the case of a weighted sum X
m∈M0
f (m) =
X
m∈M
f (m)
r X k=0
r X
(−1)k =
j1 , . . . jk m ∈ (Ej1 ∩ . . . Ejk )
r X k=0
For f (m) = 1 this formula reduces to Eqs. (B.3) and (B.5).
86
(−1)k
X
f (m).
j1 , . . . jk m ∈ (Ej1 ∩ . . . Ejk )
B.2. Sieve of Brun In the preceding section we have verified Eq. (B.1) by performing the inner sum using the binomial theorem. For this purpose we had to assume that the element m ∈ M is also an element of q(m) subsets E1 , . . . Eq(m) . Unfortunately, in general we do not know this function q(m) unless we calculate it explicitly as exemplified in Table 2.1. It is therefore of interest to analyze the effect of not knowing q(m). For example, we can set q(m) ≡ q0 for all elements of M. Needless to say, this assumption cannot reproduce the correct result. However, it will at least provide us with an estimate.
87
88
C. Improved Mertens’ formula
89
90
D. General Convolutions We start from a set of functions whose arguments form a semi group with a connection denoted by “◦”. This operation can be an addition or a multiplication. For two functions f ≡ f (x) and g ≡ g(x) from this set the generalized convolution is then defined by X
h(x) ≡ (f ∗ g)(x) ≡
x1 ◦x2 =x
f (x1 ) · g(x2 )
(D.1)
where the sum is over the products f (x1 ) · g(x2 ) and the arguments x1 and x2 are connected to the argument x of the convolution by x1 ◦ x2 = x. On purpose we have used the variables x, x1 and x2 reminiscent of continuous variables. Although we are mainly concentrating on arithmetic functions whose domain is given by the integers, we also briefly discuss an example of a convolution with continuous variables. Here the sum in Eq. (D.1) transforms into an integral. In this appendix we illustrate the concept of a generalized convolution using the examples of an additive and a multiplicative convolution. Moreover, we present the corresponding generating functions.
D.1. Convolution for Addition We start our discussion with the case of the connection in the semigroup being the addition, that is “◦”≡“+”. This choice results in the convolution X X X h(n) ≡ (f ∗ g)(n) ≡ f (k) · g(m) = f (k) · g(n − k) = f (n − m)g(m). (D.2) k+m=n
m
k
For example, this operation arises when we try to answer the question of how many possibilities h(Z) exist to change Ze into 1e and 2e pieces. The obvious quick response to this task is to take the sum over all possible combinations of integers k and m such that k + 2m = Z which yields the expression X h(Z) = . (D.3) k+2m=Z
We can identify this formula as the convolution of addition when we identify in Eq. (D.2) n = Z and define the functions f (k) ≡ 1
for all k ∈ N
91
and g(m) ≡
1 for m even 0 otherwise.
For a given Z the number of possibilities as defined in Eq. (D.3) is then given by the sum X h(Z) = (f ∗ g)(Z) = f (k) · g(m) k+m=Z
which is indeed of the form Eq. (D.2). The generating functions for the additive convolution are power series X X Fa (z) ≡ f (k)z k and Ga (z) ≡ g(m)z m . m
k
The product Ha ≡ Fa · Ga is again a power series X X X X Ha (z) ≡ f (k)z k · g(m)z m = f (k)g(m)z k+m ≡ h(n)z n m
k
(D.4)
k,m
(D.5)
n
with the additive convolution h(n) ≡
X
f (k)g(m).
k+m=n
The underlying concept of the generating function Eq. (D.4) reducing the additive convolution Eq. (D.2) to a multiplication, Eq. (D.5), is the homomorphism Φa (n) ≡ z n with Φa (k + m) = z k+m = z k · z m = Φa (k) · Φa (m).
(D.6)
A homomorphism is a mapping from one algebraic structure to another under which the structural properties of its domain are preserved in its range in the sense that if ◦ is the operation on the domain and • is the operation in the range then Φa (k ◦ m) = Φa (k) • Φa (m) It is due to this property that the generating function for an additive convolution must be a power series. In the last section of this appendix we shall see that for the case of a multiplicative convolution, the generating function is a Dirichlet series.
D.2. Continuous Variables Before we turn to the multiplicative convolution we briefly discuss the extension of the additive convolution Eq. (D.2) to continuous variables. Here we consider functions
92
f = f (x1 ) and g = g(x2 ) and the convolution h(x) ≡ (f ∗ g)(x) =
Z∞
Z∞
dx1 f (x1 )g(x2 ) =
−∞
−∞ x1 + x2 = x
dx1 f (x1 )g(x − x1 )
(D.7)
following from Eq. (D.1)is completely analogous to the discrete case Eq. (D.2). One application of this formula arises in probability theory where the functions f and g represent probability densities for the independent stochastic variables x1 and x2 . The function h is then the probability density of the variable x1 + x2 = x. The generating function χf (k) ≡
Z∞
−∞
x dx f (x) eik =
Z∞
dx f (x)eikx
−∞
which also carries the name characteristic function is the continuous version of the power series Eq. (D.4). Here we have expressed the exponential exp(ikx) as the xth power of exp(ik) in order to bring out most clearly the close resemblance of the two expressions. Obviously χf is the Fourier transform of f . Hence, it is not surprising that we can reduce the convolution Eq. (D.7) to a multiplication. Indeed, we find χh (k) ≡ χf (k)χg (k) =
Z∞
−∞
dx1
Z∞
ik(x1 +x2 )
dx2 f (x1 )g(x2 )e
−∞
=
Z∞
dx h(x)eikx
−∞
where in the last step we have introduced the integration variable x ≡ x1 + x2 together with Z∞ h(x) ≡ dx1 f (x1 )g(x − x1 ). −∞
We emphasize that the crucial element of our derivation is the property eikx1 · eikx2 = eik(x1 +x2 ) of the exponential function, which sits at the heart of the homomorphism Φa .
D.3. Multiplicative Convolution We finally discuss the case of a multiplicative convolution n X n X X h(n) ≡ (f ∗ g)(n) ≡ f (k)g(m) = f (k)g = f g(m) k m m k·m=n k where the operation ◦ is a multiplication.
93
The corresponding homomorphism is Φm (n) ≡ n−s since only then do we find Φm (m ◦ n) = Φm (m · n) = (m · n)−s = m−s · n−s = Φm (m) · Φm (n).
(D.8)
As a consequence the generating function X f (n)n−s F (s) ≡ n
of the multiplicative convolution is a Dirichlet series. It is due to the property Eq. (D.8) of the homomorphism Φm that the product of two Dirichlet series is again a Dirichlet series.
94
E. Functional Equations In this appendix we derive the functional equation 1 1 1 1 √ −1 + w(x) = √ w x x 2 x of 1 w(x) ≡ 2
"
∞ X
−πn2 x
e
n=−∞
#
−1 ≡
(E.1)
1 [ϑ(x) − 1] . 2
Here we make use of the Poisson summation formula ∞ X
f (n) =
n=−∞
∞ X
f¯(l)
(E.2)
l=−∞
for the coefficients f (n) and their Fourier transform ¯ ≡ f(l)
Z∞
dn f (n)e−2πiln .
−∞
Due to its importance and for the sake of completeness we first briefly review the Poisson summation formula and then turn to the proof of the functional equation Eq. (E.1).
E.1. Poisson Summation Formula For the derivation of Eq. (E.2) we recall the Fourier representation X
n=−∞
∞ X
e−2πiln
(E.3)
1 if a ≤ x ≤ b 0 otherwise
(E.4)
δ(x − n) =
l=−∞
of a comb of Dirac delta functions δ defined by Zb a
dx δ(x) =
and located at all integers n. Here we do not rederive Eq. (E.3) but merely motivate it. Indeed, when x = n is an integer each term in the sum Eq. (E.3) is unity since l · n = s ∈ Z and thus e−2πiln = 1.
95
Consequently the sum diverges, in complete agreement with the behavior of the delta function at x = n. Moreover, for non-integer values of x the oscillations cancel each other giving rise to a vanishing contribution in agreement with the prediction of the delta function which assumes nonzero values only at integers. We now make use of the Fourier representation Eq. (E.3) of the delta comb to derive the Poisson summation formula Eq. (E.2). For this purpose we start from the righthand side of Eq. (E.2) and interchange integration and summation which yields Z∞ ∞ ∞ Z∞ ∞ X X X −2πilx ¯ dx f (x)e = dx f (x) e−2πilx f (l) = l=−∞
l=−∞−∞
−∞
l=−∞
and reduces with the help of Eq. (E.3) to ∞ X
f¯(l) =
l=−∞
Z∞
−∞
dx f (x)
∞ X
n=−∞
δ(x − n) =
Z∞ ∞ X
n=−∞−∞
dx f (x)δ(x − n) =
∞ X
f (n).
n=−∞
In the last step we have used the integral property Eq. (E.4) of the Dirac delta function.
E.2. Functional Equation for ϑ We are now in the position to derive the functional equation Eq. (E.1) by applying the Poisson summation formula to the function ∞ X 2 ϑ(x) ≡ e−πn x (E.5) n=−∞
which yields
ϑ(x) =
∞ Z∞ X
2
dn e−πn x e−2πiln .
l=−∞−∞
With the help of the Gauß integral r 2 Z∞ π β −αx2 +βx dx e = exp α 4α −∞
we arrive at
∞ 1 X −πl2 1 1 1 x = √ ϑ ϑ(x) = √ e x l=−∞ x x
where we have recalled the definition Eq. (E.5) of ϑ. The connection w = (ϑ − 1)/2 between w and ϑ finally yields 1 1 1 1 1 1 1 1 √ ϑ √ −1 −1 = √ w −1 + w(x) = 2 x x 2 x x2 x which is the desired functional equation Eq. (E.1).
96
(E.6)
F. Evaluation of Gauß Sums In this appendix we derive an explicit expression for the Gauß sum G(a, q) ≡
q−1 X
e2πin
2a q
(F.1)
n=0
where a and q are mutual primes. We first consider the case of a = 1 and then generalize this result to a 6= 1.Throughout this appendix we follow a derivation provided for the first time by Dirichlet. In these derivations we will meet a substantial portion of number theory. Here we do not use the language of this branch such as congruences and quadratic residues but express everything in lay man’s terms. However, in the last section of this Appendix we return to this topic and introduce the reader to these ideas.
F.1. Gauß Sum G(1, q) In order to simplify the notation we use in the remainder of this appendix the abbreviation e(z) ≡ e2πiz (F.2) and the Gauß sum takes the form
q−1 2 X n e G(1, q) = . q n=0
(F.3)
Moreover, we introduce the function q−1 X (n + x)2 e . G(x) ≡ q n=0
(F.4)
where the argument x is limited to the closed interval between zero and unity, that is 0 ≤ x ≤ 1. We first note that G(x = 1) = G(x = 0) since 2 X X q−1 q q−1 2 q−1 2 X X m m m (n + 1)2 e e e e G(x = 1) = = = +1 = = G(0). q q q q n=0 m=1 m=1 m=0 97
Here we have made use of the identity e(q) = e2πiq e(0) = 1, following from the definition Eq. (F.2) of e(z). We now extend G beyond the interval 0 ≤ x ≤ 1 with the period of unity. The so-extended function G is identical to G(x) between any two neighboring integers. This feature allows us to expand G into a Fourier series ∞ X
G≡
m=−∞
with the coefficient
Z1
Gm ≡
Gm e(−mx)
(F.5)
dx G(x)e(mx).
(F.6)
0
Hence, we find from a comparison of the definitions Eqs. (F.3), (F.4) and (F.5) of G(1, q), G(x) and G together with the explicit formula Eq. (F.6) for Gm the relation G(1, q) = G(x = 0) =
∞ X
m=−∞
Gm .
(F.7)
When we substitute G(x) as defined in Eq. (F.4) into the expression Eq. (F.6) for the Fourier coefficient Gm we find 1
Gm = or
q−1 Z X
1
Gm =
q−1 Z X
dx e
n=0 0
dx e
n=0 0
(n + x)2 + mqx q
(n + x + 12 mq)2 q
,
1 2 e −mn − qm . 4
When we recall from the definition Eq. (F.2) of e(z) the relation e(−m · n) = exp [−2πim · n] = 1
(F.8)
and substitute τ ≡ n + x + 21 qm we arrive at q−1 1 2 X Gm = e − qm 4 n=0
n+1+ 12 qm
Z
n+ 12 qm
dτ e
1
2
τ q
2 q+Z2 qm τ 1 2 dτ e = e − qm 4 q 1 qm 2
where in the last step we have performed the sum over n by adding up the contributions in the integral.
98
We are now ready to evaluate the Gauß sum G(1, q) with the help of Eq. (F.7) which leads us to 1
q+Z2 qm 2 τ 1 2 , dτ e G(1, q) = Gm = e − qm 4 q m=−∞ m=−∞ ∞ X
∞ X
1 qm 2
or G(1, q) =
∞ X
n=−∞
e −qn2
q(n+1) Z
dτ e
qn
2
τ q
∞ X
1 2 + e − q(2n + 1) 4 n=−∞
q(n+1+ 12 )
Z
q(n+ 12 )
dτ e
τ2 q
where we have decomposed the sum over m into even and odd terms. Since e(−qn2 ) = 1 and
1 1 2 e − q(4n + 4n + 1) = e − q = e−iπq/2 4 4 are independent of n we find −iπq/2
G(1, q) = 1 + e
√
q
Z∞
−∞
du e u2 .
(F.9)
Here we have performed the substitution u ≡ τ 2 /q. The remaining Fresnel integral follows by comparing the result of Eq. (F.3) for q = 1 and Eq. (F.9) which leads to G(1, 1) = 1 = (1 − i)
Z∞
−∞
or
Z∞
−∞
du e u2 =
du e u2 ,
1+i 1 = . 1−i 2
When we substitute this expression back into Eq. (F.9) we arrive at the final result q 1 √ G(1, q) = (1 + e−iπ 2 )(1 + i) q. 2
We conclude by noting that G(1, q) is periodic in q with a period 4. Indeed, we find √ (1 + i) q for q = 4m √ q for q = 4m + 1 G(1, q) = (F.10) 0 for q = 4m + 2 √ i q for q = 4m + 3. 99
F.2. Gauß Sum G(a, q) Here we distinguish the two cases that (i) there exists an integer x such that (a − x2 )/q is an integer, or (ii) no such x exists. In both cases we can express the Gauß sum G(a, q) by G(1, q). However, the derivations in the cases are very different. We start our analysis with the first case. Since a − x2 = r · q where r is an integer we find X X q−1 q−1 q−1 X a 2 x · m)2 x · m)2 2 e G(a, q) = e r·m + e = m = q q q m=0 m=0 m=0 where in the last step we have made use of Eq. (F.8). The remaining sum is again G(1, q). However, the individual terms are in a different order. In order to recognize this fact we consider the example a = 14 and q = 5 which yields x = 3 since 14 − 32 = 14 − 9 = 5. For this example we find 4 X (3m)2 9 36 81 144 G(14, 5) = e = e(0) + e +e +e +e 5 5 5 5 5 m=0 which we must compare to 2 4 X 1 4 9 16 m = e(0) + e +e +e +e . G(1, 5) = e 5 5 5 5 5 m=0 Indeed, when we recognize that 9 1·5+4 4 4 e =e =e 1+ =e 5 5 5 5 and e as well as e and
e
we find
81 5
36 5
144 5
=e
=e
=e
13 · 5 + 16 5
1 =e 7+ 5
16 16 = e 13 + =e 5 5
7·5+1 5
27 · 5 + 9 5
1 =e 5
9 9 = e 27 + =e 5 5
1 4 9 16 G(14, 5) = e(0) + e +e +e +e = G(1, 5). 5 5 5 5 100
Indeed, G(14, 5) is identical to G(1, 5). The only difference is the order of the terms. This example also illlustrates how to proceed for a general pair (a, q). We now address the case when no such x exists. An example is the pair a = 3 and q = 5. Indeed, the equation 3 − x3 = r · 5 cannot be solved for integers x and r. In this case we consider the combination q−1 2 q−1 X X m a 2 e e S ≡ G(1, q) + G(a, q) = + (F.11) m . q q m=0 m=0 We again gain insight into the behaviour of the sums by considering a special example, such as a = 3 and q = 5. Here we find 4 9 16 3 12 27 48 1 +e +e +e +e +e +e +e S = e(0) + e 5 5 5 5 5 5 5 5 or 1 4 4 1 3 2 2 3 S = e(0) + e +e +e +e +e +e +e +e . 5 5 5 5 5 5 5 5 When we combine the terms this sum reduces to a geometric sum 4 m 4 m X X 1 =2 e . S=2 e 5 5 m=0 m=0
With the help of relation
r X
xm =
m=0
we arrive at
S=2
1 − xr 1−x
1 − e(1) = 0. 1 − e 51
When we substitute this result into the definition Eq. (F.11) of S we find G(3, 5) = −G(1, 5). When we introduce the Legendre symbol +1 if there is an x with q|(a − x2 ) a −1 if there is no such x ≡ q 0 if q|a
(F.12)
we can combine both cases and find
a G(1, q). G(a, q) = q We conclude by noting that ????????????
101
F.3. Congruences and Quadratic Residues We now express the main results and the essential ingredients of the derivations presented in the preceeding sections in the language of number theory. For this purpose we introduce the concept of congruences and quadratic residues. A frequently used abbreviation for the ratio (a − b)/c of the difference a − b between two numbers a and b and c 6= 0 being an integer, that is c being a divisor of a − b as expressed by c|(a − b) is a ≡ b (mod c) which reads “a is congruent to b modulo c”. For example, 8 ≡ 2(mod 2) since (8 − 2)/2 = 3 is an integer. However, 8 ≡ / 3(mod 2) since (8 − 3)/2 = 2.5 is not an integer. When we use this definition of a congruence we can express the different cases q = 4m, q = 4m + 1, q = 4m + 2 and q = 4m + 3 in Eq. (F.10) distinguishing the individual formulae for G(1, q) by q ≡ 0(mod 4), q ≡ 1(mod 4), q ≡ 2(mod 4), and q ≡ 3(mod 4), respectively. Moreover, we note that a congruence also appears in the definition Eq. (F.12) of the Legendre symbol. Indeed, the formula q|(a − x2 ) translates into x2 ≡ a(modq). Here we ask for an integer solution x of this equation for a given pair (a, q) where q is a prime and the greatest common denominator between them is 1. If q is not a divisor of a and x is a solution of the above equation then a is a quadratic residue module q. In the derivation of G(a, q) we have considered two examples to illustrate the equivalence of the corresponding sums with G(1, q). In the first case we had chosen a = 14 and q = 5. Indeed, q is a prime and the greatest common denominator between 14 and 5 is 1. Moreover, we find x = 3 since 14 − 32 = 5 and thus 14 is a quadratic residue modulo 5. On the other hand a = 3 is not a quadratic residue modulo 5, since the equation 3 − x2 = r · 5 does not have integer solutions for x or r.
102
G. Further Reading Historical Papers and Riemann hypothesis The original paper by Riemann making the connection between π and ζ is ¨ B. Riemann, Uber die Anzahl der Primzahlen unter einer gegebenen Gr¨oße, Monatsberichte der Berliner Akademie, 671-680 (1859). This article can also be found in H. Weber, Eds., Bernhard Riemanns gesammelte mathematische Werke und wissenschaftlicher Nachlaß, 136-144 (1876). ¨ H. von Mangoldt, Zu Riemann’s Abhandlung “Uber die Anzahl der Primzahlen zu einer gegebenen Gr¨oße” , Crelle’s J. 114, 255-305 (1895) H. Bohr and E. Landau, Comptes rendus, 12 janvier 1914 The results of Hardy and Littlewood appeared in G.-H. Hardy, Sur les z´eros de la fonction ζ(s) de Riemann, Comptes Rendus 158, 1012-1014 (1914) G.-H. Hardy and Littlewood, The Zeros of the Riemann Zeta-Function on the Critical Line, Math. Zeit., 10, 283-317 (1921); a short summary can be found in Proc. London Math. Soc. 19, 16 (1920) ¨ C.L. Siegel, Uber Riemann’s Nachlass zur analytischen Zahlentheorie, Quell. Stud. Gesch. Mat. Astr. Physik 2, 45-80 (1932) A classic on prime numbers is E. Landau, Handbuch der Lehre von der Verteilung der Primzahlen (Chelsea Publishing Company, New York, 1974)
Properties of Riemann ζ-function A. Ivi´ c, The Riemann Zeta-Function, Theory and Applications (Dover Publications, Mineola, 2003) E.T. Whittaker and G.N. Watson, A Course of Modern Analysis (Cambridge University Press, Cambridge, 1984) For the numerical evaluation of the zeros of ζ we refer to A.M. Odlyzko, On the distribution of spacings between zeros of the zeta function, Math. Comp. 48 273-308 (1987) A.M. Odlyzko, The 1022 -nd zero of the Riemann zeta function, Contemp. Math. 290
103
139-144 (2001)
Popular Expositions There exist many elementary books summarizing the history and importance of the Riemann hypothesis, see for example D. Rockmore, Stalking the Riemann hypothesis (Pantheon Books, New York, 2005) K. Sabbagh, The Riemann hypothesis—The greatest unsolved problems in mathematics (Farrar, Straus and Giroux, New York, 2002) M. du Sautoy, The music of primes (Harper Collins, New York, 2003)
Number Theory For an introduction to number theory, see for example G.E. Andrews, Number theory (Dover, New York, 1994) O. Ore, Number theory and its history (Dover, New York, 1988) For the connection between number theory and cryptography see for example Neal Koblitz, A Course in Number Theory and Cryptography (Springer Verlag, Heidelberg, 1994) An excellent source of quick information on mathematics is E.J. Borowski and J. M. Borwein, The Harper Collins dictionary of mathematics (Harper Collins, New York, 1991)
Special Functions For comprehensive summaries of the properties of special functions see E. Jahnke and F. Emde, Tables of Functions with Formulae and Curves (Dover, New York, 1945) I.S. Gradshteyn and I.M. Ryzhik, Table of Integrals, Series and Products (Academic Press, New York, 1980) M. Abramowitz and I.A. Stegun, Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables, National Bureau of Standards, Applied Mathematics Series, Vol. 55 (US Government Printing Office, Washington, 1970) There even exists now an updated electronic version of this famous book which includes plots of the Riemann ζ-function and the prime function. The Digital Library of Mathematical Functions (DLMF) is a web-based collection of formulas (http://dlmf.nist.gov), cross-linked and with live graphics, that can be magnified and rotated. It will be operational in the year 2006. Moreover, Mathematica and Maple have also implemented these functions. For more information see for example M.V. Berry, Why are special functions special?, Physics Today 54, 11-12 (2001)
104
In the computer age with Mathematica and Maple it is still impressive to see the three dimensional models of special functions made out of paper and plaster a hundred years ago and displayed in the foyer of the mathematical institute at the University of G¨ottingen.
Quantum Information The original paper on the Shor algorithm can be found in P.W. Shor Proc. of 35th Annual Symposium on the Foundations of Computer Science, (IEEE Computer Society, Los Alamitos, 1994), S. 124 (short version); also in SIAM J. Sci. Statist. Comput. 26 1484 (1997) or alternatively quant–ph/9508027v2 (1996); For a concise exposition see for example A. Ekert and R. Josza, Rev. Mod. Phys. 68 733 (1996). There exist many introductions into the field of quantum information. See for example H.-K. Lo, T. Spiller, and S. Popescu Introduction to Quantum Computation and Information (World Scientific Publishing, Singapore, 1998); J. Gruska, Quantum Computing(McGraw Hill, London, 1999); D. Bouwmeester, A. Ekert, and A. Zeilinger, The Physics of Quantum Information (Springer, Berlin, 2000); M.A. Nielsen and I.L. Chuang, Quantum Computation and Quantum Information (Cambridge University Press, Cambridge, 2000); G. Alber, T. Beth, M. Horodecki, P. Horodecki, R. Horodecki, M. R¨ otteler, H. Weinfurter, R. Werner, and A. Zeilinger, Quantum Information: An Introduction to Basic Theoretical Concepts and Experiments (Springer, Berlin, 2001).
Quantum Chaos and Riemann ζ-Function J.P. Keating, The Riemann zeta function and quantum chaology, In G. Casati, I. Guarneri, and U. Smilanski (eds.), Quantum Chaos, 145-185 (North Holland, Amsterdam, 1993) F. Steiner, Quantum chaos, in Schlaglichter der Forschung. Zum 75. Jahrestag der Universti¨at Hamburg 1994. (Hrsg.: R. Ansorge), 543-564 (Dietrich Reimer Verlag, Berlin, Hamburg, 1994) M.V. Berry and J.P. Keating, The Riemann zeros and eigenvalue asymptotics, SIAM Rev. 41 236-266 (1999)
Talbot Effect and Gauß Sums The phenomenon of the Talbot effect is discribed in H.F. Talbot, Facts relating to optical science, Phil. Mag. 9 401-407 (1836) B. Rohwedder, Atom optical elements based on near field grating sequences, Fortschr.
105
Phys. 47 883-911 (1999) M.V. Berry and S. Klein, Integer, fractional and fractal Talbot effects, J. Mod. Optics 43 2139-2164 (1996) For the relation between the Talbot effect and Gauß sums M.V. Berry, I. Marzoli, and W.P. Schleich Curlicues are introduced in M.V. Berry and J. Goldberg, Renormalization of curlicues, Nonlinearity 1 1-26 (1988)
Properties of Gauß Sums H. Fiedler, W. Jurkat, and O. K¨ orner, Asymptotic expansions of finite theta series, Acta Arithm. 32 129-146 (1977) J. Marklof, Limit Theorem for Theta Sums with Applications in Quantum Mechanics (Shaker Verlag, Aachen, 1997) S. Lang, Algebraic number theory (Addison-Wesley Press, New York, 1970) D. Hannay and M.V. Berry, Quantization of linear maps on a torus - Fresnel diffraction by a periodic grating, Physica 1D, 267-290 (1980)
Gauß Sums and Factorization J.F. Clauser and J.P. Dowling, Factoring integers with Young’s N-slit interferometer Phys. Rev. A 53 4587-4590 (1996) W.G. Harter, Quantum-fractal revival structure in CN quadratic spectra: Base-N quantum computer registers Phys. Rev. A 64 012312 (2001) H. Mack, M. Bienert, F. Haug, M. Freyberger and W.P. Schleich Wave Packets can Factorize Numbers Physica Status Solidi (b) 233, No. 3, 408-415 (2002) H. Mack, M. Bienert, F. Haug, F.S. Straub, M. Freyberger and W.P. Schleich Wave Packet Dynamics and Factorization of Numbers in: “Experimental Quantum Computation and Information”, Eds.: F. De Martini and C. Monroe (Elsevier, Amsterdam, 2002)
106
H. Solutions to the Problems 1.1 Prime number decomposition of factorial The number n! = 1 · 2 · 3 . . . n is a natural number which according to the Fundamental Theorem of Arithmetic can be represented as a product of powers αj of primes pr , that is n! = pα1 1 · pα2 2 . . . pαr r . We now need to obtain an estimate for the number r of primes involved as well as the individual powers αj . For this purpose we consider the example of n = 100 and ask the question how many times the factor 5 appears. From the equation 100! = 1· 2· 3· 4· 5· 6 . . . 10 . . . 15 . . . 25 . . . 50 . . . 75 . . . 100 = 1· 2· 5· 6 . . . 2 · 5 . . . 3 · 5 . . . 5 · 5 . . . 2 · 5 · 5 . . . 3 · 5 · 5 . . . 4 · 5 · 5 we find that the factor 5 appears first as integer multiples of 5. Indeed, we deal with [100/5] = 20 factors of 5. Moreover, the square of 5, that is 25 and its multiple integers appear. In this case with have [100/52 ] = [100/25] = 4. Hence, we find the total power e5 of the factor 5 in 100! is 100 100 e5 = + = 20 + 4 = 24. 5 52 Here we have used the square symbol although it was not necessary. However, the example of the factor 3 brings out the necessity. Indeed, 3 appears [100/3] = 33 times. The factors 9 = 32 , 27 = 33 and 81 = 34 appear [100/32] = 11, [100/33] = 4 and [100/34 ] = 1, respectively leading to a total power 100 100 100 100 + + + = 33 + 11 + 4 + 1 = 49 e3 = 3 32 33 34 of the prime 3. Hence, the result for an arbitrary prime p is obviously x x x x ep ≡ + 2 +··· = +O . p p p p2
107
1.2 Stirling formula When we make use of the functional relation ln n(a · b) = ln a + ln b we can express the resulting sum ! Y X ln n! = ln(1 · 2 · 3 . . . n) = ln m = ln m m≤n
m≤n
in terms of the Stieltjes Integral, that is ln n! =
X
ln m =
m≤n
Zn
d[x] ln x.
1
For increasing arguments x the integrator [x] is an increasing step function which is constant between two integers but jumps discontinuously at every integer. This behaviour of the integrator leads together with the integrand to a sum of logarithms evaluated at consecutive integers. Since it would be more convenient to perform a Riemann integral we add and subtract the corresponding Riemann integral ln n! =
Zn
dx ln x −
Zn
d(x − [x]) ln x ≡
dx ln x −
Zn
dΨ(x) ln x
1
1
1
1
Zn
leading to the saw tooth integrator Ψ(x) ≡ x − [x] −
1 2
displayed in Fig. H.1. Here we have introduced the constant shift 1/2. As a consequence the values of Ψ are between −1/2 and 1/2. Moreover, Ψ is a periodic function with perod 1 and the area underneath Ψ over one period vanishes, that is Zn+1 dx Ψ(x) = 0. n
We recall the integral relation Zn 1
dx ln x = (x ln x − x)|n1 = n ln n − n + 1
and apply integration by parts to the integral Zn 1
108
dΨ(x) ln x = Ψ(x) ln x|n1 −
Zn 1
dx Ψ(x)
1 x
Figure H.1.: Saw tooth integrator Ψ(x) ≡ x − [x] − 1/2. which yields ln n! = n ln n − n + 1 − Ψ(n) ln n −
Zn
1 dx Ψ(x) , x
1
or
Zn 1 1 ln n − n + 1 − dx Ψ(x) . ln n! = n − 2 x 1
Here we have made use of the fact that Ψ(n) = −1/2. We now need to obtain an estimate for the remaining integral. For this purpose we use integration by parts one more time and find n Zn Zn Zx Zx 1 1 1 dx Ψ(x) = dy Ψ(y) + dx dy Ψ(y) 2 , x x x 1
or
Zn
1
1 1 dx Ψ(x) = x n
1
Zn
dy Ψ(y) +
1
1
1
1
Zn
Zx
dx
1
dy Ψ(y)
1 . x2
(H.1)
1
Due to the saw tooth nature of Ψ the integral of Ψ is bounded. Hence, the right hand side of Eq. (H.1) is of the order unity, that is Zn
dx Ψ(x)
1 = O(1) x
1
109
leading to the asymptotic behavior 1 ln n! = n − ln n − n + O(1). 2
1.3 Another glimpse of the prime number theorem We start by taking the logarithm of the prime number decompositon Y n! = pe p p≤x
of n! with the estimate
x ep = + O p
x p2
for the exponent ep derived in Problem 1.1 which yields X ln p X X x x +O ln p = x + O(x). ln n! = ep ln p = p p2 p p≤x p≤x p≤x From the Stirling formula ln n! = n ln n + O(n) we thus find by comparing the left and right sides of the equation x ln x + O(x) = x
X ln p p
p≤x
the result
X ln p p≤x
p
+ O(x),
= ln x + O(1).
1.4 Sum over the inverse of primes We start from the identity S≡
X1 p≤x
p
=
Zx
dη(t)
1 ln t
3/2
representing the sum S in terms of the Stieltjes integral with the integrator η(t) ≡
X ln p p≤t
p
.
The lower boundary 3/2 is in some sense arbitrary and could lie anywhere between unity and two. We do not choose unity since this will lead to singularities due to the logarithm in the denominator.
110
Again we perform integration by parts, which yields x Zx 1 1 + dt η(t) , S = η(t) ln t 3/2 t(ln t)2 3/2
or
x
Z X ln p 1 1 S= + dt η(t) . p ln x t(ln t)2 p≤x 3/2
Here the contribution from the lower boundary at t = 3/2 vanishes since there are no primes for x < 2. From Problem 1.3 we recall the relation X ln p p
p≤x
= ln x + O(1)
which yields S =1+O
1 ln x
+
Zx
dt η(t)
1 . t(ln t)2
3/2
In order to obtain an estimate for the remaining integral we add and subtract the integral Zx Zx 1 3 ln t = dt = ln ln x − ln ln dt 2 t(ln t) t ln t 2 3/2
3/2
which yields S = ln ln x + a + O where
1 ln x
Z∞ Z∞ 1 1 3 + dt [η(t) − ln(t)] − dt [η(t) − ln(t)] . a ≡ 1 − ln ln 2 2 t(ln t) t(ln t)2 x
3/2
1.5 Mertens’ formula When we insert the identity 1 = e1/p · e−1/p into the product Y Y 1 1 −1/p 1/p 1− 1− = e ·e = p p p≤x p≤x
Y
p≤x
−1/p
e
!
Y
p≤x
1/p
e
1 1− p
111
we find with the help of Problem 1.4 for the factor in parenthesis # " Y X1 1 −1/p = exp − ln ln x − a + O , e = exp − p ln x p≤x p≤x that is Y
e−1/p =
p≤x
Hence, we arrive at
1 −a e + O(1). ln x
Y b 1 = + O(1) 1− p ln x p≤x
where
−a
b≡e
Y
1/p
e
p≤x
1 . 1− p
1.6 Derivation of Eq. (1.9) with Riemann integral We now verify the relation ϕ(x) + π(x) = ln x
Zx
dy
2
ϕ(y) ϕ(x) + I(x) 2 ≡ ln x y (ln y)
(H.2)
connecting the prime function π(x) ≡ and the function ϕ(x) ≡
X p≤x
X
=
p≤x
ln p =
X
Θ(x − p)
(H.3)
X
ln p Θ(x − p)
(H.4)
p
p
summing the natural logarithms of the primes smaller or equal to x. Here Θ denotes the Heaviside step function 1 for 0 ≤ x Θ(x) ≡ 0 for x < 0. For this purpose we substitute the expression Eq. (H.4) for ϕ into the integral I on the right hand side of Eq. (H.2) and find interchanging integration and summation I(x) =
X p
ln p
Zx 2
Θ(y − p) X dy = ln p y (ln y)2 p
Zx p
dy
1 Θ(x − p). y (ln y)2
(H.5)
In the last step we have applied the step function Θ to the integral. However, here we have to be a little careful, since Θ(y − p) not only replaces the lower limit 2 of the
112
integral by p but also brings back a step function of argument x − p. Indeed, this fact is due to the property of Θ which allows only contributions provided p < y. Since y ≤ x we find again p ≤ x and thus I(x) =
X
ln p
p≤x
Zx
dy
p
1 . y (ln y)2
Integration by parts of the remaining integral yields Zx p
that is
x Zx 1 1 1 ln y + 2 dy ln y dy , 2 = 2 (ln y) y=p y (ln y) (ln y)3 y p
Zx p
1 1 1 dy − +2 2 = ln x ln p y (ln y)
and thus −
Zx
dy
p
Zx
dy
p
1 y (ln y)2
1 1 1 − . 2 = ln x ln p y (ln y)
When we substitute this result into Eq. (H.5) we arrive at I(x) = −
X
ln p
p≤x
X ϕ(x) 1 + =− + π(x), ln x p≤x ln x
in agreement with Eq. (H.2).
2.1 Sieve of Brun 3.1 Alternative representation of M¨ obius function 3.2 Convergence of Dirichlet series for ζ The most popular test of convergence is the ratio test which states that a sum ∞ X
S≡
an
n=0
is convergent if
an+1 < 1. lim n→∞ an Unfortunately when applied to the Dirichlet series ζ(s) ≡
∞ X n=1
−s
n
=
∞ X eit ln n n=1
nσ
113
this test is inclusive since the limit 1 n+1 = lim exp −σ ln 1 + =1 lim exp −σ ln n→∞ n→∞ n n of the ratio is unity. Here we have used the triangular inequality ∞ X 1 |ζ(s)| ≤ ≡ S(σ). nσ n=1
We now consider a different convergence test, such as a comparison with the geometric series ∞ X 1 xn = 1−x n=0 which is convergent for |x| < 1. For this purpose we decompose S into a sum S≡ of sub-sums Sn ≡
∞ X
Sn
n=1
n −1 2X
m=2n−1
1 mσ
consisting of 2n − 1 − 2n−1 + 1 = 2n (1 − 21 ) = 2n−1 terms. We estimate these sub sums Sn by replacing each term by the first one, that is Sn <
n −1 2X
m=2n−1
1 (2n−1 )σ
=2
n−1
1 2n−1
σ
=
1 2n−1
σ−1
=
1 2σ−1
n−1
and thus find the upper bound n−1 X m ∞ ∞ X 1 1 = S< 2σ−1 2σ−1 m=0 n=1 of S to be determined by a geometric sum. Since the latter is convergent provided 2−(σ−1) ≡ exp [−(σ − 1) ln 2] < 1 we find σ > 1.
3.3 Inequality for ζ In order to derive the inequality 1 1 ≤ ζ(s) ≤ +1 s−1 s−1 114
(H.6)
x−s
1
1
3
2
4
5 x
Figure H.2.: Bounds on ζ(s) for a real valued argument s using the areas underneath two step functions represented by solid and dashed lines approximating the continuous function f (x) = x−s . The sum of the areas f (n) · 1 = n−s of rectangles defined by the height f (n) and the width 1 with 1 ≤ n, that P is n=1 n−s is an upper bound for the integral of f . Likewise the sum of rectangles f (n) · 1 with 2 ≤ n is a lower bound. we approximate according to Fig. H.2 the darkened area underneath the curve f (x) = x−s , that is the integral Z∞ I = dx x−s = − 1
∞ 1 1 −(s−1) = x s−1 s−1 1
by appropriately constructed step functions familiar from the definition of the Riemann integral. Indeed, the area underneath the step function depicted in Fig. H.2 by a solid line and constructed from rectangles of height f (n) with 1 ≤ n provides us with an upper bound on I. Hence, the sum ∞ X n=1
f (n) · 1 =
∞ X
n−s = ζ(s)
n=1
of the rectangles is larger than or equal to the area I underneath f which translates into the inequality 1 ≤ ζ(s). s−1 Moreover, the step function displayed by dashed lines and built out of rectangles whose upper right corners are defined by f (n) provides us with a lower bound on I. In this
115
case the sum
∞ X n=2
f (n) · 1 =
∞ X
f (n) =
n=2
∞ X n=1
is smaller than or equal to I, that is
ζ(s) − 1 ≤ or
n−s − 1 = ζ(s) − 1
1 , s−1
1 + 1. s−1 When we combine both bounds we arrive at the inequality Eq. (H.6). ζ(s) ≤
3.4 Alternative derivation of Stieltjes representation When we multiply the Dirichlet series Eqs. (3.17) and (3.20) representing ζ and M, respectively we arrive at ζ(s) · M(s) =
∞ X
m,n=1
1(m) · µ(n)(m · n)
−s
=
∞ X
h(k)k −s
k=1
where due to Eqs. (3.10) and (3.16) the coefficient X k X h(k) ≡ 1 µ(d) = µ(d) = δk,1 d d|k
d|k
reduces to a Kronecker delta. Hence, we find ζ(s) · M(s) = or
∞ X
δk,1 k −s = 1−s = 1
k=1
M(s) = ζ −1(s) =
∞ X
µ(n)n−s .
n=1
3.5 Euler representation of ζ from geometric series In order to derive the Euler representation Y (1 − p−s )−1 ζ(s) = p
of the Riemann ζ-function using the geometric series ∞
X 1 = xn 1 − x n=0 116
(H.7)
valid for |x| < 1 we start from the right hand side of Eq. (H.7) and note that |p−s | ≡ p−σ < 1 since p ≤ 2 and 1 < σ ≡ Re s. Therefore, we can expand the ratio ∞ X 1 p−np s = 1 − p−s n =0 p
into a power series. When we repeat this procedure for every factor in the product of Eq. (H.7) we find with the help of Eq. (3.25) and f = 1 the Dirichlet series ∞ YX
p−np s =
p np =0
∞ X
n−s = ζ(s)
n=0
of the Riemann ζ-function.
3.6 Logarithmic derivative of ζ (i) We start from the definition Λ(n) ≡
ln p for n = pm 0 otherwise
of the von Mangoldt function and evaluate the sum X S≡ Λ(d) d|n
of divisors d of n. Due to the definition of Λ the only contribution to the sum arises when n is of the form pm . In this case the divisors are 1, p, p2 . . . pm . Hence, the sum S takes the form S = Λ(1) + Λ(p) + Λ(p2 ) + · · · + Λ(pm ) = m · ln p = ln pm = ln n, that is ln n =
X
Λ(d).
(H.8)
d|n
Here we have used the property Λ(1) = 0. According to the M¨obius inversion formula, Eq. (H.8) is equivalent to X d Λ(n) = ln µ(d) n d|n
and with the help of the definition of Λ we arrive at X d ln p for n = pm ln µ(d) = . 0 otherwise n d|n
117
(ii) The Euler product representation ζ(s) =
Y p
(1 − p−s )−1
of ζ yields X
ln ζ(s) = −
p
ln(1 − p−s )
which after differentiation leads to X X ln p d d −s 1 ln ζ(s) = p = − p−s . −s ds −s ds 1 − p 1 − p p p Here we have made use of the relation d −s d p = e−s ln p = − ln p e−s ln p = − ln p p−s . ds ds When we expand the expression (1 − p−s )−1 using the geometric series we arrive at ln ζ(s) = −
∞ X X k=0
ln p p
−(k+1)s
p
=−
∞ X X k=0
ln p p−ks .
(H.9)
p
We can replace the two sums by a single sum over the integers n ≡ pk introducing the arithmetic function ln p for n = pm Λ(n) = 0 otherwise which casts Eq. (H.9) into the Dirichlet series ∞
X d ln ζ(s) = Λ(n)n−s . ds n=1 Indeed, when we substitute the definition of Λ into this series we find Eq. (H.9).
4.1 Special values of Γ-function From the integral representation Z∞ Γ(s) = du e−u us−1 0
we derive immediately
Z∞ ∞ Γ(1) = du e−u = −e−u 0 = 1 0
118
and
Z∞ Z∞ 1 2 Γ = du e−u u−1/2 = 2 dx e−x = π 1/2 . 2 0
0
Here we have introduced the integration variable u ≡ x2 with du = 2xdx and used the expression Eq. (E.6) for the Gauß integral. Moreover, from the functional equation Eq. (4.5) we find for s ≈ 0 with Γ(1) = 1 1 1 1 Γ(s) = Γ(1 + s) ≈ Γ(1) = . s s s
4.2 Gauß representation of Γ-function We start by iterating the functional equation Eq. (4.5) of the Γ-function in the form 1 Γ(s) = Γ(s + 1) s n times which yields Γ(s) =
1 Γ(s + n). s · (s + 1) · (s + 2) . . . (s + n − 1)
When we now consider the limit of n → ∞ we can apply the method of steepest descent to the rest term Z∞ Z∞ −x s+n−1 Γ(s + n) = dx e x = dx xs−1 e−(x−n ln x) . 0
0
The point xs of steepest descent follows from the condition 0=
d n (x − n ln x) = 1 − , dx x
that is for xs = n. For |s| ≪ n the function xs−1 is slowly varying compared to the exponential. Therefore, it is sufficient to evaluate it at the point xs = n of steepest descent. Since the term is independent of the integration variable we can take it out of the integral and arrive at Z∞ s−1 dx e−x xn = ns−1 Γ(n + 1) = ns−1 n!. Γ(s + n) ≈ n 0
In the limit of n → ∞ this relation is exact. Hence, we arrive at the desired Gauß representation n!ns−1 Γ(s) = lim n→∞ s · (s + 1) · (s + 2) . . . (s + n − 1) of the Γ-function.
119
4.3 Weierstraß representation of Γ In order to prove the Weierstraß representation ∞ Y s −s/k 1 γs 1+ e =s·e Γ(s) k k=1
(H.10)
of Γ we start from the Gauß representation, Eq. (4.13) in the form s(1 + s)(1 + 2s ) . . . (1 + 1 = lim Γ(s) n→∞ n!ns−1 that is
When we note the relation
s )(n n−1
− 1)!
,
∞ Y s −s 1 1+ n . = lim Γ(s) n→∞ k=1 k
n−s = e−s ln n and insert the term 1 = e−s/k es/k in the product we find n−1 Y s −s/k s/k −s ln n 1 1+ e e e = s lim n→∞ Γ(s) k k=1
which reduces with the help of the formula n−1 Y
"
es/k = exp s
k=1
n−1 X 1 k=1
k
#
to the result " n−1 !# n−1 X1 Y s −s/k 1 e exp s = s lim 1+ − ln n . n→∞ Γ(s) k k k=1
k=1
Now we are in the position to perform the limit n → ∞ which with the definition Eq. (2.11) of the Euler-Mascheroni number γ yields the Weierstraß representation Eq. (H.10).
4.4 Product formula for sine function 4.5 Functional equations of Γ In order to verify the relation Γ(s)Γ(1 − s) =
120
π sin(πs)
(H.11)
we start from the Weierstraß representation of Γ in the form 1 1 es/1 es/2 . . . Γ(s) = e−γs s s (1 + 1 )(1 + 2s ) . . . and find with the expression 1 γs 1 e−s/1 e−s/2 . . . Γ(−s) = − e s s (1 − 1 )(1 − 2s ) . . . for the product 1 Γ(s)Γ(−s) = −s2
1−
s 2 1
1−
s 2 2
...
−1
1 =− 2 s
(∞ Y k=1
1−
) s 2 −1 k
.
With the help of the product representation ∞ x 2 Y 1− sin x = x kπ k=1 of the sine function derived in Problem 4.4 we arrive at 1 π Γ(s)Γ(−s) = −s sin(πs) and the functional relation (−s)Γ(−s) = Γ(1 − s)
of the gamma function yields the desired result Eq. (H.11). When we substitute the argument s + 21 into the functional equation Eq. (H.11) we obtain the expression 1 π 1 Γ −s = Γ s+ 2 2 cos(πs) where we have used the relation sin(α + π/2) = cos α.
4.6 Doubling Formula of Γ 4.7 Logarithmic derivative of Γ-function When we take the logarithm of the Gauß representation Eq. (4.13) we find " # n−1 X ln Γ(s) = lim ln n! + (s − 1) ln n − ln(s + k) n→∞
k=0
which after differentiation yields # " " # n−1 n−1 n−1 X X d 1 1 X 1 1 1 = lim ln n − . ln Γ(s) = lim ln n − − + − n→∞ n→∞ ds s + k k s k s + k k=1 k=0 k=1 (H.12)
121
Here we have added and subtracted an extra sum which allows us to identify the EulerMascheroni constant, Eq. (2.11) and guarantees that the last sum converges. Indeed, we arrive at ∞ X d 1 1 ln Γ(s) = −γ − + s . ds s k(s + k) s=1
or
In particular, we find from Eq. (H.12) # " # " n n−1 X X 1 1 d = lim ln n − , = lim ln n − ln Γ(s) n→∞ n→∞ ds 1+j k s=1 j=0 k=1 # " n−1 ′ X Γ (s) d 1 1 = = −γ. ln Γ(s) − = lim ln n − n→∞ ds Γ(s) k n s=1 s=1 k=1
When we recall from Problem 4.1, the value Γ(1) = 1 we arrive at Γ′ (1) = −γ.
4.8 Integral representation of ζ We start from the integral representation Z∞ Γ(s) = dx e−x xs−1 0
of the gamma function and substitute x ≡ nu which yields Z∞ Γ(s) = ns du e−nu us−1, 0
or −s
n
1 = Γ(s)
Z∞ du e−nu us−1 . 0
When we sum both sides over n and recall the Dirichlet series of ζ we arrive at ∞ X n=1
−s
n
1 = ζ(s) = Γ(s)
Z∞ X ∞ n du e−u us−1 = n=1
0
that is 1 ζ(s) = Γ(s)
Z∞ du 0
e−u 1 us−1 = −u 1−e Γ(s)
Here we have made use of the geometric series.
122
1 Γ(s)
Z∞ du 0
1 − 1 us−1, 1 − e−u
Z∞ us−1 du u . e −1 0
4.9 Functional equations for ζ We start from the functional equation ξ(s) = ξ(1 − s) of ξ(s) ≡ π ζ(s) = π
−(1−s)/2
Γ
which yields π or
−s/2
Γ
s 2
s
−s/2
2
Γ
ζ(s)
1−s 2
ζ(1 − s)
1−s Γ 2 ζ(s) = π −1/2 π s ζ(1 − s). Γ 2s With the duplication formula 1 22x−1 Γ (2x) = √ Γ (x) Γ x + 2 π
(H.13)
(H.14)
of the gamma function discussed in Problem 4.6 we find ζ(s) = 2(2π)s
Γ(−s) ζ(1 − s) Γ Γ − 2s s 2
which with the functional relation s s 1 2π Γ Γ − = − 2 2 s sin πs 2
derived in Problem 4.5 yields
ζ(s) = π −1 (2π)s sin
πs
Γ (1 − s) ζ(1 − s). 2 Here we have used the functional equation (−s)Γ(−s) = Γ(1 − s). We replace in the expression Eq. (H.13) for ζ the formula 1 s π Γ = − s 1 2 2 Γ 2 + 2 cos πs 2 with the help of Problem 4.5 which yields π ζ(s) = π −1/2 π s 1 s Γ 2+2 Γ
1 s 2
cos
πs 2
ζ(1 − s).
The duplication formula Eq. (H.14) of Γ in the form √ s 1 s Γ = 2 π2−s Γ(s) + Γ 2 2 2
yields
ζ(s) =
(2π)s 2Γ(s) cos
πs 2
ζ(1 − s). 123
4.10 Expansion of ζ around s = 0 When we start from the expression Eq. (4.11) for ξ and use the definition Eq. (4.6) of ξ the formula ∞ i h s/2 Z s−2 s+1 1 π 1 2 + x 2 w(x) + ζ(s) = dx x − s − 1 s Γ 2s 1
brings out most clearly the pole at s = 1. We now consider the limit s → 1 and recall from Problem 4.1 the result Γ(1/2) = π 1/2 which immediately yields ζ(s) = a0 +
1 s−1
where we have introduced the abbreviation Z∞ a0 ≡ dx x−1 + x−1/2 w(x) − 1. 1
In order to evaluate the constant a0 we use the functional relation 1 1 1 1 √ −1 w(x) = √ w + x 2 x x of w derived in Appendix E for the first term in the integral and find Z∞ Z∞ Z∞ 1 1 1 1 −3/2 −1/2 a0 = dx x w + dx x w(x) + dx −1 − x 2 x−3/2 x 1
1
1
which after the substitution y ≡ x−1 , that is dy = x−2 dx = −y 2 dx reduces to Z∞ ∞ n a0 = dy y −1/2 w(y) − x−1/2 1 − lim (ln x) 1 − 1, n→∞
0
that is, with the help of the definition Eq. (4.7) of w to n−1 Z∞ X 2 dy y −1/2 e−πk y − ln n . a0 = lim n→∞
k=1 0
After the substitution u ≡ πk 2 y we arrive at " n−1 # X1 a0 = lim − ln n = γ. n→∞ k k=1
Here we have used the Gauß integral Eq. (E.4) and the definition Eq. (2.11) of the Euler-Mascheroni constant.
124
4.11 Special values of ζ We start from the functional equation ζ(s) = π −1 (2π)s sin
πs 2
Γ(1 − s)ζ(1 − s)
derived in Problem 4.9 and consider the limit s → 0 by recalling from Problems 4.1 and 4.7 the expansion Γ(1 − s) = Γ(1) − Γ′ (1)s + · · · = 1 + γs + . . . and from Problem 4.10 the Laurent expansion Γ(1 − s) = γ − which together with sin yields or
πs 2
≈
1 + ... s π s 2
1 1 ζ(s) ≈ − (2π)s (1 + γs)(1 − γs) = − es ln(2π) 1 + O(s2) 2 2 1 ζ(s) ≈ − [1 + s ln(2π)] . 2
Thus we arrive at ζ(0) = − and
1 2
1 ζ ′(0) = − ln(2π). 2
4.12 Laurent series of ζ around s = 1 4.13 Sum rules In order to verify the relation
∞ X µ(n) n=1
we first recall the behavior
n
=0
1 (H.15) s−1 of ζ in the neighborhood of s = 1 and then consider the limit s → 1 in the representation ζ(s) ≈ γ + ∞
X µ(n) 1 = . ζ(s) n=1 ns 125
We therefore arrive at ∞ X µ(n) n=1
ns
=
s−1 1 1 ≈ ≈ (s − 1) [1 − γ(s − 1)] ≈ s − 1 1 = ζ(s) 1 + γ(s − 1) γ + s−1
which for s → 1 leads to the desired relation. We can verify the formula ∞ X Λ(n) − 1 n
n=1
= −2γ
by investigating the limit s → 1 of
∞ ∞ ∞ X X X ζ ′ (s) Λ(n) − 1 −s −s − − ζ(s) = Λ(n)n − n = ζ(s) ns n=1 n=1 n=1
which yields with the expansion Eq. (H.15) the expression ∞ X 1 1 1 1 Λ(n) − 1 ≈ ≈ [1 − γ(s − 1) − 1] − γ = −2γ. 1 − γ + s 2 n (s − 1) γ + s−1 s−1 s−1 n=1
4.14 Zeros of complex functions We start from the Taylor expansion f (z) ≡
X
fn z n
n
of a holomorphic function f . For f to be real an the real axis we have to require the reality relation fn∗ = fn . As a consequence we find by taking the complex conjugate of the definition X fn z0n = 0 f (z0 ) = n
of the zero z0 the relation
0 = f ∗ (z0 ) =
X n
fn∗ (z0n )∗ =
X
fn (z0∗ )n ,
n
confirming that z0∗ is also a zero of f . From the definitions Eqs. (4.1), (4.8) and (4.14) of ζ in terms of the Dirichlet series, the integral transform, or the functional equation, respectively, we note that ζ is real along the real axis. Hence, the non-trivial zeros must indeed be symmetric with respect to the x-axis.
4.15 Number of zeros of complex functions 5.1 Proof of Chebyshev bound
126
Mathematical Index Greek Letters Γ = Γ(s) gamma function, 42 γ Euler-Mascheroni constant, 22 h i γn ≡ exp 2πin2 aq , 67 ∆ = ∆(x), 52
δ(x) Dirac delta function, 95 δij Kronecker delta, 30 ε = ε(n) neutral element of convolution, 29 ζ = ζ(s) Riemann ζ-function, 32 Θ Heaviside step function, 53 q θ = 2zi ϑ 2zi , 66 ϑ = ϑ(x) theta function, 43 κ(d) = xd − xd , 21
Λ = Λ(n) von Mangoldt function, 36
λ = λ(s) =
πs , Γ(s)
44
ϕ = ϕ(x) sum over logarithms, 8 ψ = ψ(x) Chebyshev function, 36
Miscellaneous Notations (m, n) = 1 m is prime to n or mm and n are coprime, 26 (m, n) ≡ k k is highest common divisor of m and n, 26 1 = 1(n) function which maps all integers to unity, 25 a Legendre symbol, 101 q a ≡ b (mod c), (a − b)|c integer, 102 C c = c(s), 46
D D = D(s) Dirichlet series, 25
µ = µ(x) M¨obius function, 19 ξ(s) = π −s/2 Γ 2s ζ(s), 41
d|n divisor of n, 3
̺k , ̺j trivial and non-trivial zeros of ζ, 48
Epj ∩ Epk denotes the set of elements common to both Epj and Epk , 16
π = π(x) prime function, 6
E
127
E generating function of ε, 34 e(z) ≡ e2πiz , 97 Epj set of multiples of prime pj , 14
N N(T ) number of zeros of ζ on critical line with imaginary part smaller T , 50
P
F F = F (s) Dirichlet series, 31
P (z) product of primes smaller or equal to z, 19
f˜ Mellin transform of f , 50 f (n), g(n), and h(n) arithmetic functions, 26 f ∗ g convolution, 28
R R(z) remainder in sieve, 21 S
G
G = G (s) Dirichlet series, 31
H
d|n
sum over divisor d of n, 19
s = σ + it complex argument, 32
T = T (x) saw tooth function, 62 W
L d|P (z)
P
T
H = H (s) Dirichlet series, 31
P
sum taken over all m which are prime to n, 26
(m,n)=1
G(a, b) Gauß sum, 68
L(z) =
P
µ(d) , d
21
Li = Li(x) logarithmic integral, 8
w = w(x), 43 wD = wD (x), 52
X M
[x] largest integer not exceeding x, 14
M set of integers, 13 |M| number of elements in M, 14 M0 set of primes surviving sieve, 14
128
Z z ∗ complex conjugate of z, 48
Index ante, 2 arithmetic functions, 25 multiplicative, 26 Bellman, vii binomial theorem, 86 Bohr, Harald August, 50 Carroll, Lewis, vii Cauchy integral theorem, 59 Cauchy, Augustin-Louis, 59 Chebyshev bound, 11, 57, 58 proof, 11, 126 function, 36, 58, 59 asymptotic behavior, 62 contribution from s0 , 61 contribution from pole s = 1, 62 contribution from zeros of ζ, 61 definition, 59 inverse Laplace transform, 59 new representation, 59 Chebyshev, Pafnuti Lvovich, 36 Clemenceau, George, v commutative set, 28 complex functions number of zeros, 56, 126 zeros, 56, 126 composite number, 3 congruences, 102 convolution continuous variables, 92 characteristic function, 93 Fourier transform, 93 generating function, 93 definition, 28
for addition, 91 generalized form, 91 holomorphism, 92 multiplicative convolution, 93 homomorphism, 94 neutral element, 30 of two arithmetic functions, 28 Cornu spiral, 65 Cornu, Alfred, 65 cryptography, 104 curlicue, 65 Dirac-δ function comb of, 95 definition, 95 Fourier representation, 96 Dirichlet series, 31 as a Mellin transform, 52 of ζ, 32 of ζ −1 , 33 Dirichlet series of ζ ′/ζ, 35 Dirichlet,Peter Gustav Lejeune, 25 divisor, 3 Eratosthenes, 1 sieve, see Sieve of Eratosthenes Euclid, 1 Euler product representation, see Riemann ζ-function Euler, Leonhard, 1 Euler-Mascheroni constant, 22 famous conjectures, 2 Fourier series, see Gauß sums Fourier transform, see convolution Fresnel integral, 65, 99
129
Fresnel, Augustin Jean, 65 functional equation for ξ, 44 for Γ, 42, 55, 120 for ϑ, 96 for ζ, 55, 123 for w, 95 fundamental theorem of arithmetic, 3 Γ-function, 42, 46 doubling formula, 55, 121 extended definition, 42 functional equation, 42, 55, 120 Gauß representation, 46, 54, 119 integral representation, 42 logarithmic derivative, 55, 121 simple poles, 46 special values, 54, 118 Weierstraß representation, 54, 120 Gauß integral, 96 Gauß sums, 65, 67, 68, 97, 105 G(1, q), 97 G(a, q), 100 calculation with geometric sum, 101 factorization, 106 Fourier series, 98 Gauß, Karl Friedrich, 1 generating functions, 31 for µ, 34 for 1, 34 for convolution of addition, 92 of multiplication, 31, 94 geometric sum, 101 God, 1 Goldbach, Christian, 2 conjectures, 2 Greenley, Horace, 69 Hadamard, Jaques Salomon, 1 Hardy, Godfrey Harold, 50 harmonic series, see Riemann ζ-function Heaviside step function, 53, 112 Hilbert, David, 1
130
holomorphic function, 47 homomorphism, see convolution inclusion-exclusion principle, 15, 16, 85 Kronecker, Leopold, 25 Kummer, Ernst Eduard, 73 Landau, Edmund Georg Hermann, 50 Laplace transform inverse Laplace transform, see Mellin transform Legendre symbol, 68, 101 Legendre, Adrien Marie, 8 Littlewood, John Edensor, 50 logarithmic integral, 8 Mangoldt, Hans von, 36 Mellin transform, 50 definition, 50 Γ-function as, 52 inverse, 52 inverse Laplace transform, 53 Laplace transform, 53 Mellin, Robert Hjalmar, 41 Mersenne, Marin, 2 M¨obius function alternative representation, 26, 38, 113 definition, 19, 26 multiplicative nature, 27 M¨obius inversion formula, 31 M¨obius, August Ferdinand, 6 natural numbers, 1, 2 number theory, 104 Poisson summation formula, 95 prime function, 6, 112 analytical asymptotic behavior, 62 asymptotic behavior, 6 Gauß conjecture, 7, 8 Legendre conjecture, 8 definition, 6 exact relation between π and ϕ, 9 expression for π, 59 final formula, 64
lower bound, 8 new representation, 57 Stieltjes integral, 6 Stieltjes representation, 6 integrator, 6 prime number theorem, 1, 57 primes definition, 2 infinite amount, 3 Mersenne, 2 quadratic residues, 102 quantum chaology, v quantum computer, v quantum information, v, 105 Riemann ζ-function, 32 bounds on, 39, 114 convergence, 33 convergence of Dirichlet series, 33, 38, 113 ratio test, 113 Dirichlet series, 32 Euler representation, 36 arithmetic functions, 37 from geometric series, 39, 116 expansion around s = 0, 124 expansion around s = 1, 56 extension into critical strip, 43 whole complex plane, 42 functional equation, 55 functional equations, 123 harmonic series, 33 integral representation, 55, 122 Laurent series, 56, 125 logarithmic derivative, 35, 36, 39, 117 original paper, 103 pole, 33, 50, 60 special values, 56, 125 Stieltjes representation, 33 alternative derivation, 39, 116 sum rules, 56, 125 zeros, 47
holomorphic function, 47 non-trivial zeros, 48, 50 number of zeros, 50 trivial zeros, 47, 48, 50 Riemann hypothesis, 41, 48, 50 Riemann, Georg Friedrich Bernhard, 1 Shor, Peter, v Siegel, Carl Ludwig, 50 Sieve of Brun, 23, 87, 113 Sieve of Eratosthenes, 4, 5, 15 compact formula, 20 set problem, 13 sine function, 54, 120 Snark, vii special functions, 104 staircase-like function, 10, 63 Stieltjes integral, 6 Stieltjes, Thomas Jan, 6 Stirling formula, 10, 108 Talbot effect, vii, 105 θ-functions, 66 definition, 66 functional equation, 66 twin prime problem, 2 Vall´ee-Poisson, Charles de la, 1 Weierstraß, Karl Theodor Wilhelm, 54 ξ-function functional equation, 44 symmetry relation, 46
131