Analytic Number Theory - Iwaniec H.,Kowalski E.

American Mathematical Society Colloquium Publications Volume 53

Analytic Number Theory Henryk lwaniec Emmanuel Kowalski

! ' I I

J

American Mathematical Society Providence, Rhode Island

Editorial Board Susan J. Friedlander, Chair Yuri Manin Peter Sarnak 2000 Mathematics Subject Classification. Primary llFxx, llLxx, llMxx, llNxx, 11T23, 11T24, l!R42.

For additional information and updates on this book, visit www .ams.orgfbookpages /coli-53

Library of Congress Cataloging-in-Publication Data Iwaniec, Henryk. Analytic number theory/ Henryk lwaniec, Emmanuel Kowalski. p. em. ~ (Colloquium publications, ISSN 0065-9258 ; v. 53) Includes bibliographical references and index. ISBN 0-8218-3633-1 (acid-free paper) 1. Number theory. f. Kowalski, Emmanuel, 1969- II. Title. Ill. Colloquium publications (American Mathematical Society) ; v. 53. QA241.!85 2004 512. 7'3-dc22

2004045081

Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy a chapter for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American 1Iathematical Society. Requests for such permission should be addressed to the Acquisitions Department, American Mathematical Society, 201 Charles Street, Providence, Rhode Island 02904-2294, USA. Requests can also be made by e-mail to reprint-permission
© 2004 by the American

~lathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America.

@I

The paper used in this book is acid-free and falls within the guidelines established to ensure permanence and durability. Visit the AMS home page at http://www.ams.org/ 10 9 8 7 6 54 3 2 1

09 08 07 06 05 04

Contents Preface

x1

Introduction

1

Chapter 1. Arithmetic Functions §1.1. Notation and definitions §1.2. Generating series §1.3. Dirichlet convolution §1.4. Examples §1.5. Arithmetic functions on average §1.6. Sums of multiplicative functions §1.7. Distribution of additive functions

9 9 10 12 13 19 23 28

Chapter 2. Elementary Theory of Prime Numbers §2.1. The Prime Number Theorem §2.2. Tchebyshev method §2.3. Primes in arithmetic progressions §2.4. Reflections on elementary proofs of the Prime Number Theorem

31 31 32 34 38

Chapter 3. Characters §3.1. Introduction §3.2. Dirichlet characters §3.3. Primitive characters §3.4. Gauss sums §3.5. Real characters §3.6. The quartic residue symbol §3.7. The Jacobi-Dirichlet and the Jacobi-Kubota symbols §3.8. Heeke characters

43 43 44 45 47 49 53 55 56

Chapter 4. Summation Formulas §4.1. Introduction §4.2. The Euler-Maclaurin formula §4.3. The Poisson summation formula §4.4. Summation formulas for the ball §4.5. Summation formulas for the hyperbola §4.6. Functional equations of Dirichlet £-functions §4.A. Appendix: Fourier integrals and series

65 65 69 71 74 84 86

Chapter 5. Classical Analytic Theory of £-functions §5.1. Definitions and preliminaries

93 93

v

66

c •.

...........

~,"--~._.

CONTENTS

vi

•

~

.,_w...· ............._______................______

=........

. . . . . . . . . . . . . . .,. . . .............. , ..... ., . ... . . .,..,.,, .. . ., . ...................

-~

§5.2. §5.3. §5.4. §5.5. §5.6. §5.7. §5.8. §5.9. §5.10. §5.11. §5.12. §5.13. §5.14. §5.A.

Approximations to £-functions Counting zeros of £-functions The zero-free region Explicit formula The prime number theorem The Grand Riemann Hypothesis Simple consequences of GRH The Riemann zeta function and Dirichlet £-functions £-functions of number fields Classical automorphic £-functions General automorphic £-functions Artin £-functions £-functions of varieties Appendix: complex analysis

Chapter 6. Elementary Sieve Methods §6.1. Sieve problems §6. 2. Exclusion-inclusion scheme §6.3. Estimations of v+(z), v-(z) §6.4. Fundamental Lemma of sieve theory §6.5. The A2 -Sieve §6.6. Estimate for the main term of the A2 -sieve §6.7. Estimates for the remainder term in the A2 -sieve §6.8. Selected applications of A2 -sieve

97 101 105 108 110 113 117 119 125 131 136 141 145 149

\

153 153 154 157 158 160 164 165 166

Chapter 7. Bilinear Forms and the Large Sieve §7.1. General principles of estimating double sums §7.2. Bilinear forms with exponentials §7.3. Introduction to the large sieve §7.4. Additive large sieve inequalities §7.5. Multiplicative large sieve inequality §7.4. Applications of the large sieve to sieving problems §7.6. Panorama of the large sieve inequalities §7.7. Large sieve inequalities for cusp forms §7.8. Orthogonality of elliptic curves §7.9. Power moments of £-functions

169 169 171 174 175 179 180 183 186 192 194

Chapter 8. Exponential Sums §8.1. Introduction §8.2. Weyl's method §8.3. Vander Corput method §8.4. Discussion of exponent pairs §8.5. Vinogradov's method

197 197 198 204 213 216

Chapter 9. The Dirichlet Polynomials §9.1. Introduction §9.2. The integral mean-value estimates §9.3. The discrete mean-value estimates §9.4. Large values of Dirichlet polynomials §9.5. Dirichlet polynomials with characters

229 229 230 232 235 238

TABLE OF CONTENTS

§9.6. The reflection method §9.7. Large values of D(s, x)

vii

243 246

Chapter 10. Zero Density Estimates §10 .1. Introduction §10.2. Zero-detecting polynomials §10.3. Breaking the zero-density conjecture §10.4. Grand zero-density theorem §10.5. The gaps between primes

249 249 250 254 256 264

Chapter 11. Sums over Finite Fields §11.1. Introduction §11.2. Finite fields §11.3. Exponential sums §11.4. The Hasse-Davenport relation §11.5. The zeta function for Kloosterman sums §11.6. Stepanov's method for hyperelliptic curves §11. 7. Proof of Weil's bound for Kloosterman sums §11.8. The Riemann Hypothesis for elliptic curves over finite fields §11.9. Geometry of elliptic curves §11.10. The local zeta function of elliptic curves §11.11. Survey of further results: a cohomological primer §11.12. Comments

269 269 269 272 274 278 281 287 290 291 297 300 313

Chapter 12. Character Sums §12.1. Introduction §12.2. Completing methods §12.3. Complete character sums §12.4. Short character sums §12.5. Very short character sums to highly composite modulus §12.6. Characters to powerful modulus

317 317 318 319 324 330 335

Chapter 13. Sums over Primes §13.1. General principles §13.2. A variant of Vinogradov's method §13.3. Linnik's identity §13.4. Vaughan's identity §13.5. Exponential sums over primes §13.6. Back to the sieve

337 337 340 342 344 345 348

Chapter 14. Holomorphic Modular Forms §14.1. Quotients of the upper half-plane and modular forms §14.2. Eisenstein and Poincare series §14.3. Theta functions §14.4. Modular forms associated to elliptic curves §14.5. Heeke £-functions §14.6. Heeke operators and automorphic £-functions §14.7. Primitive forms and special basis §14.8. Twisting modular forms §14.9. Estimates for the Fourier coefficients of cusp forms

353 353 357 361 363 "368 370 372 376 378

viii

CONTENTS

§14.10. Averages of Fourier coeffici@tS

•

380

Chapter 15. Spectral Theory of Automorphic Forms §15.1. Motivation and geometric preliminaries §15.2. The laplacian on !HI §15.3. Automorphic functions and forms §15.4. The continuous spectrum §15.5. The discrete spectrum §15.6. Spectral decomposition and automorphic kernels §15.7. The Selberg trace formula §15.8. Hyperbolic lattice point problems §15.9. Distribution of length of closed geodesics and class numbers

383 383 385 386 387 389 391 393 398 401

Chapter 16. Sums of Kloosterman Sums §16.1. Introduction §16.2. Fourier expansion of Poincare series §16.3. The projection of Poincare series on Maass forms §16.4. Kuznetsov's formulas §16.5. Estimates for the Fourier coefficients §16.6. Estimates for sums of Kloosterman sums

403 403 404 406 406 413 415

Chapter 17. Primes in Arithmetic Progressions §17 .1. Introduction §17.2. Bilinear forms in arithmetic progressions §17.3. Proof of the Bombieri-Vinogradov Theorem §17.4. Proof of the Barban-Davenport-Halberstam Theorem

/

419 419 421 423 424

Chapter 18. The Least Prime in an Arithmetic Progression §18.1. Introduction §18.2. The log-free zero-density theorem §18.3. The exceptional zero repulsion §18.4. Proof of Linnik's Theorem

427 427 429 434 439

Chapter 19. The Goldbach Problem §19.1. Introduction §19.2. Incomplete A-functions §19.3. A ternary additive problem with A' §19.4. Proof of Vinogradov's three primes theorem

443 443 445 446 447

Chapter 20. The Circle Method §20.1. The partition number §20.2. Diophantine equations §20.3. The circle method after Kloosterman §20.4. Representations by quadratic forms §20.5. Another decomposition of the delta-symbol

449 449 456 467 472 481

Chapter 21. Equidistribution §21.1. Weyl's criterion §21.2. Selected equidistribution results §21.3. Roots of quadratic congruences §21.4. Linear and bilinear forms in quadratic roots

487 487 488 494 496

l

TABLE OF CONTENTS

ix

§21.5. A Poincare series for quadratic roots §21.6. Estimation of the Poincare series

498 501

Chapter 22. Imaginary Quadratic Fields §22.1. Binary quadratic forms §22.2. The class group §22.3. The class group £-functions §22.4. The class number problems §22.5. Splitting primes in IQl( -/l5) §22.6. Estimations for derivatives £(k)(l, xv)

503 503 508 511 517 520 523

Chapter 23. Effective Bounds for the Class Number §23.1. Landau's plot of automorphic £-functions §23.2. A partition of A( 9 )(~) §23.3. Estimation of S3 and S2 §23.4. Evaluation of S 1 §23.5. An asymptotic formula for A(g) ( ~) §23.6. A lower bound for the class number §23. 7. Concluding notes §23.A The Gross-Zagier £-function vanishes to order 3

529 529 531 533 534 536 538 540 541

Chapter 24. The Critical Zeros of the Riemann Zeta Function §24.1. A lower bound for N 0 (T) §24.2. A positive proportion of critical zeros

547 547 550

Chapter 25. The Spacing of the Zeros of the Riemann Zeta-Function §25.1. Introduction §25.2. The pair correlation of zeros §25.3. Then-level correlation function for consecutive spacing §25.4. Low-lying zeros of £-functions

563 563 564 570 572

Chapter 26. Central Values of £-functions §26.1. Introduction §26.2. Principle of the proof of Theorem 26.2 §26.3. Formulas for the first and the second moment §26.4. Optimizing the rnollifier §26.5. Proof of Theorem 26.2

577 5 77 580 582 589 595

Bibliography

599

Index

611

PREFACE This book shows the scope of analytic number theory both in classical and modern directions. There are no division lines; in fact our intent is to demonstrate, particularly for newcomers, the fascinating countless interrelations. Of course, our picture of analytic number theory is by no means complete, but we tried to frame the material into a portrait of a reasonable size, yet providing a self-contained presentation. vVe were writing this book in a period of time during and after teaching courses and working with graduate students in Rutgers University, Bordeaux University and Courant Institute. We thank these institutions for providing conditions for both of us to work together. We shared ideas on what this book should be about with many of our colleagues, who gave us critical suggestions. Among them we would like to mention Etienne Fouvry, John Friedlander, Philippe Michel and Peter Sarnak. During a long process of typing and preparation of this book for publication, we received stimulating encouragement and technical advice from Sergei Gelfand, for all of his help we express our gratitude. Carol Hamer helped to polish some of our English phrases while her little boys tried to destroy the TeX files without success. We thank them all for the output. ' Henryk Iwaniec Emmanuel Kowalski 15 December, 2003

xi

INTRODUCTION

Analytic Number Theory distinguishes itself by the variety of tools it employs to establish results, many of which belong to the main streams of arithmetic. It is not part of analysis nor of any particular discipline of mathematics, however it does interact indeed with various fields. Therefore everybody seems to view the subject differently. This vast diversity of concepts of analytic number theory is its great attraction. Our desire in this book is to exhibit the wealth and prospects of the theory, its charming theorems and powerful techniques. However it is not our primary objective to give proofs of the strongest results, although in many cases we come quite close to the best possible ones. Rather we favor a reasonable balance between clarity, completeness and generality. The book was conceived with graduate students in mind so the reader will often find that our emphasis is on reasoning throughout the arguments. Of course our presentation is subjective, and in retrospect may lose its meaning. Certainly we do not always follow the original lines of discovery, but occasionally we do draw brief historical perspectives. Leonard Euler must get credit for the first use of analytical arguments for the purpose of studying properties of integers, specifically by constructing generating power series. Euler's proof of the infinity of prime numbers makes use of the divergence of the zeta function and the corresponding product over primes, which is named after him. This was the beginning of analytic number theory. Next came P. G. L. Dirichlet whose creation of the theory of £-functions for characters, resulting in the proof of the infinity of primes in arithmetic progressions, makes him the true father of analytic number theory. From these early days to modern times the distribution of prime numbers constitutes the core of the subject. This will be apparent in the course of our book. The first two chapters cover questions of primes up to the elementary methods of P. Tchebychev. Chapter 3 provides definitions and basic properties of Dirichlet characters and the Gauss sums. Characters on ideals of imaginary quadratic fields are also introduced, not only because they play a supporting role in subsequent chapters but to show a bit of analytic number theory beyond the traditional domain of the rational integers as well; there will be other examples throughout the book, for instance, elliptic curves. Poisson summation for number theory is what a car is for people in modern communities - it transports things to other places and it takes you back home when applied next time- one cannot live without it. Chapter 4 presents a classical account of this basic technology. Many readers do realize now, others will figure out later, that we are already talking about ideas of modular forms. But we continue our considerations along traditional lines (both classical and more recent ones) before the concept of modularity takes the leading position.

2

INTRODUCTION

The celebrated memoir of B. Riemann on the zeta function is embedded in the context of abstract £-functions in Chapter 5. It is not our style to consider things in terms more general than necessary, so defining a class of £-functions which suits minimum requirements of our forthcoming applications was not without difficulty and hesitation. In this way we could convey to dedicated researchers that generalizations are not always straightforward. For instance, to establish the zerofree region for £-functions of degree > 1 one cannot rely on the same principles as for the Dirichlet £-functions. The key ingredient is the Rankin-Selberg convolution. On the other hand, the problem of exceptional zero is resolved for many automorphic £-functions of degree > 1 (not without clever constructions) while it remains open for the £-functions with real characters. Furthermore in Chapter 5 a message is sent that a better life exists in the world of automorphic forms than in the zoo of degree one £-functions. Analytic number theory does not mean non-elementary. The first author recalls that his first serious encounter with analytic number theory started by reading the lovely book of A.O. Gelfond and Yu.V. Linnik, "Elementary Metho~s of Analytic Number Theory". When an ambitious beginner starts from there her/his love of the subject is sealed forever. Try it yourself! One is instantly captured by sieve methods. In this book we do not have space to give justice to this marvelous idea, nevertheless Chapter 6 should suffice for basic applications. Next comes the "Large Sieve", which is not a sieve but a name for other things. Yes, it did originate from a short paper by Linnik on a sieve problem, but it took time to recognize the true nature of these ideas. In Chapter 7 we reveal our viewpoint and the crucial attributes (spectral completeness, orthogonality), then we demonstrate the amazing power of the large sieve on selected old and new prob-lems. Other features of the large sieve are scrutinized showing the good and the bad sides. For example, the approach using the duality principle is fruitful for harmonics of degree one (characters) while producing poor results for harmonics of larger degree (like for example the eigenvalues of Heeke operators). The controversy over the proper place of the large sieve is academic. Simply speaking the large sieve inequalities are parts of bilinear forms theory. Estimates for exponential sums are the first tools which deeply penetrate the problems of analytic number theory beyond natural structures. These cannot be grasped by harmonic analysis alone. See what clever use people made of the property that a shifted interval is another interval, that adding an integer to a set of integers yields again a set of integers (sorry, primes are not preserved!). We challenge algebraists to find a structural explanation of the power of such arguments! They should read Chapter 8 to find what H. Weyl built out of these observations. Van der Corput and Vinogradov are also the main figures from the early stages of that discipline. A lot of work and talking went into our presentation of Vinogradov's method, because it is not quite correctly explained in numerous publications. At some point Vinogradov departs from the Weyl differencing process and treats multi-dimensional exponential sums as bilinear forms (this is the way we think of it anyway). The next two chapters show more recent technology which was developed to replace the unproven Riemann hypothesis in applications to the distribution of prime numbers. We are talking about estimates for the number of zeros of £-functions in vertical strips which are positively distanced from the critical line. Hopefully in a future one will say we were wasting time on studying the empty set. Great ideas

-· INTRODUCTION

are camouflaged there in arguments of enormous complexity, so this might not be enjoyable for everyone at first. However if you think the Riemann hypothesis is not provable in your lifetime, please read and admire these unconditional substitutes. Special mention goes to Hugh Montgomery, Martin Huxley and Matti Jutila for the most original contributions. Although we are primary interested in rational integers one can learn and benefit a lot from arithmetic of other fields. Not only from the number fields or p-adic fields, but indirectly from the fields of finite characteristic aS well. Particularly fruitful are the methods of exponential sums over the finite fields. In Chapter 11 we prove (among other things) the Riemann hypothesis for special curves which yields the celebrated estimate ofWeil for Kloosterman sums. The Kloosterman sums have been employed to solve various problems of analytic number theory from the beginning of their creation in 1926. We also mention briefly the state of the knowledge of exponential and character sums over algebraic varieties. Applications of these are harder to make, yet there is a handful of examples in the literature. It was a painful decision to exclude all but the simplest from presentation in this book. Otherwise to do full justice for these highly sophisticated ideas we would have to choose the most complicated application for which we have no room. It suffices to say that a preparation of a given problem of analytic number theory to an estimate for character sum over varieties can be the state-of-the-art in its own right, never mind that the final argument is powered by the outside forces of algebraic geometry. Dirichlet characters are already discussed in Chapter 3 and we return to them in Chapter 12 to treat very short character sums. Again one must be inventive to break limits of natural structures. Burgess theorem is a fine example. Sums over primes are treated in the next chapter. When Vinogradov succeeded in estimating sums over primes of additive characters, which he needed for a solution to the ternary Goldbach problem in conjunction with the circle method, it was a shocking result. Before him the Grand Riemann Hypothesis could do the job, but keep in mind that the Riemann hypothesis is still not established. The original ideas ofVinogradov were borrowed from combinatorial sieve and were rather complicated. Recently developed identities offer much simpler treatments of more general sums over primes. As they share the same fundamental principle (reducing the sum to bilinear forms) the results are pretty much the same, so the choice of the method is a matter of taste and technical convenience. To capture the key elements in Chapter 13 we develop more than one identity. A popular criterion for analytic number theory is that complex variable analysis is being used. Perhaps it is better to say harmonic analysis, since the action of the latter is more profound. For a long time, analytic number theory flourished exclusively from abelian harmonic analysis, that is to say from the Fourier transform in IR.n. There is still a great potential in this classical analysis to be explored. However much stronger fertilizers began to act on the soil of analytic number theory in recent times. These are automorphic functions. Of course, modular forms have been driving algebraic aspects of number theory much longer, but in a limited scope (confined to holomorphic forms). New resources of automorphic theory are found in the spectral analysis, the foundation of which was led by H. Maass and A. Selberg at the turn of the 1940's (real-analytic cusp forms, Eisenstein series, trace formula). In simple terms a non-abelian harmonic analysis found its role in analytic number theory. Truly effective expansion of spectral methods into analytic number theory began about twenty five years ago, changing the face of either subject irrevocably.

4

INTRODUCTION

This book barely addresses the fascinating issues of the new direction throughout Chapters 14, 15, and 16. Our featured application is to estimation for sums of Kloosterman sums. This is a good choice (if no more can be accommodated), because the reader can appreciate the new tools by comparing with the earlier results derived in Chapter 11 by algebraic considerations. Another application of the spectral theory of automorphic forms to an arithmetical question is presented in Chapter 21; that is to the equidistribution of roots of quadratic congruences of prime moduli. The spectral theory continues to grow extensively, so it would be premature to wrap it up here or in any other book. For fnrther reading we recommend [13], [Sa3]. Although the spectral methods of automorphic forn1s predominate current research in analytic number theory, the traditional problems continue getting our attention with respectful intensity throughout the remaining chapters. Great treasures of the subject mustn't be buried in the past. First of all a newcomer should learn the stories of primes in arithmetic progressions to large moduli. In Chapter 17 she/he will find how E. Bombieri and A.I. Vinogradov bypassed the Riemann hypothesis to establish (by the large sieve and other means) unconditional results with applications as good as one can get from the RH itself. Of course our arguments are not identical with the original ones (of 1965) since we take advantage of later simplification, in great measure due to P.X. Gallagher. Chapter 18 goes further back to 1944 when Linnik gave an extraordinary bound for the least prime in an arithmetic progression. For a long time this bound was considered as the most difficult theorem in analytic number theory. And yes, it is still hard by today's standards, and one can still learn a lot from the technology applied! Faced with the obstacle of the exceptional zero, Linnik brings the repulsion effect (he calls it Deuring-Heilbronn phenomenon) to a new level; amazingly enough he turns the problem to his advantage! This is a fascinating development in the history of analytic number theory which we recommend one should master for a better understanding of the status of the exceptional zero today. Once upon a time the famous Goldbach problem was worth a million dollar prize award. For applications the problem (representations of even integers by the sums of two primes) has no great merit, but as an intellectual challenge one would be proud to crack it. Probably something new about prime numbers would be revealed then. Read Chapter 19 to improve your chances. Chapter 20 is serious. Here analytic methods storm the domain of diophantine equations, which from ancient Greeks was exclusively a business of arithmetic. Started by Hardy - Ramanujan, continued by Hardy - Littlewood and developed substantially further by Kloosterman, the circle method uses orthogonality of additive characters to detect equations, not only to solve algebraic equations but a large class of additive problems over special integers as well. The toughest are the binary additive problems. They are not completely solved by the Kloosterman method, but at least we get a very reliable picture of what the true asymptotic for the number of solutions should be. Kloosterman sums which we covered in the preceding chapters are instrumental in the circle methods. After classical ideas we propose a more direct variant which in principle should produce the same results, however without employing Kloosterman sums. One should read Chapter 20 with an open mind, separate technical (still attractive) elements from conceptual devices to see clearly

INTRODUCTION

the connections with modular forms. Certainly Kloosterman and Rademacher were aware of these intrinsic connections, while they are overlooked by some specialists in the circle method. Equidistribution problems for sequences of special integers, lattice points in various domains, solution to diophantine equations, etc, constitute a heavy industry over the analytic number theory. We regret there is no space to run this industry in full capacity in the book. The book of M.N. Huxley [Hu4] treats only the lattice point problems, however quite deeply. In Chapter 21 we are dealing with the problem of distribution of roots of a quadratic equation reduced modulo prime. As the prime modulus tends to infinity we show that the roots are uniformly distributed. The arguments include almost everything that we developed in the book so far, thus showing that the industry is robust. Because of failure of the unique factorization of algebraic integers, the arithmetic of number fields is not as easy as for rational numbers and sometimes perplexing. The complexity is measured by the order of the ideal class group. Naturally the case of imaginary quadratic fields received the first and the most attention because units do not interfere. We do know that the class number grows to infinity (so there is only a finite number of imaginary quadratic fields with a fixed class number), but the serious issue is to estimate the class number effectively. Chapter 22 describes the problem thoroughly and prepares the ground for the advances in Chapter 23. The effective lower bound for the class number (due to D. Goldfeld) may not appear strong for demanding researchers, yet it is deep with respect to results taken from other sources. First of all it uses the Gross-Zagier formula for £-functions of eliiptic curves at the central point. We do provide a substantial overview of the involved arguments from elliptic curves, although these are more geometric than analytic. The .,nalytic arguments themselves are quite delicate. Actually they came fir~t, the £-functions of elliptic curves being supplementary. Indeed we worked out an effective lower bound for the class number which depends on the order of vanishing of general £-functions of degree two, suspecting that the requirements are satisfied by quite a few of them. In Chapter 24 we prove a very classical result of Selberg that a positive proportion of zeros of the Riemann zeta function lies on the critical line. This is a good place to learn about the mollification techniques (a kind of smoothing), which is used in many works today and will reappear in Chapter 26. Assuming the Riemann hypothesis, H.L. Montgomery revealed in 1974 that the distribution of zeros of ((s) follows the behavior of eigenvalues of certain "ensembles'' of unitary matrices. More recently physicists joined the team of workers in number theory, creating a new excitement and hope for finding a path to a proof of the Riemann hypothesis. This is the main objective of the so-called random matrix theory, one of the most popular subject and driving forces of current analytic number theory. It offers reliable models for predicting the behavior of arithmetical quantities which for a long time were shrouded in mystery. The consistency of the random matrix theory with the harmony of integers still seems quite surprising. Whatever the future of this enterprise will be, due to the current cooperation, analysis is closer to arithmetic than ever before. A subject of such magnitude. cannot be fully presented in a short space. Therefore in Chapter 25 we stick to the original theme of the correlation of zeros of (( s) and its variations on the zeros of

INTRODUCTION

families of automorphic £-functions which are near the central point. We leave it for the reader to judge whether the ideas of random matrix theory are realistic for launching an attack on the Riemann Hypothesis. In recent investigations the central values of £-functions appear in a variety of formulas with vanishing or non-vanishing assumptions. Take for example [IS2] where an effective lower bound for the class number of imaginary quadratic fields is derived essentially from the non-vanishing of central values of families of £functions, to the contrary of the vanishing requirements in the previous investigations. Another example is the formula ofT. Watson [Wa] by means of which the quantum-ergodicity conjecture (that is the equidistribution of Maass cusp forms) is reduced to a subconvexity bound for certain £-functions of degree four. We consider in detail one non-vanishing statement which has applications to arithmetic geometry. We hope this book will show the picture of analytic number theory in plenty of colors. However we must say that a lot of significant topics are left out. Missing are the dispersion method, the amplification method (see [M2]), some analytic techniques from diophantine approximations and transcendence. Moreover probability arguments are barely exposed, and we didn't touch ergodic theory either, whose impact on number theory has been felt strongly in the last years. We also try to show some details of the powerful theories which are developing as the most useful new tools for analytic number theory, in particular the theory of higher-degree automorphic forms and their £-functions, and algebraic geometry; young researchers in particular should be encouraged to develop expertise in these subjects. It is certain that spectacular applications have only begun and more will be open to those who understand both sides. Dually, arithmetic geometry and algebraic number theory also give and promise a wealth of new questions, or new aspects of old ones, where the skills and techniques of analytic number theory will be tested to the utmost. Hopefully they will bring rich rewards to those who will try to come to these open fields ... We barely mention some questions related to elliptic curves but we believe that there is much more to discover. The deep conjectures of Lang and Trotter [LT] are already quite popular, and a few other challenging problems may be found in [Kol]. The exercises inside each section serve a dna! purpose, some are to improve the reader's skill, the others serve as additional information about the subject. Historical remarks are brief, to give some orientation in the development of the matter, rather than to credit exhaustively the inventors. The only advice we offer to new researchers is read! read! read! many papers with complete proofs. Knowing a result in analytic number theory is only the first step to liking it; more important and rewarding is to understand the arguments of its proof. Our viewpoint is that making mathematics should not be rated like breaking sport records. Sometimes the strongest result is boring while a slightly weaker one generates great pleasure. Formal prerequisites for much of the book are rather slight, not going beyond differential calculus, complex analysis and integration, especially Fourier series and integrals. It is more important for the reader to have or acquire a good understanding of how to manipulate inequalities and not simple identities.

INTRODUCTION

7

In later chapters automorphic forms become important. We have included two survey chapters, yet we expect that many readers will have already some knowledge of this important topic, or will study it independently. In some sections (for instance Section8 5.13 and 5.14), which are intended as convenient references for certain facts and results which are hard to locate in proper form in the literature, we assume that the reader has some familiarity with other topics, such as representations of groups alld algebraic geometry. Sectiolls of this book were written over a period of time, therefore readers will notice a slight change of style and repetition. We think that a small redundancy is helpful for reading long arguments. Occasionally the same object is introduced again in a different chapter in local terminology which should be more familiar in a particular context. We believe this flexibility is justified for comfort, even at the expense of losing uniqueness. Our notations are mostly standard. But since inequalities with unspecified constants are the lifeblood of analytic number theory, and since there are sometimes controversies on this subject, we spell out the meaning of the various comparison symbols 0(), o(), ~, ~ or «. Most important, we use Landau's f = O(g) a11d Vinogradov's f «gas synonyms; thus /(x) « g(x) for x EX (where X mu8t be specified either explicitly or implicitly) means that [/(x)[ ::;; Cg(x) for all x E X and some constant C ;? 0. Any value of C for which this holds is called an implied constant. Since a constant is most often a function looking for a variable, the "implied constant" will sometimes depend on other parameters, which we explicitly mention at the most important points (but sometimes it is clear from context). If there is no other dependency, we speak of "absolute COllstants ''. This usage means that our 0(), for instance, is not the same as that of Landau or Bourbaki. We use f ~ g to mean that both relations f « g and g « f hold, of course with possibly different implied constants. However f = o(g) for x --> x 0 mealls that for any f > 0 there exists some (unspecified) neighborhood U, of J'o such that [f(x)[ ::;; cg(x) for x E U,. Then h ~ g means h = g + o(g). Those are the same as in Landau or Bourbaki. Among the few notation which may be unfamiliar to a beginner, we mention that pk[[m, where p is prime and k an integer, means that pk divides m exactly (i.e. pk+l does not divide m). The integral part [x] is the integer n such that n:o;;x
•

CHAPTER 1

ARITHMETIC FUNCTIONS 1.1. Notation and definitions.

Throughout this book Z, Q, IR, IC denote the sets of integers, rationals, real and complex numbers respectively. The additio1;1 of complex numbers lllakes IC a group, and Z C Q C IR C IC is the natural inclusion of subgroups. The Lebesque measure dx on the additive group IR is the invariant measure (with respect to additive translations). The subsets Q* C IR* C IC* of the non-zero numbers are closed under multiplication, they are considered as multiplicative groups. In IR* we distinguish the subgroup of positive real numbers JR+ which is endowed with the invariant measure x- 1 dx (with respect to multiplicative translations). The positive integers are called natural numbers. The set of natural numbers N = {1, 2, 3, ... } contains the set of primes IP' = {2, 3, 5, 7, ... }. These are fundamental elements of arithmetic since every natural number factors uniquely (up to a permutation) into powers of distinct primes. The distribution of primes is one of basic problems of analytic number theory. Prime numbers are often denoted by p. First of all, to cultivate analytic number theory one must acquire a considerable skill for operating with arithmetic functions. We begin with a few elementary considerations. A complex valued function f defined on N is called an arithmetic function. In some contexts an arithmetic function f : N --> IC is better seen as the sequence A= (an) with an = f(n). Very often in such context A is supported on numbers of primary interest, and an is just a multiplicity, or some sort of weight, which is introduced when counting such numbers. A different job is assigned to certain functions f : N --> IC (such as Dirichlet characters or Heeke eigenvalues) which we call "arithmetic harmonics." These play an instrumental role in analyzing the primary sequence A = (an)· Essentially the arithmetic harmonics are employed with the primary sequence A = (an) to produce a family of twisted sequences At= (anf(n)). The twisted sequences with appropriate harmonics are capable of selecting a special subsequence of A which is our target (think of Dirichlet characters being employed to detect primes in arithmetic progressions). Thanks to the additive and multiplicative structures of integers the two important classes of arithmetic functions are distinguished. A function f : N --> IC is an additive function if it satisfies

(1.1)

f(mn) = f(m)

+ f(n)

for m, n relatively prime. If this property holds for all m, n, then f is said to be completely additive. For example f(n) = logn is completely additive. Similarly, a 9

•

ARITHMETIC FUNCTIONS

1.

10

function

f : N--. C

is multiplicative if it satisfies f(rnn) = f(m)f(n)

(1.2)

for m, n relatively prime, and f is completely multiplicative if (1.2) holds for all rn, n. For example f(n) = n-• with s E IC is completely multiplicative. Obviously. the additive and the multiplicative functions are determined by their values at prime powers. We have f(1) = 0 iff is additive, f(1) = 1 iff is multiplicative not identically zero. 1.2. Generating series.

To an arithmetic function

f

we shall attach the two most natural infinite series

(1.3)

(1.4) where z,s are complex variables (in the case of the power series Et(z) we may also include n = 0 if j(O) is defined). These are called the generating series, or functions, of f. Other kinds of generating functions can be attractive as well to capture distinct properties of f. For many arithmetic functions the corresponding generating series converges absolutely in a small domain, and it is an interesting question how far the series can be analytically continued? If the generating series extends beyond the range of absolute convergence, then this property usually manifests some group structure in the coefficients f(n) ("random" series cannot be continued). For example the analytic continuation of Artin £-functions is intimately related to the reciprocity laws in number fields (in the abelian case at any rate), the analytic continuation of Hasse-Weil £-functions is a major step towards the modularity of the corresponding elliptic curves (over Q), and so on. The generating power series were introduced by L. Euler (1707-1783) for studying special additive problems (when doing so he was the first to mix analysis with arithmetic). Let Et(z) and E 9 (z) be the series for f and g. Then the product (1.5) yields the power series for the function h given by (the additive convolution) (1.6)

h(n)

=

L

J(€)g(rn).

£+m=n

Applying Cauchy's theorem one expresses the coefficient h(n) by the contour integral

(1.7)

h(n)

=

2 ~i

1

lzl=•·

Et(z)E9 (z)z-n-ldz.

Hence, given reasonable analytic properties of E1(z)E9 (z) one can deduce a good estimate for h(n), or even au asymptotic formula as n tends to oo. Of course, Euler did not know Cauchy's theorem, so his ideas were limited to the power series for which one has a secondary expression, giving au exact formula for f(n) rather than

1. ARITHMETIC FUNCTION

11

an approximation (still not a tautology). We give three cute examples. First is the identity

L)n = \

0

II(1

+ Z2m)

= (1- z)-1

1

which is nothing but an analytic statement of the uniqueness of binary expansion of natural numbers. Here is another identity:

LP(n)zn = II(1- zm)-1 0

where p(n) is the partition function. Less obvious is the following formula of Jacobi:

(1.8)

= LZn2 =II (1- zm)(1- zm+1/2)2.

About ninety years ago the integral representation (1.7) for the additive equation (1.6) gave birth to the circle method. These ideas will be developed in Chapter 20. By changing the variable z into

e(z) =

(1.9)

e2"iz

the power series (1.3) is seen as a Fourier series, which is a common pract!ce in modern analytic number theory. The generating series D f ( s) is called the Dirichlet series after being used in Dirichlet's fundamental works on primes in arithmetic progressions, though the special case

((s) = Ln-s

(1.10)

was already considered by Euler. The Dirichlet series is particularly attractive for multiplicative functions (but not limited to this case). For, iff is multiplicative, then (1.11)

Dt(s) = IJ(1

+ J(p)p-s + J(p2)p-2s + ... )

p

provided the involved series in prime powers and the product over primes (called the Euler product) converge absolutely. In particular, we have

(1.12)

((s) = IJ(1- p-s)-1 p

if Re(s) > 1. This representation of ((s) as an Euler product is an expression in analytic language of the unique factorization of natural numbers into distinct prime po}Vers.

..

L

12


1.3. Dirichlet convolution. Assuming that the Dirichlet series Dt(s), D9 (s) converge absolutely it follows that the product D 1(;;)D9 (s) is also given by a Dirichlet series, namely we have

with the coefficients (1.13)

h(n)

=

L f(d)gG)· d]n

The arithmetic function h defined by ( 1.13) (never mind the convergence of generating series) is called the Dirichlet (or multiplicative) convolution off and g, and it is denoted by f *g. The set of all arithmetic functions with the usual addition + and the operation *is a commutative ring. The function o: N-+ IC whose Dirichlet series is D 0 (s) = I is the unit element of this ring, i.e.,

1 if n =I, o(n) = { 0 ifn>l.

(1.14)

The function ((s) defined in Re (s) > I by (1.10) is called the Riemann zeta function (after his seminal memoir [Rie]), its Dirichlet coefficients make the constant function 1(n) = 1 for all n ? I. By virtue of (1.12) the inverse of ((s) has also a Dirichlet series expansion, namely (1.15) say, with coefficients (1.16)

tAm)

=

(-lr { 0

if m = P1 ... Pr with PI, . .. , p,. distinct, otherwise.

The function p,(m) was introduced in 1832 by A. F. Mobius, and it bears his name ever since. ivioreover, by the Euler product (1.12) the logarithm of ((s) ha5 the Dirichlet series expansion oc

(1.17)

log((s) =

LLe-lp-ls. £=1

p

Recall that the Dirichlet convolution * corresponds to the multiplication of the generating Dirichlet series, therefore the identity ((s) · C 1 (s) = 1 reads as (1.18)

o(m) = Lll(d) = { I d]m O

Using this formula one obtains the following

ifm = 1, ifm>l.

!. ARITHMETIC FUNCTION

MOBIUS

INVERSION.

13

For· any f, g : N --" IC the following two relations ar·e

equivalent: g(n)

(1.19)

=

L f(d), din

f(n) = LJL(d)gG)·

(1.20)

din

REMARK. To be accurate with the history the inversion formulas in the above form were stated only in 1857 by R. Dedekind. The original version stated by Mobius was somewhat different, it amounts to saying that for any real variable functions F, G: [1, x] --. IC the following two relations are equivalent:

L F(x/n), F(x) = L JL(m)G(xjm).

G(x)

(1.21)

=

n~x

(1.22)

The Mobius function is multiplicative. If f, g are multiplicative, then so are *g. If g is multiplicative, then

f · g and f

LJL(d)g(d) = fi(1- g(p)).

(1.23)

pin

din

This product admits a probabilistic interpretation. Viewing g(p) as a probability of some independent events which may occm: at p one can think of (1.23) as the probability that none of the events associated with prime divisors of n occur. 1.4. Examples.

Now we give a large sample of arithmetic functions which one encounters in analytic number theory. Many other functions will be introduced in due course. The divisor function T(n) is the number of positive divisors of n, so we have 00

(1.24) More generally Tk(n) denotes the number of representations of n as the product of k natural numbers, so its Dirichlet series is (k(s). Explicitly (1.25)

Tk(n) = (

a1;~; 1) ... (

For any v E IC we define av(n) by (1.26) so it is the sum of powers of divisors (1.27)

ar; ~;

1)

if n = p~' ... p~·.

14

1.


We have the Ramanujan formula 00

(1.28)

((s)((s- a)((s- (3)((s- a- (3)C 1 (2s- a- (3) = :Lo-a(n)o-;:J(n)n- 1 .

In particular, (1.29)

One can recognize the above Ramanujan series as a special case of the RankinSelberg convolution £-functions attached to modular forms (see Section 5.1). Dividing (1.29) by ((s) one gets (1.30) hence multiplying by ((2s)

(3(s) =

f( L 1

T(m2))n-s.

d2rn=n

This is the symmetric square £-function associated with ( 2 (s). Dividing again by (( s) one gets 00

(2(s)Cl(2s) =

(1.31)

L 2w(n)n-s

where w(n) denotes the number of distinct prime factors of n. Next we get (1.32) Note that IJ.t(n)l (1.32) we have (1.33)

=

J.t 2(n) is the characteristic function of squarefree numbers. By J.l2(n) =

L J.t(d). d21n

Inverting ( 1.32) we obtain another Dirichlet series, 00

C 1 (s)((2s) =

L >.(n)n-s

with coefficients (1.34)

>.(n) = (-1)!1(n)

where fl(n) denotes the total number of prime factors ofn (counted with multiplicity). Thus >.(n ), which is called the Liouville function, is completely multiplicative


15

and it agrees with JL(n) on squarefree numbers. The· Euler function
((s- 1) _ ~ ( ) -s ((s) -L...,
(1.35)

1

so it is also given by (1.36)

L..., -d-. ( p1) = n ""JL(d)

din

pin

One may think of
-((s) = L(logn)n-s, which gives us the logarithm function

L(n)

(1.37)

logn.

=

Since L(n) is additive, it follows that L · (f *g) = (L ·f)* g + f * (L ·g). This says that the multiplication by L is a derivation in the Dirichlet ring of arithmetic functions. Next the function A(n) can be defined as coefficients in the Dirichlet series for the logarithmic derivative -(log((s))' = -('(s)((s)- 1 , so ('

00

--(s)

(1.38)

=

(

LA(n)n-s. 1

By the Euler product (1.12) one computes that (1.39)

A(n)

~ogp

= {

ifn=p"',a;? 1, otherwise.

Thus A(n) is supported on prime powers. One can read (1.38) as A= JL*L. More explicitly,

(1.40)

A(n) =

LJL(d)log~

= - LJL(d)logd. din

din

By Mobius inversion we obtain L (1.41)

= 1 *A, or equivalently,

logn

= LA(d). din

The function A(n) was known and effectively used by P. Tchebyshev, nevertheless it is called the von Mangoldt function. Here is another identity which is useful in elementary prime number theory: (1.42)


1.

16

The von Mangoldt function A ha.~ been generalized to the higher von Mangoldt function of any degree /.; ) 0 by Ak = J.l * Lk, i.e., (1.43) (notice that A0 = 5). This satisfies the recurrence (1.44) from which it follows by induction that Ak(n) is supported on numbers having at most k distinct prime factors, i.e., Ak(n) = 0 if w(n) > k. Moreover, we have (1.45) the lower bound following by induction from (1.44) and the upper bound following from the formula Lk = 1 * Ak. or explicitly, (1.46)

(logn)k =

2.:>k(d), din

which comes from (1.43) by 1\Ii:ibius inversion. Of course, Ak is not multiplicative, but we have the following handy formula (see (1.44)) · if (m,n) = 1.

EXERCISE

1. For any integer k ) 0 and a real number x > 0, define

(1.47)

Ak(n, x)

=I>(d) (log~) k. din

Note that Ak(n, x) depends only on the squarefree kernel of n. Prove that

L

Ak(n,x) =

(k) j

o,;; 1 ,;;k

(

x)k-j . A1 (n) log:;;:

Then, using Aj ( Li-iAk, derive that for n ( x

(1.48) Therefore Ak(n,x) is supported on integers n having at most k distinct prime divisors, and it satisfies 0 ( Ak(n, x) ( (log x)k if n ( x. Next show that if (m,n) = 1, then

Ak(rnn,x)=

L

G)Ai(n)Ak-j(rn,;).

o,;;j,;;k Hence derive that for n (1.49)

= p~ 1

•.•

p~r

with p 1 , ... , p 2 distinct primes

l. ARITHMETIC FUNCTION

17

One can associate a Mobius and von Mangoldt function to any multiplicative function. If Dt(s) is the generating Dirichlet series for f which has Euler product, then J.l t and At are defined as coefficients in

(1.50)

Hence f*At = f ·Land At= JLt*U ·L). Clearly At is supported on prime powers. Iff is completely multiplicative, then J.tt(n) = J.t(n)f(n) and At(n) = A(n)f(n). There is a variety of truly interesting arithmetic functions. The theory of modular forms is a basic source for multiplicative functions. As a simple model we pick up the function r(n) which is the number of solutions to a 2 + b2 = n in integers a, b. The generating Dirichlet series for r(n) is equal to 4(K(s), where

is the zeta function of the imaginary quadratic field K = IQl( H) while the factor 4 is the number of units in K. Here a runs over non-zero integral ideals of K. All these are principal ideals generated by the Gaussian integers a= a-+ bi E Z[H], a ol 0, and if a= (a), then Na = a2 + b2 . Hence, indeed 4(K(s) =

L r-(n)n-•. n):l

Clearly ~r(n) is multiplicative. The prime numbers which are represented as the sum of two squares are characterized in a beautiful theorem of Fermat. It is easy to see that no prime p -l(mod 4) can be so written and Fermat proved that all other primes can be (uniquely up to the sign and order of a, b). In modern language this theorem of Fermat is just the factorization law in the Gaussian domain Z[ H]; it asserts that p = 2 is square of a prime ideal, p l(mod 4) splits into a product of two distinct (complex conjugate) prime ideals, p -l(mod 4) remains prime. Using the factorization law one can show by verification of local factors that

=

=

=

where X4 is the non-trivial character to modulus 4 (we have L( s, X4) is the associated Dirichlet £-function

Hence (1.51)

r(n) =4LX4(d)_ din

x4(n)

=sin "2n) and

1.

18


Besides the Dirichlet series there is great interest in studying the Fourier series for r(n) due to the additive nature of the equation a 2 + b2 = n. We have

L:r(n)e(nz) =

e2 (z)

0

(recall (1.9) that e(z) = e 2rriz), where e(z) is the theta function (1.52) This series converges absolutely for Im (z) > 0. More generally, letting rk(n) be the number of representations of n as the sum of k squares we have

L:rk(n)e(nz) = ek(z). 0

On one hand, the theta function e(z) is recognized in analytic number theory for its ability to select squares. On the other hand, e(z) is usable due to the following transformation rule: (1.53)

+b) =v(c,d)(cz+d)>e(z)

az e ( cz+d

1

which holds for any z with Im ( z) > 0 and any integers a, b, c, d with adbe = 1, c 0 (mod 4), where v(c,d) depends only on c,d. We have v 2 (c,d) = X4(d), so v(c, d) takes only four values ±1, ±i (see (3.42) for the exact description). In other words, e(z) is a modular form of weight with multiplier v(c,d) while e2 (z) is a modular form of weight one and character X4 on r o(4). In addition to the modular relations (1.53) the theta function satisfies the following involution equation:

=

1

(1.54) Note that e(z) does not vanish by virtue of the Jacobi product (1.8). More generally one associates a theta function with any positive definite quadratic form in several variables and corresponding harmonic polynomials (see Section 14.3). Yet other types of arithmetic functions are given by exponential sums over ZjmZ or (ZjmZ)*. The most important of these for analytic number theory are Gauss sums, Rarnanujan sums and Kloosterman sums. We mention here only the quadratic Gauss sum {for general Gauss sums, see Section 3.4)

G(m) =

L e(:~)x(mod m)

They were evaluated exactly by Gauss as a consequence of the Quadratic Reciprocity Law {see Theorem 3.3): {1.55)

1 + i"' G(m) = - - . Vm· 1+2


19

The Kloosterman sums S( a, b; c) encompass Ramanujan sums. They are defined by

S(a,b;c)=

(1.56)

'\'* ~

e(ax +c bx)

x (mod c)

for integers a, b and c ;;, 1. The symbol I;* restricts the sum to x coprime with c and x denotes the multiplicative inverse of x modulo c, i.e. xx = 1 (mod c). Note that S(a, b; c) is real, and has the following symmetries: (1.57) (1.58)

S(a,b;c) = S(b,a;c), S(aa', b; c) = S(a, ba'; c) if (a', c)= 1.

It also has multiplicative properties (by Chinese Remainder Theorem) (1.59)

S(a, b; cd) = S(ac, be; d)S(ad, bd; c) if (c, d)= 1.

Kloosterman sums will make many appearances in this book. They are crucial to modern analytic number theory through links with spectral theory of automorphic forms. Though no "simpler" formula exists (comparable to (1.55) for quadratic Gauss sums), a sharp bound for individual Kloosterman sums is known, due to A. Wei!:

IS( a, b; c)! ( T(c)(a, b, c) 112 ye.

(1.60)

vVe will prove this in Chapter 11 (see Corollary 11.12) and use it many times. When b = 0 (or n = 0, see (1.57)), the Kloosterman sum S(n, 0; m) is called a Ramanujan sum. It is also sometimes denoted cm(n). For these there is an explicit formula, see (3.2) and (3.3). 1.5. Arithmetic functions on average.

In the course of analytic number theory the averages of arithmetic functions of various kinds appear all over the place. The first task is to establish estimates for finite sums of type

Mt(x) =

(1.61)

L

f(n).

In this section we show a few elementary ideas. The first common tool is summation by parts, which is the formula

Mt 9 (x) =

L

J(n)g(n) = Mt(x)g(x)

n~

-lx

Mt(t)g'(t)dt

I

and its obvious variants. This formula often allows us to evaluate M 19 (x) if M 1 is known and g is smooth and doesn't oscillate. Iff is defined on [1, x], monotonic and continuous, then Mt(x) is well approximated by a corresponding integral, precisely (1.62)

Mt(x) =

lx

f(y)dy

+ O(lf(x)l + lf(1)!).


1.

20

For example, we get :l)logn)k = xPk(logx)

(1.63)

+ O((logx)k)

n~x

where h(X) is a polynomial of degree k Pk(X)

=

(-1)k-t~xe,

L o,;;e,;;k

and (1.64)

If f(y) is continuously decreasing to zero, one can do better than (1.62). Indeed, putting

[x] = ma.x{ f

(1.65)

E /2 : f (

x}

(the integral part of x) and (1.66)

{x}=x-[x]

(the fractional part of x) the average of f is given by the Stieltjes integral Mt(x) =

jx

f(y)d[y] =

jx

J(y)dy - [ ' J(y)d{y}.

Hence (1.67)

Mt(x) = [ ' f(y)dy

where the constant

+ "!! + O(f(x))

"'t is given by 0

"/J = - [ ' J(y)d{y} = f(1)

+

1=

{y}df(y).

For example, we get (1.68) where "/k is called the Stieltjes constant. For k = 0 we get the Euler constant (1.69)

1

= 1-;= {y}y- 2 dy = 0 577215 ...

Similarly one can evaluate averages of smooth functions in several variables by repeated applications of the integral approximation method. In this way one can derive Gauss formula for the lattice points in a circle (1.70)

:Lr(n) = nx + 0( vfx). n~x

21


Actually, Gauss' argument was geometrical in nature, he derived (1.70) by packing the circle with a unit square (which amounts to the same thing as the integration). In the same way one obtains (1.71) where Pk = nk/ 2 jr( ~

+ 1)

is the volume of the k-dimensional unit ball.

Most interesting arithmetic functions f arc not defined on all real numbers in which case it makes no sense to approximate Mt(x) by the integral (1.67). Yet, if f is given by a convolution of two functions one of which is smooth on real numbers and the other vanishes rapidly on natural numbers, it is still possible to evaluate M 1 ( x) by means of integrals and convergent series. Specifically, if f = g * h, where his monotonic and bounded, we arrange Mt(x) as

g(m)Mh(~),

Mt(x) = L m~x

and, after replacing Mh(Y) by

Mt(x)

= {

iy

L

h(t)dt + 0(1), we get

g~)h(~)dy+O(L

m~x

jg(m)l).

m~x

Estimating the partial sum over y < m ( x trivially we get (1.72) where (1.73)

F(y) =

g~) h(~).

L m~y

If we assume that the series L g(m)m- 1 converges absolutely, then this formula is a slight modification of a theorem of A. Wintner. As an example we apply (1. 72) for f(rn) =
"'
~---;;:--

n~x

x

= (( 2 ) + O(logx).

For the divisor function r(n) the sum Mr(x) has geometric interpretation, namely it represents the number of lattice points (m, n) E N x N under the hyperbola mn ( x. In this case our general formula (1.72) yields

Mr(x) = LT(n) = L n~x

[~] = xlogx + O(x)

n~x

showing that the average value of r(n) is log n. In order to improve the error term in the divisor problem Dirichlet applied a simple, yet powerful trick with switching divisors. First by symmetry of the equation m1m2 = n we figure that

Mr(x) = 2 L m,;;.jX

[~]- [JXf.

!.

22


Here we have fewer terms of summation than in the straightforward arrangement. This trick is called the hyperbola method. Now dropping the integral part symbols we make an error O(ylx), and it follows by (1.69) that

(1.75)

L r(n) = x log x + (2-y- 1)x + 0( vfx). n~x

EXERCISE 2. Prove by Dirichlet's method that

L:rk(n) = xPk(logx)

+ O(x 1 -i)

n~x

where Pk is a polynomial of degree k- 1 (Pk(x) is the resid~Je of ((s)kx•s- 1 at s = 1). Dirichlet's hyperbola method works nicely for the lattice points in a ball of dimension k ;;?: 4. Lagrange proved that every natural number can be represented as the sum of four squares, i.e., r 4 (n) > 0, and Jacobi established the exact formula for the number of representations r4(n) = 8(2 + (-lt)

L

d.

dln,d odd

Hence we derive

00

= 2x 2 L(2 + (-1)m)m- 2 + O(xlogx). 1

1

= 3((2)x 2 + O(xlogx) = (1rx) 2 + O(xlogx).

2

This result extends easily for any k ;;?: 4 (write rk as the additive convolution of r4 and Tk- 4 , apply the above result for r 4 and execute the summation over the remaining k - 4 squares by integration) (1.76)

"" ~ rk(n) = n,;;x

r

(?TX)~ k 1 (k + O(x2log x). 2

+ 1)

Notice that this improves the formula (1.71) which was obtained by the method of packing with a unit square. The exponent ~ - 1 in (1.76) is best possible because the individual terms of summation can be as large as the error term (apart from log x), indeed fork= 4 we have r 4(n) ;;?: 16n if n is odd by the Jacobi formula. The only cases of the lattice point problem for a ball which are not yet solved (i.e., the best possible error terms are not yet established) are for the circle (k = 2) and the sphere (k = 3). We shall address these fascinating problems on other occasions.


EXERCISE

3. Prove

~y

23

the hyperbola method that

Lr(n 2

3 -xlogx + O(x).

+ 1) =

7r

n~x

1.6. Sums of multiplicative functions. Throughout f is a multiplicative function. Since f is determined by its values at prime powers it is possible to estimate MJ(x) in terms of the local sums 00

ap(J) =

(1.77)

L f(pv)p-v. v=O

First we give simple upper bounds when need no explanation

f

is non-negative. The following estimates

(1.78)

One can do better if f is non-decreasing on prime powers. In this case h = J.t * f is non-negative, indeed h(pv) = f(pv) - f(pv-l) ) 0 if v ) 1. Writing f = 1 * h we get

M 1 (x) =

(1.79)

L

h(m)[~]

(xL

m~x

( x

h(m)m- 1

m~x

II ap(h) = x II ap(f)( 1- ~). p~x

p~x

Here is a crude but useful application of (1.79) for f(m) = Tk(mJl. We have f(p) = and

e

aJ(P) (

(1 + ~( (1 + :

2

r

where c = c( k, f) is a positive constant. We have also the elementary bound

II (1 + ~) «log x. p~x

for any x ) 2 (see (2.15)). Hence (1.79) yields (1.80)

LTk(n)e « x(logx)k'-J n~x

where the implied constant depends on k, e. This is a crude bound, but of the correct order of magnitude. Notice that it implies that (1.81)

for any c > 0, the implied constant depending on c and k. We shall be using these bounds for the divisor function Tk(n) often without mention.

24

1.


If one takes the average of f(n) over square free numbers, or what amounts to the same thing one assumes f is supported on squarefree numbers, then the above arguments yield

M,(:r) ( :r

(1.82)

IT (1+ f(p~- 1) p-:::;:r

provided f(p) ): 1 for all p. We may require f(p) ): 1 to hold true for primes only in a certain set. Then the result is (1.83)

Here is another elementary approach. Vl/e only assume that f is non-negative and completely sub-multiplicative, i.e., f(mn) ( f(m)f(n) for all m,n): 1. Moreover, suppose that f(p) is bounded on average,

L

(1.84)

f(m)A(m) (ex

m~x

for any x): 2, where c): 1 is a constant (see the Tchebyshev bound (2.12)). Then we get f(n) logn = f(nm)A(m) (ex f(n)n- 1 .

L

L

n(x

nm(x

L

Hence by partial summation (1.85) where

d="Le~)t n(x

To illustrate this result we take the multiplicative function f(n) = b(n) which is the characteristic function of numbers representable as a sum of two squares. We have f(p) = 0 if p -1 (mod 4) and f(p) = 1 otherwise. In this case (see (2.30))

=

IT (1+ f~)) « (logx)~ p:fx

for any x ): 2. Hence one deduces by (1.85) that (1.86)

B(x) = l{n ( x: n = a 2

+ b2 }1 «

x(log x)-~.

Here every number which admits a representation as the sum of two squares is counted once. For this reason the upper bound (1.86) is slightly smaller than the area of the circle a 2 + b2 ( x (see (1.70) ). Using analytic methods E. Landau [Lal] established the asymptotic formula (1.87)

B(x) =

L b(n) ~ Cx(logx)-~, n(x

l. ARITHMETIC FUNCTION

25

as x --> oo, where C i1 a positive constant. Landau's formula also follows by au elementary method with C given by (1.102). Very differently, still elementarily, the formula (1.87) is proved in [18] by applying a half-dimensional sieve. The sieve method has the advantage of producing results of great uniformity. For example, one gets B(x

+ y)- B(x) < (1 + c)B(y)

for any E > 0 provided y is sufficiently large in terms of E only, where x is any positive number. If x is very large compared toy. then analytic arguments produce rather poor results for B(x + y)- B(x). Next we present an elementary method which yields an asymptotic formula for

M 1 (:r) for a large class of multiplicative functions. This requires some regularity in the distribution off at prime powers. Letting AJ be the vo11 l\Iangoldt function associated with f (see (1.50)) we assun1e that

L

(1.88)

AJ(n)

=

~~:logx

+ 0(1)

n~x

where K is a constant. This condition means f(p) is about Kp- 1 on average whereas it was K previously. For many multiplicative functions in practice (1.88) can be established by elementary methods (this condition is an analogue of the Mertens formula (2.14), and it is weaker than the Prime Number Theorem). With some applications in mind we allow f(p) to be slightly negative. Precisely the method works if (1.88) holds with K > - ~· In addition to this we need a crude estimate (1.89)

The method begins with evaluation of the logarithmically smoothed sum

L

(1.90)

n~x

Since

f ·L

=

f * AJ

X = f(n)logn

Jx MJ(y)y- dy. 1

1

we derive by (1.88) that

Lf(n)logn= Lf(d) L d(x

=

AJ(m)

m(xjd

Lf(d)(~~:log~ +0(1)). d(x

Estimating the sum of error terms by means of (1.89) we get

(~~: + 1)

L

f(n) logn

=

~~:M 1 (x) logx

+ O((logx)l"l).

n(x

Inserting this into (1.90) we show that the difference (1.91)

b.(x)

=

M 1 (x) logx-

is relatively small, namely (1.92)

(~~: + 1)

hx

M 1 (y)y- 1 dy


1.

26

Next we divide (1.91) by x(logx)_"_ 2 and integrate getting

hx

b..(y)y-l(logy)-"-2dy

=

hx

- (K

MJ(y)y-1(logy)-"-1dy

+ 1)

hx

y- 1 (logy)_"_ 2 (hy MJ(u)u- 1du)dy.

Changing the order of integration in the multiple integral we get

hx

Mj(y)y- 1 (logy)-"- 1dy

+ hx[(logx)-"- 1 y- (logu)-"- 1]MJ(u)u- 1du.

The first integral cancels out with the second part of the last one, so we are left with b..(y)y- 1 (logy)-"- 2 dy = (logx)_"_ 1 MJ(u)u- 1du.

hx

hx

Combining this with (1.91) we arrive at the following identity:

Since the above integral converges absolutely, we can extend its range to infinity getting the identity (1.93) where CJ is a constant given by the definite integral CJ = -

(1.94)

100

b..(y)d(log y)_"_ 1

and TJ(x) is a function given by TJ(x)

(1.95)

=

100

(b..(y)- b..(x))d(logy)-"-

1

.

This tends to zero as x--+ oo, indeed by (1.92) (1.96) However, the constant (1.94) does not look natural. Now having established the asymptotic formula (1.93) with the error term (1.96) we shall use these results to derive another expression for CJ by an appeal to the zeta function for f,

DJ(s) = "L,f(n)n-•. 1

The series converges absolutely for s > 0 by virtue of (1.89). We compute by partial summation that DJ(x)

= =

1oo

y-•dMJ(Y)

-1=

= {cJ

=

-100

MJ(et)de-st =

Mt(y)dy-•

-100 {cj + o(t-")Wde-st

+ O(s")}s-"f(K + 1)

!. ARITHMETIC FUNCTION

as s get

-+

27

0+. Comparing this with the K-th power of the Riemann zeta function we

((s + 1)-K DJ(s) ~ CJr(K + 1). On the other hand, we have the Euler product

which converges absolutely for s > 0 and it has limit as s Hence the constant CJ is equal to (1.97)

CJ

1 (1--1) =--IT r(K+1) p

K

-+

0 by virtue of (1.88).

(1 + f(p) + f(p2) + ... ).

p

We have established the following formula (if [Wil]).

K

? 0, this is due to E. Wirsing

THEOREM 1.1. Suppose f is a multiplicative function which satisfies {1.88) and {1.89) with K > -~. Then

(1.98)

L

= CJ(logx)" + O((logx)l"l-l)

f(n)

n(x

where CJ is a constant given by {1.91).

We illustrate the applicability of this formula by solving a simple sieve problem. Let P be a set of primes. Our question is how many numbers n ( x have no prime divisors in P? Let f be the completely multiplicative function with f(p)

= {

~

ifp E P,

otherwise.

Then g(n)

= rr(1- f(p)) pin

is the characteristic function of numbers in question. Therefore we are asking to evaluate S(P,x) = Lg(n). n(x

Opening the convolution 9 = t.Lf * 1 we obtain S(P,x)

= L/l(d)f(d) [~] = x d(x

L/l(d/~d) + O(MJ(x)). d(x

To proceed further we are going to assume that the proportion of primes in P is less than the proportion of those not in P. More precisely we need the condition L

(1.99)

p(,x

f(p) logp = Klogx

+ 0(1)

p

with a constant K < ~· Then we derive M 1 (x)

«

x(logx)"- 1 by (1.85) and

L/l(d/~d) = c(P)(logx)-" + O((logx)"- 1 ) d(x

1.

28


by (1.98), where c(P) is a constant depending on the set P, (1.100)

c(P) =

1 (1 --IT f(1-K)

1)-" .

-f(p)) ( 1 - p

p

p

Gathering the above results we conclude

!

COROLLARY 1.2. Let P be a set of p'rimes of density K < (that is {1.99) holds). Then the number of integers 1 ( n ( x having no prime divisors in P satisfies (1.101)

S(P, x) = c(P).r(log x)-"{1

+ O((logx) 2" - 1 )}

where c(P) is the constant given by (1.100).

EXERCISE 4. Apply (1.98) to derive Landau's formula (1.87) with the constant

IT

1 C=----;;

(1.102)

(

/2 p=-1(mod 4)

1 )-~

1-p2.

1. 7. Distribution of additive functions.

Throughout f is an additive function, so f is determined by its values at prime powers, precisely f(n) = f(p"').

L

p"lln

Hence the average of

f

is given by

MJ(x) =

L

f(p") ([

xn] - [p:+1]) ·

p

p"(x

Dropping the integral part symbols we arrive at (1.103)

MJ(x) = xE(x)

+ O(x~D(x)),

where (1.104)

L

E(x) =

f(p")p-"(1- p- 1 )

pa~x

and (1.105)

D 2 (x)

=

L

lf(p")I 2P-n·

Pa~x

Actually the error terU: in (1.103) is 2.: lf(p")l which we estimated by x!D(x) by applying Cauchy's inequality and that p" ( x. The result reveals that E(x) is approximately equal to the mean value of f. Similarly we evaluate the average of j2. We have

I.

ARITHMETIC FUNCTION

I

where p, q run over primes. If p

q, the inner sum is equal to

] _ [-X-] _ [-X-] + [p<>+1qf3+1 [_X p"qf3 p<>+1qf3 p"qf3+1 X

and for p

29

]

= _x paqf3

(1- ~) (1- ~) + 0(1) ' p q

= q we must have a = {3 gPtting = -1·( [-:r] - [;:r] p"

pa+1

p"'

1) + 0(1).

1-p

Hence we derive that

M}(x) = x L

Lf(p")f(qf3)p-"q-f3(1- p- 1 )(1- q- 1 )

pa ,q11~x p#q

+x

2.::: !

2

(p")p-"(1- p- 1 )

+ o(z.=

pa~x

2.::: lf(p")f(P(Jll)·

Po ,q!3 (x

Here we drop the condition p I q and estimate the additional terms with p using lf(p"')f(pf3)1 ( lf(p"W + lf(pf3JI2_ We arrive at

=

q

(1.106) This result says that E 2 (x) is approximately equal to the mean value of

f 2.

Both approximations (1.103) and (1.106) suggest that the individual values f(n) for n ( x should be reasonably well approximated by E(x). Indeed this is true in the following sense: THEORE~I

1.3. For any additive function f : N-> IC we have

(1.107)

where c is an absolute constant. PRoOF. \Ve can assume that f is real since the real and the imaginary parts off are additive and they can be treated separately. We can also assume that x is a positive integer. First we evaluate the variance

V(x) = LU(n)- x- 1 Mt(:tW

=

Mp(x)- x- 1 M}(x).

n(x

By (1.103) and (1.106) we get

V(x)

«

x11E(x)ID(x)

+ xD 2 (x) «

xD 2 (x).

Then replacing :r·- 1 M 1 (x) by E(x) we make an admissible error, and (1.107) is established. D Theorem 1.3 is due toP. Turan [Tul] and J. Kubilius [Kubl]. The expected value E(x), if one prefers, can be reduced to (1.108)

A(x) = L p(x

f(p)p- 1

30

1.


at the expense of the constant c in ( 1.107). Our arguments above yield c = 4 if x is sufficiently large, however, J. Kubilius [Kub2] showed (after having received suggestions from H.L. Montgomery) that (1.107) holds with c = c(x) ~ ~ as x --t oo. He and others also showed that the ~ cannot be replaced by a smaller constant (cf. A. Hildebrand [Hil]). As an example we choose the additive function w(n) which counts the number of distinct prime divisors of n. Therefore w(p"') = 1, E(x) = log log x + 0(1) and D 2 (x) = loglogx + 0(1) by (2.15), so the Turan-Kubilius theorem gives l_)w(n) -log log x) 2

(1.109)

«

x log log x.

n(x

Hence it follows that w( n) ~ log log n for almost all n (a theorem of Hardy and Ramanujan). Precisely we have COROLLARY

(1.110)

1.4. For any x

~

3 and z

~

1,

i{n !( x: lw(n) -loglogxl > z(loglogx)~}l

«

z- 2 x

where the implied constant is absolute. Furthermore one can ask if a given additive function f can be renormalized by A(x) and some B(x) so that the frequencies

(1.111)

1 v(x;z)=:;:

I{

n!(x:

f(n)- A(x) B(x) !(z

}I

have a limit for any z E 1R as x tends to oo. This question is addressed in the probabilistic theory of numbers. We shall only give a glimpse of central results. The very first one was established by P. Erdos and M. Kac [EK]. THEOREM

1.5. Let f(n) be an additive function with -1 !( f(p"')

=

f(p) !( 1

such that

(1.112)

B(x) = LJ2(p)p-1 --too

as x --t oo. Then v(x; z) distribution function

~

p(x

G(z) for any z E IR, where G(z) is the Gaussian

(1.113)

In particular, it follows from the Erdi:is-Kac theorem that for any real z l{n !( x: w(n) -log log x !( z(log log x) ~}I ~ G(z)x as x tends to oo. Many other arithmetic functions (not necessarily additive) follow the Gaussian limiting distribution law. For example, H. Halberstam [Hal] proved that the number of prime divisors of shifted primes satisfy

(1.114)

l{p !( :r: w(p+ 1) -loglogx !( z(loglogx)~}l ~ G(z)1r(x)

for any fixed z E 1R as x

--t

oo.

r CHAPTER 2

ELEMENTARY THEORY OF PRIME NUMBERS 2.1. The Prime Number Theorem. It was observed nearly 200 years ago by Legendre and Gauss (independently) that the density of primes p ( x is (log x) - 1 , precisely they postulated the following

Let n(x) denote the number of primes p ( x. As

PRIME NUMBER THEOREM.

x

-->

oo, we have n(x)~

(2.1)

X

-. log x

Gauss observed that an even better approximation to n(x) is given by the singular integral (2.2)

Li(x)=

jx(1--1) - dy-+loglogx+l-

ix

dy - -= o 1og y

y 1ogy

1

For x > 1 this satisfies the asymptotic expansion

(2.3)

Li(x) =lox x { g

I t l

L

i'!(logx)-e + O((logx)-m)}.

O(R
Some people usei'i(x) = Li(x)-Li(2) in placeofLi(x) for x;?! 2. Later, Tchebyshev realized it is simpler to count primes p with the weight log p, so he investigated the sum B(x) = Llogp p~x

rather than n(x). It is still more convenient to evaluate the average of the von Mangoldt function

7/;(x) =

(2.4)

L

A(n).

n(x

~

l j

J

j

Note that (2.1) is equivalent to each of the asymptotic formulas

B(x)

(2.5)

~x,

7/;(x)

~

x

by partial summation and trivial estimation of the contribution of n = p 0

0;;:,

(

x with

2.

Although A is given by the convolution (1.40) the formula (1.72) falls short of yielding (2.5) because the error term exceeds the main term. Applying Dirichlet's trick of switching divisors (the hyperbola method) one can reduce the number of 31

2. ELEMENTARY THEORY OF PRIME NUMBERS

32

error terms, but it does not complete the work. This only leads to another problem of estimating sums of the Mobius function. Indeed, one can show by the hyperbola method that the Prime Number Theorem is equivalent to the estimate

lii(x) =

(2.6)

L p,(m) = o(x),

as x---> oo.

m.(x

To derive (2.5) from (2.6) we consider

b.(x) = L(!ogn- r(n)

+ 27).

n~x

Subtracting (1.75) from (1.63) fork= 1 we get

b.(x)

(2.7)

«

.jX.

Now applying (1.42) we get

LL

'lj;(x)- x + 27 =

p,(d)(log k- r(k) + 27)

dk(x

(2.8)

= L(logk-r(k)+27)M(~)+ k(K

L

p,(d)(b.(~)-6.(1())

d(J:J<-1

for any 1 ( K ( x. The last sum is O(xK- 112 ) by virtue of (2. 7) while for any fixed K the preceding sum is o(x) by the hypothesis (2.6). Since K can be arbitrarily large, this shows that (2.6) implies (2.5). To establish the converse we use the identity

L Jl(n) log::_= M(x) logx + L n

n'(x

which follows from

LJI(m)A(n)

rnn(x

(' 1 ((1)' = -(("

The left side is trivially O(x) (see (1.64) for

k = 1), so

1\I(x)logx = -

L p,(m)'I/J(~) +D(x). m(x

Applying (2.18) we arrive at

1\I(x) logx =

L p,(m) (~- '1/J(~)) + O(x). m(x

Now using the hypothesis (2.5) this equation implies (2.6). 2.2. Tchebyshev method. The first significant attempt for proving the Prime Number Theorem was made by Tchebyshev in 1848-1852. Retrospectively one may say that his approach is based essentially on the product formula

meaning that the archimedean valuation equals the product of p-adic valuations. This obvious formula is particularly fruitful for the factorial numbers N = 1· 2 · · · n.


33

Like Tchebyshev we present the arguments in a more friendly analytic format. First we have

L(:r) =

(2.9)

L logn = x logx- x + O(logx). n~x

On the other hand, using (1.41) we get (2.10)

L(x) =

LA(d)[~J = d~x

Hence evaluating L(x)(2.11)

7/J(x)

2L(~)

L 7/J(~). m~x

in two ways one gets

-7/J(i) + 7/J(~) -7/JG) + .. · = x log2 + O(logx).

Using the monotonicity of 7/J(y) one derives from (2.11) the first Tchebyshev estimates

x log2 + O(logx) < 7/J(x) < x log4

(2.12)

+ O(logx).

Tchebyshev improved these estimates several times using more involved linear combinations of L(!i.)· For example let us consider 7/Jt(x) = L(x)- L(~)- L(~) L(~)+L(fo). By(2.9)

7/!t(x) =ax+ O(!ogx) where a = ~log 2 + ~log 3 + i log 5 -

7/!t(x) =

fo log 30 = 0.9212 ... On the other hand,

L A(n)f(;D n(x

where

f(x) = [x]-

-!- i +to

m- [~]- [~]

+

r:oJ·

Since 1- ~ = 0, the function f(x) is periodic of period 30. One can check that f(x) takes only two values 0, 1 and f(x) = 1 if 1 ( x < 6. Therefore

7/!t(x) ( 7/J(x) ( 7/!t(x)

+ 7/J(~)-

Hence one derives (2.13)

ax+ O(logx) < 7/J(x) <~ax+ O(logx).

Since the difference between the upper and the lower bounds in (2.13) is quite small, one can deduce from (2.13) that any dyadic interval [x, 2x] contains primes provided x is sufficiently large (actually Tchebyshev proved this for all x ): 1, which was postulated by Bertrand). Tchebyshev failed to prove the Prime Number Theorem, however, he succeeded to show that if lim 7/J(x)/x exists, then this limit x~=

is equal to one (this follows instantly from (2.11) because 1- ~ +

k- ~ + · · · =

Interesting results were derived from Tchebyshev works by Mertens. approximating [x/d] in (2.10) by xjd one gets (2.14)

L A(n) = logx + 0(1). n

n(x

log 2). First


34

Next applying partial summation one derives from (2.14) that (2.15)

L~

=

log log

x+ (3 + 0 ( lo~ x)

p.(x

where (3 is a constant. Finally using log(1- p- 1 ) = -p- 1 Mertens formula (2.16)

+ O(p- 2 ),

we have the

e-' - - { 1+0 -1 - } II (1--p1) =logx CogJ ·

p.(x

Mertens showed that the exponent 1 in (2.16) coincides with the Euler constant (1.69). Similarly, using o = 1 *fl. in place of L = 1 *A, one derives a handful of results for the Mobius function. First for any x ? 1 one gets

(2.17) in place of (2.10). Hence approximating [x/m] by

xfm

one deduces that

IL fl.~) I( 1.

(2.18)

m(x

2.3. Primes in arithmetic progressions.

Every natural number n = -1(mod 3) has a prime divisor p = -1 (mod 3). Hence there are infinitely many primes p -1 (mod 3). In the same way one shows that for q = 4, 5 there are infinitely many primes p -1(mod q).

=

=

Let q be any integer > 1. Consider the cyclotomic polynomial of degree
q(x)=

II

(x-e(~))EZ[x].

a(mod q) (a,q)=l

One can show (quite easily by using elementary properties of the multiplicative group (ZfqZ)*) that every prime divisor of q(n) coprime with q satisfies p (mod q). If q is prime this property is very simple. Indeed in this case we have

=

xq- 1 q(x) = - - = 1 +x+ ... +xq-l. x-1

=

Hence if p divides q(x), then xq 1 (mod p), and comparing with the little Fermat theorem xP- 1 1 (mod p) we deduce that q divides p- 1. Now arguing like Euclid we can deduce that there are infinitely many primes p 1 (mod q).

=

=

Naturally every class a(mod q) with (a,q) = 1 should have infinitely many primes. It was for the proof of this theorem that L. Dirichlet (1837) invented multiplicative characters x(mod q) (see Chapter 3). To every such character he associated the Dirichlet series 00

(2.19)

L(s,x)

=

LX(n)n- 5 •


35

As in the case of the zeta function the Dirichlet £-function has the Euler product (2.20)

L(s,x)

II(l- x(p)p-s)-i

=

p

which converges absolutely for s Taking logarithms we get (2.21)

> 1. Hence

L(s, x) does not vanish if s

> 1.

logL(s, X)= l...:>.:(P)P-s + 0(1). p

Hence by the orthogonality of characters (see Section 3.2) (2.22)

L

p-s

L

p"'a(mod q)

x(a)1ogL(s,x)+0(1),

x(mod q)

where the error term 0(1) is bounded in terms of s uniformly for s principal character we can relate L(s, xo) to the zeta function L(s, xo)

= ((s)

II(1- p-

8

> 1. For the

).

plq

Since ((s) = = n-s =

J=

1

I

L

X-

8

dx + 0(1)

=- 1 S-

1

+ 0(1),

we infer that the principal character contributes to (2.22) (2.23)

1 1 1 -(-) logL(s,xo) = -(-) l o g - +0(1).
Next we consider X o;f Xo· Since x is periodic of period q and the sums over any complete set of residue classes vanish, it follows that the character sum over any segment is bounded, precisely

[L

x(n)[
x
Therefore the series (2.19) converges for s > 0 and !L(s,x)i < q, by partial summation. If we know that (2.24)

£(1, X) o;f 0,

then logL(s, X) has a finite limit ass---> 1+ Hence we can deduce that for s (2.25)

L P"'a(mod q)

p-s

1 = -(-)

'P q

log -

1

S -

1

> 1,

+ 0(1).

In this way Dirichlet proved there are infinitely many primes in any primitive residue class. Actually (2.25) shows that prime numbers are equidistributed (in a certain sense) among the primitive residue classes a(mod q). It remains to show the non-vanishing of £(1, x) for every x o;f xo, which is the heart of the matter. We do not follow exactly the original lines of Dirichlet but


36

rather mix his ideas with more familiar transformations of Tchebyshev. We begin with

L x(n)n-

1

1ogn = :Lx(n)n- 1 LA(d)

n'x

din

n(x

:Lx(d)A(d)d- 1 L

=

x(m)m- 1

m({z

d(x

:Lx(d)A(d)d- 1 ( £(1, x)

=

+ O(dfx))

d(x

= £(1,

x) :Lx(d)A(d)d- 1 + 0(1) d(x

where the error term 0(1) is bounded in terms of x (but not of q). The left side is also bounded (by Abel's criterion) because the average of x(n) over any segment is bounded. Therefore we proved that

:Lx(d)A~d)

(2.26)

«1

d(x

provided £(1, x) =j 0. Similarly we obtain logx + L

x(n) A~n) = L

n(x

x~n) L~t(d) log~

n(x

din

= L~t(d)x(d)(log'::) d

d(x

= £(1,

x)

d

L m('J

L~t(d) x~d) log

x(m) m

J

+ 0(1).

d(x

Hence, if L(l, x) = 0, we deduce that (2.27)

""x(n)A(n) = -logx+0(1).

L..

n

n(x

In view of the 1\lertens formula (2.14) the above conditional formula can be written as L(l + x(p)) logp p(x

«

1.

p

This shows that x(p) = -1 for almost all primes. In other words, if £(1, x) vanishes, then the character x(n) on squarefree numbers mimics the Mobius function ~t(n), which is not likely to be true (both functions are multiplicative but ~t(n) is not periodic). In any case, we have shown rigorously that (2.28)

2, ELEMENTARY THEORY OF PRIME NUMBERS

37

where Ox = -1,0 according to £(1, x) = 0 or £(1, x) =fi 0 if X =fi xo, and Ox = 1 if X = Xo· Summing (2.28) over all characters we get

L

L

A(n) =-1-( n 'P( q) x(mod

n(x

ox)logx+0(1). q)

n=!(mod q)

Hence L::x Ox ? 0. This shows that there is at most one character X =fi Xo with £(1, x) = 0. If such an exceptional character exists, it must be real because £(1, x) = 0 implies £(1, x) = 0 (the complex conjugation is a continuous automorphism of IC). The non-vanishing of £(1, X) for a real character argument. To this end we consider

T(x)

=

x =fi

Xo requires a different

L r(n, x)n-~ n(x

where

r(n, x) =

(2.29)

L x(d). din

Note that

r(n, x) =

IJ (1 + x(P) + · · · + x(p"));;:: 0

11n x) ? 1. Therefore P

r(m

for all n, and

2

,

0

T(x)?

L

m-

1

> ~ logx.

m(y'X

On the other hand, opening the convolution (2.29) we evaluate T(x) in the same fashion as Dirichlet dealt with the divisor problem (1.75) (the hyperbola method). We obtain

T(x)

=

L

LX(m)(mn)-!

mn(x

n

=

_l 2

L x(m)m-~{~(~)~ +c+o((~) 1 )} +0( L n-1x-~) n
m(y'X

=

~L(1,x)x~

+ 0(1).

Comparing both estimates we obtain log X< £(1, by letting x tend to oo. THEOREM

have £(1, X)

2.1

(DIRICHLET).

x)x1 + 0(1), whence £(1, x) =? 0

For every non-principal character X (mod q) we

=fi 0.

By Theorem 2.1 we complete the proof of (2.25). Moreover, we proved that (2.26) holds for any X =fi xo. Summing over characters we deduce


38

2.2. For any a with (a,q)

THEOREM

=

1 we have

A(n) = log x n
{2.30) n~x

+ 0{ 1)

n=oa(mod q)

where the error term 0 ( 1) depends only on q. The original proof of Dirichlet that £(1, x) i= 0 for a real character a simple consequence of his Class Number Formula: let K = Q( ,fl5) be a quadratic field with D 0, 1 (mod 4) so D is the discriminant of K. There is an associated real primitive character X such that REMARK.

x is

=

(K(s) = L(Na)-s = ((s)L(s, x)

and every primitive real character modulo q Q(vx(-1)q). Dirichlet showed that ~~

(2.31)

L(1,x)= {

i=

1 arises in this way from K =

ifx(-1)=-1, ifx(-1)=1,

2hlog"

.jij

where h is the class number of K, w is the number of units in the ring of integers of K if X( -1) = -1, and E is the fundamental unit of K if x( -1) = 1. Hence £(1, x) > _)q. {See {22.59) for odd characters). In Chapters 5, 22, 23, we develop lower bounds for £(1, x) using techniques of increasing sophistication. See also Section 15.9 for some results about class numbers of real quadratic fields. 2.4. Reflections on elementary proofs of the prime number theorem. The Prime Number Theorem was proved in 1896 independently by Hadamard and de Ia Vallee Poussin by analytic methods. The first elementary proofs were found about fifty years later by Erdos and Selberg. At the heart of their proofs lies the formula (due to Selberg) (2.32)

L(logp)

2

+L

p(x

L(logp)(logq) = 2xiogx + O(x). pq(x

To prove this we begin by

L log

2

n = x(log 2 x- 2logx + 2)

+ O(log 2 x)

n(x

=

x(logx)(Lk-')k(x

Lb + (! + 2)logk] + O(io~x). k(x

•

2.

ELEMENTARY THEORY OF PRIME NUMBERS

39

where 1 is the Euler constant. Hence opening the convolution A2 = f-L*L 2 we derive

L

Az(n) =

n(x

L L J.L(m) log2 n mn(x

-

LLIL(m)[!+(/+2)log~]

+O(x)

l(x mil

= L _:(b(n) logx + A(n)) + Lhb(n) + (! + 2)A(n)) + O(x) n(x

=

n

n(x

xlogx + xlogx + O(x) = 2xlogx + O(x)

by Mertens' formula (2.14) and Tchebyshev's upper bound (2.12) (recall (1.18)). EXERCISE. Derive from (2.32) by induction on k ;> 2 (use the recurrence (1.47)) that

L Ak(n) = kx(logx)k-l{ 1 + oCo~x) }.

(2.33)

n(x

Using Selberg's formula (2.32) Postnikov and Romanov [PRJ derived the elegant inequality IM(x)llogx <

(2.34)

L 11\.1( ~)I + O(x log log 3x). n(x

Compare this with the second part of (2.17). Now the Prime Number Theorem can be derived in the form1/J(x) ~ x from (2.32), or in the form M(x) = o(x) from (2.34) by elementary, however lengthy, tauberian type arguments. E. Bombieri [Bol] and E. Wirsing [Wi2] refined the Erdos-Selberg method to show

1/J(x) = x + O(x(logx)-A)

(2.35)

for any A > 0, the implied constant depending on A. Along the same lines H. Diamond and J. Steinig [DS] succeeded in showing that (2.36) 100

for all x ;> ee • The arguments of Erdos-Selberg are truly elementary (no complex numbers are harmed in the course of the proof) nevertheless we express our mixed feelings, because the earlier analytic methods based on contour integration inside the zero-free region of the Riemann zeta function are considerably simpler and the results are superior (see Sections 5.4 and 5.6). In 1899 de Ia Vallee Poussin proved quite easily that (2.37)

1/J(x) = x + O(xexp( -cy'k;g;:))

where c > 0 and the implied constant are absolute. The subsequent improvements of the PNT came from sharper upper bounds for ( (s) near the line Re (s) = 1,

40


particularly by Vinogradov's estimate for exponential sums (see Chapter 8). However, we would like to say that Vinogradov's method of exponential sums should not be regarded as a new technique for non-vanishing of the zeta function. In fact, all the known methods of converting the upper bounds into the lower bounds, from which to draw a zero-free region, remain in principle the same one as originated by Hadamard and de Ia Vallee Poussin. Yet, some variations are more sensitive to the input than the others, giving closer connections between estimates and zeros. For example the Borel-Caratheodory theorem in complex function theory does the job very well. The best error term in (2.37) known today is due to Korobov [Kor] and Vinogradov [VI] (see Corollary 8.31). The asymptotic formula 1/J(x) ~ x is equivalent to the non-vanishing of ((s) on the line Re (s) = 1, and the latter can be established relatively easily without appealing to analytic properties of ((s) . Here is how one can arrange the analytic ideas for a proof of the Prime Number Theorem in the form (2.35) which a broad-minded reader may still find elementary. We avoid complex variable analysis, although for clarity we do not hesitate to employ ((s) at complex numbers s = a+ it, but only with a > 1 where the Dirichlet series and the Euler product converge absolutely. We begin by considering the function G(s) = (-1)k(1/((s))(k). This is given by the series G(s) = "'Jl(m) (logm)k. L ms

(2.38)

m

From here we go to the finite sum F(x) =

(2.39)

L m(x

Jl(m)(logm)kiog.::._ m

where the factor log(x/m) plays a task of smoothing at the cut-off point. This particular cutting is produced by means of the integral formula 1 iog+y=-. 21T~

1y8 s

2

ds

(
where a > 1, giving (2.40)

F(x) =

_2_, 2JT~

1

x 8 G(s)s- 2 ds.

(
For the purpose of estimating G(s) we consider (*(s) = (s- 1)((s). Note that (2.41)

(1/((s))(k) = (s -1)(1/(*(s))(k)

+ k(1/(*(s))(H).

Next we appeal to the formula from differential calculus (2.42) where a1, a2, ... run over non-negative integers. This formula for f(s) = (* (s) reduces the problem to upper bounds for the derivatives of(* (s) and a lower bound

41


for l(*(s)l. First we derive by partial summation that for Re(s) > 1, ( -1)e((el(s) =

t

n-s(logn)e +

1

=

r= x- (logx)edx + o(~ (logX)f+ 5

lx

1

)

~= x-s(logx)edx+0((1+ ~)(logX)f+ 1 ).

Taking X = 2lsl we get (2.43) Hence

Next we infer from the Euler product by positivity that 1

(II (1 + (1 +pit+ p-it)2p-") p

=

II (1 + (3 + 2pit + 2p-it + p2•t + p-2it)p-u)

~

(

p

3

(a)l((a

+ itWI((a + 2it)l 2

where the implied constant is absolute (a similar formula is also present in the arguments of Hadamard and de Ia Vallee Poussin, see Theorem 5.26). Hence and by (2.43) we get the lower bound 3

(2.45)

I

l(*(s)l »(a -1)4lsl(log2lsl)-2.

From (2.44) and (2.45) we derive using (2.42) that (1/(*(s))(k) «(a -1)-~(k+l)lsl- 1 (log21sl)" where r; and the implied constant depend only on k (in fact, r; = (5k+ 1)/2 comes out by the above arguments, but it does not matter for application). Hence by (241) G(s) «(a -1)-~(k+ll(log2lsl)". Inserting this bound to (2.40) and integrating trivially we get

F(x)

«

x"(a -1)-~(k+ll.

Finally taking a = 1 + (logx)- we conclude the following estimate for the sum (2.39) 1

(2.46)

F(x)

«

x(logx)~(k+ll.

Note that this is non-trivial if k > 3. Now it is a completely elementary exercise to derive the PNT from (2.46). First we do the sum (2.47)

H(x) =

L f.!(m)(logm)k.

42


which is obtained by differencing F(x) as follows

F(x + y) ~ F(x) = H(x) log x

+y + x

L x
f.!(m)(logm)klog x

+y

m

Hence

H(x) « y(logx)k +y- 1 x 2 (logx)~(k+l) = 2x(!ogx)k-A by choosing y = x(!ogxtA with A= (k~3)f8. Finally we get by partial summation (2.48)

L f.l(m) « x(!ogx)-A m::;;x

where the implied constant depends only on k. Since k is arbitrary so is A. This proves the PNT for the Mobius function. Hence the PNT for the von Mangoldt function follows by the formula (2.8).

CHAPTER 3

CHARACTERS 3.1. Introduction.

The characters on residue classes, both additive and multiplicative, play instrumental parts in analytic number theory. Other characters such as the Heeke characters of a number field, or the characters of a finite field are also indispensable in the modern theory. We shall introduce these in due course. However, to avoid redundancy we start here by giving basic definitions in a general context of a finite abelian group. (Characters for non-abelian groups occur also, but more naturally as Galois groups and are best seen from the automorphic perspective in analytic number theory). Let G be a finite abelian group. A homomorphism x : G --> IC* is called a character of G. Therefore (in the multiplicative notation) X has the properties

x(xy) = x(x)x(y)

for all x,y E G,

x(1) = 1 and x(x)m = 1, where m = IGI is the order of G, therefore x(x) is a root of unity. The characters of G form a group

G with multiplication given by for all x E G.

G is

called the dual group, its identity element is the trivial character

xo(x) = 1

for all x E G.

Throughout X denotes the complex conjugate, hence also the inverse. If G is cyclic of order m and g is a generator of G, then every character of G is of type if

X= gY

for some fixed residue class a(mod m). These are distinct characters, therefore 6 is also cyclic of order m, so 6 is isomorphic to G. The isomorphism 6 ~ G can be established for any finite abelian group by writing Gas the_direct product of cyclic groups. There is a canonical isomorphism between G and for all X E G.

i(x) = x(x)

43

G given by x

~--->

i: where

3. CHARACTERS

44

The raison d'etre of these characters is found in the following orthogonality relations

'Lx(x)=

{

xEG

IGI

if X=

0

if X~

xo, xo,

if X= 1,

'Lx(x)= { IGI 0 xEG

which allows us to detect the group identity element. Suppose d divides the order of G. An element 9 E G is said to have exponent d if 9d is the identity. The order of 9 is its smallest exponent. The characters of exponent d are exactly these which are trivial on the subgroup Gd

= {

xd : x E G},

therefore they may be viewed as characters of the factor group G / Gd. The orthogonality relations become if y E Gd, if y rj Gd. This will allow us to detect the d-th powers of G. 3.2. Dirichlet characters. First we consider the characters on the additive group Z/mZ of residue classes modulo m. They are given by

1/Ja(n)

=

eC:)

where e(z) = e2"iz. This formula makes an additive character a function on Z, periodic of period m. The orthogonality property becomes

L e(:;)

if n = {;

=O(mod m),

otherwise.

a(mod m)

We shall call 1/Ja(n) = e(anjm) primitive if (a, m) = 1. Adding all the additive primitive characters we obtain the Ramanujan sum (3.1)

"\"'* L..

S(n,O;m) = Cm(n) =

e(amn).

a(mod m)

(see (1.56)). Recall that throughout this book 2::* restricts the summation to "primitive" elements, in the above case among additive characters modulo m. One shows by Mobius inversion that (3.2)

Cm(n) =

L

dJl(~)

d/(m,n)

Hence (3.3)

cp(m) m ) Cm(n) = Jl ( (m,n) cp(mj(m,n))'

3. CHARACTERS

45

In particular,

c,.,(n) = JJ.(m)

(3.4)

if (m,n) = 1.

In general, lcm(n)l :;( (m,n).

(3.5)

Next we consider characters on the multiplicative group of residue classes a(mod m) with (a,m) = 1,

X: (ZjmZ)* _,C. As with the additive characters, we wish to consider X as a function on 7!..; we do so by setting x(n) = 0 whenever n is not prime tom. This makes x a Dirichlet character, a function on 7!.., periodic modulo m and completely multiplicative. The corresponding extension of the trivial character xo(mod m) is called the principal character to modulus m. The group (ZjmZ)* and its dual have
L

O

a(mod m)

" L.' x(rnod

if X= Xo, otherwise,

x(a) = {
if

x(a) = {
a= 1(mod m),

otherwise,

If m = m 1m2 with (m 1,m2) = 1, then (ZjmZ)* c::e (Zjm 1Z)* x (Zjm2Z)*, so every multiplicative character x(mod m) is a product x1 x2 of multiplicative characters X1(mod m1) and X2(mod m2).

The Dirichlet series associated to Dirichlet characters are of paramount importance in analytic number theory. They are called Dirichlet £-functions and denoted L(s, x), or L(x, s) depending on emphasis: (3.6)

L(s, x)

=

L x(n)n-s

=II (1- X(P)P-s)-

1

;

p

the series and the Euler product are absolutely convergent for Re (s) > 1. As is well known, these functions can be analytically continued to C. See Section 4.6 for a proof. Many other analytic properties, applications and generalizatim1s of Dirichlet £-functions will be considered throughout the book. Note that by total mnltiplicativity, (1.15), (1.38) give

L'

-r;(s,x) = LA(n)x(n)n-s n~l

for Re (s) > 1. 3.3. Primitive characters. Associated to each character x, in addition to its modulus m, is a natural number m*, its conductor. The conductor is the smallest divisor of m such that X may be written as X = XoX~ where Xo is the principal character to modulus m and x* is a character to modulus m*. For some characters the conductor is equal to the modulus. Such characters are called primitive. In the above factorization x* is

3. CHARACTERS

46

a primitive character uniquely determined by X· We shall say that X is induced by x* or that x* induces X· We have if (a,m) = 1.

x(a) = x*(a)

The number of primitive characters to modulus m is given by

~*(m)=miJ(1-~) IJ(1-~t

(3.7)

Pllm

P2 lm

The proof is a simple exercise in Mobius inversion. Indeed, we have cp = 1 * cp*, whence ~· =f.£ * ~, i.e.,

cp*(m) = I>(dJcp(~) dim

giving (3.7) by multiplicativity. In the same way using the orthogonality of characters we infer a more general result

"'* 6

(3.8)

if (a, m) = 1.

x(a)=

x(mod m)

Incidentally, the formula (3.7) shows that the primitive characters x(mod m) exist precisely when m ¢ 2(mod 4). The primitive characters are pleasant to deal with. For example, we have a convenient formula (which is not valid for X not primitive)

(3.9)

m

"'

6

c(mod m)

x(b)

x(ac +b)= { 0

ifm I a, if m fa.

Indeed, letting S be the above sum we obtain x(1 + m 1 x)S = S for any x, where m1 = m(a,m)/(a2 ,m). IfS f 0, this yields x(1+m 1x) = 1 foranyx, which implies that xis periodic of period m 1 . Since x is primitive, we must have m 1 = m, i.e., m I a. Thus S = 0 if m fa, and (3.9) is obvious otherwise. The Dirichlet characters of exponent two are just the real characters (realvalued). In a number of ways they play a special role, particularly they are fundamental in the theory of quadratic forms. Below we give a complete list of real primitive characters. If m = p is an odd prime, then there exists exactly one real primitive character of conductor p, namely the quadratic residue symbol (the Legendre symbol)

Xp(n)

=G)·

This can be defined by saying that 1 + Xv(n) is the number of solutions x(mod p) to x 2 n(mod p).

=

For m = 4 we have exactly one primitive character defined by n-1

X4(n) = (-1),.--

if 2 f n.

If m = 8, we have two primitive characters

xs(n) = ( -1) t
(-l)k(n-l)(n+S)

if 2 t n, if2fn.

3. CHARACTERS

47

If m is prime power, there are no real primitive characters of conductor m other than X4, xs, X4X8 and Xp· Every real primitive character x(mod m) is obtained as the product of the above ones. Therefore the conductor of a real primitive character is a number of type 1, k, 4k, Sk where k is a positive odd and squarefree integer. 3.4. Gauss sums.

Let us come back for a moment to a general finite abelian group G. The characters of G form a complete orthogonal system so that any function f : G ___, iC has the Fourier expansion

with coefficients

u, M 'L t(gJ1fi(g). 1

=

gEG

EXERCISE

1

(DUE TO DEDEKIND).

Prove that

In the case of functions on residue classes, and on a finite field, both additive and multiplicative characters are present and it is often necessary to transform the Fourier expansion in one of these systems to the other. In doing so we encounter Gauss sums (x. 'lj;) for any pair of multiplicative and additive characters x, 'lj; respectively. We shall present the Gauss sums for finite fields in due time. Now we consider Gauss sums associated with characters on residue classes, say modulo m. For any multiplicative character x(mod m) we put (3.10)

L x(b)e(~).

T(X) =

b(rnod m)

Multiplying (3.10) by x(a) and summing over X we derive by orthogonality

(3.11)

e(~) = ~(~) L

x(a)T(X)

if (a,m) = 1.

x(mod m)

This is the Fourier expansion of additive characters in terms of the multiplicative ones. Similarly (3.12)

x(a)T(x) =

L

x(b)e(~)

if (a,m)

=

1

b(mod m)

which gives the Fourier expansion of x in terms of additive characters provided T(X) f 0. Note that the condition (a,m) = 1 in (3.12) can be dropped if xis primitive because both sides vanish if (a,m) of 1 (apply (3.9)).

48

3. CHARACTERS

LEMMA 3 .1. Suppose the character x modulo m is induced by the primitive character x* modulo m*. Then

(3.13)

T(X) = IL(; )x*(:* )T(X*).

If X (mod m) is primitive, then (3.14) We have

PROOF.

=

T(X)

x*(a)e(~) = LfL(d)x*(d)

L:* a(mod m)

dim

2

L

=

'!J)

= m/m* giving (3.13). To prove (3.14) we

The innermost sum vanishes unless d write

jT(X)I

x*(a)e(~)·

L a(mod

L

x(a)x(b)eC: b)

a 1 b(mod m)

L

x(a)

a(mod m)

L* e((a- 1)b)· m

b(mod m)

The inner sum is the Ramanujan sum (which is also the Gauss sum for the principal character). Inserting (3.2) we get

/T(X)/

2

=

Ldl'(~) dim

L

x(a).

a(morl m) a=l(rnod d)

If X is primitive of conductor m, then the last sum vanishes unless d = rn (apply (3.9)), giving (3.14). 0 One can see by Lemma 3.1 that T(X) does not vanish exactly when m/m* is squarefree and prime tom*. In this case (3.12) becomes a true formula for x(a). We now consider the more general sum

T(X,'I/!a) =

L

x(b)'I/Ja(b)

b(mod m)

where '1/Ja(b)

= e(abjm) is an additive character.

LEMMA 3.2. Let x modulo m be a non-principal character induced by the primitive character x* modulo m*. Let a ;;;, 1. We have

T(X,'I/Ja) = T(X)

L

dx*(a/d)~t(m/dm*),

dl(a,mjm•)

in particular, T(X,'I/Ja) = x(a)T(X) if (a, rn) = 1, i.e. if '1/Ja is a primitive additive character. Note that 'T(X) = x(-1)T(X), hence if x(mod m) is primitive, then (3.14) can be written as (3.15)

T(X)T(X) = x(-1)m.

3. CHARACTERS

49

For any characters X1 (mod m!) and X2(mod m2) with (m1, m2) = 1 we have (3.16) This multiplication rule fails if the moduli are not relatively prime. For characters . to the same modulus the correction factor is the Jacobi sum

L

J(x1,x2) =

(3.17)

x1(a)x2(1- a).

a(mod m)

Precisely if X1·, X2 are two characters modulo m such that .X1X2 is primitive, then (3.18) Hence if all

x1, x2 , X1X2

are primitive of modulus m, then

(3.19) If x(mod m) is primitive, then J(x,x) = x(-1)~(m).

(3.20) 3.5. Real characters.

For a primitive character x(mod m) we know that IT(X)I = yTii, however, the determination of the argument of T(X) is a difficult problem. Using Deligne's estimate for multiple Kloosterman sums one can show that T(X)/ yTii are asymptotically equidistributed on the unit circle as x(mod m) ranges over all primitive characters and m tends to infinity over primes (see Chapter 21). In the case of real characters T(X) were evaluated completely by Gauss. It is easy to see that T(X4) = 2i, T(Xs) = 2/2 and T(X4XB) = 2/2i. THEOREM

3.3

(GAUSS).

Form odd squarefree

(3.21)

T(X) =E,.,Vffi

where if m = 1 (mod 4),

(3.22)

ifm = -1 (mod 4).

We shall give an analytic proof of (3.21). But first we evaluate other Gauss sums of type (3.23)

G(m)

=

L e(:) n(mod m)

for any integer m? 1. If m is even we derive by shifting n ton+'¥ that G(m) = 2 i"'G(m) ' because .l = .!C + !'.!(mod 1) · Hence m (n + !!!) 2 -m 4

(3.24)

G(m)

=

0,

if m = 2(mod 4).

Next we show that (3.25)

if m ¢ 2(mod 4).

This follows by splitting the summation in G(m 3 ) into n = a+ bm 2(mod m 3 ) if 2 f morn= a+ b";;/ (mod m 3 ) if 4 I m. Now we are ready to prove

3. CHARACTERS

50

For any mEN,

THEOREM 3.4 (DIRICHLET).

1+im G(m) = l+irm.

(3.26) We have

PROOF.

L eC~)+O(l).

G(m)=2

O
In the segment 'T < n < '¥- we change n into ['¥-] - n = '¥- - { '¥-} - n. Since Jk(['¥-J-n) 2 = Jk(n+{'¥-}) 2 -'T+['¥-J-n= Jk(n+{'¥-}) 2 -'T(mod 1)!weobtain G(m)=2

L

eCn+~/ 2 }) )+0(1).

2

L

eC::)+2i-m

O
O
Here, and as before, the error term 0(1) accounts for the terms which are not covered in the displayed summation (at most two of them). By Lemma 8.8 each of the two short sums above is approximated (up to a bounded error term) by the integral rm/4

Jo

x2) e(m dx=

r)O

(x2)

1+i dx+0(1)=--vm+0(1). 4

lo em

Hence adding up these results we get G(m) =

1 + i-m + i-l 1

Vm +

0(1).

It remains to show that the last error term 0(1) vanishes. This follows by iterated application of (3.25) if m '/'- 2(mod 4); otherwise (3.26) is obvious because both sides vanish. 0 We generalize G(m) by setting

(3.27)

c(~)

L;

=

2

e(a;

)

n(mod m)

for any a with (a, m) 1. In particular, G(Jk) = G(m). If m = m 1m2 with (m1,m2) = 1, we obtain

(3.28)

c(-a ) _ c(am2)c(am1) m1m2

m1

mz

by writing n n 1m 2 + n 2 rn 1 with n 1 and nz ranging modulo m 1 and m2 respectively. This formula reduces the problem of evaluating generalized Gauss sum G(a/m) to that of a prime power modulus. If m =pis an odd prime we can write

c(~)

=

L;

(1 + (~) )e(~)

b(mod p)

since 1 +(~)is the number of solutions to n 2 = b(mod p). Therefore (3.29)

3.

CHARACTERS

51

by changing the variable and using (3.26). Next for distinct odd primes p, q we derive by (3.26),(3.28),(3.29) that

Note also that (3.30) Combining these formulas we deduce THEOREM 3.5 (QUADRATIC RECIPROCITY LAW). we have

For any odd primes p # q

(3.31)

Since G(-1/p) = G(l/p) it follows from (3.29) that for odd p (3.32)

EXERCISE 2. Prove that for odd p, (3.33)

Now (3.21) follows by T(X) = G(p) = Ep.jP if m = p is odd prime, and in general by the multiplicativity on using (3.16),(3.30) and (3.31). Note we have proved that

x(mod m)

(~)e(~) = L e(:) x(mod m)

if m is odd squarefree. If rn is odd but not squarefree, the left side vanishes while the right side does not. For convenience we extend the Legendre symbol ( ~) to p = 2 by setting (3.34)

(~) = { ~

if 2 t a,

if 2 I a.

Then for any b > 0 we define (3.35)

G)= II (~f. p P"llb

If b is odd, this symbol(%) was introduced by Jacobi.

3.

52

CHARACTERS

EXERCISE 3. Derive from (3.31) that for any odd integers a, b relatively prime (3.36)

(

a ) jbf

( b ) ~

a-1 b-1 = (-lp---,--(a,b)

Here (x, y)= is the Hilbert symbol defined for xy (3.37)

-1

(x,y)= = { 1

00 •

f= 0 by

if x < 0 and y < 0, otherwise.

Notice that 2(x, y)oo = 1 +sign x +sign y- sign xy. EXERCISE 4. Prove that for any a,m with (2a, m) = 1, (3.38)

ab

Finally we extend the symbol ( ~) to all integers a, b except for a = b = 0. If 0, we set

f=

(3.39) Then we set

(3.40)

(~)=G)= (~1) =-(~1) = 1,

and

(3.41)

G) =0

if (a, b)

f=

l.

For b f= 0 this symbol was introduced by Shimura [Shl] in the context of modular forms of half-integral weight. He showed that (1.53) holds with

(3.42)

v(c,d)

=Ed(~)·

ExERCISE 5. Check the consistency of the above settings. Prove that for any b ;? 1 the map a >-+ ( ~) is a Dirichlet character modulo b. Prove that for any a # 0 the map b ,._.. ( V is a Dirichlet character of conductor a* I 4a. The above extension of ( ~) will be called the Jacobi symbol.

=

0, 1(mod 4) is called a discriminant. If Ll = 1 Any integer Ll f= 0 with Ll or Ll is the discriminant of a quadratic field, then Ll is said to be fundamental. Any discriminant can be written uniquely as Ll = e2 D with D fundamental and e ;? l. A prime discriminant is a fundamental discriminant having exactly one prime factor, thus it is a number of type -4, -8, 8 and x4 (p )p with p > 2. Every fundamental discriminant factors into prime discriminants. To any discriminant Ll one associates the Kronecker symbol defined for any c f= 0 by means of the Jacobi symbol as follows

(3.43)

(~) C

K

which is

3.

CHARACTERS

where c = 2vb with b odd. Nate that ( ~) g = (

i-) = x~ (Ll), explicitly

if Ll if Ll

(3.44)

53

if Ll

= 1(mod 8),

=5(mod 8),

=O(mod 4).

We also extend the definition of the Kronecker symbol for c

= 0 by setting

if Ll = 1,

(3.45)

otherwise.

Therefore (%) K is defined for all integers c and Ll

f 0,

Ll

=0, 1(mod 4).

We shalt drop the subscript K in the Kronecker symbol notation and explain in any relevant case that we are dealing with the Kronecker symbol. Remember the Kronecker symbol is not defined for b.'s other than discriminants. When the symbol ( ~) appears without comments it stands for the Jacobi symbol. EXERCISE 6. Prove that for a fundamental discriminant Ll the Kronecker symbol(%) is a primitive character of conductor lb. I.

3.6. The quartic residue symbol. Next we construct certain characters of order four. These are associated with

IQ(i), the imaginary quadratic field of discriminant D = -4. The ring of integers of IQ( i) is Z[i] = {z =a+ bi: a,b E Z} and the group of units of Z[i] is {l,i,i2 ,i 3 }. The irreducible elements of Z[i] are (up to the units); 1 + i with N(1 + i) = 2, the rational primes q -1(mod 4) with N q = q2 and the complex numbers 1r = a+ bi with

=

(3.46)

N1r

= 1r1f = a 2 + b2 = p

=1(mod 4).

Note that 1r and 7f are coprime, these are called Gaussian primes. Every element of Z[i] different from zero factors uniquely (up to the units and permutations) into powers of irreducible elements. For every Gaussian prime 1r we define a map (the quartic residue symbol)

(3.47) which satisfies

(;a) =a--:J(mod 11:). p-1

(3.48)

=

(;n

Note that = 0 if 1r I a. If 1r fa, then ap-l 1(mod 11:) because the residue class ring Z[i]/7rZ[i] is a finite field with p = N1r elements. This property (the little theorem of Fermat) implies the existence and the uniqueness of solutions to (3.48) with ( ~) = im for some 0 ( m < 4. In particular, we have (3.49)

(;i) =

.E.=.! 'l

4

'

if p

= 1r1f

=1(mod 4).

54

CHARACTERS

3.

EXERCISE 7. Prove the following properties of the quartic residue symbol:

c:)

(3.50)

(;)(~).

=

(3.51)

a= (3(mod 1r)

(3.52)

( ~) = (;)

(3.53)

(~)

(3.54)

(;) = 1 if and only if z 4

=? ( ; ) =

if

(~).

1

7r = 1rim,

= (;),

=a(mod 1r) has a solution in Z[i]*.

If 1 = 7rJ · · · 7rr is the product of Gaussian primes (not necessarily distinct), then we set the symbol

(~)

(3.55)

(;J ... (~)·

=

Clearly the properties (3.50)-(3.53) hold true with 1r replaced by -y. In particular, (3.53) gives

(~;)=1

if

(a,p)=l.

Note that (1 + i) 2 = 2i. We say that a E Z[i] is odd if (1 + i) f a and primary if a 1(mod 2(1 + i)). Every odd number in Z[i] is associated with exactly one primary element, i.e., c:a 1(mod 2(1 +i)) for exactly one unit c: = im. A primary element can be written as the product of primary irreducibles uniquely up to permutations. We have the following

=

=

THEOREM 3.6 (THE LAW OF QUARTIC RECIPROCITY). lj1r 1 ,1r2 are distinct primary Gaussian primes, then 7rj)(7r2) = (-1) EL=..OE2..::..! 4 4 ( -7r2 7rj

(3.56)

where Pl = N 1r 1 and P2 = N 1r2. If 7r = a+ bi is primary. then

(~) =i(J-a)/2,

c ; i ) =i(a-1-b-b')l\

(~) =i-b/2

The quartic residue symbol (;) is a multiplicative character of the finite field IFP ::::- Z[i]/7rZ[i]. The Gauss sum of this character is (3.57) where a runs over the rational residue classes modulo p and a is the representative of a in Z[i] modulo 1r. If 1r is primary, then (3.58) We shall be interested in the quartic residue symbol at rational integers (3.59)

for n E Z.

3. CHARACTERS

55

This is a Dirichlet character of conductor p = 1r1f and of order four. Since (1r, 1f) = 1, we have x;(n) = n~(mod p). Therefore (3.60) is the quadratic residue symbol. Clearly Xrr(n)"and X1r(n) are distinct Dirichlet characters but their squares yield the same quadratic character. More generally, if q = p 1 · · · Pr is the product of distinct primes p1 = 1(mod 4), then there are 2r distinct Dirichlet characters x(mod q) satisfying x 2 = Xq, namely X= Xrr 1 • • • Xrrr = X7 , say, for any "f = 1r 1 · · · 7rr with 1"1 = q.

3.7. The Jacobi-Dirichlet and the Jacobi-Kubota symbols. Let q be the product of primes = 1(mod 4) (not necessarily distinct primes). The Jacobi symbol Xq( n) = ( ~) can be extended to Gaussian domain Z[i] in several ways corresponding to representations of q as the sum of two squares q = u2

(3.61)

+ v2

with (u, v) = 1.

By requiring w = u + iv to be primary we distinguish u from v and we fix the sign of u. Note that u = 1(mod 2) and v = u -1(mod 4). A Gaussian integer w = u+iv is said to be primitive if ( u, v) = 1, thus an odd w is primitive if ( w, w) = 1. For any primary primitive w we define the symbol (;;;) : Z[i] -+ {0, 1, -1} by (3.62) where on the right side is the Jacobi symbol. More explicitly we have (3.63)

(;) =

c1· ~ vs),

if z = r +is

where q = ww. If q is prime= 1(mod 4), we get two different symbols (,;;) and ( ~) which were considered by Dirichlet [Dir]. Throughout we call (:!;;) the JacobiDirichlet symbol whenever w is primary and primitive. We could introduce the Jacobi-Dirichlet symbols using roots of quadratic congruences: there is one-to-one correspondence between the roots of

w2

(3.64)

+1=

O(mod q)

and the flj.ctorizations q = ww with w = u (3.65)

w

+ iv

primary which is given by

=-uv(mod q).

Here 'IT denotes the multiplicative inverse of u modulo q. We obtain

(;;;;z) __ (r +qws)

(3.66) by ( ~)

=(

({iJ = (D

(3.68)

=

= r +is

¥) = ( ~) = ( ~I ) = 1. Hence it is clear that

(3.67) For z

if z

ifr E Z.

i we have i ) ( -w

=

.1?.::..!

1.

2

•

3.

56

CHARACTERS

Clearly (t;) is periodic in z of period q, and it is multiplicative as well since (r1

+ ws1)(r2 + ws2)

=r1r2 ~ s1s2 + w(r1s2 + r2s1) (mod q).

EXERCISE 8. Derive from the quadratic reciprocity law (3.31) the following reciprocity law for the Jacobi-Dirichlet symbol

(3.69) for any z and w which are primary and primitive. For Gaussian integers z = r +is= 1 (mod 2), we define

[z]

(3.70)

= ir;l

(1;1)

R)

where ( stands for the Jacobi symbol. Note that [z] vanishes if z is not primitive. We are interested in the multiplicative structure of [z] within the Gaussian ring Z[i] rather than with respect to the co-ordinates (r, s) E Z x Z. For this reason we refer to [z] as the Jacobi-Kubota symbol. Indeed, this symbol is relevant to the Kubota homomorphism S£(2, Z) ---+ { ±1} which has much to do with metaplectic modular forms. Of course, [z] is not multiplicative in the strict sense, yet it is nearly so, up to a factor which is the Jacobi-Dirichlet symbol. EXERCISE 9. Prove that if w is primary primitive and z (3.71)

=1

(mod 2), then

[wz] = c[w][z] (;)

with E = ±1 depending only on the quadrants to which z, wand wz belong (see the details in [Fll]). Both symbols of Jacobi-Dirichlet and Jacobi-Kubota are utilized in the proof of the asymptotic formula for primes p = X 2 + Y 4 . They play a role through the estimation for general bilinear forms in [wz] (see Proposition 21.4 of [Fll]). 3.8. Heeke characters. E. Heeke [Heel] generalized the Dirichlet characters to any number field. His characters are multiplicative functions on ideals. We restrict our considerations to imaginary quadratic fields. Every such field is obtained by an extension of rationals by ../d, where dis a negative, squarefree integer. Denote K = IQl( -/d). The ring of integers of K is a free Z-module 0 = Z + wZ, where

.,fd w = { ~(1

if

+-/d)

if

=2(mod 4), d =1(mod 4). d

The discriminant of K = Q( -/d) is if d if d

=2, 3(mod 4),

=1(mod 4).

Nate that in any case K = IQl( v75) and 0 = Z + ~ (D + v75)Z. The extension K /Q has two automorphisms, the identity and the complex conjugation. The group of

3. CHARACTERS

57

units U C 0 is finite, cyclic, generated by the root of unity (w = e27fi/w of order w = 4, 6, 2, precisely

V={±l,±i}

ifd=-1,

U = {±1, ±w, ±w2 } if d = -3, u = {±1} if d < -3.

(3.72)

The Kronecker symbol xn(n) = ( ~) is a real primitive character of conductor -D, we call it the field character. Notice that xn(-1) = -1 (see (3.39)). Every ideal a I= 0 in 0 is a free Z-module, and.the ring of residue classes 0/a is finite, the number of its elements being the norm of a Na = i{a(mod a): a E 0}1.

(3.73)

The norm is multiplicative. If a= (a) is a principal ideal, then Na = Na = lal 2 . Every ideal a I= 0 factors uniquely into powers of prime ideals. Every prime ideal of K appears as a factor of a rational prime. The law of factorization of rational primes into prime ideals in K asserts the following: -if xn(p) = 0, then p ramifies; (p) = p2 with Np = p, -if xn(P) = 1, then p splits; (p) = pjl with p 1= jl and Np = p, -if xn(p) = -1, then pis inert; (p) = p with Np = p 2 . An integral ideal a I= 0 which has no rational integral divisors other than ±1 is said to be primitive. Every primitive ideal can be written uniquely as the Z-module (3.74)

a= aZ

+ ~(b + v75)Z =[a, W + v75)]

=

with a= Na and b the integer determined by -a< b (a, b2 D(mod 4a). Note that we have b2 - 4ac = D with (a, b, c) = 1. This is clear from the following computation: (a)= (ail)= [a,

W+ v75)][a, ~(b- VD)]

=[a , a (b + JD D), a (b- ..;75 D), ac] 2

2

2

2

C [a ,ab,ac] = a[a,b,c] C (a).

For a primitive ideal a we set the point in the upper half-plane Za =

(3.75)

The inverse ideal

b + ..fJ5 E JHI.

2a

a- 1

is the free :£-module generated hy 1 and 2 0 ,

(3.76) Let \l denote the different of K, so by definition

o-

1

o- 1 is the fractional

ideal

={a E K: Tr a E Z}.

In our case the different of the quadratic field K = IQl( ..;75) is the principal ideal \l = ( ..;75). Note that '5 = \l and N\l = IDI.

3. CHARACTERS

58

Let I be the group of all non-zero fractional ideals

I= { :: :at, a2 c 0, ata2

=J

0},

and P c I the subgroup of principal ideals (a) = aO with a E K*. The factor group H = I/ P is called the class group of K. It is a finite abelian group of order h = IHI = [I: P] which is called the class number of K (also of the discriminant D). Every class A is represented by a unique primitive ideal a with Za in the standard fundamental domain (such ideal is called reduced) F = p- Up+, where (3.77)

p- = { z = x

p+ = {z = x

+ iy E llli: -~ < x < 0 and lzl > 1} + iy E llli : 0 ~ x ~ ~ and lzl = 1} .

Therefore the class number h = h(D) is equal to the number of primitive reduced ideals. For a primitive ideal written as a= [a, ~(b + VD)] with b2 - 4ac = D we have lzal 2 = ~. whence for a to be reduced it says that either -a< b ~ a< cor 0 ~ b ~ a = c. Conversely any integral solution to b2 - 4ac = D satisfying the above inequalities yields a primitive reduced ideal defined by (3.74). Now we go to characters. First we consider Dirichlet characters on 0. Let m be a non-zero integral ideal in 0. The residue classes a (mod m), a E 0, (a, m) = 1 form a multiplicative group (0/m)*. Any homomorphism

X: (0/m)*--+ C* is called a Dirichlet character modulo m . Naturally we extend x to 0 by setting x(a) = 0 if (a, m) =J 1. If xis a character to modulus m then (by natural restrictions) it can be also viewed as a character to any modulus n C m (recall that this inclusion of ideals means m divides n ). For any x(mod m) and any {3 E (Dm)-t we define the Gauss sum (3.78)

Tx(/3)

=

L

x(a)e(Tr(a/3))

a (mod m)

This is well defined, because if at = a2 (mod m) then {3(at -a 2) E (Dm)-tm = l:l- 1 and Tr (/3(a1- a2)) E Z. The same argument shows that Tx(/3) depends only on (3 (mod il- 1), i.e. Tx('y) = Tx(/3) if"'(- {3 E l:l- 1. Clearly

Tx(a/3) = _\'(a)Tx(/3) if (a, m) = 1. The conductor of X (mod m) is the largest ideal f =:> m (that is the smallest divisor f of m) such that X factors through ( 0 /f)*. Iff = m then the character is said to be primitive. PROPOSITION

(3.79)

3. 7. If X (mod m) is primitive, then

if ({3D, m) = 1,

otherwise.

After having introduced characters of (0/m)*, we look for (unitary) characters of C*. By definition these are continuous homomorphisms of C* into the unit circle

3. CHARACTERS

Any such character Xoo : C*

-+

59

5 1 is given by Xoo(a) = (

)e

: 1

1

for some C E Z. We call C the frequency of Xoo· Next we turn to characters on ideal classes. Recall we have already introduced the following groups: - I the multiplicative group of fractional ideals I 0. - P the subgroup of principal ideals. - H = I /P the ideal class group. Now for any integral ideal m C 0, m I 0, we set the subgroups

Im={aEI I (a,m)=1}ci, Pm={(a)EP I (a,m)=1}cP. Here. (a, m) = 1 means a= be- 1 for some b, e c 0, with (be, m) = 1. EXERCISE

10. Show that (a, m)

= 1 means a = f3r- 1 for some {3, 1

E

0,

(/3/, m) = 1. The factor groups Im/ Pm are isomorphic to I/ P = H. Any homomorphism of H into 5 1 is called a class group character. There are exactly h = IHI = IHI class group characters. Finally we introduce the Heeke characters ("Grossencharakteren"). A Heeke character modulo m is a continuous homomorphism 1j; : Im -+ 5 1 for which there exist two characters X: (0/m)*-+ 5 1

Xoo : C*-+ 5 1 such that

1/J((a)) for every a E 0, (a, m) by 1/J. Indeed we have

=

=

x(a)xoo(a)

1. Note that X (mod m) and Xoo are determined uniquely

Xoo(a) = 1j;((a))

if

=

o

=1

(mod m).

Moreover, the group {a E K* I o 1 (mod m)} is dense inC* so the above formula defines Xac uniquely on C* by continuity. After having determined Xoo we find X by

x(a)

=

1/J((a))/xcX)(a).

Not every pair of characters x (mod m) and Xoc come from a Heeke character E: E U, then we have x(t::)Xoo(E:) = 1/J((t::)) = 1. One can show that the condition

1j; (mod m). Indeed, if (3.80)

X(E)X 00 (t::) = 1 for all

E:

EU

is sufficient for the pair of x (mod m) and Xoo to result from a Heeke character V' (mod m). However, a pair of characters X (mod m) and Xoo satisfying the above "units consistency condition" determine 1j; (mod m) only up to a class group character, i.e. it induces exactly h Heeke characters.

CHARACTERS

3.

60

In our case the group U is cyclic of order w = 4, 6 or 2 generated by the root of unity ( = e 2 "i/w. Therefore the condition (3.80) reduces to that for c =(,which becomes (3.81) In other words, this is a condition for C (mod w). If D < -4, we have ( = -1, so the units consistency condition becomes (3.82)

C = ~(x(-1) -1)

(mod 2).

The conductor of a Heeke character 1/J (mod m) is the smallf)st divisor f of m such that 1/J is the restriction of a Heeke character to modulus f. This is the same as the conductor of x (mod m). Iff= m, then 1/J is said to be primitive. Therefore 1/J (mod m) is primitive if and only if the corresponding X (mod m) is primitive. The Heeke characters of modulus m = (1) and of the frequency C = 0 correspond exactly to the class group characters. Hence there can be many primitive characters of conductor one. Let V' be a primitive Heeke character of conductor m and of frequency C. We consider V' as a multiplicative function on all non-zero integral ideals by setting '!f;(a) = 0 if (a, m) =J 1. To 1/J we associate the Heeke L-function L(s,'!f;) =

L

1/J(a)(Na)-s.

O#aCO

This series converges absolutely for Re ( s) > 1 and has an Euler product over prime ideals L(s,'!f;) =II (1-'!f;(p)(Np)-s)- 1 . p

We also introduce the local factor at the infinite place (3.83) THEOREM 3.8 (HEeKE). Let 1/J (mod m) be primitive. Then the complete product A(s,'!f;) = L 00 (s,'lj.,)L(s,'!f;) has analytic continuation to the whole complex s-plane except for a simple pole at s = 1 when 1/J = 1/Jo is the trivial character. In this case m = (1), C = 0 and L(s, 1/Jo) = (K(s) = ((s)L(s, XD) is just the Dedekind zeta function of K. so resA(s,v'o) = hw- 1 • s=l

Moreover, A(s, 1/J) satisfies the functional equation

A(s,'!f;) = W('!f;)A(1- s,{;)

(3.84) with the root number

~V(v')

given by the corresponding Gauss sum

(3.85) The Gauss sum T('!f;) associated with 1/J (mod m) is defined by (3.86)

3. CHARACTERS

61

where 'Y E 0 and c C 0 are such that ( c, m) = 1, cDm = b). Note that the definition of T(?j;) does not depend on the choice of 'Y and c due to the units consistency condition (3.80). EXERCISE 11. Show that for a primitive Heeke character 1j; (mod m) whose Dirichlet component is X (mod m) we have

(3.87)

for some (a, m) = 1 and (3 E (Dm)- 1 . ExERCISE

12. Prove that for a primitive Heeke character 1j; (mod m),

(3.88) EXERCISE 13. If ·if; modulo m with (m, n) = 1, then

and~

modulo n are primitive Heeke characters

REMARK. If 1j; is a primitive Heeke character of conductor m = (1) and frequency £, then T( 1j;) = if. In this case the functional equation (3.84) has the root number W(?j;) = 1. Since {;(a)= 1j;(a) we find that L(s,?j;) = L(s,{;) by changing a into a in the Dirichlet series.

If 1j;(mod m) has frequency £, then its complex conjugate ;j;(mod m) has frequency -£. Therefore, without loss of generality, we can assume that 1j;(mod m) has frequency P :;, 0. Then the function f : IHI -> IC given by (3.89)

f(z) = L1fJ(a)(Na)~e(zNa)

is an automorphic form of weight k = f character x(rnod N) defined by (3.90)

+ 1 on r 0 (N)

x(n) = xv(n)?j;(n)

of level N = IDINm and

if n E Z.

If P > 0, then f is a cusp form, which is primitive if the character 1j;(mod m) is primitive. In particular, if m = (1), then N = \D\ and X= XD· REMARKS. We say a few words about the proofs of the above results. First the Fourier series (3.89) splits into a finite sum over ideal classes. In each class we get an infinite series over equivalent ideals which is a theta function for a positive definite binary quadratic form twisted by a suitable harmonic polynomial (see Section 14.3). Applying the Poisson summation formula one shows that these individual theta functions are automorphic forms of the same type (cf. Chapter 10 of [14]). Hence the f, being a linear combination of theta functions, is also an automorphic form of the same type. Another application of Poisson summation shows that the theta functions satisfy a Jacobi type involution formula. Summing over the classes these formulas yield

zkf(-z ) = iT(?j;)!(~) .JN VFiffi z .JN (at this point it is crucial that 1j;(mod m) is primitive), where T(?j;) is the Gauss sum and N = IDINm. From this, one derives the functional equation (3.84) by taking the Mellin transform with respect to z = iy. See also Chapter 14.

3. CHARACTERS

62

We end this section by giving specific examples of Heeke characters on imaginary quadratic fields. EXAMPLE 1. ForK= IQ!(i) the discriminant is D = -4, the ring of integers is 0 = Z[i], the group of units is U = {±1, ±i} and the class number ish= 1, thus every ideal is principal. Fix an integer C. Put m = (1) or m = 2(1 + i) according to 4\l or 4 f C. Define E: Im --> C* by

(3.91)

E(n) =

(1:\r

where a is the unique primary element which generates n, with a= 1(mod 2(1+i)). Then E is a primitive character of frequency C and conductor m. EXAMPLE 2. Let K = IQ!(i) be as above. Take a positive integer q = O(mod 4) and X : (Z[i]/qZ[i])* --> C* a character on the multiplicative group of primitive residue classes modulo q in Z[i]. Define E: Iq --> C* by

where a is the unique primary element which generates n. Then character (not necessarily primitive) of frequency C and modulus q.

E is

EXAMPLE 3. Let K = IQ!(VD) have discriminant D < -3, D Then the ring of integers is

(3.92)

O={~(m+nVD):m,nEZ,

a Heeke

1(mod 4).

m:=n(mod2)},

the group of units is U = {±1} while the class number h = h(D) can be arbitrarily large. Since the principal ideal ( VD) has norm JDJ the ring inclusion Z C 0 defines an isomorphism ·

(3.93)

0/(VD) _, Z/JDJZ { m + nVD +-> rn (mod 2 2

JDJ) .

Composing 1-L with the Jacobi symbol we get a quadratic character e:: 0--> {0, ±1}, explicitly

(3.94)

e: (

m+nVD) 2

=(2m)

\D\ .

=

Note that e: is odd, i.e., e:(-a) = -c(a) because JD\ -1(mod 4). Given l odd, we have h Heeke characters E: I-> C such that on principal ideals n = (a),

(3.95) These are primitive characters of frequency C and conductor m = ( VD).

3. CHARACTERS

63

EXAMPLE 4. Let K = IQI( JD) be any imaginary quadratic field with w units. Given C O(mod w), we have h = h(D) Heeke characters~ : I --> C* such that ~(a) = a/lal)l on principal ideals a = (a). These are primitive characters of frequency C and conductor m = (1). Among these there is a special character of frequency C = hw defined by ~(a)= (a/lal)w for any a E I, where (a)= ah.

=

EXAMPLE 5. Let K = IQI( JD) be any imaginary quadratic field of discriminant D, and x(mod q) a primitive Dirichlet character of conductor q with (q, D) = 1. Define the homomorphism X o N : Iq --> C* by

(3.96)

(x o N)(a) = x(Na).

Then~=

xoN is a Heeke character of frequency zero which is primitive of conductor m = (q). In this case (3.97) where T(X) is the Gauss sum for the Dirichlet character x(mod q).

CHAPTER 4

SUMMATION FORMULAS

4.1. Introduction. Series of arithmetic functions often undergo some kind of involutary transformations. These are derived not by re-grouping and changing the order of terms but deeper changes are made by applying Fourier analysis. For example, the equation (4.1)

Le--rrm2/y = VYL€--rrn2y mEZ

nE:Z

cannot be verified simply by a combinatorial argument. A rich source of relations of this kind lies in automorphic theory. For instance, for a classical cusp form f(z) of weight k on the modular group r = SL 2 (7l) whose Fourier series is 00

(4.2)

f(z) = L >.(n)n ·~· e(nz)

the automorphy equation f(z) = tical line z = iy) the formula

(4.3)

z-k f(-1/z)

L>.(m)g(m)

=o

yields (by integration along the ver-

L>.(n)h(n)

where g(x) is any smooth, compactly supported function on JR.+ and

(4.4) See more general results in (4.71) and (4.72). Modular transformations are typical but there are plenty of other important cases. For example the formulas (15.30) (the Selberg trace formula) or (16.34) (the Kuznetsov formula for sums of Kloosterman sums) do not come from automorphy equations. Naturally an equation connecting one series of an arithmetic function (weighted by a test function of certain class) with another is called a summation formula. In this chapter we develop a few such formulas and give standard applications.

65

4. SUMMATION FORMULAS

66

4.2. The Euler-Maclaurin formula.

We begin by considering a function defined on a segment of real numbers. Suppose a, bE Z and f is continuous in [a, b]. Then the sum of f(n) can be written as the Stieltjes integral

(4.5)

L

f(n)

= [ f(x)d[x]

a
a

where [x] denotes the integral part of x (see (1.65)). Putting 1/J(x) = x- [x]- ~

(4.6)

we derive by partial integration the following Euler-Maclaurin formula. LEMMA

(4.7)

4.1. For f of class C 1 on

L

f(n) =

a
l

(!(x)

[a, b] with a< b, a, bE Z we have

+ 1/J(x)/'(x))dx +~(!(b)-

f(a)).

a

This formula produces good results when f'(x) is relatively small. If the higher order derivatives of f exist and are very small, it can be profitable to continue integrating by parts. To this end we need the Bernoulli polynomials Bk(X). We define Bk(X) E IQI[X] by the following recurrence conditions:

(4.8) (4.9)

= 1, = kBk- 1 (X),

Bo(X) B~(X)

1 1

(4.10)

Bk(x)dx

= 0,

if k ~ 1. The condition (4.9) defines B~(X) in terms of Bk-I(X) up to a constant while the last condition (4.10) (the orthogonality of Bk(X) to constants) determines the constant. The monomial Xk satisfies the first two conditions but it fails the third one. We find that Bk(X) = Xk- ~xk-l + .... In particular, we have B1(X) =X-~ and B2(X) = X 2 - X+ The generating power series for the Bernoulli polynomials Bk(X) is

t·

(4.11) The last equality follows by noticing that [)

axF(t,X)

00

tk

= LBk_,(X) (k _ 1)! l

= tF(t,X).


EXERCISE

67

1. Prove that

Bk(1- X)= (-1)kBk(X) Bk(X

L

+ 1)- Bk(X) = kXk-l

Bk (X+~)= ql-k Bk(qX).

O(a
Next we define the Bernoulli numbers as constant terms of the Bernoulli polynomials (4.12) The generating power series for Bernoulli numbers is (4.13) Since F( -t) = t

+ F(t),

EXERCISE 2.

it follows that Bk = 0 if k > 1, k odd.

Prove that

(4.14)

REMARK. The Bernoulli polynomials and numbers have been generalized with respect to a character x(mod q) as follows:

The constant term of Bk,x(X) is the generalized Bernoulli number, namely

Bk,x = Bk,x(O) = qk-l

L O(a
x(a)Bk(~). q

The generalized Bernoulli numbers occur in values of £-functions at special points. They truly belong to the theory of p-adic £-functions (due to Leopoldt, see e.g.

[W]). For any k ): 0 we put (4.15)

1/Jk(x) = Bk({x})

where {x} = x - [x] is the fractional part of x. These functions are periodic of period one, so is the generating function

68


Therefore they can be represented by Fourier series. The n-th Fourier coefficient of the generating function is

1 1

o

tk

L'~h(x)-)e(-nx)dx = ( k=O k! OG

t

-et - 1

1! . o

e(t~ 2 1fm)xdx

t

00

= t - 2Kin = -

t

L (27rin)

k

·

k=!

Hence the n-th Fourier coefficient of 1j;k(x) is -k!(21rin)~k giving the expansion

1/Jk(x) = -k! L(2Kin)~ke(nx)

(4.16)

n#O

(there is no term n = 0 by (4.10)). This series converges absolutely if k ~ 2. For k = 1 we arrange the terms symmetrically getting the series

?j;(x) = - L(1rn)~ 1 sin(21rnx)

(4.17)

which converges pointwise and boundedly except for x E Z. EXERCISE 3. Prove that for any N ~ 1 and x E lR N

(4.18)

1/J(x) =- L(7rn)~ 1 sin(27rnx)

+ 0((1 + llxiiN)~ 1 )

where llxll is the distance of x to the nearest integer (4.19)

llxl[ = min{lxml :mE Z}.

REMARK. Evaluating (4.16) at x = 0 we get ((2m)=

-(27ri)2m ( 2m)! B2m

for m ~ 1. This turns out to be also true for m = 0 by the functional equation (4.7.5). Moreover, by the functional equation (4.75) it follows that for any m ~ 1, 1 ((1-2m)= -~B2m· 2m Furthermore, for a primitive character

x of conductor q > 1 one has

-1 (-21ri)m

L(m,x) =2m! -q-

T(x)Bm,x

-1

L(1-m,x) = -Bmv m ·~ for any m ~ 1 with m

=~(1- x( -1)) (mod 2) (see e.g. [IR], [W]).

Integrating by parts k -1 times in (4.7) and using (4.9) one derives the EulerMaclaurin formula of order k.

4.

SUMMATION FOR!\,IULAS

69

THEOREM 4.2. Suppose f E Ck([a, b]) with a< b, a, bE Z. Then (4.20)

L

f(n)

(f(x)- ( -k~)k 1/Jk(x)f(kl(x)) d.r

= [

a
a

+

t (~~)e

(J(e- 11 (b)- f(f- 1l(a))Be.

£=1

Very often the first order Eulcr-Maclaurin formula (4.7) is as good as (4.20). Inserting (4.18) into (4.7) and integrating by parts we get COROLLARY 4.3. For f E C 1 ([a, b]) with a< b, a, bE Z we have (4.21)

:L'

f(n)

=

L [

f(x)e(nx)dx+o([

~f}~~l~l)

Jnj(N

a(n(b

where N is any positive integer and the implied constant is absolute. Here and hereafter the dash indicates that the terms at the end-points of summation are taken with half values.

1.

4.3. The Poisson summation formula. This section is based on classical Fourier analysis as described in the Appendix. Recall that for any function f E £I (IR) its Fourier transform is defined by (4.22)

}(y)

=

k

f(x)e( -xy)dx.

j

THEOREM 4.4. Suppose that both f, Then

are in £ 1 (IR) and have bounded variation.

L f(m) = L }(n)

(4.23)

mEZ

nEZ

where both series converge absolutely.

PRoOF. Consider the function F(x)

=

L

f(x+m)

mEZ

which is periodic of period one. This has the absolutely convergent Fourier series expansion F(x)

=L

cF(n)e(nx)

nEZ

with coefficients given by cF(n)

=

t

lo

F(t)e(-nt)dt

=

1=

f(t)e(-nt)dt = }(n).

-oo

Taking F(O) we get the Poisson summation formula (4.23).

0

I!


70

EXERCISE

4. Derive (4.23) from (4.21).

Changing f(x) into J(vx + u) with v E ~+ and u E JR. the formula (4.23) generalizes to

L

(4.24)

f(vm

+ u) = ~ L

mEZ

J(;)e(uvn).

nEZ

By the Fourier inversion j(x) = f(~x) this can also be written as

L

(4.25)

EXERCISE

L

!(;)eCvn) = v

nEZ

J(vm

~ u).

mEZ

5. Using ( 4.24) prove that for a primitive character X (mod q),

L

(4.26)

J(m)x(m)

T~X)

=

mEZ

i(~)x(n)

L nEZ

where T(X) is the Gauss sum. EXAMPLES.

From the Fourier pairs (4.83), (4.84), (4.85) we obtain the follow-

ing equations: "'"' (

(4.27)

L...

1~

~) enx ( ) =

y

y

lni~Y

L m

2

(sin1ry(m+x)) ( ) , 1rym+x

(4.28) (4.29) for any x E ~ and y E ~+. The third equation generalizes (4.1). The second equation has a geometric series on its left side, so by a direct summation "'"'

(

sinh 27ry - cosh27ry~cos27rx·

-2rrlnly -

L...e nx)e n

The first sum can be also executed by using geometric series; it yields

L (1 ~ ~)e(nx) = ~ (si~ 1rx[y] ) n~y

y

y

sm 7rX

2

+ {y} sinJrx.(2[y] + 1). y sm 7rX

If y is a positive integer this reduces to y- 1 (sin 1rxyj sin 1rx )2 .

The same method (averaging of integral translations) works in several variables giving

4.

THEOREM 4.5. Suppose

f

SUMMATION FORMULAS is in the Schwartz class S(JR.l). Then

L

(4.30)

71

=

f(m)

mE'Z/

L

/(n).

nEze

The formulas (4.24) and (4.25) extend to several variables in a similar fashion. The assumption that f is a Schwartz function can be weakened considerably. In two dimensions the Poisson formula (4.30) is just the trace formula for the Laplace operator and an invariant integral operator on the torus l!t2 j7L 2 , see [14].

4.4. Summation formulas for the ball. Iff is radial, then so is j (see Lemma 4.17). In this case the Poisson formula ( 4.30) becomes a summation formula for the arithmetic function rc(m)

= l{mJ, ...

, me E 7L: mi

+ ··· + m; = m}l.

Precisely, if g(x) is smooth and compactly supported on JR.+, then

(4.31) where h(y) is the Hankel transform of g(x) given by (4.103). Throughout we assume that = 2k is an integer> 1, so k is a half integer.

e

REMARKS. The condition that g is compactly supported can be considerably weakened. It suffices that both g and h are of type (a, !3) with a < ~ + ~ < !3 (see (4.104)). To this end notice that r2k(m)

«

mk-l+e

for m ) 1, the implied constant depending on without E).

E

and k (if k ) 2, then this holds

Although there is no contribution from m = 0 on the left side of ( 4.31) there is still a contribution from n = 0 on the right side of (4.31) because h(y) is not compactly supported in J!t+ even if g(x) is. In practice this contribution yields a main term, so it is desired to single out the zero-th term. We have r 2k(O) = 1 and h(y)

~

7r(k

r

as y-:+ 0 by the asymptotic Jv(x)

roo g(x)(xy) •;' dx

k) } 0

~

xv /2vr(v+ 1). Hence (4.31) can be written as

THEOREM 4.6 (SUMMATION FORMULA FOR A BALL). Suppose g is smooth and compactly supported on J!t+. Then k

00

(4.32)

1 k ~ r2k(m)g(m)m2-2

'"""""

1

I

where JII(g) is the Mellin transform of g at s

(4.33)

00

7r 1 k = r(k) M(g) + '""""" ~ r2k(n)h(n)n2-2,

M(g) =

= ktl, i.e.,

1a= g(x)x •;' d:r,

iI


72

and h(y) is the Hankel type transform of g h(y) = 7r

(4.34)

l'o g(x)Jk~1(2n.fiY)dx.

COROLLARY 4.7 (SUMMATION FORMULA and compactly supported on J!t+. Then

f

(4.35)

r(m)g(m) =

1r

FOR A CIRCLE).

f''o g(x)dx +

Jo

1

f

Supposeg is smooth

r(m)h(m)

1

where h(y) =

(4.36)

1r

~o= g(x)Jo(2n.fiY)dx,

and both series converge absolutely. Summation formulas for the circle were established in one form or another by Hardy-Landau [HaLa] and Voronoi [Vor]. There are plenty of expressions for the Bessel function in the kernel of (4.36). Here we select a few particularly useful integral representations:

l"

Jr:Jo(z) =

=2

cos(zsinB)dB

['o

(t 2

-

1)~~ sin(zt)dt

r= sm(zch . t)dt

2 Jo

=

r= sm. (2

= 2 Jo

zw)

cos

( z ) dw

2

w --:;;-;·

Moreover, we have the following asymptotic expansion (see (23.451.1) of [GR]) (4.37)

Jr:Jo(z)

=

27r) ~ { cos(z- ~) 7r ( -z 4

1 sm(z--) . ++0 4 8z 7r

(

-12 ) } z

valid for z > 0. Inserting this into (4.36) we get

h(y) =

~o= (xy )~:;, g(x) cos(27r:.fiY- ;} )dx + 1- ~o= (xy)~£ g(x) sin(2n.fiY-;;: )dx + o( ~o= (xy)~~ lg(x)ldx). 167r

0

0

Integrating by parts in the second integral we obtain (4.38)

h(y) =

~o= (xy)~i g(x) cos(27r:.fiY- ;})dx + O(R(y))

where

R(y) =

(4.39)

~o= (xy)~~ (lg(x)l + xlg'(x)l)dx.

Next integrating by parts in the first integral we obtain (4.40)

h(y) = -

~

r= (xy) i g'(x) sin(27r.fiY- ;f )dx + O(R(y)).

1ry Jo

r

4.

SUMMATION FORMULAS

73

In the case of sphere (f. = 3, k = ~) we encounter the Bessel function of order ~, which is an elementary function, namely h (z) = (2/'rrz) ~sin z, so (4.34) simplifies 2 to 0

h(y) = {

(xy)-ig(x)sin(2rr..fifj)dx.

= g(x)x-i and H(y) = h(y)yi the formula (4.32) with k = ~becomes

Setting G(x)

COROLLARY 4.8 (sUMMATION FORMULA FOR A SPHERE). Suppose G is smooth and compactly supported on J!t+. Then

(4.41)

f>

=

3 (m)G(m)

2-rr

I

where

{oo x~G(x)dx+ fr3 (n)n-~H(n)

lo

1

I

00

H(y) =

( 4.42)

G(x) sin(2rr ..fifj)dx,

and both series converge absolutely. To illustrate the results we apply the summation formula ( 4.35) to improve the error term in the Gauss circle problem (see (1.70)). CoROLLARY 4.9. We have

L

(4.43)

r(m) = rrX + O(X!).

m~X

PROOF. We shall establish an upper bound for the sum in question, the lower bound can be derived by applying similar arguments. In what follows it is crucial that r(n) is nonnegative, so we can smooth out the sum by enlarging its range. To this end we choose the test function g(x)) 0 supported on 0 !( x !(X+ Yin which segment g(x) = min{x, 1, (X+ Y -x)Y- 1 }. This is not smooth, nevertheless (4.35) holds true. Here Y is a parameter at our disposal to be chosen later to minimize the resulting upper bound. We assume that 1 !( Y !(X~. By (4.35) we obtain

r(m) !( L r(m)g(m) = rr (X+

L m(X

y;

1

) + L

m

r(n)h(n),

n

where h(n) is the integral transform of g given by ( 4.36). Using ( 4.40) and ( 4.39) one shows that (4.44) where Z = XY- 2 . Hence 00

Lr(n)h(n) = LLh(a 2 +b 2 ) Therefore

~

«

(XZ)± = (X/Y)~.

1

1

L... r(m) !( rrX + O(Y + X,Y-2).

m(X

Choosing Y = X~ we balance the error terms to 0( X!). By a similar argument we establish a lower bound with the same main and error terms. This completes the proof of (4.43). 0


74

EXERCISE

6. Prove that

L

r(m) 1 - -

(4.45)

m:(x

7r 1·(n) 1!' , +~ L ~h(27rvfnX) = -x + O(x-:i). ( m) = -x 2 2 X

1

7!'1!

Note the error term is very small due to smoothing. Similarly to (4.43) one can derive from (4.41) the following approximate formula: (4.46)

The strongest error term here was obtained by Chamizo- lwaniec [Chi] and HeathBrown [HB5]. Interestingly enough in either paper character sums were brought into play independently; they obtained the exponents ~ and in place of ~ respectively.

H

4.5. Summation formulas for the hyperbola.

First we establish summation formulas for the divisor function T(n). They are essentially due to Voronoi [Vor] except for our arrangements of the integral transforms. THEOREM 4.10. Suppose g(x) is smooth and compactly supported on J!t+ Let ad= 1 (mod c). Then

(4.47)

~ T(m)g(~) cosC7r;m) =

1= (log~+ +

f

21 )g(x)dx

T(n)pCD cosC1!'cdn),

1

( 4.48)

where p(y) and q(y) are the integral transforms p(y)

=

q(y) =

lXJ C(27ry'x)j)g(x)dx,

1=

S(27ry'x)j)g(x)dx,

with kernels

r= cos(zw) cos (; -;;;;' (z) -;;;;· S(z)=4jr= sin(zw)sin;

C(z) = 4 Jo

z) dw dw

0

4.

SUMMATION FORMULAS

75

These kernels are also given by (see (3.864.1) and (3.864.2) of [GR])

C(z) = 4Ko(2z)- 2TIYo(2z), S(z) = 4Ko(2z) + 2nY0 (2z); therefore both results can also be written in the following single formula:

11

= (am) g(m)=-;: I.>(m)e-;:1

00

(logx+2')'-2logc)g(x)dx

0

27r

(4.49)

~

(

dn)

----;:- L r(n)e ----;:-

lor= Y0 (47r -zvnx) g(x)dx

1

+-;:4 ~ L r(n)e (dn) ---;:- lo("' K 0 (47r -zvnx) g(x)dx. 1

We shall derive (4.49) from a still more general summation formula for the tempered divisor function (4.50) m1rn2=nt

Here g(x, y) is a function of class C 1 , compactly supported on l!t 2 . By the way, the arithmetic function r 9 (m) is a. prototype of Fourier coefficients of automorphic forms in the space of continuous spectrum. PROPOSITION 4.11. Suppose g is a compactly supported Junction of class C 1 on !!t2 . Let ad= 1 (mod c). Then

L

(4.51)

r9 (m)e(a:)

=

mEZ

LTh(n)e(-

d;)

nEZ

where h is given by the Fourier transform of g, namely

y)

1_(x h(x,y) = -g -,- . c c c

(4.52) For n = 0 we have

(4.53)

Th(O)

(4.54)

=

L, G {D +

=-

{~} :y)g(x,y)dxdy

:.r +

r (~ + [':] i2_ + [!!.] i2_)g(x, y)dxdy. OX ay

j !f.2

C

C

C

PROOF. First we open r 9 (m) on the left side of (4.51) as in (4.50) and we split the summation into residue classes (m 1 , m 2 ) u 1 , u 2 ) (mod c). Then in each class we apply the Poisson summation formula (4.24) getting

=(

l:r (m)eC:) = L L 9

e(~u1uz)LLg(u 1 +cv1,uz+cvz)

u 1 ,u 2 (mod c)

=c-

2

LL n1

VI

LL

n2 u 1 ,u 2 (mod c)

v2

1

ec(au1uz+ntut+n2uz)9(:

2 ,: )

4.

76

SUMMATION FORMULAS

The term n = 0 is special since there are infinitely many divisors of zero. We compute Th(O) by reversing the above analysis (of course, this analysis could be avoided altogether, nevertheless, for the economy of presentation sometimes a repeated ap-plication of an involution is justified). First we arrange the summation over n 1 , n2 as follows:

Then we apply Poisson's summation and the Euler-Maclaurin formula (4.7) getting

~ L9(0, ~)=I (Lg(.r,cm))dx n

m

=I

I(~+{D:y)g(x,y)dxdy.

Similarly we execute the summation of g( ~, 0). Gathering the results we obtain ~-~0

PROOF OF (4.49). One would like to apply (4.51) directly for g(x,y) = g(xy), but this is not possible because such a function does not have compact support in !!t2 even if g(t) does have compact support in R For this reason we attach to g(m 1m2) redundant factors TJ(ml),TJ(m2) to localize the real variables x,y in compact segments of J!t+. Precisely we set g(x, y) = TJ(X)TJ(y)g(xy) where TJ(t) = min{t/c:, 1} with 0 < c < 1 so that r9 (m) = r(m)g(m). Note that TJ 1(t) has discontinuity at t = c:, but this is allowed for application of Proposition 4.11 as can be seen from the proof by inspection (we could choose TJ(t) smooth, however, the involved computations would have been less explicit). For the above choice we have by (4.53)

Th(O)= I =

I(~+{~}:.T+{~}:y)r](X)TJ(y)g(.r.y)dxdy

~I g(y)L(y)dy,

where

L(y)

=

(y) + 2c {X} 0 TJ(X) (y)) ~ ax -x-TJ ~ dx.

TJ(X) ---;;-TJ ;: I (

Since y is bounded from below and above, and c is small, we find by examining the support of the involved functions the following representation:

r


77

The last integral ranges over x > c- 1 y, so it is bounded trivially by O(Ecjy). Then we have 00

1 0

(Y)

dx= 2 TJ(X)TJ- X X

Joroo ~ {X} -;: TJ(X)TJ (~y) 7dx C

1.,fij TJ(x)-=2+logy-2logE, dx 0

X

= 1+

}, ~ -;:

roo {X} TJ (~y) 7dx C

00

=1+

1
1=

( y ) dx {x}TJ2X =1+ CX

cjc

dx +0(-). E:C {x} 2 X y

The part of the last integral with ~ < x < 1 equals log ~ and the remaining part equals 1-1, where 1 is the Euler constant (see (1.69)). From these evaluations we get

L(y) = logy+ 21 - 2log c + 0 ( ~). Hence (4.55)

=~I g(y)(logy + 21- 2logc)dy +

Th(O)

O(c).

As c tends to zero this gives the leading term on the right side of ( 4.49). Now we compute Th(n) for n =I 0. Since the factorization n 1 n 2 = n permits the sign change of both n 1 ,n 2 we can replace g in (4.52) by the cosine-Fourier transform, i.e., we may take 2 = -;:

h(u, v)

/ / TJ(X)TJ(y)g(xy) cos --;:-(ux 27r + vy)dxdy.

Changing variables we write this as

h(u,v)

=~I g(y)(l TJ(X)TJ(~) cosC; ( ux +

v:)) d:)dy.

If we omit TJ(X)TJ(yjx), then the pure cosine integral equals if uv > 0, if uv < 0 (see (3.871.2) and (3.871.4) of [GR], respectively, or our Appendix). This yields exactly the remaining terms on the right side of ( 4.49). Now we only need to show that the distortion introduced by TJ(X)TJ(yjx) is negligible. We denote the difference by h 0 (u, v) = h 1 (u, v) + h 1 (v, u), where

21"

h 1 (u, v)

= -

c

0

:r:) 1 g(xy) cos 2(ux 2c + vy)dydx. c 00

(1-

0

Integrating by parts twice with respect to y and then applying Fubini's theorem we get

hl(u,v)= 2

~

1r

2 g"(xy)cos 2 7r(ux+xy)dxdy. v2 JorXJ Jof"(1-~)x c c

I '

,


78

Now integrating by parts twice with respect to x we estimate the inner integral by O(cu- 2 ). The inner integral is also estimated trivially by

1oo x2[g"(xy)[dx «

y-3.

Combining both estimates we arrive at

h1(u,v)

«

1

00

v- 2

min(cu- 2 ,y- 3 )dy =

~c~[u[-~v- 2 •

Next, adding the corresponding bound for h 1 (v,u) we obtain ho(u,v) « d[uv[-!. Therefore the total distortion of the right side of (4.51), which was caused by introduction of the localizing factors TJ(X)TJ(Y) to the left side, is bounded by O(c 213 ) since the series 'Lr(n)n- 4 13 converges. Letting c tend to zero the distortion vanishes and we complete the proof of (4.49), hence also of Theorem 4.10. 0 The summation formula (4.51) for the tempered divisor function (4.50) was established in [DI]. The special case ( 4.49) is less flexible, but it is quite effective when applicable (a different, rather indirect, proof of (4.49) was given by M. Jutila [Jul]). We proceed to compose some variations on (4.49). First, we get a summation formula for certain series of Kloosterman sums: if (c, q) = 1, then

L

~S(h, 0; c)

r(m)S(h, m; c)g(m) =

1 V: 00

(tog

+I' )g(x)dx

0

m)l

roo Y

27r " ---;:L...' r(n)S(h-n,O;c) Jn

(4.56)

n)l

+;.:4 L

0

--z-vnx) g(x)dx

(47r

0

r(n)S(h + n, 0; c)

Jnroo K 0 (47r --z-vnx) g(x)dx. 0

n)l

for any integer h. To prove this, we open the Kloosterman sum (1.56)

S(h,m;c)=

"'* L...

e(ha +c ma),

a (mod c)

then for each a, we apply (4.49) and then exchange the order of summation again. Note that the Kloosterman sums on the left side collapsed to the Ramanujan sums on the left side. This feature is significant in applications, see e.g. [DFI2], [KMl],

[KMVl]. Next, playing with additive characters we derive from (4.49) a summation formula for the divisor function in an arithmetic progression. CoROLLARY

4.12. Let (q,r) = 1, and g be smooth, compactly supported on

JR.+. Then (4.57)

L

1

00

r(m)g(m) =
m;or(mod q)

+

(logx

+ 2')'- 2TJ(q))g(x)dx

0

~ L ~ LT(n) { S(r,n; c)T+(~) + S(r, -n;c)T- (~)}

1

cq

n

79


where ry( q) is the additive function _ " ' logp ( ) -L..-p-1'

(4.58)

'f]q

plq

S(r, n; c) is the Kloosterman sum, and r+ (y), r- (y) are the integral transforms (4.59)

r+(y) = -27r 1= Yo(4KJXY)g(x)dx,

(4.60)

T-(y) = 41= K 0 (4KJXj/)g(x)dx.

REMARKS. Since Y0 (z) has the asymptotic expansion (4.37) but with cos and sin interchanged (see (23.451.2) of [GR]), we get the approximate formula

r+(y) =

2

r=(xy)tg'(x)cos(4KJXY-l!._)dx+O(R(y)) 4 by the same argument which led us to (4.40), where R(y) is given by (4.39). Simi-

"Ylo

larly one derives

r-(y)

21

=-

1ry

00

o

4 1 (xy)"g'(x)e"vxYdx

+ O(R(y)).

EXERCISE 7. Using the Wei! bound for Kloosterman sums (11.16) and the test function g(x) as in the proof of (4.42), derive from (4.57) that for (q,r) = 1 and X~2,

(4.61)

L

m(X m:=r (mod q)

where the implied constant is absolute. In particular, (4.62)

L

T(rn) =X (log X+ 2"1- 1)

+ O(Xlilog X).

m~X

This improves the error term in the Dirichlet divisor problem (1.75). ExERCISE 8. Prove that for any primitive character x(mod q) with q any smooth function g(x) compactly supported on JR+ we have (4.63)

L T(m)x(rn)g(rn)

=

T

2

(x)q- 2

L T(n)x(n)h(nq- 2)

where T(X) is the Gauss sum, and h(y) is the integral transform (4.64)

h(y) = 1= K(2KJXY)g(x)dx

with kernel (4.65)

K(z) = 4x(-l)K0 (2z)- 27rYo(2z).

[Hint: Expand X into additive characters (see (3.12)).]

> 1, and

80

4.

SUMMATION FORMULAS

Next we derive a summation formula for

(4.66)

Tv(n,x) =

L n1n2=n

x(ni)c:r

where x(mod q) is a primitive character of conductor q > 1 and vis a fixed complex number. These are eigenvalues of the Heeke operators T?f: in the space of Eisenstein series for ro(q) and character X· In principle the method that we used for T(n) applies for Tv(n, x), though some arithmetical arguments need to be revised and a few modifications in integral transforms are required. However, the analysis of convergence is the same, so here we do not repeat it. We treat only the two extremal cases qlc and (q, c)= 1. See also [KMV1] for other similar formulas.

=

THEOREM 4.13. Let ad 1 (mod c) and x be a primitive character of conductor q > 1 with qlc. Then for any smooth, compactly supported function g on JR+ we have

r= g(x)xvdx

(am)

00

x(d) (q)2v LTv(m,x)e---;;- g(m)=-c~ T(X)L(1+2v,x)J

0

I

(4.67)

where T(X) is the Gauss sum, L(1

+ 2v, x)

is the Dirichlet L-function, and

J~(z) = ~(hv(z)1-2v(z)), ifx(-1) = Slll7[V J2~(z)

Jri

= --(hv(z) + 1-2v(z)), ifx(-1) = cos 1[1/ K~(z) = 4cos(7rv)K2v(z),

1, -1,

= 1, Kiv(z) = 4isin(7rv)K2v(z), ifx(-1) = -1. ifx(-1)

REMARK. For v = 0 we have

(4.68)

T(n,x) = LX(d). din

If X is odd (the case of modular forms of weight one), our formula reduces to

f I

r= g(x)dx

T(m, x)e(am)g(m) = x(d) T(X)L(1, X) c c Jo

r= g(x)J (4"~..;nidx. )

.x(d) ~ ( -~ dn) Jo +2Jrt-c-~T(n,x)e

0

4.

SUMMATION FORMULAS

81

PROOF. On the left side of (4.67) we split the summation into residue classes (m 1 , m 2 ) (u 1 , u 2 ) (mod c) and for each class we apply the Poisson formula (4.24) getting

=

where I(n 1 , n 2 ) is the Fourier integral

I(n1,n2) =

l"" l""

(;)"g(xy)ec(xn 1 +yn 2 )dxdy.

=

The sum over u 2 (mod c) vanishes unless au 1 n2 (mod c) in which case it equals c. Therefore we showed that the left side of (4.67) is equal to

Note that x(a) = x(d). From

n 1 n2

= 0 we get

The last integral is equal to 2f(-2v)cos7rv or -2if(-2v)sin7rv according to x(-1) = 1 or x(-1) = -1 by (4.108) and (4.109) respectively. Here we transform L( -2v, x) into L(1 + 2v, x) by the functional equation

L( -2v, x) = T(X)

Vir

(i) 7T

2

L(1 + 2v,

v ·

x){ f(~ +

v)/r( -v),

-zf(1+v)/f(~-v),

ifx(-1)=1, ifx(-1)=-1,

see (4.73). In the case of even character we encounter 2f(!2 + v) f(-2v)cos?Tv = 2- 2 vy'7r f(-v) and in the case of odd character we encounter -

-2f(1 1

+ v) f( -2v) sin 7TIJ

f(2 +v)

= 2- 2 vy'7r.

Hence we conclude that in both cases the terms with n 1 n 2 = 0 contribute the same amount which is exactly the first term on the right side of (4.67). From n 1 n 2 op 0 we get

82

4.

SUMMATION FORMULAS

Changing the variables (x,y)-> (u..jvn2/n 1,u- 1..jvn!/n2) we get

c::r 1= (1=

I(±n1,±n2) =

g(v)

du) dv.

ec(y'vn 1n 2(±u±u-1))u 2v- 1

Hence the remaining terms on the right side of (4.67) emerge by applying (4.112)-

(4.115).

D

REMARK. In the proof of Theorem 4.13 we assumed that -vis a small positive number, but the result extends to all complex v by analytic continuation. THEOREM 4.14. Let ad= 1 (mod c) and X (mod q) be a primitive character of conductor q > 1 with (c, q) = 1. Then for any smooth, compactly supported

function g on JR+ we have

+ x(c) T(X)qv- 1 c

where qq

=1

f

Lv(n, )()e (aqn) c

1

1= 0

4 g(x)Ki;,( 1rvfnX)dx cy'q

(mod c), and J~, K! are the same as in Theorem 4.13.

REMARK. For v

0 and X odd our formula reduces to

=

(4.70)

~ L., T(m, x)e (am) ---;;-- g(m)

=

x(c) -c-L(1, X)

1=

g(x)dx

0

1

X( c) T(X) ~ - 27r--L., T(n, )()e ( -aqn) - 1oc g(x)Jo (47rvfnX) - - dx. c q r cy'q 0

1

=

PROOF. On the left side of (4.69) we split the summation into classes m 1 u1(mod cq), m2 u 2 (mod c) and for each class we apply the Poisson formula (4.24) getting

c;qL L n1

nz

L

x(ui)ec(au1u2- 7 ~1 u1-n2u2)Iv(:1 ,n2)

L

u 1 (mod cq) Uz (mod c)

where lv(z1,z2) is the same integral as in the proof of Theorem 4.13. Note that

=

The sum over u 2 (mod c) vanishes unless au 1 n 2 (mod c) in which case it equals c. Putting u1 = dn 2qq + ct, where t ranges modulo q we obtain

LL u1

uz

=

ce(- ;n1n2)x(-cn1)T(X).

;, ··I·.

'}

I


83

Hence the left side of ( 4.69) is equal to

This looks very similar to the sum we have already dealt with in the proof of Theorem 4.13. Therefore (4.69) is derived from (4.67) by making the substitution (x, v, d) --t -v, rlq) in appropriate places. 0

ex,

Theorems 4.6, 4.10, 4.13 and 4.14 are special cases of summation formulas for the Fourier coefficients of modular forms. Actually, these results confirm in themselves the modularity of relevant forms. As a matter of fact we were able to establish such modular relations directly by applying the two-dimensional Poisson summation only because in each case our arithmetic function appears in the Fourier coefficients of a GL 2 form which is lifted from GL 1 forms, i.e., the corresponding L-function factors into two Dirichlet £-functions. For genuine cusp form coefficients one cannot apply the ordinary Poisson's summation formula. If one knows that a Fourier series n

is a modular form, say f E Sk(q, x) (see Chapter 14 for notation and survey of holomorphic modular forms), then the corresponding summation formula for the coefficients >. f ( n) would be just another expression for the equation

+b) f ( az cz + d

=

x(d)(cz

+ d)k f(z).

EXERCISE 9. Suppose f is a cusp form of weight k ) 1, level q ) 1 and character X modulo q. Let c ) 1, c O(mod q) and ad 1 (mod c). Prove that the Fourier coefficients of f satisfy

=

=

(4.71) for any smooth, compactly supported function g on JR+, where

h(y)

=

27Tik

=g(x)Jk-1( -ylxY)dx. 47T

1 0

c

[Hint.] Apply the modular equation for z = -~ + -!y and integrate the resulting Fourier series on both sides with respect toy against a suitable test function G(y). If f is primitive (see Section 14.7), so that it satisfies the Fricke involution equation

!(~:) =ry1 q~zk](z) with

ITJJI = 1 (Proposition 14.14), then we also have 00

(4.72)

L AJ(m)g(m) = rJtq- 112 L:\J(n)h(n) 1


84

where h is the same Hankel-type integral transform of g as above with c = (apply (4.71) for

.,fii

(~ ~) = (~ -1~jq) and change X( d) into TIJ to get (4.70)). In modern analytic number theory there is a great demand for summation formulas for arithmetic functions of type r(m) = n(m)(J(m +h) where cx(m),(J(n) are essentially modular forms coefficients and h is a fixed integer (but a uniformity in h is crucial). These kind of problems are often encountered when evaluating power-moments of families of £-functions on the critical line. See, for instance, [DFil], [Sal], [Ml].

4.6. Functional equations of Dirichlet £-functions. A summation formula for arithmetic functions often has its realization as a functional equation for the corresponding generating Dirichlet series. Conversely, a functional equation connecting two Dirichlet series leads (by way of taking the inverse of Mellin transform) to a summation formula for the coefficient sequences. In this section we establish this correspondence in full detail for the Dirichlet £functions (3.6) using the original method of Riemann. These ideas were developed further by E. Heeke [Heel] for £-functions of number fields and for automorphic £-functions, and later put in the general adelic setting in Tate's Thesis [Tal]. Furthermore the connections between functional equations for twisted £-functions and modularity via summation formulas is the essence of the so-called converse theorems in the theory of automorphic forms in general. There are delicate issues of compatibility which we do not discuss here; see for example the \Veil converse theorem for GL2 forms in Theorem 14.21, or [14].

q K

Let q ) 1 and let X be a primitive character modulo q. We let o(x) = 0 if in which case x = 1 and L(x, s) = ((s). Also we put = ~(1- x(-1)), so K = 0 if xis even and K = 1 ifx is odd.

f 1 and o(x) = 1 if q = 1,

THEOREM 4.15. With the above notation, L(s,x) extends to a meromorphic function on IC which is entire if X f 1 and otherwise admits a unique simple pole with residue 1 at s = 1. The completed L-function

q)s/2 A(s,x) = ( ;;: f

(s+K) - - L(s,x) 2

is entire if q f 1 and has simple poles with residue 1 at s Moreover, it satisfies the functional equation

(4.73)

=0

and s

= 1 otherwise.

A(s, x) = c:(x)A(1- s, x)

where

(4.74)

._,..r(x) c: (X) =z - - .

.;q

Recall that the Gauss sum r(x) is of modulus .,fii (see (3.10), (3.14)) for X primitive, so that the "root number" c:(x) is of modulus 1. Theorem 3.3 of Gauss shows that c:(x) = 1 for any real primitive character X·

4.

SUMMATION FORMULAS

85

For the Riemann zeta function, the functional equation is therefore

A(s) =

(4.75)

7T-s/

2

r(s/2)((s) = A(1- s).

PROOF. We show the arguments for q op 1, the case of ((s) is only slightly different because of the pole at s = 1. Let B(y, x) be the theta function

B(y, x) =

L x(n)n"e-"n 2yjq nEZ

for y > 0. By splitting into progressions modulo q, we have

B(y,x) =

L

x(a)B(y;q,a),

a (mod q)

where

B(y;q,a) = n=oa (mod

q) 2

We apply the Poisson formula (see (4.24)) to this sum. The function e-"x is its own Fourier transform (4.29). Differentiating the corresponding Fourier integral, it 2 follows that the Fourier transform of /y(x) = x"e-"x y is given by

By (4.24) we obtain

B(y;q,a)=i-"y-"-~q-~

L

e(~)e(~;q,b)

b (mod q)

after splitting again in progressions. Using the formula (3.12) for a primitive character it follows that (4.76) Now we appeal to the Mellin transform formula (see (4.107))

(;) (sh)/2rC: K)n-s

= 1+= n"e-rrn2yfqy(s+~<)/2d:

to derive

( !i.)"/2 A(s, x) = ~ r= B(y, x)y
Jo

" 2 Y for Re (s) > 1. Split the integral at y = 1 and apply (4.76) for 0 < y < 1 getting (4.77)

12

(;) " A(s, x) =

~ 1+= { c(x)B(y, x)y
from which both the analytic continuation of A(s,x) (hence that of L(s,x)) and the functional equation (4.73) follow since B(y, x) decays exponentially fast at oo.D

86

4.

SUMMATION FORMULAS

REMARK. This proof should be compared with that of Theorem 14. 7, the functional equation of Heeke £-functions. Quite often in applications a formula for the sum of an arithmetic function >.(n) with a sharp cut n (xis more convenient than that with smooth ending. In this situation it is better to work with an approximate formula having a good error term rather than with an exact infinite series which is only conditionally convergent. Here we state a classical result for a twisted divisor function which can be derived from the functional equation for the Dirichlet £-series (see [FI2]). THEOREM 4.16. Let spectively. Put

x1 , ... , )(d

be primitive chamcters to moduli q1 , ... , qd re-

For any 1 ( y ( x we have

Here q = q1 · · · qd, u = d- 3 - 2(,.; 1 +···+,.;d) and w is the root number of the £-function L(s) = L(s, x!) · · · L(s, Xd) (w is a product of the normalized Gauss sums}. The main term R(x) is equal to the residue of s- 1 x 8 L(s) at s = 1, so R(x) = xP(!ogx), where P(x) is a polynomial of degree (d. The implied constant depends only on d and f. . 1

Estimating the dna! sum trivially and choosing y

L

1

>.(n)

=

R(x)

=

d-1

qd+l xd+l we get

d-1

+ O(qd+l xd+l +£)

l~n~x

where the implied constant depends only on d and f. This, of course, is not the best error term. Using exponential sums methods one can improve the result slightly. I

I

The best error term with respect to x which one can hope for is O(x2-u+£). Indeed, the first term of the dual sum is about as large. Note that the Exponent Pair Hypothesis (see Chapter 8) does yield the best estimate, while the Riemann Hypothesis does not!

4.A. Appendix: Fourier integrals and series. Let U(JR) denote the space of Lebesgue integrable functions on R The Fourier transform off E L 1 (JR) is defined by (4.78)

Ff(y)

=

/(y) =

1.

f(x)e(-xy)dx.

l


87

Though f(x) is not necessarily continuous its Fourier transform /(y) is uniformly continuous and j(y)-> 0 as lvl _, oo (the Riemann-Lebesgue lemma). The convo1uti on of two functions /, g E U (IR) is defined by

(!

(4.79)

* g)(x)

=

1.

f(x- y)g(y)dy

for almost all x, and it belongs to U(!R), indeed the L 1-norm satisfies II/* glh ( ll/lhll9lh- The convolution is a smoothing operator, precisely, if g has the first j derivatives in U(!R), then so does f * g for any fin U(JR). The Fourier transform /(y) determines its original f(x). Precisely, iff E L 1 (1R), then

1

· 1- lvl) y f(y)e(xy)dy-> f(x)

Y ( -Y

as Y _, oo in the U-norm. If both converges uniformly in x giving

f(x) =

(4.80)

r:

J,j are in L1 (1R), then the above integral

1.

'P(x) =

1

Y ( -Y 1-

ylvl) e(xy)dy =

i

(sin1rxY)2

1rxY

.

We have j * g = j · g, in particular,
(4.83)

f(x) f(x)

= =

{

·( ) sin 27ry fy =--,

~ if !xi=~,

0 if

"Y

!xi> 1,

• (y) (sin 1ry)2 f = -,

max{1-lx!,O},

(4.84)

f(x) = e-2,.lxl,

(4.85)

f(x) = e--n:x2,

j(y)

(4.86)

1 f(x) = -h-,

•

C

7rX

1rY

j(y) = 7r- 1 (1

f(y)

= =

+ l)-\

e-"Y2, 1 . ch 1ry.

There is a corresponding Fourier analysis of functions on the circle T = !RjZ. These are regarded as functions on lR which are periodic of period one. The Fourier coefficient of a function f E U (T) is defined by (4.87)

en(!)= [

! ~u

}(y)e(xy)dy.

In other words, }(x) = f(-x). This is the inversion formula for the Fourier transform. The Fourier transform of p(Y) = max{1- W,O} is called the Fejer kernel (4.81)

l

f(x)e(-nx)dx.

88


f

They form the Fburier series for

f(x) ~

(4.88)

L Cn(/)e(nx) nEZ

which is nothing but a formal expression as long as the convergence is not established. The Riemann-Lebesgue lemma asserts that en(!) -> 0 as lnl -> oo, and, of course, this is not sufficient for the convergence of (4.88). Nevertheless, the Fourier coefficients en(/) determine J, precisely, if en(/) = 0 for all n, then f = 0 almost everywhere. The convolution

(! * g)(x) = [

(4.89)

f(x- y)g(y)dy

is defined almost everywhere, and has its Fourier coefficients equal to the product

f E £I (T) with a trigonometric polynomial

In particular, convolving

L

PN(x) =

Cn(P)e(nx)

lni(N gives a trigonometric polynomial

L

(f*PN)(x) =

cn(f)cn(PN)e(nx).

lni(N This trick solves many summability problems. The most popular is the Fejer kernel (4.90)

FN(x) =

L (1- l~)e(nx)

NCi:~xf.

=

\ni(N This produces the Fejer partial sums

L (1- l~)en(!)e(nx)

(4.91)

lni(N

which do converge to f(x) in the L 1-norm on T. The exact partial sums of the Fourier series for f can be tailored similarly by convolving f against the Dirichlet kernel (4.92)

DN(x) =

L lni(N

e(nx) =sin 7r(~N + 1)x Slll7rX

for any positive integer N. Concerning a pointwise convergence for symmetric partial sums we have (4.93)

2:

enU)e(nx) _. ~u(x + o)

+ t(x- o))

lni(N for all x E T provided f E L 1 (T) has bounded variation. The convergence is uniform on closed intervals of continuity of f.

I

r 4. SUMMATION FORMULAS

89

In many ways the problem of representing a periodic function by its Fourier series simplifies iff is square integrable because L 2 (T) is a Hilbert space, the inner product being (f,g) = [f(x)g(x)dx, and the additive characters en(x) = e(nx) form a complete orthonormal system. Note that llflh :( llfll2 by Cauchy-Schwarz inequality, so L 2 (T) C L 1 (T). For any j, g in £ 2(T) we have the Parseval formula (4.94) n

Hence the Fourier coefficients are square summable. The symmetric partial sums of the Fourier series converge to the function in the £ 2-norm. Next we consider functions in several variables x = (x 1 , . . . ,xk) E JRk. Say

f : JRk _.. IC is a Schwartz function if

I!

I;

(4.95) for any a= (a 1 , ... ,ak) and A~ 0, where jxl 2 =xi+···+ x%. Let S(JRk) denote the class of Schwartz functions. For any f E S(JRk) we set the Fourier transform (4.96)

](y)

=

i,

it has the following properties: ](x) = f(-x), (],g)= (!,g), explicitly,

(4.98)

f(x) =

l,

~

i I.

i

f(x)e( -x · y)dx

where x · y = x 1 y 1 + · · · + XkYk is the scalar product in JRk. The Fourier transform maps the space S(JRk) into itself_( check (4.95) for ](y) by partial integration), and

(4.97)

l

l,

I I

](y)e(x · y)dy,

](x)g(x)d.T =

i"

f(y)g(y)dy.

In other words, the Fourier transform is an isometry on S(JRk). The convolution (4.99)

(f

* g)(x)

=he

is turned into the product :F(f *g) =

f(x- y)g(y)dy

!

j ·g.

One may think of JRk as a homogeneous space acted on by the group of translations G = JRk. In addition to the translations the orthogonal group SO(JRk) acts on JRk by rotations making JRk the euclidean space. The riemannian metric on JRk (of curvature zero) is given by the differential ds 2 = (dx 1 ) 2 + · · · + (dxk) 2 , and the corresponding Laplace operator is (4.100)

I

82

82

D=-2+···+-2.

8xl

8xk

The exponential functions .(.(
I I

I

I :1

l1

,1 ,j


90

IR;.k), therefore the Fourier transform of a radial function is also radial. Precisely, employing the Fourier inversion (4.97) one shows LEMMA

4.17. Let k ~ 2. Put v

=

~ - 1. Suppose

(4.101) where g is a smooth compactly supported function on IR;.+. Then

(4.102) with

(4.103)

h(y) =

7r [ " '

lv(27ry'xY)g(x)dx

where lv(x) is the Bessel function of order v.

If P is a homogeneous polynomial in k variables and complex coefficients such that DP = 0 (it is called a spherical harmonic), then we have the Fourier pair F(x) = P(x)e-rrlxl',

where d = deg P. This self-duality (due to Heeke) actually characterizes spherical harmonics. In principle it says that the multiplication by spherical harmonics has little effect on Fourier transform. For example, Lemma 4.17 generalizes to the following equation (due to Bochner)

where h is given by (4.103) with the Bessel function of order increased from v to v +d (recall that v = ~ -1). These facts appear naturally in the context of unitary representations. We end this survey of classical Fourier analysis by comments on the Mellin transform on IR;+. Say f : IR;+ -> C is of type (a, {3) if (4.104) For such

f the Mellin transform

(4.105) is defined for complex variables in the vertical strip a< Res< {3. Clearly M(J)(s) is holomorphic in this strip. Since the Mellin transform on IR;+ is merely a version of the Fourier transform on IR; derived by the change of variables y = ex, s = it, the results for the latter translate appropriately for the former. For example if f is continuous of type (a, {3) and has bounded variation, then the Mellin inversion formula holds: (4.106)

f(y) = _21. {

AI(J)(s)y-sds

1fZ }(a)

where (a) indicates that the integration is on the vertical line Re(s) =a.


91

Here is a selection of Mellin transforms from Chapter 3 of [GR] (formulas (381.4), (761.9), (761.4), (411.3), (523.3), (222.2), (293.3), (871.4), (871.3), (871.2), (871.1) respectively)

LX> e-yys-ldy = r(s),

(4.107)

l>O (cosy)y

(4.108)

8

-

1

dy =

r{8)

0

{xo (sin y)y

(4.109)

8

-

1 dy

.a

0 < u < 1, -1
2 7[ = -.-,

100 log(1 + y)y

8

-

1

0
Sin 7f8

0

(4.111)

1[8

cos-, 2

= r(8) sin 7f.5,

[~ (1 +y)-lys-ldy

(4.110)

a> 0,

dy = - .7[- ,

0

-1
S Sll! 1fS

For x > 0 we have

t) 1= (y- t)

(4.112)

L"XJ cosG (y-

(4.113)

sinG

1

00

( 4.114) (4.115)

cosG (y +

)ys-ldy = 2K8 (x) cos

t)

8

)ys-ldy = 2K8 (x) sin 7f 2

~) )ys-l dy =

1"XJ sinG (y +

~8 ,

-7[

)ys-ldy =

7f l

l 8 (x) sin

~s

(x) cos

~8

8

-~

,

- nY8 (x) cos -

nY8 (8) sin

~s,

~8 .

In the last four formulas -1 < u < 1 and J., K., Ys are the standard Bessel functions. In the last two formulas we also have (4.116)

.

7f8

1f8

1 8 (x)sin- +Ys(x)cos- = 2 2

ls(x)- Ls(x)

·1rs,

2sm 2

and (4.117)

Js (

x)

s 7f8 2

CO,

_

Y. ( ) . s X Sin

1fS _

2

-

ls(x)

+ Ls(x) trs .

2cos 2

I

ij

I

l r

CHAPTER 5

CLASSICAL ANALYTIC THEORY OF £-FUNCTIONS We will now discuss a very classical part of analytic number theory. Indeed, most of the results discussed in this chapter can be traced back in some form to Riemann's Memoir on the zeta function. They were subsequently extended to Dirichlet characters, with two essentially new phenomenon appearing. One is that uniformity of estimates with respect to the conductor is vital. The other, not unrelated, is the stubborn appearance of the so-called exceptional zeros (or Landau-Siegel zeros) for quadratic characters. Ruling out the existence of such zeros remains one of the main problems of number theory. Nowadays more refined £-functions occur frequently (see Chapters 14 and 15 for a survey of automorphic forms, one of the sources for £-functions), but it is wellknown that the classical analytic results remain valid in great generality. Finding explicit statements in the literature, uniform in all parameters, is however sometimes difficult. For this reason we prove all main results in a general abstract context. Then, starting from Section 5.9 , we survey the most important classes of £-functions that arise in practice: (1) Dirichlet £-functions; see Section 5.9. (2) Dedekind zeta functions and £-functions of Heeke Griissencharakteren of number fields; see Section 5.10. (3) The automorphic £-functions of classical cusp forms (that is to say of holomorphic and Maass forms on GL(2)); see Section 5.11. (4) General automorphic £-functions on GL(m), m ~ 1; see Section 5.12. (5) The £-functions of algebraic varieties and Galois representations (although most of those are only conjectured to fit the framework discussed in this chapter); see Sections 5.13 and 5.14. In each of these cases we spell out the concrete meaning of the abstract results, and give a few additional results. 5.1. Definitions and preliminaries.

In order to treat all the £-functions that interest us in one stroke, we introduce some abstract definitions, although we are not trying to create axioms. The alternative is to introduce automorphic £-functions at the outset (see Section 5.12), but some translation would still be required for the most classical cases (and Rankin-Selberg convolutions would require a separate treatment). We try to balance familiarity and ease of use with sufficient generality. The reader unfamiliar with £-functions, or acquainted only with Dirichlet £-functions, should read the introductory paragraphs of Section 5.9 first, then read what follows with this important case in mind. Basic ideas are already present in the context of Dirichlet £-functions. We then encourage him or her to get acquainted with automorphic

93

5.

94

CLASSICAL ANALYTIC THEORY OF £-FUNCTIONS

£-functions (see [BG] for instance). The analogy with Dirichlet or Heeke characters on ideals in a number field do not really reveal the beauty and the depth of the genuine modular forms. This is especially true when Rankin-Selberg £-functions occur, which are instrumental for establishing zero-free regions. We denote £-functions by L(f, s), L(g, s), although the syrnbob J, g carry no specific meaning until Section 5.9. They are used merely for the suggestion that £-functions usually arise as attached to some interesting arithmetic object. We say that L(f, s) is an £-function if we have the following data and conditions: (1) A Dirichlet series with Euler product of degree d ~ 1,

n)l

p

with At(1) = 1, AJ(n) E IC, n;(p) E IC. The series and Euler products must be absolutely convergent for Re ( s) > 1. The n; (p), 1 :( i :( d, are called the local roots or local paran1eters of L(f, s) at p, and they satisfy (5.2)

Jn;(p)J

< p for all

p.

(2) A gamma factor

j 1 cl

where the numbers K.i E IC are called the local parameters of L(f, s) at infinity. We assume these numbers are either real or come in conjugate pairs. Moreover, Re ( Kj) > -1. This last condition tells us that rU. s) has no zero in IC and no pole for Re (s) ~ 1. (3) An integer q(f) ~ 1, called the conductor of L(f, s), such that n;(p) =f 0 for p f q(f) and 1 ~ i ~ d. A prime p f q(f) is said to be unramified. From these, the so-called complete £-function

A(f, s) = q(f)~r(f, s)L(f, s)

is defined. Clearly, it is holomorphic in the half-plane Re(s) > 1, yet it must admit analytic continuation to a meromorphic function for s E IC of order 1 (see the Appendix), with at most poles at s. = 0 and s = 1. Moreover, it must satisfy the functional equation (5.5)

l 1

(5.3)

(5.4)

l

.,

A(f. s) = c(f)A(!, 1- s).

where J is an object associated with f (the dual of f) for which AJ(n) = .5. 1 (n), ,(J, s) = rU, s), q(J) = q(f) and c(f) is a complex number of absolute value 1, called the "root number" of L(f, s). We let r(J) denote the order of the pole (if positive) oT zero (if negative) of A(f, s) at s = 0 and s = 1, equal because of (5.5). Since rU, 1) =f 0, oo, r(f) is also the order of the pole or zero of L(f, s) at s = 1. Later we shall show in all concrete cases that L(f, s) does not vanish at s = 1, so r(f) ~ 0. We seek uniform estimates for various analytic quantities related to L(f, s). For an individual £-function, the only parameter is s E IC, but when L(f, s) varies, we also have to deal with the degree, conductor, local parameters, and each of these

l~

1

5.


95

or a combination may be under investigation. It turns out that most results for L(f, s) are expressed conveniently in terms of the analytic conductor . First put d

(5.6)

qoo(s)

=IT (Is+ n:

11+

3).

j=!

Multiplying this by q(f) we get the analytic conductor d

(5.7)

q(f, s) = q(f)qoo(s) = q(f)

IT (Is+

Kjl

+ 3).

j=l

We also denote d

q(f) = q(f,O) = q(f)

IT (IKjl + 3). j=l

Note that q(J) ~ 3dq(f), so d
q(f, s) ~ q(f)(lsl + 3)d,

so estimates could be performed in terms of this last quantity without compromising much of strength. From the definition, we see that if L(f, s) and L(g, s) are £-functions, then L(f, s)L(g, s) is one with conductor q(f)q(g) and analytic conductor q(f, s)q(g, s), gamma factor 1(!, s)'y(g, s), root number c:(f)c:(g). Moreover, L(f, s) is also an £-function by construction, with same degree, conductor, gamma factor and with

c:(f) = t(f). Also if L(f, s) is entire, then for any fixed t E IR;, the shifted £function L(g, s) = L(f, s + it)L(f, s-it) is another £-function, with gamma factor /(g, s) = 1(!, s +it)!(!, s-it), conductor q(g) = q(f) 2 , root number c:(g) = 1. The analytic conductor is q(g, s) = q(f, s + it)q(f, s-it). The condition that L(f, s) be entire is required because we are not allowed to shift the poles. We take the product of two shifted £-functions by it and -it to maintain the local parameters coming in complex pairs. Of course, these are only cosmetic arrangements. In the sequel we often simplify the above notation by not displaying the dependence on J, s; we write q = q(f), q00 = q00 (s) , q = q(J, s), r = r(f).

Iff= f, then L(J, s) is said to be self-dual. This means that the Dirichlet series of the £-function has real coefficients. For a self-dual £-function, the root number is real, hence c:(f) = ± 1. It is then called the sign of the functional equation. The following observation is originally due to Shimura; despite its simplicity, it is significant in a number of applications (see Section 23.6 and Chapter 26 for instance): PROPOSITION 5.1. Let L(f, s) be self-dual with c:(f) = -1. Then L(f, ~) = 0. PROOF. By definition we have 1U, ~) f 0, and the functional equation applied 0 to the central points=~ yields L(f, ~) = -L(f, ~),hence L(f, ~) = 0. The local roots n;(p) are well defined up to permutation of the indices. If for any i we have ln;(P)I = 1 for all p f q(f) and lni(P)I ~ 1 otherwise, then L(f, s) is said to satisfy the Ramanujan-Petersson conjecture. This implies, in particular,

96

5.


that 1>-t(n)l ~ Td(n). Similarly, if for any j we have Re(Kj) ~ 0, then L(f,s) is said to satisfy the (generalized) Selberg Conjecture, or the Ramanujan-Petersson conjecture at the infinite place. In this case 1(!, s) has no poles for Re (s) > 0. Because A(f,s) = ,(f,s)L(f,s) is assumed to be holomorphic, except possibly at s = 0 and s = 1, it follows that poles of,(!, s) at s i= 0 are zeros of L(f, s). These are called the "trivial zeros". They are located exactly at the points -2m- "J i= 0, 1 ~ j ~ d, where m ~ 0 is an integer. For example, the trivial zeros of ((s) are at s = -2, -4, -6, ... while ((0) = -~. The other zeros of L(f, s) are called the non-trivial zeros. They are located in the critical strip 0 ~ Re (s) ~ 1. Presumably some trivial zeros may lay in the critical strip as well, however, one conjectures they are never in the interior. As mentioned in the introduction, we will discuss examples starting in Section 5.9. It is important to note that many interesting £-series exist beyond our context, either because of lack of strength in our fingers to prove the required assumptions (for instance Artin £-functions for non-trivial irreducible characters are not known to be entire in general, and general £-functions of varieties are not even known to be meromorphic, see Sections 5.13 and 5.14), or because some of the conditions fail. For instance, in the second class come Dirichlet £-functions of non-primitive characters (they have Euler product but the functional equation has extra factors), £-functions of general modular forms (see Theorem 14.7 for instance) which have no Euler products and a functional equation relating L(f, s) with some function not directly related to L(/, s). Still interesting are partial zeta functions of ideal classes of number fields (see (22.55) for imaginary quadratic fields) which have functional equations (relating an ideal class n with l l - n, where ll is the different), but no Euler product if the class number is > 1. However, these partial zeta functions can be reconstructed as linear combinations of the full £-functions for the class group characters. The modern analytic theory of £-functions is much influenced by two concepts: that of a family of £-functions, and that of (Langlands) functoriality. The former lacks a formal definition but is a strong guiding principle (see [M2]). In this book we will indicate when examples of families appear. Note that in this context, varying the imaginary part t of s should count as varying in a family. The main parameters in a family of £-functions are encoded in the analytic conductor. Functoriality, on the other hand, reflects the principle that sometimes new £functions are built out of old ones by making some operation (simple looking, yet quite subtle) on their coefficients AJ(n), such as raising to some power (>. 1(nk) or AJ(n)k). A proper formalization requires quite major notation and setup, and the naive ideas must usually be changed; moreover, much of the global setting is still conjectural. We will do here with a simple-minded definition of Rankin-Selberg type £-functions, and mention elsewhere the symmetric square and other symmetric power £-functions (see Section 5.12 and the discussion of the Sato-Tate conjecture in Chapter 21). Let L(f, s) and L(g, s) be £-functions of degrees d and e, with local components at infinity K; and v1 , and local roots (n,(p)) and (f3J(p)) respectively. For p f q(f)q(g), let (5.9)

Lp(f 0 g, s)

=II (1- n;(p)f3J(p)pi,j

8

) -

1

.

5. CLASSICAL ANALYTIC THEORY OF £-FUNCTIONS

97

We say that f and g have a Rankin-Selberg convolution if there exists an £-function L(f 0 g, s) of degree de such that

pfq(f)q(g)

plq(f)q(g)

where de

(5.10)

Hp(p-") =

IJ (1 -/j(p)p-•)-

1

with

h 1 (p)l < p.

j=!

The gamma factor must be written as

,u 0 g, s)

=

"-de•/2

IJrC +t·J) t,J

with Re (J.Li,j) ,;:; Re ( K:; + Vj) and IJ.L,,j I ,;:; IK; I + Ivi I, where K; and Vj are the local components at infinity off and g. In addition, the conductor off 0 g must divide q(f)eq(g)d and L(f0g,s) must haveapoleat s = 1 ifg =f. In fact, unless L(f,s) or L(g, s) factor, we will see that in concrete cases L(f 0 g, s) is entire if g i= f. We call L(f 0 g, s) the Rankin-Selberg £-function or convolution off and g, or the Rankin-Selberg square if g = f. Note that if L(f 0 J, s) or L(f 0 j, s) exists, then the estimates on the local roots and on K:j can be improved to la;(p)l < .jP and Re(K:j) > -~. The conditions also imply the inequality for the analytic conductor

(5.11) (see (5.8)). Although it is by no means obvious that L(f0g, s) should exist, it is the case when L(f,s) and L(g,s) are automorphic £-functions, as discussed in Section 5.12. This covers most cases of importance in analytic number theory (including the £-functions of number fields). Notice that the dual £-function is L(j 0 g, s). 5.2. Approximations to £-functions.

We begin with a crude result which will be used extensively, and improved later. LEMMA 5.2. Any L-function L(f,s) is polynomially bounded in vertical strips s = a+ it with a :;:; a :;:; b, It I ~ 1.

PRoOF. The L-fnnction is bounded in the half-plane a ~ 1 + E by the absolute convergence of the Dirichlet series. Hence we deduce a polynomial bound for L(f, s) in the half-plane a < -€ by the functional equation and the Stirling formula (5.114). Then the polynomial bound in the remaining strip follows by the Phragmen-Lindelof principle (see Theorem 5.53). 0 We now state an exact formula, known as the "approximate functional equation", which gives analytically convenient expressions for L(f, s) in the critical strip where the series does not converge absolutely.

98

5.


THEOREM 5.3. Let L(f, s) be an L-function. Let G(u) be any function which is holom.orphic and bounded in the strip -4 < Re (u) < 4, even, and normalized by G(O) = 1. Let X > 0. Then for- s in the strip 0 ~ CT ~ 1 we have

where V:, (y) is a srn oath function defined by (5.13)

and E(f, s) = c(f)q(f) ~ -s !U, 1 - s). ,(!, s)

(5.14)

The last term R = 0 if A(f, s) is entire, otherwise R

(

= w;I~s + u~~s

)A(f,s+u)G(u)xu qs/2/(J, s) -U·

PROOF. Consider the integral 1. I(X,f,s)=2KZ

J

du XuA(f,s+u)G(u)-. U

(3)

This integral exists because A(f, s) decays rapidly as t --> +oo for fixed CT by Stirling's formula (see (5.113) ). For the same reason one can move the integration toRe (u) = -3 by Lemma 5.2. Applying the functional equation there yields

(5.15)

A(f, s) =I( X, f, s)

+ c(f)I(X- 1, J, 1- s) + Rq•1 21 (J, s)

where A(f, s) comes from the simple pole of u- 1 G(u) at u = 0 and the last term R comes from possible residues at u = 1- sand u = -s if A(f, s) is not entire. By expanding into absolutely convergent Dirichlet series we have 1. I( X, f, s) = q5 12 ""'>. 1 (n)n-•~ 2 KZ n;;,l

= q•/ 2

I

J1U, + ( yq) 1l.

u G(u)du U

(3 )

(-n-)

(f ) ""'>.J(n)V: , s ~ n• s X n;;,l

X u) -

s

r;;

yq

.

We do the same for I(X- 1 , f, 1- s) and combine them with (5.15). Dividing both sides by q•l 21(f, s) yields (5.12). D


99

EXERCISE 1. Let L(f, s) be an £-function which is entire. For a smooth, compactly supported function F on JR+ with Mellin transform F, prove that

" c(f) L..' AJ(n)F(n) = -" L..' >.- 1 (n)H(n) JQ_ n)l

(5.16)

n)l

where

H(y) =

~ 27rz

j F(1- s) 1U,!U,1-s) s) y- ds. 8

(3)

(More explicit formulas, in the ca.se of classical automorphic forms, are described in Section 4.5; in particular, sec the formula (4.71) in which the test function goes through the Hankel transform). Show that if L(f,s) = ((s) is the Riemann zeta function, (5.16) holdo with an additional term

!"!:;~((s)F(s) = la+oo F(y)dy

R=

on the right side. Prove that in this case

H(y)

=

2

roo

Jo

F(y) cos(21rxy)dy,

and (5.16) is the Poisson summation formula. EXERCISE 2. Let L(f, s) be the £-function associated with a holomorphic primitive cusp form f of weight k = 2 and level q (think of the Hasse-Wei! zeta function of an elliptic curve). Derive from (5.12) the following formula for the central value

AJ(n) ( -21rn) L(f -') -_"' L..--exp - +E (f ) "'AJ(n) L..--exp ( -21rnX) -'2

n

XJii

,fil

Jii

,fil

n

'

where X is any positive number [Hint: take G(u) = 1.] Differentiating with respect to X this yields the summation formula "'>.t(n) 27rn) L..--exp ( - - _ - X 2 E ( f ) "'>.t(n) L..--cxp ( -21rnX) -- . n

Vrl

XJii

n

Vrl

Jii

Taking X= 1 we get for self-dual £-function with root number c(f) = 1,

L(f, 1.) = 2 2

L

AJ(n) exp(- 27rn). nvn Jii

For suitable test functions G(u), both sums in (5.12) are effectively limited to the terms with n « Jq(f, s). We shall see this clearly for a particular choice of G(u), G(u)

where A is a positive integer.

1fU) -4dA

= ( cos 4A

5.

100

CLASSICAL ANALYTIC THEORY OF L-FUNCTIONS

+ Kj)

PROPOSITION 5.4. Suppose Re(s derivatives of V8 (y) satisfy

(5.17) (5.18)

~

3o: > 0 for

1 '(

j '(d. Then the

(1 + krA. yav}al(y) = 8a + o((kf)

YaV.(a)(y)

«

where 5o = 1, 8a = 0 if a > 0 and the implied constants depend only on a, a, A and d. Recall that q00 = qoc(s) is given by (5.6). PROOF. For sand u with Re(s) =a> 0 and Re(u) Stirling's formula (5.112)

=

(3 >-a we c:lerive by

where the implied constant depends only on a and (3. Hence

')'(!, s + u) ')'(!, s) We have

«

{3/2

qoo cxp

("dl 2 u 1) .

yaV(a)(y) = ~ jy-uG(u)( -u)a ')'(!, s + u) du. s 27rz ')'(!, s) u (3)

Moving the integration to the line Re (u) = fJ = -a we derive (5.17) while moving to the line Re(u) =A we derive the bound 0((~/y)A). This combined with (5.17) yields (5.18). D The formula (5.12) expresses L(f, s) in the critical strip, essentially, as two partial sums of the Dirichlet series L(f, s) and the dual one, each of length Jq(f, s). Hence one can easily derive estimates in the critical strip. We give here the individual estimate, and the uniform version for £-functions satisfying the RamanujanPetersson conjecture. See Section 5.12 for stronger results in the case of automorphic £-functions. Because L(f, s) can have a pole or zero at s = 1 of order r, it is appropriate to kill this singularity before estimating. Therefore we shall often estimate Pr ( s )L(f, s) rather than L(f, s), where (5.19) EXERCISE

(5.20)

s Pr(s) = ( s

-1)r +l ·

3. Prove that for s with 0 '(a '( 1 we have Pr(s)L(f, s)

«

q(f,

s)(l-a)/ 2+'

for any E: > 0, with the implied constant depending onE: and f. If L(f, s) satisfies the Ramannjan-Petersson conjecture, then the implied constant depends only onE and the degree of the £-function. [Hint: Use the absolute convergence of L(f, s) in a > 1 (or the bound i>-J(n)l '( Td(n), respectively), the functional equation,

r 5. CLASSICAL ANALYTIC THEORY OF L-FUNCTIONS

101

Stirling's estimate for the gamma function and the Phragmen-Lindelof convexity principle.] In particular, (5.20) yields the following bound on the critical line

L(f,s)

(5.21)

I

«

q(f,s)4+c

which is known as the convexity bound. Using the approximate functional equation rather than the convexity principle, and the Ramanujan-Petersson conjecture, one can derive a slightly better estimate

L(f, s)

(5.22)

1

«

q(J, s)4 (logq(J, s))d- 1

where the implied constant depends only on the degree d of L(!, s). It is conjectured that L(f, s) « q(f, s )' for Re (s) = ~ with any c: > 0, the implied constant depending only on c:. This is the so-called Lindeliif Hypothesis, which is one of the consequences of the Grand Riemann Hypothesis (see Corollary 5.20). However, in many applications the major step is to improve on the convexity bound for relevant £-functions, in the sense of reducing the exponent by a positive number, however small that number may be. Such results go back to H. Weyl for the Riemann zeta function, namely

i

((s)

«

1

lsl6+'

(see (8.22)). For Dirichlet characters, one derives from Burgess's estimate for short character sums that 3 L(x,s) « lslqJ6+' (see Theorem 12.9), which is worse than (5.21) in t-aspect, but much better in q-aspect: the latter is often the most important (e.g. for applications to algebraic number theory). In view of the proof of (5.22) by the formula (5.12) it is clear that any subconvexity estimate amounts to proving that there is considerable cancellation when 1

.

summing the oscillating values AJ(n)n_2_'t, and this proves to be an arithmetical problem. Currently, it is known that a subconvexity bound holds in t-aspect and q-aspect separately (but not yet for both jointly) for any £-function attached to classical modular forms (see the surveys [M2], [ISlJ, and the papers [DFI2], [DFI3], [KMV2], [PSa], [Ml], among others). 5.3. Counting zeros of £-functions. One of the deepest subjects of the theory of £-functions is the distribution of the zeros of L(f, s). We discuss first what can be done using basic properties of holomorphic functions through the method of Hadamard and de Ia Vallee Poussin. LEMMA 5.5. Let L(f, s) be an £-function. All zeros p of A(f, s) are in the critical strip 0 '( a '( 1. For any c: > 0, we have

L p,.C0,1

IPI-1-£ < +oo.

5.

102


PROOF. Since the Euler product expansion of L(f, s) is absolutely convergent and "((/, s) does not vanish for Re (s) > 1, there are no zeros of A(f, s) in this region. By the functional equation, the same holds for Re (s) < 0. The second statement is a general result of complex analysis for entire functions of order 1. 0

Notice that if p = fJ + i"( is a zero of A(f, s), then p is a zero of the dual A(/, s). Hence, by the functional equation, the point reflected in the critical line p* = 1 ~ p = 1- fJ + i"( is also a zero of A(f, s) (with corresponding multiplicity). Of course, there is no symmetry among the trivial zeros of L(f, s) which come from 8 the poles of r( ~"' ). THEOREM 5.6. Let L(f, s) be an £-function. There exist constants a = a(f) and b = b(f) such that

(s(1- s))"A(f,s) = ea+bs

(5.23)

II p¥0,!

(1- ::)e•IP p

where p ranges over all zeros of A(!, s) different from 0, 1. Hence (5.24)

L' L

1 2

r s

'Y'

r s-1

--(f,s)=-logq+-(J,s)-b+-+--~

"(

" 1 1) L...' ( --+-. s-p p

p¥0,!

both expressions being uniformly absolutely convergent in compact subsets which have no zeros or poles (recall that r is the order of the pole or zero of L(f, s) at s = 1}. PRooF. The expansion (5.23) is simply the application of the Hadamard factorization theorem of entire functions of finite order, and (5.24) follows by taking the logarithmic derivative. 0

We denote (5.25) the expansion of the logarithmic derivative of an £-function in Dirichlet series supported on prime powers. In terms of the local roots oc; (p) of the Euler product (5.7) we have d

(5.26)

AJ(Jl)

=

L oc3(p)k logp. j=l

For Dirichlet characters, Ax(n) = x(n)A(n), and in general AJ(P) = AJ(P)logp for p prime. Note that Af(n) = AJ(n). PROPOSITION 5. 7. Let L(J, s) be an L-functionof degree d ? 1 and let p denote the zeros of A(f, s) differ·ent from 0, 1. {1) The number of zeros p = fJ + i"( such that I'Y- Tl ( 1, say rn(T, f), satisfies

(5.27)

m(T, f)

with an absolute implied constant.

«

log q(f, iT)


{2} For any s in the strip-~ (5.28)

L' r r· -(f,s)+ - + - L s s- 1

(a ( 2

L ls+h 1 1<1

103

we have

"' -1 « ~ s-p

s + Kj

logq(f,s),

ls-pl<1

with an absolute implied constant. {3) The constant b(f) in {5.23} satisfies Re(b(f)) = - L:Re(p- 1 ).

(5.29)

p

PROOF. First we prove (5.29). Taking the logarithmic derivative of the functional equation (s(1- s))' A(!, s) = E(/)(s(1- s))' A(/, 1- s), we get from (5.23)

(5.30)

2Re (b(f)) = b(f) + b(f)

=-

L (- 1 p + 1-

p,.C0,1

We have ( s- p) - 1 - (1 - s- p) - 1 on s, and similarly p- 1 + ,o- 1 «

L

p,.C0,1

S-

1 S-

p

+-1 +-=1) . p

p

« IPI- 2 for s i IPI- 2

p, the implied constant depending so the series

1- + 1 ) (s-p 1-s-p

and

(1- +-1)

L

p

p

p,.C0,1

are absolutely convergent by Lemma 5.5. So we can separate them in (5.30) and the first one vanishes (the terms cancel out since p and 1- pare zeros of A(!, s)), giving (5.29). LetT? 2 and s = 3 +iT. By (5.26) we have IAJ(n)l ( dnlogn. Hence (5.31)

L'

1-z:·(/,s)l ( d('(2)

«

logq(f).

By Stirling's formula (5.116) we have ~log q + ~ (!, s) that for any zero p = fJ + i-y we have

___2_~ < Re (-1-) < 9+(T-,) 2

s-p

«

log q(f, s). Observe also

3

4+(T-!)2

.

Hence we can take the real part in (5.24) and rearrange the resulting absolutely convergent series to derive using (5.29) that (5.32)

L P

(1

1+ T-1

J2 «

logq(f, iT).

This implies (5.27). To derive (5.28), we write s =a+ it and

L' L' L' -y;(f,s) = -y;(f,s) + y;(/,3 +it)+ +O(logq(f, s)). by (5.31), and we get by (5.24) and (5.116) again

_!!_(f,s) = i(f,s) + ":. + _r_L 1 s s- 1

1 1 L (-s- p 3+ P

1t -

p

) +O(logq(f,s)).

104


5.

In the series, keep the zeros with Js- pJ < 1 and estimate the remainder by log q(J, s) using 1 1 3 s- p - 3 +it- p '( 1 + (T -1)2 and (5.32). Moreover we have (see (5.116))

I

I

i(f,s) = 1

-~- L_1_ +L~(1+ 2 j s + Kj i r 1 L -+ O(logq ls+'<;l
S

+ Kj

00

s+Kj) 2

(s)).

0

Thus (5.28) follows.

THEOREM 5.8. Let L(J, s) be an L-function of degree d. Let N(T,J) be the number of zeros p = (3 + ir of L(f, s) such that 0 '( (3 '( 1 and lrl '( T. We have T qTd N(T,f) =;log ( 1re)d +O(logq(J,iT)) 2

(5.33)

for T ? 1 with an absolute implied constant.

PROOF. Let N' (T, f) be the number of zeros p of A(J, s) with 0 '( (3 '( 1 and 0 < 1' '( T. We have N (T, f) = N' (T, f) + N' (T, j) + O(log q (f))

(5.34)

where the error term accounts for possible real and trivial zeros of L(f, s) in 0 '(a< 1. By (5.27) we can assume that A(f,s) does not vanish on Im(s) = T by adding to T an arbitrarily small number if necessary, modifying N(T, f) by a quantity « logq(f,iT). Choosing small8 > 0 such that A(J,s) =J 0 in -8 '( Im(s) < 0, we have N(T, f)= I(T) + O(log q(J, iT)) with l(T) =

1 1ri 2

J

A' A(J,s)ds

c

where Cis the rectangle with vertices 3- i8, 3 +iT, -2 +iT and -2- i8. We cut this rectangle symmetrically at the points ~ - i8 and ~ +iT. By the functional equation, the integral on the left part of the contour equals that on the right part. Therefore l(T) equals twice the integral over the right part of C which we are going to estimate by observing the variation of arguments of each factor in

II rC ~ Kj) L(J, s) d

A(J, s)

= qs/2/(J, s)L(J, s) = 1f-dsf2qsf2

j=l

The variation of the argument of 1f-dsf 2qsf 2 gives a contribution to I (T) equal to (5.35)

T q -log d + 0(1). 41f 1f

By the Stirling formula (5.113), the argument of r(a +it) is tlogt- t + 0(1) if t? 1, so the contribution of gamma factors to I(T) is ( 5.36)

T - -dT) -1 (dT -log - + O(log q(J)). 21f 2 2 2

r

Il

5.


105

For L{f, s) on the vertical segment from 3- i8 to 3 +iT, we estimate

!

log L(f, s) = -

i

L

~J(n) n-s

n

ogn

«log q(f)

as in (5.31), so the integral on this segment is« logq(f). Then on the remaining horizontal parts from ~ - i8 to 3- i8 and 3 +iT to ~+iT, we appeal to (5.28) and (5.27) to get the estimate O(logq(f,iT)). By (5.34), (5.35) and (5.36) applied to L(J, s) and L(f, s), we get (5.33). D REMARKS. The main term of (5.33) comes out from our computation of the variation in the argument of qs1 2 "((J.s) along certain (right) segments, the contribution from the £-function itself being small (well, only from the relevant segments). This illustrates how the £-function is intrinsically connected with the companion at the infinite place. 5.4. The zero-free region. Analytic applications of £-functions make use of complex integration in a region free of zeros of L(J, s) for various f. The results are stronger if the zero-free region spreads deeper into the critical strip. The basic method to cut one is still that of Hadamard and de Ia Vallee Poussin. It is remarkable that this method also reappears in Deligne's second proof of the Riemann Hypothesis for varieties over finite fields [De2] (the basic trigonometric inequality (5.69) is misprinted there). We present the traditional arguments, but in somewhat more elegant fashion, starting with a useful general lemma of Goldfeld, Hoffstein and Liehman [GHL]. LEMMA 5.9. Let L(J, s) be an £-function of degree d with Re (At(n)) ) 0 for (n,q(f)) = 1. Suppose that at ramified primes lni(P)I ( p/2. Then L(J, 1) =J 0. Let r ) 0 be the order of the pole of L(J, s) at s = 1. There exists an absolute constant c > 0 such that L(f, s) has at most r real zeros in the interval s)1-

c

d(r

+ 1) log q(f)

.

PROOF. Let (31 be zeros of L(f, s) in the segment ~ ( f3J ( 1. By (5.28) for s =a> 1 we get (5.37)

on ignoring zeros other than f3J 's, by positivity since a-(3

1

Rea - p = for any zero p = (5.38)

f3 + i"( in

Ia -

Pl2 ) 0

(5.28). SinceRe (At(n)) ) 0 for (n, q(f)) = 1 we have

Re

~~r (f, a) nr

( 0

5.

lOG

CLASSICAL ANALYTIC THEORY OF L-FUNCTJONS

where L,.,.(f, s) is the Euler product of L(f, s) restricted to unramified primes. We estimate the contribution of ramified primes by '"""' ID

'"""' n 1 (p )p-" log pI '"""' D 1 _ n (p)p-" '( d D logp '( dlogq(f).

plq(f) I(j(d

plq(f)

J

Hence by (5.37) '"""' a-1 (3
J

where r is the order of the pole or zero of L(f, s) at s = 1. The above inequality shows that (31 = 1 is not possible (in this case r < 0) so we conclude that L(!, 1) =I 0, in other words r ;? 0. Suppose (31 > 1- c(d(r + 1) log q(f))- 1 for 1 '( j '( n. We choose a= 1 + 2c(dlogq(f))- 1 and get ndlogq(f) (r 2c + cj(r + 1) < 2c which implies n < r

+ r/2(r + 1) + O(c)

+ 0(1)

)

dlog q(f)

and n '( r if cis small enough.

0

THEOREM 5.10. Let L(f, s) be an L-function of degree d such that the RankinSelberg convolutions L(f ® f, s) and L(f ® J, s) exist, and the latter has a simple pole at s = 1 while the former is entire if f =I f. Suppose that at the ramified primes ln1 (p)l 2 '( pj2. There exists an absolute constant c > 0 such that L(f,s) has no zeros in the region

a>

(5.39)

"'""

1-

c d4 log(q(f)(ltl

+ 3))

except possibly for one simple real zero (3 f < 1, in which case f is self-dual. PROOF.

FortE JR, let L(g, s) be the £-function

L(g, s) = ((s)L(f, s + it) 2 L(f,s- it) 2 L(f ®f,s + 2it)L(! ®f,s- 2it)L(!® /,s) 2 of degree (1

+ 2d) 2 .

Its analytic conductor satisfies

(5.40)

by (5.11). The product is arranged so that its Dirichlet series coefficients are real, nonnegative. It is sufficient for ns to check this at unramified places. For an unrarnified prime p, the local Euler factor at p for L(g, s) is of the form (5.1) with "roots" 1 with multiplicity one, n 1 pit and a 1p-•t with multiplicity two, njiik with multiplicity two, n 1 nkp 2it and ii1 iikp- 2it with mnltiplicity one, where n 1 , nk run over the roots ( 5.1) for L(f, s). Therefore for any k ;? 1, the sum of the k-th powers of these roots is

11 + Lnjpkit + La;p-kitl2? 0 j

j

so that A 9 (n) ? 0 for any n coprime with q(J).

5. CLASSICAL ANALYTIC THEORY OF L-FUNCT!ONS

107

Let p = (3 + i'y be a zero of L(f, s) with (3? ~ and 'Y =J 0. Then we take t = 'Y so L(g, s) has a pole at s = 1 of order ~ 3 whereas (3 is a zero of L(g, s) of order ;? 4. Hence by Lemma 5.9 we have (5.41)

c c' (3 < 1 - d2 log q(g) < 1 - -:cd4:-:-lo-g-,-(q-;-(f-::o)--;-;(1-,-,-tl-+--::3c-c))

for some absolute constants c > 0 and d > 0. Now consider real zeros of L(f, s). For this purpose we take the product L(g, s) with t = 0. Any real zero of L(f, s) is a zero of L(g, s) of multiplicity at least 4. On the other hand, the point s = 1 is a pole of L(g, s) of order 3 iff =J and of order 5 iff = f. Hence by Lemma 5.9 we conclude the proof, except that we must prove that the possible exceptional zero of a self-dual L(f, s) satisfies f3J < 1. Consider instead

J

L(h, s)

=

((s)L(J, s) 2 L(J ®

J, s).

Assuming L(J, 1) = 0, it follows that L( h, s) is entire since the double zero compensates the poles of ((s) and L(f ® J, s). It also has non-negative coefficients, precisely for an unramified prime p and k ? 1 we have (5.42)

so the series (5.43)

has non-negative coefficients. Let a 0 .;:;: 1 be the largest real zero of L(h, s), so the first singularity of log L(h, s) when s decreases. Note a 0 exists since (( -2) = 0. By Landau's Lemma (see Lemma 5.56) the series (5.43) converges for any real a> a 0 , so by (5.42) we have logL(h,a)? 0 and jL(h,a)j? 1. Letting a-> a 0 , we get a contradiction. 0 REMARK 1. If L(f, s) is self-dual one may refine the real zew issue of Theorem 5.10 slightly, arguing with L(h, s) more efficiently. Indeed (5.42) shows that Ah(n) ? 0 for all n with (n, q(f)) = 1. Let l denote the order of the pole of L(f ® J, s) at s = 1. Then L(h, s) has a pole of order l + 1 at s = 1 while any real zero of L(J, s) is a zero of L(h, s) of order ? 2. Hence there are at most 1 1 real zeros of L(f, s) in the region (5.39).

t

J,

REMARK 2. Note that if the Rankin-Se!berg £-functions exist, but L(J ® s) has a pole of order > 1 at s = 1, it is often the case that L(J, s) is a product of other £-functions. It is then more efficient to apply Theorem 5.10 to those than to L(f, s) itself. For instance, consider the Dedekind zeta function of a quadratic field (K(s) = ((s)L(s,x). Then the Rankin-Selberg £-function is ((s) 2 L(s,x) 2 which has a double pole at s = 1.

5.

108


EXERCISE 4. Let L(f,s) and L(9,s) be two £-functions of degrees d and e respectively such that 9 f f and 9 f f. Assume that Rankin-Selberg £-functions L(f ® 9, s), L(f ® !}, s), and hence their dual, exist, and that the local roots at ramified primes satisfy /a;(p)/ 2 < pj2. Show that there exists an absolute constant c > 0 such that L(f ® 9, s) has no zero in the region a?

1

c

- (d + e)41og(q(f)q(9)(/t/

What can you prove iff= 9 or L(9,s) = L(f®9,s

f

+ 3)) ·

= g? [Hint: Use the auxiliary function

+ ~)L(f®!J,s)L(]®9,s)L(/®g,s-

~)

and apply Lemma 5.9.] In general nothing stronger is known than Theorem 5.10. For certain important £-functions, one can do better, individually, and also by considering families. Examples, including the most important one of Dirichlet characters, are discussed in Section 5.7 and 5.12.

5.5. Explicit formula. A summation formula somewhat analogous to (5.12) can be derived by integrating the Dirichlet series for the logarithmic derivative of an £-function. For historical reasons, this type of formula is usually called an "explicit formula", emphasizing the link it expresses between sums over the zeros of an £-function with sums of coefficients at prime powers. THEOREM 5.11. Let 'P : ]0, +oo[-> IC be a and cp(s)

=

be its Mellin transform. Put '1/J(x) Then we have (5.44)

L

(AJ(n)cp(n)

1+=

1 + -. 21fz

function with compact support,

cp(x)x 8 - 1 dx,

= x- 1 cp(x- 1 ),

+ Aj(n)'I/J(n))

n

c=

which means ,J;(s)

= cp(1)logq(f)

J (1'

-(!, s) I

+r

r=

Jo

= if(1-

cp(x)dx

1' - 1- s) )
(l/2)

s).

L

p

' where p runs over the zeros of L(f, s) in the strip 0 ( a ( 1 (the non-trivial and the trivial ones) with relevant multiplicity. PROOF. Start with the first term on the left side of (5.44) using Mellin inversion LAJ(n)cp(n) = n

1 1fi 2

J

L' -y;(f,s)
~)

Let c be a small positive number so that L(f, s) has no zeros in -c ( Re (s) < 0. Move the line of integration toRe (s) = -c, which is legitimate by (5.27). A simple pole is picked up at s = 1 with residue r J cp(x)dx, and at the zeros s = p with residues m
109

5. CLASSICAL ANALYTIC THEORY OF L-FUNCTIONS

The functional equation implies

L'

--y;(f, s)

= logq(f) +

"!'

"!' -

L' -

-;yU, s) + -;yU, 1- s) + -y;U, 1- s)

for a = -c. Integrating over a = -c, the factor log q(J) gives the first term on the right side of (5.41) by 1\Iellin inversion. Moreover, by absolute convergence

- 1- s)
n;;n

(-c)

1!

A 1 (n)---:

211"2

A 1 (n)1/J(n).

n;;>J

(-c)

And finally one can move the line of integration for the gamma factors to a = ~ getting the integral of gamma factors in (5.44) without residues (the poles of~ (f, s) cancel out with those of

l(}, 1- s)).

0

'

RE~IARK. Because of the appearance of the conductor q(f) on the right side of (5.41), this formula turns out to provide a good analytic tool for its study. See Theorem 5.32 below for an example. However, it is quite useful for other things. First of all the-explicit formula is seen as expressing a sum over primes by the sum over the zeros and vice versa. Certainly there are other explicit formulas connecting primes with.zeros.

EXERCISE 5. Let

where p runs over the non-trivial zeros of ((s). [Hint: Start as above, but move the line of integration far to the left instead of using the functional equation; the first term comes from the pole at s = 1 and the trivial zeros of ((s) at s = -2k, k ? 1 an integer.] Sometimes it is more convenient to work with the explicit formula in a Fourier form rather than the Mellin form (5.44). This can be derived from (5.44) by simple substitutions
L(AJ(n)+AJ(n))g(~n) n

=g(O)logq(f)+rh(

i7r)

4

where p = !+h runs over the zeros of L(J, s) in the strip 0 :S; a :S; 1 (the non-trivial and the trivial ones) with relevant multiplicity.

5.

110


5.6. The prime number theorem.

By Prime Number Theorem for an £-function L(!, s), we mean first the asymptotic behavior of the sum

1/;(f,x) = LAt(n)

(5.46)

n~x

which is essentially the sum of >-t(P) logp over primes. When L(f, s) = ((s), this amounts indeed to counting primes p ( x. Much of the arithmetic interest lies in the dependency of the error term in asymptotic approximations on extra parameters, notably on the conductor when f varies in a family. Therefore we are seeking results which are explicit in every possible parameter, although not the strongest ones. The strength of results depends on the depth of the zero-free region for L(f, s). The best we can hope for today is that L(f, s) has no zeros in the region (5.39) with at most one exception of a simple zero f3t in case off self-dual. By our definition (only for the purpose in this section) the exceptional zero is in the segment c 1- d41og(3q(f)) ( f3t < 1

(5.47)

where c is the absolute positive constant from (5.39). This definition may cause some discomfort for new enthusiasts of analytic number theory, because it is loose with respect to the constant c. However, the practice shows that one benefits from such a flexible concept. After all, c can be improved by new tools in the future, not to mention that we don't believe in the existence of the exceptional zero of any £--function. In full generality we do not yet have a strong bound for the individual coefficients of L(f, s). But a crude bound for A1 (n) on average will suffice. Specifically, we postulate the following: (5.48)

L IAt(n)l

2

«

2

2

xd log (xq(f))

n~x

for x ) 1, where the implied constant is absolute. EXERCISE

(5.49)

6. Derive from (5.48) that

1/;(f, x) =

L >-t(P) logp + 0( y'xd !og (xq(f)) 2

2

p~x

where the implied constant is absolute. Recall that the postulated zero-free region (5.39) was established in Theorem 5.10 under the assumption that the Rankin-Selberg convolution L(f 0 f, s) exists and has a simple pole at s = 1. We shall see that the same assumption is also sufficient for the second postulate (5.48). To this end we apply Proposition 5.7 for L(f0],s) rather than for L(f,s). Combining (5.28) and (5.27) (see also (5.11)) we get L' d2 -yU 0 f, a)« a_ 1 logq(f) for 1 < a ( 2, where the implied constant is absolute. This with a = 1 + (log 3x) - I yields (5.49) by the non-negativity of coefficients.

5.


111

From (5.49) we derive using Cauchy's inequality that

L

(5.50)

IAJ(n)l

«

dJXYlog(xq(J))

x
for x ) y ) 1 where the implied constant is absolute. Now we are ready to state and to prove the main Prime Number Theorem. THEOREM 5.13. Let L(J,s) be an L-function for which (5.S9} is a zero-free region with at most one exceptional real zero f3t in the segment {5.47), where c is a positive absolute constant. Suppose {5.48) holds with an implied constant absolute. Then we have 4

(5.51)

xf3J ( ( -cd- logx) 4) 1/;(J,x)=rx-{jf+O xexp y"'IgX+ logq(J) (dlogxq(J))

3

,

for x ) 1, the implied constant being absolute. By convention the term -xf3J / f3 f is not in (5. 51) if the exceptional zero does not exist.

REMARKS. The approximate formula (5.51) has meaning when the error term is smaller than the main term, and this is the case for X)

lc-

1 4

d log(dlogq)

where q = q(J) is the conductor, d is the degree and c is the relevant absolute positive constant. One can simplify the error term by compromising slightly the strength of the result. For example, we have

(5.52)

1/J(J,x) = rx-

~;

+ 0( Jq(f)xexp(-

2~4 ~))

where the implied constant is absolute. Indeed this holds true, because when the error term here is smaller than that in (5.51) the result is trivial. The conditions of Theorem 5.12 are satisfied if L(J 0 f, s) exists and has a simple pole at s = 1, but we do not restrict (5.51) to this case only. Indeed we shall establish the zero-free region (5.39) and we shall see the crude bound (5.48) directly without appealing to the Rankin-Selberg convolution L-function in many classical cases. PROOF OF THEOREM 5.12. First we smooth things out by writing 1/;(J, x) =

L At(n)4>(n) + O(dylxYlog(xq(J)))

where ¢(z) is a function support on [0, x + y], such that ¢(z) = 1 if 1 :( z :( x and l4>(z)i :( 1 elsewhere. The parameter y will be chosen later subject to 1 :( y :( x. For example, we take x¢(z) =min y' 1,1 + -y~

(z

for 0 :( z :( x

+y

and ¢(z) = 0 if z > x -

¢(s) = for s = a+ it, ~ :( a :( 2.

+ y.

z)

The Mellin transform of ¢(z) satisfies

x" ( X ) Jorx+y 4>(z)zs-rdz «~min 1, ~

112

5.


Now we can start with

LAt(n)4>(n)

=

1

1ri

2

J

L' --y;(f, s)tP(s)ds,

(2)

n

the integral converging absolutely. We shall be operating in the region

{s =a+ it I a;) 1- cJ/d4 log(q(f)(lt1

+ 3))}

where c1 = 2cj3, or c1 = cj3, depending on whether the exceptional zero is in the right half of the segment (5.47), or elsewhere (with possibility of non-existence). Let Z denote the boundary of this region. The point is that all the zeros of L(!, s) are distanced from Z by at least cj6d4 log(q(f)(ltl + 3)). Hence it follows by (5.27) and (5.28) that for s E Z,

,4

Moving the integration from the line Re (s) = 2 to Z we get by Cauchy's theorem " ' ~At(n)(n) = r¢(1)4>(f3t) +

1 1ri

2

n

J

L' --y;(f, s)¢(s)ds.

z

Here the presence of the exceptional zero term depends on whether we passed the points= f3t or not. For s = 1, or s = f3t we have

¢(s)

=

r

lo

zs- 1 dz

+ O(y) =

xs

s

+ O(y)

while the contour integral over Z is estimated by d where T arrive at

4

h~~min(1, I~Y)log (q(J)(Itl 2

= xjy

and a(T)

+3))1dsl

= 1- cJ/d 4 log(q(J)T). 13

1/:{f, x) = rx-

«

d.Jx"(T)log3 (q(J)T)

Gathering the above results we

x {3; + 0(d (xr2 + x"(T)) log (xq(J))) 4

1

3

where Tis at our disposal subject to 1 ( T ( x. We choose T = exp(~v'logx) getting (5.51 ). We still have to address the question of the exceptional zero term. An explanation about its contribution to the obtained formula is needed if the exceptional zero does exist and it lies in the left half of the segment (5.47), but in this case the term -xf3t / f3t is consumed by the existing error term, so it can be 0 ignored. This completes the proof of Theorem 5.12. If one is willing to employ more zeros of L(f, s), then one can derive along the above lines stronger approximations to 1/;(f, x). EXERCISE 7. Assume that L(J, s) satisfies the Ramanujan-Petersson conjecture. Derive (using Perron's formula (5.111)) the following approximate expansion

(5.53)

1/;(f,x)=rx-

L bi<:;T

xP- 1

(X

-p-+0 T(logx)log(xdq(J))

)

.

.~

5.


ll3

where p = {3 + iJ runs over the zeros of L(J, s) in the critical strip of height up to T, with any 1 :( T :( x, and the implied constant is absolute. EXERCISE 8. Assume that L(J, s) satisfies the Ramanujan-Petersson conjecture and Theorem 5.13 holds for f. Prove that r :( d. 5.7. The Grand Riemann Hypothesis. The Grand Riemann Hypothesis (GRH for short) refers to the following conjectural statement about zeros of £-functions: GRAND RIEMANN HYPOTHESIS. Let L(J, s) be an £-function. Then all zeros of L(J,s) in the critical strip 0 < Re(s) < 1 are on the critical line Re(s) = ~· REMARKS. Needless to say we believe that every £-function (subject to our definition in Section 5.1) satisfies the Grand Riemann Hypothesis. Yet, proving this even for one £-function would be an achievement on a historical scale for human beings. Note that an £-function may have zeros on the line Re (s) = 0 (certainly some trivial zeros, probably not the genuine ones), but as we already proved not on Re ( s) = 1. The GR.H also ensures that the trivial zeros in the open critical strip are all on the critical line as well; do they exist is not clear, most likely they do not. It is very important to consider only £-functions with Euler products. Indeed it is known that Dirichlet series without an Euler product (still civilized ones) have many zeros even in the half-plane of absolute convergence. Arithmetic geometers should not be concerned, because their £-functions originate from an Euler product of some kind. Of course, this easy start makes it harder to investigate £-functions further in analytic aspects. For example, it was only recently that analytic continuation of the Hasse-Weil £-functions of elliptic curves was established by way of modularity. One should not underestimate the analytic continuation of an Euler product beyond the region of absolute convergence, for one often implements deep arithmetic resources in one way or another. As envisioned by Riemann, this venture into the critical strip illuminates prime numbers and relevant arithmetic objects built on them by revealing their dual companions - the zeros of £-functions. lVIysterious as they are today, these zeros (most likely complex transcendental numbers) will eventually be cracked for further analysis. However, for the time being, as far as the GR.H goes, we are 011ly sure they are fine tuned to yield surprisingly strong estimates. Yes, the uniformity of estimates in terms of involved parameters which can be derived from the GR.H are astonishing. On a negative note this superb uniformity could be a reminder how slim prospects are to prove the GR.H in our lifetime. No tools of current analytic number theory are capable of treating the questions for which the GR.H provides easy answers. In this section we present a few traditional analytic consequences of the GR.H, paying attention to the uniformity in terms of the conductor. First we state a few equivalent facts which help to grasp the analytic meaning of the Riemann hypothesis. PROPOSITION 5.14. Let ~ :( n < 1. The following statements are equivalent for an £-function: ( 1) There are neither zeros nor poles of ( s - 1)" L(J, s) in u > n, where r is a non-negative integer. (2) The inverse, the logarithmic derivative and the logarithm of (s -1)' L(J, s), the latter normalized so that log L(J, s) ---> 0 as u ___, +oo, are holomorphic in u > n.

5.

114


{3) Let J1 f ( n) denote the coefficients in the Dirichlet series expansion for the inverse L(J,s)- 1 Then (5.54)

the implied constant depending on f and E: > 0. {4) Let r)! 0 be the order of the pole of L(f, s) at s = 1. Then

1/;(J,x) = rx +O(x'*),

(5.55)

the implied constant depending on f and

E:

> 0.

The Riemann hypothesis asserts that the above statements are true with n = ~. What is interesting (perhaps somewhat surprising for a beginner) is that the estimate (5.55) is self-improving by passage through zeros. Indeed, first (5.55) implies that the zeros are on the line Re (s) = ~' whence one derives by the expansion (5.53) with T = x using the crude bound (5.26) the following THEOREM 5.15. Assuming the Riemann Hypothesis for L(f, s) and the Ramanujan-Petersson Conjecture for L(f, s) we have

1/;(J, x) = rx

(5.56)

I

+ O(x2 (logx) log(xdq(J)),

for x)! 1, the implied constant being absolute. The Riemann hypothesis shows that L(f, s), the inverse L(f, s)- 1 , the logarithmic derivative Jf: (!, s) and the logarithm log L(f, s) can be well approximated by extremely short partial sums of the corresponding Dirichlet series uniformly in Res = u )! n > ~. Clearly this is not possible for s on the critical line, where the zeros are. We begin by considering

-

~ (f,s)

=

L

A~~n).

n

Let ¢(y) be a continuous function on [0, oo[ whose Mellin transform ~(w) satisfies

w(w

(5.57)

+ 1).j,(w) «

1, for w such that - ~ ( Re (w) ( ~-

Suppose ci>(w) has a simple pole at w = 0 with residue 1 (normalization). For example, ¢(y) = e-Y with ¢(w) = f(w), or ¢(y) = max(1- y,O) with .j,(w)

w-l(w

+ 1)-1.

PROPOSITION

(5.58)

-

. 5.16. For~

~ (!, s) =

L

< Re (s) (

~ we have

A~~n) ¢(i)

n

-r.j,(1-s)Xl-s+L¢(p-s)XP-•+o( logq(J,s)) P (2u- 1)VX for any X)! 1, where the implied constant depends only on that in (5.57).


115

PROOF. By now it should be a standard argument for a reader to derive the formula (5.58) by contour integration with the error term being equal exactly -1. 27rl

J

L' --L(J,s+w)rj>(w)Xwdw.

(-~)

Indeed this is obtained by starting with the above integral on the line Re (w) = = - ~. The simple poles at w = 0, 1 - s, p - s produce the corresponding terms in (5.58). Now on the line Re (w) = -~ we have Re (s + w) =a-~' and we deduce by (5.26) and (5.27) that ~ and moving it to the line Re (w)

L'

-L(J' s + w)

«

(2a -1)- 1 1ogq(J, s + w).

Integrating this against the bound from (5.57), we get the error term in (5.58). 0 The sum over the zeros in (5.58) is estimated directly (without cancellation) by means of (5.26), (5.27) and (5.57) giving (5.59)

_!:!_(!, s) L

=" AJ(n) ¢(!!._)r¢(1- s)Xl-s + o(logq(J, s) x~-a) X 2a 6

-1

n5

n

for any X~ 1, where the implied constant depends only on that in (5.57). Next we are going to estimate the sum of AJ(n). Essentially any crude bound, but independent of the conductor, like IAJ(n)l :( n, would suffice for our purpose. Nevertheless, since we already assumed the GRH, there is no point in ignoring the Ramanujan-Petersson conjecture for the local roots, which yields IAJ(n)l :( dA(n). From this we get

LA~~n)¢(i)

«dXl-a +dlogX.

n

Moreover, the polar term in (5.59) is -r.j,(1- s)X 1 -s = r(s- 1)- 1 + O(r X 1 -a + r log X). Hence (5.59) simplifies to - L' (f,s) = _r_ +O(Iogq(J,s) x~-a +dXl-a +dlogX). L s- 1 2a- 1 Finally choosing X= log 2 q(!, s) we obtain THEOREM 5.17. Assume the Riemann Hypothesis and the Ramanujan-Petersson conjecture for L(f, s). We have 2 L' (f,s) = -r- + 0 ( --(logq(!,s)) d --L s -1 2a -1

2

a

+ dloglogq(!,s) )

for any s with ~ < a :( ~, the implied constant being absolute.

5.

116


CoROLLARY 5.18. For a ) ~ +(log log log q)/(loglog q) we have L' --(! s)- -r- +0 ( -log -q- ) L

'

- s - 1

log log q

where q = q(f, s) is the analytic conductor and the implied constant is absolute.

.·

To get an estimate for the logarithm of the £-function we integrate the logarithmic derivative along the horizontal line log L(f, s) =log L(f, ~+it)-

5/4

1

L' -(!, n + it)dn.

" L Applying (5.59) we deduce the following estimates

THEOREM 5.19. Assume the Riemann Hypothesis and the Ramanujan-Petersson conjecture for L(f, s). We have !og(pr(s)L(f,s))

d(!ogq(f s)j2- 2"

« (2a -1 )Iog' 1ogq (! ,s ) +dloglogq(f,s)

for any s with ~ < a :( ~, the implied constant being absolute (recall Pr(s) ( S - 1)"( S + 1) -r. As immediate consequences of Theorem 5.19 we have lower and upper bounds for L(f, s) in the half-plane u ) ~ + c:

q(f, s)-"

«

Pr(s)L(f, s)

«

q(f, s)"

with implied constants depending only on c. The upper bound can be extended up to the critical line by the convexity principle. In particular, we obtain COROLLARY 5.20. For s with Re(s) = ~ and any (5.60)

L(f, s)

«

E

> 0 we have

q(f, s )"

where the implied constant depends only on

E.

The last estimate received considerable attention (it is known as the Lindeli:if Hypothesis), because it is sufficient for solving many fundamental problems of arithmetical nature due to its uniformity in the conductor. We just proved that it follows from the Riemann hypothesis. One may think that the Lindeli:if hypothesis should be easier to establish, but there is a popular opinion that the Riemann hypothesis must be solved first, the point being that the Riemann hypothesis is linked to more natural mathematical structures. Clearly the Lindelof conjecture implies the following estimate for sums of coefficients of the £-function (5.61)

L

AJ(n) = xPj(!ogx)

+ o(A (xq(f))")

where Pj(X) is the polynomial of degree r- 1 given by PJ(!og x) =res L(f, s)xs-r. s=l

Some unconditional results, weaker yet useful, are given in Chapter 4 (see Corollary 4.9 for ((s)L(s, X4) and Exercise 7 of Chapter 4 for ((s) 2 for instance).

j

5.


117

5.8. Simple consequences of GRH.

\Ve illustrate some of the simple arithmetic consequences of GRH, besides the Lindeliif Hypothesis, which has important applications of its own. The first problem is to estimate the order of vanishing of an £-function at a given point on the critical line, in terms of the conductor of f. \Ve consider the case of the central point s = 1/2. Note that (5.27) implies ord L(J,s) :( m(1,J) s=l/2

«

logq(J)

unconditionally, which we can improve slightly. PROPOSITION 5.21. Let L(f, s) be an entire £-function satisfying GRH. We have ord L(J, s)

(5.62)

s=l/2

«

log q(J)

log(~ log q (f))

with an absolute implied constant (recall that log q(f) > d). PROOF. We apply the explicit formula (5.45) to a test function g(y) supported on [-2Y,2YJ with 0 :( g(y) :( 2 and g(y) ;;:, 1 if IYI :( Y such that its Fourier transform h(t) is non-negative on !!!.. By positivity we can drop all the zeros p = ~ + h f= ~ in (5.45) getting mh(O) :( -2

LRe (AJ(n))n-~ g(logn) + O(logq(J)),

where m is the multiplicity of the zero p = ~- On the left side we have h(O) ::=: Y. Since we have IAJ(n)l :( dA(n)n by (5.2), the sum over prime powers non the right side is bounded by

2d

L

1

y

A(n)n2 «de ,

where the implied constant is absolute. Choosing 2Y =log(~ log q(f)) we get

m« with an absolute implied constant.

logq(f) log( j log q (f))

-...,.,:c:...:.:.::..c_ _

0

Proposition 5.21 is optimal although the estimate of the sum over primes is quite crude, which might suggest some improvement is possible. The issue is that this sum is extremely short with the choice of Y, in fact n :( e2Y :( ~ log q(f) and it is conceivable that At(n) does not change sign in this range. If we take L(J, s)k for k -> +oo instead of L(J, s), we see that (5.62) cannot hold without the factor However, in a number of cases, it is possible to do better; see Proposition 5.34 and Exercise 13 below.

l

The problem of detecting sign changes of AJ(n) for very small n is related to another classical application of the Grand Riemann Hypothesis: how many coefficients (in terms of the conductor q(J)) of L(f, s) at primes are required to distinguish L(J, s) among £-functions of a certain class?

5.

118


PROPOSITION 5.22. Let L(f, s) and L(g, s) be two £-functions with the same degree d and gamma factor. Assume that L(f 0 ], s) and L(f 0 g, s) exist and the latter is entire, that GRH holds for both and that the local roots at ramified primes for L(J 0 ], s) and L(f 0 g, s) are of modulus:( 1. There exists a prime p :( C(dlogq(f)q(g)) 2 unramified for f and g such that the local roots of L(J,s) and L(g, s) at p are different. Here C is some absolute positive constant. PROOF. The point is that by assumption the £-function L(J 0 ], s) has a pole at s = 1 whereas we assume that L(f 0 g, s) is entire. Assume the local roots of f and g coincide for all primes p :( 2X which are unramified for f and g. Then At09 (n) =A !®f(n) for n :( 2X such that (n, q(f)q(g)) = 1. By the explicit formula (5.44) applied to a function of the type cp(n/ X) where ¢) 0 is smooth, compactly supported on [1, 2], and ¢ of 0, we have L_At®!J(n)cp(n/X)

«

Vxlogq(f0g),

L_A 10](n)¢(njX) = .jy(O)X + O(Vxlogq(J 0 /)) where ~(0) J <.p(t)dt > 0. These estimates are smooth versions of the prime number theorem for the Rankin-Selberg convolution £-functions. We used GRH to estimate the sum over zeros. On the other hand, the two left sides coincide by assumption up to the contribution from ramified primes. This contribution is small, precisely it is

L

«

\A 10f(n)\ + \AJ 09 (n)\

«

d 2 (1ogq(J)q(g)) log X.

n(;2X,(n,q(f)q(g) )#I

Therefore c/Y(O)X « Vxlogq(J0])q(f0g)+d2 (1ogq(J)q(g))logX, or X « (dlogq(J)q(g)) 2 where the implied constant is absolute (recall that the definition of the Rankin-Selberg convolution implies the bound q(J 0 g) :( (q(J)q(g))d). 0 REMARK. The best general unconditional result is much weaker. Suppose L(f, s) has degree d with roots \ni (p) \ < p 114 and that it admits L (f 0 /, s). Then for every E: > 0 there exists C > 0, depending on E: and on f with the following property: If L(g, s) has the same gamma factor as L(J, s) (so the same degree), if L(f 0 g, s) is entire and if ,\ 9 (p) = >-t(P) for all p :( Cq(g)d/2+•, then g =f. To prove this, notice that if X) 1 is the largest integer for which >-t(P) = >. 9 (p) for p :( X, then by multiplicativity we have >-t(n) = ,\9 (n) for n I N(X) where N (X) is the product of primes :( X. For h = f, or h = g, one can factor L(f 0

h, s)

= Lb(f 0

h, s)H(f, h, s)

where Lb (f 0 h, s) is the Dirichlet series restricted to squarefree integers and H (J, h, s) converges absolutely for u > ~. In particular, L b (J 0 h, s) is entire if h = g and has a pole at s = 1 for h = f, and satisfies the same convexity bound as the Ra.nkin-Selberg £-function. We get by complex integration on the line u = ~+O, with 0 > 0 small enough

L ,\f0h¢(nj X)= o(f, g)X ?(log X)+ O(x~+'q(J 0 g)!+•) n~l

. I ,


119

where T is the order of the pole of L(f ®/, s) at s = 1 and P jc 0 is the polynomial of degree T - 1 given by P(logX) = resL'(J®h,s)Xs~l. s=l

Comparing the formula for h =

f

and that for h = g yields the result.

5.9. The Riemann zeta function and Dirichlet £-functions. The Riemann zeta function ((s) is a self-dual £-function, with conductor 1, gamma factor r(s) = 7r~sf 2 r(s/2) and root number 1. Its Rankin-Selberg square is also ((s). More generally, let X be a primitive Dirichlet character modulo q. The Dirichlet £-function L(s, x) is an £--function of degree 1 with conductor q, gamma factor

r(s)

=

where 8 = 0 if x(-1) = 1 and 8 c(X) = T(x)/ .fij, where

T(X) =

11'~s/2rC; 8), = 1 if x(-1) = -1, and its root number is

L

x(x)e(~)

x (mod q)

is the Gauss sum associated to x (see Section 3.4). If xis non-trivial, then L(s, x) is entire, otherwise it has a simple pole at s = 1 with residue 1. Any L(s, x) satisfies trivially the Ramanujan-Petersson conjecture. The analytic conductor is given by q(x, s) = q(it+8\+3) :=: q(\tl +3) and q(x) = (2+8)q :=: q; see Theorem 4.15. A Dirichlet £-function L(s, x) is self-dual if and only if x is quadratic. In this case c(x) = 1 by Theorem 3.3. Any two Dirichlet characters x 1 and x 2 admit a Rankin-Selberg convolution given simply by L(Xl 0 X2, s) = L(X3, s), where X3 is the primitive character that induces the product XlX2· We have X3 = XlX2 if, for instance, the conductors q; of Xi are coprime, but not in general. In particular, the Rankin-Selberg square of L(s, x) is ((s) and the finite product (5.10) is

Fixing a modulus q and considering all primitive characters modulo q gives an example of a "family'' of £-functions. Fixing X ) 2 and considering all primitive quadratic characters of conductor q ,::; X gives another family. By (5.22) we deduce the convexity bound for Dirichlet £-functions. THEOREM 5.23. Let X modulo q be a primitive Dirichlet character. We have (5.63)

L(s,x)

«

(q\s\)1

for Res = ~, the implied constant being absolute.

The following exercise provides very simple estimates which are quite handy anyway.

120


5.

EXERCISE 9. Let for 0 :( a :( 1 we have

x modulo q be a

non-trivial Dirichlet character. Show that

(5.64) for k ;:;, 1, the implied constant depending only on k. 1 Ln,-;;x x(n)l :( min(x, q).J

[Hint: Use the bound

Theorem 5.8 becomes: THEOREM 5.24. Let x modulo q be a primitive Dirichlet character, N(T, x) the number of zeros p = {3 + ir of L(s, x) in the critical strip 0 :( {3 :( 1 with hi :( T. We have

N(T,x) = !'._log qT +0(q(T+3)), 1T 2?Te with an absolute implied constant. In addition to the general explicit formula of Theorem 5.11, it is sometimes more practical to work with the following approximate formula for 1/J(x,x)

= LA(n)x(n).

PROPOSITION 5.25. For any character we have (5.65)

1/J(x,x)=lixx-

xP;1+0(f(logxq)2)

L L(p.x)=O IIm(p)I(T

where lix = 1 if X is absolute.

=

Xo and lix

=0

otherwise, 1 :( T :( x and the implied constant

PROOF. This is a special case of (5.53) for primitive characters. However, (5.65) also holds as it is for x not primitive. Indeed, suppose x*(mod q*) with q* I q is the primitive character which induces x(mod q). Then 11/J(x,x)-1/J(x,x*)l:(

L

logp«(logxq) 2 •

pu(x,piq

Moreover, the zeros of L(s, x) agree with these of L(s, x*) except for the ones coming from the product

IT

(1- p-s)

Piq,pjq•

which are p = 2?Til/logp. Hence these zeros contribute at most

L L [xP; 1[ « Plq

(logxq)2.

;:gl~ (T

The above corrections are absorbed by the error term in (5.65) completing the proof. 0

r 5.


121

By (5.65) we derive a strong approximation to

7jJ(x,q,a)

A(n)

= n~x

n::::::a

(mod q)

in terms of zeros of £-functions. First by the orthogonality of characters we have 1

L

7jJ(x,q,a) =
x(a)I/J(x,x)

X (mod q)

Applying (5.65) we obtain (5.66)

7jJ(x;q,a)=
L

x(a)

x (mod q)

L xP; 1 +o(f(logxq) 2) L(p.x)=O IIm(p)I(T

for q;?. 1 , (a, q) = 1 and 1 :( T :( x, with an absolute implied constant. Theorem 5.10 gives the zero-free region for Dirichlet £-functions, except for the possible exceptional zero f3x for a real primitive character X· Because of its importance we include here the slightly different traditional proof. THEOREM 5.26. There exists an absolute constant c > 0 such that for any primitive Dirichlet character x modulo q, L(s, x) has at most one zero in the region a;?.1-

(5.67)

c . logq(Jtl + 3)

The exceptional zero may occur only if x is real, and it is then a simple real zero, say f3x, with c 1 - - - :(

(5.68)

1og3q

f3x < 1.

PRooF. Assume X is non-trivial (the case of ((s) is similar, but it needs a change of vocabulary because our definition of £-functions does not include (( s +it) with t ol 0 for the reason that it has a pole at s = 1 - it oil). Then consider the £-function of degree 8

L(f, s) = ((s) 3L(s +it, x) 4 L(s + 2it, X2) for some t E IR where x2 is the primitive character modulo q2 I q which induces By the trigonometric inequality of de Ia Vallee Poussin

x2 •

(5.69) 3Jzl + 4Re(z) + Re(z 2 ) = lzJ(3 + 4cos8 + cos28) = 2jzj(1 + cos(8)) 2 ;?. 0 for z = Jzl exp(i&), it follows that Re (At(n)) ;?. 0 for all n;?. 1 such that (n, q) = 1. In fact, a simple computation using the definition of x 2 shows that this remains valid for all n ;?. 1. · Let p = (3 + h be a zero of L(f, s) with (3;?. ~· If X is not real, or if"' ol 0, then taking t ="'it follows that L(f, s) has a pole of order 3 at s = 1 while (3 is a real zero of order 4 of L(f, s). Hence by Lemma 5.9 we have

(3 < 1 _

c2

logq(f, p)

<1_

c

log(q(hl + 3))


5.

122

with an absolute positive constant c. If ;x 2 = 1, p = {3 is real we take t = 0, then L(f, s) has a pole of order 4 at s = 1. By Lemma 5.9 again, L(f, s) can have at most four real zeros with multiplicity satisfying the above inequality, which means that {3 must be a simple zero of L(s, x). 0 REMARK. One can also recover that {3 < 1 by Theorem 2.1. As an application of Theorem 5.13, one derives the following prime number theorem. THEOREM 5.27. Let X modulo q be a primitive Dirichlet character. We have (5.70)

L n,;;x

x(n)A(n) = 5xx- xf3x + f3x

o(xexp(~ogx

)(logq) 4 )

log x +log q

for any x ) 1, with some absolute effective constant c > 0, the implied constant being absolute. For any q) 1 and (a, q) = 1 we have X ;\'(a) X/3' ( ( -cJogx ) 4) A(n) = -(-)- - ( - ) - + 0 :rexp y!ogX (logq) 'P q 'P q f3x log x + log q

(5.71) n~x

n=a (mod q)

for x) 1, where X (mod q) is the possible exceptional real character modulo q and f3x the corresponding exceptional zero. Have in mind that there could be one or two real primitive characters modulo q, for example, if q = Sr with r odd squarefree we have two such characters xsXr and X4X8Xr·. Nevertheless, at most one of these may give an exceptional zero. In fact, the exceptional characters are very rare, and the exceptional zero itself can be somewhat controlled. THEOREM 5.28. (1) (Landau) Let x1 and X2 be distinct real primitive characters modulo ql and q2. Assume X 1 and X2 have real zeros {31 and {32 respectively. There exists an absolute constant c > 0 such that (5.72)

(2) (Siegel) For any primitive real Dirichlet character x modulo q, we have

/3 <

(5.73)

X "

1- c(c:)

q'

for any c: > 0, with a constant c(c:) > 0 depending only on c:. If c: < ~, this constant is ineffective, meaning it is not numerically computable. PROOF. (1) Consider the L-function of degree 4 (5.74)

L(f,s) = ((s)L(s,xl)L(s,x2)L(s,xlX2)

which has conductor at most (q1 q2 ) 2 . Since the x1 , x2 are non-trivial, and distinct, L(f, s) has a pole of order 1 at s = 1. Moreover,

L'

-y(f, s) =

L n)l

(1

+ XI(n))(1 + X2(n))A(n)n-s

r


123

has non-negative coefficients. By Lemma 5.9, there exists c > 0 such that L(J, s) has at most one real zero with (3;:;, 1-c/logq(J). Since q(J) :( 9(q 1 q2) 2, this gives

(5.72).

(2) For the proof of (5.73), we follow the nice argument of Goldfeld [Gol]. Let Xb x2 be (for the moment) arbitrary primitive real characters modulo q1 and q2 and consider again the £-function (5.74). We have AJ(n) ;:;, 0. Let TJ(x) be a function on [0, +oo[ such that 0 :( TJ :( 1, TJ(x) = 1 for 0 :( x :( 1 and TJ(X) = 0 for x ;:;, 2. Consider the sums " : \ 1 (n)

(n) ;?1

Z((J,x)=~---;;]3"7:;:

(since AJ(l) = 1)

n

for ~ :( (3 :( 1 and x ;:;, 1. By contour integration we have

Z((J, x) =

2 ~i j L(J, s + (J)x•ry(s)ds = L(J, (3) (2)

1

+ £(1, X1)L(1, X2)L(1, X1X2)ry(1- (3)x -/3 +

2 ~i

J

L(f, s + (J)iJ(s)x•ds.

(~-/3)

1

By (5.63), the last integral is« q1 q2x'i-13. Thus we have I

L(J, (3)

+ £(1, xi)£(1, ·X2)L(l. XIX2)ry(1- (3)x 1-/3;? 1 + O(qlq2x2-/3).

Now a'3sume that (31;:;, ~is a zero of L(s,x1). Then itfollows I

£(1, X1)L(1, X2)L(1, X1X2)ry(1- fJ1)x 1 -13' ;? 1 + O(q 1 q2x'i-/3 1 ). \Ve take x = c(q 1 q2) 4 with c sufficiently large, so the term on the left is> ~- From (5.64) we have L(1, xi) «log q1 and £(1, X1X2) « log q1 q2. Moreover, ry(1- (Jr) « (1- (Jr)- 1 since ry(s) has a simple pole at s = 0 with residue 1. Gathering these estimates we get our basic lower bound (5.75) We will use (5.75) to prove (5.73), nsing the hypothetical zero (3 1 of L(s, xt) as a tool to control other characters. Let 0 < E :( ~. If all L( s, x) for x real modulo q do not vanish on the real segment s ;:;, 1- E, then (5.73) is obvious (just take c(c) =c). Suppose otherwise that for some x1 modulo q1, the £-function L(s, xr) has a zero s = (3 1 ;:;, 1 -E. Then for any other real primitive character X2 modulo q2 , we can write (5.75) as L( 1, X2)

»

q2 4 " (log q2 ) - 2

where the implied constant depends only onE, since q1 , x1 and fJ1 are fixed once E is chosen. On the other hand, if (3 2 is a real zero of L(s,x2) we have by the mean-value theorem

for some a with (3 2 :(a :( 1 by (5.64). Combining those two bounds gives 1- (32 » q2 42 , hence (5.73) after re-defining E. 0

5.

124


t

REMARK. Essentially the bound (5.73) was first proved withE= by Landau E > 0 shortly afterwards by Siegel [Siel]. The ineffectiveness of the Siegel bound (5. 73) is apparent in applications. For instance, it is not possible to use it to solve the class number one problem for

[La2] (also not effective), and with any

1

imaginary quadratic fields. Yes, by Siegel's lower bound £(1, x) :;p !l2 ~e and by the Dirichlet Class Number Formula (2.31), or (22.59), we get

w..[E

1

h(IQ(~)) = -~L(1,x) ::$> !l2~c

(5.76)

27r for -ll a negative fundamental discriminant, X being the corresponding Kronecker symbol. Hence h tends to infinity as does Ill[. But when computing all ll with h(IQ( ~)) = 1 it is not possible to know when to stop. In Chapters 22 and 23, we use much more sophisticated ideas of Goldfeld involving £-functions of elliptic curves to prove an effective bound, although much weaker (see Theorem 23.2).

Siegel's estimate implies the Siegel-Walfisz theorem about primes in arithmetic progressions. COROLLARY 5.29. Let q ;? 1 and A > 0. We have (5.77)

rr(x;q,a)

=

x ) Li(x) (
(5.78)

7jJ(x;q,a)

=

+O((lo:x)A)'

for any x ;? 2. Moreover, for any primitive Dirichlet character X modulo q have

> 2, we

LX(P) « Jqx(logx)~A,

(5.79)

p~x

L x(n)Jl(n) « Jq:r(logx)~A

(5.80)

n~x

for any A

>

0. The implied constants depend only on A. but are ineffective.

PROOF. If q ~ (logx)A+ 1 one gets (5.78) from the prime number formula (5.71) by employing the bound (5.73). For larger q there is nothing to prove, because the formula (5.78) is trivial. Then (5.77) follows from (5.78) by partial summation. Similarly one derives (5.79) from (5.52) by employing the bound (5.73) The case of (5.80) is a bit harder. One could follow the proof of (5.51) using L ~ 1 instead of L' / L, but there are serious complications which occur near multiple zeros or clusters of zeros (so far we cannot rule out this scenario). Therefore we are going to derive (5.80) directly from (5.79). First we write

L x(n)J.L(n) = 1- L x(m)J.L(m) L

x(p)

m~x

where Pm stands for the largest prime divisor of m. Let r ;:;, 0 be the number of prime divisors of m. Then the summation conditions imply m 1 /r ~ Pm < xjm, so m < xr/(r+ 1 l. Hence, using (5.79) the inner sum over p satisfies

L...,XP

" ' () P

Jqx(l ogx)~A «-m

m

Jqx(r+1)A ,;;--m logx

«

T(m)Jqx , m(iogx)A

J

I

I


125

because (r + 1)A « 2r = T(m). Finally, summing over m trivially we obtain (5.80) with extra factor (log x )2 . This completes the proof by resetting A. 0 For Dirichlet characters, the problem discussed in Proposition 5.22 becomes that of finding the smallest prime p with x(p) =/= 1 for a non-trivial character x modulo q > 2, say Pmin(x). One derives directly from various estimates for short character sums the following bounds:

using respectively the estimates of Poly a- Vinogradov, Burgess, Lindelof hypothesis, Riemann hypothesis. Actually we only get (logq) 4 by (5.56), but with a little extra effort one can squeeze from the GRH the bound (logq) 2 . One can also do better unconditionally using Burgess's estimate enhanced by combinatorial arguments (see e.g. [Hl]). Linnik proved (by the large sieve inequality) that the smallest quadratic non-residue for a primitive real character modulo q ~ N, q prime, is ~ N' with at most finitely many exceptions, the number of exceptions depending only on E > 0. See Section 7.4 for the proof and an analogue for elliptic curves. The Riemann zeta function is further discussed in Chapters 10, 24, 25, and Dirichlet characters in Chapters 17, 18. 5.10. £-functions of number fields. Let K JQ be a number field of degree d. The Dedekind zeta function

(K(s)

=2_)Na)-s =IT (1- (Np)-s)- 1 a;I'O

defines a self-dual £-function of degree d = [K: Q) with conductor D = ldKI, the absolute value of the discriminant of K, and root number +1. The gamma factor is (5.81)

'Y(s) = rr -ds/2r (s)r1+r2 r (S-+-1)r2 2 2

where r 1 is the number of real embeddings of K and r 2 the number of pairs of complex embeddings so that d = r 1 + 2r2. The zeta function has simple pole at s = 1 with residue 2r 1(2rr 2 hR res (K(s) = rn , s=1 wvD

t

where h is the class number of K, R the regulator and w the number of roots of unity. For a proof, see e.g. [La). The Ramanujan-Petersson conjecture is obviously satisfied for (K(s). The analytic conductor is

and q((K) = 2r 1+r 23r2D ~ 3dD. There exists an absolute constant c D > cd- 1 (see Exercise 10 below) so we have logq((K) :=::log D.

> 1 such that

5.

126


REMARK. Dedekind zeta functions factor as products of Artin £-functions (see Section 5.13). Let EjQ be the Galois closure of K, G = Gal(E/Q) and H = Gal(E/K). Let rc;H be the permutation representation of G on the coset space G j H. This is the representation induced by the trivial representation of H, hence by the invariance of Artin £-functions under induction we have L(rc;H,s) = (K(s). One can decompose rc;H in terms of irreducible representations of G, rc;H = tfJpnpp, getting the factorization (5.82)

(K(s)

=IT L(s,ptP. p

In case of a failure of the Artin conjecture (a possibility we cannot yet rule out) this is not a factorization in terms of £-functions in the sense of Section 5.1. If KJQ is abelian, then E = K, H = 1 and rc;H = tfJX where x runs over a group of Dirichlet characters so that

(K(s)

=IT L(s,x) X

is a product of £-functions for distinct primitive Dirichlet characters of conductors whose product is D = ldKI and exactly one character is trivial. We deduce from the above and the results of the previous sections: THEOREM 5.30. Let KJQ be a number field of degree d. We have

for 0 :( a :( 1, the implied constant depending only on d and

E.

THEOREM 5.31. Let KJQ be a number field of degree d. Let NK(T) be the number of zeros p = (3 + h of (K(s) in the critical strip 0 :( (3 :( 1 with hi :( T. We have

T

DTd

rr

27re

NK(T) = - l o g -)d (

+ O(logDTd)

forT ;:;, 2, the implied constant being absolute.

One can use the analytic properties of £-functions to deduce information on the discriminant of number fields. For instance, the explicit formula yields quite good lower bounds forD= ldKI· This powerful method was implemented by Odlyzko [Od], Poitou [Poi], Serre [Se7] and others. Here is a simple result. THEOREM 5.32. Assume GRH for the Dedekind zeta functions of number fields. Then we have (5.83)

liminf

as d = [K : Q] = r 1

~{logD + (d- r2)%{i) + r2%w};? log1r,

+ 2r2

---+

+oo.

l

l

5. CLASSICAL ANALYTIC THEORY OF £-FUNCTIONS PROOF.

127

Apply the explicit formula (5.44) to (x(s) with 1

2

¢(x) = x-2e-a(logx) ' for some a > 0. Although this test function is not of compact support, it decays very fast at infinity, so ( 5.44) holds. The Mellin transform is ,2 ¢(!+it)= /¥IT -e-:ra o

a

2

for t E R Note that this is positive (compare with Proposition 5.55) and that ¢(x) = x- 1 ¢(x- 1 ). Since (x(s) is self-dual we get 1 2

o

1 2

n

P

1+oo ¢(x)dx + -1. J1'-(s)¢(s)ds.

1 2 0

LAx(n)¢(n) +- L¢(p) = -logD +-

1

1

(2)

By positivity, we derive 1 -logD + -11+= ¢(x)dx + - 1 . 2 0 2ITt 2

(5.84)

o

2ITt

J (~)

1' -(s)¢(s)ds;;::, 0. 1 o

By (5.81) and the Fourier inversion formula we have (5.85)

1 ---:

2ITt

J

d (d-r2) -(s)¢(s)ds = --logiT +-

1'

0

1

2

2IT

1r'

it . -(-41 +)¢(-1 + tt)dt 0

IR f

(~) r2

+ 2IT

2

r

2

r' 3 "t 1 jiR f(4 + 'i-)¢(2 + it)dt. 0

When a--> 0, the function (2IT)- 1 ~(~ +it) converges to the Dirac delta at 0, hence

r rf' 2IT jiR f(4 + 1 2IT 1

r' 1 •t 1 r' 1 jiR f(4 + '2 )¢(2 + it)dt--> f(:j) 3

0

tt

0

1

2 )¢(2

.

+ tt)dt-->

r'

3

f(;j)

Therefore, dividing (5.84) by d, then letting d--> +oo, and then letting a derive (5.83).

-->

0, we 0

Similar arguments can be applied unconditionally with slightly weaker result using test functions as constructed in Proposition 5.55. EXERCISE 10. Show that there exists an absolute constant c > 1 such that D = ldxl > cd for any number field K/Q of degree d) 2, and, in particular, D;::, 3 if K =/= Q so K has at least one ramified prime. [Hint: Proceed as in Theorem 5.51 below.]

One cannot apply directly Theorem 5.10 because the Rankin-Selberg square of

(x(s) involves convolutions of Artin £-functions of K which are not known to be entire, although they are conjectured to be. However, one can adapt the traditional arguments (as in the proof of Theorem 5.26) using (x(s) 3 (x(s + it) 4 (x(s + 2it)

5.

128


and derive the zero-free region, then the prime number theorem for K. We phrase it in terms of logNp if a = pk for some k ) 1, AK(a) = { O otherwise. Thus these are coefficients in the Dirichlet series

THEOREM 5.33. Let KJIQ be a number field of degree d. There exists an absolute constant c > 0 such that (g(s) has no zero in the region

a

> 1-

"'

c

--=-----=---,-,-,-------,-, d2 log 3)d

D(it[ +

except possibly a simple real zero (3 < 1. We have

for x ) 2, where c > 0 is an absolute constant, and the term -xf3 /(3 should be removed if there is no exceptional zero. Be aware that the zero-free region of Theorem 5.33 is not necessarily the best accessible. Indeed by (5.82), one can factorize (K(s) and if one knows that the factors are L-functions, then zero-free regions for them produce one for (K which can be much better because of a smaller conductor. For instance, if K = IQ(Jlp) is the field of p-th roots of unity, we have

(K(s) = ((s)

IT L(s,x) x#J

where the product is over non-trivial Dirichlet characters modulo p. Thus by Theorem 5.26 and Theorem 5.28 there exists an absolute constant c > 0 such that (K(s) has at most one simple real zero (3 in the region a

whereas d = p- 1 and D

c

> 1 - ,-------,,..-,----;:-:-

"'

= ldKI = pP-

logp([ti

2

+ 3)

so Theorem 5.33 is much weaker.

Confirming that the zeros of real Dirichlet characters are the most difficult to deal with, Stark [St3] has shown that if KJIQ does not contain a quadratic subfield, then there is no exceptional zero of (K ( s). For Dedekind zeta functions, Proposition 5.21 can be improved. PRoPOSITION

5.34. Let Kj!Q be a number field. Assume GRH holdsfor(K(s).

We have ord (K(s) <::: I

s=~

log 3D D I og og 3

where D = [dK[ and the implied constant is absolute.

5.


129

PROOF. Let m be the order of vanishing at ~. We argue as in the proof of (5.62). In applying the explicit formula (5.45) an additional term arises from the simple pole of (x(s) at s = 1, hence we get

mh(O)

'I + L h(-) = P"~ 2rr

j+oo g(y)eYI dy 2

_

00

-2 LA~ g(logNa) + O(log3D). vNa 0 Both sums over the zeros and the prime ideals can be dropped by positivity. Since h(O) ;:o: Y and the integral is O(ey) we infer that mY « ey +log 3D. Taking Y = log log 3D yields the result. 0 Generalizing Dirichlet characters to number fields are Heeke characters (Grossencharakters). Those al'e defined thoroughly in Section 3.8 for imaginary quadratic fields, but here we use the basic terminology in the general case (for characters of weight 0). See e.g. [La] for details. Let KJQ be a number field and~ a primitive Heeke Grossencharakter modulo (m. fl) where m is a non-zero integral ideal inK and fl is a set of real infinite places where ~ is ramified. The Heeke £-function is defined by

for Re(s) > 1. Heeke showed that L(~,s) is an £-function of degree d = [K: Q] which is entire if~ =/1 and which coincides with (x(s) for~= 1. The conductor is ll = ldxiNx;Qm, where Nx;Q is the norm and the gamma factor is given by -d

(s)r,+r2-lfll (s + 1)r2+lfll

12 '"Y(~,s)=rrsr2

r-2-

,

so q(~) ~ 4dldx1Nx;Qm = 4dfl. The root number can be explicitly written as a normalized Gauss sum for ~. One gets the following generalization of Theorem 5.26. THEOREM 5.35. Let KJQ be a number field,~ a Heeke Grossencharakter modulo (m. fl). There exists an absolute effective constant c > 0 such that L(~, s) has at most a simple real zero in the region c a>1. dlog ll(ltl + 3)

The exceptional zero can occur only for a real character and it is < 1. REMARK. This section highlights the fact that £-functions of various types are naturally defined over any number field K, using Dirichlet series over non-zero integral ideals, and Euler products over primes ideals. Thus there are Artin £functions over K, automorphic £-functions over K, £-functions of varieties over K, whereas in a sense of analytic number theory we like to see these as £-functions over Q. This view is possible because any £-function of degree d over K, where [K : Q] = J, is an £-function over Q, as defined in Section 5.1, of degree df. For instance, (x(s) is of degree 1 over K. Thus it is easy for the reader to see what results there are for £-functions over number fields, with uniformity in the

5.

130


number field. Sometimes, however, the proof has to be adapted, as for Theorem 5.33, because the Rankin-Selberg £-functions have a different form depending on whether the £-functions are seen over K or over IQl. The reader should have no difficulty supplying new proofs if necessary. It is worth pointing out that for all known £-functions, the conductor of an £-function of degree d over K takes the form q = /dK /d N K/Qf for some non-zero integral ideal f of K. Thus the conductor and the analytic conductor contain the dependency in the discriminant of the ba..-;e

~K.

Analytic number theory flourished beautifully in the last two centuries in the garden of £-functions over the rationals. The extension to number fields is fruitful as well, but it needs to be more explored. To encourage the newcomers we end this section with a few cute applications of the theory over the Gaussian domain Z[i]. There are four units in Z[i], namely 1, -1, i, -i. Every ideal of Z[i] is principaL Consider the "angle" characters

~k(a)=(,:,)

=eikargo

=

L "

c:,t/n/-s

has conductor D = 4 and it satisfies the self-dual functional equation

A(s, ~k) = 7r-sr(s +

W)L(s, ~k) = A(1- s, ~k)-

AII characters are complex except the trivial character for k = 0 giving the Dedekind zeta function L(s,~0 ) = ((s)L(s,x 4 ). Therefore there is no exceptional zero, and the Prime Number Theorem (5.52) gives

L lrrl(x

(,:,t

=2DkLi(x)+0(\k/xe-cv'IQgX)

where c > 0 and the implied constant are absolute. As an exercise we propose to derive from the above formula the following equidistribution property of Gaussian primes in sectors. THEOREM 5.36. Let 0 < (3- n ,:;;:

~

and x ;) 2. Then

/{nEZ[i] \\n\,;;;x, n
.

··.····1:_-.··.·.j

·]

.1

ik

if a = (n) oJ 0 for any k 0 (mod 4). This is a primitive Heeke character of conductor (1). The associated £-function

L(s,~k) =

~

> 0 and the implied constant are absolute.

[Hint: Approximate the characteristic function of [n, (3] modulo 2n by a smooth function, periodic of period 2n, and symmetric with respect to x - t ±x±n. Expand this function into Fourier series and apply for x = argn. By the symmetry the nonzero Fourier terms are on frequencies k 0 (mod 4). At each frequency apply the Prime Number Theorem. J

=

See Section 11.8 for an application of this result to show that the Wei! bound for the number of points on a curve over a finite field is optimal in horizontal sense.

,,


131

EXERCISE 11. Playing with argn and In I derive the equidistribution of Gaussian primes in regular domains of C.

EXERCISE

that pq = a 2

12. Prove that there are infinitely many pairs of primes p, q such with 0 < b < (3loga) 2

+ b2

We recommend W. Duke [Du4] for a powerful treatment of a variety of problems by means of Heeke characters. 5.11. Classical automorphic £-functions.

By classical automorphic forms we mean the forms on G£(2), either holomorphic or eigenfunctions of the Laplace operator (the Maass forms). Let f be a primitive holomorphic cusp form, of weight k ~ 1, level q, with nebentypus x (see Chapter 14 for definitions). Let

f(z)

=

L AJ(n)n(k-l)f2e(nz) n)l

be its normalized Fourier expansion at the cusp oo. Then

n

p

is an £-function of degree 2 with conductor q and gamma factor given by

with ck = 2(J-k}/ 2 .j7r by the Legendre duplication formula. Therefore the analytic conductor is

+ k2 1 1+ 3)(1s + k~ 1 + 3) < q(lsl + lkl + 3) 2 , q( k21 + 3)( k~l + 3) :=:: qk2. 1

q(J, s) = q(ls q(J) =

The root number is iki)(J) where ry(J) satisfies W f = ryj. It is known that L(f, s) satisfies the Ramanujan-Petersson conjecture by work of Deligne [Del] for k ~ 2 and Deligne - Serre [DeSe] for k = 1. The dual form J satisfies

>.J(n) = x(n)>.J(n), if (n,q) = 1. Have in mind then f can be self-dual for non-trivial nebentypus. For example, if ~ is a class group character of an imaginary quadratic field K = IQl( ..fi5), then the form f with coefficients given by

AJ(n) =

L

~(n)

Na=n

is self-dual of weight k = 1, oflevel q = IDI and nebentypus X= XD, the Kronecker symbol. For these facts, see Section 14.7 and Proposition 14.13, and the references there.

132

5.


Similarly, let
. = + r 2 , where r E lR or irE [0, ~[. Writing its Fourier expansion at infinity in the form

i

.fij

=

L p(n)Kir(2n[n[y)e(nx), nfO

we associate with

n):I

p

of conductor q, with gamma factor

()

"(
where

o=

7r

-·r(s+o+ir)r(s+o-ir) 2 2

0 if

q( .q.

~

q([sl + lrl + 3) 2 ,

(5.87)

iJ,

+ x(p)p-

25

= (1 - app- 5 )(1 - ;Jpp- 8 ), and correspondingly

1

(5.88)

>. = 4 + r

975

2

;? 4096

= 0.238 ...

This is due to Kim and Sarnak [Ki8]. For earlier results see e.g. [82], [8e8], [Dui2], [BDHI], [LR8]. The original theory of Rankin [Ra3] and Selberg [85], extended by W. Li [L], and its adelic version [JP8] show that any two cusp forms, holomorphic or nonholomorphic, admit a Rankin-Selberg convolution. The convolution L(Je;,g,s) has a simple pole at s = 1 if g = J and is entire otherwise. Assume f of weight k and g of weight k ~ f. (by symmetry). The gamma factor of f 0 g is given by 2

f-k

"f(J0g,s)=n- •rC\

2

i+k

)rC\

2

ll

~

j

1

We have p(n) = x(n)p(n) if (n, q) = 1. See Chapter 15 for the definition of Maass forms. Although £-functions and Heeke operators for Maass forms are not fully presented in this book, the theory is very similar to that of holomorphic forms; see for instance [BG], [Bu] and [DFI3]. In contrast with holomorphic forms, it is not known that the £-functions of primitive Maass cusp forms satisfy the Ramanujan-Petersson conjecture, although there is no doubt because the conjecture fits correctly in the general Langlands functoriality program. The current best known estimate is

where 1 - p(p)p-s IRe irl ~ hence

l

f-k

1

i+k

)rC+;+ )rC+T- 1 ).

j

1


133

Let now f have eigenvalue ~ + r 2 , and parity 8 = 0 or 1, and g have eigenvalue ~ + u 2 and parity TJ = 0 or 1. Then

) (! /®g,S=7r

-2sr(s +i(r + u) +v)r(s +i(r- u) +v) 2 2 r( s- i(r; u) +v)r( s- i(r; u) +v)

where v = 0, 1 according to whether 8 = TJ or not. Finally assume f is holomorphic of weight k and g is a 1Iaass form with eigenvalue ~ + r 2 and parity fi. Then

I (! ®

g, s ) =

7r

_ 2 sr(s+ir+¥)r(s+ir+~) 2 2 r (s -

ir

+ ¥) r ( s 2

ir +

~). ·

2

These factors can be derived by pleasant computations from the facts conveniently gathered in the Appendix to [RS]. Although the local factors of L(J ® g, s) at primes p I q(J ®g) cannot be expressed.in a simple way from the Heeke eigenvalues >..t(P), >. 9 (p) or pt(p), p9 (p), this is still the case if p divides exactly [q(J), q(g)]. In this case one can show that the factor over the ramified places in (5.10), or rather 1/Hp(P-s), is equal to

In other words, the local factor at all primes is given by (5.9). \Ve emphasize again that this is not true in all cases. Note that since pI q(J)q(g), one or both of n 1(p), f3t(P) (resp. n 9 (p), {39 (p)) is zero. · In terms of Dirichlet series, this implies that

L(J ® g, s) = L(2s, XtX 9 )

L >-t(n)>. (n)n-s 9

n)l

if [q(f), q(g)] is squarefree, where f and g can be either holomorphic or Maass forms and XJ, x9 are the respective nebentypus. Note that L(s, XtX 9 ) is the Dirichlet series of the character XtXg even if it is non-primitive, so, for instance, if Xt = x 9 = 1 then L(2s,xtx 9 ) = (q(f)q(g)(2s) is the Riemann zeta function with the Euler factors at p I q(J)q(g) removed. In the general case, the local factors at ramified primes can be described using the local Langlands conjecture for GL(2), but this is not very explicit. See Section 5.12 for the relation between L(f ® J, s) and the symmetric square £-function. REMARK. (1) The convolution of a Maass form and an holomorphic cusp form of weight 4 is important in the Phillips-Sarnak theory of spectral deformation [PS]. (2) One can also consider convolutions of GL(1) with GL(2). Those are the twists defined in Section 14.8, and the theory is rather elementary. Note, however, that the twisted modular form

n

(in case of holomorphic f) of level qm 2 , where 1/J is modulo m, may not be primitive, although it is always an eigenfunction of the unramified Heeke operators. In this case the Rankin-Selberg £-function L(J ® 1/J, s) is the £-function of the primitive

134

5.


form with conductor dividing qm 2 which has the same Heeke eigenvalues as f 01/J at almost all primes (see Exercise 5 of Section 14.7). As an example, let f be a theta function for a class group character (see Section 14.3) for an imaginary quadratic field K ='OJ(~), and XD the Kronecker symbol for this field, so that f E S1(D, xv). Then we have L(f 0 xv, s) = L(f, s). More generally, one can express the dual £-function L(/, s) of a modular form f having nebentypus X as a twist

(5.89)

L(!, s) = L(f 0

x, s)

(see (14.48)). It is quite instructive to fix a modular form and consider its family of twists f 0 x by characters of conductor q, or by real characters of conductor :( Q. For ease of reading we restate some of the general results from previous sections now in the specific context of classical automorphic £-function. THEOREM 5.37. Let f be a holomorphic primitive cusp form of level q and weight k ;? 1 as above. We have

(5.90)

L(f,s)

«

(v'Q(Isl +k))l-a+e,

for ~ :( cr :( 1. Let cp be a primitive Maass cusp form of level q with the Laplace eigenvalue A= + r 2 . We have

i

(5.91)

L(cp, s)

«

(v'Q(Isl

+ lrl))l-a+e

for ~ :( cr :( 1. In both cases, the implied constant depends only on

E.

PROOF. The inequality (5.90) is merely the re-formulation of (5.20) with the claimed uniformity, because L(f, s) satisfies the Ramanujan-Petersson conjecture. To prove (5.91), one uses instead the estimate

L

(5.92)

lp(n)l

2

«

x(x

+ q + lrl)'

n(x

proved by lwaniec [17], hence again the implied constant in (5.91) depends only on not on cp. D

E,

Next (5.33) gives THEOREM 5.38. Let f be a holomorphic primitive cusp form of level q and weight k :? 1, N(T, f) the number of zeros p = f3 +if of L(f, s) in the critical strip 0 :( f3 :( 1 with hi :( T. We have

T qT2 N(T, f)= ; log ( 1re) 2 2

+ O(logq(ITI + k)),

for T :? 2 with an absolute implied constant. Let

!

5.


135

for T ;::, 2 with an absolute implied constant. Now we re-formulate our zero-free region theorems. The first result for modular forms beyond the classical context of Dirichlet characters (and their extensions to number fields) is due to C. Moreno [Morl]. We can apply Theorem 5.10 for both holomorphic and non-holomorphic forms, getting the following zero-free region: THEOREM 5.39. Let f be a holomorphic primitive cusp form of level q and weight k;::, 1. There exists an absolute constant c > 0 such that L(f, s) has no zero in the region a

> 1-

c

::--,--,--:----::c-

,... logq(ltl + k + 3) except possibly a one simple real zero (3 < 1 in which case f is self-dual. Let
0 such that L(
a;::, 1-

c

logq(ltl

+ lrl + 3)

,

except possibly a one simple real zero (3 < 1 in which case f is self-dual. Finally we state the prime number theorem for classical cusp forms. THEOREM 5.40. Let f be a holomorphic primitive cusp form of level q and weight k ;::, 1. We have xf3

:~:::>t(p)logp = - -

+ O(y'qxexp(-c~))

(3

p(x

for x ;::, 2, where c > 0 is some absolute constant and (3 is the exceptional zero of L(f, s) if it exists and this term is omitted otherwise. Let

I>(p)logp =

-{3 + O(y'qxexp(-c~))

p(x

for x ;::, 2, where c > 0 is some absolute constant and (3 is the exceptional zero of L(
L(Jo(q), s) =

I1

L(f, s).

JES2(q)•

This is an £-function of degree d = IS2 (q)*l with q(f) = (12q)1 52 (qrJ, hence d ~ ~ and logq(f) ~ logq. Therefore it follows from Proposition 5.21 and GRH that

f2

5.

136


the order of vanishing of L(Jo(q),s) at s =~is« q(logq)(loglogq)- 1 . However, applying the Peter~son formula (14.60) one can easily show (still using GRH) that ord L(J0 (q), s) « q.

(5.94)

s=l/2

In Chapter 26, we show that this bound is sharp. Actually the bound (5.94) is established unconditionally in [KM2] (without recourse to the Grand Riemann Hypothesis). See Section 5.13 for links with Hasse-Weil zeta functions of abelian varieties.

5.12. General automorphic £-functions. In this section we talk briefly of £-functions associated with cusp forms of higher rank, because they are about to make a permanent habitat in analytic number theory in years to come. Let f be a cusp form on GL(m)/Q with m ;? 1. Then Godement- Jacquet [GJ] have defined an £-function L(f, s) of degree m associated to f. For m = 1, f corresponds uniquely to a primitive Dirichlet character X modulo q and L(f, s) = L(s, x). For m = 2, f corresponds either to a primitive cusp form, or to a primitive Maass cusp form. Hence the automorphic £-functions L(f, s) generalize both Section 5.9 and Section 5.11. Since f is cuspidal, L(f, s) is entire except for L(f, s) = ((s) with m = 1. In addition £-functions of noncuspidal automorphic representations are also defined, but they factor as products of cllipidal ones. We recommend Cogdell's Chapter 9 of [BG] as a survey of £functions on GL(m). It is expected that any automorphic £-function satisfies the Ramanujan-Petersson conjecture. Jacquet and Shalika proved that the local components satisfy lai(P)I < ..jP and Re(~~: 1 ) > -~. This has been improved by Luo, Rudnick and Sarnak [LRS], namely la;(p)l < pc andRe (~~:J) > -c with

1 1 c =2-m 2 + 1·

(5.95)

The absolute convergence of L(f, s) for Re (s) > 1, due to Jacquet-Shalika and Shahidi [JS], follows from the existence of Rankin-Selberg £-functions (see the proof of (5.48)). A useful feature of the association of £-functions with automorphic theory is that the root number E(f) is expressed as a product of local root numbers

E(j) =IT Ep(f), with Ep(f) = 1 if pis unramified. p

In fact, Ep(f) depends on the choice of a non-trivial additive character '1/Jp of ifJJ.v, but the product is independent of such choices. For any two cusp forms f and g on GL(d) and GL(e), the £-function L(f ® g, s) exists in our sense by the far-reaching generalization of the classical RankinSelberg method due to Jacquet, Piatetski-Shapiro and Shalika [JPS]. Moeglin and Waldspurger [MW] proved that L(f ® g, s) is entire if g "If, and has a simple pole at s = 1 if g =f. The bound q(f ®g) I q(f)eq(g)d is due to Bushnell and Henniart

[BH]. According to conjectures belonging to the Langlands Program, the "most general"' £-function should in fact be a product of functions belonging to the class of £-functions of cusp forms on some GL(m)/Q. Other parts of the Langlands

5.


137

conjectures imply that the Ramanujan-Petersson conjecture should hold for any automorphic £-function. Moreover, the conjectures imply the existence (and determination, in principle) of an equidistribution law for the coefficients .>. 1 (p) (see the discussion of the Sa to-Tate Conjecture in Chapter 21 ). The convexity bound for automorphic £-functions can be made uniform. THEOREM 5.41. Let L(J, s) be an automorphic L-function of a cusp form of degree d other than ((s). Then we have (5.96)

L(f,s)

«

q(J,s)'*

for u;? ~- where a= max(~(1- u),O), the implied constant depending only onE and d.

PROOF. This follows from (5.12) with X= 1, Proposition 5.4 and the estimate

L

l.>.t(n)l2

«

x(xq(J))<

n~x

for x ;? 1 with an implied constant depending only on of (5.92) proved by Molteni [Mol].

E

and d. This is the analogue D

The most commonly used £-functions of degree four in analytic number theory are the Rankin-Selberg convolutions L(J ® g, s) of two classical cusp forms (described in Section 5.11). Close relatives to this are the symmetric square and oint square £-functions of classical cusp forms. Let f have nebentypus X· The metric square off (J is either holomorphic or real-analytic) is defined by 7) L(Sym2 f, s) = L(J ® f, s)L(s, x)- 1 whereas the adjoint square is defined by L(Ad2 f, s)

(5.98)

= L(J ® j, s)((s)- 1 .

Thus both are £-functions of degree three, with conductors q(Sym 2 f) = q(J ® f)q(x)- 1 I q(J) 2 and q(Ad 2 f) = q(J ® ]) I q(J) 2 . Iff has real coefficient, then the symmetric and adjoint squares coincide. One shows that the root number for the adjoint square is c(Ad 2 f) = c(f ® ]) = l. Both have gamma factor equal to rU ® s)/r(s/2), hence

J,

(5.99) if

r(Ad

2

f, s) = 1(Sym2 f, s) = 1r- 3 s/ 2

f is holornorphic of weight

(5.100)

2

s;

rC; 1 )rC +~- 1 )r( k)

k and 2

1(Ad f,s) = 1(Sym f,s) =

7r- 3 sl 2 r(Dr(~ +ir)r(~

+ir)

iff is real-analytic Maass form with eigenvalue .>. = ~ + r 2 (note the parity does not matter). The local Euler factors for L(Sym 2 f,s) and L(Ad 2 f,s) are given by

(1- a(p)2p-s)-1(l- x(p)p-s)-1(1- f)(p)2p-s)-1,

0 _ ~~:~p-s)-1(1- p-s)-1 ( 1 _

~p-s)-1

respectively if p f q(f), where a(p) and f)(p) are the local roots at p for L(J, s). Note that from the twist formula (5.89) or the relation a(p)f)(p) = x(p) for p unrarnified,

5.

138


one can also relate the adjoint and symmetric squares as twists (convolution of GL(3) by G£(1)): If p divides exactly q(J), then this formula is still valid. In particular, for q(f) squarefree a simple computation shows that 2

L(Sym f,s) = L(2s,x 2 ) 2:>J(n 2 )n-s, n)l

2

L(Ad f,s)

=

L(2s,:e) Lx(n)>. 1 (n 2 )n-". n)l

At the point s = 1 the adjoint square £-function takes a special value (5.101)

L(A

d

2

)

f, 1 = f~1 L

(f

®

- )

w(J,f)

f, s = Vol (fo(q)\IHI)

where w = (47r)k/f(k) or w = cosh(7rr)/27r depending on the weight or eigenvalue off respectively, while (f, f) is the square of the Petersson norm off (see (14.11) for holomorphic cusp form, otherwise it is the standard L 2 -norm in the Hilbert space of automorphic functions on f 0 (q)\IHI; see Chapter 15 and also (26.47)). This formula shows, in particular, that L(Ad 2 f, 1) > 0, and it is well exploited in the spectral analysis of cusp forms on GL(2) (see for example [Roy]). Theorem 5.10 gives a zero-free region for automorphic £-functions of any de-gree. THEOREM 5.42. Let L(f,s) be an automorphic £-function of a cusp form of degree d. There exists an absolute constant c > 0 such that L(J, s) has no zero in the region (5.102) except possibly a one simple real zero is necessary that f be self-dual.

f3f < 1. For this exceptional zero to exist, it

PROOF. By the properties of Rankin-Selberg convolutions of cusp forms recalled above, all assumptions of Theorem 5.10 are valid, which gives the result, D REMARK. One can show that for an automorphic £-function L(f,s), the auxiliary £-function L(g, s) used in the proof of Theorem 5.10 has the property that A9 (n) :? 0 for all n, not only unramified ones. The proof of this depends on the local Langlands conjecture for GL(d) over p-adic fields, proved by Harris and Taylor [HT]. Roughly speaking, for any prime p, there exist continuous representations Pp : W(IQlp)--+ GL(d,C), where W is the Weil-Deligne group of IQlp, a variant of the Galois group, such that the p-factor of L(J, s) is given by Lp(f, s) = det(1 - pp(Frp)P-s) -

1

and that of the Rankin-Selberg convolutions used in defining g are Lp(f ® ], s)

= det(1- (pp ® pp)(Frp)P-s)- 1 ,

Lp(f ® f, s) = det(1- (pp ® pp)(Frp)P-s)- 1 , Lp(j ® ], s)

= det(1- (pp ® pp)(Frp)p-s)- 1 ,

5.


139

where Pp is the contragredient representation of pp, and Frp is the geometric Frobenius at p (see Tate [Tat] for more details); more precisely, Frp is meant to act on the invariants under inertia for all the above representations. This shows that, with the same convention, the p-factor of L(g, s) is

where p is the representation

But we have quite generally Tr (p 0 p)(g) = jTrp(g)l 2

for any g and p, and this gives A9 (pk) ? 0 for all k? 1. This kind of very deep arithmetical argument depends crucially on the automorphic nature of L(J, s) and is completely beyond reach when using only the bare data of the Dirichlet series or Euler product L(J, s). Statements like Proposition 5.22 for automorphic L-functions are usually called "multiplicity one theorems". The most basic result, called the strong multiplicity one principle, is due to Jacquet and Shalika [JS]: PRoPOSITION 5.43. Let L(J,s) and L(g,s) be two automorphic L-functions of cusp forms of G L( m). If the local components off and g coincide at all but finitely many primes, then f =g.

C~

PROOF.

~

Suppose the local components coincide for p S, where Sis finite (and tains the ramified primes for f or g). Since the local factors for the Rankinlberg convolution L(J 0 g, s) have no poles at s = 1, the order of the pole of - L(J 0 g, s) is the same as that for the product over primes p ~ S. But c

II Lp(f 0 g, s) II Lp(f 0 f, s) =

pf/cS

p!fcS

by assumption. Therefore L(f 0

g, s)

has a pole at s

=

1, which implies f =g. D

For classical modular forms, this is Theorem 14.12. If the local components satisfy Jai(P)I < p 114 , then the remark after Proposition 5.22 gives an explicit version. Such a bound is known in full generality only for n = 2, but using the logarithmic derivative of L(f 0g, s) instead of the squarefree part of the convolution, Moreno [Mor3] gave a general explicit bound on the number of primes required. Theorem 5.42 can be applied in particular to the symmetric and adjoint square L-functions. One can also apply it to the Rankin-Selberg L-functions L(f 0 g, s) for any two classical modular forms since Ramakrishnan [RaJ has shown that those L-functions of degree 4 are automorphic. Exercise 4 shows that one can in fact prove non-vanishing results for Rankin-Selberg L-functions without modularity. THEOREM 5.44. {1} Let f and g be classical primitive modular forms, either holomorphic or real-analytic. There exists an absolute constant c > 0 such that L(f 0 g, s) has no zero in the region o->1~ r

c

log q(f 0 g)(Jtj

+ 3)

5.

140


except possibly a one simple real zero (3 < 1. For this exceptional zero to exist, it is necessary that f ® g be self-dnal. (2) Let f be a classical primitive modular forms, either holomorphic or real2 analytic. There exists an absolute constant c > 0 such that L(Sym f, s) and L(Ad! J, s) have no zero in the region

c

u? 1- logq(f)( II t + 3) , except possibly a simple real zero (3 < 1. Iff is non-dihedral. then the exceptional zero does not exist. g,

PROOF. As mentioned, the first point follows directly from the fact that L(f ® s) is an automorphic L-function. For the second, L(Sym 2 f, s) is automorphic [GJJ, so we need only prove that

there is no exceptional zero if f is non-dihedral, which is due to Goldfeld, Hoffstein and Liehman [GHL] (since f is self-dual if there is an exceptional zero, L(Ad2 f, s) = L(Sym2 f, s)). To this end consider the L-function L(g, s) = ((s)L(Sym 2 f, s) 2 L(Sym 2 f ® Sym 2 f, s) = ((s)L(Sym

2

f, s) 3 L(Sym 2 Sym 2 f, s).

The last L-function is a special case of the symmetric square of a cusp form on GL(3) and has been shown by Bump and Ginzburg to have a simple pole at s = 1 if f is non-dihedral (this also follows by the first formula since Sym 2 f is a cusp form on G L( 3)). Hence L(g, s) has a pole of order 2 at s = 1, whereas any real zero of L(Sym 2 f, s) is a zero of L(g, s) of order ? 3. By easy local computations one checks that A 9 (n)? 0 for (n,q(g)) = 1, hence the result follows from Lemma

5.9.

D REMARK.

The formula (5.101) also proves that L(Ad 2 f, 1)

'I

0.

The following is a useful corollary (compare with the lower bounds for class numbers). COROLLARY 5.45. Let f be a classical primitive modular form, holomorphic or real-analytic. We have

(5.103)

llfll 2

»Vol (fo(q)\lHI)q-o

for any c: > 0. The implied constant depends only on c: and the weight k or the eigenvalue A whatever is relevant. If f is not dihedral, for instance, if f has trivial nebentypus and squarefree conductor q, then

(5.104)

llfll 2 »

Vol (fo(q)\IHI)

log2q

where the implied constant is effective and depends only on the weight k or the eigenvalue A whatever is relevant.


141

PROOF. The first part is proved by Hoffstein and Lockhart [HL], and follows from (5.101) the bound (3 :( 1 - c(c)q-< for the possible exceptional zero of L(Ad2 f, s), which implies a lower bound for L(Ad2 f, 1). The proof is similar to that of Siegel's bound (5.73), and the implied constant is non-effective. Iff is not dihedral, then there is no exceptional zero and one derives (5.104) by the same method from the zero-free region of Theorem 5.44. Finally, one can easily see that a dihedral form with trivial nebentypus has a non-squarefree level.O

Generally speaking the problem of exceptional zeros for the L-functions of cusp forms on GL(m) with m;;, 2 turns out to be accessible, the hardest and stubborn case being with rn = 1 for real Dirichlet characters. The above result of Goldfeld, Hoffstein and Liehman has been extended to the following theorem of Banks [Ba]: PROPOSITION 5 .46. If L(J, s) is the L-function of any cusp form on G L(3), then the exceptional zero does not exist for L(J, s).

5.13. Artin L-functions. In this Section we survey, mostly without proofs, some of the definitions and properties of Artin L-functions. In contrast with the previous sections, the fact that those are indeed L-functions as defined in Section 5.1 remains open. However, there is great confidence and evidence that the relevant conjectures are correct, and together with the analytic theory developed in this chapter, and possibly GRH, this provides henristic evidence for results that sometimes are accessible by other methods. See e.g. [CF] for more about Artin L-functions. To avoid confusion in terminology, we will sometimes refer to the L-functions defined in Section 5.1 as analytic L- functions. Artin L-functions are associated to a continuous finite dimensional Galois rt>presentation p : Gal(Q/K)---> GL(d,C) where Kji'Q is a number field. Define the Artin L-function of p by the following Euler product over prime ideals of K:

L(p, s)

=II det(1- p(Frp)Np-s)p

1

,

tlrJ

where Frp is the Frobenius conjugacy cla.~s at p, acting on the invariants under inertia group at p. Thus L(p, s) is an Euler product of degree deg(p) over K, r deg(p)[K: IQ] over IQ. •. Artin L-functions enjoy a formalism parallel to that of representations. For instance, the dual L-fnnction is also the L-function of the contragredient representation p, and the Rankin-Selberg L-function of p 1 and p2 is the L-functiou L(p 1 ® P2, s) of the tensor power p1 ® p 2 (thus the notation is compatible). Moreover, L(p 1 EB p 2 , s) = L(p 1 , s)L(p 2 , s), and if pis induced from a representation p' from a finite index subgroup COJTPsponcling to a finite extension l\- 1 / K, then (5.105)

L(p,s)

=

L(p',s).

In particular, taking K = IQ and K' = K any number field, it follows that any Art in L-function of degree d can be seen as one defined over IQ of degree deg(p)[K: IQ]. For the trivial representation of Gal(Q/ K), we get L(1, s) = (g(s). More generally if deg(p) = 1, class-field theory shows that to p is associated a unique Heeke character~ such that L(p, s) = L(~, s). So it is an analytic L-function in this case (see Section 5.10).

5.

142


It is conjectured that for any p, L(p, s) is an analytic L-function. Since L(p 1 EB p 2 , s) = L(p 1 , s )L(p 2 , s), one can restrict attention to irreducible representations p, and one expects that L(p, s) is in fact automorphic. It is known that L(p, s) has meromorphic continuation to IC and all of the conditions set in Section 5.1 are satisfied, except that some of these £-functions may have poles in the critical strip (because we still do not know if L(p, s) is holomorphic in the strip 0 < Re (s) < 1). The gamma factor can be described as a product over infinite places v of K of local gamma factors '"Yv(p, s). Let O"v be the Frobenius conjugacy class, which is of order 2 if v is a real place that extends to two complex places of L, and is = 1 otherwise. Then

s - {7l"-dsf r ( ~) r ( s; 1 ) if vis a complex place, 2

'"Yv (p, ) -

d

S

7r-dsf 2

d

d+

S

+1

r(2) r(-2- )

d- .

.

If VIS a real place,

where d = deg(p) is the dimension of p, and d-, d+ are the multiplicities, respectively, of the eigenvalue +1, -1 for p(a-v)· The conductor q(p) is of the form

q(p) = [dxfdeg(p)Nx;Qf(p) for some integral ideal f(p) of K, which is called the Artin conductor, defined using ramification groups (see Serre [Se4]). The root number E(p) can be in a sense defined from the functional equation, but there is also an a priori definition due to Dwork, Langlands and Deligne as a product of local factors. Thus the analytic conductor satisfies

q(p, s) ~ q(p)([s[

+ 4)[KQ)deg(p)

= ([dx[([s[

+ 4)[KQ))deg(p)Nf(p).

From the Brauer induction theorem (see e.g. [Se5]), we can write Tr p =

L n;Tr7r;

where n; E IZ and 71"; is a representation induced by an abelian character ~' of a finite (in fact, cyclic) extension K;j!Q. By the invariance under induction (5.105) we have (5.106)

L(p, s)

=II L(7r;, s)n' =II L(~;, s)n'

and the case of Heeke characters (over number fields) implies that L(p, s) is merom orphic with poles only in the critical strip, and satisfies a functional equation relating s and 1 - s. That this functional equation is the right one involving the gamma factor above and the Artin conductor follow from the formalism of both, especially the behavior under direct sums and induction. To show that L(p, 8) does not have extra poles, we need to prove the ARTIN CoNJECTURE. Let p be an irreducible, non-trivial, Galois representation of a number field Kji"Q. Then L(p, s) is entire.

Even the case d = 2 and K = IQ is still not completely settled. More precisely, if an irreducible representation p : Gal(Q/IQ) -> GL(2, IC) is given, then its image in PGL(2, IC) can be classified as either a dihedral group, the alternating group A 4 ,

5.


143

the symmetric group S 4 or the alternating group A 5 . In all cases, except A 5 , it is known that L(p, s) is holomorphic (even automorphic), and there are partial results in the A5 case. The dihedral case is classical, the A 4 case is due to Langlands, the S4 case to Tunnell. The latter is crucial in Wiles's proof of Fermat's Great Theorem. In a certain sense, the Artin conjecture is true on average: for instance, if K/IQ is a fixed finite Galois extension, then

(K(s) = ((s)

(5.107)

II L(p, s) p,,l

where p runs over the non-trivial irreducible representations of Gal(K/IQ) (see (5.82)). Since both ((s) and (g(s) are known to have only a simple pole at s = 1, the possible poles of L(p, s) must be compensated by zeros for other representations. The following is also useful. COROLLARY 5.4 7. Let p be a non-trivial irreducible Galois r·epresentation of K ji'Q. Then L(p, s) has neither poles nor zeros on the line Re ( s) = 1. PROOF. This follows by (5.106) because the £-functions of non-trivial Heeke Grossencharakters are entire and do not vanish on the line Re (s) = 1 (Theorem 5.3~. D Assuming the Artin conjecture, one can deduce quite interesting arithmetic corollaries using the properties of analytic £-functions. CHEBOTAREV DENSITY THEOREM. Let LjK be a finite Galois extension of degree d of number fields. Let C C Gal( L / K) be a union of conjugacy classes and

1/!(x, C)

=

L

log Np.

Np(x Fr,EC

We have as x

--+

+oo, C

(5.108)

1/J(x,

)

~

ICix IGal(L/K)I

If GRH holds for Artin £-functions, then for x ? 2, (5.109)

1/J(x C)= ICix ' IGal(L/ K)l

+ o(v'r(log x) ~ lc L, p

I log(xdeg(p)fK:Qlq(pJ)) P

where p runs over the irreducible representations ofGal(LjK), and (5.110)

1

Cp

~-

= IGal(L/K)I L.. Trp(x). xEC

The implied constant is absolute.

PROOF. Although the proof below depends on the Artin conjecture, (5.108) is known unconditionally and (5.109) does not need the Artin conjecture (see e.g. [Se6]).

144

5.

CLASSICAL ANALYTIC THEORY OF L-FUNCT!ONS

Let G = Gai(L/ K). Functions of the form Tr p, for p irreducible, form an orthonormal basis of the vector space of conjugacy-invariant functions on the finite group G with the scalar product 1

L

(J,g/ = IGI

f(x)g(x).

X

Hence the characteristic function Oc of C C G is given by 8c(x)

= :LcpTrp(x) p

where Cp is as in (5.110). For the trivial representation, we have c1 = ICI/IGI, and thus (5.108) follows from the Artin Conjecture and the Prime Number Theorem 5.13 for L(p,s). Similarly (5.109) comes from (5.56) and (5.110). D The estimate (5.109) can be made explicit in various ways. Sometimes one can compute Cp explicitly. Otherwise, a simple general bound on GRH is given for instance by

To see this, notice that by orthogonality we have 2 2 "'I ~ Cp 1 -- 118c 11 2-p

ICI

IGai(L/K)I'

and it is well known that 2

Ldeg(p) = !Ga!(L/K)!. p

Hence by Cauchy's ineqnality we derive "'ic llog(xdeg(p)[K:Q]),;: [K: IQ](Iogx) (

~ p

~

P

ICI !Gal(L/ K)l

)~ IGal(L/K)I~

= [K: IQ](Iogx)#[. On the other hand, since !Tr (p(x))l :;( deg(p), we have lcPI :;( deg(p), hence

L

lcPIJogq(p) :;(log

p

II q(p)deg(p) =log ldLI p

by comparison of the conductors in the identity of L-functions

(L(s)

=II L(p,s)deg(p). p

For other estimates and arrangements of the error term on GRH, and also for unconditional estimates with explicit error terms; see (Se6] and [LO]. Here is a sample application of the Chebotarev Density Theorem.

r

5.


145

PROPOSITION 5.48. Let n ): 2 and N ): 1. Let DN be the set of monic polynomials f E Z[X] of degree n whose coefficients have absolute value :( N. Then we have 1 -xNn "" "" logp =en+ o(1) ~ ~ fEDN

f

p(x has a root mod p

as x-> oo and N-> oo, where en = 1- ~ as n-> oo).

+ · · · + (-1)n- 1 ~

(note that en

->

1- ~

PROOF. We have IDNI = Nn. Gallagher ([Ga4], see Exercise 3 of Chapter 7) has shown that the set eN of elements f E DN such that either f is reducible or the Galois group of the equation f(x) = 0 is not the full symmetric group Sn satisfies 1 1eN1 « Nn--z!ogN. By trivial estimate, elements in eN do not contribute to the limit above. Let f ¢ eN. Choosing a root a off inC, let L = Q(n) and K the Galois closure of L. We have Gal(K/Q) = Sn and H = Gal(K/L) :::' Sn_ 1 . By Galois theory, we see that

U o-- Ho-}={p I FrpEH} 1

{pI fhasarootmodulop}={p I FrpE

aESn

where H is the set of elements in Sn fixing at least one element. Precisely, this holds up to finitely many primes at which the localizations of Z[X]/(f) and the ring of integers of L differ. By the Chebotarev Density Theorem we have

logp p(x

f has a root mod

IHI x n.1

= -

-

+ o(x)

p

By exclusion-inclusion, it follows that IHifn! =en, hence the result.

D

See e.g. [Se6] for other applications of the Chebotarev Density Theorem. 5.14. £-functions of varieties. The situation with £-functions of algebraic varieties over Q is even less understood than that for Galois representations. Let us call them "arithmetic geometry" £-functions, or just "geometric" £-functions. We will not discuss things in general, but mention the important cases of algebraic curves and abelian varieties. The only satisfactory theory exists for the intersection of these two cases, namely the abelian varieties of dimension one (that is the elliptic curves). Let E jQ be an elliptic curve. Modularity of E, proved by Wiles, TaylorWiles and Breuil-Conrad-Diamond-Taylor [W], [TW], [BCDT], means that the Hasse-Wei! zeta function of E, normalized to have critical strip 0 :( Re (s) :( 1, corresponds to a modular form of weight 2 and level equal to the conductor of E, hence is an £-function according to Section 5.11. See Section 14.4 for a more detailed statement, as well as a description of the root number in most cases. Conjecturally, the "modularity'' extends to the Hasse-Wei! zeta functions of arbitrary smooth projective curves over number fields. Let e jQ be such a curve. Then for all primes p f N, where N ): 1 is some integer, the curve can be reduced

5.

146


modulo p to a smooth projective curve Cp/Fp. As described in Exercise 2 of Chapter 11, the local zeta function

is a rational function of tllC' form

z

_

Pp(T) (Cp,T)- (1-T)(1-pT)

for some monic polynomial Pp with Pp(O) = 1, of degree 2g where g ;;:, 0 is the genus of C. Write

IT

Pp(T) =

1

(1- nj(Ji)p2T).

1(J(2g

Then the Riemann Hypothesis for curves over finite fields (see Chapter 11) shows that lnj(P)I = 1. The Hasso-Weil zeta function of C outside N, analytically normalized, is defined by the Euler product

LN(C,s)

=IT (1- nl(p)p-s)- 1 · · · (1- n2 9(p)p-s)- 1 PIN

It is conjectured that after multiplying by appropriately defined Euler factors at p I N, the complete L-function is entire, it is necessarily self-dual and the gamma factor should be

i(C,s)=7r- 98 r(

s

s

1g

3g

+ ) r( + ) 2 4 2 4

=c 9 (27r)-9T(s+~) 9

for some constant c9 > 0 (see (5.86)). Note that the Riemann Hypothesis for the curves C1, over finite fields means that this L-function satisfies the RamanujanPetersson conjecture (up to the ramified places, but it is known that those still satisfy the required condition). In general for any g > 1, this is not yet known except for very special cases. REMARK. In arithmetic geometry it is more practical to leave the local roots nj (p) as they occur naturally without normalization, i.e. to write

Pp(T) =

IT

(1- !3J(p)T)

l(j(2g

with l/3j(Jl)l =

..;p. Thus L(C,s)

=IT

Pp(Jl-s+1/2)-1.

ptN

In this notation the critical line is Re (s) = 1 and the functional equation relates the values at s and 2 - s. The same remark refers to abelian varieties. Abelian varieties are higher-dimensional generalizations of elliptic curves, seen as algebraic groups. An abelian variety AjQ is a smooth projective variety with a given rational point 0 E A(Q) and addition and inverse morphisms p :.Ax A---> A, i : A ---> A which are maps of algebraic varieties over Q and satisfy the axioms of a commutative group (with i giving the opposite). If the dimension of A is one,


147

this corresponds exactly to elliptic curves E jQ with the "usual" chord-and-tangent description of the group law (see [Sil] and Section 11.8). There exists an integer N): 1 such that for p f N, the reduced variety Ap/IFp is smooth. Wei! showed that the local zeta function

is a rational function of the form

Z(Ap T) '

= Pp, 1 (T)

· · · Pp.29-I(T) Pp,o(T) · · · Pp,2 9(T)

where Pp,J are polynomials. Wei! proved that Pp,I is of degree 2g and of the form

Pp,J(T) =

II

I

(1- a,(p)p2T)

l(i(2g

with [ai(P)I = 1, which is the local Riemann Hypothesis. In addition one can arrange the a;(p) such that ai(p)ai+ 9(p) = 1 for 1 :( i :(g. The first polynomial Pp,l determines Pp,J as the j-th exterior power

II In particular, Pv.o outside N is

j_

(1- ai, (p) · · · ai, (p)p2 T).

= 1 - T, Pp. 29 = 1 - p9T. The global zeta function of AjQ

LN(A, s)

=II (1- a1 (p)p-s)-

1

• · ·

(1- a 29 (p)p-s)- 1 •

p{N

Again it is conjectured that after multiplying by appropriately defined Euler factors at pIN the complete product L(A, s) is an £-function, which is entire and self-dual and has gamma factor

for some constant c9 > 0 (see (5.86)). It satisfies the Ramanujan-Petersson conjecture, since the correctly defined ramified factors also satisfy the required condition. For g = 1, this conjecture is true by the modularity of elliptic curves over Q. For any g > 1, except for some cases involving so-called CM varieties, the analytic continuation and functional equation remain open problems. The conductor q(A) (which divides N) of this £-function is predicted as part of the conjecture and is a close relative of the Artin conductor for Artin £-functions. Note the resemblance between the shape and definition of the £-functions of curves and abelian varieties. It is not a coincidence. The theory of the jacobian variety of curves associates to every smooth projective algebraic curve C of genus g over a field k an abelian variety J(C)jk of dimension g, called its Jacobian, in such a way that for a curve CjQ we have L(C,s) = L(J(C),s). For instance, the £-function L(J0 (q),s) of (5.93) is the £-function of the Jacobian J 0 (q) of the modular curve X 0 (q) = f 0 (q)\IHI, by work of Eichler and Shimura.

148

5.


REMARK. An alternative definition of the local factors of L(A, s) (or of L(C, s) for a curve) is given in terms of the £-adic Tate module of A: let Te(A) = (l~1 A[en]) ® IQle, n

then the Galois group of IQl acts on Te(A) and for any prime p good reduction, we have Pp.!(T) = det(1- TFrp

f

£ where A has

I Te(A))- 1 ,

denoting by Frp a Frobenius conjugacy class at p. Dually, Pp,l is given in terms of the etale cohomology group H 1 (A.IQle) (see Section 11.11) by Pp,!(T) = det(1- Tup

I Te(A))- 1

where uP is the inverse of Frp. because Te(A) is the dual of H 1 (A, IQle). The Mordell-Weil theorem states that if A/IQ is an abelian variety, then the group of rational points A(IQ) is finitely generated. The computation of its rank is one of the great problems of arithmetic geometry. We have the famous BIRCH AND SWINNERTON-DYER CONJECTURE. Let AjiQ be an abelian variety. Then rankA(IQ) = ord L(A,s). I S=2 Of course this conjecture assumes that L(A, s) is an £-function, so L(A. ~) is defined. If the Birch and Swinnerton-Dyer conjecture holds, one can get information on the rank of abelian varieties using analytic methods. Often this suggests very interesting problems. For instance, by Proposition 5.21 we derive: COROLLARY 5.49. Let AjiQ be an abelian variety of dimension g ): 1 with conductor q. Assume that its Hasse- Weil zeta function is an £-function and that the Birch and Swinnerton-Dyer conject·ure holds for A(IQ). Then we have rank A(IQ)

«

logq log(_g3 logq)

with an absolute implied constant. Note th~t q

,

> 39 by Theorem 5.51 below.

This can be turned over to give a lower bound for the conductor of an abelian variety with a given rank, for instance for elliptic curves. CoROLLARY 5.50. Assume that the Birch and Swinnerton-Dyer conjecture holds for elliptic curves over IQ. Then if E/IQ is an elliptic curve of rank r): 1 and conductor q. we have q >> rcr where c

> 0 is some absolute constant and the implied constant is also absolute.

It would be very interesting to prove any unconditional estimate comparable to those two corollaries using algebraic methods. In Chapter 26 we study the order of vanishing of the zeta function (often called the analytic rank) for the abelian varieties Jo(q)/IQ, obtaining quite precise results. As for the discriminants of number fields in Section 5.10, one can use £functions to investigate, conditionally, the conductor of elliptic curves, or of abelian varieties over IQ.

5.


149

THEOREM 5.51. Let AjiQ be an abelian variety oveT IQ of dimension g ): 1 such that L(A. s) is an analytic £-function of conductor q = q(A). Then we have q ): el. 2g, and in particular q > 3. See [Mes] for even stronger bounds. The bound q > 1 is known unconditionally, by the celebrated theorem of Fontaine [Fon] according to which there is no abelian variety over IQ everywhere unrarnified. For a variety of dimension g = 1, this means there is no elliptic curve defined over IQ which is umamified everywhere, i.e. its discriminant is divisible by at least one prime. In this case it can also be proved without appealing to £-functions or modularity by showing that the discriminant of the ·weierstrass equation (14.18) (with integral coefficients) cannot be equal to ±1 (see [Si!], ex. 8.15). In the higher dimensional case, no such explicit reformulation is accessible, and Fontaine's unconditional proof is much more involved. PROOF OF THEORE~I 5.51. Apply (5.24) to L(A. s) at s =a-> 1 , then take the real part. Because L(A, s) is self-dual, the constant b disappears by using (5.29). Moreover, by the positivity 1 Re - - = s-p

a--(3

-1cr-p --12 ): 0,

we derive the inequality 1

r'

L'

2 1ogq- g log21r + gr;-(cr + ~) + y(A, a):;;, 0. Next by the Ramanujan-Petersson bound we have

Hence we get 1 2g log q ): log 27r -

rr' ((J + ~) -

c,' ( ((J)

for any a-> 1. Taking a-= 2.3, one checks that the right side is 0.62 ... > 0.6, hence q ): e!. 29 and q > 3. 0 REMARK. Since e 2 ·4 > 11, and it is known (see (14.41)) that the elliptic curve with smallest conductor is

with conductor 11, the argument implies that 11 is in fact the smallest conductor of an abelian variety over IQ, assuming their zeta functions are £-functions.

5.A. Appendix: complex analysis. In this section we present in concise form the concepts and results of complex analysis that are used in this chapter. We omit most proofs but refer to [Tl], [Ru] or [Ahi] (among many references).

I 150

5.

l


A.l. Functions of finite order.

'

DEFINITION A. L Let f IC - t IC be an entire function. Then of finite order if there exists (3 > 0 such that lf(s)l

«

f

is said to be

exp(lslfl)

for s E IC. If one can take any (3 > 1, f is said to be of order :( 1, and if no (3 < 1 is possible, then f is said to be of order 1. Let f be a meromorphic function on IC. Then f is of order :( 1 if there exists entire functions g and h of order :( 1 such that f = gfh. THEOREM 5.52. Let f be an entire function of order 1. {1) We have

f(s) = sr

II (1- ~-)esjp

p uniformly on all compact subsets ofiC, where r is the order of the zero off at s = 0 and p runs over zeros off different from 0. (2) For any E > 0, the series p,.CO

A.2. The Phragmen-Lindelof principle for a strip. The Phragmen-Lindelof "principle" is a generic term for theorems which extend the maximum modulus principle to various infinite regions of the complex plane. Only the case of strips a :( 0' :( b interests us. · THEOREM 5.53. Let f be a function holomorphic on an open neighborhood of a strip a :( 0' :( b, for some real numbers a < b, such that lf(s)l « exp(ls!A) for some A ): 0 and a :( 0' :( b. (1) Assume that lf(s)! :( M for all s on the boundary of the strip, i.e. for 0' =a or 0' =b. Then we have lf(s)! :( M for all sin the strip. (2) Assume that

If( a+ it)!:( 111a(1 + !tl)", lf(b +it)!:;:; .Nh(1 + !tl)fl fortE R

Then

for all sin the strip, where

e is the linear function such that £(a)= 1, £(b)= 0.

A.3. The Perron formula. Let h(x) be the function given by if

X> 1,

if

X=

if

X< 1.

1,

5. CLASSICAL ANALYTIC THEORY OF £-FUNCTIONS PROPOSITION

(5.111)

151

5.54. We have 1

27ri

jc+iT

-

5

X

c-iT

ds Xc ) -=h(x)+0(--s Tllog xl

for any x > 0, x I= 1, T > 0 and 0 < c :( 2, with an absolute implied constant. If x = 1, the factor llogxl is omitted.

A.4. The Stirling formula. The Stirling a;;ymptotic formula I

(5.112)

r(s) =

C7rf (~)" (1+ oC~~))

is valid in the angle I arg sl :( 1r Hence for s = O" +it, t I= 0, O" fixed

E

with the implied constant depending on

t:.

(5.113) Let d

!(s)=

ITrC~"1) j=J

be a gamma factor as in (5.3) with Re(r;1 ) > -1. It has no poles in Re(s) ~ 1. Recall the corresponding conductor at infinity is d

q=(s)

=IT (It+ Kjl + 3). j=l

Let -~ ,;:; Re (s) :( 2. From (5.112) we deduce (5.114) where k =ReI.: Kj· Hence we have (5.115) ~Ioreover,

(5.116)

using the recurrence r(s ,,

-(s)I

+ 1) =

"' ~

sr(s) we derive 1

-Is+"; I
«

logq 00 (s).

The implied constants in (5.114) and (5.116) are absolute.

A.5. Existence of test functions. For applications of the explicit formula, especially in the absence of the GRH, it is useful to know that there exist test functions with certain positivity properties.

5.

152


PROPOSI'TION 5.55. There exist positive, non-zero, C 00 functions 'T/ on [0, +oo[ with compact support in [e-1, e] such that 'T/(1) = 1, i}(O) > 0, 'TI(x) = 'T/(x- 1 ) for all x > 0 and moreover either i}(it) ;:, 0 fortE IR or Re (i}(s)) ;:, 0 for all s E rc with ial ,:;; 1. SKETCH OF PROOF. Define f(y) = 'Tl(eY), which is an even, compactly supported, coo function on IR, and

i}(s) = for all s E C, hence i}(it) =

Re (f)(s)) =

1.

f(y)e 8 Ydy

1

f(y)eitydy

1

f(y)eay cos(ty)dy =

1.

f(y) cosh(ay) cos(ty)dy.

By the first formula to have i}(it) ;:, 0 it suffices to choose f ;:, 0 such that its Fourier transform is positive. The Fourier pair (4.83), suitably smoothed, does the job. From the second formula and the maximum modulus principle for harmonic functions, the inequality Re (i}(s)) ;:, 0 will hold for Ia I ,:;; 1 if and only if the Fourier transform of the even function g(y) = f(y) cosh(y) is non-negative. Conversely, if a function g (even, smooth and compactly supported) with this property is given, ·a suitable test function 'T/ is easily obtained by reversing this procedure. Now if go is any smooth compactly supported positive function on IR, then the convolution square g = go * go will work, since g = g~. The support of 'T/ and its value at 1 can be easily adjusted by homogeneity. 0 Functions of this type were constructed by Poitou and others for the purpose of obtaining lower-bounds for the discriminant of number fields [Poi]. Note in a way they are simply better behaved versions of the basic function f(s) =(a- s)- 1 for a > 1 (see (5.28) and the proof of Theorem 5.51). See also [PP] for another construction of a test function with useful properties in applications.

A.6. Landau's lemma. LEMMA

5.56. Let An ;:, 0 for all n;:, 1. Suppose the series

D(s) =

L

Ann-s

n)l

converges absolutely for all s with Re (s) > ao, but not for any s with Re (s) < a 0 ; in other words, ao is the abscissa of absolute convergence of D( s). Then s = a 0 is a singularity of D(s), i.e. D(s) cannot be continued analytically to a neighborhood of ao.

I I

CHAPTER 6

ELEMENTARY SIEVE METHODS

The sieve methods originated from works of Viggo Brun on the Goldbach and Twin Prime Conjectures [Brl], [Br2]. Modern sieve methods apply to a great variety of problems, and when combined with techniques of analytic number theory they are very advanced indeed (cf. [Fll]). In this chapter we provide only basic results from sieve theory. First we establish by refining the combinatorial arguments of Brun the so-called Fu.ndamental Lemma. Then we develop an upper bound sieve of Selberg and illustrate its power by a few popular applications. l'viore complete treatments of sieve methods can be found in the books of Halberstam and Richert [HaRi] and Greaves [Gr]. 6.1. Sieve problems.

Let A= (an) be a sequence of non-negative numbers. The ultimate question is how often these numbers are supported on primes? For example, if A is the characteristic function of the shifted primes 1 if n

(6.1)

=

p + 2,

an = { 0 otherwise,

then we are asking how often is p + 2 prime? In this case, the heuristic arguments described in Section 13.1 yield the following asymptotic formula (6.2)

1r2(x) = J{p ( x;

p + 2 is prime }I ~ Cx(log x)- 2

where C is a positive constant given by (6.3)

c

=

2

II (1 -

(p

~ l)J =

u2o32.

p

Hence there "are" infinitely many twin primes, but we do not have rigorous proof of this old conjecture by any method. The sieve methods alone (in their classical format) are incapable of selecting primes, however they yield satisfactory results for slightly modified problems. Let P be a set of primes, z ~ 2 and (6.4)

P(z)

=II

p.

p
Now we seek estimates for the sifted sum (6.5)

S(x,z) = n~x

(n,P(z))=l

153

6. ELEMENTARY SIEVE METHODS

154

When Pis the set of all primes and z = .jX, then S(x, .jX) agrees with (6.6)

2.>v

S(x) =

p(x

up to a few terms an with n < .JX. The sieve methods produce upper and lower bounds for S( x, z) of correct order of magnitude as far as z ~ x"', where n > 0 is a small positive constant. Note that if n ~ x has no prime divisors less than x"', then n has at most [1/n] prime divisors, so one can say n is almost prime. The remarkable universality of sieve is that it applies to quite general sequences A= (an)· All we need are asymptotics for the partial sums (6.7) n(x n"'O (mod d)

with d I P(z), d

< y. We assume that

(6.8)

where g(d)X is an expected main term and rd(x) is an error term which we think of as being relatively small if d < y. In the main term X is approximately equal to (6.9)

A(x) = Lan n(x

=

so g( d) stands for the density of the masses an attached ton 0 (mod d). Thinking of the divisibility by distinct primes as independent events, it is not too surprising that we assume that g( d) is a multiplicative function such that (6.10)

0

~

g(p) < 1 if pEP.

Of course, the approximation of type (6.8) is not unique, however to get good results in practice there is not much room to choose g(d) and X from. For some prime moduli q one may have a good approximation (6.8) with g(q) = 1, which means that almost all mass comes from the an with n = 0 (mod q). In this case the chance of finding terms an supported on primes is slim; that is why we exclude such prime moduli from the set P. Since we use (6.8) only ford I P(z), we are free to modify the multiplicative function g(d) arbitrarily at d prime to P(z) and indeed for notational simplicity we set

g(p) = 0 if p ¢ P.

(6.11)

6.2. Exclusion-inclusion scheme. Detecting the condition (n, P(z)) = 1 by means of the Mobius inversion formula (6.12)

o(m)

=

LJ.L(d) = {

~

dim

ifm= 1, ifm

7"

the sum (6.5) unfolds into S(x,z) = L n(x

( L din diP(z)

J.L(d))an.

1,


155

Changing the order of summation we obtaiu the Legendre formula (6.13)

S(x,z) =

L

p,(d)Ad(x).

diP(z)

Next introducing (6.8) we get (6.14)

S(x, z) = XV(z)

+ R(1:, z)

where (6.15)

V(z) =

IT

(1- g(p))

piP(z)

and (6.16)

L

R(x, z) =

rd(x).

diP(z)

On probabilistic grounds one may expect that the product V (z) represents the true density of the mass coming from an with n ~ x having no prime divisors p < z in P. However, this is almost never correct except for z small relatively to x in the logarithmic scale. In other words, the divisibility by distinct primes are not completely independent events when these primes are relatively large. Technically speaking the problem is that, if z is large, the remainder sum R(x, z) has many terms, so it exceeds the expected main term XV(z). In order to reduce the number of terms in the Legendre formula (6.13), and consequently in the remainder term (6.16), Bruu truncated the Mobius function p,(d) to two sets, say v+ and v-, and considered the incomplete convolutions (6.17)

2.: 11(d)

o+(nJ =

din dED+

(6.18)

L

•s-(n) =

p.(d)

din dED-

in place of o(n). He lost the exact property (6.12) but maintained the inequalities (6.19) for all n ;;;, 1. Hence one obtains the lower and upper bounds (6.20) where (6.21)

2.:

v±(zJ =

.\~g(dJ,

diP(z)

(6.22)

L

R±(x, z) =

.\~rd(x)

diP(z)

and that

.\d, X;t denote the Mobius function on the sets v+, v- respectively. Assuming

(6.23)

v+, v-

c [1,y)

1 I


ISG

we can estimate the remainder sums R±(x, z) by

R(x,y) =

(6.24)

L

lrd(x)l.

d
Sometimes we can exploit the intrinsic structure of the sieve weights ..\~ to get estimates for R± (x, z) which are better than that for R(x, y). This is due to a possible cancellation of the error terms rd(x) in R±(:r, z), but we do not venture into this sophisticated area in this book (see e.g. [19], [110]). For notational simplicity we are writing squarefree numbers as the product of distinct primes enumerated in decreasing order

(6.25)

d =PI·· ·Pr

with

PI>···> Pr·

Put

(6.26)

v+ = { d =PI··· p,.;

(6.27)

v- = { d =PI · · · p,.;

for all m odd}

< Ym Pm < Ym Pm

for all m even}

where Ym are suitable parameters. By convention both sets v+ and v- contain d = 1. We start from the recurrence formula

V(z) = 1- Lg(p)V(p).

(6.28)

p
Iterating this we derive by the well-known exclusion-inclusion procedure the following identities:

L V(z) = v-(z) + L V(z) = v+(z)-

(6.29)

V,,(z),

n odd

(6.30)

V,,(z)

where

(6.31)

Vn(z) = Yn(Pu<···
Pm
Dropping the non-negative terms V,,(z) in (6.29) and (6.30) it follows that

(6.32) Now we can ea..9ily verify the lower and the upper bound sieve conditions (6.19); indeed (6.32) becomes (6.19) on taking P(z) =nand g(d) = 1. From now on the combinatorial sieves A+ = (..\!) and A- = (\j) depend on the truncation parameters Ym alone. Brun tried various parameters, his best choice was Ym = yo/3-"' for suitable constants 0 < n < 1 < /3. We choose the parameters Ym depending on the primes PI, ... Pm which occurred prior to the m-th step of the iteration, precisely

(6.33)

Ym

= (yjpi · .. Pm) ~

where /3 is a fixed number, /3 > 1. Clearly every d E v+ U v- satisfies d < y except possibly for d = p E v-. However assuming z (; y in this case we also have d = p < z (; y. Our choice (6.33) is inspired by heuristic arguments exploiting the


157

concept of "sieving limit", however, as we arc not going for optimal results, we are free to choose any /3 > 1. 6.3. Estimations of v+(z), v-(z). We need an upper bound for v+(z) and a lower bound for v-(z). By virtue of (6.29) and (6.30) in both cases the problems reduce to getting upper bounds for Vn(z). Since V(pn) in (6.31) are non-negative some of the summation conditions can be relaxed by positivity. We get

Yn~Pn<···pt
Pt···Pm-tP~ n
where the conditions hold for all 1 ( m imply that

< n regardless of parity. These conditions

P1···Pm
if

m
by induction on m. In particular, for m = n - 1 this yields the following lower bound for the least prime divisors of d = p 1 · · • Pn in Vn(z), 1

1

Pn ) (Y/PI · · · Pn-d e+> ) YM 1 say, where Zn =

8

1

z( {; )"

(~-1)n-1

"'73

1 (~-1)n

)

yiJ "'"i3"

)

Zn,

with

Having established the lower bound Pn ) Zn we drop the other conditions and estimate as follows

< V(zn) ( '""' 9 ( n! L.....p

"'

Zn

~p
l)n <"' V(zn) (log V(zn))n n! V(z) ·

To proceed further we make the following assumption about the multiplicative function g(d): for any w with w < z, we have

II

(6.34)

w,;;p
(1- g(p))-1 ( [( ( -log z )" logw

where"> 0 and K > 1 do not depend on w. REMARKS. The condition (6.34) is relatively easy to verify in practice. We call the exponent " the sieve dimension. Notice that " is not uniquely defined; any larger number is also qualified. The constant K can be disturbingly large if the densities g(p) are near one for many small p. Nevertheless the results can be refined in this respect. Indeed, in practice one can establish the following estimate

(6.35)

II (1 -

w
log z ) ,.: ( K' ·) 1 1 + lo w g(p))- ( ( lo w g

g

158


for any w with 2 o( w < z, where "' and K' are positive numbers independent of w. This estimate implies (6.34) with '""= 1 + "' and

K'

(6.36)

K=1+-. logz

Therefore K can be reduced to a number near 1 at the expense of increasing the dimension " by 1. By (6.34) we derive using the inequality 1 + x ,;::

V(z,)

o(

j3

KV(z) ( j3 _

)

1

Kn

ex

that

< KV(z)en/b

where f3 = "b + 1. Hence

Inserting n!;, e(n/e)n we obtain (6.37) where a= b- 1 e1 +b-•. We choose b large so that a< 1. The condition Pn ;, Yn in (6.31) implies p~+/3 ;, y, and this shows that the sum (6.31) is void is n ,;:: s- (3. Therefore (6.37) yields (6.38) We choose f3 = 9" + 1, getting b = 9 and a < e- 1 . Hence we conclude THEOREM 6.1. Let A+= (.A.!), A- = (.A.d") be the combinatorial sieves of level y > 1 and the pammeter j3 = 9" + 1. Then for any multiplicative function g(d) satisfying (6.10), (6.11) and (6.34) and any z,;:: y 11f3 we have

(6.39)

V+(z) < (1 +ef3-sK 10 )V(z),

(6.40)

v-(z)

> (1- e 6 -s K 10 )V(z)

where s =log yj log z. COROLLARY

6.2. Let the conditions be as above. Then

+ ef3-s K 10 )V(z)X + R(x, z

(6.41)

S(x, z) < (1

(6.42)

S(x, z) > (1- e/3-s K

where f3 9" sum (6.24).

+ 1,

10

5

)

)V(z)X- R(x, z 5 )

s is any number ;, {3, and R(x, y) denotes the remainder

6.4. Fundamental lemma of sieve theory. Combining the inequalities (6.39), (6.40) with (6.32) we conclude the following

6.


159

FUNDAMENTAL LEM~IA 6.3. Let~ > 0 andy> 1. There exist two sets of Teal numbers A+ = (.-\;j) and A- = (.-\,i) depending only on ~ andy with the following pmperties:

>.~ = 1,

(6.43) (6.44)

1>-~1:(1

(6.45)

.-\~ = 0

and for any integer n

if1
> 1,

(6.46) din

din

Moreover, for any multiplicative function g(d) with 0 :( g(p) the dimension condition

log z ) " ( 1

(6.47)

II ( 1 - g(p))-l :( ( logw

< 1 and satisfying

K ) + logw

w~p
for all 2 :( w

< z :( y we have

(6.48)

L

K

10

>.~g(d)=(1+0(e-s(1+~) ))II(l-g(p))

diP(z)

where P( z) denotes the pmduct of all primes p constants depending only on K.

g

p
< z and s

= logy/ log z, the implied

REMARK. The Fundamental Lemma plays an assisting role in sieve theory; it usually serves for preliminary elimination of the elements of a sequence which have small prime divisors, that is primes p < z = y 1 fs with s-> +oo. In such applications the results are asymptotically precise, so one loses nothing while gaining a practical assumption that the sequence in question is supported on almost primes. Quite often one only needs an upper-bound of the right order of magnitude, that is (6.49)

L

>.~g(d)

«

K10

II (1- g(p)). p
diP(z)

This holds under the condition (6.34)(which is weaker than (6.47)) with the implied constant depending only on ~. Although the sequence A= (an) accompanied us when constructing the sieves A+= (>.;j) and A-= (>.,t), one should realize that A+, A- have actually nothing in common with A. One can legally apply the particular sieves A+, A- to any sequence of non-negative numbers, the only risk is that the results may not be ideal if the parameter ~ (the so-called dimension) and y (the level) of A+, A- do not match perfectly the characteristics of A. Of course one does not need to push s to infinity. The lower bound (6.42) is already interesting if s > {J. Assuming that the remainder term R(x, y) is negligible for a suitable level y, one infers from (6.42) that (6.50) n:(x

(n,P(z))=l


160

where z = y 1 f(/3+<) for any f · > 0 provided y is sufficiently large in terms of E. Suppose we can control R(x, y) for y = x"'-" with 0 < n ( 1. Then (6.50) holds for z = x(<>-<)/(;3+<), while the condition (n,P(z)) = 1 together with n ( x imply that n has at most n/ /3 prime divisors. In general to establish an upper bound for S(x, z) is much easier, because S(x, z) is decreasing in z so one can reduce z at will. EXERCISE

1. Derive from (6.41) the upper bound

(6.51) EXERCisE 2. Show that the sequence (6.1) satisfies the sieve conditions for dimension " = 1 and level n = 1/2 (see Theorem 17.1). Then derive from (6.42) that

(6.52) where n 2 (x, z) denotes the number of primes p ( x for which p + 2 has no prime divisors < z. Therefore there are infinitely many primes p such that p + 2 has at most twenty prime divisors. Recent developments of sieve methods in this direction show (among other things) that

I{p (

x; P + 2

=PI

or P1P2 with

PI,

P2

> x 3 / 11 } I »

x(log x) -

2

for all sufficiently large x. Hence p + 2 has at most two prime divisors infinitely often. This last result is due to J.R. Chen [Ch]. We still do not know which case p + 2 =PI or p + 2 = p 1p2 occurs infinitely often? Probably both!

·-~

6.5. The A2-sieve. A powerful and elegant method of an upper bound sieve came from Atle Selberg

[Sl]. Selberg's method yields results of great generality, it is simpler than combinatorial sieves at the start, though equally complex in its advanced forms. We modify the notation of the previous section slightly to display clearly the economy of conditions which are required for performing Selberg's sieve. Recall that an upper bound sieve of level D is a sequence of real numbers Ad ford< D with A1 = 1 such that (6.53)

:L>d;? 0

for all rn > 1.

d\m

Hereafter the superscript + is omitted for notation simplicity. This positivity condition was quite difficult to get a hold on in combinatorial sieve. Selberg made it very easy by choosing Ad such that (6.54)

I>d = (LPd) d\m

2

d\m

where {pd} is another sequence of real numbers with (6.55)

PI=

1.

Since the squares are non-negative, such a choice guarantees (6.53) no matter what Pd are!

j

r

I


161

Selberg's choice amounts to (6.56)

L

..\d =

Pd,Pd,·

[d, ,d,]=d In order to control the level we assume Pd arc supported on integers < if d)

Pd = 0

(6.57)

Jl5,

..JD.

Hence the resulting sieve {..\d} has level of support D. Following Selberg we call it the A2 -sieve of level D. Throughout, A = (an) is a sequence of non-negative numbers, P is a squarefree number and for any d I P we set (6.58) n"O (mod d)

Here as before, g(d) is a multiplicative function satisfying (6.10). Applying the A2 -sieve to the sequence A= (an) in the sifting range P we get

S(A,P)

=

L an ( Lan( L (n,P)=1

=

L L

n

Pdf

d[(n,P)

Pd,Pd,IA[d,.d,JI = XG

+ R(A, P)

d 1 ,d 2 [P

where (6.59)

G=

L L

g([dr,d2])Pd 1 Pd 2

d 1 ,d 2 [P

and

(6.60)

R(A,P) =

L L

Pd,Pd,T[d,,d,j(A).

d 1 ,d 2 [P

The task before us is to make this general inequality optimal. Forgetting for a moment about the remainder term R(A, P, A2 ) we wish to minimize G with respect to the unknown numbers Pd subject to (6.55) and (6.57). The ensuing numbers will satisfy (6.61)

hence the remainder term is automatically under control. The expression (6.59) is a quadratic form in Pd· In order to find the minimum of G it helps to diagonalize. In the presentation below it goes without saying that Pd is supported on, and the relevant variables of summation run over the divisors of P (thus over squarefree numbers). Furthermore we can assume that

(6.62)

0

< g(p) < 1 g(p) = 0

if piP, ifpf P.

6.

162

ELE~IENTARY

SIEVE METHODS

Let h(d) be the multiplicative function defined by

g(p) h(p) = ~-(-). 1-gp

(6.63) We obtain G

=

L g(abc)PacPbc abcjP

=

Lg(c) L

L

g(a)g(b)PacPbc

(a,b)=l 2

= Lg(c) Lll(d)g(d) (Lg(m)Pcdmf d

c

=

L

Lh(d)-1( dj P

g(rn)pmt

rn=O(mod d)

Hence by the linear change of variables (6.64)

~d

= p.(d)

g(rn)pm m=O(mod d)

we obtain the diagonal form (6.65)

G

Lh(d)- 1 ~~

=

d\P

~d·

We still have to reinterpret the condition (6.55) in terms of the new variables To this end we use the Mobius inversion to convert (6.64) into

(666)

Pf =

p.(~)

L

g( )

d=O(mod f)

~d

In particular, for € = 1 this gives the linear equation (6.67)

L:~d

=

1.

d\P

Ivforeover, one observes by (6.64) and (6.66) that the support conditions (6.57) for Pd are equivalent to these for ~d, (6.68)

if d?

v75.

Now our target is to minimize (6.65) on the hyperplane (6.67). Cauchy's inequality to (6.67) we derive GH? 1 where (6.69)

H=

L

h(d)

d
so G cannot be smaller than H- 1 • The equality (6.70)

GH= 1

holds for (6.71)

~d = h(d)H- 1

if d <

v75.

Applying


163

Note that

H,;;;"L, h(d)= II(l+h(p))= II(1-g(pW\

(6.72)

diP

PIP

PIP

i.e.,

1

H)

(6.73)

II(l- g(p)). PIP

Next we compute Pt by inserting (6.71) into (6.66) getting

J.L(C)g(C)peH

L

=

h(m),

m
that is (6.74)

Pt

=

J.L(C)h(C) g(C)H

'"""' L.....-

h(d)

.

d
Now we show (6.61). To this end we group the terms in (6.69) according to the greatest common divisor of d and e getting

H=

L L kif

h(m)

m< v'D/k (m,f)=1

L

)("L,h(k)) kif

L

h(d) = "L,h(k)

d< v75 (d,l)=k

h(m)=J.L(C)peH

m
and this proves (6.61) (this neat estimate is due to J.H. van Lint and H.-E. Richert [LR]). From this one gets directly by (6.56) (6.75) From the above results we conclude the following THEOREM 6.4. Let A= (a,) be a finite sequence of non-negative numbers and P be a finite product of distinct primes. For every diP we write

L

lAd I=

(6.76)

a,= g(d)X

+ r·d(A)

n=O(mod d)

where X > 0 and g(d) is a multiplicative function with 0 < g(p) < 1 for piP. Let h(d) be the multiplicative function given by h(p) = g(p)(1- g(p))- 1 and

(6.77)

L

H=

h(d)

d
for some D

(6.78)

>

1. Then we have

S(A, P) =

L

a, '(X H- 1

+ R(A, P)

(n,P)=1

/


164

where

R(A, P) =

(G.79)

L >.drd(A) diP

with Ad given by (6.56) and (6.14).

Using (6.75) one estimates the remainder term crudely by

L

IR(A, F) I:::;

(6.80)

T3(d)jrd(A)j.

d
The A2 -sieve is very general indeed. Its upper bound does not require any regularity in the distribution of the density function g(p) over primes, therefore the dimension of a sifting problem does not play explicitly a role. If g(p) = w(p)p-1, where w(p) is the number of residue classes (mod p) which one wants to exclude, then h(p) = w(p)(p- w(p))- 1 is the ratio of the numbers of rejected to admitted classes, and H incorporates these ratios. Of course, the larger w(p) is, the smaller the upper bound (6.78). However, a few local deviations of w(p) from its average have insignificant global effect. To the contrary our estimates derived by the combinatorial sieve are quite sensitive in this respect because the hypothesis (6.35) is assumed to !wid for all segments of primes w :::; p < z, no matter how short. The A2 -sieve yields fantastic results for sifting problems when the number of residue classes to be excluded is large, it can compete with the large sieve method of Linnik (see Section 7.4). REI\IARKS. Note that the numbers Pd depend on the multiplicative function g(d) so (indirectly) on the sifting sequence A. However, assuming some regularity of g(d) (as in the "' dimensional sieve, see (6.34)) one can establish fairly good approximation (for small d at any rate)

(6.81)

Pd = JL(d) ( logVD/d)"{ rn 1+ 0 (

log v D

log

1 )} . VD D/ d

6.6. Estimate for the main term of the A2 -sieve.

To bring the Selberg upper bound (6.78) to a practical form we need a clear lower bound for

H = H(D) =

L

h(d).

d
H(D) =

L

0

h(d)

d
where the superscript

~

restricts the summation to squarefree numbers.

6.

ELE~IENTARY

SIEVE METHODS

165

We can estimate H(D) strongly, elementarily and nicely for g(d) = d- 1 (this occurs when one is sifting numbers in au interval). In this case h(d) =
H(D) =

2.::" r d
1

II

0+ ~ +. ·.) )

L

1

m- >log

VD.

m
pld

If g(d) agrees with d- 1 only for (d, q) = 1 (as for example in the case of sifting an arithmetic progresoion), then this lower bound holds with the following modification: (6.82)

H(D) >II (1- g(p))(log VD). plq

The above example is rather special. Now suppose g(p)p is

(6.83)

K

on average, say

L g(p)logp = Klogx + 0(1) p<:x

for all x ) 2 where"- is a positive number. Hence g(p) logp « 1. Suppose also that

(6.84) Since h(p) = g(p) + O(g(p) 2 ), it follows that (6.83) holds for h as well (with a different implied constant). Applying Theorem 1.1 for the multiplicative function h in place of f we get

where p

(the required conditions (1.89) and (1.90) are satisfied by (6.83), (6.84)) and the implied constant depends on that in (6.83). If D is large in terms of this constant we can invert this approximation getting

(6.85) where

(6.86)

Hg = II(l- g(p))(1- ~)-". p

6. 7. Estimates for the remainder term in the A2-sieve. Once again we consider a class of sifting problems in which the individual error terms satisfy

(6.87)

!rd(A)I ,:; g(d)d.

Naturally with this property one makes the condition

(6.88)

g(d)d) 1 if d/P.

1GG

6. ELEMENTARY SIEVE l\IETHODS

This implies g([d 1,d2 ])[d1,d2 ]

g(dl)g(dz)d 1 d2, therefore

(

L

2

(6.89)

IR(A,P,A )1 ( (

lp,dg(d)df (

(~

d
L

h(m)o-(m)f

m
by (6.60) and (6.72), where a(m) denotes the sum of divisors of m. Next assume that the density function g(p) satisfies the following crude bound

L

(6.90)

g(p) logp « log(2xjy).

y~p~x

Hence we infer the same property for h(p)a(p)p- 1 • Then we apply (1.85) getting

L.....'""'

h(m)a(m) «log /75D

m.
L.....'""'

h(m)a(m)m- 1 .

m
Here we have m

m
Hence we conclude that (6.91)

R(A,P)

«

D(logD)- 2 .

Combining with (6.78) we get THEOREM 6.5. Suppose the conditions of Theorem 6.4 hold. Moreover, assume (6.87). (6.88} and (6.90}. Then we ha11e

(6.92)

S(A,P) (

~H +0 (~) log D

where H = H(D) is given by (6. 77}, D > 1 is arbitrary, and the implied constant depends only on that in ( 6. 90).

6.8. Selected applications of A2 -sieve.

Consider the sequence A = (an) which is the characteTistic function of an arithmetic progression in a short interval (6.93)

n

== a(mod q),

:t

<

1l (

:~·

+Y

where (a, q) = 1 and 1 ( q < y. Let P consist of primes p ( Vfj with p f q. Then (6.94)

7r(.r + y;q,a) -1r(x;q.a) ( S(A, P)

+ Vfjq- 1 .

On the other hand, the conditions ofTheorem6.5 are satisfied with X= yq- 1 and 1 g(d) = d- if (d,q) = 1. By (6.82) we have H(D) >~log D. Combining (6.92) _q with (6.94) we get 1r(x+y;q,a)-1r(x;q,a)<

Choosing D = yq- 1 we conclude

JY) · q

2y D+O ( - D- + ()l 2

og

log D

6.


THEOREM 6.6. For (a, q) (6.95)

167

= 1 and 1 !( q < y we ha-ve

1r(x+y;q,a)-1r(.r;q,a)<

<.p

y ) ()I2y (/)+0 ( q og y q qiog 2 (yjq)

where the implied constant is absolv.te. This is called the Brun-Titchmarsh inequality. Using the large sieve method

H.L. Montgomery and R.C. Vaughan [MVl] have shown that the error term in (6.95) can be deleted.

Next we take A= (an) the characteristic function of the polynomial values n

= (m- a1) ··· (m- ak)

with 1 !( m !( x, where all a1 are distinct. In this case g(p) = v(p)p- 1 where v(p) is the number of roots modulo p. If p is sufficiently large, v(p) = k so we have a k-dimensional sieve problem. By (6.85) and (6.92) we deduce THEOREM 6. 7. Let a = ( a1, ... , ak) be distinct integers which do not co-ver all residv.e classes to any prime modv.lv.s. Then the nv.mber of integers 1 !( m !( x for which m - a 1 , ... , m - ak are all primes satisfies (6.96)

where (6.97)

1)-k . B=II ( 1 -v(p)) - ( 1-p p p

REMARKS. The upper bound (6.96) is larger by factor 2k k! than the conjectured asymptotic 1r(x;a) ~ B.r(logx)-k

CHAPTER 7

BILINEAR FORMS AND THE LARGE SIEVE 7.1. General principles of estimating double sums. Certainly the most versatile instruments of analytic number theory are double sums

(7.1) m

n

Here cr.= (etm), (3 = (f3n) are finite vectors of complex numbers and

(7.2)

= ((m,n))

is a finite matrix with complex entries. In matrix notation w(n,/3) = cx(3t. Given the goal is to give a sharp estimate for 1Jl(cx,(3) which is valid for arbitrary vectors ex, (3. Yes, it is essential to assume nothing about the numbers Ctm, f3n, because in practice they are quite complex objects. Very often an attempt to utilize specific structure of these vectors brings back a mirror image of the original situation. Hence a question, what is the benefit from arranging the original sums into bilinear forms? The key feature is that the sequences ex, (3 do not see each other! Therefore a cancellation in w( ex, (3) is possible, subject only to properties of the matrix coefficients (m, n). Indeed we have (7.3)

where fl.= fl.( ) is the norm of the corresponding linear operator and

(7.4) A more transparent way of estimating w(cr, (3) goes by Cauchy's inequality. To this end choose first m to be in the outer sum and n in the inner sum getting

(7.5)

IW(et,/3)1

2

!(

llnii

2

L m

1Lf3n(m,nf. n

Note that the unknown coefficients Ctm in the outer sum are gone. At this point one could be more flexible by introducing new coefficients f(m) in place of Ctm to smooth the next step and to optimize the final bound. Smoothing is one among many optional devices which are put into practice with remarkable effects in analytic number theory. This is not just a technical device, it can be viewed as a kind of spectral completion. Next squaring out the inner sum over n followed by interchanging the order of summation we get

(7.6)

2 LILf3n(m,n)l = LLf3n,J3n 2 L(m,n!)(m,n2)· m

n

n1

n2

169

m

7. BILINEAR FORMS AND THE LARGE SIEVE

170

For n 1 = n 2 , the so-called diagonal terms, one cannot get cancellation in the inner sum (7.7) so the saving factor comes solely from the fact that the diagonal is smaller than the full square! However, for most of n 1 , n 2 off the diagonal one may find a considerable cancellation in the sum (7.8) because of independent variations in the sign of . At this point the sums (7.8) can be either estimated directly finishing the job, or it can be transformed to another sum (so to speak a dual sum) by a kind of spectral formula. In the latter scenario even a trivial estimation of the dual sum ofter{ produces a non-trivial bound for the original sum. Moreover, one can also exploit the extra summation over n 1 , n 2 . Our point in these remarks is to say that the general estimate (7.3) obstructs the visibility for improvements. We advise to carry out the arguments in practice always by Cauchy's inequality because it allows you to control your position. Before launching Cauchy's inequality to the double sum I]J(a,,B), make sure to choose the outer and the inner sums variables conveniently as the results will be different. This option of interchanging m, n in I]J(a,,B) is the essence of the following duality principle. DUALITY PRINCIPLE.

Suppose for any comple.1: numbers

.Bn,

(7.9)

Then for any complex numbers am (7.10)

L II:am¢(m,n)l n

2 :(

llllall

2

,

m

where ll is the same in both inequalities. PROOF.

(7.11)

The left side of (7.10) is equal to (7.1) with

/3n =

La,
By (7.5) and (7.9) we get I]J(a,,Bj2 :( llllaii 2 II.BII 2 But 11,611 2 = I]J(a,,B) so (7.10) is derived from (7.9). 0 Because of the application of Cauchy's inequality, it is inevitable that much attention concentrates on estimating sums of type (7.9) or (7.10). These equivalent statements for general coefficients Om, .Bn are called Large Sieve Inequalities for historical rea..sons although they do not resemble a sieve of any kind. It is better to regard (7.9), (7.10) as a near-orthogonality property of . We end this introduction to general bilinear forms with a few comments about obvious cases for which the ideas described definitely cannot work. Suppose¢( m, n)


171

factors, say cp(m, n) = x(m)1/J(n), or cp(m.. n) is a short linear combination of such products. In other words, the variables m., n of cp(m, n) are separated at no, or low, cost. Then we may choose the vectors 0 = (om), (3 = (f3n) with Om= x(m), f3n = i[J( n). For this biased choice of vectors the bilinear form is lJi ( o, (3) = lloll 2 l f311 2 showing no cancellation of terms. The principles of estimating bilinear forms in analytic number theory have been articulated very slowly. Perhaps because it is not the simplicity of the ideas, but rather the additional devices introduced by researchers in particular cases which make this technology so effective today. First applications dealt with bilinear forms in which cp( m, n) depended only on the product m.n, say

cp(rn, n)

(7.12)

=

g(mn),

where g is an arithmetic function in one variable. Of course, in this case g cannot be multiplicative. 1.1\1. Vinogradov should be mentioned for the first impressive results. Using an exclusion-inclusion procedure (a sieve method) he managed to express exponential sums over primes by double sums and estimated the latter by elementary means (cancellation in geometric series). Then utilizing the result in the circle method Vinograclov gave a stunning solution of the ternary Goldbach problem. His ideas are still alive today (see Chapters 13 and 19); however, there are more instances which stimulated the whole business of bilinear forms. Some of these will be explicitly described in forthcoming chapters, and many are embedded in arguments without recognition. Linnik [Li2] has elaborated the idea of estimating double sums into the dispersion method, which is missing in this book. The notation chosen for this introduction serves as a model. Obviously we do not need to consider cp(m, n) as a function in two integral variables. Actually in some works it is nicer to index the two variables by more appropriate parameters (rational fractions, characters, eigenvalues, etc.). In this chapter we give a selection from the great variety of bilinear forms and large sieve type ineqt1alities. A few basic ones will be given detailed proofs, a few more will be discussed in a broad context, and many will be just stated with references to original publications. 7.2. Bilinear forms with exponentials. The bilinear forms of type (7.1) with

cp(m., n) = e(XmYn)

(7.13)

appear quite often in classical analytic number theory. Here Xm, Yn are real numbers and e(z) = exp(27riz) as usual. LEMMA

(7.14)

7 .1. For any complex numbe1·s

[ 1

il:>me(xmv)l m

2

dy !( 5Y

Om

and real numbers

LL 2Yix, 1 -x 1172 1<1

Xm

lom,om,l·

we have

172

7.

BILINEAR FORMS AND THE LARGE SIEVE

PROOF. Let g(y) be a non-negative function on~ with g(y) ~ 1 if IYI ~ ~and supp(g) C [-1, 1]. Then the left side of (7.14) is bounded by

j 9( 2~ )l:~:>me(XmY)I

2

dy

= 2Y

L I>m lim §(2Y(Xm

m

1

2

1 -

Xm,))

rn1 ,rn2

~

2Yg(O)

The Fourier pair g(y) = (si~;vr, g(v) conditions, giving the result.

f

max(1 - Ivi, 0) satisfies the above D

Our basic inequality is THEOREM 7.2. Let Xm, Ym be real numbers with lxml ~X, IYml ~ Y. Then for any complex numbers nm, f3n i:L:Lnmf3ne(XmYn)l m

~5(XY+1)~(

1

LL

lnm,nm,lf

lxm 1 -x 1112 jY
n

1

( LL

(7.15)

lf3n,f3n,lf

IY''I -Yn,IX < 1

PROOF. Put c = (4X)- 1 and 1rXmCXm

1m= sm . (27rcXm )'

e(xy)

7rX

= -.-sm27rEJ:

1y+e: e(xt)dt y-e:

we find that the left side of (7.15) is bounded by

Hence by Cauchy's inequality and Lemma 7.1 we find that the square of the left side of (7.15) is bounded by

(/~-~"" ( L ~ 2c( L L

dy) ([:~"" I:L 1me(xmy)l dy) 2

lf3nlr

IYn-YI
m

lf3n 1 f3n,I)5(Y + c)(7rX)

1Yn 1 -Yn 2 J
Here 10c(Y + c)(1rX) 2

=

2

(

LL lxnq

4

;'

(XY +

lnm,am,l)·

-xTn 2 jY <1

:!l < 25(XY + 1) completing the proof.

D

If the points Xm and Yn are well-spaced, so that only the diagonal terms appear on the right side of (7.15), we get

7.


173

COROLLARY 7.3. Suppose lxml (X, IYnl ( Y, and for m1 have lxm, - Xm 2 1)' A, 1Yn 1 - Yn 2 1 )' B . Then

IL

I

Lam/Jne(xmYn)l :( 5(1

1

+ XY)2 ( 1 + AY)

f.

m2, n1

1

1.2 (

f.

n2 we

!2

1 + BX) ilniiii/JII·

n

m

PROOF. Apply (7.15) with max(X,B- 1 ) and max(Y,A- 1 ) in place of X and y~~~~

0

The spacing problem can be easily solved for points given by smooth functions. For example we derive by Corollary 7.3 the following COROLLARY 7.4. Let f(m), g(n) be real functions on [M, 2M], [N, 2N] respectively, such that f « F, g « G and lf'l )' FM- 1 , 19'1 )' GN- 1 . Then for any complex numbers am, !Jn we have I

L

L

c.m!Jne(f(m)g(n))

I

I

« (FG)-2 (FG + J\.1) 2(FG + N) 2//n////,811.

PRoOF. We have lf(m!)- f(m2)1 )' lm1- m2IFM- 1 and lg(n!)- g(n2)l )' n 2 IGN- 1 hence the result follows from Corollary 7.3 for A = F M- 1 , B = 1 GN- , X = F and Y = G. 0

ln 1

-

Corollary 7.3 yields also interesting results for exponential sums with monomials, because the rational points are well-spaced. LEMMA 7.5. Let a{J f. 0, .6. > 0, K )' 1, L )' 1. The number of integral quadruples (k,k',f.,f.') with K :( k,k' :( 2K, L :(f., f.':( 2L such that

(7.16) is bounded byO(KL(.6.KL+log2KL)) where the implied constant depends only on a, {J. EXERCISE 1. Prove Lemma 7.5. From Lemma 7.5 we derive COROLLARY 7.6. Let a!Jr8 f. 0, X> 0, K, L, M, N )' 1. Let ak,e and bm,n be complex numbers with /ak.el :( 1 and lbm,nl :( 1 forK :( k :( 2K, L :( f.:( 2L, M :( m :(2M, N :( n :( 2N. Then

(7.17)

k"'f.l3m
£

m

n

where the implied constant depends only on a, {J, 1, 8.


174

7.3. Introduction to the large sieve. The large sieve type inequalities are ubiquitous in modern analytic number theory. The name is somewhat misleading however, as sieves do not enter the general picture: it was derived from Linnik's original idea (see [Lil]), which was subsequently transformed beyond recognition into general £ 2-type estimates, from which sieve results can be derived in some cases. First we explain in general terms the underlying philosophy, which is expected to have much larger applicability. Let X be a finite set of "harmonics", well-suited maybe to solve analytically some interesting equation. To each x E X, we assume that a sequence (x(n)) is associated, so to speak the "Fourier coefficients". The large sieve problem for X is to find C = C(X, N) ;? 0 such that the following "large sieve inequality" holds for the £ 2 -mean of a linear form in the

x(n):

L IL

(7.18)

anx(n)l2 (

C(X,N)IIa112

xEX n<,_N

for any complex numbers an, where llall 2 = I: lanl 2 . By Cauchy's inequality one gets (7.18) with C(X, N) = NIXI, but this is not useful in applications. The idea is that the "harmonics" are orthonormal, or nearly so, hence if one expands the square

L IL xEX

2

L

anx(n)l =

n~N

nt,n2

an 1 an,

L

x(nl)x(n2),

xE,\'

the diagonal terms corresponding to n1 = n2 should be dominating, contributing roughly 1XIIIall 2because x(n) should be of size 1 on average. In any case it is hard to imagine that C(X, N) would be much smaller than the number of elements in X. Moreover, because of the duality principle for bilinear forms, which means that (7.18) is equivalent with (7.19)

L IL

2

bxx(n)l

(

C(X, N)

n:(N xEX

L

lbxl 2

xEX

with diagonal contribution N I: lbxl 2, it is not possible to make C(X, N) smaller than the length of the vector (an)· Therefore in view of these diagonal limitations the best estimate which one can hope for C( X, N) is roughly (7.20)

C(X,N) c:=

lXI +N.

This holds indeed in a number of crucial cases. A sharp large sieve inequality (7.20) says that the linear forms

of length N ( lXI with Iani ( 1 are, on average over x E X, of size N112. This is best possible, and although in many cases it is expected that the individual sums are of this size, very few cases are known. Especially interesting is the case an= J.L(n)

7.


175

and x(n) = AJ(n) are coefficients of some £-function (see Chapter 5). Then the individual bound n(N

(for any E > 0) is equivalent with the Riemann Hypothesis for L(f, s) (see (5.52)). Hence, a sharp large sieve type inequality for a family of £-functions is as powerful on average as the Grand Riemann Hypothesis would give.

7.4. Additive large sieve inequalities. In the additive large sieve inequality the harmonics considered are additive characters of Z with x(n) = e(cm) for some n E R (of course only n mod 1 matters). So the linear forms to be estimated are just the trigonometric polynomials

(7.21) where the complex coefficients arc supported in AI< n '(A!+ N. Note that if O
llnr-nsll )5

ifr=f.s,

for some 6 > 0 (where llxll is the distance to the nearest integer), then the number of distinct points nn i.e. of harmonics, is '( 1 + J- 1 . THEOREM 7.7. For any set of 6-spaced points O
(7.22)

L

I

L

ane(nrn)l

'(

W 1 + N-

1)llall

2

.

Af
This estimate is best possible; it was proved in the form stated independently by Selberg [S3] and Montgomery and Vaughan [MV2], and is of the strength of (7.20). We give the proof of Montgomery and Vaughan which is based on the following generalization of the Hilbert inequality. LEMMA 7.8. Suppose Ar are distinct real numbers with jAr- A8 j ) 6 ifr Then for any complex numbers Zr we have

(7.23)

PROOF. By Cauchy's inequality, it suffices to show that

(7.24) Squaring out we arrange the left side as follows

L = L

LZsZt L s,t

1 (Ar- As)(Ar- At)

r:j:.s,t

2

_ "'lz l " ' 1 - L.. s L.. (Ar- As) 2 s

r:j:.s

+ "'"' ZsZt " ' ( - 1 L.. L.. As -At L.. Ar- As s:j:.t r=j::s,t

_ _1_) Ar -At .

=f. s.

176


For the last sum we have

1 2: ( ~ -A

r=/:s,t

r

s

r

~

1 ) " 1 - At = r=f;s Ar - As -

L

r#-t

1 Ar - At

2

+ As -

At'

hence

_" ~ izsi 2" ~ (A

1 _A )2

L-

s

rof-s

s

r

')"" ZsZt +~~~(A _A )2 soft

t

s

"L A Z.zt {" 1 - At ~ A . - A

+~

sof-t

8

r:/:--s

1

8

-

"

~A

r=Ft

1 } - At .

r

Note now that the estimate (7.24) amounts to finding the norm of the matrix (!Lr,s) with /Lr,s = ( Ar - As) -l if r fo s and /Lr,r = 0, thus we may assume that the vector (zr) is extremal. Since the matrix is skew-hermitian, the extremal vector is an eigenvector, i.e.

"~--Zr =vz5

r#s Ar - As

for some purely imaginary v E IC. This shows that the last two sums in L cancel out. Therefore for the extremal vector we have

2"

L-~\Zs\ -" 1 ') " " " ZsZt ~(A -A)Z+~~L.,(A -A)2. 8 8 s

rof-s

r

t

s=Ft

Since we have \Ar - As\ :? 8\r- s\ the innermost sum is bounded by 28- 2 ((2) = 2 2 /38 • This gives (7.24) completing the proof of (7.23). D

JT

COROLLARY

7 .9. For any set of !i-spaced points

ctr E

~/Z and any complex

numbers Zr we have

(7.25)

PROOF. We apply (7.23) to the doubly-indexed set of numbers Zm,r = ( -l)mzr and Am,r = m + O:r with 1 ( m ( K, getting

"\' ( I "\' L., L.,

-l)m-n

(r,m)f'(s,n)

m - n

ZrZs

+ Ltr

-

O:s

I( 1T8[( "\' ~

\zrf.

r

Here we can replace the summation condition (r, m) fo (s, n) by r ol s because for r· = s the remaining terms cancel out pairwise (m,n) against (n,m). If we put /,; = m - n and divide by K we derive

7.


This yields (7.25) by letting K

-->

L

+oo because for ex

kEZ

177

f/. Z

7r k + ex = sin 1rex

( -1)k

0

where the series is summed symmetrically. COROLLARY 7.10. For any real x we have """'"""' _sin27rx( ex,. - exs) I ,_ 1 """' I l2 ~ ~ Z,Z 8 :s_; u ~ Zr . • I T#s Slll7r(et,-ets) r

(7.26)

PROOF. This follows by applying Corollary 7.9 twice with z,. twisted by e(xexr) and e(-xexr)0 Now we are ready to prove (7.22), first with By duality, it is equivalent to show that

L

(7.27)

o- 1 + N

instead of

o-

1

+ N- 1.

IL Zre(ncxr)l2 (; (N + J-1 lllz112

M
r

for any complex numbers Zr- Squaring out, we see that the diagonal term r = s contributes Nllzll 2, while the off-diagonal terms yield

with K = Jo.! + ~(N + 1). This quantity is smaller than the left side of (7.26), hence is(; J- 1 llzll 2. Adding both contributions gives (7.27). To save the -1, we apply Theorem 7.7 to the points (exr + k)K- 1 with 1 (; k (; K, for the trigonometric polynomial T(ex) = S(exK). We obtain 2 1 2 IS(ex,W = IT(exr;k)l (; W K + NK- K + 1)11all

KL r

LL k

r

because the points (exr + k)K- 1 are spaced by oK- 1 , and the sum T(ex) ranges over m = nK with (Jo.!K +K -1) < m (; (MK +K -1)+ (NK +K -1). Hence dividing by K and letting K tend to infinity we obtain (7.22) as stated (this trick is due to Paul Cohen). For many applications, a weaker estimate with the factor J- 1 + N- 1 replaced by C(o- 1 + N) with an absolute constant Cis sufficient. This kind of large sieve inequality was first stated by H. Davenport and H. Halberstam [DHl]. We are going to give a quick proof of one such estimate. By duality we need to show that

L IL l're(exrn)l2 (; C(J-1 + N)lhll2 lni
,.

for any complex numbers /'r· Let f(x) be a non-negative function with f(x) ) 1 if lxl (; 1. The left side above is estimated by the same expression with extra smoothing factor f(n/N), and the latter equals

LLI'r'fs'a(r,s)

178


where

a(r, s) =

L

!(~ )e((nr- ct )n). 8

n

By Poisson's formula

a(r, s) = N

L

}((o:r- ct 8

+ h)N).

h

We choose f(x) such that its Fourier transform satisfies }(y) « (1 + y 2 )- 1 giving « N(1 + llnr- nsll 2 N2)- 1 . Finally applying 2i'Yr'Ysl ( I'Yrl 2+ i'Ysl 2 and

a(r, s)

L la(r, s)l « N L (1 + (okN)

2

)-l

« o-l + N

k=O

we complete the proof. The above arguments would yield the best possible constant C = 1 if the test function f(x) was chosen appropriately. Selberg did just that, his optimal function being quite complicated. Now specialize the large sieve inequality (7.4) by taking for the points ctr the rationals ajq with 1 ( q ( Q and (a, q) = 1. These points are spaced by = Q- 2 , indeed if ajq =f. a' jq', then

o

llafq- a' fq'll = ll(aq'- a'q)jqq'll ~ (qq')- 1 ~ Q- 2 . Therefore Theorem 7.7 yields THEOREM 7.11. For any complex numbers an with J.f N is a positive integer, we have

L L*

(7.28)

I

L

< n ( J.f + N, where

aneCn)I2((Q2+N-1)IIa112·

q(Qa (mod q) M
Notice tha~ if A = (an) is supported on an arithmetic progression n (mod k), and (k,q) = 1, we can change variables to derive COROLLARY

7.12. For any complex numbers an with M

=e

< n ( M + N we

have (7.29)

L

L*

I

L

q(Q a (mod a) n"'£ (mod q) (k,q)=l

ane(an) 12 q

L

( (Q2 +k-IN) n"'e

lanl2·

(mod k)

EXERCISE 2. Prove the following large sieve inequality in several variables: let d ~ 1 and 0 > 0 and let ctr = ( ctr,I, ... , ctr,d) be 0-spaced points in JRd /Zd, i.e. if r =f. s, we have max llnr,i- ns,ill ~ 0. Then

(7.30) where x · y is the standard scalar product in JRd, and an are arbitrary complex numbers for n = (n 1 , ... , nd) with 1 ,;; n; ,;; N, the implied constant depending only on the dimension d.

7.


179

7.5. Multiplicative large sieve inequality. Instead of additive characters, we now take as harmonics the Dirichlet characters. If one considers X= {X I x is a character modulo q}, then by expanding the square and applying the orthogonality relation one finds immediately 2

L IL

anx(n)l

'(

(q+N)IIall

2

X (mod q) n~N

which is comparable to (7.20). Much more significant, however, is the case where the harmonics are all primitive Dirichlet characters to modulus q '( Q. Here we have the basic result of Bombieri and Davenport [BD] (their original result is slightly weaker): THEOREM 7.13. For any complex numbers an with M N is a positive integer, we have

< n '( M + N, where

(7.31)

This is again as strong as (7.20). Note that without the restriction to primitive this inequality would be false by a large factor (take an = 1 for all n, then the trivial characters modulo q '( Q contribute already about N 2 Q). An interesting feature of (7.31) is that the bound depends only on the length of the interval, but not on its position. In this aspect the large sieve inequality for Dirichlet characters has some advantage over the GRH.

x,

PROOF. We can in fact give a slightly more precise estimate, which is sometimes useful in applications. The primitive multiplicative characters x (mod q) can be expanded into additive characters by means of Gauss sums (see (3.12))

L

T(X)x(n) =

x(a)e(

0 ;)

a (mod q)

for any n. If X is not primitive but induced by a primitive Xs modulo s with q = rs, then if (r,s) = 1, we have

L

x(a)e(aqn)

a (mod q)

=

:L* :L*

x(bs

+ cr)eCb +

7)

b (mod r) c (mods)

where er(n) is a Ramanujan sum (3.1). We derive

" ' _ Xs(r) L_.,anx(n)cr(n) = -(-) n

T

Xs

" L..'

x(a)S(ajq)

a (mod q)

for any complex numbers an, where S(n) is the trigonometric polynomial (7.21).

7.

180

BILINEAR FORI\IS AND THE LARGE SIEVE

By orthogonality of characters we then have

L 'P(~s) x 2:::*

rs(Q (r,s)=l

[I:anx(n)cr(n)r (

(mod s)

n

L
q(Q

(mod q)

2

a

Applying Theorem 7.11, we conclude that (7.32)

L

s
rs(Q (r,s)=l

L• I L

anx(n)cr(n) 12 ( (Q

2

+ N -l)!lalf

X (mods) M
This inequality contains (7.31) by positivity (ignore all terms with r

f'

1).

0

REMARK. Bombieri and Davenport did succeed in getting interesting applications of (7.32) by exploiting the extra summation over r.

7.4. Applications of the large-sieve to sieving problems. In this section we explain how the large sieve inequality for additive characters implies a sieve result in the "ordinary" sense (see also Chapter 6 for background on sieves), and give Linnik's first application of such a result for estimation of the least quadratic non-residue. Consider a finite set M of integers and a finite set P of prime numbers. For each p E 'P, Jet flp c 7ljp7l be a set of residue classes "to sieve out". The data (M, 'P, fl) defines a sieving problem; the corresponding sifted set is (7.33)

S(M, P, fl) ={mE M;

m (mod p) rf. flp

for all pEP}.

The goal is to estimate S = IS(M, P, fl)l, the cardinality of S(M, 'P, fl). More generally we consider, for an arbitrary sequence of complex numbers a= (an), the sum (7.34)

Z= nES(M,P,O)

and we seek a bound for Z in terms of the £2-norm of a. Naturally in this context the name "large sieve" is appropriate, because in contrast with other methods (like combinatorial sieves, see Chapter 6), we are looking for strong estimates even if lflPI is rather large when pis large. We derive now from Theorem 7.11 a result somewhat stronger than Linnik's original version (see [Rot], [Mol], [Hul], [Bo2]). THEOREM 7.14. Suppose M is contained in an interval of length N ;;?: 1, and assume flp f' 7l/p7l for evenJ p E 'P, i.e., w(p) = lflpl < p. Then we have 2 (7.35) 1ZI 2 ( N Q ilall 2

~

for any Q;;?: 1, where (7.36)

H=

L

b

q(Q

h(q)

A

/l


181

and h( q) is the mttltiplicative function supporied on sqttarefree integers with prime divisors in P such that

w(p) h(p) = --(-)" p-wp

(7.37)

In particular, taking for an the characteristic Junction of S(M, P, !1), we have

N+Q2 S"( -H--.

(7.38)

Let S(n) be the trigonometric polynomial (7.21), so S(O) = Z. First we establjsh the following inequality for individual moduli. LEMMA

7.15. For any positive squarefree number q we have 2

(7.39)

~* [s(~)[ .

h(q)IS(oW "(

a (mod q)

PROOF. Let X(q,v), for v E ZjqZ, denote X(q,v)

= noov (mod q)

Using additive characters we have

X(q,v)=~ ~ e(-a;)s(~). a (mod q)

so by orthogonality (Plancherel formula) it follows that q

~

2

IX(q,v)l =

v (mod q)

~ [s(~)[ •

2

a (mod q)

If q =pis prime, we have X(p, v) = 0 for v E !1p, so by Cauchy's inequality we get 2

~

2

IZI = !S(O)I =I

X(p,v)l

v (mod p)

((p-w(pJJ~Ix(p,vW=(1-w~l) ~ v

a

[sG)i

2

(mod p)

which yields (7.39) by subtracting the term for a == 0 (mod p). Now if q = q1 q2 with (q1,q2) = 1, we have

Hence if (7.39) holds for q1 and q2 , we derive by successive applications that

~* [s(~) a (mod q)

r)

h( q2)

~· [s (::) a 1 (mod q 1 )

r)

h(qi)h(q 2 )IS(O)I

2

,

so the proof of (7.39) is finished by induction on the number of prime factors of q.D

182

7.


PROOF OF THEOREM 7.14. Sum (7.39) over q :( Q and then apply Theorem 7.11. 0 Linnik [Lil] established an inequality somewhat weaker than (7.38), yet powerful enough to give a spectacular application to the problem of the least quadratic non-residue modulo a prime p, i.e., the smallest positive integer q(p) such that (q(p)/p) = -1. Note that q(p) is prime. It is conjectured that q(p) «, p' whereas the best known estimate is

From the Grand Riemann Hypothesis for Dirichlet £-functions one derives that q(p) « (logp) 2 with an absolute implied constant. q(p)

THEOREM 7.16 (LINNIK). Let c > 0. The number of primes p :( N such that > N' is bounded by a constant depending only on E.

PROOF. Let X, be the number of primes :( VN with q(p) > N', which we want to show is bounded in terms of c. To this end consider the sieving problem with

M = {1,2, ... ,N}, P = {p :(

np = {v

G) = p) I G) = -1}.

VN I

(mod

1 for all n :( N'},

Note that w(p) = ~ (p- 1) and h(p) = (p- 1) / (p + 1). This is indeed a "large sieve" because h(p) ;? ~The sifted set contains the set, say Z,, of all n :( N which have no prime divisors larger than N'. Therefore by (7.8) with Q = VN we get Z, = IZ,I :( 2NH- 1 . Combining this inequality with

~X, :(

L

h(p) :( H

p(.JiV q(p))N'

we get X,Z, :( 6N. It remains to estimate Z,. It is known that Z, ~ o(c)N for some o(c) > 0 as N ---+ oo (see e.g. [Bo2], pp. 8-9), so the result follows. Alternatively, a sufficient lower bound for Z, follows 2 by counting in the set Z, numbers of special type n = mp 1 · · · Pk :( N with N'-' < 1 p1 < N' for 1 :( j :( k = c- . We get

Z, ;?

L [-N-] » N. Pr · · ·Pk

P1, ... ,pk

Hence X, « 1. Finally changing N to N 2 and c to c/2 we conclude Linnik's theorem exactly as stated. 0 See Proposition 7.30 for an analogue for elliptic curve using the large sieve type inequality for symmetric square £-functions.

7.

BILINEAR. FORMS AND THE LARGE SIEVE

183

EXERCISE 3. (1) Consider the following sieve in dimension d : for any prime pEP, let l1(p) c (Z/pZ)d be a subset withw(p) = ll1(p)l
Show that

= {m = (m!,··· ,md) EM I for pEP, m (mod p) \1' np}·

s = IS(M, P,l1)1 «

(Nd + Q 2 d)H~j where H is given by (7.36) and the implied constant depends on d [Hint: Use (7.30)]. (2) Fix n;? 1. Let DN be the set of monic polynomials f E Z[X] of degree n with coefficients in absolute value :( N. For r = (r 1 , ... , rk) with r 1 + 2r2 + · · · = n let Cr C DN be the subset of polynomials f for which f (mod p) does not factor as a product of r 1 linear factors, r 2 quadratic factors, ... , for any prime p. Show that l ICrl « Nn~2logN, the implied constant depending only on n. (3) Deduce that the set C C DN of polynomials f for which the Galois group of the splitting field off is not Sn satisfies

ICI «

I

Nn~ 2

log N

the implied constant depending only on n. [Hint: Use the fact that iff has Gaiois group H which is a proper subgroup of Sn, the union of the conjugates of H is distinct from Sn, hence there is a "splitting type" r as in (2) such that f E Cr.] This is due to Gallagher [Ga4]. Another interesting application of two-dimensional large sieve is the proof by Duke [Du5] that almost all elliptic curves E /Q have their p-torsion field with Galois group G£(2, ZjpZ) for all primes p. 7.6. Panorama of the large sieve inequalities. There is a great variety of estimates of the large sieve type which are used these days in analytic number theory. In this section we selected a few important ones (without proofs) to make an impression about the scope of the matter. Some applications require large sieve inequalities in which the set of harmonics consists of characters twisted by exponentials of different kind. The first case of such a hybrid large sieve is due to P.X. Gallagher [Gal]. THEOREM 7.17. Let N, Q, T;? 1. For any complex numbers an we have

L .L* JT IL q(Q X (mod q)

anx(n)nif dt

« (Q 2T + N)llall 2 .

~T n(N

A discrete version of this result, among further generalizations, can be found in[Mo2]. Of course, twisting the multiplicative characters by additive ones does not produce stronger results, but if the exponential component is not linear, then it is likely to be orthogonal to the characters and one can receive extra saving (in Theorem 7.17 the exponential component is e ( f,; log n)). Here is another example of some importance for applications (see [Dui3]J.

184

7.


THEOREM 7.18. Let v(x) be a real, smooth function on ~+ such that 0 xiv'(x)i < 1 and iv'(x)i2 < lv"(x)!. Then for X) Q) 1 we have

anx(n)e(xv~)w «

L :L* IL

(N

<

+ Q~X~ logX)itall 2 -

q,;,Q X (mod q) n,;,N

where the implied constant is absolute. If the linear forms L anx(n) are lacunary (many coefficients an vanish), then the power of the large sieve reduces significantly. There is a great demand for results which would be sensitive in this respect (cf. P. Elliott [El]). 0. Raman~ worked out many interesting estimates of such kind (unpublished). PROBLEM 7.19. Prove that for any complex numbers an supported on primes

L :L* IL

anx(nf

«

(Q

2

+ N(logN)- 1 )11a11 2

q,;,Q X (mod q) n,;,N

where the implied constant is absolute. A more general problem is to fix inequalities for linear forms

f E Z[X] and establish good large sieve

One should realize how fortunate is the case of large sieve for coefficients of full density, because the duality principle works efficiently. For sequences of coefficients On which are very sparse, the duality arguments are poor and very little can be done directly while predictions for the best possible estimates are risky. In the same note it is a challenging problem to establish the best large sieve inequality for harmonics which are primitive characters of a fixed order. Here we have a powerful result of D. R. Heath-Brown [HBl] for quadratic characters (see also the earlier result of M. Jutila [Ju2]). THEOREM 7.20. For any complex numbers an we have

Lo I:L' an(~)l m~hi

2

«

(MN)'(M

+ N)llall 2 •

n~N

Here the superscript b restricts the summation to squarefree numbers, (~) stands for the Jacobi symbol, f is any positive number and the implied constant depends only on E. REMARKS. The fact that we can view ( ~) as either a character in n of modulus m, or a character in m of modulus 4n (by way of the quadratic reciprocity law) is the key feature of the proof. Note that the problem is self-dual, i.e. interchanging m and n does not matter. For interesting applications of Theorem 7.20, see [HBl], [Sou], [IM], [PP].

!;;:) )

In connection with Kloosterman sums one meets characters ( where f is a quadratic polynomial. The corresponding large sieve was considered in [Il] with the following result

7.


185

THEOREM 7.21. For any complex numbers an we have

n2-4)/2 « (MN)<(M23+ M3N)jjajj 2 2 L IL an (----:;;;:---

m~AJ

n~N

where e is any positive number and the implied constant depends only on

E.

EXERCISE 4. Let S(m, n; c) denote the classical Kloosterman sum. For any complex numbers Om, f3n we have

L IL L Omf3nS(m,n;c)l :( (C2 + M + N)jjojjjj(311. c~C m~Aln~N

EXERCISE 5. For any complex numbers

L 2:* I LL

Om,

f3n we have

2

2

Omf3ne(a:n)r :( (Q + MN)IIoll 11(311

2.

q'(Q a (mod q) m'(M,n~N

(mn,q)=1

The estimates in the above exercises can be easily derived from the large sieve inequalities (7.11), (7.13) respectively. Much harder is the following estimate for bilinear forms with Kloosterman fractions due toW. Duke, J. Friedlander and H. Iwaniec [DFI4]. THEOREM 7.22. Let a be a positive integer. For any complex numbers we have

Om,

f3n

1 1 ) ts .!. (a+MN)2jjojjjj(31i ( m) «(MN)<(M+N

L L Omf3ne a-;; m~Al,n~N

(m,n)=1

where the implied constant depends only on

E.

REMARKS. Without the factor + N- 1 ) 1158 , this bound would be just trivial. It is often crucial for applications to have a saving factor, less important how large it is. The proof of Theorem 7.22 uses the amplification method; the principle of this is explained at the end of Chapter 26.

(M- 1

Bilinear forms in complex domains are also desired for applications. We select from [Fll] two results for the Jacobi-Dirichlet symbol (t;) and for the JacobiKubota symbol [wz] (for definitions, see Section 3.7). THEOREM 7.23. Let Ow, f3z be any complex numbers with for w, z in the discs jwj 2 :( M, jzj 2 :( N. Then

jowl :(

1,

lf3zl :(

1

(z) «(M+N)l'i(MN)l'i+<

L * Lowf3z;;; w

1

11

z

where the * restricts the summation to the primary primitive numbers w, positive number and the implied constant depends only on E.

E

is any

Using tlw twisted multiplicativity (3.71) of the Jacobi-Kubota symbol one can derive from Theorem 7.23 the following estimate (subject to some minor conditions)

L

*

L

*

Owf3z[wz]

«

1

ll

(M +N)l'i(MN)l'i+<


186

With similar resultti but in a different fashion one can treat bilinear forms in '!jJ 1 (mn) where '!jJ1 (f.) are the normalized Fourier coefficients of a fixed cusp form f E Sk(f 0 (N), v) of weight k =half an odd integer, and vis the compatible theta multiplier (see (1.53) and (3.42)). These coefficients are not multiplicative so a cancellation in general bilinear forms is possible. The following estimate was proved in [Duii], 0

L L

0

cxmf3n'I/JJ(mn)

«

1

(MN)"(M2

1

+ M4N)Iicxlillf311

m:(AI n:(N

where CXm, f3n are any complex numbers and c is any positive number, the implied constant depending only on c and the cusp form f. Probably this general estimate holds true with M 112 + M 114 N replaced by M + N. Have in mind that the bound 1/J1 (f.) «f." is expected to be true for f. squarefree but it is not yet proved; it is essentially equivalent with the Lindeliif Hypothesis for a suitable automorphic £-function twisted by the real character xe in the f.-aspect (via Shimura correspondence and Waldspurger formula). 7. 7. Large-sieve inequalities for cusp forms.

The Fourier coefficients of cusp forms. or better the eigenvalues of Heeke operators in the space of cusp forms, are analogues of Dirichlet characters. As tools they are powerful due to the orthogonality in the sense of the large sieve inequalities. There are two essential aspects of the matter, the spectral and the level aspects. Moreover, some hybrid aspects also appear in practice. Let (uj) be an orthonormal basis of Maass cusp forms for r = fo(q) with q;? 1 (see Theorem 15.5) and let Aj = s1 (1- Sj) with Sj = ~ + itj be the corresponding eigenvalues of the Laplace operator. Let

u1 (z) =

VYLP 1 (n)Kit,(27rlnly)e(nx) n,'O

be the Fourier expansion of normalized coefficients

(7.40)

Uj

at the cusp oo (see Lemma 15.1). Consider the

vj(n) = ( - lnlq -- ) ~ PJ(n), for n I 0, cos1l1Ttj

These are expected to be essentially bounded on average with respect ton, q, Using the Kuznetsov formula (Theorem 16.3) one can show THEOREM

tj·

7.24. Let q ;? 1, T;? 1 and N ;? 1. For any complex numbers an

we have

(7.41)

L

I"Lanvj(nf «(qT 2 +NlogN)Iiali 2

t 1 :(T n<:(N

where the implied constant is absolute. Many results of this type are established in [DI], but with a slightly weaker term Nl+" in place of N log N (probably N would suffice). Note that qT 2 is approximately the number of cusp forms u1 (z) for f 0 (q) with 0 < tj :( T by Weyl's Law.

·-·~

7.


187

Still more orthogonality is expected from the primitive cusp forms. The following estimate is an open problem PROBLEM 7.25. Let Q ) 1, T ) 1 and N ) 1. Prove that for any complex numbers an (7.42)

L* I L

L

OnVj(n)\2

«

(Q2T2

+ N)\\a\\2

q,;;Qt;,;;T n,;;N

where the implied constant is absolute and ~* restricts Uj to the primitive cusp forms, so T,,u 1 = >. 1 (n)u.j and vj(n) = Aj(n)vj(1) for all n ) 1. Similar inequalities hold true for holomorphic cusp forms. In this case more precise results can be derived from the Petersson formula (Proposition 14.5). Let F be an orthonormal basis of Sk(q). Let f(z) = L

aJ(n)e(nz)

n)l

be the Fourier expansion off at oo. We consider the normalized Fourier coefficients qf(k-1) ~ '-/Ji(n) = ( (47l"n)k-l) OJ(n).

(7.43)

Recall that if k ) 2 we have

\F\

(7.44)

= dimSk(q) :=:: kq

IT (1 + p-

1

).

plq

THEOREM 7 .26. Let :F be any orthonormal basis of Sk(q) with k for any complex numbers On we have (7.45)

IL

L

On'l/!J(n)\2

«

>

2. Then

(q + N)\\a\\2

jE:F n,;;N

where the implied constant is absolute. PROOF. By (7.43) and Proposition 14.5, the Petersson formula implies that the left side of (7.45), say L(q), equals 2 q\\a\\ +27l"qi-k

L c_::::::::O

(mod q)

c-lLLO:manS(m,n;c)Jk-1( 4 "~). m,n~N

Opening the Kloosterman sum we derive by orthogonality of additive characters and by Cauchy's inequality that \LLO:manS(m,n;c)\,;;

(c+ N)\\a\\ 2.

m,n~N

Next (to separate the variables) we derive via the power series expansion

!88

7 .. BILINEAR FORMS AND THE LARGE SIEVE

that

where

h-J(2x) =

L

xk-1+2t

f!r(k

f)O

The condition x = 21r.jmnc- 1 q ) 27r N, giving L(q) :( qllall

2

+ 27rq

:(

+

f):( x

27rNC 1

L

1

,

if x :( 1.

1 holds for any c _ O(mod q) if

:(

c-

2

C~f (c + N)llall 2 ,

c:O (mod q)

hence (7.45) follows in this case. To remove the condition q ) 27rN we observe that Sk(q) C Sk(pq) and the index is [f 0 (pq) : f 0 (q)] :( p + 1. Using this embedding with appropriate renormalization we find that L(q) :( (1 + p- 1 )L(pq). If q :( 27rN, we can choose p with 27rN :( pq :( 47rN and apply the result for L(pq) getting (7.45). 0 With a slight modification the above arguments work also for Sk(q, X) with any character X (mod q) and for k = 2. However, the case k = 1 is very different because the space SJ(q,x) is small (probably dimS 1 (q,x) « vqlogq if xis a real character). Technically speaking this intrinsic distinction manifests itself by the lack of convergence of the series of Kloosterman sums in the Petersson formula. What lies behind the scene is the huge spectrum of Maass cusp forms near the bottom of which only a small part (the holomorphic forms at the. bottom) is detected by the Petersson formula (the lack of spectral completeness causes the problem!). In this scenario one can proceed by the duality principle (no spectral completeness is required then), but a new obstacle emerges with the degree of the automorphic harmonics (GL(2) x GL(2) is essentially GL(4); to the contrary for Dirichlet characters GL(l) x GL(l) is still GL(1)) which amplies the conductor. Nevertheless some non-trivial orthogonality can be established for sufficiently long vectors relative to the level. For example W. Duke succeeded in proving along such Jines the following large sieve type inequality (not perfect but quite useful) THEOREM 7.27. Let X be an odd primitive quadratic character modulo q. Let F be an orthonormal basis of S 1 (q,x) and ?jJ9 (n) be the corresponding normalized coefficients. Then for any complex numbers an we have

(7.46)

L IL

an'I/Jg(nf

«

(q

+ N)llall 2

gE:F n:(N

where the implied constant is absolute.

Duke [Dul] gave an important application of (7.46) for estimating the dimension of S1 ( q, x). Specifically he proved dimSJ(q,x)

«

q11/12+<

for q prime and x a real primitive character modulo q, the first improvement over the bound dim s1 (q, x) « qj log q which comes from the trace formula.

7.


189

Using the existence of Rankin-Selberg £-functions for arbitrary degree automorphic forms, Duke and Kowalski [DK] have established by the duality principle a very general type of large sieve inequality which has applications especially to the study of automorphic £-functions close to the line Re (s) = 1. We mention the special case of symmetric square coefficients (see Section 5.12) and give a sketch of proof to illustrate the wealth of ingredients. THEOREM 7.28. For q squarefree, let S2(q)* be the basis of primitive forms of level q and weight 2. Let >.J(n) be the Heeke eigenval·ues for f E S2(q)*. For any complex numbers an we have (7.47)

L L IL

UnAJ(n2)l2

« (N(logN)15 + N~+
q(Q /ES 2 (q)• n(N

if N) Q) 1, with an implied constant depending only on c. SKETCH OF PROOF. One uses duality and one can attach a smooth test function to the sum over n, reducing the problem to estimating

where f, g E S 2 (q)*, for some fixed smooth function <.p with compact support such that r.p(x) = 1 for 0,;,; x,;,; 1. (The Heeke eigenvalues >. 1(n) for f E Si.(q) are real). By Mellin inversion we have

H(f, g) =

~ 27rz

j G(s)rp(s)N ds 8

(2)

where

Using the fact that >. 1 (n 2 ) are, up to small perturbation, coefficients of the symmetric square £-function (see Section 5.12) and the description of Rankin-Selberg £-functions of cusp forms on G£(3), it follows that

where G 1 (s) is given by an Euler product absolutely convergent for Re (s) > ~· We have Deligne's bound i>. 1 (n)l,;,; T(n) (see (14.54)), which implies

for Re(s) = u > ~~for some absolute constant A> 0, with an absolute implied constant. It is kno~n (by properties of Rankin-Selberg convolutions) that L(Sym 2 f 0 Sym2 g, s) is entire except if Sym2 f = Sym2 g, in which case there is a simple pole at s = 1. Moreover, in the Appendix to [DK], Ramakrishnan shows that this exception is equivalent with f = g in the case of squarefree levels (in the general


190

case, a quadratic twist can occur). So moving the contour to the line u ~ ~

5 > 0, we get H(f, g)=

~ 27f2

J

+ 5 for

G1(s)L(Sym 2 f 0 Sym 2 g, s)
(~+0) for

f #

g. For u

L(Sym

2

= ~ + 5, we use the convexity bound (see Exercise 3 of Chapter 5)

f

0 Sym 2 g, u +it)

«

q(Sym 2 f 0 Sym 2 g, u + it) 114 - 8 12 +"

for this £-function of degree 9, where q(·, s) is the analytic conductor defined in (5.7). We have q(Sym 2 f 0 Sym 2 g, u +it) « Q 6 ()tl + 1) 9 by the bound (5.11) of Bushnell and Henniart for the conductor of a Rankin-Selberg convolution, with an absolute implied constant. By the rapid decay of 0. We take 5 = (log N)- 1 so

H(J, g)

«

1

N'i+" Q3f2

for N ) Q and f #g. Iff= g, by Deligne's bound we have l>-t(n 2 )>. 9 (n 2 )1 ~ r(n 2 ) 2 ~ r(n) 4 hence by direct estimation H(f,J) « N(log N) 15 2 Since there are « Q forms on the left side of (7.47), it follows that for N ) Q we have (7.47). 0 In this connection we suggest the following problem as a first case of a higherdegree large sieve inequality: PROBLEM

7.29. Prove that

2..::

II:

2 an>.J(n )1

2

«

(qN)"(q

+ N)llall 2

fEHk(q) n,;;N

where Hk(q) is a Heeke basis of eigenforms on Sk(q), with q squarefree, and the implied constant depends only on c and k.

We can deduce from Theorem 7.28 the analogue of Linnik's result (Theorem 7.16) for elliptic curves. PROPOSITION 7.30. Let A > 0 and Q > 2 be fixed. For E jQ a semistable elliptic curve of conductor ~ Q, let M(E) be the number of semistable elliptic wrves F/Q with conductor~ Q such that ap(p) = aE(P) for all p ~ (logQ)A. Then we have M(E) « Q9fA

where the implied constant depends only on A. It is conjectured that the number of semistable elliptic curves of conductor ~ Q is about Q5 16 ; the lower bound is easy by constructing explicit families, but the best known upper bound is Q1+" for any c > 0 (see [DK]). Therefore our result is non-trivial for any A> 11.

r

191


PROOF. For a given weight 2 primitive form

f

of conductor q :( Q, q squarefree,

Jet AI' (f) be the number of primitive modular forms g, of weight 2 with squarefree conductor :( Q such that >. 1 (p) = >. 9(p) for p :( (JogQ)A. By modularity of

(semistable) elliptic curves over Ql, we have M(E) :( M'(JE) where fE is the weight 2 form associated to E. Note that AJ(n) = >. 9 (n) if all prime divisors pIn are :( (log Q)A. As in Linnik 's proof, the idea is to find coefficients an supported on prime numbers that make the linear form

L an>.J(n)

LJ =

n~N

large, hence L 9 = LJ is large and repeated for all g counted in 1\I'(J). On the other hand, the large sieve estimate (7.47) will show that this cannot happen very often producing a bound for M'(f). Because the coefficients >. 1 (p) can be quite small (in contrast witll Dirichlet characters), an = AJ(n) may not work. Hence one is tempted to use Proposition 14.22, but N there would be too small (limited toN :( (JogQ)A). Instead, notice that by the Prime Number Theorem and the formula >.J(p) 2 - AJ(p 2 ) = 1 for (p, q(f)) = 1, one of the sets T 1 or T2 given by T, = {p :( (logQ)A

I I>.J(Pi)l) ~and (p,q(J))

= 1}

satisfies ITil » (logQ)A(AloglogQ)- 1 with an absolute implied constant. Assume it is T2 (the case ofT1 is similar, but simpler). We have I>.J(n)l ) 2-w(n) if n is squarefree with all its prime factors p E T 2 . Fix m ) 1 (to be chosen later). Let an = AJ(n) for such n with w(n) = m, and an = 0 otherwise. It follows that

LJ =

L n<:(N

lanl 2 =

L

I>.J(n)l 2 ) T 2mU

n~N

where U is the number of values of n :( N satisfying the desired condition. By Theorem 7.28, we have

for any N ) Q 8 . Hence for N = Q 8 we get

M'(J) « Qs(JogQ)15 22mu-1. Choosing m = [8(logQ)(AloglogQ)- 1 ] we get 22m« Q' and U)

c~~~) »

1

QS(l-A- )-e

showing that for

E.

small enough.

D

·

7. BILINEAR

192

FOR~!S

AND THE LARGE SIEVE

7.8. Orthogonality of elliptic curves. A straightforward approach to the large sieve inequalities requires near-orthogonality and almost completeness of the outer harmonics. However, some spaces contain smaller subspaces which preserve completeness with respect to their own structure. Then proving a large sieve type inequality for the subspace becomes a harder yet feasible problem of independent interest. A clear example is offered by the family of elliptic curves (7.48) with a, b E IZ, 1 ~ a ~ A, 1 ~ b ~ B. The corresponding modular forms f ab are of weight 2 and level q ~ Q = 16(4A3 + 27B 2 ). We have about AB such curves while the total number of all primitive cusp forms of weight two and level q ~ Q is about Q2 . Therefore we are dealing with a very small subset indeed. In this section we present a large sieve type inequality for the harmonics associated with the family of elliptic curves. If m is square free the Heeke eigenvalue for the cusp form associated to (7.48) is given by the character sum 3

(7.49)

"~

Aab(m) = p(m)

(x +max+ b)

x(mod m)

(see Section 14.4). However, it is more interesting to consider the sum over the reduced classes x (mod m), (x,m) = 1. Put 3

(x +:x+b)·

(7.50) :r(mod m)

Note that

Aab(m) =

LIL(d)(~)>.~b(~). dim

THEOREM

(7.51)

7.31. For any complex numbers aa, f3b we have

I L L aaf3b>.~b(m)j

0

L l~m~!t/

2

« lla1JIIf311(11I + VA)(.llf + VB)M'

l~a~A

1,;;b,;;B

where c is any positive number and the implied constant depends only on c. PROOF. ·we can assume that m is odd squarefree. By Gauss sums (see Theorem

3.2)

>.:b(m) = IL(m)

t;;.

ym

Em =Jlm)(

L z

(mod m)

(-=-) m

em(z(x 3 +ax+ b))

L* x

(mod m)

(~)em(z 2 x 3 + zb).

L*

Vm x(mod m)

z(mod m)

Hence the inner sum in (7.51) equals

!L(m)~

L* x

(mod m)

(Laaem(axJ)(Lf3b

L* z

(mod m)

em(z 2 x 3 +zb))


7.

193

By Cauchy's inequality the left side of (7.51) is bounded by (AB) 112 wher-e

A= L

L*

m

ILnaem(a.rf :( llnii2(A + 11!2)

x(mod m}

a

and

B=

L~ m

L*

1Lf3b

x (mod m)

:::: '"""'T3(m) ~~ m m

x

I" '"""' ~f3b

(mod m}

:( LT3(m)T(m)

em(2 x +zb)l

z (mod m}

b

'"""' ~

2

2 3

L

L*

2

~

z

2 3 1 em(zx+zb)

(mod m}

2

2

l/3b em(zbf<< II/311 (B + 11! )11!".

z(mod m)

This completes the proof of (7.51).

0

REMARKS. From the Hasse bound >.:b(m) « m 112T(m) (see Theorem 11.25) it follows that the left side of (7.51) is« llnllllf3II(AB)'i 2 M 312 log211!, which we regard as a trivial bound, because no cancellation is used. In applications one often wishes to have a saving slightly more than AI 112. Our theorem guarantees this saving relative to the trivial bound if AB is larger than AJ2. Choosing A = X 113 and B = X 112 we have the elliptic curves with discriminant 6.ab = -16( 4a 3 +27b2 ) « X and we obtained the desired saving if AI« X 5 112 . This is quite a large range for AI, but not yet completely satisfactory. Indeed, for applications to £-functions one needs to pass the barrier AI= X 112. The following conjecture would provide the passage CoNJECTURE 7.32. For· any complex numbers bm) supported on squarefree numbers we have

L

IL

2 /mAab(m)l «(A+ 11!)1\I L

l(a(A m(Af

lrmT(m)l

2

m(A/

where the implied constant may depend on b slightly.

When b is a square we are having elliptic curves of positive rank. Similar estimates are expected to be true for special families of elliptic curves of a given rank. EXERCISE 6*. Let m be a positive, odd, squarefree number. Prove that for any complex numbers na, /3b,

ILL naf3b>.ab(m)l :( T4(m)(A + A(a(2A B(b(2B

m)~ (B + m)~llnllllf311-


194

7.9. Power-moments of £-functions.

We have remarked that a sharp large sieve type inequality (7.18), (7.20) for coefficients of a family of £-functions, has strength comparable to the Grand Riemann Hypothesis on average. So it is not surprising that the large sieve can also be employed to derive estimates for averages of £-functions which are as strong as the Lindelof Hypothesis would give (see Corollary 5.20). There is a vast literature of relevant results. For classical £-functions we recommend the book by A. Ivic [Iv]. In this section we only grasp the industry by showing a few representative results.

1:

THEOREM 7.33. ForT) 2 we have (7.52)

I((~ +itWdt « TlogT.

THEOREM 7.34. We have (7.53)

L 'L*

IL{1+it,x)ls«Qz(tz+1)(logQ(Itl+2))17

q,;;Q X (mod q)

for any t E IR, where the implied constant is absol·ute. THEOREM 7.35. Let k ) 2 and F be a Heeke oTthogonal basis of Sk(q). We have (7.54) !L(f, ~ +it)l 4 « q(t 2 + 1)(1ogq(ltl +2)) 17

L

JE:F

for· any t E IR, where the implied constant depends only on k. REMARKS. In all cases,. slightly more precise estimates are known (in some cases asymptotic formulas). See [T2] for the Riemann zeta function, and [KMVI] for cusp forms. Also (7.53) and (7.54) are only comparable to the Lindelof hypothesis in q-aspect. In t-aspect they are as good as the convexity bound, extra averaging over t would improve the situation. In Chapter 26, as a by-product of a deeper study of special values of L-fnnctions, we prove asymptotic formulas for the first and second moments of derivatives L'(f, ~)with respect to the odd weight 2 cusp forms (see (26.35) and (26.36)). In fact, the modern methods used in the analytic study of special values of £-functions (and their zeros) rely heavily on various types of averaging, some of which are quite sophisticated indeed (they require orthogonality of a capacity beyond the diagonal, so to speak). See the introduction to Chapter 26 for some references, to which can be added [Cil] (among many others). See also Section 9.2. SKETCH OF PROOF OF THEOREMS 7.33, 7.34, 7.35. The arguments are quite familiar so we only indicate the essential steps. The method starts with suitable short approximations of £-functions by Dirichlet polynomials (see Chapter 5), and then applies a large sieve inequality to the resulting linear forms. Without compromising estimates one can take a moment of order 2k as long as the length of the approximation for the k-th power is bounded by the total number of "harmonics". For instance, in Theorem 7.34, there are about Q 2 characters involved, and for each one the approximation to L( ~ +it, x) is of length about (q(ltl + 1)) 112 , so we can apply Theorem 7.13 for the mean square of L( ~+it, x) 4 , and get a bound for the eighth power-moment which is nearly best possible.


195

We sketch Theorem 7.34 this way, leaving the other two theorems as exercises {use Theorem 7.17 with Q = 1 and Theorem 9.1 in the first case, or Theorem 7.26 in the last case). Let X be primitive modulo q, 2 :( q :( Q. We have

L(s, x) 4 =

L x(nh(n)n-s n;:d

for Re (s) > 1. By Theorem 5.3 for L(s, x) 4 we see that L(~ +it, x) 4 is equal to

"x(n)r4(n) L..- nl/2+•t

n;?:l

v(!.!_) + c4q-4it 'Y(~- it) " x(m)r4(m) w(!2?:) q2 x 'Y( _ +it) L..q2 1 2

ml/2-•t

m;?:l

where V(y) = Vs(Y) and W(y) = Ws(Y) are given by (5.13) and (5.14) with the gamma factor 'Y(s) = 1r- 2 sr(~(s + K)) 4 forK= ~(1- x(-1)) and some suitable choice of auxiliary function G(u). The functions V(y) and W(y) decay rapidly for y » t 2 + 1 (see Proposition 5.4) so practically we are left with sums of length N;,::: q2 (t 2 + 1). By Theorem 7.13 we get

L L* L X~~~~!~~) I

2 1

«

(Q

2

+ N)(log N)H

q(Q X (mod q) n(N

Using this estimate essentially for N;,::: Q2 (t 2

+ 1) one derives

(7.53).

D

CHAPTER 8

EXPONENTIAL SUMS 8.1. Introduction. As usual we denote the additive character onlR by e(x) = e2"ix. Exponential sums which we consider in this chapter are of type

L

S1 (N) =

(8.1)

e(f(n))

l(n(N

where f is a smooth, real valued function on the interval (1, NJ which is called the amplitude function and N is a positive integer called the length of the sum. We could begin the summation at any point and consider (8.2)

e(f(n))

SJCAI,N) = M
where M is any integer and f is a smooth function on [M + 1, M + N]. Of course, the latter can be transformed to the former by shifting the argument,

L

St(M,N) =

e(f(n+M)).

l~n~N

We have the trivial bound lSt(M, N)l '(Nand our goal will be to improve on this bound as much as we can. However, in many cases even a slight improvement of the trivial bounq is sufficient for applications. Exponential sums S1 (N) for suitable f are encountered in various areas of analytic number theory. For example, they can be used to estimate the Riemann zeta function ((s) on vertical lines. Indeed by the simple approximation N!-s

'\' n-s + -~ + O(N-")

((s) =

(8.3)

s-1

L..,

l~n~N

for s = u +it with u ~ % and 1 '( t '( N the problem reduces to estimating sums

g

which are of type (8.2) with f(x) = log x. Another example is the problem of counting lattice points inside a planar domain. This reduces to estimating integral points under a curve y = g(x), where g is a smooth, positive decreasing function on [1, N]. The number of points in question equals

L l~n~N

[g(n)] =

L

g(n)-

1(n(N

L l(n(N

197

1/J(g(n))-

N

2'

8. EXPONENTIAL SUMS

198

On the right side the first sum is well approximated by an integral (see (1.67)) while the second sum is equal to

by the truncated Fourier series (4.18) for 1/J(x). Hence the problem reduces to estimation of sums (8.1) with f(:r) = hg(x). In particular, in the Gauss circle problem (lattice points in the circle x 2 + y 2 = R2 ) we encounter sums S1 (R) with f(x) = h)R 2 - x 2 , and in the Dirichlet divisor problem (lattice points under the hyperbola xy = N) we encounter sums S t(N) with f(x) = hNx- 1 • In this chapter we present classical methods for estimating general exponential sums St(M, N) due to H. Weyl, J. G. Van der Corput and I. M. Vinogradov. Our results do not depend on particular properties of f but only on estimates for derivatives. It is always understood that the functions involved are of class ck' where k is the largest order of derivation appearing in the statements. Note that iff is well approximated by g, then St(M, N) can be estimated by S 9 (M, N') for some N':::;; N. Precisely, iff= g + h, then by partial integration M+N

(8.4)

St(M,N) = S 9 (M,N)-

1

S 9 (M,x- M)de(h(x)).

M

Hence we get (8.5) where M+N

ch = 1 + 27r

1

[h'(x)[dx.

M

If his monotonic and bounded (so Ch is bounded), then the original sum St and the modified sum S 9 are practically equal. Sometimes in applications we find that a small alteration of f is necessary to meet the conditions of our results, in which case (8.5) can be used. 8.2. Weyl's method. The first applications of exponential sums in number theory were given in 1916 by H. Weyl [Wl] to the problem of equidistribution of sequences modulo one. In the second paper of 1921 Weyl [W2] develops a general method which is particularly good for St(N) where f is a polynomial

f(x)

=

o:xk

+ (3xk- 1 + · · · E IR[x]

with o: > 0. In this case we call St(N) a Weyl sum of degree k. The Weyl sum for a linear polynomial f(x) = o:x is a sum over geometric progression

St(N) =

"" sin.1ro:N o: L, e(o:n) = - e ( -(N Sill 7r0: 2

l(n~N

+ 1) ) .

8. EXPONENTIAL SUMS

199

Since Jsin 1rnJ ~ 2JJaiJ, where JJaJJ denotes the distance of a to the nearest integer, we obtain for a 1 Z,

(8.6) The Weyl sum for a quadratic polynomial f(x) = ax 2 + (3x is also called a Gauss sum. In the special case where a, b are integers with (2a, N) = 1 we have the exact formula (3.38) from which it follows by completing the square that

(8.7) Though there is no simple expression for a general Gauss sum

St(N)

2.::

=

e(an 2 + (3n),

l~n(N

we can estimate this quite well. First we arrange JSt(NW as follows

e(2afn). l(n,n+f(N

Then applying (8.6) we get

2.::

JSt(NW ~ N +

(8.8)

min(2N, JJ2aCJJ-

1

).

!'(,f
Hence one can deduce that

1 1 JSt(N)J ~ 2.jO:N + clog-

(8.9)

yQ

Q

if 0 < a ~ ~, which restriction can always be arranged. However, a more precise estimate depends on the diophantine nature of the leading coefficient a. By Dirichlet's approximation theorem there exists a rational approximation to 2n of type \2a-

(8.10)

~~ ~ q

1 - -

2Nq

with (a,q) = 1 and 1 ~ q ~ 2N. Hence JJ2aCJJ ~ -!JJa£/qJJ for any 1 ~ £ < N,£ O(mod q) and

2.::

JJ2a£JJ- 1 ~2(~+1)

2.::

JJCjqJJ-

'/=

1

f(mod q) iitO(mod q)

!'(,f
fitO(mod q)

= 2(N + q) (

2.::

2.::

e- 1 +

l(f(~

e- 1 ) ~ 4(N + q) log q.

l(f<1

Then estimating the partial sum of (8.8) over 1 2N 2 q- 1 we arrive at

~

£ < N, £ = O(mod q) trivially by

ISt(NW ~ N + 2N 2 q- 1 + 4(N + q) logq. This together with the trivial bound JS1 (N)J

~

N implies

8. EXPONENTIAL SUMS

200

THEOREM 8.1. If f(x)

=

o:x 2 + {Jx with 2o: satisfying (8.10}, then

(8.11)

Theorem 8.1 is essentially best possible. Slightly better results hold true for almost all coefficients o:, {3 (mod 1) with respect to the Lebesque measure. Indeed we have the following estimate for the second, the fourth and the sixth power mean values

t I 2.::

(8.12)

(8.13)

e(o:n2

Jo

t I 2.::

+ {Jn)l

2

do:= N,

l~n~N

e( o:n2 + {Jn)l 4 do:« (N log 2N)

Jo

2

,

l~n~N

(8.14) PROOF. The first formula is the Parseval identity. The second integral is bounded by the number of solutions to + n§ = n~ + n~ in positive integers n1, n2, n3, n4 ,:; N, and the latter is bounded by

ni

2.:: f~2N

r 2 (f)

«

(Nlog2N) 2

2

The third integral is equal to the number of solutions to the system

+ n2 + n3 = ni + n§ + n~ =

n1 {

+ ns + n5, n~ + n~ + n~

n4

in positive integers n1, n2, n3, n4, n 5 , n6 ,:; N. Putting kv = nv- nv+3 and fv = nv+nv+3 for 1 ,:; v ,:; 3 we get the system kl +k2+k3 = 0 and kl el +k2f2+k3(i3 = 0. This reduces to one equation k1(f1- £3) + k2(f2 - £3) = 0 and k3 is determined uniquely by kl, k2. If kl (el - e3) = 0, then k2 ( e2 - e3) = 0, therefore there are at most 16N 3 such solutions. It remains to count the solutions with k 1 (£ 1 - £3 ) -f 0. Given m with 1 ,:; m ,:; 4N 2 and £3 with 1 < £3 ,:; 2N there are at most r 2 (m) solutions tom =k1(f1-f3) = -k2(f2 -£3) in kl,k2,fJ,f2. Thus the total number of solutions in question does not exceed

2.::

4N

r 2 (m)

«

(Nlog2N) 3 .

l~m~4N 2

0

This completes the proof of (8.14).

Now we estimate Weyl's sum St(N) for a polynomial of any degree k ~ 2. We begin by

ISt(NW =

2.:: 2.:: e(f(m)- f(n)) 2.:: 2.:: e(f(f + n)- f(n)).

O
=

lii
O
O
8.

EXPONENTIAL

SUMS

201

Iff is a polynomial of degree k, then g(x) = j(f. + x)- f(x) is of degree k- 1 (for any f. -f 0). By repeated application of the above "differencing process" we ultimately arrive at a Weyl sum of degree one for which we can use the non-trivial bound (8.6). Precisely we derive by induction PROPOSITION 8.2. Jj j(x) = o:xk + · · · with k ~ 1, then

2.:: ···2.::

ISt(N) I ~ 2N { N-k

21-k

min(N, llo:k!£1 · · · f.k-1ll-

1

)}

-N
PROOF. For k = 1 this bound is interpreted as that in (8.6). Moreover, we have already proved this result for k = 2 in (8.8). Suppose the result is true for k ~ 2. Let f(x) = o:xk+ 1 + ... ,then g(x) = j(f. + x)- f(:r) = o:(k + 1)f.xk + ... , therefore

ISt(N)I 2 ~

2.:: I 2.::

ifi
2.:: {N-" 2.:: .. ·2:::

~2N

lfi
e(o:(k+1)nk+ .. ·)l

O
1 min(N,IIo:(k+1)!f.1 ... f.k-1f.ll- }

21-k

-N
by the induction hypothesis. Hence by Holder's inequality we get 1St(N)I 2

~4N2 {N-k- 1

2.:: .. 2..::

min(N,IIo:(k+1)!f.1 ..

f.~.:ll- 1 )}

21-k

-N
0

This yields the result for k + 1. Next we use Proposition 8.2 to estimate sums St(M, N) for well approximated by polynomials. Let k ~ 2. Suppose

f

which can be

/.;

~! lf(k)(x)l ~ F

(8.15) in M

~

x

~AI+

N. Then we have f(x + M) = p(x) + r(x), where k-1

p(x) = f(M) + xf'(AI) + · · · + _x_ _ j(k- 1 )(M) (k -1)!

and the derivative of r( x) satisfies

for some M

~

y

~AI+

N. Therefore by (8.5) we obtain

for some N' ~ N. This estimate is reasonably good for short sums S 1(!of, N), i.e., if N is much smaller than M, because we can make the factor 1 + 21rF Nk },[-k to be bounded, by choosing k which is not very large. Applying Proposition 8.2 for the Weyl sum Sp(N') we get

202

8. EXPONENTIAL SUMS

COROLLARY 8.3. Let f(x) be a smooth function in the interval [M, M which it satisfies {8.15). Then

IS 1(1vi,N)I ~ 2N(1 where

v=

N

1

-k

L ... L

+ 21rFNk11rkW

min(N,

+ N]

in

22 -•

11/(k- 1)(111)£1 ... ek-211- 1).

-N
We shall use the above results to estimate St(11·!, 111') for any 11/' with 1 ~ J\.1' ~ M. To this end we assume that (8.15) holds in the whole interval 11! ~ x ~ 211!. We choose 1 ~ N ~ 111 and split St(M, M') into shorter sums of length N getting

ISt(M, Jl./')1 ~

L

+ jN, N)l + 2N

ISt(M

O~j
where J = [111 / N]. Hence by Corollary 8.3 and Holder's inequality we obtain

IS1 (M,M')I ~2N(1+27rFNkM-k)

v:t-• +2N

I: O~j
~2M(1+27rFNkl\I-k)W 22 -k +2N

where Vj are the corresponding sums associated with the points 111 is the mean-value of these sums

W=} L

+ jM

and W

vi

O~j
L

= J-1N1-k

Gathering terms according to the product £1 •.. trivially we arrive at

W~k2kN- +2kN 1 -k 1

L l~r
CrF

1

L

fk-2 =

r and for r = 0 estimating

miu(N,I\r/(k- 1 )(1\f+jN)\1-l)

O~j
where R = Nk- 2 and Cr denotes the number of representations r = £1 ... fk_ 1 with 1 ~ £1, ... , fk-1 < N. For each r we consider separately the inner sum

where Yi = r J(k- )(11J + jN). Note that IYil ~ rk!M 1 -k F = Y, say. We want the points Yi to be well spaced, and for this reason we require a lower bound for the derivative off of order k throughout the whole segment [111,2111). Precisely from now on we assume that 1

(8.16)

8.

EXPONENTIAL SUMS

203

with A ~ 1 for AI ,:; x ,:; 2/o.!. Then by the mean-value theorem we deduce that IY1'- Y1l ~ t.lj'- il where t. = rk!(2M)-kNFA- 1 Now it is clear that U,:;

1

L L

J

min(N, IY1- ul-

1

)

lui,;YO,;j
2 ,:; Y

+

1

1

(N

+

L O
-~),:;

(2Y + r

1

A

N

)(N + 2t.- 1 log311!).

J

This yields

FN W,:; 4k'(4log3M)k(Mk

+N+ M +

All!k-l FNk-1 ).

Finally

Nk

S 1 (M,M')

FN

A

N

« ( 1 + F Jo.!k) (Mk + N + M +

AJI!k-l 2 2 -k FNk- 1 ) Mlog3M

where the implied constant is absolute. This holds for any N with 1 ,:; N ,:; M. Suppose Jo.!k ~ F ~ 1, so we may takeN= [11! p-l/kj giving THEOREM 8.4. Let k ~ 2. Suppose f satisfies (8.16) in the segment [11!, 211!]. Then for 1 ,:; 11!' ,:; M we have (8.17)

where the implied constant is absolute. REMARK. At the end of the proof of (8.17) we assumed that Jo.!k ~ F ~ 1, however this condition is not necessary because otherwise the result is trivial. COROLLARY 8.5. Suppose f(x) satisfies (8.16} fork = 2 and k = 3 in the segment [M, 2M]. Then for 1 ,:; 11!' ,:; 11! ,:; F we have

St(ll!,ll!')

(8.18)

«

AFtM! log 3M.

where the implied constant is absolute. PROOF. This follows by taking the minimum of the two bounds (8.17) fork = 2 0

~dk=3.

Note that (8.18) is trivial if ll!,:; p!. However, for any M which is comparable with F in the logarithmic scale Theorem 8.4 yields a non-trivial bound by choosing k appropriately. For example, choosing k = [2log F I log 11!] we derive COROLLARY 8.6. Suppose f(x) is smooth in [11!, 211!] and all its derivatives satisfy (8.19)

with some A

~

1. Then for 1 ,:; 11!' ,:; M ,:; F we have

(8.20) where "( = log F I log 11! and the implied constant is absolute.

8. EXPONENTIAL SUMS

204

As applications of the above results we derive estimates for the Riemann zeta function on the lines u = ~ and u = 1. For t ? 3 we have ((u +it)=

(8.21)

L

n-u-it

+ O(C").

n~t

Hence we derive by (8.18) the following subconvexity bound (( ~ +it)

(8.22)

« tt (log t) 2

Moreover, we derive by (8.20) the following approximation: ((1 +it)=

(8.23)

L

n-l-it

+ O(e-J!Ogt)

n(.y

with y = t 3 flog log t, which yields by trivial estimation (8.24)

((1

.

+ tt) «

log t -1-1-. og ogt

8.3. Van der Corput method. This method emerged from two papers of 1921 and 1922 by J. G. van der Corput [Corl], [Cor2]. Briefly speaking the novelty consists in replacing the sum by integrals and estimating the latter. This is executed by an application of the Poisson summation formula followed by an a.symptotic evaluation of the resulting Fourier integrals (by using various techniques such as stationary phase). One arrives at an approximate equation for two sums of different length; this transformation is called the B-process. The amplitude functions in both sums may change shape but not size, and they have comparable derivatives. Next these amplitude functions are reduced by a differencing process which is a refinement of that used in WeyJ's method; this routine is called the A-process. The final estimate for the exponential sum SJ(M,'N) is obtained by repeated application of the two processes. Now we proceed to the first step of the van der Corput method, which is a truncated version of the Poisson formula (8.25)

PROPOSITION

8. 7. Let f(x) be a real function with J"(x)

> 0 on the interval

[a, b]. We then have (8.26)

L a
e(f(n)) =

L

[

e(f(x)- mx)dx + O(c- 1 + log(,B- o: + 2))

a:-e
where o:,,B,E are any numbers with o: ,; f'(a) ,; f'(b) ,; ,13 and 0 < E ,; 1, the implied constant being absolute.

, 8. EXPONENTIAL SUMS

205

PRoOF. Let M stand for the interval [a-£, (3 + £]. First we give the trivial bound b

11 L a

11L l

2

e(-mx)ldx(;2(b-a+l)

0

mEM

e(-mx)ldx

mEM

(8.27) :::;; 2(b- a+ 1)

fo 1 min((3- a+ 2,x-

1

)dx « (b- a+ 1) log((3- a+ 2)·.

This proves (8.26) if 0 < b- a < 2. Therefore from now on we assume that b- a We apply the Poisson formula to the tempered sum

L

g(n)e(f(n)) =

a
L m

t

~

2.

g(x)e(f(x)- mx)dx,

a

where g(x) is a function which planes the edges, precisely g(x) = min(x-a, 1, b-x). This function is introduced temporarily for convergence of the sum of the Fourier integrals and will be removed later. For m f/:: M we write by partial integration

rb rb g(x) )' 27ri la g(x)e(f(x)- m1:)dx = - la (f'(x) _ m e(f(x)- mx)dx = ( rb

-

r+l)

Jb-1 la

e(f(x)- mx) dx + rb

f'(x)- m

la

g(x)f"(x) e(f(x)- mx)dx (f'(x)- m) 2

= lb(m)- 'T,(m) + Tab(m),

say. For Tab(m) we have

For T,(m) we have

Integrating by parts once more we obtain another bound

r+l d(f'(x)- m)-

2

IL(m)l:::;; (!'(a+ 1)- m)- +(!'(a)- m)- 2 + l;a

= 2max{(f'(a + 1)- m)- 2 , (!'(a)- m)- 2 }

'(

2 1

2(n- m)- 2 + 2((3- m)- 2 •

The same bounds hold for 7b(m). Combining these bounds we derive IL(m)l + I'Jb(m)l + ILb(m)l

«

(3-a ((3 _ m)(n _ m)

for any m f/:: M. Hence we conclude that

L a
g(n)e(f(n)) =

L mEM

t a

1

g(x)e(f(x)- rnx)dx + O(c- + log((3- a+ 2)).

8.

206

EXPONENTIAL

SUMS

It remains to remove the weights g(n) and g(x). The correction on the left side is

0(1) and on the right side it is O(log(/3- a+ 2)) by (8.27) for b completes the proof of (8.26).

=

a+ 1. This D

EXERCISE 1. Derive (8.26) from the Euler-Maclaurin formula (4.7) using (4.18) or directly from (4.21). It is easy to generalize (8.26) for a weighted sum as follows:

(8.28)

2.::

2.::

g(n)e(f(n)) =

[

g(x)e(f(x)- mx)dx

0'-c
a
+ 0 (G(c- 1 + log(/3- a+ 2))) where g is any smooth function on [a, b] and

+ [ lg'(y)idy.

G = lg(b)l

(8.29) For the proof put

L

o(y) =

a
L

e(f(n))-

o:-c
1Y e(f(x)- mx)dx. a

Thus o(y) « E-l + log(/3 - Q + 2) by virtue of (8.26). summation we obtain (8.28) with the error term

[

Now applying partial

g(y)do(y) = o(b)g(b) - [ o(y)g'(y)dy

which is bounded by G(c 1 + log(/3- a+ 2)) as claimed. In a special case (8.28) simplifies to LEMMA 8.8. Let f(x) be a real function with lf'(x)l ~ 1- e and f"(x) [a,b]. We then have

(8.30)

2.:: a
-f 0 on

g(x)e(f(x))dx + O(Ge- 1 )

g(n)e(f(n)) = [ a

where G is given by (8.29) and the implied constant is absolute. Our next task is to evaluate the exponential integrals in (8.26). We begin by proving simple estimates for (8.31)

!J(a, b)=

l

e(f(x))dx

where f(x) is a real, smooth function on [a,b]. Notice that the trivial bound IIJ(a, b)l ~ b- a cannot be improved substantially if j(k) « (b- a)-k for all k ~ 1, because such a function is more or less constant. However, if one of the derivatives is somewhat larger, then our result will be non-trivial.

8. EXPONENTIAL SUMS LEMMA

207

8.9. Suppose f'(x)f"(x) -:J 0 on [a,b]. Then

(8.32) PRooF. We can assume without loss of generality that f"(x) p'lJ'tial integration we get

. e(f(b)) e(f(a)) 21rzi1 (a, b)= !'(b)- f'(a) Hence

1 2"IIJ(a,b)l (; lf'(b)l which is better than what is claimed. LEMMA

lb a

> 0. Then by

(

1 ) e(f(x))d f'(x) ·

1

1

+ lf'(a)l + f'(a)-

1 f'(b)

0

8.10. Suppose that for some k) 1 we have

(8.33)

for any x on [a, b] with A > 0. Then (8.34) PRoOF. We apply induction with respect to k. For k = 1 the result follows from (8.32). Suppose that f(k+!) ) A> 0 on the whole interval [a, b]. Thus f(k) is increasing, so it may have at most one zero in [a, b], say c if it exists. If j(k) -:J 0 on [a, b], we still define c by putting c = a or c = b according to whether f(k) > 0 or J(k) < 0. Let 8 be a positive number to be chosen later. We put a 1 = max( a, c- 8) and br = min(c + 8, b) and we split IJ(a, b)= IJ(a,ar) + IJ(ar, br) + IJ(b 1 , b). For the integral over the middle segment we use the trivial bound

IIJ(ar,br)l (; br- ar (; 28. To estimate the integral over the left segment [a, a 1 ] we may assume that a 1 > a, that is a 1 = c- 8 >a, or else we get nothing. Now we verify that for a(; x (; a 1 ,

1c )1c

-JCkl(x) =

f(k+!)(y)dy- f(k)(c) f(k+!)(y)dy) (c- x)A) 8A > 0.

Hence by the induction hypothesis

1I1 (a, a 1 )1

(; k2k(8A)-!fk.

The same bound holds true for the integral over the right segment [br, b]. Adding these three bounds we get

IIJ(a, b)l (; 28 + 2k2k(8A)-!fk. This is true for any 8 (8.34) for k + 1.

> 0. On putting 8

= A -I/(k+I) we complete the proof of

Combining Lemmas 8.8 and 8.9 we obtain

0

EXPONENTIAL

8.

208

SUMS

COROLLARY 8.11. Let f(x) be a real function with(}:.; f"(x) -:J 0 on [a,b]. Then

2.:::

(s.35)

lf'(x)l :.;

1- (} and

g(n)e(f(n)) «ee-l

a
where g is given by (8.29) and the implied constant is absolute.

ExERCISE 2. Prove that for any real numbers fn satisfying(} :.; fn- fn-1 :.; fn+l - fn :.; 1 - (} we have

JL::e(fn)J:.; cotan~ . 0

(8.36)

n

Combining Proposition 8.7 with Lemma 8.10 fork= 2 we derive COROLLARY 8.12. Let b - a ): 1. Let f (x) be a real function on [a, b] with

f"(x)): A> 0 on [a, b]. Then

L

(8.37)

e(f(n))

«

(f'(b)- J'(a)

+ 1)A-~

a

PROOF. By Lemma 8.10 the integrals in (8.26) are bounded by SA -l/ 2 and the number of these integrals is less than J'(b)- J'(a) + 1 (take a= f'(a),(3 = f'(b) and E = hence (8.37) follows. 0

! ),

Note that J'(b)- f'(a) = (b- a)f"(y) for some y with a:.; y:.; b, thus (8.37) implies COROLLARY 8.13. Let b- a ): 1. Let f(x) be a real function on [a, b] such that

A ( f"(x) ( ryA with A> 0. T]): 1. Then ~

1

~

(8.38)

1

e(f(n))«ryA'i(b-a)+A->

a

Now we are going to establish an approximate formula for the exponential integrals (8.31) which is more precise than the estimate (8.34). We begin by a special case LEMMA 8.14. Let h(x) be a real function on [O,X] such that h(O) = 1, h(x)

(8.39)

h'(x)

(8.40)

» 1,

« x- 1 ,

» « x- 2 .

(xh(x))'

h"(x)

1,

Then for a > 0 we have

(8.41)

rx

Jo

2

e(ax h(x))dx

=

(

1) e 8

1 +0 ( aX 1 ) J8Q

where the constant implied in 0 depends only on the constant implied in» of {8.39) and that in« of (8.40).

·~

EXPONENTIAL

8.

SUMS

209

PRooF. For notational simplicity we set h(:r) = g2 (x). Clearly g(x) satisfies all the conditions of h(x). We change the variable :r 2 g2 (x) = t to obtain

1x

1T

e(ax 2 g 2 (x))dx =

e(at)dd

-1T

e(at)f(x)dt

where T = (Xg(X)) 2 and

f(x) = ((xg(x))'- 1)/2xg(x)(xg(x))'. Using the conditions (8.39), (8.40) for g(x) and the Taylor expansions

g(x) = 1 + xg'(O) we deduce that f(x) integration 2rria

1T

+ O(x 2 x- 2 ),

« x- 1

« x- 2

and f'(x)

e(at)f(x)dt = =

g'(.x)

1T

g'(O)

=

+ O(xX- 2 )

Hence we obtain by partial

f(x)de(at)

e(aT)f(X)- f(O)

« x-1 +

1x

-1T

lf'(x)ldx

e(at)f'(x)dx(t)

« x-1

because x'(t) is positive. This bound is absorbed by the error term in (8.41). Next we extend the first integral to 00

1

I

.

e(at)dt'i

=

0

1 1 e(-) --, 8 y'8Q

and we estimate the excess by partial integration as follows: 2rria

r= e(at)dt2 = -T-2e(aT)- lrroo e(at)dC2

lr

I

I

«

I

T-2 = (Xg(X))- 1

I

« x- 1 .

This bound is also absorbed by the error term in (8.41) completing the proof.

0

CoROLLARY 8.15. Let f(x) be a real function on [a, b] such that

(8.42)

J"(c) ) A,

(8.43)

lf( 3 l(x)l (; AX- 1 , lf( 4 l(x)l (; AX- 2

for some A> 0 and X> 0. Suppose f'(c) = 0 at some point c in (a, b). Then we have


8. EXPONENTIAL SUMS

210

PRooF. We write IJ(a, b)=

l

l-c

e(f(x))dx =

e(f(c + x))dx +[-a e(f(c- x))dx.

By Taylor's expansion we have f(c + x) = f(c) and h(x) is a function which satisfies

+ ax 2 h(x),

x , 2x h(x)) 1- X, (xh(x)) ?: 1- X-

3

3

say, where a= !f"(c)

x2 x 12 2

and h'(x) « x- 1 , h"(x) « x- 2 These estimates follow from (8.42), (8.43). Hence the conditions of Lemma 8.14 are satisfied in the interval 0 ~ x ~ Y with Y = min(b- c, X) getting

ry

Jo

e(f(c + x))dx

1

!

=

e(f(c)

+ i) (4f" (c))- 2 + o(AY).

For the remaining part of the integral we apply Lemma 8.9 getting

l-c

e(f(c + x))dx

«

lf'(c + Y)l- 1

+ lf'(b)l- 1 «

(AY)- 1 .

From both estimates we obtain

Similarly we evaluate the integral over the segment [a, c] completing the proof of ~~-

0

Now we are ready to show the main result of the van der Corput method THEOREM 8.16. Let f(x) be a real function on [a, b] whose derivatives satisfy the following conditions: A~ f" ~ ryA, lf( 3 )1 ~ ryA(b- a)- 1 , lf( 4 ll ~ ryA(b- a)- 2 for some A > 0 and T] ) 1. Then we have

where a= f'(a),f3 = f'(b) and Xm is the unique solution to f'(x) RJ(a, b) is considered as an error term; it satisfies I

R 1(a, b)« A -2

(8.46)

= m. Here

+ ry 2 1og(f3- a+ 1)


PRooF. We apply (8.25) withE= ~, a= f'(a) and f3 = f'(b). Form with a+ E < m < f3- E (note this range is void if f3- a~ 1) we evaluate the involved exponential integrals by (8.44) getting

rb

Ja e(f(x)- mx)dx

I

=

e(f(xm)- mxm

+ i)f"(xm)-2

T] ( ~~+~~ 1 1 )) +0(

A b- Xm

Xm- a

.

8. EXPONENTIAL SUMS

211

Here we have ryA(b- Xm) ~ f'(b)- f'(xm) =(3-m and ryA(xm- a) ~ f'(xm)f'(a) = m- a, therefore the error terms contribute at most

n+c
For the two missing integrals with a - E < m ~ a + E and (3 - E ~ m < (3 + E we use the bound SA - 112 which follows from (8.34) with k = 2 completing the proof ~~·~·

0

REMARKS. In general the bound (8.46) cannot be significantly improved because it is almost as small as the individual terms in both sums (8.45). ExERCISE 3. Using partial summation derive from (8.45) the following formula for a weighted sum

(8.47)

L

g(n)e(f(n)) = o
where g is any smooth function on [a, b] and G is given by (8.29). ExAMPLE.

Let X > 0, N > 0 and a > 1, v > 1. We then have

(8.48) 1

2.: (;) 2 e(~C~r)

1

=

N
2.: (~) 2 e(~- ~cr;t) (3

M
I

+ O(N-21og(N + 2) + M-2!og(M + 2)) where ~

+ ~ = 1, J.l/3 = v"'

and lv! N

= X,

the implied constant depending only on

0:, V.

The approximate formula (8.47) (which constitutes the B-process of van der Corput) transforms an exponential sum of amplitude f(x) to an exponential sum of amplitude h(y) related by (8.49)

h(y) = f(x)- xy, y = f'(x).

Thus f and h have essentially the same size. However, the length of the sums changes from b- a to (3- a = f'(b) - f'(a). Note that the operation (8.47) is involutary, so it would be useless to apply it two times in a row. It is recommended to apply the B-process in the direction which reduces the length of the exponential sum. Then on the shorter side one can estimate trivially (getting for example (8.24)), but it is not necessary to terminate in this way. A new possibility is offered by a differencing process which reduces the amplitude function, but does not change the length of summation. This is called the A-process of Weyl-van der Corput. We begin by the following general inequality

212

8.

EXPONENTIAL SUMS

LEMMA 8.17. For any complex numbers Zn we have

1Lznl ~(1+b~/)L(1-j~j) 2

(8.50)

a
lqi
L Zn+qZn a
where Q is any positive integer. PRooF. Setting Zn = 0 if n ~ (a, b) we obtain

S =

Zn = L

L

Zn = L

Zn+q

a
for any q E Z. Sum up this over q with 0

QS=

~

q < Q getting

L L Zn+q· a-Q+I
Hence by Cauchy's inequality

Q2jSj2 ~ (b- a+ Q) L

= (b-a+Q)

I

L Zn+ql2 O(q
L

LZn+q 1 Zn+q 2

O(ql,Q2
= (b-a+Q) L

n

v(q)LZn+qZn

lqi
where v(q) is the number of integers 0 ~ q1 , q2 < Q with q1 - q2 Q- jqj. This completes the proof of (8.50).

= q,

i.e., v(q)

= D

Taking Zn = e(f(n)) we obtain the differencing process PROPOSITION 8.18. Let f(x) be real function in (a, b) and Q a positive integer. We have 2

(8.51) I L e(f(n))l a
~

(1+ b·;/) L

(1-1~1)

lqi
L e(f(n+q)-f(n)) a(q)
where a(q) =max( a, a- q) and b(q) = min(b, b- q). REMARKS. Note that q = 0 is always there, so the resulting bound for the original sum ~e(f(n)) can never be better than (b- a)Q- 112 , i.e., one cannot save more than Q112 from the trivial bound b - a. In practice Q is chosen to be much smaller than b- a, so the new amplitude function h(x) = f(x + q)- f(x) has reduced derivatives, while a(q) and b(q) do not change much. Here the flexibility in choosing Q is another innovation introduced by van der Corput to the original differencing method of Weyl. I

ExERCISE 4. Prove that for 1 ~ Q ~ (b- a) 2 we have

(8.52)

L e(f(n))«(b-a)Q-~+(b-a)!l L e(h(n))l a
1

2

where h(x) = f(x + q) - f(x- q) for some 0 < q < Q and the implied constant is absolute [Hint: Use only even integers for Weyl shifts.]

8.

EXPONENTIAL SUMS

213

By successive application of Proposition 8.18 and Corollary 8.13 we derive COROLLARY 8.19 (VAN DER CoRPuT). Let b- a ) 1. Let f(l:) be a real function on (a, b) such that A ( f( 3l(x) ( ryA where b- a) 1,A > 0 and 17) 1. Then

L

(8.53)

e(f(n))«rJ~A~(b-a)+A-~(b-a)~

a
where the implied constant is absolute. PROOF. For 0 < q < Q and a< x < b-q we have h"(x) = f"(x+q)- f"(x) = qf( 3 )(y) for some y E (a, b). Hence Aq ( h"(x) ( ryAq and by (8.38)

e(f(n + q)- f(n))

«

1

1

ry(Aq)2 (b- a)+ (Aq)-2.

a
A similar result is true for the sums with negative q. For q = 0 we have the trivial bound b- a+ 1. Collecting these results we deduce by (8.51) that

IL

2

e(f(n))j

«(1+b~a){b-a+

a
L

(ryA~q-~(b-a)+(Aq)-~)}

O
«

(1+ b~a)(b-a+ryA!Q~(b-a)+A-~Q~)

for any positive integer Q. However, the above estimate remains valid for any positive real number Q for obvious reasons. Taking Q = A-I/ 3 we obtain (8.53) provided A ) (b- a) - 3 , and otherwise the result is trivial. 0 The Weyl-van der Corput inequality (8.51) can be used repeatedly to reduce further the amplitude function until (8.38) becomes useful. This leads to the following generalization of (8.38) and (8.53). THEORE11 8.20 (VANDER CORPUT). Let b-a) 1. Let f(x) be a real function on (a, b) and k ) 2 such that A ( J(k)(x) ( ryA, where A> 0 and T]) 1. Then

L

(8.54)

e(f(n))

«

ry 2 ,_. A"'(b- a)+ A-"'(b- a) 1 -

22

-k

a
where

t;,

= (2"- 2)- 1 and the implied constant is absolute.

For large k the bounds (8.17) of Weyl and (8.54) of van der Corput are comparable. 8.4. Discussion of exponent pairs. Throughout S1 (N) denotes an exponential sum of type SJ(N) =

L

e(f(n))

N
with N ( N' ( 2N and f is a smooth function on [N, 2N]. We have established numerous estimates for S1 (N) which depend essentially on the length N and on the size of the amplitude function f. In this section we describe these results in a unified form and discuss possible extensions. For the sake of simplicity we drop some minor, yet necessary conditions, ~o our assertions are not strictly correct in

8. EXPONENTIAL SUMS

214

this section. Rigorous assertions can be found in the book by S. W. Graham and G. Kolesnik [GK] as well as in the original papers by van der Corput [Corl], [Cor2], E. Phillips [Ph] and R. A. Rankin [Ral]. Suppose satisfy

f

behaves like a monomial, precisely we assume that the derivatives

(8.55) for N ( x ( 2N and any j ~ 0, where the implied constant depends on j. In particular, lf(x)l ::c.: F and lf'(x)l ::c.: FN- 1 • If N is larger than F by a sufficiently large factor, then Corollary 8.11 yields

This bound is the best possible, because in this case the sum SJ(N) is well approximated by the corresponding integral (see Lemma 8.8). In what follows we stay out of this special case by assuming that

(8.56) Inspired by many examples we postulate the existence of universal numbers

(8.57)

O(p,q(~

having the property that

(8.58) for any exponential sum of length N and the frequency function f satisfying (8.55) and (8.56). Here E is any positive number and the implied constant depends only onE and on the sequence of constants implied in (8.55). We call (p, q) an exponent pair (notice that our definition of q differs by ~ from that in the literature (GK]). Clearly if (p, q) is an exponent pair, then any pair of larger numbers form an exponent pair. The trivial estimate for SJ(N) corresponds to the exponent pair

(8.59)

(p,q)

=

(o, ~)-

The following statement is the most optimistic for this aspect of the theory of exponential sums, and is widely believed to be correct in its essentials (minor assumptions on f might be required to avoid pathologies): EXPONENT PAIR HYPOTHESIS. The pair (p, q) = (0, 0) is an exponent pair. In other words, an exponential sum of type SJ(N) with f, N satisfying (8.55) and (8.56) should satisfy

(8.60) This bound, if true for reasonable functions, would lead to solutions of many problems in analytic number theory. For example, it implies the Lindeli.if Hypothesis ( (~ + it) « t<' if t ~ 1.

8. EXPONENTIAL SUMS

215

Moreover, it would solve the Gauss circle problem and the Dirichlet divisor problem; namely it implies

L r(n) = 1rx + O(xt+'), n~x

LT(n)

=

xlogx

+ (21 -1)x + O(xt+•),

n(x

where the exponents are best possible. Actually for these problems one only needs to know that (p, q) = (0, ~) is an exponent pair. However, our current knowledge is very poor. Corollary 8.13 tells us that

(8.61)

(p,q) =

G,o)

is an exponent pair. Clearly a linear convex combination of exponent pairs is again an exponent pair. Therefore (8.59) and (8.61) imply that (p, q) = ( is an exponent pair, however, we have already proved a stronger result (8.18) which can be now interpreted as saying that

l, :D

(8.62)

(p, q) =

(~, ~)

is an exponent pair. Theorem 8.4 yields the exponent pairs

(8.63)

4 1 4(k-1)) (p,q) = ( k2k'2- ~

and Theorem 8.20 yields

(8.64)

1 1 k-1) (p, q) = ( 2k - 2' 2 - 2k - 2

for any integer k ~ 2. Notice that as k tends to infinity in both sequences, p = p(k) tends to zero slightly faster than q = q(k) tends to~· This feature makes it possible to claim non-trivial bounds for the exponential sums S 1 (N) whose length N is comparable with the amplitude F in the logarithmic scale. In the next section we shall produce by Vinogradov's method a sequence of exponent pairs (p, q) in which p tends to zero much faster than q tends to ~. So far we do not have an exponent pair with p = 0 and q < ~; any such example would mark a major breakthrough in the theory of exponential sums. A continuum of exponent pairs can be created by alternating the A and B processes. By the Weyl-van der Corput differencing inequality (8.51) one derives A- PROCESS. If (p, q) is an exponent pair, so is I

(8.65)

A(

)p, q -

(-Pq+2 ) 2p + 2' 2p + 2 .

By the van der Corput approximate functional equation (8.45) one derives

8. EXPONENTIAL SUl\IS

216

B-PROCESS.

(8.66)

If (p, q) is an exponent pair, so is

B(p,q) = (q,p).

Here are some examples of exponent pairs generated from the trivial pair

For the purpose of estimating the Riemann zeta function on th'e critical line one needs an exponent pair (p, q) with p + q as small as possible since (8.67)

((~+it)« t!(p+q)+<,

if t ~ 1.

In 1955 Rankin [Ral] computed the minimum of B = !(P + q) over all exponent pairs generated by the A and B processes, namely Bmin = 0.16451067 .... In other applications of exponent pairs one seeks the minimum of a fractional linear function B(p, q) = (np + (Jq + J")/(8p + •yq + v). Having these applications in mind in 1985 Graham gave a truly elegant algorithm for the solution to this general problem; see [GKJ. Today there are known exponent pairs which cannot be obtained by the van der Corput method. For example, Huxley and Watt [HW] showed that (8.68)

(p,q) =

(:6' :6)

is an exponent pair by elaborating the work of Bombieri and Iwaniec [BIJ. Further improvements along these lines are given in [Hu4] and a very nice exposition is contained in [GK]. The Weyl shift (from n to n + q) was introduced for the purpose of reducing the amplitude function (from f(n) to h(n) = f(n + q)- f(n)). However, this shift also offers an additional variable of summation. One can take advantage of this by an appeal to the theory of two-dimensional exponent pairs, but the improvements are not impressive (in the context of the Riemann zeta function the results are still weaker than what follows from the Huxley-Watt exponent pair (8.68)). 8.5. Vinogradov's method. In the late thirties I. M. Vinogradov [Vi2], [Vi3] invented a new method for estimating exponential sums of Weyl's type which is very strong for sums of large amplitude (relative to the length). He spent a considerable time of his life to

J

r 8. EXPONENTIAL SUMS

217

massage the results, so we have today several elegant versions due to him and his followers. Let f(x) be a real smooth function in [N, 2N]. Our aim is to estimate exponential sums

St(a,b) =

(8.69)

L

e(f(n))

a
for any a, b with N :( a < b :( 2N. vVe shall approach the problem in several distinct steps. Step I (Small Shift). We begin, as in Weyrs method, by shifting the summation

e(f(n + q)).

S 1(a,b) = a-q
We choose q to be relatively small so we can approximate f(n+q) by a polynomial in q of reasonably small degree. Precisely we apply the Taylor expansion f (n + q) = Fn(q) + Rn(q), where Fn(q) is the polynomial

Fn(q) =

L

C>j(n)qj

O(;j(;k

with coefficients nj(n) = JCil(n)/j! and Rn(q) satisfies the bound

IRn(q)l :( qk+l f(k+I) /(k + 1)! with f(k+I) = maxlf(ktiJ(J:)I. Inserting this approximation we get

St(a,b) =

e(Fn(q))+B(2q+qk+IJ(k+ 1 lN)

L a
where IBI :( 1. Here the first error term 2q represents the trivial bound for exponential sums over two non-overlapping intervals and the second error term comes from the inequality ie(Rn(q)) -11 :( 27riRn(q)l :( qk+IJ(k+I). We shall require all the implied constants in the forthcoming estimates to be absolute. This is important, even with respect to k, because for some applications (first of all for bounding ((1 +it)) one uses k depending on the length of summation and the size of the amplitude function. At this point Vinogradov's approach diverges from that of Weyl. The first new idea is that now we are going to average over q in a special subset, say Q C {1, 2, ... , Q}, rather than in the full set. We obtain (8.70)

St(a,b)= L a
IQI- 1 Le(Fn(q))+O(Q+Qk+Ij(k+I)N) qEQ

where IQI denotes the number of points in Q. Though Fn(q) is a polynomial in q the innermost sum Sn = L e(Fn(q)) qEQ

cannot be considered as Weyl's sum of degree k, because of restrictions for Q (to be specified soon). Moreover, we are not going to apply the idea of differencing by way of squaring the sum s1 (a, b), but rather we proceed directly to estimation of

218

8. EXPONENTIAL SUMS

each inner sum Sn separately for a < n < b. In other words, we no longer need the variable n until the last step, and even there, it is not essential. Since the variable n plays a minor role in what follows we simplify the notation by considering the exponential sum

S=

2..= e(F(q)) qEQ

for arbitrary polynomial with real coefficients

F(q)

=

2..=

cxjqj.

O(J~k

Obviously S depends on the fractional part of the coefficients CXJ- For this reason we shall exploit only the coefficients with /nJI < 1, because these can be easily controlled. On the other hand, the very small coefficients, specifically /nJI < Q-J, do not make a variation in the argument of e(F(q)). Therefore only the middle terms of our polynomial F(q) play a role, that is these with Q-J < /cx1 I < 1, the other terms are wasted. In the final arguments we shall appeal to the original coefficients CXj = cxJ(n) = f(jl(n)/j! whose particular shape is not important, we only need to estimate their size. STEP II (creation of bilinear forms). We choose Q={xy:1:(x,y:(P} where the numbers q = xy are counted with multiplicity of such representations, thus /Q\ = Q = P 2 and the exponential sumS becomes a bilinear form

2..=

(8.71)

e(F(xy)).

It is important that the variables x, y run independently over two sets (of some

cardinality), yet it is of no significance what the points are. For example, we could restrict x, y to special numbers or consider a general bilinear form B(X, Y) =

2..= 2..= a(x)b(y)e( (x, y)) xEX yEY

with any complex coefficients a(x), b(y), where X, Yare the sets of points of type x = [1, ... ,xk], y = [1,y, ... ,yk] with x,y E Z,1 :( x :( X,1 :( y :( Y and (x, y) = F(xy). If X, Yare sufficiently large and well-spaced sets then a non-trivial bound for B(X, Y) follows easily by Fourier technique (see Chapter 10). However, this is not the case under consideration here; our sets are quite sparse. STEP III (expanding the variables). Our next goal is to create more points of summation and make many variables with no large gaps. To this end we raise the bilinear form (8.71) to a high power, apply Holder's inequality and rearrange

8. EXPONENTIAL SUMS

219

the resulting multiple sum as follows:

ISle (

p£-1

~~~e(F(xy))le y

X

= pl-

1

~ )q, ...

v(.\ 1 , ..• , Ak) ~(xe(etJAJX

+ · · · + CtkAkXk).

,Ak

Here (x are complex numbers with solutions to the system of equations

Y1 (8.72)

{

ICxl

= 1 and v(.\ 1 , ... , Ak) is the number of

+ · · · + Ye

= .\1

~~. ~- ::: ~ y.;. ~ -~~ y~

+ · · · + y~

= .\k

in integers 1 ( y1 ( P for j = 1, 2, ... , e. Thus e ( .\1 ( fPi, or else there are no solutions. Since will be much larger thank (approximately ::=:: k 2 ) we expect that most vectors (.\1, ... , .\k) have good number of representations by the system (8.72). This number v(.\1, ... , .\k) may vary only slightly. In order to smooth the outer summation we raise ISle again to power 2f, getting

e

e

where

Ie,k(P)

~

=

)q , ...

v(.\1, ... , .\k),

,Ak

~ ~~(xe(et 1 .\lx + ··· + CtkAkxk)l

Ze,k(P) =

)q, ...

,Ak

2

e.

X

For the average of v(.\ 1 , ... , .\k) we have the exact value

Ie,k(P) = Pe, but the estimation of the average of v 2 (.\ 1 , ... , .\k) is a much harder problem. Note that Je,k(P) is the number of solutions to the system of homogeneous equations Y1

(8.73) {

· · ·- Y2e = 0

~:.~. ·. ·. ·. ~- ~~-~-~~~~ -~- ·.·.·. ~-~~~ -~-~ Y~

in integers 1 ( Yi ( P for j (8.74)

+ · · · + Ye- Ye+1-

+ · .. + Y~ - Y~+1 =

- · · · - Y~e = 0

1, 2, ... , 2€. We have

8. EXPONENTIAL SUMS

220

More generally, if the right-hand side of the system (8.73) is replaced by constants •.. , ak then the number of solutions Je,k(Pj; a 1 , •.• , ak) to the system of inhomogeneous equations is given by the integral

a1 ,

Hence it is clear that (8.75)

STEP IV (tuning down to linear polynomials). Estimation of Je,k(P) lies in the heart of Vinogradov's method (see the next step), however, the vital saving comes from Ze,k(P) as it is the only sum left with non-trivial characters. We have

Ze,k(P) :(

L x1 , ...

where ak = ak(xl, ... 'X2e) = arrive at

Ze,k(P) :( Je,k(P)

L

I

,xzr

)q , ...

L;i (x~ -

L a1 , ...

+ · · · + ak.\kak)l

x~H), so iaki :( ePk. Hence by (8.75) we

L

I

,ak

e(a1.\ 1a!

,Ak

)q , ...

e(nJAWJ

+ · · · + ak.\kak)l.

,Ak

Here a 1 , ... , ak, .\ 1 , ... , Ak run independently over all integers with iahi :( eph and I.A.hl :( fPh. Note that the multiple sum factors so we write the result as

where

ll=

II

D(ah,ePh),

l(;h(;k D(a,X) =

x- 2

L IL lmi(;X

e(amn)l·

lni(;X

Now, as in vVeyl's method, the source of cancellation is in the exponential sums for linear polynomials. This is reached very differently here by Vinogradov's technique of bilinear forms than before by Weyl's differencing process. But above all, the new process is much faster. Indeed, Weyl's differencing requires applications of Cauchy's inequality k -1 times (each application reduced the degree by one), which amounts to raising the original sum to power 2k-l, whereas Vinogradov's method gets the job done with the power 2f2 ;:o;: k 4 . Another new feature is that we produced linear polynomials out of every monomial in F, not just one from the highest degree; consequently the saving factors appear about k times. Gathering the above estimates we obtain (8.76)

8.

EXPONENTIAL

SUMS

221

STEP V (estimation of Je,k(P)). Let us give some thought about the inequality (8.76). HereS is the original sum S(n1, ... ,nk) = ~e(n 1 q+ .. ·+nkqk) qEQ

while Je,k is the L2t-norrn of

P(n 1 , ... ,nk)= ~ e(n1x+···+nkxk) 1:s;;x:s;;e

with respect to the coefficients n 1 , ... , nk over the torus (!R./Z)k. At first glance we may think that the problem of estimating Je,k(P) is not much different than that of the original sum S. Yes, this would be an even harder problem if we dealt with individual sums. But, due to complete integration over the coefficients, we are really having a diophantine problem (counting solutions to the system of homogeneous polynomial equations (8.73)). By a heuristic argument one is lead to the bound (8.77)

(the linear equation in (8.73) fixes one variable, the quadratic equation fixes two additional variables, etc.), but we cannot prove this. Vinogradov succeeded in proving a slightly weaker bound. THEOREM 8.21. Let k, m be positive integers and €) k(k + m). We then have (8.78)

for any

p)

kk(l-l/k)-"'.

We begin by proving LEMMA 8.22. Let p be a prime number, p > k, and >.1,··· ,>.k be integers. Then the number T = Tp(>. 1 , ... , >.k) of solutions to the system of congruences

{:; : ••• : :; : :: :::: ::1

(8.79)

in .r 1 , ...

, Xk

(mod

pk)

such that x,

=t

Xj

(mod p) is bounded by k(k-1)

T ( k!p-,-.

(8.80)

PROOF. We write each

Xr

uniquely in the form

Xr = Xr1

+ Xr2P + · · · + XrkPk- 1

with 0 ( Xri < p for 1 ( r ( k and 1 ( i ( k. The first coordinates Xr 1 satisfy the system (8.79) (mod p). From the theory of symmetric polynomials it follows that the j-th symmetric functions in xu,x 21 , ... ,xk 1 , say Sj, are determined by .>. 1 , ... , >.k and that Xrl are roots of xk- s1xk-l + · · · + (-1)k sk. There are at most k roots, so there are at most k! solutions (x 11, x 21 , ... , xkl)·

8. EXPONENTIAL SUMS

222

Now suppose (xn,X21,··· ,Xk1), ... ,(x1h,X2hr··· ,Xkh) with 1 :( h < k are given. We shall find conditions for the coordinates of order h + 1. Letting Xr1

+ Xr2P + · · · + Xr·hPh- 1 =

Yrh

for 1 :( r :( k, we have

Xr

=Yrh + Xrh+TPh(mod ph+ 1).

Consider the system (8. 79) (mod ph+ 1) reduced by omitting the first h congruences, i.e., k

~(Yrh + Xrh+lPh)m

=Am(mod ph+ 1)

r=l

for h

< m :(

k. Hence k

(8.81)

~ Xrh+IY::h

=J.Lm(mod p), h < m < k,

r=l

where J.Lm are some numbers determined by the coordinates Xrj of order 1 :( j :( h and by the constants Am with h < m < k. The numbers Yrhr 1 :( r :( k, are different modulo p because so are Xr· In particular, at most one Yrh vanishes modulo p, say y 1h, if any. If we choose x 1 h+1r ... , Xhh+1 arbitrarily the remaining ones xh+ 1h+ 1, ... Xkh+ 1 are determined modulo p uniquely from the linear system (8.81) because its determinant (Vandermonde) is non-zero. Therefore, the number of coordinates of order h + 1 does not exceed ph. We conclude that T :( k!pp2 ... Pk-1 = k!p k(k.-ll.

0 COROLLARY 8.23. Let p be a prime number, p > k, and let Ukp(P) be the number of solutions to the system of congruences

(8.82)

~ (xh' - xh+k) 1,;;h,;;k

=0 (mod pm)

for 1 :( m :( k, in X < Xh :( X+ P, 1 :( h :( 2k, such that xh, ;f. Xh 2 (mod p) for 1 :( h1 of h2 :( k. We have (8.83) Our next aim is to establish a recurrence inequality for Jek(P). We assume that £ > k ;;;, 2 and P :( pq, where p is a prime number > k to be chosen later and q is an integer. Clearly, we have Je,k(P) :( Je,k(pq). For estimating Je,k(pq) we divide the solutions of the system (8. 73) into two classes. We say that (y 1 , ... , Y2l) is of the first class if both sequences (y1, ... , Ye) and (Ye+ 1), ... , Y2t) have at least k different numbers modulo p. The remaining solutions are of the second class. Let us estimate the number J 1 , say, of the first class solutions. Since k numbers can be put in £ places in f( £ - 1) ... ((I - k + 1) ways we obtain J 1 :( (1 2 k Jn, where

J

I

8. EXPONENTIAL SUMS

223

Jn stands for the number of solutions (y1, ... , Y2£) such that all (y1, ... , Yk) are different modulo p and all (Y£+ 1 , ... , Ye+k) are different modulo p. Letting

2..=

S(et) =

e(Fa(Y)),

I~y~P

2..=

2..=

u(mod p)

l~y~P

S(n)=

e(Fa(y)) =

2..=

Su(n),

u(modp)

y=u(mod p)

say. By Holder's inequality we obtain

:( p2£-2k-1

2..=

Jn(u),

O
say, where ~· means that the summation is restricted to the residue classes ... , uk modulo p that are all different, and

ub

This is equal to the number of solutions (y1 , ... , Yu) of the system (8. 73) such that all y 1 , ... , Yk are different modulo p, all Y£+1, ... , Y£+I, ... , Y£+k are different modulo p and all the remaining ones are congruent to u modulo p. For these we set Yh = u + PVh with 0 :( Vh < q if k < h :( e or k + e < h :( 2€. We obtain

for 1 :( m :( k. Now we apply an obvious fact that if (y 1 , ... , y 2 e) satisfies the homogeneous system (8.73), then (y1 - u, ... ,y2 e- u) also does. Therefore we obtain 2..= (xh'- xh'+e) + pm 2..= (vh'- v!;'te) = 0, O
k
for q :( m :( k, where Xh = Yh- u for 0 < h :( k and e < h :( e+ k. In particular, we observe that xh satisfy the system of congruences (8.79). Given a solution to (8.79) we see that vh satisfy the inhomogeneous system of equations

L

(vh'- v~e) =Am, 1 :( m :( k,

k
with some constant Am independent of vh. Thus the number of the Vh 's is bounded by Je-k,k(q); see (8.75). Combining with (8.83) we conclude that

and (8.84)

8. EXPONENTIAL SUMS

224

Now we estimate ]z, the number of solutions of the second class of (8.73). Among the numbers (y 1 , ... , Ye) there are at most k- 1 different residue classes modulo p, or among the number (Yf+l, ... , Yu) there are at most k- 1 different residue classes modulo p. Therefore

where U is the sequence of vectors u = (u 1, ... , u 2e) (mod p) with the relevant property, so lUI : :; 2kfpf+k-l_ By the inequality (x1 · · · Xn)!.:::;; ~(x1

+ · · · + Xn)

we obtain

2f

(2£)- 1 ~]z(uh), say, where

Hence (8.85) By (8.84) and (8.85) we conclude that Je,k(P) :::;;

[e2kk!p2f-2k- "

1

"i'> pk(Pk + P)k + 2 kepe+k-tq2k J Je-k,!.:(q).

Finally assuming that P ) kk we may take a prime number p with 2 + pl/k :::;; ·P:::;; 2P 1 1k and the integer q = [p- 1 ?] + 1 getting ' LEMMA

8.24. Letk) 2, ,e) !k(k+ 3) -1 and P) kk. We then have

(8.86) Now Theorem 8.21 follows easily from Lemma 8.14 by indnction in m. Combining (8.76) with (8.78) fore= k(k + m) we obtain (8.87) where m is any positive integer and P >

kk(t--/;)-=.

STEP VI (estimation of .6.). It remains to estimate .6.. We have X 2 D(a:,X):::;;

:::;;

1 min(x,-- ) 1 lmi,;;X 21 a:mll

~

~

~

lri,;;oX+~ lnm-rl<~

:::;; 2(a:X

+ 1)(X +

1 min(x, I ) 2 a:m- rl

~ 0
min(2x,

__!__)). QU

EXPONENTIAL

8.

SUMS

225

The last sum is bounded by an integral which equals a - l log eX, therefore

D(a,X) ( 2X- 2(aX

+ 1)(X + a- 1 1ogeX)

( 4(a + a-lx- 2 ) log3X

for a > 0 and X ) 3. We also have trivial bound D( a, X) ( 4. From both estimates we conclude that for any I C {1, 2, ... , k },

IT(IaJI + laJI-lp- 2J)

ll (

(8.88)

(4klog3€P)k.

jEI

STEP VII (estimation of exponential sums). We now assume that J(~·) is a real smooth function on [N, 2NJ such that for all j ) 1 (8.89) on [N, 2NJ, where F) N) 2 and a) 1. Hence for aj(n) = J(jl(n)/j' we get 3

3

a-J (2N)-JF ( laJ(n)l ( aj N-JF and

ll

(IT (FN-J + F- 1 NJ p- 2J)ak

4

(8k log 3P)k.

jEI

We assume that P 4

(

N ( Ft and k) 2logFjlogN. We choose

I

=

. log F } . log F - < J ( _-,:=.--c-=cc { J .. logN log(NjP)

getting 2

2

ll ( IT(FN-J)(2kek)k (1og3N)k ( N-P(J-l)(2kek)k (1og3N)k jEI

where

J=

III )

log F log( N / P)

log F

- -log N

1)

(log F) (log P) + 1. 2 (log N)2 .

Finally (8.90)

ll ( exp(-(logF) 2(logP) 2 (2logN)- 3 )ak (klog3N)k. 4

Now setting P = N1, k = [4logFjlogN] and m = 8k we infer from (8.70), (8.87) and (8.90) the following THEOREM 8.25. Let f(x) be a smooth function on [N, 2N] which satisfies (8.89} with F ) N 4 and a ) 1. We then have (8.91)

where the constant implied in

«

is absolute.

APPLICATIONS: We conclude the presentation of Vinogradov's method by . drawing from (8.91) a few basic applications.

8.

226

EXPONENTIAL SUMS

COROLLARY 8.26. Fort ) N ) 2 we have

L

(8.92)

nit« Nexp(-,B(logN) 3 (logt)- 2 ),

l~n~N

where ,B and the constant implied in

«

are absolute.

THEOREM 8.27. There exist.s an absolute constant a> 0 such that for s = a+it with t ) 2 and ~ :( a :( 1,

((s)

(8.93)

«

' t"( 1 -")'a (logt)3.

The implied constant is absolute. PROOF. By (8.21) using partial summation and (8.92) we have

((s) =

L

n-s + 0(1)

l~n~t

1 1

=(log t)

l

«

x-" exp( -,B(logx) 3 (Iogt)- 2 )dx

1 1

3

t(l-")u-{3u du :( (log t)

3

tf(v)-{3(u-v) du

where f(u) = (1- a)u- ,Bu 3 and v = y'(1- a)/3(3 so f(v) = 2(1- a)~ j3V3{3. This gives (8.93) with a= 2j3.j'Jf3. D COROLLARY 8.28 (VINOGRADOV-KOROBOV, 1957). There exist absolute constants 1 > 0, 8 > 0 such that

I(( a+ it)i :( "f(logt)~

(8.94) in the region (8.95)

t)

2, a ) 1- 8(logt)-~.

There are several interesting devices by means of which one can transform an upper bound for ((s) into a zero-free region and derive estimates for 1/((s), ('(s)/((s) in this region. The original arguments of Hadamard and de Ia Vallee Poussin (presented in Chapter 5) produce only a zero-free region of the type a ) 1 - c(log t) - 1 , regardless of how good the estimate for ( ( s) that one has. However combining these arguments with a method of Landau one can establish deeper relationships (see Theorem 3.10 and Theorem 3.11 of [T2]). LEMMA. Let ( t), 'f/;( t) be positive increasing functions for t ) 0 such that tj>(t)'lj;(t) = o(exp(tjl(t))) as t--> +oo. If ((s)

«

exp(¢(t))

for

a ) 1- 'f/;(t)- 1 ,

then ((s) f. 0 Moreover in this region

for

a ) 1- CJ/;(2t

1/((s), ('(s)j((s)

«

Hence using Corollary 8.28 we obtain

+ 3)- 1 ¢(2t +

3)- 1 .

¢(2t+3)'f/;(2t+3).

·~.·.

1

8. EXPONENTIAL SUMS THEOREM 8.29. We have ((s) 1

of 0

227

and 2

1

((s)

«

(logt)3(loglogt)3,

('(s) ((s)

«

(logt)" (loglogt)"

2

1

for cr ~ 1- c(logt)- 2 13 (loglogt)- 113 , t ~ 3, where c

> 0 is an absolute constant.

Now the standard contour integral argument (see Chapter 5) yields the following strongest known Prime Number Theorem. COROLLARY 8.30. We have '1/J(x) for x

~

= x + O(x exp( -c(log x) ~(log log x)-i ))

3, where c > 0 is some absolute constant.

Let Tk(n) stand for the number of ways of factoring n into k positive integers, so the Dirichlet generating series for Tk( n) is (k ( s). THEOREM 8.31. For any

(8.96)

Dk(x)

=

E

> 0 and x

~

2 we have

L Tk(n) = xPk(logx) + O(x 8•+"), n(x

where Pk is a polynomial of degree k- 1, ok = 1- {k- 2 13 , 1 is a positive absolute constant and the constant implied in 0 depends on E and k only.

PROOF. By Perron's formula (5.111) we obtain s 1 jl+<+iT . (k(s)=-_ds 27rl 1+<-.Y S

Dk(x) = -

+ O(xi+ 2
where T is any number with 2 !( T !( x. Move the integration to Re (s) = u, ~ < cr < 1, getting by (8.93)

The residue is equal to xPk(log x) and the error term becomes O(x 8 k+ 2") by setting 213 T = xkand cr = 1- (2ad)- 2 . 0 Recall that for small k one gets stronger results using van der Corput method. It is conjectured that the best exponent in (8.96) is Ok = for any k.

\1/

CHAPTER 9

THE DIRICHLET POLYNOMIALS 9.1. Introduction. A Dirichlet polynomial is a finite Dirichlet series

L

D(s) =

(9.1)

ann-s

l~n~N

with complex coefficients an· This is a special case of sums of type L ane-.X(n)s where >.(n) are distinct real numbers called "frequencies." Our special case >.(n) = log n is distinguished by the following features:

- >.(n) is smooth and slowly increasing, - >.(n) is additive. The derivative of D(s) is also a Dirichlet polynomial of length N,

L

D'(s) =

a;,n-•,

l'!(n'!(N

its coefficients a~ = -an log n are changed only slightly. The additive property of our frequencies implies that the product of two Dirichlet polynomials is a Dirichlet polynomial, namely we have

(L ann-•)( L n'!(N

bmm-s)

m'!(Af

=

Lce£-s e'!(L

with L = ]If N and the coefficients of the product are

(9.2)

ce =

The main objective of the theory of Dirichlet polynomials is to estimate D( s) at special points. Since the set of special points is not constructible in practice (think of the zeros p = f3+h of ((s) with (3 > ~)it is necessary to deal with arbitrary points. These can be elected by combinatorial arguments into well-spaced sets. Moreover, the coefficients an are quite complicated in applications, therefore one has to treat them as arbitrary complex numbers. Though in such generality it is impossible to give a non-trivial bound for D(s) at any fixed s, it is true that ID(s)l cannot take large values for many well-spaced points. Hence our fundamental question will be how often ID(s)l is larger than the mean value over the relevant set of points. If s =a+ it has fixed a, then by changing the coefficients an into ann-a we can assume without loss of generality that all the points in question are on the

229

230

9.

THE DIRICHLET POLYNOMIALS

imaginary line s = it. After establishing results for points on the imaginary line we shall extend them for sets of well-spaced points in vertical strips. Actually we can also modify the Dirichlet polynomial D( s) by twisting its terms by various functions fs(n) depending on s which change only slightly inn (have small derivatives inn), that is to say, the theory extends to sums of type (9.3)

Such a twist can be handled using any standard Fourier analysis (separation of variables in fs(n), see Proposition 9.11). 9.2. The integral mean-value estimates.

By Cauchy's inequality we get (9.4)

I

L

annif (GN

l~n(N

where (9.5) I(n(N

Of course, this bound is best possible, however, one can do better on average. THEOREM 9.1. For any complex numbers an we have (9.6)

where the implied constant is absolute. PROOF. Let f(t) be the following continuous piecewise linear function if t ( -N, if - N < t ( 0, if 0 < t ( T, ifT < t ( T+ N, if t > T+N. Then our integral is majorized by

with F(x)

=

J

f(t)xi 1 dt.

We have F(l) = T +Nand F(x) « N- 1 (logx)- 2 if x have m) ( m)-2 l(m+n)2 F ( -;;: « N-! log-;;: << N m- n

=11.

«

Hence form =In we

N (m- n)2.

r

9. THE DIRICHLET POLYNOMIALS

231

!

Therefore our integral is bounded from above by

L

(T+N)G+O(N

[aman[(m-n)- 2 ) =(T+O(N))G

l~m#n~N

by using the inequality 2[aman[ ( [am[ 2 + [an[ 2 . Similarly we derive the same lower bound. Combining both estimates we complete the proof of (9.6). 0 This result should be compared with the discussion of bilinear forms and large sieve in Chapter 7. TQ.eorem 9.1 implies that G~ is the mean-value of [D(it)[ in the following asymptotic sense (9.7)

11T

lim -

T~oo

T

[D(t)[ 2 dt =G.

0

Note that the mean-value of the product of Dirichlet polynomials is essentially bounded by the product of mean-values. Precisely we derive for the coefficients (9.2) the following inequalities:

L

2

[ct [

LL

(

l

[anbman 1 bm 1 [

nm=ntml

n

( (L [an[ 2T(n)) (L [bm[ 2T(m)). m

n

Hence using the bound T(n)

«

n' we obtain

LIce[« (MN)'(L [an[ 2 )

(9.8)

f

(L [bm[

2

).

n

For the product of k polynomials we have

(L

a~1 )n-')- ·. (

n~N1

with N = N 1

•••

L

a~k)n-s)

n~Nk

L

=

ann-s

n~N

Nk and the coefficients are an = "'· L ·· "'a(I) L n1 .. . a(k). nk n 1 ···nk=n

The mean-value of these coefficients satisfies

L

[an[ 2

(

L ···L

[a~1} ···a~~ [Tk(nr · · · nk) ( Gk1) · · · G~k)

where for a given sequence A= (an) we put (9.9) This follows by the inequality Tk(n 1 · · · nk) ( Tk(nr) · · · Tk(nk)· Since Tk(n) is multiplicative, it suffices to check this inequality for ni = pa,. In this case we have Tk(Pa) = (1 + a)(1 + a/2). · · (1 + aj(k- 1)), so it suffices to check that

9.

232


1 + (a 1 + · · · + ak)/f ( (1 +a!/ f)··· (1 + ak/f') for 1 ( £ Note that Gk « GN"

< k,

which is obvious.

by Tk(n) « n', where the implied constant depends onE and k; however, we shall often in practice have a better estimate G k « G(log 2N)K with some K ) k. The approximate formula (9.6) is the best possible of its kind in general. It is best used for polynomials of length N which is near T in the logarithmic scale. If the polynomial D(s) is quite short we can better apply (9.6) to a suitable power D(s)k getting COROLLARY 9.2. For any integer k ) 1 we have

(9.10) where Gk is given by {9.9) and the implied constant is absolute. Still we may not be able to match T and Nk for k an integer, so (9.10) is not perfect in some ranges. In this connection H. Montgomery [Mo2] made the following CONJECTURE Ah. Suppose [an[ ( 1. Then for any real k with 1 ( k ( 2 we have (9.11) where the implied constant depends only on

E.

It is not difficult to prove (9.11) for some special polynomials, for example, when an= 1 for all n. However, the bound (9.10) for general complex numbers an, with Nk+< replaced by Gk N', was shown by J. Bourgain [Bou] to be false if k is not an integer. His counter-example is given by a Dirichlet polynomial supported in a short interval. But such cases are not common in applications, so the Conjecture Ah, if true, is still important. It yields, among other things, the density conjecture for zeros of the Riemann zeta function (see Chapter 10). 9.3. The discrete mean-value estimates. Next we establish discrete versions of Theorem 9.1 and some of their consequences. DEFINITION. A set T of real numbers tr is said to be well-spaced if [tr 1 -tr,[ ) 1 for any r1 =I r2. LEMI\IA 9.3 (GALLAGHER). LetT be a set of points tr with ~ ( tr ( T- ~ which is well-spaced. Let F(t) be a smooth function on [0, T]. Then we have

(9.12)

L

trET

[F(tr)[ 2

(

1T O

([F(t)[ 2 + [F(t)F'(t)l)dt.

9.


233

PROOF. For a smooth function f(x) on [0, 1] we get the identity f(x)

=[

f(t)dt+

lax tf'(t)dt+ [(t-1)f'(t)dt

by partial integration. Hence

Jt(~ll (

[ Uf(tll + W'(tlJJdt.

Applying this for j2(t) we infer

I!(~W ( [

(JJ(t)J 2 + JJ(t)f'(t)Jdt.

This gives JF(trW (

l~~~ (JF(tW + JF(t)F'(t)J)dt,

and summing over tr in T we deduce (9.12) because the intervals (tr- ~, tr do not overlap.

+ ~) D

Notice that by the Cauchy-Schwarz inequality we have

Applying (9.12) together with (9.13) for F(t) = D(it) one gets THEOREM 9 .4. LetT be a set of well-spaced points in the segments [0, T] with T ) 1, and let an be any complex numbers. Then we have 2

(9.14)

Ll L trET

annit,l «(T+N)Giog2N

I~n~N

where G is given by (9. 5) and the implied constant is absolute.

CoROLLARY 9.5. Let the conditions be as in Theorem k ) 1 we have

9.4. For any integer

(9.15) where Gk is given by (9.9} and the implied constant is absolute.

REMARKS. Gallagher's lemma yields a transition from an individual value D(it) to an integral. It employs the derivative D'(it) which is not acceptable in some applications. However, we can make a transition without differentiation of D(s). To this end take a smooth function f(x) ;, 0 supported on [~, 2N] with f(x) = 1 on [1, N] such that xa f(a) (x) « 1 for 0 ( a ( 2. By Mellin inversion 1 f(x) = 27r

where F(t)=

joo F(t)xitdt -oo

j f(x)x't- dx«(1+)ti)1

2

1og2N

9.

234


by partial integration. We multiply an by f(n), which is a redundant factor if 1 ( n ( N, and apply the above integral representation of f(n) getting 1 D(s) = 27f

Hence (9.16)

D(itr)

«

(log2N)

j=-=

1:

D(s

+ it)F(t)dt.

ID(it)i(1 +it- trl)- 2 dt

where the implied constant is absolute. The same arguments apply to D(s)k with k a positive integer giving (9.17) where the implied constant is absolute. Hence the estimates (9.14) and (9.15) follow from (9.6) and (9.10) respectively. The bound (9.14) has deficiency of not depending on the cardinality of the set T; it is relatively weaker for smaller sets. Montgomery [Mon] succeeded in improving (9.14) if R = ITI is smaller than T~ (note that any well-spaced set T E [0, T] has no more than T + 1 points). The idea is based on the duality principle for bilinear forms: Let X = (xmn) be a complex matrix. Then the following statements about X and a number Dare equivalent:

(A) For any complex numbers an, 2

L/L>nXmn/

(

DLianl

2 .

m

(B) For any complex numbers bm, 2

L/LbmXmn/ n

m

(DLibml 2 . m

For the proof, see Chapter 7. THEOREM 9.6 (MONTGOMERY). Let the condition be as in Theorem 9.4. Then we have

L/

(9.18)

t,-ET

2

L

annitrl

«G(N+RT~)log2T

l~n~N

where the implied constant is absolute. PROOF. By the duality principle the problem is equivalent to showing that for any complex numbers Cr one has (9.19)

L l~n~N

IL t.,.ET

2

Crnitrl

«

(L trET

icri 2 )(N +RT~)log2T.

1

l

J

r THE DIRICHLET POLYNOMIALS

9.

235

Squaring out and changing the order of summation on the left side of (9.19) we get

LL>r Cr Z(tr 1

r1

2

1 -

tr 2 )

r2

where

L

Z(t) =

nit.

l~n~N

By (8.20), (8.26) and (8.34) we derive that (9.20)

«

Z(t)

Njtj- 1 + jtj ~log jtj, if jtj ) 1.

This yields (9.18).

0

By the Lindelof Hypothesis ((~+it) « 1W for jtj ) 1, one should expect a better bound for Z(t) than (9.20), namely that (9.21)

Z(t)

Njtj- 1 + N~jtj", for jtj) 1.

«

Hence we get (9.22)

LIL trET

annitr 12

«

G(N

+ RN~ )T"

l~n~N

in place of (9.18). But an even stronger bound is predicted by Montgomery in the following CONJECTURE M. Suppose iani ( 1. LetT be a set of R well-spaced points tr in the segment [0, T] with T) 1. Then (9.23)

L IL trET

annitrl2

«

N(N

+ R)T".

l(n(N

EXERCISE. Show that Conjecture AI implies Conjecture Mw for any real 1 (

k ( 2 (with extra factor T"). 9.4. Large values of Dirichlet polynomials. The discrete mean-value estimates offer some answers to the question how often a Dirichlet polynomial assumes large values at a given set of well-spaced points. Thus (9.14) yields THEOREM 9.7. Let D(s) be a Dirichlet polynomial of length N) 1 and letT be a set of well-spaced points tr in the segment [0, T] with T ) 1 such that (9.24)

for some V > 0. Then the cardinality ofT, say R = jTj, satisfies (9.25)

R

«

(T + N)GV- 2 log 2N

where the implied constant is absolute. Notice that the bound (9.25) is non-trivial if V » G~ log 2N. For V much larger than this Montgomery got a better result by applying (9.18).


236

THEOREM

9.8. Let the conditions be as in Theorem 9. 7 with

(9.26)

V)

G~T~ log2T.

«

GNV- 2 log2T

Then R

(9.27)

where the implied constant is absolute. PROOF. We can assume that T is larger than N by a large constant factor, because otherwise (9.27) follows by (9.25). Inserting (9.24) into (9.18) we get

RV 2 «G(N+RT~)log2T. D

This implies (9.27) under the restriction (9.26).

The restriction (9.26) can be removed at a cost of introducing an extra term in the bound (9.27). COROLLARY 9.9 (HuxLEY). Let the conditions be as in Theorem 9.7. Then (9.28)

where the implied constant is .absolute. PROOF. Huxley's idea (the subdivision method) is a simple trick of dividing the set T to meet the condition (9.26). First we cm1 assume that

V) G~ log2T

(9.29)

1

1

because otherwise the assertion (9.28) is trivial. Then we have V ) G 2 T0• log 2T0 with T0 = min{T, c- 2 V 4 (log 2T)- 4 }. The condition (9.29) ensures us that 1 ( T 0 ( T. Let Te be the subset of points ofT in the interval [l'T, (£ + 1)To] with 0 ( £ ( TT0- 1 . For each subset Te Theorem 9.8 is applicable giving Rt = [Tel « G NV- 2 log 2T. Hence

R

=

L Rc « TT

0

1

GNV- 2 log2T

e D

which yields (9.28).

Applying the above results for the polynomial D(it)k with k a positive integer one gets (9.30) (9.31)

« R « R

+ Nk)(Gk v- 2 )k log 2Nk, {(GkNV- 2 )k + T(GrNV- 6 )k}(log 2T) 6 (T

where the implied constant is absolute (see (9.25) and (9.28) respectively). In 1975, M. Jutila came up with several innovations to improve the above estimates in some ranges. His best result (which we state here without proof, see [J u3]) is

9.


237

THEOREM 9.10 (JUTILA). For any positive integer k we have (9.32) for any

R« f

GN

(G N) - t ~+ G NT (G1f2N) 3

4 k

{\12+ \(2

> 0, the implied constant depending only on

f

T N2k}(NT)" and k.

REMARKS. Notice that if (9.33)

V)

a! N~ (NT)"

then Jutila's bound (9.32) becomes essentially Huxley's bound (9.31) by letting k be sufficiently large in terms of f. The best (conditional) estimates for the number of large values of a Dirichlet polynomial with bounded coefficients comes from Montgomery's conjectures. By Conjecture .1\fk one derives (9.34) for any real k ) 1 and f > 0, the implied constant depending on f and k. In applications (9.34) is best used for k with Nk = T. By Conjecture M one derives (9.35)

if

Actually the above estimates for R are equivalent to the Montgomery conjectures. All the discrete mean-value estimates with respect to a well-spaced set T = {t 1 , ... , tR} C [0, T] can be generalized slightly to sums of type

L

(9.36)

anfr(n)nitr

l~n~N

where fr(n) is a nice function which does not vary inn too much. PROPOSITION 9.11. Suppose that each fr(x) satisfies (9.37) for 1 ( x ( N and 0 ( a ( 2. Suppose that

(9.38)

I

L

anfr(n)nitrl) V

l~n~N

for 1 ( r ( R. Then the bounds {9.30) and (9.31) hold true with Gk replaced by Gk(log2N) 2 . fn particular, fr(n) = n-"r with 0 ( ar ( 1 qualifies.

PROOF. We may assume that fr(x) is supported in [~, 2N] where it satisfies (9.37). Then we write fr(n)

= _!_ 27l"

Joo fr(it)n-i dt -oo 1

where fr(s) is the Mellin transform of fr(x), fr(s) =

1=

fr(x)xs-ldx.

9.

238

This satisfies fr(it) we get

L

«

(ltl


+ 1)- 2 log2N

anfr(n)nit,

«

(log2N)

l~n~N

«(log 2N)

by partial integration two times. Hence

i:

l

jD(itr

~ it)i(ltl + 1)- 2 dt

L ID(itr(m))i(lml + 1)-

2

where tr(m) = tr ~tis the point at which jD(itr ~it)! is maximal with m,;; t,::; m + 1. Notice that itr(m)l ,::; T + lml + 1. By the hypothesis (9.38) it follows that ID(itr(m))l

»

(lml

+ 1)iV(log2N)- 1

for some m E Z. Let Tm be the set of such points tr(m) and Rm its cardinality, so we haveR,::; Lm Rom. The set Tm can be divided into two well-spaced subsets. Applying (9.30) and (9.31) for each of these subsets we obtain corresponding bounds for R"'' and summing over m we get the asserted bounds for R. 0 9.5. Dirichlet polynomials with characters. A large part of the theory of Dirichlet polynomials (9.1) extends to sums of type (9.39)

L

D(s,x) =

anx(n)n-s

l~n~N

where x is some sort of arithmetic harmonic, for example, x(n) can be a Dirichlet character, or the Fourier coefficient of a modular form (see Chapter 14). However, in the latter example one is faced with technical difficulties caused by the lack of complete multiplicativity (the generating £-function has Euler product of degree two) while the orthogonality property is only approximate. In this section we deal with the Dirichlet characters to various moduli. Let k) 1 and Q) 1. We consider the set H.(k, Q) of characters x(mod kq) such that X = ~1/J where ~ is any character to modulus k and 1/J is any primitive character to modulus q with 1 ,::; q ,::; Q, (q, k) = 1. Throughout we assume that T ) 3, N ) 1, and we denote (9.40)

£ = logHN.

First we establish the following extension of Theorem 9.1. THEOREM 9.12. For any complex numbers an we have (9.41)


l

I


239

REMARKs. Our estimate (9.41) is not as strong as it could possibly be, namely the logarithmic factor C3 could be removed by taking a traditional approach based on the large sieve inequality for additive characters. The transition from multiplicative to additive characters being performed by Gauss sums (see Chapter 3), this approach has the extra advantage of working well for sequences (an) supported in short segments AI < n ( !vi+ N. Nevertheless, we have chosen a direct method simply to show new ideas. Here the quantity H = kQ 2 T is about the number of harmonics x(n)nit being employed (think oft as a discrete variable ranging over a well-spaced set of points with /t/ ( T). PROOF. For the proof of (9.41) we first assume that (an) is supported in a dyadic segment X ( n ( 2X with X ): 1. By the duality principle the problem reduces to the estimation of

(9.42) for any complex numbers cx(t). We enlarge the outer summation by introducing a smoothing function f(n) ): 0 with f(n) ): 1 for X ( n ( 2X. Then squaring out we see that the dual form (9.42) is bounded by (9.43) where

B(t,x) = Lf(n)x(n)nit

(9.44)

Here x = X 1:\:2 is a character of modulus I!= k[q 1 , q2] ( kQ 2 and By Poisson's summation

/tl = lt1- t2l (

where G is the Gauss sum

cG) =

L

x(a)e(aeh)

a( mod t)

and F is the Fourier transform of f(x)xii,

F(y) =

1=

f(x)xite(xy)dx.

Have in mind that F(y) is also a function oft. We choose f(x) such that (9.45)

ltl (yX)~) F(y) «X exp ( TT , if y > 0.

For this, there are many good choices, for example, (9.46)

X)

5 -x -f(x) = exp ( 2 X X

T.


240

satisfies all the requirements. For the proof of (9.45), we move the path of integration in the Fourier integral of f(x)xit from JR+ to eiOJR+ with 0 < 8 < ~ getting

IF(y)l (X

lor= exp(25 +Bit I- (x + x- 1 ) cosO- 21rxyX sinO )dx.

Since x- 1 cos 8 + 21rxyX sin 8 ;:? (1ryX sin 28) ~, this yields

IF(y)l (X expa

+Bit I -

(7ryX sin 28)~) (cose)- 1 .

We choose 8 = r- 1 getting (9.45). By (9.45) we derive that (by the trivial bound IG(h/f)l ( £)

where (9.48)

8x =

1 if x is principal, and 8x = 0 otherwise. Note that x = XIX2 is principal if and only if x1 = x2 (this is the place where our hypothesis that the characters 7/J(mod q) are primitive is essential). Inserting (9.47) and (9.48) into (9.43) we estimate (9.42) by

{X+H 2 exp(-(X/H)!)}

L xEH(k,Q)

lTicx(tWdt. O

Since this holds for any complex numbers cx(t), it follows by duality that for any complex numbers an, (9.49)

{T L lolL xEH(k,Q)

O

2

2

anx(n)nitldt«{X+H exp(-(XjH)!)}

X
L

lanl 2 •

X
This result is as good as (9.41) provided X;:? H(logH) 2 , but it is very poor for smaller X. Note that we established (9.49) only for sums over n in dyadic segments. To derive a result for short sums we shall increase X artificially, while preserving the dyadic shape of the range of the character sum (this idea is due to E. Bombieri). We begin with the inequality

L P
logp;:?

L

logp -log kq;;, ~p

P
p/kq

which holds for P ;:? 9log kq. Hence the left side of (9.49) is bounded by

241


9.

Here we used the multiplicativity of our harmonics x(n)nit to replace n by m = pn. Now m ranges over the dyadic segment pX < m ( 2pX, so applying (9.49) with X replaced by pX we get a new estimate {PX

L

+ H 2 exp( -(PX/ H)~)}

lanl 2

X
where P is any number ;:? 9log kQ. We choose P that

L 1TI L

xOt(k,Q)

O

=

(9

+ H x- 1 ) (log H) 2

anx(n)nifdt«(X+H)(1ogH)

X
2

L

showing

ianl 2 .

X
Finally we replace the dyadic segment X < n ( 2X by the whole interval! ( n ( N by subdividing the latter and applying Cauchy-Schwarz inequality. This produces an extra factor log2N, and it completes the proof of Theorem 9.12. D To every X E 'H(k,Q) we associate several points (possibly none) (9.50)

Let S(k,Q,T) be the set of all such points counted with relevant. multiplicity. We say that S(k,Q,T) is well-spaced if for any sr,(x!),sr,(X2) in S(k,Q,T) with (r1,x!) =F (r2,X2) we have either Xr =F X2, or Xr = X2 =X and ltr,(x)-tr,(X)I;:? 1. Clearly, if S(k, Q, T) is well-spaced, then its cardinality satisfies (9.51)

R

= IS(k,Q,T)I ( 3kQ 2T= 3H.

Nowhere in the proof of Theorem 9.12 did we use essentially the continuous measure in t, therefore the same estimates hold true if one replaced the integration by a summation over any set of well-spaced points tr(x). Then one can change nitr(x) into n-sr(x) at the cost of an extra factor (log 2N) 2 in (9.41) by the arguments from the proof of Proposition 9.11. This way Theorem 9.12 becomes THEOREM 9.13. LetS= S(k, Q, T) be a well-spaced set of points {9.50). For any complex numbers an we have (9.52)

L IL

anx(n)n-sr(xf

«

G(N

+ H)£.5

sr(X)ES l~n~N


Theorem 9.13 is an extension of Theorem 9.4. Next we establish an extension of Theorem 9.6, namely THEOREM 9.14. Let S = S(k, Q, T) be a well-spaced set of points (9.50) of cardinality R. For any complex numbers an we have (9.53)

L IL

anx(n)n-sr(xf

sr(X)ES l~n~N


«

G(N

+ RH~)£. 4


242

PRoOF. This goes by modifications of our arguments from the proof of Theorem 9.6 and Theorem 9.12. First we assume that
(9.54)

D

= Lf(n)IL2:>x(tr)x(n)nitrl2 X

r

for any complex numbers cx(tr), where f(n) is any non-negative function on JR.+ such that f(n) ;:? 1 if 1 ( n ( N (in what follows we assume N ;:? 2, otherwise (9.53) is trivial). The dual form satisfies (9.55)

D (

L L L :Licx (trJCx 1

2

(tr,)B(tr 1

-

tr 2 , XL\:2)1

where B(t, x) is given by (9.44). At this point we are going to refine the estimation (9.47). We write (9.56)

1 B(t, x) = - . f L(s- it, x)}(s)ds 2 7r~ ;(2)

where L(s, x) is the Dirichlet £-function and ](s) is the Mellin transform off }(s) =

1=

f(x)xs-Jdx.

Moving the integration to the imaginary line we derive (9.57)

cp( £)

1

r

B(t, x) = bx-e-f(1 +it)+ 0(£ 2 (ltl

1

+ 1) 2 (log Ne(ltl + 1)) 2 ).

Here the leading term comes from the pole of L(s- it, x) at s = 1 +it if x is principal (of course, it agrees with that in (9.47)) and the error term is obtained from two estimates (9.58)

L(s, x)
+ 2)

(see Exercise 9 in Section 5.9) and (9.59)

}(s)

«

e-lsilogN

on Re s = 0. To see the latter take (9.60)

so }(s) = 5r(s)(N 8 -1). Hence we also get }(1 +it)« Ne-1 1 1. Now (9.57) shows that the dual form satisfies X

r

and this proves (9.53) with the logarithmic factor [} in place of £ 4 , but only if all the points sr(X) are imaginary. Then we change nit into n-

243

sum B(x, t). However, this estimation is not the best possible. Assuming the Lindeliif hypothesis

L(s, x)

(9.61)

«

(flsW

for Res = ~ and x(mod f) one deduces (by moving the integration to the critical line rather than to the imaginary one) that (9.62) This yields (9.63)

L\L

sr(x)ES l,;;n.;;N

in place of (9.53). The results (9.53) and (9.63) are due to H. Montgomery [Mo2].

9.6. The reflection method. The Lindeliif hypothesis may be an open problem for a long time. Nevertheless, M. N. Huxley [Hu2] has succeeded to establish unconditionally a result which is almost as good as (9.63) in some important ranges. First he did it for polynomials D(s) in Corollary 9.9 by the subdivision trick. However, this trick does not make sense for polynomials D(s, x) twisted by characters. For these polynomials Huxley invented a completely different brilliant trick which he calls the reflection method for reasons soon to be clear. Using Huxley's ideas we are going to prove THEOREM

9.15. Let the conditions of Theorem 9.14 hold true. Then

(9.64)

where the implied constant is absolute. For the proof we can assume that the coefficients an are supported in a dyadic segment N ( n ( 2N, and as before, we can also assume that all the points sr(x) are on the imaginary line. The latter simplification costs a loss of factor (log 2N) 2 , so we must show (9.64) with £. 4 in place of £. 6 . The problem reduces to the estimation of the dual form D. In the proof of Theorem 9.14 we have estimated D for any complex coefficients cx(tr), however, it is sufficient to deal with (9.65) Precisely, letting C be the left side of (9.64), we have (9.66) (see the derivation of the duality principle in Chapter 7). Now we are going to take advantage of these special coefficients (9.65). By (9.55) we are again faced with the character sum B(x, t). We shall not improve the error term in (9.57) but rather replace it by another character sum; in other words, we use a kind of summation


244

formula (see Chapter 4, in particular ( 4.26)). We derive such a formula from a functional equation for the corresponding L-function. Without loss of generality, we can assume that all the characters in C have the same parity (by dividing C into two sums if necessary), so the character X= x 1 2 in B(x, t) is always even. If x(mod £) is primitive, then we have the functional equation (4.73)

x

L( s, x) = Exr(s)e! -s L(1- s, x)

(9.67) where lcxl = 1 and

7rs-,r (1-s) - - ;r (s) . 2 2

r(s) =

(9.68)

1

Suppose x(mod £)is induced by the primitive character x*(mod £*)with £*if. Then L(s,x)

L(s,x*)

=

II(l- x*(p)p-

5

).

p)f

Hence the functional equation (9.67) for L(s, x*) yields the following functional equation for L(s, x), L(s, x)

(9.69)

=

€xr(s)P(s)L(1- s, X)

where €x = €x• and (9.70)

P(s)

=

(e*)~-s

II (1- x*(p)p- )(1- x*(p)ps-l)5

1

•

p)£

Notice that r(s), P(s) are holomorphic in Res< 1, that IP(s)l = 1 on Res=!, whence by the convexity bound or Phragmen-Lindeliif principle (9.71)

IP(s)i ( e~-u

if

a= Re(s) ( ~-

Moreover, L( s, x) is holomorphic on IC except for a simple pole at s = 1 with residue r.p(f.)/£ if x(mod £) is principal. Therefore by moving the integration in (9.56) to the line Re s = -c and applying the functional equation (9.69) we obtain (9.72)

B(t, x) = bx r.p~£) f(l +it)+ ExB*(t, x)

where (9.73)

B*(t, x) = - 1 .

27r~

1

-

,(s- it)P(s- it)L(1- s + it,x)f(s)ds.

(-e)

Expanding L(1- s +it, X) into the Dirichlet series and integrating termwise we get 00

(9.74)

B*(t,x) = Lg(m)x(m)m-~-it

where (9.75)

g(y) = - 1 . 27r~

1(

~)

-

r(s- it)P(s- it)f(s)ys-,1 ds.

Notice we moved the integration from Res = -E in (9.73) back to Res = ~ in (9.75) as we can because r(s) and P(s) have no poles in Res< 1.

,..

: J 9. THE DIRICHLET POLYNOMIALS

245

The formula (9.72) is the one we were talking about, but it is not yet ready for applications because the dual function g(y) is not evaluated explicitly enough. To see its properties we take again the majorant function f(x) given by (9.46) with X = N, so its Mellin transform is (9.76)

' = 1=

f(s)

0

5 X N . exp(----)x•-ldx 2 N X

= 2e 5 12Ks(2)Ns

where K.(y) stands for the Bessel-Macdonald function. Hence j(s) has exponential decay on any vertical lines =a+ it. Using analytic properties of j(s), 'Y(s) and P(s) one shows, by moving the integration in (9.75) sufficiently far to the left (as in (10.64)), that (9.77)

g(y)

«

1

N' exp

(

-51 ( £(itlyN+ 1) ) ~)

where the implied constant is absolute (compare this estimate with (9.45)). Hence + 1)(logN£(itl + 1)) 3 .

g(m) is very small if mN;:? £(itl

In our case itl ( T and£ ( kQ 2 , so the terms of B*(t,x) with mN;:? H£ 3 contribute very little, precisely we get by (9.77) (9.78)

L

B*(t,x) =

g(m)x(m)m-~-it

+ O(H- 1 )

m(A/

where (9.79) and the implied constant is absolute. To make (9.78) ready for applications it is desired to separate m from X and t involved in g(m). We accomplish this goal quickly by introducing (9.75) (9.80)

B*(t,x) =

~

Since IK.(2)1 ( 81f(s)l (9.81)

B*(t, x)

j

27ft (~)

«

'Y(s)P(s)j(s +it)(

e-lul for s

« N~

L

x(m)ms-l)ds + O(H- 1 ).

m(M

=

~

+ iu,

we get

;=I L x(m)m-~+i(t+u)le-luldu +

H- 1

-oo m(A/

where the implied constant is absolute. This result is more precise than (9.57). Indeed, estimating trivially by (9.81) one gets

B*(t,x)

(9.82)

«

H!.c~

which recovers (9.57). But one can do better by putting the sum

L

(9.83)

XlX2(m)m-!+i(tr,-tr,+u)

m:(Af

in the original polynomial (9.84)

L N
anxdn)nitr,


246

which appears as a coefficient in (9.55) (here and hereafter we use the abbreviated notation tr = tr(X)). By (9.55), (9.72) and (9.81) we derive

D

«

N L L L ID(itrpX)D(itr 2 ,X)Iexp(-ltr,- tr2 1)

+ N!

I:

+ H-! L

LL L XI

X2

L L

L

l

ID(itr 2,X2)P(itr,Xl;itr,X2;u)je-iuidu

L

I

ID(itr,, Xl)D(iir 2, X2JI

where P( itr,, X1 , itr2, X2; u) denotes the product of the two sums (9.83) and (9.84). For fixed x2 , tr2 , u this product is a new polynomial of type (9.39). The length of P(itr,X 1 ) = P(itr,,x 1 ;itr2,X2;u) being 2MN = 2H£ 3 is perfect for an application of Theorem 9.13. We obtain

Xl

I

rz

r1

LLIP(itr,,x!)l (

1

(RLLiP(itrpX!W)~ « (RGH)~£ 6 .

r1

X1

r1

1

l1

After this bound, which does not depend on X2, t2, u, we estimate

Integrating in u, we find that the contribution of these polynomials to D is at most O(R(CGHN)~£ 5 ). The polynomials from the diagonal x1 = X2 =X contribute

N L L LID( itr,, x)D( itr2, x) Iexp( -ltr, - tr21l

«

NC,

and the remaining ones contribute

Adding up these contributions we obtain D « NC + R(CGHN)~£ 6 . Hence we derive by (9.66) that C « GN + GR~(HN)~£ 4 • This completes the proof of Theorem 9.15. 9.7. Large values of D(s,x). The notation from previous section holds here, in particular, we recall that H = kQ 2 T and £ = log H N. In this section we deduce a few bounds for the number of large values of the polynomials D(s, x). First, as in Theorem 9.7, we deduce from (9.52) the following THEOREM

9.16. Suppose for any sr(X) in a well-spaced set S(k, Q, T) we have

ID(sr(X), x)i•;? V.

(9.85)

Then the cardinality of this set, say R = IS(k, Q, T)i, satisfies (9.86)

R

«

(H


+ N)GV- 2 £ 5

1

r

I !


247

If V is relatively large, then (9.53) yields a better bound (for details see the proof of Theorem 9.8), namely the following THEOREM

9.17. Let the conditions be as in Theorem 9.16 with

(9.87) Then

(9.88) where the implied constant is absolute.

The estimate (9.88) is ideal for applications, unfortunately it is established only for very large V relative to the "conductor" H. One can remove the condition (9.87) at a cost of introducing an extra term in the bound (9.88). To this end we apply (9.64) and get the following result THEOREM

9.18. Let the conditions be as in Theorem 9.16. Then


The last estimate (9.89) is the celebrated result of Huxley [Hux]. It is surprising to see that in the special case of the Riemann zeta function, his subdivision argument yields a bound (9.28) which agrees with (9.89) obtained by very different arguments (reflection method). The assumption that the ordinates tr(X) of the points sr(X) = ar(X) +itr(X) are well-spaced can be relaxed, it suffices that they do not cluster too much. Suppose that for any sr(X) E S(k, Q, T) we have (9.90)

\{ri; Sr,(X) ES(k,Q,T),\tr, -tr\ ( 1}\ ( L.

Then clearly all Theorems 9.13 to 9.18 are true for the set S(k, Q, T) with the relevant upper bounds multiplied by L (to see this divide the set S(k, Q, T) into at most 2L subsets of well-spaced points and apply the results for each of these subsets separately). In particular, we know that the number of zeroes p = (3 + ir of L(s, x) for a given character x(mod kq) with h- t\:::;; 1 is bounded by O(logkq(\t\ + 3)) (see Chapter 5). Therefore, by virtue of the above remark, all Theorems 9.13 to 9.18 hold for any subset S(k, Q, T) of zeros of

II

(9.91)

q,;;Q

II

II

1/l(mod q)

~(mod

L(s,7f;~) k)

(q,k)=l '1/J primitive

in a rectangle a;?: n, \t\ £., and G replaced by

(

T (counted with multiplicity) with just one extra factor

(9.92)

G(n) =

L

[an\2n-2a

n~N

in the relevant upper bounds. In the next chapter we apply these theorems for selected points sr(X) which are zeros of (9.91).

CHAPTER 10

ZERO-DENSITY ESTIMATES 10.1. Introduction. In the absence of a proof of the Grand Riemann Hypothesis, it is natural to ask how many zeros of a given £-function can lie off the critical line. Since we know there are no zeros on the line Re (s) = 1, a few such zeros would not normally influence applications, but a large number of these, especially when they are near this line, may distort substantially the results (an example being the distribution of primes in short intervals, see Section 10.5). Therefore, we should ask how many zeros can possibly be in a given region, and our estimates should reveal that the further the region is from the critical line, the smaller the number of zeros it contains. A quantitative statement of this kind is called a zero-density theorem. The regions we need most in practice are the rectangles (10.1)

R(n,T)={s=a+it;

a;?n,

lti(T}

for ~ ( Ci ( 1 and T ;:? 3. Let N( n, T) denote the number of zeros of the Riemann zeta function in R( n, T) (counted with corresponding multiplicity), that is we count the zeros p = f3 + i-y with (10.2)

!

In this notation the Riemann hypothesis asserts that N(n, T) = 0 for any Ci > and T;:? 3, while the best known zero-free region (due to Vinogradov and Korobov) says that (10.3)

N(n, T) = 0

if

n > 1- c(logT)- 2 / 3 (loglogT) 1/ 3

where c is an absolute positive constant (see Chapter 8). Recall that the number of all zeros p = f3 + i-y with bl ( T satisfies (see Chapter 5) (10.4)

T

T

N(T) =-log 1f 21fe

+ O(iogT).

By analogy to the convexity principle concerning the order of ((s) on vertical lines one may expect the following bound for N(n, T) to be true DENSITY CONJECTURE. For~

(10.5)

( n ( 1 and T;:? 3 we have

N(n, T)

«

T 2 (t-a) logT

where the implied constant is absolute. Of course, the convexity principle for zeros in rectangles does not hold for general analytic functions; nevertheless, the density conjecture for ((s) has a reliable

249

10. ZERO-DENSITY ESTIMATES

250

backup - the Riemann Hypothesis. This conjecture is particularly attractive because it has a chance to be proved by the technology which is available today, while it replaces the Riemann hypothesis in various applications to the distribution of primes. A great deal of effort was made to establish estimates of type

«

N(a, T)

(10.6)

Tc(a)(l-a)(logT)A

with c(a) as small as possible (the logarithmic factor is of secondary concern here, although it will be important to eliminate it in Chapter 18). The first zero-density results were proved by Bohr and Landau [BL] in 1914. Their idea was basic in many works over three decades, notably in the works by F. Carlson [Car] and A. E. Ingham [Ing]. Two new devices were introduced by P. Tunin in 1949, and in collaboration with G. Halasz in 1958. Great refinements were made by H. Montgomery in 1969. The other significant contributors to this fascinating theory are M. N. Huxley, M. Jutila and D. R. Heath-Brown. In this book we shall try to give a glimpse of the current state of knowledge. Our results will not be as strong as the best known ones, though nearly so. Moreover, we treat in considerable detail the zeros of Dirichlet £-functions. Let N(a, T, x) be the number of zeros Px = f3x + hx of L( s, x) with f3x ) a and hx I :( T (counted with multiplicity). The ultimate goal would be to prove the following GRAND DENSITY CoNJECTURE.

Let k ) 1, Q ) 1, T ) 3 and ~ :( a :( 1.

Then

(10.7) q(Q

,P(mod q)

~(mod

k)

(q,k)=l 1/Jprimitive

where H = kQ 2 T and A is an absolute constant, the implied constant being also absolute.

Our main result, the Grand Density Theorem 10.4, will give (10.7) in the range < a :( 1, but with the constant c = 12/5 in place of 2. ~ :( a :( 1. l'v!oreover, it yields (10.7) in the whole segment ~

10.2. Zero-detecting polynomials.

First we outline the old and new strategies. Given a holomorphic function L(s) in a region D C C there is a variety of tools for counting its zeros inside D. A typical one is the integral formula of Jensen (10.8)

{R n(r) dr Jo

r

= _1

27ri

1

lsi=R

log I L(s) Ids

L(O) s

where n(r) denotes the number of zeros of L(s) in the disc lsi :( r (assuming L(s) # 0 for s = 0 and for s on lsi = R). There are similar formulas for other domains. Note that one needs both upper and lower bounds for IL(s)l on the boundary to deduce an estimate for the number of zeros inside the region. Of the two, the lower bound is the hardest to find. In view of these requirements any integral formula is practical only for a region sufficiently wide so that its boundary is distanced from the area in which the zeros are concentrated (otherwise one cannot

. -.~

r

10.

I

ZERO-DENSITY ESTIMATES

251

claim a reasonable lower bound for IL(s)i on aD). But this is not the case for the narrow rectangles R(n, T). To avoid the above barriers one gets the following idea: multiply L(s) by another holomorphic function, say M(s), which is suspected to be large when L(s) is small, and count the zeros of the product N(s) = L(s)M(s) rather than of L(s) alone. Of course, the extra zeros of kl(s) will occur, but hopefully not so many as to change the order of magnitude of the original number (a positive proportion of the extra zeros is acceptable). In reality the idea works only for counting "fictitious" zeros because N(s) does vanish at the true zeros of L( s). Thus the points at which we want the mollifier M (s) to be large are only hypothetical. Having no possibility to inspect the distribution of these hypothetical points one may try 1\l(s) which mimics 1/ L(s) at almost all points, so N(s) is expected to be close to one almost everywhere in a given region. It would be tautological to take just 1/L(s) for the mollifier M(s), or naive to deform 1/L(s) smoothly to get a useful holomorphic function. But suppose 1/L(s) is given by an absolutely convergent series in a certain region. Then the chances are that its partial sums do well approximate to 1/ L(s) uniformly in a wider region which does not contain the true zeros of L(s). Therefore, a finite but sufficiently long partial sum of such kind offers a good candidate for M(s). The above strategy serves well for counting fictitious zeros of Dirichlet series with multiplicative coefficients. First to illustrate the ideas we give a quick treatment of L(s) = ((s). In this case

1

((s) =

II( 1- ps1) = ""Jl(m) ~ ~, m

P

but only if Res> 1. Nevertheless, one expects that the partial sum (10.9)

M(s) = "" Jl(m) ~

m(A/

ms

approximates to 1/((s) if Res= a> ~· Indeed, the Riemann hypothesis ensures that

(10.10)

(ts) = M(s)

+ 0((\s\M)" M~-").

REMARKS. The partial sum Af(s) was first used in the context of zero-density estimates by Carlson [Car] while Bohr and Landau [BL] were working earlier with the partial product

(10.11)

P(s)=

II (1-~). p(P

Either device produces non-trivial results, yet M(s) is more efficient than P(s). It helps to keep hold on the size of the points, so we restrict our analysis to the region (10.12)

s = a +it,

a ;;:, n,

T

< iti :::;; 2T,


252

and we assume that T is large. In this region

((s) =

(10.13)

L

n-s

+ O(T-"')

n~T

where the implied constant is absolute. Hence (10.14)

((s)M(s)=

L

ann- 8 +0(T-"'M 1 -"'log2M)

n~TAI

where the coefficients are (10.15)

J.l.(m), d1n=n

m(Af,d~T

and the error term is obtained by the trivial estimation

M(s)

(10.16)

«

l\I 1 -"'log2M.

Assuming 1 ,;; M ,;; T we have a 1 = 1 and an = 0 if 1 < n ,;; M or n > MT. We cover the interval M < n ,;; MT by dyadic segments N < n ,;; 2N with N = 2t M, 0 ,;; e < L = [log T I log 2]. For each of these segments denote (10.17)

De(s)

ann -s .

= N
Thus we have (10.18)

((s)M(s) = 1 +

L

De(s)

+ E(s)

O(l
where the error term satisfies IE(s)l ,;; ~ for s in the region (10.12), provided Jo.Jl-o,;; To(logT)-2. An important feature of the identity (10.18) is the absence of terms with 1 < n,;; M on the right side. The left side at a zero of ((s), say at s = p, vanishes so

we must have

\L

(10.19)

o,;;.e
De(P)\)

~-

Hence IDe(P)I ) (2£)- 1

(10.20)

for some 0 ,;; e < L. This lower bound is rather large by comparison to the meanvalue of 1De(s)l 2 on the line Res= a. Indeed we have (10.21)

Ge(a) =

L

lanl2n-2a << Nl-2o(log2N)3.

N
For this reason we call De(s) zero-detecting polynomials. What we showed is that every zero of ((s) in the region (10.12) is detected by at least one of a few fixed polynomials by way of producing unusually large value.

10.


253

REMARKS. Of course, the lower bound (10.20) may not hold at points other than the zeros of ( (s). In fact, one should expect that

De(s)

(10.22)

«

N~-o+£

for any s in the region (10.12). Comparing (10.22) with (10.20) one could get a contradiction which proves the lliemann Hypothesis. Well, this argument is nothing but wishful thinking because one needs the Riemann Hypothesis to establish (10.22). These remarks reveal that the concept of a zero-detecting polynomial is somewhat superficial, it will loose its meaning when the GRH is fully established. In reality Dirichlet polynomials do not assume large values very often. Now the theory of Dirichlet polynomials as developed in Chapter 9 tells us that (10.20) cannot happen too often, in particular, producing an interesting estimate for the number of zeros of the zeta function. Precisely, let Re be the number of zeros of ((s) in (10.12) which are detected by the polynomial De(s). Then Theorem 9.7 (or Theorem 9.16) together with the closing remarks of Chapter 9 yield (10.23) by (10.20) and (10.21). Since M :( N < !liT, this gives (10.24) for any 0 :(

e < L.

Choosing M = T 2 o.-I we get

(10.25) Adding up the Re we get an upper bound for all zeros of ((s) in (10.12) (some of these may be detected by several polynomials De(s)), then we extend the result to the zeros in (10.2), and conclude THEOREM 10 .1. For ~ :(a :( 1 and T > 2 we have (10.26) This estimate (apart from the logarithmic factor) was obtained by F. Carlson [Car] in 1920 (see also E. Landau [La3] of 1921). Twenty years later A. E. Ingham [Ing] proved (10.27) Either ( 10.26) or (10.27) imply that the Riemann hypothesis is "almost true" in a statistical sense, namely that almost all zeros of ( (s) lie arbitrarily close to the critical line. It follows by (10.27) that the number of zeros p = f3 + i-y with hi :( T and

A log log 1 -f3 > -+3 2 logT

(10.28) is bounded by O(T(Iog T) 5 -

4 A)

for A > 1.

Concerning the zeros ncar the critical line in 1942 A. Selberg [84] proved that (10.29)

N(a, T) «(a- ~)- 1 T,

10.

254


where the implied constant is absolute. Hence, given a function (T) -> oo as T -> oo, it follows that almost all zeros of the Riemann zeta function lie in the region

1/3 - .!I 2 <

(10.30)

(T) logT'

10.3. Breaking the zero-density conjecture. It is amazing how long the estimate (10.27) of Ingham resisted improvements. After sixty years it is still the best known result (apart from the logarithms) in the range ~
For the zeros of ((s) near the line Res = 1 one is also helped by having shorter partial sums which approximate to ((s). In this section we use a suitable short polynomial to establish a zero-density estimate which is even better than the zero-density conjecture for a > ~. THEOREM 10.2 (HUXLEY). For any a>~ and T ) 2 we have (10.31)

N(a, T)

«

3(1-o:)

T---:r=t (logT)

A

where A= 300(a- ~)- 2 and the implied constant depends only on a.

REMARKS. Actually Huxley proved (10.31) for any a ) ~ with A and the implied constant being absolute. We could refine our arguments to get (10.31) in wider rectangles and with absolute constants, but it would obscure the presentation. Besides proving (10.31) our mission is also to introduce neatly some technical novelties due to M. Jutila [Ju4]. He realized that it was sufficient to take the mollifier AI(s) quite short. Then, to achieve best results, he makes an adjustment by raising the zero-detecting polynomials to appropriate high powers. This idea of Jutila is somewhat similar to the work of D. R. Heath-Brown on his identity for primes (13.37). By Weyl's bound (8.18) (the van der Corput bound (8.53) would also do the job) we derive (10.32)

((s) =

L n-s + O(x1-ari logT) n~X

for s =a+ it with ~ :::;; a :::;; 1, T :::;; it I :::;; 2T, where 1 :::;; X :::;; T and the implied constant is absolute. Hence (10.33)

((s)M(s) = 1 +

r


where 1 :( M :(X,

ian I :( T(n)

255

(see (10.15) with T replaced by X) and

(10.34)

We require (10.35)

IE(s)l :(

1

2

for s in the rectangle (10.12), and this holds if (10.36)

because T is assumed to be large. Now the situation is the same as in Section 10.2 except that we employ fewer zero-detecting polynomials (10.17); their length N = 2£ Jo.I is restricted by M :( N < M X (i.e., 0 :( e < L = [log X/ log 2]). Recall that Re denotes the number of zeros of ((s) in (10.12) satisfying (10.20). We shall estimate Re by an appeal to (9.31) with a suitable k ~ 2 rather than (9.30) which was used with k = 1 for deriving (10.23). In other words, we apply (9.28) for the polynomial De(s)k. We choose k ~ 2 depending on N such that P = Nk falls into the segment (10.37)

Z

< P :( (MX) 2 + Z~,

where Z is a given number to be specified later subject to Z ~ M X. To see that (10.37) is possible take k ~ 2 such that Z 1 1k < M :( zl/(k- 11, and in case k = 2 use also N :( 1\fX. The mean-value of 1De(s)l 2k on the line Res= a is bounded by (10.38)

GZ =

(

L

lanl2n-2"Tk(n))

k

«

pl-2"(logP)4k(k+l)

N
where the implied constant depends only on k. Moreover, the lower bound V = (2£)- 1 of IDe(s)i at the zero is raised to the power Vk = (2£)-k » (logT)-k. Therefore (9.31) yields (10.39)

Re

«

(? 2(1-a) + T p4- 6 ")(log t)Ak

where Ak = 6(k+ 1)(12k+ 1) and the implied constant depends on k. Here it would be perfect to choose P = T 112(2a- 1), producing the best bound r(l-a)/(2a-1) (log T)Ak' but it is not always possible because Pis not precisely localized (since k had to be an integer!). However, we have reasonably good control of P given by the interval (10.37) where Z is at our disposal. Inserting the upper bound for P to the first term and the lower bound for P to the second term in the right side of (10.40) we get (10.40)

Now it is clear that the best choice is Z Z~ (log T) 9 which yields (10.41)

Rt

«

3(1-o)

T~

= T 11(3 a- 1). A

(log T) k+ 6 .

We also take MX

=


256

We still have some freedom to choose M and X subject to the condition (10.36). For the value MX = X 3f 4 {3n-l)(logT) 9 this condition is satisfied with (10.42)

M = y(Gct-5)/12{3ct-l)

(10.43)

X= y(7-3n)/G(3n-l)(logT)9.

Finally, summing (10.41) we get (10.31) with A= A~.:+8 = 6(K + 1)(12K + 1) +8, where K is the maximal k which is needed to put P = N" into (10.37) for M :( N < MX, ie, K = 1 + [logZ/ logM] :( 1 + 2(n- ~)- 1 Hence A:( 300(n- ~)- 2 . Having sufficiently strong estimates for exponential sums one can make a good approximation to ((s) by partial sums which are very short from which one can derive relatively sharp zero-density estimates. Thus, using the Vinogradov estimate (8.92) it was shown by Halasz and Turan [HaTh] that for any n near one J

EXERCISE

3

N(n,T) «T(l-n)2llog{l-n)l.

(10.44)

1. Prove along the above lines that the Lindeli:if Hypothesis implies N(n, T) « y2(I-n)+e

(10.45) for ~ :( n :( 1 and any

E

> 0, the implied constant depending only on

E.

EXERCISE 2. Assuming the Lindeli:if Hypothesis and the Montgomery Conjecture !If (see (9.23)), prove along the above lines that

(10.46)

N(n,T)

if n > ~ for any

E

«

T'

> 0, the implied constant depending on n and

E.

10.4. Grand zero-density theorem. Now we are going to exploit the ideas of the previous section and the results of Chapter 9 for polynomials with characters in their full force. We consider characters A_(mod kq) with (k, q) = 1 such that x = ~'lj; with ~ any character to modulus k and '!/.• any primitive character of conductor q. Here k is given while q varies over the segment 1 :( q :( Q. Moreover, we keep the points s = a+ it in the rectangle a;;, n, ltl :( T. where T;;, 3. Let N(n, k, Q, T) denote the total number of zeros of all L(s, x) counted with multiplicity subject to the above restrictions, i.e., N(n,k,Q,T) =

L

L

N(n,T,~'Ij;).

,P(rnod q) <(mod k) (q,k)=l 1J;primitivc q(Q

Put D = kQT, H = kQ 2 T and£= 2logD. By the classical bound N(n, T, x)

«

Tlog(kqT)

(see Theorem 5.24) it follows that N(n, k, Q. T) « H £, which we shall refer to as the trivial bound. In this section we shall establish various non-trivial bounds for N(n, k, Q, T) which are generalizations rather than improvements of the results by Ingham and Huxley (compare (10.72) with (10.27) and (10.31)).


257

From the previous section it is clear that if one ha.s shorter Dirichlet polynomials approximating L(s, x), then a stronger zero-density theorem can be deduced. The best unconditional approximation is given by the approximate fuuctianal equation, which, roughly speaking, represents L( s, x) as a sum of two polynomials of length X and Y respectively, with XY being the conductor. Classical formulas of such kind (see e.g. [Lav]) are quite messy, so we rather develop here a different expression for L(s, x) of the same nature, but more friendly for applications in mind. See also Theorem 5.3 for a general version. The quality of an approximate functional equation depends on the choice of a particular cut-off function. We have already experimented with such approximate equations when studying the reflection method in Section 9.6. Our present setting is close to that particular one. One may start from any smooth function f on JR+ such that (10.47)

f(O)

= 1 and

lim f(x) x-+oo

=0

with J'(x) having rapid decay as x tends to 0 or oo. Then its Mellin transform (10.48) has a simple pole at s = 0 with residue 1 and s](s) is an entire function. Indeed we have (10.49)

s](s)

-1

=

00

5

J'(.r)x dx

which shows what we claimed. Normally we would not recommend working with any specific cut-off function, but we do so here for clarity to keep track of uniformity in several variables side by side. A nice choice is

J(~·) = /{,

(10.50)

1

oc

1)

dy exp( -y-- - . y

X

y

where!{, is the normalizing constant such that f(O) = 1. We shall see fmm computations below that K- 1 = 2K0 (2) = 2I:~ f'(k)r- 3 (k) = 0.2277 .... This function satisfies 0 < f(x) <

(10.51)

Ke-x

because e-l/y < y, and 0 < 1- f(x) <

(10.52)

Ke-Ifx

because (10.53) By (10.49) for (10.54)

f(x)

f

+ !(~) =

given by (10.50) we get

1.

10, ZERO-DENSITY ESTIMATES

258

where K 8 (z) is the Bessel-Macdonald function. In particular, by the power series expansion of Ks(z) at z = 2 we get (10.55)

• "IrK ~ 1 ( 1 1 ) sf(s)=sin(ns)L...-r(k) r(k-s)-r(k+s) ·

1

Hence we see that j(s) is odd

j(s) = -i(-s).

(10.56)

Moreover, we get the uniform bound for s =a +it E IC, (10.57) where the implied constant is absolute. Now we are ready to derive the approximate formula for L(s, x) in question. We assume that s = a + it with < a < 1 and x is a character of modulus £ ~ 1. To refresh your memory, we recommend reading again the arguments from (9.67) to (9.75) in the previous chapter. We begin by evaluating the sum

!

(10.58) By contour integration and the functional equation (9.69) we get

B(s, x) = - 1 . 27r2

1

L(s

-

+ u, x)Xu f(u)du

(1)

= 8x 'P£(€) X

1

-s }(1-

s)

+ L(s, x) + ~ 27r2

1

•

L(s + u, x)Xu }(u)du.

(-1)

Then by the functional equation (9.69) and by (10.56) we find that the integral on the line Re u = -1 is equal to -ExB*(s,x) where

·1

1 B*(s, x) = 27r2

r(s- u)P(s- u)£(1- s

+ u, x)x-u }(u)du.

(I)

Here we expand L( 1- s + u, X.) into Dirichlet series and integrate each term getting (10.59)

B*(s,x) = L'X(m)ms-Ig(mX)

where (10.60)

g(y) =

1 :z-' 7r2

1 (1)

-

r(s- u)P(s- u)y-u f(u)du.

Gathering the above results we obtain the desired expression (10.61)

L(s, x) = B(s, X)+ ExB*(s, X)- bx
where B(s, x) and B* (s, x) are the reduced series given by (10.58) and (10.59) with arbitrary X > 0.

,.1:''

•• '


259

It is clear that B(s, x) runs essentially over n «X, and we show that B*(s, x) runs essentially over m « Y, where Y is somewhat larger than fjs\X- 1 . To this end recall the estimates

(10.62)

\P(s)\ :(£~-Res,

(10.63)

r(s) « jsj~-R.e

s

which hold uniformly in the half-plane Res :( ~, and the estimate (10.57) which holds for all s. Hence, moving the integration (10.60) sufficiently far to the right, say to the vertical line

(10.64) one derives

f\s\ ( --1 ( - y ) g(y) « -exp 5

(10.65)

f\s\

y

~)

where the implied constant is absolute. Hence g(mX) is very small if mX is somewhat larger than £\s\. Precisely, we get by (10.65)

(10.66)

x(m)m 8 - 1 g(mX)

B*(s,x) = L

+ o()y)

m~Y

provided

(10.67) Next we write (10.60) as

(10.68)

~

g(y) =

27r2

1

r(s- 2a + u)P(s- 2a + u)y"- 2" }(2a- u)du

(1J)

by changing u into 2a - u and moving the integration to the line Re u 1 < rJ < 2a. Then we insert (10.68) into (10.66) getting

(10.69)

B*(s,x)= 2 ~i1

(1J)

=

'1/, where

(Lx(m)m-S+u- 1 )W(u)du+O(}y) m<:;y

where

W(u) = r(s- 2a + u)P(s- 2a + u)X"- 2 " }(2a- u). By (10.63), (10.62), (10.57) we get for u = '1/ + iv, W(u)

We assume X 2

;;,

«

((\s\

«

(2a- ry)- 1 (~~) ~+"-

+ jvi)£)~+"-'7X1l- 2 "(2a- ry)- 1 e-~lvl 11

x 1 -1le-lvl.

£js\ and '1/ = n +~'getting for a;;, n,

(10.70) Hence (10.69) yields

(10.71) B*(s,x)

«

(n-

~)- 1 x~-Q

1 IL x(m)m-•+<>-~+ivle-lvldv + -XY 00

-oomq

~

..

1 -.

10.

2GO


Before going further let us state what we have proved. LEMMA 10.3. Let x be a character modulo I! and s a complex number with Res): a>~· Choose X): (!!lsi)~ andY satisfying (10.67}. Then we have {10.61) where B(s, x) is given by (10.58} with f as in (10.50) and B*(s, x) satisfies (10. 71} with the implied constant being absolute. Now we proceed to the proof of onr main result which io GRAND DENSITY THEORE~I 10.4. Let ~ 0. Then we have (10.72) where

(10.73)

c( a) = min ( 2

~ n , 3n 3_ 1 ) .

Here D = kQT, H = kQ 2 T and £ = 2log D. The exponent A in the power of logarithm and the implied constant depend only on n and c.

REI\IARK. One conic! remove c and assert that both A and the implied constant are absolute, however, such a refinement is technically very complicated to make along our lines (try the original arrangements by H. L. Montgomery [Mo2]). Note that (10.72) is essentially as strong as (10.7) in the range ~ ~, very few £-functions of primitive characters modulo q '( Q have a zero in the rectangle (10.1). In what follows x runs over characters of modulus I!= kq ,;;; Q and s = a+ t is a complex variable restricted to the rectangle a '( a ,;;; 1, It! ,;;; T. Furthermore, if x is principal we assume that It! ): £ 2 , because otherwise the result is trivial. For the proof of Theorem 10.4 we use (10.61) with (10.74) The residual term in (10.61) is small

by virtue of (10.57), and the series (10.58) can be reduced to n ,;;; Y up to a correction term O(D- 1 ) by virtue of (10.51). We obtain (10.75)

L(s, X)=

L x(n)n-s f(~) n.(Y

Multiply this by the polynomial (10.76)

Al(s, x) =

L m.(A/

J.L(m)x(m)m-s


261

where 1 ( M ( D~ getting L(s,x)M(s,x)

(10.77)

L

=

anx(n)n-s

n(.AJY

where the coefficients are given by (10.78)

an=

2.: ~t(mJJ(~ ).

dtn=n d()-,m(.A!

(10.79)

an(v)

=

~t(mJ(tr-~+iv,

2.:

drn=n d(.Y,m(.AI

so

JanJ (

T(n) and

Jan(v)J (

T(n). For n (AI we have more precise estimates

drn=n

mjn

The first yields a 1 = 1 + O(D- 2 ) and an gives

IL

an(v)x(n)n-sl (

y~-a

n(.AJ

«

L

D- 2 if 1 < n ( M, while the latter

T(n)n-~ « y!-<>11!! log2111.

n(.A!

vVe want this to be bounded by £- 2 which is the case by assuming (10.80)

By the above estimates we reduce ( 10. 77) to L(s,x)M(s,x)

(10.81)

= 1+

L

anx(n)n-s

We want to treat the first sum and the integral of the second sum in one way. To this end we combine the sum and the integral into one integral with respect to the measure (10.82)

where dv is the Lebesque measure on!R and o(v) is the point measure at v = 0, the factor being introduced for the normalization

!

(10.83)

j

x

-oo

djt =I.


262

In this notation we write (10.81) (10.84)

L(s, x)M(s, x)- 1

ru;

one inequality

«£[YO I

L

an(v)x(n)n-•ldfL(v)

+C

1

-oo Al
where for v = 0 we redefine an(O) to be an- From now on we only need to know that an(v) does not depend on s, X and that lan(v)l ::;:; T(n). For convenience we also set an (v) = 0 if n ::;:; M or n > MY. Recall that (10.84) holds for sin the rectangle R( a, T) and a character X modulo Q, and in case X is principal we require ltl;? £ 2 , where£= 2logD and D = kQT. In particular, if s =pis a zero of L(s, x) in this region, we get from (10.84) kq with q::;:;

1= I

L

an(v)x(n)n-PidJ.L(v)

»

£- 1

-oo AI
provided D is sufficiently large (which condition can be assumed, or else the goal is trivial). As in Section 10.3 we break the summation into dyadic segments N < n::;:; 2N with N = 2e M, 0::;:; € < L =[logY/ log2]. Denote (10.85)

De(s, x)

=!=I

L

an(v)x(n)n-•ldJ.L(v).

-oo N
For every zero p being counted there exists I! such that (10.86) Let Re denote the number of zeros satisfying (10.86). Then the total number of zeros is R::;:; 2:Re::;:; LmaxRe.

e

e

Raising Dt(s,x) to a suitable power 2k with k;? 2 (depending on N) we obtain

with P = Nk which falls into the segment

z : ;:; P

(10.87)

::;:; ( 11IY) 2

+ z~

where Z is any fixed number to be chosen later, subject to MY::;:; Z::;:; H. At this point we require M ;? D
L

lbn(v)l2n-2o::;:; pl-2o£A

P
where A depends only on k. Adding up these inequalities for the points (p, x) satisfying (10.86) we obtain

r

I


263

Now we are ready to use two results from Chapter 9, namely Theorem 9.13 and Theorem 9.15 (see also the closing remarks in Chapter 9 concerning the spacing of points p). Applying Theorem 9.13 we get

(10.88) and by Theorem 9.15

the latter giving

(10.89) (here H = kQ 2 T, and A is a constant depending on k which may be different in each appearance). A direct comparison of (10.88) and (10.89) shows that one is stronger than the other if~
+E,;;; a<~· Re

«

Then introducing P from (10.87) into (10.88) one gets

((MY)4(1-n)

+ z3(1-n) + Hz.i-2")£A.

Clearly the best choice is Z = H 1 1(2 -o.). We also assume MY ,;;; Z~ getting 3(1-o) A Re«H~£.

(10.90)

It remains to verify whether the restriction MY ,;;; Z~ does not contradict the conditions previously imposed. For Y = D~£ 2 , and since D,;;; H, this restriction is verified with M = H 4~~-~~~ C 2 • Note that M satisfies (10.80), M;? D'/ 4 and obviously MY,;;; Z. The exponent A and the implied constant in (10.90) depend only on E. Since E is arbitrary, Theorem 10.4 is established in the range~
Re

«

((.111Y)4(1-a)

Now the best choice is Z = H

Re For M

+ z3(1-a) + Hz4-6")£A. giving

((D.llf2)2(1-a)

+ H3;~~~~ )£A.

= D'/ 4 this establishes Theorem 10.4 in the range

EXERCISE

(10.91)

«

11( 3 "-l)

3. Show that for ~ < a ,;;; 1 and

N(a, k, Q, T)

«

(D(3+<)(1-n)

E

~ ,;;; a ,;;; 1.

> 0,

+ Hb(a)(l-a))£A,

where

(10.92)

4 8 -). 5- 2a 5a- 2

b(a) = min(-- , -

[Hint: Raise the polynomial De(s, x) to a power 2k with k ;? 3 so that Z,;;; P,;;; (MY)3 + z4/3.J


264

EXERCISE 4. Show that for ~

N(a, k, Q, T)

(10.93)

«

< a:

,;;; 1 and E

(D(H<)(!-a)

> 0,

+ Ha(a)(l-a)).CA

where 5 -) . a(a) =min ( - -5 - , - 3 -a 7a- 3

(10.94)

[Hint: Raise the polynomial De(s,x) to a power 2k with k;? 4 so that Z,;;; P,;;; + zs/4 .]

(11IY)4

REMARKS. For a: near 1, the estimates (10.91) and (10.93) are stronger than ( 10. 72) in terms of Q2 , but not in terms of kT. In particular, if kT ,;;; Q<, then (10.93) yields

N(a, k, Q, T)

« Q4 (!-a)+c

f1 ,; ;

in the range a ,;;; 1, which is essentially the density conjecture. By using his Theorem 9.10, M. Jutila [Jut] established the density conjecture for zeros of ((s) in the same range. Every exponent a(a),b(a),c(a) takes its maximum at a=~, being a(~) = ~,b(~) = lf, c(~) = For a= 1 we get a(1) = ~, b(a) = c(1) = ~, and for a= ~ all three exponents take value 2.

¥

£,

10.5. The gaps between primes. This section is written for the purpose of giving an example of how the zerodensity theorems can be applied. We have chosen a few questions about primes in short intervals, because they were inspirational for the development of density theorems in the first place, and for other statistical studies of zeros of ((s). By no means do we attempt to show the best results from the vast literature. By the Prime Number Theorem,

'1/J(x) =

L

A(n) = x + E(x),

n(x

where E(x) is a suitable error term, it follows directly that (10.95)

'1/J(x+y)- '1/J(x)

~

y

as x---> oo, provided y = y(x) is somewhat larger than E(x). Hence there is a prime number in the short interval (x, x + y] for all sufficiently large x. Without improving the existing error term in the Prime Number Theorem, G. Hoheisel [Ho] succeeded, nevertheless, in showing in 1930 that (10.95) holds for y = x 0 with some absolute constant () < 1. This is an impressive achievement given the fact that the error term as good as E(x) = O(x 0 ) seems to be far beyond the current technology. Recall that the latter bound translates into the non-vanishing of ((s) in Res > 0. However, Hoheisel required only two results which were available in his time. First is a zero-free region of ((O" +it) of the type (10.96)

ltl,;;; T,

0";? 1- BloglogT logT

for some constant B > 0 and all T sufficiently large. The second ingredient is the zero-density estimate of type (10.97)

10.


265

for all ~ :( a :( 1 with some constants c ;? 2 and A ;? 1. From these results one derives

e=

THEOREM 10.5. Put

1- (c+ (A+ 1)/B)- 1 . Then

(10.98)

1/;(x + y) -1/;(x) = y + oCo~x)

for all y with x 0 (log x) 3

:(

y ( x.

PROOF. One uses the approximate "explicit formula" (see Section 5.9)

1/J(x)=x-

L

2) xP p-+0 (X y;(iogx)

j-y[!(T

with T = x 1-

0.

Hence

1/;(x+y)-1/J(x) _ 1 =""" (x+y)P-xP +0(-1-). ~

y

[1[!(T

~

~X

Here the sum over the zeros is bounded by

L

x 13 -

1

:(

j-y[!(T

21

1

x"- 1 dN(a, T)

2

:(

2.r-~N(~, T) + 2(logx)

1 1

xn- 1 N(a, T)da

2

1

«

x-1T1ogT + (logx)(logT)A

«

x-1T1ogT + (Tc jx)~(logT)A

1

(Tcjx)"da

where"'= B(loglogT)/logT. Since (Tcj:r)~ = (logT)(c-,.:.)s = (logT)-A-1,

the sum over zcroB iB 0(1/ logx) completing the proof of Theorem 10.5.

0

Hoheisel used the zero-free region (10.96) with a positive but very small constant B due to J. E. Littlewood [Lit], and the zero-density estimate (10.97) with exponent c = 4 due to F. Carlson [Car] (see (10.26)). After Vinogradov's widening the region (10.96) (see Corollary 8.28) one can take B arbitrarily large so e = 1 - c- 1 + E satisfies the conditions of Theorem 10.5. From now on the Hoheisel exponent e depends only on c in the zero-density estimate. Note that one needs (10.97) to hold throughout the segment ~ :( a :( 1 with the same c, so an improvement in a subrange does not help. In other words, all one gets from (10.6) iB c = maxc(a). By the Grand Density Theorem 10.4 for ((s) one gets (10.97) with c = 12/5 (the maximum of c(a) is attained at a = ~),which yields (10.98) for x 0 ,::; y,::; X with e = f2 +E. This is the best result of its kind obtained so far (in 1972 by M. N. Huxley [Hu3}). The density conjecture c = 2 yields e = ~ +E. Various combinations of the above analytic arguments with sieve methods produced estimates (10.99)

y « 1/J(x + y) -1/;(x)

« y

10.

266


in place of the asymptotic formula (10.95), however, for shorter intervals. For example, R. Baker and G. Harman [BH] got (10.99) for y = x 0 with()= 0.534. From (10.99) with y = x 0 it follows that the difference between consecutive primes satisfies (10.100)

dn

= Pn+l

- Pn

« P~·

Therefore we have (10.100) with()= 0.534 as a consequence of the Baker-Harman 1

work. Recall that the Riemann hypothesis yields dn « p~ logp11 , but even on the Pair Correlation Conjecture (see Chapter 25), the best that has been achieved is (due to D. Goldston and D. R. Heath-Brown [GHB]) (10.101)

dn

«

(Pn logpn)!.

Creating a probabilistic model for primes in 1937 H. Cramer [Cra] was led to a conjecture that (10.102) In the other direction R. Rankin [Ra2] showed by constructing special composite numbers that (10.103)

dn;? (e'- c:)(logpn)(loglogpn)(loglogloglogp,)(logloglogpn)- 2

infinitely often, where 1 is the Euler constant. Paul Erdos offered a price of $10,000 to anyone who can replace e' in (10.103) by a function increasing to infinity (the largest price ever offered by Erdos for a solution to a mathematical problem).

It is conjectured that the normalized gaps between consecutive primes dn * = (Pn+l - Pn)(logpn)- 1 has a Poisson distribution, that is for any t > 0,

(10.104)

. 1 hm -[{n,;:;x; dn*,;:;t}[=1-e-t.

x-oc X

Many estimates for 1/;(x to x.

+ y) -1/J(x)

were established on average with respect

EXERCISE 5. Assuming (10.97) with c;? 2 prove that (10.105) for X 0 ,;:; y ,;:; X with()= 1- 2c- 1 + c:, for any E > 0 and any A> 0, the implied constant depending only on c: and A. In particular, by taking c = 12/5 deduce that (10.95) holds true withy= x 0 for almost all x, where()> ~ is a fixed number. We close this section by pointing·out a formal similarity between the problems of primes in short intervals and primes in arithmetic progressions to large moduli.

. j

. .


267

EXERCISE 6. Assume the following estimates for the zeros of Dirichlet £functions with characters x(mod q): (1) L(s, x) =/= 0 in the regions= tJ +it with ltl ,;;; T and

(10.106)

tJ')1- B

loglogqT logqT

where B is a positive constant, provided qT is sufficiently large, (2) The number N(a, q, T) of zeros of all L(s, x) with x(mod q) in the rectangle it! ,;;; T, tJ ') a satisfies (10.107)

N(a,q,T)

«

(qT)"(l-n)(logqT)A

for ~ ,;;; a ,;;; 1, T ') 3, where c ') 2 and A ) 1 are suitable constants. Prove that (10.108) uniformly for x ) q0 (log q) 3 with B = c + (A tional results.

+ 1) j B.

See Chapter 17 for uncondi-

CHAPTER 11

SUMS OVER FINITE FIELDS 11.1. Introduction. In this chapter we consider a special type of exponential and character sums, called sometimes "complete sums", which can be seen as sums over the elements of a finite field. Although the methods of Chapter 8 can still be applied to the study of such sums, disregarding this special feature, the deepest understanding and the strongest results are obtained when the finite field aspect is taken into account and the powerful techniques of algebraic geometry are brought to bear. We have already encountered in the previous chapters some examples of exponential sums whith can be interpreted as sums over finite fields, for example, the quadratic Gauss sums

Ga(P) =

L G)e(a;) x mod p

or the Kloosterman sums (1.56)

"'*

S(a,b;p) = ~

f

I,,,~ ~!

rl

I

I !

(ax+ bx)

e -p- .

x mod p

In this chapter we will study these sums in particular. The culminating point of our presentation is the elementary method of Stepanov which we apply for proving Weil's bound for Kloosterman sums

IS(a,b;p)l (

2.JP

and Hasse's bound for the number of points of an elliptic curve over a finite field. Then we survey briefly, without proofs, the powerful formalism of l'-adic cohomology developed by Grothendieck, Deligne, Katz, Laumon and others, hoping to convey a flavor of the tools involved and to give the reader enough knowledge to make at least a preliminary analysis of any exponential sum he or she may encounter in analytic number theory.

11.2. Finite fields. We first recall briefly some facts about finite fields, and establish the notations used in this chapter. For every prime p, the finite ring 7lfp7l of residue classes modulo p is a field, which we denote IF p· The Galois theory of IFP is very easy to describe: for any n ;::, 1, there exists a unique (up to isomorphism) field extension of IFP of degree n, written IF p". Conversely, any finite field IF with q elements is isomorphic (but not canonically) to a unique field IFPd, so q = pd, and IF admits also a unique finite extension of degree n for any n ;::, 1, namely IFP"". Let now IF= 1Fq be a finite field with q = pd elements. In most of the chapter, p is fixed and we change notation slightly, denoting by iF an algebraic closure of IF

269

11.

270

SUMS OVER FINITE FIELDS

and by IF n c lF the unique extension of degree n of IF for n ;::, 1. The context will always indicate clearly that the cardinality of 1Fn is qn and not n. The extension 1Fn/1F is a Galois extension, with Galois group Gn canonically isomorphic to 7l/n7l, the isomorphism being the map 7l/n7l ___, Gn defined by 1 >--> CY, where CY is the Frobenius automorphism of 1Fn given by CY(x) = xq. Let lF be a given algebraic closure of IF, so by the above,

By Galois theory, for any x E

JF,

we have x E IF<===? CY(x) = x if and only if

xq = x and more generally

(11.1) From this we can deduce that IF n is the splitting field of the polynomial Xqn - X E IF[X]. More precisely, one can state the following result of Gauss: LEMMA

11.1. For any integer n ;::, 1, we have

I1 I1

(11.2)

P = Xqn - X

din deg(P)=d

where the product ranges over all irreducible monic polynomials P of degree d dividing n. PROOF. This is an immediate consequence of the description of finite fields: the roots of the polynomial on the right side (in an algebraic closure) are exactly the elements x E 1Fn with multiplicity one and, conversely, every such x has a minimal polynomial which must occur, exactly once, among the polynomials P on the left side. 0

Associated to the extension IF n/"F are the trace map and the norm map. Because of the above description of the Galois group of 1Fn/1F, the trace map Tr = Tr 1Fn/1F IF n ___, IF is given by (11.3) O(i(n-1

while the norm map N = N'fn/IF (11.4)

N(x) = O(i(n-1

IF~

O(i(n-1

___, IF* is similarly

I1

O(i.(n-1

The equations Tr (x) = y and N(x) = y, for a fixed y E IF are very important. Because the extension 1Fn/1F is separable, the equation Tr (x) = y always has a solution. If x 0 is a given solution, then all solutions are in one-to-one correspondence with solutions of Tr (a) = 0, by x = xo +a. Moreover, any solution of Tr (a) = 0 is of the form a= CY(b)- b = bq- b for some bE 1Fn, unique up to addition of an element in IF. Similarly, for any y E IF*, the equation N(x) = y has a solution, and if xo is a given solution, the set of solutions is in one-to-one correspondence with solutions of N(a) = 1, which by Hilbert's Theorem 90 (or by direct proof) are all given by

L

11. SUMS OVER FINITE FIELDS

271

a= CT(b)b- 1 = bq-l for some bE IF~, unique up to multiplication by an element in IF*. As the additive group of IF is finite, the general theory of characters of a finite abelian group (see Chapter 3) can be applied. Characters of IF are called additive characters, and they are all of the form x >-> 1/J(ax) for some a E IF, where 1/J is some fixed non-trivial additive character. For instance, let Tr : IF--> ZjpZ be the trace map to the base-field, then

1/J(x) = e(Tr(x)jp)

(11.5)

is a non-trivial additive character of IF. For a given additive character 1/J and a E IF, we denote by 1/Ja the character x >->1/J(ax). Applying the general theory of characters of finite abelian groups, we get the orthogonality relations

L1fJ(x) = { q "'

0

if

X=

1,

otherwise

(which is used to "solve" the equation x = 0 in IF) and if 1/J = 1 is the trivial character, otherwise. The description of characters of the multiplicative group IF* (also called multiplicative characters of IF) is not so explicit. The group structure of IF* is well-known (dating back to Gauss): it is a cyclic group of order q- 1. Generators of IF* are called primitive roots, and there are -> n such that zn = x and all multiplicative characters of IF are expressed as

x(x) = e(alog(x)) q-1 for some a E Z/( q- 1)Z, but such a description is usually of no use in analytic number theory. As examples of multiplicative characters, suppose IF = ZjpZ and p of 2. Then the Legendre symbol

is a non-trivial quadratic character. In general, if 8 I (q- 1), there is a cyclic group of order 8 consisting of characters x of IF* of order 8. The orthogonality relations become

q-1 L:x(x) = { 0 X

if X= 1, otherwise,

11.

272


(the sum over all multiplicative characters), and

L xEJii'>

q-1 x(x) = { 0

if

x=

1 is the trivial character,

otherwise.

It is usual to extend multiplicative characters to lF by defining x(O) = 0 if X of 1, and x(O) = 1 if X= 1. Notice then that for any (j I q- 1 the formula (11.6)

L

x(x) = I{Y ElF

I y6 =

x}l

x'=l (also a particular case of the orthogonality relations for the group JF* /(lF*)d, as described in Chapter 3) is true for all x E JF. 11.3. Exponential sums. Let lF = lF q be a finite field with q = pm elements, p a prime. Exponential sums over lF can be of various kinds. For the simplest case, consider a polynomial P E JF[X] and an additive character 'lj;, and define the sum

S(P) =

L 'lj;(P(x)). xEIF

Slightly more generally, take a non-zero rational function consider S(J) = 'lj;(J(x));

f

=

P /Q E JF(X) and

L

xEIF Q(x)#O

for instance, taking q = p and f(x) = ax+ bx- 1 , we have S(J) Multiplicative characters can also be used, getting sums of the type

Sx(J) =

S(a,b;p).

L* x(J(x)) xEIF

(where the star in :L* means here and henceforth that the summation extends to all x which are not poles of f). For q = p, x = ())) (the Legendre symbol) and f(x) E Z[X] a cubic polynomial without multiple roots modulo p, we see that -Sx (f) is the p-th coefficient aP of the Hasse-Wei! zeta function of the elliptic curve with equation y2 = f(x) (see Section 14.4). Still more generally, one can mix additive and multiplicative characters, and define sums such as (11.7)

Sx(J,g) =

L* x(J(x))'lj;(g(x)), xEIF

an example of which is the Salie sum T(a, b;p) defined by

T(a,b;p) =

L* x mod p

(~)e(ax+bx) p

which occurs in the Fourier expansion of half-integral weight modular forms; see [16] for instance. In contrast with the seemingly simpler Kloosterman sums S(a, b;p), the Salie sums T(a, b; p) can be explicitly computed (see Lemma 12.4, and Corollary 21.9 for the uniform distribution of the "angles" of the Salie sums).


273

To end this list, we mention that all these definitions can again be generalized to sums in more than one variable, and that the summation variables can be restricted to the rational points of an algebraic variety defined over JF: some examples will appear in the survey sections of this chapter. The exponential sums which directly arise in analytic number theory are sums over the prime field 7ljp7l. However, the deeper understanding naturally requires considering sums over the extension fields JF Pn. Indeed, the very reason for the success of algebraic methods lies in the fact that an exponential sum over JFP doesn't really come alone, but has natural "companions" over all the extension fields lFPn, and it is really the whole family which is investigated and which is the natural object of study. Those companion sums are easily defined: take the most general sumS= Sx(f,g) we have introduced, then for n:;:, I let (II.8)

Sn =

2:* x(N1Fnf1F(f(x)))7/l(TrFnfF(g(x))) .xEFn

where we use the multiplicative character xo Nand the additive character 7/JoTr of lFn. All the sums Sn are incorporated into a single object, the zeta function of the exponential sum, which is the formal power series Z = Zx(f,g) E IC[[T]] defined by the formula

Z = exp(L

~nTn ).

n~!

Justification for the introduction of the zeta function comes from the following rationality theorem, conjectured by Wei!, and proved by Dwork. THEOREM 11.2 (DWORK). The zeta function Z is the power series expansion of a rational function; more precisely, there exist coprime polynomials P and Q in IC[T], with P(O) = Q(O) = 1, such that Z =g.

As a corollary, denote by (n;) (resp. (/3j)) the inverse of the roots (with multiplicity) of P (resp. Q), so

p

=II (I - ct;T),

Q

=II (1- (3jT).

Then using the power-series expansion 1 Tn log-="'I-T ~ n n~l

we find that the formula Z = PjQ is equivalent to the formula

for any n :;:, I, which shows how the various sums Sn are related. In particular, note that they satisfy a linear recurrence relation of order d equal to the number deg P + deg Q of roots ct;, f3;.

11.

274

COROLLARY


11.3. We have for any n ;::, 1 the upper bound

In particular, (11.9)

A common abuse of language is to speak of the O
s = 2..:::* x(f(x))'lj;(g(x)), xEF

we also introduce for a E IF,

Sa= L* x(f(x))7f;a(g(x))

=

xEIF

L* x(f(x))'lj;(ag(x)). xEIF

Estimates on average over a for the first few power moments of Sa are often easily derived by elementary means, and they can be of great use in estimating S, even in addition to the methods of algebraic geometry. See the proof of Weil's bound for Kloosterman sums in Section 11.7 and the examples in Section 11.11. More general types of families have been (and still are) extensively studied by Katz; see for instance [Kl].

11.4. The Hasse-Davenport relation. We consider general Gauss sums over a finite field. Let IF = IF q be a finite field with q = pm elements, and let 'lj; be an additive character and X a multiplicative character of IF. The Gauss sum G(x, 'lj;) is (11.10)

G(x,'lj;)

=

Lx(xh''(x). xEIF

(recall that X is extended to IF by x(O) = 1, 0 according to whether x is trivial or not). When X is the Legendre symbol, one recovers quadratic Gauss sums. The associated sums over the extensions fields are

Gn(x,'lj;) = L

x(NFnflF(x))'lj;(TrlFnfiF(x))

xElFn

and the zeta function is (11.11)

Z(x,'lj;) = exp(L n~l

Gn(~,'lj;)Tn).

11.


275

In this case, Dwork's Theorem was proved by Hasse and Davenport and is known as the Hasse-Davenport Relation. THEOREM 11.4 (HASSE-DAVENPORT). Assumex and'!j; are non-trivial. Then we have for any n ) 1, -Gn(X,'I/J) = (-G(x,'I/J))n or equivalently the zeta function is a linear polynomial

Hence the only "root" for the Gauss sum is G(x, 1/J) itself. This can be estimated elementarily, as was done for the Gauss sums considered in Chapter 3. PROPOSITION 11.5. We have

IG(x, 1/J)I

=

.;q

if neither X nor 1/J is trivial, while

IG(1, 1/J)I = {

~

IG(x, 1)1 = {

~

if 1/J non-trivial, if 1/J = 1 if X non-trivial, if X= 1.

PROOF. The last two statements are immediate, so assume neither X nor 1/J is trivial. We have

IG(x,1/J)I 2

L x(x)x(y)1/J(x)1/J(-y) = L x(z) L 1/J((z- 1)y) (on writing z = xy-

=

x,yEF'

zEF

1

)

yEF'

= q (by orthogonality, applied twice.)

D We now turn to the proof of the Hasse-Davenport Relation. We consider the field F = IF(X) of rational functions on IF and the ring R = IF[X] of polynomials. Recall that R is a principal ideal domain. For h E R of degree d ) 0, we define the norm N(h) = qd. The zeta function of F is the Dirichlet series (analogous to the Riemann zeta function) (F(s) = N(h)-s.

L

hER h monic

REMARK. This could also be written as a sum over the non-zero ideals a in R,

where N(a) = IR/al = N(h) for any polynomial h such that a= (h). But we will work with polynomials to emphasize the elementary spirit here.

11.

276


The Dirichlet series (F ( s) converges absolutely for Re ( s) > 1. Indeed, putting

n(d) ={hER

f

deg(h) = d, his monic}= qd

we obtain immediately

d)O

d)O

On the other hand, unique factorization into irreducible polynomials yields an expression of (F as an Euler product

II

(p(s) =

PER P monic irreducible

which is convergent for Re (s) > 1. The first step in the proof of the Hasse-Davenport Relation consists of writing the zeta function of Gauss sums as an £-function for the field F. Let H C F* be the subgroup of rational functions which are quotients of monic polynomials, and G C H a subgroup with the property

Then if a : G--+ IC* is a character of the group G, it can be extended to a totally multiplicative function of the set of monic polynomials hER by putting a( h)= 0 if h 1/ G. The corresponding £-function is defined analogously to the classical £-functions by the Dirichlet series L(s,a) =

L

a(h)N(h)-s

=II (1- a(P)N(P)-s)-

hER

1

P

h monic

for Re(s) > 1. For dealing with Gauss sums we consider the subgroup G c H of rational functions f defined and non-vanishing at 0. Define a character A on G by

for h = Xd- a 1 Xd- 1 + · · · + (-l)dad E R. Clearly .A is multiplicative on ~onic polynomials, and extends to a character of G. In this case we get the following: LEMMA 11.6. We have L(s, .A)= 1 + G(x,-l/!)q-s. PROOF.

We arrange the Dirichlet series for L(s, .A) according to the degree of

h:

L(s,.A)=L( d)O

L

.A(h))q-ds

deg(h)=d

and evaluate each term in turn. For d = 0, the only monic polynomial occurring is h = 1, and .>-(1) = 1. Ford= 1, we have h =X- a so that .A(h) = L.A(X- a)= LX(a)-lj!(a) = G(x,"l/!). deg(h)=1

a ElF

a ElF

ll. SUMS OVER FINITE FIELDS

277

For any d ): 2 we have

= qd- 2

L

x(ad)'!j!(al) =

o

a1,adElF

by orthogonality, because at least one of the characters

x, '1)! is non-trivial.

D

On the other hand, appealing to the Euler product we will prove: LEMMA 11.7. We have L(s,>.) = Z(q-•), where Z tion (11.11) associated with Gauss sums.

= Z(x,'l)!)

is the zeta func-

Theorem 11.4 follows from Lemmas 11.6 and 11.7. PROOF OF

LEMMA 11.7. Taking the logarithmic derivative of the Euler prod-

uct, we get __1_L'(s,>.) = Ldeg(P)L>.(P)"q-rds

1ogqL(s,>.)

P

=

r?l

L

(L d

n)l

rd=n

L

d>.(PJ")q-ns

p

deg(P)=d

while, on the other hand,

It therefore suffices to prove the formula

(11.12)

L

d>.(Pr 1d = Gn(x,'l)!)

p d=deg(P)In

for n ): 1, the equality of the logarithmic derivatives being sufficient to imply Lemma 11.7 since both sides are Dirichlet series with leading coefficient 1. To prove (11.12), let P be one of the irreducible polynomials appearing on the left side, of degree d \ n. Its roots, say x 1 , ... , xd, are in lFn. Fix one root x =xi and write We get

hence

11.

278


Summing over all roots of P we derive d

d>..(Ptld = LX(N(xi)),P(Tr(xi)), t=l

and summing over all P with deg(P) In, we get (11.12) by Lemma 11.1 since every element in 1Fn will appear exactly once as one of the roots x, for some P. D 11.5. The zeta function for Kloosterman sums.

Next we consider Kloosterman sums. Let IF be a finite field with q = pm elements and this time consider additive characters'¢ and rp. We define the Kloosterman sum associated to '¢ and rp by

(11.13)

,P(x)rp(x- 1),

S(,P, rp) = - L xEIF•

(the minus factor is only for cosmetic reasons). When q = p is prime, and ,P(x) e(ax/p), rp(x) = e(bxjp), we have therefore S('¢, rp) = -S(a, b;p).

=

The companion sums over the extension fields IF n are

xEIF~

and the Kloosterman zeta function is

Sn(~,rp)rn).

Z = Z(,P,rp) = exp(L n;?:l

We will prove Dwork's Theorem in this case, which is due to Garlitz. THEOREM

11.8. Assume that '¢ and

Z(,P, rp) = 1- S(,P, rp)T + qT2 The proof is very similar to that of Theorem 11.4. We put R = IF[X], F = IF(X) as before, and consider the same group G C F* of quotients of monic polynomials defined and non-vanishing at 0. We define a character 1) : G-+ C* by putting

for a monic polynomial hE G, where we write (compare the previous section)

h = xd

+ a1xd-! + ... + ad-1x +ad

(with ad ol 0 since h E G). The following computation verifies that 1) is indeed a character of G: let h' = xe + b!xe- 1 + ... + be-1X +be with be "' 0, then

hh' = xd+e and

+ (a1 + b1)Xd+e- 1 + · · · + (ad-1be + adbe_I)X + adbe

r


279

I

Recall that we extend TJ to all hER by putting ry(h) = 0 for h rf_ G. LEMMA

11.9. For 'ljJ and 'ljJ non-trivial, the L-function associated to TJ is given

by

L(s,ry) = LTJ(h)N(h)-s = 1- S('lj!,
By arranging terms according to the degree of h, we write

PROOF.

L(s,ry) = L ( d?O

L

ry(h))q-ds

deg(h)=d

and evaluate the inner sums. Ford= 0, we have only h = 1 and ry(1) = 1. For d = 1, we have h = X+ a with a f' 0, hence

L

ry(h) = L

ry(X +a)= L

'lj!(a)
deg(h)=1

Ford= 2, we get

= q -1 + (L 'lj!(a)) (L
bEF•

by applying twice the orthogonality of characters, since neither 'ljJ nor

"L

ry(hJ="L

deg(h)=d

L

= qd-J

'ljJ(aJ)
a1,ad-1EF

a ElF•

D

since there is free summation over a 1 E IF. LEMMA

11.10. For 'ljJ and

+ q1-2s.

This lemma completes the proof of Theorem 11.8. PROOF.

The £--function has an Euler product

L(s,ry)

=IT (1- ry(P)N(P)-s)-

1

.

p

Taking the logarithmic derivative we get __1_ L'(s, TJ) = L deg(P) L ry(P)' q-rdeg(P)s logq L(s,ry) P r? 1

="L("Ld n?1

rd=n

L deg(P)=r

ry(Pr)q-n•

11.

280


and as before it suffices to prove the formula

L

(11.14)

dry(P)nfd = -Sn('I/J,tp)

d=deg(P)In

for n ): 1. Let p = xd

+ alxd-l + ... + ad-lx +ad

be one of the irreducible polynomials on the left side of (11.14), of degree d I n, and x 1 , ... , xd its roots, which lie in IF d. We have for each i, n n Tr(x;) = dTriFd/IF(x;) = -da1 (since aci 1 XdP(X- 1 ) = Xd +a~:' xd-l + ... +

Hence ry(P)nfd

aci 1 )

and

= ?/! Ga~ )'P(;r~:~) = ?/!( -x;)tp( -x;l)

and summing over the roots X;, then over the polynomials P of degree d obtain (11.14) by Gauss's Lemma again.

I n, we D

Theorem 11.8 allows us to factor the Kloosterman zeta function

where a and (3 are complex numbers, and of course a+ (3 = S(?j!, tp), a(J = q. In sharp contrast to the case of Gauss sums, however, the roots a and (3 cannot be explicitly computed. THEOREM 11.11 (WElL). Assume that ?)! and tp are non-trivial and p # 2. Then the roots a and (3 for the Kloosterman sum S(?j!,tp) satisfy lal = lfJI = ,;q, and therefore we have (11.15)

IS(?)!, 'P)I ( 2.fii..

We will prove Theorem 11.11 in the next two sections. COROLLARY 11.12. Let a, b, c be integers, c positive. We have (11.16) PROOF. By the twisted multiplicativity (1.59) for Kloosterman sums, it suffices to consider c = pv with p prime and v ): 1. If p I nm, we have Ramanujan sums for which the result is easy (see (3.2), (3.3)). Otherwise, the case v = 1 follows from Theorem 11.11 for p ): 3, and for p = 2 one checks immediately that the Kloosterman sums modulo 2 satisfy Theorem 11.11: we have S(1, 1; 2) = 1, and the associated zeta function is therefore Z(T) = 1+T +2T2 , with roots ( -1±i/7)/4 of modulus 1//2. The case p f nm and (3 ): 2 can be dealt with elementarily; see Exercise 1 of Chapter 12. D


281

EXERCISE 1. Consider a general Kloosterman-Salie sum

S(x.; 'lj!,
L x(x)'!j!(x)
and its associated companions Sn and zeta function Z, where 'ljJ and
'!j!(x) =

e(Tr~ax)),

e(Tr~x)).

11.6. Stepanov's method for hyperelliptic curves. We will prove Theorem 11.11 by deducing it from the Riemann Hypothesis for certain algebraic curves over finite fields. However, we use Stepanov's elementary method (see [Ste], [Sch], [Bo3]) instead of Weil's arguments. Let IF be a finite field with q element, of characteristic p. We will only consider algebraic curves C 1 over IF given by equations of the type (11.17)

for some polynomial condition (11.18)

f E IF[ X] of degree m ): 3. We assume moreover the following

The polynomial Y 2

-

f(X) E IF[X, Y] is absolutely irreducible

(i.e. it is irreducible over the algebraic closure of IF). This is a minimal regularity assumption on the curve C1 . It is easily seen to be equivalent to the condition that f is not a square in iF[ X], and we will use it in this form. REMARK. Stepanov's method has been refined by Schmidt [Sch] and Bombieri [Bo3] and is capable of handling the general case of the Riemann Hypothesis for curves; the case of curves with equation of the type yd = f(x) is not much harder than the one treated here. We limit ourselves to the curves C1 for simplicity, and because it suffices for the application to Kloosterman sums and elliptic curves. Note that curves of the type y 2 = f (x) are instances of so-called hyperelliptic curves, which are quite naturally distinguished among algebraic curves (but not all hyperelliptic curves are of this form; see for instance elliptic curves in characteristics 2 and 3). The problem we consider is that of estimating the number ICJ(IF)I of IF-rational points of C1 , i.e., the number N of solutions (x, y) E IF 2 to the equation y 2 = f(x). We are especially interested in this question when q is large (typically, as with exponential sums, the polynomial f E IF[X] is fixed, and we-consider the IF n-rational points for all n): 1), although we will obtain completely explicit inequalities. THEOREM 11.13. Assume that f E IF[X] satisfies {11.18), and m = deg(f)): 3. If q >4m 2 , then N = ICJ(IF)I satisfies

IN - ql < 8myq.

lL

282


Clearly we can assume that p > 2, as otherwise the map y >-> y 2 is an automorphism of 1F and N = q. Stepanov's idea, which was inspired by results of Thue [Thu] in diophantine approximation, is to construct an auxiliary polynomial of degree r, say, having zeros of high multiplicity (at least e, say) at the x-coordinates of points of CJ(IF). Hence one gets easily the inequality N ( 2r£- 1 , the factor two being the highest possible multiplicity of a given x-coordinate among points in CJ(IF). This inequality turns out to be so strong that it gives the upperbound of the theorem (certainly a surprising fact!). A trick then deduces the lower-bound from this. We first distinguish among the points (x, y) these with y = 0. Let No be the number of distinct zeros off in IF, which is also the number of points (x, 0) E CJ(IF). If (x,y) is a point of CJ withy =F 0, it follows that f(x) is a square in IF, which is true if and only if g(x) = 1 where

g=

r

with

c

=

~(q- 1).

Conversely, given x E IF with g(x) = 1, there are exactly two elements y E IF* with y 2 = J(x). Hence, writing

(11.19)

N1

= l{x E IF I g(x) = 1}1

it follows that

N=No+2N1 .

(11.20)

We will estimate N 1 by following the strategy sketched above, but in order to handle the lower bound later, we generalize slightly and consider for any a E IF the set (11.21)

Sa= {x

E IF

I f(x) = 0 or g(x) =a}.

To produce polynomials vanishing to a large order, we wish to use derivatives to characterize when this occurs. In characteristic 0, a polynomial P has a zero of order e at Xo if and only if all the derivatives p(i) with 0 ( i < e vanish at Xo. In characteristic p > 0, however, this is no longer true if e > p, as the example of the polynomial P = XP shows, since p(k) = 0 for all k;;, 1, in particular, p(Pl(O) = 0. A satisfactory solution follows by considering other differential operators. DEFINITION. Let K be any field. For any k ;;, 0, the k-th Hasse derivative is the linear operator Ek : K[X] -> K[X] defined by

Ek xn = (~)xn-k for all n ;;, 0, and extended to K[X] by linearity. We also write E = E 1 (but beware that Ek =FE o Eo··· o E). REMARK.

From the binomial expansion

xn =(X- a+ at=

t

(~)an-k(X- a)k,

k=O

and by linearity, we see that the value of Ek P at a point a E K, for P E K(X], is simply the coefficient of (X- a)k in the Taylor expansion of P around a. This

11.


283

explains the properties of the Hasse derivatives, but we cannot take this as a definition, because the values of a polynomial over a finite field do not characterize the polynomial. Note that for K of characteristic p > 0, we get EXP = E 2 XP = · ·. = Ep-l XP = 0, but EP XP = 1 # 0 and we see that the Hasse derivatives detect the zero of XP of order exactly pat 0. This is a general fact, as Lemma 11.16 will show. LEMMA 11.14. The Hasse derivatives satisfy k

Ek(Jg) =

L (Ei f)(Ek-ig) j=O

for all f, g E K[X], and more generally,

(11.22) for fi, ... , fr E K[X].

PROOF. It suffices to consider from the identity

f =

Xm, g

= Xn ,·and the first formula follows

which is obvious from the combinatorial interpretation of the binomial coefficients. Then the second formula follows by induction. 0 CoROLLARY 11.15. {1) For all k, r ~ 0, and all a E K, we have Ek(X- a)'=

{2) For all k, r

~

G)

(X- ark.

0 with k ( r, and all f, g E K[X], we have Ek(Jgr) = hgr-k

for some polynomial h such that

deg(h) ( deg(J) PROOF. For (1), we apply (11.22) to

+ kdeg(g)- k. f1

= · · · = fr =X- a, getting

iJ +···+Jr=k

and only terms with all j; E {0, 1}, 1 ( i ( r, give non-zero contributions since Ei(X- a)= 0 for j ~ 2, from the definition. Hence (1) follows. For (2), we observe that if k ( r, we have j, = 0 for at least r - k indices 0 in (11.22), which gives (2). LEMMA 11.16. Let f E K[X] and a E K. Suppose that (Ek!)(a) = 0 for all k
11.

284

PROOF.


Let f=

L

a;(X-a)i

O(i(d

be the Taylor expansion off around a. By (1) of Corollary 11.15, we obtain

L

Ekf=

a;G)(x-a)i-k

k(i(d

and evaluating at a we get ak = 0 for all k < claimed.

e, hence f

is divisible by (X- a)e as D

We need another technical lemma. LEMMA

and let r

11.17. Let K =IF be a finite field of characteristic p with q elements,

= h(X, Xq) E IF[ X], where hE IF[ X, YJ. Then Ekr = (E1.:h)(X, Xq)

for all k < q, where on the right side E1.:h denotes the Hasse derivative of h performed with respect to X. PROOF. It suffices to consider h = xn ym, so we must prove that Ek xn+mq (Ek xn)xmq. From Lemma 11.14, we get

=

k

Ek xn+mq =

L Ek-j xn Ej xmq j=O

so it suffices to show that Ei xmq = 0 for 0 < j < q to prove the lemma. But

mq) ( J

= mq (mq-1) = 0 J- 1

J

D

in characteristic p, and the result follows.

We come to the heart of Stepanov's method, the construction of the auxiliary polynomial. PROPOSITION 11.18. Assume that q > 8m, and let f! be an integer satisfying m
deg(r) < d +2m£(£- 1) + mq which has a zero of order at least

e at all points x E Sa

(recall c

=

~(q- 1)).

We will look, by the method of indeterminate coefficients, for a polynomial r of the special form (11.23)

r

=

fe

L

(rj

+ s 1 g)Xiq

o,;;j
for some polynomials rj, Sj E IF[X], to be constructed, each of which has degree bounded by c - m. Hence such a polynomial r has degree bounded by (11.24)

deg(r) (em+ c- m +em+ Jq ( (J + m)q.

The next lemma is crucial to ensure that r assumption (11.18).

#

0. This is where we need the

ll. SUMS OVER FINITE FIELDS

285

LEMMA 11.19. We haver= 0 E lF[X] if and only if1·1 = s 1 = 0 E lF[X] for all j. PRooF. We can assume (by a shift X >-> X+ a if necessary) that f(O) ¥c 0. Suppose that r = 0 but not all Tj, s1 are zero; let k be the smallest index for which one of rk, Skis non-zero. Dividing by fe Xkq, we get from (11.22) the identity

L

(rk

+ skg)X(j-k)q =

0

k,;;_j
ho =

L

1·

1

+ h 1g =

0, where

XU-k)q,

h1

=

k,;;_j
L

s 1 x(j-k)q.

k,;;_J
We square this equation, then multiply both sides by

f,

getting

h6f = hir. Since

f

E lF[X], we have

f(X)q = f(Xq) hence

=f(O)

(mod Xq)

=

rU sU(o) (mod Xq). However, the degree s of the polynomials in this congruence are bounded by 2deg(rk) + m,;; 2(c- m) + m < q, and 2deg(sk) < 2(c- m) < q, respectively. So there must be equality rU = s%f(O), which contradicts the assumption (11.18) that f is not a square in iF[X]. 0 We now evaluate the Hasse derivatives of r. LEMMA 11.20. Let k,;; f.. Then there exist polynomials rj'), s)k) each one of degree ,;; c- m + k(m- 1) such that

Ekr = Jf-k

L

(rjk)

+ s)k)g)Xjq.

o,;;_j
We can writer= h(X, Xq) where hE lF[X, YJ is the polynomial

L

h = fe

(r;

+ sjnyiq.

O~j~J

Hence by Lemma 11.17, we have

Ekr = (E'}.;h)(X, Xq) =

L

(Ek(Jfrj)

+ Ek(ff+csj))Xiq.

o,;;_J
fe-kr?) and Ek(Jf+csj) = jl-k+csj') with deg(rjk)),;; deg(rj) c- m

+ k(m- 1)

and deg(sjk)) ,;; c- m

+ k(m- 1).

+ kdeg(f)- k,;;

This is the desired result. 0

Recall that we wish r to have zeros of order ? f. at points in Sa (see (11.20)). If f(x) = 0, clearly this is the case. So let x E Sa, with f(x) i' 0. Applying Lemma

lL

286


11.21, we evaluate Ekr at a point x E Sa, using g(x) xq = x:

=

where

a(k)

= a, and most importantly

f(x)e-ku(k)(x)

E IF(X] is the polynomial

L

u(k) =

(rjk)

+ as)kl)xi.

O(,j
We can now prove Proposition 11.18: if u(k) = 0 for all k < e, Lemma 11.16 shows that r has a zero of order ): e at all points in Sa. The system of equations (11.25)

(J(k)

=

0, for all k <

e

is a homogeneous system of linear equations, the unknowns being the coefficients of the polynomials rj, Sj, the equations corresponding to the coefficients of the a(k). We observe that deg(u(k)) < c- m + k(m- 1) + J, so the number of equations does not exceed B = £(c- m + J) + ~£(£ -1)(m- 1) while, on the other hand, the number of coefficients of the rj and Sj is at least A = 2(c- m)J. By choosing J large enough, we can make A > B. Then the system (11.25) has a non-trivial solution, and by Lemma 11.19 this produces r =F 0 such that r has zeros of order ): e at all points x E Sa. Taking

e

J = -(c +2m(£- 1)) q

one can check that A > B (recall that 2c = q- 1 and 8£ bounded by (11.24), which gives Proposition 11.18.

~

q). The degree of r is

We now prove Stepanov's Theorem 11.13. First, let a be arbitrary, and apply Proposition 11.18. Since the auxiliary polynomial r is non-zero and vanishes to order ;?: e at points in Sa, we have £!Sal ~ deg(r) ~ d +2m£(£- 1) + mq so !Sal ~ c +2m(£- 1) + mqe- 1 • We choose e = 1 + [y'q/2], which gives the bound

ISal < c + 4m,jq.

(11.26)

To prove Theorem 11.13, take first a = 1 getting q No+ N1 =!Sal < 2 + 4m,/q hence the upper bound (11.27)

N =No+ 2N1 < 2(No + N 1 ) < q + 8m,jq.

To get a lower bound, by the factorization Xq- X= X(Xc -1)(Xc + 1) we have

f(x)(g(x)- 1)(g(x)

+ 1) =

0

for all x E IF, hence N 0 + N 1 + N2 = q where N2 = l{x E IF By (11.26) applied to S_ 1 , we have q No+ N2 = IS- 1 1< 2 +4m,/q

I

g(x)

-1}1.

r


hence N1 = q- No - N1

287

q 2

> - - 4my/q,

and finally, (11.28)

N =No+ 2NI) 2NI

> q- 8my/q.

Clearly (11.27) and (11.28) prove Theorem 11.13. 11. 7. Proof of Weil's bound for Kloosterman sums. Let lF be a finite field with q elements, of characteristic p op 2. Let 1/J be any fixed non-trivial additive character of lF. For any additive character t.p there exists a unique a E lF such that t.p = 1/Ja, hence any Kloosterman sum S('lf;, t.p) is of the form xEIF"

for some a, b E lF. We consider a and b as fixed and write g = aX + bX- 1 • We will prove Weil's bound (11.15) by relating the average of the Kloosterman sums S('l/Ja, 1/Jb) over 1/J to the number of points on an hyperelliptic curve, where the contribution of the trivial character 1/Jo = 1 will be the main term. LEMMA

11.21. For any n) 1 and any x E lFn, we have

L 1/J(Tr(x)) "' where the sum ranges over all additive characters of lF and Tr is the trace lF

(11.29)

I I 9

l{x E lFn I yq_ Y = x}l

=

n -+

lF.

PROOF. If Tr (x) = 0, then the equation yq- y = x has q solutions exactly, as recalled in Section 11.2, and in this case we have 1/J(Tr (x)) = 1 for all'!f;, hence the right side of (11.29) is also equal to q. On the other hand, if Tr (x) op 0, the equation yq - y = x has no solution, and the character sum is zero by orthogonality. 0

From this lemma we deduce that

V> xEIF~

(11.30)

=

l{(x, y)

ElF~ X

lFn I yq- y = g(x)}l = Nn, say,

for n ) 1. If 1/J = 1/Jo, the trivial character, we have

For 1/J op 1/Jo, let a.p, (3.p be the "roots" of the Kloosterman sum S('l/Ja,'l/Jb), so by Theorem 11.8 we have nv.f3v. = q and

for all n ) 1. We can therefore write Nn = qn - 1 -

L V>#V>o

(a~+ (3~).

11.

288


The equation yq - y = g(x) does not obviously describe a curve, since g is not a polynomial, but multiplying by x it is equivalent with Ca,b : ax 2

-

(yq - y)x

+b=

0

(note that x = 0 is not possible since b fo 0). Because p fo 2, the number of solutions is equal to the number of solutions of the discriminant equation of this quadratic equation Da,b : (yq- y) 2 - 4ab = v 2 , i.e., N, = IDa,b(lFn)l. This is of the form (11.17) with deg(f) = 2q, and because 4ab fo 0 it satisfies (11.18). Hence by Theorem 11.13 we have

INn- qnl < 16qi+n/2 if n is large enough, so that qn > 16q. By (11.30) we get a sharp estimate for the roots

~~

(11.31)

L

q V>#V>o

(a~+ f3~)~

aV>, jJV>, on average

( 16qn/2

for n large enough. The following simple lemma shows that the individual roots must be of modulus ( ,fii: LEMMA 11.2 2.

and assume that

Let w 1 ,

... , Wr

be complex numbers, A, B positive real numbers

ltwjl ( ABn j=l

holds for all integers n large enough. Then lwi I ( B for all j. PROOF. One can do this by hand (using Dirichlet's box principle), but a nice trick gives the result immediately: consider the complex power series

izi

The hypothesis implies that f converges absolutely in the disc < B- 1 , hence f is analytic in this region. In particular, it has no poles there, which means that we must have lwil- 1 ~ B- 1 for all j. D From this lemma applied with A= 16q, B =

,;q, we deduce the upper bounds

laV>I ( ,fii, if3V>I ( ,fii for all 1/J fo 1/Jo. Since aV>f3V> lf3V>I = ,fii, and so Theorem 11.11 is proved.

=

q, we have in fact

iaV>I

=

REMARKS. (1) We see here twice how crucial the introduction of the companion sums Kn is: first because the curve Da,b has very high degree, so Stepanov's bound IN- ql < 8m,fii is trivial when applied to lF itself, and secondly because only by the consideration of all extension fields can we determine the exact order of magnitude of the roots, and obtain Wei!'s bound S('ljJ,
(2) The constant 2 is optimal in Weil's bound for fixed a, b and q. Indeed we have

11.


289

This is a non-zero rational function with poles on the circle [z[ = is its radius of conveTgence. Therefore

This means that for any

E

1/ ,fii, hence this

> 0 there exist infinitely many n such that

It is conjectured (this follows from the Sato-Tate Conjecture for the angles of Kloosterman sums described in Chapter 21) that the Wei! bound is also optimal when a, b are fixed, n = 1, and q = p --> +oo. However, this remains very much open. See the remark at the end of the introduction to Section 11.8 for the case of elliptic curves.

(3) Using the extension of Stepanov's method to curves of the type yd = f(x) and an analysis of the corresponding zeta function, one can prove the following estimate for complete character sums: THEOREM 11.23. Let lF be a finite field with q elements and let X be a nontrivial multiplicative character of !F* of order d > 1. Suppose f E !F[X] has m distinct roots and f is not a d-th power. Then for n ) 1 we have

IL

x(N(J(x)))l '( (m -1)qnf2.

xEIFn

This is Theorem 2C', p. 43, of [Sch]. In particular, we get the following corollary which will be used in proving the Burgess bound for short character sums (Theorem 12.6). COROLLARY 11.24. Let x (mod p) be a non-principal multiplicative character. If one of the classes bv (mod p), v = 1, ... , 2r is different from the remaining ones then

I

L

x((x+bJ) ... (x+br))x((x+br+I) ... (x+b2r))l'(2rp!.

x(mod p)

PRooF. Observe that x((x

+ bi) ... (x +br))x((x + br+l) ... (x +b2r) =

with f(x)=

II l'j'r

(x+b1 )

II

x(J(x))

(x+b1 )P- 2.

r+l'j'2'·

From the assumption, one of the b, is a root off of order either 1 or p- 2, which is coprime with the order d I (p- 1) of x, so we can apply Theorem 11.23. D

290

11.


11.8. The Riemann Hypothesis for elliptic curves over finite fields. A particularly important case of the Riemann Hypothesis is that of elliptic curves. Historically this was first established by Hasse using global methods. In the notation of Section 11.6, this means that we consider curves C1 with deg f = 3, so the equation is of the form (11.32) (the numbering reflects the traditional notation for elliptic curves, see Section 14.4). In contrast with that section, we emphasize that we are considering the affine curve, without the point at infinity. The cubic polynomial f(x) = x 3 + a 2 x 2 + a 4x + a 6 cannot be a square, so this curve satisfies the assumption (11.18). Moreover, we assume that f does not have a double root; this means that the curve C is smooth (see Section 11.9), and it is a necessary condition for what follows. In this case, Theorem 11.13 implies that for q > 36 the number N = jC(lF)j satisfies IN- qj < 24y/q. In Section 11.10 we will prove, as before for Kloosterman sums, the rationality and the functional equation of the corresponding zeta function, from which we will deduce: THEOREM 11. 25. Let C be an elliptic curve over lF given by 2 2 C : y = x 3 + a2x + a4x + a6

with ai E lF. Then for all n ) 1 we have

(11.33) REMARKS. Theorem 11.25 is optimal. Indeed letting n--+ +oo, this follows as for Kloosterman sums from Lemma 11.22. However, it is also true in the horizontal sense as the following example shows: let E/IQ be the elliptic curve with equation

E: y 2 = x 3

-

x

which has complex multiplication by Z[i]. As before we consider the affine points, not the projective ones. The discriminant of E is 64 so E can be reduced modulo p to an elliptic curve over ZjpZ for any odd prime p. One shows (for instance by relating E to the curve y 2 = x 4 + 4 by changing (x,y)....., (yx- 1 ,2x- y 2x- 2 ) for (x, y) =/= (0, 0), see e.g. [14] or [IR]) that jE(ZjpZ)I = p if p 3 (mod 4) and jE(Z/pZ)I = p- 2ap if p 1 (mod 4), where

=

=

p =a;+

b;

=

with 7r = ap + ibp 1 (mod 2(1 + i)) (this congruence determines 7r up to conjugation). For any E > 0, Theorem 5.36 (generalized slightly to add the congruence condition) shows that there exist infinitely many Gaussian primes 1r 1 (mod 2(1+i)) such that jarg1rj
=

!Pfor infinitely many p.

IE(Z/pZ)II

=

2japj ) 2(1- t:

2

)/P


291

Theorem 11.25 will be proved in Section 11.10 after some geometric and algebraic preliminaries. This goes a bit further away from the heart of analytic number theory, yet we include full details because the Hasse bound is also important as being the simplest case of the very important Deligne bound for Fourier coefficients of modular forms. The reader will also certainly appreciate the elegance and beauty of the geometry involved. 11.9. Geometry of elliptic curves.

In explaining the special geometric features of elliptic curves, we may as well consider a more general case. So let k be an arbitrary field, k an algebraic closure, and let C be the curve given by the equation

identified with the set of solutions (x,y) E P. We assume as before that f does not have a double root. If k' /k is any extension, we let C(k') be the set of solutions in (k') 2 . The geometry of the elliptic curve becomes much clearer if we work with the projective version of the curve C, namely the curve E in the projective plane given, in homogeneous coordinates (x: y: z), by the equation

Putting z = 1 gives back C; on the other hand, "at infinity", we are only adding one point: taking z = 0 yields x = 0, and all elements (0 : y : 0) (with y =/= 0) correspond to a single point oo = (0 : 1 : 0) in the projective plane. Notice that this point oo is rational over the base field k, so that for all extensions k' / k we have

E(k')

=

C(k') u {oo}.

The main property of the curve E that we will use is the beautiful fact that its points form an abelian group, with identity element oo. Throughout, p denotes points on E, not the characteristic of the field k. The group law (denoted by +) is described by the geometric condition that for any three (distinct) points p 1 , P2 and p 3 in E, we have p 1 + P2 + p 3 = 0 if and only if the three points are collinear (in the projective plane), and the opposite of a point (x: y: z) is the point (x: -y: z) (symmetry with respect to the x-axis). This way one can construct the sum of any two distinct points, by computing the equation of the line joining them, and taking the opposite (in the sense above) of the third intersection point with the curve. That there are exactly three intersection points follows immediately from the fact that the polynomial f is of degree 3. In addition, to compute the double p + p of a point p, the same construction is done with the tangent line at p; the condition that f has no double root ensures that this tangent line always exists. Also, because f E k[X], it follows easily that the k'-rational points E(k'), for any extension k' jk, form a subgroup of E(k). We do not prove those facts here; completely elementary proofs, by computing explicitly the coordinates of the sum P1 + P2 of two points according to the recipe above and checking the abelian group axioms (associativity is the only difficulty), are fairly straightforward (see for instance [IR], ch. 18, 19).

292

11.


We now introduce some further geometric objects related to E or, more generally, to any smooth, projective, algebraic curve 1 . Thus consider again a more general case: let k be an algebraically closed field, and let E be a plane algebraic curve over k, i.e. given by an equation

f(x,y,z) = 0 for some homogeneous f E k[X, Y, Z]. We identify E with the set of points in the projective plane. We assume that E is smooth, which means here that for any point p = (x : y : z) of E, not all partial derivatives 8f j8X(p), 8f j8Y(p), 8f j8Z(p), are zero. In this case the line with equation

8f ax(P)(X- x)

8f

8f

+ BY(p)(Y- y) + az(P)(Z- z) =

0

is well-defined and is the tangent line to E at p. For elliptic curves y2 = f(x), this smoothness condition is equivalent to the fact that the polynomial f has no double roots, by a simple calculation. Let C be the affine curve corresponding to E given by

C : f(x, Y, 1) = 0 in P. Let g(X, Y) = f(X, Y, 1) E k[X, Y]. We define k[C] = k[X, Y]j(g). Elements of k[C] can be interpreted as functions on C. We assume that (g) is a prime ideal (this is easily checked in the case of elliptic cnrves) so that k[C] is an integral domain, and we let k(C) or k(E) be its quotient field, called the function field of C or of E. It is a finite extension of the field k(X) of rational functions over k (for elliptic curves y 2 = f(x), it is a quadratic extension k(X)(v'J)). We interpret elements of the function field as rational functions on E, so given a point p E E and an element

k(E)x

~ Z,

which gives the order of the zero (if) 0) or pole (if< 0) of a rational function at p. As a discrete valuation, it satisfies

ordp(c)

=

0, forcE k*,

ordp(
+ ordp(1/l),

ordp('P + V')) min(ordp(
11.


ideaL Let 1r be a generator; then m~ is generated by ordp can be defined for

1rd

293

for any d ~ 1. The order

I

and extended to a homomorphism k(E)* -> Z. The properties above are then quite easy to check. For elliptic curves y 2 = f(x), one can easily see that if p # oo, and p = (x,y) withy# 0, it is possible to take 1r =X- :r. For p = oo, one can take 1r = XjY, and one finds that ord 00 (x) = -2; ord=(Y) = -3. Every non-zero element

and the second gives the degree of a divisor: Div(E)--. Z, deg { [p] >->1. As suggested by the notation, divisors of the type (-> rp defined by
L(D)={O}U{
If D = n 1[pi]+ ... +nk[pk]-mi[q 1]- ... -mj[qj] with ni, mi ~ 0, then

# 0,

11.

294


(1) Poles of order at most ni at p;, 1 :( i :( k; (2) Zeros of order at least m, at q;, 1 :( i :( j. It follows immediately that if D2 :( D1, then L(D2) C L(DI). Also, if D1 ~ D2, then writing D 1 = D2 + (1/J), the map
--> V''P induces an isomorphism L(DI) --> L(D2), in particular, £(DI) = £(D 2) only depends on the linear equivalence class of the divisor. The following interpretation is al~o clear: there is a bijection

(11.34)

P(L(D)) __,{Effective divisors linearly equivalent to D}, { 'P .__. (
between the projective space P(L(D)) of L(D) and the effective divisors linearly equivalent to D (by definition of L(D), the map has image in the set of effective divisors). This also requires the important fact that (11.35)

L(O) =

k,

£(0) = 1.

In other words, an everywhere defined rational function on E is constant: this is obvious for elliptic curves, since regularity on C forces such a
0. deg(D) > 0 or D ~ 0.

Then either

PRooF. If £(D) > 0, there is a non-zero element

11.27. Let E be an elliptic curve over an algebraically closed field

k. For any divisor D on E, £(D) is finite and we have the formula (11.36)

£(D)-£( -D)= deg D.

Equivalently, by Lemma 11.26, we can compute £(D) for any D by: 1. If deg(D)) 0, and D f 0, then £(D)= deg(D). 2. If D ~ 0, then £(D) = 1. 3. If deg(D) < 0, then £(D) = 0. This is the Riemann-Roch theorem specialized for elliptic curves. See the remark below for the general case. The simple proof for the Riemann-Roch theorem hinges on the remarkable interaction of the group structure on E with the divisor group. Indeed, consider the map a : Div(E) __, E defined by a(n1 [pi] + ... + nk[pk]) = n1P1 + ... + nkpk, the + on the right side corresponding to the group law on E.

11.

PROPOSITION


295

11.28. Let D be a divisor on E. Then we have D ~ [a(D)] + (deg(D)- 1)[oo].

PROOF. In essence this "is" the group law itself: by induction, we need only consider D = [p] + [q] and D = [p] - [q] for some points p and q. If p or q is the origin oo, the result is obvious. So consider first the case of D = [p] + [q] with p, q =/= oo and p =/= q. Then the equation aX + bY + c = 0 of the line joining p and q defines an element

(
~

0 we get

D = [p] + [q] ~ (
-[p] + [oo] ~ [-p]- [oo]

and hence D ~ [a(D)] + [oo], as desired. If p = q, the equation of the tangent line gives 2[p] + [2p) - 2[oo) ~ 0, and the result again follows. Finally if D = [p] - [q], use (11.37) to reduce to the previous case: [p]- [q] = [p] + ( -[q] + [oo])- [oo] ~ [p] + [-q]- 2[oo] ~ [p + q]- [oo]. D PROOF OF THE RIEMANN-ROCH THEOREM. Notice that (11.36) forD or -D are equivalent. By Proposition 11.28, we have £(D)= £([a(D)] + (degD -l)[oo]), £(-D)= £([-;,(D)]+ (1- deg D)[oo]). One of the two divisors on the right is effective, so we can assume that Dis effective and of the form D = [p] + n[oo] with P E E and n ): 0. If p = oo, then D = m[oo] with m): 1. We must prove £(D) = m. Any element

I n :s; m}l

= m.

Now we consider D = [p] +n[oo] with n ): 0 and p =/= oo. If n = 0, we can use the automorphism q >-> q-p which sends p to oo to get an isomorphism L([p]) c:- L([oo]) which implies £([p]) = 1.

11.

296


Ifn;, 1, we have D) n[oo] hence L(n[oo]) C L(D). Thus n :(£(D). Moreover, because only a simple pole is allowed at p, we have £(D) :( n + 1: if 'lrp is a function with a simple zero at p, we have a k-linear map

L(D)/ L(n[oo]) ___, k { 'P >--> (7rp
(X- x) = [p] + [-p] - 2[oo], (Y + y) = [-p] + [p'] + [p"]- 3[oo] for some p' and p". Let

I (cp)+D>O}

and we let Ck(D) = dimk Lk(D). It is clear that Ck(D) :(£(D). THEoREM 11.29. For any k-rotional divisor D, we have

In particular the Riemann-Roch formula holds with Ck(D) instead of £(D).

PROoF. This is a special case of the following theorem, which is a formulation of Hilbert's Theorem 90 for GL(n): let k be a field, k an algebraic closure, V a kvector space with an action of G k. Then there is a basis of V made of elements which are Gk-fixed (equivalently, let Vk = yGk, then V = Vk Q9 k, or dimk Vk = dimk V). For a proof, see for instance [Sil], Lemma II.5.8.1. 0 REMARK. This theory adapts in the following way to more general algebraic curves (see for instance [Ha], IV): if E is smooth and projective over k, one can define a certain divisor class K (called the canonical class, and related to differentials on E). It has degree deg(K) = 2g- 2 for some integer g ) 0, called the genus of E, and the Riemann-Roch Theorem takes the form (11.38)

£(D) - C(K- D) = deg(D) + 1 -g.

l

I l l

l


297

Elliptic curves correspond to g = 1; in this case the canonical class is trivial, and this reduces to Theorem 11.27. The case g = 0 corresponds to the projective line, and is also very easy. The proof of (11.38) is much more involved than the one for elliptic curves, since there is no group law on the curve which would help. 11.10. The local zeta function of elliptic curves.

Let C be an elliptic curve over lF given by (11.32). It is more convenient to use here the corresponding projective curve E, as described in Section 11.9. The zeta function of E is defined as the formal power series

Z(E) = exp(L \E(:n)\yn ).

(11.39)

n)l

We first relate Z(E) to points on the curve, by giving its Euler product (compare Lemma 11.7). To do this we introduce some terminology, which comes from the language of schemes. DEFINITION. Let E be an elliptic curve over a finite field lF with q elements. A closed point of E is the Galois orbit of a point x 0 E E(JF). The degree deg(x) of a closed point x is the cardinality (necessarily finite) of the orbit and its norm is N x = qdcg(x). The set of closed points of E is denoted \E\.

This notion is analogue to that of an irreducible polynomial in JF[X] used for the zeta functions of Gauss sums and Kloosterman sums. To every closed point x E \E\ is associated anlF-rational divisor which is simply the formal sum of all the elements in the orbit. The degree of this divisor is the degree of x. Moreover, it is easy to see that the group of lF-rational divisors is the free abelian group generated by the divisors associated to closed points. LEMMA

11.30. We have the Euler product expansion

Z(E)

( 11.40)

=

II

(1 ~

ydeg(x))-I,

xEiEi

where the product is over all closed points of E. PROOF. This is very close to Lemmas 11.7 and 11.9. First by decomposing the points in E(lFn) in Galois orbits we obtain

\E(lFn)\ =

Ld L din

1,

xEiEi

deg(x)=d

which is the analogue of Lemma 11.1. Then we have

Z'(E)

"'""'

~T Z(E) = L

\E(lFn)\T

n

n)l

and, on the other hand, this operator applied to the right side of (11.40) yields

L xEiEi

deg(x)LTndeg(x)=LTn(Ld n)l

n)l

din

L

1)

xEiEi

deg(x)=d

hence the result.

0

298


11.

Using the Riemann-Roch Theorem, we now prove the rationality and functional equation of the zeta function. THEOREM 11.31. The zeta function Z(E) of an elliptic curve is a rational function. More precisely, it is of the form

zE

_ 1 - aT + qT ( ) - (1- T)(1- qT)

(11.41)

2

where a E Z is defined by the relations \E(lF)\ = q+ 1-a or, in terms of C, \C(lF)\ = q- a. The zeta function satisfies the functional equation Z(E, (qT)- 1 ) = Z(E, T). LEMMA 11.32. Let d) 0 and let hd(C) be the set of linear equivalence classes of lF-rational divisors of degree d. Then hd(C) is finite and \hd(C)\ = \ho(C)\ :( IE(lF)J. PROOF. For any rational divisor D, we have by Proposition 11.28 the equiv[u(D)) + (degD- 1)[oo], thus the linear equivalence class of D only depends on u(D). If D is lF-rational, it follows that u(D) E E(lF), and therefore the inequality \hd(C)I :( JE(lF)J holds. Moreover, it is clear that the map D ,__.. D + d[oo) with inverse D ,__.. D- d[oo) induces a bijection between hd(C) and h0 (C). D alenceD~

PROOF OF THEOREM 11.31. Since lF-rational divisors on E are simply combinations with integer coefficients of divisors associated to closed points, the Euler product (11.40) gives the formal power series expression

Z(E) =

L

Tdeg(D)

D)O

where the sum is over all effective lF-rational divisors on E. Split the sum according to the degree d of D; for d = 0, the only effective divisor is D = 0 so Z(E) = 1+ LTd L 1. d)!

D)O

deg(D)=d

For each d, split further the sum over divisors of degree d in linear equivalence classes. By Lemma 11.32, there are h0 (C) equivalence classes for each d. For a given class (that of D say), the contribution is the number of effective (JF-rational) divisors linearly equivalent to D. By (11.34) and Theorem 11.29, this is equal to qiF(D)-

JP(L(D))J =

q_

1

1

l(D)-

=

q_

1

1

.

Since d ) 1, the Riemann-Roch theorem implies f(D) = deg(D) computation of Z(E) is now straightforward

Z(E)=1+LTd d)!

L D)O

d, so the

1=1+ho(C)""(l-1)Td q-1 L.... d)l

dcg(D)=d

=

(11.42)

1

h0 (C) ( Td T ) h 0 (C)T 1 + q- 1 1- qT- 1- T = + (1- T)(1- qT)

1- bT+ qT 2 (1- T)(1- qT)

.,


299

where b is defined by h0 (C) = q + 1- b. This proves the rationality, and gives the precise form, except that we need to prove that a= b, where IE(JF)I = q+ 1- a, or equivalently h0 (C) = IE(lF)I (actually in Lemma 11.32, we have already shown lh 0 (C)I :( IE(lF)I, but we do not need it any more). To obtain this equality, start from the original definition (11.39) of Z(E), and compare with (11.42): the latter is seen to imply that IE(lF)i = q+1-b = h0 (C). Finally the functional equation of Z(E) is a formal consequence of (11.42). D It is worth recording separately one of the last steps of the proof. PROPOSITION 11.33. Let E be an elliptic curve over a finite field lF, let D be a divisor on E. Then D is principal if and only if deg(D) = 0 and cr(D) = 0 E E. More precisely, the map j : D >--> cr(D) is an isomorphism between the group of divisor classes of degree 0 and E(lF). PROOF. A divisor D is 1Fn-rational for some n ) 1; looking at E over 1Fn, it suffices to prove the isomorphism between classes of lF-rational divisors of degree 0 and lF-rational points. But j is a surjective (j([x] - [oo]) = x) map between finite sets with the same cardinality (lho(C)I = IE(lF)i). D

This is the special case of the so-called Abel-Jacobi Theorem, for an elliptic curve over a finite field. It actually holds over any field, and a generalization to all (smooth projective) curves is the content of the theory of jacobian varieties associated to curves. To conclude the proof of Theorem 11.25, we proceed as in the case of Kloosterman sums: from (11.41), we derive IE(lFn)l- (qn + 1) =an+ {Jn where 1 - aT+ qT2 = (1 - aT)(1 - {JT). Then Lemma 11.22, applied with the input from Stepanov's Theorem 11.13, shows that ial :( ,fii., lfJI :( ,fii., and since a{J = q, this concludes the proof. EXERCISE 2. Assuming the general Riemann-Roch formula (11.38), prove that for a smooth projective algebraic curve E of genus g over a finite field lF with q elements, the zeta function

Z(E) = exp('L

IE~n)!Tn)

n;n is a rational function of the form Z(E) =

P(T) (1- T)(1- qT)

for some polynomial P with integral coefficients and degree 2g. [Hint: The question will arise whether there exist lF-rational divisor classes on E of degree 1 (which is obvious for elliptic curves since the point oo is lF-rational). The image of the degree map is 8Z for some 8 I (2g- 2) (the degree of the canonical class). Using this fact, find a preliminary form of the zeta function and analyze the poles to show that actually 8 = 1 (see [Mor2], 3.3).]

II.

300

SUMS OVER FfNfTE FrELDS

11.11. Survey of further results: a cohomological primer. The methods of Stepanov are very useful and, in certain circumstances they provide the best tools available today, especially when the genus of the curve is large compared to the cardinality of the finite field (see for instance the proof by Heath-Brown of non-trivial estimates for Hcilbronn sums [HB2]). However, the deepest understanding of exponential sums over finite fields and the greatest impact on classical problems of analytic number theory comes from the sophisticated concepts of algebraic number theory, especially the f-adic cohomology theory as developed by Grothendieck and his collaborators, which give a very powerful and flexible framework for working with very general exponential sums. The proof of the Riemann Hypothesis for varieties by Deligne [Del], and even more his far-reaching generalization [De2], are the basis for the extensive work of Katz, Lauman and others. It is beyond the scope of this book to discuss this theory in great detail. Let us direct the interested reader to the survey articles [Lau], [K2]. Study of the foundational basis of the f-adic theory can be started in [De3] and continued together with applications in the books of Katz, for instance [K3], [K4]. We will limit this section to a short introduction of the basic vocabulary and we will state a few of the most fundamental results in this language. We then include examples to show that such knowledge can already be very useful even when one is not familiar with the details and background of algebraic geometry. In Sections 11.4 and 11.5, we have shown that Gauss sums and Kloosterman sums can be related to analogues of Dirichlet characters over finite fields. The f-adic cohomological formalism which we now discuss can be thought as relating exponential sums, dually, to objects which are Galois-theoretic in nature. The exponential sums Sn defined by (11.8) can be interpreted as sums over the algebraic curve Uf,g consisting off; minus the poles of the rational functions f and g. More generally, one wishes to consider exponential sums not only over curves but over more general varieties. We will use some basic vocabulary of algebraic geometry to describe such situations, but will illustrate them in the simpler case of curves. Already the case of (11.8) and Uj,g are quite interesting. Let IF be a finite field and U /IF be a smooth algebraic variety of dimension d ) 0 (technically, we assume as part of the smoothness assumption that U is geometrically connected, and as part of being a variety that U is quasi-projective). The simplest examples in dimension 1 are Uf.g• or smooth projective curves. In dimension d > 1, the most important examples are the affine d-space Ad, with set of points A d(JF) = Jli'd, and the projective d-space. The exponential sums over U will be of the type (11.43)

Sn =

L

x(N(f(x)))'l/!(Tr(g(x)))

xEU(IFn)

where

f

and g are IF-rational functions defined on U.

To U /IF is associated the so-called arithmetic etale fundamental group 1r 1 (U) which "classifies" etale coverings V ---+ U of U, and is the analogue both of the Galois group of a field, or of the "ordinary" topological fundamental group. A morphism of algebraic varieties is etale if it is flat and unramified; if U is a curve, this means V is a curve, f is non-constant and unramified. For the simpler purposes of exponential sums, the fundamental group can be considered somewhat as a black

r 11. SUMS OVER FINITE FIELDS

301

box in what follows, but one should keep in mind that the elements of V in 1r1(U) act as automorphisms of any etale covering 1r : V ---t U (i.e. 1r(!x) = 1r(x) for any 1 E 1r1 (U) and x E V), and that it is a functor: any map U ---t V between varieties induces a continuous group homomorphism 1r1(U) ---t 1r1 (V). (One should fix a base-point in defining 1r 1 (U), but a more or less canonical choice exists, the so-called "generic point" of the scheme U.) EXAMPLES. (1) Let U be a single point {x} defined over lF. Then 1r1(U) is the Galois group Gal(lF /lF). (2) Let U/lF be a smooth curve, not necessarily projective. There is a an associated smooth projective curve C/lF such that U C C with complement a finite set T of points. If U = lf for instance, then C = lP' 1 is the projective line, and

T={O,oo}. The fundamental group can be described concretely as follows: let K = JF(U) = lF( C) be the function field of U, i.e. the field of rational functions on U or C (if C = lP' 1 , then K = lF(t) is the usual field of rational fractions). We have the Galois group G K = Gal(k / K) of K. For every closed point x of C, there is the c9rresponding discrete valuation ordx of K. This extends to the separable closure k of K, and gives rise to a decomposition group Dx < Gx and an inertia group Ix < Dx as in classical algebraic number theory, with the property that Dx/Ix ~ Gal(lFqflFq), where lFq is the residue field of x, a finite field with q = Nx elements. Then 1r 1 (U) "is" the quotient of Gx by the smallest closed normal containing all inertia groups Ix for x a closed point of U. Fix a prime number £ =fc p. The objects used to interpret exponential sums over U are the so-called £-adic sheaves on U. In the simpler cases, those will be "lisse", in which case there is a simpler alternate Galois-theoretic description which we take as definition. DEFINITION. Let U /lF be a smooth variety over a finite field. A lisse £-adic sheaf on U is a continuous representation p: 1r1 (U)--> GL(V) where Vis a finite dimensional Qe-vector space. Continuity refers to the profinite topology on 1r 1 ( U) and the £-adic topology on V.

Note the similarity with the definition of Galois representations of number fields (see Section 5.13). Because of the original definition of a sheaf, one usually denotes £-adic sheaves by curly letters :F, g, etc. Notice that one can obviously speak of direct sums, tensor product, symmetric powers, etc., of lisse £-adic sheaves by performing the corresponding operations on the representations. Also one can speak of irreducible sheaves, etc. An important £-adic sheaf, denoted Qe(l), is obtained by considering the natural action of 1r1 (U) on £-power roots of unity, which arises from the etale coverings where one simply extends the base field from lF to its extension by roots of unity. This action is given by a certain character Xt : 1r 1 (U) ---t Q£. Using this sheaf, one defines Tate twists: if :F is a lisse £-adic sheaf and i E Z, then one denotes :F( i) (:F twisted i times) the sheaf which corresponds to the action p' of 1r 1 (U) on the same vector space but with

p'(t) = xf(!)Pb); in other words, :F( 1)

=

:F 0 Qe (1) for instance.

302


11.

Exponential sums arise by looking at the action of the Frobenius elements at points of U. Let x be a closed point x of U, which can be seen as a Galois orbit of points in U(IF). The fundamental group of the "point" x is the Galois group Dx of the residue field of U at x, isomorphic to 1Fn where n is the degree of x. By functoriality there is a map Dx __, 1r1 (U). We have Dx ~ Gal(Fn/IF n) and the latter is generated (topologically) by the Frobenius morphism cr, so taking the image we get in ?Tr(U) a well-defined conjugacy class, called the arithmetic Frobenius conjugacy class at x. In particular, for any f-adic sheaf one can speak of the trace Tr p(crx) without ambiguity. However, it turns out that it is the inverse F of CT (the so-called geometric Frobenius) which appears naturally in the cohomological description of exponential sums. We denote by Fx the corresponding conjugacy class; it is called simply the Frobenius conjugacy class at x (omitting the adjective geometric). THEOREM 11.34. Let U /IF be a smooth variety, let f =I 0 and g be IF -rational functions on U, let 'lj! be an additive character and x a multiplicative character of IF. Let Sn = Sn(U,f,g,x,'lj!) be the associated exponential sums over U(IFn) as in {11.43). Then there exists a lisse f-adic sheaf :F on U of degree 1 with the property that for all n ): 1 we have (11.44)

Sn =

L

Tr(Fx I :F)

xEU(IFn)

where we denote Tr(g p : ?Tr(U) __, GL(V).

I :F)=

Tr(p(g)

I V),

:F corresponding to the representation

To compare with the characters used to describe Gauss sums and Kloosterman sums, one should think of the latter as analogues of Dirichlet characters or Heeke characters, whereas the f-adic sheaves given by this theorem are analogues of Galois characters. The correspondence between the two concepts is an instance of reciprocity or class-field theory. We sketch the construction of :F in the case where x = 1 and g is a non-zero rational function on U = A 1 - {poles of g}, over IF, which makes it clear that this . is very closely related to the argument in Section 11.7. Consider the curve (11.45) and notice that there is a surjective map 1r : (x, y) >-+ x from C to U. For any a E JF, the equation yq - y - a = 0 is separable, hence it has q distinct roots in lF. In fact the additive group of IF acts on the roots by translation: if y is a root and z E IF, then (y + z)q- (y + z) = yq - y = a. Moreover, 1r : C __, U is an etale covering (we've just seen it is everywhere unramified and surjective). In other words, 1r is an etale Galois covering with Galois group isomorphic to the additive group IF (coverings given by such equations are called Artin-Shreier coverings). The fundamental group ?Tr(U) acts on C by automorphisms of the covering, which means as translations by elements of IF as above. This defines a surjective map

r


303

1r1(C) to 1r1(U), which can be described as the space

V = {f: 1r1(U)

--t

Qe I f(n) = f(J) for any T E 1r1(C)}

(where T E 1r1 (C) is seen through the map 1r1(C) which 1r1(U) acts by translation on the right

--t

1r1(U) coming from 1r), on

p(J)j(T) = f(ry). The elements f E V depend only on 1T1(C)\1r1 (U) ~IF (i.e. on the automorphisms of the covering C --t U), which implies that V ~ Q~ is an £-adic sheaf on U of degree q. The representation space V can be decomposed over the additive characters 'ljJ of IF,

where l.p is the '1/J-eigencomponent of V, namely

L.p = {f E VI p(J)f = '1/J(rp(J))f for all1 E 1r 1 (U)}. It is easy to see that each l.p is an £-adic sheaf on U, and because p is induced from the trivial representation, each L.p is of degree 1.

Then for every additive character '1/J, the £-adic sheaf on U corresponding to l;r; is the sheaf satisfying (11.44) for the exponential sums Sn(U, g, '1/J). Indeed, if x E U(IFn), and y satisfies yq - y = g(x), then the Frobenius of x acts on y by yqn = y+TrFnfF(g(x)) since Yqn _ Y = Yqn _ Yqn-1 +Yqn-1 _ ••• +Yq _ Y

= (yq-

y)qn-l

+ · · · + yq- y =

Tr(yq- y)

= Trg(x).

Hence rp( u x) = Tr g( x) and by definition of l.p it follows that u x acts on l.p . by multiplication by '1/J(Trg(x)), hence Fx = u; 1 acts by .,z;(Trg(x)), which gives (11.44). In particular, note that taking the trace for IQle on C we derive IC(IFn)l = LSn(U,f,'ljJ), as in (11.30).

"'

EXERCISE 3. (1) Let Sn be the character sum (11.8) with g = 0 for some multiplicative character X of IF* and some non-zero rational function f E IF(x), on the variety U = A 1 - {zeros and poles of !} . Describe as above the construction of the sheaf£ satisfying (11.44) in this case. [Hint: Use the cover yd = f(x), where d is the order of the multiplicative character x.] (2) Let Sn be as in (11.8), U C A 1 the complement of the zeros and poles of f and the poles of g. If L.p is the sheaf satisfying (11.44) for f = 1 and lx is the sheaf satisfying (11.44) for g = 0, show that£= l.p 0 lx satisfies (11.44) for Sw EXAMPLES. (1) Even the case p = 1 is interesting when dealing with a general variety U. This "trivial" £-adic sheaf is denoted Qe, and one has Sn = IU(IFn)l. (2) For the sheaf Qe(1), notice that O"x acts by~>---> ~q for any root of unity if Nx = q. Therefore Fx acts by~>---> ~tfq and in particular the only eigenvalue of Fx is q- 1 •

11.

304


Now in addition to UJF we consider its "extension of scalars" [J jiF over the algebraic closure of F. There is a corresponding geometric fundamental group 1r 1 (U), which sits in an exact sequence 1-. 1r 1 (U)-. 1r 1 (U)-. Gal(F/F)-. 1.

(11.46)

To every £-adic sheaf F on U are associated the £-adic cohomology groups with compact support of [J with coefficients in F. Those are finite-dimensional Qevector spaces, denoted, H~(U, F) fori) 0. The key point is that the Galois group ofF acts naturally on H~(U, F), and in particular, so do the Frobenius CT and its inverse F, the geometric Frobenius. The key to the cohomological interpretation of exponential sums is the GROTHENDIECK-LEFSCHETZ TRACE FORMULA. Let U jF be a smooth variety of dimension d) 0, F an £-adic sheaf on U. We have H~(U, F) = 0 if i > 2d and for any n) 1

2:

(11.4 7)

Tr(Fx I F)= Tr(pn I H~(U,F))- Tr(pn I H~(U,F))

+ · ··

xEU(IFn)

Therefore to evaluate the exponential sums (11.43) using the associated sheaf, we need to know the traces, or equivalently the eigenvalues, of F (equivalently, of CT = p - l ) acting on H~ for 0 :( i :( 2d. It turns out that in most cases Hg and H';d are easy to compute: PROPOSITION 11.35. Let F be a lisse £-adic sheaf on a smooth variety U JF, corresponding to the representation p of1r 1 (U) on the Qe-vector space V. We have if U is projective,

(11.48)

if U is not projective,

and (11.49)

va

where denotes the space of vectors invariant under the action of a group G on an abelian group, and Va denotes the space of co-invariants, the largest quotient of V on which G acts trivially. In both cases, the isomorphisms are canonical isomorphisms of vector spaces with an action of the Galois group of F. Since V is a representation of

1r 1 (U),

the exact sequence (11.46) shows that

V"'(O) and Vrr,(U) are acted on by Gal(FJF), "through" the given representation p.

This proposition shows that for a curve UJF, the only "difficult" cohomology group is H1(U,F). EXAMPLE. Let U = E jFp be an elliptic curve, F = Qe the trivial sheaf. By the proposition one has (1) Hg(E,Qe) = Qe, with trivial action ofF (since Qe is the trivial sheaf). (2) H';(E,Qe) = Qe(-1), so by definition of the twist, Facts by multiplication by p (on roots of unity, i.e. on Qe(1), CT acts by ~ ,.... e, hence F by multiplication by p- 1 ).

11.


305

The Lefschetz trace formula (11.47) gives

(compare Theorem 11.31). More generally, one derives the rationality of the zeta function directly from the Trace Formula. CoROLLARY 11.36. Let U, Sn and :F be as in Theorem 11.34. For 0 :( i :( 2d, let b; =dim H~(U,:F) and b,

P;(T)

=

det(1- FT I H~(U,:F))

=II (1- n;,jT). j=l

We have

and for n) 1,

(11.50)

Sn =

L

(-1rnr,1.

O(i(2d

Theorems 11.4, l1.8 and the result of Exercise 1 are all special cases of this corollary, together with suitable computations of cohomology groups. The numbers b; are called the £-adic Betti numbers for :F. Of much greater importance, however, is Deligne's vast generalization of the Riemann Hypothesis [De2]. One starts with the following "local" definition: DEFINITION. Let w E Z be an integer. A lisse £-adic sheaf :F on U jJF is said to be pure of weight w if for any closed point x of U, all eigenvalues of Fx acting on the Qe vector space V associated to :F are algebraic integers all conjugates of which have the same absolute value equal to qw/ 2 where q = Nx is the cardinality of the residue field. For instance, the trivial sheaf Qe is pure of weight 0 (all eigenvalues 1). For any i E Z, Qe(i) is pure of weight -2i, and if:Fispure of weight w, then :F(i) is pure of weight w- 2i. For any exponential sum (11.43), the associated sheaf :F is pure of weight 0 because the only eigenvalue at xis the root of unity x(N f(x))'l/J(Tr g(x)). THEOREM 11.37 (DELIGNE). Let UjJF be a smooth variety and :Fa lisse £-adic sheaf on U, pure of weight w. Let i ) 0 and let~ be any eigenvalue of the geometric Frobenius F acting on H~(U,:F). Then~ is an algebraic integer, and if n E IC is a conjugate of~. we have (11.51)

In I :( q(w+i)/2.

The conclusion is also phrased as saying that H~ (U, :F) is mixed of weights :( i + w. If there is equality in ( 11.51), then H~ (U, :F) is said to be pure (of weight w + i). In certain cases, one can apply duality theorems (for instance Poincare duality) to deduce further that ( 11.51) is an equality.

306

11.


REMARK. Although Deligne's proof is a monumental achievement of very deep algebraic geometry, it is an interesting fact that a crucial use is made of a generalization of the method of Hadamard and de Ia Vallee Poussin for proving non-vanishing of £-functions on the line Re(s) = 1 (see Section 5.4). Similarly, in Deligne's first proof [Del], the ideas of the classical Rankin-Selberg method for modular forms are essential (specifically, Deligne acknowledges the influence of [Ra3]). EXAMPLE. Let C JIF be a smooth connected projective curve (for instance, an elliptic curve). By Proposition 11.35 as in the previous example, we have easily: (1) H2(
It follows that if a is one of (the complex conjugates of) the eigenvalues of F on H~(C,Qe), then pja is one also. Hence from Theorem 11.37, since Otis pure of weight 0, one deduces that In I= VP· Thus 2g

IC(!Fn)l = pn

+ 1- L

ar

i=l

where the ai E

Q are the eigenvalues ofF on

H~. Estimating trivially now, we get

recovering the Riemann Hypothesis, and in particular, Theorem 11.25 for the case g = 1. In the case of exponential sums (11.43), the sheaf :F is pure of weight 0, hence denoting d(:F) = max{i I H~(U,:F) ""'0}, we derive directly from (11.50) and (11.51) the bound

(11.52)

ISnl ,;

L

2

biqni/ ,

o,;i,;d(F)

for n ): 1 and, in particular,

(11.53) As in the case of Kloosterman sums, the exponent d(:F)j2 is best possible in this inequality. The bound d(:F) ,; 2d gives a trivial estimate (because U is smooth of dimension d, it has about qnd points, as proved by. the Riemann Hypothesis for the trivial sheaf Qe). Any improvement of this trivial bound is equivalent with H;d(U, :F) = 0, and the square root cancellation often expected from heuristic reasonings is equivalent with H~(U, :F) = 0 for i > d. Although not always true, this turns out to hold "generically", as the analytic intuition suggests (see for instance Theorem 11.43 below).·

l l l

I

i

l '

l

1

j

11.


307

For the exponential sums (11.43) we have d(F) < 2d, unless F is the trivial sheaf Qe, so there is always a non-trivial bound. This follows from (11.49), since F is of degree 1 so the space of co-invariants is either the whole space (meaning the representation is trivial) or 0. However, this small gain is usually insufficient in applications. Another surprising consequence of Deligne's result and the discreteness of integers is the following "self-improving" statement: COROLLARY 11.38. Let Sn be an exponential sum as in {11.43) and F the associated sheaf. Suppose w ): 0 is an integer such that

ISnl «

qw/2+0

for some 8 E [0, ~[and n): 1. Then we have d(F) (; w, hence

ISnl «

qwf 2 .

A second issue in applying the estimates (11.52) or (11.53) in the context of applications to analytic number theory is that we usually have lF = 7ljp7l, with the prime number p varying. In this case, whereas the variety U can be defined over Q (or Z) so that the sum is, for all p, over the 1Fp-points of the reduction Up of U modulo p, the sheaves Fp genuinely depend on p (see the equation (11.45)), i.e. there is no theory of sheaves over U jZ giving each Fp by "reduction modulo p". (Katz has asked a number of times for such a theory of "exponential sums over Z"; see e.g. [K2], but it remains elusive.) Thus, the Betti numbers bi(P) = dimH~(Up,Fp)

of the cohomology groups can depend on p, and the applicability of the results above would be ruined, even with the Riemann Hypothesis, if these dimensions were not bounded in a reasonable way in terms of p. This is in fact the case. The first general result in this direction is due to Bombieri [Bo4] for additive character sums (11.43) where f = 1, and was generalized by Adolphson and Sperber [ASl], [AS2] for general sums (their methods are p-adic, based on Dwork's original ideas). In general, those results bound the Euler characteristic 2d

Xc(F)

= 2)-1fdimH~(U,F) = i=O

(-l)ibi,

L O(i(2d

of a sheaf F on U jJF, but further arguments of Katz [K5] show how to deduce bounds for 2d

ac(F)

= LdimH~(U,F) =

L

bi,

O(i(2d

i=O

(hence forb, (; ac(F)) from those for Xc(F). THEOREM 11.39. Let U /Q be a smooth variety over Q, f and g Junctions on U with f invertible. Let C be a prime number and for all p ol C such that the reduction Up ofU modulo pis smooth, let x and '1/J be any multiplicative and additive characters ofJFP. Let Fp be an C-adic sheaf on Up such that L

x(Nf(x))'lj;(Trg(x)) =

L

Tr(Fx

I Fp)

11.

308


for n): 1. We have ac(Fp) (; C where C is a constant depending only on U, f and g.

A simple explicit bound is given in [AS3] if f(x) = 1, so only the additive characters occur, and g is a Laurent polynomial on U = (Q- {0} )d. The sums over 7ljp7l in question are therefore sums in d variables of the type

(11.54) where

f

s,,p = E Q[x 1 , x1 1 , . . . , xd, x;J 1 ] is a non-zero Laurent polynomial. Writing

for some (finite) set J c zd, the Newton polyhedron W(f) off is defined to be the convex hull in JRd of J U {0}. PROPOSITION 11.40. With the above assumptions, denoting by Fj,p the associated sheaf for the sums Sf,p• we have

lxc(FJ,p)l ( d!Vol (W(f)), ac(Fj,p) (; lOdd!Vol(W(f)) for any p not dividing the denominator of any coefficient off, where Vol (W(J)) is the volume of the Newton polyhedron in the sub.space spanned by W(f) in JRd, with respect to Lebesgue measure.

Note that by using exclusion-inclusion and detecting polynomials equations by means of multiplicative characters, one can use combinations of sums of the type (11.54) to describe much more general ones. Also, in many cases, one can show that only all the odd (or even) cohomology groups vanish, in which case lxc(F)I = ac(F). See also Theorems 11 and 12 of [K5] for explicit estimates in quite general cases. We now give examples of computations using these fundamental results. For exponential sums arising in analytic number theory, one often needs nothing more, if one uses skillfully some other simple tricks such as averaging over extra parameters to analyze the weight of the roots. EXAMPLE 1. The Kloosterman sums S(a, b; p) for ab ol 0 can be treated using Proposition 11.40 with d = 1 and f(x) = ax+ bx- 1 • Then W(f) is the interval [-1, 1]. By Proposition 11.35, we have H~ = H~ = 0 in this case since U = lP 1 - {O,oo} is not projective, so ac = -xc- By Theorem 11.37, is mixed of weight (; 1. Hence we recover the Wei! bound:

H;

(Of course, in fact we have b1 = 2 and the last inequality is an equality). EXAMPLE 2. The previous example generalizes to the multiple Kloosterman sums defined by

(11.55)

11. SUMS

OVER FINITE FIELDS

309

for r): 2 and a ol 0, so K2(a,p) = S(a, 1;p) (see [Bo4], [Del]). Without appealing to the £-function, one can nevertheless get some information by averaging over a. We get

L 1Kr(a,q)l2 = qr- qr-1- ... -

(11.56)

q -1.

a#O

Hence IKr(a,q)l (; qr/ 2 following Lemma LEMMA

numbers

To improve this elementary bound we appeal to the

ll.41. Given a finite set of distinct angles

ai we have

L IL:a,e(nBi)l n(N

2

()i

modulo 2JT and complex

2

= Nllall +0(1)

i

where the implied constant does not depend on N. Hence

1~~!~ IL:aie(nBi)l)

(11.57)

llall.

' PROOF. We have

L IL:a,e(nBif =NL:Iaif+ LLaiaJ L e(n(Bi-8

1 )).

n(N

i

i=;l:j

i

n(N

The inner sum is bounded by a constant independent of N, so the first result follows and (11.57) is an obvious consequence. D From (11.56) and (11.57) it follows that among the Kr(a,p), there is at most one root of weight r, say for K r ( a0 , p), and all other roots are of weight (; r - 1. Notice that Kr(ao, p) E Q(llp), the cyclotomic field of p-th roots of unity. Using the Galois action on Q(llp), the conjugates of Kr(ao,p) are Kr(aovr,p) for v E IF;. By the Riemann Hypothesis, this means that the conjugate of the root ~ of weight r is still a root of weight r for Kr(a 0 v 2,p). Hence vr = 1 for all v E IFp, which is only possible if p- 1 I r. In particular all roots are of weight (; r- 1 if p > r + 1. One therefore gets by Proposition 11.40 (11.58)

Kr(a,q)

«

q(r-1)/2

where the implied constant depends only on r. In the case of Kloosterman sum the Newton polyhedron is the simplex with vertices (1,0, ... ,0), ... ,(0, ... ,0,1),(-1, ... ,-1) whose volume is 1/r!. Moreover, it is known that the zeta function is a polynomial so Xc = -a 0 and we get the precise estimate IKr(a,q)l (; rq(r-1)/2 This was first proved by Deligne [De3], without any assumption on p and r.


11.

310

EXAMPLE 3. Here is another higher-dimensional example. In lowing exponential sum over finite fields appears: (11.59)

L

W(x, 1/J;p) =

x(xy(x

[Cll], the fol-

+ 1)(y + 1))'1/;(xy- 1)

x,y mod p

where p is prime, x is a non-trivial quadratic character modulo p and 1/J is any multiplicative character modulo p. THEOREM 11.42 (CONREY-lWANIEC). There exists an absolute positive constant C such that

IW(x,'l/J;p)l (; Cp

(11.60)

for all p and all'!/; as above. The first step of the proof is to apply Theorem 11.34 to say there exists an C-adic sheaf F on the algebraic surface

U = {(x, y) I xy(x

+ 1)(y + 1) =f 0},

pure of weight 0 for which we have, for q = pn, n ): 1, the formula

W(x, 1/J; q) =

L

x(N(xy(x + 1)(y + 1)))'1/;(N(xy- 1))

x,yEU(I'q)

L

Tr(F(x,y)

I F).

x,yEU(I'q)

The Lefschetz trace formula (11.47) takes the form 4

W(x,'l/J;q) = L(-1);Tr(Fn I H~(U,F)). i=O

By Theorem 11.37, each H~(U,F) is mixed of weights(; i. Let (av, iv, w ... ) be the family of eigenvalues ofF acting on the whole cohomology (with multiplicities), together with their index and weight. We have iavl = pw"/ 2 and

By Theorem 11.39 in this case, the total number of roots Ov is bounded by a constant independent of p. Thus, we gain on the trivial bound W « p 2 if Wv (; 3 (instead of 4), and the statement of the theorem is that Wv (; 2. The second step is to show that there is at most one root of weight ): 3 (actually, it must be = 3) and, if it exists, then 1/J = X is the non-trivial quadratic character. This will be derived from the following average formula: (11.61)

1

A= q-

1

2.::1W(x,1/J;q)l 2 =q 2 -2q-2.

"'


311

To prove this formula, open the square and sum over 1/J first getting by the orthogonality of characters

11.I,Vi 1U2

(where we shorten the notation from performed giving

x oN

to

x).

Next the summation over u 1 is

Then the sum over u 2 is performed giving

L

x(u2 + 1)x(u2)B(vl, u2) = qx(vl + 1) + x(vl) + 1,

tL2;iO

and finally the sum over v 1 gives

A=

L

x(vl + 1)(qx(vl + 1) + x(v!) + 1) = q(q- 2)- 2.

Vt#O

From (11.61), using Lemma 11.41, it is clear that for all 1/J and v we have 3 and moreover, Wv (; 2 except for at most one root, for one character 1/J. If this case occurs for 1/J, it happens for {! too, so the only possibility is 1/J being a real character. Since

Wv (;

W(x,1;q)= (L::x(u(u+1))/ =1, u

we must have 1/J =X· The· last step is to treat the case 1/J = that

x

separately. Precisely, one can show

IW(x,x;q)l (; 4q for any p and q = pn. This is done in a purely elementary manner without appealing to the Riemann Hypothesis, and we refer to [Cil] for the details. In fact, W. Duke showed that W(x,x;p) = 2Re(J2(x,~)), where J(x,O is the Jacobi sum and~ is a quartic character modulo p 1 (mod 4). Also W(x, x; p) is the p-th Fourier coefficient of the modular form TJ( 4z) 6 of weight 3 and level 12.

=

When U is not a curve, numerous geometric subtleties can be involved in dealing with the non-trivial cohomology groups H~ with i ol 0, 2d. Here are two general bounds, among many: the first one is due to Deligne [Del], and the second is the recent version in [FK] of a general "stratification" theorem of Katz and Lauman. THEOREM 11.43. {1) Let f E Z[X1, ... , Xm] be a non-zero polynomial of degree d such that the hyper-surface Ht in pm-l defined by the equation

-1 IL

312


where fd is the homogeneous component of degree d off, is non-singular. For any p f d such that the reduction of H 1 modulo p is smooth, any non-trivial additive character 1/J modulo p and any n ): 1 we have

(11.62)

I L ... L X1 , ...

1/J(Tr(J(xr, ... ,xm)))l ( (d -1)mqnm/2.

,xm.EFpn

I

J:i

.·jl

{2) Let d ): 1, n ): 1 be integers, V a locally closed subscheme of A.z of dimension dim V(C) ( d and f E Z[Xr, ... ,Xn] a polynomial. Then there exists C = C(n, d, V, f) and closed subschemes X 1 C A'Z of relative dimension ( n - j such that

·.

Xn c ... c X2 c Xr c A'Z

.

'1 ,~

-~

and for any rational function g non-zero on V, any hE (ZjpZt- X 1(ZjpZ), any prime p, any non-trivial additive character 1/J and multiplicative character x modulo p, we have ·

L

x(g(x))'l/;(J(x)

+ hrXI + ... hnxn)

( Cp~+Sl_

xEV(Z/pZ)

Note that in (1), subject to a geometric condition on Ht, we obtain square root cancellation in the exponential sum. In (2) the assumptions are much Jess stringent, but the conclusion is weaker: we have a family of exponential sums parameterized by h E An, and roughly speaking we have square root cancellation for "generic" sums (for h outside an exceptional subvariety X 1 of codimension ): 1 in An), while worse and worse bounds can occur only on smaller and smaller subvarieties. We will give an application of (2) in Chapter 21. The C-adic theory and formalism are much more developed than what we have surveyed here. It can also deal with sums over singular varieties, but the necessary algebraic notions become rather formidable, and we concede being unable to discuss the perverse sheaves that arise in more advanced situations. We wish to emphasize, however, that this theory is also particularly well-suited to the study of families of exponential sums. The parameters defining those, say Sx, are most naturally themselves points on some algebraic variety Xj'F. In favorable circumstances there exists a lisse C-adic sheaf F on X (corresponding to an action of 1r 1 (X) on V) such that (11.63)

Sx,n = Tr (F;'

I V)

for every value of the parameter x E X and any n ): 1. For example a non-zero polynomial f of degree ( d over F can be described as an F-rational point of the affine parameter space X= _Ad+!- {0} by d

f =

L a,Xi

f-t

(ao, ... ,ad)·

i=O

There is an C-adic sheaf F on X such that

L '1/;(J(x)) = Tr (F;' IV). xEFn

;

······J··· .


313

Purity of a sheaf satisfying (1 1.63) depends on a first application of the Riemann Hypothesis. If it holds, the application of the C-adic theory typically results in equidistribution statements (following from [De2]) for the arguments of the exponential sums. This equidistribution is in some space of conjugacy classes of the "monodromy group" of the situation. We refer for instance to [KS] for a very lucid introduction to these profound aspects. 11.12. Comments.

In this closing section we give some impressions about how the exponential and character sums over finite fields interact with analytic number theory; There are more subtle issues between the two subjects than just applications of results concerning the first one for solving problems of the latter. We could be quite specific by covering completely a few representative examples, but it would be long and not transparent enough. Rather we decided to discuss principles, ideas and tricks in general terms and guide the reader to particular publications.

I I r

I ,.

•f

[ i

i

First of all some exponential sums appear when one uses Fourier analysis to get a hold on the sequence under investigation. There are no finite fields in the background, so the resulting sums are not immediately related to objects of algebraic geometry. However, one can complete these sums (by another use of Fourier analysis) and then factor them into sums of prime power moduli. Usually one can evaluate these local sums explicitly, or give strong estimates by elementary or ad hoc methods, except when the modulus is prime. But in the prime case one may naturally consider the sum as being over a finite field. This scheme allows us to appeal to the powerful results from algebraic geometry. However, the drawback of finishing by estimates for every complete sum individually is that one cannot exploit a possible cancellation from extra variables offered by the varying moduli (finite fields of different characteristics do not interact in algebraic geometry). Sometimes a kind of reciprocity formula can help turn the modulus into a variable (see for example [Ill] or [M3]). Another scenario is that the sums over modulus appear in the spectral resolution of a differential operator, in which case the spectral theory produces estimates far stronger than those derived by algebraic geometry. For example, this is the case of sums of Kloosterman sums; see Chapter 16. One can also think this way about the real character sums with cubic polynomials; they are coefficients of a cusp form associated with an elliptic curves, so the modularity gives extra cancellation in summation over the modulus. As a rule the exponential/character sum of a given modulus which comes out of analytic number theory is incomplete. This itself is not a problem because various completing techniques are available as mentioned above. Completing is a natural step to take, but is it useful or wasteful? At this point one should realize that a. bound for a complete sum holds essentially the same for the original incomplete sum. This means that the result is relatively weaker for a shorter sum. Still it is non-trivial when the length of the original sum is larger than the square root of the modulus. Very short sums cannot be treated this way. We do not have an absolute recommendation when to complete or not a given incomplete sum. Our experience suggests executing the Fourier method as long as the resulting summation over the frequencies is shorter than the range of the original sum: at least one can feel that one is progressing. But sometimes it is worth acting otherwise, accepting a step backwards in this respect while opening a position for a stronger second move.

314

11.


For example, imagine that the amplitude in the exponential sum is not a rational function, but nevertheless becomes one after an application of the stationary phase method to the relevant Fourier transform. In this case the losses from the range of summation can be recovered with extra savings by applications of algebraic arguments (see Section 8 of [Cil], where this game is played in several variables simultaneously). Whatever the arguments which lead to complete sums may be, in the final step one cannot beat the square root saving factor. Therefore to receive a non-trivial result one must first produce somehow a sum with a number of terms larger than the square root of the modulus. There are several ways to get started, depending on the shape of the exponential/character sum. First, one can try to apply a Weyl shift with the effect of squaring the number of terms. Similarly one may just square the whole sum, or raise it to a higher power to produce even more points. Note that shifting the variable and squaring the sum are not the same things; the first requires some additive features of the variable while the latter nothing at all. These operations seem to be quite superficial at first glance (we are taking essentially replicas of the original sum), yet with ingenuity one can rearrange the points so the summation goes in a skewed direction, the consecutive terms repel violently and randomly producing a considerable cancellation. This is easy to say, much harder to execute. In fact one needs many other devices, such as gluing several variables with small multiplicity in order to arrive at a single variable over a range larger than the square root of the modulus. One also must smooth out this composite variable before applying algebraic arguments. Usually an application of Cauchy's inequality and enlarging the outer summation (due to positivity) does the job. A powerful example how this works is given by Burgess [Burl]. In this paper a short character sum is estimated by an appeal to the Riemann Hypothesis for algebraic curves. An interesting point is that after all the tricks one comes to a complete character sum for a curve of a large genus, although the original sum is over a line segment. Different arguments for building one extra large variable are applied in [FI4], consequently the final complete sum comes in three variables, or equivalently in four variables over a hypersurface. Here the Deligne theory applies (see the Appendix by Birch and Bombieri), although the related variety is singular. One should not be surprised and afraid of that singularity, because, after all, the process of creating more points of summation at the start is quite superficial. In this game one must be experienced when mixing the points to be sure that it is quite random. Another interesting case of creating and estimating exponential sums in three variables is given by N. Pitt [PJ]. Applications of the Riemann Hypothesis for curves over finite fields are by now customary. Much less successful are the use of genuinely higher-dimensional varieties. There are reasons for the difficulties involved. First of all when more variables appear, stronger restrictions are imposed on them which are harder to resolve (a kind of uncertainty principle). Just imagine having an abundance of points to work with, but which are not free because of some side conditions. For example how would you cope with a requirement that the determinants of a family of elliptic curves match the conductors?

,

r


315

Of course, there are also direct applications of the Riemann Hypothesis for varieties to traditional problems of solvability of diophantine equations by means of the circle method (see examples in Chapter 20). If the number of variables is sufficiently large, one needs nothing to manipulate, except for completing the sum by the standard Fourier method. Some ingenuity, however, is required to apply the circle method (a variant of Kloosterman) to treat diophantine equations with a relatively small number of variables, an excellent example being the work of HeathBrown on cubic forms [HB6]. It is possible in some circumstances to beat the bound for exponential sums which is derived by the Riemann Hypothesis. This is because the angles of the roots of the £-function themselves vary so that additional cancellation may occur. Deligne and Katz have established such occurrences for families parameterized by points on curves or varieties. In other words, in their cases one is actually considering exponential sums in more variables. However, the cancellation of roots can also occur for families parameterized by points over small irregular sets. More important for analytic number theory is that these sets can be quite general, no structure of a subvariety is needed, but instead a kind of a bilinear form structure would suffice. In practice it is not clear how to work with the roots, so one returns to the corresponding exponential sums where manipulations with the parameters (grouping, gluing, etc.) can be performed properly according to the shape of the involved rational function which is seen with the naked eye. In this process one must not destroy the complete variables since in our mind the corresponding summations are already executed in terms of the roots. Therefore when applying Cauchy's inequality to smooth the one variable composed out of the parameters, we put all the complete variables to the inner summation, say n of them, together with some remaining parameters which where not used in the composition. These inner parameters are critical for enlarging the diagonal. After squaring out we get a complete exponential sum in 2n + 1 variables which depends on the inner parameters. Except for a few configurations of those, the complete exponential sum satisfies the best possible bound derived from the Riemann Hypothesis, thus saving the factor of square root of the modulus per each variable. Since the number of variables is larger than twice the original, we win the game. The above recipe is somewhat oversimplified, yet it reveals the source of extra saving. One can see how it works in the particular case of [Cll]. Speaking of [Cll] we would like to add that here the exponential/character sums in several variables emerge after applications of harmonic analysis with respect to the hyperbolic Laplace operator rather than by the traditional Fourier analysis. The profound theory of Deligne and other geometers is being used in analytic number theory with spectacular effects, yet more ideas need to be invented to fully exploit its potential. Perhaps one should go beyond borrowing estimates and penetrate deeply inside the theory. This is a great subject for future research. P. Michel [M3] made the first significant steps (see also [FK] and [KSl]).

CHAPTER 12

CHARACTER SUMS 12.1. Introduction.

In analytic number theory one often encounters sums of type

S= LF(x)

(12.1)

xEV

zn

where V c is a finite set and F : V -+ IC is a periodic function of period q. Because V does not match the periodicity, S is called an incomplete sum. Suppose

F(x) = x(J(x))'l/;(g(x))

(12.2)

where x, 1/J are multiplicative and additive characters to modulus q, and rational functions with integer coefficients. Precisely, we assume that (12.3)

f=fr/fo,

J, g

are

g=gJ/go

where fo,fr,go,g! E Z[x] and (12.4)

Uo(x)go(x),q) = 1 if x E V.

Having fixed these polynomials (which are not unique) we define (12.5) (12.6)

x(f(x)) = x(fr(x)]o(x)), '1/;(g(x)) = 'l/;(gl(x)!Jo(x))

where, as usual, a denotes the multiplicative inverse of a modulo q. Then the resulting sum (12.7)

S=

2.::* x(f(x))'l/;(g(x)) xEV

is called an incomplete character sum. Here, and hereafter, the star restricts the summation to the points of V satisfying (12.4). The residue classes x(mod q) which satisfy (12.4) will be called admissible. An important case, but by no means the only one, is V being a box. When the box has size exactly q we get (12.8)

S(x, '!/;) =

2.::*

x(f(x))'l/;(g(x))

x(mod q)

which is called a complete character sum. In this chapter we give basic techniques of estimating incomplete character sums in one variable over an interval. As an exercise for the reader, we suggest generalizing some of the forthcoming results to several variables. 317

12. CHARACTER SUMS

318

12.2. Completing ·methods. The most common treatment of an incomplete sum (12.9)

L*

S(M;N) =

F(n)

goes by expanding it into complete sums (this is called the completing technique), (12.10)

s(~)

L*

=

F(x)e(-

a;)

x(mod q)

and treating the latter by various arithmetic means. To do so we split the summation into the *-admissible residue classes n x(mod q), and detect these classes by the orthogonality of additive characters getting

=

(12.11)

S(M;N) =

~

L

-\(~)s(~)

a(mod q)

where (12.12) Usually the main contribution to S(M;N) comes from a= 0 in (12.11) for which -\(0) = N (we assume that M and N): 1 are integers). For 0 < Ia! :( ~' we have j-\(~)1 :( qjaj- 1 . Hence LEMMA 12.1. Let F(x) be a complex-valued function defined on the residue classes x (mod q) which are -~<-admissible. Then the corresponding sums satisfy

(12.13)

Suppose that the complete sums satisfy (12.14)

js(~)j

:( c(B)(a+ b,q)~l

for some b E Z and B > ~. This bound is often true with B = ~ Then (12.13) becomes (12.15)

+ c for any c > 0.

js(M; N)- ~S(O) I :( c(B)£(b, q)l

where (12.16)

f(b,q)=

L

ial- 1 (a+b,q)~.

O
Since C(b, q) is usually small (for example, we have £(0, q) :( 2T(q) log q), the inequality (12.15) shows that the incomplete sum S(M; N) satisfies essentially the same bound as does the corresponding complete sums S( ~ ), up to the main term !!'.s(o) q .

"

12. CHARACTER SUMS

319

Clearly the above method of completing the sum of a periodic function F(x) works for more general sums of type

S = LF(n)G(n)

(12.17)

where G(x) is a nice function of analytic character which decays to zero rapidly as lxl -> oo. Now we present another method. Suppose G(x) is of Schwartz class on IR. Then one can apply the Poisson summation formula, giving

s = ~ Ls(~)c(~)

(12.18)

q aEZ

q

q

where G(y) is the Fourier transform of F(x). In principle both methods are equivalent, however the latter may be preferable in case G(x) propagates some vibrations. If the vibrations are regular, G( '!) can be evaluated asymptotically by the stationq ary phase method (see for example Corollary 8.15) giving an asymptotic expansion for Sin terms of S(~). On the other hand, if G(x) is wild, one should have some reservation for completing S in terms of additive characters. One should try to use "harmonics" which are more adequate for the particular case. The Fourier coefficients of cusp forms, holomorphic or non-holomorphic, may serve well as the relevant harmonics in many cases. 12.3. Complete character sums. Let Xq and 1/;q be multiplicative and additive characters respectively to the modulus q. The complete character sum

S(xq,1/;q) =

L*

Xq(f(x))'!j;q(g(x))

x(mod q)

is multiplicative with respect to q. Indeed, let q = rs with (r, s) = 1. Then Xq factors uniquely into XrXs where Xr. Xs are multiplicative characters to moduli r, s respectively. The additive character 1/;q is given by (12.19) for some a E Z. By the "reciprocity" formula (12.20)

=

where ss 1(mod r) and rf 1/;;.1/J:. Hence it follows that

s r 1 -r +-s =-(mod 1) q

=1(mod s), the additive character factors into 1/;

9

=

(12.21) Therefore the problem of evaluating a complete character sum of modulus q reduces to that of prime power moduli. The complete sum S(x, 1/;) of modulus q = p/3 shouldn't be confused with the character sum over the finite field IFq which was considered in Chapter 11, except for q = p in which case S(x, 1j;) is indeed one of such sums. The case of prime modulus belongs to the theory of £-functions for curves over finite fields, as described in Chapter 11. The rationality of the relevant £-function together with the Riemann

12. CHARACTER SUMS

320

Hypothesis (both proved by A. Wei! [Wel] in this case) yield algebraic numbers lgvl = p, p!, 1 such that

g1, ... , gr with

S(x,1/J) = g1 +···+g.,

(12.22)

The number r is bounded independently of the characteristic p. Moreover, assuming some non-singularity conditions for the rational functions J, g with respect to the characters X, 1/J, there are no roots gv with lgvl = p, so (12.22) yields (12.23) See Chapter 11 and in particular Section 11.11 for more complete discussion of this situation. There is no need to algebraic geometry for the complete character sums S(x, 1/;) to modulus q = p/3 with f3 ;;: 2, because in this case elementary arguments are available. LEMMA

12.2. Let q = p 2 " with a;;: 1. Then we have

(12.24)

x(f(y))1/J(g(y))

S(x,1/J) =p" y(mod pa) h(y)=O(mod pa)

where h(y) is the rational function given by (12.25)

h(y)

=

ag'(y)

!' + bf(y)

with the integers a, b depending on the characters 1/J, x which are determined by {12.19} and (12.27} below. REMARK. Since the characters x, 1/J have modulus p 2"' it is required for correctness of the summation (12.24) to know that x(f(y))'!j;(g(y)) does not depend on the choice of the representative of y(mod p"') on the curve h(y) O(mod p"'). This property follows in the course of the proof.

=

PROOF. Write x = y + zp"', where y and z run independently over any fixed systems of residue classes modulo p"', and y is restricted by the condition p f fo(y)go(y). We have

f(x)

(12.26)

= f(y) + J'(y)zp"'

(mod p 2 "')

Indeed, the congruence (12.26) is easily seen for monomials xn by the binomial formula

(y

+ zp"')n

=

yn

+ nyn-1 zp"' + ....

Then it extends to arbitrary polynomials JI(x),j0 (x) with integral coefficients by linearity. Moreover, if f 0 (y) O(mod p), then (12.26) is verified for f(x)j Jo(x) as follows:

=

h(x)/ fo(x)

=(JI(y) + J;(y)zp"')fo(y)(1- fo(y)Jb(y)zp"') =h (y)]o(y) + u; (y)fo(y)- j"g(y)J~(y)h (y))zp"'.

By (12.26) we get

x(f(x)) = x(f(y))x(1

+ !' y(y)zp"').

12. CHARACTER SUMS

321

Clearly x( 1 + zp") is an additive character to modulus p", so there exists an integer b (uniquely determined modulo p") such that (12.27)

x(1

+ zp") =

e(!: ).

Hence (12.28) Similarly we get (12.26) for the rational function g(x), whence

1/;(g(x)) = 1/;(g(y))e(ag'(y)zp-").

(12.29)

Multiplying (12.28), (12.29) and summing over the residue classes y, z modulo p" we get

S(x,1/J) =

:L*

x(f(y))'!j;(g(y))

y(mod p")

(12.30)

e(h(y)zp-").

=O(mod p") in which case it equals p". This

The inner sum vanishes unless h(y) completes the proof of (12.24). LEMMA

L

z(mod p")

D

12.3. Let q = p2 "+1 with a): 1. Then we have

S(x,1/J) = p"

x(f(y))1/J(g(y))Gp(Y) y(mod p") h(y)=O(mod p")

where Gp(y) is the Gauss sum

L

Gp(y) =

(12.31)

ep(d(y)z 2 + h(y)p-"z).

z(mod p)

Here h(y) is the rational function {12.25} but with b given by {12.35} below, and a d(y) = -g"(y)

(12.32)

2

bf"

bf'

2

+ --(y) + (p-1)-(-(vl) . 2 f 2 f

2 REMARK. As z runs mod p, so does z j2p, therefore the Gauss sum (12.31) is correctly defined.

PROOF. We write x = y + zp" with y running modulo p" subject to p fo(y)go(y), and z running modulo p"+ 1 As before we argue that

t

(12.33) Nate that the rational function ~ f" (y) has integral coefficients. This is clear for the monomial yn because n(n- 1) O(mod 2), then for any integral polynomial by linearity, and finally for a rational function f(y) = h(y)/fo(y) by the identity

=

~!"

=

1

2

~f{'fr; - JU~fa - Vd~'fa

2

+ JU6l 2 faa

By (12.33) we get (12.34)

x(f(x)) = x(f(y))x(1

f' 1f" + ( ](y)z + 2j(y)z2 p")p").

12. CHARACTER SUMS

322

Consider the function

W

+ zp")

e(p:+l

=

+ (p

-1);;).

This is a character of the subgroup of residue classes x(mod p 2 "+ 1 ) with x _ 1(mod p"). Indeed ~((1

+ zp")(1 + wp"))

=

W

+ (z + w + zwp")p") 2

2

z+ z )+-w- ) =e ( -w +(p-1 p<>+l 2p =

W

+ zp")W + wp").

Since the subgroup has order p"+ 1 and ~b are all different for distinct b modulo p"+ 1 , it follows that there exists an integer b (uniquely determined modulo p"+ 1 ) such that (12.35)

x(1

+ zp")

bz = e ( p"+l

+ (p-

bz

2

)

1)2; .

Using (12.34), (12.35) and

1/;(g(x)) =

(12.36)

lj;(g(y))e(~~~;) z+ ~g";y) z 2 )

we derive that

S(x, 1/J) =

2:.:

x(f(y)I/J(g(y))

=

Here the innermost sum vanishes unless h(y) 0 (mod p") in which case it equals p"Gp(y). This completes the proof of (12.30). 0 The Gauss sums Gp(Y) were computed in Chapter 3. If p f 2d(y), we have (12.37) The formulas (12.24) and (12.30) represent truly the final stage of computation. The only terms which are not determined on the right side are the roots of the congruence h(y) O(mod p"). However, in practice there are not many roots (essentially a bounded number) so one gets the estimate \S(x, 1/J)\ ~ cq 112 with c depending mildly on the coefficients of the rational functions f, g.

=

EXERCISE 1. Using Lemma 12.2 and Lemma 12.3 together with (12.37) evaluate the Kloosterman sum

(12.38)

S(m,n;q) =

""'* L...

e

(mx q+ nx)

x(mod q)

for q = pf3 with f3 ): 2. Suppose p (~)=(~)in which case (12.39)

f 2mn.

Show that S(m, n; q) vanishes, unless

12. CHARACTER SUMS

where £2

323

=mn(mod q).

The Kloosterman sums to modulus q = V with f3): 2 were computed first by H. Salie [Sal]. He also computed the so-called Salie sums (12.40)

T(m,n;q) =

"' L...J (x) q e (mx +q nx) x(mod q)

where (~) is the Jacobi-Legendre symbol, including the case of prime modulus. One can do it for composite moduli as well. LEMMA 12.4. Suppose (q, 2n) = 1. Then

T(m,n;q)=Eqq~(%)

(12.41)

_

L

v2=mn(mod q)

eC;).

PROOF. Consider the function F(u) = T(m, nu 2 ; q) defined for u (mod q). The Fourier transform of F( u) is

F(v)

=

L

F(u)e(- uqv)

u(mod q)

"'*

1(n)

= EqQ' -

q

L...J

e

(mx-4nXv

2 )

q

x(mod q)

by the formula (3.21) for quadratic Gauss sums. Notice that the Jacobi-Legendre symbol canceled out, therefore the last sum is the Ramanujan sum

dji(qjd). dl(4mn-v 2 ,q)

Hence by Fourier inversion

F(u) =

~

L

F(v)eCqv) =

Eqq-~ (%) LdJ1(~)

v(mod q)

dlq

L

e(uqv).

v(mod q) v 2 ,;4mn(mod d)

If ( u, q) = 1, this simplifies to

, (n) q _ "' L...J

F(u) = Eqq2

e (2uv) q .

v2=mn(mod q)

In particular, for u = 1 we get (12.41). Suppose (q, 2mn) = 1. Then T(m, n; q) vanishes unless there exists i with (12.42)

li.a..>,

£2

=mn(mod q).

D

CHARACTER SUMS

12.

324

=

Given .e, all the solutions to v 2 mn(mod q) can be written explicitly as v (rf- ss)f, where r, s run over the factorizations rs = q with (r, s) = 1. Hence the formula (12.41) can be written more explicitly as follows: (12.43)

The Kloosterman sums to prime modulus cannot be computed in elementary terms. This shouldn't be surprising in view of the results (and conjecture) concerning the distribution of angles of Kloosterman sums (see Section 21.2, in particular Theorem 21.7 and the Sato-Tate Conjecture). 12.4. Short character sums.

In this section we are dealing with character sums over a short interval (12.44)

x(n),

where X is a non-principal Dirichlet character of modulus q. By Lemma 12.1 we obtain (12.45)

/Sx(N)/ ( 2

L

a- 1 /gx(a)/

O
where (12.46)

L

9x(a) =

x(x)eC;)

x(mod q)

are the classical Gauss sums. Let x*(mod q*) be the primitive cl!aracter whicl! induces X· We have (12.47)

L

9x(a) = 9x· (1)

dX.*(afd)Jl(qfdq*)

dl(a,qjq•)

by Lemma 3.2. Hence

/gx(a)/ ( a((a, qfq*))y'{f.

(12.48)

Inserting this bound into (12.45) we derive (12.49)

/Sx(N)/ ( 3T(qjq*)y'{f1ogq.

Since T(m) ( 2Vffi, this yields THEOREM 12.5. For any non-principal character (12.50)

I

L

x (mod q)

we have

x(n)l ( 6y'qlogq.

M
This inequality was proved in 1918 independently by G. P6lya (Pol] and I. M. Vinogradov [V 4] except for the constant 6. By the Grand Riemann Hypothesis for Dirichlet £-functions one can derive a slightly stronger estimate

Sx (N)

«

y'q log log 3q,

12. CHARACTER SUMS

325

however, nothing better than the P6lya-Vinogradov inequality is known in general except for the constant factor. For recent improvements of this constant see A. Hildebrand [Hi2]. Next we give another derivation of the P6lya-Vinogradov inequality without completing Bx(N). We have

Bx(N) = LX(n)f(n) where f(x) = min(x-M, 1,M +N +1-x) if M ( x ( M +N +1, and f(x) = 0, otherwise. This redundant cut-off function will help to separate the variables in the forthcoming expressions. For any integer a we have

Bx(N) = LX(n+a)j(n+a). Hence

x(n + a)j(n +a). O
Now separate the variables in f(n +a) by the Fourier transform

f(n +a)= getting

1: 1:

j(t)e((n + a)t)dt

ISx(N)I (

lj(t)IB(t)dt

where B(t) is the sum in two independent variables

B(t) =

~

L

I

x(n + a)e(nt)l.

L

O
By Cauchy's inequality

2 B(t) (

G+ ~)

L

lLx(n

x(mod q)

( G+~)LLI n2

nl

+ x)e(nt)l

2

n

L

x(ndx)x(n2+x)l-

x(mod q)

Suppose x(mod q) is primitive. Then (12.51)

L

x(n1

+ x)x(n2 + x) = S(O, n 1 -

n 2 ; q)

x(mod q)

is the Ramanujan sum. Hence we derive that

q

2 1)(A+N 1 B(t) ( ( A+ -q-

+ 1) (A+ N)R(q)

where (12.52)

R(q) =

L y(mod q)

IS(O,y;q)l.

326

12.

CHARACTER SUMS

Assuming that N :( 1 (as we can by virtue of periodicity), and choosing A= N the above bound simplifies to 6R(q). Hence ISx(N)I :(

(6R(q))~

1:

1/(t)ldt.

By 1/(t)l (rrt)- 2 1sin(rrt) sin(rrtN)I we infer that the integral is bounded by 3log2N, and conclude that ISx(N)I :( 8R(q)~ logq. Note that R(q) = 2w(q)
(12.53)

For q prime this estimate gives the Polya-Vinogradov inequality while for composite q it is only slightly weaker. REMARK. The factor log q emerged in the process of separation of variables. Since this factor cannot be entirely ignored, we make a point that in general the losses from separation of variables are not exclusively of technical nature, but occur inevitably for arithmetic reasons as well. This comment sounds even stronger in the context of bilinear forms, where factorization by superficial separation of variables at no cost would be fatal (think of Corollary 7.3). The Polya-Vinogradov inequality (12.50) is trivial for character sums Sx(N) of length N « q~, while the true bound is expected to be (12.54) which is non-trivial if N » q3 £. In a series of papers D. Burgess [Burl], [Bur2] established several results for relatively short character sums, one of them being q

THEOREM 12.6 (D. BuRGESS). Let X be a primitive character of conductor Then

> 1.

(12.55) for r = 2, 3, and for any r ): 1 if q is cubefree, the implied constant depending only onroandr.

The hypothesis that q is cubefree can be removed at the cost of weakening the result. Let q = k£, where £ is the maximal cubefree factor. By factoring the character x(mod q) accordingly and splitting the summation into classes modulo k one derives from Theorem 12.6, (12.56) (if£= 1, then k = q and (12.56) follows by the periodicity of x). Burgess's bound (12.55) for q cubefree is non-trivial if N taking r sufficiently large, one derives (12.57) for any 0

< 8 :(

~, the implied constant depending only on 8.

» q:l:+f.:+o,

and by

r

12. CHARACTER SUMS

327

We shall give a proof of a slightly stronger estimate (12.58)

but only for characters x(mod p) to prime modulus. Here cis an absolute constant (c = 30 is fine). Our arguments go by induction with respect to N. First notice that the assertion (12.58) is either trivial or it follows from (12.50), unless (12.59)

which condition we henceforth assume. Applying a shift n H < N we obtain

>-->

n

+ h with

1 :( h :(

x(n +h)+ 2tJE(H)

(12.60) M
where !tJ! :( 1 and (12.61)

·

1

r+l

l

E(H) = cH 1 -"p~(logpF

by the induction hypothesis (12.58) for the two character sums of length h which do not overlap with the original segment. Let H = AB, where A, B are positive integers. We use the shifts of type (12.62)

h = ab with 1 :(a :(A, 1 :( b :( B.

Averaging (12.60) over a, b we get

L

x(n+ab)+28E(H)

M
where IBI :( 1. By x(n + ab) = x(a)x(an +b), where a is the multiplicative inverse of a modulo p (here we employ two properties of characters, the periodicity and the multiplicativity) we deduce that (12.63)

where

V =

L

v(x)j

x(mod p)

L

x(x+b)j

1;;,b;;,B

and v(x) is the number of representations of x as an( mod p) with 1 :(a:( A and M < n :( M + N. We shall estimate V without using the induction hypothesis, so the implied constant will be independent of c. As a and n vary over short segments relative to q, many residue classes x(mod q) are not represented by an; in other words, v(x) is often zero. Moreover, v(x) is essentially bounded, but it is impossible to analyze its random changes of size. Therefore we relax v(x) in V by applying the Holder inequality

V < '

tr 1

vl

-ttr:fcwt: v2 ,

12.

328

CHARACTER SUMS

where

L

VI =

v(x),

w

v2 (x),

x(mod p)

L IL

=

L

=

v2

x(mod p)

x(x+b)l2r.

x(mod p) l(b(B

We could have restricted the outer summation in W to the residue classes x(mod p) for which v(x) oF 0, but we are not able to take advantage of such a condition. The extension to the complete sum is massive, nevertheless it is nominal relative to the length of the character sum raised to power 2r. This explains our choice of Holder's exponents. Clearly, we have V 1 (essentially).

=

AN. We shall show that

V2 is also of this magnitude

LEMMA 12.7. We have (12.64)

V2 :( 8AN(ANp- 1 +log 3A).

PROOF. V2 is the number of quadruples (al,a2,n1,n2) with 1 :( a1,a2 :(A and M < n1,n 2 :( M + N such that a 1 n2 a2n1(modp). Fix a 1 ,a2 and put a1n2- a2n1 = kp. We have

=

Ik- (a

1 -

Ml

a2)P :(

AN P

and (a1, a2) Ik. Given a1, a 2 and k as above we find that the number of pairs (n 1, n2) satisfying the equation a1n 2 - a 2n 1 = kp is bounded by 2N(a1,a2)/max(a1,a2)· Therefore

Hence (12.64) follows after simple estimates.

0

LEMMA 12.8. We have (12.65) This estimate for W lies at the heart of Burgess' method. Assuming (12.65) we complete the proof of (12.58) as follows. We choose B = [rp:fr] giving

W :( (2r) 2 rp~. Then we choose A = [Nj9rp:fr]. Note that A ) 1 by the left side of (12.59), and AN :( N 2j9rp:fr :( p(logp) 2 by the right side of (12.59). Therefore Lemma 12.7 gives V2:;;:; AN(4logpf. Recalling that V1 = AN we derive V :( 2r(AN) 1-:fr ( 4p~ logp) ~ :( N 2- ~ (p r.+,.' logp) ~.

12.

CHARACTER

SUMS

329

Next note that H = AB :( Nj9 and H ) N/10. Therefore we derive by (12.61) and (12.63) that 2

1

1

r+l

1

\Sx(N)\ :( (10 + c)N -;:p4r2(logp)•.

3

Taking c = 30 we obtain (12.58). It remains to prove Lemma 12.8, for which we can assume B < p. As in Burgess' original proof we appeal to the Riemann Hypothesis for curves over finite fields. By Corollary 11.24, for any X modulo p which is non-trivial, we have

(12.66)

1

2.::

x((x +b!) ... (x + br))x((x + br+1) ... (x + b2r))l :( 2rp!

x(mod p)

if one of the classes bv(mod p), v = 1, ... , 2r is different from any other one. We have

2..::···2..::

W=

:L

x((x + b!) ... (x + br))x((x + br+l) ... (x + b2r)).

The complete sum satisfies (12.66) except possibly for the (b 1 , . .. , b2r) which can be arranged in r equal pairs. Thus the number of exceptions is bounded by r( 2 r )Br :( (2rB)'. Hence (12.65) follows. r REMARK. Choosing the involved parameters in the above proof more economically one can get (12.58) with the factor c(logp)~ replaced by c'(logp)ir for some absolute constant c' ) 1 (for other comments see [Fri]). EXERCISE 2. Prove along the above lines the estimate (12.55) with r = 2 for any modulus q > 1. The Burgess bound (12.55) with r = 2 implies a sub convexity bound for Dirichlet £-functions in conductor aspect (compare Section 5.9). THEOREM 12.9. Let X be a primitive Dirichlet character modulo q ) 2, and s =~+it. We have

L(s, x)

(12.67) for any

E

<%::

jsjq-h+•

> 0, the implied constant depending only on E.

PROOF. By Theorem 5.3 we have

where Vs(y) is a rapidly decaying function given by (5.13). Precisely it satisfies

Vs(y)

<%::

y ( 1 +G)

)-1

and

V'( s

)«~(1+.1!._)y is\ . 1

Y

Hence

' lsi ( x )-1 (x-•v.(xj ..,Iii) ) « x3/2 1 + M

12. CHARACTER SUMS

330

By (12.55) with r = 2 we have ! 3 S(x) = " L ' x(n) ~ min(q,x2qlii+•). n(x

Then by partial summation we get

and the result follows by applying the relevant estimates.

0 12.5. Very short character sums to truly composite modulus. It is natural that Burgess's method fails to produce a non-trivial estimate for the character sum Sx(N) of length N « qi, where q is the conductor of X· On the other hand, we do have effective treatments of exponential sums of Weyl's type (see Chapter 8) which are very short relative to the size of the amplitude function. Here we apply the differencing process to reduce the amplitude function several times until it is small enough (however unbounded) so that a non-trivial estimate for the resulting sum in the final step is available from special sources. In this respect the case of the character sum Sx(N) is quite different because one normally cannot reduce the conductor by shifting the argument; absolutely not, if the conductor is a prime number. There is, however, a possibility to imitate the Weyl-Van der Corput differencing ideas for character sums of modulus which factor into relatively small numbers; we say that such a modulus is truly composite. An important estimate for character sums of special composite moduli was given by S. W. Graham and J. Ringrose [GRi] (see also [H-B] and [FI3]). In this section we present our version of the Graham-Ringrose magnificent result. We shall proceed by induction with respect to the number of factors (not necessarily primes) in the modulus of the character. For this reason, in order to meet the induction hypothesis, we treat more general sums than necessary for the final result. Let e(mod k) be a multiplicative character. We consider (12.68)

1 l

e(f(n))

where f(x) is a rational function with all zeros and poles being integers. We write (12.69)

f(x)

=

IT (x- a,)d" v=l

where dv are non-zero integers and av are any integers. We do not assume that a1, ... , am are distinct, so the representation (12.69) does not define the exponents d 1 , • .• , dm uniquely. Recall that for a given representation f = ! 1 I fo with ! 1 , I fo E Z[x] we defined e(f(x)) = e(h(x))~(f0 (x)), therefore the sum (12.68) is actually restricted by

(12.70)

((n- a!) .. · (n- am), k) = 1.

I

J

..

I"'

I

12.

CHARACTER SUMS

331

THEOREM 12.10. Let ~ = XI··· Xr-IX, where Xe (mod qe) for 1 :( R < r are any characters and x (mod q) is a primitive character of conductor q > 1, q squarefree. Let f be given by ( 12.69} with (12.71) where h is the order of X- Put No =max{ Q1,

(12.72)

A=

... , Qr-1,

II II (av, -

q~ }q~, and

av2 )-

vl:fv2

Then for any N ) No we have

(12.73) The power of proof stems from estimation of the complete character sum (12.74)

S(x) =

:L

x(f(x)).

x(mod q)

If q =pis prime we are dealing with a sum over the field lFp- In this case we derive

from [Sch] the following PROPOSITION 12 .11. Let X (mod p) be a non-principal character of order h which satisfies (12. 71). Then (12.75)

\S(x)\ :( (m-l)(A,p)ipi.

PROOF. First note that p ol 2 because there is only the trivial character of modulus two. Form= 1 we have f(x) = (x- a!)d 1 with (d~, h) = 0 and S(x)=

:L

xd'(x)=O,

x(mod p)

so (12.75) holds (in this case A= 1). Let m) 2. If A == O(mod p), then (12.75) is triviaL Now suppose all the numbers a 1 , ..• , am are distinct modulo p. If f is a polynomial (i.e., all the exponents d1 , ... , dm are positive), then the polynomial F(X, Y) = yh - f(X) is absolutely irreducible (see Lemma 2C of [Sch]) and our assertion follows from Theorem 11.23. If f(x) = h(x)/fo(x) with h(x),fo(x) E Z[x] having all distinct zeros, then we consider the polynomial g(x) = h (x)f0 (x)P- 2 in place of the rational function f(x). Note that x(f(x)) = x(g(x)) and (p- 2, h) = 1, because h\(p- 1). Therefore the condition (12.71) holds for g(x), so (12.75) is established. 0 Proposition 12.11 extends by multiplicativity as follows COROLLARY 12.12. Let X (mod q) be a primitive character of squarefree conductor q and order h which satisfies (12. 71}. Then (12.76) where w(q) is the number of prime divisors of q.

12. CHARACTER SUMS

332

PROOF. If q = P1···pt, then X= X1···xt, where Xl,··· ,Xt are primitive characters of order h 1 , ••• , ht respectively which are divisors of h. Therefore the condition (12.71) is satisfied for each character Xs(mod p 8 ). Multiplying (12.75) for X1, ... , Xt we get (12.76). 0

Now we are ready to prove that for N) No, (12.77) by induction with respect tor. The factors cr(m) ) 1 will be determined in the induction process; they depend on q, but it is not necessary to display this because q will not change throughout the process. Moreover, we do not change the set of exponents {d1 , ... , dm} in the representation (12.69), although every induction step will double the number of zeros and poles of the resulting rational functions. For r

= 1 we have ~ = X and Sx(N)

=

:L

x(f(n))

N

= -S(x) + Bq q

M
where S(x) is the complete sum (12.74) and that (because q! :( N)

IBI :(

1. Hence we obtain by (12.76)

(12.78) Now let r ) 2. We apply the shift n getting SdN)

>-->

n

+ hq 1

and average over 1 :( h :( H

=if

where ~ = X2 · · · Xr-!X and IBI :( 1. Here the error term is obtained by trivial estimation of the short character sums which do not overlap with the original sum. Hence ISdN)I :( T + 2Hq 1 , where T =

~

2.:: I :L

~(f(n + hql))l.

M
By Cauchy's inequality

T

2

:(;

~(f(n+hlql)jj(n+h2q1))1.

:L* I :L

l(h 1 ,h 2 (H Af
Here the inner sum is of type (12.68) for the reduced character~ and the quotient function ](x) = f(x + h1q1)/ J(x + h2ql). This has 2m zeros and poles. For f(x) written as (12.69) we introduce the polynomial

Ll(x)

=II II (x + av

1 -

av2 ),

deg Ll(x)

=m(m- 1),

l.lt:fl/2

so Ll = Ll(O). Then the corresponding polynomial for ](x) is

b.(x) = Ll 2 (x)Ll 2 (x + y)(x

+ y) 2m

., 1 .

r 12. CHARACTER SUMS

333

where y = (h1- h2)q1. Hence b.= b.(O) = !:::,. 2!:::,. 2(y)y 2m and (b.q2 · · · Qr-l, q) :( (!:::,.ql · · · Qr-l• q)(yl:::,.(y), qj(q, ql)). By the induction hypothesis we get

\:L ~(J(n))\ :( Cr-1(2m)g(hl -

h2)((!:::,.q1 "· Qr-l, q)q-l )

21 -r

N

n

where g(h)

= (hq1!:::,.(hq1), qj(q, q!)) 21 H12

"'* L

g(h1- h2)

=

-r.

We choose H = q and estimate as follows

1 q

" L'

(hq 1/:::,.(hq 1), qj(q, q1)) 21-r

h(mod q)

l(h 1 ,h 2 (H

:( ~ :L q y(rnod

(y!:::,.(h),q)21-r q)

1

:( (q :L

21-r

(y!:::,.(y),q))

y(mod q)

For the inner sum we have

-1 :L q y(mod q)

(y!:::,.(h), q) :(

:L J{y(mod d); y!:::,.(yj =O(mod d)}J dlq

:( 2..::(1 dlq

+ deg l:::,.)w(d)

= (2

We have deg!:::,. = m(m- 1), so 2 + deg!:::,. :(2m 2 and (2 Gathering the above estimates we obtain

+ deg !:::,.)w(q).

+ deg t::,.)w(q)

:( T(q)T!(q).

Since q 1 q~ :( N, this yields (12.77) for r ) 2, provided Cr(m)) cL(2m)(T(q)T!(q)) 2

-r

+ 2.

By (12.78) we get (12.77) for r = 1, provided c!(m) ) Tm(q). One can check that the numbers 2 Cr(m) = 4( T(q)" Tm(q) 2r) 2_, satisfy the above (recurrence) inequality (use T2m(q) = T(q)Tm(q)) giving (12.73). Our main interest in Theorem 12.10 is for f(x) = x (so m = 1,!:::,. = 1) in which case it yields THEOREM 12.13. Let Xt (mod qe) for 1 :( £ < r be any characters and let X (mod q) be a primitive character of conductor q > 1, q squarefree with (q,ql···Qr-l) = 1. ThenforN) No= max{q1, ... ,Qr-l,qt}q~ we have

I

l

I

L

(12.79)

12.

334

CHARACTER

SUMS

CoROLLARY 12.14. Let X be a primitive character of conductor q squarefree. Suppose all prime factors of q are :( N!. Then

I

(12.80)

> 1, q·

x(n)!,.,;; 4NT(qrf2' q-1/r2'

:L

Af
where r is any integer with Nr ~ q3 .

PROOF. Let p be the largest prime factor of q. We can write q = Ql · · · Qr with qe :( pq~ for 1 :(f. :( r (some of the factors can be one). We have Qe :( Nt. Moreover, one of the factors, say Qr, satisfies Qr ~ q~. We can also require that w(qr) :( w(q)jr. To meet these conditions define Qr as the smallest product of the largest prime factors of q such that Qr ~ q~. Then define qe successively for 1 :(f.< r as the largest product of the largest prime factors of qfqr · · · Qi+l such that qe :( pq~ (if q = Qr · · · Q£+1, then put qe = · · · = Ql = 1). Clearly, every prime factor of q is used once in these r steps, so q = Q1 · · · Qr· Accordingly X = X1 · · · Xr. where xe are characters modulo qe and Xr is primitive. Therefore (12.80) follows 0 from (12. 79). The divisor function in the estimates (12.79), (12.80) is a nuisance. In general we have T(q) = (a! + 1) ···(an + 1) :( 2at+··+an = 211 (q) if q = p~ 1 · • • p~n, where Sl(q) is the number of prime factors of q counted with multiplicity. This can be quite large. We estimate Sl(q) as follows,

.

Sl(q) = :L A(d) :( :L A(d) logd logz dlq

+ :L(A(d)

dlq

d
logd

_ A(d)) :( logq logz logz

+~

where z > 1 and c is an absolute constant. Choosing z = log q for q (12.81)

S1

log 2 z

~

3 we get

log q ( 1 c ) (q) :( loglogq + loglogq ·

Hence (12.82)

T(q) :( 2(1+cfloglogq)logq/loglogq.

EXERCISE 3. Prove that T(q) :( q1/loglog 3 q for all q ~ 1. Using this crude estimate we derive from (12.80) COROLLARY 12.15. Letx be a primitive characterofconductorq ~ 3,q squarefree. Let p be the largest prime factor of q. Then, for (12.83)

N ~ q'(q)

+l

with c:(q) = 4(loglogq)-~

we have

(12.84)

x(n)


« N

exp( -JiQgq)

12. CHARACTER SUMS EXERCISE

335

4. Let the conditions be as in Corollary 12.15. Show that £(1, x) :( c:(q) log q + 9logp + b

(12.85)

where b is an absolute constant. 12.6. Characters to powerful modulus. The truly composite moduli (note that the numbers having no large prime divisors, the so-called "smooth numbers", or "entiers friables" in French, are truly composite) appear in practice not as often as the prime moduli. Another extreme are the moduli which are high powers of a fixed prime. These are examples of powerful numbers. We say that q is powerful if its kernel (12.86)

k=

ITP plq

is small relative to q in the logarithmic scale. Let X be a primitive character of conductor q which is a powerful number. In this case a large part of character values can be described as an exponential of a polynomial, so the character sum Sx(N) can be reduced to Weyl sums. To this end consider the following polynomial: x2

L(x) = x - 2

(12.87)

xd

+ · ·· ± d" -

One can show that (see [Ga2], p. 192) (12.88)

L(x + y + xy) = L(x)

+ L(y) +

L*

c(a, b)xayb

d
with some rational coefficients c(a, b) such that c(a, b) E Z[a, b]. Suppose 4 f q (for a technical reason). Let d be sufficiently large so that q2 lkd. Let D be the product of all m :( d,(m,q) = 1. Clearly DL(kx) E Z[x], and it follows from (12.88) that

DL(kx + ky

+ k 2 xy)

=DL(kx) + DL(ky)(mod q).

Hence the function ~(1 + kx) = e(DL(kx)jq) is a character of the multiplicative group of residue classes mod q congruent to 1 mod k. It is easy to see that the order of~ is qjk, which is the order of the group. Therefore there exists an integer B such that

x(l

(12.89)

+ kx) = e(BDL(kx)jq).

Since q is the conductor of X it follows that B is prime to qjk. On the other hand, b is determined only mod qjk so we can assume that B is prime to q. By (12.89) we arrange the character sum (12.44) as follows:

Sx(N) =

L

x(a)

L x(1 + kan) = L

x(a)

L e(F(an)jq)

O
where n runs over the interval (M- a)k- 1 < n :( (M + N- a)k- 1 , ali= 1(mod q), and F(x) = BDL(kx) is a polynomial of degree d with integer coefficients. The

336

12.

CHARACTER SUMS

last sum is of Weyl's type which can be estimated by Vinogradov's method. This has been done in detail in [12], getting THEOREM 12.16. Let x be a primitive character of conductor q, 2 f q. Let k be the product of prime factors of q. Then, for k 100 < N < N' :( 2N we have (12.90)

I :L

x(n)l :( 1N1 -o

N
where

1 = exp(200rlog 2 (1200r)),

o= (1800r)- 2 (log3600r)- 1 ,

log3q logN·

r=--

REMARKS. The first result of this type for q = pe was established by A. G. Postnikov [Pos]. Our generalization of the Postnikov formula (12.89) for the powerful modulus is based on the formula (12.88) which is due to P. X. Gallagher [Ga2]. There are applications of the results to bounding the L(s,x), to widening the zero-free region, and to estimating the least prime p a(mod q) (see [Ga2] and [12]).

=

CHAPTER 13

SUMS OVER PRIMES 13.1. General principles.

Given (13.1)

f : N --+ I[ we

are interested in estimation of the sum

V= Lf(p) p

where p runs over prime numbers, or of the closely related sum (13.2)

S = L_A(n)f(n)

where A(n) is the von Mangoldt function. Note that Sis obtained from V for f(n) replaced by f(n) logn up to a small contribution of terms n = p" with v;, 2. Of course, the problems of estimating S and V are equivalent by partial summation, but it is often simpler in practice to deal with S. Iff is a nice, smooth function with compact support on JR+ one can express S by the zeros of ((s) by integrating- }(s)('(s)/((s), where j is the Mellin transform of f. One gets (13.3)

5=}(1)- L_J(-2n)- L_}(p). n>O

•

p

However, even assuming the Riemann Hypothesis, this expression (the so-called explicit formula, see Exercise 4 in Chapter 5) is not always usefuL For example, if f : lR + --+ I[ is an oscillatory function, such as (13.4,) (13.5)

f(x) = e(ax)g(x) f(x) = e(aJX)g(x)

with a E JR* and g(x) a bump function, then the Mellin transform }(s) is spread over a large range of zeros so that the expansion (13.3) fails to produce good results. As with any summation type formula (see Chapter 4) the decision to apply it or not can be reasonably guided according to whether the original sum has more terms over primes than the resulting sum over the zeros. Recently, equipped with powerful computers, researchers do not hesitate to download from the internet 1 a list of numerical values of the zeros, and are not afraid of using them to test sums over primes indirectly via the explicit formula. Primes and zero are no longer different beasts, it is a matter of choice to work with one or the other sets. 1 See

Odlyzko's site: http:/ fwww.dtc.urnn.edu;-odlyzko/zeta_tahles/

337

13. SUMS OVER PRlMES

338

Many interesting arithmetic functions f : N _, I[ are not even defined on the real numbers, in which case the explicit formula (13.3) is not applicable (true, one can take smooth extension off to JR.+, but it does not work efficiently if the set of values f(n) for n EN is erratic). In such a case we prefer to use a sequence notation A= (an) in place of the function f(n) =an. Since A is infinite, we consider the finite sums (13.6)

S(A,x) =

L A(n)an n(x

in place of (13.2) with the aim of estimating these for all sufficiently large x. It is not difficult to predict the asymptotic behavior of S(A, x) by appealing to a randomness of the Mobius function: MoBiuS RANDOMNESS LAw. The Mobius function Jl(m) changes sign randomly so that for any "reasonable" sequence of complex numbers A = (am) the twisted sum

(13.7)

M(A,x) =

L Jl(m)am m(x

is relatively small due to the cancellation of its terms.

Of course a reasonable sequence means chosen with no bias. In practice it is often the case, except when playing with sieve weights which do conspire with the Mobius function. To apply this law to prime numbers we write A = J1 * L, or (13.8)

A(n) =

L Jl(d) log J L Jl(d) logd, =-

din

din

and insert the latter into (13.6) getting (13.9)

S(A,x) = - LJ1(d)(1ogd)Ad(x), d

where ( 13.10) n(x

n=O(mod d)

The quantities Ad(x), so also Ad(x) logd, are "reasonable" if the original arithmetic terms an are themselves "reasonable". Therefore, according to the principle, we believe that the contribution of large d's in (13.9) is negligible, so S(A,x) is well approximated by (13.11)

S(A, D, x) = -

L J1(d)(1ogd)Ad(x) d(D

for some D which is relatively small. Now for d ( D the quantities Ad(x) may satisfy (due to large averaging) a simple, yet strong approximate formula (13.12)

13. SUMS OVER PRIMES

339

where g(d) is a nice multiplicative function, A(x) = A 1 (x), and rd(x) is a small error term. Ignoring the error term we arrive at S(.A., D, x) ~ H(D)A(x), where

H(D) = - L

tt(d)(logd)g(d).

d(D

Usually g is regularly distributed over primes so there is a limit of H(D) as D tends to oo, say H(D) ~ H(oo) = H with (13.13)

H = - Ltt(d)(logd)g(d) = d

IT(l- g(p)) (1--1)-1 . p

p

Therefore, the above transformations lead us to the following asymptotic formula (13.14)

S(.A.,x)

~

HA(x),

as x--> oo.

Of course, proving this rigorously is a different matter. What needs to be said, however, is that the heuristic formula (13.14) has never failed to hold in every natural case in which S(.A.,x) is evaluated rigorously by special means. In 1934, I. M.Vinogradov (see [V5], [V6]) gave a very good estimate for the exponential sum (13.15)

V(a;x)

=

Le(ap) p(x

which was needed for solving the Goldbach problem with three primes (see Chapter 19). In the absence of the Riemann Hypothesis, this was a surprising accomplishment. His method applies to the sum S(.A.,x) for a great variety of oscillating sequences .A.= (an) which are not multiplicative, such as an= e(an), an= e(ayln), an = x(n + 1), where x is a multiplicative non-principal character. Here is how one can explain the principles of Vinogradov's method in abstract settings. One starts by repeated applications of the exclusion-inclusion procedure to arrange S(.A., x) into sums of two types (13.16)

s1

=

L

L

adadn,

d(D dn(x

(13.17) mn(x NI
where ad, f3m, In are independent real numbers (essentially bounded) and D, M, N are suitable parameters (neither small nor large relative to x). The sum of type 5 1 can be written as (13.18)

51= L

adAd(x)

d(D

where Ad(x) is given by (13.10). Here the terms an of Ad(x) are cleared of complicated coefficients except for the condition n 0 (mod d) which is easy to handle for small d. Assuming some regularity in the variation of the argument of an one shows that Ad(x) is small for every d ( D due to the cancellation of terms.

=


340

The sum of type S 2 is considered as a bilinear form in the coefficients f3m, In (see Chapter 7 for more general considerations). For estimation it is convenient to split s2 into bilinear forms over dyadic segments (13.19)

13(M,N) =

Applying Cauchy's inequality one removes the coefficients f3m, In as follows: (13.20)

2 2 /13(M,N)/ ,;:; (Lif3m/ ) LL/rniin2/C(n1,n2) n1

n2

where (13.21)

C(n1,n2) = Lamn 1 0.mn 2· m

If n 1 = n 2 we can do nothing better than the trivial bound. However, since N is quite large the number of such cases (the diagonal terms) is relatively small. On the other hand, for a majority of n 1 i n2 the sum C(n 1 , n2) is small because of cancellation of terms (the variation of the argument of amn 1 is not completely annihilated by that of O.mn 2 ). In this way one arrives at a non-trivial estimate for 13(M, N) and also for the sums of type S 2 . Combining the results for sums of type S1 and S 2 one deduces a bound for S(A, x) which is often remarkably strong. In the forthcoming sections we provide details of the Vinogradov method in particular contexts. We shall also present some other methods. 13.2. A variant of Vinogradov's method.

In this section we develop a formula for sums over primes using some modifications of the original ideas of Vinogradov. The arguments are of combinatorial nature, they have much in common with sieve methods. For 1 < n ,;:; x we put f3n(x) = (1 + w(n, y'x))- 1 , where w(n, y'x) is the number of primes p\n with p ,;:; y'x. If n is squarefree, we have if n has a prime factor ,;:; y'x, if n is prime with y'x < n,;:; x. This nice formula is due to 0. Ramare. Hence for any sequence of complex numbers A= (an) we have

p~..fi,

mp(x (m,p)=l

where I::" restricts the summation to squarefree numbers. Let P(z) be the product of all primes p ,;:; z with 2 ,;:; z ,;:; y'x. We apply the above identity to the subsequence of numbers an with (n,P(z)) = 1 getting

l
z

341

Now we remove the restriction to squarefree numbers n, and the condition p The correction quantity does not exceed

L

lanl ( (Lianl 2 )!

n(x

n(x

f m.

(LxP- )! ( IIA(x)llx!z-! 2

p>z

\lp>z,p2ln •

where IIA(x) II is defined by (13.22)

(L lanl

IIA(x)ll =

1

2

2 )

·

n(x

To this we add the terms a1 and ap with p ( ,jX and estimate their contribution by

L

lanl ( IIA(x)llxl.

n(JX

Next we remove the condition (n,P(z)) = 1 by the exact sieve (the Legendre formula)

L

L

an=

n(x (n,P(z))=1

Jl(d)Ad(x).

diP(z)

We keep the terms Ad(x) with d ( ,jX and estimate the remaining ones. First by Cauchy's inequality we get

L

LL

IAd(x)l (

diP(z) d>,fi

LL

ladnl ( IIA(x)ll(

dn(x

dn(x

diP(z),d>,fi

diP(z),d>,fi

L

( IIA(x)ll(xlog3x)! (

T(dn))!

T(d)d- 1) !_

diP(z),d>,fi

To the last sum we apply Rankin's trick as follows

L

T(d)d-1 ( x-<

diP(z) d>,fi

L

T(d)d2<-1 ( x-<

diP(z) (

X-€

II (1 +

II (1 + 2p2<-1) p(z

3 p-1-E)2z ' (

x-E((l

+ c)2z3'

p

1 ) 2z ( x-< ( 1 + E by choosing

E =

3 '

(

log x )

= exp -logz

(1

+ logz) 2 e 3

1/log z. Collecting the above estimates we get

PROPOSITION

13.1. Let 2 ( z (,;I. For any sequence of complex numbers

A= (an) we have (13.23)

"'

~aP =

p(x

"'

~ Jl(d)Ad(x)-

diP(z) d(y'X

"'' ~

(m,P(z))=1

!3m(x)

"' ~

mp(x z
amp+ c(x,z)vxiiA(x)ll


342

where P(z) is the product of all primes p :( z, L.:;' restricts the summation to squarefree numbers, Ad(x) is given by {13.10}, IIA(x)ll is given by {13.22}, i,Bm(x)l :( 1 and c(x,z) (which also depends on A) satisfies

lc(x, z)l :( 2r:; yz

(13.24)

) (log3x) 21 + exp ( -2-logx 1ogz

REMARKS. More precisely we have (13.23) with ,Bm(x) = (1 + w(m, y'x))-I, where w(m, y'x) is the number of primes plm,p :( y'x, but we didn't specify this to emphasize that one doesn't need anything explicit about these coefficients. The first sum on the right side of (13.23) is of type 5 1 with D = y'x (see (13.16) and (13.18)). One can also split this into two sums 5 1 +52 according to d :::; z and z < d :( y'x. Then 5 1 is of type (13.16) with D = z while 5 2 is of type (13.17) with M = z and N = y'x. The double sum on the right side of (13.23) is also of type (13.17) with M = z and N = y'x. The estimation (13.24) is fine for

z = exp( y'!OgX).

(13.25) For this choice of z we have

lc(x,z)l :( 3exp(-~y'!OgX)(log3x) 21 .

(13.26)

If A= (an) is an oscillatory sequence (so the main term in (13.12) is expected to be small), then without great loss one can simplify (13.23) by writing the inequality

(13.27)

ILapl :( L p(x

IAd(x)l

L ampl + c(x,z)v'XIIA(x)ll.

+ Ll

d(~

mp(x z
We shall return to the above constructions in Section 13.6.

13.3. Linnik's identity. In 1960 Yu.V. Linnik [Li2] gave a marvelous expression for (13.28)

A'(n) = A(n) = { alogn 0

1

ifn =pa,a

otherwise

in terms of the strict divisor functions (13.29) To this end he computes the logarithm of

((s) =

L n-s = IJ (1- p-s)-l p

in two ways using the power series expansion

= L: . k

log(1- x)

= -

k=l

First by the Euler product for ((s) one gets (13.30)

log((s) =

L A'(n)n-•. n~2

> 0,


343

On the other hand, by the Dirichlet series for ((s), (13.31)

Comparing the coefficients in both expansions Linnik gets the identity '( ) "''( n ) = - "'(-1)k ~ -k-Tk n .

(13.32)

k

Note that r~(n) = 0 if 2k > n, so (13.32) is a finite sum over 1 ( k ( lognjlog2. In applications Linnik uses (13.32) for numbers n having no small prime factors, say for (n,P(z)) = 1 with z;, 2. Then (13.32) runs over k ( logn/logz. This yields the following PROPosiTION

,fii. For any sequence of complex numbers

13.2. Let 2 ( z (

A.= (an) we have " ~' ap··· = - " ~' ( -1)k

(13.33)

v

" ~'

k

k(K

p (x

anr~(n)

n(x

(n,P(z))=l

p>z

where K =log xjlog z and

r~(n)

is the strict divisor function (13.29}.

If one prefers the exact divisor functions re( n) one can replace the strict ones in the Linnik formula by

r~(n)

(13.34)

r~(n)=

L

(-1)k-t(;)re(n)

O(£(k

Moreover, the condition (n,P(z)) = 1 can be relaxed at small cost by applying the exact sieve in the same way as in the Vinogradov formula. On the right-hand side of (13.33) the first term (k = 1) is a sum of type 5 1 , precisely (13.35)

s~

=

L

an,

l
(n,P(z))=l

while any other sum (13.36)

sk =

L

anT~(n)

n(x

(n,P(z))=l

for k ;, 2 can be _regarded as a bilinear form (a sum of type S 2 ). However, Linnik gave a special treatment for Sk for a few small k (in the solution of the HardyLittlewood equation p + x 2 + y 2 = N), and only for larger k he arranged Sk into a suitable bilinear form. By extending the class of special forms involving the divisor functions of degree k > 1 one gains more flexibility in the remaining parts of the identity (13.33) from which to arrange bilinear forms. Of course, not every form Sk can be treated by special means. Usually one treats 5 1 by classical Fourier analysis,


344

and one can often apply to 5 2 the spectral theory of automorphic forms. One may hope that Sk with k;, 3 can be treated by Fourier analysis in GLk, but so far this has not been successful. In 1981 D. R. Heath-Brown [HB3] developed a formula for A(n) which is reminiscent of that of Linnik. Let

M(s) = L

J.L(m)m-•.

m(z

Then n>z

Consider the identity

r(s)(1- ((s)M(s))K = r(s) ( (

L

+

(-1)k

l(k(K

(~) ('(s)(k-l(s)Mk(s).

Comparing the Dirichlet coefficients on both sides we obtain PROPOSITION

(13.37)

A(n) =

13.3. Let K ;, 1, z ;, 1. Then for any n < 2zK we have

- 1 ,c~}-1)k (~) m,~~~;~=nJ.L(mr) ... J.L(mk)lognk. m1, .. , ,mk~z

'. ;: : '. ;: :

This identity of Heath-Brown has a few advantages over that of Linnik. Most of all, the number of terms K can be chosen smaller by regulating with the parameter z more efficiently. EXERCISE

(13.38)

1. Show that for n ( zK we have

J.L(n) = - L

(-1)k (

~)

I(k(K

L·"L

JL(mr) ... JL(mk).

m1···mknl···nk-1=n

mi,··· ,mk(z

13.4. Vaughan's identity.

The most popular formula for A(n) is due toR. C. Vaughan [Va]. We derive it from A(n) = LJ.L(b)log~.

bin Here we keep the terms with b ( y and transform the remaining sum as follows

LJ.L(b)log~ bin b>y

=

LLJ.L(b)A(c). bcln b>y

Next we keep the terms with c > z and transform the remaining sum as follows

LLf.L(b)A(c) = LL!.t(b)A(c)- LL!J(b)A(c). bcln bcln bcln b>y,c(z

c(z

b(y,c(z

Here the complete sum over all b dividing n/c vanishes unless c = n, which is not possible if n > z. Adding up the above expressions one obtains

13.

SUMS OVER PRIMES

345

PROPOSITION 13.4. Let y, z;, 1. Then for any n > z we have (13.39)

A(n)

=

Ltt(b)iog~- LLtt(b)A(c) + LLtt(b)A(c). ~ b(y

~n b(y,c(z

~n

b>y,c>z

Similarly we derive a formula for the Mobius function. PROPOSITION 13.5. Let y, z;, 1. Then for any m > max(y, z) we have (13.40) bcjm

b>y,c>z

PROOF. We start from

tt(m) = LLtt(b)tt(c) bcjm

and split the summation into four ranges according to b :( y, b Then we transform the two middle sums into

bcjm b(y,c>z

bcjm b(y,c(z

bcjm b>y,c(z

bcjm b(y,c(z

> y, c :( z, c > z.

Adding up the above expressions one obtains (13.40).

D

Vaughan's identity (13.39) is not as flexible for arranging bilinear forms as those of Linnik or Heath-Brown, but it proved to be sufficient for basic applications and it is the simplest of all (ready to use without re-grouping its terms). EXERCISE. Derive a formula

a Ia Vaughan for

Ak(n).

13.5. Exponential sums over primes. As an example of numerous applications of the methods described in the previous sections we prove Vinogradov's estimate for the exponential sum (13.15). Assuming (13.41)

a-

I

~~ q

:( ..!._ q2

with (a,q) = 1,

Vinogradov [V6] showed that (13.42)

11

V(a;x) <:: q>x>

1~

+ q->x+ xexp(- 2 yiogx). 1

Using Vaughan's identity (13.39) we derive practically the same bound for l

!

(13.43)

S(a;x) =

L e(an)A(n). n(x


346

THEOREM

13.6. Suppose a satisfies {13.41). Then for x;, 2 we have

(13.44)

where the implied constant is absolute. We begin the proof of (13.44) by estimating special exponential sums. We have

Hence for any numbers x(m) > 0 we get

L IL

L

e(amn)l (

lmi:S;M l(n(N n(x(m)

As m varies over a segment of length q/2, the points spaced by 1/2q at least. By this observation we derive

L lmi(M

min(N, 2 II~mll)

1

=

JJamJJ

((1+4Mq- )(N+

are all distinct and

L

qe-I)

l(l(q

« Similarly for x(m)

2 1f~mlf).

min( N,x(m),

lmi:S;M

(M + N +MNq- 1 + q) log2q.

xfm we derive (consider separately the range 1 ( m ( q/2)

L l(m(J\J

min(~, 2 JJ~mJJ) «

1

(M +xq- +q)log2qx.

From these estimates we get LEMMA

(13.45)

13.7. For any numbers x(m) > 0 we have

L IL

e(amn)I«(M+N+MNq- 1 +q)log2q,

lmi(M l(n(N n(x(m)

L IL

(13.46)

e(amn)l

«

(M + xq- 1 + q) log2qx.

l(m(M mn(x

Next we derive a bound for general bilinear forms (13.47)

!3(x;N) =

L IL

rme(amn)l

N
where rm are complex numbers with (13.45) we get 2

!3 (x;N) ( 2N

lrm

LL

J

(

I

1. Applying Cauchy's inequality and

L

e(a(m 1 - m2)n)l

l(mp:;;;m2(N N
«

(-Kr + N + ~ + q) x log 2q.

From this bound we deduce

13.

SUMS OVER PRIMES

347

LEMMA 13.8. For any complex numbers amJ3n with have

lam I :::;

1, 1/Jnl

:::;

1 we

(13.48)

PROOF OF THEOREM 13.6. By the identity (13.39) we have

S(a;x) = LL~-£(m)(logt')e(at'm)- LLl:>(m)A(n)e(at'mn) lmn~x

m~~l,n~N

+ LLLJ.t(m)A(n)e(at'mn) + O(N). lmn~x m~A-f,n~N

We choose M = N = x%. To the first and the second sums we apply (13.46) getting O((x~ + xq- 1 + q)(logx) 2 . In the last sum we consider t'n = k as one variable with coefficient c(k) = L A(n):::; logk. nlk,n?N

Then we apply (13.48) getting O((x~ estimates we get (13.44).

+ xq- 1 + q)h~(logx) 3 .

Adding up these D

By the same arguments, but using the identity (13.40) in place of (13.39) (also apply Cauchy's inequality along the way to remove the corresponding coefficients c(k) which are bounded by r(k), not by logk as above), one shows THEOREM 13.9. Suppose a satisfies {13.41}. Then for x;? 2 we have (13.49)

'"""

L

f.t(m)e(am)

« (q'x' +q-'x+x')'x'(logx).4 1

1

1

4

1

1

If a is a rational number with small denominator, then the sum (13.49) can be well estimated by an appeal to the zero-free region of £-functions. Indeed, for any Dirichlet character x(mod k) we have (see (5.80)) (13.50) for any x ;? 2 and A ;? 0, the implied constant depending only on A. Hence we derive by means of Gauss sums that (13.51) Now let a be any real number. Given Q ;? 1 there exists a rational number a/q with (a, q) = 1,1:::; q:::; Q such that (13.52)

13.

348

SUMS OVER PRIMES

Hence we derive by partial summation that

for some 1

~

y ~ x. Using (13.51) one gets

L J.L(m)e(am) «

(13.53)

(q+

~)x(logx)- 5 A

m~x

for any x ;? 2 and A ;? 0, the implied constant depending only on A. We apply (13.53) if q ~ xQ- 1 and (13.49) if xQ- 1 < q ~ Q getting

L J.L(m)e(am) « Q- 1x 2(logx)- 5A + Q~xi(Jogx) Finally, choosing Q

=

4

+xfo(Jogx) 4 •

x(log x) - 4 A, we conclude the following

THEOREM 13.10. For any real number a and x;? 2 we have

L J.L(m)e(nm) « x(logx)-A

(13.54)

for any A ;? 0, where the implied constant depends only on A.

This estimate was first proved in 1937 by H. Davenport [Dal] by applying Vinogradov's method. The complete uniformity in a of this bound is its pleasant feature. We shall exploit (13.54) in our treatment of the Goldbach problems in Chapter 19. EXERCISE 2. Using the identity (13.39) prove that

Le(avn)A(n)

(13.55)

«

xt(logx) 4

n.(x

for any real number a

f

0, where the implied constant depends on a.

13.6. Back to the sieve. We elaborate further the ideas of Section 13.2 to draw results which can be useful in more delicate situations than that in the last section. The goal is to produce the bound (13.56)

IL apl ~ or(x) p~x

for any c: > 0 and x ;? x 0 (c:) subject to hypotheses on sums of type (13.18) and (13.19) which can be verified by existing technology. One of such technology will be the spectral theory of automorphic forms which we present in Chapter 21 for the problem of equidistribution of quadratic roots to prime moduli. Recall that P(z) denotes the product of all primes p < z. Given x we consider the sums (13.57)

Q(A,z) =

L n.(x

(n,P(z))=l

an

L


349

as a function of z for the sequence A = (an) and its subsequences Ad = (amd)· (See also Chapter 6 for an elementary introduction to sieve methods). Suppose that lanl ~ r(n). Let xk < z ~ x~. Then our sum over primes is approximated by Q(A, z) as follows:

+0

= Q(A, z)

Co:x

log

G~:gxz) + (lo:x)2).

One can reduce the "sieving level" z by applying the Buchstab identity, namely for any sequence A and any w < z we have

L

Q(A,z)=Q(A,w)-

(13.58)

Q(Ap,p).

w~p
Applying this identity twice we get (13.59)

L

Q(A,z)=Q(A,w)-

Q(Ap,w)+ L L Q(Apq,q).

w~p
Opening Q(A, w) and Q(Ap, w) by the exact sieve of Legendre we get

= L ~-£(d)Ad(x) 1

Q(A,w)- L

Q(Ap,w)

w~p
d

where the dash restricts the summation to the divisors d of P(z) which have at most one prime factor p ;? w. We fix y ;? z and estimate the partial sum over d > y by applying Rankin's trick as follows:

IL' f.t(d)Ad(x)l ~ L

1

L

r(md)

«

1

x(logx) L

d>y md~x

d>y

r(d)d- 1

d>y

and

d

d>y

L

~y-<(1+

r(p)pe-1)

w~p
rr(1+r(p)p'- 1 ). p
Choosing c: = (log w) - I we get ,

Lr(d)d-

1

~

d>y

Z (-)

Y

<

1 2c+l

IT(1+-)

Hence

L' f.t(d)Ad(x)

Y

p

p
« (~) I/logw x(logx) 8. y

d>y

The remaining sum over d ~ y is estimated by (13.60)

(Z) I/ logw

« :;

R(x) = L d~y

IAd(x)l.

(logzf.

13.

350

SUMS OVER PRIMES

Now we proceed to the double sum of Q(Apq, q) over primes p, q with w !( q < p < z. Let x~ < v !( x~. We estimate the partial sum over q ;? v trivially as follows:

~ ~

Q(Apq,q)

-1Xogx

«

v~q~p
~ ~

q

-I + ( z )2

1

v~q~x3

x (logx) --log logx 3logv

«

1ogz

x +(logx)2'

It remains to evaluate the sum of Q(Apq, q) over primes p, q with w !( q < v and q < p < z. We write this in the form

(13.61) mq~x

w~q
q
where Pm is the least prime factor of m and 'Ym is the number of prime divisors of m which are< z, so 'Ym !( w(m). We want to make it a bilinear form. This requires us to relax the relation q < Pm· To this end we use the following lemma which is convenient for separation of variables: LEMMA 13.11. There exists a function h(t) depending only on z such that

i:

(13.62)

Jh(t)Jdt < log6z

and for all integers 1 !( a, b !( z,

1 h(t)Gr 00

(13.63)

't

dt =

-oo

{

1 0

if a !( b, if a> b.

PROOF. Put g(u) = min{uz, 1,1 + (1-u)z} for 0 !( u !( 1 + z- 1 and g(u) = 0 otherwise. Then for positive integers a,b !( z we have g(a/b) = 1 if a!( b, and g(a/b) = 0 if a> b. We also have

g( ~) =

2~i jo) j(s) G) -s ds

where j(s) is the Mellin transform of g(u), i.e., f(s) =

oo

1 o

g(u)u 8 - 1 du =

(z

+ 1)s+l 2s+l s(s + 1)zs

-

1

.

For s =it with t > 0 we have three estimates Jf(s)J !( min{log6z, 2Cl, 4zC 2 } from which it follows that the function h(t) = -{;f(-it) has the asserted properties. D Applying (13.63) to (13.61) and changing the notation q top we obtain the integral of bilinear forms

lxo L L -oo

mp~x

w~p
w::;;;pm
'Ymamp(Pm/P)it h(t)dt

~ ~:·.


351

which we estimate by B(x)(logx) 2 with (13.64)

B(x)

=

L IL

(m,P(w))=1

ampPitl

mp,;;x w~p
for some real number t (use w(m)
!"Lap!,;;; R(x)

+ B(x)(logxf + E(x)

p~x

where (13.66)

E(x)

«

X ( (logx)2 ) logx log 6(logv)(logz)

X

+ (logx)2 +

(z)1/logw

y

8

x(logx)

and the implied constant is absolute. Let2,;;;D.,;;;xfi. Wechoosez=D.- 2 x!, y=D.- 1 x!, v=D.- 1 x!, andw= 1 101 ,6. 1 oglogx getting E(x) « x(logx)- 2 logD.. Hence we conclude (take D.= xE) THEOREM 13.12. Let A= (an) be a sequence of complex numbers with T(n). Let x ~ e 12 and (logx)- 1 '( c: '( Suppose that

f?..

(13.67)

L I L adml '( (lo:x)2' L I L pitapml '( (lo:x)4 (m,P(w))=1 d~y

(13.68)

lanl,;;;

dm~x

pm,;;x

w~p
for any t E IR, where the parameters y, w, v are given in terms of x and c: by W

(13.69)

= XE/10loglogx

and P(w) is the product of primes< w. Then

(13.70)

'""""

C:X Lap«-1-

p~x

ogx


REMARKS. The condition (13.68) is regarded as an estimate for general bilinear forms. The factor p't (which comes on our way of separation of variables) contaminates the coefficients, but it has no effect in practice. One could remove this factor, however, a precise statement would involve another parameter, which compromises the clarity. Some of the restrictions in the bilinear form (13.68) can also be modified. For example, one can remove the coprimality condition in the outer sum, but our experience shows that it is convenient to have it (note that (m, P(w)) = 1 implies (m,p) = 1). A concept of "prime producing exponents" was discussed in [DFI5]; Theorem 13.12 asserts that the pair ( ~, ~) is prime producing. The presence of c: in the parameters (13.69) is very important; without it the requirement (13.68) would be impossible to verify for most interesting sequences A= (an)· A well balanced choice is c: = c:(x) = (loglogx)- 1 giving a bound for the

352


sum over primes which shows only little cancellation, nevertheless it is a satisfactory result, particularly in applications to the equidistribution problems of degree two objects related to the GL2 theory (see Chapter 21).

CHAPTER 14

HOLOMORPHIC MODULAR FORMS 14.1. Quotients of the upper half-plane and modular forms. Holomorphic modular forms were historically discovered and studied for purposes of complex analysis and algebraic geometry, in particular, in connection with elliptic functions. Then they were introduced in algebraic number theory, particularly for the development of class field theory, Galois representations and arithmetic of elliptic curves. More recently modular forms have entered the territory of analytic number theory. Besides providing new tools, they are also a source of deep problems with numerous connections to arithmetic geometry, notably through the theory of £-functions. See also Chapter 15 for a survey of the theory of non-holomorphic forms, as well as Chapter 16 for applications to Kloosterman sums and Chapter 26 for some results about £-functions. There are many fine books on various aspects of this vast subject, for instance [Bu], [BG], [Mi], [Sh3], [14], [Sel]. We start with llil, the Poincare upper half-plane (14.1)

llii

= {z

E IC I y

= Im (z) > 0}

which is an open subset of 1C (in the next chapter, it will be seen as a riemannian manifold of constant negative curvature with the Poincare metric). There is an action of the group G = SL(2, JR) on H by linear fractional transformations (14.2)

az b

gz - - + - for g = - cz + d'

(ac db)

because (14.3)

Im (gz)

=

Im(z) lcz + dl2

which is positive if Im(z) > 0. The center of G, namely { ±1}, acts trivially, and one shows that the group of holomorphic automorphisms of llii is the quotient PS£(2, JR) = S£(2, lR)/ { ±1 }. Note that this action is also isometric for the hyperbolic Poincare metric; see Chapter 15. The action of this large group gives llii a large amount of symmetry. A useful manifestation of this is the following property: define first the action of G on the extended upper half-plane llil* = llii U lR U oo, by the same formula (14.2) in addition to the rules goo = a/c if c f 0 and goo = oo if c = 0, then for any three distinct points v, w and z in lR U oo, there exists g E G such that

gv

= 0,

gw

= 1,

353

gz

= oo.

14.

354

HOLOMORPHIC MODULAR FORMS

Associated to a Riemannian metric is a natural invariant measure dJ.L, which in the case ·of JHl is the hyperbolic measure

dJ.L(z) = y- 2 dxdy if z = x

(14.4)

which is invariant by G, i.e. for any

f

L

+ iy

integrable on lHl we have

f(z)dJ.L(z) =

L

f(gz)dJ.L(z).

Arithmetic comes when we consider discrete subgroups of G acting on JHl and the corresponding quotient spaces. One should keep in mind the analogy of .Z C IE. (or .Z[i] C C). This is a very rich theory; there are many "different" discrete subgroups of G, and to classify them is almost the same as classifying all Riemann surfaces. For arithmetic applications, the most important examples are the congruence subgroups, particularly those which contain one of the so-called Heeke congruence subgroups f 0 (q) defined by

fo(q) = { (

~ ~)

E S£(2, .Z)

Ic

=0 (q)}.

The groups f 0 (q) have finite index in SL(2,.Z), precisely

[ro(1): fo(q)] = q

IT (1 + p-

1

).

plq

For q varying one obtains very interesting quotients f 0 (q)\lHI (their genus increases). One prefers to deal with compact Riemann surfaces, however, quotients of JHl are not always compact. It is even possible to have Vol (f\JH[) = +oa, where the volume refers to the natural measure induced from (14.4). For example, the group of translations by integers

has the vertical strip 0 < Re (z) ,;; 1 "representing" the quotient r co \lHI, and it has infinite area. More generally, a fundamental domain for the action of a discrete subgroup r C G on JHl is a nice subset F C JHl which "almost" contains one point per orbit. Precisely, F must satisfy the following conditions: ( 1) F is an open subset in JHI. (2) The closure F of F in C intersects every orbit of r, i.e. for all z E JHl there exists 'Y E r with -yz E P. (3) No two points in F are f-equivalent. A fundamental domain always exists, but it is certainly not unique. One can even choose F to be a polygon with (hyperbolic) geodesic sides. For the congruence subgroups f 0 (q), one can first consider the usual fundamental domain for S£(2, .Z),

F1

=

{z E lHl I IRe(z)l < ~'

lzl > 1}

14. HOLOMORPHIC MODULAR FORMS

355

and then take the union of its images under coset representatives in S£{2, Z):

u

/'Ero(q)\ro(1)

is a fundamental domain for f 0 (q). (It is not necessarily connected; for other choices, see [15], Section 2.2.) On a picture of F1, the non-compactness appears because F1 touches the boundary lE.Uoo at infinity, so (i, 2i, 3i, ... ) is a sequence without converging subsequence. However, by the Gauss-Bonnet formula or a simple direct computation, the volume of SL(2,Z)\llll (which is the volume of any fundamental domain) remains finite, namely Vol (S£(2, Z)\llll) = ~,and the same is true for fo(q) with 7f

Vol (fo(q)\llll) = 3[fo(1) : fo(q)]. For r such that Vol (f\llll) < +oo, the points at which F touches the boundary are called cusps of r (this intuitive description will be made more precise in Section 15.7 when we discuss the classification of elements in G), and they are in finite number (because to each cusp one can attach non-overlapping small neighborhoods in llll, which have fixed positive volume). From the above, oo is the only cusp for SL(2,Z). So for fo(q), they are easily classified by considering the action of f 0 (q)\fo(1) on oo. One finds that every cusp has a unique representative ~ with (a, c)== 1, c;? 1, c I q, and a is modulo (c, ~). Their number is therefore

h=

L 'P((c, d)). cd=q

This is equal to r(q) if q is squarefree. For any cusp a, its stability group r a = h E r I-ya = a} is infinite cyclic (up to ±1), generated by 'Ya say, and there is a aa E G, called a scaling matrix, such that aaoo =a and a a-1 'Yaa a =

(14.5)

(

1 1)

0

1

.

Using aa, most computations and properties related to the various cusps of can be studied for the cusp at infinity, with stability group generated by the translation z ~ z + 1, for the conjugate group a.;- 1 faa (it is an advantage of the Poincare model over other models of the hyperbolic plane, for instance the unit disc, that it has such a distinguished cusp).

r

We can now define modular forms as holomorphic functions on lHl transforming in a simple way under the action of a discrete subgroup. Let k ;? 1 be an integer. Then we have an action (the "weight k action") of G = SL(2,IE.) on functions f : lHl --+ IC by (14.6)

(f

lk g)(z) =

j(g, z)-k f(gz),

where j(g, z)

= cz + d for g = ( ~ ~),

Let q;? 1 be an integer, and X a Dirichlet character modulo q (not necessarily primitive). Clearly X induces

acharac~er off0 (q) by x(g) = x(d) for g = ( ~ ~).

14.

356


A modular form of weight k, level q, and nebentypus (or character) xis a holomorphic function f on llil which satisfies

f [k

(14.7)

"Y =

xb)f,

for all "Y E ro(q)

and is holomorphic at all cusps of r 0 (q). The holomorphy at cusps is explained as follows. First, for a cusp a of fo(q), the function fa= f \k a 0 is seen by modularity to be periodic of period one, fa(z + 1) = fa(z). This implies now that fa is a function of the parameter q = e( z), namely fa(z) = 9a(q), where 9a is holomorphic in a punctured disc {z E IC [ 0 < [z[ < r}. We then say that f is meromorphic at a if 9a is meromorphic at 0. Therefore fa has a Laurent series expansion fa(z) =

L

/a(n)e(nz)

for some integer n 0 , and /a(na) f 0. We say that f is holomorphic at a if it is meromorphic and n 0 ;? 0. If, moreover, we have n 0 > 0, we say that f vanishes at a. If f vanishes at all cusps, it is called a cusp form. EXERCISE 1. Show from the definition that a modular form if and only if the f-invariant function

f

is a cusp form

g(z) = yk/2\f(z)\

(14.8)

is bounded on lHl (this criterion is very convenient in practice). 2. Show that if we take the same definition but k = 0, the corresponding space of modular forms is reduced to 0. EXERCISE

·., \:.:

Clearly modular forms of fixed level, weight and character form a vector space, which is denoted Ah(q, x), and the cusp forms a subspace denoted Sk(q, x). Taking "Y = -1, we see that f is identically zero unless x(-1) = (-l)k, so we assume that x satisfies this consistency condition. The first fundamental result is that Ah(q, x) is a finite dimensional vector space. This can be proved in general using the Riemann-Roch theorem for compact Riemann surfaces. In the case of f 0 (q), a simpler proof is possible. For .k ;? 2, the dimension of Ah(q,x) can be computed exactly'(see e.g. [Sh3]), and then estimated very precisely (the case of forms of weight one is a tantalizing problem of deep arithmetical significance because of links with Artin £-functions, see [Se2], [Dul]). One shows for instance that k-1 dim !Yh(q,x) = ~vq

where

Vq

G':

+ O(yqk)

= [ro(l) : ro(q)].

For any modular form

f

we denote by aJ(n) its Fourier coefficients at oo, so

that (14.9)

,,

f(z) =

L aJ(n)e(nz). n~O

Those coefficients are very important objects (they are arithmetic harmonic~), and their order of magnitude, in particular, is the subject of many inquiries. For


357 I

cusp forms, the criterion that ykf 2 1f(z)i is bounded on H implies quickly the trivial bound (14.10) which is good enough to start with (see below (14.54) for the much deeper Deligne bound). One also defines the Petersson inner product on llh(q, x) by (14.11) This is well-defined because the integrand is f 0 (q)-invariant, as a simple computation reveals (compare (14.8)), and the integral is finite as soon as one off or g is a cusp form. In particular, Sk(q, x) is a (finite dimensional) Hilbert space with this inner product, and this analytic fact is very important in what follows (see the definition of "newforms" in Section 14.7). Notice also that it is possible to multiply modular forms of the same level: if f E lvh(q, X), g E Jlh,(q, x'), we have fg E llh+k' (q, XX'). In particular, the direct sum of all llh(q), k ~ 0, is a graded C-algebra. This important algebraic fact will not be of much use in this book, however. It should be emphasized that in proving that a function is modular, the modularity formula (14.7) need only be proved for 'Yin a family of generators of f 0 (q). Such groups are always finitely generated so this is a finite set of conditions on f. For instance SL(2, Z) is generated by the two elements

acting as the translation z >--* z + 1 and as inversion z the two conditions for f to be modular become

f(z)=f(z+l) and

>--*

-z- 1 respectively, and

!(~ )=zkf(z).

1

We now give examples of modular forms and in particular of cusp forms. 14.2. Eisenstein and Poincare series.

The basic examples of modular forms are the Eisenstein and the Poincare series. We only define those series associated to the cusp oo for simplicity, but similar constructions can be done at every cusp. Let k > 2 (to ensure absolute convergence) and put -yEI' ~ \I'o(q)

Clearly, Ek is in Jl.h(q, x), as it is constructed by the averaging technique; holomorphy at the cusps is checked by explicit computation of the Fourier expansion, which also shows that Ek is not a cusp form. It is called an Eisenstein series. More generally let m ~ 0. The m-th Poincare series is obtained by the averaging technique applied to e(mz):

Pm(z) = -yEl = \I'o(q)

14.

358


fork ~ 2 (the convergence fork= 2 is not absolute bnt can be seen form ~ 1 from the Fourier expansion (14.12) below, using the Weil bound for Kloosterman sums (1.60)). We do not consider m < 0 as the series diverges then. The important fact is that for m ~ 1, we get cusp forms.

14.1. (1) !fm = 0, then Po= Ek. 1, then Pm is a cusp form in Sk(q,x).

PROPOSITION

(2) Ifm

~

We sketch part of the proof, namely the following lemma. LEMMA

14.2. The Poincare series Pm has Fourier expansion

Pm(z) = 8(m,O)

(14.12)

+ LP(m,n)e(nz) n):l

with coefficients

and k-1

p(m,n) =

(~)--,--- (8(m, n) + 27ri-k c>O c=O (mod q)

if m

~

1. Here Sx(m, n; c) is the Kloosterman sum with character x,

(14.13)

Sx(m, n; c) =

'\"'*

L

nd)

x(d)e (md - -+ c-

d (mod c)

and

Jk- 1

is the Bessel function of order k- 1.

PROOF. We first make a convenient parametrization of the cosets r =\f 0 (q), namely using the action of an element of r on the right and on the left, we see that f 0 (q) is the disjoint union

=

(14.14)

(where for given c and d with (c, d) = 1, a and b are any two integers such that adbe= 1). Thus we have correspondingly

Pm(z) = e(mz)

+

L

"E*

c>O d (mode) c=O (mod q)

where

x(d)I(c,d;z)

14.

HOLOMORPH!C MODULAR FORMS

359

Next, by the Poisson summation formula

'"""' r

(am

m

)

I(c, d; z) = L.., J11. (c(z + v) +d) -k e ---;;- c(c(z + v) +d) - nv dv nEZ

=

~ e ( -am c

nEZ

nd){j+oo+iy +. (cv)-ke (-m - 2- - nv )dv }e(nz).

-=+ty

c

c

V

Here we observe that the inner integral vanishes for n ~ 0 (move the line of integration upwards; the case n = 0 is the most important, so the reader is encouraged to check carefully). Then putting together the different terms, and recognizing the Bessel function in the integral (see [GR, 8.315.1, 8.412.2]), we obtain the lemma. D EXERCISE 3. Prove the decomposition (14.14). Notice that there are infinitely many Poincare series Pm, but they all live in the finite dimensional space Sk(q,x). This means that there must be many linear relations between the Pm's. On the other hand, such abundance tends to suggest that the Poincare series should span the whole space. This is indeed the case. LEMMA 14.3. Let f E lvh(q,x) be a modular form with expansion (14.9). Then for any m): 1, f(k- 1) (j,Pm) = (47rm)k-1 a/(m). COROLLARY 14.4. The Poincare series with m =F 0 span Sk(q,x). Indeed, any cusp form f orthogonal to every Pm must have all its Fourier coefficients at oo equal to zero, and hence must be zero. Because the space spanned by Poincare series is closed (this uses crucially the finite-dimensionality of the space) the result follows. PROOF OF LEMMA 14.3. By definition of Pm we have

~

(J,Pm) = LJ(z){ =

L

xb)j(--y,z)-k e(mvl}ldJ.t(z)

-yEf = \fo(q)

~

Im (--yz)k f(--yz)e(m--yz)dJ.t(z)

-yEf = \fo(q)

(by modularity off and Im (--yz) = lj(g, z)I- 2 Im (z)) =

Jot lor+= yk- 2 f(z)e( -mz)dxdy

=

~ aj(n) n~O

l

+oo

l- 2 e- 2"(n+m)ydy

0

11

e((n- m)x)dx

0

f(k -1)

= (47rm)k-1 af(m) 0


14.

360

Taken together, the two previous lemmas imply a formula which shows how the sequences of Fourier coefficients of a basis of 8k(q, x) are almost orthogonal, thereby giving another decomposition in arithmetic harmonics of the diagonal symbol8(m,n). Let F be any orthonormal basis of 8k(q,x) (distinguished basis will be described in Section 14.7). Since Pm is in 8.\:(q, x), we can expand it as a linear combination of the f E F, namely we have

Pm =

L

(f,Pm)J

fE:F

so that by taking the n-th Fourier coefficient of both sides (on the right with Lellllila 14.3, on the left with Lemma 14._2) we obtain the Petersson formula PROPOSITION 14.5 (PETERSSON). For any n ~ 1 and m ~ 1, (14.15)

r(k-1) " ( 1ry'mn:)k- 1 L 4

-a1 (n)a 1 (m) = 8(m,n)

JE:F

(4"y'rnn)

c -1 S)((m,n;c)Jk-1 --c-- . c>O c"'O (modq)

EXAMPLE. The "first" cusp form occurs for level 1 and weight 12; it is the famous Ramanujan t. function, which has the elegant product expansion t.(z)

=q

IT (1- qn)

2

\

with q = e(z)

n~l

(see e.g. [Sel] for one of many direct proofs of the modularity). The existence of a non-zero f E 812(1) can be predicted by considering the function f = E~ - Eg, where E 4 and E 6 are the Eisenstein series of level 1 and weight 4 and 6. These have the Fourier expansions (at the only cusp oo)

E4(z)

=

1 + 240

L o- (n)e(nz) 3

n~l

E5(z)

=

1- 504

L o- (n)e(nz) 5

so that a direct computation of the small order terms gives E~- Eg = 1728q + O(q 2)

whence f is non-zero and is a cusp form. Since dim 8 12 (1) = 1, the equality f = 1728t. must follow once we know that t. is indeed modular. The coefficients in the expansion oft. are denoted T(n) (not to be mistaken with the divisor function). These are integers having many fascinating properties, for example,

T(n)

=L d11 mod 691. din

This kind of congruences belongs properly to the theory of Galois representations (see e.g. [Se3]). Evidently, t. must also be proportional to the non-zero Poincare series in 8 12 (1), but it does not arise naturally in this way. Also, the construction off exploits the


361

fact that holomorphic modular forms can be multiplied. Actually, it is not very difficult to show that the graded ring EBk)ol\!h(1) is isomorphic to the polynomial ring C[£4 , £ 6 ], and £ 4 and-£6 are algebraically independent. 14.3. Theta functions. The Eisenstein and the Poincare series are constructed by averaging over the group cosets. Theta functions provide another important class of examples of modular forms which come out in quite a different way. The most basic theta series is Jacobi's e function and equation (1.53) is indeed a modularity property of weight ~. However, we have not defined half-integral weight forms (see for instance [Shl], [14] for this theory) so we square e and obtain a weight 1 form

e2 (z)

=

:L r(n)e(nz). n~O

More generally we consider first theta series attached to binary quadratic forms. Definite forms are related to holomorphic theta series, while indefinite ones would give non-holomorphic forms, which were first constructed by Maass [MaJ. Recall the notation of Section 3.8: K =
OA(z) = 2_ w

+L

e(zNa)

oEA

where a runs over the non-zero integral ideals belonging to the class A. As shown in Section 22.2, there is a one-to-one correspondence between ideal classes and classes of definite binary quadratic forms of discriminant D. Let ii> A = [a, b, c] be the quadratic form associated to A. If b E A, we have b = (a)a for some a E a- 1 , a= m + nz.. Then Nb = il>A(m, n) so taking the units into account we obtain the formula (14.17)

BA(z) =

~

L

e(il>A(m,n)).

m,nEZ

Poisson summation (in two variables) shows again that

eA is modular, precisely

OA E J\/1(\D\,Xn), where XD is the Kronecker symbol (the Dirichlet character associated to the field K). It also follows that

From (14.16) or (14.17) we see that eA is not a cusp form. For a class group character x E if. we define the theta series

fx(z) = LX(a)e(zNa) where the sum is over all integral ideals of 0 this time. Splitting into classes we derive

fx(z) = L AE1i

x(A)BA(z)

362

14.


for x =F 1, so fx E M1 (\D\,XD) also. Computing the Fourier expansion at other cusps, one proves that fx is a cusp form if x2 =F 1. Recall that the characters of order 2 of 1t are called genus characters. They were determined (essentially) by Gauss, who showed that the subgroup of genus characters is isomorphic to (Z/2Z)t- 1 , where t = w(IDIJ is the number of distinct prime divisors of IDI. For x not a genus character, we have fx. E 81 (\D\, XD), and it also satisfies

fx

c;~J =

-ifx.(z).

All class group characters are particular cases of Heeke characters of K (see Section 3.8), so they have an £-function (over K) attached to them

and by Mellin inversion the completed £-function

AK(s,x) =

(~rr(s)LK(s,x)

is expressed by the formula

A few other representations of AK(s, x) are given in Section 22.3. From the above integral representation we derive, as for the Riemann zeta function or the Dirichlet £-function, that LK(s,x) has meromorphic continuation to the whole complex plane and is entire if X =F 1. Moreover, we have the functional equation

Much more general theta series are obtained by considering any integral positive definite quadratic form, and associated harmonic polynomials inn variables instead of only binary quadratic forms. However, there is no interpretation in terms of ideal classes of number fields in the general case (for 4 variables, of course, there are links with quaternion algebras). Let A = (a, j) be a real, symmetric, positive definite matrix of rank r =o 0 (mod 2), and i,j

the corresponding quadratic form (the theory of forms in odd number of variables is somewhat harder because of a complicated multiplier system for the corresponding modular forms). A homogeneous polynomial P E C[X] is called a harmonic polynomial if D.AP = 0, where

is the Laplace operator associated with the matrix A- 1 = (a;i). With this data we define the theta series e(z) = ~ P(m)eGA[mJ). mEZr


363

Suppose A and NA- 1 have integral coefficients, and even diagonal, for some suitable integer N ~ 1. Then B(z) E Ah(N,xD) where k = ~ +degP and XD(d) = (I}) is the Kronecker symbol with D = (-1)lijAj. Moreover, if degP > 0, then B(z) = B(z; A, P) is a cusp form which satisfies

B(z; A, P) = i!i IAI-L-ke( -~; A-1,

P*)

where P*(x) = P(A- 1x) is a harmonic polynomial associated with A- 1 (see also Section 20.4 and, for instance, [14], Chapters 9 and 10 for complete proofs). 14.4. Modular forms associated to elliptic curves. Cusp forms associated to elliptic curves have both arithmetic and analytic flavors. We begin by introducing the basic invariants of elliptic curves over an arbitrary field (see [Sil] and also Section 11.9). Any elliptic curve E, defined over a field k, has a plane model given by a Weierstrass equation (14.18) with coefficients a1 E k. Associated with the a-coefficients is the set of coefficients b2, b4, b5, bs given by b2 = ai + 4a2, b4 = a1a3 + 2a4, b5 = a~+ 4a6, bs = afa6a1a3a4 +4a2a6 +a2a~- a~. The b-coefficients are related by 4bs = b2b6- b~. Having these quantities we express the discriminant of (14.18) by (14.19)

6. = -b~b8 - 8b~- 27b~ + 9b2b4b5.

The cubic equation (14.18) defines an elliptic curve over k if and only if it is non-singular, which means that its discriminant 6. does not vanish. To simplify equations we associate with (14.18) another set of coefficients c 4 , ct; given by C4 = b~ - 24b4, c5 = -b~ + 36b2b4 - 216b6 . In these terms we have (14.20) REMARKS. The first set of b-coefficients arises from completing the square. Assuming the characteristic of k is not 2, we can change the coordinates (x, y) in (14.18) into (x, y + yx + T) getting

(14.21)

2 3 b2 2 b4 b6 E: y = x + -x + -x + 4 2 4.

The second set of c-coefficients arises from completing the cube. If further the characteristic is not 3, we can change the coordinates (x, y) in (14.21) into (x- ~, y) getting (14.22)

E : y2

= x3

-

4 c x- ~ = x 3 +ax+ b 48 864 '

say. This has the discriminant 6. = 12- 3 (d- c§) = -16(4a 3 + 27b2 ). In this equation we change further (x, y) into (x, ~) and multiply through by 4 getting

E: y 2 = 4x 3 +Ax+ B

(14.23) with A= given by (14.24)

-n = 4a and B =

-ffu

= 4b. Now the discriminant of (14.23) is simply

14.

364

(notice that 1728

=


123).

We return to the general equation (14.18) over any field k. We define the so-called j-invariant by (14.25) this makes sense because we assume 6. =F 0. The j-invariant plays an important role in classification of elliptic curves. First we single out two special cases.

CASE j = 0. This is equivalent to c4 = 0, therefore if E is given by (14.23), then A= 0 and 6. = -27B 2. CASE j = 1728. This is equivalent to c6 = 0, therefore if E is given by (14.23), then B = 0 and 6. = -A 3 . EXAMPLE 1.

Suppose j =F 0, 1728. Then the equation 2

(14.26)

E :y

+ xy = x

3

36 1 - j - 1728 x- j - 1728

gives an elliptic curve over any field k which contains j; its j-invariant is equal to j, the c-coefficients are c4 = j(j -1728)- 1, c6 = -j(j -1728)- 1 and the discriminant is 6. = PU -1728)- 3 . Also the cubic equation (14.27)

E :y 1

2

.

+ JXY =

3 j(j-1) 2 36j2 j3 x - ---x - ---x- - - 4 j - 1728 j - 1728

(which is the quadratic twist of (14.26) by j) gives an elliptic curve over the field k; its invariants are c4 = j 3 (j -1728)- 1, c6 = -j 4 (j -1728)- 1 and 6. = j 8 (j-1728)- 3. EXAMPLE

(14.28)

2. Suppose j =F 0, 1728 and char.k =F 2, 3. Then the cubic equation E :

y

2 =

4.r 3

-

_'!!j__ X j - 1728

-

_'!!j__

j - 1728

with j E k gives an elliptic curve over k whose j-invariant is equal to j and the discriminant is 6. = 26 · 3 12 · j 2(j -1728)- 3. From now on we .consider elliptic curves E defined over
J

.,L-.


365

(1) Multiplicative reduction (or semistable) if p I 6. and p f c4 (this means E/'ffi'P has a node). (2) Additive reduction (or unstable) if pI 6. and pI c4 (this means E /flp has a cusp). The multiplicative reduction is said to be split or non-split if the slopes of the tangent lines at the node are rational or not, respectively. For every p we define the conductor exponent /p as follows: (1) /p = 0 if E has good reduction at p. (2) JP = 1 if E has multiplicative reduction at p. (3) fp = 2 if E has additive reduction at p # 2, 3. When E has additive reduction at p = 2 or p = 3, then the definition of /p is quite involved (see [Sil]), but we always have /p ~ 2 and /p-2 is a certain "measure of wild ramification". In these cases one can show that h ( 8 and h ( 5. Having the exponents /p for all p the conductor of E /Q is set to be (14.30)

Note that N is squarefree if and only if E has no additive reduction at all places. In this case E is called semistable. Hasse defined the £-function of E as an Euler product of local factors defined by looking at the reduction Ep of E modulo p for all primes p, which is a curve over the finite field flp. The local zeta function is defined by the formal power series

xn

Zp(E) = exp(L IE(Fp")l-;-) n):O

where IE(Fp" )I is the number of points of Ep in a finite field of cardinality pn (with the point at infinity included). Let a(p) be the integer defined by IE(Fv)l = p + 1- a(p). Notice that if E is given by a simpler Weierstrass equation y2 = x 3

(to which one can reduce for p character sum

f

+ax+ b

2 or 3), then a(p) can be expressed by the quadratic

a(p) = -

L (x3 + ;x +b). x mod p

For

pI 6., the coefficient a(p) a(p) = {

~ -1

is given by

if E has additive reduction, if E has split multiplicative reduction, if E has non-split multiplicative reduction.

Hasse proved that Zp(E) is a rational function for every p, and for p f 6. (the reduced curve is then smooth) (14.31)

z

2

P

(E)= 1-a(p)X +pX (1- X)(1- pX) .

14.

366


Moreover, Hasse proved the Riemann Hypothesis for Ep in the form of the inequality

ia(p)l ,;;; 2.jp

(14.32)

which implies that both roots of the numerator of ZP have modulus equal to 1/ ,jP. Bee Section 11.8 for elementary proofs of (14.31) and (14.32). The Hasse-Weil zeta function of E, already mentioned in Section 5.14, is defined by the Euler product L(E, s)

( 14.33)

=II (1- a(p)p-s)- II(1- a(p)p-s + pl-2s)-1. 1

PILl

Pill

Define a(n) for all n

~

1 by expanding (14.33) into the Dirichlet series 00

L(E,s) = ~a(n)n- 8 •

(14.34)

This L-function is holomorphic in the region Re(s) > ~by the Riemann hypothesis (14.32). Hasse further conjectured that the L-function thus defined from local data could be extended to an entire function satisfying the functional equation A(E, s) = wA(E, 2- s) where w = ±1 and A(E, s) =

../N)s f(s)L(E, s), (~

N being the conductor of E. This functional equation is the same as that satisfied by the Heeke L-functions of cusp forms of weight 2 and level N with trivial character (see Section 14. 7). Around 1950, Shimura and Taniyama made the conjecture that for any E jQ, there is a cusp form whose L-function is equal to L(E, s ). Later Wei! refined the conjecture by postulating that the level of this form should equal exactly the conductor of E. This refined conjecture can be tested computationally since the space of cusp forms of a given level, weight and character has finite dimension. This deep modularity conjecture was proved in 1995 by Wiles [W], WilesTaylor [TW] for semistable curves (with Fermat's Great Theorem as a consequence, as had been shown before by Ribet), and is now known for all elliptic curves over Q, the final work being due to Breuil, Conrad, Diamond and Taylor [BDCT]. THEOREM 14.6. Let EjQ be any elliptic curoe of conductor N. There exists a primitive cusp form 00

f(z) = ~>.(n)n!e(nz) E S2(fo(N))

(14.35) such that a(n)

=

>.(n)n~, that is 00

(14.36)

L(E, s

+ ~) = L(f, s) = ~ >.(n)n-

8 •

Since the cusp form f associated to E is primitive of level N it is an eigenfunction of the Fricke involution (Wf)(z) = N-lz- 2 f(-1jNz), i.e., (14.37)

Wf = -wf,

with w = ±1.

~

I

i i


367

I Hence the functional equation of the completed £-function is (14.38)

I

I

L

A(!, s)

=

wA(f, 1- s).

The sign w = ±1 in this equation is aL~o called the root number of E. It is not difficult to compute w computers for any concrete curve, yet there is no simple formula for w in general. In the important special case of semistable curves, that is when N is squarefree, we obtain by the theory of Heeke operators (14.39)

w = -JL(N)a(N) = -JL(N)>.(N)N1 = -(-l)m

where m is the number of places with split multiplicative reduction. The celebrated Birch and Swinnerton-Dyer Conjecture asserts that L(E, s) ( resp. L(J, s)) vanishes at s = 1 (resp. s = ~) to order exactly equal to the rank of the group E(Q) of rational points on E. Though r = rankE(Q) is not easy to compute, its parity is conjectured to be determined by (14.40)

w = (-lr,

and this has been proved by Nekovar [Ne] under the assumption that the TateShafarevitch group of E is finite (which is known if the order of vanishing of L(E, s) at s = 1 is ( 1). Proofs of the modularity of special elliptic curves existed before, notably for the curves with complex multiplication, due to Deuring (for a self-contained proof in the case of the "congruent number" curves, see for instance [14], Chapter 8). An early example (due to Shimura) of a modular curve which is not a CM-curve is the following elliptic curve E : y2 + y = x3 - x2 whose j-invariant is j = -4096/11 and discriminant is/::;.= -11. It follows from the properties of the conductor that the level of the weight 2 cusp form f corresponding to this particular E must be 11. But it is easy to show that S 2 ( 11) is a !-dimensional space, and actually it is not too hard to construct an element in this space from the knowledge ofthe Ramanujan /::;. function, namely

f(z) = (!:;.(z)!:;.(11z))1/12 = q

IT (1- qn)2(1- qlln)2 n)l

(q = e(z) as before). In other words, the modularity of E is equivalent with the identity (14.41)

No elementary proof of this seems to be known. One proceeds in two steps. The first one consists of showing that Y0 (11) = f 0 (11)\IHI (which, suitably compactified, is a compact Riemann surface) is an algebraic curve isomorphic to the (affine) elliptic curve E; this is done by constructing meromorphic _functions 91 and 92 on X 0 (11) (as quotients of modular forms of the same weight) which satisfy the equation of E, and goes back to Fricke and Klein. The second step, which is arithmetic and much less elementary, is to deduce from this the required identity of £-functions. This last argument is an example of the so-called Eichler-Shimura theory.

14.

368


Another interesting example is the Gross-Zagier curve given by the twisted Weierstrass equation -139y 2 = x 3 + 10x 2 - 20x + 8. This curve has conductor N = 37 · 139 2 and the sign of the functional equation is c: = -1. Moreover, the rank of the group of rational points on this curve is 3 while one can show using the Gross-Zagier formula that the Hasse-Wei! L-function vanishes at s = 1 to order exactly 3 (see Appendix to Chapter 23 for a sketch). Hence this example confirms the conjecture of Birch and Swinnerton-Dyer. In combination with an earlier work of Goldfeld it yields a non-trivial effective lowerbound for the class number of imaginary quadratic fields, as will be explained in Chapter 23. 14.5. Heeke £-functions.

Heeke showed how to produce Dirichlet series out of modular forms satisfying many of the properties of Dirichlet £-functions. Consider f E Mk(q, x), so f has the Fourier expansion at oo f(z) =

L at(n)e(nz). n;?::O

One defines its Heeke £-function to be the Dirichlet generating series for the Fourier coefficients (14.42)

L(f, s) =

L at(n)n-•. n;?::I

This L-function and the completed £-function (14.43)

A(s,f) = ('{!rr(s)L(f,s)

are holomorphic in the region Re (s) > 1 + ~ by the trivial bound (14.10). We also introduce an operator W (sometimes called the Fricke involution) on llh(q, x) by (14.44)

which, because the element Hq

= ( ~ ~1 )

normalizes f 0 (q), i.e. Hqf 0 (q)

=

fo(q)Hq, induces maps W: 1\h(q,x)---> Mk(q,x) and W: Sk(q,x) ___, Sk(q,x). THEOREM 14.7 (HEeKE). With notation as above, the £-function off has a memmorphic continuation to the whole complex plane and the completed £-function satisfies the functional equation

(14.45)

A(f, s) = ikA(Wf, k- s).

Moreover, L(f, s) is entire iff is a cusp form, and otherwise it has only a simple pole at s = k.

14.


369

PROOF. The proof is very much like Riemann's (second) proof of the functional equation of ((s). By the definition of the gamma function we have for all n? 1,

.;q)sf(s)n-s = ( 21r

r+oo e-2rrnyfv'fiy•dy

lo

Y

so in the region of absolute convergence we have the representation

A(s, f)=

l+oo (!(~)- aoUJ)y'd:.

Here we split the integral into the part from 1 to +oo and the part from 0 to 1 and we transform the latter as follows

dy -aoUl)Y01 1!(f (iy) 5

y

0

j+oo(f (~i) -ao(f) )Y_ -dy YVQ y = ik j+oo (w!(~)- aoUl)Yk-sd: 5

=

I

by applying (14.44). Adding both parts we obtain the integral representation

A(f,s) =

;+= (!(~)- aoUJ)y•d:

+ik j+oo (w!(~) -aoUl)Yk-sd:· Since f- a 0 (!) decays exponentially fast at infinity, the meromorphic continuation follows, and the function equation is then clear. D If f = Ek is an Eisenstein series, one can see from the Fourier expansion that L(Ek, s) is a product of two Dirichlet £-functions, so we do not get any really new £-functions. But when f is a cusp form, the associated Heeke £-function is a genuine GL(2) object. REMARK. Looking back at the theta series in Section 14.3, one can observe that it provides an equality

where on the left side is the £-function attached to a class group character of the quadratic extension K, which has an Euler product

of degree 1 over prime ideals in 0, whereas on the right side we have the Heeke £-function of the modular form fx. E 1\11 (IDI, XD) which has an Euler product of degree 2 over rational primes. This is an early sign of the existence of an instance of Langlands functoriality. Similarly, £-functions defined over number fields are expected to be identical with particular higher-degree £-functions over Q. Another case of this phenomenon appears in the theory of elliptic curves, where the HasseWeil zeta function of an elliptic curve E jQ with complex multiplication was also shown by Deming to coincide with the £-function of a Heeke character of weight 2 of an imaginary quadratic field. '

14.

370


14.6. Heeke operators and automorphic £-functions.

We now sketch the theory of Heeke operators and the Atkin-Lehner theory of primitive forms, which establishes a link between Dirichlet characters and automorphic forms, showing clearly how the latter generalize the former. The Heeke operators are certain linear operators acting on spaces of modular forms, whose existence and properties can be seen to be intimately related to the arithmeticity of the subgroups f 0 (q) of S£(2, JR.), although this will not be apparent in this survey (see [Mar]). Fix integers k ;:, 1, q ;:, 1 and a character x modulo q satisfying the consistency property x(-1) = (-1)k. The operator T(n) is defined by the formula (14.46)

T(n)f(z)

=-;;,1"" ~

""

x(a)ak ~ f (az+b) -dO~b
ad=n

for n ;:, 1 an integer (notice that T(n) also depends on x and so implicitly on q; this can be sometimes significant). The following proposition is fundamental. PROPOSITION 14.8. For any n;:, 1, T(n) acts on modular forms and on cusp forms: in other words, it induces linear maps

T(n) T(n)

Ah(q,x)-+ Mk(q,x), Sk(q,x)-+ Sk(q,x).

The proof of this proposition requires identifying the formula (14.46) as another instance of an averaging operator, this time over the (finite) orbit space 6-n = fo(q)\f n, where r n is the set of integral matrices of order 2 and determinant n, on which S£(2, Z) acts naturally. From the definition, we compute the action of T(n) on the Fourier expansion (at oo) of a modular form (or even of any 1-periodic function), namely

T(n)f(z) =

~

L

=

O~b
~ LaJ(m) m

=

L

La,(m)e(mazd+b) m

x(a)akeC;z)

L

e(~b)

O~b
ad=n

L (L m

L

x(a)ak

ad=n

x(a)ak-laJ(cli'J)e(mz)

ad=n o.i=m

Therefore (14.47)

T(n)f(z) =

L(L m

x(d)dk-la 1

C;:) )e(mz).

dl(n,m)

This formula could also be taken as a definition of T(n), but the modularity of T(n)f would be harder to prove. However, since the Fourier expansion determines a modular form, it can be used to give quick proofs of many properties of the Heeke operators.

·.·.~


371

PRoPOSITioN 14.9. The Heeke operators commute. More precisely, for any n ? 1 and m ? 1 we have

T(m)T(n) =

x(d)dk- 1 T(~;)

L dl(n,m)

or equivalently T(mn) =

Jl(d)x(d)dk- 1 T(~)TCjj).

L dl(n,m)

In particular, the T(n) are multiplicative: T(mn) = T(m)T(n) if m and n are coprime. Therefore any T(n) is a product of Heeke operators T(p") for some primes p and integers 11. Those operators obey a second order linear recurrence relation (showing that they are determined by T(p)): T(p"+l)

= T(p)T(p")- X(P)Pk-1T(p"-1)

which can be summarized by the identity of generating functions LT(p")X"

= (1 - T(p)X

+ x(p)pk-1 X2)-l,

v)O

so the Dirichlet generating series of all the T(n) has an Euler product (of degree 2) LT(n)n-s

=II (1- T(p)p-s + x(p)pk-l-2s)-l. p

For p prime, we have the formula T(p)f(z) = x(p)pk- 1 f(pz)

+-1 p

"L...

I (z+b) .

O(b
p

If p divides q (the level), this simplifies to T(p)f(z)

=

~

+b)·

L fc p O(b

The operators T(p) for p not dividing q turn out to have very different properties than those with p dividing q; the latter are often referred to as the "bad" Heeke operators, and the corresponding p as the bad or ramified primes. The main difference between those two kinds of operators, from the analytic point of view, lies in their relation with the inner product. Indeed, the next lemma fails for (n, q) > 1. LEMMA 14.10. Let n be coprime with q. Then the operatorT(n) acting on the space of cusp forms Sk(q, X) is normal with respect to the Petersson inner product; more precisely, its adjoint is

(14.48)

T(n)*

The lemma means that for

f

=

x(n)T(n).

and g two cusp forms we have

(T(n)f, g)= x(n)(f, T(n)g);

note that indeed this formula cannot hold for n not coprime with q, i.e. for n with x(n) = 0, because it would imply that T(n) = 0, which is certainly not always true.

14.

372


The proof of this lemma (due to Heeke) is quite involved (see [14], pages 104-106 for instance). From the normality of the T(n) with (n, q) = 1 and the fact that they commute, we deduce by standard linear algebra the following PROPOSITION 14.11. There is an orthonormal basis of the space Sk(q,x) of cusp forms which consists of eigenfunctions of all the Heeke operators T(n) for (n,q) = 1.

Any non-zero modular form bers >.(n) such that

f

E Ah(q,x) for which there exist complex num-

T(n)f = >.(n)f for all n coprime with q is called a Heeke form. If f is a cuspidal Heeke form, the adjointness formula (14.48) shows that >.(n) = x(n)>.(n), for (n, q) = 1. What does it mean that f is a Heeke form for the Fourier coefficients a1 (n) of f? From (14.47) applied to f we obtain by taking the first Fourier coefficient of both sides that (14.49)

>.(n)at(1)

for all n with (n,q) = 1

a1 (n),

=

so if a1 (1) f' 0, the Heeke eigenvalues and the Fourier coefficients are the same up to a constant factor! This already proves that the Ramanujan T function in Section 14.2 is multiplicative, as already conjectured by Ramanujan himself (and proved by Mordell). Indeed, since S 12 (1) is one-dimensional and spanned by the 6. function, perforce 6. is an eigenfunction of all Heeke operators (there are no ramified primes here), and since T(l) = 1, the eigenvalue is T(n) and the multiplicativity follows from Proposition 14.9.

·.,'11

14.7. Primitive forms and special basis. Proposition 14.11 is not exactly what we wish to be true. It would be more interesting to have a basis of eigenfunctions of all the Heeke operators without exceptions. Then (14.49) holds for all n, and so at(1) must be non-zero or otherwise f = 0. After normalization, the Fourier coefficients will be multiplicative, and more importantly the Heeke £-function of f will have an Euler product

L(f, s)

=II (1- a,(p)p-s + x(p)pk-!-2s)-!. p

Such a wonderful world does not exist, however. Indeed, let x* modulo q* be the primitive character which induces X· Take any q' and d with q* I q', dq' I q and let x' be the character modulo q' induced by x*. Now iff E Sk(q', x'), the function f ld (z) = f(dz) is in Sk(q,x), as follows from the matrix identity

0) (

d ( 0 1

0:

'Y

(3) 6

=

(

0:

"f/d

(3d) ( d 6 0

0) 1

.

It is also easy to show that the Heeke operators T( n) with (n, q) = 1 commute with the operation f >-> f ld, so f being a Heeke form implies that f ld is one. However, since f ld (z) = a,(n)e(dnz),

L

n)l

.. ·t·.: .·

:;

14.


the first Fourier coefficient off

373

ld vanishes if d > 1.

The theory of primitive forms (or "newforms") remedies to this defect by considering that cusp forms such as f ld in Sk(q, x) are not really of level q but come from lower levels, and that a better theory of Heeke operators should hold when· such "oldforms" are put aside. This was developed by Atkin and Lehner [AL]. Therefore, let s• (q, x) be the subspace of Sk ( q, X) spanned by all cusp forms of the type f ld where f E Sk(q',x') with q' < q and dq'l q. Then let S*(q,x) be the orthogonal complement of st(q, X) with respect to the Petersson inner product, so

Sk(q, x) = s'(q, x) EB S*(q, x) (note that if X is primitive, we have SjJq, x) = Sk(q, X), and that if X is trivial, q is prime and k ( 10 or k = 14, we also have S~(q) = Sk(q), because Sk(1) = 0 for those k). Since (T(n)f) ld= T(n)(f ld) for (n, q) = 1, it is immediate that T(n) acts on St{q, x) for (n, q) = 1. Since those Heeke operators are normal, they act also on the orthogonal complement Si.(q, x). In particular, both spaces still have a basis of Heeke forms. By definition, a Heeke form f which is in Si,(q, x) is called a primitive form (of weight k, level q, and character x). The main result is the following: THEOREM 14.12 (MuLTIPLICITY ONE PRINCIPLE). Given a sequence (>.(n)) of complex numbers, the subspace of Si.(q, x) spanned by Heeke forms with eigenvalues >.(n) for all (n, q) = 1 is at most one-dimensional. In other words, such a H ecke form f, if it exists, is unique up to multiplication by a scalar. EXERCISE 4. Prove that the multiplicity-one principle is equivalent to the assertion that any primitive form f E s:, (q, x) has non-zero first Fourier coefficient. Suppose now that Tis a linear operator on the space Si, (q, x) which commutes with all T(n) for (n,q) = 1. Then it follows that T acts on the common eigenspaces of those T(n). By the multiplicity-one principle, those are one-dimensional, each spanned by a primitive form f, and therefore f is also an eigenfunction ofT. In particular applying this property to the "bad" T(n), i.e. for (n, q) f. 1, we obtain the desired result. PROPOSITION 14.13. Let f E SZ(q,X) be a primitive form. eigenfunction of all Heeke opemtors,

T(n)f = >.J(n)f, and for all n non-zero.

~

for all n

~

Then f is an

1

1, a1(n) = >.J(n)aJ(1). Hence the first Fourier coefficient aJ(l) is

It is natural to normalize a primitive form, so that it becomes a distinguished basis of its common eigenspace; then the basis of Si,(q,x) of normalized primitive forms is also unique. Two natural normalizations exist: we will say that f is Heckenormalized if its first Fourier coefficient is a1 ( 1) = 1, so that the Fourier coefficients are the same as the Heeke eigenvalues, and that f is Petersson-normalized if llfll = 1 (such normalization is often more useful in the context of the Petersson formula). Iff is Hecke-normalized, then of course llfll- 1 f is Petersson-normalized. The norm llfll is related to the value of the adjoint square £-function off at 1, and plays a

14.

374


role in some important applications. See Section 5.12, and in particular Corollary 5.45 for the comparison between the two normalizations. When no other indication is present, "primitive" will always mean "Heckenormalized primitive". In any case, once a normalization is chosen, the basis of primitive forms of s;;(q,x) is unique. In particular, it is possible to speak of the set of primitive forms without ambiguity. The Heeke £-function of a (Hecke-normalized) primitive form vious arguments, has an Euler product L(f, s)

J,

from the pre-

=II (1- >-t(P) + x(p)pk-1-2s)-!. p

This is equivalent with the formula (14.50)

>-t(mn)

'"""' L... x(d)d k-1 >-t (mn) d,2 .

=

dl(m,n)

This can be also written by Mobius inversion as

L

>. 1 (m)>-t(n) =

(14.51)

Jl(d)x(d)dk- 1 >-t(~)>-t(J).

dl(m,n)

Moreover, the functional equation of Theorem 14.7 is also simpler because the primitivity off will show that W f is expressed in terms of f. The linear operator W is an isometry, and it is easy to show that W commutes with the good Heeke operators: WTX(n) = x(n)TX(n)W, for (n,q) = 1 (where the superscript X indicates for which character the T(n) is defined). To come back from Sk(q,x) to the original space Sk(q,x), we use another operator, the K-operator defined by

K f(z) = f( -z). Note that K acts on Fourier expansions by conjugating the coefficients, Kf(z) =

L a,(n)e(nz) = f(z), say. n)O

One must be careful because K is not
which satisfies the commutation relation T(n)W

Let

f

= x(n)WT(n),

for (n, q)

= 1.

be a primitive form. Then for (n, q) = 1 we have

T(n)Wf

=x(n)WT(n)J =x(n)W(>-t(n)J) =x(n)>-t(n)W(f) = >-t(n)W(f)

by the formula for the adjoint ofT(n). By the multiplicity one principle, it follows that W f must be a multiple of J, hence

14. HOLOMORPHJC MODULAR FORMS

375

PROPOSITION 14.14. Iff is a primitive form, then there exists 1J E C, with 1111 = 1, such that Wf = 11f. 2

That 1111 = 1 follows from the fact that W is an involution, W = I d. The eigenvalue 11 is another very intricate invariant of f. It is determined by f but only in special cases do we have a formula for its value in terms of more accessible data. PROPOSITION

14.15. Let X be a primitive character of conductor q. Then 11 is

given by

PROPOSITION

14.16. Let q be squarefree, and X trivial. Then 11 is given by 11 = JL(q)>.t(q)ql-~.

From Proposition 14.14 and Theorem 14.7 we deduce the main result about automorphic L-functions of primitive forms. THEOREM 14.17. Let f be a primitive form. Then the Heeke L-function off has an Euler product expansion

L(f, s)

=IT (1- AJ(P) + X(P)Pk-l-2s)-l p

and has analytic continuation to an entire function. The completed £-function

A(!, s)

= (

~r f(s)L(f, s)

satisfies the functional equation

A(!, s) = hiA(], k- s). Notice that this theorem is in perfect analogy with the corresponding result for primitive Dirichlet characters. Even deeper analogies are revealed by the insight of the Langlands program, which uses representation theory to unify the whole theory of L- functions (cf. [B G]). We conclude this section by stating a non-trivial generalization of the multiplicity-one principle which is very useful in applications. THEOREM 14.18 (THE STRONG MULTIPLICITY ONE PRINCIPLE). Given a sequence ( >.( n)) of complex numbers and a fixed positive integer M, there exists at most one primitive form f E Si., (q, x) such that

>.t(n)

= >.(n), for all n prime toM.

The main point is that M has nothing to do with the level q. In effect, this shows that a primitive form is known when one knows its eigenvalues for all primes, except finitely many. In fact one can show using Rankin-Selberg L-functions that finitely many primes suffice (depending on the weight and level); see Proposition 5.22 and the remark following.

14.

376


EXERCISE 5. Show that if f E Sk(q, X) is a Heeke form, then there exists a unique primitive form g E Sk(q',x') with q' I q such that x'(n) = x(n) and >. 9 (n) = AJ(n) for all n prime to q. 14.8. Twisting modular forms. We have already mentioned that we intend to exploit the eigenvalues AJ(n) of primitive modular forms as arithmetic harmonics; however, before they can be used effectively for such a purpose, it is necessary to get information on the eigenvalues themselves. For this, following our strategy, it is natural to probe their behavior when twisted by other harmonics. The case of primitive Dirichlet characters is basic and the results are nice. PROPOSITION 14.19. Let f E Mk(q,x) be a modular form with Fourier coefficients at(n), not necessarily a Heeke form, q* the conductor of the Dirichlet character x and let 1/J be a primitive Dirichlet character modulo r. Let f @1/J be the function on lHI given by the Fourier expansion

(! @1/J)(z) =

L 1/J(n)at(n)e(nz). n)O

Then f @1/J is also a modular form, more precisely f @1/J E Mk(N, x1/J 2 ), where the level N is the least common multiple of q, q*r and r 2 . Iff is a cusp form, then so is f @1/J.

PROOF. Since 1/J is primitive, we have for any n the expression for 1/J(n) in terms of additive characters

T(~)ip(n) =

L

~(u)eCrn)

u(modr)

where T(-:;j) (14.52)

f'

0 is the Gauss sum, whence

L

(!Q91(J)(z)=T(~)- 1

f(z+~)·

u(modr)

It is now a matter of writing, for 1 = (

~ ~)

in f 0 (N), the matrix identity in

SL(2,IR.),

b_

bc:u _ d_

2

2

cd:2u

cd u

)

(

1 1 + d~ 0 1

2

)

r

to obtain

--~ .... .. _._-._

which shows the modularity. As for f @1/J being a cusp form when f is, the criterion of Exercise 1 applies immediately from (14.52). 0 By multiplicativity of 1/J and the formula (14.47), we see that the operation of twisting by 1/J takes Heeke forms to Heeke forms. On the other hand, it might

'.':"


377

not take a primitive form to a primitive form, because the level N as described in Proposition 14.19 might not be optimaL This happens, for instance, when f = fx is one of the forms of weight 1 associated to an ideal-class character X in Section 14.3. Indeed since the Fourier coefficients of fx are supported on integers which are norms of ideals of the field K = IQI( v'D), so twisting by the field character 1/J = XD has no effect at all, since it is identically 1 on norms of ideals. By Exercise 5, however, there is a unique primitive form g, with level dividing N, which is such that 1/J(n)>..t(n) = >.. 9 (n), for all n coprime with q. In the special case when q and r are coprime, however, there is no problem and f 01/J is primitive whenever f is (as might indeed be expected). In this case the functional equation of the twisted £-function takes the following form. PRoPOSITION 14.20. Let f E Si.(q, x) be a primitive form and 1/J be a primitive Dirichlet character modulo r with (r, q) = 1. Then f ® 1/J is primitive of level N = qr 2 and the Heeke £-function off® 1/J is entire and polynomially bounded in vertical strips. Moreover, the completed £-function {14.43) satisfies the functional equation A(s, f 01/J) = ikw'fitA(k- s, f ®

1J)

where T}f is the eigenvalue off for the operator W, and the root number w depends only on X and 1/J, namely 2

w

=

x(r)1/J(q) T(1/J) r

.

The fact that there is no pole for any of the twists of a cusp form shows that the sign of the Fourier coefficients are changing randomly and independently of those of any Dirichlet character. The lack of periodicity of the Fourier coefficients is a welcome feature in analytic number theory. See [Du12], [Du13], [Dul4] for some applications of twisting as a replacement of positivity. Twists are important also as a simple case of Rankin-Selberg convolution GL(2) ® GL(1) (see Chapter 5), and they occur in so-called converse theorems which characterize modular forms by means of analytic properties of £-functions. The first converse theorem using twists is due to Wei! [We3]. Deeper variants are important in proving instances of Langlands functoriality, see the survey by Cogdell in [BG]. We quote Weil's result: THEOREM

14.21. Let L1(s)

=

L>(n)n-", n)l

L2(s)

=

2_)(n)n-s n)l

be two Dirichlet series absolutely convergent for Re (s) > C for some C > 0. Assume that there exists integers k;;, 1, q;;, 1 and M > 0 such that for any 1/J primitive modulo m, with (m, Mq) = 1, the Dirichlet series L(f 01/;, s)

=

L a(n)1/J(n)n-s, n)l

L(g 01/;, s)

=

L b(n)1/J(n)n-s n)l

admit analytic continuation to entire functions bounded in vertical strips such that the functions A(!® 1/J, s) = (211r 5 f(s)L(f ® 1/J, s),

A(g ® 1/J, s) = (2n)-sr(s)L(g ® 1/J, s)

14.

378


are entire and satisfy the functional equation

with w1/! =

ikx(m)1j;(q)r(1/;) 2r- 1.

Then there exists a cusp form f E Sk (q, x) for some X modulo q such that L(f,s) = L1(s) and L(Wf,s) = L2(s). 14.9. Estim
The question of the size of the Fourier coefficients (or Heeke eigenvalues) of cusp forms is one of the classical problems in the study of automorphic forms for themselves. Consider f E Sk(q,x). By the criterion for cusp forms of Exercise 1 and the Parseval formula we obtain, from the Fourier expansion off, '·

L

laJ(nWe- 4rrny

lf(zWdx

= [

n)l

«

Y-k

0

for any y > 0, so that for any N ;;, 1,

L

laJ(nW

«

y-ke4rrNy

n(N

and by choosing y = N- 1 we derive a bound for the second-moment of the Fourier coefficients (14.53)

L

lat(nW

« Nk.

n(N

Hence the trivial bound (14.10), namely a1(n) « nk1 2, follows by positivity. This last step is rather crude and it is therefore expected that the resulting bound for the individual coefficient at(n) can be improved. On the other hand, the bound obtained on average turns out to have already the correct order of magnitude, as shown by the properties of the Rankin-Selberg £-functions. Thus, it shows that k-1 af ( n), on average, is of order n--,--. The statement that af ( n) « n k;' +• for any E > 0 holds individually is known as the Ramanujan-Petersson Conjecture. For primitive cusp forms, this was proved by P. Deligne [Del] for k ;;, 2 as a consequence of the lliemann Hypothesis for varieties over finite fields (the Wei! conjectures), in the very sharp form (14.54) for any n ) 1, where Tis the divisor function. The corresponding formula fork= 1 holds and is due to Deligne and Serre [DeSe]. This is an extremely powerful result, in particular, because it is completely uniform. If f is not primitive, by writing it as a linear combination of Heeke forms, we derive from (14.54)

but the implied constant now depends on

f quite badly.


379

It is often convenient to normalize the Fourier coefficients so as to extract the expected order of magnitude, writing (14.55)

f(z)

= L>J(n)n "~' e(nz) n):O

instead of (14.9), and similarly for primitive forms with AJ(n). This has the effect of putting the critical line of the £-function at Re ( s) = ~, and the functional equation then relates A(!, s) and A(], 1 - s). This normalization is most convenient in analytic number theory and will be in effect from now on, except where explicitly stated. It is also the normalization used in Chapter 5, see Section 5.11 for the necessary adjustments. In this context, Deligne's bound takes the form i>.J(n)l ( r(n) for f primitive, and the Heeke £-function becomes

n)l

p

In recent applications (for instance, of the "amplification method"), the need for lower bounds on the Fourier coefficients has also arisen. Since an individual lower bound is impossible, as there exist forms f (primitive) and n (coprime with the level) with AJ(n) = 0, this can only be expected to hold on average. Indeed, from the Rankin-Selberg method we know that the upper-bound (14.53) can be sharpened to the asymptotic (remember that we now normalize aJ(n) as in (14.55)) (14.56)

L

2

!aJ(n)! = c!N

+ O(Nt),

n~N

with cf > 0 the implied constant depending on f. In particular, for a fixed f, the Fourier coefficients are not too small too often. However, the dependence of the constant on f in this formula is not explicit enough and so the uniformity which is often required for applications is not present. In this context a very useful observation is that if f is a primitive form

From this simple formula we see that for p not dividing the level q it is not possible for AJ(P) and AJ(p 2 ) to be simultaneously very small! This is very useful because it is completely uniform in all parameters involved. Here is an example of the use of this formula.

II

PROPOSITION 14.22. Let f E S'k(q,x) be a primitive form, AJ(n) its Heeke eigenvalues. There exists a sequence of complex numbers c(n) such that

(14.57)

L

c(n)>.J(n) ::=:: VN(logN)- 1 ,

n~N

(14.58)

L

ic(n)\ 2 ::=:: VN(logN)- 1

n~N

if N

»

(log 2q)2, where the implied constants are absolute.

14.

380


PROOF. It is enough to take c(n) = X(P) if n = p 2 ( N, c(n) = -x(p).X.t(p) if n = p ( ../N, and c(n) = 0 otherwise. Then the left side of (14.57) is equal to

by Tchebyshev's estimate for -;r(.JN) and w(q) « (log2q)(loglog3q)- 1 . On the other hand, c(n) satisfies (14.58) by Deligne's bound (14.54). 0 REMARK. From the proof one sees that the numbers c(n) can be chosen to have support on primes or squares of primes and that !c(n)l ( 2. These extra properties are not important but could be convenient for technical reasons. 14.10. Averages of Fourier coefficients. The Petersson formula (Proposition 14.5) is very useful in applications; it permits averaging over natural families of automorphic forms, which are so to speak "complete". In doing so, it shows that the Fourier coefficients of automorphic forms behave as "harmonics", they are nearly orthogonal (see also Section 7.3 for the underlying philosophy). This is also very true with the similar effects of the Kuznetsov formula (see Chapter 16). It is especially important to deal with coefficients of primitive forms, which benefit also from multiplicative properties (among other things). The issue arises that in general there is no basis of Sk(q, x) of primitive forms. However, taking a basis of Heeke forms (see Proposition 14.11), say Hk(q, x), and denoting f(z) =

L

>.. 1 (n)n(k-I)f 2 e(nz)

n)l I 1-

their Fourier expansion, Proposition 14.5 yields COROLLARY 14.23. Let q ? 1, k ? 2. Let X be a character modulo q and Hk(q, x) any Heeke basis of Sk(q, x). For any m, n ? 1, we have (14.59)

Lh

AJ(n)>..J(m)=3(m,n)+2n-i-k

L

c- 1 Sx(m,n;c)Jk-r( 4 "~)-

c>O c=:oO (modq)

where the superscript h means that f has been spectrally normalized to be of Petersson norm 1. Without normalization this means the terms of summation are appropriately weighted, namely

(14.60)

It is important to know how good the approximate orthogonality can really be, that is, to estimate the series of Kloosterman sums. A simple but effective treatment yields the following corollary (see Theorem 16.7 for a more refined estimate).


14,

381

CoROLLARY 14.24. With notation as above, we have for any m, n;;:, 1, (14.61) 1

1

l7(q)

(

(mn)4))

+0 ( T3((m,n))(m,n,q)2(mn)4qVklog 1+ yqf where the implied constant is absolute. PRooF. Use the estimate Jk_ 1 (x) Kloosterman sums (Corollary 1L12)

«

min(1,x/k) and the Wei! bound for

(14.62) 1

1

1

For c = qr, say, this yields the bound (m, n, q) 2 (m, n, r)2 (qr) 2 T(q)T(r). Hence the sum of Kloosterman sums in (14.59) is estimated by l T(q)'"""' T(r) l . ( .,fiiin) (m,n,q)2- L.., r;:. (m,n,r)2mm 1, -k- . yfcJ r yr qr

Hence the result follows by the elementary estimate 7

L ~min( 1, ~) «

Vxlog(1

+ Vx),

r

which holds for any positive X.

D

If the space of old forms vanishes, then the set of primitive forms Sk(q, x)* is a distinguished basis of Sk(q, x). In particular, this is so if X is primitive for instance. However, we will discuss further the special case where q is prime, x is trivial and k < 12, because this will be used in Chapter 26 for studying non-vanishing of £-functions. In that case Si.(q) = Sk(q) because Sk(1) = 0 (the first cusp form for SL(2,7l) is of weight 12). Iff E Sk(q)* is a primitive form it has real eigenvalues, and since q is squarefree the sign E 1 = ±1 of the functional equation is given by Ef = ->..J(q)q 112 (Proposition 14.16; note we have changed normalization of Fourier coefficients slightly). 1t is often desirable to be able to average over forms with a fixed sign, say E = ±1. This can be done by inserting a factor (1 + EE f), and applying the summation formula twice:

2

Lh >..1 (m)>..J(n) Lh (1 +EEJ)>..J(m)>..J(n) Lh AJ(m)>..J(n)- Eq 112 Lh AJ(q)>..J(m)>..J(n). =

j

EJ=E

=

f

Because

f

f

is primitive and q is the level, AJ(q)>.. 1 (n) = >.. 1 (qn) for all n. We deduce

14.

382


PROPOSITION 14.25. Let q be prime, 1 < k < 12, we have (14.63)

2

E

=

±1. For any m, n;;:, 1,

Lh AJ(m)AJ(n) = 8(m, n)- Eq 11 8(m, nq) 2

E:j=E

+27ri- k

(47fy'71ill)

. 1/2 S(m,nq,c))Jk-1 . c -1 (S(m,n,c)-Eq --c-.

" L...,'

c>O c"'O (mod q)

Here also we can estimate some or all Kloosterman sums getting an approximation to the diagonal symbol in terms of the Heeke eigenvalues of primitive cusp forms. COROLLARY 14.26. Let q be prime, 1 < k < 12 and E = ±1. For any m, n;;:. 1 with (m, q) = 1 we have (14.64)

2

Lh AJ(m)AJ(n) = 8(m, n) - 2uri-k

.,fti

L (r,q)=l

~S(mq,n;r)Jk-1(4" r

r

fi!!!!.)

Vq 1

+ o(r3((m,n)) (;~r

1

log(1 +

4

(~ ))

and (14.65) 2

Lh AJ(m)Aj(n) = 8(m, n) +0C ((m, n)) c~~)! log( 1 + (r:~) !) ). .,fti q q 3

Ef=E

The implied constant in both estimates is absolute. PROOF. Since q does not divide m we have 8(m,qn) = 0. Moreover, the first term involving Kloosterman sums on the right side of (14.63) is already estimated in Corollary 14.24. For the second, we write c = qr and estimate the contribution of the terms with q I r, by the same method getting a bound which is of smaller order of magnitude. When (r, q) = 1, the Kloosterman sum factorizes

S(m,n;qr)

= S(mq,n;r)S(O,m;q) = -S(mq,n;r).

This gives (14.64). The remaining sum of Kloosterman sums in (14.64) can also be estimated directly by using Weil's bound, and the result is given in the error term of (14.65) (this is worse than the error term in (14.64) in terms of q). D The error term of (14.65) is larger than the error term in (14.64) and will not be sufficient (just) for applications in Chapter 26. Better estimates will be derived from (14.64) by exploiting a cancellation in sums of Kloosterman sums over the variables m and n.

, j .

.

CHAPTER 15

SPECTRAL THEORY OF AUTOMORPHIC FORMS 15.1. Motivation and geometric preliminaries. Abelian harmonic analysis, as developed in the previous chapters (see especially Chapter 4), gives powerful means of transforming or estimating summations over integers, or lattice points in various geometrical domains. However, in analytic number theory, it is desired to have such tools for sums over other subsets of integers which are not defined by analytic conditions alone. Common among these are sums over (a, b, c, d) E Z 4 restricted by the determinant equation

I~ ~ I =

ad - be = h

where h is fixed. For example, one encounters the determinant equation when investigating the power moments (with amplification) of L-functions on the critical line to handle off-diagonal contributions . . Although the classical Fourier analysis applies to the determinant equation quite well (especially when enhanced by estimates for Kloosterman sums) the spectral theory on the hyperbolic plane produces much stronger results. The key point is that for the analysis of the determinant equation the automorphic forms are more natural harmonics than the exponential functions or Dirichlet characters. They enter the modern analytic number theory through many other doors. For example, the old problems concerning the class group of quadratic fields (Lagrange, Gauss, Dirichlet) cannot be properly understood without speaking of automorphic theory. Another incentive for new analytic tools is to study automorphic L-functions. References for this chapter are for instance [Bu], [15]. The ba..~ic theory was constructed by Maass [Ma] and Selberg [86]. As in the previous chapter we consider first the group G = SL(2,!R). Just as Z is a discrete subgroup of IR, from which the theory of periodic functions and Fourier series arises, r = f 0 (1) = SL(2, Z) is a discrete subgroup of G. Studying the determinant equation with h = 1 by harmonic analysis means taking a kernel (smooth, compactly supported) k : G ---t C and finding a summation formula a Ia Poisson for the function JC : G ---t C given by

IC(g) =

L k(lg) "'Ei

which is f-periodic; IC(Ig) = IC(g) for 'Y E f.

383

15.

384

SPECTRAL THEORY OF AUTOMORPHIC FORMS

However, a useful simplification is possible by performing first an ordinary Fourier expansion with respect to the compact abelian subgroup

K={(c?sB smB

-sinO) cos 0

e E [o, 21r] } c s L(2, JR)

1

(which is isomorphic to the circle lR/27rZ). Any function k can be expanded as a series k(g) = km(g)

L

mEZ

where km satisfies a simple transformation rule km(gr) = e(mfJ)k(g) for all r = re E K. Therefore, instead of functions on G, we can look at functions on the quotient G j K. This quotient is diffeomorphic to the Poincare upper half-plane by GjK-;H

(15.1)

{ ( a c

b) ,__, ai + b. d

cz+d

Instead of doing this reduction explicitly, we work on !HI and have G act on it. Recall from Chapter 14 that the Poincare upper half-plane is !HI = {z E C Im(z) > 0}. We will see it now as a model of the hyperbolic plane, a complete Riemannian manifold of dimension 2 with constant negative curvature -1 when equipped with the Poincare metric (15.2) The geometry of !HI with this metric is accessible because !HI has a large isometry group. Indeed, the action (14.2) of G = SL(2, JR) on !HI by linear fractional transformations (15.3)

az + b gz = cz + d'

for g = (

~ ~)

is an action by isometries: this follows directly from the formula above and from (15.4)

d(gz) = (cz

+ d)- 2 dz.

Hence G C Isom(H), the isometry group of H, with the center of G acting trivially. Actually, PSL(2,1R) = SL(2,1R)/{±1} is the full group of orientationpreserving isometries of !HI, and Isom(IHI) is generated by G and the reflection a :

z ,_,

-z.

Using this large isometry group, many geometric properties of !HI can be established fairly quickly: (1) Angles in !HI are the same as euclidean angles (simply because the metric is conformal to the euclidean metric ldzl 2 ). (2) The hyperbolic geodesics in !HI are the half-circles (in the euclidean sense) orthogonal to the "boundary" lR U oo, and the vertical lines Re (z) =constant. In particular, such a circle or line is transformed into another by g E G. (3) A geodesic circle of center w, {z E !HI I d(z, w) = r }, is also a euclidean circle (with different center and radius). (4) The distance (in the hyperbolic metric) is explicitly given by

lz- wl + lz- wl

d(z,w) =log Iz-wI Iz-w I'

.•

._ :·

J

15. SPECTRAL THEORY OF AUTOMORPHIC FORMS

385

but it is actually often more convenient to work with the function u(z, w) given by

(15.5) where u : JR+

(15.6)

u(z,w) ->

lz- wl 2

=

u(d(z,w)) = 4Im(z)Im(w)

JR+ is the function of the distance

d)2 .

1 ( sinh'2 u(d)='2(coshd-1)=2

REMARKS. (1) One has to be careful that the topology of !HI induced by Riemannian metric is different from its (euclidean) topology as a subset of C. instance, !HI is complete with the Riemannian metric, but of course not with euclidean one. (2) Notice that the compact subgroup K is the stabilizer of i E !HI, hence isomorphism G / K ~!HI described above is simply the map g >-> gi on G.

the For the the

Recall that the invariant measure associated to the Poincare metric is

(15.7) (see ( 14A)). When we come to harmonic analysis on quotients of !HI by discrete subgroups, the congruence subgroup fo(q)\IHI in particular, see Section 14.1, thinking of the abelian case of IR and its own discrete subgroups, many new phenomena occur. First, any discrete subgroup of IR is necessarily of the form aZ with a E IR, and they are therefore basically "all the same". This similarity allows us, for instance, to deduce quickly a Poisson summation formula for an arithmetic progression from the one for Z (see Section 4.3). On the other hand, there are very subtle questions concerning harmonic analysis on f 0 (q)\IHI (similar to their complex-analytic differences mentioned in Chapter 14). 15.2. The laplacian on !HI. Harmonic analysis on !HI is associated with the hyperbolic Laplace operator, which is given by.

(15.8)

EP

b. = -y2 ( 8x2

EP) + 8y2 .

This is a G-invariant differential operator, i.e. for any function f which is twice · differentiable we have b.(f I g) = b.f. Here and hereafter the action on the right of G on functions is of course given by

(f I g)(z) = f(gz). There are many ways to analyze b. on !HI, depending on which coordinates are chosen. Using rectangular coordinates z = x + iy and working by separation of variables, one considers functions f on !HI which are periodic of period one in the variable x; then f has a Fourier expansion

(15.9)

f(z)

=

L nEZ

fn(y)e(nx)

15.

386


with coefficients

fn(Y) = [

(15.10)

+ iy)e(-nx)dx.

f(x

Applying the Laplace operator one gets

/::,.f(z) = y 2

L

2 2 (47r n fn(Y)- J;;(y))e(nx).

nEZ

Of special concern are the eigenfunctions oft::.., the functions f such that /::,.f = >..f for some ).. E C. It is best to write the eigenvalue ).. in terms of a parameter s such that ).. = s(l- s). Because!::,. is an elliptic differential operator, any such f is automatically real-analytic on !HI. Continuing with a function f periodic in x, we see that it is an eigenfunction with eigenvalue ).. if and only if fn is a solution to the ordinary differential equation

f" + (y~- 47r 2n 2 )f =

0.

For n = 0 all solutions are linear combinations of y 5 and y 1 -s (replace y 1 -s by y 112 logy if s = 1/2). Then for n ) 1, two linearly independent solutions to the equation are given by standard Bessel functions

fn(Y) = 2y 112 Ks-Ij2(27rlnly) and

fn(Y) = 2y 112 ls-Ij2(27rlnly) which are distinguished by their asymptotic behavior as y --> +oo. This leads to LEMMA 15.1. Let f :!HI--> C be an eigenfunction of!::,. which is !-periodic in x, with eigenvalue>..= s(l- s), and suppose f satisfies the growth condition

f(z) = o(e 2"Y),

as y--> +oo.

Then there exist complex numbers a, b, and an for n f(z) = ay 5

+ byl-s + y 112 L

# 0,

such that

anKs-Ij2(27rlnly)e(nx)

n,
{if s = 1/2, replace y 1 -s by y 1 12 1ogy). In particular, this explicit Fourier expansion implies that polynomial growth; precisely, we have

f(z)

«

y"

f

is in fact of at most

+ YI-<>

for y--> +oo, where a= Re (s) as usual. 15.3. Automorphic functions and forms.

We now consider a discrete subgroup of G for which Vol (r\IHI) < +oo (such subgroups are called Fuchsian groups of the first kind). Of particular importance are the Heeke groups r o( q) for q ) 1. We define different spaces off-periodic functions on !HI: A(f\IHI)

= {!

: !HI--> c

I f II= f,

for an, E r}

I

I I


387

(the space of automorphic functions), 2

L (f\H)

= {! E

A(f\H)

I II/II= tlf(zWdJL(z) < +oo}

(the space of square-integrable automorphic functions), As(f\H) = {! E A(f\H)

I f:.J = s(1- s)f}

(the space of automorphic forms, or Maass forms, with eigenvalue A= s(1- s)). Let a be any cusp off and u 0 a scaling matrix as in (14.5). For any f E L 2 (f\H) the function fa = f I 0" 0 is thus 1-periodic in x, and has a Fourier expansion as in (15.9). We define further 2

L6(f\H) = {! E L (f\H) I

Ja.o = 0

for any cusp n}

(the space of cuspidal automorphic functions),

(the space of cusp forms with eigenvalue s(1- s)). Note, in particular, that any automorphic form in L 2 (f\H) (or with moderate growth) admits a Fourier expansion at the cusp oo as in Lemma 15.1. Our first goal will be to describe how to decompose elements of L 2 (f\H) in "spectral" terms, similar to the Fourier decomposition (series or integrals) on IR./Z or R This will be done in terms of the eigenvalues of the laplacian acting on automorphic functions. We define the domain D of t:. to be D = {! E L 2 (f\H)

I f and f:.J are of class c= and bounded},

thereby turning t:. into an unbounded linear operator on L 2 (f\H), defined on the dense domain D. Using Stoke's formula, it is shown that t:. is symmetric and positive, and by general functional analysis, it therefore has a self-adjoint extension to all of L 2 (f\H). If the quotient f\H were compact, then the general Hilbert space theory would imply that t:. has pure point spectrum, and that L 2 (f\H) is spanned by eigenfunctions of t:., as is the case with IR./Z. Due to the lack of compactness, things are more complicated because a continuous spectrum exists but, on the other hand, the presence of the cusps gives rise to Fourier expansions, which are powerful tools. 15.4. The continuous spectrum.

We start the spectral decomposition by constructing a space of "obvious" automorphic functions, namely those constructed by the familiar averaging technique. More precisely, for a test function 1/J : JR.+ ---> C which is smooth with compact support and a cusp a, we define the incomplete Eisenstein series (15.11)

Ea(z,'if;) =

L

1/;(Imu;;- 1 v)

-yEr" \r

which is obviously in L 2 (f\H) since 1/J has compact support. Of course, for the same reason, E 0 (-, 'if;) can never be an eigenfunction oft:.. However, using Mellin

15.

388


inversion, we find that Ea(z,'l/1) =

(15.12)

2

11ri

1

.

Ea(z,s)'l/J(s)ds

(a)

where u

> 1, .,j; is the Mellin transform of 1/J and Ea(·. s) is the Eisenstein series Ea(z,s) =

L

1

(Imu;;- 1z)'

-yEf a \f

the series being absolutely convergent, uniformly on compact sets, for Re (s) > 1. Because Ll(Im z)• = s(1-s)Im (z), the Eisenstein series Ea(z, s) is an eigenfunction" of Ll, but unfortunately it is not square integrable so it cannot be used directly for the spectral decomposition. We define a new space £(f\H) to be the closure in £ 2(f\H) of the space spanned by the incomplete Eisenstein series. Clearly, Ll acts on £(f\H). The solution to the spectral decomposition in this subspace lies in the meromorphic continuation of the Eisenstein series (with respect to the s-variable) to the whole complex plane, which was proved by Selberg. In the case of fo(q), the analytic continuation can be seen quite easily by computing explicitly its Fourier expansion at the cusp oo. Let a, b be two (not necessarily distinct) cusps. Then Ea(·, s) has a Fourier expansion at the cusp b, which we write Ea(ubz, s) = Oa.bY

8

+ 'Pa.bYI-s + y 112 L

'Pa,b(n, s)Ks-lj2(27rlnly)e(nx).

niD

The square matrix of constant terms (s) = ('Pa.b(s))a,b is called the scattering matrix of r. In the case of r = SL(2, Z), the Fourier coefficients are (we drop the subscripts a, b since there is only one cusp)

'P (

s) = '7rr(s- 1/2) ((2s- 1) v" r(s) ((2s) '

cp(n, s) = 7r•r(s)-l((s)-llnl-1/2

L

a)s-1/2 ' (b

ab=lnl

and the meromorphic continuation can indeed be seen from that of the Riemann zeta function ((s). In this case the full Fourier expansion of E(z,s) is (15.13)

E(z, s) = B(s)y•

+ 0(1- s)yl-s + 4.JYLr

8 _

1

j2(n)Ks-lj2(27rny) cos(27rnx).

1

For fo(q) in general, similar expressions hold in terms of Dirichlet £-functions. Let Re (s) ;:? 1/2. It is easily seen that in this region E(z, s) has only a simple pole at s = 1 with constant residue v- 1 , where V =Vol (f\H). For general fuchsian groups of the first kind, there might be other poles in the half-plane Re (s) ;:? 1/2, although only a finite number, all in the interval 1/2 < s :( 1, none on the line Re (s) = 1/2. All residues are in L 2(f\H) and one denotes R.(f\H) the subspace that they span; this is called the space of residual spectrum.

15. SPECTRAL THEORY OF AUTOMORPHIC FORl\IS

3~9

For f 0 (q), one further sees that s = 1 is the only pole of E(z, s), with constant residue

Vq =Vol (fo(q)\illl) =

iq IT

(1

+ p- 1 ).

p!q

Using the self-adjointness oft:., one can prove the functional equation of the Eisenstein series E(z,s) =
(15.14)

15.2. Let f E £(fo(q)\illl). Then f has the expansion

f(z) =

~q

t

f(z)dJ.L(z)

+ L 4~ a

k

(!, Ea(·,

1+ it))Ea(z, 1+ it)dt

where Vq =Vol (fo(q)\illl). This formula is valid in L2 sense, and the integrals are absolutely converyent point-wise and uniformly on compact sets if f E D.

The main point in the proof is that, while Ea(·,s) is not square-integrable (because of term Oa,bYs + 'Pa,b(s)y 1 -• in the Fourier expansion at the cusp b), it is almost so when Re (s) = 1/2. Indeed, one can show that Ea(CTbZ, s) « .,fY when y -> oo, and that the extra integration in t in (15.14) saves a factor logy, which is sufficient to turn the right-hand side into an L2 function. On the other hand, one might be tempted to say that (15.12) on the line Re (s) = 1/2 gives the spectral decomposition, but it is not true because the coefficient -J;( s) is not the inner·product (£0 (·,1/J),£0 (-,s)) as it should be (since (Ea(·,s),Ea(·,s)) diverges, multiplying by the Eisenstein series and integrating is not permitted in (15.12). It is necessary to appeal to the functional equation to get the proper formula which (despite what (15.12) might suggest) shows that it is necessary to use the Eisenstein series at all cusps b to get the expression of the incomplete Eisenstein series at the cusp a (see [15], 22.1-22.3). In the case of a general subgroup r, the Eisenstein series remain in many respects rather mysterious objects, but they are comparatively less so for f 0 (q), where the explicit Fourier expansion yields crucial estimates in a straightforward way, with the consequence that, for the applications in this book, the continuous spectrum will be easily manageable. 15.5. The discrete spectrum. We now turn our attention towards the complement of the space £(f\illl). LEMMA 15.3. The orthogonal complement of £(f\illl) is the space L5(f\illl) of cuspidal automorphic functions.

This follows by a simple computation, which we write down completely to give a feeling of the whole theory: we simply compute the scalar product (!, Ea(·, 1/;))

15.

390


of an L 2 -automorphic function against an incomplete Eisenstein series

(f, Eo(., 'if;))= L f(z)Ea(z, 1/;)d;t(z)

=

1

L

f(z)

jp(Imu;;- 1 /z)dtL(z)

,a.\r

F

L

Lf(z)jp(Imu;;- 1 v)dtL(z)

-yEr a \r

which by automorphy of f transforms to

=

1l1+oo

f(uaz)jp(z)y- 2 dydx

since the sets u;;- 11F are disjoint and make a partition of the strip 0 (see (14.5) or think of a= oo). This we can now write as

< Re (z) < 1

(15.15) where la,o(y) is the constant term in the Fourier expansion off at a, and the lemma is now clear. Again, t:. acts on the space L5(f\H). But now one can show that the resolvent of the laplacian is a compact operator on L5(f\H). From this it readily follows that t:. has pure point spectrum on L5(f\H) and thus this space is spanned by eigenfunctions of t:., which are called Maass cusp forms. Because t:. is self-adjoint, the eigenvalues of t:. must also be non-negative, and indeed they are positive since the eigenvalue 0 corresponds to bounded harmonic functions on H, which are necessarily constant. In terms of s this means that either 1/2 < s :( 1, or Re (s) = 1/2. There can be only finitely many eigenvalues .X = s(1 - s) with 1/2 < s :( 1, and they are called exceptional eigenvalues for r. Although it is fairly easy to construct fuchsian groups having arbitrarily many exceptional eigenvalues, Selberg conjectured that they do not exist for congruence (arithmetic) groups. SELBERG'S EIGENVALUE CONJECTURE.

AI ;;,

For any q;:? 1, we have

1

4

where .\ 1 is the smallest eigenvalue oft:. acting on L5(fo(q)\H).

Further, Selberg proved THEOREM

15.4. For any q;:? 1, we have .\ 1 ;:?

ft.

This has been improved, as mentioned in Section 5.11, the best result to date being .\ 1 ;:? 975/4096 by Kim and Sarnak [KiS]. As will appear clearly in Sections 15.8 and 15.9, exceptional eigenvalues are somewhat analogous to exceptional zeros of Dirichlet £-functions (see (5.51)), and


391

too many of them can compromise analytic estimates. However, Theorem 15.4 is analogous to a zero-free strip and is thus much stronger than what is known in this case. The more fruitful analogy is with the Ramanujan-Petersson conjecture, and in representation theoretic terms there is a common formulation. For general r, cusp forms are shrouded in mystery; indeed, it is by no means clear that they must exist (see [PS]), except in certain cases where this is ensured by symmetry (for instance, if the reflection z >-> -z descends to the quotient, odd functions must come from odd cusp forms since all Eisenstein series are even). For congruence subgroups, however, it was shown by Selberg using the trace formula that not only do cusp forms exist, but they are in a certain sense "dominating" over the Eisenstein series. Their presence is measured by the counting function

which is shown to satisfy the Weyl Law

for T ;:? 2, where the constant is absolute. Hence there are many eigenvalues, on average about qT between T and T + 1, but the proof is indirect and doesn't produce any of the corresponding cusp forms! 15.6. Spectral decomposition and automorphic kernels. From the two previous sections we can deduce the full spectral decomposition theorem for the automorphic laplacian on f 0 (q)\illl. THEOREM

15.5. Let uo, u 1 , u 2 ,

...

be an orthonormal basis of the residual l

and cuspidal spaces, i.e. u 0 = constant = Vol (fo(q)\illl)-2 E R.(f 0 (q)\illl) with eigenvalue .Xo = 0 and Uj E .A~,(f 0 (q)\illl), with eigenvalue Aj = s1 (1- sj) for j = 1, 2, .... Then any fE L2 (f 0 (q)\illl) has the spectral decomposition

(15.16)

f(z)

=

L j)O

(J, Uj)uj(z)

+ L 4_!__ a

7r

r(J,

}ll!.

Ea(·,

1+ it))E.(z, ~ + it)dt

valid in L 2 sense, and also converging absolutely and uniformly on compact sets if f E TJ. Moreover, the Parseval formula holds

(15.17)

We will now apply Theorem 15.5 to get a kind of Poisson summation formula for quotients of the hyperbolic plane. For this we consider a kernel function in two variables 1 k : ill! x ill! --> C, say continuous and compactly supported, which is a "point-pair invariant", i.e. k(gz,gw) = k(z,w) for all g E G. This means that k(z,w) depends only on the distance d(z,w). We express this property by saying that there exists another function in one variable k : JR+ --> C such that k(z, w) = k(u(z, w )) where u is the function in (15.5). 1

The second variable is useful because there is no privileged basepoint.

15.

392


Associated to k is an invariant integral operator L acting on functions on llll by

i

Lf(z) =

f(w)k(z, w)dJ.L(w)

and, from the assumption that k depends only on the distance, it follows that L is Ginvariant. In particular, L maps automorphic functions to automorphic functions, and indeed for f E .A(fo(q)\H) which is bounded we have another formula

L

Lf(z) =

f(w)K(z, w)dJ.L(w).

where the kernel K is defined by (15.18)

K(z,w)=

k(z,1w). -yEro(q)/{±1}

·The new kernel K is automorphic in both variables: indeed, in w by the averaging construction, and in z because k is symmetric. Applying the spectral decomposition in the z-variable gives the expression K(z, w) =

:L (K(-, w), uj)uj(z) j~O

+ :L 2_ { a

47r

}1!.

(K(-, w), E 0 (.,

~ + it))Ea(z, ~ + it)dt

where the coefficients (K(·, w), Uj) (respectively (K(·, w), Ea(·, ~+it))) are again automorphic and can be also expanded similarly. However, the next lemma shows that they are simply proportional to the cusp forms Uj (respectively the Eisenstein series). LEMMA 15.6. Let k : JR+ -> IC be smooth and compactly supported, k(z, w) = k( u(z, w)) the associated kernel and L the integral operator as above. Iff : H-> 1C (not necessa'T'ily automorphic) is any eigenfunction of the laplacian with eigenvalue >. = s(l- s), s = 1/2 +it, t E IC, then f is also an eigenfunction of the integral operator L, with an eigenvalue which depends on >. and L, but not on f. Mom precisely, we have

Lf whem A = h( t), the transform k

>->

=

J.

q(v)

v

= Af

h being obtained by the following steps:

+oo

k(u)(u- v)- 112 du,

g(r) = 2q((sinh h(t)

=

l

~) 2 ),

g(r)eirtdt


393

For a proof, see e.g. [15, Th. 1.16]. The transform above is called the HarishChandrajSelberg transform for S£(2, JR). It can be inverted by g(r) =

~

{ h(t)e-irtdt,

}1!.

27r

q(v) = k(u)

1

2g(2log(JV + ../V+l)),

11+=

= -7r

(v- u)- 112 dq(v),

u

sufficient conditions for h being h(t) = h( -t), (15.19)

{

his holomorphic in lim (t)l ~! h(t) ~

(ltl + 1)_ 2 _ 5

+ 8,

.

for some 8 > 0. These conditions are in fact sufficient for the analysis of the automorphic kernel K(z, w), i.e. the restriction that k has compact support can be relaxed. We deduce from this the spectral decomposition which was our goal. THEOREM 15.7. Let h be a function satisfying the conditions (15.1g), k the Harish-ChandrajSelberg transform of h and K the automorphic kernel on fo(q) associated to k. Then the following expansion holds

K(z,w) =

k(u(z,1w)) = Lh(ti)ui(z)u1 (w)

L -yEro(q)/{±1}

j)O

+L a

4~ l h(t)Ea(z,! + it)Ea(w,! +it)dt

where the convergence is absolute and uniform on compact sets. It is important to have some control on the growth of the eigenvalues and eigenfunctions in this formula. The following is often sufficient for first estimations. PROPOSITION

L lt,I
15.8. LetT ;:? 1 and z

lu1(z)l 2+

E

H. We have

L 4~ j__T IEa(z, ~ + it)l dt ~ T 2+ Ty(z) 2

T

a

where the implied constant depends on y(z)

=

r

alone and

maxmaxim(0".;- 11z). a

-rEr

For the proof, see [15], Section 7.2. 15.7. The Selberg trace formula. In Section 15.6, we have obtained the spectral expansion of an automorphic kernel, and in doing so, we have seen that it is useful to consider the invariant integral operator L associated to it, namely Lf(z) =

L

f(w)K(z,w)dp(w)

394

15.


where K is given by (15.18). Another formula, which is also a generalization of the Poisson summation formula, arises when one computes the trace of L, using the spectral expansion of K on the one hand, and its definition on the other hand: this is the celebrated Selberg trace formula. Of course, the correct definition of the trace of an operator on an infinite dimensional space is already a problem, but here we take a pragmatic approach and define the trace of L (or of K) to be the integral of the kernel on the diagonal (which is well-defined because our kernels are always at least continuous): (15.20)

'If K =

L

K(z, z)d11(z).

If there were no Eisenstein series in Theorem 15.7, the spectral expansion would yield immediately TrK = Lh(tj) j)O

since (ui) is an orthonormal basis, and we would obtain the pre-trace formula

L h(tj) = L 1k(u(Tz, z))d11(z). -yEr

j

F

The right-hand side is still not very explicit. To go further, Selberg partitioned the sum over 1 in such a way that the partial sums are computable. This is done by dividing the group into its conjugacy cla..~ses (that those are non-trivial is a feature of the non-commutativity of f). For one conjugacy class C, say of 1 E r, we have 1 C={T- /T

I TEf}

which is in bijection with Z(T)\f, where Z(T) is the centralizer of 1 in f,

z (!)

= {T E I I iT = Ti}

so the contribution of the class C to the sum is equal to

TEZ(-y)\r

rEZ(-y)\r

which, by invariance, gives the contribution 'If c K =

f

k(u(v, z))d11(z)

lz(-r)\H

to the trace of K. The advantage we gain in performing this summation over 1 E C is that the above expression for 'If c K now depends on the conjugacy class of 1 in SL(2,JR:.), not in r. Hence in further computations we can replace 1 by its most standard representative T- 1 /T for suitable T in G. Such classes have geometric meaning, and the fact that 1 is an element of r will translate this meaning into information about geometric data of the quotient space f\H. We first describe the classification of the conjugacy classes in G = SL(2,JR:.) using the geometric action of G on H, namely by considering the fixed points of the isometry of H induced by an element g E G. This gives information about conjugacy classes, because if z is fixed by g, then TZ is fixed by the conjugate T- 1gT of g. Because the group is very symmetric, the fixed points will be enough to classify g up to conjugacy.


Let 9 E G, 9 = (

~ ~), 9 #

395

±1, acting non-trivially on H U IQl U oo. The

equation 9z = z is a quadratic equation cz 2 + (d- a)z- b = 0, with discriminant Ll = (d- a) 2 + 4bc = (Tr 9) 2 - 4, and there are three cases. • First case: ITr 91 > 2 (9 is called hyperbolic). Then Ll > 0, and there are two distinct fixed points in IR (the case c = 0, a# 1 also belongs to this case, but one of the fixed points is oo). There exists T E G which sends one of the fixed points to 0 and the other to oo; then by the observation above the conjugate T- 19T fixes 0 and oo, and clearly (changing T into T- 1 if necessary), we find that 9 is conjugate to a unique element of the form (15.21)

for some p > 1, which acts geometrically as a dilation z >-> pz. We call p the norm of 9 and write N 9 = p; the norm classifies exactly a hyperbolic conjugacy class. Notice that p can be recovered from g by solving the equation z + z- 1 = Tr 9 and taking the square root of the solution which is larger than I. Associated to a hyperbolic motion is the geodesic lg in H which is the halfcircle perpendicular to IR at the two fixed points, or in the case c = 0, the vertical line from the real fixed point to oo. This geodesic is preserved by 9 (this is obvious for a(p), and follows by conjugation in the general case). Note that the norm of 9 can be recovered by log N 9 = d(z, 9z) for z E lg (9 acts as a translation on the geodesic). Conversely, to every geodesic 1 in H, one can associate the isotropy subgroup r"' in G, containing elements which preserve I· This is generated by ±1, an element of order 2 which interchanges the end-points of 1 in IR U { oo}, and the subgroup of those elements fixing the end-points (which is isomorphic to the multiplicative group of positive real numbers, and so to IR by the logarithm map). All these facts follow by conjugation to the case when the geodesic is the imaginary axis, and r"' is generated by the involution z >--> -z- 1 and the group of elements a(p) above. Let again 9 be hyperbolic. The centralizer Z(9) of 9 in G is a subgroup of r "'g; one checks that the involution is not in it, but the other subgroup is, so Z(9) is isomorphic to IR (times ±1, of course). • Second case: ITr 91 = 2 (9 is called a parabolic motion). Then Ll = 0, so 9 has a unique fixed point a, a E IR if c # 0, and a = oo if c = 0. As before, some T sends a to oo and one sees this way that 9 is conjugate to a unique element (15.22)

for some x E IR, which acts as a horizontal translation z >-> z + x. The parameter x classifies the parabolic conjugacy classes. The group N = {n(x) I x E IR} is the isotropy group of oo in G, and it is the centralizer in G of any of its elements. By conjugation, the centralizer of a parabolic element is thus isomorphic to IR. • Third case: ITr 91 < 2 (9 is called an elliptic motion). Then Ll < 0, so the quadratic equation has two conjugate complex roots, one of which, say z9 , is in H. Moving z9 to i by an element T E G, a short computation reveals that 9 is

396

15.


conjugate to a unique element (15.23)

r(e) = ± (

co~e

-sme

sine) cose

e

witheE IR/2nZ, which acts by hyperbolic rotation of angle around i. The group K = {r(e) I e E IR/2nZ}, isomorphic to IR/2nZ, is the centralizer of any of its (non-trivial) elements, and the centralizer of any elliptic element is isomorphic to K. Now consider a discrete subgroup r of G and an element 1 E r. Let Zr(l) be the centralizer of 1 in f; it is of course a subgroup of the centralizer Z(l) of 1 in G, and it is a discrete subgroup. When 1 is hyperbolic or parabolic and Z (!) is isomorphic to IR, this implies that Zr(l) is infinite cyclic, and when 1 is elliptic, this implies that Zr(l) is finite cyclic. In all cases, we say that 1 (or its conjugacy class in r) is primitive when 1 is itself a generator of the centralizer Zr(T). It is clear that any element 1 is of the form ±10 for some integer m ;;, 0 and a unique primitive element IO· By conjugating the generator Ia to one of the standard forms described above, we see that the action of Zr(l) on H has the following simple fundamental domain, depending on the type of 1: (1) If 1 is hyperbolic, Ia is conjugate to (15.21) for p = N1o, acting by z >-> pz, and a fundamental domain is the horizontal strip

Zr(T)\H={z=x+iyEH l1
Zr(i)\H={z=x+iy

I 0
(3) If 1 is elliptic, 10 is conjugate to (15.23) for some e. Let w be the fixed point of 10 (the center of the rotation); a fundamental domain for Zr (1) is obtained by taking the domain between two geodesics going through w with an angle equal to (by discreteness, notice that nmst be or the form 21r 1 for some integer 1).

e

e

c

c ;;,

In the special case of r = S£(2, Z), the parabolic and elliptic conjugacy classes are very ~asy to describe. First, the fixed point of a parabolic element must be in IQl U { oo}. But all such points are r -equivalent, by Bezout 's theorem, and since no = n(l) clearly generates the isotropy group of oo in S £(2, Z), we find that its conjugacy class is the only parabolic conjugacy class in S£(2, Z). The centralizer Zr(no) is {n(x) I x E Z}, and a fundamental domain for Zr(no) is the vertical strip{z=x+iy I 0<:r<1}. By simple computations, one finds also that S£(2, Z) contains only two primitive conjugacy classes of elliptic elements, with representatives ( ~ - 1 ), ( ~ ~) of 0

1

order 2 (as an isometry, but 4 as a matrix), fixing i and of order 3 (as an isometry, but 6 as a matrix), fixing j, respectively. EXERCISE 1. Find the primitive parabolic conjugacy classes in f 0 (q), and show that they correspond to the set of cusps as described in Section 14.1.

We come back to the notation of the introduction to this section, with r a discrete subgroup such that the quotient f\H has finite voh1me, and has cusps (finitely many). The lack of compactness creates some analytic difficulties, because


397

it turns out that the kernel is not integrable on the diagonal, so the trace, as we have defined it, is infinite. This problem is solved by considering truncated fundamental domains F(Y) which tend to F for Y ---> +oo, in the sense that JF(Y) f(z)dJL(z) ---> JF f(z)dlt(z) for any integrable f. The kernel K restricted to F(Y) is now integrable on the diagonal, and one can compute the truncated trace Try K = {

K(z, z)dJL(z)

JF(Y)

using the spectral expansion for the restriction on the one hand, and the decomposition into conjugacy classes on the other hand. On the spectral side, this gives for Y---> +oo an expression of the form TrY K =A logY+ T 1 + o(1), and on the geometric side another expression TrY K =A logY +T2 +o(l), for the some A> 0. Hence we have the tautology A = A, and the non-trivial formula T 1 = T2, which is the trace formula. A small analytical miracle occurs when doing the exact computations: the complicated Harish-Chandra/Selberg transform disappears from the picture, and all that is left is the function h on one side, and its Fourier transform h on the other side. We now give indications on the shape of the various parts, and state the complete formula in the special case r = SL(2, Z). On the spectral side, the cusp forms and residual spectrum Uj contribute, as in the case of a compact quotient (15.24) and the Eisenstein series are shown to contribute (15.25)

h(O) 1 -Tr(l)-4 2 4rr

1+oo h(t)-(l

'P 2

-oo

where ( s) is the scattering matrix formed by the first Fourier coefficients of the Eisenstein series and (s) is its determinant (see Section 15.4). The geometric computations are done independently for each type of (primitive) conjugacy classes. The class of the identity motion has a contribution given by (15.26)

1+=

Volf\IHI th(t) tanh(rrt)dt. 4rr _ 00

A hyperbolic conjugacy class P with associated primitive class Po contributes

(15.27)

1 log N Po -( 1 ) 2rr NPI/2- NP-1(2h 2rr logNP .

A primitive elliptic class R of order m (as an isometry) contributes

(15.28)

rrfl) " ( 2m sinL m O
-!1+=

e-27rt£(m

1-e- 2" 1

h(t)dt.

-CX)

Finally, all primitive parabolic conjugacy classes contribute the same amount, which is (15.29)

h(O)- -{h(O) 1 log24 2rr

1+= -oo

1/;(1 + it)h(t)dt

}

15.

398


where 1j;(s) = .Jf'(s) is the logarithmic derivative of the Gamma function. For SL(2, Z), after some rearranging, and taking into account the description of the cusps and of the scattering matrix, we obtain THEOREM 15.9. Let h be a function satisfying the conditions {15.19), h it.s Fourier tran.sform. Let
(15.30)

L:h(tj)

1 +-

47r

j

1 ' lR

-

<~

+ it)h(t)dt =

rh(t)(_!_tanh(7rt)_l_log2- __!_1/;(1 12 21r 21r

I~~.

+ __!__ 2"

L p

l~g N Po 1 NP'i-NP-2

+it))dt

h(__!__ log N P) 2

"

hyperbolic

15.8. Hyperbolic lattice point problems.

One of the most direct applications of Theorem 15.7, similar in spirit to the examples given for the Poisson summation formula and the Voronoi formula in Chapter 4, is to the study of hyperbolic lattice point problems. Let D C lHI be an open subset. The lattice point problem for D is the problem of counting the number of translates of a given z 0 E lHI by elements of a discrete subgroup r C G which are in D. The classical problems of Gauss and Dirichlet already mentioned in Chapter 4 are exactly analogue, taking the action of the lattice Z 2 on IR 2 by translation and D the points in a circle or under an hyperbola. As in these cases, analytic number theory can have its say only if one considers a sequence of subsets getting larger, and sufficiently regular. We take a geodesic ball of radius tending to infinity, in the more convenient form B(w, R) = {z E lHI I 4u(w, z) + 2 ~ R} and look for good asymptotics for the counting function

N(R) = i{r

E f

Jrw

E

B(w, R)}J.

As in the euclidean case, the expected value of N(R) is the area of the geodesic ball, but whereas Gauss's classical argument (packing with unit squares) establishes the corresponding result easily with a fairly good error term R 112 in the euclidean case, this simple argument is very inefficient in the context of hyperbolic geometry. Indeed, the error term is proportional to the length of the boundary of the disc, which is small in euclidean geometry, but is of the same order of magnitude as the area in the hyperbolic case. This fact follows from the isoperimetric inequality: if the boundary 8D of Dis smooth, A= Vol(D) and L = length(8A), then we have 41l"A+A 2 ~L 2 ,

henceL;;,A.

One can at least prove elementary upper bounds which are quite useful (it is used in the proof of Proposition 15.8 and in Chapter 22).

·'

15.


399

PROPOSITION 15.10. Let r be a discrete subgroup of SL(2, JR) such that the volume Vol (f\llll) is finite and let a be a cusp of r. (1) For any z E lHI we have 1 10 l{rEfa\f I Imo-; rz>Y}I
Y

(2) For any z, w E llll, and 8 > 0, we have

lh E o-; 1 fo- I u(rz,w) < li}l < ~(Im(w) +c; 1 ) + c ~ m+t) +1 w 0

0

with an absolute implied constant. Here we let c 0 =min{c>0

!)

I(~

1

Ea-; fo- 4 forsomea, b,

d}.

See [15], Lemma 2.11 for the proof. Applying Theorem 15.7 for suitable kernels, one can get an asymptotic formula. THEOREM 15.11. Let f be a discrete subgroup of SL(2, JR) with Vol (f\llll) Let w E llll. We have

<

+oo.

N(R)=

2 "R

Vol (f\llll)

+2J7f L 1

2
with an implied constant depending on

r

8 r(si-!)lu(wWR '+0(R 213 ) r(s 1 + 1) 1

and z only.

SKETCH OF PROOF. We consider only r = f 0 (q) as before. If k(u) ;::, 0 is any function with k(u) ;::, 1 for u ~ (R- 2)/4, and if the corresponding automorphic kernel K(z,w) satisfies the assumptions of Theorem 15.7, we have by positivity (15.31)

N(R) ~ Lk(u(w,,w)) = 2K(w,w). -yEf'

By Theorem 15.7, we have (15.32) with notation as in the theorem. We take k( u) = 1 for u ~ (R- 2) /4 and k( u) = 0 for u;::, (R+S- 2)/4, and k linear between. We estimate h(t) using Lemma 15.6, for a suitable function f on lHI which is eigenvalue of D. with >. = ~ + t 2 • For example, f (z) = y 8 yields h(t)

=

L

k(i,z)y 5 dJl.(Z).

One finds first that h(t)

!

=

f(s- !) j 1 f - -2-R 5 r(s + 1)

+ O(R 112 + S)

for < s ~ 1, where the implied constant depends on s. The constant eigenvalue for j = 0 yields the main term 7rR(Vol(f\lHI))- 1 , and the exceptional eigenvalues (if any) with 0 < Aj < ~ yield the second term, up to the error term O(R 112 + S).

15.

400


! , integrating by parts yields

Let T = RS- 1 . For Re ( s) = h(t)

«

lsl- / 2R 112(min(lsl, T) +log R), 5

with an absolute implied constant. Hence we estimate the Eisenstein series contribution to (15.32) at a given cusp a by

1

h(t)IE.(w,! + itWdt

«

1R

1 2

+ R 1

IE(w 1 +it)IZ 2 ((1 + t) +log R) " ( ' 2 )'0 dt -VT 1+t

v'T

J 1

R 112

lti>v'T

«

IE " (w ' 2 + it)l dt 5 1

(T +log R)

(1

2

+ t)

(RT)l/2

by integration by parts using Proposition 15.8. The same bound holds for the contribution of the non-exceptional discrete spectrum by partial summation from Proposition 15.8. Using these estimates with S = R 213 , (15.31) and (15.32) give r 27rR r= " A (R) :0:: Vol (f\lHI) + 2y7r l L

f(si-!) 2 s 2/3 f(si + 1) lui(w)l R J + O(R ).

2
Similarly with k(u) = 1 for u :( (R- S- 2)/4 and k(u) = 0 for u;::, (R- 2)/4, we get the lower bound and Theorem 15.11 is proved. 0 REMARK. Note that for fo(q) we have .>- 1 ;::, ~ so s1 < ~ by (5.88), so the contribution of exceptional eigenvalues is smaller than the error term. COROLLARY 15.12. {1) Let N 1 (X) be the number of integer solutions to the equation adbe= 1 with a 2 + b2 + e2 + d2 :(X. We have

N 1 (X) = 6X

+ O(X 213 ).

(2) Let r(n) be the number of representations of n as a sum of two squares. We have ~ r(n)r(n + 1) = 8X + O(X 2 f3). n(X

PROOF. Note first that for w = i, we have 4u( 1i, i) + 2 = a 2 + b2 + e2 + d2 for any 1 = (

~ ~).

So (1) follows immediately from Theorem 15.11 for r = SL(2, Z)

and w = i, since Vol (SL(2, Z)\lHI) = 37r-l and .>- 1 = 91.14 ... > :J: for the modular group. For (2), consider the group r={(a e

b)ESL( 2 Z)I a=d(mod2)}=(0 d ' e = b (mod 2) 1

-1)-lr()(O 2 1 1

°

-1) 1 ·

We have a + d = 2k, a - d = 2€, b + e = 2m, b - e = 2n, with ad - be = k 2 - £2 -m 2 + n 2 = 1 and a 2 + b2 + e2 + d2 = 2(k 2 + £2 + m 2 + n 2), so Theorem 15.11 gives ~ r(m)r(m+1) =N(4X +2) =8X +0(X 213 ). m(X

0

15.


401

15.9. Distribution of length of closed geodesics and class numbers.

The simplest application of the Selberg Trace Formula is to the distribution of the lengths of closed geodesics on the quotient space X = f\IHI, also called the length spectrum. In fact, closed geodesics are in one-to-one correspondence with the hyperbolic conjugacy classes of f. Indeed, if g E f is an hyperbolic element, then g maps the geodesic 1 (half-circle) joining its two fixed-points to itself. In particular, for any z E /, the geodesic segment from z to gz is mapped in f\IHI to a closed geodesic. Taking the model situation where g = a(p) with p > 1 as in (15.21) and z = i, observe that the length of this closed geodesic is d( i, g( i)) = log p = log N g. Conjugates of p, which have the same fixed points, give rise to the same closed geodesic, and conversely any closed geodesic is obtained in this way. THEOREM 15.13. Let have

L

r

be a subgroup of SL(2, JR) with Vol (f\IHI) < +oo. We

logNP=X+

sj 1 X '+0(X 314 ) 8

L ~
NP(;,X

where the sum is over primitive hyperbolic conjugacy classes and the implied constant depends on r.

REMARKS. (1) For r with compact quotient, this was proved by Huber [Hub). (2) The exponent can be improved, see e.g. Luo, Rudnick and Sarnak [LRS), where 3/4 is replaced by 7/10 + E for any E > 0. Although the analogy with the Prime Number Theorem suggests that it should be X 112 +", and the analogue of the Riemann Hypothesis holds, this is yet unknown. The main difficulty is that there are many more eigenvalues than zeros of the zeta function. Also, as in Theorem 15.11, if r = f 0 (q), Selberg's inequality )q ;::, means that the exceptional eigenvalues have smaller contribution than the error term.

-&

SKETCH OF PROOF. The argument is quite similar in principle to that of the previous theorem. We choose h such that the Fourier transform h localizes the sum over hyperbolic conjugacy classes to those with N P ~ X + Y for some parameter Y;::, 1, say h(t) = 2cosh(rrt)q(2rrt) where q is a smooth function, even, supported on itl ~ log(X + Y) and with 0 ~ q ~ 1 and q = 1 on [-logX,IogX]. The contribution of the discrete and residual spectrum to the spectral trace is

L h(tj) =X+ L

sj 1 X

81

+ O(Y + X 112 T)

~
with T = XY- , and the same estimate holds for Eisenstein series (to see this, estimate h by Fourier inversion on ~ < s ~ 1 and on Re ( s) = 1 separately). The identity, elliptic and parabolic motions have contributions « X 112 T by simple estimations, so one gets

Lq(logNP)(IogNP)=X+

sj 1 X 8'+0(Y+X 1 12 T).

L ~
p

Subtracting this applied at X+ Y and X, by positivity, we see that

L X
(logNP)

«

Y +X 112 T

402

15.


hence the result by choosing y = X 314 •

0

In [Sa2], P. Sarnak deduces the following corollary.

=

THEOREM 15.14. For any integer d:;:;, 1 such that d 0, 1 (mod 4), let h(d) be the class number of Q( .../d) and Ed be the fundamental unit. We have

~ h(d) = Li(X 2 ) + O(X 3 12 (1ogX) 2 ) t:d(X

for X :;:;, 2, with an absolute implied constant.

PROOF. The proof depends on the following map between indefinite quadratic forms and closed geodesics. Let Q(x, y) = ax 2 + bxy + cy 2 be a primitive integral indefinite quadratic form of discriminant d > 0, i.e. (a, b, c) = 1 and d = b2 - 4ac > 0. By linear change of variable, SL(2, Z) acts on such forms and the number of equivalence classes is h( d). The equations Q(O, 1) = 0 has two real roots, 81 = (-b + .../d)/2 and 82 = (-b- .../d)/2, to which is associated the geodesic in lHI joining 01 and 02 . The stabilizer of Q under the action of SL(2, Z) is equal to the stabilizer of 81 (or 82), and a generator is given explicitly by g(Q)= ((to-buo)/2 auo

-cuo ) (to+ buo)/2

where cd = (to+ u 0 .../d)j2 (in other words, (t 0 , u 0 ) is the fundamental solution to the equation t6- du6 = 4). The norm of this hyperbolic element is then r;~. The map Q >--+ g sends primitive integral quadratic forms to hyperbolic conjugacy classes of SL(2, Z). We claim it induces a bijection between classes of forms and primitive hyperbolic conjugacy classes. To see this, let g be any primitive hyperbolic element of SL(2, Z). Let 0 1 and 02 be the fixed points of g and K = Q( 0 1 , 02 ). Clearly K is a real quadratic field, say K = Q( .../d) for a unique d > 0, d = 0, 1 (mod 4). Then the stabilizer of 81 is generated by g because g is primitive. The form Q(x,y) = (x- 81y)(x- 02y) is a primitive quadratic form with g(Q) =g. This implies surjectivity of the map and injectivity is easy to check. Hence the norms of conjugacy classes of primitive hyperbolic elements are r;~ with multiplicity h(d). In this case Theorem 15.13 becomes

(15.33)

~ 2h(d) logcd =X+ O(X 3 14 ), Ed(VX

hence the result after summation by parts.

D

REMARK. Siegel [Sie2] has shown that

(15.34)

"2 X3/2

~ h(d)logcd= 18(( ) +O(XlogX) 3 d(X

for X :;:;, 2. Notice the difference between the ordering of the discriminants in (15.33) and (15.34). An asymptotic formula for h(d) over d,;;; X is not known. EXERCISE 2. Prove (15.34) using the Dirichlet Class Number Formula (2.31).

CHAPTER 16

SUMS OF KLOOSTERMAN SUMS 16.1. Introduction.

Individual Kloosterman sums S(m, n; c) are quite well examined by algebraic methods (see Chapter 11). The result is Weil's bound (16.1)

IS(m,n;c)l ~ (m,n,c)~c~T(c)

which is the best possible. This and earlier estimates have been used in analytic number theory indirectly in many ingenious ways (see for instance [Li2], [At], [110], [Wi3]). Yu. V. Linnik [Li3] pointed out that for certain applications one actually needs estimates for sums of type (16.2)

Smn(X) = c(X

c=O(mod q)

The upper bound (16.1) implies (16.3)

Smn(X)

«

T((m,n)q)(m,n,q)~X~ log X

for X ;::, 2 where the implied constant is absolute. However, Linnik suggested that a smaller bound should be true, if X is sufficiently large, because of cancellation of Kloosterman sums, which is due to a regular change of sign. Linnik conjectured that (16.4)

Smn(X) «X",

where the implied constant depends on m, n and c:, and he also provided some heuristics to support this bound. In his investigations Linnik was driven by an analogy of the zeta function (16.5)

Zmn(s) = c=O(mod q)

to the hypothesis of Hasse on the L-functions of elliptic curves (to be precise Linnik spoke only for q = 1 and used a different notation). This analogy turns out to be close only in the analytic aspects of the series (16.5) (like analytic continuation, but not the Euler product). By independent analysis A. Selberg [87] revealed that Zmn(s) admits a spectral decomposition with respect to automorphic forms (for the group f 0 (q)). Selberg led the foundation for the spectral theory of Kloosterman sums yet he did not deliver the necessary technical tools to cultivate this fertile soil. 403

/

16. SUMS OF KLOOSTERMAN SUMS

404

In this chapter we sketch proofs of central results of this profound theory. There are solid presentations of selected topics in the articles [13], [113], [DI] and in the books [15], [Sa3], [Motl], however, a conclusive exposition is yet to be written. 16.2. Fourier expansion of Poincare series.

There are various methods to develop the spectral decompositions of sums of Kloosterman sums. Arguably the best structural treatments appeal directly to the Green function (see [15]), or the resolvent operator (see [GS]). Nevertheless, we follow the traditional method which begins by dual treatments of the Poincare series (for the group r = fo(q)) (16.6)

~

Pm(z) =

F(m-yz),

m=1,2,3, ....

-yEr=\r

Here F(z) is a function on lHI which is periodic in x of period one and it satisfies F(z) « y 1 so the series (16.6) converges absolutely (0' = 1 will be attained by continuity). Our treatments of Pm(z) are parallel to those already given for the holomorphic forms in Section 14.2, and our goal is the Kuznetsov formula (16.34) which is an analog of the Petersson formula (14.15). The extra care is needed to justify infinite summations over the spectrum of the Laplace operator. To this end one needs some crude estimates for the Fourier coefficients of cusp forms in the spectral parameters. We shall prove much stronger estimates than necessary in Section 16.5, but only after the final Kuznetsov formulas are established. However, one could derive the required auxiliary estimates from the early forms of Kuznetsov's formulas. We encourage the reader to do so and to organize our arguments into a correct logical order (we compromised the order, but not the completeness, for reducing the length of presentation). Applying the double coset decomposition (14.14) and then the Poisson summation formula as in the proof of Lemma 14.2 we arrive at the Fourier series for Pm(z),

Pm(z) = F(mz)

+~ qic

~ · ~e(nz) 1Irn _ F(m (~ ~) ~)e(-nOd~. ad=ol(rnod c)

(0-Y

n

Now assume that the generating function F( z) satisfies the rule F( z + t) = e( t)f(z) for t E lR so by the action of r on lHI we have

From this action one gets the Kloosterman sums in the Fourier series (16.7)

Pm(z) = F(mz)

+~ n

S(m, n; c)Fc(m, n; y)e(nz) c=oO(mod q)

where Fc(m,n;y) =

by

shifting~ to~-

(16.8)

{

Jlm ~=y

F(-m/c2 ~)e(-n0d~

djc. We put F(z)

=

p(47ry)e(z)


405

and call p(y) the test function for the corresponding Pm(z), getting (16.9)

Fc(m, n; y) =

1 ( Im

E=y

47rmy) m p ~~ [ e ( -2cn~ ) d~. c ~2 c ~

In this generality we cannot compute the integral (16.9). An interesting choice for the test function is

p(y) = Ys

(16.10)

where s is a complex parameter at our disposal with Re s > 1 (there are other eigenfunctions of fl. which lead also to interesting results). In this case the Poincare series becomes

~ (47rmim !Z) 5 e(m/z)

Pm(z,s) =

(16.11)

-yEf'= \f'

/

and its Fourier expansion becomes

Pm(z, s) = (4rrm)'{e(mz)

(16.12)

+ ~ Zmn(s; y)e(nz)}

where

Zmn(s;y) =

~

c- 25 S(m,n;c)Is(m,n;y)

c=oO(mod q)

and

l~l- 2 se(-n~-mj~c2 )d~. lrm(=y The general Poincare series Pm(z) can be represented by Pm(z,s) by applying the Mellin transform to p(y). ls(m,n;y)= {

The particular Poincare series Pm ( z, s) was introduced by A. Selberg [87]. This is nearly a cusp form, but not exactly so, because it fails to be an eigenfunction of fl.. Precisely we have (fl.+ s(1- s))Pm(z, s) = sPm(z, s + 1),

(16.13)

because (fl.+ s(1 - s))F(z) = 47rsyF(z) for F(z) = y 5 e(z). Hence we get the recurrence formula

Pm(z, s) = sR:..Pm(z, s + 1)

(16.14)

where R:.. = (fl.+ .>-)- 1 is the resolvent of fl. at>. = s(1- s). By the spectral theory R:.. is meromorphic, therefore Pm(z, s) can be continued meromorphically to the whole complex s-plane. Moreover, in the half-plane Re (s) > ~ the poles of R:.. are at the points Aj = si (1 - Sj) > 0 in the discrete spectrum of fl., and they are simple, so are the poles of Pm(z,s) at s = s1 . In fact it is conjectured by Selberg that Aj ;::, so s1 = ~ + it 1 with tj real and Pm(z, s) is holomorphic in Re (s) > ~·

i,

In what follows we avoid the analytic continuation of the Selberg-Poincare series and the resolvent operator altogether. Instead, we rely heavily on the spectral decomposition of automorphic functions.


406

16.3. The projection of Poincare series on Maass forms. We return to the Poincare series Pm(z) for a generating function of type (16:8). Let f(z) be a Maass form of eigenvalue>.= :J: + r 2 given by the Fourier series (16.15)

f(z)

ay!+ir

=

+ by!-ir + y! L anKir(27rlniy)e(nx) n7'0

(see Lemma 15.1). By the unfolding method we derive

(J, Pm) = 1= [

f(z)F(mz)dJLZ.

Inserting (16.15) and (16.8) we pick up the term am by integration in 0 Then, changing the variable y--> y/27rm we obtain the formula (16.16)

< x < 1.

(f,Pm) = (27rm)!am 1= e-YKir(Y)f5(2y)y-~dy.

Hence observe that Pm(z) is orthogonal to constant functions. In particular for p(y) = y• the above integral is a product of gamma functions giving (16.17) This is clearly an analog of the formula for the inner product of a holomorphic modular form with a holomorphic Poincare series given in Lemma 14.3. 16.4. Kuznetsov's formulas. Now we are ready to derive explicit relations between Kloosterman sums and Fourier coefficients of automorphic forms. To this end we compute in two ways the inner product (Pm, Qn) of two Poincare series Pm(z) and Qn(z) generated by F(z) = (47ry)~p(47ry)e(z) and G(z) = (47ry)~q(47ry)e(z) respectively (note we modified the test functions p(y),q(y) by the factory!). First by the spectral decomposition (see Theorem 15.2) we get (16.18)

=

(Pm,Qn) = L(Pm,Uj)(uj,Qn) J=l

+L =

tn

i:

(Pm,E=(*,

~ + ir))(E=(*, ~ + ir),Qn)dr.

where (uj(z)) is an orthonormal basis of cusp forms and (E=(z, ~ + ir); r E JR.) is the eigenpacket of Eisenstein series associated with a system of inequivalent cusps. Suppose u1 (z) has the Fourier expansion (16.19)

Uj(z) = Pj(O)y•'

+ y! L Pj(n)Kit, (27rlniy)e(nx) n7'0

where s 1 = ~ +it1 with tj real, or~< s 1 < 1 because Then (16.20)

>.1 = s1(1-s 1 ) =

:l:+tJ > 0.

407


where (16.21) by the formula (16.16). Similarly for the Eisenstein series (16.22)

E 0 (z, ~

+ ir) = 00 y!;+ir + 'Pa(~ + ir)y!;-ir + y!i LTa(n, r)K;r(27rlnly)e(nx) n#O

we have (16.23) Inserting (16.19) and (16.23) into (16.18) we obtain (16.24)

(Pm,Qn) = 4Jrvmn{f=Pi(m)pi(n)wp(ti)wq(ti) j=1

Next we compute (Pm, Qn) by unfolding with respect to Qn and applying the Fourier expansion (16.7) for Pm. We obtain

(Pm,Qn) = =

r)Q t Jo Jo

Pm(z)G(nz)dJ.Lz

d Jor= (o(m,n)F(imy)+ LS(m,n;c)Fc(m,n;y)e(iny))G(iny)}· 0

qlc

Changing the variables we arrive at (16.25)

(Pm,Qn) = 41rvmn(o(m,n)(p,q)

+ L c- 1 S(m,n; c)Vpq(

4

1rv;rn))

qlc

where (16.26) and (16.27)

{ Jor= p (XYJZI )-(XY) (-X~~~ (1 ))dyd~ q 0 e 47r~ Y+ y YJIT"

Vpq(x) = lrm~=l

We simplify Vpq (x) by changing ~ = 2i (( 2 + 1) -l, where ( runs over the semi-circle 1(1 = 1, Re (() > 0 clockwise. Letting Re (() = 1) we have~= i((7J)-I, I~ I = 7)- 1 and 1~1- 1 ~ = i((7J)- 1 d(. Hence (16.28)

Vpq(x) = i

ji-ilor=

P (:_1J)iJ(xy7J)e- !!<

Y

~+Y)
Combining (16.24) and (16.25) we arrive at the following

16.

408

SUMS OF KLOOSTERMAN SUMS

PROPOSITION 16.1. Let m, n > 0 and p(y), q(y) be smooth, bounded functions on JR+. Then we have

Now let us choose p(y) = q(y) = y~+iv with v E JR. For these test functions we get whence wp(t )wp(t) = 1r 2 sinh( 1rv) jv cosh 1r(v- t) cosh 1r(v + t).

(16.30)

On the other hand, (p, q) = 1 and Vpp(x) = B2w(x), where 1

(16.31)

Bs(x) = 2ix l i K 8 ((x)C d(.

In this case (16.29) becomes (an auxiliary version of Kuznetsov's formula) COROLLARY 16.2. Form, n > 0 and v E IR we have (16.32)

f

j=

1

pj(m)p1 (n) cosh 1r(v - tj) cosh 1r(v + tj)

+

L _1_ Joo

f 0 (m,r)r0 (n,r) dr _ 00 cosh 1r(v- r) cosh 1r(v + r)

47r

0

2

1r- v

(

= sinh(1rv) o(m,n)+

~

L·

c

_1

(47r/ffin)) S(m,n;c)B2;v - - c - .

c=O(mod q)

The integral (16.31) can be expressed in terms of the Bessel function Js(Y) of real positive argument. Indeed, by Cauchy's theorem 00

B 8 (x) = 2ix ( [

= 2ix For y

l""

+ j_~~)

1

K 8 ((x)C d(

(Ks(ixy)- K 8 (-ixy))y- 1 dy.

> 0 we have (see (5.14.4), (5.22.5), (5.22.6) of [Leb]) Ks(iy) =

. 1r (e-"is/2J_s(Y)- e"isf2Js(Y)) 2sm(1rs)

Ks( -iy) = -.-Jr-(e"is/2 Ls(Y)- e-"is/2 Js(Y)). 2sm(1rs) Hence Ks(iy)- Ks( -iy) =

(1ri , (Js(Y) 2 COS JrS 12 )

+ Ls(Y)).

16.


409

Finally, this yields (16.33)

B.(x) =

(JrX / )

2

COS 1rS

1"" x

(J.(y)

+ L.(y))y- 1dy.

With the real parameter v one can generate quite a lot of functions h(t) out of the particular one h(t) = (cosh 1r(v- t) cosh 1r(v + t))- 1 . N. V. Kuznetsov (Kuz] produced from (16.32) the following THEOREM 16.3. Suppose h(t) satisfies the conditions {15.20). Then for any rn, n > 0 we have (16.34) 00

_ h(t1 ) LPi(rn)pj(n)--hi=1 COS Jrtj

+L 0

1 4 1r

-

j"" ra(rn,r)ra(n,r)-h_ h(r)dr _ 00

=o(rn,n)g0 +

COS

~ L

1rr

c- 1 S(rn,n;c)g (47r~ ---;:-c-)

c"'O(mod q)

where go= 1r- 2

(16.35)

and

1:

rh(r) tanh(1rr)dr

j""

2i rh(r) g(x) = hir(x)-h()dr. 1r _ 00 COS 1rr

(16.36)

REMARKS. The formula (16.34) was also established independently by R. W. Bruggeman [Bru], but only for a certain type of test functions h(t) and with g(x) in less refined form than (16.36) of Kuznetsov. The result also holds (and the proof is similar) for rnn < 0, except for the integral transform (16.36) which changes to

41""

g-(x) = 2

1r

K2ir(x)rh(r) sinh(1rr)dr.

0

The Kuznetsov formula (16.34) can be derived from (16.32) by means of the following LEMMA 16.4. Suppose h(t) satisfies the conditions {15.20). Then (16.37)

cosh(1rr)dr = -2h(t) j oo(h(r+i)+h(r-i)) cosh 1r(rt) cosh 1r(r + t) cosh(1rt)" _

2

2

00

PROOF (SKETCH). Move the integral of h(r +~)down to the line lm (r) = -~ excluding small half-circles centered at r = t - ~, r = -t- ~, and move the integral of h( r - ~) up to the line Im (r) = ~ excluding small half-circles centered at r = t + ~, r = t- ~- The horizontal integrals cancel out because h(t) is even. The four integrals over the small half-circles yield in the limit -2h(t)/ cosh(1rt) by 0 computing residues.


410

By integrating (16.32) according to (16.37) one obtains immediately the spectral terms on the left side of (16.34). On the right side one obtains the diagonal term with

joo (h(r+ ~). +h(r- ~))tanh(J_r1') . rdr

-1

9o = 27r2

= 2-1 1r

-oo

;= . _

h(r + ~)

00

rdr tan

h(

1rr)

= 21r1

;= _

.

h(r) tanh(1rr)(r- ~)dr

00

which is exactly (16.35) because h(r) is even. Moreover, one obtains the sum of Kloosterman sums with

-1;

g(x) = 27r 2

00

-oo

(h(r

.

+ ~) + h(r-

.

r~

~))B2ir(x)tanh(7rr).

Inserting (16.34) and interchanging the integration we get

-x 27r

g(x) = -

-x 21r

=-

;= joo ;= ;= x

_

x

_ 00

(h(r

.

.

rdr

dy

+ ~) + h(r- ~))(J2ir(Y) + L2ir(Y))-:---h( )sm 1rr y .

00

[(2ir + 1)J2ir+J(Y)- (2ir -1)J2ir-1(Y)]

h(r)dr dy cos

h(

1rr

) -, y

because h(r) is even. Next, by the recurrence formula

(s

+ 1)Js+J(Y)- (s -1)Js-J(Y) = -2sy(Js(Y)/y)',

we get 2

g(x) = - ix 1r

r= joo cosr~t)) (J2ir(Y)/y)'drdy

} x

_ 00

1rr

which coincides with (16.36). The formula (16.34) finds its basic applications for estimation of the Fourier coefficients of cusp forms in spectral parameters and the level (see the next section). In the level aspect alone the preliminary formula (16.32) is often sufficient and easy to use. However, the extra flexibility offered by the large family of functions h(t) in the Kuznetsov refined version (16.34) is helpful for tackling delicate issues, such as exceptional eigenvalues or the distribution of length of closed geodesics on the Riemann surface f\IHI; cf. [Ill], [!12]. For problems in analytic number theory we need to treat the sums of Kloosterman sums rather than the Fourier coefficients of cusp forms. To this end we need an equation in which the test function attached to the Kloosterman sums is given a priori, and the ones attached to the spectral terms are obtained by integral transforms with explicit kernels (something like the lliemann explicit formula for primes). Unfortunately, the formula (16.34) is not sufficient because one cannot represent every requested function f(x) by the integral (16.36). Moreover, even if g(x) admits the integral representation (16.36), it is not easy to find its original function h(r). Historically speaking, it was E. C. Titchmarsh [T3] who first tried to invert the map h >--+ g given by (16.36) for gin a dense subspace of L 2 (JR+,x- 1 dx), but

.\


411

not in the context of automorphic forms. Then jointly with D. B. Sears [ST] they realized that the image of this map is not dense. Indeed, writing (16.36) as 2i

g(x) = 1r

1

00

0

h(t)t (ht(x)- L2it)(x))-h()dt cos

Jrt

it is clear that the Bessel functions Jk_ 1 (x) of positive integral order k- 1 with k even are missed, because they are orthogonal to h, 1 (x)- J-2it(x). However, they showed that {Je(x); C = 1, 3, 5,. .. } together with {J2it(x)- L2it(x); 0 < t < oo} form a complete, orthogonal system in L 2 (JR+,x- 1 dx), the latter in the sense of continuous spectral measure. Precisely, let f(x) be a function of class C 2 on [0, oo) such that

f(O) = 0,

(16.38)

for a = 0, 1, 2 with a > 2. Then we have (16.39)

f(x)

=

+

dy) 2tdt Joroo (J2it(x)- L2it(x)) (Joroo (hit(h)- L2it(Y))f(y)y sinh( 1rt) 2

1

00

L

2(k- 1)Jk-l(x)

Jk-1(Y)f(y)y- 1dy.

0

k>O,k even

Here the integral component of f(x) is in the image of the integral transform (16.36). Therefore, for completeness one needs, in addition to (16.34), a formula for sums of Kloosterman sums S(m, n; c) twisted by the Bessel functions Jk_ 1 (47r,fiT!ii/c). These we already know for each k as the Petersson formula (14. 15). Put

1

00

(16.40)

Mj(t) =

.

sm

Jri ) h( 21rt

(h,t(x)- L2zt(x))f(x)x- 1dx

0

and ( 16.41)

NJ(k) =

4(k- 1)! ( 1ri)k 4

roo Jk-1(x)f(x)x

Jo

-1

dx.

Let Pi (m) and Ta ( m, r) be the Fourier coefficients of a complete orthonormal system of cusp forms and the Eisenstein series respectively. In addition let 1/Jik(m) for 1 :'( j :'( dim Sk(r) be the Fourier coefficients of a complete orthonormal system of holomorphic cusp forms 00

( 16.42)

fik(z) =

L

1/Jjk(m)m k2' e(mz)

1n=l

of weight k = 2, 4, 6, ... Adding Kuznetsov's formula (16.34) to Petersson's formula (14.15) in accordance to the decomposition (16.39) one derives

16.

412


THEOREM 16.5. Let f satisfy (16.38). For mn > 0 we have

L

(16.43)

c- 1 S(m,n;c)fe1r~)

=

c=O(mod q)

fMj(tj)Pj(m)pj(n) J=l

1 joo + La

41r

-oo Mj(r)fa(m,r)Ta(n,r)dr + L NJ(k) L O
Observe that the diagonal term on the spectral side is gone! One also needs a formula for sums of Kloosterman sums S(m, n; c) with m, n of different sign. This is a similar case, actually it is simpler because the holomorphic cusp forms do not appear (they have no Fourier coefficients at negative frequencies). Put (16.44) THEOREM 16.6. Let f satisfy (16.38}. For mn < 0 we have (16.45)

L c- 1 S(m,n;c)f(4 1r~)

=

c=O(mod q)

fKJ(tj)Pj(m)pj(n) J=l

+

L 41 joo -oo KJ(r)fa(m,r)Ta(n,r)dr. 7r

a

PROOF. This follows from Theorem 16.3 (with g(x) replaced by g-(x)) and the following Kontorovich-Lebedev inversion (see (5.27.14) of [Leb])

1

00

f(x) = :

(16.46)

2

K2zt(x)

(1

00

K2zt(Y)f(y)y- 1 dy) tsinh(2trt)dt. D

We close this section with remarks on the Kloosterman sums zeta function. Applying the formula (16.43) for f(x) = x 2 s-l formally one derives the following spectral decomposition: 2

(16.47)

2( 2 ~yrnn) s-l Zmn(s)

=

sm(trs)

+

L

f j=l

r(s- Sj)r(s- 1 + sj) p (m)p (n) 1

cosh(trtj)

tr- 2(4tr) 1-k(k- 1)!r(s- ~)r(s -1

k>O,k even

+ ~)

L

1

{;j;(m)'l/Jjk(n)

1:(j:(dim SK(r)

+

r(s-!-ir)r(s-~+ir)_ L 411r joo h( ) Ta(m,r)Ta(n,r)dr. _ 1r1' a

00

COS

Of course, the function f(x) = x 2 s-l does not satisfy the conditions (16.38), nevertheless one can prove rigorously that the formula (16.47) holds true.


413

The series

(16.48)

Lmn(s)

~

L

=

c

_1

(4n~

S(m,n;c)Js --cc-)

c=O(mod q)

seems more natural than Zmn(s) because it satisfies a suitable ftinctional equation

is) (n,-2is) =n _ (ns) 2 o(m,n),

1 ~ ( m,2 Ta (16.49) Lmn(s)-Lmn(-s)+2sL..._.Ta

1

.

Slll

a

(see Theorem 9.2 of [15]). 16.5. Estimates for the Fourier coefficients. In this section we derive from the Petersson and the Kuznctsov formulas immediate estimates for the Fourier coefficients of basic modular forms. To this end we simply estimate the involved sums of Kloosterman sums by using Weil's bound for the individual sum. More advanced methods and stronger results have been established in [13], [113] and [D1]. Our problems reduce to estimating the series (16.48). First we show that

(16.50)

L

c-I-aiS(m, n; c)l :'( 8(a- ~)- 2 T((m, n))(m, n, q)~T(q)q-~-a

c=O(mod q)

if~
and the last series is bounded by (

2

L

(a + ~)

T(d)d-a.

dl(m,n)

Hence (16.50) follows, because T(d) :'( 2d1 and ((a+~):'( 2(a- ~)- 1 . For k ;? 2 we have

k ( xk-1 ) « (x)a

Jk-I(x) «min 1, (k _ 1)!

for any 0 :'(a :'( 1. Therefore (16.50) yields

(16.51)

Lmn(k- 1)

«

(2a-

q- 2 (

1¥r T~

T((m, n))(m,n,q)!

for any ~
J() 5 x

Xa

« lf(s+~)l

«e

nlsi/2(X)a

j:;f

Therefore (16.50) yields

(16.52)

Lmn(s)

«

(2a -1)- 2e"lsl/ 2

(vmn 1

m n)"" T(q) T((m,n))(m,n,q)! · sq .jQ . 1

16.

414


where the implied constant is absolute. Note the right side of the Petersson formula (14.14) equals

+ 21rik Lmn(k- 1).

o(m, n)

Hence, using (16.52) with 2cr -1 = (log3mn)- 1 , we conclude THEOREM 16.7. Let k :? 2, k even. The Fourier coefficients of an orthonormal system of holomorphic cusp forms of weight k and level q satisfy (16.53)

r(k -1) (

4

1r)k-I

~-

L

1/JJk(m)'l/JJk(n) = o(m,n)

J

+ o(~~r((m, n))(m,n,q)~ (mn)~ log2 3mn) where the implied constant is absolute.

Next notice that the right side of the Kuznetsov formula (16.34) equals o(m,n)g0

1

+ -21rZ.

j (a)

Lmn(s)

h(is/2)s ( / )ds 2

COS 1rS

for any~< cr < 1. Inserting (16.52) with 2cr -1 :'( (log3mn)- 1 we conclude THEOREM 16.8. Suppose h(r) satisfies {15.20). Then for m,n > 0 (16.54)

=_

LPJ(m)pj(n)

j=l

h(tj) h( t) cos 7r 3

~ 1

+ L 47r 0

= o(m, n)go +

1= _ -=

Ta(m,r)ra(n,r)

h(r)dr h( ) cos 1rr

o( H T~q) r((m, n))(m, n, q)~ (mn)~)

where

(16.55) for any 0 < TJ :'( (41og3mn)- 1 , and the implied constant is absolute.

In particular, Theorem 16.8 yields (16.56)

f .

J=l

e-(tjjT), IPJ(n)i2 cosh(rrt 3 )

+L ~ a

joo e-(t/T)21Ta(n, tW dt cosh(rrt)

47r -=

= 9o + O(T~ r(q)q- 1 r(n)(n, q)~ n! log2 3n) where g 0 = rr- 2 T 2

+ 0(1) and the

implied constants are absolute.

Another interesting choice of the test function is (16.57) where X ;? 1. This is non-negative on the spectrum, i.e., for t or it real, and large for the exceptional points, precisely we have h(t) ~ X 2 s-l if s = ~ +it > ~ Clearly go « 1 and the integral (16.55) satisfies H

« .,- 2 x~+ 217 «X~ (log 3mnX) 2


by choosing yields (16.58)

(4log3mnx)- 1 . Therefore (16.54) for the test function (16.57)

'f/ =

L !
415

1

1Pi(n)I 2 X 281 -

1

« 1 + T(q)T(n)(n,q)1(nX)!(log3nX) 2 . q

<1

According to Selberg's eigenvalue conjecture for the group f 0 (q) the sum on the left side of (16.58) is void. This has not yet been proved. However, the inequality (16.58) shows that the exceptional points appear infrequently. Specifically, taking n = 1 and X = q2 we get (16.59)

L !
1

lpj(1)IV( 2s 1 -

1

)

«

T(q)(log2q) 2 .

<1

Here one can choose a basis (a Heeke-Pizer basis) such that 1Pi(1)1

L

(16.60)

!
1

q2(2s1 -1)

«

»

q-~-·, getting

ql+•.

<1

Hence we derive the following density estimate (16.61) for any a~ ~ and c: > 0, the implied constant depending only on c. If a> ~,this bound is quite small relative to the vohime of f 0 (q)\H. 16.6. Estimates for sums of Kloosterman sums.

Now we have everything ready for estimation of sums of Kloosterman sums. The original sum of Linnik given by (16.2) for q = 1 was treated in 1977 by Kuznetsov [Kuz]. He applied his formula (16.43) for a suitable f(x) and implemented Weil's bound for the Kloosterman sums near the boundary of the summation range getting (16.62) for X ~ 1 and m, n ~ 1, where the implied constant depends on m, n. The exponent ~ here is remarkable by comparison to the ~ in (16.3), though it is not yet arbitrarily small as conjectured by Linnik. In practice one is satisfied with sums of S(m, n; c) over the modulus c restricted smoothly in an interval rather than by the sharp cut-off. These smoothed sums, while not compromising real applications, can be given the true estimate (see (16.72)). Fix m, n, q ~ 1 and a function estimate (16.63)

f

of class C 3 supported on [~, 1]. We shall

S(X)= c=O(mod g)

for all X ~ 1. By Theorem 16.5 the problem is transferred to estimation of the Fourier coefficients of basic modular forms and the integral transforms (16.40), (16.41) of f(xX/2).


416

Introducing the power series expansion (16.64) into

MJ(t) = where s =

r= (J2s-1(2x)- JHs(2x))f(xX)x- dx 1

-:-=------( 1r )

sm 21rs } 0

! +it, we get M (t) =

~ ~( -1)e sin(27rs) L

f

t=O

+

}(21! + 2s -1) x2s-1-2t f!!r(f! + 2s)

the same expression with s changed into 1 - s,

where }(s) is the Mellin transform of f(x). All we need to know about f is that 1/(a)l :'( K. for a= 0, 1, 2, 3 and some K. > 0. Integrating by parts up to three times we deduce that

}(s) «"-(lsi+ 1)- 3

(16.65)

ifRe(s)>O,

where the implied constant is absolute. Moreover, we use the Stirling formula ifRe(s)>O.

(16.66)

Hence, estimating all terms of the power series for M 1 (t), except for f = 0, we deduce that K.X- 1 log2X) MJ(t) = w(s, X)+ 0 ( ls13 cosh(7rt)

(16.67) where

w(s '

X)=~ sin(21rs)

(}(2s -1) x2s-1- }(1- 2s) x1-2s) r(2s) r(2- 2s) ·

Next by the functional equation r(s)r(1 - s) = 1r / sin(1rs) we write (16.68)

w(s, X)= r(2s- 1)}(1- 2s)XHs- r(1- 2s)}(2s- 1)X 28 -

We also estimate the leading term w(s, X), but only for s = getting (16.69)

1

.

! + it with t real,

Klog2X

MJ(t)

« Is 13 cos h( Jrt )"

Similarly for the integral transform (16.41) we deduce that . /-.~

(16.70)

!

Finally introducing (16.69) and (16.70) into (16.43), except for the points < 1, then applying Cauchy's inequality and the estimates (16.53), (16.56) we arrive at Sj

<

.\ •c

w

~· ~--· 16.


417

THEOREM 16.9. Let f be of class C 3 with compact support in[~, 1] and if(a)i :( 1 for a= 0, 1, 2, 3. Then form n, q, X :;?: 1 we have

S(X)

(16.71)

=

L

w(sj,X)pJ(m)pj(n)+R(X)

~
where w(s, X) is given by {16.68} and R(X) satisfies

(16.72)

R(XJ «

( 1 + (m, q)~) ~ ( 1 + (n,q)~) ~ r(q)r(mn)(log2mnX) 3

with the implied constant being absolute.

Assuming Selberg's eigenvalue conjecture the sum over ~ < sJ < 1 is void, so S(X) = R(X) and (16.72) becomes a pure bound for (16.63). In particular, it is known that the Selberg conjecture holds true for the modular group where q = 1 (for a proof, see e.g. [DI]; Hejhal computed that the smallest eigenvalue is )q = 91.14 ... ). In this case we get the unconditional result

Lc- 1 S(m, n; c)f(27r/ffinX/c) « r(mn)(mn)~(log2mnf.

(16.73)

c>O

Notice we replaced (log 2mnX) 3 by (log 2mn) 2 because the factor log 2X in (16.69) appears for small t which is not in the spectrum for the modular group. Denote the sum over the exceptional spectrum (hypothetically empty) by

E(X) =

(16.74)

L

w(sj, X)pj(m)pJ(n).

~
Suppose s 1 = ~ + et is the· largest point, i.e., )q = s 1 (1 ~ sl) = ~ ~ et 2 is the lowest eigenvalue in the space of cusp forms. Since w( s1' X) » X 2 <> for sufficiently large X we deduce by comparing (16.71) with (16.3) for m = n with p 1 (n) i= 0 that Cl :( ~ which corresponds to the Selberg lower bound )q :;?: The best known estimate is et,::;; due to Kim and Sarnak [KSa] (see Section 5.11). The estimate et ,::;; is obtained in [114] by simpler arguments. Applying Cauchy's inequality and w(s, X) « X 23 - 1 log 2X uniformly for ~ ,::;; s ,::;; 1 we get

ft·

i4

f4

E(X)

« Em(X)~En(X)~ log2X

where

Em(X) = LiPJ(mWX 2 s,-l j

For any 1 :( Y :(X we have Em( X),::;; (X/Y) 2 <>Em(Y), and (16.58) yields

Em(Y)

«

1 + r(q)r(m)q- 1 (m,q)~(mY)~(log2mY) 2 .

Choosing Y = min(X,max(l,q 2 /m(m,q))) we derive

Em(X)

«

[1

+ ((m,q)mjq 2 )~X 2 <> + ((m,q)mXjq 2 ) 2<>]r(q)r(m)(log2mX) 2 •

Combining the above estimates we obtain


418

PROPOSITION

~-

16.10. Suppose the lowest cuspidal eigenvalue for ro(q) is

>.1

=

a 2 with 0
(16.75) . E(X)

« (1+ ((m, q)mjq 2 )~ X"'+

((m, q)mjq 2 )"'

X"')

( 1 + ((n, q)n/l)i X"'+ ((n, q)njq 2 )"' X"' )r(q)r(mn)(1og2mnX) 3 where the implied constant is absolute.

Because the bound (16.72) is absorbed by (16.75), the latter constitutes our final estimate for the sum (16.63). Although the Selberg eigenvalue conjecture is not yet proved, our bound for S(X) turns out to be strong enough in the level aspect to yield adequate results in the most important applications. Very often in Putting a= ~we get the following simple practice it suffices to know that >. 1 :;?: unconditional estimate

fts.

THEOREM

(16.76)

16.11. Form,n,q,X:;?: 1 we have

S(X)

« ( 1 + (m, q) ~;) \1 + (n, q) :~) ~ r(q)r(mn)(log 2mnX) 3


Similar arguments work and the same results hold true for the sum (16.63) if S(m, n; c) is replaced by S( -m, n; c) for any m, n:;?: 1. To this end replace Theorem 16.5 by Theorem 16.6.

J '

.

CHAPTER 17

PRIMES IN ARITHMETIC PROGRESSIONS 17.1. Introduction. The Prime Number Theorem asserts that every residue class a(mod q) with (a,q) = 1 contains the same proportion of primes, i.e., for fixed q we have (17.1)

1 1r(x;q,a) ~
(17.2)

1 'lj;(x; q, a)~
as x tends to infinity. One even has reasonable estimates for the error terms in the asymptotics (17.1), (17.2). However, more important than the error term is the range of uniformity for the modulus q in terms of x. The Siegel-Walfisz Theorem (Corollary 5.29) asserts that (17.3)

'lj;(x;q,a) =

+O(x(logx)-A)

for any q:;?: 1, (a, q) = 1, x:;?: 2 and A :;?: 0, the implied constant depending only on A. Notice that this estimate is non-trivial only if q « (logx)A. The Grand Riemann Hypothesis for L(s, x) with x(mod q) implies (17.4)

X

1

2

'lj;(x;q,a) =
where the implied constant is absolute, and this is non-trivial for q « x!(logx)- 2 • It is conjectured by H. Montgomery that (17.5) where the implied constant depends only on c:. This implies that the asymptotic formula (17.2) holds true uniformly in q ( x 1-". It was shown, however, by Friedlander and Granville ([FG], [Gra]), extending previous work of Maier [Mai] on primes in short intervals, that (17.1) and (17.2) cannot hold for q ( x(!ogx)-A for any A> 0. The GRH may be out of reach for several generations of researchers, but we have already a satisfactory substitute of (17.4), or rather the extension of (17.3), in the following form. 419

17.

420

PRIMES IN ARITHMETIC PROGRESSIONS

THEOREM 17.1 (E. BOMBIERI, A. I. VINOGRADOV, 1965). We have

L

(17.6)

max I,P(x; q, a) -

q'(Q (a,q)=l

for· any A:? 0, where Q depending on A.

___!____( )

<.p q

I « x(log x)-A

x~(Jogx)-B with B

=

=

B(A), the implied constant.

Here are brief comments on the history of the problem. All treatments use in one way or another the idea of large sieve due to Linnik [Lil]. First, A. Renyi [Re]established (17.6) with Q = x 8 - £ for some B > 0, then K. Roth [Rot] got this for B = } while M. B. Barban [Bar] succeeded with B = ~ (and died soon after). Finally A. L Vinogradov [Vi] reached the level B = ~ while E. Bombieri [Bo5] working independently in the same time proved a slightly stronger result as stated in Theorem 17.1. Both authors derived their results from new density theorems for zeros of £-functions which they established first. But Gallagher [Ga3] and Motohashi [Mot2] simplified the argument to avoid the use of zeros, instead feeding the main ideas directly into the estimation of bilinear forms over arithmetic progressions. These developments are related to the identities of Vinogradov, Linnik, Heath-Brown or Vaughan type for sums over primes discussed in Chapter 13. In this chapter we borrow from these innovations as much as possible to prove Theorem 17.1 with B(A) = 2A + 6 by employing minimal work. EXERCISE 1. Prove the asymptotic formula for the Titchmarsh divisor problem:

""' L T(p -1)

~ex

. ((2)((3) w1th c = ~ = 1.943596 ...

p~x

as x-+ +oo. This formula was proved first by Linnik [Li7]. [Hint: Use Dirichlet's hyperbola method to reduce to counting primes 1 (mod d) with d ( ylx, then use Theorem 17.1 and the Brun-Titchmarsh inequality (6.95).] Sec [Kol] for a fairly natural analogue of this problem for elliptic curves.

=

The Bombieri-Vinogradov theorem is expected to be true with Q = x 1 - £ (the Elliott-Halberstam conjecture) which follows from (17.5 ), but not from the GRH. There are some unconditional results of type

(17.7)

L .X(q)(¢(x;q,a)q'(Q

___!____( ))

«

x(logx)-A

<.p q

with Q = x~-£, where a f 0 is fixed and .X(q) is any well-factorable function in the sense of sieve theory (see [BFI]). Here c: > 0, A > 0 are arbitrary and the implied constant depends on a, c:, A. This result contains useful information about cquidistribution of primes p ( x in a fixed residue class a(mod q) of modulus q larger than x~, in which range the GRH does not apply. One can handle the larger moduli q if the averaging over the classes a(mod q) is permitted. Thus, P. Tunin [Th2] derived from the GRH that

(17.8)

L* a(mod q)

(

X ,P(x; q, a)- c.p(q)

)2 «

x(1ogx)

4

.·.~'

17.


421

where the implied constant is absolute. This estimate is non-trivial for q as large as x(logx)- 5 . Introducing the additional averaging over the modulus M. B. Barban [Bar] and H. Davenport and H. Halberstarn [DH2] established unconditional results of comparable strength. THEOREM 17.2 (BARBAN, DAVENPORT AND HALBERSTAM, 1966). We have (17.9)

L L*

(

X 1/J(x;q,a)-
)2 « x(logx)-A

q{,Q a(mod q)

for any A > 0, where Q pending on A.

= x(logx)-B with B = B(A). the implied constant de-

There are analogous results for the Mobius function in arithmetic progressions. EXERCISE 2. Prove that (17.10)

L

L

m;xj

q~Q

«

Jl(m)l

x(logx)-A

m:(x

m"'a(mod q)

for any A > 0, where Q depending on A.

x~ (1ogx)-B with B = B(A), the implied constant

17.2. Bilinear forms in arithmetic progressions. Given a reasonable arithmetic function f it is natural to expect its values to be equidistributed over primitive residue classes, i.e., one should have a non-trivial bound for 1 (17.11) Dt(x;q,a) = f(n)-
L

L

n:(x

n:(x

n"'a(mod q)

(n,q)=1

for all (a, q) = 1, provided q is not extremely large. In many cases one can show that (17.12)

Dt(x; q, a)«

(L IJ(nW) ~ x~ (logx)-A. n:f;x

This is non-trivial only for q ( (logx)A. Having (17.12) one can prove by the large sieve inequality (7.31) a bound for Dt(x;q,a) for some type off which is good for almost all q ( d(logx)-B. It is surprising how little is required of f. Essentially f needs to be represented as a convolution of two sequences, say f = et * {3, one of which, say {3, is equidistributed over primitive residue classes uniformly with respect to the moduli q ( (log x )A, or more generally, f should be well approximated by a linear combination of such convolutions. We begin our investigations by examining a sequence {3 = (f3n) of complex numbers supported on 1 ( n ( N. We assume that for some 0 < Ll ( 1 it satisfies (17.13) for all (a,q) = 1, where

llfJII =

(L lf3n 1 ~ 2

)

n

•

422

17.


First we show that (17.13) implies the following bound for character sums. LEMMA 17.3. For a non-principal character x (mod r) and a positive integer

s we have (17.14)

/ L fJnx(n)/ (n,s)=1

~ II!JIIN~~ 3 rr(s).

PROOF. By Mobius' inversion we change the condition (n,s) = 1 and then we split the sum as follows:

(n,s)=1

kls

=

n"'O(mod k)

L

Jl(k) L Jl(f) L

kis k'(K

ilk

fJnx(n)

(n,£)=1

+ L Jl(k) kls k>K

L

f3nx(n).

n"'O(mod k)

Next we estimate the sum of !Jnx(n) over (n, f)= 1 by splitting into classes modulo fr, and for each class we apply the hypothesis (17.13). This gives us (the leading terms cancel out because

x is non-principal)

II!JIIN!~g L LIJl(f)I'P(fr) ~ II!JIIN~~ 9 K
Then we estimate the sum of f3nx(n) over n This gives us 1"

llf3IIN 2

1

L..... k- 2 ~

=O(mod k) using Cauchy's inequality. 1

1

ll!JIIN 2 K- 2 r(s).

kls k>K

Adding both estimates we obtain (17.14) by taking K = ~- 6 .

D

Now we proceed to the main result THEOREM 17.4. Suppose jJ = (f3n) is a sequence supported on 1 ~ n ~ N which satisfies (17.13). Let a= (am) be any sequence supported on 1 ~ m ~ M. Then we have (17.15)

L

max IDa•f3(MN;q,a)l

q'(Q (a,q)=1

«:: llaiiii!JII(~M! N~ + M~ + N~ + Q)(logQ) 2 where the implied constant is absolute. PROOF. Using Dirichlet characters we write

Da•!3(MN;q,a) =

L

x(al(:Lamx(ml)(L!Jnx(n)).

x(mod q)

m

n

x#xo

Hence, reducing to primitive characters, the left side of (17.15) is bounded by 1

1

L L s'(Q
:L*

J

L

x(mod r) (m,s)=1

amx(m)ll L fJnx(n)J. (n,s)=1

17. PRIMES IN ARITHMETIC PROGRESSIONS

423

Ifr is small, say r ( R, we apply (17.14) getting

For the remaining r > R we split the range into dyadic segments P < r ( 2P and for each of such reduced sums we apply the large sieve inequality (7.31) getting

R
«

llalllliJII(Q + Mk

+ N~ + M~N~W 1 )(1ogQ) 2 •

Adding both results for R =b. - I we get (17.15).

D

17.3. Proof of the Bombieri-Vinogradov Theorem.

Suppose xi < n ( x. By (13.39) withy= z =xi we write A(n) = A#(n)

+

A~(n), where

A~(n) =

L

>..(C)J.L(m),

f.m=n 1 m(x5

L

A~(n) =

>..(C)J.L(m).

fm=n

xi ..(£) is the incomplete logarithm (17.16)

>..(£)=log£-

L

A(k).

kll k(xt

Accordingly we split (17.17)

D11(x; q, a)= DM (x; q, a)+ DA' (x; q, a)+ O(x~ log x)

where the error term comes from the contribution of terms n < xk for which the above identities do not apply. Iff is continuous and monotonic on [1, y], then by elementary arguments we derive IDt(Y; q, a)l ( 2lf(1)1 + 2lf(y)l. Applying this for f(£) =log£ and f(£) = 1 we obtain D>.(Y; q, a) D 11 ~(x;q,a) « x% iogx and summing over q ( Q we get (17.18)

L

max ID 11 ~(x;q,a)l

« xk log x.

Hence

« Qx% iogx.

q(Q (a,q)=l

We shall estimate D 11 , (x; q, a) by an appeal to Theorem 17.4. To this end we need A~(n) to be in a form of convolution of two sequences, and we almost have it except that the restriction n = Cm ( x makes £ and m interconnected. To relax this and to keep hold of the sizes of Cm, we are going to divide the interval 1 ( n ( x

17. PRIMES IN ARITHMETIC PROGRESSIONS

424

into short subintervals of type y < n ,;;; (1 + o)y with x-t < c5 ,;;; 1. There are 0 (s- 1 ) such subintervals. We cover A' (n) by partial sums of type

2:2:

(17.19)

>-(e)IL(m)

lm=n L
with L, M taking values (1 + o)j in the range xt < L, M < x~, LM < x, except for n < xt and (1 + S)- 1 x < n < (1 + o)x in which ranges the covering is not exactly with multiplicity one. Estimating the contribution in these excess areas trivially we obtain (17.20)

DA,(x;q,a) = LLD(LM;q,a) L

+ O(c5q- 1 xlogx)

AI

where D(LM;q,a) stands for 1 >-(e)!L(m)-
2: 2: L
LL

A(fMm).

L
For each of D(LM;q,a) we can apply Theorem 17.4 with b.= (logx)-A because the sequence IL(m) satisfies the hypothesis (17.12) by the Siegel-Walfisz Theorem (also the sequence >-(f) satisfies this hypothesis). This gives (17.21)

L

max ID(LM;q,a)l «c5b.x(logx) 3

q:(Q (a,q)=1

for Q =box!. Summing over L, M (there are O(c5- 2 ) such segments) we obtain by (17.20) and (17.21) that (17.22)

We choose c5 =b.! getting the bound b.!x(logx) 3 . Adding (17.22) to (17.18) we obtain (17.23)

L

max 11/;(x;q,a)-1/;((x))l

1

(a,q)=l

«b.~x(logx)3.

q~i:::l.x'2

Here we can replace 1/;(x) by x + O(b.x) by the Prime Number Theorem. Having done this, Theorem 17.1 follows with (17.24)

B(A) = 2A

+ 6.

17.4. Proof of the Barban-Davenport-Halberstam Theorem.

We shall establish Theorem 17.2 for arithmetic functions f more general than A. It does not require f to be a convolution type as in Theorem 17.4, but only that f is well distributed over residue classes to quite small moduli.

17.


425

THEOREM 17.5. Suppose that for some 0 < b. ( 1 we have (17.25)

for all (a,q) = 1. Then (17.26)

L L*

2

IDJ(x;q,aW«IIfll (b.x+Q)(logQ)

2

q'(Q a(mod q)

where the implied constant is absolute. PROOF. By the orthogonality of characters

2:::*

IDJ(x;q,a)l =
a( mod q)

L \L f(n)x(n)\

2 .

x(mod q) n'(x

x#xo

Averaging over q, then reducing to the primitive characters, we see that the left side of (17.26) is bounded by

L

_1

s'(Q

L 1
2:::* I L

_1

x(mod r)

f(n)x(n)l2·

n'(x (n,s)=1

At this point we apply the same arguments as in the proof of Theorem 17.4 with

c. = {3 = f getting the bound

llfll {xb. 3 R2 + Q + x~ + xW 1 }(logQ) 2 . 2

Choosing R = b,.- 1 we arrive at (17.26).

D

COROLLARY 17.6. Let the conditions be as in Theorem 17.4. Let ab =J 0. Then we have (17.27)

2:::

I

2:::2::: llmf3n- ~( 2::: c.m) ( 2::: f3n)l
q'(Q am,bn(mod q) (q,ab)=l (mn,q)=l

(m,q)=l

(n,q)=1

« llnllllfJII(M + Q)~(b.N + Q)~(logQ) 2 PROOF. The difference of the sums in (17.27) is also given by

L* L*

Da(M;q,u)D!3(N;q,v).

au,bv(mod q)

Hence the result follows by Cauchy's inequality from Theorem 17.5 applied for f = with b. = 1 and for f = {3 with b.. D

c.

CHAPTER 18

THE LEAST PRIME IN AN ARITHMETIC PROGRESSION 18.1. Introduction. After having uniform distribution of primes in primitive residue classes to a given modulus q the natural question to ask is how big is the first prime in a given class? Denote (18.1)

p(q, a)= min{p: p =a( mod q)}.

Clearly p(q, a) cannot be smaller than (1 conjectured that (18.2)

p(q, a)

+ o(1))
for some a( mod q). It is

« ql+'

while the Riemann Hypothesis for Dirichlet £-functions of characters x(mod q) implies (18.3)

p(q,a)

«

(
Unconditionally, it is easy to derive from the Siegel-Walfisz Theorem (see Corollary 5.29) that p(q,a) «exp(q') for any c: > 0, the implied constant depending on c: (ineffectively). Moreover, if there is no exceptional character modulo q, then it follows from the Prime Number Theorem that

p(q, a)«

qclogq

where c is a positive constant. The celebrated theorem of Yu. V. Linnik [Li4], [Li5], [Li6] of 1944 asserts THEOREM 18.1 (LINNIK). There exist absolute constants c :;?: 1 and L :;?: 2 such that (18.4) This beautiful theorem is one of the greatest achievements in analytic number theory. Originally Linnik did not give a numerical value of L, though his method was effective, i.e., both constants c and L could have been computed given the time and will. Here is a selected list of the Linnik constant produced by various researchers L = 10,000 L= 777

L=80

Pan (1957), Chen (1965), Jutila (1977),

L = 20 L = 16 L = 5.5 427

Graham (1981), Wang (1986), Heath-Brown (1992).

18. THE LEAST PRIME IN AN ARITHMETIC PROGRESSION

428

In this chapter we prove Linnik's theorem without determining the constant L.

Our treatments borrow substantially from the magnificent work of S. Graham [G 1], [G2]. For readers interested in learning other methods (the Tunin power-sums), we recommend Chapter 6 of E. Bombieri [Bo2]. The sharpest tools are developed in Heath-Brown [HB4]. All proofs of Linnik's theorem use in one form or another the following three principles about the zeros of the Dirichlet £-functions. Throughout we denote

(18.5)

II

Lq(s) =

L(s,x).

x(mod q)

A zero of L(s, x) will be denoted by p = (3 + i"'(, or by Px = f3x + i"Yx if the dependence on the character need be displayed. For ~ :;:;; a :;:;; 1 and T ;? 1 we denote by N(a, T, x) the number of zeros Px counted with multiplicity in the rectangle (18.6)

a

iti :;:; T.

< u :;:;; 1,

Thus (18.7)

Nq(a,T) =

L

N(a,T,x)

x(mod q)

is the number of all zeros of Lq(s) counted with multiplicity in the rectangle (18.6). PRINCIPLE 1 (THE ZERO-FREE REGION). There is a positive constant c 1 (effectively computable) such that Lq (s) has at most one zero in the region

u;? 1- cJ/logqT,

(18.8)

jtj:;:;; T.

The exceptional zero, if it exists, is real and simple and it is for a real, non-principal character. PRINCIPLE

constants

CJ, c2

(18.9)

2 (THE LOG-FREE ZERO-DENSITY ESTIMATE). There are positive (effectively computable) such that for any ~ :;:;; a :;:;; 1 and T ::? 1,

Nq(a, T) :;:;; c(qT)c,(l-n)

3 (THE EXCEPTIONAL ZERO REPULSION). There is a positive con(effectively computable) such that, if the exceptional zero (31 exists, say L(/31, X 1) = 0 with PRINCIPLE

stant

c3

(18.10)

1 - cJ/ log qT :;:;; (31 < 1,

then the function Lq (s) has no other zeros in the region (18.11)

u ~ 1 -c3

jlog(1-(3 1 )1ogqTj I T , ogq

iti :;;;T.

The first principle is classical (due to Landau), and is proved in Section 5.9. The second principle is due to Linnik [Li6] and the third principle is a quantitative version of the Deuring-Heilbronn phenomenon (also due to Linnik [Li5]). Bombieri [Bo2] established a somewhat stronger version of Principle 3. He showed that if there is an exceptional zero, then Principle 2 can be made stronger as well. We extract from his result the following

18.

THE LEAST PRIME IN AN ARITHMETIC PROGRESSION

429

PROPOSITION 18.2 (BOMBIERI). There is a positive constant c2 (effectively computable) such that, if the zero (31 satisfying {18.10) exists, then Nq(a, T)

(18.12)

for any putable).

! '(

«

(1- (3I)(logqT)(qT)c'(l-a)

a '( 1, where the implied constant is absolute (and effectively com-

EXERCISE 1. Show that Proposition 18.2 implies Principle 3. EXERCISE 2. Derive from Proposition 18.2 the Siegel bound (5.73) for real zeros of Dirichlet £-functions. The numerical values of the constants c1 , c2, c3 and c yield the Linnik constant L, but from our estimates this will be quite large. Assuming (as we can) that q is sufficiently large and T '( log q the above principles are known to hold for the constants c 1 = c2 = 3, c3 =

fa,

!.

18.2. The log-free zero-density theorem.

Recall the Huxley density estimate (see Theorem 10.4)

(18.13) where A is an absolute constant. Therefore, (18.9) is new only for zeros near the line Res= 1, namely, for a with

(18.14)

1- a« A.C- 1 log .C

where

(18.15)

.C =log qT.

In this section we are going to prove Principle 2 with the constant c2 = 47, however, the method is capable of giving a much smaller constant by direct modifications. Note that for the principal character N(a, T, xo) equals the number of zeros of the Riemann zeta function in the rectangle (18.6). Thus by the Vinogradov zerofree region (Corollary 8.28) and the Huxley density estimate (see Chapter 10) it follows that

(18.16)

N( a, T, xo)

«

y3(1-a).

This could also be established without appealing to the zero-free region along the lines of this section, but we exclude the principal character from consideration here to simplify our exposition. The main idea behind the proof of Principle 2 is the same as for the density estimates in Chapter 10, i.e., we construct a Dirichlet polynomial which serves as a zero-detector because it assumes enormous values at the zeros of L(, s, x). Then, using the duality idea followed by estimation of resulting character sums, we obtain the desired bound for Nq(a, T). If we had applied this directly to L(s, x) then a loss of logarithmic factors would occur, which is not acceptable. Therefore some refinements are necessary. We eliminate the logarithmic factors by reducing the coefficients of L(s, x) with a device similar to that which gave us a positive


430

proportion of zeros of ((s) on the critical line (see Chapter 24). There are other methods for establishing Principle 2, such as the Power Sum Method of P. Thran (used by Bombieri, Jutila, Montgomery). We introduce two kinds of mollifications to L(s, x). Put

(18.17)

K(s) = f(LAd) (L8b)n-s I

dJn

bJn

where Ad is a continuously cut-off Mobius function and 8b is a kind of coefficient used in sieve methods. Though Ad and 8b appear to have similar structures they play here distinct roles. The first one is applied to annihilate the early terms of the series (except for n = 1) while the second one is applied to sift out (or rather to diminish the contribution) of the terms which have small prime factors. Precisely we choose (18.18)

logz/d)

Ad = f.L( d) min ( 1, log zjw

for 1 :( d :( z where 1 < w < z, and we set Ad= 0 if d > z. Then we choose (18.19)

8 b-

f.L(b)b G<.p(b)

""' ~

ab~y

(a,bq)=!

where G is the normalization factor such that 81 (18.20)

G=

L

= 1,

i.e.,

f.L2(a)_ <.p(a)

a(y

(a,q)=!

The coefficients 8b are coming from the A2 -sieve and have the following properties (see Section 6.5):

18bl :( 1

(18.21) (18.22)

-·-l=L 8bt8b, b,,b,

(18.23)

=

c-I

[bJ,b2]

G) <.p(q) logy

q

(in fact one could choose 8b from essentially any upper-bound sieve of level y). Since Ad agrees with f.L(d) for 1 :( d :( w it follows by Mobius inversion that (18.24)

if n = 1, if 1 < n :( w.

For any n > w we have the trivial bound (18.25)

. ;jj .II

18.· THE LEAST PRIME IN AN ARITHMETIC PROGRESSION

431

however, it was shown by S. Graham [Gl] that (18.26) Since we do not seek the best constants we make the comfortable choice of the levels y, w, z, namely (18.27) For any x(mod q) we have the twisted series (18.28)

K(s, X)=

f(I: Ad) (2:: fh )x(n)n-•. 1

d[n

b[n

This factors into (18.29)

K(s,x) = L(s,x)M(s,x)

where (18.30)

M(s,x) = l:(l:l:Adlh)x(m)m-•. m

[b,d]=m

Actually we shall take only a partial sum of K(s, x), (18.31)

Kx(s, X)=

2:: (2:: Ad) (2:: lh )x(n)n-• l(n(x

din

bin

with (18.32)

X=

(qT)23

Notice that m = [b, d] ~ bd ~ yz and for X "I

xo

(18.33) Hence, for s = u +it in the rectangle (18.6) the tail of K(s, x) satisfies (18.34)

IK(s, x)- IC(s, xll ~ 2qlslyzx-<7 ~ 4qTyzx-a ~ ~·

For s = p, a zero of L(s, x), we have K(p, y) = 0 by (18.29), so (18.34) becomes (18.35) Extracting from Kx (p, x) the first term, this inequality becomes the zero detector (18.36)

·1

2:: (I: Ad) (I: eb )x(n)n-pl ~ ~w
din

bin

For the trivial character X = Xo the detector of zeros is somewhat different from (18.36), so we shall count these zeros separately. Let S(x) denote the set of zeros of L(s,x) (counted with multiplicity) in the rectangle (18.6). By (18.36) we deduce that the total number (18.37)

R=

2:: IS(x)l = 2:: N(a,T,x) x#xo

x#xo


432

satisfies

R~2L X

L / L (L-'d)(Lth)x(n)n-•/~ sES(x) w
for some numbers cx(s) with lcx(s)l = 1. By Cauchy's inequality we obtain R 2 ~4UV

where

u=

(18.38)

L

(L-'drn!-2"

w
djn

and 2

(18.39)

V = Lf(nl(Leb) n bfn

2

"-

1

2

/L L cx(s)x(n)n-•/ X sES(x)

Here f is any non-negative function such that f(n)) 1 for w < n

•

~ x.

Using (18.26) one derives by partial summation that for any ~ ~ a ~ 1,

(18.40)

u ~ x2(1-a) logxjw {1 + o(-1-)} << x2(!-a). logzjw

logzjw

To estimate V we square out and change the order of summation getting

(18.41) where

B(x,s) = Lf(nl(Lebrx(n)n-• bjn

n

for X = X1X2 and s = s 1 + s 2 + 1- 2a. Note that Re ( s) ) 1. Take any f supported on [w jv, xv] which is continuous, bounded and piecewise monotonic. Then we have

L

f(dn)(dn}-s

=

n=a(mod q)

F(s) dq

+0(1s\~) w

where

(18.42)

F(s) =

j f(()C.d(

and the implied constant is absolute. Hence we derive that for any (d, q) = 1,

L

f(n)x(n)n-•

=

o(x)

'Pd:) F(s) + o(islq;J

n;::;:O(mod d)

where o(xo) = 1 and o(x) = 0 otherwise. This yields by (18.28) and (18.29)

(18.43)

B(x, s) = o(x)
F~s) + 0(\slq~/)·


433

Introducing (18.43) into (18.41) we get by (18.30) (18.44)

V ~

'2.::: '2.:::'2.:::

IF(sl + s2 + 1- 2a)l(logy)- 1 +0(R 2 Tqv~-V).

X s1 ,s2ES(xl

One can find

f such that for Re (s)

~

1,

F(s) « (1 +Is -1llogv)- 2 logx.

(18.45)

For example, we may take

!(~)=min {1- iogwj~, 1,1- iog~jx} logv

for wjv

~ ~ ~

logv

xv and J(O = 0 otherwise. Indeed, writing

f(O logv =

iog+(xvj~) -log+(x/0 -log+(w/0 + iog+(wjv~)

one verifies by the formula

that

F(l- s) = ](s) = ["" J(OC-id~ = (xv)s- x•- w• + (wjv)• Jo s 2 1ogv (v 8 - 1)(x•- w 2 v-•) . ( 1 ) << mm logx, - - - . = s 21ogv s 12 ogv 1 1 This gives (18.45). For s = s1 +s2+ 1-2a we have ls-11 = 1!31 +,62 -2a+i(ll -121J ~ hl-1'21· By (18.44) and (18.45) we get (18.46)

I 'I

I

LEMMA 18.3. Let X (mod q) be a non-trivial character, ~ ~ a ~ 1, v ~ 2 and t a real number. Then

(18.47)

" "' ~

. 2 ~ (l+lf'-tllogv)-

L(p,x)=O

1(1-a+logv 1 ) logAvq(ltl+1)

2

f3?o.

where A is an absolute constant. PROOF.

We first suppose that

x is

primitive. We have

L' 1 q 1 r' ( -+-(l+x(-1)) s 1 ) +B(xl+l.::: ( -1+ -1) (18.48) -(s,x)=--iog---L 2 11 2r 2 4 s-p p p

where the summation ranges over all zeros of L(s, x) with ,6 > 0 and B(x) is a constant with 1 Re B(x) =- l.:::Rep

p


434

(see (5.29)). Hence it follows that for 1
(18.49) For s =

CJ

+ it with u > 1 we have

1 (u-(3) 1 ( b-t[)-2 Re--= ?--- 1+-s-p (u-(3)2+(,-t)2 (u-(3) u-1 ' L' = (' ly(s, x)l :( A(n)n-" = -((u) =

L

1 (Y-

1 + 0(1).

I

Hence (18.49) yields (drop the zeros with (3
""""(1+ btf) ~ u-1

2 :(

/3)a

1 (u- n)(11og q(ftf + 1) + - - + 0(1)). u-1

Choosing u = 1 + 1/logv we obtain (18.47). If x(mod q) is induced by a primitive character x*(mod q*) with q* f q, then the left side of (18.47) agrees with that for x*, so the result holds for any x =F xo. Notice that 1 + (1-a)logv :( v 1 -"'. Therefore Lemma 18.3 and (18.46) yield (18.50)

V

«

by the trivial bound R

1ogxlogvqT R 2 T v z ---- + q-y R v 1_ 0

logy logv

«

w

«

RI-a v

qT log qT and by choosing v =qT.

(18.51)

Multiplying (18.40) and (18.53) we obtain by (18.38) that (18.52) Hence we get (18.9) by adding the number of zeros of L(s, Xo) which is negligible 0 by virtue of (18.16). 18.3. The exceptional zero repulsion. In this section we are going to prove Principle 3, however, our method is capable of giving (18.13). It begins by a suitable modification of the arguments which we used for the proof of Principle 2 in the previous section. Suppose x1(mod q) is the exceptional character and (31 is the exceptional zero of L( s, xi) which satisfies (18.53) Throughout we assume (as we can) that c1 is a sufficiently small, absolute positive constant. Our goal is to prove there exis.ts an absolute constant co ? 2c 1 such that the function Lq(s) has no zero other than (3 1 in the region (18.54)

> 1 log(c0 /o 1 logqT) 92log qT '

u "'

Clearly this result implies Principle 3.

ftf :( T.


i.

435

To take full advantage of the repulsion property of {31 we apply the method of zero-detectors from the previous section to the function ((s)L(; + oi, XI) rather than to ((s). We have (18.55) where p(n) is a positive, multiplicative function given by (18.56)

p(n) = l:::xi(a)a-•h. ajn

REMARK.

One could work as well with the product

(18.57)

where (18.58)

T(n,xi) =

l:::x1(a) ajn

but the slight shift in L(s + oi, x!) turns out to simplify some technical arguments. As before we introduce two kinds of mollifications to ((s)L(s (18.59)

K(s) =

+ oi, XI)

getting

f(l:::Ad) (l:::th)p(n)n-• I

djn

bjn

where Ad are defined by (18.18) and eb are defined by (18.19) with the parameters y, w, z given by (18.27). Then for any non-principal character x(mod q) we consider the twisted series (18.60)

K(s, x) =

f (2..::: Ad) (2..::: eb) I

djn

p(n)x(n)n-•.

bin

This factors into (18.61)

K(s, x) = L(s, x)L(s + oi, xxr)M(s; x)

where (18.62)

M(s, x) =

2.::: (2..::: 2.::: Adeb) II (p(p)- x(p)p-s- 201 )x(m)m-•. m

[b,d]=m

pjm

To see this we arrange K(s,x) as follows: K(s, X) =

2.::: (2..::: 2.::: Adeb) x(m)m-• 2.::: p(mn)x(n)n-•. m

[b,d]=m

n

18. THE LEAST PRlME IN AN ARITHMETIC PROGRESSION

436

Opening p(mn) by (18.56) we see that the innermost sum is

L Xl(a)a-"•

L

x(n)n-s = L(s,x) L

n=O(mod af(a,m))

L

L(s,x) LXl(c)c-"•

=

x~~~) x((a,am)) ((a,am))'

a

elm

XlX(a)a-s-8•

(a,c)=l

elm

pic

and this yields (18.61) and (18.62) because m is squarefree. By (18.61) it follows that K(s,x) is holomorphic in the whole complex plane (if x = x 1 , then the pole of L(s +5 1 , xo) at s = 1 -5 1 = {3 1 cancels with the zero of L(s, xi)). As before we take only a partial sum of K(s,x), (18.63)

Kx(s,x)= L (LAd)(Lth)p(n)x(n)n-s I ,;n,;x din bin

with x given by (18.32), and we show by contour integration that the tail of K(s, x) satisfies

IK(s,x)- Kx(s,x)l ~!

(18.64)

!

for s = u +it with u ;;, and ltl ~ T. Then for s = p, a zero of L(s, x) which is different from /31 if X= Xl, we have K(p, X)= 0 by (18.61), therefore (18.64) yields (18.65)

I L w
if p = (3 +if with

!3 ;;,

(LAd) (Lth)p(n)x(n)n-PI;;, ~ din

~ and

ill

bin

~ T.

The inequality (18.65) is viewed as the zero-detector. This wouldn't be useful if not for the fact that p(n) is very small quite frequently. One cannot exploit a possible cancellation of terms because of the complexity of the coefficients AdRemember that it is due to the introduction of the factors (18.66)

w(n)=LAd din

that the early terms with 1 < n ~ w are annihilated. Now, our strategy will be to remove these factors (by applying Holder's inequality) to create a sum of v(n) 2 p(n), where (18.67)

v(n)= Llh bin

over n with w < n ~ z. When applying Holder's inequality one could also remove the factor v(n) 2 , but it would result in a loss of logq which is not acceptable. For this reason we retain v(n) 2 p(n). Still, the sum of v(n) 2 p(n) is manageable because the support of fh is relatively small by comparison to the range of n. In this way by taking advantage that p(n) nearly vanishes quite often (depending on how close to one is the exceptional zero) we derive from the inequality (18.65) that !3 is not close to one, precisely this inequality prohibits p = (3 + i-y to be in the region (18.54).


437

Having outlined the strategy we proceed to details. By (18.65) we get

L

lw(n)v(n)lp(n)n-.6) ~·

w
Hence applying Holder's inequality (18.68)

where

U=

L L L

w2(n)nl-2.6,

w
V =

v 2(n)p 3 (n)n- 1 ,

w
W =

v 2(n)p(n)n- 1 .

w
For U we have by (18.40)

U « x2(1-,13).

(18.69) 3

3

In V we estimate p (n) by T (n) and apply a sieve of dimension 8 (see the Fundamental Lemma 6.3) getting (18.70)

We shall show that

w«

(18.71)

O]logx.

Multiplying these estimates we derive from (18.68) that o!(logx)x 4 (l-,13) > 23eo,

(18.72)

where co is a positive absolute constant. Assuming (18.53) holds with 2c1 ~ c0 we conclude by (18.72) with x = (qT) 23 that p = (3 + i-y is not in the region (18.54) as claimed. '!

We now proceed to the proof of (18.71), it is here where the exceptional zero plays its role. Our treatment is essentially elementary, yet some arguments are quite subtle (we borrow these from the work of Bombieri [Bo2]). We begin by an asymptotic evaluation of W. To this end we examine the generating zeta function 00

(18.73)

W(s) = 'L:v 2(n)p(n)n- 8 •

This is just (18.60) with Ad= (}d and x = xo, so by (18.61) and (18.62)

W(s)

=

L(s, xo)L(s + o1 ,xl)M(s)

where

I

~-

(m,q)=l

plm

438


and

am=

LL fhJh2· (b1,b2]=m

Hence W(s) is holomorphic everywhere except for a simple pole at s = 1 with res W(s) = cp(q) £(1 + 81,xi)M(1). q

s=1

By contour integration we derive that

w=

(18.74)

cp(q) 111(1)£(1 + 81, xi) q

log~+ o(~). w

q

Now we need estimates for J\!(1) and £(1 + 81, xi). From sieve theory (see Chapter 6) we get

1\1(1)=

(18.75)

L

amii(p(p)-p-1-2ot)m-1

(m,q)=1

«

II (1 -

p~y

p]m

p(p p

l) =

_q

II (1 -

cp(q) p~y

p(p p

l}

p(q

To estimate £(1 + 81, x 1) we first examine

P(x,y) =

(18.76)

L

(1+ X1(P))p- 1 .

y
LEMMA

18.4. For x > y ;::: q2 and q sufficiently large we have

P(x, y)

(18.77)

PROOF.

~

48 1 log x.

Consider

S(x) = LT(n, xJ)n- 1 n(x

where T( n, Xd is given by (18.58), SO it is a multiplicative function with T(p, XI) = 1 + x1 (p). Multiplying P(x,y) and S(y) we deduce from the non-negativity of T(n,xi)that

P(x,y)S(y)

~

S(xy)- S(y).

Then by the elementary formula (see (22.11))

S(x) = £(1, xi)(log x + 1) + £'(1, xi)+ O(q~ x-! log x) we get

S(xy)- S(x) = £(1, xi) logx + O(q~y-~ logy) ~ 2£(1, XI) logx

,;


439

because L(1,xJ) » q-~ andy;? q2 . On the other hand, we derive a lower bound for S(y) as follows

S(y);? y- 01 LT(n,xl)(1- ~)n-Ih n~y

2

=- . 27rZ

1 (!)

~-~

((s+;3J)L(s+/'lt,xl)-(--)ds S S +1

2L(1,xl)

_!

= 81(1h +1) +O(y

2

1 1 q'):? 28 1 L( 1,xl).

0

Combining these bounds we complete the proof of (18.77).

REMARKS. Lemma 18.4 shows that if 81 = 1- ;31 = o(1/logq), then XI(P) = -1 for almost all primes in segments q2 < p ~ qA. For very small primes the above arguments are rather crude, but see Chapter 22, in particular (22.100).

Now we are ready to estimate £(1 + 81 , xi). For any 1 < s

((s)L(s,x1) ::o::

II(l +T(p,xl)p-

~

2 we have

8 )

p

and by Lemma 18.4 we obtain by partial summation

II(1+T(p,x!)p-s)

4

~exp(LT(p,x!)p-•) «expC ~\)·

p>y

P>Y

Hence

L(s, xi)«: (s- 1) exp

(~) s-1

II (1 + T(p, xi)p-•). p(y

In particular, we get (18.78)

L( 1 + 81, XI)

«: 81

II (1 + pp(p)) .

p(y

Combining (18.74), (18.75) and (18.78) we get W «: 81 Jog x +q- 1 which completes the proof of (18.71), and of Principle 3. · 18.4. Proof of Linnik's Theorem.

Throughout c1, c2, c3 and c are the absolute constants from Principles 1, 2 and 3. We assume (as we can) that (18.79) Let (18.80) and put (18.81)

c,c2 > 1

> c1,c3 > 0.


440

We use the truncated explicit formula (see (5.65))

X 1 '1/J(x;q,a)=-() --(-) 'P q

R xPx

L

'P q x(mod

(X

x(a)L - + 0 Rlogx Px

q)

)

Px

where I:R restricts the summation to zeros Px = f3x + hx of L( s, x) in u ~ ~,It! ~ R. First we estimate the sum over zeros Px with ~ T < lfx I ~ T for 1 ~ T ~ R by using Principle 2 as follows:

I L x(mod

x(a) q)

L

xPxp_;ll

~T
~ T2 L X

L

xf3x =

-T2

!1

1 2

1 1

x"'dNq(a,T)

I

hxi(T 1

2 1 = -Tx2Nq(!,T) ~

2c 1 02. 2 Tx2(qT)

2 + -(logx)

T

2c + -x(logx) T

1

((qT)c 2 /x) 1-ada

1/2

xlogx

2c

Nq(a,T)x"'da

112

x

~ T log(x(qT) c2) ~ 2c(c 2 + 1)T ~

provided x;? (qT)c 2 + 1 which is our case. Notice the absence of logx in the last bound. Hence the explicit formula reduces to (18.82)

1

X

L

'1/J(x;q,a) = -(-)- - ( ) 'P q 'P q x(mod

x(a) q)

( LPx -xPx +0 -( )T + R logx Px 'P q T

X

X

)

where I:T restricts the summation to zeros Px with lfxl ~ T, and the implied constant is absolute. We choose

T=q,

(18.83)

so the error term in (18.82) is O(x(logq)jq
I L x(a)L:~'\~2 L x(mod q

Px

X

x(mod q)

Lxf3x Px

where (18.85)

71 = cJ/2logq

if the zero {3 1 does not exist, and (18.86)

71

=

c31log(2o1logq)l/2logq

I:'

by an appeal to

l


441

if fJ1 exists. By Principle 2 we have

Nq(a,T) ~ cq2c2(1-a), so our sum over the zeros Px other than f3x is bounded by

2cx~ qc 2 + 2cx(log x)

1

1 -'7 (lc 2jx) 1 -"da

2

~

2 2 11 2cxlogx (q-~ex c) 4 1-'7/2 . 2 log(xq- c2) x

Therefore we have established the following PROPOSITION 18.5. Let c,c1,c2,c3 be the absolute constants from Principles 1, 2, 3. Then for x;? q4 c 2 we have

q) }

1 1 2 x { xf3 (log '1/J(x;q,a) = cp(q) 1- Xl(a)T + ecx-'11 + 0 -q-

(18.87)

where the term involving {31 does not exist if {31 does not, 1) is given by ( 18.85) or {18.86) respectively, 1e1 ~ 4 and the implied constant is absolute. In the first case of Proposition 18.5 we deduce THEOREM 18.6. Let c 1 , c2 and c be the absolute constants from Principle 1 and Principle 2 respectively. Suppose for any real character x1 (mod q) there is no real zero of L(s, xi) in

a> 1- cJ/2logq.

(18.88) 4

Then for x ;? q (18.89)

where 1e1

c2

we have

'1/J(x; q, a)= ~

cp~q) ( 1 + ecexp( -~ ~:!:) + oCo:q))

4, and the implied constant is absolute (effectively computable).

In the second case when {3 1 exists, we assume that x ;? qv where

(18.90) and we estimate as follows:

Inserting these estimates into (18.87) we deduce

18. THE LEAST PRIME IN AN ARlTHMETIC PROGRESSION

442

THEOREM 18.7. Let c, c1, c2 , c3 be the absolute constants from Principles 1, 2 and 3. Suppose there exists a real zero /3 1 of L(s, x1) for a real character x1 (mod q) such that 81 = 1- {31 ~ ci/2logq. Then for x ) qv with v given by {18.90} we have

(18.91)

x

81 log q (

( 1 ))

'1/J(x;q,a)) r.p(q) ~ 1 + 0 y'q

where the implied constant is absolute. From Theorems 18.6 and 18.7 it follows that in any case we have (because 81logq » q-~) CoROLLARY

(18.92)

18.8. If q is sufficiently large and x ) qL with 4 4 log8c } L =max { 4c2, -log8c, - -- c1 c3 1 logc1 1

,

then (18.93)

X

'1/J(x; q, a)» r.p(q)y'q

where the implied constant is absolute. This result is a quantitative version of Linnik's theorem. .}

CHAPTER 19

THE GOLDBACH PROBLEM 19.1. Introduction. Christian Goldbach (1690-1764), a member of the Petersburg Academy, in a letter of 1742 to Leonard Euler posed the problem of proving that every number N ) 5 can be represented as the sum of three primes. Euler answered with the equivalent hypothesis that every even number N ) 4 is a sum of two primes. Inspired by many failures on these problems, Edmund Landau challenged (in 1912) to show that every number N > 1 is a sum of at most k primes with k sufficiently large but fixed. Landau's problem was solved in 1930 by L. G. Shnirelman (see e.g. [Kh]). The work of Shnirelman opened a new direction in general additive number theory.

I I

I

In 1937 I. M. Vinogradov [V6] succeeded in solving the original Goldbach problem for all odd N which are sufficiently large. Before him, G. H. Hardy and J. E. Littlewood [HLl] made a serious attack by means of a then new circle method. Indeed, they succeeded conditionally under the Riemann Hypothesis for Dirichlet £-functions. Vinogradov followed their line of attack (with his own beautification of the circle method), and removed the use of GRH in the estimate of the exponential sum over primes (19.1)

Le(ap), p(N

which is required by the circle method. His treatment of this particular sum (sieve and double sums methods) became a fundamental tool for estimating a large class of sums over primes (sec Chapter 13). I

'

I !

l

In this chapter we derive Vinogradov's Three Primes Theorem by a mixture of his original and somewhat modern ideas. We replace the exponential sum over primes (19.1) by the exponential sum with the Mobius function for which we have a uniform bound (13.54) due to H. Davenport. In this way we minimize the role played by the circle method, and we do not appeal to the distribution of primes in residue classes (the use of these properties is accumulated in the proof of (13.54)). One can avoid completely the idea of circle method by using only the elementary dispersion method (see the Rutgers Lecture Notes of 1994, Chapter 6). None of the current methods promises any chance to solve the Goldbach-Euler problem for two primes. However, it was established independently by N. G. Tchudakov [Tchu], van der Corput [Cor3] and T. Estermann [Est] that almost all (in the sense of density) even numbers are representable as the sum of two primes. This follows by a suitable generalization of the Vinogradov result (one of the three primes 443

19. THE GOLDBACH PROBLEM

444

can be taken from any sequence of integers which is relatively dense). H.L. Montgomery and R.C. Vaughan [MV3] showed that the number of exceptional even integers (those which are not representable as the sum of two primes) is quite small. For even numbers N we consider

(19.2)

L

G2(N) =

A(n1)A(n2),

n1+n2=N

and for odd numbers N we consider

(19.3) Therefore

(19.4)

G3(N) =

L

A(n)G2(N- n).

n
Hence a good estimate of G 2 (N') for sufficiently many N' < N will yield a correspondingly good estimate of G 3 (N). The heuristic Mobius Randomness Law described in Section 13.1 justifies the following stronger form of the Goldbach problem (which implies its solution for sufficiently large N): CONJECTURE.

For even numbers N ) 4,

(19.5)

where (19.6)

and (19.7)

c2

= 2IJ(1- (p -1)- 2 ). p>2

Here A is any positive number and the implied constant in the error term depends only on A. We shall prove the conjecture for almost all even numbers.

19.1. Let A,B,X) 4. The number of even integers N with 4 X for which

THEOREM

N

~

~

(19.8)

does not exceed CX(log X)-A where C is a constant which depends only on A, B. The celebrated result of Vinogradov for three primes follows immediately by (19.4) and Theorem 19.1. ·

19. THE GOLDBACH PROBLEM THEOREM

19.2

(VINOGRADOV).

445

For odd numbers N) 7 we have

(19.9)

where

(5 3 (N)

> 0 is given by

(19.10)

and A is any positive number, the implied constant depending on A. 19.2. Incomplete A-functions. Given z ) 2 we split A(n) as follows:

A(n) =- LtL(m) logm = Att(n)

(19.11)

+ A'(n)

min where M(n) and A'(n) are the partial sums over the divisors m ~ z and m > z respectively. Then we write

(19.12) where

and Gtt'(N), G"(N) defined similarly. LEMMA

19.3. Let N be an even number) 4 and z ) 2. We have

(19.13)

for any A) 0, the implied constant depending only on A. PROOF.

We have 1.

The equation f\m 1 (m 1 ,m2)IN and

+ f.2m2

N is equivalent with the congruence conditions

f.mi(m 1 ,m2)- 1

=N(m1,m2)-

1

(mod m2(m 1 ,m2)- 1 )

with 1 ~ f. < Nm1 1. Therefore the number of solutions is N[m 1, m2]- 1 + 0(1) giving where ®2(N;z)=

l

L m1,m2(z

(mt,m2)IN

19. THE GOLDBACH PROBLEM

446

It remains to estimate ®2(N; z). We write

m1,m2(z

(mt,m 2 )=l (mtm,,d)=l

=

L

Jl~d)

diN

For cd

Jl~~d) (

L c(z/d

L

ll;:l

logcdm

t

m(z/cd (m,cd)=l

> z! we estimate trivially by

L L diN

c- 2 r

1

(logz) 4 ~ 2T(N)z-!(logz) 4 .

1

cd>z2

If cd ~ z! we derive from the Prime Number Theorem that "' ~

Jl(m) cd A ----:;;;:-logcdm= cp(cd){1+0(T(cd)(logz)- )}.

m~z/cd

(m,cd)=l

Hence

®2(N;z) =

L L

Jl(d)Jl(cd)dcp- 2(cd) + O((logz) 4 -A + T(N)z-!(logz) 4 ).

diN cd(z!

Now extending the sum over cd ~ z! to the infinite series (as we can with the error term which is already present) we obtain 2

(19.14)

"'Jl (d)d

"'

'P

(c,d)=l

~ ----z---(d) ~ din

Jl(c) 'P

2(c) = (52(n). 0

This completes the proof of Lemma 19.3.

Lemma 19.3 reveals that the main contribution to the binary Goldbach-Euler· equation comes from the incomplete function AU(n). In other words, the complementary function A'(n) gives considerably less. This is due to the sign change of the Mobius function involved in A'(n) which ranges over a large segment. However, we do not have yet rigorous arguments to justify this heuristic. But it is a relatively simple matter in ternary additive problems. 19.3. A ternary additive problem with A'. In this section we give a bound for a general sum

L

T'(N) =

uevmA'(n)

£+m+n=N

where ue and Vm are arbitrary complex numbers. To this end we need an estimate for the exponential sum (19.15)

s'(a) =

L n(N

A'(n)e(an)

19.

THE GOLDBACH PROBLEM

447

which is uniform in a. By Theorem 13.10 we derive by partial summation that for any o E lR and x ) 2,

L JL(m)(logm)e(om) « x(logx)-A

(19.16)

m(x

with any A ) 0, the implied constant depending only on A. Since IS'(a)l

LI L

~

~t(m)(logm)e(ofm)l

l(Njz z
we get by (19.16) that

S'(o)

(19.17)

«

N(logN)(logz)-A.

REMARK. Any sequence of numbers in place of A0(n) could be used provided the associated exponential sum (19.15) satisfies S'(a) « N(logN)-A for any A> 0. Now we are ready to prove the following LEMMA 19.4. For any complex numbers ue and Vm we have

L

(19.18)

UfVmA'(n)

« llullllvll

N(!ogN)(logz)-A

f+m+n=N

with any A) 0, the implied constant depending only on A. PROOF. We have

T'(N) = [

(L uee(o£J) ( L f.(N

Vme(om))S'(o)e(-oN)do.

m(N

Hence (19.18) follows by (19.17), Cauchy-Schwarz inequality and the Parseval formula

D 19.4. Proof of Vinogradov's three primes theorem. Let

CN

be any complex numbers for N

~X,

N odd. We write

N(X

N(X

Then we use Lemma 19.3 with z =X~ getting

L

cNG~u(N)

N(X

=

L

cNf82(N)N + O(llciiX~ (logX)-A).

N(X

For the sums of cNGU'(N) and cNG;'(N) we apply Lemma 19.4 getting the same bounds as the error term above. Hence (19.19) N(X

N(X

448

!9. THE GOLDBACH PROBLEM

For the special coefficients

CN

PROPOSITION

19.5. For any X;? 3 we have

L

(19.20)

= G 2(N)- !8 2 (N)N this yields

(G2(N)- !82(N)N)2

«

X3(1ogx)-2A

N(X N even

where A is any positive number and the implied constant depends only on A. Clearly (19.20) implies Theorem 19.1. For the proof of Theorem 19.2 we apply (19.19) to (19.4) getting G3(N) =

L A(n)!8 2(N- n)(N- n) + O(N 3(1ogN)-A). n
By (19.14) the main term is equal to 2 " L...' Jl(C)J1 (d)d (c,d)=! cp2(c)cp2(d)

" L...'

A(n)(N- n).

n
The inner sum is equal to 2

if ( d, N) (19.21)

=

N2 cp(d)

+ O(N

2 -A (logN) )

1 by the Prime Number Theorem. Since

- L 2

(d,cN)=l

this completes the proof of (19.9).

Jl(C)J12(d)d = !83(N) cp 2(c)cp 3(d)

CHAPTER 20

THE CIRCLE METHOD 20.1. The partition number. The circle method originated in the paper of 1918 by G. H. Hardy and S. Ramanujan [HR] on partitions. Any decomposition n = n 1 + n 2 + n 3 + · · · into non-negative integers without regard to their order is called a partition of n. Let p(n) denote the number of such (unrestricted) partitions. This is also equal to the number of ordered solutions (x 1 ,x 2 , ... ) in non-negative integers to the equation n =X!+ 2x2 + 3x3 + · · · . Algebraically, it is also the number of conjugacy classes in the symmetric group 6nL. Euler was fascinated by the remarkable properties of p(n). He introduced the generating power series co

co

F(z) = LP(n)zn = 0

II(l- zm)~l I

and showed that

which is known as Euler's Pentagonal Numbers Theorem. Playing with other power series Euler derived many amazing formulas, such as p(n) =

:n1 2.:

a(h)p(n- h)

O
where a( h) is the sum of divisors of h. Ramanujan noticed, and later gave proofs of the congruence properties

=O(mod 5), =O(mod 7), p(n) =O(mod 11), p(n)

if n

p(n)

if n if

=4(mod 5),

=5(mod 7), n =6(mod 11).

He was also fascinated by analytic expressions for p(n) in terms of modular forms. Changing the variable z into e(z) one transforms the power series F(z) into the Fourier series co

(20.1)

f(z) = LP(n)e(nz). 0

449

20. THE CIRCLE METHOD

450

We write f(z) = e(z/24)/ri(z) where

17(z) =

e( 2~)

IT(1-

e(mz))

1

is the Dedekind eta function. This is a modular form on the upper half-plane H with respect to the modular group SL 2 (Z) of weight ~ and a suitable multiplier system. Precisely Dedekind proved that

11hz)= B('y)(cz

(20.2)

+ d)~17(z)

for any z E H and 1 = ( ~!) E SL 2 (Z), where the multiplier 8(1) is defined by the properties B(-1) = e(DB('y), B(e ~)) =

and

b)) e (a+d 1 ) 24 - 8c - 2s(d, c) '

a B (( c if c

e(f4)

d

=

> 0. Here s(d,c) is the Dedekind sum

(recall that 7/J(x) = x- [x]- ~). Note that the cusp oo is non-singular with respect to the Dedekind multiplier system (see [!4] for definition), so the Fourier expansion of 17(z) has no constant term; it is a lacunary series

with an = 1 if n = ±1 (mod 12), an = -1 if n = ±5 (mod 12) and an = 0 otherwise. However, the Fourier coefficients p(n) of f(z) = e(zf24)/1](z) are quite large. Note the trivial bound p(n) :( 2n which follows from

LP(n)Tn 0

=

II(1- Tm)-1. I

One can derive a better bound from the modular equation

(see (20.2) for 1 = ( 1 -I)). In particular, for z = i/y withy::;:: 1 this yields

p(n) Hence by taking y =

« y-~ exp(21r(;4 + ~)).

1r.jn16 one gets

(20.3) where B = 2Kjv'6. Hardy and Ramanujan proved the asymptotic formula p(n) ~ (4V'3n)- 1e8 Vri.

They also gave an asymptotic expansion with an estimate for the error term.


451

Refining the method of Hardy-Ramanujan, H. Rademacher 1937 the exact formula )

.

THEOREM

(20.4)

[R]

established in

20 .1. For n ? 1 we have

p(n) =

=

1

d1

,

.

rr; Lc'Ac(n)-d -;-smh 'Try 2 c=l nAn

(B-An ) C

where B = 27r/v'6, An= (n- ~)~ and (20.5)

Ac(n)

L* e(~s(a,c)- an/c).

=

a(mod c)

Notice that the Rademacher series (20.4) converges absolutely by virtue of the trivial estimates JAc(n)J ~ c and

d~ ( }n sinh ~An) "i

«

c- exp (~An) . 3

Taking only the first term in (20.4) one already gets a very good approximation

(20.6)

The starting point of the Hardy-Ramanujan method is the Cauchy integral (or the coefficients of a power series. In our case (20.7)

p(n) =

~

1

F(z)z-n- 1 dz

27r 2 lzl=r

(this integration over a circle gave the method its name). Changing z into e(z) we get by periodicity

(20.8)

p(n) =

l

w+l

w

f(z)e( -nz)dz

where w is any point in the upper half-plane !HI and the path of integration is any continuous curve in !HI from w to w + 1. We shall choose the path of integration as a continuous chain of certain circular arcs on which f(z) can be analyzed quite precisely by exploiting its modularity. To construct the arcs in question we need some well-known properties of the Farey sequence or series (see e.g. [HW]). Let C be a positive integer. The collection of all reduced fractions ~ with 1 ~ c ~ C and (a, c)= 1 arranged in the increasing sequence a' a a"

···-<-<-···

c' c c" is called the Farey sequence of order C. Given a point ~ in that sequence let ~

and ~;; denote the adjacent points. The denominators c', c" are determined by the conditions (20.9)

C- c < c' ~ C, { C -c < c'' ~ C,

ac' ac"

=1(mod c), =-1(mod c)


452

and the numerators are a'= (ac'- 1)c- 1 , a"= (ac" + 1)c- 1 , so the points are

a' c'

(20.10)

a

a"

a

c"

c

1 cc"

-=-+-.

-;:- cc''

Between afc and a' /c', a" fc" there are the mediants (a' +a)/(c' +c), (a+a")/(c+c") which are also reduced fractions but do not belong to the Farey sequence of order C. They have denominators in the segment ( C, c + CJ and satisfy

a'+ a ct+c a"+ a

a c a

1 c(c+c')

a' c' a"

c"+c

c

c(c+c'')

1 c'(c+c')'

- - = - - - - - = - + -----,.----:-

--=-+

1

c"

To each reduced fraction %with c ) 1, (a, c) (20.11)

1

=--

c''(c+c")

.

1 we associate the circle

=

i I- 1 z -a- - c 2c 2 - 2c 2

I

which is denoted by C(afc) and is called a Ford circle; it is a circle in lHI tangent to the real line at z = %· Two different circles do not intersect; they are tangent if and only if they belong to consecutive fractions in the same Farey sequence. All the circles C(afc) are obtained as the images of the horizontal line Im z = 1 under the action of SL2(Z). Indeed writing (20.12)

az

+b +d

a

/Z=--= - CZ

C

1

c( CZ

+ d)

for 1 = (: :) with c ? 1 it follows that

a i i z + djc /Z=-+ - -2- - - C 2c 2 2c z + dfc if Im (z) = 1, whence it is clear that /Z satisfies (20.11).

Now let C? 1 be given. Suppose a'fc' < afc < a"fc" are three consecutive points from the Farey sequence of order C. Then C(afc) touches C(a'fd) and

"' J .

.

'


453

C(a" fd') at the points

(20.13)

' -ac + Cac

with

(' =

(20.14)

9:.C

+ ("ac

with

(" =

-1 , c(c'+ic) 1

ac

c(c"- ic)

ac

Indeed, we see that the point (20.13) is an appropriate mean of the centers of C( afc) and C(a'fd) a c

1 c(c' + ic)

by (20.10), so this is the tangency point of the circles C(afc) and C(a' fc'). Similarly one verifies the point (20.14). On each circle C(afc) with 1 ~ c ~ C we choose the upper arc "'ac which connects the tangency points (20.13) and (20.14) with the circles of the adjacent fractions. The chain of such arcs "'ac gives us an infinite continuous curve which is periodic of period one. We choose as the path of integration in (20.8) the fragment of this curve which is contained in a vertical strip of width one beginning in a tangency point. We get (20.15)

p(n) =

L 2:*

Hac(n)

c(C a(mod c)

where

Hac(n) =

1

f(z)e( -nz)dz

-y.,

by virtue of the periodicity of f(z) and e(-nz). We shall evaluate each arc integral Hac(n) separately by applying specific transformation"! E SL 2 (7l.). First, changing the variable z into afc + izfc2 (translation -magnification- rotation) we transform (20.11) onto the circle lz- ~~ = ~getting

Hac(n)

=

(

i ( -an) iz ) e ( -i nz) -e - iz~, f -a +- dz c2 c, c c2 c2 z.,

where z is on the right arc of the circle between the points

(20.16) {20.17)

, . 2 , Zac = -'lC (ac ,

Zac

. 2 ,

= -tc

ic

= -,--. '

Cac =

c +tc -ic

-c '.' . -tc

··~


454

Given c ~ 1 and (a, c) = 1 the modular relation (20.2) with z replaced by -dfc1/cz translates into

f ( -a +-z) c

c

=

(z)~ 1 (1- +z )) f (-d 1) ~ e ( -s(d,c) +--z 24c z c cz 1 2

where ad= 1(mod c). Changing z into izfc we write this equation as

a iz) =c _l2e(-s(d,c))Ec(z)f 1 ( d i) f ( -+--+2 2 c

c

c

z

where Ec(z) = z~ exp(f2(~- Z,.). Applying this to the integral Hac(n) we get

Hac(n) = ic-§e(~s(a,c)- anjc)Iac(n) where

Iac(n) =

(~" J( -~ + i )Ec(z)e2nnzjc2 dz.

lz~c

C

Z

(More precise but cumbersome notation would be Ia,c(n)). To evaluate Iac(n) we replace f by its constant term in the Fourier expansion (20.1) which is p(O) = 1, and then we extend the arc of integration to the whole circle. This will give us the main term

Changing z into 1/z we transform the circle lz- ~~ = ~ into the vertical line Re (z) = 1 getting

Ic(n) =

J

I+ioo

exp

1-ioo

('l-rZ + 227r ( n - -1 )) z-2dz. 5

12

C Z

24

Therefore we have

(20.18)

Ic(n) = 2i7r

c2~J ~I~ C~>-n)

where Iv(Y) is the Bessel function (see [GR], 8.4-8.5). (This unfortunate coincidence in notation will end shortly!)

455


It remains to estimate the difference Iac(n)- Ic(n) = I;c(n)- I~c(n)- I~c(n), say, where

J~c(n) =fa<" Ec(z)e27rnz/c2 dz, I:c(n) = {0 Ec(z)e2"nzjc2 dz. lz'c:c For estimation of I~c(n) we move the path of integration from the arc to the chord of the arc because on this chord i/z has large height so f(-djc + ijz)- 1 is small. Indeed on this chord we have lzl ~ min(lz~cl, lz:cl) ~ 2cC- 1 and Re z ~ max(Re z~c' Re z:cl ~ 2C- 2 by (20.16), (20.19) and (20.9). The length of the chord is bounded by lz~cl + lz:cl ~ 4cC- 1 . Moreover, on the chord we have

, !;;: ~ p(m)e (--;;dm) exp (--;27f ( m- 1 ) + ---;!2 27fz ( n- 1 )) 24 24

= z2

1

«

(3) 1

exp

c~~)

because Re(z- 1 ) = 1, Re(z) ~ c2C- 2 and p(m) ~2m. Therefore (20.19) Next we estimate J~c(n). The length of the arc from 0 to z~c on the circle lz- = is bounded by ~ lz~cl ~ 1rcc- 1 and for z on that arc we have lzl ~ cc- 1 . Moreover, we derive (as we did when estimating I;c(n)) that

!I !

E~(z)e27rnzfc2 =z~expC;z + 2:ZZ(n- ~)) «

(3) 1

exp

c~~).

Therefore /~ 0 (n) satisfies the bound (20.19). Similarly we show that 1:c(n) satisfies the bound (20.19). Collecting these bounds and the formula (20.18) we get Hac(n) =

2 ; e( ~s(a, c)-

a;) (12.\n)-~ I~ C~>-n) + O(c- C- ~ exp(27rnC- 2)).

Inserting this into (20.15) we get

Letting C tend to infinity we obtain (20.20)

1


456

·;,

Using the formulas

J,(y) =

z

(~)! sinhy, 1ry

h(y) =y!dd (y-!h(y)) 2

y

2

we conclude (20.4). REMARKS. Since ry(z) is essentially a theta function, the multiplier system ll('y) can be expressed in terms of Gauss sums rather than the Dedekind sums. Consequently the coefficients Ac(n) in the Rademacher formula (20.4) can be also expressed by simpler functions. Indeed A. Selberg showed (using directly the Euler Pentagonal Numbers Theorem) that

L

(20.21)

£(mod 2c) £(3£-l)=-2n(mod 2£)

Hence IAc(n)l ~ 2T(c)cL

20.2. Diophantine equations. After his historical paper with Ramanujan on the partition function Hardy continued working on the circle method jointly with J. E. Littlewood. From 1920 to 1928 they published a series of eight papers under the common title, "Some problems of 'Partitio Numerorum'". They realized that the modularity of the generating Fourier series is not an essential property for the method, but rather good estimates for the relevant exponential sums are sufficient. This observation opened numerous new applications, the most popular one being for the Waring problem. This concerns the solvability of the equation (20.22) in positive integers x 1 , • .. , Xn, where k ? 1 and N ? 1 are given numbers. In 1770 Edward Waring asserted that (20.22) does have a solution provided n is sufficiently large in terms of k alone. This was proved in 1909 by D. Hilbert [H]. By the circle method Hardy and Littlewood obtained an asymptotic formula for the number of representations (20.22). In this section we illustrate the method by showing a somewhat later result due to 1-K. Hua [H2]. THEOREM 20.2. Let n > 2k. Then the number v(N) of representations of N as the sum of n positive k-th powers satisfies (20.23)

Here C!3(N) is given as the sum of infinite series of the multiplicative function (20.24)

c(q,N) =

aN) . L * (1q L e (axk))n q e (-q a(mod q)

x(mod q)

457


The series

j 1

(20.25)

iI

converges absolutely and satisfies c1 ~ ®(N) ~ c2, where c1o c2 are positive numbers depending on k and n, but not on N; also 8 > 0 and the implied constant depend at most on k, n.

1

I

®(N) = Lc(q,N) q=l

We begin by applying the method to a general equation (20.26)

where f is a polynomial of degree k ) 1 with integer coefficients. We seek integral solutions x = (x 1, ... , Xn) restricted to a box B C [-X, X]n with X) 1. Thus

f(x)

(20.27)

«

xk

if

X E

B.

The method works if the number of variables is much larger than the degree and the values of f(x) vary considerably (so f(x) cannot be constant in many variables after any unimodular transformation, see our analytic condition (20.35)). Therefore, by a statistical reasoning we anticipate that the number vt(B) of solutions to (20.26) in X E B n should satisfy

zn

vt(B)

«

xn-k.

Exactly, the number of such solutions is given by the integral (Vinogradov's variation on the Hardy-Littlewood power series setting) (20.28)

vt(B) = [

( 0

L

e(af(x)))da.

xEBnzn

As in the Hardy-Ramanujan work on the partition number we expect here that the main contribution comes from the integration in neighborhoods of rational points with small denominators. According to the Dirichlet approximation theorem every a satisfies (20.29)

la-~~~q~' with1~q~P,

(a,q)=l,

where P is a fixed positive number. Let P ) 2Q ) 2. If a satisfies (20.29) with q ~ Q, we say a belongs to the "major arc" (20.30)

Notice that the intervals of 9J1 centered at different points ajq do not overlap because a all 1 1 2 Q; ? qq1 ? qQ ? qP I

q-

if ajq f al/q 1 with q1 called the "minor arc".

~

Q. The remaining part of the segment m = [0, 1] \ 9J1 is


458

REMARK. In the Hardy-R.amanujan work the circle a(mod 1) was divided exactly by the mediants of the Farey sequence of order C = P, so there was nothing in the minor arc.

For a E 9J1 we evaluate the exponential sum

L

SJ(a) =

e(af(x))

xEBnzn

by splitting into residue classes

L

Sj(a) =

L

eGf(u))

u(mod q)

e(OJ(x))

xEBnzn x"'u(mod q)

where (} = a - ajq. Since \0\ is small, namely \B\ ~ (qP)- 1 by (20.29), we can replace the sum over residue classes u(mod q) by a corresponding integral with a reasonable error term. Suppose that P » xk- 1 with the implied constant large enough so that

l ~f(x)j~£.2

for1~v~n,xEB.

OXv

Hence \Oaf jaxv\ ~ 1j2q, so Lemma 8.8 applied successively for each variable Xv uv(mod q) yields

=

1

e(Bf(x))=q-nBt(O)+O((l+~-r- )

L

q

xEBnzn x"'u(rnod q)

where BJ((}) is the integral in question

(20.31)

BJ((}) =

l

e(Of(x))dx.

A trivial error term would be (1 + Xjq)n, so we only got saving of the size of one variable of summation. Note that B1(0) does not depend on the residue class u(mod q) (in more sophisticated versions of the circle method the residue class does appear in the main term with a profound effect). Therefore summing over the classes u(mod q) we get

(20.32) where CJ(ajq) is the normalized, complete exponential sum

(20.33)

L

CJ(ajq) = q:

e(~f(u)).

u(mod q)

Integrating (20.32) over the major arcs we get

r SJ(a)da = L CJ(q) r

Jm

q~Q

jiBI~l/qP

Bt(O)d(} + O(P- 1 Q 2 (Q +Xt- 1 )

where

(20.34)

CJ(q) = q:

2:::* L e(~f(u)). a( mod q) u(mod q)

Note that CJ(q) is real and \c1(q)\ ~ q.


First we evaluate asymptotically the integral of BJ(B) over the segment 1/qP. Suppose that for any B > 0,

B1(B)

(20.35)

«

459

IBI :(

(BXk)-I--yxn

where 1 > 0. Have in mind that xn is the trivial bound and BXk is about BJ(x), so we postulate a saving which is only slightly larger than the size of the amplitude. This condition is quite realistic for a smooth function f(x) which changes fast in s~eral variables. Observe that the bound (20.35) remains the same if a constant is added to f. For example, if

j(x) = N- X~-···- X~

(20.36) for 0

< x, :( X= Nt, then IBJ(B)i

=!lax e(Bxk)dxln « e-nfk,

so (20.35) holds with 1 = I - 1 > 0, if n > k. Using (20.35) we can extend the integration over IBI :( 1/qP to the whole real line getting (20.37) where (20.38) The condition (20.35) also implies that (20.39) (use (20.35) if

IBI > x-k

and estimate trivially in

IBI :( x-k).

Next we evaluate asymptotically the sum of CJ(q) over the moduli q :( Q. Suppose that for any q) 1, (a, q) = 1, (20.40) where 7) > 0. Observe that this bound remains the same if a constant is added to f. Hence (20.41) Using (20.41) we can extend the summation over q :( Q to all moduli getting (20.42)

L

CJ(q) = ll5j

+ O(Q-'1)

q(Q

where (20.43)

Now collecting the above results we obtain (20.44)


460

where (20.45) Observe that we can take A :( 2x-o-r~ with o a small, positive number by choosing xk- 8 ( 2 +-r+~), but some other choices of Q and p can be better, see the line above (20.62).

Q = x 8 -r and p =

The infinite integral (20.38) and the infinite series (20.43), are called the "singular integral" and the "singular series" for the equation (20.26). These can be brought to expressions which allow meaningful interpretations in terms of local densities. First we work out the singular integral. We have

VJ(B) = lim JT T~= -T

(1

B

e(Bf(x))dx)de = lim { T~=JB

Integrating over the level set {x E B;J(x)

=

sin( 2 "t~)T) dx. 1rj

X

t} and using

sin27rtd - t=1 j-== 7rt one can show that VJ (B) is the (n- 1)-dimensional volume of the set {x E B; f (x) = 0} with respect to a suitable measure which depends on f. It is also approximately equal to the n-dimensional Lebesgue measure of the set {x E B; IJ(x)l :( Precisely, we derive by the condition (20.35) that for any V > 0,

H-

(20.46) where IAI denotes the n-dimensional Lebesgue measure of A. This is non-trivial for v « xk, and it yields .

1

VJ(B) = hm - l{x E B; if(x)i :( V}lv~o 2V

(20.47)

For the proof of (20.46) we use the formula

e(By)dB j-== sin21rBV 1rB and the estimates sin27rBV we obtain

=

21rBV

= { 1

if IYI <

v,

0,

if IYI >

v.

+ O(B

2

V

2

),

sin27rBV

«

1. Hence by (20.35)

oo 1= sin 21rBV VJ(B) = -= BJ(B)dB = -= 21rBV BJ(B)dB

j

h 00

+0(1T BVIBJ(B)IdB+ =

1

(BV)- IBJ(B)IdB)

2~1{x E B; IJ(x)l:;;; V}l + O((TV + (Tv)-

Choosing TV= 1 we get (20.46).

1

)(TXk)--yxn-k).


461

For example, if f(x) is given by (20.36), then (20.47) yields 1 V(N) = k

1· · ·1 (

N- x 2k

X2,···

-

· · · -

k .!._1 xn) k dx2 · · · dXn·

,xn>O

x;+···+x~
1. Show that the above integral is

EXERCISE

(20.48) Now we work out the singular series. The sum over a(mod q) with (a,q) = 1 in (20.34) is the Ramanujan sum

therefore

CJ(q) =

1 n

q

L>·(~)d l{u(mod q);J(u) = O(mod d)}l = I>(~)wJ(d), dlq

dlq

where dn- 1 w 1 (d) is the number of solutions to the congruence

f(x) =(mod d).

(20.49)

Since WJ(d) is multiplicative, we have

CJ(q)

=IT (w,(p") -w,(p"-

1

)).

Pnllq

Hence the singular series is also given by the infinite product (20.50) where (20.51)

8J(P)

= 1

+ ~)w!(P")- WJ(P"- 1 )). o=l

Note that the series (20.51) and the product (20.50) converge absolutely by virtue of (20.41). The numbers 8J(P) are called "local densities" for the reason that (20.52) so 8J(P) is a density for the p-adic solutions of J(x) = 0. By the analogous formula (20.47) one can interpret Vj(B) as a density measure for the real solutions of f(x) = 0.

If WJ(P") stabilizes, i.e.,

(20.53)

for a;::, a,

20.

462

THE CIRCLE METHOD

then o1 (p) = Wf(pa) is the density of solutions of the congruence

=O(mod pa).

f(x)

(20.54)

For some polynomials f (such as (20.36)) the stability property (20.53) holds for all p with a common a which is sufficiently large depending on f. In this case (20.55)

We now proceed to the estimation of the integral of the exponential sum S 1 (a) over the minor arc m = [0, 1] \ 9Jl. First Hardy and Littlewood estimated S 1 (a) for each a in m and then integrated trivially. For a polynomial in one variable we have LEMMA 20.3 (WEYL). Let g(x) = cxk + c xk- 1 + · · · with c and k positive 1

integers. Suppose with (a, q) = 1.

(20.56) Then (20.57) for any

E

> 0 and X? 1, the implied constant depending only on c.

REMARK. The range of summation in (20.57) can be shifted to any segment of length X without affecting the bound. PROOF. By Proposition 8.2 the exponential sum is bounded by 21-k

L ... L

2X { x-k

~X
«X {x-i+

min(X, llack!nl ... nk-111-l)}

,nk-t
L

x-k

1

21-k

Tk-l(m) min( X, llack!mll- )}

l(m
where the first term comes from n 1 ... nk-l = 0 and Tk- 1(m) appears as a bound for the number of representations of m = n 1 · · · nk-1 with 1 :( n., < X, so 1 :( m < M = xk-l. After applying Tk-l (m) « 22 k m' we extend the summation to

L

(20.58)

min( X, lla£11-1)

l(l
=

where L = ck!M (i.e., we ignore the condition t' O(mod ck!)). If t' ranges over an interval of length q/2 the points at'( mod 1) are spaced by q- 1 by virtue of (20.56), therefore the sum (20.58) is bounded by 2 (1+ L) q

1

L min(x,~~~~~)«(1+~)(X+qlogq). q q

l(mod q)

This estimate leads to (20.57). From Weyl's lemma for a= ajq and X= q we obtain

D

20.

THE CIRCLE METHOD

463

COROLLARY 20.4. Let g(x) be a polynomial of degree k ~ 2 with integer coefficients. Let (a, q) = 1. Then

(20.59)

L

e(~g(x))

« (c,q)~ql-2'-k+<

x(mod q)

where c is the leading coefficient of g, and the implied constant depends only on e.

REMARK. By the lliemann Hypothesis for curves over finite fields (proved by A. Wei! [Wel]) one can obtain the bound q!+•, but with the implied constant depending on g. However, our exponent 1- 2 1 -k is more than sufficient for applications in this chapter. For a in the minor arc m we have (20.56) with Q in the bound (20.57) is

(~ + ~ + :k

< q :(

P, so the saving factor

rl-k

which is rather small. However, this can be multiplied if f(x 1 , ... ,xn) has many summands in distinct single variables. For example let us consider

(20.60) where fe(xe)

= cexk + · · · with ce > 0 for all 1 :( £ :( n. Then, by Lemma 20.3

(20.61) for any a Em. The same bound holds for the integral of SJ(a) on m. We require this to be of smaller order than xn-k which we accomplish by taking n sufficiently large in terms of k. Choosing Q = Xt and P = xk-j we get

(20.62) with some ii > 0, if n > k2k+l. Note that the condition (20.35) for the polynomial (20.60) holds with 1 = n/k-1. So the above choice ofQ and P also gives ,6. « x-J, see (20.45). Moreover, we verify (20.40) using Corollary 20.4

if n

> 2k. Therefore (20.44) becomes

(20.63) where Q) f and VJ(B) are the singular series and the singular integral for the polynomial (20.60). Adding (20.62) to (20.63) we conclude

LL


464

THEOREM 20.5. Let f 1 (x 1 ), ... ,fn(xn) be polynomials of degree k;:, 2 with integer coefficients. Suppose n > k2k+I. Then the number of integral solutions to the equation

(20.64) in a box B C [-X, X]n with X ;:, 1 satisfies (20.65) for some ii > 0, where® 1 and Vt(B) are the singular series and the singular integral for the polynomial {20.60). The implied constant in the error term of (20.65) may depend on the polynomials !1, ... Jn but rather mildly; it does not depend on the constant terms of f 1 , . . . .!n- In particular, the asymptotic formula (20.65) applies nicely to the equation (by adding a constant term) (20.66) Let c 1 , •.. ,en be the leading coefficients, and suppose all are positive. Let vf(N) be the number of solutions to (20.66) in positive integers. They are contained in the box B = [0, X]n with X= cNt for some constant c;:, 1 depending on JI, ... ,/n. Therefore Theorem 20.5 yields (20.67) for some ii > 0. In this case we can compute the singular integral VJ(N) asymptotically. First we know from (20.46) that

/{0

:(X], ... 'Xn

:(X;

III (xi)+ ... +

fn(xn)-

N/ :( U}J «

uxn-k

for any U > 0. Hence and by (20.46) we derive using the approximations fe(xe) = cexZ + O(xk-I) that Vj(N)

= 1U /{0 :( x 1 , ... , Xn :( X; 2

Jc 1 x1k + · · · + cexek -

N/ :( U}/

+ O(U-1 xn-1 + (U x-kf' xn-k). In the error term we choose U = xk-I/(1+-y) getting O(xn-k--y/(I+-rl). In the main term we can reduce U to any smaller number V without changing the above error term by virtue of (20.46). Therefore (20.68) where V(N) is the same one as in (20.48). Inserting this to (20.67) we conclude that (20.69) In particular, we complete the proof of Theorem 20.2 if n > k2k+l, except for the analysis of the singular series ®(N) about which we shall speak later. In order to get results for smaller n we look for improvements in the treatment of the minor arc. Suppose f has the first variable separated, (20.70)

r

I


465

Then s,(a) = S9 (a)Sh(a), so we can write (20.71)

IJmr s,(a)dai::;; (ma.xjSg(a)l) Jot

ISh(a)jda.

aEm

In the minor arc we estimate S 9 (a) by Weyl's lemma. Recall that for a E m we have (20.44) with Q < q::;; P. We choose Q = X 613 and P = xk- 1+0, where b is a small positive number. Assuming that g is a non-constant polynomial we get a non-trivial bound (20.72) Since T/ is small, we cannot waste much in the estimation of Sh(a). Our goal is (20.73) which is sufficient to yield (20.74)

i

s,(a)da

«

xn-k- 11 .

Moreover, the above choice of Q and P makes the error term in (20.44) of admissible order xn-k-11. Adding (20.44) to (20.74) we get (20.75)

It remains to prove (20.73). Note that this bound is close to the true order of magnitude. Indeed, by dropping the absolute value of Sh(a) we estimate from below by the integral which is exactly equal to the number of solutions to

h(x2, ... , Xn) = 0

(20.76)

in the relevant box, and this number is expected to be of order xn- 1 -k. In view of this remark the task of showing (20.73) may seem to be almost as difficult as the original problem, or even more difficult because we have one variable less in (20.76) than in the original equation (20.26). However, this situation is not as bad in practice, the reason being that it is easier to give an upper bound for the number of points on a variety than to count these with asymptotic precision. In 1938 1-K. Hua [H2] succeeded in proving (20.73) with any T/ > 0 for the polynomial which is the sum of sufficiently many k-th powers.

LEMMA 20.6 (HuA). Let k? 1 and (20.77)

S(a) =

L

e(ank).

l(n(X

Then for any 1 ::;; C ::;; k we have

I

l

l t

1 1

(20.78)

jS(a)j 2 'da

the implied constant depending on c and k.

«

X 2'-l+•

20.

466

THE CIRCLE METHOD

PROOF. For l = 1 we get

[

2 IS(a)l da =X.

e

Suppose that (20.78) holds fore with 1 :::;; < k. We shall prove it holds fore+ 1. By repeated application of the Weyl "differencing process" £times (see Proposition 8.2) we arrive at the inequality

L ···L L e(ahi · · · hep(n; hi, ... , he))

IS(aW' :( (2X) 2'-e-i

-X
where I= !(hi, ... , he) is a subinterval o£[1, X] and p(x; h 1, ... , he) is a polynomial of degree k- with integer coefficients. We simplify this by writing

e

IS(a)l 2' :( (2X) 2'-e-l

L ame(am)

where am is the number of solutions to the equation hi···hep(n;hi,··· ,he) =m in the above range. Therefore a 0 « xe and am « xe if m p(x; h1, ... , he) is not a constant polynomial. We can also write

=f

0, because

where bm is the number of solutions to

n~

+-- · + n~- n~+l--- ·- n~.

= m

with s = 2e-i, 1 :( n1, ... , n2s :(X. Clearly

and

bo =

ii

IS(a)l2' da

«

x2'-e+e

by the induction hypothesis. Now we derive by the Parseval identity

t IS(a)l2'+' da :( (2X)2'-e-l L ambm « x2'-e-i x2'+e

Jo

completing the proof.

= x2'+'-l-l+e

m

D

From Hua's lemma withe= k one gets (20.73) for any polynomial h in which at least 2k variables appear exactly as k-th powers. This holds for the polynomial (20.36) provided n- 1 ;, 2k (we used one variable for the non-trivial estimation (20.72) in the minor arc). In particular, this completes the proof of Theorem 20.2 except for the analysis of the singular series. In general this analysis of IB f is not simple. In the special case of the equation

(20.23) one can estimate the series (20.43) by elementary yet long arguments, see the comments below.


467

The asymptotic formula (20.79) for the equation f(x) = 0 in X E B n zn will be meaningful only if one knows that the singular series fl5J is bounded below by a positive quantity which does not or only barely depends on f, and that the box B is large enough so that Vj(B) has order of magnitude xn-k. The latter is essentially true if the equation f (x) = 0 has solutions in real variables laying deeply in the interior of B. However, a good lower bound for fl5J is a more subtle question which we shall address on special occasions. The positivity of 1!5 1 follows if the equation f(x) = 0 is solvable in every p-adic field by virtue of the product formula (20.50). Actually the local densities 8J(P) are very close to one uniformly for all p sufficiently large, so the problem is to give good lower bounds for 8J(P) at finite number of places which depend on f. If f(x) = 0 has local solutions and the number of variables n is considerably larger than the degree k, then (subject to some extra incidental conditions) there are plenty of local solutions so that 1!5!;:, c(k, n) > 0. For example, if f(x) = N- xt- · · ·- x~ (the case of the Waring problem) it is known that (see H. Davenport [Da2]) (20.80)

1!5(N);:, c(k, n) > 0

if n;:, 4k.

20.3. The circle method after Kloosterman. The circle method is a tool which is suitable for detecting the equation m = n by means of the additive characters e(etm), where m runs over a given set of integers and n is fixed. In a more general setting one has a sequence of complex numbers A= (etm) with

L

letml < 00

mEZ

and the problem is to evaluate asymptotically a selected term of A. We have

where

Assuming we have an adequate knowledge about the distribution of A in residue classes to small moduli we can evaluate SA(et) at the points Ct which are close to rationals with small denominators. The set of such points is called the major arc because the main term in the asymptotic for Ctn comes from this area. However, a harder work is needed for estimation of SA(et) for Ct in the remaining area which is called the minor arc. Here the treatment depends very much on the structure of the sequence A. We are particularly pleased with A being an additive convolution of many independent sequences. In this case SA (Ct) factors into several exponential sums, so even a small saving in estimation of each of these can lead to a successful conclusion. It is intrinsic of the Hardy-Littlewood type arguments that one needs at least three summands to resolve the additive equation in question. The arguments certainly fail for binary additive problems (sometimes one summand can be partitioned into two so the additive problem under consideration seems to be

468


of ternary type, but the range of resulting variables is reduced drastically so this rearrangement does not really change the problem). Another characteristic of the equations m = n for which the circle method as developed by Hardy-Littlewood works is that the expected number of solutions must be large relative to n, and it shouldn't depend essentially on n. For example, the method may work for a diophantine equation f(x) = n if the number of variables is much larger than the degree. A real challenge for the circle method is to detect the incident m = n when it is expected to happen irregularly. Moreover, a new idea is needed to treat binary additive problems (20.81)

v(n) =

L

LamfJe

m+i:;:::::n

where A= (am) and B = (fJe) are given sequences of some arithmetical nature. In 1926 H. D. Kloosterman [Klo] succeeded with the equation (20.82) Actually his method works for the equation

Q(x) = n

(20.83)

where Q(x) is any positive definite quadratic form in r ~ 4 variables and integral coefficients. (In particular, using it one can recover the four squares theorem of Lagrange, which the Hardy-Littlewood version could not). In the next section we shall treat the equation (20.83) by extending the original work of Kloosterman. To explain the novelty of the Kloosterman method let us speculate on the general additive binary problem (20.81) (the quadric equation (20.82) is of this type, for one can take m = a 1 xi + a 2 x~ and e= a3x§ + a 4 x~ as the summands with Clm, !Je being the representation numbers by the corresponding binary forms). We have

v(n) = [

SA(n)Sa(a)e(-an)dn

where SA(a) and Sa( a) are the corresponding exponential sums. The best bounds for these exponential sums which one hopes to prove for almost all a are the £2 norms

IIAII =

(L iaml ~ 2

)

m

and IIBII respectively. Even if the bounds SA(a) « IIAII and Sa( a) « IIBII were accomplished for a in the minor arc (which is rarely the case), it is not sufficient because the expected main term does not exceed IIAII IIBII· Therefore one cannot rely only on estimates. For some sequences one can transform the corresponding exponential sum into another sum, say (20.84) where the new terms 'Ym(a) have the status of being "dual" to ame(-am) and the error term EA(a) is significantly smaller than IIAII. Of course, the dual sum is essentially as large as the original one, but it may have fewer terms. Now,


469

instead of estimating, one integrates the product of dual sums in a getting an extra saving from the cancellation which is due to the variation in the argument of each individual term I'm( a). In this way one breaks the barrier IIAIIIIBII which couldn't be passed by absolute value integration. The main term emerges as before from the integration in small neighborhoods of rationals with small denominators. The diophantine nature of a plays a role in the transformation (20.84), so it is appropriate to divide the circle a (mod 1) by the Farey points % with 1 :( c :( C, (a, c) = 1. However, in contrast to the previous practice, neither gaps nor overlapping of these segments is allowed (because of the prohibition of employing estimates for the exponential sums SA( a) and SB(a)). To cover the circle exactly we choose naturally the division points to be the mediants of the Farey sequence of order C. The resulting segments

a (~a) = (a'+a a+a"] = (a~- c(c+c')'~+ 1

9Jt

c'+c' c+c"

1 J c(c+c")

have length which changes in arithmetic fashion (see (20.9)) and this property has to be considered when integrating different pieces of the dual exponential sum. Kloosterman controls the length of the segments 9Jt(%) by Fourier analysis, and consequently creates exponentials of type e(~) for various u E Z, where d is the multiplicative inverse of a( mod c). By this analysis the integration in a(mod 1) amounts to the summation over the Farey points% with factors e(~). At the same time a nice coincidence may occur that the dual terms l'm(a) for a close to % also have a phase of type e (~) for various v E Z (this happens for the sequence of values of a quadratic form). Multiplying these factors and summing over a(mod c) one arrives at the Kloosterman sum

"L...,

S(m,n;c)=

e(dm+can)

ad=l(mod c)

with m = -u- v. Now any non-trivial bound for S(m, n; c) corresponds to a cancellation in the integral of the dual sums giving a critical improvement of the direct bound IIAIIIIBII. Kloosterman succeeded in showing that

(m,n,c)~c~T(c),

S(m,n;c) «

while the Wei! bound (Corollary 11.12) asserts that (20.85)

IS(m,n;c)l :(

(m,n,c)~c~T(c).

Let us emphasize that if Q(x) was not a quadratic polynomial, then the dual terms, if they exist, may have a phase other than e (~), in which case another exponential sum to modulus c would appear in place of the Kloosterman sums, and the method still works by an appeal to appropriate results for the relevant sums. We now proceed to a detailed presentation of Kloosterman's ideas in a somewhat abstract form. We begin by establishing a Fourier-Kloosterman expansion of the delta symbol (20.86)

o(n) = {

~

ifn

= 0,

ifn ;;to 0.


470

PROPOSITION

20.7. Let C be a real number;;, L We have

o(n) = 2Re

(20.87)

Jot0 "L..

"*

nx)

1

d cd dx (cd)- e ( n-;;-

L..

c~C
where* restricts the summation by (c, d) = I and dis the multiplicative inverse to d modulo c. REMARK.

For n = 0 the formula (20.86) gives a cute identity

2: 2:

(20.88)

(cd)-1 =

~-

c~C
(c,d)=l

EXERCISE

2. Prove (20.88) directly.

For the proof of (20.87) we may assume that C is a positive integer. Let I. We shall evaluate the constant term of the Fourier series off which is equal to the mean-value

f : ffi. --. C be a periodic function of period {L(/) = [

f(x)dx.

Splitting into Farey segments
f.LU) =

2: 2: 1 "t(x)dx

O"a
=

2:2: f

O"a
l/c(c+c")

(a )

t -+x c1x

-1/c(c+c')

C

(a,c)=l

the last line following by (20.9). Suppose that (i.e., f has real Fourier coefficients), then (20.89)

tfcC

f.L(/) = 2Re

L Jo

c~C

°

L

f has the symmetry f( -x) *

= f(x)

d t(x- -;;)dx.

C
Applying this to f(x) = e(nx) and changing x into xjcd one gets (20.87). In the remaining part of this section we elaborate (20.89) further, not for immediate applications but rather to uncover more structure of the formula. We write (20.89) as (20.90)

THE CIRCLE METHOD

20.

471

where K 1(x; c) is a kind of incomplete Kloosterman sum associated with

L*

Kt(x;c) =

(20.91)

C
f

f(~-2_). c ex

This can be expressed as a complete sum in d(mod c) by applying the formula min( 1, [x: d]- [C

~ d])

= {

~

if d :( x, if d >X

for C < d :( c + C and x ;? C. Hence we get (20.92)

L* min(1,[x:dJ-[C~d])f(~-c~).

Kt(x;c)=

d(mod c)

Moreover, the "min" factor in (20.92) equals 1 if x

[C~d]

[x:dJ-

=

> c + C, and it is equal to

x~C -~(x:d) +~(C~d)

if C < x < c + C, therefore

x-C) Kt(x;c) =min ( 1, c-

"""*

L

(ii 1)

f 7;- ex

d(mod c)

(20.93) where the last sum is present only if C < x < c +C. Inserting this into (20.90) we obtain PROPOSITION 20.8. Let C be a positive integer and f a periodic function on lR of period 1. Then the constant term off in its Fourier series is given by (20.94)

1 min(1,x~C)( 00

MU)=2Re Lc-

1

c(C

C

+2Re Lc-llc+C c(C

C

L* f(~- c~))x- 2 dx d(mod c)

L* (1/J(C~d)-1/J(x:d))f(~- c~)x-2dx. d(mod c)

REMARKS. Quite often in applications the first line of (20.94) yields the main term while the second line can be successfully estimated by means of exponential sums over finite fields. Expanding 1/J and f into Fourier series the sum K 1 ( x; c) can be expressed in terms of complete Kloosterman sums (20.85), but this option is not always practical. For f(x) = e(-nx) one gets by (20.94) another expression for the delta-symbol

o(n) = 2Re

L c- R(n;c) Loo e(- c:) min( 1, X~ c)x- 2dx 1

ct(C

+2Re Lc- 1 c(C

c+C

1 c

n) (Rc(n;c)-Rx(n;c))e(-- x- 2dx ex


472

where R(n; c)= 8(0, n; c) is the Ramanujan sum and

L*

Rx(n;c) =

'1/J

(x - d) (nil)

-c- e --; .

d(mod c)

Note that

Rx(n;c)

« (

L

d-!)c!T(c)log2c

di(c,n)

by applying the Fourier expansion

'1/J(x: d)= L

(2nif)-le(ex: d)+

o( (1 +cllx: dllrl)

O
and Weil's bound for the Kloosterman sums S(f, n; c). 20.4. Representations by quadratic forms. We proceed to treat the equation (20.83) by the Kloosterman circle method. Our goal is THEOREM 20.9. Let Q(x) be a positive definite quadmtic form in r ;;;,: 4 variables and integer coefficients. Then for any n > 0 the number r( n, Q) of integml solutions m = (m1, ... , mr) to Q(m) = n satisfies (20.95)

where k =

~,

IAI is the determinant of Q and ®(n, Q) is the singular series

(20.96)

®(n,Q) = L:c-rgc(n,Q) c=l

with (20.97)

2.:*

gc(n, Q) =

e(~(Q(h)- n)),

L

d(mod c) h(mod c)

the implied constant depends onE and on the form Q. The quadratic form Q(x) in x = (xi, ... , Xr) is associated with a symmetric matrix A = (a;j) of rank r with a;i E Z and a;; E 2Z which is positive definite. In Siegel's notation

Q(x) = ~A[x] = ~txAx = ~ L:a;;X~

+L

L:a;jXiXj·

i
The discriminant of Q is defined by D=(-l)~IAI

{

ifriseven,

D=~(-1)-"¥1AI

ifrisodd

where IAI is the determinant (one shows that D (20.98)

B(z) =

L

e(Q(m)z) =

=0, l(mod 4) if r is even). Put

L r(n, Q)e(nx) 0

'.:·

20. THE

CIRCLE

METHOD

473

where

r(n,Q) = j{m

E

zr;Q(m) = n}j.

This theta function is a modular form of weight k = ~.level 2N, where N is such that N A- 1 is integral, and it transforms by a suitable multiplier system defined in terms of the Jacobi symbol (see Theorem 10.8 of [14]), but we do not use this fact. All we need is the following Jacobi inversion formula LEMMA 20.10. Let z E IHl and x E !Rr. Then (20.99)

SKETCH OF PROOF. This follows by the Poisson summation formula

L

f(m

+ x) =

L

j(m)e(tmx)

mEZr

for f(x) = e( ~A[x]). The computation of the Fourier transform j(m) is carried out by a linear change of variables which diagonalizes A, so it reduces to the familiar integral in one variable

0

=

Let ad 1(mod c). We shall use Lemma 20.10 to transform 8(z- ~) into a theta function for the adjoint quadratic form (20.100) REMARK. The adjoint form Q*(x) does not have integral coefficients. If N is a positive integer such that N A - I is integral, then N Q* ( x) has integral coefficients. For example, N = IAI satisfies this condition, yet a smaller number suffices quite often. Note that Nr O(mod jAI) so every prime factor of IAI is inN.

=

Splitting the summation in (20.98) into residue classes h = (h 1 , ... , hr) of modulus c we get

e(z-~)=

L h(mod c)

e(-~Q(h))

L

e(Q(m)z).

m=h(mod c)

Then by the Jacobi inversion formula (20.99) for x = hc 1 we get

II I

Inserting this into the former sum and changing h into dh modulo c we arrive at


474

LEMMA 20.11. Let z E lHI and ad= 1 (mod c). Then (20.101)

_l -r(i)k ""' ( d) e( Q*(m)) L.

e(z--;:a)

= JAJ >c

Gm --;:

;

-~

mEzr

where Q* is the adjoint quadmtic form and Gm(d/c) is the Gauss sum (20.102)

To pick up the coefficient r(n, Q) in (20.98) we apply (20.89) for f(x) e( -nz)8(z), where z = x + iy withy> 0 to be chosen later. We obtain (20.103)

L Jot/cC T(c,n;x)e(-nz)dx

r(n,Q) = 2Re

c~C

0

where (20.104)

2:* e(n~)e(z-~)·

T(c,n;x)=

C
Then by Lemma 20.11 we obtain (20.105)

T(c,n;x) =

JAJ-~c-r(;-)

L

k

Tm(c,n;x)e(-Q*(m)jc 2 z)

mEzr

where (20.106)

Tm(c, n;x)

2.::*

=

e(

n~)cm( -~)·

C
Now we are going to estimate Tm(c,n;x). First we estimate the Gauss sum LEMMA 20.12. Let (c,d)

=

1 and mE

zr.

We have

(20.107)

where the implied constant depends only on the quadmtic form Q. PROOF.

We have

2.::*

e(~(Q(x)- Q(y) + 1 (x- y)m))

x,y(mod c)

=

2.::*

e(~( 1 yAz + Q(z) + 1 zm))

y,z(mod c)

:( crj{ z(mod c); Az which yields (20.107).

=O(mod c)}J 0

Next we evaluate the Gauss sum precisely, but only if the modulus is odd and coprime with the discriminant.


LEMMA

475

20.13. Let (c, 2IAid) = 1 and mE zr. We have

(20.108) PROOF. It reduces to the one-dimensional case by a local diagonalization and completing the square, (20.109) if (c, 2a) = 1 (see (3.38)). Recall that a nonsingular symmetric matrix over a field of characteristic =J 2 is equivalent to a diagonal matrix. Hence one shows that there exist an integral matrix V and an integral diagonal matrix B = diag (b 1 , • •• , br) such that (c, lVI) = 1 and 1 V AV B(mod c). Changing x to Vy modulo c, we get

= A[x] =B[y] = L bvy~.

I

II '.

!~'

r

Q(x)

r

+ 1 xm =:! L(bvy~ + 2dvYv) = ~ Lbv(Yv + bvdv) 2 v=l

v=l

-

~

L bvd~. v=l

Here the last sum is congruent modulo c to

i

i

!

II

i

=

Moreover, we have b1 · · · bv = IBI IAIIVI 2 (mod c). Therefore, applying (20.109) in each variable Yv we conclude (20.108). '0 To estimate the sum ( 20.106) we first complete it as follows (20.110)

Tm(c,n;x) =

L

/'(£)K(£,m,n;c)

f(mod c)

where 'I

1 r,I

(20.111)

K(f!,m,n;c) =

:-!

II

! I'

' ~

I

~* L

e(fd+cnd)cm(-~c)

d(mod c)

and

/'(£) =

~

L

e( -~)-

C
11

l! ~

1 '' 1 ,i I

. I

' ;··.l

If l€1 :( ~ we have/'(£) « (1 + 1£1)- 1 . Therefore the problem reduces to estimation of the complete sums K(£, m, n; c).

The complete sum K(f,m,n;c) is multiplicative in c. Precisely, if c = c0 c 1 with (c0 ,c 1 ) = 1, then K(f,m,n;c) = K(co)(f!,m,n;c 1)K(cl)(£,m,n;c0 ) where the superscripts (c0 ), (c!) indicate that the additive characters e(~x) and e(~x) are used in place of e(t) and e( {;;) respectively. Let co denote the largest factor of c having all its prime divisors in 2IAI and c1 be the remaining factor, so (c 1 , 2IAI) = 1.


476

We estimate the sums (20.111) to moduli c0 and trivial summation over do(mod c0 ), we get

separately. By (20.107) and

CJ

(20.112) where the implied constant depends only on the form Q. Then by (20.108) we get (20.113) where

sr(x,y) = L (~reCd;yJ) d(mod c,)

is either a Kloosterman sum if r is even, or a Salie sum if r is odd. In each case this sum satisfies the bound (20.85) giving (20.114) Inserting (20.113) and (20.114) into (20.110) and summing over ll'l :( LEMMA

~

we derive

20.14. We have

T,(c,n;x)

(20.115)

«

*

1

1

(n+Q (m),c 1)2cJc

!.±.! 2

T(c)log2c

where the implied constant depends only on the form Q. We shall apply the bound (20.115) to all terms in (20.105) except form= 0 in the range 0 < x < 1/c(c +C) in which T0 (c, n; x) is equal to (20.116)

T(c,n)=

L* e(n~) L e(-~Q(h)). d(mod c)

h(mod c)

We obtain

(i)k T(c,n)

l T(c,n;x)=IAI-.,c-r;

, "'""'' + 0 ( (c0 c)2T(c)(log2c)(clzl)-k ~ (n + Q*(m),cJ) 2, exp ( - 2nyQ*(m))) cZizl 2 mEzr

where :Z:::' means that m = 0 is excluded from the summation if 0 We estimate this sum by

L' :( (L(nN + l',c1)~(1 +l')- Lb (l + NQ*(m)) 2

)

2

< x < 1/c( c+ C).

exp(-

2

7r~~:t)).

mEZ 1'

f)'O

Recall that N is a positive integer such that NQ* is integral and (c1, 2N) = l. We take (20.117)

c = n~,

and

y=

c- 2 =

This choice implies that for z = x + iy with 0 < x

n- 1 •

< (cC)-1,

clzly-~:;;:; (C- 1 +Cy)y-~ = 2, and if (c(c + C))- 1 < x < (cC)- 1, then we also have the lower bound

clzly-~ ;;, (2C)- 1 y-~ = ~-


Moreover, we have Q*(m)

»

477

JmJ 2 . Hence, for any 0 < x < (cC)- 1 we deduce

L (1+NQ*(m)) 2 exp(- 2 7f~~:t)) «(cJzJy-~)" 0

.

mEZr

for any

K. ;;,

0. Applying this with

K.

= k we get

(i)k T(c, n) + O(~(cl)(c0 c)2r(c)(log 2c)n2)

1 T(c, n; x) = JAJ-,c-r ;

where

k

1

~(c1) = L(c1,l' + nN)~ (l' + 1)- 2 . e;,o

Inserting this into (20.103) we get r(n, Q)

= JAJ-~

L c-rT(c, n) c<;;C

+

1/c(c+C)

1

(i/z)ke( -nz)dx

-1/c(c+C)

o(n~c- I: ~(c1)(ca/c)~r(c) log 2c). 1

c(C

Here the error term is bounded by

I:

On the other hand, the integral is equal to

= (27r)kr(k)- 1 nk- 1

(i/z)ke(-nz)dx

up to an error term O((cC)k- 1). Summing this error term over c :( C we find that its total contribution is absorbed by the error term already present (use the bound (20.115) for T(c, n)), therefore r(n,Q)

=

(27r)knk-J r(k)

JlAI L JAJ

k

c-rT(c,n)

1

+ O(n,-•+").

c<;;C

Finally, extending the summation in the main term to all c we obtain (20.95) (notice that T(c,n) = 9c(n,Q) by changing the variables h into dh and d into -d in (20.116)). Theorem 20.9 is also valid for ternary forms (r = 2k = 3), but the error term in (20.95) is just too large to yield an asymptotic formula. However, a slight improvement is possible by exploiting additional cancellation in sums of the sums (20.113) for various moduli of the relevant Farey sequence. In the case of r = 3 one obtains the Salie sums (20.118)

K(a, b;c) =

L (~)e(ad; bd). d(mod c)

These can be computed explicitly in terms of quadratic roots modulo c. For example, if (c, 2a) = 1, then (see Lemma 12.4) (20.119)

K(a,b;c)

=

Ec(~)c~

L x 2 =ab(mod c)

eC;),

20.

478

THE CIRCLE METHOD

which easily yields IK(a, b; c)! :( dT(c). Clearly, this bound is best possible, yet one can derive a better estimate on average over the modulus c due to the variation of e(~). Such a result belongs to the spectral theory of automorphic forms of weight k =~,and it leads to the following formula (see [Du2] and [16]) THEOREM 20.15. Let Q(x) be a ternary, positive definite quadratic form with integral coefficients. Then for any n > 0 the number of integral representations Q(m) = n satisfies

r(n,Q)=4n

(20.120)

(]AI2n)!

1

.J

1

®(n,Q)+O(n2-221)

where ®(n,Q) is the singular series given by (20.96}-(20.97), and the implied constant depends on Q. -~

The formulas (20.95) and (20.120) are true asymptotics only if the singular series ®( n, Q) does not vanish. We complete this section by highlighting a few features of ®(n, Q). We know from the general considerations in the previous section that the singular series (20.96) is the product of local densities (of p-adic solutions to Q(x) = n) (20.121)

®(n,Q)

=IT Op(n,Q). p

Also the leading factor (20.122) embodies the density of real solutions to Q(x) = n. We shall compute the local densities Op(n, Q) for p f 2IAI using the formula (20.108) for Gauss sums; it gives (20.123)

Suppose r is even. Then the above sum is the Ramanujan sum giving

gc(n,Q)

=

11(~)q

XD(c)ck L ql(c,n)

where D = (-1t1 2 !AI is the discriminant of the quadratic form Q(x) and XD(c) = (-¥) is the Jacobi symbol. Hence for any P with (P, 2D) = 1

Lc-rgc(n,Q)= L ciP

ql(n,P)

XD(q)ql-kiT(1-XD(P)P-k) PI f

In particular, if every prime power factor of Pis larger than that of n, this simplifies to Lc-rgc(n,Q) = (LXD(q)ql-k) rr(1- XD(P)P-k). ciP

qln

piP

Taking P = p"+l, where v = ordpn, we get the local density


479

for any p f2D. Hence it is clear that (20.125)

<~v(n,Q)#O

ifpf2D.

Moreover, taking P to be the product of arbitrary large number of primes p f 2D we deduce that (20.126)

\5(n, Q) = at-k(n, X4D)L(k, X4D)- 1

IT llp(n, Q) pj2D

where L(s, X4D) denotes the Dirichlet L-function, and (20.127)

as(n,x) = LX(d)d8 • din

The question when llp(n, Q) > 0 for pi2D is quite delicate. If r > 4 one can show that the non-vanishing of all the local densities llp(n, Q) with pi2D is equivalent to the existence of m E zr such that (20.128) Assuming this condition, one can infer by (20.96)-(20.97) that (20.129)

\5(n, Q)

;o;:

IT(l+ xv(P)PI-k) pjn

if r is even, r = 2k > 4. For r = 4 a similar analysis applies, except for ll2(n, Q). If r is odd, r ): 5, then the exact computation of the local densities of llp(n, Q) are more involved; nevertheless, one can derive directly by the trivial estimation (20.130)

l9c(n,Q)I ( c~+I

if (c,21AI) = 1

that the singular series \5(n, Q) is bounded from below and above by positive constants depending only on Q, provided the congruence (20.128) has integral solutions. REMARK. If r = 3, we know only that \5(n,Q) 3> n-" for any E > 0 where the implied constant depends onE, Q, still assuming the solvability of (20.128) and that 62 (n.Q) > 0. At present the implied constant is not effectively computable, because the result uses Siegel's bound (5.76).

There is more to say about the formula (20.95) in terms of modular forms. Indeed, the representation number r(n, Q) is the n-th Fourier coefficient of the theta function (20.98) which belongs to the space Ah(2N, 1'1) of modular forms of weight k, level 2N, and a suitable multiplier 1'1. Suppose r = 2k): 4, so the whole space Ah (2N, 1'1) is spanned by the Eisenstein series (one for each singular cusp with respect to 19) and a finite number of cusp forms. Therefore

B(z, Q) = E(z, Q)

+ F(z, Q)

where E(z,Q) is a unique combination of the standard Eisenstein series (called the Eisenstein series of the form Q) and F(z, Q) is a cusp form uniquely determined by Q. From this decomposition

r(n, Q) = p(n, Q)

+ T(n, Q)


480

where p(n, Q), r(n, Q) are the corresponding Fourier coefficients. It turns out that p(n, Q) coincides with the main term in the formula (20.95)

p(n,Q) = 000 (n,Q) ITov(n,q) = p

( 21r)knk-l M\5(n,Q). r(k) IAI

In view of these observations (due to Siegel) the error term in (20.95) is just a bound for the Fourier coefficient r(n, Q) of the cusp form F(z, Q). Having revealed this fact we could apply the R.ama:nujan-Petersson conjecture

r(n, Q)

«

k-1

r(n)n---,---

(the implied constant depending on Q) which was proved by P. Deligne if k is an integer, k ~ 2, i.e., if r = 2k is even, r ~ 4. Our result (20.95) corresponds to the Selberg bound for the Fourier coefficient r(n, Q). If r is odd, the theory of Deligne is not applicable. Siegel [Sie3] also observed that the Eisenstein series E(z, Q) for a form Q is genus invariant. Actually he showed that

L

E(z,Q) =

w(Qv)&(z,Qv)

QuE gcn Q

where Qv runs over inequivalent forms in the genus of Q, and w(Qv) are the genus masses. To define these numbers we first recall some definitions and basic facts from the theory of quadratic forms, in particular, genus theory. Two quadratic forms Q1 , Q2 (it is assumed throughout that all forms under consideration have the same rank r ~ 2) are equivalent if one can be obtained from the other by a unimodular (invertible over Z) change of variables, i.e. if A 1 = A 2 U for some U E ll!r(Z) with lUI = ±1. The equivalent forms form a class. The determinant is a class invariant. A quadratic form Q(x) Denote by

=

~A[x]

is equivalent to itself in a number of ways.

O(Q) = {U E J~,Ir(Z); 1U AU= A} the group of automorphs of Q. This is finite and the same one for equivalent forms; we denote its order by IO(Q)I. If r = 2 the number IO(Q)I depends only on the determinant IAI, but if r > 2 it varies from class to class. Two forms Q1 , Q2 are in the same genus if they are equivalent over every p-adic field and over the real numbers (the latter condition is redundant since both Q1 , Q2 are assumed to be positive definite). The determinant is a genus invariant. One can show that two forms Q1 ,Q2 of the same determinant IA 1 I = IA2I are in the same genus if and only if they are equivalent over Z to forms congruent modulo SIA 1 IIA2I- The number of classes in a genus of a fixed determinant is finite. Clearly, the number of representations r(n, Q) is a class invariant, but it may vary within a given genus, because the number of automorphs may do so. Therefore, it is natural to add the following weight w(Q) to the representations: 1

w(Q) = IO(Q)I

(

L

Q"EgenQ

1

IO(Q )I I'

)

-1

·


481

Note that the total mass of the forms in a genus is

L

w(Qv) = 1.

QvE gen Q

Accordingly, the average number of representation of n by forms of a given genus is defined as the weighted sum

L

r(n,gen Q) =

w(Qv)r(n,Qv)·

QvE gen Q

Collecting the above notation and results we arrive at the Siegel mass formula, r(n,gen Q) =p(n,Q)

(20.131)

=

(21r)knk-1 liMQS(n,Q). r(k)y IAI

20.5. Another decomposition of the delta-symbol. Dissecting the unit circle by the Farey points ~ of order C we developed a Fourier type expansion for
eCd) = L

2..":* d(mod c)

~t(~)d.

dl(c,m)

We begin by a simple-minded treatment. An integer n is equal to zero if and only if it has a divisor larger than lnl. Therefore, for any q > lnl we have
~

L

eCn).

a(rnod q)

Writing (20.132)

~

in its lowest terms we get
1

= -

q

L S(O, n; c). clq

Next we average over q to gain an extra variable. Let w(u) be a function supported in a segment C ( u ( 2C and normalized by (20.133)

Lw(q) q=l

=

1.


482

Summing (20.132) over q we get 00

(20.134) c=l

r=l

This formula is valid only for lnl < C; consequently, it employs the characters e(nd/c) to moduli as large as lnl, which is not a good feature in applications. In this respect a Kloosterman type expansion has the advantage of employing characters to much smaller moduli relative to lnl, precisely one can make it useful with c ( 2lnl!. To refine the above idea we list here the following observations: -

every positive integer divides zero,

-

a non-zero integer has few divisors,

-

of the two complementary divisors of n > 0 one does not exceed n!.

Let w(u) be a function of u ';) 0 which vanishes at u = 0 and decays rapidly to. zero as u ---> oo. We normalize w(u) by the condition (20.133). Then by the above observations we deduce the following identity: o(n)

= L::(w(q)- w('~l)). qln

Detecting the condition qln by orthogonality of additive characters e(anjq), and then reducing the fractions a/q we obtain PROPOSITION

20.16. Let w(u) satisfy the above conditions. Then for any in-

teger n we have

(20.135)

o(n) = LS(O,n;c)llc(n) c=l

where Llc (u) is a function on lR given by

(20.136)

Llc(u) = L(cr)- 1 (w(cr)-

w(lul/cr)).

r=l

Since Llc ( u) is not compactly supported (even if the test function w( u) is chosen with compact support), we make a practical alteration by multiplying (20.135) with a nice function f(u) supported in lui ( 2N and normalized by f(O) = 1. We get

(20.137)

o(n) = L

S(O, n; c)llc(n)f(n).

c=l

Now suppose that w(u) is also compactly supported, say in the dyadic interval C ( u ( 2C (think of w(u) as a bump function). Then the series (20.137) reduces to c ( 2max(C,N/C) =X, say. Clearly the optimal choice is

C=N~

(20.138) giving X= 2C = 2N~ and

(20.139)

o(n)

= L

S(O, n; c)llc(n)f(n).

c~2C

_:Ifi

20, THE CIRCLE METHOD

483

Therefore it takes characters of moduli c ( 2N ~ to detect the event n = 0 within the integers lnl ( 2N. Of course, the optimal choice for the magnitude of moduli is not necessary, so we continue our investigation without the condition (20.138), because there are some technical benefits from having C independent of N. Yes, our new expressions (20.135) and (20.137) still contain only the Ramanujan sums (there are nowhere hidden Kloosterman fractions!), but what about the factor ~c(n)? Assuming that the test function w( u) is smooth one can regard ~c( u) as a continuous analog of the Ramanujan sum S(O, n; c). To be practical we must be able to control the variation of ~c( u) in both variables c, u, and to separate c from u at a low cost. To this end we are going to prove 20.17. Suppose w(u) is smooth, supported in the segment C ( u ( 2C 1, normalized by (20.133}, and has derivatives satisfying

LEMMA

with C

~

(20.140)

for 0 ( a ( A. Then for any c

~

~ ( )

(20.141)

c

1 and u E lR we have

u«

1

(c+C)C +

1

lul+cC'

and for 1 ( a ( A, (20.142) PROOF.

We split

(20.143) where (20.144)

= 1 V(y) = L r- w(yr)r~l

r= r- w(r)dr,

Jn

1

0

(20.145) By the Euler-Maclaurin formula

LF(r) = J(F(r)

+ {r}F'(r))dr,

r

where F is of class C 1 on lR and {r} = r - [r] is the fractional part of r, we get (20.146) Since {x} ( min(1, x) ( 2x(1 (20.147)

+ x) -l,

we derive by (20.140) that


484

In the same way, by applying (20.147) for w(u) changed into w(1/u) andy into lfy, we derive that

W(y)

(20.148)

« c- 1 (1 + y/C)- 1 .

Inserting these estimates into (20.143) we obtain (20.141 ). For the proof of (20.142) observe that ~~a)(u) vanishes, unless lui > cC in which case

~~a)(u) =

f(-cr)-a-lw(a)(:)

«

(cC)-l(lul +cC)-a.

r=l

0 Next we shall show that ~c(u) approximates to the Dirac distribution quite strongly on suitable test functions. To this end we first establish the following identity LEMMA

20.18. For any compactly supported function f of class ca on lR we

have (20.149)

oo

{oo

j-oo ~c(u)f(u)du = f(O)w(O) + }(O)ca-l Jo

( )

f3a

(~) (w rr

(a)

)

dr

-ca-lfooo f3aG) laoo w(u)uaf(a)(ru)dudr, where f3a(x)

=

;!,Ba({x}) and Ba(X) is the Bernoulli polynomial, so f3a(x) = L(-27rim)-ae(mx). mi"'O

PROOF. We split the integral on the left side of (20.149) into two parts according to (20.143) and evaluate the sums over r by the Euler-Maclaurin formula (see Theorem 4.2)

L F(r) = j (F(r) + f3a(r)F(a)(r))dr.

(20.150)

r

First we get

V(Y)=ya

1oo f3a(~)(w~r))(a)dr,

which leads to the second term on the right-hand side of (20.149). We are left with

!

00

(lui) du

-= f(u)W "Z' "Z'

j(X) }(0) r= w(r) -oo w(lui) ~ f(cru)du- -cJo -r-dr 00

=

= ['"

f=

00

1o

1

00

f(cru)du- f(O)w(O)- }(O) c

r=-oc.

0

=

w(u)

0

w(r) dr

joo {r }f3a(r)0 w(u) a f(cru)drdu- f(O)tu(O)

r

a

-oc

8r

by (20.150). Changing r into rfc we obtain the remaining terms of the right-hand 0 side of (20.149).

rrr'


485

'

COROLLARY 20.19. Let the conditions be as in Lemma 20.17. For any f comon .IR, and for c): 1 we have pactly supported and of class

(20.151)

ca

1:

!::.c(u)f(u)du = f(O)

+ Ec(f)

where (20.152) PROOF. This follows by I,Ba(x)l ( 1 and trivial estimation of the integrals in (20.149), but with the main term f(O)w(O) in place of f(O). However, by the nOTmalization condition (20.133) and the Euler-Maclaurin formula (20.150) we have

Here the error term is absorbed by (20.152) because

f(O)

«

1:

(IJ(u)j

+ IJ(a)(u)j)du.· D

Suppose f(u) is supported in

lui (

2N and has derivatives

(20.153) for 1 ( a '( A. Then (20.152) gives (20.154) In particular, if C = VN, then this simplifies to Ec(J) « (c/C)a (change a into a+ 1). Recall that the series (20.137) terminates at X= X(C, N) = 2max(C, N/C) = 2VN. The bound (20.154) is very small if c ( X 1 -" by letting a be sufficiently large. When applying the formula (20.137) to solve an additive problem it is convenient to have t::.c(n)f(n) expressed in additive characters e(nv) with Ivi quite small. To this end we consider the Fourier transform (20.155)

9c(v)

=

1:

!::.c(u)f(u)e(-uv)du.

By Fourier inversion (20.156)

!::.c(u)f(u) =

Inserting this to (20.137) we get (20.157)

o(n)

=

1: 1:

~ S(O, n; c)

9c(u)e(uv)du.

9c(v)e(nv)dv.


486

REMARKS. Here we keep the restriction c ( X, because after applying the Fourier integral (20.156) for tlc(n)f(n) one loses sight of the support of the original test functions.

Now we need estimates for 9c(v). By Corollary 20.19 for f(u)e(-uv) in place of f(u) we find that

9c(v)

(20.158)

=

1+

oC~ (~ + ~ + rv~ccf).

I:

Moreover, we have by partial integration

9c(v) = (-27riv)-a

(tlc(u)j(u))(a)e(-uv)du.

Hence we derive by (20.141), (20.142), (20.153) another estimate

9c(v)

(20.159)

«

(lvfcC)-a

+ (1 + NC- 2 )(fv!N)-a.

The formula (20.158) is fine for small lvl while the estimate (20.159) is for larger fvf, the lvl = 1/cC being the transition value. The test functions f(u) and w(u) do not need to be compactly supported, but only decaying rapidly enough outside the crucial ranges. We propose the following EXERCISE

(20.160) where

K

3. Give explicit computations for the special test functions

f(u) = exp( -~ ),

c) ,

K ( u w(u) = cexp -c---;;:

is a normalization constant.

It is understandable that the Fourier development of o(n) of Kloosterman type (20.87) and our formula (20.135) have essentially the same power in applications, because they employ the additive characters to moduli of comparable size. However, the absence of Kloosterman fractions in (20.135) can simplify some treatments. For example, when solving the diophantine equation f(x 1 , ... , Xn) = 0 by means of (20.135) one is led to an exponential sum for a polynomial in n + 1 variables over a finite field, whereas the Kloosterman method brings us a rational function.

CHAPTER 21

EQUIDISTRIBUTION 21.1. Weyl's criterion. We begin by reviewing the general principles of equidistribution theory. Let (X, B, dJ.L) be a probability space, where X is a topological space, B is the cr-algebra of Borel sets, and dJ.L is a measure normalized by

L

dJ.L(x) = 1.

A sequence {xn} dJ.L) if

C X

is said to be equidistributed (with respect to the measure . 1 hm -NI{n ( N;xn E B}l = J.L(B)

N~=

for any open set B E B. Under some mild assumptions this property is essentially that 1 (21.1) lim -N f(xn) = { f(x)dJ.L(x) N--oo

L

Jx

n~N

for any compactly supported continuous function

f :X

__, IC.

Clearly the equidistribution implies that the sequence {xn} is dense in X, yet it is a more refined property. For showing the equidistribution it is sufficient to verify (21.1) for a system of test functions f which generates a dense subset of Co(X). The system of test functions can be simply the elements of an orthonormal basis of L 2 (X,dJ.L). In particular, if X= G is a compact group, the matrix coefficients of irreducible unitary representations of G form an orthonormal basis of L 2 ( G) with respect to Haar measure (the Peter-Weyl theorem). If X= GU is the space of conjugacy classes of a compact group, then the characters of G form an orthonormal basis of L 2 (GU,dg), with respect to the measure induced by Haar measure. Since the formula (21.1) always holds for f(x) = 1 which is the character of the trivial representation, specializing the above discussion to this case gives the WEYL CRITERION. Let G be a compact group, GU the set of conjugacy classes in G. Then a sequence of elements Xn E GU is equidistributed with respect to Haar measure if and only if for any non-trivial irreducible unitary representation p : G--> GL(V), we have

(21.2)

L

Tr(p(xn)) = o(N),

as N __, oo.

n~N

A classical example in number theory (initiated by H. Weyl [Wl] in 1916) is the circle X = lR/Z with the Lebesgue measure dx. In this case it is customary

487

21.

488

EQUIDISTRIBUTION

to look at a sequence of real numbers Xn, and for the purpose of equidistribution to take the fractional parts { Xn} as representatives on the circle. If the fractional parts are equidistributed on the circle, then the original sequence is said to be equidistributed modulo one. The characters e(hx) for h E Z, h =F 0, are then the representations involved, and the Weyl criterion for equidistribution modulo 1 is

L e(hxn) = o(N),

as N

-->

oo, for any h =F 0.

n~N

21.2. Selected equidistribution results. Originally Weyl introduced his criterion to study the distribution modulo oue of the sequences Xn = f(n), where f is a polynomial with real coefficients. PROPOSITION 21.1. Let f E !R[X] be a real polynomial of degree d): 1. Assume that f(x) = axd + · · · with a¢:. Q. Then Xn = f(n) is equidistributed modulo one. PROOF. We consider f(x) = axd for simplicity. Let h =F 0 and f3 = ha. For

Q = xd-o, with 0 < o < 1, apply Lemma 20.3 (coming from Proposition 8.2) to an approximation [(3- afq[ ( q- 2 with (a, q) = 1 and 0 < q ( Q. This gives

2.:: e(f3ndl « xi+"(q-1 + x-1 + qx-drrd n~X

with rd = 21 -d. If q): XdQ- 1 = X 8 , this implies

L e(f3nd) « X

1

-'

for some I> 0.

n~X

If q

< XdQ- 1 ,

· S(X) =

denoting g(t) =(a- ~)td, we have by partial summation

L e(f3nd) = L e (and) e(g(X))- 27ri },{x ( l.::e (and)) - g'(t)e(g(t))dt.

n~X

n~X

q

1

n~t

q

By splitting into residue classes modulo q we get Y6 Le ( qand) = -q+O(q) with6= L e(axd) q. n~Y

x (mod q}

Hence by partial integration again we get

S(X)

6

=

(X

q}

e(g(t))dt +

1

o(q( 1 + Xd qQ)) «

X[6[

-q- + O(q).

To the complete sum 6 we can apply either the bounds for Ramanujan sums (if d = 1), Gauss sums (if d = 2) or Weil's bound for higher degree (also Weyl's Lemma (20.59) would be sufficient), getting 6 « q112 +" and hence S(X) « Xq-1/2+"+X 0 . Since q --> +oo because f3 ¢:. Q, we have in any case

L e(f3nd) = o(X) as X--> +oo. n~X

0 The equidistribution of arithmetical sequences built out of prime numbers are fascinating and most challenging. For primes themselves, if one takes as space

r "

EQUIDISTRIBUTION

2L

489

G = (Z/qZ)x with the counting measure, then equidistribution is equivalent with the Prime Number Theorem in Arithmetic Progressions (see Chapter 5). Similarly, if K /Q is a finite Galois extension, then one wishes to study the distribution of the Frobenius conjugacy classes ap in the Galois group of K/Q. There the Chebotarev Density Theorem gives the result: THEOREM 21.2. Let K /Q be a finite Galois extension with Galois group G. For any set C C G which is stable by conjugacy, we have

l

However, in this case as for arithmetic progressions, questions of the size of the error term and uniformity in terms of parameters is most important and quite different in perspective than the simple equidistribution statement.

iI

One is naturally led to consider sequences Xp = f(p), especially when Xn = f(n) has already been proved to be equidistributed modulo one. There the methods of Chapter 13 are relevant. Indeed, Theorem 13.6 implies the following theorem of Vinogradov

,.

I

I

THEOREM 21.3. Let 0< rt Q be an irrational number. Then the sequence o.p is equidistributed modulo one asp--> +oo. PROOF. Let f3 = o.h with h (21.3)

L

e(f3p)

=I 0.

We must prove

= o(X(logX)- 1 )

as X__, +oo.

p'(X

Let Q = x(logx)-B for some B > 0, and let a/q be a rational approximation to f3 such that 1!3- a/ql :( 1/(qQ) :( q- 2 . If q;? xQ- 1 = (logx) 8 , applying Theorem 13.6 yields

L

e(f3p)

«

x(logx)3-B/2.

p'(X

If q < xQ-1, then by partial summation, splitting into arithmetic progressions modulo q and the Siegel-Walfisz Theorem, we have

L p'(X

e(f3p)

Li(X)

« -(-) +X(logX)B-A

for any A > 0. If we take B > 6 and A > B + 1, then (21.3) follows since q __, +oo as X --> +oo for f3 irrational. D Similarly, (13.55) implies that o..JP is equidistributed for any fixed real number 0.

=I 0.

The techniques of exponential sums over finite fields, and particularly the powerful results based on the Riemann Hypothesis proved by Deligne (see Chapter 11), give very good tools to study many interesting equidistribution problems. We give an example of Fouvry and Katz [FK] using Theorem 11.43. ·

'"

'.a.l'•

21. EQUIDISTRIBUT!ON

490

THEOREM 21.4. Let P 1 (X), ... , Pr(X) be polynomials in Z[X1, ... ,Xn] such that the total degree of any non-trivial integral linear combination a1 P1 + · · ·+ arPr is~ 2. Let (x)-> +oo as x-> +oo. Then, for p-> +oo, the sequence

where 0 :( x 1, ... ,xn :( ¢(p).fiilogp, is equidistributed modulo one.

Let w(p) = (p).fiilogp and for a= (a 1 , ... , ar)

SKETCH OF PROOF.

S

=

oF 0, let

L ... 2.:.>(a1P1(x) + -~· + arPr(x)). XI,··· ,XT'

O(x,(w(p)

By the Weyl Criterion of (IR/Z)', we must show that

s«

(21.4)

w(pr asp->

+oo.

First assume w(p) < p. Using Fourier inversion on ZjpZ, we have 1 S = - ""'T(h;,w(p))S 1 (a,P,h) pn~ h

where T(h,x) =

IT L eC;m) l(1(n

O~m(x

L

Sl(a,P,h)= X!

1 •••

e(a·P(x~-h·x) 1Xn

(here h ··xis the usual scalar product h 1 x 1 T(h,x)

«

p

IT

+ · · · + hnxx)·

min(x,

We have

llhdPII- 1 )

l(i(n

where lltll is the distance to the nearest integer as usual. For the complete sums 5 1 (a, P, h), we have two bounds: first, a result of Wei! shows that (21.5) (the implied constant depending only on the P;) because a · P(x) - h · x is never identically zero for a "# 0 by assumption. This bound is insufficient because there are too many h. But by Theorem 11.43 with f =a· P, g = 1 and V = Az, there are varieties X 1 of dimension:;:; n- j such that S 1 (a,P,h) « Cpn/ 2 +(1-l)/ 2 for h ~ X 1 . Weil's bound (21.5) shows that Xn = 0. Denoting X 0 =An we get S

«

p-n

L

p(n+j-1)/2

L

IT(h,w(p))l.

l(J(n

Lemma 9.5 of [FK] gives

L hEX;-1

IT(h, w(p))l

« Pn-(i-llw(p)j-l(logpr-U-1)

21. EQUIDISTRIBUTION

491

from which

« pl/2w(pt-l(logp)

S

hence (21.4) follows. If w(p) > p, one dissects the set of summation into cubes with sides < p and apply the above bound for each of them. D

A different set of problems, still very natural, involves the "angles" of exponential sums to prime modulus, or the local roots of £-functions for varieties over finite fields. Quite similar is the question of distribution of eigenvalues of Heeke operators Tp. Maybe the simplest non-trivial exponential sums are Gauss sums. Let x be a primitive character modulo p and T(X) the associated Gauss sum (3.10). By (3.14) one can write T(X) = e(ap(X))yp, where ap(X) E ffi./Z is called the "angle" of the Gauss sum. The question is, how are those distributed? This question admits at least two variants. One can consider Gauss sums for 1 (mod d), as p --> +oo, or one characters of a fixed order d modulo primes p can look at all characters modulo p. In the former case, when d = 2, Theorem 3.3 and the Prime Number Theorem show that e(ap(x)) is equidistributed among the two elements {1, i} (note that in this case e(ap(X)) E {±1, ±i} follows from T(X) 2 = T(X)T(X) = x( -1)p, so the angle is not equidistributed in the group {±1, ±i} ). Kummer studied the arguments of cubic Gauss sums. For p 1 (mod 3), there exists a unique prime 1r E K = Q(e(1/3)) C C such that 1r = 1 (mod 3) and Nw = p. The cubic residue symbol x,..(x) = C:ih is defined as the cube root of unity such that x(p-l)/ 3 x,.. (mod w), for x E OjwO

=

=

=

where 0 is the ring of integers inK. It is a character of order 3 on the multiplicative group of the residue field 0 jwO = ZjpZ, i.e. a (primitive) cubic Dirichlet character modulo p. Using Jacobi sums, one shows that the Gauss sum satisfies

T(X,..))3 (--

v'P

1r

= e(3ap(x,..)) = -.,.;

w

see [IR]. It is well-known that 1r j1r is equidistributed on ffi./Z (proved using Heeke characters of K, see Section 3.8), but to get e(ap(x,..)) a certain cube root of unity "interferes". Kummer and later Hasse conjectured a certain non-uniform distribution based on some numerical evidence. However R. Heath-Brown and S. Patterson [HBP] proved: THEOREM 21.5. The angles e(ap(Xx)) of cubic Gauss sums are equidistributed in ffi.jZ asp__, +oo, p 1 (mod 3).

=

The proof is based on techniques for sums over primes, as in Chapter13, and properties of a (metaplectic) Eisenstein series which has the Gauss sums in its Fourier coefficients. In the case of all characters modulo p, we have the following theorem of Deligne:

21. EQUIDISTRJBUTJON

492

THEOREM 21.6. The p-2 angles (ap(X)) of Gauss sums of primitive charocters modulo p become equidistributed in ffi.jZ asp tends to oo, i.e. we have lim p~+= p - 2

"'* L...J

lot

f(ap(x)) =

X (mod p)

asp--> oo for any continuous function f : [0, 1]

-->

f(B)dB

C.

PROOF. By the Weyl Criterion, it is enough to show that .

*

1

(T(X))n =0

hm - -p~+=p-2 L VP (mod p)

(21.6)

X

for n # 0. One can reduce ton) 1 using the formula T(X)T(X) = x(-l)p (see (3.14)). We have T(X)n

"' L...J

=

X(Xr · · · Xn)e

(xr+···+x) p n

XI,··· ,Xn

hence summing over all

x (including x =

1) we get by orthogonality of characters

LT(X)n = (p- 1)Kn(1,p) X

where Kn(1,p) is the multiple Kloosterman sum defined in (11.55). Subtracting the contribution of X= 1, we get

L* X (mod p)

c(~)r =p-n/2((p-1)Kn(1,p)+(-1t+r). VP Kn(1,p) « p(n-I)/ 2 by (11.58), where the implied

Asp--> +oo, we have depends only on n, hence (21.6) follows.

constant D

Of course for analytic number theory a particular interest attaches to the classical Kloosterman sums (1.56). By Weil's bound, one can write S(a,b;p) = 2y'pcos(2wBp(a,b)) for some unique Bp( a, b) E [0, w]. The problem of the distribution of the angles is fascinating. Using all a# 0 modulo p, N. Katz [K4] solved the problem. THEOREM 21. 7. The p- 1 angles Bp(a, a) for a respect to the Sato- Tate measure on [0, w] given by

#

0 are equidistributed with

d11sr = 2w- 1 (sinB)dB as p

-->

+oo, i.e we have p- 1

"'* L...J

a( mod p)

f(Bp(a))

~ lo{" f(B)dp..

asp--> oo for any continuous function f : [0, w]

-->

C.

Moreover, he states the following conjecture where a is fixed:

21. EQUIDISTRIBUTION THE SATO- TATE CONJECTURE FOR KLOOSTERMAN SUMS.

493

Let a, b be fixed

non-zero integers. For p prime let &p E [0, 1r] be such that S(a, b;p)

= 2.fii(cosBp)·

Then as X --> +oo, the angles Bp, p ( X, become equidistributed with respect to the Sato-Tate measure dJLST·

It is not even known whether S(a, b;p) changes sign infinitely often, or that the statement S(a,b;p) (2 - E ) ( - - - = 2 cos ep ( 2

v'P

(for all p large enough) is false, for any E E]O, 2[. See [FoM] for the best progress in this direction, combining sieve methods and cohomological studies of exponential sums as in Chapter 11. In Corollary 21.9, on the other hand, we prove that the angles of Salie sums T(a,b;p) (for fixed a, b with ab not a square) are equidistributed asp--> +oo, but with respect to Lebesgue measure. The same Sa to-Tate distribution is conjectured to hold for the distribution of coefficients ap associated to an elliptic curve E/Q without CM, or more generally for the eigenvalues >-t(P) of a non-dihedral primitive holomorphic cusp form f of weight) 2. If pis a prime not dividing the conductor off, then by Deligne's bound we have >-t(P) = 2cosBp(f) for some Bp(f) E [0, 1r]. For E, this reads

aE(p) = 2.fiicos&p(E). Then we have THE SATO- TATE CONJECTURE FOR MODULAR FORMS. Let f be a primitive holomorphic cusp form of weight ) 2 which is not of dihedral type. With the above notation, the angles Bp(f), p (X, become equidistributed as X--> +oo with respect to the Sato- Tate measure d11sr.

Using the modularity of f and the analytic properties of symmetric power £functions, more progress has been made concerning this conjecture than about the analogue for Kloosterman sums (see Serre's letter at the end of [Shah]). In fact, there is a convincing argument in favor of this conjecture based on the expected functoriality of symmetric power £-functions and the analytic properties that would follow by the results of Chapter 5. We can sketch this argument, taking the case of the Ramanujan 6. function of weight 12 as an example - the level is 1 which eliminates complications arising from ramification. We have the Euler product factorizing as L(6., s) (1- app-s)- 1 (1- /3pp-s)-l

=II p

with p 11 12 (ap + {3p) = T(p) and ap/3p = 1. Here T(p) is the Ramanujan T-function, not the divisor function. For n ) 1, the n-th symmetric power of 6. is the Euler product L(Symn6.,s) (1- a~/3;-jp-s)- 1 .

=II II p

O(j(n


494

The functoriality principle of Langlands would imply that L(Symn ~, s) is an automorphic £-function of degree n+ 1 in the sense of Chapter 5, with conductor q = 1 and with no pole. Since ap/3p = 1, we have

where Bp = Bp(~) and Pn is the Tchebyshev polynomial. Applying Theorem 5.10 we find that for n ~ 1,

L

Pn(cosBp)

=

,',

o(w(X))

p(X

as X

--->

+oo, hence lim

X~+oo 7r

1 (X)

L

Pn(cosBp) = 0 = ]," Pn(cost)dJ.Lsy(t). 0

p(X

It is easy to show that the functions Pn (cost) span C([O, w]), so the Sato-Tate

conjecture follows by the Weyl Criterion from these conjectured properties of the symmetric powers. It is currently known that L(Symn ~, s) is automorphic and for n :( 4: n = 2 is due to Gelbart and Jacquet [GeJ] and n = 3, 4 to Kim and Shahidi (see [KSh]), and that it is an £-function in the sense of Chapter 5, possibly with extra poles, for n :( 9. Iff is dihedral, which is equivalent to E having CM in the elliptic curve case, the distribution is known (by work of Deuring) and is quite different. Notice that one can determine this distribution by the above argument if one knows the order of the pole of L(Symn f, s) at s = 1. Similarly for weight 1 forms where the Heeke eigenvalues can only take finitely many different values. Finally we should mention that equidistribution problems in hyperbolic plane are also very interesting. There one has to use not only cusp forms as test functions in the Weyl Criterion, but also Eisenstein series. As an example, W. Duke [Du2] proved that Heegner points become equidistributed with respect to the Poincare measure as the discriminant goes to infinity. Also we refer to [Sa4] for discussion of the problems of "quantum chaos" and the links between the quantum unique ergodicity conjecture and subconvexity bounds for certain high-degree £-functions. 21.3. Roots of quadratic congruences. Let f(X) = aX 2 + bX + c be a quadratic polynomial with integer coefficients. For a given prime p f 2a the congruence

(21.7)

f(v)

=O(mod p)

has at most two solutions; precisely the number of solutions is

p(p)=1+(~) where D = b2 - 4ac is the discriminant off and (;) is the Legendre symbol. Hence 50% of primes yield two roots and the other 50% yields none. Therefore p(p) is one

~,i

- ·.1

r

21. EQUIDISTRffiUTION

495

on average

L p(p) "'7r(x),

(21.8)

as x __, oo.

p(x

It is an interesting question how the two roots v(mod p) are distributed asp runs over the primes with p f 2a, (~) = 1. Obviously, if f(X) factors over Q, then the fractions {vjp} are not dense in [0, 1]; in fact the limit points can only be ~ with 0 :( u :;:; a. Excluding this case W. Duke, J. Friedlander, H. lwaniec [DFI5] and A. Toth [Tot] showed THEOREM 21.8. Suppose f(X) is a quadratic polynomial with integer coefficients, irreducible over Q. Then for any 0 :( a < f3 :( 1 we have

(21.9)

l{v (mod p); p :( x, f(v)

=0 (mod p), a :( { ~} :( {3} "' ({3- o)1r(x)

as x--> oo.

Equivalently, for any continuous, periodic function F(t) of period one (21.10)

lim -1() X

X--+00 7r

L

"" L

F(vjp) =

p,;;x f(v)=O(mod p)

11 0

F(t)dt.

In other words, the sequence of numbers vjp is equidistributed modulo one. Furthermore, by Weyl's criterion together with (21.8) this property is equivalent to

L Ph(P) = o(1r(x)),

(21.11)

p~x

as x

-+

oo, for any positive integer h, where

(21.12) v(mod n) f(v)=O(mod n)

A consequence of Theorem 21.8 is the uniform distribution of angles of the Salie sums T(m,n;p)=

p

(d) (dm - -+ pdn) -.

"" ~ d(rnod p)

By Lemma 12.4 we have T(m,n;p)=2cos

27fV) (p T(O,n;p)

for p t 2mn, where T(O,n;p) = c:py'P(~) is the Gauss sum and vis a root of v2 4mn(modp). We think of T(O,n;p) as the normalization factor and call 21rvjp the angle of the Salie sum. Therefore Theorem 21.8 implies

=

21. EQUIDJSTRJBUTJON

496

COROLLARY 21.9. If mn is not a square, then the angles of the Salie sums T(m, n;p) are uniformly distributed modulo 27f asp runs over primes.

This contrasts with the conjectured Sato-Tate distribution for Kloosterman sums. In the simplest case of the polynomial f(X) = X 2 + 1 the roots v(mod p) with p = 1(mod 4) correspond to the representations of pas the sum of two squares p = r 2 + s 2 . By choosing -s < r :( s with s odd, we see that every such representation gives the unique root v = rs(mod p). Hence

v f 1 - =--+-(mod 1). p s sp In view of this construction the assertion (21.11) becomes

L*

e(h~)

= o(1r(x))

r2+s2=p(x

after clearing the almost constant perturbation e(hjsp) = 1 + O(h/sp). This result is reminiscent to the following one:

which was established in [FI]. Other surprising applications of Theorem 21.8 are given in [VdP] and [Ko2]. In this chapter we present a proof of Theorem 21.8 for the polynomial (21.13)

f(X) =X 2 -D

with D < 0 (so f(X) is irreducible without further conditions). The general case of f with negative discriminant can be dealt with by completing the square and modifying slightly the arguments. However, the case of positive discriminant requires substantial changes (because of the infinite group of units in Q( VD) if D > 0). This case was settled by A. Toth [Tot] using an alternative path passing directly through Kloosterman sums. Our approach also depends on Kloosterman sums, but indirectly via Section 16.5. 21.4. Linear and bilinear forms in quadratic roots. By the Weyl Criterion we must show the inequality (21.11) for the polynomial (21.13) with D < 0 and h =I 0. We can assume that D is squarefree because the Weyl sum (21.11) with frequency h forD= m 2 E is equal to the Weyl sum with frequency mh forE (up to bounded amount for primes dividing m). According to the equidistribution criteria it suffices to prove that (21.14) as x -> oo, where h is a fixed positive integer and g(y) is a fixed smooth function supported on 1 :( y :( 2. To this end we apply Theorem 13.12 for the sequence


497

A= (an) with an= Ph(n)g(njx). We have to verify the conditions (13.67), (13.68) for~= x•(x) with E(x)--> 0. The first condition is weaker than the estimate

(21.15)

for any A) 2, where B = B(A) and the implied constant depends on A, hand g. For the second condition it suffices to prove the following estimate (21.16)

LILt3nPh(mn)g(:n) m

I« x(logx)~A

n

for any complex numbers t3n supported in the interval (21.17) with lt3nl :( 1. We can also restrict the support of t3n to primes (see (13.68)) which helps at technical points. In this section we derive both (21.15) and (21.16) from one estimate for the linear forms (21.18) However, we need a very strong bound for .Cd(x) which is uniform in large ranges of d and h. PROPOSITION

21.10. Let 1 :( h :( x. Then

(21.19) where the implied constant depends on the polynomial

f

and the test function g.

For d = 1 the spectral theorem for the modular group would give a stronger (best possible) result

L

Ph(n)g(~) « x ~ log 2 x

n

where the implied constant depends on h. Without smoothing, the spectral method yields

L

Ph(n)

« x~ log x

n.(x

as shown by V. A. Bykovsky [Byk]. Our approach to .Cd(N) was inspired by that of Bykovsky. We shall work with the group f 0 (d) so we have to deal with the exceptional eigenvalues that might be there. All these will be taken care of in estimation for a certain Poincare series in the last section. That (21.19) implies (21.15) is clear. Now we are going to derive (21.16) from (21.19). Let B(x) denote the double sum over m, non the left side of (21.16). First we install the condition (m, n) = 1 and estimate the missing part by

L Llf3np(mn)g(:n) m nlm

I« XL lt3nln~ 2 «x(logx)~B n

21.

498

EQUIDISTRIBUTION

because f3n are supported on primes in the interval (21.17). Let B*(x) denote the double sum reduced by (m, n) = 1, so B(x) = .B*(x) +O(x(iogx)- 8 ). We arrange B* (x) as follows

B*(x) (

L L I L f3ng(~n) L m

f(o)"'O(m) (n,m)=l

Then by Cauchy's inequality B*(x) 2

C*(x)=

Lm L m

«

f(v)"'O(mn) v"'o(m)

e(;:) I·

C*(x) log x, where

IL

L

f3ng(mn)

f(o)"'O(m) (n,m)=l

X

f(v)"'O(mn) v=8(m)

e(~)l .

2

mn

Squaring out and changing the order of summation we obtain

where

For n 1 = n 2 we use the trivial bound 'D*(n, n) « n- 2x 2. If n 1 =I n2, then (n1, n2) = 1 since we assumed they are primes, therefore

Note that ~ ( !!..!. ( 2, or else the sum 'D*(n 1, n 2) is void. Here we remove the condition (m, n 1 n 2 1 and estimate the excess part by O((n 1n 2)- 312x 2) again by taking advantage of n 1 , n 2 being primes. With the condition ( m, n 1n 2) = 1 removed the resulting complete sum 'D(n 1,n 2) is just the linear form (21.18) ford= n1n2, h replaced by h(n 2 - ni) and the test function g(y) replaced by ~g(,;;-)!1(.;;-). After rescaling, Proposition 21.10 yields

t:=

'D(n1, n2)

« ((n1n2)-~ x~ + (n1n2)-i x~ )x log2 x

where the implied constant depends only on f, g, h. Putting together the above estimates one can easily complete the proof of (21.16). 21.5. A Poincare series for quadratic roots.

In this section we interpret the linear form .Cd(x) given by (21.18) as a Poincare series Ph(z) for the group f 0 (d) of the kind considered in Sections 16.2 and 16.3. Throughout z = x + iy denotes a point in the upper half-plane IHI, so x is no longer the variable from the previous sections. We begin by recalling various connections of positive definite binary quadratic forms with modular forms. A few relevant facts are also reviewed in Section 22.1. Fix a negative integer D. Consider all forms (with even middle coefficient) cp(X, Y) = aX

2

+ 2bXY + cY 2

2L EQUIDISTRIBUTION

499

of discriminant b2 - ac =D. Multiplying by -1 (if necessary) we can assume that a, c > 0, so cp is positive definite. There are two zeros of cp which are complex conjugate. Choose the one in lHI

z'P = b + VJ5 E lHI. a

The modular group r = S£ 2 (£'..) acts on the quadratic forms of discriminant D by linear change of variables

cp(}'(X, Y) = cp(aX + 1Y,j3X + c5Y) if a = (

~ ~)

E

r.

This action is compatible with taking the zeros, i.e., az'P =

z'Pa. Let F be the standard fundamental domain for r (see (22.10) and (22.11)). Every form of discriminant Dis equivalent (with respect to the action of f) to the exactly one form with z'P E F. Denote the set of roots of representing classes by A= {z'P E F; discr(cp) = D}. Therefore, there exists a one-to-one correspondence between the solutions of b2 ac = D with a, b, c E £'.., a, c > 0 and the points of the orbits {az; a E r /f z} for z E·A, where r. is the stability group of z. Hence we derive Ph(n)

=

L

eCb) = n

b2 =D(mod n)

L

L

lfzl- 1

zEA

e(hRe az).

O'Er= \r Im

O'z=.jjDT/n

Next we split the inner sum into classes with respect to the subgroup f 0 (d) getting

e(hRe arz).

(21.20) zEA

.,.Ero(d)\r

We shall use this formula for numbers n into the summation conditions for z and (21.21)

JTDI/Im

uEr = \ro(d) Im O'Tz=.jjDT/n

=O(mod d). This congruence can be built

TZ

T

(but not a) as the following congruence

=O(mod d).

Note that this condition does not depend on the choice ofT in the coset f 0 (d)r. Now we are ready to evaluate the linear forms (21.22) n=O(mod d)

where F( u) is a smooth function compactly supported on JR+. Introducing (21.20) we arrange this into (21.23)

Ld

=

L zEA

lfzl- 1 Lb .,.Ero(d)\r

Ph(rz)


500

where I::" restricts T by the congruence (21.21) and Ph(z) is the Poincare series for the group fo(d) F(27rhim az)e(hRe az).

(21.24)

The formula (21.23) truly represents the linear form Ld by the Poincare series Ph(Tz) because there are relatively few points TZ satisfying (21.21). Precisely the number of points z E A is the class number h(D) and the number of cosets f 0 (d)T with T E f is the index

[r: r 0 (d)] =

diJ( 1+ ~ ). p

p[d

This alone is quite large, however, not every T satisfies (21.21). For every z E A, let v(d,z) denote the number of cosets f 0 (d)T for which (21.21) holds. Let cp = aX 2 + 2bXY + cY 2 be the quadratic form corresponding to z. Then notice that J[DI/Im (z) = c = cp(O, 1). Hence the condition (21.21) for

T

= (

~ ~)

is that (21.25)

cp((O, 1)7) = cp(/, 8)

=a·-/+ 2/ry8 + c8 2

=0

(mod d).

Moreover, counting the cosets f 0 (d)T means that we count the roots (/,8) of

v(d, z)

~

T(d).

To prove this, recall we assumed that D is squarefree, so by the Chinese Remainder Theorem we must prove that v(p, z) ~ 2 for p I D prime. Since p 2 does not divide D, the quadratic form aX 2 + 2bXY + cY 2 is non-zero modulo p. The projective solutions to (21.25) ford= p with 8 =I 0 (mod p) correspond to roots of the nonzero quadratic polynomial aX 2 + 2bX + c, hence there are at most 2 of them. If p 18, the only possible projective root is the "point at infinity" (1,p), which occurs if and only if p I a. In that case there was at most one solution in the first situation, so the total number of solutions if always ~ 2. Note that (21.18) becomes (21.22) for the test function (21.27)

F(u) = g(27rh/ID!jux).

This particular function is supported in

y-l

~

u ~ 2Y- 1 with

(21.28) Therefore, by the above arrangements and observations, our problem of estimating the linear form Ld(x) reduces to that of the Poincare series Ph(z) for the test function F(u) given by (21.27). In the next section we treat Ph(z) by spectral methods.


501

21.6. Estimation of the Poincare series. We shall estimate Ph(i) for all z E lHI uniformly in hand the level d. Om goal

is LEMMA 21.11. Let P,.(z) be given by (21.24) with F(u) supported in y-l ~ u ~ 2Y- 1 for Y ) 2 such that /F(j) / ~ yJ for 0 ~ j ~ 4. Then for any z E lHI and T E f we have

(21.29)

Ph(rz)

«

1(

1

1

1

3)

(y+y- I )" (hY)" +d-"(d,h)'~(hY)" r(dh)(logY) 2

where y = Im z and the implied constant is absolute.

It is easy to see that the bound (21.29) for Y given by (21.28) together with (21.26) yield (21.19). Hence one completes the proof of Theorem 21.8 for the polynomial /(X) = X 2 - D. We still have to prove Lemma 21.11. vVe start from the spectral decomposition (see Theorem 15.2) Ph(z) = 2)Ph, Uj)uj(z) + J

L a

_!_ 47r

r(!,

}I{

Ea(-,

~ + it))Ea(z, ~ +

it)dt

where (uj(z)) is an orthonormal basis of Maass cusp forms, together with the constant function for the zero eigenvalue, and a runs over the cusps of fa( d). By the Cauchy-Schwarz inequality (21.30) where (21.31) and (21.32) Here h(t) > 0 is a function decreasing fast enough at infinity, introduced for convergence. We choose (21.33) First we estimate Kd(z). To this end observe that (21.31) is the spectral decomposition of the automorphic kernel

Kd(z) =

L

k(u(z,1z))

-yEro(d)

where k(u) is the Barish-Chandra/Selberg transform of h(t) (see Theorem 15.7). Moreover, notice that k(u) is non-negative because the Fourier transform of h(t) is positive and decreasing on JR+. These observations show that K(z) will only increase if one replaces f 0 (d) by the modular group r = f 0 (1). Having done this the dependence on dis gone and the enlarged kernel is f-invariant, so we get

..

21. EQUIDISTRIBUT!ON

502

Kd(rz) ,;; K 1 (rz) = K 1 (z) for any T E rand z E JHI. Using crude estimates for cusp forms and Eisenstein series for the modular group we derive (21.34) for any T E r and z E !HI, where the implied constant is absolute. Actually for our application we could stop at the inequality Kd(rz),;; K 1 (z) because z runs over a fixed set of h(D) points and we allow the implied constant to depend on D. Now it remains to estimate Rd(h). By (16.16) we have (Ph,uJ)

= (27rh)~PJ(h)F(tJ)

where PJ (h) is the Fourier coefficient of Uj ( z) and F(t)

=

lX) F(y)Kit(y)y-~dy.

First we explain how to estimate F(t). Since F(y) is supported on the dyadic segment y- 1 ,;; y ,;; 2Y- 1 with Y ) 2, it is useful to apply the power series expansion for Kit (y) getting a rapidly convergent series of Mellin transform of F(y) at the points ~ ±it+ 2£, for = 0, 1, 2, .... Then integrating the Mellin transform by parts four times we gain the factor h{t) (see (21.33)). Next we estimate the resulting terms by Stirling's formula (5.112), and sum them up to deduce that F(t) on the spectrum satisfies

e

F(t)

«

h(t)(cosh 1rt)-~ Y~ (Yit. + y-it + 3log Y).

Actually this result is a synthesis of three different bounds derived along the above lines separately in the ranges 0
where H(t) =

«

hY H(tj )IPj(hW,

~(Y2it +

y-2it +9log2Y). cosh( 1rt) A similar estimate is deduced for the inner product of the Poincare series Ph and the Eisenstein series E •. From these estimates we get Rd(h)

«

hY(LH(tj)IPJ(hW J

+ L 4~ a

1.

2

H(t)ira(h,t)l dt),

where Ta(h, t) is the Fourier coefficient of E 0 (z, ~+it) (see (16.22)). Then applying (16.56) and (16.58) we arrive at (21.35)

Rd(h)

«

hY {1+ d- 1 (d, h)~ (hY) ~ }r2(dh) log4 Y.

Finally inserting (21.34) and (21.35) into (21.30) we finish the proof of Lemma 21.11.

'i'

CHAPTER 22

!-

IMAGINARY QUADRATIC FIELDS

22.1. Binary quadratic forms. ':

We are primarily interested in the imaginary quadratic field K = IQ( vD), but to put our treatment into historical perspective we begin by reviewing the theory of binary quadratic forms

(22,1)

+ cY 2

of discriminant

D

(22,2)

=

b2

-

4ac < 0.

We do not assume that D is fundamental, however, we only consider primitive forms, i.e.,

(a,b,c) = 1.

(22.3)

This is not a serious restriction because any form is a multiple of a primitive form. Changing the sign of all coefficients (if necessary) we can assume that

a> 0,

(22.4)

and

c

>0

so

+ zaY)

= c(zcX

+ Y)(zcX + Y)

with (22.6)

Za

b+vD

= ---

2a

and

b+vD

Zc= - - - .

2c

Actually the roots Za, Zc depend on b but we do not display it. Throughout we choose the root so that Za and Zc are in the upper half-plane lHI. Note that ZaZc = 1. The modular group r = SL2(Z) acts on the primitive forms of a given discriminant D by unimodular transformations (22.7)

aa X 2 + ba XY

+ caY 2 , 503

ifa=

(~

n

Ef

504

22.


say, with

(22.8) This action is compatible with action by linear fractional transformation on !HI, and the roots are preserved. Two forms
(22.9)

Obviously, every equivalence class is represented by a unique form
(22.10)

p- =

{z

E lHI:

(22.11)

p+ =

{z

E lHI:

1

lzl > 1, -
2

such a form 'P is called reduced; it is characterized in terms of coefficients by

(22.12)

-a < b ~ a

Since Im (za) = JfDTf2a ;;,

(22.13)

~

c

b ;;, 0 if a = c.

with

/312, the first coefficient of a reduced form satisfies a~

JIDI/3.

Hence it follows that the number of reduced forms is finite. This number h = h(D) is called the class number of discriminant D because it equals the number of distinct equivalence classes. The transformations T E r which fix a form 'Pare called automorphs of
(22.14)

w~n

if

D= -3,

if D=-4, if

D< -4.

Equivalent forms have conjugate groups of automorphs, namely Aut(
+ Y2 ,

if D < -4.

'

. . .J'.·.'.'.·"

,.

22.


505

Note that for all integers k the forms tp(X + kY, Y) are equivalent; such forms are said to be parallel. The parallel forms have the same first coefficient a while the middle coefficients are in the same residue class modulo 2a. In "Disquisitiones Arithmeticae" Gauss introduced composition of forms of a given discriminant. His construction was subsequently simplified by Dirichlet but only for pairs of forms tp 1 (X, Y) = a 1 X 2 + b1 XY + c 1 Y 2 and tp 2(X, Y) = a 2X 2 + b2 XY + c2Y 2 satisfying (22.15)

( a1,

1

b

;

2

b , a2)

= 1.

Note that b1 ,b 2 have the same parity (namely b1 = b2 = D(mod 2)) so ~(b 1 +~)is an integer. If the Dirichlet condition (22.15) holds, then the forms tp 1 , tp 2 are said to be united. Clearly, this relationship extends to the classes of parallel forms. For the united forms there exists a unique b(mod 2a 1 a 2) such that (22.16) Putting b2 - D = 4a 1 a 2 c we derive the form (22.17) which is primitive of discriminant D; it satisfies the identity (22.18) with x = x 1 x2- cy1Y2 andy= a1x1y2 + by1y2 + a2X2Y1· This is the Dirichlet formula for composition of united classes of parallel forms. Now we use the Dirichlet formula to define composition of any two equivalence classes. We say that a positive integer m is properly represented by 'P if (22.19)

m = tp(cr,-y)

with (cr,-y) = 1.

Since 'P is positive definite the number of proper representations, denoted by r~( m), is finite. Equivalent forms represent properly the same numbers; these are exactly the numbers which appear as the first coefficients of forms in the equivalent class. Every primitive form represents properly a number m which is co-prime with any fixed positive integer. Therefore given two classes A 1 , A 2 we can choose representatives '{Jl,'P2 with (a 1,a2) = 1; such forms are united. Then (22.17) determines the third class A= A 1 A 2 = [tp]. One can show that A does not depend on the chosen representatives tp 1 , '{J2, so the Dirichlet composition induces a well-defined multiplication of classes. The law of composition is commutative and also associative (the latter property is not obvious). This makes the set 1t of all distinct equivalence classes a finite abelian group of order h. The identity element of 1t, which we denote by 1, is the class which contains the principal form

(22.20)

{

D y2 'P = x2 __ 4

if D

=O(mod 4),

1- D 'P = X2 + XY + _4_y2

if D

= 1(mod 4).

The inverse class of tp(X, Y) = aX 2 + bXY + cY 2 is the class which contains
22.

506


first and the last coefficients). Therefore the sets of numbers properly represented by a class A and its inverse A- 1 coincide. The question of which numbers are represented properly by a given form, has no simple answer. However, there is a simple, sufficient and necessary condition for a positive integer m to be properly represented by some primitive form of discriminant D, namely it is the solvability of the congruence b2

(22.21)

=D(mod 4m).

Let Rv(m) denote the number of all representations of m by a complete system of inequivalent forms of discriminant D, thus Rv(m)

=

LL 'P

r~(mfd2 )

d21m

where cp runs over inequivalent forms. If (m, D) = 1 we have (22.22)

Rv(m) = w

L xv(d) dim

where XD is the Kronecker symbol. We shall see that for special discriminants (idoneal numbers) the above formula yields the number of representations for an individual form (because exactly one class of forms represents m). This "convenient" situation is explained in the genus theory due to Gauss, which we are going to present in more modern terms. First we classify the binary quadratic forms of a given discriminant according to rational unimodular transformations rather than the integral ones. Two forms cp, 1/J are said to be in the same genus if 1/J(X, Y) = cp(nX + "(Y, ,BX + c5Y) for some

a

=

(

~ ~)

E SL 2 (1Q).

In fact it suffices to apply a with all entries having

the denominator 8D. Obviously, equivalent forms are in the same genus but not conversely. Indeed, the inverse class A- 1 is in the genus of A, though A- 1 does not always coincide with A. Every genus has the same number of classes. The genus which contains the principal class is called the principal genus; we denote it by g and put (22.23)

A beautiful theorem of Gauss asserts that g consists of squares of classes, (22.24)

g = {A2 :A E 7-l}.

Thus g is a subgroup of 1-l. The factor group :F

=

1-ljQ is called the genus group.

We say that a form cp is ambiguous if it is equivalent to ip; in other words, the class A = [cp] has exponent two in the class group 1-l. Such classes form the subgroup (22.25)

£={AE7-l:A 2 =1}.

Thus £ is isomorphic to :F because it is the kernel of the homomorphism A Putting (22.26)

ho =

1£1

-+

A2.

i.e

22. IMAGINARY QUADRATIC FIELDS

507

we have h = hoh1. The group of ambiguous classes £ is the easier part of 11.. An ambiguous form cp which is reduced is characterized by having its root Za on the boundary ofF+, (22.27)

Za

=

b + v'f5 E i)p+. 2a

Hence a reduced form cp(X, Y) = aX 2 + bXY + cY 2 is ambiguous if and only if a = b, a = c or b = 0 in which cases the discriminant factors into D = a( a- 2c), D = (b- 2a)(b + 2a) or D = -4ac, respectively. Using these characterizations one can compute h 0 ; we obtain (22.28) where r is the number of distinct, odd prime divisors of D and s = 0, 1, 2. Precisely, 1(mod 4), then s = 0, and if D = -4N, then if D

=

(22.29) {

=3, 7(mod 8), =1, 2, 4, 5, 6(mod 8), if N =O(mod 8).

s=O

if N

s=1

if N

s=2

If a positive integer m prime to D is represented properly by a form of discriminant D (i.e., (22.21) is solvable), then it is represented by forms only of one genus, actually the residue class m(mod D) alone determines this genus. In this connection Gauss assigned to a discriminant D a system of r + s genus characters. These are the Legendre symbols Xp for every p \ D,p > 2, and if D = -4N, we have extra s characters modulo 8 listed below if N

X4

if N

Xs

if N

=6(mod 8),

if N

= 2(mod 8),

{ X4X8 x 4 and Xs

I

=3, 7(mod 8), = 4, 5(mod 8),

none

if N

1,

=O(mod 8).

Let S denote the system of genus characters. We have (22.30)

II X= XD

xES

Hence if m is represented by a form of discriminant D and (m, D) = 1, then by (22.30) and (22.21) we obtain TixEsx(m) = xv(m) = 1. Two such numbers m,n are represented by forms of the same genus if and only if x(m) = x(n) for all XES. Therefore a genus of forms is characterized by a collection of sigus ex = ±1 such that (22.31)

II Ex= 1, xES

and we have exactly 21 5 1-l possibilities.

22.

508


Now we return to the formula (22.22) which, in view of the genus theory, gives the number of representations of m by forms representing the genus whose invariants are for all XES.

Ex= x(m)

(22.32)

If there is only one class in each genus, then Ro(m) reduces to the number of representations by a single form. Hence a question: which discriminants D satisfy h 1 = 191 = 1, or equivalently

A2

(22.33)

= 1

for every A E 7t.

After Euler we call such a discriminant "numerus idoneus", (in English it is called "idoneal number", or "convenient number"; in French "nombre convenable"). As a matter of fact Euler considered discriminants of type D = -4N and called N a numerus idoneus rather than D. Gauss gave a list of 65 such numbers: N = 1,2,3,4, 7

with h(D) = 1,

N = 5, 6, 8, 9, 10 (plus ten more)

with h(D) = 2,

N = 21, 24, 30,33 (plus twenty more)

with h(D) = 4,

{ N = 105, 120 (plus fifteen more)

with h(D) = 8, with h(D) = 16.

N =840,1320,1365,1848

The last number in Gauss' list N = 1848 is the largest known numerus idoneus.

W. E. Briggs and S. Chawla [BCJ showed that there is at most one such number beyond 10 65 . Idoneal numbers N have the property that the form x 2 + N y 2 represents primes essentially in no more than one way, whereas a composite number has several representations if any (for details see [Cox]). This property offers an algorithm for primality testing which was an inspiration for Euler to study the idoneal numbers in the first place. 22,2, The class group,

From now on we assume that D is a negative fundam.ental discriminant. A fundamental discriminant is characterized by the property that every form
=

(22.34)

N

= 1, 2, 5, 6

(mod 8).

The associated Kronecker symbol xo(m) = (~) (see (3.43)) is a real, primitive character of conductor -D. It turns out that the fundamental discriminants are exactly the discriminants of quadratic fields. Throughout we use the notation from the theory of the imaginary quadratic field K = IQ(vD) as described in Section 3.8. The quadratic forms 0, c > 0, b2 - 4ac = D < 0) correspond in a one-to-one way to the primitive ideals (22.35)

a= [a, b\yD] = a[1,za] C 0.


509

Hence the terminology and results for quadratic forms which were used in the previous section translate naturally for ideals. In this section we go through this transition. Recall the following: I - the group of non-zero fractional ideals, P - the subgroup of principal ideals, H = I/ P - the class group, h = IHI - the class number, ll = ( VD) - the different.

Every class A E H has a unique, primitive ideal (22.35) with z 0 in the fundamental domain F =p-Up+ of the modular group r = SL 2(Z); such ideal is said to be reduced. Thus the number of reduced ideals is h. The subgroup Q = {A

2

:

A E H}

is called the principal genus and the factor group :F = H/9 is called the genus group. Hence two non-zero ideals a, b belong to the same genus if and only if a = bc 2 for some c E I. One can also show that a, b are in the same genus if and only if

Na = NbN1

(22.36)

for some 1 E K*.

A class A E His said to be ambiguous if A= A- 1 , i.e., A 2 = 1. The group of ambiguous classes £={AEH:A 2 =1}, as the kernel of the homomorphism A--> A 2, is isomorphic to the genus group F. The correspondence between quadratic forms and primitive ideals casts ambiguous forms into ambiguous ideals (a is ambiguous if a2 is principal); it proves that every ambiguous class A has its reduced primitive ideal a with z 0 on the boundary of p+. Recall that a reduced ambiguous form VJ(X, Y) = aX 2 + bXY = cY 2 has a = b, a = c or b = 0, hence it yields a factorization of the discriminant into

-D = d 1d2 = a(4c- a),

(2a- b)(2a +b),

4ac,

respectively, with d2 > d 1 > 0 and (d 1 ,d2 ) = 1 (the coprimality of these factors results from D being fundamental). Conversely, a factorization -D = d 1 d2 with d 2 > d1 > 0, (d 1 ,d2) = 1 yields a reduced ambiguous form. For example. if D l(mod 4) we obtain

=

VJ(X, Y)

=

d 1X 2 + d 1XY

+ d1 :

2

d Y2 ,

{ 'P(X, Y) = d1!d'X 2 + d2; d1 XY

+ d1: d2Y 2,

Correspondingly, a reduced primitive ambiguous ideal yields a factorization of the different ll = (VD) into ll = (l 1 ll2 with (ll 1 ,() 2) = 1; note that Cf(ll 1 ) = Cf(ll 2). Distinct such factorizations (up to the order of factors ll 1 , ll 2) yield distinct ambiguous classes, therefore (22.37)

t...

ho =

1£1 = 2t-l

510

22.


where t = w(IDI) is the number of distinct prime divisors of D (note that (22.37) agrees with (22.28) in case of fundamental discriminants). The genus of an ideal can be determined by values of real characters of the class group. These characters are assigned to every factorization D = D1D2 into two fundamental discriminants D 1 , D 2. Note that D1, D2 have different signs and are coprime, so we have exactly 21 - 1 distinct factorizations up to the order. We define XD,,D 2 first on prime ideals by (22.38)

XD,,D2(p) = {

xn,(Np) (N ) XD2 P

(this is well defined because Xn(Na) = 1 if (a, D)= 1), and we extend XD 1 ,n2 to all non-zero fractional ideals by multiplicativity. We obtain a character xn,,n 2 : I ---+ {±1} such that xv,,v 2 (a) = 1 for all a E P. Indeed, for a= (a:) with a:= 1(m+nvD), (a:, D)= 1 we have XD 1 ,n 2 (a) = xv, (a)= XD, (~(m 2 -n 2 D)) = 2 xv,(''.;, ) = 1 (similarly one can check the other cases). Therefore XD,,D 2 E H; these are exactly all real characters of the class group called the genus characters. Two ideals a, b E I are in the same genus if and only if

x(a) = x(b)

for all genus characters.

The genus theory of Gauss with its explicit characterization of ambiguous classes proved to be very useful for primality testing and factorization techniques. A method for factoring - D which exploits the class group structure, as described by D. Shanks [Sha], requires only O(IDI~) operations. If-Dis prime, then the whole class group is the principal genus H = Q. In this case H tends to be cyclic more often than not. After Gauss, a discriminant D (not necessarily negative prime) is said to be regular if the principal genus is cyclic (this is not the same as the regular primes occurring in the theory of cyclotomic fields). We still do not know if there are infinitely many regular discriminants. However, the numerical evidence supports the existence of a large proportion of these. It has been conjectured in [Ge] that the proportion of regular to all discriminants (both negative) is

n ((n)) 00

(((6)

-1

=

0.8469 ....

n=4

Gauss genus theory gives us a full description of the 2-Sylow subgroups of the class group. There are some interesting, however, incomplete developments for other p-Sylow subgroups of H, in particular, for p = 3 by Davenport and Heilbronn [DH]. According to the Cohen-Lenstra heuristics the probability that an imaginary quadratic field has in its ideal class group an element of order p is 1-

n n=l

(1- p-n).

22.


511

22.3. The class group £-Functions. Let

x :H

cha~acter

___, C* be a

of the class group. Define

(22.39) for Re(s) > 1 where the series converges absolutely. Notice that LK(s,x) LK(s, x). For any primitive ideal a= [a, b+fDJ we have h

x(a)LK(s,x) = h L(NWS = -a-•

L xEH

w

b-a

L

lal- 28

O;faEa - 1

by writing b = (a)a. Since a- = Z + :Z 0 Z, we derive 1

lal-2s =

L O;faEa-1

L

lm + nzal-2s

L

(m,n)¥(0,0)

= ((2s) L

L

lm + nzal- 28 = 2((2s) ( ~) -s E(za, s)

(m,n)=l

where

E(z, s) =

L (Imyz) 8 = "'Er = \r

1

LY•Im + nzl2 L(m,n)=l

2s

is the Eisenstein series for the modulaJ group. This gives us (22.40)

L

2h

(y']D[)-s ((2s)E(za. s).

x(a)LK(s, x) = -:; -2-

xEil

We put

(22.41) (22.42) (22.43)

O(s) = 7r-•qs)((2s), E*(z, s) = O(s)E(z, s), AK(s.x) = (27r)-•r(s)IDI~LK(s,x).

In this notation (22.40) becomes (22.44)

L

2h

.

xE1t

.

x(a)AK(s, x) = -E*(za, s). w

By Fourier inversion (the orthogonality of characters) we obtain (22.45)

AK(s,x) =

2w LX(a)E*(z.,s) a

where a runs over a set of primitive, inequivalent ideals (a good choice being the set of reduced ideals). By virtue of this formula the Heeke £-function inherits analytic properties of the Eisenstein series. It is well known (see (15.13)) that E*(z, s) has the Fourier expansion

(22.46)

E*(z, s) = O(s)y•

+ 0(1- s)y 1 -s 00

22.

512


where

(22.47)

Tv(n)=

L ad=n

Gr

The Fourier expansion yields analytic continuation of E(z, s) to the whole complex s-plane, it shows that E(z,s) in Re(s):;;:: has only a simple pole at s = 1 with constant residue ~, and it satisfies the functional equation

4

E*(z, s) = E*(z, 1- s).

(22.48)

Inserting (22.46) into (22.45) we obtain the Fourier expansion

(22.49)

AK(s,x) =

~ LX(a){B(s)(~)" +0(1- s)(~r-· a

+ 4(~r

12

f>s-lf2(n)Ks-l/2(

nn ~)cos( nn~)}

I

where a runs over primitive representatives of ideal classes. Note that the Fourier series converges rapidly since the Bessel function Kv(Y) has exponential decay

C7rJI/ e-Y(1+;Y). 2

Kv(Y)= for y > 0, v E C where JBJ:::;; Jv 2

-

~~ (see (23.451.6) of [GR]).

We deduce by (22.45) or (22.49) that LK(s, x) has analytic continuation to the whole complex s-plaiJ.e, it is entire except for a simple pole at s = 1 if X is trivial with

(22.50)

res (K(s) = s=l

2nh !fnT' Wy JDJ

and it satisfies the functional equation

(22.51) All the above properties of LK(s, x) can be also deduced from the integral representation (a Ia Riemann, see (4. 77)) due to Heeke

(22.52)

M(x) AK ( s, X) = ws(s- 1)

+

1= I

(y-

s

+ y s - 1 )fx (

iy ) JfDT dy

where o(x) = 1 if X is trivial, or else o(x) = 0, and

(22.53)

fx(z) = L

x(a)e(zNa)

where a runs over all non-zero integral ideals. Integrating termwise we get

(22.54)

AK(s,x) =

w~(~(~l) + ~x(a)(~rr(s, ~) + ~x(al(~r-·r(l-s,

~)


513

where a= Na and r{s,x) is the incomplete gamma function

Heeke derived the formula (22.52) by splitting LK(s, x) into a sum of Epstein zeta functions (apparently introduced first by Dirichlet) {22.55)

z.(sJ =

I: I:
where 'Pa(X, Y) = aX 2 + bXY + cY 2 is the quadratic form corresponding to the ideal a. This in turn is the Mellin transform of the theta function

m

n

Both e.(z) and fx(z) are modular forms of weight one for the group fa(IDI) and character XD· If X is not real, then fx.(z) is a primitive cusp form with Heeke eigenvalues

>-x.(n) =

(22.56)

L

x(a).

Na=n

If X is real, say X = XD,,D 2 with D1D2 = D, then the Heeke £-function factors into Dirichlet £-functions. Precisely, it can be verified by comparing local factors of Euler products and using (22.38) together with the factorization law of ideals in K = IQl( v75) that KRONECKER FACTORIZATION FORMULA.

(22.57)

LK(s, XD,,D2) = L(s, xn.)L(s, Xn2).

For the trivial character we have the Dedekind zeta function oc

(K(s) = L(Na)-s = ((s)L(s, XD) = LT(n, xn)n-•. I

In this case, {22.45) becomes (22.58) Comparing the residues at s

=

1 on both sides we obtain the celebrated

DIRICHLET CLASS NUMBER FORMULA.

(22.59)

L(l,xn)=

27rh(D) fiT1T'

wviDI

22.

514 EXERCISE


1. Prove that for D < -4,

(2- XD(2))h(D) =

L

XD(n).

O
Another interesting formula comes from (22.45) at s expansion (22.46) we see that E(z, ~) == 0 and

~.

By the Fourier

00

E'(z, ~) = y'ylogy + 4vi/Lr(n)K0 (27rny) cos(27rnx). Hence the central value of Lx (s, x) is given by

Inserting the Fourier expansion (22.61) we arrive at (22.60)

Lx(~,x) = ~ L ~{tog~ +4 I>(n)Ko(1fn ~) cos(":b) }. a

Using K 0 (y)

«

1

y- ~ e-Y one derives by trivial estimation that

This is interesting for the trivial class character Xo. Assuming L ( ~, XD) ~ 0 (which follows from the Riemann Hypothesis for L ( s, XD), but may conceivably be established sometime without recourse to the GRH) the left side is Lx(~, xo) ((~)L~, XD) !( 0 so we derive

h(D)

»

"L(~)~ log~ a

where n runs over the complete system of primitive, reduced ideals. Note that we dropped the factor 2 in the logarithm (find out why this is permissible!). Recall that a = N n !( J\D\/3 so all terms above are positive. This implies that h(D) cannot be very small, namely

h(D)

»

\Dihog/DI,

where the implied constant is effectively computable. Moreover, this implies that a positive proportion of the primitive, reduced ideals have the norm Nn:;:: Duke [Du2] has shown unconditionally that the points Za are equidistributed in the fundamental domain F with respect to the hyperbolic measure dJ.L = y- 2 dxdy. His result is ineffective because it uses Siegel's bound h(D) » IDI~-< (see (5.76)).

JlDI.

The Class Number Formula (22.59) reveals the algebraic nature of the special value L(l, XD)· One can derive infinitely many formulas of that kind by comparing

22.


515

coefficients in the Taylor expansion at s = 1 on both sides of (22.45). Of particular interest are the constant terms. On the left side we have (22.61) if

x is non-trivial,

(22.62)

and

h(

Jl75l

1 + l o g - - + -L L' (1, xv) ) AK(s, x) ~- - w s- 1 27r

if X is trivial by the following expansions: = ( JlDI)s-l 21r

1+

(s-1)1og~ + ·· · 2rr

f(s) = 1- (s- 1)1 + ... 1

((s)= s-1 +'Y+··· L(s,xv) = L(1,xv) + (s -1)L'(1,xv) + ··· and the Class Number Formula (22.59). On the right side of (22.45) we need an expansion of E*(z,s) which we shall derive from the Fourier series (22.46). First notice that 20(s)y 5 ~ -s- 1 + 'Y- log47ry ass-> 0 by the following expansions:

(;y)s = 1 + s Jog ;Y + ... 1 s

r(s) = - - 1 + ... ((2s) = -

1

2 -slog27r+···

using the values ((0) = -~ and ('(0) = -~ Jog27r. Moreover, 20(1- s)y 1by using the value ((2) = ~- Hence (22.63)

2E*(z, s)

~

1 - - + 'Y- log47ry + ~y + 2D(z) s -1 3

where D(z) is the tail in the Fourier expansion (22.46) at s = 1, precisely 00

D(z) = 4y'YL -r1;2(n)Kr; 2(27rny) cos(21rnx). 1

Since K ~ (y) = (1r j2y) ~ e-Y and r 1 ; 2(n) = o-_ 1 (n)n~, this becomes 00

(22.64)

D(z) = 2Re Lo-_ 1 (n)e(nz).

If we execute the summation as follows 00

00

D(z) = 2Re LLm- 1 e(mnz) = -2'L:Iogl1-e(nz)l 1

1

we find that (22.65)

D(z) =

1

-iy-

2 log lry(z)l,

2 rv

~

22.

516


where ry(z) is the Dedekind eta function

ry(z) =

(22.66)

e( 2: ) IT(l- e(nz)). I

1

The eta function is a modular form of weight on the group r = SL2(Z) with a suitable multiplier system involving Dedekind sums (see e.g. [14]). Then ry(z) 24 = t>.(z) is the famous Ramanujan L}. function, the primitive cusp form of weight 12. Therefore the function (22.67)

v'JDT)

112 is introduced for normalization so that is f-invariant (here the factor (2/ F(za) =a-~ [ry(zoW). In this notation (22.65) becomes

(22.68)

7r 6

v'JDT

1 2

D(z) = - - y - - l o g - - - IogF(z). 2y

Inserting (22.68) into (22.63) we obtain (22.69)

2E*(z, s)

~

1 s- 1

-

+ 'Y- log 21rJfl5T- 2logF(z).

Next inserting (22.69) into (22.45) we obtain (22.70)

h (- 1 AK(s, X)~ o(x)w s-1

+ 'Y -log21rJfl5T)

Now matching (22.70) with (22.61), or (22.62) in case exact equation which is called the KRONECKER LIMIT FORMULA.

(22.71)

LK(1, X)=

Jfx

2 "'x(n) logF(z ). -0

wL a

x is

trivial, we derive an

is a non-trivial class group character, then

-47r " ' x(n) log F(za)· !iT)i L

wy[D[

a

Moreover, (22.72)

L' 2 -y(l, XD) =log [D[- 'Y +hI,: log F(za)· a

Consider (22.71) for a non-trivial genus character x = XB,C which corresponds to the factorization D = BC into non-trivial fundamental discriminants B, C. One of these is negative and the other is positive, say B < 0 < C. By the Kronecker formula (22.57) we have LK(1, XB,C) = L(1, XB )L(1, xc) and by the Dirichlet Class Number Formula (22.59) we have L(1, xs) = 27rh(B)/w(B)v'JB1. Dirichlet also established the class number formula for positive discriminants (see (2.31) ), namely (22.73)

L(1,xc) = 2h(C)[C[-~ Iogc:(C)

where h(C) is the class number of the real quadratic field Q(VC) and c:(C) is the fundamental unit. Combining these formulas we infer from (22.71) that (22.74)

c:(C)2h(B)h(C)fw(B) =

IJ F(n)-x(a). a

22.


517

Besides their significance for algebraic numbers the Kronecker limit formulas can be used to estimate derivatives of £-functions. For this purpose it is more convenient to operate with D(z) rather than F(z), so we return to (22.68) getting by (22.72) and (22.59) 1

(22.75)

L (1, XD)

21T ""' 1 ( 1 +log jD[ a + =-:;; ~ ( 7ra + VfDI 6

2D(za) ) )

where n runs over inequivalent primitive ideals and a = N a. It is not easily seen that the right side of (22.75) does not depend on the choice of representatives of the ideal classes while this is clear in (22.72). A good choice in practice is the system of primitive, reduced ideals because the points Za have the largest height making D(za) easy to estimate. Estimating trivially D(za) « 1 we obtain (22.76)

I (" L(1,XD)=nL --

a

6a

1 IDI) +0 (h(D)) !inllog!iT)I viDI a viDI

where the implied constant is absolute. REMARKS. A different formula for L 1 (1, XD) was derived by S. Chowla and A. Selberg [CS], namely 1

IDih (1,XD) =

-7r

L

1 1

XD(n)logr( ~ )

O
2 + :h(/+log2n).

22.4. The class number problems. Estimation of the order of the class group of a number field is one of the most intricate problems in arithmetic. In particular, one wants to know which fields have class number one since it means that the ring of integers of such a field has the unique factorization property. A small class number (relative to the discriminant) is not a common feature. As with many stories in arithmetic this one starts with Gauss. In his Disquisitiones Arithmeticae, Gauss is seriously concerned about the class number h(D) of the imaginary quadratic field K = IQl( VD). GAuss CONJECTURE. We have h(D)--+ oo as -D--+ oo. Even if this conjecture is proved the real issue in practice is GAuSS CLASS NuMBER PROBLEM. Find an effective algorithm for determining all imaginary quadratic fields K = IQl( VD) with a given class number h = h(D). Gauss knew that h(D)

(22.77)

=

1 for the following nine discriminants

-D = 3, 4, 7, 8, 11, 19, 43, 67, 163.

Here is a remarkable characterization of these discriminants PROPOSITION 22.1 (Rabinovitch, 1913). ForD:::= l(mod 4) we have h(D) if and only if the polynomial

f(x) = x 2

-

x

+

1

--,/

represents primes for all natural numbers x < ~D. 1

The question becomes:

=1

22.

518


GAUSS CLASS NUMBER ONE PROBLEM. Prove that the above list of nine negative discriminants D with h(D) = 1 is complete. Recall that h(D) is linked to £(1, xv) by the Dirichlet formula (22.59) but it does not help much. In fact this formula served Dirichlet to estimate L(1,xv) rather than h(D) giving (22.78)

27r

£(1, xv) :?

wJiDI

since h(D) ;? 1. The class number one problem boils down to improving the lower bound (22.78) for IDI > 163. Next, recall that the effective lower bound (22.79)

h(D) ~ IDI~ log IDI,

follows from the formula (22.60) subject to the hypothesis that£(~, XD);? 0. This can solve the CNOP assuming GRH (if the constant are made explicit). There are much stronger connections between h(D) and zeros of L(s, xv), for example PROPOSITION 22.2 (Hecke-Landau, 1918). If L(s, xv) does not vanish in the regions> 1- ajlogiDI, then (22.80)

h(D) > b

JiD1

logiDI

where a, b are some positive constants, effectively computable. In modern times the above result is routine, and one can push this implication much deeper when powered by new technologies (Linnik's density theorem and Tunin's power-sums method). However, the following implication was quite surprising in its time. PROPOSITION 22.3 (Deuring, 1933). If the Riemann Hypothesis for the zeta function ((s) is false, then h(D) > 1 for all sufficiently large IDI.

A year later Mordell improved Deuring's result to the effect that h(D) -+ oo as -D-+ oo under the same assumption. In the same time Heilbronn expanded the assumption to the falsity of the Riemann hypothesis for any Dirichlet £-series. PROPOSITION 22.4 (Heilbronn, 1934). If GRH is false, then h(D) -D-+ oo.

-+

oo as

Combining both Proposition 22.2 and 22.4 one completes an unconditional solution of the Gauss Conjecture that h(D) -+ oo as -D -+ oo. Of course, the result is not effective because we do not know which one of the two propositions is needed. The problem is that if some L( s, x) has a zero off the line Re (s) = the resulting lower bound for h(D) depends on this hypothetical point which has no numerical value. Therefore the CNOP still cannot be reduced to a finite number of computations. Some degree of effectivization was achieved in the following fashion.

1

i

. :"·.

.

22.


519

PROPOSITION 22.5 (Heilbronn and Linfoot, 1934). There is at most one imaginary quadrotic field K = IQl( v'L5) with h = h(D) = 1 which is not in the list of Gauss. For the above achievements some attention should be given to E. Landau for his early ideas in [La2]. Promptly after Heilbronn and Linfoot, he was able to prove the following remarkable estimate. PROPOSITION 22.6 (Landau, 1935). There exists an absolute, effective constant c > 0 such that for every h ;? 1 all but one of the discriminants D < 0 with h(D) = h satisfy (22.81)

IDI ( ch 8 (log 3h) 6

Next C. L. Siegel [Siel] used similar arguments to establish his famous THEOREM 22.7 (Siegel, 1935). For any e {not computable) such that

> 0 there exists a constant c(e) > 0

(22.82)

Hence (22.83)

h(D) > c(e)IDI 1 / 2 - £ .

Both papers of Landau and Siegel appeared in the first volume of Acta Arithmetica, 1935. A simple proof of Siegel's theorem was given by D. Goldfeld [Gol] (see Chapter 5). The next interesting result is THEOREM 22.8 (Tatuzawa, 1951). Siegel's bound is effective except for one possible D for each e > 0. Precisely, ifO < e ( and elogiDI;? 1, then with at most one exception

f2

(22.84) In 1980 J. Hoffstein [Ho~ established further numerical improvements of Tatuzawa's estimate using Goldfeld's technique. The CNOP was solved for the first time by Kurt Heegner [Hee] in 1952, but his arguments (modular forms, complex multiplication) were understood and recognized only after his death in 1968. An additional note of tragedy was that this first recognition came two years after another solution was found by A. Baker. Baker's method [B] is very different: it uses an effective lower bound for linear forms of three logarithms of algebraic numbers. In the meantime, H. Stark [St1] gave a third solution which turned out to be similar to Heegner's. Then Stark looked into Baker's proof and realized that it sufficed to use linear forms only in two logarithms, so that the problem could have been solved already in 1949 by Gelfand and Linnik. Retrospectively, it seems that H. Weber was capable of solving the CNOP long before. See [Cox] for a presentation of the Heegner-Stark arguments. Next was the problem of class number two. This has been solved independently in 1971 by A. Baker [B] and H. Stark [St2]. They established that there are exactly eighteen negative discriminants D with h(D) = 2, namely

-D= 15,20,24,35,40,51,52,88,91, 115,123,148,187,232,235,267,403,427.

22.

520


Our story culminates at two remarkable achievements by Goldfeld (197.6) and Gross and Zagier (1983). Goldfeld [Go2] gave an effective lower bound for h(D), the quality of which depends on the rank of an auxiliary elliptic curve, and its satisfying the Birch and Swinnerton-Dyer Conjecture. Gross and Zagier ([GZl), [GZ2]) produced a curve with the required properties. The combined results of Goldfeld, Gross and Zagier solves the Class Number Problem for any fixed h, up to a finite amount of computation. J. Oesterle [Oe] reduced the implied constants in Goldfeld's work so much as to make the result practical for computers. He succeeded in showing the following neat estimate: (22.85)

h(D)

2

> _!_(logiDI)ll(1- [ v'Pl). 55 p+ 1 p[D

We shall present Goldfeld's approach in Section 22.7 with some variations of our own, and we survey the construction of Gross and Zagier in the Appendix to Chapter 23. The best effective lower bounds for h(D) which current technology allows us to hope for are of type h(D) ;? c9 (log jDj) 9 for any g > 0 and some computable constant c 9 > 0, yet it is too short to solve some other popular problems such as EuLER IDONEAL NuMBER PROBLEM. Find all discriminants D for which the class group of Q( VD) has one class in each genus.

If Dis an idoneal discriminant, then h(D) = 2t-l where t = w(IDj) is the number of distinct prime divisors of D. Since w(jDj) can be as large as log jDj/ log log jDj the problem requires an effective lower bound

(22.86)

h(D)

»

jDjcfloglog\DI

with

C

> log2.

We know by Landau's estimate (22.81) that there are only finitely many idoneal discriminants but we cannot list them all because of ineffective constants. 22.5. Splitting primes in Q( VD).

If the class number h = h( D) is small, then there are only few prime ideals p of degree one with small norm. Indeed, if p = pjl with (p, jl) = 1, then ph is a principal ideal generated by ~ (m + n VD) with n =f. 0, whence ph = ;! (m 2 - n 2 D) ;? l::. where (22.87) Therefore the least prime PI =PI (D) with XD (PI) = 1 satisfies (22.88) Hence XD(n) agrees with Jl.(n) on all squarefree numbers n::::; lf,5: with (n,l::.) = 1. This property is not likely to hold in long segments (because XD is periodic while J1. is not), therefore (22.88) suggests that h is rather large. In this section we establish several estimates which hold for any D < -4 but are interesting only when h is relatively small. Let pD (a) be the number of solutions to (22.89)

b2

=D(mod 4a)


521

in b(mod 2a). This is the multiplicative function with pv(p'"') = 1 + xv(p) if p f D, pv(p) = 1 if pI D and pv(p'"') = 0 if p I D, o. > 1. Thus the generating Dirichlet series for pv(a) has the Euler product 1

G:(s)

=LPv(a)a-s =II (1 + ~) II (1 + ~)( 1- ~r . pfD

a

XD(P)=l

Since p D (a) is the number of primitive ideals of norm a we have

whence (22.90)

pv(a) = L

Xv(b)J.L(c).

bc 2 fa

First we show the following basic inequalities (22.91)

pv(a),:; h,:;

L a(

..ILl

L a(

pv(a).

y'iDj73

This follows because the class number his equal to the number of primitive, reduced ideals a= [a, b+fDJ with b satisfying (22.89). For a,:; VE. the points (22.92)

Za

b+JD

= --2a

have height Imza ;;:, 1, so choosing b with -a < b ,:; a, these points lie in the standard fundamental domain of r = SL 2 (Z) proving the first inequality of (22.91). The second inequality follows along the same lines since every reduced ideal a has a= Na,::: viDI/3. K

Now suppose PI < · · · < Pr are the first = Q(JD). By (22.91) we have

r primes which split completely in

where o.1, ... , O.r run over non-negative integers and r' is the number of positive exponents. Hence v,.(log Aj21ogpr) ,:; h where Vr(o.) =

Since Vr(o.) ) (;~(,this implies log A,:; (2hr!) 1fr(logpr) (,:; rh 1fr logpr if r ) 2), i.e., (22.93) In particular, the second prime which splits completely satisfies p2 h ,:; (log A)9, then (22.94)

I{P,::: exp()logA): xv(p) = 1}1,::: 2g

1

;;:,

A 2,/h. If

522

22.


provided/::;,. is sufficiently large in terms of g, namely log D.> (4g) 4 9. Moreover, (22.91) yields

~( L p(2\i'Z"

2r ~h.

x(p)=1

Hence for any positive integer r we obtain

\{p ~ t:;..fr : XD(P)

=

·~·

1}\ ~ rh~.

Choosing r = [ ~ log 8h J this gives (22.95)

\{p ~

t:;,.lflogBh:

X(p) = 1}\ ~ log8h.

Next we extend the basic inequality (22.91) to larger ranges. To this end we use the following elementary estimate (see Proposition 15.10): \{!Efoo\f:lm/z>Y}\~1+

10

Y

which holds for any Y > 0 and z E H. Applying this for the points Za in the standard fundamental domain of r = S L 2 (Z) with Y = .;t;,.jA we deduce that

L

(22.96)

lOA) PD(a) ~ h ( 1 + .;t;,.

a(A

for any A> 0. As before this implies

~( L p(A

r~

2

h(1+

1~).

x(p)=1

Hence for any positive integer r we obtain

h ) ~. XD(P) = 1}\ < rh>'1 +5rA ( .;t;,.

I{P ~A: Choosing r = (22.97)

[A log 2h] + 1 we conclude \{p ~ A: XD(P)

= 1}\

that

«

(1

+ A\D\- 41log 2h) log 2h

for any A > 0, where the implied constant is absolute. In Chapter 23 we shall need the above estimates in various forms, therefore we end this section by deriving the required results. First restricting the summation in (22.96) to squarefree numbers we obtain

L >T(a, XD) ~ h ( 1 + lOA) .;t;,. .

(22.98)

a(A

Hence for any positive A, B we derive by partial summation (22.99)

L

b

A)

( 1 12 T(a,xD)a- 1 « h B + 6"

112

B
(22.100)

""b

L....

B
T(a, XD)a-

1

«

h

(

B1 + Jog .;t;,.A) .


523

Next we are going to estimate the sum

L'

S(A, B) =

(22.101)

T(a)T(a, xn)a- 112

l
where C stands for the product of primes in D and all the primes p xn(P) = 1. We have

~

B with

Applying (22.99) and (22.100) we get S(A, B)

«

1 B

h( -

1

+ -A)l/2 + h 6.

L'

A)l/2

«h ( 8+~

T(a,xn) ( 1 ya B

--- -

B
2 h (1

A)l/2

+ v'B 8+~

+ -A

)112

a6.

2(1

logA)(A)l/2

+h 8+ .JE

~

·

Rearranging terms this bound simplifies to (22.102)

where the implied constant is absolute, and effective. 22.6. Estimations for derivatives £(k)(l,xn). If the class number of the imaginary field K

= IQ( VD) is unusually small, say

( v'JDI)

(22.103)

h(D) = o log jDj ,

then so is the value of the Dirichlet series L(s, XD) at s = 1, namely (22.104)

by virtue of the Class Number Formula (22.59). Though there is no direct relation between the class number and derivatives of L( s, XD) at s = 1 all of these values are also affected by the improbable hypothesis (22.103). Truncating the Dirichlet series for £(kl(s,xn) we get (22.105)

by the Polya-Vinogradov inequality (12.50) (22.106)

L y
'

.

XD(n)

«

jDj1log\Dj.

22.

524


Hence by absolute summation (22.107) It is amazing that this trivial bound is the best known so far (apart from some

improvements of the implied constant). By the Riemann Hypothesis for L(s, xv) one can show that (22.108) A slight improvement on (22.107) can be derived from the rather subtle results of Graham and Ringrose [GRi] for special D having only small prime divisors (see Chapter 12). In this section we give estimates which are better than (22.107) if the class number h(D) is relatively small. First we examine V(l, xv). We know by Landau's analytic method (see Chapter 5) that L(s,xv) cannot have two real zeros (counted with multiplicity) close to s = 1, therefore L(1,xv) being small should imply L'(1,xv) is not small. This can be shown quickly by elementary arguments. To this end we evaluate the sum

Ln- 1 T(n,xv)= L n::;;x

Xv~m)(log~+I+O(~))+o(~log 2 x)

m(y

=

L(1,xv)(logx +I)+ L'(1,xv) +

o(~ + ~ log X) 2

for any 2 ::::; y::::; x, where the error terms above carne from estimation of the relevant character sums with y < m ::::; x by means of the Polya-Vinogradov inequality. Hence we deduce the approximate formula (22.109) L

n- 1 T(n, xv) = £(1, xv)(log x + 1) + £'(1, xv) + O(IDit x-! logx).

n(x

Notice that on the left side all terms are non-negative and for n = dm 2 with dID we have T(n,xv);, 1. Summing over these numbers (with d squarefree to ensure the uniqueness of the representation n = dm2 ) we infer that (by choosing x = e-' D) 2

L(1,xv)log!DI+L'(1,xv)

>

(~ +O(Iog~DI))v(D),

where v(D) is almost constant, precisely (22.110)

v(D)=

J1(1+~). PID

Assuming (22.104) this yields L'(1,xv)

p

> ("i;'- +o(l))v(D).

Now we derive a precise asymptotic for L'(l, xv) by appealing to the Kronecker Limit Formula (22. 75) which allows us to take full advantage of the small class number condition. One could derive the same things from (22.109) but we choose the alternative path since it works for any class character and reveals other features. First, estimating trivially, we get 2

(22.111)

L'(1,xv) =

7r

6

£(D)+

o(h~ logiDI) viDI

r I

525


where f.( D) is the sum over the norms of primitive reduced ideals (22.112) We shall approximate f.( D) by the product (22.113)

Clearly we have the upper bound f.( D)

L

<

PD(a)a- 1 ,:; P(D).

a(IDI

For a lower bound we apply Rankin's trick (recall f.(D);,

L

that~=

PD(a)a- 1 > (k(s)-

L

[D[/4):

PD(a)a-s

where (k(s) = ((2s)- 1 (x(s) is the zeta function of K = Q( JD) (reduced to the primitive ideals) (A.(s) =

L PD(a)a-s =IT (1 + p-s) II piD

(1

+ p-

5

)(1- P-T 1 ,

xo(p)=l

and s is any real number > 1 to be chosen at the end to optimize the obtained results. Applying (22.88) by partial summation we get

L

a>~

PD(a)a-s

<

llsh/3:. (s -1) ~

Here we can replace 11s by 33 because if s ;, 3 we have also the trivial bound 3/sv'iS.. derived by using PD(a) ,:; a and h ;, 1. Then we estimate (k(s) from below by the partial Euler product restricted to primes p ,:; [D[, say P8 (D), and replace Ps(D) by P(D) with admissible correction. Precisely, using the inequality TI(l-xp);, 1- ZxP with Xp

= 1- ( 1 + ~) ( 1-

~) ( 1 +~)-I (1- ~)-I

1

2 1-p -s ( )plogp = - - - - - , : ; 2 s - 12- p + 1 1 - p-S p - 1 we get Ps(D)/P(D) > 1- 2(s -1)ry(D) where rt(D) is given by (22.114)

rt(D)=

" L......

plogp p2 -1·

P(IDI

xo(p)#-1

Hencef.(D) > P(D){1- 2(s -l)ry(D)}- 33h(D)/(s -1 )v'iS... Choosing s close to 1, namely s = 1 + (33h(D)frt(D)P(D)JiDT) ~,we derive the unconditional formula (22.115) where 0

f.(D)

= P(D) -B(rt(D)P(D)h(D)/JiDT) 112

< B < 3v'33. Note that ry(D) «log [D[. Hence we obtain

22.

526

PROPOSITION


22.9. Put E(D) = h(D)\D\-1 log \D\. We have

f(D) = P(D){1 + O(E(D)1)}

(22.116)

where the implied constant is absolute and effective.

Inserting (22.116) into (22.111) we come up with PROPOSITION

22.10. Suppose E(D) = h(D)\D\-1 log \D\ :( 1. Then we have

L'(1,XD) =

(22.117)

(

7!"2 6

1

)

+0(E(D)2) P(D).

The product over splitting primes in P(D) can be reduced considerably to that over a very few small primes by means of (22.97). Indeed (22.97) gives

L

(22.118)

P-l

«

(B-1 +\D\-4/log2hlog\D\)log2h.

B
22.11. Suppose

(22.119)

h (; \D\1/loglogJDI,

(22.120)

2 (; z (; log\D\.

Then we have

(22.121)

L'(l,xD)=

rr

~\(n)

1

(1+~)(1+~r (1+o(~)).

p(z log2h

Xo(p)=l

where v(D) is defined by (22.110).

Now we estimate higher derivatives. We start from the sum Sk(x) =

L

XD(n)n- 1 (1ogn)k

n(x

which approximates to (-1)k£(k)(1,xD) with an error term given by (22.105). On the other hand, we have

By the convergence of using

Z

JL( m) m - 1(log m )t for 0 (; f (; k, and in case of £ = 0

L

JL(m)m- 1 « (log2z)- 1,

m(z

we obtain sk (X )

«

"'r(n,xD) k-l logx ~ ---(log2n) Iog (2x I n )" n

n~x

,1

22.


527

Estimating the terms withy< n ::::; x by means of (22.100) we get Sk(x)

«

L n~y

T(n, XD) (log2n)k- 1 + ~(logy)k- 1 + ~(logx)k. n Y v6.

Here we estimate log n by logy, then extend the summation to n ::::; D 2 and apply (22.109) getting Sk(x)

«

h h L'(1,XD)(logy)k- 1 + y(logy)k- 1 + ..;;s(logx)k.

We choose x = D 2 and y = 2h 2 to arrive at (22.122) Combining (22.122) and (22.117) we obtain PROPOSITION 22.12. Suppose h(D) logJDJ::::; (22.123)

L(k)(1,XD)

«

jjDi,

then for any k;, 1,

P(D)(log2h(D))k- 1

where P(D) is the product given by {22.113) and the implied constant depends only on k.

REMARKS. Notice that (22.122) implies the trivial bound (22.107), and it becomes stronger when the class number is relatively small. In the derivation of (22.122) (which is unconditional) we made an appeal to the Prime Number Theorem in the form (22.124)

L m(z

JL(m) (logm)e m

«

1.

One can avoid these not so simple results altogether by examining a special combination of the truncated derivatives Se(x) with 0::::; I!::::; k. EXERCISE 2. Show that for

x;, 2 and a primitive character x(mod

JDJ),

2 logn =(log2 x-·n)L(1,x)-2,L 1 (1,x)-L 11 (1,x)+O(JDJ•x-'log 1 1 ) 2 """' L.... T(n,x)x. n n(x

Derive from this that log 2d < 0 L "( 1,XD ) < -2 """' L.... -ddiD

provided £(1, XD) < (log JDJ)- 2 and JDJ is sufficiently large.

CHAPTER 23

EFFECTIVE BOUNDS FOR THE CLASS NUMBER We have already seen how the assertion of h(D) being small relatively to the discriminant D leads to simple, yet fake, approximations to the special values L( ~, XD) and £(kl(1, xv). In this section we consider central values of derivatives of certain £-functions associated with automorphic forms twisted by the character XD· In this case the level of the twisted form is much larger than the conductor of the character (it is about D 2 ), so we shall be able to observe positive effects only for derivatives of order g ;, 2 and only when the class number is extremely small (essentially h(D) « (logiDI)g- 1 ). As in the previous cases our results are effective but we do not dwell on computing all the implied constants. Our motivation is to illustrate the masterwork of D. Goldfeld [Go2] which after Gross and Zagier [GZlj has applications to give effective lower bounds for h(D). Although for this application Goldfeld (and J. Oesterle [Oe]) considered only the £-function of an elliptic curve, we take any primitive cusp form of a fixed level and a central character (possibly the trivial central character). At the end we derive the lower bound for h( D) in question, and we make suggestions for further search of the £-functions which can be employed for estimating the class number. 23.1. Landau's plot of automorphic £-functions. Throughout, f is a primitive cusp form of weight k ;, 1 and character E(mod N) on the group fo(N) such that E(-1) = (-1)k. Therefore (see Chapter 14) f has the Fourier expansion 00

f(z) =

L .\(n)n k2' e(nz) 1

with coefficients .\(n) which are the eigenvalues of Heeke operators Tn for all n. Notice we normalized these so that the associated £-function

L(s, f)=

L .\(n)n-s 1

has an Euler product of type (23.1)

L(s, f)=

IJ(l- .\(p)p-s + E(p)p-2•)-1 p

and the complete £-function

A(s,f) =

(~rr(s+ k2 529

1

)L(s,f)

23. EFFECTIVE BOUNDS FOR THE CLASS NUMBER

530

(which is entire) satisfies the functional equation A(s, f) = w(f)A(1-

(23.2)

s,!).

Here w(f) is a complex number depending on f with [w(f)[ = 1 and form whose Fourier coefficients are :\( n).

1 is the cusp

Let x be the character associated with the field K = Q( JI5), that is the Kronecker symbol x(n) =(~);it is a real, primitive character of conductor [D[. We create the form f @ x by twisting the coefficients with x, 00

u 0 xl(z) = L .\(n)x(n)n .,, e(nz). The twisted form f@ x is a cusp form of weight k, character c and level N D 2 , but it is not always primitive. However, there exists a unique primitive cusp form 00

fx(z) =

L .\x(n)n •2' e(nz) I

of level M[N D 2 and character Ex such that Ex(n) = c(n) and Ax(n) = .\(n)x(n) for all (n, N D) = 1. Therefore the associated £-function 00

L(s, fx) =

L .\x(n)n-• I

has the Euler product

L(s, fx)

=

IT (1- .\x(P)P-s + Ex(p)pIT (1- .\(p)x(p)p-• + c(p)p-2•)-1. 2

•)-

1

piND

(23.3)

ptND

The complete L-function A(s, fx) =

JM)s f(s + k; 1)L(s, fx) (~

is entire and it satisfies the functional equation A(s, fx) = w(fx)A(1 - s, 1xl

(23.4)

with [w(fx)[ = 1. Given

f

and

x we consider the product L-function (in Landau's style) 00

(23.5)

L(s) = L(s, f)L(s, fx) =

L ann-•.

Note that L(s) has an Euler product of degree four. Now we have all the information about L(s, f) and L(s, fx) that is needed to work with L(s). The complete product L-function defined by A(s) = Q•r 2 (s

+ k;l )L(s, f)L(s, fx)

with Q = v'M N /47r 2


531

satisfies the functional equation

A(s) = wA(1- s)

(23.6)

with the root number w = w(f)w(fx.)· Our first aim is to evaluate asymptotically the derivative A(9l(s) at s =~of order g): 0. We make no further hypothesis until the time comes to examine the main terms in our formula. 23.2. A partition of A(9)(~). We begin by expressing A(g) ( ~) as a sum of two rapidly converging conjugate series in coefficients of L(s), similar to the formula (5.12). To this end we compute · the contour integral I=~ A(s + ~)s- 9 - 1 ds

J

27r~

(I)

in two ways. First moving to the line Re (s) = -1 we pass a pole of order g + 1 at s = ~ with residue A(9)(~). Then we observe that the new integral on the line Re (s) = -1 is equal to-( -1)9wJ by the functional equation (23.6), therefore we obtain A( 9 l(~) = I+ (-1)9wJ. Here I denotes the complex conjugate of I. Next we compute I by integrating termwise the Dirichlet series (23.5). We obtain I= Q112 s, so (23.7) where

S =

(23.8)

f a(n)v(!:~:.) Vn Q

I

and V(y) is the inverse Mellin transform of (23.9)

V(y) =

~ 27r~

J

g!f 2 (s

+ ~ )s-g-I,

y-sr 2 (s + ~)s-g- 1 ds.

(1)

Nate that (move the integration to Re (s) = ~) (23.10)

V(y) =

L

c1(1od)j +O(y~ log~)

O(j(g

if 0 < y ~ 1 with the leading coefficient Cg = r 2 (~). Moreover, V(y) « yke- 2 -/Y if y ): 1 (move the integration toRe (s) = .,ffJ and estimate by Stirling's formula (5.113)). Combining both cases we deduce the bound for the cut-off function (23.11) which holds for ally> 0. By virtue of (23.11) the series (23.8) begins to decay exponentially with n » Q so we are left essentially with Q terms, which is still a lot. However, if the class number h is very small, then many coefficients a(n) vanish. Under this fictitious condition a main term for S will emerge from a lacunary subsequence of the coefficients (essentially from a(m 2 )). This main term is proportional to the value at s = 1 of the symmetric square £-function attached to f (which does not vanish, see Theorem 5.44 and (5.101)).


532

Before estimating 8 we pull out all the ramified places and a few small ones which split in the field K = Q( .,fi5). Precisely, let C be the product of primes in ND and all primes p,;;;; B with x(p) = 1. We shall choose B later, it will be quite small but for now we assume only B,; \D\. We write (23.12)

By the Euler product (23.1) the coefficients >.( n) satisfy

L

>.(m)>.(n) =

(23.13)

c(d)>.(mnd- 2 ).

d[(m,n)

Hence for (m,ND) = 1 we have

a(m) =

L

1{l(d)T(n,x)>.(n)

d 2 n=m

where 1{1 =

ex.

···'!'<

Accordingly 8 is partitioned into

We expect that the main contribution to 8 comes from n = m 2 and with T(m 2, x) deleted, which is

81 =

L etc~

a(c) .jC

L (d,C)=l

1{1(d) d

L (n,C)=l

>-(m2lv(cd2m2) . m Q

The remaining terms yield

Thus we have 8 = 8 1 + 82 + 83. We shall treat each of these three sums separately. The first sum 8 1 will be evaluated asymptotically by an appeal to analytic properties of the symmetric square £-function attached to the cusp form f, and the other two sums will be estimated by means of (22.86) and (22.88) respectively. We estimate 83 essentially by the class number while 8 2 turns out to be negligible. For the coefficients of the cusp form f we use Deligne's bound (23.14)

\>.(n)\ ,;;;; T(n).

This rather advanced result is not essential, however, it is useful to simplify exposition.

r 23. EFFECTIVE BOUNDS FOR THE CLASS NUMBER

533

23.3. Estimation of 53 and 52. We have

II

T(n, x) =

II (1+ a(1 + x(p))).

a(l+ x(p))

Hence, writing n = am 2, where a is squarefree we see that T(n, x) ( T(a, X)T(m 2). Moreover, by Deligne's bound l>.(n)l ( T(n) ( T(a)T(m 2). Therefore

1531 (

la~l LLT2d:2)

L c!C=

d

T(a,x) Tjilv(acd~m2)1·

Lb a>l

m

(a,C)=l

Applying (22.88) and (23.11) we deduce that

53« h(1

+ ~){vc(1) + vcWB-~(logt.) 9 +1°}

where

vc(s) =

L

la(c)lc-s.

ciC=

We have a(pe) = L~ >.(pk)>.x(Pl-k), whence for any s > 0 by (23.14) 2

II (LT(pe)p-fs) =II (1- p-s)-4 = ~c(s), 00

vc(s) (

piC

0

PIC

say. For vn(1) we may have a more precise estimate vn(1) x

(23.15)

II(1 + l>.(p)l) « v (D) 2

p

PID

where v(D) is defined by (22.110). To estimate 52 first notice that T( m 2 , x) = 1 unless m has a prime divisor with X(P) = 1, indeed it is clear from the formula

II (l+a(1+x(p))).

T(m 2 ,x)=

P"llm

Therefore by (23.11) and Deligne's bound

" L,'

52« vc(~)(logt.) 9 + 1

(m,C)=l

«

vc(~)(logt.)g+l

2

2,x) -1)e-m/ .,fQ T(m ) --(T(m m

L ~~ L m T2(m2)

m

p-le-mp/VfJ.

pjC

x(p)=l Applying (22.86) we infer that (23.16)

52« h(1

+ ~)vcWB- 1 (logt.) 9 +1°.

We simplify both results by taking B in the following range (23.17)


534

For s = that

!, 1 we have vb(s) (

~b(s)

«

h~n(s)

« hT(!DI) (

h 2 , whence we conclude

(23.18) 23.4. Evaluation of S 1 . To this end we use complex integration and analytic properties of the series which come out. Inserting (23.9) for V(cd 2 m 2 /Q) into (23.12) we get (23.19)

S1 =

1

Jt 2-n-z

(l)

Q5 f 2 (s + ~)Z(2s + 1)s- 9 - 1 ds

where Z(s) is the corresponding Dirichlet series

L

Z(s)=(La(c)c-•1 2 ) ( ciC=

L

1/J(d)d-•)(

(d,C)=l

,\(m 2 )m-•).

(m,C)=l

Recall that 1/J = c:X. We write Z(s) = L(s,1j;)M(s,f)P(s,f) where

P(s, f)= (

L

a(c)c-"1 2 )

ciC=

(

L

1/J(d)d-s) -l (

diC=

L

,\(m 2 )m-s) -l,

miC=

L(s, 1/J) is the Dirichlet £-function for the character 1j; =EX and 00

M(s, f)=

(23.20)

L ,\(m )m-•. 2

Every Dirichlet series above has an Euler product. To compute them we factor the Heeke polynomial for L(s, f) into (23.21) with a(p) + (3(p) = ,\(p) and a(p)(3(p) = c:(p). Similarly, this holds for the local factors of L(s, fx.) with ax.(P) = a(p)x(p), f3x.(p) = (3(p)x(P) in place of a(p), (3(p). This yields (23.22) Here and hereafter we drop the argument pin the roots a, (3, ax_, f3x. for simplicity. By (23.22) we derive

f 0

,\(p2l)p-i's = _a_( 1 - a2p-s)-l _ _ (3_( 1 _ (32p-s)-! a-(3 a-(3

= (1- a2p-s)-l(1 + a(3p-s)( 1 - (32p-s)-l = (1- a2p-s)-l( 1 -

a(3p-s)-l( 1 - (32p-s)-!( 1 - €2p-2").

Hence the C-part is equal to

P(s,f) = IJ(1- ap-s/2)-1(1- (3p-sf2)-l( 1 - ax.p-sf2)-l( 1 - f3x.P-s/2)-! piC

IJ (1 -1/Jp-")(1 - a2p-")(1 piC

+ c:p-s)-1 (1 -

(32p-").


535

After factoring 1 - a 2 p-s and 1 - {Pp-s we arrange this product as (23.23) P(s, f)=

II (1 + Ep-s)- 1(1 -1/Jp-s) PIC

rr(l +

ap-s/2)(1

+ j3p-s/2)(1- O.xP-s/2)-1(1- f3xp-•l2)-l.

piC

Moreover, we get M(s, f)= L(2s, c 2 )- 1 L(s, Sym 2 f) where L(s,Sym2 f)=

II(l- a2p-s)-1(1- aj3p-s)-1(1- j32p-s)-1 p

is the symmetric square £-function attached to J; see Section 5.12. This £-function appears as a factor in the Rankin-Selberg £-function 00

L(s, f 0 f)=

(23.24)

L >. (n)n-s. 2

Indeed, by (23.12) for m = n we deduce that (23.25)

L(s, f 0 f)= L(s, c)M(s, f)=

L(s,E) 2 ) L(s, Sym f). ( L 2s,E 2

The £-function L( s, Sym 2 f) admits meromorphic continuation to the whole complex s-plane and has a functional equation. As proved by Shimura [Sh2], it is holomorphic except possibly for simple poles at s = 0 and s = 1. The cases for which L( s, Sym 2 f) has a pole at s = 1 are completely understood (the dihedral forms) and very rare. We shall assume that our function L( s, Sym 2 f) has no pole at s = 1. By (5.101) or Theorem 5.44, we have (23.26)

£(1, Sym2 f)

i=

0.

Besides the above arithmetic properties of the zeta functions we need crude estimates on the line Re ( s) = ~. By (23.23) we get (23.27)

IP(s,f)l ( ~c(~) «h.

For the £-function of the character 'If;= EX we use the convexity bound (see Section 5.9) (23.28) Hence we arrive at (23.29) where the implied constant depends on the cusp form

f.

Having gathered the arithmetic and analytic properties of the relevant zeta functions we are now ready to evaluate the series S 1 . Moving in (23.19) to the line Res= -1/4 we obtain (23.30)

S1 = So+O(h)


536

where So is the residue at s = 0, or simply

d9

S0 =-d Qsr 2 (s+~)Z(2s+1),

(23.31)

s9

.

ats=O.

REMARK. The error term in (23.30) could be improved by factor JDJ- 1116 using Burgess' bound in place of {23.28) but we do not need anything better than (23.30). The dependence of So on the character 'lj; = EX needs to be exposed. To this end we set R(s) = Q8 f 2 (s + ~)M(2s + 1, f)P(2s + 1, f)

(23.32)

and differentiate L(2s + 1, 'lj; )R(s) by Leibniz rule getting So=

(23.33)

L

2"(~)L(u)(1,1j;)R(vl(O).

u+v=g

Next, when computing the derivatives R(vl(O) we think of Q" as a dominant factor in R(s). Recalling that all primes inC are bounded by Q, we derive by repeated application of Leibniz rule and the product representation (23.23) that (23.34) where

t(C) =

(23.35)

LP-! logp. pJC

This approximation would be useless if the error term was not small, but thanks to (23.17) we get by (22.81) that t(C) = t(D) +0(log2h). We have also t(D) « w(JD/) « log2h, therefore t(C) « log2h. Finally by (23.30), (23.34) and (23.36) we obtain (23.36)

Sr = R(O)

UI;9

2u ( ~)£(u)(1, 'lj;)(log JD/)v ( 1 +

oc~:,~,)) + O(h).

23.5. An asymptotic formula for A(Y)(~). Adding (23.18) to (23.37) we obtain (23.37) S = R(O)

ui;y

2u ( ~)£(u)(1, 'lj;)(log /DI)v ( 1 +OC~:~~~)) + O(hvc(1))

where R(O) = f 2 (~)M(1,f)P(1,f). Recall that (23.38)

f I

537


and P(1, f) is given by (23.23). Have in mind that P(1, f) depends on D but rather mildly, precisely (23.39)

P(1,f)

=IT (1 + c(p)r1 (1- c(p)x(p)) (1 + a(p)) p

pJC

.jP

p

(

1 + f3(p)) (1 - a(p)x(P)) - I (1- f3(p)x(p)) -1

.;p

.;p

.;p

where the product is over primes pI D and p ( B with x(p) = 1. Inserting (23.37) to ( 23. 7) we obtain the desired expression for A(g l ( ~) in terms of derivatives of the £-function for the character 'lj; = EX at the point s = 1. From now on we assume that Eisa real character, the symmetric square Syrn 2 f has real coefficients (though f may have both real and imaginary coefficients) and P(1, f) is also real. Then the approximate formula (23.37) for SandS is the same. Recall the formula (23.7). Since we do not wish the main terms canceled we must match the parity of g with the sign of the functional equation (23.6) for the product £-function L(s) = L(s, f)L(s, fx), i.e., we require

w = w(f)w(fx) = (-1)9.

(23.40)

Now inserting (23.37) into (23.7) we arrive at (23.41)

Q-!A(g)w = 2r 2 mM(1,flP(1,fl

L

u+v~g

2u( ~)L(u)(1,1j;)(logiDI)v( 1 +

oC:gl~l))

+ O(hvc(1)).

g

We shall refine this rudimentary formula in special cases. Suppose E = 1, so 'lj; = XD· We can use the estimates for £(u)(1, XD) established in Section 22.6 under the minor assumption h(D) log IDI ( (see (22.122)) reducing (23.41) to

JlD1

(23.42)

Q-!A(g)W = 2gf 2 (~)M(1,f)P(1,f) 1 L'(1,xo)(log IDI) 9 - (1 +

oC::I~I)) + O(hvc(1)).

Also we could replace £'(1, xo) by ~P(D) using (22.117). Though we made so far rather minor restrictions for the class number (exactly such that the segment (23.17) is not void), our asymptotic formula (23.42) is interesting only if h(D) «(log IDI) 9 , or else the error term exceeds the main term. Having assumed the class number is that small, notice that there are at most 2g primes p ( exp( Jlog IDI) which split inK= Q(VD) (see (22.95)). Choosing B = (logiDI) 4 9+ 20 (which is within (23.17)) we find that vc(1) ~ vo(1) « v(D) 2 by (23.15), then L'(1,Xo) ~ v(D) by (22.119), and 1

(23.43)

P(1,f)~rr(1+~r (1+>.(p) +~) pJD

by (23.39).

p

.jP

p

=

rr(1+ >.(p)f) pJD

p+


538

23.6. A lower bound for the class number. As before X = XD is the character of the imaginary quadratic field K = IQ( .Ji5). If the left side of (23.42) vanishes, we are left with a lower bound for the error term which contains the class number h = h(D). We put this observation as PROPOSITION 23.1. Suppose f is a primitive cusp form of weight k ~ 1, level N and trivial central character. Suppose the product £-function vanishes at s = ~ to order

m = ord L(s, f)L(s, fx) ~ 3.

(23.44)

s=~

Let g = m -1 or g = m- 2 be such that ( -1) 9 = w = w(f)w(fx). Then (23.45)

h(D)

»

B(D)(log IDI) 9 -

1

where (23.46)

B(D)

=II (1 +! r3 (1 + >.(p)fr~. PID

p

p

+

The implied constant depends only on f and m and it is effectively computable. The theory of elliptic curves provides plenty of examples of cusp forms f E S 2(fo(N)) for which the associated £-function LE(s + ~) = L(s, f) should have large order of vanishing at s = ~, precisely equal to the rank of the group of rational points on E, by the Birch-Swinnerton-Dyer Conjecture. Besides the rank we need to know the root numbers w(f), w(fx) to control the parity of g. In general, for a primitive form f E Sk(f 0 (N)) of squarefree level N we have (see e.g. Proposition 14.16 or [14])

(23.47) There is no simple formula for w(f) if N is not squarefree. If the greatest common divisor (N, D 2) is squarefree, then the twisted form is primitive, i.e., fx = f ® x, has level M = [N, D 2] and the root number is

(23.48) where N = NrN2 with (N1 , D) = 1 and N2ID. This simplifies to w(fx) = x(-N)w(f) if (N,D) = 1. Exploiting (23.47) and {23.48) we shall derive from Proposition 23.1 our final result THEOREM 23.2. There exists an absolute, effectively computable constant c > 0 such that for any fundamental discriminant D < 0 we have

(23.49)

h(D)

~

cB(D) log IDI

where (23.50)

O(D)

=II (1 + !p piD

r3 ( rl. 1+

2~

p

+

23.

EFFECTIVE BOUNDS FOR THE CLASS NUMBER

539

PROOF. First we handle the case when p = 37 splits in «JJ(v'LJ); in this case (22.72) implies h(D)log37 ~ logiD/41 which is better than (23.49). Therefore, from now on we assume that

(23.51)

xo(37)

i

1.

We begin with the elliptic curve Eo : y2 = x 3 + 10x2 - 20x

(23.52)

+8

which was considered by Gross and Zagier [GZl]; this is a modular curve of conductor No = 37 and rank ro = 0. The corresponding primitive cusp form is 00

fo(z) = L>.o(n)n~e(nz) E S2(fo(No)) and the associated Hasse-Weil zeta function is LE0 (s) = L(s- ~, fo) where

L(s,fo) = L>.o(n)n- 5 • 1

This satisfies the appropriate functional equation with sign w(fo) = 1. For the proof of Theorem 23.2 we take the twisted curve E: -139y 2 = x 3 + 10x 2 - 20x + 8. (23.53) This is a modular curve of conductor N = 37 · 139 2 and rank r = 3. The corresponding primitive cusp form is 00

f(z) = L>.(n)n~e(nz) E Sz(fo(N)) with >.(n) = >.o(n)x-139(n), i.e., f = fo ® X-!39, and the associated Hasse-Weil zeta function is LE(s) = L(s- ~,f) where

The sign of the functional equation for

f

is

w(f) = X-139(-1)w(fo) = X-139(-1) = -1 according to (23.48). Hence L( ~,f) = 0, but Gross and Zagier showed that also L'(~, f)= 0, therefore L(s, f) vanishes at s = ~ to the order at least three. (In Appendix 23.A, we sketch the proof of the result of Gross and Zagier). Now we examine the twisted form f®XD E Sz(fo(N D 2)) and its corresponding primitive form fxv E S2(fo(M)) of level MIND 2 CASE 139 f D. Then M = ND 2, fxv = f ® XD = fo ® X-139D, and by virtue of (23.48) wUxv) = X-139D(-N!)M(N2)>.o(Nz) where N 1N 2 = 37 with (N1,D) = 1 and NziD. If N1 = 37 and Nz = 1, we get wUxv) = X-139D( -37) = X-139( -37) = -w(fo) = -1. If N1 = 1 and Nz = 37, we get wUxv) = -X-139D(-1)>.o(37) = ->.o(37) = -w(fo) = -1 by virtue of (23.47).

23.

540


CASE 139ID. We write D = -139C where Cis a positive fundamental discriminant coprime with 139. Accordingly the field character factors into XD = X-139XC and f@ XD = fo@ X:_ 139 @ XC· Here the form fo@ X:_ 139 is not primitive because the trivial character x=-139 simply kills the place p = 139 in its £-function, however, it is induced by the primitive form f 0 . Therefore the primitive form which induces f@ XD is fx.n = fo@ xc, it has the level M = [37, C 2] = 37C2/(37, C) and the root number

wUx.nl = xc(-Nr)J.L(N2)>..o(N2) by virtue of (23.48), where N1N2 = 37 with (N1,C) = 1 and N2IC. If N1 = 37 and N2 = 1, we get wUx.nl = xc(-37) = X-r39D(-37) = -1, as in the former case. If N 1 = 1 and N2 = 37, we get wUx.nl = ->..o(37) = -1. Now we conclude that in all cases w(f) = wUx.nl = -1, whence the £-function L(s) = L(s, f)L(s,Jx.) has root number w = w(f)wUx.nl = 1. Since L(s) vanishes at s = ~ to order m): 3 (actually m): 4) we can apply Proposition 23.1 with g = 2 proving (23.49). D EXERCISE.

(23.54)

Derive from (23.49) the following effective bound h(D)

»

exp( -(loglog IDI)-1)(1og IDI).

[Hint: Use genus theory if D has many small prime factors.] 23. 7. Concluding notes. Our formula (23.41) is not restricted to the £-functions associated with elliptic curves, and one hopes to use it more efficiently by employing other £-functions which vanish to high order at the central point. An interesting proposition is to try the £-function of a cusp form with non-trivial central character, the point being that the leading term in (23.41) has the order of magnitude L(1,c:xo)(logiDI) 9 which for c: #1 is better than the bound £'(1, XD )(log IDI)g-l deduced from (23.42) when € = 1. There are several examples of cuspidal £-functions other than the Hasse-Weil zeta functions of elliptic curves which vanish at the central point s = ~ to the order at least two (but still with the trivial central character, unfortunately) found by F. Rodriguez Villegas [RV]. These are the £-functions for Heeke characters of an imaginary quadratic field which are described in Chapter 3 by formulas (3.92)(3.95). Suppose~ is such a character of the field Q( Fq) with q > 3, q 3(mod 4) and of weight R): 1,£ = 1(mod 2), i.e.,~ is a character which on principal ideals takes

=

~((a))= C:)(l:l)e

for a= ~(m + nFq). There are exactly h(-q) characters of this type, each of which yields a primitive cusp form

h(z) = L~(a)(Na)~e(zNa) E Se+ 1 (f 0 (q)) (see (3.89)), and the associated £-function

L(s,O = L~(a)(Na)-s


541

satisfies the following equation (self-dual)

(see Theorem 3.8) with the root number w = (-1)'2 (~). In the Main Theorem of Rodriguez-Villegas [RV] the central value L( ~, ~) is expressed as a square of a linear combination of certain theta functions, showing that L( ~, ~) ;? 0. It is also proved that L( ~, O is positive if f. = 1 (because it is true on average over all characters~ and these characters are Galois conjugates of each other). For f. = 3 the non-vanishing of L( ~, ~) is guaranteed only if the class number h( -q) is not a multiple of 3 (this condition is required to show there is only one Galois orbit). Consistently, for q = 59 we have h( -59) = 3 and L( ~, ~) = 0. Since the root number is w = -(i9) = 1 we get three distinct functions L(s,O each of which vanishes at s = ~ to order at least 2, and the corresponding cusp forms have weight k =f.+ 1 = 4. 1

23.A Appendix: The Gross-Zagier £-function vanishes to order 3. We will sketch the proof by Gross and Zagier that the Hasse-Wei! zeta function of the curve (23.53) -139y 2 = x 3 + 10x 2 - 20x + 8 vanishes to order ;? 3 at the central critical point. Precisely, we explain the proof of: THEOREM

23.3. For the above £-function, we have L'(f, ~) = 0.

Since the sign ofthe function equation for L(f,s) is -1, this implies that the order of vanishing is at least 3 as desired. The proof is based on a formula of Gross and Zagier for the special value of the derivative of the £-function of E (over a suitable auxiliary imaginary quadratic field) in terms of the height of a special point. To state this, we recall some notation. Let k be a number field and Mk the set of (normalized) absolute values of k, denoted I · lv for v E Ah; those include archimedean as well as non-archimedean absolute values and we have the product formula

IT

lxlv = 1 for any x

E k*.

vEI\h

For any x = [x 0 height of x is

:

x1

:

x 2 ] in the projective plane IP'2 (k), the absolute logarithmic

(23.55)

(this is well-defined because of the product formula). If E c IP'2 is an elliptic over k, then the function x ,_. h(x) is almost a quadratic form (with respect to the group structure). Denoting hE(x) = lim h( 2nx), n-t+oo

22n

Tate and Neron showed that hE is a quadratic form with hE(x) = h(x) + 0(1), called the canonical height on E. We only need the obvious fact that hE(O) = 0, which follows because h( oo) = 1; see [Sil] for more details.

23.

542


Let E/Q be an elliptic curve with conductor N. It is modular and a (nonobvious) consequence is that there exists non-constant maps 1r : Xo(N) - t E. Among these, there is a unique one such that 1r(oo) = 0 and 1r*(dz) = cf(z)dz for some constant c > 0, where dz is the translation-invariant differential form on E and f is the primitive weight 2 cusp form associated to E. The map 1r is then called the modular parameterization of E and in fact it is induced by the holomorphic function lHl - t IC defined by 1r(z) = 27ric

J~ f(w)dw.

Let KjQ be any imaginary quadratic field with discriminant D < 0 such that every prime p dividing N splits in K (in particular, is not ramified). Note that there are infinitely many such fields because the conditions are XK(P) = 1 for p I N, which are congruence conditions for D. There exist 2w(N) ideals n C 0 such that Ojn ~ ZjNZ (where 0 C K is the ring of integers), those being in one-to-one correspondence with solutions (3 (mod 2N) to the congruence (3 2 D (mod 4N). To each ideal class n of K is associated the corresponding complex point Za E lHl (see Section 22.1). One easily shows that there is an SL(2,Z)-equivalent point Za such that

=

2A with N I A and B (3 (mod 2N). This point is unique modulo f 0 (N). The Heegner point on E associated to K and n is defined to be

=

YK =

L 7r(za) E E(IC),

where the sum runs over all ideal classes of K. THEOREM 23.4 (THE GROSS-ZAGIER FORMULA). LetEjQ be an elliptic curve with associated cusp form f, K an imaginary quadratic field as above with discriminant D, x the Kronecker character associated to K. Let LK(f, s) = L(f, s)L(f 0

x, s).

Then LK(f, ~) = 0 and 2

(23.56)

2

L' (f l)81r llfll h( ) K , 2 - u2(deg7r)IDII/2 YK

where llfll is the Petersson norm off E S 2 (N), u E {1, 2, 3} is half the number of units of K, deg(1r);? 1 is the degree of the map 1r : X 0 (N) - t E. REMARKS. (1) In the application below, the modularity of E and the existence of 1r will be obvious. (2) What is implicit (and already quite deep) is that YK is an algebraic point in E(IC), so that its canonical height is defined. This comes from the modular interpretation of Xo(N): points T E fo(N)\IHI are in one-to-one correspondence with pairs (E, H), where E is an elliptic curve over IC and H C E(IC) is a cyclic subgroup of order N. This bijection is induced by the map T,..... (IC/(Z EB TZ), (N- 1 ) ), and the pair corresponding to Za is (IC/n, q- 1n/n), where N K/Qq = N. This curve has complex multiplication, hence finitely many Galois conjugates, which implies that Za is defined over Q.


543

More precisely, one shows that YK E E(K). The z 0 , where a runs over the ideal classes in K, are in fact a single Galois orbit under the Galois group of K. SKETCH OF PROOF OF THEOREM 23.3. The principle is that if EjQ is an elliptic curve, with associated cusp form f, such that the root number of E is +1, and if D < 0 is a fundamental discriminant of an imaginary quadratic field KjQ such that the twisted curve Ev/Q has root number -1, then one has L(f0x, ~) = 0. If in addition the Heegner point YK E E(IC) turns out to have h(YK) = 0, then

L'x(f, ~) = L(f, ~)L'(f 0 X,~)= 0. If L(f, ~) =f 0 (which can be rigorously checked numerically), then it follows that L'(f0x,~) =O. We now apply this following [GZ2] to the curve

E : y2 = x 3

+ 10x 2 -

20x

+8

with the imaginary quadratic field K = Q()-139), so that Ev is indeed the curve (23.53). This curves satisfies the assumptions of Theorem 23.4 since we have X-139(-37) = -1. We choose the ideal n = (1 +w) (where w = (1 + v-139)/2). 53 The discriminant of E is ~ = 212 .37 and the ]-invariant is j = 2 ';;, = 40936~ 00 . Thus the given Weierstrass equation is not minimal. A corresponding minimal model is the curve E' : y 2 + y = x 3 + x 2 - 3x + 1, the coordinate change bringing E' toE being (x, y) >--+ (4x- 2, 8y + 4). The curve E' has discriminant 37, and since it is squarefree, the conductor of E is 37. We denote by f the weight 2 primitive cusp form of level 37 corresponding to E. Write E(C) = Cj(Z EB TZ) ·for some T E !H[, as given by the theory of elliptic functions. Since the conductor N = 37 is squarefree, the sign of the functional equation is given by EE = aE' (37) which gives EE = 1, and numerical checking gives

L(f, ~) = L(E, 1)

;:::o

0.72568106193615278 ...

=f 0

as required. In fact, one can check that E(Q) ~ Z/3Z is generated by (x, y) = (2, 4). The ideal class group of K is of order three, generated by a = [5, 1+v'; 139 ]. The three Heegner points for the chosen n are given by

-71 + FI39 151 + FI39 3+Ff39 Z_a = 370 zo = 74 370 Denoting by 1r the modular parameterization Xo(37) --+ E, we need to prove that n(za) + n(z-a) + n(zo) = 0 E E(IC), or equivalently, that

Za =

(23.57)

n(za)

+ n(z-a) + n(zo)

E Z EB TZ.

Notice that this is a problem stated in analytic terms. To prove it, one uses the following lemma from the theory of elliptic functions: LEMMA 23.5. Let A C C be a lattice, and let f =f 0 be a A-periodic meromorphic function on C. Let s1, ... , sd be a set of representatives of A-equivalence classes of poles and zeros off, with multiplicity. Then we have s 1 + · · · + sd EA. This is the complex-analytic analogue of part of Proposition 11.33, which we proved there for elliptic curves over finite fields. It is in fact only a special case of the Abel-Jacobi Theorem (see [Rey] for instance).


544

PROOF. Choose a fundamental parallelogram P C C for C/ A and representatives for the s; such that s; is in the interior of P for all i. Then consider the integral -1 /'(z) dz = L:s; 27ri laP f(z) i

r

by the residue theorem. If"'/ and "'/ + .A 0 are parallel sides of P, with .Ao E A, then (taking into account orientation) the substitution z >--+ z - zo on the second one yields by A-periodicity -

-1

1

27ri

'Y

/'(z) dz f(z)

-1

+ -1

27ri ,+>.o

zf'(z) dz = ~ f(z) 27ri

1 'Y

f'(z) dz E .AoZ f(z)

since the integral is the winding number around 0 of the closed loop t Hence the result follows.

>--+

J("'!(t)). D

We will exhibit an elliptic function on C/(Z EB TZ) with a pole of order 3 at 0 and simple zeros at the three points 1r(zb), thereby proving (23.57) by Lemma 23.5. Start with the meromorphic function u on X 0 (37) given by

u(z) =

ry(z)2 ry(37z)2

where rJ is the Dedekind eta function given by

ry(z) = eCz ) 4

II (1- e(kz)) k?I

(see (22.66)). Letting q = e( z ), we have at infinity the expansion

qf2 u(z) =

nk (1 - qk)2

'!.In

q 12

k

(1

-

-3

37k)2 =

q

q

(1

+ ... ),

so u has a pole of order 3 at oo. Since ry doesn't vanish except at infinity, this is the only pole of u. Moreover, because rJ is of level 1, the substitution z >--+ -1/z shows that u has a zero of order 3 at the cusp 0, and no other zeros. Now consider v(z) = u(z)- u(zo). It has still a pole of order 3 at infinity, and a zero at zo. This is a simple zero, and v as two other simple zeros at Za and Z-a· Indeed, because there are as many zeros as poles, it suffices to prove v(z±a) = 0, or u(z±a) = u(zo), to show this. Notice that u(z) 12 = ~(z)~(37z)- 1 where ~is the R.amanujan function. For a point z E Xo(37) corresponding to a pair (E, H) of an elliptic curve and a cyclic subgroup of order N = 37, as in the remark above, one has ~(z) = ~(E) and ~(37z) = ~(E/H). For a Heegner point zb, this pair is (C/b,n- 1b/b). It follows that u(zo) 12 = u(z±a) 12 . Then one can check (even numerically) that u(zo) = u(z±a) or, more precisely, prove that u(zo) = 1 + w E K, where 37 = N(1 + w), and use the fact from Complex Multiplication theory, that if 0' is the generator of the Galois group of the Hilbert Class Field of K corresponding to a, then u(z±a) = u(zo)"±• = u(zo). To conclude, we need to push the function v to E(C). To do this, put

w(z) =

II rr(z')=z

v(z')


545

for z E E(C). It is easy to check that this is a well-defined meromorphic function on E(C), and it has a pole of order 3 at n(oo) = 0 and simple zeros at n(z&). This finishes the proof. 0

li''

,.,

iI

REMARK. Using programs such as PARI/GP, one can nowadays find explicit equations and formulas for all objects in the above constructions. Some are stated in [MSD], Sec. 5. We thank C. Delaunay for computing the Heegner points. It is easier to deal with the curve F : y2

+ y = x 3 + x2 - 23x -

50

instead of E. By [MSD], we have E ~ F/Jl-3, where Jl-3 is the group of 3-roots of unity (one has F[3] ~ 11- 3 x 7lj37l as Galois-module), so F is isogenous toE, in particular, E and F (and their twists) have identical £-functions. The curve X 0 (37) is the nonsingular model of the plane curve with equations X (37) : y 2 = -x6 - 9x 4 - llx + 37, 0

., r

and the modular parameterization of F is the map

n(x,y) = (~(37x- 2 - 5), ~(37yx- 3 - 4)). The Hilbert Class Field H of K = IQ( J -139) is the splitting field over K of P = X 3 + 4X 2 + 6X + 1. Denote by ()_ 0 the real root of P, by Bo the root in lHl and Ba = Bo. Then n(zb) E F(H) is given by

n(zb) = (A 1(Bb)

+ A2(Bb)V-139,Bl(Bb) + B2(Bb)V-139)

with

1 A I = --(1152X 2 + 6398X + 13469 412 2 ) 1 A 2 = - --(32579X 2 + 53474X + 13881 2 ) 139 . 4!2 B1 = ~ 3 ( -39658X 2 + 225512X + 84810)

4

~ (2662280X 2 + 6561700X + 1270577). 139 413 One can then check "by hand" that n(z 0 ) + n(z-a) + n(zo) = 0 in F(C). (This means that the three points n(zb) are on a line in II'2). Note, however, that n(zo) is of infinite order (in F( H)) because the class number of IQ( J -139) is larger than the degree (= 2) of the modular parameterization n (this is a theorem of Nakazato B2

[Nak]).

= -

i

l l 'I ·l

!

rl

~I

I

'

CHAPTER 24

THE CRITICAL ZEROS OF THE RIEMANN ZETA FUNCTION The zeros p = ~ + i'y of ((s) on the critical line are called the critical zeros. Let N 0 (T) denote the numper of critical zeros with 0 < 'Y :( T. Recall that N(T) denotes the number of all zeros p = f3 + i"( with 0 < f3 < 1 and 0 < 'Y :( T, and that we have (see Theorem 5.24) (24.1)

T

T

N(T) =-Jog-+ O(logT).

2n 2ne The Riemann Hypothesis asserts that N 0 (T) = N(T). By the density theorem (Theorem 10.1) we know that almost all zeros are near the critical line. Riemann showed by hand computations that there are ze~os exactly on the line Res= ~,the first critical zero being P1 = ~ + i"(1 with 'Yl = 14.13 .... Using modern computers it has been checked that millions of the first zeros of ((s) are on the line Res= ~ and they all are simple (see e.g. [LRW]). 24.1. A lower bound for N 0 (T). In 1914, G. H. Hardy proved that ((s) has infinitely many zeros on the line Res=~, and seven years later he showed jointly with J. E. Littlewood [HL2] that

N 0 (T)

(24.2)

»

T

for all large T. In this section we give a proof of this result. Then in the next section we improve this bound to the following estimate of A. Selberg, (24.3)

»

No(T)

TlogT,

which says that a positive proportion of zeros lies on the critical line. We set (24.4) where g(s) = 7r-s/ 2 f(s/2). By the functional equation g(s)((s) = g(1- s)((1- s) it follows that f (u) is real and even. We consider the two integrals (24.5)

I(t) =

(24.6)

J(t)

where~

(24.7)

=

i i

t+~

f(u)du,

t

t+~

t

if(u)idu

is a fixed positive number. Clearly II(t)l :( J(t). If

II(t)l < J(t), 547

548

24.

THE CRITICAL ZEROS OF THE RIEMANN ZETA FUNCTION

then f(u) must change sign in the interval (t, t +~),hence f(u) must have a zero in (t, t +~),so has((~+ iu) because g(s) does not vanish anywhere. Therefore our problem reduces to showing that (24.7) occurs quite often. To this end we are going to prove a lower bound for I(t) and an upper bound for J(t) on average over the segment [T, 2T]. ~,

LEMMA 24.1. such that

Let~)

1. There exists a function K(t), which also depends on

J(t) )

(24.8)

~-

K(t),

2

(24.9)

[

r [K(tWdt « T,

if T ) ~ 2 , where the implied constant is absolute.

LEMMA 24.2.

Let~)

1. We have

-~ ·'l

(24.10) if T ) ~ 6 , where the implied constant is absolute.

By Lemmas 24.1 and 24.2 the estimate of Hardy-Littlewood can be deduced as follows. LetT be the subset of [T,2T] where [I(t)[ = J(t), thus frtr(t)[dt =

£

J(t)dt.

We get by Cauchy's inequality and by Lemma 24.2 l[I(t)[dt (

lT

[I(t)[dt

«~h.

On the other hand, we get by Lemma 24.1 and Cauchy's inequality

l

J(t)dt)

~[T[-l K(t)dt = ~[T[ + O([T[1T1).

Combining the above bounds we deduce that the measure of T satisfies [T[ « ~-!r where the implied constant is absolute. For~ sufficiently large this gives [T[ ( ~T. In other words, the setS= [T, 2T]\ Ton which (24.7) holds has measure [S[) ~T. Clearly S contains a sequence {t 1 , ... , tR} of ~-spaced points of length R) ~- 1 [5[) T/2~. For each tr,l ( r,:;; R, there is a sign change of f(u) in tr < u < tr + ~, hence there is a critical zero Pr = ~ + hr with tr < 'Yr < tr + ~ This proves that No(T) » Tj~ which is (24.2). It remains to prove the two lemmas. PROOF OF LEMMA 24.1. We have J(t)=

l

0

b.

b.

[((~+it+iu)[du;;,jfo ((~+it+iu)dul

)~ -111:> (((~+it+ iu)- l)dul = ~- K(t),

i ~

'

24. THE CRITICAL ZEROS OF THE RIEMANN ZETA FUNCTION

549

say. To estimate K ( t) we use the approximation

L

((s) =

(24.11)

+ O(T-~)

n-s

l(n(T

which is valid for s =~+it with T < t < 3T (see (8.3)). Hence

L

K(t)=l

n-~-itl-n-it.l+O(~T-~), logn

l
and by Theorem 9.1 we conclude that

1

2T

IK(tWdt

«

L

T

T

1

n- (logn)- 2

+ ~ 2 « T.

l
0 PROOF OF LEMMA 24.2. Using the convexity bound ((s) the integral of (24.10) as follows

h 2

I=

T

«

lsi~ we arrange

2 II(t)l 2 dt =tTl it. f(t + u)dul dt

h 1,:. h 2

=it. it.

+ ui)f(t + u2)dt du1du 2

T

f(t

T

f(t)f(t

2

=it.

+ u2- ul)dt du1du 2 + 0(~ 3 T!)

=1,:. (~-lui) { T f(t)f(t+u)dtdu+0(~ 3 T~). -t. lr 2

Now we approximate f(t)f(t formula

+ u)

by a Dirichlet polynomial. First by Stirling's

r(a +it)= (27r)! (it)"-~ (t/e)ite-~t(l + o(t- 1 ))

for a > 0, t > 0 we find that

g(~ + it)g(~ +it+ iu) =

lg(~+it)g(~+it+iu)l

(21r) -T (l + o(u + 1)). 2

t

Then by (24.11) and the convexity bound ((s)

f(t)f(t

+ u) =

t

« lslt

we get

L L (mn)-~(~)'t c7r~ ) T + 0(~ 2 r-!). 2

l(m,n(T

Hence

I=

LL c(m,n)(mn)-! +0(~ T~) 4

l(m,n(T

where

c(m, n) =

j_:

2

(A- lui) tT

(~)it c7r~ ) -T dtdu

~ 2 t T (~)\(~log 2 7r~ )dt 2

=

550


with x(x) = x- 2(sinx) 2. If m

m c(m,n)log;- «

i= n,

we integrate by parts getting

(sin:!~log(nm 2 /T))

2

+

log(nm2/T)

(sin:!~log(2nm 2 /T))

2

log(2nm 2/T)

2 2 2 Tid(sin i~ log(2nm /t)) 1 lr log(2nm 2/t)

+ [

«min { ~2, jlog n;21-2

+ jlog 2n:;21-2}.

If m = n, it is easier to integrate over t first and then over u getting c(m,m) = 2T

1

2 ~ (~-lui)--.2- 2~ (nm ) ~ -T du

-~

2-w

I

nm2~-l + log ----y;2nm2~-I} « ~T min 1, log T

{ I

.

Using these estimates one derives that

LL c(m,n)(mn)-~ «

~T

l(m,n(T

D

which completes the proof of (24.10). 24.2. A positive proportion of critical zeros.

In 1942, A. Selberg modified the arguments of Hardy and Littlewood showing that a positive proportion of zeros of ((s) lies on the line Res= ~· He did not give the percentage number. A very strong result is due to B. Conrey [Co2), namely that

No(T) > ~N(T)

(24.12)

for sufficiently large T. Conrey's approach is based on Levinson's method (see [Lev]) which is quite different from the one used by Selberg. Selberg's refinement of the Hardy-Littlewood arguments appears in mollification of ((s) by a Dirichlet polynomial, its role being to diminish the contribution of large values of ((s). This idea was first exercised by Bohr and Landau [BL], however, less effectively than in the hands of A. Selberg [S8]. Selberg considers (24.13)

g(l +iu)

f( u) = IY(~

+ iu)i (( ~ + iu)i'P( ~ + iu)i

where

= '\;"" £....., h(logd) lD "/d d-s d
og

and h(x) is a real, continuous function on [0, 1] such that (24.15) (24.16)

h(x) = 1 + O(x) h(x) « 1-x

2

24.


551

while the coefficients "/d have arithmetical nature (they are also real and bounded). Clearly f(u) is real and even. The reason that l
="'

W

(24.17) We have for s > 1,

1)12 ( 1-p.

~

(

1-p.1)-l ~ ( 1-p.1)-1 , 2

whence hdl ~ lidl ~ 1, where id are the coefficients of

((s)~

(24.18)

=

I1(1- p~r~ = Lidd-s. d

p

As in the Hardy-Littlewood proof we shall compare the integrals

(24.19)

I(t) =

(24.20)

J(t)

in the range 0

~

t

~

=

i i

t+~

f(u)du,

t

t+~

lf(u)ldu

t

T. Due to the mollification, we shall be able to work with

smaller~,

(24.21) where the implied constant is absolute. We shall prove three estimates LEMMA 24.3. Let (logT)- 1 ~ ~ ~ 1 and D = T 0 with 0 < () ~ have

1T lf(t)ldt » 1T lf(t)l dt «

(24.22)

2

(24.23)

fo.

Then we

T, T,

1T ll(t)l dt « ~T(logT)- 1 , 2

(24.24)

where the implied constant depends only on ().

From these estimates we derive Selberg's bound (24.3) as follows. Let£ be the subset of [0, T] in which IJ(t)l < J(t). We obtain A=

i

J(t)dt;;,

i

(J(t)- II(t)l)dt

=

1T

(J(t) -II(t)l)dt = B- C

552


say. By the Cauchy-Schwarz inequality and by (24.23) we derive A2

~ I£11T J

2

(t)dt = I£11T (fob.lf(t

+ v)ldv)

2

dt

~ ~I£11T 1b.lf(t + v)l 2 dvdt ~ 2~ 2 1£11T+b.IJ(t)l 2 dt « ~ 2 1£1T. On the other hand, we have by (24.22)

B = 1T l+b.lf(u)ldudt;?

~iT lf(u)ldu » ~T,

and by (24.24) C2

~ T 1T II(t)l 2 dt « ~T 2 (log T)-

1

.

Combining the above estimates we obtain ~T « ~!T(logT)-1 +~I£1~T1, whence 1£1 » T, provided ~log T is sufficiently large. The set £ contains a sequence of points { t1, ... , tR} which are ~-spaced of length R;? ~ - 11£1 » T log T. For each tr there is a sign change of f(u) in (tr, tr + ~), so there is a zero of ((% + iu) in this interval because l
{T lf(t)Jdt;?

lo

Ilr;2 { ((% + it)

By the approximation (24.11) and the trivial bound
«

D~ for Re(s) =!we

n(N

where a 1 = 1 and lanl ~ T3(n) for n ~ N = D 2 T. Assuming D ~ T~(logT)- 1 we obtain

T 1 ((s)

1 T

L

T T o(--) ;? log T 3

if Tis sufficiently large, which proves (24.22). The other two estimates (24.23) and (24.24) are much harder to establish because we have to appeal to the nature of the mollifier

(24.25) 1T ((s 1 +it)((s2+it)a 51 +itbs2 -i 1 dt = TPv

( 2 ~ab) +O((ab)~ T~ (logT)

where v = v2- v 1 and Pv(X) is defined by xiv

(24.26)

Pv(X) = ((1- iv)

+ ((1 + i v ) - .- . 1+w

6

),

., ;

24.


553

The implied constant in {24.25} is absolute. Notice that by the Laurent expansion of ((1 + iv) we get xiv -1 Pv(X) = - . - +2"! -1 + O(lvl)

(24.27)

w

which is quite good if lvl is small. In particular, we have

Po( X)= log X+ 21- 1.

(24.28)

CoROLLARY 24.5. If (a, b) = 1, we have (24.29)

for I((~+ it)1

2

1

(~r dt = ~(log 2:_:b + 21- 1) + o(abT~ (logT) 6 )

where the implied constant is absolute. We begin the proof of Theorem 24.4 by evaluating the following partial integrals (24.30) for T '( T 1

'(

T2

'(

2T. Changing the variable t

-->

t - v 1 we get

2

S = (ab)-! biv

£~ ((~+it)((~+ iv +it) (~tdt + E,

where E is the non-overlapping part of the integral, and it satisfies (24.31) by applying the convexity bound ((s) « lsli on the line Res=~ Next we approximate ((s) by its partial sum (24.32)

(x(s)

=

L

n- 8 •

n(X

Recall that for Re s = (]" ;;o: ~ and lsi '(

1r X

we have

xi-s

(24.33)

((s) = (x(s)-

In particular, for s with Re s = ~ and T

1

_

«

8

+ O(X-a).

lsi '( 1rT we have

((s) = (r(s) + O(r-!).

(24.34) Hence

where E= accounts the contribution of the error term in (24.34), thus it satisfies (24.35) by virtue of the following estimate (obtained from (7.52) by Cauchy's inequality)

{3T

lo

l(a + it)ldt

«

T(logT)!.

554


Inserting the Dirichlet polynomials (24.32) we get (24.36) First we pull out the terms on the diagonal am = bn. Since a, b are co-prime it follows that m = bf! and n = af! with f! ,;;; L = Tc- 1 where c =max( a, b). Hence these terms contribute

Introducing the approximation (24.33) we arrive at .

(24.37)

Liv

So= (T2- T1)(ab)-1+w ( ((1- iv)

+-;;;) + 0(1).

Now we consider the contribution to the integral (24.36) of the terms am=/= bn, say S*. By integration we get (24.38) where (24.39) We shall show that the main contribution to S(t) comes from the terms near the diagonal, i.e., from m, n with 1 - E < a;:· < 1 + E for some small E to be chosen later. Put (24.40)

S,(t) = i

·t

L.. L..

"t( log/;:;;: am)-! .

1 1 (am)-,--• (bn)-,-+w+•

"'"'

O
For the remaining terms we have I log ~';I > ~, and since log x is monotonic, we can apply partial summation to estimate the contribution of these terms by

E- 1 (ab)-1(1ogT) 2 i(x(~ +it)(y(~ -iv- it) I for some x,y with 1 ,;;; x,y ,;;; T. Now by Weyl's subconvexity bound (x(s) « jsjt(log3jsj) 2 , which is valid for s with Res= ~ and x ,;;; 4jsj 2 (see (8.22)), it follows that (24.41) For the terms in SE(t) we write am= bn + h with 0 < Ihi < Ebn, and use the approximations am) -I =1 + 0 ( -1 ) (abmn)-'1 ( logbn h bn bm)it = (1 ( am getting

SE(t) =

+~)-it= e-ithjbn + o(!!!:__) 2 bn

b2n

LL

ih-l(bn)ive-ithjbn +EJ

am-bn=h

O

555

where E 1 accounts for the contribution of the above error terms. We have

I

E1

«

(1

LL

+ c:t)

(bn)-

1

lam-bnl<
Clearly we can interchange a and b (see the first inequality), so combining both estimates we derive a symmetric bound (24.42) provided ET) logT, which condition we henceforth assume. Now we write S<(t) as

N1
where

H+ = min{EbT, E(1 +E)- 1 aT} HN1

= -min{EbT, E(1-E)- 1 aT} = lhl/cb, N2 = min{T, (aT- h)b- 1 }.

For the inner sum we apply the following formula

L

1 {N2 e(f(n))

=-

a

N1
J~

e(f(x))dx+0(1)

N1

which holds true for any smooth function f such that If'! ,;;; (2a)- 1 and f" f 0 in the interval [N1 , N 2 ] with N 1 < N2 (this follows from Proposition 8.7). Assuming (24.43) we obtain

I:

N1
Here we replace N 2 by Tmin{1,ajb} making an error O(lhlfab). Then we change the variable x into xtj2-rrb getting

L N,
!hi 1 ( ) l+iv lct/2nlhl e(-hx)x-iv-2dx+0(-+1)

(bn)ive-ithfbn=- ___!__ ab 2-rr

tf2ndT

n=-hb(mod a)

where d = min{a, b}. Inserting this into S<(t) we get

ab

556


where (24.44)

E2

L

«

cb +

k) « (aW~cT

+ logT.

0
The inner sum over h in the above integral is a partial sum in the Fourier expansion for -1/>(x) = [x]- x + ~· Applying the approximation

-1/>(x) = -H1
h#O

where H = min{H1, H2} (compare this with (4.18)) we get

where E3

«

t -b a

1= ( + (1 1

t/21rdT 1

«

1

llxll)- x- 2 dx ct-

1 -b Ea

1= (llxll +-

=1 -

Eab

X

=

(

Et)) x- 2 dx+ Lk- 1 log 1+-k

tj27rdT

x )-1dx -

tj21rdT

«

1

Ef

X

1 -b(d+logT). Ea

We simplify this bound to (24.45) Finally we use the formula (24.46)

1="'·( )

'!-'XX

y

-2-wd - ((1 + w) X1+w

- y-w - + 1-y-1-w -w

21+w

which holds for 0 < y < 1 andRew> -1, getting

(.!__)iv ((11 ++ .iv) _ _t_b (dT)iv :rr_ (dT)l+iv . + b 1 . + E1 + E2 + E3.

t _ _t_ S, ( ) - ab 21r

zv

a

w

a

+w

Now adding up the relevant formulas we obtain

S = (ab)-1+iv [T2 Pv

( 2 ~:b)- T1Pv ( 2 ~~b)] + R

where Pv(X) is defined by (24.26) and R is the total error term. By (24.31), (24.35), (24.37), (24.42)-(24.45) we get

R

«

(ab)-~ (T~ + E- 1 T~ + c 2T 2)(1ogT) 6 + (logT) 2

provided ET;;, logT and 2c 2abT,;;; 1. We choose E = r-~ getting

R

«

(ab)-~T~(iogT) 6

provided 2ab ,;;; Tt. We also have the trivial bound

S

« (ab)-~

1T I((~

+itWdt

«

(ab)-!TiogT.

24.


557

Combining both results we get rid of the condition 2ab ,;;; Tt claiming a weaker result

This yields (24.25) by adding the results for dyadic points Tj. REMARKS. The error terms in (24.25) and (24.29) could be improved substantially. For example, if a= b = 1, our Corollary 24.5 yields

{T

T

lo j((~ + itWdt = r(log 27T + 21- 1) + E(T)

(24.47)

with E(T) « T~ log 6 T, while it is known that E(T) that (24.47) holds with E(T) « ri+E (see [Iv]).

« rf2+'.

It is conjectured

Theorem 24.4 can be easily generalized to integrals of type Gv(T)

=loT((~+ it)((~- it- iv)(

Lt

g(t)dt

where g(t) is a smooth function on [0, T]. For g(t) = 1 we get by (24.25) Gv(T) =

T~ {((1- iv) + v ab

((1 +iv) ( T b)iv} + 1 + zv

21r a

O(abT~ (logT) .

6)

if (a, b) = 1, jvj ,;;; 1 and T ) 2, the implied constant being absolute. Hence we derive by partial integration that in general Gv(T) =

b~

{T g(t)(((1- iv) + ((1 + iv) (-t-)iv)dt +

v ab Jo

21rab

O(abGT~ (logT) 6 )

where

{T

G=jg(T)I+ Jo

tdt jg'(t)it+ . 1

For g(t) = (27r (t)iv/ 2 this yields CoROLLARY 24.6. Let (a, b)= 1, jvj ,;;; 1 and T) 2. Then we have (24.48)

{T

Jo ((~+it)((~- it- iv)

t

(a)it(21r)'f

b

dt

(_!_) '¥} + O(abT~ (logTf)

(_!'_)

=I!_ { ((1 +iv) '¥ + ((1- _iv) ...;a:b 2 +w 21ra2 2- w 21rb 2 where the implied constant is absolute.

Now we proceed to the proof of (24.23). Let o.e be the coefficients of

cp2(s) =

(L f3dd-sf L =

d(D

£(D2

o.e£-s

558


where (3d= /dh(logdflogD), thus ere« T(€). We have by Corollary 24.5

1T

1T 1((4 +

lf(t)l 2 dt =

it)

:L:>aab(ab)-~

= L

b

a

= AT(log

+ it)i 2 dt {T i((~ + itW(a(b)i 1dt

Jo

~ + 21- 1)

- BT + O(D Ti (logT) 6

8

)

where (24.49)

A= L d

(24.50)

B = L d

1 L L C
Therefore it suffices to prove that

A« (log D)-I, B«l.

(24.51) (24.52)

First we treat A in considerable detail and then we modify the arguments to reduce the case of B to that of A. We begin by diagonalizing the quadratic form A as follows

f f

1 2 1 A= Ld- LJ.L(8)8- (Laaoda-

o

d

= L d

d-

1

a

(L 11(8)8- 1) (L C
a

= L

d

where a=O(mod d)

a=O(mod d)

1 Ad= LL(8182)- Ad(81)Ad(82), o,o2\d= d\8!02

where (k,d)=l From now on we take h(x) = 1- x, so that

(24.53)

(3 d

_ -!d

log+ D(d logD ·

These are the original coefficients of Selberg, however one could choose h(x) almost arbitrarily subject to the conditions (24.15) and (24.16). In this particular case we


559

have a simple Mellin integral representation

~

log+ x =

(24.54)

27TZ

1

x•s- 2ds.

(<)

Now Ad(8)logD =

L /okk- 1 log+ (k,d)=l

~

= ro

2 ~i 1 Z(s + 1)(D/8)•s- ds 2

(<)

where Z(s) is the corresponding zeta function

Z(s) =

IT(l- p-•)! = (d(s)!((s)-!

L rkk- 2 = (k,d)=l ptd

with

(d(s) =

IT(1- p-•)-

1

pld We have (d(s + 1) « (d(1) = d/
1

JsJ- 312ds

«

»

JsJ- 1 and (D/8)'

«

1 on the line

c:- 1/ 2 =(log D)!.

(<)

Hence we derive Ad(8) where

because lrol ,;;; "to and i

«

lroJ(dj
*i

«

>..(d)d/
= 1. Therefore we have

(24.55) Finally inserting this to the diagonal form of A we argue as follows: A(logD)2

«

L

d2
d(D 2

«

d(D 2

d-1

II (1- ~ Pld

p

r3

2:: d-lrr(1+p-~J,;;; 2:: d-]L.:w-~ d(D 2

,; ; ((~) L

pld d-

d(D 2 1

«

wid

logD

d(D 2

which completes the proof of (24.51). For the proof of (24.52) we write log ab = log abd 2 - 2log d and we split B accordingly, say B = B'- 2B". By the arguments which led us to the diagonalization of A we derive similar expressions for B' and B". Thus we have

B' = L'P(d) L L e
560


where

Bd =

O
L a=o(mod d)

*

by the formula L = A

Bd

1. Applying (24.55) we derive

« (L

(d,q)A(q)q- 1 )d
«

d
q(D2

This bound is larger than that for Ad by factor log D, hence the former arguments yield B' « 1 as required for (24.52). Moreover, we have

B"= LLL£XadlXbd(abd)- 1 logd= L1fl(d)A~, (a,b)=1

d

d

where

d d d 1fl(d) = LJ1(8)8log 8 = L/1(8)8 LA(q) Jjd

Jjd

Jqjd

= LA(q)q
qjd Hence B" :( 4A log D

«

qjd 1 by (24.51 ). This completes the proof of (24.52).

Finally we sketch a proof of (24.24). We have

{T

I=

Jo

{T {6

II(tWdt =

Jo IJo

6

=

! (~-lui) Jo

f(t

T

{

f(t)f(t

+ u)dul

2

dt

+ u)dtdu + O(D2 T!)

-6

and by the argument (Stirling's formula) which was applied in the proof of Lemma 24.2 we get f(t)f(t + u)

=((~+it)((~- it- iu)
=~~~((~+it)((~- it- iu)(~)

it

2

;)

(27rta2) 'f +

'f

D2

+ o(T~)

o(~: ).

Integrating over t by Corollary 24.6 we get

lT

f(t)](t

+ u)dt =

2T L

0

d

L

LIXadlXbd(abd)- 1

(a,b)=l

(T)iu + ((1-iu) (abd ('h)iu} +0(D4T~(logT)7).

+iu) (d { ((1 2+zu V'h

2-zu

Vr

Now integrating over u we get (24.56)

I= 2T L d

L

L

lXadlXbd(abd)- 1 { ( d{f)

+ (VI];)}

(a,b)=1

·11

.-,-


561

where

_ 1~logX

- tl

o

(sinu) (~) du +0 I X . u og 2

Putting

(24.57) we write (X) = ~tl - tlw(tliog X) + O(tlflog X). (T(27r)~(abd)- 1 into (24.56) we arrive at

I= -2tlTA' + O(tlTA"

Inserting this for X

+ tlTA"'(iog X)- 1 + D 4 Ti (iogT) 7 )

where

~)

A I = """"' ~~~CladClbd(abd)""""' """"' I w ( tliog ~ , d

(a,b)=l 1

A"= LILLCladClbd(abd)- 1, d

(a,b)=l

A"'= LLLietadClbdl(abd)d

1

•

(a,b)=l

The sum A" can be estimated in the same way as A,

The sum A"' looks like A except that its terms are taken with absolute value. Consequently, in the analytic argument applied for A'" the Riemann zeta function occurs as ((s) 112 rather than ((s)- 112 resulting in a loss of factor log D in the bound for A'" by comparison to A, i.e., one gets A"' « 1. For the sum A' one derives the same bound as for A, namely A'« (log D)- 1 , since the function w(tllog X) is bounded and it has nice derivatives. For example, one can use the Mellin transform of w(tliog X) in X which brings a slight shift in the argument of the involved Dirichlet series, and this makes no essential change in the rest of the proof. Another approach would be to reduce A' to A by arithmetic arguments as we reduced B' to B. To this end write the power series oo

w(z) =

(-l)k+l (2z)2k+J 2k + 1 ·

L (2k + 2)' k=O

562

24. THE CRITICAL ZEROS OF THE RlEMANN ZETA FUNCTION

For z = tllog X = ~ tllog(T (27r) - tllog( abd) this reduces the problem to estimating sums of type A(k) = LLL:Cl
Writing (logabd)k = L Ak(q) q(abd

where Ak is the von Mangoldt function of degree k we arrange A(k) in terms of Ad and use the bound (24.55). Now collecting the above estimates for A', A", A"' we get I« tlT(logD)- 1

Taking D

=rio

+ D 4 T~(logT) 7 .

we complete the proof of (24.24).

REMARKS. Selberg's mollification method works well for counting zeros on and near the critical line. He showed that

(24.58)

N(~ +48,T)

«

T 1 - 6 logT

uniformly for 8:;? 0. Hence it follows that almost all zeros of ((s) lie in the region (24.59)

liT

1 1,: : 1)(t) - 2 "' log(ltl + 3)

where 17(t) is any positive function which increases to infinity.

CHAPTER 25

THE SPACING OF ZEROS OF THE RIEMANN ZETA-FUNCTION 25.1. Introduction. Throughout this chapter we assume the validity of the Riemann Hypothesis for ((s). This allows us to put the critical zeros p = ~ + i'y in an increasing sequence according to their ordinates ... ~ "/-2 ~ "/-1

(25.1) where "1-n satisfies

= -"'n·

< 0 < 11

~ ')'2 ~ ...

Recall that the counting function N(T)

T

N(T) = -

(25.2)

21r

= l{n I 0 < "'n

~ T}l

T

log-+ O(logT) 21re

forT ;? 2 (see Theorem 5.24). Hence it follows that (25.3)

"fn ~ 21rn(log n)- 1 ,

as n--> oo,

so the numbers (25.4)

(n =

1 1'i "'n log l"fnl 2

have unit mean spacing, i.e., (n ~ n as lnl --> oo. Actually (25.2) is much more precise than (25.4), still it is not satisfactory for understanding the subtle aspects 's, in particular, it tells very little about gaps between of the distribution of consecutive zeros. The Riemann Hypothesis alone does not shed enough light on the problem (it does improve the error term in (25.2) only by factor log log T) .

"'n

Our first goal would be to understand the statistic of the sequence of numbers (n- One expects that the consecutive spacings

(25.5) are not purely random (i.e. are not poissonian), as expected for the normalized distances between primes, or for the normalized distances between eigenvalues of the Laplace operator for the modular group (see the final remarks of Chapter 10 and [Sa4] respectively), but rather they follow the Gaussian Unitary Ensemble distribution from random matrix theory. We say that a sequence 0 < ( 1 ~ ( 2 ~ ... follows GUE if (25.6)

1 N

L

f(on) ~

Jor= f(s)P(s)ds

l~n~N

563

564

25. THE SPACiNG OF ZEROS OF THE RIEMANN ZETA-FUNCTION

for any nice function f : )R+ -+ IC (say of Schwarz class), where P(s) is the limit density distribution of consecutive spacing of eigenvalues of random unitary matrices (the limit with respect to the rank). Explicitly this distribution was determined by Gaudin and Mehta [GM], and is given by P(s) = det(I- Qs), where Qs is the integral operator on £ 2([-1, 1]) whose kernel is

t )·

sin :>!.!(x- y)

Qs(x,y) =

(25.7)

7rX-y

Specifically the density can be expressed by the following infinite product 00

(25.8)

P(s)

=II (1- Aj(s)) 0

where 1;? >. 0 (s);? >. 1 (s) ;? ... are the eigenvalues of the integral operator. In general the GUE law as stated above is very hard to establish, because harmonic analysis cannot control consecutive points of a sequence, but it can locate the points in small domains. Therefore more tractable problems are about correlation of sets of points in question. We begin by presenting the original work of H. Montgomery [Mo4] on the pair correlation of zeros of the Riemann zeta function, and then we state more general results and conjectures for the n-level correlation (see definitions in Section 25.3).

25.2. The pair correlation of zeros. The main tool for dealing with the correlation problems is the explicit formula of Riemann (or its variations) which connects a sum over the zeros of ((s) with a sum over prime numbers. General versions are proved in Chapter 5, but here we use a formula in somewhat customized form. Let g E C;."'(IR) and let

(25.9)

h(r)

=I:

g(u)eiurdu.

Put fR(s) = 7r-s/ 2 f(s/2), the local factor at the infinite place for the Euler product of ((s). Then

"~h('y)=h (2i) +h (-2i) +27T1 j= h(r) {fR(2+zr)+fR(2-!'f') r~ r~ 1

-oo

.

1

.

}

dr

'r

(25.10) 00

- L A(n) ..jii (g(logn)+g(-logn)). I

This is the case of Theorem 5.12 for the Riemann zeta function. The strategy goes as follows. First we localize the ')'-terms, say I' near t, by choosing a suitable test function h(r) = h(r, t), where t is a parameter at our disposal (h(r) has to be an entire function). Of course, by the uncertainty principle of harmonic analysis, such a localization cannot be exact; at best one can see r near t at the distance c/log t, where cis a positive constant. Computing a properly weighted £2-norm of such ')'-sum with respect to the parameter t involved in the test function h(l',t) one picks up terms which are close to the diagonal, i.e., one gets a sum over pairs of zeros ')', 1' with I' - 1' being quite small. Then, by an approximation argument, or by Fourier inversion in another parameter which is

25.

THE SPACING OF ZEROS OF THE RIEMANN ZETA-FUNCTION

565

built in the test function, one creates a sum whose terms are the desired function of the difference "/ - 1'. The original approach of Montgomery was somewhat special, and it is still being used by many researchers. Since his choice of test functions is quite natural we use them first. Montgomery [Mo4] introduced the function (a Fourier transform of the gaps) (25.11) for any real a and T;? 2, where w(u) is a suitable localizing function. Specifically he takes (25.12) Note that F(a, T) is real and F(a, T) = F( -a, T);? 0, because the Fourier transform of w is positive. Precisely, w( v) = 27re- 4 "1vl, so (25.13)

F(a,T)

L

= ~ joo e-2lvll TlogT _ 00

ei,(v+ologT)\2 dv.

O
By trivial estimation, using (25.2), one derives F(a, T) ~ F(O, T)

(25.14)

«

logT.

Our goal is to give an asymptotic formula for F(a, T) as T--+ oo which is uniform in a in ranges as large as possible. THEOREM 25.1 (MONTGOMERY). For 0

~a~

1 and T;? 2 we have


REMARKS. For a with 2(loglogT)/logT ~a~ 1-1/(loglogT) logT we have F(a, T) ~a,

(25.16)

as T __. oo.

The asymptotic formula (25.15) holds for all a ;? 0 but it loses its meaning if a ;? 1, because the errorterm O(aT"'- 1 ) exceeds the leading term a. In proving Theorem 25.1, Montgomery appealed to the explicit formula

L(x, t) = R(x, t)

(25.17)

where L(x, t) is a special sum over the zeros of ((s) localized near t, namely (25.18) and R(x, t) is the corresponding sum over primes, including the infinite place terms; see Section 5.5. We have 00

(25.19)

R(x, t) = -

L I

A(n) (x)it

..fii :;;:

(n x)

min ;:':;;:

+ x-l+•t. log(t + 2) + E(x, t)

566

25. THE SPACING OF ZEROS OF THE RIEMANN ZETA-FUNCTION

where the la.st term satisfies (25.20)

E(x, t)

.,fi

1

« ;: + t + 1 ,

if x ;? 1 and t > 0, the implied constant being absolute. This particular equation (25.17) can be deduced from the formula 00

LA(n)n-s = --(s) = _x-- L_x- + L_x( 1- s p- s 2n + s ('

(25.21)

p-s

1-.s

n(x

p

-2n-s

1

which holds for x > 1, x =/= pn and s =/= 1, p, -2n, or directly by a contour integration of ('

--(s) =

(25.22)

(

LA(n)n-s 00

I

and the functional equation

('

('

r'

r'

(

(

rR

rR

--(s)- -(1- s) = ___B_(s) + ____3_(1- s) = log(lsl + 3) + 0(1).

(25.23)

We square the equation (25.17) and integrate (this is a way of picking small differences -y - -y'), getting {25.24) The left side is equal to

where the summation is over all the critical zeros p = ~ +i-y, p' = ~ +i-y' (with-y, 1' positive and negative). Using (25.2) we reduce the above expression a.s follows:

LL

4

xi(-y-·y')

0<1•/'(T

1T(1+(t-,)2 )- 1 (1+(t-,') 2 )- 1 dt+O(log 3 T) O

LL x•h--/) joo (1+(t-,)

=4

Oi''(T

= 21r

LL

2

)-

1

(1+(t--y') 2 )- 1 dt+O(log 3 T)

-oo

xi('Y--/lw(!- 1') + O(log 3 T).

0
Putting x = To. with a ;? 0 we arrive at (25.25) where the implied constant is absolute.

'


567

Next we evaluate the other side of (25.24). To this end we use the following form of Parseval's formula for Dirichlet series (see [MVl], or refine the proof of Theorem 9.1) (25.26) Write (25.20) as R(x, t) = G(t)

+ D(t) + E(t),

respectively. Then

IR(x,t)l 2 = IC(t)l 2 + ID(t)1 2 + IE(t)1 2 + O(IC(t)D(t)l

+ ID(t)E(t)l + IE(t)C(t)l).

Integrating in 0 < t :::; T we derive by the Cauchy-Schwarz inequality T

foiR(x, t)i 2 dt = C 2

(25.27)

with C 2 =

1T

+ D 2 + E 2 + O(CD +DE+ EC),

IC(t)l 2 dt and D 2 , E 2 defined similarly. By (25.26) we obtain

x)

2 (n A (n) C2 = ~ L..)T+O(n))--min -,2

n

1

By the Prime Number Theorem 1/J(x) = x

x n

+ O(x/ log 2x)

.

this is

C 2 = Tlogx + O(T + xlogx).

(25.28) For D 2 we get simply

T

(25.29)

.

D 2 = x- 2 1log2 (t + 2)dt = x- 2 Tlog 2 T

+ O(x- 2 TlogT),

and from the error term (25.20) we get (25.30) Inserting (25.28), (25.29), (25.30) into (25.27) we arrive at T

(25.31)

foiR(x, t)i 2 dt = Tlogx + [

2

2

Tlog T

+ B(x, T)

where (25.32)

B(x, T)

Comparing (25.31) for

«

x =rex

T

+ xlogx + x- 1 (1og2x)TlogT.

with (25.25) we complete the proof of (25.15).

Now we derive a few consequences of Theorem 25.1.

568

25.


THEOREM 25.2. Let f(x) be a Schwartz function on JR. such that its Fourier transform (25.33)

1:

i(y) =

is of class C 1 with supp

j

C ( -1, 1).

(25.34) =

f(x)e( -xy)dx

Then

{l

/(n)JnJdn +

1

/(0)} ~ logT + O(T)

where the implied constant depends only on f. PROOF. For any (25.35)

f

of Schwartz class we have

T) = -(logT T )/ 27r

""' logL.....- ""' L.....- w('y-"' , )f ( ("!-"' ,) 27r 0<1',/''(T

00

f(n)F ( n, T)dn. -

00

Here the integral equals

The first term agrees with that on the right-hand side of (25.34). The second term is 2(logT) [

(/(0) + O(n))T- 2 "dn = /(0) + O((logT)-I),

and the error term is trivially estimated by 0(1/log T). This completes the proof of (25.34). D In particular, choosing f(x) = ((sin7rnx)/7rnx) 2 we get COROLLARY 25.3. If 0 (25.36)

< n < 1 is fixed, then as T-+ oo,

( ')(sin(("!- 1')~ logT)) L.....- w "' - "' L ""' , ("! - 1') ~ log T

2

~

0<-y,-1 (T

(

1 n

+ -n) -T Iog T . 3 21r

From this Montgomery deduced a lower bound for the number of simple zeros of ((s), (25.37)

NI(T) = J{O

<"',:; T;

p

=! + h

is simple}J.

(recall we assume the Riemann Hypothesis). COROLLARY 25.4 (MONTGOMERY). As T-+ oo, we have (25.38)

NI(T) >

(~3 +o(1))!_logT, 27T

i.e., at least 2/3 of the zeros are simple.

25.


569

PROOF. Discarding all terms in (25.36) except for 1 = 1' we get

L

m; < (

~+

O<,(T

i

+ o(l))

~ log T,

where m, is the multiplicity of p = ~ +h· This holds for any fixed a with 0 < a < 1, the best bound being~+ o(1) for a--> 1-. Hence we derive that N 1 (T)")

T "'(2-m,)m,>(2-~+o(1))-logT L.....27T

O<,(T

completing the proof of (25.38).

0

REMARK. The test function which was used in (25.36) is not optimal for the purpose of estimating N 1 (T) along the above lines, so the percentage 2/3 of simple zeros can be slightly increased. Another consequence of Theorem 25.2 (which we do not draw here) is that - In can be smaller than its average 27T flog In by a factor A < 1 infinitely often, i.e., lim inf On :( A. Montgomery succeeded in showing this with A = 0.68. The best published result is A= 0.5171 which is due to Conrey, Ghosh and Gonek [CGG], while a new refinement points to A = 0.5169.... Recently it was shown that if On :( ~ - E sufficiently often, then the class number h(- D) of the imaginary quadratic field IQ( FIJ) is bounded below by c(c)D~ (log D)- 2 for some constant c(c) effectively computable in terms of c; see [CI2]. In+ 1

The formula (25.16) holds uniformly for E :::; a :::; 1 -E. With a little more effort one could show that (25.16) holds uniformly forE :::; a:::; 1. One may wonder what is the true behavior of F(a, T) for a ") 1? Montgomery showed that (25.16) cannot hold for a > 1. Moreover, he described heuristic arguments which suggest that

F(a, T)

(25.39)

~

1,

as

T _, oo

uniformly in bounded intervals 1 :( a :::; A. By applying this to (25.35) with appropriate test functions f of Schwartz class, Montgomery was led to the following PAIR CORRELATION CONJECTURE. For a < (3 put N(a,(J;T)

=I{

m

f.

n;

0 < lm,ln

:( T,

~~;';} < lm -In<~~;~

H

Then, as T --> oo we have (25.40)

N(a,(J;T) ~ N(T)

{(3 ( 1- (sin 1TU) }a --;-;;:-

2 )

du.

REMARKS. The assertions (25.39) for a ") 1 together with (25.16) for 0 < a :::; 1 are essentially equivalent to (25.40). By (25.39) the asymptotic formula (25.36) extends to all a") 1 with the function a- 1 + a/3 replaced by 1 + 1/3a2 . Hence letting a--> oo it follows by the former argument that almost all zeros of ((s) are simple.

570

25.


25.3. The n-level correlation function for consecutive spacing. The Pair Correlation Conjecture of Montgomery indicates that the zeros of

(( s) behave much like the eigenvalues of large complex Hermitian random matrices whose statistical distribution is known as the Gaussian Unitary Ensemble. Further support for this behavior was provided by Z. Rudnick and P. Sarnak [RS] who extended Montgomery's results to the n-level correlation sums for the normalized zeros of ((s). In this section we formulate the main results of Rudnick and Sarnak and we give a few comments about their arguments. Recall that (m denote the normalized ordinates of the zeros Pm = ~

((s) given by (25.4), they are ordered in increasing sequence ... :::; (_ 2

+ hm

:::; (_ 1

of

<0<

and (n have unit mean spacing. The Pair Correlation Conjecture answers the question of how often the differences (m 1 - (m2 fall in a given interval. One may inquire about pairs of differences (m, - (m 2, (m 2 - (m 3 falling into a given rectangle, the triple of differences in a box, and so on. To control the problem we restrict our considerations to the first M points (m, 1 :::; m :::; M, where M is a large number. Let n ;? 2 and B C ~R_n-I be a box of dimension n- 1. Put ( 1 :( (2 :( ... ,

~1{1 :( m!,···

Rn(M;B) =

,mn :( M

all distinct;

((m,- (m2' (m2- (m3 · · · '(mn-1 - (mJ E B}j. For n = 2 and B

=

(a, {3) this is simply

R2(M;B)

=

1

Ml {1:::; m1 f. m2:::; M;

a< (m, -(m 2 < {3}1.

ForM= N(T) this is very close to N(a,{3; T)/N(T). As in (25.41) we expect that there is a function Rn (B) such that

Rm(M;B)

(25.41)

~

Rn(B),

as M __, oo.

Our goal is to determine Rn(B) for any n;? 2. To simplify the Fourier analysis here we introduce test functions of the following type:

f(x~>

... , xn)

TFl. f(xi, ... , Xn) is symmetric, TF2. f(xl TF3. f(x 1 ,

+ t, ... ... ,

, Xn + t) = f(xl, ... , Xn) for any t E JR., Xn) ---> 0 rapidly as !xi ---> 0 on the planes x 1 + · · · + Xn = 0.

REMARKS. The condition TF2 says that f(x 1 , ... ,xn) depends only on the successive differences. This together with TF3 implies that f(x 1 , ••• , xn) decays rapidly on any hyperplane x 1 + · · · + Xn = t. For n = 2 such a function is given by f(xi, x2) = f(x 1 - x2) where f(x) is an even function on JR. which decays rapidly to zero as lx! ---> oo. For any test function as above we put (25.42)

Rn(M·, f)

=

Mn!

" L.__..' · · · " L.__..' l~m1, ... ,mn~lvf

all distinct

f( (m, 1 • • • , (mn ) '

25.

THE SPACING OF ZEROS OF

THE RIEMANN

ZETA-FUNCTION

571

and seek an asymptotic formula (25.43)

Rn(M; f)~ Rn(f),

as M _, oo.

The multiple sum Rn(M; f) can be smoothed further by replacing the range of summation 1 :::; m 1 , ... , mn :::; M by a smooth (rapidly decreasing) cut-off function. Thus we also consider m1, ... ,mn

all distinct

where h(y) is the cut-off function in question and L

(25.45)

1

= -logT.

27T REMARK. There is no reason to use a cut-off function h(y 1 , ••• , Yn) more general than the product h(y 1 ) ••• h(yn) because we wish the points ')'m 1 , ••• , l'mn to be close to each other. In 1962, Dyson [D] considered a statistical distribution of energy levels of some complex systems in which context he determined the n-level correlation density Wn(x 1 , •.• , Xn) for the GUE model. He showed that (25.46) where

K(x) = sin 1TX.

(25.47)

1TX

One can show the following properties of Wn(x): (25.48) (25.49) (25.50)

0 :( Wn(x) :( 1, Wn(x) = 0 if and only if Xi = Xj for some i =J j, Wn(x) = 1 if and only if X E zn and Xi =I Xj fori =I j.

Let us compute the density

fu~ction

K(O) W2(xr, X2) = det ( K(xr- x2)

for n = 2. We get 2

K(x 1 - x2)) _ _ (sin 1r(x1 - x2)) K(O) - 1 . 7r(xr- x2) '

which agrees with that in (25.40). THEOREM 25.5 (RUDNICK-SARNAK). Suppose that the cut-off function h(r) is given by the Fourier integral (25. 9) with some smooth compactly supported function g( u). Suppose the Fourier transform of the test function f(x),

](0 =

r

}Rn

f(x)e( -~. x)dx

is supported in the polyhedral domain I~ 1l have

+ · · · + l~n I :(

2. Then, as T

-->

oo we

572

25. THE SPACING OF ZEROS OF THE RJEMANN ZETA-FUNCTION

where o(x) is the Dirac distribution at point 0. By making an appropriate approximation to the cut-off function one can derive from (25.51) the following CoROLLARY

i(O

supported in

25.6. Suppose the test function f(x) has its Fourier transform 161 + · · · + l~nl ( 2. Then we have {25.43) with Rn(f) =

(25.52)

kn f(x)Wn(x)o(~(x! + · · · +

Xn))dx

where x = (x1, ... xn)· This can be written as (25.53) where Xn = -x 1 REMARKS.

-

Xn.

... -

In particular, we have

R2(f) = 2

k

f(x, -x)W2(x, -x)dx =

L

f(x) ( 1-

ei:;x)

2 )dx

~here f(x) = f(x,O), which agrees with the Montgomery pair correlation result. 25.4. Low-lying zeros uf £-functions. A somewhat different problem from the function concerns the zeros near the central s = ~· Such a problem, of course, depends £-function of degree d (still satisfying GRH)

consecutive spacing of zeros of an £point, i.e., in diminishing distance to on the scaling of zeros. Consider an as in Chapter 5, given by

(25.54) with conductor CJ > 1 (see (5.7), (5.8)). The number of zeros of L(f, s) (25.55) with

l··ul (

PJ =

~ + hJ

T satisfies

(25.56) as T --> oo by Theorem 5.8. This suggests that the proper scaling factor for the lowlying zeros of L(f, s) should be ~log CJ. Now the question is: how does N(T, f) behave for small T? To make the question more interesting we take a test function ¢ in the Schwartz class on IE. and define the low-zeros sum (25.57)

D(f;¢) = L¢(1/o;:f), 'If

where If runs over the ordinates of zeros counted with multiplicity. For ¢ with rapid decay this sum concentrates on zeros which are within 0(1/ log cf) of the points=~-

r


573

Because, for a single £-function, we are effectively counting very few zeros, classical harmonic analysis cannot detect their distribution. Therefore we must take a sufficiently large family of £-functions into consideration. Suppose :F is an infinite family ordered by the conductor. Put

:F(Q) = {f E :F; CJ C Q}.

(25.58)

Now we want to know the asymptotic behavior of the average density 1

(25.59)

L

AD(:F; ¢>, Q) = I:F(Q)I

D(f, ¢>).

fE:F(Q)

If :F is a complete family in a certain spectral sense, one should expect that

AD(:F; ¢>, Q)

(25.60)

~ h¢>(x)W(:F)(x)dx

where W(:F)(x) is a density function characterized by :F. Some investigations in this direction were undertaken in the 1970's (see M. Jutila [Ju5], H. L. Montgomery [Mo4]) for the family of Dirichlet £-functions L(s,x). However, the recent studies by N. Katz and P. Sarnak [KSl], [KS2] expanded far beyond this family. By examining various families :F they revealed that the low-lying zeros are always governed by a symmetry (or monodromy) group G(:F) associated with :F. Their assertions are particularly convincing for zeta and £-functions over finite fields, not only because the Riemann Hypothesis is known to be true, but also due to connections with geometry which provides additional intuition (the zeros are then eigenvalues of the Frobenius operator, see Section 11.11 and [KSl]). There are also attractive results for families of global £-functions, but progress is not yet as deep as in the case of local £-functions. In this section we present a few such results. First we inspect the family of Dirichlet £-functions with real primitive characters Xd· Here L(s, Xd) has the conductor cd = idi. Since L(s, Xd) have even functional equations, we may put their zeros p~C) = ~ + h~f) into the sequence (25.61) with ~~-C)= ~~C) for f. E Z\{0} (it is believed that L(~,Xd) is never zero). Let (25.62)

:F(Q) = {d fundamental discriminant; ldl C Q}.

CONJECTURE (KATZ-SARNAK).

(25.63)

1 :F(Q)I I

"

L...,

"

As Q--->

=,

( (qlogldl)

L..., r/.> Td ~

we have

r

~ Jm r/.>(x)Wsp(x)dx

dE:F( Q) C>O

IR

where the density function is given by (25.64)

"' ( ) _ _ sin 27rx . YVspX-1 27rX

574

25.


REMARKS. Katz and Sarnak [KS2] proved this conjecture for any test function

¢ whose Fourier transform

¢(~) =

(25.65)

1cp(x)e(-~x)dx

is supported in ( -2, 2) (of course, subject to the validity of the GRH). The case of L(s, Xd) is obtained by twisting ((s) with Xd· One may generalize this case by starting with any L(f,s) in place of ((s) and taking the family of £-functions L(f 0 Xd, s) twisted by the real characters Xd· For example, choose an elliptic curve E jQ, and let E(d) /Q be •the corresponding quadratic twists. Then the associated £-functions L(E(dl,s) are obtained from L(E,s) by twisting with Xd· The corresponding conjecture of Katz and Sarnak predicts that the density function for the low-lying zeros of L(E(d), s) is given by w+(x) = 1 + sin22nx_

(25.66)

1l"X

Many interesting examples of families of £-functions are offered by the theory of automorphic forms, and a variety of these are studied by Iwaniec, Luo and Sarnak [ILS]. We are going to describe some of their results. Let r = SL 2 (Z) and k ~ 2 be an even integer. Let Sk(r) be the linear space of cusp forms for r of weight k. Let Hk(f) be the Heeke basis of primitive forms of Sk(f). To any CXl

f(z) =

L AJ(n)n •;-

1

e(nz) E Hk(r)

we associate the Heeke £-function

L(f,s)

=

L

Aj(n)n~s

I

= II(1- AJ(P)P~s + p~2s)~1 .. P

As described in Chapter 14, the completed function

A(f,s)

=

(2n)~ 8 f(s

+ k2 1 )L(f,s)

is entire and satisfies the functional equation A(!, s) = E:JA(f, 1 - s) witn root number given by E:f = ik = ±1 (note E:J depends only on k but not on f). Because the sign of the functional equation has some impact on the distribution of low-lying zeros we separate the families of even and odd £-functions. Put (25.67)

M+(K)=

2::

IHk(f)l,

k(K

k=O(mod (25.68)

M~(K)=

4)

2::

IHk(r)l.

k(K

k=2(mod

4)

The corresponding densities turn out to be (25.69)

W

+( ) X

sin 2nx =1+-- - , 21l"X

SPACING OF ZEROS OF THE RIEMANN ZETA-FUNCTION

25. THE

575

sin 21rx w-(x) = 1 - - - +Jo(x). 21rx

(25. 70)

THEOREM 25.7 (IWANIEC-Luo-SARNAK). Suppose ¢> is a smooth function on IR compactly supported in] - 2, 2[. Then, asK_,=, we have (25. 71)

1 M+(K)

L

L

k"K

jEHk(r)

D(J;¢>)

~ 1¢>(x)W+(x)dx, Ill:

k=O(mod 4)

(25. 72) 1

M-(K)

L

L

k"K

jEHk(r)

D(J;¢>)

~ 1¢>(x)W-(x)dx. Ill:

k=2(mod 4)

Let L(Sym2 J, s) be the symmetric square £-function associated with f. Recall from Section 5.12 that it is the £-function of degree 3 given by 00

L(Sym2 J, s) = ((2s)

=

(25.73)

rr(

L .X1 (n

2

)n-s

1 - AJ(P2)p-s-\- AJ(P2)p-2s _ p-3s)-1.

p

As shown by Shimura [Sh2], the completed function (25.74)

A(Sym2

J, s)

=

7r-¥r(•! 1 )r(st~- 1 )r(•!k)L(Sym2 j, s)

is entire and satisfies the functional equation A(sym2 J, s) = A(Sym2 J, 1- s) (note that the root number is always one). In this case we have THEoREM 25.8 (IWANIEC-Luo-SARNAK). Suppose¢> is a smooth function on IR compactly supported in]-~,~[. Then, asK_,=, (25.75)

1

M(K)

L

L

k"K

jEHk(r)

D(Sym2 J; r/>)

~ 1¢>(x)Wsp(x)dx Ill:

kooO(mod 2)

where M(K) = M+(K)

+ M-(K) and Wsp(x) is given by

(25.64).

REMARKS. We emphasize that the conditions on the test function ¢> in Theorem 25.7 and Theorem 25.8 allow the support of the Fourier transform if; (see (25.65)) to be larger than [-1, 1]. To appreciate this fact we write

h¢>(x)W(x)dx = h (i;(y)W(y)dy by Plancherel's theorem. Here the Fourier transforms of the corresponding density distributions are:

w+(y) = 8o(Y)

+ ~1J(y),

w-(y) = 8o(Y)- h(y) Wsp(Y) = 8o(y)- ~1)(y)

+ 1,

576


where 80 (y) is the Dirac distribution and 1J(Y) is the characteristic function of the segment [-1, 1]. These Fourier transforms have discontinuity at y = 1 andy= -1, therefore the results are distinguishable only if the support of ¢ covers the points y = 1 andy= -1.

CHAPTER 26

CENTRAL VALUES OF £-FUNCTIONS

26.1. Introduction. We have already mentioned many times the Grand Riemann Hypothesis for £-functions and discussed many aspects of the fascinating zeros of £-functions (see Chapters 5 or 24 for instance) in various aspects. Recently, much of the analytic theory of £-functions focused on estimates for the values of £-functions at the central critical points= ~· These special values appear in various contexts and vanishing or non-vanishing are the main questions. For example, algebraic techniques developed by Mazur to study rational points on certain modular curves, with applications to diophantine equations, require for full effect to prove that there exists a modular form of a certain type which does not vanish at the central point (see [Ell] for a recent example). Another very different situation is the surprising link discovered by Iwaniec and Sarnak between the proportion of non-vanishing of various families of £-functions and the Gauss Class Number Problem (see Theorem 26.1 below). Here is a short panorama of selected accomplishments in recent years, emphasizing the strongest results from analytic number theory. The first is the Iwaniec-Sarnak result mentioned above [182]. Among different variants, we choose the following:

=

THEOREM 26.1. Fork 0 (mod 4), let Hk be the set of weight k Heeke forms on S L (2, Z). For any real primitive character 1j; modulo q with q :( k{), we have (26.1)

I{! E Hk I IL(f 01/;,

~)I ~

(lo: k) 2

}I~ ( ~- o(1)) IHkl

as k --+ +oo and{) --+ 0. They also show that

L

L(f, ~)L(f 0 x, ~) ~ L(x, 1)1Hkl

jEHk

for any real primitive character x modulo D, uniformly for D :( k{) as k --+ +oo and{)--+ 0. Hence, if the constant ~ in (26.1) for 1j; = 1 could be replaced by~+ 8 for any 8 > 0, with an effective implied constant, it would follow that

L(1,x) for any real primitive character constant.

»

1 (logD)2

x modulo D 577

with an absolute and effective implied

26. CENTRAL VALUES OF £-FUNCTIONS

578

Note it is important to have a lower bound as in (26.1) instead of simply L(!, ~) =I 0, so it is conceivable that all special values are non-zero, yet too small to solve the Class Number Problem along these lines. From the point of view of analytic number theory, a lower bound as in (26.1) and simple non-vanishing are usually indistinguishable. In part becau..'le of the relationship to the Birch and Swinnerton-Dyer Conjecture, the order of vanishing of weight 2 cusp forms at s = ~ has attracted particular attention. Recall that in one instance it is easy to guarantee that the central point is a zero, namely if L(!, s) is a self-dual £-function and the sign of the functional equation is -1. It is reasonable to assume that the order of vanishing will be the smallest compatible with this restriction, namely a "generic" even form will have L(!, ~) =I 0 and a generic odd form L'(!, ~) =I 0. Brumer (and independently Murty) conjectured this and used GRH to give some evidence. Iwaniec and Sarnak proved that half the special values of even forms do not vanish (and are not too small, as in (26.1)). Kowalski and Michel [KMl], and independently VanderKam [vdK] considered higher derivatives. We quote two results: THEOREM 26.2 (Kowalski-Michel). Let S 2(q)* be the set of primitive weight 2 forms of level q. We have I{! E S2(q)* I f.J = -1 and L'(!,

~)=I O}l ~

c6-

o(ll)IS2(q)*l

for primes q --+ +oo. THEOREM 26.3 (Kowalski-Michel-VanderKam). For any fixed k ~ 0 we have I{! E S2(q)* I f.J

= ( -1)k

and L(kl(J, ~)=I O}l ~ !(Pk- o(1))IS2(q)*l

for primes q--+ +oo, where Po = 1/4, p 1 = 7/16, P2 = 0.4825, P3 = 0.495, and moreover, Pk = ~ - 3 A2 + 0( f,). Although the error term depends on k, it is possible to combine this result with an average bound

L

2

(or1L(!,s)) «1

/ES2(q)•

s=2

for q prime and derive (see [KMV3] for both results) CoROLLARY 26.4. We have

L /ES2(q)•

for primes q

--+

or1L(!,s) :( (c+o(l))IS2(q)*l

s=z

+oo with c = 1.1891.

This is particularly interesting because 1.1891 < ~, which was the value obtained by Brumer on GRH. In the following sections we will give considerable details of the proof of Theorem 26.2. This result expands an earlier one of B. Duke [Du3] where the proportion was q/(logq) 4 . Those non-vanishing results have nice interpretation in terms of arithmetic geometry, as already mentioned in Section 5.14. Indeed associated to the curve Xo(q) = fo(q)\IHI is an abelian variety, its jacobian variety, denoted J 0 (q), which

26. CENTRAL VALUES

579

is of dimension dimJ0 (q) = dimS2 (q) = IS2 (q)*l (the latter only for q prime). The jacobian is defined over IQ and its group of rational points is finitely generated by the Mordell-Weil theorem. Eichler and Shimura computed the Hasse-Weil zeta function of J 0 (q) in terms of modular forms: for q prime it takes the form L(Jo(q),s)

=

IT

L(f,s).

jES,(q)•

If the Birch and Swinnerton-Dyer Conjecture for abelian varieties holds for J 0 (q), we have rankJ0 (q)(IQ) = ord L(Jo(q),s) = ord L(f,s),

L

fES,(q)• s=1/2

s=1/2

so lower bounds for the order of vanishing give lower bounds for the rank of Jo ( q). In fact, by the general form of the Gross-Zagier formula (Theorem 23.4) for an arbitrary weight 2 cusp form, one shows that if the order of vanishing is exactly 1, then f contributes 1 to the rank (by conjugates of Heegner points). So Theorem 26.2 implies THEOREM 26.5. For q prime, q--> +oo, we have rankJo(q)(IQ)

~

c6-

o(1l) dimJo(q).

On the Birch and Swinnerton-Dyer Conjecture, Corollary 26.4 gives a corresponding fairly good upper bound for rank J0 (q). There is currently no comparable unconditional bound. Recall from Proposition 5.21 that, on GRH, the maximal order of vanishing of an £-function of degree d with conductor q at the central critical point is about (log q)(log ~log q)- 1 . Assuming the Birch and Swinnerton-Dyer Conjecture, this translates to a conjectural upper bound for the maximum rank of an abelian variety over IQ. One can see it is best possible by taking powers of an elliptic curve with positive rank. However, more convincing is the case of J 0 (q) (which is conjecturally "almost" irreducible). For q prime one gets by GRH and the Petersson formula the slightly better estimate (5.94). The conductor is Nq = qdim Jo(q), hence by dimJo(q) ~ q/12 we have dimJ0 (q) ~ (logNq)(loglogNq)- 1 . So Theorem 26.5 shows that the upper bound on GRH cannot be improved (except for the implied constant). Of course one can also study classical Dirichlet £-functions. Michel and VanderKam [MvdK] studied the analogue of Theorem 26.3 for the family of primitive characters modulo q. A different challenge is to consider only real characters modulo q :( Q. There Soundararajan [Sou] has proved THEOREM 26.6. Let Q be the set of odd squarefree integers. For q E Q, let = (8qjn) be the even quadratic character modulo 8q. Then

Xsq(n)

l{q:(Q I qEQ

andL(~,Xsq)#O}I~

G-o(1))1{q:(Q I qEQ}I

as Q--> +oo.

Notice the coincidence of the proportion 7/8 with Theorem 26.2 (where a factor ~ really comes from consideration of odd forms). In fact, Soundararajan's proof

26.

580


uses a mollifier AI(x) as in Theorem 26.2 (see Section 26.2 and following), and remarkably for a mo!lifier of length Qtl, the proportion obtained is 1- (2~ + 1)- 3 exactly as in Theorem 26.91 The Katz-Sarnak philosophy of monodromy groups of a family of £-functions can nicely explain this phenomenon. This concept of monodromy group of families, although lacking a formal definition yet, has already been used to explain very striking results. For example, Conrey and Soundararajan [CoSo] have been able to prove for the first time that there are infinitely many £-functions of real characters with no real zeros in the critical strip (not only at s = ~). The success of their approach (which depends on a constant turning out to be positive) was "predicted" by the conjectured symmetry of the family of real characters. THEOREM 26.7. Let Q be as in Theorem 26.6. We have

l{q'(Q I L(X-sq,s)i'OforsE[0,1]}1? G-o(lJ)I{q'(Q I qEQ}I as Q-->

+oo.

We conclude with a still challenging problem: prove that, given a modular form f (say of weight 2 and level q), there exists a positive proportion of quadratic characters X such that L(j 0 x, ~) 'I 0. This is a conjecture of Goldfeld that still resists proof. The best known result is due to Ono [Ono], using algebraic methods (congruences and Fourier coefficients of half-integral weight modular forms), losing a factor (log Q) 0 for some 8 E (0, 1). Analytically, using Heath-Brown's large sieve inequality for real characters (Theorem 7.20), Perelli and Pomykala [PP] have the best result, with a factor Q-e lost. REMARK. Other special points on the critical line are those s1 = ~ + iti associated to an eigenvalue of the Laplace operator of a Maass cusp form 'Pi for a congruence subgroup. The deformation theory of Phillips and Sarnak shows that the vanishing of the special value of the Rankin-Selberg £-function L(j 0
S 2 (q)* (properly normalized) of primitive forms of S2 (q), as described in Corollary 14.23 and afterwards. Let f E S2(q)*. We denote its Fourier expansion f(z) =

L

AJ(n)v'ne(nz)

n)l

and recall its multiplicative properties (14.50), (14.51). To prove Theorem 26.2, we first derive the weighted analogue: THEOREM 26.8. We have

:Lh /ES2(q)• odd

L'(f,I/2)#0

1?

U6 -a(ll)

l I

!

t j

l ~

l i

" 26. CENTRAL VALUES

for primes q -->

581

+oo.

Recall from (14.60) that the h in superscript indicates that the normalizing factor (47r)- 1 llfll- 2 is inserted. The method of proof is based on comparison of the first and second moments of special values L'(J, 1/2). Since "large values" of L'(f, 1/2) have a large effect on the second moment, we use mollification, as in the proof of Selberg's theorem in Chapter 24. Let

Ah

=

L

w

h

-Et)M(f)L'(f, ~)

/ES2(q)•

and

Ah=

L

h

~(1-Et)IM(f)L'(f,~W

/ES2(q)•

for some M(J) E C. Comparison of an upper bound for A/2 and a lower bound for Af1 yields by Cauchy's inequality a lower bound for the quantity we study (26.2) Cj=~l

L'(!,~),
In order to achieve the best possible estimate, we will find asymptotics for M 1 and A12 . Note that if the mollifier is ignored (take M(f) = 1), a factor logq is lost in the final estimate. To get sums amenable to Petersson's formula, we look for mollifiers as linear forms in >-t(m): (26.3)

for some M = q!>. with 0 ::;:; 6. < ~ and real coefficients numbers ::;:; M, such that

Xm

supported on squarefree

(26.4)

for some absolute implied constant. Theorem 26.8 follows by letting t. THEOREM

26.9. For 0::;:; 6.

-->

~ in the following

< 1/2, we have 1>

~(1- .,........,.~) (26.+1)3

"""2

/ES2(q)' odd L' (!, 1/2);<0

for all primes q large enough in terms oft..

1


582

26.3. Formulas for the first and the second moment. The moments M 1 and M 2 can both be reduced to combinations of averages of

L' (!, ~) or L' (!, ~ )2 twisted by Heeke eigenvalues. For m ~ 1, let (26.5) (26.6)

We have (26.7) and by (26.3)

(26.8) We will obtain asymptotic formulas for D(m) and H(m) valid if m is small enough with respect to q. We first use contour integration and the functional equation to express L' (!, ~) and L' (!, ~ )2 as sums of rapidly convergent series. This could be derived from (5.12) by differentiation, but we redo the computations anyway. Let N ~ 2 and G be a fixed polynomial such that G( -s) = G(s), G( -N) = · ·· = G(-1) = 0 and G(O) = 1. In particular, G'(O) = c< 3l(O) = 0. Consider 1 =1-. 2nl

J

1 ds A(f,s+2)G(s)2. s

(2)

We can shift the contour of integration to the line Re (s) = -2, picking up a double pole at s = 0, hence l=resA(f,s+ 21 )G(s) - 2- + -1. s=O s 2nl

J

1 )ds A(f,s+2)G(s 2· s

(-2)

By the functional equation (Theorem 14.17), with sign -fj = Re (s) = -2 is equal to c fl, so

E:f,

the integral on

G(s)

(1- E:f )I= res A(!, s + 21 ) -2- . 8=0 s Computing the residue by writing the Taylor expansions around 0, we obtain

, 26. CENTRAL VALUES

583

On the other hand, we can compute I by expanding the £-function in an absolutely convergent Dirichlet series on the line Re(s) = 2, which gives

where

I

~

V(y) =

2?rZ

r(s + l)G(s)y_ 8 d2s. S

(3/2)

Hence, comparing both expressions, we have (26.9)

(1- c1 )L'(f,

~) = 2(1- c1 ) _L >-t~)vC~)l?l

vi

vq

We estimate V easily by shifting the contour to the left, or right: we have

V(y) =-logy- 'Y + O(y), V(y) « y-1 for all j ~ 1.

(26.10) (26.11)

Similarly, we consider the integral J = 1-. 2?r!

I

ds A(f,s+~) G(s) 3 s 2

(2)

and proceed to evaluate it in the same fashion as for I. Shifting the contour to Re (s) = -2 and applying the functional equation for the square of the £-function A(!, s) 2 = A(f, 1- s) 2 (where the sign is c} = 1), we get 2J = resA(f,s + l) 2 G(s). 2

s=O

s3

Further, from the multiplicativity of the Heeke eigenvalues (see (14.50)) we derive the Dirichlet series expansion

L(f,s) 2 = (q(2s)

L T(n)>-t(n)n-•, n~l

so the term by term integration yields

where (26.12)

W(y) = - 1 . 2 ?r!

I

ds (q(1 + 2s)f(s) 2 G(s)y-•-. s

(1/2)

Hence

2

y/q" >-t(n) T(n)w( 4?r n) =res A(!, s + ?r L.., ..(ii q s=O n~l

l ) 2 G(s). 2 s3


584

For odd

f, we have L(f, ~) = A(f, ~) = 0 so the above residue equals A'(!, ~ )2 =

'{! L'(f, ~ )2 from the Taylor expansions of A(f, s + ~) and G(s). Therefore for odd forms we have (26.13)

The function W satisfies the bound (26.14)

« y'W(i)(y) «

yiW(j)(y)

(26.15)

lol(Y

+ y- 1 ),

y-1,

for all i? 0, j? 1.

for all i? j? 0,

(the implied constant depending on i and j) and there exists a polynomial P, independent of q, of degree at most 2, such that 1 (logy) 3 12 This last fact follows by writing

(26.16)

W(y) = -

W( y=res ) s=O

+ P(logy) + O(q- 1 (logy) 2 + y).

G(s)r(s) 2 (q(1 sys

+ 2s) + O( y )

by shifting the contour, and computing the residue. Using (26.5) and (26.9) we get D(m) =

Le ve~vC~)t.'(C,m) vq

where

t.'(l,m)

=

Lh (1-ctP.·t(l)>..t(m). f

We now appeal to (14.65) of Corollary 14.26 which shows that approximation to the Kronecker delta

t.'(l,m)

= 8(/,m)

+ oC3 (~m))

C;)

t.' is a strong

i(logq)i).

One could use also (14.64), keeping the Kloosterman sums, but this will not be necessary for the application to lvfr and Theorem 26.2. Hence we have by (26.10) and (26.11) D(m) = _1_v(2?rm)

Vm

y/q

+ O(q-3/Sml/4(logq)2)

with an absolute implied constant. By (26.7), (26.10) and (26.4) we get PROPOSITION

26.10. We have form? 1,

(26.17)

where ij = vq/(2?re'), and if M = qt>, then (26.18)

26. CENTRAL VALUES

585

where 8 = ~(~- t.).

In the following, when we write an error term of the form O(q- 6 ), it is implied that 8 > 0, 8 depends only on t., and the value of 8 may change from line to line. 'vVe now proceed to get an expression for l'vh as a quadratic form in the Xm· The idea is the same as above, but we cannot simply estimate "the contribution of the sum of Kloosterman sums in the Petersson formula by Weil's bound: we open the Kloosterman sums and transform further to derive the required expression. Let m ~ 1 be fixed, m '( q2 c,. < q. We are now ready to compute Af2 . By (26.6) and (26.13) we have

H(m)

=

L T~w(47r2n) Lh (1- c:t)>-t(m)>-t(n) n?l

q

V"

I

= LTjiw(47r:n)t.'(m,n). n~l

Our previous approximation to t.'(m, n) by 8(m, n) is not sufficiently strong. From Corollary 14.26 we have the more precise formula 1 271" "'L.."""" -S(mij,n;r)J 1 (471" ff-n) t.(m,n)=8(m,n)-1 Vii (r,q)=l r r q

I

(mn)4 ) + 0 ( T3((m,n))-q-(logq) Hence by (26.14), (26.15)

(26.19) where X(m) is a sum of series of Kloosterman sums

(26.20)

X(m) =

~

L

yfq

(r,q)=l

r-

1

Xr(m), 2

(26.21)

n) ~(n).

_ (471" Xr(m) =- "'""""T(n) L..- 'n S(mq,n;r)JI --:;: ff-n) W (4n -q

n:;n Y"

q

For technical reasons we have inserted in the summation the factor a fixed function~ : JR+ ---> [0, 1], which is coo and satisfies ~(x) = 0, 0 '( x '( ~,

~(n) where~

is

~(x) = 1, x ~ 1.

Obviously this extra factor doesn't affect the sum (all positive integers are at least 1!), but it will be useful to gain convergence in some series appearing later. We first estimate the contribution of those terms with r > R in (26.20), where R > 0 will be chosen later (R = q2 ). Using J 1 (x) « x, Weil's bound for Kloosterman sums, (26.14) and (26.15), we get

2 71" y/q

L

~Xr(m) «

r>R r (r,q)=l

§(logq) 11 •

VR

I


586

We denote now X'(m) the remaining part of the sum in X(m), namely

X'(m) =

(26.22)

2 7r

L

y/q

r<:,R

r- 1 Xr(m).

(r,q)=1

Let r <:,; R. The expression Xr(m) can be computed using the summation formula (4.56) for sums of Kloosterman sums. This yields

2 Xr(m) = --S(m,O;r) r

1=

Vx

(log- +!')t(x)dx r

0

27r +-;Lr(h)S(hq-m,O;r) h;;,1

1+= 0

Yo (47rVhx) - r - t(x)dx

1+=

- -4 Lr(h)S(hq+m,O;r) r h;;,1 o

Ko (47rVhx) - - t(x)dx r

where

(26.23) Hence

(26.24)

X '( m) = -47r

"""" L..- 21 S(m, 0; r ) L(r)

y/q

r

r<(,R

(r,q)=1

47r 2

+/7i

1 2 :L:r(h)S(hq-m,O;r)y(h)

L

r

r<:,R

h;;,1

(r,q)=1

8

-

~

Vq

"

L

~ Lr(h)S(hq+m,O;r)k(h)+O((iogq)

r<:,R

r

h)1

5 )

y/q

(r,q)=1

with

2 r= (log--:;:vfx + 1')11 (47r ~) (47r X) dx -:;: Vq W -q- ,jX'

(26.25)

L(r) = Jo

(26.26)

y(h) = Jo

(26.27)

k(h) =

r= 1+= Ko( 7r~)t(x)dx, Yo (47rVhx) - r - t(x)dx, 4

the error term coming from removing the function ~(x) from the first term using t(x) « (logq) 3 x- 1 12 for 0 <:,; x <:,; 1, and IS(m,O; r)l <:,; r.

r

26. CENTRAL VALUES

587

Let X"(m) be the first term in X'(m). We have thus by changing the integraz tion variable x into 'i!¥-, 1 X"(m) = -2 "'"""' L..- -S(m,O;r) r

1= 0

r~R

.Jiii +'1")11 (2vmx)W(r 2 x) dx'(log2n yx

(r,q)=1

I

_2__,

=

2nz

Z!{1 + 2s)(q(1 + 2s)s- 1 r(s) 2G(s)L{s)ds,

(1/2)

by {26.12), with

L

Z!(s) =

S(m,O;r)r-s,

r~R

(r,q)=1

L(s) = -2

+= 1 0

.Jiii )1I(2vmx)x-•-If2dx. (log-+ 1 271"

Extending the sum over r to all r

~

1, we get

Z!(s) = (q(s)- 1

Ld

1

-s

+ O(r(m)R- 1 )

dim

if Re (s) = 2 by the formula (3.2) for the Ramanujan sum. Moreover, for all s with < Re (s) < 1, we have

i

(26.28) where 1/; = r' jr and -2 < Re(s) < -~)

1+= o

1+=

Q = q/4n 2 Indeed formula 14.561.14 of [GR] gives (for 8

J1(x)x dx = 2 8

(logx)J1(x)x dx =

8

r(1

+ .§.)

- ( ;)

r

1- 2

2 8 ~~~ ~ :~ (log2 + ~1/1(1 + D+ ~1/1(1- ~))

and a simple computation gives (26.28). Consequently we have X"(m) =

I

_2__, 2nz

(-2)a-2s(m)s- 1r(s) 2G(s)L(s)ds + o(T(m) logq) R

(1/2)

since (on Re (s) = ~) (q(1 + 2s)r(s) 2G(s)L(s)

« lr(s)r(-s)s- 1 l(logq + 11/1(1 + s)l + 11/1(1- s)i).

The function F(s) being integrated can be written

F(s) = m- 112s17s(m)- 1G(s)r(s)r( -s) (log~ + 21" + 1/;(1 + s) + 1/;(1- s)) where

1"/.(mJ =

L

ab=m

Gr


588

(Fourier coefficients of the non-holomorphic Eisenstein series for S£(2, Z)). Thus, F(s) is an odd fnnction of s. Moreover, it is holomorphic in the strip IRe (s)i < 1, except for a triple pole at s = 0, and decreases exponentially in vertical strips. Shifting the contour toRe (s) =-~and then changing s into -s, we conclude that X"(m) =

~ ~~~F(s) +ocr;) logq).

Expanding around s = 0 we have s- 1 r(s)r( -s) = -s- 3 +

(l- r"(1))s- 1 + O(s),

G(s) = 1 + ~G"(O)s 2 + O(s 3 ), 21" + ¢(1 + s) + ¢(1- s) = ¢"(O)s 2 + O(s 4 ), 1Js(m) = T(m) + ~T(m)s 2 + O(s 3 ) where (26.29)

T(m) =

L

(log

~t

ab=m

Combining those, we obtain therefore X"(m) =- T(m) (log

4vm

(26.30)

2) +a T(m) (log 2) + 0( m Vm m

7

(m) loo-q), R

b

2

where a= ~(1" - r"(1)- ~G"(O) -1/J"(O)). Coming to the other two terms in (26.24), we claim that for q < m and R :( q2 (26.31) (26.32) for any c > 0, the implied constant depending only on c. We choose R = q2 By (26.20), (26.22), (26.24), (26.30), (26.31), (26.32) together, we deduce (26.33)

X(m)

=-~~(log~)+ a 7~ (log~)+ 0( (v;n + ~)q')

Inserting this in (26.19) and appealing to (26.16), we obtain PROPOSITION

26.11. The twisted mean-value {26.6} is given by

(26.34) H(m) =

~ T(m) (log 2) 3 -! T(m) (log 2) + T(m) P 12

Vm

m

4

Vm

m

Vm

1

(log

2) +0(m 1 q-l/4+') m 1 4

where Q = q/4rr 2 and P 1 (X) = P(X) +aX for 1 :;;;: m :( q2 e;. with~ < 1/2. The implied constant depends only on c and ~-

26. CENTRAL VALUES

589

REMARK. In particular, form= 1, (26.17) and (26.34) give asymptotic formulas for the first and second moments of the special values of the derivatives L'(f, ~): (26.35)

L W- Ef )L'(f, ~) = ~ logq -!'log 21r + O(q-112), h

/ESz(q)•

(26.36)

Lh W- Et)L'(f, ~)2 = 112(log 4!2)3 + Pl(log 4!2) + O(q-1/4+•). fESz(q)•

Finally, the expression for the second moment of special values as quadratic form in the mollifier coefficients follows from (26.8) and (26.34). THEOREM 26.12. Assume M = qt> with~<~· Then

Ah = where

Ah 1 , 11122

(26.37)

and

Ah1

Ah 3

1 . 1 M21- 4M22 12

+ M23 + O(q -a.)

are quadratic forms in the variables

~ b1 ~ T(m 1m 2) ~ Xbm

= ~ b

rrq,mz

mlm2

1

Xbm 2

(

Q

given by

Xm

log-mlm2

)3

,

(26.38)

- (26.39) and 6 is a positive number depending only on~- Recall that

Q=

qj(47r 2 ).

i-

REMARK. In fact our proof is complete only for~< This limitation comes from the error term in (26.19), all other parts being adequate for~< ~· The full extension to this case can be established by arguments essentially the same as in the other parts, so we omit the details. The proofs of (26.31) and (26.32) require only upper bounds for the integrals y(h) and k(h). These are estimated by standard integration by parts (as in Section 4.4). Combining with the bound for Ramannjan sums jS(hq±m, 0; r)i :( (hq±m, r) one derives the results (for details see the original paper [KMl]). 26.4. Optimizing the mollifier. Now (26.18) expresses Ai) as a linear form, and Theorem 26.12 expresses Ah as a quadratic form, in the coefficients X 111 • The optimal choice of X 111 to maximize M[ jM2 (see (26.2)) is not easy to express explicitly because the quadratic form is not diagonal. Our strategy will be to decompose lvh as a sum of quadratic forms, one of which, say M~, is easily diagonalized. We then choose Xm to maximize !If[ jkl~, and simply evaluate the other quadratic forms for this chosen vector. One could justify afterwards that this is asymptotically best possible even for the full quadratic form. It turns out that both M 21 and A122 in Theorem 26.12 have contributions of the same order of magnitude, whereas A£23 is smaller.


590

We separate m1 and m2 in (26.37) by means of the formula T(mlm2)

L

=

2 1 11(a)T(: )T(: )

al(m,,m2)

getting (after relaxing the resulting coprimality condition by means of the Mobius function)

Expanding the logarithm, J\.f21 is a linear combination of the quadratic forms

II(t,u,v,w) = (logQ)"Lvt(k)Ykv)Ykw) k

where (i)

Yk

~ T(m) )i = ~ ----;:n-(logm Xkm, m

~

vt(k) =

11

L

~a)(loga) 1 ,

ab=k

and t, u, v and w are non-negative integers such that t + u + v + w = 3. In fact M 23 is also such a linear combination with parameters such that t + u + v + w :( 2, however. For the chosen vector (xm) it will be clear that II(t, u, v, w)

«

(Ioglogq)t+ 2 II(O, u, v, w)_:__:""l--"'-'""-ogq

by direct estimation, so we can restrict our attention to II( u, v, w) = II(O, u, v, w). Accordingly we write v(k) for v0 (k). Note that fork:( M, we have v(k) = rp(k)k- 2. The contribution to M 21 of those II( u, v, w) is the quadratic form m 21 = II(3, 0, 0)- 6II(2, 1, 0)

+ 6II(1, 1, 1) + 6II(1, 2, 0) -

6II(O, 1, 2) - 2II(O, 0, 3)

(using the obvious symmetry II( u, v, w) =II( u, w, v)). We select the one quadratic form II = II(3, 0, 0) as reference and we choose (xm) which maximizes Mf /II. Then we evaluate the other quadratic forms II(u, v, w) and M 22 for that vector. By definition, II is already diagonalized II= (logQ) 3

L v(k)y~. k

The

Xm

(26.40)

are recovered by the formula Xm

=

L g~)Y~m k

where g = 11 * 11 is the Dirichlet convolution inverse ofT. Hence the linear form M1. can be written as M1 = ~ Xm log j__ = ~ j(k)yk ~m m ~ m

k

26.

CENTRAL VALUES

591

where

Since ~g(k)k-s

= ((s)- 2 ,

k?l we have

~j(k)k-s

= ((s +

W 2 ((log q)((s +

1) + ('(s +

1))

k?l =

logq -(Cl)'(s+l) ((s + 1)

which shows that j(k) =

(26.41)

fl~k) (logqk).

By Cauchy's inequality we have

!II{

~ (~ j;t~~) (~ v(k)yk)

with equality if Yk = ~~Zj fork :( Yk =

!If.

Hence the best choice of mollifier is

P.((kJ~logqk, ifk:(M, '~' { 0, ifk >!II.

In particular Yk> and so also Xm (see (26.40)), is supported on squarefree numbers and the growth condition (26.4) is immediately verified. Since j(k) is about (log k)/k and v is about k-1, it is quite clear from looking at the various expressions involved that we will get a positive (harmonic) proportion if !II = qt> for any 6. > 0, with both Mf and M 2 being of size (log q) 6 . The results of evaluating llh and 11!21 , !1!22 are as follows PROPOSITION

M 1 =6. (

6_2

3

26.13. Let M = qt> and +

6.

2

+

Xm

be as above. We have

1) (logq) 3 +0((!ogq) 2),

4

28 33 35 1 ) 4 M21 = ( 36.6 + 56.5 + 46.4 + 66.3 +26.2 + 46. (logq)6 +O((logq)s), M23 « (log q) 5(log log q) 4. PROOF.

For !lf1 , by definition of Yk and (26.41), we have

~ j;t1; = ~ ~t2) (logqk) 2

3

llh = (logQ)- II =

and the result follows by partial summation from

2


592

For llh 1 , we have

II(u, v, w) = (log Q)u

:L v(k)yk")Ykw) k

and we express Yki) in terms of Yk using the higher von Mangoldt function A; = 11 * (log)i. Precisely we have (26.42)

Yki) =

:L

T~e) A,(e)Ykt·

e<;,M/k

See Section 1.4 ( (1.43) and after) for the properties of A; that we need. The sum in (26.42) is thus a sum over squarefree ehaving at most i prime factors (i :( 3). To evaluate we can find the associated Dirichlet series or more elementarily separate the sum into the parts with a fixed number of prime factors. Those are multiple sums over primes, and they are of Mertens type since T(£)£- 1 = 2ie- 1 for such e with w(€) = j prime factors, hence easy to evaluate asymptotically. We quote the result (see [KMl], 2.5.2, for details): one finds (i) kJ.L(k) ( ]lf)i "+ 1 . ( k . ) Yk =c;cp(k) logk (logq' M'k)+0 cp(k)(logq)'(loglogq)

for i = 1, 2, 3 with c1 = -1, c 2 = ~ and c3 = 0. Then the computation of the various II( u, v, w) is a simple matter of summation by parts. For instance we have

(recalllogq =log y/q + 0(1)). When everything is computed, the value of M 21 is obtained as claimed. D

There remains to treat the quadratic form llh 2 defined by (26.38). Recall that T(n) is defined by (26.29). It follows that (26.43)

26. CENTRAL VALUES

593

We use this to separate m 1 and m 2 , which yields the expression

This is rewritten as (26.44)

M22

= 2 (fi(1, 0, 0) - ll(O, 1, 0)- ll(O, 0, 1))

- 4l>l(k)ykzk k

where

and

' a""""' (b) (c) II(a,b,c)=(logQ) ~v(k)yk zk. k

As before we express the new variables in terms of Yk by (26.45)

Zk

(26.46)

(!) _ zk -

""""' A(£) log£ £ Yk£,

= 2 ~

l(AI/k """"'

~

r(e)A(£)

--£-Zk£

l(M/k

+

""""' T(f)A(£) ~ £ Ykl· l(M/k

To see this, notice that by (26.40) we have

The Dirichlet generating series for the coefficient of£ is L(s + 1) where

It is easy to see that

--26. CENTRAL VALUES OF L-FUNCTIONS

594

so L(s) = 2(('C 1 )', which gives (26.45). Then we derive (26.46) as follows

""" T(£)A(£) L..... --£--Zkf

""" T(£)A(£) £ Yki

+ L.....

I(AI/k

f(M/k

by (26.43). By (26.45) and (26.46) we can evaluate and we get

Zk

and

zi

1

)

much as before with Yki),

From this we easily evaluate the quadratic forms that occur in (26.44)

ll(1, 0, 0)

=

-(logQ)

L v(k)YkYk

2

)

+ O((logq) 5 ) = -II(1, 2, 0) + O((logq) 5 ),

k 1

ll(O, 1, 0) = - L ll(O, 0, 1)

«

2

v(k)yk )yk

)

+ O((logq) 5 )

= -II(O, 1, 2)

+ O((logq) 5 ),

(logq) 5 .

Moreover, by direct estimation Lv!(k)ykZk

«

(logq) 5 (loglogq) 3

k

Now (26.44) yields: PROPOSITION

26.14. Let m = qb.. For

Xm

as above, we have

From Theorem 26.12, Proposition 26.13 and Proposition 26.14, we can compute that for ll < ~,

Ml _ ~ (1 _ M2

-

2

1

(2ll

+

) 1) 3

+0

((log log q) logq

4 )

(where the very simple fraction appears quite miraculously by partial fraction expansion). Inequality (26.2) then implies Theorem 26.9, and therefore Theorem 26.8.

... CENTRAL VALUES

26.

EXERCISE.

595

Show by similar methods that

Lh

1) G-o(1))

fES2(q)'

L(f,~)¥0

for primes q --> +oo.

26.5. Proof of theorem 26.2. Passing from Theorem 26.8 (non-vanishing in "harmonic" average) to Theorem 26.2 (non-vanishing in "natural" average) can be done in different ways. We use Shimura's formula (26.47)

47l'llfll

2

IS~~~rl L(Sym2 f, 1) + O((logq) 3 )

=

(for q prime)

relating the Petersson norm to the special value of the symmetric square £-function off at s = 1. The latter is given by

n)l

n)l

for Re (s) > 1; see Section 5.12. At s = 1 the series barely fails to converge absolutely, but on average it can be represented by a very short sum. From [KM2], we quote a general useful result based on the large sieve type estimate for symmetric square £-functions (see Theorem 7.28). PROPOSITION 26.15. Let exf, for f E S2(q)*, be complex numbers such that (26.48)

L

(26.49)

ex 1

«

q1-li for some 8 > 0,

lexlt

«

(logq)A for some A> 0,

h

f

Then we have """ex = IS2(q)*l """"""hex Pt(n) L..... f ((2) L..... L..... f n f

n(x

+ O(

f

q

1--r)

with x = q", where r;, is any positive number and 1 is a positive number which depends only on 8 and r;,. The implied constant depends only on r;,, 8 and A. We use a mollifier M(f) aB in (26.3) of length M = qc,. with 6

Nl =

L

<

~' defining

M(f)L'(f, 1/2),

e:,=-1 N2 =

L

IM(f)L'(f, 1/2)1

2

£,=-1

and apply Proposition 26.15 with ex! = M(f)L'(f, ~) and ex = IM(f)L'(f, ~)1 2 respectively. The convexity bound L'(f, ~) « q114 (logq) 2 (use (26.9) to prove it again) yields easily (26.48) and (26.49).

2G.

596


It follows that

N1

=

IS~i~r~

Lh

N2 =

IS~i~r~

d~2 AJ(d2)M(f)L'(f, ~) + O(qH),

L d£2~x

c 1 =-1

(£,q)~1

Lh

d~2 AJ(d2)!M(f)L'(f, ~w + O(qH)

L

c 1 =-1 d£2~x

(f,q)~1

with x = q" where K > 0 can be chosen arbitrarily. Inserting the formulas (26.9) and (26.13), and using results of Section 26.3, we are led to a linear form expression for N 1 and a quadratic form for N 2 which are very similar to those of Theorem 26.12. It is important to be able to take K arbitrarily small to retain an asymptotic formula. The same principle as in Section 26.4 applies to find the mollifier and evaluate N 1 and N 2 , but there are many technical complications due to lack of complete multiplicativity of the coefficients involved. REMARK. If one is willing to accept a slightly weaker result (26.50)

for q prime, with no explicit constant (the analogue of Selberg's theorem on critical zeros of ((s)), some shortcuts are available. First, there is no reason to optimize the mollifier as in Section 26.4, and the computations there can be simplified quite a bit. Then instead of appealing to Proposition 26.15 and performing the computation of N 1 and N 2 one can argue by means of Cauchy's inequality

"'h L..... /ES 2 (q)' odd L' (f,1j2)f0

1

"' L.....

=

1

47rllfll 2 :;;;

("' 1 ) 1/2 ( L..... 167r2 llfll 4

/ES2(q)' odd L' (!.1/2)f0

f

"'

L.....

1)

1/2

.

/ES 2 (q)' odd L' (!.1/2)f0

Then the asymptotic formula (26.51)

"'

1

L..... 167r211JII4 ~ cq

2

f

as q -> +oo for some constant c > 0 together with Theorem 26.8 imply (26.50). Formula (26.51) is a special case of a result of E. Royer [Roy] about moments of L(Sym 2 J, 1) over f E S2(q)*. Note that by (26.47), this an1ounts to the negative second moment of L(Sym2 J, 1). The proof of (26.51) uses a density theorem for zeros of families of £-functions to restrict the summation to f satisfying a "quasiRiemann Hypothesis". REMARK. Besides being used to build mollifiers, as we have done, the twisted mean-values of type (26.5), (26.6) for families of £-functions have a powerful use for building an "amplifier", i.e. a sum of the type

Ak = L

A(f)IL(f, ~)lk

/E:F

where A(f) ;) 0, say

A(f) = ILce>.J(i'f i<;L

26. CENTRAL VALUES

597

but this time A (f) is meant to be large for f = fo E F, some fixed form, so that by positivity one deduces from an upper bound for the mean-value Ak an individual upper bound (26.52) In the be~t circumstances, the bound for Ak can be achieved if L is a small power of IFI, and matches that from Lindelof Hypothesis and orthogonality

Ak

« IFil+' (L lcel 2 )

112

e

so that (26.52) is a gain if the mean-value of ce is comparable in size to the peak value A(f0 ). Think of the case of primitive Dirichlet characters of conductor q where one can take ce = x(£) (see Proposition 14.22 for a suitable choice for classical modular forms). This amplification method has been the most successful for the proof of subconvexity bounds for £-functions of degree ;;? 2, particularly in conductor aspect where other techniques (Weyl shifts, reducing the averaging family) are inoperant. The implementation is, however, quite delicate in all cases (requiring off-diagonal analysis) so we refer to papers like [DFI2], [DFI3], [KMV2] and the survey [M2] for more information on this important subject.

Bibliography [AS1] [AS2j [AS3] [Ahlj [AM] (AL] [At] [Bj [BH] [BaJ [Bar]

[BG]. (BL] [Bo1] [Bo2] [Bo3] [Bo4] [Bo5] (BD] [BFI] [BI] [Bouj [BDCT] (BCJ [Bruj [Brlj

A. Adolphson and S. Sperber, On twisted exponential sums, Math. Ann. 290 (1991), 713-726. A. Adolphson and S. Sperber, Character sums in finite fields, Compositio Math. 52 (1984), 325-354. A. Adolphson and S. Sperber, Newton polyhedra and the total degree of the £-function associated to an exponential sum, Invent. math. 88 (1987), 555-569. L. Ahlfors, Complex Analysis, McGraw Hill, 1978. M. Atiyah and !.G. MacDonald, Introduction to commutative algebra, Addison-Wesley, 1969. A.O.L Atkin and J. Lehner, Heeke operators on ro(m), Math. Ann. 185 (1970), 134160. F. V. Atkinson, The mean value of the zeta-function on the critical line, Pro c. London Math. Soc. (2) 47 (1941), 174-200. A. Baker, Transcendental Number Theory, Cambridge Univ. Press, 1990. R. Baker and G. Harman, The dzfference between consecutive primes, Proc. London Math. Soc. 72 (1996), 261-280. W. Banks, Twisted symmetric-square £-functions and the nonexistence of Siegel zeros on G£(3), Duke Math. J. 87 (1997), 343-353. M. B. Bar ban, The "large sieve" method and its application to number theory, Uspehi Mat. Nauk 21 (1966), 51-102; English trans!. in Russian Math. Surveys 21 (1966), 49-103. J. Bernstein and S. Gelbart, An introduction to the Langlands program, Birkhaiiser, 2003. H. Bohr and E. Landau, Surles zeros de Ia fonction ((s) de Rtemann, Compte Rend us de l'Acad. des Sciences (Paris) 158 (1914), 106-110. E. Bombieri, Maggiorazione del resto nel "Primzahlsatz" col metoda di ErdOs-Selberg., !st. Lombardo Accad. Sci. Lett. Rend. A 96 (1962), 343-350. E. Bombieri, Legrand crible dans la thtiorie analytique des nombres, S.M.F, 1974. E. Bombieri, Counting points on curves over finite fields (d'apres S. A. Stepanov), Lecture Notes in Math. 383 (1974), 234-241. E. Bombieri, On exponential sums in finite fields, II, Invent. math. 47 (1978), 29-39. E. Bombieri, On the large sieve, Mathematika 12 (1965), 201-225. E. Bombieri and H. Davenport, Some inequalities involving trigonometrical polynomials, Ann. Scuola Norm. Sup. Pisa (3) 23 (1969), 223-241. E. Bombieri, J. Friedlander and H, lwaniec, Primes in arithmetic progressions to large moduli, Acta Math. 156 (1986), 203-251 E. Bombieri and H. Iwaniec, Some mean-value theorems for exponential sums, Ann, Senoia Norm. Sup. Pisa CI. Sci. 13 (1986), 473-486. J. Bourgain, Remarks on Montgomery's conjectures on Dirichlet sums, Lecture Notes in Math. 1469 (1991), 153-165. C. Breuil, B. Conrad, F. Diamond and R. Taylor, On the modularity of elliptic curves over Q: wtld 3-adic exercises, J. Amer. Math. Soc. 14 (2001), 843-939. W. E. Briggs and S. Chowla, On discriminants of binary quadratic forms with a single class in each genus, Canad. J. Math. 6 (1954), 463-470. R. W. Bruggeman, Fourier coefficients of cusp forms, Invent. math. 45 (1978), 1-18. V. Brun, Uber das Goldbachsche Gesetz und dte Anzahl der Primzahlpaare, Archiv for Math.og Naturvid. B34 (1915), no. 8.

599

600

[Br2] [Bu] [BDHI] [Burl] [Bur2] (BH] (Byk]

[Car] [CF] [Chi] [Ch] [CS] [Col] [Co2] [CCC]

[Cll] [CI2] [CoSo] [Cor!] [Cor2] [Cor3] [Cox] [Cra] [Dal] [Da2] (DHI] (DH2] [DHe] [Del] [De2]

BIBLIOGRAPHY V. Brun, Le crible d 'Emtosthene et le theoreme de Goldbach, C. R. Acad. Sci. Paris 168 (1919), 544-546. D. Bump, Automorphic fonns and representations, Cambridge Univ. Press, 1996. D. Bump, W. Duke, J. Hoffstein and H. Iwaniec, An estimate for the Heeke eigenvalues of Maass forms, Internat. Math. Res. Notices 4 (1992), 75-81. D. A. Burgess, On character sums and L-series, I, Proc. London Math. Soc. (3) 12 (1962), 193-206. D. A. Burgess, On character sums and L-series, 11, Proc. London Math. Soc. (3) 13 (1963), 524-536. C. J. Bushnel1 and G. Henniart, An upper bound on conductors for pairs, J. Number Theory 65 (1997), 183-!96. V. A. Bykovsky, Spectral expansion of certain automorphic functions and its numbertheoretical applications, Proc. Steklov lnst. (LOMI) 134 (1984), 15-33; English trans!. in J. Soviet Math. 36 (1987), 8-21. F. Carlson, Uber die Nullstellen der Dirichletschen Reihen und der Riemannschen (-Funktion, Arkiv. fiir Mat. Astr. och Fysik 15 (1920). J.W.S Cassels and A. Friilich, Algebraic Number Theory, Academic Press, 1990. Chamizo and H. lwaniec, On the sphere problem, Rev. Mat. Iberoamericana 11 (1995), 417--429. J. R. Chen, On the representation of a larger even integer as the sum of a prime and the product of at most two primes, Sci. Sinica 16 (1973), 157-176. S. Chawla and A. Selberg, On Epstein's zeta-function, J. Reine Angew. Math. 227 (1967), 86-110. B. Conrey, Zeros of derivatives of Riemann's ~-function on the critical line, J. Number Theory 16 (1983), 49-74. B. Conrey, More than two fifths of the zeros of the Riemann zeta function are on the critical line, J. Reine Angew. Math. 399 (1989), 1-26. B. Conrey, A. Ghosh and S. lv!. Gonek, A note on gaps between zeros of the zeta function, Bull. London Math. Soc. 16 (1984), 421-424. B. Conrey and H. Iwaniec, The cubic moment of central values of automorphic Lfunctwns, Ann. of Math. (2) 151 (2000), 1175-12!6. B. Conrey and H. Iwaniec, Spacing of zeros of Heeke L-functwns and the class number problem, Acta Arithmetica 103 (2002), 259-312. B. Conrey and K. Soundararajan, Real zeros of quadratzc Dirichlet £-functions, Invent. math. 150 (2002), 1-44. J.G. van der Corput, Zahlentheoretische Abschiitzungen, Math. Ann. 84 (1921), 53-79. J.G. van der Corput, Verscharfung der Abschiitzungen beim Teilerproblem, Math. Ann. 87 (1922), 39--£5. J. G. van der Corput, Sur l 'hypothese de Goldbach pour presque taus les nombres premiers, Acta Arithmetica 2 (1937), 266-290. D. Cox, Primes of the form x 2 + ny 2 , Wiley, 1989. H. Cramer, On the order of magnitude of the difference between consecutive prime numbers, Prace Mat.-Fiz. 45 (1937), 51-74. H. Davenport, On some infinite series involving arithmetical functions. II, Quart. J. Math. Oxf. 8 (1937), 313-320. H. Davenport, Analytic methods for Diophantine equations and Diophantine inequalities, Ann Arbor Publishers, 1963. H. Davenport and H. Halberstam, The values of a trigonometrical polynornial at well spaced points, Mathematika 13 (1966), 91-96. H. Davenport and H. Halberstam, Primes in arithmetic progressions, Michigan Math. J. 13 (!966), 485-489. H. Davenport and Hei1bronn, On the class-number of binary cubic fonns, I, II, J. London Math. Soc. 26 (1951), 183-192, 192-198. P. Deligne, La conjecture de Weil, I, lnst. Hautes Etudes Sci. Pub!. Math. 43 (1972), 206-226. P. Deligne, La conjecture de Wei!, 11, Inst. Hautes Etudes Sci. Pub!. Math. 52 (1980), 137-252.

BIBLIOGRAPHY

[De3] [DeSe]

[DI] [DS] [Dir] (Du1] (Du2] (Du3] [Du4] [Du5] (DFil] (DFI2] (DFI3] [DFI4] [DFI5]

[Dull] [Dui2] [Dui3]

(Dui4] (DK]

(DJ [Ell] [EI] (EK] [Est] [Fon] (FK] (FI] [FoM]

601

P. Deligne, Cohomologie etale, SGA 4~, Lecture Notes. Math. 569, Springer Verlag, 1977. P. Deligne and J-P. Serre, Formes modulaires de poids 1, Ann. Sci. Ecole Norm. Sup. (4) 7 (1974), 507-530. J.M. Deshouillers and H. Iwaniec, Kloosterman sums and Fourier coefficients of cusp forms, Invent. math. 70 (1982/83), 219-288. H. Diamond and J. Steinig, An elementary proof of the prime number theorem with a remainder term, Invent. math. 11 (1970), 199-258. P.G. Dirichlet, Demonstration d'une propriete analogue d la loi de reciprocite qui existe entre deux nombres premiers quelconques, J. Reine Angew. Math. 9 (1832), 379-389. W. Duke, The d-imension of the space of cusp forms of weight one, Internat. Math. Res. Notices (1995), 99-109. W. Duke, Hyperbolic distribution problems and half-integral weight Maass forms, Invent. math. 92 (1988), 73-90. W. Duke, The critical order of vanishing of automorphic £-functions with large level, Invent. math. 119 (1995), 165-174. W. Duke, Some problems in multidimensional analytic number theory, Acta Arithmetica 52 (1989), 203-228. W. Duke, Elliptic curves with no exceptional primes, C. R. Acad. Sci. Paris Ser. I Math. 325 (1997), 813-818. W. Duke, J. Friedlander and H. Iwaniec, A quadratic divisor problem, Invent. math. 115 (1994), 209-217. W. Duke, J. Friedlander and H. Iwaniec, Bounds for automorphic £-functions, II, Invent. math. 115 (1994), 219-239. W. Duke, J. Friedlander and H. Iwaniec, The subconvexity problem for Artin £functions, Invent. math. 149 (2002), 489-577. W. Duke, J. Friedlander and H. Iwaniec, Bilinear fo17ns with Klooste17nan fractions, Invent. math. 128 (1997), 23-43. W. Duke, J. Friedlander, H. Iwaniec, Equidistnbution of roots of a quadratic congruence to prime moduli, Ann. of Math. (2) 141 (1995), 423-441. W. Duke and H. Iwaniec, Bilinear forms in the Fourier coefficients of half-integral weight cusp forms and sums over primes, Math. Ann. 286 (1990), 783-802. W. Duke and H. Iwaniec, Estimates for coefficients of £-functions, I, Automorphic forms and analytic number theory (Montreal, PQ, 1989), CRM, 1989, pp. 43-47. W. Duke and H. Iwaniec, Estimates for coefficients of £-functions, II, Proceedings of the Amalfi Conference on Analytic Number Theory, Univ. Salerno, Salerno, 1992, pp. 71-82. W. Duke and H. Iwaniec, Estimates for coefficients of £-functions, IV, A mer. J. Math. 116 (1994), 207-217. W. Duke and E. Kowalski, A problem of Linmk for elliptic curves and mean-value estimates for automorphic representations, Invent. math. 139 (2000), 1-39. F. J. Dyson, Statistical theory of the energy levels of complex systems, I, J. Mathematical Phys. 3 (1962), 140-156. J. Ellenberg, Galois representations attached to Q-curves and the generalized Fernwt equation A 4 + B 2 = CP, Amer. J. Math. (to appear). P.D.T.A. Elliott, On inequalities of large sieve type, Acta Arithmetica 18 (1971), 405422. P. Erdos and M. Kac, The Gaussian law of errors in the theory of additive number theoretic functions, Amer. J. Math. 62 (1940), 738-742. T. Estermann, On Goldbach's Problem: Proof that Almost All Even Positive Integers are Sums of Two Primes, Proc. London Math. Soc. (2) 44 (1938), 307-314. J.M. Fontaine, Il n'y a pas de varietes abEliennes sur Z, Invent. math. 81 (1985), 515-538. E. Fouvry and N. Katz, A general stratification theorem for exponential sums, and applications, J. Reine Angew. Math. 540 (2001), 115-166. E. Fouvry and H. Iwaniec, Gaussian Primes, Acta. Arithmetica 79 (1997), 249-287. E. Fouvry and P. Michel, Sur le changement de signe des sommes de Kloostennan (preprint).

602

[Fri) [FG) [Fll) [FI2) [FI3) [FI4) [Gal) [Ga2)

[Ga3) [Ga4) [GM) [Ge) [GeJ) [GJ) [Gal) [Go2) [GS) [GHL) [GHB) [GR) [G1) [G2)

[GK) [GRi) [Gra)

[Gr) [GZl) [GZ2) [HaTu) [Hal)

BIBLIOGRAPHY

J. Friedlander, Primes in arithmetic progressions and related topics, Analytic Number Theory and Diophantine problems, Birkhaiiser. 1987, pp. 125-134. J. Friedlander and A. Granville, Limitations to the equi-distribution of primes, Ill, Compositio Math. 81 {1992), 19-32. J. Friedlander and H. Iwaniec, The polynomial X 2 + Y 4 captures its primes, Ann. of Math. {2) 148 (1998), 945-1040. J. Friedlander and H. Iwaniec, Summation formulae for coefficients of £-functions (to appear). J. Friedlander and H. Iwaniec, A Note on Character Sums, Contemporary Math. 166 (1994), 295-299. J. Friedlander and H. lwaniec, Incomplete Kloosterman sums and a divisor problem, Ann. of Math. 121 (1985), 319-350. P.X. Gallagher, A large sieve density estimate near cr = 1, Invent. math. 11 {1970), 329-339. P.X. Gallagher, Primes in pr'Ogressions to prime-power modulus, Invent. math. 16 (1972), 191-201. P. X. Gallagher, Bombieri's mean value theorem, Mathematika 15 {1968), 1-6. P. X. Gallagher, The large sieve and probabilistic Galois theory, Proc. Sympos. Pure Math., Vol. XXIV, Amer. Math. Soc., 1973, pp. 91-101. M. L. Gaudin and M. Mehta, On the density of eigenvalues of a mndom matrix, Nuclear Phys. 18 {1960), 420-427. F. Gerth III, Extension of conjectures of Cohen and Lenstro, Expositiones Math. 5 {1987), 181-184. S. Gel bart and H. Jacquet, A relation between automorphic representations of G £{2) and GL(3), Ann. Sci. Ecole Norm. Sup. (4) 11 {1978), 471-542. R. Godement and H. Jacquet, Zeta functions of simple algebras, Lecture Notes Math. 260, Springer Verlag, 1972. D. Goldfeld, A simple proof of Stegel's theorem, Proc. Nat. Acad. Sci. U.S.A. 71 (1974), 1055. D. Goldfeld, The class number of quadrattc fields and the conjectures of Birch and Swinnerton-Dyer, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 3 (1976), 624-663. D. Goldfeld and P. Sarnak, Sums of Kloosterman sums, Invent. math. 71 (1983), 243250. D. Goldfeld, J. Hoffstein and D. Lieman, Annals of Math. (2) 140 (1994), 161-181. D. Goldston and D. R. Heath-Brown, A note on the differences between consecutive primes, Math. Ann. 266 (1984), 317-320. I.S. Gradhsteyn and I.lvl. Rizbik, Table of integrals, series and products, 6th Edition, Academic Press, 2000. S. Graham, An asymptotic estimate related to Selberg's sieve, J. Number Theory 10 (1978), 83-94. S. Graham, On Linnik's constant, Acta Arithmetica 39 (1981), 163-179. S.W Graham and G. Kolesnik, van der Corput's method of exponential sums, Cambridge Univ. Press, 1991. S. W. Graham and C. J. Ringrose, Lower bounds for least quadrahc nonresidues, Analytic number theory {Allerton Park, IL, 1989), Birkhii.user, 1990, pp. 269-309. A. Granville, Unexpected irregularities in the distribution of prime numbers, Proceedings of the International Congress of Mathematicians, Vol. 1, 2 (Ziirich, 1994), Birkhii.user, Basel, 1995, pp. 388-399. G. Greaves, Sieves in number theory, Ergebnisse der Mathematik und ihrer Grenzgebiete (3), vol. 43, Springer Verlag, 2001. B. Gross and D. Zagier, Heegner points and derivatives of £-series, Invent. math. 84 {1986), 225-320. B. Gross and D. Zagier, Points de Heegner et dErivEes de fonctions L, C. R. Acad. Sci. Paris Ser. I Math. 297 (1983), 85--87. G. Halasz and P.Turan, On the distribution of roots of Riemann zeta and allted functions, I, J. Number Theory 1 (1969), 121-137. H. Halberstam, On the dtstribution of additive number-theoretic functions, ll, Ill, J. London Math. Soc. 31 (1956), 1-14, 14-27.

BIBLIOGRAPHY

[HaRi] [HaLa] [HL1] [HL2] [HRJ [HWJ [HT] [Ha] [HB1] [HB2] [HB3] [HB4] [HB5] [HB6] [HBPJ [Heel] [Hee] [HJ [Hi!] (Hi2] [Ho~

[HLJ [Ho] [H1] [H2] [Hub] [Hu1] [Hu2] [Hu3] [Hu4] [HuW] [Ing]

603

H. Halberstam and H. E. Richert, Sieve methods. Academic Press, 1974. G. H. Hardy and E. Landau, The lattice points of a circle, Proc. Royal Soc. A 105 (1924), 244-258. G. H. Hardy and J. E. Littlewood, Some Problems of 'Partitio Numerorum.' III. On the Expression of a Number as a Sum of Primes., Acta Math. 44 (1922), 1-70. G. H. Hardy and J. E. Littlewood, The zeros of Riemann's zeta function on the critical /me, Math. Z. 10 (1921), 283-317. G.H. Hardy and S. Ramanujan, Asymptotic Formulae in Combinatory Analysis, Proc. London Math. Soc. 17 (1918), 75-115. G.H. Hardy and E.M. Wright, An introduction to the theory of numbers, 5th edition, Oxford University Press, 1979. M. Harris and R. Taylor, The geometry and cohomology of some simple Shimum varieties, Princeton U niv. Press, 2002. R. Hartshorne, Algebraic Geometry, Grad. Texts in Math. 52, Springer Verlag, 1977. D.R. Heath-Brown, A mean value estimate for real character sums, Acta Arithmetica 72 (1995), 235-275. D.R. Heath-Brown, An estimate for Heilbronn's exponential sum, Analytic number theory, Vol. 2 (Allerton Park, IL, 1995), Birkhaiiser, 1995, pp. 451-463. D. R. Heath-Brown, Prime numbers in short intervals and a generalized Vaughan identity, Canad. J. Math. 34 (1982), 1365-1377. D. R. Heath-Brown, Zero-free regions for Dirichlet £-functions, and the least prime in an arithmetic progression, Proc. London Math. Soc. (3) 64 (1992), 265-338. D. R. Heath-Brown, Lattice points in the sphere, Number theory in progress, Vol. 2, de Gruyter, Berlin, 1999, pp. 883-892. D. R. Heath-Brown, Cubic forms in ten variables, Proc. London Math. Soc. (3) 47 (1983), 225-257. D.R. Heath-Brown and S.J. Patterson, The distribution of Kummer sums at prime arguments, J. Reine Angew. Math. 310 (1979), 111-130. E. Heeke, Uber eine neue Art von Zetafunktionen, Math. Zeit. 6 (1920), 11-51. K. Heegner, Diophantische Analysis und Modulfunlctionen, Math. Z. 56 (1952), 227253. D. Hilbert, Beweis fUr die Darstellbarkeit der ganzen Zahlen durch eine feste Anzahl n-ter Potenzen (Waringsches Problem), Math. Annalen 67 (1909), 281-305. A. Hildebrand, An asymptotic formula for the variance of an addztive function, Math. z. 183 (1983), 145-170. A. Hildebrand, On the constant in the P6/ya- Vinogradov mequa/ity, Canad. Math. Bull. 31 (1988), 347-352. J. Hoffstein, On the Siegel-Tatuzawa theorem, Acta Arithmetica 38 (1980/81), 167174. J. Hoffstein and P. Lockhart, Coefficients of Maass forms and the Siegel zero, Annals of Math. 140 (1994), 161-181. G. Hoheisel, Primzahlprobleme in der Analysis, S.-B. Preuss. Akad. Wiss. Phys.-Math. Kl. (1930), 58Q--588. Loo-Keng Hua, Introduction to number theory, Springer, 1982. Loo Keng Hua, On Waring's problem, Quart. J. Math. Oxford 9 (1938), 199-202. H. Huber, Zur analytischen Theorie hyperbolischen Raumformen und Bewegungsgruppen, Math. Ann. 138 (1959), 1-26. M. N. Huxley, The large sieve inequality for algebraic number fields, Mathematika 15 (1968), 178-187. M. N. Huxley, Large values of Dirichlet polynomials, Acta Arithmetica 24 (1973), 329-346. M. N. Huxley, On the differences between consecutive primes, Invent. math. 15 (1972), 164-170. l'vl. N. Huxley, Area, lattice points, and exponentzal sums, The Clarendon Press, 1996. M. N. Huxley and N. Watt, Exponential sums and the Riemann zeta function, Proc. London Math. Soc. 57 (1988), 1-24. A. E. Ingham, On the estimation of N(u,T), Quart. J. Math. 11 (1940), 291-292.

604

[IR]

(Ivj [!1] [I2] [I3]

[!4] [!5] [I6] [!7] [I8] [I9] [!10] [Ill] [!12] [113] [I14]

[ILS] [IM] [ISl] [IS2] [JS] [JPS] [Jul] [Ju2] [Ju3] [Ju4] [Ju5] [Kl] [K2] [K3]

[K4] [K5]

BIBLIOGRAPHY

K. Ireland and M. Rosen, A Classical Introduction to M adem Number Theory, 2nd Edition, Grad. Texts in Math. 84, Springer-Verlag, 1990. A. lviC, The Riemann zeta-function, Theory and applications, Dover Publications, 2003. Iwaniec, Character sums and small eigenvalues for ro(p), Glasgow Math. J. 27 (1985), 99-116. H. Iwaniec, On zeros of Dirichlet's L series, Invent. math. 23 (1974), 97-104. H. Iwaniec, Spectral theory of automorphic functions and recent developments in analytic number theory, Proceedings of the ICM Berkeley 1986, Amer. Math. Soc., 1987, pp. 444-456. H. Iwaniec, Topics in Classical Automorphic Forms, A.M.S, 1997. H. Iwaniec, Introduction to the spectral theory of automorphic forms, 2nd edition, A.M.S and R.M.I, 2002. H. Iwaniec, Fourier coefficients of modular forms of half-integral weight, Invent. math. 87 (1987), 385-401. H. Iwaniec, The spectral growth of automorphic £-functions, J. Reine Angew. Math. 428 (1992), 139-159. H. Iwaniec, The half-dimensional sieve, Acta Arithmetica 29 (1976), 69-95. H. Iwaniec, A new form of the error term in the linear sieve, Acta Arithmetica 37 (1980), 307-320. H. Iwaniec, Almost primes represented by quadmtic polynomials, Invent. math. 47 (1978), 171-188. H. Iwaniec, Small eigenvalues of Laplacian for ro(N), Acta Arithmetica 56 (1990), 65-82. H. Iwaniec, Prime geodesic theorem, Journal Reine Angew. Math. 349 (1984), 136-159. H. Iwaniec, Nonholomorphic modular forms and their applications, Modular forms (Durham, 1983), Horwood, 1984, pp. 157-196. H. Iwaniec, The lowest eigenvalue for congruence groups, Topics in geometry, Birkhiiuser, 1996, pp. 203-212. H. Iwaniec, W. Luo and P. Sarnak, Low-lying zeros of families of £-functions, Inst. Hautes Etudes Sci. Pub!. Math. 91 (2001), 55-131. H. lwaniec and P. Michel, The second moment of the symmetric square £-functions, Ann. Acad. Sci. Fenn. Math. 26 (2001), 465-482. H. Iwaniec and P. Sarnak, Perspectives on the analytic theory of £-functions, GAFA Special Volume GAFA2000 (2000), 705-741. H. Iwaniec and P. Sarnak, The non-vanishing of central values of automorphic £functions and Landau-Siegel zeros, Israel J. Math. 120 (2000), 155-177. H. Jacquet and J. Shalika, On Euler products and the classification of automorphic representations, I and II, Amer. J. Math. 103 (1981), 499-588, 777-815. H. Jacquet, I. Piatetskii-Shapiro and J. Shalika, Rankin-Selberg convolutions, Amer. J. Math. 105 (1983), 367-464. M. Jutila, Lectures on a method in the theory of exponential sums, Springer Verlag, 1987. M. Jutila, On character sums and class numbers, .J. Number Theory 5 (1973), 203-214. M. Jutila, On large values of Dirichlet polynomials, Topics in number theory (Proc. Colloq., Debrecen, 1974), North-Holland, 1976, pp. 129-140. M. Jutila, Zero-density estimates for £-functions, Acta Arithmetica 32 (1977), 55-62. M. Jutila, Statistical Deuring-Heilbronn phenomenon, Acta Arithmetica 37 (1980), 221-231. N. Katz, Twisted £-functions and monodromy, Princeton Univ. Press, 2002. N. Katz, Exponential sums over finite fields and differential equations over the complex numbers: some interactions, Bull. Amer. Math. Soc. (N.S.) 23 (1990), 269-309. N. Katz, Gauss sums, Kloosterman sums and monodromy groups, Princeton Univ. Press, 1988. N. Katz, Sommes exponentielles, S.M.F, 1980. N. Katz, Sums of Betti numbers in arbitrary characteristic, Finite Fields Appl. 7 (2001), 29-44.

BIBLIOGRAPHY

[KS1] [KS2] [Kh] [KSa] [KSh] [Klo] [Kol] [Ko2] [KM1] [KM2] [KMV1]

[KMV2] [KMV3]

[Kor] [Kub1] [Kub2] [Kuz]

[LO]

[La1]

[La2] [La3] [La] [LT] [Lau] [Lav] [Leb]

[Lev]

[L] [Li1]

605

N. Katz and P. Sarnak, Random matrices, Frobenius eigenvalues, and monodromy, A.M.S, 1999. N. Katz and P. Sarnak, Zeroes of zeta functions and symmetry, Bull. Amer. Math. Soc. 36 (1999), 1-26. A. Y. Khinchine, Three pearls of number theory, Dover Publications, 1998. H. Kim and P. Sarnak, Refined estimates towards the Ramanujan and Selberg conjectures, J. Amer. Math. Soc. 16 (2003), 175-181. H. Kim and F. Shahidi, Cuspidality of symmetric powers with applications, Duke Math. J. 112 (2002), 177-197. H.D. Kloosterman, On the representation of numbers in the form ax 2 + by 2 + cz 3 + dt 2 , Acta Math. 49 (1926), 407-464. E. Kowalski, Analytic problems for elliptic curves (2001) (preprint). E. Kowalski, On the "reducibility" of arctangents of integers, Amer. Math. Monthly 111 (2004), 351-354. E. Kowalski and P. Michel, A lower bound for the rank of Jo(q), Acta Arithmetica 94 (2000), 303-343. E. Kowalski and P. Michel, The analytic rank of Jo(q) and zeros of automorphic £function, Duke Math. J. 100 (1999), 503-542. E. Kowalski, P. Michel and J. VanderKam, Mollification of the fourth moment of automorphic £-functions and arithmetic applications, Invent. math. 142 (2000), 95151. E. Kowalski, P. Michel and J. VanderKam, Rankin-Selberg £-functions in the level aspect, Duke Math. J 114 (2002), 123-191. E. Kowalski, P. Michel and J. VanderKam, Non-vanishing of high derivatives of automorphic £-functions at the center of the critical strip, J. fiir die Reine und Angew. Math. 526 (2000), 1-34. N. M. Korobov, Estimates of trigonometric sums and their applications, Uspehi rv1at. Nauk. 13 (1958), 185-192. J. Kubilius, Probab,lity methods in number theory, Usp. Mat. Nauk. 68 (1956), 31-66. J. Kubilius, Sharpening of the estimate of the second central moment for additive arithmehcal functions, Litovsk. Mat. Sb. 25 (1985), 104-110. N. V. Kuznetsov, The Petersson conjecture for cusp forms of weight zero and the Linnik conjecture, Mat. Sb. (N.S.) 111 (1980), 334-383; Math. USSR-Sb 39 (1981), 299-342. J. Lagarias and A. Odlyzko, Effective versions of the Chebotarev Dens.ty Theorem, Algebraic number fields: £-functions and Galois properties (Proc. Sympos., Univ. Durham, Durham, 1975), Academic Press, [977, pp. 409-464. E. Landau, Uber die Einteilung der positiven ganzen Zahlen in vier Klassen nach der Mindestzahl der zu ihrer additiven Zusammensetzung erforderlichen Quadrate, Arch. der Math. u. Phys. (3) 13 (1908), 305-312. E. Landau, Bemerkungen zum Heilbronnsehen Satz, Acta Arithmetica 1 (1936), 1-18. E. Landau, Uber die Nullstellen der Dirichletschen Reihen und der Riemannschen (-Funktion, Arkiv. fiir Mat. Astr. och Fysik 16 (1921). S. Lang, Algebraic Number Theory, 2nd edition, Grad. Texts in Math. 110, SpringerVerlag, 1994. S. Lang and H. Trotter, Frobeniu.s distribution in G£2 extensions, Lecture Notes in Math. 504, Springer Verlag, 1976. G. Laumon, Exponential sum.s and l-ad'l.c cohomology: a survey, Israel J. Math. 120 (2000), 225-257. A. F. Lavrik, Approximate functional equations of Dtrichlet functions, Izv. Akad. Nauk SSSR Ser. Mat. 32 (1968), 134-185. N. N. Lebedev, Special functions and their applications, Dover Publications, 1972. N. Levinson, More than one-third of the zeros of the Riemann zetafunction are on a= 1/2, Adv. Math. 13 (1974), 383-436. W. Li, L-series of Rankm type and their functwnal equations, Math. Ann. 244 (1979), 135-166. Yu. V. Linnik, The large sieve, Dokl. Akad. Nauk SSSR 30 (1941), 292-294. (in Russian)

606

[Li2] ]Li3] ]Li4] [Li5] [Li6] [Li7] [LR] [Lit] [LRW] [Luo] [LRS] [Ma]

[Mai] [Mar] [MSD] ]Mes]

[MI] [M2] [M3] [MvdK] [Mi] [MW] ]Mol] [Mol] [Mo2] [Mo3] [Mo4] ]MV1] [MV2] ]MV3]

BIBLIOGRAPHY

Yu. V. Linnik, The dispersion method in binary additive problems, AMS, 1963. Yu V. Linnik, Additive problems and eigenvalues of the modular operators, Proc. Internat. Congr. Mathematicians (Stockholm, 1962), pp. 27()-284. Yu. V. Linnik, On the least prime in an arithmetic progression, I. The basic theorem, Rec. Math. ]Mat. Sbornik] N.S. 15(57) (1944), 139--178. Yu. V. Linnik, On the least prime in iln arithmetic progression, II. The DeuringHeilbronn phenomenon, Rec. Math. ]Mat. Sbornik] N.S. 15(57) (1944), 347-368. Yu. V. Linnik, On Dirichlet's £-series and prime-number sums, Rec. Math. [Mat. Sbornik] N.S. 15(57) (1944), 3-12. Yu. V. Linnik, New versions and new uses of the dispersion methods in binary additive problems, Dokl. Akad. Nauk SSSR 137 (1961), 1299--1302. J. H. van Lint and H. -E. Richert, On primes in arithmetic progressions, Acta Arithmetica 11 (1965), 209--216. J. E. Littlewood, On the zeros of the Riemann Zeta-function, Cambridge Phil. Soc. Proc. 22 (1924), 295-318. J. van de Lune, H.J.J te Riele, D.T. Winter, On the zeros of the Riemann zeta Junction in the critical strip, IV, Math. Comp. 174 (1986), 667-681. W. Luo, Nonvanishing of L-values and the Weyllaw, Ann. of Math. (2) 154 (2001), 477-502. W. Luo, Z. Rudnick and P. Sarnak, On Selberg's eigenvalue conjecture, Geom. Funct. Anal. 5 (1995), 387-401. H. Maass, rJber eine neue Art von nichtanalytischen automorphen Funktionen und die Bestimmung Dirichletscher Reihen durch Funktionalgleichungen, Math. Ann. 121 (1949), 141-183. H. Maier, Primes in short intervals, Michigan Math. J. 32 (1985), 221-225. G. Margulis, Discrete Subgroups of Semisimple Lie Groups, Ergebnisse der Math. und ihrer Grenzgebiete 68, Springer Verlag, 1991. B. Mazur and P. Swinnerton-Dyer, Arithmetic of Wei[ curves, Invent. math. 25 (1974), 1-61. J-F. Mestre, Formules explicites et minorahons de conducteurs de varietes algebriques, Compositio Math. 58 (1986), 209--232. P. Michel, The subconvexity problem for Rankin-Selberg £-functions and equidistribution of Heegner points, Annals. of Math. (to appear). P. Michel, Analytic number theory and families of automorphtc £-functions (to appear). P. Michel, Autour de la conjecture de Sato-Tate pour les sommes de Kloosterman, /, Invent. Math. 121 (1995), 61-78. P. Michel and J. VanderKam, Non-vanishing of high derivatives of Dirichlet £-functions at the central point, J. Number Theory 81 (2000), 130-148. T. Miyake, Modular forms, Springer Verlag, 1989. C. Moeglin and J-L. Waldspurger, Poles des fonctions L de paires pour GL(N), app. toLe spectre residue! de GL(N), Ann. Sci. ENS (4eme serie) 22 (1989), 605-674. G. Molteni, Upper and lower bounds at s = 1 for certain Dinchlet series with Euler product, Duke Math. J. 111 (2002), 133-158. H.L. Montgomery, The analytic principle of the large sieve, Bull. A mer. Math. Soc. 84 (1978), 547-567. H.L. Montgomery, Topics in multiplicatzve number theory, Lecture Notes in Math. 227, Springer Verlag, 1971. H. Montgomery, Zeros of £-functions, Invent. math. 8 (1969), 346-354. H. L. Montgomery, The pair correlation of zeros of the zeta function, Analytic number theory (Proc. Sympos. Pure Math., Vol. XXIV), Amer. Math. Soc., 1972, pp. 181-193. H. L. Montgomery and R. C., Vaughan, The large sieve, Mathematikia 20 (1973), 119--134. H.L. Montgomery and R.C. Vaughan, Hilbert's inequality, J. London Math.Soc. (2) 8 (1974), 73--82. H.L. Montgomery and R.C. Vaughan, The exceptional set in Goldbach's problem, Acta Arithmetica 27 (1975), 353-370.

BIBLIOGRAPHY

[Morl] [Mor2] [Mor3] [Mot1] [Mot2] [Nak] [Ne] [Od] [Oe] [Ono] [PP] [PSa] [Ph] [PS] [Pi] [Poi] [Pol] [Pos]

[PRJ [R]

[RaJ [Ra1]

[R.a2] [Ra3] [Re] [Rey]

[Rie] [RV] [Rot]

607

C. Moreno, Prime number theorems for the coefficients of modular forms, Bull. Amer. Math. Soc. 78 (1972), 796-798. C. Moreno, Algebraic curves over finite fields, Cambridge Univ. Press, 1991. C. Moreno, Analytic proof of the strong multtplicity one theorem, Amer. J. Math. 107 (1985), 163-206. Y. Motohashi, Spectral theory of the Riemann zeta-function, Camdridge Univ. Press, 1997. Y. Motohashi, An induction principle for the generalization of Bombieri 's prime number theorem, Proc. Japan Acad. 52 (1976), 273-275. H. Nakazato, Heegner points on modular elliptic curves, Proc. Japan Acad. Ser. A Math. Sci. 72 (1996), 223-225. J. Nekovar, On the parity of ranks of Selmer groups, II, C. R. A cad. Sci. Paris Ser. I Math. 332 (2001 ), 99-104. A. M. Odlyzko, Some analytic estimates of class numbers and discriminants, Invent. math. 29 (1975), 275-286. J Oesterle, Nombres de classes des corps quadmtzques 2magzna2res, Astensque 121122 (1985), 309-323. K. Ono, Nonvanishing of quadratic twists of modular £-functions and applications to elliptic curves, J. Reine Angew. Math. 533 (2001), 81-97. A. Perelli and J. Pomykala, Averages over twisted elliptic £-functions, Acta Arithmetica 80 (1997), 149-163. Y. Petridis and P. Sarnak, Quantum unique ergodicity for SL2(0)\1Hl! and estimates for £-functions, Journal of Evolution Equations 1 (2001), 277-290. E. Phillips, The zeta-function of Riemann; further developments of van der Corput 's method, Quart. J. Math. 4 ( 1933), 209-225. R. Phillips and P. Sarnak, On cusp forms for co-finite subgroups of PSL(2, IR), Invent. math. 80 (1985), 339-364. N. Pitt, On shifted convolution of ( 3 (s) with automorphic £-functions, Duke Math. J. 77 (1995), 383-406. G. Poitou, Sur les petits discriminants, seminaire Delange-Pisot-Poitou, 18e annee (1976-77). G. P6lya, Uber dte Verteilung der quadratischen Reste und Nichtreste, Nachr. Konig!. Gesell. Wissensch. Gi:ittingen, Math.-phys. Klasse (1918), 21-29. A. G. Postnikov, On Dirichlet L-series with the character modulus equal to the power of a prime number, J. Indian Math. Soc. (N.S.) 20 (1956), 217-226. A. G. Postnikov and N. P. Romanov, A simplification of A. Selberg ·s elementary proof of the asymptotic law of dzstribution of prime numbers, Uspehi Mat. Nauk (N.S.) 10 (1955), 75-87. H. Rademacher, On the Partition Function p(n), Proc. London Math. Soc. 43 (1937), 241-254. D. Ramakrishnan, Modularity of the Rankin-Selberg L-series, and multiplicity one for S£(2), Annals of Math. (2) 152 (2000), 45-111. R. A. Rankin, Van der Corput 's method and the theory of exponent pairs, Quart. J. Math. (2) 6 (1955), 147-153. R. A. Rankin, The d'ljjerence between consecutive pnme numbers, V, Proc. Edinburgh Math. Soc. (2) 13 (1962/1963), 331-332. R. A. Rankin, Contributions to the theory of Ramanujan's T function and similar arithmetical functions, II, Proc. Camb. Phil. Soc. 35 (1939), 351-372. A. Renyi, On the representation of an even number as the sum of a single prime and single almost-prime number, Izvestiya Akad. Nauk SSSR. Ser. Mat. 12 (1948), 57-78. E. Reyssat, Quelques aspects des surfaces de Riemann, Progress in Math. 77, Birkhaiiser, 1989. B. Riemann, Uber die Anzahl der Primzahlen unter einer gegebenen GrOsse, Monatsber. Berlin. Akad. (1859), 671-680. F. Rodriguez-Villegas, Square root formulas for central values of Heeke L-series, II, Duke Math. J. 72 ( 1993), 431-440. K.F. Roth, On the large sieves of Linnik and Renyi, 1\Iathematika 12 (1965), 1-9.

608

[Roy] [Ru] [RS] [Sal] [Sal] [Sa2] [Sa3] [Sa4] [Sch] [ST] [Sl] [S2] [S3] [S4] [S5] [S6]

[S7] [S8] [Sel] [Se2]

[Se3]

[Se4] [Se5] [Se6] [Se7] [Se8] [Shah] [Sha] [Shl] [Sh2] [Sh3]

BIBLIOGRAPHY

E. Royer, Statistique de Ia variable aleatoire L(Sym2 f, 1), Math. Ann. 321 (2001), 667-687. W. Rudin, Real and complex analysis, McGraw-Hill, 1987. Z. Rudnick and P. Sarnak, Zeros of principal L-functwns and random matrix theory, Duke Math. J. 81 (1996), 269-322. H. Sali€, Uber die Kloostermanschen Summen S(u, v, q), Math. Z. 34 (1931), 91-109. P. Sarnak, Estimates for Rankin-Selberg £-Functions and Quantum Unique Ergodicity, J. Funct. Anal. 184 (2001), 419-453. P. Sarnak, Class Numbers of indefinite binary quadratic forms, Journal of Number Theory 15 (1982), 229-247. P. Sarnak. Some applications of modular forms, Cambridge Univ. Press, 1990. P. Sarnak, Arithmetic quantum chaos, Bar-Ilan Univ., 1995. W. Schmidt. Equations over finite fields. An elementary approach, Lecture Notes in Math. 536, Springer Verlag, 1976. D. B. Sears and E. C. Titchmarsh, Some eigenfunction formulae, Quart. J. Math. Oxford 1 (1950), 165-175. A. Selberg, The general sieve method and its place in prime number theory, Proc. ICM, vol. 1, Cambridge, MA., 1950, pp. 286-292. A. Selberg, On the estimation of Fourier coefficients of modular forms, Proc. Sympos. Pure Math., Vol. VIII, Amer. Math. Soc., 1965, pp. 1-15. A. Selberg. Lectures on sieves, Collected Papers Vol. II, Springer Verlag, 1991, pp. 66247. A. Selberg, On the zeros of Riemann's zeta-function, Skr. Norske Vid. Akad. Oslo I (1942). A. Selberg, Bemerkungen uber eine Dirichletsche Reihe, die mit der Theorie der Modulformen nahe verbunden ist, Arch. Math. Naturvid. 43 (1940), 47-50. A. Selberg, Harmonic analysis and discontinuous groups m weakly symmetric Riemannian spaces with applications to Dirichlet series, J. Indian Math. Soc. (N.S.) 20 (1956). 47-S7. A. Selberg, On the estimation of Founer coefficients of modular forms, Proc. Symp. Pure Math. 8, 1965, pp. 1-8. A. Selberg, On the zeros of the zeta function of Rtemann, Der Kong. Norske Vidensk. Selsk. Forhand. 15 (1942), 59-62. J-P. Serre, Cours d'arithme.tique, 2nd edition, P.U.F, 1977. J-P. Serre, Modular forms of weight one and Galois representations, Algebraic number fields: £-functions and Galois properties (Proc. Sympos., Univ. Durham, Durham, 1975), Academic Press, 1977, pp. 193-268. J-P. Serre, Congruences et formes modulaires (d'apres H. P. F. Swinnerton-Dyer), Seminaire Bourbaki, 24e annee (1971/1972), Exp. No. 416, Lecture Notes in Math. 317, Springer Verlag, 1973, pp. 319-338. J- P. Serre, Corps locaux, Hermann, 1968. .J-P. Serre, Representations lineaires des corps finis, Hermann, 1971. J-P. Serre, Quelques applications du the.oreme de densite de Chebotarev, Pub!. Math. IHES 54 (1981), 123-201. J-P. Serre, Minorations de discriminants, Oeuvres, Vol. Ill, Springer- Verlag, 1986, pp. 2-!0-243. J-P. Serre, Letter to J.M. Deshouillers. F. Shahidi, Symmetnc power £-functions for GL(2), Elliptic curves and related topics, CRM Proc. Lecture Notes 4, Amer. Math. Soc., 1994, pp. 159-182. D. Shanks, Class number, a theory of factorization, and genera, 1969 Number Theory Institute (Proc. Sympos. Pure Math., Vol. XX), Amer. Math. Soc., 1971, pp. 415-440. G. Shimura, On modular forms of half-integral weight, Annals of Math. 97 (1973), 440-481. G. Shimura, On the holomorphy of certain Dirichlet series, Proc. London Math. Soc. (3) 31 (1975), 79-98 G. Shimura, lntroductwn to the arithmetic theory of automorphic junctions, Princeton Univ. Press, 1971.

BIBLIOGRAPHY

[Siel] [Sie2] [Sie3] [Sil]

[Sou] [Stl] [St2] [St3] [Ste] [Tal] [Tat]

[TW] [Tchu] [Th]

[Tl] [T2] [T3j [Tot]

[Tnl] [Tn2] [vdK] [VdP] [Va] [Vl] [V2] [V3] [V4] [V5] [V6] [Vi] [Vor]

[W]

609

C. L. Siegel, Uber die Classenzahl quadratischer Zahlkorper, Acta Arithmetic a 1 (1936), 83--86. C.L. Siegel, On the theory of indefinite quadratic forms, Ann. of Math. 45 (1944), 577-622. C. L. Siegel, Lectures on quadratic Jorrns, Tat a Institute, 1967. J. Silverman, The arithmetzc of elliptic curves, Grad. Texts in Math. 106, SpringerVerlag, 1986. K. Soundararajan, Nonvanishing of quadratic Dinchlet £-functions at s = ~, Ann. of Math. (2) 152 (2000), 447-488. H. Stark, A complete determination of the complex quadratzc fields with class-number one, Michigan Math. J. 14 (1967), 1-27. H. Stark, A transcendence theorem for class-number problems. II. Annals of Math. (2) 96 (1972), 174-209. H. Stark, Some effective cases of the Brauer-Siegel theorem. Invent. math. 23 (1974), 135-152. S. A. Stepanov, The number of points of a hyperelliptic curve over a finite prime field, lzv. Akad. Nauk SSSR Ser. Mat. 33 (1969), 1171-1181. J. Tate, Fourier analysis in number fields and Heeke's zeta functions, Algebraic Number Theory, Academic Press, 1990, pp. 305-347. J. Tate, Number theoretic preliminaries, Proceedings of Symposia in Pure Math. 33, vol 2, A.M.S, 1979, pp. 3-26. R. Taylor and A. Wiles, Ring-theoretic properties of certain Heeke algebras, Ann. of Math. (2) 141 (1995), 553-572. N. G. Tchudakov, Sur le probleme de Goldbach, C. R. (Dokl.) Acad. Sci. URSS, n. Ser. 17 (1937), 335-338. A. Thue, Uber Anniiherungswerte algebraischer Zahlen, J. Reine Angew. !\lath. 135 ( 1909), 284-305. E.C. Titchmarsh. The theory of functions, 2nd Edition, Oxford Univ. Press, 1939. E. C. Titchmarsh, The theory of the Riemann zeta-functwn, 2nd edition, Oxford Univ. Press, 1986. E. C. Titchmarsh, Ezgenfunction expansions associated with second-order differential equations. Clarendon Press, 1962. A. Toth, Roots of quadratic congr"Uences, lnternat. Math. Res. Notices (2000), 719-739. P. Turan, Uber einige Vemllgemeinerungen eznes Satzes von Hardy und Ramanujan, J. Lond. Math. Soc. 11 (1936), 125-133. P. Tur
6!0

[Wa] [We1] [We2] [W1] [W2] [W] [Wi1] [Wi2] [Wi3]

BIBLIOGRAPHY

T. Watson, Rankin triple products and quantum chaos, PhD thesis, Princeton University, 2001. A. Wei!, On some exponential sums, Proc. Nat. Acad. Sci. U.S. A. 34 (1948), 204-207. A. Wei!, Uber die Bestimmung Dirichletscher Reihen durch Funktionalgleichungen, Math. Ann. 168 (1967), 149-156. H. Weyl, Uber die Gleichverteilung von Zahlen mod. Eins, Math. Ann. 77 (1916), 313-352. H. Weyl, Zur Abschiitzung von ((1 + ti), Math. Zeit. 10 (1921), 88-101. A. Wiles, Modular elliptic curves and Fermat's last theorem, Ann. of Math. (2) 141 (1995), 443-551. E. Wirsing, Das asymptotische Verhalten von Summen iiber multiphkative Funktionen, II, Acta Math. Acad. Sci. Hungar. 18 (1967), 411-467. E. Wirsing, Elementare Beweise des Primzahlsatzes mit Restglied, II, J. Reine Angew. Math. 214/215 (1964), 1-18. E. Wirsing, Growth and differences of additive arithmetic functions, Topics in classical number theory, Vol. I, II (Budapest, 1981), Colloq. Math. Soc. Janos Bolyai, 34, NorthHolland, Amsterdam, 1984, pp. 1651-1661.

Index

A-process, 204, 211, 215 B-process, 204, 211, 216 £-functions of varieties, 96 A2 -sieve, 160, 430 l-adic cohomology, 269 l-adic cohomology groups with compact support, 304 l-adic sheaf, 310 j-invariant, 364

Bilinear form, 169, 174, 186, 218, 326, 340, 343, 346, 350, 420 Binary quadratic forms, 498, 503 Birch and Swinnerton-Dyer Conjecture, 148, 367, 520, 538, 578, 579 Brun-Titchmarsh inequality, 167, 420 Buchstab identity, 349 Burgess' bound, 536 Canonical height, 541 Cauchy's inequality, 162, 169, 220 Central character, 540 Central critical point, 577 Central value, 514, 529 Character sum, 74, 101, 365, 422, 429 Characters, 43, 487 Chebotarev Density Theorem, 143, 489 Circle method, 171, 315, 443 Class group, 58, 510 Class group character, 59, 131, 134, 361, 369, 516 Class number, 58, 125, 368, 402, 504, 523, 537, 569 Class Number Formula, 38, 124, 402, 513, 516 Class number one problem, 124, 577 Closed point, 297 Cohen-Lenstra heuristics, 510 Combinatorial sieve, 164 Companion sums, 273, 278 Complete £-function, 94 Complete exponential sum, 458 Complete family, 573 Complete sums, 269, 318 Completing technique, 318 Complex multiplication, 367, 369 Conductor, 45, 60, \)3, 94, 109, 130, 149, 190, 247, 365, 539, 572, 579, 597 Conductor exponent, 365 Congruence subgroup, 354 Conjugacy classes, 394 Constant term in the Fourier expans1on, 390 Continuous spectrum, 75 Converse theorems, 377

Abel-Jacobi Theorem, 299, 543 Abelian variety, 145, 146 Absolute logarithmic height, 541 Absolute values, 541 Additive binary problem, 468 Additive character, 44, 175, 181, 271, 467, 475 Additive function, 9, 28 Additive reduction, 365 Adjoint square, 137, 373 Algebraic curves, 145 Algebraic surface, 310 Almost primes, 159 Ambiguous classes, 509 Ambiguous form, 507 Amplification method, 379, 597 Amplifier, 596 Analytic £-function, 141, 143, 149 Analytic conductor, 95 Approximate functional equation, 97, 257 Arithmetic etale fundamental group, 300 Artin £-functions, 96, 126, 141, 356 Artin-Shreier covering, 302 Automorphic £-functions, 84, 93, 375 Automorphic form, 61, 93, 131 Auxiliary polynomial, 282 Averaging technique, 357, 387 Bad reduction, 364 Bernoulli numbers, 67 Bernoulli polynomial, 66, 484 Bessel function, 72, 73, 90, 91, 245, 258, 358, 386, 408, 411, 454, 512 Betti numbers, 305, 307 611

612

Convexity bound, 101, 119, 137, 244, 535, 549, 553, 595 Critical line, 113, 547, 562 Critical strip, 96, 101, 105, 113, 145 Critical zero, 547, 563 Cubic Gauss sum, 491 Cusp, 355, 387 Cusp form, 61, 65, 83, 136, 186, 356, 368, 479, 574 Dedekind eta function, 450, 516, 544 Dedekind multiplier system, 450 Dedekind sum, 450, 456, 516 Dedekind zeta function, 125, 513 Degree, 94 Deligne bou~d, 291, 357, 379, 493, 532 Density conjecture, 232, 249, 265 Density theorem, 420, 547, 596 Determinant equation, 383 Deuring-Heilbronn phenomenon, 428 Difference between consecutive primes, 266 Differencing process, 201, 204, 211,212, 220, 330, 466 Diophantine approximation, 282 Dirichlet £-functions, 45, 83, 84, 182, 267, 324, 329, 368, 388, 479, 513, 534, 573, 579 Dirichlet approximation theorem, 199, 457 Dirichlet character, 45, 179 Dirichlet convolution, 12, 590 Dirichlet divisor problem, 21, 79, 198, 215 Dirichlet polynomial, 194, 229, 253, 429, 549, 550, 554 Dirichlet series, 11 Discrete subgroup, 354 Discrete valuation, 292, 301 Discriminant, 363 Dispersion method, 171 Divisor function, 13, 74, 86, 334 Divisor on a curve, 293 Dual sum, 170, 468 Duality principle, 170, 174, 184, 189,234 Effective divisor, 293, 298 Eichler-Shimura theory, 367 Eisenstein series, 80, 357, 360, 369, 388, 411, 479, 491, 511, 588 Elliptic curve, 145, 190, 269, 313, 363, 403, 520, 529, 574 Elliptic differential operator, 386 Elliptic functions, 543 Elliptic motion, 395 Epstein zeta function, 513 Equidistribution, 130, 137, 198, 313. 352, 487 Etale covering, 300 Euler characteristic, 307 Euler ldoneal Number Problem, 520

INDEX

Euler Pentagonal Numbers Theorem, 449, 456 Euler product, 11, 41, 45, 60, 94, 113, 146, . 297, 366, 371, 493, 521, 529, 551, 564 Euler-Maclaurin formula, 66, 68, 76, 206, 483, 485 Exceptional character, 37, 434 . Exceptional eigenvalues, 390, 399, 410, 497 Exceptional zero, 93, 111, 112, 121, 140, 390, 428, 434 Exceptional zero repulsion, 428 Exclusion-inclusion, 145, 156, 171, 308, 339 Explicit formula, 108, 118, 127, 265, 337, 410, 440, 564 Exponent pair, 214 Exponent Pair Hypothesis, 86, 214 Exponential integrals, 206, 208 Exponential sums, 197, 225, 443, 456 Family of £-functions, 96, 175, 573, 580 Family of exponential sums, 312 Farey sequence, 451, 458, 469, 477 Ford circle, 452 Four squares theorem, 468 Fourier coefficients, 174 Fourier coefficients of cusp forms, 319, 404 Fourier coefficients of modular forms, 83, 291, 479 Fourier expansion at infinity, 132 Fourier integrals, 204 Fricke involution, 83, 366, 368 Frobenius automorphism, 270, 302 Frobenius conjugacy class, 302 Function field, 292, 301 Functional equation, 81, 84, 94, 244, 258, 362, 366, 368, 375, 377, 512, 530, 539, 547, 566, 582 Functional equation of Eisenstein series, 389 Functor, 301 Fundamental discriminant, 52, 124, 508, 538, 540, 543 Fundamental domain, 58, 354, 396, 499, 504, 521 Fundamental Lemma, 153, 159, 437 Fundamental unit, 38, 402, 516 Gamma factor, 94 Gauss circle problem, 20, 73, 198, 215 Gauss sum, 18, 47, 49, 60, 79, 84, 119, 179, 192, 199, 239, 274, 321, 347, 456, 474, 478, 488, 491, 495 Gauss-Bonnet formula, 355 Gaussian prime, 53, 130, 290 Gaussian Unitary Ensemble, 563, 570 Genus character, 362, 516 Genus theory, 480, 506, 540 Geometric Frobenius, 302, 304 Geometric fundamental group, 304 Goldbach problem, 171, 339

INDEX

Good reduction, 364 Grand Density Conjecture, 250 Grand Riemann Hypothesis, 101, 113, 136, 175, 182, 194, 249, 324, 419, 577 GRH,ll3 Gross-Zagier curve, 368 Gross-Zagier formula, 579 Group law on elliptic curves, 291 Haar measure, 487 Hadamard and de Ia Vallee Poussin, 41, 101, 306 Hankel transform, 71, 99 Harish-ChandrafSelberg transform, 393, 397, 501 Harmonic polynomial, 362 Harmonics, 9, 174, 239, 319, 356, 360, 383 Hasse bound, 193, 269 Hasse derivative, 282 Hasse-Weil zeta function, 99, 145, 146, 272, 366, 369, 539, 541, 579 Heeke L-function, 60, 86, 366, 368, 374, 379, 511, 574 Heeke basis, 574 Heeke character, 56, 59, 129, 141, 142, 302, 362 Heeke congruence groups, 354 Heeke eigenvalues, 134, 192, 513, 582 Heeke form, 372 Heeke operator, 80, 186, 367 Heegner point 494, 542-544 579 Heilbronn su~s, 300 ' Higher von Mangoldt function, 16, 592 Hilbert Class Field, 544 Hilbert inequality, 175 Hilbert's Theorem 90, 296 Hybrid large sieve, 183 Hyperbola method, 22, 31, 37, 420 Hyperbolic conjugacy classes, 401 Hyperbolic geodesics, 384 Hyperbolic laplacian, 385 Hyperbolic measure, 354, 514 Hyperbolic motion, 395 Hyperbolic plane, 384 Hyperelliptic curve, 281, 287

ldoneal number, 508 Imaginary quadratic field, 17, 56, 361, 503, 508, 542 Incomplete character sum, 317 Incomplete Eisenstein series, 387, 389 Incomplete gamma function, 513 Incomplete Kloosterman sum, 471 Invariant integral operator, 392, 393 Isoperimetric inequality, 398 Jacobi inversion formula, 4 73 Jacobi sum, 49, 311, 491 Jacobi symbol, 52, 473

613

Jacobian variety, 147, 578 Kloosterman fractions, 185, 481 Kloosterman sum, 18, 78, 185, 187,278, 308, 313, 322, 382, 404, 469, 476, 481, 492, 584 Kloosterman sums zeta-function 412 Kloosterman zeta function, 278 ' Kloosterman-Salie sum, 281, 358 Kronecker Limit Formula, 516, 524 Kronecker symbol, 52, 57, 124, 506, 508, 530 Kuznetsov formula, 65, 186, 380

Langlands functoriality, 96, 369, 493 Langlands program, 132 Laplace operator, 89, 131, 132, 186, 362, 404, 563, 580 Large sieve, 164, 167, 174, 194, 239, 420,595 Large sieve inequality, 125, 423, 580 Lattice points, 197 Least quadratic non-residue, 182 Lefschetz trace formula, 305, 310 Legendre formula, 155, 341 Legendre symbol, 271 Length of closed geodesics, 410 Length spectrum, 401 Levinson's method, 550 LindelofHypothesis, 101, 116, 186, 194,214, 235, 243, 256, 597 Linear equivalence classes, 298 Linear forms, 497, 499 Linear fractional transformation 353 504 Linearly equivalent divisors, 293, ' Liouville function, 14 Lisse £-adic sheaf, 301 Local Langlands conjecture, 138 Local root number, 136 Local roots, 94 Local zeta function, 365 Log-free zero-density estimate, 428 Low-lying zero, 574 Mobius function, 32, 36, 42, 345, 421, 430, 443, 446, 590 MObius inversion, 13, 44, 46, 154 Mobius Randomness Law 338 444 Maass form, 131, 387 ' ' Major arc, 467 Matrix coefficients, 487 Mellin transform, 61, 90, 257 Mertens formula, 34 Minimal model, 364 Minor arc, 457, 462, 464, 466, 467 Mixed of weights :5 w, 305 Modular curves 577 Modular form, l45, 356 Modular form of weight one, 356 Modular group, 503 Modular interpretation, 542

614

Modular parameterization, 542, 543, 545 Modularity conjecture, 191, 366 Mollification, 550, 562, 581 Mollifier, 251 Monodromy group, 313, 573 Mordell-Wei! theorem, 579 Multiple Kloosterman sum, 308, 492 Multiplicative characters, 34, 271, 308 Multiplicative function, 154, 165, 339, 521 :tv1ultiplicative inverse, 19, 55 Multiplicative reduction, 365 Multiplicity one principle, 373 Near-orthogonality, 170, 192 Nebentypus, 131 Negative curvature, 384 Newton polyhedron, 308, 309 Norm map, 270 Numerus idoneus, 508 Order of vanishing of an L-function, 117 Orthogonality of characters, 35, 46, 121, 311, 318, 425, 482, 492 Orthogonality relations, 44, 45, 179, 271, 511 P6lya-Vinogradov inequality, 325, 326 Pair Correlation Conjecture, 266, 569 Parabolic motion, 395 Peter-Weyl theorem, 487 Petersson formula, 136, 187, 188, 380, 404, 411, 579, 580 Petersson inner product, 357, 371 Petersson norm, 138, 542, 595 Poincare duality, 305 Poincare metric, 353, 384 Poincare series, 357, 404, 500 Poincare upper half-plane, 353, 384 Point-pair invariant, 391 Poisson distribution, 266 Poisson summation formula, 61, 69, 75, 99, 204, 319, 359, 391, 398, 404, 473 Polya-Vinogradov inequality, 523, 524 Positivity, 41, 105, 157, 180, 377 Primary element, 54 Prime Number Theorem, 25, 31, 110, 264, 401, 419, 427, 446, 448, 489, 527, 567 Primes in arithmetic progressions to large moduli, 266 Primes in short intervals, 264, 266 Primes splitting completely, 521 Primitive character, 46, 48, 70, 79, 85, 119, 179,244,422,491,508,597 Primitive conjugacy classes, 396, 402 Primitive cusp form, 99, 134, 370, 373, 374, 513, 529 Primitive Heeke character, 60 Primitive ideal, 57, 508 Primitive root, 271

INDEX

Principal divisors, 293 Principal genus, 506 Pure of weight w, 305

Quadratic character, 184 Quadratic Reciprocity Law, 51 Quantum unique ergodicity conjecture, 494 Radius of convergence, 289 Ramanujan /j, function, 360, 367, 493, 516 Ramanujan T function, 372 Ramanujan sum, 18, 44, 48, 179, 280, 323, 325, 461, 472, 478, 481, 483, 488, 587, 589 Ramanujan-Petersson conjecture, 95, 100, 101, 115, 131, 137, 146, 378, 391, 480 Random matrix theory, 563 Rankin's trick, 341, 349, 525 Rankin-Selberg L-function, 97, 118, 189, 375, 378, 535, 580 Rankin-Selberg convolution, 14, 97, 106, 110 Real character, 46, 49 Real primitive character, 38, 46, 57, 84, 122, 573, 577 Real quadratic field, 38, 516 Reflection method, 243, 257 Regular discriminants, 510 Regulator, 125 Residual spectrum, 388 Resolvent of the laplacian, 390, 405 Riemann Hypothesis, 305, 306, 309, 320, 337, 427, 514, 518, 524, 547, 563, 568 Riemann Hypothesis for curves over finite fields, 146, 329, 463 Riemann Hypothesis for Dirichlet £-functions, 443 Riemann Hypothesis for elliptic curves over finite fields, 366 Riemann zeta function, 12, 119, 197, 204, 216, 388, 561 Riemann-Roch Theorem, 294, 296, 298 Root number, 94, 142, 145, 367, 377, 538, 574 Roots of an exponential sum, 274 Roots of quadratic congruences, 55

Salie sum, 272, 323, 476, 477, 495 Sato-Tate Conjecture, 137, 289, 324, 493 Sato-Tate measure, 492 Scaling matrix, 355, 387 Scattering matrix, 388, 397 Selberg Eigenvalue Conjecture, 96, 390, 415 Selberg sieve, 160 Selberg trace formula, 65, 391 Self-dual L-function, 95, 106, 578 Semistable, 365 Separation of variables, 326, 350 Series of Kloosterman sums, 380 Shifted primes, 30, 153

INDEX

Short character sums, 289, 326 Siegel mass formula, 481 Siegel's bound, 124, 479, 514 Siegel-Walfisz Theorem, 419, 427, 489 Sieve dimension, 157 Sieve methods, 25, 153, 171, 265, 340, 349, 430, 493 Sieve problem, 27 Sieve weights, 338 Sieving level, 349 Sifted sum, 153 Sifting range, 161 Sign of the functional equation, 95, 537 Singular integral, 460 Singular series, 460, 466, 478 Smoothing, 40, 73, 74, 111, 169, 239, 497 Special value, 194, 577, 595 Spectral decomposition, 403, 406, 501 Spectral theory, 19, 344 Spectral theory of Kloosterman sums, 403 Square root cancellation, 306, 312, 314 Strict divisor function, 342 Strong multiplicity one principle, 139, 375 Subconvexity bound, 101, 204, 329, 494, 554, 597 Subdivision method, 236 Sums of Kloosterman sums, 410, 586 Symmetric square, 14, 137, 189, 532, 535, 575, 595 Tate twists, 301 Tate-Shafarevitch group, 367 Tempered divisor function, 75 Theta function, 18, 61, 85, 361, 456, 473, 479, 513, 541 Theta multiplier, 186 Trace map, 270 Trivial sheaf, 306 'frivial zeros, 96 Twisted modular form, 133 Uncertainty principle, 564 Unramified, 94, 118 Upper-bound sieve, 430 Von Mangoldt function, 15, 31, 42 Voronoi formula, 398 Waring problem, 456 Weierstrass equation, 363, 543 Wei! bound, 79, 130, 269, 288, 381, 403, 413, 472, 585 Weil conjectures, 378 Well-factorable function, 420 Well-spaced, 175, 218, 232 Weyl Law, 186, 391 Weyl shift, 216, 314 Weyl sum, 198, 201, 335

615

Zero-density theorem, 249, 264 Zero-detecting polynomials, 252 Zero-free region, 105, 110, 128, 135, 138, 249, 264, 336, 347, 428, 429 Zeta function of an exponential sum, 273

Analytic Number Theory - Iwaniec H.,Kowalski E.

Recommend Documents