To the memory of Sylvan Burgstaller, Duane E. Anderson, and especially James L. Nelson who, at the University of Minnesota Duluth, taught me the fundamentals of writing proofs in analysis.
Acknowledgments
I wish to thank Natalya St. Clair for her excellent work creating the illustrations appearing in this textbook. She took my crude sketches and vague ideas and turned them into pleasing artwork and instructive diagrams. I also wish to thank Daniel M. Kane, Alan Gluchoff, Thomas Drucker, and Walter Stromquist for their insightful comments about the presentation, content, and correctness of the text.
vii
Preface
After learning to solve many types of problems such as those found in the first courses in Algebra, Geometry, Trigonometry, and Calculus, mathematics students are usually exposed to a “transition” course where they are expected to write proofs of various theorems. I taught such a course for a dozen years and was never satisfied with the textbooks available for that course. Although such textbooks often teach the fundamentals of logic (conditionals, biconditionals, negations, truth tables) and give some common proof strategies such as mathematical induction, the textbooks failed to teach what a student needs to be thinking about when trying to construct a proof. Many of these books present a great number of well-written proofs and then ask students to write proofs of similar statements in the hope that the students will be able to mimic what they have seen. Some of these books are also designed to be used as an introductory textbook in Analysis, Abstract Algebra, Topology, Number Theory, or Discrete Mathematics, and, as such, they concentrate more on explaining the fundamentals of those topic areas than on the fundamentals of writing good proofs. This Book Is Not Your Traditional Transition Textbook The goal of this book is to give the student precise training in the writing of proofs by explaining what elements make up a correct proof, by teaching how to construct an acceptable proof, by explaining what the student is supposed to be thinking about when trying to write a proof, and by warning about pitfalls that result in incorrect proofs. In particular, this book was written with the following directives: • Unlike many transition books which do not give enough instruction about how to write proofs, most of the proofs presented in this text are preceded by detailed explanations describing the thought process one goes through when constructing the proof. Then a good proof is given that incorporates the elements of that discussion. • For proofs that share the same general structure such as the proof of lim f .x/ D L x!a for various functions, proof templates are provided that give a generic approach to writing that type of proof. ix
x
Preface
• Many transition books begin with several chapters covering an introduction to logic, set theory, cardinal numbers, and an axiomatic construction of the real numbers. I find that students do not appreciate the details of these discussions when these concepts are presented before they are needed to write a specific proof. For example, truth tables are very helpful in verifying the truth of a complex logical statement, but it is hard for students at that level to see the connection between the truth value of a complex statement and the formation of a proof. Therefore, I introduce many of these ideas as needed within the contexts of writing Analysis proofs and have kept the introductory material to a minimum. • Many books that propose to teach students to write proofs in Analysis get carried away with covering those great topics in Analysis and cut back on the proof writing instruction. The books may start out teaching about proofs, but after a few chapters of introduction, they assume that the students now understand everything they need to know about writing proofs, and the books concentrate entirely on the concepts of Analysis. This book covers plenty of Analysis and can be used as a textbook for a typical beginning Real Analysis course, but it never loses sight of the fact that its primary focus is about proof writing skills. Certainly, one can use this book for a beginning course in Real Analysis because it thoroughly covers the standard theorems, but as a first course in proof writing, it will succeed where others fail. If the students using this book have already had a thorough background in writing proofs, then this book could be used as a standard one-semester course in Real Analysis. Theses students might begin in Sect. 2.5 and, depending on their background, be expected to cover the material through Chaps. 6, 7, or 8. On the other hand, if the students are using this book both as an introduction to proof writing and an introduction to Analysis, then the textbook can be used for a two-semester course in Real Analysis and proof writing. The first semester might aim to cover the first five or six chapters, while the second semester aims to complete the book. For most of the topics, it is important that the chapters be covered in their prescribed order. Elements of later chapters do depend on the material covered in earlier chapters. Madison, Wisconsin, USA 2016
List of Proof Templates Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proving A B for sets A and B . . . . . . . . . . . . . . . Proving A D B for sets A and B . . . . . . . . . . . . . . . Proving a function f is surjective . . . . . . . . . . . . . . Proving a function f is injective . . . . . . . . . . . . . . . Proving lim f .x/ D L . . . . . . . . . . . . . . . . . . . . x!a Proving a result using mathematical induction . . . . . . . Proving lim f .x/ does not exist . . . . . . . . . . . . . . . x!a Proving the function f is continuous at the point a . . . . . Proving the function f is uniformly continuous on the set A Proving is a metric space . . . . . . . . . . . . . .
1.1 What Is a Proof? A statement in Mathematics is just a sentence which could be designated as true or false. The sentences “1 C 1 D 2” and “x D 4 implies x2 D 16” are true statements while “All rational numbers are positive” and “There is a real number x such that x2 C 5 D 2” are false statements. Some sentences like “Green is nice” or “Authenticity runs hot” are too ambiguous, a matter of opinion, or are just plain nonsense and cannot be said to be true or false, so mathematicians would not consider them to be statements. Mathematicians have a lot of words for kinds of statements including many that you have heard: definition, axiom, postulate, principle, conjecture, lemma, proposition, law, theorem, contradiction, and others. You are certainly familiar with the numbers you use for counting items: 1; 2; 3; 4, and so forth. Suppose you wish to investigate statements about these numbers to see which statements hold true for all of these numbers. This is an admirable mathematical pursuit, so how would you get started? Mathematicians know from experience that if you want to begin an investigation, you better start with definitions, that is, you better make some clear statements about the objects you are about to study, because there are examples of mathematicians running off to study something without first making clear what it is they are studying, and later running into problems because they have not been consistent about how they are treating these new objects. This happened, for example, when people investigated the concept of limit before a precise definition of limit was in place. OK, so perhaps you make some statements about the numbers with which you want to work so that you are confident that you understand the collection 1; 2; 3; 4; 5; : : : . What are you going to be able to do with these numbers? If you only know the names of these numbers and have a symbolic representation for each, there is not a great deal you can do with them. Perhaps you could get a collection of blocks and paint one number on each block. Then you could have fun rearranging these numbers just as you have seen done by countless children.
But more likely you are interested in investigating some properties of these numbers having to do with their order or how they behave when operated on by addition or multiplication. This, of course, would mean that you will need to make clear statements about addition and multiplication operations and a less than relationship, again, so that you do not run into problems later because you were being ambiguous. So, you might write definitions of addition, multiplication, and less than, and then make statements about how these operations behave such as a Commutative Law of Addition (mCn D nCm), a Distributive Law of Multiplication over Addition (a.b C c/ D ab C ac), and an Order Property of Addition (r < s implies r C t < s C t). These statements about how these defined quantities work are called axioms, postulates, or principles. They are statements that you accept as the guiding rules for how your mathematical objects behave and go beyond the definitions to describe and make precise just what the definitions are talking about. Once you have made definitions and laid out your axioms, you should have the tools necessary to begin an investigation of other properties. Suppose that someone looks at a few examples and notice that 1 C 9 D 10 and 10 is 2 times another number, 5. They then notice that 4C12 D 16, 3C147 D 150, and 1002C6 D 1008, and all of these results are also numbers equal to 2 times another number. This might lead them to make the statement that “if you add two natural numbers together, the result is always 2 times another number.” Such a statement would be called a conjecture, a statement whose truth has not yet been determined. Of course, you know that this statement is false and came about because the investigator had not yet considered enough examples. Once they stumble upon 5 C 8 D 13 and notice that 13 cannot be represented as 2 times another number, they will know that the statement does not hold true in every case. Other conjectures such as “for every natural number a, the number a2 3a C 12 is a multiple of 2” hold up to more scrutiny. At some point in your investigation you might see a convincing argument that this conjecture is, in fact, a true statement. Such a convincing argument is what is called a proof. Once it is known that a statement has a proof, it is known as a theorem, lemma, corollary, proposition, or law. So, a proof of a statement in mathematics is a convincing argument that establishes the truth of that statement. Some statements are very easily proved, and certainly mathematicians often set up axioms in order to make particular statements easy to prove. At first this may appear to be cheating or, at best, unproductive and uninteresting because it seems to defeat the purpose of establishing truth by dictating rules that make it trivial to establish the truth. But this is certainly not the case. It is common for mathematicians to have an intuitive idea about how a system should work before they feel that they understand it enough to set down formal definitions and axioms. Perhaps you wanted addition of all natural numbers a and b to satisfy a C b D b C a. Then it would make sense to include this rule among your axioms. The axioms are written with the idea of establishing enough structure so that the statements the mathematicians want to hold true can easily be proved. The richness of mathematics is that after assuring that the obvious can be proved from the axioms, there are many more results that can be proved that are not immediately obvious from the definitions and axioms,
1.1 What Is a Proof?
3
statements which might never have been apparent to those who set up the system in the first place. For example, Fermat’s Last Theorem (there are no natural numbers a, b, c, and n > 2 such that an C bn D cn ) is a statement about natural numbers which could only be conjectured after investigating a large number of examples, and stood as a conjecture for hundreds of years before a proof was provided. Occasionally, it is shown that a conjecture is independent of the axioms; that is, neither the truth nor the falseness of the statement follows from the axioms. Two famous examples are the statements about sets known as the Axiom of Choice and the Continuum Hypothesis which have been shown to be independent of the original axioms of Zermelo-Fraenkel Set Theory. The independence of such statements suggests that the axiom system is not rich enough in structure to establish the truth of these statements, and that if one chose to do so, those statements could be added to the list of axioms for the system. The Axiom of Choice or something equivalent to it, for example, is now usually listed along with the Zermelo-Fraenkel axioms. One certainly hopes that it is not possible to prove two contradictory statements about objects in a system. Such an occurrence would say that the axioms of the system were inconsistent, and this would require the axioms to be changed. After the original ground rules for Set Theory were established by Georg Cantor in the 1870s and 1880s, Bertrand Russell pointed out in 1901 a paradox (contradiction) that is a consequence of those rules. Now commonly known as Russell’s paradox, it stimulated a flurry of activity which resulted in the young field of Set Theory being put on a firm foundation (we hope) with the creation and adoption of the Zermelo–Fraenkel axioms. The language of a proof can vary depending on who is writing the proof and who is the intended reader. In other words, what makes a convincing argument may well depend on who it is that needs to be convinced. For example, if two experts in Functional Analysis are speaking to each other, one might prove a statement by saying “Oh, that’s just a consequence of the Hahn-Banach Theorem.” That proof might be sufficient since it completely describes the reasoning behind the statement in question due to the shared knowledge of the two experts. On the other hand, if one of these experts were speaking to a beginning mathematics graduate student, the proof would need to include far more detail in order for it to be a convincing argument. If the expert were speaking to a high school student, the proof might need to be a complete book that both introduces the needed concepts and explains many results needed to understand the proof. It is important to understand that there is a difference between knowing why a statement is true and knowing how to write a good proof of the statement. It is quite possible to learn a great deal of mathematics, to be able to solve many types of mathematical problems, and to understand why particular properties must hold without being able to write coherent proofs of these properties. It is analogous to a police detective who has gathered enough evidence to be convinced which of the many suspects has committed a particular crime, but it is quite another thing to have the criminal successfully prosecuted in a court of law resulting in the criminal’s conviction and eventual punishment for the crime. A student in Analysis needs to learn many strategies that can be brought to bear when writing proofs. Some of these
4
1 What Are Proofs, and Why Do We Write Them?
strategies are methods or tricks that enter a student’s bag of tricks which can be employed later when solving problems or writing proofs. A student of proof writing needs to learn how to take those strategies and turn them into coherent proofs where the ideas are presented in a logical order, fill in all necessary details, and make clear to the proof reader exactly why the chosen strategies justify the needed result. This book talks about how you should go about writing proofs of the kinds of statements typically found in the branch of Mathematics called Analysis. The branches of mathematics are not precisely defined. After a new branch arises, some mathematicians begin to combine ideas from older branches with ideas from the new branch to form even newer areas of study. For example, there are branches called Algebra, Geometry, and Topology. During the twentieth century mathematicians began talking about Algebraic Topology, Algebraic Geometry, and Geometric Topology. Very roughly speaking, then, some of the branches of mathematics are • Set Theory: the study of sets, set operations, functions between sets, orderings of sets, and sizes of sets • Algebra: the study of sets upon which there are binary operations defined (such as addition or multiplication) and includes Group Theory, Ring Theory, Field Theory, and Linear Algebra • Topology: the study of continuous functions and properties of sets that are preserved by continuous functions • Analysis: the study of sets for which there is a measure of distance allowing for the definition of various limiting processes such as those found in the subjects of Calculus, Differential Equations, Functional Analysis, Complex Variables, Measure Theory, and many other areas. Other areas of study such as Applied Mathematics, Combinatorics, Geometry, Logic, Probability are considered by some mathematicians to be their own branch of mathematics or just as part of one or more of the above four branches. The exact designation is important to some mathematicians and not to others. Although mathematicians learn to write proofs in each of these branches of mathematics, one has to begin the learning process someplace. Many teachers feel that Analysis is a good area to start because students who have completed a study of Calculus will already be familiar with just about all of the theorems discussed in a beginning course in Analysis, and may already have an intuitive feeling for why these results hold. That does not mean that those same students can write convincing proofs of these theorems. It is the goal of this book to provide the training necessary so that a student can learn to write proofs of these and similar theorems. Undergraduate courses in Topology, Group Theory, Advanced Calculus, Graph Theory, and so forth generally present the beginning concepts in each of these fields and try to give students a feel for why the major results in the fields are true. Sometimes this involves having the students learn proofs of these results while other times it only involves a presentation of definitions and known results with the idea that the students will be able to take the why it is true and turn it into a proof themselves. This book is much more interested in turning known strategies into proofs than in introducing a wealth of new strategies.
1.2 Why We Write Proofs
5
Some arguments in Analysis follow a standard format or template. This book will present several templates for proofs as a tool for teaching how one might approach the writing of a proof. For example, one can learn to prove a statement of the form lim f .x/ D L by following a standard pattern. This book will display x!1 proof patterns by presenting proof templates, and for each template it will discuss proof examples showing how to use the template and the thought process needed to complete such proofs. After that, a student would be expected to produce similar proofs. There are other theorems in Analysis whose proofs involve the introduction of some clever idea which time has shown to be useful. Beginning students would not be expected to produce proofs using these new ideas on their own, so some of these proofs are presented in order to teach the new proof strategy. The experienced mathematician will have seen a large number of these clever proof techniques and can be expected to reuse these techniques when writing a proof of some new statement. Beginning students do not have this catalog of proof techniques from which to draw, so they are not expected to be able to write proofs for such a wide variety of statements. But one must start someplace when building up this catalog, and it is a goal of this book to get students started in the right direction.
1.2 Why We Write Proofs There are many reasons why mathematicians put a lot of weight on the writing of proofs. Here are some of the reasons. Determining Truth Research mathematicians use proofs to determine what mathematical statements are true. Although many statements in mathematics are obviously true, many remain unproved conjectures for long periods of time before being proved. When a conjecture stands unproved for many years, there is time for more mathematicians to learn about the statement, and the conjecture may attract a great deal of attention. When the conjecture is first stated, some may find it interesting, but finding a suitable proof may not appear to be a difficult problem until many people have tried unsuccessfully to find a proof. As this interesting statement remains a conjecture for a longer and longer period of time, the mathematical community realizes that the problem of finding a proof is much more involved than originally expected. This is exciting partly because a wider community of experts begin to wonder whether the statement under consideration is true and because it becomes clear that new techniques will be needed to find a suitable proof if, in fact, the statement can be proved at all. The problem of determining whether or not the mathematical statement is true takes on the same sort of interest that some people would take in the success of their favorite sports teams; sitting and waiting to see how they will fair in the upcoming contest. When a longstanding conjecture is finally proved, the announcement of the accomplishment will often be covered by the lay press giving mathematics an uncharacteristic brief period of pubic admiration. Perhaps you are familiar with some of these famous problems whose
6
1 What Are Proofs, and Why Do We Write Them?
Fig. 1.1 Dividing the disk with the chords from n points
1 Point, 1 Region
2 Points, 2 Regions
3 Points, 4 Regions
4 Points, 8 Regions
5 Points, 16 Regions
6 Points, 31 Regions
resolution has alluded mathematicians for years (at least at the time of the writing of this text in January 2016): The Riemann Hypothesis, the Goldbach Conjecture, the Twin Prime Conjecture, the P versus NP Problem, and the Navier–Stokes Equations Existence and Smoothness Problem. During the last 40 years resolutions have been announced for several long-standing problems including the Four Color Theorem, The Bieberbach Conjecture (now called de Branges’s Theorem), Fermat’s Last Theorem, and the Poincaré Conjecture. Why do mathematicians expend so much effort trying to prove statements, some of which may seem obvious from the start? One reason is that mathematicians are very skeptical of statements that appear obvious, and rightfully so. There is a long history that includes mathematical statements which appear to be true which are eventually shown to be false. Even very clear patterns can be deceptively seductive. Take, for example, the following problem. Select a set of n points along the circumference of a circle, draw the chords between each pair of points, and find out the maximum number of regions into which these segments can divide the disk. Figure 1.1 shows the results for the first few values of n. Although from considering n D 1; 2; 3; 4; 5 it appears that the chords can divide the disk into 2n1 regions, this fails to be true when n D 6. With a bit more thinking n1 itnis not hard to see that2n could not be the correct answer. With n points there are chords and at most 4 intersections of two chords. This number of intersections 2 grows as a fourth-degree polynomial in n suggesting that the number of regions will
1.2 Why We Write Proofs
7
also grow as a fourth-degree polynomial in n. It would, therefore, be suspicious for the number of regions to grow at the exponential rate of 2n1 . Another well-known example comes from Number Theory. The function .x/ gives the number of positive prime integers less than or equal to the number x. The growth rate of this function has long been of central importance in Number Theory. The Prime Number Theorem says that the function grows at the same rate as the logarithmic integral Z
x
li.x/ D 0
dt : ln t
In fact, for many years it was thought that li.x/ > .x/ for all x > 0 because this holds for all small values of x which can be practically checked, for example, all x between 0 and 1024 . It has now been shown that li.x/ .x/ switches sign infinitely often, although only for extremely large values of x. It is apparent that sometimes seemingly very obvious patterns do not hold in every case, so mathematicians rely on proofs to convince themselves that the patterns do indeed hold in the general case. Testing Axiom Systems In the next chapter you will read about the writing of proofs for some very elementary facts in mathematics; so elementary that you may wonder why anyone would bother with these proofs. Clearly, it makes sense to begin any training in the writing of proofs with some very simple results that are easy to understand so that the student can feel confident about all the statements being made in the proofs. But these proofs are not being presented just because they are elementary. When one sets up a mathematical system by making definitions and determining axioms, it is usually with a particular application or example in mind. The desired result is that the new system will include the already partially understood application so that any new discoveries will immediately tell something new about the original application. Suppose someone sets up an axiom system for the real numbers, for example, but is not able to prove that addition of real numbers satisfies the commutative property. Since the commutative property is an important aspect of addition of real numbers, it would appear that the new axiom system does not have enough power to represent all that one would want to show about the real numbers. Perhaps the axiom system will need to be expanded to include an axiom about the commutativity of addition. Thus, if one cannot prove that the expected simple properties hold, then it says that something is missing from the axioms. So mathematicians write proofs to confirm that their axiom systems are representative of the applications they are trying to describe. Exhibiting Beauty There are no rules about what composers of music need to write, but many composers try to write in standardly accepted formats such as string quartets or symphonies because there are already organizations ready to perform such works and groups of people happy to listen to such works. Scholars of literature compare literary works by writing literary analysis, a form which holds a lot of meaning for those who read and write in that field. Although painters
8
1 What Are Proofs, and Why Do We Write Them?
choose to make pictures of every sort of object or scene, real or imagined, most painters eventually try their hand at painting some of the standard subjects (still life, nudes, famous religious or historical depictions). Similarly, mathematicians write proofs partly because that is what mathematicians enjoy doing. Although many mathematicians make substantial contributions to the sciences, social sciences, and arts through the application of their mathematical skills, others live in a world of creating and discussing abstract concepts that have no immediate application to real world problems, or at least no application apparent to the mathematicians doing the research. To them, mathematics is studied as part of the humanities and is appreciated for its beauty. And much of the beauty of mathematics lies in the proofs of its theorems. One gets a great deal of pleasure reading a clever proof of a complicated result when the proof can be stated in just a few lines, especially if previous proofs of the same result were considerably longer and more difficult to understand. Many mathematicians like reading articles and attending conferences where they are exposed mainly to proofs of results, partly so that they can learn about new results, but more importantly so they can appreciate the techniques brought to bear to construct the proofs. Testing Students One should not underestimate the need to educate future mathematicians. A good way to test whether a student understands a particular result is to ask the student to present a proof of the result. The presentation of a proof shows a deep understanding of why the result is true and shows an ability to discuss many details about the objects involved. At the graduate school level in mathematics, most test problems require the student to produce a proof of a particular result. The student who has completed a study of Calculus is likely to have mastered basic skills in Algebra, Geometry, Trigonometry, and Elementary Functions. This is a good point in one’s studies to begin writing proofs. It should not be assumed that one can just begin writing proofs at this stage even if they have had years of experience watching teachers and authors present proofs to them any more than someone can be expected to sit down and begin playing the piano just because they have watched many other people present concerts using the instrument. In this book the reader will be taken through the construction of many proofs in a step-by-step manner that presents the thought process used to write the proofs. Some incorrect proofs are shown and explained so that the student can learn about common pitfalls to avoid. Some students dread the transition to writing proofs because they feel that they do not understand how to write proofs, and are leery of the day when they will be expected to produce what they cannot now do. But the ability to write good proofs is a skill no different from the ability to factor polynomials or integrate rational functions. There is no expectation that the beginner can produce a good proof, but every expectation that the beginner can learn.
Chapter 2
The Basics of Proofs
2.1 The Language of Proofs 2.1.1 Conditional Statements Most theorems concern mathematical objects x that satisfy a set of properties P, that is, P.x/ D the properties P hold for object x. The theorem may say that if P.x/ is true, then some additional properties Q.x/ must also be true. Such statements are called conditional statements and can be written P.x/ ! Q.x/. In the context of proving theorems, the “P.x/” portion of the statement is referred to as the hypothesis of the statement, and the “Q.x/” portion of the statement is referred to as the conclusion of the statement. The hypothesis of a conditional statement is often called the antecedent while the conclusion of the conditional statement is often called the consequent. For example, a well-known theorem is that all functions differentiable at a point are also continuous at that point. There are many equivalent ways to express this fact: • • • •
All functions differentiable at a point are also continuous at that point. If the function f is differentiable at a point, then f is continuous at that point. The function f is differentiable at a point only if f is continuous at that point. If the function f is not continuous at a point, then f is not differentiable at that point. • There are no functions f such that f is both differentiable at a point and discontinuous at that point. • The function f is differentiable at a point implies that f is continuous at that point. • The function f is differentiable at x ! f is continuous at x.
of the conditional statement P.x/ ! Q.x/. Indeed, the converse of this theorem is the clearly false statement: “If the function f is continuous at a point, then f is differentiable at that point.” Certainly, there are functions f both continuous and differentiable at a point, but knowing that a function is continuous at a point does not allow one to conclude that it is differentiable at that point. The converse of a conditional statement is not logically equivalent to the original statement, but since the two statements are concerned with the same subject matter, mathematicians are often interested in the converse of a given conditional. If someone succeeds in proving a new theorem expressed as a conditional statement, you might wonder whether the converse of the statement could also be true. Sometimes the truth of the converse statement is a trivial matter because it is well known. But there are many examples where the converse does not hold in every case; that is, there are many known values of x where the converse statement “Q.x/ ! P.x/” is false. Other times, the converse statement is something that has been previously established. But very often, the truth of the converse statement remains an open question, and the proof of the original conditional statement may generate research interest in its converse. One of the equivalent forms of a conditional statement P.x/ ! Q.x/ is the statement “if Q.x/ is false, then P.x/ must be false.” This can be written as “:Q.x/ ! :P.x/” using the negation symbol : . This form of the statement is called the contrapositive of the original conditional statement. For example, the contrapositive of the statement discussed above is “If the function f is not continuous at a point, then f is not differentiable at that point.” Although logically equivalent to the original conditional statement, the contrapositive often gives you a different way to think about the statement, and you will often see a proof which is a proof of the contrapositive statement instead of a proof of the original conditional statement.
2.1.2 Negation of a Statement The negation of a statement is a statement with the opposite truth value of the original statement, that is, a statement which is false exactly when the original statement is true. For example, the negation of “n is an integer” is “n is not an integer.” The negation of the statement P.x/ is “not P.x/” or simply “:P.x/.” The conditional statement “P.x/ ! Q.x/” says that every time P.x/ holds it must be the case that Q.x/ also holds. The negation of this statement must, therefore, state that for at least one value of x, P.x/ is true and Q.x/ is false or “P.x/ and :Q.x/.” A proof by contradiction is a proof that assumes both that P.x/ and :Q.x/ are true, and derives a statement that must be false (known as a contradiction) showing that it is impossible to have P.x/ being true at the same time that Q.x/ is false. The well-known Pythagorean Theorem is a conditional statement: “If a right triangle has legs with lengths a and b and a hypotenuse with length c, then a2 C b2 D c2 .” The converse of the Pythagorean Theorem is also true: “If a triangle has sides with lengths a, b, and c satisfying a2 C b2 D c2 , then the triangle is a right triangle.” When a conditional statement, P.x/ ! Q.x/ and its converse
2.1 The Language of Proofs
11
Q.x/ ! P.x/ are both true, the two statements can be combined into one as P.x/ ! Q.x/. This can also be stated as “P.x/ if and only if Q.x/.” Such statements are called biconditional statements. Thus, the Pythagorean Theorem and its converse could be combined into the single biconditional statement: “A triangle is a right triangle if and only if the triangle has side lengths a, b, and c satisfying a2 C b2 D c2 .”
2.1.3 Proofs of Conditional Statements Conditional statements often make assertions about a very large number of objects or even an infinite set of objects. Indeed, the statement about differentiable functions being continuous refers to infinitely many functions, and the Pythagorean Theorem refers to an infinite number of triangles. How, then, are you supposed to prove these results since you clearly cannot consider every case individually? A general approach to proving the conditional statement “P.x/ ! Q.x/” is to select a generic element x which could represent any object satisfying P.x/ and then to prove the statement Q.x/. Since a generic object x satisfying P.x/ has been shown to satisfy Q.x/, it follows that every object satisfying P.x/ must also satisfy Q.x/, and the result has been proved. This will be the format of most of the proofs you will ever write in analysis. If the statement “P.x/ ! Q.x/” is not true, it means that there is at least one value of x that makes “P.x/ ! Q.x/” a false statement. Such an x is called a counterexample to the statement, and exhibiting such a counterexample would be a way to prove that “P.x/ ! Q.x/” is false. A proof of “P.x/ ! Q.x/” is essentially an argument showing that no counterexamples exist. There are many phrases that occur so frequently when writing proofs, that mathematicians have developed a short hand notation for these phrases. There is little need to use these abbreviations within a textbook such as this or even in a journal article, but the short hand can be useful when writing out a proof by hand on paper or a blackboard. Here is a list of some of the commonly used symbols. Shorthand Symbols for Proofs 9 there exists
9Š there exists exactly one
8 for all
S suppose (or assume)
3 such that ! implies
! if an only if
12
2 The Basics of Proofs
2.1.4 Exercises Perform the follows steps for each of the conditional statements in Exercises 1–6. A B C D E
identify the hypothesis and the conclusion. write the converse of the statement. decide whether or not the converse of the statement is true. write the contrapositive of the statement. write the negation of the statement.
1. If x D 1 and y D 1, then xy D 1. 2. If x is an integer, then 2x C 1 is also an integer. 3. f .x/ and g.x/ are both continuous at x D 0 only if f .x/ C g.x/ is continuous at x D 0. 4. xy D 0 if x D 0 or y D 0. 5. If xy 9y D 0 and y > 0, then x D 9. 6. A rectangle has area xy if two adjacent sides of the rectangle have lengths x and y. 7. Write the following without using shorthand symbols. (a) 9x 2 R 3 x C 4 D 2. (b) 8x 2 R 9y 2 R 3 x C y D 10.
2.2 Template for Proofs Many proofs can be written by following a simple formula or template that suggests guidelines to follow when writing the proof. Mathematicians reading a proof that follows a traditional template will find the proof easier to follow because there will be an expectation about what will be presented in the proof. For example, many proofs will follow the general template given here. TEMPLATE followed by many proofs • • • •
SET THE CONTEXT ASSERT THE HYPOTHESIS LIST IMPLICATIONS STATE THE CONCLUSION
To illustrate this template, consider this proof of a well-known theorem from elementary Algebra.
2.2 Template for Proofs
13
PROOF (Quadratic Formula): For constants a, b, and p c, the quadratic b2 4ac b ˙ . polynomial ax2 C bx C c has roots given by x D 2a SET THE CONTEXT: Let a, b, and c be constants with a ¤ 0. ASSERT THE HYPOTHESIS: Suppose that x satisfies ax2 C bx C c D 0. LIST IMPLICATIONS: Since a ¤ 0, it follows that x2 C ba x C ac D 0. b2 b2 Then x2 C ba x C ac C 4a 2 D 4a2 . b 2 b2 C ac D 4a • Factoring shows that x C 2a 2. b 2 b2 c b2 4ac • Then x C 2a D 4a2 a D 4a2 .
• • • •
2
b 4ac • This means that x C 2a must be one of the two square roots of b 4a 2 . s p p 2 2 2 b ˙ b 4ac b 4ac b ˙ b 4ac D˙ , and x D . • So, x C D 2 2a 4a 2a 2a • STATE THE CONCLUSION: Thus, the p roots of the quadratic polynomial b ˙ b2 4ac ax2 C bx C c are given by x D . 2a
The proof template begins with the suggestion to “SET THE CONTEXT” which represents statements designed to tell the reader what is being assumed in the proof. This is usually a sentence or two telling the reader about the properties of the objects that will be encountered in the proof. It may also introduce which variables will appear in the proof and what kinds of objects they represent. So, in the given proof of the Quadratic Formula, the first line tells that the variables a, b, and c are going to represent known constants with a not being 0. Clearly, the fact that a is not 0 needs to be stipulated because if a D 0, the polynomial ax2 C bx C c would not be quadratic and would not have the proposed roots. Generally, you are not looking for a lengthy narrative here, and, in fact, brevity is a particularly cherished attribute of a proof. Saying what needs to be said, but only what needs to be said is usually best. Some authors who state a theorem and immediately follow the statement of the theorem with its proof will forgo setting the context at the beginning of the proof because the reader will have just seen the statement of the theorem and may not need to see a repeat of the context for that proof. For example, in the example proof, the statement of the theorem does introduce the constants a, b, and c and polynomial ax2 C bx C c, so some authors might just skip the first line of the proof. On the other hand, if the first line of the proof instead introduced the constants r, s, and t, the proof could have proceeded using these variables instead of a, b, and c. The same result would have been proved. So the “SET THE CONTEXT” of the proof makes the proof independent of the statement of the theorem being proved. Thus, for completeness, it is good to establish the habit of including the setting of the context at the beginning of each proof, at least until the student’s experience in proof writing has matured. Your choices of variables used to represent particular objects in the proof are not critically important to the structure or correctness of the proof, but there are
14
2 The Basics of Proofs
certain variables that mathematicians associate with various uses, and sticking to these conventional choices simplifies the understanding of the proof because those variable choices bring with them a history of context that the reader will recognize. There are very few Algebra students who would recognize the Quadratic Formula p s ˙ s2 4rt if you gave them z D . Proofs about limits usually refer to the 2r variables and ı which represent small positive real numbers used in specific ways in the proof. Using these two variables in their traditional contexts makes the proofs easier to understand because the reader will expect these variables to play specific roles, just as they have in many other proofs the reader has seen. Seeing many examples of proofs will familiarize the novice proof writer with these traditional uses of variables. Suppose that the statement being proved indicates that every object satisfying the properties listed in the hypothesis of the theorem also satisfies some properties listed in the conclusion of the theorem. One generally structures a proof of such a statement by first selecting a generic object satisfying the properties listed in the hypothesis. The “ASSERT THE HYPOTHESIS” part of the proof is where the writer selects an arbitrary element satisfying the hypothesized properties. In the Quadratic Formula proof, it was assumed that x satisfied the quadratic equation ax2 C bx C c D 0. Other examples would be statements such as • • • • • •
Let n be any natural number bigger than 3. Let x be an element of set A. Let y be a root of the polynomial p.x/. Assume that the real valued function f has a zero at the point z. Suppose G and H are any two linesR that intersect at a point P. s Assume that the function f .s/ D 0 g.x/ dx is a differentiable function of s. In addition assume that 0 f .s/ 10 for all s 0.
It is possible that there are infinitely many objects which could play the role of the generically chosen object. But if an argument proves the result is true for this generic object, then the theorem will have been shown to hold for any object that could have played the role of the generic object, and, therefore, the theorem will have been proved for all objects satisfying the hypothesis. The Quadratic Formula proof addresses the one generic polynomial ax2 C bx C c and in doing so derives a formula that works for all quadratic polynomials including 5x2 17x C 126 and rx2 C sx C t. Often the reader of a proof will form a mental picture of the generic object being chosen. For example, after reading “Let n be any natural number bigger than 3,” the reader may think, “OK, how about n D 7?” As the proof progresses, the reader may take each statement of the proof and verify that it is valid and makes sense for their choice of n D 7. This helps the reader follow the logic of the proof and verifies that they are understanding what the proof is saying. The proof will be completed when it is shown that the generically chosen element satisfying the hypothesis of the theorem is, in fact, an element satisfying the conclusion of the theorem as stated in the “STATE THE CONCLUSION” part of the template. There will certainly need to be some statements placed between
2.2 Template for Proofs
15
the original assertion of the hypothesis and the end of the proof that justify the conclusion of the theorem. Those statements make up the “LIST IMPLICATIONS” part of the template. In almost all cases, most of the body of the proof belongs to this list of statements. Each statement in the list should follow from definitions or be simple implications following from previous statements in the proof. In a wellwritten complete proof, the reader should easily see why each implication follows logically from other statements made earlier in the proof (Fig. 2.1). If an implication is not clear on its own, it will need some justification so the reader can follow the logic. The justification may just be a reminder of a key point made earlier in the proof (“as shown earlier, f is continuous at point a”) or a reminder of a well-known definition or theorem (“Since all continuous functions on the interval Œ0; 4 are R4 integrable there, it follows that f .x/dx exists.”) The given Quadratic Formula proof 0
contains six lines of implications. Each line follows easily from the line before using standard rules of Algebra, and any student familiar with the algebraic manipulations of equations will be able to understand these implications. In the fourth step of the b2 proof, the quantity 4a 2 is added to both sides of an equation. Although this step surely follows the rules of Algebra, it may not be clear to the reader of the proof why the step is important. As it turns out, this “completing the square” operation prepares for the factoring performed in the fifth step of the proof and is arguably the most clever step of the proof. A proof will often require a clever step such as this. The proof writer may have labored for years looking for the inspiration needed to find such a step, but the proof itself need only make clear the justification for what is being done and does not need to refer to the sweat that went into producing it. Some implications will be easy for the reader to follow without having to justify the step. Other statements may need some deeper explanation. Here is where the proof writer will need to consider the expertise of the target audience for the proof in order to decide how much detail to provide. How to make your proof “easy to follow” is only clear when you know for whom it is meant to be easy. For example, it b2 made sense to follow the line x2 C ba xC ac D 0 with the statement x2 C ba xC ac C 4a 2 D b2 4a2
because this just used the fact that you can add equal quantities to both sides of an equation to get a new equation that is equivalent. On the other hand, suppose you wish to combine a conditional statement on line 8 of a proof with the fact stated on line 15 of the proof in order to show that the hypothesis of that conditional is satisfied. This would allow the writer to state the conclusion of the conditional
Fig. 2.1 List of implications for P ! Q
16
2 The Basics of Proofs
statement to get line 16 of the proof, but the reader may have to be reminded about which statements are being combined to get that conclusion. Sometimes the writer of a long or complicated proof will need to make a new definition or point out some new property that will be important later in the proof. Depending on the complexity of the new idea, the proof writer may want to include an example or two of objects satisfying the new definition or property. This will serve to help the reader understand the new concept or to verify that the reader is understanding the new concept. It is admirable to include such examples if the complexity of the proof can be made clearer. But in most other contexts, the proof should be kept short without the inclusion of unnecessary statements. If the intended readers are able to easily construct these examples on their own, then the examples should be left out of the proof. The remainder of this chapter will discuss proofs that follow this general proof template in contexts that the student should find easy to follow. It will also give an opportunity to present some definitions and notation that will be used in later chapters.
2.2.1 Exercises 1. If you were writing a proof of “All prime numbers greater than 2 are odd,” which of the following would be appropriate ways to begin the proof. (There may be more than one correct answer.) (a) (b) (c) (d) (e) (f) (g)
Let n be an odd prime number. Assume that all odd prime numbers are greater than 2. Let n be a prime number greater than 2. Assume that 2 is a prime number. Assume that n and k are integers with n > k > 2. The numbers 3, 5, 7, and 11 are prime numbers greater than 2. Let n be a number greater than 2 which is not prime.
2. If you were writing a proof of “The diagonals of a parallelogram bisect each other,” which of the following would be appropriate ways to begin the proof. (There may be more than one correct answer.) (a) (b) (c) (d) (e) (f)
Let ABCD be a parallelogram. Let ABCD be a quadrilateral whose diagonals bisect each other. Let ABCD be a parallelogram whose diagonals bisect each other. All rectangles are parallelograms. Assume that the diagonals of a parallelogram bisect each other. Assume that if the diagonals of a quadrilateral bisect each other, then the quadrilateral is a parallelogram.
2.3 Proofs About Sets
17
3. If you were writing a proof of “Every cubic polynomial with real coefficients has at least one real root,” which of the following would be appropriate ways to begin the proof. (There may be more than one correct answer.) (a) Assume that every cubic polynomial with real coefficients has at least one real root. (b) Assume that p.x/ is a polynomial with at least one real root. (c) Assume that a, b, c, and d are real numbers with a ¤ 0, and let p.x/ D ax3 C bx2 C cx C d. (c) The polynomial x3 8 has exactly one real root at x D 2. (e) Let p.x/ be a cubic polynomial with real coefficients and q.x/ be a cubic polynomial with complex coefficients. (f) Let p.x/ be a cubic polynomial with real coefficients with real root r. Write an appropriate first sentence that would begin proofs of each of the following statements. 4. If m and n are relatively prime integers, then there exist integers x and y such that mx C ny D 1. 5. The three angle bisectors of any triangle intersect at a common point. 6. If a and b are real numbers with a b, and f is a function continuous on the closed interval Œa; b, then there is a real number M such that jf .x/j M for all x 2 Œa; b. u , ! u ! u .! 7. If ! v , and ! w are 3-dimensional vectors, then .! v / ! w D! v ! w /:
2.3 Proofs About Sets 2.3.1 Set Notation Most courses in mathematics discuss sets: sets of numbers, sets of points, sets of functions, sample spaces, and so forth. This should have given any Calculus student an intuitive understanding of sets. Many theorems in mathematics are statements about sets in disguise. For example, the statement that “If the function f is differentiable at a point, then f is continuous at that point” is equivalent to the statement “The set of functions differentiable at a point is a subset of the set of functions continuous at that point.” For the purposes of this text, it will be enough to define a set as a collection of elements. That is, elements are those objects that belong to sets, and the notation x 2 A says that x is an element of the set A, and x … A says that x is not an element of the set A. The set A is a subset of the set B, or A is contained in the set B, if each element of A is also an element of B in which case this fact is written as A B. Two sets, A and B, are equal if they have the same elements, that is, all the elements in the set A are in the set B, and all the elements in the set B are in the set A. Notationally, this says that A D B if and only if both A B and B A.
18
2 The Basics of Proofs
There are many ways to express the contents of a set. One is to list the elements such as A D fa; b; cg or B D f1; 3; 5; 7; : : : g. Another way is to use set builder notation which states that the set consists of all elements satisfying a given property P.x/ and is written fx j P.x/g, or to emphasize that the elements of the set are also in set A, it is often written fx 2 A j P.x/g. Examples are fx j x > 0g, fy j y2 C3y2 > 7g, and ff j f is a function differentiable at x D 3g. Note that a set is determined by the elements that are in the set. Thus, f1; 2; 3g D f3; 2; 1g D f1; 2; 2; 3; 3; 3; 1; 2; 3g because all three of these sets contain exactly the same three elements. In some contexts, mathematicians will talk about a multiset which is an object similar to a set but allows elements of the collection to appear with different multiplicities. Thus, f1; 2; 3g and f1; 2; 2; 3; 3; 3; 1; 2; 3g would be different multisets even though, in the notation of sets, they are the same set. One special set is the empty set written as ; or fg and is the set that has no elements. In some contexts there is an understanding of a universal set, U, such that all other sets under consideration are subsets of U. For example, the sets A D f1; 2; 3g and B D f2; 4; 6; 8; : : : g can be thought of as subsets of the universal set U D f1; 2; 3; 4; : : : g. Take care not to confuse elements and subsets. Remember that sets are collections of elements and sets are subsets of other sets. It is possible that a set contains other sets as elements, but this would need to be explicitly clear from the definition of that set. It is correct to write 3 2 f1; 2; 3; 4; 5g, f1; 3; 5g f1; 2; 3; 4; 5g, and f1; 2g 2 f1; f1; 2g; f1; f1; 2ggg, but it is incorrect to write 3 f1; 2; 3; 4; 5g, f1; 2g 2 f1; 2; 3; 4; 5g, or ; 2 f1; 2; 3; 4; 5g. The student should be familiar with the following standard set operations. The union of sets A and B is A [ B D fx j x 2 A or x 2 Bg, and the intersection of sets A and B is A \ B D fx j x 2 A and x 2 Bg. When there is an understood universal set, U, it makes sense to refer to the complement of a set Ac D fx 2 U j x … Ag. It does not make sense to discuss the complement of a set if there is no understood universal set. For example, is f1; 2; 3gc D f4; 5; 6; 7; : : : g, or is it f: : : ; 3; 2; 1; 0; 4; 5; 6; 7; : : : g? For that matter, is your right shoe an element of f1; 2; 3gc ? The difference of two sets is AnB D fx 2 A j x … Bg, and, equivalently, if there is an understood universal set, AnB D A \ Bc . For example, if A D f1; 2; 3; 4; 5g and B D f2; 4; 6; 8g, then A [ B D f1; 2; 3; 4; 5; 6; 8g, A \ B D f2; 4g, AnB D f1; 3; 5g, and BnA D f6; 8g.
2.3.2 Exercises 1. Which of the following statements are true? (a) (b) (c) (d)
(e) f2; 5; 5; 6g f1; 2; 3; : : : ; 10g. (f) f1; 2; 3; : : : ; 10g f1; 2; 3; : : : ; 10g. 2. Given A D f1; 3; 5; 7; 9; 11; 13g, B D f2; 3; 4; 5; 6g, and C D f1; 4; 7; 11; 14g evaluate each of the following expressions. (a) (b) (c) (d) (e) (f)
A[A A\B .A [ B/ \ C .B [ C/ \ A .A \ B/nC .AnB/ [ C
2.3.3 Proofs About Subsets There are many simple statements about sets which should be immediately obvious to students reading this text, but learning to write proofs for these types of statements will be instructive and useful in the proof writing discussed in the following chapters. Here are some of those simple statements that apply to all sets A, B, and C. Some Statements About All Sets A, B, and C A A [ B. A \ B A. An.B [ C/ .A [ B/nC. .A [ B/ \ C A [ .B \ C/. A [ B D B [ A, the Commutative Law of Union. A \ B D B \ A, the Commutative Law of Intersection. .A [ B/ [ C D A [ .B [ C/, the Associative Law of Union. .A\B/\C D A\.B\C/, the Associative Law of Intersection. A [ .B \ C/ D .A [ B/ \ .A [ C/, the Distributive Law of Union Over Intersection. • A \ .B [ C/ D .A \ B/ [ .A \ C/, the Distributive Law of Intersection Over Union. • .A [ B/c D Ac \ Bc , DeMorgan’s Laws. • .A \ B/c D Ac [ Bc , DeMorgan’s Laws.
• • • • • • • • •
The first four of these statements propose that one set is a subset of a second set. From the definition of subset, for A B to be true, it is required that for every x 2 A, x must also be in B. There is a standard template for proofs of statements of this form:
20
2 The Basics of Proofs
TEMPLATE for proving A B for sets A and B • SET THE CONTEXT: State what is being assumed about the sets A and B. • ASSERT THE HYPOTHESIS: Let x 2 A. • LIST IMPLICATIONS: Use the properties of set A to show x belongs to set B. • STATE THE CONCLUSION: x 2 B. Therefore, by the definition of subset, A B. For example, how would one prove the statement “For all sets A and B, A A [ B?” Because this proof is supposed to apply to any sets A and B regardless of what properties they may possess, all that would be necessary for the “SET THE CONTEXT” part of the proof is a statement introducing to the reader the fact that the variables A and B will represent sets. Since A A [ B exactly when every element of A is also an element of A [ B, the “ASSERT THE HYPOTHESIS” part of the proof needs to select a generic element of the set A so that the proof can conclude that the generic element is an element of set A [ B. The first two lines of the proof read: Suppose that A and B are any two sets. Let x 2 A. The “LIST IMPLICATIONS” for this proof can be very short. It merely needs to show that the definition of “set union” implies that x is in the union A [ B. This completes the proof. PROOF: A A [ B. • • • • •
SET THE CONTEXT: Suppose that A and B are any two sets. ASSERT THE HYPOTHESIS: Let x 2 A. LIST IMPLICATIONS: Since x 2 A, it is true that x 2 A or x 2 B. By the definition of set union x 2 A [ B. STATE THE CONCLUSION: Therefore, by the definition of subset, A A [ B.
Do the statements of this proof have to appear in exactly this order using exactly these words? Of course not. There can be many variations in what makes up a good proof. But it does not hurt to review why these statements make a good proof. The first line Suppose that A and B are any two sets just makes it clear to the reader that the variables A and B can be used to represent any two sets. Here is where the reader of the proof may well mentally choose two sets so that when reading the remainder of the proof, the reader can verify that the statements make sense when applied to those two sets. The second line Let x 2 A is required because by the definition of “subset,” one must show that each element of A is also an element of A [ B, so selecting an arbitrary element of A is the natural way to do this. The next line Since x 2 A, it is true that x 2 A or x 2 B is just a statement of logic that says if statement p is true, then statement p or q is also true. Of course, this particular p or q statement is exactly the definition of x being a member of A [ B, which is exactly what is needed to complete the proof.
2.3 Proofs About Sets
21
Could one have interchanged the third and fourth lines of this proof? Well, yes; the proof would be complete if that were done, but the fact that the definition of set union is invoked right after its conditions are verified makes the statements of the proof flow smoothly. The reader facing the definition of set union in line three might wonder why that definition is being shown at that point. By placing that statement as the fourth statement where the proof reader has just seen that x 2 A or x 2 B, the proof reader will immediately see that the definition of set union applies. Note that each of the five statements in the proof has been placed on a separate line in the display box. This has been done merely to facilitate the discussion about that proof. In practice, there is no requirement that these statements appear on a separate lines. The second statement about all sets is A \ B A. This can be proved using the same proof template as the first statement. Since this statement also applies to any two sets A and B, the first line of this proof will be the same as the first line of the previous proof. Because the assertion of the statement being proved is that A \ B is a subset of another set, the “ASSERT THE HYPOTHESIS” line of the proof would change to the assertion that x belongs to A \ B. After reading this second line, what does the proof reader know about x? Only that x belongs to the intersection of two sets. Thus, that only direction that the proof can proceed is to invoke the definition of set intersection to make the additional assertion that x 2 A and x 2 B. This is a statement of the form p and q, so logic allows the assertion that p is true, or, in this case, that x 2 A. This is the required “STATE THE CONCLUSION” statement, and the complete proof would be PROOF: A \ B A. • SET THE CONTEXT: Suppose that A and B are any two sets. • ASSERT THE HYPOTHESIS: Let x 2 A \ B. • LIST IMPLICATIONS: By the definition of set intersection, x 2 A and x 2 B. • Thus, x 2 A. • STATE THE CONCLUSION: Therefore, by the definition of subset, A \ B A. For a more substantial example, consider the third of the list of statements about sets An.B [ C/ .A [ B/nC. A proof of this statement will need to refer to the definition of set difference as well as the definitions of set union and subset. Since the statement being proved involves three sets, the “SET THE CONTEXT” part of the proof will need to refer to all three sets. The “ASSERT THE HYPOTHESIS” statement will need to select an arbitrary element from An.B [ C/. To emphasize that the choice of which variable to use is arbitrary, this time use y rather than x to represent the arbitrarily chosen element. Once it is known that y 2 An.B [ C/, the only property of y that can be used is the fact that y is a member of a set difference. Thus, this would be a good time to invoke the definition of set difference. That assures that y 2 A and y … .B [ C/. At that point one can use the definition of
22
2 The Basics of Proofs
set union to conclude that since y … .B [ C/ that y … B and y … C. Now these facts can be combined to get the “STATE THE CONCLUSION” statement required by the proof template. The complete proof would be PROOF: An.B [ C/ .A [ B/nC. • SET THE CONTEXT: Suppose that A, B, and C are any three sets. • ASSERT THE HYPOTHESIS: Let y 2 An.B [ C/. • LIST IMPLICATIONS: By the definition of set difference, y 2 A and y … .B [ C/. • By the definition of set union y cannot be an element of either set B or set C, or it would be in B [ C. • Also by the definition of set union, since y 2 A, y is also a member of A [ B. • Now, y 2 .A [ B/ and y … C, so by the definition of set difference, y 2 .A [ B/nC. • STATE THE CONCLUSION: Therefore, by the definition of subset, An.B [ C/ .A [ B/nC.
2.3.4 Exercises Write proofs for each of the following statements. 1. For all sets A, B, and C, .A \ B/ \ C A \ C. 2. For all sets A, B, and C, .A \ B/ \ .A \ C/ B \ C. 3. For all sets A, B, and C, .AnB/ \ .AnC/ An.B \ C/.
2.3.5 Proofs About Set Equality Let A and B be sets. From the definition of set equality it follows that one can prove A D B by proving the two separate facts A B and B A. That suggests the following proof template for proving that two sets are equal.
2.3 Proofs About Sets
23
TEMPLATE for proving A D B for sets A and B • SET THE CONTEXT: Make statements about what is being assumed about sets A and B. PART 1: SHOW A B. • ASSERT THE HYPOTHESIS: Let x 2 A. • LIST IMPLICATIONS: Use the properties of set A to show x belongs to set B. • CONCLUDE PART 1: x 2 B. Therefore, by the definition of subset, A B. PART 2: SHOW B A. • ASSERT THE HYPOTHESIS: Let x 2 B. • LIST IMPLICATIONS: Use the properties of set B to show x belongs to set A. • CONCLUDE PART 2: x 2 A. Therefore, by the definition of subset, B A. • STATE THE CONCLUSION: Therefore, because A and B are subsets of each other, by the definition of set equality, A D B. Is it correct to use the same variable x in both parts of the above proof template? Yes, since the use of the variable x is only important in the context of showing A B or B A, there is little chance that the reader will be confused by these two uses of the same variable. On the other hand, there would be nothing incorrect about using the variable x to represent the element of set A in the first part of the proof and to use the variable y to represent the element of set B in the second part of the proof. Using the same variable has the advantage that it is used the same way in both parts of the proof, that is, to represent an element of a set that is being shown to also be an element of a second set. Is it correct that the variables A and B are used to represent the sets in both parts of the proof? Could, for example, the first part of the proof use sets A and B, and the second part of the proof use sets C and D? Here the answer is that it is very important to use the same variables in both parts of the proof. To show A D B it must be shown that A B and B A for the same pair of sets A and B. Showing A B and C D does not let one conclude that A D B. After introducing A and B in the “SET THE CONTEXT” part of the proof, it would be wrong to change the use of these variables later in the proof or to change which variables were representing the two sets. Consider how to write proofs of three of the example statements: • A [ B D B [ A, the Commutative Law of Union. • .A \ B/ \ C D A \ .B \ C/, the Associative Law of Intersection. • .A [ B/c D Ac \ Bc , DeMorgan’s Law. The first proof follows easily from the fact that in logic the statements p or q and q or p are equivalent. This leads to the proof
24
2 The Basics of Proofs
PROOF: A [ B D B [ A. • SET THE CONTEXT: Suppose that A and B are any two sets. PART 1 A [ B B [ A • • • • •
ASSERT THE HYPOTHESIS: Let x 2 A [ B. LIST IMPLICATIONS: By the definition of set union, x 2 A or x 2 B. Thus, x 2 B or x 2 A. By the definition of set union x 2 B [ A. CONCLUDE PART 1: Hence, from the definition of subset, it follows that A [ B B [ A.
PART 2 B [ A A [ B • • • • •
ASSERT THE HYPOTHESIS: Now suppose that x 2 B [ A. LIST IMPLICATIONS: By the definition of set union, x 2 B or x 2 A. Thus, x 2 A or x 2 B. By the definition of set union x 2 A [ B. CONCLUDE PART 2: Hence, from the definition of subset, it follows that B [ A A [ B.
• STATE THE CONCLUSION: Therefore, because A [ B and B [ A are subsets of each other, by the definition of set equality A [ B D B [ A. Note that the “PART 1” and “PART 2” labels have been included in the above display as guides to the student, but they are not required elements of the proof itself. This proof can be shortened. Since the second part of the proof is identical to the first part of the proof except that the roles of the sets A and B are interchanged, one might save the reader from having to think through the details of the second half of the proof which are identical to the details of the first half. The proof could be written as PROOF: A [ B D B [ A. • • • • • • • •
Suppose that A and B are any two sets. Let x 2 A [ B. By the definition of set union, x 2 A or x 2 B. Thus, x 2 B or x 2 A. By the definition of set union x 2 B [ A. Hence, from the definition of subset, it follows that A [ B B [ A. Similarly, one can conclude that B [ A A [ B. Therefore, since A [ B and B [ A are subsets of each other, by the definition of set equality A [ B D B [ A.
In fact, the first half of the proof is the second half of the proof. The first half of the proof shows that A[B B[A for any two sets A and B. In particular, that proof applies when the roles of the two sets are interchanged; just let the variable A in the
2.3 Proofs About Sets
25
first part of the proof represent the set B from the second part of the proof, and let the variable B in the first part of the proof represent the set A from the second part of the proof. The Associative Law of Intersection refers to three sets and requires repeated use of the definition of “set intersection.” The definition is used to break down the statement x 2 .A \ B/ \ C into the three simple statements x 2 A, x 2 B, and x 2 C and then these facts are put back together to form the needed x 2 A\.B\C/. Again, the proof needs two parts. The result is PROOF: .A \ B/ \ C D A \ .B \ C/. • Suppose that A, B, and C are any three sets. PART 1 .A \ B/ \ C A \ .B \ C/ Let x 2 .A \ B/ \ C. By the definition of set intersection, x 2 .A \ B/ and x 2 C. Also, by the definition of set intersection, x 2 A and x 2 B. Thus, x 2 A, x 2 B, and x 2 C. Since x 2 B and x 2 C, by the definition of set intersection x 2 B \ C. Since x 2 A and x 2 B \ C, by the definition of set intersection x 2 A \ .B \ C/. • Hence, from the definition of subset, it follows that .A \ B/ \ C A\ .B \ C/.
• • • • • •
PART 2 A \ .B \ C/ .A \ B/ \ C Now, let x 2 A \ .B \ C/. By the definition of set intersection, x 2 A and x 2 B \ C. Also, by the definition of set intersection, x 2 B and x 2 C. Thus, x 2 A, x 2 B, and x 2 C. Since x 2 A and x 2 B, by the definition of set intersection x 2 A \ B. Since x 2 A \ B and x 2 C, by the definition of set intersection x 2 .A \ B/ \ C. • Hence, from the definition of subset, it follows that A\.B\C/ .A\B/\C. • Therefore, because .A \ B/ \ C and A \ .B [ C/ are subsets of each other, by the definition of set equality .A \ B/ \ C D A \ .B \ C/.
• • • • • •
The two DeMorgan’s Laws are useful because they tell how to simplify the complement of a set formed by a combination of unions and intersections of sets. Proving these laws can follow the template for showing set equality. The proofs will need to refer to the definitions of set union, set intersection, and set complement. The order in which these definitions are invoked follows from what is known at that point of the proof. For example, if you know that x 2 .A [ B/c , then the only way to make progress in the proof is to apply the definition of set complement because the only attribute known about the set is that it is the complement of some other set.
26
2 The Basics of Proofs
Fig. 2.2 .A [ B/c is equal to Ac \ B c
A
B
(A ∪ B)C
A
B
AC ∩ BC
Yes, that other set is a union of two sets, but there is no way to use that information at this point of the proof because complementation was performed after the union was taken (Fig. 2.2). PROOF: .A [ B/c D Ac \ Bc . • Suppose that A and B are any two sets. PART 1 .A [ B/c Ac \ Bc Let x 2 .A [ B/c . By the definition of set complement, x … .A [ B/. If x 2 A or x 2 B, then x 2 A [ B which is false. Thus, x … A and x … B, so by the definition of set complement, x 2 Ac and x 2 Bc . • By the definition of set intersection x 2 Ac \ Bc . • Hence, from the definition of subset, it follows that .A [ B/c Ac \ Bc .
• • • •
PART 2 Ac \ Bc .A [ B/c Now, let x 2 Ac \ Bc . By the definition of set intersection, x 2 Ac and x 2 Bc . Thus, by the definition of set complement, x … A and x … B. If x 2 A [ B, then by the definition of union, it would follow that x 2 A or x 2 B which is false. • Thus, x … A [ B, and, by the definition of set complement, x 2 .A [ B/c . • Hence, from the definition of subset, it follows that Ac \ Bc .A [ B/c . • Therefore, because Ac \ Bc and .A [ B/c are subsets of each other, by the definition of set equality .A [ B/c D Ac \ Bc .
• • • •
2.3.6 Exercises Give that A, B, and C are sets, write proofs for each of the following statements. 1. A \ B D B \ A. 2. A \ .BnA/ D ;. 3. .AnB/ [ .BnA/ D .A [ B/n.A \ B/.
2.4 Proofs About Even and Odd Integers
4. 5. 6. 7.
27
.A [ B/ [ C D A [ .B [ C/. .A [ B/ \ C D .A \ C/ [ .B \ C/. .A \ B/ [ C D .A [ C/ \ .B [ C/. An.B [ C/ D .AnB/ \ .AnC/.
2.4 Proofs About Even and Odd Integers 2.4.1 Definitions of Even and Odd Integers You are already very familiar with the natural numbers, N D f1; 2; 3; 4; : : : g, which are sometimes called the counting numbers or whole numbers. By adding zero and the negative natural numbers to this set, one obtains the integers, Z D f: : : ; 3; 2; 1; 0; 1; 2; 3; : : : g. The natural numbers are often referred to as the positive integers. Much of a student’s first study of mathematics is concerned with these two sets of numbers. By a very young age most people are already familiar with even and odd integers and some of their properties. This section will construct proofs of some of these properties both because the student will feel very comfortable with the concepts and because it allows for the introduction of some basics about how to write proofs. Before proceeding with proofs, though, it is necessary that there is agreement on the definitions of even and odd integers. Indeed, there are many possible definitions of even integers: n 2 Z is an even integer if • the decimal representation of n has a ones digit equal to 0, 2, 4, 6, or 8. • n is either 0 or the prime factorization of n contains a factor of 2. • there is an integer k such that npD 2k. • in is a real number, where i D 1. • the number .1/n is positive. • 9n 1 .mod 10/. • sin. n / D 0. 2 • n2 2 Z. Which of these definitions should be used when writing proofs about even and odd integers? Actually, since all the definitions are equivalent, one could adopt any one of these definitions and then prove theorems that show that all the other definitions are equivalent to the chosen definition. This is not an unusual situation in mathematics, especially for a concept as elementary as even integers. But it turns out that one of these definitions is particularly well suited for writing proofs, and that is, n 2 Z is even if there is a k 2 Z such that n D 2k. This makes a useful
28
2 The Basics of Proofs
definition because it provides a fairly easy way to check whether a given integer is even, and because knowing that a number n is even immediately gives you a number k for which n D 2k, and that is a powerful tool for proving facts about even integers. For this reason, this chosen definition is called the working definition, that is, it is the definition easiest to apply in the wide variety of contexts. It is the definition chosen from which all other properties of even numbers can be derived. A similar discussion could take place about how to define odd integers. The working definition is that n 2 Z is odd if there is a k 2 Z such that n D 2k C 1. There is a long list of facts you could prove about even and odd numbers. Facts About Even and Odd Integers • • • • • •
Every integer is either even or odd. No integer is both even and odd. n 2 Z is even if and only if n C 1 2 Z is odd. The sum of any two even integers is even. The sum of any two odd integers is even. The sum of an even integer and an odd integer is odd. • The product of two odd integers is odd. • The product of two integers is odd only if both of the factors are odd.
Together, the first two of these facts say that each integer is either even or odd but not both. This says that the sets of even and odd integers form a partition of Z, that is, the sets are disjoint and the union of the sets is all of Z. Some authors require that all the sets of a partition be nonempty as in the case with even and odd integers. So why is it that every integer is either even or odd? This depends on the Division Algorithm that states that if m; n 2 Z with n > 0, then there are unique q; r 2 Z with 0 r < n such that m D nq C r. In this case q is called the quotient of the division, and r is called the remainder of the division. Using the Division Algorithm, any integer m can be divided by 2 giving a quotient and remainder where the remainder is either 0 or 1. If the remainder is 0, then m D 2q for integer q implying that m is even, and if the remainder is 1, then m D 2q C 1 for integer q implying that m is odd.
2.4.2 Proofs About Even and Odd Integers How can these ideas be used to write a good proof of Every integer is either even or odd? First it is easier to reword the statement as If m 2 Z, then either m is even or m is odd. This is a conditional statement, so the natural way to begin a proof is to assume that the hypothesis of the statement is satisfied, that is, that m is an integer. Now apply the Division Algorithm to get the quotient q and remainder r guaranteed by the algorithm. Finally, the value of r shows that m either satisfies the
2.4 Proofs About Even and Odd Integers
29
definition of being an even integer or the definition of being an odd integer. The result would be PROOF: Ever integer is either even or odd. • Let m be an integer. • By the Division Algorithm there are integers q and r with 0 r < 2 such that m D 2q C r. • If r D 0, then m D 2q for integer q which means that m satisfies the definition for being even. • If r D 1, then m D 2q C 1 for integer q which means that m satisfies the definition for being odd. • Since r must be either 0 or 1, it follows that every integer is either even or odd. Next consider the how to prove the statement The sum of any two odd integers is even. The statement concerns the sum of any two odd integers, so the proof reader would expect the proof to consider two arbitrarily chosen odd integers. Once two odd integers are chosen, the definition of odd integer should be invoked because, at that point, that is the only information that is known about the two integers. Finally, a little algebra will help to show that the sum of these two odd integers satisfies the definition of even integer. Here is an attempt to write such a proof that makes several common proof writing errors. PROOF ATTEMPT: The sum of any two odd integers is even. The two integers are odd, so each has the form 2k C 1. The sum of these two integers is .2k C 1/ C .2k C 1/ D 4k C 2. k could be even or odd. The number 2 is even since it is 2 1. 4k is even since it is 2 2k. The sum of two even numbers is even, so the sum of 4k and 2 is an even number. • Therefore, the sum of two odd integers is always even.
• • • • • •
Here are some complaints about the above proof attempt. • The proof begins talking about two integers, but the proof reader has not yet been introduced to these integers and does not know what two integers are being discussed. The proof is missing a “SET THE CONTEXT” sentence to introduce the idea of starting with any two odd integers. • The proof uses the variable k without introducing what that variable represents. The proof requires that k be an integer, but the fact that k is an integer is not stated anywhere. As far as the proof reader knows, k could be any complex number. Later, the proof claims that 2k is an integer which is needed to show 4k is an even integer. Without knowing that k is an integer, it does not follow that 2k is also an integer. • The definition of odd integer allows you to take an odd integer and represent it as 2k C 1, where k is another integer. To apply this definition, then, the proof should
30
•
•
•
•
2 The Basics of Proofs
start with an odd integer, say m, and then represent it as 2kC1 rather than starting with 2k C 1. The subtle point is that one should start with odd integer and use its definition to move on to 2k C 1 rather than starting with 2k C 1 which jumps the gun. The reader of the proof could wonder whether 2k C 1 could represent a generic odd integer. Well, it can, but this takes some thought which can be avoided by starting with an odd integer m and then using the definition of odd to select the integer k such that m D 2k C 1. The definition of “odd integer” does refer to “2k C 1,” but it is more precise. It does not say “has the form.” It says that there is an integer k such that the odd number equals 2k C 1: It is a major error to allow both odd integers to equal 2k C 1 for the same number k. The only way this can happen is for the two odd integers themselves to be equal. Thus, this “proof” only applies to a small subset of cases where one adds two identical odd integers together such as 3 C 3 or 117 C 117. The statement k could be even or odd is certainly correct, but it does not contribute to the proof. It is a statement about items in the proof that is not part of the proof. Occasionally, one will make a definition as part of a long proof, and then give some examples to help the reader understand that definition. But if a statement is not needed either as a critical step in a proof or an important illustration to aid the understanding of the proof, then the statement should be left out of the proof because it distracts from the proof and complicates it. The statement The sum of two even numbers is even is correct, but it has not been proved yet, at least in this text, and is equivalent in difficulty to proving the corresponding statement about the sum of odd integers. Thus, it is not appropriate to use the result about sums of even integers to prove one about the sum of odd integers.
Considering these ideas, one can construct a better proof. PROOF: The sum of any two odd integers is even. • Let m and n be two odd integers. • From the definition of odd integer, there is an integer k1 such that m D 2k1 C 1 and an integer k2 such that n D 2k2 C 1. • Then m C n D .2k1 C 1/ C .2k2 C 1/ D 2.k1 C k2 C 1/. • Since k1 and k2 are integers, so is k1 C k2 C 1. • Thus, the sum m C n is equal to twice an integer, so by the definition of even integer, m C n is even. • Therefore, the sum of any two odd integers is always even. The form of this proof can be copied almost word for word to get a similar proof of the statement The product of two odd integers is odd.
2.5 Basic Facts About Real Numbers
31
PROOF: The product of any two odd integers is odd. • Let m and n be two odd integers. • From the definition of odd integer, there is an integer k1 such that m D 2k1 C 1 and an integer k2 such that n D 2k2 C 1. • Then mn D .2k1 C 1/.2k2 C 1/ D 4k1 k2 C 2k1 C 2k2 C 1 D 2.2k1 k2 C k1 C k2 / C 1. • Since k1 and k2 are integers, so is 2k1 k2 C k1 C k2 . • Thus, the product mn is equal to one more than twice an integer, so by the definition of odd integer, mn is odd. • Therefore, the product of any two odd integers is always odd.
2.4.3 Exercises Write proofs for each of the following statements. 1. 2. 3. 4. 5.
The sum of any two even integers is even. The product of an even integer and an odd integer is even. The difference of an even integer and an odd integer is odd. If the product of two integers is odd, then both of the integers must have been odd. The sum of any four consecutive integers is even.
2.5 Basic Facts About Real Numbers 2.5.1 Ordered Fields Many of the theorems of Calculus involve properties of the real numbers. Some of these properties are subtle, so it is essential to understand this important set of numbers. Already introduced are the sets of natural numbers, N, and the integers, Z. Also of importance is the set of rational numbers, Q D f mn j m; n 2 Z; n ¤ 0g. This definition comes with the understanding that the two rational numbers mn and ab are equal whenever mb D na. Thus, there are always infinitely many representations for each rational number. For all rational numbers r ¤ 0, one can always find relatively prime integers m and n with n > 0 such that r D mn . Together with an agreement to write the rational number 0 as 01 , each rational number has a unique lowest terms representation. The set of rational numbers is more than a set of fractions with integers for numerators and denominators. It also comes with the two binary operations of addition (C) and multiplication () and with the order relation less than (<).
32
2 The Basics of Proofs
The binary operations satisfy conditions which make Q into a field. A field F is a set with operations of addition and multiplication that satisfies the following axioms. Axioms for a Field F A set F together with the binary operations of addition .C/ and multiplication ./ form a field if F contains the two elements 0 and 1 with 0 ¤ 1 such that for every r; s; t 2 F r C s 2 F and r s 2 F and r Ds ! rCt DsCt r Ds ! rt Dst the Closure Properties .r C s/ C t D r C .s C t/ .r s/ t D r .s t/ the Associative Properties the Commutative rCsDsCr rsDsr Properties rC0Dr r1Dr the Identity Properties There exists r 2 F If r ¤ 0, there exists 1r 2 F the Inverse such that r C .r/ D 0 such that r 1r D 1 Properties r .s C t/ D r s C r t
the Distributive Law of Multiplication Over Addition
Notice that the rational numbers do satisfy the eleven field axioms. One defines the operation subtraction () by r s D r C .s/ and the operation division ( ) for s ¤ 0 by r s D r 1s D rs . Moreover, the field Q together with the less than order relation is an ordered field that obeys the following axioms. Axioms for an Ordered Field F A field F is an ordered field with order relation < if for every r; s; t 2 F exactly one of the following holds r < s, r D s, s < r r < s and s < t imply r < t r < s implies r C t < s C t r < s and 0 < t imply r t < s t
the Trichotomy Property the Transitive Property the Addition Property of Less Than The Multiplication Property of Less Than
2.5 Basic Facts About Real Numbers
33
Notice that the rational numbers do satisfy the four ordered field axioms. One defines the other order relations of greater than (>), greater than or equal to (), and less than or equal to () in the obvious ways, that is, r > s whenever s < r, r s whenever either r > s or r D s, and r s whenever either r < s or r D s. There are many other ordered fields, and it is constructive to consider how to justify the fifteen ordered field axioms for a different ordered field. For example, p the set T D fr C s 2 j r; s 2 Qg is an ordered field p using the usual p addition and multiplication operations. For two elements a C b 2 and c C d 2 in T, define p p p addition as .a C b 2/ C .c C d 2/ D .a C c/ C .b C d/ 2 and multiplication p p p as .a C b 2/ .c C d 2/ D .ac C 2bd/ C .ad C bc/ 2. To define the less than p p p relation you would want .a C b 2/ < .c C d 2/ whenever a c < .d b/ 2 p which can be checked by squaring both a c and .d b/ 2, although you will need topalso considerpthe signs of a c and d b. Thus, the definition becomes .a C b 2/ < .c C d 2/ if one of the following holds: • a c < 0 and 0 < d b, • 0 < a c, 0 < d b, and .a c/2 < 2.d b/2 , or • a c < 0, d b < 0, and .a c/2 > 2.d b/2 . It is fairly easy to check that T is an ordered field. The only field axiom which does not follow immediately from the properties of rational p numbers is the inverse axiom for multiplication. You should verify that for a C b 2 ¤ 0, its multiplicative inverse is a b p C 2 a2 2b2 a2 2b2 which is in T. The order axioms take more work to verify due to the complicated definition of less than. For example, to verify the less than relation p p works correctly 2, c C d 2, and with addition, one would begin with three elements of T, a C b p p p e C f p2 where it ispgiven that a C p b 2 < cpC d 2. One needs to compare .a C b 2/ C .e C f 2/ with .c C d 2/ C .e C f 2/. To do this, one compares the values of .a C e/ .c C e/ D a c and .d C f / .b C f / D d b. But this reduces to comparing p a c and p d b which are known to satisfy the correct conditions because a C b 2 < c C d 2 was given. Every ordered field satisfies a long list of simple properties that you will associate with facts learned in Arithmetic and Algebra. Here are some of those properties.
34
2 The Basics of Proofs
Some Properties Obeyed By All Ordered Fields Let r; s; t all be elements of ordered field F . Then 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
r 0 D 0. If r C t D s C t, then r D s. If r t D s t and t ¤ 0, then r D s. .r/ D r. If r ¤ 0, then 11 D r. r r D s if and only if r D s. r D .1/ r. .r/ C .s/ D .r C s/. .r/ .s/ D r s. If r < s and t < 0, then s t < r t. If r ¤ 0, then r2 > 0. 0 < 1. If 0 < r, then 0 < 1r . If 0 < r < s, then 0 < 1s < 1r . If n is any natural number and r > 1, then rn < rnC1 .
The reader may wish to prove some of these properties by applying the axioms. This book will not dwell on these proofs since the techniques used in proving them are not essential for writing most proofs in Analysis. Two simple proofs are given here as examples. PROOF: If r is any element of the field F, then r 0 D 0. Let r be an element of field F . Since 0 is the additive identity of F , 0 D 0 C 0. Then r 0 D r .0 C 0/. By the Distributive Law, r 0 D r 0 C r 0. By adding r 0 to each side of this equality, one gets 0 D r 0 r 0 D .r 0 C r 0/ r 0 D r 0 C .r 0 r 0/ D r 0 C 0 D r 0. • Therefore, for any r 2 F , r 0 D 0.
• • • • •
The next theorem essentially says that if .1/ r has the same properties as r, it must equal r.
2.5 Basic Facts About Real Numbers
35
PROOF: If r is any element of the field F, then r D .1/ r. Let r be an element of field F . Then .1/ r D = = = = = = =
.1/ r C 0 .1/ r C .r C r/ Œ.1/ r C r C r Œ.1/ r C 1 r C r Œ.1/ C 1 r C r 0 r C r 0 C r r
Additive Identity Additive Inverses Associative Law of Addition Multiplicative Identity Distributive Law Additive Inverses r0D0 Additive Identity
Therefore, .1/ r D r for every r 2 F . Note that every ordered field F will contain a copy of Q. This follows since 0; 1 2 F , and if n is a natural number in F , then n C 1 2 F . Thus, it follows by mathematical induction that n 2 F for all n 2 N. Moreover, since 0 < 1, it follows for each n 2 N that n D n C 0 < n C 1 showing that all natural numbers are distinct elements of F . The existence of the negatives of all numbers in F implies that the integers is a subset of F , and the existence of reciprocals implies that all of Q lies in F . There are fields which are not ordered fields, and some of them do not contain copies of Q. Indeed, there are finite fields as well as infinite fields that do not contain Q or even N.
2.5.2 The Completeness Axiom and the Real Numbers There are infinitely many ordered fields. The real numbers, R, is special because it includes every number that is considered a possible “distance from zero,” either positive, negative, or zero. An easy way to ensure that R contains every possible distance is to require it to satisfy the Completeness Axiom. This axiom considers nonempty subsets of an ordered field, F , (actually, any ordered set would do). A subset S F is said to be bounded above if there is an M 2 F such that all x 2 S satisfy x M. In this case, M is called an upper bound of S. Similarly, S F is bounded below by lower bound K 2 F if all x 2 S satisfy x K. If S F is both bounded above and bounded below, then S is said to be bounded. If M is an upper bound for a set S, and it is less than or equal to every upper bound of S, then M is the least upper bound of S. Similarly, if K is a lower bound for a set S, and it is greater than or equal to every lower bound of S, then K is the greatest lower bound p of S. For example, if S is the interval .1; 5 D fx j 1 x < 5g, then 10, 6, and 30
36
2 The Basics of Proofs
are all upper bounds of S, but 5 is the least upper bound of S. Also, 2, 0, and 12 are all lower bounds of S, but 1 is the greatest lower bound of S. One often uses the notation l.u.b..S/ or sup.S/ to represent the least upper bound or supremum of S and g.l.b..S/ or inf.S/ to represent the greatest lower bound or infimum of S. Axioms for the Real Numbers The real numbers, R, is an ordered field that satisfies The Completeness Axiom: Every nonempty set S R which is bounded above has a least upper bound in R. Note, for example, that the set S D fx 2 Q j x2 < 7g is a nonempty subset of Q which is bounded above by 4, 3, and 2.7, but there is no element of Q which is a least upper bound of p S. The set of real numbers, though, does contain a least upper bound of S, namely 7. The Completeness Axiom is sometimes called the Least Upper Bound Principle. The Completeness Axiom comes up frequently in proofs about the real numbers to show that numbers with particular properties exist. For example, consider the two theorems, the Archimedian Principle and the Existence of Square Roots. Both of these theorems are easily understood, but they cannot be proved without using the Completeness Axiom. The Archimedian Principle states that for every real number r there is a natural number greater than r. It can be proved using a proof by contradiction. The proof makes the assumption that there is a real number greater than every natural number and uses this to derive a contradiction, a statement that is false. Because one cannot derive a false statement from a true statement, the assumption most recently made in the proof must be a false statement, and you can conclude that no real number exists that is greater than every natural number. PROOF (Archimedian Principle): If r 2 R, then there exists n 2 N such that r < n. • Suppose that there is an r 2 R such that r > n for every n 2 N. • Then the set N is a nonempty subset of R with an upper bound, so by the Completeness Axiom, N has least upper bound M. • Then M 1 < M, so M 1 is not an upper bound for N. • Thus, there is a k 2 N with the property that k > M 1. • But then k C 1 is also in N, yet k C 1 > .M 1/ C 1 D M where M is an upper bound for N. • This is a contradiction since no element of a set can be greater than an upper bound for that set. • Therefore, the assumption that r > n for every n 2 N must be false, and for every r 2 R there must be at least one n 2 N with n > r.
2.5 Basic Facts About Real Numbers
37
You may not have ever doubted that every nonnegative real number has a square root, but this is a fact that can be proved using the axioms for the real numbers. It is a nice application of both the Trichotomy Property and the Completeness Axiom. Given a positive real number, r, the proof constructs the set S D fx 2 R j x2 rg and then uses the Completeness Axiom to exhibit a value, s, equal to the least upper bound of S. Then it shows that s2 cannot be greater than r and cannot be less than r, so by the Trichotomy Property, s2 must equal r. In particular, the proof first assumes that s2 > r and shows that there is a number y > 0 such that the square of s y is also greater than r. This shows that s y is an upper bound for S which contradicts the fact that s is the least upper bound of S. The 2 proof magically suggests that y D s 4sr works. Where did this magical expression for y come from? It came from considering what property you would want such a y to have. If you want .s y/2 > r, this suggests that you want s2 2sy C y2 > r. This inequality is quadratic in y and has an unnecessarily messy solution. But one of the most important lessons about writing proofs in Analysis is that one can often be a little sloppy when trying to show that an inequality holds. Here, for example, rather than finding a y such that s2 2sy C y2 > r, it would be sufficient to find a y such that s2 2sy > r, because if s2 2sy > r, then certainly the needed s2 2sy C y2 > r also holds. The advantage of making this change is that the inequality s2 2sy > r 2 2 is very easy to solve for y yielding y < s 2sr . Thus, the value y D s 4sr ought to work fine, and, hence, the magic is demystified. Of course, there are many other possible values of y that would also have worked in this proof, but only one value for y is needed. After showing that s2 > r cannot be true, the proof assumes that s2 < r and shows that there is a number y > 0 such that the square of s C y is less than r. This shows that s C y is in S which contradicts the fact that s is an upper bound of 2 S. Again, the proof just suggests setting y D rs . Can you figure out where this 4s expression for y came from? Indeed, the calculation is similar to the one above. You need .s C y/2 r, so s2 C 2sy C y2 r. It is simpler if y2 could be replaced by 2sy. If you assume y 2s, it allows you to conclude y2 2sy so that .s C y/2 D s2 C 2sy C y2 s2 C 2sy C 2sy D s2 C 4sy. You then want a y that satisfies 2 gives the needed value of y (Fig. 2.3). Putting s2 C 4sy r. Thus, the value y D rs 4s these ideas together gives the following proof.
S
s2 – r s 4s
– s2 S s r 4s
Fig. 2.3 Showing the least upper bound of S is s D
p r
S
s
38
2 The Basics of Proofs
PROOF (Existence of Square Roots): If r 2 R and r 0, then there exists an s 2 R such that s2 D r. Let r 0 be a real number. If r D 0, then 02 D r and 0 satisfies the needed condition. So assume that r > 0. Let S D fx 2 R j x2 rg. S is nonempty since 0 2 S. S is bounded above by r C 1 since x > .r C 1/ implies x2 > r2 C 2r C 1 > r so x … S. • Thus, by the Completeness Axiom, S has a least upper bound s. • By the Trichotomy Property, either s2 > r, s2 < r, or s2 D r.
• • • • • •
2
If s2 > r, note that y D s 4sr > 0, and .s y/2 D s2 2sy C y2 > s2 2sy D 2 2 s2 2s s 4sr D s 2Cr > r. Because .s y/2 > r, it follows that s y < s is an upper bound of S. This contradicts the fact that s is the least upper bound of S. Therefore, s2 > r must be false. 2
If s2 < r, let y be the smaller of 2s and rs . Then y > 0, and 4s 2 2 2 2 2 2 D r. .s C y/ D s C 2sy C y s C 2sy C 2sy D s C 4sy s2 C 4s rs 4s Because .sCy/2 r, it follows that sCy 2 S and sCy > s. This contradicts the fact that s is an upper bound of S. Therefore, s2 < r must be false. • Thus, it must be true that s2 D r which proves that for every real number r 0 there is an s 2 R with s2 D r.
2.5.3 Absolute Value, the Triangle Inequality, and Intervals The concept that separates the area of Mathematics known as Analysis from other branches such as Algebra, Topology, Set Theory, and Combinatorics is the idea of distance. In the real numbers, one canmeasure distance by using the absolute x if x 0 value function which is defined as jxj D : For a real number x, the x if x < 0 absolute value of x can be thought of as the distance that x is from the real number 0. Note that for all x 2 R it holds that jxj x jxj. If k > 0, then ˇ the set fx j jxj < kg is the same ˇas the set fx j k < x < kg. Similarly, the set fx ˇ jxj > kg is the same as the set fx ˇ k > x or x > kg. The distance between two real numbers x and y can be defined as jx yj. Note that this distance is positive unless x D y. One property of the absolute value function used frequently in proofs in Analysis is the triangle inequality which states that for all x; y 2 R, jx C yj jxj C jyj. The name of this inequality comes from geometry where it is known that the sum of the
2.5 Basic Facts About Real Numbers
39
Fig. 2.4 Triangle inequality
y
x+y
x lengths of two sides of a triangle always exceeds the length of the third side of the triangle (Fig. 2.4). One simple proof of the triangle inequality is PROOF (Triangle Inequality): jx C yj jxj C jyj • Let x and y be elements of R. • Then jxj x jxj and jyj y jyj. Adding these inequalities yields .jxj C jyj/ x C y .jxj C jyj/. • This last inequality is equivalent to jx C yj jxj C jyj. A subset S contained in R is called connected if it has the property that for any two numbers in S, all the numbers between those two numbers are also in S. More precisely, S is connected if for all x; y 2 S with x < y, it follows that z 2 S for all z with x < z < y. Informally, this means that there are no holes in the set S. Another word for a connected set of real numbers is an interval. If a < b are real numbers, all of the following sets are intervals. Intervals of Real Numbers
; empty set fag D Œa; a single point fx j a < x < bg D .a; b/ open bounded interval fx j a x bg D Œa; b closed bounded interval fx j a x < bg D Œa; b/ bounded interval open on the right fx j a < x bg D .a; b bounded interval open on the left fx j a < xg D .a; 1/ open right infinite interval fx j x < bg D .1; b/ open left infinite interval fx j a xg D Œa; 1/ closed right infinite interval fx j x ag D .1; b closed left infinite interval R entire real line
40
2 The Basics of Proofs
2.5.4 Exercises 1. Show that for any real number x it follows that jxj C jx 6j 6. 2. Show that for any real number x, jx 1j C jx 3j 2. 3. Show that for any real numbers x and y it follows that jx2 C3x4yjCjx14yj .x C 1/2 . 4. Show that for any real numbers x and y, jx C yj C jx y 2j C j2x C 8j 10. 5. Show that for any real numbers x and y it follows that jxjCj3x5yjCj5x4yj jx C yj. 6. Show that the intersection of any two intervals is always an interval. 7. Under what conditions is the union of two intervals an interval?
2.6 Functions 2.6.1 Function, Domain, Codomain Intuitively, a function is a mapping that assigns to each point of some domain A a value that resides in some codomain B. This is usually written f W A ! B. More precisely, the function f is defined as a set of ordered pairs .x; y/ where each x resides in the domain A of f and each y resides in the codomain B of f , and for each x 2 A there is exactly one y 2 B such that .x; y/ 2 f . Since there is a unique ordered pair .x; y/ 2 f for each x 2 A, f associates or links the value of y to the value of x and allows one to write f .x/ D y.
2.6.2 Surjection The domain of the function f is exactly the set of all x that are first coordinates of the order pairs in f , that is, the domain is A D fx j .x; y/ 2 f g. The range of f is defined as the image of f , that is, the range is fy j .x; y/ 2 f g. Clearly, the codomain of f can be any set that contains the range of f . This can lead to some confusion since the codomain of f is not precisely defined. It is simply a convenience. When one defines a function f W R ! R, one means that f is defined for every real number, and that for any x 2 R, the value f .x/ also lies in R. This is the case whether or not R is the range of f or if the range of f is actually some proper subset of R. It could be difficult and unnecessary to calculate exactly which subset of R is the range of f , so it might be easier to just give the codomain as R and avoid the technicalities of figuring out just what values of R are in the range of f . For example, the function f .x/ D 3x6 15x4 C 12x3 C 25x2 32x C 14 maps the real numbers into the real numbers, but to find the range of f , you would need to find the minimum value of f . This minimum exists, but it may not be possible to give its value explicitly.
2.6 Functions
41
If the range of f W A ! B is actually all of B, we say that f maps A onto B, and f is called surjective, and f is called a surjection. Thus, one can prove that a function is surjective by showing that each element of the codomain is in the range. TEMPLATE for proving a function f is surjective • SET THE CONTEXT: Make a statement introducing f , its domain A, and its codomain B. • Select an arbitrary value y 2 B. • Exhibit a value x 2 A such that y D f .x/. • STATE THE CONCLUSION: Therefore, f is a surjection. Note that the crucial step in proving that a function is surjective is showing the existence of an x with f .x/ D y and verifying that the x is in the domain A of the function. For example, the function f .x/ D 5x2 C 1 is a surjection from the negative real numbers onto the interval .1; 1/. To prove this you would need to show that for each real number y > 1 there is a negative real number x for which f .x/ D y. But this just involves a simple algebraic manipulation. That is, if you need 5x2 C 1 D y, then you can solve to get 5x2 D y1 and x2 D y1 . Here one needs to be careful because 5 q y1 it is easy to continue by writing x D which always results in a positive value 5 for x. The q proof needs to exhibit a negative value for x, so it is important to set
. There is no need for the proof to display the steps of solving the x D y1 5 equation for x. The goal is to produce a value of x 2 A such that f .x/ D y; how you arrived at that x is not important. It may be interesting, but it is not an essential part of the proof, and, therefore, it should not be part of the proof. PROOF: The function f .x/ D 5x2 C 1 is a surjection from the negative real numbers onto the interval .1; 1/. • Let f .x/ D 5x2 C 1.
q . • For any y > 1 let x D y1 5
> 0, so • Because y > 1, y1 5 negative real number.
q
y1 5
is a positive real number and x is a
q 2 C 1 D y. • Moreover, f .x/ D 5x C 1 D 5 y1 C 1 D 5 y1 5 5 2
• Therefore, f is a surjection.
2.6.3 Injection The definition of function requires that each value x in the domain of f is found in exactly one ordered pair .x; y/ 2 f . The same does not have to hold for values in
42
2 The Basics of Proofs
the codomain, that is, one value y in the codomain could appear in many order pairs .x; y/ 2 f . For example, for the constant function f W R ! R given by f .x/ D 1 for all x 2 R, the value 1 appears as the second coordinate in all the ordered pairs of the function. If a function has the property that no value of y appears as the second coordinate of more than one ordered pair in f , then f is said to be injective or, less formally, that f is one-to-one. In this case the function f is called an injection. In such a case, one sees that f .x1 / D f .x2 / only if x1 D x2 . This gives a procedure for proving that a function is injective. TEMPLATE for proving a function f is injective • SET THE CONTEXT: Make a statement introducing f , its domain A, and its codomain B. • Assume that for two values x1 and x2 in A that f .x1 / D f .x2 /. • Show that x1 D x2 . • STATE THE CONCLUSION: Therefore, f is an injection. p For example, the function f .x/ D 4x C 7 maps the positive real numbers to the positive real numbers. It is not a surjection, but it is an injection. The proof would require that you show that f .x1 / D f .x2 / implies that x1 D x2 . Again, this is just an algebraic manipulation. p PROOF: The function f .x/ D 4x C 7 is an injection from the positive real numbers to the positive real numbers. p • Let f .x/ D 4x C 7. • Assume p that for positive p real numbers x1 and x2 , f .x1 / D f .x2 /. • Then 4x1 C 7 D 4x2 C 7. • Squaring yields 4x1 C 7 D 4x2 C 7, so 4x1 D 4x2 , and x1 D x2 . • Therefore, f is an injection. If a function f W A ! B is both surjective and injective, that is, if f is both one-toone and onto, then f is bijective, and f is called a bijection. In this case, f exhibits a one-to-one correspondence between the set A and the set B. Two functions f and g whose ranges are in the real numbers can be combined arithmetically. Specifically, one can define f C g, f g, fg, and gf in natural ways: .f C g/.x/ D f .x/ C g.x/,.f g/.x/ D f .x/ g.x/, .fg/.x/ D f .x/ g.x/, and, f .x/ for x such that g.x/ ¤ 0, gf .x/ D g.x/ . When functions f and g are combined in this way, the domain of the sum, difference, product, or quotient is assumed to be the intersection of the domain of f and the domain of g with the exception that the domain of gf also excludes values of x for which g.x/ D 0. Thus, the function p p f .x/ D x 4 is defined for all x 4, and g.x/ D 5 x is defined p the function p for all x 5. It follows that the function x 4 C 5 x is defined only for those x satisfying 4 x 5. Similarly, the function ff .x/ is only defined for x > 4 even .x/ though it is identically 1 for those x. That function has a natural extension to all real numbers.
2.6 Functions
43
g(x) y f(x) B
x
z A
f◦g
C
Fig. 2.5 Composition .f ı g/.x/ D z
2.6.4 Composition If g is a function assigning values in its domain A to values in its range contained in the set B, and if f is a function assigning values in its domain B to values in its range contained in the set C, then the composition of f with g is the function .f ı g/.x/ D f g.x/ which assigns to values in its domain A values in its range contained in set C (Fig. 2.5). The main reason for considering compositions is that it is often easiest to represent complicated functions as compositions of simpler 2x functions. For example, the function f .x/ D psinxC4 is clearly a quotient where the numerator is the composition of the function p x2 with the function sin x, and the denominator is the composition of the function x with the function x C 4. It is easily shown that if g W A ! B and f W B ! C are both surjective functions, then their composition, f ı g W A ! C, is also surjective. To prove this, you would follow the template for proving that a function is surjective. That requires that you select an arbitrary z 2 C and show that there is an x 2 A such that .f ı g/.x/ D z. Why might you use the variable z here rather than the variable y? Well, that allows you to think of g as mapping x to y, and f , in turn, mapping y to z. Faced with the statement f g.x/ D z, there is little you can do except to apply what you know about the function f , that is, that f is surjective. Because f is surjective, and z is in the codomain of f , you know that there is a y in the domain of f such that f .y/ D z. Can you find an x such that g.x/ D y? Of course y is in the domain of f which is the codomain of g. The function g is surjective, so there must be an x in the domain of g that maps onto y. These ideas give the following proof.
44
2 The Basics of Proofs
PROOF: If g W A ! B and f W B ! C are both surjective functions, then their composition f ı g W A ! C is also surjective. • • • • • •
Let g W A ! B and f W B ! C be two surjective functions. Let z 2 C. Then since f is a surjection from B to C, there is a y 2 B such that f .y/ D z. Since g is a surjection from A to B, there is an x 2 A such that g.x/ D y. Therefore, .f ı g/.x/ D f g.x/ D f .y/ D z. It follows that f ı g W A ! C is surjective.
It is also true that if g W A ! B and f W B ! C are both injective functions, then their composition, f ı g W A ! C, is also injective. You would prove this by following the template for proving that a function is injective. That is, you would assume that .f ı g/.x 1/ D .f ıg/.x2 / for some x1 and x2 in A. Again, what can you say if you know f g.x1 / D f g.x2 / ? All that you can do is apply what you know about the function f , that is, that f is injective. Since f is injective, you can conclude that g.x1 / D g.x2 /. Then because g is injective, you can conclude x1 D x2 , and you are done. PROOF: If g W A ! B and f W B ! C are both injective functions, then their composition f ı g W A ! C is also injective. • • • • • •
Let g W A ! B and f W B ! C be two injective functions. Assume that for some x1 and x2 inA, .f ı g/.x1 / D .f ı g/.x2 /. By the definition of composition f g.x1 / D f g.x2 / . Then since f is an injection, it follows that g.x1 / D g.x2 /. Since g is an injection, it follows that x1 D x2 . It follows that f ı g W A ! C is injective.
2.6.5 Exercises Write a proof for each of the following statements. 1. For each real number r there is a real number x such that x3 D r. 2. For each real number r 0 there is a real number x 0 such that x4 D r. 3. If n is an odd positive integer, then for each real number r there is a real number x such that xn D r. 4. If n is an even positive integer, then for each real number r 0 there is a real number x 0 such that xn D r. 5. If h W A ! B, g W B ! C, and f W C ! D are three functions, then .f ı g/ ı h D f ı .g ı h/. In other words, function composition is associative. (Hint: Show that both functions .f ı g/ ı h and f ı .g ı h/ give the same result when applied to an x 2 A.)
2.6 Functions
45
6. If h W A ! B, g W B ! C, and f W C ! D are three surjective functions, then their composition f ı g ı h is surjective. 7. If h W A ! B, g W B ! C, and f W C ! D are three injective functions, then their composition f ı g ı h is injective.
Chapter 3
Limits
3.1 The Definition of Limit In a typical Calculus course students develop an intuitive understanding of the concept of limit which, of course, is the central concept of Calculus and, indeed, the central concept of Analysis. In particular, if f is a function defined on an open interval containing a 2 R, then f has limit L at a if the values of f .x/ get closer and closer to L as x approaches a. In order to prove theorems about limits, one needs a rigorous definition of limit which makes clear what is meant by “closer and closer” and “approaches.” In Analysis the distance between two real numbers is measured by the absolute value of the difference of the two numbers. Thus, the ideas of “closer and closer” and “approaches” naturally involve statements about the absolute values of differences of two quantities. Consider the definition of lim f .x/ D L, where the function f is defined in an x!a
such that jf .x/ Lj < for all x in that interval. That is, there is a value ı > 0 such that every x satisfying jx aj < ı also satisfies jf .x/ Lj < as seen in Fig. 3.3. Again, the choice of the Greek letter ı (delta) is completely arbitrary, but the tradition of using ı in this context is universal. Note that for the function whose graph appears in Fig. 3.3 the value L is the limit of the function as x approaches a, and L also happens to be the value of f .x/ at x D a. You should recall that this sometimes happens (specifically when f is continuous at x D a), but that this is not a requirement. Indeed, one reason for discussing limits in the first place is because there is a need to evaluate the behavior of a function as x approaches a value a when the function fails to be defined at x D a. Thus, in general, one does not want to require jf .x/ Lj < for all x with jx aj less than some positive value ı since this would require jf .x/ Lj < at x D a. Instead, one excludes the need for the function to satisfy any conditions at all at x D a by saying that there is a positive ı such that f is within the desired tolerance of L for all x with 0 < jx aj < ı. Clearly, the value of ı must be chosen to be positive since no negative value would represent a distance and, ı D 0 would not result in a region around the number a satisfying jx aj < ı.
3.2 Proving lim f .x/ D L
49
x!a
Combining these ideas results in the following definition. Suppose that the function f is defined for all x in an open interval containing a 2 R except perhaps at x D a. Then the limit of f as x approaches a is L, lim f .x/ D L, means that for x!a every > 0 there exists a ı > 0 such that for every x satisfying 0 < jx aj < ı, it follows that jf .x/ Lj < . The power of this definition is the fact that the and ı are arbitrary positive numbers. For example, what if you knew that for every > 0 there were a ı > 0 such that whenever 0 < jx aj < ı, then jf .x/ Lj < 2? Would this be sufficient for showing lim f .x/ D L? The answer is yes, because the x!a
is arbitrary. Suppose that for any > 0 you can find a ı > 0 that will ensure that jf .x/ Lj < 2. Then since 2 is also a positive number, you can find a ı 0 > 0, likely smaller than ı, that will ensure that jf .x/ Lj < 2 2 D . The point here is that since was arbitrary, you can replace it with any positive number, including 2 .
3.1.1 Exercises Which of the following definitions is equivalent to the definition of lim f .x/ D L? x!a
1. For all 0 there is a ı 0 such that if 0 < jx aj < ı, then jf .x/ Lj < . 2. For all > 0 there is a ı > 0 such that if 0 < jx aj < 4ı , then jf .x/ Lj < 7. 3. For all > 0:001 there is a ı > 0:001 such that if 0 < jx aj < ı, then jf .x/ Lj < . 4. For all ı > 0 there is an > 0 such that if 0 < jx aj < ı, then jf .x/ Lj < . 5. There exists a ı > 0 such that for all > 0, if 0 < jxaj < ı, then jf .x/Lj < . 6. For all ı > 0 there is an > 0, such that if 0 < jx aj < , then jf .x/ Lj < ı. 7. For all > 1 there is a ı > 1, such that if 1 < jx aj C 1 < ı, then jf .x/ Lj C 1 < .
3.2 Proving lim f .x/ D L x!a
The definition of limit provides a formula by which one can construct a proof that a particular function f has a limit of L at the point a. The definition requires that for every > 0 there is a ı > 0 that satisfies certain properties. Thus, a proof of a limit must show that for every > 0 you can exhibit a ı > 0 which has the needed property. As with other proofs that some property holds for all elements of a set, the proof begins by selecting an arbitrary element of that set. In this case, one would select an arbitrary > 0. The goal is to present a value for ı > 0 such that every x satisfying 0 < jx aj < ı also satisfies jf .x/ Lj < . That suggests the following proof template.
50
3 Limits
TEMPLATE for proving lim f .x/ D L x!a
• SET THE CONTEXT: Make statements telling what is known about the function f and the numbers a and L. • SELECT AN ARBITRARY : Given > 0, • PROPOSE A VALUE FOR ı: let ı D . Here you would insert an appropriate value for ı. • SELECT AN ARBITRARY x: Select x such that 0 < jx aj < ı. • LIST IMPLICATIONS: Derive the result jf .x/ Lj < . • STATE THE CONCLUSION: Therefore, lim f .x/ D L. x!a
For example, consider proving that lim 2x 3 D 7. As stated in the template, x!5
the proof would begin with “Let f .x/ D 2x 3. Given > 0, : : :.” Your task is to determine a value of ı > 0 that ensures the inequality jf .x/ Lj < holds for all x with 0 < jx aj < ı. Since the function f is not constant, the choice of ı will surely depend on the value of . But how is this value of ı determined? A common approach is to work backwards from the final conclusion jf .x/ Lj < to see what value of ı is needed. In this example, f .x/ D 2x 3, a D 5, and L D 7. The value of jf .x/ Lj D j.2x 3/ 7j D j2x 10j D 2jx 5j. Note that this expression has a factor of x a, where a D 5. When finding the limit of a polynomial where L D f .a/, this will always be the case. For more complicated functions f , other properties of f will often allow you to write f .x/ L as an expression where x a is a factor. This makes it easier to determine a value of ı since the choice of ı restricts the size of jx aj which, in turn, will make jf .x/ Lj small, the desired result. So here, jf .x/ Lj D 2jx 5j. To force jf .x/ Lj to be less than some arbitrary > 0, it is, therefore, sufficient for 2jx 5j to be made less than . This is done by making jx 5j < 2 , and the needed value of ı is 2 . Note that it is stipulated that is positive, so ı D 2 is also greater than zero, a requirement of the definition of limit. Now a complete proof can be written by following the template. PROOF: lim 2x 3 D 7 x!5
• • • • •
Let f .x/ D 2x 3. Given > 0, let ı D 2 > 0. Select x such that 0 < jx 5j < ı D 2 . Then ı > jx5j implies > 2jx5j D j2x10j D j.2x3/7j D jf .x/7j. Therefore, lim 2x 3 D 7. x!5
Identifying an appropriate value for ı is very easy when f is linear as in the example above, but it can be trickier for other functions. Consider proving lim 3x2 D 48. In this example, f .x/ D 3x2 , a D 4, and L D 48. The value of
x!4
jf .x/ Lj D j3x2 48j D 3jx2 16j D 3jx C 4j jx 4j. As suggested above,
3.2 Proving lim f .x/ D L
51
x!a
it should not be surprising that this expression includes a factor of x a D x 4 which can be forced to be small by selecting a small value for ı. In particular, the proof will need to justify jf .x/ Lj < which means 3jx C 4j jx 4j < and jx 4j < 3jxC4j . Here is an attempt at a proof using this idea, but it falls short of being correct. PROOF ATTEMPT: lim 3x2 D 48 x!4
• Let f .x/ D 3x2 . • For any x, let ı D 3jxC4j . • Then 0 < jx 4j < ı D 3jx2 16j D jf .x/ 48j. • Therefore, lim 3x2 D 48.
, 3jxC4j
implies that > jx 4j 3jx C 4j D
x!4
Can you spot some of the errors in this proof? • The second line of the proof refers to the variable which has not yet been introduced in the proof. In particular, without having specified that > 0, one does not know that ı > 0 which is required by the definition of limit. The proof should include the phrase Given > 0. • The value of ı in the second line of the proof is undefined when x D 4. • The most serious error here is that the value of ı depends on the value chosen for x. The definition of limit requires that for every > 0 there is a ı > 0. That value of ı can depend on the value of but certainly cannot depend on x which has not yet been introduced in the definition. After ı is specified, the definition requires that a condition hold for all x satisfying 0 < jx aj < ı, and only then does the definition refer to values of x. Still one needs a value of ı which will be less than 3jxC4j for all the values of x considered in the proof. One way around this would be to find a value for ı which is less than 3jxC4j for every value of x. But this cannot be done because the expression gets arbitrarily small as x gets large. On the other hand, the value of x will be 3jxC4j restricted so that 0 < jx 4j < ı. Thus, unless ı is very large, x cannot wander too far away from a D 4, and 3jxC4j cannot get arbitrarily small. So how does one choose a ı which both ensures that jx C 4j does not grow too large and also makes jx 4j small? The technique is to select ı in two stages. First, to ensure that jx C 4j does not grow too large, restrict the value of ı so that x cannot wander too far from a D 4. Almost any restriction in the size of ı will work, so how about suggesting that ı not exceed 1? If ı 1, then when you choose an x with 0 < jx4j < ı, you will know that jx4j < 1 which is equivalent to 1 < x4 < 1 and, thus, 1 C 8 < .x 4/ C 8 < 1 C 8. That is, 7 < x C 4 < 9, and it follows that jx C 4j < 9. Here is another attempt at a proof that uses this idea. Unfortunately, it too has problems.
52
3 Limits
PROOF ATTEMPT: lim 3x2 D 48 x!4
• Let f .x/ D 3x2 . • Given > 0, let ı D 1. • Then for any x such that 0 < jx 4j < ı D 1, it follows that 1 < x 4 < 1 so 1 C 8 < .x 4/ C 8 < 1 C 8 and jx C 4j < 9. • Now let ı D 27 > 0. • Then 0 < jx 4j < ı D 27 implies that > jx 4j 27 > jx 4j 3jx C 4j D 2 j3x 48j D jf .x/ 48j. • Therefore, lim 3x2 D 48. x!4
The only problem with the above proof is in its use of the variable ı. In the second line of the proof ı is set to 1, and in the fourth line it is set to 27 . It does not make sense to set the value of ı equal to both of these values because, except in the rare case that D 27, the value of ı cannot be equal to both values at the same time. The solution is to choose one value for ı that satisfies two separate conditions. For example, you can first require that ı < 1. Then a choice of x with 0 < jx 4j < ı will guarantee that jx C 4j < 9. Then 3jxC4j > 39 D 27 . This suggests that you should select ı D 27 . But you also need ı 1. What happens if someone suggests that be some rather large number such as D 100? Then ı D 27 would not satisfy ı < 1. This is not a problem since one can always get away with selecting a positive value for ı that is smaller than needed. Thus, you can select ı to be the lesser of 1 and 27 . This choice is usually written as ı D min 27 ; 1 . Now you can put this all together to get a formal proof that is completely correct. PROOF: lim 3x2 D 48 x!4
Let f .x/ D 3x2 . Given > 0, let ı D min 27 ;1 . Select x such that 0 < jx 4j < ı. Since ı 1, it follows that jx4j < 1 and 1 < x4 < 1, so 7 < xC4 < 9. Thus, jx C 4j < 9. • Since ı 27 , it follows that jx 4j < 27 D 39 < 3jxC4j . 2 • Then ı > jx 4j implies > 3jx C 4j jx 4j D j3x 48j D jf .x/ 48j. • Therefore, lim 3x2 D 48.
• • • •
x!4
xC2 2 x!2 x C3xC2
As a third example, consider proving the limit lim
D 1. In this
example f .x/ D a D 2, and L D 1. Note that f .2/ is not defined even though the limit as x approaches 2ˇ exists. The proof ˇ of this limit must conclude ˇ ˇ xC2 with the inequality > jf .x/Lj D ˇ x2 C3xC2 .1/ˇ. As in the previous examples, xC2 , x2 C3xC2
xC2 it would be convenient if the expression x2 C3xC2 .1/ would contain a factor of x C 2 so that it could be made small by requiring x .2/ to be less than some ı.
3.2 Proving lim f .x/ D L
53
x!a
But this follows with some fairly straightforward algebra. Assuming that x ¤ 2, x2
xC2 xC2 1 1 C .x C 1/ xC2 .1/ D C1 D C1 D D : C 3x C 2 .x C 2/.x C 1/ xC1 xC1 xC1
ˇ ˇ ˇ ˇ ˇ D jx C 2j ˇ 1 ˇ would follow if jx C 2j never The needed inequality > ˇ xC2 xC1 xC1 exceeds jx C 1j which, in turn, would happen if ı < jx C 1j. Again, there is a problem because the choice of ı > 0 cannot depend on the value of x, yet jx C 1j can get arbitrarily close to zero as x gets close to 1. The strategy, then, would be to restrict the value of ı so that x could not get close to 1. If x is supposed to be close to 2, ı could be chosen so that it does not exceed 12 . Then, jx C 1j could not get smaller than 1 12 D 12 , and jx C 1j > 2 . You would not want ı to exceed either 2 or 12 . Thus, one can select ı D min 2 ; 12 . The complete proof follows. xC2 2 x!2 x C3xC2
PROOF: lim
D 1
xC2 Let f .x/ D x2 C3xC2 . Given > 0, let ı D min 2 ; 12 . Select x such that 0 < jx .2/j < ı. Since ı 12 , it follows that jx C 2j < 12 and 12 < x C 2 < 12 , so 32 < x C 1 < 12 . Thus, jx C 1j > 12 . • Since ı 2 , it follows that jx C 2j < 2 < jx C 1j. • Then ı > jx .2/j > 0 implies 2 > jx C 2j and 1 1 jx C 2j D jxC1j j1 C .x C 1/j D > 2jx C 2j > jxC1j ˇ ˇ 1 ˇ ˇˇ xC2 ˇ ˇ C 1ˇ D ˇ 2 .1/ˇ D jf .x/ .1/j.
• • • •
xC1
x C3xC2 xC2 2 x!2 x C3xC2
• Therefore, lim
D 1.
Clearly, at the point that you stipulate that ı should be less than 12 , you are making a rather arbitrary decision. What would have happened if you had chosen some other reasonable bound on the size of ı? For example, what if instead you only require ı < 34 ? This would also work, although that decision would affect the final choice of ı for now jx C 1j can get as small as 14 , and jxC2j could be as large as 3jxC1j 4jx C 2j. This suggests that you then select ı D min 4 ; 4 . This choice is no better or worse than the ı chosen earlier. When one makes such arbitrary decisions, it is good form to make a selection that does not lead to unnecessary arithmetic or algebraic complications because one does not want to make the proof any harder to read than necessary. Thus, it p would perfectly adequate but enormously awkward to select the bound on ı to be p5 . As long as the bound is less than 1, it will do the 1C 5
job of keeping jx C 1j bounded away from zero, but optimal choice.
p 5 p 1C 5
would certainly not be an
54
3 Limits
3.2.1 Exercises Write a proof of each of the following limits. 1. lim 35 x C 1 D 4 x!5
2. lim 5x 8 D 7 x!3
3. lim 2x2 D 18 x!3
4. lim 9x2 D 4 x! 23
5. lim 3x2 5x 7 D 1 x!1
6. lim x2 C 3x C 1 D 29 x!4
7. lim 2x3 D 16 x!2
6 D2 x!1 2xC5 xC4 lim 2 D x!8 x 10xC10
8. lim 9.
2
10. lim mx C b D ma C b x!a
11. lim ax2 D au2 x!u
12. lim ax2 C bx C c D au2 C bu C c x!u
3.3 One-Sided Limits The one-sided limits lim f .x/ D L and lim f .x/ D L are very similar to twox!aC
x!a
sided limits except that the value of x is only allowed to approach the real number a from one side. As a result, the definitions of these one-sided limits are very similar to the definition of limit with minor alterations that forces x to stay on one side of a. The definition of limit states that for a function f defined in a neighborhood of a, but not necessarily at a, the limit lim f .x/ D L means for every > 0 there exists a x!a
ı > 0 such that for every x satisfying 0 < jx aj < ı, it follows that jf .x/ Lj < . What is it about this definition that allows x to approach a from two sides? It is the inequality 0 < jx aj < ı that allows x to be either greater than or less than a since jx aj is positive in either case. By removing the absolute value function in this inequality and writing instead 0 < x a < ı, the choice of x becomes restricted to being a value greater than a, or writing instead 0 < a x < ı, the choice of x becomes restricted to being a value less than a. Thus, if f is a function defined for all x in an open interval with right end at a, then the limit of f at a from the left is L, lim f .x/ D L, means that for every > 0 there is a ı > 0 such that for every x x!a
satisfying 0 < a x < ı, it follows that jf .x/ Lj < . Similarly, if f is a function
3.3 One-Sided Limits
55
defined for all x in an open interval with left end at a, then the limit of f at a from the right is L, lim f .x/ D L, means that for every > 0 there is a ı > 0 such that x!aC
for every x satisfying 0 < x a < ı, it follows that jf .x/ Lj < . One-sided limits are particularly useful in cases where the function f behaves 1 differently on one side of a as on the other side such as the way e x behaves quite 1 differently as x approached 0 from the right where 1x is positive from how e x behaves as x approaches 0 from the left where 1x is negative. Similarly, the derivative of f .x/ D jxj has different limits as x approaches 0 from the right and from the left. There are also p cases where a function is not even defined for x on one side of a such as f .x/ D x which is not defined for x < 0. Proving the existence of one-sided limits is very similar to proving two-sided limits except that care must be taken to ensure that the value of x remains on one side of a. Take, for example, the limit lim 2x2 5x D 3. Here f .x/ D 2x2 5x, x!3C
a D 3, and L D 3. As with a proof of other limits earlier in the chapter, the proof needs to give a value for ı > 0 which will ensure > jf .x/Lj D j.2x2 5x/3j D j.2x C 1/.x 3/j. This will follow if jx 3j < ı j2xC1j for all suitable values of x. What is needed is the largest possible value of 2x C 1, but 2x C 1 is not bounded unless x is restricted to be close to 3. Thus, stipulate that ı be less than 1 which will ensure that x3 will be less than1, x will not exceed 4, and 2xC1 will not exceed 9. Then ı can be chosen to be min 9 ; 1 , and the proof can be written as follows. PROOF: lim 2x2 5x D 3 x!3C
Let f .x/ D 2x2 5x. Given > 0, let ı D min 9 ; 1 . Select x such that 0 < x 3 < ı. Since ı 1, it follows that 0 < x 3 < 1, 3 < x < 4, and j2x C 1j < 9. Then 0 < x 3 < ı implies 9 > x 3 and > 9.x 3/ > .2x C 1/.x 3/ D 2x2 5x 3 D jf .x/ 3j. • Therefore, lim 2x2 5x D 3. • • • • •
x!3C
Consider a function where its left limit differs from its right limit such as the function 5 7x if x < 1 f .x/ D : Then lim f .x/ D 2 while lim f .x/ D 1. Thus, x!1 x if x 1 x!1C while proving lim f .x/ D 2, it is important to use that fact that x < 1 as part x!1
of the proof since the required inequalities will not hold for x > 1 (Fig. 3.4). The following shows one possible proof.
56
3 Limits
Fig. 3.4 Graph of f .x/
PROOF: lim x!1
5 7x if x < 1 x if x 1
D 2
• • • • •
5 7x if x < 1 Let f .x/ D . x if x 1 Given > 0, let ı D 7 . Select x such that 0 < 1 x < ı D 7 . Then x < 1, so f .x/ D 5 7x. It follows that > 7.1 x/ D 5 7x .2/ D jf .x/ .2/j. Therefore, lim f .x/ D 2. x!1
In the third line of the proof, 0 < 1 x < ı ensures that x < 1 which, in turn, is needed to conclude that f .x/ D 5 7x and not f .x/ D x. The fact that x < 1 is also used in the fourth line of the proof to conclude that 5 7x .2/ D jf .x/ .2/j which follows because 5 7x .2/ is positive for all x < 1.
3.3.1 Exercises Write a proof of each of the following one-sided limits. 1. lim x2 C 4x D 21 x!3C
2. lim 8 3x D 1 x!3
3. lim x!2
4. lim
x!4C
5. lim x!2
6. lim
x!2C
x2 4 x2 3xC2
D4
x2 4x
D
2x2 7x4 8jx2j x2 4 8jx2j x2 4
D 2 D2
4 9
3.4 Limits at Infinity
57
3.4 Limits at Infinity The definitions given in the last two sections do not make sense when the real number that x approaches, a, is replaced by infinity. Infinity, of course, is not an element of the real numbers, R, but it does make sense to ask whether a function approaches a limit when x increases without bound, that is, as x approaches infinity. When one writes lim f .x/ D L, one is thinking that f .x/ is getting close to the real x!1 number L as x increases without bound. But it does not make sense to measure how close x is to infinity by choosing a ı > 0 so that when x is within ı of infinity, f .x/ is close to L. Since infinity is not a real number, one cannot measure the distance from the real number, x, to infinity, even less expect x to get within ı of infinity. So how does one quantify “getting closer to infinity?” The answer lies in the phrase “increases without bound” which suggests that for any bound, N, you could place on the size of x, the value of x can be made to be greater than that bound. Thus, instead of selecting a ı > 0 and requiring 0 < jx aj < ı, one chooses a number N 2 R and requires x > N. This allows the following definition. Suppose that the function f is defined for all x > K for some real number K. Then the limit of f as x approaches infinity is L, lim f .x/ D L, means that for every > 0 there exists x!1 an N 2 R such that for every x > N, it follows that jf .x/ Lj < (Fig. 3.5). Now consider how one might write a proof of a limit at infinity. For example, consider x x D 0. Here f .x/ D x2 C6 and L D 0. As with other limit proving the limit lim x2 C6 x!1
proofs, the goal is to arrange that jf .x/ Lj < for an chosen > 0. ˇ ˇ arbitrarily ˇ x ˇ Again, you can work backwards. Since jf .x/ Lj D ˇ x2 C6 ˇ, as long as x > 0, it ˇ ˇ ˇ x ˇ would follow that ˇ x2 C6 ˇ < xx2 D 1x . Thus, there is an expression, 1x , which is larger than jf .x/ Lj for all suitably large values of x. This will help because if you can assure that 1x is less than , it will follow that jf .x/ Lj is also less than . It would not have been helpful to exhibit an expression that was always less than jf .x/ Lj because making that expression small would not imply that jf .x/ Lj is small. Now, if x > 1 , it follows that 1x < suggesting that 1 is a suitable value for N. x x!1 x2 C6
PROOF: lim • • • • •
D0
x Let f .x/ D x2 C6 . Given > 0, let N D 1 . Select x such that x > N > 0. Then x > 1 implies > 1x D xx2 > x Therefore, lim x2 C6 D 0. x!1
Fig. 3.5 Approaching a limit as x ! 1
x x2 C6
ˇ ˇ ˇ x ˇ D ˇ x2 C6 0ˇ D jf .x/ 0j.
58
3 Limits
Note that it is important that the third step of the proof pointed out that N is positive. It is used in the fourth step when 1x is calculated, and this would not have been allowed if the value of x could have been zero. For a second example, consider proving lim 2xC5 D 2. Again, you can work x!1 x7 ˇ ˇˇ .2xC5/2.x7/ ˇˇ ˇ 19 ˇ ˇ 2xC5 backwards to get > jf .x/ Lj D ˇ x7 2ˇ D ˇ ˇ D ˇ x7 ˇ. From here x7 there are a number of ways to proceed. You can solve for x in the previous inequality to get x > 7 C 19 which gives a reasonable value for N. Another way would be to 19 is less than say that if x > 14, then x 7 < x 2x D 2x . In this case the fraction x7 19 38 38 , and it becomes clear that x > is sufficient. x D x 2 x This is an example demonstrating the enormous flexibility one sometimes has in writing proofs in analysis where you often need to prove an inequality which can be done in many ways. It is usually easier to prove an inequality involving a simple fraction rather than a complicated fraction, so you can use the strategy of replacing a fraction with a simpler fraction that is clearly larger, or in some cases, clearly smaller. Keep in mind that a ratio of positive values gets larger if its numerator gets larger or its denominator gets smaller. A complete proof can be written as follows. 2xC5 x!1 x7
PROOF: lim • • • • •
D2
Let f .x/ D 2xC5 . x7 Given > 0, let N D 7 C 19 . Select x such that x > N > 7. Then x > 7 C 19 implies x 7 > Therefore, lim 2xC5 D 2. x7
19
and >
19 x7
D
2xC5 x7
2 D jf .x/ 2j.
x!1
As in the previous proof it is important that x > 7 is pointed out in the third step of the proof because that fact is needed both to ensure that f .x/ is defined by assuring x 7 ¤ 0 and that x 7 is positive allowing the absolute value function to be introduced in the fifth step of the proof. With a slight adjustment of the definition of lim f .x/ D L, one gets a definition x!1
of lim f .x/ D L. This time rather than choosing an N and requiring jf .x/ Lj < x!1
for all x > N, one instead needs f .x/ to be within of L for those x < N. Thus, lim f .x/ D L means that for every > 0 there exists an N 2 R such that for all x!1 x < N it follows that jf .x/ Lj < . 2 D 3, one can identify an N such that x < N implies that To prove lim 6x2x2C5x 7 x!1 ˇ ˇ 2 ˇ ˇ 2 2 ˇ 6x C5x ˇ ˇ .6x C5x/3.2x2 7/ ˇ D 3j < by working backwards. That is, 3 j 6x2x2C5x ˇ ˇD ˇ ˇ 7 2x2 7 2x2 7 ˇ 5xC21 ˇ ˇ 2 ˇ. It would be nice to simplify this rather messy expression; something you 2x 7 can do as long as you do not introduce changes that prevent the final inequality from holding. In this case, the 7 term in the denominator of 5xC21 is an inconvenience, 2x2 7
3.4 Limits at Infinity
59
and it would be nice to remove it. Simply removing this negative term would make the absolute value of the fraction smaller when what is needed is to make the fraction larger. A strategy that does work is to take part of the 2x2 term, which grows very large as x goes to 1, and pair it with the 7 term. For example, 2x2 7 p can be written as x2 C .x2 7/. Because x2 7 is a positive value for all x < 7, removing it from the denominator makes the absolute value of the fraction greater. Also note that when x < 21 , the numerator j5x C 21j < 5jxj, and this happens for 10 ˇ 5xC21 ˇ p 5 5 ˇ ˇ all x < 7. It would then be sufficient for > jxj D 5jxj 2 > 2x2 7 or that x < x p as long as x < 7. A proof would be 6x2 C5x x!1 2x2 7
PROOF: lim
D3
6x2 C5x . 2x2 7
• Let f .x/ D
p • Given > 0, let N D min 7; 5 . p • Select x such that x < N 7. ˇ ˇ ˇ ˇ ˇˇ 5xC21 ˇˇ ˇ>ˇ 2 2 ˇD • Then x < N 5 implies > ˇ 5x ˇ D ˇ 5x x2 x C.x 7/ ˇ 2 ˇ ˇ ˇ ˇ .6x C5x/3.2x2 7/ ˇ ˇ 6x2 C5x ˇ D D jf .x/ 3j. 3 ˇ ˇ ˇ ˇ 2x2 7 2x2 7 6x2 C5x 2 x!1 2x 7
• Therefore, lim
D 3.
3.4.1 Exercises Find ways to justify each of the following inequalities that hold for large values of x. 1. 2. 3.
Write proofs for each of the following limits. 4 D0 x!1 xC4 3x9 lim D1 x!1 3xC4
4. lim 5.
9x2 D3 2 x!1 3x 10 3 x lim D 15 3 2 x!1 5x 2x 4
6. lim 7.
60
3 Limits
3.5 Limit of a Sequence 3.5.1 Definition of Sequence A sequence is just a function whose domain is the set of natural numbers, N. In this chapter the codomain of a sequence will be the real numbers, R, but you can have a sequence with any set serving as the codomain. Functions are usually referenced using the notation f .x/. But for sequences it is traditional to place the argument of a sequence in a subscript rather than within parentheses as in a1 ; a2 ; a3 ; : : : . The entire sequence is notated with angle brackets as in . Note that this is not the same as the set fa1 ; a2 ; a3 ; : : : g which is just the collection of the values taken on by the sequence, that is, the range of the function a W N ! R. For each n 2 N, an is called a term of the sequence, or specifically, the nth term of the sequence.
3.5.2 Arithmetic with Sequences As with any real-valued function, you can add, subtract, multiply, and divide sequences. The sum of sequences and is the sequence where, for each n 2 N, cn D an C bn . Similarly, one can define the difference of sequences and product of sequences as cn D an bn and cn D an bn , respectively. If the sequence has no terms equal to zero, then the quotient of sequence and is the sequence cn D abnn . Other arithmetic operations can be similarly defined. If f is any real-valued function with a domain that includes the range of the sequence , then it makes sense to define the sequence cn D f .an /. For example,pif pis the p 1; 3; 5; 7; : : : , then the sequence < an > is the sequence 1; 3; 5; 7; : : : .