Numerical Computing CSC 2702 Required textbook: Numerical Analysis: Burden & Faires: 8th edition Thomson Brooks/Cole
Dr Azeddine M KICT, CS IIUM October 12, 2009
1
Contents 1 Mathematical Preliminaries 1.1 Review of Calculus . . . . 1.1.1 Exercises . . . . . . 1.2 Round-off Errors . . . . . 1.2.1 Exercises . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
2 Solution of Equations in One Variable 2.1 Bisection Metho d . . . . . . . . . . . . 2.2 Fixed-Point Iteration . . . . . . . . . . 2.3 Newton’s Method . . . . . . . . . . . . 2.4 Secant Metho d . . . . . . . . . . . . . 2.5 False Position Metho d . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . .
4 4 6 8 17
. . . . .
19 20 23 26 27 27
3 Error Analysis for Iterative Methods 29 3.1 3.1 Linearl arly an and d Quad adra rati tica callly Converge ergen nt Proc rocedu edures res . . . . . . . . . . . . . . . . . . . . . . 29 3.2 Zero multiplicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4 Accelerating Convergence 4.1 4.1 Aitk Aitken en’s ’s ∆2 metho d . . . 4.2 Steffensen’s Metho d . . . 4.3 Zeros Polynomial . . . . 4.4 Horner’s Metho d . . . . 4.5 Deflation . . . . . . . . . 4.6 Mu ¨ller’s metho d . . . . . 4.7 Exercises . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
5 Interpolation and Polynomial Approximation 5.1 Weierstrass Approximation Theorem . . . . . 5.2 Lagrange Polynomial . . . . . . . . . . . . . . 5.3 Neville’s Method . . . . . . . . . . . . . . . . 5.4 Newton Interpo pollating Polynomial . . . . . . . 5.5 Polynomial Forms . . . . . . . . . . . . . . . . 5.6 Spline Interp olation . . . . . . . . . . . . . . . 5.7 Parametric Curves . . . . . . . . . . . . . . .
2
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . .
33 . . . . . . 33 . . . . . . 35 . . . . . . 37 . . . . . . 37 . . . . . . 39 . . . . . . 40 . . . . . . 40
. . . . . . .
43 43 43 45 46 49 50 52
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
6 Numerical Differentiation and Integral 6.1 Numerical Differentiation . . . . . . . . 6.2 Richardson’s Extrapolation . . . . . . . 6.3 Elements of Numerical Integration . . . 6.3.1 Trap ezoidal rule . . . . . . . . . 6.3.2 Simpson’s rule . . . . . . . . . . 6.3.3 Degree of precision . . . . . . . 6.3.4 Newton-Cotes Formula . . . . . 6.4 Compo possite Numerical Integration . . . 6.5 Adaptive Quadrature Methods . . . . . 6.6 Gaussian Quadrature . . . . . . . . . . 6.7 Improp er Integrals . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
7 Initial-Value Problems for ODE 7.1 intro duction . . . . . . . . . . . . . . . . . . . 7.2 7.2 Elementary tary Theo heory of Initi nitial al--Valu alue Prob obllems . 7.3 Euler’s Metho d . . . . . . . . . . . . . . . . . 7.4 Higher-Order Taylor Methods . . . . . . . . . 7.5 Runge-Kutta Metho ds . . . . . . . . . . . . . 7.6 Predictor Corrector Metho ds . . . . . . . . . . 8 Direct Methods for Solving Linear 8.1 Gaussian Elimination . . . . . . . 8.2 Pivoting Strategies . . . . . . . . 8.3 Matrix Inverse . . . . . . . . . . . 8.4 Determinant of Matrix . . . . . . 8.5 Matrix Factorization . . . . . . .
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
57 57 62 64 64 65 65 66 66 68 72 74
. . . . . .
77 77 78 80 82 83 87
. . . . .
90 91 92 92 92 92
9 Iterative Metho ds for Solving Linear Systems 93 9.1 Norms of vectors and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 9.2 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 9.3 9.3 Itera erativ tive Techniq niques ues for for Solv olving Line Lineaar Sys Systems ems . . . . . . . . . . . . . . . . . . . . . . . . 93 10 Some Useful Remarks 10.1 Largest Possible Ro ot . . . . . . . . . . . . 10.2 Convergence of Bisection Method . . . . . 10.3 Convergence of False Position Method . . 10.4 Convergence of Newton-Raphson Method . 10.5 Convergence of Secant Method . . . . . . . 10.6 Convergence of Fixed Point Method . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
94 94 94 95 95 96 97
11 Exams 98 11.1 ex e xam 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 11.2 ex e xam 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 11.3 ex e xam 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3
Chapter 1 Mathematical Preliminaries 1.1 1.1
Revi Review ew of Calc Calcul ulus us Definition of the limit: A function f defined on a set X of real number has a limit L at x0 , written lim f ( f (x) = L
(1.1)
∀ǫ > 0 ∃δ > 0 | 0 < |x − x | < δ ⇒ |f ( f (x) − L| < ǫ
(1.2)
x→x0
if the following statement is true 0
f is continuous at x0 if lim f ( f (x) = f ( f (x0 )
(1.3)
x→x0
Convergence of the sequences: The sequence has a limit x (converges to x) if
∀ǫ > 0 ∃N > 0 | n > N ⇒ |x − x| < ǫ n
(1.4)
Differentiable functions: The function f is differentiable at x0 if f ( f (x) x→x0 x
f ′ (x) = lim
f (x ) − f ( −x 0
(1.5)
0
exists. This limit is called the derivative of f at x0 . The set of all function that have n continuous derivatives on X is denoted by C n (X ). ). Theorem: If the function f is differentiable at x0 , then f is continuous at x0 .
Rolle’s Theorem: Theorem: Suppose f C [a, b] and f is differentiable on (a, (a, b). If f ( f (a) = f ( f (b), then a number c with f ′ (c) = 0.
∈
4
∈ (a, b) exists
Mean value theorem: Suppose f C [a, b] and f is differentiable on (a, (a, b). A number c
∈
f ′ (c) =
f ( f (b) b
∈ (a, b) exists with
− f ( f (a) −a
(1.6)
Extreme value theorem: If f C [a, b], then c1 and c2 exist with f ( f (c1 ) f ( f (x) f ( f (c2 ), for all x [a, b]. In addition addition,, if f is differentiable on (a, (a, b), then c1 and c2 occur either at the endpoints of [a, [a, b] or where f ′ is zero.
∈
≤
≤
∈
Riemann integral: The Riemann integral of a function f on the interval [a, [a, b] is the limit (provided it exists) of n
b
f ( f (x)dx =
lim
max∆x max∆xi →0
a
f ( f (zi )∆x )∆xi ,
(1.7)
i=1
where the numbers numbers xi satisfy a = x0 x1 ... xn = b, and where ∆x ∆xi = xi xi−1 , and zi is arbitrarily chosen in [x [xi−1 , xi ]. If the points are equally spaced and we choose zi = xi , in this case The Riemann integral of a function f on the interval [a, [a, b] is the limit (provided it exists) of
≤
≤
−
b
f ( f (x)dx = lim
n→∞
a
where xi = a + i(b
b
−a n
n
f ( f (xi ),
(1.8)
i=1
− a)/n. /n.
Weighted Mean Value Theorem for Integral: Suppose f C [a, b], the Riemann integral of g exists on [a, [a, b], and g(x) does not change sign on [a, [a, b]. Them there exist a number c in (a, b) with
∈
b
b
f ( f (x)g (x)dx = f ( f (c)
a
g (x)dx.
(1.9)
a
When g (x) = 1, it gives the average value of the function f over the interval [a, [a, b]. f ( f (c) =
1 b
−a
b
f ( f (x)dx.
(1.10)
a
differentiable on (a, b). If f If f ((x) Generalized Rolle’s Theorem: Suppose that f C [a, b] is n times differentiable (n) is zero at n + 1 distinct numbers x0,...,xn in [a, b], then a number c in (a, b) exists with f (c) = 0.
∈
f (a) and f ( f (b), then Intermediate Value Theorem: If f C [a, b] and K is any number between f ( there exists c in (a, b) for which f ( f (c) = K .
∈
[a, b], and x0 [a, b]. For every x Taylor’s Theorem: If f C n [a, b] and f (n+1) exists on [a, there exists a number ξ (x) between x0 and x with f ( f (x) = P n (x) + Rn (x), where
∈
∈
n
P n(x) =
f (k) (x0 ) (x k!
−x )
(ξ (x)) (x (n + 1)!
−x )
k=0 (n+1)
Rn(x) =
f
5
0
0
∈ [a, b],
k
(1.11)
n+1
(1.12)
P n (x) is called the nth Taylor polynomial for f about x0 , and Rn(x) is called the remainder term (or truncation truncation error). In the the case case when when x0 = 0, the Taylor polynomial is often called a Maclaurin Maclaurin polynomial polynomial. if we take n the Taylor polynomial is called Taylor series for f about x0 . For x0 = 0, the Taylor series is called Maclaurin series.
→∞
Example: We want to determine an approximate approximate value value of cos (0. (0.01) using second Maclaurin polynomial cos x = 1
− 12 x
2
1 + x3 sin(ξ sin(ξ ) 6
(1.13)
where ξ is number between 0 and x. cos(0. cos(0.01) = 0. 0.99995 + 0. 0.16
× 10
−6
sin ξ
(1.14)
where we use the bar over 6 to indicate that this digit repeats indefinitely.
| cos(0. cos(0.01) − 0.99995| = 0.16 × 10
−6
sin ξ
≤ 0.16 × 10
−6
(1.15)
which gives 0.99994983 < 0.99995
− 0.16 × 10
−6
≤ ≤
cos(0. cos(0.01) 0.99995 + 0. 0.16
−6
× 10
< 0.99995017 (1.16)
The error bound is much larger than the actual error. This is due in part to the poor bound we used for sin ξ . It can be shown that sin x x . Since 0 ξ 0.01, we find bound 0. 0.16 10−8 . 1 2 1 Note that eq. (1.13) can be written as cos x = 1 x + x4 cos(ξ cos(ξ ′ ) and the error will be no more 2 24 1 −8 −9 than 10 = 0.416 10 . 24 This example illustrate two objectives of numerical analysis: find an approximation to the solution and determine a bound for the error .
|
×
1.1. 1.1.1 1
|≤| |
−
≤ ≤
×
Exer Exerci cise sess
The exercises are from the textbook sec 1.1 pages 14-16. Tutor: Exercises 1,2,3,4,15,23 Students: All odd exercises except 17 Assignment 1: Exercises 15 and 26
solution of exercise 13: 2 The Taylor polynomial P 4 (x) for the function f ( f (x) = x ex is given by f ( f (x) = x + x3 + 1/ 1/2 x5 + O (x7 ) P 4 (x) = x + x3
6
×
The remainder is given by f (5) (ξ (x)) 5 R4 (x) = x 5! 1 ξ2 = e (15 + 90ξ 90ξ 2 + 60ξ 60ξ 4 + 8ξ 8ξ 6 )x5 30 1.211406197 x5 0.01240479946
≤ ≤
we find the last inequality by substituting ξ = x = 0.4. The integral can be approximated by 0.4
0.4
f ( f (x)dx
0
≈
x + x3 dx = 0.086400
0
The upper bound error is 0.4
1.211406197 x5 = 0.0008269866307
0
To approximate f ′ (0. (0.2) using P 4′ (0. (0.2) we can write 2
f ′ (0. (0.2) 2
f ′ (0. (0.2) = ex
≈ 1 + 3 x | = 1.12 (1 + 2x 2x )| = 1.124075636 2
0.2
0.2
The actual error is 1.124075636
− 1.12 = 0.0.004075636
solution of exercise 15: The nth derivative of cos(x cos(x) is (n)
cos(x cos(x)
= cos x +
According to Taylor’s theorem the error is Rn (x) =
(x
(n+1)
−x ) 0
π n 2
(n + 1)!
sin ξ +
π n 2
where ξ is between x and x0 . For x = 42o and x0 = π/4 π/ 4 we find that the error is less than Rn(x)
n=1 n=2 n=3 n=4
, , , ,
≤
(π/60) π/ 60)n+1 E = (n + 1)!
E = 0.001370778390 E = 0.00002392459621 E = 3.131722321 10−7 E = 3.279531946 10−9
× ×
7
So n = 3 is sufficient to get the value accuracy to 10−6 . In this case P 3 (x) is equal to P 3 (x) = cos(x0 ) sin(x sin(x0 ) (x = 0.7431446016
−
− x ) − 1/2 cos(x cos(x ) (x − x ) 0
0
0
2
+ 1/ 1/6 sin(x sin(x0 ) (x
the actual actual value value of cos 42o = 0.7431448255... 7431448255...,, so the error is of the order of 0. 0.2239
3
−x ) 0
−6
× 10
.
solution of exercise 26: let assume that m = Min Min [f ( f (x1 ), f ( f (x2 )] and M = Max Max [f ( f (x1 ), f ( f (x2 )]. So, for any value ξ between x1 and x2 we have c1 m c2 m
f (x ) ≤ c M ≤ c f ( f (x ) ≤ c M ≤ c f ( 1
1
1
2
2
2
which lead to (c1 + c2 )m
≤ c f ( f (x ) + c f ( f (x ) ≤ (c 1
1
2
2
1
+ c2 )M
and therefore m
f (x ) + c f ( f (x ) ≤ c f ( ≤ M c +c 1
1
2
1
2
2
without loss of generality, let assume that m = f ( f (x1 ) and M = f ( f (x2 ), then the last equation gives f (x ) + c f ( f (x ) f (x ) ≤ c f ( ≤ f ( c +c According to the intermediate value theorem ∃ξ between x and x f ( f (x1 )
1
1
2
1
2
2
2
1
f ( f (ξ ) =
1.2 1.2
2
such that
c1 f ( f (x1 ) + c2 f ( f (x2 ) c1 + c2
Roun Roundd-off off Erro Errors rs An n-digit floating-point number in base β has the form x=
e
±(.d d ...d ) β 1 2
n β
(1.17)
where (.d (.d1 d2 ...dn )β is a β -fraction -fraction called the mantissa, end e is an integer called the exponent. Such a floating-point number is said to be normalized in case d1 = 0, or else d1 = d2 = ... = dn = 0.
64-bit (binary digit) representation for number, called long real. The first bit is a sign indicator, denoted by s This is followed by 11-bit exponent, c, called the characteristic, and, a 52-bit binary fraction, f , f , called mantissa. the base for the exponent is 2. The 52 binary digits can holds up to 16 decimal digits. The exponent of 11 binary digits gives a range of 0 to 211 1 = 2047. To use small number we have to shift the exponent by 1023, so the range is actually between 0 1023 = 1023 and 2047 1023 = 1024
−
−
−
−
8
To save storage and provide a unique representation for each floating-point number we use a normalized form ( 1)s 2c−1023(1 + f ) f ) (1.18)
−
Example: consider the machine number 0 10000000011 1011100100010000000000000000000000000000000000000000 1011100100010000000000000000000000000000000000000000 The left most bit is zero, the number is positive. The next eleven bits, (10000000011)2 = 1 + 21 + 210 = 1027. The exponent part is 21027−1023 = 24 . The final 52 bits specify the mantissa
f = (.101110010001)2 =
1 1 1 1 1 1 + 3 + 4 + 5 + 8 + 12 2 2 2 2 2 2
This number represents ( 1)s 2c−1023(1 + f ) f ) = 24
−
× (1 + 0.0.722900390625) = 27. 27.56640625
The next smallest number to this is represented by 0 10000000011 10111001000 1011100100001111111111111111111111111111111111111111 and the next largest number is 0 10000000011 101110010001000000000000000000000000000000000000000 1011100100010000000000000000000000000000000000000001 This This means that our orig-
. . . 9 4 2 6 0 4 6 6 5 . 7 2
5 2 6 0 4 6 6 5 . 7 2
next smallest machine number
. . . 0 5 2 6 0 4 6 6 5 . 7 2
next largest machine number
Figure 1.1: inal number represents not only 27.56640625, but also half of the real numbers that are between the numbers 27.56640625 and its two nearest machine-number neighbors see (Fig. 1.1).
Underflow and overflow .............. Round-off errors : Round-off errors arise because it is impossible to represent all real numbers exactly on a finite-state machine (which is what all practical digital computers are). On a pocket calculator, if one enters 0.0000000000001 (or the maximum number of zeros possible), then a ’+’, and then 100000000000000 (again, the maximum number of zeros possible), one will obtain the nu number mber 10000000 100000000000 0000000 000 again, again, and not 10000000 100000000000 0000000 000.000 .0000000 00000000 000001. 01. The calcula calculator’ tor’ss answer is incorrect because of round-off in the calculation.
9
Round-off errors in a computer 1 The most basic source of errors in a computer is attributed to the error in representing a real number with a limited number of bits. The machine epsilon, ǫ, is the inter interv val betwe between en 1 and the next next nu num mber greate greaterr than than 1 that that is disti distingu nguis ishab hable le from from 1. This This means means that no number umber betw between 1 and 1 + ǫ can be represented in the computer. Machine epsilon can be found by the following program: 10 20 30
E=1 IF E+1>1 THEN PRINT E ELSE STOP E=E/ E/2: 2: GOTO 20
When numbers are added or subtracted, an accurate representation of the result may require much larger number of digits than needed for numbers added or subtracted. Serious amounts of round-off error occur in situations: 1. when adding adding (or subtract subtracting ing)) a very very small number number to (or from) a large large nu number mber 2. when a nu number mber is subtracted subtracted from another another that is very close close To test the first case on the computer, let us add 0.00001 to unity ten thousand times. The program to do this job would be: 10 20 30 40 50
sum=1 for i=1 to 1000 0000 sum=sum+0.00001 next prin rint*, t*, sum
The result of this program would be sum = 1.100136 Since the exact answer is 1.1, the relative error of this computation is 1.100136 1.1 = 0.000124 = 0. 0.0124% 1.1 The cause cause of this this round round off error can be und unders erstood tood lik like this. this. Let consid consider er the computa computati tion on of 1+0.00001. The binary representation of 1 and 0.00001 in 32-bit (binary digit) are, respectively,
−
(1)10 = (0. (0.1000 0000 0000 0000 0000 0000)2 (0. (0.00001)10 = (0. (0.1010 0111 1100 0101 1010 1100)2 We adjust their exponent we get
1
×2 ×2
−16
(0. (0.10000 0000 0000 0000 0000 0000 0000 0000 0000 0000)2 + (0. (0.00000 0000 0000 0000 1010 0111 1100 0101 1010 1100)2 (0. (0.10000 0000 0000 0000 1010 0111 1100 0101 1010 1100)2 1
Applied Numerical Methods with Software, Shoichiro Nakamura
10
1
×2
1
×2 ×2
1
Now we have to use only 24-bit for the mantissa, we get (0. (0.10000 0000 0000 0000 1010 0111 1100 0101 1010 1100)2
1
×2
after you round it you get
(0. (0.1000 0000 0000 0000 0101 0100)2
×2
1
which is equivalent to (1. (1.0000100136)10 . Thus, whenever 0.00001 is added to 1, the result gains 0. 0.0000100136 as an error. When addition of 0.00001 is added to 1 ten thousand times, an error of exactly ten thousand times 0. 0.0000100136 is generated. Although the calculated result gains in the present example, it can lose if some digits are cut off. Loss and gain are both referred to round-off error.
Strategies to minimize round off errors 1. Double precisio precision n 2. Grouping Grouping 3. Taylor aylor expansion 4. Changing definition definition of variable variable 5. Rewriting Rewriting the equation to avoid subtractions subtractions
Example: We want to add 0.00001 ten thousand times to unity by using: double precision precision (a)-double (b)-grouping method Double precision method: 10 SUM=1.0D0 SUM=1.0D0 20 DO I=1, I=1,10 1000 000 0 30 SUM= SUM=SU SUM+ M+0. 0.00 0000 001D 1D0 0 40 END DO DO 50 PRINT *, SUM
Grouping method: SUM=1 DO 47 I=1, I=1,10 100 0 TOTAL=0 DO 40 K=1, K=1,10 100 0 TOTAL=TOTAL+0.00001 40 CONTINUE CONTINUE
11
SUM=SUM+TOTAL 47 CONTINUE CONTINUE PRIN PRINT T *, SUM SUM
Example: As θ approaches 0, accuracy of a numerical evaluation for d=
sin(1 + θ) θ
− sin(1)
becomes very poor because of the round-off errors. By using Taylor expansion we can write sin(1 + θ) = sin(1) + θ cos(1)
− 0.5θ
2
sin(1)... sin(1)...
Therefore, d
cos(1) − 0.5θ sin(1) ≈ D = cos(1)
By writing a program to compute d. The accuracy accuracy of the approxim approximatio ation n increase increasess as θ approaches 0. The FORTRAN program is: program program testeps testeps implicit implicit none real :: d,da,t=1.0 d,da,t=1.0e0,h= e0,h=10.0e 10.0e0 0 inte intege ger r :: i do i=1,7 i=1,7 t=t/h da=cos(1.0e0)-0.5e0*t*sin(1.0e0) d=(sin(1.0e0+t)-sin(1.0e0))/t print*,t,d,da end do end
The output is: angle d D ------------------------------------0.1000000 0.10000000E-00 0E-00 0.49736413 0.49736413 0.4982287 0.49822873 3 9.9999997 9.99999978E-03 8E-03 0.53608829 0.53608829 0.5360949 0.53609490 0 9.9999993 9.99999931E-04 1E-04 0.53993475 0.53993475 0.5398815 0.53988153 3 9.9999990 9.99999902E-05 2E-05 0.54062998 0.54062998 0.5402602 0.54026020 0 9.9999988 9.99999884E-06 4E-06 0.54383242 0.54383242 0.5402980 0.54029804 4 9.9999988 9.99999884E-07 4E-07 0.54327762 0.54327762 0.5403018 0.54030186 6
12
errors introduc introduced ed in floating floating poin p ointt arithme arithmetic tic,, the associati associative ve and Law of Arithm Law Arithmeti eticc Due to errors distributive laws of arithmetic are not always satisfied. that is x + (y ( y + z ) = (x + y ) + z x (y z ) = (x y) z x (y + z ) = (x y) + (x (x
× × ×
Example: Let x = 0.456732 x+y (x + y ) z (y + z ) x + (y (y + z )
−
= = = =
× 10
−2
× × × × z) , y = 0.243451, and z = −0.248000.
0.00456732 + 0. 0.243451 = 0. 0.248018 0.248018 0.248000 = 0. 0.000018 = 0. 0.180000 10−4 0.243451 0.248000 = 0.0045490 0.00456732 0.00454900 = 0. 0.0001832 = 0. 0.183200 10−4
− −
×
−
−
×
chopping and rounding: Let use normalized decimal floating-point floating-point form n
±0.d d ...d × 10 1 2
k
(1.19)
where 1 d1 9 and 0 di 9. This This number number is called k-digit decimal machine numbers. numbers. Any positive real number within the numerical range of the machine can be normalize to the form
≤ ≤
≤ ≤
y = 0.d1 d2 ...dk dk+1 dk+2 ...
× 10
n
(1.20)
The floating-point form of y, denoted by f l(y ), is obtained by terminating the mantissa of y at k decimal digits. There are two ways of performing this termination. One called chopping . is simply chop off the digits dk+1 dk+2 .... .... The other method, method, called called rounding, add 5 10n−(k+1) to y and then chops the result. So, when rounding, if dk+1 5, we add 1 to dk to obtain f l(y). This is we round up. When dk+1 < 5, we merely chop off all but the first k digits; so we round down. Rounding up the digits and even the exponent might change.
×
≥
Example: the number π = 0.314159... 314159... 101 . The floating-point form of π of π using five-digit five-digit chopping chopping 1 is f l(π ) = 0.31415 10 = 3.1415. The floating-point form of π of π using five-digit rounding is 3. 3.1416, because of the sixth digit expansion of π which is 9 > 5.
×
×
Absolute error and relative error: If p⋆ is an approximation to p, the absolute error is p p
∗
| − p |, and the relative error is | p p −| p p p| | . ⋆
Significant digit: The number p∗ is said to approximate p to t significant digits if t is the largest nonegative integer for which p p p∗ 5 10−t (1.21) p p
| − |≤ × ||
A more more formal formal definition definition of signifi significan cantt digits digits is as follow following ing.. Let the true value value have have digits digits p = d1 d2 ...dk dk+1 ...dn . Let the approximate value have p∗ = d1 d2 ...dk ek+1 ...en . where d1 = 0 and with the first difference in the digits occurring at the (k (k + 1)st digit. digit. We then say
13
that p and p∗ agree to k significant significant digits if dk+1 significant digits.
|
− e | < 5. Otherwise, we say they agree to k − 1 k+1
Example: Let the true value p = 10/ 10/3 and the approximate value p∗ = 3.333. The absolute error is 10/ 10/3 3.333 = 1/3000. The relative error is 1/10000=10−4 < 5 10−4 The number of significant digits is 4.
|
−
|
×
Assume that the floating-point floating-point representations representations f l(x) and f l(y) are given for the real number x and y and the symbols , , , represent addition, subtraction, multiplication, and division operations, respectively. The finite-digit arithmetic is given by
⊕⊖⊗⊘
x x x x
⊕y ⊖y ⊗y ⊘y
= = = =
f l(f l(x) + f l(y )) f l(f l(x) f l(y)) f l(f l(x) f l(y)) f l(f l(x) f l(y))
− × ÷
For k-digit chopping we have
−
≤
y
f l(y ) y
−
≤
For k-digit rounding we have y
f l(y ) y
10−k+1
0.5
× 10
(1.22)
−k+1
(1.23)
One of the most common error involves the cancellation of significant digits due to the subtraction of two nearly equal numbers . Suppose we have two nearly equal numbers x and y, with x > y, we have f l(x) = 0.d1 d2 ...d p α p+1 p+1 α p+2 p+2 ...αk f l(y) = 0.d1 d2 ...d p β p+1 p+1 β p+2 p+2 ...β k the floating-point form of x of x f l(f l(x)
× 10 × 10
n
n
− y takes the form
− f l(y))
= α p+1 10n− p β p+1 p+1 α p+2 p+2 ...αk p+1 β p+2 p+2 ...β k = 0.σ p+1 10n− p p+1 σ p+2 p+2 ...σk
× ×
−
n− p
× 10
The floating-point number used to represent x y has at most k p digits digits of significan significance. ce. An Any y further calculations involving x y retain the problem of having only k p digits of significance.
−
−
−
−
Loss of significance: Consider, for example, x∗ = 0.76545421 101 and y∗ = 0.76544200 101 to be an approximation to x and y, respectively, correct to seven significant digits. Then, in eight-digit floating-point floating-point arithmetic, the difference difference z ∗ = x∗ y∗ = 0.12210000 10−3 . But as an approximation to z = x y is good only to three digits, since the fourth significant digit of z ∗ is derived from the eight digits of x∗ and y∗ , both possibly in error. Hence, while the error in z ∗ is at most the sum of the error in x∗ and y ∗, the relative error in z ∗ is possibly 10000 times the relative relative error in x∗ and y∗ . Loss of significant digits is therefore dangerous only if we wish to keep the relative error small
×
−
−
14
×
×
We can also have error when dividing by a small number of multiplying by large number. Suppose, for example, that the number z has a finite-digit approximation z + δ , where the error δ is introduced by representation or previous calculation. If we divide it by ǫ = 10−n , where n > 0, then z ǫ
≈ fl
f l (z ) f l(ǫ)
= (z + δ )
× 10
n
n
so, the absolute error in this approximation, δ by a factor 10n .
| | × 10 , is the original absolute error, |δ|, multiplied
Example: Let p = 0.54617 and q = 0.54601. 54601. The exact exact value value of r = p q = 0.16 10−5 . If we perfor perform m the ∗ ∗ ∗ ∗ ∗ subtraction using 4-digit rounding we find p = 0.5462 and q = 0.5460, and r = p q = 0.2 10−4 . The relative error is
−
− r
r∗
r
×
−
×
= 0.25
which has only one significant digit, whereas p∗ and q ∗ were accurate to four and five significant digits, respectively. Example: The quadrature formula states that the roots of ax of ax2 + bx + c = 0, when a = 0, are
√ b ± b − 4ac − = 2
x±
(1.24)
2a
using four-digit rounding arithmetic, consider this formula applied to x2 + 62. 62.1x + 1 0, whose roots are approximately x+ = 0.01610723 and x− = 62. 62.0839 08390. 0. we can can see see that that b 4ac, ac, so the numerator of x+ involves the subtraction of two nearly equal numbers. b2 4ac = 62. 62.06, we get f l(x+ ) = 0.02 which is a poor approximation to x+ = 0.01611, with a relative error about 2.4 10−1 . The other root f l(x−) = 62. 62.10 has a small relative error around 3. 3.2 10−4 .
−
×
−
−
√ − ×
−
−
To obtain more accurate we can use the formula
−b + √b − 4ac 2a √ −b + b − 4ac
− ≫
2
x+ =
2
= = so we can get f l(x+ ) = formula for x2
2a 2c b2 4ac
− √ − b+
b+ b+
√b − 4ac √b − 4ac 2 2
−4
−0.01610 which has the small relative error 6.6.2 × 10 x− =
In this case f l(x−) will be
. we can also derive a
−2c √ b − b − 4ac 2
50.00 which has the large relative error 1. 1.9 × 10 −50. 15
−1
.
Example: This example shows how we can avoid loss of significance. significance. We want to evaluate evaluate f ( f (x) = 1 cos(x cos(x) near zero in six-digi six-digitt arithme arithmetic tic.. Since Since cos(x cos(x) 1 for x near zero, there will be loss of significant digits by first finding cos(x cos(x) and then subtracting it from 1. Without loss of generality, assume that x is close to zero with x > 0 , we have
−
≈
cos(x cos(x) = 0.a1 a2a3 a4a5 a6 a7 ... if x = x0 is close enough to zero we can have cos(x cos(x0 ) = 0.999999a 999999a7 ... the difference is −5
−6
− cos(x cos(x ) = 0.100000 × 10 − 0.a a a ... × 10 if we use rounding and if a if a ≥ 5 we cannot calculate the value of cos x using six-digit arithmetic at all x ≤ x , because the rounding value of 1 − cos(x cos(x) is zero. for example 1 − cos(0. cos(0.001) = 0. 0.000000 but it is equal to 0. 0.500000 × 10 . To overcome this we can use another formula 1
0
7 8 9
7
0
−6
1
cos(x) − cos(x
1 + cos(x cos(x) (1 1 + cos x sin2 (x) = 1 + cos x =
cos(x)) − cos(x
If we use this last equation we find that for x = 0.001 1
−
sin2 (0. (0.001) cos(0. cos(0.001) = 1 + cos cos 0.001 0.1 10−5 = 2 = 0.5 10−6
× ×
We can use Taylor polynomial 1
x2 2
− cos x ≈
−
x4 + ... 24
which gives 1
cos0.001 ≈ − cos0. ≈
0.0012 2 0.5
× = 0.5 × ≈ 0.5 ×
0.0014 + ... 24 0.1 10−11 −6 10 + ... 24 10−6 0.416667 10−13 + ... 10−6
−
− −
×
×
Example: The value of the polynomial p(x) = 2x3
− 3x
2
+ 5x 5x
16
− 4 at x = 3 can be calculated as:
⋆ x2 = 9, x3 = 27, then we put every thing together, p(3) = 54 17 + 15 We have five multiplication: x2 , x3 , 2x3 , 3x2 , 5x, and one addition and two subtractions. We need in total 8 operations
−
− 4 = 38.
⋆ The polynomial can be arrange as p(x) = [(2x [(2x 3)x 3)x + 5]x 5] x 4, nested manner We need three multiplications and one addition and two subtractions. In total we need six.
−
−
– In general, for a polynomial of degree n we need (n (n 1) + n = 2n 1 multiplications: n n−1 2 ( n 1 for x for x , x ,...,x ,...,x . and n and n for the multiplication of coefficients, an xn ,an−1 xn−1,...a ,...a1 x. . However, for the nested form we need only n multiplications.
−
−
−
×
×
×
– Both need n addition/subtraction operations .
1.2. 1.2.1 1
Exer Exerci cise sess
Assignment: odd exercises From section 1.2 pages 26-29 Tutorial: 1, 1415926...,, and p∗ = Exercise 1: = π = 3.1415926...
22 = 3.142857. The absolute error is 7 ∗
0.0012644 <
| p p − p | | p p − p |
= 3.142857 3.14159 1415922 6... < 0.0012645
−
∗
If we round it we find that the absolute error is 0.00126. The relative error is
− −
0.0012644 p p∗ 0.0012645 < < 3.1415927 p 3.1415926 p p∗ −4 4.0247 10 < < 4.0250 10−4 p
×
If we round it we find that the relative error is 4. 4.025
× 10
×
−4 ∗
| p p − p | . Note that this number The relative error in p , as an approximation to p, is defined by α = | p p| | p p − p | if α ≪ 1. One can show that is close to | p p | | p p − p | = α =⇒ | p p − p | = α ≈ α (1.25) | p p| | p p | |1 ± α| ∗
∗
∗
∗
∗
∗
Exercise 5: Three-digit rounding arithmetic for: a) 133 + 0.921 = 133. 133.921 is 134. b) 133 0.499 = 132. 132.501 is 133. c) (121 0.327) 119 = (120. (120.673) 119 = 121 119 is 2. d) (121 119) 0.327 = (2) 0.327 = 1. 1.673 is 1.67 13 6 0.929 0.857 0.072 e) 14 7 = = = 1.80 2e 5.4 5.44 5.4 0.04 The absolute error and relative error are: a) 0.79 10−1 , 0.59 10−3 b) 0.499, 0.377 10−2 e) 0.154 0.0786
− − − − − ×
− −
×
− − ×
−
−
−
17
Assignment Suppose two points (x (x0, y0 ) and (x (x1, y1 ) are on the straight line with y0 = y1 . The x-intercept of the line is given by
x= or x = x0
x0 y 1 y1
−x y −y
1 0
0
− (xy −−xy)y 1
0
1
0
0
(x0 , y0 ) = (1. (1.31, 31, 3.24) and (x (x1 , y1 ) = (1. (1.93, 93, 4.76) and three-digit rounding arithmetic Group1 Use the data (x to compute x-intercept both ways. Which method is better and why?
(x0, y0 ) = (0. (0.2, 0.2) and (x (x1 , y1 ) = (1. (1.2, 1.01) and three-digit rounding arithmetic to Group2 Use the data (x compute x-intercept both ways. Which method is better and why?
Solution -y1 x0 + x1 y0 X1 := --------------------------y1 + y0 y0 (x1 - x0) X2 := x0 + -------------------y1 + y0 Group 1 -------------------------------------------------------------x0 := 1.31 y0 := 3.24 x1 := 1.93 y1 := 4.76 Actu Actual al solu olution tion is -0.0 0.0115 1157894 789473 737 7 X1 := -0.00658 X2 := -0.01
Group 2 ----------------------------------------------------------------x0 := 0.2 y0 := 0.2 x1 := 1.2 y1 := 1.01 Actu Actua al solu olutio tion is -0.0 -0.04 46913 691358 5802 025 5 X1 := -0.0469 X2 := -0.047
Rela Relati tive ve erro error r for for X1 is 0.43 0.4317 1727 2727 2728 28 Rela Relati tive ve erro error r for for X2 is 0.13 0.1363 6363 6363 6365 65 X2 is better than X1
Rela Relati tive ve erro error r for for X1 is 0.0 0.000 0028 2894 9473 7375 750 0 Rela Relati tive ve erro error r for for X2 is 0.0 0.001 0184 8421 2105 0519 197 7 X1 is better than X2
18
Chapter 2 Solution of Equations in One Variable Finding the roots of a function f is very important in science and engineering and it is not always simple. Let consider the function f ( f (x) = (1
8
− x)
=1
2
3
28x − 56x 56x − 8x + 28x
+ 70x 70x4
5
56x − 56x
+ 28x 28x6
7
− 8x
+ x8
It is clear that x = 1 is the only real root of f of f .. The graph of f of f is given in Fig. 2.1. It shows that there are many roots for f because of many positive and negative values of f of f ((x).
10
−14
10.0
7.5
y 5.0
2.5
0.0 0.97
0.98
0.99
1.0
1.01
1.02
1.03
x
Figure Figure 2.1: The strange strange behavior behavior of f ( f (x) near x = 1 is due to round off errors in the computation of expanded for of f ( f (x). of f of f ((x) = 0.
19
2.1 2.1
Bise Bisect ctio ion n Meth Method od Definition: The first technique, based on intermediate intermediate value theorem , is called bisection method . To begin, set a1 = a and b1 = b, and let p1 the midpoint of the interval [a, [a, b]; that is a1 + b1 (2.1) 2 2 if f ( f ( p1) = 0, then the root of f ( f (x) = 0 is p = p1 . If f ( f ( p1 ) = 0, then f ( f ( p1) has the same sign of as either f ( f (a1 ) or f ( f (b1 ). When f ( f ( p1 ) and f ( f (a1 ) have the same sign, p ( p1 , b1 ), and we set a2 = p1 and b2 = b1 . When f ( f ( p1) and f ( f (b1 ) have the same sign, p (a1 , p1 ), and we set a2 = a1 and b2 = p1 . We then reapply the process to the interval [a [a2 , b2 ]. p1 = a1 +
b1
−a
1
=
∈
∈
Algorithm: INPUT: endpoints a, b; Tolerance T OL OL;; maximum number of iteration N 0 OUTPUT: approximate solution p or message of failure. Step 1: Set i=1; FA=f(a); Step 2: while i N 0 do steps 3-6. Step 3: set p = a + (b (b a)/2; (compute pi ) FP=f(p); Step 4: if F P = 0 or (b (b a)/2 < TOL then OUTPUT (p); (procedure terminated terminated successfully) successfully) STOP. Step 5: set i = i + 1 Step 6: If FA.FP FA.FP > 0 then set a = p; (compute ai , bi) FA=FP else set b = p
≤
− −
Step 7: OUTPUT (“method failed after N 0 iterations”) STOP The stopping procedure in step 4 by selecting a tolerance ǫ > 0 with the following conditions: p pN pN −1 < ǫ pN pN −1 <ǫ pN f ( f ( pN ) < ǫ
|
− − |
|
(2.2) ,
pN = 0
(2.3) (2.4)
|
The first and the last one are not good measure measure of the tolerance tolerance (see Ex.16 Ex.16 and 17 page 52). The middle one which is the relative error is better that the two others. 1 If you apply the bisection method to the function f ( f (x) = on the interval [0, [0, 2] you will find x 1 that the method can catch a singularity at x=1. The stopping stopping procedure procedure in step 4 will will never be satisfied for any number of iterations N 0 and the method fails.
−
To start the algorithm we need to check that the sign of the product f ( f (a).f ( .f (b) sign function defined by sign(x sign(x) =
−
1, if x < 0 0, if x = 0 1, if x > 0
20
≤ 0. We can use the (2.5)
and we just test sign(f(a)).sign(f(b)) instead of f of f ((a).f ( .f (b). It is good practice to set an upper bound for the number of iterations. This eliminate the possibility in entering an infinite loop. It is good goo d to choose the interval interval [a, [a, b] to be b e small as possible so we can reduce the number number of iterations. iterations. The bisection is slow to converge, N may become quite larger for small tolerance.
Theorem: Suppose that f C [a, b] and f ( f (a).f ( .f (b) < 0. The bisection method generates a sequence pn approximating a zero p of f with
∈
| p p − p| ≤ b 2− a , n
Proof : For each n
n
n
≥ 1.
(2.6)
≥ 1, we have b1 = b bn
−a
, a1 = a an d b a = 2n−1
(2.7) (2.8)
−
n
(2.9)
we also have that
| p p − p | ≤ b −2 a n
n
n
=
b
−a
(2.10)
2n
which shows that the sequence pn converges to p. The bound for number of iterations assumes calculation performed using infinite-digit arithmetic. When implementing implementing the method on a computer, we have have to consider consider round-off error. For example, the computation of midpoint of [a, [a, b] should be found from the equation pn = an +
bn
−a
n
(2.11)
2
instead from the algebraic equivalent equation pn =
an + bn 2
(2.12)
The first equation add a small correction (b (bn an)/2 to the known value an . if bn an is near the maximum precision of the machine this correction will not affect significantly pn. Howe Howeve ver, r, (an + bn )/2 may return a midpoint that is not even in the interval [a [an, bn ].
−
−
Exercises: Odd numbers of sec. 2.1 page 51-52.
– Ex 13: An approximate value to 3 25 correct to within 10−4 . Let consider the function f ( f (x) = x3 25. We can choose the interval interval [2, [2, 3], We have f (2) f (2) = and f (3) f (3) = 2. The two values have different signs, so we can apply the bisection method.
√
−
21
−17
n 1 2 3 4 5 6 7 8 9 10 11 12 13 14
an 2 2.5 2.75 2.875 2.8750 2.90625 2.921875 2.9218 921875 7500 2.9218 921875 7500 2.9238 2.9238 2.9238 2.9238 2.9240
bn 3 3 3 3 2.93750 2.93750 2.937500 2.9296 929687 8755 2.9257 925781 8122 2.9258 2.9248 2.9243 2.9241 2.9241
pn 2.5 2.75 2.8750 2.93750 2.906250 2.921875 2.929688 2.9257 925781 8122 2.9238 923828 2811 2.9248 2.9243 2.9241 2.9240 2.9240
√
bn an 1 0.5 0.25 0.125 0.0625 0.031250 0.015625 0.0078125 0.0039062 1.9531E 9531E 9.7656E 7656E 4.8828E 8828E 2.4414E 4414E 1.2207E 2207E
f ( f ( pn) 9.3750 4.2031 1.2363 +0.34741 +0.0625 +0.03125 +0.145710 +0.0452607 0.0048632 +2.0190E 0190E +7.6615E 6615E +1.3986E 3986E 17. 17.324E 324E 1.6692E 6692E
−
− − −
−
− 03 − 04 − 04 − 04 − − 04 −
So, the approximate value of 3 25 is p14 = 2.9240, because (b (b14 (b13 a13 )/2 = 1.2207E 2207E 04. If we use the Theorem:
−
−
| p p − p| ≤ b 2− a < 10 n
− 02 − 03 − 03 − 03 − 04 − a )/2 = 6.1035E 1035E − 05 and 14
−4
(2.13)
n
we find that 1 < 10−4 n 2
4 = 13. 13.288 ⇒ −n log2 < −4 ⇒ n > log2
So, n should be at least 14.
– Ex 18: The function f ( f (x) = sin sin (πx) πx ) has zeros at every integer. We want to show that when 1 < a < 0 and 2 < b < 3, the bisection method converges to:
−
0 2 1
a+b <2 a+b >2 a+b = 2
fo r fo r fo r
we have have to check check the sign of sin (aπ), aπ ), sin
a+b π ,and ,and sin sin (bπ) bπ) for each iteration. 2
For the starting point we have: for a ( 1, 0), the function function sin (aπ) aπ ) [ 1, 0) and for b (2, (2, 3), the functio function n sin (bπ) bπ) (0, (0, 1] So, the bisection method can apply on the interval [a,b]. The only root that we can get are, 0, or 1, or 2; because these are the only integers belong to ( 1, 3). a+b Next, we have to check the sign of sin π . We know that a + b (1, (1, 3), 2
∈− ∈
−
∈− ∈
∈
a+b a+b if a + b < 2 we have p = 0.5 < < 1 and sin π > 0. 2 2 so the sign sign of sin (aπ) aπ) < 0 and sin sin ( pπ) pπ) > 0 are different and the only root between a and p is 0. Therefore the bisection method converge to 0. 22
a+b a+b if a + b > 2 we have p = 1 < < 1.5 and sin π < 0. 2 2 so the sign sign of sin ( pπ) sin (bπ) pπ) < 0 and sin bπ) > 0 are different and the only root between p and b is 2. Therefore the bisection method converge to 2. a+b if a + b = 2 we have p = 1 and sin π = 0. 2 which of cause is the root 1.
We will solve exercises 1,3,11,15,16 from sec 2.1 page 51-52.
2.2 2.2
Fixe Fixedd-P Poin oint Iter Iterat atio ion n Definition: A number p is a fixed point for a given function g if g ( p) p) = p
(2.14)
Theorem: If g C [a, b] and g (x) [a, b] for all x [a, b], the g has a fixed point in [a, [a, b]. ′ In addition, g (x) exists on (a, (a, b) and a positive constant k < 1 exists with
∈
∈
∈
′
|g (x)| ≤ k,
forall x
∈ (a, b)
(2.15)
then the fixed point in [a, [a, b] is unique. f (x) = g (x) proof: Let consider the function f (
− x. We have the following relations: f ( f (b) = g (b) − b ≤ 0 f ( f (a) = g (a) − a ≥ 0
(2.16)
So, according to the Intermediate value theorem f ( f (x) = 0 has a root. root. Th Thus us g (x) = x has a solution.Therefore g has a fixed point. Assume that there are more than one fixed points, let say, p and q with p = q . We know from from the mean value theorem that it exists ξ between q and q such that
g ( p) p) p
− g(q) = g (ξ ) = 1 −q ′
which contradicts the fact that g ′ (x) < 1, so, there will be only one fixed point. What about the case when g ′ (x) > 1, do we have such function? Indeed if g ′ (x) = 1 for all x the only possible case is that g ′ (x) < 1. Proof : Suppose that g′ (x) = 1 for all x (a, b). Because Because g(x) [a, b] and g ′ (x) exists so, we have g ′ (x) < 1 or 1 g ′ (x) < 1 or g′ (x) > 1: The first one and last on are not true because
|
| −
| − ≤
| |
|
|
∈
| ∈
−1 ≤ g(bb) −− ga(a) < 1 So, we can say that for the case when g ′ (x) > 1 no function exists.
|
|
Algorithm: OL;; maximum number of iteration N 0 INPUT: initial point p0 a, b; Tolerance T OL 23
(2.17)
OUTPUT: approximate solution p or message of failure. Step 1: Set i=1; Step 2: while i N 0 do steps 3-6.
≤
Step 3: set p = g ( p0); p0); (compute pi ) p p0 < TOL then Step 4: if p OUTPUT (p); (procedure terminated terminated successfully) successfully) STOP. Step 5: set i = i + 1 Step 6: Set p0 = p; (update p0 )
| − |
Step 7: OUTPUT (“method failed after N 0 iterations”) STOP. Fixed-point theorem If g C [a, b] and g (x) [a, b] for all x exists on (a, (a, b) and a positive constant 0 < k < 1 exists with
∈
∈
′
|g (x)| ≤ k, then for any p0 Proof:
∈ [a, b], the sequence p
n
for all x
′
∈ [a, b], suppose in addition, g (x)
∈ (a, b)
(2.18)
= g ( pn−1 ) converges to the unique fixed point p
∈ [a, b].
If g satisfies the fixed point theorem, then bounds for the error involved in using pn to approximate approximate p are given by n
| p p − p | ≤ k max{ p − a, b − p } | p p − p | ≤ 1 k− k | p p − p | n
0
0
(2.19)
n
n
1
0
(2.20)
proof: Study examples 3 and 4 page 57 Exercises: Ex.5: We use use a fixedfixed-poi point nt itera iterati tion on method method to deter determi mine ne a solu soluti tion on accura accurate te to 10−2 for – Ex.5: x4 3x2 3 = 0 on [1, [1, 2] and p0 = 1. Solution: To use the fixed-point iteration we have to find a function g (x) which satisfies fixed-point theo1/4 rem. the equation x4 3x2 3 = 0 leads to x4 = 3x2 +3, which in turn leads to x = 3x2 + 3 . 1/4 Now we check if the function g (x) = 3x2 + 3 satisfies the fixed-point theorem.
−
−
− −
g ′ (x) =
3 x 2 (3x (3x2 + 3)3/4
24
The derivative is always positive in the region [1, [1, 2]. So So,, g (1) = 1. 1.565084580 and g(2) = 1.967989671. Therefore g(x) [1, [1, 2].
∈
g ′ (x) =
3 x 2 (3x (3x2 + 3)3/4
3 2 2 (3 12 + 3)3/4 0.7825422900 < 1
≤ ≤
×
So, the function g(x) satisfies the fixed-point theorem. For more precisely, the second derivative is given by ′′
g (x) =
−
3 x2 2 4 (x2 + 1)(3x 1)(3x2 + 3)3/4
−
√
In the region [1, [1, 2] we have g′ (1) = 0. 0.3912711450, g ′ (2) = 0. 0.3935979342 and g ( 2) = 0. 0.4082482906, ′ so g (x) 0.4082482906 < 1. Therefore our k = 0.4082482906. according to the theorem we have
≤
| p p − p | ≤
k n max[ p p0
| p p − p | ≤ ≤
kn max[ p p0 a, b p0 ] 0.4082482906n 10−2
n
− a, b − p ] 0
which becomes n
which implies that n
−
− ≤
= 1.943316930 is accurate to within 10−2 .
≥ 6. The answer will be p
6
– Ex 7: We want to show that the function g(x) = π + 0. 0.5sin(x/ 5sin(x/2) 2) has a unique fixed point on [0, [0, 2π]. We know that 0 < π 0.5 g (x) π + 0. 0.5 < 2π (2.21)
−
≤
≤
The derivative of the function g is given by g ′ (x) =
cos(x/ cos(x/2) 2) 4
≤ 14
(2.22)
Therefor Thereforee the function function has a unique unique fixed point. point. To find an approx approxima imation tion to the fixed-poin fixed-pointt −2 that is accurate to 10 let estimate the number of iteration. Let use p0 = π
| p p − p
n
|≤
This leads to the number of iteration n
1 4
n
π
≤ 10
−2
(2.23)
≥ 5.
we can better use kn
| p p − p | ≤ 1 − k | p p − p | n
1
25
0
(2.24)
we have p1 = π + 1/ 1/2, so the last equation becomes (1/ (1/4)n 1 2 = 3/4 2 3
| p p − p | ≤ n
This leads to the number of iteration n p0 = 3.141592654 p1 = 3.641592654 p2 = 3.626048865 p3 = 3.626995623 p4 = 3.626938795 p5 = 3.626942209
2.3 2.3
≤ 1 4
n
10−2
(2.25)
≥ 4. Therefore
Newt Newton on’s ’s Meth Method od Derivation of Newton’s (or the Newton-Raphson) method : Suppose that f C 2 [a, b], have continuous second derivative. Let p0 [a, b] be an approximation to the solution p of f ( f (x) = 0 such that f ′ ( p) p) = 0 and p p p0 is “small”. The first Taylor Taylor polynomial of f ( f (x) about p0 is ( p p0)2 ′′ ′ f ( f ( p) p) = f ( f ( p0) + ( p ( p p0 )f ( p0) + f (ξ ( p)) p)).. (2.26) 2 where ξ ( p) p) lies between p and p0 . Since f ( f ( p) p) = 0, and if we neglect the quadrature term we get
∈
| − |
−
−
p
≈ p
We can write the sequence pn = pn−1
∈
1
f ( p ) − f f ( ( p ) 0
= p0
′
f ( p − f f ( ( p
n−1 )
′
n−1 )
(2.27)
0
,
n
≥1
(2.28)
The last equation is what we call Newton’s Method. Algorithm (please see Page 64) Newton’s method is a functional iteration technique of the form pn = g ( pn−1 ) where g( pn−1 ) = pn−1
f ( p − f f ( ( p
n−1 )
′
n−1 )
,
n
≥1
(2.29)
The Newton’s method cannot be continuous if f if f ′ ( pn−1) = 0. Newton’s method derivation depends on the assumption that p0 is close to the p. So, it is important that the initial initial approximation approximation p0 is chosen to be close to the actual value p. In some cases even with poor initial approximation the Newton’s method converges. The following theorem illustrates the theoretical importance of the initial approximation choice of p0 . Theorem: Let f C 2 [a, b]. If p If p [a, b] is such that f ( f ( p) p) = 0 and f ′ ( p) p) = 0, then there exists a δ > 0 such that Newton’s method generates a sequence pn converging to p for any initial approximation approximation p0 [ p p δ, p + δ]. Proof: (see page 66).
∈
∈
{ }
∈ −
26
The theorem states, that under reasonable assumptions, Newton’s method converges provided a sufficiently sufficiently accurate initial approximati approximation on is chosen. chosen. In practice, the method doesn’t tell us how to calculate δ . In general either either the method method conve converges rges quickly quickly or it will be clear clear that conve convergen rgence ce is unlikely. Newton’s method is a powerful technique, but it has a major weakness: the need of the first derivative.The calculation of the first derivative f ′ (x) needs more arithmetic operations than f ( f (x).
2.4 2.4
Seca Secan nt Metho ethod d
Newton’s Newton’s method is a powerful powerful technique, technique, but it has a major weakness: weakness: the need of the first derivative derivative.The .The calculation of the first derivative f ′ (x) needs more arithmetic operations than f ( f (x). We know that the first derivative is defined by f ( f (x) x
f ′ ( pn−1 ) = lim
x→ pn−1
Letting x = pn−2 , we have f ′ ( pn−1 )
f ( p − f ( − p
n−1 )
(2.30)
n−1
f ( p ) − f ( f ( p ≈ f ( − p p
n−1 )
n−2
n−2
(2.31)
n−1
using this last equation in the Newton’s method we get pn = pn−1
− p p − f ( f ( p ) − f ( f ( p n−1
n−1
n−2 n−2 )
f ( f ( pn−1 )
(2.32)
This is called the Secant Method . Starting with two initial approximation p0 and p1, the approximation p2 is the x-intercept of the line joining the two points ( p0 , f ( f ( p0 )) and ( p1 , f ( f ( p1 )). The approximation p3 is the x-intercept -intercept of the line joining ( p1, f ( f ( p1)) and ( p2 , f ( f ( p2 )) (see the Fig.2.2).
2.5 2.5
False alse Posit ositio ion n Meth Method od Secant Method
p
0
p
2
False Position Metho
p
p1
p
3
0
p4
p
2
p
p
1
3
p
4
Figure 2.2: Secant Method and False Position method for finding the root of f of f ((x) = 0.
27
generatess approx approxima imation tionss in the same same wa way y as the Secant Secant method, but False position method: generate incl include udess a test test to ensure ensure that that the root is alwa always ys brack brackete eted d betw between succe success ssiv ivee itera iterati tions ons.. Th This is method is not recommended and it is just to illustrate how bracketing can be incorporated. First we choose two approximations p0 and p1 such that f ( f ( p0 ) f ( f ( p1 ) < 0. The approximation p2 is chosen the same manner as the secant method, as the x-intercept of the line joining ( p0 , f ( f ( p0 )) and ( p1 , f ( f ( p1 )). To decide which secant line to be use to compute p3 , we check the sign of f ( f ( p1 ) f ( f ( p2 ). If it is negative then p1 and p2 bracket a root, and we choose b3 as the x-intercept of the line joining ( p1 , f ( f ( p1 )) and ( p2 , f ( f ( p2 ). If not not we choos choosee p3 as the x-intercept of the line joining p0 , f ( f ( p0 )) and ( p2 , f ( f ( p2 )). In similar manner we can found pn, for n 4.
·
·
≥
Exercises Ex 5: We wa want nt to use Newton’s Newton’s method to find an approx approxima imate te solution solution accurate accurate to 10−4 of for the following equation. x3
2
− 2x − 5 = 0, in the interval , [1, [1, 4]
Solution: The Newton’s method is: pn = pn−1 with f ( f (x) = x3
− 2x
−
5 and f ′ (x) = 3x2
f ( p − f f ( ( p
n−1 )
′
n−1 )
,
n
≥1
− 4x.
The approximation is accurate to the places for which pn−1 and pn agree. The Newton’s methods gives: n 1 2 3 4 5
pn−1 2.5000 500000 0000 0000 2.7142 714285 8571 7144 2.6909 690951 5151 5166 2.6906 690647 4749 4999 2.6906 690647 4744 4488
pn 2.7142 714285 8571 7144 2.6909 690951 5151 5166 2.6906 690647 4749 4999 2.6906 690647 4744 4488 2.6906 690647 4744 4488
| p p − p | n
n−1
0.214285714 0.023334198 0.000304017 5.110−8 0.00
which is p4 = 2.690647448 If we use p0 = 2 we get n 1 2 3 4 5
pn−1 2.0000 000000 0000 0000 3.2500 250000 0000 0000 2.8110 811036 3678 7899 2.6979 697989 8950 5033 2.6906 690677 7715 1533
pn 3.2500 250000 0000 0000 2.8110 811036 3678 7899 2.6979 697989 8950 5033 2.6906 690677 7715 1533 2.6906 690647 4744 4488
which is p5 = 2.690647448
28
| p p − p | n
n−1
1.250000000 0.438963211 0.113047286 0.007312350 0.000029705
(2.33)
Chapter 3 Error Analysis for Iterative Methods 3.1
Linear Linearly ly and and Quadra Quadratic ticall ally y Conv Convergen ergentt Procedur Procedures es
We investigate the order of convergence of functional iteration . Suppos posee pn is a sequence that converges to p, with pn = p for all n. If posi positi tiv ve Definition: Sup constant λ and α exist with p pn+1 p lim =λ (3.1) n→∞ p pn p α then the sequence converges to p of order α, with asymptotic constant λ. An iterative techniques of the form pn = g( pn−1) is said to be of order α if the sequence pn converges to the solution p = g ( p) p) of order α.
{ }
| − | | − |
{ }
In general, a sequence with high order of convergence converges more rapidly than a sequence with a lower order. The asymptotic constant affects the speed of convergence but is not as important as the order. If α = 1, the sequence is linearly convergent . If α = 2, the sequence is quadratically convergent . Assume that we have two sequences one converges linearly to zero and the other converges quadratically to zero. For simplicity , suppose that
| p p | ≈ 0.5 | p p | | p˜ | ≈ 0.5 | p˜ | n+1
Line Linear arly ly con converge ergen nt
n
n+1 2 n
Quadra Quadrati tical cally ly conv convergen ergentt
For linearly convergent, we have 2
n
| p p − 0| = | p p | ≈ 0.5| p p | ≈ 0.5 | p p | ≈ ... ≈ 0.5 | p p | n
n
n−1
n−2
0
For quadratically convergent, we have 2
3
4
(0.5) | p˜ − 0| = | p˜ | ≈ 0.5| p˜ | ≈ 0.5 | p˜ | ≈ ... ≈ (0. n
n
n−1
n−2
If we use p0 =1 we can see that n
(0.5) | p p | ≈ 0.5 , | p˜ | ≈ (0. n
n
29
2n −1
2n −1
| p˜ | 0
2n
after seven iterations we get −2
| p p | ≈ 0.78125 × 10 n
−38
| p˜ | ≈ 0.58775 × 10 n
In order for the linearly convergent to have the same accuracy as quadratically convergence we need: 7
(0. (0.5)n = (0. (0.5)2
−1
7
⇒ n = 2 − 1 = 127 Theorem: Let g ∈ C [a, b] be such that g (x) ∈ [a, b], for all x ∈ [a, b]. Suppose, in addition, that g is continuous on (a, (a, b) and that a positive constant k < 1 exists with |g (x)| ≤ k , for all x ∈ (a, b). If 0, then for any p ∈ [a, b], the sequence g (x) = p = g( p ), n ≥ 1, (3.2) ′
′
′
0
n
n−1
converges only linearly to the unique fixed point p in [a, b]. proof: p) = 0 and g ′′ is continuous Theorem: Let p a solution of the equation x = g (x). Suppose that g ′ ( p) with g′′ (x) < M on an open interval I containing p. Then there exists a number δ > 0 such that, for p0 [ p p δ, p + δ ], the sequence defined by pn = g ( pn−1 ), where n 1, converges at least quadratically to p. Moreover, for sufficient large values of n,
|
∈ −
|
≥
| p p − p| < M | p p − p| . 2 n+1
2
n
(3.3)
Proof: The easiest way to construct a fixed-point problem associated with a root-finding problem f ( f (x) = 0 is to subtract a multiple of f of f ((x) from x. pn = g ( pn−1 ), with g (x) = x φ(x)f ( f (x), where φ(x) is a differentiable function that will be chosen later. If p satisfies f ( f ( p) p) = 0 then it is clear that g ( p) p) = p. If the iteration procedure derived from g to be quadratically convergent, we need g ′ ( p) p) = 0 when f ( f ( p) p) = 0. Since
−
g ′ (x) = 1
′
′
f (x) − φ(x)f (x) − φ (x)f (
we have g ′ ( p) p) = 1 = 1
′
′
− φ ( p) p)f ( f ( p) p) − φ( p) p)f ( p) p) − φ( p) p)f ( p) p) ′
which implies φ( p) p) =
1 f ′ ( p) p)
If we let φ(x) = 1/f ′ (x), we will ensure that φ( p) p) = 1/f ′ ( p) p) and produce quadratically convergent procedure pn = g ( pn−1 ) = pn−1
f ( p − f f ( ( p
n−1 )
′
n−1 )
(3.4)
This is of cause Newton’s method which is quadratically convergent provided that f ′ ( pn−1 ) = 0.
30
3.2 3.2
Zero Zero multi ultipl plic icit ity y f (x) = 0 is zero multiplicit Definition: A solution p of f ( multiplicity y m of f if for x = p, we can write m f ( f (x) = (x p) p) q(x), where lim q (x) = 0.
−
Theorem: f Proof:
x→ p
1
′
∈ C [a, b] has a simple zero at p in (a, b) iff f ( 0. f ( p) p) = 0 but f ( p) p) =
If p is a simple root of f then Newton’s method converges converges quadratically quadratically.. If p If p is not a simple root then Newton’s method may not converge quadratically (see Example 2 page 79).
Theorem: The function f
m
∈ C
[a, b] has a zero of multiplicity m at p in (a, b) iff
f ( f ( p) p) = f ′ ( p) p) = ... = f m−1 ( p) p) = 0 f m ( p) p) = 0
One method to handle problems of multiple roots is to define a function µ(x) =
m
If p is zero of f of multiplicity m and f ( f (x) = (x µ(x) = (x
−
f ( f (x) f ′ (x)
− p) p)
(3.5)
q(x), then
q (x) p) p) mq( mq(x) + (x ( x p) p)q′ (x)
−
which has a simple zero. Newton’s method can be applied to µ to give g(x) = x = x
− µµ((xx)) ′
−
f ( f (x) f ′ (x) [f ′ (x)]2 f ( f (x) f ′′ (x)
−
(3.6)
If g has the required continuity conditions, functional iteration applied to g will be quadratically convergent regardless of the multiplicity of the zero of f . f . Theoretically Theoretically,, the only drawback drawback to this ′′ method is the additional calculation calculation of the second derivativ derivativee f . In practice, multiple roots can cause serious round-off problems since the denominator consists of the difference of two numbers that are both close to zero.
3.3 3.3
Exerc xercis ises es
Ex 6: Show that the following following sequence converges converges linearly to p = 0. How large must n before we have p p pn 5 10−2 ? a) p a) pn = 1/n. /n. b) qn = 1/n2 .
| − |≤ ×
Sol 6: It is clear that the limit limit of the sequences sequences is zero It is clear that lim
n→∞
| p p − p| = 1 | p p − p| n+1 n
31
for (a) we have 1/n 1/n 5 10−2 implies that n n2 20, which in turn gives n 5.
≥
≤ ×
≥
2
1/n ≤ 5 × 10 ≥ 20. for (b) we have 1/n
−2
implies that
n
Ex 8a: We wa want nt to show show that the sequence sequence pn = 10−2 converges quadratically to 0. Sol 8a: We know that n
lim 10−2 = 0
n→∞
we now calculate the following limit n+1
10−2 lim = n n→∞ (10−2 )2
n
(10−2 )2 lim =1 n n→∞ (10−2 )2
thus α = 2,and the sequence converges quadratically to zero. k
Ex 8b: We want to show that the sequence pn = 10−n doesn’t converge to zero quadratically, regardless of the size of the exponent k > 1. Sol 8b: We know that k
lim 10−n = 0
n→∞
we now calculate the following limit k
10−(n+1) lim k n→∞ (10−n )2 To find this limit we know that k
lim 2n
n→∞
− (n + 1)
k
=
k
lim n
n→∞
− 2
n+1 n
k
=
∞
So, we cannot find a positive number λ. Therefore, the sequence doesn’t converge quadratically.
32
Chapter 4 Accelerating Convergence 4.1 4.1
Aitk Aitken en’s ’s ∆2 method We consider a technique that can be used to accelerate the convergence of a sequence that is linearly convergent. Suppose that pn is a linearly convergent sequence with limit p. The Aitk Aitken’ en’ss ∆2 method on the assumption that pˆn converges defined by
{ } { }
pˆn = pn
−
= pn
−
( pn+1 pn)2 pn+2 2 pn+1 + pn (∆ p (∆ pn)2 for n 0 ∆2 pn
−
−
(4.1)
≥
converges more rapidly to p than does the original sequence pn . The symbol ∆ p ∆ pn is the forward difference which is defined by
{ }
∆ pn = pn+1
for n
− p , n
≥0
(4.2) (4.3)
≥2
(4.4) (4.5)
High powers of the operator ∆ are defined recursively by ∆k pn = ∆(∆k−1 pn)
for k
For example ∆2 pn = = = =
∆(∆ pn ) ∆( p ∆( pn+1 pn ) ( pn+2 pn+1 ) ( pn+1 pn+2 2 pn+1 + pn
− −
−
− −
− p ) n
(4.6)
Theorem: Suppose that pn is a sequence that converges linearly to the limit p and that
{ }
pn+1 p <1 n→∞ pn p lim
− −
33
(4.7)
Then the sequence pˆn = pn
−
(∆ p (∆ pn )2 ∆2 pn
(4.8)
converges to p faster that p in the sense that
{}
pˆn n→∞ pn lim
− p = 0 − p
(4.9)
Example: Let consider pn = cos(1/n cos(1/n). ). This sequence converges linearly to p = 1, lim
n→∞
cos
− −
cos
1 n+1 1 n
1
2
sin
1
n n+1 = lim n→∞ (n + 1) 2 sin 1 n
1
=
lim
n→∞
1 n+1 sin n1 2
sin
1
cos n+1 n = lim n→∞ (n + 1) 2 cos 1 n = 1 The Aitken’s ∆2 is defined by
n 1 2 3 4 5 6 7
pˆn = pn
−
= pn
−
(∆ p (∆ pn )2 ∆2 pn ( pn+1 pn )2 pn+2 2 pn+1 + pn
−
pn 0.54030 54030230 23059 59 0.87758 87758256 25619 19 0.94495 94495694 69463 63 0.96891 96891242 24217 17 0.98006 98006657 65778 78 0.98614 98614323 32316 16 0.98981 98981326 32604 04
−
(4.10)
pˆn 0.9617750599 0.9821293535 0.9897855148 0.9934156481 0.9954099422 0.9966199575 0.9974083190
Example: The function f ( f (x) = x3 3x + 2 = (x (x 1)2 (x + 2) has a double root p = 1. If Newton Newton’s ’s method method converges to p = 1 it converge convergess linearl linearly y. We choose choose p0 = 2. The Newton Newton’s ’s method method produce producess the the following sequence:
−
−
p0 = 2. pn+1 = pn
34
pn3
− 3− p 3 p− 3+ 2 n
n
n 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
pn 1.55555555 555555555555 5555556 556 1.29790660 297906602254 2254429 429 1.15539019 155390199213 9213768 768 1.07956221 079562210414 0414361 361 1.04028843 040288435171 5171017 017 1.02027680 020276809786 9786733 733 1.01017232 010172323431 3431420 420 1.00509474 005094741093 1093272 272 1.00254952 002549528082 8082823 823 1.00127530 001275305026 5026243 243 1.00063778 000637787960 7960288 288 1.00031892 000318927867 7867152 152 1.00015947 000159472408 2408516 516 1.00007973 000079738323 8323218 218 1.00003986 000039869690 9690520 520
pn 1 pn−1 1 0.5555555560 0.5362318832 0.5216071009 0.5120156259 0.5063765197 0.5032910809 0.5016727483 0.5008434160 0.5004234759 0.5002121961 0.5001062491 0.5000533092 0.5000250840 0.5000125414 0.5000125411
− −
It is clear that Newton’s method is linearly convergent or it converges slowly to p = 1. Let us apply Aitken’s acceleration process to the sequence pn of iterations generated by Newton’s method pˆn = pn
−
( pn+1 pn )2 pn+2 2 pn+1 + pn
−
−
(4.11)
which gives: n 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
4.2 4.2
pn 2.0 1.5555 555555 5555 5566 1.2979 297906 0660 6022 1.1553 155390 9019 1999 1.0795 079562 6221 2100 1.0402 040288 8843 4355 1.0202 020276 7681 8100 1.0101 010172 7232 3233 1.0050 005094 9474 7411 1.0025 002549 4952 5288 1.0012 001275 7530 3055 1.0006 000637 3778 7888 1.0003 000318 1892 9288 1.0001 000159 5947 4722 1.000079738 1.000039870
pˆn 0.9425287356 0.97897 97897679 67949 49 0.99334 99334207 20783 83 0.99809 99809276 27682 82 0.99948 99948654 65474 74 0.99986 99986655 65586 86 0.999965 99996596 9695 95 0.999991 99999140 4062 62 0.999997 99999784 8406 06 0.999999 99999945 4588 88 0.999999 99999986 8645 45 0.999999 99999996 9661 61 0.999999 99999999 9915 15 0.999999 99999999 9979 79
∗∗ ∗∗
pn 1 pˆn 1 1.0 0.0574712644 0.555555556 0.0210232051 0.297906602 0.0066579217 0.155390199 0.0019072318 0.079562210 0.0005134526 0.040288435 0.0001334414 0.020276810 0.0000340305 0.010172323 0.0000085938 0.005094741 0.0000021594 0.002549528 5.41210−7 0.001275305 1.35510−7 0.000637788 3.3910−8 0.000318928 8.510−9 0.000159472 2.110−9 0.000079738 0.000039870
−
− − − − − − − − − − − − − − ∗∗ ∗∗
−
Steff Steffen ense sen’ n’ss Meth Method od
By applying a modification of Aitken’s ∆ 2 method to a linearly convergent sequence obtained from a fixed-point iteration , we can accelerate the convergence to quadratic. This procedure is known as Stef35
fensen’s method. Aitken’s method construct the terms in order, p0,
p1 = g ( p0 ),
p2 = g ( p1 ),
pˆ0 = ∆2 ( p0 ),
p3 = g ( p2 ),
{ }
pˆ1 = ∆2 ( p1 ),...
{ }
(4.12)
where ∆2 indicates that Aitken’s method given by eq.(4.10) is used. Steffensen’s Method constructs the same first four terms, p0 , p1, p2 , and pˆ0. Howe Howeve ver, r, at this this step step it assum assumes es that that pˆ0 is better approximation to p than p2 and applies fixed-point iteration to pˆ0 instead of p2 . This This leads leads to the following sequence:
{ }
p(0) 0 ,
(0) p(0) 1 = g ( p0 ),
(0) p(0) 2 = g ( p1 ),
(0) 2 p(1) 0 = ∆ ( p0 ),
{ }
(1) p(1) 1 = g ( p0 ),...
(4.13)
Note that the denominator denominator can be b e zero in the next iteration. If this occurs, we terminate the sequence sequence 1 and select the last one before we get zero . Ex 3 page 86 : Let g (x) = cos(x cos(x
− 1) and p
(0) 0
(1)
= 2. We want to use Steffensen’s method to get p0 .
(0)
= 2
(0)
= cos(2
p0 p1
p(0) = 2 (1) p0
=
− 1) = 0.0.5403023059 cos(0. (0.5403023059 − 1) = 0. 0.8961866647 ( p − p ) p − = 0.826427396 p − 2 p + p (0) 1
(0) 0
(0) 2 0 (0) (0) 1 0
(0) 2
Ex 4 page 86 : (0) (1) (2) Let g (x) = 1 + (sin x)2 and p0 = 2. We want to use Steffensen’s method to get p0 and p0 (0)
= 2
(0)
= 1 + (si (sin 2)2 = 1.708073418
(0)
= 1 + (sin 1.708073418)2 = 1.981273081
p0 p1 p2
(1) p0
=
(0) p0
(0)
−
( p1 (0) p2
−
(0) 2 0 ) (0) (0) 2 p1 + p0
− p
= 2.152904629
(2)
To calculate p0 we start with: (1)
= = 2.152904629
(1)
= 1 + (sin 2.152904629)2 = 1.697735097
p0 p1
p(1) = 1 + (sin 1.697735097)2 = 1.983972911 2 (2) p0 1
=
(1) p0
(1)
−
( p1 (1)
p2
−
(1) 2 0 ) (1) (1) 2 p1 + p0
See page 85 from the textbook
36
− p
= 1.873464043
4.3 4.3
Zero Zeross Polyn olynom omia iall
A polynomial of degree n has the form P ( P (x) = anxn + an−1 xn−1 + ... + a1 x + a0
(4.14)
where ai ’s are the coefficients of P of P .. are constant and an = 0.
Fundamental undamental Theorem of Algebra: Algebra: If P If P ((x) is polynomial of degree n one root (possibly complex).
≥ 1, then P ( P (x) = 0 has at least
If P ( P (x) is a polynomial of degree n 1, then there exist unique constants x1, x2,...,xk , possibly complex, and unique positive integers m1, m2 ,...,mk , such that
≥
k
mi = n
i=1
P ( P (x) = an (x
m1
−x ) 1
(x
−x ) 1
m2
...( ...(x
mk
−x ) k
mi is the multiplicity of the zero xi . Let P ( P (x) and Q(x) be polynomials of degree at most n, If x1 , x2 ,...,xk , with k > n, are distinct numbers numbers with P ( P (xi ) = Q(xi ) for i = 1, 2,...,k, ,...,k, then p(x) = Q(x) for all values of x of x.
4.4 4.4
Horn Horner er’’s Meth ethod
To use Newton’s method to locate approximate zeros of a polynomial P(x), we need to evaluate P ( P (x) and ′ P (x) at specified value values. s. To compute compute them efficiency efficiency we compute compute them in the nested manner. manner. Horner’ Horner’ss method incorporates this nesting technique and require only n multiplications and n additions to evaluate an arbitrary nth-degree polynomial. polynomial.
Synthetic Division Algorithm and the Remainder Theorem : A polynomial of degree n can be written as P n(x) = anxn + an−1 xn−1 + ... + a1 x + a0 Divide Divide this polynom p olynomial ial P n (x) by (x (x x1 ), giving a reduced polynomial Qn−1 (x) of degree n a remainder R P n (x) = (x x1 )Qn−1 (x) + R
−
−
(4.15)
− 1, and (4.16)
We can see that P n(x1 ) = R. If we differentiate P n(x) we get P n′ (x) = (x
′
− x )Q 1
n−1 (x) +
Qn−1 (x)
(4.17)
thus, P n′ (x1 ) = Qn−1 (x1)
37
(4.18)
We evaluate Qn−1 (x1 ) by a second division whose remainder equals Qn−1 (x1 ), and so on. on. Now Now we can write P n(x) = = = =
an xn + an−1 xn−1 + ... + a1 x + a0 (x x1 )Qn−1 (x) + R (x x1 )(b )(bn−1 xn−1 + bn−2 xn−2 + ... + b1x + b0 ) + R bn−1 xn + bn−2 xn−1 + ... + b1x2 + b0 x bn−1 x1 xn−1 + bn−2 x1 xn−2 + ... + b1 x1 x + b0 x1 + R
− −
−
We collect the terms
[bn−2 x1 bn−1] xn−1 + [b [bn−3 P n(x) = bn−1 xn + [b + [b0 x1 b1 ] x + [R [R x1 b0 ]
−
−
−
1 bn−2 ] x
−x
n−2
+ ...
By comparison we get: bn−1 = an bn−2 = an−1 + x1 bn−1 . . .. = .. bi = ai+1 + x1 bi+1 . . .. = .. b0 = a1 + x1 b1 So, the reminder can be evaluated from R = a0 + x1 b0
Horner’s Method : Let P ( P (x) = anxn + an−1 xn−1 + ... + a1 x + a0
(4.19)
If bn = an and bk = ak + bk+1 x0 , then b0 = P ( P (x0 ), k is from n to get p(x0 ). Moreover, if
for k = n
− 1, n − 2, ..., ..., 1, 0
(4.20)
− 1 to 0, which means you need only n multiplications and n additions Q(x) = bn xn−1 + bn−1xn−2 + ... + b2 x + b1
(4.21)
Then P ( P (x) = (x
Proof:
− x )Q(x) + b 0
0
(4.22)
The derivative of P ( P (x) is given by P ′ (x) = Q(x) + (x (x
′
− x )Q (x) 0
(4.23)
Thus P ′ (x0 ) = Q(x0) 38
(4.24)
Example: We want to evaluate P ( P (x) = 2x4 we start by: Coeff x4 a4 = 2 b4 = a4 b4 = 2
2
− 3x
+ 3x 3x
− 4 at x = −2 using Horner’s method. 0
Coeff x3 Coeff x2 Coeff x1 Coeff x0 a3 = 0 a2 = 3 a1 = 3 a0 = 4 b3 = a3 + b4( 2) b2 = a2 + b3 ( 2) b1 = a1 + b2 ( 2) b0 = a0 + b1( 2) b3 = 4 b2 = 5 b1 = 7 b0 = 10
−
−
−
−
Therefore, P ( P ( 2) = 10 and P ( P (x) = (x + 2)(2x 2)(2x3
−
−
−
2
− 4x
−
− + 5x 5x − 7) + 10.
Example: Find an approximation to one of the zeros of P of P ((x) = 2x4 3x2 + 3x 4 using Newton’s Method and synthetic division to evaluate P ( P (xn ) and P ′ (xn ) for each iterate xn . at x0 = 2 we use bn = an and bk = ak + bk+1 x0 for k = n 1 to k = 0.
−
−
−
21 26
−
(26)( 2) = 02 47 =
− −
− −
02 33 34 45 47 ( 4)8 ( 2) = 89 (510 )( 2) = 1011 ( 712 )( 2) = 1413 48 33 + 8 9 = 510 34 1011 = 712 45 + 1413 = 1014
−
− − −
− −
Using the theorem P ′ (x0 ) = Q(x0 ) we get 21
and
− −
−
5 −4 −7 (2 )(−2) = −4 (−8 )(−2) = 16 (21 )(−2) = −42 −4 − 4 = −8 5 + 16 = 21 −7 − 42 = −49 2
5
25
− −
−
2
5
3
6
7
7
3
4
8
8
9
9
4
10
10
11
P (x ) 10 = −2 − − P ( −49 ≈ −1.796 Q(x ) repeating the procedure we get for x = −1.796, P ( P (x ) = 1.742 and P (x ) = −32. 32.565, so, x ≈ 73896.. in a simi simila larr manner manner we get x = −1.73897. 73897. An actual actual zero to five five decim decimal al places places is −1.73896 −1.73896. 0
x1 = x0
0
1
′
1
1
2
3
4.5
Deflation
If the N th th iterate, xN , in Newton’s method is an approximate zero for the polynomial P ( P (x), then P ( P (x) = (x
−x
N )Q(x) + b0
= (x
−x
N )Q(x) +
P ( P (xN )
≈ (x − x
N )Q(x)
(4.25)
so, x xN is an approximate factor of P of P ((x). Letting xˆ1 = xN be the approximate zero of P and Q1 (x) = Q(x) be the approximate factor given by
−
P ( P (x)
≈ (x − xˆ )Q (x) 1
1
(4.26)
We can find a second approximate zero of P of P by applying Newton’s method to Q1 (x). If P If P ((x) is of order n we can apply repeatedly the procedure to find xˆ2 and Q2 (x),..., x ˆn−2 and Qn−2 (x). After finding (n (n 2) roots we get a quadrature quadrature form which we can solve solve it to get the last two approximate approximate roots. This procedure is called deflation.
−
39
The accuracy difficulty with deflation is due to the fact that, when obtaining the approximate zero of P ( P (x), Newton’s method is used on the reduced polynomial Qk (x), that is, P ( P (x)
≈ (x − xˆ )(x )(x − x ˆ )...( ...(x − xˆ )Q (x) 1
2
k
k
(4.27)
An approximate zero xˆk+1 of Qk (x) will generally not approximate a root of P ( P (x) but of Q of Qk (x). To eliminate this we can use the reduced equations to find approximates xˆi , and then apply Newton’s Method to the original polynomial P ( P (x). One problem with applying the Secant, False position, or Newton’ methods to polynomials is possibility sibility of having complex roots. If the initial initial approximation approximation is real all subsequent approximation approximationss will also be real. To overcome this problem we start with complex initial approximation.
4.6
Mu¨ller’s method
The secant method use two initial approximations p0 and p1 , to get p2 which is the x-intercept of the line joining the two points ( p0 , f ( f ( p0 )), ( p ( p1 , f ( f ( p1 )). M¨uller’s uller’s method uses three initial approximations, p0 , p1 , p p2 , and determinate the next approximati approximation on p3 by considering the intersection of the x-axis with the parabola through ( p0 , f ( f ( p0 )),( p )),( p1, f ( f ( p1 )), and ( p ( p2 , f ( f ( p2 )). The parabola take the form P ( P (x) = a(x
2
− p ) 2
+ b( x
− p ) + c 2
(4.28)
Of cause we can find these parameters a, b, and c. To determine p3 , a zero of P of P ((x), we apply the quadrature formula p3
− p
2
=
−2c √ b ± b − 4ac 2
(4.29)
This has no problem with subtracting nearly equal numbers (see example 5 section 1.2). This formula has two roots. In Muller’s method, the sign is chosen to agree with the sign of b of b, so p3 = p2
2c √ − b + sign(b sign(b) b − 4ac 2
(4.30)
Once p3 is determined, we apply the procedure to p1 , p2 and p3 to get p4 . and so one. The method involve square root which means that complex numbers can be found using Muller’s method.
4.7 4.7
Exerc xercis ises es We want to find the approximation to 10−4 of all real zeros of the following polynomial using Newton’s method. P ( P (x) = x3 40
2
− 2x − 5
sol: Descartes’s rule of signs. The rule states that the number n p of positive zeros of a polynomial P ( P (x) is less than or equal to the number of variations v in sign of the coefficients of P of P ((x). Moreover, the difference v n p is nonegative even integer. For our example, the number of variations v in sign of the coefficients of P of P ((x) is v = 1. There are at most 1 positive root. Moreover, 1 n p 0, which implies that n p = 1. Therefor Thereforee there is one positive root. Now we change x x, we find
−
− ≥
→−
P ( P ( x) =
−
3
2
−x − 2x − 5
So, there is no variations in sign, v = 0. Thus, there is no negative root. Thus, our conclusion is: there is only on real root which is a positive real number. We then apply Newton’s Method starting with p0 = 2. f ( f (x) = x3 2x2 5 f ′ (x) = 3x2 4x p0 = 2 f ( f ( pn) pn+1 = pn , f ′ ( pn )
− − −
n
−
n 1 2 3 4 5 6
pn 3.2500 250000 0000 0000 2.8110 811036 3678 7899 2.6979 697989 8950 5033 2.6906 690677 7715 1533 2.6906 690647 4744 4488 2.6906 690647 4744 4488
≥0
| p p − p | n
n−1
1.250000000 0.438963211 0.113047286 0.007312350 0.000029705 0.000000001
We want to find to within 10−5 all the zeros of the polynomial f ( f (x) = x4 + 5x 5 x3
2
85x − 136 − 9x − 85x
by first finding the real zeros using Newton’s method and then reducing to polynomails of lower degree to determine any complex zeros. According to Descart rule we have: 1. For positive positive zeroes, zeroes, we have: have: nu number mber of variati variations ons of sign sign is 1. Th Thus, us, there is only only on positive positive zero 2. For negative negative zeroes, zeroes, we have: have: nu number mber of variart variartions ions of sign is 3. Th Thus, us, there are one or three negative zeroes. We apply newtons method with p0 = 0.00000 we get
41
[1 [2 [3 [4 [5 [6 [7 [8 [9
-1.6 -1.600 0000 0000 0000 00, , -2.6 -2.681 8139 3948 4805 05, , -5.5 -5.595 9534 3480 8023 23, , -4.8 -4.842 4260 6050 5061 61, , -4.3 -4.377 7721 2109 0956 56, , -4.1 -4.167 6734 3430 3093 93, , -4.1 -4.124 2472 7210 1017 17, , -4.1 -4.123 2310 1078 7873 73, , -4.1 -4.123 2310 1056 5624 24, ,
After rounding we get the negative zeros is
1.60 1.6000 0000 0000 000] 0] 1.08 1.0813 1394 9480 805] 5] 2.91 2.9139 3953 5321 218] 8] 0.75 0.7527 2742 4296 962] 2] 0.46 0.4653 5394 9410 105] 5] 0.20 0.2098 9867 6786 863] 3] 0.04 0.0426 2622 2207 076] 6] 0.00 0.0016 1613 1314 144] 4] 0.00 0.0000 0002 0224 249] 9]
−4.123106.
We now use Horner’s method to get the reduced polynomial. we get b4 = b3 = b2 = b1 = b0:= b0:=
1. 9.1231 9.1231056 05625 25 28.615 28.615528 52812 12 32.984 32.984845 8450 0 0.
The polynomial is now reduced to Q1 (x) = x3 + 9. 9.123105625x 123105625x2 + 28. 28.61552812x 61552812x + 32. 32.9848450 We use Newton Method to get a solution for Q1 (x) = 0, we find up to 10−5 the root 4. 4.123106. We use Horners method to get the reduced polynomial b3 b2 b1 b0
= = = =
1. 5. 8. 0.
Thus, the reduced polynomial is
Q2 (x) = x2 + 5x 5x + 8 This gives two complex roots,
32288i −2.5 ± 1.32288i
42
Chapter 5 Interpolation and Polynomial Approximation Conseder Data with two columns, x and y . We plot this data and see if we can fit this data to a function y = f ( f (x). Th This is is what what we call call “interpolation”. We will will study study the case case wh when en we fit the the data data to a polynomial.
5.1
Weierst eierstras rasss Appr Appro oximati ximation on Theor Theorem em Theorem: Suppose that f is defined and continuous on [a, [a, b]. For each ǫ > 0, there exists a polynomial P ( P (x), with the property that f (x) − P ( P (x)| < ǫ, for all x ∈ [a, b]. |f ( The proof of this theorem can be found in most elementary textbook on real analysis. Taylor polynomials are used mainly to approximate a function at a specified point. A good polynomial needs to provide a relatively accurate approximation over the entire interval. Taylor aylor poly p olynomi nomial al is not alway alwayss an appropri appropriate ate for interpol interpolati ation. on. As exampl examplee To approx approxima imate te f ( f (x) = 1/x at x = 3 using Taylor polynomial expanded at x = 1, leads to inaccurate result. n 0 P n (3) 1
5.2 5.2
1 2 1 3
−
3 4 5 11
−
5 6 21 43
−
7 85
−
Lagr La gran ange ge Polyn olynom omia iall
In this section we find approximating polynomials that are determined simply by specifying certain points on the plane through which they must pass. Let determine a polynomial of degree one that passes through the distinct points (x (x0 , f ( f (x0 )) and (x1 , f ( f (x1 )). We define the functions: x x0 x L1 (x) = x1 L0 (x) =
43
−x −x −x −x
1
(5.1)
1
0 0
(5.2)
and define P ( P (x) = L0 (x)f ( f (x0 ) + L1 (x)f ( f (x1 )
(5.3)
It is clear that the polynomial P ( P (x) coincides with f ( f (x) at x0 and x1 and it is the unique linear function passing through (x (x0 , f ( f (x0 )) and (x (x1 , f ( f (x1 )).
linear interpolat interpolation ion, consider the construction of a polynomial of To generate this concept of linear degree at most n that passes through n + 1 points, points, (x (x0 , f ( f (x0 )), (x (x1 , f ( f (x1 )),...,(x )),...,(xn , f ( f (xn )). ,...,xn are n + 1 distinct numbers and f is a function whose values are given at Theorem If x0 , x1 ,...,x these numbers, then a unique polynomial P ( P (x) of degree at most n exists with f ( f (xk ) = P ( P (xk ),
k = 0, 1,...,n
(5.4)
This polynomial is given by n
P ( P (x) =
f ( f (xi )Ln,i (x)
(5.5)
i=0
with n
Ln,k (x) =
i=0 k i=
x xk
−x −x
i
(5.6)
i
Proof: Theorem: Suppose x0 , x1 ,...,x ,...,xn are n + 1 distinct numbers in [a, [a, b] and f f n+1(ξ (x)) f ( f (x) = P ( P (x) + (x (n + 1)!
n+1
∈ C
[a, b], then
− x )(x )(x − x )...( ...(x − x ) 0
1
where P ( P (x) is the interpolating polynomial given by eq.(5.5) and ξ (x)
Proof:
n
(5.7)
∈ (a, b).
Definition: Let f be a function at x0 , x1 ,..., xn, and suppose that m1 , m2 ,...,m ,...,mk are k distinct integers, with 0 mi n for each i. The Lagrange polynomial that agrees with f ( f (x) at all k points xm1 , xm2 ,..., xmk , is denoted by P m1m2...mk (x).
≤ ≤
Theorem Let f be defined at x0 , x1,...,x ,...,xk , and x j = xi be two numbers from the set. Then, P ( P (x) =
(x
− x )P j
(x) − (x − x )P (x − x )
0,1,...,j −1,j+1 ,j +1,...,k ,...,k
i
i
0,1,...,i−1,i+1 ,i+1,...,k ,...,k (x)
j
describes the k th Lagrange polynomial that interpolate f at k + 1 points, x0 , x1 ,...,x ,...,xk .
Proof:
44
(5.8)
5.3 5.3
Nevi Nevill lle’ e’ss Meth Method od
This theorem implies that the interpolating polynomials can be generated recursively. To avoid multiple subscripts, let Qi,j , for 0 j i, denote the interpolating polynomial of degree j on the ( j + 1) numbers xi− j , xi− j+1 ,...,xi−1 , xi ; that is j +1 ,...,x
≤ ≤
Qi,j = P i− j,i− j+1 j +1,...,i ,...,i−1,i
x0 x1 x2 x3 x4
P 0 P 1 P 2 P 3 P 4
= Q0,0 = Q1,0 = Q2,0 = Q3,0 = Q4,0
P 0,1 P 1,2 P 2,3 P 3,4
(5.9)
= Q1,1 = Q2,1 P 0,1,2 = Q2,2 = Q3,1 P 1,2,3 = Q3,2 P 0,1,2,3 = Q3,3 = Q4,1 P 2,3,4 = Q4,2 P 1,2,3,4 = Q4,3 P 0,1,2,3,4 = Q4,4
(5.10)
Naville’s Iterated Interpolation is given by (x
−x
i− j )Qi,j −1
− (x − x )Q (x − x )
i−1,j −1
(5.11)
Q0,0 = f ( f (x0 ), Q1,0 = f ( f (x1 ), ,...,Qn,0 f (xn ), n,0 = f (
(5.12)
Qi,j =
i
i
i− j
Example: Suppose function f is given for the following values: x x0 x1 x2 x3 x4
= 1.0 = 1.3 = 1.6 = 1.9 = 2.2
f ( f (x) 0.7651977 0.6200860 0.4554022 0.2818186 0.1103623
we want to approximate f (1 f (1..5) using various interpolating polynomials at x = 1.5. By using using Nevill Neville’s e’s method, eq. (5.12), we can calculate Qi,j Q0,0 = P 0 = 0.7651977, 7651977, Q1,0 = P 1 = 0.6200860, 6200860, (x x0)Q1,0 Q1,1 = P 0,1 = (x1 Q2,0 = P 2 = 0.4554022 (x x1)Q2,0 Q2,1 = P 1,2 = (x2 (x x0 )Q2,1 Q2,2 = P 0,1,2 = (x2
− −
−
− (x − x )Q −x ) 1
0,0
− (x − x )Q −x ) − (x − x )Q −x )
assignments: study example 6 page 113. 45
= 0.5233449
0
2
1,0
= 0.5102968
1
2
0
1,1
= 0.5124715
5.4
Newton Newton In Inter terpola polatin ting g Polynom olynomial ial
Suppose there is a known polynomial P n−1(x) that that inter interpol polate atess the data data set: set: (xi , yi ), i = 0, 1,..,n 1. When one more data point (x (xn , yn ), which is distinct from all the other data points, is added to the data set, we can construct a new polynomial P n (x) that interpol interpolates ates the new data set. To do so, let consider consider the polynomial
−
n−1
P n (x) = P n−1 (x) + cn
(x
−x)
(5.13)
i
i=0
where cn is an unknown constant. For the case when n = 0, we write P 0 (x) = y0
(5.14)
It is clear that for all the old points P n (xi ) = P n−1 (xi ),
for i = 0,...,n
−1
(5.15)
at the new point (x (xn, yn ) we have n−1
P n (xn ) = P n−1 (xn ) + cn
(xn
i=0
−x)
(5.16)
i
which leads to at the new point (x (xn , yn ) we have cn =
P n (xn ) n−1
i=0
− P (x ) = y − P (x ) (x − x ) (x − x ) n−1
n
n
n n−1
i
n−1
n
n
i
(5.17)
i=0
So, For any given data set (x (xi , yi ), i = 1, 2,...,n, ,...,n, we can obtain the interpolating polynomial by recursive process that starts from P 0 (x) and uses the above construction to get P 1(x), P 2 (x), ..., P n−1(x). We will demonstrate this process through the following example i xi yi
0 1 2 3 4 0 0.25 0.5 0.75 1 1 0 1 0 1
−
First step: for i = 0 we have P 0(x) = y0 =
−1
Second step: adding the point (x (x1 , y1 ) = (0. (0.24, 24, 0), we get 0
P 1 (x) = P 0 (x) + c1
(x
i=0
= =
−1 + c (x − x ) −1 + c x 1 1
46
0
−x) i
The constant c1 is given by c1 =
y1
− P (x ) (x − x ) 0 − (−1) =4 0.25 − 0 0
1
0
1
i
i=0
= Thus,
P 1 (x) =
−1 + 4 x
The third step: adding the point (x (x2 , y2 ) = (0. (0.5, 1), 1
P 2 (x) = P 2 (x) + c2
(x
−x) (−1 + 4x 4 x) + c (x − x )(x )(x − x ) 4 x + c2x(x − 0.25) −1 + 4x i
i=0
= =
2
0
1
The constant c2 is given by c2 =
y2
− P (x ) (x − x ) −1 − (−1 + 4 × 0.5) = 0 (0. (0.5 − 0)(0. 0)(0.5 − 0.25) 1
2
1
2
i
i=0
= Thus,
P 2 (x) =
−1 + 4x 4x
We continue the calculations we find c3
−64 , = 3
64, c4 = 64,
P 3 (x) =
4x − −1 + 4x 4x − P (x) = −1 + 4x 4
− − − − − − −
64 x x 3 64 x x 3
1 4 1 4
x x
1 2 1 2
+ 64x 64x x
1 4
x
1 2
x
3 4
Divided difference polynomial : The divided difference difference polynomial polynomial is a helpful method to generate interpolation interpolation polynomials. polynomials. The first order divided difference of f of f at x = xi is give by f [ f [xi , xi+1 ] =
f ( f (xi+1 ) xi+1
f (x ) − f ( −x i
(5.18)
i
The second order divided difference of f of f at xi is given by f [ f [xi , xi+1 , xi+2 ] =
f [ f [xi+1 , xi+2 ] xi+2 47
f [x , x − f [ −x i
i
i+1 ]
(5.19)
We can generate this to higher order f [ f [x1 , . . . , xn ] f [ f [x0 , . . . , xn−1 ] xn x0 With these definition we get the interpolation polynomial as:
− −
f [ f [x0 , x1 , . . . , xn ] =
(5.20)
n−1
P n (x) = f [ f [x0 ] + f [ f [x0 , x1 ](x ](x
− x ) + . . . + f [f [x , . . . , x ] 0
n
= f [ f [x0 ] +
i xi yi
n
i−1
f [ f [x0 , . . . , xi ]
i=1
Example:
0
(x
i=0
(x
i=0
−x ) i
−x)
(5.21)
i
0 1 2 3 4 0 0.25 0.5 0.75 1 1 0 1 0 1
−
Let try to find the interpolation polynomial of the above table i xi f ( f (xi ) 0 0.00 f [ f [x0 ] = 1 1 0.25 f [ f [x1] = 0
1st DD
2nd DD
3rd DD
−
f [ f [x0 , x1 ] =
f [x1 ]−f [x0 ] x1 −x0
=4
2 0.50 f [ f [x1] = 1 f [ f [x1, x2 ] = 3 0.75 f [ f [x1] = 0
f [x2 ]−f [ f [x1 ] x2 −x1
=4 f [ f [x0 , x1 , x2 ] = 0
f [ f [x2, x3 ] =
f [x3 ]−f [ f [x2 ] x3 −x2
=
−4
f [ f [x1 , x2 , x3 ] =
4 1.00 f [ f [x1] = 1 f [ f [x3, x4 ] =
f [x4]−f [3] [3] x4 −x3
−16
=4 f [ f [x0, x1, x2 , x3 ] = f [ f [x2 , x3 , x4 ] = 16 f [ f [x1 , x2 , x3 , x4 ] =
−
64 3
128 3
f [ f [x0, x1, x2 , x3 , x4 ] = 64 So, the polynomial is P 3 (x) =
−1 + 4(x 4(x +64(x +64(x
=
− − − − − − − − − − − − − − − − −
4x −1 + 4x
0) + 0(x 0(x
0) x
0) x
1 4
64 x x 3
x
1 4
1 4
1 2
x
48
x
1 2
64 (x 3 3 4
+ 64x 64x x
1 4
0) x
1 4
x
1 2
x
1 2
x
3 4
5.5 5.5
Polyn olynom omia iall Forms orms
We follow the book Introduction to numerical analysis, Alastair Wood, Addition-Wesly
Power form: A polynomial of degree n can be written in power form as P n (x) = a0 + a1x + . . . + an xn n
=
ak xk
(5.22)
k=0
This form is convenient for analysis but may leads to loss of significance. For example, let consider P 1 (x) =
18001 3
−x
This polynomial takes 1/3 at x = 6000 and -2/3 at x = 6001. On a finite-precision machine with 5 decimal digits the coefficients are stored as a0∗ = 6000. 6000.3 and a1∗ = 1, and hence
−
P 1 (6000) = 6000. 6000.3 6000 = 0. 0.3 P 1 (6001) = 6000. 6000.3 6001 = 0.7
−
−
−
Only one digit of the exact value value is recovered, recovered, yet the coefficients are accurate to 5 digits! 4 significant significant digits have been lost due to subtraction of two near-neighbor large numbers.
Shifted power form : The drawbac drawback k seen in the previous previous example example can be allevi alleviated ated by changi changing ng the origin of the x to a non-zero value c an writing the polynomial (5.22) as n
P n (x) =
ak xk
k=0
= b0 + b1 (x n
=
k=0
n
− c) + . . . + b (x − c) b (x − c) n
(5.23)
k
k
This form is called shifted power form. c is a centre an bk are constant constant coefficien coefficients. ts. The previous previous example can be written as 18001 3 1 = (x 3
P 1 (x) =
−x − − 6000)
So, we get 1 3 1 (6001) 1) = P 1 (600 3 P 1 (600 (6000) 0) =
− (6000 − 6000) = 13 = 0.33333 − (6001 − 6000) = 13 − 1 = −0.66667 49
These values are accurate to 5 digits and there is no loss of significance. We can find the coefficients bk by using Taylor polynomial at x=c. This gives n
P n(x) =
k=0
P n(k) (c) (x k!
− c)
k
(5.24)
(k)
where,P where,P n (c) is the k-th derivative of P n(x) at x = c. Thus, P n(k)(c) = k!
bk
Newton form : We can generalized equation (5.24) by choosing n centres c1 , . . . , cn we get P n (x) = d0 + d1 (x c1 ) + d2 (x c1 )(x )(x c2 ) + . . . +dn (x c1 )(x )(x c2 ) . . . (x cn )
−
−
n
= d0 +
k
−
dk
k =1
(x
j=1 j =1
−
−
−
(5.25)
−c ) j
This This is Newton Newton form, form, which which is particu particularl larly y useful useful for polynomi polynomial al interpol interpolati ation. on. if c if c1 = c2 = . . . = cn = c we get the shifted power form, and for c1 = c2 = . . . = cn = 0 we cover the power form.
5.6 5.6
Spli Sp line ne In Inte terpo rpola lati tion on
For large data set a single approximation by a polynomial satisfying the data (x (xi , f ( f (xi )) will give a polynomial nomial of high high degree. degree. In general a polynomi polynomial al of high high degree degree oscillate oscillatess which which may not be acceptab acceptable le behavior. One solution of this problem is to use interpolation in a piecewise manner. The simplest approach uses linear interpolates. Given n + 1 items of data in ascending ascending order order by x, the data (x (xi , f ( f (xi )) and (x (xi+1 , f ( f (xi+1 )) are interpolated interpolated by a straight straight line. A piecewise linear interpolation interpolation is called linear spline S 1 . The linear spline suffers from the lack of smoothness. The continuity is assured but there is a change in the first derivative. The solution is to use splines having greater smoothness. We concentrate on cubic spline.
Definition1 : For the data (x (xi , f ( f (xi )), i = 0, . . . , n, n, S 3 is a cubic spline in [x [x0 , xn ] if: (1) S 3 restricted to [x [xi−1 , xi ] is a polynomial of degree at most 3 2 (2) S 3 C [x0 , xn ] (3) If s3,i and s3,i+1 ,i+1 are cubic interpolation on adjacent sub-intervals then the conditions:
∈
s3,i (xi ) = s3,i+1 f (xi ) ,i+1 (xi ) = f ( s′3,i (xi ) = s′3,i+1 ,i+1 (xi ) s′′3,i (xi ) = s′′3,i+1 ,i+1 (xi ) The condition of this definition is that individual interpolates can no longer be constructed in isolation. The piecewise piecewise interpolates interpolates s3,1 , . . . , s3,n are interdependent through the derivatives continuity condition. 1
Alastair Wood, Introduction to Numerical Analysis
50
On the interval [x [xi−1 , xi ], and for i = 1, 2, . . . , n, n, we have s3,i (x) = f ( f (xi−1 ) + ai (x
−x
i−1 ) + bi (x
2 i−1 )
−x
+ ci (x
3 i−1 )
−x
(5.26)
there are 3n 3n constants to be determined, ai , bi ,ci , i = 1, . . . , n. n. The continuity enforce that s3,i (xi ) = f ( f (xi−1 ) + ai (xi xi−1 ) + bi (xi xi−1 )2 + ci (xi xi−1 )3 s3,i+1 f (xi ) + ai+1(xi xi ) + bi+1 (xi xi )2 + ci+1 (xi xi )3 ,i+1 (xi ) = f (
−
−
−
−
−
−
which leads to f ( f (xi−1 ) + ai (xi
+ ci (xi xi−1 )3 = f ( f (xi ) f ( f (xi−1 ) + ai hi + bi hi2 + ci hi3 = f ( f (xi )
−x
i−1 ) + bi (xi
−x
i−1 )
2
−
where hi = xi xi−1 for i = 1, . . . , n. n. For the first derivative we have
−
s′3,i (xi ) = ai + 2b 2bi (xi s′3,i+1 2bi+1 ,i+1 (xi ) = ai+1 + 2b
2
3 c (x − x ) − x ) + 3c (x − x ) + 3c 3 c (x − x ) i−1
i
i
i
i
i+1
i−1
i
i
2
which leads to ai + 2b 2bi hi + 3c 3ci hi3 = ai+1 for i = 1, . . . , n 1. For the second derivative we get
−
s′′3,i (xi ) = 2bi + 6c 6ci (xi xi−1 ) s′′3,i (xi ) = 2bi+1 + 6c 6ci+1 (xi xi )
−
−
which leads to bi + 3c 3ci hi = bi+1 for i = 1, . . . , n 1. The natural cubic spline is defined by
−
s′′3,1 (x0 ) = s′′3,n (xn ) = 0
(5.27)
in other words b1 = 0,
bn + 3c 3cn hn = 0
and
we have: 3n constants to be determined n equations from continuity n
− 1 equations from 1st derivative n − 1 equations from 2st derivative 2 equations from natural cubic spline Therefore, we can find all the constants. 51
(5.28)
5.7 5.7
Par aram amet etri ricc Curv Curves es
In some cases curves cannot be expressed as a function of one coordinate variable y in terms of the other variable x. A straightforward method to represent such curves is to use parametric technique. We choose a parameter t on the interval [t [t0 , tn], with t0 < t1 < ... < tn and construct approximation functions with xi = x(ti )
a nd
yi = y(ti )
(5.29)
Consider a curve given by the figure.3.14 page 158 from the textbook. From the curve we can extract the following table i ti xi yi
0 1 2 3 0 0.25 0.5 0.75 1 0 1 0 0 1 0.5 0
−
4 1 1 1
−
Please refer to page 158 Example: The first part of the graph x = [10, 6, 2, 1, 2, 6, 10] y = [3, [3, 1, 1, 4, 6, 7, 6] t = [0, [0, 1/6, 1/3, 1/2, 2/3, 5/6, 1] The second part of the graph [2, 6, 10, 10, 10, 10, 13] x = [2, y = [10, 12, 12, 10, 10, 1, 1] t = [0, [0, 1/4, 2/4, 3/4, 1]
52
The cubic polynomials for the first graph are:
− −− → − − − − −− → − − 10
115 13 283 13 176 13 1168 13 707 13
f x x = u
297 u 540 u3 13 13 27 u 1620 u2 + 2700 u3 13 13 13 1539 u + 2916 u2 1836 u3 13 13 13 + 1215 u 2592 u2 + 1836 u3 13 13 13 4833 u + 6480 u2 2700 u3 13 13 13 + 1917 u 1620 u2 + 540 u3 13 13 13
− −
− −
− −
1797 u + 4266 u3 130 65 367 1383 u 1242 u2 + 1350 u3 130 130 65 13 2143 17367 u + 22734 u2 17226 u3 130 130 65 65 1831 + 17463 u 12096 u2 + 5994 u3 65 130 65 65 389 16521 u + 13392 u2 1350 u3 13 130 65 13 2397 + 40629 u 20898 u2 + 6966 u3 26 130 65 65
−
− −
− −
−
(5.30)
(5.31)
3
f y y = u
u < 1/6 u < 1/3 u < 1/2 u < 2/3 u < 5/6 otherwise u < 1/6 u < 1/3 u < 1/2 u < 2/3 u < 5/6 otherwise
(5.32)
The cubic polynomials for the second graph are:
→ − − − → − − −
205 u + 152 u3 14 7 113 137 u + 684 u2 760 u3 28 14 7 7 815 + 2647 u 300 u2 + 1096 u3 28 14 7 929 2699 u + 1464 u2 488 u3 14 14 7 7
2+
f x x = u
135 u 184 u3 14 7 323 123 u + 516 u2 872 u3 28 14 7 7 1277 + 4677 u 612 u2 + 2328 u3 28 14 7 2399 7473 u + 3816 u2 1272 u3 14 14 7 7
10 +
f y y = u
− − −
−
−
−
−
53
u < 1/4 u < 1/2 u < 3/4 otherwise
(5.33) (5.34)
u < 1/4 u < 1/2 u < 3/4 otherwise
(5.35)
We merge the two graphs we get
Example: Given the data points i 0 1 2 xi 4 9 16 f i 2 3 4 We want to estimate the faction value f at x = 7 using natural cubic spline. The cubic equations are: f 1 (x) = a0 + a1 (x f 2 (x) = b0 + b1 (x
2
− x ) + a (x − x ) + a (x − x ) − x ) + b (x − x ) + b (x − x ) 0
2
1
0
2
2
1
3
0
3
1
which become f 1 (x) = 2 + a1 (x f 2 (x) = 3 + b1 (x
2
− 4) + a (x − 4) + a (x − 4) − 9) + b (x − 9) + b (x − 9) 2
2
2
3
3
3
The continuity leads to 3 1 4 1
= = = =
2 + a1 (9 4) + a2(9 4)2 + a3 (9 4)3 5a1 + 25a 25a2 + 125a 125a3 3 + b1 (16 9) + b2 (16 9)2 + b3 (16 9)3 +7b1 + 49b 49b2 + 343b 343b3
−
−
−
−
−
−
The first derivatives: 2a2 (x f 1′ (x) = a1 + 2a f 2′ (x) = b1 + 2b 2b2 (x
3a (x − 4) − 4) + 3a − 9) + 3b 3b (x − 9) 3
3
The continuity of the derivatives lead to: b1 = a1 + 10a 10a2 + 75a 75a3 The second derivatives: f 1′′ (x) = 2a2 + 6a 6a3 (x f 2′′ (x) = 2b2 + 6b 6b3 (x 54
− 4) − 9)
3
2
2
3
3
The continuity of the second derivatives lead to b2 = a2 + 15a 15a3
The condition of natural spline 0 = a2 0 = b2 + 21b 21b3 Now we collect our equations: 1 1 b1 b2 0 0
= = = = = =
5a1 + 25a 25a2 + 125a 125a3 7b1 + 49b 49b2 + 343b 343b3 a1 + 10a 10a2 + 75a 75a3 a2 + 15a 15a3 a2 b2 + 21b 21b3
Because we are interested in f (7) f (7) we can use
55
1 1 b1 b2 0
5a1 + 125a 125a3 +7b1 + 49b 49b2 + 343b 343b3 a1 + 75a 75a3 15a 15a3 b2 + 21b 21b3
= = = = =
then, we get: 1 = 5a1 + 125a 125a3 and 1 = 7(a1 + 75a 75a3) + 49(15a 49(15a3 ) + 343 1 = 7a1 + 1015a 1015a3
− 15a 15a3 21
The solutions of the following equations are: 89 420 1 = 2100
a1 = a3
−
Thus, the cubic polynomial is f 1 (x) = 2 +
89 (x 420
1 − 4) − 2100 (x − 4)
3
Therefore f 1 (7) =
459 175
(5.36)
We can find the other polynomial in the same way f 2 (x) = 3 +
37 (x 210
1 − 9) − 140 (x − 9)
56
2
+
1 (x 2940
− 9)
3
Chapter 6 Numerical Differentiation and Integral 6.1
Numeri Numerica call Differe Differen ntiatio tiation n
Consider a small increment ∆x ∆x = h in x. According to Taylor’s theorem, we have f ( f (x + h) = f ( f (x) + hf ′ (x) +
h2 ′′ f (ξ ) 2
(6.1)
where, ξ is a real number between x and x + h. We can get f ′ (x) =
f ( f (x + h) h
f (x) h − f ( − f (ξ) ′′
(6.2)
2
We can get the same formula using Linear Lagrange polynomial. we use two points x0 and x1 = x0 +h, we get x x1 x x0 f ′′ (ξ ) f ( f (x) = f ( f (x0 ) + f ( f (x1) + (x x0 x1 x1 x0 2! x x1 x x0 f ′′ (ξ ) = f ( f (x0 ) + f ( f (x1 ) + (x h h 2!
− − −
−
− − −
− x )(x )(x − x ) 0
1
)(x − x ) − x )(x 0
1
Now we calculate the derivative at x = x0 ′
f (x) = f ′ (x0 ) =
′′
f (x ) f ( f (x ) f (ξ ) 1 −f ( + + (2x (2x − x − x ) + [f (ξ )] (x − x )(x )(x − x ) 2! 2! h h f ( f (x + h) − f ( f (x ) h − f (ξ) 0
1
0
′′
0
0
h
1
′
0
1
′′
2
This formula is known as the forward-difference if h > 0 and the backward-difference if h < 0. The previous formula is also called two-point formula.
Backward-difference can be written for h > 0 as f ′ (x) =
f ( f (x)
− f ( f (x + h)) h + f (ξ ) ′′
h
57
2
(6.3)
(n+1)-point formula: To obtain general derivative approximation formulas, suppose that x0 , x1 ,...,xn are (n (n+1) distinct numbers in some interval I and that f C n+1(I ). ). We can Use Lagrange polynomials n n f (n+1) (ξ (x)) f ( f (x) = f ( f (xk )Lk (x) + (x xk ) (6.4) (n + 1)!
{
∈
k=0
−
k=0
then, we calculate the derivative of f of f ((x) we get n
n
f (n+1) (ξ j ) f (x j ) = f ( f (xk )Lk (x j ) + (n + 1)! k=0
′
′
(x
k=0 j k=
−x ) k
(6.5)
In general, using more evaluation points produces greater accuracy, although the number of functional evaluations and growth of round-off error discourages this somewhat.
Three-point formula : 1 h2 (3) [ 3f ( 4 f ((x + h) f ( 2h)] + f (ξ0 ), f (x0 ) = f (x0 ) + 4f f (x0 + 2h 2h 3 2 1 h f ′ (x0 ) = [f ( f (x0 + h) f ( f (x0 h)] + f (3) (ξ1 ), 2h 6 ′
−
−
−
−
(6.6) (6.7)
where ξ0 lies between x0 and x0 + 2h 2 h and ξ1 lies between x0 h and x0 + h. Althoug Although h the errors errors 2 in both formulas are O(h ), the error in the last equation is approximately half the error in the first equation. This is because it use data from both sides of x of x0 .
−
Five-point formula f ′ (x0 ) =
1 [f ( f (x0 12h 12h
+8f +8f ((x0 + h)
−
f (x − h) − 2h) − 8f ( 0
h4 (5) f ( f (x0 + 2h 2h)] + f (ξ ), 30
(6.8)
where ξ lies between x0 2h and x0 + 2h. The other five-poi five-point nt formula formula is useful for end-point end-point approximations. It is given by
−
f ′ (x0 ) =
1 [ 25f 25f ((x0 ) + 48f 48f ((x0 + h) 12h 12h
−
+16f +16f ((x0 + 3h 3h)
−
36f ((x − 36f
0
+ 2h 2h)
h4 (5) 3f ( f (x0 + 4h 4h)] + f (ξ ), 5
(6.9)
where ξ lies between x0 and x0 + 4h. Left-en Left-endpoin dpointt approx approxima imation tion are found using h > 0 and right-endpoint approximations with h < 0.
Example: Let f ( f (x) = x exp(x exp(x). The values of f at different x are given x 1.8 1.9 2.0 2.1 2.2
f ( f (x) 10. 10.889465 12. 12.703199 14. 14.778112 17. 17.148957 19. 19.855030 58
}
Since, f ′ (x) = (x +1)exp(x +1)exp(x), we have f ′ (2) = 22. 22.167168. Let us approximate f ′ (2) using three-point three-point formulas. 1 [ 3f ( 4 f ((x + h) f ( 2h)]) f ′ (x0 ) f (x0 ) + 4f f (x0 + 2h 2h 1 f ′ (2) [ 3f (2 f (2..0) + 4f 4f (2 (2..1) f (2 f (2..2)] , h = 0.1 2 0.1 22. 22.032310 1 f ′ (2) [ 3f (2 f (2..0) + 4f 4f (1 (1..9) f (1 f (1..8)] , h = 0.1 2 0.1 22. 22.054525 1 f ′ (x0 ) [f ( f (x0 + h) f ( f (x0 h)] , 2h 1 [f (2 f (2..1) f (1 f (1..9)] , h = 0.1 2 0.1 22. 22.228790
≈ − − ≈ × − − ≈ ≈ − × − − − ≈ ≈ − − ≈ × − ≈ The errors are 22. 22.167168 − 22. 22.03231 = 0. 0.134858, 22. 22.167168 − 22. 22.054525 = 0. 0.112643, and 22. 22.167168 − 22. 22.228790 = −0.61622 × 10 . respectively −1
It is also possible to find approximation to higher derivatives function using only tabulated values of function at various points. Expand a function f in a third Taylor polynomial about x0 and evaluate at x0 f ( f (x) = f ( f (x0 ) + (x (x +
(x
′
− x )f (x ) + 0
4
− x ) f 0
24
(4)
0
(x
2
± h. Then
3
− x ) f (x ) + (x − x ) f 0
0
′′
0
2
(3)
6
(x0 )
(ξ ),
(6.10)
h2 ′′ h3 (3) h4 (4) f ( f (x0 + h) = f ( f (x0 ) + hf (x0) + f (x0 ) + f (x0 ) + f (ξ+ ) 2 6 24 2 3 h ′′ h (3) h4 (4) ′ f ( f (x0 h) = f ( f (x0 ) hf (x0 ) + f (x0 ) f (x0 ) + f (ξ− ) 2 6 24 h < ξ− < x0 < ξ+ < x0 + h adding the two last equations we get ′
−
where x0
−
−
(6.11) (6.12)
−
f ( f (x0 + h) + f ( f (x0 h) h2 ′′ h4 (4) = f ( f (x0) + f (x0 ) + f (ξ+ ) + f (4) (ξ− ) , 2 2 24 solving this last equation we find that
−
1 f (x0 ) = 2 [f ( f (x0 h ′′
h2 (4) f (ξ+ ) + f (4) (ξ− ) 24
f (x ) + f ( f (x + h)] − − h) − 2f ( is continuous on [x [x − h, x + h]. Since 1/ 1/2(f 2(f 0
0
(6.13)
(6.14)
(4) Suppose that f (4) (ξ+) + f (4) (ξ− )) is between f (4) (ξ+) 0 0 and f (4) (ξ− ), the Intermediate value theorem implies that a number ξ exists between ξ+ and ξ−, and hence in (x (x0 h, x0 + h), with
−
(4)
f
f (4) (ξ+ ) + f (4) (ξ− ) = 2
This lead to 1 f (x0 ) = 2 [f ( f (x0 h h, x0 + h). ′′
where ξ
∈ (x − 0
− h) − 2f ( f (x ) + f ( f (x 0
59
0
+ h)]
−
h2 (4) f (ξ ) 12
(6.15)
It is important to pay attention to round-off error when approximati approximating ng derivativ derivatives. es. Let illustrate illustrate this by an example:
two-point formula f ′ (x)
f (x + h) − f ( f (x) f − f = ≈ f ( h h 1
0
(6.16)
If we assume the round off errors in f 0 and f 1 as e0 and e1, respectively, then f ′ (x)
≈ f + e −h f − e 1
1
0
0
=
f 1
− f
0
+
h
e1
−e
0
(6.17)
h
If the errors are of magnitude e, we can at worst get Rudolf error =
|
|
(6.18)
− h2 f (ξ) = M2h , where M is the bound given by ′′
We know that the truncation error is given by M = max f ′′ (ξ ) and x
2e h
≤ ξ ≤ x + h Thus the bound for total error is then
Total error = Ro Round off error + Truncation error 2e M h E 2 = + h 2
(6.19) (6.20)
Note that when the step size h increased, the truncation error increases while the round off error decreases. The optimal value of h can be found to be M E ′ (h) = 2
−
2e =0= h2
⇒h=2
e . M
(6.21)
Three-point formula 1 f (x0 ) = [f ( f (x0 + h) 2h ′
f (x − h)] − − f ( 0
h2 (3) f (ξ ) 6
Suppose that in evaluating f ( f (x0 h) we encounter round-off error e(x0 values f ( f ˜(x0 h) are related to f ( f (x0 h) by
±
±
f ( f (x0
± ± h) = f ( f ˜(x ± h) + e(x ± h) 0
(6.22)
± h).
Then our our computer computer (6.23)
0
The total error is f ′ (x0 )
−
f ( f ˜(x0 + h)
f ˜(x − h) − f ( = 0
2h
e(x0 + h)
2
− e(x − h) − h f 2h 6 0
(3)
(ξ )
(6.24)
is due in part part to the roundround-off off error error and in part part to the trunca truncati tion on error. error. If we assume assume that the round-off error is bounded by ǫ > 0 and that the third derivative is bounded by M > 0. then then the total error will be bounded by f ′ (x0 )
−
f ( f ˜(x0 + h)
− f ( f ˜(x − h) 0
2h
=
≤ 60
e(x0 + h)
ǫ h2 + M. 6 h
2
− e(x − h) − h f 2h 6 0
(3)
(ξ )
(6.25) (6.26)
To reduce the truncation error, h2 M/6, M/ 6, we must reduce h. But if h is reduced, the round-off error ǫ/h grows. In practice, then, it is seldom advantageous to let h be too small sine the round-off error will dominate the calculations. The Minimum error can be found for the optimal value of h given by h=
3ǫ M
3
(6.27)
Numerical differentiation is unstable: small values of h of h needed to reduce truncation error also cause the round-off error to grow.
Example: Show that the differentiation rule f ′ (x0 ) is exact for all f
3
≈ a f + a f + a f 0 0
1 1
∈ C , if, and only if, it exact for f ( f (x) = 1,
2 2
f ( f (x) = x2 ,
f ( f (x) = x,
and find the values of a0 , a1, and a2 .
solution: According to the following formula n
f ( f (x) =
k=0
with
f (n+1) (ξ (x)) f ( f (xk )Lk (x) + (n + 1)! n
Lk (x) =
(x (xk
i=0 i= k
n
(x
k=0
−x )
(6.28)
k
−x) −x) i
(6.29)
i
we calculate the derivative of f of f ((x) n
f (n+1) (ξ j ) f (x j ) = f ( f (xk )Lk (x j ) + (n + 1)! k=0
′
′
n
k=0 j k=
(x
−x )
(6.30)
k
Therefore, we get n
f (3) (ξ0 ) f (x0) = f 0 L0 (x0 ) + f 1 L1 (x0 ) + f 2 L2 (x0 ) + (x0 6 k=1 ′
′
′
′
we have to show that
−x )
(6.31)
k
– For f ( f (x) = 1, x , x2 , the formula f ′ (x0 ) = a0 f 0 + a1 f 1 + a2 f 2 is exact since ξ0 f (3) (ξ0 ) = 0. – Now, we want to show that if the formula f ′ (x0 ) = a0 f 0 + a1 f 1 + a2f 2 is exact, then it should be also exact for f ( f (x) = 1, x , x2 . If f ′ (x0 ) = a0 f 0 + a1 f 1 + a2f 2 is exact then ξ0 f (3)(ξ0 ) = 0. This implies that α, β,γ, f (x) = α + βx + γx 2 . Then,
∀
∀
∀
if f ′ (x0 ) = a0 f 0 + a1 f 1 + a2 f 2 is exac exactt = = =
(3)
⇒ ∀ξ f (ξ ) = 0 ⇒ ∀α, β,γ, f (x) = α + βx + γx ⇒ the formula is exact fo f ( f (x) = 1, x , x
Exercises section 4.1: 1, 3, 5, 9, 13, 15, 19 61
0
0
2
2
6.2
Richar Richardso dson’s n’s Extra Extrapola polatio tion n
Richardson’s Richardson’s extrapolation extrapolation is used to generate high-accuracy high-accuracy results while using low-order formulas. formulas. Extrapolation can be applied whenever it is known that an approximation technique has an error term with predictable form like M N (h) = K 1 h + K 2 h2 + ..., (6.32)
−
for some collection of unknown constants, K i , where N ( N (h) approximate an unknown value M . In general M N ( N (h) K 1 h, unless there was a large variation in magnitude among the constants K , which is O(h). if M can be written in the form
−
≈
m−1
M = N (h) +
K j h j + O(hm),
(6.33)
j=1 j =1
then for j = 2, 3,...,m, ,...,m, we have an O (h j ) approximation of the form N j (h) = N j −1
h 2
+
N j −1 (h/2) h/2) 2 j −1
− N −1
j −1 (h)
(6.34)
These approximations are generated by rows in the order indicated by the numbered entries in the following table O(h) 1 : N 1 (h/1) h/1) 2 : N 1 (h/2) h/2) 4 : N 1 (h/4) h/4) 7 : N 1 (h/8) h/8)
O (h2 )
≡ N (h/1) h/1) h/2) ≡ N (h/2 ≡ N (h/4 h/4) h/8) ≡ N (h/8
O(h3 )
O(h4) (6.35)
3 : N 2 (h) 5 : N 2 (h/2 h/2) 6 : N 3(h) 8 : N 2 (h/4 h/4) 9 : N 3(h/2 h/2) 10 : N 4 (h)
If M is in the form that contains only even power of h then N j (h) = N j −1
h 2
+
N j −1 (h/2) h/2) 4 j −1
− N −1
j −1 (h)
(6.36)
for j = 2, 3,.., gives an O(h2 j ).
Example: We want to determine an approximate the value to f ′ (1. (1.0) with h = 0.4 where f ( f (x) = ln x. We use Richardson’s Extrapolation N 3 (h). We have 1 f (x0) = [f ( f (x0 + h) 2h ′
f (x − h)] − − f ( 0
h2 (3) f (x0) 6
−
h4 (5) f (ξ ) 120
In the case h = 0.4 and x0 = 1, we can calculate N 3 (h) using 1 [ln(x [ln(x0 + h) 2h N 1(0. (0.4) = 1.059122326 N 1(0. (0.2) = 1.013662770 N 1(0. (0.1) = 1.003353478 N 1 (h) =
62
− ln(x ln(x − h)] 0
− ...
(6.37)
We then use N j (h) = N j −1
h 2
+
N j −1 (h/2) h/2) 4 j −1
− N −1
j −1 (h)
to get N 1 (0. (0.2) 41 N 1 (0. (0.1) N 2 (0. (0.2) = N 1 (0. (0.1) + 41 N 2 (0. (0.2) N 3 (0. (0.4) = N 2 (0. (0.2) + 42 N 2 (0. (0.4) = N 1 (0. (0.2) +
− N (0. (0.4) = 0.9985095847 −1 − N (0. (0.2) = 0.9999170473 −1 (0.4) − N (0. = 1.000010878 −1
Exercises section 4.2: 1, 3, 5, 11 Exercises 5 (a) and 7 (a) > > > >
a1:=1.1;f1:=exp(2*a1); a1:=1.1;f1:=exp(2*a1); a2:=1.2;f2:=exp(2*a2); a2:=1.2;f2:=exp(2*a2 ); a3:=1.3;f3:=exp(2*a3); a3:=1.3;f3:=exp(2*a3 ); a4:=1.4;f4:=exp(2*a4); a4:=1.4;f4:=exp(2*a4 ); f1 f2 f3 f4
a1 := 1.1 := 9.0250 9.0250134 13499 99 a2 := 1.2 := 11.023 11.023176 17638 38 a3 := 1.3 := 13.463 13.463738 73804 04 a4 := 1.4 := 16.444 16.444646 64677 77
> > > > >
h:=0.1 h:=0.1; ; f1p:=1/2/h*(-3*f1+4*f2-f3); f1p:=1/2/h*(-3*f1+4*f2-f3); f2p:=1/2/h*(f3-f1); f2p:=1/2/h*(f3-f1); f3p:=1/2/h*(f4-f2); f3p:=1/2/h*(f4-f2); f4p=-1/2/h*(-3*f4+4*f3-f2); f4p=-1/2/h*(-3*f4+4*f3-f2);
> > > >
h := 0.1 f1p := 17.769 17.769634 63490 90 f2p := 22.193 22.193622 62270 70 f3p := 27.107 27.107351 35195 95 f4p = 32.510 32.510822 82265 65 evalf(subs(x=1.3,h^2/3*diff(exp(2*x), evalf(subs(x=1.3,h^2/3*diff(exp(2*x),x$3))); x$3))); evalf(subs(x=1.3,h^2/6*diff(exp(2*x), evalf(subs(x=1.3,h^2/6*diff(exp(2*x),x$3))); x$3))); evalf(subs(x=1.4,h^2/6*diff(exp(2*x), evalf(subs(x=1.4,h^2/6*diff(exp(2*x),x$3))); x$3))); evalf(subs(x=1.4,h^2/3*diff(exp(2*x), evalf(subs(x=1.4,h^2/3*diff(exp(2*x),x$3))); x$3))); 0.3590330144 0.1795165072 0.2192619569
63
1
1
2
(6.38)
0.4385239139
6.3
Elemen Elements ts of Numeri Numerica call In Integ tegrat ration ion
b f ( f (x)dx a
The basic method involved in approximating n
is called numerical quadrature . It uses uses a sum
ai f ( f (xi ) to approximate the integral. The methods of quadrature in this section are based on Lagrange
i=0
interpolating polynomial .
6.3.1 6.3.1
Tra rapez pezoi oida dall ru rule le
To derive the Trapezoidal rule, we use the linear Lagrange polynomial which agrees with the function f ( f (x) at x0 = a and x1 = b. The Trapezoidal Rule is: b
h f ( f (x)dx = [f ( f (x0 ) + f ( f (x1 )] 2
a
where, h = x1 x0 = b a and ξ (a, b). We want to evaluate the integral
−
−
−
h3 ′′ f (ξ ), 12
(6.39)
∈
2
I =
(x3 + 1) dx
(6.40)
1
using trapezoidal rule. 2
1 f ( f (x)dx = [f (1) f (1) + f (2)] f (2)] 2
1
−
h3 ′′ f (ξ ) 12
1 1 [2 + 9] 6ξ 2 12 11 1 = ξ 2 2 Therefore the trapezoidal rule give 5. 5.5 with a maximum error E max max equals to =
−
−
|E | = Max 12 ξ = 1 max max
The exact value of the integral is
2
I =
(x3 + 1) dx
1
1 4 x + x 21 4 1 4 = 2 + 2 frac141 frac1414 + 1 4 19 = = 4.75 4 =
Thus, the absolute error is 5.5 calculated from the formula.
| − 4.75| = 0.75.
| −
Note Note that the error is less than the maxim maximum um error error
64
6.3.2 6.3.2
Simp Simpso son’ n’ss ru rule le
Simpson’s rule uses the second Lagrange polynomial with nodes at x0 = a, x1 = a + h, x2 = b, where h = (b a)/2. There are few tricks to get the formula (please see sec 4.3 and exercise 24 of sec 4.3)
−
x2
h [f ( f (x0 ) + 4f 4 f ((x1 ) + f ( f (x2 )] 3
f ( f (x)dx =
x0
We want to evaluate the integral
1
I =
−
h5 (4) f (ξ ). 90
ex dx
(6.41)
(6.42)
−1
using Simpson’s rule 1
f ( f (x)dx =
1 [f ( f ( 1) + 4f 4f (0) (0) + f (1)] f (1)] 3
−
−1
− 901 f
(4)
(ξ )
(6.43) (6.44)
we get 1
1 −1 [e + 4 + e] 3
ex dx =
−1
= 2.36205
− 901 e
− 901 e
ξ
ξ
The maximum error is
|E | max max
= Max =
1 ξ e 90
1 e = 0.03020 90
The exact value of the integral is
1
I =
ex dx = e
−1
the absolute error is 2.36205
|
6.3.3 6.3.3
− 1e = 2.35040
(6.45)
− 2.35040| = 0.01165 which is less than the maximum error 0.03020.
Degr Degree ee of pr prec ecis isio ion n
The degree of accuracy, or precision, of a quadrature formula is the largest positive integer n such that the formula is exact for xk , for each k = 0, 1,...,n. ,...,n. What degree of precision does the following formula have?
1
−1
f ( f (x)dx = f
− √ √ 1 3
65
+ f
1 3
The integral
1
1 + ( 1)k x dx = 1+k −1
−
k
and
− √ √ √ − √ − − √ 1 3
k
1 3
+
k
k
1 3
=
1 + ( 1)k
The form has n degree of precision if it is exact for xk , k = 0, . . . , n 1 + ( 1)k = 1+k
1 3
k
1 + ( 1)k
It is true when k is an odd number. For k even number 1 = 1+k
k
1 3
which is true for k = 0, 2. Therefore, the formula is true for k = 0, 1, 2, 3. Thus, the degree of the formula is 3. We can conclude from this example that the formula is true for all the polynomial of degree at most 3.
6.3.4 6.3.4
Newton Newton-Co -Cotes tes Formula ormula
There are two types of Newton-Cotes formula, open and closed. The (n+1)-point closed Newton-Cote ih, for i = 0, 1,...,n, ,...,n, where x0 = a, xn = b, and h = (b a)/n. /n. It call called ed formula uses nodes xi = x0 + ih, closed because the endpoints of the closed interval [a, [a, b] are included as nodes. The open Newton-Cotes formulas use nodes xi = x0 + ih, ih, for each i = 0, 1,...,n, ,...,n, where h = (b a)/(n + 2) and x0 = a + h. This This implies implies that xn = b h. so, we label the endpoints endpoints by setting setting x−1 = a and xn+1 = b. where
−
−
−
b
n
f ( f (x)dx
a
≈
ai f ( f (xi ),
(6.46)
i=0 b
ai =
Li (x)dx
(6.47)
a
Theorem 4.2 Theorem 4.3
Exercises section 4.3:
6.4
Composit Composite e Numeri Numerical cal In Integ tegrat ration ion
The Newton-Cotes formulas formulas are generally generally unsuitable unsuitable for use over large integration intervals. intervals. One way to overcome this problem is to split the large interval into subintervals and sum Newton-Cotes formulas over all these subintervals. 66
Composite Simpson’s rule: Let f C 4[a, b], n be even number, h = (b a)/n, /n, and x j = a + hj, hj , for each j = 0, 1, . . . , n. n. There exists a µ (a, b) for which the composite Simpson’s rule for n subintervals can be written as
∈
−
∈
b
a
h f ( f (x) dx = f ( f (a) + 2 3
(n/2) n/2)−1
n/2 n/2
f ( f (x2 j ) + 4
j=1 j =1
f ( f (x2 j −1 ) + f ( f (b)
j=1 j =1
− −
b a 4 (4) h f (µ) 180
Composite Trapezoidal rule : Let f C 2 [a, b], h = (b a)/n, /n, and x j = a + hj, hj , for each j = 0, 1, . . . , n. n. There exists a µ for which the composite trapezoidal rule for n subintervals can be written as
∈
−
b
f ( f (x) dx =
a
n−1
h f ( f (a) + 2 f ( f (x j ) + f ( f (b) 2 j=1 j =1
− a h f (µ) − b 12 2
′′
(6.48)
∈ (a, b) (6.49)
Composite Midpoint rule : Let f C 2 [a, b], n be even, h = (b a)/(n + 2), 2), and x j = a + h( j + 1), for for each each j =, 1, 0, . . . , n + 1. There exists a µ (a, b) for which the composite Midpoint rule for n + 2 subintervals can be written as n/2 n/2 b b a 2 ′′ f ( f (x) dx = 2h f ( f (x2 j ) h f (µ) (6.50) 6 a j=0 j =0
∈
−
∈
−
− −
An important property shared by all these composite integration techniques is a stability with respect to round off errors. Let demonstrate this property for Composite Simpson’s rule with n subintervals to a function f on [a, b]. Assume that f ( f (xi ) = f ( f ˜(xi ) + ei (6.51) where ei is the round off error and f ( f ˜(xi ) is an approximation approximation to f ( f (xi ). From the Composite Simpson’s Simpson’s rule
b
a
f ( f (x) dx =
h f ( f (a) + 2 3
(n/2) n/2)−1
n/2 n/2
f ( f (x2 j ) + 4
j=1 j =1
f ( f (x2 j −1 ) + f ( f (b)
j=1 j =1
the accumulate round off error, e(h) is given by
e(h) =
h e0 + 2 3
(n/2) n/2)−1
n/2 n/2
e2 j + 4
j=1 j =1
e2 j −1 + en
j=1 j =1
If the round off errors are uniformly bounded by ǫ, then e(h)
≤ ≤
− −
b a 4 (4) h f (µ) 180
h ǫ [1 + (n (n 2) + 2n 2n + 1] 3 nhǫ = (b a)ǫ
− −
It is clear from the last equation that the bound is independent of h and n. Th This is mea means ns that that,, even though we may nee to divide an interval into more parts to ensure accuracy, the increased computat computation ion that is required required does not increase increase the round round off error. This This means means that the procedure procedure is stable as h approaches zero. Recall that was not true for the numerical differentiation procedures. 67
Exercises section 4.4:
6.5 6.5
Adap Adapti tiv ve Qua Quadr drat atur ure e Meth Methods ods
The composite quadrature rules rules use equally spaced points. This is not good if the function to be integrated integrated has large variat variations ions in some region region and small small variati variations ons at other other regions. regions. So, it is useful to introduc introducee a method to adjust the step size. The step size have to be smaller over region where a large variation occurs. This technique is called adaptive quadrature . The method method is based based on Simpson Simpson’s ’s rule. rule. But we can also use any Newton-Cotes formula. We know that Simpson’s rule uses two subintevals over [a [ak , bk ]: S (ak , bk ) =
h (f ( f (ak ) + 4f 4 f ((ck ) + f ( f (bk )) , 3
where ck is the center of [a [ak , bk ], and h = ξk
bk
−a 2
∈ [a , b ] so that k
k
(6.52)
. Furthe urtherm rmore ore,, if f
(4)
∈ C
[ak , bk ], then there exist
k
b
f ( f (x) dx = S (ak , bk )
a
−
h5 (4) f (ξk ) 90
(6.53)
A composite Simpson rule using four subintervals of [a [ak , bk ] can be performed by bisecting this interval into two equal subinterval [a [ak , ck ] = [ak1 , bk1 ] and [c [ck , bk ] = [ak2 , bk2]. We then write S (ak1 , bk1 ) + S (ak2 , bk2 ) =
h
4 f ((c ) + f ( f (a ) + 4f f (b )) × 2 (f ( h + (f ( f (a ) + 4f 4 f ((c ) + f ( f (b )) 3×2 k1
3
k1
k2
(6.54)
k1
k2
(6.55)
k2
where only two additional evaluation of f ( f (x) are needed at ck1 and ck2 , which are the midpoint of the intervals [a [ak1 , bk1 ], and [a [ak2 , bk2 ], respectively. Furthermore, if f C (4) [ak , bk ], then there exist ξk1 [ak , bk ] so that
∈
∈
bk
= S (ak1 , bk1 ) + S (ak2 , bk2 )
ak
If we assume that f (4) (ξk ) S (ak , bk )
(4)
≈ f −
h5
(4)
− 16 × 90 f
(ξk1 )
(6.56)
(ξk1 ), we get
h5 (4) f (ξk ) 90
≈ S (a
k 1 , bk 1 ) +
S (ak2, bk2 )
h5
(4)
− 16 × 90 f
(ξk1 )
which can be written as
−
h5 (4) f (ξk1 ) 90
≈ 16 (S (a 15
k1 , bk1 ) +
S (ak2 , bk2 )
− S (a , b )) k
k
Thus, we can find that
b
a
f ( f (x) dx
− S (a
k1 , bk1 )
− S (a
≈
k 2 , bk 2 )
68
1 S (ak1 , bk1 ) + S (ak2 , bk2 ) 15
|
− S (a , b )| k
k
(6.57)
1 1 Because of the assumption f (4) (ξk ) f (4) (ξk1 ), the fraction is replaced with when implementing 15 10 the method in a program. Assume that we want the tolerance to be ǫk > 0 for the interval [a [ak , bk ]. If
≈
1 S (ak1, bk1 ) + S (ak2, bk2 ) 10
|
we infer that
− S (a , b )| < ǫ k
b
f ( f (x) dx
a
− S (a
k 1 , bk 1 )
− S (a
k
(6.58)
< ǫk
(6.59)
k
k2 , bk2 )
Thus the composite Simpson rule is used to approximate the integral b
f ( f (x) dx
a
≈ S (a
k 1 , bk 1 ) +
S (ak2 , bk2 )
(6.60)
and the error bound for this approximation over the interval [a [ak , bk ] is ǫk . The adaptive quadrature is implemented by applying Simpson’s rules in this way: 1. We start with the interv interval al [a [a0 , b0 ] and tolerance ǫ0 2. we then apply apply Simpson Simpson rule: h (f ( f (ak ) + 4f 4 f ((ck ) + f ( f (bk )) , 3 h S (ak1 , bk1 ) + S (ak2 , bk2 ) = (f ( f (ak1) + 4f 4 f ((ck1 ) + f ( f (bk1 )) 3 2 h + (f ( f (ak2) + 4f 4 f ((ck2 ) + f ( f (bk2 )) 3 2 S (ak , bk ) =
× ×
3. The interval interval is refined into subintervals subintervals labeled [a [a01 , b01 ] and [a [a02 , b02 ]. 4. If the accuracy test
1 S (a01 , b01 ) + S (a02 , b02 ) 10 is passed, the quadrature formula
|
S (a01 , b01 ) + S (a02 , b02 ) =
− S (a , b )| < ǫ 0
0
0
(6.61)
h (f ( f (a01 ) + 4f 4 f ((c01 ) + f ( f (b01 )) 6 h + (f ( f (a02 ) + 4f 4 f ((c02 ) + f ( f (b02 )) 6
is applied to [a [a0 , b0 ] and we are done. 5. If the accuracy accuracy test fails, fails, The two subintervals are labeled [a [a1 , b1 ] and [a [a2 , b2 ], over over which which the toleranc tolerances es are halve halved, d, ǫ1 = ǫ/2, ǫ/2, ǫ2 = ǫ/2. ǫ/2. We repeat the steps 4-5 for the two intervals with the new tolerances. 6. we add all the quadrature formulas formulas where the accuracy test are passed 69
example We apply the adaptive quadrature algorithm to approximate
1
0
√
3 x dx = 1 2
We first define the function S (µ, v ) =
v
−µ 6
f ( f (µ) + 4f 4 f
µ+v 2
+ f ( f (v )
Accuracy test for [0,1], we start with ǫ0 = 0.001
(0, 0.5) + S (0. (0.5, 1) − |S (0,
S (0, (0, 1) S (0, (0, 0.5) S (0. (0.5, 1) S (0, (0, 1) 10ǫ 10ǫ0
|−
= = = =
0.9571067813 0.3383883477 0.6464010497 0.0176826161 > 0
We have to refine the interval [0, [0, 1] into [0, [0, 0.5] and [0. [0.5, 1]
Accuracy test for [0,0.5], our ǫ1 = ǫ0/2
|S (0, (0, 0.25) + S (0. (0.25, 25, 0.5) −
S (0, (0, 0.5) (0, 0.25) S (0, S (0. (0.25, 25, 0.5) S (0, (0, 0.5) 10ǫ 10ǫ1
|−
0.3383883477 0.1196383477 0.2285372827 0.004787282700 > 0
= = = =
We have to refine the interval [0, [0, 5] into [0, [0, 0.25] and [0. [0.25, 25, 0.5]
Accuracy test for [0,0.25] , our ǫ2 = ǫ1 /2
(0, 0.125) + S (0. (0.125, 125, 0.25) − |S (0,
S (0, (0, 0.25) S (0, (0, 0.125) S (0. (0.125, 125, 0.25) (0, 0.25) 10ǫ 10ǫ2 S (0,
|−
= = = =
0.1196383477 0.04229854347 0.08080013118 0.000960326900 > 0
We have to refine the interval [0, [0, 25] into [0, [0, 0.125] and [0. [0.125, 125, 0.25]
Accuracy test for [0,0.125], our ǫ3 = ǫ2 /2
(0, 0.0625) + S (0. (0.0625, 0625, 0.125) − |S (0,
S (0, (0, 0.125) S (0, (0, 0.0625) S (0. (0.0625, 0625, 0.125) S (0, (0, 0.125) 10ǫ 10ǫ3
|−
The test has passed. So, we can keep the interval [0, [0, 0.125] with S (0, (0, 0.0625) + S (0. (0.0625, 0625, 0.125) = 0. 0.04352195381. We go now back and keep ǫ3 70
= = = =
0.04229854347 0.01495479346 0.02856716035 0.000026589660 < 0
Accuracy test for [0.125,0.25]
(0.125, 125, 0.1875) + S (0. (0.1875, 1875, 0.25) − |S (0.
S (0. (0.125, 125, 0.25) S (0. (0.125, 125, 0.1875) S (0. (0.1875, 1875, 0.25) S (0. (0.125, 125, 0.25) 10ǫ 10ǫ3
|−
= = = =
0.08080013118 0.03699538942 0.04381002180 0.001244719960 < 0
The test has passed. So, we can keep the interval [0. [0.125, 125, 0.25] with S (0. (0.125, 125, 0.1875) + S (0. (0.1875, 1875, 0.25) = 0. 0.08080541122. We go now back with ǫ2
Accuracy test for [0.25 0.5]
(0.25, 25, 0.375) + S (0. (0.375, 375, 0.5) − |S (0.
S (0. (0.25, 25, 0.5) S (0. (0.25, 25, 0.375) S (0. (0.375, 375, 0.5) S (0. (0.25, 25, 0.5) 10ǫ 10ǫ2
= = = =
|−
0.2285372827 0.1046387629 0.1239134540 0.002485065800 < 0
The test has passed. So, we can keep the interval [0. [0.25, 25, 0.5] with (0.25, 25, 0.375) + S (0. (0.375, 375, 0.5) = 0. 0.2285522169. S (0. We go now back with ǫ1
Accuracy test for [0.5 1]
|S (0. (0.5, 0.75) + S (0. (0.75, 75, 1) −
S (0. (0.5, 1) S (0. (0.5, 0.75) (0.75, 75, 1) S (0. S (0. (0.5, 1) 10ǫ 10ǫ1
|−
= = = =
0.6464010497 0.2959631153 0.3504801743 0.004957760100 < 0
The test has passed. So, we can keep the interval [0. [0.5, 1] with S (0. (0.5, 0.75) + S (0. (0.75, 75, 1) = 0. 0.6464432896. Now we can add
1
0
√
3 x dx 2
≈
S (0, (0, 0.0625) + S (0. (0.0625, 0625, 0.125) + S (0. (0.125, 125, 0.1875) + S (0. (0.1875, 1875, 0.25) +
S (0. (0.25, 25, 0.375) + S (0. (0.375, 375, 0.5) + S (0. (0.5, 0.75) + S (0. (0.75, 75, 1) = 0.9993228715
the actual error is 0. 0.0006771285 We can summarize our results: ak 0.000 0.125 0.250 0.500
bk 0.125 0.250 0.500 1.000
AQ 0.04352195381 0.08080541122 0.2285522169 0.6464432896 0.9993228715 71
1.5
1
0.5
0 0
0.125
.0.25
0.5
1
Exercises section 4.6:
6.6 6.6
Gaus Ga ussi sian an Qu Quad adra ratu ture re
Gaussian integration is based on the concept that accuracy of quadrature form n can be improved by choosin choosingg nodes wisely wisely,, rather rather than on the basis basis of equal spacing spacing nodes. nodes. Gauss Gauss integr integrati ation on assumes assumes an approximation of the form This equation contains 2n 2n unknow unknowns ns to be determi determined. ned. Gaussi Gaussian an Quadrature Quadrature use the fact fact that that the quadrature form has the high degree of precision as possible.
n
1
≈
f ( f (x) dx
−1
ai f ( f (xi )
(6.62)
i=1
Let us find the Gaussian quadrature formula for n = 2. In this case the function 1, x, x2 , and 4x 4x3 should give exact results.
1
f ( f (x) = 1 =
⇒
f ( f (x) = x =
⇒
2
f ( f (x) = x
3
f ( f (x) = x
=
⇒
=
⇒
a1 + a2 =
dx = 2
−1
a1 x1 + a2 x2 =
1
xdx = 0
−1
a1 x21 a1 x31
+
a2 x22
=
1
=
x1 =
−x
+
x2 dx =
−1
a2 x32
solving these equations we get a1 = a2 = 1,
1
2 3
x3 dx = 0
−1
√13
(6.63)
√ √
(6.64)
2
=
Thus, we have the Gaussian Integration for two nodes
1
−1
f ( f (x) dx
≈ f
1 3
72
+ f
1 3
This method can be generalized to more than two nodes but there is another alternative way to get more easily. This alternative way use what we call Legendre Polynomials. They are defined by For each n, P n is a monic polynomial of degree n, a polynomial xn + a(n−1) x(n−1) + ... + a1 x + a0 in which the coefficient of the highest order term is 1. whenever P ( P (x) is polynomial of degree less then n we have
1
P ( P (x) P n (x) dx = 0
(6.65)
−1
These are few Legendre polynomials: P 0 (x) = 1, P 1 (x) = x , , P2 (x) = x2 P 3 = x3
− 35 x,
P 4 (x) = x4
− 46 x
2
− 13 , +
3 35
(6.66)
The nodes xi , i = 1, . . . , n are determined by the roots of P of P n (x) and the coefficients ai are defined by
1
ai =
−1
x xi
j =1 i j =
−x −x
j
(6.67)
dx
j
If P ( P (x) is any polynomial of degree less than 2n 2n, then
1
≈
n
1
f ( f (x) dx
−1
P ( P (x) dx =
−1
ai P ( P (xi )
(6.68)
i=1
Note that Gaussia Gaussian n formula formula imposes imposes a restric restrictio tion n on the limits limits of integr integratio ation n to be from -1 to 1. It is possible to overcome this restriction by using the technique of changing variable.
b
1
f ( f (x) dx =
a
g (z )dz
(6.69)
−1
We define z = Ax + B z B x = A
−
So, we can get = Aa + B 1 = Ab + B
−1 which gives the solution
A =
2
b a a+b B = a b 73
− −
(6.70) (6.71)
Therefore
b
a
1 f ( f (x) dx = A
1
g(z ) dz
(6.72)
−1
where g (z ) = f
− z
B
(6.73)
A
Example: Convert the integral
2
I =
x/2 e−x/2 dx
−2
to the Gaussian limits of integration. Solution:
− 2
I =
x/2 e−x/2 dx
−2
1 = A
1
f
z
B
A
−1
dz
We have A=
2
=
1 2
b a a+b B= =0 a b
− −
Thus
2
I =
x/2 e−x/2 dx
−2
1
= 2
f (2 f (2zz ) dz
−1
1
= 2
e−z dz
−1
6.7 6.7
Impr Improper oper In Inte tegr gral alss
Improper integral result in two cases: the notion of integration is extended to an interval of integration on which the function is unbounded the endpoint of the interval of integration is infinity,
74
∞.
Let consider the first case, when the integrand f ( f (x) is un unbound bounded. ed. The second second case can be reduced to the first case. It is well known that the improper integral
b
dx (x a) p
(6.74)
−
a
converges iff the power 0 < p < 1. Thus, we define the improper integral
b
a
dx (b a)1− p = (x a) p 1 p
− −
−
(6.75)
If the function f ( f (x) can be written as g (x) (x a) p where g is continuous on [a, [a, b], the improper integral f ( f (x) =
(6.76)
−
b
f ( f (x)dx
(6.77)
a
exists for 0 < p < 1. Let now approxim approximate ate this this integr integral al given given by eq. (6.77) (6.77) using using composit compositee Simpson Simpson’s ’s rule. rule. Assume Assume that 5 g C [a, b]. In this case, the fourth Taylor polynomial, P 4 (x), for g can be written as
∈
P 4 (x) = g (a) + (x (x
−
g′′ (a) a)g (a) + (x 2 ′
−
g ′′′ (a) a) + (x 3! 2
−
g (4)(a) a) + (x 4! 3
− a)
4
(6.78)
Therefore
b
b
f ( f (x)dx =
a
a
The last term can be integrated like b
a
g(x) (x
P 4(x) dx = (x a) p
−
=
− P (x) dx + − a) 4 p
4
− k=0 4
k=0
g(k) (a) (x k!
b
a
P 4 (x) dx (x a) p
k− p
− a)
g (k) (a) (b (k p + 1)k 1)k!
(6.79)
−
dx
− a)
k− p+1 p+1
(6.80)
Now, we can define a function G(x) such that
G(x) =
g (x) (x 0,
− P (x) , − a) 4 p
if a < x
≤b
(6.81)
if x = a
Since 0 < p < 1 and P 4k (a) agrees with g (k) (a) for each k = 0, dots This implies implies dots,, 4, we have G C 4 [a, b]. This that the Composite Simpson’s rule can be applied to approximate the integral of G of G on [a, b]. Example: We want to approximate 1 sin(x sin(x) dx (6.82) 1/4 x 0 Using Simpson’s composite rule with n = 4.
∈
75
1. We find the fourth order Taylor aylor polynomi p olynomial al for sin(x sin(x) at x = 0 sin(x sin(x)
≈ P (x) = x − 16 x
3
4
(6.83)
2. We evaluate evaluate the integral integral
1
1
P 4 (x) dx = x1/4
0
x3/4
0
−
1 11/ x11/4 dx 6
166 = 0.5269841270 315
= 3. We define the function function G(x)
G(x) =
sin(x sin(x)
− x + 16 x
3
, if 0 < x
x1/4 0,
≤1
if x = 0
4. We apply Composite Simpson’s Simpson’s rule with n = 4
1
G(x) dx
0
≈ 121 [G(0) + 2G 2G(0. (0.5) + 4G 4G(0. (0.25) + 4G 4G(0. (0.75) + G(1)] = 0. 0.001432198742
5. Now we get
1
0
sin(x sin(x) dx x1/4
≈= 0.001432198742 001432198742 + 0.5269841270 = 0. 0.5284163257
To approximate the improper integral with a singularity at the right endpoint, we could apply the technique used above with the following transformation
b
−a
f ( f (x)dx =
a
−b
f ( f ( z )dz
−
(6.84)
which has a singularity at the left endpoint. An improper with a singularity at a < c < b can be treated as the sum
b
c
f ( f (x)dx =
a
b
f ( f (x)dx +
a
f ( f (x)dx
(6.85)
c
Other type of improper integral involves infinite limits of integration can be treated as
a
1/a
∞
f ( f (x)dx =
t−2f (1 f (1/t /t))dt
(6.86)
0
for a = 0. For the case when a = we can split the integral into two parts. One from 0 to c and the other from c = 0 to .
∞
76
Chapter 7 Initial-Value Problems for ODE 7.1 7.1
intr introdu oduct ctio ion n An ordinary Differential Equation (ODE) is an equation containing one ore more derivatives of y . Differen Differentia tiall equation equationss are classi classified fied according according to their their order. order. The order order of differen differentia tiall equation equation is the highest highest deriv derivative ative that appears appears in the equation. equation. When the the equation equation contai contains ns only a first derivative, it is called first-order differential equation . A first order differential equation can be expressed as dy = f ( f (t, y ) (7.1) dt Degree of a differential equation is the power of the highest-order derivative. For example ty ′′ + 3t 3t2 + 2 = 0 is a first-degree, second-order differential equation. A differential equation is a linear equation when it does not contain terms involving the product of the dependent variable y or its derivativ derivatives. es. For example example y ′′ + 2y 2y ′ + t2 is linear but y ′′ + 2y 2y ′ y + t2 is not. If the order of the equation is n, we nee n conditi conditions ons in order order to obtain obtain a unique unique solution solution.. When all the conditions are specified at a particular value of independent variable t, then the problem is called initial-value problem . It is also possible to specify the conditions at different values of t. such such probl problem emss are calle called d the the boundary-value problem . All numerical techniques for solving differential equations involve a series of estimates of y of y (t) starting from the given conditions. There are two basic approaches, one-step and multistep methods. In one-step one-step methods, methods, we use informa information tion from only one preceding preceding points. points. To estimat estimatee yi we only need yi−1 . Multistep methods use information at two or more previous steps to estimate a value.
77
7.2
Elemen Elementar tary y Theory Theory of of Initia Initial-V l-Valu alue e Proble Problems ms f (t, y ) is said to satisfy a Lipschitz condition in the variable y on Lipschitz condition: A function f ( 2 a set D R if a constant L > 0 exists with
⊂
|f ( f (t, y ) − f ( f (t, y )| ≤ L|y − y |, 1
2
1
(7.2)
2
whenever (t, (t, y1 ), (t, y2 )
∈ D. Convex Set: A set D ∈ R is said to be convex if whenever (t (t , y ) and (t (t , y ) belong to D and λ is in [0, [0, 1], the point ((1 − λ)t + λt , (1 − λ)y + λy ) belongs to D. This This means that the entire entire straigh straightt line line segmen segmentt 2
1
2
1
1
1
2
2
2
between the two points also belongs to the set D.
Non Convex Convex
The set D = (t, y ) a
{
| ≤ t ≤ b, y ∈ R} is a convex set.
Theorem: Suppose that f ( f (t, y ) is defined on a convex set D
≤
∂f (t, y ) ∂y
2
∈ R . If a constant L > 0 exists with
(t, y ) L, for all (t,
∈ D,
then f satisfies a Lipschitz condition on D in the variable y . proof:
Theorem: Suppose that D = (t, y ) a t b, y R and that f ( f (t, y) is continuous on D . If f satisfies a Lipschitz condition on D in the variable y , then the initial-value problem
{
| ≤ ≤
∈ }
y ′ (t) = f ( f (t, y ), a has a unique solution y (t) for a
≤ t ≤ b, y(a) = α,
≤ t ≤ b.
Example: We want to show that the following initial-value problem has a unique solution y′ = y cos(t cos(t), 0
≤ t ≤ 1, y(0) = 1
We apply the previous theorem which states: 78
Suppose that D = (t, y ) a t b, y R and that f ( f (t, y ) is continuous on D. If f satisfies a Lipschitz condition on D in the variable y , then the initial-value problem
{
| ≤ ≤
∈ }
y ′ (t) = f ( f (t, y ), a has a unique solution y(t) for a Let check that
≤ t ≤ b, y(a) = α,
≤ t ≤ b. f ( f (t, y ) = y′ = y cos(t cos(t), 0
≤t≤1
satisfies satisfies Lipschitz Lipschitz condition. condition.
|y
1
cos(t cos(t)
−y
2
cos(t cos(t) = cos(t cos(t) y1
| − y | ≤ |y − y |
|
2
1
2
Thus, L = 1, f ( f (t, y ) satisfies the Lipschitz condition. Therefore, there is a unique solution.
example: We want to show that ′
y = has solution y 3 tyt = 2 for 1
−
y3 + y (3y (3y2 + 1)t 1)t
≤ t ≤ 2, and y(1) = 1.
of y 3 + t + yt = 2 we find – we calculate the derivative of y 3y2 y′ t + 3y 3y 3 + y + y ′ t = 0 y ′ (3y (3y2 t + t) + y 3 + y = 0 which gives y′ =
−
y3 + y (3y (3y 2 + 1)t 1)t
we have also from y3 t + yt = 2 that at t = 1 we get y 3 + y = 2. There is only one real root for y 3 + y = 2 which is y = 1. To get y(2) we can use Newton method f ( f (y ) = y 3 + y 1 = 0 pn3 + pn 2 pn+1 = pn 3 pn2 + 1
−
−
It is clear that f (0) f (0) =
−
f (1) = 1. We can start Newton iteration from p −1 and f (1)
0
i 0 1 2 3 4 5 6
pi 0.5 0.7142857143 0.6831797236 0.6823284233 0.6823278037 0.6823278038 0.6823278038
So, the approximate value of y of y (2) = 0. 0.6823278038.
Exercises section 5.1: 1, 3, 5, 7 79
= 0.5. We get
7.3 7.3
Euler uler’’s Meth ethod The objective of Euler’s method is to obtain an approximate solution to the well-posed initial-value problem dy = f ( f (x, y ), dt
a
≤ t ≤ b,
y (a) = α
(7.3)
We can obtain approximate solutions at fixed points, called mesh points. Let assume that the mesh points are equally distributed throughout the interval [a, [a, b]. So, we choose ti = a + ih,
for each i = 0, 1,...,N.
The common distance between the points h = (b method we use Taylor’s Theorem y (ti+1 ) = y(ti) + (t ( ti+1
(7.4)
− a)/N , N , is called the step size. To derive Euler’s ′
− t )y (t ) + i
i
(ti+1
2
− t ) y (ξ ) i
′′
2
i
h2 ′′ = y(ti ) + h y (ti ) + y (ξi ) 2 h2 ′′ = y(ti ) + h f ( f (ti , y (ti)) + y (ξi ) 2 ′
Euler’s method constructs constructs wi it is given by
(7.5)
≈ y(t ), for each i = 1, 2,...,N , by dropping the remainder term. Thus, i
w0 = α, wi+1 = wi + hf ( hf (ti , wi ),
for each i = 0, 1,...,N.
(7.6)
This last equation is called difference equation associated with Euler’s method.
Example: Let approximate the solution of y′ = y t2 + 1, using Euler’s method with 0 y (0) = 0. 0.5, and N = 10 (please see Example 1 page 259). We have h = (b a)/N = 0.2 and t0 = 0. The Euler’s method is:
−
−
w0 = 0.5, wi+1 = wi + 0. 0 .2
2 i
× (w − t i
+ 1), 1),
ti
wi
0.0 0.2 0.4 0.6 .. .
0.05000 0.80000 1.15200 1.55040 .. .
2.0 4.86580
80
for each i = 0, 1,...,N
≤ t ≤ 2,
Theorem Suppose that f is continuous an satisfies a Lipschitz condition with constant L on D = (t, y ) a
| ≤ t ≤ b, −∞ < y < ∞}
{
(7.7)
and that a constant M exists with ′′
|y (t)| ≤ M, for all t ∈ [a, b]
(7.8)
Let y (t) denote the unique solution to the initial-value problem
y ′ (t) = f ( f (t, y ), a
≤ t ≤ b, y(a) = α
(7.9)
and wi , (i = 0, . . . , N ) be the approximations generated by Euler’s method for some positive integer N . N . Then, for each i
|y(t ) − w | ≤ hM 2L i
i
eL(ti −a)
−1
(7.10)
Theorem Assume that the hypotheses of the previous theorem hold and ui (i = 0, . . . , N ) be the approximations obtained from u0 = α + δ0 , ui+1 = ui + hf (ti , ui ) + δi+1
(7.11)
and if δi < δ, then
| |
|y(t ) − u | ≤ i
i
1 L
hM δ + 2 h
eL(ti −a)
L(ti −a)
− 1 + | δ |e 0
(7.12)
The error bound is no longer linear in h. In fact it goes to infinity for h goes to zero. lim
h−→∞
hM δ + 2 h
=
∞
(7.13)
It can be shown that the minimum value of the error occurs when h=
2δ M
example ex 9 : Given the initial-value problem 2 y ′ = y + t2 et , 1 t with the exact solution y (t) = t2 (et
≤ t ≤ 2, y(1) = 0.0.
− e). The Euler’s method with h = 0.1 gives 81
(7.14)
t 1. 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
w(t) 0 0.271828 271828182 18288 0.684755 684755577 57777 1.27697 769788344 344 2.09354 935477688 688 3.18744 874455123 123 4.62081 208177847 847 6.46639 663966379 379 8.8091 809119 1969 6900 11. 11.7479 747999654 654 15. 15.3982 398233565 565
y (t) 0. 0.3459 345919 1987 8777 0.8666 866642 4253 5377 1.6072 607215 1508 0800 2.6203 620359 5955 5522 3.9676 967666 6629 2977 5.7209 720961 6153 5300 7.9638 963873 7347 4777 10. 10.7936 793624 2466 66 14.3230 323081 8154 54 18.6830 683097 9709 09
y (t) w(t) 0. 0.0740916942 0.1818869593 0.330236736 0.526811864 0.780221174 1.100143683 1.497477098 1.984504970 2.57508500 3.28486144
−
A linear interpolation to approximate y(1. (1.04) can be found as follows. We use x0 = 1 and x1 = 1.1 and the values of w(x0) and w(x1 ) we get the Lagrange polynomial P ( P (x) = 2.718281828 x
− 2.718281828
Thus, P (1 P (1..04) = 0. 0.108731273. The exact value value is y(1. (1.04) = 0. 0.119987497 and the error is 0. 0.011256224.
7.4
Higher Higher-Or -Order der Taylor ylor Methods Methods Euler’s method is Taylor’s method of order one. For order n we write Taylor polynomial for n
y(ti+1 ) =
k=0
hk (k) hn+1 (n+1) y (ti ) + y (ξi ) k! (n + 1)!
(7.15)
for some ξi (ti+1 , ti ) The difference-equation method correspond to previous equation can be found by deleting the remainder term. we get the Taylor method order n:
∈
ω0 = α, ωn+1 = ωi + h T (n) (ti , ωi ), n−1
(n)
T
= f ( f (ti , ωi ) +
j=1 j =1
i = 0, 1,...,N
h j j ) f ( j) (ti , ωi ) ( j + 1)!
Example: We want to approximate the solution of the initial-valued problem y′ = t2 + y 2 ,
0
≤ t ≤ 0.4,
y (0) = 0
with h = 0.2 using Taylor’s method of order 4. We calculate the derivatives y′ y′′ y′′′ y(4)
= = = =
t2 + y 2 2t + 2yy 2yy ′ 2 + 2y ′2 + 2yy 2yy ′′ 6y ′ y ′′ + 2yy 2yy ′′′ 82
− 1,
(7.16)
we find y(0) = 0.0 y (0. (0.2) = 0.00266666667 y (0. (0.4) = 0.02135325469 If we use step size h = 0.4 we get y (0. (0.4) = 0. 0.02133333333. The correct answer is y (0. (0.4) = 0. 0.021359. It shows that the accuracy has been improved by using subintervals, i.e., decreasing the step size.
7.5 7.5
Rung Ru ngee-Ku Kutt tta a Meth Methods ods The Taylor’s method provides the formal definition of a step-by-step numerical method for solving initial-value problems. The difficulty of applying Taylor’s method is connected with evaluating higher derivatives which is extremely complicated. We can explore a class of methods that agree with the first n + 1 terms of the Tayl Taylor or series using function value value only (no need to construct f (r) ). These are Runge-Kutta Methods . Runge-K Runge-Kutta utta methods refer to a family family of one-step one-step methods. methods. They They all based on the general form of the extrapolation equation, yi+1 = yi + slope = yi + m h
×
interval size
(7.17) (7.18)
where m is the slope slope that is weighted weighted averag averages es of the slopes at various arious points in the interv interval. al. If we estimate m using slopes at r points in the interval (t (ti , ti+1 ), then m can be written as m = w1 m1 + w2 m2 + . . . + wr mr , where wi are weights of the slopes at various points. Runge-Kutta methods are known by their order. For instance, a Runge-Kutta method is called the r-order Runge-Kutta method when slopes at r points are used to construct the weighted average slope m. In Euler’ Euler’ss method method we use only only one one point point slope slope at (t (ti , yi) to estimate yi+1, and therefore, Euler’s method is a first-order Runge-Kutta method. Second Taylor Polynomial in two variables for the function f ( f (t, y) near the point (t (t0 , y0 ) can be written as
[(t t0 )f t + (y (y y0 )f y ] P 2 (t, y ) = f ( f (t0 , y0 ) + [(t (t t0 )2 (y y0 )2 + f tt f yy (t tt + yy + (t 2 2
−
−
−
−
)(y − y )f − t )(y 0
0
ty ty
(7.19)
We consider here the second-order method which has the form yi+1 = yi + (w (w1 m1 + w2 m2 )h
(7.20)
m1 = f ( f (ti , yi ) m2 = f ( f (ti + a1 h, yi + b1 m1 h)
(7.21) (7.22)
where
83
The weights w1 and w2 and the constants a1 and b1 are to be determined. The principle of RungeKutta method is that these parameters are chosen such that the power series expension of the right side of eq. (7.20) agrees with Taylor series expension of yi+1 in terms of yi and f ( f (ti , yi ). The second-order Taylor series expension of y of yi+1 about yi is given by yi+1
y ′′ 2 = yi + y h + h 2 ′
(7.23)
We know that yi′ = f ( f (ti , yi ) dy ′ ∂f ∂f ′′ yi = = + f ( f (ti , yi ) dt ∂t ∂y We get yi+1 = yi + f h + (f (f t + f y f ) f )
h2 2
Now consider the right side of eq. (7.20). To get Taylor’s expension we need the Taylor’s series of two variables. we can write yi+1 = = = = =
yi + (w (w1m1 + w2m2 )h y1 + (w (w1 f + w2 f ( f (ti + a1 h, yi + b1m1 h)) yi + [w [w1f + w2(f + a1 hf t + b1 m1 hf y + O(h2 ))]h ))]h yi + w1 hf + w2 hf + w2 a1h2 f t + w2 b1 m1 h2 f y + O (h3 ) yi + (w (w1 + w2 )hf + w2 (a1f t + b1 m1 f y )h2 + O(h3 )
Then we can compare between h2 2 = yi + (w (w1 + w2 )hf + w2 (a1 f t + b1 m1 f y )h2
yi+1 = yi + f h + (f (f t + f y f ) f ) yi+1 we find w1 + w2 = 1,
w2 a1 = w2 b1 =
1 2
Note that we have only three equation equationss but four variable variables. s. These These set of equations equations has no unique unique solution. The index i = 0, 1, . . . , N 1.
−
If we choose w1 = 0, w2 = 1 and a1 = b1 = 1/2 we get what we call Midpoint method m1 = f ( f (ti , yi ) h m1 m2 = f ( f (ti + , yi + h) 2 2 yi+1 = yi + m2 h
84
If we choose w1 = w2 = 1/2 and a1 = b1 = 1 we get what we call Modified Euler Method m1 = f ( f (ti , yi ) m2 = f ( f (ti + h, yi + m1 h) 1 yi+1 = yi + (m1 + m2 )h 2 If we choose w1 = 1/4, w2 = 3/4 and a1 = b1 = 2/3 we get what we call Heun’s method m1 = f ( f (ti , yi ) m2 yi+1
2 2 = f ti + h, yi + m1 h 3 3 h = yi + (m1 + 3m 3m2 )h 4
The derivation of Runge-Kutta Order Four is too long, we just give it here without details. m1 = f ( f (ti , yi ) m2 = m3 = m4 = yi+1 =
h 1 f ti + , yi + m1 h 2 2 h 1 f ti + , yi + m2 h 2 2 f ( f (ti + h, yi + m3 h) 1 yi + (m1 + 2m 2m2 + 2m 2m3 + m4 )h 6
Examples: We want to approximate the solution of y ′ = te3t
− 2y,
0
≤ t ≤ 1,
y (0) = 0
with h = 0.5. Let use Modified Euker’s method N y0 ti m1 m2 yi+1
(1 0)/h 0)/h = 2 0 0 + ih, i = 0, 1, . . . , N 1 f ( f (ti , yi ) f ( f (ti + h, yi + m1 h) h = yi + (m1 + m2 )h 2
= = = = =
−
−
we get ti 0.0 0.5 1.0
wi 0.0 0.560211134 5.301489796 85
(7.24)
Example: We want that the midpoint method, the modified Euler’s method, and Heun’s method give the same approximatio approximations ns to the initial-v initial-value alue problem y′ =
−y + t + 1,1,
0
≤ t ≤ 1,
y (0) = 1
yi+1 = yi + (w (w1 + w2 )f h + (a (a1f t + b1 m1f y ) = yi + f h + w2 (a1 b1 m1 )h2
−
with w1 + w2 = 1.
– Midpoint method: w2 = 1, yi+1
a1 = b1 =
h2 = yi + f h + (1 2
1 2
−m ) 1
– Modified Euler’s method: 1 w2 = , 2
yi+1
a1 = b1 = 1
h2 = yi + f h + (1 2
−m ) 1
– Heun’s method: 3 w2 = , 4
yi+1
a1 = b1 =
h2 = yi + f h + (1 2
2 3
−m ) 1
Therefore, all the three methods give the same approximations.
example: We want to find y(0. (0.2) by using Runge-Kutta fourth order method for y ′ = 1 + y2 ,
y (0) = 0
We take t0 = 0, y0 = 0, and h = 0.2, we get m1 m2 m3 m4 y (0. (0.2)
= = = = =
f ( f (t0 , y0 ) = 1 f ( f (t0 + h/2 h/2, y0 + hm1 /2) = 1. 1.01000000 f ( f (t0 + h/2 h/2, y0 + hm2 /2) = 1. 1.01020100 f ( f (t0 + h, y0 + m3 h) = 1.040820242 y0 + h(m1 + 2m 2m2 + 2m 2m3 + m4 )/6 = 0.2027074080
86
7.6 7.6
Pred Predic icto torr Corr Correc ecto torr Meth Methods ods
In the previous methods, Euler, Heun, Taylor, and Runge-Kutta are called one-step methods because only the value of y of y0 at the beginning of the interval was required. They use information from one previous point to compute the successive point; that is, yi is needed to compute yi+1 . The Multistep methods make make use of informa information tion about the solutio solution n at more then one point. A desirable desirable feature feature of multi multistep step method is that the local truncation error can be determined and a correction term can be included, which improves the accuracy of the answer at each step. Definition: An m-step multistep method for solving the initial-value problem y ′ f ( f (t, y ),
a
≤ t ≤ b,
y (a) = α
(7.25)
has a difference equation for finding the approximation wi+1 at the mesh point t the following equation, where the integer m > 1
− i + 1 represented by
wi+1 = am−1 wi + am−2 wi−1 + . . . + a0 wi+1−m +h [bm f ( f (ti+1, wi+1 ) + bm−1 f ( f (ti , wi ) + . . . + b0 f ( f (ti+1−m , wi+1−m)]
(7.26)
for i = m 1, . . . , N 1, where h = (b 1)/N 1)/N , the a0 , a1 , . . . , am−1 and b0 , . . . , bm are constant, and the starting values w0 = α, w1 = α1 , . . . , wm−1 = αm−1 . When bm = 0 the method is called explicit, or open, since in eq. (7.26), wi+1 is explic explicitl itly y given given in terms terms of previou previously sly determined determined values values.. When bm = 0 the method is called implicit, or closed, since wi+1 occurs in both sides. Euler’s method gives
−
−
−
yi+1 = yi + f ( f (t0 + ih,yi ), i = 0, 1, . . .
(7.27)
The modified Euler’s method can be written as h yi+1 = yi + [f ( f (ti , yi ) + f ( f (ti+1 , yi+1 )] (7.28) 2 The value of yi+1 is first estimated by Euler’s method, eq (7.27), and then used in the right hand side of the modified Euler’s method, eq (7.28), giving a better approximation of yi+1 . The value of yi+1 is again substituted in the modified Euler’s method, eq (7.28), to find a still better approximation of yi+1 . Th This is procedure is repeated till two consecutive iterated values of y of yi+1 agree. agree. The equation equation (7.27) (7.27) is therefore therefore called predictor while eq (7.28) is called corrector. We will describe only the multistep method called Adam-Bashforth-Moulton Method . It can can be derived from the fundamental theorem of calculus
ti+1
yi+1 = yi +
f ( f (t, y )dt
(7.29)
ti
The predictor uses the Lagrange polynomial approximation for f ( f (t, y ) based on the four values, ti−3 , ti−2 , ti−1 , and ti . (t
)(t − t )(t )(t − t ) − t )(t (t − t )(t )(t − t )(t )(t − t ) (t − t )(t )(t − t )(t )(t − t ) +y (t − t )(t )(t − t )(t )(t − t ) (t − t )(t )(t − t )(t )(t − t ) +y (t − t )(t )(t − t )(t )(t − t ) (t − t )(t )(t − t )(t )(t − t ) +y (t − t )(t )(t − t )(t )(t − t )
P 4 (t) = yi−3
i−2
i−3
i−2
i−2
i
i−3
i−3
i−2
i−1
i−3
i−1
i−3
i−3
87
i−1
i−2
i−3
i−2
i−2
i−1
i−2
i
i
i−2
i−1
i−1
i
i
i
i−1
i−2
i
i
i−1
i−3
i−3
i
i−1
i−1
i
It is integrated over the interval [t [ti , ti+1 ].
ti+1
h (55y (55yi 24
P 4 (t)dt =
ti
59y − 59y
+ 37y 37yi−2
i−1
− 9y
i−3 )
(7.30)
Then, we can write yi+1 = yi +
h (55y (55yi 24
59y − 59y
i−1
+ 37y 37yi−2
− 9y
i−3 )
(7.31)
This last equation is called Adams-Bashforth predictor . Note here that extrapolation is used. The corrector is developed in similar way. way. A second Lagrange polynomial for f ( f (t, y ) is constructed based on the four points, (t (ti−2 , yi−2 ), (ti−1 , yi−1 ), (ti , yi ), and the new point (t (ti+1, yi+1) just calculated by eq. (7.33). (t
)(t − t )(t )(t − t ) − t )(t (t − t )(t )(t − t )(t )(t − t ) (t − t )(t )(t − t )(t )(t − t ) +y (t − t )(t )(t − t )(t )(t − t ) (t − t )(t )(t − t )(t )(t − t ) +y (t − t )(t )(t − t )(t )(t − t ) (t − t )(t )(t − t )(t )(t − t ) +y (t − t )(t )(t − t )(t )(t − t )
P 4 (t) = yi−2
i−1
i−2
i−1
i−1
i
i−2
i
i+1
i
i−2
i−1
i−2
i−1
i+1
i+1
i
i−1
i
i−1
i−2
i−1
i+1
i+1
i
i−2
i+1
i−2
i
i−2
i−2
i
i+1
i+1
i−1
i+1
i
i−1
i+1
i
It is integrated over the interval [t [ti , ti+1 ].
ti+1
P 4(t)dt =
ti
h (9y (9yi+1 + 19y 19yi 24
− 5y
i−1
+ yi−2 )
(7.32)
Which gives the Adams-Moulton corrector: yi+1 => yi +
h (9y (9yi+1 + 19y 19yi 24
− 5y
i−1
+ yi−2 )
(7.33)
We have to repeat the last equation, the corrector, till we obtain the needed accuracy.
Algorithm – COMPUTE THE FIRST FOUR INITIAL-VALUE WITH RUNGE-KUTTA-METHOD – COMPUTE yi(0) +1 , USING THE FORMULA (0)
yi+1 = yi +
h [55f [55f i 24
− 59f 59f
i−1
+ 37f 37f i−2
− 9f
i−3 ]
(7.34)
+ f i−2]
(7.35)
(0) – COMPUTE f i(0) f (xi+1, yn+1 ) +1 = f ( k) from the equation – COMPUTE yi(+1 (k)
yi+1 = yi +
h (k−1) [9f [9f ((xi+1 , yi+1 ) + 19f 19f i 24 88
− 5f
i−1
– ITERATE ON i UNTIL
Example: Consider Consider the initial-v initial-value alue problem y′ = 1 + y 2 ,
k) yi(+1
0
k−1) yi(+1 (k) yi+1
−
<ǫ
≤ t ≤ 0.8,
(7.36)
y(0) = 0
The first steps is to calculate the four initial value, w0 , w1 , w2 , and w3 . To do this this we can use for example Four-order Runge-Kutta method with t0 = 0, w0 = 0, and h = 0.2 we get k1 k2 k3 k4 w1
= = = = =
hf (t0 , y0) hf (t0 + h/2 h/2, y0 + k1 /2) hf (t0 + h/2 h/2, y0 + k2 /2) hf (t0 + h, y0 + k3 ) w0 + (k (k1 + 2k 2k2 + 2k 2k3 + k4)/6 = 0.2027074081
In the same way we can get w2 = 0.4227889928 w3 = 0.6841334020 so, we get the predictor w4 = w3 + h/24[55 h/24[55f f ((t3 , w3 ) = 1.023434882
− 59f 59f ((t , w ) + 37f 37f ((t , w ) − 9f ( f (t , w )] 2
2
1
1
0
0
Now, we can correct the predicted value using the formula (0) w4 w4(1) (2) w4 (3)
w4
t1 = 0.2, t2 = 0.4, = 1.023434882 = w3 + h/24 h/24
t3 = 0.6,
9f ( f (t4 , w4(0) )
t4 = 0.8
+ 19f 19f ((t3 , w3 )
= 1.029690402 (1) = w3 + h/24 h/24 9f ( f (t4 , w4 ) + 19f 19f ((t3 , w3 ) = 1.030653654 (2) = w3 + h/24 h/24 9f ( f (t4 , w4 ) + 19f 19f ((t3 , w3 ) = 1.030653654
f (t , w ) + f ( f (t , w ) − 5f ( 2
2
1
1
f (t , w ) + f ( f (t , w ) − 5f ( 2
2
1
1
− 5f ( f (t , w ) + f ( f (t , w ) 2
2
1
1
So, the predicted-corrector methods gives an approximate solution 1. 1.030653654. The actual solution of the ODE is y (t) = tan(x) y(0. (0.8) = 1.029638557 The errors are (0.8) − tan(0. tan(0.8)| |w (0. |w (0. (0.8) − tan(0. tan(0.8)| 4 (3) 4
89
= 0.006203675 = 0.001015097
Chapter 8 Direct Methods for Solving Linear Systems Alinear system has one of the following properties: Unique solution solution No solution No unique solution Ill conditioned conditioned Let consider the following linear system
x 2y = 0.45x 45x 0.91y 91y =
−2 −1
− −
This system is ill conditioned because the two equations represent a nearly two parallel lines. Definition: A matrix n m can be represented by
×
A = (aij ) =
a11 a12 . . . a1m a21 a22 . . . a2m .. .. .. . . . an1 an2 . . . anm
Linea Linearr equati equations ons can be repres represen ented ted by matri matrix. x. Solvi Solving ng linea linearr equati equation on can be don donee by three three main main operations: Row E i can be multiplied by any nonzero constant λ. Row E j can be multiplied by λ and added to the row E i Row E i and E j can be transposed in order.
Example
−
1 2 3 1
1 1 1 2
0 1 1 3
− − −
90
3 1 2 1
−
− 4 1 3 4
it becomes
−
1 2 3 1
1 1 1 2
− − −
0 1 1 3
3 1 2 1
−
− → → 4 1 3 4
1 0 0 0 1 0 0 0
1 1 4 3
0 1 1 3
3 5 7 2
− − − − − − 1 1 0 0
− −
0 1 3 0
3 5 13 13
− − − 4 7 15 8
− − −
4 7 13 13
The matrix becomes a triangular matrix . It is possi p ossible ble to solve solve the linear linear equation equationss by a backwardsubstitutio substitution n process.
8.1 8.1
Gaus Ga ussi sian an Elim Elimin inat atio ion n
In general the matrix A of dimension n Example: Consider the following system
× n can be reduced to triangular by elementary operations.
− −− − − − − − −− − − 1 2 1 1
After elementary operation we get
1 0 0 0
1 1 3 1
1 3 0 0
1 1 2 1
1 2 1 4
1 3 5 0
3 12 12 9 17
1 0 2
3 6 8
21 5
84 5
Now, the solution can be obtained with backward substitution
21 84 x4 = x4 = 4 5 5 5x3 2 4 = 8 x3 = 0 3x2 3 0 = 6 x2 = 2 x1 2 + 0 + 4 = 3 x1 = 1
⇒ − ⇒ ⇒ ⇒
− − × − − × −
−
If the forward elimination gives the final row 00 . . . 0ann bn
|
which gives ann xn = bn , the original system has a unique solution, which will be obtained by backward substitution. if the final row has 00 . . . 00 bn
|
where bn = 0, the system has no solution.
91
if the final row has 00 . . . 00 0
|
The system has an infinity of solutions. There are few strategics to implement Gauss Elimination:
8.2 8.2
Piv Pivotin oting g Stra Strate tegi gies es
8.3 8.3
Matr Matrix ix In Inv verse erse
8.4 8.4
Dete Determ rmin inan antt of Matr Matrix ix
8.5
Matrix Matrix Factori actorizat zation ion
92
Chapter 9 Iterative Methods for Solving Linear Systems 9.1 9.1
Norm Normss of of vec vecto tors rs an and d Mat Matri rice cess
9.2
Eigen Eigenv values alues and Eigen Eigenv vectors ectors
9.3
Iterat Iterativ ive e Tec Techni hnique quess for Solvi Solving ng Linea Linearr Systems Systems
93
Chapter 10 Some Useful Remarks 10.1 10.1
Larg La rges estt Pos Possi sibl ble e Root Root
For a polynomial represented by f ( f (x) = an xn + an−1 xn−1 + . . . + a1 x + a0 the largest possible root is given by xl =
(10.1)
− aa
n−1
(10.2)
n
This value is taken as initial approximation when no other value is given.
Search Bracket : All real roots are within the interval
− − − an−1 an
2
an−2 2 , an
an−1 an
2
an−2 an
2
(10.3)
Another relationship that suggests an interval for roots. All real roots are within the interval
− − 1
10.2 10.2 1
1 1 Max a0 , . . . , an−1 , 1 + Max a0 , . . . , an−1 an an
| |
{| |
|
}
| |
{| |
|
}
(10.4)
Con Converge ergenc nce e of Bis Bisec ecti tion on Met Method hod
In the bisection method, after n iteration, the root must lie within th
bound at n
iteration is
− a , This means that the error
2n
−a
(10.5)
b a E n = 2n+1 2
(10.6)
E n =
b
b
2n
Similarly, E n+1 =
−
That is, the error decreases linearly with each step by a factor of 1/2. Therefore, the bisection method is linearly convergent . 1
Numerical Methods, E Balagurusamy
94
10.3 10.3
Conv Convergenc ergence e of Fals False e Posi Positio tion n Method Method
The false position position formula formula is based based on the linear linear interpola interpolation tion model. model. One of the starting starting poin p oints ts is fixed while the other moves towards towards the solution. Assume that the initial points bracketing bracketing the solution are a and b and that a moves towards the solution and b is fixed. Let x1 = a an p be the solution. Then, E 0 = p
− p ,
E 1 = p
E i = p
− p
0
− p
(10.7)
1
that is It can be shown that..
10.4 10.4
(10.8)
i
Conv Convergenc ergence e of of Newt Newtonon-Rap Raphso hson n Meth Method od
Let pn be estimate of a root of the function f ( f (x) = 0. 0. If p If pn and pn+1 are close to each other, then, we can use Taylor’s series expansion, f ( f ( pn+1) = f ( f ( pn) + ( p ( pn+1
−
f ′′ (ξ ) pn )f ( pn ) + ( pn+1 2 ′
2
− p ) n
(10.9)
where, ξ lies between between pn and pn+1 . Let assume that the exact root is p. Then 0 = f ( f ( pn) + ( p ( p
−
f ′′ (ξ ) pn)f ( pn ) + ( p 2 ′
− p ) n
2
(10.10)
Assume that f ′ ( pn ) = 0. From Newton-Raphson formula we have
pn+1 = pn
f ( p ) − f f ( ⇒ f ( f ( p ) = ( p − p ( p ) n
′
n
n
′
n+1 ) f ( pn )
n
(10.11)
substituting this for f ( f ( pn ) in eq.(10.10),yields 0 = ( p
− p
′ n+1 ) f ( pn ) +
f ′′ (ξ ) ( p 2
− p ) n
2
(10.12)
The error in the estimate xn+1 is given by E n+1 = p E n = p Now, eq. (10.12) can be written as
− p − p
n+1 n
f ′′ (ξ ) 2 0 = E n+1 f ( pn ) + E n 2 ′
which leads to E n+1 =
−
f ′′ (ξ ) 2 E 2 f ′ ( pn ) n
(10.13) (10.14)
The last equation shows that the error is roughly proportional to the square of the error in the previous iteration. Thus, the Newton-Raphson method is quadratically convergent . The Newton-Raphson Newton-Raphson Method has certain limitations. limitations. The Method may fail in the following following situations: situations: 95
1. If f If f ′ ( pn) = 0. 2. If the initial initial guess guess is too far away away from the required required root, the process may may conve converge rge to some some other other root. 3. A practica practicall value alue in the iterati iteration on sequence sequence may repeat, resulting resulting in an infinite infinite loop. This This occurs when the tangent to the curve f ( f (x) at pn+1 cuts the x-axis again at pn .
10.5 10.5
Con Converge ergenc nce e of Seca Secan nt Meth Method od
Secant method uses two initial estimates but does not require that they must bracket the root. The secant formula of iteration is pn pn−1 pn+1 = pn f ( f ( pn ) (10.15) f ( f ( pn ) f ( f ( pi−1 )
− −
−
Let p be actual root of f of f ((x) and E n the error in the estimate of p of pn . Then pi = E i + p, for, i = n
− 1, n , n + 1
(10.16)
Substituting in eq.(10.15) and simplifying, we get E n+1 =
E n−1f ( f ( pn ) E nf ( f ( pn−1) f ( f ( pn ) f ( f ( pi−1 )
− −
(10.17)
According to the mean value theorem, there exist at least one point, between pn and p such that f ′ (ξn) =
f ( f ( pn) pn
− f ( f ( p) p) f ( f ( p ) f ( f ( p ) = = f ( p ) = E f (ξ ) ⇒ f ( p − p E − p n
n
n
′
n
i
n
(10.18)
n
Similarly, f ( f ( pn−1) = E n−1 f ′ (ξn−1) The equation (10.17) becomes E n+1
f ′ (ξn ) = f ( f ( pn )
That is, E n+1
′
− f (ξ f ( p − f (
n−1 )
i−1 )
(10.19)
E n E n−1
(10.20)
∝ E E
(10.21)
α n
(10.22)
n
n−1
Let us now find the order of convergence of this iteration process. If the order of convergence is α then E n+1
∝ E
So, from equation (10.21), we can write E n+1
∝ E E n
n−1
⇒ ⇒ ⇒ ⇒ ⇒ 96
E nα E nα E nα E n
∝ E E ∝ E E ∝ E ∝ E ∝ E
E nα−1
n n−1 α n−1 n−1 α+1 n−1 (α+1)/α +1)/α n−1 (α+1)/α +1)/α n−1
This means that
±√
α+1 1 5 =α α= . (10.23) α 2 Since α is always positive then, the order of convergence of the secant method is α = 1.618 and the convergence is referred to as superlinear convergence .
⇒
10.6 10.6
Con Converge ergenc nce e of Fixe Fixed d Poi Poin nt Method Method
97
Chapter 11 Exams 11.1
exam 1
Answer all questions. 1. Let x = 0.456
(−2)
× 10
, y = 0.134, and z = 0.920.
Use three-digit rounding arithmetic to evaluate: (a) (x (x + y) + z . (b) x + (y (y + z ). Is the associative law for addition satisfied? Explain your answer. (10 Marks) 2. Evaluate Evaluate the polynomial polynomial P ( P (x) = x3
2
− 6x
+ 11x 11x
−5
at p = 2 using Horner’s Theorem. (6 Marks) 3. We wa want nt to evalu evaluate ate the square square root of 5 using using the equatio equation n x2 iteration iteration algorithm. algorithm.
− 5 = 0 by applying the fixed-point
1 5 (a) Use algebraic algebraic manipulation manipulation to show that g (x) = x + has a fixed point exactly at x2 5 = 0 2 2x (b) Use fixed-point theorem to show that the function g (x) converges to the unique fixed point for any intial p0 [2, [2, 5].
−
∈
(c) Use p0 = 3 to evaluate p2 . (8 Marks)
4
4. (a) Evalu Evaluate ate exactly exactly the integr integral al
ex dx. dx.
0
98
4
(b) Find an approximation to
ex dx using Simpson’s rule with h = 2.
0
4
(c) Find an approximation to
ex dx using Composite Simpson’s rule with h = 1.
0
(d) Does the composite Simpson’s Simpson’s rule improve improve the approximatio approximation. n. (8 Marks) 5. Given Given the equation equation y ′ = 3x2 +1, with y (1) = 2. Estimate y(2) y(2) by Euler’s Euler’s method using with h = 0.25. (8 Marks)
END
Useful Formulas and Theorem Horner’s Theorem Let P ( P (x) = anxn + an−1 xn−1 + ... + a1 x + a0
(11.1)
If bn = an and bk = ak + bk+1 x0 ,
for k = n
..., 1, 0 − 1, n − 2, ...,
(11.2)
then b0 = P ( P (x0 ), Fixed-Point Theorem: Let g C [a, b] be such that g (x) [a, b], for all x in [a, b]. Suppo Suppose, se, in addition, addition, that g′ exists on (a, b) and that a constant 0 < k < 1 exists with
∈
∈
′
|g (x)| ≤ k,
forall x
∈ (a, b).
Then, for any number p0 in [a, b], the sequence defined by pn = g( pn−1 ),
n
≥ 1,
converges to the unique fixed point p in [a, b]. Euler’s method: dy To approximate the solution of the initial-value problem = f ( f (t, y ), a t b, y(a) = α at (N+1) dt equally spaced numbers in the interval [a, [a, b], we construct the solution solution y(ti ) = wi for i = 0, 1,...,N 1 and
≤ ≤
w0 = α, t0 = a, wi+1 = wi + hf ( hf (ti , wi ), ti = a + ih, where h = (b
− a)/N 99
−
11.2
exam 2
Answer all questions. 1. (a) (a) Eval Evalua uate te f ( f (x) = x3
− 6.1x
2
+ 3. 3.2x + 1. 1.5 at x = 4.71 using three-digit rounding arithmetic.
(b) Find the relative error in (a). (c) Use Horner’s Theorem to evaluate f ( f (x) at x = 4.71 using three-digit rounding arityhmetic. (d) Find the relative error in (c). (e) Why Why the relativ relativee error error in (c) is less than the relativ relativee error error in (a). (10 Marks) 2. Let f ( f (x) =
3
−x − cos x and p = −1. Use Newton’s method to find p . 0
2
(6 Marks) 3. (a) Le Let A be a given positive constant and g (x) = 2x Ax2 . Sho Show w that if fixed-poi fixed-point nt iterati iteration on converges to a nonzero limit, then the limit is p = 1/A, /A, so the reciprocal of a number can be found using only multiplications and subtractions. 1 (b) Use fixed-point iteration with p0 = 0.1 to find p2 that approximates . 11
−
(8 Marks) 4. Use the forward-diff forward-difference erence and backward-diffe backward-difference rence formulas formulas to determine determine each of the missing entry in the following table: x f ( f (x) f ′ (x) 1.0 1.0000 .... 1.2 1.2625 .... 1.4 1.6595 .... (8 Marks) 5. Use Euler’s method to approximate approximate the solution solution for the following following initial-value initial-value problem. t−y ′ y = e , 0 t 1, y (0) = 1 with h = 0.5.
≤ ≤
(8 Marks)
END
100
Useful Formulas and Theorem Horner’s Theorem Let P ( P (x) = anxn + an−1 xn−1 + ... + a1 x + a0
(11.3)
If bn = an and bk = ak + bk+1 x0 ,
for k = n
− 1, n − 2, ..., ..., 1, 0
(11.4)
then b0 = P ( P (x0 ), Fixed-Point Theorem: Let g C [a, b] be such that g (x) [a, b], for all x in [a, b]. Suppo Suppose, se, in addition, addition, that g′ exists on (a, b) and that a constant 0 < k < 1 exists with
∈
∈
′
|g (x)| ≤ k,
forall x
∈ (a, b).
Then, for any number p0 in [a, b], the sequence defined by pn = g( pn−1 ),
n
≥ 1,
converges to the unique fixed point p in [a, b]. Euler’s method: dy To approximate the solution of the initial-value problem = f ( f (t, y ), a t b, y(a) = α at (N+1) dt equally spaced numbers in the interval [a, [a, b], we construct the solution solution y(ti ) = wi for i = 0, 1,...,N 1 and
≤ ≤
w0 = α, t0 = a, wi+1 = wi + hf ( hf (ti , wi ), ti = a + ih, where h = (b
− a)/N
101
−
11.3
exam 3
Answer all questions. 1. Let f ( f (x) =
ex
−e
−x
x
(a) Find lim f ( f (x). x→0
(b) Use three-digit three-digit rounding arithmetic to evaluate evaluate f (0 f (0..100). (c) The actual value is f (0 f (0..100) = 2. 2.003335000 find the relative error for the value obtained in (b). (8 Marks) 2. (a) Find Find the the actu actual al val value ue of of
1
ex dx
−1
(b) Use Simpson’s rule to approximate the integral in (a). (c) Find the maximum error from Simpson’s formula. (8 Marks) 3. Use the forward-diff forward-difference erence and backward-diffe backward-difference rence formulas formulas to determine determine each of the missing entry in the following table x f ( f (x) f ′ (x) 1.0 1.0000 .... 1.2 1.2625 .... 1.4 1.6595 .... (8 Marks) 2
4. (a) Find Find The The actu actual al val value ue of of
1 dx. dx. x+4
0
2
(b) Use the Trapazoidal rule to approximate and Find the actual error.
1 dx x+4
0
(c) Determine the values of n and h required for the Composite Trapazoidal rule to approximate 2
1 dx to within 10−6. x+4
0
102
(8 Marks) 5. Use Euler’s method to approximate approximate the solution solution for the following following initial-value initial-value problem. ′ t−y y = e , 0 t 1, y (0) = 1 with h = 0.5.
≤ ≤
(8 Marks)
END
Useful Formulas and Theorem Composite Trapezoidal rule : Let f C 2 [a, b], h = (b a)/n, /n, and x j = a + hj, hj , for each j = 0, 1, . . . , n. n. There exists a µ for which the composite trapezoidal rule for n subintervals can be written as
∈
−
b
f ( f (x) dx =
a
h f ( f (a) + 2 f ( f (x j ) + f ( f (b) 2 j=1 j =1
Simpson’s rule With nodes at x0 = a, x1 = a + h, x2 = b, where h = (b x2
n−1
2
′′
(11.5)
− a)/2, the Simpson’s rule is
h f ( f (x)dx = [f ( f (x0 ) + 4f 4 f ((x1 ) + f ( f (x2 )] 3
x0
− a h f (µ) − b 12
∈ (a, b)
−
h5 (4) f (ξ ). 90
(11.6)
Euler’s method dy To approximate the solution of the initial-value problem = f ( f (t, y ), a t b, y(a) = α at (N+1) dt equally spaced numbers in the interval [a, [a, b], we construct the solution solution y(ti ) = wi for i = 0, 1,...,N 1 and
≤ ≤
w0 = α, t0 = a, wi+1 = wi + hf ( hf (ti , wi ), ti = a + ih, where h = (b
− a)/N
103
−