Numerical methods to solve systems of equations in Python Contents Introduction: ................................................................................................................................................. 1 Conjugate gradient method .......................................................................................................................... 2 Definitions and methodology ................................................................................................................... 2 Python code .............................................................................................................................................. 2 Jacobi method ............................................................................................................................................... 3 Methodology ............................................................................................................................................. 3 Python code .............................................................................................................................................. 4 Jacobi method for diagonal dominant matrix ................... ............................ .................. .................. .................. .................. .................. .................. .................. ............ ... 5 Definitions and methodology ................................................................................................................... 5 Python code .............................................................................................................................................. 5 Gauss Seidel Method .................................................................................................................................... 7 Methodology ............................................................................................................................................. 7 Python code .............................................................................................................................................. 7 Jacobi relaxation method ............................................................................................................................ 11 Methodology ........................................................................................................................................... 11 Python code ............................................................................................................................................ 11
We will make frequent use of the libraries numpy.linalg and numpy (for using norm, dot operator, inv, transpose).
Introduction: All the 5 methods presented below are iterative methods.
, the matrix is not symmetric and pos. definite we can turn this system
When for the system
into and it becomes one.
Hence most of the algorithms presented below are for symmetric matrices (except for Jacobi relaxed method). Also the norm I will be working w ith in considering the error terms will be the 2-norm, induced by the classical dot product between two vectors, or between a matrix and a vector.
Conjugate gradient method Definitions and methodology 1. We say that two directions product of two vectors.
, , are conjugate over A if , , 0 where ,, is the scalar
2. Problem:
where is positive definite,real and symmetric. The bilinear form , , , defines a scalarproduct over and if {,…, } forms a basis over then we may express the solution ∗ ∑ ∑= . Suppose we want to solve the system
Based on this expansion we calculate:
∗ , ∗ =
⇒ =
, ⇒ , , ⇒ , , . =
This gives the following method for solving the equation Ax = b: find a sequence of n conjugate directions, and then compute the coefficients
.
The corresponding algorithm:
∈ ,symmetric and positive definite matrix, ∈ , ∈ . S2) We compute , . +, + + , + + ∈ , , ∈ such that: S3) For ≥ 0,we compute + , > + + , + + + + + <<,,>> , + , << , > , + 0. until 0 or + +. S4) Display the solution + S1) We read
Python code First I am testing the functionality of the aforementioned libraries.
def tests_norms(): def tests_norms (): def test1 def test1(): (): A = np.array([[1 np.array([[ 1,2],[ ],[2 2,3]]) print(np.linalg.norm(A)) print (np.linalg.norm(A)) "Computes Frobenius norm" print(np.sqrt( print (np.sqrt(1 1** **2 2+2** **2 2+3** **2 2+4** **2 2)) print(np.linalg.norm(A, print (np.linalg.norm(A,2 2)) "Computes 2-norm"
test1() def test2 def test2(): (): "For inverse of a matrix one can use di rectly the linalg package " "inv member" A = np.array([[1 np.array([[ 1,2],[ ],[2 2,3]]) print(np.linalg.inv(A)) print (np.linalg.inv(A)) Now we get on w ith our algorithms and tests. def tests_CGM(): def tests_CGM (): def test def test(A,b,x0): (A,b,x0): "I test the first step of the Conjugate Gradient Method" r0 = b-np.dot(A,x0) p0 = r0 alpha0 = np.dot(p0,r0)/np.dot( np.dot(p0,r0)/np.dot(p0,np.dot(A,p0)) p0,np.dot(A,p0)) x= x0 x = x + alpha0*p0 print(x) print (x) def ttest def ttest(): (): A = np.array([[2 np.array([[ 2,1],[ ],[1 1,4]]) b = np.array([1 np.array([ 1,2]) x0 = np.array([1 np.array([ 1,1]) test(A,b,x0) ttest() def CGM def CGM(A,b,x0,tol): (A,b,x0,tol): err,y = 1,x0 r = b-np.dot(A,x0) p=r i=0 while (err>tol): while (err>tol): alpha = np.dot(p,r)/np.d np.dot(p,r)/np.dot(p,np.dot(A,p)) ot(p,np.dot(A,p)) z = y + alpha * p r = r-alpha * np.dot(A,p) beta = -np.dot(r,np.dot(A -np.dot(r,np.dot(A,p))/np.dot(p,np.dot(A ,p))/np.dot(p,np.dot(A,p)) ,p)) p = r+beta * p i=i+1 i=i+ 1 err = np.dot(z-y,z-y) #squared 2-norm of z-y y=z return y,err,i return y,err,i def ttest3 def ttest3(): (): A = np.array([[10 np.array([[ 10,,7,8,7],[ ],[7 7,5,6,5],[ ],[8 8,6,10 10,,9],[ ],[7 7,5,9,10 10]]) ]]) b = np.array([32 np.array([ 32,,23 23,,33 33,,31 31]) ]) x0 = np.zeros(4 np.zeros( 4) print(CGM(A,b,x0, print (CGM(A,b,x0,10 10**(**(-10 10))) ))) Remark: The errors are computed under the 2-norms.
Jacobi method Methodology
, ∈ , ∈ .
We again seek to solve
so the system becomes ⇒ . . + + ,∀ ,∀ ∈ . For any ∈ we define the sequence ( ) ∈ through the relation The sequence is convergent iff 1. Remark: We denote spectral radius = max{| max{||,| |, ||, … , ||}. We consider
Python code def jacobi_methods jacobi_methods(num): def spectral_radius def spectral_radius(A): (A): eig = np.linalg.eig(A)[0 np.linalg.eig(A)[0] abs_eig = np.abs(eig) return np.max(abs_eig) return np.max(abs_eig) def spectral_radius2 def spectral_radius2(A): (A): "spectral radius of I-A" B = np.identity(np.shape(A np.identity(np.shape(A)[ )[0 0])-A return spectral_radius(B) def jacobi_method(A,b,x0,tol): jacobi_method(A,b,x0,tol): "A = square matrix,b=the free term col.vector, x0 = initial guess " sz = np.shape(A) if sz[0 sz[0]!= ]!=len len(b): (b): raise IndexError IndexError('the ('the number of rows of A and length of b are not \ compatible') x,i,err,B = x0,0 x0,0,1,np.identity(np.shape(A)[ ,np.identity(np.shape(A)[0 0])-A if spectral_radius(B)> if spectral_radius(B)>= = 1: print('The print ('The method is not convergent') return False else:: else while (err>=tol while (err>=tol and i< and i<30 30): ): z = b+np.dot(B,x) i = i+1 i+1 err = np.dot(z-x,z-x) x=z return x,err,i return x,err,i def test_jacobi def test_jacobi(): (): A = np.array([[0.2 np.array([[ 0.2,,0.3 0.3],[ ],[0.1 0.1,,0.2 0.2]]) ]]) b = np.array([0.2 np.array([ 0.2,,0.3 0.3]) ]) x0 = np.array([0.1 np.array([ 0.1,,0.2 0.2]) ]) tol = 0.01 0.01 * * np.exp(-5 np.exp(-5) print(jacobi_method(A,b,x0,tol)) print (jacobi_method(A,b,x0,tol)) def test_matrix def test_matrix(): (): A = np.array([[0.2 np.array([[ 0.2,,0.3 0.3],[ ],[0.1 0.1,,0.2 0.2]]) ]]) B = np.identity(np.shape(A np.identity(np.shape(A)[ )[0 0])-A print(spectral_radius(B)) print (spectral_radius(B)) #test_matrix() -- Uncomment these to be able to test the functions #test_jacobi() -- from outside dic = {1 {1:spectral_radius, :spectral_radius,2 2:spectral_radius2, :spectral_radius2,3 3:jacobi_method} return dic[num] return dic[num]
Jacobi method for diagonal dominant matrix Definitions and methodology Definition1:
( ( )≤,≤ is diagonal dominant on lines if and only if | | , ∀1 . ∑=,≠ . , ∀1 ( )≤,≤ is diagonal dominant on columns if and only if ∑ ∑=,≠ Similarly, ( . We say that the matrix
Method of solving the system:
∈ , − , , −. ⇔ 2. The system becomes ⇔ − − ⇔ . .
1.We consider
Python code Python code for testing the dominance of a matrix.
"Jacobi for diagonal dominant matrices" import numpy as np def tests def tests(num): (num): def dominance_rows def dominance_rows(A): (A): "I test if a matrix A is diagonal dominant on rows" sz = np.shape(A)[ np.shape(A)[0 0] for i for i in range range(sz): (sz): if (np.abs(A[i,i])<=1 (np.abs(A[i,i])<=1 / /2 2*np.sum(np.abs(A[i]))): return 0 return 1 def dominance_cols def dominance_cols(A): (A): "I test if a matrix A is diagonal dominant on columns" sz = np.shape(A)[ np.shape(A)[1 1] for i for i in range range(sz): (sz): if (np.abs(A[i,i]<=1 (np.abs(A[i,i]<=1 / /2 2 *np.sum(np.abs(np. *np.sum(np.abs(np.transpose(A)[i])) transpose(A)[i])))): )): return 0 return 1 def diagonal def diagonal(A): (A): "Create a diagonal matrix out of A" return np.diag(np.diag(A)) return np.diag(np.diag(A)) dic = {1 {1:dominance_rows, :dominance_rows,2 2:dominance_cols, :dominance_cols,3 3:diagonal} return dic[num] return dic[num] def test_dominance(): def test_dominance (): A = np.array([[1 np.array([[ 1,0,0,0.5 0.5],[ ],[1 1,2,0.3 0.3,,0.6 0.6],[ ],[0 0,1,3,1.5 1.5],[ ],[0 0,1,1,,-3 3]]) print(A) print (A) print(tests( print (tests(1 1)(A)) print(tests( print (tests(2 2)(A)) print((1 / print /2 2*np.sum(np.abs(A[ *np.sum(np.abs(A[3 3]))) print(np.abs(A[ print (np.abs(A[3 3,3]))
print(np.abs(A[3 print(np.abs(A[ 3,3])<= ])<=1 1 / /2 2*np.sum(np.abs(A[ *np.sum(np.abs(A[3 3]))) Now I get to the core algorithm. def jacobi_functions jacobi_functions(): (): def jacobi_matrix1(A,b): jacobi_matrix1(A,b): "creates the matrix B = I - D^(-1)*A, and the new vector b=D^(-1)*A" "for diagonal dominant matrix on rows" D = np.diag(np.diag(A)) return np.identity(np.shape(A) np.identity(np.shape(A)[[0]) - np.dot(np.linalg.inv(D), np.dot(np.linalg.inv(D),A),\ A),\ np.dot(np.linalg.inv(D),b) def jacobi_matrix2(A): jacobi_matrix2(A): "creates the matrix B = I - A*D^(-1)" "in case of diagonal dominant matrix on columns" D = np.diag(np.diag(A)) B = np.identity(np.shape(A) np.identity(np.shape(A)[[ 0]) - np.dot(A,np.linalg.inv(D)) return B return B def test_jacobi_matrix def test_jacobi_matrix(): (): A = np.array([[1 np.array([[ 1,2,3],[ ],[1 1,2,0],[ ],[0 0,1,1]]) b = np.array([1 np.array([ 1,1,1]) print(jacobi_matrix1(A,b)) print (jacobi_matrix1(A,b)) print(jacobi_matrix2(A)) print (jacobi_matrix2(A)) #test_jacobi_matrix() def test_jacobi_matrix2 def test_jacobi_matrix2(): (): h2 = tests(1 tests(1) A = np.array([[1 np.array([[ 1,0,0,0],[ ],[0 0,2,0,0],[ ],[0 0,0,3,0],[ ],[0 0,0,0,4]]) B = np.diag(([1 np.diag(([ 1,2,3,4])) print(A) print (A) print(B) print (B) print(h2(A)== print (h2(A)==1 1) #test_jacobi_matrix2() def jacobi_method(A,x0,a,tol): jacobi_method(A,x0,a,tol): sz = np.shape(A)[ np.shape(A)[0 0] if sz!=len sz!=len(x0): (x0): raise IndexError IndexError('The ('The dimensions of A and x0 are not compatible' ) h2,h3 = tests(1 tests( 1),tests( ),tests(2 2) if h2(A)==1 h2(A)==1: "If A is dominant on rows" B,b = jacobi_matrix1(A,a jacobi_matrix1(A,a)) x,err,i = x0,1 x0,1,0 while(err>=tol while (err>=tol and i< and i<30 30): ): z = np.dot(B,x) + b err = np.dot(z - x,z-x) i = i+1 i+1 x=z return x,err,i return x,err,i elif h3(A)==1 h3(A)==1: "If A is dominant on cols" B = jacobi_matrix2(A) x,err,i = x0,1 x0,1,0 y = np.diag(np.diag(A))*x
while(err>=tol and i< while(err>=tol and i<30 30): ): z = np.dot(B,y) + a err = np.dot(z-y,z-y) i = i+1 i+1 y=z return x,err,i return x,err,i else:: else raise ValueError ValueError('You ('You better try another method for solving the system') def test_jacobi_method def test_jacobi_method(): (): A = np.diag([2 np.diag([2,3,2,4]) b = np.array([1 np.array([ 1,2,3,4]) x0 = np.ones(4 np.ones(4) print(jacobi_method(A,x0,b, print (jacobi_method(A,x0,b,10 10**(**(-4 4))) test_jacobi_method()
Gauss Seidel Method Methodology
, , ( ()≤,≤ where , ∀1 and 0,otherwise. Then det det 1 so − exists. ⇔ ⇔ − − ⇔ ⇔ ⇔ The system ⇔ − − ⇔ − −. − and − then ⇔ . If we denote + ⇔ + + + + Hence we can define, for ∈ ,the seq. + ,∀ , ∀ ∈ .. Remark on convergence of the sequence The sequence is convergent if and only if 1. We note
Python code "Gauss Seidel method" import numpy as np def tests def tests(): (): def test1 def test1(): (): l = list list([[(i,j) ([[(i,j) for for j j in range range((4)] for for i i in range range((4)]) print(l) print (l) m = list list([x ([x if x[0 x[0]>x[ ]>x[1 1] else 0 for for x x in l]) in l]) print(m) print (m) def test2 def test2(): (): A = np.random.normal(0 np.random.normal( 0,1,( ,(4 4,4)) B,C = np.zeros((4 np.zeros(( 4,4)),np.zeros(( )),np.zeros((4 4,4)) for i for i in range range((1,4): for j for j in range range(i): (i): B[i,j] = A[i,j]
for i in range for i range((0,4): for k for k in range range(i, (i,4 4): C[i,k] = A[i,k] print(A) print (A) print(B) print (B) print(C) print (C) print(np.zeros(( print (np.zeros((4 4,4))) test2() def decomposition(A): def decomposition (A): "A = L + R, where L contains the matrix A under its main diagonal" sz = np.shape(A)[0 np.shape(A)[ 0] L,R = np.zeros((sz,sz)), np.zeros((sz,sz)),np.zeros((sz, np.zeros((sz,sz)) sz)) for i for i in range range((1,sz): for j for j in range range(i): (i): L[i,j] = A[i,j] for i for i in range range((0,sz): for k for k in range range(i,sz): (i,sz): R[i,k]=A[i,k] return L,R return L,R def test_decomposition(): def test_decomposition (): A = np.random.normal(0 np.random.normal( 0,1,( ,(5 5,5)) print(A) print (A) print(decomposition(A)[ print (decomposition(A)[0 0]) print(decomposition(A)[ print (decomposition(A)[1 1]) def gauss_seidel_method(A,b,x0,tol): def gauss_seidel_method (A,b,x0,tol): if np.linalg.det(A)==0 np.linalg.det(A)==0: raise Exception Exception("The ("The matrix is singular") else:: else sz = np.shape(A)[ np.shape(A)[0 0] B = np.identity(sz) - A L,R = decomposition(B) K = np.identity(sz) - L C = np.dot(np.linalg.inv(K),R) c = np.dot(np.linalg.inv(K),b) x,err,i= x0,1 x0,1,0 while (err>=tol while (err>=tol and i< and i<30 30): ): z = np.dot(C,x)+c err = np.dot(z-x,z-x) i=i+1 i=i+ 1 x=z return x,err,i return x,err,i def test_gauss_seidel_method(): def test_gauss_seidel_method (): A1,A2 = np.diag([1 np.diag([1,2,3,4]),np.diag([ ]),np.diag([1 1,2,3,4]) b = np.ones(4 np.ones(4) x0 = np.zeros(4 np.zeros( 4)
A1[3,0],A1[ A1[3 ],A1[0 0,3] = 2,2 print(gauss_seidel_method(A2,b,x0, print (gauss_seidel_method(A2,b,x0,0.000001 0.000001)) )) #print(gauss_seidel_method(A1,b,x0,0.000001)) "We see that for diagonal matrices the method i s not always converging" "I will need to import from Jacobi the jacobi_methods(num) function containing" "spectral radius function" from Jacobi import import jacobi_methods jacobi_methods def gs_spectral_radius(A): def gs_spectral_radius (A): B=np.identity(np.shape(A)[0 B=np.identity(np.shape(A)[ 0])-A L,R = decomposition(B) K = np.identity(np.shape( np.identity(np.shape(A)[ A)[0 0])-L C = np.dot(np.linalg.inv(K),R) return jacobi_methods( return jacobi_methods(1 1)(C) def test_spectral_radius(): def test_spectral_radius (): A = np.array([[1 np.array([[ 1,2,,-2 2],[ ],[1 1,1,1],[ ],[2 2,2,1]]) print(gs_spectral_radius(A)) print (gs_spectral_radius(A)) B = np.diag([1 np.diag([1,2,3,4]) print(gs_spectral_radius(B)) print (gs_spectral_radius(B)) def gauss_seidel_method2(A,b,x0,tol): def gauss_seidel_method2 (A,b,x0,tol): "An improved version, which abandonds the task if the spectral radius " " of the matrix C previously defined is greater or equal than 1" sz = np.shape(A)[0 np.shape(A)[ 0] B = np.identity(sz)-A L,R = decomposition(B) K = np.identity(sz) - L C = np.dot(np.linalg.inv(K),R) if np.linalg.det(A)==0 np.linalg.det(A)==0: raise Exception Exception("The ("The matrix is singular") elif gs_spectral_radius( elif gs_spectral_radius(C)>= C)>=1 1: raise Exception Exception("The ("The method will not converge, the spectral radius is >=1") else:: else c = np.dot(np.linalg.inv(K),b) x,err,i = x0,1 x0, 1,0 while (err>=tol while (err>=tol and i< and i<30 30): ): z=np.dot(C,x)+c err = np.dot(z-x,z-x) i = i+1 i+1 x=z return x,err,i return x,err,i def test_gauss_seidel_method2(): def test_gauss_seidel_method2 (): A1,b,x0 = np.diag([1 np.diag([ 1,2,3,4]), np.ones(4 np.ones(4),np.zeros( ),np.zeros(4 4) print(gauss_seidel_method2(A1,b,x0, print (gauss_seidel_method2(A1,b,x0,0.00001 0.00001)) ))
def norm_inf_gauss_tests(num): def norm_inf_gauss_tests (num): def modified_ones def modified_ones(q,n): (q,n): "q is a vector of length that must be inferior to n" "the result would be [q1,q2,...,qk,1,1,..,1]" z = np.ones(n) for i for i in range range((len len(q)): (q)): z[i] = q[i] return z return z def test1 def test1(): (): q = [2.1 [2.1,,2.2 2.2,,2.3 2.3]] x = modified_ones(q,5 modified_ones(q,5) print(x) print (x) def norm_inf def norm_inf (B): (B): "Cumulative scalar product" "q1 = sum of elements of B[0],q2 = <(q1,1,1...,1),(B[1,0],...,B[1,m])>" <(q1,1,1...,1),(B[1,0],...,B[1,m])>" "q3 = <(q1,q2,1,1,...,1),( <(q1,q2,1,1,...,1),(B[2,0],B[2,1],...,B B[2,0],B[2,1],...,B[2,m])>" [2,m])>" sz = np.shape(B)[ np.shape(B)[0 0] q=[] q.append(np.dot(np.ones(sz),B[0 q.append(np.dot(np.ones(sz),B[ 0])) for i for i in range range((1,sz): aux = q[0 q[ 0:i] q.append(np.dot(modified_ones(aux,np.shape(B)[0 q.append(np.dot(modified_ones(aux,np.shape(B)[ 0]),B[i])) return q return q def test2 def test2(): (): B = np.array([[1 np.array([[ 1,2],[ ],[3 3,,-1 1]]) print(norm_inf(B)) print (norm_inf(B)) "should return 3.0 and 8.0 if it is correctly implemented" #test2() def norm_inf_gauss def norm_inf_gauss(A): (A): "the return of the function will specify if Gauss-Seidel is applicable" "If the maximum of the q's is <1 then Gauss-Seidel converges otherwise no" B = np.identity(np.shape(A np.identity(np.shape(A)[ )[0 0]) - A q = norm_inf(np.abs(B norm_inf(np.abs(B)) )) return q return q dic = {1 {1:modified_ones, :modified_ones,2 2:norm_inf, :norm_inf,3 3:norm_inf_gauss} return dic[num] return dic[num] Remark (Alternative sufficient condition for convergence): Regarding the last function norm_inf_gauss I have to take into consideration the following:
, ∑ , 2 , max ∑ ∑= ∑− max , if 1 then we = ∑= ≤≤ have the following evaluations: ∞ − ( 1)∞ − ∞. So to test if a system is convergent or not we can see if 1.
Considering
Jacobi relaxation method Methodology Suppose we have a system A symmetric and positive definite.
. Instead of considering − , , −, we take for iterative purpose, − . . − . The system of equations becomes − − ⇔ ⇒ . The optimal choice of is made as follows: − . For we can be sure of a convergent array. Let ≥ ≥ ⋯ eigenvalues of − + If
Python code "Simultaneous Relaxation method" import numpy as np def relaxed_method def relaxed_method(): (): def issymetric def issymetric(A): (A): if np.all(np.transpose( if np.all(np.transpose(A)==A): A)==A): return 1 else:: else return 0 def is_pos_def def is_pos_def (A): (A): if np.all(np.linalg.eig(A)[0 np.all(np.linalg.eig(A)[ 0]> ]>0 0) and i and issymetric(A)= ssymetric(A)== = 1: return 1 else:: else return 0 def opt_relax_param def opt_relax_param(A): (A): "Optimal relaxation parameter.Interest parameter.Interest on positive-definite and " "symmetric matrices" eig = np.linalg.eig(A)[0 np.linalg.eig(A)[0] m,M = np.min(eig),np.max(eig) return 2 /(m+M) def chg_var def chg_var(A,a): (A,a): "change of variables in relaxed method" sz = np.shape(A)[ np.shape(A)[0 0] sigma = opt_relax_param(A) D = np.diag(np.diag(A)) C = np.identity(sz) - sigma * np.dot(np.linalg.inv(D),A) c = sigma*np.dot(np.linalg.inv( sigma*np.dot(np.linalg.inv(D),a) D),a) return C,c return C,c def test def test(): (): A = np.array([[5 np.array([[ 5,3,2],[ ],[3 3,6,3],[ ],[2 2,3,5]]) a= np.array([1 np.array([ 1,2,3]) print(chg_var(A,a)) print (chg_var(A,a)) #test() def solution1 def solution1(A,a,x0,tol,n): (A,a,x0,tol,n): "n is the number of iterations" i terations" "I assume matrix A fullfils all necessar necessary y conditions"
x,err,i=x0,1 x,err,i=x0, 1,0 C,c = chg_var(A,a) while (err>=tol while (err>=tol and i