6 Numerical Methods for Constrained Optimization.pdf

Numerical Methods for Constrained Optimization

Cheng-Liang Chen

PSE

LABORATORY

Department of Chemical Engineering National TAIWAN University

Chen CL

1

Power Generation via Fuel Oil Process Description





A two-boiler o-boiler turbin turbine-g e-gene enerat ratoor combin combinati ation on to produce roduce a power power output of 50 MW Range of operation: [18 [18,, 30] 30] MW MW and [14 [14,, 25] 25] MW, MW, respectively

Chen CL

1

Power Generation via Fuel Oil Process Description





A two-boiler o-boiler turbin turbine-g e-gene enerat ratoor combin combinati ation on to produce roduce a power power output of 50 MW Range of operation: [18 [18,, 30] 30] MW MW and [14 [14,, 25] 25] MW, MW, respectively

Chen CL

2



Max supply of Blast Furnace Gas: 10 10 fuel fuel units/hr



Fuel used (f (f ton/h) ton/h) and power generated (x (x MW) f = a0 + a + a1x + a + a2x2 Gene Genera rato torr Fuel uel Type 1 2



a0

a1

a2

Fuel Fuel Oil

1.4609 0.15186 .001450

BFG

1.5742 0.16310 .001358

Fuel Fuel Oil

0.8008 0.20310 .000916

BFG

0.7266 0.22560 .000778

Objective: min amount of fuel oil purchased

Chen CL

3

Power Generation via Fuel Oil Formulation 

Amount of fuel type ( j ) used by generator (i) f ij = aij 0 + aij 1xij + aij 2x2ij



Powers generated by generator i and power demand pi = xi1 + xi2 p1 + p2



i = 1, 2; j = 1, 2

≥

i = 1, 2

50

Used fuel ( j ) 2

zj

2

  ≥ f ij =

i=1

i=1

aij 0 + aij 1xij + aij 2x2ij

j = 1, 2

Chen CL 

4

NLP Formulation min z1 s.t. pi = xi1 + xi2 p1 + p2 2

zj

i = 1, 2

≥ 50

 ≥

aij 0 + aij 1xij + aij 2x2ij

i=1

xij

≥ 0, 0 ≤ z2 ≤ 10, 18 ≤ p1 ≤ 30, 14 ≤ p2 ≤ 25

Chen CL

5

Power Generation via Fuel Oil GAMS Code $TITLE Power Generation via Fuel Oil $OFFUPPER $OFFSYMXREF OFFSYMLIST OPTION SOLPRINT = OFF; * Define SETS G F K

index sets Power Generators /gen1*gen2/ Fuels /oil, gas/ Constants in Fuel Consumption Equations /0*2/;

* Define and Input the Problem Data TABLE A(G,F,K) Coefficients in the fuel consumption equations 0 1 2 gen1.oil 1.4609 .15186 .00145 gen1.gas 1.5742 .16310 .001358 gen2.oil 0.8008 .20310 .000916 gen2.gas 0.7266 .22560 .000778; PARAMETER PMAX(G) Maximum power outputs of generators / GEN1 30.0, GEN2 25.0/; PARAMETER PMIN(G) Minimum power outputs of generators / GEN1 18.0, GEN2 14.0/; SCALAR GASSUP Maximum supply of BFG in units per h /10.0/ PREQ Total power output required in MW /50.0/;

Chen CL * Define optimization variables VARIABLES P(G) Total power output of generators in MW X(G, F) Power outputs of generators from specific fuels Z(F) Total Amounts of fuel purchased OILPUR Total amount of fuel oil purchased; POSITIVE VARIABLES P, X, Z; * Define Objective Function and Constraints EQUATIONS TPOWER Required power must be generated PWR(G) Power generated by individual generators OILUSE Amount of oil purchased to be minimized FUELUSE(F) Fuel usage must not exceed purchase; TPOWER.. SUM(G, P(G)) =G= PREQ; PWR(G).. P(G) =E= SUM(F, X(G,F)); FUELUSE(F).. Z(F) =G= SUM((K,G), a(G,F,K)*X(G,F)**(ORD(K)-1)); OILUSE.. OILPUR =E= Z("OIL"); * Impose Bounds and Initialize Optimization Variables * Upper and lower bounds on P from the operating ranges P.UP(G) = PMAX(G); P.LO(G) = PMIN(G); * Upper bound on BFG consumption from GASSUP Z.UP("gas") = GASSUP; * Specify initial values for power outputs P.L(G) = .5*(PMAX(G)+PMIN(G)); * Define model and solve MODEL FUELOIL /all/; SOLVE FUELOIL USING NLP MINIMIZING OILPUR; DISPLAY X.L, P.L, Z.L, OILPUR.L;

6

Chen CL

7

Power Generation via Fuel Oil Result 

Power outputs for generators 1 and 2 are 30 MW and 20 MW



Use all BFG (10) due to free Purchase 4.681 ton/h of fuel oil



Generator 1 is operating at the maximum capacity of its operating range

⇒ 36.325 MW total power ⇒ 13.675 MW remaining power

Chen CL

8

Alkylation Process Optimization Process Flowsheet

Chen CL

9

Alkylation Process Optimization Notation and Variable

Chen CL

10

Alkylation Process Optimization Total Profit

f (x) = C 1x4x7

− C 2x1 − C 3x2 − C 4x3 − C 5x5

C 1 = alkylate product value ($0.063/octan-barrel) C 2 = olefin feed cost ($5.04/barrel) C 3 = isobutane recycle cost ($0.035/barrel)

C 4 = acid addition costs ($10.00/per thousand pounds) C 5 = isobutane makeup cost ($3.36/barrel)

(a)

Chen CL

11

Alkylation Process Optimization Some Regression Models 

Reactor temperatures between 80 percent at 85 93 was

−

− 90oF and the reactor acid strength by weight

x4 = x 1(1.12 + 0.13167x8 

− 0.00667x28)

(b)

Alkylate yield x4 equals the olefin feed x1 plus the isobutane makeup x5 less shrinkage. The volumetric shrinkage can be expressed as 0.22 volume per volume of alkylate yield x5 = 1.22x4 x1 (c)

−



The acid strength by weight percent x6 could be derived from an equation that expressed the acid addition rate x3 as a function of the alkylate yield x4, the acid dilution factor x9, and acid strength by weight percent x6 (the addition acid was assumed to have acid strength of 98%) 98000x3 x6 = x4x9 + 1000x3

(d)

Chen CL 

12

The motor octane number x7 was a function of the external isobutane-to-olefin ratio x8 and the acid strength by weight percent x6 (for the same reactor temperatures and acid strengths as for the alkylate yield x4) x7 = 86.35 + 1.098x8



− 0.038x28 + 0.325(x6 − 89)

The external isobutane-to-olefin ratio x8 was equal to the sum of the isobutane recycle x2 and the isobutane makeup x5 divided by the olefin feed x1 x2 + x5 x8 = x1



(f )

The acid dilution factor x9 could be expressed as a linear function of the F-4 performance number x10 x9 = 35.82



(e)

− 0.222x10

(g)

The last dependent variable is the F-4 performance number x10, which was expressed as a function of the motor octane number x7 x10 =

−133 + 3x7

(h)

Chen CL

13

Alkylation Process Optimization Constrained NLP Problem

max f (x) = C 1x4x7 s.t. Eq.s

(b)

∼

− C 2x1 − C 3x2 − C 4x3 − C 5x5

(h)

(10 variables, 7 constraints)

(a)

Chen CL

14

Alkylation Process Optimization GAMS Code $ TITLE ALKYLATION PROBLEM FROM GINO USER’S MANUAL $ OFFSYMXREF $ OFFSYMLIST OPTION LIMROW=0; OPTION LIMCOL=0; POSITIVE VARIABLES X1,X2,X3,X4,X5,X6,X7,X8,X9,X10; VARIABLE OBJ; EQUATIONS E1,E2,E3,E4,E5,E6,E7,E8; E1..X4=E=X1*(1.12+.13167*X8-0.0067*X8**2); E2..X7=E=86.35+1.098*X8-0.038*X8**2+0.325*(X6-89.); E3..X9=E=35.82-0.222*X10; E4..X10=E=3*X7-133; E5..X8*X1=E=X2+X5; E6..X5=E=1.22*X4-X1; E7..X6*(X4*X9+1000*X3)=E=98000*X3; E8.. OBJ =E= 0.063*X4*X7-5.04*X1-0.035*X2-10*X3-3.36*X5; X1.LO X1.UP X2.LO X2.UP X3.LO

= = = = =

0.; 2000.; 0.; 16000.; 0.;

Chen CL X3.UP = 120.; X4.LO = 0.; X4.UP = 5000.; X5.LO = 0.; X5.UP = 2000.; X6.LO = 85.; X6.UP = 93.; X7.LO = 90; X7.UP = 95; X8.LO = 3.; X8.UP = 12.; X9.LO = 1.2; X9.UP = 4.; X10.LO = 145.; X10.UP = 162.; X1.L =1745; X2.L =12000; X3.L =110; X4.L =3048; X5.L =1974; X6.L =89.2; X7.L =92.8; X8.L =8; X9.L =3.6; X10.L =145; MODEL ALKY/ALL/; SOLVE ALKY USING NLP MAXIMIZING OBJ;

15

Chen CL

16

Alkylation Process Optimization Relaxation of Regression Variables

max s.t.

f (x) = C 1x4x7 Rx4 = x5 = x6 =

− C 2x1 − C 3x2 − C 4x3 − C 5x5 x1(1.12 + 0.13167x8 − 0.00667x28) (b) 1.22x4 − x1 (c)

98000x3 x4 x9 +1000x3

Rx7 = 86.35 + 1.098x8

(d)

− 0.038x28 + 0.325(x6 − 89)

x8x1 = x2 + x5 (f ) Rx9 = 35.82 0.222x10 (g) Rx10 = 133 + 3x7 (h) 0.99x4 Rx4 1.01x4 0.99x7 0.99x9 0.99x10

≤ ≤ ≤ ≤

−

(a)

−

≤ Rx7 ≤ 1.01x7 Rx9 ≤ 1.01x9 Rx10 ≤ 1.01x10

(e)

Chen CL

17

Alkylation Process Optimization Solution

f (x0) = 872.3 (initial guess)

=

⇒

f (x∗) = 1768.75

Chen CL

18

Constraint Status at A Design Point Minimize:

f (x)

Subject to:

gj (x)

≤

0 j = 1,

hi(x) = 0

⇓

Active Constraint gj (x(k)) = 0,

··· , m i = 1, ··· ,  hi(x(k)) = 0

Inactive Constraint gj (x(k)) < 0 Violated Constraint gj (x(k)) > 0,

hi(x(k)) = 0

-Active Constraint gj (x(k)) < 0,

gj (x(k)) +  > 0



Chen CL

Conceptual Steps of Constrained Optimization Algorithms Initialized from A Feasible Point

19

Chen CL

Conceptual Steps of Constrained Optimization Algorithms Initialized from An Infeasible Point

20

Chen CL

21

Sequential Unconstrained Minimization Techniques (SUMT)

Minimize:

f (x)

Subject to:

gj (x)

≤

0 j = 1,

hk (x) = 0

⇒ Minimize:

··· , m k = 1, ··· , 

Φ(x, r p) = f (x) + r pP (x) P (x)

:

penalty if any constraint is violated

r p

:

weighting factor

Chen CL

22

The Exterior Penalty Function Methods Minimize:

f (x)

Subject to: gj (x)

≤

0 j = 1,

hk (x) = 0

⇓

··· , m k = 1, ··· , 

min Φ(x, r p) = f (x) + r pP (x) m

P (x) =

 j =1



[max (0, gj (x))]2 +



k=1

[hk (x)]2

Chen CL

23

The Exterior Penalty Function Method: Example Minimize: Subject to:

(x+2)2 f = 20 g1 = 1−2 x g2 = x−2 2

≤0 ≤0

Chen CL

24

⇓ Min: Φ(x, r p) =

2

(x + 2) + r p 20

 

max 0,

1

2

   − x

2

+ max 0,

x

  − 2

2

2

Chen CL

25

The Exterior Penalty Function Method: Example Minimize: f = x 1 + x2 Subject to: g1 =

−2 + x1 − 2x2 ≤ 0 g2 = 8 − 6x1 + x21 − x2 ≤ 0

Chen CL 

Pseudo-objective function: r p = 0.05

26

Chen CL 


27

Chen CL 


28

Chen CL

29

Algorithm for the Exterior Penalty Function Method r p :

small value

−→

large value

−→ ∞

Chen CL

30

Advantages and Disadvantages of Exterior Penalty Function Methods 

It is applicable to general constrained problems, i.e. equalities as well as inequalities can be treated



The starting design point can be arbitrary



The method iterates through the infeasible region where the problem functions may be undefined



If the iterative process terminates prematurely, the final design may not be feasible and hence not usable

Chen CL

31

The Interior Penalty Function Methods (Barrier Function Methods) Minimize:

f (x)

Subject to: gj (x)

≤

0 j = 1,

hk (x) = 0

⇓ min

Φ(x, r p , r p)

= f (x) + r p

m

··· , m k = 1, ··· , 

− j =1

1

gj (x)



+ r p



k=1

[hk (x)]2

Chen CL

32

The Interior Penalty Function Method: Example Minimize: Subject to:

(x+2)2 f = 20 g1 = 1−2 x g2 = x−2 2

≤0 ≤0

Chen CL

33

⇓ Min: Φ(x, r p ) =

−

(x + 2) 2 2 2  + r p + 20 1 x x 2

−

− −



Chen CL

34

Algorithm for the Interior Penalty Function Method r p :

large value

−→

small value

−→

0

Chen CL

35

Advantages and Disadvantages of Interior Penalty Function Methods 

The starting design point must be feasible



The method always iterates through the feasible region

Chen CL

36

Augmented Lagrange Multiplier Method: Equality-Constrained Problems Min:

f (x)

s.t.

⇒ ⇒

hk (x) = 0 k = 1,

   

L(x, λ) = f (x) +

λk hk (x)

∀k

A(x, λ, r p) = f (x) +

NC: at x∗, λ∗

⇒

··· , 

∂A ∂f = + ∂xi ∂xi ∂L ∂f = + ∂xi ∂xi

λk hk (x) + r p[hk (x)]

∀k

∀k

∀k

2

∂hk (λk + 2r phk ) ∂xi ∗ ∂hk λk = 0 ∂xi



= 0

λ∗k = λk + 2r phk (x) (update λk with finite upper bound for r p)

Chen CL

37

Chen CL

38

Augmented Lagrange Multiplier Method: Example Min: f (x) = x21 + x22 s.t. h(x) = x1 + x2

−1=0

Chen CL

39

⇒ A(x, λ , r p) ∇ A(x, λ , r p) x

⇒

= x21 + x22 + λ(x1 + x2 =

 

− 1) + r p(x1 + x2 − 1)2 2x1 + λ + 2r p(x1 + x2 − 1) 0 = 2x2 + λ + 2r p(x1 + x2 − 1) 0

x1 = x2 =

2r p−λ 2+4r p

   



Solve the problem for r p = 1; λ = 0, 2, respectively



True optimum: λ∗ =

−

−1, x∗1 = x∗2 = 0.5

Chen CL

40

Augmented Lagrange Multiplier Method: Inequality-Constrained Problems Min: f (x) s.t. gj (x)

⇒ ⇒

≤

0 j = 1,

··· , m

gj (x) + Z j2 = 0

A(x, λ, Z , r p) = f (x) +

 

λj gj (x) + Z j2

∀j

⇒

A(x, λ, r p) = f (x) + Ψj = max





2 2 λj Ψj + r pΨj

∀j

− λj gj (x), 2r p

λj = λj + 2r pΨj





2 + r p gj (x) + Z j2

 



(??)

(update rule)

Case 1: g active

Z 2 = 0, λ∗ = λ + 2r pg > 0, Ψ = g >

Case 2: g inactive

Z 2  = 0, λ∗ = λ + 2r p(g + Z 2 ) = 0, Ψ = g + Z 2 =

−λ 2r p −λ 2r

>g

Chen CL

41

Augmented Lagrange Multiplier Method: General Problems

Min: f ( f (x) s.t. gj (x)

≤

0 j = 1,

hk (x) = 0

⇓

··· ,m k = 1, · · · , 

f (x) + A(x, λ, r p) = f (

Ψj = max





λj Ψj + r + r pΨj2

∀j

− λj gj (x), 2r p

λj = λj + 2r 2r pΨj (x)



λk+m = λk+m + 2r phk (x)

  +

2

λk+mhk (x) + r + r p[hk (x)]

∀k

j = 1,

··· ,m k = 1, · · · , 



Chen CL

42

Chen CL

43

Advantages of ALM Method 

Relatively insensitive to value of r p



Precise g (x) = 0 and h(x) = 0 is possible



Acceleration is achieved by updating λ



Starting point may be either feasible or infeasible



At optimum, λ optimum, λj∗ = 0 will automatically identify the active constraint constraint set



Chen CL

44

Generalized Reduced Gradient Method (GRG) Min:

f (x)

s.t.

gj (x)

≤

0

≤

xi

hk (x) = 0 xi Min:

⇓

f (x) hk (x) = 0 xn+j

∈ Rn j = 1, ··· , J k = 1, ··· , K ≤ xui j = 1, ··· , J

x

s.t. gj (x) + xn+j = 0 xi

∈ Rn j = 1, ··· , J k = 1, ··· , K ≤ xui

x

≤ ≥

xi 0

Chen CL

45

Min:

⇓

f (x)

s.t. hk (x) = 0 xi

∈ Rn+J k = 1, ··· , K, ··· , K + J ≤ xui i = 1 ∼ n ∼ n + J

x

≤

xi

Note: one hk (x) = 0 can reduce one degree-of-freedom one variable can be represented by others K + J depedent variables can be represented by other (n + J ) (K + J ) = (n K ) independent var.s

⇒ ⇒

−

−

Chen CL

46

Divide x(n+J )×1 into indep/dep (design/state, Y /Z ) variables to satisfy hk (x) = 0

x =

  Y Z

Y = (n+J )×1

 

y1 ..

 

Z =

yn−K

 

z1 ..

 

zK +J

Variations of objective and constraint functions: n−K

df (x) =

  

i=1 n−K

dhk (x) =

i=1

⇒

dh =

dh1

∂f dyi + ∂yi

K +J

∂hk dyi + ∂yi

···

 

∂f dzi ∂zi

=

∇T f dY + ∇T f dZ

∂hk dzi ∂zi

=

∇T hkdY + ∇T hkdZ

i=1 K +J i=1

dhK

···



dhK +J

T

Y

Y

Z

Z

= C dY + DdZ

Chen CL

47

C =

D =

       

∂h1 ∂y1

···

∂h1 ∂yn−K

∂hK ∂y1

···

∂hK ∂yn−K

∂hK +J ∂y1

···

∂hK +J ∂yn−K

∂h1 ∂z1

···

∂h1 ∂zK +J

···

∂hK ∂zK +J

···

∂hK +J ∂zK +J

.. ..

..

∂hK ∂z1

..

∂hK +J ∂z1

.. .

.. ..

.. .

.. ..

       

∇   ∇ ∇ ∇   ∇ ∇

T Y

=

h1

..

T

hK ..

Y

T Y

hK +J T Z

=

hK ..

Z

Z

h1

..

T

T

       

hK +J

Chen CL

x

48

−→ x + dx should maintain h = ⇒ dh = 0

dh = C dY + DdZ =

⇒

DdZ =

⇒

df (x) =

OR dZ =

=

0

0

−C dY −D−1C dY ∇T f dY + ∇T f dZ ∇T f − ∇T f D−1C 1×(n−K ) dY Y



Y

= GT RdY

df (x) = GR = dY

Z



Z

(Generalized Reduced Gradient)

∇

Y

T

  − ∇

f

D

−1

C

f

Z

((n

− K ) × 1)

Chen CL 



49

Use GR to generate search direction S

⇒ step size

dZ = D−1C dY is based on linear approximation some hk (x + dx) = 0 fix Y , correct dZ to maintain h + dh = 0 (repeat until dZ is small)

⇒ ⇒

−



h(x) + dh(x) = 0 h(x) + C dY + DdZ = 0

dZ =

−D−1 [h(x) + C dY ]

Chen CL

50

GRG: Algorithm 

Specify design (Y ) and state (Z ) variables

select state variables (Z ) to avoid singularity of D  any component of x that is equal to its lower/upper bound initially is set to be a design variable  slack variable should be set as state variables 



Compute GR (analytical or numerical)



Test for convergence: stop if GR < 



Determination of search direction S : steepest descent: S = GR Fletcher-Reeve conjugate gradients, DFP, BFGS,

|| ||

−

···

Chen CL 

51

Find optimum step size along s

dY = λS = λ dZ =

−D

−1

 

s1 ..

sn−K

C dY =

= λT = λ

  − 

 

t1 ..

tK +J

D

−1

C (λS )

= λ

−

D

−1



CS

Chen CL 

52

distance to side constraints: considering design var.s λi =



yiu − yi,old si yi−yi,old si

0 < λ∗

zju−zj,old tj zj−zj,old tj

⇒ λ1 =

if si < 0

min λi i

{ }

if tj > 0

⇒ λ2 =

if tj < 0

min λj j

{ }

≤ min{λ1, λ2} xnew =



if si > 0

distance to side constraints: considering state var.s

λj =



   

if xnew is infeasible

 

   

Y old + dY Z old + dZ

=

 

Y old + λ∗S Z old + λ∗T

⇒ keep Y new and modify Z new

Chen CL

53

GRG: Example

− x2)2 + (x2 − x3)4 h(x) = x1(1 + x22) + x43 − 3 = 0 −3 ≤ x1, x2, x3 ≤ 3

min: f (x) = (x1 s.t.



Choose arbitrarily the design/state variables as Y =

    −    y1 y2

=

x1 x2

Z =

  z1

2.6

x1 =

2 2

f (x1) = 21.16

=

x3

Chen CL 

54

Compute reduced gradient:

∇ f ∇ f Y

=

Z

=

C = D

=

D−1C = GR

=

        ∂f ∂x 1 ∂f ∂x 2 ∂f ∂x 3 ∂h ∂x 1

 − −− −−       −  − 

=

2(x1

=

∂h ∂x 2

4x33

=

.15625

 0



||GR|| = 9.2 =  ⇒ continue



Use steepest descent method: S =



Find λ∗ along S λ1 = min



3−(−2.6) − 3−2 , −9.2 9.2

.325

=

9.2 9.2

=

0

=

5

=

32

9.2 9.2

= Z f

Y

− x3)3

1 + x22 2x1x2

=

∇ f − D−1C ∇

x3)3

4(x2

=

∂h ∂x 3 1 32 C

2(x1 x2) x2) + 4(x2

 −    −  



−GR

= min 0.6087, 0.5435

{

}

= 0.5435

10.4

Chen CL

55

T

=

λ2 =

x

=

−D−1CS = −3−2 −4.4275



−

.15625

= 1.1293

⇒

  − 

Y + λS Z + λT

2.6

=

2 2

 

 − 

9.2 = 4.4275 9.2 λ = min λ1, λ2 = 0.5435

−.325

{

−  −

9.2

+ λ

−

}

 

−  − −

2.6 + 9.2λ

= 9.2 2 9.2λ 4.4275 2 4.4275λ

2

  4

f (λ) = [( 2.6 + 9.2λ) (2 9.2λ)] + [(2 9.2λ) (2 4.4275λ)] = 518.7806λ4 + 338.56λ2 169.28λ + 21.16 df = 2075.1225λ3 + 677.12λ 169.28 = 0 λ∗ = 0.22 dλ

−

xnew

=

−  − −

− −

 

2.6 + 9.2λ∗ = 2 9.2λ∗ 2 4.4275λ∗

−

− −

−  − −

− −

⇒

 

−  −

 

2.6+2.024 0.576 = 2 2.024 0.024 2 .97405 1.02595

Chen CL

56

h(xnew ) = Dnew

=

C new

=

−2.4684 = 0!! ⇒ xnew ∈ F R

                − − − − ∂h ∂x 3

= 4(x33)new = 4(1.02595)3 = 4.32

∂h ∂x 1

dZ = D

−1

∂h ∂x 2

= 1 + x22 2x1x2

[ h(x)

C dY ]new =

new

1 4.32

= 1 0.028

2.4684 + 1 0.028

= [0.116]

⇒

Z new

    

= Z old + dZ = 1.02595 + 0.116 = 1.142

h(xnew ) = .. ..

−1.876 = 0!! ⇒ xnew ∈ F R

  2.024 2.024

−

Chen CL

57

Linearization of The Constrained Problem

Min: s.t.

f (x) hj (x) = 0 gj (x)

Min:

f (x(k) + ∆x)

s.t. hj (x(k) + ∆x) gj (x(k) + ∆x)

≤ ⇓

0

j = 1,

··· , p j = 1, ··· , m ¯ f

∇   ∇   ∇   ≤ f (x(k))∆x

T

≈

f (x(k)) +

≈ ≈

hj (x(k)) +

T

gj (x(k)) +

T

¯j h

hj (x(k))∆x = 0

gj (x(k))∆x g ¯j

0

Chen CL

58

f k ci nij

Min:

≡ ≡ ≡ ⇓

f ¯ =

f (x(k)) ej ∂f (x(k)) ∂xi ∂hj (x(k) ) ∂xi

di

≡ − hj (x(k)) bj ≡ − gj (x(k)) ≡ ∆x(ik) ∂g (x ) aij ≡ ∂x i

n

  

cidi = cT d

i=1 n

s.t.

¯j = h

nij di = ej

i=1 n

g¯j =

(k)

j

i=1

aij di

≤

bj

 

[n1

[a1

  ≤

··· n p]T d = N T d = e

··· am]T d = AT d

b

Chen CL

59

Definition of Linearized Subproblem: Example

Min: s.t.

f (x) = x21 + x22 g1(x) = g2(x) = g3(x) =

⇒ ∇f (x) ∇g2(x)

=

=

− 3x1x2 1 2 1 2 −1≤0 x + 1 6 6 x2 −x1 ≤ 0 −x2 ≤ 0 x(0) = (1, 1)

 −   −  −  ∇ 2x1

3x2

2x2

3x1

1

0

∇g1(x)

g3(x) =

=

 −  0

1

    1 3 x1 1 3 x2

Chen CL

60

Chen CL

⇒

61

f (x(0)) = b2 =

∇f (x(0)) ⇒

=

A =

−1, −g2(x(0)) = 1,

−  −  −   − 1 1

1 3 1 3

1

0

0

1

−g1(x(0)) = 23, b3 = −g3(x(0)) = 1 b1 =

∇g1(x(0))

     1 3 1 3

=

2 3

b=

1 1

Chen CL

62

⇓ Min:

f ¯ =

−  −

1

1 3

s.t.

1

0

  −          ≤   − 1

1 3

0

1

d1 d2

d1 d1

Or Min:

f ¯ =

s.t. g¯1

=

g¯2

=

g¯3

=

−d1 − d2

1 1 d + 1 3 3 d2

−d1 ≤ 1 −d2 ≤ 1

≤ 23

2 3

1 1

Chen CL

63

Or Min:

¯ x) f (

s.t. g¯1(x)

=

f (x(0)) +



∇f · (x − x(0)) (x1 − 1) −1 −1 (x − 1) 2

 

 

=

−1 +

=

−x1 − x2 + 1 g1(x(0)) + ∇g1 · (x − x(0)) (x1 − 1) 2 1 1 −3 + 3 3 (x2 − 1) 1 3 (x1 + x2 − 4) ≤ 0 −x1 ≤ 0 −x2 ≤ 0

= = =

g¯2(x)

=

g¯3(x)

=

  

 

Chen CL

64

Optimum solution: line D d1 + d2 = 2

(x1

− 1) + (x2 − 1) = 2

− E x1 + x2

−4=0

Chen CL

65

Sequential Linear Programming Algorithm Basic Idea

Min: s.t.

f (x) hj (x) = 0 gj (x)

Min:

f (x(k) + ∆x)

s.t. hj (x(k) + ∆x) gj (x(k) + ∆x)

≤ ⇓

0

j = 1,

··· , p j = 1, ··· , m ¯ f

∇   ∇   ∇   ≤ f (x(k))∆x

T

≈

f (x(k)) +

≈ ≈

hj (x(k)) +

T

gj (x(k)) +

T

¯j h

hj (x(k))∆x = 0

gj (x(k))∆x g ¯j

0

Chen CL

66

f k ci nij

Min:

f ¯

≡ ≡ ≡ ⇓ =

f (x(k)) ej ∂f (x(k) ) ∂xi ∂hj (x(k)) ∂xi

di

≡ − hj (x(k)) bj ≡ − gj (x(k)) ≡ ∆x(ik) ∂g (x ) aij ≡ ∂x i

n

  

cidi = cT d

i=1 n

s.t.

¯j h

=

nij di = ej

i=1 n

g¯j

=

(k)

j

i=1

−∆(ik) ≤ di ≤

aij di (k )

∆iu ,

≤

bj i = 1,

 

[n1

[a1

  ≤

··· n p]T d = N T d = e

··· am]T d = AT d

··· , n

b

(move limits)

Chen CL

67



Move limits: the linearized problem may not have a bounded solution or the changes in design may become too large



Selection of proper move limits is of critical importance because it can mean success or failure of the SLP algorithm



All bi



di: no sign restriction − + di = d+ d , d i i i

⇒ 

≥0 −

≥ 0, d−i ≥ 0

Stopping Criteria:

 

gj

≤ 1, j = 1 ∼ m; ||d|| ≤ 2

hj

≤ 1, j = 1 ∼ p

Chen CL

68

An SLP Algorithm 

Step 1: x(0), k = 0, 1, 2



Step 2: calculate f k ; bj , j = 1 ∂f ∂xi , nij

∼ m; ej , j = 1 ∼ p ∂hj ∂xi , aij

∂gj ∂xi



Step 3: evaluate ci =



Step 4: select proper move limits ∆i , ∆iu



Step 5: define the LP subproblem



Step 6: use Simplex method to solve for d(k)



Step 7: check for convergence, Stop ?



=

(k)

=

(k )

(some fraction of current design)

Step 8: update the design x(k+1) = x(k) + d(k) set k = k + 1 and go to Step 2

Chen CL

69

An SLP Example

Min: s.t.

f (x) = x21 + x22 g1(x) = g2(x) = g3(x) =

⇒ ∇f (x) ∇g2(x)

=

=

− 3x1x2 1 2 1 2 −1≤0 x + 6 1 6 x2 −x1 ≤ 0 −x2 ≤ 0 x(0) = (3, 3)

 −   −  −  ∇ 2x1

3x2

2x2

3x1

1

0

∇g1(x)

g3(x) =

=

 −  0

1

    1 3 x1 1 3 x2

Chen CL

70

f (x(0)) = b2 =

(0)

∇f (x

) =

A =

−9, b1 = −g1(x(0)) = −2 < 0 (violated) −g2(x(0)) = 3 (inactive) b3 = −g3(x(0)) = 3 (inactive)

−  −  −   − 3

= c

3

1 1

1

0

0

1

(0)

∇g1(x

  −    1

) =

1

2

b=

3 3

Chen CL

71

⇓ Min:

f ¯

=

−  −

3

1

s.t.

1

0

  −      −     ≤   − 3

1 0

1

d1 d2

d1 d2

Or f ¯

=

s.t. g¯1

=

g¯2

=

g¯3

=

Min:

−3d1 − 3d2 d1 + d2 ≤ −2 −d1 ≤ 3 −d2 ≤ 3

2

3 3

Chen CL 







72

Feasible region: A B C ; Cost function parallel to B C (OptSoln : BC

− −

−

⇒ d1,2 = −1)

100% move limits d1 = 1, d2 = 1

⇒ −3 ≤ d1, d2 ≤ 3 (ADEF ) ⇒ − − ⇒ x1 = 2, x2 = 2 (x1 − 3 = d1 = −1) 20% move limits ⇒ −0.6 ≤ d1, d2 ≤ 0.6 (A1D2E 1F 1) ⇒ no feasible solution

The linearized constraints are satisfied, but the original nonlinear constraint g1 is still violated

Chen CL

73

SLP: Example

Min: s.t.

f (x) = x21 + x22 g1(x) = g2(x) = g3(x) =

⇓

Min:

f ¯ =

s.t.

g¯1 = g¯2 = g¯3

−0.15

− 3x1x2 1 2 1 2 −1≤0 x + 1 6 6 x2 −x1 ≤ 0 −x2 ≤ 0 x(0) = (1, 1)

1,2 = 0.001, 15% design change

−d1 − d2

1 1 d + 1 3 3 d2

−d1 ≤ 1 = −d2 ≤ 1 ≤ d1, d1 ≤

≤ 23 0.15

Chen CL

Solution region: DEFG

74

⇒ F, (d1 = d2 = 0.15)

Chen CL

75

d1

g¯1 =



f ¯ =

Min: s.t.

≡ d+1 − d−1 , d2 ≡ d+2 − d−2 −d+1 + d−1 − d+2 + d−2 − − + + 1 2 − − ≤ + d d d d 1 1 2 2 3 3 −d+1 + d−1 ≤ 1 −d+2 + d−2 ≤ 1

⇓

g¯2 = g¯3 =

− d−1 + − d− d 1 1 − − d+ d 2 2 + − d− d 2 2 d+ 1

− d+ , d 1 1 − d+ , d 2 2

≤ ≤ ≤ ≤ ≥ ≥

0.15 0.15 0.15 0.15 0 0



Chen CL

76

⇓

− − + d+ = 0.15, d = 0, d = 0.15, d 1 1 2 2 = 0

− d−1 − − d+ d 2 2

d1 = d+ 1

= 0.15

d2 =

= 0.15

x(1) = x(0) + d(0)

f (x(1)) = g1(x(1)) =

||d||

= (1.15, 1.15)

−1.3225 −0.6166 (inactive)

= 0.212 > 2

⇒

next iteration

Chen CL

77

Observations on The SLP Algorithm 

The rate of convergence and performance depend to a large extent on the selection of move limits



The method cannot be used as a black box approach for engineering design problems (selection of move limits is a trial and error process)



The method is not convergent since no descent function is defined



The method can cycle between two points if the optimum solution is not a vertex of the constraint set



Lack of robustness

Chen CL

78

Quadratic Programming Problem Quadratic Step Size Constraint



Constrained Constrained Nonlinear Nonlinear Programming Programming Linear Programming Subproblem: lack of robustness Quadratic Programming Subproblem: with descent function and step size determination strategy

⇒ ⇒ 

Quadratic Quadratic Step Size Constraint: Constraint:

−∆i(k) ≤ di ≤ ∆iu(k), ⇒ 12||d|| = 12dT d = 12

i = 1,



··· ,n (di)2 ≤ ξ

(move limits) limits)

Chen CL

79

⇓ Min:

f ¯ =

n

   

cidi = cT d

i=1 n

¯j = s.t. h

nij di = ej

i=1 n

g¯j =

aij di

i=1 1 2

(di)2

≤ ≤

bj ξ

 

[n1

[a1

  ≤

· · · n p]T d = N T d = e

· · · am]T d = AT d (move (move limits) limits)

b

Chen CL

80

Quadratic Programming (QP) Subproblem n

Min:

f ¯ =

   

cidi = cT d

i=1 n

s.t.

¯j = h

nij di = ej

i=1 n

g¯j =

aij di

i=1 1 2

Min:

⇓

(di)2

≤ ≤

bj ξ

 

[n1

[a1

(? check KTC, λ = 1)

  ≤

· · · n p]T d = N T d = e

· · · am]T d = AT d b (0. (0.5dT d ≤ ξ )

f ¯ = cT d+0 +0..5dT d

s.t. N T d = e AT d

≤

b

a convex programming problem

⇒ unique soln

Chen CL

81

Quadratic Programming (QP) Subproblem Min: f ( f (x) = 2x31 + 15 15x x22

− 8x1x2 − 4x1

s.t. h(x) = x21 + x + x1x2 + 1 = 0

− 14x22 − 1 ≤ 0 x(0) = (1, (1, 1) A : (1, (1, −2), 2), B : (−1, 2) f ( f (x∗) = 74(A 74(A), 78(B 78(B )

g (x) = x1 Opt:

x∗ =

Chen CL

c =

82

∇f ∇h ∇g

=

(6x (6x21

− 8x2 − 4, 30 30x x2 − 8x1)

= (2x (2x1 + x + x2, x1)

f (1 f (1,, 1) = 5,

=

x(0)

=

x(0)

= (1, (1, x2/2)

−

x(0)

h(1, (1, 1) = 3 = 0,



g(1, (1, 1)

( 6, 22)

−

(3, (3, 1)

=

(1, (1, 0.5)

=

−0.25

−

< 0

Chen CL

83

LP Sub-P: Min: s.t.

f ¯ = 6d1 + 22d2 ¯ = 3d1 + d2 = 3 h

−

− g¯ = d1 − 0.5d2 ≤ 0.25 50% move limits: −0.5 ≤ d1, d2 ≤ 0.5 ⇒ infeasible (HIJK ) −1 ≤ d1, d2 ≤ 1 100% move limits: ⇒ L : d1 = − 23, d2 = −1, f ¯ = −18

Chen CL

84

QP Sub-P: Min: f ¯ = 6d1 + 22d2 + 0.5(d21 + d22) ¯ = 3d1 + d2 = 3 s.t. h

−

g¯ =

⇒

G

:

− d1 − 0.5d2 ≤ 0.25 d1 = −0.5, d1 = −1.5, f ¯ = −28.75

Chen CL

85

Solution of QP Problems

Min:

q (x) = cT x + 12 xT Hx

s.t. AT x

≤

bm×1

≥

0n×1

(H = I ?)

N T x = e p×1 x

⇒ ⇒

AT x + s

−x ≤

= bm×1

0

L = cT x + 12 xT Hx + uT (AT x + s +v T (N T x

− b) − ζ T x

− e)

Chen CL

86

KT C c + Hx + Au

− ζ + N (y − z) AT x + s − b N T x − e

⇓ ⇓ ⇓

v=y

y, z

= 0 =

0

=

0

uj sj = 0 ζ ixi = 0

sj , uj , ζ i

− z,

≥

0

j=1

∼m i=1∼n

≥

0

Chen CL

87

⇓

 

H T

A

T

N

A 0m×m 0 p×m

−I n×n

         x

−N

0n×m

N

0m×n

I m×m

0m× p

0m× p

0 p×n

0 p×m

0 p× p

0 p× p

u

ζ s

y z

⇒

−    c

=

b

e

B (n+m+ p)×2(n+m+ p)x2(n+m+ p)×1 = D(n+m+ p)×1

xiζ i = 0, uj sj = 0

⇒

X iX n+m+i = 0 X i

≥

0

i = 1

∼ n + m i = 1 ∼ 2(n + m + p

Chen CL

88

Two-Phase Simplex Method to Solve QP Problem Bx + Y = D

Y : artificial variables

n+m+ p

w =

 

Y i

i=1 n+m+ p

=

Di +

i=1

= w0 +

   −

2(n+m+ p)

Bij X j

j =1

2(n+m+ p)



n+m+ p i=1

C j X j

j =1

Note:

both X i, X n+m+i are not simultaneously bases

Chen CL 

89

Procedures:

Step  Step  Step  Step  Step 

1: find B 2: define D, rearrange s.t. Di 0 3: calculate w0, C j s 4: complete phase I 5: recover optimum values for original QP variables x, Lagrangian multipliers u, v , ζ , and slack variables s

≥

Chen CL

90

Solution of QP Problem: Example Min:

= s.t.

− 3)2 + (x2 − 3)2 x21 − 6x1 + x22 − 6x2 (+18) x1 + x2 ≤ 4 x1 − 3x2 = 1

f (x) = (x1 g(x) = h(x) = x1, x2

≥ ⇓

q (x) =

c = b =

0

     − −         −      −        6

6 6

4

6

x1 x2

H =

e =

1

+ 0.5 x1 x2 2 0 0 2

A =

2 0

x1

0 2

x2

1 1

 

N =

 −  1

3

Chen CL

91

B =

x =

   

2

0

1

0

2

1

1

1

0

0

1

−3

0

0

⇒

X 1 =

0

0

0

1

−1

0

0

1

−3

0

0

1

3

0

0

0

13 4 ,

X 2X 5 = 0,

D =

0

T

X 3X 6 = 0

X 2 = 34 , X 3 = 34 , X 8 =

5 4

X 4 = 0, X 5 = 0, X 6 = 0, X 7 = 0 x1 =

13 4 ,

x2 = 34 , u1 = 34 , ζ 1 = 0, ζ 2 = 0

s1 = 0, y1 = 0, z1 = 54 , v1 = y1 3 f ( 13 , 4 4) =

41 8

     6

x1 x2 u1 ζ 1 ζ 2 s1 y1 z1

X 1X 4 = 0,

⇒

−1

 −   

− z1 = − 54

6 4 1

Chen CL

Basic Y 1 Y 2 Y 3 Y 4 w Y 1 Y 2 Y 3 X 1 w X 2 Y 2 Y 3 X 1 w

−

−

−

92

−w

0 0 0 0 1 0 0 0 0 1 0 0 0 0 1

X 1 2 0 1 1 4 0 0 0 1 0 0 0 0 1 0

−

X 2 X 3 X 4 X 5 X 6 X 7 X 8 Y 1 Y 2 Y 3 Y 4 0 1 1 0 0 1 1 1 0 0 0 2 1 0 1 0 3 3 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 3 0 0 0 0 0 0 0 0 0 1 0 2 1 1 1 2 2 0 0 0 0 6 1 1 0 0 1 1 1 0 0 2 2 1 0 1 0 3 3 0 1 0 0 4 0 0 0 1 0 0 0 0 1 1 3 0 0 0 0 0 0 0 0 0 1 12 2 1 1 1 2 2 0 0 0 4 1 1 1 1 1 16 16 0 0 0 0 6 6 6 3 2 1 10 10 1 2 0 3 3 1 0 1 0 3 3 3 3 2 2 2 1 0 23 23 0 1 0 1 3 3 3 3 1 1 1 0 12 12 0 0 0 0 0 2 2 2 0 0 1 1 1 4 4 2 0 0 0

−

−

−

− − −

−

−

−

− −

−

−

−

−

−

−

−

− −

−

− − −

− − − −

− −

− −

−

D 6 6 4 1 17 4 6 3 1 13

−

−

2 3 14 3 1 3

3 5

−

Chen CL

93

Basic X 2 Y 2 X 8 X 1 w

−w

X 2 X 3 X 8 X 1 w

0 0 0 0 1

−

−

0 0 0 0 1

X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 Y 1 Y 2 Y 3 Y 4 D 0 1 0 0 0 14 0 0 0 0 14 14 34 0 0 4 3 1 5 0 0 3 1 5 1 3 0 0 1 1 0 32 1 1 1 0 32 12 12 1 0 0 0 0 34 0 0 0 0 34 14 13 4 0 0 4 3 1 5 0 0 2 0 6 2 3

− − −

−

0 0 0 1 0

1 0 0 0 0

−

0 1 0 0 0

0

0

−

1 4 5 4 1 4 3 4

0 0 1 0 0

−34 −14 − 1 1 − − 4 4 0 0

0 0

0

− − −

−

0 0 1 0 0

−

0

0

3 4 1 4

1 4 1 4

0 1

0 1

−

−14 − −14

−

1 4 5 4 1 4 3 4

1 4 1 4

3 4 3 4 5 4 13 4

1

1

0

Chen CL

94

Sequential Quadratic Programming (SQP)

Constrained Steepest Descent Method

Min:

⇒ Min:

f ¯ = cT x + 12 xT x d =

⇓

−c

(steepest descent direction if no constraint)

constrained QP subproblem is solved sequentially

f ¯ = cT d + 0.5dT d

s.t. N T d = e AT d

⇒

d

Q:

≤

b

: modify

−c to satisfy constraints

Descent function ?

Step size ?

Chen CL

95

CSD: Pshenichny’s Descent Function Φ(x) = f (x) + RV (x) Φ(x(k)) = f (x(k)) + RV (x(k)) Φk = f k + RV k V k = max 0; h1 ,

{ | | ··· , |h p|; g1, ··· , gm} ≥ 0

(maximum constraint violation) p

R

≥

rk =

 

m

(k)

vi

i=1

  +

i=1

(k )

ui

Chen CL

96

Note: the Lagrange function is a lower bound on the descent function Min:

f (x)

s.t. hi(x) = 0 gi(x)

⇒

≤

0

i = 1,

··· , p i = 1, ··· , m p

m

   | || |     | |    

L(x) = f (x) +

vihi +

i=1 p

≤

f (x) +

≤

f (x) +

uigi

i=1

m

vi hi +

i=1 p

i=1 m

vi +

i=1

uigi

ui max 0; h1 , , h p ; g1, , gm

i=1

r, R=max{R0 ,r }

≤

f (x) + RV = Φ(x)

{ | |·| |

 V

· }



Chen CL

97

CSD: Descent Function Example Min: s.t.

f (x) = x21 + 320x1x2 g1(x) =

x1 60x2

g2(x) = 1

−x1, −x2 ≤

−1≤0

x −x ) − x (3600 ≤0

0,

1

1

2

x(0) = (40, 0.5),

R = 10000

f (x(0)) = (40)2 + 320(40)(0.5) = 8000 g1(x(0)) = g2(x(0)) = g3 = V 0 =

40 1 = 0.333 > 0, 60(0.5) .5) 1 40(40−0 = 0.5611 > 3600

−

(violation)

− 0, (violation) −40 < 0, g4 = −0.5 < 0, (inactive) max {0; 0.333, 0.5611, −40, −0.5} = 0.5611

Φ0 = f 0 + RV 0

= 8000 + (10000)(0.5611) = 13611

Chen CL

98

CSD: Step Size Determination 

Use inexact step size (for efficiency, exact line search is not necessary)



Trial step size tj : j = 0

t0 =

j = 1

t1 =

j = 2 ..

t2 =

1 0 2 1 1 2 1 2 2

  

= 1 = =

1 2 1 4

Chen CL 

99

Determine step size as αk = tj with j as the smallest integer to satisfy descent condition of inequality x(k+1,j ) = x(k) + tj d(k)

Φk+1,j = f k+1,j + RV k+1,j

≤

Φk

tj β k

|−  |

β k = γ d(k)

2

,

0 < γ

≤ 1

Chen CL 

100

Effect of parameter γ on step size determination a larger γ tends to reduce step size to satisfy descent condition of Inequality

Chen CL

101

CSD: Step Size Example

Min:

f (x) = x21 + 320x1x2

s.t. g1(x) =

x1 60x2

x −x ) − x (3600 ≤0 −x1 ≤ 0 −x2 ≤ 0, x(0) = (40, 0.5)

g2(x) = 1 g3(x) = g4(x) =

⇒

−1≤0 1

1

2

d(0) = (25.6, 0.45) u = (4880, 19400, 0, 0)(?),

R =



γ = 0.5

ui = 4880 + 19400 + 0 + 0 = 24280

β = 0.5(25.62 + 0.452) = 328

Chen CL

102

f (x(0)) = (40)2 + 320(40)(0.5) = 8000 g1(x(0)) = g2(x(0)) = g3(x(0)) = g4(x(0)) =

40 1 = 0.333 > 0, 60(0.5) .5) 1 40(40−0 = 0.5611 > 3600

−

− −40 < 0, −.5 < 0,

(violation)

0, (violation)

(inactive) (inactive)

V 0 = max 0; 0.333, 0.5611, 40, 0.5

{

Φ0 = f 0 + RV 0

− − }

= 8000 + (24280)(0.5611) = 21624

= 0.5611

Chen CL

103

Initial:

t0 = 1 (1,0)

= x1 + t0d1 = 40 + (1)(25.6) = 65.6

(1,0)

= x2 + t0d2 = .5 + (1)(0.45) = 0.95

x1 x2 f 1,0 =

( j = 0) (0)

(0)

(0)

(0)

f (x(1,0)) = (65.6)2 + 320(65.6)(.95) = 24246 g1(x(1,0)) = g2(x(1,0)) = g3(x(1,0)) = g4(x(1,0)) = V 1,0 =

65.6 1 = 0.151 > 0, 60(0.95) .6−0.95) 1 65.6(65 = .1781 3600

−

(violation)

− − > 0, (inactive) −65.6 < 0, (inactive) −0.95 < 0, (inactive) max {0; 0.151, −.1781, −65.6, −0.95} = 0.151

Φ1,0 = f 1,0 + RV 1,0

= 24246 + (24280)(0.151) = 27912 LHS = 27912 + 328 = 28240 > 21624 = RHS!

Chen CL

104

Second:

t1 = 0.5 (1,1)

= x1 + t1d1 = 40 + (0.5)(25.6) = 52.8

(1,1)

= x2 + t1d2 = .5 + (0.5)(0.45) = .725

x1 x2 f 1,1 =

( j = 1)

(0)

(0)

(0)

(0)

f (x(1,1)) = (52.8)2 + 320(52.8)(.725) = 15037 g1(x(1,1)) = g2(x(1,1)) = g3(x(1,1)) = g4(x(1,1)) = V 1,1 =

52.8 1 = 0.2138 > 0, 60(.725) .8−.725) 1 52.8(52 = 0.2362 3600

−

(violation)

− > 0, (violation) −52.8 < 0, (inactive) −.725 < 0, (inactive) max {0; 0.2138, 0.2362, −52.8, −.725} = 0.2362

Φ1,1 = f 1,1 + RV 1,1

= 15037 + (24280)(0.2362) = 20772 LHS = 20772 + (0.5)328 = 20936 < 21624 = RHS

Chen CL

105

SQP: CSD Algorithm 



Step 1: k = 0; x(0), R0(= 1), 0 < γ (= 0.2) < 1; 1, 2 Step 2: f (x(k)), hj (x(k)), gj (x(k)); V k ; f (x(k)), hj (x(k)), gj (x(k))

∇







∇

∇

⇒ d(k), v(k), u(k)

Step 3: define and solve QP subproblem Step 4: Stop if d(k)

|| || ≤ 2 and V k ≤ 1

Step 5: calculate rk (sum of Lagrange multipliers), set R = max Rk , rk

{



}

Step 6: set x(k+1) = x(k) + αk d(k) (minimize (inexact) Φ(x) = f (x) + RV (x) along x(k))

Chen CL 



106

Step 7: save Rk+1 = R, k = k + 1 and go to Step 2 The CSD algorithm along with the foregoing step size determination procedure is convergent provided second derivatives of all the functions are piece-wise continuous (Lipschitz Condition) and the set of design points x(k) are bounded as follows: Φ(x(k))

≤ Φ(x(0))

Chen CL

107

SQP: CSD Example Min:

f (x) = x21 + x22

s.t. g1(x) = g2(x) = g3(x) = R0 = x∗ =

− 3x1x2 1 2 1 2 x + 1 6 6 x2 − 1 ≤ 0 −x1 ≤ 0 −x2 ≤ 0 x(0) = (1, 1) 10, γ = 0.5, 1 = 2 = 0.001 √ √ ( 3, 3), u = (3, 0, 0), f ∗ = −3.0625

Chen CL

108

Iteration 1 (k=0) 

Step 1: k = 0; x(0) = (1, 1), R0 = 10, γ = 0.5; 1 =  2 = 0.001



Step 2: f (x(0)) = g2(x(0)) =

∇f (x(0)) ∇g2(x(0))

= =

−1 g1(x(0)) −1 g3(x(0)) (−1, −1) ∇g1(x(0)) ∇g3(x(0)) (−1, 0)

V 0 = 0

= =

−23 −1 (inactive)

= ( 13 , 13 )

= (0, 1)

−

Chen CL

linearized constraints and linearized constraint set

109

Chen CL 

110

⇒ d(k), v(k), u(k)

Step 3: define and solve QP subproblem Min: s.t.

⇒

f ¯ = ( d1 g¯1

−d1

L

∂L ∂d1 ∂L ∂d2

uisi

⇒

d(0)

− − d2) + 0.5(d21 + d22) = 13 d1 + 13 d2 ≤ 23 ≤ 1, −d2 ≤ 1 = (−d1 − d2) + 0.5(d21 + d22) + u1 13 (d1 + d2 − 2) + s21 u2(−d1 − 1 + s22) + u3(−d2 − 1 + s23) = −1 + d1 + 13 u1 − u2 = 0 = −1 + d2 + 13 u1 − u3 = 0 1 2 − (d + d 2) + s 1 2 1 =0 3 −d1 − 1 + s22 = 0 −d2 − 1 + s23 = 0 = 0, ui ≥ 0, i = 1, 2, 3 = (1, 1) (D), u(0) = (0, 0, 0), f ¯ = −1





Chen CL





111

√ Step 4: ||d || = 2 ≥ 2, continue (0)

Step 5: r0 =



ui = 0, R = max R0, r0 = max 10, 0 = 10

{

}

{

}

Chen CL 

112

Step 6: Φ0 = f 0 + RV 0 = β 0 = γ d(0)

−1 + (10)(0) = −1

|| ||2 = 0.5(1 + 1) = 1

x(1,0) = x(0) + t0d(0) = (2, 2)

f 1,0 = f (2, 2) =

(t0 = 1)

−4

V 1,0 = V (2, 2) = max 0; 13 , 2, 2 = Φ1,0 = Φ0

− t0β 0

=

{ − − } 13 f 1,0 + RV 1,0 = −4 + (10) 13 = − 23 −1 − 1 = −2 < Φ1,0 ⇒ t1 = 0.5

x(1,1) = x(0) + t1d(0) = (1.5, 1.5)

(t1 = 0.5)

f 1,1 = f (1.5, 1.5) = V 1,1 = Φ1,1 = Φ0

− t1β 0 ⇒ α0

=

−2.25 V (1.5, 1.5) = max{0; − 14 , −1.5, −1.5} = 0 f 1,1 + RV 1,1 = −2.25 + (10)(0) = −2.25 −1 − 0.5 = −1.5 > Φ1,0

= t1 = 0.5,

x(1) = (1.5, 1.5)

Chen CL 

113

Step 7: R1 = R0 = 10, k = 1 go to Step 2

Iteration 2 (k=1)



Step 2: f (x(1)) = g2(x(1)) =

∇f (x(1)) ∇g2(x(1))

= =

−2.25 g1(x(1)) −1 g3(x(1)) (−1.5, −1.5) ∇g1(x(1)) ∇g3(x(1)) (−1, 0)

V 1 = 0

= =

−0.25 −1 (inactive)

= (0.5, 0.5) = (0, 1)

−

Chen CL 

⇒ d(k), v(k), u(k)

Step 3: define and solve QP subproblem

Min: s.t.

⇒ 



114

f ¯ = ( 1.5d1 g¯1

− d1

− 1.5d2) + 0.5(d21 + d22) = 0.5d1 + 0.5d2 ≤ 0.25 ≤ 1.5, −d2 ≤ 1.5 −

¯= d(1) = (0.25, 0.25) (D), u(1) = (2.5, 0, 0), f

Step 4: d(1) = 0.3535

|| ||

Step 5: r0 = 10



−2.25

≥ 2, continue

ui = 2.5, R = max R1, r1 = max 10, 2.5 =

{

}

{

}

Chen CL 

115

Step 6: Φ1 = f 1 + RV 1 = β 1 = γ d(1)

−2.25 + (10)(0) = −2.25

|| ||2 = 0.5(0.125) = 0.0625

x(2,0) = x(1) + t0d(1) = (1.75, 1.75)

(t0 = 1)

f 2,0 = f (1.75, 1.75) = V 2,0 = Φ2,0 = Φ1

− t0β 1 ⇒ α1



=

−3.0625 V (1.75, 1.75) = max{0; 0.0208, −1.75, −1.75} = 0.020 f 2,0 + RV 2,0 = −3.0625 + (10)(0.0208) = −2.8541 −2.25 − (1)(0.0625) = −2.3125 > Φ2,0

= t0 = 1.0,

x(2) = (1.75, 1.75)

Step 7: R2 = R1 = 10, k = 2 go to Step 2

Chen CL

116

SQP: Effect of γ in CSD Method



Same as last example, case 1: γ = 0.5



Step 1

∼ 5: same as before

⇒ 0.01

Chen CL 

117

Step 6: Iteration 1 Φ0 = f 0 + RV 0 = β 0 = γ d(0)

−1 + (10)(0) = −1

|| ||2 = 0.01(1 + 1) = 0.02

x(1,0) = x(0) + t0d(0) = (2, 2)

f 1,0 = f (2, 2) =

(t0 = 1)

−4

V 1,0 = V (2, 2) = max 0; 13 , 2, 2 = Φ1,0 = Φ0

− t0β 0

=

{ − − } 13 f 1,0 + RV 1,0 = −4 + (10) 13 = − 23 −1 − (1)(0.02) = −1.02 < Φ1,0 ⇒

x(1,1) = x(0) + t1d(0) = (1.5, 1.5)

(t1 = 0.5)

f 1,1 = f (1.5, 1.5) = V 1,1 = Φ1,1 = Φ0

− t1β 0 ⇒ α0

=

t1 = 0.5

−2.25 V (1.5, 1.5) = max{0; − 14 , −1.5, −1.5} = 0 f 1,1 + RV 1,1 = −2.25 + (10)(0) = −2.25 −1 − (0.5)(0.02) = −1.01 > Φ1,0

= t1 = 0.5,

x(1) = (1.5, 1.5)

Chen CL 

118

Step 6: Iteration 2 Φ1 = f 1 + RV 1 = β 1 = γ d(1)

−2.25 + (10)(0) = −2.25

|| ||2 = 0.02(0.125) = 0.0025

x(2,0) = x(1) + t0d(1) = (1.75, 1.75)

(t0 = 1)

f 2,0 = f (1.75, 1.75) = V 2,0 = Φ2,0 = Φ1

− t0β 1 ⇒ α1



=

−3.0625 V (1.75, 1.75) = max{0; 0.0208, −1.75, −1.75} = 0.020 f 2,0 + RV 2,0 = −3.0625 + (10)(0.0208) = −2.8541 −2.25 − (1)(0.0025) = −2.2525 < Φ2,0

= t0 = 1.0,

x(2) = (1.75, 1.75)

A smaller value of γ has no effect on the first two iteration

Chen CL 

Same as last example, case 2: γ = 0.5 0.9 Iteration 2: step size t1 = 0.5 x(2) = (1.625, 1.625), f 2 = 2.641, g1 = 0.1198, V 1 = 0

⇒

⇒



119

−

−

A larger γ value results in a smaller step size and the new design point remains strictly feasible

Chen CL

120

SQP: Effect of R in CSD Method



Same as last example, case 1: R = 10

⇒1

Iteration 1 Step 1 5: same as before  Step 6: 

∼

Φ0 = f 0 + RV 0 = β 0 = γ d(0)

−1 + (1)(0) = −1

|| ||2 = 0.5(1 + 1) = 1

x(1,0) = x(0) + t0d(0) = (2, 2)

f 1,0 = f (2, 2) =

−4

(t0 = 1)

V 1,0 = V (2, 2) = max 0; 13 , 2, 2 = Φ1,0 = Φ0

− t0β 0 ⇒ α0

=

{ − − } 13 f 1,0 + RV 1,0 = −4 + (1) 13 = − 11 3 −1 − (1)(1) = 0 > Φ1,0

= t0 = 1,

x(1) = (2, 2)

Chen CL

121

Iteration 2 

Step 2: f (x(1)) = g2(x(1)) =

∇f (x(1)) ∇g2(x(1))

=

= ( 1, 0)

V 1 =



−4 −1 −

1 3

g1(x(1)) = g3(x(1)) =

∇g1(x(1)) ∇g3(x(1))

=

−1 (inactive)

= ( 0, 1)

−

⇒ d(k), v(k), u(k)

Step 3: define and solve QP subproblem Min: s.t.

⇒

f ¯ = ( 1.5d1 g¯1

− d1

d(1)

− 1.5d2) + 0.5(d21 + d22) = 0.5d1 + 0.5d2 ≤ 0.25 ≤ 1.5, −d2 ≤ 1.5 = (−0.25, −0.25) (D), u(1) = ( 27 8 , 0, 0), −

f ¯ =

−4

Chen CL

122

Step 4: d(1) = 0.3535 2, continue 27  Step 5: r0 = ui = 27 , R = max R , r = max 1, 1 1 8 8 =  Step 6: 1 Φ1 = f 1 + RV 1 = 4 + ( 27 )( 8 3 ) = 2.875

|| ||





β 1 = γ d(1)

≥

{

−

}

{

}

27 8

−

|| ||2 = 0.5(0.3535)2 = 0.0625

x(2,0) = x(1) + t0d(1) = (1.75, 1.75)

(t0 = 1)

f 2,0 = f (1.75, 1.75) = V 2,0 = Φ2,0 = Φ1





− t0β 1 ⇒ α1

=

−3.0625 V (1.75, 1.75) = max{0; 0.0208, −1.75, −1.75} = 0.02 − f 2,0 + RV 2,0 = −3.0625 + ( 27 8 )(0.0208) = 2.9923 −2.875 − (1)(0.0625) = −2.9375 > Φ2,0

= t0 = 1.0,

x(2) = (1.75, 1.75)

Step 7: R2 = R1 = 27 8 , k = 2 go to Step 2

A smaller value of R gives a larger step size in the first iteration, but results at the end of iteration 2 is the same as before

Chen CL

123

Observations on the CSD Algorithm 

CSD algorithm is a 1st-order method for constrained optimization



CSD can treat equality as well as inequality constraints



Golden section search may be used to find the step size by minimizing the descent function instead of trying to satisfy the descent function (not suggested as it is inefficient)



Rate of convergence of CSD can be improved by including higherorder information about problem functions in the QP subproblem



Now, the step size is not allowed to be greater than one In practice, step size can be larger than one



Numerical uncertainties in selection of parameters γ, R0



Starting point can affect performance of the algorithm

Chen CL

124

SQP: Constrained Quasi-Newton Methods To introduce curvature information for the Lagrange function into the quadratic cost function new QP subproblem

⇒

Min:

f (x)

s.t. hi(x) = 0, i = 1 p

⇒ KTC:

L(x, v ) = f (x) +

vihi(x) = f (x) + v T h(x)

i=1 p

∇L(x, v)

= =

h(x) =

note:

 ∇

N =

∇f (x) + vi hi(x) i=1 ∇f (x) + N v = 0

0

 

∂h1 ∂x1

···

∂h p ∂x1

∂h1 ∂xn

···

∂h p ∂xn

..

∼ p

..

..

 

n× p

Chen CL

125

∇       L(x, v )

⇒

h(x)

or

∇F T (y(k))∆y(k) (k ) 2 ∇ L N ⇒ T

⇒

 

N

∇2L∆x(k) ∇2L∆x(k) ∇2L∆x(k)

∇   ⇒  2

(k)

L N

N T

=

0

F (y ) =

Newton:

⇒

0

0

solve

    

,

0

−F (y(k))

=

(k)

    −∇ − ∆x

+ N ∆v (k) = + N v (k+1)

+ N v (k+1) =

v (k+1)

 

(k)

=

v

L

h

L

v (k)

x

∇  − 

=

∆v

0

∆x

let y =

 

−∇f (x(k)) − N v(k)

− ∇f (x(k)) −∇ f = −

    h

these equations to obtain ∆x(k), v (k+1)

Chen CL

Note: Min: s.t.

126

the same as minimizing the following constrained function

∇f T ∆x

∇2L∆x

+ 0.5∆xT

h + N T ∆x =

¯ = L ¯ = L

⇒

KTC:

∇

∇f T ∆x + 0.5∆xT ∇2L∆x + vT (h + N T ∆x) ∇f + ∇2L∆x + N v = 0

h + N T ∆x =

∇       ⇒  2

L N

T

N

0

0

∆x v

=

0

−∇  −  f

h

Chen CL

127

Now, the solution ∆x should be treated as a search direction d and step size determined by minimizing an appropriate descent function to obtain a convergent algorithm the new QP subproblem

⇒

Min: f ¯ = cT d + 12 dT Hd ¯ = N T d = e s.t. h ¯ = AT d g

≤

b

Chen CL

128

Quasi-Newton Hessian Approximation

H (k+1) = H (k) + D(k) s(k) = αk d(k)

− E (k)

(change in design)

z (k) = H (k)s(k) y (k) =

ξ 1 = θ = w (k) =

D

(k)

∇L(x(k+1), u(k), v(k)) − ∇L(x(k), u(k), v(k)) s(k) · y (k), ξ 2 = s(k) · z (k) 8ξ 1 if ξ 1 ≥ 0.2ξ 2, otherwise θ = ξ0.− ξ θy(k) + (1 − θ)z (k), ξ 3 = s(k) · w(k)

1 (k) (k)T = w w , ξ 3

2

(k )

E

2 1

1 (k) (k)T = z z ξ 2

Chen CL

129

SQP: Modified CSD Algorithm PLBA (Pshenichny-Lim-Belegundu-Arora) method 



Step 1: same as CSD, let H (0) = I Step 2: calculate functions and gradients, maximum violation of constraints, update Hessian if k > 0



Step 3: solve d(k), u(k), v (k)



Step 4-7: same as CSD

Chen CL

130

SQP: Modified CSD Algorithm Example Min:

f (x) = x21 + x22

s.t. g1(x) = g2(x) = g3(x) = R0 = x∗ =

− 3x1x2 1 2 1 2 −1≤0 x + 6 1 6 x2 −x1 ≤ 0 −x2 ≤ 0 x(0) = (1, 1) 10, γ = 0.5, 1 = 2 = 0.001 √ √ ( 3, 3), u = (3, 0, 0), f ∗ = −3.0625

Chen CL 

131

Iteration 1: same as before d(0) = (1, 1),

α0 = 0.5,

u(0) = (0, 0, 0), 

Iteration 2: Step 2: at x(1) = (1.5, 1.5) f =

⇒

x(1) = (1.5, 1.5)

R1 = 10,

−6.75,

g1 =

H (0) = I

−0.25,

g2 = g 3 =

−1.5

Chen CL

132

∇f ∇g2

= ( 1.5, 1.5), =

− − ∇g1 = (0.5, 0.5) (−1, 0), ∇g3 = (0, −1)

s(0) = α0d(0) = (0.5, 0.5)

z (0) = H (0)s(0) = (0.5, 0.5) y (0) =

ξ 1 =

∇f (x(1)) − ∇f (x(0)) = (−0.5, −0.5) s(0) · y (0) = −0.5, ξ 2 = s(0) · z (0) = 0.5

θ = 0.8(0.5)/(0.5 + 0.5) = 0.4 w(0) = 0.4( 0.5,

ξ 3 = D

(0)

(0)

H

=

=

− −0.5) + (1 − 0.4)(0.5, 0.5) = (0.1, 0.1) s(0) · w(0) = 0.1

        0.1 0.1 0.1 0.1 1 0 0 1

+

(0)

E

   − =

0.5 0.5 0.5 0.5

    −  − 

0.1 0.1

0.5 0.5

0.1 0.1

0.5 0.5

=

0.6

.4

.4 0.6

Chen CL

133

Step 3: QP subproblem Min:

f ¯ =

s.t.

g¯1 = g¯2 = g¯3 =

⇒ 

−1.5d1 − 1.5d2 + 0.5(0.6d21 − 0.8d1d2 + 0.6d22) 0.5d1 + 0.5d2 ≤ 0.25 −d1 ≤ 1.5 −d2 ≤ 1.5

d(1) = (0.25, 0.25),

u(1) = (2.9, 0, 0)

Same as previous CSD method, In general, inclusion of approximate Hessian will give different directions and better convergence

Chen CL

134

Newton-Raphson Method to Solve Multiple Nonlinear Equations F (x) =

(n

0

× 1)

x(k+1) = x(k) + ∆x(k),

 { n

Stop if

Ex:

||F (x∗)||

=

(k)

(k )

(k)

F 1(x1 , x2 ) + F 2(x1 , x2 ) +

1/2

≤

···

δ

F 2(x1, x2) = 0

(k) ∂F 1 ∆x 1 ∂x1 (k) ∂F 2 ∆x 1 ∂x1

(k)

= 0

(k)

= 0

1 + ∂F ∂x2 ∆x2 2 + ∂F ∂x2 ∆x2

                      (k)

F 1

(k) F 2 (k)

⇒

}2

i=1

F 1(x1, x2) = 0 (k )

F i(x∗)



k = 0, 1, 2,

=

F ∆x(k) =

∂F 1 ∂x1 ∂F 2 ∂x1

∂F 1 ∂x2 ∂F 2 ∂x2

T

∇F =J −1 (k )

−J

F

(k)

∆x1

(k) ∆x2

=

0 0

∆x(k)

or solve these eq.s directly

Chen CL

135

Newton-Raphson Method to Solve Multiple Nonlinear Equations: Example 6

− 4.x0×10 =0 x 250 − 4.x0×10 =0 x

F 1(x) = 1.0 F 2(x) = J =

Step 1: Step 2: Step 3:

F 1 = (0)

J

=

6

2 1 2

∂F 1 ∂x1 ∂F 2 ∂x1

∂F 1 ∂x2 ∂F 2 ∂x2

x(0) = (500, 1.0),

Step 4: solve

⇒

 

2 1 2

 

15, F 2 =

− 

8 125



16

= 4.0

× 106

δ = 0.1, 7750

− 

 

2 x31x2 1 2 x1x22

k=0

⇒ ||F || = 7750 > δ

16 16000

8 125

16

16 16000

   

x(0) = (151, 0.33)

(k)

∆x1 (k) ∆x2

1 x21x22 2 x1x32

 

=

15 7750

⇒ x(1) = (651, 1.33)

6 Numerical Methods for Constrained Optimization.pdf

Recommend Documents