Numerical Methods for Constrained Optimization
Cheng-Liang Chen
PSE
LABORATORY
Department of Chemical Engineering National TAIWAN University
Chen CL
1
Power Generation via Fuel Oil Process Description
A two-boiler o-boiler turbin turbine-g e-gene enerat ratoor combin combinati ation on to produce roduce a power power output of 50 MW Range of operation: [18 [18,, 30] 30] MW MW and [14 [14,, 25] 25] MW, MW, respectively
Chen CL
1
Power Generation via Fuel Oil Process Description
A two-boiler o-boiler turbin turbine-g e-gene enerat ratoor combin combinati ation on to produce roduce a power power output of 50 MW Range of operation: [18 [18,, 30] 30] MW MW and [14 [14,, 25] 25] MW, MW, respectively
Chen CL
2
Max supply of Blast Furnace Gas: 10 10 fuel fuel units/hr
Fuel used (f (f ton/h) ton/h) and power generated (x (x MW) f = a0 + a + a1x + a + a2x2 Gene Genera rato torr Fuel uel Type 1 2
a0
a1
a2
Fuel Fuel Oil
1.4609 0.15186 .001450
BFG
1.5742 0.16310 .001358
Fuel Fuel Oil
0.8008 0.20310 .000916
BFG
0.7266 0.22560 .000778
Objective: min amount of fuel oil purchased
Chen CL
3
Power Generation via Fuel Oil Formulation
Amount of fuel type ( j ) used by generator (i) f ij = aij 0 + aij 1xij + aij 2x2ij
Powers generated by generator i and power demand pi = xi1 + xi2 p1 + p2
i = 1, 2; j = 1, 2
≥
i = 1, 2
50
Used fuel ( j ) 2
zj
2
≥ f ij =
i=1
i=1
aij 0 + aij 1xij + aij 2x2ij
j = 1, 2
Chen CL
4
NLP Formulation min z1 s.t. pi = xi1 + xi2 p1 + p2 2
zj
i = 1, 2
≥ 50
≥
aij 0 + aij 1xij + aij 2x2ij
i=1
xij
≥ 0, 0 ≤ z2 ≤ 10, 18 ≤ p1 ≤ 30, 14 ≤ p2 ≤ 25
Chen CL
5
Power Generation via Fuel Oil GAMS Code $TITLE Power Generation via Fuel Oil $OFFUPPER $OFFSYMXREF OFFSYMLIST OPTION SOLPRINT = OFF; * Define SETS G F K
index sets Power Generators /gen1*gen2/ Fuels /oil, gas/ Constants in Fuel Consumption Equations /0*2/;
* Define and Input the Problem Data TABLE A(G,F,K) Coefficients in the fuel consumption equations 0 1 2 gen1.oil 1.4609 .15186 .00145 gen1.gas 1.5742 .16310 .001358 gen2.oil 0.8008 .20310 .000916 gen2.gas 0.7266 .22560 .000778; PARAMETER PMAX(G) Maximum power outputs of generators / GEN1 30.0, GEN2 25.0/; PARAMETER PMIN(G) Minimum power outputs of generators / GEN1 18.0, GEN2 14.0/; SCALAR GASSUP Maximum supply of BFG in units per h /10.0/ PREQ Total power output required in MW /50.0/;
Chen CL * Define optimization variables VARIABLES P(G) Total power output of generators in MW X(G, F) Power outputs of generators from specific fuels Z(F) Total Amounts of fuel purchased OILPUR Total amount of fuel oil purchased; POSITIVE VARIABLES P, X, Z; * Define Objective Function and Constraints EQUATIONS TPOWER Required power must be generated PWR(G) Power generated by individual generators OILUSE Amount of oil purchased to be minimized FUELUSE(F) Fuel usage must not exceed purchase; TPOWER.. SUM(G, P(G)) =G= PREQ; PWR(G).. P(G) =E= SUM(F, X(G,F)); FUELUSE(F).. Z(F) =G= SUM((K,G), a(G,F,K)*X(G,F)**(ORD(K)-1)); OILUSE.. OILPUR =E= Z("OIL"); * Impose Bounds and Initialize Optimization Variables * Upper and lower bounds on P from the operating ranges P.UP(G) = PMAX(G); P.LO(G) = PMIN(G); * Upper bound on BFG consumption from GASSUP Z.UP("gas") = GASSUP; * Specify initial values for power outputs P.L(G) = .5*(PMAX(G)+PMIN(G)); * Define model and solve MODEL FUELOIL /all/; SOLVE FUELOIL USING NLP MINIMIZING OILPUR; DISPLAY X.L, P.L, Z.L, OILPUR.L;
6
Chen CL
7
Power Generation via Fuel Oil Result
Power outputs for generators 1 and 2 are 30 MW and 20 MW
Use all BFG (10) due to free Purchase 4.681 ton/h of fuel oil
Generator 1 is operating at the maximum capacity of its operating range
⇒ 36.325 MW total power ⇒ 13.675 MW remaining power
Chen CL
8
Alkylation Process Optimization Process Flowsheet
Chen CL
9
Alkylation Process Optimization Notation and Variable
Chen CL
10
Alkylation Process Optimization Total Profit
f (x) = C 1x4x7
− C 2x1 − C 3x2 − C 4x3 − C 5x5
C 1 = alkylate product value ($0.063/octan-barrel) C 2 = olefin feed cost ($5.04/barrel) C 3 = isobutane recycle cost ($0.035/barrel)
C 4 = acid addition costs ($10.00/per thousand pounds) C 5 = isobutane makeup cost ($3.36/barrel)
(a)
Chen CL
11
Alkylation Process Optimization Some Regression Models
Reactor temperatures between 80 percent at 85 93 was
−
− 90oF and the reactor acid strength by weight
x4 = x 1(1.12 + 0.13167x8
− 0.00667x28)
(b)
Alkylate yield x4 equals the olefin feed x1 plus the isobutane makeup x5 less shrinkage. The volumetric shrinkage can be expressed as 0.22 volume per volume of alkylate yield x5 = 1.22x4 x1 (c)
−
The acid strength by weight percent x6 could be derived from an equation that expressed the acid addition rate x3 as a function of the alkylate yield x4, the acid dilution factor x9, and acid strength by weight percent x6 (the addition acid was assumed to have acid strength of 98%) 98000x3 x6 = x4x9 + 1000x3
(d)
Chen CL
12
The motor octane number x7 was a function of the external isobutane-to-olefin ratio x8 and the acid strength by weight percent x6 (for the same reactor temperatures and acid strengths as for the alkylate yield x4) x7 = 86.35 + 1.098x8
− 0.038x28 + 0.325(x6 − 89)
The external isobutane-to-olefin ratio x8 was equal to the sum of the isobutane recycle x2 and the isobutane makeup x5 divided by the olefin feed x1 x2 + x5 x8 = x1
(f )
The acid dilution factor x9 could be expressed as a linear function of the F-4 performance number x10 x9 = 35.82
(e)
− 0.222x10
(g)
The last dependent variable is the F-4 performance number x10, which was expressed as a function of the motor octane number x7 x10 =
−133 + 3x7
(h)
Chen CL
13
Alkylation Process Optimization Constrained NLP Problem
max f (x) = C 1x4x7 s.t. Eq.s
(b)
∼
− C 2x1 − C 3x2 − C 4x3 − C 5x5
(h)
(10 variables, 7 constraints)
(a)
Chen CL
14
Alkylation Process Optimization GAMS Code $ TITLE ALKYLATION PROBLEM FROM GINO USER’S MANUAL $ OFFSYMXREF $ OFFSYMLIST OPTION LIMROW=0; OPTION LIMCOL=0; POSITIVE VARIABLES X1,X2,X3,X4,X5,X6,X7,X8,X9,X10; VARIABLE OBJ; EQUATIONS E1,E2,E3,E4,E5,E6,E7,E8; E1..X4=E=X1*(1.12+.13167*X8-0.0067*X8**2); E2..X7=E=86.35+1.098*X8-0.038*X8**2+0.325*(X6-89.); E3..X9=E=35.82-0.222*X10; E4..X10=E=3*X7-133; E5..X8*X1=E=X2+X5; E6..X5=E=1.22*X4-X1; E7..X6*(X4*X9+1000*X3)=E=98000*X3; E8.. OBJ =E= 0.063*X4*X7-5.04*X1-0.035*X2-10*X3-3.36*X5; X1.LO X1.UP X2.LO X2.UP X3.LO
= = = = =
0.; 2000.; 0.; 16000.; 0.;
Chen CL X3.UP = 120.; X4.LO = 0.; X4.UP = 5000.; X5.LO = 0.; X5.UP = 2000.; X6.LO = 85.; X6.UP = 93.; X7.LO = 90; X7.UP = 95; X8.LO = 3.; X8.UP = 12.; X9.LO = 1.2; X9.UP = 4.; X10.LO = 145.; X10.UP = 162.; X1.L =1745; X2.L =12000; X3.L =110; X4.L =3048; X5.L =1974; X6.L =89.2; X7.L =92.8; X8.L =8; X9.L =3.6; X10.L =145; MODEL ALKY/ALL/; SOLVE ALKY USING NLP MAXIMIZING OBJ;
15
Chen CL
16
Alkylation Process Optimization Relaxation of Regression Variables
max s.t.
f (x) = C 1x4x7 Rx4 = x5 = x6 =
− C 2x1 − C 3x2 − C 4x3 − C 5x5 x1(1.12 + 0.13167x8 − 0.00667x28) (b) 1.22x4 − x1 (c)
98000x3 x4 x9 +1000x3
Rx7 = 86.35 + 1.098x8
(d)
− 0.038x28 + 0.325(x6 − 89)
x8x1 = x2 + x5 (f ) Rx9 = 35.82 0.222x10 (g) Rx10 = 133 + 3x7 (h) 0.99x4 Rx4 1.01x4 0.99x7 0.99x9 0.99x10
≤ ≤ ≤ ≤
−
(a)
−
≤ Rx7 ≤ 1.01x7 Rx9 ≤ 1.01x9 Rx10 ≤ 1.01x10
(e)
Chen CL
17
Alkylation Process Optimization Solution
f (x0) = 872.3 (initial guess)
=
⇒
f (x∗) = 1768.75
Chen CL
18
Constraint Status at A Design Point Minimize:
f (x)
Subject to:
gj (x)
≤
0 j = 1,
hi(x) = 0
⇓
Active Constraint gj (x(k)) = 0,
··· , m i = 1, ··· , hi(x(k)) = 0
Inactive Constraint gj (x(k)) < 0 Violated Constraint gj (x(k)) > 0,
hi(x(k)) = 0
-Active Constraint gj (x(k)) < 0,
gj (x(k)) + > 0
Chen CL
Conceptual Steps of Constrained Optimization Algorithms Initialized from A Feasible Point
19
Chen CL
Conceptual Steps of Constrained Optimization Algorithms Initialized from An Infeasible Point
20
Chen CL
21
Sequential Unconstrained Minimization Techniques (SUMT)
Minimize:
f (x)
Subject to:
gj (x)
≤
0 j = 1,
hk (x) = 0
⇒ Minimize:
··· , m k = 1, ··· ,
Φ(x, r p) = f (x) + r pP (x) P (x)
:
penalty if any constraint is violated
r p
:
weighting factor
Chen CL
22
The Exterior Penalty Function Methods Minimize:
f (x)
Subject to: gj (x)
≤
0 j = 1,
hk (x) = 0
⇓
··· , m k = 1, ··· ,
min Φ(x, r p) = f (x) + r pP (x) m
P (x) =
j =1
[max (0, gj (x))]2 +
k=1
[hk (x)]2
Chen CL
23
The Exterior Penalty Function Method: Example Minimize: Subject to:
(x+2)2 f = 20 g1 = 1−2 x g2 = x−2 2
≤0 ≤0
Chen CL
24
⇓ Min: Φ(x, r p) =
2
(x + 2) + r p 20
max 0,
1
2
− x
2
+ max 0,
x
− 2
2
2
Chen CL
25
The Exterior Penalty Function Method: Example Minimize: f = x 1 + x2 Subject to: g1 =
−2 + x1 − 2x2 ≤ 0 g2 = 8 − 6x1 + x21 − x2 ≤ 0
Chen CL
Pseudo-objective function: r p = 0.05
26
Chen CL
Pseudo-objective function: r p = 0.1
27
Chen CL
Pseudo-objective function: r p = 1.0
28
Chen CL
29
Algorithm for the Exterior Penalty Function Method r p :
small value
−→
large value
−→ ∞
Chen CL
30
Advantages and Disadvantages of Exterior Penalty Function Methods
It is applicable to general constrained problems, i.e. equalities as well as inequalities can be treated
The starting design point can be arbitrary
The method iterates through the infeasible region where the problem functions may be undefined
If the iterative process terminates prematurely, the final design may not be feasible and hence not usable
Chen CL
31
The Interior Penalty Function Methods (Barrier Function Methods) Minimize:
f (x)
Subject to: gj (x)
≤
0 j = 1,
hk (x) = 0
⇓ min
Φ(x, r p , r p)
= f (x) + r p
m
··· , m k = 1, ··· ,
− j =1
1
gj (x)
+ r p
k=1
[hk (x)]2
Chen CL
32
The Interior Penalty Function Method: Example Minimize: Subject to:
(x+2)2 f = 20 g1 = 1−2 x g2 = x−2 2
≤0 ≤0
Chen CL
33
⇓ Min: Φ(x, r p ) =
−
(x + 2) 2 2 2 + r p + 20 1 x x 2
−
− −
Chen CL
34
Algorithm for the Interior Penalty Function Method r p :
large value
−→
small value
−→
0
Chen CL
35
Advantages and Disadvantages of Interior Penalty Function Methods
The starting design point must be feasible
The method always iterates through the feasible region
Chen CL
36
Augmented Lagrange Multiplier Method: Equality-Constrained Problems Min:
f (x)
s.t.
⇒ ⇒
hk (x) = 0 k = 1,
L(x, λ) = f (x) +
λk hk (x)
∀k
A(x, λ, r p) = f (x) +
NC: at x∗, λ∗
⇒
··· ,
∂A ∂f = + ∂xi ∂xi ∂L ∂f = + ∂xi ∂xi
λk hk (x) + r p[hk (x)]
∀k
∀k
∀k
2
∂hk (λk + 2r phk ) ∂xi ∗ ∂hk λk = 0 ∂xi
= 0
λ∗k = λk + 2r phk (x) (update λk with finite upper bound for r p)
Chen CL
37
Chen CL
38
Augmented Lagrange Multiplier Method: Example Min: f (x) = x21 + x22 s.t. h(x) = x1 + x2
−1=0
Chen CL
39
⇒ A(x, λ , r p) ∇ A(x, λ , r p) x
⇒
= x21 + x22 + λ(x1 + x2 =
− 1) + r p(x1 + x2 − 1)2 2x1 + λ + 2r p(x1 + x2 − 1) 0 = 2x2 + λ + 2r p(x1 + x2 − 1) 0
x1 = x2 =
2r p−λ 2+4r p
Solve the problem for r p = 1; λ = 0, 2, respectively
True optimum: λ∗ =
−
−1, x∗1 = x∗2 = 0.5
Chen CL
40
Augmented Lagrange Multiplier Method: Inequality-Constrained Problems Min: f (x) s.t. gj (x)
⇒ ⇒
≤
0 j = 1,
··· , m
gj (x) + Z j2 = 0
A(x, λ, Z , r p) = f (x) +
λj gj (x) + Z j2
∀j
⇒
A(x, λ, r p) = f (x) + Ψj = max
2 2 λj Ψj + r pΨj
∀j
− λj gj (x), 2r p
λj = λj + 2r pΨj
2 + r p gj (x) + Z j2
(??)
(update rule)
Case 1: g active
Z 2 = 0, λ∗ = λ + 2r pg > 0, Ψ = g >
Case 2: g inactive
Z 2 = 0, λ∗ = λ + 2r p(g + Z 2 ) = 0, Ψ = g + Z 2 =
−λ 2r p −λ 2r
>g
Chen CL
41
Augmented Lagrange Multiplier Method: General Problems
Min: f ( f (x) s.t. gj (x)
≤
0 j = 1,
hk (x) = 0
⇓
··· ,m k = 1, · · · ,
f (x) + A(x, λ, r p) = f (
Ψj = max
λj Ψj + r + r pΨj2
∀j
− λj gj (x), 2r p
λj = λj + 2r 2r pΨj (x)
λk+m = λk+m + 2r phk (x)
+
2
λk+mhk (x) + r + r p[hk (x)]
∀k
j = 1,
··· ,m k = 1, · · · ,
Chen CL
42
Chen CL
43
Advantages of ALM Method
Relatively insensitive to value of r p
Precise g (x) = 0 and h(x) = 0 is possible
Acceleration is achieved by updating λ
Starting point may be either feasible or infeasible
At optimum, λ optimum, λj∗ = 0 will automatically identify the active constraint constraint set
Chen CL
44
Generalized Reduced Gradient Method (GRG) Min:
f (x)
s.t.
gj (x)
≤
0
≤
xi
hk (x) = 0 xi Min:
⇓
f (x) hk (x) = 0 xn+j
∈ Rn j = 1, ··· , J k = 1, ··· , K ≤ xui j = 1, ··· , J
x
s.t. gj (x) + xn+j = 0 xi
∈ Rn j = 1, ··· , J k = 1, ··· , K ≤ xui
x
≤ ≥
xi 0
Chen CL
45
Min:
⇓
f (x)
s.t. hk (x) = 0 xi
∈ Rn+J k = 1, ··· , K, ··· , K + J ≤ xui i = 1 ∼ n ∼ n + J
x
≤
xi
Note: one hk (x) = 0 can reduce one degree-of-freedom one variable can be represented by others K + J depedent variables can be represented by other (n + J ) (K + J ) = (n K ) independent var.s
⇒ ⇒
−
−
Chen CL
46
Divide x(n+J )×1 into indep/dep (design/state, Y /Z ) variables to satisfy hk (x) = 0
x =
Y Z
Y = (n+J )×1
y1 ..
Z =
yn−K
z1 ..
zK +J
Variations of objective and constraint functions: n−K
df (x) =
i=1 n−K
dhk (x) =
i=1
⇒
dh =
dh1
∂f dyi + ∂yi
K +J
∂hk dyi + ∂yi
···
∂f dzi ∂zi
=
∇T f dY + ∇T f dZ
∂hk dzi ∂zi
=
∇T hkdY + ∇T hkdZ
i=1 K +J i=1
dhK
···
dhK +J
T
Y
Y
Z
Z
= C dY + DdZ
Chen CL
47
C =
D =
∂h1 ∂y1
···
∂h1 ∂yn−K
∂hK ∂y1
···
∂hK ∂yn−K
∂hK +J ∂y1
···
∂hK +J ∂yn−K
∂h1 ∂z1
···
∂h1 ∂zK +J
···
∂hK ∂zK +J
···
∂hK +J ∂zK +J
.. ..
..
∂hK ∂z1
..
∂hK +J ∂z1
.. .
.. ..
.. .
.. ..
∇ ∇ ∇ ∇ ∇ ∇
T Y
=
h1
..
T
hK ..
Y
T Y
hK +J T Z
=
hK ..
Z
Z
h1
..
T
T
hK +J
Chen CL
x
48
−→ x + dx should maintain h = ⇒ dh = 0
dh = C dY + DdZ =
⇒
DdZ =
⇒
df (x) =
OR dZ =
=
0
0
−C dY −D−1C dY ∇T f dY + ∇T f dZ ∇T f − ∇T f D−1C 1×(n−K ) dY Y
Y
= GT RdY
df (x) = GR = dY
Z
Z
(Generalized Reduced Gradient)
∇
Y
T
− ∇
f
D
−1
C
f
Z
((n
− K ) × 1)
Chen CL
49
Use GR to generate search direction S
⇒ step size
dZ = D−1C dY is based on linear approximation some hk (x + dx) = 0 fix Y , correct dZ to maintain h + dh = 0 (repeat until dZ is small)
⇒ ⇒
−
h(x) + dh(x) = 0 h(x) + C dY + DdZ = 0
dZ =
−D−1 [h(x) + C dY ]
Chen CL
50
GRG: Algorithm
Specify design (Y ) and state (Z ) variables
select state variables (Z ) to avoid singularity of D any component of x that is equal to its lower/upper bound initially is set to be a design variable slack variable should be set as state variables
Compute GR (analytical or numerical)
Test for convergence: stop if GR <
Determination of search direction S : steepest descent: S = GR Fletcher-Reeve conjugate gradients, DFP, BFGS,
|| ||
−
···
Chen CL
51
Find optimum step size along s
dY = λS = λ dZ =
−D
−1
s1 ..
sn−K
C dY =
= λT = λ
−
t1 ..
tK +J
D
−1
C (λS )
= λ
−
D
−1
CS
Chen CL
52
distance to side constraints: considering design var.s λi =
yiu − yi,old si yi−yi,old si
0 < λ∗
zju−zj,old tj zj−zj,old tj
⇒ λ1 =
if si < 0
min λi i
{ }
if tj > 0
⇒ λ2 =
if tj < 0
min λj j
{ }
≤ min{λ1, λ2} xnew =
if si > 0
distance to side constraints: considering state var.s
λj =
if xnew is infeasible
Y old + dY Z old + dZ
=
Y old + λ∗S Z old + λ∗T
⇒ keep Y new and modify Z new
Chen CL
53
GRG: Example
− x2)2 + (x2 − x3)4 h(x) = x1(1 + x22) + x43 − 3 = 0 −3 ≤ x1, x2, x3 ≤ 3
min: f (x) = (x1 s.t.
Choose arbitrarily the design/state variables as Y =
− y1 y2
=
x1 x2
Z =
z1
2.6
x1 =
2 2
f (x1) = 21.16
=
x3
Chen CL
54
Compute reduced gradient:
∇ f ∇ f Y
=
Z
=
C = D
=
D−1C = GR
=
∂f ∂x 1 ∂f ∂x 2 ∂f ∂x 3 ∂h ∂x 1
− −− −− − −
=
2(x1
=
∂h ∂x 2
4x33
=
.15625
0
||GR|| = 9.2 = ⇒ continue
Use steepest descent method: S =
Find λ∗ along S λ1 = min
3−(−2.6) − 3−2 , −9.2 9.2
.325
=
9.2 9.2
=
0
=
5
=
32
9.2 9.2
= Z f
Y
− x3)3
1 + x22 2x1x2
=
∇ f − D−1C ∇
x3)3
4(x2
=
∂h ∂x 3 1 32 C
2(x1 x2) x2) + 4(x2
− −
−GR
= min 0.6087, 0.5435
{
}
= 0.5435
10.4
Chen CL
55
T
=
λ2 =
x
=
−D−1CS = −3−2 −4.4275
−
.15625
= 1.1293
⇒
−
Y + λS Z + λT
2.6
=
2 2
−
9.2 = 4.4275 9.2 λ = min λ1, λ2 = 0.5435
−.325
{
− −
9.2
+ λ
−
}
− − −
2.6 + 9.2λ
= 9.2 2 9.2λ 4.4275 2 4.4275λ
2
4
f (λ) = [( 2.6 + 9.2λ) (2 9.2λ)] + [(2 9.2λ) (2 4.4275λ)] = 518.7806λ4 + 338.56λ2 169.28λ + 21.16 df = 2075.1225λ3 + 677.12λ 169.28 = 0 λ∗ = 0.22 dλ
−
xnew
=
− − −
− −
2.6 + 9.2λ∗ = 2 9.2λ∗ 2 4.4275λ∗
−
− −
− − −
− −
⇒
− −
2.6+2.024 0.576 = 2 2.024 0.024 2 .97405 1.02595
Chen CL
56
h(xnew ) = Dnew
=
C new
=
−2.4684 = 0!! ⇒ xnew ∈ F R
− − − − ∂h ∂x 3
= 4(x33)new = 4(1.02595)3 = 4.32
∂h ∂x 1
dZ = D
−1
∂h ∂x 2
= 1 + x22 2x1x2
[ h(x)
C dY ]new =
new
1 4.32
= 1 0.028
2.4684 + 1 0.028
= [0.116]
⇒
Z new
= Z old + dZ = 1.02595 + 0.116 = 1.142
h(xnew ) = .. ..
−1.876 = 0!! ⇒ xnew ∈ F R
2.024 2.024
−
Chen CL
57
Linearization of The Constrained Problem
Min: s.t.
f (x) hj (x) = 0 gj (x)
Min:
f (x(k) + ∆x)
s.t. hj (x(k) + ∆x) gj (x(k) + ∆x)
≤ ⇓
0
j = 1,
··· , p j = 1, ··· , m ¯ f
∇ ∇ ∇ ≤ f (x(k))∆x
T
≈
f (x(k)) +
≈ ≈
hj (x(k)) +
T
gj (x(k)) +
T
¯j h
hj (x(k))∆x = 0
gj (x(k))∆x g ¯j
0
Chen CL
58
f k ci nij
Min:
≡ ≡ ≡ ⇓
f ¯ =
f (x(k)) ej ∂f (x(k)) ∂xi ∂hj (x(k) ) ∂xi
di
≡ − hj (x(k)) bj ≡ − gj (x(k)) ≡ ∆x(ik) ∂g (x ) aij ≡ ∂x i
n
cidi = cT d
i=1 n
s.t.
¯j = h
nij di = ej
i=1 n
g¯j =
(k)
j
i=1
aij di
≤
bj
[n1
[a1
≤
··· n p]T d = N T d = e
··· am]T d = AT d
b
Chen CL
59
Definition of Linearized Subproblem: Example
Min: s.t.
f (x) = x21 + x22 g1(x) = g2(x) = g3(x) =
⇒ ∇f (x) ∇g2(x)
=
=
− 3x1x2 1 2 1 2 −1≤0 x + 1 6 6 x2 −x1 ≤ 0 −x2 ≤ 0 x(0) = (1, 1)
− − − ∇ 2x1
3x2
2x2
3x1
1
0
∇g1(x)
g3(x) =
=
− 0
1
1 3 x1 1 3 x2
Chen CL
60
Chen CL
⇒
61
f (x(0)) = b2 =
∇f (x(0)) ⇒
=
A =
−1, −g2(x(0)) = 1,
− − − − 1 1
1 3 1 3
1
0
0
1
−g1(x(0)) = 23, b3 = −g3(x(0)) = 1 b1 =
∇g1(x(0))
1 3 1 3
=
2 3
b=
1 1
Chen CL
62
⇓ Min:
f ¯ =
− −
1
1 3
s.t.
1
0
− ≤ − 1
1 3
0
1
d1 d2
d1 d1
Or Min:
f ¯ =
s.t. g¯1
=
g¯2
=
g¯3
=
−d1 − d2
1 1 d + 1 3 3 d2
−d1 ≤ 1 −d2 ≤ 1
≤ 23
2 3
1 1
Chen CL
63
Or Min:
¯ x) f (
s.t. g¯1(x)
=
f (x(0)) +
∇f · (x − x(0)) (x1 − 1) −1 −1 (x − 1) 2
=
−1 +
=
−x1 − x2 + 1 g1(x(0)) + ∇g1 · (x − x(0)) (x1 − 1) 2 1 1 −3 + 3 3 (x2 − 1) 1 3 (x1 + x2 − 4) ≤ 0 −x1 ≤ 0 −x2 ≤ 0
= = =
g¯2(x)
=
g¯3(x)
=
Chen CL
64
Optimum solution: line D d1 + d2 = 2
(x1
− 1) + (x2 − 1) = 2
− E x1 + x2
−4=0
Chen CL
65
Sequential Linear Programming Algorithm Basic Idea
Min: s.t.
f (x) hj (x) = 0 gj (x)
Min:
f (x(k) + ∆x)
s.t. hj (x(k) + ∆x) gj (x(k) + ∆x)
≤ ⇓
0
j = 1,
··· , p j = 1, ··· , m ¯ f
∇ ∇ ∇ ≤ f (x(k))∆x
T
≈
f (x(k)) +
≈ ≈
hj (x(k)) +
T
gj (x(k)) +
T
¯j h
hj (x(k))∆x = 0
gj (x(k))∆x g ¯j
0
Chen CL
66
f k ci nij
Min:
f ¯
≡ ≡ ≡ ⇓ =
f (x(k)) ej ∂f (x(k) ) ∂xi ∂hj (x(k)) ∂xi
di
≡ − hj (x(k)) bj ≡ − gj (x(k)) ≡ ∆x(ik) ∂g (x ) aij ≡ ∂x i
n
cidi = cT d
i=1 n
s.t.
¯j h
=
nij di = ej
i=1 n
g¯j
=
(k)
j
i=1
−∆(ik) ≤ di ≤
aij di (k )
∆iu ,
≤
bj i = 1,
[n1
[a1
≤
··· n p]T d = N T d = e
··· am]T d = AT d
··· , n
b
(move limits)
Chen CL
67
Move limits: the linearized problem may not have a bounded solution or the changes in design may become too large
Selection of proper move limits is of critical importance because it can mean success or failure of the SLP algorithm
All bi
di: no sign restriction − + di = d+ d , d i i i
⇒
≥0 −
≥ 0, d−i ≥ 0
Stopping Criteria:
gj
≤ 1, j = 1 ∼ m; ||d|| ≤ 2
hj
≤ 1, j = 1 ∼ p
Chen CL
68
An SLP Algorithm
Step 1: x(0), k = 0, 1, 2
Step 2: calculate f k ; bj , j = 1 ∂f ∂xi , nij
∼ m; ej , j = 1 ∼ p ∂hj ∂xi , aij
∂gj ∂xi
Step 3: evaluate ci =
Step 4: select proper move limits ∆i , ∆iu
Step 5: define the LP subproblem
Step 6: use Simplex method to solve for d(k)
Step 7: check for convergence, Stop ?
=
(k)
=
(k )
(some fraction of current design)
Step 8: update the design x(k+1) = x(k) + d(k) set k = k + 1 and go to Step 2
Chen CL
69
An SLP Example
Min: s.t.
f (x) = x21 + x22 g1(x) = g2(x) = g3(x) =
⇒ ∇f (x) ∇g2(x)
=
=
− 3x1x2 1 2 1 2 −1≤0 x + 6 1 6 x2 −x1 ≤ 0 −x2 ≤ 0 x(0) = (3, 3)
− − − ∇ 2x1
3x2
2x2
3x1
1
0
∇g1(x)
g3(x) =
=
− 0
1
1 3 x1 1 3 x2
Chen CL
70
f (x(0)) = b2 =
(0)
∇f (x
) =
A =
−9, b1 = −g1(x(0)) = −2 < 0 (violated) −g2(x(0)) = 3 (inactive) b3 = −g3(x(0)) = 3 (inactive)
− − − − 3
= c
3
1 1
1
0
0
1
(0)
∇g1(x
− 1
) =
1
2
b=
3 3
Chen CL
71
⇓ Min:
f ¯
=
− −
3
1
s.t.
1
0
− − ≤ − 3
1 0
1
d1 d2
d1 d2
Or f ¯
=
s.t. g¯1
=
g¯2
=
g¯3
=
Min:
−3d1 − 3d2 d1 + d2 ≤ −2 −d1 ≤ 3 −d2 ≤ 3
2
3 3
Chen CL
72
Feasible region: A B C ; Cost function parallel to B C (OptSoln : BC
− −
−
⇒ d1,2 = −1)
100% move limits d1 = 1, d2 = 1
⇒ −3 ≤ d1, d2 ≤ 3 (ADEF ) ⇒ − − ⇒ x1 = 2, x2 = 2 (x1 − 3 = d1 = −1) 20% move limits ⇒ −0.6 ≤ d1, d2 ≤ 0.6 (A1D2E 1F 1) ⇒ no feasible solution
The linearized constraints are satisfied, but the original nonlinear constraint g1 is still violated
Chen CL
73
SLP: Example
Min: s.t.
f (x) = x21 + x22 g1(x) = g2(x) = g3(x) =
⇓
Min:
f ¯ =
s.t.
g¯1 = g¯2 = g¯3
−0.15
− 3x1x2 1 2 1 2 −1≤0 x + 1 6 6 x2 −x1 ≤ 0 −x2 ≤ 0 x(0) = (1, 1)
1,2 = 0.001, 15% design change
−d1 − d2
1 1 d + 1 3 3 d2
−d1 ≤ 1 = −d2 ≤ 1 ≤ d1, d1 ≤
≤ 23 0.15
Chen CL
Solution region: DEFG
74
⇒ F, (d1 = d2 = 0.15)
Chen CL
75
d1
g¯1 =
f ¯ =
Min: s.t.
≡ d+1 − d−1 , d2 ≡ d+2 − d−2 −d+1 + d−1 − d+2 + d−2 − − + + 1 2 − − ≤ + d d d d 1 1 2 2 3 3 −d+1 + d−1 ≤ 1 −d+2 + d−2 ≤ 1
⇓
g¯2 = g¯3 =
− d−1 + − d− d 1 1 − − d+ d 2 2 + − d− d 2 2 d+ 1
− d+ , d 1 1 − d+ , d 2 2
≤ ≤ ≤ ≤ ≥ ≥
0.15 0.15 0.15 0.15 0 0
Chen CL
76
⇓
− − + d+ = 0.15, d = 0, d = 0.15, d 1 1 2 2 = 0
− d−1 − − d+ d 2 2
d1 = d+ 1
= 0.15
d2 =
= 0.15
x(1) = x(0) + d(0)
f (x(1)) = g1(x(1)) =
||d||
= (1.15, 1.15)
−1.3225 −0.6166 (inactive)
= 0.212 > 2
⇒
next iteration
Chen CL
77
Observations on The SLP Algorithm
The rate of convergence and performance depend to a large extent on the selection of move limits
The method cannot be used as a black box approach for engineering design problems (selection of move limits is a trial and error process)
The method is not convergent since no descent function is defined
The method can cycle between two points if the optimum solution is not a vertex of the constraint set
Lack of robustness
Chen CL
78
Quadratic Programming Problem Quadratic Step Size Constraint
Constrained Constrained Nonlinear Nonlinear Programming Programming Linear Programming Subproblem: lack of robustness Quadratic Programming Subproblem: with descent function and step size determination strategy
⇒ ⇒
Quadratic Quadratic Step Size Constraint: Constraint:
−∆i(k) ≤ di ≤ ∆iu(k), ⇒ 12||d|| = 12dT d = 12
i = 1,
··· ,n (di)2 ≤ ξ
(move limits) limits)
Chen CL
79
⇓ Min:
f ¯ =
n
cidi = cT d
i=1 n
¯j = s.t. h
nij di = ej
i=1 n
g¯j =
aij di
i=1 1 2
(di)2
≤ ≤
bj ξ
[n1
[a1
≤
· · · n p]T d = N T d = e
· · · am]T d = AT d (move (move limits) limits)
b
Chen CL
80
Quadratic Programming (QP) Subproblem n
Min:
f ¯ =
cidi = cT d
i=1 n
s.t.
¯j = h
nij di = ej
i=1 n
g¯j =
aij di
i=1 1 2
Min:
⇓
(di)2
≤ ≤
bj ξ
[n1
[a1
(? check KTC, λ = 1)
≤
· · · n p]T d = N T d = e
· · · am]T d = AT d b (0. (0.5dT d ≤ ξ )
f ¯ = cT d+0 +0..5dT d
s.t. N T d = e AT d
≤
b
a convex programming problem
⇒ unique soln
Chen CL
81
Quadratic Programming (QP) Subproblem Min: f ( f (x) = 2x31 + 15 15x x22
− 8x1x2 − 4x1
s.t. h(x) = x21 + x + x1x2 + 1 = 0
− 14x22 − 1 ≤ 0 x(0) = (1, (1, 1) A : (1, (1, −2), 2), B : (−1, 2) f ( f (x∗) = 74(A 74(A), 78(B 78(B )
g (x) = x1 Opt:
x∗ =
Chen CL
c =
82
∇f ∇h ∇g
=
(6x (6x21
− 8x2 − 4, 30 30x x2 − 8x1)
= (2x (2x1 + x + x2, x1)
f (1 f (1,, 1) = 5,
=
x(0)
=
x(0)
= (1, (1, x2/2)
−
x(0)
h(1, (1, 1) = 3 = 0,
g(1, (1, 1)
( 6, 22)
−
(3, (3, 1)
=
(1, (1, 0.5)
=
−0.25
−
< 0
Chen CL
83
LP Sub-P: Min: s.t.
f ¯ = 6d1 + 22d2 ¯ = 3d1 + d2 = 3 h
−
− g¯ = d1 − 0.5d2 ≤ 0.25 50% move limits: −0.5 ≤ d1, d2 ≤ 0.5 ⇒ infeasible (HIJK ) −1 ≤ d1, d2 ≤ 1 100% move limits: ⇒ L : d1 = − 23, d2 = −1, f ¯ = −18
Chen CL
84
QP Sub-P: Min: f ¯ = 6d1 + 22d2 + 0.5(d21 + d22) ¯ = 3d1 + d2 = 3 s.t. h
−
g¯ =
⇒
G
:
− d1 − 0.5d2 ≤ 0.25 d1 = −0.5, d1 = −1.5, f ¯ = −28.75
Chen CL
85
Solution of QP Problems
Min:
q (x) = cT x + 12 xT Hx
s.t. AT x
≤
bm×1
≥
0n×1
(H = I ?)
N T x = e p×1 x
⇒ ⇒
AT x + s
−x ≤
= bm×1
0
L = cT x + 12 xT Hx + uT (AT x + s +v T (N T x
− b) − ζ T x
− e)
Chen CL
86
KT C c + Hx + Au
− ζ + N (y − z) AT x + s − b N T x − e
⇓ ⇓ ⇓
v=y
y, z
= 0 =
0
=
0
uj sj = 0 ζ ixi = 0
sj , uj , ζ i
− z,
≥
0
j=1
∼m i=1∼n
≥
0
Chen CL
87
⇓
H T
A
T
N
A 0m×m 0 p×m
−I n×n
x
−N
0n×m
N
0m×n
I m×m
0m× p
0m× p
0 p×n
0 p×m
0 p× p
0 p× p
u
ζ s
y z
⇒
− c
=
b
e
B (n+m+ p)×2(n+m+ p)x2(n+m+ p)×1 = D(n+m+ p)×1
xiζ i = 0, uj sj = 0
⇒
X iX n+m+i = 0 X i
≥
0
i = 1
∼ n + m i = 1 ∼ 2(n + m + p
Chen CL
88
Two-Phase Simplex Method to Solve QP Problem Bx + Y = D
Y : artificial variables
n+m+ p
w =
Y i
i=1 n+m+ p
=
Di +
i=1
= w0 +
−
2(n+m+ p)
Bij X j
j =1
2(n+m+ p)
n+m+ p i=1
C j X j
j =1
Note:
both X i, X n+m+i are not simultaneously bases
Chen CL
89
Procedures:
Step Step Step Step Step
1: find B 2: define D, rearrange s.t. Di 0 3: calculate w0, C j s 4: complete phase I 5: recover optimum values for original QP variables x, Lagrangian multipliers u, v , ζ , and slack variables s
≥
Chen CL
90
Solution of QP Problem: Example Min:
= s.t.
− 3)2 + (x2 − 3)2 x21 − 6x1 + x22 − 6x2 (+18) x1 + x2 ≤ 4 x1 − 3x2 = 1
f (x) = (x1 g(x) = h(x) = x1, x2
≥ ⇓
q (x) =
c = b =
0
− − − − 6
6 6
4
6
x1 x2
H =
e =
1
+ 0.5 x1 x2 2 0 0 2
A =
2 0
x1
0 2
x2
1 1
N =
− 1
3
Chen CL
91
B =
x =
2
0
1
0
2
1
1
1
0
0
1
−3
0
0
⇒
X 1 =
0
0
0
1
−1
0
0
1
−3
0
0
1
3
0
0
0
13 4 ,
X 2X 5 = 0,
D =
0
T
X 3X 6 = 0
X 2 = 34 , X 3 = 34 , X 8 =
5 4
X 4 = 0, X 5 = 0, X 6 = 0, X 7 = 0 x1 =
13 4 ,
x2 = 34 , u1 = 34 , ζ 1 = 0, ζ 2 = 0
s1 = 0, y1 = 0, z1 = 54 , v1 = y1 3 f ( 13 , 4 4) =
41 8
6
x1 x2 u1 ζ 1 ζ 2 s1 y1 z1
X 1X 4 = 0,
⇒
−1
−
− z1 = − 54
6 4 1
Chen CL
Basic Y 1 Y 2 Y 3 Y 4 w Y 1 Y 2 Y 3 X 1 w X 2 Y 2 Y 3 X 1 w
−
−
−
92
−w
0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
X 1 2 0 1 1 4 0 0 0 1 0 0 0 0 1 0
−
X 2 X 3 X 4 X 5 X 6 X 7 X 8 Y 1 Y 2 Y 3 Y 4 0 1 1 0 0 1 1 1 0 0 0 2 1 0 1 0 3 3 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 3 0 0 0 0 0 0 0 0 0 1 0 2 1 1 1 2 2 0 0 0 0 6 1 1 0 0 1 1 1 0 0 2 2 1 0 1 0 3 3 0 1 0 0 4 0 0 0 1 0 0 0 0 1 1 3 0 0 0 0 0 0 0 0 0 1 12 2 1 1 1 2 2 0 0 0 4 1 1 1 1 1 16 16 0 0 0 0 6 6 6 3 2 1 10 10 1 2 0 3 3 1 0 1 0 3 3 3 3 2 2 2 1 0 23 23 0 1 0 1 3 3 3 3 1 1 1 0 12 12 0 0 0 0 0 2 2 2 0 0 1 1 1 4 4 2 0 0 0
−
−
−
− − −
−
−
−
− −
−
−
−
−
−
−
−
− −
−
− − −
− − − −
− −
− −
−
D 6 6 4 1 17 4 6 3 1 13
−
−
2 3 14 3 1 3
3 5
−
Chen CL
93
Basic X 2 Y 2 X 8 X 1 w
−w
X 2 X 3 X 8 X 1 w
0 0 0 0 1
−
−
0 0 0 0 1
X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 Y 1 Y 2 Y 3 Y 4 D 0 1 0 0 0 14 0 0 0 0 14 14 34 0 0 4 3 1 5 0 0 3 1 5 1 3 0 0 1 1 0 32 1 1 1 0 32 12 12 1 0 0 0 0 34 0 0 0 0 34 14 13 4 0 0 4 3 1 5 0 0 2 0 6 2 3
− − −
−
0 0 0 1 0
1 0 0 0 0
−
0 1 0 0 0
0
0
−
1 4 5 4 1 4 3 4
0 0 1 0 0
−34 −14 − 1 1 − − 4 4 0 0
0 0
0
− − −
−
0 0 1 0 0
−
0
0
3 4 1 4
1 4 1 4
0 1
0 1
−
−14 − −14
−
1 4 5 4 1 4 3 4
1 4 1 4
3 4 3 4 5 4 13 4
1
1
0
Chen CL
94
Sequential Quadratic Programming (SQP)
Constrained Steepest Descent Method
Min:
⇒ Min:
f ¯ = cT x + 12 xT x d =
⇓
−c
(steepest descent direction if no constraint)
constrained QP subproblem is solved sequentially
f ¯ = cT d + 0.5dT d
s.t. N T d = e AT d
⇒
d
Q:
≤
b
: modify
−c to satisfy constraints
Descent function ?
Step size ?
Chen CL
95
CSD: Pshenichny’s Descent Function Φ(x) = f (x) + RV (x) Φ(x(k)) = f (x(k)) + RV (x(k)) Φk = f k + RV k V k = max 0; h1 ,
{ | | ··· , |h p|; g1, ··· , gm} ≥ 0
(maximum constraint violation) p
R
≥
rk =
m
(k)
vi
i=1
+
i=1
(k )
ui
Chen CL
96
Note: the Lagrange function is a lower bound on the descent function Min:
f (x)
s.t. hi(x) = 0 gi(x)
⇒
≤
0
i = 1,
··· , p i = 1, ··· , m p
m
| || | | |
L(x) = f (x) +
vihi +
i=1 p
≤
f (x) +
≤
f (x) +
uigi
i=1
m
vi hi +
i=1 p
i=1 m
vi +
i=1
uigi
ui max 0; h1 , , h p ; g1, , gm
i=1
r, R=max{R0 ,r }
≤
f (x) + RV = Φ(x)
{ | |·| |
V
· }
Chen CL
97
CSD: Descent Function Example Min: s.t.
f (x) = x21 + 320x1x2 g1(x) =
x1 60x2
g2(x) = 1
−x1, −x2 ≤
−1≤0
x −x ) − x (3600 ≤0
0,
1
1
2
x(0) = (40, 0.5),
R = 10000
f (x(0)) = (40)2 + 320(40)(0.5) = 8000 g1(x(0)) = g2(x(0)) = g3 = V 0 =
40 1 = 0.333 > 0, 60(0.5) .5) 1 40(40−0 = 0.5611 > 3600
−
(violation)
− 0, (violation) −40 < 0, g4 = −0.5 < 0, (inactive) max {0; 0.333, 0.5611, −40, −0.5} = 0.5611
Φ0 = f 0 + RV 0
= 8000 + (10000)(0.5611) = 13611
Chen CL
98
CSD: Step Size Determination
Use inexact step size (for efficiency, exact line search is not necessary)
Trial step size tj : j = 0
t0 =
j = 1
t1 =
j = 2 ..
t2 =
1 0 2 1 1 2 1 2 2
= 1 = =
1 2 1 4
Chen CL
99
Determine step size as αk = tj with j as the smallest integer to satisfy descent condition of inequality x(k+1,j ) = x(k) + tj d(k)
Φk+1,j = f k+1,j + RV k+1,j
≤
Φk
tj β k
|− |
β k = γ d(k)
2
,
0 < γ
≤ 1
Chen CL
100
Effect of parameter γ on step size determination a larger γ tends to reduce step size to satisfy descent condition of Inequality
Chen CL
101
CSD: Step Size Example
Min:
f (x) = x21 + 320x1x2
s.t. g1(x) =
x1 60x2
x −x ) − x (3600 ≤0 −x1 ≤ 0 −x2 ≤ 0, x(0) = (40, 0.5)
g2(x) = 1 g3(x) = g4(x) =
⇒
−1≤0 1
1
2
d(0) = (25.6, 0.45) u = (4880, 19400, 0, 0)(?),
R =
γ = 0.5
ui = 4880 + 19400 + 0 + 0 = 24280
β = 0.5(25.62 + 0.452) = 328
Chen CL
102
f (x(0)) = (40)2 + 320(40)(0.5) = 8000 g1(x(0)) = g2(x(0)) = g3(x(0)) = g4(x(0)) =
40 1 = 0.333 > 0, 60(0.5) .5) 1 40(40−0 = 0.5611 > 3600
−
− −40 < 0, −.5 < 0,
(violation)
0, (violation)
(inactive) (inactive)
V 0 = max 0; 0.333, 0.5611, 40, 0.5
{
Φ0 = f 0 + RV 0
− − }
= 8000 + (24280)(0.5611) = 21624
= 0.5611
Chen CL
103
Initial:
t0 = 1 (1,0)
= x1 + t0d1 = 40 + (1)(25.6) = 65.6
(1,0)
= x2 + t0d2 = .5 + (1)(0.45) = 0.95
x1 x2 f 1,0 =
( j = 0) (0)
(0)
(0)
(0)
f (x(1,0)) = (65.6)2 + 320(65.6)(.95) = 24246 g1(x(1,0)) = g2(x(1,0)) = g3(x(1,0)) = g4(x(1,0)) = V 1,0 =
65.6 1 = 0.151 > 0, 60(0.95) .6−0.95) 1 65.6(65 = .1781 3600
−
(violation)
− − > 0, (inactive) −65.6 < 0, (inactive) −0.95 < 0, (inactive) max {0; 0.151, −.1781, −65.6, −0.95} = 0.151
Φ1,0 = f 1,0 + RV 1,0
= 24246 + (24280)(0.151) = 27912 LHS = 27912 + 328 = 28240 > 21624 = RHS!
Chen CL
104
Second:
t1 = 0.5 (1,1)
= x1 + t1d1 = 40 + (0.5)(25.6) = 52.8
(1,1)
= x2 + t1d2 = .5 + (0.5)(0.45) = .725
x1 x2 f 1,1 =
( j = 1)
(0)
(0)
(0)
(0)
f (x(1,1)) = (52.8)2 + 320(52.8)(.725) = 15037 g1(x(1,1)) = g2(x(1,1)) = g3(x(1,1)) = g4(x(1,1)) = V 1,1 =
52.8 1 = 0.2138 > 0, 60(.725) .8−.725) 1 52.8(52 = 0.2362 3600
−
(violation)
− > 0, (violation) −52.8 < 0, (inactive) −.725 < 0, (inactive) max {0; 0.2138, 0.2362, −52.8, −.725} = 0.2362
Φ1,1 = f 1,1 + RV 1,1
= 15037 + (24280)(0.2362) = 20772 LHS = 20772 + (0.5)328 = 20936 < 21624 = RHS
Chen CL
105
SQP: CSD Algorithm
Step 1: k = 0; x(0), R0(= 1), 0 < γ (= 0.2) < 1; 1, 2 Step 2: f (x(k)), hj (x(k)), gj (x(k)); V k ; f (x(k)), hj (x(k)), gj (x(k))
∇
∇
∇
⇒ d(k), v(k), u(k)
Step 3: define and solve QP subproblem Step 4: Stop if d(k)
|| || ≤ 2 and V k ≤ 1
Step 5: calculate rk (sum of Lagrange multipliers), set R = max Rk , rk
{
}
Step 6: set x(k+1) = x(k) + αk d(k) (minimize (inexact) Φ(x) = f (x) + RV (x) along x(k))
Chen CL
106
Step 7: save Rk+1 = R, k = k + 1 and go to Step 2 The CSD algorithm along with the foregoing step size determination procedure is convergent provided second derivatives of all the functions are piece-wise continuous (Lipschitz Condition) and the set of design points x(k) are bounded as follows: Φ(x(k))
≤ Φ(x(0))
Chen CL
107
SQP: CSD Example Min:
f (x) = x21 + x22
s.t. g1(x) = g2(x) = g3(x) = R0 = x∗ =
− 3x1x2 1 2 1 2 x + 1 6 6 x2 − 1 ≤ 0 −x1 ≤ 0 −x2 ≤ 0 x(0) = (1, 1) 10, γ = 0.5, 1 = 2 = 0.001 √ √ ( 3, 3), u = (3, 0, 0), f ∗ = −3.0625
Chen CL
108
Iteration 1 (k=0)
Step 1: k = 0; x(0) = (1, 1), R0 = 10, γ = 0.5; 1 = 2 = 0.001
Step 2: f (x(0)) = g2(x(0)) =
∇f (x(0)) ∇g2(x(0))
= =
−1 g1(x(0)) −1 g3(x(0)) (−1, −1) ∇g1(x(0)) ∇g3(x(0)) (−1, 0)
V 0 = 0
= =
−23 −1 (inactive)
= ( 13 , 13 )
= (0, 1)
−
Chen CL
linearized constraints and linearized constraint set
109
Chen CL
110
⇒ d(k), v(k), u(k)
Step 3: define and solve QP subproblem Min: s.t.
⇒
f ¯ = ( d1 g¯1
−d1
L
∂L ∂d1 ∂L ∂d2
uisi
⇒
d(0)
− − d2) + 0.5(d21 + d22) = 13 d1 + 13 d2 ≤ 23 ≤ 1, −d2 ≤ 1 = (−d1 − d2) + 0.5(d21 + d22) + u1 13 (d1 + d2 − 2) + s21 u2(−d1 − 1 + s22) + u3(−d2 − 1 + s23) = −1 + d1 + 13 u1 − u2 = 0 = −1 + d2 + 13 u1 − u3 = 0 1 2 − (d + d 2) + s 1 2 1 =0 3 −d1 − 1 + s22 = 0 −d2 − 1 + s23 = 0 = 0, ui ≥ 0, i = 1, 2, 3 = (1, 1) (D), u(0) = (0, 0, 0), f ¯ = −1
Chen CL
111
√ Step 4: ||d || = 2 ≥ 2, continue (0)
Step 5: r0 =
ui = 0, R = max R0, r0 = max 10, 0 = 10
{
}
{
}
Chen CL
112
Step 6: Φ0 = f 0 + RV 0 = β 0 = γ d(0)
−1 + (10)(0) = −1
|| ||2 = 0.5(1 + 1) = 1
x(1,0) = x(0) + t0d(0) = (2, 2)
f 1,0 = f (2, 2) =
(t0 = 1)
−4
V 1,0 = V (2, 2) = max 0; 13 , 2, 2 = Φ1,0 = Φ0
− t0β 0
=
{ − − } 13 f 1,0 + RV 1,0 = −4 + (10) 13 = − 23 −1 − 1 = −2 < Φ1,0 ⇒ t1 = 0.5
x(1,1) = x(0) + t1d(0) = (1.5, 1.5)
(t1 = 0.5)
f 1,1 = f (1.5, 1.5) = V 1,1 = Φ1,1 = Φ0
− t1β 0 ⇒ α0
=
−2.25 V (1.5, 1.5) = max{0; − 14 , −1.5, −1.5} = 0 f 1,1 + RV 1,1 = −2.25 + (10)(0) = −2.25 −1 − 0.5 = −1.5 > Φ1,0
= t1 = 0.5,
x(1) = (1.5, 1.5)
Chen CL
113
Step 7: R1 = R0 = 10, k = 1 go to Step 2
Iteration 2 (k=1)
Step 2: f (x(1)) = g2(x(1)) =
∇f (x(1)) ∇g2(x(1))
= =
−2.25 g1(x(1)) −1 g3(x(1)) (−1.5, −1.5) ∇g1(x(1)) ∇g3(x(1)) (−1, 0)
V 1 = 0
= =
−0.25 −1 (inactive)
= (0.5, 0.5) = (0, 1)
−
Chen CL
⇒ d(k), v(k), u(k)
Step 3: define and solve QP subproblem
Min: s.t.
⇒
114
f ¯ = ( 1.5d1 g¯1
− d1
− 1.5d2) + 0.5(d21 + d22) = 0.5d1 + 0.5d2 ≤ 0.25 ≤ 1.5, −d2 ≤ 1.5 −
¯= d(1) = (0.25, 0.25) (D), u(1) = (2.5, 0, 0), f
Step 4: d(1) = 0.3535
|| ||
Step 5: r0 = 10
−2.25
≥ 2, continue
ui = 2.5, R = max R1, r1 = max 10, 2.5 =
{
}
{
}
Chen CL
115
Step 6: Φ1 = f 1 + RV 1 = β 1 = γ d(1)
−2.25 + (10)(0) = −2.25
|| ||2 = 0.5(0.125) = 0.0625
x(2,0) = x(1) + t0d(1) = (1.75, 1.75)
(t0 = 1)
f 2,0 = f (1.75, 1.75) = V 2,0 = Φ2,0 = Φ1
− t0β 1 ⇒ α1
=
−3.0625 V (1.75, 1.75) = max{0; 0.0208, −1.75, −1.75} = 0.020 f 2,0 + RV 2,0 = −3.0625 + (10)(0.0208) = −2.8541 −2.25 − (1)(0.0625) = −2.3125 > Φ2,0
= t0 = 1.0,
x(2) = (1.75, 1.75)
Step 7: R2 = R1 = 10, k = 2 go to Step 2
Chen CL
116
SQP: Effect of γ in CSD Method
Same as last example, case 1: γ = 0.5
Step 1
∼ 5: same as before
⇒ 0.01
Chen CL
117
Step 6: Iteration 1 Φ0 = f 0 + RV 0 = β 0 = γ d(0)
−1 + (10)(0) = −1
|| ||2 = 0.01(1 + 1) = 0.02
x(1,0) = x(0) + t0d(0) = (2, 2)
f 1,0 = f (2, 2) =
(t0 = 1)
−4
V 1,0 = V (2, 2) = max 0; 13 , 2, 2 = Φ1,0 = Φ0
− t0β 0
=
{ − − } 13 f 1,0 + RV 1,0 = −4 + (10) 13 = − 23 −1 − (1)(0.02) = −1.02 < Φ1,0 ⇒
x(1,1) = x(0) + t1d(0) = (1.5, 1.5)
(t1 = 0.5)
f 1,1 = f (1.5, 1.5) = V 1,1 = Φ1,1 = Φ0
− t1β 0 ⇒ α0
=
t1 = 0.5
−2.25 V (1.5, 1.5) = max{0; − 14 , −1.5, −1.5} = 0 f 1,1 + RV 1,1 = −2.25 + (10)(0) = −2.25 −1 − (0.5)(0.02) = −1.01 > Φ1,0
= t1 = 0.5,
x(1) = (1.5, 1.5)
Chen CL
118
Step 6: Iteration 2 Φ1 = f 1 + RV 1 = β 1 = γ d(1)
−2.25 + (10)(0) = −2.25
|| ||2 = 0.02(0.125) = 0.0025
x(2,0) = x(1) + t0d(1) = (1.75, 1.75)
(t0 = 1)
f 2,0 = f (1.75, 1.75) = V 2,0 = Φ2,0 = Φ1
− t0β 1 ⇒ α1
=
−3.0625 V (1.75, 1.75) = max{0; 0.0208, −1.75, −1.75} = 0.020 f 2,0 + RV 2,0 = −3.0625 + (10)(0.0208) = −2.8541 −2.25 − (1)(0.0025) = −2.2525 < Φ2,0
= t0 = 1.0,
x(2) = (1.75, 1.75)
A smaller value of γ has no effect on the first two iteration
Chen CL
Same as last example, case 2: γ = 0.5 0.9 Iteration 2: step size t1 = 0.5 x(2) = (1.625, 1.625), f 2 = 2.641, g1 = 0.1198, V 1 = 0
⇒
⇒
119
−
−
A larger γ value results in a smaller step size and the new design point remains strictly feasible
Chen CL
120
SQP: Effect of R in CSD Method
Same as last example, case 1: R = 10
⇒1
Iteration 1 Step 1 5: same as before Step 6:
∼
Φ0 = f 0 + RV 0 = β 0 = γ d(0)
−1 + (1)(0) = −1
|| ||2 = 0.5(1 + 1) = 1
x(1,0) = x(0) + t0d(0) = (2, 2)
f 1,0 = f (2, 2) =
−4
(t0 = 1)
V 1,0 = V (2, 2) = max 0; 13 , 2, 2 = Φ1,0 = Φ0
− t0β 0 ⇒ α0
=
{ − − } 13 f 1,0 + RV 1,0 = −4 + (1) 13 = − 11 3 −1 − (1)(1) = 0 > Φ1,0
= t0 = 1,
x(1) = (2, 2)
Chen CL
121
Iteration 2
Step 2: f (x(1)) = g2(x(1)) =
∇f (x(1)) ∇g2(x(1))
=
= ( 1, 0)
V 1 =
−4 −1 −
1 3
g1(x(1)) = g3(x(1)) =
∇g1(x(1)) ∇g3(x(1))
=
−1 (inactive)
= ( 0, 1)
−
⇒ d(k), v(k), u(k)
Step 3: define and solve QP subproblem Min: s.t.
⇒
f ¯ = ( 1.5d1 g¯1
− d1
d(1)
− 1.5d2) + 0.5(d21 + d22) = 0.5d1 + 0.5d2 ≤ 0.25 ≤ 1.5, −d2 ≤ 1.5 = (−0.25, −0.25) (D), u(1) = ( 27 8 , 0, 0), −
f ¯ =
−4
Chen CL
122
Step 4: d(1) = 0.3535 2, continue 27 Step 5: r0 = ui = 27 , R = max R , r = max 1, 1 1 8 8 = Step 6: 1 Φ1 = f 1 + RV 1 = 4 + ( 27 )( 8 3 ) = 2.875
|| ||
β 1 = γ d(1)
≥
{
−
}
{
}
27 8
−
|| ||2 = 0.5(0.3535)2 = 0.0625
x(2,0) = x(1) + t0d(1) = (1.75, 1.75)
(t0 = 1)
f 2,0 = f (1.75, 1.75) = V 2,0 = Φ2,0 = Φ1
− t0β 1 ⇒ α1
=
−3.0625 V (1.75, 1.75) = max{0; 0.0208, −1.75, −1.75} = 0.02 − f 2,0 + RV 2,0 = −3.0625 + ( 27 8 )(0.0208) = 2.9923 −2.875 − (1)(0.0625) = −2.9375 > Φ2,0
= t0 = 1.0,
x(2) = (1.75, 1.75)
Step 7: R2 = R1 = 27 8 , k = 2 go to Step 2
A smaller value of R gives a larger step size in the first iteration, but results at the end of iteration 2 is the same as before
Chen CL
123
Observations on the CSD Algorithm
CSD algorithm is a 1st-order method for constrained optimization
CSD can treat equality as well as inequality constraints
Golden section search may be used to find the step size by minimizing the descent function instead of trying to satisfy the descent function (not suggested as it is inefficient)
Rate of convergence of CSD can be improved by including higherorder information about problem functions in the QP subproblem
Now, the step size is not allowed to be greater than one In practice, step size can be larger than one
Numerical uncertainties in selection of parameters γ, R0
Starting point can affect performance of the algorithm
Chen CL
124
SQP: Constrained Quasi-Newton Methods To introduce curvature information for the Lagrange function into the quadratic cost function new QP subproblem
⇒
Min:
f (x)
s.t. hi(x) = 0, i = 1 p
⇒ KTC:
L(x, v ) = f (x) +
vihi(x) = f (x) + v T h(x)
i=1 p
∇L(x, v)
= =
h(x) =
note:
∇
N =
∇f (x) + vi hi(x) i=1 ∇f (x) + N v = 0
0
∂h1 ∂x1
···
∂h p ∂x1
∂h1 ∂xn
···
∂h p ∂xn
..
∼ p
..
..
n× p
Chen CL
125
∇ L(x, v )
⇒
h(x)
or
∇F T (y(k))∆y(k) (k ) 2 ∇ L N ⇒ T
⇒
N
∇2L∆x(k) ∇2L∆x(k) ∇2L∆x(k)
∇ ⇒ 2
(k)
L N
N T
=
0
F (y ) =
Newton:
⇒
0
0
solve
,
0
−F (y(k))
=
(k)
−∇ − ∆x
+ N ∆v (k) = + N v (k+1)
+ N v (k+1) =
v (k+1)
(k)
=
v
L
h
L
v (k)
x
∇ −
=
∆v
0
∆x
let y =
−∇f (x(k)) − N v(k)
− ∇f (x(k)) −∇ f = −
h
these equations to obtain ∆x(k), v (k+1)
Chen CL
Note: Min: s.t.
126
the same as minimizing the following constrained function
∇f T ∆x
∇2L∆x
+ 0.5∆xT
h + N T ∆x =
¯ = L ¯ = L
⇒
KTC:
∇
∇f T ∆x + 0.5∆xT ∇2L∆x + vT (h + N T ∆x) ∇f + ∇2L∆x + N v = 0
h + N T ∆x =
∇ ⇒ 2
L N
T
N
0
0
∆x v
=
0
−∇ − f
h
Chen CL
127
Now, the solution ∆x should be treated as a search direction d and step size determined by minimizing an appropriate descent function to obtain a convergent algorithm the new QP subproblem
⇒
Min: f ¯ = cT d + 12 dT Hd ¯ = N T d = e s.t. h ¯ = AT d g
≤
b
Chen CL
128
Quasi-Newton Hessian Approximation
H (k+1) = H (k) + D(k) s(k) = αk d(k)
− E (k)
(change in design)
z (k) = H (k)s(k) y (k) =
ξ 1 = θ = w (k) =
D
(k)
∇L(x(k+1), u(k), v(k)) − ∇L(x(k), u(k), v(k)) s(k) · y (k), ξ 2 = s(k) · z (k) 8ξ 1 if ξ 1 ≥ 0.2ξ 2, otherwise θ = ξ0.− ξ θy(k) + (1 − θ)z (k), ξ 3 = s(k) · w(k)
1 (k) (k)T = w w , ξ 3
2
(k )
E
2 1
1 (k) (k)T = z z ξ 2
Chen CL
129
SQP: Modified CSD Algorithm PLBA (Pshenichny-Lim-Belegundu-Arora) method
Step 1: same as CSD, let H (0) = I Step 2: calculate functions and gradients, maximum violation of constraints, update Hessian if k > 0
Step 3: solve d(k), u(k), v (k)
Step 4-7: same as CSD
Chen CL
130
SQP: Modified CSD Algorithm Example Min:
f (x) = x21 + x22
s.t. g1(x) = g2(x) = g3(x) = R0 = x∗ =
− 3x1x2 1 2 1 2 −1≤0 x + 6 1 6 x2 −x1 ≤ 0 −x2 ≤ 0 x(0) = (1, 1) 10, γ = 0.5, 1 = 2 = 0.001 √ √ ( 3, 3), u = (3, 0, 0), f ∗ = −3.0625
Chen CL
131
Iteration 1: same as before d(0) = (1, 1),
α0 = 0.5,
u(0) = (0, 0, 0),
Iteration 2: Step 2: at x(1) = (1.5, 1.5) f =
⇒
x(1) = (1.5, 1.5)
R1 = 10,
−6.75,
g1 =
H (0) = I
−0.25,
g2 = g 3 =
−1.5
Chen CL
132
∇f ∇g2
= ( 1.5, 1.5), =
− − ∇g1 = (0.5, 0.5) (−1, 0), ∇g3 = (0, −1)
s(0) = α0d(0) = (0.5, 0.5)
z (0) = H (0)s(0) = (0.5, 0.5) y (0) =
ξ 1 =
∇f (x(1)) − ∇f (x(0)) = (−0.5, −0.5) s(0) · y (0) = −0.5, ξ 2 = s(0) · z (0) = 0.5
θ = 0.8(0.5)/(0.5 + 0.5) = 0.4 w(0) = 0.4( 0.5,
ξ 3 = D
(0)
(0)
H
=
=
− −0.5) + (1 − 0.4)(0.5, 0.5) = (0.1, 0.1) s(0) · w(0) = 0.1
0.1 0.1 0.1 0.1 1 0 0 1
+
(0)
E
− =
0.5 0.5 0.5 0.5
− −
0.1 0.1
0.5 0.5
0.1 0.1
0.5 0.5
=
0.6
.4
.4 0.6
Chen CL
133
Step 3: QP subproblem Min:
f ¯ =
s.t.
g¯1 = g¯2 = g¯3 =
⇒
−1.5d1 − 1.5d2 + 0.5(0.6d21 − 0.8d1d2 + 0.6d22) 0.5d1 + 0.5d2 ≤ 0.25 −d1 ≤ 1.5 −d2 ≤ 1.5
d(1) = (0.25, 0.25),
u(1) = (2.9, 0, 0)
Same as previous CSD method, In general, inclusion of approximate Hessian will give different directions and better convergence
Chen CL
134
Newton-Raphson Method to Solve Multiple Nonlinear Equations F (x) =
(n
0
× 1)
x(k+1) = x(k) + ∆x(k),
{ n
Stop if
Ex:
||F (x∗)||
=
(k)
(k )
(k)
F 1(x1 , x2 ) + F 2(x1 , x2 ) +
1/2
≤
···
δ
F 2(x1, x2) = 0
(k) ∂F 1 ∆x 1 ∂x1 (k) ∂F 2 ∆x 1 ∂x1
(k)
= 0
(k)
= 0
1 + ∂F ∂x2 ∆x2 2 + ∂F ∂x2 ∆x2
(k)
F 1
(k) F 2 (k)
⇒
}2
i=1
F 1(x1, x2) = 0 (k )
F i(x∗)
k = 0, 1, 2,
=
F ∆x(k) =
∂F 1 ∂x1 ∂F 2 ∂x1
∂F 1 ∂x2 ∂F 2 ∂x2
T
∇F =J −1 (k )
−J
F
(k)
∆x1
(k) ∆x2
=
0 0
∆x(k)
or solve these eq.s directly
Chen CL
135
Newton-Raphson Method to Solve Multiple Nonlinear Equations: Example 6
− 4.x0×10 =0 x 250 − 4.x0×10 =0 x
F 1(x) = 1.0 F 2(x) = J =
Step 1: Step 2: Step 3:
F 1 = (0)
J
=
6
2 1 2
∂F 1 ∂x1 ∂F 2 ∂x1
∂F 1 ∂x2 ∂F 2 ∂x2
x(0) = (500, 1.0),
Step 4: solve
⇒
2 1 2
15, F 2 =
−
8 125
16
= 4.0
× 106
δ = 0.1, 7750
−
2 x31x2 1 2 x1x22
k=0
⇒ ||F || = 7750 > δ
16 16000
8 125
16
16 16000
x(0) = (151, 0.33)
(k)
∆x1 (k) ∆x2
1 x21x22 2 x1x32
=
15 7750
⇒ x(1) = (651, 1.33)