-
Notifications
You must be signed in to change notification settings - Fork 0
/
discr_dir_LQR.qmd
296 lines (259 loc) · 16.9 KB
/
discr_dir_LQR.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
---
title: "Finite-horizon LQR-optimal control as a QP"
format:
html:
html-math-method: katex
code-fold: true
execute:
enabled: false
jupyter: julia-1.10
---
Here we specialize the general procedure from the previous section to the case of a linear system and a quadratic cost. we start by considering a simple problem of *regulation*, wherein the goal is to bring the system either exactly or approximately to zero final state, that is, $\mathbf x^\text{ref}=\mathbf 0$ and we want $\bm x_N=\mathbf x^\text{ref}$ or $\bm x_N\approx\mathbf x^\text{ref}$, respectively.
$$
\begin{aligned}
\operatorname*{minimize}_{\mathbf u_0,\ldots, \mathbf u_{N-1}, \mathbf x_{0},\ldots, \mathbf x_N} &\quad \frac{1}{2} \bm x_N^\top \mathbf S \bm x_N + \frac{1}{2} \sum_{k=0}^{N-1} \left(\bm x_k^\top \mathbf Q \bm x_k + \bm u_k^\top \mathbf R \bm u_k \right)\\
\text{subject to} &\quad \bm x_{k+1} = \mathbf A\bm x_k + \mathbf B\bm u_k,\quad k = 0, \ldots, N-1, \\
&\quad \bm x_0 = \mathbf x_0,\\
&\quad \bm x_N = \mathbf 0\; (\text{or}\, \bm x_N \approx \mathbf 0).
\end{aligned}
$$
Referring to the two options for the last constraint,
- if the condition $\bm x_N=\mathbf 0$ on the final state is strictly enforced, the terminal state cost (the term $\frac{1}{2} \bm x_N^\top \mathbf S \bm x_N$ in the cost function) is redundant and can be removed;
- if the final state condition can be relaxed to $\bm x_N\approx\mathbf 0$, it is by increasing the weight $\mathbf S$ in the terminal cost function $\frac{1}{2} \bm x_N^\top \mathbf S \bm x_N$ that $\bm x_N$ can be made arbitrarily close to $\mathbf 0$.
::: {.callout-tip}
It is a standard dilemma in optimization, not only in optimal control, that if we want to satisfy some requirement, we can either strictly enforce it through constraints or we can seemingly relax it and set a cost to be paid for not satysfying it.
:::
### Simultaneous (sparse) formulation
Below we rewrite the latter problem, that is, $\bm x_N\approx\mathbf 0$, in the "unrolled" form, where we stack the state and control variables into "long" vectors $\bar{\bm x}$ and $\bar{\bm u}$. Doing the same for the former is straightforward.
$$
\begin{aligned}
\operatorname*{minimize}_{\bar{\bm u},\bar{\bm x}} & \frac{1}{2}\left(\begin{bmatrix} \bm x_1^\top & \bm x_2^\top & \ldots & \bm x_N^\top \end{bmatrix}
\underbrace{\begin{bmatrix}\mathbf Q & & & \\ & \mathbf Q & &\\ & &\ddots & \\ & & & \mathbf S \end{bmatrix}}_{\overline{\mathbf Q}}
\underbrace{\begin{bmatrix} \bm x_1 \\ \bm x_2 \\ \vdots \\ \bm x_N \end{bmatrix}}_{\bar{\bm x}}\right.\\
&\qquad +\left.
\begin{bmatrix} \bm u_0^\top & \bm u_1^\top & \ldots & \bm u_{N-1}^\top \end{bmatrix}
\underbrace{\begin{bmatrix}\mathbf R & & & \\ & \mathbf R & &\\ & &\ddots & \\ & & & \mathbf R \end{bmatrix}}_{\overline{\mathbf R}}
\underbrace{\begin{bmatrix} \bm u_0 \\ \bm u_1 \\ \vdots \\ \bm u_{N-1} \end{bmatrix}}_{\bar{\bm u}}\right)
+ \underbrace{\frac{1}{2}\mathbf x_0^\top \mathbf Q \mathbf x_0}_{\mathrm{constant}}
\end{aligned}
$$
subject to
$$
\begin{bmatrix} \bm x_1 \\ \bm x_2 \\ \bm x_3\\ \vdots \\ \bm x_N \end{bmatrix} = \underbrace{\begin{bmatrix}\mathbf 0 & & & &\\\mathbf A & \mathbf 0 & & &\\ &\mathbf A &\mathbf 0 & & \\ & & &\ddots & \\& & &\mathbf A & \mathbf 0 \end{bmatrix}}_{\overline{\mathbf A}}
\begin{bmatrix} \bm x_1 \\ \bm x_2 \\ \bm x_3\\ \vdots \\ \bm x_N \end{bmatrix} + \underbrace{\begin{bmatrix}\mathbf B & & & & \\ & \mathbf B & & & \\& &\mathbf B & \\ & & &\ddots \\ & & & & \mathbf B \end{bmatrix}}_{\overline{\mathbf B}}\begin{bmatrix} \bm u_0 \\ \bm u_1 \\ \bm u_2\\\vdots \\ \bm u_{N-1} \end{bmatrix} + \underbrace{\begin{bmatrix}\mathbf A\\\mathbf 0\\\mathbf 0\\\vdots\\\mathbf 0\end{bmatrix}}_{\overline{\mathbf A}_0}\mathbf x_0.
$$
Note that the last term in the cost function can be discarded because it is constant.
The terms with the $\bar{\bm x}$ vector can be combined and we get
$$
\begin{bmatrix} \mathbf 0 \\ \mathbf 0 \\ \mathbf 0\\ \vdots \\ \mathbf 0 \end{bmatrix} = \underbrace{\begin{bmatrix}-\mathbf I & & & &\\\mathbf A & -\mathbf I & & &\\ &\mathbf A &-\mathbf I & & \\ & & &\ddots & \\& & &\mathbf A & -\mathbf I \end{bmatrix}}_{\overline{\mathbf A} - \mathbf I}
\begin{bmatrix} \mathbf x_1 \\ \mathbf x_2 \\ \mathbf x_3\\ \vdots \\ \mathbf x_N \end{bmatrix} + \underbrace{\begin{bmatrix}\mathbf B & & & & \\ & \mathbf B & & & \\& &\mathbf B & \\ & & &\ddots \\ & & & & \mathbf B \end{bmatrix}}_{\overline{\mathbf B}}\begin{bmatrix} \mathbf u_0 \\ \mathbf u_1 \\ \mathbf u_2\\\vdots \\ \mathbf u_{N-1} \end{bmatrix} + \underbrace{\begin{bmatrix}\mathbf A\\\mathbf 0\\\mathbf 0\\\vdots\\\mathbf 0\end{bmatrix}}_{\overline{\mathbf A}_0}\mathbf x_0.
$$ {#eq-linear_system_linear_constraints_matrix_format}
Upon stacking the two "long" vectors into $\bar{\bm z}$ we reformulate the optimization problem as
$$
\operatorname*{minimize}_{\widetilde{\mathbf z}\in\mathbb{R}^{2N}}\quad \frac{1}{2}\underbrace{\begin{bmatrix}\bar{\bm x}^\top &\bar{\bm u}^\top\end{bmatrix}}_{\bar{\bm z}^\top} \underbrace{\begin{bmatrix}\overline{\mathbf Q} & \\ & \overline{\mathbf R} \end{bmatrix}}_{\widetilde{\mathbf Q}}\underbrace{\begin{bmatrix}\bar{\bm x}\\\bar{\bm u}\end{bmatrix}}_{\bar{\bm z}}
$$
subject to
$$
\mathbf 0 = \underbrace{\begin{bmatrix}(\overline{\mathbf A}-\mathbf I) & \overline{\mathbf B}\end{bmatrix}}_{\widetilde{\mathbf A}}\underbrace{\begin{bmatrix}\bar{\bm x}\\\bar{\bm u}\end{bmatrix}}_{\bar{\bm z}} + \underbrace{\overline{\mathbf A}_0 \mathbf x_0}_{\tilde{\mathbf b}}.
$$
To summarize, we have reformulated the optimal control problem as a linearly constrained quadratic program
$$
\boxed{
\begin{aligned}
\underset{\bar{\bm z}\in\mathbb{R}^{2N}}{\text{minimize}} &\quad \frac{1}{2}\bar{\bm z}^\top \widetilde{\mathbf Q} \bar{\bm z}\\
\text{subject to} &\quad \widetilde{\mathbf A} \bar{\bm z} + \tilde{\bm b} = \mathbf 0.
\end{aligned}}
$$
```{julia}
function direct_dlqr_simultaneous(A,B,x₀,Q,R,S,N)
Qbar = BlockArray(spzeros(N*n,N*n),repeat([n],N),repeat([n],N))
for i=1:(N-1)
Qbar[Block(i,i)] = Q
end
Qbar[Block(N,N)] = S
Rbar = BlockArray(spzeros(N*m,N*m),repeat([m],N),repeat([m],N))
for i=1:N
Rbar[Block(i,i)] = R
end
Qtilde = blockdiag(sparse(Qbar),sparse(Rbar)) # The matrix defining the quadratic cost.
Bbar = BlockArray(spzeros(N*n,N*m),repeat([n],N),repeat([m],N))
for i=1:N
Bbar[Block(i,i)] = B
end
Abar = BlockArray(sparse(-1.0*I,n*N,n*N),repeat([n],N),repeat([n],N))
for i=2:N
Abar[Block(i,(i-1))] = A
end
Atilde = sparse([Abar Bbar]) # The matrix defining the linear (affine) equation.
A0bar = spzeros(n*N,n)
A0bar[1:n,1:n] = A
btilde = A0bar*sparse(x₀) # The constant offset for the linear (affine) equation.
K = [Qtilde Atilde'; Atilde spzeros(size(Atilde,1),size(Atilde,1))] # Sparse KKT matrix.
F = qdldl(K) # KKT matrix LDL factorization.
k = [spzeros(size(Atilde,1)); -btilde] # Right hand side of the KKT system
xtildeλ = solve(F,k) # Solving the KKT system using the factorization.
xopt = reshape(xtildeλ[1:(n*N)],(n,:))
uopt = reshape(xtildeλ[(n*N+1):(n+m)*N+1],(m,:))
return xopt,uopt
end
n = 2 # Number of state variables.
m = 1 # Number of (control) input variables.
A = rand(n,n) # State matrix.
B = rand(n,m) # Input coupling matrix.
x₀ = [1.0, 3.0] # Initial state.
N = 10 # Time horizon.
s = [1.0, 2.0]
q = [1.0, 2.0]
r = [1.0]
S = diagm(0=>s) # Matrix defining the terminal state cost.
Q = diagm(0=>q) # Matrix defining the running state dost.
R = diagm(0=>r) # Matrix defining the cost of control.
uopts,xopts = direct_dlqr_simultaneous(A,B,x₀,Q,R,S,N)
using Plots
p1 = plot(0:(N-1),uopts,marker=:diamond,label="u",linetype=:steppost)
xlabel!("k")
ylabel!("u")
p2 = plot(0:N,hcat(x0,xopts)',marker=:diamond,label=["x1" "x2"],linetype=:steppost)
xlabel!("k")
ylabel!("x")
plot(p1,p2,layout=(2,1))
```
This constrained optimization problem can still be solved without invoking a numerical solver for solving quadratic programs (QP). We do it by introducing a vector $\boldsymbol\lambda$ of Lagrange multipliers to form the Lagrangian function
$$
\mathcal{L}(\bar{\bm z}, \boldsymbol \lambda) = \frac{1}{2}\bar{\bm z}^\top \widetilde{\mathbf Q} \bar{\bm z} + \boldsymbol\lambda^\top(\widetilde{\mathbf A} \bar{\bm z} + \tilde{\mathbf b}),
$$
for which the gradients with respect to $\bar{\bm z}$ and $\boldsymbol\lambda$ are
$$
\begin{aligned}
\nabla_{\tilde{\bm{z}}} \mathcal{L}(\bar{\bm z}, \boldsymbol\lambda) &= \widetilde{\mathbf Q}\bar{\bm z} + \tilde{\mathbf A}^\top\boldsymbol\lambda,\\
\nabla_{\boldsymbol{\lambda}} \mathcal{L}(\tilde{\bm x}, \boldsymbol\lambda) &=\widetilde{\mathbf A} \bar{\bm z} + \tilde{\mathbf b}.
\end{aligned}
$$
Requiring that the overall gradient vanishes leads to the following KKT set of linear equations
$$
\begin{bmatrix}
\widetilde{\mathbf Q} & \widetilde{\mathbf A}^\top\\ \widetilde{\mathbf A} & \mathbf 0
\end{bmatrix}
\begin{bmatrix}
\bar{\bm z}\\\boldsymbol\lambda
\end{bmatrix}
=
\begin{bmatrix}
\mathbf 0\\ -\tilde{\mathbf b}
\end{bmatrix}.
$$
Solving this could be accomplished by using some general solver for linear systems or by using some more tailored solver for symmetric indefinite systems (based on [LDL factorization](https://en.wikipedia.org/wiki/Cholesky_decomposition#LDL_decomposition), for example [ldl](https://www.mathworks.com/help/matlab/ref/ldl.html) in Matlab).
#### Adding constraints on controls and states
When solving a real optimal control problem, we may want to impose inequality constraints on $\bm u_k$ due to saturation of actuators.
We may also want to add constraints on $\bm x_k$ as well, which may reflect some performance specifications. In both cases, the KKT system above would have to be augmented and we resort to some already finetuned numerical solver for quadratic programming (QP) instead.
### Sequential (dense) formulation
We can express $\bar{\bm x}$ as a function of $\bar{\bm u}$ and $\mathbf x_0$. This can be done in a straightforward way using (@eq-linear_system_linear_constraints_matrix_format), namely,
$$
\bar{\bm x} = (\mathbf I-\overline{\mathbf A})^{-1}\overline{\mathbf B} \bm u + (\mathbf I-\overline{\mathbf A})^{-1} \overline{\mathbf A}_0 \mathbf x_0.
$$
However, instead of solving the sets of equations, we can do this substitution in a more insightful way. Write down the state equation for several discrete times
$$
\begin{aligned}
\bm x_1 &= \mathbf A\mathbf x_0 + \mathbf B\bm u_0\\
\bm x_2 &= \mathbf A\mathbf x_0 + \mathbf B\bm u_0\\
&= \mathbf A(\mathbf A\mathbf x_0 + \mathbf B\bm u_0)+ \mathbf B\bm u_0\\
&= \mathbf A^2\mathbf x_0 + \mathbf A\mathbf B\bm u_0 + \mathbf B\bm u_0\\
&\vdots\\
\bm x_k &= \mathbf A^k\mathbf x_0 + \mathbf A^{k-1}\mathbf B\bm u_0 +\mathbf A^{k-2}\mathbf B\bm u_1 +\ldots \mathbf B\bm u_{k-1}.
\end{aligned}
$$
Rewriting into matrix-vector form (and extending the time $k$ up to the final time $N$)
$$
\begin{bmatrix}
\bm x_1\\\bm x_2\\\vdots\\\bm x_N
\end{bmatrix}
=
\underbrace{
\begin{bmatrix}
\mathbf B & & & \\
\mathbf A\mathbf B & \mathbf B & & \\
\vdots & & \ddots &\\
\mathbf A^{N-1}\mathbf B & \mathbf A^{N-2}\mathbf B & & \mathbf B
\end{bmatrix}}_{\widehat{\mathbf C}}
\begin{bmatrix}
\bm u_0\\\bm u_1\\\vdots\\\bm u_{N-1}
\end{bmatrix}
+
\underbrace{
\begin{bmatrix}
\mathbf A\\\mathbf A^2\\\vdots\\\mathbf A^N
\end{bmatrix}}_{\widehat{\mathbf A}}\mathbf x_0
$$
For convenience, let's rewrite the compact relation between $\bar{\bm x}$ and $\bar{\bm u}$ and $\mathbf x_0$
$$
\bar{\bm x} = \widehat{\mathbf C} \bar{\bm u} + \widehat{\mathbf A} \mathbf x_0.
$$ {#eq-x_as_function_of_u_and_x0}
We can now substitute this into the original cost, which then becomes independent of $\bar{\bm x}$, which we reflect by using a new name $\tilde J$
$$
\begin{aligned}
\tilde J(\bar{\bm u};\mathbf x_0) &= \frac{1}{2}(\widehat{\mathbf C} \bar{\bm u} + \widehat{\mathbf A} \mathbf x_0)^\top\overline{\mathbf Q} (\widehat{\mathbf C} \bar{\bm u} + \widehat{\mathbf A} \mathbf x_0) + \frac{1}{2}\bar{\bm u}^\top\overline{\mathbf R} \bar{\bm u} + \frac{1}{2}\mathbf x_0^\top\mathbf Q\mathbf x_0\\
&= \frac{1}{2}\bar{\bm u}^\top\widehat{\mathbf C}^\top \overline{\mathbf Q} \widehat{\mathbf C} \bar{\bm u} + \mathbf x_0^\top\widehat{\mathbf A}^\top \overline{\mathbf Q} \widehat{\mathbf C} \bar{\bm u} + \frac{1}{2} \mathbf x_0^\top\widehat{\mathbf A}^\top \overline{\mathbf Q} \widehat{\mathbf A} \mathbf x_0 + \frac{1}{2}\bar{\bm u}^\top\overline{\mathbf R} \bar{\bm u} + \frac{1}{2}\mathbf x_0^\top\mathbf Q\mathbf x_0\\
&= \frac{1}{2}\bar{\bm u}^\top(\widehat{\mathbf C}^\top \overline{\mathbf Q} \widehat{\mathbf C} + \overline{\mathbf R})\bar{\bm u} + \mathbf x_0^\top\widehat{\mathbf A}^\top \overline{\mathbf Q} \widehat{\mathbf C} \bar{\bm u} + \frac{1}{2} \mathbf x_0^\top(\widehat{\mathbf A}^\top \overline{\mathbf Q} \widehat{\mathbf A} + \mathbf Q)\mathbf x_0.
\end{aligned}
$$
The last term (the one independent of $\bar{\bm u}$) does not have an impact on the optimal $\bar{\bm u}$ and therefore it can be discarded, but such minor modification perhaps does no justify a new name for the cost function and we write it as
$$
\tilde J(\bar{\bm u};\mathbf x_0) = \frac{1}{2}\bar{\bm u}^\top\underbrace{(\widehat{\mathbf C}^\top \overline{\mathbf Q} \widehat{\mathbf C} + \overline{\mathbf R})}_{\mathbf H}\bar{\bm u} + \mathbf x_0^\top\underbrace{\widehat{\mathbf A}^\top \overline{\mathbf Q} \widehat{\mathbf C}}_{\mathbf F^\top} \bar{\bm u}.
$$
This cost is a function of $\bar{\bm u}$, the initial state $\mathbf x_0$ is regarded as a fixed parameter. Its gradient is
$$
\nabla \tilde J = \mathbf H\bar{\bm u}+\mathbf F\mathbf x_0.
$$
Setting it to zero leads to the following linear system of equations
$$
\mathbf H\bar{\bm u}=-\mathbf F\mathbf x_0
$$
that needs to be solved for $\bar{\bm u}$. Formally, we write the solution as
$$
\bar{\bm u} = -\mathbf H^{-1} \mathbf F \mathbf x_0.
$$
::: {.callout-note}
Solving linear equations by direct computation of the matrix inverse is not a recommended practice. Use dedicated solvers of linear equations instead. For example, in Matlab use the backslash operator, which invokes the most suitable solver.
:::
#### Adding the constraints on controls
Adding constraints on $\bar{\bm u}$ is straightforward. It is just that instead of a linear system we will have a linear system with additional inequality constraints. Let's get one
$$
\begin{aligned}
\operatorname*{minimize}_{\bar{\bm u}} & \quad \frac{1}{2}\bar{\bm u}^T \mathbf H \bar{\bm u} + \mathbf x_0^T\mathbf F^T \bar{\bm u}\\
\text{subject to} &\quad \bar{\bm u} \leq \bar{\mathbf u}_\mathrm{max}\\
&\quad \bar{\bm u} \geq \bar{\mathbf u}_\mathrm{min},
\end{aligned}
$$
which we can rewrite more explicitly (in the matrix-vector format) as
$$
\begin{aligned}
\operatorname*{minimize}_{\bar{\bm u}} & \quad \frac{1}{2}\bar{\bm u}^T \mathbf H \bar{\bm u} + \mathbf x_0^T\mathbf F^T \bar{\bm u}\\
\text{subject to} & \begin{bmatrix}
\mathbf{I} & & & \\
& \mathbf{I} & & \\
& & \ddots & \\
& & & \mathbf{I} \\
-\mathbf{I} & & & \\
& -\mathbf{I} & & \\
& & \ddots & \\
& & & -\mathbf{I}
\end{bmatrix}
\begin{bmatrix}
\mathbf u_0 \\ \mathbf u_1 \\ \vdots \\ \mathbf u_{N-1}
\end{bmatrix}
\leq
\begin{bmatrix}
\mathbf u_\mathbf{max} \\ \mathbf u_\mathrm{max} \\ \vdots \\ \mathbf u_\mathrm{max}\\ -\mathbf u_\mathrm{min} \\ -\mathbf u_\mathrm{min} \\ \vdots \\ -\mathbf u_\mathrm{min}
\end{bmatrix}.
\end{aligned}
$$
#### Adding the constraints on states
We might feel a little bit uneasy about loosing an immediate access to $\bar{\bm x}$. But the game is not lost. We just need to express $\bar{\bm x}$ as a function of $\bar{\bm u}$ and $\mathbf x_0$ and impose the constraint on the result. But such expression is already available, see (@eq-x_as_function_of_u_and_x0). Therefore, we can formulate the constraint, say, an upper bound on the state vector
$$
\bm x_k \leq \mathbf x_\mathrm{max}
$$
as
$$
\bar{\mathbf x}_\mathrm{min} \leq \widehat{\mathbf C} \bar{\bm u} + \widehat{\mathbf A} \mathbf x_0 \leq \bar{\mathbf x}_\mathrm{max},
$$
where the the bars in $\bar{\mathbf x}_\mathrm{min}$ and $\bar{\mathbf x}_\mathrm{max}$ obviously indicates that these vectors were obtained by stacking the corresponding vectors for all times $k=1,\ldots,N$.