Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
hurak committed Jul 6, 2024
1 parent c4f1f94 commit 74c1786
Show file tree
Hide file tree
Showing 75 changed files with 5,382 additions and 3,764 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
103e56eb
03b73c09
1,174 changes: 1,174 additions & 0 deletions cont_dp_DDP.html

Large diffs are not rendered by default.

24 changes: 22 additions & 2 deletions cont_dp_DPP.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">


<title>cont_dp_dpp – B(E)3M35ORR – Optimal and Robust Control</title>
<title>Differential dynamic programming (DDP) – B(E)3M35ORR – Optimal and Robust Control</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
Expand Down Expand Up @@ -492,6 +492,9 @@
</div>
</li>
<li class="sidebar-item">
<span class="menu-text">cont_dp_DDP.qmd</span>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./cont_dp_references.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">References</span></a>
Expand Down Expand Up @@ -691,9 +694,26 @@
<!-- main -->
<main class="content" id="quarto-document-content">

<header id="title-block-header" class="quarto-title-block default">
<div class="quarto-title">
<h1 class="title">Differential dynamic programming (DDP)</h1>
</div>



<div class="quarto-title-meta">




</div>



</header>


<p>blabla</p>



Expand Down Expand Up @@ -790,7 +810,7 @@
}
var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
var mailtoRegex = new RegExp(/^mailto:/);
var filterRegex = new RegExp("https:\/\/hurak\.github\.io\/orr\/");
var filterRegex = new RegExp('/' + window.location.host + '/');
var isInternal = (href) => {
return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
}
Expand Down
30 changes: 25 additions & 5 deletions cont_dp_HJB.html
Original file line number Diff line number Diff line change
Expand Up @@ -533,6 +533,12 @@
<a href="./cont_dp_LQR.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Using HJB equation to solve the continuous-time LQR problem</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./cont_dp_DDP.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Differential dynamic programming (DDP)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
Expand Down Expand Up @@ -728,8 +734,15 @@
</nav>
<div id="quarto-sidebar-glass" class="quarto-sidebar-collapse-item" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item"></div>
<!-- margin-sidebar -->
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar zindex-bottom">

<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
<nav id="TOC" role="doc-toc" class="toc-active">
<h2 id="toc-title">On this page</h2>

<ul>
<li><a href="#hamilton-jacobi-bellman-hjb-equation" id="toc-hamilton-jacobi-bellman-hjb-equation" class="nav-link active" data-scroll-target="#hamilton-jacobi-bellman-hjb-equation">Hamilton-Jacobi-Bellman (HJB) equation</a></li>
<li><a href="#hjb-equation-and-hamiltonian" id="toc-hjb-equation-and-hamiltonian" class="nav-link" data-scroll-target="#hjb-equation-and-hamiltonian">HJB equation and Hamiltonian</a></li>
</ul>
<div class="toc-actions"><ul><li><a href="https://github.com/hurak/orr/issues/new" class="toc-action"><i class="bi bi-github"></i>Report an issue</a></li></ul></div></nav>
</div>
<!-- main -->
<main class="content" id="quarto-document-content">
Expand Down Expand Up @@ -762,6 +775,8 @@ <h1 class="title">Dynamic programming for continuous-time optimal control</h1>
<p>Optionally we can also consider constraints on the state at the final time (be it a particular value or some set of values) <span class="math display">
\psi(\bm x(t_\mathrm{f}),t_\mathrm{f})=0.
</span></p>
<section id="hamilton-jacobi-bellman-hjb-equation" class="level2">
<h2 class="anchored" data-anchor-id="hamilton-jacobi-bellman-hjb-equation">Hamilton-Jacobi-Bellman (HJB) equation</h2>
<p>We now consider an arbitrary time <span class="math inline">t</span> and split the (remaining) time interval <span class="math inline">[t,t_\mathrm{f}]</span> into two parts <span class="math inline">[t,t+\Delta t]</span> and <span class="math inline">[t+\Delta t,t_\mathrm{f}]</span> , and structure the cost function accordingly <span class="math display">
J(\bm x(t),\bm u(\cdot),t) = \int_{t}^{t+\Delta t} L(\bm x,\bm u,\tau)\,\mathrm{d}\tau + \underbrace{\int_{t+\Delta t}^{t_\mathrm{f}} L(\bm x,\bm u,\tau)\,\mathrm{d}\tau + \phi(\bm x(t_\mathrm{f}),t_\mathrm{f})}_{J(\bm x(t+\Delta t), \bm u(t+\Delta t), t+\Delta t)}.
</span></p>
Expand All @@ -780,16 +795,21 @@ <h1 class="title">Dynamic programming for continuous-time optimal control</h1>
-\frac{\partial {\color{blue}J^\star (\bm x(t),t)}}{\partial t} = \min_{\bm u(t)}\left[L(\bm x(t),\bm u(t),t)+(\nabla_{\bm x} {\color{blue} J^\star (\bm x(t),t)})^\top \bm f(\bm x(t),\bm u(t),t)\right].}
</span></p>
<p>This is obviously a partial differential equation (PDE) for the optimal cost function <span class="math inline">J^\star(\bm x,t)</span>.</p>
<p>And since this is a differential equation, boundary value(s) must also be specified. In particular, the optimal cost function must be specified at the final state and the final time, i.e. <span class="math display">
<p>And since this is a differential equation, boundary value(s) must be specified to determine a unique solution. In particular, since the equation is first-order with respect to both time and state, specifying the value of the optimal cost function at the final state and the final time is enough. With the general final-state constraints we have introduced above, the boundary value condition reads <span class="math display">
J^\star (\bm x(t_\mathrm{f}),t_\mathrm{f}) = \phi(\bm x(t_\mathrm{f}),t_\mathrm{f}),\qquad \text{on the hypersurface } \psi(\bm x(t_\mathrm{f}),t_\mathrm{f}) = 0.
</span></p>
<p>By the way, recall the definition of Hamiltonian <span class="math inline">H(\bm x,\bm u,\bm \lambda,t) = L(\bm x,\bm u,t) + \boldsymbol{\lambda}^\top \mathbf f(\bm x,\bm u,t)</span>. The HJB equation can also be written as <span class="math display">\boxed
<p>Note that this includes as special cases the fixed-final-state and free-final-state cases.</p>
</section>
<section id="hjb-equation-and-hamiltonian" class="level2">
<h2 class="anchored" data-anchor-id="hjb-equation-and-hamiltonian">HJB equation and Hamiltonian</h2>
<p>Recall the definition of Hamiltonian <span class="math inline">H(\bm x,\bm u,\bm \lambda,t) = L(\bm x,\bm u,t) + \boldsymbol{\lambda}^\top \mathbf f(\bm x,\bm u,t)</span>. The HJB equation can also be written as <span class="math display">\boxed
{-\frac{\partial J^\star (\bm x(t),t)}{\partial t} = \min_{\bm u(t)}H(\bm x(t),\bm u(t),\nabla_{\bm x} J^\star (\bm x(t),t),t).}
</span></p>
<p>What we have just derived is one of the most profound results in optimal control – Hamiltonian must be minimized by the optimal control. We will exploit it next for some derivations.</p>
<p>Recall also that we have already encountered a similar results that made statements about the necessary maximization (or minimization) of the Hamiltonian with respect to the control – the celebrated Pontryagin’s principle of maximum (or minimum).</p>


</section>

<a onclick="window.scrollTo(0, 0); return false;" role="button" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a></main> <!-- /main -->
<script id="quarto-html-after-body" type="application/javascript">
Expand Down Expand Up @@ -1232,7 +1252,7 @@ <h1 class="title">Dynamic programming for continuous-time optimal control</h1>
</div>
<div class="nav-footer-center">
<p>Copyright 2024, Zdeněk Hurák</p>
<div class="toc-actions"><ul><li><a href="https://github.com/hurak/orr/issues/new" class="toc-action"><i class="bi bi-github"></i>Report an issue</a></li></ul></div></div>
<div class="toc-actions d-sm-block d-md-none"><ul><li><a href="https://github.com/hurak/orr/issues/new" class="toc-action"><i class="bi bi-github"></i>Report an issue</a></li></ul></div></div>
<div class="nav-footer-right">
&nbsp;
</div>
Expand Down
38 changes: 25 additions & 13 deletions cont_dp_LQR.html
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
<script src="site_libs/quarto-search/fuse.min.js"></script>
<script src="site_libs/quarto-search/quarto-search.js"></script>
<meta name="quarto:offset" content="./">
<link href="./cont_dp_references.html" rel="next">
<link href="./cont_dp_DDP.html" rel="next">
<link href="./cont_dp_HJB.html" rel="prev">
<script src="site_libs/quarto-html/quarto.js"></script>
<script src="site_libs/quarto-html/popper.min.js"></script>
Expand Down Expand Up @@ -533,6 +533,12 @@
<a href="./cont_dp_LQR.html" class="sidebar-item-text sidebar-link active">
<span class="menu-text">Using HJB equation to solve the continuous-time LQR problem</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./cont_dp_DDP.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Differential dynamic programming (DDP)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
Expand Down Expand Up @@ -753,14 +759,15 @@ <h1 class="title">Using HJB equation to solve the continuous-time LQR problem</h
</header>


<p>As we have already discussed a couple of times, in the LQR problem wwe consider a linear (and time invariant) system modelled by <span class="math display">
\dot{\bm x}(t) = \mathbf A\bm x(t) + \mathbf B\bm u(t)
<p>As we have already discussed a couple of times, in the LQR problem we consider a linear time invariant (LTI) system modelled by <span class="math display">
\dot{\bm x}(t) = \mathbf A\bm x(t) + \mathbf B\bm u(t),
</span> and the quadratic cost function <span class="math display">
J(\bm x(t_\mathrm{i}),\bm u(\cdot), t_\mathrm{i}) = \frac{1}{2}\bm x^\top(t_\mathrm{f})\mathbf S_\mathrm{f}\bm x(t_\mathrm{f}) + \frac{1}{2}\int_{t_\mathrm{i}}^{t_\mathrm{f}}\left(\bm x^\top \mathbf Q\bm x + \bm u^\top \mathbf R \bm u\right)\mathrm{d}t.
</span></p>
<p>The Hamiltonian is <span class="math display">
H(\bm x,\bm u,\bm \lambda) = \frac{1}{2}\left(\bm x^\top \mathbf Q\bm x + \bm u^\top \mathbf R \bm u\right) + \boldsymbol{\lambda}^\top \left(\mathbf A\bm x + \mathbf B\bm u\right)
</span> and according to the HJB equation our goal is to minimize <span class="math inline">H</span> at a given time <span class="math inline">t</span>, which enforces the condition on its gradient <span class="math display">
H(\bm x,\bm u,\bm \lambda) = \frac{1}{2}\left(\bm x^\top \mathbf Q\bm x + \bm u^\top \mathbf R \bm u\right) + \boldsymbol{\lambda}^\top \left(\mathbf A\bm x + \mathbf B\bm u\right).
</span></p>
<p>According to the HJB equation our goal is to minimize <span class="math inline">H</span> at a given time <span class="math inline">t</span>, which enforces the condition on its gradient <span class="math display">
\mathbf 0 = \nabla_{\bm u} H = \mathbf R\bm u + \mathbf B^\top \boldsymbol\lambda,
</span> from which it follows that the optimal control must necessarily satisfy <span class="math display">
\bm u^\star = -\mathbf R^{-1} \mathbf B^\top \boldsymbol\lambda.
Expand All @@ -771,7 +778,7 @@ <h1 class="title">Using HJB equation to solve the continuous-time LQR problem</h
<p>The minimized Hamiltonian is <span class="math display">
\min_{\bm u(t)}H(\bm x, \bm u, \bm \lambda) = \frac{1}{2}\bm x^\top \mathbf Q \bm x + \boldsymbol\lambda^\top \mathbf A \bm x - \frac{1}{2}\boldsymbol\lambda^\top \mathbf B\mathbf R^{-1}\mathbf B^\top \boldsymbol\lambda
</span></p>
<p>Setting <span class="math inline">\boldsymbol\lambda = (\nabla_{\bm x} J^\star)^\top</span>, the HJB equation is <span class="math display">\boxed
<p>Setting <span class="math inline">\boldsymbol\lambda = \nabla_{\bm x} J^\star</span>, the HJB equation is <span class="math display">\boxed
{-\frac{\partial J^\star}{\partial t} = \frac{1}{2}\bm x^\top \mathbf Q \bm x + (\nabla_{\bm x} J^\star)^\top \mathbf A\bm x - \frac{1}{2}(\nabla_{\bm x} J^\star)^\top \mathbf B\mathbf R^{-1}\mathbf B^\top \nabla_{\bm x} J^\star,}
</span> and the boundary condition is <span class="math display">
J^\star(\bm x(t_\mathrm{f}),t_\mathrm{f}) = \frac{1}{2}\bm x^\top (t_\mathrm{f})\mathbf S_\mathrm{f}\bm x(t_\mathrm{f}).
Expand Down Expand Up @@ -803,11 +810,16 @@ <h1 class="title">Using HJB equation to solve the continuous-time LQR problem</h
<p><span class="math display">
-\bm x^\top \dot{\mathbf{S}} \bm x = \frac{1}{2} \bm x^\top \left[\mathbf Q + \mathbf S \mathbf A + \mathbf A^\top \mathbf S - \mathbf S \mathbf B\mathbf R^{-1}\mathbf B^\top \mathbf S \right ] \bm x.
</span></p>
<p>Finally, since the above single (scalar) equation should hold for all <span class="math inline">\bm x(t)</span>, the matrix equation must hold too, and we get the familiar differential Riccati equation <span class="math display">\boxed
{-\dot{\mathbf S}(t) = \mathbf A^\top \mathbf S(t) + \mathbf S(t)\mathbf A - \mathbf S(t)\mathbf B\mathbf R^{-1}\mathbf B^\top \mathbf S(t) + \mathbf Q.}
</span></p>
<p>We also get the optimal control <span class="math display">\boxed
{\bm u^\star(t) = - \underbrace{\mathbf R^{-1}\mathbf B^\top \mathbf S(t)}_{\bm K(t)}\bm x(t).}
<p>Finally, since the above single (scalar) equation should hold for all <span class="math inline">\bm x(t)</span>, the matrix equation must hold too, and we get the familiar differential Riccati equation for the matrix variable <span class="math inline">\mathbf S(t)</span> <span class="math display">\boxed
{-\dot{\mathbf S}(t) = \mathbf A^\top \mathbf S(t) + \mathbf S(t)\mathbf A - \mathbf S(t)\mathbf B\mathbf R^{-1}\mathbf B^\top \mathbf S(t) + \mathbf Q}
</span> initialized at the final time <span class="math inline">t_\mathrm{f}</span> by <span class="math inline">\mathbf S(t_\mathrm{f}) = \mathbf S_\mathrm{f}</span>.</p>
<p>Having obtained <span class="math inline">\mathbf S(t)</span>, we can get the optimal control by substituting it into <span class="math display">\boxed
{
\begin{aligned}
\bm u^\star(t) &amp;= - \mathbf R^{-1}\mathbf B^\top \nabla_{\bm x} J^\star(\bm x(t),t) \\
&amp;= - \underbrace{\mathbf R^{-1}\mathbf B^\top \mathbf S(t)}_{\bm K(t)}\bm x(t).
\end{aligned}
}
</span></p>
<p>We have just rederived the continuous-time LQR problem using the HJB equation (previously we did it by massaging the two-point boundary value problem that followed as the necessary condition of optimality from the techniques of calculus of variations).</p>
<p>Note that we have also just seen the equivalence between a first-order linear PDE and first-order nonlinear ODE.</p>
Expand Down Expand Up @@ -1242,8 +1254,8 @@ <h1 class="title">Using HJB equation to solve the continuous-time LQR problem</h
</a>
</div>
<div class="nav-page nav-page-next">
<a href="./cont_dp_references.html" class="pagination-link" aria-label="References">
<span class="nav-page-text">References</span> <i class="bi bi-arrow-right-short"></i>
<a href="./cont_dp_DDP.html" class="pagination-link" aria-label="Differential dynamic programming (DDP)">
<span class="nav-page-text">Differential dynamic programming (DDP)</span> <i class="bi bi-arrow-right-short"></i>
</a>
</div>
</nav>
Expand Down
12 changes: 9 additions & 3 deletions cont_dp_references.html
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@
<script src="site_libs/quarto-search/quarto-search.js"></script>
<meta name="quarto:offset" content="./">
<link href="./ext_stochastic_LQR.html" rel="next">
<link href="./cont_dp_LQR.html" rel="prev">
<link href="./cont_dp_DDP.html" rel="prev">
<script src="site_libs/quarto-html/quarto.js"></script>
<script src="site_libs/quarto-html/popper.min.js"></script>
<script src="site_libs/quarto-html/tippy.umd.min.js"></script>
Expand Down Expand Up @@ -510,6 +510,12 @@
<a href="./cont_dp_LQR.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Using HJB equation to solve the continuous-time LQR problem</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./cont_dp_DDP.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Differential dynamic programming (DDP)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
Expand Down Expand Up @@ -1170,8 +1176,8 @@ <h1 class="title">References</h1>
</script>
<nav class="page-navigation">
<div class="nav-page nav-page-previous">
<a href="./cont_dp_LQR.html" class="pagination-link" aria-label="Using HJB equation to solve the continuous-time LQR problem">
<i class="bi bi-arrow-left-short"></i> <span class="nav-page-text">Using HJB equation to solve the continuous-time LQR problem</span>
<a href="./cont_dp_DDP.html" class="pagination-link" aria-label="Differential dynamic programming (DDP)">
<i class="bi bi-arrow-left-short"></i> <span class="nav-page-text">Differential dynamic programming (DDP)</span>
</a>
</div>
<div class="nav-page nav-page-next">
Expand Down
6 changes: 6 additions & 0 deletions cont_indir_CARE.html
Original file line number Diff line number Diff line change
Expand Up @@ -490,6 +490,12 @@
<a href="./cont_dp_LQR.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Using HJB equation to solve the continuous-time LQR problem</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./cont_dp_DDP.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Differential dynamic programming (DDP)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
Expand Down
6 changes: 6 additions & 0 deletions cont_indir_LQR_fin_horizon.html
Original file line number Diff line number Diff line change
Expand Up @@ -490,6 +490,12 @@
<a href="./cont_dp_LQR.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Using HJB equation to solve the continuous-time LQR problem</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./cont_dp_DDP.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Differential dynamic programming (DDP)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
Expand Down
6 changes: 6 additions & 0 deletions cont_indir_LQR_inf_horizon.html
Original file line number Diff line number Diff line change
Expand Up @@ -490,6 +490,12 @@
<a href="./cont_dp_LQR.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Using HJB equation to solve the continuous-time LQR problem</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./cont_dp_DDP.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Differential dynamic programming (DDP)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
Expand Down
6 changes: 6 additions & 0 deletions cont_indir_Pontryagin.html
Original file line number Diff line number Diff line change
Expand Up @@ -490,6 +490,12 @@
<a href="./cont_dp_LQR.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Using HJB equation to solve the continuous-time LQR problem</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./cont_dp_DDP.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Differential dynamic programming (DDP)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
Expand Down
6 changes: 6 additions & 0 deletions cont_indir_calculus_of_variations.html
Original file line number Diff line number Diff line change
Expand Up @@ -490,6 +490,12 @@
<a href="./cont_dp_LQR.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Using HJB equation to solve the continuous-time LQR problem</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./cont_dp_DDP.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Differential dynamic programming (DDP)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
Expand Down
6 changes: 6 additions & 0 deletions cont_indir_constrained.html
Original file line number Diff line number Diff line change
Expand Up @@ -490,6 +490,12 @@
<a href="./cont_dp_LQR.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Using HJB equation to solve the continuous-time LQR problem</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./cont_dp_DDP.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Differential dynamic programming (DDP)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
Expand Down
6 changes: 6 additions & 0 deletions cont_indir_overview.html
Original file line number Diff line number Diff line change
Expand Up @@ -533,6 +533,12 @@
<a href="./cont_dp_LQR.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Using HJB equation to solve the continuous-time LQR problem</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./cont_dp_DDP.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Differential dynamic programming (DDP)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
Expand Down
Loading

0 comments on commit 74c1786

Please sign in to comment.