STEM 隨筆︰古典力學︰動力學【四】

若我們閱讀『某某文本』

Linear–quadratic regulator

The theory of optimal control is concerned with operating a dynamic system at minimum cost. The case where the system dynamics are described by a set of linear differential equations and the cost is described by a quadratic function is called the LQ problem. One of the main results in the theory is that the solution is provided by the linear–quadratic regulator (LQR), a feedback controller whose equations are given below. The LQR is an important part of the solution to the LQG (linear–quadratic–Gaussian) problem. Like the LQR problem itself, the LQG problem is one of the most fundamental problems in control theory.

……

Infinite-horizon, continuous-time LQR

For a continuous-time linear system described by

\displaystyle {\dot {x}}=Ax+Bu

with a cost functional defined as
\displaystyle J=\int _{0}^{\infty }\left(x^{T}Qx+u^{T}Ru+2x^{T}Nu\right)dt
the feedback control law that minimizes the value of the cost is
\displaystyle u=-Kx\,
where \displaystyle K is given by
\displaystyle K=R^{-1}(B^{T}P+N^{T})\,
and \displaystyle P is found by solving the continuous time algebraic Riccati equation
\displaystyle A^{T}P+PA-(PB+N)R^{-1}(B^{T}P+N^{T})+Q=0\,
This can be also written as
\displaystyle {\mathcal {A}}^{T}P+P{\mathcal {A}}-PBR^{-1}B^{T}P+{\mathcal {Q}}=0\,
with
\displaystyle {\mathcal {A}}=A-BR^{-1}N^{T}\qquad {\mathcal {Q}}=Q-NR^{-1}N^{T}\,

───

 

,卻發現『不知旨趣』時,當如之何哉?這時『望文生義』實非其法也。此處『控制』真指追求『性價比』乎?

使然將如何決定『價格』耶!

或因還不知

龐特里亞金最大化原理

龐特里亞金最大化原理(Pontryagin’s maximum principle)也有稱為龐特里亞金最小化原理,是最優控制中的理論,是在狀態或是輸入控制項有限制條件的情形下,可以找到將動力系統由一個狀態到另一個狀態的最優控制信號。此理論是蘇俄數學家列夫·龐特里亞金及他的學生在1956年提出的[1]。這是變分法歐拉-拉格朗日方程的特例。

簡單來說,此定理是指在所有可能的控制中,需讓「控制哈密頓量 」(control Hamiltonian)取極值,極值是最大值或是最小值則依問題以及哈密頓量的符號定義而不同。正式的用法,也就是哈密頓量中所使用的符號,會取到最大值,但是此條目中使用的符號定義方式,會讓極值取到最小值。

\mathcal {U}} 是所有可能控制值的集合,則此原理指出,最優控制 \displaystyle u^{*} 必須滿足以下條件:

\displaystyle H(x^{*}(t),u^{*}(t),\lambda ^{*}(t),t)\leq H(x^{*}(t),u,\lambda ^{*}(t),t),\quad \forall u\in {\mathcal {U}},\quad t\in [t_{0},t_{f}]

其中 \displaystyle x^{*}\in C^{1}[t_{0},t_{f}] 是最佳狀態軌跡,而 \displaystyle \lambda ^{*}\in BV[t_{0},t_{f}] 是最佳 協態軌跡[2]

此結果最早成功的應用在輸入控制有限制條件的最小時間問題中 ,不過也可以用在狀態有限制條件的問題中。

也可以推導控制哈密頓量的特殊條件。若最終時間 \displaystyle t_{f} 固定,且控制哈密頓量不是時間的顯函數 \displaystyle \left({\tfrac {\partial H}{\partial t}}\equiv 0\right) ,則:

\displaystyle H(x^{*}(t),u^{*}(t),\lambda ^{*}(t))\equiv \mathrm {constant} \,

若最終時間沒有限制,則:
\displaystyle H(x^{*}(t),u^{*}(t),\lambda ^{*}(t))\equiv 0.\,
若在某一軌跡上滿足龐特里亞金最大化原理,此原理是最佳解的 。 提供了最佳解的,但該條件須在整個狀態空間中都要成立。

若在某一軌跡上滿足龐特里亞金最大化原理,此原理是最佳解的必要條件哈密頓-雅可比-貝爾曼方程 提供了最佳解的充份必要條件,但該條件須在整個狀態空間中都要成立。

最大化和最小化

此定理一開始的名稱是龐特里亞金最大化原理(Pontryagin’s maximum principle),其證明也是以控制哈密頓量最大化為基礎 。此原理最早的應用是要最大化火箭的終端速度。不過後來此定理大部份的應用是使性能指標最小化,因此常稱為龐特里亞金最小化原理。龐特里亞金的書解出了要讓性能指標最小化的問題[3]

符號

以下的內容會使用這些表示方式

\displaystyle \Psi _{T}(x(T))={\frac {\partial \Psi (x)}{\partial T}}|_{x=x(T)}\,

\displaystyle \Psi _{x}(x(T))={\begin{bmatrix}{\frac {\partial \Psi (x)}{\partial x_{1}}}|_{x=x(T)}&\cdots &{\frac {\partial \Psi (x)}{\partial x_{n}}}|_{x=x(T)}\end{bmatrix}}
\displaystyle H_{x}(x^{*},u^{*},\lambda ^{*},t)={\begin{bmatrix}{\frac {\partial H}{\partial x_{1}}}|_{x=x^{*},u=u^{*},\lambda =\lambda ^{*}}&\cdots &{\frac {\partial H}{\partial x_{n}}}|_{x=x^{*},u=u^{*},\lambda =\lambda ^{*}}\end{bmatrix}}
\displaystyle L_{x}(x^{*},u^{*})={\begin{bmatrix}{\frac {\partial L}{\partial x_{1}}}|_{x=x^{*},u=u^{*}}&\cdots &{\frac {\partial L}{\partial x_{n}}}|_{x=x^{*},u=u^{*}}\end{bmatrix}}
\displaystyle f_{x}(x^{*},u^{*})={\begin{bmatrix}{\frac {\partial f_{1}}{\partial x_{1}}}|_{x=x^{*},u=u^{*}}&\cdots &{\frac {\partial f_{1}}{\partial x_{n}}}|_{x=x^{*},u=u^{*}}\\\vdots &\ddots &\vdots \\{\frac {\partial f_{n}}{\partial x_{1}}}|_{x=x^{*},u=u^{*}}&\ldots &{\frac {\partial f_{n}}{\partial x_{n}}}|_{x=x^{*},u=u^{*}}\end{bmatrix}}

最小化問題必要條件的正式敘述

以下是讓泛函最小化的必要條件。令 \displaystyle x 為在輸入為 \displaystyle u u時,動態系統的狀態,且滿足以下條件

\displaystyle {\dot {x}}=f(x,u),\quad x(0)=x_{0},\quad u(t)\in {\mathcal {U}},\quad t\in [0,T]

其中
\displaystyle {\mathcal {U}} 為可行控制的集合
\displaystyle T 為系統的結束時間。

控制 \displaystyle u\in {\mathcal {U}} 需在所有 \displaystyle t\in [0,T] 內使目標泛函\displaystyle J 最小化,目標泛函 \displaystyle J 隨應用而定,可以寫成

\displaystyle J=\Psi (x(T))+\int _{0}^{T}L(x(t),u(t))\,dt

系統動態的限制可以用導入時變拉格朗日乘數向量 \displaystyle \lambda 的方式和拉格朗日量 \displaystyle L 相加,而拉格朗日乘數向量 \displaystyle \lambda 的元素稱為系統的協態(costate)。因此可以建構在所有 \displaystyle t\in [0,T] 的哈密頓量為:
\displaystyle H(x(t),u(t),\lambda (t),t)=\lambda ^{\rm {T}}(t)f(x(t),u(t))+L(x(t),u(t))\,
其中 \displaystyle \lambda ^{\rm {T}} 是 \displaystyle \lambda 的轉置。

龐特里亞金最小化原理提到最佳狀態軌跡 \displaystyle x^{*} ,最佳控制 \displaystyle u^{*} 及對應的拉格朗日乘數向量 \displaystyle \lambda ^{*} 必需最小化哈密頓量 \displaystyle H ,因此

\displaystyle (1)\qquad H(x^{*}(t),u^{*}(t),\lambda ^{*}(t),t)\leq H(x^{*}(t),u,\lambda ^{*}(t),t)\,

針對所有 \displaystyle t\in [0,T] 時間,也針對所有可能的控制輸入 \displaystyle u\in {\mathcal {U}} 。以下的式子也必須成立
\displaystyle (2)\qquad \Psi _{T}(x(T))+H(T)=0\,
而且也要滿足以下的協態方程
\displaystyle (3)\qquad -{\dot {\lambda }}^{\rm {T}}(t)=H_{x}(x^{*}(t),u^{*}(t),\lambda (t),t)=\lambda ^{\rm {T}}(t)f_{x}(x^{*}(t),u^{*}(t))+L_{x}(x^{*}(t),u^{*}(t))
若最終狀態 \displaystyle x(T) 沒有固定(其微分變異不為0),最終協態也要滿足以下條件
\displaystyle (4)\qquad \lambda ^{\rm {T}}(T)=\Psi _{x}(x(T))\,
上述(1)-(4)的條件是最佳控制的必要條件。公式(4)只有在 \displaystyle x(T) 沒有固定時才需要成立。若 \displaystyle x(T) 是固定值,公式(4)不在必要條件中。

此解法可以應用在宇宙學和天體物理學中 [4]

 

來歷吧。

尚未曉『概念』有『始』

費馬之最短時間原理  。

有『初』

歐拉的最小作用量原理

拉格朗日有『成』,展開『分析力學』矣。

其『價值』能怎麼算呢??留名青史而已夫!!

Optimal control

Optimal control theory deals with the problem of finding a control law for a given system such that a certain optimality criterion is achieved.

It is an extension of the calculus of variations, and is a mathematical optimization method for deriving control policies. The method is largely due to the work of Lev Pontryagin and Richard Bellman in the 1950s, after contributions to calculus of variations by Edward J. McShane.[1] Optimal control can be seen as a control strategy in control theory.

General method

Optimal control deals with the problem of finding a control law for a given system such that a certain optimality criterion is achieved. A control problem includes a cost functional that is a function of state and control variables. An optimal control is a set of differential equations describing the paths of the control variables that minimize the cost function. The optimal control can be derived using Pontryagin’s maximum principle (a necessary condition also known as Pontryagin’s minimum principle or simply Pontryagin’s Principle),[2] or by solving the Hamilton–Jacobi–Bellman equation (a sufficient condition).

We begin with a simple example. Consider a car traveling in a straight line on a hilly road. The question is, how should the driver press the accelerator pedal in order to minimize the total traveling time? Clearly in this example, the term control law refers specifically to the way in which the driver presses the accelerator and shifts the gears. The system consists of both the car and the road, and the optimality criterion is the minimization of the total traveling time. Control problems usually include ancillary constraints. For example, the amount of available fuel might be limited, the accelerator pedal cannot be pushed through the floor of the car, speed limits, etc.

A proper cost function will be a mathematical expression giving the traveling time as a function of the speed, geometrical considerations, and initial conditions of the system. It is often the case that the constraints are interchangeable with the cost function.

Another optimal control problem is to find the way to drive the car so as to minimize its fuel consumption, given that it must complete a given course in a time not exceeding some amount. Yet another control problem is to minimize the total monetary cost of completing the trip, given assumed monetary prices for time and fuel.

A more abstract framework goes as follows. Minimize the continuous-time cost functional

\displaystyle J=\Phi \,[\,{\textbf {x}}(t_{0}),t_{0},{\textbf {x}}(t_{f}),t_{f}\,]+\int _{t_{0}}^{t_{f}}{\mathcal {L}}\,[\,{\textbf {x}}(t),{\textbf {u}}(t),t\,]\,\operatorname {d} t

subject to the first-order dynamic constraints (the state equation)
\displaystyle {\dot {\textbf {x}}}(t)={\textbf {a}}\,[\,{\textbf {x}}(t),{\textbf {u}}(t),t\,],
the algebraic path constraints
\displaystyle {\textbf {b}}\,[\,{\textbf {x}}(t),{\textbf {u}}(t),t\,]\leq {\textbf {0}},
and the boundary conditions
\displaystyle {\boldsymbol {\phi }}\,[\,{\textbf {x}}(t_{0}),t_{0},{\textbf {x}}(t_{f}),t_{f}\,]=0
where \displaystyle {\textbf {x}}(t) is the state, \displaystyle {\textbf {u}}(t) is the control, \displaystyle t is the independent variable (generally speaking, time), \displaystyle t_{0} is the initial time, and \displaystyle t_{f} is the terminal time. The terms \displaystyle \Phi and \displaystyle {\mathcal {L}} are called the endpoint cost and Lagrangian, respectively. Furthermore, it is noted that the path constraints are in general inequality constraints and thus may not be active (i.e., equal to zero) at the optimal solution. It is also noted that the optimal control problem as stated above may have multiple solutions (i.e., the solution may not be unique). Thus, it is most often the case that any solution \displaystyle [{\textbf {x}}^{*}(t^{*}),{\textbf {u}}^{*}(t^{*}),t^{*}] to the optimal control problem is locally minimizing.

Linear quadratic control

A special case of the general nonlinear optimal control problem given in the previous section is the linear quadratic (LQ) optimal control problem. The LQ problem is stated as follows. Minimize the quadratic continuous-time cost functional

\displaystyle J={\tfrac {1}{2}}\mathbf {x} ^{\mathsf {T}}(t_{f})\mathbf {S} _{f}\mathbf {x} (t_{f})+{\tfrac {1}{2}}\int _{t_{0}}^{t_{f}}[\,\mathbf {x} ^{\mathsf {T}}(t)\mathbf {Q} (t)\mathbf {x} (t)+\mathbf {u} ^{\mathsf {T}}(t)\mathbf {R} (t)\mathbf {u} (t)\,]\,\operatorname {d} t

Subject to the linear first-order dynamic constraints
\displaystyle {\dot {\mathbf {x} }}(t)=\mathbf {A} (t)\mathbf {x} (t)+\mathbf {B} (t)\mathbf {u} (t),
and the initial condition
\displaystyle \mathbf {x} (t_{0})=\mathbf {x} _{0}
A particular form of the LQ problem that arises in many control system problems is that of the linear quadratic regulator (LQR) where all of the matrices (i.e., \displaystyle \mathbf {A} ,  \displaystyle \mathbf {B} , \displaystyle \mathbf {Q} , and \displaystyle \mathbf {R} } are constant, the initial time is arbitrarily set to zero, and the terminal time is taken in the limit \displaystyle t_{f}\rightarrow \infty (this last assumption is what is known as infinite horizon). The LQR problem is stated as follows. Minimize the infinite horizon quadratic continuous-time cost functional
\displaystyle J={\tfrac {1}{2}}\int _{0}^{\infty }[\,\mathbf {x} ^{\mathsf {T}}(t)\mathbf {Q} \mathbf {x} (t)+\mathbf {u} ^{\mathsf {T}}(t)\mathbf {R} \mathbf {u} (t)\,]\,\operatorname {d} t
Subject to the linear time-invariant first-order dynamic constraints
\displaystyle {\dot {\mathbf {x} }}(t)=\mathbf {A} \mathbf {x} (t)+\mathbf {B} \mathbf {u} (t),
and the initial condition
\displaystyle \mathbf {x} (t_{0})=\mathbf {x} _{0}
In the finite-horizon case the matrices are restricted in that \displaystyle \mathbf {Q} and \displaystyle \mathbf {R} are positive semi-definite and positive definite, respectively. In the infinite-horizon case, however, the matrices \displaystyle \mathbf {Q} and \displaystyle \mathbf {R} are not only positive-semidefinite and positive-definite, respectively, but are also constant. These additional restrictions on \displaystyle \mathbf {Q} and \displaystyle \mathbf {R} in the infinite-horizon case are enforced to ensure that the cost functional remains positive. Furthermore, in order to ensure that the cost function is bounded, the additional restriction is imposed that the pair \displaystyle (\mathbf {A} ,\mathbf {B} ) is controllable. Note that the LQ or LQR cost functional can be thought of physically as attempting to minimize the control energy (measured as a quadratic form).

The infinite horizon problem (i.e., LQR) may seem overly restrictive and essentially useless because it assumes that the operator is driving the system to zero-state and hence driving the output of the system to zero. This is indeed correct. However the problem of driving the output to a desired nonzero level can be solved after the zero output one is. In fact, it can be proved that this secondary LQR problem can be solved in a very straightforward manner. It has been shown in classical optimal control theory that the LQ (or LQR) optimal control has the feedback form

\displaystyle \mathbf {u} (t)=-\mathbf {K} (t)\mathbf {x} (t)

where \displaystyle \mathbf {K} (t) is a properly dimensioned matrix, given as
\displaystyle \mathbf {K} (t)=\mathbf {R} ^{-1}\mathbf {B} ^{\mathsf {T}}\mathbf {S} (t),
and \displaystyle \mathbf {S} (t) is the solution of the differential Riccati equation. The differential Riccati equation is given as
\displaystyle {\dot {\mathbf {S} }}(t)=-\mathbf {S} (t)\mathbf {A} -\mathbf {A} ^{\mathsf {T}}\mathbf {S} (t)+\mathbf {S} (t)\mathbf {B} \mathbf {R} ^{-1}\mathbf {B} ^{\mathsf {T}}\mathbf {S} (t)-\mathbf {Q}
For the finite horizon LQ problem, the Riccati equation is integrated backward in time using the terminal boundary condition
\displaystyle \mathbf {S} (t_{f})=\mathbf {S} _{f}
For the infinite horizon LQR problem, the differential Riccati equation is replaced with the algebraic Riccati equation (ARE) given as
\displaystyle \mathbf {0} =-\mathbf {S} \mathbf {A} -\mathbf {A} ^{\mathsf {T}}\mathbf {S} +\mathbf {S} \mathbf {B} \mathbf {R} ^{-1}\mathbf {B} ^{\mathsf {T}}\mathbf {S} -\mathbf {Q}
Understanding that the ARE arises from infinite horizon problem, the matrices \displaystyle \mathbf {A} ,  \displaystyle \mathbf {B} , \displaystyle \mathbf {Q} , and \displaystyle \mathbf {R} are all constant. It is noted that there are in general multiple solutions to the algebraic Riccati equation and the positive definite (or positive semi-definite) solution is the one that is used to compute the feedback gain. The LQ (LQR) problem was elegantly solved by Rudolf Kalman.[3]

 

※ 重讀

control.lqr

control.lqr(A, B, Q, R[, N])
Linear quadratic regulator design

The lqr() function computes the optimal state feedback controller that minimizes the quadratic cost

J = \int_0^\infty (x' Q x + u' R u + 2 x' N u) dt

The function can be called with either 3, 4, or 5 arguments:

  • lqr(sys, Q, R)
  • lqr(sys, Q, R, N)
  • lqr(A, B, Q, R)
  • lqr(A, B, Q, R, N)

where sys is an LTI object, and A, B, Q, R, and N are 2d arrays or matrices of appropriate dimension.

PARAMETERS:

A, B: 2-d array :

Dynamics and input matrices

sys: LTI (StateSpace or TransferFunction) :

Linear I/O system

Q, R: 2-d array :

State and input weight matrices

N: 2-d array, optional :

Cross weight matrix

RETURNS:

K: 2-d array :

State feedback gains

S: 2-d array :

Solution to Riccati equation

E: 1-d array :

Eigenvalues of the closed loop system