STEM 隨筆︰古典力學︰動力學【五‧五‧甲】

150px-Pierre_Francois_Verhulst

Logistic-curve.svg

P(t) = \frac{1}{1 + \mathrm e^{-t}}

350px-Logit.svg

\operatorname{logit}(p)=\log\left( \frac{p}{1-p} \right)

220px-Linear_regression.svg

Y \approx F(X, \Box)

Maple_logistic_plot_small

x_{n+1} = r x_n(1 - x_n)

Logistic_map_animation

Logistic_map_phase_plot_of_x-n+1--x-n-_vs_x-n-

相圖

512px-LogisticMap_BifurcationDiagram

Logistic_map

Logistic_map_scatterplots_large

LogisticCobwebChaos
定點震盪混沌

200px-Ganzhi001

300px-NewtonIteration_Ani

一八三八年,比利時數學家 Pierre François Verhulst 發表了一個『人口成長』方程式,

\frac{dN}{dt} = r N \left(1 - \frac {N}{K} \right)

,此處 N(t) 是某時的人口數,r 是自然成長率, K 是環境承載力。求解後得到

N(t) = \frac{K}{1+ C K e^{-rt}}

,此處 C = \frac{1}{N(0)} - \frac{1}{K} 是初始條件。 Verhulst 將這個函數稱作『logistic function』,於是那個微分方程式也就叫做『 logistic equation』。假使用 P = \frac{N}{K} 改寫成 \frac{dP}{dt} = r P \left(1 - P \right),將它『標準化』,取 CK = 1r = 1,從左圖的解答來看, 0 < P <1,也就是講人口數成長不可能超過環境承載力的啊!

如果求 P(t) 的反函數,得到 t = \ln{\frac {1 -P}{P}},這個反函數被稱之為『Logit』函數,定義為

\operatorname{logit}(p)=\log\left( \frac{p}{1-p} \right) , \ 0 < p < 1

,一般常用於『二元選擇』,比方說『To Be or Not To Be』的『機率分佈』,也用於『迴歸分析』 Regression Analysis 來看看兩個『變量』在統計上是『相干』還是『無干』的ㄡ!假使試著用『無窮小』數來看 \log\left( \frac{\delta p}{1-\delta p} \right) = \log(\delta p) \approx - \infty\log\left( \frac{1-\delta p} {\delta p}\right) = \log(\frac{1}{\delta p}) = \log(H) \approx \infty,或許更能體會『兩極性』的吧!!

一九七六年,澳洲科學家 Robert McCredie May 發表了一篇《Simple mathematical models with very complicated dynamics》文章,提出了一個『單峰映象』 logistic map 遞迴關係式 x_{n+1} = r x_n(1 - x_n), \ 0\leq x_n <1。這個遞迴關係式很像是『差分版』的『 logistic equation』,竟然是產生『混沌現象』的經典範例。假使說一個『遞迴關係式』有『極限值x_{\infty} = x_H 的話,此時 x_H = r x_H(1-x_H),可以得到 r{x_H}^2 = (r - 1) x_H,於是 x_H \approx 0 或者 x_H \approx \frac{r - 1}{r}。在 r < 1 之時,『單峰映象』或快或慢的收斂到『』;當 1 < r < 2 之時,它很快的逼近 \frac{r - 1}{r};於 2 < r < 3 之時,線性的上下震盪趨近 \frac{r - 1}{r};雖然 r=3 也收斂到 \frac{r - 1}{r},然而已經是很緩慢而且不是線性的了;當 r > 1 + \sqrt{6} \approx 3.45 時,對幾乎各個『初始條件』而言,系統開始發生兩值『震盪現象』,而後變成四值、八值、十六值…等等的『持續震盪』;最終於大約 r = 3.5699 時,這個震盪現象消失了,系統就步入了所謂的『混沌狀態』的了!!

連續的』微分方程式沒有『混沌性』,『離散的』差分方程式反倒發生了『混沌現象』,那麼這個『量子』的『宇宙』到底是不是『混沌』的呢??回想之前『λ 運算』裡的『遞迴函式』,與數學中的『定點』定義,『單峰映象』可以看成函數 f(x) = r \cdot x(1 - x) 的『迭代求值』︰x_1 = f(x_0), x_2 = f(x_1), \cdots x_{k+1} = f(x_k) \cdots。當 f^{(p)} (x_f) = f \cdots p -2 times f \cdots f(x_f) = x_f,這個 x_f 就是『定點』,左圖中顯示出不同的 r 值的求解現象,從有『定點』向『震盪』到『混沌』。如果我們將『 logistic equation』改寫成 \Delta P(t) = P(t + \Delta t) - P(t) = \left( r P(t) \left[ 1 - P(t) \right]  \right) \cdot \Delta t,假使取 t = n \Delta t, \Delta t = 1,可以得到 P(n + 1) - P(n) =  r P(n) \left[ 1 - P(n) \right],它的『極限值P(H) \approx 0, 1,根本與 r 沒有關係,這也就說明了兩者的『根源』是不同的啊!然而這卻建議著一種『時間序列』的觀點,如將 x_n 看成 x(n \Delta t), \ \Delta t = 1,這樣 \frac{x[(n+1) \Delta t]  - x[n \Delta t]}{\Delta t} = x_{n+1} - x_n 就說是『速度』的了,於是 (x_n, x_{n+1} - x_n) 便構成了假想的『相空間』,這可就把一個『遞迴關係式』轉譯成了一種『符號動力學』的了!!

在某些特定的 r 值,這個『遞迴關係式』有『正確解』 exact solution,比方說 r=2 時,x_n = \frac{1}{2} - \frac{1}{2}(1-2x_0)^{2^{n}},因為 x_0 \in [0,1),所以 (1-2x_0)\in (-1,1),於是 n \approx \infty \Longrightarrow (1-2x_0)^{2^{n}} \approx 0,因此 x_H \approx \frac{1}{2}。再者由於『指數項2^n 是『偶數』,所以此『符號動力系統』不等速 ── 非線性 ── 而且不震盪的逼近『極限值』的啊。

─── 《【Sonic π】電路學之補充《四》無窮小算術‧中下上

 

不可思議事件讓人目瞪口呆,難以置信現象令人好奇探玄︰

Non-Standard Approach

There are implemented two different approaches for the polynomial parts of the splines in PyTrajectory. They differ in the choice of the nodes. Given an interval [x_k, x_{k+1}],\ k \in [0, \ldots, N-1] the standard way in spline interpolation is to define the corresponding polynomial part using the left endpoint by

P_k(t) = c_{k,0} (t - x_k)^3 + c_{k,1}(t - x_k)^2 + c_{k,2}(t - x_k) + c_{k,3}

However, in the thesis of O. Schnabel from which PyTrajectory emerged a different approach is used. He defined the polynomial parts using the right endpoint, i.e.

P_k(t) = c_{k,0} (t - x_{k+1})^3 + c_{k,1}(t - x_{k+1})^2 + c_{k,2}(t - x_{k+1}) + c_{k,3}

This results in a different matrix to ensure the smoothness and boundary conditions of the splines:

The reason for the two different approaches being implemented is that after implementing and switching to the standard approach some of the examples no longer converged to a solution. The examples that are affected by that are:

 

難道『選左』或『選右』真的會不同乎?

想那 PyTrajectory 用的是萊文貝格-馬夸特演算法

Levenberg-Marquardt Method

The Levenberg-Marquardt method can be used to solve nonlinear least squares problems. It is an extension of the Gauss-Newton method and solves the following minimization problem.

The real number \mu is a parameter that is used for the attenuation of the step size (x_{k+1} - x_k) and is free to choose. Thus, the generation of excessive correction is prevented, as is often the case with the Gauss-Newton method and leads to a possible non-achievement of the local minimum. With a vanishing attenuation, \mu = 0 the Gauss-Newton method represents a special case of the Levenberg-Marquardt method. The iteration can be specified in the following form.

The convergence can now be influenced by means of the parameter \mu. Disadvantage is that in order to ensure the convergence, \mu must be chosen large enough, at the same time, this also leads however to a very small correction. Thus, the Levenberg-Marquardt method has a lower order of convergence than the Gauss-Newton method but approaches the desired solution at each step.

Control of the parameter \mu

The feature after which the parameter is chosen, is the change of the actual residual

and the change of the residual of the linearized approximation.

As a control criterion, the following quotient is introduced.

It follows that R(x_k,s_k) \geq 0 and for a meaningful correction \tilde{R}(x_k, s_k) \geq 0 must also hold. Thus, \rho is also positive and \rho \rightarrow 1 for \mu \rightarrow \infty. Therefor \rho should lie between 0 and 1. To control \mu two new limits b_0 and b_1 are introduced with 0 < b_0 < b_1 < 1 and for b_0 = 0.2, b_1 = 0.8 we use the following criteria.

  • \rho \leq b_0 \qquad\quad :\mu is doubled and s_k is recalculated
  • b_0 < \rho < b_1 \quad : in the next step \mu is maintained and s_k is used
  • \rho \geq b_1 \qquad\quad :s_k is accepted and \mu is halved during the next iteration

 

屬於所謂

置信域方法

置信域方法Trust-region methods)又稱為信賴域方法,它是一種最優化方法,能夠保證最優化方法總體收斂。

算法發展

置信域方法的歷史可以追溯到Levenberg(1944),Marquardt(1963),Goldfeld,Quandt and Trotter(1966),但現代置信域方法是Powell(1970)提出來的。他明確提出了置信域子問題,接受方向步 \displaystyle s_{k} 的準則 ,校正置信域半徑 \displaystyle \nabla _{k} 的準則,及收斂性定理。這些措施使置信域方法比線搜索方法具有更大的優越性 。

 

也!莫非某些範例其置信域裡有『動力混沌』耶??

Trust region

Trust region is a term used in mathematical optimization to denote the subset of the region of the objective function that is approximated using a model function (often a quadratic). If an adequate model of the objective function is found within the trust region then the region is expanded; conversely, if the approximation is poor then the region is contracted. Trust region methods are also known as restricted step methods.

The fit is evaluated by comparing the ratio of expected improvement from the model approximation with the actual improvement observed in the objective function. Simple thresholding of the ratio is used as the criterion for expansion and contraction—a model function is “trusted” only in the region where it provides a reasonable approximation.

Trust region methods are in some sense dual to line search methods: trust region methods first choose a step size (the size of the trust region) and then a step direction while line search methods first choose a step direction and then a step size.

The earliest use of the term seems to be by Sorensen (1982).

Example

Conceptually, in the Levenberg–Marquardt algorithm, the objective function is iteratively approximated by a quadratic surface, then using a linear solver, the estimate is updated. This alone may not converge nicely if the initial guess is too far from the optimum. For this reason, the algorithm instead restricts each step, preventing it from stepping “too far”. It operationalizes “too far” as follows. Rather than solving \displaystyle A\Delta x=b for \displaystyle \Delta x , it solves \displaystyle (A+\lambda \operatorname {diag} (A))\Delta x=b where \displaystyle \operatorname {diag} (A) is the diagonal matrix with the same diagonal as A and λ is a parameter that controls the trust-region size. Geometrically, this adds a paraboloid centered at \displaystyle \Delta x=0 to the quadratic form, resulting in a smaller step.

The trick is to change the trust-region size (λ). At each iteration, the damped quadratic fit predicts a certain reduction in the cost function, \displaystyle \Delta f_{\mathrm {pred} } , which we would expect to be a smaller reduction than the true reduction. Given \displaystyle \Delta x we can evaluate

\displaystyle \Delta f_{\mathrm {actual} }=f(x)-f(x+\Delta x)

By looking at the ratio \displaystyle \Delta f_{\mathrm {pred} }/\Delta f_{\mathrm {actual} } we can adjust the trust-region size. In general, we expect \displaystyle \Delta f_{\mathrm {pred} } to be a bit less than \displaystyle \Delta f_{\mathrm {actual} } and so the ratio would be between, say, 0.25 and 0.5. If the ratio is more than 0.5, then we aren’t damping the step much, so expand the trust region (decrease λ), and iterate. If the ratio is smaller than 0.25, then the true function is diverging “too much” from the trust-region approximation, so shrink the trust region (increase λ) and try again.

 

叫人不得不面對『數值穩定性

In the mathematical subfield of numerical analysis, numerical stability is a generally desirable property of numerical algorithms. The precise definition of stability depends on the context. One is numerical linear algebra and the other is algorithms for solving ordinary and partial differential equations by discrete approximation.

In numerical linear algebra the principal concern is instabilities caused by proximity to singularities of various kinds, such as very small or nearly colliding eigenvalues. On the other hand, in numerical algorithms for differential equations the concern is the growth of round-off errors and/or initially small fluctuations in initial data which might cause a large deviation of final answer from the exact solution[citation needed].

Some numerical algorithms may damp out the small fluctuations (errors) in the input data; others might magnify such errors. Calculations that can be proven not to magnify approximation errors are called numerically stable. One of the common tasks of numerical analysis is to try to select algorithms which are robust – that is to say, do not produce a wildly different result for very small change in the input data.

An opposite phenomenon is instability. Typically, an algorithm involves an approximate method, and in some cases one could prove that the algorithm would approach the right solution in some limit. Even in this case, there is no guarantee that it would converge to the correct solution, because the floating-point round-off or truncation errors can be magnified, instead of damped, causing the deviation from the exact solution to grow exponentially.[1]

 

這個頭疼的問題呦!!