時間序列︰從微觀到巨觀

 如何理解一個國家的富裕?假使說百業興盛、市場繁榮、人民所得逐年提高,是否就足以保障那個富裕呢??或者說這個國家不單是富裕,而且走在通往富強的康莊大道上哩。人們需要理由以及數據解釋當下發生的事,更想有水晶球預測未來的趨勢!由於事物總在時流裡變化,在在促使人門研究

時間序列

時間序列英語:time series實證經濟學的一種統計方法

內涵

時間序列是用時間排序的一組隨機變量,國內生產毛額(GDP)、消費者物價指數(CPI)、台灣加權股價指數、利率、匯率等等都是時間序列。

時間序列的時間間隔可以是分秒(如高頻金融數據),可以是日、周、月、季度、年、甚至更大的時間單位。

時間序列是計量經濟學所研究的三大數據形態(另兩大為橫截面數據和縱面數據)之一,在總體經濟學國際經濟學金融學金融工程學等學科中有廣泛應用。

時間序列變量的特徵

  • 非平穩性(nonstationarity,也譯作不平穩性非穩定性):即時間序列的變異數無法呈現出一個長期趨勢並最終趨於一個常數或是一個線性函數
  • 波動幅度隨時間變化(Time-varying Volatility):即一個時間序列變量的變異數隨時間的變化而變化

這兩個特徵使得有效分析時間序列變量十分困難。

平穩型時間數列(Stationary Time Series)係指一個時間數列其統計特性將不隨時間之變化而改變者。

傳統的計量經濟學的假設

  1. 假設時間序列變量是從某個隨機過程中隨機抽取並按時間排列而形成的,因而一定存在一個(狹義)穩定趨勢(stationarity),即:平均值是固定的
  2. 假定時間序列變量的波動幅度不隨時間改變,即:變異數是固定的。但這明顯不符合實際,人們早就發現股票收益的波動幅度是隨時間而變化的,並非常數

這兩個假設使得傳統的計量經濟學方法對實際生活中的時間序列變量無法有效分析。克萊夫·格蘭傑羅伯特·恩格爾的貢獻解決了這個問題。

 

!尤其今天已是大數據的時代,人們嚮往掌握方法,能夠藉著資訊煉金也!!

Time series: random data plus trend, with best-fit line and different applied filters

Time series

A time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Examples of time series are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average.

Time series are very frequently plotted via line charts. Time series are used in statistics, signal processing, pattern recognition, econometrics, mathematical finance, weather forecasting, intelligent transport and trajectory forecasting,[1] earthquake prediction, electroencephalography, control engineering, astronomy, communications engineering, and largely in any domain of applied science and engineering which involves temporal measurements.

Time series analysis comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data. Time series forecasting is the use of a model to predict future values based on previously observed values. While regression analysis is often employed in such a way as to test theories that the current values of one or more independent time series affect the current value of another time series, this type of analysis of time series is not called “time series analysis”, which focuses on comparing values of a single time series or multiple dependent time series at different points in time.[2]

Time series data have a natural temporal ordering. This makes time series analysis distinct from cross-sectional studies, in which there is no natural ordering of the observations (e.g. explaining people’s wages by reference to their respective education levels, where the individuals’ data could be entered in any order). Time series analysis is also distinct from spatial data analysis where the observations typically relate to geographical locations (e.g. accounting for house prices by the location as well as the intrinsic characteristics of the houses). A stochastic model for a time series will generally reflect the fact that observations close together in time will be more closely related than observations further apart. In addition, time series models will often make use of the natural one-way ordering of time so that values for a given period will be expressed as deriving in some way from past values, rather than from future values (see time reversibility.)

Time series analysis can be applied to real-valued, continuous data, discrete numeric data, or discrete symbolic data (i.e. sequences of characters, such as letters and words in the English language[3]).

Methods for time series analysis

Methods for time series analysis may be divided into two classes: frequency-domain methods and time-domain methods. The former include spectral analysis and wavelet analysis; the latter include auto-correlation and cross-correlation analysis. In the time domain, correlation and analyses can be made in a filter-like manner using scaled correlation, thereby mitigating the need to operate in the frequency domain.

Additionally, time series analysis techniques may be divided into parametric and non-parametric methods. The parametric approaches assume that the underlying stationary stochastic process has a certain structure which can be described using a small number of parameters (for example, using an autoregressive or moving average model). In these approaches, the task is to estimate the parameters of the model that describes the stochastic process. By contrast, non-parametric approaches explicitly estimate the covariance or the spectrum of the process without assuming that the process has any particular structure.

Methods of time series analysis may also be divided into linear and non-linear, and univariate and multivariate.

 

不過事涉『隨機』與『統計』,因此欲從微觀、個體到巨觀、總體 ,最好有個堅實的基礎。故而回顧一個物理『模型』之生︰

220px-Early_Pinball

彈珠台

200px-Electrona_in_crystallo_fluentia.svg

德國物理學家保羅‧卡爾‧路德維希‧德汝德 Paul Karl Ludwig Drude 於一九零零年提出了一個『電傳導』的模型。他想從『微觀』的角度來推導『歐姆定律』。雖然在今天或許需要一些量子力學的修正,這個古典簡單的模型卻提供了『金屬』中『直流電』和『交流電』的傳導、磁場的『霍爾效應』,以及『熱傳導』種種現象非常好的解釋。

德汝德將『導體』想像成由相對固定的『正離子』與可移動的『自由電子』所構成。這些為數眾多的『自由電子』彼此間不斷的發生『碰撞』,又和固定的『正離子』間也發生碰撞,彷彿就像在『彈珠台』裡的那些『彈珠』一樣。那麼到底這些『自由電子』的數量有多大的呢?如果用 D_e 代表『電子密度』,D_e = N_A \frac{Z_c \rho_m}{A_m},此處 N_A = 6.02 \times {10}^{23} 是阿佛加德羅常數,Z_c 是一個金屬原子貢獻多少個『自由電子』,\rho_m 是金屬質量密度,A_m 是金屬的原子量。

舉例來說『Na 很容易形成一價的『鈉離子』, 就說它的 Z_c = 1,如此 D_{eNa} =  6.02 \times {10}^{23} atoms/mole \frac {1 e / atom \cdot 0.968 \times {10}^6 g / m^3}{22.98 g/mole} = 2.54 \times {10}^{28} e / m^3,這樣一克的鈉,體積大約一立方公分,就有『數量級』為 {10}^{22} 個『自由電子』。

假使將它看成『自由電子氣體』,再利用奧地利物理學家路德維希‧愛德華‧波茲曼 Ludwig Eduard Boltzmann 所發展的古典氣體『運動理論』Kinetic theory 來探討這些『自由電子』,就如同理想氣體一樣,在『熱平衡』時,一個『自由電子』的『熱速度v_{thermal}  可以用 \frac {1} {2} m \cdot \overline{{v_{thermal}}^2} = \frac {3} {2} k_B T 來計算,此處 k_B 是波茲曼常數 k_b = 1.3806488(13) \times 10^{-23} \mbox{ JK}^{-1}T 是『絕對溫標』。那麼室溫下 {25}^{\circ} C = {298.16}^{\circ} K 的一個『自由電子』的『熱速度』大約是 v_{thermal} = \sqrt{\frac {3 k_B T}{m}} = 1.16 \times {10}^5 m/s

225px-Boltzmann2
統計力學拓荒者

Translational_motion

這個速度一秒大於百公里,不可謂之不大,假使用『費米氣體』的量子統計力學來講,更要大上個十倍,不過由於它在『各方向』的『均等性』,因此統計上來說『淨電流』的貢獻為『』。也就是說 \langle \vec{v}_{thermal} \rangle = 0

那麼德汝德是如何看待這些『碰撞』作用的呢?或者說他做了哪些『假設』的呢?這點正是探討一個『物理模型』的『合理性』與『適切性』的重要之處。依據現今的說法,德汝德假設了︰

一、如果沒有外部的『電磁場』作用,『自由電子』將會作『直線運動』,彼此間的『電磁作用力』可以被忽略。這意味著是一種『獨立電子』的假設,它處於一個由『正離子』與『其他電子』所構成的『平均的環境』 ── 因此淨作用為零 ──,統計上來講這一般認為是『合宜的』。

二、『電子』和『正離子』之間的『碰撞』是『即時』的,統計上無關之『隨機事件』,所以總體來說這沒有任何『淨貢獻』,雖然有不同的學者『批評』它的『合宜性』。然而如果從『散射事件』來看,這也許只是說某些『物質屬性』之『均向性』的另一種說法罷了。

三、假設了『平均碰撞時間\tau 的『存在』,所以我們可以說很小的一段時距 \delta t 發生『碰撞』的『機會』是 \frac {\delta t}{\tau},而且這個『機率』和一個『自由電子』的『位置』與『動量』無關。這正像是『丟一根』長度為 \delta t 的『』投到一個以 \tau 為『格子線』板子上,問『』掉到『線上』的『機率』大小如何,通常被認為是很好的『近似』。

四、『碰撞』後的『熱電子』應該保有該處『熱平衡』的速度。這是一個作用『鄰近原則』的假設,一般從『物理因果』上講,以為應是『正確的』。

那麼我們如何推導『自由電子』受到一個外在時變的『力場\vec{F}(t) 中之『平均動量』方程式的呢?假使在 t 時刻,一個『自由電子』的『動量』是 \vec{p}(t),到了 t + dt 時刻它的動量 \vec{p}(t+dt) 可以這樣考慮,如果說這個『自由電子』發生了『碰撞』,按造『假設三』它的『碰撞』機率是 P_c = \frac{dt}{\tau},再依據『假設二』,它的淨『平均動量』貢獻將會是『』, \langle {\vec{p}}_c(t+dt) \rangle = 0。如果說此時這個『自由電子』沒有發生『碰撞』,於是按造『牛頓第二運動定律\langle {\vec{p}}_{nc}(t+dt) \rangle = \langle \vec{p}(t) \rangle + \vec{F}(t)dt,這個不發生『碰撞』的機率 P_{nc}1 - P_c = 1 - \frac{dt}{\tau},因此

\langle \vec{p}(t+dt) \rangle = P_c \cdot \langle {\vec{p}}_c(t+dt) \rangle + P_{nc} \cdot \langle {\vec{p}}_{nc}(t+dt) \rangle

= \left( 1 - \frac{dt}{\tau} \right) \left( \langle \vec{p}(t) \rangle + \vec{F}(t)dt \right),所以可得

\frac{d  \langle \vec{p}(t) \rangle}{dt} = \frac{\langle \vec{p}(t+dt) \rangle - \langle \vec{p}(t) \rangle}{dt} = - \frac{\langle \vec{p}(t) \rangle}{\tau} + \vec{F}(t)

這就是德汝德模型之電子的運動方程式。

─── 摘自《【Sonic π】電聲學補充《三》上

【Errata】

二、常量不隨時變的外力 \vec{F} 時,\langle \vec{p}(t) \rangle = \langle \vec{p}(0) \rangle \cdot e^{- \frac {t}{\tau}} + \vec{F} \cdot \tau

應是 \langle \vec{p}(t) \rangle = \langle \vec{p}(0) \rangle \cdot e^{- \frac {t}{\tau}} + \vec{F} \cdot \tau (1 -e^{- \frac {t}{\tau}})

 

方踏上此時間序列之旅。