W!o+ 的《小伶鼬工坊演義》︰神經網絡【backpropagation】一

談及『反向傳播算法』之前, Michael Nielsen 先生首起『記號法』之用︰

……

Let’s begin with a notation which lets us refer to weights in the network in an unambiguous way. We’ll use w^l_{jk} to denote the weight for the connection from the k^{\rm th} neuron in the (l-1)^{\rm th} layer to the j^{\rm th} neuron in the l^{\rm th} layer. So, for example, the diagram below shows the weight on a connection from the fourth neuron in the second layer to the second neuron in the third layer of a network:

This notation is cumbersome at first, and it does take some work to master. But with a little effort you’ll find the notation becomes easy and natural. One quirk of the notation is the ordering of the j and k indices. You might think that it makes more sense to use j to refer to the input neuron, and k to the output neuron, not vice versa, as is actually done. I’ll explain the reason for this quirk below.

We use a similar notation for the network’s biases and activations. Explicitly, we use b^l_j for the bias of the j^{\rm th} neuron in the l^{\rm th} layer. And we use a^l_j for the activation of the j^{\rm th} neuron in the l^{\rm th} layer. The following diagram shows examples of these notations in use:

With these notations, the activation a^l_j of the j^{\rm th} neuron in the l^{\rm th} layer is related to the activations in the (l-1)^{\rm th} layer by the equation (compare Equation (4) and surrounding discussion in the last chapter)

a^{l}_j = \sigma\left( \sum_k w^{l}_{jk} a^{l-1}_k + b^l_j \right), \ \ \ \ (23)

where the sum is over all neurons k in the (l-1)^{\rm th} layer. To rewrite this expression in a matrix form we define a weight matrix w^l for each layer, l. The entries of the weight matrix w^l are just the weights connecting to the l^{\rm th} layer of neurons, that is, the entry in the j^{\rm th} row and k^{\rm th} column is w^l_{jk}. Similarly, for each layer l we define a bias vector, b^l. You can probably guess how this works – the components of the bias vector are just the values b^l_j, one component for each neuron in the l^{\rm th} layer. And finally, we define an activation vector a^l whose components are the activations a^l_j.

───

 

果有深意哉??如果設想使用『羅馬數字』作算術

XIV \times LXX

,雖是大學者,恐怕一樣頭疼。假使記作

14 \times 70

,縱使小朋友,也能駕馭自如!!

恰似中國古代的『天元術』之所以難學難傳的乎!!??

如果考之以歷史,或許最早的『語詞算術』,可能源於中國古代的『天元術』︰

在中國數學史上最早創立天元概念的是北宋平陽蔣周所著的《益古集》,隨後有博陸李文一撰《照膽》,鹿泉石信道撰《鈐經》,平水劉汝諧撰《如積釋鎖》,處州李思聰《洞淵九容》後人才知道有天元。

李冶在東平獲得劉汝諧撰《如積釋鎖》,書中用十九個單字表示未知數的各個x^9x^-9的冪:

仙、明、霄、漢、壘、層、高、上、天、人、地、下、低、減、落、逝、泉、暗、鬼;其中立天元在上。

後來有太原彭澤彥出,反其道而行,以天元在下[2]

《益古集》,《照膽》,《鈐經》,《如積釋鎖》,《洞淵九容》等早期天元術著作今已失傳。李冶在《測圓海鏡》中使用天元在上的天元術。後來李冶又著《益古演段》,採用天元在下的次序。朱世傑四元玉鑒》和《算學啟蒙》卷下也採用天元在下的次序。

Wylie_on_Tian_Yuen

在天元術中,一次項係數旁記一「元」字(或在常數項旁記一「太」字)。

歷史上有兩種次序:
《測圓海鏡》式

「元」以上的係數表示各正次冪,「元」以下的係數表示常數項和各負次冪)。

例:李冶《測圓海鏡》第二卷第十四問方程:-x^2-680x+96000=0

Counting rod v-1.png
Counting rod h6.pngCounting rod h-8.pngCounting rod 0.png
Counting rod v9.pngCounting rod h6.pngCounting rod 0.pngCounting rod 0.pngCounting rod 0.png
《益古演段》式

「元」以下的係數表示各正次冪,「元」以上的係數表示常數和各負次冪

例一:

李冶益古演段》卷中第三十六問中的方程=3x^2+210x-20325 用天元術表示為:

Counting rod v2.pngCounting rod 0.pngCounting rod v-3.pngCounting rod h2.pngCounting rod v5.png

Counting rod v2.pngCounting rod h1.pngCounting rod 0.png 元(x)

Counting rod v3.pngx^2項)

其中「太」是常數項,算籌Counting rod v3.png 打斜線表示該項常數為負數。 「元」相當於未知數x

對於東方早期古典『高階方程式論』有興趣之讀者,也許可以讀讀金代數學家李冶所著『測圓海鏡』,體會一下不同的『思路』。或將可以感受『符號學』的發展對於『數理邏輯』的貢獻,了解有時理解的難易就藏在『記號法』之中,畢竟『符號』也能有『美學』的吧!!

─ 摘自《勇闖新世界︰ 《 pyDatalog 》【專題】之約束編程‧一

 

故而對此『記號法』務先嫻熟善用的耶!!