【鼎革‧革鼎】︰ Raspbian Stretch 《六之 J.3‧MIR-13.0 》

如果說『感知器網絡』不過是個『線性分類器

Linear classifier

In the field of machine learning, the goal of statistical classification is to use an object’s characteristics to identify which class (or group) it belongs to. A linear classifier achieves this by making a classification decision based on the value of a linear combination of the characteristics. An object’s characteristics are also known as feature values and are typically presented to the machine in a vector called a feature vector. Such classifiers work well for practical problems such as document classification, and more generally for problems with many variables (features), reaching accuracy levels comparable to non-linear classifiers while taking less time to train and use.[1]

Definition

If the input feature vector to the classifier is a real vector \vec x, then the output score is

y = f(\vec{w}\cdot\vec{x}) = f\left(\sum_j w_j x_j\right),

where \vec w is a real vector of weights and f is a function that converts the dot product of the two vectors into the desired output. (In other words, \vec{w} is a one-form or linear functional mapping \vec x onto R.) The weight vector \vec w is learned from a set of labeled training samples. Often f is a simple function that maps all values above a certain threshold to the first class and all other values to the second class. A more complex f might give the probability that an item belongs to a certain class.

For a two-class classification problem, one can visualize the operation of a linear classifier as splitting a high-dimensional input space with a hyperplane: all points on one side of the hyperplane are classified as “yes“, while the others are classified as “no“.

A linear classifier is often used in situations where the speed of classification is an issue, since it is often the fastest classifier, especially when \vec x is sparse. Also, linear classifiers often work very well when the number of dimensions in \vec x is large, as in document classification, where each element in \vec x is typically the number of occurrences of a word in a document (see document-term matrix). In such cases, the classifier should be well-regularized.

220px-Svm_separating_hyperplanes

In this case, the solid and empty dots can be correctly classified by any number of linear classifiers. H1 (blue) classifies them correctly, as does H2 (red). H2 could be considered “better” in the sense that it is also furthest from both groups. H3 (green) fails to correctly classify the dots.

───

罷了,人們是否會大失所望耶?所謂『辨識』是多麼智慧之行為!怎麼可能只是『分類』而已勒??顯然那個『感知器模型』太簡略不能反映『真實』的吧!!

……

如果請人『分辨』下圖『什麼是什麼』 ?

HandWritingDigits

 

可能十分容易!假使請人描述『為什麼那像那』的呢??恐怕非常困難!!假使有人想要『定義』『4』的圖象像什麼︰

mnist_test4

 

真不知這能不能作的哩!!??比方說那圖都是『4』這一類,因著模糊之『相似性』,人們總可以講︰所謂『4』中分來看,左有個『勾』,右有個『豎』與『勾』相交於『下』……

那麼這些『定義』之『屬性』將如何『判定』下圖之歸屬耶?

4-1 4-2 4-3 4-4 4-5 4-6 4-7 4-8

 

難道因為『四、九』西方金,所以『陰、陽』難辨乎??!!

………

或許真正令人驚訝的是,一個『人工神經網絡』竟然能『訓練』且『學習』將手寫阿拉伯數字『分類』那麼好的也!!!

─── 《神經網絡【Perceptron】五

 

如果反思使用『特徵向量』表現某物 \vec{x} 的作法,其實十分『抽象』!就像為什麼 28 \times 28 二維圖素的手寫數字,可以用 784 個一維分量來代表那個『輸入特徵』呢?這樣兩個手寫數字之『不同』到底在計算什麼呢??好比

弦相似性

餘弦相似性通過測量兩個向量的夾角的餘弦值來度量它們之間的相似性。0度角的餘弦值是1,而其他任何角度的餘弦值都不大於1;並且其最小值是-1。從而兩個向量之間的角度的餘弦值確定兩個向量是否大致指向相同的方向。兩個向量有相同的指向時,餘弦相似度的值為1;兩個向量夾角為90°時,餘弦相似度的值為0;兩個向量指向完全相反的方向時,餘弦相似度的值為-1。這結果是與向量的長度無關的 ,僅僅與向量的指向方向相關。餘弦相似度通常用於正空間,因此給出的值為0到1之間。

注意這上下界對任何維度的向量空間中都適用,而且餘弦相似性最常用於高維正空間。例如在信息檢索中,每個詞項被賦予不同的維度,而一個文檔由一個向量表示,其各個維度上的值對應於該詞項在文檔中出現的頻率。餘弦相似度因此可以給出兩篇文檔在其主題方面的相似度。

另外,它通常用於文本挖掘中的文件比較。此外,在數據挖掘領域中 ,會用到它來度量集群內部的凝聚力。[1]

定義

兩個向量間的餘弦值可以通過使用歐幾里得點積公式求出:

  {\mathbf {a}}\cdot {\mathbf {b}}=\left\|{\mathbf {a}}\right\|\left\|{\mathbf {b}}\right\|\cos \theta

給定兩個屬性向量, AB,其餘弦相似性θ由點積和向量長度給出,如下所示:

{\text{similarity}}=\cos(\theta )={A\cdot B \over \|A\|\|B\|}={\frac {\sum \limits _{{i=1}}^{{n}}{A_{i}\times B_{i}}}{{\sqrt {\sum \limits _{{i=1}}^{{n}}{(A_{i})^{2}}}}\times {\sqrt {\sum \limits _{{i=1}}^{{n}}{(B_{i})^{2}}}}}},這裡的  A_{i}  B_{i}分別代表向量  A  B的各分量

給出的相似性範圍從-1到1:-1意味著兩個向量指向的方向正好截然相反,1表示它們的指向是完全相同的,0通常表示它們之間是獨立的,而在這之間的值則表示中間的相似性或相異性。

對於文本匹配,屬性向量AB 通常是文檔中的詞頻向量。餘弦相似性,可以被看作是在比較過程中把文件長度正規化的方法。

信息檢索的情況下,由於一個詞的頻率(TF-IDF權)不能為負數,所以這兩個文檔的餘弦相似性範圍從0到1。並且,兩個詞的頻率向量之間的角度不能大於90°。

 

和文本『主題雷同』如何相干呦!!

此所以講︰『方以類聚,物以群分』,言簡意賅也☆★

 In this exercise notebook, we will segment, feature extract, and analyze audio files. Goals:
 
  1. Detect onsets in an audio signal.
  2. Segment the audio signal at each onset.
  3. Compute features for each segment.
  4. Gain intuition into the features by listening to each segment separately.