【鼎革‧革鼎】︰ Raspbian Stretch 《六之 J.3‧MIR-13.3 》

如果人們只是向前忘了顧後,恐怕疑惑生焉?!

分明清楚定義那個輸入訊號 \vec{x}

x(t)=\begin{cases} \cos(440 \pi t); & t < 0.5 \\ \cos(660 \pi t); & 0.5 \leq t < 1 \\ \cos(524 \pi t); & t \ge 1 \end{cases}

總不會因為例子中改叫作 \vec{y} 而有不同吧!?

奈何『時間軸』竟受 hop\_length 影響耶??

 

 

 

 

倘再考之以 n\_fft 的關聯,當能領會『預設之旨』乎!!

 

無怪 librosa 有這麼多

Time and frequency conversion

frames_to_samples(frames[, hop_length, n_fft]) Converts frame indices to audio sample indices
frames_to_time(frames[, sr, hop_length, n_fft]) Converts frame counts to time (seconds)
samples_to_frames(samples[, hop_length, n_fft]) Converts sample indices into STFT frames.
samples_to_time(samples[, sr]) Convert sample indices to time (in seconds).
time_to_frames(times[, sr, hop_length, n_fft]) Converts time stamps into STFT frames.
time_to_samples(times[, sr]) Convert timestamps (in seconds) to sample indices.
hz_to_note(frequencies, **kwargs) Convert one or more frequencies (in Hz) to the nearest note names.
hz_to_midi(frequencies) Get the closest MIDI note number(s) for given frequencies
midi_to_hz(notes) Get the frequency (Hz) of MIDI note(s)
midi_to_note(midi[, octave, cents]) Convert one or more MIDI numbers to note strings.
note_to_hz(note, **kwargs) Convert one or more note names to frequency (Hz)
note_to_midi(note[, round_midi]) Convert one or more spelled notes to MIDI number(s).
hz_to_mel(frequencies[, htk]) Convert Hz to Mels
hz_to_octs(frequencies[, A440]) Convert frequencies (Hz) to (fractional) octave numbers.
mel_to_hz(mels[, htk]) Convert mel bin numbers to frequencies
octs_to_hz(octs[, A440]) Convert octaves numbers to frequencies.
fft_frequencies([sr, n_fft]) Alternative implementation of np.fft.fftfreqs
cqt_frequencies(n_bins, fmin[, …]) Compute the center frequencies of Constant-Q bins.
mel_frequencies([n_mels, fmin, fmax, htk]) Compute the center frequencies of mel bands.
tempo_frequencies(n_bins[, hop_length, sr]) Compute the frequencies (in beats-per-minute) corresponding to an onset auto-correlation or tempogram matrix.

 

換算函式呦☆

請循其本也◎

參考

 

 

 

 

 

 

 

 

 

 

 

【鼎革‧革鼎】︰ Raspbian Stretch 《六之 J.3‧MIR-13.2 》

音樂 = □ + □

白居易琵琶行

……

千呼萬喚始出來,猶抱琵琶半遮面。
轉軸撥絃三兩聲,未成曲調先有情。
絃絃掩抑聲聲思,似訴平生不得志。
低眉信手續續彈,說盡心中無限事。
輕攏慢撚抹復挑,初為《霓裳》後《六么》。
大絃嘈嘈如急雨,小絃切切如私語。
嘈嘈切切錯雜彈,大珠小珠落玉盤。
間關鶯語花底滑,幽咽泉流水下灘。
水泉冷澀絃凝絕,凝絕不通聲暫歇。
別有幽愁暗恨生,此時無聲勝有聲。
銀瓶乍破水漿迸,鐵騎突出刀槍鳴。
曲終收撥當心畫,四絃一聲如裂帛。
東舟西舫悄無言,唯見江心秋月白。
……

過去唐朝的大詩人白樂天,文章功深力厚又精音通律,於琵琶行一文中描寫著一位『彈著琵琶的女子』,文本直叫能『音聲』透出筆端奪字而來。

………《走進音樂世界!!

 

若說『嘈嘈切切錯雜彈,大珠小珠落玉盤。』,都給白居易聽的那麼清楚,想必早已精通 Onset 『有無個中』三昧矣!

Onset (audio)

Onset refers to the beginning of a musical note or other sound. It is related to (but different from) the concept of a transient: all musical notes have an onset, but do not necessarily include an initial transient.

In phonetics the term is used differently – see syllable onset.

Onset detection

In signal processing, onset detection is an active research area. For example, the MIREX annual competition features an Audio Onset Detection contest.

Approaches to onset detection can operate in the time domain, frequency domain, phase domain, or complex domain, and include looking for:

Simpler techniques such as detecting increases in time-domain amplitude can typically lead to an unsatisfactorily high amount of false positives or false negatives.

The aim is often to judge onsets similarly to how a human would: so psychoacoustically-motivated strategies may be employed. Sometimes the onset detector can be restricted to a particular domain (depending on intended application), for example being targeted at detecting percussive onsets. With a narrower focus, it can be more straightforward to obtain reliable detection.

※ 註

Spectral flux

Spectral flux is a measure of how quickly the power spectrum of a signal is changing, calculated by comparing the power spectrum for one frame against the power spectrum from the previous frame.[1] More precisely, it is usually calculated as the 2-norm (also known as the Euclidean distance) between the two normalised spectra.

Calculated this way, the spectral flux is not dependent upon overall power (since the spectra are normalised), nor on phase considerations (since only the magnitudes are compared).

The spectral flux can be used to determine the timbre of an audio signal, or in onset detection,[2] among other things.

 

據聞 Onset 的偵測方法尚未完善,不要只顧使用現成程式庫也?

librosa.onset.onset_detect

librosa.onset.onset_detect(y=None, sr=22050, onset_envelope=None, hop_length=512, backtrack=False, energy=None, units=’frames’, **kwargs)
 

Basic onset detector. Locate note onset events by picking peaks in an onset strength envelope.

The peak_pick parameters were chosen by large-scale hyper-parameter optimization over the dataset provided by [R42].

Parameters:

y : np.ndarray [shape=(n,)]

audio time series

sr : number > 0 [scalar]

sampling rate of y

onset_envelope : np.ndarray [shape=(m,)]

(optional) pre-computed onset strength envelope

hop_length : int > 0 [scalar]

hop length (in samples)

units : {‘frames’, ‘samples’, ‘time’}

The units to encode detected onset events in. By default, ‘frames’ are used.

backtrack : bool

If True, detected onset events are backtracked to the nearest preceding minimum of energy.

This is primarily useful when using onsets as slice points for segmentation.

energy : np.ndarray [shape=(m,)] (optional)

An energy function to use for backtracking detected onset events. If none is provided, then onset_envelope is used.

kwargs : additional keyword arguments

Additional parameters for peak picking.

See librosa.util.peak_pick for details.

Returns:

onsets : np.ndarray [shape=(n_onsets,)]

estimated positions of detected onsets, in whichever units are specified. By default, frame indices.

Note

If no onset strength could be detected, onset_detect returns an empty list.

Raises:

ParameterError

if neither y nor onsets are provided

or if units is not one of ‘frames’, ‘samples’, or ‘time’

 

※ 範例

 

 

最好能追本溯源,了解演算法核心哩◎

librosa.onset.onset_strength

librosa.onset.onset_strength(y=None, sr=22050, S=None, lag=1, max_size=1, detrend=False, center=True, feature=None, aggregate=None, centering=None, **kwargs)
 

Compute a spectral flux onset strength envelope.

Onset strength at time t is determined by:

mean_f max(0, S[f, t] – ref_S[f, t – lag])

where ref_S is S after local max filtering along the frequency axis [R43].

By default, if a time series y is provided, S will be the log-power Mel spectrogram.

 

 

 

 

 

 

 

 

【鼎革‧革鼎】︰ Raspbian Stretch 《六之 J.3‧MIR-13.1 》

雖然曾經簡略介紹過 STFT ︰

人耳聽旋律,縱然不能分辨音高時值,卻知音符總有個先來後到。傳說莫札特絕對音感之好,即使是初聞的曲調,也能立刻譜出!!為求一次得到『兩個領域』── 時間、頻率 ── 的好處,於是人們設想了

短時距傅立葉變換

短時距傅立葉變換傅立葉變換的一種變形,為時頻分析中其中一個重要的工具。

傅立葉轉換在概念上的區別

將訊號做傅立葉變換後得到的結果,並不能給予關於訊號頻率隨時間改變的任何資訊。以下的例子作為說明:

x(t)=\begin{cases} \cos(440 \pi t); & t < 0.5 \\ \cos(660 \pi t); & 0.5 \leq t < 1 \\ \cos(524 \pi t); & t \ge 1 \end{cases}

傅立葉變換後的頻譜和短時距傅立葉轉換後的結果如下:

傅立葉轉換後, 橫軸為頻率(赫茲)

短時距傅立葉轉換, 橫軸為時間(秒),縱軸為頻率(赫茲)

由上圖可發現,傅立葉轉換只提供了有哪些頻率成份的資訊,卻沒有提供時間資訊;而短時傅立葉轉換則清楚的提供這兩種資訊。這種時頻分析的方法有利於頻率會隨著時間改變的訊號,如音樂訊號和語音訊號等分析。

定義

數學定義

簡單來說,在連續時間的例子,一個函數可以先乘上僅在一段時間不為零的窗函數再進行一維的傅立葉變換。再將這個窗函數沿著時間軸挪移,所得到一系列的傅立葉變換結果排開則成為二維表象。數學上,這樣的操作可寫為:

 X(t, f) = \int_{-\infty}^{\infty} w(t-\tau)x(\tau) e^{-j 2 \pi f \tau} \, d\tau

另外也可用角頻率來表示:

 X(t, \omega) = \int_{-\infty}^{\infty} w(t-\tau)x(\tau) e^{-j \omega \tau} \, d\tau

其中w(t)窗函數,窗函數種類有很多種,會在稍後再做仔細討論。x(t)是待變換的訊號。X(t,\omega)w(t-\tau)x(\tau)的傅立葉變換。 隨著t的改變,窗函數在時間軸上會有位移。經w(t-\tau)x(\tau)後,訊號只留下了窗函數截取的部分做最後的傅立葉轉換。

而反短時距傅立葉轉換,其數學類似傅立葉轉換,但須消除窗函數的作用:

 x(t)=w(t_1-t)^{-1} \int_{-\infty}^{\infty} X(t_1, f) e^{j 2 \pi f t}\, df ; w(t_1-t)\ne 0

窗函數

窗函數通常滿足下列特性:

  1. w(t) = w(-t) \,,即為偶函數。
  2. max(w(t))=w(0) \,,即窗函數的中央通常是最大值的位置。
  3. w(t_1)\ge w(t_2), |t_2| \ge |t_1|,即窗函數的值由中央開始向兩側單調遞減。
  4. w(t)\cong 0 , |t|\to \infty,即窗函數的值向兩側遞減為零。

常見的窗函數有:方形、三角形、高斯函數等,而短時距傅立葉轉換也因窗函數的不同而有不同的名稱。而加伯轉換,即為窗函數是高斯函數的短時距傅立葉轉換,通常沒有特別說明的短時距傅立葉轉換,即為加伯轉換

……

頻譜(Spectrogram)

Spectrogram即短時傅立葉轉換後結果的絕對值平方,兩者本質上是相同的,在文獻上也常出現spectrogram這個名詞。

SP_x(t,f) = |X(t,f)|^2 = | \int_{-\infty}^{\infty} w(t-\tau)x(\tau) e^{-j 2 \pi f \tau} \, d\tau |^2

─── 摘自《W!o+ 的《小伶鼬工坊演義》︰神經網絡【FFT】六

 

想來恐無助於應用 librosa stft API 軟件界面也︰

librosa.core.stft

librosa.core.stft(y, n_fft=2048, hop_length=None, win_length=None, window=’hann’, center=True, dtype=<class ‘numpy.complex64’>, pad_mode=’reflect’)
Short-time Fourier transform (STFT)

Returns a complex-valued matrix D such that

np.abs(D[f, t]) is the magnitude of frequency bin f at frame t

np.angle(D[f, t]) is the phase of frequency bin f at frame t

Parameters:

y : np.ndarray [shape=(n,)], real-valued

the input signal (audio time series)

n_fft : int > 0 [scalar]

FFT window size

hop_length : int > 0 [scalar]

number audio of frames between STFT columns. If unspecified, defaults win_length / 4.

win_length : int <= n_fft [scalar]

Each frame of audio is windowed by window(). The window will be of length win_length and then padded with zeros to match n_fft.

If unspecified, defaults to win_length = n_fft.

window : string, tuple, number, function, or np.ndarray [shape=(n_fft,)]

center : boolean

  • If True, the signal y is padded so that frame D[:, t] is centered at y[t * hop_length].
  • If False, then D[:, t] begins at y[t * hop_length]

dtype : numeric type

Complex numeric type for D. Default is 64-bit complex.

mode : string

If center=True, the padding mode to use at the edges of the signal. By default, STFT uses reflection padding.

Returns:

D : np.ndarray [shape=(1 + n_fft/2, t), dtype=dtype]

STFT matrix

※ 範例

 

 

故而特此援引 JULIUS O. SMITH III 先生之大作

SPECTRAL AUDIO SIGNAL PROCESSING

JULIUS O. SMITH III
Center for Computer Research in Music and Acoustics (CCRMA)

 

Time-Frequency Displays 章節

 

盼能加深認識的ㄡ◎

 

 

 

 

 

 

 

 

【鼎革‧革鼎】︰ Raspbian Stretch 《六之 J.3‧MIR-13.0 》

如果說『感知器網絡』不過是個『線性分類器

Linear classifier

In the field of machine learning, the goal of statistical classification is to use an object’s characteristics to identify which class (or group) it belongs to. A linear classifier achieves this by making a classification decision based on the value of a linear combination of the characteristics. An object’s characteristics are also known as feature values and are typically presented to the machine in a vector called a feature vector. Such classifiers work well for practical problems such as document classification, and more generally for problems with many variables (features), reaching accuracy levels comparable to non-linear classifiers while taking less time to train and use.[1]

Definition

If the input feature vector to the classifier is a real vector \vec x, then the output score is

y = f(\vec{w}\cdot\vec{x}) = f\left(\sum_j w_j x_j\right),

where \vec w is a real vector of weights and f is a function that converts the dot product of the two vectors into the desired output. (In other words, \vec{w} is a one-form or linear functional mapping \vec x onto R.) The weight vector \vec w is learned from a set of labeled training samples. Often f is a simple function that maps all values above a certain threshold to the first class and all other values to the second class. A more complex f might give the probability that an item belongs to a certain class.

For a two-class classification problem, one can visualize the operation of a linear classifier as splitting a high-dimensional input space with a hyperplane: all points on one side of the hyperplane are classified as “yes“, while the others are classified as “no“.

A linear classifier is often used in situations where the speed of classification is an issue, since it is often the fastest classifier, especially when \vec x is sparse. Also, linear classifiers often work very well when the number of dimensions in \vec x is large, as in document classification, where each element in \vec x is typically the number of occurrences of a word in a document (see document-term matrix). In such cases, the classifier should be well-regularized.

220px-Svm_separating_hyperplanes

In this case, the solid and empty dots can be correctly classified by any number of linear classifiers. H1 (blue) classifies them correctly, as does H2 (red). H2 could be considered “better” in the sense that it is also furthest from both groups. H3 (green) fails to correctly classify the dots.

───

罷了,人們是否會大失所望耶?所謂『辨識』是多麼智慧之行為!怎麼可能只是『分類』而已勒??顯然那個『感知器模型』太簡略不能反映『真實』的吧!!

……

如果請人『分辨』下圖『什麼是什麼』 ?

HandWritingDigits

 

可能十分容易!假使請人描述『為什麼那像那』的呢??恐怕非常困難!!假使有人想要『定義』『4』的圖象像什麼︰

mnist_test4

 

真不知這能不能作的哩!!??比方說那圖都是『4』這一類,因著模糊之『相似性』,人們總可以講︰所謂『4』中分來看,左有個『勾』,右有個『豎』與『勾』相交於『下』……

那麼這些『定義』之『屬性』將如何『判定』下圖之歸屬耶?

4-1 4-2 4-3 4-4 4-5 4-6 4-7 4-8

 

難道因為『四、九』西方金,所以『陰、陽』難辨乎??!!

………

或許真正令人驚訝的是,一個『人工神經網絡』竟然能『訓練』且『學習』將手寫阿拉伯數字『分類』那麼好的也!!!

─── 《神經網絡【Perceptron】五

 

如果反思使用『特徵向量』表現某物 \vec{x} 的作法,其實十分『抽象』!就像為什麼 28 \times 28 二維圖素的手寫數字,可以用 784 個一維分量來代表那個『輸入特徵』呢?這樣兩個手寫數字之『不同』到底在計算什麼呢??好比

弦相似性

餘弦相似性通過測量兩個向量的夾角的餘弦值來度量它們之間的相似性。0度角的餘弦值是1,而其他任何角度的餘弦值都不大於1;並且其最小值是-1。從而兩個向量之間的角度的餘弦值確定兩個向量是否大致指向相同的方向。兩個向量有相同的指向時,餘弦相似度的值為1;兩個向量夾角為90°時,餘弦相似度的值為0;兩個向量指向完全相反的方向時,餘弦相似度的值為-1。這結果是與向量的長度無關的 ,僅僅與向量的指向方向相關。餘弦相似度通常用於正空間,因此給出的值為0到1之間。

注意這上下界對任何維度的向量空間中都適用,而且餘弦相似性最常用於高維正空間。例如在信息檢索中,每個詞項被賦予不同的維度,而一個文檔由一個向量表示,其各個維度上的值對應於該詞項在文檔中出現的頻率。餘弦相似度因此可以給出兩篇文檔在其主題方面的相似度。

另外,它通常用於文本挖掘中的文件比較。此外,在數據挖掘領域中 ,會用到它來度量集群內部的凝聚力。[1]

定義

兩個向量間的餘弦值可以通過使用歐幾里得點積公式求出:

  {\mathbf {a}}\cdot {\mathbf {b}}=\left\|{\mathbf {a}}\right\|\left\|{\mathbf {b}}\right\|\cos \theta

給定兩個屬性向量, AB,其餘弦相似性θ由點積和向量長度給出,如下所示:

{\text{similarity}}=\cos(\theta )={A\cdot B \over \|A\|\|B\|}={\frac {\sum \limits _{{i=1}}^{{n}}{A_{i}\times B_{i}}}{{\sqrt {\sum \limits _{{i=1}}^{{n}}{(A_{i})^{2}}}}\times {\sqrt {\sum \limits _{{i=1}}^{{n}}{(B_{i})^{2}}}}}},這裡的  A_{i}  B_{i}分別代表向量  A  B的各分量

給出的相似性範圍從-1到1:-1意味著兩個向量指向的方向正好截然相反,1表示它們的指向是完全相同的,0通常表示它們之間是獨立的,而在這之間的值則表示中間的相似性或相異性。

對於文本匹配,屬性向量AB 通常是文檔中的詞頻向量。餘弦相似性,可以被看作是在比較過程中把文件長度正規化的方法。

信息檢索的情況下,由於一個詞的頻率(TF-IDF權)不能為負數,所以這兩個文檔的餘弦相似性範圍從0到1。並且,兩個詞的頻率向量之間的角度不能大於90°。

 

和文本『主題雷同』如何相干呦!!

此所以講︰『方以類聚,物以群分』,言簡意賅也☆★

 In this exercise notebook, we will segment, feature extract, and analyze audio files. Goals:
 
  1. Detect onsets in an audio signal.
  2. Segment the audio signal at each onset.
  3. Compute features for each segment.
  4. Gain intuition into the features by listening to each segment separately.

 

 

 

 

 

 

 

 

【鼎革‧革鼎】︰ Raspbian Stretch 《六之 J.3‧MIR-12後 》

現代『聽覺』 的研究表明,雖然不同的人『音感』不同,然而卻都有下圖的聽覺『共性現象』:用一參考音高 F1 、另一變化音高 F2 ,讓 F2 遠比 F1 小而漸增加,聽覺逐漸從『平穩』變成『不平穩』── 『臨界頻帶』,在頻率差小於 15 Hz 時轉成了『拍音區域』、一般人當頻率差小於 12.5 Hz 時會產生『拍音』之感覺,最後 F2 = F1 時感覺是同音高;同樣的如果 F2 由比 F1 大而減小,情況相同。

音頻感覺

─── 《Sonic π 之節拍體驗

 

聞聽感覺之事,物理可為憑乎?故而音聲入耳『和諧』與否,尚須『聽覺』判斷矣。此乃『音樂聲學』之主、客觀兩面也!

Musical acoustics

Musical acoustics or music acoustics is a branch of acoustics concerned with researching and describing the physics of music – how sounds are employed to make music. Examples of areas of study are the function of musical instruments, the human voice (the physics of speech and singing), computer analysis of melody, and in the clinical use of music in music therapy.

Physical aspects

Whenever two different pitches are played at the same time, their sound waves interact with each other – the highs and lows in the air pressure reinforce each other to produce a different sound wave. Any repeating sound wave that is not a sine wave can be modeled by many different sine waves of the appropriate frequencies and amplitudes (a frequency spectrum). In humans the hearing apparatus (composed of the ears and brain) can usually isolate these tones and hear them distinctly. When two or more tones are played at once, a variation of air pressure at the ear “contains” the pitches of each, and the ear and/or brain isolate and decode them into distinct tones.

When the original sound sources are perfectly periodic, the note consists of several related sine waves (which mathematically add to each other) called the fundamental and the harmonics, partials, or overtones. The sounds have harmonic frequency spectra. The lowest frequency present is the fundamental, and is the frequency at which the entire wave vibrates. The overtones vibrate faster than the fundamental, but must vibrate at integer multiples of the fundamental frequency for the total wave to be exactly the same each cycle. Real instruments are close to periodic, but the frequencies of the overtones are slightly imperfect, so the shape of the wave changes slightly over time.[citation needed]

Subjective aspects

Variations in air pressure against the ear drum, and the subsequent physical and neurological processing and interpretation, give rise to the subjective experience called sound. Most sound that people recognize as musical is dominated by periodic or regular vibrations rather than non-periodic ones; that is, musical sounds typically have a definite pitch). The transmission of these variations through air is via a sound wave. In a very simple case, the sound of a sine wave, which is considered the most basic model of a sound waveform, causes the air pressure to increase and decrease in a regular fashion, and is heard as a very pure tone. Pure tones can be produced by tuning forks or whistling. The rate at which the air pressure oscillates is the frequency of the tone, which is measured in oscillations per second, called hertz. Frequency is the primary determinant of the perceived pitch. Frequency of musical instruments can change with altitude due to changes in air pressure.

……

Harmonics, partials, and overtones

The fundamental is the frequency at which the entire wave vibrates. Overtones are other sinusoidal components present at frequencies above the fundamental. All of the frequency components that make up the total waveform, including the fundamental and the overtones, are called partials. Together they form the harmonic series.

Overtones that are perfect integer multiples of the fundamental are called harmonics. When an overtone is near to being harmonic, but not exact, it is sometimes called a harmonic partial, although they are often referred to simply as harmonics. Sometimes overtones are created that are not anywhere near a harmonic, and are just called partials or inharmonic overtones.

The fundamental frequency is considered the first harmonic and the first partial. The numbering of the partials and harmonics is then usually the same; the second partial is the second harmonic, etc. But if there are inharmonic partials, the numbering no longer coincides. Overtones are numbered as they appear above the fundamental. So strictly speaking, the first overtone is the second partial (and usually the second harmonic). As this can result in confusion, only harmonics are usually referred to by their numbers, and overtones and partials are described by their relationships to those harmonics.

Scale of harmonics

Harmonics and non-linearities

When a periodic wave is composed of a fundamental and only odd harmonics (f, 3f, 5f, 7f, …), the summed wave is half-wave symmetric; it can be inverted and phase shifted and be exactly the same. If the wave has any even harmonics (0f, 2f, 4f, 6f, …), it is asymmetrical; the top half is not a mirror image of the bottom.

Conversely, a system that changes the shape of the wave (beyond simple scaling or shifting) creates additional harmonics (harmonic distortion). This is called a non-linear system. If it affects the wave symmetrically, the harmonics produced are all odd. If it affects the harmonics asymmetrically, at least one even harmonic is produced (and probably also odd harmonics).

A symmetric and asymmetric waveform. The red (upper) wave contains only the fundamental and odd harmonics; the green (lower) wave contains the fundamental and even harmonics.

Harmony

If two notes are simultaneously played, with frequency ratios that are simple fractions (e.g. 2/1, 3/2 or 5/4), the composite wave is still periodic, with a short period—and the combination sounds consonant. For instance, a note vibrating at 200 Hz and a note vibrating at 300 Hz (a perfect fifth, or 3/2 ratio, above 200 Hz) add together to make a wave that repeats at 100 Hz: every 1/100 of a second, the 300 Hz wave repeats three times and the 200 Hz wave repeats twice. Note that the total wave repeats at 100 Hz, but there is no actual 100 Hz sinusoidal component.

Additionally, the two notes have many of the same partials. For instance, a note with a fundamental frequency of 200 Hz has harmonics at: :(200,) 400, 600, 800, 1000, 1200, …

A note with fundamental frequency of 300 Hz has harmonics at: :(300,) 600, 900, 1200, 1500, … The two notes share harmonics at 600 and 1200 Hz, and more coincide further up the series.

The combination of composite waves with short fundamental frequencies and shared or closely related partials is what causes the sensation of harmony. When two frequencies are near to a simple fraction, but not exact, the composite wave cycles slowly enough to hear the cancellation of the waves as a steady pulsing instead of a tone. This is called beating, and is considered unpleasant, or dissonant.

The frequency of beating is calculated as the difference between the frequencies of the two notes. For the example above, |200 Hz – 300 Hz| = 100 Hz. As another example, a combination of 3425 Hz and 3426 Hz would beat once per second (|3425 Hz – 3426 Hz| = 1 Hz). This follows from modulation theory.

The difference between consonance and dissonance is not clearly defined, but the higher the beat frequency, the more likely the interval is dissonant. Helmholtz proposed that maximum dissonance would arise between two pure tones when the beat rate is roughly 35 Hz. [1]

 

鑑於調音系統筆記已經講的清晰完整。只得補之以拍音不和諧哩◎

 

 

 

 

※ 筆記文本寫作,參考

Markdown Cells

Text can be added to Jupyter Notebooks using Markdown cells. Markdown is a popular markup language that is a superset of HTML. Its specification can be found here:

https://daringfireball.net/projects/markdown/