【鼎革‧革鼎】︰ Raspbian Stretch 《六之 J.3‧MIR-4 》

俗話講︰隔行如隔山!為什麼呢?若以人能應用工具而言,實費疑猜乎 !?果真各行各業有『奇巧裝置』耶?!恐是落在規矩和行話迷霧哩 ??!!故耳過去雖曾註解神經網絡與深度學習乙事︰

對一本小巧完整而且寫的好的書,又該多說些什麼的呢?於是幾經思慮,就講講過去之閱讀隨筆與念頭雜記的吧!終究面對一個既舊也新的議題,尚待火石電光激發創意和發想。也許只需一個洞見或將改變人工智慧的未來乎??

Michael Nielsen 先生開宗明義在首章起頭

The human visual system is one of the wonders of the world. Consider the following sequence of handwritten digits:

Most people effortlessly recognize those digits as 504192. That ease is deceptive. In each hemisphere of our brain, humans have a primary visual cortex, also known as V1, containing 140 million neurons, with tens of billions of connections between them. And yet human vision involves not just V1, but an entire series of visual cortices – V2, V3, V4, and V5 – doing progressively more complex image processing. We carry in our heads a supercomputer, tuned by evolution over hundreds of millions of years, and superbly adapted to understand the visual world. Recognizing handwritten digits isn’t easy. Rather, we humans are stupendously, astoundingly good at making sense of what our eyes show us. But nearly all that work is done unconsciously . And so we don’t usually appreciate how tough a problem our visual systems solve.

The difficulty of visual pattern recognition becomes apparent if you attempt to write a computer program to recognize digits like those above. What seems easy when we do it ourselves suddenly becomes extremely difficult. Simple intuitions about how we recognize shapes – “a 9 has a loop at the top, and a vertical stroke in the bottom right” – turn out to be not so simple to express algorithmically. When you try to make such rules precise, you quickly get lost in a morass of exceptions and caveats and special cases. It seems hopeless.

Neural networks approach the problem in a different way. The idea is to take a large number of handwritten digits, known as training examples,

 

and then develop a system which can learn from those training examples. In other words, the neural network uses the examples to automatically infer rules for recognizing handwritten digits. Furthermore, by increasing the number of training examples, the network can learn more about handwriting, and so improve its accuracy. So while I’ve shown just 100 training digits above, perhaps we could build a better handwriting recognizer by using thousands or even millions or billions of training examples.

In this chapter we’ll write a computer program implementing a neural network that learns to recognize handwritten digits. The program is just 74 lines long, and uses no special neural network libraries. But this short program can recognize digits with an accuracy over 96 percent, without human intervention. Furthermore, in later chapters we’ll develop ideas which can improve accuracy to over 99 percent. In fact, the best commercial neural networks are now so good that they are used by banks to process cheques, and by post offices to recognize addresses.

We’re focusing on handwriting recognition because it’s an excellent prototype problem for learning about neural networks in general. As a prototype it hits a sweet spot: it’s challenging – it’s no small feat to recognize handwritten digits – but it’s not so difficult as to require an extremely complicated solution, or tremendous computational power. Furthermore, it’s a great way to develop more advanced techniques, such as deep learning. And so throughout the book we’ll return repeatedly to the problem of handwriting recognition. Later in the book, we’ll discuss how these ideas may be applied to other problems in computer vision, and also in speech, natural language processing, and other domains.

Of course, if the point of the chapter was only to write a computer program to recognize handwritten digits, then the chapter would be much shorter! But along the way we’ll develop many key ideas about neural networks, including two important types of artificial neuron (the perceptron and the sigmoid neuron), and the standard learning algorithm for neural networks, known as stochastic gradient descent. Throughout, I focus on explaining why things are done the way they are, and on building your neural networks intuition. That requires a lengthier discussion than if I just presented the basic mechanics of what’s going on, but it’s worth it for the deeper understanding you’ll attain. Amongst the payoffs, by the end of the chapter we’ll be in position to understand what deep learning is, and why it matters.

………

說明這本書的主旨。是用『手寫阿拉伯數字辨識』這一主題貫串『神經網絡』以及『深度學習』之點滴,希望讀者能夠藉著最少的文本一窺全豹、聞一知十。因此他盡量少用『數學』,盡可能白話描述重要的『原理』與『概念』。

─── 摘自《W!o+ 的《小伶鼬工坊演義》︰神經網絡與深度學習【發凡】

 

心中篤定啊◎

縱也曾寫過

W!o+ 的《小伶鼬工坊演義》︰神經網絡【FFT】一

若干快速傅立葉變換相關視、聽小品,至今怕讀樂譜以及豆芽菜

的呦 !! ??

因此非常樂於推薦 CCRMA

Julius Orion Smith III

Home Page

Online Books

  1. Mathematics of the Discrete Fourier Transform (DFT)
  2. Introduction to Digital Filters
  3. Physical Audio Signal Processing
  4. Spectral Audio Signal Processing

All Publications in Chronological Order

 

先生公開之線上書也☆

MATHEMATICS OF THE DISCRETE FOURIER TRANSFORM (DFT)WITH AUDIO APPLICATIONS

SECOND EDITION

JULIUS O. SMITH III
Center for Computer Research in Music and Acoustics (CCRMA)


Preface

The Discrete Fourier Transform (DFT) can be understood as a numerical approximation to the Fourier transform. However, the DFT has its own exact Fourier theory, which is the main focus of this book. The DFT is normally encountered in practice as a Fast Fourier Transform (FFT)–i.e., a high-speed algorithm for computing the DFT. FFTs are used extensively in a wide range of digital signal processing applications, including spectrum analysis, high-speed convolution (linear filtering), filter banks, signal detection and estimation, system identification, audio compression (e.g., MPEG-II AAC), spectral modeling sound synthesis, and many other applications; some of these will be discussed in Chapter 8.

This book started out as a series of readers for my introductory course in digital audio signal processing that I have given at the Center for Computer Research in Music and Acoustics (CCRMA) since 1984. The course was created primarily for entering Music Ph.D. students in the Computer Based Music Theory program at CCRMA. As a result, the only prerequisite is a good high-school math background, including some calculus exposure.

……

SPECTRAL AUDIO SIGNAL PROCESSING

JULIUS O. SMITH III
Center for Computer Research in Music and Acoustics (CCRMA)

Preface

This book precipitated from my “spectral modeling” course which has been offered at the Center for Computer Research in Music and Acoustics (CCRMA) since 1984. The course originally evolved as a dissemination vehicle for spectral-oriented signal-processing research in computer music, aimed at beginning graduate students in computer music and engineering programs et al. Over the years it has become more of a tour of fundamentals in spectral audio signal processing, with occasional mention and citation of prior and ongoing related research. In principle, the only prerequisites are the first two books in the music signal processing series [264,263].

The focus of this book is on spectral modeling applied to audio signals. More completely, the principal tasks are spectral analysis, modeling, and resynthesis (and/or effects). We analyze sound in terms of spectral models primarily because this is what the human brain does. We may synthesize/modify sound in terms of spectral models for the same reason.

The primary tool for audio spectral modeling is the short-time Fourier transform (STFT). The applications we will consider lie in the fields of audio signal processing and musical sound synthesis and effects.

The reader should already be familiar with the Fourier transform and elementary digital signal processing. One source of this background material is [264]. Some familiarity with digital filtering and associated linear systems theory, e.g., on the level of [263], is also assumed.

There is a notable absence in this book of emphasis on audio coding of spectral representations. While audio coding is closely related, there are other books which cover this topic in detail (e.g., [273,16,159]). On the other hand, comparatively few works address applications of spectral modeling in areas other than audio compression. This book attempts to help fill that gap.