對於那個七十四行 Python 小程式, Michael Nielsen 先生寫到︰
I said above that our program gets pretty good results. What does that mean? Good compared to what? It’s informative to have some simple (non-neural-network) baseline tests to compare against, to understand what it means to perform well. The simplest baseline of all, of course, is to randomly guess the digit. That’ll be right about ten percent of the time. We’re doing much better than that!
What about a less trivial baseline? Let’s try an extremely simple idea: we’ll look at how dark an image is. For instance, an image of a 2 will typically be quite a bit darker than an image of a 1, just because more pixels are blackened out, as the following examples illustrate:
It’s not difficult to find other ideas which achieve accuracies in the 20 to 50 percent range. If you work a bit harder you can get up over 50 percent. But to get much higher accuracies it helps to use established machine learning algorithms. Let’s try using one of the best known algorithms, the support vector machine or SVM. If you’re not familiar with SVMs, not to worry, we’re not going to need to understand the details of how SVMs work. Instead, we’ll use a Python library called scikit-learn, which provides a simple Python interface to a fast C-based library for SVMs known as LIBSVM.
If we run scikit-learn’s SVM classifier using the default settings, then it gets 9,435 of 10,000 test images correct. (The code is available here.) That’s a big improvement over our naive approach of classifying an image based on how dark it is. Indeed, it means that the SVM is performing roughly as well as our neural networks, just a little worse. In later chapters we’ll introduce new techniques that enable us to improve our neural networks so that they perform much better than the SVM.
That’s not the end of the story, however. The 9,435 of 10,000 result is for scikit-learn’s default settings for SVMs. SVMs have a number of tunable parameters, and it’s possible to search for parameters which improve this out-of-the-box performance. I won’t explicitly do this search, but instead refer you to this blog post by Andreas Mueller if you’d like to know more. Mueller shows that with some work optimizing the SVM’s parameters it’s possible to get the performance up above 98.5 percent accuracy. In other words, a well-tuned SVM only makes an error on about one digit in 70. That’s pretty good! Can neural networks do better?
In fact, they can. At present, well-designed neural networks outperform every other technique for solving MNIST, including SVMs. The current (2013) record is classifying 9,979 of 10,000 images correctly. This was done by Li Wan, Matthew Zeiler, Sixin Zhang, Yann LeCun, and Rob Fergus. We’ll see most of the techniques they used later in the book. At that level the performance is close to human-equivalent, and is arguably better, since quite a few of the MNIST images are difficult even for humans to recognize with confidence, for example:
假使說用『猜的』,恐怕講『百分之十』都只能是『想當然爾』吧 ,故而沒什麼可以多說的了!
至於說用『平均暗度』,即使僅從人寫字習慣上講︰
或大或小、或粗或細,可知其不可為!不過讀讀
""" mnist_average_darkness ~~~~~~~~~~~~~~~~~~~~~~ A naive classifier for recognizing handwritten digits from the MNIST data set. The program classifies digits based on how dark they are --- the idea is that digits like "1" tend to be less dark than digits like "8", simply because the latter has a more complex shape. When shown an image the classifier returns whichever digit in the training data had the closest average darkness. The program works in two steps: first it trains the classifier, and then it applies the classifier to the MNIST test data to see how many digits are correctly classified. Needless to say, this isn't a very good way of recognizing handwritten digits! Still, it's useful to show what sort of performance we get from naive ideas.""" #### Libraries # Standard library from collections import defaultdict # My libraries import mnist_loader def main(): training_data, validation_data, test_data = mnist_loader.load_data() # training phase: compute the average darknesses for each digit, # based on the training data avgs = avg_darknesses(training_data) # testing phase: see how many of the test images are classified # correctly num_correct = sum(int(guess_digit(image, avgs) == digit) for image, digit in zip(test_data[0], test_data[1])) print "Baseline classifier using average darkness of image." print "%s of %s values correct." % (num_correct, len(test_data[1])) def avg_darknesses(training_data): """ Return a defaultdict whose keys are the digits 0 through 9. For each digit we compute a value which is the average darkness of training images containing that digit. The darkness for any particular image is just the sum of the darknesses for each pixel.""" digit_counts = defaultdict(int) darknesses = defaultdict(float) for image, digit in zip(training_data[0], training_data[1]): digit_counts[digit] += 1 darknesses[digit] += sum(image) avgs = defaultdict(float) for digit, n in digit_counts.iteritems(): avgs[digit] = darknesses[digit] / n return avgs def guess_digit(image, avgs): """Return the digit whose average darkness in the training data is closest to the darkness of ``image``. Note that ``avgs`` is assumed to be a defaultdict whose keys are 0...9, and whose values are the corresponding average darknesses across the training data.""" darkness = sum(image) distances = {k: abs(v-darkness) for k, v in avgs.iteritems()} return min(distances, key=distances.get) if __name__ == "__main__": main()
倒是滿有意思的。在樹莓派 3 上,實測結果如下︰
pi@raspberrypi:~/neural-networks-and-deep-learning/src 9,435 / 10,000 python mnist_svm.py Baseline classifier using an SVM. 9435 of 10000 values correct.