W!o+ 的《小伶鼬工坊演義》︰神經網絡【Perceptron】二

據知一九四三年 Warren McCullogh 和 Walter Pitts 因為生物

神經元

神經元(neuron),又名神經原神經細胞(nerve cell),是神經系統的結構與功能單位之一。神經元佔了神經系統約10%,其他大部分由膠狀細胞所構成。基本構造由樹突、軸突、髓鞘、細胞核組成。傳遞形成電流,在其尾端為受體,藉由化學物質傳導(多巴胺 、乙醯膽鹼),在適當的量傳遞後在兩個突觸間形成電流傳導。

人腦中,神經細胞約有860億個。其中約有700億個為小腦顆粒細胞(cerebellar granule cell)

220px-PurkinjeCell

1899年科學家所畫神經元的圖

 

之啟發,創造了今天稱之為『McCulloch–Pitts (MCP) neuron』的『人工神經元』數學模型︰

Some specific models of artificial neural nets

McCullogh-Pitts Model

In 1943 two electrical engineers, Warren McCullogh and Walter Pitts, published the first paper describing what we would call a neural network. Their “neurons” operated under the following assumptions:

  1. They are binary devices (Vi = [0,1])
  2. Each neuron has a fixed threshold, theta
  3. The neuron receives inputs from excitatory synapses, all having identical weights. (However it my receive multiple inputs from the same source, so the excitatory weights are effectively positive integers.)
  4. Inhibitory inputs have an absolute veto power over any excitatory inputs.
  5. At each time step the neurons are simultaneously (synchronously) updated by summing the weighted excitatory inputs and setting the output (Vi) to 1 iff the sum is greater than or equal to the threhold AND if the neuron receives no inhibitory input.

We can summarize these rules with the McCullough-Pitts output rule

and the diagram

Using this scheme we can figure out how to implement any Boolean logic function. As you probably know, with a NOT function and either an OR or an AND, you can build up XOR’s, adders, shift registers, and anything you need to perform computation.

We represent the output for various inputs as a truth table, where 0 = FALSE, and 1 = TRUE. You should verify that when W = 1 and theta = 1, we get the truth table for the logical NOT,

        Vin  |  Vout
        -----+------
          1  |   0
          0  |   1

by using this circuit:

With two excitatory inputs V1 and V2, and W =1, we can get either an OR or an AND, depending on the value of theta:

if

if

Can you verify that with these weights and thresholds, the various possible inputs for V1 and V2 result in this table?

        V1 | V2 | OR | AND
        ---+----+----+----
         0 |  0 |  0 |  0
         0 |  1 |  1 |  0
         1 |  0 |  1 |  0
         1 |  1 |  1 |  1

 

The exclusive OR (XOR) has the truth table:

        V1 | V2 | XOR
        ---+----+----
         0 |  0 |  0 
         0 |  1 |  1       (Note that this is also a
         1 |  0 |  1        "1 bit adder".)
         1 |  1 |  0 

It cannot be represented with a single neuron, but the relationship
XOR = (V1 OR V2) AND NOT (V1 AND V2) suggests that it can be represented with the network

 

由於『XOR』邏輯函數並不能夠用『單一單層』之 MCP 神經元來『表現』的緣故!因此必得成為『網絡』的乎?就算這樣變成一個『通用計算機』的了,那麼它怎麼『學習』的呢??經過了十五年之沉潛,而後 Frank Rosenblatt 提出了『感知器』模型︰

The Perceptron

The next major advance was the perceptron, introduced by Frank Rosenblatt in his 1958 paper. The perceptron had the following differences from the McCullough-Pitts neuron:

  1. The weights and thresholds were not all identical.
  2. Weights can be positive or negative.
  3. There is no absolute inhibitory synapse.
  4. Although the neurons were still two-state, the output function f(u) goes from [-1,1], not [0,1]. (This is no big deal, as a suitable change in the threshold lets you transform from one convention to the other.)
  5. Most importantly, there was a learning rule.

Describing this in a slightly more modern and conventional notation (and with Vi = [0,1]) we could describe the perceptron like this:

This shows a perceptron unit, i, receiving various inputs Ij, weighted by a “synaptic weight” Wij.

The ith perceptron receives its input from n input units, which do nothing but pass on the input from the outside world. The output of the perceptron is a step function:

and

For the input units, Vj = Ij. There are various ways of implementing the threshold, or bias, thetai. Sometimes it is subtracted, instead of added to the input u, and sometimes it is included in the definition of f(u).

A network of two perceptrons with three inputs would look like:

Note that they don’t interact with each other – they receive inputs only from the outside. We call this a “single layer perceptron network” because the input units don’t really count. They exist just to provide an output that is equal to the external input to the net.

The learning scheme is very simple. Let ti be the desired “target” output for a given input pattern, and Vi be the actual output. The error (called “delta”) is the difference between the desired and the actual output, and the change in the weight is chosen to be proportional to delta.

Specifically, and

where is the learning rate.

Can you see why this is reasonable? Note that if the output of the ith neuron is too small, the weights of all its inputs are changed to increase its total input. Likewise, if the output is too large, the weights are changed to decrease the total input. We’ll better understand the details of why this works when we take up back propagation. First, an example.

……

How many epochs does it take until the perceptron has been trained to generate the correct truth table for an OR? Note that, except for a scale factor, this is the same result which McCullogh and Pitts deduced for the weights and bias without letting the net do the learning. (Do you see why a positive threshold for a M-P neuron is equivalent to adding a negative bias term in the expression for the perceptron total input u?)

───

 

又將『人工神經網絡』推向新的高峰。只不過那個『XOR』的問題仍然麻煩︰

why  do  neurons  make  networks

On the logical operations page, I showed how single neurons can perform simple logical operations, but that they are unable to perform some more difficult ones like the XOR operation (shown above). and I described how an XOR network can be made, but didn’t go into much detail about why the XOR requires an extra layer for its solution.  This page is about using the knowledge we have from the formalising & visualising page to help us understand why neurons need to make networks. The only network we will look at is the XOR, but at the end you will play with a network that visualises the XOR problem as a pair of lines through input space that you can adjust by changing the parameters of the neurons.

the   xor   problem

We have a problem that can be described with the logic table below, and visualised in input space as shown on the right.

……

 

況且到底該如何『教化』『神經網絡』『有效學習』的呢??!!是因為『太簡略』,所以『理論』上根本不可能學會『太多』??還是由於『很費時』,因此『實務』上很難有什麼價值!!

故到今日『思辨』依舊存在,無怪乎有人說︰

Author: Michael Marsalli
Overview:

[No_Image]

MODULE DESCRIPTION:

In 1943 Warren S. McCulloch, a neuroscientist, and Walter Pitts, a logician, published “A logical calculus of the ideas immanent in nervous activity” in the Bulletin of Mathematical Biophysics 5:115-133. In this paper McCulloch and Pitts tried to understand how the brain could produce highly complex patterns by using many basic cells that are connected together. These basic brain cells are called neurons, and McCulloch and Pitts gave a highly simplified model of a neuron in their paper. The McCulloch and Pitts model of a neuron, which we will call an MCP neuron for short, has made an important contribution to the development of artificial neural networks — which model key features of biological neurons.

The original MCP Neurons had limitations. Additional features were added which allowed them to “learn.” The next major development in neural networks was the concept of a perceptron which was introduced by Frank Rosenblatt in 1958. Essentially the perceptron is an MCP neuron where the inputs are first passed through some “preprocessors,” which are called association units. These association units detect the presence of certain specific features in the inputs. In fact, as the name suggests, a perceptron was intended to be a pattern recognition device, and the association units correspond to feature or pattern detectors.

……

 

也有人講︰

The McCulloch-Pitts Neuron
Written by Harry Fairhead
Article Index
The McCulloch-Pitts Neuron
What can the brain compute?

Nowadays the McCulloch-Pitts neuron tends to be overlooked in favour of simpler neuronal models but they were and are still important. They proved that something that behaved like a biological neuron was capable of computation and early computer designers often thought in terms of them.

Before the neural network algorithms in use today were devised, there was an alternative. It was invented in 1943 by neurophysiologist  Warren McCulloch and logician Walter Pitts. Now networks of the McCulloch-Pitts type tend to be overlooked in favour of “gradient descent” type neural networks and this is a shame. McCulloch-Pitts neurons are more like the sort of approach we see today in neuromorphic chips where neurons are used as computational units.

wsmcculloch

Warren McCulloch

pitts

Walter Pitts

What is interesting about the McCulloch-Pitts model of a neural network is that it can be used as the components of computer-like systems.

……

What can the brain compute?

You can see that it would be possible to continue in this way to build more and more complicated neural circuits using cells. Shift registers are easy, so are half and full adders – give them a try!

But at this point you might well be wondering why we are bothering at all?

The answer is that back in the early days of AI the McCulloch-Pitts neuron, and its associated mathematics, gave us clear proof that you could do computations with elements that looked like biological neurons.

To be more precise, it is relatively easy to show how to construct a network that will recognise or “accept” a regular expression. A regular expression is something that can be made up using simple rules. In terms of production rules any regular expression can be described by a grammar having rules of the type:

<non-terminal1> ->  symbol <non-terminal2>

or

<non-terminal1> -> symbol

That is, rules are only “triggered” in the right and symbols are only added at the left.

……

Why is this important?

Well if you agree that McCulloch-Pitts neurons capture the essence of the way biological neurons work then you also have to conclude that biological networks are just finite state machines and as such can only recognise or generate regular sequences.

In their original work McCulloch and Pitts extended this observation into deducing a great deal about human brain function. Most of this seems a bit far-fetched from today’s standpoint but the basic conclusion that the brain is probably nothing more than a simple computer – i.e. a finite state machine – still seems reasonable.

If you know a little about the theory of computation you might well not be happy about this “bottom line” because a finite state machine isn’t even as powerful as a Turing machine. That is, there are lots of things that a Turing machine can compute that in theory we, as finite state machines, can’t. In fact there are three or more complexities of grammar, and hence types of sequence, that finite state machines, and hence presumably us, cannot recognise.

This sort of argument is often used to demonstrate that there has to be more to a human brain than mere logic – it has a non-physical “mind” component or some strange quantum phenomena that are required to explain how we think.

All nonsense of course!

You shouldn’t get too worried about these conclusions because when you look at them in more detail some interesting facts emerge. For example, all finite sequences are regular and so we are really only worrying about philosophical difficulties that arise because we are willing to allow infinite sequences of symbols.

While this seems reasonable when the infinite sequence is just ABAB… it is less reasonable when there is no finite repetitive sequence which generates the chain. If you want to be philosophical about such things perhaps it would be better to distinguish between sequences that have no fixed length limit – i.e. unbounded but finite sequences – and truly infinite sequences.

Surprisingly, even in this case things work out in more or less the same way with finite state machines, and hence human brains, lagging behind other types of computer. The reason for this is simply that as soon as you consider a sequence longer than the number of elements in the brain it might as well be infinite!

As long as we restrict our attention to finite sequences with some upper limit on length, and assume that the number of computing elements available is much greater than this, then all computers are equal and the human brain is as good as anything!

McCulloch and Pitts neural networks are not well-known or widely studied these days because they grew into or were supersede by another sort of neural net – one that can be trained into generating any logic function or indeed any function you care to name.

───