The learning process
- This section describes the learning ability of neural networks. First, the term learning is explained, followed by an overview of specific learning algorithms for neural nets.
What does "learning" mean refering to neural nets?
- In the human brain, information is passed between the neurons in form of electrical stimulation along the dendrites. If a certain amount of stimulation is received by a neuron, it generates an output to all other connected neurons and so information takes its way to its destination where some reaction will occur. If the incoming stimulation is too low, no output is generated by the neuron and the information's further transport will be blocked. Explaining how the human brain learns certain things is quite difficult and nobody knows it exactly. It is supposed that during the learning process the connection structure among the neurons is changed, so that certain stimulations are only accepted by certain neurons. This means, there exist firm connections between the neural cells that once have learned a specific fact, enabling the fast recall of this information. If some related information is acquired later, the same neural cells are stimulated and will adapt their connection structure according to this new information. On the other hand, if a specific information isn't recalled for a long time, the established connection structure between the responsible neural cells will get more "weak". This had happened if someone "forgot" a once learned fact or can only remember it vaguely. As mentioned before, neural nets try to simulate the human brain's ability to learn. That is, the artificial neural net is also made of neurons and dendrites. Unlike the biological model, a neural net has an unchangeable structure, built of a specified number of neurons and a specified number of connections between them (called "weights"), which have certain values. What changes during the learning process are the values of those weights. Compared to the original this means: Incoming information "stimulates" (exceeds a specified threshold value of) certain neurons that pass the information to connected neurons or prevent further transportation along the weighted connections. The value of a weight will be increased if information should be transported and decreased if not. While learning different inputs, the weight values are changed dynamically until their values are balanced, so each input will lead to the desired output. The training of a neural net results in a matrix that holds the weight values between the neurons. Once a neural net had been trained correctly, it will probably be able to find the desired output to a given input that had been learned, by using these matrix values. I said "probably". That is sad but true, for it can't be guaranteed that a neural net will recall the correct results in any case. Very often there is a certain error left after the learning process, so the generated output is only a good approximation to the perfect output in most cases. The following sections introduce several learning algorithms for neural networks.
Supervised and unsupervised learning
- The learning algorithm of a neural network can either be supervised or unsupervised. A neural net is said to learn supervised, if the desired output is already known. Example: pattern association Suppose, a neural net shall learn to associate the following pairs of patterns. The input patterns are decimal numbers, each represented in a sequence of bits. The target patterns are given in form of binary values of the decimal numbers:
input pattern target pattern 0001 001 0010 010 0100 011 1000 100
Forwardpropagation
- Forwardpropagation is a supervised learning algorithm and describes the "flow of information" through a neural net from its input layer to its output layer. The algorithm works as follows:
- Set all weights to random values ranging from -1.0 to +1.0
- Set an input pattern (binary values) to the neurons of the net's input layer
- Activate each neuron of the following layer:
- Multiply the weight values of the connections leading to this neuron with the output values of the preceding neurons
- Add up these values
- Pass the result to an activation function, which computes the output value of this neuron
- Repeat this until the output layer is reached
- Compare the calculated output pattern to the desired target pattern and compute an error value
- Change all weights by adding the error value to the (old) weight values
- Go to step 2
- The algorithm ends, if all output patterns match their target patterns
Patterns to be learned: input
0 1
1 1target
0
1- First, the weight values are set to random values (0.35 and 0.81).
- The learning rate of the net is set to 0.25.
- Next, the values of the first input pattern (0 1) are set to the neurons of the input layer (the output of the input layer is the same as its input).
- The neurons in the following layer (only one neuron in the output layer) are activated:
Input 1 of output neuron: 0 * 0.35 = 0 Input 2 of output neuron: 1 * 0.81 = 0.81 Add the inputs: 0 + 0.81 = 0.81 (= output) Compute an error value by subtracting output from target: 0 - 0.81 = -0.81 Value for changing weight 1: 0.25 * 0 * (-0.81) = 0 (0.25 = learning rate) Value for changing weight 2: 0.25 * 1 * (-0.81) = -0.2025 Change weight 1: 0.35 + 0 = 0.35 (not changed) Change weight 2: 0.81 + (-0.2025) = 0.6075
- Now that the weights are changed, the second input pattern (1 1) is set to the input layer's neurons and the activation of the output neuron is performed again, now with the new weight values:
Input 1 of output neuron: 1 * 0.35 = 0.35 Input 2 of output neuron: 1 * 0.6075 = 0.6075 Add the inputs: 0.35 + 0.6075 = 0.9575 (= output) Compute an error value by subtracting output from target: 1 - 0.9575 = 0.0425 Value for changing weight 1: 0.25 * 1 * 0.0425 = 0.010625 Value for changing weight 2: 0.25 * 1 * 0.0425 = 0.010625 Change weight 1: 0.35 + 0.010625 = 0.360625 Change weight 2: 0.6075 + 0.010625 = 0.618125
- That was one learning step. Each input pattern had been propagated through the net and the weight values were changed.
- The error of the net can now be calculated by adding up the squared values of the output errors of each pattern:
Compute the net error: (-0.81)2 + (0.0425)2 = 0.65790625
- By performing this procedure repeatedly, this error value gets smaller and smaller.
The algorithm is successfully finished, if the net error is zero (perfect) or approximately zero.
Backpropagation
- Backpropagation is a supervised learning algorithm and is mainly used by Multi-Layer-Perceptrons to change the weights connected to the net's hidden neuron layer(s). The backpropagation algorithm uses a computed output error to change the weight values in backward direction. To get this net error, a forwardpropagation phase must have been done before. While propagating in forward direction, the neurons are being activated using the sigmoid activation function. The formula of sigmoid activation is:
1 f(x) = --------- 1 + e-input
The algorithm works as follows:- Perform the forwardpropagation phase for an input pattern and calculate the output error
- Change all weight values of each weight matrix using the formula
- weight(old) + learning rate * output error * output(neurons i) * output(neurons i+1) * ( 1 - output(neurons i+1) )
- Go to step 1
- The algorithm ends, if all output patterns match their target patterns
Patterns to be learned: input
0 1
1 1target
0
1- First, the weight values are set to random values: 0.62, 0.42, 0.55, -0.17 for weight matrix 1 and 0.35, 0.81 for weight matrix 2.
- The learning rate of the net is set to 0.25.
- Next, the values of the first input pattern (0 1) are set to the neurons of the input layer (the output of the input layer is the same as its input).
- The neurons in the hidden layer are activated:
Input of hidden neuron 1: 0 * 0.62 + 1 * 0.55 = 0.55 Input of hidden neuron 2: 0 * 0.42 + 1 * (-0.17) = -0.17 Output of hidden neuron 1: 1 / ( 1 + exp(-0.55) ) = 0.634135591 Output of hidden neuron 2: 1 / ( 1 + exp(+0.17) ) = 0.457602059
- The neurons in the output layer are activated:
Input of output neuron: 0.634135591 * 0.35 + 0.457602059 * 0.81 = 0.592605124 Output of output neuron: 1 / ( 1 + exp(-0.592605124) ) = 0.643962658 Compute an error value by subtracting output from target: 0 - 0.643962658 = -0.643962658
- Now that we got the output error, let's do the backpropagation.
- We start with changing the weights in weight matrix 2:
Value for changing weight 1: 0.25 * (-0.643962658) * 0.634135591 * 0.643962658 * (1-0.643962658) = -0.023406638 Value for changing weight 2: 0.25 * (-0.643962658) * 0.457602059 * 0.643962658 * (1-0.643962658) = -0.016890593 Change weight 1: 0.35 + (-0.023406638) = 0.326593362 Change weight 2: 0.81 + (-0.016890593) = 0.793109407
- Now we will change the weights in weight matrix 1:
Value for changing weight 1: 0.25 * (-0.643962658) * 0 * 0.634135591 * (1-0.634135591) = 0 Value for changing weight 2: 0.25 * (-0.643962658) * 0 * 0.457602059 * (1-0.457602059) = 0 Value for changing weight 3: 0.25 * (-0.643962658) * 1 * 0.634135591 * (1-0.634135591) = -0.037351064 Value for changing weight 4: 0.25 * (-0.643962658) * 1 * 0.457602059 * (1-0.457602059) = -0.039958271 Change weight 1: 0.62 + 0 = 0.62 (not changed) Change weight 2: 0.42 + 0 = 0.42 (not changed) Change weight 3: 0.55 + (-0.037351064) = 0.512648936 Change weight 4: -0.17+ (-0.039958271) = -0.209958271
- The first input pattern had been propagated through the net.
- The same procedure is used for the next input pattern, but then with the changed weight values.
- After the forward and backward propagation of the second pattern, one learning step is complete and the net error can be calculated by adding up the squared output errors of each pattern.
- By performing this procedure repeatedly, this error value gets smaller and smaller.
The algorithm is successfully finished, if the net error is zero (perfect) or approximately zero.
Note that this algorithm is also applicable for Multi-Layer-Perceptrons with more than one hidden layer.
"What happens, if all values of an input pattern are zero?"
- If all values of an input pattern are zero, the weights in weight matrix 1 would never be changed for this pattern and the net could not learn it. Due to that fact, a "pseudo input" is created, called Bias that has a constant output value of 1. This changes the structure of the net in the following way:
These additional weights, leading to the neurons of the hidden layer and the output layer, have initial random values and are changed in the same way as the other weights. By sending a constant output of 1 to following neurons, it is guaranteed that the input values of those neurons are always differing from zero.
Selforganization
- Selforganization is an unsupervised learning algorithm used by the Kohonen Feature Map neural net. As mentioned in previous sections, a neural net tries to simulate the biological human brain, and selforganization is probably the best way to realize this. It is commonly known that the cortex of the human brain is subdivided in different regions, each responsible for certain functions. The neural cells are organizing themselves in groups, according to incoming informations. Those incoming informations are not only received by a single neural cell, but also influences other cells in its neighbourhood. This organization results in some kind of a map, where neural cells with similar functions are arranged close together. This selforganization process can also be performed by a neural network. Those neural nets are mostly used for classification purposes, because similar input values are represented in certain areas of the net's map. A sample structure of a Kohonen Feature Map that uses the selforganization algorithm is shown below:
Kohonen Feature Map with 2-dimensional input and 2-dimensional map (3x3 neurons) As you can see, each neuron of the input layer is connected to each neuron on the map. The resulting weight matrix is used to propagate the net's input values to the map neurons. Additionally, all neurons on the map are connected among themselves. These connections are used to influence neurons in a certain area of activation around the neuron with the greatest activation, received from the input layer's output. The amount of feedback between the map neurons is usually calculated using the Gauss function:-|xc-xi|2c is the position of the most activated neuron 2 * sig2 xi are the positions of the other map neurons feedbackci = e sig is the activation area (radius) -------- where x
In the beginning, the activation area is large and so is the feedback between the map neurons. This results in an activation of neurons in a wide area around the most activated neuron. As the learning progresses, the activation area is constantly decreased and only neurons closer to the activation center are influenced by the most activated neuron. Unlike the biological model, the map neurons don't change their positions on the map. The "arranging" is simulated by changing the values in the weight matrix (the same way as other neural nets do). Because selforganization is an unsupervised learning algorithm, no input/target patterns exist. The input values passed to the net's input layer are taken out of a specified value range and represent the "data" that should be organized. The algorithm works as follows:- Define the range of the input values
- Set all weights to random values taken out of the input value range
- Define the initial activation area
- Take a random input value and pass it to the input layer neuron(s)
- Determine the most activated neuron on the map:
- Multiply the input layer's output with the weight values
- The map neuron with the greatest resulting value is said to be "most activated"
- Compute the feedback value of each other map neuron using the Gauss function
- Change the weight values using the formula:
- weight(old) + feedback value * ( input value - weight(old) ) * learning rate
- Decrease the activation area
- Go to step 4
- The algorithm ends, if the activation area is smaller than a specified value
No comments:
Post a Comment