BEE4333 Intelligent Control
description
Transcript of BEE4333 Intelligent Control
BEE4333 Intelligent Control
Hamzah Ahmad
Ext: 6024/6130
Artificial Intelligence: Artificial Neural Network (ANN)
Copyright ofHamzah AhmadFKEE, UMP
Todays Lesson 4.1 Basic Concept 4.2 ANN Applications
LO1 : Able to understand the concept of Artificial Neural Network and its applications
Copyright ofHamzah AhmadFKEE, UMP
Basic Concept ANN born from the demand of machine
learning; computer learns from experience, examples and analogy.
Simple concept : computer attempts to model the human brain.
Also known as parallel distributed processors. Why we need an intelligent processor or
computer to replace current technology? To decide intelligently and interact accordingly.
Copyright ofHamzah AhmadFKEE, UMP
Human Brain; biological NN
SOMA
SOMA
Synapses; connection between neutrons
Dendrites;Received information
Axon;sends
information
NEURONPlasticity : Neurons heading to right answer are strengthen and for the wrong answer is weakened.
Learning from experience!
Copyright ofHamzah AhmadFKEE, UMP
LearningIN
PU
T S
IGN
ALS
OU
TP
UT S
IGN
ALS
INPUT LAYER OUTPUT LAYERMIDDLE LAYER
ANN Architecture
Copyright ofHamzah AhmadFKEE, UMP
Learning Synapses has their own weight to express the
importance of input. ANN learns through iterated adjustment from
synapses weight. Weight is adjusted to cope with the output
environment regarding about its network input/output behavior.
Each neutron computes its activation level based on the I/O numerical weights.
The output of a neuron might be the final solution or the input to other networks.
Copyright ofHamzah AhmadFKEE, UMP
How to design ANN? Decide how many neurons to be used. How the connections between neurons are
constructed? How many layers needed? Which learning algorithm to be apply? Train the ANN by initialize the weight and
update the weights from training sets.
Copyright ofHamzah AhmadFKEE, UMP
ANN characteristicsAdvantages: A neural network can perform tasks that a linear program can
not. When an element of the neural network fails, it can continue
without any problem by their parallel nature. A neural network learns and does not need to be
reprogrammed. It can be implemented in any application. It can be implemented without any problem.
Disadvantages: The neural network needs training to operate. The architecture of a neural network is different from the
architecture of microprocessors therefore needs to be emulated. Requires high processing time for large neural networks.
Copyright ofHamzah AhmadFKEE, UMP
Todays Lesson 4.3 ANN Model 4.4 ANN Learning 4.5 Simple ANN
LO1 : Able to understand basic concept of biases, thresholds and linear separability
LO2 : Able to analyze simple ANN (Perceptrons)
Copyright ofHamzah AhmadFKEE, UMP
Examples of ANN
Copyright ofHamzah AhmadFKEE, UMP
CategorizationFeedforward All signals flow in one direction only, i.e. from
lower layers (input) to upper layers (output)Feedback Signals from neurons in upper layers are fed
back to either its own or to neurons in lower layers.Cellular Connected in a cellular manner.
Copyright ofHamzah AhmadFKEE, UMP
Exercise Construct 4 artificial neurons
2 neurons on the input and 2 neurons on the output Each arrow has its own weight. Those weight is multiplied to each value going through each
arrow - what this process define? If there is only ONE (1) input and a weight, so the output will be
the multiplication of both. For more than ONE (1) input and weights, then the neuron will sum up the values.
Consider the weight is ONE (1) for each arrow and set the input to be (0,0), (0,1), (1,1), (1,-1), (-1,1).
What happens? Change the weight randomly and differently between -0.5 to
0.5 to each arrows. What happens? Try changing the weight again other than above weights. Observed what happen.
Copyright ofHamzah AhmadFKEE, UMP
ANN Learning In all of the neural paradigms, the application
of an ANN involves two phases:
(1) Learning phase (pengajaran) (2) Recall phase (penggunaan)
In the learning phase (usually offline) the ANN is trained until it has learned its tasks (through the adaptation of its weights)
The recall phase is used to solve the task.
Copyright ofHamzah AhmadFKEE, UMP
ANN Learning An ANN solves a task when its weights are adapted
through a learning phase. All neural networks have to be trained before they
can be used. They are given training patterns and their weights are
adjusted iteratively until an error function is minimized.
Once the ANN has been trained no more training is needed.
Two types of learning prevailed in ANNs: Supervised learning:- learning with teacher signals or
targets Unsupervised learning:- learning without the use of
teacher signals
Copyright ofHamzah AhmadFKEE, UMP
Supervised Learning In supervised learning the training patterns are provided
to the ANN together with a teaching signal or target. The difference between the ANN output and the target is
the error signal. Initially the output of the ANN gives a large error during
the learning phase. The error is then minimized through continuous
adaptation of the weights to solve the problem through a learning algorithm.
In the end when the error becomes very small, the ANN is assumed to have learned the task and training is stopped.
It can then be used to solve the task in the recall phase.
Copyright ofHamzah AhmadFKEE, UMP
Supervised Learning
Matching the I/O pattern
Copyright ofHamzah AhmadFKEE, UMP
Unsupervised Learning In unsupervised learning, the ANN is trained without
teaching signals or targets. It is only supplied with examples of the input patterns
that it will solve eventually. The ANN usually has an auxilliary cost function which
needs to be minimized like an energy function, distance, etc.
Usually a neuron is designated as a “winner” from similarities in the input patterns through competition.
The weights of the ANN are modified where a cost function is minimized.
At the end of the learning phase, the weights would have been adapted in such a manner such that similar patterns are clustered into a particular node.
Copyright ofHamzah AhmadFKEE, UMP
ANN paradigm There are a number of ANN paradigms developed over
the past few decades. These ANN paradigms are mainly distinguished
through their different learning algorithms rather than their models.
Some ANN paradigms are named after their proposer such as Hopfield, Kohonen, etc.
Most ANNs are named after their learning algorithm such as Backpropagation, Competitive learning, Counter propagation, ART, etc. and some are named after their model such as BAM,
Basically a particular ANN can be divided into either a feedforward or a feedback model and into either a supervised or unsupervised learning mode.
Copyright ofHamzah AhmadFKEE, UMP
ANN Classifications
Copyright ofHamzah AhmadFKEE, UMP
ANN Performance The performance of an ANN is described by
the figure of merit, which expresses the number of recalled patterns when input patterns are applied, that could be complete, partially complete, or even noisy.
A 100% performance in recalled patterns means that for every trained input stimulus signal, the ANN always produces the desired output pattern.
Copyright ofHamzah AhmadFKEE, UMP
ANN Performance
Copyright ofHamzah AhmadFKEE, UMP
Basis of ANN computing idea Neuron computes the input signals and compares
the result with a threshold value, θ. If the input is less than θ, then the neuron output
is -1, otherwise +1. Hence, the following activation function(sign
function) is used,
where X is the net weighted input to neuron, xi is the i input value, wi is the weight of input i . n is the number of neuron input and Y is the neuron output.
𝑋=∑𝑖=1
𝑛
𝑥 𝑖𝑤𝑖𝑌={+1𝑖𝑓 𝑋 ≥𝜃
−1 𝑖𝑓 𝑋<𝜃 } 𝑌=𝑠𝑖𝑔𝑛[¿∑𝑖=1
𝑛
𝑥 𝑖𝑤 𝑖 −𝜃]
Copyright ofHamzah AhmadFKEE, UMP
Other types of activation function
Y
X
+1
-1
0
Y
X
+1
-1
0
Y
X
+1
-1
0
Y
X
+1
-1
0
Stepfunction
Signfunction
Sigmoidfunction
Linearfunction
𝑌 𝑠𝑖𝑔𝑛={+1, 𝑖𝑓 𝑋≥ 0−1 , 𝑖𝑓 𝑋<0}
𝑌 𝑠𝑡𝑒𝑝={+1 ,𝑖𝑓 𝑋 ≥ 00 ,𝑖𝑓 𝑋<0 } 𝑌 𝑠𝑖𝑔𝑚𝑜𝑖𝑑=
1
1+𝑒−𝑋
X
Copyright ofHamzah AhmadFKEE, UMP
Simple ANN: A Perceptron
Perceptron is used to classify input in two classes; e.g class A1 or A2.
A linear separable function is used to divide the n-dimensional space as follows;
Say, 2 inputs, then we have a characteristics as shown on left figure. θ is used to shift the bound.
Three dimensional states is also possible to be view.
∑𝑖=1
𝑛
𝑥 𝑖𝑤𝑖−𝜃=0
x2
x10
2
1
Copyright ofHamzah AhmadFKEE, UMP
Simple Perceptron
Inputs
Output/bias
∑
x1
x2
w1
w2
θ
Hardlimiter
Threshold
LinearCombiner
Must be boolean!
Copyright ofHamzah AhmadFKEE, UMP
Different training pattern based on weights defined
• Note that p1 and p2 are incorrectly being determined• p1 target is t=1 and p2 target is t=-1
Decisionboundary
Copyright ofHamzah AhmadFKEE, UMP
Learning: Classification Learning is done by adjusting the actual output Y
to meet the desired output Yd. Usually, the initial weight is adjust between -0.5
to 0.5. At iteration k of the training example, we have the error e as
If the error is positive, the weight must be decrease and otherwise must be increase.
Perceptron learning rule also can be obtained where
α is the learning rate and 0< α<1.
𝑒 (𝑘)=𝑌 𝑑(𝑘)−𝑌 (𝑘)
𝑤𝑖 (𝑘+1 )=𝑤𝑖 (𝑘 )+∝×𝑥 𝑖(𝑘)×𝑒(𝑘)
Copyright ofHamzah AhmadFKEE, UMP
Training algorithm
Step 1: Initialization Set initial weight wi between [-0.5,0.5]and θ.
Step 2: Activation Perceptron activation at iteration 1 for each input and a
specific Yd. e.g for a step activation function we have
Step 3: Weight training Perceptron weight is updated by
where = α X xi(p) X e(p)
Step 4: Iteration Next iteration at time k+1 and go to step 2 again.
Copyright ofHamzah AhmadFKEE, UMP
Example Consider truth table of AND operation
How ANN of a single perceptron can be trained?
Consider a step activation function in this example.
Input x1 Input x2 AND (x1 ∩ x2)
0 0 0
0 1 0
1 0 0
1 1 1
Threshold, θ = 0.2Learning rate,α = 0.1
Use initial weight as 0.3 for x1 and -0.1 for x2
Epoch
Input x1
Input x2
Desired output
yd
Initial weight w1
Initial weight w2
Actual Outpu
t Y
Error, e
Final weight w1
Final weight w2
1 0 0 0 0.3 -0.1 0 0 0.3 -0.1
0 1 0 0.3 -0.1 0 0 0.3 -0.1
1 0 0 0.3 -0.1 1 -1 0.2 -0.1
1 1 1 0.2 -0.1 0 1 0.3 0.0
𝑌 (𝑝)=𝑠𝑡𝑒𝑝[∑𝑖=1
𝑛
𝑥 𝑖 (𝑝 )𝑤 𝑖 (𝑝) −𝜃]where = α x xi(p) x e(p)
The epoch continues until the weights are converging to a steady state values.
Copyright ofHamzah AhmadFKEE, UMP
Now consider this problem Design a mobile robot that avoid collisions
using ANN. There are three inputs; right wheel velocity,
left wheel velocity and relative distance between robot and obstacle.
The output will be the mobile robot heading angle.
Write only two epoch for this problem.
Copyright ofHamzah AhmadFKEE, UMP
Today Lessons 4.3 ANN Model 4.4 ANN Learning 4.5 Simple ANN
LO1 : Able to understand basic concept of biases, thresholds and linear separability
LO2 : Able to analyze simple ANN (Perceptrons)
Copyright ofHamzah AhmadFKEE, UMP
Today Lessons 4.6 Multilayer Neural Networks &
Backpropagation Algorithm
LO1 : Able to understand basic concept of biases, thresholds and linear separability
Copyright ofHamzah AhmadFKEE, UMP
Sigmoid function characteristics The sigmoid activation function with different values c. When c is large, the sigmoid becomes like a threshold
function and when is c small, the sigmoid becomes more like a straight line (linear).
When c is large learning is much faster but a lot of information is lost, however when c is small, learning is very slow but information is retained.
Because this function is differentiable, it enables the B.P. algorithm to adapt the lower layers of weights in a multilayer neural network.
Because this function is differentiable, it enables the B.P. algorithm to adapt the lower layers of weights in a multilayer neural network.
Copyright ofHamzah AhmadFKEE, UMP
Multilayer neural networks Multilayer NN-feedforward neural network with
one or more hidden layer. Model consists of input layer, middle or hidden
layer and an output layer. Why hidden layer is important?
Input layer only receives input signal Output layer only display the output patterns. Hidden layer process the input signals; weight
represents feature of inputs.
Copyright ofHamzah AhmadFKEE, UMP
Multilayer NN model
x1
x2
Inputs
x3
1st hidden layer
2nd hidden layer
Output
Copyright ofHamzah AhmadFKEE, UMP
Multilayer Neural Network learning Multilayer NN learns through a learning
algorithm; the popular one is BACK-PROPAGATION.
The computations are similar to a simple perceptron.
Back-propagation has two phases; Input layer demonstrates the training input
pattern and then propagates from layer to layer to output.
The calculation for error will notify the system to modified the weights appropriately.
Copyright ofHamzah AhmadFKEE, UMP
Back-propagation Each neurons must be connected to each
other. Calculations
Same as a simple perceptron case.
Typically, sigmoid function is used in Multilayer NN.
Copyright ofHamzah AhmadFKEE, UMP
Back-propagation: Learning mode
Before the BP can be used, it requires target patterns or signals as it a supervised learning algorithm.
Training patterns are obtained from the samples of the types of inputs to be given to the multilayer neural network and their answers are identified by the researcher.
Examples of training patterns are samples of handwritten characters, process data, etc. following the tasks to be solved.
The configuration for training a neural network using the BP algorithm is shown in the figure below in which the training is done offline.
The objective is to minimize the error between the target and actual output and to find ∆w.
Copyright ofHamzah AhmadFKEE, UMP
BP: Learning mode The error is calculated at every iteration and
is backpropagated through the layers of the ANN to adapt the weights.
The weights are adapted such that the error is minimized.
Once the error has reached a justified minimum value, the training is stopped, and the neural network is reconfigured in the recall mode to solve the task.
Copyright ofHamzah AhmadFKEE, UMP
Error gradient
Copyright ofHamzah AhmadFKEE, UMP
Let’s look for a specific case
i j kinputs output
xiyi
Error signals
Input signals
lmn
wij wjk
Copyright ofHamzah AhmadFKEE, UMP
Understand more, learns more Error propagation starts from output layer
back to hidden layer. How to calculate error signals at layer k? How about calculation to update the weights
at layer k? Weight correction; or In sigmoid function, where
Copyright ofHamzah AhmadFKEE, UMP
Weight correction in hidden layer
We use the same technique to find the weight in hidden layer.
Copyright ofHamzah AhmadFKEE, UMP
Steps for calculations Step 1 : Initialization
Set weights and threshold randomly within a small range
Step 2 : Activation Use sigmoid activation function at hidden layer
and output layer Hidden layer;
Output layer;
Copyright ofHamzah AhmadFKEE, UMP
Step for calculations Step 3 : Weight training
Calculating error gradient in output layer
where
Update ; Calculating error gradient in hidden layer
Update ;
Step 4 : Iteration Back to step 2 and repeat process until selected error
criterion is satisfied.
Copyright ofHamzah AhmadFKEE, UMP
When does the training process stop?
Training process stop until the sum of squared error for the output y is less than a prescribed value; 0.001.
Sum of squared error : performance indicator of the system.
The smaller, the better the system performance.
Copyright ofHamzah AhmadFKEE, UMP
More about back-propagation Different initial weights and threshold may
have different solutions, but finally the system has almost similar solutions.
The decisions boundaries can be view if we use the sign activation function.
Drawbacks of back-propagation Not suitable for biological neurons; to adjust the
neurons weight. Computational expensive Slower training
Copyright ofHamzah AhmadFKEE, UMP
Consider other technique
Sigmoid function; f(x) = (1+e-x)-1
f(x) = xThe error signals are as follows.
δk = Lk (1- Lk )( tk - Lk )δj = Lj (1- Lj ) ∑k δk wkj
Adaptions of weights are defined as below.∆wkj( t + 1) = η δk Lj + α∆wkj( t )∆wji( t + 1) = η δj Li + α∆wji( t )
Copyright ofHamzah AhmadFKEE, UMP
XOR PROBLEM
Copyright ofHamzah AhmadFKEE, UMP
In this example, θ is not available!
Copyright ofHamzah AhmadFKEE, UMP
Copyright ofHamzah AhmadFKEE, UMP
Copyright ofHamzah AhmadFKEE, UMP
Copyright ofHamzah AhmadFKEE, UMP
Copyright ofHamzah AhmadFKEE, UMP
∆Wji(t+1) = 0.1 X (-0.0035) X 0 + (0.9 X 0) = 0Wj01(t+1) = 0.55 + (0) = 0.55
Wj02(t+1) = 0.15 + (0) = 0.15
Copyright ofHamzah AhmadFKEE, UMP
Examples