5/7/2012
1
WHAT ARE NEURAL NETWORKS? Neural networks are parallel information processing
systems with their architecture inspired by the
structure and functioning of the brain
A neural network have the ability to learn and
generalize.
Neural networks can be trained to make
classifications and predictions based on historical data
Neural nets are included in many data mining
products
5/7/2012
2
3
Very popular and effective techniques
Artificial Neural Network goes by many names, such
as connectionism, parallel distributed processing,
neuro-computing and natural intelligent systems.
It is abbreviated by ANN
ANN has a strong similarity to the biological brain.
It is composed of interconnected elements called
neurons.
4
The Biological Neuron
Biological neuron receives inputs from other sources,
combines them in some way, performs a generally
nonlinear operation on the result, and then output the
final result.send Inputs
(eyes, ears)
Turn the processed
inputs into outputs
Process the inputs
The electrochemical contact between
neurons
5/7/2012
3
5
The Artificial Neuron
6
Network Layers The neurons are
grouped into layers.
The input layer.
The output layer.
Hidden layers between these two layers.
5/7/2012
4
Neural Network Architecture In a Neural Network, neurons are grouped into
layers.
The neurons in each layer are the same type.
There are different types of Layers.
The Input layer, consists of neurons that receive
input from the external environment.
The Output layer, consists of neurons that
communicate to the user or external environment.
The Hidden layer, consists of neurons that ONLY
communicate with other layers of the network.
Now that we have a model for an artificial neuron, we can imagine connecting many of then together to form an Artificial Neural Network:
Input layer
Hidden layer
Output layer
5/7/2012
5
Based on the following assumptions:
1. Information processing occurs at many simple
processing elements called neurons.
2. Signals are passed between neurons over
interconnection links.
3. Each interconnection link has an associated
weight.
4. Each neuron applies an activation function to
determine its output signal.
A Neuron
The n-dimensional input vector x is mapped into variable y by means of the scalar product and a nonlinear function mapping
mk-
f
weighted
sum
Input
vector x
output y
Activation
function
weight
vector w
w0
w1
wn
x0
x1
xn
5/7/2012
6
What is a neuron? a (biological) neuron is a
node that has many inputsand one output
inputs come from other neurons
the inputs are weighted
weights can be both positive and negative
inputs are summed at the node to produce an activation value
f
12
The Artificial Neuron Model In order to simulate neurons on a computer, we
need a mathematical model of this node
Node j ( a neuron) has n inputs xi , i = 1 to n
Each input (connection) is associated with a weight wij
The neuron includes a bias denoted by bj
The bias has the effect of increasing or deceasing the net input
5/7/2012
7
The net input to node j (net result) ,uj , is the sum of
the products of the connection inputs and their
weights:
Where uj is called the “activation of the neuron”.
The output of node j is determined by applying a non-
linear transfer function g to the net input:
called the “transfer function”.
j jx g(u )
n
j ij i ji 1
u w x b
A common choice for the transfer function is the
sigmoid:
The sigmoid has similar non-linear properties to the
transfer function of real neurons:
It accepts inputs varying from – ∞ to ∞
bounded below by 0
bounded above by 1
jj u
1g(u )
1 e
5/7/2012
8
An Example of multilayer feed forward neural network
5/7/2012
9
17
Architecture of ANN
Feed-Forward NetworksThe signals travel one way from input to output
Feed-Back NetworksThe signals travel as loops in the network, the
output of the network is connected to the input.
18
Learning ANN
The purpose of the learning function is to modify the variable connection weights causes the network to learn the solution to a problem according to some neural based algorithm.
There are tow types of learning
Supervised (Reinforcement) learning.
Unsupervised learning.
5/7/2012
10
19
Supervised learning Means their exist an external
help or a teacher.
The teacher may be a training set of data.
The target is to minimize the error between the desired and the actual output.
The process take place as follows:
20
Presenting input and output data to the network, this data is often referred to as the “training set”.
The network processes the inputs and compares its resulting outputs against the desired outputs.
Errors are then propagated back through the system, causing the system to adjust the weights, which are usually randomly set to begin with.
This process occurs over and over as a closer match between the desired and the predicted output.
5/7/2012
11
When no further learning is necessary, the
weights are typically frozen for the
application.
There are many algorithms used to implement
the adaptive feedback required to adjust the
weights during training. The most common
technique is called “Back-Propagation”.
Let us suppose that a sufficiently large set of
examples (training set) is available.
Supervised learning:
– The network answer to each input pattern is
directly compared with the desired answer and a
feedback is given to the network to correct possible
errors
5/7/2012
12
23
Back Propagation (BP) It is an improving performance method in training of
multilayered feed forward neural networks.
Are used to adjust the weights and biases of networks to minimize the sum squared error of the network which is given by
where xt and xt^ are the desired and predicted
output of the tth output node.
2t t
1ˆSSE (x x )
2
24
BP Network – Supervised Training
Desired output of the training examples
Error = difference between actual & desired output
Change weight relative to error size
Calculate output layer error , then propagate back to previous layer
Hidden weights updated
Improved performance
5/7/2012
13
Probably the most common type of ANN used
today is a multilayer feed forward network
trained using back-propagation (BP)
Often called a Multilayer Perceptron (MLP)
26
Unsupervised learning The training set consists of
input training patterns but not with desired outputs.
This method often referred to as self-organization.
Therefore, the network is trained without benefit of any teacher.
5/7/2012
14
27
Applications in Clustering and reducing dimensionality
Learning may be very slow
No help from the outside
No training data, no information available on the desired output
Learning by doing
Used to pick out structure in the input:
Clustering
Compression
28
Choosing the network size It seems better to start with a small number of
neurons, because:
learning is faster.
it is often enough.
it avoids over-fitting problems.
If the number of neurons are too much we will get an over fit.
In principle, one hidden layer is sufficient to solve any problem. In practice, it may happen that two hidden layers with a small number of neurons may work better (and/or learn faster) than a network with a single layer.
5/7/2012
15
• Too few nodes: Don’t fit the curve very well
• Too many nodes: Over parameterization
• May fit noise
• makes network more difficult to train
30
The Learning Rate The learning rate c, which determines by how
much we change the weights w at each step.
If c is too small, the algorithm will take a long time to converge.
Sum
-Squar
e E
rro
r
Epoch
Error Surface
5/7/2012
16
31
If c is too large, the network may not be able to make the fine discriminations possible with a system that learns more slowly. The algorithm diverges.
Epoch
Error SurfaceSum
-Squar
e E
rro
r
Multi-Layer Perceptron
Output nodes
Input nodes
Hidden nodes
Output vector
Input vector: xi
wij
i
jiijj OwI
jIje
O
1
1
))(1( jjjjj OTOOErr
jkk
kjjj wErrOOErr )1(
ijijij OErrlww )(
jjj Errl)(
5/7/2012
17
33
Applications of ANNs Prediction – weather, stocks, disease, Predicting
financial time series
Classification – financial risk assessment, image processing
Data Association – Text Recognition
Data Conceptualization – Customer purchasing habits
Filtering – Normalizing telephone signals Optimization
Diagnosing medical conditions
Identifying clusters in customer databases
Identifying fraudulent credit card transactions
Hand-written character recognition
and many more….
5/7/2012
18
35
Advantages Adapt to unknown situations
Autonomous learning and generalization
Their most important advantage is in solvingproblems that are too complex for conventionaltechnologies, problems that do not have analgorithmic solution or for which an algorithmicsolution is too complex to be found.
Disadvantages Not exact
Large complexity of the network structure
Using a neural network for prediction Identify input and outputs
Preprocess inputs - often scale to the range [0,1]
Choose an ANN architecture
Train the ANN with a representative set of training
examples (usually using BP)
Test the ANN with another set of known examples
often the known data set is divided in to training
and test sets. Cross-validation is a more rigorous
validation procedure.
Apply the model to unknown input data