Download - ANN-lecture9

5/7/2012

1

WHAT ARE NEURAL NETWORKS? Neural networks are parallel information processing

systems with their architecture inspired by the

structure and functioning of the brain

A neural network have the ability to learn and

generalize.

Neural networks can be trained to make

classifications and predictions based on historical data

Neural nets are included in many data mining

products

5/7/2012

2

3

Very popular and effective techniques

Artificial Neural Network goes by many names, such

as connectionism, parallel distributed processing,

neuro-computing and natural intelligent systems.

It is abbreviated by ANN

ANN has a strong similarity to the biological brain.

It is composed of interconnected elements called

neurons.

4

The Biological Neuron

Biological neuron receives inputs from other sources,

combines them in some way, performs a generally

nonlinear operation on the result, and then output the

final result.send Inputs

(eyes, ears)

Turn the processed

inputs into outputs

Process the inputs

The electrochemical contact between

neurons

5/7/2012

3

5

The Artificial Neuron

6

Network Layers The neurons are

grouped into layers.

The input layer.

The output layer.

Hidden layers between these two layers.

5/7/2012

4

Neural Network Architecture In a Neural Network, neurons are grouped into

layers.

The neurons in each layer are the same type.

There are different types of Layers.

The Input layer, consists of neurons that receive

input from the external environment.

The Output layer, consists of neurons that

communicate to the user or external environment.

The Hidden layer, consists of neurons that ONLY

communicate with other layers of the network.

Now that we have a model for an artificial neuron, we can imagine connecting many of then together to form an Artificial Neural Network:

Input layer

Hidden layer

Output layer

5/7/2012

5

Based on the following assumptions:

1. Information processing occurs at many simple

processing elements called neurons.

2. Signals are passed between neurons over

interconnection links.

3. Each interconnection link has an associated

weight.

4. Each neuron applies an activation function to

determine its output signal.

A Neuron

The n-dimensional input vector x is mapped into variable y by means of the scalar product and a nonlinear function mapping

mk-

f

weighted

sum

Input

vector x

output y

Activation

function

weight

vector w

w0

w1

wn

x0

x1

xn

5/7/2012

6

What is a neuron? a (biological) neuron is a

node that has many inputsand one output

inputs come from other neurons

the inputs are weighted

weights can be both positive and negative

inputs are summed at the node to produce an activation value

f

12

The Artificial Neuron Model In order to simulate neurons on a computer, we

need a mathematical model of this node

Node j ( a neuron) has n inputs xi , i = 1 to n

Each input (connection) is associated with a weight wij

The neuron includes a bias denoted by bj

The bias has the effect of increasing or deceasing the net input

5/7/2012

7

The net input to node j (net result) ,uj , is the sum of

the products of the connection inputs and their

weights:

Where uj is called the “activation of the neuron”.

The output of node j is determined by applying a non-

linear transfer function g to the net input:

called the “transfer function”.

j jx g(u )

n

j ij i ji 1

u w x b

A common choice for the transfer function is the

sigmoid:

The sigmoid has similar non-linear properties to the

transfer function of real neurons:

It accepts inputs varying from – ∞ to ∞

bounded below by 0

bounded above by 1

jj u

1g(u )

1 e

5/7/2012

8

An Example of multilayer feed forward neural network

5/7/2012

9

17

Architecture of ANN

Feed-Forward NetworksThe signals travel one way from input to output

Feed-Back NetworksThe signals travel as loops in the network, the

output of the network is connected to the input.

18

Learning ANN

The purpose of the learning function is to modify the variable connection weights causes the network to learn the solution to a problem according to some neural based algorithm.

There are tow types of learning

Supervised (Reinforcement) learning.

Unsupervised learning.

5/7/2012

10

19

Supervised learning Means their exist an external

help or a teacher.

The teacher may be a training set of data.

The target is to minimize the error between the desired and the actual output.

The process take place as follows:

20

Presenting input and output data to the network, this data is often referred to as the “training set”.

The network processes the inputs and compares its resulting outputs against the desired outputs.

Errors are then propagated back through the system, causing the system to adjust the weights, which are usually randomly set to begin with.

This process occurs over and over as a closer match between the desired and the predicted output.

5/7/2012

11

When no further learning is necessary, the

weights are typically frozen for the

application.

There are many algorithms used to implement

the adaptive feedback required to adjust the

weights during training. The most common

technique is called “Back-Propagation”.

Let us suppose that a sufficiently large set of

examples (training set) is available.

Supervised learning:

– The network answer to each input pattern is

directly compared with the desired answer and a

feedback is given to the network to correct possible

errors

5/7/2012

12

23

Back Propagation (BP) It is an improving performance method in training of

multilayered feed forward neural networks.

Are used to adjust the weights and biases of networks to minimize the sum squared error of the network which is given by

where xt and xt^ are the desired and predicted

output of the tth output node.

2t t

1ˆSSE (x x )

2

24

BP Network – Supervised Training

Desired output of the training examples

Error = difference between actual & desired output

Change weight relative to error size

Calculate output layer error , then propagate back to previous layer

Hidden weights updated

Improved performance

5/7/2012

13

Probably the most common type of ANN used

today is a multilayer feed forward network

trained using back-propagation (BP)

Often called a Multilayer Perceptron (MLP)

26

Unsupervised learning The training set consists of

input training patterns but not with desired outputs.

This method often referred to as self-organization.

Therefore, the network is trained without benefit of any teacher.

5/7/2012

14

27

Applications in Clustering and reducing dimensionality

Learning may be very slow

No help from the outside

No training data, no information available on the desired output

Learning by doing

Used to pick out structure in the input:

Clustering

Compression

28

Choosing the network size It seems better to start with a small number of

neurons, because:

learning is faster.

it is often enough.

it avoids over-fitting problems.

If the number of neurons are too much we will get an over fit.

In principle, one hidden layer is sufficient to solve any problem. In practice, it may happen that two hidden layers with a small number of neurons may work better (and/or learn faster) than a network with a single layer.

5/7/2012

15

• Too few nodes: Don’t fit the curve very well

• Too many nodes: Over parameterization

• May fit noise

• makes network more difficult to train

30

The Learning Rate The learning rate c, which determines by how

much we change the weights w at each step.

If c is too small, the algorithm will take a long time to converge.

Sum

-Squar

e E

rro

r

Epoch

Error Surface

5/7/2012

16

31

If c is too large, the network may not be able to make the fine discriminations possible with a system that learns more slowly. The algorithm diverges.

Epoch

Error SurfaceSum

-Squar

e E

rro

r

Multi-Layer Perceptron

Output nodes

Input nodes

Hidden nodes

Output vector

Input vector: xi

wij

i

jiijj OwI

jIje

O

1

1

))(1( jjjjj OTOOErr

jkk

kjjj wErrOOErr )1(

ijijij OErrlww )(

jjj Errl)(

5/7/2012

17

33

Applications of ANNs Prediction – weather, stocks, disease, Predicting

financial time series

Classification – financial risk assessment, image processing

Data Association – Text Recognition

Data Conceptualization – Customer purchasing habits

Filtering – Normalizing telephone signals Optimization

Diagnosing medical conditions

Identifying clusters in customer databases

Identifying fraudulent credit card transactions

Hand-written character recognition

and many more….

5/7/2012

18

35

Advantages Adapt to unknown situations

Autonomous learning and generalization

Their most important advantage is in solvingproblems that are too complex for conventionaltechnologies, problems that do not have analgorithmic solution or for which an algorithmicsolution is too complex to be found.

Disadvantages Not exact

Large complexity of the network structure

Using a neural network for prediction Identify input and outputs

Preprocess inputs - often scale to the range [0,1]

Choose an ANN architecture

Train the ANN with a representative set of training

examples (usually using BP)

Test the ANN with another set of known examples

often the known data set is divided in to training

and test sets. Cross-validation is a more rigorous

validation procedure.

Apply the model to unknown input data