ISAN-DSP GROUP Artificial Neural Networks Basic Theory and Research Activity in KKU Dr. Nawapak...

65
ISAN-DSP GROUP Artificial Neural Networks Artificial Neural Networks asic Theory and Research Activity in KK asic Theory and Research Activity in KK Dr. Nawapak Eua-anant Dr. Nawapak Eua-anant Department of Computer Engineering Department of Computer Engineering Khon Kaen University Khon Kaen University 1 April 2546 1 April 2546

Transcript of ISAN-DSP GROUP Artificial Neural Networks Basic Theory and Research Activity in KKU Dr. Nawapak...

ISAN-DSP GROUP

Artificial Neural NetworksArtificial Neural Networks

Basic Theory and Research Activity in KKUBasic Theory and Research Activity in KKU

Dr. Nawapak Eua-anantDr. Nawapak Eua-anantDepartment of Computer EngineeringDepartment of Computer Engineering

Khon Kaen UniversityKhon Kaen University

1 April 25461 April 2546

ISAN-DSP GROUP

Path IPath I

Basic Theory of Artificial Neural NetworkBasic Theory of Artificial Neural Network

ISAN-DSP GROUP

A neuron : The smallest unit in the brainA neuron : The smallest unit in the brain

Cell body (soma)

Dendrite

Nucleus

NeuronAxon

Myelin sheath

Synapse

The brain consists of about 10,000 million neurons.

Axon Hillock

ISAN-DSP GROUP

A Neuron: The smallest unit in the brain (continued)A Neuron: The smallest unit in the brain (continued)

Pictures from http://vv.carleton.ca/~neil/neural/neuron-a.html

ISAN-DSP GROUP

A Neuron: The smallest unit in the brain (continued)A Neuron: The smallest unit in the brain (continued)

Dendrite: Each neuron contains approximately 10,000 dendrites connected to other neurons.

Drendrites receive “electrical signals” in from other neurons.

Some neurons have 200,000 connections or more.

Axon: Each axon consists of several terminal buttons called synapses connected to dendrites of other neurons.

The length of some axons may reach 1 meters.

Myelin sheets act as “insulator”.

ISAN-DSP GROUP

Example of neuronsExample of neurons

The cultured retinal explants taken from chick embryos

From http://www.ams.sunysb.edu/research/pinezich/neuron_reconstruction/

ISAN-DSP GROUP

Example of neurons (continued)Example of neurons (continued)

From http://faculty.washington.edu/chudler/cellpyr.html(Image courtesy of Dr. James Crandall, Eunice Kennedy Shriver Center)

Neurons located in the cerebral cortex of the hamster.

ISAN-DSP GROUP

How a neuron workHow a neuron work

Na

K Cl

Na

Cl K

Organic ion

70 mV

+

At the equilibrium point, there is a higher concentrationof Potassium inside the cell and a higher concentrationof sodium outside the cell.

This results in a potential across the cell membrane of about 70-100 mV called the resting potential.

ISAN-DSP GROUP

How a neuron work (continued)How a neuron work (continued)

+

++

Na

K Cl

Na

Cl K

Organic ion

+Before the depolarization

70 mV +

Excitatory inputs

Na

K Cl

Cl K

Organic ion

Na

After the neuron has fired

-30 mV +

When the aggregate input is greater than the axon hillock's threshold value, there will be a large influx of sodium ioninto the cell contributing to the depolarization.

This results in the action potential transmitted down theaxon to other cells.

ISAN-DSP GROUP

Artificial Neural NetworkArtificial Neural Network

Input nodesHidden nodes

Output nodes

Connections

Input patternOutput pattern

In ANN, each node is similarto a single neuron. There are many connectionsBetween nodes.

ISAN-DSP GROUP

Mathematical Model of a NeuronMathematical Model of a Neuron

McCulloch-Pitts model (1943)

wi1

wi2

wi3 i

yi

x1

x2

x3

1 1 2 2( )

( )i i i iN N i

ij j ij

y g w x w x w x

g w x

1 if z 0( )

0 otherwiseg z

Where g is the unit step function:

= the threshold level.w = weight of the connection

x = input

ISAN-DSP GROUP

How can we use a mathematical function to classify ?How can we use a mathematical function to classify ?

Consider a simple problem: How can we classify fat and thin peoples?

Decisio

n line

FatFat

ThinThin

Weight (kg.)

Height(cm.)

40 80

140

180

ISAN-DSP GROUP

How can we use a mathematical function to classify ? How can we use a mathematical function to classify ? (cont.)(cont.)

We used 2 inputs, weight (x1) and height (x2), to classify “thin” and “fat”.

x1 (kg.)

x2

(cm.)

40 80

140

180Deci

sion lin

e

FatFat

ThinThin

Line x 2 - x 1 -

100=0

Area where x2 - x1 - 100 < 0

Area where x2 - x1 - 100 > 0

Weight-height space

We can usea line to classifyData.

ISAN-DSP GROUP

How can we use a mathematical function to classify ? How can we use a mathematical function to classify ? (cont.)(cont.)

The decision function to classify “thin” and “fat”:

2 1

2 1

1 (thin) if - -100 0

0 (fat) if - -100 0

x xy

x x

which is similar to the McCulloch-Pitts model:

1 1 2 2

1 2

( )

( 100)

y g w x w x

g x x

1 if z 0( )

0 otherwise.g z

where

Advantage: Universal linear classifierProblem: For a particular problem, how can we choosesuitable weights w and of the function ?

ISAN-DSP GROUP

A Single Layer Perceptron: Adaptive linear classifier A Single Layer Perceptron: Adaptive linear classifier

Rosenblett (1958) purposed a model with supervised learning:

wi1ywi2

wi3

x1

x2

x3

Input Networkoutput

-

Error

Desired output

+

Adjust weights

ISAN-DSP GROUP

1

( )N

i ij j ij

y g w x

wi1

yi

x4

wi2

wi4

wi3

x1

x2

x3

For each node, the output is given by

wij = Connection weight of branch (i,j)xj = Input data from node j in the input layeri = Threshold value of node i in the output layerg = Activation function

Inpu

t lay

er

Out

put l

ayer

A Single Layer Perceptron (continued)A Single Layer Perceptron (continued)

ISAN-DSP GROUP

A Single Layer Perceptron (continued)A Single Layer Perceptron (continued)

The number of input nodes depends on the numberof components of an input pattern.

There are many types of activation functions.For example: The threshold function

The sigmoid function

1 if ( )

0 if

x Tg x

x T

1( )

1 xg x

e

-10 -5 0 5 10-0.2

0

0.2

0.4

0.6

0.8

1

1.2

-10 -5 0 5 10-0.2

0

0.2

0.40.6

0.81

1.2

= 2= 4

= 0.5

= 1

ISAN-DSP GROUP

How a Single Layer Perceptron WorksHow a Single Layer Perceptron Works

yw1

w2

x1

x2

1 1 2 2

1 1 2 2

1 if 0

0 if 0

w x w x y

w x w x

x1

x2 Region where y = 1

Region where y = 0

Decision Line L:

1 1 2 2 0w x w x

Consider a 2-input single layer perceptron with the threshold activation function. The output y is given by

Slope and position of Line Ldepend on w1, w2 and .

ISAN-DSP GROUP

How a Single Layer Perceptron Works (continued)How a Single Layer Perceptron Works (continued)

x1

x2

(0,0)

(0,1)

(1,0)

y = 0

(1,1) y =1

The lines in this range can be usedas Function AND.

This line cannot be usedfor Function AND.

x1 x2 y0 0 0

0 1 0

1 0 0

1 1 1

Example: Function AND

The decision line must be suitably set for each problem.In other words, weights of the network must be selectedproperly.

ISAN-DSP GROUP

22 ˆy y outputNetwork ˆ output, Desired Error, yy

Procedure for training the network

1. Feed input data into the network2. Compute network output 1 1 2 2ˆ ( )y f w x w x

3. Compute error 22 ˆy y

1 2For input ( , ) and Desired output x x y

4. Adjust weights , new old new oldw w w

5. Repeat step 1 until t

Training the PerceptronTraining the Perceptron

Weights of the network must be adjusted in the directionTo minimize error:

ISAN-DSP GROUP

For a single layer perceptron with the thresholding Activation function,

ii xyyw )ˆ( ˆ( )y y

rate Learning

Training the Perceptron (continued)Training the Perceptron (continued)

Updating weights using the Gradient descent method

ISAN-DSP GROUP

x1 x2 y0 0 0

0 1 0

1 0 0

1 1 1

Example: Function AND

w1 = 0.5 w2 = 2.5 = 1.0 = 0.2

Initial weights

Iteration 4

Iteration 8Iteration 12

Iteration 16OK!

Start

ii xyyw )ˆ( ˆ( )y y

Training the Perceptron (continued)Training the Perceptron (continued)

-0.2-0.2

0 0.2 0.4 0.6 0.8 1 1.2

0

0.2

0.4

0.6

0.8

1

1.2

x1

x 2

ISAN-DSP GROUP

n x1 x2 y y^ Err W1

New

w1 W2

New

w2 New

0 0.5 2.5 1

1 0 0 0 0 0 0.5 0 2.5 0 1

2 0 1 0 1 -1 0.5 0 2.3 -0.2 1.2

3 1 0 0 0 0 0.5 0 2.3 0 1.2

4 1 1 1 1 0 0.5 0 2.3 0 1.2

5 0 0 0 0 0 0.5 0 2.3 0 1.2

6 0 1 0 1 -1 0.5 0 2.1 -0.2 1.4

7 1 0 0 0 0 0.5 0 2.1 0 1.4

8 1 1 1 1 0 0.5 0 2.1 0 1.4

9 0 0 0 0 0 0.5 0 2.1 0 1.4

10 0 1 0 1 -1 0.5 0 1.9 -0.2 1.6

… … … … … … … … … … …

1 1 2 2ˆ ( )y f w x w x

yy ˆ

ii xyyw )ˆ( ˆ( )y y

new old

new old

w w w

Training the Perceptron (continued)Training the Perceptron (continued)

ISAN-DSP GROUP

x1

x2

(0,0)

(0,1)

(1,0)

Class y = 0

(1,1) Class y =1

Linearly separable dataLinearly separable data

In 2-dimensional space, a decision function of a singlelayer perceptron is a line. Therefore, data to be classified must be separated usinga single line. We say that data is linearly separable.

ISAN-DSP GROUP

x1 x2 y0 0 0

0 1 1

1 0 1

1 1 0

Example: Function XOR

x1

x2

(0,0)

(1,1)

Class y =0

(0,1)

(1,0)

Class y = 1

Not OK!

Nonlinearly separable data : Limitation of a single layer perceptronNonlinearly separable data : Limitation of a single layer perceptron

There are some cases that a single layer perceptron does not work. In such cases, we cannot use a single line to divide data between classes. The data is nonlinearly separable.

ISAN-DSP GROUP

Linearly separable Nonlinearly separable

Linearly separable vs. Nonlinearly separableLinearly separable vs. Nonlinearly separable

ISAN-DSP GROUP

0 if 0

0 if 1

2211

2211

xwxwxw

xwxwxwy

NN

NN

The decision function becomes the hyperplane: 02211 xwxwxw NN

Higher Dimensional SpaceHigher Dimensional Space

In the case of input patterns having more than 1 components, the output of the network is given by

Example: 3D case

Class B

x1

x3

x2

Class A Decision plane0332211 xwxwxw

ISAN-DSP GROUP

1 1

( ( )) ( ))N N

j j j j j jj j

w y g w x g w x x

Learningrate Output

errorDerivative Input

Updating equation for the Gradient Descent MethodUpdating equation for the Gradient Descent Method

( )g x x

Derivatives for some activation functions:1. Linear unit

2. Sigmoid function

3. Tanh() function2

1( )

1 xg x

e

( ) 2 ( ) (1 ( ))g x g x g x

( ) 1g x

( ) tanh( )g x x 2( ) (1 ( ) )g x g x

ISAN-DSP GROUP

Input nodes

Hidden nodes

Output nodes

Connections

Layer 0

Layer 1

Layer N

Layer N-1

N-layer network

A Multilayer Layer PerceptronA Multilayer Layer Perceptron

Feed forward network

ISAN-DSP GROUP

A Multilayer Layer Perceptron (continued)A Multilayer Layer Perceptron (continued)

Input Networkoutput

-

Desired output

Error

+

Back propagationalgorithm

Feed forward network

ISAN-DSP GROUP

x1 x2 y0 0 0

0 1 1

1 0 1

1 1 0

Function XOR

12,2w y2

ox1

x2

y11

1,1w 21,1w

22,1w

11,2w

11,2w

1 1 11 1,1 1 1,2 2 1

1 1 12 2,1 1 2,2 2 2

( )

( )

y g w x w x

y g w x w x

2 2 21,1 1 1,2 2 1( )o g w y w y

Layer 1

Layer 2

g( ) = threshold function

How A Multilayer Layer Perceptron works: XOR ExampleHow A Multilayer Layer Perceptron works: XOR Example

ISAN-DSP GROUP

x1

x2

(0,0)

(0,1)

(1,0)

(1,1)

0 112

12,11

11,11 xwxwLLine

0 122

12,21

11,22 xwxwLLine

x1 x2 y1 y2

0 0 0 0

0 1 1 0

1 0 1 0

1 1 1 1

Outputs at layer 1

y2

x1

x2

y11

1,1w1

1,2w1

1,2w

How A Multilayer Layer Perceptron Work (cont.)How A Multilayer Layer Perceptron Work (cont.)

At the first layer

2 nodes in the first layer correspond to 2 lines

ISAN-DSP GROUP

0 112

12,11

11,11 xwxwLLine

0 122

12,21

11,22 xwxwLLine

(0,0) (1,0)

(1,1)

y1

y2

y1-y2 space

x1

x2

(0,0)

(0,1)

(1,0)

(1,1)

Class 0

Class 1

x1-x2 space

Linearly separable !

At the first layer

How A Multilayer Layer Perceptron Work (cont.)How A Multilayer Layer Perceptron Work (cont.)

Hidden layers transform input data into linearly separable data !

ISAN-DSP GROUP

At the output layer

y1

y2

(0,0) (1,0)

(1,1)

y1-y2 space

Class 0

Class 1

Space y1-y2 is linearly separable.Therefore the output layer can classify data correctly.

0 212

22,11

21,13 ywywLLine

oy1

21,1w

22,1wy2

How A Multilayer Layer Perceptron Work (cont.)How A Multilayer Layer Perceptron Work (cont.)

ISAN-DSP GROUP

21

,,

2

( )

( )

n n n nj i j j in

j i

n n nj j jn

j

w g h xw

g h

Updating weights and bias

, , ,( ) ( )

( ) ( )

n n nj i j i j i

n n nj j j

w new w old w

new old

Back Propagation AlgorithmBack Propagation Algorithm

The Gradient Descent Method

,i i ik k j j k

j

h w y Where

ISAN-DSP GROUP

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

Class 1Class 0

Example : Application of MLP for classificationExample : Application of MLP for classification

x = randn([2 200]);

o = (x(1,:).^2+x(2,:).^2)<1;

MATLAB Example

Input points (x1,x2) generated from random numbers

Desired outputif (x1,x2) lies in a circle of radius 1 centered at the origin then

o = 1else

o = 0

x2

x1

ISAN-DSP GROUP

Network structure

x2

x1

Hidden nodes(sigmoid)

Output node (Sigmoid)

Input nodes

Threshold unit(for binary output)

Example : Application of MLP for classification (cont.)Example : Application of MLP for classification (cont.)

ISAN-DSP GROUP

PR = [min(x(1,:)) max(x(1,:));

min(x(2,:)) max(x(2,:))];

S1 = 10;

S2 = 1;

TF1 = 'logsig';

TF2 = 'logsig';

BTF = 'traingd';

BLF = 'learngd';

PF = 'mse';

net = newff(PR,[S1 S2],{TF1 TF2},BTF,BLF,PF);

Matlab command : Create a 2-layer network

Range of inputs

No. of nodes in Layers 1 and 2

Activation functions of Layers 1 and 2

Training function

Learning functionCost function

Command for creating the network

Example : Application of MLP for classification (cont.)Example : Application of MLP for classification (cont.)

ISAN-DSP GROUP

Matlab command : Train the network

net.trainParam.epochs = 2000;

net.trainParam.goal = 0.002;

net = train(net,x,o);

y = sim(net,x);

netout = y>0.5;

No. of training rounds

Maximum desired errorTraining command

Compute network outputs (continuous)Convert to binary outputs

Example : Application of MLP for classification (cont.)Example : Application of MLP for classification (cont.)

ISAN-DSP GROUP

Initial weights of the hidden layer nodes (10 nodes) displayed as Lines w1x1+w2x2+ = 0

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

Class 1Class 0

Example : Application of MLP for classification (cont.)Example : Application of MLP for classification (cont.)

ISAN-DSP GROUP

Graph of MSE vs training epochs (success with in only 10 epochs!)

Training algorithm: Levenberg-Marquardt back propagation

0 2 4 6 8 1010

-3

10-2

10-1

100 Performance is 0.00172594, Goal is 0.002

10 Epochs

Trai

ning

-Blu

e G

oal-B

lack

Example : Application of MLP for classification (cont.)Example : Application of MLP for classification (cont.)

ISAN-DSP GROUP

Results obtained using the Levenberg-Marquardt Back propagation algorithm

Classification Error : 0/200

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

Class 1Class 0

Unused node

Only 6 hidden nodes are adequate !

Example : Application of MLP for classification (cont.)Example : Application of MLP for classification (cont.)

ISAN-DSP GROUP

Path IIPath II

ANN Research in Department of Com. Eng.ANN Research in Department of Com. Eng.Khon Kaen UniversityKhon Kaen University

Pattern Recognition ApplicationsPattern Recognition Applications

1. Face Recognition Project 1. Face Recognition Project 2. Resonant Inspection Project2. Resonant Inspection Project3. Other Projects3. Other Projects

ISAN-DSP GROUP

Elements of Pattern Recognition

Feature extraction

Data acquisition

Recognition processANN

Used to reduce amount of data to be processed by extracting important features of raw data.

This process can reduce computational cost dramatically.

ISAN-DSP GROUP

Face Recognition ProjectFace Recognition Project

Feature ExtractionDiscrete Wavelet

+ Fourier Transform

Neural Network

1. Possessing multi-resolution analysis capability that can eliminate unwanted variations of the facial image in wavelet scale-space.

2. Being able to compress the image using few coefficients

Senior Project 2001 1. Chavis Srichan,2. Piyapong Sripikul3. Suranuch Sapsoe

ISAN-DSP GROUP

Multiresolution Analysis Using WaveletMultiresolution Analysis Using Wavelet

HH1LH1

HL1

HH2

HL2

LH2Original image

2-level multiresolition decomposed image

LL2

L = Low frequencyComponent

H = High frequencycomponent

ISAN-DSP GROUP

Feature Extraction OF Facial ImagesFeature Extraction OF Facial Images

Original image640x480 pixels

1. SegmentationEliminates unwanted pixels

2. DWT 4 levelsReduces size of the image to40x30 pixels

3. FFTTransforms toFreq. domain

ISAN-DSP GROUP

FFT image

Networkoutput

2-layer feed forward network (120-10-1)

Recognition ProcessRecognition Process

Database ofFacial images

ISAN-DSP GROUP

Training the networkTraining the network

ISAN-DSP GROUP

90 % Match

24 % Match

Identification ResultsIdentification Results

ISAN-DSP GROUP

Resonant Inspection ProjectResonant Inspection Project

Senior Project 2002: 1. Min Nguyen 2. Prachya Damprapa 3. Tiranattada Nantama

Tapping

Process AnalyzeAnd Classify

ISAN-DSP GROUP

Basic Concept of Resonant InspectionBasic Concept of Resonant Inspection

Objects of difference structures have different resonantfrequencies.

Good and broken parts have different tapping sounds.

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2

x 104

0

100

200

300

400

500

600

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2

x 104

0

100

200

300

400

500

600

Spectrum of tapping sound

Good dining plate Broken dining plate

ISAN-DSP GROUP

Tapping DeviceTapping Device

Solenoid

Microphone

ISAN-DSP GROUP

Feature ExtractionFeature Extraction

Sampled audio signals were converted into spectral signals using FastFourier Transform.

Good plate Broken plate

Tap

ping

1T

appi

ng 2

ISAN-DSP GROUP

The whole spectrum was divided into 16 equal intervals.

Feature Extraction (continued)Feature Extraction (continued)

Good plate

Broken plate+

ISAN-DSP GROUP

Networkoutput

Recognition ProcessRecognition Process

Spectrum of tapping sound(16 intervals)

3-layer (16-10-4-1) Feed forward

network Good

Bad

ISAN-DSP GROUP

Resonant Inspection Project: Software PartResonant Inspection Project: Software Part

Create Training Data

Train Network

Test & Classified

Set Training Data Parameter

Set Network Parameter

ISAN-DSP GROUP

Training the networkTraining the network

ISAN-DSP GROUP

Experimental ResultsExperimental Results

Good plate

ISAN-DSP GROUP

Experimental Results (continued)Experimental Results (continued)

Broken plate

ISAN-DSP GROUP

Experimental Results (continued)Experimental Results (continued)

Test objects

Classification results % Error

Good Bad

Good plates

30 0 0

Broken plates

0 30 0

Data set: 100 tapping sounds of good plates : 70 for training, 30 for testing

100 tapping sounds of broken plates: 70 for training, 30 for testing

ISAN-DSP GROUP

Function to be approximated

Outputs from neural nodes combined to get the approximated function

Node 1

Node 3Node 2

Other Projects: Function Approximation ApplicationOther Projects: Function Approximation Application

ISAN-DSP GROUP

Fitting ProblemFitting Problem

Underfit problem- Too few nodes- too wide

Overfit problem- Too many nodes- too narrow

Function to be approximated

Network Results

Underfit

Overfit

ISAN-DSP GROUP

Graduate Research on Function Approximation Graduate Research on Function Approximation

Solving the initial Centers and Widths problem of Radial Basis Function Network for time series prediction by Chatklao Jareonbhol

For rainfall forecasting

To determine the suitable parameters of the network The number of nodes Suitable widths and locations of nodes

ISAN-DSP GROUP