Neural Network Applications Using an Improved Performance Training Algorithm

Neural Network Applications Neural Network Applications Using an Improved Performance Using an Improved Performance

Training AlgorithmTraining Algorithm

Annamária R. Várkonyi-Kóczy 1, 2, Balázs Tusor 2

1 1 Institute of Mechatronics and Vehicle Engineering, Óbuda Institute of Mechatronics and Vehicle Engineering, Óbuda UniversityUniversity

2 2 Integrated Intelligent Space Japanese-Hungarian LaboratoryIntegrated Intelligent Space Japanese-Hungarian Laboratory

e-mail: [email protected] e-mail: [email protected]

OutlineOutlineIntroduction, Motivation for using SC Techniques Introduction, Motivation for using SC Techniques Neural Networks, Fuzzy Neural Networks, Circular Neural Networks, Fuzzy Neural Networks, Circular Fuzzy Neural NetworksFuzzy Neural NetworksThe place and success of NNsThe place and success of NNsA new training and clustering algorithmsA new training and clustering algorithmsClassification examples Classification examples A real-world application: fuzzy hand posture and A real-world application: fuzzy hand posture and gesture detection systemgesture detection system– Inputs of the systemInputs of the system– Fuzzy hand posture modelsFuzzy hand posture models– The NN based hand posture identification systemThe NN based hand posture identification systemResultsResultsConclusionsConclusions

33

Motivation for using SC Techniques Motivation for using SC Techniques We need something ”non-classical”: We need something ”non-classical”:

ProblemsProblemsNonlinearity, never unseen spatial and temporal complexity Nonlinearity, never unseen spatial and temporal complexity of systems and tasksof systems and tasksImprecise, uncertain, insufficient, ambiguous, contradictory Imprecise, uncertain, insufficient, ambiguous, contradictory information, lack of knowledgeinformation, lack of knowledgeFinite resources Finite resources Strict time requirements (real-time Strict time requirements (real-time processing)processing)Need for optimizationNeed for optimization++ Need for user’s comfort Need for user’s comfort

New challenges/more complex tasks to be solved New challenges/more complex tasks to be solved more more sophisticated solutions neededsophisticated solutions needed

44

Motivation for using SC Techniques Motivation for using SC Techniques We need something ”non- We need something ”non-

classical”: Intentionsclassical”: IntentionsWe would like to build MACHINES to be able to do the We would like to build MACHINES to be able to do the same as humans do (e.g. autonomous cars driving in same as humans do (e.g. autonomous cars driving in heavy traffic).heavy traffic).

We always would like to find an algorithm leading to an We always would like to find an algorithm leading to an OPTIMUM solution (even when facing too much OPTIMUM solution (even when facing too much uncertainty and lack of knowledge)uncertainty and lack of knowledge)

We would like to ensure MAXIMUM performance (usually We would like to ensure MAXIMUM performance (usually impossible from every points of view, i.e. some kind of impossible from every points of view, i.e. some kind of trade-off e.g. between performance and costs)trade-off e.g. between performance and costs)

We prefer environmental COMFORT (user friendly We prefer environmental COMFORT (user friendly machines)machines)

55

Need for optimizationNeed for optimization

Traditionally: Traditionally: optimization = precisionoptimization = precision

New definitionNew definition (L.A. Zadeh) (L.A. Zadeh): : optimization = cost optimizationoptimization = cost optimization

But what is cost!?But what is cost!?precision and certainty also carry a costprecision and certainty also carry a cost

66

User’s comfortUser’s comfort

preprocessingprocessing

aims of preprocessing improving the performance of the algorithms

giving more support to the processing (new)

image processing / computer vision:noise smoothingfeature extraction (edge, corner detection)pattern recognition, etc.

3D modeling, medical diagnostics, etc.automatic 3D modeling, automatic ...

processing

preprocessing

Human languageModularity, simplicity, hierarchical structures

Aims of the processing

77

Low complexity, approximate modelingLow complexity, approximate modelingApplication of adaptive and robust techniquesApplication of adaptive and robust techniquesDefinition and application of the proper cost Definition and application of the proper cost function including the hierarchy and measure of function including the hierarchy and measure of importance of the elementsimportance of the elementsTrade-off between accuracy (granularity) and Trade-off between accuracy (granularity) and complexity (computational time and resource complexity (computational time and resource need)need)Giving support for the further processingGiving support for the further processing

These do not cope with traditional and AI methods, These do not cope with traditional and AI methods, only with Soft Computing Techniques and only with Soft Computing Techniques and

Computational IntelligenceComputational Intelligence

Motivation for using SC Techniques Motivation for using SC Techniques We need something ”non- We need something ”non-

classical”: Elements of the Solutionclassical”: Elements of the Solution

88

What is Computational Intelligence?What is Computational Intelligence? Computer Computer + +

IntelligenceIntelligence

Increased computer facilities Added by the new methods

L.A. Zadeh, Fuzzy Sets [1965]: “In traditional – hard – computing, the prime desiderata are precision, certainty, and rigor. By contrast, the point of departure of soft computing is the thesis that precision and certainty carry a cost and that computation, reasoning, and decision making should exploit – whenever possible – the tolerance for imprecision and uncertainty.”

99

What is Computational Intelligence?What is Computational Intelligence?

CI can be viewed as a consortium of methodologies which CI can be viewed as a consortium of methodologies which play important role in conception, design, and utilization of play important role in conception, design, and utilization of information/intelligent systems. information/intelligent systems. The principal members of the consortium are: fuzzy logic The principal members of the consortium are: fuzzy logic (FL), neuro computing (NC), evolutionary computing (EC), (FL), neuro computing (NC), evolutionary computing (EC), anytime computing (AC), probabilistic computing (PC), anytime computing (AC), probabilistic computing (PC), chaotic computing (CC), and (parts of) machine learning chaotic computing (CC), and (parts of) machine learning (ML). (ML). The methodologies are complementary and synergistic, The methodologies are complementary and synergistic, rather than competitive. rather than competitive. What is common: Exploit the tolerance for imprecision, What is common: Exploit the tolerance for imprecision, uncertainty, and partial truth to achieve tractability, uncertainty, and partial truth to achieve tractability, robustness, low solution cost and better rapport with robustness, low solution cost and better rapport with reality.reality.

1010

Soft Computing Methods Soft Computing Methods (Computational Intelligence) (Computational Intelligence) fulfill all of the five requirements:fulfill all of the five requirements:

(Low complexity, approximate modeling(Low complexity, approximate modelingapplication of adaptive and robust techniquesapplication of adaptive and robust techniquesDefinition and application of the proper cost function including the Definition and application of the proper cost function including the hierarchy and measure of importance of the elementshierarchy and measure of importance of the elementsTrade-off between accuracy (granularity) and complexity (computational Trade-off between accuracy (granularity) and complexity (computational time and resource need)time and resource need)Giving support for the further processing)Giving support for the further processing)

1111

Methods of Methods of Computational IntelligenceComputational Intelligence

fuzzy logic –low complexity, easy build in of the a fuzzy logic –low complexity, easy build in of the a priori knowledge into computers, tolerance for priori knowledge into computers, tolerance for imprecision, interpretabilityimprecision, interpretabilityneuro computing - learning abilityneuro computing - learning abilityevolutionary computing – optimization, optimum evolutionary computing – optimization, optimum learninglearninganytime computing – robustness, flexibility, anytime computing – robustness, flexibility, adaptivity, coping with the temporal adaptivity, coping with the temporal circumstancescircumstancesprobabilistic reasoning – uncertainty, logicprobabilistic reasoning – uncertainty, logicchaotic computing – open mindchaotic computing – open mindmachine learning - intelligencemachine learning - intelligence

1212

Neural NetworksNeural Networks

It mimics the human brainIt mimics the human brain(McCullogh & Pitts, 1943, Hebb, 1949)(McCullogh & Pitts, 1943, Hebb, 1949)Rosenblatt, 1958 (Perceptrone)Rosenblatt, 1958 (Perceptrone)Widrow-Hoff, 1960 (Adaline)Widrow-Hoff, 1960 (Adaline)……

1313

Neural NetworksNeural Networks

Neural Nets are parallel, distributed information Neural Nets are parallel, distributed information processing tools which are processing tools which are

Highly connected systems composed of identical Highly connected systems composed of identical or similar operational units evaluating local or similar operational units evaluating local processing (processing (processing element, neuronprocessing element, neuron) usually in ) usually in a well-ordered topology a well-ordered topology

Possessing some kind of Possessing some kind of learning algorithm learning algorithm which which usually means learning by patterns and also usually means learning by patterns and also determines the mode of the information processingdetermines the mode of the information processing

They also possess an They also possess an information recall algorithm information recall algorithm making possible the usage of the previously making possible the usage of the previously learned informationlearned information

1414

Application areas where NNs are Application areas where NNs are successfully usedsuccessfully used

One and multiOne and multi--dimensional signal processing dimensional signal processing (image processing, speech processing, etc.)(image processing, speech processing, etc.)System identification and controlSystem identification and controlRoboticsRoboticsMedical diagnosticsMedical diagnosticsEconomical features estimationEconomical features estimationAssociative memory = content addressable Associative memory = content addressable memorymemory

1515

Application area where NNs are Application area where NNs are successfully usedsuccessfully used

Classification system Classification system (e.g. Pattern recognition, (e.g. Pattern recognition, character recognition)character recognition)Optimization system (the usually feedback NN Optimization system (the usually feedback NN approximates the cost function) (e.g. radio frequency approximates the cost function) (e.g. radio frequency distribution, A/D converter, traveling salesman distribution, A/D converter, traveling salesman problem)problem)Approximation system (any input-output mapping)Approximation system (any input-output mapping)Nonlinear dynamic system model (e.g. Solution of Nonlinear dynamic system model (e.g. Solution of partial differpartial differenential equation systems, prediction, rule tial equation systems, prediction, rule learning)learning)

Main featuresMain features

Complex, non-linear input-output mapping Complex, non-linear input-output mapping AdaptivAdaptivityity, learning ability, learning ability distributed architecturedistributed architecture fault tolerant propertyfault tolerant property possibility of parallel analog or digital VLSI possibility of parallel analog or digital VLSI implementationsimplementationsAnalogy with neurobiologyAnalogy with neurobiology

1717

Classical neural netsClassical neural netsStatic nets (without memory, feedforward Static nets (without memory, feedforward networks)networks)– One layerOne layer– Multi layer Multi layer

MLP (Multi Layer Perceptron)MLP (Multi Layer Perceptron)RBF (Radial Basis Function) RBF (Radial Basis Function) CMAC (Cerebellar Model Articulation Controller)CMAC (Cerebellar Model Articulation Controller)

Dynamic nets (with memory or feedback recall Dynamic nets (with memory or feedback recall networks)networks)– Feedforward (with memory elements)Feedforward (with memory elements)– FeedbackFeedback

Local feedbackLocal feedbackGlobal feedbackGlobal feedback

1818

Feedforward architecturesFeedforward architecturesOne layer architectures: Rosenblatt perceptron

Feedforward architecturesFeedforward architectures

Input Output

Tunable parameters (weighting factors)

One layer architectures

2020

Feedforward architecturesFeedforward architecturesMultilayer network (static MLP net)

2121

Approximation propertyApproximation property

universal approximation property for some universal approximation property for some kinds of NNskinds of NNs Kolmogorov: Any continuous real valued Kolmogorov: Any continuous real valued N N variable function defined over the [0,1]variable function defined over the [0,1]NN

compact interval can be represented with the compact interval can be represented with the help of appropriately chosen 1 variable help of appropriately chosen 1 variable functions and sum operation.functions and sum operation.

2222

LearningLearning

Learning = Learning = structure + structure + parameter estimation parameter estimation supervised supervised learninglearningunsupervised learningunsupervised learninganalytic learninganalytic learningConvergence?Convergence???Complexity?Complexity???

2323

System: d=f(x,n)

NN Model: y=fM(x,w)

Criteria:C(d,y)

Input

Parametertuning

estimation of the model parameters by x, y, d

x d

y

C=C(ε)

n (noise)

Supervised learningSupervised learning

2424

Supervised learningSupervised learning

Criteria functionCriteria function– Quadratic:Quadratic:

– ......

2525

Minimization of the criteriaMinimization of the criteria

Analytic solution (only if it is very simple)Analytic solution (only if it is very simple)Iterative techniquesIterative techniques– Gradient methodsGradient methods– Searching methodsSearching methods

ExhaustiveExhaustiveRandomRandomGenetic searchGenetic search

2626

Parameter correctionParameter correction

PerceptronPerceptron

Gradient methodsGradient methods

– LMS (least means square algorithm)LMS (least means square algorithm)

......

Fuzzy Neural NetworksFuzzy Neural Networks

Fuzzy Neural Networks (FNNs)Fuzzy Neural Networks (FNNs)– based on the concept of NNsbased on the concept of NNs– numerical inputsnumerical inputs– weights, biases, outputs: fuzzy numbersweights, biases, outputs: fuzzy numbers

Circular Fuzzy Neural Networks Circular Fuzzy Neural Networks (CFNNs)(CFNNs)

– based on the concept of FNNsbased on the concept of FNNs– topology realigned to a circular topology realigned to a circular

shapeshape– connection between the hidden and connection between the hidden and

input layers trimmedinput layers trimmed– the trimming done depends on the the trimming done depends on the

input datainput data– e.g., for 3D coordinates, e.g., for 3D coordinates,

each coordinate can be connectedeach coordinate can be connectedto only 3 neighboring hidden layer to only 3 neighboring hidden layer neuronsneurons

– dramatic decrease in the required dramatic decrease in the required training time training time

ClassificationClassificationClassification = the most important unsupervised learning Classification = the most important unsupervised learning problem: it deals with finding a structure in a collection of problem: it deals with finding a structure in a collection of unlabeled dataunlabeled dataClustering = assigning a set of objects into groups whose Clustering = assigning a set of objects into groups whose members are similar in some way and are “dissimilar” to the members are similar in some way and are “dissimilar” to the objects belonging to other groups (clusters)objects belonging to other groups (clusters)(usually iterative) multi-objective optimization problem(usually iterative) multi-objective optimization problemClustering is a main task of explorative data mining, Clustering is a main task of explorative data mining, statistical data analysis used in machine learning, pattern statistical data analysis used in machine learning, pattern recognition, image analysis, information retrieval, recognition, image analysis, information retrieval, bioinformatics, etc.bioinformatics, etc.Difficult problem: multi-dimensional spaces, time/data Difficult problem: multi-dimensional spaces, time/data complexity, finding an adequate distance measure, non-complexity, finding an adequate distance measure, non-unambiguous interpretation of the results, overlapping of the unambiguous interpretation of the results, overlapping of the clusters, etc.clusters, etc.

The Training and Clustering The Training and Clustering AlgorithmsAlgorithms

Goal: – To further increase the speed of the training of the ANNs used

for classificationIdea: – During the learning phase, instead of directly using the training

data the data should be clustered and the ANNs should be trained by using the centers of the obtained clusters

u – inputu’– centers of the appointed clustersy – output of the modeld – desired outputc – value determinedby the criteria function

The Algorithm of the Clustering The Algorithm of the Clustering Step (modified K-means alg.)Step (modified K-means alg.)

The ANNsThe ANNs

Feedforward MLP, BP algorithmNumber of neurons: 2-10-2learning rate: 0.8momentum factor: 0.1Teaching set: 500 samples, randomly

chosen from the clustersTest set: 1000 samples, separately

generated

Easily solvable problem4 classes, no overlapping

Examples: Problem #1Examples: Problem #1

The Resulting Clusters and Required Training Time in the First Experiment with Clustering Distances A: 0.05,

B: 0.1, and C: 0.25

(First experiment)(First experiment)

Clustering distance

Time Spent on Training (min:sec) Quantity of Appointed Clusters Σ

Unclustered 2:07 (100%) 113 127 127 133 500

Clustered

A1 2:00 (94,5%) 30 30 8 14 82

B1 1:53 (89%) 11 13 3 4 31

C1 0:53 (41,7%) 3 2 1 1 7

Comparison between the Results of the Training using the Clustered and the Cropped Datasets of the 1st

Experiment

Clustering distance Accuracy of the Training Decrease in

qualityDecrease in

Required Time

Clustered

A1 1000/1000 100% no decrease 5.5%

B1 1000/1000 100% no decrease 11%

C1 1000/1000 100% no decrease 58.3%

Cropped

A1’ 1000/1000 100% no decrease 18%

B1’ 1000/1000 100% no decrease 62.99%

C1’ 965/1000 96.5% 3.5% decrease 63.78%

Moderately hard problem4 classes, slight overlapping


The Resulting Clusters and Required Training Time in the Second Experiment with Clustering

Distances A: 0.05, B: 0.1, and C: 0.25

Clustering distance

Time Spent on Training

(hour:min:sec)Quantity of Appointed Clusters Σ

Unclustered 3:38:02 (100%) 127 125 137 111 500

Clustered

A2 0:44:51 (20,57%) 28 31 14 2 78

B2 0:11:35 (5,31%) 11 10 5 2 28

C2 0:03:00 (1,38%) 2 3 1 1 7

Comparison between the Results of the Training using the Clustered and Cropped Datasets of the

2nd Experiment

Clustering distance Accuracy of the Training Decrease in

AccuracyDecrease in

Required Time

Clustered

A2 997/1000 99.7% 0.3% 79.43%

B2 883/1000 88.3% 11.7% 94.69%

C2 856/1000 85.6% 14.4% 98.62%

Cropped

A2’ 834/1000 83.4% 16.6% 96.32%

B2’ 869/1000 86.9% 13.1% 96.49%

C2’ 834/1000 83.4% 16.6% 96.68%

Comparison of the Accuracy and Training Time Results of the Clustered and Cropped Cases of

the 2nd Experiment

Group

Decrease in Accuracy Decrease in Required Time

Clustered Cropped Clustered Cropped

A2 | A2’ 0.3% 16.6% 79.43% 96.32%

B2 | B2’ 11.7% 13.1% 94.69% 96.49%

C2 | C2’ 14.4% 16.6% 98.62% 96.68%

Hard problem4 classes, significant overlapping


The Resulting Clusters and Required Training Time in the Third Experiment with Clustering Distances A: 0.05,

B: 0.1, and C: 0.2

Clustering distance

Time Spent on Training (min:sec) Quantity of Appointed Clusters Σ

Unclustered N/A 127 125 137 111 500

Clustered

0.05 52:29 28 30 33 6 97

0.1 24:13 12 10 12 3 37

0.2 7:35 3 4 4 1 12

Comparison between the Results of the Training using the Clustered and Cropped Datasets of the

3rd Experiment

Clustering distance Accuracy of the Training Decrease in quality

Clustered

A3 956/1000 95.6% 4.4%

B3 858/1000 85.8% 14.2%

C3 870/1000 87% 13%

Cropped

A3’ 909/1000 90.9% 9.1%

B3’ 864/1000 86.4% 13.6%

C3’ 773/1000 77.3% 22.7%

Comparison of the Accuracy Results of the Clustered and Cropped Cases of the 3rd

Experiment

Group

Decrease in quality

Clustered Cropped

A3 | A3’ 4.4% 9.1%

B3 | B3’ 14.2% 13.6%

C3 | C3’ 13% 22.7%

easy problem4 classes, no overlapping


0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

The original dataset

The trained network’s classifying ability

0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

d = 0.2 0.1 0.05

Clustering Distance

Accuracy on the test set

Number of samples

Required time for training

Relative speed

increase

Original 100% 500 2 minutes 38 seconds -

Clustered

0.2 89.8% 7 6 seconds 96%

0.1 95.6% 21 22 seconds 86%

0.05 99.7% 75 44 seconds 72%

Cropped

0.2 96.8% 7 5 seconds 96.8%

0.1 97.1% 21 11 seconds 93%

0.05 98.7% 75 23 seconds 85%

Clustering Distance

Accuracy of the training

on the original training set

Accuracy in

percentage

Accuracy of the training on the

test set

Accuracy in percentage

Clustered

0.2 450/500 90% 898/1000 89.8%

0.1 481/500 96.2% 956/1000 95.6%

0.05 499/500 99.8% 997/1000 99.7%

Cropped

0.2 447/500 89.4% 898/1000 89.8%

0.1 488/500 97.6% 971/1000 97.1%

0.05 498/500 99.6% 987/1000 98.7%

Clustering Distance Clustered Cropped Clustered to cropped

relation

0.2 89.8% 89.8% equals

0.1 95.6% 97.1% 1.5% better

0.05 99.7% 98.7% 1% better

Clustering Distance Clustered Cropped Clustered to

cropped relation

0.2 6 seconds 5 seconds 16.6% slower

0.1 22 seconds 11 seconds 50% slower


Accuracy/training time

Moderately complex problem3 classes, with some overlappingThe network could not learn the original training data with the same options


The original dataset

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

d = 0.2 0.1 0.05

Clustering DistanceAccuracy on the original training

set

Number of clusters

Required time for training

Clustered

0.2 80.6% 16 35 seconds

0.1 91% 44 1 minute 47 seconds

0.05 95.2% 134 17 minutes 37 seconds

Cropped

0.2 80.2% 16 32 seconds

0.1 93.4% 44 1 minute 20 seconds

0.05 91.4% 134 1 hour 50 minutes 9 seconds

Clustering Distance


original training set

Accuracy in

percentage


test set

Accuracy in percentage

Clustered

0.2 403/500 80.6% 888/1000 88.8%

0.1 455/500 91% 977/1000 97.7%

0.05 476/500 95.2% 971/1000 97.1%

Cropped

0.2 401/500 80.2% 884/1000 88.4%

0.1 467/500 93.4% 974/1000 97.4%

0.05 457/500 91.4% 908/1000 90.8%

Clustering Distance Clustered Cropped Clustered to

cropped relation


0.1 1 minute 47 seconds 1 minute 20 seconds 25% slower

0.05 17 minutes 37 seconds

1 hour 50 minutes 9 seconds 625% faster

A Real-World Application: Man-A Real-World Application: Man-machine cooperation in ISpace machine cooperation in ISpace

Man-machine cooperation in ISpace using visual Man-machine cooperation in ISpace using visual (hand(hand posture and –gesture based) posture and –gesture based) communicationcommunicationStereo-camera systemStereo-camera systemRecognition of hand gestures/ hand tracking and Recognition of hand gestures/ hand tracking and classification of hand movementsclassification of hand movements3D computation of feature points /3D model 3D computation of feature points /3D model buildingbuildingHand model identificationHand model identificationInterpretation and execution of instructionsInterpretation and execution of instructions

The Inputs: The 3D coordinate The Inputs: The 3D coordinate model of the detected handmodel of the detected hand

The method uses two cameras– From two different viewpoint

The method works in the following way:– It locates the areas in the pictures

of the two cameras where visible human skin can be detected using histogram back projection

– Then it extracts the feature points in the back projected picture considering curvature extrema:

peaks andvalleys

– Finally, the selected feature points are matched in a stereo image pair.

The results: The 3D coordinate model of the hand, 15 spatial pointsThe results: The 3D coordinate model of the hand, 15 spatial points

Fuzzy Hand Posture ModelsFuzzy Hand Posture Models

describing the human hand describing the human hand by fuzzy hand feature setsby fuzzy hand feature setstheoretically 3theoretically 31414 different different

hand postureshand postures

1st set: four fuzzy features describing the distance between the fingertips of 1st set: four fuzzy features describing the distance between the fingertips of each adjacent finger (How far are finger X and finger Y from each other?) each adjacent finger (How far are finger X and finger Y from each other?)

2nd set: five fuzzy features describing the bentness of each finger 2nd set: five fuzzy features describing the bentness of each finger (How big is the angle between the lowest joint of finger W and the plane of the palm?)(How big is the angle between the lowest joint of finger W and the plane of the palm?)

3rd set: five fuzzy features describing the relative angle between the bottom finger 3rd set: five fuzzy features describing the relative angle between the bottom finger joint and the plane of the palm of the given hand (How bent is finger Z?)joint and the plane of the palm of the given hand (How bent is finger Z?)

Fuzzy Hand Posture ModelsFuzzy Hand Posture Models

Feature group Feature Value

Relative distance between adjacent

fingers

a Largeb Mediumc Smalld Small

Relative angle between the lowest joint of eachfinger and the plane

of the palm

A MediumB SmallC SmallD LargeE Large

Relative bentness of each finger

A MediumB LargeC LargeD SmallE Small

Example: Victory

ModelBaseModelBaseGestureBaseGestureBaseTarget GeneratorTarget GeneratorCircular Fuzzy Neural Networks (CFNNs)Circular Fuzzy Neural Networks (CFNNs)Fuzzy Inference Machine (FIM)Fuzzy Inference Machine (FIM)Gesture DetectorGesture Detector

Fuzzy Hand Posture and Gesture Fuzzy Hand Posture and Gesture Identification SystemIdentification System

ModelBaseModelBase

GestureBaseGestureBase


Stores the features of the models as linguistic

variables

Contains the predefined hand gestures as

sequences of FHPMs


Target GeneratorTarget Generator

d - identification value (ID) of the model in the ModelBase.SL - linguistic variable for setting the width of the triangular fuzzy sets

Calculates the target parameters for the CFNNs

and the FIM.

Input parameters:


Fuzzy Inference Machine (FIM)Fuzzy Inference Machine (FIM) Max (Max (Min(βMin(βii))) ) ββii - - intersection of the fuzzy feature sets intersection of the fuzzy feature sets

Gesture DetectorGesture Detector

Identifies the detected FHPMs by Identifies the detected FHPMs by using fuzzy min-max algorithmusing fuzzy min-max algorithm

Searches predefined hand gesture patterns Searches predefined hand gesture patterns in the sequence of detected hand posturesin the sequence of detected hand postures

Circular Fuzzy Neural Networks Circular Fuzzy Neural Networks (CFNNs)(CFNNs)

– 3 different NNs for the 3 feature groups3 different NNs for the 3 feature groups– 15 hidden layer neurons15 hidden layer neurons– 4/5 output layer neurons4/5 output layer neurons– 45 inputs (= 15 coordinate triplets)45 inputs (= 15 coordinate triplets)– but only 9 inputs connected to each but only 9 inputs connected to each hidden neuronhidden neuron

Convert the coordinate model to a FHPM

The ExperimentsThe ExperimentsSix hand modelsSix hand modelsSeparate training and testing setsSeparate training and testing setsTraining parameters:Training parameters:– Learning rate: 0.8Learning rate: 0.8– Coefficient of the momentum method: Coefficient of the momentum method:

0.50.5– Error threshold: 0.1Error threshold: 0.1– SSLL: small: small

3 experiments3 experiments– First and second experiments compare First and second experiments compare

the speed of the training using the the speed of the training using the clusteredclustered and the and the originaloriginal unclusteredunclustered datadata and the accuracy of and the accuracy of the trained systemthe trained system

for given clustering distance (0.5)for given clustering distance (0.5)– Third experiment compares the Third experiment compares the

necessary training time and the necessary training time and the accuracy of the trained system for accuracy of the trained system for different clustering distancesdifferent clustering distances

The first two experiments have been conducted on an average PC (Intel Pentium® The first two experiments have been conducted on an average PC (Intel Pentium® 4 CPU 3.00 GHz, 1 GB RAM, Windows XP+SP3 operating system), while the third 4 CPU 3.00 GHz, 1 GB RAM, Windows XP+SP3 operating system), while the third experiment has been conducted on another PC (Intel® CoreTM 2 Duo CPU T5670 experiment has been conducted on another PC (Intel® CoreTM 2 Duo CPU T5670 1.80 GHz, 2 GB RAM, Windows 7 32-bit operating system).1.80 GHz, 2 GB RAM, Windows 7 32-bit operating system).

Experimental Results:Experimental Results: The Result in The Result in Required Training TimeRequired Training Time

Network typeTime Required for Error Threshold Intervals

0.5-0.25 0.25-0.2 0.2-0.15 0.15-0.12

Unclustered

A 28 mins 39 minutes 1 hour17 minutes

2 hour24 minutes

B 50 mins 2 hours14 minutes

2 hour28 minutes

C 53 mins 52 minutes 2 hour40 minutes

Clustered

A 16 minutes(42.86%)

25 minutes(35.9%)

1 hour14 minutes

(3.9%)

1 hour18 minutes

(45.8%)

B 32 minutes(36%)

1 hour3 minutes(52.9%)

1 hour1 minutes(58.8%)

C 31 minutes(41.5%)

46 minutes(11.5%)

58 minutes(63.75%)

(First experiment)(First experiment)

Experimental Results: Another Experimental Results: Another Training Session with only One Training Session with only One

SessionSessionNetwork type

Error Threshold IntervalsSpeed Increase

0.5-0.12

Unclustered A 4 hours and 27 minutes51.6%

Clustered A 2 hours and 9 minutes

Unclustered B 3 hours and 8 minutes27.1%

Clustered B 2 hour and 22 minutes

Unclustered C 4 hours and 5 minutes18%

Clustered C 3 hours and 21 minutes

(Second experiment)(Second experiment)

Experimental Results: Comparative Experimental Results: Comparative Analysis of the Result of the Analysis of the Result of the

Trainings of the Two SessionsTrainings of the Two Sessions

Measured attributeInput data for the training

Difference in ratioUnclustered Clustered

First Experiment

Total time spent onTraining

14 hours and38 minutes

8 hours and8 minutes 44.4% decrease

Classification accuracy 98.125% 95.2% 2.9% decrease

SecondExperiment

Total time spent ontraining

11 hours and41 minutes

7 hour and52 minutes 32.5% decrease

Classification accuracy 98.125% 95.83% 2.3% decrease

Experimental Results: The quantity of Experimental Results: The quantity of Clusters Resulting from Multiple Clusters Resulting from Multiple

Clustering Steps for Different Clustering Clustering Steps for Different Clustering DistancesDistances

Clusteringdistance

Number of clusters for each hand type

Open hand Fist Three Point Thumb-up Victory Σ

Unclustered 20 20 20 20 20 20 120

d = 0.5 10 13 4 7 4 5 42

d = 0.4 13 16 5 9 5 8 55

d = 0.35 13 17 5 12 10 8 65

(Third experiment)(Third experiment)

Experimental Results: Comparative Experimental Results: Comparative Analysis about the Characteristics of Analysis about the Characteristics of the Differently Clustered Data Setsthe Differently Clustered Data Sets

Measured attributeMeasured attribute Measured valueMeasured value Difference in ratioDifference in ratio

UnclusteredTotal time spent on training 6 hours and 30 minutes -

Average classification accuracy 97% -

d = 0.5Total time spent on training 3 hour and 57 minutes 39% decrease

Average classification accuracy 95.2% 1.8% decrease

d = 0.4Total time spent on training 4 hour and 22 minutes 32.8% decrease

Average classification accuracy 97% 0% decrease

d = 0.35Total time spent on training 5 hour and 46 minutes 11.1% decrease

Average classification accuracy 97% 0% decrease

(Third experiment)(Third experiment)

Experimental Results: Experimental Results: Clustered Data SetsClustered Data Sets

Hand posture typeClustering distance

UC 0.5 0.4 0.35

Open hand 77/80 76/80 76/80 76/80

Fist 72/80 77/80 76/80 76/80

Three 78/80 74/80 79/80 80/80

Point 80/80 77/80 78/80 79/80

Thumb-up 80/80 78/80 80/80 78/80

Victory 79/80 75/80 77/80 77/80

Average(in ratio) 97% 95.2% 97% 97%

Number of correctly classified samples / number of all samples(Third experiment)(Third experiment)

6868

Tusor, B. and A.R. Várkonyi-Kóczy, “Reduced Complexity Tusor, B. and A.R. Várkonyi-Kóczy, “Reduced Complexity Training Algorithm of Circular Fuzzy Neural Networks,” Journal of Training Algorithm of Circular Fuzzy Neural Networks,” Journal of Advanced Research in Physics, 2012.Advanced Research in Physics, 2012.Tusor, B., A.R. Várkonyi-Kóczy, I.J. Rudas, G. Klie, G. Kocsis, Tusor, B., A.R. Várkonyi-Kóczy, I.J. Rudas, G. Klie, G. Kocsis, An Input Data Set Compression Method for Improving the An Input Data Set Compression Method for Improving the Training Ability of Neural Networks, In CD-ROM Proc. of the Training Ability of Neural Networks, In CD-ROM Proc. of the 2012 IEEE Int. Instrumentation and Measurement Technology 2012 IEEE Int. Instrumentation and Measurement Technology Conference, I2MTC’2012, Graz, Austria, May 13-16, 2012, pp. Conference, I2MTC’2012, Graz, Austria, May 13-16, 2012, pp. 1775-1783.1775-1783.Tóth, A.A., Várkonyi-Kóczy, A.R., “A New Man- Machine Interface Tóth, A.A., Várkonyi-Kóczy, A.R., “A New Man- Machine Interface for ISpace Applications,” Journal of Automation, Mobile Robotics for ISpace Applications,” Journal of Automation, Mobile Robotics & Intelligent Systems, Vol. 3, No. 4, pp. 187-190, 2009.& Intelligent Systems, Vol. 3, No. 4, pp. 187-190, 2009.Várkonyi-Kóczy, A.R., B. Tusor, “Human-Computer Interaction for Várkonyi-Kóczy, A.R., B. Tusor, “Human-Computer Interaction for Smart Environment Applications Using Fuzzy Hand Posture and Smart Environment Applications Using Fuzzy Hand Posture and Gesture Models,” IEEE Trans. on Instrumentation and Gesture Models,” IEEE Trans. on Instrumentation and Measurement, Vol. 60, No 5, pp. 1505-1514, May 2011. Measurement, Vol. 60, No 5, pp. 1505-1514, May 2011.

References to the examplesReferences to the examples

ConclusionsConclusionsSC and NN based methods can offer solution for many SC and NN based methods can offer solution for many ”unsolvable” cases however with a burden of convergence and ”unsolvable” cases however with a burden of convergence and complexity problemscomplexity problemsNew training and clustering procedures which can New training and clustering procedures which can advantageously be used in the supervised training of neural advantageously be used in the supervised training of neural networks used for classificationnetworks used for classificationIdea: reduce the quantity of the training sample set in a way Idea: reduce the quantity of the training sample set in a way that does little (or no) impact on its training abilitythat does little (or no) impact on its training abilityClustering based on the Clustering based on the kk-means method with the main -means method with the main difference in the assignment step, where the samples are difference in the assignment step, where the samples are assigned to the first cluster that is “near enough”.assigned to the first cluster that is “near enough”.As a result, for classification problems, the complexity of the As a result, for classification problems, the complexity of the training algorithm (and thus the training time) of neural training algorithm (and thus the training time) of neural networks can significantly be reducednetworks can significantly be reducedOpen questions: Open questions: – dependency of the decrease of classification accuracy and training time dependency of the decrease of classification accuracy and training time

of different types of ANNsof different types of ANNs– optimal clustering distanceoptimal clustering distance– ggeneralization of the method towards other types of NNs, problemseneralization of the method towards other types of NNs, problems, etc., etc.

Neural Network Applications Using an Improved Performance Training Algorithm

Documents

Transcript of Neural Network Applications Using an Improved Performance Training Algorithm