chapter 2 ann.docx

10
CHAPTER – 2 ARTIFICIAL NEURAL NETWORKS 14

Transcript of chapter 2 ann.docx

Page 1: chapter 2 ann.docx

CHAPTER – 2

ARTIFICIAL NEURAL NETWORKS

14

Page 2: chapter 2 ann.docx

ARTIFICIAL NEURAL NETWORK

An artificial neuron network (ANN) is a computational model based on the structure and

functions of biological neural networks. A neural network is a massively parallel distributed

processor made of simple processing units called artificial neurons.  Artificial neural networks

are very different from biological networks, although many of the concepts and characteristics of

biological systems are faithfully reproduced in the artificial systems. Artificial neural nets are a

type of non-linear processing system that is ideally suited for a wide range of tasks, especially

tasks where there is no existing algorithm for task completion. ANN can be trained to solve

certain problems using a teaching method and sample data. In this way, identically constructed

ANN can be used to perform different tasks depending on the training received. With proper

training, ANNs are capable of generalization, the ability to recognize similarities among different

input patterns, especially patterns that have been corrupted by noise. The artificial neural

network can be categorised into two parts namely neuron model and perceptron model.

2.1.1 ONE NEURON MODEL

In this model, neuron has several inputs and one output. Each input is associated with a

weight wi. The sum of product of inputs and respective weights is first calculated by the neuron.

This summation is the input for activation function which calculates the output.

Figure 2.1. Model of a neuron

15

Page 3: chapter 2 ann.docx

2.1.2 PERCEPTRON MODEL

A more complex model called perceptron model (Kasabov et al. 1996) was later

introduced to extend the one neuron model. The concept of perceptron was the starting point for

the development of neural networks. A perceptron element consists of a single node which

receives weighted inputs and produces results using thresholds according to certain rules. The

perceptron is efficiently used to classify linearly separable data but is not suitable for nonlinear

data.

2.2 ARCHITECTURE OF NEURAL NETWORKS

A neural network consists of a number of interconnected nodes or neurons. Each node is

a simple processing element that responds to the weighted inputs it receives from the other

nodes. The arrangement of the nodes or neurons is referred as the network architecture. There are

a number of different parameters that must be decided upon when designing a neural network.

Among these parameters are the number of layers, the number of nodes or neurons per layer, the

number of training iterations, etc. Some of the more important parameters in terms of training

and network capacity are the number of hidden neurons, the learning rate and the momentum

parameter. The network can learn more complex problems by increasing the number of hidden

layers.

2.3 TRAINING OF NEURAL NETWORK

Once a network has been structured for a particular application, that network is ready to

be trained. To start this process the initial weights are chosen randomly. Then, the training, or

learning, begins. There are two approaches to training - supervised and unsupervised. Supervised

training involves a mechanism of providing the network with the desired output either by

manually "grading" the network's performance or by providing the desired outputs with the

inputs. Unsupervised training is where the network has to make sense of the inputs without

outside help.

16

Page 4: chapter 2 ann.docx

2.3.1 SUPERVISED TRAINING

In supervised training, both the inputs and the outputs are provided. The network then

processes the inputs and compares its resulting outputs against the desired outputs. Errors are

then propagated back through the system, causing the system to adjust the weights which control

the network. This process occurs over and over as the weights are continually tweaked. The set

of data which enables the training is called the "training set." During the training of a network

the same set of data is processed many times as the connection weights are ever refined.

2.3.2 UNSUPERVISED TRAINING

The other type of training is called unsupervised training. In unsupervised training, the

network is provided with inputs but not with desired outputs. The system itself must then decide

what features it will use to group the input data. This is often referred to as self-organization or

adaption

2.4 FEED FORWARD BACK PROPAGATION ALGORITHM FOR TRAINING OF

ANN

Since the real uniqueness or 'intelligence' of the network exists in the values of the

weights between neurons, we need a method of adjusting the weights to solve a particular

problem. For this type of network, the most common learning algorithm is called Back

Propagation (BP). A back propagation network learns by example, that is, we must provide a

learning set that consists of some input examples and the known-correct output for each case. So,

we use these input-output examples to show the network what type of behaviour is expected, and

the back propagation algorithm allows the network to adapt.

The back propagation learning process works in small iterative steps: one of the example

cases is applied to the network, and the network produces some output based on the current state

of its synaptic weights (initially, the output will be random). This output is compared to the

known-good output, and a mean-squared error signal is calculated. The error value is then

propagated backwards through the network, and small changes are made to the weights in each

layer. The weight changes are calculated to reduce the error signal for the case in question. The

whole process is repeated for each of the example cases, then back to the first case again, and so

on. The cycle is repeated until the overall error value drops below some pre-determined

17

Page 5: chapter 2 ann.docx

threshold. The network will never exactly learn the ideal function, but rather it will

asymptotically approach the ideal function.

2.5 FACTORS AFFECTING THE PERFORMANCE OF THE NEURAL NETWORK

The factors affecting the performances of the neural network to generalize, that is, the

capability of the neural network to interpolate and extrapolate to data that it has not seen before

are as follows

2.5.1 NUMBER OF NODES

The mathematical structure can be made very flexible and the neural network can be used

for a wide range of applications by using large number of simple processing elements. However,

it may not be necessary for all applications because very simple topologies have been

investigated using a small number of data points. The neural network ability to represent the

training data increases with the number of nodes in the hidden layer(s) at the expense of its

capability to generalize.

2.5.2 SIZE OF TRAINING DATA SET

The data set used must be representative of the entire distribution of values corresponding

to particular class. If the entire set of the distribution of the data in feature space is not covered

sufficiently the network may not classify new data accurately. Therefore sufficient number of

data is often required for training, and researchers are often concerned with finding the minimum

size of data set necessary. However, large training data sets also need longer training times.

Several modifications to the MLP algorithm have been introduced including the momentum

term, the delta-bar-delta rule, and optimization procedures to speed up the training process.

2.5.3 TRAINING TIME

The generalizing capability of the network is affected by the time taken for training a

network. The network trained on a specific data set for longer time may classify those data more

accurately; however, the capability to classify previously unseen data decreases consequently.

Therefore, an over trained network is able to memorize the training data but fails to generalize

when it is applied to different data sets. The accuracy of a trained neural network is evaluated

with a testing data set. The testing data set allows anyone to assess the accuracy of the trained

neural network. The choice of data for testing should, like the training data, be representative of

18

Page 6: chapter 2 ann.docx

the entire distribution of values corresponding to a given class. The available data is generally

divided into training and testing data set and the test data is used to evaluate the performance of

the model.

2.6 ARTIFICIAL NEURAL NETWORKS IN REMOTE SENSING:

Artificial neural networks (ANN) have become an important tool in the analysis of

remotely sensed data mainly due to their ability to learn complex relationships. It has been

shown from the research that ANNs may be used to classify remotely sensed data more

accurately (Gopal et al. 1999; Liu et al. 2003; Murthy et al. 2003). Since artificial neural

networks are capable of adjusting its synaptic weights to adapt to the environment, it is able to

deal with incomplete information and provide responses under uncertainty.

Artificial neural networks can be used for surface water quality assessment (Gross et al.

1999), retrieval of soil moisture (Chang et al. 2000, Del Frate et al. 2003), estimation of crop

variables, predicting the yield of a crop, classification of different crops etc.

Artificial neural networks are also used to detect the disease in plants. Aji et al. (2013)

applied an ANN to detect pathogens which decrease oil production. Xiaoli Li and Yong He

(2008) applied an ANN in their study on tea leaves. They were able to discriminate the low

quality tea leaves and obtained a good accuracy of 77.3% in classification of all three tea gardens

by using ANN models. ANNs have also been used for the identification of plant viruses. The

results obtained indicated that the method using ANNs can be a reliable tool, very helpful in such

analyses. Therefore, it was suggested to use ANN models as an alternative for traditional

methods used in verification of a large amount of data (Glezakos et al. 2010).

Artificial neural networks are widely used in the field of agriculture, such as

classification of hard red wheat by feed forward back propagating neural networks (Chen et al.

1995), tomato maturity evaluation using colour image analysis (Choi et al. 1995), single wheat

kernel color classification (Wang 1999) retrieval of crop parameters of spinach (pandey et al.

2010) and estimation of rice crop variables (D.K. Gupta et al. 2015). Nakano et al.(1992)

developed a method that can classify the quality of the external appearance of apples using a

neural network. It can be used to evaluate the quality of the apples.

19