Selecting an Artificial Neural Network for Efficient Modeling And

8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And

1/12

International Journal of Machine Tools & Manufacture 42 (2002) 663674

Selecting an artificial neural network for efficient modeling andaccurate simulation of the milling process

Jorge F. Briceno a, Hazim El-Mounayri a,, Snehasis Mukhopadhyay b

a Mechanical Engineering Department at Indiana University, Purdue University, Indianapolis (IUPUI), 723 W. Michigan Street, SL 260,

Indianapolis, IN 46202-5132, USAb Department of Computer and Information Science at IUPUI, 723 W. Michigan Street, Indianapolis, IN, USA

Received 16 August 2001; accepted 15 January 2002

Abstract

In this paper, two supervised neural networks are used to estimate the forces developed during milling. These two ArtificialNeural Networks (ANNs) are compared based on a cost function that relates the size of the training data to the accuracy of themodel. Training experiments are screened based on design of experiments. Verification experiments are conducted to evaluate these

two models. It is shown that the Radial Basis Network model is superior in this particular case. Orthogonal design and specificallyequally spaced dimensioning showed to be a good way to select the training experiments. 2002 Elsevier Science Ltd. Allrights reserved.

Keywords: End milling; Artificial neural networks; Back propagation; Radial basis

1. Introduction

As one of the most useful methods of metal cutting,the milling process attempts to remove an amount ofmaterial through chip formation by the two continuousmotions of a tool and a workpiece (see Fig. 1). In this

Fig. 1. Flat-end milling process.

Corresponding author. Tel.: +1-317-278-3320; fax.: +1-317-274-9744.

E-mail address: [email protected] (H. El-Mounayri).

0890-6955/02/$ - see front matter 2002 Elsevier Science Ltd. All rights reserved.

PII: S0 8 9 0 - 6 9 5 5 ( 0 2 ) 0 0 0 0 8 - 1

case, the tool has a rotational motion (expressed byspindle speed) and the workpiece a linear movement(expressed by feed rate). The cutting edge is in contactwith the material at many points, which changedepending on the position of the edge relative to thematerial. This makes the present process involved interms of operational variables. Many parameters have tobe defined to conduct this operation. Among the princi-pal ones are spindle speed (tool rotational velocity), feedrate (workpiece velocity), diameter of the tool, helixangle, radial depth of cut (RDC), axial depth of cut(ADC), rake angle, clearance angle and number of flutes.These variables conjointly with tool and workpiecematerial define the state of cutting, which controls theprocess parameters. The latter include tool wear, toollife, surface finish, etc. The forces that are developedduring the milling process, can directly or indirectlymeasure/estimate such process parameters. In general,excessive cutting forces result in low product qualitywhile small cutting forces often indicate low machiningefficiency [1]. Thus, controlling these forces is of para-mount importance.

The majority of milling operations have been carriedout based on cutting conditions determined from pre-vious experience and/or existing machining data. On the


2/12

664 J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674

other hand, researchers have been trying to develop

mathematical models that would predict the cutting

forces based on the geometry and physical character-

istics of the process. Such prediction could then be used

to optimize the process. However, due to its complexity,the milling process still represents a challenge to the

modeling and simulation research effort. In fact, most ofthe research work reported in this regard, which is based

on either analytical or semi-empirical approaches, has in

general shown only limited levels of accuracy and/orgenerality.

In the present paper, a different approach that is based

on advanced artificial intelligence techniques isimplemented and tested. More specifically two differentneural networks are used to predict the forces developed

during End milling. The networks are then compared andthe best network is selected based on certain criteria.

2. Literature review

This relatively new methodology of Artificial NeuralNetwork (ANN), inspired by biological nervous systems,

has found application in many real-world problem solv-

ing. One of the first engineering applications wasreported by Minsky and Papert developing perceptrons

in 1969. Then this field stayed dormant until about 1986when the PDP group comprising Rumelhart andMcClelland [2] published a two-volume book on explo-

rations in the microstructure of cognition. It is only in

the past few years that this methodology was

implemented in metal-cutting operations. In [3], a feed-forward neural network algorithm is implemented to pre-dict flank wear in orthogonal turning. In this case, feedrate, cutting speed and force ratio are used as inputs. Liu

and Wang [4] also propose a back propagation (BP)

ANN for on-line modeling of the milling system. How-

ever, this study has several limitations, the most

important of which is the use of a single machining para-

meter as the variable input. In [5], a more efficient modelis created using BP ANN (using LevenbergMarquardtapproach). In this case, three inputs are considered withdifferent levels for each parameter. This approach has

the disadvantage of requiring too many experiments to

train the ANN. This, in terms of Industrial usability, is

unattractive and expensive.

Radial Basis Networks (RBN), a neural network

architecture different from multi-layer BP ANN, havebeen used mainly for pattern recognition. However,

recent studies have indicated that this important network

can be successfully used as a function modeler as well.

Cook and Chiu [6] used a radial basis network as a

framework to establish some network improvementsconsidering a time series model of a manufacturing pro-

cess. Cheng and Lin [7] used three ANNs to estimate

bending angles formed by laser. The RBN showed to be

superior to the other models. Elanayar and Shin [8] util-

ized RBN to predict tool wear based on certain machin-

ing conditions. A more general representation of the

milling process cannot be found in the literature. In

addition, no work has been conducted yet to evaluateand compare different artificial neural networks used to

model the milling process.

3. Artificial neural network models of the milling

process

In the current work, two supervised neural networks

for modeling the milling process are compared. The firstone is a back propagation neural network (BP) with log-

sigmoid transfer functions in hidden layers and lineartransfer function in the output layer; the second is a rad-

ial basis network (RBN) with Gaussian activation func-

tions. The first ANN is very popular, especially in thearea of manufacturing modeling, as its design and oper-

ation are relatively simple. The radial basis network has

some additional advantages such as rapid convergence

and less error. In particular, most commonly used RBNs

involve fixed basis functions with linearly appearingunknown parameters in the output layer. In contrast,multi-layer BP ANNs involve adjustable basis functions.

That result in nonlinearly appearing unknown para-

meters. It is commonly known that linearity in para-

meters in RBN allow the use of least squares error based

updating schemes that have faster convergence than the

gradient-descent methods used to update the nonlinear

parameters of multi-layer BP ANN. On the other hand,it is also known that the use of fixed basis functions inRBN results in exponential complexity in terms of the

number of parameters, while adjustable basis functions

of BP ANN can lead to much less complexity in termsof the number of parameters or network size [9]. How-

ever, in practice, the number of parameters in RBN starts

becoming unmanageably large only when the number of

input features increases beyond about 10 or 20, which

is not the case in our study. Hence, the use of RBN was

practically possible for our problem. MatLab NeuralNetwork Tool Box was used as a platform to create

the networks.

3.1. Back-propagation neural network (BPNN)

Since the objective is to evolve a model that relatesselected inputs with outputs, BPNN constitutes an excel-

lent tool to approximate such function. The general net-

work topology is shown in Fig. 2. This network is com-

posed of several neurons or processing elements (PE)

operating in parallel. The PEs are arranged in differentsections or layers. These structures include: an input

layer, hidden layer(s) and an output layer. Each layer is

connected to other layers through the weight lines that


3/12

665J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674

Fig. 2. Back-propagation network topology.

come from each PE. The architecture of each PE is

shown is Fig. 3. In general terms, the operation of thistype of network can be described in terms of two major

phases: The feed-forward phase and the back-propa-

gation phase.

3.1.1. Feed-forward phase

The input patterns are represented by the input PEs.

Here no calculation is made. The following set of neu-

rons are found in the hidden layer(s). Form the ith input

PE the information is conducted to the jth PE in hidden

layer through the weight Wij. As depicted in Fig. 3, the

incoming data, in such element, is represented by

aj n

i 0

WijIi (1)

where, aj is the linear combination of each Ii multiplied

by Wij. is the value used in the activation function; Ii is

the ith input; Wij is the weight value from the ith input

PE to the jth hidden PE; n is the number of incoming

information to the jth PE; aj is the value fed to thesquashing function which gives the output of the jth PE

Fig. 3. Architecture of an individual PE for BP.

to the next layer(s). The output of this element is

given by

Yj SFl(aj) (2)

where, Yj is the output value of the jth element; SFl is

the squashing function (or activation function) of the lth

hidden layer.In this paper, the squashing functions used in the hid-

den and output layers are log-sigmoid transfer functionand linear transfer function respectively. The value of Yjis propagated through each further layer until the output

is generated.

3.1.2. Back-propagation phase

In this phase the learning process is conducted. In gen-

eral terms, the implementation of BP consists of updat-

ing the network weights in the direction in which the

performance function decreases most rapidly. Once the

output (Yj) is calculated, it is compared with the targetvalue (tj). Then the following error is computed:

ej 1

2(tjYj)

2 (3)

This error ej corresponds to just one output PE. There-fore the overall error (E vector) is expressed by

E (e1,,ej,ek) (4)

where k is the number of outputs.The error is then transmitted backwards from the out-

put layer to the input layer. The connection weights are

updated by each PE, leading the network to converge.Several techniques can be used to conduct this back-

propagation. One of the most widely used is the Leven-

bergMarquardt technique. This technique approximatesthe Hessian matrix with the product of the Jacobian

matrix and its transpose. In this way, the weight updates

is based on the following equation:

Wnewij Woldij

JT

JTJ I(5)

where, Wnewij Corrected weight for jth PE coming fromthe previous layer, Woldij Previous weight for jth PE

from previous layer, J Jacobian matrix containing the

first derivatives of the network errors with respect to thenetwork weights and error signals for the ith pattern,

Scalar factor (when equal to zero, the method iscalled second order Newtons, while when set to a largenumber, it is called gradient descent with small step

size), Error signal for the jth PE.This network offers a good generalization method-

ology and a fast convergence using the LevenbergMar-quardt algorithm. In the same way, regularization is usedto improve generalization through the use of automated

regularization based on Bayesian framework. For this

particular case, since the size of the data is relatively


4/12


small and based on Whites theorem [10] (which statesthat one layer with non-linear activation functions issufficient to map any non-inear functional relationshipwith a reasonable level of accuracy), a single hiddenlayer neural network was utilized and the number ofweights are kept around 3/4 of the number of experi-

ments, actually:Number of weights(Number of experiments)(3/4)

Normally this factor is about 1/10 , but due the small

size of the data in this particular case a factor of 3/4 wasused, which still resulted in more data points than the

number of unknown weights.

The effect of topology is also studied by considering

different cases. The topologies are varied by varying the

number of neurons in hidden layer (n, in Fig. 2) between

a lower limit of 2 and an upper limit of 3/4 of the totalnumber of experiments. The lower limit was selected

based on the fact that one neuron in the hidden layer

represents a model in which a linear relation is implied

between the inputs and outputs. The following notation

is used to describe the topology: 3.n.4; which means: 3

inputs, n neurons in the hidden layer and 4 outputs.

3.2. Radial basis network

This neural network utilizes the Gaussian curve to

map values. RBN works considerably well in function

approximation. It is very fast in convergence and it is

very simple to define in terms of a number of character-istic parameters.

Radial basis network (RBN) or radial basis function

network is a two layer fully interconnected neural net-work. It has two general characteristics: First, it mayrequire more neurons than the standard feed-forward BP

networks. Second, it can be designed in a fraction of the

time that it takes to train the aforementioned BP.

A typical RBN is shown in Fig. 4. The network has

Fig. 4. Radial basis network architecture.

n inputs and k outputs. The first layer is connected withthe second or internal layer by weights that come from

the input elements and the bias element. Weights from

internal layer to outputs are also defined. Each elementin the internal layer receives an input pattern vector andcompares it with the mean weight vector that connects

the input with second layer. The weight vector deter-mines the position of the center of the radial hidden

element in the input space. Here, the activation function

is similar to a Gaussian density function. This functionis defined as follows:

Yki e

h

(uihaih)2C

V2 (6)

Here Yki is the response of the ith element in the hid-den layer. The weights uih define the mean value vectorassociated with each hidden PE, aih represent the inputs.

The parameter V is the factor that shapes the form of

the squashing function and is called spread factor; C isa constant. The PEs architecture of the hidden layer can

be seen in Fig. 5.

Finally, the connection weights between the second

layer and the output layer is multiplied by the output of

the internal elements (linear summation function), givingthe output value to be compared with the target vectors.

zkj p

i 0

WijYkj (7)

Radial basis network is a very efficient network whenfunction approximation is needed. This artificial neural

network has the following characteristics:

1. it is very fast in comparison to back-propagation;

2. it has the ability of representing nonlinear functions;

3. it does not experience local minima problems of

back-propagation.

RBN is being used for an increasing number of appli-cations, proportioning a very helpful modeling tool.

Fig. 5. Radial basis neuron.


5/12


In summary, two parameters need to be defined.Spread factor and goal factor. The spread factor V, has

to be specified depending on the particular case in hand.It has to be smaller than the highest limit of the input

data and larger than the lowest limit [11]. Based on this,and assuming that all the training data (as will be

explained in a future paper) is mapped between 0 and1, three values to be considered are: 0.2, 0.5 and 0.8.

The goal factor value is set to zero, since error is a

decisive factor in this study.

4. Experimental data for training the ANN models

4.1. Experimental set-up

The three components of the cutting force are meas-

ured using a Kistler 9257B dynamometer. These were

sampled at 2500 Hz for 10 s each and have been stored

in files in a spreadsheet format. The machine tool usedfor all the experiments in this work is a FADAL VMC-

3016L 4-Axis CNC milling machine.

The experiments were conducted using a 1/4 in. diam-

eter, 2-flute, HSS, Do-All end mill. The tool geometryparameters were a 14 rake angle, a 16 primary clear-ance angle, and a 37.5 helix angle. This is a tooldesigned specifically for non-ferrous metals like alumi-num and has a higher rake angle. The data acquisition

package used was LabVIEW. The set up can be seen in

Fig. 6.

4.2. Design of experiments

Design of experiments (DOE) is utilized here to deter-

mine the optimum number of experiments needed to suc-

cessfully model the process within the required accuracy.

This technique came into picture as a link between stat-

istical design and engineering knowledge. Literature on

experimental design is numerous and this paper is not

intended to cover aspects of experimental design tech-

niques and detailed information can be found in Ross

Fig. 6. Experimental set-up.

[12]. Experimental design is made up of three stages:

First, system design. In this phase, the flat end millingexperimental set-up is built including the dynamometer

to measure the required forces. Second, parameter

design. Here the variables that are involved in the pro-cess are valuated. In this particular case, orthogonal

arrays are used to host the variations of process para-meters. Third, tolerance design which is not considered

here, as this study aims at comparing two artificial neuralnetworks. The present work constitutes a first step andeventually further enhancements and refinements wouldbe needed.

4.3. Set of experiments

As noted earlier, there is a number of machining para-meters that significantly affects the milling process. Ofthese parameters, spindle speed, feed rate and depth of

cut have been varied in current experiments and cutting

force variation with time recorded. Other parameters

such as tool diameter, rake angle, etc. are kept constant

for the scope of this study. In fact, the selected para-

meters are very critical in the flat-end milling processand should provide a basis for meaningful results for

comparing the two models.In order to select the data to be used in the training

phase, several experimental sets were designed. All these

sets represent states or points in a 3D space, since only

3 parameters were selected.

4.3.1. First set of experiments

The first set consists of 27 experiments. Three valueswere selected for each parameter. This approach gives33=27 experiments (full factorial). The range of valueswere selected based on recommendations given by [13].

Next, DOE is applied. Since no sensibility relation is

known at this stage, equally spaced division is used in

order to set the particular values. This results in the fol-

lowing:

feed rate (mm/min): range 100200 and selectedvalues: 100, 150 and 200;

spindle speed (rpm): range 6001800 and selectedvalues: 600, 1200 and 1800;

radial depth of cut (%D): range 0100 and selectedvalues: 25, 62.5 and 100.

The corresponding space is shown in Fig. 7.

4.3.2. Second set of experiments

For this set, the work-space is divided as shown in

Fig. 8 (bold points are in different RDC planes). Again

an equally spaced division is used. In this set, the num-ber of states in the work-space has increased. In fact, the

total number of experiments (full factorial) is then

53=125, 27 of which are already in the first set. This


6/12


Fig. 7. First set of experiments.

Fig. 8. Second set of experiments.

reduces the number of additional experiments towards a

full factorial to 98. As mentioned above, the second set

is represented by equally spaced states inside the definedrange and subsequent models partially cover the rest of

the space. The second set has in total 35 experiments,

27 from the first set plus an additional 8 experiments(see Fig. 8).

It is important to point out that these eight additional

experiments are in different RDC planes (radial depth ofcut) from the ones used in the previous set. Again, they

are equally spaced.

4.3.3. Third set of experiments

This set consists of 12 additional experiments (see

Fig. 9), which results in a total of 47 experiments.

4.3.4. Fourth set of experiments

Eighteen additional points inside the range, as shown

in Fig. 10, are considered resulting in a total of 65

experiments.In summary, four different experimental sets are

defined to be used in the training phase. Each set is usedto train each one of the ANN models.

Fig. 9. Third set of experiments.

Fig. 10. Fourth set of experiments.

4.3.5. Validation set

This set (made of 20 new experiments) is used to com-

pare the measured values with the ones predicted from

the ANNs. These experiments will also support the

determination of the optimum number of representative

training data.

All experiments were performed using the above-

mentioned milling machine. Forces in X-, Y- and Z-direc-

tion were measured and are found to be periodic.

4.4. Data pre-processing

After collecting the force components, the resultantforce R was calculated using the following equation:

R Fx2 Fy2 Fz2 (8)

The maximum (MAX), minimum (MIN), mean

(MEAN) and standard deviation (STDV) values of this

resultant force are calculated for each experiment (as

they represent important characteristics of a continuous

force pattern). Next, the data is normalized in order tomake it suitable for the training process [11]. This was

done by mapping each term to a value between 0 and 1

using the following formula:


7/12


Fig. 11. General ANN topology.

N (R

Rmin)

(Nmax

Nmin)(RmaxRmin)

Nmin (9)

where, N: normalized value of the real variable; Nmin andNmax: minimum and maximum values of normalization,

respectively; R: real value of the variable; Rmin and Rmax:minimum and maximum values of the real variable,

respectively.

This normalized data was utilized as the inputs

(machining conditions) and outputs (characteristics of

the resultant force) to train the ANN. In other words,

two vectors are formed in order to train the neural net-

work (see Fig. 11):

Input=[feed rate; spindle speed; radial depth of cut];Output=[MAX; MIN; MEAN;STDV];

Table 1

Linear regression for training phase (using BP)

Back propa

Set W R

MAX MIN MEAN STDV

1st 14 0.987 0.981 0.984 0.992

21 0.995 0.983 0.993 0.9952nd 14 0.973 0.965 0.974 0.973

21 0.983 0.97 0.981 0.974

28 0.984 0.985 0.986 0.975

3rd 14 0.966 0.917 0.974 0.951

21 0.973 0.922 0.977 0.958

28 0.973 0.952 0.987 0.959

35 0.977 0.955 0.988 0.968

4th 14 0.958 0.912 0.974 0.946

21 0.967 0.924 0.978 0.953

28 0.968 0.945 0.983 0.953

35 0.969 0.933 0.987 0.955

42 0.97 0.951 0.986 0.956

49 0.97 0.951 0.988 0.955

Table 2

Error values of BP network, 1st experimental set, topology 3.2.4

MAX MIN MEAN STDV

0.1293 0.0353 0.5958 0.0832

0.4576 0.048 0.1146 0.1396

0.7888 0.088 0.0775 0.23740.6513 0.0241 0.2677 0.2231

0.002 0.0002 0.3962 0.0399

0.0837 0.0356 0.2639 0.0494

0.6995 0.2055 0.0747 0.2582

0.5869 0.4021 0.0476 0.164

0.3958 0.2684 0.167 0.1484

0.4416 0.0148 0.4065 0.0945

0.5721 0.0447 0.1487 0.2264

0.1314 0.473 0.1748 0.1092

0.721 0.178 0.0093 0.1913

0.0028 0.0217 0.4942 0.0382

0.5879 1.4483 0.9077 0.109

0.0794 0.0158 0.3518 0.0318

0.4278 0.2741 0.205 0.2454

0.6645 0.0548 0.0371 0.21520.4569 0.2025 0.1407 0.1116

0.9767 0.0218 0.1056 0.3863

Table 3

Values to report

MAXIMUM MINIMUM MEAN STDV

0.4428 0.1928 0.2493 0.1551 Mean

0.2853 0.3261 0.2235 0.0924 Stdv

5. Results

5.1. Training results

Each experimental set (except the validation set) is

used to train each network. This training is repeated for

each topology. The performance is measured by the lin-

ear regression (R) of each output. With this analysis it

is possible to determine the response of the network with

respect to the targets. A value of 1 indicates that thenetwork is perfectly simulating the training set while 0

means the opposite. For all the cases in this study, the

value of R (for all output sets) is shown in Table 1.

The case of RBN showed a perfect fitting pattern (R=1for all the cases) as expected since the goal error factor

is set to zero.

5.2. Validation results of the BP model and RBN

model

For each network, the difference between the realvalue and the predicted value is calculated producing a

matrix of 20 by 4 elements, meaning 20 experiments of

validation (rows) and 4 outputs parameters (columns).


8/12


Table 4

Results from BP

ERROR (mean and stdv) (real-predicted)102 NTopology MAXIMUM MINIMUM MEAN STDV

1st Set 27 Experiments

3.2.4 (w=14) 0.4428 0.1928 0.2493 0.1551 Mean0.2853 0.3261 0.2235 0.0924 Stdv

3.3.4 (w=21) 0.4328 0.2161 0.2168 0.1672 Mean0.2346 0.3018 0.1636 0.0874 Stdv

2nd Set 35 Experiments

3.2.4 (w=14) 0.3396 0.13 0.2128 0.1073 Mean0.2053 0.3133 0.2173 0.0896 Stdv

3.3.4 (w=21) 0.2726 0.1194 0.1332 0.1082 Mean0.1802 0.2981 0.1711 0.0628 Stdv

3.4.4 (w=28) 0.2554 0.133 0.1404 0.1035 Mean0.1783 0.2325 0.1497 0.0591 Stdv

3rd Set 47 Experiments

3.2.4 (w=14) 0.3524 0.1127 0.2078 0.1009 Mean0.2144 0.2194 0.1887 0.0844 Stdv

3.3.4 (w=21) 0.2293 0.1128 0.1309 0.09 Mean

0.2002 0.1855 0.1397 0.0485 Stdv3.4.4 (w=28) 0.253 0.1215 0.079 0.0843 Mean

0.1815 0.1132 0.0945 0.0471 Stdv

3.5.4 (w=35) 0.2394 0.1309 0.0906 0.0769 Mean0.2018 0.1092 0.1131 0.0478 Stdv

4th Set 65 Experiments

3.2.4 (w=14) 0.2891 0.0859 0.2262 0.079 Mean0.2028 0.212 0.1934 0.0628 Stdv

3.3.4 (w=21) 0.1905 0.0794 0.138 0.0569 Mean0.187 0.1981 0.1469 0.0424 Stdv

3.4.4 (w=28) 0.1884 0.0998 0.1255 0.0517 Mean0.1742 0.1167 0.1321 0.0429 Stdv

3.5.4 (w=35) 0.1923 0.1307 0.0978 0.0515 Mean0.1661 0.1147 0.1098 0.0332 Stdv

3.6.4 (w=42) 0.1935 0.0809 0.0974 0.0551 Mean

0.1733 0.091 0.0797 0.0394 Stdv3.7.4 (w=49) 0.1988 0.086 0.0892 0.0533 Mean

0.1715 0.0893 0.0624 0.0388 Stdv

For each column, the mean and standard deviation are

calculated. These two values represent the mean error

and standard deviation of each output element respect-

ively. In this way, a vector of two elements is used to

make the comparison. To illustrate the calculations, anexample is presented. For back-propagation network and

using the 1st experimental set with topology 3.2.4, the

error is calculated as follows:

eij |mijpij|i:1 20,j:1 4 (10)

where, i refers to experiment number, and j refers to thejth output of the network; eij is the error value of the ith

machining condition state for the jth output; mij is themeasured value of the ith machining condition state for

the jth output; and pij is the predicted value of the ith

machining condition state for the jth outputThe calculated errors are shown in Table 2. From this

table the mean and standard deviation are calculated for

each column. The reported results are shown in Table

3. This was done for each model and for each topology

(in BP) as well as combination (in RB network). The

results are shown in Table 4 (BP) and Table 5 (RBN).

6. Methodology used to compare the two ArtificialNeural Networks

The selection of the corresponding best network iscarried out in terms of accuracy and efficiency. Thelatest term is measured by selecting a minimum number

of training experiments that results in a sufficientlyaccurate model. It is known that the larger is the training

set, the more accurate the evolved model is. Conse-quently, a cost function is needed to evaluate the simul-

taneous influence of training experiments size and mod-els accuracy.


9/12


Table 5

Results from RBN

ERROR (mean and stdv) (real-predicted)102 NSpread MAXIMUM MINIMUM MEAN STDV

1st Set 27 Experiments

0.2 Mean 0.3758 0.3143 0.2954 0.1911STDV 0.2893 0.2858 0.1277 0.1071

0.5 Mean 0.5111 0.2909 0.1925 0.1685

STDV 0.28 0.2207 0.193 0.0819

0.8 Mean 0.7133 0.2084 0.8924 0.1149

STDV 0.5511 0.2551 0.3299 0.1062

2nd Set 35 Experiments

0.2 Mean 0.224 0.1147 0.1497 0.09

STDV 0.1981 0.1621 0.0895 0.0825

0.5 Mean 0.2602 0.2183 0.1813 0.0892

STDV 0.2129 0.1809 0.1537 0.0793

0.8 Mean 0.2532 0.2048 0.1754 0.0917

STDV 0.2188 0.1717 0.1452 0.0754

3rd Set 47 Experiments

0.2 Mean 0.2794 0.0753 0.1379 0.0928

STDV 0.263 0.0989 0.1046 0.09820.5 Mean 0.4704 0.2839 0.2028 0.1197

STDV 0.4654 0.2399 0.1552 0.1107

0.8 Mean 0.5866 0.2995 0.2051 0.1346

STDV 0.5228 0.235 0.1761 0.1151

4th Set 65 Experiments

0.2 Mean 0.2272 0.0337 0.1506 0.0621

STDV 0.2402 0.0401 0.1068 0.0586

0.5 Mean 2.2885 0.7389 0.3053 0.8367

STDV 2.5721 0.8868 0.2742 0.9921

0.8 Mean 4.7221 1.4937 0.3416 1.8317

STDV 5.2183 1.7041 0.2734 2.0805

6.1. Establishment of the cost function

The cost function (C) is set to relate the followingparameters:

1. number of experiments (NE);

2. error of prediction in terms of two important vari-

ables:

3. Maximum resultant force (EMAX);4. Mean resultant force (EMEAN).

Therefore, the overall cost function is given by

C1NE

N

2EMAXE

3EMEAN

E(11)

where i (i=1,2,3) are the weights of each equation term.N is the maximum number of possible experiments (in

this case 125, which represent the full factorialcondition), E is the maximum allowed error (which is

set to 30 N). The last value was selected based on the

fact that this error constitutes a relatively small valuecompared to the magnitude of the forces developed dur-

ing milling experiments conducted in here.

Eq. (11) shows that the closer NE is to N, the higher

the value of C. This is compensated by the fact that theerror would be much smaller than E. On the other hand,

using a small NE, the cost is reduced by the first termbut augmented by the last two terms, since the accuracy

would be compromised.

In addition, the equation is set to be unitless in order

to provide a fair base of comparison.

The experimental set that gives the least cost is the

selected set to be utilized for the particular ANN model.

Then, the two networks are compared.

The weights of each parameter (Eq. (11)) are selected

based on the needs of this study. Previous studies have

been criticized for the number of experiments required

in the training phase. Normally the use of artificial neuralnetworks requires a large number of experiments for

training. For this very reason, the heaviest weight will

be 1 (term that determines the heaviness of the numberof experiments in the cost function), while 2 & 3 areset to smaller values and with equal values. The reason

for choosing equal values for 2 & 3 is that EMAX andEMEAN represent forces that have the same value in terms

of cost relevance. The maximum force is important in

this study due the great significance that this particularvariable has in tool breakage while the mean force indi-


10/12


Table 6

Cost Values (BP)

Sets

1st 2nd 3rd 4th

Cost (=0.5)

W 14 1.3263 1.14466667 1.23446667 1.274833

21 1.25546667 0.90033333 0.90113333 0.9635

28 N/A 0.88366667 0.85413333 0.939167

35 N/A N/A 0.8508 0.8995

42 N/A N/A N/A 0.900833

49 N/A N/A N/A 0.896

Cost (=0.7)W 14 1.7877 1.51293333 1.60793333 1.618367

21 1.68853333 1.17086667 1.14126667 1.1825

28 N/A 1.14753333 1.07546667 1.148433

35 N/A N/A 1.0708 1.0929

42 N/A N/A N/A 1.094767

49 N/A N/A N/A 1.088

Cost (=0.2)

W 14 0.6342 0.59226667 0.67426667 0.75953321 0.60586667 0.49453333 0.54093333 0.635

28 N/A 0.48786667 0.52213333 0.625267

35 N/A N/A 0.5208 0.6094

42 N/A N/A N/A 0.609933

49 N/A N/A N/A 0.608

cates the average force experienced during a cutting

cycle, giving indication of the total power used. Based

on the above, the weights are selected as follows:1=0.8 and 2=3=0.5 (initially).Since the effect of EMAX and EMEAN in the cost func-

tion is important, a sensibility analysis is carried to seehow the value of the cost function varies when the

weights of these terms (2, 3) are set to different values.It is necessary to mention that these values are limitedby 1=0.8 (the upper limit; since this factor is the largestweight in Eq. (11)) and 0 (the lowest limit; which corre-

sponds to zero participation). Therefore, three values are

considered for 2 and 3: 0.5, 0.7 and 0.2.The cost value is then calculated based on Eq. (11)

and using the values of EMAX and EMEAN from Tables 4

and 5 for BP and RBN respectively. The cost is calcu-

lated for all the values of 2 and 3. These two valuesare represented by . The results are shown in Table 6

(BP) and 7 (RBN).

Table 7

Cost values (RBN) results correspond to spread factor=0.2

Cost values

Sets Cost (0.5) Cost (0.7) Cost (0.2)

1st 1.29146667 1.73893333 0.62026667

2nd 0.84683333 1.09596667 0.47313333

3rd 0.9963 1.2745 0.579

4th 1.04566667 1.29753333 0.66786667

7. Discussion of results

For the training phase, Table 1 shows the effective-

ness of the selected ANN architecture for BP. All R-

values are over 0.9. This table depicts that the more neu-rons in the hidden layer (high W), the better the represen-

tation (high R). By the same token, the increased numberof experiments results in a reduction in the value of R

which is compensated by the addition of more PEs in

the hidden layer. This tendency was expected becauseof the fact that when having a larger training set, more

neurons are needed to establish a good modeling.

Since all R-values are sufficiently high, it is possibleto conclude that any of these W combinations in any set

can be used to successfully train the neural network. The

same was applied for RBN where all R-values are 1.Furthermore, this indicates that the methodology of DOE

can be successfully applied with good results.

From these results, it is possible to state that based on

the training performance, the RBN is better than the BP.

Table 4 depicts that the 4th set produces smaller errors

than the 1st set. This is because the former set contains

more experiments (more information about the process)

than the latter one. This table also shows the effect of

increasing the number of neurons or PEs in the hiddenlayer. The more neurons, the more accurate the experi-

ment set. The magnitudes of the errors are relatively

small if compared with the forces developed during mill-

ing, which in this study vary between 200 and 1000 N.

The results given by the radial basis network indicates

that the smallest errors are reached when the spread

value is equal to 0.2 for all sets. For this reason the costcalculation and therefore the model comparison is con-ducted using this particular value.

Tables 6 and 7 show interesting results pertaining to

the cost values for BP and RBN, respectively. Each

model indicates different lowest cost sets. For BP the

set with lowest cost is the third set with 0.85 and 1. 07

for =0.5 and 0.7, respectively. These two values are atW=35. While for =0.2, the lowest cost corresponds tothe second set, W=28. This trend is represented in Fig.12, where the cost vs set is plotted.

This tendency is due to the lower weight given to the

error term (=0.2), compared with the weight given tothe number of experiments (NE). From these results,

=0.5 can be considered as the base of comparison sincethis value does not underestimate the participation of the

error part on the cost function. Therefore, the best modelfrom this network is the one with 47 experiments with

a cost value of 0.8508 and the topology of 3 inputs, 5

neurons in the hidden layer and 4 outputs.

In the radial basis network, the 2nd set gives the low-

est cost regardless of the value of(see Fig. 13). Again,=0.5 is selected for comparison purpose. For this net-work, the best model is the one with 35 experiments, a

cost value of 0.8468, and a spread value of 0.2.


11/12


Fig. 12. Cost vs sets (back-propagation network, using the lowest cost of each experimental set).

Fig. 13. Cost vs sets (RB network).

Based on these results (BP cost=0.8508 and 47 experi-ments; RBN cost=0.8468 and 35 experiments) and thefact that the radial basis network is for this particular

case about 3 times faster to train than back-propagation

network, the model that best represents the functional

relation between the considered milling parameters is theradial basis network.

The selected network, not only gave the lowest cost,

but it could be trained with fewer experiments and is

much faster. This indicates that for this particular case,

the RBN is more efficient than BP.

8. Conclusions and future work

In this paper, two supervised neural networks are used

to successfully estimate the forces developed duringmilling process. Design of experiments and specificallyorthogonal arrangement is used to select the experiments

to perform and establish different sets that are considered

in each ANN model. DOE contributed to increasing the

efficiency in the system by drastically reducing theamount of experimental data needed for successful train-

ing.

Based on the results of this study, it is possible to

conclude that having 5 values (equally spaced) of theselected milling parameters and applying orthogonal

arrangement, 35 experiments (out of 125) are enough to

train and evolve an accurate ANN model of the end mill-

ing process. In back-propagation networks, the use of a

single hidden layer showed to work sufficiently well forthe process in consideration. However, it is shown thatradial basis network is superior to back-propagation net-

work in predicting the milling forces, when evaluated in

terms of a cost function that combines costs of experi-

ments with accuracy.

In this study, a cost function is defined based on spe-cific needs. This fitness function can be refined in thefuture in order to represent more extensively the charac-

teristics of the milling process. In the same way, it is


12/12


possible to design a more systematic methodology to sel-

ect the spread factor in radial basis network. This could

increase the accuracy of the model.

References

[1] H.Y. Feng, N. Su, A mechanistic citting force model for ball-end milling, Journal of Manufacturing Science and Engineering

November (1998).

[2] D.E. Rumelhart, J.L. McClelland, PDP Research Group Parallel

Distributed Processing: Explorations in the Microstructure of

Cognition 2 vols., in: MIT Press, Cambridge, MA, 1986.

[3] Q. Liu, Y. Altintas, On-line monitoring of flank wear in turning

with multi-layered feed-forward neural network, International

Journal of Machine Tools & Manufacture 39 (1999) 19451959.

[4] Y. Liu, C. Wang, Neural network based adaptive control and

optimisation in the milling process, International Journal of

Advanced Manufacturing Technology 15 (11) (1999) 791795.

[5] V. Tandon, Closing the gap between CAD/CAM and optimized

CNC end milling, MSME Thesis, Purdue School of Engineer-

ing & Technology, 2000

[6] D. Cook, C. Chiu, Combining a radial basis neural network with

time series analysis techniques to predict manufacturing process

parameters, Applied Artificial Intelligence 9 (6) (1995) 623631.

[7] P.J. Cheng, S.C. Lin, Using neural networks to predict bending

angle of sheet metal formed by laser, International Journal of

Machine Tools & Manufacturing 40 (1999) 11851197.

[8] S. Elanayar, Y.C. Shin, Design and implementation of tool wear

monitoring with radial basis function neural networks, in: Pro-ceedings of the American Control Conference, Proceedings of

the 1995 American Control Conference. Part 3 (of 6), 1995, pp.

17221726.

[9] A.R. Barron, Neural net approximation, in: Proceedings of the

Seventh Yale Workshop on Adaptive and Learning Systems,

1992, pp. 6872.

[10] R.C. Eberhart, P. Simpson, R. Dobbins, Computational Intelli-

gence PC Tools, in: AP Professional, New York, 1996.

[11] H. Demuth, M. Beale, Neural Network Toolbox v3 Users Guide,

in: The MathWorks Inc, USA, 1999.

[12] P.J. Ross, Taguchi techniques for quality engineering, in:

McGraw Hill, New York, 1988.

[13] R.A. Walsh, McGraw-Hill machining and metalworking hand-

book, in: McGraw-Hill, New York, 1994.

Selecting an Artificial Neural Network for Efficient Modeling And

Documents

Transcript of Selecting an Artificial Neural Network for Efficient Modeling And