Selecting an Artificial Neural Network for Efficient Modeling And
-
Upload
vaalgatamilram -
Category
Documents
-
view
217 -
download
0
Transcript of Selecting an Artificial Neural Network for Efficient Modeling And
-
8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And
1/12
International Journal of Machine Tools & Manufacture 42 (2002) 663674
Selecting an artificial neural network for efficient modeling andaccurate simulation of the milling process
Jorge F. Briceno a, Hazim El-Mounayri a,, Snehasis Mukhopadhyay b
a Mechanical Engineering Department at Indiana University, Purdue University, Indianapolis (IUPUI), 723 W. Michigan Street, SL 260,
Indianapolis, IN 46202-5132, USAb Department of Computer and Information Science at IUPUI, 723 W. Michigan Street, Indianapolis, IN, USA
Received 16 August 2001; accepted 15 January 2002
Abstract
In this paper, two supervised neural networks are used to estimate the forces developed during milling. These two ArtificialNeural Networks (ANNs) are compared based on a cost function that relates the size of the training data to the accuracy of themodel. Training experiments are screened based on design of experiments. Verification experiments are conducted to evaluate these
two models. It is shown that the Radial Basis Network model is superior in this particular case. Orthogonal design and specificallyequally spaced dimensioning showed to be a good way to select the training experiments. 2002 Elsevier Science Ltd. Allrights reserved.
Keywords: End milling; Artificial neural networks; Back propagation; Radial basis
1. Introduction
As one of the most useful methods of metal cutting,the milling process attempts to remove an amount ofmaterial through chip formation by the two continuousmotions of a tool and a workpiece (see Fig. 1). In this
Fig. 1. Flat-end milling process.
Corresponding author. Tel.: +1-317-278-3320; fax.: +1-317-274-9744.
E-mail address: [email protected] (H. El-Mounayri).
0890-6955/02/$ - see front matter 2002 Elsevier Science Ltd. All rights reserved.
PII: S0 8 9 0 - 6 9 5 5 ( 0 2 ) 0 0 0 0 8 - 1
case, the tool has a rotational motion (expressed byspindle speed) and the workpiece a linear movement(expressed by feed rate). The cutting edge is in contactwith the material at many points, which changedepending on the position of the edge relative to thematerial. This makes the present process involved interms of operational variables. Many parameters have tobe defined to conduct this operation. Among the princi-pal ones are spindle speed (tool rotational velocity), feedrate (workpiece velocity), diameter of the tool, helixangle, radial depth of cut (RDC), axial depth of cut(ADC), rake angle, clearance angle and number of flutes.These variables conjointly with tool and workpiecematerial define the state of cutting, which controls theprocess parameters. The latter include tool wear, toollife, surface finish, etc. The forces that are developedduring the milling process, can directly or indirectlymeasure/estimate such process parameters. In general,excessive cutting forces result in low product qualitywhile small cutting forces often indicate low machiningefficiency [1]. Thus, controlling these forces is of para-mount importance.
The majority of milling operations have been carriedout based on cutting conditions determined from pre-vious experience and/or existing machining data. On the
-
8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And
2/12
664 J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674
other hand, researchers have been trying to develop
mathematical models that would predict the cutting
forces based on the geometry and physical character-
istics of the process. Such prediction could then be used
to optimize the process. However, due to its complexity,the milling process still represents a challenge to the
modeling and simulation research effort. In fact, most ofthe research work reported in this regard, which is based
on either analytical or semi-empirical approaches, has in
general shown only limited levels of accuracy and/orgenerality.
In the present paper, a different approach that is based
on advanced artificial intelligence techniques isimplemented and tested. More specifically two differentneural networks are used to predict the forces developed
during End milling. The networks are then compared andthe best network is selected based on certain criteria.
2. Literature review
This relatively new methodology of Artificial NeuralNetwork (ANN), inspired by biological nervous systems,
has found application in many real-world problem solv-
ing. One of the first engineering applications wasreported by Minsky and Papert developing perceptrons
in 1969. Then this field stayed dormant until about 1986when the PDP group comprising Rumelhart andMcClelland [2] published a two-volume book on explo-
rations in the microstructure of cognition. It is only in
the past few years that this methodology was
implemented in metal-cutting operations. In [3], a feed-forward neural network algorithm is implemented to pre-dict flank wear in orthogonal turning. In this case, feedrate, cutting speed and force ratio are used as inputs. Liu
and Wang [4] also propose a back propagation (BP)
ANN for on-line modeling of the milling system. How-
ever, this study has several limitations, the most
important of which is the use of a single machining para-
meter as the variable input. In [5], a more efficient modelis created using BP ANN (using LevenbergMarquardtapproach). In this case, three inputs are considered withdifferent levels for each parameter. This approach has
the disadvantage of requiring too many experiments to
train the ANN. This, in terms of Industrial usability, is
unattractive and expensive.
Radial Basis Networks (RBN), a neural network
architecture different from multi-layer BP ANN, havebeen used mainly for pattern recognition. However,
recent studies have indicated that this important network
can be successfully used as a function modeler as well.
Cook and Chiu [6] used a radial basis network as a
framework to establish some network improvementsconsidering a time series model of a manufacturing pro-
cess. Cheng and Lin [7] used three ANNs to estimate
bending angles formed by laser. The RBN showed to be
superior to the other models. Elanayar and Shin [8] util-
ized RBN to predict tool wear based on certain machin-
ing conditions. A more general representation of the
milling process cannot be found in the literature. In
addition, no work has been conducted yet to evaluateand compare different artificial neural networks used to
model the milling process.
3. Artificial neural network models of the milling
process
In the current work, two supervised neural networks
for modeling the milling process are compared. The firstone is a back propagation neural network (BP) with log-
sigmoid transfer functions in hidden layers and lineartransfer function in the output layer; the second is a rad-
ial basis network (RBN) with Gaussian activation func-
tions. The first ANN is very popular, especially in thearea of manufacturing modeling, as its design and oper-
ation are relatively simple. The radial basis network has
some additional advantages such as rapid convergence
and less error. In particular, most commonly used RBNs
involve fixed basis functions with linearly appearingunknown parameters in the output layer. In contrast,multi-layer BP ANNs involve adjustable basis functions.
That result in nonlinearly appearing unknown para-
meters. It is commonly known that linearity in para-
meters in RBN allow the use of least squares error based
updating schemes that have faster convergence than the
gradient-descent methods used to update the nonlinear
parameters of multi-layer BP ANN. On the other hand,it is also known that the use of fixed basis functions inRBN results in exponential complexity in terms of the
number of parameters, while adjustable basis functions
of BP ANN can lead to much less complexity in termsof the number of parameters or network size [9]. How-
ever, in practice, the number of parameters in RBN starts
becoming unmanageably large only when the number of
input features increases beyond about 10 or 20, which
is not the case in our study. Hence, the use of RBN was
practically possible for our problem. MatLab NeuralNetwork Tool Box was used as a platform to create
the networks.
3.1. Back-propagation neural network (BPNN)
Since the objective is to evolve a model that relatesselected inputs with outputs, BPNN constitutes an excel-
lent tool to approximate such function. The general net-
work topology is shown in Fig. 2. This network is com-
posed of several neurons or processing elements (PE)
operating in parallel. The PEs are arranged in differentsections or layers. These structures include: an input
layer, hidden layer(s) and an output layer. Each layer is
connected to other layers through the weight lines that
-
8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And
3/12
665J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674
Fig. 2. Back-propagation network topology.
come from each PE. The architecture of each PE is
shown is Fig. 3. In general terms, the operation of thistype of network can be described in terms of two major
phases: The feed-forward phase and the back-propa-
gation phase.
3.1.1. Feed-forward phase
The input patterns are represented by the input PEs.
Here no calculation is made. The following set of neu-
rons are found in the hidden layer(s). Form the ith input
PE the information is conducted to the jth PE in hidden
layer through the weight Wij. As depicted in Fig. 3, the
incoming data, in such element, is represented by
aj n
i 0
WijIi (1)
where, aj is the linear combination of each Ii multiplied
by Wij. is the value used in the activation function; Ii is
the ith input; Wij is the weight value from the ith input
PE to the jth hidden PE; n is the number of incoming
information to the jth PE; aj is the value fed to thesquashing function which gives the output of the jth PE
Fig. 3. Architecture of an individual PE for BP.
to the next layer(s). The output of this element is
given by
Yj SFl(aj) (2)
where, Yj is the output value of the jth element; SFl is
the squashing function (or activation function) of the lth
hidden layer.In this paper, the squashing functions used in the hid-
den and output layers are log-sigmoid transfer functionand linear transfer function respectively. The value of Yjis propagated through each further layer until the output
is generated.
3.1.2. Back-propagation phase
In this phase the learning process is conducted. In gen-
eral terms, the implementation of BP consists of updat-
ing the network weights in the direction in which the
performance function decreases most rapidly. Once the
output (Yj) is calculated, it is compared with the targetvalue (tj). Then the following error is computed:
ej 1
2(tjYj)
2 (3)
This error ej corresponds to just one output PE. There-fore the overall error (E vector) is expressed by
E (e1,,ej,ek) (4)
where k is the number of outputs.The error is then transmitted backwards from the out-
put layer to the input layer. The connection weights are
updated by each PE, leading the network to converge.Several techniques can be used to conduct this back-
propagation. One of the most widely used is the Leven-
bergMarquardt technique. This technique approximatesthe Hessian matrix with the product of the Jacobian
matrix and its transpose. In this way, the weight updates
is based on the following equation:
Wnewij Woldij
JT
JTJ I(5)
where, Wnewij Corrected weight for jth PE coming fromthe previous layer, Woldij Previous weight for jth PE
from previous layer, J Jacobian matrix containing the
first derivatives of the network errors with respect to thenetwork weights and error signals for the ith pattern,
Scalar factor (when equal to zero, the method iscalled second order Newtons, while when set to a largenumber, it is called gradient descent with small step
size), Error signal for the jth PE.This network offers a good generalization method-
ology and a fast convergence using the LevenbergMar-quardt algorithm. In the same way, regularization is usedto improve generalization through the use of automated
regularization based on Bayesian framework. For this
particular case, since the size of the data is relatively
-
8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And
4/12
666 J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674
small and based on Whites theorem [10] (which statesthat one layer with non-linear activation functions issufficient to map any non-inear functional relationshipwith a reasonable level of accuracy), a single hiddenlayer neural network was utilized and the number ofweights are kept around 3/4 of the number of experi-
ments, actually:Number of weights(Number of experiments)(3/4)
Normally this factor is about 1/10 , but due the small
size of the data in this particular case a factor of 3/4 wasused, which still resulted in more data points than the
number of unknown weights.
The effect of topology is also studied by considering
different cases. The topologies are varied by varying the
number of neurons in hidden layer (n, in Fig. 2) between
a lower limit of 2 and an upper limit of 3/4 of the totalnumber of experiments. The lower limit was selected
based on the fact that one neuron in the hidden layer
represents a model in which a linear relation is implied
between the inputs and outputs. The following notation
is used to describe the topology: 3.n.4; which means: 3
inputs, n neurons in the hidden layer and 4 outputs.
3.2. Radial basis network
This neural network utilizes the Gaussian curve to
map values. RBN works considerably well in function
approximation. It is very fast in convergence and it is
very simple to define in terms of a number of character-istic parameters.
Radial basis network (RBN) or radial basis function
network is a two layer fully interconnected neural net-work. It has two general characteristics: First, it mayrequire more neurons than the standard feed-forward BP
networks. Second, it can be designed in a fraction of the
time that it takes to train the aforementioned BP.
A typical RBN is shown in Fig. 4. The network has
Fig. 4. Radial basis network architecture.
n inputs and k outputs. The first layer is connected withthe second or internal layer by weights that come from
the input elements and the bias element. Weights from
internal layer to outputs are also defined. Each elementin the internal layer receives an input pattern vector andcompares it with the mean weight vector that connects
the input with second layer. The weight vector deter-mines the position of the center of the radial hidden
element in the input space. Here, the activation function
is similar to a Gaussian density function. This functionis defined as follows:
Yki e
h
(uihaih)2C
V2 (6)
Here Yki is the response of the ith element in the hid-den layer. The weights uih define the mean value vectorassociated with each hidden PE, aih represent the inputs.
The parameter V is the factor that shapes the form of
the squashing function and is called spread factor; C isa constant. The PEs architecture of the hidden layer can
be seen in Fig. 5.
Finally, the connection weights between the second
layer and the output layer is multiplied by the output of
the internal elements (linear summation function), givingthe output value to be compared with the target vectors.
zkj p
i 0
WijYkj (7)
Radial basis network is a very efficient network whenfunction approximation is needed. This artificial neural
network has the following characteristics:
1. it is very fast in comparison to back-propagation;
2. it has the ability of representing nonlinear functions;
3. it does not experience local minima problems of
back-propagation.
RBN is being used for an increasing number of appli-cations, proportioning a very helpful modeling tool.
Fig. 5. Radial basis neuron.
-
8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And
5/12
667J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674
In summary, two parameters need to be defined.Spread factor and goal factor. The spread factor V, has
to be specified depending on the particular case in hand.It has to be smaller than the highest limit of the input
data and larger than the lowest limit [11]. Based on this,and assuming that all the training data (as will be
explained in a future paper) is mapped between 0 and1, three values to be considered are: 0.2, 0.5 and 0.8.
The goal factor value is set to zero, since error is a
decisive factor in this study.
4. Experimental data for training the ANN models
4.1. Experimental set-up
The three components of the cutting force are meas-
ured using a Kistler 9257B dynamometer. These were
sampled at 2500 Hz for 10 s each and have been stored
in files in a spreadsheet format. The machine tool usedfor all the experiments in this work is a FADAL VMC-
3016L 4-Axis CNC milling machine.
The experiments were conducted using a 1/4 in. diam-
eter, 2-flute, HSS, Do-All end mill. The tool geometryparameters were a 14 rake angle, a 16 primary clear-ance angle, and a 37.5 helix angle. This is a tooldesigned specifically for non-ferrous metals like alumi-num and has a higher rake angle. The data acquisition
package used was LabVIEW. The set up can be seen in
Fig. 6.
4.2. Design of experiments
Design of experiments (DOE) is utilized here to deter-
mine the optimum number of experiments needed to suc-
cessfully model the process within the required accuracy.
This technique came into picture as a link between stat-
istical design and engineering knowledge. Literature on
experimental design is numerous and this paper is not
intended to cover aspects of experimental design tech-
niques and detailed information can be found in Ross
Fig. 6. Experimental set-up.
[12]. Experimental design is made up of three stages:
First, system design. In this phase, the flat end millingexperimental set-up is built including the dynamometer
to measure the required forces. Second, parameter
design. Here the variables that are involved in the pro-cess are valuated. In this particular case, orthogonal
arrays are used to host the variations of process para-meters. Third, tolerance design which is not considered
here, as this study aims at comparing two artificial neuralnetworks. The present work constitutes a first step andeventually further enhancements and refinements wouldbe needed.
4.3. Set of experiments
As noted earlier, there is a number of machining para-meters that significantly affects the milling process. Ofthese parameters, spindle speed, feed rate and depth of
cut have been varied in current experiments and cutting
force variation with time recorded. Other parameters
such as tool diameter, rake angle, etc. are kept constant
for the scope of this study. In fact, the selected para-
meters are very critical in the flat-end milling processand should provide a basis for meaningful results for
comparing the two models.In order to select the data to be used in the training
phase, several experimental sets were designed. All these
sets represent states or points in a 3D space, since only
3 parameters were selected.
4.3.1. First set of experiments
The first set consists of 27 experiments. Three valueswere selected for each parameter. This approach gives33=27 experiments (full factorial). The range of valueswere selected based on recommendations given by [13].
Next, DOE is applied. Since no sensibility relation is
known at this stage, equally spaced division is used in
order to set the particular values. This results in the fol-
lowing:
feed rate (mm/min): range 100200 and selectedvalues: 100, 150 and 200;
spindle speed (rpm): range 6001800 and selectedvalues: 600, 1200 and 1800;
radial depth of cut (%D): range 0100 and selectedvalues: 25, 62.5 and 100.
The corresponding space is shown in Fig. 7.
4.3.2. Second set of experiments
For this set, the work-space is divided as shown in
Fig. 8 (bold points are in different RDC planes). Again
an equally spaced division is used. In this set, the num-ber of states in the work-space has increased. In fact, the
total number of experiments (full factorial) is then
53=125, 27 of which are already in the first set. This
-
8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And
6/12
668 J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674
Fig. 7. First set of experiments.
Fig. 8. Second set of experiments.
reduces the number of additional experiments towards a
full factorial to 98. As mentioned above, the second set
is represented by equally spaced states inside the definedrange and subsequent models partially cover the rest of
the space. The second set has in total 35 experiments,
27 from the first set plus an additional 8 experiments(see Fig. 8).
It is important to point out that these eight additional
experiments are in different RDC planes (radial depth ofcut) from the ones used in the previous set. Again, they
are equally spaced.
4.3.3. Third set of experiments
This set consists of 12 additional experiments (see
Fig. 9), which results in a total of 47 experiments.
4.3.4. Fourth set of experiments
Eighteen additional points inside the range, as shown
in Fig. 10, are considered resulting in a total of 65
experiments.In summary, four different experimental sets are
defined to be used in the training phase. Each set is usedto train each one of the ANN models.
Fig. 9. Third set of experiments.
Fig. 10. Fourth set of experiments.
4.3.5. Validation set
This set (made of 20 new experiments) is used to com-
pare the measured values with the ones predicted from
the ANNs. These experiments will also support the
determination of the optimum number of representative
training data.
All experiments were performed using the above-
mentioned milling machine. Forces in X-, Y- and Z-direc-
tion were measured and are found to be periodic.
4.4. Data pre-processing
After collecting the force components, the resultantforce R was calculated using the following equation:
R Fx2 Fy2 Fz2 (8)
The maximum (MAX), minimum (MIN), mean
(MEAN) and standard deviation (STDV) values of this
resultant force are calculated for each experiment (as
they represent important characteristics of a continuous
force pattern). Next, the data is normalized in order tomake it suitable for the training process [11]. This was
done by mapping each term to a value between 0 and 1
using the following formula:
-
8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And
7/12
669J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674
Fig. 11. General ANN topology.
N (R
Rmin)
(Nmax
Nmin)(RmaxRmin)
Nmin (9)
where, N: normalized value of the real variable; Nmin andNmax: minimum and maximum values of normalization,
respectively; R: real value of the variable; Rmin and Rmax:minimum and maximum values of the real variable,
respectively.
This normalized data was utilized as the inputs
(machining conditions) and outputs (characteristics of
the resultant force) to train the ANN. In other words,
two vectors are formed in order to train the neural net-
work (see Fig. 11):
Input=[feed rate; spindle speed; radial depth of cut];Output=[MAX; MIN; MEAN;STDV];
Table 1
Linear regression for training phase (using BP)
Back propa
Set W R
MAX MIN MEAN STDV
1st 14 0.987 0.981 0.984 0.992
21 0.995 0.983 0.993 0.9952nd 14 0.973 0.965 0.974 0.973
21 0.983 0.97 0.981 0.974
28 0.984 0.985 0.986 0.975
3rd 14 0.966 0.917 0.974 0.951
21 0.973 0.922 0.977 0.958
28 0.973 0.952 0.987 0.959
35 0.977 0.955 0.988 0.968
4th 14 0.958 0.912 0.974 0.946
21 0.967 0.924 0.978 0.953
28 0.968 0.945 0.983 0.953
35 0.969 0.933 0.987 0.955
42 0.97 0.951 0.986 0.956
49 0.97 0.951 0.988 0.955
Table 2
Error values of BP network, 1st experimental set, topology 3.2.4
MAX MIN MEAN STDV
0.1293 0.0353 0.5958 0.0832
0.4576 0.048 0.1146 0.1396
0.7888 0.088 0.0775 0.23740.6513 0.0241 0.2677 0.2231
0.002 0.0002 0.3962 0.0399
0.0837 0.0356 0.2639 0.0494
0.6995 0.2055 0.0747 0.2582
0.5869 0.4021 0.0476 0.164
0.3958 0.2684 0.167 0.1484
0.4416 0.0148 0.4065 0.0945
0.5721 0.0447 0.1487 0.2264
0.1314 0.473 0.1748 0.1092
0.721 0.178 0.0093 0.1913
0.0028 0.0217 0.4942 0.0382
0.5879 1.4483 0.9077 0.109
0.0794 0.0158 0.3518 0.0318
0.4278 0.2741 0.205 0.2454
0.6645 0.0548 0.0371 0.21520.4569 0.2025 0.1407 0.1116
0.9767 0.0218 0.1056 0.3863
Table 3
Values to report
MAXIMUM MINIMUM MEAN STDV
0.4428 0.1928 0.2493 0.1551 Mean
0.2853 0.3261 0.2235 0.0924 Stdv
5. Results
5.1. Training results
Each experimental set (except the validation set) is
used to train each network. This training is repeated for
each topology. The performance is measured by the lin-
ear regression (R) of each output. With this analysis it
is possible to determine the response of the network with
respect to the targets. A value of 1 indicates that thenetwork is perfectly simulating the training set while 0
means the opposite. For all the cases in this study, the
value of R (for all output sets) is shown in Table 1.
The case of RBN showed a perfect fitting pattern (R=1for all the cases) as expected since the goal error factor
is set to zero.
5.2. Validation results of the BP model and RBN
model
For each network, the difference between the realvalue and the predicted value is calculated producing a
matrix of 20 by 4 elements, meaning 20 experiments of
validation (rows) and 4 outputs parameters (columns).
-
8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And
8/12
670 J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674
Table 4
Results from BP
ERROR (mean and stdv) (real-predicted)102 NTopology MAXIMUM MINIMUM MEAN STDV
1st Set 27 Experiments
3.2.4 (w=14) 0.4428 0.1928 0.2493 0.1551 Mean0.2853 0.3261 0.2235 0.0924 Stdv
3.3.4 (w=21) 0.4328 0.2161 0.2168 0.1672 Mean0.2346 0.3018 0.1636 0.0874 Stdv
2nd Set 35 Experiments
3.2.4 (w=14) 0.3396 0.13 0.2128 0.1073 Mean0.2053 0.3133 0.2173 0.0896 Stdv
3.3.4 (w=21) 0.2726 0.1194 0.1332 0.1082 Mean0.1802 0.2981 0.1711 0.0628 Stdv
3.4.4 (w=28) 0.2554 0.133 0.1404 0.1035 Mean0.1783 0.2325 0.1497 0.0591 Stdv
3rd Set 47 Experiments
3.2.4 (w=14) 0.3524 0.1127 0.2078 0.1009 Mean0.2144 0.2194 0.1887 0.0844 Stdv
3.3.4 (w=21) 0.2293 0.1128 0.1309 0.09 Mean
0.2002 0.1855 0.1397 0.0485 Stdv3.4.4 (w=28) 0.253 0.1215 0.079 0.0843 Mean
0.1815 0.1132 0.0945 0.0471 Stdv
3.5.4 (w=35) 0.2394 0.1309 0.0906 0.0769 Mean0.2018 0.1092 0.1131 0.0478 Stdv
4th Set 65 Experiments
3.2.4 (w=14) 0.2891 0.0859 0.2262 0.079 Mean0.2028 0.212 0.1934 0.0628 Stdv
3.3.4 (w=21) 0.1905 0.0794 0.138 0.0569 Mean0.187 0.1981 0.1469 0.0424 Stdv
3.4.4 (w=28) 0.1884 0.0998 0.1255 0.0517 Mean0.1742 0.1167 0.1321 0.0429 Stdv
3.5.4 (w=35) 0.1923 0.1307 0.0978 0.0515 Mean0.1661 0.1147 0.1098 0.0332 Stdv
3.6.4 (w=42) 0.1935 0.0809 0.0974 0.0551 Mean
0.1733 0.091 0.0797 0.0394 Stdv3.7.4 (w=49) 0.1988 0.086 0.0892 0.0533 Mean
0.1715 0.0893 0.0624 0.0388 Stdv
For each column, the mean and standard deviation are
calculated. These two values represent the mean error
and standard deviation of each output element respect-
ively. In this way, a vector of two elements is used to
make the comparison. To illustrate the calculations, anexample is presented. For back-propagation network and
using the 1st experimental set with topology 3.2.4, the
error is calculated as follows:
eij |mijpij|i:1 20,j:1 4 (10)
where, i refers to experiment number, and j refers to thejth output of the network; eij is the error value of the ith
machining condition state for the jth output; mij is themeasured value of the ith machining condition state for
the jth output; and pij is the predicted value of the ith
machining condition state for the jth outputThe calculated errors are shown in Table 2. From this
table the mean and standard deviation are calculated for
each column. The reported results are shown in Table
3. This was done for each model and for each topology
(in BP) as well as combination (in RB network). The
results are shown in Table 4 (BP) and Table 5 (RBN).
6. Methodology used to compare the two ArtificialNeural Networks
The selection of the corresponding best network iscarried out in terms of accuracy and efficiency. Thelatest term is measured by selecting a minimum number
of training experiments that results in a sufficientlyaccurate model. It is known that the larger is the training
set, the more accurate the evolved model is. Conse-quently, a cost function is needed to evaluate the simul-
taneous influence of training experiments size and mod-els accuracy.
-
8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And
9/12
671J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674
Table 5
Results from RBN
ERROR (mean and stdv) (real-predicted)102 NSpread MAXIMUM MINIMUM MEAN STDV
1st Set 27 Experiments
0.2 Mean 0.3758 0.3143 0.2954 0.1911STDV 0.2893 0.2858 0.1277 0.1071
0.5 Mean 0.5111 0.2909 0.1925 0.1685
STDV 0.28 0.2207 0.193 0.0819
0.8 Mean 0.7133 0.2084 0.8924 0.1149
STDV 0.5511 0.2551 0.3299 0.1062
2nd Set 35 Experiments
0.2 Mean 0.224 0.1147 0.1497 0.09
STDV 0.1981 0.1621 0.0895 0.0825
0.5 Mean 0.2602 0.2183 0.1813 0.0892
STDV 0.2129 0.1809 0.1537 0.0793
0.8 Mean 0.2532 0.2048 0.1754 0.0917
STDV 0.2188 0.1717 0.1452 0.0754
3rd Set 47 Experiments
0.2 Mean 0.2794 0.0753 0.1379 0.0928
STDV 0.263 0.0989 0.1046 0.09820.5 Mean 0.4704 0.2839 0.2028 0.1197
STDV 0.4654 0.2399 0.1552 0.1107
0.8 Mean 0.5866 0.2995 0.2051 0.1346
STDV 0.5228 0.235 0.1761 0.1151
4th Set 65 Experiments
0.2 Mean 0.2272 0.0337 0.1506 0.0621
STDV 0.2402 0.0401 0.1068 0.0586
0.5 Mean 2.2885 0.7389 0.3053 0.8367
STDV 2.5721 0.8868 0.2742 0.9921
0.8 Mean 4.7221 1.4937 0.3416 1.8317
STDV 5.2183 1.7041 0.2734 2.0805
6.1. Establishment of the cost function
The cost function (C) is set to relate the followingparameters:
1. number of experiments (NE);
2. error of prediction in terms of two important vari-
ables:
3. Maximum resultant force (EMAX);4. Mean resultant force (EMEAN).
Therefore, the overall cost function is given by
C1NE
N
2EMAXE
3EMEAN
E(11)
where i (i=1,2,3) are the weights of each equation term.N is the maximum number of possible experiments (in
this case 125, which represent the full factorialcondition), E is the maximum allowed error (which is
set to 30 N). The last value was selected based on the
fact that this error constitutes a relatively small valuecompared to the magnitude of the forces developed dur-
ing milling experiments conducted in here.
Eq. (11) shows that the closer NE is to N, the higher
the value of C. This is compensated by the fact that theerror would be much smaller than E. On the other hand,
using a small NE, the cost is reduced by the first termbut augmented by the last two terms, since the accuracy
would be compromised.
In addition, the equation is set to be unitless in order
to provide a fair base of comparison.
The experimental set that gives the least cost is the
selected set to be utilized for the particular ANN model.
Then, the two networks are compared.
The weights of each parameter (Eq. (11)) are selected
based on the needs of this study. Previous studies have
been criticized for the number of experiments required
in the training phase. Normally the use of artificial neuralnetworks requires a large number of experiments for
training. For this very reason, the heaviest weight will
be 1 (term that determines the heaviness of the numberof experiments in the cost function), while 2 & 3 areset to smaller values and with equal values. The reason
for choosing equal values for 2 & 3 is that EMAX andEMEAN represent forces that have the same value in terms
of cost relevance. The maximum force is important in
this study due the great significance that this particularvariable has in tool breakage while the mean force indi-
-
8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And
10/12
672 J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674
Table 6
Cost Values (BP)
Sets
1st 2nd 3rd 4th
Cost (=0.5)
W 14 1.3263 1.14466667 1.23446667 1.274833
21 1.25546667 0.90033333 0.90113333 0.9635
28 N/A 0.88366667 0.85413333 0.939167
35 N/A N/A 0.8508 0.8995
42 N/A N/A N/A 0.900833
49 N/A N/A N/A 0.896
Cost (=0.7)W 14 1.7877 1.51293333 1.60793333 1.618367
21 1.68853333 1.17086667 1.14126667 1.1825
28 N/A 1.14753333 1.07546667 1.148433
35 N/A N/A 1.0708 1.0929
42 N/A N/A N/A 1.094767
49 N/A N/A N/A 1.088
Cost (=0.2)
W 14 0.6342 0.59226667 0.67426667 0.75953321 0.60586667 0.49453333 0.54093333 0.635
28 N/A 0.48786667 0.52213333 0.625267
35 N/A N/A 0.5208 0.6094
42 N/A N/A N/A 0.609933
49 N/A N/A N/A 0.608
cates the average force experienced during a cutting
cycle, giving indication of the total power used. Based
on the above, the weights are selected as follows:1=0.8 and 2=3=0.5 (initially).Since the effect of EMAX and EMEAN in the cost func-
tion is important, a sensibility analysis is carried to seehow the value of the cost function varies when the
weights of these terms (2, 3) are set to different values.It is necessary to mention that these values are limitedby 1=0.8 (the upper limit; since this factor is the largestweight in Eq. (11)) and 0 (the lowest limit; which corre-
sponds to zero participation). Therefore, three values are
considered for 2 and 3: 0.5, 0.7 and 0.2.The cost value is then calculated based on Eq. (11)
and using the values of EMAX and EMEAN from Tables 4
and 5 for BP and RBN respectively. The cost is calcu-
lated for all the values of 2 and 3. These two valuesare represented by . The results are shown in Table 6
(BP) and 7 (RBN).
Table 7
Cost values (RBN) results correspond to spread factor=0.2
Cost values
Sets Cost (0.5) Cost (0.7) Cost (0.2)
1st 1.29146667 1.73893333 0.62026667
2nd 0.84683333 1.09596667 0.47313333
3rd 0.9963 1.2745 0.579
4th 1.04566667 1.29753333 0.66786667
7. Discussion of results
For the training phase, Table 1 shows the effective-
ness of the selected ANN architecture for BP. All R-
values are over 0.9. This table depicts that the more neu-rons in the hidden layer (high W), the better the represen-
tation (high R). By the same token, the increased numberof experiments results in a reduction in the value of R
which is compensated by the addition of more PEs in
the hidden layer. This tendency was expected becauseof the fact that when having a larger training set, more
neurons are needed to establish a good modeling.
Since all R-values are sufficiently high, it is possibleto conclude that any of these W combinations in any set
can be used to successfully train the neural network. The
same was applied for RBN where all R-values are 1.Furthermore, this indicates that the methodology of DOE
can be successfully applied with good results.
From these results, it is possible to state that based on
the training performance, the RBN is better than the BP.
Table 4 depicts that the 4th set produces smaller errors
than the 1st set. This is because the former set contains
more experiments (more information about the process)
than the latter one. This table also shows the effect of
increasing the number of neurons or PEs in the hiddenlayer. The more neurons, the more accurate the experi-
ment set. The magnitudes of the errors are relatively
small if compared with the forces developed during mill-
ing, which in this study vary between 200 and 1000 N.
The results given by the radial basis network indicates
that the smallest errors are reached when the spread
value is equal to 0.2 for all sets. For this reason the costcalculation and therefore the model comparison is con-ducted using this particular value.
Tables 6 and 7 show interesting results pertaining to
the cost values for BP and RBN, respectively. Each
model indicates different lowest cost sets. For BP the
set with lowest cost is the third set with 0.85 and 1. 07
for =0.5 and 0.7, respectively. These two values are atW=35. While for =0.2, the lowest cost corresponds tothe second set, W=28. This trend is represented in Fig.12, where the cost vs set is plotted.
This tendency is due to the lower weight given to the
error term (=0.2), compared with the weight given tothe number of experiments (NE). From these results,
=0.5 can be considered as the base of comparison sincethis value does not underestimate the participation of the
error part on the cost function. Therefore, the best modelfrom this network is the one with 47 experiments with
a cost value of 0.8508 and the topology of 3 inputs, 5
neurons in the hidden layer and 4 outputs.
In the radial basis network, the 2nd set gives the low-
est cost regardless of the value of(see Fig. 13). Again,=0.5 is selected for comparison purpose. For this net-work, the best model is the one with 35 experiments, a
cost value of 0.8468, and a spread value of 0.2.
-
8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And
11/12
673J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674
Fig. 12. Cost vs sets (back-propagation network, using the lowest cost of each experimental set).
Fig. 13. Cost vs sets (RB network).
Based on these results (BP cost=0.8508 and 47 experi-ments; RBN cost=0.8468 and 35 experiments) and thefact that the radial basis network is for this particular
case about 3 times faster to train than back-propagation
network, the model that best represents the functional
relation between the considered milling parameters is theradial basis network.
The selected network, not only gave the lowest cost,
but it could be trained with fewer experiments and is
much faster. This indicates that for this particular case,
the RBN is more efficient than BP.
8. Conclusions and future work
In this paper, two supervised neural networks are used
to successfully estimate the forces developed duringmilling process. Design of experiments and specificallyorthogonal arrangement is used to select the experiments
to perform and establish different sets that are considered
in each ANN model. DOE contributed to increasing the
efficiency in the system by drastically reducing theamount of experimental data needed for successful train-
ing.
Based on the results of this study, it is possible to
conclude that having 5 values (equally spaced) of theselected milling parameters and applying orthogonal
arrangement, 35 experiments (out of 125) are enough to
train and evolve an accurate ANN model of the end mill-
ing process. In back-propagation networks, the use of a
single hidden layer showed to work sufficiently well forthe process in consideration. However, it is shown thatradial basis network is superior to back-propagation net-
work in predicting the milling forces, when evaluated in
terms of a cost function that combines costs of experi-
ments with accuracy.
In this study, a cost function is defined based on spe-cific needs. This fitness function can be refined in thefuture in order to represent more extensively the charac-
teristics of the milling process. In the same way, it is
-
8/3/2019 Selecting an Artificial Neural Network for Efficient Modeling And
12/12
674 J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663674
possible to design a more systematic methodology to sel-
ect the spread factor in radial basis network. This could
increase the accuracy of the model.
References
[1] H.Y. Feng, N. Su, A mechanistic citting force model for ball-end milling, Journal of Manufacturing Science and Engineering
November (1998).
[2] D.E. Rumelhart, J.L. McClelland, PDP Research Group Parallel
Distributed Processing: Explorations in the Microstructure of
Cognition 2 vols., in: MIT Press, Cambridge, MA, 1986.
[3] Q. Liu, Y. Altintas, On-line monitoring of flank wear in turning
with multi-layered feed-forward neural network, International
Journal of Machine Tools & Manufacture 39 (1999) 19451959.
[4] Y. Liu, C. Wang, Neural network based adaptive control and
optimisation in the milling process, International Journal of
Advanced Manufacturing Technology 15 (11) (1999) 791795.
[5] V. Tandon, Closing the gap between CAD/CAM and optimized
CNC end milling, MSME Thesis, Purdue School of Engineer-
ing & Technology, 2000
[6] D. Cook, C. Chiu, Combining a radial basis neural network with
time series analysis techniques to predict manufacturing process
parameters, Applied Artificial Intelligence 9 (6) (1995) 623631.
[7] P.J. Cheng, S.C. Lin, Using neural networks to predict bending
angle of sheet metal formed by laser, International Journal of
Machine Tools & Manufacturing 40 (1999) 11851197.
[8] S. Elanayar, Y.C. Shin, Design and implementation of tool wear
monitoring with radial basis function neural networks, in: Pro-ceedings of the American Control Conference, Proceedings of
the 1995 American Control Conference. Part 3 (of 6), 1995, pp.
17221726.
[9] A.R. Barron, Neural net approximation, in: Proceedings of the
Seventh Yale Workshop on Adaptive and Learning Systems,
1992, pp. 6872.
[10] R.C. Eberhart, P. Simpson, R. Dobbins, Computational Intelli-
gence PC Tools, in: AP Professional, New York, 1996.
[11] H. Demuth, M. Beale, Neural Network Toolbox v3 Users Guide,
in: The MathWorks Inc, USA, 1999.
[12] P.J. Ross, Taguchi techniques for quality engineering, in:
McGraw Hill, New York, 1988.
[13] R.A. Walsh, McGraw-Hill machining and metalworking hand-
book, in: McGraw-Hill, New York, 1994.