8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
1/16
Forecasting disruptions in the ADITYA tokamak
using neural networks
A. Sengupta, P. Ranjan
Institute for Plasma Research,Bhat, Gandhinagar, India
Abstract. A neural network technique has been used to predict disruptions in the ADITYA tokamak.A time series prediction method is employed whereby a series of past values of some time dependent
quantity is used to predict its value in the future. The time varying observables used in the present
work are the different diagnostic signals from four Mirnov probes, one soft X ray monitor and one H monitor. The predicted quantities are the same observables at some future time. The neural network
is trained with the past values of the different diagnostic signals as inputs and the future values of the
same quantities as targets. The trained neural network is used to forecast in a multistep sequence. This
amounts to a prediction several time steps earlier. Very good prediction results have been obtained
up to 8 ms earlier with little distortion of the signals and no appreciable time lag, a capability which
is believed to be well suited to the task of on-line predictions of disruptions in ADITYA. As actual
experimental signals are used, confidence regarding the performance of the neural network on hardware
implementation is automatically ensured.
1. Introduction
Disruption in tokamaks is a sudden loss of confine-
ment and subsequent transfer of plasma energy to thesurrounding structures. As a result the machine walls
and the supporting structures are subjected to enor-
mous heat load causing moderate to severe damage.
Disruptions also result in rapid plasma current decay,
which induces large electric fields that in turn drive
large eddy currents in the conducting structures andmechanical supports. This results in enormous jB
forces. The damage caused by these forces determinesthe lifetime of a machine. Disruption avoidance, or
minimization of disruptivity, therefore, is important
for cost effective operation of tokamaks.
Artificial neural networks (ANNs) have alreadybeen used for studying different aspects of tokamak
plasmas. These include fast estimation of plasma
parameters in DIII-D [1], ASDEX Upgrade [2] and
ITER [3], as a means of predicting disruptions [46]
and the vertical position of the plasma current cen-
troid [7]. It has also been used to order the magnetic
sensors according to their importance in the estima-
tion of plasma parameters [2, 3].The motivation for using ANNs for prediction of
disruptions came from the early use of ANNs in
various forecasting applications [8, 9]. However, the
ultimate aim of the prediction will be to make an
Corresponding author.
attempt to reduce the frequency of disruptions on-
line in hardware. Therefore, if used as a disruption
alarm, an ANN should not only give an accurate pre-
diction of an approaching disruption, but also should
make this prediction sufficiently early to allow formeasures to be taken to soften the impact of disrup-
tion. In this article, ways to predict plasma disrup-
tions in the ADITYA tokamak [10, 11] are discussed,
using time series of various time dependent quanti-
ties obtained from diagnostics. These include fluctu-
ations of the tangential component of poloidal mag-
netic field B as measured by Mirnov probes placedat different poloidal locations around the plasma.
These have been used earlier [4], where only a single
probe is used as input to the ANN for the prediction.
The results of that study are not suitable for the goal
of disruption control, since:
(a) Large errors are present for predictions more
than 1.1 ms earlier.
(b) An increasing time lag appears between the
actual and the predicted instants of disruption
as the prediction is made earlier and earlier.
In a recent work [6], soft X rays have been used as
inputs instead of magnetic signals, and the predic-
tion is made 3.12 ms in advance of the event, whichis a 200% improvement over the results of Ref. [4].
However, the time lag problem persists for predic-
tions more than 3.12 ms in advance. For effective real
Nuclear Fusion, Vol. 40, No. 12 c2000, IAEA, Vienna 1993
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
2/16
A. Sengupta and D. Ranjan
time measures to be taken, this time has to increase
by at least a factor of 2.There are two goals for this article:
(i) To use an ANN to predict the instant oftriggering a disruption.
(ii) To use an ANN to make the prediction suf-
ficiently early that measures can be taken to
soften the impact of disruptions.
The criterion for (i) above is that the exact instant
of triggering of the disruptive instabilities should be
picked up, rather than the instant of current decay,
because once current quench starts, control mea-sures, even if taken, may prove futile. The triggering
of instabilities is signalled primarily by:
(a) Increased MHD activities around the plasma
edge, primarily the (m,n) = (2, 1) mode, picked
up by a set of Mirnov coils located around the
plasma. These immediately precede the thermal
quench.
(b) A fall in the soft X ray (SXR) intensity at theplasma core, which immediately follows edge
cooling.
(c) Increased H emission.
For (ii), the earliness, i.e. the extent of early pre-
diction, which can be quantified by a time interval
t, is the major issue. This t, when applied to dis-
ruption avoidance or minimization, must be around
57 ms for effective measures to be taken.The purpose of this article is to find out whether
ANN architectures, different from those used earlier,
and the use of additional diagnostic information helpimprove upon these results. So in addition to several
Mirnov probe signals, soft X ray (SXR) and H emis-
sion signals have also been used here. A series of val-ues of the diagnostic signals has been chosen as their
past values, and a prediction involves a continuation
of the series. This prediction can be a single time
step in future, or several time steps. The latter rep-
resents an earlier forecast, and this earliness can be
increased by increasing the number of predicted time
steps. However, since the prediction error increaseswith the increase in the number of time steps, the
choice for sufficiently early prediction should neces-
sarily be within permissible errors.
The organization of this article is as follows. Sec-
tion 2 contains a general treatment of time series pre-
diction, while Section 3 discusses briefly ANNs andtheir relation to time series prediction. In Section 4
an overview of the different ANN architectures used
for time series prediction is given. Section 5 shows
the preparation of the database, while Section 6 givesour forecasting results in detail, Section 7 discusses
the results and Section 8 summarizes the results and
conclusions.
2. Time series prediction
A time series [9] basically refers to a set of values
which are taken to be measurements of an observ-able over time. The system on which the observ-
able is being measured is evolving with time, i.e. it
is a dynamical system. The observable is a functiononly of the state of the system; as soon as the sys-
tem returns to the original state, the observable also
returns to the original value.
Let the state of the system at present be repre-
sented by a and the observable being measured byp(a). It is assumed that state a contains all the infor-
mation required to predict the state t time units into
the future. Let the state at this future time be Ft(a).
The prediction refers to the calculation of the observ-able at time t from a knowledge only of the present.
Similarly, if one goes backwards in time from the
present instant, a time series of past values of theobservable is obtained:
b = [p(a), p(F(a)), p(F2(a)),....,p(Fm(a))]
(1)
where is the time step length or the rate of sam-
pling of the observable. b is thus a segment of a time
series where the time dependence is now expressed
explicitly:
b = [xt1 , xt1, .....,xt1m] (2)
where x is the measured quantity xt1 = p(a) and t1is the present time instant.
Equation (2) is the form of the time series that isgenerally used [4, 6, 8, 9].
Prediction means estimating the measured vari-
able at future times, i.e. the continuation of the series
by way of extrapolation. For the extrapolation, some
functional representation of the extrapolated (pre-
dicted) value is required in terms of the given timeseries. This should have the following form:
xpredt1+n = fn[xt1 , xt1, .....,xt1mr]. (3)
The left hand side of the above equation gives
the predicted value of the dynamical quantity at thefuture time t1 + n(n = 1, 2, ....), where again t1refers to the present. fn gives the functional form
for the transformation. The problem, therefore, is
1994 Nuclear Fusion, Vol. 40, No. 12 (2000)
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
3/16
Article: Neural network forecasts of disruptions in ADITYA
to find an approximation for fn to bring about the
extrapolation.Extrapolation schemes for fn can be divided into
two broad categories, linear and non-linear. Linear
models such as auto-regressive (AR), moving average
(MA) or auto-regressive moving average (ARMA)
have been most frequently used for time series anal-
ysis [9]. These models work well only for simple
time series and are most likely to fail for stochas-tic or chaotic series. Analysis of such complex series
requires a long time history of the series, yielding
very high order linear models, i.e. models involving
a very large number of linear terms (corresponding
to the past temporal points of the series). In prac-tice such high order models are impractical from a
computational point of view.
Non-linear techniques, such as the ANN, wavelet
and chaos analysis can provide good insight into a
complex time series when linear models fail (Ref. [12]
and Refs [215] therein). The ANN algorithm invokes
non-linear models that approximate a much broaderclass of functions than linear models, so that it can
analyse any complex time series without involving
large errors due to numerical instabilities.
3. Artificial neural networks
The ANN technique, which has its origins as an
artificial model of the parallel processing capabili-
ties of the human brain, is typically used in patternrecognition where a collection of images is presented
to the network, and its task is to assign the images to
one or more classes. Another typical use of the ANNis non-linear regression, where the algorithm is used
to find a smooth interpolation between data points.
By way of contrast, time series prediction involves
processing of patterns which evolve over time, the
response at a particular point of time depending not
only on the current value of the observable, but also
on the past. The ANN, of which the multilayer per-
ceptron (MLP) is the most widely used type, consists
of several layers of nodes or neurons, and represents
an analytic mapping between a set of inputs xi anda set of outputs yk (shown in Fig. 1, where i = 1
5, k = 12). The layer(s) not directly accessible to
the user, referred to as the hidden layer(s), produce
the inherent non-linearity in the transformation, and
also increase the networks ability to model different
classes of function. While the size of the input andoutput layers are determined by the problem being
solved, the size of the hidden layer is determined by
trial and error, from the training and testing errors.
Signals, propagating in the forward direction only,i.e. from the input towards the output, impinging on
a particular neuron j of a hidden layer, are weighted
by certain factors to give the net input gj to the
neuron j:
gj =
m
i=0
wjixi (4)
where xi refers to the output of the ith neuron of theinput layer, m is the total number of input neurons
and the weight wji represents the strength of the con-
nection between the neuron j of a hidden layer and
the neuron i of the input layer. i = 0 corresponds tothe bias term, whose value is x0. The non-linear func-
tion usually chosen for the mapping is a sigmoidal
function [13], acting on gj , with the form
f(gj) =2
1 + egj 1. (5)
Neural network training refers to an adjustment of
the weights to achieve the minimization of an error,called the mean square error, defined by
E2 =
k
l
(y(l)k y
(l)desk )
2
NoutNex(6)
where y(l)desk is the desired value of the kth output
as determined by the lth member of a training data
set. Nout and Nex are the total number of outputs
and examples, respectively, in a given problem. Notethat E2 is averaged over all examples and all out-
puts (normalized). Training is stopped when E2
decreases to a pre-defined error goal.To evaluate the performance of the network, the
same network with the correct weights is applied to
another set of known input/output examples calledthe test dataset. If the network performance on this
dataset is satisfactory, it is supposed to have a gener-
alization capability over any set of similar data, and
can be used to process the unknown data in those
data sets.
For time series analysis, the inputs to the ANN are
the past values of the measured (temporally varying)quantity and the output is the predicted value. The
more complex the time series, the more past infor-
mation is needed. This results in a larger number
of inputs and weights. The yks in the numerator of
Eq. (6) are the outputs (ANN calculated and the tar-
get) measured at a certain (future) time instant andare therefore local in time.
The functional representation fn as shown in
Eq. (3) is in general unknown and, for ANN
Nuclear Fusion, Vol. 40, No. 12 (2000) 1995
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
4/16
A. Sengupta and D. Ranjan
Figure 1. Structure of the ANN. This shows a general 5:3:3:2 MLP-2 network. The
offset bias is not shown.
modelling, is usually approximated by a sigmoidalfunction, shown in Eq. (5). Here a very impor-
tant property of ANNs is used, which is the fact
that it is only the nature of the function, i.e.
whether it is linear or non-linear, that determines
a transformation, rather than its actual form. It
is this property which is utilized while definingthe inherent non-linearity of the ANN by only cer-
tain specific forms of sigmoidal functions, while
the examples in different problems may involve a
broad spectrum of non-linear functions. If the time
series is multivariate rather than univariate, the
scalars x and y representing the inputs and out-puts are to be replaced by vectors. In that case
the product in Eq. (4) is also to be substituted
by W x.
4. General methods for the prediction
There are three possible methods for the predic-
tion of disruption from the past values of a given time
series, using a feedforward neural network [9]. These
methods are used to predict the dynamical observ-able at a future time t1 + n, i.e. xt1 + n, from the
available data at time t1.Method 1. One possibility is to construct a sin-
gle function f which predicts one point into the
future, and iterate this function on its own out-
puts to predict further into the future. Expressed
mathematically,
xpredt1+1
= f(xt1 , xt11, ....) (7)
xpredt1+2 = fx
predt1+1, xt1 , xt11, ....) (8)
...
xpredt1+n1
= f(xpredt1+n2, xpredt1+n3
,....,xt1 , xt11,....)
(9)
xpredt1+n = f(x
predt1+n1
, xpredt1+n2
,....,xt1 , xt11, ....)
(10)
xpredt1+n is the predicted value of x at a time n steps
ahead of t1.Method 2. One function can be constructed that
uses only past data as inputs to directly predict one
desired future point; i.e.
xpredt1+n = f(xt1 , xt11, ....). (11)
Method 3. Another method which can be pro-
posed is to construct functions which take both
previous predictions and past values as inputs, and
predict only the future point as output:
xpredt1+1
= f1(xt1 , xt11,....) (12)
xpredt1+2 = f2(xpredt1+1, xt1 , xt11, ....) (13)
...
xpredt1+n1
= fn1(xpredt1+n2
, xpredt1+n3
, xt1 , xt11,....)
(14)
1996 Nuclear Fusion, Vol. 40, No. 12 (2000)
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
5/16
Article: Neural network forecasts of disruptions in ADITYA
Table 1. Major parameters of the tokamak ADITYA
Parameters Design values Range of the discharges used
Major radius (cm) 75
Minor radius (cm) 25
Plasma cross-section shape Circular
Plasma current (kA) 250 80100
Toroidal field at plasma centre (T) 1.5 0.75
Plasma duration (ms) 300 6085Electron temperature (eV) 500 250300
xpredt1+n = fn(x
predt1+n1
, xpredt1+n2
, xt1 , xt11,....). (15)
In all the above cases, n = 1 implies a single step
prediction. Although both single and multistep pre-
diction can be used, our primary aim will be the
latter, since the application here requires long term
prediction. Methods 1 and 3 are called iterated pre-
diction methods, while method 2 is a direct predic-
tion method.
5. Database preparation
The database for the prediction task was prepared
using experimental ADITYA discharges (Table 1 lists
some major parameters of ADITYA). One disruptivedischarge was used for training purposes and one for
testing. Forecasting was then done with three disrup-
tive discharges. The plasma discharges chosen for our
work were all sampled at 0.02 ms.
Ten past values of each of the input variables wereused, and one predicted value at the output, which
was chosen many steps ahead, given the requirements
for real time prediction. This number of past tem-
poral points was slightly less than that used for the
TEXT studies, where 15 past values of a single input
were used. However, we shall see later that therewould be a total of 60 inputs in the present study,
that would consist of 10 past temporal values of six
different diagnostic signals. This would be shown to
be the optimum number of inputs.
The type of network chosen for this work was an
MLP-2 ANN with two layers of 16 neurons each.
The reason for using this rather than the MLP-1
network lay in the quality of fitting. It was found
that although the training error was less for the
MLP-1 network with 32 hidden neurons, the testing
error, as also the difference between the training andthe testing errors, was much smaller for the MLP-2
network with the same 32 neurons divided equally
between the two hidden layers. This was not surpris-
ing, because if the number of input neurons is large
in comparison with that of the hidden neurons (asin our case), an MLP-2 always contains a smaller
number of weights and therefore shows a better
generalization property than an MLP-1 network.
Looking at the iterated methods in Section 4, it
was observed that since the number of inputs and
outputs increased with every iteration, long term
prediction would be computationally intensive, whileit is known that for real time prediction of disrup-
tions, these predictions should be long enough to ask
for iterations of the order of 200400. Moreover, the
single step predicted variable xt1+1, that is fed back
to the input to predict xt1+2, is certainly not as accu-
rate as the target xt1 , xt11, .... Therefore, the iter-ative method was not thought to be well suited to
the task of disruption prediction. Hence method 2,
the direct method, was used for our predictions. In
the present study, with a sampling time of 0.02 ms,
there were 50 predicted time steps corresponding toa prediction 1 ms earlier (i.e. n = 50 in Eq. (11)).
Similarly n = 100 for a 2 ms early prediction andn = 400 when a forecast is made 8 ms in advance.
The non-linear mapping was brought about by
the sigmoid of Eq. (5). This is a symmetric sigmoid,
bounded in the interval [1,+1]. The inputs and out-
puts were normalized in the same interval. Without
this normalization, a normalization constant would
have been required in Eq. (6), as the outputs had
different dimensions.
The ANN was trained using the general adap-
tive recipe (GAR) algorithm [14]. Learning rate
or gradient descent step length was initialized to
1.0. On-line modification of the learning rate was
possible in GAR, through specification of up and
down adaptation parameters which were set at0.002 and 0.8, respectively. These values were deter-
mined by the network training process. A larger up
Nuclear Fusion, Vol. 40, No. 12 (2000) 1997
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
6/16
A. Sengupta and D. Ranjan
5
0
5
expt.
5
0
5
predicted t = 1ms
5
0
5
predicted t = 2ms
5
0
5
predicted t = 3ms
55 60 65 70 75 80 85 90
5
0
5
time(ms)
predicted t = 4ms
Figure 2. Using only soft X ray signals as input to the neural network, the quality
of prediction for t = 1, 2, 3 and 4 ms, respectively, are compared with the actual
signal. It is observed that for t = 3 ms, a time lag appears for the first time in the
predicted signal with respect to the actual experimental data. This lag increases with
higher t. The vertical lines represent the instant the disruption is actually triggered.
adaptation increased the gradient descent step length
so much as to often overshoot the minima, whereby
the error increased. A smaller down adaptation did
not reduce the learning rate enough, so that after afew iterations the learning rate increased once again
to overshoot the minimum. This effectively slowed
down the training.
To begin with, the ANN was trained with only one
diagnostic signal. This was to test the performance of
the network with similar input information as that
already used in Refs [4] and [6]. First, one Mirnov
probe was used as input, followed by the SXR signal.
Finally, only the H signal was used as the single
input. From the training stage itself it became clear
that the network required additional information to
learn the trends in the data as the learning remained
very slow throughout. The only exception was the
training with the H signal, when the error reducedmuch faster.
The performance of the trained network, fed
with SXR signals, in forecasting disruptions is pre-
sented in Fig. 2. The vertical lines denote the actual
triggering instant of the instabilities. The main
observation here is that the instant of prediction
of the triggering of the disruption started laggingbehind with respect to the actual signal when pre-
diction was done 3 ms or more early. This more or
less agreed with the results of Ref. [6].
The number of inputs was then increased by
choosing two Mirnov probes and the SXR and Hsignals. The Mirnov probes chosen first were two
closely located ones, at poloidal angles of 114 and
138. It was observed that the learning rate wors-
ened, as did the forecasting errors on a new dis-
charge. Next, two probes located more or less dia-metrically opposite to each other were selected, at
angles of 42 and 234. For this set of inputs, the
learning improved over the SXR case but was worse
than that of the H case. The performance of the
ANN on new data, however, remained more or less
the same as that on the single input cases. The exper-iment was repeated with similar inputs, but now the
two Mirnov probes were those located at 138 and
330. A much improved generalization capability of
1998 Nuclear Fusion, Vol. 40, No. 12 (2000)
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
7/16
Article: Neural network forecasts of disruptions in ADITYA
Table 2. Comparison of the mean square training
errors for the ANN provided with different combinations
of diagnostic signals as inputs
Combination of inputs Training error
SXR 0.0165
H 2.65 104
Four inputsa 0.0054
Four inputsb
0.0103Six inputsc 0.0086
a Four inputs: Inputs comprised of Mirnov probes at 42
and 234, together with SXR and H.b Four inputs: Inputs comprised of Mirnov probes at 138
and 330, together with SXR and H.c Six inputs: Inputs comprised of Mirnov probes at 42,
138, 234 and 330, together with SXR and H.
the ANN was noticed. Moreover, the ANN seemed
to have gained a better tolerance for long term
predictions.
The number of inputs was further increased tofour magnetic signals from probes located in the four
quadrants around the plasma at angles 42, 138, 234
and 330, together with the SXR and H signals.
Although the execution time increased because of a
larger ANN structure, this set of inputs clearly pro-
duced an overall improvement in the fitting.
This observation was believed to be due to the uni-
formity of the probe locations around the plasma so
that more information was now put into the network.This was corroborated by the fact that initially the
choice of Mirnov probes located diametrically oppo-
site improved the performance, as compared withthe set of signals from two closely located probes.
Table 3. Comparison of ANN p erformance with respect to mean square error E2 for single input and multiple
input cases
t (ms) E2SXR E2H E
24 E24 E
26
0.02 0.0224 5.62 104 0.1396 0.0221 0.0144
1.00 0.0342 0.1039 0.1758 0.0579 0.0365
2.00 0.0721 0.1898 0.2032 0.0922 0.0562
3.00 0.1141 0.2107 0.2143 0.1105 0.0667
4.00 0.1621 0.2254 0.2278 0.1262 0.0762
8.00 0.3038 0.2546 0.2662 0.1763 0.1088
Notes:
E2SXR: mean square error for single input with SXR signal.
E2H: mean square error for single input with H signal.
E24: mean square error for four inputs when the Mirnov probes were chosen from the 42 and 234 locations.
E24: mean square error for four inputs when the Mirnov probes were chosen from the 138 and 330 locations.
E26: mean square error for six inputs.
Then the trained ANN behaved still better with four
probes more uniformly spread out in the four quad-
rants. Thus, this shows that the poloidal distribution
of the probes was crucial for the ANN to perform
well on out of sample discharges. Use of more probes,
however, did not improve the fitting much, and the
network ran the risk of being too heavy, resulting in
unnecessary computation time.
Table 2 compares the training errors for differentANN inputs. Table 3 displays the performance of the
trained ANN with various combinations of inputs.
These include:
(a) A single SXR signal input;
(b) A single H signal input;
(c) Four inputs consisting of the two Mirnov probes
at poloidal angles of 42 and 234, the SXR and
H signals;(d) Four inputs consisting of the two Mirnov probes
at poloidal angles of 138 and 330, the SXR and
H signals;
(e) Six inputs comprising all four Mirnov probe
signals, and the SXR and H signals.
When applied to new data, it is clear from Table 3
that the ANN was most tolerant to the increase of
predicted time steps when six different diagnostics
were used, although the training error as well the
single step prediction error were the minimum whenonly the H signal was the input.
Therefore, the final set of diagnostic data used in
this study consisted of the following:
(i) Four Mirnov probe signals. The probes chosenare located more or less symmetrically around
Nuclear Fusion, Vol. 40, No. 12 (2000) 1999
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
8/16
A. Sengupta and D. Ranjan
Table 4. Comparison of the instants of disruption
triggering as displayed by the indicator for various t
using the H signal
t (ms) Actual instant Predicted instant
1.00 81.495 81.50
2.00 81.495 81.50
3.00 81.495 81.50
4.00 81.495 81.505.00 81.495 81.50
6.00 81.495 81.50
7.00 81.495 81.50
8.00 81.495 81.52
the plasma, at poloidal angles of 42, 138, 234
and 330.
(ii) One set of SXR monitor data.(iii) One set of H monitor data.
Since each of the inputs to the ANN was an array,
composed of the past values of the variable, it had tobe expressed as a vector rather than a scalar, the vec-
tor components corresponding to the past values (thenumber of which in our case was ten). Thus there
were six input vectors in the network, corresponding
to the six diagnostic signals listed above. The out-
puts were the future values of the same signals to be
predicted, which in this study was at a single time
instant only, according to Eq. (11). Thus, the ANN
had six scalar outputs.
6. Forecasting disruption
After the ANN was trained and the weight fac-
tors properly set, it was used to forecast disruption
on three disruptive discharges from ADITYA. These
discharges differed in the maximum plasma current
and the duration, but the general behaviours of the
fluctuating quantities were similar. Another notablefeature was that all these discharges ended in a major
disruption, without any preceding minor disruption.
As already mentioned, an important criterion for all
our forecasting was to choose the instant of disrup-
tion triggering.
For the actual detection of the instant of disrup-
tion triggering, which in fact was our first goal, an
indicator was made whereby the moment the insta-
bilities set in, an alarm would be given to the controlsystem, which then could take measures to soften the
impact of the disruption. Table 4 shows the trigger-
ing instants as displayed by the indicator for various
Table 5. Comparison of ANN performance with respect
to mean square error E2 for unfiltered and filtered input
signals
(The first value of t corresponds to a single step ahead
prediction.)
t (ms) Without filter With filter
0.02 0.0272 0.0090
1.00 0.0577 0.03652.00 0.0650 0.0452
3.00 0.1037 0.0858
t, using one of the forecasting discharges for the
H signal. The H radiation in ADITYA was seen
to remain at a more or less constant value (Figs 3, 6
and 9) during the ramp-up and flat-top phase of the
discharge before starting to rise at the instant thedisruption precursors set in (which coincides with
the instant of disruption triggering). So the crite-
rion for defining the disruption triggering was thatthe signal value should be greater than 2.00. The
results showed that the prediction instants remained
exactly the same up to t = 7 ms (although therewas a very small discrepancy with the actual signal),
while for t = 8 ms, a small time lag of 0.02 ms was
observed for the first time. This seemed to be the
trend in all the discharges used for forecasting, where
this time lag varied from 0.02 to 0.03 ms. Therefore,
in our results t was limited to 8 ms. Since the ANNinputs were experimental signals, the inherent noise
was inevitably there. It was observed that there was
a good reduction of error after filtering of the noise,as shown in Table 5, so that a better fitting was
achieved. This motivated us to use filtered experi-
mental data as inputs in the subsequent cases.
Figure 3 shows the first of the discharges used
for forecasting, shot 6690. This 95.28 kA plasma dis-
rupted at t 82 ms, while a disruption was triggered
at t 81.50 ms, as our indicator shows. Figure 4
compares the quality of prediction of this disruptive
event t = 1, 2, 4 and 8 ms earlier, with respect
to the SXR experimental signal. Figure 5 does the
same, but with the H signals. With a sampling time
of 0.02 ms for these discharges, this corresponded to
predicted time instants 50, 100, 200 and 400 time
steps ahead, respectively; these being the values ofn
in Eq. (11).
The major observations from these figures were
the following.
(a) Unlike the previous articles [4] and [6] where
a time lag was reported for the predicted instant of
2000 Nuclear Fusion, Vol. 40, No. 12 (2000)
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
9/16
Article: Neural network forecasts of disruptions in ADITYA
0
10
20
Vloop2
5
0
5
SX
R
5
0
5
mag.
fluct.
0
50
100
Ip(kA)
Shot : 6690 06Jan1999 01:46:58 PM
0
5
H
0 10 20 30 40 50 60 70 80 90 1000
1
2
Bv(kA)
Time(ms)
Figure 3. The first disruptive discharge, shot 6690, was used for forecasting. This
plasma shot disrupted around 82 ms, and the disruption was triggered around 81 ms.
The plasma current attained prior to disruption was t 90 kA.
5
0
5
predicted t = 1ms
5
0
5
predicted t = 2ms
5
0
5
predicted t = 4ms
50 55 60 65 70 75 80 85 905
0
5
time(ms)
predicted t = 8ms
5
0
5
expt.
Figure 4. Forecasting disruption using our full network with six inputs for shot
6690. Only SXR signals are shown. The actual experimental signal is compared with
the neural network predictions for t = 1, 2, 4 and 8 ms early, as shown. The vertical
lines indicate the actual instant of triggering the disruption.
Nuclear Fusion, Vol. 40, No. 12 (2000) 2001
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
10/16
A. Sengupta and D. Ranjan
0
5
predicted t = 2ms
0
5
predicted t = 4ms
50 55 60 65 70 75 80 85 900
5
time(ms)
predicted t = 8ms
0
5
exppt.
0
5
predicted
t = 1ms
Figure 5. Forecasting disruption using shot 6690. Only H signals are shown. The
actual experimental signal is compared with the neural network predictions for t =
1, 2, 4 and 8 ms early, as shown. The vertical lines indicate the actual instant of
triggering the disruption.
disruption beyond 1.12 and 3.12 ms, respectively, thepresent study did not show any appreciable time lag
even for a prediction 8 ms earlier. This showed a
significant improvement of the results by the use of
more diagnostic information into our neural network.(b) As the temporal activities were predicted ear-
lier and earlier, there was only a small change in the
waveform of the predicted signals with respect to the
corresponding targets.
(c) The last 30 ms of the discharge was scanned.
This was found to be enough for our purpose, asthe temporal activities around the time the instabil-
ities were triggered have been well depicted. More-
over, sawtooth phenomena are clearly observed from
Fig. 4, around 55 ms, which are also included within
the predicted part of the signal.
(d) The vertical lines in Figs 4 and 5 indicatethe instant the disruptive instabilities have just been
triggered. By following this line for each of the fiveplots of each figure, the ANN prediction and the
actual disruption can be compared very well.
(e) A prediction at a time t early means that
the signal at time t is predicted at the instant tt.
If the prediction results are analysed for t = 8 ms,
it is observed that the instant of observation of dis-
ruption precursors around 81 ms was predicted by
using the temporal behaviour around 73 ms.
Figure 6 shows the second plasma discharge usedfor forecasting. This 83.57 kA discharge disrupted
at t 62 ms, the disruption being triggered at t
60 ms. Figures 7 and 8 display the performance of the
neural network for prediction of this disruption 1, 2,4 and 8 ms early, with only two of the inputs, the
SXR and H signals, being shown.
Analysis of shot 6520 revealed the following:
(i) Once again a very good prediction of the trig-
gering of the instability, the instant of which is
given by the vertical lines, was observed even for
t = 8 ms.
(ii) The last 28 ms of this discharge were pre-
dicted. The reason for choosing only this portion
was that in this temporal range the SXR signal was
observed to rise along with the current ramp-up.It was observed that the signal from the monitor
was able to pick up the actual rise of core temper-
ature only around 30 ms. However, once again this
2002 Nuclear Fusion, Vol. 40, No. 12 (2000)
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
11/16
Article: Neural network forecasts of disruptions in ADITYA
0
10
20
Vloop2
42
02
SXR
1
0
1
mag.
fluct.
0
50
100
Ip(kA)
Shot : 6520 24Dec1998 05:09:45 PM
0
5
H
0 10 20 30 40 50 60 70 800
1
2
Bv(kA)
Time(ms)
Figure 6. The second disruptive discharge, shot 6520, used for forecasting. This
plasma discharge disrupted around 62 ms, and the disruption was triggered around
60 ms. The plasma current attained prior to disruption was 80 kA.
5
0
5
expt.
5
0
5
p
redicted t = 1ms
5
0
5
predicted t = 2ms
35 40 45 50 55 60 655
0
5
time(ms)
pr
edicted t = 8ms
5
0
5
predicted t = 4ms
Figure 7. Forecasting disruption using shot 6520. Only SXR signals are shown.
The actual experimental signal is compared with the neural network predictions for
t = 1, 2, 4 and 8 ms early, as shown. The vertical lines indicate the actual instant
of triggering the disruption.
Nuclear Fusion, Vol. 40, No. 12 (2000) 2003
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
12/16
A. Sengupta and D. Ranjan
0
5
expt.
0
5
predicted
t = 1ms
0
5
predicted
t = 2ms
0
5
predicted
t = 4ms
35 40 45 50 55 650
5
time(ms)
predicted
t = 8ms
Figure 8. Forecasting disruption using shot 6520. Only H signals are shown. The
actual experimental signal is compared with the neural network predictions for t =
1, 2, 4 and 8 ms early, as shown. The vertical lines indicate the actual instant of
triggering the disruption.
sufficed, as this time regime contained the disruption
precursors followed by the current quench, as also a
portion of the discharge prior to the triggering of the
instabilities.
(iii) The spikes of the SXR signal towards thenegative side were only noise and obviously did not
have any physical significance. These spikes contin-
ued even after the discharge terminated. However,Fig. 7 shows that the noise level was considerably fil-
tered, and the negative spikes were greatly reduced.
The third discharge used for forecasting,
shot 6688, is shown in Fig. 9. In this case the98.39 kA plasma disrupted at t 65 ms, while
the triggering instabilities set in around 63.32 ms,
according to the indicator. The observations from
this discharge are described below:
(a) The signal from the Mirnov probe at 42, and
the SXR and H signals were predicted remarkably
well, with very little distortion in the signals even for
a prediction 8 ms early.
(b) The SXR signals in this case did not contain
any negative spikes. In addition, sawtooth oscilla-
tions were observed prior to the disruption, for the
last 30 ms. These sawteeth were excellently picked
up by the neural network.
(c) The vertical lines in Figs 1012 show the
instant of triggering of the disruptive instabilities.
From Fig. 10 one observes that the MHD activitiesas picked up by the Mirnov probe started increas-
ing around 63 ms, when the magnetic fluctuations
increased in amplitude.
It was seen from the results of all the three disrup-
tive discharges that, while predicting the disruption
occurrence, the ANN did not give any false predic-
tion within the non-disruptive part of the discharge.
This should be a good motivation for using this algo-
rithm as a disruption alarm.
A general feature of all the predictions was that
towards the beginning of the predicted interval, sev-
eral of the predicted signals became a little distorted
with respect to the actual signal, especially at higher
t. However, for achieving the goals of the present
study, this was not likely to prove any hurdle, as onlythe prediction of the signal around the instant of
the triggering of disruptive instabilities was of prime
concern. In the earlier part of the discharges, the
2004 Nuclear Fusion, Vol. 40, No. 12 (2000)
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
13/16
Article: Neural network forecasts of disruptions in ADITYA
0
10
20
Vloop2
42
02
SXR
5
0
5
mag.
fluct.
0
5
H
0 10 20 30 40 50 60 700
1
2
Bv(kA)
Time(ms)
0
50
100
Ip(kA)
Shot : 6688 06Jan1999 01:34:55 PM
Figure 9. The third disruptive discharge shot 6688, used for forecasting. This plasma
discharge shows a major disruption at t 66 ms. The plasma current attained prior
to disruption was t 103 kA.
5
0
5
expt.
5
0
5
p
redicted t = 1ms
5
0
5
predicted t = 4ms
35 40 45 50 55 60 65 70 75 805
0
5
time(ms)
pr
edicted t = 8ms
5
0
5
predicted t = 2ms
Figure 10. Forecasting disruption using shot 6688. Only B=42 signals are shown.
The actual experimental signal is compared with the neural network predictions for
t = 1, 2, 4 and 8 ms early, as shown. The vertical lines indicate the actual instant
of triggering the disruption.
Nuclear Fusion, Vol. 40, No. 12 (2000) 2005
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
14/16
A. Sengupta and D. Ranjan
5
0
5
expt.
5
0
5
predicted t = 1ms
5
0
5
predicted t = 2ms
5
0
5
predicted t = 4ms
35 40 45 50 55 60 65 70 75 80
5
0
5
time(ms)
predicted t = 8ms
Figure 11. Forecasting disruption using shot 6688. Only SXR signals are shown.
The actual experimental signal is compared with the neural network predictions for
t = 1, 2, 4 and 8 ms early, as shown. The vertical lines indicate the actual instant
of triggering the disruption.
0
5
expt.
0
5
pred
icted
t = 1ms
0
5
predicted
t = 2 sm
0
5
predicted
t = 4ms
35 40 45 50 55 60 65 70 75 800
5
time(ms)
pred
icted
t = 8ms
Figure 12. Forecasting disruption using shot 6688. Only H signals are shown.
The actual experimental signal is compared with the neural network predictions for
t = 1, 2, 4 and 8 ms early, as shown. The vertical lines indicate the actual instant
of triggering the disruption.
2006 Nuclear Fusion, Vol. 40, No. 12 (2000)
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
15/16
Article: Neural network forecasts of disruptions in ADITYA
point that was of real importance for our purpose
was whether any false alarms were produced by the
ANN, when there were no indications of the trigger-
ing of a disruption in the actual data.
None of the discharges used in this work was pre-
dicted entirely. To do this, a fresh training was nec-
essary, as the plasma dynamics during the startup
phase were not picked up by the ANN during train-
ing, which was also done using the last 35 msof the training discharge. For forecasting the wholedischarge, a large error was, therefore, anticipated.
But although the time series prediction formalism
requires the use of more past information for an accu-
rate prediction of the future, the initial phase of the
discharges was unlikely to provide any extra infor-
mation regarding the triggering of the instabilities
leading to the disruption.
7. Discussions
The forecasting of plasma disruptions in toka-
mak ADITYA were described in the previous sec-
tion, using a set of diagnostics different from what
had been used in the earlier works [4, 6]. The
use of a combination of several diagnostic signals,
rather than a single type of diagnostic as had been
used in the studies of [4, 6], was thought to haveproduced the improved forecasting capabilities of
the ANN.
Apart from changing the nature of the inputs,another major change was made in the present work
from Refs [4, 6]. This concerns the use of a direct pre-diction of the disruption, unlike the iterated predic-
tion methods incorporated earlier. But it was proved
that the improvement in forecasting was not really
due to this change, as the use of a single input in this
work did not produce better results. In particular,
the performance of a trained network with only an
SXR signal showed a result similar to that of Ref. [6],
as the time lag was first observed around 3 ms. A
glance at Table 3 would reveal that the ANN predic-
tion error with only an H signal worsened further.There was not much change when the SXR and Hsignals were used along with two Mirnov probes at
42 and 234. There was, however, a marked improve-
ment when the magnetic signals were from probes at
138 and 330. The best predictions were obtained
from all four probes, together with the SXR and Hsignals. From this it appears that two main factors
were responsible for the best prediction results in this
work:
(a) A definite combination of diagnostic signalsfrom 4 Mirnov probes, one SXR and one Hmonitor.
(b) The poloidal distribution of the Mirnov probes
around the plasma.
The performance of the trained ANN in an actual
real time application for plasma disruption forecastcould not be in doubt, as the discharges used in this
study were experimental, and noise tolerance of theANN was automatically ensured.
Regarding the timescales of TEXT and ADITYA,
it can be stated that the scales are much shorter
for ADITYA, as the plasma duration for ADITYA
is around 100 ms while that for TEXT varies from
250 to 400 ms or more [6, 15]. Thus a detection of an
approaching disruption 8 ms in advance in ADITYAwould correspond to a much more reliable situation
for a real time prediction.
8. Summary and conclusions
In this article a neural network was used for fore-
casting plasma disruptions in ADITYA. A number of
diagnostic signals were fed into the network input.
Although this made the structure of the network
heavier, it is believed that this increased input infor-
mation from a definite combination of the diagnostics
was the main reason for the significantly improved
performance. This combination provided the opti-
mum number of inputs to the ANN, with ten past
values of each of the temporal variables.Confidence about the performance of the ANN
in real time could be gained from the fact that the
algorithm not only predicted the trigger of the insta-
bilities correctly, but did it sufficiently early which
was the basic requirement for real time operations.
A forecast of an approaching disruption about 8 ms
in advance is extremely crucial, not only for medium
sized machines like ADITYA, but also for reactor
grade tokamaks like ITER where the pulse lengths
are to be around 1000 s. Since such long pulse opera-tions can be strongly inhibited by major disruptions,
a forecast as proposed in this study can be effectively
used to alert the real time control systems and mea-
sures, such as electron cyclotron resonance heating,
pellet injections and neutral beam heating, can be
put into operation to soften the harmful effects of dis-
ruptive termination of a plasma discharge. In addi-tion, it was amply demonstrated that in the absence
of any approaching disruption, the network would
Nuclear Fusion, Vol. 40, No. 12 (2000) 2007
8/8/2019 Forecasting Disruptions in the ADITYA Tokamak
16/16
A. Sengupta and D. Ranjan
not give any false alarms. Finally, since experimen-
tal plasma discharges were used in this study, the
ability of the ANN from the point of view of noise
tolerance was automatically ensured.
One crucial observation in this work was that the
discharges used were not taken on the same day, and
yet no effect was noticed in the prediction quality.
The quality degraded slightly only due to a larger
t. From this it could be concluded that the physi-cal conditions, such as wall conditioning and average
plasma density, do not have any effect on the predic-
tion of disruption. Prediction depends basically on
the nature of the discharges. The discharges used in
this work were, by nature, similar in so far as thegeneral variation of the different temporally varying
plasma parameters is concerned. Moreover, all the
discharges ended in a major disruption without any
intermediate minor disruption. So although the max-
imum plasma current, loop voltage and the duration
of the discharges varied from discharge to discharge,
these had no real effect on the quality of prediction.
Acknowledgements
The authors take this opportunity to express
their sincere thanks to J.B. Lister for providing
them with the neural network program. They grate-
fully acknowledge H. Ramachandran for his sugges-tions and critical comments after going through this
manuscript. One of the authors (AS) would like to
thank C. Ramdas who helped in drawing the neu-
ral network structure of Fig. 1. Finally, the authors
thank the entire ADITYA team for supplying theexperimental data.
References
[1] Lister, J.B., Schnurrenberger, H., Nucl. Fusion 31
(1991) 1291.
[2] Coccorese, E., Morabito, C., Martone, R., Nucl.
Fusion 34 (1994) 1349.
[3] Albanese, R., et al., Fusion Technol. 30 (1996) 219.
[4] Hernandez, J.V., et al., Nucl. Fusion 36 (1996) 1009.
[5] Wroblewski, D., Jahns, G.L., Leuer, J.A., Nucl.
Fusion 37 (1997) 725.[6] Vannucci, A., Oliveira, K.A., Tajima, T., Nucl.
Fusion 39 (1999) 255.
[7] Yoshino, R., Koga, J.K., Takeda, T., Fusion Tech-
nol. 30 (1996) 237.
[8] Hamilton, J.D., Time Series Analysis, Princeton
University Press, Princeton, NJ (1994).
[9] Weigend, A.S., Gershenfeld, N.A., Time Series Pre-
diction: Forecasting the Future and Understanding
the Past, Addison-Wesley, Reading, MA (1992).
[10] Bhatt, S.B., et al., Indian Pure Appl. Phys. 27
(1989) 710.
[11] Saxena, Y.C., Curr. Sci. 65 (1993) 25.
[12] Geva, A.B., IEEE Trans. Neural Networks NN-9(1998) 1471.
[13] Bishop, C.M., Rev. Sci. Instrum. 65 (1994) 1803.
[14] Lister, J.B., Schnurrenberger, H., Marmillod, P.,
Implementation of a Multilayer Perceptron for a
Non-linear Control Problem, Rep. LRP 398/90,
CRPPEPFL, Lausanne (1990).
[15] Vannucci, A., McCool, S.C., Nucl. Fusion 37 (1997)
1229.
(Manuscript received 27 October 1999
Final manuscript accepted 30 August 2000)
E-mail address of A. Sengupta:
Subject classification: C0, Tm
2008 Nuclear Fusion, Vol. 40, No. 12 (2000)
Top Related