Post on 22-Oct-2021
Research ArticleFuzzy Clustering-Based Ensemble Approach toPredicting Indian Monsoon
Moumita Saha1 Pabitra Mitra1 and Arun Chakraborty2
1Department of Computer Science and Engineering Indian Institute of Technology Kharagpur KharagpurPaschim Medinipur West Bengal 721302 India2Centre for Oceans Rivers Atmosphere and Land Sciences Indian Institute of Technology Kharagpur KharagpurPaschim Medinipur West Bengal 721302 India
Correspondence should be addressed to Moumita Saha moumitasaha2012gmailcom
Received 2 January 2015 Revised 31 March 2015 Accepted 3 April 2015
Academic Editor Xiaolong Jia
Copyright copy 2015 Moumita Saha et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
Indian monsoon is an important climatic phenomenon and a global climatic marker Both statistical and numerical predictionschemes for Indian monsoon have been widely studied in literature Statistical schemes are mainly based on regression or neuralnetworks However the variability of monsoon is significant over the years and a single model is often inadequate Meteorologistsrevise their models on different years based on prevailing global climatic incidents like El-Nino These indices often have degreeof severity associated with them In this paper we cluster the monsoon years based on their fuzzy degree of associativity to theseclimatic event patterns Next we develop individual predictionmodels for the year clusters Aweighted ensemble of these individualmodels is used to obtain the final forecast The proposed method performs competitively with existing forecast models
1 Introduction
Monsoon is a complex phenomenon of a climatic systemIt is influenced by multiple climatic parameters and sea-atmosphere interactions Prediction of monsoon is chal-lenging due to large variability present in its patternsIndian Meteorological Department (IMD) performs forecastof Indian summer monsoon rainfall (ISMR) since 1886Indian monsoon forecast was initiated by Blanford [1] asearly as 1882 The success of forecasts in span of 1882ndash1885 encouraged Blanford to design operational long rangeforecast model for monsoon in 1886 Subsequently Walker[2] developed models studying the statistical correlationsbetween rainfall and different global climate parametersThapliyal and Kulshrestha [3] introduce regression model inpredicting south-west Indian monsoon rainfall Gowarikeret al [4] propose power regression model for long-termforecast of monsoon which provided accurate forecast for along period but failed to predict the extreme condition of
2002 In 2004 Rajeevan et al [5] reassess different climaticparameters and introduce four new parameters to designstatistical model for issuing long-range forecast of Indianmonsoon Succeeding in 2007 Rajeevan et al [6] builtmodelsusing ensemble multiple regression and pursuit projectionregression to forecast Indian rainfall and proved to besuperior to past IMD models Schewe and Levermann [7]explain the change in distribution of Indian rainfall and alsoexplain the reasons behind failure of monsoon in certainyears Wu et al [8] propose a linear Markov model to predictshort-term climate variability of East Asian monsoon Fan etal [9] develop two statistical prediction schemes for seasonalforecast of East Asian summer monsoon The schemes takethe direct outputs of the existing models and give betterprediction of the summer monsoon
Artificial neural networks (ANN) [10] are widely usedin modelling the nonlinearity present in monsoon processSahai et al [11] use ANN techniques with error backpropaga-tion to forecast Indian summer monsoon rainfall Hong [12]
Hindawi Publishing CorporationAdvances in MeteorologyVolume 2015 Article ID 329835 12 pageshttpdxdoiorg1011552015329835
2 Advances in Meteorology
predicts Indian summer monsoon utilizing recurrent neuralnetwork and also demonstrates successful employment ofsupport vector machine in solving nonlinear regression andtime series problemsThree different backpropagation neurallearning rules namely momentum learning conjugate gra-dient descent learning and Levenberg-Marquardt learningare used by S Chattopadhyay and G Chattopadhyay [13]to perform a comparative study of different neural networkmethod to predict rainfall time series
Presence of large variability in monsoon patterns makesit difficult for a single model to predict its distributionA number of uncertainties including boundary conditionparameter and structural uncertainties are involved in con-struction of these models Thus it remains fundamentallychallenging to have a singlemodel for predictionMultimodelensembles are proposed to overcome the weakness of singlemodel which combine the outcome of different models toproduce efficient results [14 15] In addition monsoon showsdifferent characteristics over yearsThere exist groups of yearswhere variation of climatic parameters and pattern of rainfallare similar We use fuzzy clustering to cluster the similaryears together and model them separately The motivationbehind using fuzzy clustering is that each year manifests amixture of physical climatic events We cannot hard clustera year into a specific group years have their membership ofbelongingness to every cluster Fuzzy clustering is used toenclose the characteristics of different events being related toa year of study We use the same set of climatic parameters aspredictor set for every cluster but frame different models foreach cluster
A number of prediction models namely multiple regres-sion (MR) multilayer perceptron (MLP) recurrent neuralnetwork (RNN) and generalized regression neural network(GRNN) models are used for prediction of Indian monsoonfor the year clusters There exists viable reasons for usingneural networks like MLP RNN and GRNN for modelling(i) Indian monsoon is a complex process which cannot beadequately modelled by linearmodels (ii) nonlinearity in thetime-series pattern can be well captured by neural networklearning (iii) climatic events are much closely related to nearyears parameters disturbance as compared to distant yearsand neural network enables attaching weight to the yearparameter in appropriate manner
In this work climatic parameters that are strongly corre-lated with Indian monsoon are identified at the onset whichis followed by fuzzy clustering of years into groups withdegree of belongingness of each year to the clusters Thenwe model each cluster with four types of models namelyMR MLP RNN and GRNN to forecast rainfall Weightedensemble of forecasts given by respective models for eachcluster is considered as final predicted rainfall Analysisand comparisons are performed on aggregate Indian rainfalland finally a meteorological interpretation of the obtainedclusters is presented
The paper is organised in the following manner Wediscussed the details of data and predictor climatic param-eters in Sections 2 and 3 Proposed clustering basedapproach prediction model and ensemble technique arepresented in Section 4 with experimental results in Section 5
Meteorological significance is discussed in Section 6 andfinally conclusions are provided in Section 7
2 Data Sets Used
We consider the annual Indian summer monsoon rain-fall (ISMR) occurring in four months of June JulyAugust and September Annual ISMR is considered duringperiod 1948ndash2013 for our study The long period aver-age (LPA) (1948ndash2013) of ISMR is 8918mm ISMR isexpressed as percentage of the LPA value The data isobtained from Indian Institute of Tropical MeteorologyPune (httpwwwimdpunegovinresearchncclongrangedatadatahtml) [16]
Predictor parameters sea level pressure (SLP) (httpwwwesrlnoaagovpsdgcos wgspGriddeddatanoaaerslphtml) and sea surface temperature (SST) (httpwwwesrlnoaagovpsddatagriddeddatanoaaerssthtml) data areprovided by theNOAAOARESRLPSD at spatial resolutionof 2∘
times 2∘ [17] Surface pressure (SP) and zonal wind
velocity (WV) data are collected from NCEP ReanalysisDerived data provided by the NOAAOARESRL PSD(httpwwwesrlnoaagovpsddatagriddeddatanceprea-nalysisderivedsurfacehtml) [18] available at resolutionof 25
∘
times 25∘ Finally Nino 34 data which is the sea
surface temperature anomaly for the spatial coverageof 5∘S to 5
∘N and 170∘W to 120
∘W in Pacific Oceanregion is acquired from National Center for AtmosphericResearch (httpwwwcpcncepnoaagovproductsanalysismonitoringensostuffensoyearsshtml) [19] All the abovemonthly data are considered for the period 1948ndash2013 in ourstudy and analysis
3 Global Climatic ParametersInfluencing Indian Monsoon
Indian monsoon is strongly influenced by several globalclimatic parameters occurring at places distant from Indiansubcontinent Identification of predictor parameters relies onphysical understanding of monsoon event and wind patternflow We have selected the climatic parameters based onthe parameters used by Indian meteorological departmentrsquosmodels [5 6] studying their correlation with Indian summermonsoon rainfall (ISMR) during our period of study (1948ndash2013) In the data preprocessing phase climatic anomaly dataare evaluated by calculating the deviation of parameter valuefrom long-term average value of the parameter exclusively foreachmonth followed by correlation study between ISMR andthe climatic parameters for a lag of zero to twelve monthsWe consider the best lagged predictor month having highcorrelationwith ISMRThe predictor climatic parameters andtheir correlation values with Indian monsoon are shown inTable 1 Figure 1 shows the geographic location of climaticparameters influencing Indian monsoonPredictor Sets of Climatic Parameters Based on the correlationwith Indian monsoon we have built five predictor sets forforecasting Different combinations of the identified climaticparameters (Table 1) form the predictor sets The predictorsets are shown in Table 2
Advances in Meteorology 3
Table 1 Climatic parameters (CP) influencing Indian monsoon with geographical location correlation values and correlated month (0signifies same years and minus1 signifies previous year)
CP CP name Location Correlation values Correlated monthsCP1 North Atlantic Ocean SST anomaly 20∘Nndash30∘N 100∘Wndash80∘W 0242 Jan (0)CP2 North Atlantic Ocean surface pressure anomaly 20∘Nndash30∘N 100∘Wndash80∘W 0256 April (0)CP3 East Asia SLP anomaly 35∘Nndash45∘N 120∘Endash130∘E 0337 May (0)CP4 East Asia surface pressure anomaly 35∘Nndash45∘N 120∘Endash130∘E 0341 Mar (0)CP5 Equatorial South Eastern Indian ocean SST anomaly 20∘Sndash10∘S 100∘Endash120∘E 0200 Sept (minus1)CP6 Pressure gradient between Madagascar and Tibet mdash 0253 May (0)CP7 Nino 34 SST anomaly 5∘Sndash5∘N 170∘Wndash120∘W 0311 Sept (minus1)CP8 Equatorial Pacific Ocean SLP anomaly 5∘Sndash5∘N 120∘Endash80∘W 0272 Aug (minus1)CP9 North West Europe surface pressure anomaly 55∘Nndash65∘N 20∘Endash40∘E 0183 Jan (0)CP10 North Central Pacific zonal wind anomaly 5∘Nndash15∘N 180∘Endash150∘W 0457 May (0)
70∘
70∘
60∘
60∘
60∘
60∘
40∘
40∘
40∘
40∘
20∘
20∘
20∘
20∘
0∘
70∘
70∘
60∘
60∘
40∘
40∘
20∘
20∘
0∘
0∘
100∘
100∘
100∘
120∘
120∘
160∘
160∘
180∘
80∘
80∘
140∘
140∘
60∘
60∘
40∘
40∘
20∘
20∘
0∘
100∘
100∘
100∘
120∘
120∘
160∘
160∘
180∘
80∘
80∘
140∘
140∘
Figure 1 Climatic parameters over the globe governing Indianmonsoon (purple patches signify the location of climatic parameterstaken and blue patch represents the Indian region) CP
119894
representsparameter 119894 in Table 1
Table 2 Predictor sets with climatic parameters
Predictor sets Climatic parametersPredSet1 CP1 CP4 CP5 CP6PredSet2 CP4 CP5 CP6 CP7PredSet3 CP2 CP4 CP10PredSet4 CP2 CP4 CP7 CP10PredSet5 CP3 CP7 CP8 CP9
4 Methodology
We propose fuzzy clustering of monsoon years into groupsfollowed by building models for each group separately andfinally predicting Indian summer monsoon rainfall (ISMR)as weighted ensemble of forecasts provided by clustermodelsThe block diagram of the proposed fuzzy clustering-basedapproach to prediction of ISMR is shown in Figure 2Detailedsteps are described in the following subsections
41 Motivation Variability of Monsoon Patterns Trends anddistributions of monsoon vary to a large extent over years It
is thus necessary to group the years into clusters which havesimilar patterns of predictor climatic parameters affectingmonsoon The approach of clustering the years is effective aswe can build separate models for each cluster These clustermodels will be more accurate as variation within cluster isless Finally ensemble of forecasts of these cluster modelsresults in better prediction of IndianmonsoonAs an exampleconsider two clusters of years corresponding to strong El-Nino and North Atlantic Oscillation respectively A droughtyear has correlation with both events and hence might havesignificant degree of belongingness to both clusters
42 Fuzzy Clustering of Monsoon Years Fuzzy 119888-meansclustering is used for grouping the similar years togetherFuzzy 119888-means (FCM) is a method of clustering which allowsone instance of input to belong to more than one clusterwith some membership of belongingness FCM attempts topartition a set of119873 elements119884 = 119910
1 119910
119899 into a collection
of 119888 fuzzy clusters 119862 = cen1 cen
119888 and a partition matrix
119882 = 119908119894119895isin [0 1] 119894 = 1 119899 119895 = 1 119888 where 119908
119894119895gives the
degree of belongingness of element 119910119894to cluster with center
cen119895FCM aims to minimize an objective function of (1) The
update of partition matrix and centers occur in accordancewith (2) and (3) respectively
119869119898
=
119873
sum
119894=1
119888
sum
119895=1
119908119898
119894119895
10038171003817100381710038171003817119910119894minus cen119895
10038171003817100381710038171003817
2
1 le 119898 le infin (1)
119908119894119895=
1
sum119888
119896=1
(10038171003817100381710038171003817119910119894minus cen119895
100381710038171003817100381710038171003817100381710038171003817119910119894 minus cen
119896
1003817100381710038171003817)2(119898minus1) (2)
cen119895=
sum119873
119894=1
119908119898
119894119895
sdot 119909119894
sum119873
119894=1
119908119898
119894119895
(3)
where119898 denotes the level of cluster fuzziness
4 Advances in Meteorology
Input
∙ Climateparameter(CP)
∙ Rainfall
Climate
anomaly
data
evaluation
Correlation
study of
CP
and rainfall
Identification
of month
of CP
highly
correlated
to
rainfall
Building
predictor
set with
different
CP
combination
Data preprocessing
Ensemble neural network model
Ensemble
of op of
all clusters
Forecast
by everyclusterby each
model
∙ MR
∙ MLP
∙ RNN
∙ GRNN
Clusters
Membershipvalue
Clustering years
Apply
alpha-
cut to
obtain
clusters
Clustering
predictor
set using
fuzzy
clustering
Ensemble
forecastClusters
ValidationPhysical interpretation of clusters
Error calculation
Comp
with
existing
models
Comp
with
nonclustered
method
Support Confidence
Threshold
Cluster interpretation
OutputOutput∙ Forecast of rainfall∙ Error in forecasting∙ Model comparison results
∙ Climatic eventsassociated with eachcluster
Figure 2 Proposed fuzzy clustering-based ensemble approach for prediction of Indian summer monsoon rainfall
43 Prediction Models Multiple regression and three modelsof artificial neural networks (ANN) namely multilayer per-ceptron recurrent neural network and generalized regres-sion neural network are used to design predictionmodels foreach cluster exclusively Forecast of annual ISMR is providedby each cluster model separately and also by ensemble of all
the clustersrsquo model forecast We describe below the modelsused
431 Multiple Regression (MR) Multiple regression model isused to learn the relationship between several independentpredictor variables (119883
119894s) and a dependent variable (119884)
Advances in Meteorology 5
Table 3 Model parameter setting for MLP models
Parameter set Hidden layers Training years Training methodParSet1 [3 5] 20 BFGS quasi-Newton backpropagationParSet2 [3 5 10] 15 Conjugate gradient backpropagation with Powell-Beale restartsParSet3 [5 10] 10 Scaled conjugate gradient backpropagationParSet4 [3 5] 15 Resilient backpropagation
Multiple regression model having 119901 independent variables isshown in
119910119894= 12057311199091198941+ 12057321199091198942+ sdot sdot sdot + 120573
119901119909119894119901
+ 120576119894 (4)
where 119909119894119895is the 119894th observation of 119895th independent variable
where the first independent variable takes the value 1 for all 119894and 120576 represents the residual
432 Multilayer Perceptron Neural Network (MLP) Mul-tilayer perceptron neural network is a class of ANN whereconnections between the neurons do not form a directedcycle In this network the information propagates in onlyone direction from input nodes through hidden nodesand to the output nodes The independent and dependentvariables constitute the input and output layers respectivelyNumber of hidden layers with corresponding nodes mustbe determined empirically for each prediction task Fourdifferent parameter sets are considered empirically for modeldesigned to forecast ISMR shown in Table 3
433 Recurrent Neural Network (RNN) Recurrent neuralnetwork is a class of ANN which creates an internal stateof the network to exhibit dynamic temporal behaviourClimatic changes or events occurring in near or same timeperiod are highly correlated Similarly rainfall patterns aremore correlated to influencing factors in the near years ascompared to the distant years This phenomenon is wellcaptured by RNN which gives weights in decreasing order tothe values in near to distant years during training of networkThus it assists in modelling the system dynamics in muchnatural manner Same set of climatic parameters as MLPnetwork (Table 3) is considered with delay span of 2 units
434 Generalized Regression Neural Network (GRNN) Gen-eralized regression neural network is a variant of radialbasis function network GRNN has three layers of artificialneurons input hidden and output The hidden layer hasradial basis neurons while neurons in the output layer havelinear transfer function Output of radial basis neurons is theinput scaled by the spread factor Given 119901 input-output pairs119909119894119910119895isin R119899timesR1 with 119899 input variables and 119894 = 1 2 119901 119910
119895
represents the output from each hidden unit The GRNNoutput for a test point 119909 isin R119899 is described by
119910(119909) =
119901
sum
119894=1
119882119894119910119894 (5)
where
119882119894=
exp (minus1003817100381710038171003817119909 minus 119909
119894
1003817100381710038171003817
2
21205902
)
sum119901
119896=1
exp (minus1003817100381710038171003817119909 minus 119909
119896
1003817100381710038171003817
2
21205902
)
(6)
The reasons behind modelling using GRNN are (i) only onetunable design parameter (spread factor) (ii) one-pass algo-rithm (less time consuming) and (iii) accurately approximatefunctions from sparse data
Optimal training year is ascertained for MR and GRNNmodels by varying training years from 5 to 30 and validatingagainst least absolute error in prediction during validationperiod (1984ndash1993) A training of 119898 years specifies that forpredicting 119903th year rainfall available preceding119898 number ofyears 119903 minus 1 119903 minus 2 119903 minus 119898 present in a particular cluster areconsidered for training
44 Ensemble of Predictors Complexity in monsoon processmakes it difficult for a single model to predict rainfallaccuratelyWedesign separatemodels for each cluster of yearsobtained by fuzzy clustering using four predictors describedin Section 43 Finally annual ISMR is presented as weightedensemble of forecasts of model designed for each clusterWeight is taken as the fuzzy membership of belongingnessof the test year in different clusters
Ensemble prediction119905 =119888
sum
119894=1
119882119905
119894
sdot 119875119894 (7)
where119875119894represents the prediction given by amodel for cluster
119894119882119905119894
is the fuzzy membership of 119905th test year to cluster 119894 and119888 is the total number of clusters
45 Validation of Proposed Approach The study is performedon data for the period 1948ndash2013 Fuzzy clustering is per-formed over the period to cluster it into three groups Thenumber of clusters is decided based on cluster quality Sepa-rate prediction models are designed for all three clusters andensemble of forecasts of thesemodels is provided as predictedIndian summer monsoon rainfall Test period 2001ndash2013 isconsidered to evaluate the forecasting skills of our proposedapproach
The forecastmodels for annual ISMR are chiefly evaluatedin terms ofmean absolute errorOther error statistics namelyroot mean square error prediction yields Pearson correla-tion and Willmott index of agreement are also evaluated tojudge the efficacy of our proposed approach for predictionThey are described below
6 Advances in Meteorology
(i) Mean Absolute Error (MAE) Mean absolute errorfor prediction of annual ISMR is calculated in thefollowing way
MAE =sum119873
119894=1
|119884 minus 119883|
119873 (8)
where 119883 and 119884 are the actual and predicted ISMRseries for test period and119873 denotes the total numberof test years
(ii) Root Mean Square Error (RMSE) Root mean squareerror calculates the differences between model pre-dicted output and actual values They are a goodmeasure to compare forecasting errors of variousmodels
RMSE = radic(119884 minus 119883)
2
119873 (9)
(iii) Prediction Yield (PY) Prediction yields are evaluatedat three different error categories (5 10 and15 errors) to assess the overall prediction resultsby judging percent of predicted years within eachallowed range of errors
(iv) Pearson Correlation Coefficient (PC) Pearson corre-lation coefficientmeasures the strength of linear asso-ciation between actual and predicted values wherethe value of 1 means a perfect positive correlation andthe value of minus1 means a perfect negative correlation
PC =
sum119873
119894=1
(119883119894minus 119883) (119884
119894minus 119884)
radicsum119873
119894=1
(119883119894minus 119883)2
radicsum119873
119894=1
(119884119894minus 119884)2
(10)
where 119883 and 119884 are the actual and predicted ISMRseries for test period and 119883 and 119884 are their corre-sponding mean
(v) Willmott Index of Agreement (WI)Willmott index ofagreement is a standardized measure of the degree ofmodel prediction error It varies between 0 and 1 withhigher values indicating a better fit of the model forprediction
Index of agreement = 1 minussum119873
119894=1
1003816100381610038161003816119883119894 minus 119884119894
1003816100381610038161003816
2
sum119873
119894=1
(10038161003816100381610038161003816119884119894minus 119883
10038161003816100381610038161003816+10038161003816100381610038161003816119883119894minus 119883
10038161003816100381610038161003816)2
(11)
5 Experimental Results and Analysis
In this section we present the evaluation of our proposedfuzzy clustering-based approach We first present the resultsof fuzzy clustering of the monsoon years for different pre-dictor sets Forecasting skills are evaluated for all cluster andthe ensemble model in terms of mean absolute errors for testperiod 2001ndash2013 In addition other measures like root meansquare errors in prediction correlation between predicted
Table 4 Cluster size (number of years) by fuzzy 119888-means clusteringwith 120572-cut of 03 over the period 1948ndash2013
Predictor set Cluster1 Cluster2 Cluster3PredSet1 16 38 30PredSet2 30 17 40PredSet3 32 14 38PredSet4 42 31 21PredSet5 15 37 26
and actual rainfall prediction yields and agreement indexbetween actual and predicted rainfall are also estimatedto establish the efficiency of our proposed approach toprediction of Indian summer monsoon rainfall
51 Clustering of Monsoon Years Fuzzy clustering is per-formed over period 1948ndash2013 to cluster the data into threeclusters We have performed an 120572-cut with value 120572 = 03to assign the data instances to the clusters The value isascertained empirically such that the distribution of elementswithin clusters is regular A data instance can be assigned tomore than one cluster simultaneously The cluster sizes areshown in Table 4 while considering various predictor sets
52 Prediction Accuracy We predict annual rainfall consid-ering for all five predictor sets (Table 2) separately using fourmodels namely MR MLP RNN and GRNN Test period isconsidered from 2001 to 2013
521 Multiple Regression Model (MR) Multiple regressionmodels are built for every cluster by ascertaining optimaltraining period for each predictor set Optimal trainingperiod is evaluated by varying training years and validatingthem for least absolute error in prediction during valida-tion period (1984ndash1993) Individual cluster based as well asweighted ensemble models are considered for predictionTable 5 gives the mean absolute error for individual clusterbased and ensemble models for test period 2001ndash2013 Themodel provides mean absolute error of 62 for PredSet4(Table 2) It is observed that the ensemblemodel outperformsall the single cluster models for every predictor set Figure 3shows the interannual variability of actual and ensemblepredicted rainfall as percent of long period average (LPA)
522 Multilayer Perceptron Neural Network Model (MLP)Multilayer perceptron neural networkmodel is designedwithfour different sets of parameters described in Table 2 Meanabsolute errors of all cluster and ensemble models are shownin Table 6MLP model reports an error of 40 for PredSet4(Table 2) with MLP parameters ParSet1 (Table 3) The actualand predicted rainfall by models built for clusters andensemble model is shown in Figure 4 Ensemble predictedrainfall closely follows actual rainfall
523 Recurrent Neural Network Model (RNN) Mean abso-lute errors for prediction of annual rainfall by recurrentneural network model for the test period 2001ndash2013 are
Advances in Meteorology 7
Table 5 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MR cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 62
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 94 93 109 86PredSet2 20 110 75 94 83PredSet3 15 109 65 92 67PredSet4 15 104 101 68 62PredSet5 15 76 85 84 79
Table 6 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MLP cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 40
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet4 138 181 169 82PredSet2 ParSet3 160 79 110 52PredSet3 ParSet1 80 78 65 65PredSet4 ParSet1 93 107 45 40PredSet5 ParSet1 85 153 137 110
ActualEnsembleClus1 model
Clus2 modelClus3 model
140
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 3 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters modelsby MR for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
presented inTable 7PredSet3 (Table 2)withRNN parametersParSet1 (Table 3) gives error of 51 RNN gives weightsin decreasing order of their distance from test year to thetraining years The pattern of actual and ensemble predictedrainfall in terms of percentage of LPA is shown in Figure 5
524 Generalized Regression Neural Network Model (GRNN)Generalized regression neural network ensemble and
ActualEnsembleClus1 model
Clus2 modelClus3 model
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
Figure 4 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byMLP for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
individual cluster modelsrsquo errors in terms of mean absoluteerrors are presented in Table 8 The model reports anerror of 61 for PredSet3 (Table 2) Figure 6 shows theinterannual variations of ensemble forecast of rainfall byGRNN ensemble model along with actual rainfall patternin terms of percentage of LPA for period 2001ndash2013 It isobserved that the predicted values are close to actual rainfallpatterns Prediction by models designed for clusters is shownby different symbols
8 Advances in Meteorology
Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88
Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below
(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern
Advances in Meteorology 9
Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)
Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062
Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach
Predictor setMR MLP RNN GRNN
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67
(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least
(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models
All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels
54 Comparison of Results
541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7
542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by
IMD 16-paramIMD 8-paramIMD 10-paramMR
MLPRNNGRNN
16-par 8-par 10-par MR MLP RNN GRNNDifferent models
12
10
8
6
4
2
0
Root
mea
n sq
uare
erro
r as
of L
PA
Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models
same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod
10 Advances in Meteorology
Table 11 Physical climatic events under study
Climatic event Numberof years Years associated with the event
Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994
El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009
La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011
Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996
Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events
Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054
55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014
6 Meteorological Analysis
Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11
Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR
61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below
25
20
15
10
5
0
minus5
minus10
minus15
minus20
minus25
Dev
iatio
n of
rain
fall
from
o
f LPA
1945
1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
2010
2015
Years
Normal years
El-Nino yearsLa-Nina years
Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013
(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event
Support =119909ce119873
(12)
where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster
(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years
Confidence =119909ce119879ce
(13)
where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013
Advances in Meteorology 11
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
50400
302022
10 2040
60
ConfidenSupp
(a)
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
5040
3020
10 2040
60
C nfidencSupport
(b)
Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1
Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering
Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood
We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously
7 Conclusion
Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel
ensemble forecast is provided for Indian summer monsoonrainfall
Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events
In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
12 Advances in Meteorology
Acknowledgment
This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur
References
[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884
[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924
[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992
[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991
[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004
[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007
[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012
[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013
[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012
[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013
[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000
[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008
[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008
[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011
[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013
[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions
and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995
[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011
[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996
[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ClimatologyJournal of
EcologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
EarthquakesJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom
Applied ampEnvironmentalSoil Science
Volume 2014
Mining
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal of
Geophysics
OceanographyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal ofPetroleum Engineering
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Atmospheric SciencesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MineralogyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MeteorologyAdvances in
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geological ResearchJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geology Advances in
2 Advances in Meteorology
predicts Indian summer monsoon utilizing recurrent neuralnetwork and also demonstrates successful employment ofsupport vector machine in solving nonlinear regression andtime series problemsThree different backpropagation neurallearning rules namely momentum learning conjugate gra-dient descent learning and Levenberg-Marquardt learningare used by S Chattopadhyay and G Chattopadhyay [13]to perform a comparative study of different neural networkmethod to predict rainfall time series
Presence of large variability in monsoon patterns makesit difficult for a single model to predict its distributionA number of uncertainties including boundary conditionparameter and structural uncertainties are involved in con-struction of these models Thus it remains fundamentallychallenging to have a singlemodel for predictionMultimodelensembles are proposed to overcome the weakness of singlemodel which combine the outcome of different models toproduce efficient results [14 15] In addition monsoon showsdifferent characteristics over yearsThere exist groups of yearswhere variation of climatic parameters and pattern of rainfallare similar We use fuzzy clustering to cluster the similaryears together and model them separately The motivationbehind using fuzzy clustering is that each year manifests amixture of physical climatic events We cannot hard clustera year into a specific group years have their membership ofbelongingness to every cluster Fuzzy clustering is used toenclose the characteristics of different events being related toa year of study We use the same set of climatic parameters aspredictor set for every cluster but frame different models foreach cluster
A number of prediction models namely multiple regres-sion (MR) multilayer perceptron (MLP) recurrent neuralnetwork (RNN) and generalized regression neural network(GRNN) models are used for prediction of Indian monsoonfor the year clusters There exists viable reasons for usingneural networks like MLP RNN and GRNN for modelling(i) Indian monsoon is a complex process which cannot beadequately modelled by linearmodels (ii) nonlinearity in thetime-series pattern can be well captured by neural networklearning (iii) climatic events are much closely related to nearyears parameters disturbance as compared to distant yearsand neural network enables attaching weight to the yearparameter in appropriate manner
In this work climatic parameters that are strongly corre-lated with Indian monsoon are identified at the onset whichis followed by fuzzy clustering of years into groups withdegree of belongingness of each year to the clusters Thenwe model each cluster with four types of models namelyMR MLP RNN and GRNN to forecast rainfall Weightedensemble of forecasts given by respective models for eachcluster is considered as final predicted rainfall Analysisand comparisons are performed on aggregate Indian rainfalland finally a meteorological interpretation of the obtainedclusters is presented
The paper is organised in the following manner Wediscussed the details of data and predictor climatic param-eters in Sections 2 and 3 Proposed clustering basedapproach prediction model and ensemble technique arepresented in Section 4 with experimental results in Section 5
Meteorological significance is discussed in Section 6 andfinally conclusions are provided in Section 7
2 Data Sets Used
We consider the annual Indian summer monsoon rain-fall (ISMR) occurring in four months of June JulyAugust and September Annual ISMR is considered duringperiod 1948ndash2013 for our study The long period aver-age (LPA) (1948ndash2013) of ISMR is 8918mm ISMR isexpressed as percentage of the LPA value The data isobtained from Indian Institute of Tropical MeteorologyPune (httpwwwimdpunegovinresearchncclongrangedatadatahtml) [16]
Predictor parameters sea level pressure (SLP) (httpwwwesrlnoaagovpsdgcos wgspGriddeddatanoaaerslphtml) and sea surface temperature (SST) (httpwwwesrlnoaagovpsddatagriddeddatanoaaerssthtml) data areprovided by theNOAAOARESRLPSD at spatial resolutionof 2∘
times 2∘ [17] Surface pressure (SP) and zonal wind
velocity (WV) data are collected from NCEP ReanalysisDerived data provided by the NOAAOARESRL PSD(httpwwwesrlnoaagovpsddatagriddeddatanceprea-nalysisderivedsurfacehtml) [18] available at resolutionof 25
∘
times 25∘ Finally Nino 34 data which is the sea
surface temperature anomaly for the spatial coverageof 5∘S to 5
∘N and 170∘W to 120
∘W in Pacific Oceanregion is acquired from National Center for AtmosphericResearch (httpwwwcpcncepnoaagovproductsanalysismonitoringensostuffensoyearsshtml) [19] All the abovemonthly data are considered for the period 1948ndash2013 in ourstudy and analysis
3 Global Climatic ParametersInfluencing Indian Monsoon
Indian monsoon is strongly influenced by several globalclimatic parameters occurring at places distant from Indiansubcontinent Identification of predictor parameters relies onphysical understanding of monsoon event and wind patternflow We have selected the climatic parameters based onthe parameters used by Indian meteorological departmentrsquosmodels [5 6] studying their correlation with Indian summermonsoon rainfall (ISMR) during our period of study (1948ndash2013) In the data preprocessing phase climatic anomaly dataare evaluated by calculating the deviation of parameter valuefrom long-term average value of the parameter exclusively foreachmonth followed by correlation study between ISMR andthe climatic parameters for a lag of zero to twelve monthsWe consider the best lagged predictor month having highcorrelationwith ISMRThe predictor climatic parameters andtheir correlation values with Indian monsoon are shown inTable 1 Figure 1 shows the geographic location of climaticparameters influencing Indian monsoonPredictor Sets of Climatic Parameters Based on the correlationwith Indian monsoon we have built five predictor sets forforecasting Different combinations of the identified climaticparameters (Table 1) form the predictor sets The predictorsets are shown in Table 2
Advances in Meteorology 3
Table 1 Climatic parameters (CP) influencing Indian monsoon with geographical location correlation values and correlated month (0signifies same years and minus1 signifies previous year)
CP CP name Location Correlation values Correlated monthsCP1 North Atlantic Ocean SST anomaly 20∘Nndash30∘N 100∘Wndash80∘W 0242 Jan (0)CP2 North Atlantic Ocean surface pressure anomaly 20∘Nndash30∘N 100∘Wndash80∘W 0256 April (0)CP3 East Asia SLP anomaly 35∘Nndash45∘N 120∘Endash130∘E 0337 May (0)CP4 East Asia surface pressure anomaly 35∘Nndash45∘N 120∘Endash130∘E 0341 Mar (0)CP5 Equatorial South Eastern Indian ocean SST anomaly 20∘Sndash10∘S 100∘Endash120∘E 0200 Sept (minus1)CP6 Pressure gradient between Madagascar and Tibet mdash 0253 May (0)CP7 Nino 34 SST anomaly 5∘Sndash5∘N 170∘Wndash120∘W 0311 Sept (minus1)CP8 Equatorial Pacific Ocean SLP anomaly 5∘Sndash5∘N 120∘Endash80∘W 0272 Aug (minus1)CP9 North West Europe surface pressure anomaly 55∘Nndash65∘N 20∘Endash40∘E 0183 Jan (0)CP10 North Central Pacific zonal wind anomaly 5∘Nndash15∘N 180∘Endash150∘W 0457 May (0)
70∘
70∘
60∘
60∘
60∘
60∘
40∘
40∘
40∘
40∘
20∘
20∘
20∘
20∘
0∘
70∘
70∘
60∘
60∘
40∘
40∘
20∘
20∘
0∘
0∘
100∘
100∘
100∘
120∘
120∘
160∘
160∘
180∘
80∘
80∘
140∘
140∘
60∘
60∘
40∘
40∘
20∘
20∘
0∘
100∘
100∘
100∘
120∘
120∘
160∘
160∘
180∘
80∘
80∘
140∘
140∘
Figure 1 Climatic parameters over the globe governing Indianmonsoon (purple patches signify the location of climatic parameterstaken and blue patch represents the Indian region) CP
119894
representsparameter 119894 in Table 1
Table 2 Predictor sets with climatic parameters
Predictor sets Climatic parametersPredSet1 CP1 CP4 CP5 CP6PredSet2 CP4 CP5 CP6 CP7PredSet3 CP2 CP4 CP10PredSet4 CP2 CP4 CP7 CP10PredSet5 CP3 CP7 CP8 CP9
4 Methodology
We propose fuzzy clustering of monsoon years into groupsfollowed by building models for each group separately andfinally predicting Indian summer monsoon rainfall (ISMR)as weighted ensemble of forecasts provided by clustermodelsThe block diagram of the proposed fuzzy clustering-basedapproach to prediction of ISMR is shown in Figure 2Detailedsteps are described in the following subsections
41 Motivation Variability of Monsoon Patterns Trends anddistributions of monsoon vary to a large extent over years It
is thus necessary to group the years into clusters which havesimilar patterns of predictor climatic parameters affectingmonsoon The approach of clustering the years is effective aswe can build separate models for each cluster These clustermodels will be more accurate as variation within cluster isless Finally ensemble of forecasts of these cluster modelsresults in better prediction of IndianmonsoonAs an exampleconsider two clusters of years corresponding to strong El-Nino and North Atlantic Oscillation respectively A droughtyear has correlation with both events and hence might havesignificant degree of belongingness to both clusters
42 Fuzzy Clustering of Monsoon Years Fuzzy 119888-meansclustering is used for grouping the similar years togetherFuzzy 119888-means (FCM) is a method of clustering which allowsone instance of input to belong to more than one clusterwith some membership of belongingness FCM attempts topartition a set of119873 elements119884 = 119910
1 119910
119899 into a collection
of 119888 fuzzy clusters 119862 = cen1 cen
119888 and a partition matrix
119882 = 119908119894119895isin [0 1] 119894 = 1 119899 119895 = 1 119888 where 119908
119894119895gives the
degree of belongingness of element 119910119894to cluster with center
cen119895FCM aims to minimize an objective function of (1) The
update of partition matrix and centers occur in accordancewith (2) and (3) respectively
119869119898
=
119873
sum
119894=1
119888
sum
119895=1
119908119898
119894119895
10038171003817100381710038171003817119910119894minus cen119895
10038171003817100381710038171003817
2
1 le 119898 le infin (1)
119908119894119895=
1
sum119888
119896=1
(10038171003817100381710038171003817119910119894minus cen119895
100381710038171003817100381710038171003817100381710038171003817119910119894 minus cen
119896
1003817100381710038171003817)2(119898minus1) (2)
cen119895=
sum119873
119894=1
119908119898
119894119895
sdot 119909119894
sum119873
119894=1
119908119898
119894119895
(3)
where119898 denotes the level of cluster fuzziness
4 Advances in Meteorology
Input
∙ Climateparameter(CP)
∙ Rainfall
Climate
anomaly
data
evaluation
Correlation
study of
CP
and rainfall
Identification
of month
of CP
highly
correlated
to
rainfall
Building
predictor
set with
different
CP
combination
Data preprocessing
Ensemble neural network model
Ensemble
of op of
all clusters
Forecast
by everyclusterby each
model
∙ MR
∙ MLP
∙ RNN
∙ GRNN
Clusters
Membershipvalue
Clustering years
Apply
alpha-
cut to
obtain
clusters
Clustering
predictor
set using
fuzzy
clustering
Ensemble
forecastClusters
ValidationPhysical interpretation of clusters
Error calculation
Comp
with
existing
models
Comp
with
nonclustered
method
Support Confidence
Threshold
Cluster interpretation
OutputOutput∙ Forecast of rainfall∙ Error in forecasting∙ Model comparison results
∙ Climatic eventsassociated with eachcluster
Figure 2 Proposed fuzzy clustering-based ensemble approach for prediction of Indian summer monsoon rainfall
43 Prediction Models Multiple regression and three modelsof artificial neural networks (ANN) namely multilayer per-ceptron recurrent neural network and generalized regres-sion neural network are used to design predictionmodels foreach cluster exclusively Forecast of annual ISMR is providedby each cluster model separately and also by ensemble of all
the clustersrsquo model forecast We describe below the modelsused
431 Multiple Regression (MR) Multiple regression model isused to learn the relationship between several independentpredictor variables (119883
119894s) and a dependent variable (119884)
Advances in Meteorology 5
Table 3 Model parameter setting for MLP models
Parameter set Hidden layers Training years Training methodParSet1 [3 5] 20 BFGS quasi-Newton backpropagationParSet2 [3 5 10] 15 Conjugate gradient backpropagation with Powell-Beale restartsParSet3 [5 10] 10 Scaled conjugate gradient backpropagationParSet4 [3 5] 15 Resilient backpropagation
Multiple regression model having 119901 independent variables isshown in
119910119894= 12057311199091198941+ 12057321199091198942+ sdot sdot sdot + 120573
119901119909119894119901
+ 120576119894 (4)
where 119909119894119895is the 119894th observation of 119895th independent variable
where the first independent variable takes the value 1 for all 119894and 120576 represents the residual
432 Multilayer Perceptron Neural Network (MLP) Mul-tilayer perceptron neural network is a class of ANN whereconnections between the neurons do not form a directedcycle In this network the information propagates in onlyone direction from input nodes through hidden nodesand to the output nodes The independent and dependentvariables constitute the input and output layers respectivelyNumber of hidden layers with corresponding nodes mustbe determined empirically for each prediction task Fourdifferent parameter sets are considered empirically for modeldesigned to forecast ISMR shown in Table 3
433 Recurrent Neural Network (RNN) Recurrent neuralnetwork is a class of ANN which creates an internal stateof the network to exhibit dynamic temporal behaviourClimatic changes or events occurring in near or same timeperiod are highly correlated Similarly rainfall patterns aremore correlated to influencing factors in the near years ascompared to the distant years This phenomenon is wellcaptured by RNN which gives weights in decreasing order tothe values in near to distant years during training of networkThus it assists in modelling the system dynamics in muchnatural manner Same set of climatic parameters as MLPnetwork (Table 3) is considered with delay span of 2 units
434 Generalized Regression Neural Network (GRNN) Gen-eralized regression neural network is a variant of radialbasis function network GRNN has three layers of artificialneurons input hidden and output The hidden layer hasradial basis neurons while neurons in the output layer havelinear transfer function Output of radial basis neurons is theinput scaled by the spread factor Given 119901 input-output pairs119909119894119910119895isin R119899timesR1 with 119899 input variables and 119894 = 1 2 119901 119910
119895
represents the output from each hidden unit The GRNNoutput for a test point 119909 isin R119899 is described by
119910(119909) =
119901
sum
119894=1
119882119894119910119894 (5)
where
119882119894=
exp (minus1003817100381710038171003817119909 minus 119909
119894
1003817100381710038171003817
2
21205902
)
sum119901
119896=1
exp (minus1003817100381710038171003817119909 minus 119909
119896
1003817100381710038171003817
2
21205902
)
(6)
The reasons behind modelling using GRNN are (i) only onetunable design parameter (spread factor) (ii) one-pass algo-rithm (less time consuming) and (iii) accurately approximatefunctions from sparse data
Optimal training year is ascertained for MR and GRNNmodels by varying training years from 5 to 30 and validatingagainst least absolute error in prediction during validationperiod (1984ndash1993) A training of 119898 years specifies that forpredicting 119903th year rainfall available preceding119898 number ofyears 119903 minus 1 119903 minus 2 119903 minus 119898 present in a particular cluster areconsidered for training
44 Ensemble of Predictors Complexity in monsoon processmakes it difficult for a single model to predict rainfallaccuratelyWedesign separatemodels for each cluster of yearsobtained by fuzzy clustering using four predictors describedin Section 43 Finally annual ISMR is presented as weightedensemble of forecasts of model designed for each clusterWeight is taken as the fuzzy membership of belongingnessof the test year in different clusters
Ensemble prediction119905 =119888
sum
119894=1
119882119905
119894
sdot 119875119894 (7)
where119875119894represents the prediction given by amodel for cluster
119894119882119905119894
is the fuzzy membership of 119905th test year to cluster 119894 and119888 is the total number of clusters
45 Validation of Proposed Approach The study is performedon data for the period 1948ndash2013 Fuzzy clustering is per-formed over the period to cluster it into three groups Thenumber of clusters is decided based on cluster quality Sepa-rate prediction models are designed for all three clusters andensemble of forecasts of thesemodels is provided as predictedIndian summer monsoon rainfall Test period 2001ndash2013 isconsidered to evaluate the forecasting skills of our proposedapproach
The forecastmodels for annual ISMR are chiefly evaluatedin terms ofmean absolute errorOther error statistics namelyroot mean square error prediction yields Pearson correla-tion and Willmott index of agreement are also evaluated tojudge the efficacy of our proposed approach for predictionThey are described below
6 Advances in Meteorology
(i) Mean Absolute Error (MAE) Mean absolute errorfor prediction of annual ISMR is calculated in thefollowing way
MAE =sum119873
119894=1
|119884 minus 119883|
119873 (8)
where 119883 and 119884 are the actual and predicted ISMRseries for test period and119873 denotes the total numberof test years
(ii) Root Mean Square Error (RMSE) Root mean squareerror calculates the differences between model pre-dicted output and actual values They are a goodmeasure to compare forecasting errors of variousmodels
RMSE = radic(119884 minus 119883)
2
119873 (9)
(iii) Prediction Yield (PY) Prediction yields are evaluatedat three different error categories (5 10 and15 errors) to assess the overall prediction resultsby judging percent of predicted years within eachallowed range of errors
(iv) Pearson Correlation Coefficient (PC) Pearson corre-lation coefficientmeasures the strength of linear asso-ciation between actual and predicted values wherethe value of 1 means a perfect positive correlation andthe value of minus1 means a perfect negative correlation
PC =
sum119873
119894=1
(119883119894minus 119883) (119884
119894minus 119884)
radicsum119873
119894=1
(119883119894minus 119883)2
radicsum119873
119894=1
(119884119894minus 119884)2
(10)
where 119883 and 119884 are the actual and predicted ISMRseries for test period and 119883 and 119884 are their corre-sponding mean
(v) Willmott Index of Agreement (WI)Willmott index ofagreement is a standardized measure of the degree ofmodel prediction error It varies between 0 and 1 withhigher values indicating a better fit of the model forprediction
Index of agreement = 1 minussum119873
119894=1
1003816100381610038161003816119883119894 minus 119884119894
1003816100381610038161003816
2
sum119873
119894=1
(10038161003816100381610038161003816119884119894minus 119883
10038161003816100381610038161003816+10038161003816100381610038161003816119883119894minus 119883
10038161003816100381610038161003816)2
(11)
5 Experimental Results and Analysis
In this section we present the evaluation of our proposedfuzzy clustering-based approach We first present the resultsof fuzzy clustering of the monsoon years for different pre-dictor sets Forecasting skills are evaluated for all cluster andthe ensemble model in terms of mean absolute errors for testperiod 2001ndash2013 In addition other measures like root meansquare errors in prediction correlation between predicted
Table 4 Cluster size (number of years) by fuzzy 119888-means clusteringwith 120572-cut of 03 over the period 1948ndash2013
Predictor set Cluster1 Cluster2 Cluster3PredSet1 16 38 30PredSet2 30 17 40PredSet3 32 14 38PredSet4 42 31 21PredSet5 15 37 26
and actual rainfall prediction yields and agreement indexbetween actual and predicted rainfall are also estimatedto establish the efficiency of our proposed approach toprediction of Indian summer monsoon rainfall
51 Clustering of Monsoon Years Fuzzy clustering is per-formed over period 1948ndash2013 to cluster the data into threeclusters We have performed an 120572-cut with value 120572 = 03to assign the data instances to the clusters The value isascertained empirically such that the distribution of elementswithin clusters is regular A data instance can be assigned tomore than one cluster simultaneously The cluster sizes areshown in Table 4 while considering various predictor sets
52 Prediction Accuracy We predict annual rainfall consid-ering for all five predictor sets (Table 2) separately using fourmodels namely MR MLP RNN and GRNN Test period isconsidered from 2001 to 2013
521 Multiple Regression Model (MR) Multiple regressionmodels are built for every cluster by ascertaining optimaltraining period for each predictor set Optimal trainingperiod is evaluated by varying training years and validatingthem for least absolute error in prediction during valida-tion period (1984ndash1993) Individual cluster based as well asweighted ensemble models are considered for predictionTable 5 gives the mean absolute error for individual clusterbased and ensemble models for test period 2001ndash2013 Themodel provides mean absolute error of 62 for PredSet4(Table 2) It is observed that the ensemblemodel outperformsall the single cluster models for every predictor set Figure 3shows the interannual variability of actual and ensemblepredicted rainfall as percent of long period average (LPA)
522 Multilayer Perceptron Neural Network Model (MLP)Multilayer perceptron neural networkmodel is designedwithfour different sets of parameters described in Table 2 Meanabsolute errors of all cluster and ensemble models are shownin Table 6MLP model reports an error of 40 for PredSet4(Table 2) with MLP parameters ParSet1 (Table 3) The actualand predicted rainfall by models built for clusters andensemble model is shown in Figure 4 Ensemble predictedrainfall closely follows actual rainfall
523 Recurrent Neural Network Model (RNN) Mean abso-lute errors for prediction of annual rainfall by recurrentneural network model for the test period 2001ndash2013 are
Advances in Meteorology 7
Table 5 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MR cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 62
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 94 93 109 86PredSet2 20 110 75 94 83PredSet3 15 109 65 92 67PredSet4 15 104 101 68 62PredSet5 15 76 85 84 79
Table 6 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MLP cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 40
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet4 138 181 169 82PredSet2 ParSet3 160 79 110 52PredSet3 ParSet1 80 78 65 65PredSet4 ParSet1 93 107 45 40PredSet5 ParSet1 85 153 137 110
ActualEnsembleClus1 model
Clus2 modelClus3 model
140
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 3 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters modelsby MR for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
presented inTable 7PredSet3 (Table 2)withRNN parametersParSet1 (Table 3) gives error of 51 RNN gives weightsin decreasing order of their distance from test year to thetraining years The pattern of actual and ensemble predictedrainfall in terms of percentage of LPA is shown in Figure 5
524 Generalized Regression Neural Network Model (GRNN)Generalized regression neural network ensemble and
ActualEnsembleClus1 model
Clus2 modelClus3 model
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
Figure 4 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byMLP for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
individual cluster modelsrsquo errors in terms of mean absoluteerrors are presented in Table 8 The model reports anerror of 61 for PredSet3 (Table 2) Figure 6 shows theinterannual variations of ensemble forecast of rainfall byGRNN ensemble model along with actual rainfall patternin terms of percentage of LPA for period 2001ndash2013 It isobserved that the predicted values are close to actual rainfallpatterns Prediction by models designed for clusters is shownby different symbols
8 Advances in Meteorology
Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88
Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below
(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern
Advances in Meteorology 9
Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)
Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062
Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach
Predictor setMR MLP RNN GRNN
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67
(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least
(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models
All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels
54 Comparison of Results
541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7
542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by
IMD 16-paramIMD 8-paramIMD 10-paramMR
MLPRNNGRNN
16-par 8-par 10-par MR MLP RNN GRNNDifferent models
12
10
8
6
4
2
0
Root
mea
n sq
uare
erro
r as
of L
PA
Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models
same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod
10 Advances in Meteorology
Table 11 Physical climatic events under study
Climatic event Numberof years Years associated with the event
Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994
El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009
La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011
Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996
Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events
Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054
55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014
6 Meteorological Analysis
Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11
Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR
61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below
25
20
15
10
5
0
minus5
minus10
minus15
minus20
minus25
Dev
iatio
n of
rain
fall
from
o
f LPA
1945
1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
2010
2015
Years
Normal years
El-Nino yearsLa-Nina years
Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013
(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event
Support =119909ce119873
(12)
where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster
(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years
Confidence =119909ce119879ce
(13)
where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013
Advances in Meteorology 11
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
50400
302022
10 2040
60
ConfidenSupp
(a)
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
5040
3020
10 2040
60
C nfidencSupport
(b)
Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1
Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering
Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood
We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously
7 Conclusion
Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel
ensemble forecast is provided for Indian summer monsoonrainfall
Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events
In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
12 Advances in Meteorology
Acknowledgment
This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur
References
[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884
[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924
[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992
[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991
[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004
[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007
[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012
[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013
[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012
[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013
[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000
[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008
[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008
[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011
[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013
[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions
and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995
[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011
[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996
[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ClimatologyJournal of
EcologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
EarthquakesJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom
Applied ampEnvironmentalSoil Science
Volume 2014
Mining
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal of
Geophysics
OceanographyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal ofPetroleum Engineering
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Atmospheric SciencesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MineralogyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MeteorologyAdvances in
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geological ResearchJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geology Advances in
Advances in Meteorology 3
Table 1 Climatic parameters (CP) influencing Indian monsoon with geographical location correlation values and correlated month (0signifies same years and minus1 signifies previous year)
CP CP name Location Correlation values Correlated monthsCP1 North Atlantic Ocean SST anomaly 20∘Nndash30∘N 100∘Wndash80∘W 0242 Jan (0)CP2 North Atlantic Ocean surface pressure anomaly 20∘Nndash30∘N 100∘Wndash80∘W 0256 April (0)CP3 East Asia SLP anomaly 35∘Nndash45∘N 120∘Endash130∘E 0337 May (0)CP4 East Asia surface pressure anomaly 35∘Nndash45∘N 120∘Endash130∘E 0341 Mar (0)CP5 Equatorial South Eastern Indian ocean SST anomaly 20∘Sndash10∘S 100∘Endash120∘E 0200 Sept (minus1)CP6 Pressure gradient between Madagascar and Tibet mdash 0253 May (0)CP7 Nino 34 SST anomaly 5∘Sndash5∘N 170∘Wndash120∘W 0311 Sept (minus1)CP8 Equatorial Pacific Ocean SLP anomaly 5∘Sndash5∘N 120∘Endash80∘W 0272 Aug (minus1)CP9 North West Europe surface pressure anomaly 55∘Nndash65∘N 20∘Endash40∘E 0183 Jan (0)CP10 North Central Pacific zonal wind anomaly 5∘Nndash15∘N 180∘Endash150∘W 0457 May (0)
70∘
70∘
60∘
60∘
60∘
60∘
40∘
40∘
40∘
40∘
20∘
20∘
20∘
20∘
0∘
70∘
70∘
60∘
60∘
40∘
40∘
20∘
20∘
0∘
0∘
100∘
100∘
100∘
120∘
120∘
160∘
160∘
180∘
80∘
80∘
140∘
140∘
60∘
60∘
40∘
40∘
20∘
20∘
0∘
100∘
100∘
100∘
120∘
120∘
160∘
160∘
180∘
80∘
80∘
140∘
140∘
Figure 1 Climatic parameters over the globe governing Indianmonsoon (purple patches signify the location of climatic parameterstaken and blue patch represents the Indian region) CP
119894
representsparameter 119894 in Table 1
Table 2 Predictor sets with climatic parameters
Predictor sets Climatic parametersPredSet1 CP1 CP4 CP5 CP6PredSet2 CP4 CP5 CP6 CP7PredSet3 CP2 CP4 CP10PredSet4 CP2 CP4 CP7 CP10PredSet5 CP3 CP7 CP8 CP9
4 Methodology
We propose fuzzy clustering of monsoon years into groupsfollowed by building models for each group separately andfinally predicting Indian summer monsoon rainfall (ISMR)as weighted ensemble of forecasts provided by clustermodelsThe block diagram of the proposed fuzzy clustering-basedapproach to prediction of ISMR is shown in Figure 2Detailedsteps are described in the following subsections
41 Motivation Variability of Monsoon Patterns Trends anddistributions of monsoon vary to a large extent over years It
is thus necessary to group the years into clusters which havesimilar patterns of predictor climatic parameters affectingmonsoon The approach of clustering the years is effective aswe can build separate models for each cluster These clustermodels will be more accurate as variation within cluster isless Finally ensemble of forecasts of these cluster modelsresults in better prediction of IndianmonsoonAs an exampleconsider two clusters of years corresponding to strong El-Nino and North Atlantic Oscillation respectively A droughtyear has correlation with both events and hence might havesignificant degree of belongingness to both clusters
42 Fuzzy Clustering of Monsoon Years Fuzzy 119888-meansclustering is used for grouping the similar years togetherFuzzy 119888-means (FCM) is a method of clustering which allowsone instance of input to belong to more than one clusterwith some membership of belongingness FCM attempts topartition a set of119873 elements119884 = 119910
1 119910
119899 into a collection
of 119888 fuzzy clusters 119862 = cen1 cen
119888 and a partition matrix
119882 = 119908119894119895isin [0 1] 119894 = 1 119899 119895 = 1 119888 where 119908
119894119895gives the
degree of belongingness of element 119910119894to cluster with center
cen119895FCM aims to minimize an objective function of (1) The
update of partition matrix and centers occur in accordancewith (2) and (3) respectively
119869119898
=
119873
sum
119894=1
119888
sum
119895=1
119908119898
119894119895
10038171003817100381710038171003817119910119894minus cen119895
10038171003817100381710038171003817
2
1 le 119898 le infin (1)
119908119894119895=
1
sum119888
119896=1
(10038171003817100381710038171003817119910119894minus cen119895
100381710038171003817100381710038171003817100381710038171003817119910119894 minus cen
119896
1003817100381710038171003817)2(119898minus1) (2)
cen119895=
sum119873
119894=1
119908119898
119894119895
sdot 119909119894
sum119873
119894=1
119908119898
119894119895
(3)
where119898 denotes the level of cluster fuzziness
4 Advances in Meteorology
Input
∙ Climateparameter(CP)
∙ Rainfall
Climate
anomaly
data
evaluation
Correlation
study of
CP
and rainfall
Identification
of month
of CP
highly
correlated
to
rainfall
Building
predictor
set with
different
CP
combination
Data preprocessing
Ensemble neural network model
Ensemble
of op of
all clusters
Forecast
by everyclusterby each
model
∙ MR
∙ MLP
∙ RNN
∙ GRNN
Clusters
Membershipvalue
Clustering years
Apply
alpha-
cut to
obtain
clusters
Clustering
predictor
set using
fuzzy
clustering
Ensemble
forecastClusters
ValidationPhysical interpretation of clusters
Error calculation
Comp
with
existing
models
Comp
with
nonclustered
method
Support Confidence
Threshold
Cluster interpretation
OutputOutput∙ Forecast of rainfall∙ Error in forecasting∙ Model comparison results
∙ Climatic eventsassociated with eachcluster
Figure 2 Proposed fuzzy clustering-based ensemble approach for prediction of Indian summer monsoon rainfall
43 Prediction Models Multiple regression and three modelsof artificial neural networks (ANN) namely multilayer per-ceptron recurrent neural network and generalized regres-sion neural network are used to design predictionmodels foreach cluster exclusively Forecast of annual ISMR is providedby each cluster model separately and also by ensemble of all
the clustersrsquo model forecast We describe below the modelsused
431 Multiple Regression (MR) Multiple regression model isused to learn the relationship between several independentpredictor variables (119883
119894s) and a dependent variable (119884)
Advances in Meteorology 5
Table 3 Model parameter setting for MLP models
Parameter set Hidden layers Training years Training methodParSet1 [3 5] 20 BFGS quasi-Newton backpropagationParSet2 [3 5 10] 15 Conjugate gradient backpropagation with Powell-Beale restartsParSet3 [5 10] 10 Scaled conjugate gradient backpropagationParSet4 [3 5] 15 Resilient backpropagation
Multiple regression model having 119901 independent variables isshown in
119910119894= 12057311199091198941+ 12057321199091198942+ sdot sdot sdot + 120573
119901119909119894119901
+ 120576119894 (4)
where 119909119894119895is the 119894th observation of 119895th independent variable
where the first independent variable takes the value 1 for all 119894and 120576 represents the residual
432 Multilayer Perceptron Neural Network (MLP) Mul-tilayer perceptron neural network is a class of ANN whereconnections between the neurons do not form a directedcycle In this network the information propagates in onlyone direction from input nodes through hidden nodesand to the output nodes The independent and dependentvariables constitute the input and output layers respectivelyNumber of hidden layers with corresponding nodes mustbe determined empirically for each prediction task Fourdifferent parameter sets are considered empirically for modeldesigned to forecast ISMR shown in Table 3
433 Recurrent Neural Network (RNN) Recurrent neuralnetwork is a class of ANN which creates an internal stateof the network to exhibit dynamic temporal behaviourClimatic changes or events occurring in near or same timeperiod are highly correlated Similarly rainfall patterns aremore correlated to influencing factors in the near years ascompared to the distant years This phenomenon is wellcaptured by RNN which gives weights in decreasing order tothe values in near to distant years during training of networkThus it assists in modelling the system dynamics in muchnatural manner Same set of climatic parameters as MLPnetwork (Table 3) is considered with delay span of 2 units
434 Generalized Regression Neural Network (GRNN) Gen-eralized regression neural network is a variant of radialbasis function network GRNN has three layers of artificialneurons input hidden and output The hidden layer hasradial basis neurons while neurons in the output layer havelinear transfer function Output of radial basis neurons is theinput scaled by the spread factor Given 119901 input-output pairs119909119894119910119895isin R119899timesR1 with 119899 input variables and 119894 = 1 2 119901 119910
119895
represents the output from each hidden unit The GRNNoutput for a test point 119909 isin R119899 is described by
119910(119909) =
119901
sum
119894=1
119882119894119910119894 (5)
where
119882119894=
exp (minus1003817100381710038171003817119909 minus 119909
119894
1003817100381710038171003817
2
21205902
)
sum119901
119896=1
exp (minus1003817100381710038171003817119909 minus 119909
119896
1003817100381710038171003817
2
21205902
)
(6)
The reasons behind modelling using GRNN are (i) only onetunable design parameter (spread factor) (ii) one-pass algo-rithm (less time consuming) and (iii) accurately approximatefunctions from sparse data
Optimal training year is ascertained for MR and GRNNmodels by varying training years from 5 to 30 and validatingagainst least absolute error in prediction during validationperiod (1984ndash1993) A training of 119898 years specifies that forpredicting 119903th year rainfall available preceding119898 number ofyears 119903 minus 1 119903 minus 2 119903 minus 119898 present in a particular cluster areconsidered for training
44 Ensemble of Predictors Complexity in monsoon processmakes it difficult for a single model to predict rainfallaccuratelyWedesign separatemodels for each cluster of yearsobtained by fuzzy clustering using four predictors describedin Section 43 Finally annual ISMR is presented as weightedensemble of forecasts of model designed for each clusterWeight is taken as the fuzzy membership of belongingnessof the test year in different clusters
Ensemble prediction119905 =119888
sum
119894=1
119882119905
119894
sdot 119875119894 (7)
where119875119894represents the prediction given by amodel for cluster
119894119882119905119894
is the fuzzy membership of 119905th test year to cluster 119894 and119888 is the total number of clusters
45 Validation of Proposed Approach The study is performedon data for the period 1948ndash2013 Fuzzy clustering is per-formed over the period to cluster it into three groups Thenumber of clusters is decided based on cluster quality Sepa-rate prediction models are designed for all three clusters andensemble of forecasts of thesemodels is provided as predictedIndian summer monsoon rainfall Test period 2001ndash2013 isconsidered to evaluate the forecasting skills of our proposedapproach
The forecastmodels for annual ISMR are chiefly evaluatedin terms ofmean absolute errorOther error statistics namelyroot mean square error prediction yields Pearson correla-tion and Willmott index of agreement are also evaluated tojudge the efficacy of our proposed approach for predictionThey are described below
6 Advances in Meteorology
(i) Mean Absolute Error (MAE) Mean absolute errorfor prediction of annual ISMR is calculated in thefollowing way
MAE =sum119873
119894=1
|119884 minus 119883|
119873 (8)
where 119883 and 119884 are the actual and predicted ISMRseries for test period and119873 denotes the total numberof test years
(ii) Root Mean Square Error (RMSE) Root mean squareerror calculates the differences between model pre-dicted output and actual values They are a goodmeasure to compare forecasting errors of variousmodels
RMSE = radic(119884 minus 119883)
2
119873 (9)
(iii) Prediction Yield (PY) Prediction yields are evaluatedat three different error categories (5 10 and15 errors) to assess the overall prediction resultsby judging percent of predicted years within eachallowed range of errors
(iv) Pearson Correlation Coefficient (PC) Pearson corre-lation coefficientmeasures the strength of linear asso-ciation between actual and predicted values wherethe value of 1 means a perfect positive correlation andthe value of minus1 means a perfect negative correlation
PC =
sum119873
119894=1
(119883119894minus 119883) (119884
119894minus 119884)
radicsum119873
119894=1
(119883119894minus 119883)2
radicsum119873
119894=1
(119884119894minus 119884)2
(10)
where 119883 and 119884 are the actual and predicted ISMRseries for test period and 119883 and 119884 are their corre-sponding mean
(v) Willmott Index of Agreement (WI)Willmott index ofagreement is a standardized measure of the degree ofmodel prediction error It varies between 0 and 1 withhigher values indicating a better fit of the model forprediction
Index of agreement = 1 minussum119873
119894=1
1003816100381610038161003816119883119894 minus 119884119894
1003816100381610038161003816
2
sum119873
119894=1
(10038161003816100381610038161003816119884119894minus 119883
10038161003816100381610038161003816+10038161003816100381610038161003816119883119894minus 119883
10038161003816100381610038161003816)2
(11)
5 Experimental Results and Analysis
In this section we present the evaluation of our proposedfuzzy clustering-based approach We first present the resultsof fuzzy clustering of the monsoon years for different pre-dictor sets Forecasting skills are evaluated for all cluster andthe ensemble model in terms of mean absolute errors for testperiod 2001ndash2013 In addition other measures like root meansquare errors in prediction correlation between predicted
Table 4 Cluster size (number of years) by fuzzy 119888-means clusteringwith 120572-cut of 03 over the period 1948ndash2013
Predictor set Cluster1 Cluster2 Cluster3PredSet1 16 38 30PredSet2 30 17 40PredSet3 32 14 38PredSet4 42 31 21PredSet5 15 37 26
and actual rainfall prediction yields and agreement indexbetween actual and predicted rainfall are also estimatedto establish the efficiency of our proposed approach toprediction of Indian summer monsoon rainfall
51 Clustering of Monsoon Years Fuzzy clustering is per-formed over period 1948ndash2013 to cluster the data into threeclusters We have performed an 120572-cut with value 120572 = 03to assign the data instances to the clusters The value isascertained empirically such that the distribution of elementswithin clusters is regular A data instance can be assigned tomore than one cluster simultaneously The cluster sizes areshown in Table 4 while considering various predictor sets
52 Prediction Accuracy We predict annual rainfall consid-ering for all five predictor sets (Table 2) separately using fourmodels namely MR MLP RNN and GRNN Test period isconsidered from 2001 to 2013
521 Multiple Regression Model (MR) Multiple regressionmodels are built for every cluster by ascertaining optimaltraining period for each predictor set Optimal trainingperiod is evaluated by varying training years and validatingthem for least absolute error in prediction during valida-tion period (1984ndash1993) Individual cluster based as well asweighted ensemble models are considered for predictionTable 5 gives the mean absolute error for individual clusterbased and ensemble models for test period 2001ndash2013 Themodel provides mean absolute error of 62 for PredSet4(Table 2) It is observed that the ensemblemodel outperformsall the single cluster models for every predictor set Figure 3shows the interannual variability of actual and ensemblepredicted rainfall as percent of long period average (LPA)
522 Multilayer Perceptron Neural Network Model (MLP)Multilayer perceptron neural networkmodel is designedwithfour different sets of parameters described in Table 2 Meanabsolute errors of all cluster and ensemble models are shownin Table 6MLP model reports an error of 40 for PredSet4(Table 2) with MLP parameters ParSet1 (Table 3) The actualand predicted rainfall by models built for clusters andensemble model is shown in Figure 4 Ensemble predictedrainfall closely follows actual rainfall
523 Recurrent Neural Network Model (RNN) Mean abso-lute errors for prediction of annual rainfall by recurrentneural network model for the test period 2001ndash2013 are
Advances in Meteorology 7
Table 5 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MR cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 62
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 94 93 109 86PredSet2 20 110 75 94 83PredSet3 15 109 65 92 67PredSet4 15 104 101 68 62PredSet5 15 76 85 84 79
Table 6 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MLP cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 40
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet4 138 181 169 82PredSet2 ParSet3 160 79 110 52PredSet3 ParSet1 80 78 65 65PredSet4 ParSet1 93 107 45 40PredSet5 ParSet1 85 153 137 110
ActualEnsembleClus1 model
Clus2 modelClus3 model
140
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 3 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters modelsby MR for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
presented inTable 7PredSet3 (Table 2)withRNN parametersParSet1 (Table 3) gives error of 51 RNN gives weightsin decreasing order of their distance from test year to thetraining years The pattern of actual and ensemble predictedrainfall in terms of percentage of LPA is shown in Figure 5
524 Generalized Regression Neural Network Model (GRNN)Generalized regression neural network ensemble and
ActualEnsembleClus1 model
Clus2 modelClus3 model
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
Figure 4 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byMLP for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
individual cluster modelsrsquo errors in terms of mean absoluteerrors are presented in Table 8 The model reports anerror of 61 for PredSet3 (Table 2) Figure 6 shows theinterannual variations of ensemble forecast of rainfall byGRNN ensemble model along with actual rainfall patternin terms of percentage of LPA for period 2001ndash2013 It isobserved that the predicted values are close to actual rainfallpatterns Prediction by models designed for clusters is shownby different symbols
8 Advances in Meteorology
Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88
Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below
(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern
Advances in Meteorology 9
Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)
Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062
Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach
Predictor setMR MLP RNN GRNN
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67
(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least
(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models
All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels
54 Comparison of Results
541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7
542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by
IMD 16-paramIMD 8-paramIMD 10-paramMR
MLPRNNGRNN
16-par 8-par 10-par MR MLP RNN GRNNDifferent models
12
10
8
6
4
2
0
Root
mea
n sq
uare
erro
r as
of L
PA
Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models
same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod
10 Advances in Meteorology
Table 11 Physical climatic events under study
Climatic event Numberof years Years associated with the event
Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994
El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009
La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011
Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996
Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events
Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054
55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014
6 Meteorological Analysis
Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11
Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR
61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below
25
20
15
10
5
0
minus5
minus10
minus15
minus20
minus25
Dev
iatio
n of
rain
fall
from
o
f LPA
1945
1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
2010
2015
Years
Normal years
El-Nino yearsLa-Nina years
Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013
(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event
Support =119909ce119873
(12)
where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster
(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years
Confidence =119909ce119879ce
(13)
where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013
Advances in Meteorology 11
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
50400
302022
10 2040
60
ConfidenSupp
(a)
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
5040
3020
10 2040
60
C nfidencSupport
(b)
Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1
Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering
Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood
We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously
7 Conclusion
Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel
ensemble forecast is provided for Indian summer monsoonrainfall
Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events
In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
12 Advances in Meteorology
Acknowledgment
This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur
References
[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884
[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924
[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992
[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991
[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004
[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007
[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012
[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013
[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012
[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013
[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000
[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008
[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008
[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011
[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013
[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions
and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995
[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011
[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996
[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ClimatologyJournal of
EcologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
EarthquakesJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom
Applied ampEnvironmentalSoil Science
Volume 2014
Mining
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal of
Geophysics
OceanographyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal ofPetroleum Engineering
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Atmospheric SciencesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MineralogyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MeteorologyAdvances in
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geological ResearchJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geology Advances in
4 Advances in Meteorology
Input
∙ Climateparameter(CP)
∙ Rainfall
Climate
anomaly
data
evaluation
Correlation
study of
CP
and rainfall
Identification
of month
of CP
highly
correlated
to
rainfall
Building
predictor
set with
different
CP
combination
Data preprocessing
Ensemble neural network model
Ensemble
of op of
all clusters
Forecast
by everyclusterby each
model
∙ MR
∙ MLP
∙ RNN
∙ GRNN
Clusters
Membershipvalue
Clustering years
Apply
alpha-
cut to
obtain
clusters
Clustering
predictor
set using
fuzzy
clustering
Ensemble
forecastClusters
ValidationPhysical interpretation of clusters
Error calculation
Comp
with
existing
models
Comp
with
nonclustered
method
Support Confidence
Threshold
Cluster interpretation
OutputOutput∙ Forecast of rainfall∙ Error in forecasting∙ Model comparison results
∙ Climatic eventsassociated with eachcluster
Figure 2 Proposed fuzzy clustering-based ensemble approach for prediction of Indian summer monsoon rainfall
43 Prediction Models Multiple regression and three modelsof artificial neural networks (ANN) namely multilayer per-ceptron recurrent neural network and generalized regres-sion neural network are used to design predictionmodels foreach cluster exclusively Forecast of annual ISMR is providedby each cluster model separately and also by ensemble of all
the clustersrsquo model forecast We describe below the modelsused
431 Multiple Regression (MR) Multiple regression model isused to learn the relationship between several independentpredictor variables (119883
119894s) and a dependent variable (119884)
Advances in Meteorology 5
Table 3 Model parameter setting for MLP models
Parameter set Hidden layers Training years Training methodParSet1 [3 5] 20 BFGS quasi-Newton backpropagationParSet2 [3 5 10] 15 Conjugate gradient backpropagation with Powell-Beale restartsParSet3 [5 10] 10 Scaled conjugate gradient backpropagationParSet4 [3 5] 15 Resilient backpropagation
Multiple regression model having 119901 independent variables isshown in
119910119894= 12057311199091198941+ 12057321199091198942+ sdot sdot sdot + 120573
119901119909119894119901
+ 120576119894 (4)
where 119909119894119895is the 119894th observation of 119895th independent variable
where the first independent variable takes the value 1 for all 119894and 120576 represents the residual
432 Multilayer Perceptron Neural Network (MLP) Mul-tilayer perceptron neural network is a class of ANN whereconnections between the neurons do not form a directedcycle In this network the information propagates in onlyone direction from input nodes through hidden nodesand to the output nodes The independent and dependentvariables constitute the input and output layers respectivelyNumber of hidden layers with corresponding nodes mustbe determined empirically for each prediction task Fourdifferent parameter sets are considered empirically for modeldesigned to forecast ISMR shown in Table 3
433 Recurrent Neural Network (RNN) Recurrent neuralnetwork is a class of ANN which creates an internal stateof the network to exhibit dynamic temporal behaviourClimatic changes or events occurring in near or same timeperiod are highly correlated Similarly rainfall patterns aremore correlated to influencing factors in the near years ascompared to the distant years This phenomenon is wellcaptured by RNN which gives weights in decreasing order tothe values in near to distant years during training of networkThus it assists in modelling the system dynamics in muchnatural manner Same set of climatic parameters as MLPnetwork (Table 3) is considered with delay span of 2 units
434 Generalized Regression Neural Network (GRNN) Gen-eralized regression neural network is a variant of radialbasis function network GRNN has three layers of artificialneurons input hidden and output The hidden layer hasradial basis neurons while neurons in the output layer havelinear transfer function Output of radial basis neurons is theinput scaled by the spread factor Given 119901 input-output pairs119909119894119910119895isin R119899timesR1 with 119899 input variables and 119894 = 1 2 119901 119910
119895
represents the output from each hidden unit The GRNNoutput for a test point 119909 isin R119899 is described by
119910(119909) =
119901
sum
119894=1
119882119894119910119894 (5)
where
119882119894=
exp (minus1003817100381710038171003817119909 minus 119909
119894
1003817100381710038171003817
2
21205902
)
sum119901
119896=1
exp (minus1003817100381710038171003817119909 minus 119909
119896
1003817100381710038171003817
2
21205902
)
(6)
The reasons behind modelling using GRNN are (i) only onetunable design parameter (spread factor) (ii) one-pass algo-rithm (less time consuming) and (iii) accurately approximatefunctions from sparse data
Optimal training year is ascertained for MR and GRNNmodels by varying training years from 5 to 30 and validatingagainst least absolute error in prediction during validationperiod (1984ndash1993) A training of 119898 years specifies that forpredicting 119903th year rainfall available preceding119898 number ofyears 119903 minus 1 119903 minus 2 119903 minus 119898 present in a particular cluster areconsidered for training
44 Ensemble of Predictors Complexity in monsoon processmakes it difficult for a single model to predict rainfallaccuratelyWedesign separatemodels for each cluster of yearsobtained by fuzzy clustering using four predictors describedin Section 43 Finally annual ISMR is presented as weightedensemble of forecasts of model designed for each clusterWeight is taken as the fuzzy membership of belongingnessof the test year in different clusters
Ensemble prediction119905 =119888
sum
119894=1
119882119905
119894
sdot 119875119894 (7)
where119875119894represents the prediction given by amodel for cluster
119894119882119905119894
is the fuzzy membership of 119905th test year to cluster 119894 and119888 is the total number of clusters
45 Validation of Proposed Approach The study is performedon data for the period 1948ndash2013 Fuzzy clustering is per-formed over the period to cluster it into three groups Thenumber of clusters is decided based on cluster quality Sepa-rate prediction models are designed for all three clusters andensemble of forecasts of thesemodels is provided as predictedIndian summer monsoon rainfall Test period 2001ndash2013 isconsidered to evaluate the forecasting skills of our proposedapproach
The forecastmodels for annual ISMR are chiefly evaluatedin terms ofmean absolute errorOther error statistics namelyroot mean square error prediction yields Pearson correla-tion and Willmott index of agreement are also evaluated tojudge the efficacy of our proposed approach for predictionThey are described below
6 Advances in Meteorology
(i) Mean Absolute Error (MAE) Mean absolute errorfor prediction of annual ISMR is calculated in thefollowing way
MAE =sum119873
119894=1
|119884 minus 119883|
119873 (8)
where 119883 and 119884 are the actual and predicted ISMRseries for test period and119873 denotes the total numberof test years
(ii) Root Mean Square Error (RMSE) Root mean squareerror calculates the differences between model pre-dicted output and actual values They are a goodmeasure to compare forecasting errors of variousmodels
RMSE = radic(119884 minus 119883)
2
119873 (9)
(iii) Prediction Yield (PY) Prediction yields are evaluatedat three different error categories (5 10 and15 errors) to assess the overall prediction resultsby judging percent of predicted years within eachallowed range of errors
(iv) Pearson Correlation Coefficient (PC) Pearson corre-lation coefficientmeasures the strength of linear asso-ciation between actual and predicted values wherethe value of 1 means a perfect positive correlation andthe value of minus1 means a perfect negative correlation
PC =
sum119873
119894=1
(119883119894minus 119883) (119884
119894minus 119884)
radicsum119873
119894=1
(119883119894minus 119883)2
radicsum119873
119894=1
(119884119894minus 119884)2
(10)
where 119883 and 119884 are the actual and predicted ISMRseries for test period and 119883 and 119884 are their corre-sponding mean
(v) Willmott Index of Agreement (WI)Willmott index ofagreement is a standardized measure of the degree ofmodel prediction error It varies between 0 and 1 withhigher values indicating a better fit of the model forprediction
Index of agreement = 1 minussum119873
119894=1
1003816100381610038161003816119883119894 minus 119884119894
1003816100381610038161003816
2
sum119873
119894=1
(10038161003816100381610038161003816119884119894minus 119883
10038161003816100381610038161003816+10038161003816100381610038161003816119883119894minus 119883
10038161003816100381610038161003816)2
(11)
5 Experimental Results and Analysis
In this section we present the evaluation of our proposedfuzzy clustering-based approach We first present the resultsof fuzzy clustering of the monsoon years for different pre-dictor sets Forecasting skills are evaluated for all cluster andthe ensemble model in terms of mean absolute errors for testperiod 2001ndash2013 In addition other measures like root meansquare errors in prediction correlation between predicted
Table 4 Cluster size (number of years) by fuzzy 119888-means clusteringwith 120572-cut of 03 over the period 1948ndash2013
Predictor set Cluster1 Cluster2 Cluster3PredSet1 16 38 30PredSet2 30 17 40PredSet3 32 14 38PredSet4 42 31 21PredSet5 15 37 26
and actual rainfall prediction yields and agreement indexbetween actual and predicted rainfall are also estimatedto establish the efficiency of our proposed approach toprediction of Indian summer monsoon rainfall
51 Clustering of Monsoon Years Fuzzy clustering is per-formed over period 1948ndash2013 to cluster the data into threeclusters We have performed an 120572-cut with value 120572 = 03to assign the data instances to the clusters The value isascertained empirically such that the distribution of elementswithin clusters is regular A data instance can be assigned tomore than one cluster simultaneously The cluster sizes areshown in Table 4 while considering various predictor sets
52 Prediction Accuracy We predict annual rainfall consid-ering for all five predictor sets (Table 2) separately using fourmodels namely MR MLP RNN and GRNN Test period isconsidered from 2001 to 2013
521 Multiple Regression Model (MR) Multiple regressionmodels are built for every cluster by ascertaining optimaltraining period for each predictor set Optimal trainingperiod is evaluated by varying training years and validatingthem for least absolute error in prediction during valida-tion period (1984ndash1993) Individual cluster based as well asweighted ensemble models are considered for predictionTable 5 gives the mean absolute error for individual clusterbased and ensemble models for test period 2001ndash2013 Themodel provides mean absolute error of 62 for PredSet4(Table 2) It is observed that the ensemblemodel outperformsall the single cluster models for every predictor set Figure 3shows the interannual variability of actual and ensemblepredicted rainfall as percent of long period average (LPA)
522 Multilayer Perceptron Neural Network Model (MLP)Multilayer perceptron neural networkmodel is designedwithfour different sets of parameters described in Table 2 Meanabsolute errors of all cluster and ensemble models are shownin Table 6MLP model reports an error of 40 for PredSet4(Table 2) with MLP parameters ParSet1 (Table 3) The actualand predicted rainfall by models built for clusters andensemble model is shown in Figure 4 Ensemble predictedrainfall closely follows actual rainfall
523 Recurrent Neural Network Model (RNN) Mean abso-lute errors for prediction of annual rainfall by recurrentneural network model for the test period 2001ndash2013 are
Advances in Meteorology 7
Table 5 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MR cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 62
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 94 93 109 86PredSet2 20 110 75 94 83PredSet3 15 109 65 92 67PredSet4 15 104 101 68 62PredSet5 15 76 85 84 79
Table 6 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MLP cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 40
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet4 138 181 169 82PredSet2 ParSet3 160 79 110 52PredSet3 ParSet1 80 78 65 65PredSet4 ParSet1 93 107 45 40PredSet5 ParSet1 85 153 137 110
ActualEnsembleClus1 model
Clus2 modelClus3 model
140
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 3 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters modelsby MR for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
presented inTable 7PredSet3 (Table 2)withRNN parametersParSet1 (Table 3) gives error of 51 RNN gives weightsin decreasing order of their distance from test year to thetraining years The pattern of actual and ensemble predictedrainfall in terms of percentage of LPA is shown in Figure 5
524 Generalized Regression Neural Network Model (GRNN)Generalized regression neural network ensemble and
ActualEnsembleClus1 model
Clus2 modelClus3 model
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
Figure 4 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byMLP for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
individual cluster modelsrsquo errors in terms of mean absoluteerrors are presented in Table 8 The model reports anerror of 61 for PredSet3 (Table 2) Figure 6 shows theinterannual variations of ensemble forecast of rainfall byGRNN ensemble model along with actual rainfall patternin terms of percentage of LPA for period 2001ndash2013 It isobserved that the predicted values are close to actual rainfallpatterns Prediction by models designed for clusters is shownby different symbols
8 Advances in Meteorology
Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88
Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below
(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern
Advances in Meteorology 9
Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)
Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062
Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach
Predictor setMR MLP RNN GRNN
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67
(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least
(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models
All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels
54 Comparison of Results
541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7
542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by
IMD 16-paramIMD 8-paramIMD 10-paramMR
MLPRNNGRNN
16-par 8-par 10-par MR MLP RNN GRNNDifferent models
12
10
8
6
4
2
0
Root
mea
n sq
uare
erro
r as
of L
PA
Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models
same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod
10 Advances in Meteorology
Table 11 Physical climatic events under study
Climatic event Numberof years Years associated with the event
Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994
El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009
La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011
Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996
Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events
Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054
55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014
6 Meteorological Analysis
Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11
Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR
61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below
25
20
15
10
5
0
minus5
minus10
minus15
minus20
minus25
Dev
iatio
n of
rain
fall
from
o
f LPA
1945
1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
2010
2015
Years
Normal years
El-Nino yearsLa-Nina years
Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013
(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event
Support =119909ce119873
(12)
where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster
(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years
Confidence =119909ce119879ce
(13)
where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013
Advances in Meteorology 11
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
50400
302022
10 2040
60
ConfidenSupp
(a)
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
5040
3020
10 2040
60
C nfidencSupport
(b)
Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1
Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering
Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood
We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously
7 Conclusion
Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel
ensemble forecast is provided for Indian summer monsoonrainfall
Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events
In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
12 Advances in Meteorology
Acknowledgment
This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur
References
[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884
[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924
[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992
[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991
[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004
[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007
[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012
[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013
[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012
[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013
[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000
[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008
[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008
[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011
[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013
[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions
and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995
[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011
[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996
[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ClimatologyJournal of
EcologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
EarthquakesJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom
Applied ampEnvironmentalSoil Science
Volume 2014
Mining
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal of
Geophysics
OceanographyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal ofPetroleum Engineering
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Atmospheric SciencesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MineralogyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MeteorologyAdvances in
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geological ResearchJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geology Advances in
Advances in Meteorology 5
Table 3 Model parameter setting for MLP models
Parameter set Hidden layers Training years Training methodParSet1 [3 5] 20 BFGS quasi-Newton backpropagationParSet2 [3 5 10] 15 Conjugate gradient backpropagation with Powell-Beale restartsParSet3 [5 10] 10 Scaled conjugate gradient backpropagationParSet4 [3 5] 15 Resilient backpropagation
Multiple regression model having 119901 independent variables isshown in
119910119894= 12057311199091198941+ 12057321199091198942+ sdot sdot sdot + 120573
119901119909119894119901
+ 120576119894 (4)
where 119909119894119895is the 119894th observation of 119895th independent variable
where the first independent variable takes the value 1 for all 119894and 120576 represents the residual
432 Multilayer Perceptron Neural Network (MLP) Mul-tilayer perceptron neural network is a class of ANN whereconnections between the neurons do not form a directedcycle In this network the information propagates in onlyone direction from input nodes through hidden nodesand to the output nodes The independent and dependentvariables constitute the input and output layers respectivelyNumber of hidden layers with corresponding nodes mustbe determined empirically for each prediction task Fourdifferent parameter sets are considered empirically for modeldesigned to forecast ISMR shown in Table 3
433 Recurrent Neural Network (RNN) Recurrent neuralnetwork is a class of ANN which creates an internal stateof the network to exhibit dynamic temporal behaviourClimatic changes or events occurring in near or same timeperiod are highly correlated Similarly rainfall patterns aremore correlated to influencing factors in the near years ascompared to the distant years This phenomenon is wellcaptured by RNN which gives weights in decreasing order tothe values in near to distant years during training of networkThus it assists in modelling the system dynamics in muchnatural manner Same set of climatic parameters as MLPnetwork (Table 3) is considered with delay span of 2 units
434 Generalized Regression Neural Network (GRNN) Gen-eralized regression neural network is a variant of radialbasis function network GRNN has three layers of artificialneurons input hidden and output The hidden layer hasradial basis neurons while neurons in the output layer havelinear transfer function Output of radial basis neurons is theinput scaled by the spread factor Given 119901 input-output pairs119909119894119910119895isin R119899timesR1 with 119899 input variables and 119894 = 1 2 119901 119910
119895
represents the output from each hidden unit The GRNNoutput for a test point 119909 isin R119899 is described by
119910(119909) =
119901
sum
119894=1
119882119894119910119894 (5)
where
119882119894=
exp (minus1003817100381710038171003817119909 minus 119909
119894
1003817100381710038171003817
2
21205902
)
sum119901
119896=1
exp (minus1003817100381710038171003817119909 minus 119909
119896
1003817100381710038171003817
2
21205902
)
(6)
The reasons behind modelling using GRNN are (i) only onetunable design parameter (spread factor) (ii) one-pass algo-rithm (less time consuming) and (iii) accurately approximatefunctions from sparse data
Optimal training year is ascertained for MR and GRNNmodels by varying training years from 5 to 30 and validatingagainst least absolute error in prediction during validationperiod (1984ndash1993) A training of 119898 years specifies that forpredicting 119903th year rainfall available preceding119898 number ofyears 119903 minus 1 119903 minus 2 119903 minus 119898 present in a particular cluster areconsidered for training
44 Ensemble of Predictors Complexity in monsoon processmakes it difficult for a single model to predict rainfallaccuratelyWedesign separatemodels for each cluster of yearsobtained by fuzzy clustering using four predictors describedin Section 43 Finally annual ISMR is presented as weightedensemble of forecasts of model designed for each clusterWeight is taken as the fuzzy membership of belongingnessof the test year in different clusters
Ensemble prediction119905 =119888
sum
119894=1
119882119905
119894
sdot 119875119894 (7)
where119875119894represents the prediction given by amodel for cluster
119894119882119905119894
is the fuzzy membership of 119905th test year to cluster 119894 and119888 is the total number of clusters
45 Validation of Proposed Approach The study is performedon data for the period 1948ndash2013 Fuzzy clustering is per-formed over the period to cluster it into three groups Thenumber of clusters is decided based on cluster quality Sepa-rate prediction models are designed for all three clusters andensemble of forecasts of thesemodels is provided as predictedIndian summer monsoon rainfall Test period 2001ndash2013 isconsidered to evaluate the forecasting skills of our proposedapproach
The forecastmodels for annual ISMR are chiefly evaluatedin terms ofmean absolute errorOther error statistics namelyroot mean square error prediction yields Pearson correla-tion and Willmott index of agreement are also evaluated tojudge the efficacy of our proposed approach for predictionThey are described below
6 Advances in Meteorology
(i) Mean Absolute Error (MAE) Mean absolute errorfor prediction of annual ISMR is calculated in thefollowing way
MAE =sum119873
119894=1
|119884 minus 119883|
119873 (8)
where 119883 and 119884 are the actual and predicted ISMRseries for test period and119873 denotes the total numberof test years
(ii) Root Mean Square Error (RMSE) Root mean squareerror calculates the differences between model pre-dicted output and actual values They are a goodmeasure to compare forecasting errors of variousmodels
RMSE = radic(119884 minus 119883)
2
119873 (9)
(iii) Prediction Yield (PY) Prediction yields are evaluatedat three different error categories (5 10 and15 errors) to assess the overall prediction resultsby judging percent of predicted years within eachallowed range of errors
(iv) Pearson Correlation Coefficient (PC) Pearson corre-lation coefficientmeasures the strength of linear asso-ciation between actual and predicted values wherethe value of 1 means a perfect positive correlation andthe value of minus1 means a perfect negative correlation
PC =
sum119873
119894=1
(119883119894minus 119883) (119884
119894minus 119884)
radicsum119873
119894=1
(119883119894minus 119883)2
radicsum119873
119894=1
(119884119894minus 119884)2
(10)
where 119883 and 119884 are the actual and predicted ISMRseries for test period and 119883 and 119884 are their corre-sponding mean
(v) Willmott Index of Agreement (WI)Willmott index ofagreement is a standardized measure of the degree ofmodel prediction error It varies between 0 and 1 withhigher values indicating a better fit of the model forprediction
Index of agreement = 1 minussum119873
119894=1
1003816100381610038161003816119883119894 minus 119884119894
1003816100381610038161003816
2
sum119873
119894=1
(10038161003816100381610038161003816119884119894minus 119883
10038161003816100381610038161003816+10038161003816100381610038161003816119883119894minus 119883
10038161003816100381610038161003816)2
(11)
5 Experimental Results and Analysis
In this section we present the evaluation of our proposedfuzzy clustering-based approach We first present the resultsof fuzzy clustering of the monsoon years for different pre-dictor sets Forecasting skills are evaluated for all cluster andthe ensemble model in terms of mean absolute errors for testperiod 2001ndash2013 In addition other measures like root meansquare errors in prediction correlation between predicted
Table 4 Cluster size (number of years) by fuzzy 119888-means clusteringwith 120572-cut of 03 over the period 1948ndash2013
Predictor set Cluster1 Cluster2 Cluster3PredSet1 16 38 30PredSet2 30 17 40PredSet3 32 14 38PredSet4 42 31 21PredSet5 15 37 26
and actual rainfall prediction yields and agreement indexbetween actual and predicted rainfall are also estimatedto establish the efficiency of our proposed approach toprediction of Indian summer monsoon rainfall
51 Clustering of Monsoon Years Fuzzy clustering is per-formed over period 1948ndash2013 to cluster the data into threeclusters We have performed an 120572-cut with value 120572 = 03to assign the data instances to the clusters The value isascertained empirically such that the distribution of elementswithin clusters is regular A data instance can be assigned tomore than one cluster simultaneously The cluster sizes areshown in Table 4 while considering various predictor sets
52 Prediction Accuracy We predict annual rainfall consid-ering for all five predictor sets (Table 2) separately using fourmodels namely MR MLP RNN and GRNN Test period isconsidered from 2001 to 2013
521 Multiple Regression Model (MR) Multiple regressionmodels are built for every cluster by ascertaining optimaltraining period for each predictor set Optimal trainingperiod is evaluated by varying training years and validatingthem for least absolute error in prediction during valida-tion period (1984ndash1993) Individual cluster based as well asweighted ensemble models are considered for predictionTable 5 gives the mean absolute error for individual clusterbased and ensemble models for test period 2001ndash2013 Themodel provides mean absolute error of 62 for PredSet4(Table 2) It is observed that the ensemblemodel outperformsall the single cluster models for every predictor set Figure 3shows the interannual variability of actual and ensemblepredicted rainfall as percent of long period average (LPA)
522 Multilayer Perceptron Neural Network Model (MLP)Multilayer perceptron neural networkmodel is designedwithfour different sets of parameters described in Table 2 Meanabsolute errors of all cluster and ensemble models are shownin Table 6MLP model reports an error of 40 for PredSet4(Table 2) with MLP parameters ParSet1 (Table 3) The actualand predicted rainfall by models built for clusters andensemble model is shown in Figure 4 Ensemble predictedrainfall closely follows actual rainfall
523 Recurrent Neural Network Model (RNN) Mean abso-lute errors for prediction of annual rainfall by recurrentneural network model for the test period 2001ndash2013 are
Advances in Meteorology 7
Table 5 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MR cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 62
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 94 93 109 86PredSet2 20 110 75 94 83PredSet3 15 109 65 92 67PredSet4 15 104 101 68 62PredSet5 15 76 85 84 79
Table 6 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MLP cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 40
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet4 138 181 169 82PredSet2 ParSet3 160 79 110 52PredSet3 ParSet1 80 78 65 65PredSet4 ParSet1 93 107 45 40PredSet5 ParSet1 85 153 137 110
ActualEnsembleClus1 model
Clus2 modelClus3 model
140
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 3 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters modelsby MR for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
presented inTable 7PredSet3 (Table 2)withRNN parametersParSet1 (Table 3) gives error of 51 RNN gives weightsin decreasing order of their distance from test year to thetraining years The pattern of actual and ensemble predictedrainfall in terms of percentage of LPA is shown in Figure 5
524 Generalized Regression Neural Network Model (GRNN)Generalized regression neural network ensemble and
ActualEnsembleClus1 model
Clus2 modelClus3 model
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
Figure 4 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byMLP for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
individual cluster modelsrsquo errors in terms of mean absoluteerrors are presented in Table 8 The model reports anerror of 61 for PredSet3 (Table 2) Figure 6 shows theinterannual variations of ensemble forecast of rainfall byGRNN ensemble model along with actual rainfall patternin terms of percentage of LPA for period 2001ndash2013 It isobserved that the predicted values are close to actual rainfallpatterns Prediction by models designed for clusters is shownby different symbols
8 Advances in Meteorology
Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88
Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below
(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern
Advances in Meteorology 9
Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)
Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062
Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach
Predictor setMR MLP RNN GRNN
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67
(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least
(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models
All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels
54 Comparison of Results
541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7
542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by
IMD 16-paramIMD 8-paramIMD 10-paramMR
MLPRNNGRNN
16-par 8-par 10-par MR MLP RNN GRNNDifferent models
12
10
8
6
4
2
0
Root
mea
n sq
uare
erro
r as
of L
PA
Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models
same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod
10 Advances in Meteorology
Table 11 Physical climatic events under study
Climatic event Numberof years Years associated with the event
Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994
El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009
La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011
Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996
Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events
Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054
55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014
6 Meteorological Analysis
Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11
Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR
61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below
25
20
15
10
5
0
minus5
minus10
minus15
minus20
minus25
Dev
iatio
n of
rain
fall
from
o
f LPA
1945
1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
2010
2015
Years
Normal years
El-Nino yearsLa-Nina years
Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013
(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event
Support =119909ce119873
(12)
where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster
(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years
Confidence =119909ce119879ce
(13)
where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013
Advances in Meteorology 11
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
50400
302022
10 2040
60
ConfidenSupp
(a)
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
5040
3020
10 2040
60
C nfidencSupport
(b)
Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1
Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering
Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood
We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously
7 Conclusion
Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel
ensemble forecast is provided for Indian summer monsoonrainfall
Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events
In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
12 Advances in Meteorology
Acknowledgment
This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur
References
[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884
[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924
[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992
[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991
[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004
[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007
[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012
[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013
[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012
[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013
[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000
[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008
[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008
[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011
[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013
[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions
and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995
[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011
[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996
[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ClimatologyJournal of
EcologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
EarthquakesJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom
Applied ampEnvironmentalSoil Science
Volume 2014
Mining
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal of
Geophysics
OceanographyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal ofPetroleum Engineering
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Atmospheric SciencesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MineralogyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MeteorologyAdvances in
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geological ResearchJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geology Advances in
6 Advances in Meteorology
(i) Mean Absolute Error (MAE) Mean absolute errorfor prediction of annual ISMR is calculated in thefollowing way
MAE =sum119873
119894=1
|119884 minus 119883|
119873 (8)
where 119883 and 119884 are the actual and predicted ISMRseries for test period and119873 denotes the total numberof test years
(ii) Root Mean Square Error (RMSE) Root mean squareerror calculates the differences between model pre-dicted output and actual values They are a goodmeasure to compare forecasting errors of variousmodels
RMSE = radic(119884 minus 119883)
2
119873 (9)
(iii) Prediction Yield (PY) Prediction yields are evaluatedat three different error categories (5 10 and15 errors) to assess the overall prediction resultsby judging percent of predicted years within eachallowed range of errors
(iv) Pearson Correlation Coefficient (PC) Pearson corre-lation coefficientmeasures the strength of linear asso-ciation between actual and predicted values wherethe value of 1 means a perfect positive correlation andthe value of minus1 means a perfect negative correlation
PC =
sum119873
119894=1
(119883119894minus 119883) (119884
119894minus 119884)
radicsum119873
119894=1
(119883119894minus 119883)2
radicsum119873
119894=1
(119884119894minus 119884)2
(10)
where 119883 and 119884 are the actual and predicted ISMRseries for test period and 119883 and 119884 are their corre-sponding mean
(v) Willmott Index of Agreement (WI)Willmott index ofagreement is a standardized measure of the degree ofmodel prediction error It varies between 0 and 1 withhigher values indicating a better fit of the model forprediction
Index of agreement = 1 minussum119873
119894=1
1003816100381610038161003816119883119894 minus 119884119894
1003816100381610038161003816
2
sum119873
119894=1
(10038161003816100381610038161003816119884119894minus 119883
10038161003816100381610038161003816+10038161003816100381610038161003816119883119894minus 119883
10038161003816100381610038161003816)2
(11)
5 Experimental Results and Analysis
In this section we present the evaluation of our proposedfuzzy clustering-based approach We first present the resultsof fuzzy clustering of the monsoon years for different pre-dictor sets Forecasting skills are evaluated for all cluster andthe ensemble model in terms of mean absolute errors for testperiod 2001ndash2013 In addition other measures like root meansquare errors in prediction correlation between predicted
Table 4 Cluster size (number of years) by fuzzy 119888-means clusteringwith 120572-cut of 03 over the period 1948ndash2013
Predictor set Cluster1 Cluster2 Cluster3PredSet1 16 38 30PredSet2 30 17 40PredSet3 32 14 38PredSet4 42 31 21PredSet5 15 37 26
and actual rainfall prediction yields and agreement indexbetween actual and predicted rainfall are also estimatedto establish the efficiency of our proposed approach toprediction of Indian summer monsoon rainfall
51 Clustering of Monsoon Years Fuzzy clustering is per-formed over period 1948ndash2013 to cluster the data into threeclusters We have performed an 120572-cut with value 120572 = 03to assign the data instances to the clusters The value isascertained empirically such that the distribution of elementswithin clusters is regular A data instance can be assigned tomore than one cluster simultaneously The cluster sizes areshown in Table 4 while considering various predictor sets
52 Prediction Accuracy We predict annual rainfall consid-ering for all five predictor sets (Table 2) separately using fourmodels namely MR MLP RNN and GRNN Test period isconsidered from 2001 to 2013
521 Multiple Regression Model (MR) Multiple regressionmodels are built for every cluster by ascertaining optimaltraining period for each predictor set Optimal trainingperiod is evaluated by varying training years and validatingthem for least absolute error in prediction during valida-tion period (1984ndash1993) Individual cluster based as well asweighted ensemble models are considered for predictionTable 5 gives the mean absolute error for individual clusterbased and ensemble models for test period 2001ndash2013 Themodel provides mean absolute error of 62 for PredSet4(Table 2) It is observed that the ensemblemodel outperformsall the single cluster models for every predictor set Figure 3shows the interannual variability of actual and ensemblepredicted rainfall as percent of long period average (LPA)
522 Multilayer Perceptron Neural Network Model (MLP)Multilayer perceptron neural networkmodel is designedwithfour different sets of parameters described in Table 2 Meanabsolute errors of all cluster and ensemble models are shownin Table 6MLP model reports an error of 40 for PredSet4(Table 2) with MLP parameters ParSet1 (Table 3) The actualand predicted rainfall by models built for clusters andensemble model is shown in Figure 4 Ensemble predictedrainfall closely follows actual rainfall
523 Recurrent Neural Network Model (RNN) Mean abso-lute errors for prediction of annual rainfall by recurrentneural network model for the test period 2001ndash2013 are
Advances in Meteorology 7
Table 5 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MR cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 62
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 94 93 109 86PredSet2 20 110 75 94 83PredSet3 15 109 65 92 67PredSet4 15 104 101 68 62PredSet5 15 76 85 84 79
Table 6 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MLP cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 40
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet4 138 181 169 82PredSet2 ParSet3 160 79 110 52PredSet3 ParSet1 80 78 65 65PredSet4 ParSet1 93 107 45 40PredSet5 ParSet1 85 153 137 110
ActualEnsembleClus1 model
Clus2 modelClus3 model
140
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 3 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters modelsby MR for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
presented inTable 7PredSet3 (Table 2)withRNN parametersParSet1 (Table 3) gives error of 51 RNN gives weightsin decreasing order of their distance from test year to thetraining years The pattern of actual and ensemble predictedrainfall in terms of percentage of LPA is shown in Figure 5
524 Generalized Regression Neural Network Model (GRNN)Generalized regression neural network ensemble and
ActualEnsembleClus1 model
Clus2 modelClus3 model
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
Figure 4 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byMLP for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
individual cluster modelsrsquo errors in terms of mean absoluteerrors are presented in Table 8 The model reports anerror of 61 for PredSet3 (Table 2) Figure 6 shows theinterannual variations of ensemble forecast of rainfall byGRNN ensemble model along with actual rainfall patternin terms of percentage of LPA for period 2001ndash2013 It isobserved that the predicted values are close to actual rainfallpatterns Prediction by models designed for clusters is shownby different symbols
8 Advances in Meteorology
Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88
Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below
(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern
Advances in Meteorology 9
Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)
Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062
Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach
Predictor setMR MLP RNN GRNN
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67
(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least
(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models
All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels
54 Comparison of Results
541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7
542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by
IMD 16-paramIMD 8-paramIMD 10-paramMR
MLPRNNGRNN
16-par 8-par 10-par MR MLP RNN GRNNDifferent models
12
10
8
6
4
2
0
Root
mea
n sq
uare
erro
r as
of L
PA
Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models
same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod
10 Advances in Meteorology
Table 11 Physical climatic events under study
Climatic event Numberof years Years associated with the event
Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994
El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009
La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011
Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996
Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events
Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054
55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014
6 Meteorological Analysis
Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11
Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR
61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below
25
20
15
10
5
0
minus5
minus10
minus15
minus20
minus25
Dev
iatio
n of
rain
fall
from
o
f LPA
1945
1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
2010
2015
Years
Normal years
El-Nino yearsLa-Nina years
Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013
(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event
Support =119909ce119873
(12)
where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster
(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years
Confidence =119909ce119879ce
(13)
where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013
Advances in Meteorology 11
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
50400
302022
10 2040
60
ConfidenSupp
(a)
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
5040
3020
10 2040
60
C nfidencSupport
(b)
Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1
Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering
Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood
We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously
7 Conclusion
Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel
ensemble forecast is provided for Indian summer monsoonrainfall
Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events
In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
12 Advances in Meteorology
Acknowledgment
This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur
References
[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884
[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924
[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992
[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991
[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004
[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007
[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012
[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013
[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012
[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013
[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000
[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008
[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008
[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011
[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013
[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions
and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995
[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011
[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996
[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ClimatologyJournal of
EcologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
EarthquakesJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom
Applied ampEnvironmentalSoil Science
Volume 2014
Mining
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal of
Geophysics
OceanographyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal ofPetroleum Engineering
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Atmospheric SciencesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MineralogyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MeteorologyAdvances in
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geological ResearchJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geology Advances in
Advances in Meteorology 7
Table 5 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MR cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 62
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 94 93 109 86PredSet2 20 110 75 94 83PredSet3 15 109 65 92 67PredSet4 15 104 101 68 62PredSet5 15 76 85 84 79
Table 6 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MLP cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 40
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet4 138 181 169 82PredSet2 ParSet3 160 79 110 52PredSet3 ParSet1 80 78 65 65PredSet4 ParSet1 93 107 45 40PredSet5 ParSet1 85 153 137 110
ActualEnsembleClus1 model
Clus2 modelClus3 model
140
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 3 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters modelsby MR for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
presented inTable 7PredSet3 (Table 2)withRNN parametersParSet1 (Table 3) gives error of 51 RNN gives weightsin decreasing order of their distance from test year to thetraining years The pattern of actual and ensemble predictedrainfall in terms of percentage of LPA is shown in Figure 5
524 Generalized Regression Neural Network Model (GRNN)Generalized regression neural network ensemble and
ActualEnsembleClus1 model
Clus2 modelClus3 model
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
Figure 4 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byMLP for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
individual cluster modelsrsquo errors in terms of mean absoluteerrors are presented in Table 8 The model reports anerror of 61 for PredSet3 (Table 2) Figure 6 shows theinterannual variations of ensemble forecast of rainfall byGRNN ensemble model along with actual rainfall patternin terms of percentage of LPA for period 2001ndash2013 It isobserved that the predicted values are close to actual rainfallpatterns Prediction by models designed for clusters is shownby different symbols
8 Advances in Meteorology
Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88
Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below
(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern
Advances in Meteorology 9
Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)
Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062
Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach
Predictor setMR MLP RNN GRNN
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67
(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least
(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models
All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels
54 Comparison of Results
541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7
542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by
IMD 16-paramIMD 8-paramIMD 10-paramMR
MLPRNNGRNN
16-par 8-par 10-par MR MLP RNN GRNNDifferent models
12
10
8
6
4
2
0
Root
mea
n sq
uare
erro
r as
of L
PA
Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models
same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod
10 Advances in Meteorology
Table 11 Physical climatic events under study
Climatic event Numberof years Years associated with the event
Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994
El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009
La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011
Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996
Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events
Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054
55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014
6 Meteorological Analysis
Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11
Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR
61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below
25
20
15
10
5
0
minus5
minus10
minus15
minus20
minus25
Dev
iatio
n of
rain
fall
from
o
f LPA
1945
1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
2010
2015
Years
Normal years
El-Nino yearsLa-Nina years
Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013
(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event
Support =119909ce119873
(12)
where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster
(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years
Confidence =119909ce119879ce
(13)
where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013
Advances in Meteorology 11
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
50400
302022
10 2040
60
ConfidenSupp
(a)
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
5040
3020
10 2040
60
C nfidencSupport
(b)
Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1
Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering
Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood
We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously
7 Conclusion
Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel
ensemble forecast is provided for Indian summer monsoonrainfall
Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events
In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
12 Advances in Meteorology
Acknowledgment
This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur
References
[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884
[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924
[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992
[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991
[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004
[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007
[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012
[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013
[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012
[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013
[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000
[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008
[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008
[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011
[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013
[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions
and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995
[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011
[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996
[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ClimatologyJournal of
EcologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
EarthquakesJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom
Applied ampEnvironmentalSoil Science
Volume 2014
Mining
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal of
Geophysics
OceanographyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal ofPetroleum Engineering
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Atmospheric SciencesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MineralogyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MeteorologyAdvances in
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geological ResearchJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geology Advances in
8 Advances in Meteorology
Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51
Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88
Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61
Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below
(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively
ActualEnsembleClus1 model
Clus2 modelClus3 model
120
100
80
60
40
20
0
Rain
fall
as
of L
PA
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Test years
Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013
(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern
Advances in Meteorology 9
Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)
Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062
Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach
Predictor setMR MLP RNN GRNN
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67
(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least
(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models
All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels
54 Comparison of Results
541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7
542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by
IMD 16-paramIMD 8-paramIMD 10-paramMR
MLPRNNGRNN
16-par 8-par 10-par MR MLP RNN GRNNDifferent models
12
10
8
6
4
2
0
Root
mea
n sq
uare
erro
r as
of L
PA
Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models
same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod
10 Advances in Meteorology
Table 11 Physical climatic events under study
Climatic event Numberof years Years associated with the event
Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994
El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009
La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011
Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996
Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events
Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054
55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014
6 Meteorological Analysis
Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11
Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR
61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below
25
20
15
10
5
0
minus5
minus10
minus15
minus20
minus25
Dev
iatio
n of
rain
fall
from
o
f LPA
1945
1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
2010
2015
Years
Normal years
El-Nino yearsLa-Nina years
Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013
(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event
Support =119909ce119873
(12)
where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster
(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years
Confidence =119909ce119879ce
(13)
where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013
Advances in Meteorology 11
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
50400
302022
10 2040
60
ConfidenSupp
(a)
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
5040
3020
10 2040
60
C nfidencSupport
(b)
Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1
Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering
Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood
We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously
7 Conclusion
Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel
ensemble forecast is provided for Indian summer monsoonrainfall
Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events
In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
12 Advances in Meteorology
Acknowledgment
This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur
References
[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884
[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924
[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992
[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991
[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004
[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007
[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012
[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013
[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012
[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013
[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000
[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008
[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008
[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011
[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013
[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions
and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995
[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011
[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996
[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ClimatologyJournal of
EcologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
EarthquakesJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom
Applied ampEnvironmentalSoil Science
Volume 2014
Mining
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal of
Geophysics
OceanographyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal ofPetroleum Engineering
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Atmospheric SciencesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MineralogyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MeteorologyAdvances in
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geological ResearchJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geology Advances in
Advances in Meteorology 9
Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)
Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062
Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach
Predictor setMR MLP RNN GRNN
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
Tot error(NC) ()
Ensml error(WC) ()
PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67
(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least
(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models
All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels
54 Comparison of Results
541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7
542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by
IMD 16-paramIMD 8-paramIMD 10-paramMR
MLPRNNGRNN
16-par 8-par 10-par MR MLP RNN GRNNDifferent models
12
10
8
6
4
2
0
Root
mea
n sq
uare
erro
r as
of L
PA
Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models
same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod
10 Advances in Meteorology
Table 11 Physical climatic events under study
Climatic event Numberof years Years associated with the event
Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994
El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009
La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011
Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996
Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events
Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054
55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014
6 Meteorological Analysis
Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11
Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR
61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below
25
20
15
10
5
0
minus5
minus10
minus15
minus20
minus25
Dev
iatio
n of
rain
fall
from
o
f LPA
1945
1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
2010
2015
Years
Normal years
El-Nino yearsLa-Nina years
Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013
(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event
Support =119909ce119873
(12)
where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster
(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years
Confidence =119909ce119879ce
(13)
where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013
Advances in Meteorology 11
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
50400
302022
10 2040
60
ConfidenSupp
(a)
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
5040
3020
10 2040
60
C nfidencSupport
(b)
Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1
Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering
Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood
We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously
7 Conclusion
Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel
ensemble forecast is provided for Indian summer monsoonrainfall
Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events
In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
12 Advances in Meteorology
Acknowledgment
This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur
References
[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884
[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924
[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992
[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991
[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004
[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007
[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012
[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013
[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012
[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013
[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000
[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008
[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008
[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011
[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013
[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions
and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995
[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011
[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996
[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ClimatologyJournal of
EcologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
EarthquakesJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom
Applied ampEnvironmentalSoil Science
Volume 2014
Mining
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal of
Geophysics
OceanographyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal ofPetroleum Engineering
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Atmospheric SciencesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MineralogyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MeteorologyAdvances in
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geological ResearchJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geology Advances in
10 Advances in Meteorology
Table 11 Physical climatic events under study
Climatic event Numberof years Years associated with the event
Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994
El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009
La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011
Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996
Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events
Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054
55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014
6 Meteorological Analysis
Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11
Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR
61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below
25
20
15
10
5
0
minus5
minus10
minus15
minus20
minus25
Dev
iatio
n of
rain
fall
from
o
f LPA
1945
1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
2010
2015
Years
Normal years
El-Nino yearsLa-Nina years
Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013
(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event
Support =119909ce119873
(12)
where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster
(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years
Confidence =119909ce119879ce
(13)
where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013
Advances in Meteorology 11
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
50400
302022
10 2040
60
ConfidenSupp
(a)
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
5040
3020
10 2040
60
C nfidencSupport
(b)
Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1
Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering
Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood
We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously
7 Conclusion
Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel
ensemble forecast is provided for Indian summer monsoonrainfall
Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events
In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
12 Advances in Meteorology
Acknowledgment
This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur
References
[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884
[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924
[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992
[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991
[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004
[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007
[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012
[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013
[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012
[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013
[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000
[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008
[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008
[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011
[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013
[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions
and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995
[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011
[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996
[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ClimatologyJournal of
EcologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
EarthquakesJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom
Applied ampEnvironmentalSoil Science
Volume 2014
Mining
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal of
Geophysics
OceanographyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal ofPetroleum Engineering
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Atmospheric SciencesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MineralogyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MeteorologyAdvances in
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geological ResearchJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geology Advances in
Advances in Meteorology 11
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
50400
302022
10 2040
60
ConfidenSupp
(a)
14
12
10
8
6
4
2
060
5040
3020
100 0
2040
6080
100
ConfidenceSupport
Year
-cou
nt
5040
3020
10 2040
60
C nfidencSupport
(b)
Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1
Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering
Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood
We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously
7 Conclusion
Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel
ensemble forecast is provided for Indian summer monsoonrainfall
Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events
In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
12 Advances in Meteorology
Acknowledgment
This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur
References
[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884
[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924
[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992
[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991
[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004
[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007
[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012
[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013
[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012
[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013
[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000
[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008
[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008
[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011
[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013
[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions
and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995
[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011
[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996
[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ClimatologyJournal of
EcologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
EarthquakesJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom
Applied ampEnvironmentalSoil Science
Volume 2014
Mining
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal of
Geophysics
OceanographyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal ofPetroleum Engineering
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Atmospheric SciencesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MineralogyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MeteorologyAdvances in
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geological ResearchJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geology Advances in
12 Advances in Meteorology
Acknowledgment
This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur
References
[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884
[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924
[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992
[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991
[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004
[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007
[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012
[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013
[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012
[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013
[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000
[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008
[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008
[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011
[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013
[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions
and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995
[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011
[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996
[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ClimatologyJournal of
EcologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
EarthquakesJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom
Applied ampEnvironmentalSoil Science
Volume 2014
Mining
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal of
Geophysics
OceanographyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal ofPetroleum Engineering
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Atmospheric SciencesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MineralogyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MeteorologyAdvances in
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geological ResearchJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geology Advances in
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ClimatologyJournal of
EcologyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
EarthquakesJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom
Applied ampEnvironmentalSoil Science
Volume 2014
Mining
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
International Journal of
Geophysics
OceanographyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal ofPetroleum Engineering
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Atmospheric SciencesInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MineralogyInternational Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MeteorologyAdvances in
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geological ResearchJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Geology Advances in