Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm...

12
Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm Jing Geng a , Ming-Wei Li a,n , Zhi-Hui Dong a , Yu-Sheng Liao b a College of Shipbuilding Engineering, Harbin Engineering, University, Harbin 150001, Heilongjiang, China b Department of Healthcare Administration, Oriental Institute of Technology, No. 58, Sec. 2, Sichuan Rd., Panchiao, New Taipei, Taiwan, ROC article info Article history: Received 30 April 2014 Received in revised form 22 June 2014 Accepted 22 June 2014 Communicated by Wei Chiang Hong Keywords: Port throughput Forecasting Chaotic mapping Particle swarm optimization (PSO) Simulated annealing (SA) Robust v-support vector regression (RSVR) abstract Port throughput forecasting is a very complex nonlinear dynamic process, prediction accuracy is inuenced by uncertainty of socio-economic factors, especially by the mixed noise (singular point) produced in the collection, transfer and calculation of statistical data; consequently, it is difcult to obtain a satisfactory port throughput forecasting result. Thus, establishing an effective port throughput forecasting scheme is still a signicant research issue. Since the robust v-support vector regression model (RSVR) has the ability to solve the nonlinear and mixed noise in the port throughput history data and its related socio-economic factors, this paper introduces the RSVR model to forecast port throughput. In order to search the more appropriate parameters combination for the RSVR model, considering the proposed simulated annealing particle swarm optimization (SAPSO) algorithm and the original PSO algorithm still have the drawbacks of immature convergence and is time consuming, this study presents chaotic simulated annealing particle swarm optimization(CSAPSO) algorithm to deter- mine the parameter combination. Aiming to identify the nal input vectors for RSVR model, the multivariable adaptive regression splines (MARS) is adopted to select the nal input vectors from the candidate input variables. This study eventually proposes a port throughput forecasting scheme that hybridizes the RSVR, CSAPSO and MARS to obtain a more accurate forecasting result. Subsequently, this study compiles the port throughput data and the corresponding socio-economic indicators data of Shanghai as the illustrative example to evaluate the feasibility and performance of the proposed scheme. The experimental results indicate that the proposed port throughput forecasting scheme obtains better forecasting result than the six competing models in terms of forecasting error. & 2014 Elsevier B.V. All rights reserved. 1. Introduction The prediction of port throughput has the basic function of making port strategic decision, port developing scale, port general layout and port district division. If this prediction is not accurate enough, bias in policy will occur, which may cause huge nancial losses. Therefore, developing an effective port throughput fore- casting model has become a crucial and challenging task. Numerous forecasting approaches have been developed for port throughput prediction. In the conventional quantitative fore- casting approaches, the autoregressive moving integrated moving average (ARIMA) models [1] are the most popular and practical time series forecasting. It is often applied to forecast the series when data are inadequate to construct econometric, or when knowledge of the structure of forecasting models is limited. Time series models are simple in calculation and fast in speed, and are likely to outperform other models in some cases, especially in short-term forecasting [2,3]. Therefore it is widely used in port throughput prediction [47]. However, time-series forecasting models fail to reect other related factors of the predicting series. Articial neural network (ANN) is primarily based on a model of emulating the processing of human neurological system identify related spatial and temporal characteristics from the historical data patterns (particularly for nonlinear and dynamic evolutions); therefore, they can approximate any level of complexity and do not need prior knowledge of problem solving. Since port through- put prediction is too complex to be solved by a single linear statistical algorithm, ANN should be considered as an alternative for solving port throughput forecasting. Owing to the superior performance to approximate any degree of complexity and requir- ing no prior knowledge of problem solving, ANN models [810] have been widely applied in port throughput forecasting [11]. Although ANN-based forecasting models can approximate to any function, particularly nonlinear functions, they have difculties in the non-convex problem of network training errors, explaining Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/neucom Neurocomputing http://dx.doi.org/10.1016/j.neucom.2014.06.070 0925-2312/& 2014 Elsevier B.V. All rights reserved. n Corresponding author. Tel.: þ886 18845615108; fax: þ886 0451 82568317. E-mail addresses: [email protected] (J. Geng), [email protected], [email protected] (M.-W. Li), [email protected] (Y.-S. Liao). Please cite this article as: J. Geng, et al., Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm, Neurocomputing (2014), http://dx.doi.org/10.1016/j.neucom.2014.06.070i Neurocomputing (∎∎∎∎) ∎∎∎∎∎∎

Transcript of Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm...

Page 1: Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm

Port throughput forecasting by MARS-RSVR with chaotic simulatedannealing particle swarm optimization algorithm

Jing Geng a, Ming-Wei Li a,n, Zhi-Hui Dong a, Yu-Sheng Liao b

a College of Shipbuilding Engineering, Harbin Engineering, University, Harbin 150001, Heilongjiang, Chinab Department of Healthcare Administration, Oriental Institute of Technology, No. 58, Sec. 2, Sichuan Rd., Panchiao, New Taipei, Taiwan, ROC

a r t i c l e i n f o

Article history:Received 30 April 2014Received in revised form22 June 2014Accepted 22 June 2014Communicated by Wei Chiang Hong

Keywords:Port throughputForecastingChaotic mappingParticle swarm optimization (PSO)Simulated annealing (SA)Robust v-support vector regression (RSVR)

a b s t r a c t

Port throughput forecasting is a very complex nonlinear dynamic process, prediction accuracy isinfluenced by uncertainty of socio-economic factors, especially by the mixed noise (singular point)produced in the collection, transfer and calculation of statistical data; consequently, it is difficult toobtain a satisfactory port throughput forecasting result. Thus, establishing an effective port throughputforecasting scheme is still a significant research issue. Since the robust v-support vector regressionmodel (RSVR) has the ability to solve the nonlinear and mixed noise in the port throughput history dataand its related socio-economic factors, this paper introduces the RSVR model to forecast portthroughput. In order to search the more appropriate parameters combination for the RSVR model,considering the proposed simulated annealing particle swarm optimization (SAPSO) algorithm and theoriginal PSO algorithm still have the drawbacks of immature convergence and is time consuming, thisstudy presents chaotic simulated annealing particle swarm optimization(CSAPSO) algorithm to deter-mine the parameter combination. Aiming to identify the final input vectors for RSVR model, themultivariable adaptive regression splines (MARS) is adopted to select the final input vectors from thecandidate input variables. This study eventually proposes a port throughput forecasting scheme thathybridizes the RSVR, CSAPSO and MARS to obtain a more accurate forecasting result. Subsequently, thisstudy compiles the port throughput data and the corresponding socio-economic indicators data ofShanghai as the illustrative example to evaluate the feasibility and performance of the proposed scheme.The experimental results indicate that the proposed port throughput forecasting scheme obtains betterforecasting result than the six competing models in terms of forecasting error.

& 2014 Elsevier B.V. All rights reserved.

1. Introduction

The prediction of port throughput has the basic function ofmaking port strategic decision, port developing scale, port generallayout and port district division. If this prediction is not accurateenough, bias in policy will occur, which may cause huge financiallosses. Therefore, developing an effective port throughput fore-casting model has become a crucial and challenging task.

Numerous forecasting approaches have been developed forport throughput prediction. In the conventional quantitative fore-casting approaches, the autoregressive moving integrated movingaverage (ARIMA) models [1] are the most popular and practicaltime series forecasting. It is often applied to forecast the serieswhen data are inadequate to construct econometric, or whenknowledge of the structure of forecasting models is limited. Time

series models are simple in calculation and fast in speed, and arelikely to outperform other models in some cases, especially inshort-term forecasting [2,3]. Therefore it is widely used in portthroughput prediction [4–7]. However, time-series forecastingmodels fail to reflect other related factors of the predicting series.

Artificial neural network (ANN) is primarily based on a modelof emulating the processing of human neurological system identifyrelated spatial and temporal characteristics from the historicaldata patterns (particularly for nonlinear and dynamic evolutions);therefore, they can approximate any level of complexity and donot need prior knowledge of problem solving. Since port through-put prediction is too complex to be solved by a single linearstatistical algorithm, ANN should be considered as an alternativefor solving port throughput forecasting. Owing to the superiorperformance to approximate any degree of complexity and requir-ing no prior knowledge of problem solving, ANN models [8–10]have been widely applied in port throughput forecasting [11].Although ANN-based forecasting models can approximate to anyfunction, particularly nonlinear functions, they have difficulties inthe non-convex problem of network training errors, explaining

Contents lists available at ScienceDirect

journal homepage: www.elsevier.com/locate/neucom

Neurocomputing

http://dx.doi.org/10.1016/j.neucom.2014.06.0700925-2312/& 2014 Elsevier B.V. All rights reserved.

n Corresponding author. Tel.: þ886 18845615108; fax: þ886 0451 82568317.E-mail addresses: [email protected] (J. Geng), [email protected],

[email protected] (M.-W. Li), [email protected] (Y.-S. Liao).

Please cite this article as: J. Geng, et al., Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarmoptimization algorithm, Neurocomputing (2014), http://dx.doi.org/10.1016/j.neucom.2014.06.070i

Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎

Page 2: Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm

black-box operations, can easily be trapped in local minima[12,13], have time-consuming training procedures, and have sub-jectivity in selecting an ANN model architecture [14]. Additionally,the training of an ANN model requires a large amount of trainingsamples, while port throughput and related impact indicators havelimited datum. Thus, ANN models cannot achieve satisfactoryperformance in port throughput forecasting.

Support vector regression (SVR) has overcome the inherentdefects of the ANN model [15], with the structure risk minimiza-tion criterion. It possesses not only greater nonlinear modelingability, but also several superior advantages, such as theoreticallyensuring global optimum, simple modeling structure and pro-cesses, and small sample popularization requirement; therefore,SVR-based models [16] have been successfully applied to solveforecasting problems in many fields, such as financial time seriesforecasting [17–21], tourist arrival prediction [22,23], atmosphericscience forecasting [24–27], traffic flow prediction [28–30], andelectric load forecasting [31–36]. In addition, it has also success-fully been used for prediction of port throughput [37,38].

The prediction of port throughput is a complex nonlineardynamic procedure, and is affected by numerous factors, such asGross Domestic Product, Gross National Product, Total Imports andExports, Industrial Output, etc. These factors mostly have randomand nonlinear characteristics, and may have complex nonlinearconnections among them; thus, it is difficult to express them by adefinite method in low dimensional space. Noise may emergewhen collecting, transmitting and analyzing datum cause ofstochastic error, and can exist both in port throughput sequenceand in its related factors. The distribution of noise is subject tonormal distribution, and has high amplitude value in some point.Theoretically, these models mentioned above do not considerabout the drawback of noise. The mixed noise in throughputsequence data and related impact factors datum will largely affectthe final prediction results, especially on sensitive SVR model (thematrix is not spare cause of outlier approaching decision boundary,while numbers of support vector grows very fast cause of outlierinfluence). Wu [39] designed a robust loss function (according todifferent situations, uses different loss functions), consideringmixed noise of normal distribution, high amplitude values, singularpoint features in datum of prediction sequence and related impactfactors, and gained a new support vector regression (namely RSVR),and applied it to products sale timing sequence prediction. Thenumerical calculation results indicate that the estimated model caneffectively suppress noise, and lead to better prediction results. Todeal well with the mixed noise in port throughput sequence and itsrelated impact factors, this paper adopts the RSVRmodel to improverobustness and accuracy of port throughput prediction.

The practical results indicate that the forecasting accuracy ofSVR-based models is influenced by the determination of theparameters significantly [40]. Although there are some recom-mendations on the appropriate setting of SVR parameters in theliterature [41], those approaches do not simultaneously considerthe interaction effects among the parameters. The common cross-validation method used for selecting SVR parameters has certaincross error [42], especially in complex forecasting problems, itcannot guarantee high forecasting accurate level.

To identify which approach is suitable for specified datapatterns, researchers have employed different hybrid evolutionaryalgorithms [43–45] (such as particle swarm optimization, simu-lated annealing algorithms, genetic algorithms and immune algo-rithms) to determine the parameters. Here, all SVR models withparameters determined by different evolutionary algorithms aresuperior to other competitive forecasting models (ARIMA, ANNs,etc.); however, these evolutionary algorithms still suffer from theshortcoming of being time consuming or inefficient in optimizingthe parameters for SVR models.

Therefore, a more alternative evolutionary algorithms needs to bedeveloped to improve the optimizing approach of parameters. PSO isa stochastic optimization algorithm based on the groups theory, firstintroduced by Kennedy and Eberhart, and currently is widely used infunction optimization, neural network training, pattern classification,fuzzy control system and other engineering fields. PSO is influencedby random oscillation effect in late evolution, making it time-consuming when near globally optimal values, which means con-vergence rate is very low, and is easily trapped in local minima, andbecomes a bottleneck of the PSO algorithm for further development.The SA algorithm is a kind of heuristic random search algorithmbased on the Monte-Carlo iteration solving method, originallyintroduced by N. Metropolis. The algorithm has a strong ability ofglobal optimization search, and accepts both good solutions andinferior solutions by probability. Thus when SA falls into the trap oflocal optimization, theoretically it can also jump out of the trap aftersufficient time, and finally obtain the global optimum. Recently, SA isused widely in engineering, such as production management, controlengineering, machine learning, neural networks, image processingand other areas. PSO-SA, which combines SA and PSO, is proposed toenhance the algorithm's ability to jump out of the local maxima, andreduce convergence time [46]. Employing the PSO-SA algorithm toselect parameters of the SVR model can improve the optimizationeffect of parameters for SVR [47,48].

However, due to the lack of diversity in the late evolutionary,the PSO-SA algorithm still easily falls into local optimum (imma-ture convergence) because there is a lack of diversity in the lateevolutionary. To enhance the diversity of evolution population andimprove the ability of global exploration of PSO-SA, this paperdecides to employ the chaotic sequence to transform the threehyper-parameters of an SVR model from the solution space to thechaotic space; any variable in this kind of chaotic space can travelergodically over the whole space of interest to determine theimproved solution eventually.

Recently, numerous chaotic sequences adopt the Logistic map-ping function, which is distributed at both ends in the interval[0,1]; it could not excellently strengthen the chaotic distributioncharacteristics [49]. By comparing with the analysis on chaoticdistribution characteristics after mapping the hyper-parametersinto chaotic space, the author concludes that the Cat mappingfunction has good ergodic uniformity in the interval [0,1] and doesnot easily to fall into a minor cycle [50].

Therefore, this study employs the Cat mapping function toimplement the chaos disturbance for the evolutionary populationof PSO-SA, designs the coupling evolution mechanism of Catmapping, SA and PSO, and proposes the CSAPSO algorithm toimprove the optimizing performance of RSVR's parameters.

In the meanwhile, port throughput is influenced by various socio-economic factors (the candidate input variables, such as gross domes-tic product, total investment in fixed assets, total imports and exports,industrial output, etc.), but the major disadvantage of SVR-basedforecasting models is that it cannot select the final input vectors fromthe candidate input variables. Thus, selecting the final input vectors iscrucial in constructing an SVR-based forecasting model.

The MARS is a multivariate, nonlinear, nonparametric regres-sion approach [51]; it not only has excellent variable selectioncapabilities, but also can analyze the difference between thedegrees of significance for different variables effectively. ThusMARS has been widely used in various fields such as salesprediction [52], credit evaluation [53–55], stock price forecasting[56–58], software reliability analysis [59–61] and predicting spe-cies distribution [62,63]. However, MARS has been rarely used inport throughput forecasting in the existing literature; therefore,this paper employs MARS to determine the final input vectors forRSVR and analyze significance degrees between different factorsfor further port throughput generation mechanism research.

J. Geng et al. / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎2

Please cite this article as: J. Geng, et al., Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarmoptimization algorithm, Neurocomputing (2014), http://dx.doi.org/10.1016/j.neucom.2014.06.070i

Page 3: Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm

To improve forecasting performance in port throughput pre-diction, this study integrates MARS, RSVR and CSAPSO to propose aport throughput forecasting scheme (namely the MRSVR-CSAPSOscheme). The main idea of the established scheme is first toemploy MARS to select the final input vectors from the candidateinput variables for RSVR, and then an MRSVR model is built. Afterthe MRSVR model was obtained, the CSAPSO is adopted todetermine the values of three parameters in the MRSVR model.Finally, the MRSVR-CSAPSO port throughput forecasting schemeis built.

In the forecasting process, this study selects numerous factorsas the candidate input variables, such as gross domestic product(x1:GDP), total investment in fixed assets(x2:TIFA), total importsand exports(x3:TIE), industrial output(x4:IO), first industrial value(x5:FIV), second industry value(x6:SIV), tertiary industry value(x7:TIV), population measurement(x8:PM), total retail sales of con-sumer goods(x9:TRSCG), freight volume(x10:FV), highway freightvolume(x11:HFV), and railway freight volume(x12:RFV). Mean-while, as the basis of future port throughput, the historical portthroughput data has strong influence on future port throughput,and it is easily obtained and quantifiable; thus, this paper alsoused historical port throughput data as the candidate inputvariables, such as the previous three year's port throughput(x13:PTYPT1), previous two year's port throughput(x14:PTYPT), pre-vious year's port throughput(x15:PYPT), previous 2-year movingaverage(x16:P2YMA) and previous 3-year moving average(x17:P3YMA).

To test forecasting performance of the proposed predictionscheme, the results of the proposed MRSVR-CSAPSO model werecompared with those from the ARIMA, MBPNN, RSVR-CSAPSO,MSVR-CSAPSO, MRSVR-PSO and MRSVR-SAPSO models.

The rest of this paper is organized as follows. Section 2 gives anintroduction about MARS, the basic formulation of RSVR andparameters determination of the RSVR model by CSAPSO. Theproposed MRSVR-CSAPSO port throughput forecasting scheme isdescribed in Section 3. Section 4 provides a numerical exampleand compares the forecasting performances among all alterna-tives. The conclusions are provided in Section 5.

2. Methodology

2.1. Robust v-support vector regression (RSVR)

2.1.1. Fundamental of SVRThe basic idea of the SVR model is: define nonlinear mapping

Ф: Rn-RmðmZnÞ, map the x of spatial data to a high-dimensionalfeature space, then make linear regression in the space.

Given a dataset fðxi; yiÞ; i¼ 1;2;⋯;Ng, where xiARn is the inputvector (the final input vectors), yiARn (port throughput forecast-ing result) is the output variable corresponding to xi, that decisionattribute: total electricity consumption N is the total number ofdata points, SVR implements function regression estimate accord-ing to the following formula:

f ðxÞ ¼ωΦðxÞþb ð1Þwhere ω, Φ(x) is m dimensional vector, “ � ” is the dot-product inthe feature space, bAR is the threshold. SVR uses structural riskminimization to compute Formula (1); the core idea of structuralrisk minimization is

R½f �rRempþRreg ð2Þwhere R [f] is actual risk, Remp is empirical risk, which is a kind ofmeasurement between f (x) and sampling bias. Rreg is confidencerange, which is a kind of measurement f (x) complexity. Empi-rical risk Remp is determined by the loss function, different loss

functions will form different Remp, and then the different SVRmodel is obtained.

2.1.2. Robust loss functionThere is a relationship between the optimal loss function and

the intrinsic characteristics of the sample set. For normal noisedistribution, using Gaussian function as the loss function can helpobtain the best effect of noise reduction. For larger noise andsingular point, the performance of Laplace loss function (linearloss function) is the best. Based on analysis with unpredictability,randomness, nonstationarity of the sample data and the charac-teristics of all kinds of loss function, in order to improve therobustness of the SVR model, the above loss function should takeinto account the new loss function (namely robust loss function)when the new loss function is designed. At this time, themathematical model of robust loss function is expressed asfollows:

LðξÞ ¼0 jξjrε12ðjξj�εÞ2 εo jξjrεμ

μðjξj�εÞ�12μ

2 εo jξjrεμ

8><>: ð3Þ

where εþμ¼ εμ,εZ0,μZ0. This paper only considers the condi-tion of μ¼ 1.

Robust loss function is divided into three parts (Fig. 1):

(1) when jξjrε, the deviation is not to be punished, in order toensure sparse solutions of the learning machine.

(2) when εr jξjrεμ, the Gaussian loss function, 1=2ðjξj�εÞ2, isused to punish the deviation, to suppress the noise obeyingGaussian distribution.

(3) when jξjZεμ, the Laplace loss function, μðjξj�εÞ, is employedto increase the punishment, to restrain the large amplitudenoise and the anomalies data.

Robust loss function weakens the influence of abnormal data,which makes the SVR model have better robustness and general-ization ability.

2.1.3. Robust v-support vector regressionExpected risk obtained from f ðxÞ ¼wx also meets the structural

risk minimization principle. The robust v-support vector regres-sion (namely RSVR) has a hyper-plane of f(x). Suppose the datasetT ¼ f ðxi; yiÞ g, where xiARd, yiAR, xi is d dimensional columnvector, i¼1, 2,…, l; here, the mathematical model of the RSVRwith the robust loss function is as follows:

minw;ξðnÞ;b;ε

12ð∥w∥2þb2ÞþC vεþ 1

l ∑iA I1

12ðξ2i þξn2i Þþ1

l ∑iA I2

μðξiþξni Þ" #

s:t:yi�wxi�brεþξi;

wxiþb�yirεþξniξi; ξ

n

i Z0; vA ð0;1�; εZ0; i¼ 1;⋯; l ð4Þ

L(ξ)

ε εμ

μ

Fig. 1. Robust loss function.

J. Geng et al. / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 3

Please cite this article as: J. Geng, et al., Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarmoptimization algorithm, Neurocomputing (2014), http://dx.doi.org/10.1016/j.neucom.2014.06.070i

Page 4: Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm

where w is the d dimensional column vector, CðCZ0Þ is a penaltycoefficient, deciding the balance between confidence risk andexperience risk, vAð0;1� is the upper bound of the proportion oferror samples in the total number of training samples and thelower bound of the proportion of support vectors in the totalnumber of training samples, ε is used to control the pipe size,ξi,ξ

n

i ,ði¼ 1;⋯; lÞ are slack variables, used to ensure the constraintsatisfaction, I1 is a sample set of slack variables falling in theinterval 0o jξij,jξni jrεμ, I2 is a sample set of slack variable falling inthe interval εμo jξij, jξni j. Using the dual principle, Karush–Kuhn–Tucker (KKT) conditions and kernel function skill, the dualproblem of the above optimization problem is calculated andexpressed as Eq. (5):

minα;αn

12 ∑

l

i ¼ 1∑l

j ¼ 1ðαi�αn

i Þðαj�αn

j ÞðKðxi; xjÞþ1Þ� ∑l

i ¼ 1yiðαi�αn

i Þ

þ l2C

∑l

i ¼ 1ðα2i �αn2

i Þ

s:t: eT ðαþαnÞrCv

0rαi; αn

i r minCl;Cμl

� �ð5Þ

Transform Eq. (5) into the matrix form as follows:

minα;αn

12½αT ; ðαnÞT �

QþElC �Q

�Q QþElC

" #α

αn

� �þ ½�yT ; yT � α

αn

� �

s:t: eT ðαþαnÞrCv

0rαi; αn

i r minCl;Cμl

� �ð6Þ

where Qij ¼ Kðxi; xjÞþ1, i¼ 1;⋯; l, j¼ 1;⋯; l, e¼ ½1;⋯;1�T , e is ldimensional column vector, E is l order matrix, and α and αn arevectors formed by the Lagrange multiplier, they are all l-dimen-sional non-negative column vectors.

The output of RSVR is as follows:

f ðxÞ ¼ ∑l

i ¼ 1ðαi�αn

i ÞðKðxi; xÞþ1Þ ð7Þ

According to Eq. (7), RSVR has a more concise dual form andsimplifies the calculation process. RSVR has no parameter, b, in theoutput expressions.

Owing to the good performance of the radial basis in theapplication of SVR [35], this paper employs the radial basisfunction as the kernel function of the RSVR model. The radialbasis function is expressed as Eq. (8):

Kðxi; xjÞ ¼ exp �jjxi�xjjj22σ2

!ð8Þ

There are three parameters (C, v, andδ) in the RSVR model;determining the parameters is the first step to train the RSVRmodel. The practical results indicate that the forecasting accuracyof an RSVR model depends on a good setting of parameterscombination (C, v, andδ). Therefore, the study on parametersdetermination is a very valuable issue. Recently, authors haveapplied a series of intelligent optimization algorithms to test thepotentiality and the suitability involved in the parameters deter-mination of an RSVR model. However, the employed intelligentoptimization algorithms almost suffer from being trapped in localoptimum easily and slow convergence. To improve the optimiza-tion efficiency of the parameters combination, in this paper, theCSAPSO algorithm is proposed and used in the parameter deter-mination of the RSVR model.

2.2. CSAPSO

PSO is easy to implement, does not need to adjust manyparameters, and has fast convergence in the early stage; however,it needs longer search time in the vicinity of the global optimumvalue and easily gets trapped in local extremum, due to therandom oscillation phenomenon in late stage. SA can convergeto the global optimum with the 100 probability, provided theinitial temperature is high enough and the temperature dropsslowly enough. It has the ability to jump out of local optima,because it can accept poor particles in a certain probability.Combining the mechanism of SA and PSO algorithm, the PSO-SAhybrid algorithm was established, which both improves the con-vergence speed and enhances the ability of jumping out of localextremum of the algorithm, due to its jump characteristic in theevolution. The PSO-SA hybrid algorithm is superior to the standardPSO algorithm and SA algorithm independently in calculationaccuracy and optimization result. However, the proposed PSO-SAas well as the original PSO algorithm still suffers from the short-coming of immature convergence (trapped in local optimum) andis time consuming.

In order to overcome the deficiency of the PSO-SA hybridalgorithm and search for more appropriate parameter combinationsof the RSVR model, this paper introduces Cat mapping into the PSO-SA hybrid optimization algorithm and establishes a more effectivehybrid optimization algorithm (CSAPSO), and the proposed CSAPSOalgorithm is employed to select the parameters of the RSVR model.In the proposed CSAPSO algorithm, the basic operations of the PSOare used to initialize particle swarm, then the mechanism of SA isadopted to converge the particles into the lowest energy state, andthe improved particles from SA are obtained. After that, the Catmapping with excellent chaos ergodicity is employed to implementchaos disturbance for particle swarm, to explore the better particlesand to obtain the more accurate solution, when the PSO-SAevolution suffers from standstill; finally, the new modified indivi-dual will be sent back to standard PSO for the next generation tillthe termination condition of the algorithm is reached. Finally, theoptimal parameter combination of the RSVR model is obtained bythe CSAPSO algorithm.

2.2.1. Global chaotic disturbance using cat mapping (GCDCM)Chaos [49] is a common nonlinear phenomenon; its behavior is

complex and similar to random, but it has a delicate internalregularity. Employing chaotic mapping to optimize searching issuperior to random search with a blind disorder; it can avoid beingtrapped in local optimum, due to the ergodicity of chaos. Chaosoptimization approach is a global optimization technique, and hasbeen widely used in improving the evolutionary algorithm. How-ever, the current chaotic sequence generator used in improvingthe evolutionary algorithm mostly adopts the Logistic mapping,the Tent mapping and the An mapping. This paper introduces theCat mapping (with better chaos distribution characteristic) intothe PSO-SA algorithm to strengthen the global exploring ability ofthe CSAPSO hybrid optimization algorithm.

1) Cat mappingFor two-dimensional Cat mapping function [64], as shown inEq. (9),

xnþ1 ¼ ðxnþynÞ mod 1ynþ1 ¼ ðxnþ2ynÞ mod 1

(ð9Þ

where x mod 1¼ x�½x�.Analysis [50] on chaotic characteristics of these four mappingfunctions (Logistic mapping, Tent mapping, An mapping andCat mapping) indicated that the distribution of Cat mapping is

J. Geng et al. / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎4

Please cite this article as: J. Geng, et al., Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarmoptimization algorithm, Neurocomputing (2014), http://dx.doi.org/10.1016/j.neucom.2014.06.070i

Page 5: Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm

relatively uniform and has no cyclic phenomenon during theiteration process. Meanwhile, the chaotic sequence value of theCat mapping may take 0 and 1; in short, Cat mapping has betterchaos distribution characteristic. Therefore, this paper appliesCat mapping to comply with GCDCM, in order to furtherstrengthen the swarm diversity and reduce the convergencetime.

2) The procedure of GCDCM is illustrated as follows:Step 1: Set i¼1.Step 2: Employ the Cat mapping function to generate pop_sizechaotic variables with different trails xi ¼ fxi1; xi2;⋯; xiQ g,i¼1,2,…, pop_size.Step 3: Set j¼1.Step 4: According to Eq. (10), map all the components of xi tothe value interval ½gj� min; gj� max� to obtain gi,

gi ¼ gj� minþðgj� max�gj� minÞxi ð10Þwhere gj� min and gj� max are, respectively, the minimum valueand the maximum value of the jth vector of gi, j¼1,2,…,Q.Step 5: if jZQ, go to step7, otherwise go to step 6.Step 6: Set j¼ jþ1, go to step 4.Step 7: if iZpop_size, go to step 9, otherwise go to step 8.Step 8: Set i¼ iþ1, go to step 2.Step 9: Calculate the fitness value of particles, mix with particleswarm from SA and sequence according to the fitness value.Then, select pop_size best particles ranked ahead in the fitnessvalue as the new modified particle swarm.

2.2.2. Implementation structure of the CSAPSO algorithmThis study proposes a hybrid CSAPSO algorithm by using SA to

escape from local minima; in addition, it employs Cat mapping tocomply with global chaotic disturbance for further strengtheningglobal exploration capability and speeding up the convergence rate.The proposed CSAPSO algorithm consists of the standard PSO part,the SA part and the GCDCM part. The standard PSO part evaluatesthe initial particle swarm and uses four basic PSO operators(historical extremum updating, the global extremum updating,speed updating, location updating) to obtain new particle swarm(best particle). Then, for each generation of standard PSO, it will bedelivered to SA for further processing. After completing all the SAprocesses, the modified individual will be continually delivered toGCDCM for global chaotic disturbance. After complete processes ofGCDCM, the new modified individual will be sent back to standardPSO for the next generation. These evolution iterations will benever stopped till the termination condition of the algorithm isreached. The proposed procedure of CSAPSO is illustrated as followsand the flowchart is shown in Fig. 2.

2.2.3. Parameters determination by CSAPSOThe PSO and SA employ the mean absolute percentage error

(MAPE, expressed as Eq. (11)) of regressive value as the fitnessfunction.

Fitness¼MAPEð%Þ ¼ 100N

∑N

i ¼ 1

f̂ iðxÞ� f iðxÞf iðxÞ

���������� ð11Þ

where N is the number of forecasting samples, f iðxÞ is the actualvalue of the ith period, and f̂ iðxÞ is the forecasting value of the ithperiod.

The proposed CSAPSO algorithm is applied to optimize theparameters combination of the RSVR model, and the procedure isillustrated as follows:

Step 1: Set population parameters. Set the size of particleswarm, pop_size, the acceleration parameters, c1 and c2, themaximum evolution generation, gmax, the population distribu-tion coefficient, pop_distr, the initial temperature, T0, thethermal equilibrium temperature, T, maximum inner iteration,Tmax.Step 2: Generate initial population. Each particle, Xk(i), k¼(C,v,δ), i¼1,2,…,pop_size, has three vectors, respectively, repre-senting three parameters (C, v, and δ). Adopt rand(0,1) andEq. (12) to generate pop_size particles, in the feasible intervals(Mink, Maxk). In addition, randomly initialize the speed andupdate the individual optimal position, PG

i , and the globaloptimal position, PG

g . Set g¼1.

XkðiÞ ¼MinkþxkðiÞðMaxk�MinkÞ; k¼ ðC; v; δÞ ð12Þ

Step 3: Evaluate fitness. Set each particle Xk(i), k¼(C, v,δ), i¼1,2,…, pop_size as the parameters combination of the RSVR model,train the RSVR model, calculate the fitness function value, thenevaluate the fitness (forecasting errors) of each particle.Step 4: If the current swarm satisfies the stopping criteriaR (if the number of generation is equal to the maximumevolution generation), go to step 12, otherwise go to step 5.Step 5: Calculate the adaptive inertia weight factor ϖ according toEq. (13). Update the speed and position of each particle in particleswarm. Then, calculate the fitness values of all the particles,update the individual optimal position PG

i and the global optimalposition PG

g of pop_size particles. Set g¼gþ1, go back to step 6.

ϖ ¼ϖmax�gnϖmax�ϖmin

genð13Þ

where ϖ is the updated inertia weight, ϖmin and ϖmax are theminimum value and the maximum value of inertia weight,respectively. In general, they are set as 0.4 and 0.9, correspondingly.Step 6: Provisional state. Receive values of the three parametersfrom PSO, make a random move to change the existing systemstate to a provisional state. Another set of three positiveparameters are generated in this stage.Step 7: Metropolis criterion tests. The acceptance or rejection ofprovisional state is determined by the following metropolis criter-ion equation [44]. If the provisional state is accepted, then theprovisional state is set as the current state; otherwise, return tostep 6.

where p is a random number to determine the acceptance of theprovisional state, P (accept Snew), the probability of accepting thenew state, is given by the following probability function:

Pðaccept SnÞ ¼ exp �EðSOÞ�EðSnÞkT

� �ð15Þ

where T is the thermal equilibrium temperature, and k representsthe Boltzmann constant.Step 8: Incumbent solutions. If the provisional state is acceptedand the current state is superior to the system state, then set

If EðSnÞrEðSoÞ; the provisional state is accepted:If EðSnÞ4EðSoÞ and poPðaccept SnewÞ; 0rpr1; the provisional state is acceptedOtherwise; the provisional state is regected:

8><>: ð14Þ

J. Geng et al. / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 5

Please cite this article as: J. Geng, et al., Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarmoptimization algorithm, Neurocomputing (2014), http://dx.doi.org/10.1016/j.neucom.2014.06.070i

Page 6: Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm

the current state as the new system state; otherwise, return tostep 6 until the maximum number of loops (Ns) is reached.The investigation [65] demonstrated that Ns should be 100d toavoid infinitely repeated loops, where d denotes the problemdimension. In this paper, three parameters (C, v, andδ) are usedto determine the system states; thus, Ns is state to 300.Step 9: Temperature reduction. After the new system state isobtained, reduce the temperature. The new temperature

reduction is implemented by the following equation:

Nt ¼ Ctρ ð16Þwhere 0oρo1, Nt is updated temperature, Ct is currenttemperature, and ρ is set at 0.9 [66].Step 10: If the pre-determined temperature is reached, then stopthe SA algorithm and the latest state is set as the approximateoptimal solution, go to step 11. Otherwise, go to step 6.

Yes

No

g>gmax

pre-determinedtemperature is reached

No

No

Yes

Yes

Get new modified particle swarm accordingto sorting of fitness value

g=g+1

Mix new particle swarm with particleswarm from SA

Update historical extremum PiG and global

extremum PgG

Update the velocity and position of particle

PSO finished

Use Cat mapping to generate new particleswarm

Global chaotic disturbance using Catmapping (GCDCM)

g=1

Evaluate fitness value

Initialize the parameters of particle swarm

Initialize the velocity and position ofparticle randomly

Start SA with best individualobtained from PSO

Initialize the maximum inneriteration, initial temperature,

end temperature

Generate initial current state

Random move to get aprovisional state

t=t+1

Set current stateas the new system state

Temperature reduction

SA finished

Equilibrium or maximuminner iteration is reached

Metropolis criterion test

CSAPSO finished(Obtain the optimal parameters )

Yes

No

t=1

Fig. 2. CSAPSO algorithm flowchart.

J. Geng et al. / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎6

Please cite this article as: J. Geng, et al., Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarmoptimization algorithm, Neurocomputing (2014), http://dx.doi.org/10.1016/j.neucom.2014.06.070i

Page 7: Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm

Step 11: Global chaotic disturbance. Start GCDCM withimproved individual obtained from SA. Adopt GCDCM modularto complete global chaos disturbance. Send the modifiedparticle swarm from GCDCM back to standard PSO, and con-tinue to implement the PSO operation. Go to step 4.Step 12: Output optimization results. The global optimal posi-tion PG

g ¼ ðCbest ; vbest ; δbestÞ is presented as the three parameters(C, v, andδ) of the RSVR model.

2.3. Multivariable adaptive regression splines

MARS is a nonlinear, nonparametric regression method. It wasproposed by statistician Jerry Friedman [67] in 1995, and itsapplication software was developed by Salford Systems company.The modeling procedure of MARS is based on a divide-and-conquer strategy – partitioning the training datasets into separateregions, each of which obtains its own regression equation. Thenon-linearity of a model is approximated using separate linearregression slopes in distinct intervals of the independent variablespace. Therefore, the slope of the regression line is allowed tochange from one interval to the other as the two ‘knot’ points arecrossed [52]. The general MARS function can be represented as thefollowing equation [51]:

f̂ ðxÞ ¼ a0þ ∑M

m ¼ 1am ∏

Km

k ¼ 1½skmðxðk;mÞ�tkmÞ� ð17Þ

where a0 and am are parameters, M is the number of basisfunctions, Km is the number of knots, skm takes on values of either1 or –1 and indicates the right/left sense of the associated stepfunction, x(k, m) is the label of the independent variable, and tkmindicates the knot location.

MARS constructs a very large number of basis functions to buildthe optimal MARS model, but parts of the selected basis functionsover-fit the data initially; therefore, these basis functions will be

deleted in the order of least contributions by the generalizedcross-validation (GCV) criterion. Finally, the decrease of the GCV isused to assess importance of the basis functions, when the basisfunction is removed from the model. This process will continueuntil the remaining basis functions all satisfy the pre-determinedrequirements.

The GCV can be expressed as follows [51]:

GCVðMÞ ¼ 1N

∑N

i ¼ 1

½yi� f̂ MðxiÞ�2½1�CðMÞ

N �2ð18Þ

where there are N observations, and C(M) is the cost-penaltymeasures of a model containing M basis functions. Therefore, thenumerator measures the lack of fit on the M basis function modelfM(xi) and the denominator denotes the penalty for model com-plexity C(M). That is, the purpose of C(M) is to penalize modelcomplexity, to avoid over-fitting, and to promote the parsimony ofmodels. To do so, C(M) introduces a cost incurred per basisfunction included in the model, much like the adjusted R2 inleast-squares regression. It is usually defined as C(M)¼M in linearleast-squares regression and this is used in this paper.

After creating a MARS model, the contribution to the fit of themodel is used to estimate the relative importance of a variable.Because each variable can be added to the basis function, theimportance of the variable can be estimated by the fit degree ofthe model. The GCV criterion is used to evaluate the importance ofthe variable. To implement the selection process, MARS removesone variable at a time, leaves the other variables, refits the modeland then calculates the fit's reduction. The highest scoring, andmost important, variable is the one that most reduces the fit of themodel after being deleted, and it is endowed with 100 weights.Less important variables receive lower scores, and are endowedwith corresponding weights, according to the ratio of the reduc-tion in fit of the model. More details regarding the MARS modelare available in the literature [51].

Selecting the candidate input variablesStage I

Stage IIUsing Min-Max normalization

to process data

P3YMAGDP TIFA P2YMA

Obtaining the FIV of the RSVRFIV 1Stage III FIV 2 FIV n-1 FIV n

CSAPSO algorithm

Cat mappingPSO SA

Obtain Cbest,vbest,δbest of MRSVR model

MRSVR-CSAPSO port throughput forecasting scheme

Stage IV

Stage V MRSVR model

RSVR model

Determining the final input vectors(FIV) by MARS

MRSVR model

PYPTTIE

Fig. 3. The flowchart of the MRSVR-CSAPSO port throughput forecasting scheme.

J. Geng et al. / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 7

Please cite this article as: J. Geng, et al., Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarmoptimization algorithm, Neurocomputing (2014), http://dx.doi.org/10.1016/j.neucom.2014.06.070i

Page 8: Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm

MARS is a flexible procedure with the ability to reveal optimalvariable transformations and interactions, as well as the complexdata structure that often hides in high-dimensional data [51],which makes MARS particularly suitable for problems with highinput dimensions.

Considering that port throughput is influenced by numerousfactors (the candidate input variables) and the RSVR cannotdetermine the final input vectors from the candidate input vari-ables, this study adopts MARS to select the final input vectors forRSVR and analyze significance degrees between different factorsfor further port throughput generation mechanism research.

3. The proposed port throughput forecasting scheme

The MARS, RSVR and CSAPSO are integrated to establish aforecasting scheme for port throughput. The flowchart of theproposed MRSVR-CSAPSO port throughput forecasting scheme ispresented in Fig. 3. As shown in Fig. 3, the proposed portthroughput forecasting scheme consists of five stages.

The first stage of the proposed MRSVR-CSAPSO port throughputforecasting scheme is selecting the candidate input variables.Because the forecasting year's port throughput is influenced bynumerous socio-economic factors strongly, such as GDP (x1), TIFA(x2), TIE(x3), IO(x4), FIV(x5), SIV(x6), TIV(x7), PM(x8), TRSCG (x9),FV(x10), HFV(x11), RFV(x12), [11,68], this study selects the 12socio-economic factors as the candidate input variables. Mean-while, considering the previous year's port throughput also affects

the forecasting year's port throughput, this paper also selects thePTYPT1(x13), PTYPT(x14), PYPT(x15), P2YMA (x16) and P3YMA(x17), 5 historical factors as the candidate input variables. Finally,17 candidate input variables for port throughput forecasting areobtained.

In the second stage, the original datasets are processed by theMin–Max normalization method. The original datasets employedin this paper include a variety of measurements reflecting a widerange of units. The magnitude of the absolute value can varysignificantly for different variables. Data normalization can alle-viate the effects that large value predictor variables overwhelmsmaller value predictor variables, which can improve the perfor-mance of forecasting results [69,70]. Thus, in this paper, theMin–Max normalization method is used to standardize the origi-nal datasets.

The Min–Max normalization method converts variable X to x inthe range [�1.0, 1.0] using the following equation:

x¼ �1þ 2ðX�XminÞðXmax�XminÞ

ð19Þ

where Xmaxand Xmin are the maximum and minimum values,respectively, of variable X.

After determining the candidate input variables and normal-izing the original datasets, the third stage is to obtain the finalinput vectors for RSVR. In this stage, the MARS is employed toselect the final input vectors for the RSVR model, and then theMARS-RSVR model (namely MRSVR) is obtained. In constructingthe MARS method, the Matlab Toolbox based on MARS algorithm(namely ARESLab) is used to implement MARS, which can bedownloaded from http: //www.cs.rtu.lv/jekabsons/regression.html.

The key step of training the MRSVR is to set appropriateparameters In the fourth stage, the CSAPSO algorithm is developedto search for the appropriate combination of three parameters (Cbest,vbest, andδbest) for the MRSVR model. For details about the procedureof parameters determination, please refer to Section 2.2.3.

In the final stage, the trained MRSVR model with the appro-priate parameters (Cbest, vbest, and δbest) is applied to forecast thefuture value of port throughput.

4. Empirical study

4.1. Datasets and performance criteria

To evaluate the performance of the proposed port throughputforecasting scheme, the port throughput and the correspondingsocio-economic indicators data, obtained from yearbook of theShanghai and yearbook of China port, is used in this study. Thedata collection period for socio-economic indicators and thecorresponding port throughput data are from 1978 to 2013. Thereare totally 36 data points in the dataset.

The original data employed in this example is illustrated inTables 1 and 2. The employed data are divided into two datasets,the first 29 data points are used as the training sample, while theremaining 7 data points are employed as the testing sample formeasuring the forecasting performance of the proposed model.

The prediction results of the proposed MRSVR-CSAPSO portthroughput forecasting scheme are compared to those of theRSVR-CSAPSO model (for modeling the RSVR-CSAPSO model, allthe 17 candidate input variables are used as the final inputvectors), MARS-SVR-CSAPSO model (hereinafter simply calledMSVR-CSAPSO) with ε-loss function in SVR, the MARS-RSVR-PSOmodel (hereinafter simply called MRSVR-PSO) and the MARS-RSVR-SAPSO model (determining parameters by SAPSO [47], here-inafter simply called MRSVR-SAPSO). As the BP neural network is awell-known forecasting method, the MARS-BPNN (hereinafter

Table 1The data of port throughput and the corresponding socio-economic indicators ofShanghai.

Year GDP/billionyuan

TLFV/billionyuan

TAE/billiondollars

IO/billionyuan

FIV/billionyuan

SIV/billionyuan

1978 272.81 27.91 30.26 514.01 11.00 211.051979 286.43 35.58 38.78 556.30 11.39 221.211980 311.89 45.43 45.06 598.75 10.10 236.101981 324.76 54.60 41.50 620.12 10.58 244.341982 337.07 71.34 38.93 634.65 13.31 249.321983 351.81 75.94 41.40 663.53 13.52 255.321984 390.85 92.30 44.00 728.12 17.26 275.371985 466.75 118.56 51.74 862.73 19.53 325.631986 490.83 146.93 52.04 952.21 19.69 336.021987 545.46 186.30 59.96 1073.84 21.60 364.381988 648.30 245.27 72.45 1304.66 27.36 433.051989 696.54 214.76 78.48 1524.67 29.63 466.181990 781.66 227.08 74.31 1642.75 34.24 505.601991 893.77 258.30 80.44 1947.18 34.06 550.641992 1114.32 357.38 97.57 2429.96 34.16 677.391993 1519.23 653.91 127.32 3327.04 37.82 902.381994 1990.86 1123.29 158.67 4255.19 47.61 1148.451995 2499.43 1601.79 190.25 4547.47 59.82 1419.411996 2957.55 1952.05 222.63 5126.22 68.72 1596.731997 3438.79 1977.59 247.64 5649.93 72.03 1744.021998 3801.09 1964.83 313.44 5763.67 73.84 1871.891999 4188.73 1856.72 386.04 6213.24 74.49 1984.642000 4771.17 1869.67 547.10 7022.98 76.68 2207.632001 5210.12 1994.73 608.98 7806.18 78.00 2403.182002 5741.03 2187.06 726.64 8730.00 79.68 2622.452003 6694.23 2452.11 1123.97 11708.49 81.02 3209.022004 8072.83 3084.66 1600.26 14595.29 83.45 3892.122005 9247.66 3542.55 1863.65 16876.78 90.26 4381.202006 10572.24 3925.09 2274.89 19631.23 93.81 4969.952007 12494.01 4458.61 2829.73 23108.63 101.84 5571.062008 14069.87 4829.45 3221.38 25968.38 111.80 6085.842009 15046.45 5273.33 2777.31 24888.08 113.82 6001.782010 17165.98 5317.67 3688.69 31038.57 114.15 7218.322011 19 195.69 5067.09 4374.36 33834.44 124.94 7927.892012 20181.72 5254.38 4367.58 33186.41 127.8 7854.772013 21602.12 5647.79 4413.98 33899.38 129.28 8027.77

J. Geng et al. / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎8

Please cite this article as: J. Geng, et al., Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarmoptimization algorithm, Neurocomputing (2014), http://dx.doi.org/10.1016/j.neucom.2014.06.070i

Page 9: Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm

simply called MBPNN) [13] model is also employed to forecast theport throughput in this study as a comparison model. Besides, tofully assess the performance of the proposed port throughputforecasting scheme, the ARIMA model is also employed for thecomparison model.

The mean absolute deviation (MAD), root mean square error(RMSE), mean absolute percentage error (MAPE) and root meansquare percentage error (RMSPE) are employed to evaluate theprediction performance of the proposed port throughput forecast-ing model. These four performance criteria are measures of thedeviation between actual and predicted values. The smaller thevalues of these criteria, the more accurate are the predicted

results. The definitions of these criteria are as follows:

MAD¼ 1N

∑N

i ¼ 1jf ni ðxÞ� f iðxÞj ð20Þ

RMSE¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1N

∑N

i ¼ 1ðf ni ðxÞ� f iðxÞÞ2

sð21Þ

MAPE¼ 1N

∑N

i ¼ 1

f ni ðxÞ� f iðxÞf iðxÞ

�������� ð22Þ

RMSPE¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1N

∑N

i ¼ 1

f ni ðxÞ� f iðxÞf iðxÞ

� �2vuut ð23Þ

where f ni ðxÞ and f iðxÞ represent the actual and predicted value ofith points, respectively, and N is the total number of data points.

Numerical experiments have been implemented in Matlab7.1 programming language with a 1.80 GHz Core(TM)2 CPU perso-nal computer (PC) and 2.0 G memory under Microsoft WindowsXP professional.

4.2. Selecting the final input vectors by MARS

This study employs the MARS variable screening mechanism toanalyze the 17 factors of Shanghai for selecting the final inputvectors for RSVR.

In constructing the MARS method, the maximum number ofbasis functions is set as 21, the maximum number of intersectionfunctions is set as 10, the remaining parameters are determined bythe aresparams( � ) functions of ARESLab. The calculated result ofthe MARS model is shown in Table 3.

The calculation result of the MARS model shows that the SIV(x6) and PYPT(x15) play crucial roles in building the MARS modelin the 17 candidate related factors. It is observed that SIV(x6) andPYPT(x15) are predominant influencing factors on port through-put. The reason is that the port is the distribution center of rawmaterial, energy and products for import and export; as a result,most of the port throughput is form of raw materials, energy andproducts. The consumption of raw materials and energy as well asthe production belong to the category of secondary industry (SIV);thus, the development level of the second industry will largelydetermine the scale of port throughput; as the cardinal number ofthe prediction year's port throughput, the PYPT (previous year'sport throughput) will largely affect the level of the predictionyear's port throughput.

Consequently, the SIV(x6) and PYPT(x15) are selected as thefinal input vectors for RSVR. Let's take port throughput forecastingof 2008 as an example; the final input vectors are composed of x6(SIV) and x15 (PYPT) of 2008.

Table 2The data of port throughput and the corresponding socio-economic indicators ofShanghai.

Year TIV/billionyuan

PM/millionpeople

TRSCG/billionyuan

FV/milliontons

HFV/milliontons

RFV/milliontons

PT/milliontons

1978 50.76 1098.28 54.10 19645 7184 4329 79551979 53.83 1132.14 68.28 19841 7234 4406 82191980 65.69 1146.52 80.43 20037 7284 4484 84831981 69.84 1162.84 88.73 20878 7670 4599 90441982 74.44 1180.51 89.80 21719 8056 4714 96051983 82.97 1194.01 100.68 22560 8442 4829 101661984 98.22 1204.78 123.72 23401 8828 4944 107271985 121.59 1216.69 173.39 24243 9216 5059 112911986 135.12 1232.33 196.84 23964 9113 4299 118251987 159.48 1249.51 225.25 23685 9017 3539 123591988 187.89 1262.42 295.83 23406 8912 2779 128931989 200.73 1276.45 331.38 23127 8808 2019 134271990 241.82 1283.35 333.86 22848 8714 1257 139591991 309.07 1287.20 382.06 22785 8225 1280 144811992 402.77 1289.37 464.82 22722 7737 1303 150031993 579.03 1294.74 675.92 22659 7249 1326 155251994 794.80 1298.81 834.76 22596 6761 1349 160471995 1020.20 1301.37 1050.96 22531 6273 1376 165671996 1292.11 1304.43 1258.00 40928 25023 1320 164011997 1592.74 1305.46 1435.38 41373 25991 1252 163971998 1855.36 1306.58 1593.27 42090 26351 1152 163871999 2129.60 1313.12 1722.33 44485 27171 997 186412000 2486.86 1321.63 1865.28 47954 28369 1055 204402001 2728.94 1327.14 2016.37 49545 28869 1080 220992002 3038.90 1344.23 2203.89 54196 29759 1131 263842003 3404.19 1341.77 2404.45 58669 30678 1208 366212004 4097.26 1352.39 2656.91 63180 31554 1284 378972005 4776.20 1360.26 2979.50 68741 32684 1278 443172006 5508.48 1368.08 3375.20 72617 33799 1233 537482007 6821.11 1378.86 3873.30 78108 35634 1143 561442008 7872.23 1391.04 4457.23 84347 40328 985 581702009 8930.85 1400.70 5173.24 76967 37745 941 592052010 9833.51 1412.31 6070.50 81024 40890 959 653392011 11142.86 1419.36 6814.80 93318 42685 888 727582012 12199.15 1426.93 7412.30 94376 42 911 825 735592013 13445.07 1425.14 8019.05 91535 43809 694 77574

Table 3Result of MARS model.

Selected variable Importance score GCV Basis function

x6 SIV Second industry value 100.00 0.774 BF1¼max(0, x6þ0.47456)BF2¼max(0, x15þ0.411)

x15 PYPT Previous year’s port throughput 70.58 0.223 BF3¼max(0, �0.411�x15)MARS function: y¼�0.49692þ0.67143�BF1þ0.3571�BF2�0.82445�BF3

Note:Type: piecewise-linear.GCV: 0.011.Total number of basis functions: 4.Total effective number of parameters: 8.5.Execution time: 0.22 s.

J. Geng et al. / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 9

Please cite this article as: J. Geng, et al., Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarmoptimization algorithm, Neurocomputing (2014), http://dx.doi.org/10.1016/j.neucom.2014.06.070i

Page 10: Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm

4.3. Forecasting results and discussions

In order to ensure the comparability of the contrast model,this study uses the same computer to calculate for each model andselects the mean absolute percentage error (MAPE) of the regres-sion series as the fitness function. The maximum optimizing timeshould be as similar as possible, because more optimization timecan improve the optimization performance. Considering the ran-domness of the evolutionary algorithms in the optimizationprocess, the MBPNN, RSVR-CSAPSO, MSVR-CSAPSO, MRSVR-PSO,MRSVR-SAPSO, and MRSVR-CSAPSO models are employed toforecast the port throughput for 10 times, and the mean optimalsolution as the final result. For the ARIMA model, the original portthroughput data are directly used for model building, and the SASstatistical package is adopted in this study to build the ARIMAmodel. For the BPNN model, the network is trained by trainlm( � )function, the learngdm( � ) is selected as the learning function ofweights, and the learning rate is set as 0.01. The ranges of thesethree parameters in the SVR-based model are set asCA ½0:01;1000�, vA ½0:01;1� and δA ½0:1;1000�. The same para-meters in the PSO and CSAPSO algorithms are set as pop_size¼100,gen¼500, and c1¼c2¼2.0.

Table 4 and Fig. 4 demonstrate the actual values and predictedvalues of port throughput obtained from ARIMA, MBPNN, RSVR-CSAPSO, MSVR-CSAPSO, MRSVR-PSO, MRSVR-SAPSO and MRSVR-CSAPSO models. The four measurement criteria (MAD, RMSE,MAPE and RMSPE) of the seven models are computed and sum-marized in Table 5.

It clearly depicts that the MAD, RMSE, MAPE and RMSPE of theproposed MRSVR-CSAPSO forecasting model are 681.06, 701.29,1.02% and 1.03%, respectively, and smaller than those of the sixcompeting models including ARIMA, MBPNN, RSVR-CSAPSO,MSVR-CSAPSO, MRSVR-PSO and MRSVR-SAPSO in this paper. The

significant superiority of the proposed MRSVR-CSAPSO forecastingmodel can be summarized as follows:

1) The prediction performance of the ARIMA model is the worstamong comparison models. The reason is that the future yearport throughput is influenced not only by historical values butalso by numerous socio-economic factors; as a result, theforecasting performance of the ARIMA only based on historicalvalues is inferior to other competing models. Therefore, in thisstudy, selecting the historical values and numerous socio-economic factors as the input vector is highly necessary.

2) The forecasting accuracy of the SVR-based models is superior tothat of the BPNN model; it indicates that the SVR-based modelsovercome the deficiency of the BPNN model, such as poorglobal search and easy to converge to local minima, which canwell solve nonlinearity and wave property (fluctuation) pro-blems. Thus, the SVR-based models obtain excellent perfor-mance compared to the BPNN model. Therefore, employing theSVR-based model to build the port throughput forecastingscheme is appropriate.

3) Due to the excellent variable selection capabilities, the MARS isadopted to reduce the dimension of the candidate inputvariables. In the forecasting process, selecting the final inputvectors (x6: SIV and x15: PYPT) from the 17 candidate inputvariables by MARS improves the forecasting performance ofMRSVR-CSAPSO. For example, in Table 5, using MARS canexcellently shift the performance of the RSVR-CSAPSO model,with performance criteria (MAD¼1877.28, RMSE¼1917.84,MAPE¼2.84%, RMSPE¼2.88%), to better the performance ofthe MRSVR-CSAPSO model, with performance criteria(MAD¼681.06, RMSE¼701.29, MAPE¼1.02%, RMSPE¼1.03%).

Table 4Comparison of prediction results (unit:104 T).

Years 2007 2008 2009 2010 2011 2012 2013

Actual 56144 58170 59205 65339 72758 73559 77574ARIMA 59781 55257 62866 68032 66758 78324 81229MBPNN 54183 59693 60927 64008 70217 71204 79605RSVR-CSAPSO 58057 56683 57523 66637 70278 75856 75591MSVR-CSAPSO 55191 59649 60334 64369 71061 74962 75271MRSVR-PSO 57720 56944 60174 63657 74630 72009 75577MRSVR-SAPSO 57088 57279 60020 66438 71644 74571 78760MRSVR-CSAPSO 55628 58763 58782. 66138 73648 72883 78444

2007 2008 2009 2010 2011 2012 201350000

55000

60000

65000

70000

75000

80000

85000

90000

Val

ue

Years

Actual ARIMA MBPNN RSVR-CSAPSO MSVR-CSAPSO MRSVR-PSO MRSVR-SAPSO MRSVR-CSAPSO

Fig. 4. Comparison result of actual and predicted values.

Table 5Performance criteria results of seven models.

Models Parameters(C, v, σ)

Performance criteria

MAD RMSE MAPE(%)

RMSPE(%)

ARIMA 3903.73 4043.47 5.89 6.03MBPNN 1923.55 1965.12 2.91 2.95RSVR-CSAPSO (784, 0.43, 191) 1877.28 1917.84 2.84 2.88MSVR-CSAPSO (513, 0.68, 367) 1419.14 1486.25 2.12 2.17MRSVR-PSO (25, 0.38, 912) 1553.13 1587.88 2.34 2.37MRSVR-SAPSO (219, 0.23, 257) 1008.97 1016.49 1.53 1.53MRSVR-CSAPSO (691, 0.84, 289) 681.06 701.29 1.02 1.03

J. Geng et al. / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎10

Please cite this article as: J. Geng, et al., Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarmoptimization algorithm, Neurocomputing (2014), http://dx.doi.org/10.1016/j.neucom.2014.06.070i

Page 11: Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm

4) In Table 5, the MAD, RMSE, MAPE and RMSPE values of theproposed MRSVR-CSAPSO model with robust loss function areclearly smaller than that of the MSVR-CSAPSO model withε-loss function. It demonstrates that, first, the mixed noise datareally exists in the historical data and socio-economic factors ofthe port throughput; second, the MRSVR-CSAPSO model withrobust loss function has successfully solved these mixed noisedata. Consequently, the proposed MRSVR-CSAPSO modelobtains better performance than the MSVR-CSAPSO model inport throughput forecasting.

5) From Table 5, we can see that the SAPSO algorithm is excellentto shift the local solution of the MRSVR-PSO model, (C, v,σ)¼(25, 0.38, 912), with performance criteria (MAD¼1553.13,RMSE¼1587.88, MAPE¼2.34% and RMSPE¼2.37%), to anotherbetter solution, (C, v,σ)¼(219, 0.23, 257), of the MRSVR-SAPSOmodel with performance criteria (MAD¼1008.97, RMSE¼1016.49, MAPE¼1.53% and RMSPE¼1.53%). It reveals that theSAPSO hybrid algorithm is superior to the standard PSOalgorithm in determining the parameters of the MRSVR model,due to its jump characteristic in the evolution.

6) In evolution of CSAPSO, mapping the three hyper-parameters(C, v, σ) from the solution space to the chaotic space by theCat mapping enhances the diversity of evolutionary particleswarm, explores the better particles and searches for moreappropriate solution, when the SAPSO evolution suffers fromstandstill. In short, the introduction of Cat mapping to theSAPSO hybrid algorithm further improves the global explora-tion capability. For instance, in Table 5, using CSAPSO signifi-cantly improves the forecasting performance of MRSVR-CSAPSO than that of MRSVR-SAPSO, shifting the solution ofthe MRSVR-SAPSO model, (C, v,σ)¼(219, 0.23, 257), withperformance criteria (MAD¼1008.97, RMSE¼1016.49, MAPE¼1.53% and RMSPE¼1.53%), to another better solution, (C, v,σ)¼(691, 0.84, 289) of the MRSVR-CSAPSO model with performancecriteria (MAD¼681.06, RMSE¼701.29, MAPE¼1.02% andRMSPE¼1.03%). Therefore, it depicts that the CSAPSO algorithmdeveloped in this paper is more appropriate to determine theparametric combination (C, v, σ) for MRSVR than PSO andSAPSO.

5. Conclusion

Port throughput forecasting is an important part of portplanning and feasibility study, and plays an important and indis-pensable role in determining the direction of port development,scale of basic facilities investment, berth location and port man-agement strategy. Accurate port throughput forecasting result isthe basis of correct decision and planning for port management;thus, it is very necessary to improve the forecasting precision ofport throughput. This paper proposes an MRSVR-CSAPSO portthroughput forecasting scheme, combining MARS, RSVR andCSAPSO. The empirical study based on Shanghai statistical datais used to elucidate the forecasting accuracy of the proposedscheme. The prediction results of the proposed MRSVR-CSAPSOport throughput forecasting scheme are compared to those of theARIMA, MBPNN, RSVR-CSAPSO, MSVR-CSAPSO, MRSVR-PSO andMRSVR-SAPSO models. The empirical study indicates that theproposed MRSVR-CSAPSO port throughput forecasting schemeobtains better prediction result and outperforms the six compar-ison models. In short, the established scheme is a valid approachfor port throughput predication.

In future studies, more socio-economic factors such as interac-tions between ports and economic policy should be considered todetermine candidate input variables. More appropriate tools foridentifying the final input vectors of SVR should be explored. Other

approaches based on advanced optimization algorithms and novelhybrid methods should be further studied to search more appro-priate parameter combinations for the SVR model and to obtainmore accurate forecasting results of port throughput.

Acknowledgments

The work was supported by the Fundamental Research Fundsfor the Central Universities (HEUCF140108), Science and technol-ogy project of Western Transportation Construction of Ministry ofCommunications (2014364554050).

References

[1] G.E.P. Box, G.M. Jenkins, Time Series Analysis: Forecasting and Control,Holden-Day, San Francisco, 1976.

[2] P.J. Sheldon, T. Var, Tourism forecasting: a review of empirical research,J. Forecast. 4 (1985) 183–195.

[3] S.F. Witt, C.A. Witt, Modeling and Forecasting Demand in Tourism, AcademicPress, London, 1992.

[4] F. Jiang, K. Lei, Grey prediction of port throughput based on GM(1,1,a) model,Logistics Sci-Tech. 9 (2009) 68–70.

[5] X.L. Jiang, Y.Q. WU, Y.J. Shi, Empirical study of an intermix algorithm forforecasting port throughput, China Harbour Eng. 2 (2009) 7–11.

[6] G. Chen, L. Lei, H. Zou, China's port throughput of iron ore forecast based oncubic exponential smoothing method, Econ. Res. Guide 4 (2012) 191–192,229.

[7] W.P. Zhang, Research on combined forecasting model in the containerthroughput forecasting of Ningbo port, Bull. Sci. Technol. 28 (5) (2012)133–136.

[8] N. Shang, M. Qin, Y. Wang, Z. Cui, Y. Cui, Y. Zhu, A BP neural network methodfor short-term traffic flow forecasting on crossroads, Comput. Appl. Softw.23 (2) (2006) 33–35.

[9] Y. Ye, Z. Lu, Neural network short-term traffic flow forecasting model based onparticle swarm optimization, Comput. Eng. Des. 30 (18) (2009) 4296–4299.

[10] H. Liu, C. Zhou, A. Zhu, L. Li, Multi-population genetic neural network modelfor short-term traffic flow prediction at intersections, Acta Geod. Cartogr. Sin.38 (4) (2009) 364–368.

[11] T.T. Chen, Y.Y. Chen, Port Cargo Throughput, Forecast based on BP neuralnetwork, Jisuanji Yu Xiandaihua 10 (2009) 4–5.

[12] M.L. Liu, M.H. Zhu, The port throughput forecast model based on the BP neuralnetwork, J. Syst. Sci. 20 (4) (2012) 88–91.

[13] J.P. Mu, J. Li, M. Zhang, Harbor cargo throughput forecast based on BP neuralnetwork and PCA, Tech. Method 31 (19) (2012) 79–82.

[14] J.A.K. Suykens, Nonlinear modeling and support vector machines, in: Proceed-ings of IEEE Instrumentation and Measurement Technology Conference, 2001,pp. 87–294.

[15] J. Cao, A. Cai, A robust shot transition detection method based on supportvector machine in compressed domain, Patt. Recognit. Lett. 28 (12) (2007)1534–1540.

[16] V. Vapnik, S. Golowich, A. Smola, Support vector machine for functionapproximation regression estimation and signal processing, Adv. Neural. Inf.Process. Syst. 9 (1996) 281–287.

[17] L. Cao, Support vector machine experts for time series forecasting, Neuro-computing 51 (2003) 321–339.

[18] W. Huang, Y. Nakamori, S.Y. Wang, Forecasting stock market movementdirection with support vector machine, Comput. Oper. Res. 32 (2005)2513–2522.

[19] W.M. Hung, W.C. Hong, Application of SVR with improved ant colonyoptimization algorithms in exchange rate forecasting, Control Cybern. 38(2009) 863–891.

[20] P.F. Pai, C.S. Lin, A hybrid ARIMA and support vector machines model in stockprice forecasting, Omega 33 (2005) 497–505.

[21] P.F. Pai, C.S. Lin, W.C. Hong, C.T. Chen, A hybrid support vector machineregression for exchange rate prediction, Int. J. Inf. Manag. Sci. 17 (2006) 19–32.

[22] W.C. Hong, Y. Dong, L.Y. Chen, S.Y. Wei, SVR with hybrid chaotic geneticalgorithms for tourism demand forecasting, Appl. Soft Comput. 11 (2011)1881–1890.

[23] P.F. Pai, W.C. Hong, An improved neural network model in forecasting arrivals,Ann. Tour. Res. 32 (2005) 1138–1141.

[24] W.C. Hong, Rainfall forecasting by technological machine learning models,Appl. Math. Comput. 200 (2008) 41–57.

[25] W.C. Hong, P.F. Pai, Potential assessment of the support vector regressiontechnique in rainfall forecasting, Water Resour. Manag. 21 (2007) 495–513.

[26] P.F. Pai, W.C. Hong, A recurrent support vector regression model in rainfallforecasting, Hydrol. Process. 21 (2007) 819–827.

[27] M.A. Mohandes, T.O. Halawani, S. Rehman, A.A. Hussain, Support vectormachines for wind speed prediction, Renew. Energy 29 (2004) 939–947.

[28] W.C. Hong, Traffic flow forecasting by seasonal SVR with chaotic simulatedannealing algorithm, Neurocomputing 74 (2011) 2096–2107.

J. Geng et al. / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 11

Please cite this article as: J. Geng, et al., Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarmoptimization algorithm, Neurocomputing (2014), http://dx.doi.org/10.1016/j.neucom.2014.06.070i

Page 12: Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarm optimization algorithm

[29] W.C. Hong, Y. Dong, F. Zheng, S.Y. Wei, Hybrid evolutionary algorithms in aSVR traffic flow forecasting model, Appl. Math. Comput. 217 (2011)6733–6747.

[30] W.C. Hong, Y. Dong, F. Zheng, C.Y. Lai, Forecasting urban traffic flow by SVRwith continuous ACO, Appl. Math. Model. 35 (2011) 1282–1291.

[31] W.C. Hong, Hybrid evolutionary algorithms in a SVR-based electric loadforecasting model, Int. J. Electr. Power Energy Syst. 31 (2009) 409–417.

[32] W.C. Hong, Chaotic particle swarm optimization algorithm in a support vectorregression electric load forecasting model, Energy Convers. Manag. 50 (2009)105–117.

[33] W.C. Hong, Electric load forecasting by support vector model, Appl. Math.Model. 33 (2009) 2444–2454.

[34] W.C. Hong, Application of chaotic ant swarm optimization in electric loadforecasting, Energy Policy 38 (2010) 5830–5839.

[35] P.F. Pai, W.C. Hong, Forecasting regional electric load based on recurrentsupport vector machines with genetic algorithms, Electr. Power Syst. Res.74 (2005) 417–425.

[36] P.F. Pai, W.C. Hong, Support vector machines with simulated annealingalgorithms in electricity load forecasting, Energy Convers. Manag. 46 (2005)2669–2688.

[37] W.Z. Cheng, F.X. Ren f, X.C. Zhou, Application of SVM model in the predicationof Harbor handling capacity in Jiujiang, Logist. Eng. Manag. 34 (9) (2012)61–64.

[38] M.W. Li, W.C. Hong, H.G. Kang, Urban traffic flow forecasting using Gauss-SVRwith cat mapping, cloud model and PSO hybrid algorithm, Neurocomputing23 (99) (2013) 230–240.

[39] Q. Wu, H.S. Yan, Product sales forecasting model based on robust M-supportvector machine, Comput. Integr. Manuf. Syst. 15 (6) (2009) 1081–1087.

[40] P.F. Pai, W.C. Hong, Forecasting regional electric load based on recurrentsupport vector machines with genetic algorithms, Electr. Power Syst. Res. 74(3) (2005) 417–425.

[41] V. Cherkassky, Y. Ma, Practical selection of SVR parameters and noiseestimation for SVM regression, Neural Netw. 17 (2004) 113–126.

[42] F. Friedriehs, C. Igel, Evolutionary tuning of multiple SVM parameters, NeuralComput. 64 (1) (2005) 107–117.

[43] X. Wang, C. Qin, W.H. Gui, Parameters election of support vector regressionbased on hybrid optimization algorithm and its application, J. Control TheoryAppl. 3 (4) (2005) 372–376.

[44] O. Chapelle, V. Vapnik, O. Bousquet, et al., Choosing mu1tip1e parameters forsupport vector machines, Mach. Learn. 46 (1) (2002) 131–159.

[45] M.W. Chang, C.H. Lin, Leave-one-out bounds for support vector regressionmodel selection, Neural Comput. 17 (5) (2005) 1188–1222.

[46] X.T. Zhu, Power short-term load forecasting based on SA-LSSVM, Sci. Technol.Eng. (2012) 6171–6174.

[47] H.Q. Zhang, X.Y. Zhang, Steam load forecasting based on chaos theory andLSSVM, Syst. Eng.-Theory & Pract. (2013) 1058–1066.

[48] N. Lu, Y. Liu, Research on self-adaptive integrated model for short-term loadforecasting based on algorithm combination, Power Syst. Protect. Control(2012) 109–113.

[49] G. Chen, Y. Mao, C.K. Chui, A symmetric image encryption scheme based on 3Dchaotic cat maps, Chaos Solitons Fractals 21 (2004) 61–74.

[50] M.W. Li, H.G. Kang, P.H. Zhou, W.C. Hong, Hybrid optimization algorithm basedon chaos, cloud and particle swarm optimization algorithm, J. Syst. Eng.Electron. 24 (2) (2013) 324–334.

[51] J.H. Friedman, Multivariate adaptive regression splines (with discussion), Ann.Stat. 19 (1991) 1–141.

[52] C.J. Lu, Sales forecasting of computer products based on variable selectionscheme and support vector regression, Neurocomputing 128 (2014) 491–499.

[53] W. Chen, C. Ma, L. Ma, Mining the customer credit using hybrid support vectormachine technique, Expert Syst. Appl. 36 (4) (2009) 7611–7616.

[54] T.S. Lee, C.C. Chiu, Y.C. Chou, C.J. Lu, Mining the customer credit usingclassification and regression tree and multivariate adaptive regression splines,Comput. Stat. Data Anal. 50 (4) (2006) 1113–1130.

[55] W. Xiao, Q. Zhao, Q. Fei, A comparative study of data mining methods inconsumer loans credit scoring management, J. Syst. Sci. Syst. Eng. 15 (4)(2006) 419–435.

[56] M.H.F. Zarandi, M. Zarinbal, N. Ghanbari, I.B. Turksen, A new fuzzy functionsmodel tuned by hybridizing imperialist competitive algorithm and simulatedannealing, application: stock price prediction, Inf. Sci. 222 (2013) 213–228.

[57] L.J. Kao, C.C. Chiu, C.J. Lu, C.H. Chang, A hybrid approach by integratingwavelet-based feature extraction with MARS and SVR for stock index fore-casting, Decis. Support Syst. 54 (3) (2013) 1228–1244.

[58] W. Dai, Y.E. Shao, C.J. Lu, Incorporating feature selection method into supportvector regression for stock index forecasting, Neural Comput & Applic. 23(2013) 1551–1561.

[59] R. Mohanty, V. Ravi, M.R. Patra, Hybrid intelligent systems for predictingsoftware reliability, Appl. Soft Comput. 13 (1) (2013) 189–200.

[60] N. Raj Kiran, V. Ravi, Software reliability prediction by soft computingtechniques, J. Syst. Softw. 81 (4) (2008) 576–583.

[61] Y. Zhou, H. Leung, Predicting object-oriented software maintainability usingmultivariate adaptive regression splines, J. Syst. Softw. 80 (8) (2007) 1349–1361.

[62] J.R. Leathwick, D. Rowe, J. Richardson, et al., Using multivariate adaptiveregression splines to predict the distributions of New Zealand’s freshwaterdiadromous fish, Freshw. Biol. 50 (12) (2005) 2034–2052.

[63] J. Elith, J. Leathwick, Predicting species distribution from museum andherbarium records using multiresponse model fitted with multivariate adap-tive regression splines, Divers. Distrib. 13 (3) (2007) 265–275.

[64] Y. Kao, E. Zahara, A hybrid genetic algorithm and particle swarm optimizationfor multimodal functions, Applied Soft Computing 8 (2008) 849–857.

[65] S. Kirkpatrick, C.D. Gelatt, M.P. Vecchi, Optimization by simulated annealing,Science 220 (1983) 671–680.

[66] A. Dekkers, E.H.L. Aarts, Global optimization and simulated annealing, Math.Progr. 50 (1991) 367–393.

[67] J.H. Friedman, C.B. Roosen, An introduction to multivariate adaptive regressionsplines, Statist. Method Med. Res. 4 (1995) 197–217.

[68] J.H. Xu, H. Jin, Study on intrinsic factors of port throughput based on principalcomponent analysis, Port Waterw. Eng. 1 (2010) 2–5.

[69] S.F. Crone, S. Lessmann, R. Stahlbock, The impact of preprocessing on datamining: an evaluation of classifier sensitivity in direct marketing, Eur. J. Oper.Res. 173 (3) (2006) 781–800.

[70] S. Crone, J. Guajardo, R. Weber, The impact of preprocessing on support vectorregression and neural networks in time series prediction, in: Proceedings ofthe International Conference on Data Mining, DMIN'06, Las Vegas, USA, 2006,pp. 37–44.

Jing Geng was born in 1968. She received her Master’sdegree in engineering from Dalian University of Tech-nology in 2013. She is currently an associate professorin College of shipbuilding engineering of Harbin Engi-neering University. Her research interests are portplanning simulation, geographic information systemand hybrid evolutionary algorithm.

Ming-Wei Li was born in 1984. He received hisDoctorate degree in Engineering from Dalian Universityof Technology in 2013. Since September 2013, he hasbeen with the College of Shipbuilding Engineering ofHarbin Engineering University, where he is currently alecturer. His research interests mainly include portpolicy and digital applications of forecasting technol-ogy, computational intelligence and support vectorforecasting.

Zhi-Hui Dong was born in 1985. She received herMaster’s degree in Naval Architecture & Ocean Engi-neering Design and Manufacturing from Harbin Engi-neering University in 2010. She is working toward a Ph.D. degree in HEU. Her research interests are intelligentalgorithm, the modeling and simulation optimizationand logistics optimization.

Yu-Sheng Liao is a professor with the Department ofHealthcare Administration, Oriental Institute of Tech-nology, Taiwan. He received his PhD in BusinessAdministration from National Taiwan University in1988. His research interests are hybrid evolutionaryalgorithm, grey theory, and computational intelligence.

J. Geng et al. / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎12

Please cite this article as: J. Geng, et al., Port throughput forecasting by MARS-RSVR with chaotic simulated annealing particle swarmoptimization algorithm, Neurocomputing (2014), http://dx.doi.org/10.1016/j.neucom.2014.06.070i