Statistical Downscaling of Temperature with the Random...

12
Research Article Statistical Downscaling of Temperature with the Random Forest Model Bo Pang, 1,2 Jiajia Yue, 1,2 Gang Zhao, 1,2 and Zongxue Xu 1,2 1 College of Water Sciences, Beijing Normal University, Beijing 100875, China 2 Beijing Key Laboratory of Urban Hydrological Cycle and Sponge City Technology, Beijing 100875, China Correspondence should be addressed to Gang Zhao; [email protected] Received 20 December 2016; Revised 25 March 2017; Accepted 17 May 2017; Published 15 June 2017 Academic Editor: Jorge E. Gonzalez Copyright © 2017 Bo Pang et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e issues with downscaling the outputs of a global climate model (GCM) to a regional scale that are appropriate to hydrological impact studies are investigated using the random forest (RF) model, which has been shown to be superior for large dataset analysis and variable importance evaluation. e RF is proposed for downscaling daily mean temperature in the Pearl River basin in southern China. Four downscaling models were developed and validated by using the observed temperature series from 61 national stations and large-scale predictor variables derived from the National Center for Environmental Prediction–National Center for Atmospheric Research reanalysis dataset. e proposed RF downscaling model was compared to multiple linear regression, artificial neural network, and support vector machine models. Principal component analysis (PCA) and partial correlation analysis (PAR) were used in the predictor selection for the other models for a comprehensive study. It was shown that the model efficiency of the RF model was higher than that of the other models according to five selected criteria. By evaluating the predictor importance, the RF could choose the best predictor combination without using PCA and PAR. e results indicate that the RF is a feasible tool for the statistical downscaling of temperature. 1. Introduction Global climate models (GCMs) are considered the most cred- ible tools for the projection of future global climate change [1]. However, there is a general mismatch between the spatial and temporal resolution of the GCM output and regional scale climate change impact studies. Various techniques have been developed to downscale GCM outputs to finer scales. ese methods are widely divided into dynamic (physical) and statistical (empirical) downscaling [2, 3]. Because of the complexity in modeling and computing dynamic downscal- ing, statistical downscaling techniques have been used widely in climate change studies due to their simplicity and ease of implementation. Statistical downscaling techniques can be divided into three categories: weather typing, weather generators, and re- gression-based methods. Various models have been devel- oped and applied in the downscaling of temperature, like linear regression [4, 5], canonical correlation analysis (CCA) [6, 7], artificial neural networks (ANN) [8], support vector machines (SVM) [9], and so forth. Some comparison stud- ies have been made in the past. Schoof and Pryor [10] demonstrated that ANN models give better estimates than multiple linear regression (MLR) models for daily temper- ature downscaling at Indianapolis. Kostopoulou et al. [11] demonstrated that MLR and CCA are superior to ANN in the simulation of minimum and maximum temperatures over Greece. Duhan and Pandey [12] compared MLR, ANN, and the least square support vector machine (LS-SVM) models to downscale the temperature of the Tons River basin in India and demonstrated that LS-SVM models perform better than ANN and MLR models. ese comparison studies indicate that none of the aforementioned methods can assure an accurate estimate of temperature under different situations. e predictor selection is critical for developing a sta- tistical downscaling model. Suitable predictors should be informative, and the relationship between the predictors and predictands should be stationary [13]. Informative predictors can be identified using statistical measures, such as the Hindawi Advances in Meteorology Volume 2017, Article ID 7265178, 11 pages https://doi.org/10.1155/2017/7265178

Transcript of Statistical Downscaling of Temperature with the Random...

Page 1: Statistical Downscaling of Temperature with the Random ...downloads.hindawi.com/journals/amete/2017/7265178.pdf · Statistical downscaling techniques can be divided into threecategories:weathertyping,weathergenerators,andre-gression-based

Research ArticleStatistical Downscaling of Temperature with the RandomForest Model

Bo Pang12 Jiajia Yue12 Gang Zhao12 and Zongxue Xu12

1College of Water Sciences Beijing Normal University Beijing 100875 China2Beijing Key Laboratory of Urban Hydrological Cycle and Sponge City Technology Beijing 100875 China

Correspondence should be addressed to Gang Zhao gangzhaomailbnueducn

Received 20 December 2016 Revised 25 March 2017 Accepted 17 May 2017 Published 15 June 2017

Academic Editor Jorge E Gonzalez

Copyright copy 2017 Bo Pang et alThis is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

The issues with downscaling the outputs of a global climate model (GCM) to a regional scale that are appropriate to hydrologicalimpact studies are investigated using the random forest (RF) model which has been shown to be superior for large dataset analysisand variable importance evaluation The RF is proposed for downscaling daily mean temperature in the Pearl River basin insouthern China Four downscalingmodels were developed and validated by using the observed temperature series from 61 nationalstations and large-scale predictor variables derived from the National Center for Environmental PredictionndashNational Center forAtmospheric Research reanalysis datasetTheproposedRFdownscalingmodelwas compared tomultiple linear regression artificialneural network and support vector machine models Principal component analysis (PCA) and partial correlation analysis (PAR)were used in the predictor selection for the other models for a comprehensive study It was shown that the model efficiency of theRF model was higher than that of the other models according to five selected criteria By evaluating the predictor importance theRF could choose the best predictor combination without using PCA and PAR The results indicate that the RF is a feasible tool forthe statistical downscaling of temperature

1 Introduction

Global climatemodels (GCMs) are considered themost cred-ible tools for the projection of future global climate change[1] However there is a general mismatch between the spatialand temporal resolution of the GCM output and regionalscale climate change impact studies Various techniques havebeen developed to downscale GCM outputs to finer scalesThese methods are widely divided into dynamic (physical)and statistical (empirical) downscaling [2 3] Because of thecomplexity in modeling and computing dynamic downscal-ing statistical downscaling techniques have been used widelyin climate change studies due to their simplicity and ease ofimplementation

Statistical downscaling techniques can be divided intothree categories weather typing weather generators and re-gression-based methods Various models have been devel-oped and applied in the downscaling of temperature likelinear regression [4 5] canonical correlation analysis (CCA)[6 7] artificial neural networks (ANN) [8] support vector

machines (SVM) [9] and so forth Some comparison stud-ies have been made in the past Schoof and Pryor [10]demonstrated that ANN models give better estimates thanmultiple linear regression (MLR) models for daily temper-ature downscaling at Indianapolis Kostopoulou et al [11]demonstrated that MLR and CCA are superior to ANN inthe simulation ofminimumandmaximum temperatures overGreece Duhan and Pandey [12] compared MLR ANN andthe least square support vector machine (LS-SVM) modelsto downscale the temperature of the Tons River basin inIndia and demonstrated that LS-SVMmodels perform betterthan ANN and MLR models These comparison studiesindicate that none of the aforementioned methods canassure an accurate estimate of temperature under differentsituations

The predictor selection is critical for developing a sta-tistical downscaling model Suitable predictors should beinformative and the relationship between the predictors andpredictands should be stationary [13] Informative predictorscan be identified using statistical measures such as the

HindawiAdvances in MeteorologyVolume 2017 Article ID 7265178 11 pageshttpsdoiorg10115520177265178

2 Advances in Meteorology

0 500250(km)

Pearl River BasinMeteorological stations

Elevation (m)High 2635

Low minus6

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E 120

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E 120

∘09984000998400998400E

30∘09984000998400998400

N

N

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

Figure 1 Study area

Pearson Spearmen and Kendall correlation analysis [9]CCA [14] maximum covariance analysis (MCA) [15] partialcorrelation (PAR) [16ndash18] and principal component analysis(PCA) [19 20] Interactive model fitting approaches are alsoused in predictor selection [21] However some limitationsare found during the application such as the limited ability oftraditional correlation analysis for interpreting nonstationaryand nonlinear relationships [22] Therefore a precise statisti-cal downscaling method with an inbuilt predictor selectionmechanism will be helpful for researchers studying climatechange impact

The random forest (RF) model is an ensemble machinelearning technique based on a combination of classificationor regression methods and statistical learning theory [23]RF models have been well applied in various fields such asrisk analysis [24] ground water studies [25] remote sensinganalysis [26] and flood hazard assessment [27] and especiallyshow advantages in land cover classification [28ndash31] Thereare two important advantages of RF models The first is theability to handle large datasets with correlated conditionalvariables because it includes precision in the prediction isnonparametric and is robust in the presence of outliersnoise and overfitting [23 32] The second is the inbuiltvariable importance evaluation By permuting the variablesrandomly each variable can be compared to the predictionresults and evaluated for its importance [33] Based on thisbody of knowledge RF should be in theory highly appli-cable to downscaling and able to rectify multivariable andnonlinear issues Eccel et al [34] adopted RF with four linearand nonlinearmodels in the postprocessing of two numerical

weather prediction models for the prediction of minimumtemperatures in an alpine region However the RF just servedas one of the comparativemodels in the studyThe advantagesand disadvantages of the RF and its applicability in statisticaldownscaling have not been studied in detail

The purpose of this study is to fully investigate theapplicability of RF for statistical downscaling of temperatureThe predictors are the 26 large-scale variables derived fromthe National Center for Environmental Prediction (NCEP)reanalysis daily dataset and the predictands are the observedtemperatures at 61 national standard stations located in thePearl River basin The RF is used to capture the complexrelationship between selected predictors from the NCEP dataand observed daily mean temperature from these stations Acomparison study was conducted involving the application ofMLP ANN and SVM models The PCA and PAR methodswere used in predictor selection for the comparative modelsfor a comprehensive study

2 Study Area and Data Description

21 Study Area Description The Pearl River (97∘391015840Esim117∘181015840E 3∘411015840Nsim29∘151015840N) is the second largest river in Chinawith a drainage area of 454 times 105 km2 of which 442 times105 km2 is located in China [35 36] The Pearl River basin(Figure 1) is located in tropical and subtropical climatezones the annual temperature is 14ndash22∘C and the annualprecipitation is 1200ndash2200mm The distribution of precipi-tation is gradually reduced from east to west This regionaldistribution is significantly different and the interannual

Advances in Meteorology 3

161ndash201202ndash218

219ndash236237ndash258Pearl River Basin

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N

N

20∘09984000998400998400

N

Mean temperature ( ∘C)105ndash160

(a) Observation of annual mean temperature

0 150 300(km)

Standard deviation30ndash4647ndash5758ndash63

64ndash6970ndash79Pearl River Basin

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(b) Standard deviation of observed temperature

Figure 2 Observed mean temperature and standard deviation at the meteorological stations

change is large The precipitation is concentrated duringAprilndashSeptember [36] accounting for 72ndash88 of the annualprecipitation [35]

The Pearl River basin is a rich water resource The waterresource in the entire river basin is 4700 cubic meters percapita equivalent to 17 times the national per capita rate butthe spatial and temporal distribution of the water resourceare uneven and basin flooding waterlogging drought andsaltiness create frequent natural disasters The Pearl Riverbasin is a highly developed region having a prominentposition in the economic development of China

22 Data Description

221 Temperature Data The observed meteorological dataused in this study are the daily mean temperatures at61 national standard stations in the Pearl River basin Acontinuous data series for the period of 1961ndash2005 wasselected for the study These observations were obtainedfrom the National Climate Center which is in charge ofmonitoring collecting compiling and releasing high qualityhydrological data in China The observed mean temperatureand standard deviation at the meteorological stations areshown in Figure 2 The mean daily temperature shows anincreasing trend from the north to south in the periodHowever the standard deviation showed an opposite trend

222 Large-Scale Atmospheric Variables TheNCEP reanaly-sis daily data were downloaded from the website httpwwwcdcnoaagov which included 26 large-scale atmosphericvariables at a scale of 25∘ times 25∘ that were derived from thedataset over the period of 1961ndash2005Thedetails of the predic-tors are shown in Table 1 The NCEP data were interpolated

Table 1 NCEP reanalysis predictors

Number Predictor Description1 mslp Mean sea level pressure2 p_f Surface airflow strength3 p_u Surface zonal velocity4 p_v Surface meridional velocity5 p_z Surface vorticity6 p_th Surface wind direction7 p_zh Surface divergence8 p5_f 500 hPa airflow strength9 p5_u 500 hPa zonal velocity10 p5_v 500 hPa meridional velocity11 p5_z 500 hPa vorticity12 p5th 500 hPa wind direction13 p5zh 500 hPa divergence14 p8_f 850 hPa airflow strength15 p8_u 850 hPa zonal velocity16 p8_v 850 hPa meridional velocity17 p8_z 850 hPa vorticity18 p8th 850 hPa wind direction19 p8zh 850 hPa divergence20 p500 500 hPa geopotential height21 p850 850 hPa geopotential height22 r500 500 hPa relative humidity23 r850 850 hPa relative humidity24 rhum Surface relative humidity25 shum Surface-specific humidity26 temp Mean temperature at 2m height

4 Advances in Meteorology

Trainingdataset(TD)

Randomized

Regressionresult 1

Regressionresult 2

Regressionresult 3

Regressionresult k

The averageof all results

TD1

TD2

TD3

TDk

Figure 3 Structure of a random forest model

to each station using the bilinear interpolationmethodwhichhas been widely used in statistical downscaling [37ndash39]

3 Downscaling by Using the RandomForest Method

31 Random Forest Method

311 Methodology The random forest (RF) method is anenhanced classification and regression tree (CART) methodproposed by Breiman in 2001 which consists of an ensembleof unpruned decision trees generated through bootstrapsamples of the training data and random variable subsetselection

As shown in Figure 3 the RF is composed of a set ofCARTs The accuracy of the RF prediction depends on thestrength of the individual CARTs [23] Each CART consistsof a root node internal nodes and leaves Each internal nodeis associated with a test function to split the incoming dataFor regression trees splitting is made in accordance witha squared residuals minimization algorithm which impliesthat the expected sum variances in the two resulting nodesshould be minimized as shown in

argmin119909119895le119909119877119895119895=1119872

[119901119897 var (119884119897) + 119901120591 var (119884120591)] (1)

Here 119901119897 and 119901120591 are fractions of samples in the left andright nodes var(119884119897) and var(119884120591) are response vectors forcorresponding left and right child nodes and 119909119895 le 119909119877119895 119895 =1 2 119872 are optimal splitting questions

Compared to traditional CART methods that use wholedata sets the RF trains each individual CART on bootstrapresamples (119872 samples) of the total dataset Instead of usingall the features the RF uses a random selection of features tosplit each node The best split is chosen among a randomlyselected subset of Ntry input variables at each node The treeis then grown to the maximum size without pruning In thisway119872 CARTs are grown and the final output is the averageof the predictions of those trees

312 Importance of Variables The RF performs an inbuiltcross-validation in parallel to the training process by usingout-of-bag (OOB) samples which are not chosen during thebootstrap split In the regression mode the total learningerror is obtained by averaging the prediction error of eachindividual tree using their OOB samples as shown in

MSE asymp MSEOOB = 1119899119899sum119894=1

[ (119883119894) minus 119884119894]2 (2)

where 119899 is the total number of OOB samples (119883119894) is the RFoutput corresponding to a given input sample119883119894 and119884119894 is theobserved output

The RF provides two methods of evaluating the impor-tance of each variable [27] The first method evaluatesthe variable importance based on how much poorer theprediction will be if the variable is permuted randomlyThe prediction errors of the OOB samples of each tree(termed EOOB1) are calculated during the training procedureAt the same time each input variable in the OOB samplesis permuted one at a time These modified datasets arealso predicted by the tree (termed EOOB2) At the end ofthe training procedure the importance of each variable isobtained by averaging the difference between EOOB1 andEOOB2 It is then normalized by the standard deviation Thesecond method is based on the calculation of the nodeimpurity criterion As illustrated in (1) we can calculate howmuch the split decreases the node impurity For regressiontrees the decrease of the node impurity can be calculatedusing the difference between the residual sum of squares(RSS) before and after the split The importance of a variablecan be rapidly calculated by combining the decreases of nodeimpurity for the variable over all trees [40]

32 Model Implementation and Validation TheRF is utilizedto simulate the nonlinear relationship between the NCEPpredictors and the observed temperatures at the 61 stationsIn this study the RF algorithm is implemented in the Rprogramming language with a package ldquorandom forestrdquowhich has built-in functions to measure variable importance

Advances in Meteorology 5

Guangzhou Station Nanning Station

0

05

1

15

2

25Si

gnifi

canc

e

5 10 15 20 25 300Number of predictors

0

02

04

06

08

1

12

14

16

18

2

Sign

ifica

nce

5 10 15 20 25 300Number of predictors

times105

times105

Figure 4 Relative significance of the predictors at Guangzhou Station and Nanning Station

As illustrated previously 119872 and Ntry are two sensitiveparameters in the RF models Ntry is the square root of thetotal number of variables [41]119872 influences the convergenceof the RF and can be determined through the OOB error

For model development the daily mean temperatureseries from the national standard stations and the large-scaleatmospheric variables of the NCEP data are divided into twodatasetsThe first 31 years (1961ndash1991) are used for calibratingthe regression model while the remaining 14 years of data(1992ndash2005) are used to validate the model For testing theperformance of the proposedmodel two data-drivenmodelsANN and SVM are selected for model comparison whichare commonly used in statistical downscalingThe three-layerbackpropagation artificial neural network (BP-ANN) andLS-SVM are adopted which have been applied successfullyin downscaling temperature [12] The MATLAB functionsldquonewffrdquo and ldquotunelssvmrdquo are employed for obtaining thevalues of the models parameters Multiple linear regressionanalysis is adopted for the MATLAB implementation ThePAR [18] and PCA [19 20] methods are used in predictorselection for the MLR ANN and SVMmodels

33 Model Performance Analysis Five criteria are selected toevaluate the performance of the RF and comparative modelsincluding the NashndashSutcliffe model efficiency index (Nash)root mean square error (RMSE) mean absolute error (MAE)correlation coefficient (119877) and model Bias (Bias) which aredefined as

Nash = 1 minus sum119899119894=1 (119910119894obs minus 119910119894sim)2sum119899119894=1 (119910119894obs minus 119910obs)2

RMSE = radicsum119899119894=1 (119910119894obs minus 119910119894sim)2119899

MAE = sum119899119894=1 1003816100381610038161003816119910119894obs minus 119910119894sim1003816100381610038161003816119899 119877 = sum119899119894=1 (119910119894obs minus 119910obs) (119910119894sim minus 119910sim)radicsum119899119894=1 (119910119894obs minus 119910obs)2 (119910119894sim minus 119910sim)2

Bias = sum119899119894=1 (119910119894sim minus 119910119894s)119899 (3)

Here119910obs is the vector of the observed predictands119910obs isthemean of the observed predictands 119910sim is the vector of thesimulated predictands and 119910sim is the mean of the simulatedpredictands In general a higher Nash and 119877 indicate bettermodel efficiency In contrast smaller values of RMSE MAEand Bias indicate higher accuracy in the model prediction

4 Result Analysis and Discussion

41The Choice of Predictors One of themost important stepsin the development of downscaling models is the choice ofappropriate predictors [42] The RF can evaluate the relativecontribution of each predictor for downscaling results bycombining the RSS decreases over all trees which makes itconvenient in choosing predictors for the RFUsing two of thestations as examples Figure 4 shows the relative importanceof the predictors for the predictands at Guangzhou andNanning Stations where the number of the predictor hasbeen given as indicated in Table 1 For a comprehensiveinvestigation of the predictorrsquos importance in the Pearl Riverbasin stations the rank (the most important predictor isindicated as 1) of the relative importance of the predictor ineach station is calculated and indicated in Figure 5

6 Advances in Meteorology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Number of predictors

Random forest

0

5

10

15

20

25

Rank

Figure 5 Rank of relative importance of predictors in the 61 sta-tions

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Numbers of predictors

123456789

1011

EOO

B

Figure 6 MSEs of out-of-bag samples

According to Figures 4 and 5 the number 26 predictorthe mean temperature at 2m height is the most importantpredictor for all RF models at the 61 stations Similarly thenumber 25 predictor the surface-specific humidity rankedsecond for predictor importance at all stations In contrastthe number 5 predictor surface vorticity is the least impor-tant predictor for all stations The ranks of the other predic-tors vary for the different stations whichmay be caused by themeteorological and geographical differences of the stations

Because the OOB samples can provide unbiased esti-mation of the RF model performance the rationality ofincluding each factor can be tested using the MSEs of theOOB (indicated by EOOB)The predictors of each station arescreened using the ranks of relative importance which wereevaluated in previous stepThen theMSE of theOOB samplesin each station is calculated with the increase of the predictorcombination and plotted in Figure 6

The results show that EOOB generally decreases butat a decreasing rate as more predictors are included Thisindicates that the RF avoids overfitting successfully Howeverthe improvement of model performance by including morepredictors is slight when the predictor number exceeds acertain number which is seven in this study With enoughcomputation resources all predictors are chosen in the RFmodeling to downscale the temperature for these stations

42 Comparative Study The performance of the RF modelis compared with that of the MLR ANN and SVM models

Partial Correlation Analysis

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Number of predictors

0

5

10

15

20

25

Rank

Figure 7 Rank of partial correlation of predictors in the 61 stations

Two predictor selection methods PAR and PCA are appliedin the four models

The partial correlations of the 26 predictors with thepredictands are shown in Figure 7 It can be observed that theranks of the predictorsrsquo partial correlation vary in differentstations In general the number 1 predictor mean sea levelpressure has the largest partial correlation in most of thestations However the number 5 and number 8 predic-tors corresponding to surface vorticity and 500 hPa airflowstrength have the smallest partial correlations in themajorityof the stations

The partial correlation coefficients are used to decide thevariables that are included in the input combination Based onformer studies [18 43 44] a combination of seven predictorswith the highest partial correlation coefficients are selectedand applied in the MLR ANN and SVM models Theseresults are marked as MLR-par ANN-par and SVM-par

PCA which was commonly used by previous researchersis also used in the comparative study Before PCA thepredictors are standardized by subtracting themean from theoriginal values and then dividing the results by the standarddeviation of the original variables The PCA method is thenapplied to the standardized NCEP predictor variables toextract principal components (PCs) that are orthogonal Theobtained PCs preservemore than 90 of the variance presentat each station Then the PCs are used in the MLR ANNand SVM modeling and these results are marked as MLR-pca ANN-pca and SVM-pca

The calibration and validation results of the RF andcomparative models are summarized in Table 2 and theaverage Nash in the calibration and validation periods for allmodels are plotted in Figure 8

The results showed that the RF is superior to the com-parative models in the calibration and validation periodsAccording to the average values of Nash RMSEMAE119877 andBias the RF for the 61 stations are 098 080 058 099 and000 in the calibration period and 094 146 112 097 and021 in the validation period respectively All of these criteriaare superior to those of the comparativemodels It can also beobserved in Figure 8 that the RF shows higher precision andmore stability over the other models in the study area

The results of the two parameter selection are alsocompared for the MLR ANN and SVM models The PAR is

Advances in Meteorology 7

Calibration Validation

08

082

084

086

088

09

092

094

096

098

Aver

age o

f Nas

h

RF

MLR

-par

SVM

-par

MLR

-pca

SVM

-pca

AN

N-p

ar

AN

N-p

ca

Model

078

08

082

084

086

088

09

092

094

096

Aver

age o

f Nas

h

RF

MLR

-par

SVM

-par

MLR

-pca

SVM

-pca

AN

N-p

ca

AN

N-p

ar

Model

Figure 8 Average Nash in calibration and validation period for all models

Table 2 Performance assessment for predictands in calibration and validation

Models Periods Nash RMSE MAE 119877 Bias

MLR-par Calibration 092 176 135 096 000Validation 092 166 129 096 001

MLR-pca Calibration 090 196 151 095 000Validation 088 200 155 094 031

ANN-par Calibration 093 156 119 097 000Validation 093 152 117 097 007

ANN-pca Calibration 092 168 128 096 000Validation 091 172 132 096 035

SVM-par Calibration 092 176 134 096 minus009Validation 092 166 128 096 minus006

SVM-pca Calibration 089 197 150 095 minus010Validation 088 199 153 094 022

RF Calibration 098 080 058 099 000Validation 094 146 112 097 021

superior to PCA in theMLR ANN and SVMmodels For theANNmodels the average119877 are similar in both the calibrationand validation period however the average Nash RMSEMAE and Bias of the results using PAR are superior to thoseof PCA with decreases of 001 012 008 and 0 in the cali-bration period and 002 020 015 and 028 in the validationperiod respectively For the SVM models the increases inaverage Nash and119877 are 003 and 001 in the calibration periodand 004 and 002 in the validation period respectively Thedecreases of average RMSE MAE and Bias are 021 016and 001 in the calibration period and 033 025 and 028

in the validation period respectively For the same the PARis superior to PCA for MLR for most of the criteria withincreases in Nash and 119877 and decreases in RMSE and MAE

The spatial distributions of model precision of the RFand the comparative models are shown in Figure 9 inwhich Nash is selected as the evaluating criteria For thecomparative results the PAR was used in the MLR ANNand SVM modeling It is shown that the RF is superior tothe comparative models at most of the individual stations Inaddition it can be observed that the precision of the RF instations located in the plain region is higher than that in the

8 Advances in Meteorology

MLR calibration

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basin

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

le088

ge096

N

(a) MLR calibration

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

088ndash090090ndash092

092ndash094094ndash096

MLR validationPearl River Basinle088

ge096

N

(b) MLR validation

ANN calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(c) ANN calibration

ANN validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(d) ANN validation

SVM calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(e) SVM calibration

SVM validation

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

0 150 300(km)

N

(f) SVM validation

Figure 9 Continued

Advances in Meteorology 9

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

RF calibration

(g) RF calibration

RF validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(h) RF validation

Figure 9 The spatial distribution of Nash for different models

mountain regions which indicates the limited applicability incomplex terrains

5 Summary and Conclusions

Statistical downscaling models are effective in solving themismatch between large-scale climate models and local scalehydrological responses In this study a statisticalmodel basedon the random forest method was proposed and appliedto the Pearl River basin The objective of this study was toinvestigate whether the RF approach could successfully sim-ulate the complicated relationship between the predictors andpredictands The daily mean temperature observations from61 stations in the Pearl River Basin and the NCEP reanalysisdaily data from 1961 to 2005were selected in order to comparethe results of the RF model with those of the MLR ANNand SVM models The following summarizes the discussionpoints and provides conclusions derived from this analysis

(1) The RF model successfully simulated the relation-ship between the predictors and predictands andperformed better than the MLR ANN and SVMmodels According to five statistical criteria the RFshowed the highest model efficiency in both thecalibration and validation periods In addition it wasobserved that themodel efficiency of the RF increasedcontinuously by considering more predictors In thisstudy all 26 predictors from the NCEP were consid-ered in the RF modeling By taking full advantageof the information in the predictors and avoidingthe influence of noise the RF model performancedominated the other results

(2) The built-in variable importance evaluation processand the OOB samples in the RF made predictorselection convenient The built-in variable impor-tance evaluation process ranked the importance of

each predictor for the prediction of the predictandsThe OOB samples gave the unbiased estimation ofthe model efficiency Both of these were helpful inthe predictor selection Although all predictors wereconsidered in this study they will be most valuablewhen the RF is used in more complex downscalingproblems such as precipitation downscaling

(3) The spatial distribution of model precision at theindividual stations for the RF and the comparativemodels was also discussed Although the RF wassuperior to the comparative models for most of thestations there was still room for improvement for theprediction accuracy in mountainous areas

As this studymainly discussed the development and com-parison of temperature downscalingmethods future projectsin the Pearl River basin were not further explored We willpursue further research in the application of the RF to addi-tionalmeteorological elements such as the precipitationThiswill be helpful in understanding the strength and limitationof the RF method Furthermore the surface parameters likethe elevation slope vegetation and so forth also influencethe distribution of these meteorological elements especiallyin complex terrain [45] wewill explore further in consideringthese parameters as part of model input

Conflicts of Interest

The authors declare no conflicts of interest

Authorsrsquo Contributions

Bo Pang contributed to the literature review statisticalanalysis manuscript preparation and editing Jiajia Yue andGang Zhao contributed to the model design and simulation

10 Advances in Meteorology

and Zongxue Xu contributed to the manuscript revision andreview and supervised the study

Acknowledgments

This research is funded by the Youth Science Foundation ofthe National Natural Science Foundation (51309009) theNational Natural Science Foundation (91125015) andNation-al Key Research and Development Program (during the 13thFive-Year Plan) under Grant no 2016YFC0401309 Ministryof Science and Technology China

References

[1] IPCC Climate Change 2007 The Physical Science Basis Con-tribution of Working Group I to the Fourth Assessment Reportof the Intergovernmental Panel on Climate Change CambridgeUniversity Press Cambridge UK 2007

[2] R L Wilby S P Charles E Zorita B Timbal P Whetton andL O Mearns Guidelines for Use of Climate Scenarios Developedfrom Statistical Downscaling Methods UEA Norwich UK2004

[3] J T Chu J Xia C-Y Xu and V P Singh ldquoStatistical down-scaling of daily mean temperature pan evaporation and pre-cipitation for climate change scenarios in Haihe River ChinardquoTheoretical and Applied Climatology vol 99 no 1-2 pp 149ndash1612010

[4] R L Wilby L E Hay and G H Leavesley ldquoA comparisonof downscaled and raw GCM output implications for climatechange scenarios in the San JuanRiver Basin Coloradordquo Journalof Hydrology vol 225 no 1-2 pp 67ndash91 1999

[5] M K Goyal and C S P Ojha ldquoDownscaling of surfacetemperature for lake catchment in an arid region in India usinglinear multiple regression and neural networksrdquo InternationalJournal of Climatology vol 32 no 4 pp 552ndash566 2012

[6] R Huth ldquoStatistical downscaling in central Europe evaluationof methods and potential predictorsrdquo Climate Research vol 13no 2 pp 91ndash101 1999

[7] D Chen and Y Chen ldquoAssociation between winter temperaturein China and upper air circulation over East Asia revealed bycanonical correlation analysisrdquo Global and Planetary Changevol 37 no 3-4 pp 315ndash325 2003

[8] P Coulibaly Y B Dibike and F Anctil ldquoDownscaling precipi-tation and temperature with temporal neural networksrdquo Journalof Hydrometeorology vol 6 no 4 pp 483ndash496 2005

[9] A Anandhi V V Srinivas D N Kumar and R S NanjundiahldquoRole of predictors in downscaling surface temperature to riverbasin in India for IPCC SRES scenarios using support vectormachinerdquo International Journal of Climatology vol 29 no 4 pp583ndash603 2009

[10] J T Schoof and S C Pryor ldquoDownscaling temperature andprecipitation a comparison of regression-based methods andartificial neural networksrdquo International Journal of Climatologyvol 21 no 7 pp 773ndash790 2001

[11] E Kostopoulou CGiannakopoulos CAnagnostopoulou et alldquoSimulatingmaximumandminimum temperature overGreecea comparison of three downscaling techniquesrdquoTheoretical andApplied Climatology vol 90 no 1-2 pp 65ndash82 2007

[12] D Duhan and A Pandey ldquoStatistical downscaling of tempera-ture using three techniques in the Tons River basin in Central

IndiardquoTheoretical and Applied Climatology vol 121 no 3-4 pp605ndash622 2015

[13] D Maraun H W Rust and T J Osborn ldquoThe annual cycleof heavy precipitation across the United Kingdom a modelbased on extreme value statisticsrdquo International Journal ofClimatology vol 29 no 12 pp 1731ndash1744 2009

[14] M Widmann ldquoOne-dimensional CCA and SVD and theirrelationship to regression mapsrdquo Journal of Climate vol 18 no14 pp 2785ndash2792 2005

[15] M K Tippett T DelSole S J Mason and A G BarnstonldquoRegression-based methods for finding coupled patternsrdquo Jour-nal of Climate vol 21 no 17 pp 4384ndash4398 2008

[16] M Hessami P Gachon T B M J Ouarda and A St-Hilaire ldquoAutomated regression-based statistical downscalingtoolrdquo Environmental Modelling and Software vol 23 no 6 pp813ndash834 2008

[17] Z Liu Z Xu S P Charles G Fu and L Liu ldquoEvaluation of twostatistical downscaling models for daily precipitation over anarid basin in Chinardquo International Journal of Climatology vol31 no 13 pp 2006ndash2020 2011

[18] C Yang N Wang S Wang and L Zhou ldquoPerformancecomparison of three predictor selection methods for statisticaldownscaling of daily precipitationrdquo Theoretical and AppliedClimatology pp 1ndash12 2016

[19] I Hanssen-Bauer E J Foslashrland J E Haugen and O E TveitoldquoTemperature and precipitation scenarios for Norway com-parison of results from dynamical and empirical downscalingrdquoClimate Research vol 25 no 1 pp 15ndash27 2003

[20] A Hannachi I T Jolliffe and D B Stephenson ldquoEmpiricalorthogonal functions and related techniques in atmosphericscience a reviewrdquo International Journal of Climatology vol 27no 9 pp 1119ndash1152 2007

[21] O Fistikoglu andUOkkan ldquoStatistical downscaling ofmonthlyprecipitation using NCEPNCAR reanalysis data for tahtaliriver basin in Turkeyrdquo Journal of Hydrologic Engineering vol 16no 2 pp 157ndash164 2010

[22] A Sharma ldquoSeasonal to interannual rainfall probabilistic fore-casts for improved water supply management part 1mdashA strat-egy for system predictor identificationrdquo Journal of Hydrologyvol 239 no 1ndash4 pp 232ndash239 2000

[23] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[24] M Malekipirbazari and V Aksakalli ldquoRisk assessment in sociallending via random forestsrdquo Expert Systems with Applicationsvol 42 no 10 pp 4621ndash4631 2015

[25] P Baudron F Alonso-Sarrıa J L Garcıa-Arostegui F Canovas-Garcıa D Martınez-Vicente and J Moreno-Brotons ldquoIdentify-ing the origin of groundwater samples in a multi-layer aquifersystemwithRandomForest classificationrdquo Journal ofHydrologyvol 499 pp 303ndash315 2013

[26] K Tatsumi Y Yamashiki M A Canales Torres and C L RTaipe ldquoCrop classification of upland fields using Random forestof time-series Landsat 7 ETM+datardquoComputers and Electronicsin Agriculture vol 115 pp 171ndash179 2015

[27] Z Wang C Lai X Chen B Yang S Zhao and X Bai ldquoFloodhazard risk assessment model based on random forestrdquo Journalof Hydrology vol 527 pp 1130ndash1141 2015

[28] P O Gislason J A Benediktsson and J R Sveinsson ldquoRandomforests for land cover classificationrdquo Pattern Recognition Lettersvol 27 no 4 pp 294ndash300 2006

Advances in Meteorology 11

[29] M Pal ldquoRandom forest classifier for remote sensing classifica-tionrdquo International Journal of Remote Sensing vol 26 no 1 pp217ndash222 2005

[30] V F Rodriguez-Galiano M Chica-Olmo F Abarca-Hernan-dez P M Atkinson and C Jeganathan ldquoRandom Forestclassification of Mediterranean land cover using multi-seasonalimagery and multi-seasonal texturerdquo Remote Sensing of Envi-ronment vol 121 pp 93ndash107 2012

[31] V F Rodriguez-Galiano B Ghimire J Rogan M Chica-Olmoand J P Rigol-Sanchez ldquoAn assessment of the effectiveness ofa random forest classifier for land-cover classificationrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 67 no 1pp 93ndash104 2012

[32] C Strobl and A Zeileis ldquoDanger high power mdash Exploringthe statistical properties of a test for random forest variableimportancerdquo Tech Rep 17 University of Munich 2008

[33] V Svetnik A Liaw C Tong J Christopher Culberson R PSheridan and B P Feuston ldquoRandom forest a classification andregression tool for compound classification and QSAR mod-elingrdquo Journal of Chemical Information and Computer Sciencesvol 43 no 6 pp 1947ndash1958 2003

[34] E Eccel L Ghielmi P Granitto R Barbiero F Grazzini and DCesari ldquoPrediction of minimum temperatures in an alpineregion by linear and non-linear post-processing of meteorolog-ical modelsrdquoNonlinear Processes in Geophysics vol 14 no 3 pp211ndash222 2007

[35] Pearl River Water Resources Committee (PRWRC) The Zhu-jiang Archive vol 1 Guangdong Science and Technology PressGuangzhou China 1991 (in Chinese)

[36] Q Zhang C-Y Xu and Z Zhang ldquoObserved changes ofdroughtwetness episodes in the Pearl River basin Chinausing the standardized precipitation index and aridity indexrdquoTheoretical and Applied Climatology vol 98 no 1-2 pp 89ndash992009

[37] E V Dmitriev I V Nogotkov V S Rogutov G Komenko andA Chavro ldquoTemporal error estimate for statistical downscalingregional meteorological modelsrdquo Fısica de la Tierra vol 19 pp219ndash241 2007

[38] C Lavaysse M Vrac P Drobinski M Lengaigne and TVischel ldquoStatistical downscaling of the French Mediterraneanclimate assessment for present and projection in an anthro-pogenic scenariordquo Natural Hazards and Earth System Sciencevol 12 no 3 pp 651ndash670 2012

[39] L Liu Z Liu X Ren T Fischer and Y Xu ldquoHydrologicalimpacts of climate change in the Yellow River Basin for the 21stcentury using hydrological model and statistical downscalingmodelrdquo Quaternary International vol 244 no 2 pp 211ndash2202011

[40] F-F Ai J Bin Z-M Zhang et al ldquoApplication of randomforests to select premiumquality vegetable oils by their fatty acidcompositionrdquo Food Chemistry vol 143 pp 472ndash478 2014

[41] P Gislason J Benediktsson and J Sveinsson ldquoRandom forestclassification of multisource remote sensing and geographicdatardquo in Proceedings of the Geoscience and Remote Sensing Sym-posium vol 2 pp 1049ndash1052 IEEE Anchorage Alaska USA2004

[42] B C Hewitson and R G Crane ldquoClimate downscaling tech-niques and applicationrdquo Climate Research vol 7 no 2 pp 85ndash95 1996

[43] B Timbal A Dufour and B McAvaney ldquoAn estimate of fu-ture climate change for western France using a statistical

downscaling techniquerdquo Climate Dynamics vol 20 no 7-8 pp807ndash823 2003

[44] K Tatsumi T Oizumi and Y Yamashiki ldquoEffects of climatechange on daily minimum and maximum temperatures andcloudiness in the Shikoku region a statistical downscalingmodel approachrdquoTheoretical and Applied Climatology vol 120no 1-2 pp 87ndash98 2015

[45] Z A Holden J T Abatzoglou C H Luce and L S BaggettldquoEmpirical downscaling of daily minimum air temperature atvery fine resolutions in complex terrainrdquoAgricultural and ForestMeteorology vol 151 no 8 pp 1066ndash1073 2011

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 201

International Journal of

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 2: Statistical Downscaling of Temperature with the Random ...downloads.hindawi.com/journals/amete/2017/7265178.pdf · Statistical downscaling techniques can be divided into threecategories:weathertyping,weathergenerators,andre-gression-based

2 Advances in Meteorology

0 500250(km)

Pearl River BasinMeteorological stations

Elevation (m)High 2635

Low minus6

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E 120

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E 120

∘09984000998400998400E

30∘09984000998400998400

N

N

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

Figure 1 Study area

Pearson Spearmen and Kendall correlation analysis [9]CCA [14] maximum covariance analysis (MCA) [15] partialcorrelation (PAR) [16ndash18] and principal component analysis(PCA) [19 20] Interactive model fitting approaches are alsoused in predictor selection [21] However some limitationsare found during the application such as the limited ability oftraditional correlation analysis for interpreting nonstationaryand nonlinear relationships [22] Therefore a precise statisti-cal downscaling method with an inbuilt predictor selectionmechanism will be helpful for researchers studying climatechange impact

The random forest (RF) model is an ensemble machinelearning technique based on a combination of classificationor regression methods and statistical learning theory [23]RF models have been well applied in various fields such asrisk analysis [24] ground water studies [25] remote sensinganalysis [26] and flood hazard assessment [27] and especiallyshow advantages in land cover classification [28ndash31] Thereare two important advantages of RF models The first is theability to handle large datasets with correlated conditionalvariables because it includes precision in the prediction isnonparametric and is robust in the presence of outliersnoise and overfitting [23 32] The second is the inbuiltvariable importance evaluation By permuting the variablesrandomly each variable can be compared to the predictionresults and evaluated for its importance [33] Based on thisbody of knowledge RF should be in theory highly appli-cable to downscaling and able to rectify multivariable andnonlinear issues Eccel et al [34] adopted RF with four linearand nonlinearmodels in the postprocessing of two numerical

weather prediction models for the prediction of minimumtemperatures in an alpine region However the RF just servedas one of the comparativemodels in the studyThe advantagesand disadvantages of the RF and its applicability in statisticaldownscaling have not been studied in detail

The purpose of this study is to fully investigate theapplicability of RF for statistical downscaling of temperatureThe predictors are the 26 large-scale variables derived fromthe National Center for Environmental Prediction (NCEP)reanalysis daily dataset and the predictands are the observedtemperatures at 61 national standard stations located in thePearl River basin The RF is used to capture the complexrelationship between selected predictors from the NCEP dataand observed daily mean temperature from these stations Acomparison study was conducted involving the application ofMLP ANN and SVM models The PCA and PAR methodswere used in predictor selection for the comparative modelsfor a comprehensive study

2 Study Area and Data Description

21 Study Area Description The Pearl River (97∘391015840Esim117∘181015840E 3∘411015840Nsim29∘151015840N) is the second largest river in Chinawith a drainage area of 454 times 105 km2 of which 442 times105 km2 is located in China [35 36] The Pearl River basin(Figure 1) is located in tropical and subtropical climatezones the annual temperature is 14ndash22∘C and the annualprecipitation is 1200ndash2200mm The distribution of precipi-tation is gradually reduced from east to west This regionaldistribution is significantly different and the interannual

Advances in Meteorology 3

161ndash201202ndash218

219ndash236237ndash258Pearl River Basin

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N

N

20∘09984000998400998400

N

Mean temperature ( ∘C)105ndash160

(a) Observation of annual mean temperature

0 150 300(km)

Standard deviation30ndash4647ndash5758ndash63

64ndash6970ndash79Pearl River Basin

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(b) Standard deviation of observed temperature

Figure 2 Observed mean temperature and standard deviation at the meteorological stations

change is large The precipitation is concentrated duringAprilndashSeptember [36] accounting for 72ndash88 of the annualprecipitation [35]

The Pearl River basin is a rich water resource The waterresource in the entire river basin is 4700 cubic meters percapita equivalent to 17 times the national per capita rate butthe spatial and temporal distribution of the water resourceare uneven and basin flooding waterlogging drought andsaltiness create frequent natural disasters The Pearl Riverbasin is a highly developed region having a prominentposition in the economic development of China

22 Data Description

221 Temperature Data The observed meteorological dataused in this study are the daily mean temperatures at61 national standard stations in the Pearl River basin Acontinuous data series for the period of 1961ndash2005 wasselected for the study These observations were obtainedfrom the National Climate Center which is in charge ofmonitoring collecting compiling and releasing high qualityhydrological data in China The observed mean temperatureand standard deviation at the meteorological stations areshown in Figure 2 The mean daily temperature shows anincreasing trend from the north to south in the periodHowever the standard deviation showed an opposite trend

222 Large-Scale Atmospheric Variables TheNCEP reanaly-sis daily data were downloaded from the website httpwwwcdcnoaagov which included 26 large-scale atmosphericvariables at a scale of 25∘ times 25∘ that were derived from thedataset over the period of 1961ndash2005Thedetails of the predic-tors are shown in Table 1 The NCEP data were interpolated

Table 1 NCEP reanalysis predictors

Number Predictor Description1 mslp Mean sea level pressure2 p_f Surface airflow strength3 p_u Surface zonal velocity4 p_v Surface meridional velocity5 p_z Surface vorticity6 p_th Surface wind direction7 p_zh Surface divergence8 p5_f 500 hPa airflow strength9 p5_u 500 hPa zonal velocity10 p5_v 500 hPa meridional velocity11 p5_z 500 hPa vorticity12 p5th 500 hPa wind direction13 p5zh 500 hPa divergence14 p8_f 850 hPa airflow strength15 p8_u 850 hPa zonal velocity16 p8_v 850 hPa meridional velocity17 p8_z 850 hPa vorticity18 p8th 850 hPa wind direction19 p8zh 850 hPa divergence20 p500 500 hPa geopotential height21 p850 850 hPa geopotential height22 r500 500 hPa relative humidity23 r850 850 hPa relative humidity24 rhum Surface relative humidity25 shum Surface-specific humidity26 temp Mean temperature at 2m height

4 Advances in Meteorology

Trainingdataset(TD)

Randomized

Regressionresult 1

Regressionresult 2

Regressionresult 3

Regressionresult k

The averageof all results

TD1

TD2

TD3

TDk

Figure 3 Structure of a random forest model

to each station using the bilinear interpolationmethodwhichhas been widely used in statistical downscaling [37ndash39]

3 Downscaling by Using the RandomForest Method

31 Random Forest Method

311 Methodology The random forest (RF) method is anenhanced classification and regression tree (CART) methodproposed by Breiman in 2001 which consists of an ensembleof unpruned decision trees generated through bootstrapsamples of the training data and random variable subsetselection

As shown in Figure 3 the RF is composed of a set ofCARTs The accuracy of the RF prediction depends on thestrength of the individual CARTs [23] Each CART consistsof a root node internal nodes and leaves Each internal nodeis associated with a test function to split the incoming dataFor regression trees splitting is made in accordance witha squared residuals minimization algorithm which impliesthat the expected sum variances in the two resulting nodesshould be minimized as shown in

argmin119909119895le119909119877119895119895=1119872

[119901119897 var (119884119897) + 119901120591 var (119884120591)] (1)

Here 119901119897 and 119901120591 are fractions of samples in the left andright nodes var(119884119897) and var(119884120591) are response vectors forcorresponding left and right child nodes and 119909119895 le 119909119877119895 119895 =1 2 119872 are optimal splitting questions

Compared to traditional CART methods that use wholedata sets the RF trains each individual CART on bootstrapresamples (119872 samples) of the total dataset Instead of usingall the features the RF uses a random selection of features tosplit each node The best split is chosen among a randomlyselected subset of Ntry input variables at each node The treeis then grown to the maximum size without pruning In thisway119872 CARTs are grown and the final output is the averageof the predictions of those trees

312 Importance of Variables The RF performs an inbuiltcross-validation in parallel to the training process by usingout-of-bag (OOB) samples which are not chosen during thebootstrap split In the regression mode the total learningerror is obtained by averaging the prediction error of eachindividual tree using their OOB samples as shown in

MSE asymp MSEOOB = 1119899119899sum119894=1

[ (119883119894) minus 119884119894]2 (2)

where 119899 is the total number of OOB samples (119883119894) is the RFoutput corresponding to a given input sample119883119894 and119884119894 is theobserved output

The RF provides two methods of evaluating the impor-tance of each variable [27] The first method evaluatesthe variable importance based on how much poorer theprediction will be if the variable is permuted randomlyThe prediction errors of the OOB samples of each tree(termed EOOB1) are calculated during the training procedureAt the same time each input variable in the OOB samplesis permuted one at a time These modified datasets arealso predicted by the tree (termed EOOB2) At the end ofthe training procedure the importance of each variable isobtained by averaging the difference between EOOB1 andEOOB2 It is then normalized by the standard deviation Thesecond method is based on the calculation of the nodeimpurity criterion As illustrated in (1) we can calculate howmuch the split decreases the node impurity For regressiontrees the decrease of the node impurity can be calculatedusing the difference between the residual sum of squares(RSS) before and after the split The importance of a variablecan be rapidly calculated by combining the decreases of nodeimpurity for the variable over all trees [40]

32 Model Implementation and Validation TheRF is utilizedto simulate the nonlinear relationship between the NCEPpredictors and the observed temperatures at the 61 stationsIn this study the RF algorithm is implemented in the Rprogramming language with a package ldquorandom forestrdquowhich has built-in functions to measure variable importance

Advances in Meteorology 5

Guangzhou Station Nanning Station

0

05

1

15

2

25Si

gnifi

canc

e

5 10 15 20 25 300Number of predictors

0

02

04

06

08

1

12

14

16

18

2

Sign

ifica

nce

5 10 15 20 25 300Number of predictors

times105

times105

Figure 4 Relative significance of the predictors at Guangzhou Station and Nanning Station

As illustrated previously 119872 and Ntry are two sensitiveparameters in the RF models Ntry is the square root of thetotal number of variables [41]119872 influences the convergenceof the RF and can be determined through the OOB error

For model development the daily mean temperatureseries from the national standard stations and the large-scaleatmospheric variables of the NCEP data are divided into twodatasetsThe first 31 years (1961ndash1991) are used for calibratingthe regression model while the remaining 14 years of data(1992ndash2005) are used to validate the model For testing theperformance of the proposedmodel two data-drivenmodelsANN and SVM are selected for model comparison whichare commonly used in statistical downscalingThe three-layerbackpropagation artificial neural network (BP-ANN) andLS-SVM are adopted which have been applied successfullyin downscaling temperature [12] The MATLAB functionsldquonewffrdquo and ldquotunelssvmrdquo are employed for obtaining thevalues of the models parameters Multiple linear regressionanalysis is adopted for the MATLAB implementation ThePAR [18] and PCA [19 20] methods are used in predictorselection for the MLR ANN and SVMmodels

33 Model Performance Analysis Five criteria are selected toevaluate the performance of the RF and comparative modelsincluding the NashndashSutcliffe model efficiency index (Nash)root mean square error (RMSE) mean absolute error (MAE)correlation coefficient (119877) and model Bias (Bias) which aredefined as

Nash = 1 minus sum119899119894=1 (119910119894obs minus 119910119894sim)2sum119899119894=1 (119910119894obs minus 119910obs)2

RMSE = radicsum119899119894=1 (119910119894obs minus 119910119894sim)2119899

MAE = sum119899119894=1 1003816100381610038161003816119910119894obs minus 119910119894sim1003816100381610038161003816119899 119877 = sum119899119894=1 (119910119894obs minus 119910obs) (119910119894sim minus 119910sim)radicsum119899119894=1 (119910119894obs minus 119910obs)2 (119910119894sim minus 119910sim)2

Bias = sum119899119894=1 (119910119894sim minus 119910119894s)119899 (3)

Here119910obs is the vector of the observed predictands119910obs isthemean of the observed predictands 119910sim is the vector of thesimulated predictands and 119910sim is the mean of the simulatedpredictands In general a higher Nash and 119877 indicate bettermodel efficiency In contrast smaller values of RMSE MAEand Bias indicate higher accuracy in the model prediction

4 Result Analysis and Discussion

41The Choice of Predictors One of themost important stepsin the development of downscaling models is the choice ofappropriate predictors [42] The RF can evaluate the relativecontribution of each predictor for downscaling results bycombining the RSS decreases over all trees which makes itconvenient in choosing predictors for the RFUsing two of thestations as examples Figure 4 shows the relative importanceof the predictors for the predictands at Guangzhou andNanning Stations where the number of the predictor hasbeen given as indicated in Table 1 For a comprehensiveinvestigation of the predictorrsquos importance in the Pearl Riverbasin stations the rank (the most important predictor isindicated as 1) of the relative importance of the predictor ineach station is calculated and indicated in Figure 5

6 Advances in Meteorology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Number of predictors

Random forest

0

5

10

15

20

25

Rank

Figure 5 Rank of relative importance of predictors in the 61 sta-tions

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Numbers of predictors

123456789

1011

EOO

B

Figure 6 MSEs of out-of-bag samples

According to Figures 4 and 5 the number 26 predictorthe mean temperature at 2m height is the most importantpredictor for all RF models at the 61 stations Similarly thenumber 25 predictor the surface-specific humidity rankedsecond for predictor importance at all stations In contrastthe number 5 predictor surface vorticity is the least impor-tant predictor for all stations The ranks of the other predic-tors vary for the different stations whichmay be caused by themeteorological and geographical differences of the stations

Because the OOB samples can provide unbiased esti-mation of the RF model performance the rationality ofincluding each factor can be tested using the MSEs of theOOB (indicated by EOOB)The predictors of each station arescreened using the ranks of relative importance which wereevaluated in previous stepThen theMSE of theOOB samplesin each station is calculated with the increase of the predictorcombination and plotted in Figure 6

The results show that EOOB generally decreases butat a decreasing rate as more predictors are included Thisindicates that the RF avoids overfitting successfully Howeverthe improvement of model performance by including morepredictors is slight when the predictor number exceeds acertain number which is seven in this study With enoughcomputation resources all predictors are chosen in the RFmodeling to downscale the temperature for these stations

42 Comparative Study The performance of the RF modelis compared with that of the MLR ANN and SVM models

Partial Correlation Analysis

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Number of predictors

0

5

10

15

20

25

Rank

Figure 7 Rank of partial correlation of predictors in the 61 stations

Two predictor selection methods PAR and PCA are appliedin the four models

The partial correlations of the 26 predictors with thepredictands are shown in Figure 7 It can be observed that theranks of the predictorsrsquo partial correlation vary in differentstations In general the number 1 predictor mean sea levelpressure has the largest partial correlation in most of thestations However the number 5 and number 8 predic-tors corresponding to surface vorticity and 500 hPa airflowstrength have the smallest partial correlations in themajorityof the stations

The partial correlation coefficients are used to decide thevariables that are included in the input combination Based onformer studies [18 43 44] a combination of seven predictorswith the highest partial correlation coefficients are selectedand applied in the MLR ANN and SVM models Theseresults are marked as MLR-par ANN-par and SVM-par

PCA which was commonly used by previous researchersis also used in the comparative study Before PCA thepredictors are standardized by subtracting themean from theoriginal values and then dividing the results by the standarddeviation of the original variables The PCA method is thenapplied to the standardized NCEP predictor variables toextract principal components (PCs) that are orthogonal Theobtained PCs preservemore than 90 of the variance presentat each station Then the PCs are used in the MLR ANNand SVM modeling and these results are marked as MLR-pca ANN-pca and SVM-pca

The calibration and validation results of the RF andcomparative models are summarized in Table 2 and theaverage Nash in the calibration and validation periods for allmodels are plotted in Figure 8

The results showed that the RF is superior to the com-parative models in the calibration and validation periodsAccording to the average values of Nash RMSEMAE119877 andBias the RF for the 61 stations are 098 080 058 099 and000 in the calibration period and 094 146 112 097 and021 in the validation period respectively All of these criteriaare superior to those of the comparativemodels It can also beobserved in Figure 8 that the RF shows higher precision andmore stability over the other models in the study area

The results of the two parameter selection are alsocompared for the MLR ANN and SVM models The PAR is

Advances in Meteorology 7

Calibration Validation

08

082

084

086

088

09

092

094

096

098

Aver

age o

f Nas

h

RF

MLR

-par

SVM

-par

MLR

-pca

SVM

-pca

AN

N-p

ar

AN

N-p

ca

Model

078

08

082

084

086

088

09

092

094

096

Aver

age o

f Nas

h

RF

MLR

-par

SVM

-par

MLR

-pca

SVM

-pca

AN

N-p

ca

AN

N-p

ar

Model

Figure 8 Average Nash in calibration and validation period for all models

Table 2 Performance assessment for predictands in calibration and validation

Models Periods Nash RMSE MAE 119877 Bias

MLR-par Calibration 092 176 135 096 000Validation 092 166 129 096 001

MLR-pca Calibration 090 196 151 095 000Validation 088 200 155 094 031

ANN-par Calibration 093 156 119 097 000Validation 093 152 117 097 007

ANN-pca Calibration 092 168 128 096 000Validation 091 172 132 096 035

SVM-par Calibration 092 176 134 096 minus009Validation 092 166 128 096 minus006

SVM-pca Calibration 089 197 150 095 minus010Validation 088 199 153 094 022

RF Calibration 098 080 058 099 000Validation 094 146 112 097 021

superior to PCA in theMLR ANN and SVMmodels For theANNmodels the average119877 are similar in both the calibrationand validation period however the average Nash RMSEMAE and Bias of the results using PAR are superior to thoseof PCA with decreases of 001 012 008 and 0 in the cali-bration period and 002 020 015 and 028 in the validationperiod respectively For the SVM models the increases inaverage Nash and119877 are 003 and 001 in the calibration periodand 004 and 002 in the validation period respectively Thedecreases of average RMSE MAE and Bias are 021 016and 001 in the calibration period and 033 025 and 028

in the validation period respectively For the same the PARis superior to PCA for MLR for most of the criteria withincreases in Nash and 119877 and decreases in RMSE and MAE

The spatial distributions of model precision of the RFand the comparative models are shown in Figure 9 inwhich Nash is selected as the evaluating criteria For thecomparative results the PAR was used in the MLR ANNand SVM modeling It is shown that the RF is superior tothe comparative models at most of the individual stations Inaddition it can be observed that the precision of the RF instations located in the plain region is higher than that in the

8 Advances in Meteorology

MLR calibration

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basin

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

le088

ge096

N

(a) MLR calibration

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

088ndash090090ndash092

092ndash094094ndash096

MLR validationPearl River Basinle088

ge096

N

(b) MLR validation

ANN calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(c) ANN calibration

ANN validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(d) ANN validation

SVM calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(e) SVM calibration

SVM validation

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

0 150 300(km)

N

(f) SVM validation

Figure 9 Continued

Advances in Meteorology 9

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

RF calibration

(g) RF calibration

RF validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(h) RF validation

Figure 9 The spatial distribution of Nash for different models

mountain regions which indicates the limited applicability incomplex terrains

5 Summary and Conclusions

Statistical downscaling models are effective in solving themismatch between large-scale climate models and local scalehydrological responses In this study a statisticalmodel basedon the random forest method was proposed and appliedto the Pearl River basin The objective of this study was toinvestigate whether the RF approach could successfully sim-ulate the complicated relationship between the predictors andpredictands The daily mean temperature observations from61 stations in the Pearl River Basin and the NCEP reanalysisdaily data from 1961 to 2005were selected in order to comparethe results of the RF model with those of the MLR ANNand SVM models The following summarizes the discussionpoints and provides conclusions derived from this analysis

(1) The RF model successfully simulated the relation-ship between the predictors and predictands andperformed better than the MLR ANN and SVMmodels According to five statistical criteria the RFshowed the highest model efficiency in both thecalibration and validation periods In addition it wasobserved that themodel efficiency of the RF increasedcontinuously by considering more predictors In thisstudy all 26 predictors from the NCEP were consid-ered in the RF modeling By taking full advantageof the information in the predictors and avoidingthe influence of noise the RF model performancedominated the other results

(2) The built-in variable importance evaluation processand the OOB samples in the RF made predictorselection convenient The built-in variable impor-tance evaluation process ranked the importance of

each predictor for the prediction of the predictandsThe OOB samples gave the unbiased estimation ofthe model efficiency Both of these were helpful inthe predictor selection Although all predictors wereconsidered in this study they will be most valuablewhen the RF is used in more complex downscalingproblems such as precipitation downscaling

(3) The spatial distribution of model precision at theindividual stations for the RF and the comparativemodels was also discussed Although the RF wassuperior to the comparative models for most of thestations there was still room for improvement for theprediction accuracy in mountainous areas

As this studymainly discussed the development and com-parison of temperature downscalingmethods future projectsin the Pearl River basin were not further explored We willpursue further research in the application of the RF to addi-tionalmeteorological elements such as the precipitationThiswill be helpful in understanding the strength and limitationof the RF method Furthermore the surface parameters likethe elevation slope vegetation and so forth also influencethe distribution of these meteorological elements especiallyin complex terrain [45] wewill explore further in consideringthese parameters as part of model input

Conflicts of Interest

The authors declare no conflicts of interest

Authorsrsquo Contributions

Bo Pang contributed to the literature review statisticalanalysis manuscript preparation and editing Jiajia Yue andGang Zhao contributed to the model design and simulation

10 Advances in Meteorology

and Zongxue Xu contributed to the manuscript revision andreview and supervised the study

Acknowledgments

This research is funded by the Youth Science Foundation ofthe National Natural Science Foundation (51309009) theNational Natural Science Foundation (91125015) andNation-al Key Research and Development Program (during the 13thFive-Year Plan) under Grant no 2016YFC0401309 Ministryof Science and Technology China

References

[1] IPCC Climate Change 2007 The Physical Science Basis Con-tribution of Working Group I to the Fourth Assessment Reportof the Intergovernmental Panel on Climate Change CambridgeUniversity Press Cambridge UK 2007

[2] R L Wilby S P Charles E Zorita B Timbal P Whetton andL O Mearns Guidelines for Use of Climate Scenarios Developedfrom Statistical Downscaling Methods UEA Norwich UK2004

[3] J T Chu J Xia C-Y Xu and V P Singh ldquoStatistical down-scaling of daily mean temperature pan evaporation and pre-cipitation for climate change scenarios in Haihe River ChinardquoTheoretical and Applied Climatology vol 99 no 1-2 pp 149ndash1612010

[4] R L Wilby L E Hay and G H Leavesley ldquoA comparisonof downscaled and raw GCM output implications for climatechange scenarios in the San JuanRiver Basin Coloradordquo Journalof Hydrology vol 225 no 1-2 pp 67ndash91 1999

[5] M K Goyal and C S P Ojha ldquoDownscaling of surfacetemperature for lake catchment in an arid region in India usinglinear multiple regression and neural networksrdquo InternationalJournal of Climatology vol 32 no 4 pp 552ndash566 2012

[6] R Huth ldquoStatistical downscaling in central Europe evaluationof methods and potential predictorsrdquo Climate Research vol 13no 2 pp 91ndash101 1999

[7] D Chen and Y Chen ldquoAssociation between winter temperaturein China and upper air circulation over East Asia revealed bycanonical correlation analysisrdquo Global and Planetary Changevol 37 no 3-4 pp 315ndash325 2003

[8] P Coulibaly Y B Dibike and F Anctil ldquoDownscaling precipi-tation and temperature with temporal neural networksrdquo Journalof Hydrometeorology vol 6 no 4 pp 483ndash496 2005

[9] A Anandhi V V Srinivas D N Kumar and R S NanjundiahldquoRole of predictors in downscaling surface temperature to riverbasin in India for IPCC SRES scenarios using support vectormachinerdquo International Journal of Climatology vol 29 no 4 pp583ndash603 2009

[10] J T Schoof and S C Pryor ldquoDownscaling temperature andprecipitation a comparison of regression-based methods andartificial neural networksrdquo International Journal of Climatologyvol 21 no 7 pp 773ndash790 2001

[11] E Kostopoulou CGiannakopoulos CAnagnostopoulou et alldquoSimulatingmaximumandminimum temperature overGreecea comparison of three downscaling techniquesrdquoTheoretical andApplied Climatology vol 90 no 1-2 pp 65ndash82 2007

[12] D Duhan and A Pandey ldquoStatistical downscaling of tempera-ture using three techniques in the Tons River basin in Central

IndiardquoTheoretical and Applied Climatology vol 121 no 3-4 pp605ndash622 2015

[13] D Maraun H W Rust and T J Osborn ldquoThe annual cycleof heavy precipitation across the United Kingdom a modelbased on extreme value statisticsrdquo International Journal ofClimatology vol 29 no 12 pp 1731ndash1744 2009

[14] M Widmann ldquoOne-dimensional CCA and SVD and theirrelationship to regression mapsrdquo Journal of Climate vol 18 no14 pp 2785ndash2792 2005

[15] M K Tippett T DelSole S J Mason and A G BarnstonldquoRegression-based methods for finding coupled patternsrdquo Jour-nal of Climate vol 21 no 17 pp 4384ndash4398 2008

[16] M Hessami P Gachon T B M J Ouarda and A St-Hilaire ldquoAutomated regression-based statistical downscalingtoolrdquo Environmental Modelling and Software vol 23 no 6 pp813ndash834 2008

[17] Z Liu Z Xu S P Charles G Fu and L Liu ldquoEvaluation of twostatistical downscaling models for daily precipitation over anarid basin in Chinardquo International Journal of Climatology vol31 no 13 pp 2006ndash2020 2011

[18] C Yang N Wang S Wang and L Zhou ldquoPerformancecomparison of three predictor selection methods for statisticaldownscaling of daily precipitationrdquo Theoretical and AppliedClimatology pp 1ndash12 2016

[19] I Hanssen-Bauer E J Foslashrland J E Haugen and O E TveitoldquoTemperature and precipitation scenarios for Norway com-parison of results from dynamical and empirical downscalingrdquoClimate Research vol 25 no 1 pp 15ndash27 2003

[20] A Hannachi I T Jolliffe and D B Stephenson ldquoEmpiricalorthogonal functions and related techniques in atmosphericscience a reviewrdquo International Journal of Climatology vol 27no 9 pp 1119ndash1152 2007

[21] O Fistikoglu andUOkkan ldquoStatistical downscaling ofmonthlyprecipitation using NCEPNCAR reanalysis data for tahtaliriver basin in Turkeyrdquo Journal of Hydrologic Engineering vol 16no 2 pp 157ndash164 2010

[22] A Sharma ldquoSeasonal to interannual rainfall probabilistic fore-casts for improved water supply management part 1mdashA strat-egy for system predictor identificationrdquo Journal of Hydrologyvol 239 no 1ndash4 pp 232ndash239 2000

[23] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[24] M Malekipirbazari and V Aksakalli ldquoRisk assessment in sociallending via random forestsrdquo Expert Systems with Applicationsvol 42 no 10 pp 4621ndash4631 2015

[25] P Baudron F Alonso-Sarrıa J L Garcıa-Arostegui F Canovas-Garcıa D Martınez-Vicente and J Moreno-Brotons ldquoIdentify-ing the origin of groundwater samples in a multi-layer aquifersystemwithRandomForest classificationrdquo Journal ofHydrologyvol 499 pp 303ndash315 2013

[26] K Tatsumi Y Yamashiki M A Canales Torres and C L RTaipe ldquoCrop classification of upland fields using Random forestof time-series Landsat 7 ETM+datardquoComputers and Electronicsin Agriculture vol 115 pp 171ndash179 2015

[27] Z Wang C Lai X Chen B Yang S Zhao and X Bai ldquoFloodhazard risk assessment model based on random forestrdquo Journalof Hydrology vol 527 pp 1130ndash1141 2015

[28] P O Gislason J A Benediktsson and J R Sveinsson ldquoRandomforests for land cover classificationrdquo Pattern Recognition Lettersvol 27 no 4 pp 294ndash300 2006

Advances in Meteorology 11

[29] M Pal ldquoRandom forest classifier for remote sensing classifica-tionrdquo International Journal of Remote Sensing vol 26 no 1 pp217ndash222 2005

[30] V F Rodriguez-Galiano M Chica-Olmo F Abarca-Hernan-dez P M Atkinson and C Jeganathan ldquoRandom Forestclassification of Mediterranean land cover using multi-seasonalimagery and multi-seasonal texturerdquo Remote Sensing of Envi-ronment vol 121 pp 93ndash107 2012

[31] V F Rodriguez-Galiano B Ghimire J Rogan M Chica-Olmoand J P Rigol-Sanchez ldquoAn assessment of the effectiveness ofa random forest classifier for land-cover classificationrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 67 no 1pp 93ndash104 2012

[32] C Strobl and A Zeileis ldquoDanger high power mdash Exploringthe statistical properties of a test for random forest variableimportancerdquo Tech Rep 17 University of Munich 2008

[33] V Svetnik A Liaw C Tong J Christopher Culberson R PSheridan and B P Feuston ldquoRandom forest a classification andregression tool for compound classification and QSAR mod-elingrdquo Journal of Chemical Information and Computer Sciencesvol 43 no 6 pp 1947ndash1958 2003

[34] E Eccel L Ghielmi P Granitto R Barbiero F Grazzini and DCesari ldquoPrediction of minimum temperatures in an alpineregion by linear and non-linear post-processing of meteorolog-ical modelsrdquoNonlinear Processes in Geophysics vol 14 no 3 pp211ndash222 2007

[35] Pearl River Water Resources Committee (PRWRC) The Zhu-jiang Archive vol 1 Guangdong Science and Technology PressGuangzhou China 1991 (in Chinese)

[36] Q Zhang C-Y Xu and Z Zhang ldquoObserved changes ofdroughtwetness episodes in the Pearl River basin Chinausing the standardized precipitation index and aridity indexrdquoTheoretical and Applied Climatology vol 98 no 1-2 pp 89ndash992009

[37] E V Dmitriev I V Nogotkov V S Rogutov G Komenko andA Chavro ldquoTemporal error estimate for statistical downscalingregional meteorological modelsrdquo Fısica de la Tierra vol 19 pp219ndash241 2007

[38] C Lavaysse M Vrac P Drobinski M Lengaigne and TVischel ldquoStatistical downscaling of the French Mediterraneanclimate assessment for present and projection in an anthro-pogenic scenariordquo Natural Hazards and Earth System Sciencevol 12 no 3 pp 651ndash670 2012

[39] L Liu Z Liu X Ren T Fischer and Y Xu ldquoHydrologicalimpacts of climate change in the Yellow River Basin for the 21stcentury using hydrological model and statistical downscalingmodelrdquo Quaternary International vol 244 no 2 pp 211ndash2202011

[40] F-F Ai J Bin Z-M Zhang et al ldquoApplication of randomforests to select premiumquality vegetable oils by their fatty acidcompositionrdquo Food Chemistry vol 143 pp 472ndash478 2014

[41] P Gislason J Benediktsson and J Sveinsson ldquoRandom forestclassification of multisource remote sensing and geographicdatardquo in Proceedings of the Geoscience and Remote Sensing Sym-posium vol 2 pp 1049ndash1052 IEEE Anchorage Alaska USA2004

[42] B C Hewitson and R G Crane ldquoClimate downscaling tech-niques and applicationrdquo Climate Research vol 7 no 2 pp 85ndash95 1996

[43] B Timbal A Dufour and B McAvaney ldquoAn estimate of fu-ture climate change for western France using a statistical

downscaling techniquerdquo Climate Dynamics vol 20 no 7-8 pp807ndash823 2003

[44] K Tatsumi T Oizumi and Y Yamashiki ldquoEffects of climatechange on daily minimum and maximum temperatures andcloudiness in the Shikoku region a statistical downscalingmodel approachrdquoTheoretical and Applied Climatology vol 120no 1-2 pp 87ndash98 2015

[45] Z A Holden J T Abatzoglou C H Luce and L S BaggettldquoEmpirical downscaling of daily minimum air temperature atvery fine resolutions in complex terrainrdquoAgricultural and ForestMeteorology vol 151 no 8 pp 1066ndash1073 2011

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 201

International Journal of

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 3: Statistical Downscaling of Temperature with the Random ...downloads.hindawi.com/journals/amete/2017/7265178.pdf · Statistical downscaling techniques can be divided into threecategories:weathertyping,weathergenerators,andre-gression-based

Advances in Meteorology 3

161ndash201202ndash218

219ndash236237ndash258Pearl River Basin

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N

N

20∘09984000998400998400

N

Mean temperature ( ∘C)105ndash160

(a) Observation of annual mean temperature

0 150 300(km)

Standard deviation30ndash4647ndash5758ndash63

64ndash6970ndash79Pearl River Basin

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(b) Standard deviation of observed temperature

Figure 2 Observed mean temperature and standard deviation at the meteorological stations

change is large The precipitation is concentrated duringAprilndashSeptember [36] accounting for 72ndash88 of the annualprecipitation [35]

The Pearl River basin is a rich water resource The waterresource in the entire river basin is 4700 cubic meters percapita equivalent to 17 times the national per capita rate butthe spatial and temporal distribution of the water resourceare uneven and basin flooding waterlogging drought andsaltiness create frequent natural disasters The Pearl Riverbasin is a highly developed region having a prominentposition in the economic development of China

22 Data Description

221 Temperature Data The observed meteorological dataused in this study are the daily mean temperatures at61 national standard stations in the Pearl River basin Acontinuous data series for the period of 1961ndash2005 wasselected for the study These observations were obtainedfrom the National Climate Center which is in charge ofmonitoring collecting compiling and releasing high qualityhydrological data in China The observed mean temperatureand standard deviation at the meteorological stations areshown in Figure 2 The mean daily temperature shows anincreasing trend from the north to south in the periodHowever the standard deviation showed an opposite trend

222 Large-Scale Atmospheric Variables TheNCEP reanaly-sis daily data were downloaded from the website httpwwwcdcnoaagov which included 26 large-scale atmosphericvariables at a scale of 25∘ times 25∘ that were derived from thedataset over the period of 1961ndash2005Thedetails of the predic-tors are shown in Table 1 The NCEP data were interpolated

Table 1 NCEP reanalysis predictors

Number Predictor Description1 mslp Mean sea level pressure2 p_f Surface airflow strength3 p_u Surface zonal velocity4 p_v Surface meridional velocity5 p_z Surface vorticity6 p_th Surface wind direction7 p_zh Surface divergence8 p5_f 500 hPa airflow strength9 p5_u 500 hPa zonal velocity10 p5_v 500 hPa meridional velocity11 p5_z 500 hPa vorticity12 p5th 500 hPa wind direction13 p5zh 500 hPa divergence14 p8_f 850 hPa airflow strength15 p8_u 850 hPa zonal velocity16 p8_v 850 hPa meridional velocity17 p8_z 850 hPa vorticity18 p8th 850 hPa wind direction19 p8zh 850 hPa divergence20 p500 500 hPa geopotential height21 p850 850 hPa geopotential height22 r500 500 hPa relative humidity23 r850 850 hPa relative humidity24 rhum Surface relative humidity25 shum Surface-specific humidity26 temp Mean temperature at 2m height

4 Advances in Meteorology

Trainingdataset(TD)

Randomized

Regressionresult 1

Regressionresult 2

Regressionresult 3

Regressionresult k

The averageof all results

TD1

TD2

TD3

TDk

Figure 3 Structure of a random forest model

to each station using the bilinear interpolationmethodwhichhas been widely used in statistical downscaling [37ndash39]

3 Downscaling by Using the RandomForest Method

31 Random Forest Method

311 Methodology The random forest (RF) method is anenhanced classification and regression tree (CART) methodproposed by Breiman in 2001 which consists of an ensembleof unpruned decision trees generated through bootstrapsamples of the training data and random variable subsetselection

As shown in Figure 3 the RF is composed of a set ofCARTs The accuracy of the RF prediction depends on thestrength of the individual CARTs [23] Each CART consistsof a root node internal nodes and leaves Each internal nodeis associated with a test function to split the incoming dataFor regression trees splitting is made in accordance witha squared residuals minimization algorithm which impliesthat the expected sum variances in the two resulting nodesshould be minimized as shown in

argmin119909119895le119909119877119895119895=1119872

[119901119897 var (119884119897) + 119901120591 var (119884120591)] (1)

Here 119901119897 and 119901120591 are fractions of samples in the left andright nodes var(119884119897) and var(119884120591) are response vectors forcorresponding left and right child nodes and 119909119895 le 119909119877119895 119895 =1 2 119872 are optimal splitting questions

Compared to traditional CART methods that use wholedata sets the RF trains each individual CART on bootstrapresamples (119872 samples) of the total dataset Instead of usingall the features the RF uses a random selection of features tosplit each node The best split is chosen among a randomlyselected subset of Ntry input variables at each node The treeis then grown to the maximum size without pruning In thisway119872 CARTs are grown and the final output is the averageof the predictions of those trees

312 Importance of Variables The RF performs an inbuiltcross-validation in parallel to the training process by usingout-of-bag (OOB) samples which are not chosen during thebootstrap split In the regression mode the total learningerror is obtained by averaging the prediction error of eachindividual tree using their OOB samples as shown in

MSE asymp MSEOOB = 1119899119899sum119894=1

[ (119883119894) minus 119884119894]2 (2)

where 119899 is the total number of OOB samples (119883119894) is the RFoutput corresponding to a given input sample119883119894 and119884119894 is theobserved output

The RF provides two methods of evaluating the impor-tance of each variable [27] The first method evaluatesthe variable importance based on how much poorer theprediction will be if the variable is permuted randomlyThe prediction errors of the OOB samples of each tree(termed EOOB1) are calculated during the training procedureAt the same time each input variable in the OOB samplesis permuted one at a time These modified datasets arealso predicted by the tree (termed EOOB2) At the end ofthe training procedure the importance of each variable isobtained by averaging the difference between EOOB1 andEOOB2 It is then normalized by the standard deviation Thesecond method is based on the calculation of the nodeimpurity criterion As illustrated in (1) we can calculate howmuch the split decreases the node impurity For regressiontrees the decrease of the node impurity can be calculatedusing the difference between the residual sum of squares(RSS) before and after the split The importance of a variablecan be rapidly calculated by combining the decreases of nodeimpurity for the variable over all trees [40]

32 Model Implementation and Validation TheRF is utilizedto simulate the nonlinear relationship between the NCEPpredictors and the observed temperatures at the 61 stationsIn this study the RF algorithm is implemented in the Rprogramming language with a package ldquorandom forestrdquowhich has built-in functions to measure variable importance

Advances in Meteorology 5

Guangzhou Station Nanning Station

0

05

1

15

2

25Si

gnifi

canc

e

5 10 15 20 25 300Number of predictors

0

02

04

06

08

1

12

14

16

18

2

Sign

ifica

nce

5 10 15 20 25 300Number of predictors

times105

times105

Figure 4 Relative significance of the predictors at Guangzhou Station and Nanning Station

As illustrated previously 119872 and Ntry are two sensitiveparameters in the RF models Ntry is the square root of thetotal number of variables [41]119872 influences the convergenceof the RF and can be determined through the OOB error

For model development the daily mean temperatureseries from the national standard stations and the large-scaleatmospheric variables of the NCEP data are divided into twodatasetsThe first 31 years (1961ndash1991) are used for calibratingthe regression model while the remaining 14 years of data(1992ndash2005) are used to validate the model For testing theperformance of the proposedmodel two data-drivenmodelsANN and SVM are selected for model comparison whichare commonly used in statistical downscalingThe three-layerbackpropagation artificial neural network (BP-ANN) andLS-SVM are adopted which have been applied successfullyin downscaling temperature [12] The MATLAB functionsldquonewffrdquo and ldquotunelssvmrdquo are employed for obtaining thevalues of the models parameters Multiple linear regressionanalysis is adopted for the MATLAB implementation ThePAR [18] and PCA [19 20] methods are used in predictorselection for the MLR ANN and SVMmodels

33 Model Performance Analysis Five criteria are selected toevaluate the performance of the RF and comparative modelsincluding the NashndashSutcliffe model efficiency index (Nash)root mean square error (RMSE) mean absolute error (MAE)correlation coefficient (119877) and model Bias (Bias) which aredefined as

Nash = 1 minus sum119899119894=1 (119910119894obs minus 119910119894sim)2sum119899119894=1 (119910119894obs minus 119910obs)2

RMSE = radicsum119899119894=1 (119910119894obs minus 119910119894sim)2119899

MAE = sum119899119894=1 1003816100381610038161003816119910119894obs minus 119910119894sim1003816100381610038161003816119899 119877 = sum119899119894=1 (119910119894obs minus 119910obs) (119910119894sim minus 119910sim)radicsum119899119894=1 (119910119894obs minus 119910obs)2 (119910119894sim minus 119910sim)2

Bias = sum119899119894=1 (119910119894sim minus 119910119894s)119899 (3)

Here119910obs is the vector of the observed predictands119910obs isthemean of the observed predictands 119910sim is the vector of thesimulated predictands and 119910sim is the mean of the simulatedpredictands In general a higher Nash and 119877 indicate bettermodel efficiency In contrast smaller values of RMSE MAEand Bias indicate higher accuracy in the model prediction

4 Result Analysis and Discussion

41The Choice of Predictors One of themost important stepsin the development of downscaling models is the choice ofappropriate predictors [42] The RF can evaluate the relativecontribution of each predictor for downscaling results bycombining the RSS decreases over all trees which makes itconvenient in choosing predictors for the RFUsing two of thestations as examples Figure 4 shows the relative importanceof the predictors for the predictands at Guangzhou andNanning Stations where the number of the predictor hasbeen given as indicated in Table 1 For a comprehensiveinvestigation of the predictorrsquos importance in the Pearl Riverbasin stations the rank (the most important predictor isindicated as 1) of the relative importance of the predictor ineach station is calculated and indicated in Figure 5

6 Advances in Meteorology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Number of predictors

Random forest

0

5

10

15

20

25

Rank

Figure 5 Rank of relative importance of predictors in the 61 sta-tions

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Numbers of predictors

123456789

1011

EOO

B

Figure 6 MSEs of out-of-bag samples

According to Figures 4 and 5 the number 26 predictorthe mean temperature at 2m height is the most importantpredictor for all RF models at the 61 stations Similarly thenumber 25 predictor the surface-specific humidity rankedsecond for predictor importance at all stations In contrastthe number 5 predictor surface vorticity is the least impor-tant predictor for all stations The ranks of the other predic-tors vary for the different stations whichmay be caused by themeteorological and geographical differences of the stations

Because the OOB samples can provide unbiased esti-mation of the RF model performance the rationality ofincluding each factor can be tested using the MSEs of theOOB (indicated by EOOB)The predictors of each station arescreened using the ranks of relative importance which wereevaluated in previous stepThen theMSE of theOOB samplesin each station is calculated with the increase of the predictorcombination and plotted in Figure 6

The results show that EOOB generally decreases butat a decreasing rate as more predictors are included Thisindicates that the RF avoids overfitting successfully Howeverthe improvement of model performance by including morepredictors is slight when the predictor number exceeds acertain number which is seven in this study With enoughcomputation resources all predictors are chosen in the RFmodeling to downscale the temperature for these stations

42 Comparative Study The performance of the RF modelis compared with that of the MLR ANN and SVM models

Partial Correlation Analysis

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Number of predictors

0

5

10

15

20

25

Rank

Figure 7 Rank of partial correlation of predictors in the 61 stations

Two predictor selection methods PAR and PCA are appliedin the four models

The partial correlations of the 26 predictors with thepredictands are shown in Figure 7 It can be observed that theranks of the predictorsrsquo partial correlation vary in differentstations In general the number 1 predictor mean sea levelpressure has the largest partial correlation in most of thestations However the number 5 and number 8 predic-tors corresponding to surface vorticity and 500 hPa airflowstrength have the smallest partial correlations in themajorityof the stations

The partial correlation coefficients are used to decide thevariables that are included in the input combination Based onformer studies [18 43 44] a combination of seven predictorswith the highest partial correlation coefficients are selectedand applied in the MLR ANN and SVM models Theseresults are marked as MLR-par ANN-par and SVM-par

PCA which was commonly used by previous researchersis also used in the comparative study Before PCA thepredictors are standardized by subtracting themean from theoriginal values and then dividing the results by the standarddeviation of the original variables The PCA method is thenapplied to the standardized NCEP predictor variables toextract principal components (PCs) that are orthogonal Theobtained PCs preservemore than 90 of the variance presentat each station Then the PCs are used in the MLR ANNand SVM modeling and these results are marked as MLR-pca ANN-pca and SVM-pca

The calibration and validation results of the RF andcomparative models are summarized in Table 2 and theaverage Nash in the calibration and validation periods for allmodels are plotted in Figure 8

The results showed that the RF is superior to the com-parative models in the calibration and validation periodsAccording to the average values of Nash RMSEMAE119877 andBias the RF for the 61 stations are 098 080 058 099 and000 in the calibration period and 094 146 112 097 and021 in the validation period respectively All of these criteriaare superior to those of the comparativemodels It can also beobserved in Figure 8 that the RF shows higher precision andmore stability over the other models in the study area

The results of the two parameter selection are alsocompared for the MLR ANN and SVM models The PAR is

Advances in Meteorology 7

Calibration Validation

08

082

084

086

088

09

092

094

096

098

Aver

age o

f Nas

h

RF

MLR

-par

SVM

-par

MLR

-pca

SVM

-pca

AN

N-p

ar

AN

N-p

ca

Model

078

08

082

084

086

088

09

092

094

096

Aver

age o

f Nas

h

RF

MLR

-par

SVM

-par

MLR

-pca

SVM

-pca

AN

N-p

ca

AN

N-p

ar

Model

Figure 8 Average Nash in calibration and validation period for all models

Table 2 Performance assessment for predictands in calibration and validation

Models Periods Nash RMSE MAE 119877 Bias

MLR-par Calibration 092 176 135 096 000Validation 092 166 129 096 001

MLR-pca Calibration 090 196 151 095 000Validation 088 200 155 094 031

ANN-par Calibration 093 156 119 097 000Validation 093 152 117 097 007

ANN-pca Calibration 092 168 128 096 000Validation 091 172 132 096 035

SVM-par Calibration 092 176 134 096 minus009Validation 092 166 128 096 minus006

SVM-pca Calibration 089 197 150 095 minus010Validation 088 199 153 094 022

RF Calibration 098 080 058 099 000Validation 094 146 112 097 021

superior to PCA in theMLR ANN and SVMmodels For theANNmodels the average119877 are similar in both the calibrationand validation period however the average Nash RMSEMAE and Bias of the results using PAR are superior to thoseof PCA with decreases of 001 012 008 and 0 in the cali-bration period and 002 020 015 and 028 in the validationperiod respectively For the SVM models the increases inaverage Nash and119877 are 003 and 001 in the calibration periodand 004 and 002 in the validation period respectively Thedecreases of average RMSE MAE and Bias are 021 016and 001 in the calibration period and 033 025 and 028

in the validation period respectively For the same the PARis superior to PCA for MLR for most of the criteria withincreases in Nash and 119877 and decreases in RMSE and MAE

The spatial distributions of model precision of the RFand the comparative models are shown in Figure 9 inwhich Nash is selected as the evaluating criteria For thecomparative results the PAR was used in the MLR ANNand SVM modeling It is shown that the RF is superior tothe comparative models at most of the individual stations Inaddition it can be observed that the precision of the RF instations located in the plain region is higher than that in the

8 Advances in Meteorology

MLR calibration

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basin

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

le088

ge096

N

(a) MLR calibration

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

088ndash090090ndash092

092ndash094094ndash096

MLR validationPearl River Basinle088

ge096

N

(b) MLR validation

ANN calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(c) ANN calibration

ANN validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(d) ANN validation

SVM calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(e) SVM calibration

SVM validation

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

0 150 300(km)

N

(f) SVM validation

Figure 9 Continued

Advances in Meteorology 9

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

RF calibration

(g) RF calibration

RF validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(h) RF validation

Figure 9 The spatial distribution of Nash for different models

mountain regions which indicates the limited applicability incomplex terrains

5 Summary and Conclusions

Statistical downscaling models are effective in solving themismatch between large-scale climate models and local scalehydrological responses In this study a statisticalmodel basedon the random forest method was proposed and appliedto the Pearl River basin The objective of this study was toinvestigate whether the RF approach could successfully sim-ulate the complicated relationship between the predictors andpredictands The daily mean temperature observations from61 stations in the Pearl River Basin and the NCEP reanalysisdaily data from 1961 to 2005were selected in order to comparethe results of the RF model with those of the MLR ANNand SVM models The following summarizes the discussionpoints and provides conclusions derived from this analysis

(1) The RF model successfully simulated the relation-ship between the predictors and predictands andperformed better than the MLR ANN and SVMmodels According to five statistical criteria the RFshowed the highest model efficiency in both thecalibration and validation periods In addition it wasobserved that themodel efficiency of the RF increasedcontinuously by considering more predictors In thisstudy all 26 predictors from the NCEP were consid-ered in the RF modeling By taking full advantageof the information in the predictors and avoidingthe influence of noise the RF model performancedominated the other results

(2) The built-in variable importance evaluation processand the OOB samples in the RF made predictorselection convenient The built-in variable impor-tance evaluation process ranked the importance of

each predictor for the prediction of the predictandsThe OOB samples gave the unbiased estimation ofthe model efficiency Both of these were helpful inthe predictor selection Although all predictors wereconsidered in this study they will be most valuablewhen the RF is used in more complex downscalingproblems such as precipitation downscaling

(3) The spatial distribution of model precision at theindividual stations for the RF and the comparativemodels was also discussed Although the RF wassuperior to the comparative models for most of thestations there was still room for improvement for theprediction accuracy in mountainous areas

As this studymainly discussed the development and com-parison of temperature downscalingmethods future projectsin the Pearl River basin were not further explored We willpursue further research in the application of the RF to addi-tionalmeteorological elements such as the precipitationThiswill be helpful in understanding the strength and limitationof the RF method Furthermore the surface parameters likethe elevation slope vegetation and so forth also influencethe distribution of these meteorological elements especiallyin complex terrain [45] wewill explore further in consideringthese parameters as part of model input

Conflicts of Interest

The authors declare no conflicts of interest

Authorsrsquo Contributions

Bo Pang contributed to the literature review statisticalanalysis manuscript preparation and editing Jiajia Yue andGang Zhao contributed to the model design and simulation

10 Advances in Meteorology

and Zongxue Xu contributed to the manuscript revision andreview and supervised the study

Acknowledgments

This research is funded by the Youth Science Foundation ofthe National Natural Science Foundation (51309009) theNational Natural Science Foundation (91125015) andNation-al Key Research and Development Program (during the 13thFive-Year Plan) under Grant no 2016YFC0401309 Ministryof Science and Technology China

References

[1] IPCC Climate Change 2007 The Physical Science Basis Con-tribution of Working Group I to the Fourth Assessment Reportof the Intergovernmental Panel on Climate Change CambridgeUniversity Press Cambridge UK 2007

[2] R L Wilby S P Charles E Zorita B Timbal P Whetton andL O Mearns Guidelines for Use of Climate Scenarios Developedfrom Statistical Downscaling Methods UEA Norwich UK2004

[3] J T Chu J Xia C-Y Xu and V P Singh ldquoStatistical down-scaling of daily mean temperature pan evaporation and pre-cipitation for climate change scenarios in Haihe River ChinardquoTheoretical and Applied Climatology vol 99 no 1-2 pp 149ndash1612010

[4] R L Wilby L E Hay and G H Leavesley ldquoA comparisonof downscaled and raw GCM output implications for climatechange scenarios in the San JuanRiver Basin Coloradordquo Journalof Hydrology vol 225 no 1-2 pp 67ndash91 1999

[5] M K Goyal and C S P Ojha ldquoDownscaling of surfacetemperature for lake catchment in an arid region in India usinglinear multiple regression and neural networksrdquo InternationalJournal of Climatology vol 32 no 4 pp 552ndash566 2012

[6] R Huth ldquoStatistical downscaling in central Europe evaluationof methods and potential predictorsrdquo Climate Research vol 13no 2 pp 91ndash101 1999

[7] D Chen and Y Chen ldquoAssociation between winter temperaturein China and upper air circulation over East Asia revealed bycanonical correlation analysisrdquo Global and Planetary Changevol 37 no 3-4 pp 315ndash325 2003

[8] P Coulibaly Y B Dibike and F Anctil ldquoDownscaling precipi-tation and temperature with temporal neural networksrdquo Journalof Hydrometeorology vol 6 no 4 pp 483ndash496 2005

[9] A Anandhi V V Srinivas D N Kumar and R S NanjundiahldquoRole of predictors in downscaling surface temperature to riverbasin in India for IPCC SRES scenarios using support vectormachinerdquo International Journal of Climatology vol 29 no 4 pp583ndash603 2009

[10] J T Schoof and S C Pryor ldquoDownscaling temperature andprecipitation a comparison of regression-based methods andartificial neural networksrdquo International Journal of Climatologyvol 21 no 7 pp 773ndash790 2001

[11] E Kostopoulou CGiannakopoulos CAnagnostopoulou et alldquoSimulatingmaximumandminimum temperature overGreecea comparison of three downscaling techniquesrdquoTheoretical andApplied Climatology vol 90 no 1-2 pp 65ndash82 2007

[12] D Duhan and A Pandey ldquoStatistical downscaling of tempera-ture using three techniques in the Tons River basin in Central

IndiardquoTheoretical and Applied Climatology vol 121 no 3-4 pp605ndash622 2015

[13] D Maraun H W Rust and T J Osborn ldquoThe annual cycleof heavy precipitation across the United Kingdom a modelbased on extreme value statisticsrdquo International Journal ofClimatology vol 29 no 12 pp 1731ndash1744 2009

[14] M Widmann ldquoOne-dimensional CCA and SVD and theirrelationship to regression mapsrdquo Journal of Climate vol 18 no14 pp 2785ndash2792 2005

[15] M K Tippett T DelSole S J Mason and A G BarnstonldquoRegression-based methods for finding coupled patternsrdquo Jour-nal of Climate vol 21 no 17 pp 4384ndash4398 2008

[16] M Hessami P Gachon T B M J Ouarda and A St-Hilaire ldquoAutomated regression-based statistical downscalingtoolrdquo Environmental Modelling and Software vol 23 no 6 pp813ndash834 2008

[17] Z Liu Z Xu S P Charles G Fu and L Liu ldquoEvaluation of twostatistical downscaling models for daily precipitation over anarid basin in Chinardquo International Journal of Climatology vol31 no 13 pp 2006ndash2020 2011

[18] C Yang N Wang S Wang and L Zhou ldquoPerformancecomparison of three predictor selection methods for statisticaldownscaling of daily precipitationrdquo Theoretical and AppliedClimatology pp 1ndash12 2016

[19] I Hanssen-Bauer E J Foslashrland J E Haugen and O E TveitoldquoTemperature and precipitation scenarios for Norway com-parison of results from dynamical and empirical downscalingrdquoClimate Research vol 25 no 1 pp 15ndash27 2003

[20] A Hannachi I T Jolliffe and D B Stephenson ldquoEmpiricalorthogonal functions and related techniques in atmosphericscience a reviewrdquo International Journal of Climatology vol 27no 9 pp 1119ndash1152 2007

[21] O Fistikoglu andUOkkan ldquoStatistical downscaling ofmonthlyprecipitation using NCEPNCAR reanalysis data for tahtaliriver basin in Turkeyrdquo Journal of Hydrologic Engineering vol 16no 2 pp 157ndash164 2010

[22] A Sharma ldquoSeasonal to interannual rainfall probabilistic fore-casts for improved water supply management part 1mdashA strat-egy for system predictor identificationrdquo Journal of Hydrologyvol 239 no 1ndash4 pp 232ndash239 2000

[23] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[24] M Malekipirbazari and V Aksakalli ldquoRisk assessment in sociallending via random forestsrdquo Expert Systems with Applicationsvol 42 no 10 pp 4621ndash4631 2015

[25] P Baudron F Alonso-Sarrıa J L Garcıa-Arostegui F Canovas-Garcıa D Martınez-Vicente and J Moreno-Brotons ldquoIdentify-ing the origin of groundwater samples in a multi-layer aquifersystemwithRandomForest classificationrdquo Journal ofHydrologyvol 499 pp 303ndash315 2013

[26] K Tatsumi Y Yamashiki M A Canales Torres and C L RTaipe ldquoCrop classification of upland fields using Random forestof time-series Landsat 7 ETM+datardquoComputers and Electronicsin Agriculture vol 115 pp 171ndash179 2015

[27] Z Wang C Lai X Chen B Yang S Zhao and X Bai ldquoFloodhazard risk assessment model based on random forestrdquo Journalof Hydrology vol 527 pp 1130ndash1141 2015

[28] P O Gislason J A Benediktsson and J R Sveinsson ldquoRandomforests for land cover classificationrdquo Pattern Recognition Lettersvol 27 no 4 pp 294ndash300 2006

Advances in Meteorology 11

[29] M Pal ldquoRandom forest classifier for remote sensing classifica-tionrdquo International Journal of Remote Sensing vol 26 no 1 pp217ndash222 2005

[30] V F Rodriguez-Galiano M Chica-Olmo F Abarca-Hernan-dez P M Atkinson and C Jeganathan ldquoRandom Forestclassification of Mediterranean land cover using multi-seasonalimagery and multi-seasonal texturerdquo Remote Sensing of Envi-ronment vol 121 pp 93ndash107 2012

[31] V F Rodriguez-Galiano B Ghimire J Rogan M Chica-Olmoand J P Rigol-Sanchez ldquoAn assessment of the effectiveness ofa random forest classifier for land-cover classificationrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 67 no 1pp 93ndash104 2012

[32] C Strobl and A Zeileis ldquoDanger high power mdash Exploringthe statistical properties of a test for random forest variableimportancerdquo Tech Rep 17 University of Munich 2008

[33] V Svetnik A Liaw C Tong J Christopher Culberson R PSheridan and B P Feuston ldquoRandom forest a classification andregression tool for compound classification and QSAR mod-elingrdquo Journal of Chemical Information and Computer Sciencesvol 43 no 6 pp 1947ndash1958 2003

[34] E Eccel L Ghielmi P Granitto R Barbiero F Grazzini and DCesari ldquoPrediction of minimum temperatures in an alpineregion by linear and non-linear post-processing of meteorolog-ical modelsrdquoNonlinear Processes in Geophysics vol 14 no 3 pp211ndash222 2007

[35] Pearl River Water Resources Committee (PRWRC) The Zhu-jiang Archive vol 1 Guangdong Science and Technology PressGuangzhou China 1991 (in Chinese)

[36] Q Zhang C-Y Xu and Z Zhang ldquoObserved changes ofdroughtwetness episodes in the Pearl River basin Chinausing the standardized precipitation index and aridity indexrdquoTheoretical and Applied Climatology vol 98 no 1-2 pp 89ndash992009

[37] E V Dmitriev I V Nogotkov V S Rogutov G Komenko andA Chavro ldquoTemporal error estimate for statistical downscalingregional meteorological modelsrdquo Fısica de la Tierra vol 19 pp219ndash241 2007

[38] C Lavaysse M Vrac P Drobinski M Lengaigne and TVischel ldquoStatistical downscaling of the French Mediterraneanclimate assessment for present and projection in an anthro-pogenic scenariordquo Natural Hazards and Earth System Sciencevol 12 no 3 pp 651ndash670 2012

[39] L Liu Z Liu X Ren T Fischer and Y Xu ldquoHydrologicalimpacts of climate change in the Yellow River Basin for the 21stcentury using hydrological model and statistical downscalingmodelrdquo Quaternary International vol 244 no 2 pp 211ndash2202011

[40] F-F Ai J Bin Z-M Zhang et al ldquoApplication of randomforests to select premiumquality vegetable oils by their fatty acidcompositionrdquo Food Chemistry vol 143 pp 472ndash478 2014

[41] P Gislason J Benediktsson and J Sveinsson ldquoRandom forestclassification of multisource remote sensing and geographicdatardquo in Proceedings of the Geoscience and Remote Sensing Sym-posium vol 2 pp 1049ndash1052 IEEE Anchorage Alaska USA2004

[42] B C Hewitson and R G Crane ldquoClimate downscaling tech-niques and applicationrdquo Climate Research vol 7 no 2 pp 85ndash95 1996

[43] B Timbal A Dufour and B McAvaney ldquoAn estimate of fu-ture climate change for western France using a statistical

downscaling techniquerdquo Climate Dynamics vol 20 no 7-8 pp807ndash823 2003

[44] K Tatsumi T Oizumi and Y Yamashiki ldquoEffects of climatechange on daily minimum and maximum temperatures andcloudiness in the Shikoku region a statistical downscalingmodel approachrdquoTheoretical and Applied Climatology vol 120no 1-2 pp 87ndash98 2015

[45] Z A Holden J T Abatzoglou C H Luce and L S BaggettldquoEmpirical downscaling of daily minimum air temperature atvery fine resolutions in complex terrainrdquoAgricultural and ForestMeteorology vol 151 no 8 pp 1066ndash1073 2011

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 201

International Journal of

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 4: Statistical Downscaling of Temperature with the Random ...downloads.hindawi.com/journals/amete/2017/7265178.pdf · Statistical downscaling techniques can be divided into threecategories:weathertyping,weathergenerators,andre-gression-based

4 Advances in Meteorology

Trainingdataset(TD)

Randomized

Regressionresult 1

Regressionresult 2

Regressionresult 3

Regressionresult k

The averageof all results

TD1

TD2

TD3

TDk

Figure 3 Structure of a random forest model

to each station using the bilinear interpolationmethodwhichhas been widely used in statistical downscaling [37ndash39]

3 Downscaling by Using the RandomForest Method

31 Random Forest Method

311 Methodology The random forest (RF) method is anenhanced classification and regression tree (CART) methodproposed by Breiman in 2001 which consists of an ensembleof unpruned decision trees generated through bootstrapsamples of the training data and random variable subsetselection

As shown in Figure 3 the RF is composed of a set ofCARTs The accuracy of the RF prediction depends on thestrength of the individual CARTs [23] Each CART consistsof a root node internal nodes and leaves Each internal nodeis associated with a test function to split the incoming dataFor regression trees splitting is made in accordance witha squared residuals minimization algorithm which impliesthat the expected sum variances in the two resulting nodesshould be minimized as shown in

argmin119909119895le119909119877119895119895=1119872

[119901119897 var (119884119897) + 119901120591 var (119884120591)] (1)

Here 119901119897 and 119901120591 are fractions of samples in the left andright nodes var(119884119897) and var(119884120591) are response vectors forcorresponding left and right child nodes and 119909119895 le 119909119877119895 119895 =1 2 119872 are optimal splitting questions

Compared to traditional CART methods that use wholedata sets the RF trains each individual CART on bootstrapresamples (119872 samples) of the total dataset Instead of usingall the features the RF uses a random selection of features tosplit each node The best split is chosen among a randomlyselected subset of Ntry input variables at each node The treeis then grown to the maximum size without pruning In thisway119872 CARTs are grown and the final output is the averageof the predictions of those trees

312 Importance of Variables The RF performs an inbuiltcross-validation in parallel to the training process by usingout-of-bag (OOB) samples which are not chosen during thebootstrap split In the regression mode the total learningerror is obtained by averaging the prediction error of eachindividual tree using their OOB samples as shown in

MSE asymp MSEOOB = 1119899119899sum119894=1

[ (119883119894) minus 119884119894]2 (2)

where 119899 is the total number of OOB samples (119883119894) is the RFoutput corresponding to a given input sample119883119894 and119884119894 is theobserved output

The RF provides two methods of evaluating the impor-tance of each variable [27] The first method evaluatesthe variable importance based on how much poorer theprediction will be if the variable is permuted randomlyThe prediction errors of the OOB samples of each tree(termed EOOB1) are calculated during the training procedureAt the same time each input variable in the OOB samplesis permuted one at a time These modified datasets arealso predicted by the tree (termed EOOB2) At the end ofthe training procedure the importance of each variable isobtained by averaging the difference between EOOB1 andEOOB2 It is then normalized by the standard deviation Thesecond method is based on the calculation of the nodeimpurity criterion As illustrated in (1) we can calculate howmuch the split decreases the node impurity For regressiontrees the decrease of the node impurity can be calculatedusing the difference between the residual sum of squares(RSS) before and after the split The importance of a variablecan be rapidly calculated by combining the decreases of nodeimpurity for the variable over all trees [40]

32 Model Implementation and Validation TheRF is utilizedto simulate the nonlinear relationship between the NCEPpredictors and the observed temperatures at the 61 stationsIn this study the RF algorithm is implemented in the Rprogramming language with a package ldquorandom forestrdquowhich has built-in functions to measure variable importance

Advances in Meteorology 5

Guangzhou Station Nanning Station

0

05

1

15

2

25Si

gnifi

canc

e

5 10 15 20 25 300Number of predictors

0

02

04

06

08

1

12

14

16

18

2

Sign

ifica

nce

5 10 15 20 25 300Number of predictors

times105

times105

Figure 4 Relative significance of the predictors at Guangzhou Station and Nanning Station

As illustrated previously 119872 and Ntry are two sensitiveparameters in the RF models Ntry is the square root of thetotal number of variables [41]119872 influences the convergenceof the RF and can be determined through the OOB error

For model development the daily mean temperatureseries from the national standard stations and the large-scaleatmospheric variables of the NCEP data are divided into twodatasetsThe first 31 years (1961ndash1991) are used for calibratingthe regression model while the remaining 14 years of data(1992ndash2005) are used to validate the model For testing theperformance of the proposedmodel two data-drivenmodelsANN and SVM are selected for model comparison whichare commonly used in statistical downscalingThe three-layerbackpropagation artificial neural network (BP-ANN) andLS-SVM are adopted which have been applied successfullyin downscaling temperature [12] The MATLAB functionsldquonewffrdquo and ldquotunelssvmrdquo are employed for obtaining thevalues of the models parameters Multiple linear regressionanalysis is adopted for the MATLAB implementation ThePAR [18] and PCA [19 20] methods are used in predictorselection for the MLR ANN and SVMmodels

33 Model Performance Analysis Five criteria are selected toevaluate the performance of the RF and comparative modelsincluding the NashndashSutcliffe model efficiency index (Nash)root mean square error (RMSE) mean absolute error (MAE)correlation coefficient (119877) and model Bias (Bias) which aredefined as

Nash = 1 minus sum119899119894=1 (119910119894obs minus 119910119894sim)2sum119899119894=1 (119910119894obs minus 119910obs)2

RMSE = radicsum119899119894=1 (119910119894obs minus 119910119894sim)2119899

MAE = sum119899119894=1 1003816100381610038161003816119910119894obs minus 119910119894sim1003816100381610038161003816119899 119877 = sum119899119894=1 (119910119894obs minus 119910obs) (119910119894sim minus 119910sim)radicsum119899119894=1 (119910119894obs minus 119910obs)2 (119910119894sim minus 119910sim)2

Bias = sum119899119894=1 (119910119894sim minus 119910119894s)119899 (3)

Here119910obs is the vector of the observed predictands119910obs isthemean of the observed predictands 119910sim is the vector of thesimulated predictands and 119910sim is the mean of the simulatedpredictands In general a higher Nash and 119877 indicate bettermodel efficiency In contrast smaller values of RMSE MAEand Bias indicate higher accuracy in the model prediction

4 Result Analysis and Discussion

41The Choice of Predictors One of themost important stepsin the development of downscaling models is the choice ofappropriate predictors [42] The RF can evaluate the relativecontribution of each predictor for downscaling results bycombining the RSS decreases over all trees which makes itconvenient in choosing predictors for the RFUsing two of thestations as examples Figure 4 shows the relative importanceof the predictors for the predictands at Guangzhou andNanning Stations where the number of the predictor hasbeen given as indicated in Table 1 For a comprehensiveinvestigation of the predictorrsquos importance in the Pearl Riverbasin stations the rank (the most important predictor isindicated as 1) of the relative importance of the predictor ineach station is calculated and indicated in Figure 5

6 Advances in Meteorology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Number of predictors

Random forest

0

5

10

15

20

25

Rank

Figure 5 Rank of relative importance of predictors in the 61 sta-tions

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Numbers of predictors

123456789

1011

EOO

B

Figure 6 MSEs of out-of-bag samples

According to Figures 4 and 5 the number 26 predictorthe mean temperature at 2m height is the most importantpredictor for all RF models at the 61 stations Similarly thenumber 25 predictor the surface-specific humidity rankedsecond for predictor importance at all stations In contrastthe number 5 predictor surface vorticity is the least impor-tant predictor for all stations The ranks of the other predic-tors vary for the different stations whichmay be caused by themeteorological and geographical differences of the stations

Because the OOB samples can provide unbiased esti-mation of the RF model performance the rationality ofincluding each factor can be tested using the MSEs of theOOB (indicated by EOOB)The predictors of each station arescreened using the ranks of relative importance which wereevaluated in previous stepThen theMSE of theOOB samplesin each station is calculated with the increase of the predictorcombination and plotted in Figure 6

The results show that EOOB generally decreases butat a decreasing rate as more predictors are included Thisindicates that the RF avoids overfitting successfully Howeverthe improvement of model performance by including morepredictors is slight when the predictor number exceeds acertain number which is seven in this study With enoughcomputation resources all predictors are chosen in the RFmodeling to downscale the temperature for these stations

42 Comparative Study The performance of the RF modelis compared with that of the MLR ANN and SVM models

Partial Correlation Analysis

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Number of predictors

0

5

10

15

20

25

Rank

Figure 7 Rank of partial correlation of predictors in the 61 stations

Two predictor selection methods PAR and PCA are appliedin the four models

The partial correlations of the 26 predictors with thepredictands are shown in Figure 7 It can be observed that theranks of the predictorsrsquo partial correlation vary in differentstations In general the number 1 predictor mean sea levelpressure has the largest partial correlation in most of thestations However the number 5 and number 8 predic-tors corresponding to surface vorticity and 500 hPa airflowstrength have the smallest partial correlations in themajorityof the stations

The partial correlation coefficients are used to decide thevariables that are included in the input combination Based onformer studies [18 43 44] a combination of seven predictorswith the highest partial correlation coefficients are selectedand applied in the MLR ANN and SVM models Theseresults are marked as MLR-par ANN-par and SVM-par

PCA which was commonly used by previous researchersis also used in the comparative study Before PCA thepredictors are standardized by subtracting themean from theoriginal values and then dividing the results by the standarddeviation of the original variables The PCA method is thenapplied to the standardized NCEP predictor variables toextract principal components (PCs) that are orthogonal Theobtained PCs preservemore than 90 of the variance presentat each station Then the PCs are used in the MLR ANNand SVM modeling and these results are marked as MLR-pca ANN-pca and SVM-pca

The calibration and validation results of the RF andcomparative models are summarized in Table 2 and theaverage Nash in the calibration and validation periods for allmodels are plotted in Figure 8

The results showed that the RF is superior to the com-parative models in the calibration and validation periodsAccording to the average values of Nash RMSEMAE119877 andBias the RF for the 61 stations are 098 080 058 099 and000 in the calibration period and 094 146 112 097 and021 in the validation period respectively All of these criteriaare superior to those of the comparativemodels It can also beobserved in Figure 8 that the RF shows higher precision andmore stability over the other models in the study area

The results of the two parameter selection are alsocompared for the MLR ANN and SVM models The PAR is

Advances in Meteorology 7

Calibration Validation

08

082

084

086

088

09

092

094

096

098

Aver

age o

f Nas

h

RF

MLR

-par

SVM

-par

MLR

-pca

SVM

-pca

AN

N-p

ar

AN

N-p

ca

Model

078

08

082

084

086

088

09

092

094

096

Aver

age o

f Nas

h

RF

MLR

-par

SVM

-par

MLR

-pca

SVM

-pca

AN

N-p

ca

AN

N-p

ar

Model

Figure 8 Average Nash in calibration and validation period for all models

Table 2 Performance assessment for predictands in calibration and validation

Models Periods Nash RMSE MAE 119877 Bias

MLR-par Calibration 092 176 135 096 000Validation 092 166 129 096 001

MLR-pca Calibration 090 196 151 095 000Validation 088 200 155 094 031

ANN-par Calibration 093 156 119 097 000Validation 093 152 117 097 007

ANN-pca Calibration 092 168 128 096 000Validation 091 172 132 096 035

SVM-par Calibration 092 176 134 096 minus009Validation 092 166 128 096 minus006

SVM-pca Calibration 089 197 150 095 minus010Validation 088 199 153 094 022

RF Calibration 098 080 058 099 000Validation 094 146 112 097 021

superior to PCA in theMLR ANN and SVMmodels For theANNmodels the average119877 are similar in both the calibrationand validation period however the average Nash RMSEMAE and Bias of the results using PAR are superior to thoseof PCA with decreases of 001 012 008 and 0 in the cali-bration period and 002 020 015 and 028 in the validationperiod respectively For the SVM models the increases inaverage Nash and119877 are 003 and 001 in the calibration periodand 004 and 002 in the validation period respectively Thedecreases of average RMSE MAE and Bias are 021 016and 001 in the calibration period and 033 025 and 028

in the validation period respectively For the same the PARis superior to PCA for MLR for most of the criteria withincreases in Nash and 119877 and decreases in RMSE and MAE

The spatial distributions of model precision of the RFand the comparative models are shown in Figure 9 inwhich Nash is selected as the evaluating criteria For thecomparative results the PAR was used in the MLR ANNand SVM modeling It is shown that the RF is superior tothe comparative models at most of the individual stations Inaddition it can be observed that the precision of the RF instations located in the plain region is higher than that in the

8 Advances in Meteorology

MLR calibration

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basin

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

le088

ge096

N

(a) MLR calibration

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

088ndash090090ndash092

092ndash094094ndash096

MLR validationPearl River Basinle088

ge096

N

(b) MLR validation

ANN calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(c) ANN calibration

ANN validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(d) ANN validation

SVM calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(e) SVM calibration

SVM validation

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

0 150 300(km)

N

(f) SVM validation

Figure 9 Continued

Advances in Meteorology 9

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

RF calibration

(g) RF calibration

RF validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(h) RF validation

Figure 9 The spatial distribution of Nash for different models

mountain regions which indicates the limited applicability incomplex terrains

5 Summary and Conclusions

Statistical downscaling models are effective in solving themismatch between large-scale climate models and local scalehydrological responses In this study a statisticalmodel basedon the random forest method was proposed and appliedto the Pearl River basin The objective of this study was toinvestigate whether the RF approach could successfully sim-ulate the complicated relationship between the predictors andpredictands The daily mean temperature observations from61 stations in the Pearl River Basin and the NCEP reanalysisdaily data from 1961 to 2005were selected in order to comparethe results of the RF model with those of the MLR ANNand SVM models The following summarizes the discussionpoints and provides conclusions derived from this analysis

(1) The RF model successfully simulated the relation-ship between the predictors and predictands andperformed better than the MLR ANN and SVMmodels According to five statistical criteria the RFshowed the highest model efficiency in both thecalibration and validation periods In addition it wasobserved that themodel efficiency of the RF increasedcontinuously by considering more predictors In thisstudy all 26 predictors from the NCEP were consid-ered in the RF modeling By taking full advantageof the information in the predictors and avoidingthe influence of noise the RF model performancedominated the other results

(2) The built-in variable importance evaluation processand the OOB samples in the RF made predictorselection convenient The built-in variable impor-tance evaluation process ranked the importance of

each predictor for the prediction of the predictandsThe OOB samples gave the unbiased estimation ofthe model efficiency Both of these were helpful inthe predictor selection Although all predictors wereconsidered in this study they will be most valuablewhen the RF is used in more complex downscalingproblems such as precipitation downscaling

(3) The spatial distribution of model precision at theindividual stations for the RF and the comparativemodels was also discussed Although the RF wassuperior to the comparative models for most of thestations there was still room for improvement for theprediction accuracy in mountainous areas

As this studymainly discussed the development and com-parison of temperature downscalingmethods future projectsin the Pearl River basin were not further explored We willpursue further research in the application of the RF to addi-tionalmeteorological elements such as the precipitationThiswill be helpful in understanding the strength and limitationof the RF method Furthermore the surface parameters likethe elevation slope vegetation and so forth also influencethe distribution of these meteorological elements especiallyin complex terrain [45] wewill explore further in consideringthese parameters as part of model input

Conflicts of Interest

The authors declare no conflicts of interest

Authorsrsquo Contributions

Bo Pang contributed to the literature review statisticalanalysis manuscript preparation and editing Jiajia Yue andGang Zhao contributed to the model design and simulation

10 Advances in Meteorology

and Zongxue Xu contributed to the manuscript revision andreview and supervised the study

Acknowledgments

This research is funded by the Youth Science Foundation ofthe National Natural Science Foundation (51309009) theNational Natural Science Foundation (91125015) andNation-al Key Research and Development Program (during the 13thFive-Year Plan) under Grant no 2016YFC0401309 Ministryof Science and Technology China

References

[1] IPCC Climate Change 2007 The Physical Science Basis Con-tribution of Working Group I to the Fourth Assessment Reportof the Intergovernmental Panel on Climate Change CambridgeUniversity Press Cambridge UK 2007

[2] R L Wilby S P Charles E Zorita B Timbal P Whetton andL O Mearns Guidelines for Use of Climate Scenarios Developedfrom Statistical Downscaling Methods UEA Norwich UK2004

[3] J T Chu J Xia C-Y Xu and V P Singh ldquoStatistical down-scaling of daily mean temperature pan evaporation and pre-cipitation for climate change scenarios in Haihe River ChinardquoTheoretical and Applied Climatology vol 99 no 1-2 pp 149ndash1612010

[4] R L Wilby L E Hay and G H Leavesley ldquoA comparisonof downscaled and raw GCM output implications for climatechange scenarios in the San JuanRiver Basin Coloradordquo Journalof Hydrology vol 225 no 1-2 pp 67ndash91 1999

[5] M K Goyal and C S P Ojha ldquoDownscaling of surfacetemperature for lake catchment in an arid region in India usinglinear multiple regression and neural networksrdquo InternationalJournal of Climatology vol 32 no 4 pp 552ndash566 2012

[6] R Huth ldquoStatistical downscaling in central Europe evaluationof methods and potential predictorsrdquo Climate Research vol 13no 2 pp 91ndash101 1999

[7] D Chen and Y Chen ldquoAssociation between winter temperaturein China and upper air circulation over East Asia revealed bycanonical correlation analysisrdquo Global and Planetary Changevol 37 no 3-4 pp 315ndash325 2003

[8] P Coulibaly Y B Dibike and F Anctil ldquoDownscaling precipi-tation and temperature with temporal neural networksrdquo Journalof Hydrometeorology vol 6 no 4 pp 483ndash496 2005

[9] A Anandhi V V Srinivas D N Kumar and R S NanjundiahldquoRole of predictors in downscaling surface temperature to riverbasin in India for IPCC SRES scenarios using support vectormachinerdquo International Journal of Climatology vol 29 no 4 pp583ndash603 2009

[10] J T Schoof and S C Pryor ldquoDownscaling temperature andprecipitation a comparison of regression-based methods andartificial neural networksrdquo International Journal of Climatologyvol 21 no 7 pp 773ndash790 2001

[11] E Kostopoulou CGiannakopoulos CAnagnostopoulou et alldquoSimulatingmaximumandminimum temperature overGreecea comparison of three downscaling techniquesrdquoTheoretical andApplied Climatology vol 90 no 1-2 pp 65ndash82 2007

[12] D Duhan and A Pandey ldquoStatistical downscaling of tempera-ture using three techniques in the Tons River basin in Central

IndiardquoTheoretical and Applied Climatology vol 121 no 3-4 pp605ndash622 2015

[13] D Maraun H W Rust and T J Osborn ldquoThe annual cycleof heavy precipitation across the United Kingdom a modelbased on extreme value statisticsrdquo International Journal ofClimatology vol 29 no 12 pp 1731ndash1744 2009

[14] M Widmann ldquoOne-dimensional CCA and SVD and theirrelationship to regression mapsrdquo Journal of Climate vol 18 no14 pp 2785ndash2792 2005

[15] M K Tippett T DelSole S J Mason and A G BarnstonldquoRegression-based methods for finding coupled patternsrdquo Jour-nal of Climate vol 21 no 17 pp 4384ndash4398 2008

[16] M Hessami P Gachon T B M J Ouarda and A St-Hilaire ldquoAutomated regression-based statistical downscalingtoolrdquo Environmental Modelling and Software vol 23 no 6 pp813ndash834 2008

[17] Z Liu Z Xu S P Charles G Fu and L Liu ldquoEvaluation of twostatistical downscaling models for daily precipitation over anarid basin in Chinardquo International Journal of Climatology vol31 no 13 pp 2006ndash2020 2011

[18] C Yang N Wang S Wang and L Zhou ldquoPerformancecomparison of three predictor selection methods for statisticaldownscaling of daily precipitationrdquo Theoretical and AppliedClimatology pp 1ndash12 2016

[19] I Hanssen-Bauer E J Foslashrland J E Haugen and O E TveitoldquoTemperature and precipitation scenarios for Norway com-parison of results from dynamical and empirical downscalingrdquoClimate Research vol 25 no 1 pp 15ndash27 2003

[20] A Hannachi I T Jolliffe and D B Stephenson ldquoEmpiricalorthogonal functions and related techniques in atmosphericscience a reviewrdquo International Journal of Climatology vol 27no 9 pp 1119ndash1152 2007

[21] O Fistikoglu andUOkkan ldquoStatistical downscaling ofmonthlyprecipitation using NCEPNCAR reanalysis data for tahtaliriver basin in Turkeyrdquo Journal of Hydrologic Engineering vol 16no 2 pp 157ndash164 2010

[22] A Sharma ldquoSeasonal to interannual rainfall probabilistic fore-casts for improved water supply management part 1mdashA strat-egy for system predictor identificationrdquo Journal of Hydrologyvol 239 no 1ndash4 pp 232ndash239 2000

[23] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[24] M Malekipirbazari and V Aksakalli ldquoRisk assessment in sociallending via random forestsrdquo Expert Systems with Applicationsvol 42 no 10 pp 4621ndash4631 2015

[25] P Baudron F Alonso-Sarrıa J L Garcıa-Arostegui F Canovas-Garcıa D Martınez-Vicente and J Moreno-Brotons ldquoIdentify-ing the origin of groundwater samples in a multi-layer aquifersystemwithRandomForest classificationrdquo Journal ofHydrologyvol 499 pp 303ndash315 2013

[26] K Tatsumi Y Yamashiki M A Canales Torres and C L RTaipe ldquoCrop classification of upland fields using Random forestof time-series Landsat 7 ETM+datardquoComputers and Electronicsin Agriculture vol 115 pp 171ndash179 2015

[27] Z Wang C Lai X Chen B Yang S Zhao and X Bai ldquoFloodhazard risk assessment model based on random forestrdquo Journalof Hydrology vol 527 pp 1130ndash1141 2015

[28] P O Gislason J A Benediktsson and J R Sveinsson ldquoRandomforests for land cover classificationrdquo Pattern Recognition Lettersvol 27 no 4 pp 294ndash300 2006

Advances in Meteorology 11

[29] M Pal ldquoRandom forest classifier for remote sensing classifica-tionrdquo International Journal of Remote Sensing vol 26 no 1 pp217ndash222 2005

[30] V F Rodriguez-Galiano M Chica-Olmo F Abarca-Hernan-dez P M Atkinson and C Jeganathan ldquoRandom Forestclassification of Mediterranean land cover using multi-seasonalimagery and multi-seasonal texturerdquo Remote Sensing of Envi-ronment vol 121 pp 93ndash107 2012

[31] V F Rodriguez-Galiano B Ghimire J Rogan M Chica-Olmoand J P Rigol-Sanchez ldquoAn assessment of the effectiveness ofa random forest classifier for land-cover classificationrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 67 no 1pp 93ndash104 2012

[32] C Strobl and A Zeileis ldquoDanger high power mdash Exploringthe statistical properties of a test for random forest variableimportancerdquo Tech Rep 17 University of Munich 2008

[33] V Svetnik A Liaw C Tong J Christopher Culberson R PSheridan and B P Feuston ldquoRandom forest a classification andregression tool for compound classification and QSAR mod-elingrdquo Journal of Chemical Information and Computer Sciencesvol 43 no 6 pp 1947ndash1958 2003

[34] E Eccel L Ghielmi P Granitto R Barbiero F Grazzini and DCesari ldquoPrediction of minimum temperatures in an alpineregion by linear and non-linear post-processing of meteorolog-ical modelsrdquoNonlinear Processes in Geophysics vol 14 no 3 pp211ndash222 2007

[35] Pearl River Water Resources Committee (PRWRC) The Zhu-jiang Archive vol 1 Guangdong Science and Technology PressGuangzhou China 1991 (in Chinese)

[36] Q Zhang C-Y Xu and Z Zhang ldquoObserved changes ofdroughtwetness episodes in the Pearl River basin Chinausing the standardized precipitation index and aridity indexrdquoTheoretical and Applied Climatology vol 98 no 1-2 pp 89ndash992009

[37] E V Dmitriev I V Nogotkov V S Rogutov G Komenko andA Chavro ldquoTemporal error estimate for statistical downscalingregional meteorological modelsrdquo Fısica de la Tierra vol 19 pp219ndash241 2007

[38] C Lavaysse M Vrac P Drobinski M Lengaigne and TVischel ldquoStatistical downscaling of the French Mediterraneanclimate assessment for present and projection in an anthro-pogenic scenariordquo Natural Hazards and Earth System Sciencevol 12 no 3 pp 651ndash670 2012

[39] L Liu Z Liu X Ren T Fischer and Y Xu ldquoHydrologicalimpacts of climate change in the Yellow River Basin for the 21stcentury using hydrological model and statistical downscalingmodelrdquo Quaternary International vol 244 no 2 pp 211ndash2202011

[40] F-F Ai J Bin Z-M Zhang et al ldquoApplication of randomforests to select premiumquality vegetable oils by their fatty acidcompositionrdquo Food Chemistry vol 143 pp 472ndash478 2014

[41] P Gislason J Benediktsson and J Sveinsson ldquoRandom forestclassification of multisource remote sensing and geographicdatardquo in Proceedings of the Geoscience and Remote Sensing Sym-posium vol 2 pp 1049ndash1052 IEEE Anchorage Alaska USA2004

[42] B C Hewitson and R G Crane ldquoClimate downscaling tech-niques and applicationrdquo Climate Research vol 7 no 2 pp 85ndash95 1996

[43] B Timbal A Dufour and B McAvaney ldquoAn estimate of fu-ture climate change for western France using a statistical

downscaling techniquerdquo Climate Dynamics vol 20 no 7-8 pp807ndash823 2003

[44] K Tatsumi T Oizumi and Y Yamashiki ldquoEffects of climatechange on daily minimum and maximum temperatures andcloudiness in the Shikoku region a statistical downscalingmodel approachrdquoTheoretical and Applied Climatology vol 120no 1-2 pp 87ndash98 2015

[45] Z A Holden J T Abatzoglou C H Luce and L S BaggettldquoEmpirical downscaling of daily minimum air temperature atvery fine resolutions in complex terrainrdquoAgricultural and ForestMeteorology vol 151 no 8 pp 1066ndash1073 2011

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 201

International Journal of

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 5: Statistical Downscaling of Temperature with the Random ...downloads.hindawi.com/journals/amete/2017/7265178.pdf · Statistical downscaling techniques can be divided into threecategories:weathertyping,weathergenerators,andre-gression-based

Advances in Meteorology 5

Guangzhou Station Nanning Station

0

05

1

15

2

25Si

gnifi

canc

e

5 10 15 20 25 300Number of predictors

0

02

04

06

08

1

12

14

16

18

2

Sign

ifica

nce

5 10 15 20 25 300Number of predictors

times105

times105

Figure 4 Relative significance of the predictors at Guangzhou Station and Nanning Station

As illustrated previously 119872 and Ntry are two sensitiveparameters in the RF models Ntry is the square root of thetotal number of variables [41]119872 influences the convergenceof the RF and can be determined through the OOB error

For model development the daily mean temperatureseries from the national standard stations and the large-scaleatmospheric variables of the NCEP data are divided into twodatasetsThe first 31 years (1961ndash1991) are used for calibratingthe regression model while the remaining 14 years of data(1992ndash2005) are used to validate the model For testing theperformance of the proposedmodel two data-drivenmodelsANN and SVM are selected for model comparison whichare commonly used in statistical downscalingThe three-layerbackpropagation artificial neural network (BP-ANN) andLS-SVM are adopted which have been applied successfullyin downscaling temperature [12] The MATLAB functionsldquonewffrdquo and ldquotunelssvmrdquo are employed for obtaining thevalues of the models parameters Multiple linear regressionanalysis is adopted for the MATLAB implementation ThePAR [18] and PCA [19 20] methods are used in predictorselection for the MLR ANN and SVMmodels

33 Model Performance Analysis Five criteria are selected toevaluate the performance of the RF and comparative modelsincluding the NashndashSutcliffe model efficiency index (Nash)root mean square error (RMSE) mean absolute error (MAE)correlation coefficient (119877) and model Bias (Bias) which aredefined as

Nash = 1 minus sum119899119894=1 (119910119894obs minus 119910119894sim)2sum119899119894=1 (119910119894obs minus 119910obs)2

RMSE = radicsum119899119894=1 (119910119894obs minus 119910119894sim)2119899

MAE = sum119899119894=1 1003816100381610038161003816119910119894obs minus 119910119894sim1003816100381610038161003816119899 119877 = sum119899119894=1 (119910119894obs minus 119910obs) (119910119894sim minus 119910sim)radicsum119899119894=1 (119910119894obs minus 119910obs)2 (119910119894sim minus 119910sim)2

Bias = sum119899119894=1 (119910119894sim minus 119910119894s)119899 (3)

Here119910obs is the vector of the observed predictands119910obs isthemean of the observed predictands 119910sim is the vector of thesimulated predictands and 119910sim is the mean of the simulatedpredictands In general a higher Nash and 119877 indicate bettermodel efficiency In contrast smaller values of RMSE MAEand Bias indicate higher accuracy in the model prediction

4 Result Analysis and Discussion

41The Choice of Predictors One of themost important stepsin the development of downscaling models is the choice ofappropriate predictors [42] The RF can evaluate the relativecontribution of each predictor for downscaling results bycombining the RSS decreases over all trees which makes itconvenient in choosing predictors for the RFUsing two of thestations as examples Figure 4 shows the relative importanceof the predictors for the predictands at Guangzhou andNanning Stations where the number of the predictor hasbeen given as indicated in Table 1 For a comprehensiveinvestigation of the predictorrsquos importance in the Pearl Riverbasin stations the rank (the most important predictor isindicated as 1) of the relative importance of the predictor ineach station is calculated and indicated in Figure 5

6 Advances in Meteorology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Number of predictors

Random forest

0

5

10

15

20

25

Rank

Figure 5 Rank of relative importance of predictors in the 61 sta-tions

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Numbers of predictors

123456789

1011

EOO

B

Figure 6 MSEs of out-of-bag samples

According to Figures 4 and 5 the number 26 predictorthe mean temperature at 2m height is the most importantpredictor for all RF models at the 61 stations Similarly thenumber 25 predictor the surface-specific humidity rankedsecond for predictor importance at all stations In contrastthe number 5 predictor surface vorticity is the least impor-tant predictor for all stations The ranks of the other predic-tors vary for the different stations whichmay be caused by themeteorological and geographical differences of the stations

Because the OOB samples can provide unbiased esti-mation of the RF model performance the rationality ofincluding each factor can be tested using the MSEs of theOOB (indicated by EOOB)The predictors of each station arescreened using the ranks of relative importance which wereevaluated in previous stepThen theMSE of theOOB samplesin each station is calculated with the increase of the predictorcombination and plotted in Figure 6

The results show that EOOB generally decreases butat a decreasing rate as more predictors are included Thisindicates that the RF avoids overfitting successfully Howeverthe improvement of model performance by including morepredictors is slight when the predictor number exceeds acertain number which is seven in this study With enoughcomputation resources all predictors are chosen in the RFmodeling to downscale the temperature for these stations

42 Comparative Study The performance of the RF modelis compared with that of the MLR ANN and SVM models

Partial Correlation Analysis

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Number of predictors

0

5

10

15

20

25

Rank

Figure 7 Rank of partial correlation of predictors in the 61 stations

Two predictor selection methods PAR and PCA are appliedin the four models

The partial correlations of the 26 predictors with thepredictands are shown in Figure 7 It can be observed that theranks of the predictorsrsquo partial correlation vary in differentstations In general the number 1 predictor mean sea levelpressure has the largest partial correlation in most of thestations However the number 5 and number 8 predic-tors corresponding to surface vorticity and 500 hPa airflowstrength have the smallest partial correlations in themajorityof the stations

The partial correlation coefficients are used to decide thevariables that are included in the input combination Based onformer studies [18 43 44] a combination of seven predictorswith the highest partial correlation coefficients are selectedand applied in the MLR ANN and SVM models Theseresults are marked as MLR-par ANN-par and SVM-par

PCA which was commonly used by previous researchersis also used in the comparative study Before PCA thepredictors are standardized by subtracting themean from theoriginal values and then dividing the results by the standarddeviation of the original variables The PCA method is thenapplied to the standardized NCEP predictor variables toextract principal components (PCs) that are orthogonal Theobtained PCs preservemore than 90 of the variance presentat each station Then the PCs are used in the MLR ANNand SVM modeling and these results are marked as MLR-pca ANN-pca and SVM-pca

The calibration and validation results of the RF andcomparative models are summarized in Table 2 and theaverage Nash in the calibration and validation periods for allmodels are plotted in Figure 8

The results showed that the RF is superior to the com-parative models in the calibration and validation periodsAccording to the average values of Nash RMSEMAE119877 andBias the RF for the 61 stations are 098 080 058 099 and000 in the calibration period and 094 146 112 097 and021 in the validation period respectively All of these criteriaare superior to those of the comparativemodels It can also beobserved in Figure 8 that the RF shows higher precision andmore stability over the other models in the study area

The results of the two parameter selection are alsocompared for the MLR ANN and SVM models The PAR is

Advances in Meteorology 7

Calibration Validation

08

082

084

086

088

09

092

094

096

098

Aver

age o

f Nas

h

RF

MLR

-par

SVM

-par

MLR

-pca

SVM

-pca

AN

N-p

ar

AN

N-p

ca

Model

078

08

082

084

086

088

09

092

094

096

Aver

age o

f Nas

h

RF

MLR

-par

SVM

-par

MLR

-pca

SVM

-pca

AN

N-p

ca

AN

N-p

ar

Model

Figure 8 Average Nash in calibration and validation period for all models

Table 2 Performance assessment for predictands in calibration and validation

Models Periods Nash RMSE MAE 119877 Bias

MLR-par Calibration 092 176 135 096 000Validation 092 166 129 096 001

MLR-pca Calibration 090 196 151 095 000Validation 088 200 155 094 031

ANN-par Calibration 093 156 119 097 000Validation 093 152 117 097 007

ANN-pca Calibration 092 168 128 096 000Validation 091 172 132 096 035

SVM-par Calibration 092 176 134 096 minus009Validation 092 166 128 096 minus006

SVM-pca Calibration 089 197 150 095 minus010Validation 088 199 153 094 022

RF Calibration 098 080 058 099 000Validation 094 146 112 097 021

superior to PCA in theMLR ANN and SVMmodels For theANNmodels the average119877 are similar in both the calibrationand validation period however the average Nash RMSEMAE and Bias of the results using PAR are superior to thoseof PCA with decreases of 001 012 008 and 0 in the cali-bration period and 002 020 015 and 028 in the validationperiod respectively For the SVM models the increases inaverage Nash and119877 are 003 and 001 in the calibration periodand 004 and 002 in the validation period respectively Thedecreases of average RMSE MAE and Bias are 021 016and 001 in the calibration period and 033 025 and 028

in the validation period respectively For the same the PARis superior to PCA for MLR for most of the criteria withincreases in Nash and 119877 and decreases in RMSE and MAE

The spatial distributions of model precision of the RFand the comparative models are shown in Figure 9 inwhich Nash is selected as the evaluating criteria For thecomparative results the PAR was used in the MLR ANNand SVM modeling It is shown that the RF is superior tothe comparative models at most of the individual stations Inaddition it can be observed that the precision of the RF instations located in the plain region is higher than that in the

8 Advances in Meteorology

MLR calibration

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basin

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

le088

ge096

N

(a) MLR calibration

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

088ndash090090ndash092

092ndash094094ndash096

MLR validationPearl River Basinle088

ge096

N

(b) MLR validation

ANN calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(c) ANN calibration

ANN validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(d) ANN validation

SVM calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(e) SVM calibration

SVM validation

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

0 150 300(km)

N

(f) SVM validation

Figure 9 Continued

Advances in Meteorology 9

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

RF calibration

(g) RF calibration

RF validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(h) RF validation

Figure 9 The spatial distribution of Nash for different models

mountain regions which indicates the limited applicability incomplex terrains

5 Summary and Conclusions

Statistical downscaling models are effective in solving themismatch between large-scale climate models and local scalehydrological responses In this study a statisticalmodel basedon the random forest method was proposed and appliedto the Pearl River basin The objective of this study was toinvestigate whether the RF approach could successfully sim-ulate the complicated relationship between the predictors andpredictands The daily mean temperature observations from61 stations in the Pearl River Basin and the NCEP reanalysisdaily data from 1961 to 2005were selected in order to comparethe results of the RF model with those of the MLR ANNand SVM models The following summarizes the discussionpoints and provides conclusions derived from this analysis

(1) The RF model successfully simulated the relation-ship between the predictors and predictands andperformed better than the MLR ANN and SVMmodels According to five statistical criteria the RFshowed the highest model efficiency in both thecalibration and validation periods In addition it wasobserved that themodel efficiency of the RF increasedcontinuously by considering more predictors In thisstudy all 26 predictors from the NCEP were consid-ered in the RF modeling By taking full advantageof the information in the predictors and avoidingthe influence of noise the RF model performancedominated the other results

(2) The built-in variable importance evaluation processand the OOB samples in the RF made predictorselection convenient The built-in variable impor-tance evaluation process ranked the importance of

each predictor for the prediction of the predictandsThe OOB samples gave the unbiased estimation ofthe model efficiency Both of these were helpful inthe predictor selection Although all predictors wereconsidered in this study they will be most valuablewhen the RF is used in more complex downscalingproblems such as precipitation downscaling

(3) The spatial distribution of model precision at theindividual stations for the RF and the comparativemodels was also discussed Although the RF wassuperior to the comparative models for most of thestations there was still room for improvement for theprediction accuracy in mountainous areas

As this studymainly discussed the development and com-parison of temperature downscalingmethods future projectsin the Pearl River basin were not further explored We willpursue further research in the application of the RF to addi-tionalmeteorological elements such as the precipitationThiswill be helpful in understanding the strength and limitationof the RF method Furthermore the surface parameters likethe elevation slope vegetation and so forth also influencethe distribution of these meteorological elements especiallyin complex terrain [45] wewill explore further in consideringthese parameters as part of model input

Conflicts of Interest

The authors declare no conflicts of interest

Authorsrsquo Contributions

Bo Pang contributed to the literature review statisticalanalysis manuscript preparation and editing Jiajia Yue andGang Zhao contributed to the model design and simulation

10 Advances in Meteorology

and Zongxue Xu contributed to the manuscript revision andreview and supervised the study

Acknowledgments

This research is funded by the Youth Science Foundation ofthe National Natural Science Foundation (51309009) theNational Natural Science Foundation (91125015) andNation-al Key Research and Development Program (during the 13thFive-Year Plan) under Grant no 2016YFC0401309 Ministryof Science and Technology China

References

[1] IPCC Climate Change 2007 The Physical Science Basis Con-tribution of Working Group I to the Fourth Assessment Reportof the Intergovernmental Panel on Climate Change CambridgeUniversity Press Cambridge UK 2007

[2] R L Wilby S P Charles E Zorita B Timbal P Whetton andL O Mearns Guidelines for Use of Climate Scenarios Developedfrom Statistical Downscaling Methods UEA Norwich UK2004

[3] J T Chu J Xia C-Y Xu and V P Singh ldquoStatistical down-scaling of daily mean temperature pan evaporation and pre-cipitation for climate change scenarios in Haihe River ChinardquoTheoretical and Applied Climatology vol 99 no 1-2 pp 149ndash1612010

[4] R L Wilby L E Hay and G H Leavesley ldquoA comparisonof downscaled and raw GCM output implications for climatechange scenarios in the San JuanRiver Basin Coloradordquo Journalof Hydrology vol 225 no 1-2 pp 67ndash91 1999

[5] M K Goyal and C S P Ojha ldquoDownscaling of surfacetemperature for lake catchment in an arid region in India usinglinear multiple regression and neural networksrdquo InternationalJournal of Climatology vol 32 no 4 pp 552ndash566 2012

[6] R Huth ldquoStatistical downscaling in central Europe evaluationof methods and potential predictorsrdquo Climate Research vol 13no 2 pp 91ndash101 1999

[7] D Chen and Y Chen ldquoAssociation between winter temperaturein China and upper air circulation over East Asia revealed bycanonical correlation analysisrdquo Global and Planetary Changevol 37 no 3-4 pp 315ndash325 2003

[8] P Coulibaly Y B Dibike and F Anctil ldquoDownscaling precipi-tation and temperature with temporal neural networksrdquo Journalof Hydrometeorology vol 6 no 4 pp 483ndash496 2005

[9] A Anandhi V V Srinivas D N Kumar and R S NanjundiahldquoRole of predictors in downscaling surface temperature to riverbasin in India for IPCC SRES scenarios using support vectormachinerdquo International Journal of Climatology vol 29 no 4 pp583ndash603 2009

[10] J T Schoof and S C Pryor ldquoDownscaling temperature andprecipitation a comparison of regression-based methods andartificial neural networksrdquo International Journal of Climatologyvol 21 no 7 pp 773ndash790 2001

[11] E Kostopoulou CGiannakopoulos CAnagnostopoulou et alldquoSimulatingmaximumandminimum temperature overGreecea comparison of three downscaling techniquesrdquoTheoretical andApplied Climatology vol 90 no 1-2 pp 65ndash82 2007

[12] D Duhan and A Pandey ldquoStatistical downscaling of tempera-ture using three techniques in the Tons River basin in Central

IndiardquoTheoretical and Applied Climatology vol 121 no 3-4 pp605ndash622 2015

[13] D Maraun H W Rust and T J Osborn ldquoThe annual cycleof heavy precipitation across the United Kingdom a modelbased on extreme value statisticsrdquo International Journal ofClimatology vol 29 no 12 pp 1731ndash1744 2009

[14] M Widmann ldquoOne-dimensional CCA and SVD and theirrelationship to regression mapsrdquo Journal of Climate vol 18 no14 pp 2785ndash2792 2005

[15] M K Tippett T DelSole S J Mason and A G BarnstonldquoRegression-based methods for finding coupled patternsrdquo Jour-nal of Climate vol 21 no 17 pp 4384ndash4398 2008

[16] M Hessami P Gachon T B M J Ouarda and A St-Hilaire ldquoAutomated regression-based statistical downscalingtoolrdquo Environmental Modelling and Software vol 23 no 6 pp813ndash834 2008

[17] Z Liu Z Xu S P Charles G Fu and L Liu ldquoEvaluation of twostatistical downscaling models for daily precipitation over anarid basin in Chinardquo International Journal of Climatology vol31 no 13 pp 2006ndash2020 2011

[18] C Yang N Wang S Wang and L Zhou ldquoPerformancecomparison of three predictor selection methods for statisticaldownscaling of daily precipitationrdquo Theoretical and AppliedClimatology pp 1ndash12 2016

[19] I Hanssen-Bauer E J Foslashrland J E Haugen and O E TveitoldquoTemperature and precipitation scenarios for Norway com-parison of results from dynamical and empirical downscalingrdquoClimate Research vol 25 no 1 pp 15ndash27 2003

[20] A Hannachi I T Jolliffe and D B Stephenson ldquoEmpiricalorthogonal functions and related techniques in atmosphericscience a reviewrdquo International Journal of Climatology vol 27no 9 pp 1119ndash1152 2007

[21] O Fistikoglu andUOkkan ldquoStatistical downscaling ofmonthlyprecipitation using NCEPNCAR reanalysis data for tahtaliriver basin in Turkeyrdquo Journal of Hydrologic Engineering vol 16no 2 pp 157ndash164 2010

[22] A Sharma ldquoSeasonal to interannual rainfall probabilistic fore-casts for improved water supply management part 1mdashA strat-egy for system predictor identificationrdquo Journal of Hydrologyvol 239 no 1ndash4 pp 232ndash239 2000

[23] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[24] M Malekipirbazari and V Aksakalli ldquoRisk assessment in sociallending via random forestsrdquo Expert Systems with Applicationsvol 42 no 10 pp 4621ndash4631 2015

[25] P Baudron F Alonso-Sarrıa J L Garcıa-Arostegui F Canovas-Garcıa D Martınez-Vicente and J Moreno-Brotons ldquoIdentify-ing the origin of groundwater samples in a multi-layer aquifersystemwithRandomForest classificationrdquo Journal ofHydrologyvol 499 pp 303ndash315 2013

[26] K Tatsumi Y Yamashiki M A Canales Torres and C L RTaipe ldquoCrop classification of upland fields using Random forestof time-series Landsat 7 ETM+datardquoComputers and Electronicsin Agriculture vol 115 pp 171ndash179 2015

[27] Z Wang C Lai X Chen B Yang S Zhao and X Bai ldquoFloodhazard risk assessment model based on random forestrdquo Journalof Hydrology vol 527 pp 1130ndash1141 2015

[28] P O Gislason J A Benediktsson and J R Sveinsson ldquoRandomforests for land cover classificationrdquo Pattern Recognition Lettersvol 27 no 4 pp 294ndash300 2006

Advances in Meteorology 11

[29] M Pal ldquoRandom forest classifier for remote sensing classifica-tionrdquo International Journal of Remote Sensing vol 26 no 1 pp217ndash222 2005

[30] V F Rodriguez-Galiano M Chica-Olmo F Abarca-Hernan-dez P M Atkinson and C Jeganathan ldquoRandom Forestclassification of Mediterranean land cover using multi-seasonalimagery and multi-seasonal texturerdquo Remote Sensing of Envi-ronment vol 121 pp 93ndash107 2012

[31] V F Rodriguez-Galiano B Ghimire J Rogan M Chica-Olmoand J P Rigol-Sanchez ldquoAn assessment of the effectiveness ofa random forest classifier for land-cover classificationrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 67 no 1pp 93ndash104 2012

[32] C Strobl and A Zeileis ldquoDanger high power mdash Exploringthe statistical properties of a test for random forest variableimportancerdquo Tech Rep 17 University of Munich 2008

[33] V Svetnik A Liaw C Tong J Christopher Culberson R PSheridan and B P Feuston ldquoRandom forest a classification andregression tool for compound classification and QSAR mod-elingrdquo Journal of Chemical Information and Computer Sciencesvol 43 no 6 pp 1947ndash1958 2003

[34] E Eccel L Ghielmi P Granitto R Barbiero F Grazzini and DCesari ldquoPrediction of minimum temperatures in an alpineregion by linear and non-linear post-processing of meteorolog-ical modelsrdquoNonlinear Processes in Geophysics vol 14 no 3 pp211ndash222 2007

[35] Pearl River Water Resources Committee (PRWRC) The Zhu-jiang Archive vol 1 Guangdong Science and Technology PressGuangzhou China 1991 (in Chinese)

[36] Q Zhang C-Y Xu and Z Zhang ldquoObserved changes ofdroughtwetness episodes in the Pearl River basin Chinausing the standardized precipitation index and aridity indexrdquoTheoretical and Applied Climatology vol 98 no 1-2 pp 89ndash992009

[37] E V Dmitriev I V Nogotkov V S Rogutov G Komenko andA Chavro ldquoTemporal error estimate for statistical downscalingregional meteorological modelsrdquo Fısica de la Tierra vol 19 pp219ndash241 2007

[38] C Lavaysse M Vrac P Drobinski M Lengaigne and TVischel ldquoStatistical downscaling of the French Mediterraneanclimate assessment for present and projection in an anthro-pogenic scenariordquo Natural Hazards and Earth System Sciencevol 12 no 3 pp 651ndash670 2012

[39] L Liu Z Liu X Ren T Fischer and Y Xu ldquoHydrologicalimpacts of climate change in the Yellow River Basin for the 21stcentury using hydrological model and statistical downscalingmodelrdquo Quaternary International vol 244 no 2 pp 211ndash2202011

[40] F-F Ai J Bin Z-M Zhang et al ldquoApplication of randomforests to select premiumquality vegetable oils by their fatty acidcompositionrdquo Food Chemistry vol 143 pp 472ndash478 2014

[41] P Gislason J Benediktsson and J Sveinsson ldquoRandom forestclassification of multisource remote sensing and geographicdatardquo in Proceedings of the Geoscience and Remote Sensing Sym-posium vol 2 pp 1049ndash1052 IEEE Anchorage Alaska USA2004

[42] B C Hewitson and R G Crane ldquoClimate downscaling tech-niques and applicationrdquo Climate Research vol 7 no 2 pp 85ndash95 1996

[43] B Timbal A Dufour and B McAvaney ldquoAn estimate of fu-ture climate change for western France using a statistical

downscaling techniquerdquo Climate Dynamics vol 20 no 7-8 pp807ndash823 2003

[44] K Tatsumi T Oizumi and Y Yamashiki ldquoEffects of climatechange on daily minimum and maximum temperatures andcloudiness in the Shikoku region a statistical downscalingmodel approachrdquoTheoretical and Applied Climatology vol 120no 1-2 pp 87ndash98 2015

[45] Z A Holden J T Abatzoglou C H Luce and L S BaggettldquoEmpirical downscaling of daily minimum air temperature atvery fine resolutions in complex terrainrdquoAgricultural and ForestMeteorology vol 151 no 8 pp 1066ndash1073 2011

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 201

International Journal of

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 6: Statistical Downscaling of Temperature with the Random ...downloads.hindawi.com/journals/amete/2017/7265178.pdf · Statistical downscaling techniques can be divided into threecategories:weathertyping,weathergenerators,andre-gression-based

6 Advances in Meteorology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Number of predictors

Random forest

0

5

10

15

20

25

Rank

Figure 5 Rank of relative importance of predictors in the 61 sta-tions

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Numbers of predictors

123456789

1011

EOO

B

Figure 6 MSEs of out-of-bag samples

According to Figures 4 and 5 the number 26 predictorthe mean temperature at 2m height is the most importantpredictor for all RF models at the 61 stations Similarly thenumber 25 predictor the surface-specific humidity rankedsecond for predictor importance at all stations In contrastthe number 5 predictor surface vorticity is the least impor-tant predictor for all stations The ranks of the other predic-tors vary for the different stations whichmay be caused by themeteorological and geographical differences of the stations

Because the OOB samples can provide unbiased esti-mation of the RF model performance the rationality ofincluding each factor can be tested using the MSEs of theOOB (indicated by EOOB)The predictors of each station arescreened using the ranks of relative importance which wereevaluated in previous stepThen theMSE of theOOB samplesin each station is calculated with the increase of the predictorcombination and plotted in Figure 6

The results show that EOOB generally decreases butat a decreasing rate as more predictors are included Thisindicates that the RF avoids overfitting successfully Howeverthe improvement of model performance by including morepredictors is slight when the predictor number exceeds acertain number which is seven in this study With enoughcomputation resources all predictors are chosen in the RFmodeling to downscale the temperature for these stations

42 Comparative Study The performance of the RF modelis compared with that of the MLR ANN and SVM models

Partial Correlation Analysis

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Number of predictors

0

5

10

15

20

25

Rank

Figure 7 Rank of partial correlation of predictors in the 61 stations

Two predictor selection methods PAR and PCA are appliedin the four models

The partial correlations of the 26 predictors with thepredictands are shown in Figure 7 It can be observed that theranks of the predictorsrsquo partial correlation vary in differentstations In general the number 1 predictor mean sea levelpressure has the largest partial correlation in most of thestations However the number 5 and number 8 predic-tors corresponding to surface vorticity and 500 hPa airflowstrength have the smallest partial correlations in themajorityof the stations

The partial correlation coefficients are used to decide thevariables that are included in the input combination Based onformer studies [18 43 44] a combination of seven predictorswith the highest partial correlation coefficients are selectedand applied in the MLR ANN and SVM models Theseresults are marked as MLR-par ANN-par and SVM-par

PCA which was commonly used by previous researchersis also used in the comparative study Before PCA thepredictors are standardized by subtracting themean from theoriginal values and then dividing the results by the standarddeviation of the original variables The PCA method is thenapplied to the standardized NCEP predictor variables toextract principal components (PCs) that are orthogonal Theobtained PCs preservemore than 90 of the variance presentat each station Then the PCs are used in the MLR ANNand SVM modeling and these results are marked as MLR-pca ANN-pca and SVM-pca

The calibration and validation results of the RF andcomparative models are summarized in Table 2 and theaverage Nash in the calibration and validation periods for allmodels are plotted in Figure 8

The results showed that the RF is superior to the com-parative models in the calibration and validation periodsAccording to the average values of Nash RMSEMAE119877 andBias the RF for the 61 stations are 098 080 058 099 and000 in the calibration period and 094 146 112 097 and021 in the validation period respectively All of these criteriaare superior to those of the comparativemodels It can also beobserved in Figure 8 that the RF shows higher precision andmore stability over the other models in the study area

The results of the two parameter selection are alsocompared for the MLR ANN and SVM models The PAR is

Advances in Meteorology 7

Calibration Validation

08

082

084

086

088

09

092

094

096

098

Aver

age o

f Nas

h

RF

MLR

-par

SVM

-par

MLR

-pca

SVM

-pca

AN

N-p

ar

AN

N-p

ca

Model

078

08

082

084

086

088

09

092

094

096

Aver

age o

f Nas

h

RF

MLR

-par

SVM

-par

MLR

-pca

SVM

-pca

AN

N-p

ca

AN

N-p

ar

Model

Figure 8 Average Nash in calibration and validation period for all models

Table 2 Performance assessment for predictands in calibration and validation

Models Periods Nash RMSE MAE 119877 Bias

MLR-par Calibration 092 176 135 096 000Validation 092 166 129 096 001

MLR-pca Calibration 090 196 151 095 000Validation 088 200 155 094 031

ANN-par Calibration 093 156 119 097 000Validation 093 152 117 097 007

ANN-pca Calibration 092 168 128 096 000Validation 091 172 132 096 035

SVM-par Calibration 092 176 134 096 minus009Validation 092 166 128 096 minus006

SVM-pca Calibration 089 197 150 095 minus010Validation 088 199 153 094 022

RF Calibration 098 080 058 099 000Validation 094 146 112 097 021

superior to PCA in theMLR ANN and SVMmodels For theANNmodels the average119877 are similar in both the calibrationand validation period however the average Nash RMSEMAE and Bias of the results using PAR are superior to thoseof PCA with decreases of 001 012 008 and 0 in the cali-bration period and 002 020 015 and 028 in the validationperiod respectively For the SVM models the increases inaverage Nash and119877 are 003 and 001 in the calibration periodand 004 and 002 in the validation period respectively Thedecreases of average RMSE MAE and Bias are 021 016and 001 in the calibration period and 033 025 and 028

in the validation period respectively For the same the PARis superior to PCA for MLR for most of the criteria withincreases in Nash and 119877 and decreases in RMSE and MAE

The spatial distributions of model precision of the RFand the comparative models are shown in Figure 9 inwhich Nash is selected as the evaluating criteria For thecomparative results the PAR was used in the MLR ANNand SVM modeling It is shown that the RF is superior tothe comparative models at most of the individual stations Inaddition it can be observed that the precision of the RF instations located in the plain region is higher than that in the

8 Advances in Meteorology

MLR calibration

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basin

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

le088

ge096

N

(a) MLR calibration

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

088ndash090090ndash092

092ndash094094ndash096

MLR validationPearl River Basinle088

ge096

N

(b) MLR validation

ANN calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(c) ANN calibration

ANN validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(d) ANN validation

SVM calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(e) SVM calibration

SVM validation

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

0 150 300(km)

N

(f) SVM validation

Figure 9 Continued

Advances in Meteorology 9

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

RF calibration

(g) RF calibration

RF validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(h) RF validation

Figure 9 The spatial distribution of Nash for different models

mountain regions which indicates the limited applicability incomplex terrains

5 Summary and Conclusions

Statistical downscaling models are effective in solving themismatch between large-scale climate models and local scalehydrological responses In this study a statisticalmodel basedon the random forest method was proposed and appliedto the Pearl River basin The objective of this study was toinvestigate whether the RF approach could successfully sim-ulate the complicated relationship between the predictors andpredictands The daily mean temperature observations from61 stations in the Pearl River Basin and the NCEP reanalysisdaily data from 1961 to 2005were selected in order to comparethe results of the RF model with those of the MLR ANNand SVM models The following summarizes the discussionpoints and provides conclusions derived from this analysis

(1) The RF model successfully simulated the relation-ship between the predictors and predictands andperformed better than the MLR ANN and SVMmodels According to five statistical criteria the RFshowed the highest model efficiency in both thecalibration and validation periods In addition it wasobserved that themodel efficiency of the RF increasedcontinuously by considering more predictors In thisstudy all 26 predictors from the NCEP were consid-ered in the RF modeling By taking full advantageof the information in the predictors and avoidingthe influence of noise the RF model performancedominated the other results

(2) The built-in variable importance evaluation processand the OOB samples in the RF made predictorselection convenient The built-in variable impor-tance evaluation process ranked the importance of

each predictor for the prediction of the predictandsThe OOB samples gave the unbiased estimation ofthe model efficiency Both of these were helpful inthe predictor selection Although all predictors wereconsidered in this study they will be most valuablewhen the RF is used in more complex downscalingproblems such as precipitation downscaling

(3) The spatial distribution of model precision at theindividual stations for the RF and the comparativemodels was also discussed Although the RF wassuperior to the comparative models for most of thestations there was still room for improvement for theprediction accuracy in mountainous areas

As this studymainly discussed the development and com-parison of temperature downscalingmethods future projectsin the Pearl River basin were not further explored We willpursue further research in the application of the RF to addi-tionalmeteorological elements such as the precipitationThiswill be helpful in understanding the strength and limitationof the RF method Furthermore the surface parameters likethe elevation slope vegetation and so forth also influencethe distribution of these meteorological elements especiallyin complex terrain [45] wewill explore further in consideringthese parameters as part of model input

Conflicts of Interest

The authors declare no conflicts of interest

Authorsrsquo Contributions

Bo Pang contributed to the literature review statisticalanalysis manuscript preparation and editing Jiajia Yue andGang Zhao contributed to the model design and simulation

10 Advances in Meteorology

and Zongxue Xu contributed to the manuscript revision andreview and supervised the study

Acknowledgments

This research is funded by the Youth Science Foundation ofthe National Natural Science Foundation (51309009) theNational Natural Science Foundation (91125015) andNation-al Key Research and Development Program (during the 13thFive-Year Plan) under Grant no 2016YFC0401309 Ministryof Science and Technology China

References

[1] IPCC Climate Change 2007 The Physical Science Basis Con-tribution of Working Group I to the Fourth Assessment Reportof the Intergovernmental Panel on Climate Change CambridgeUniversity Press Cambridge UK 2007

[2] R L Wilby S P Charles E Zorita B Timbal P Whetton andL O Mearns Guidelines for Use of Climate Scenarios Developedfrom Statistical Downscaling Methods UEA Norwich UK2004

[3] J T Chu J Xia C-Y Xu and V P Singh ldquoStatistical down-scaling of daily mean temperature pan evaporation and pre-cipitation for climate change scenarios in Haihe River ChinardquoTheoretical and Applied Climatology vol 99 no 1-2 pp 149ndash1612010

[4] R L Wilby L E Hay and G H Leavesley ldquoA comparisonof downscaled and raw GCM output implications for climatechange scenarios in the San JuanRiver Basin Coloradordquo Journalof Hydrology vol 225 no 1-2 pp 67ndash91 1999

[5] M K Goyal and C S P Ojha ldquoDownscaling of surfacetemperature for lake catchment in an arid region in India usinglinear multiple regression and neural networksrdquo InternationalJournal of Climatology vol 32 no 4 pp 552ndash566 2012

[6] R Huth ldquoStatistical downscaling in central Europe evaluationof methods and potential predictorsrdquo Climate Research vol 13no 2 pp 91ndash101 1999

[7] D Chen and Y Chen ldquoAssociation between winter temperaturein China and upper air circulation over East Asia revealed bycanonical correlation analysisrdquo Global and Planetary Changevol 37 no 3-4 pp 315ndash325 2003

[8] P Coulibaly Y B Dibike and F Anctil ldquoDownscaling precipi-tation and temperature with temporal neural networksrdquo Journalof Hydrometeorology vol 6 no 4 pp 483ndash496 2005

[9] A Anandhi V V Srinivas D N Kumar and R S NanjundiahldquoRole of predictors in downscaling surface temperature to riverbasin in India for IPCC SRES scenarios using support vectormachinerdquo International Journal of Climatology vol 29 no 4 pp583ndash603 2009

[10] J T Schoof and S C Pryor ldquoDownscaling temperature andprecipitation a comparison of regression-based methods andartificial neural networksrdquo International Journal of Climatologyvol 21 no 7 pp 773ndash790 2001

[11] E Kostopoulou CGiannakopoulos CAnagnostopoulou et alldquoSimulatingmaximumandminimum temperature overGreecea comparison of three downscaling techniquesrdquoTheoretical andApplied Climatology vol 90 no 1-2 pp 65ndash82 2007

[12] D Duhan and A Pandey ldquoStatistical downscaling of tempera-ture using three techniques in the Tons River basin in Central

IndiardquoTheoretical and Applied Climatology vol 121 no 3-4 pp605ndash622 2015

[13] D Maraun H W Rust and T J Osborn ldquoThe annual cycleof heavy precipitation across the United Kingdom a modelbased on extreme value statisticsrdquo International Journal ofClimatology vol 29 no 12 pp 1731ndash1744 2009

[14] M Widmann ldquoOne-dimensional CCA and SVD and theirrelationship to regression mapsrdquo Journal of Climate vol 18 no14 pp 2785ndash2792 2005

[15] M K Tippett T DelSole S J Mason and A G BarnstonldquoRegression-based methods for finding coupled patternsrdquo Jour-nal of Climate vol 21 no 17 pp 4384ndash4398 2008

[16] M Hessami P Gachon T B M J Ouarda and A St-Hilaire ldquoAutomated regression-based statistical downscalingtoolrdquo Environmental Modelling and Software vol 23 no 6 pp813ndash834 2008

[17] Z Liu Z Xu S P Charles G Fu and L Liu ldquoEvaluation of twostatistical downscaling models for daily precipitation over anarid basin in Chinardquo International Journal of Climatology vol31 no 13 pp 2006ndash2020 2011

[18] C Yang N Wang S Wang and L Zhou ldquoPerformancecomparison of three predictor selection methods for statisticaldownscaling of daily precipitationrdquo Theoretical and AppliedClimatology pp 1ndash12 2016

[19] I Hanssen-Bauer E J Foslashrland J E Haugen and O E TveitoldquoTemperature and precipitation scenarios for Norway com-parison of results from dynamical and empirical downscalingrdquoClimate Research vol 25 no 1 pp 15ndash27 2003

[20] A Hannachi I T Jolliffe and D B Stephenson ldquoEmpiricalorthogonal functions and related techniques in atmosphericscience a reviewrdquo International Journal of Climatology vol 27no 9 pp 1119ndash1152 2007

[21] O Fistikoglu andUOkkan ldquoStatistical downscaling ofmonthlyprecipitation using NCEPNCAR reanalysis data for tahtaliriver basin in Turkeyrdquo Journal of Hydrologic Engineering vol 16no 2 pp 157ndash164 2010

[22] A Sharma ldquoSeasonal to interannual rainfall probabilistic fore-casts for improved water supply management part 1mdashA strat-egy for system predictor identificationrdquo Journal of Hydrologyvol 239 no 1ndash4 pp 232ndash239 2000

[23] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[24] M Malekipirbazari and V Aksakalli ldquoRisk assessment in sociallending via random forestsrdquo Expert Systems with Applicationsvol 42 no 10 pp 4621ndash4631 2015

[25] P Baudron F Alonso-Sarrıa J L Garcıa-Arostegui F Canovas-Garcıa D Martınez-Vicente and J Moreno-Brotons ldquoIdentify-ing the origin of groundwater samples in a multi-layer aquifersystemwithRandomForest classificationrdquo Journal ofHydrologyvol 499 pp 303ndash315 2013

[26] K Tatsumi Y Yamashiki M A Canales Torres and C L RTaipe ldquoCrop classification of upland fields using Random forestof time-series Landsat 7 ETM+datardquoComputers and Electronicsin Agriculture vol 115 pp 171ndash179 2015

[27] Z Wang C Lai X Chen B Yang S Zhao and X Bai ldquoFloodhazard risk assessment model based on random forestrdquo Journalof Hydrology vol 527 pp 1130ndash1141 2015

[28] P O Gislason J A Benediktsson and J R Sveinsson ldquoRandomforests for land cover classificationrdquo Pattern Recognition Lettersvol 27 no 4 pp 294ndash300 2006

Advances in Meteorology 11

[29] M Pal ldquoRandom forest classifier for remote sensing classifica-tionrdquo International Journal of Remote Sensing vol 26 no 1 pp217ndash222 2005

[30] V F Rodriguez-Galiano M Chica-Olmo F Abarca-Hernan-dez P M Atkinson and C Jeganathan ldquoRandom Forestclassification of Mediterranean land cover using multi-seasonalimagery and multi-seasonal texturerdquo Remote Sensing of Envi-ronment vol 121 pp 93ndash107 2012

[31] V F Rodriguez-Galiano B Ghimire J Rogan M Chica-Olmoand J P Rigol-Sanchez ldquoAn assessment of the effectiveness ofa random forest classifier for land-cover classificationrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 67 no 1pp 93ndash104 2012

[32] C Strobl and A Zeileis ldquoDanger high power mdash Exploringthe statistical properties of a test for random forest variableimportancerdquo Tech Rep 17 University of Munich 2008

[33] V Svetnik A Liaw C Tong J Christopher Culberson R PSheridan and B P Feuston ldquoRandom forest a classification andregression tool for compound classification and QSAR mod-elingrdquo Journal of Chemical Information and Computer Sciencesvol 43 no 6 pp 1947ndash1958 2003

[34] E Eccel L Ghielmi P Granitto R Barbiero F Grazzini and DCesari ldquoPrediction of minimum temperatures in an alpineregion by linear and non-linear post-processing of meteorolog-ical modelsrdquoNonlinear Processes in Geophysics vol 14 no 3 pp211ndash222 2007

[35] Pearl River Water Resources Committee (PRWRC) The Zhu-jiang Archive vol 1 Guangdong Science and Technology PressGuangzhou China 1991 (in Chinese)

[36] Q Zhang C-Y Xu and Z Zhang ldquoObserved changes ofdroughtwetness episodes in the Pearl River basin Chinausing the standardized precipitation index and aridity indexrdquoTheoretical and Applied Climatology vol 98 no 1-2 pp 89ndash992009

[37] E V Dmitriev I V Nogotkov V S Rogutov G Komenko andA Chavro ldquoTemporal error estimate for statistical downscalingregional meteorological modelsrdquo Fısica de la Tierra vol 19 pp219ndash241 2007

[38] C Lavaysse M Vrac P Drobinski M Lengaigne and TVischel ldquoStatistical downscaling of the French Mediterraneanclimate assessment for present and projection in an anthro-pogenic scenariordquo Natural Hazards and Earth System Sciencevol 12 no 3 pp 651ndash670 2012

[39] L Liu Z Liu X Ren T Fischer and Y Xu ldquoHydrologicalimpacts of climate change in the Yellow River Basin for the 21stcentury using hydrological model and statistical downscalingmodelrdquo Quaternary International vol 244 no 2 pp 211ndash2202011

[40] F-F Ai J Bin Z-M Zhang et al ldquoApplication of randomforests to select premiumquality vegetable oils by their fatty acidcompositionrdquo Food Chemistry vol 143 pp 472ndash478 2014

[41] P Gislason J Benediktsson and J Sveinsson ldquoRandom forestclassification of multisource remote sensing and geographicdatardquo in Proceedings of the Geoscience and Remote Sensing Sym-posium vol 2 pp 1049ndash1052 IEEE Anchorage Alaska USA2004

[42] B C Hewitson and R G Crane ldquoClimate downscaling tech-niques and applicationrdquo Climate Research vol 7 no 2 pp 85ndash95 1996

[43] B Timbal A Dufour and B McAvaney ldquoAn estimate of fu-ture climate change for western France using a statistical

downscaling techniquerdquo Climate Dynamics vol 20 no 7-8 pp807ndash823 2003

[44] K Tatsumi T Oizumi and Y Yamashiki ldquoEffects of climatechange on daily minimum and maximum temperatures andcloudiness in the Shikoku region a statistical downscalingmodel approachrdquoTheoretical and Applied Climatology vol 120no 1-2 pp 87ndash98 2015

[45] Z A Holden J T Abatzoglou C H Luce and L S BaggettldquoEmpirical downscaling of daily minimum air temperature atvery fine resolutions in complex terrainrdquoAgricultural and ForestMeteorology vol 151 no 8 pp 1066ndash1073 2011

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 201

International Journal of

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 7: Statistical Downscaling of Temperature with the Random ...downloads.hindawi.com/journals/amete/2017/7265178.pdf · Statistical downscaling techniques can be divided into threecategories:weathertyping,weathergenerators,andre-gression-based

Advances in Meteorology 7

Calibration Validation

08

082

084

086

088

09

092

094

096

098

Aver

age o

f Nas

h

RF

MLR

-par

SVM

-par

MLR

-pca

SVM

-pca

AN

N-p

ar

AN

N-p

ca

Model

078

08

082

084

086

088

09

092

094

096

Aver

age o

f Nas

h

RF

MLR

-par

SVM

-par

MLR

-pca

SVM

-pca

AN

N-p

ca

AN

N-p

ar

Model

Figure 8 Average Nash in calibration and validation period for all models

Table 2 Performance assessment for predictands in calibration and validation

Models Periods Nash RMSE MAE 119877 Bias

MLR-par Calibration 092 176 135 096 000Validation 092 166 129 096 001

MLR-pca Calibration 090 196 151 095 000Validation 088 200 155 094 031

ANN-par Calibration 093 156 119 097 000Validation 093 152 117 097 007

ANN-pca Calibration 092 168 128 096 000Validation 091 172 132 096 035

SVM-par Calibration 092 176 134 096 minus009Validation 092 166 128 096 minus006

SVM-pca Calibration 089 197 150 095 minus010Validation 088 199 153 094 022

RF Calibration 098 080 058 099 000Validation 094 146 112 097 021

superior to PCA in theMLR ANN and SVMmodels For theANNmodels the average119877 are similar in both the calibrationand validation period however the average Nash RMSEMAE and Bias of the results using PAR are superior to thoseof PCA with decreases of 001 012 008 and 0 in the cali-bration period and 002 020 015 and 028 in the validationperiod respectively For the SVM models the increases inaverage Nash and119877 are 003 and 001 in the calibration periodand 004 and 002 in the validation period respectively Thedecreases of average RMSE MAE and Bias are 021 016and 001 in the calibration period and 033 025 and 028

in the validation period respectively For the same the PARis superior to PCA for MLR for most of the criteria withincreases in Nash and 119877 and decreases in RMSE and MAE

The spatial distributions of model precision of the RFand the comparative models are shown in Figure 9 inwhich Nash is selected as the evaluating criteria For thecomparative results the PAR was used in the MLR ANNand SVM modeling It is shown that the RF is superior tothe comparative models at most of the individual stations Inaddition it can be observed that the precision of the RF instations located in the plain region is higher than that in the

8 Advances in Meteorology

MLR calibration

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basin

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

le088

ge096

N

(a) MLR calibration

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

088ndash090090ndash092

092ndash094094ndash096

MLR validationPearl River Basinle088

ge096

N

(b) MLR validation

ANN calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(c) ANN calibration

ANN validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(d) ANN validation

SVM calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(e) SVM calibration

SVM validation

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

0 150 300(km)

N

(f) SVM validation

Figure 9 Continued

Advances in Meteorology 9

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

RF calibration

(g) RF calibration

RF validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(h) RF validation

Figure 9 The spatial distribution of Nash for different models

mountain regions which indicates the limited applicability incomplex terrains

5 Summary and Conclusions

Statistical downscaling models are effective in solving themismatch between large-scale climate models and local scalehydrological responses In this study a statisticalmodel basedon the random forest method was proposed and appliedto the Pearl River basin The objective of this study was toinvestigate whether the RF approach could successfully sim-ulate the complicated relationship between the predictors andpredictands The daily mean temperature observations from61 stations in the Pearl River Basin and the NCEP reanalysisdaily data from 1961 to 2005were selected in order to comparethe results of the RF model with those of the MLR ANNand SVM models The following summarizes the discussionpoints and provides conclusions derived from this analysis

(1) The RF model successfully simulated the relation-ship between the predictors and predictands andperformed better than the MLR ANN and SVMmodels According to five statistical criteria the RFshowed the highest model efficiency in both thecalibration and validation periods In addition it wasobserved that themodel efficiency of the RF increasedcontinuously by considering more predictors In thisstudy all 26 predictors from the NCEP were consid-ered in the RF modeling By taking full advantageof the information in the predictors and avoidingthe influence of noise the RF model performancedominated the other results

(2) The built-in variable importance evaluation processand the OOB samples in the RF made predictorselection convenient The built-in variable impor-tance evaluation process ranked the importance of

each predictor for the prediction of the predictandsThe OOB samples gave the unbiased estimation ofthe model efficiency Both of these were helpful inthe predictor selection Although all predictors wereconsidered in this study they will be most valuablewhen the RF is used in more complex downscalingproblems such as precipitation downscaling

(3) The spatial distribution of model precision at theindividual stations for the RF and the comparativemodels was also discussed Although the RF wassuperior to the comparative models for most of thestations there was still room for improvement for theprediction accuracy in mountainous areas

As this studymainly discussed the development and com-parison of temperature downscalingmethods future projectsin the Pearl River basin were not further explored We willpursue further research in the application of the RF to addi-tionalmeteorological elements such as the precipitationThiswill be helpful in understanding the strength and limitationof the RF method Furthermore the surface parameters likethe elevation slope vegetation and so forth also influencethe distribution of these meteorological elements especiallyin complex terrain [45] wewill explore further in consideringthese parameters as part of model input

Conflicts of Interest

The authors declare no conflicts of interest

Authorsrsquo Contributions

Bo Pang contributed to the literature review statisticalanalysis manuscript preparation and editing Jiajia Yue andGang Zhao contributed to the model design and simulation

10 Advances in Meteorology

and Zongxue Xu contributed to the manuscript revision andreview and supervised the study

Acknowledgments

This research is funded by the Youth Science Foundation ofthe National Natural Science Foundation (51309009) theNational Natural Science Foundation (91125015) andNation-al Key Research and Development Program (during the 13thFive-Year Plan) under Grant no 2016YFC0401309 Ministryof Science and Technology China

References

[1] IPCC Climate Change 2007 The Physical Science Basis Con-tribution of Working Group I to the Fourth Assessment Reportof the Intergovernmental Panel on Climate Change CambridgeUniversity Press Cambridge UK 2007

[2] R L Wilby S P Charles E Zorita B Timbal P Whetton andL O Mearns Guidelines for Use of Climate Scenarios Developedfrom Statistical Downscaling Methods UEA Norwich UK2004

[3] J T Chu J Xia C-Y Xu and V P Singh ldquoStatistical down-scaling of daily mean temperature pan evaporation and pre-cipitation for climate change scenarios in Haihe River ChinardquoTheoretical and Applied Climatology vol 99 no 1-2 pp 149ndash1612010

[4] R L Wilby L E Hay and G H Leavesley ldquoA comparisonof downscaled and raw GCM output implications for climatechange scenarios in the San JuanRiver Basin Coloradordquo Journalof Hydrology vol 225 no 1-2 pp 67ndash91 1999

[5] M K Goyal and C S P Ojha ldquoDownscaling of surfacetemperature for lake catchment in an arid region in India usinglinear multiple regression and neural networksrdquo InternationalJournal of Climatology vol 32 no 4 pp 552ndash566 2012

[6] R Huth ldquoStatistical downscaling in central Europe evaluationof methods and potential predictorsrdquo Climate Research vol 13no 2 pp 91ndash101 1999

[7] D Chen and Y Chen ldquoAssociation between winter temperaturein China and upper air circulation over East Asia revealed bycanonical correlation analysisrdquo Global and Planetary Changevol 37 no 3-4 pp 315ndash325 2003

[8] P Coulibaly Y B Dibike and F Anctil ldquoDownscaling precipi-tation and temperature with temporal neural networksrdquo Journalof Hydrometeorology vol 6 no 4 pp 483ndash496 2005

[9] A Anandhi V V Srinivas D N Kumar and R S NanjundiahldquoRole of predictors in downscaling surface temperature to riverbasin in India for IPCC SRES scenarios using support vectormachinerdquo International Journal of Climatology vol 29 no 4 pp583ndash603 2009

[10] J T Schoof and S C Pryor ldquoDownscaling temperature andprecipitation a comparison of regression-based methods andartificial neural networksrdquo International Journal of Climatologyvol 21 no 7 pp 773ndash790 2001

[11] E Kostopoulou CGiannakopoulos CAnagnostopoulou et alldquoSimulatingmaximumandminimum temperature overGreecea comparison of three downscaling techniquesrdquoTheoretical andApplied Climatology vol 90 no 1-2 pp 65ndash82 2007

[12] D Duhan and A Pandey ldquoStatistical downscaling of tempera-ture using three techniques in the Tons River basin in Central

IndiardquoTheoretical and Applied Climatology vol 121 no 3-4 pp605ndash622 2015

[13] D Maraun H W Rust and T J Osborn ldquoThe annual cycleof heavy precipitation across the United Kingdom a modelbased on extreme value statisticsrdquo International Journal ofClimatology vol 29 no 12 pp 1731ndash1744 2009

[14] M Widmann ldquoOne-dimensional CCA and SVD and theirrelationship to regression mapsrdquo Journal of Climate vol 18 no14 pp 2785ndash2792 2005

[15] M K Tippett T DelSole S J Mason and A G BarnstonldquoRegression-based methods for finding coupled patternsrdquo Jour-nal of Climate vol 21 no 17 pp 4384ndash4398 2008

[16] M Hessami P Gachon T B M J Ouarda and A St-Hilaire ldquoAutomated regression-based statistical downscalingtoolrdquo Environmental Modelling and Software vol 23 no 6 pp813ndash834 2008

[17] Z Liu Z Xu S P Charles G Fu and L Liu ldquoEvaluation of twostatistical downscaling models for daily precipitation over anarid basin in Chinardquo International Journal of Climatology vol31 no 13 pp 2006ndash2020 2011

[18] C Yang N Wang S Wang and L Zhou ldquoPerformancecomparison of three predictor selection methods for statisticaldownscaling of daily precipitationrdquo Theoretical and AppliedClimatology pp 1ndash12 2016

[19] I Hanssen-Bauer E J Foslashrland J E Haugen and O E TveitoldquoTemperature and precipitation scenarios for Norway com-parison of results from dynamical and empirical downscalingrdquoClimate Research vol 25 no 1 pp 15ndash27 2003

[20] A Hannachi I T Jolliffe and D B Stephenson ldquoEmpiricalorthogonal functions and related techniques in atmosphericscience a reviewrdquo International Journal of Climatology vol 27no 9 pp 1119ndash1152 2007

[21] O Fistikoglu andUOkkan ldquoStatistical downscaling ofmonthlyprecipitation using NCEPNCAR reanalysis data for tahtaliriver basin in Turkeyrdquo Journal of Hydrologic Engineering vol 16no 2 pp 157ndash164 2010

[22] A Sharma ldquoSeasonal to interannual rainfall probabilistic fore-casts for improved water supply management part 1mdashA strat-egy for system predictor identificationrdquo Journal of Hydrologyvol 239 no 1ndash4 pp 232ndash239 2000

[23] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[24] M Malekipirbazari and V Aksakalli ldquoRisk assessment in sociallending via random forestsrdquo Expert Systems with Applicationsvol 42 no 10 pp 4621ndash4631 2015

[25] P Baudron F Alonso-Sarrıa J L Garcıa-Arostegui F Canovas-Garcıa D Martınez-Vicente and J Moreno-Brotons ldquoIdentify-ing the origin of groundwater samples in a multi-layer aquifersystemwithRandomForest classificationrdquo Journal ofHydrologyvol 499 pp 303ndash315 2013

[26] K Tatsumi Y Yamashiki M A Canales Torres and C L RTaipe ldquoCrop classification of upland fields using Random forestof time-series Landsat 7 ETM+datardquoComputers and Electronicsin Agriculture vol 115 pp 171ndash179 2015

[27] Z Wang C Lai X Chen B Yang S Zhao and X Bai ldquoFloodhazard risk assessment model based on random forestrdquo Journalof Hydrology vol 527 pp 1130ndash1141 2015

[28] P O Gislason J A Benediktsson and J R Sveinsson ldquoRandomforests for land cover classificationrdquo Pattern Recognition Lettersvol 27 no 4 pp 294ndash300 2006

Advances in Meteorology 11

[29] M Pal ldquoRandom forest classifier for remote sensing classifica-tionrdquo International Journal of Remote Sensing vol 26 no 1 pp217ndash222 2005

[30] V F Rodriguez-Galiano M Chica-Olmo F Abarca-Hernan-dez P M Atkinson and C Jeganathan ldquoRandom Forestclassification of Mediterranean land cover using multi-seasonalimagery and multi-seasonal texturerdquo Remote Sensing of Envi-ronment vol 121 pp 93ndash107 2012

[31] V F Rodriguez-Galiano B Ghimire J Rogan M Chica-Olmoand J P Rigol-Sanchez ldquoAn assessment of the effectiveness ofa random forest classifier for land-cover classificationrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 67 no 1pp 93ndash104 2012

[32] C Strobl and A Zeileis ldquoDanger high power mdash Exploringthe statistical properties of a test for random forest variableimportancerdquo Tech Rep 17 University of Munich 2008

[33] V Svetnik A Liaw C Tong J Christopher Culberson R PSheridan and B P Feuston ldquoRandom forest a classification andregression tool for compound classification and QSAR mod-elingrdquo Journal of Chemical Information and Computer Sciencesvol 43 no 6 pp 1947ndash1958 2003

[34] E Eccel L Ghielmi P Granitto R Barbiero F Grazzini and DCesari ldquoPrediction of minimum temperatures in an alpineregion by linear and non-linear post-processing of meteorolog-ical modelsrdquoNonlinear Processes in Geophysics vol 14 no 3 pp211ndash222 2007

[35] Pearl River Water Resources Committee (PRWRC) The Zhu-jiang Archive vol 1 Guangdong Science and Technology PressGuangzhou China 1991 (in Chinese)

[36] Q Zhang C-Y Xu and Z Zhang ldquoObserved changes ofdroughtwetness episodes in the Pearl River basin Chinausing the standardized precipitation index and aridity indexrdquoTheoretical and Applied Climatology vol 98 no 1-2 pp 89ndash992009

[37] E V Dmitriev I V Nogotkov V S Rogutov G Komenko andA Chavro ldquoTemporal error estimate for statistical downscalingregional meteorological modelsrdquo Fısica de la Tierra vol 19 pp219ndash241 2007

[38] C Lavaysse M Vrac P Drobinski M Lengaigne and TVischel ldquoStatistical downscaling of the French Mediterraneanclimate assessment for present and projection in an anthro-pogenic scenariordquo Natural Hazards and Earth System Sciencevol 12 no 3 pp 651ndash670 2012

[39] L Liu Z Liu X Ren T Fischer and Y Xu ldquoHydrologicalimpacts of climate change in the Yellow River Basin for the 21stcentury using hydrological model and statistical downscalingmodelrdquo Quaternary International vol 244 no 2 pp 211ndash2202011

[40] F-F Ai J Bin Z-M Zhang et al ldquoApplication of randomforests to select premiumquality vegetable oils by their fatty acidcompositionrdquo Food Chemistry vol 143 pp 472ndash478 2014

[41] P Gislason J Benediktsson and J Sveinsson ldquoRandom forestclassification of multisource remote sensing and geographicdatardquo in Proceedings of the Geoscience and Remote Sensing Sym-posium vol 2 pp 1049ndash1052 IEEE Anchorage Alaska USA2004

[42] B C Hewitson and R G Crane ldquoClimate downscaling tech-niques and applicationrdquo Climate Research vol 7 no 2 pp 85ndash95 1996

[43] B Timbal A Dufour and B McAvaney ldquoAn estimate of fu-ture climate change for western France using a statistical

downscaling techniquerdquo Climate Dynamics vol 20 no 7-8 pp807ndash823 2003

[44] K Tatsumi T Oizumi and Y Yamashiki ldquoEffects of climatechange on daily minimum and maximum temperatures andcloudiness in the Shikoku region a statistical downscalingmodel approachrdquoTheoretical and Applied Climatology vol 120no 1-2 pp 87ndash98 2015

[45] Z A Holden J T Abatzoglou C H Luce and L S BaggettldquoEmpirical downscaling of daily minimum air temperature atvery fine resolutions in complex terrainrdquoAgricultural and ForestMeteorology vol 151 no 8 pp 1066ndash1073 2011

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 201

International Journal of

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 8: Statistical Downscaling of Temperature with the Random ...downloads.hindawi.com/journals/amete/2017/7265178.pdf · Statistical downscaling techniques can be divided into threecategories:weathertyping,weathergenerators,andre-gression-based

8 Advances in Meteorology

MLR calibration

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basin

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

le088

ge096

N

(a) MLR calibration

0 150 300(km)

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

088ndash090090ndash092

092ndash094094ndash096

MLR validationPearl River Basinle088

ge096

N

(b) MLR validation

ANN calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(c) ANN calibration

ANN validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(d) ANN validation

SVM calibration

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(e) SVM calibration

SVM validation

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

0 150 300(km)

N

(f) SVM validation

Figure 9 Continued

Advances in Meteorology 9

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

RF calibration

(g) RF calibration

RF validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(h) RF validation

Figure 9 The spatial distribution of Nash for different models

mountain regions which indicates the limited applicability incomplex terrains

5 Summary and Conclusions

Statistical downscaling models are effective in solving themismatch between large-scale climate models and local scalehydrological responses In this study a statisticalmodel basedon the random forest method was proposed and appliedto the Pearl River basin The objective of this study was toinvestigate whether the RF approach could successfully sim-ulate the complicated relationship between the predictors andpredictands The daily mean temperature observations from61 stations in the Pearl River Basin and the NCEP reanalysisdaily data from 1961 to 2005were selected in order to comparethe results of the RF model with those of the MLR ANNand SVM models The following summarizes the discussionpoints and provides conclusions derived from this analysis

(1) The RF model successfully simulated the relation-ship between the predictors and predictands andperformed better than the MLR ANN and SVMmodels According to five statistical criteria the RFshowed the highest model efficiency in both thecalibration and validation periods In addition it wasobserved that themodel efficiency of the RF increasedcontinuously by considering more predictors In thisstudy all 26 predictors from the NCEP were consid-ered in the RF modeling By taking full advantageof the information in the predictors and avoidingthe influence of noise the RF model performancedominated the other results

(2) The built-in variable importance evaluation processand the OOB samples in the RF made predictorselection convenient The built-in variable impor-tance evaluation process ranked the importance of

each predictor for the prediction of the predictandsThe OOB samples gave the unbiased estimation ofthe model efficiency Both of these were helpful inthe predictor selection Although all predictors wereconsidered in this study they will be most valuablewhen the RF is used in more complex downscalingproblems such as precipitation downscaling

(3) The spatial distribution of model precision at theindividual stations for the RF and the comparativemodels was also discussed Although the RF wassuperior to the comparative models for most of thestations there was still room for improvement for theprediction accuracy in mountainous areas

As this studymainly discussed the development and com-parison of temperature downscalingmethods future projectsin the Pearl River basin were not further explored We willpursue further research in the application of the RF to addi-tionalmeteorological elements such as the precipitationThiswill be helpful in understanding the strength and limitationof the RF method Furthermore the surface parameters likethe elevation slope vegetation and so forth also influencethe distribution of these meteorological elements especiallyin complex terrain [45] wewill explore further in consideringthese parameters as part of model input

Conflicts of Interest

The authors declare no conflicts of interest

Authorsrsquo Contributions

Bo Pang contributed to the literature review statisticalanalysis manuscript preparation and editing Jiajia Yue andGang Zhao contributed to the model design and simulation

10 Advances in Meteorology

and Zongxue Xu contributed to the manuscript revision andreview and supervised the study

Acknowledgments

This research is funded by the Youth Science Foundation ofthe National Natural Science Foundation (51309009) theNational Natural Science Foundation (91125015) andNation-al Key Research and Development Program (during the 13thFive-Year Plan) under Grant no 2016YFC0401309 Ministryof Science and Technology China

References

[1] IPCC Climate Change 2007 The Physical Science Basis Con-tribution of Working Group I to the Fourth Assessment Reportof the Intergovernmental Panel on Climate Change CambridgeUniversity Press Cambridge UK 2007

[2] R L Wilby S P Charles E Zorita B Timbal P Whetton andL O Mearns Guidelines for Use of Climate Scenarios Developedfrom Statistical Downscaling Methods UEA Norwich UK2004

[3] J T Chu J Xia C-Y Xu and V P Singh ldquoStatistical down-scaling of daily mean temperature pan evaporation and pre-cipitation for climate change scenarios in Haihe River ChinardquoTheoretical and Applied Climatology vol 99 no 1-2 pp 149ndash1612010

[4] R L Wilby L E Hay and G H Leavesley ldquoA comparisonof downscaled and raw GCM output implications for climatechange scenarios in the San JuanRiver Basin Coloradordquo Journalof Hydrology vol 225 no 1-2 pp 67ndash91 1999

[5] M K Goyal and C S P Ojha ldquoDownscaling of surfacetemperature for lake catchment in an arid region in India usinglinear multiple regression and neural networksrdquo InternationalJournal of Climatology vol 32 no 4 pp 552ndash566 2012

[6] R Huth ldquoStatistical downscaling in central Europe evaluationof methods and potential predictorsrdquo Climate Research vol 13no 2 pp 91ndash101 1999

[7] D Chen and Y Chen ldquoAssociation between winter temperaturein China and upper air circulation over East Asia revealed bycanonical correlation analysisrdquo Global and Planetary Changevol 37 no 3-4 pp 315ndash325 2003

[8] P Coulibaly Y B Dibike and F Anctil ldquoDownscaling precipi-tation and temperature with temporal neural networksrdquo Journalof Hydrometeorology vol 6 no 4 pp 483ndash496 2005

[9] A Anandhi V V Srinivas D N Kumar and R S NanjundiahldquoRole of predictors in downscaling surface temperature to riverbasin in India for IPCC SRES scenarios using support vectormachinerdquo International Journal of Climatology vol 29 no 4 pp583ndash603 2009

[10] J T Schoof and S C Pryor ldquoDownscaling temperature andprecipitation a comparison of regression-based methods andartificial neural networksrdquo International Journal of Climatologyvol 21 no 7 pp 773ndash790 2001

[11] E Kostopoulou CGiannakopoulos CAnagnostopoulou et alldquoSimulatingmaximumandminimum temperature overGreecea comparison of three downscaling techniquesrdquoTheoretical andApplied Climatology vol 90 no 1-2 pp 65ndash82 2007

[12] D Duhan and A Pandey ldquoStatistical downscaling of tempera-ture using three techniques in the Tons River basin in Central

IndiardquoTheoretical and Applied Climatology vol 121 no 3-4 pp605ndash622 2015

[13] D Maraun H W Rust and T J Osborn ldquoThe annual cycleof heavy precipitation across the United Kingdom a modelbased on extreme value statisticsrdquo International Journal ofClimatology vol 29 no 12 pp 1731ndash1744 2009

[14] M Widmann ldquoOne-dimensional CCA and SVD and theirrelationship to regression mapsrdquo Journal of Climate vol 18 no14 pp 2785ndash2792 2005

[15] M K Tippett T DelSole S J Mason and A G BarnstonldquoRegression-based methods for finding coupled patternsrdquo Jour-nal of Climate vol 21 no 17 pp 4384ndash4398 2008

[16] M Hessami P Gachon T B M J Ouarda and A St-Hilaire ldquoAutomated regression-based statistical downscalingtoolrdquo Environmental Modelling and Software vol 23 no 6 pp813ndash834 2008

[17] Z Liu Z Xu S P Charles G Fu and L Liu ldquoEvaluation of twostatistical downscaling models for daily precipitation over anarid basin in Chinardquo International Journal of Climatology vol31 no 13 pp 2006ndash2020 2011

[18] C Yang N Wang S Wang and L Zhou ldquoPerformancecomparison of three predictor selection methods for statisticaldownscaling of daily precipitationrdquo Theoretical and AppliedClimatology pp 1ndash12 2016

[19] I Hanssen-Bauer E J Foslashrland J E Haugen and O E TveitoldquoTemperature and precipitation scenarios for Norway com-parison of results from dynamical and empirical downscalingrdquoClimate Research vol 25 no 1 pp 15ndash27 2003

[20] A Hannachi I T Jolliffe and D B Stephenson ldquoEmpiricalorthogonal functions and related techniques in atmosphericscience a reviewrdquo International Journal of Climatology vol 27no 9 pp 1119ndash1152 2007

[21] O Fistikoglu andUOkkan ldquoStatistical downscaling ofmonthlyprecipitation using NCEPNCAR reanalysis data for tahtaliriver basin in Turkeyrdquo Journal of Hydrologic Engineering vol 16no 2 pp 157ndash164 2010

[22] A Sharma ldquoSeasonal to interannual rainfall probabilistic fore-casts for improved water supply management part 1mdashA strat-egy for system predictor identificationrdquo Journal of Hydrologyvol 239 no 1ndash4 pp 232ndash239 2000

[23] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[24] M Malekipirbazari and V Aksakalli ldquoRisk assessment in sociallending via random forestsrdquo Expert Systems with Applicationsvol 42 no 10 pp 4621ndash4631 2015

[25] P Baudron F Alonso-Sarrıa J L Garcıa-Arostegui F Canovas-Garcıa D Martınez-Vicente and J Moreno-Brotons ldquoIdentify-ing the origin of groundwater samples in a multi-layer aquifersystemwithRandomForest classificationrdquo Journal ofHydrologyvol 499 pp 303ndash315 2013

[26] K Tatsumi Y Yamashiki M A Canales Torres and C L RTaipe ldquoCrop classification of upland fields using Random forestof time-series Landsat 7 ETM+datardquoComputers and Electronicsin Agriculture vol 115 pp 171ndash179 2015

[27] Z Wang C Lai X Chen B Yang S Zhao and X Bai ldquoFloodhazard risk assessment model based on random forestrdquo Journalof Hydrology vol 527 pp 1130ndash1141 2015

[28] P O Gislason J A Benediktsson and J R Sveinsson ldquoRandomforests for land cover classificationrdquo Pattern Recognition Lettersvol 27 no 4 pp 294ndash300 2006

Advances in Meteorology 11

[29] M Pal ldquoRandom forest classifier for remote sensing classifica-tionrdquo International Journal of Remote Sensing vol 26 no 1 pp217ndash222 2005

[30] V F Rodriguez-Galiano M Chica-Olmo F Abarca-Hernan-dez P M Atkinson and C Jeganathan ldquoRandom Forestclassification of Mediterranean land cover using multi-seasonalimagery and multi-seasonal texturerdquo Remote Sensing of Envi-ronment vol 121 pp 93ndash107 2012

[31] V F Rodriguez-Galiano B Ghimire J Rogan M Chica-Olmoand J P Rigol-Sanchez ldquoAn assessment of the effectiveness ofa random forest classifier for land-cover classificationrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 67 no 1pp 93ndash104 2012

[32] C Strobl and A Zeileis ldquoDanger high power mdash Exploringthe statistical properties of a test for random forest variableimportancerdquo Tech Rep 17 University of Munich 2008

[33] V Svetnik A Liaw C Tong J Christopher Culberson R PSheridan and B P Feuston ldquoRandom forest a classification andregression tool for compound classification and QSAR mod-elingrdquo Journal of Chemical Information and Computer Sciencesvol 43 no 6 pp 1947ndash1958 2003

[34] E Eccel L Ghielmi P Granitto R Barbiero F Grazzini and DCesari ldquoPrediction of minimum temperatures in an alpineregion by linear and non-linear post-processing of meteorolog-ical modelsrdquoNonlinear Processes in Geophysics vol 14 no 3 pp211ndash222 2007

[35] Pearl River Water Resources Committee (PRWRC) The Zhu-jiang Archive vol 1 Guangdong Science and Technology PressGuangzhou China 1991 (in Chinese)

[36] Q Zhang C-Y Xu and Z Zhang ldquoObserved changes ofdroughtwetness episodes in the Pearl River basin Chinausing the standardized precipitation index and aridity indexrdquoTheoretical and Applied Climatology vol 98 no 1-2 pp 89ndash992009

[37] E V Dmitriev I V Nogotkov V S Rogutov G Komenko andA Chavro ldquoTemporal error estimate for statistical downscalingregional meteorological modelsrdquo Fısica de la Tierra vol 19 pp219ndash241 2007

[38] C Lavaysse M Vrac P Drobinski M Lengaigne and TVischel ldquoStatistical downscaling of the French Mediterraneanclimate assessment for present and projection in an anthro-pogenic scenariordquo Natural Hazards and Earth System Sciencevol 12 no 3 pp 651ndash670 2012

[39] L Liu Z Liu X Ren T Fischer and Y Xu ldquoHydrologicalimpacts of climate change in the Yellow River Basin for the 21stcentury using hydrological model and statistical downscalingmodelrdquo Quaternary International vol 244 no 2 pp 211ndash2202011

[40] F-F Ai J Bin Z-M Zhang et al ldquoApplication of randomforests to select premiumquality vegetable oils by their fatty acidcompositionrdquo Food Chemistry vol 143 pp 472ndash478 2014

[41] P Gislason J Benediktsson and J Sveinsson ldquoRandom forestclassification of multisource remote sensing and geographicdatardquo in Proceedings of the Geoscience and Remote Sensing Sym-posium vol 2 pp 1049ndash1052 IEEE Anchorage Alaska USA2004

[42] B C Hewitson and R G Crane ldquoClimate downscaling tech-niques and applicationrdquo Climate Research vol 7 no 2 pp 85ndash95 1996

[43] B Timbal A Dufour and B McAvaney ldquoAn estimate of fu-ture climate change for western France using a statistical

downscaling techniquerdquo Climate Dynamics vol 20 no 7-8 pp807ndash823 2003

[44] K Tatsumi T Oizumi and Y Yamashiki ldquoEffects of climatechange on daily minimum and maximum temperatures andcloudiness in the Shikoku region a statistical downscalingmodel approachrdquoTheoretical and Applied Climatology vol 120no 1-2 pp 87ndash98 2015

[45] Z A Holden J T Abatzoglou C H Luce and L S BaggettldquoEmpirical downscaling of daily minimum air temperature atvery fine resolutions in complex terrainrdquoAgricultural and ForestMeteorology vol 151 no 8 pp 1066ndash1073 2011

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 201

International Journal of

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 9: Statistical Downscaling of Temperature with the Random ...downloads.hindawi.com/journals/amete/2017/7265178.pdf · Statistical downscaling techniques can be divided into threecategories:weathertyping,weathergenerators,andre-gression-based

Advances in Meteorology 9

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

RF calibration

(g) RF calibration

RF validation

0 150 300(km)

088ndash090090ndash092

092ndash094094ndash096

Pearl River Basinle088

ge096

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

105∘09984000998400998400E 110

∘09984000998400998400E 115

∘09984000998400998400E

25∘09984000998400998400

N20∘09984000998400998400

N

25∘09984000998400998400

N20∘09984000998400998400

N

N

(h) RF validation

Figure 9 The spatial distribution of Nash for different models

mountain regions which indicates the limited applicability incomplex terrains

5 Summary and Conclusions

Statistical downscaling models are effective in solving themismatch between large-scale climate models and local scalehydrological responses In this study a statisticalmodel basedon the random forest method was proposed and appliedto the Pearl River basin The objective of this study was toinvestigate whether the RF approach could successfully sim-ulate the complicated relationship between the predictors andpredictands The daily mean temperature observations from61 stations in the Pearl River Basin and the NCEP reanalysisdaily data from 1961 to 2005were selected in order to comparethe results of the RF model with those of the MLR ANNand SVM models The following summarizes the discussionpoints and provides conclusions derived from this analysis

(1) The RF model successfully simulated the relation-ship between the predictors and predictands andperformed better than the MLR ANN and SVMmodels According to five statistical criteria the RFshowed the highest model efficiency in both thecalibration and validation periods In addition it wasobserved that themodel efficiency of the RF increasedcontinuously by considering more predictors In thisstudy all 26 predictors from the NCEP were consid-ered in the RF modeling By taking full advantageof the information in the predictors and avoidingthe influence of noise the RF model performancedominated the other results

(2) The built-in variable importance evaluation processand the OOB samples in the RF made predictorselection convenient The built-in variable impor-tance evaluation process ranked the importance of

each predictor for the prediction of the predictandsThe OOB samples gave the unbiased estimation ofthe model efficiency Both of these were helpful inthe predictor selection Although all predictors wereconsidered in this study they will be most valuablewhen the RF is used in more complex downscalingproblems such as precipitation downscaling

(3) The spatial distribution of model precision at theindividual stations for the RF and the comparativemodels was also discussed Although the RF wassuperior to the comparative models for most of thestations there was still room for improvement for theprediction accuracy in mountainous areas

As this studymainly discussed the development and com-parison of temperature downscalingmethods future projectsin the Pearl River basin were not further explored We willpursue further research in the application of the RF to addi-tionalmeteorological elements such as the precipitationThiswill be helpful in understanding the strength and limitationof the RF method Furthermore the surface parameters likethe elevation slope vegetation and so forth also influencethe distribution of these meteorological elements especiallyin complex terrain [45] wewill explore further in consideringthese parameters as part of model input

Conflicts of Interest

The authors declare no conflicts of interest

Authorsrsquo Contributions

Bo Pang contributed to the literature review statisticalanalysis manuscript preparation and editing Jiajia Yue andGang Zhao contributed to the model design and simulation

10 Advances in Meteorology

and Zongxue Xu contributed to the manuscript revision andreview and supervised the study

Acknowledgments

This research is funded by the Youth Science Foundation ofthe National Natural Science Foundation (51309009) theNational Natural Science Foundation (91125015) andNation-al Key Research and Development Program (during the 13thFive-Year Plan) under Grant no 2016YFC0401309 Ministryof Science and Technology China

References

[1] IPCC Climate Change 2007 The Physical Science Basis Con-tribution of Working Group I to the Fourth Assessment Reportof the Intergovernmental Panel on Climate Change CambridgeUniversity Press Cambridge UK 2007

[2] R L Wilby S P Charles E Zorita B Timbal P Whetton andL O Mearns Guidelines for Use of Climate Scenarios Developedfrom Statistical Downscaling Methods UEA Norwich UK2004

[3] J T Chu J Xia C-Y Xu and V P Singh ldquoStatistical down-scaling of daily mean temperature pan evaporation and pre-cipitation for climate change scenarios in Haihe River ChinardquoTheoretical and Applied Climatology vol 99 no 1-2 pp 149ndash1612010

[4] R L Wilby L E Hay and G H Leavesley ldquoA comparisonof downscaled and raw GCM output implications for climatechange scenarios in the San JuanRiver Basin Coloradordquo Journalof Hydrology vol 225 no 1-2 pp 67ndash91 1999

[5] M K Goyal and C S P Ojha ldquoDownscaling of surfacetemperature for lake catchment in an arid region in India usinglinear multiple regression and neural networksrdquo InternationalJournal of Climatology vol 32 no 4 pp 552ndash566 2012

[6] R Huth ldquoStatistical downscaling in central Europe evaluationof methods and potential predictorsrdquo Climate Research vol 13no 2 pp 91ndash101 1999

[7] D Chen and Y Chen ldquoAssociation between winter temperaturein China and upper air circulation over East Asia revealed bycanonical correlation analysisrdquo Global and Planetary Changevol 37 no 3-4 pp 315ndash325 2003

[8] P Coulibaly Y B Dibike and F Anctil ldquoDownscaling precipi-tation and temperature with temporal neural networksrdquo Journalof Hydrometeorology vol 6 no 4 pp 483ndash496 2005

[9] A Anandhi V V Srinivas D N Kumar and R S NanjundiahldquoRole of predictors in downscaling surface temperature to riverbasin in India for IPCC SRES scenarios using support vectormachinerdquo International Journal of Climatology vol 29 no 4 pp583ndash603 2009

[10] J T Schoof and S C Pryor ldquoDownscaling temperature andprecipitation a comparison of regression-based methods andartificial neural networksrdquo International Journal of Climatologyvol 21 no 7 pp 773ndash790 2001

[11] E Kostopoulou CGiannakopoulos CAnagnostopoulou et alldquoSimulatingmaximumandminimum temperature overGreecea comparison of three downscaling techniquesrdquoTheoretical andApplied Climatology vol 90 no 1-2 pp 65ndash82 2007

[12] D Duhan and A Pandey ldquoStatistical downscaling of tempera-ture using three techniques in the Tons River basin in Central

IndiardquoTheoretical and Applied Climatology vol 121 no 3-4 pp605ndash622 2015

[13] D Maraun H W Rust and T J Osborn ldquoThe annual cycleof heavy precipitation across the United Kingdom a modelbased on extreme value statisticsrdquo International Journal ofClimatology vol 29 no 12 pp 1731ndash1744 2009

[14] M Widmann ldquoOne-dimensional CCA and SVD and theirrelationship to regression mapsrdquo Journal of Climate vol 18 no14 pp 2785ndash2792 2005

[15] M K Tippett T DelSole S J Mason and A G BarnstonldquoRegression-based methods for finding coupled patternsrdquo Jour-nal of Climate vol 21 no 17 pp 4384ndash4398 2008

[16] M Hessami P Gachon T B M J Ouarda and A St-Hilaire ldquoAutomated regression-based statistical downscalingtoolrdquo Environmental Modelling and Software vol 23 no 6 pp813ndash834 2008

[17] Z Liu Z Xu S P Charles G Fu and L Liu ldquoEvaluation of twostatistical downscaling models for daily precipitation over anarid basin in Chinardquo International Journal of Climatology vol31 no 13 pp 2006ndash2020 2011

[18] C Yang N Wang S Wang and L Zhou ldquoPerformancecomparison of three predictor selection methods for statisticaldownscaling of daily precipitationrdquo Theoretical and AppliedClimatology pp 1ndash12 2016

[19] I Hanssen-Bauer E J Foslashrland J E Haugen and O E TveitoldquoTemperature and precipitation scenarios for Norway com-parison of results from dynamical and empirical downscalingrdquoClimate Research vol 25 no 1 pp 15ndash27 2003

[20] A Hannachi I T Jolliffe and D B Stephenson ldquoEmpiricalorthogonal functions and related techniques in atmosphericscience a reviewrdquo International Journal of Climatology vol 27no 9 pp 1119ndash1152 2007

[21] O Fistikoglu andUOkkan ldquoStatistical downscaling ofmonthlyprecipitation using NCEPNCAR reanalysis data for tahtaliriver basin in Turkeyrdquo Journal of Hydrologic Engineering vol 16no 2 pp 157ndash164 2010

[22] A Sharma ldquoSeasonal to interannual rainfall probabilistic fore-casts for improved water supply management part 1mdashA strat-egy for system predictor identificationrdquo Journal of Hydrologyvol 239 no 1ndash4 pp 232ndash239 2000

[23] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[24] M Malekipirbazari and V Aksakalli ldquoRisk assessment in sociallending via random forestsrdquo Expert Systems with Applicationsvol 42 no 10 pp 4621ndash4631 2015

[25] P Baudron F Alonso-Sarrıa J L Garcıa-Arostegui F Canovas-Garcıa D Martınez-Vicente and J Moreno-Brotons ldquoIdentify-ing the origin of groundwater samples in a multi-layer aquifersystemwithRandomForest classificationrdquo Journal ofHydrologyvol 499 pp 303ndash315 2013

[26] K Tatsumi Y Yamashiki M A Canales Torres and C L RTaipe ldquoCrop classification of upland fields using Random forestof time-series Landsat 7 ETM+datardquoComputers and Electronicsin Agriculture vol 115 pp 171ndash179 2015

[27] Z Wang C Lai X Chen B Yang S Zhao and X Bai ldquoFloodhazard risk assessment model based on random forestrdquo Journalof Hydrology vol 527 pp 1130ndash1141 2015

[28] P O Gislason J A Benediktsson and J R Sveinsson ldquoRandomforests for land cover classificationrdquo Pattern Recognition Lettersvol 27 no 4 pp 294ndash300 2006

Advances in Meteorology 11

[29] M Pal ldquoRandom forest classifier for remote sensing classifica-tionrdquo International Journal of Remote Sensing vol 26 no 1 pp217ndash222 2005

[30] V F Rodriguez-Galiano M Chica-Olmo F Abarca-Hernan-dez P M Atkinson and C Jeganathan ldquoRandom Forestclassification of Mediterranean land cover using multi-seasonalimagery and multi-seasonal texturerdquo Remote Sensing of Envi-ronment vol 121 pp 93ndash107 2012

[31] V F Rodriguez-Galiano B Ghimire J Rogan M Chica-Olmoand J P Rigol-Sanchez ldquoAn assessment of the effectiveness ofa random forest classifier for land-cover classificationrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 67 no 1pp 93ndash104 2012

[32] C Strobl and A Zeileis ldquoDanger high power mdash Exploringthe statistical properties of a test for random forest variableimportancerdquo Tech Rep 17 University of Munich 2008

[33] V Svetnik A Liaw C Tong J Christopher Culberson R PSheridan and B P Feuston ldquoRandom forest a classification andregression tool for compound classification and QSAR mod-elingrdquo Journal of Chemical Information and Computer Sciencesvol 43 no 6 pp 1947ndash1958 2003

[34] E Eccel L Ghielmi P Granitto R Barbiero F Grazzini and DCesari ldquoPrediction of minimum temperatures in an alpineregion by linear and non-linear post-processing of meteorolog-ical modelsrdquoNonlinear Processes in Geophysics vol 14 no 3 pp211ndash222 2007

[35] Pearl River Water Resources Committee (PRWRC) The Zhu-jiang Archive vol 1 Guangdong Science and Technology PressGuangzhou China 1991 (in Chinese)

[36] Q Zhang C-Y Xu and Z Zhang ldquoObserved changes ofdroughtwetness episodes in the Pearl River basin Chinausing the standardized precipitation index and aridity indexrdquoTheoretical and Applied Climatology vol 98 no 1-2 pp 89ndash992009

[37] E V Dmitriev I V Nogotkov V S Rogutov G Komenko andA Chavro ldquoTemporal error estimate for statistical downscalingregional meteorological modelsrdquo Fısica de la Tierra vol 19 pp219ndash241 2007

[38] C Lavaysse M Vrac P Drobinski M Lengaigne and TVischel ldquoStatistical downscaling of the French Mediterraneanclimate assessment for present and projection in an anthro-pogenic scenariordquo Natural Hazards and Earth System Sciencevol 12 no 3 pp 651ndash670 2012

[39] L Liu Z Liu X Ren T Fischer and Y Xu ldquoHydrologicalimpacts of climate change in the Yellow River Basin for the 21stcentury using hydrological model and statistical downscalingmodelrdquo Quaternary International vol 244 no 2 pp 211ndash2202011

[40] F-F Ai J Bin Z-M Zhang et al ldquoApplication of randomforests to select premiumquality vegetable oils by their fatty acidcompositionrdquo Food Chemistry vol 143 pp 472ndash478 2014

[41] P Gislason J Benediktsson and J Sveinsson ldquoRandom forestclassification of multisource remote sensing and geographicdatardquo in Proceedings of the Geoscience and Remote Sensing Sym-posium vol 2 pp 1049ndash1052 IEEE Anchorage Alaska USA2004

[42] B C Hewitson and R G Crane ldquoClimate downscaling tech-niques and applicationrdquo Climate Research vol 7 no 2 pp 85ndash95 1996

[43] B Timbal A Dufour and B McAvaney ldquoAn estimate of fu-ture climate change for western France using a statistical

downscaling techniquerdquo Climate Dynamics vol 20 no 7-8 pp807ndash823 2003

[44] K Tatsumi T Oizumi and Y Yamashiki ldquoEffects of climatechange on daily minimum and maximum temperatures andcloudiness in the Shikoku region a statistical downscalingmodel approachrdquoTheoretical and Applied Climatology vol 120no 1-2 pp 87ndash98 2015

[45] Z A Holden J T Abatzoglou C H Luce and L S BaggettldquoEmpirical downscaling of daily minimum air temperature atvery fine resolutions in complex terrainrdquoAgricultural and ForestMeteorology vol 151 no 8 pp 1066ndash1073 2011

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 201

International Journal of

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 10: Statistical Downscaling of Temperature with the Random ...downloads.hindawi.com/journals/amete/2017/7265178.pdf · Statistical downscaling techniques can be divided into threecategories:weathertyping,weathergenerators,andre-gression-based

10 Advances in Meteorology

and Zongxue Xu contributed to the manuscript revision andreview and supervised the study

Acknowledgments

This research is funded by the Youth Science Foundation ofthe National Natural Science Foundation (51309009) theNational Natural Science Foundation (91125015) andNation-al Key Research and Development Program (during the 13thFive-Year Plan) under Grant no 2016YFC0401309 Ministryof Science and Technology China

References

[1] IPCC Climate Change 2007 The Physical Science Basis Con-tribution of Working Group I to the Fourth Assessment Reportof the Intergovernmental Panel on Climate Change CambridgeUniversity Press Cambridge UK 2007

[2] R L Wilby S P Charles E Zorita B Timbal P Whetton andL O Mearns Guidelines for Use of Climate Scenarios Developedfrom Statistical Downscaling Methods UEA Norwich UK2004

[3] J T Chu J Xia C-Y Xu and V P Singh ldquoStatistical down-scaling of daily mean temperature pan evaporation and pre-cipitation for climate change scenarios in Haihe River ChinardquoTheoretical and Applied Climatology vol 99 no 1-2 pp 149ndash1612010

[4] R L Wilby L E Hay and G H Leavesley ldquoA comparisonof downscaled and raw GCM output implications for climatechange scenarios in the San JuanRiver Basin Coloradordquo Journalof Hydrology vol 225 no 1-2 pp 67ndash91 1999

[5] M K Goyal and C S P Ojha ldquoDownscaling of surfacetemperature for lake catchment in an arid region in India usinglinear multiple regression and neural networksrdquo InternationalJournal of Climatology vol 32 no 4 pp 552ndash566 2012

[6] R Huth ldquoStatistical downscaling in central Europe evaluationof methods and potential predictorsrdquo Climate Research vol 13no 2 pp 91ndash101 1999

[7] D Chen and Y Chen ldquoAssociation between winter temperaturein China and upper air circulation over East Asia revealed bycanonical correlation analysisrdquo Global and Planetary Changevol 37 no 3-4 pp 315ndash325 2003

[8] P Coulibaly Y B Dibike and F Anctil ldquoDownscaling precipi-tation and temperature with temporal neural networksrdquo Journalof Hydrometeorology vol 6 no 4 pp 483ndash496 2005

[9] A Anandhi V V Srinivas D N Kumar and R S NanjundiahldquoRole of predictors in downscaling surface temperature to riverbasin in India for IPCC SRES scenarios using support vectormachinerdquo International Journal of Climatology vol 29 no 4 pp583ndash603 2009

[10] J T Schoof and S C Pryor ldquoDownscaling temperature andprecipitation a comparison of regression-based methods andartificial neural networksrdquo International Journal of Climatologyvol 21 no 7 pp 773ndash790 2001

[11] E Kostopoulou CGiannakopoulos CAnagnostopoulou et alldquoSimulatingmaximumandminimum temperature overGreecea comparison of three downscaling techniquesrdquoTheoretical andApplied Climatology vol 90 no 1-2 pp 65ndash82 2007

[12] D Duhan and A Pandey ldquoStatistical downscaling of tempera-ture using three techniques in the Tons River basin in Central

IndiardquoTheoretical and Applied Climatology vol 121 no 3-4 pp605ndash622 2015

[13] D Maraun H W Rust and T J Osborn ldquoThe annual cycleof heavy precipitation across the United Kingdom a modelbased on extreme value statisticsrdquo International Journal ofClimatology vol 29 no 12 pp 1731ndash1744 2009

[14] M Widmann ldquoOne-dimensional CCA and SVD and theirrelationship to regression mapsrdquo Journal of Climate vol 18 no14 pp 2785ndash2792 2005

[15] M K Tippett T DelSole S J Mason and A G BarnstonldquoRegression-based methods for finding coupled patternsrdquo Jour-nal of Climate vol 21 no 17 pp 4384ndash4398 2008

[16] M Hessami P Gachon T B M J Ouarda and A St-Hilaire ldquoAutomated regression-based statistical downscalingtoolrdquo Environmental Modelling and Software vol 23 no 6 pp813ndash834 2008

[17] Z Liu Z Xu S P Charles G Fu and L Liu ldquoEvaluation of twostatistical downscaling models for daily precipitation over anarid basin in Chinardquo International Journal of Climatology vol31 no 13 pp 2006ndash2020 2011

[18] C Yang N Wang S Wang and L Zhou ldquoPerformancecomparison of three predictor selection methods for statisticaldownscaling of daily precipitationrdquo Theoretical and AppliedClimatology pp 1ndash12 2016

[19] I Hanssen-Bauer E J Foslashrland J E Haugen and O E TveitoldquoTemperature and precipitation scenarios for Norway com-parison of results from dynamical and empirical downscalingrdquoClimate Research vol 25 no 1 pp 15ndash27 2003

[20] A Hannachi I T Jolliffe and D B Stephenson ldquoEmpiricalorthogonal functions and related techniques in atmosphericscience a reviewrdquo International Journal of Climatology vol 27no 9 pp 1119ndash1152 2007

[21] O Fistikoglu andUOkkan ldquoStatistical downscaling ofmonthlyprecipitation using NCEPNCAR reanalysis data for tahtaliriver basin in Turkeyrdquo Journal of Hydrologic Engineering vol 16no 2 pp 157ndash164 2010

[22] A Sharma ldquoSeasonal to interannual rainfall probabilistic fore-casts for improved water supply management part 1mdashA strat-egy for system predictor identificationrdquo Journal of Hydrologyvol 239 no 1ndash4 pp 232ndash239 2000

[23] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001

[24] M Malekipirbazari and V Aksakalli ldquoRisk assessment in sociallending via random forestsrdquo Expert Systems with Applicationsvol 42 no 10 pp 4621ndash4631 2015

[25] P Baudron F Alonso-Sarrıa J L Garcıa-Arostegui F Canovas-Garcıa D Martınez-Vicente and J Moreno-Brotons ldquoIdentify-ing the origin of groundwater samples in a multi-layer aquifersystemwithRandomForest classificationrdquo Journal ofHydrologyvol 499 pp 303ndash315 2013

[26] K Tatsumi Y Yamashiki M A Canales Torres and C L RTaipe ldquoCrop classification of upland fields using Random forestof time-series Landsat 7 ETM+datardquoComputers and Electronicsin Agriculture vol 115 pp 171ndash179 2015

[27] Z Wang C Lai X Chen B Yang S Zhao and X Bai ldquoFloodhazard risk assessment model based on random forestrdquo Journalof Hydrology vol 527 pp 1130ndash1141 2015

[28] P O Gislason J A Benediktsson and J R Sveinsson ldquoRandomforests for land cover classificationrdquo Pattern Recognition Lettersvol 27 no 4 pp 294ndash300 2006

Advances in Meteorology 11

[29] M Pal ldquoRandom forest classifier for remote sensing classifica-tionrdquo International Journal of Remote Sensing vol 26 no 1 pp217ndash222 2005

[30] V F Rodriguez-Galiano M Chica-Olmo F Abarca-Hernan-dez P M Atkinson and C Jeganathan ldquoRandom Forestclassification of Mediterranean land cover using multi-seasonalimagery and multi-seasonal texturerdquo Remote Sensing of Envi-ronment vol 121 pp 93ndash107 2012

[31] V F Rodriguez-Galiano B Ghimire J Rogan M Chica-Olmoand J P Rigol-Sanchez ldquoAn assessment of the effectiveness ofa random forest classifier for land-cover classificationrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 67 no 1pp 93ndash104 2012

[32] C Strobl and A Zeileis ldquoDanger high power mdash Exploringthe statistical properties of a test for random forest variableimportancerdquo Tech Rep 17 University of Munich 2008

[33] V Svetnik A Liaw C Tong J Christopher Culberson R PSheridan and B P Feuston ldquoRandom forest a classification andregression tool for compound classification and QSAR mod-elingrdquo Journal of Chemical Information and Computer Sciencesvol 43 no 6 pp 1947ndash1958 2003

[34] E Eccel L Ghielmi P Granitto R Barbiero F Grazzini and DCesari ldquoPrediction of minimum temperatures in an alpineregion by linear and non-linear post-processing of meteorolog-ical modelsrdquoNonlinear Processes in Geophysics vol 14 no 3 pp211ndash222 2007

[35] Pearl River Water Resources Committee (PRWRC) The Zhu-jiang Archive vol 1 Guangdong Science and Technology PressGuangzhou China 1991 (in Chinese)

[36] Q Zhang C-Y Xu and Z Zhang ldquoObserved changes ofdroughtwetness episodes in the Pearl River basin Chinausing the standardized precipitation index and aridity indexrdquoTheoretical and Applied Climatology vol 98 no 1-2 pp 89ndash992009

[37] E V Dmitriev I V Nogotkov V S Rogutov G Komenko andA Chavro ldquoTemporal error estimate for statistical downscalingregional meteorological modelsrdquo Fısica de la Tierra vol 19 pp219ndash241 2007

[38] C Lavaysse M Vrac P Drobinski M Lengaigne and TVischel ldquoStatistical downscaling of the French Mediterraneanclimate assessment for present and projection in an anthro-pogenic scenariordquo Natural Hazards and Earth System Sciencevol 12 no 3 pp 651ndash670 2012

[39] L Liu Z Liu X Ren T Fischer and Y Xu ldquoHydrologicalimpacts of climate change in the Yellow River Basin for the 21stcentury using hydrological model and statistical downscalingmodelrdquo Quaternary International vol 244 no 2 pp 211ndash2202011

[40] F-F Ai J Bin Z-M Zhang et al ldquoApplication of randomforests to select premiumquality vegetable oils by their fatty acidcompositionrdquo Food Chemistry vol 143 pp 472ndash478 2014

[41] P Gislason J Benediktsson and J Sveinsson ldquoRandom forestclassification of multisource remote sensing and geographicdatardquo in Proceedings of the Geoscience and Remote Sensing Sym-posium vol 2 pp 1049ndash1052 IEEE Anchorage Alaska USA2004

[42] B C Hewitson and R G Crane ldquoClimate downscaling tech-niques and applicationrdquo Climate Research vol 7 no 2 pp 85ndash95 1996

[43] B Timbal A Dufour and B McAvaney ldquoAn estimate of fu-ture climate change for western France using a statistical

downscaling techniquerdquo Climate Dynamics vol 20 no 7-8 pp807ndash823 2003

[44] K Tatsumi T Oizumi and Y Yamashiki ldquoEffects of climatechange on daily minimum and maximum temperatures andcloudiness in the Shikoku region a statistical downscalingmodel approachrdquoTheoretical and Applied Climatology vol 120no 1-2 pp 87ndash98 2015

[45] Z A Holden J T Abatzoglou C H Luce and L S BaggettldquoEmpirical downscaling of daily minimum air temperature atvery fine resolutions in complex terrainrdquoAgricultural and ForestMeteorology vol 151 no 8 pp 1066ndash1073 2011

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 201

International Journal of

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 11: Statistical Downscaling of Temperature with the Random ...downloads.hindawi.com/journals/amete/2017/7265178.pdf · Statistical downscaling techniques can be divided into threecategories:weathertyping,weathergenerators,andre-gression-based

Advances in Meteorology 11

[29] M Pal ldquoRandom forest classifier for remote sensing classifica-tionrdquo International Journal of Remote Sensing vol 26 no 1 pp217ndash222 2005

[30] V F Rodriguez-Galiano M Chica-Olmo F Abarca-Hernan-dez P M Atkinson and C Jeganathan ldquoRandom Forestclassification of Mediterranean land cover using multi-seasonalimagery and multi-seasonal texturerdquo Remote Sensing of Envi-ronment vol 121 pp 93ndash107 2012

[31] V F Rodriguez-Galiano B Ghimire J Rogan M Chica-Olmoand J P Rigol-Sanchez ldquoAn assessment of the effectiveness ofa random forest classifier for land-cover classificationrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 67 no 1pp 93ndash104 2012

[32] C Strobl and A Zeileis ldquoDanger high power mdash Exploringthe statistical properties of a test for random forest variableimportancerdquo Tech Rep 17 University of Munich 2008

[33] V Svetnik A Liaw C Tong J Christopher Culberson R PSheridan and B P Feuston ldquoRandom forest a classification andregression tool for compound classification and QSAR mod-elingrdquo Journal of Chemical Information and Computer Sciencesvol 43 no 6 pp 1947ndash1958 2003

[34] E Eccel L Ghielmi P Granitto R Barbiero F Grazzini and DCesari ldquoPrediction of minimum temperatures in an alpineregion by linear and non-linear post-processing of meteorolog-ical modelsrdquoNonlinear Processes in Geophysics vol 14 no 3 pp211ndash222 2007

[35] Pearl River Water Resources Committee (PRWRC) The Zhu-jiang Archive vol 1 Guangdong Science and Technology PressGuangzhou China 1991 (in Chinese)

[36] Q Zhang C-Y Xu and Z Zhang ldquoObserved changes ofdroughtwetness episodes in the Pearl River basin Chinausing the standardized precipitation index and aridity indexrdquoTheoretical and Applied Climatology vol 98 no 1-2 pp 89ndash992009

[37] E V Dmitriev I V Nogotkov V S Rogutov G Komenko andA Chavro ldquoTemporal error estimate for statistical downscalingregional meteorological modelsrdquo Fısica de la Tierra vol 19 pp219ndash241 2007

[38] C Lavaysse M Vrac P Drobinski M Lengaigne and TVischel ldquoStatistical downscaling of the French Mediterraneanclimate assessment for present and projection in an anthro-pogenic scenariordquo Natural Hazards and Earth System Sciencevol 12 no 3 pp 651ndash670 2012

[39] L Liu Z Liu X Ren T Fischer and Y Xu ldquoHydrologicalimpacts of climate change in the Yellow River Basin for the 21stcentury using hydrological model and statistical downscalingmodelrdquo Quaternary International vol 244 no 2 pp 211ndash2202011

[40] F-F Ai J Bin Z-M Zhang et al ldquoApplication of randomforests to select premiumquality vegetable oils by their fatty acidcompositionrdquo Food Chemistry vol 143 pp 472ndash478 2014

[41] P Gislason J Benediktsson and J Sveinsson ldquoRandom forestclassification of multisource remote sensing and geographicdatardquo in Proceedings of the Geoscience and Remote Sensing Sym-posium vol 2 pp 1049ndash1052 IEEE Anchorage Alaska USA2004

[42] B C Hewitson and R G Crane ldquoClimate downscaling tech-niques and applicationrdquo Climate Research vol 7 no 2 pp 85ndash95 1996

[43] B Timbal A Dufour and B McAvaney ldquoAn estimate of fu-ture climate change for western France using a statistical

downscaling techniquerdquo Climate Dynamics vol 20 no 7-8 pp807ndash823 2003

[44] K Tatsumi T Oizumi and Y Yamashiki ldquoEffects of climatechange on daily minimum and maximum temperatures andcloudiness in the Shikoku region a statistical downscalingmodel approachrdquoTheoretical and Applied Climatology vol 120no 1-2 pp 87ndash98 2015

[45] Z A Holden J T Abatzoglou C H Luce and L S BaggettldquoEmpirical downscaling of daily minimum air temperature atvery fine resolutions in complex terrainrdquoAgricultural and ForestMeteorology vol 151 no 8 pp 1066ndash1073 2011

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 201

International Journal of

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 12: Statistical Downscaling of Temperature with the Random ...downloads.hindawi.com/journals/amete/2017/7265178.pdf · Statistical downscaling techniques can be divided into threecategories:weathertyping,weathergenerators,andre-gression-based

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 201

International Journal of

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in