An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing...

13
An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing landslides Mihai Pavel a,n , John D. Nelson a , R. Jonathan Fannin b a Department of Forest Resources Management, Faculty of Forestry, University of British Columbia, 2424 Main Mall, Vancouver, BC, Canada V6T 1Z4 b Department of Civil Engineering, University of British Columbia, 6250 Applied Science Lane, Vancouver, BC, Canada V6T 1Z4 article info Article history: Received 23 December 2009 Received in revised form 15 September 2010 Accepted 7 October 2010 Available online 10 November 2010 Keywords: Landslide susceptibility mapping Subjective geomorphic mapping Artificial Neural Networks (ANN) Learning Vector Quantization (LVQ) Geographic Information Systems (GIS) abstract This study explores the possibility of creating landslide susceptibility mappings by using two types of data: (i) an existing subjective geomorphic mapping; and (ii) landslides already identified in the area analyzed. The analysis is conducted using a type of Artificial Neural Network (ANN) named Learning Vector Quantization. For the subjective geomorphic mapping various definitions of stability were considered/analyzed, some using a 2-class system and some using a 5-class system. The study concludes that mappings using an existing subjective geomorphic classification and based on two stability classes can be successfully replicated with the ANN-based approach. However, mappings based on existing landslides and on the 5-class system do not yield results sufficiently accurate for practical applications. Creation of landslide susceptibility mappings involved utilization of data of numerous types (numerical and class-type variables). This study also investigated various methods of data coding and identified the most appropriate method for this type of analysis. & 2010 Elsevier Ltd. All rights reserved. 1. Introduction In the literature related to terrain stability assessment the saying ‘‘the past and the present are the key to the future’’ is often cited (e.g., Dai and Lee, 2001). However, as in many other instances the opposite of this was also stated, as ‘‘we can never plan the future by the past’’ (Edmund Burkein Gosavi, 2003, p. 133). The main objective of this study ranges between these two extremes and explores two different types of data used for terrain stability mapping and four criteria of stability and assesses their outcomes. The study was conducted in the Canadian province of British Columbia (BC). The first principle for terrain stability assessment investigated in this study states that future can be predicted based on the past and present. However, there is no doubt that to adequately anticipate the future a representative description of the past and present must be available. For landslide susceptibility mapping the most commonly available description of the past and present for a certain region is represented by existing events. The methods based on previous landslides are presented in numerous papers, of which we mention Guzzetti et al. (1999), Aleotti and Chowdhury (1999), Dai et al. (2002), and Brenning (2005). The second principle investigated in this study is based (in a more or less explicit manner) on the belief that future cannot be predicted by the past, and as a method is best represented by subjective geomorphic mapping. This is a widely used method for assessing terrain stability over large areas (Keaton and DeGraff, 1996; Soeters and van Westen, 1996; Carrara et al., 2003) and is the official system used in British Columbia. In essence, subjective geomorphic mapping is based on the experience and knowledge of the terrain specialist (mapper) and involves delineating terrain polygons that are uniform with respect to surficial materials, landforms, and geomorphic processes. Assessments are carried out by geoscientists or engineers using a combination of field and office based techniques. Subjective geomorphic mappings reflect the stability of a certain area only in space and there is no temporal resolution explicitly assigned to them. However, these products include also information on existing landslides, i.e., presence of events automatically forces classification of a terrain polygon in a certain class. The main issues associated with this method are its relatively high cost, and the challenge of achieving con- sistency between products generated by different terrain mappers. A method for replicating landslide susceptibility maps in a con- sistent manner and at reduced cost by using subjective geomorphic mappings and based on a certain/chosen definition of stability was developed by Pavel et al. (2008). Also, Pavel et al. (2008) identified slope, elevation, aspect, and existing geomorphic processes as the attributes most important in such analyses. The main objectives of this study are (i) to produce landslide susceptibility mappings by using existing high-quality subjective Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/cageo Computers & Geosciences 0098-3004/$ - see front matter & 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.cageo.2010.10.006 n Corresponding author. Present address: FPInnovations, 2601 East Mall, Van- couver, BC, Canada V6T 1Z4. Tel.: + 1 604 228 1555; fax: + 1 604 228 0999. E-mail address: [email protected] (M. Pavel). Computers & Geosciences 37 (2011) 554–566

Transcript of An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing...

Page 1: An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing landslides

Computers & Geosciences 37 (2011) 554–566

Contents lists available at ScienceDirect

Computers & Geosciences

0098-30

doi:10.1

n Corr

couver,

E-m

journal homepage: www.elsevier.com/locate/cageo

An analysis of landslide susceptibility zonation using a subjective geomorphicmapping and existing landslides

Mihai Pavel a,n, John D. Nelson a, R. Jonathan Fannin b

a Department of Forest Resources Management, Faculty of Forestry, University of British Columbia, 2424 Main Mall, Vancouver, BC, Canada V6T 1Z4b Department of Civil Engineering, University of British Columbia, 6250 Applied Science Lane, Vancouver, BC, Canada V6T 1Z4

a r t i c l e i n f o

Article history:

Received 23 December 2009

Received in revised form

15 September 2010

Accepted 7 October 2010Available online 10 November 2010

Keywords:

Landslide susceptibility mapping

Subjective geomorphic mapping

Artificial Neural Networks (ANN)

Learning Vector Quantization (LVQ)

Geographic Information Systems (GIS)

04/$ - see front matter & 2010 Elsevier Ltd. A

016/j.cageo.2010.10.006

esponding author. Present address: FPInnov

BC, Canada V6T 1Z4. Tel.: +1 604 228 1555; f

ail address: [email protected] (M

a b s t r a c t

This study explores the possibility of creating landslide susceptibility mappings by using two types of

data: (i) an existing subjective geomorphic mapping; and (ii) landslides already identified in the area

analyzed. The analysis is conducted using a type of Artificial Neural Network (ANN) named Learning

Vector Quantization. For the subjective geomorphic mapping various definitions of stability were

considered/analyzed, some using a 2-class system and some using a 5-class system.

The study concludes that mappings using an existing subjective geomorphic classification and based

on two stability classes can be successfully replicated with the ANN-based approach. However, mappings

based on existing landslides and on the 5-class system do not yield results sufficiently accurate for

practical applications. Creation of landslide susceptibility mappings involved utilization of data of

numerous types (numerical and class-type variables). This study also investigated various methods of

data coding and identified the most appropriate method for this type of analysis.

& 2010 Elsevier Ltd. All rights reserved.

1. Introduction

In the literature related to terrain stability assessment thesaying ‘‘the past and the present are the key to the future’’ is oftencited (e.g., Dai and Lee, 2001). However, as in many other instancesthe opposite of this was also stated, as ‘‘we can never plan the futureby the past’’ (Edmund Burke—in Gosavi, 2003, p. 133). The mainobjective of this study ranges between these two extremes andexplores two different types of data used for terrain stabilitymapping and four criteria of stability and assesses their outcomes.The study was conducted in the Canadian province of BritishColumbia (BC).

The first principle for terrain stability assessment investigatedin this study states that future can be predicted based on the pastand present. However, there is no doubt that to adequatelyanticipate the future a representative description of the past andpresent must be available. For landslide susceptibility mapping themost commonly available description of the past and present for acertain region is represented by existing events. The methods basedon previous landslides are presented in numerous papers, of whichwe mention Guzzetti et al. (1999), Aleotti and Chowdhury (1999),Dai et al. (2002), and Brenning (2005).

ll rights reserved.

ations, 2601 East Mall, Van-

ax: +1 604 228 0999.

. Pavel).

The second principle investigated in this study is based (in amore or less explicit manner) on the belief that future cannot bepredicted by the past, and as a method is best represented bysubjective geomorphic mapping. This is a widely used method forassessing terrain stability over large areas (Keaton and DeGraff,1996; Soeters and van Westen, 1996; Carrara et al., 2003) and is theofficial system used in British Columbia. In essence, subjectivegeomorphic mapping is based on the experience and knowledge ofthe terrain specialist (mapper) and involves delineating terrainpolygons that are uniform with respect to surficial materials,landforms, and geomorphic processes. Assessments are carriedout by geoscientists or engineers using a combination of field andoffice based techniques. Subjective geomorphic mappings reflectthe stability of a certain area only in space and there is no temporalresolution explicitly assigned to them. However, these productsinclude also information on existing landslides, i.e., presence ofevents automatically forces classification of a terrain polygonin a certain class. The main issues associated with this methodare its relatively high cost, and the challenge of achieving con-sistency between products generated by different terrain mappers.A method for replicating landslide susceptibility maps in a con-sistent manner and at reduced cost by using subjective geomorphicmappings and based on a certain/chosen definition of stability wasdeveloped by Pavel et al. (2008). Also, Pavel et al. (2008) identifiedslope, elevation, aspect, and existing geomorphic processes as theattributes most important in such analyses.

The main objectives of this study are (i) to produce landslidesusceptibility mappings by using existing high-quality subjective

Page 2: An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing landslides

M. Pavel et al. / Computers & Geosciences 37 (2011) 554–566 555

geomorphic maps based on three criteria of stability (different thanthe one used in Pavel et al., 2008); and (ii) to investigate thepossibility of producing landslide susceptibility mappings based onexisting landslides and explore the limitations of this approach. Toaddress these objectives a type of Artificial Neural Network (ANN)named Learning Vector Quantization—LVQ (Kohonen, 2001) isused to ‘‘learn’’ the patterns of instability associated with each dataset. The LVQ algorithm used in this study is based on supervisedlearning: some spatial entities (areas) already classified are avail-able and are used to classify new entities. The study aimed atdeveloping a method that can be used with minimal changes inother locations.

Also, two important tasks (secondary objectives) were set forthis study. The first one consists in identification of the mostimportant attributes that contribute to the quality of the mapping.Since the data used in this analysis are of different types, conversionis needed to a format usable by the LVQ algorithm. The second taskis to test various data coding methods and identify the method thatproduces the best results.

In practical applications, the risk assessment of landslidesinvolves both the potential hazard of landslide initiation and ofthe consequences of the events. For this reason, travel distance oflandslides needs to be assessed. Based on the findings of this study amethod for predicting travel distance of landslides using the sameprinciples of analysis was developed. However, given the limita-tions of the data available, the method for predicting travel distanceis more speculative in nature, and is presented only in Section 9.In the literature, mappings that incorporate a temporal dimensionare commonly referred to as hazard mappings, whereas those thathave only a spatial resolution are called susceptibility mappings.In this study, we argue that none of the methods investigated(using landslides or a subjective geomorphic mapping) is a truetemporal mapping. Therefore, in the remainder of this paper bothmethods are referred to as landslide susceptibility mappings.

2. Previous work

The majority of existing methods for terrain stability mappingcan be included in one of the following broad classes: physicallybased models, models based on data, and methods that use theexperience and knowledge of the mapper. For each type of modelvarious methods were developed either from the perspective ofcrisp (Boolean) logic or based on Fuzzy Logic.

Similarly to other fields of study, the methods developed in theearly stages of terrain stability analysis aimed at developingfunctional models. These are commonly referred to as physicallybased models and incorporate knowledge from various scienceslike Physics, Soil Mechanics, Groundwater Hydrology, etc., to assessterrain stability. In the early stages, models were developed in2-dimensions (2-D), and later on evolved to 3-D models onrelatively small areas, and with the advancement of the GIS theywere applied to large areas. Numerous examples exist in theliterature of which we mention Wu and Sidle (1995), Dietrichand Montgomery (1998), and Pack et al. (2001). The main idea of ananalysis conducted with physically based models is summarized inthe following series of steps (National Institute of Standards andTechnology—NIST, 2010):

Problem–Data–Model–Analysis–Conclusions.This sequence shows that in this approach the model is

developed before the analysis. The issues that limit applicabilityof physically based models are related to the assumptions used,which are sometimes considered oversimplifying, and the avail-ability of reliable data over large areas for the parameters used bythese models.

As advanced computers became available, development ofmodels based on data became feasible and popular. The maininput of these models consists of data (numbers) which are usuallyreadily available or derived from primary data sources. Othersources (like physically based models) can be used in the datapreparation stages, but only to provide input (i.e., only numbers) tobe used in the models based on data. These models use statisticalmethods, of which logistic regression and discriminant analysisseem to be the most popular, and machine learning methods likeArtificial Neural Networks (ANN) and Support Vector Machines.The first step in such analyses usually consists in delineating terrainunits (relatively) uniform with respect to their attributes relevantto stability. Various types of terrain units are used (Guzzetti et al.,1999), of which the most common are grid cell, and slope/catchment area. Each terrain unit is assigned a series of attributesrelevant to stability, like elevation, slope, aspect, type of surficialmaterial, texture, etc. Thus, terrain units can be represented byn-dimensional vectors (i.e., their attributes can be regarded ascoordinates in a multi-dimensional space), and the terrain classi-fication problem consists of analyzing high-dimensional data sets.

Numerous studies describe the application of ANN for landslidehazard zonation based on the criterion of presence or absence oflandslides: Fernandez-Steeger et al. (2002); Lee et al. (2003);Gomez and Kazvoglu (2005); Ermini et al. (2005); Catani et al.(2005); Lui et al. (2006), Melchiorre et al. (2008); Pradhan and Lee(2010a, 2010b). Also, ANN were used for the validation of physi-cally based models as described in Al-Tuhami (2000) and Vullietand Mayoraz (2000). The most common approach for ANN applica-tions consists in utilization of the Multi-Layer Perceptron (MLP)trained with the back-propagation algorithm (Rumelhart et al.,1986) or one of its modified versions. Other approaches were usedby Lui et al. (2006) and Melchiorre et al. (2008) who trained a MLPwith the Levenberg–Marquardt algorithm, and Ermini et al. (2005)who used a Probabilistic Neural Network. For the vast majority ofstudies conducted so far examples of unstable terrain (i.e., existinglandslides) were used for the landslide hazard zonation; this type ofanalysis is called supervised classification. Pavel (2003) andFerentinou and Sakellariou (2005) conducted also terrain stabilityanalyses that did not use existing examples (unsupervised classi-fication) by using a type of ANN called the Self-Organizing Map(Kohonen, 2001).

For analyses using models based on data the following sequenceof steps is used (NIST, 2010):

Problem–Data–Analysis–Model–Conclusions.Unlike physically based models, in this case the model is not

developed a priori. The model is simply derived from the data andmay take a very different format compared to the physically basedapproach. For example, for ANN, once the architecture used isselected the ‘‘model’’ is in fact stored in the weights of the network.However, this is also the main criticism of this approach: ANN doesnot explain the reasoning used to attain a certain result; they actlike black boxes providing little insight into how decisions aremade, and this can make users lack confidence in the reasoning ofthe system.

Models based on Bayesian statistics were also developed(Demoulin and Chung, 2007). The following sequence of steps isspecific for these models (NIST, 2010):

Problem–Data–Model–Prior distributions–Analysis–Conclusionswhich is clearly different than the previous two presented.A different approach to terrain stability mapping involved the

development of Expert Systems (ES). In ES, knowledge modeling isachieved using the symbolic Artificial Intelligence approach. ES aredeveloped to capture human expertise, mostly in terms of if–then–else rules. Examples are provided in Aste et al. (1995), Faure et al.(1995), and Wislocki and Bentley (1989). However, to this date, thesuccess of these approaches is relatively low. The most important

Page 3: An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing landslides

M. Pavel et al. / Computers & Geosciences 37 (2011) 554–566556

problem in the development of ES consists in extracting andcombining knowledge from various sources (e.g., experts, technicalreports, factual data, etc.), and to date, no formal method wasdeveloped to address it. This issue is described in detail in Kohonen(2001), Huang and Xing (2002), and Negnevitsky (2005).

A fundamental shift in terms of the underlying principles ofanalyses is represented by methods developed under the formal-ism of Fuzzy Sets and Fuzzy Logic. Fuzziness is a type of imprecisioncharacterizing classes that for various reasons cannot have or donot have sharply defined boundaries. Fuzziness is not a probabil-istic attribute, in which the degree of membership of an entity islinked to a given statistically defined probability function. Rather, itis an admission of possibility that an individual is a member of a set,or that a given statement is true (Burrough and McDonnell, 1998).Instead of probability, fuzzy sets theory uses concepts of admittedpossibility, which is described in terms of the fuzzy membershipfunction; these functions permit individuals to be partial membersof different, overlapping sets. In fuzzy sets, the grade of member-ship is expressed in terms of a scale that can vary continuouslybetween 0 and 1; this gives the degree to which the entity belongsto the set in question. Fuzzy sets refer to how we assign an object toone class or another, and fuzzy logic is about how we manipulaterules and concepts and how we infer different conclusions fromthese. Essentially, the difference between Fuzzy Logic and Boolean(crisp) Logic consists in a modification/generalization of theprinciple of the excluded middle (Burrough and McDonnell, 1998).

Applications of Fuzzy Sets Theory and Fuzzy Logic in terrainstability analysis are described in Dodagoudar and Venkatachalam(2000) and Saboya et al. (2006) who used various methods forterrain stability analysis and incorporated the uncertainties in thesoil parameters by considering them as fuzzy variables. In general,the problem with such methods is to determine the membershipfunction unambiguously. Existing methods employ either expertopinion (which is subjective), or mathematically derived functions(which need large amounts of data). Besides, the relationshipderived may be only site specific, and extrapolation would bevery difficult. For terrain stability mapping on large areas it isrecognized that many geographical phenomena are vague in theirnature. Vagueness is an inherent property of geographical data, andthat ignoring it is to strip away the essence of much of those data(Fisher, 2000). Fuzzy boundaries and fuzzy attributes are neededfor a spatial analysis. The theoretical foundation exists (or is inadvanced stage of development). For example, Cross and Firat(2000) developed a ‘‘fuzzy object data model’’, and Guesgen andAlbrecht (2000) address the problem of imprecise reasoning in GIS.However, the problems related to lack of reliable spatial datapersist and reduce the applicability of fuzzy methods to real-lifesituations.

New approaches were developed that combine ANN, ES,and Fuzzy Sets. The new approaches consider the specific problemsof each method and attempt to eliminate them or reduce theirinfluence on the results. Two main ideas are used in the develop-ment of the new approaches: (i) use domain-specific knowledgeextracted from experts combined with a fuzzy representation ofthe attributes, and then conduct the classification with an ANN;and (ii) represent the primary data in a crisp format, use them in anANN-based analysis, and then extract fuzzy rules from the trainednetwork. The theoretical foundations of the methods are presentedin Fletcher and Hinde (1995), Huang and Xing (2002), and Kasabov(1996), and specific applications to the landslide susceptibilityproblem are presented in Vahidnia et al. (2010) and Pradhan et al.(in press).

Other formalisms used to address the terrain stability probleminclude Fractals (Shunmin and Yunzhi, 1997), Catastrophe Theory(Runqiu and Qiang, 1997), Chaos Theory (Qin et al., 2001), and Self-organized Criticality (Turcotte and Malamud, 2004).

One of the main inputs of the terrain stability studies consist ofexisting events. However, it is well-known that a description of pastlandslides with a reasonable accuracy is difficult, and sometimesimpossible. Inaccuracies in the description of landslides relate tothe completeness of the inventory, and the spatial and temporallocation of the events. To this date, there is no satisfactory solutionfor identifying the spatial location of the initiation point of land-slides. Frequently, the initiation point is assumed to be the head-scarp of the landslide. Based on this representation, the areaclassified as ‘‘landslide’’ is much smaller than the area classifiedas ‘‘non-landslides’’ which raises the issue of representativeness, orin other words, the ability to capture the ‘‘patterns’’ of instabilityexisting in the area analyzed. To address the difference with respectto the size of the ‘‘landslide’’ and ‘‘non-landslide’’ areas, sampling of‘‘non-landslides’’ was sometimes used (Dhakal et al., 2000),although this may also raise complex issues on the selection andrepresentativeness of the samples used.

The completeness of the landslides inventories is discussed innumerous publications. For example, Malamud et al. (2004) statethat ‘‘landslide inventories are generally incomplete and the levelof completeness of the inventory is unknown, particularly forhistorical inventories’’. A comprehensive spatial and temporallocation of events is necessary for a description of conditionsexisting at the moment when the landslide occurred and whichmay have played a role in the initiation of the event. Theseconditions refer, but are not limited to vegetation (species compo-sition, age, closure), water conditions (distance to streams, lakes,etc.), and development that occurred in the area (distances toharvest blocks, to roads, to culverts, harvesting method used, etc.).

3. Study area

This study was conducted using data from the Lower Seymourwatershed within the Greater Vancouver Regional District (GVRD;currently, Metro Vancouver), situated in southwestern BritishColumbia (BC). The Seymour watershed (56.7 km2) lies withinthe Pacific Ranges of the Coast Mountains and is adjacent to theMetro Vancouver area. A detailed physiographic description of thearea is presented in GVRD (1999). The topography of the Seymourarea is rugged and is the result of geologically recent and ongoingtectonic uplift, combined with glacial erosion and with the effect ofrock strength. Tectonic uplift of the Coastal Mountains in responseto the convergence of the Pacific and Juan de Fuca plates along theWest Coast of North America has been occurring for tens of millionsof years. Bedrock consists primarily of intrusive igneous rocks,which are sometimes exposed at mid and upper elevations.Elevation in the Seymour watershed ranges from 40 to about1500 m, and slope varies between 01 and 721, with steepnessgenerally increasing with elevation. The most common surficialmaterials in the study are colluvial and morainal deposits (till), orcombinations of these materials and rock, and fluvial and glacio-fluvial materials on the valley floors. In Seymour, the valley floorsrepresent a particular challenge with respect to stability becausethe surficial deposits overlie glaciolacustrine deposits. Althoughthese areas are characterized by relatively low slope gradients, ahigh clay content makes them prone to instability. Climate isgenerally mild and wet, but temperature and precipitation aresignificantly influenced by elevation. The mean annual air tem-perature varies from 10 to 4.6 1C. The average annual precipitationis 4033 mm, with a daily maximum of 314 mm.

Apart from the Seymour River the watershed is dissected bynumerous tributaries as presented in Fig. 1. The soils in theSeymour area have limited water retention capability and highvalues of hydraulic conductivity which are enhanced by macro-pores distributed throughout the soils. Little overland flow

Page 4: An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing landslides

Fig. 1. Seymour study site: DEM, roads, streams, and location of existing landslides.

M. Pavel et al. / Computers & Geosciences 37 (2011) 554–566 557

occurs; instead, water is transmitted rapidly through the soilmatrix to streams and gullies. Storm runoff is generated bysnowmelt, by rainfall, and by rain-on-snow events which occurat lower- and mid-elevations. Melt of the mid- and upper-elevationsnow packs continues over several months and, on average, thegreatest monthly inflows to the Seymour River occur in late spring,as a result of snowmelt.

GVRD conducted a comprehensive ecological study of thewatersheds they administer, between 1995 and 1999, utilizingremote sensing techniques combined with on-the-ground surveys.Underhill (2000, p. 17) estimates that this was the most compre-hensive landscape level database developed to that date for anywatershed in Canada, and perhaps in North America. The databasewas summarized in Anon. (1999) and is the main data source forthis study. Fig. 1 displays a Digital Elevation Model (DEM) of thestudy site and the locations (initiation points) of existing landslidesidentified in the area. Within Seymour, 212 landslides wereidentified from aerial photographs and confirmed by groundtruthing. Examples are provided in Fig. 2. The age of those land-slides was estimated to vary from 1 year to more than 20 years, atthe time of mapping. However, the exact date of each event was notestablished because this is very difficult, especially for olderlandslides.

The subjective terrain stability map of Seymour is presented inFig. 2. In total, 397 polygons were delineated according to the BCterrain classification system (Howes and Kenk, 1997) and stabilityclasses from Class I (stable) to Class V (most unstable) wereassigned to each polygon based on principles presented inProvince of BC (1999). Terrain stability classes provide a relativeranking of the likelihood of a landslide occurring after timberharvesting or road construction, but they give no indication of theexpected magnitude of a landslide (Fig. 3).

Based on the BC terrain classification, each polygon may bedescribed by up to three surficial materials and two subsurficialmaterials, their corresponding texture and surficial expression. Up

to three geomorphic processes can be recorded for each polygon, indecreasing order of their areal extent; no information aboutfrequency and intensity of events is intended, and the arealproportion of the processes is not stated. The processes identifiedin the study area are avalanches, gully erosion, rapid mass move-ments, and to a smaller extent, slow mass movements. Polygonslope is also defined using five classes which are then used todescribe the slope ranges in individual polygons.

4. Data description and coding

4.1. Data description

The main types of data used were topographic attributesderived from the DEM, location of existing landslides, and geo-morphic attributes included in the terrain mapping. Data in digitalformat were stored and manipulated using the ArcViewTM GIS.

Topographic data were represented by using the raster struc-ture and consisted of a 20-m (cell size) DEM. Based on the DEM thefollowing parameters relevant to terrain stability were computed:slope, aspect, plan (planform) curvature, and profile curvature. Twoadditional topographic features considered relevant to terrainstability were generated: interaction of curvatures (computed bysubtracting the value of the plan curvature from that of profilecurvature) and specific/hydrologic catchment area (Sca). Sca isdefined as the ratio of upslope drainage area per unit contourlength; it was computed based on the description presented inTarboton (1997) using the software implementation of Pack et al.(2001).

Location of existing slides was provided in vector format, as aseries of points where the initiation of an event was assumed tohave occurred. Since initiation point of a landslide cannot belocated accurately, a 40-m buffer was created in the GIS aroundthe initiation points. The diameter of the buffer was selectedsubjectively, based on experience and on knowledge of existinglandslides in the Seymour watershed. In the DEM, all cells having acentroid located within the buffer were assumed unstable.

Geomorphic data consist of terrain polygons and their corre-sponding attributes. These data were initially stored in the GISusing the vector model and were rasterized for the analysis. As aresult of this, all cells that have their centroids inside a certainterrain polygon share the same geomorphic attributes. Discussionson this type of data transformation and its potential problems arepresented in Davis (1999) and Pavel (2003).

4.2. Data coding

LVQ can only utilize numerical information, so data coding wasused to convert the variables to a suitable form. Based on thecommon criteria for data classification (e.g. Manly, 1994), the dataused in this study were classified as follows: (1) ratio-type: alltopographic attributes; (2) ordinal: polygon slope; and (3) nom-inal: all geomorphic attributes.

For ratio-type data two cases were distinguished: (1) raw data;and (2) data standardized to mean 0 and 1 standard deviation, toaccount for the difference in magnitude of various attributes. Themethod chosen to code nominal and ordinal data is 1-of-c coding,which is described in many ANN publications (e.g. Bishop, 1995),and is used for coding symbolic variables into a set of numericalvariables (similar to ‘dummy’ variables used in Statistics). Themethod consists of assigning to each attribute a number of classesequal to the number of levels specific for the respective attribute.Each attribute is represented by a vector, which has a value of ‘1’ forthe class that corresponds to the actual value of the attributeand values of ‘0’ for all the other classes (Table 1). However, to be

Page 5: An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing landslides

Fig. 2. Field photos (a–d) of mass movements identified in Seymour area.

M. Pavel et al. / Computers & Geosciences 37 (2011) 554–566558

combined with the geomorphic data, the topographic attributeshad to be coded using the 1-of-c coding method. Therefore, 1-of-ccoding was investigated both for standardized and non-standar-dized topographic attributes. Each topographic attribute was codedaccording to a ten-class system and classes were defined by equallydividing the range of the respective attribute.

For the 1-of-c coding method, apart from the binary notationused, the bipolar notation was also explored. The bipolar notation issimilar to the binary, but the value ’0’ is replaced with ’�1’.Additionally, the m-of-n (or thermometer) coding was investi-gated. When using this method, each attribute is represented by avector, which has values of ‘1’ for all classes up to the class thatincludes the actual value of the parameters and values of ‘0’ for theremaining classes (Table 1). This method is intended for ordinal-and ratio-type data and attributes should be arranged in increasingorder of magnitude. When this method was applied, the nominalattributes were first subjectively ranked based on their impact on

terrain stability. More details on how various coding methods wereused in the analysis are presented in Fig. 4.

5. Methodology

5.1. Modeling principles

Michalewicz and Fogel (2000) state that in any real-world problemthe analyst chooses the evaluation function because it is not givenwith the problem. In this study, the evaluation function comprisesboth the definition of the accuracy of predictions and the definition ofstable and unstable terrain. As there is no ‘‘perfect’’ mapping thatconstitutes an ideal solution for purposes of comparison, accuracy isdefined in this study as similarity to expert’s mapping whenreplicating the subjective geomorphic mapping, and as the ability

Page 6: An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing landslides

M. Pavel et al. / Computers & Geosciences 37 (2011) 554–566 559

of the method to identify areas where landslides might occur whenthe mapping is conducted based on existing landslides.

Considering the type of data available in this study (i.e., locationof landslides and subjective geomorphic mapping), the followingcriteria were used to delineate stable and unstable terrain and toassess the quality of predictions yielded:

TabExa

C

1

‘‘T

Criterion a: unstable terrain in this case corresponds to ClassesIV–V, and terrain Classes I–III are considered stable (Pavel et al.,2008).

� Criterion b: unstable terrain is considered to be only terrain Class

V, and terrain Classes I–IV are considered stable.

� Criterion c: classification is based on the 5-class system, and

attempts to classify terrain according to the system used in BC(Province of BC, 1999).

� Criterion d: unstable is considered to be only terrain already

affected by the 212 landslides, and the remaining area isconsidered stable.

Criteria a–c are based only on the subjective geomorphicmapping and criterion d uses only existing landslides. The analyseswere first performed separately for topographic and for geo-morphic attributes, with respect to the four criteria. Based onthese analyses, the attributes believed to contribute most to thesepartial classifications were identified. Next, the topographic andgeomorphic attributes were combined and analyzed using this

Fig. 3. Seymour—terrain stability mapping.

le 1mple of coding of slope using 1-of-c and ‘‘thermometer’’ coding methods. The slope

oding method Slope class (deg.)

o7 7–14 14-21 21–28 28

-of-c 0 0 0 0 0

hermometer’’ 1 1 1 1 1

reduced set of attributes. Upon completion of these analyses thequality (accuracy) of terrain stability mappings was assessed andthe most important topographic and geomorphic attributes wereidentified. This approach is amenable to criticism because theremight be some combinations that could yield good results but wereexcluded. However, conducting the study this way was mainlydriven by knowledge about terrain stability in general and theterrain attributes used in the analyses (i.e., domain knowledge).An overview of all the analyses conducted in this study is presentedin Fig. 4.

Analysis was performed in two steps at all levels. In the first stepthe ANN was trained using a portion of the data. Next, the quality ofpredictions of the trained network was tested with the remainingdata. The amount of data to separate into training and testing sets isproblem-dependent. Most practitioners recommend randomlysplitting the data into two thirds for training and one third fortesting. For Seymour, after the elimination of areas which were notof interest in this study (lakes, swamps, etc.) the total numberof cells in the DEM was 126 168, which were split according tothe above rule. The only situation that required more attentionwas in relation to criterion d. As the number of unstable cells(i.e., landslides) is much smaller than the number of stable cells,care was exercised to ensure a representative balance in bothtraining and testing datasets. For this reason, unstable cells werefirst separated from the rest of the data and split in a 2/3 vs. 1/3ratio, then combined with the stable cells which were split based onthe same rule.

5.2. The modeling process

Topographic attributes were examined using correlation ana-lysis and Multiple Discriminant Analysis (MDA) using StatisticalAnalysis Software (SAS, SAS Institute, 2001), with the proceduresCandisc and Discrim. From MDA, the standardized discriminantfunction coefficients were extracted. These coefficients express therelative contribution of each attribute to the quality of theclassification (Guzzetti et al., 2006). Based on these, the topo-graphic attributes were ranked in the following order of impor-tance: slope, elevation, Sca, aspect, profile curvature, interaction ofcurvatures, plan curvature. To account for the uncertainty relatedto this ranking, nine predictions were analyzed, which includedvarious combination of topographic attributes. Combinations weredriven both by the ranking of attributes and by domain knowledge.The combinations of topographic attributes analyzed are shown inTable 2.

Geomorphic attributes were first ranked based on their rele-vance to terrain stability and the accuracy with which theywere delineated in the following order of decreasing importance:geomorphic processes, polygon slope, expression-thickness, surfi-cial and subsurficial materials, expression-slope, expression-complex. To account for the subjectivity of this classification,numerous analyses were carried out by combining these attributes.A detailed description of the analyses performed to assess theimpact of various geomorphic attributes is presented as supple-mentary data to this paper. When analyses are based on both

coded falls in the class 35–421.

–35 35–42 42–49 49–56 56–63 463

1 0 0 0 0

1 0 0 0 0

Page 7: An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing landslides

Topographic attributes

- Raw data & Standardized data

- Both types of data represented as:• Numbers (not coded)• 1-of-c coding – binary• 1-of-c coding – bipolar• “thermometer” coding

(8 combinations analyzed)

Geomorphic attributes

- Data coding.• 1-of-c coding – binary• 1-of-c coding – bipolar• “thermometer” coding

(3 combinations analyzed)

Criteria for evaluation: a – d. Criteria for evaluation: a – c.

Module A Module B

Reduced set oftopographic attributes

Reduced set ofgeomorphic attributes

Criteria for evaluation: a – d.

Results and mappings for criteriaa – d.

Reduced set oftopographic and geomorphic

attributes

Module C

1-of-c coding - binary

Fig. 4. Description of analyses conducted. Three modules (A–C) identified in this figure correspond to three parts of the ‘‘Results’’ section.

Table 2Results for predictions based on topographic attributes. Prediction 4 (in bold) identifies the attributes retained for the next phase of the analysis.

Prediction Attributes included Accuracy (%)

Crit. a Crit. b Crit. c Crit. d

1 Slope, elevation, Sca, aspect, profile curvature, interaction of curvatures, plan curvature. 84 82 57 71

2 Slope, elevation, Sca, aspect, profile curvature, interaction of curvatures. 83.7 82.6 57.3 74

3 Slope, elevation, Sca, aspect, profile curvature. 83.6 82.4 57.6 71

4 Slope, elevation, Sca, aspect. 83.6 82.1 57 72.65 Slope, elevation, Sca. 79 79.2 51.2 71.8

6 Slope, elevation, aspect. 82.9 81 57.1 70.3

7 Slope, elevation. 76 77.6 48 73.2

8 Slope, Sca. 75.1 74.3 46.8 71.9

9 Slope. 67.9 62.3 47.1 97.4

M. Pavel et al. / Computers & Geosciences 37 (2011) 554–566560

Page 8: An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing landslides

M. Pavel et al. / Computers & Geosciences 37 (2011) 554–566 561

topographic and geomorphic attributes, each pixel is characterizedby unique topographic attributes derived from the DEM and bygeomorphic attributes common to all pixels located inside thesame terrain polygon. The attributes retained from the initialphases of the analysis, considering separately the topographicand geomorphic attributes, and the types of analyses performedafter combining them, are presented in Section 8.

5.3. Assumptions used in this study

The common assumptions applied to GIS-based analyses arealso used in this study. These relate to the accuracy of representa-tion of natural data in a GIS environment, e.g., DEM is a faithfulrepresentation of the topography, and hence the other attributesderived from elevation are correct, the surficial and subsurficialmaterials associated with terrain polygons are correctly identifiedand described, etc. A special note needs to be made in relation toexisting landslides: although the events identified in Seymour areof different types, no attempt was made to identify the differenttypes. Given the uncertainty on the occurrence of these events itwas concluded that a correct classification of past landslides is notpossible. Also, as Terzaghi (1950) noted, classifications of landslidesare highly subjective. As such, it is assumed that a correct samplingof landslides by type, in the training and testing datasets, wasachieved by the virtues of the simple random sampling used.

As presented in Section 4, the mapping system used assignsmore than one geomorphic process to polygons and records themin the order of the areal spread, although most processes do notcover the entire area of the polygon. A delineation of areas affectedby geomorphic processes was not available; therefore, it isassumed that processes apply to the whole polygon where theyare recorded.

5.4. Evaluation of results

For evaluation of the results, the testing data, which representthe mapping produced by LVQ, were compared with the mappingproduced by terrain specialists for criteria a–c and with the numberof existing landslides, for criterion d (i.e., cells characterized aslandslides after buffering). Accuracy was evaluated based on thenumber of cells (i.e., area) predicted as unstable by the model vs.the ‘‘correct’’ classification as identified by terrain specialists. Forcriteria a, b, and d, which are based on two classes, the classicalbinary classification problem includes two kinds of errors: type Ierror, which is a false alarm, and type II error which is a miss. Interrain stability, like in many other fields, type II errors are muchmore costly than type I, because extra checks are much less of aproblem than a miss. For criterion c, which is based on five classesany misclassification is counted as an error.

Similarly to other domains where models for natural systemsare developed, in terrain stability modeling there is much debate inrelation to the terminology used. The most frequently used (anddebated) words are ‘‘verification, validation, and confirmation’’. Weadopted the terminology used by one of the most widely acceptedpublications in this matter (Oreskes et al., 1994) and use the word‘‘confirmation’’ in relation to the assessment of the quality of themodel developed in this study.

6. Learning Vector Quantization (LVQ)

LVQ is a relatively simple and yet efficient classification algo-rithm which was purposely developed for statistical patternrecognition, especially when dealing with very noisy, high-dimen-sional stochastic data. The main benefit of this ANN is goodrecognition accuracy and a significant reduction of computing

operations when compared with more traditional statistical meth-ods. In this study, the LVQ-based analysis consists of two phases. Inthe first phase, called the training phase, N input vectors (whichrepresent spatial entities) of known classification are grouped intoM clusters (MoN), by using M representative (or prototype)vectors. At the end of this phase each stability class is representedby its own set of prototype vectors. In the second phase, called thetesting phase, new vectors (terrain entities) of unknown classifica-tion can be classified based on their similarity to the M prototypes.Vector similarity is based on the shortest distance, and in this studythe Euclidean distance was used. A complete mathematicaldescription of the LVQ is presented in Kohonen (2001).

The analysis starts with the initialization of the prototypevectors for each class. During the training phase, the trainingvectors are read, the network calculates the distance from thecurrent input vector to all prototype vectors, and finds the vectorfor which the distance is minimum, i.e., the ‘‘best match’’ vector.Next, the difference between the current vector and the ‘‘bestmatch’’ is calculated and adjusted with the ‘‘learning rate’’. The‘‘learning rate’’ is a small scalar value which is either kept constantor is gradually reduced to zero during the training phase. At the laststep, the ‘‘best match’’ is updated by either adding or subtractingthe quantity computed at the previous step (i.e., the adjusteddifference). Addition is performed when the two vectors belong tothe same class and subtraction when they belong to differentclasses. Essentially, LVQ training consists of moving the prototypevectors iteratively toward or away from the training vectors.Training continues until the stopping condition is met. Thiscondition may specify a fixed number of epochs (number of timeswhen the data set is presented to the net) or, when no changesoccur in the classification.

At the end of the training process, the ANN can be used toclassify new vectors, for which classification is unknown. The newvectors are input into the network, and a label (class) is assigned toeach of them based on the closest prototype vector. Since LVQ is anetwork based on prototypes the number of stability classes usedin the analysis (training and testing) and the reporting phase can be(easily) selected in a consistent manner. The number of stabilityclasses for both analysis and reporting was 2 for criteria a, b, and d,and 5 for criterion c.

An important note about the utilization of LVQ is needed: theobjective of this study is not to test the ability of LVQ to ‘‘learn’’ andreplicate the patterns of instability identified on a geographic area.The validity of the method is proved by the extensive literature thatdescribe and analyze it (e.g., Holmstrom et al., 1996a, b; Laaksonenand Oja, 1996; Kohonen, 2001). Similarly to statistical techniques(e.g., regression analysis) the objective of this study is to investigateif a useful model for landslide susceptibility can be developed byusing LVQ, and assess its quality. By comparison with regressionanalysis, the accuracy of the mappings produced fulfills the samerole as the multiple coefficient of determination (R2), or thestandard error of the estimate. Also, identification of ‘‘significant’’variables is an important task of such an investigation, and thisstudy aims at finding the terrain attributes that contribute most toa good mapping.

The parameters used in the training of LVQ were selected basedon preliminary analyses and on data from the literature (Kohonenet al., 1996; Kohonen, 2001). The learning rate selected was 0.03and the number of prototype vectors was set to 2000 (1000 perclass for criteria a, b, and d, and 400 per class for criterion c). Theentire training data were passed to the ANN twice (i.e., two epochs).However, the LVQ algorithm is influenced by the number ofpatterns used (learned) for different classes and for this reasonappropriate changes had to be performed. For criterion d, the ratiobetween stable and unstable terrain is about 1–40, and to accountfor this, in the training phase, ‘‘landslide’’ cells were presented to

Page 9: An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing landslides

M. Pavel et al. / Computers & Geosciences 37 (2011) 554–566562

the ANN 40 times, in each epoch. For the same reason, for criterion bthe unstable pixels were presented to the ANN three times, in eachepoch. For criteria a and c, the number of pixels for each class wereapproximately equal and no such corrections were necessary.

7. Preliminary analyses

Numerous preliminary analyses were conducted to assess theinfluence of the attributes and the impact of coding methods. Theattributes were analyzed individually with respect to the codingmethod and their impact on prediction. The procedure involvedselecting a subset of polygons with all other attributes similarexcept for the one analyzed. Predictions made with and withoutthat parameter were then evaluated prior to deciding whether ornot changes were required. As a result of the preliminary analyses,texture of materials was eliminated. Also, the surficial expressionwhich is a multifaceted description of materials, was split intothree components denoted as expression-thickness, expression-slope, and expression-complex.

As presented in Fig. 4, the analyses employing only topographicattributes used standardized and non-standardized data, both withraw numbers (i.e., numbers as they are) and with 1-of-c coding.When numbers were used, standardized data produce betterresults. In general, these results were 2–4% better with respectto criterion a, and varied for the other criteria. However, whencoupled with 1-of-c coding, these differences become less visible,and make standardization unnecessary. Results obtained withbipolar notation followed the same trend, but were in general lessaccurate than binary notation. For thermometer coding the resultswere less accurate, and indeed sometimes erratic. Thermometercoding appears very much affected by the subjectivity in rankingthe nominal data, and by the number of classes created for anattribute. Results followed the same trend for geomorphic attri-butes which were coded using the ‘‘1-of-c’’ method in binary andbipolar format and ‘‘thermometer’’ coding. More details on theseanalyses are presented in Pavel (2003). Therefore, in the nextsection of the paper, only results using non-standardized numbersand 1-of-c binary coding are presented.

8. Results and confirmation of the model

8.1. Results and confirmation of the model when using topographic

attributes

The results discussed in this section correspond to Module A inFig. 4 and are presented in Table 2. In Table 2, results with respect tocriteria a and b are relatively similar, which indicates consistentterrain stability mapping with respect to terrain Class IV and V. Inessence, these results show that dropping the three curvatures doesnot have a significant impact on the accuracy of predictions, and theonly attributes to be included in the next phase of the analysis areslope, elevation, aspect, and Sca. For criterion a, results indicate acontinuous increase in prediction accuracy after the successiveintroduction of slope, elevation, aspect, and Sca (predictions no. 9,7, 6, and 4) from 67.9% to 83.6%. Classification with respect tocriterion b follows a similar trend, and the accuracy of predictionsbased on the same attributes (slope, elevation, aspect, and Sca)increases from 62.3% to 82.1%.

With respect to criterion c, Table 2 shows that the highestaccuracy of classifications is 57.6% (prediction 3). This accuracyindicates that the model is unable to replicate the 5-class terrainclassification system when using only topographic attributes.

Results based on existing landslides (criterion d) have accura-cies in the range of 70–74%. For prediction 9, the accuracy

was 97.4% because the entire area was classified as stable. Thissituation occurred simply because the area of existing landslides(and associated buffers) is small compared to the entirewatershed—this is an artifact produced by reporting outcomesas a percentage.

8.2. Results and confirmation of the model when using geomorphic

attributes

The results discussed in this section correspond to Module B inFig. 4. A detailed presentation of results based on geomorphicattributes is provided as supplementary data to this article. Theseresults are presented only for criteria a–c; criterion d is not feasiblein this case because landslides are not associated with wholepolygons. Predictions based on certain combinations of geo-morphic attributes yield accuracies greater than 95%. However,these values are unrealistically high since they are a directconsequence of the representation used that assigns the sameattributes to all cells in a polygon. Also, these results have nopractical applicability, as they are based on polygons alreadydelineated and fully described (with terrain symbols attached).Delineation and description of polygons is the most time-consum-ing part of subjective geomorphic mapping. If the mapping processis so advanced, there is no point in using an ANN for assigningstability classes as this can be done manually.

Analyses based only on geomorphic attributes indicate that themost dominant geomorphic process recorded for each polygon,polygon slope, and expression-thickness are the most important fordetermining stability. However, where both terrain mapping and aDEM are available, each cell has a slope derived from the DEM,hence slope of the polygon is redundant. Therefore, the onlyattributes to be used in the next phase of analysis are geomorphicprocesses and expression-thickness.

8.3. Results and confirmation of the model when using topographic

and geomorphic attributes

The results discussed in this section correspond to Module C inFig. 4. Analyses were conducted with attributes retained inprevious phases and ranked in the following order: slope, geo-morphic processes, elevation, aspect, Sca, and expression-thick-ness. Similarly to previous sections, six predictions with variouscombinations of these attributes were carried out to account for thesubjectivity of the ranking and also for uncertainty related to theinfluence of each attribute. These results are presented in Table 3.

The results in Table 3 show that for criterion a the greatestaccuracy is achieved with prediction 12, and for criterion b withprediction 11. However, criterion a is considered the most impor-tant for practical applications, and for criterion b the differencebetween the predictions 11 and 12 is only 0.3%. Accordingly,the best agreement is achieved with prediction 12 by using theattributes slope, geomorphic processes, elevation, and aspect. Forcriterion c, results in Table 3 show that the greatest accuracy is70.8%. This finding implies there is an improvement in the qualityof prediction compared to the case when only topographicattributes are used. However, the accuracy of predictions achievedfor criterion c is relatively low. For criterion d, results in Table 2show an improvement in accuracy compared to predictions basedonly on topographic attributes. The greatest accuracy (75.4%) isagain achieved with the attributes slope, geomorphic process,elevation, and aspect (prediction 12). In the remainder of thissection maps showing results with the attributes slope, geo-morphic processes, elevation, and aspect (prediction 12) arepresented.

Page 10: An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing landslides

Fig. 5. Results and confirmation of model with respect to terrain Class IV and V

(criterion a; from Pavel et al., 2008).

Fig. 6. Results and confirmation of model with respect to terrain Class V

(criterion b).

Table 3Results for predictions based on the reduced set of topographic and geomorphic attributes. Prediction 12 (in bold) is the most important one for practical applications.

Prediction Attributes included Accuracy (%)

Crit. a Crit. b Crit. c Crit. d

10 Slope, geomorphic processes, elevation, aspect, Sca, expression thickness. 90.5 93.9 70.8 74.3

11 Slope, geomorphic processes, elevation, aspect, Sca. 90.2 94.2 70.7 75

12 Slope, geomorphic processes, elevation, aspect. 90.9 93.9 70.3 75.413 Slope, geomorphic processes, elevation, Sca. 88.8 91.9 71.1 73.7

14 Slope, geomorphic processes, elevation. 87.7 91.3 68.5 72.1

15 Slope, geomorphic processes. 80.2 81.5 62.2 65.1

M. Pavel et al. / Computers & Geosciences 37 (2011) 554–566 563

Prediction with respect to criterion a is presented in Fig. 5 alongwith terrain polygons Class IV and Class V. Fig. 5 shows that Type Ierrors associated with this prediction cover a relatively small area.Type II errors, however, affect much larger areas which consist of analmost complete miss of a polygon situated in the lower (south)part of the valley and another one near the central part. Also,portions of polygons are misclassified, mainly at lower elevations.The accuracy achieved for criterion a is 90.9%.

Results for criterion b are presented in Fig. 6, which also displaysterrain polygons Class V. The error consists mainly in small areasclassified as unstable in the stable area (Type I error). Overall, anaccuracy of 93.9% is achieved with respect to criterion b.

For criterion c the error consists mainly in misclassificationsbetween neighbouring (stability) classes, especially between poly-gons belonging to classes I and II, and IV and V. The misclassifiedareas are relatively small in size and are distributed over the entirearea. For this reason, displaying a meaningful map at a reasonablescale for the results obtained with respect to criterion c it is notpossible.

The prediction for criterion d is presented in Fig. 7 (terrainpolygons Class V are displayed, as well). Terrain identified asunstable in Fig. 7 captures the majority of existing landslides (198,of 212 in total). All the landslides missed occur at lower elevation,on the southern side of the study area. For criterion d, terrainmapped as unstable outside the existing landslides and buffersrepresents Type I errors (which constitutes the vast majority oferrors in this case), while existing landslides not identified bythe system represent Type II errors. In general, unstable terrain is

Page 11: An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing landslides

Fig. 7. Results and confirmation of model with respect to existing landslides

(criterion d).

M. Pavel et al. / Computers & Geosciences 37 (2011) 554–566564

delineated within terrain polygons identified as Class V by themappers. However, for practical applications the accuracy of 75.4%with respect to this criterion is still relatively low.

9. Discussion

Results of this study show that the ANN-based method wassuccessful at identifying landslide-prone terrain based on existinggeomorphic mapping and by using a 2-class system. Good accura-cies are achieved in relation to criteria a and b; hence, the method isable to replicate the subjective geomorphic mapping with respectto unstable and potentially unstable terrain (Class IV and V), and(strictly) unstable terrain (Class V). Results based only on topo-graphic attributes yielded accuracies of 82.1% and 83.6% for criteriaa and b, respectively (prediction 4). Adding geomorphic processes(prediction 12), the accuracies increased to 90.9% and 93.9%,respectively. The greater increase in accuracy with respect tocriterion b shows that geomorphic processes are better indicatorsof instability in relation to polygons Class V than with Class IV.

Overall, the findings of this study indicate that terrain stabilitymapping of a site by using a 2-class system can be done based onthe LVQ algorithm and a subset of topographic and geomorphicattributes. Topographic attributes (slope, elevation, and aspect)are convenient and inexpensive to measure/derive based oncommonly used methods. Of all the geomorphic attributes,

‘‘geomorphic processes’’ was the only one found important in thisstudy, implying that a different, much simplified terrain mappingmethod may be sufficient for extracting this attribute. Geomorphicprocesses can be accurately identified on aerial photographs, andfield inspections can confirm their areal extent. Delineation ofgeomorphic processes is much easier, requires less time, and a lessskilled person, compared to the entire terrain mapping procedure.Accordingly, it is believed that the method developed in this studycan be used to produce consistent landslide susceptibility map-pings at reduced costs. This approach can be used both as aplanning tool and for independent risk analysis. The model canbe used to delineate in a consistent manner areas of potentialinstability and to identify zones that require detailed groundchecks, thereby limiting expensive field assessments to the mostpotentially unstable ground.

The LVQ-based system used in this study was not able toreplicate the 5-class system (criterion c) with a good accuracy.However, it is very likely that the (strict) definition of the errorplayed an important role in this case. In practical applications it isalso important to quantify the magnitude of the error, i.e., considerif a terrain entity was misclassified between two adjacent classesor two extreme classes (i.e., it is very different if a terrain cellbelonging to class I was misclassified as class II or as class V). It isvery likely that results would be different if a different definition oferror was used. As well, although the 5-class system is the officialsystem in use in the Province of British Columbia, there are fewapplications where delineation of terrain class I and/or II is of greatimportance. Future studies need to explore in greater detail otherdefinitions of errors and classification systems based on more thantwo classes.

Results of this study indicate that delineation of landslide-proneterrain based on existing landslides (criterion d) did not yield goodaccuracies. It is very likely that this result was influenced by thepractical challenges related to the temporal and spatial identifica-tion/description of past events. Also, the reduced area occupied byexisting landslides may have affected the ability of the model to‘‘capture’’ the patterns of instability existing in the area.

For landslide hazard mapping based on existing events, Pavel(2003) speculates on a possible method for analyzing a landslidedatabase with adequate temporal resolution. This would involve ananalysis similar to the one conducted here by using a temporalimplementation of the same ANN (Kohonen, 2001). The first stepwould consist of simply splitting the database ‘‘in time’’, so theolder events be used to train the ANN and the most recent ones totest the prediction. Next, the method could be used for addressingthe landslide traveling distance problem. The method presented inFannin and Wise (2001) can be used for decomposing the travelpaths of existing landslides in elementary units. Based on thisdecomposition, an analysis method similar to the one used in thisstudy becomes feasible. This method has all the advantages ofanalysis methods based on data, it would implicitly take intoaccount the magnitude-frequency problem, and it is a powerfulalternative to existing approaches to landslide travel prediction.However, such an analysis would be still affected by inaccuraciesrelated to the location of initiation points. Also, in the absence ofgood data sets that can be consistently collected for large areas, themethod has little practical applicability. In relation to similarlycomplex problems, Horgan (1995) states that many of the mostprofound achievements of science do not actually solve problems,and rather prescribe the limits of knowledge. This might also be thecase for the creation of landslide hazard mappings based onexisting events.

Given the heterogeneity of data used for landslide susceptibilitymapping, a comprehensive exploration of various coding methodswas conducted in this study. The results support the conclusionthat the 1-of-c binary coding method produces good results for

Page 12: An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing landslides

M. Pavel et al. / Computers & Geosciences 37 (2011) 554–566 565

replication of a subjective geomorphic mapping when a 2-classsystem is used.

10. Conclusions

This study concluded that the creation of landslide suscept-ibility mappings based on a 2-class system can be done by using anexisting subjective geomorphic mapping. For the above criteria,similarities to experts’ mapping greater than 90% were achieved.However, replication of the 5-class system used in British Columbiawas relatively unsuccessful and the highest accuracy achieved wasabout 71%.

Also, the study concludes that at present, creation of terrainmappings with a temporal resolution (i.e., hazard mappings basedon existing landslides) is not possible, except maybe for someintensely studied areas. In this study, the highest accuracy for thiscriterion was about 75%. Over the last years, great efforts andimportant resources were allocated to terrain stability mappingbased on existing events. However, in the absence of mappingsvalid in time, the problem is far from being solved, and to this date,no theoretical foundation seem to exist to prove that the problem isactually solvable.

Acknowledgements

Funding for this study was provided by the Science Council ofBritish Columbia, University of British Columbia, Natural Sciencesand Engineering Research Council of Canada, and Western ForestProducts, Ltd. The authors are very grateful to Metro Vancouver forproviding the data used in this study. The field photos used in thisarticle are also courtesy of Metro Vancouver.

Appendix A. Supplementary materials

Supplementary data associated with this article can be found inthe online version at doi:10.1016/j.cageo.2010.10.006.

References

Al-Tuhami, A.A., 2000. Neural Networks: a solution for the factor of safety problemin slopes. In: Proceedings of the 8th International Symposium on Landslides,Cardiff, UK. Thomas Telford, London. pp. 45–50.

Aleotti, P., Chowdhury, R., 1999. Landslide hazard assessment: summary review andnew perspectives. Bulletin of Engineering Geology and the Environment 58,21–44.

Aste, J.P., Ke, C., Faure, R.M., Mascarelli, D., 1995. The SISIPHE and XPENTprojects—Expert systems for slope instability. In: Proceedings of the SixthInternational Symposium on Landslides, Christchurch. Balkema, Rotterdam,pp. 1647–1652.

Bishop, C.M., 1995. Neural Networks for Pattern Recognition. Oxford UniversityPress, Oxford 482 pp.

Brenning, A., 2005. Spatial prediction models for landslide hazard: review, compar-ison and evaluation. Natural Hazards and Earth System Sciences 5, 853–862.

Burrough, P.A., McDonnell, R.A., 1998. Principles of Geographical InformationSystems. Oxford University Press, Oxford 333 pp.

Carrara, A., Crosta, G., Frattini, P., 2003. Geomorphological and historical data inassessing landslide hazard. Earth Surface Processes and Landforms 28,1125–1142.

Catani, F., Casagli, N., Ermini, L., Righini, G., Menduni, G., 2005. Landslide hazard andrisk mapping at catchment scale in the Arno River basin. Landslides 2, 329–342.

Cross, V., Firat, A., 2000. Fuzzy objects for Geographical Information Systems. FuzzySets and Systems 113, 19–36.

Dai, F.C., Lee, C.F., Ngai, Y.Y., 2002. Landslide risk assessment and management: anoverview. Engineering Geology 64, 65–87.

Davis, T.J., 1999. Towards verification of a natural resource uncertainty model. Ph.D.Dissertation, University of British Columbia, Vancouver.

Dai, F.C., Lee, C.F., 2001. Terrain-based mapping of landslide susceptibility using ageographical information system: a case study. Canadian Geotechnical Journal38, 911–923.

Demoulin, A., Chung, C.-J.F., 2007. Mapping landslide susceptibility from smalldatasets: a case study in the Pays de Herve (E Belgium). Geomorphology 89(3–4), 391–404.

Dhakal, A.S., Amada, T., Aniya, M., 2000. Landslide hazard mapping and itsevaluation using GIS: an investigation of sampling schemes for a grid-cellbased quantitative method. Photogrammetric Engineering & Remote Sensing 66(8), 981–989.

Dietrich, W.E., Montgomery, D.R., 1998. SHALSTAB: a digital terrain model formapping shallow landslide potential, 30+ pp, /http://socrates.berkeley.edu/�geomorph/shalstab/index.htmS (accessed June 20, 2010).

Dodagoudar, G.R., Venkatachalam, G., 2000. Reliability analysis of slopes using fuzzysets theory. Computers and Geotechnics 27, 101–115.

Ermini, L., Catani, F., Casagli, N., 2005. Artificial neural networks applied to landslidesusceptibility assessment. Geomorphology 66, 327–343.

Fannin, R.J., Wise, M.P., 2001. An empirical-statistical model for debris flow traveldistance. Canadian Geotechnical Journal 38, 982–994.

Faure, R.M., Mascarelli, D., Vaunat, J., Leroueil, S., Tavenas, F., 1995. Present state ofdevelopment of XPENT, Expert system for slope stability problems. In:Proceedings of the Sixth International Symposium on Landslides, Christchurch.Balkema, Rotterdam, pp. 1671–1678.

Ferentinou, M.D., Sakellariou, M.G., 2005. Assessing landslide hazard on mediumand large scales using self-organizing maps. In: Proceedings of the InternationalConference on Landslide Risk Management, Vancouver, Canada, 31 May–3 June,Taylor & Francis Group, London, UK, pp. 639–648.

Fernandez-Steeger, T.M., Rohn, J., Czurda, K., 2002. Identification of landslide areaswith neural nets for hazard analysis. In: Proceedings of the I EuropeanConference on Landslides, Prague, Czech Republic, June 24–26. Balkema, TheNetherlands, pp. 163–168.

Fisher, P., 2000. Sorites paradox and vague geographies. Fuzzy Sets and Systems 113,7–18.

Fletcher, G.P., Hinde, C.J., 1995. Using neural networks as a tool for constructing rulebased systems. Knowledge-based Systems 8 (4), 183–189.

Gomez, H., Kazvoglu, T., 2005. Assessment of shallow landslide susceptibility usingartificial neural networks in Jabonosa River Basin, Venezuela. EngineeringGeology 78, 11–27.

Gosavi, A., 2003. Simulation-based Optimization: Parametric Optimization Tech-niques and Reinforcement Learning. Kluwer Academic Publishers, Boston, MA554 pp.

Greater Vancouver Regional District (GVRD; currently, Metro Vancouver), 1999.Watershed Ecological Inventory Program. Study coordinated by Acres Interna-tional Limited, 3 vols.

Guesgen, H.W., Albrecht, J., 2000. Imprecise reasoning in geographic informationsystems. Fuzzy Sets and Systems 113, 121–131.

Guzzetti, F., Carrara, A., Cardinali, M., Reichenbach, P., 1999. Landslide hazardevaluation: a review of current techniques and their application in a multi-scalestudy, Central Italy. Geomorphology 31, 181–216.

Guzzetti, F., Reichenbach, P., Ardizzone, F., Cardinali, M., Galli, M., 2006. Esti-mating the quality of landslide susceptibility models. Geomorphology 81,166–184.

Holmstrom, L., Koistinen, P., Laaksonen, J., Oja, E., 1996a. Comparison of neural andstatistical classifiers—theory and practice. Research Report A 13, Rolf Nevan-linna Institute, University of Helsinki, Finland, 37 pp.

Holmstrom, L., Koistinen, P., Laaksonen, J., Oja, E., 1996b. Neural network andstatistical perspectives of classification. In: Proceedings of the ICPR ‘96, Vienna,pp. 286–290.

Horgan, J., 1995. From complexity to perplexity. Scientific American 272 (6),104–110.

Howes, D.E., Kenk, E., 1997. Terrain Classification System for British Columbia(version 2). Fisheries Branch Ministry of Environment and Surveys and ResourceMapping Branch Ministry of Crown Lands Province of British Columbia, Victoria,BC, 101 pp.

Huang, S.H., Xing, H., 2002. Extract intelligible and concise fuzzy rules from neuralnetworks. Fuzzy Sets and Systems 132, 233–243.

Kasabov, N.K., 1996. Learning fuzzy rules and approximate reasoning in fuzzy neuralnetworks and hybrid systems. Fuzzy Sets and Systems 82, 135–149.

Keaton, J.R., DeGraff, J.V., 1996. Surface observation and geologic mapping. In:Turner, A.K., Schuster, R.L. (Eds.), Landslides—investigation and mitigation.Transportation Research Board Special Report 247. National Academy Press,Washington, DC, pp. 178–230.

Kohonen, T., 2001. Self-organizing Maps, Third ed. Springer-Verlag, Berlin 501 pp.Kohonen T., Hynninen J., Kangas J., Laaksonen J., Torkkola K., 1996. Learning vector

quantization. Technical Report A30, Helsinki University of Technology, Labora-tory of Computer and Information Science, Espoo, 32 pp.

Laaksonen, J., Oja, E., 1996. Classification with learning k-nearest neighbors. In:Proceedings of ICNN ‘96, Washington, DC, pp. 1480–1483.

Lee, S., Ryu, J.-H., Min, K., Won, J.-S., 2003. Landslide susceptibility analysis using GISand artificial neural network. Earth Surface Processes and Landforms 28,1361–1376.

Lui, Y., Guo, H.C., Zou, R., Wang, L.J., 2006. Neural network modelling for regionalhazard assessment of debris flow in Lake Qionghai Watershed, China. Environ-mental Geology 49, 968–976.

Malamud, B.D., Turcotte, D.L., Guzzetti, F., Reichenbach, P., 2004. Landslideinventories and their statistical properties. Earth Surface Processes and Land-forms 29, 687–711.

Manly, B.F.J., 1994. Multivariate Statistical Methods: A Primer, Second ed. Chapmanand Hall, London 215 pp.

Page 13: An analysis of landslide susceptibility zonation using a subjective geomorphic mapping and existing landslides

M. Pavel et al. / Computers & Geosciences 37 (2011) 554–566566

Melchiorre, C., Matteucci, M., Azzoni, A., Zanchi, A., 2008. Artificial neural networks andcluster analysis in landslide susceptibility zonation. Geomorphology 94, 379–400.

Michalewicz, Z., Fogel, D.B., 2000. How to Solve It: Modern Heuristics. Springer-Verlag, Berlin 467 pp.

National Institute of Standards and Technology (NIST)—US Department of Com-merce, 2010. Engineering Statistics Handbook. /http://www.itl.nist.gov/div898/handbook/S (accessed June 20, 2010).

Negnevitsky, M., 2005. Artificial Intelligence: A Guide to Intelligent Systems, Seconded. Addison-Wesley, London 415 pp.

Oreskes, N., Shrader-Frechette, K., Belitz, K., 1994. Verification, validation, andconfirmation of numerical models in the Earth Science. Science 263, 641–646.

Pack, R.T., Tarboton, D.G., Goodwin, C.N., 2001. SINMAP: a stability index approachto terrain stability hazard mapping. User’s Manual, 68 pp /http://hydrology.neng.usu.edu/sinmap/S (accessed June 20, 2010).

Pavel, M., 2003. Application of artificial neural networks for terrain stabilitymapping. Ph.D. Dissertation, University of British Columbia, Vancouver.

Pavel, M., Fannin, R.J., Nelson, J.D., 2008. Replication of a terrain stability mappingusing an artificial neural network. Geomorphology 97 (3-4), 356–373.

Pradhan, B., Lee, S., 2010a. Landslide susceptibility assessment and factor effectanalysis: backpropagation artificial neural networks and their comparison withfrequency ratio and bivariate logistic regression modeling. EnvironmentalModelling & Software 25 (6), 747–759.

Pradhan, B., Lee, S., 2010b. Regional landslide susceptibility analysis using back-propagation neural network model at Cameron Highland, Malaysia. Landslides 7(1), 13–30.

Pradhan, B., Sezer, E.A., Gokceoglu, C., Buchroithner, M.F. Landslide susceptibilitymapping by neuro-fuzzy approach in a landslide prone area (Cameron Highland,Malaysia). IEEE Transactions on Geoscience and Remote Sensing, in press,doi:10.1109/TGRS.2010.2050328.

Province of BC (British Columbia), 1999. Mapping and assessing terrain stabilityguidebook. Forest Practices Code of British Columbia, second ed. Ministry ofForests and Range, Victoria, BC, 36 pp. /http://www.for.gov.bc.ca/TASB/LEGSREGS/FPC/FPCGUIDE/terrain/index.htmS (accessed June 20, 2010).

Qin, S.Q., Jiao, J.J., Wang, S.J., 2001. The predictable time scale of landslides. Bulletinof Engineering Geology and the Environment 59, 307–312.

Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1986. Learning representations by back-propagating errors. Nature 323, 533–536.

Runqiu, H., Qiang, X., 1997. Originality in applying nonlinear theories in geologicalhazard analysis. In: Proceedings of International Symposium on Engineering

Geology and the Environment, Organized by the Greek National Group of IAEG,Athens, Greece, 23–27 June, pp 713–718.

Saboya Jr., F., da Gloria Alves, M., Dias Pinto, W., 2006. Assessment of failuresusceptibility of soil slopes using fuzzy logic. Engineering Geology 86 (4),211–224.

SAS Institute, 2001. SAS User’s Guide, Version 8.02. The SAS Institute, Cary, NorthCarolina.

Shunmin, Y., Yunzhi, S., 1997. Fractal characterization of regional landslide activitiesand its significance. In: Proceedings of the International Symposium onEngineering Geology and the Environment, Organized by the Greek NationalGroup of IAEG, Athens, Greece, 23–27 June, pp.11555–1158.

Soeters, R., van Westen, C.J., 1996. Slope instability recognition, analysis, andzonation. In: Turner, A.K., Schuster, R.L. (Eds.), Landslides—investigation andmitigation. National Academy Press, Transportation Research Board SpecialReport 47, Washington, DC, pp. 129–177.

Tarboton, D.G., 1997. A new method for the determination of flow directions andupslope areas in grid digital elevation models. Water Resources Research 33,309–319.

Terzaghi, K., 1950. Mechanism of landslides, Application of Geology to engineeringpractice, Berkeley Volume. The Geological Society of America, pp. 83–123.

Turcotte, D.L., Malamud, B.D., 2004. Landslides, forest fires, and earthquakes:examples of self-organized critical behavior. Physica A 340, 580–589.

Underhill, D. (Ed.), 2000. Watershed Restoration Technical Bulletin, Streamline, vol.5, no. 3. BC Ministry of Environment, Watershed Restoration Program, Van-couver, BC. /http://www.forrex.org/streamline/ISS18/vol5no3.pdfS (accessedJune, 2010).

Vahidnia, M.H., Alesheikh, A.A., Alimohammadi, A., Hosseinali, F., 2010. A GIS-basedneuro-fuzzy procedure for integrating knowledge and data in landslidesusceptibility mapping. Computers & Geosciences 36, 1101–1114.

Vulliet, L., Mayoraz, F., 2000. Coupling neural networks and mechanical models for abetter landslide management. In: Proceedings of the 8th International Sympo-sium on Landslides, Cardiff, UK. Thomas Telford, London, pp. 1521–1526.

Wislocki, A.P., Bentley, S.P., 1989. An expert system for landslide hazard and riskassessment. In: Topping, B.H.V. (Ed.), Artificial Intelligence Techniques andApplications for Civil and Structural Engineers. Civil-Comp Press, Edinburgh,pp. 249–252.

Wu, W., Sidle, R.C., 1995. A distributed slope stability model for steep forestedbasins. Water Resources Research 31 (8), 2097–2110.