Stochastic Automata for Outdoor Semantic Mapping using … · 2017. 12. 18. · Stochastic Automata...

General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from orbit.dtu.dk on: Dec 18, 2017

Stochastic Automata for Outdoor Semantic Mapping using Optimised SignalQuantisation

Caponetti, Fabio; Blas, Morten Rufus; Blanke, Mogens

Published in:Control Engineering Practice

Link to article, DOI:10.1016/j.conengprac.2010.11.010

Publication date:2011

Link back to DTU Orbit

Citation (APA):Caponetti, F., Blas, M. R., & Blanke, M. (2011). Stochastic Automata for Outdoor Semantic Mapping usingOptimised Signal Quantisation. Control Engineering Practice, 19(3), 223-233. DOI:10.1016/j.conengprac.2010.11.010

http://dx.doi.org/10.1016/j.conengprac.2010.11.010

http://orbit.dtu.dk/en/publications/stochastic-automata-for-outdoor-semantic-mapping-using-optimised-signal-quantisation(1610ce48-f357-413a-a9d6-d4305ca837b2).html

Stochastic Automata for Outdoor Semantic Mapping using OptimisedSignal Quantisation

Fabio Caponetti∗,a,b,e, Morten Rufus Blasc,a, Mogens Blankea,d

aTechnical University of Denmark, Department of Electrical Engineering, Automation and Control Group, Elektrovej build.326, DK-2800 Kgs. Lyngby, Denmark

bUniversita Politecnica delle Marche, Dipartimento di Ingegneria Informatica Gestionale e dell’Automazione, Via BrecceBianche, 60131 Ancona, Italy

cClaas Agrosystems, Bøgeskovvej 6, 3490 Kvistgaard, DenmarkdCeSOS, Norwegian University of Science and Technology, 7491 Trondheim, Norway

eIntegra Software srl, Via Leopardi 1, 60035 Jesi, Italy

Abstract

Autonomous robots require many types of information to obtain intelligent and safe behaviours. For outdooroperations, semantic mapping is essential and this paper proposes a stochastic automaton to localise therobot within the semantic map. For correct modelling and classification under uncertainty, this papersuggests quantising robotic perceptual features, according to a probabilistic description, and then optimisingthe quantisation. The proposed method is compared with other state-of-the-art techniques that can assessthe confidence of their classification. Data recorded on an autonomous agricultural robot are used forverification and the new method is shown to compare very favourably with existing ones.

Key words: Stochastic automata, Robotics, Classification, Probabilistic models, Quantisation

1. Introduction

Recent developments in outdoor robots and sen-sor technology have made autonomous field opera-tions a realistic aim. The challenge is to developand add functionality so that vehicles will behavein a safe and reliable manner under unmanned ope-ration. Safe behaviour is crucial if outdoor robotsare to become acceptable to authorities and society.High reliability is also required if robots are to beattractive to farmers and other professional users.The biggest technological challenge in such autono-mous outdoor systems is ability to sense the envi-ronment, as well as to classify and use perceptualinformation for control.Classification and perception in natural environ-

ments are difficult. High dimensional data are in-volved, including those coming from video streams,and with noisy sensor signals that have limitations

∗Corresponding author. Tel.: +45 4525 3565; fax: +454588 1295

Email addresses: [email protected] (Fabio Caponetti),[email protected] (Morten Rufus Blas),[email protected] (Mogens Blanke)

in accuracy and range, it is hard to model andgeneralise objects and contexts. Metric mapping,which is commonly based on position from GlobalPositioning System (GPS) Satellites combined withmotion sensing from Inertial Measurement Units(IMU), provides very useful information, but needsto be supplemented by environment sensing andperception to obtain safe autonomous operation.Semantic mapping is a technique to extract non-metric features from sensors such as cameras andlaser scanners. In a changing natural environmentautonomous operation is a challenge. GPS avai-lability is irregular or prone to outliers, and fea-ture extraction from advanced sensors frequentlysuffers from artifacts. Robust methods to recogniseobjects and scenes are, therefore, a key to assurethat a robot shows correct and safe behaviour, evenin faulty conditions or unforeseen situations (Bou-guerra et al., 2008). Safe autonomous operationtherefore requires robust semantic mapping and ob-ject recognition, and the technology needed mustreach far beyond conventional robotic motion plan-ning, (LaValle, 2006; Mettler et al., 2010). The se-mantic mapping problem is in essence a classifica-

Preprint submitted to Control Engineering Practice 7th December 2010

tion problem, and several techniques are availablefrom in-door robotic applications, from computervision and from the film industry. Detailed refe-rences and discussion of main features of classifica-tion algorithms are presented in Section 4.2.1.

This paper suggests a novel approach based onstochastic automata (Lunze, 2001) to create anduse outdoor semantic maps for safe, autonomousnavigation. In this framework, automaton statesare shown to conveniently correspond to differentcharacteristic environments, hence giving modelsan intuitive interpretation. States are interconnec-ted by probabilistic transitions, which represent to-pological relations. Transitions are associated withactivation conditions in the form of perception pat-terns and vehicle motion history. It is discussed howsignal quantisation can enhance robustness againstnoise and achieve fault-tolerant performances, andhow quantisation can be optimised according to theprobability distributions of observations. Combi-ning well-known task planning methods (Galindoet al., 2008) with robust semantic mapping, it isshown how safe operation can be achieved. Theproposed method for semantic mapping is compa-red to existing state-of-the-art techniques on datarecorded from an autonomous agricultural robot inan experimental orchard.

The paper is organised as follows. The problemand the context are first outlined. Then, perceptionfor semantic mapping is introduced using a stochas-tic automaton framework, signal space quantisationis discussed and an optimisation is suggested tominimise false alarm probability. Detailed resultsare then presented using field test data, comparisonwith other algorithms is discussed and conclusionsfinalise the paper.

2. Autonomous orchard operations

Semantic mapping is of extreme importance formobile robots. With a good state estimate of thesemantic map, this information can be used to su-pervise plan execution, redefine controllers, or aidthe localisation process. This helps to assure sa-fety and reliability. Semantic mapping is commonlydone by matching the metric position of the robotin a known map. This is still viable outdoors butfalls short when evaluated in terms of robustness.Missions have to continue, regardless of weather orseason, leading to the need for a sensing system ro-bust against vegetation shape, environment changes

and possible map or localisation faults. Fig.1 illus-trates the use of an automaton model to representsemantic information of an orchard.

Figure 1: Environment-distinctive areas and their topolo-gical relations are modelled by states and transitions of astochastic automaton. To identify the model probabilities,the real-valued, measured input and perceived output arequantised and are tuned through frequency count probabi-lities. Robust semantic localisation is achieved in real timeby recursive evaluation in an observer.

To stimulate new robust solutions, the data usedin this paper were recorded during autonomous ope-rations of a tractor in an experimental orchard ow-ned by Copenhagen University. The area is cha-racterised by geometric features: trees or plants,are set along straight and parallel lines as visiblein Fig.2. The distance between each tree in a rowdepends on the type of tree, while rows are wideenough to let a tractor drive through. A typicalwork plan involves driving from the docking sta-tion to the headland, choosing a row and drivingthrough while performing tasks like cutting grassor spraying. Missions are defined by the user, interms of trajectories and operations. Each planis composed of one or more alternatives, useful inunexpected situations, like faults or new obstacles.

2

Figure 2: A typical tour covers the track shown in the imageas GPS route on Google Earth background. The numbersdenote the zones in which the area has been divided for se-mantic mapping. 1. Open field (Road) 2. Headland 3. Densetrees 4. Sparse trees

2.1. Autonomous tractor

The tractor is a standard orchard tractor that hasbeen retrofitted with additional sensors and compu-ting power (Griepentrog et al., 2009). A Sick laserscanner mounted in front of the robot scans a maxi-mum of 8m for 180degs with a configurable resolu-tion of 0.5 or 1deg at ∼ 70Hz. The tractor has asecond distance sensor, a stereo vision device, whichmakes available a 3D point cloud of the field of view.It gives more information than the laser althoughit is quite noisy, and light and weather dependent.The tractor is also equipped with a global positio-ning device, based on real-time kinematics (RTK)technology and odometry sensors.

2.2. Semantic states

A state is a geographical area, or context, whichthe tractor must recognise. Referring to Fig.2, atypical orchard can be divided into four intercon-nected zones, enumerated and described as follows.

1. Open field Few obstacles and freely traversablespace in the field of view of the tractor, i.e. aroad or a low vegetation field, see Fig.4.1.

2. Headland Defines the start/end of the agricul-tural area. It is usually delimited by fences,markers or open space, see Fig.4.2.

3. Dense trees The distance between trees is verytight so there re very limited manoeuvringpossibilities and the robot may need to pushthrough branches to get by, see Fig.4.3.

Figure 3: Picture of the operational tractor. Sensor referencesystems are overlaid and also shown on a top-down sketch.The position of the tractor is relative to its centre of massin UTM coordinates.

4. Sparse trees Sparse vegetation, robot can ea-sily manoeuvre between the trees while avoi-ding physical contact, see Fig.4.4.

2.3. Perception

The perception system is designed to provide acompact set of signals that can discriminate bet-ween the selected states (Sec.2.2). Due to the limi-ted resolution of the sensors, and to obtain robustfeatures, the system estimates: (a) Amount of vi-sible ground. (b) Space occupied by obstacles. (c)Linear features such as fences or walls. (d) Freespace around the tractor. (e) Vegetation permea-bility. Visible ground plane and space occupied byobstacles are not mutually exclusive because the 3Ddata makes it possible to observe large amounts ofground while in presence of obstacles. Sparse trees,for example, do not prevent observing the groundplane, even if the crown of the trees form large obs-tacles that block the line of view. Fig.5 shows howlaser and vision streams are fused to get the abovefeatures.

2.3.1. Signal ygp: Visible Ground plane

The approach followed for extracting groundplane is similar to Konolige et al. (2009). Given a3D point cloud a RANSAC technique (Fischler andBolles, 1981) is used to construct a ground plane hy-potheses. This is done by: (a) Choosing three non-collinear points at random from the point cloud; (b)

3

(1)

(2)

(3)

(4)

Figure 4: Pictures from the left stereo-camera of the tractorfor each semantic state during summer. 1. Open field 2.Headland 3. Dense trees 4. Sparse trees

Estimating a plane from the chosen points; (c) Ran-king the plane estimates by the number of inliers.Fig.6.b shows an on-line estimated plane from the3D point cloud P in the form of the 3D grid map.

Let the found plane be written in the Hessiannormal form defining the unit normal vector n andthe distance along the line, p. A 2D cell grid iscreated on the estimated plane. Each cell cxy, de-fined by the interval Qxy, is classified as groundif a cloud point projected on the plane falls wi-thin it and its distance to the plane is less thanDmax. The amount of ground plane is the sum ofthese grid cells evaluated from n0 − n1 along thex-axis to m0 − m1 along the y-axis. For x ∈ P,

Figure 5: A single laser scan is enough to estimate theamount of free space and to classify the vegetation. By fu-sing the laser measurements with the synchronized 3D pointcloud a 3D grid map of the field of view is built and popula-ted. This is in order to determine the proportion of occupiedspace and its characteristics. Although the 3D cloud is noisy,it is alone sufficient to reliably estimate the ground plane.

cx,y =∣∣{x

∣∣ xx ∈ Qxy ∧ n · x+ p < Dmax

}∣∣ > 0

ygp =

n1∑x=n0

m1∑y=m0

cx,y.

2.3.2. Signal yo: Obstacles

The obstacle signal yo is constructed by projec-ting each measure from the stereo and laser into a3D grid map. The value of 1 is given to a 3D gridcell cxyz if a point falls inside the cell and its heightabove the ground plane is larger than Dmax. Thenumber of occupied cells is counted and used as ameasure of the obstacles in view. Fig.6 provides avisual illustration of the signal.

cxyz =∣∣ {

x∣∣ x ∈ Qxyz ∧ n · x+ p ≥ Dmax

}∣∣ > 0

yo =

n1∑x=n0

m1∑y=m0

k1∑

z=k0

cxyz.

2.3.3. Signal yls: Linear structures

This signal quantifies the presence of linear struc-tures in the environment such as walls, fences,or hedges. The 3D grid map of obstacles, des-cribed in Sec.2.3.2, is collapsed into a 2D gridmap on the ground plane by summing the num-ber of occupied cells along the vertical component,

gxy =k1∑

z=k0

cx,y,z. A RANSAC line-fitting algorithm

is then used on the 2D grid map to extract thestrongest line. A function l(a, b) returns the gridcells that intersect with the line parameterised bya and b, l (a, b) = {j |j ∈ gx,y , jy = ajx + b}. TheRANSAC algorithm finally attempts to maximise

4

(a) (b)

Figure 6: Ground plane and obstacles. (a) Left camera image taken in between dense trees. (b) Corresponding 3D grid map.The estimated ground plane and the obstacles are coloured with a dark and a light colour, respectively. Trees and the humanare correctly segmented out and classified in the grid map.

the sum of grid cells that intersect with the line,

yls = maxa,b|w|∑i=1

l (a, b). Fig.7 shows the steps of

the process using experimental data.

2.3.4. Signal yfs: Laser free space

The free space observed by the laser is the areaspanned by a laser scan. As depicted in Fig.8, eachpair of adjacent observations, si, si+1 (i = 1, . . . n),defines a triangle from the scanner. Its area can beevaluated by using the Heron’s formula,

di = ||li − li+1||2,

si =li + li+1 + di

2,

Ai =√si(si − li)(si − li+1)(si − di).

By summing the area of all triangles from a la-ser scan, an estimate of the free space is yfs =∑nl−1

i=0 Ai.

2.3.5. Signal yvp: Vegetation permeability

Considering a region of interest, the local spa-tial laser range distribution can be captured by theprincipal components of the spatial covariance ma-trix (Lalonde et al., 2006). By singular value de-composition the covariance matrix is decomposedinto principal components ordered by decreasing ei-genvalues. By examining the eigenvalues the vege-tation can be classified. Eigenvalues of similar ma-gnitude can be used to model/recognise bushes orgeneral foliage. If one eigenvalue dominate signifi-cantly, this is the signature of walls or other solidobjects. A maximum likelihood strategy for point

wise classification is used to distinguish between thetwo classes of objects. The details are provided inCaponetti and Blanke (2009).

3. Methods

This section recalls elements of stochastic auto-mata theory and introduces other methods used inthe paper.

3.1. Stochastic automaton

An autonomous vehicle in its environment is heremodelled as a discrete-event system subject to in-put that may cause a change of state of the model.Output is state-specific. The states in the model arethe locations identified in Sec.2.2. The dynamic be-haviour of the model is described by changes in thediscrete values of signals, referred to as events inthis framework. The system’s discrete input, stateand output are denoted by v, z and w and theirdiscrete value sets are enumerated as:

Input: v ∈ Nv ⊂ Q, Nv = {1, 2, . . . ,M} ,State: z ∈ Nz ⊂ Q, Nz = {1, 2, . . . , N} ,

Output: w ∈ Nw ⊂ Q, Nw = {1, 2, . . . , R} .Using the notation of Schroder (2003), an initialisedstochastic automaton is described by the five −tuple: S = (Nz,Nv,Nw, L, P (zk)) . Where P (zk)represents the set of state probabilities at time k.L is the behavioural function, the law that governsthe stochastic process underlying the automaton.

L : Nz ×Nw ×Nz ×Nv → [0, 1] ⊂ RL(z′, w, z, v) = P (zk+1 = z′, wk = w|zk = z, vk = v) .

5

(a) (b) (c)

Figure 7: Linear structures. (a) Laser reading overlaid on a top-down view of the 3D grid map. The lighter the colour, thebigger the number of measurements contained in the bin. (b-c) The strongest line estimated corresponds to the fence delimitingthe orchard.

Figure 8: Each pair of consecutive laser readings forms a triangle. By summing the area of all the triangles forming a scan,the laser free space is estimated.

3.2. Classification

Semantic mapping using the available perceptualdata is equivalent to solving an observation pro-blem for the modelling stochastic automaton. Gi-ven an input and output sequence and an initia-lised automaton S, the current state is obtainedby determining the conditional probability distribu-tion P (zk|k) = P (zk|V (0...k),W (0...k)), (Schroder,2003). The solution of the observation problem isgiven by the set of all the states zk to which the au-tomaton may move with non-zero probability, whileaccepting the input and generating the output se-quence specified. The a-posterior state probabilitydistribution can be evaluated on-line by iterativeapplication of a prediction and correction schemadiscussed in Schroder (2003), and briefly reported

here.

P (zk|k) =

∑zk+1

L(k)P (zk|k − 1)

∑zk,zk+1

L(k)P (zk|k − 1), k ≥ 0,(1)

P (zk|k − 1) =

∑zk−1

L(k − 1)P (zk−1|k − 2)

∑zk,zk−1

L(k − 1)P (zk−1|k − 2)(2)

P (z0| − 1) ::= P (z0)

The input vk and the output wk are argumentsin L(k) = L(zk+1, wk|zk, vk). Eq.(1) describeshow the prediction obtained from the previoustime point, Eq.(2), have to be corrected after thenew measurements v(k) and w(k) become available.The recursive formulation of the algorithm is publi-shed in Blanke et al. (2006), Chap. 8.

3.3. Quantisation of the signal spaces

Through quantisation, the real-valued signalsdescribed in Sec.2.3 are fed to the automaton.

6

For generality, consider a generic continuous signalfunction of time t, x(t), R→ R. A quantiser splitsthe signal space R into a finite number of disjointsets Qx(ξ) where ξ ∈ Nx ⊂ Q. With Qx(ξ) de-noting the set of values in x associated with thequantised value ξ, the quantiser function reads, interms of intervals,

Qx(ξ) = (xlowξ , xup

ξ ], ξ ∈ Nx = {1, . . . , Nx} (3)

where xlowξ and xup

ξ are the lower and upper boundof the quantisation respectively. The index ξ of thepartition Qx(ξ) to which the current value of x be-longs, represents the qualitative level of the signal.Schroder (2003) designs arbitrarily the quantisers,hence this paper will describe data-driven proce-dure to define the discretisation levels.Let x ⊂ R be a generic continuous signal sampled

in the time with uniform frequency. N samples aredrawn independently from the acquired signal todefine the subset γ = {xi, i = 1...N}. Each sampleis associated to the corresponding class z to definethe training data (xi, θi), i = 1, . . . , N .Since all xi are independent and identically-

distributed samples of a random variable, the pro-bability density function f(ρ) can be estimated bykernel density estimation, Parzen (1962).

f(ρ) ≈ 1

Nh

N∑

i=1

K(ρ− xi

h), ρ ∈ R, min(γ) ≤ ρ ≤ max(γ)

(4)where K is a kernel function, N is the number ofsamples and h is a smoothing parameter. Givena Gaussian kernel in the form of Eq.(5), the valueof h can be chosen to maximise the reconstructionperformance (Bowman and Azzalini (1997)),

K(ϕ) =1

2πe−

12ϕ

2

. (5)

Given the approximated probability density func-tion the cumulative density function F (x) is,

F (x) =

x∫

−inf

f(ρ)dρ ≈x∑

ρ=−inf

f(ρ). (6)

Defining

γz = {(xj , θj); ∀j : θj = z} (7)

as the subset of samples related to the state z,it is possible to use the above results to estimate

F (x|θ = z), for each z ∈ Nz. Fig.9 shows the re-sults of the procedure when applied to a syntheticdata-set.To simplify the notation, define

F (x)z = F (xj), xj ∈ γz, z ∈ Nz. (8)

Reverting to classification, the quantiser definedby Eq.(3) can be interpreted as a linear machine,which splits the continuous time signal x into seg-ments. A simple classifier would select the stateby looking at the discrete level in which the signalfalls. The probability that the robot is in state zwhile observing a discrete output equal to ξ is,

P (z|Qx(ξ)) =P (Qx(ξ)|z)P (z)∑

ζ∈Nz

P (Qx(ξ)|ζ)P (ζ). (9)

Rewriting Eq.(9) using Eq.(3),

P (θ = z|xlowξ < x ≤ xup

ξ )

=P (xlow

ξ < x ≤ xupξ |θ = z)P (θ = z)∑

ζ∈Nz

P (xlowξ < x ≤ xup

ξ |θ = ζ)P (θ = ζ).

Further, from Eq.(8), the conditional probabilityis a function of the conditional cumulative densityfunction,

P (Qx(ξ)|θ = z) = P (xlowξ < x ≤ xup

ξ |θ = z) = F (xupξ )z−F (xlow

ξ )z.

Hence,

P (z|Qx(ξ)) =P (Qx(ξ)|z)P (z)∑

ζ∈Nz

P (Qx(ξ)|ζ)P (ζ)

=

(F (xup

ξ )z − F (xlowξ )z

)P (z)

∑

ζ∈Nz

(F (xup

ξ )ζ − F (xlowξ )ζ

)P (ζ)

. (10)

Eq.(10) describes the probability that the systemis in the semantic state z while the continuous si-gnal x is contained in the quantised level ξ. Thisinformation is used to define the quantisation levelsby maximising the probability that the robot is in astate z while x ∈ Qx(ξ) and minimising the numberof levels ξ, (|Nx|).

min|Nx|

maxQx(ξ)

P (z|Qx(ξ)) , ξ ∈ Nx, z ∈ Nz. (11)

Non-informative intervals, with low discriminativeperformance are merged with the confining interval.

7

0 5 10 15 200

10

20

30

40

50

60

70

80

90

Signal 1

Pro

babi

lity

dist

ribut

ion

State 2State 1State 3

(a)

0 5 10 150

10

20

30

40

50

60

70

Signal 2

Pro

babi

lity

dist

ribut

ion


(b)

0 5 10 15 200

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Signal 1

Pro

babi

lity

dens

ity fu

nctio

n


(c)

0 5 10 150

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Signal 2

Pro

babi

lity

dens

ity fu

nctio

n


(d)

Figure 9: Synthetic data-set composed of two signals generated by three states. (a)(b) Two continuous independent signalsare uniformly sampled and grouped in states according to a hand-labelled classification. (c)(d) State-conditioned probabilitydistribution estimated via kernel density estimation

Since the variability space of continuous variablescannot be known exactly, the quantisation levelsare defined in the probability space. Probabilitiesare delimited to the set [0, 1] ⊂ R so the wholespace is sampled as: {pj}, pj ∈ [0, 1] ⊂ R; j =1, . . . , Np. For each pj a value in the signal spacecan be found from the state conditional cumulativedensity function. Summarising, the algorithm tofind initial estimates of the quantisation levels is,

∀ pj find xj such that F (xj)z = pj , z ∈ Nz

X = {xi : F (xi)z = pj , ∀z ∈ Nz, j = 1...Np, i = 1...Nx}Fig.10 shows the procedure an example signal ofFig.9.Supposing the model composed by N states, X

contains Nx = N ·Np points. By sorting X , theinitial quantisation intervals are defined as:

Qx(1) =(−∞, x1]

Qx(2) =(x1, x2]

...

Qx(ξ) =(xi, xi+1], ξ ∈ Nx (12)

...

Qx(Nx) =(xNx , ∞)

Each quantisation level ξ ∈ Nx is associated to thestate z for which the condition max

zP (θξ = z|Qx(ξ))

holds. Two quantisers, Qx(ξ) and Qx(ξ + 1) aretherefore merged if:

P (θξ+1 = θξ|Qx(ξ+1)) ≥ P (θξ+1 = ζ|Qx(ξ+1)), ζ 6= θξ,∀ζ ∈ Nx,(13)

The conditional probability of the resulting mergedquantisation interval is re-evaluated before beingcompared to the succeeding interval Qx(ξ + 2).By this procedure the number of levels decreasesdrastically while fulfilling the merging condition, asshown by Fig.11 referring to Fig.9 datasets.

3.4. Model identification

The estimation of the behavioural relation foreach transition is done by applying the identifica-tion algorithm documented by Blanke et al. (2006).From a hand-classified data-set the transition pro-babilities are approximated by frequency count.For large sample sizes not all state transition for allthe possible input/output couples are found, yiel-ding to L(z′, w|z, f, v) = 0 even if the transitionz → z′ is feasible. To overcome these limitations,the same proportion of training points were selec-ted for each state while a bias in the state transitionmatrix was added for transitions that might not be

8

0 2 4 6 8 10 12

0.1

0.3

0.5

0.7

0.9

State 1 conditioned cdf

Signal 1

Cum

ulat

ive

dens

ity fu

nctio

n

0 5 10 15 20

0.1

0.3

0.5

0.7

0.9


Signal 1

Cum

ulat

ive

dens

ity fu

nctio

n

0 2 4 6 8 10

0.1

0.3

0.5

0.7

0.9


Signal 1

Cum

ulat

ive

dens

ity fu

nctio

n

︸︷︷︸

0 5 10 15 20

0.1

0.3

0.5

0.7

0.9

State conditioned cdf

Signal 1

Cum

ulat

ive

dens

ity fu

nctio

n


Figure 10: The probability space is uniformly sampled between 0.1 and 0.9 with steps of 0.1. For each level, the correspondingsignal value is mapped on the state-conditional cumulative distribution function.

0 5 10 15 200

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Signal 1

Pro

babi

lity

dens

ity fu

nctio

n


Qx(1)Qx(2)Qx(3)

0 5 10 150

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Signal 2

Pro

babi

lity

dens

ity fu

nctio

n


Qx(1)Qx(2)Qx(3)

Figure 11: Estimated discretisation levels resulting by the application of the proposed procedure on the synthetic data. Theinitial quantisation levels estimated, shown in Fig.10, are reduced by application of Eq.(13).

represented in the sample sequence but are physi-cally feasible.

4. Semantic mapping results

This section describes how the above methods areused for semantic mapping and field test results arepresented.

4.1. Model design

According to Sec.2.2 the modelling stochastic au-tomaton is composed of N = 4 states,

Nz = {Open field, Headland, Dense trees, Sparse trees} = {1, 2, 3, 4}.

The signals used by the robot for semantic mappingare divided in two sets with respect to the automa-ton: input u ∈ Rm and output y ∈ Rr.

9

4.1.1. Discrete-valued input

For semantic mapping, robot motion is theonly input considered u = [um], hence Nv ={moving, stand still} = {1, 2}. This is done to di-sallow state transitions when the tractor is stopped.

4.1.2. Discrete-valued output

Collecting the signals described in Sec.2.3,the observed output consist of a vector y =[ygp, yfs, yls, yo, yvp] of continuous signals. The pro-cedure introduced in section 3.3 is here used to de-sign quantisers for the perceptual data. The γ setof Eq.(7) is populated by randomly picking samplesfrom a training set. The state-conditioned proba-bility distributions are obtained by kernel densityestimation and shown in Fig.12. Perceptual alia-sing is recognisable in the probability space as anoverlap of the distribution curves. This problem ishandled by creating quantisation levels for each re-gion of interest and by combining all the signals inthe automaton and using the state transition mo-del.

4.2. Classification results

The validation data-set was recorded during arun which covered the track shown in Fig.2. Thepath first passed through an apple orchard (densetrees), then followed the back fence (headland) to apear orchard (sparse trees), the path returned alongthe line, crossed over behind some nut-trees andtook the back route home to the garage. The runwas made in summer time to catch one extremumof the scenario. Full grown foliage, bushes and treebranches hanging increased the variability of eachzone. To stress more the robustness of the methods,people were moving or standing in the tractor fieldof view during the data recording.

4.2.1. State-of-art algorithms

Classification problems appear frequently in dif-ferent areas of science and technology and severalstate-of-art methods are available. Support Vec-tor Machines (SVM), Adaboost, Gaussian mixtureemitting Hidden Markov Models (GHMM) (Wolfand Sukhatme, 2008) and Finite State Machines(FSM) provide different classification methods foruse in semantic mapping. The SVM technique isbased on statistical learning theory and is usedfor classification and regression problems, Vapnik(2006). Adaboost is a technique introduced byFreund and Schapire (1997) that linearly combines

simple, weak classifiers on the basis of classificationperformance obtained on a training set. Such clas-sifiers have been used previously Mozos et al. (2005)to classify indoor places. Later, several approacheswere proposed in literature to improve performanceby taking advantage of object recognition (Nuch-ter et al., 2005) and probabilistic environment mo-dels (Mozos et al., 2007). A GHMM consists ofa discrete-time and discrete-space Markov processthat contains some hidden parameters and emitsobservable outputs (Rabiner, 1989). For semanticmapping, a GHMM could be built for each pos-sible state by using labelled observations to trainGaussian mixtures that characterise emission. Fi-nite state machines were used to classify film scenesfor information retrieval in Zhai et al. (2004) wherestructural information, together with low and mid-level features, were used to classify the scenes.

4.2.2. Comparison of algorithms

To benchmark the solutions, the Matlab imple-mentation of the automata was compared to opensource libraries for SVM, Adaboost and GHMM.The automaton was configured so that all the statetransitions were allowed in order to make the com-parison fair with the other methods. Four standardSVM kernels were tested: linear, polynomial (of de-gree 3), radial basis function (RBF), and sigmoid.Kernel parameters were fine-tuned by an iterativeprocedure based on the training data. The packageLIBSVM by Chang and Lin (2001) was used forlearning and classification. Adaboost was set to usea maximum number of weak classifiers of 10. Theimplementation was based on Matlab and configu-red with tree stumps as weak classifiers. The BayesNet Toolbox for Matlab was used to produce theresults presented regarding the GHMM.A K-fold cross validation procedure was perfor-

med to evaluate the variability of the results. Thedata collected was split by random sampling in Kdisjoint sets. K − 1 sets were merged and usedas training while the remaining data were used asvalidation. In this case, the data collected was com-posed of 2281 synchronised laser and vision obser-vations. Due to the amount of data K was chosenequal to 2. In this way the generalisation capa-bilities together with the robustness were stressedby using half of the available data-set for trainingand half for testing. The performances were evalua-ted by collecting the classification results from fiveindependent runs of the 2-fold validation as shownby Fig.13. In addition to the classification rates the

10

20 40 60 80 1000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

Laser free space

Pro

babi

lity

dens

ity fu

nctio

n

OpenFieldHeadlandDenseTreesSparseTrees

(a)

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1

1.2

1.4

Obstacles

Pro

babi

lity

dens

ity fu

nctio

n


(b)

0 10 20 30 40 50 60 70 800

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

Visible ground plane

Pro

babi

lity

dens

ity fu

nctio

n


(c)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

1

2

3

4

5

6

Linear structures

Pro

babi

lity

dens

ity fu

nctio

n


(d)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

2

4

6

8

10

12

14

16

18

20

Vegetation permeability

Pro

babi

lity

dens

ity fu

nctio

n


(e)

Figure 12: Refined quantisation levels overlaid on state conditional probability density functions. Levels are represented asvertical dotted lines and conditional PDFs as lines with different styles. (a) Laser free space (b) Obstacles (c) Visible groundspace (d) Linear structures (e) Vegetation permeability

confusion matrices of each method have been eva-luated and shown in Fig. 14. Each row of the ma-trices represents the instances in a predicted class,while each column represents the instances in anactual class.

Classification method

Correct classification ratio

Aut Lin Pol Rbf Sig Adb Ghmm0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.84 0.84 0.90 0.90 0.84 0.84 0.85

Figure 13: Correct classification rate for all the methods.The performances were evaluated by collecting the results offive runs with a 2-fold cross validation process (both trai-ning and validation were done using two disjoint sets of 1141samples of five features). The standard deviation of the re-sults has been represented as an error line for each bar.

Polynomial and RBF kernel based SVMs showthe best classification rates. The GHMM shows theworst overall performance, which is due to singula-rity problems that in turn are mapped to the clas-sification rate variance. All the methods have pro-blems distinguishing between state 1 and 4, apartfrom the GHMM. Sparse orchards, are characteri-sed by spaced trees, letting either stereovision andlaser perceive only few obstacles. Missing trees inthe sparse orchard are then labelled as open areas.For this reason, only the GHMM has the requireddegree of freedom to discriminate between the twostates but suffers from training problems, which inturn worsens the overall performance. The auto-mata shows similar performance to the other state-of-art methods, even though it employs quantisedsignals. This shows that the information loss inquantising the signals is minimal. The design of theraw signals has been made in order to have signalsas independent as possible. This is further demons-trated by the fact that the automata can achievethe same performances as the other methods. Theadvantage of working with quantised signals is thatit simplifies the classification problem by reducingthe amount of data that goes in.

In Fig.15 the timings for both training and clas-sification are shown for each method. SVMs clearly

11

require the longest training time and show the worstscalability. This is due to the fact that the si-gnals are not bounded, and the implementationutilised here has problems handling this. Regar-ding the classification times, GHMMs are the slo-west. The SVMs again seem to scale the worst.One exception is the linear SVM, which is fasterthan the automata but the confusion matrix showsthe worst performance in terms of discrimination.Adaboost performs remarkably well for both trai-ning and classification timings. However, the out-put of Adaboost does not give information aboutthe confidence of the estimate. This makes the me-thod unsuitable for supervision and diagnosis taskswhere low confidence estimates should not triggerfalse alarms. The automata trains faster than SVMand classifies faster than the GHMMs. It showsgood scalability properties and outputs a confidenceestimate unlike Adaboost. Automata-based classi-fiers can be implemented efficiently compared to theother methods, which makes them suitable for ro-botic hardware.

5. Conclusion

This paper presented a novel approach for out-door semantic mapping by means of a stochasticautomaton. A main advantage of the automatacompared to other classification methods was astraightforward inclusion of how the dynamical sys-tem evolves over time. In the case study of theorchard presented here, the tractor motion and thespatial connection of environments formed an intui-tive basis for a model. Spurious observations wereeffectively dealt with in the updating method forstate belief but had a slight penalty in the form oflower adherence to the ground truth during transi-tions. This behaviour could be fine-tuned accordingto the needs of a particular use of the algorithm.A case study of an autonomous vehicle in an

orchard showed the properties of the automata-based diagnosis with optimised signal quantisation.A comparison with state-of-the-art classifiers wasmade on the orchard data. Results showed thatthe automata-based method trains faster than SVMand classifies faster than the GHMMs. The au-tomata approach shows good scalability propertiesand outputs a confidence estimate, an essential fea-ture to avoid false alarms from low-confidence hy-pothesis results. Automata-based classifiers wereshown to be efficient compared to the other me-thods, and they were therefore found attractive for

implementation in robotic environments with hardreal-time constraints.The method was used to optimise quantisation is

general and could well be applied to general classi-fication problems.

6. Acknowledgements

The support from the Danish Ministry of FoodAgriculture and Fisheries, under contract 3412-06-01729 is gratefully acknowledged. Our colleaguesDr. J.C. Andersen from the Technical Universityof Denmark, Dr. H-W. Griepentrog and Mr. J.Resting-Jeppesen, both from Copenhagen Univer-sity, Department of Life Sciences, are gratefully ack-nowledged for allowing access to equipment anddata. Hako Werke is acknowledged for providingthe tractor used for the orchard experiments.

References

Blanke, M., Kinnaert, M., Lunze, J., Staroswiecki, M., 2006.Diagnosis and fault tolerant control, 2nd Edition. Sprin-ger.

Bouguerra, A., Karlsson, L., Saffiotti, A., 2008. Monitoringthe execution of robot plans using semantic knowledge.Robotics and autonomous systems 56, 942–954.

Bowman, A. W., Azzalini, A., 1997. Applied SmoothingTechniques for Data Analysis. Oxford University Press.

Caponetti, F., Blanke, M., 2009. Combining stochastic auto-mata and classification techniques for supervision and safeorchard navigation. In: 7th IFAC Symposium on FaultDetection, Supervision and Safety of Technical Processes.

Chang, C.-C., Lin, C.-J., 2001. LIBSVM: a libraryfor support vector machines. Software available athttp://www.csie.ntu.edu.tw/ cjlin/libsvm.

Fischler, M. A., Bolles, R. C., June 1981. Random sampleconsensus: A paradigm for model fitting with applicationsto image analysis and automated cartography. Graphicsand Image Processing 24 (6).

Freund, Y., Schapire, R. E., 1997. A decision-theoretic gene-ralization of on-line learning and an application to boos-ting. Journal of Computer and System Sciences (55).

Galindo, C., Fernandez-Madrigal, J.-A., Gonzalez, J., Saf-fiotti, A., 2008. Robot task planning using semantic maps.Robotics and autonomous systems 56 (11), 955–966.

Griepentrog, H. W., Andersen, N. A., Andersen, J., Blanke,M., Heinemann, O., Madsen, T., Pedersen, S., Ravn, O.,Wulfsohn, D., July 2009. Safe and reliable - further deve-lopment of a field robot. In: Proc. 7th European Confe-rence on Precision Agriculture (ECPA). Academic Publi-shers, Wageningen.

Konolige, K., Agrawal, M., Blas, M. R., Bolles, R. C., Ger-key, B., Sola, J., Sundaresan, A., 2009. Mapping, naviga-tion, and learning for off-road traversal. J. of Field Robo-tics 26 (1), 88–113.

Lalonde, J., Vandapel, N., Huber, D., Hebert, M., October2006. Natural terrain classification using three dimensio-nal ladar data for ground robot mobility. Journal of fieldrobotics 23 (10), 839–861.

12

0.19

0.02

0.00

0.06

0.06

0.93

0.02

0.01

0.03

0.04

0.97

0.01

0.71

0.01

0.00

0.92

Tru

e cl

ass

Linear SVM1 2 3 4

1

2

3

4

0.52

0.03

0.00

0.04

0.06

0.95

0.01

0.01

0.03

0.02

0.98

0.01

0.40

0.01

0.00

0.94

Tru

e cl

ass

Polynomial SVM1 2 3 4

1

2

3

4

0.54

0.03

0.00

0.05

0.05

0.96

0.02

0.00

0.02

0.01

0.98

0.01

0.39

0.00

0.00

0.94

Tru

e cl

ass

RBF SVM1 2 3 4

1

2

3

4

0.16

0.03

0.00

0.03

0.07

0.92

0.03

0.02

0.03

0.04

0.97

0.01

0.74

0.01

0.01

0.94

Tru

e cl

ass

Sigmoid SVM1 2 3 4

1

2

3

4

0.22

0.00

0.00

0.03

0.09

0.97

0.08

0.05

0.09

0.03

0.91

0.00

0.60

0.00

0.01

0.91

Tru

e cl

ass

Adaboost1 2 3 4

1

2

3

4

0.80

0.13

0.00

0.00

0.13

0.87

0.20

0.10

0.00

0.00

0.70

0.00

0.07

0.00

0.10

0.90

Tru

e cl

ass

GHMM1 2 3 4

1

2

3

4

0.35

0.02

0.01

0.02

0.04

0.88

0.03

0.03

0.04

0.05

0.93

0.01

0.58

0.05

0.03

0.94

Tru

e cl

ass

Automata1 2 3 4

1

2

3

4

Figure 14: Mean confusion matrices collected evaluating theperformances from five trials of a 2-fold cross validation.Rows of the matrices represent the instances of predictedclass and columns the instances of an actual class. The dar-ker the square, the higher the classification rate.

5 10 150

50

100

150

200

Number of features

Tra

inin

g tim

e [s

]

SVM−LinearSVM−RBFSVM−SIGAutomatonAdaBoostGHMM

5 10 150

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Number of features

Cla

ssifi

catio

n tim

e [s

]

SVM−LinearSVM−RBFSVM−SIGAutomatonAdaBoostGHMM

Figure 15: Comparison of training and classification timefor evaluated methods. The test dataset is composed of 1141samples with five features per sample. For 10 and 15 featuresthe existing five features have been replicated two and threetimes to see how algorithms scale.

LaValle, S. M., 2006. Planning Algorithms. Cam-bridge University Press, Cambridge, U.K., available athttp://planning.cs.uiuc.edu/.

Lunze, J., June 2001. Diagnosis of quantised systems. FaultDetection, Supervision and Safety for Technical Processes2000 1 (1), 29–40.

Mettler, B., Dadkhah, N., Kong, Z., July 2010. Agile au-tonomous guidance using spatial value functions. ControlEngineering Practice 18 (7), 773–788.

Mozos, O., Stachniss, C., Burgard, W., April 2005. Super-vised learning of places from range data using adaboost.Proceedings of the 2005 IEEE International Conferenceon Robotics and Automation.

Mozos, O. M., Jensfelt, P., Zender, H., Kruijff, G. J., Bur-gard, W., April 2007. From labels to semantics: An in-tegrated system for conceptual spatial representations ofindoor environments for mobile robots. In: Workshop ”Se-mantic information in robotics” at the IEEE International

13

Conference on Robotics and Automation.Nuchter, A., Wulf, O., Lingemann, K., Hertzberg, J., Wag-

ner, B., Surmann, H., 2005. 3d mapping with semanticknowledge. RoboCup International Symposium, 335–346.

Parzen, E., 1962. On estimation of a probability density func-tion and mode. Ann. Math. Stat. 33, 1065–1076.

Rabiner, L., February 1989. A tutorial on hidden markovmodels and selected applications in speech recognition.Proc. IEEE 77 (2), 257–286.

Schroder, J., 2003. Modeling, state observation and diagnosis

of quantised systems. No. 282 in Lecture notes in controland information science. Springer.

Vapnik, V., 2006. Estimation of Dependences Based on Em-pirical Data. Information Science and Statistics. Springer.

Wolf, D. F., Sukhatme, G. S., April 2008. Semantic map-ping using mobile robots. IEEE Transactions on Robotics24 (2), 245–258.

Zhai, Y., Rasheed, Z., Shah, M., 2004. A framework for se-mantic classification of scenes using finite state machines.Lecture notes in computer science 3115, 279–288.

14

Stochastic Automata for Outdoor Semantic Mapping using … · 2017. 12. 18. · Stochastic Automata...

Documents

Transcript of Stochastic Automata for Outdoor Semantic Mapping using … · 2017. 12. 18. · Stochastic Automata...