2D-environment navigation using a neural network

Cognitive Robotics Project

Di Lecce Arturo

POLITECNICO DI MILANO

2D-environment navigationusing a neural network

Abstract

1. a robot with some kind of obstacle sensors

2. a controller for obstacle avoidance

The purpose of this project is simulating a robot that can navigate in a 2D environment avoiding obstacles

What we need:

Arturo Di Lecce POLITECNICO DI MILANO

The robot

The robot is a bi-wheeled w/ circle shape rover equipped with:


Head Direction

Robot body (diam = 1 robot unit)

sensors range indicators

1) touch sensors2) range finder sensors

The robot - touch sensors


two strips of touch sensors - one from -45° to 0° respect head direction (left)- one from 0° to 45° respect head direction(right)

LEFT TOUCH SENSORSdetects objects at 0.6 r.u

in [45°,0°]

RIGHT TOUCH SENSORSdetects objects at 0.6 r.u

in [0°,-45°]

The robot - range finder sensors


two range finder sensors - one from -45° to 0° respect head direction (left)- one from 0° to 45° respect head direction(right)

LEFT RANGE FINDERSstarts to detect objects

at 11 r.u in [45°,0°]

RIGHT RANGE FINDERSstarts to detect objects

at 11 r.u in [0°,-45°]

Neural network controller


“Learning Anticipation via Spiking Networks: Application to Navigation Control”

(Arena et al. - 2009)

● Speaks about a 2-layer neural network controller useful for robot navigation in 2D spaces with capability of obstacle avoidance and target approaching

• We focus on obstacle avoidance

Neural network controller - the spiking network (1/3)


The neuron model used for this controller, is the one proposed by Eugene M. Izhikevich in his paper “Simple Model of Spiking Neurons” (2003):

Neural network controller - the spiking network(2/3)


Depending on four parameters (a,b,c,d), the model reproduces spiking and bursting behavior of known types of cortical neurons

● a is the time scale of recovery variable u(t)● b is the sensitivity of recovery variable u(t)● c is the after spike reset value of membrane potential v(t)● d is the after spike reset value of recovery variable u(t)

Neural network controller - the spiking network(3/3)


Neural network controller – controller structure (1/3)


Composed by:● 2 unconditioned stimuli neurons UC● 2 consditioned stimuli neurons CS● a constant source of neural spikes Go-On● 2 motor neurons RMgo & LMgo● 2 motor neurons RMturn & LMturn



UnconditionedStimuli

neurons

Go-OnMotor

Neurons

TurnMotor

Neurons

ConditionedStimuli

neurons



2 kind of inputs to 2°nd layer neurons:

● excitatory synapses (like USL → RMTURN) marked with an arrow● Inhibitory synapses (like USL → RMGO) marked with a dot

Neural network controller – controller integration


Left Touch Sensor

Right Touch Sensor

Left RangeFinder

Right RangeFinder

Right Motor

LeftMotor

Neural network controller – controller inputs


If the distance between the robot and an nearest obstacle is

<= 0.6 robot units

● Unconditioned stimuli & touch sensors

With this function, the range finders sensors approximately starts to fire at

a distance of 11 robot units (d0 is obstacle distance)

● Conditioned stimuli & range finders

Neural network controller – neuron input


The synaptic input to a generic neuron J is given by:

Where:● wij represents the weight of the synapse from neuron i to neuron j

● ti is the instant in which a generic neuron i connected to neuron j, emits a spike

● The function ε(t) is

Neural network controller – controller outputs


A single motor input is the sum of:● # of spikes emitted by GoOn neuron● # of spikes emitted by Turn neuron

Neural network controller – synaptic weights learning


Synaptic weights for CS-->TurnMotors are continuously learned according to the STDP rule:

Where Δt is the difference between the spiking time of the presynaptic neuron and that of the postsynaptic one and the rest are parameters of the learning algorithm.

For avoiding that the weights increase steadily, weight decay has been introduced:

Neural network controller – controller behavior (1/2)


● The motor driver signal depends on the number of spikes in the output neurons assigned to the motor

● GoOn motor neurons generate the spike train needed to let the robot advance in the forward direction

● The spikes of the Turn motor neurons are then summed to the ones of the GoOn neurons

● In presence of collisions, GoOn neurons are inhibited and the forward movement is suppressed

● When the left and right motor neurons emit an equal number of spikes, the robot moves forward with a speed proportional to the number of spikes. In absence of conditioned stimuli, the amplitude of the forward movement is about 0.3 r.u. for each step

FORWARD MOVEMENT

Neural network controller – controller behavior (2/2)


● When there is a difference in the number of spikes emitted by left and right motor neurons, the robot rotates. The angle of rotation (in counterclockwise direction) is:

θ = 0.14*Δnswhere

Δns = nr - nl

● We count the spikes emitted both by the left and the right neuron and the robot advance with a speed proportional to the minimum number of them

ROTATION

Simulation - Intro


Main simulation function:

Parameters:● posn: start position of the robot [x,y,θ];● map_name: relative path of the map image (.png)● steps: #of simulation steps● w: initial weights (if 0, weights are initialized into function)● debug: flag for debugging and showing some graphs● drawPath: flag for drawing the robot movements at the end of the simulation

Returns:● posn: final position of the robot● w: weights at the end of the simulation

Simulation – neural network representation


Membranepotentials

Recoveryvariables

Inputs

Simulation – some settings


Simulation – soft sensor


offset=π/2delta=π/2

Simulation – neuron input function


Simulation – setting inputs

+ if excitatory input- if inhibitory input

Touch sensors input

Range finders input


Simulation – weigths update


Simulation – controller (1/2)


Retrieve LM & RM go and turn neurons number of spikes

Go forward

Rotate

Simulation – controller (2/2)


Calc new position

Update distances

Simulation 1 – Results (1/3)


First simulation: 200 timesteps

weights have not yet been learned!



If touch sensors detect an obstacle, forward movement is inhibited and the

robot turns



After 1600 timesteps

From 1600to 2300

timesteps



Weights are learned and if range

sensors detects an obstacle, the robot

turns



Weights learning curves

Simulation – Some problems (1/2)


Weights reach their maximum/minimum valuesafter some k-simulation steps

In Arena & al. simulation, this doesn't happen!

Simulation – What can be wrong?


Simulation result depends on a lot of factors:

● Neuron model parameters (a,b,c,d) – influence spike rate, sensitivity and default membrane potential

● STMP parameters (A+,A-,Tau+,Tau-) – influence weights learning rate

● Simulation step: lower values, increases the precision of spike time variables so the weights are learned better but the simulation time grows

● Robot's speed● Soft touch/range sensor algorithm● Different simulation environment (in Arena et al. they used C++)

Simulation – Goin' further ...


● It's possible to integrate target approaching controls adding an inter-neurons layer to the controller and a visual input sensor (a camera)

2D-environment navigation using a neural network

Education

Transcript of 2D-environment navigation using a neural network