Rakesh Gosangi PRISM lab Department of Computer Science and Engineering Texas A&M University

Slide 1

Biologically-inspired robot spatial cognition based on rat neurophysiological studies

Alejandra Barrera and Alfredo WeitzenfeldAuton Robot 2008.Rakesh GosangiPRISM labDepartment of Computer Science and EngineeringTexas A&M University1OutlineIntroductionRelated workBiologically inspired spatial cognitionExperimental resultsConclusion and DiscussionIntroductionSLAM the problem of a mobile robot acquiring a map of its environment while localizing itself in the map.

Challenges in SLAMData association if two features observed at different times correspond to the same objectPerceptual ambiguity distinguish between places that provide similar or equivalent visual patternsSpatial cognition in ratsData association or place recognition in rats is based on cognitive maps generated in hippocampus

Cognitive maps are created from visual and kinesthetic feedback information

Rats can learn and unlearn to reward locations in goal-oriented tasks

Contribution of the paperNeural network based spatial cognition model for a mobile robot inspired from rats brain structure

Build a holistic topological map of the environmentRecognize places previously visited Learn-unlearn to reward locationsPerform goal-directed navigationUse kinesthetic and visual cues from the environment

OutlineIntroductionRelated workBiologically inspired spatial cognitionExperimental resultsConclusion and DiscussionComparison with Milford (2006) - RatSlamThe two models coincide with mapping and map adaptation but differ in goal-directed navigation

Milford et al. use a topological map of experiences where each experience codifies location and orientationTransitions are associated with locomotion

In this paper, the nodes correspond to visual information patterns and path integration signalsTransitions correspond to orientation and locomotion of the rat Experimental basisMorris experiment (1981)Two types of rats Normal ratsRats with hippocampal lesionsTwo experimental situationsVisible platformSubmerged platform with visual cues around the arenaNormal rats relate their position with respect to visual cues and recognize target location

Image borrowed from - Morris, R. G. M. (1981). Spatial localization does not require the presenceof local cues. Learning and Motivation, 12, 239260.Experimental basisOKeefes experiment (1983)A reversal task on a T-mazeRats with Hippocampal lesionsLearned to turn to right arm in a T-mazeGradually changed their orientation for left arm to right arm in 8-arm mazeTheir behavior was based on goal-location relative to bodyNormal ratsLearned to turn to right arm in T-mazeThe shifting from left to right was not gradual in an 8-arm mazeTheir behavior was based on a spatial map constructed in hippocampus

OutlineIntroductionRelated workBiologically inspired spatial cognitionExperimental resultsConclusion and DiscussionBiologically inspired spatial cognitionBiological backgroundAffordances processingRats motivationPath integrationLandmark processingPlace representation and recognitionLearningAction Selection

Affordance processingAffordances are coded as a linear array of cells called affordance perceptual schema

An affordance corresponds to a 45 turn relative to the rats head

Each affordance is represented as a Gaussian distribution, the activation of neuron i is give by

MotivationThe rats motivation is related to its hunger drive

The rat obtains a reward r(t) by the presence of food

Alpha_d property of the animal. A binary value. If the food in ingested at time t then it is set to one.dmax maximum value of the hunger driveb incentive that occurs with the presence of food sight or smell18Path IntegrationProcess of updating the position of the point of departure each time the animal performs a motionPath integration helps an animal return homePath integration uses kinesthetic informationMagnitude of rotationMagnitude of translationPath integration module is composed of two neural network layersDynamic Remapping Layer (DRL)Path Integration Feature Detector Layer (PIFDL)Dynamic Remapping Layer2-D array of neuronsThe activation of a neuron (i, j) is computed as

(x, y) codify the anchor relative to initial coordinates in the plane

The anchor position displaces each time the rat moves by the same magnitude but in the opposite direction

The anchor position is updated by applying convolution between DR layer and a mask M

The DR Layer is updated according to C by centering the Gaussian at (r, c) maximum value of C

Path Integration Feature Detector LayerPIFDL is also a 2-Dimensional array of neurons

Every neuron in DLR is randomly connected to 50% on neuron in the PIFDL

The weights between the two layers are learned through Hebbian learning

Landmark ProcessingDistance and orientation of each landmark is represented as a linear array of cells (LPS)

Each LPS is connected to a 2-Dimensional array of neurons called Landmark Feature Detector Layer (LFDL)

The connecting weights are learned through Hebbian learning

All the LFDLs are combined into a single Landmark Layer (LL)

Visual information pattern is stored in an array called LP

Place representation and recognitionPlace Cell Layer (PCL) is a 2-Dimensional layer of neuronsEvery neuron in PIFDL is randomly connected to 50% of neurons in the PCLEvery neuron in the LL(Landmark Layer) is connected to 50% of neurons in the PCLThe synaptic efficacy between the two layers is learned through Hebbian learningPC encodes kinesthetic and visual information sensed by the rat at a given location and a given orientation

World Graph LayerThe nodes in the map represent different placesArcs between the nodes representThe direction of the rats headNumber of steps taken by the rat to move from one node to the otherEvery node can be connected to eight actor units, one for each directionPlace recognition

SD is the similarity degree, N is the number of cells

LearningLearn and unlearn reward locations by reinforcement learning through Actor-Critic ArchitectureAdaptive Critic (AC) unit contains a Predictive Unit (PU) which estimates future rewards for every placeEvery neuron in PCL(Place cell layer) is connected to PU and every connection hasA weight wEligibility trace e

P(t) is expected reward at time t

r(t) is effective reinforcement signal

Action SelectionAction selection is based on four signalsAvailable affordances at time t (AF)Random rotations between available affordances (RPS)Unexplored rotations from current location (CPS)Global Expectation of Maximum Reward (EMR)

RepresentationEach affordance in AF is represented as a GaussianRPS is one Gaussian centered at a random array positionCPS capture the animals curiosity. As many Gaussians as unexecuted rotations at that location

OutlineIntroductionRelated workBiologically inspired spatial cognitionExperimental resultsConclusion and DiscussionExperimentsHardwareSony AIBO ERS-210 4 legged robot1.8 GHz P4 processorA local camera with 50 horizontal view and 40 vertical viewAt a given time step the robot takes three non-overlapping snapshots (0, +90, -90)Visual processing analyzes the number of colored pixels in the imagesKinesthetic information is obtained from the external motor control, there is no odometerFour experimental conditionsExperiment 1 T-mazeDeparture point is the base of the mazeDuring training phase the goal is set at the end of the left armDuring the testing phase the goal is shifted to the right arm

ResultsThe robot takes 16 trials to completely unlearn the previously correct hypothesisWhen the expectation of reward exceeds noise the robot starts visiting the right armIn OKeefes experiments (1983), the rats chose the right arm 90% of the time by 24th trial

Experiment 2 8-arm radial mazeThe goal is set at -90 arm during training phaseDuring the testing phase the goal is set at +90 arm

ResultsWhen the expectation of reward for -90 arm is smaller than noise the robot visits other arms randomlyBy the 12th trial the robot starts choosing the +90 armIn OKeefes experiments (1983) the rats chose the correct arm by 20th trial

Experiment 3 Multiple T-mazeThe robot departs at the base of vertical T-mazeDuring training phase the goal is placed at right arm (90) of the left horizontal T-mazeDuring testing phase the goal is placed at right arm (270) of the right horizontal T-maze

ResultsIf the robot reaches the goal at the end of a path then it is positively reinforcedIf a path does not lead the robot to a goal it is negatively reinforced thus unlearning the pathThe robot completely unlearns previous goal by 20th trial

Experiment 4 Maze with landmarksThree colored cylinders were placed outside the maze as landmarksDuring testing the robot was placed at different starting locations

ResultsThe robots use place recognition to find goalsAll the robots found the goal successfully from all starting positions

OutlineIntroductionRelated workBiologically inspired spatial cognitionExperimental resultsConclusion and DiscussionDiscussion and conclusionsThe model proposed capture some behavioral aspects of ratsAbilitiesBuild a holistic topological map in real timeLearn and unlearn goal locationsExploit the cognitive map to recognize visited places

Very simplistic perceptual systemThe current model cannot deal with real environmentsAffordance space and landmark space is discreteComputationally expensive to process continuous spaces

Questions / Comments

Rakesh Gosangi PRISM lab Department of Computer Science and Engineering Texas A&M University

Documents

Transcript of Rakesh Gosangi PRISM lab Department of Computer Science and Engineering Texas A&M University