Rakesh Gosangi PRISM lab Department of Computer Science and Engineering Texas A&M University
description
Transcript of Rakesh Gosangi PRISM lab Department of Computer Science and Engineering Texas A&M University
Slide 1
Biologically-inspired robot spatial cognition based on rat neurophysiological studies
Alejandra Barrera and Alfredo WeitzenfeldAuton Robot 2008.Rakesh GosangiPRISM labDepartment of Computer Science and EngineeringTexas A&M University1OutlineIntroductionRelated workBiologically inspired spatial cognitionExperimental resultsConclusion and DiscussionIntroductionSLAM the problem of a mobile robot acquiring a map of its environment while localizing itself in the map.
Challenges in SLAMData association if two features observed at different times correspond to the same objectPerceptual ambiguity distinguish between places that provide similar or equivalent visual patternsSpatial cognition in ratsData association or place recognition in rats is based on cognitive maps generated in hippocampus
Cognitive maps are created from visual and kinesthetic feedback information
Rats can learn and unlearn to reward locations in goal-oriented tasks
Contribution of the paperNeural network based spatial cognition model for a mobile robot inspired from rats brain structure
Build a holistic topological map of the environmentRecognize places previously visited Learn-unlearn to reward locationsPerform goal-directed navigationUse kinesthetic and visual cues from the environment
OutlineIntroductionRelated workBiologically inspired spatial cognitionExperimental resultsConclusion and DiscussionComparison with Milford (2006) - RatSlamThe two models coincide with mapping and map adaptation but differ in goal-directed navigation
Milford et al. use a topological map of experiences where each experience codifies location and orientationTransitions are associated with locomotion
In this paper, the nodes correspond to visual information patterns and path integration signalsTransitions correspond to orientation and locomotion of the rat Experimental basisMorris experiment (1981)Two types of rats Normal ratsRats with hippocampal lesionsTwo experimental situationsVisible platformSubmerged platform with visual cues around the arenaNormal rats relate their position with respect to visual cues and recognize target location
Image borrowed from - Morris, R. G. M. (1981). Spatial localization does not require the presenceof local cues. Learning and Motivation, 12, 239260.Experimental basisOKeefes experiment (1983)A reversal task on a T-mazeRats with Hippocampal lesionsLearned to turn to right arm in a T-mazeGradually changed their orientation for left arm to right arm in 8-arm mazeTheir behavior was based on goal-location relative to bodyNormal ratsLearned to turn to right arm in T-mazeThe shifting from left to right was not gradual in an 8-arm mazeTheir behavior was based on a spatial map constructed in hippocampus
OutlineIntroductionRelated workBiologically inspired spatial cognitionExperimental resultsConclusion and DiscussionBiologically inspired spatial cognitionBiological backgroundAffordances processingRats motivationPath integrationLandmark processingPlace representation and recognitionLearningAction Selection
Affordance processingAffordances are coded as a linear array of cells called affordance perceptual schema
An affordance corresponds to a 45 turn relative to the rats head
Each affordance is represented as a Gaussian distribution, the activation of neuron i is give by
MotivationThe rats motivation is related to its hunger drive
The rat obtains a reward r(t) by the presence of food
Alpha_d property of the animal. A binary value. If the food in ingested at time t then it is set to one.dmax maximum value of the hunger driveb incentive that occurs with the presence of food sight or smell18Path IntegrationProcess of updating the position of the point of departure each time the animal performs a motionPath integration helps an animal return homePath integration uses kinesthetic informationMagnitude of rotationMagnitude of translationPath integration module is composed of two neural network layersDynamic Remapping Layer (DRL)Path Integration Feature Detector Layer (PIFDL)Dynamic Remapping Layer2-D array of neuronsThe activation of a neuron (i, j) is computed as
(x, y) codify the anchor relative to initial coordinates in the plane
The anchor position displaces each time the rat moves by the same magnitude but in the opposite direction
The anchor position is updated by applying convolution between DR layer and a mask M
The DR Layer is updated according to C by centering the Gaussian at (r, c) maximum value of C
Path Integration Feature Detector LayerPIFDL is also a 2-Dimensional array of neurons
Every neuron in DLR is randomly connected to 50% on neuron in the PIFDL
The weights between the two layers are learned through Hebbian learning
Landmark ProcessingDistance and orientation of each landmark is represented as a linear array of cells (LPS)
Each LPS is connected to a 2-Dimensional array of neurons called Landmark Feature Detector Layer (LFDL)
The connecting weights are learned through Hebbian learning
All the LFDLs are combined into a single Landmark Layer (LL)
Visual information pattern is stored in an array called LP
Place representation and recognitionPlace Cell Layer (PCL) is a 2-Dimensional layer of neuronsEvery neuron in PIFDL is randomly connected to 50% of neurons in the PCLEvery neuron in the LL(Landmark Layer) is connected to 50% of neurons in the PCLThe synaptic efficacy between the two layers is learned through Hebbian learningPC encodes kinesthetic and visual information sensed by the rat at a given location and a given orientation
World Graph LayerThe nodes in the map represent different placesArcs between the nodes representThe direction of the rats headNumber of steps taken by the rat to move from one node to the otherEvery node can be connected to eight actor units, one for each directionPlace recognition
SD is the similarity degree, N is the number of cells
LearningLearn and unlearn reward locations by reinforcement learning through Actor-Critic ArchitectureAdaptive Critic (AC) unit contains a Predictive Unit (PU) which estimates future rewards for every placeEvery neuron in PCL(Place cell layer) is connected to PU and every connection hasA weight wEligibility trace e
P(t) is expected reward at time t
r(t) is effective reinforcement signal
Action SelectionAction selection is based on four signalsAvailable affordances at time t (AF)Random rotations between available affordances (RPS)Unexplored rotations from current location (CPS)Global Expectation of Maximum Reward (EMR)
RepresentationEach affordance in AF is represented as a GaussianRPS is one Gaussian centered at a random array positionCPS capture the animals curiosity. As many Gaussians as unexecuted rotations at that location
OutlineIntroductionRelated workBiologically inspired spatial cognitionExperimental resultsConclusion and DiscussionExperimentsHardwareSony AIBO ERS-210 4 legged robot1.8 GHz P4 processorA local camera with 50 horizontal view and 40 vertical viewAt a given time step the robot takes three non-overlapping snapshots (0, +90, -90)Visual processing analyzes the number of colored pixels in the imagesKinesthetic information is obtained from the external motor control, there is no odometerFour experimental conditionsExperiment 1 T-mazeDeparture point is the base of the mazeDuring training phase the goal is set at the end of the left armDuring the testing phase the goal is shifted to the right arm
ResultsThe robot takes 16 trials to completely unlearn the previously correct hypothesisWhen the expectation of reward exceeds noise the robot starts visiting the right armIn OKeefes experiments (1983), the rats chose the right arm 90% of the time by 24th trial
Experiment 2 8-arm radial mazeThe goal is set at -90 arm during training phaseDuring the testing phase the goal is set at +90 arm
ResultsWhen the expectation of reward for -90 arm is smaller than noise the robot visits other arms randomlyBy the 12th trial the robot starts choosing the +90 armIn OKeefes experiments (1983) the rats chose the correct arm by 20th trial
Experiment 3 Multiple T-mazeThe robot departs at the base of vertical T-mazeDuring training phase the goal is placed at right arm (90) of the left horizontal T-mazeDuring testing phase the goal is placed at right arm (270) of the right horizontal T-maze
ResultsIf the robot reaches the goal at the end of a path then it is positively reinforcedIf a path does not lead the robot to a goal it is negatively reinforced thus unlearning the pathThe robot completely unlearns previous goal by 20th trial
Experiment 4 Maze with landmarksThree colored cylinders were placed outside the maze as landmarksDuring testing the robot was placed at different starting locations
ResultsThe robots use place recognition to find goalsAll the robots found the goal successfully from all starting positions
OutlineIntroductionRelated workBiologically inspired spatial cognitionExperimental resultsConclusion and DiscussionDiscussion and conclusionsThe model proposed capture some behavioral aspects of ratsAbilitiesBuild a holistic topological map in real timeLearn and unlearn goal locationsExploit the cognitive map to recognize visited places
Very simplistic perceptual systemThe current model cannot deal with real environmentsAffordance space and landmark space is discreteComputationally expensive to process continuous spaces
Questions / Comments