Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38...
Transcript of Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38...
![Page 1: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/1.jpg)
Thesis Defense Ph.D. Program in Information Systems and Computer Engineering
![Page 2: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/2.jpg)
2/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
Problem
Approach
Emotion-based Intrinsic Rewards
Emerging Emotions
Socially-Aware Learning Agents
Conclusions
![Page 3: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/3.jpg)
3/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
Problem
Approach
Emotion-based Intrinsic Rewards
Emerging Emotions
Socially-Aware Learning Agents
Conclusions
![Page 4: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/4.jpg)
4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Simple navigation problem ! State: agent location
! Actions: movement
![Page 5: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/5.jpg)
5/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! RL main ideas (Sutton & Barto, 1998; Kaelbling et al., 1996)
! Objective: maximize the reward throughout time
! Task: discover which actions maximize reward in each state
! e.g., using Q-learning (Watkins, 1989)
+1
![Page 6: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/6.jpg)
6/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Assumptions (Sutton & Barto, 1998; Kaelbling et al., 1996)
! Fully observable environments
! Infinite visits to all states and actions
! Stationary environments
! Agent limitations
! Limited perception and computational resources
! Dynamic, unpredictable and unreliable environment
! Demand for manual adjustments
! Design assumptions too restrictive (Littman, 1994; Loch and Singh, 1998; Singh et al., 1994)
![Page 7: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/7.jpg)
7/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Rewards are fundamental (Sutton & Barto, 1998; Kaelbling et al., 1996)
! Implicitly defines the agent’s task
! Impact on the learning time
! Impact on what is learned
! Major challenge (Abbeel and Ng, 2004; Ng and Russell, 2000; Sorg et al., 2010a)
! Build reward mechanisms so the task is learned efficiently
! Flexible and robust
! Enhance agent’s autonomy
![Page 8: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/8.jpg)
8/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
“Design reward mechanisms for RL agents that are able to
alleviate their inherent perceptual limitations and
make them operate in a wide variety of domains
without the explicit intervention of others or relying on
expert or domain knowledge about a particular task.”
![Page 9: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/9.jpg)
9/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
Problem
Approach
Emotion-based Intrinsic Rewards
Emerging Emotions
Socially-Aware Learning Agents
Conclusions
![Page 10: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/10.jpg)
10/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Agent has access to several sources of information
! Perception, reasoning, learning, etc.
How long since I observed this state?
What is the best state that I can be?
How many times have I performed this action?
What is the value of this state?
… interaction information
processing mechanisms
state, reward
decision making
action
![Page 11: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/11.jpg)
11/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
interaction information
decision making
intrinsic motivation
intrinsic reward
! Intrinsically Motivated RL (Singh et al. 2009, 2010; Sorg et al. 2010)
! Agent learns with intrinsic rewards
! Fitness: measures performance
! Mitigates computational limitations of learning agents
processing mechanisms
state, extrinsic reward
![Page 12: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/12.jpg)
12/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Parallel with biological organisms ! Limited perception and resources
! Dynamic, unpredictable and unreliable environment
! Natural motivational mechanisms
! Shaped by evolution
! Provide adaptive advantages
! Social mechanisms
! Cooperation in inherently
competitive environments
![Page 13: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/13.jpg)
13/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
“We focus on the role of emotions and also on the
way individuals interact and cooperate with each other as a social group
to design more flexible and robust reward mechanisms that
enhance the autonomy of RL agents in
both single and multiagent settings.”
![Page 14: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/14.jpg)
14/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Emotion-based Intrinsic Rewards ! Role of emotions in decision-making
! 4 emotion-based domain-independent reward features
! Emerging Emotions
! Emergence of useful sources of information
! Discuss relation with emotions
! Socially-Aware Learning Agents
! Extend IMRL to multiagent scenarios
! Socially-aware behaviors
![Page 15: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/15.jpg)
15/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
Problem
Approach
Emotion-based Intrinsic Rewards
Emerging Emotions
Socially-Aware Learning Agents
Conclusions
![Page 16: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/16.jpg)
16/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Studies show that emotions: (Cardinal et al. 2002, Dawkins 2000, Phelps & LeDoux 2005)
! Basic and ancient survival mechanism
! Beneficial adaptive mechanism for decision-making
! Elicit physiological signals
! Absence of emotions (Bechara et al. 2000, Damasio 1994, LeDoux 2000)
! Impairs taking advantageous decisions
! How to provide emotion-based motivation?
![Page 17: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/17.jpg)
17/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
Goals, Beliefs, Norms
novelty
causality
focus of event
importance …
! Evaluation through appraisal (Frijda and Mesquita 1998, Lazarus 2001, Reisenzein 2009, Roseman 2001, Scherer 2001)
appraisal mechanism
emotional state
re-appraisal
belief update
behavior
![Page 18: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/18.jpg)
18/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
interaction information
decision making
! Major dimensions of appraisal (Ellsworth & Scherer 2003, Leventhal & Scherer, 1987)
! 4 domain-independent reward features
! Evaluate agent’s history of interaction with environment
emotion-based intrinsic reward
appraisal-based reward mechanism
processing mechanisms
![Page 19: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/19.jpg)
19/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Foraging scenarios
! Each presents distinct challenge ! Partially-observable
! Compare performance emotion-based vs. fitness-based
! Results ! Emotional agents outperform standard agents
! Careful consideration of emotional aspects
! Learn the intended task
! Overcome perceptual limitations
![Page 20: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/20.jpg)
20/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! 4 emotion-based reward features
! Novelty, Valence, Goal relevance and Control
! Domain-independent
! General-purpose guiding system for RL agents
! Mitigation of perceptual limitations
! Departs from previous works within Affective Computing
! Domain-independent appraisal-based
! Does not alter RL algorithm
! Does not focus on a set of basic emotions
![Page 21: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/21.jpg)
21/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
Problem
Approach
Emotion-based Intrinsic Rewards
Emerging Emotions
Socially-Aware Learning Agents
Conclusions
![Page 22: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/22.jpg)
22/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Answer the question: ! “Are emotions the best candidate to complement the agents’ information
processing mechanism?”
interaction information
processing mechanisms
decision making
intrinsic motivation
intrinsic reward
![Page 23: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/23.jpg)
23/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
basic variables
decision making
! Reward optimization using Genetic Programming (Niekum et al. 2010)
! Population of reward functions
! Evolved and evaluated according to agent’s performance
evolved reward function
processing mechanisms
GP mechanism
selection, mutation, crossover
![Page 24: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/24.jpg)
24/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Same foraging scenarios as before
! Observe resulting evolved optimal rewards
! Discover patterns in the reward functions’ expressions
! Results
! Set of 5 informative signals
! Fitness, relevance, advantage, prediction, frequency
! Each signal can be used as an intrinsic reward feature
![Page 25: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/25.jpg)
25/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Scenarios based on PacMan
! Objective: validate emerged optimal sources of information
! Different and much more complex scenarios
! Limited perception
! Results
! GP-based agent outperformed standard agents
! Learn the intended task
! Overcome perceptual limitations
![Page 26: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/26.jpg)
26/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Analyze emerged information signals
! Compare the “kinds” of evaluation
! According to appraisal theories of emotion
! Results
! Informative signals provide similar evaluation
! Share structural and dynamical properties
![Page 27: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/27.jpg)
27/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Information-processing reward mechanism ! Emerged by means of genetic programming
! Domain-independent
! Mitigation of perceptual limitations
! Relation with natural agents and emotions
! Dynamic and structural connections with appraisal dimensions
! Adaptive mechanism that allows for higher fitness
! Reinforce the role of emotions in agents adaptation
![Page 28: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/28.jpg)
28/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
Problem
Approach
Emotion-based Intrinsic Rewards
Emerging Emotions
Socially-Aware Learning Agents
Conclusions
![Page 29: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/29.jpg)
29/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Extend previous research into multiagent settings
! Shared environment
! Each agent has its own goals
! May be conflicting
! Achieve cooperation
! Inspiration from social mechanisms
! Living in group augments survival chances
! Still there is competition for resources and power
! Communicate intentions and evaluate each others actions
![Page 30: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/30.jpg)
30/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Cooperation in competitive contexts (Axelrod 1984, de Waal 2008, Dörner 1999, Falk & Fischbacher 2006, Hamilton 1964, Trivers 1971)
! Need for affiliation ! Altruistic behaviors despite momentary losses
! Legitimacy signals
! Signal socially-aware behaviors
! Reciprocation mechanism
! Evaluates “kindness” of others’ actions
![Page 31: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/31.jpg)
31/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
social aspects
decision making
! Limited resource sharing scenarios
! Internal and external social rewards
! Evaluate appropriateness of behaviors towards social group
social intrinsic reward
processing mechanisms
social motivation mechanism
![Page 32: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/32.jpg)
32/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Limited resource and mutual dependency scenarios
! 2 or 3 agents interact in the same environment
! Limited perception
! Agents learn and act individually, fitness is measured of the social group
! Results
! Socially motivated agents outperformed “greedy” group
! Mostly homogeneous populations
! Emergence of “socially-aware” behaviors
![Page 33: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/33.jpg)
33/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Extend IMRL to multiagent scenarios ! Social intrinsic motivation
! Based on affiliation, social signaling and reciprocity
! Emergence of “socially aware” behaviors
! Trade-off immediate gains for future collaboration
! Learned behaviors benefit whole group
! Accordance with how cooperation thrives in nature
! Existence of signaling mechanism
! Reciprocation opportunities
! Future interactions
![Page 34: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/34.jpg)
34/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
Problem
Approach
Emotion-based Intrinsic Rewards
Emerging Emotions
Socially-Aware Learning Agents
Conclusions
![Page 35: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/35.jpg)
35/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
“We focus on the role of emotions and also on
the way individuals interact and cooperate with each other as a social group
to design more flexible and robust reward mechanisms that
enhance the autonomy of RL agents in
both single and multiagent settings.”
![Page 36: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/36.jpg)
36/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! RL and IMRL ! Alleviate perceptual limitations and modeling effort
! More autonomous, robust and flexible mechanisms
! Novel approach for multiagent IMRL
! Affective Computing
! Show the importance of emotion-based reward design
! Independent of algorithm, not focused on set of basic emotions
! Parallel with natural organisms
! Importance of emotion-related information
![Page 37: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/37.jpg)
37/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013
! Multiagent Systems ! Relation with evolutionary game theory
! Emergence of cooperation with relatedness and reciprocation
! Signaling mechanism according to internal social standards
! Emergence of cooperation by means of
! Social pressures
! Pure altruism
![Page 38: Thesis Defensegaips.inesc-id.pt/.../2013/sequeira2013phdthesis_pres.pdf · 2015-09-10 · 4/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013 ! Simple navigation problem](https://reader033.fdocuments.us/reader033/viewer/2022050209/5f5b88c05c4d2b17020d99e4/html5/thumbnails/38.jpg)
38/38 Pedro Sequeira – Ph.D. Thesis Defense – September 18, 2013