Learning object affordances based on structural object representation Kadir F. Uyanik Asil Kaan...

Post on 21-Jan-2016

217 views 0 download

Tags:

Transcript of Learning object affordances based on structural object representation Kadir F. Uyanik Asil Kaan...

Learning object affordances based on structural object representation

Kadir F. UyanikAsil Kaan Bozcuoglu

EE 583 Pattern RecognitionJan 4, 2011

Content• Goal• Inspirations• Potential Difficulties• Problem Definition• Proposed Method• References• Appendix

Goal

Goal

Goal

Goal

Inspirations

Ecological Psychologist James Jerome Gibson

1904 -1979

Cognitive PsychologistIrving Biederman

1939 -

Inspirations:Affordances[1]

[1] J. J. Gibson (1977), The Theory of Affordances. In Perceiving, Acting, and Knowing, Eds. Robert Shaw and John Bransford, ISBN 0-470-99014-7.[2] E. Sahin, M. Cakmak, M.R.Dogar, E. Ugur , G. Ucoluk, To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control, Adaptive Behavior , 2007 pp: 447-472

“… an affordance is neither an objective property nor a subjective property; or both if you like. An affordance cuts across the dichotomy of subjective-objective and helps us to understand its inadequacy. It is equally a fact of the environment and a fact of behavior. It is both physical and psychical, yetneither. An affordance points both ways, to the environment and to the observer.”

Inspirations:Affordances[1]

[1] J. J. Gibson (1977), The Theory of Affordances. In Perceiving, Acting, and Knowing, Eds. Robert Shaw and John Bransford, ISBN 0-470-99014-7.[2] E. Sahin, M. Cakmak, M.R.Dogar, E. Ugur , G. Ucoluk, To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control, Adaptive Behavior , 2007 pp: 447-472

“… an affordance is neither an objective property nor a subjective property; or both if you like. An affordance cuts across the dichotomy of subjective-objective and helps us to understand its inadequacy. It is equally a fact of the environment and a fact of behavior. It is both physical and psychical, yetneither. An affordance points both ways, to the environment and to the observer.”

Inspirations:Affordances[1]

[1] J. J. Gibson (1977), The Theory of Affordances. In Perceiving, Acting, and Knowing, Eds. Robert Shaw and John Bransford, ISBN 0-470-99014-7.

<entity> <behavior>

<effect>

environment agent

(<effect>, <(entity, behavior)>)Revised Definition: An affordance is an acquired relation between a <(entity, behavior)> tuple of an agent such that the application of the <behavior> on the <entity> generates a certain <effect>[2].

[2] E. Sahin, M. Cakmak, M.R.Dogar, E. Ugur , G. Ucoluk, To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control, Adaptive Behavior , 2007 pp: 447-472

“… an affordance is neither an objective property nor a subjective property; or both if you like. An affordance cuts across the dichotomy of subjective-objective and helps us to understand its inadequacy. It is equally a fact of the environment and a fact of behavior. It is both physical and psychical, yetneither. An affordance points both ways, to the environment and to the observer.”

[3] Recognition-by-components: A theory of Human Image Understanding, Psychological Review, Vol. 94 (1987), pp. 115-148

“There are small number of geometric components that constitute the primitive elements of the object recognition system (like letters to form words)”

Inspirations:Human Image Understanding[3]

[3] Recognition-by-components: A theory of Human Image Understanding, Psychological Review, Vol. 94 (1987), pp. 115-148

“There are small number of geometric components that constitute the primitive elements of the object recognition system (like letters to form words)”

Inspirations:Human Image Understanding[3]

Potential Difficulties[4]

• Structural description not enough, also need metric info

[4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition

Potential Difficulties[4]

• Structural description not enough, also need metric info

• Difficult to extract geons from real images

[4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition

Potential Difficulties[4]

• Structural description not enough, also need metric info

• Difficult to extract geons from real images

• Ambiguity in the structural description: most often we have several candidates

[4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition

Potential Difficulties[4]

• Structural description not enough, also need metric info

• Difficult to extract geons from real images

• Ambiguity in the structural description: most often we have several candidates

• For some objects, deriving a structural representation can be difficult

[4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition

Problem Definition

HOW TO• decompose objects into parts/components ?• find relations between components ?• find a generic graph representation of an

<action-entity-effect> three tuple ?

Object Decomposition

Proposed Algorithm

Object Decomposition

Proposed Algorithm

Object Decomposition

Proposed Algorithm

Object Decomposition

Proposed Algorithm

Object Decomposition

Proposed Algorithm

Object Decomposition

Proposed Algorithm

Object Decomposition

Proposed Algorithm

Object Decomposition

Proposed Algorithm

Object Decomposition

Proposed Algorithm

Object Decomposition

Proposed Algorithm

Object Decomposition

Proposed Algorithm

Object Decomposition

What is missing?

•Use/try different clustering algorithms

•Triangulate 3D surfaces, Delaunay

• Compute gaussian curvature on each vertex

• Detect region boundaries, curvature thresholding

•Perform iterative region growing, flood fill

Graphical Representation

• We represent each objects in non-directed graphs as follows:– Each node has the info of geometric

shape of the part– Each edge has the information of

direction of edge for three axises, i.e from node1 to node2, x axis increases.

Graphical Representation

Similarity Checking

[isIsomorphic, label_list]= check_Isomorphism(G1, G2)If isIsomorphic

Check geometric shapes of same labeled nodes in two graphsCheck direction of equivalent edges in both graphsIf both are matched, return trueElse return false

Else return false

Isomorphism check: Two candidates: - n1 = n6, n2 = n4, n3 = n5 (Attributes matched!) - n1 = n4, n2 = n6, n3 = n5 (Attributes isn’t matched)

Graphical Representation

Similarity Checking

Current System• 80% is successful • Assumes no occlusion.

– For the cup case, handles should always be visible

• Needs metric info to distinguish bigger objects from small ones

One way to go…

• Learning a generic graph for each affordance type.• Checking the maximal- cliques of the match graph while comparing graph

of an object and a generic graph.• Mahalanobis distance metric for generic graphs and use MLE

Tools

References

[1] J. J. Gibson (1977), The Theory of Affordances. In Perceiving, Acting, and Knowing, Eds. Robert Shaw and John Bransford, ISBN 0-470-99014-7.[2] E. Sahin, M. Cakmak, M.R.Dogar, E. Ugur , G. Ucoluk, To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control, Adaptive Behavior , 2007 pp: 447-472[3] Recognition-by-components: A theory of Human Image Understanding, Psychological Review, Vol. 94 (1987), pp. 115-148[4] M. A. Arbib CS564 – Brain Theory and Artificial Intelligence, USC, Fall 2001, Lecture 7: Object Recognition

Thanks for listening

Appendix

Human Image Understanding

• Hypothesis: small number of geometric components that constitute the primitive elements of the object recognition system (like letters to form words)

• Geons are directly recognized from edges, based on their nonaccidental properties (i.e., 3D features that are usually preserved by the projective imaging process).

– edges are straight or curved– pairs of edges are parallel or non-parallel– vertices will always appear to be vertices

• Non-accidental properties allows geons to be recognized from any perspective.

• The information in the geons are redundant so that they can be recognized even when partially occluded.

AppendixThe Importance of spatial arrangement

AppendixThe Principal of non-accidentalness

Examples:

• Colinearity

• Smoothness

• Symmetry

• Parallelism

• Cotermination

AppendixSome non-accidental differences