Understanding Human-Object Interaction in RGB-D videos for ...€¦ · Discriminative models for...

Zhiwen FangBeingTogether Centre，IMI， Research Fellow

Understanding Human-Object Interaction in RGB-D videos for Human Robot Interaction

Non-verbal language

MotivationHuman-robot interaction (HRI)[1,2,3]

[1] Yang Xiao, Zhijun Zhang, Aryel Beck, Junsong Yuan, and Daniel Thalmann. 2014. Human–robot interaction by understanding upper body gestures. Presence: teleoperators and virtual environments 23, 2 (2014), 133–154.[2] Isibor Kennedy Ihianle, Usman Naeem, and Abdel‐Rahman Tawil. 2016. Recognition of activities of daily living from topic model. Procedia Computer Science 98 (2016), 24–31.[3] Marina P′erez‐Jim′enez, Borja Bordel S′anchez, and Ram′on Alcarria. 2016. T4AI: A system for monitoring people based on improved wearable devices. Research Briefs on Information & Communication Technology Evolution (ReBICTE) 2 (2016), 1–16.

Verbal language

Facial expression

body gesture

Object

Social robot

Motivation

Understand the intention of the human based on the object information

with a cell phone in hand and close to ear, it may indicate

that the person is having a call.

with a cup in hand and close to mouse, it may indicate the

person is drinking.

How to detect hand-held objects?

1 Introduction

2 Method

4 Results

Outline

Conclusions

3 System overview

Wearable sensors & Radio Frequency Identification tags [1]

Thermal band images [2]

Computer vision method based on RGB camera [3][4]

[1] K. P. Fishkin, M. Philipose, and A. Rea. 2005. Hands-on RFID: wireless wearables for detecting use of objects. In IEEE International Symposium on Wearable Computers, 2005. Proceedings.38–43.[2] Cigdem Beyan and Alptekin Temizel. 2015. A multimodal approach for individual tracking of people and their belongings. The Imaging Science Journal 63, 4 (2015), 192–202.[3] Chaitanya Desai, Deva Ramanan, and Charless Fowlkes. 2010. Discriminative models for static human‐object interactions. In Computer vision and pattern recognition workshops (CVPRW), 2010 IEEE computer society conference on. IEEE, 9–16.[4] Zhaozhuo Xu, Yuan Tian, Xinjue Hu, and Fangling Pu. 2015. Dangerous human event understanding using human‐object interaction model. In Signal Processing, Communications and Computing (ICSPCC), 2015 IEEE International Conference on. IEEE, 1–5.

Introduction

Research problems in hand-held object detection(1) Relationship between objects and a person

(2) Hand-held objects are often very small

(3) Targets loss because of appearance changes and/or part

occlusion in the sequence.

Chair, bottle, cell phone, keyboard… About 5 meters, bottle Part occlusion, cell phone

1 Introduction

2 Method

4 Results

Outline

Conclusions

3 System overview

Method

Human contextual information

1. Skeleton data (25 body joint positions)

2. Local patch around the hand joint

10RGB image Person Index

Estimate the probability of belonging to a person1. Object Detection in the local patch

2. Estimate the probability using the person index map

Method

Estimate the probability of belonging to a person

Method

Object detection in a local patch by YOLO[1, 2]

(1) resize the image to 544 * 544

(2) run a convolutional network on the resized image

(3) output the results by the confidence of network model.

[1] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.[2] Redmon J, Farhadi A. YOLO9000: better, faster, stronger[J]. arXiv preprint, 2017.

Method

Object tracking based on correlation filter [1]

(1) dense sampling by modeling all possible translations of the

base sample in a search window as circulant shifts

(2) learning the correlation filter by solving a ridge regression

problem in the Fourier domain.[1] Henriques J F, Caseiro R, Martins P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596.

Method

1 Introduction

2 Method

4 Results

Outline

Conclusions

3 System overview

Natural Language Processing

Speech recognition

Natural Language Processing

Hand‐held object detection Object detection

Human and robot interaction

Language interaction

Object exchange

System overview

1 Introduction

2 Method

4 Results

Outline

Conclusions

3 System overview

Results

Detection rate of different methods in three categories (i.e. bottle, cup, cell phone).

* w/o represents the method without human contextual information

1 Introduction & Literature Review

2 Method

4 Results

Outline

Conclusions

3 System overview

Conclusions

To provide intelligent human-robot interaction, it is critical to

understand the interaction between the human and daily objects,

so that we can analyze the intention of the human.

Using a RGB-D sensor, we can provide a method to detect

hand-held objects

Human contextual information is introduced to improve the

performance of hand-held object detection

THANK YOU!

Understanding Human-Object Interaction in RGB-D videos for ...€¦ · Discriminative models for...

Documents

Transcript of Understanding Human-Object Interaction in RGB-D videos for ...€¦ · Discriminative models for...

Joint Group Feature Selection and Discriminative Filter Learning … · 2019. 10. 23. · Joint Group Feature Selection and Discriminative Filter Learning for Robust Visual Object

Discriminative pose-free descriptors for face and object ... · Discriminative pose-free descriptors for face and object matching ... Unconstrained face recognition, where the gallery

The Elements of Decision Alignment · The Elements of Decision Alignment Human to Human Human to/from Object Object to Object Select agent Trademark Chain of custody App stores White

Discriminative models for multi-class object layoutfowlkes/papers/drf-iccv09.pdfDiscriminative models for multi-class object layout Chaitanya Desai Deva Ramanan Charless Fowlkes Department

Localized discriminative Gaussian process latent variable ... · PDF fileLocalized discriminative Gaussian process latent variable model for text ... we use Discriminative Gaussian

Object Class Recognition Using Discriminative Local Features Gyuri Dorko and Cordelia Schmid.

Learning Discriminative Embeddings for Object Recognition ... · Learning Discriminative Embeddings for Object Recognition on-the-ﬂy Miguel Lagunes-Fortiz1; ... limited compared

Object Recognition by Discriminative Combinations of Line Segments and Ellipses Alex Chia ^˚ Susanto Rahardja ^ Deepu Rajan ˚ Maylor Leung ˚ ^ Institute.

Discriminative Techniques For The Recognition Of Complex ... · images, classifying those features as belonging to an object of interest or not, and aggre-gating found object parts

Discriminative Models for Multi-Class Object Layoutfowlkes/papers/drf-ijcv11.pdf · tle statistical regularities of the visual world which sep-arate object from background. As a result,

slide: Object Class Recognition Using Discriminative Local Features

Human Classification, Activity Recognition, Object ...cseweb.ucsd.edu/~mpatanka/docs/Research2.pdf · Human Classification, Activity Recognition, Object Detection and Human Object

Histopathological Image Classiﬁcation using Discriminative ... · Histopathological Image Classiﬁcation using Discriminative Feature-oriented ... our Discriminative Feature-oriented

Human-humanoid collaborative object transportation

Scalable real-time object recognition and segmentation via …pvernaza/papers/mrfRecogTR.pdf · Scalable real-time object recognition and segmentation via cascaded, discriminative

Discriminative Distance Measures for Object Detectionmahamud/thesis.pdf · object of interest in an input image, given a prior training set (2D or 3D data) for the objects of interest.

Discriminative Training for Object Recognition Using Image ...thomas.deselaers.de/publications/papers/deselaers_cvpr05.pdfpoints and manually segmented training data for classiﬁca-tion.

Machine Learning Classification, Discriminative …...Machine Learning Classiﬁcation, Discriminative learning Structured output, structured input, discriminative function, joint

Beyond Correlation Filters: Learning Continuous …...Discriminative Correlation Filters (DCF) Applications • Object recognition • Object detection • Object tracking –Among

Discriminative Adaptive Training and Discriminative Adaptationmi.eng.cam.ac.uk/seminars/speech/lan_seminar1.pdf · Discriminative Adaptive Training and Discriminative Adaptation Lan