Who’s Doing What: Joint Modeling of Names and Verbs for Simultaneous Face and Pose Annotation

Who’s Doing What: Joint Modeling of Names and Verbs for Simultaneous Face

and Pose Annotation

Presenter: Maresh Naresh Singh

Authors: Luo Ji, Barbara Caputo and Vittorio Ferrari

• Given: News items consisting of images with their associated text.

• Goal: Figure out who is doing what?

Who is doing what?

• Guess possible action of a person in the image.

• Use pose as well as verb for this purpose.• Associate actions with the person in the

image.• Predict the name of the person.

(b) US Democratic presidential candidate Senator Barack Obama waves to supporters together with his wife Michelle Obama standing beside him at his North Carolina and Indiana primary election night rally in Raleigh.

(a) Four sets ... Roger Federer preparesto hit a backhand in a quarter-final matchwith Andy Roddick at the US Open.

Correspondence ambiguity problem.

• Multiple persons in the image and captions.• Person in the image but not mentioned in the

caption.• Mention in the caption but not present in the

image.

• The title “Joint Modeling of Names and Verbs for

Simultaneous Face and Pose Annotation”

Generative Model

• Observed variables: Names and verbs in the caption. Detected persons in the image.

• Latent Variables: Image-caption correspondence.

• Parameters: Visual appearances of face and pose classes corresponding to different names and verbs.

• EM to compute hidden variables.

Features

Face and pose recognition

• Uses face detector and upper body detector.• Face and upper-body are considered to belong

to same person if the face lies in the center of upper-body bounding box.

Name-Verb pair.

• Language parser extracts name-verb pair from each caption.

• Uses OpenNLP.

Summary from last class.

Probability Function

• Uses EM to maximize the above function.

• Maximizing the previous equation somehow boils down to minimizing the equation:

EM algorithm (Initialization)

• Compute distance matrix between faces/poses from images sharing some name/verb in the caption.

• For each name/verb pair, select all captions containing only that name/verb.

• If the corresponding images contain only one person, their faces/poses are used to initialize the center vectors

• If the corresponding images always contain multiple players, assign person by random selection.

EM algorithm (E-Step)

EM algorithm (M-Step)

Experiment and observations

Comments

• Better results on the chosen dataset.• Somewhat successful in recognizing persons in

images without captions.

Comments

• Assumes independence between persons in an image.

• Limited dataset of 1610 images used for experimentation.

• Manual involvement in writing captions.• Images collected using search queries like

“Barack Obama” + “Shake hands”• Such queries results in images with strong

correspondence between pose and face.

Thanks for tolerating me.

Who’s Doing What: Joint Modeling of Names and Verbs for Simultaneous Face and Pose Annotation

Documents

Transcript of Who’s Doing What: Joint Modeling of Names and Verbs for Simultaneous Face and Pose Annotation

Dine Out! Who’s Offering Patio Dining? Calabasas Who’s ...

NeoVision 2 Annotation Guidelines - Welcome to iLab!ilab.usc.edu/neo2/dataset/neovision2-annotation-guidelines.pdf11. Annotation Quality Assurance Quality is assured throughout annotation

Annotation - 碁峰資訊epaper.gotop.com.tw/pdf/A155.pdf · Retention meta-annotation Java annotation annotation class class Java virtual machine annotation class class annotation

Block Annotation: Better Image Annotation With …...Block Annotation: Better Image Annotation with Sub-Image Decomposition Hubert Lin Cornell University Paul Upchurch Cornell University

Who’s Watching?

Cat Pose Knee to Chest Tree Pose See Saw Pose Cow Pose...Knee to Chest Cow Pose Cat Pose Title MovetoLearn_Yoga_cards_mockup Created Date 8/4/2020 1:31:53 PM ...

Annotation and Evaluation - GATE · University of Sheffield, NLP Topics covered • Defining annotation guidelines • Manual annotation using the GATE GUI • Annotation schemas

Who’s next ?

English PropBank Annotation Guidelinesverbs.colorado.edu/propbank/EPB-Annotation-Guidelines.pdfChapter 1 Verb Annotation Instructions 1.1 PropBank Annotation Goals PropBank is a corpus

NCBI’s Genome Annotation: Overview Incremental processing Re-annotation ( batch ) Post-annotation review Case studies NOTE: limiting discussion to annotation.

Tomas Jakab Ankush Gupta Hakan Bilen Andrea Vedaldi · 2019-07-04 · man pose estimation with the goal of maximising perfor-mance while minimising annotation effort. Unsupervised

Pose Estimation and Segmentation of Multiple People in ...port interactive annotation, editing, and navigation in stereo videos [6], [7], which are important tasks in post-production

Pose Guided Person Image Generation - arXiv.org e … embedding. To avoid expensive annotation of poses, we apply a state-of-the-art pose estima-tor [2] to obtain approximate human

Open Annotation: Social Bookmarking and Annotation of eBooks

Genome Annotation - University of California, Santa Cruz · Genome Annotation Repeat Annotation For Gene Annotation 1 RepeatMasker -pa xx -gccalc -nolow -species aves genome.fasta

Who’s to Say Who’s Right? - Rob Ferry 9-25-16

Who’s Here?

Annotation. Traditional genome annotation BLAST Similarities.

WHO’S HOT? & WHO’S NOT! JULY 2013 ISSUE

Pathema Burkholderia Annotation Jamboree: Introduction to Annotation Jamboree