Visual Dialog - Stanford University · Visual Dialog 1.0 2.0 1.5 Questions Answers o 10 5 67 Words...

9
Visual Dialog Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José M.F. Moura, Devi Parikh, Dhruv Batra Presented by: Alan Luo 1

Transcript of Visual Dialog - Stanford University · Visual Dialog 1.0 2.0 1.5 Questions Answers o 10 5 67 Words...

Page 1: Visual Dialog - Stanford University · Visual Dialog 1.0 2.0 1.5 Questions Answers o 10 5 67 Words 8 9 # Unique answers (x 10000) Image I Do you think the woman is with him? Question

Visual DialogAbhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José M.F. Moura, Devi Parikh, Dhruv Batra

Presented by: Alan Luo

1

Page 2: Visual Dialog - Stanford University · Visual Dialog 1.0 2.0 1.5 Questions Answers o 10 5 67 Words 8 9 # Unique answers (x 10000) Image I Do you think the woman is with him? Question

Introduction Natural Language Processing + Computer Vision

● Aiding visually impaired users in understanding their surroundings or social media content

● Interacting with an AI assistant

2

Page 3: Visual Dialog - Stanford University · Visual Dialog 1.0 2.0 1.5 Questions Answers o 10 5 67 Words 8 9 # Unique answers (x 10000) Image I Do you think the woman is with him? Question

Video Captioning

Related Work Image/Video Captioning Image Captioning

3

Page 4: Visual Dialog - Stanford University · Visual Dialog 1.0 2.0 1.5 Questions Answers o 10 5 67 Words 8 9 # Unique answers (x 10000) Image I Do you think the woman is with him? Question

Datasets

Related Work Visual-Semantic Alignments Visual-Semantic Alignments

4

Page 5: Visual Dialog - Stanford University · Visual Dialog 1.0 2.0 1.5 Questions Answers o 10 5 67 Words 8 9 # Unique answers (x 10000) Image I Do you think the woman is with him? Question

5

Related Work Visual Q&A

Page 6: Visual Dialog - Stanford University · Visual Dialog 1.0 2.0 1.5 Questions Answers o 10 5 67 Words 8 9 # Unique answers (x 10000) Image I Do you think the woman is with him? Question

Contributions1. Propose a new AI task: Visual Dialog

2. Develop a novel two-person chat data-collection protocol and introduce a new dataset

3. Introduce a family of neural encoder-decoder models for Visual Dialog

6

Page 7: Visual Dialog - Stanford University · Visual Dialog 1.0 2.0 1.5 Questions Answers o 10 5 67 Words 8 9 # Unique answers (x 10000) Image I Do you think the woman is with him? Question

Technical Details With Late Fusion Encoder

7

Page 8: Visual Dialog - Stanford University · Visual Dialog 1.0 2.0 1.5 Questions Answers o 10 5 67 Words 8 9 # Unique answers (x 10000) Image I Do you think the woman is with him? Question

Qualitative Quantitative

8

Dataset VisDial

Page 9: Visual Dialog - Stanford University · Visual Dialog 1.0 2.0 1.5 Questions Answers o 10 5 67 Words 8 9 # Unique answers (x 10000) Image I Do you think the woman is with him? Question

ResultsQualitative Results

9

Quantitative Results