Yiannis A picture is worth 13.6 words -...

59
1/59 A picture is worth 13.6 words Hal Daumé III, [email protected] A picture is worth 13.6 words (on average) Alex Berg Amit Goyal Tamara Berg Jesse Dodge Yejin Choi Yiannis Aloimonos Kota Yamaguch Alyssa Mensch Karl Stratos Meg Mitchell Xufeng Han Ching Lik Teo Yezhou Yang

Transcript of Yiannis A picture is worth 13.6 words -...

Page 1: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

1/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

A picture is worth13.6 words

(on average)

AlexBerg

AmitGoyal

TamaraBerg

JesseDodge

YejinChoi

YiannisAloimonos

KotaYamaguchi

AlyssaMensch

KarlStratos

MegMitchell

XufengHan

Ching LikTeo

YezhouYang

Page 2: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

2/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

An on-paper experiment

Write a captionfor this image,one sentencein length.

(In English.)

Page 3: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

3/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

People write weird captions

Another dream car toadd to the list, this onespotted in Hanbury St.

Page 4: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

4/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

People write weird captions

Another dream car toadd to the list, this onespotted in Hanbury St.

Shot out my car windowwhile stuck in trafficbecause people in

Cincinatti can'tdrive in the rain

Page 5: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

5/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

People write weird captions

Another dream car toadd to the list, this onespotted in Hanbury St.

Shot out my car windowwhile stuck in trafficbecause people in

Cincinatti can'tdrive in the rain

1. A distorted photo of a mancutting up a large cut ofmeat in a garage.

2. A man smiling at thecamera while carvingup meat.

3. A man smiling while hecuts up a piece of meat.

4. A smiling man is standing next to a table dressinga piece of venison.

5. The man is smiling into the camera as he cuts meat.

Page 6: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

6/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

What I used to think vision did...

Page 7: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

7/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

What I used to think vision did...

Page 8: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

8/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

What I used to think vision did...

Page 9: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

9/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

What I used to think vision did...

Page 10: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

10/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

Now I know better....

Page 11: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words11 Hal Daumé III ([email protected])

bird

boat

bottle

bowl

Detecting on a large scale...

Page 12: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words12 Hal Daumé III ([email protected])

Given an image

1)

What do people describe?

Page 13: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words13 Hal Daumé III ([email protected])

Predict what people will describe

Given an image

1)

“two women sitting brunette blonde on bench reading magazine”

What do people describe?

Page 14: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words14 Hal Daumé III ([email protected])

Predict what people will describe

Given an image

1)

“two women sitting brunette blonde on bench reading magazine”

women ● bench ●

magazine● grass

skirt

What do people describe?

Page 15: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words15 Hal Daumé III ([email protected])

What’s in this image?

Predicting what will be described

Page 16: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words16 Hal Daumé III ([email protected])

manbabysling

ladderfridgetable

watermelonchair

boxescups

water bottlewall

pacifierbeard

glassesshirt

What’s in this image?

Predicting what will be described

Page 17: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words17 Hal Daumé III ([email protected])

What do people describe?“A bearded man is holding a child in a sling.”

manbabysling

ladderfridgetable

watermelonchair

boxescups

water bottlewall

pacifierbeard

glassesshirt

What’s in this image?

Predicting what will be described

Page 18: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words18 Hal Daumé III ([email protected])

What do people describe?“A bearded man is holding a child in a sling.”

manbabysling

ladderfridgetable

watermelonchair

boxescups

water bottlewall

pacifierbeard

glassesshirt

What’s in this image?

Predicting what will be described

“A bearded man stands while holdinga small child in a green sheet.” “A bearded man with a baby in a sling poses.”“Man standing in kitchen with little girlin green sack.” “Man with beard and baby”

Page 19: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words19 Hal Daumé III ([email protected])

What do people describe?“A bearded man is holding a child in a sling.”

manbabysling

ladderfridgetable

watermelonchair

boxescups

water bottlewall

pacifierbeard

glassesshirt

What’s in this image?

Predicting what will be described

“A bearded man stands while holdinga small child in a green sheet.” “A bearded man with a baby in a sling poses.”“Man standing in kitchen with little girlin green sack.” “Man with beard and baby”

Page 20: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words20 Hal Daumé III ([email protected])

Two kinds of factors– Compositional– Semantic

What factors influence what someone will describe about an image?

Description factors

Page 21: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words21 Hal Daumé III ([email protected])

“A sail boat on the ocean.”

Size/Saliency

Location

Compositional factors

Page 22: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words22 Hal Daumé III ([email protected])

Compositional factors

“Two men standing on beach.”

Size/Saliency

Location

Page 23: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words23 Hal Daumé III ([email protected])

“girl in the street”

Object Type

Nameable Scene

Unusualness

Semantic factors

Page 24: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words24 Hal Daumé III ([email protected])

Semantic factors

“kitchen in house”

Object Type

Nameable Scene

Unusualness

Page 25: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words25 Hal Daumé III ([email protected])

Semantic factors

“elephant in the beach”

Object Type

Nameable Scene

Unusualness

Page 26: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words26 Hal Daumé III ([email protected])

Semantic factors

“A tree in water and a boy with a beard”

Object Type

Nameable Scene

Unusualness

Page 27: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words27 Hal Daumé III ([email protected])

Using large corpora to compose natural captions

(why write your own material when you can just “steal” it?)

Page 28: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words28 Hal Daumé III ([email protected])

a) monkey playing in the tree canopy, Monte Verde in the rain forest

e) the monkey sitting in a tree, posing for his picture

c) monkey spotted in Apenheul Netherlands under the tree

d) a white-faced or capuchin in the tree in the garden

b) capuchin monkey in frontof my window

Composing captions

Page 29: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words29 Hal Daumé III ([email protected])

a) monkey playing in the tree canopy, Monte Verde in the rain forest

e) the monkey sitting in a tree, posing for his picture

c) monkey spotted in Apenheul Netherlands under the tree

d) a white-faced or capuchin in the tree in the garden

b) capuchin monkey in frontof my window

Composing captions

Page 30: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words30 Hal Daumé III ([email protected])

Caption images where:

We assume some evidence for 1 object

&

Object detector is confident

Captioning with (some) evidence

Page 31: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words31 Hal Daumé III ([email protected])

Caption images where:

We assume some evidence for 1 object

&

Object detector is confident

Tag: “mare” Evidence for horse

Captioning with (some) evidence

Page 32: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words32 Hal Daumé III ([email protected])

Caption images where:

We assume some evidence for 1 object

&

Object detector is confident

Tag: “mare”

High detection score

Evidence for horse

Captioning with (some) evidence

Page 33: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words33 Hal Daumé III ([email protected])

Grab phrases based on image similarity between query and captioned data baseObject detection similarity - NPs, VPs Stuff detection similarity – PPs Scene similarity - PPs

Mash phrases Compose descriptions using simple rule based concatenation

Generation: Grab 'N Mash

Page 34: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words34 Hal Daumé III ([email protected])

Detect: fruit

Getting NPs – Objects

Page 35: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words35 Hal Daumé III ([email protected])

Detect: fruit

Find matching fruit detections by color similarity

Getting NPs – Objects

Page 36: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words36 Hal Daumé III ([email protected])

Detect: fruit

Find matching fruit detections by color similarity

Tray of glace fruit in the market at Nice, France

Fresh fruit in the market

A box of oranges was just catching the sun, bringing out detail in the skin.

The street market in Santanyi, Mallorca is a must for the oranges and local crafts.

An orange tree in the backyard of the house.

mandarin oranges in glass bowl

Getting NPs – Objects

Page 37: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words37 Hal Daumé III ([email protected])

Getting NPs – Objects

The muddy elephantAn elephantsmall elephantA very large and seemingly old elephantmusk male elephantAfrican elephantthe temple elephant

Fushia flowera flowera pink zinna flowerThis beautiful flowera roman pink flowera tiny pink flowerpink bursting flowersa perfectly pink gerbera daisy

a lonesome ducka native new zealand duckThe duckmale Mallard duckseveral other ducksa so-called navigation duckthis ducka duckduckmandarin duck

Page 38: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words38 Hal Daumé III ([email protected])

theses cows live in the field behind my house A cow eating flowers in

the south of the Netherlands.

The cow was more interested in eating than looking at me with a camera!

While cycling north on Tremaine Road near Milton, this cow gazed across the road intently.

Detect: cow

Find matching cow detections by shape/pose similarity

Getting VPs – objects

Page 39: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words39 Hal Daumé III ([email protected])

Detect: grassgreen manure in the veg field - Plaw Hatch

Find matching grass detections by color similarity

Found on hawthorn in boggy grass field

Sheep in a field spotted during a coastal drive from Tramore to Dungervan

I am happy in a field of green Maryland grass

Getting PPs – stuff

Page 40: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words40 Hal Daumé III ([email protected])

View from our B&B in this photo

Extract scene descriptor

Find matching images by scene similarity

Pedestrian street in the Old Lyon with stairs to climb up the hill of fourviere

I'm about to blow the building across the street over with my massive lung power.

Only in Paris will you find a bottle of wine on a table outside a bookstore

Getting PPs – scenes

Page 41: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words41 Hal Daumé III ([email protected])

Composing captions

Page 42: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words42 Hal Daumé III ([email protected])

object color

object pose

scene

stuf

Composing captions

Page 43: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words43 Hal Daumé III ([email protected])

NP: the sheep

VP: meandered along a desolate road

PP: in the highlands of Scotland

PP: through frozen grass

object color

object pose

scene

stuf

Composing captions

Page 44: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words44 Hal Daumé III ([email protected])

NP: the sheep

VP: meandered along a desolate road

PP: in the highlands of Scotland

PP: through frozen grass

object color

object pose

scene

stuf

Various composition patterns:NP VPNP PP_stufNP PP_scene…NP VP PP_scene PP_stuf

Composing captions

Page 45: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words45 Hal Daumé III ([email protected])

the sheep meandered along a desolate road in the highlands of Scotland through frozen grass

NP: the sheep

VP: meandered along a desolate road

PP: in the highlands of Scotland

PP: through frozen grass

object color

object pose

scene

stuf

Various composition patterns:NP VPNP PP_stufNP PP_scene…NP VP PP_scene PP_stuf

Composing captions

Page 46: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words46 Hal Daumé III ([email protected])

cat enjoys hiding under the tree

A female Monarch butterfly was visiting the plant in my front yard in Devon 17/10/10 Stained glass window

depicting Christ and numerous saints in Washington National Cathedral in the Eglise

A double-decker bus under some spreading shade trees

her flower girl dress designed by Mainbocher in the house

A duck was having a bath in the harbor at whitehaven, cumbria, england in the water near Camley St

Good results

Page 47: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words47 Hal Daumé III ([email protected])

Not so good results

Page 48: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words48 Hal Daumé III ([email protected])

Language issues

A Moo cow tied up around the city eating grass in various places under the tree at the young tree

male tiger sighting in twelve months of a street

Not so good results

Page 49: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words49 Hal Daumé III ([email protected])

Language issues Vision issues

A Moo cow tied up around the city eating grass in various places under the tree at the young tree

The silhouetted building and cross stands under water around Loon Mountain

male tiger sighting in twelve months of a street

a girl walking by in a green field in the sun

Not so good results

Page 50: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

A picture is worth 13.6 words50 Hal Daumé III ([email protected])

Language issues Vision issues Just plain silly

A Moo cow tied up around the city eating grass in various places under the tree at the young tree

dogs running pic, this time, racing through the sea at Fraisthorpe near Bridlington of Christmas tree in bed

The silhouetted building and cross stands under water around Loon Mountain

male tiger sighting in twelve months of a street

a girl walking by in a green field in the sun

bike was left here by an ancient civilization not as sophisticated as our own in the grass of granite

Not so good results

Page 51: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

51/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

What about 2nd language learning?● Obvious problems

● Assumes knowledge 1st language● Assumes knowledge of the world● Still don't have a robot...

Page 52: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

52/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

What about 2nd language learning?● Obvious problems

● Assumes knowledge 1st language● Assumes knowledge of the world● Still don't have a robot...

● But we do havesoftware withexercises for SLA

Page 53: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

53/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

What about 2nd language learning?● Obvious problems

● Assumes knowledge 1st language● Assumes knowledge of the world● Still don't have a robot...

● But we do havesoftware withexercises for SLA

It's hard for people, too!

Page 54: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

54/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

What about 2nd language learning?● Obvious problems

● Assumes knowledge 1st language● Assumes knowledge of the world● Still don't have a robot...

● But we do havesoftware withexercises for SLA

It's hard for people, too!

Page 55: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

55/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

Aspects of computational 2ndLL● Very specific linguistic variants

● Number, case, agreement, etc.● Not enough to get the majority case

Page 56: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

56/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

Aspects of computational 2ndLL● Very specific linguistic variants

● Number, case, agreement, etc.● Not enough to get the majority case

● Focus on subtle visual differences

Page 57: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

57/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

Aspects of computational 2ndLL● AI-style

reasoning &one-shotlearning

Page 58: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

58/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

What is needed to solve this?● Linguistic model over character

sequences (words not okay!)w/o any L-specific background

● Pre-trained (?) visual detectorsfor objects, poses andphysical relationships (eg., gaze)

● Ability to reason and generalizefrom a few examples

Page 59: Yiannis A picture is worth 13.6 words - UMIACSusers.umiacs.umd.edu/~hal/talks/15-02-msrny-teatalk.pdf · 2015-12-18 · 1/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

59/59 A picture is worth 13.6 wordsHal Daumé III, [email protected]

Thanks!Questions?

AlexBerg

AmitGoyal

TamaraBerg

JesseDodge

YejinChoi

YiannisAloimonos

KotaYamaguchi

AlyssaMensch

KarlStratos

MegMitchell

XufengHan

Ching LikTeo

YezhouYang