WordsEye: From Text To Pictures

98
WordsEye: From Text To Pictures The very humongous silver sphere is fifty feet above the ground. The silver castle is in the sphere. The castle is 80 feet wide. The ground is black. The sky is partly cloudy.

description

WordsEye: From Text To Pictures. The very humongous silver sphere is fifty feet above the ground. The silver castle is in the sphere. The castle is 80 feet wide. The ground is black. The sky is partly cloudy. Why is it hard to create 3D graphics?. The tools are complex. Too much detail. - PowerPoint PPT Presentation

Transcript of WordsEye: From Text To Pictures

Page 1: WordsEye: From Text To Pictures

WordsEye: From Text To Pictures

The very humongous silver sphere is fifty feet above the ground. The silver castle is in the sphere. The castle is 80 feet wide. The ground is black. The sky is partly cloudy.

Page 2: WordsEye: From Text To Pictures

Why is it hard to create 3D graphics?

Page 3: WordsEye: From Text To Pictures

The tools are complex

Page 4: WordsEye: From Text To Pictures

Too much detail

Page 5: WordsEye: From Text To Pictures

Involves training, artistic skill, and expense

Page 6: WordsEye: From Text To Pictures

Pictures from Language

No GUI bottlenecks - Just describe it!• Low entry barrier - no special skill or training required

• Give up detailed direct manipulation for speed and economy of expression

– Language expresses constraints

– Bypass rigid, pre-defined paths of expression (dialogs, menus, etc) as defined by GUI

– Objects vs Polygons – draw upon objects in pre-made 3D and 2D libraries

Enable novel applications in education, gaming, online communication, . . .

• Using language is fun and stimulates imagination

Semantics

• 3D scenes provide an intuitive representation of meaning by making explicit the contextual elements implicit in our mental models.

Page 7: WordsEye: From Text To Pictures

WordsEye Initial Version (with Richard Sproat)

Developed at AT&T Labs• Graphics: Mirai 3D animation system on Windows NT

• Church Tagger, Collins Parser on Linux

• WordNet (http://wordnet.princeton.edu/)

• Viewpoint 3D model library

• NLP (linux) and depiction/graphics (Linux) communicate via sockets

• WordsEye code in Common Lisp

Siggraph paper (August 2001)

Page 8: WordsEye: From Text To Pictures

New Version (with Richard Sproat)

Rewrote software from scratch

• Linux and CMUCL

• Custom Parser/Tagger

• OpenGL for 3D preview display

• Radiance Renderer

• ImageMagic, Gimp for 2D post-effects

Different subset of functionality

•No verbs/poses yet

Web interface (www.wordseye.com)

• Webserver and multiple backend text-to-scene servers

• Gallery/Forum/E-Cards/PIctureBooks/2D effects

Page 9: WordsEye: From Text To Pictures

A tiny grey manatee is in the aquarium. It is facing right. The manatee is six inches below the top of the aquarium. The ground is tile. There is a large brick wall behind the aquarium.

Example

Page 10: WordsEye: From Text To Pictures

A silver head of time is on the grassy ground. The blossom is next to the head. the blossom is in the ground. the green light is three feet above the blossom. the yellow light is 3 feet above the head. The large wasp is behind the blossom. the wasp is facing the head.

Example

Page 11: WordsEye: From Text To Pictures

The humongous white shiny bear is on the American mountain range. The mountain range is 100 feet tall. The ground is water. The sky is partly cloudy. The airplane is 90 feet in front of the nose of the bear. The airplane is facing right.

Example

Page 12: WordsEye: From Text To Pictures

A microphone is in front of a clown. The microphone is three feet above the ground. The microphone is facing the clown. A brick wall is behind the clown. The light is on the ground and in front of the clown.

Example

Page 13: WordsEye: From Text To Pictures

ExampleUsing user-uploaded image

Page 14: WordsEye: From Text To Pictures

Mary uses the crossbow. She rides the horse by the store. The store is under the large willow. The small allosaurus is in front of the horse. The dinosaur faces Mary. The gigantic teacup is in front of the store. The gigantic mushroom is in the teacup. The castle is to the right of the store.

Exampleoriginal version of Software

Page 15: WordsEye: From Text To Pictures

Web Interface – preview mode

Page 16: WordsEye: From Text To Pictures

Web Interface – rendered (raytraced)

Page 17: WordsEye: From Text To Pictures

WordsEye Overview

Linguistic Analysis• Parsing• Create dependency-tree representation• Anaphora resolution

Interpretation• Add implicit objects, relations• Resolve semantics and references

Depiction• Database of 3D objects, poses, textures• Depiction rules generate graphical constraints• Apply constraints to create scene

Page 18: WordsEye: From Text To Pictures

Linguistic Analysis

Tag part-of-speech

Parse

Generate semantic representation

•WordNet-like dictionary for nouns

Anaphora resolution

Page 19: WordsEye: From Text To Pictures

Example: John said that the cat is on the table.

Page 20: WordsEye: From Text To Pictures

Parse tree for: John said that the cat was on the table.

Said (Verb)

John (Noun)That (Comp)

On (Prep)Cat (Noun)

Table (Noun)

Was (Verb)

Page 21: WordsEye: From Text To Pictures

Nouns: Hierarchical Dictionary

Living thing

Animal Plant

Cat Dogcat-vp2842

cat-vp2843

dog-vp23283

dog_standing-vp5041

Physical Object

Inanimate Object

. . .

Page 22: WordsEye: From Text To Pictures

WordNet problems

Inheritance conflates functional and lexical relations• “Terrace” is a “plateau”

• “Spoon’ is a “container”

• “Crossing Guard” is a “traffic cop”

• “Bellybutton” is a “point”

Lack of multiple inheritance between synsets• “Princess” is an aristocrat, but not a female

• "ceramic-ware" is grouped under "utensil" and has "earthenware", etc under it. But there are no dishes, plates, under it because those are categorized elsewhere under "tableware"

Lacks relations other than ISA. Thesaurus vs dictionary.• Snowball “made-of” snow

• Italian “resident-of” Italy

Cluttered with obscure words and word senses• “Spoon” as a type of golf club

Create our own dictionary to address these problems

Page 23: WordsEye: From Text To Pictures

Semantic Representation for: John said that the blue cat was on the table.

1. Object: “mr-happy” (John)2. Object: “cat-vp39798” (cat)3. Object: “table-vp6204” (table) 4. Action: “say”

:subject <element 1> :direct-object <elements 2,3,5,6> :tense “PAST”

5. Attribute: “blue” :object <element 2>

6. Spatial-Relation “on” :figure <element 2>

:ground <element 3>

Page 24: WordsEye: From Text To Pictures

Anaphora resolution: The duck is in the sea. It is upside down. The sea is shiny and transparent. The ground is invisible. The apple is 3 inches below the duck. It is in front of the duck. The yellow illuminator is 3 feet above the apple. The cyan illuminator is 6 inches to the left of it. The magenta illuminator is 6 inches to the right of it. It is partly cloudy.

Page 25: WordsEye: From Text To Pictures

Indexical Reference: Three dogs are on the table. The first dog is blue. The first dog is 5 feet tall. The second dog is red. The third dog is purple.

Page 26: WordsEye: From Text To Pictures

Interpretation

Interpret semantic representation• Object selection

• Resolve semantic relations/properties based on object types

• Answer Who? What? When? Where? How?

• Disambiguate/normalize relations and actions

• Identify and resolve references to implicit objects

Page 27: WordsEye: From Text To Pictures

Object Selection: When object is missing or doesn't exist . . .

Text object: “Foo on table” Substitute image: “Fox on table”

Related object: “Robin on table”

Page 28: WordsEye: From Text To Pictures

Object attribute interpretation (modify versus selection)

Conventional: “American horse” Unconventional: “Richard Sproat horse”

Substance: “Stone horse” Selection: “Chinese house”

Page 29: WordsEye: From Text To Pictures

Semantic Interpretation of “Of”

Containment: “bowl of cats”

Part: “head of the cow” Property: “height of horse is..”

Grouping: “stack of cats” Substance: “horse of stone” Abstraction: “head of time”

Page 30: WordsEye: From Text To Pictures

Implicit objects & references

Mary rode by the store. Her motorcycle was was red.red.

• Verb resolution: Identify implicit vehicle

•Functional properties of objects

• Reference

•Motorcycle matches the vehicle

•Her matches with Mary

Page 31: WordsEye: From Text To Pictures

Implicit Reference: Mary rode by the store. Her motorcycle was red.

Page 32: WordsEye: From Text To Pictures

Depiction

3D object and image database

Graphical constraints

•Spatial relations

•Attributes

•Posing

•Shape/Topology changes

Depiction process

Page 33: WordsEye: From Text To Pictures

3D Object Database

2,000+ 3D polygonal objects

Augmented with:

•Spatial tags (top surface, base, cup, push handle, wall, stem, enclosure)

•Skeletons

•Default size, orientation

•Functional properties (vehicle, weapon . . .)

•Placement/attribute conventions

Page 34: WordsEye: From Text To Pictures

2000+ 3D Objects

Page 35: WordsEye: From Text To Pictures

10,000 images and textures

B&W drawings Texture Maps Artwork Photographs

Page 36: WordsEye: From Text To Pictures

3D Objects and Images tagged with semantic info

Spatial tags for 3D object regionsSpatial tags for 3D object regions

Object type (e.g. WordNet synset)Object type (e.g. WordNet synset)

• Is-a

• represents

Object sizeObject size

Object orientation (front, preferred supporting surface -- Object orientation (front, preferred supporting surface -- wall/top)wall/top)

Compound object consituentsCompound object consituents

Other object properties (style, parts, etc.)Other object properties (style, parts, etc.)

Page 37: WordsEye: From Text To Pictures

Canopy (under, beneath) Top Surface (on, in)

Spatial Tags

Page 38: WordsEye: From Text To Pictures

Spatial Tags

Base (under, below, on) Cup (in, on)

Page 39: WordsEye: From Text To Pictures

Spatial Tags

Push Handle (actions) Wall (on, against)

Page 40: WordsEye: From Text To Pictures

Spatial Tags

Stem (in) Enclosure (in)

Page 41: WordsEye: From Text To Pictures

Stem in Cup: The daisy is in the test tube.

Page 42: WordsEye: From Text To Pictures

Enclosure and top surface: The bird is in the bird cage. The bird cage is on the chair.

Page 43: WordsEye: From Text To Pictures

Spatial Relations

Relative positions• On, under, in, below, off, onto, over, above . . .• Distance

Sub-region positioning• Left, middle, corner,right, center, top, front, back

Orientation• facing (object, left, right, front, back, east, west . . .)

Time-of-day relations

Page 44: WordsEye: From Text To Pictures

Vertical vs Horizontal “on”, distances, directions: The couch is against the wood wall. The window is on the wall. The window is next to the couch. the door is 2 feet to the right of the window. the man is next to the couch. The animal wall is to the right of the wood wall. The animal wall is in front of the wood wall. The animal wall is facing left. The walls are on the huge floor. The zebra skin coffee table is two feet in front of the couch. The lamp is on the table. The floor is shiny.

Page 45: WordsEye: From Text To Pictures

Attributes

Size

•height, width, depth

•Aspect ratio (flat, wide, thin . . .)

Surface attributes

•Texture databaseTexture database

•Color, Texture, Opacity, reflectivity

•Applied to objects or textures themselves

•Brightness (for lights)

Page 46: WordsEye: From Text To Pictures

Attributes: The orange battleship is on the brick cow. The battleship is 3 feet long.

Page 47: WordsEye: From Text To Pictures

Time of day & cloudiness

Page 48: WordsEye: From Text To Pictures

Time of day & lighting

Page 49: WordsEye: From Text To Pictures

Poses (original version only -- not yet implemented in web

version)

Represent actions

Database of 500+ human poses

• Grips

• Usage (specialized/generic)

• Standalone

Merge poses (upper/lower body, hands)

• Gives wide variety by mix’n’match

Dynamic posing/IK

Page 50: WordsEye: From Text To Pictures

Grip wine_bottle-bc0014 Use bicycle_10-speed-vp8300

Poses

Page 51: WordsEye: From Text To Pictures

Throw “round object” Run

Poses

Page 52: WordsEye: From Text To Pictures

Combined poses: Mary rides the bicycle. She plays the trumpet.

Page 53: WordsEye: From Text To Pictures

The Broadway Boogie Woogie vase is on the Richard Sproat coffee table. The table is in front of the brick wall. The van Gogh picture is on the wall. The Matisse sofa is next to the table. Mary is sitting on the sofa. She is playing the violin. She is wearing a straw hat.

Combined poses

Page 54: WordsEye: From Text To Pictures

Mary pushes the lawn mower. The lawnmower is 5 feet tall. The cat is 5 feet behind Mary. The cat is 10 feet tall.

Dynamically defined poses using Inverse Kinematics (IK)

Page 55: WordsEye: From Text To Pictures

Shape Changes (not implemented in web version)

Deformations Deformations

•Facial expressions

•Happy, angry, sad, confused . . . mixtures

•Combined with poses

Topological changesTopological changes

•Slicing

Page 56: WordsEye: From Text To Pictures

Facial Expressions

Edward runs. He is happy. Edward is shocked.

Page 57: WordsEye: From Text To Pictures

The rose is in the vase. The vase is on the half dog.

Page 58: WordsEye: From Text To Pictures

Depiction Process

Given a semantic representation

•Generate graphical constraints

•Handle implicit and conflicting constraints.

•Generate 3d scene from constraints

•Add environment, lights, camera

•Render scene

Page 59: WordsEye: From Text To Pictures

Example: Generate constraints for kick

• Case1Case1: : No path or recipient; Direct object is largeNo path or recipient; Direct object is largePose: Actor in kick posePosition: Actor directly behind direct objectOrientation: Actor facing direct object

• Case2Case2: : No path or recipient; Direct object is smallNo path or recipient; Direct object is smallPose: Actor in kick posePosition: Direct object above foot

• Case3: Case3: Path and RecipientPath and RecipientPose+relations . . . (some tentative)

Page 60: WordsEye: From Text To Pictures

Some varieties of kick

Case3: John kicked the ball to the cat on the skateboard

Case1: John kicked the pickup truck

Case2:John kicked the football

Page 61: WordsEye: From Text To Pictures

Implicit Constraint. The vase is on the nightstand. The lamp is next to the vase.

Page 62: WordsEye: From Text To Pictures

Figurative & Metaphorical Depiction

• Textualization

• Conventional Icons and emblems

• Literalization

• Characterization

• Personification

• Functionalization

Page 63: WordsEye: From Text To Pictures

Textualization: The cat is facing the wall.

Page 64: WordsEye: From Text To Pictures

Conventional Icons: The blue daisy is not in the army boot.

Page 65: WordsEye: From Text To Pictures

Literalization: Life is a bowl of cherries.

Page 66: WordsEye: From Text To Pictures

Characterization: The policeman ran by the parking meter

Page 67: WordsEye: From Text To Pictures

Functionalization: The hippo flies over the church

Page 68: WordsEye: From Text To Pictures

Future/Ongoing Work

Build/use scenario-based lexical resource• Word knowledge (dictionary)• Frame knowledge

– For verbs and event nouns

– Finer-grained representation of prepositions and spatial relations

• Contextual knowledge – Default verb arguments

– Default constituents and spatial relations in settings/environments

• Decompose actions into poses and spatial relations

• Learn contextual knowledge from corpora

Graphics/output support• Add dynamic posing of characters to depict actions• Handle more complex, natural text

• Handle object parts

• Add more 2D/3D content (including user uploadable 3D objects)

• Physics, animation, sound, and speech

Page 69: WordsEye: From Text To Pictures

FrameNet – Digital lexical resource http://framenet.icsi.berkeley.edu/

• 947 hierarchically defined frames947 hierarchically defined frames

• 10,000 lexical entries (Verbs, nouns, adjectives)10,000 lexical entries (Verbs, nouns, adjectives)

• Relations between frame (perspective-on, subframe, Relations between frame (perspective-on, subframe, using, …)using, …)

• Annotated sentences for each lexical unitAnnotated sentences for each lexical unit

Page 70: WordsEye: From Text To Pictures

Lexical Units in “Revenge” Frame

Page 71: WordsEye: From Text To Pictures

Frame elements for avenge.v

Frame Element Core Type

Degree Core

Depictive Peripheral

Injured_party Extra_thematic

Injury Core

Instrument Core

Manner Peripheral

Offender Peripheral

Place Core

Punishment Peripheral

Purpose Core

Result Extra_thematic

Time Peripheral

Page 72: WordsEye: From Text To Pictures

Annotations for “avenge.v”

Page 73: WordsEye: From Text To Pictures

Relations between frames

Page 74: WordsEye: From Text To Pictures

Frame element mappings between frames

• Core vs Peripheral

• Inheritance

• Renaming (eg. agent -> helper)

Page 75: WordsEye: From Text To Pictures

Valence patterns for verb “sell” (commerce_sell frame) and two related frames

<LU-2986 "sell.v" Commerce_sell> patterns:<LU-2986 "sell.v" Commerce_sell> patterns: (33 ((Seller Ext) (Goods Obj))) (33 ((Seller Ext) (Goods Obj))) (11 ((Goods Ext))) (11 ((Goods Ext))) (7 ((Seller Ext) (Goods Obj) (Buyer Dep(to)))) (7 ((Seller Ext) (Goods Obj) (Buyer Dep(to)))) (4 ((Seller Ext))) (4 ((Seller Ext))) (2 ((Goods Ext) (Buyer Dep(to)))) (2 ((Goods Ext) (Buyer Dep(to))))

<frame: Commerce_buy> patterns:<frame: Commerce_buy> patterns: (91 ((Buyer Ext) (Goods Obj))) (91 ((Buyer Ext) (Goods Obj))) (27 ((Buyer Ext) (Goods Obj) (Seller Dep(from)))) (27 ((Buyer Ext) (Goods Obj) (Seller Dep(from)))) (11 ((Buyer Ext))) (11 ((Buyer Ext))) (2 ((Buyer Ext) (Goods Obj) (Seller Dep(at)))) (2 ((Buyer Ext) (Goods Obj) (Seller Dep(at)))) (2 ((Buyer Ext) (Seller Dep(from)))) (2 ((Buyer Ext) (Seller Dep(from)))) (2 ((Goods Obj))) (2 ((Goods Obj)))

<frame: Expensiveness> patterns:<frame: Expensiveness> patterns: (17 ((Goods Ext) (Money Dep(NP)))) (17 ((Goods Ext) (Money Dep(NP)))) (8 ((Goods Ext))) (8 ((Goods Ext))) (4 ((Goods Ext) (Money Dep(between)))) (4 ((Goods Ext) (Money Dep(between)))) (4 ((Goods Ext) (Money Dep(from)))) (4 ((Goods Ext) (Money Dep(from)))) (2 ((Goods Ext) (Money Dep(under)))) (2 ((Goods Ext) (Money Dep(under)))) (1 ((Goods Ext) (Money Dep(just)))) (1 ((Goods Ext) (Money Dep(just)))) (1 ((Goods Ext) (Money Dep(NP)) (Seller Dep(from)))) (1 ((Goods Ext) (Money Dep(NP)) (Seller Dep(from))))

Page 76: WordsEye: From Text To Pictures

Parsing and generating semantic relations using FrameNetNLP> NLP> (interpret-sentence "the boys on the beach said that the fish swam to island”)(interpret-sentence "the boys on the beach said that the fish swam to island”)

Parse:Parse:(S(S (NP (NP (DT "the") (NN2 (NNS "boys"))) (NP (NP (DT "the") (NN2 (NNS "boys"))) (PREPP* (PREPP (IN "on") (NP (DT "the") (NN2 (NN "beach")))))) (PREPP* (PREPP (IN "on") (NP (DT "the") (NN2 (NN "beach")))))) (VP (VP1 (VERB (VBD "said"))) (COMP "that") (VP (VP1 (VERB (VBD "said"))) (COMP "that") (S (NP (DT "the") (NN2 (NN "fish"))) (S (NP (DT "the") (NN2 (NN "fish"))) (VP (VP1 (VERB (VBD "swam"))) (VP (VP1 (VERB (VBD "swam"))) (PREPP* (PREPP (TO "to") (NP (NN2 (NN "island"))))))))) (PREPP* (PREPP (TO "to") (NP (NN2 (NN "island")))))))))

Word Dependency:Word Dependency:((#<noun: "boy" (Plural) ID=18> (:DEP #<prep: "on" ID=19>))((#<noun: "boy" (Plural) ID=18> (:DEP #<prep: "on" ID=19>)) (#<prep: "on" ID=19> (:DEP #<noun: "beach" ID=21>)) (#<prep: "on" ID=19> (:DEP #<noun: "beach" ID=21>)) (#<verb: "said" ID=22> (:SUBJECT #<noun: "boy" (Plural) ID=18>) (#<verb: "said" ID=22> (:SUBJECT #<noun: "boy" (Plural) ID=18>) (:DIRECT-OBJECT #<verb: "swam" ID=26>)) (:DIRECT-OBJECT #<verb: "swam" ID=26>)) (#<verb: "swam" ID=26> (:SUBJECT #<noun: "fish" ID=25>) (#<verb: "swam" ID=26> (:SUBJECT #<noun: "fish" ID=25>) (:DEP #<prep: "to" ID=27>)) (:DEP #<prep: "to" ID=27>)) (#<prep: "to" ID=27> (:DEP #<noun: "island" ID=28>))) (#<prep: "to" ID=27> (:DEP #<noun: "island" ID=28>)))

Frame Dependency:Frame Dependency:((#<relation: CN-SPATIAL-RELATION-ON ID=19>((#<relation: CN-SPATIAL-RELATION-ON ID=19> (:FIGURE #<noun: "boy" (Plural) ID=18>) (:FIGURE #<noun: "boy" (Plural) ID=18>) (:GROUND #<noun: "beach" ID=21>)) (:GROUND #<noun: "beach" ID=21>)) (#<action: "say.v" ID=22> (#<action: "say.v" ID=22> (#<frame-element: "Text" ID=29> #<action: "swim.v" ID=26>) (#<frame-element: "Text" ID=29> #<action: "swim.v" ID=26>) (#<frame-element: "Author" ID=30> #<noun: "boy" (Plural) ID=18>)) (#<frame-element: "Author" ID=30> #<noun: "boy" (Plural) ID=18>)) (#<action: "swim.v" ID=26> (#<action: "swim.v" ID=26> (#<frame-element: "Self_mover" ID=31> #<noun: "fish" ID=25>) (#<frame-element: "Self_mover" ID=31> #<noun: "fish" ID=25>) (#<frame-element: ("Goal") ID=32> #<prep: "to" ID=27>)) (#<frame-element: ("Goal") ID=32> #<prep: "to" ID=27>)) (#<prep: "to" ID=27> (:DEP #<noun: "island" ID=28>))) (#<prep: "to" ID=27> (:DEP #<noun: "island" ID=28>)))

Page 77: WordsEye: From Text To Pictures

Acquiring contextual knowledge

Where does “eating breakfast” take place?Where does “eating breakfast” take place?

• Inferring the environment in a text-to-scene conversion system.Inferring the environment in a text-to-scene conversion system. K-CAP 2001 K-CAP 2001 Richard SproatRichard Sproat

Default locations and spatial relations (by Gino Miceli) Default locations and spatial relations (by Gino Miceli)

• Project Gutenberg corpus of online English prose (http://www.gutenberg.org/), Project Gutenberg corpus of online English prose (http://www.gutenberg.org/),

• Use seed-object pairs to extract other pairs with equivalent spatial relations Use seed-object pairs to extract other pairs with equivalent spatial relations (e.g. (e.g. cupscups are (typically) on are (typically) on tablestables, while , while booksbooks are on are on desksdesks). ).

• Leverage verb/preposition semantics as well as simple syntactic structure to Leverage verb/preposition semantics as well as simple syntactic structure to identify identify spatial templatesspatial templates based on verb/{preposition,particle} plus based on verb/{preposition,particle} plus intervening modifiers. intervening modifiers.

Page 78: WordsEye: From Text To Pictures

Pragmatic Ambiguity: The lamp is next to the vase on the nightstand . . .

Page 79: WordsEye: From Text To Pictures

Syntactic Ambiguity: Prepositional phrase attachment

John looks at the cat on the skateboard.

John draws the man in the moon.

Page 80: WordsEye: From Text To Pictures

Potential Applications

• Online communications: Electronic postcards, visual chat/IM, social networks

• Gaming, virtual environments

• Storytelling/comic books/art

• Education (ESL, reading, disabled learning, graphics arts)

• Graphics authoring/prototyping tool

• Visual summarization and/or “translation” of text

• Embedded in toys

Page 81: WordsEye: From Text To Pictures

Storytelling: The stagecoach is in front of the old west hotel. Mary is next to the stagecoach. She plays the guitar. Edward exercises in front of the stagecoach. The large sunflower is to the left of the stagecoach.

Page 82: WordsEye: From Text To Pictures

Scenes within scenes . . .

Page 83: WordsEye: From Text To Pictures

Greeting Cards

Page 84: WordsEye: From Text To Pictures

1st grade homework: The duck sat on a hen; the hen sat on a pig;...

Page 85: WordsEye: From Text To Pictures

Conclusion

New approach to scene generation

•Low overhead (skill, training . . .)

• Immediacy

•Usable with minimal hardware: text or speech

input device and display screen.

Work is ongoing

•Available as experimental web service

Page 86: WordsEye: From Text To Pictures

Related Work

• Adorni, Di Manzo, Giunchiglia, 1984

• Put: Clay and Wilhelms, 1996

• PAR: Badler et al., 2000

• CarSim: Dupuy et al., 2000

• SHRDLU: Winograd, 1972

Page 87: WordsEye: From Text To Pictures

Bloopers – John said the cat is on the table

Page 88: WordsEye: From Text To Pictures

Bloopers: Mary says the cat is blue.

Page 89: WordsEye: From Text To Pictures

Bloopers: John wears the axe. He plays the violin.

Page 90: WordsEye: From Text To Pictures

Bloopers: Happy John holds the red shark

Page 91: WordsEye: From Text To Pictures

Bloopers: Jack carried the television

Page 92: WordsEye: From Text To Pictures

Web Interface - Entry Page (www.wordseye.com)

• Registration

• Login

• Learn more

• Example pictures

Page 93: WordsEye: From Text To Pictures

Web Interface - Public Gallery

Page 94: WordsEye: From Text To Pictures

Web Interface - Add Comments to Picture

Page 95: WordsEye: From Text To Pictures

Web Interface - Link Pictures into Stories & Games

Page 96: WordsEye: From Text To Pictures

The tall granite mountain range is 300 feet wide.The enormous umbrella is on the mountain range.The gray elephant is under the umbrella.The chicken cube is 6 feet to the right of the gray elephant.The cube is 5 feet tall. The cube is on the mountain range.A clown is on the elephant. The large sewing machine is on the cube.A die is on the clown. It is 3 feet tall.

Page 97: WordsEye: From Text To Pictures
Page 98: WordsEye: From Text To Pictures