Eoghan Furey, Kevin Curran, Paul Mc Kevitt Intelligent Systems Research Centre,
Building character animation for intelligent storytelling with the H-Anim standard Minhua Eunice Ma...
-
Upload
leonard-scott-mitchell -
Category
Documents
-
view
216 -
download
0
Transcript of Building character animation for intelligent storytelling with the H-Anim standard Minhua Eunice Ma...
Building character animation for intelligent storytelling with the H-Anim standard
Minhua Eunice Ma and Paul Mc Kevitt
School of Computing and Intelligent Systems
Faculty of InformaticsUniversity of Ulster
EuroGraphics Ireland
29 April 2003
MultiModal interactive storytelling AesopWorld KidsRoom Larsen & Petersen’s Interactive
Storytelling Computer games
Virtual humans & embodied agents
Jack (University of Pennsylvania) Improv (Perlin & Goldberg, 1996) BEAT (Cassell et al., 2000) SimHuman Gandalf
Previous research
EuroGraphics Ireland
29 April 2003
Automatic Text-to-Graphics Systems WordsEye (Coyne & Sproat, 2001) ‘Micons’ and CD-based language animation
(Narayanan et al. 1995) Spoken Image (Ó Nualláin & Smith, 1994) &
successor SONAS (Kelleher et al. 2000) Semantic representations
Schank’s (1972) Conceptual Dependency (CD) Theory & scripts
Jackendoff’s (1990) Lexical Conceptual Structure (LCS)
Previous research
EuroGraphics Ireland
29 April 2003
Objectives of CONFUCIUS
To interpret natural language story and movie (drama) script input and to extract conceptual semantics from the natural language
To generate 3D animation and virtual worlds automatically from natural language
To integrate 3D animation with speech and non-speech audio, to form an intelligent multimedia storytelling system for presenting multimodal stories
EuroGraphics Ireland
29 April 2003
CONFUCIUS’ context diagram
Story in natural language
CONFUCIUSMovie/drama script 3D animation
non-speech audioTailored menu for script input
Speech (dialogue)Storywrit
er /playwrig
ht
User/story listene
r
EuroGraphics Ireland
29 April 2003
Architecture of CONFUCIUS
3D authoring tools, existing 3D
models & character models
visual knowledge (3D graphic library)
Prefabricated objects(knowledge base)
Script writer
Script parser
Natural Language Processing
Text To Speech
Sound effects
Animation generation
Synchronizing & fusion
3D world with audio in VRML
Natural language stories
Language knowledge
mapping
lexicongrammaretc
semantic representations
visual knowledge
EuroGraphics Ireland
29 April 2003
knowledge base
Language knowledge
Visual knowledge
World knowledge
Spatial & quantitative reasoning knowledge
Semantic knowledge - lexicons (e.g. WordNet)Syntactic knowledge - grammarsStatistical models of languageAssociations between words
Object model (nouns)
Functional informationInternal coordinate axes (for spatial reasoning)Associations between objects
Knowledge base of CONFUCIUS
Event model (event verbs, describes the motion of objects/humans)
EuroGraphics Ireland
29 April 2003
Software & Standards
Java: parsing intermediate representation, changing VRML code to add/modify animation, integrating modules
3D graphic modelling Authoring tools
Humanoid characters: Character Studio, Internet Character Animator (ICA)
Narrator: Microsoft Agent Props & stage: 3D Studio Max
Modelling language & standard VRML 97 for modelling the geometric of objects, props and
environment Humanoid modelling
MPEG-4 Face and Body Animation (FBA) Humanoid Animation (H-Anim) specifications Main problem to solve: defining standards for high-level behaviours of
virtual Humans
Natural language processing tools PC-PARSE (morphologic and syntax analysis) WordNet (lexicon, semantic inference)
EuroGraphics Ireland
29 April 2003
Level 1 Of Articulation of H-Anim
Joints and segments of LOA1
Though CONFUCIUS adopts Level 1 Of Articulation (LOA1) in its human character animation, its animation script engine adds ROUTEs dynamically based on the h-anim’s joint list and animation keyframe list. As long as the animation keyframes are in conformity with the joints definition in the h-anim file, CONFUCIUS’ animation engine is well adapted for any level of articulation.
EuroGraphics Ireland
29 April 2003
Agents and Avatars—How much autonomy?
Autonomy & intelligence: highlow
autonomous characters
avatars interface agentsVirtual humans:
Autonomous characters/agents have higher requirements for sensing, memory, reasoning, planning, behaviour control, and even emotional status (a sense-<emotion->control-action structure) Avatars are “user-controlled” and hence require fewer autonomous actions. However, basic naïve physics such as collision detection and reaction is still demanded when the user controls an avatar to hit a wall or grasp an object A virtual character in non-interactive storytelling is somewhere in between an agent and an avatar. Most of its behaviours, emotion, and responses to the changing environment are described in story input
characters in non-interactive storytelling
EuroGraphics Ireland
29 April 2003
Semantic representations
Categories Knowledge representations Decomposite Typical applications
rule-based representation
expert systems
FOPC (First Order Predicate Calculus)
sentence representation, expert systems
semantic networks
lexical semantics
Schank’s scripts
story understanding
frame-based representations
(1) general knowledge representation & reasoning
XML-based representations
multimodal semantics
Conceptual Dependency (CD)
event-logic truth conditions
x-schema and f-structure
Lexical-Conceptual Structure (LCS)
(2) physical knowledge representation & reasoning (inc. spatial /temporal reasoning)
Lexical Visual Semantic Representation (LVSR)
dynamic vision (movement) recognition & generation
EuroGraphics Ireland
29 April 2003
Lexical Visual Semantic Representation (LVSR) is a necessary semantic representation between 3D model information and syntactic information because 3D model differences, although crucial in distinguishing word meanings, are invisible to syntax
LVSR is based on Jackendoff’s LCS and adapts it to the task of language visualization. It enhances LCS by Schank’s scripts
Ontological categories of LVSR: OBJ, HUMAN, EVENT, STATE, PLACE, PATH, and PROPERTY
OBJ for props or places (e.g. buildings) HUMAN for either human being or any other articulated
animated characters (e.g. animals) as long as their skeleton hierarchy is defined in the graphic library
EVENT for actions, movements and manners STATE for static existence PROPERTY for attributes of OBJ/HUMAN
Lexical Visual Semantic Representation
EuroGraphics Ireland
29 April 2003
PATH & PLACE predicates
PATH predicates
Direction feature
Termination feature
PLACE predicates
contact/attach feature
to 1 1 at unmarked
from 0 1 behind <-contact>
toward 1 0 end_of n/a
away_from 0 0 in unmarked
via n/a 0 in_front_of <-contact>
across n/a n/a near <-contact>
along n/a n/a on <+contact>
out unmarked
over <-contact>
top_of n/a
under unmarked
We analysed 62 common English prepositions and defined 7 PATH predicates and 11 PLACE predicates for interpreting spatial movement events of OBJ/HUMANs
EuroGraphics Ireland
29 April 2003
Examples of LVSR & animation generation
Manipulating environment & spatial relationsInput sentence: John walked towards the house.LVSR: [EVENT walk ([HUMAN john],[PATH toward [OBJ house]])]Output animation
Input sentence: Nancy ran across the field.LVSR:[EVENT run ([HUMAN nancy],[PATH via [PLACE on [OBJ field]]])]Output animation
Manipulating objectsInput sentence: John lifted his hat.LVSR: [EVENT go ([OBJ hat],[PATH from [PLACE on [OBJ john.head]]])][EVENT lift ([HUMAN john],[OBJ hat])]Output animation
EuroGraphics Ireland
29 April 2003
Graphics library
Simple geometry filesgeometry & joint hierarchy
Files (H-Anim)
animation library(key frames)
objects/props characters
motions
instantiation
EuroGraphics Ireland
29 April 2003
Animation generator
verbsemantic analysis
use lexical entries in Lexical Visual Semantics to analyse verb semantics, replace synonyms, spatial reasoning
match basic motionsin library?
motiondecomposition
animation controller
environmentplacement
N
Y
Syntax tree
VRML file of the virtual story world
motion instantiation
apply scripts
LVSR
If the event predicate matches basic human motions in animation library
Apply spatial info & place OBJ/HUMAN into a specified environment
EuroGraphics Ireland
29 April 2003
Collision detection
Collision detection is a crucial issue for path planning, manoeuvring objects, reactive behaviour, and multiple characters’ activities
VRML provides a built-in collision detection mechanism for the avatar (user), but the mechanism does not apply to intersection between other characters/objects
Collision avoidance algorithms for humanoid bodies: Coarse approximations (e.g. bounding boxes or spheres) Polygon level checks between humans and objects Dynamic LOD checking according to distance to the
observer, users’ observation focus, and whether the human is in a crowd, etc.
CONFUCIUS’ animation generator uses bounding cylinders around the human body segments for
protagonists A bounding cylinder around the whole human body for
minor characters, characters in a crowd, and characters beyond the scope of attention
EuroGraphics Ireland
29 April 2003
Multiple characters’ synchronization & coordination
Multiple characters’ activities A character can start a task when another signals that the
situation (pre-conditions) is ready Characters can communicate with one another Two or more characters can cooperate in a shared taskMultiple characters’ synchronization Event-driven timing mechanism (VRML provides a utility for
event routing (ROUTE node) Exact time-driven synchronization
Nancy was walking along the street. John called her. Nancy stopped and saw John. John walked towards her. They exchanged greetings.
The end of the animation john_speech (calling Nancy) triggers:(1) to stop the animation of nancy_walk(2) to start the animation of nancy_gazeWander (searching for who’s calling)(3) to start the animation of john_walk (walking towards Nancy)
EuroGraphics Ireland
29 April 2003
Relation to other work
A general purpose humanoid character animation system
Compared with other related virtual human modelling systems, CONFUCIUS’ character animation focuses on the language-to-humanoid animation process rather than considering human modelling & motion solely
Fully use existing 3D OBJ/HUMAN models, tools and programs, such as the H-anim models Nancy (by C. Ballreich, © 1997 3Name3D / Yglesias, Wallock, Divekar, Inc.) and Baxter (by C. Babski, © LIG/EPFL), animation keyframe files, and BVH to h-anim keyframe conversion script (by M. Lewis, The Ohio State University)
Adopt current studies in linguistics such as LCS and improve them to adapt the demands of language visualization
EuroGraphics Ireland
29 April 2003
Prospective applications
Children’s education Multimedia presentation Movie/drama
production Computer games Virtual Reality
Conclusion & future work
CONFUCIUS’ humanoid character animation explores challenging problems in language visualization and automatic animation production:
formalizes meaning of action verbs and spatial prepositions
maps language primitives with visual primitives a reusable common senses knowledge base for other
systems Future work Deformation for facial
expressions under-specified language
input action composition for
simultaneous activities