VISION-BASED CONTROL OF 3D FACIAL ANIMATIONdavid/Classes/ICG/Talks/Yusuf_Pres.pdf• Extract...
Transcript of VISION-BASED CONTROL OF 3D FACIAL ANIMATIONdavid/Classes/ICG/Talks/Yusuf_Pres.pdf• Extract...
VISION-BASED CONTROL OF 3D FACIAL ANIMATION
Jin-xiang Chai - Jing Xiao - Jessica Hodgins Carnegie Mellon University
Eurographics / SIGGRAPH 2003
Yusuf OSMANLIOĞLU 2010
OUTLINE
• Aim • Existing techniques • Proposed method and challenges • Related work • Overall system • Analysis of system • Results • Future work
AIM
“Interactive avatar control”
• Designing a rich set of realistic facial actions for a virtual character
• Providing intuitive and interactive control over these actions in real time
• Physically modeling skin and muscles of the face
• Motion capturing techniques – Vision based – Online
EXISTING TECHNIQUES
EXISTING TECHNIQUES
EXISTING TECHNIQUES
+ High resolution - Expensive - Troublesome
- Noisy - Low resolution
+ Inexpensive + Easy to use
Control Interface Quality
Vision based animation
Online motion capture
Motion Capture Techniques
PROPOSED METHOD
Vision-based interface
Motion capture database
Interactive avatar control
+
CHALLENGES
• Map low quality visual signals to high quality motion data
• Extract meaningful animation control signals from the video sequence of a live performer in real time
• Make vertices of the face model to change place for forming facial expression, according to the displacement of limited number of markers
• Allow any user to control any 3D face model
RELATED WORK • Keyframe interpolation • Performanc Capturing • Pseudo – muscle based / muscle based simulation • 2D facial data for speech (viseme driven
approach) • Full 3D motion capture data
RELATED WORK Motion capture
• Making Faces[Guenter et al. 98] • Expression Cloning[Noh and Neumann 01]
Vision based tracking for direct animation • Physical markers[Williams 90] • Edges [Terzopoulos and Waters 93, Lanitis et al. 97] • Optical flow with 3D models[Essa et al. 96, Pighin et al. 99, DeCarlo
et al. 00]
Vision based animation with blenshape • Hand-drawn expressions [Buck et al. 00] • 3D avatar model [FaceStation]
SYSTEM OVERVIEW
Video Analysis
Avatar Animation
Preprocessed motion capture
data Expression control and animation
Expression retargeting
Performance capture 3D head pose
Video Analysis • Vision based facial tracking
– Tracking 19 2D features on the face – 2xLips, 2xMouth, 4xEyebrow, 8xEye, 3xNose
• Initialization – Neutral face – Positioning and initializing parameters of the cylinder model to capture head
pose – Positioning locations of 19 points manually
• Tracking pose of the head – 6 DOF – yaw, pitch, roll, 3D position – Updating position and orientation per frame – Reseting accumulated errors
• Expression tracking – Defining square windows centered at feature’s position
• Expression Control Parameters – 15 parameters that are extracted automatically from 2D
tracking points – Mouth(6) – Nose(2) – Eyes(2) - Eyebrows(5)
Distance between two tracking points
Distance between a line and a point
Orientation and center of the
mouth
Expression control signal
Video Analysis
SYSTEM OVERVIEW
Video Analysis
Avatar Animation
Preprocessed motion capture
data Expression control and animation
Expression retargeting
Performance capture
Motion Capture Data Preprocessing
• Building up the face model with 3D laser scan
• Motion capture – Attaching 76 reflective markers on actor’s face – Actor is allowed to move his head freely
• Coupled head and facial movements – Decoupling pose and expressions
3D poses Expression seperation
Expression control
parameter extraction
Motion Capture Data Preprocessing
Motion capture database • 70.000 frames with a 120 fps camera (~10 minutes record)
• 76 referance points on the face
• 6 basic facial expression
• Anger, fear, surprise, sadness, joy, disgust
• Eating yawning, snoring
• Each expression repeated 6 times during mocap session
• Very limited motion data related to speaking(6000 frames)
• Does not cover all variations of the facial movements related to speaking
Motion Capture Data Preprocessing
SYSTEM OVERVIEW
Video Analysis
Avatar Animation
Preprocessed motion capture
data Expression control and animation
Expression retargeting
Performance capture
Expression Control and Animation
2D tracking data
Vision-based Interface
Motion Capture Database
19*2 DOF
Facial expression control
parameters
Facial expression control
parameters 15 DOF 15 DOF
76*3 DOF 3D motion data
• Visual expression control signals are very noisy
• One-to-many mapping from expression control signal space 3D motion space
Control Signal Space 3D Motion Space
76*3 DOF 15 DOF
Expression control signal Expression control parameter
Expression Control and Animation
Nearest Neghbor Search
Noisy Control Signal
Online PCA
K=120 closest examples
Time Interval W = 20 frame/60 fps =0.33s
7 largest eigen curves (99.5 % energy)
Filtered Control Signal
Filter by eigen curves
Preprocessed motion capture
database
Expression Control and Animation
Nearest Neighbor
Search
d1
d2
dK
...
w(d2)
w(dK)
w(d1)
...
Filtered Control Signal
Expression Control and Animation
SYSTEM OVERVIEW
Video Analysis
Avatar Animation
Preprocessed motion capture
data Expression control and animation
Expression retargeting
Performance capture
EXPRESSION RETARGETING
Sythesized Expression Avatar Expression
• Learn the surface mapping function using Radial Basis Functions such that xt=f(xs) • Transfer the motion vector by local Jacobian matrix Jf(xs) by δxt=Jf(xs) δxs • Run time computational cost is independent from the number of vertices of head model
δxs δxt
xs xt
?
EXPRESSION RETARGETING
SYSTEM OVERVIEW
Video Analysis
Avatar Animation
Preprocessed motion capture
data Expression control and animation
Expression retargeting
Performance capture
RESULTS
CONCLUSIONS
Developed a performance-based facial animation system for interactive expression control • Tracking real-time facial movements in video • Preprocessing the motion capture database • Transforming low-quality 2D visual control signal
to high quality 3D facial expression • An efficient online expression retargetting
FUTURE WORK
• Formal user study on the quality of the synthesized motion
• Controlling and animating 3D photorealistic facial expression
• Size of database
• Speech as an input to the system
THANKS…
QUESTIONS?