Elliptical Head Tracking Using Intensity Gradients and
Color Histograms
Stan Birchfield
Stanford University
Autodesk Advanced Products Group
http://vision.stanford.edu/~birch
PROBLEM
TILT
PAN
ZOOM
CHALLENGES: * rotation * multiple people * zoom
APPLICATIONS: * video conferencing * distance learning
PREVIOUS METHODS
FL
ES
H-
CO
LO
RE
DO
BJE
CT
S
MU
LT
IPL
EM
OV
ING
PE
OP
LE
AR
BIT
RA
RY
CA
ME
RA
MO
VE
ME
NT
OU
T-O
F-P
LA
NE
RO
TA
TIO
N1. TEMPLATE [Hager & Belhumeur, 1996]
Y Y YN
2. FLESH COLOR [Fieguth & Terzopoulos, 1997]
N N YN
3. BACKGROUND DIFFERENCING [Graf et al., 1996]
Y N NY
Method
Criterion
CUES:• COLOR• MOTION• TEXTURE
INTERIOR BOUNDARY
COMPLEMENTARY CRITERIA
• INTENSITY EDGES• DEPTH & MOTION . DISCONTINUITIES
APPLICATION: 1. Interesting, useful 2. Well-connected to other body parts
WHY FOCUS ON THE HEAD?
GEOMETRIC: 1. Nearly rigid 2. Nearly ellipsoid Easy to model
HEADMODEL (x,y)
Ellipse: vertical aspect ratio = 1.2state s = (x,y,)
SEARCH
velocityprediction
}||,||,|:|{ rp
rp
rp yyyxxxS s
*2
*1
2 tt
pt xxx
*2
*1
2 tt
pt yyy
*1t
pt
LOCAL HEAD SEARCH
)}()({maxarg* icigSisss
s
GRADIENT COLOR
SEARCHRANGE
TWO CHOICES:
1. MAGNITUDE
2. DOT PRODUCT
NORMALIZATION
GRADIENT MODULE
N
iNg i1
1 |)(|)( sgs
N
iNg ii1
1 |)()(|)( sgns
)(min)(max
)(min)(
)(igSsigSs
igSsg
ii
ig ss
ss
s
ellipse normal gradient
COLOR MODULE
COLOR SPACE
HISTOGRAMINTERSECTION[Swain & Ballard 1991]
NORMALIZATION
B-G (8 bins) G-R (8 bins)
B+G+R (4 bins)
)(min)(max
)(min)(
)(icSsicSs
icSsc
ii
ic ss
ss
s
Ni
Ni
iI
iMiIc
1
1
)(
))(),(min()(
s
ss
MODEL
CURRENT
INTERSECTION
SKIN HAIR
SUMMARY OF ALGORITHM
OFF-LINE: 1. Manually place head within ellipse 2. Store model histogram
RUN TIME: 1. At each hypothesized location, compute - Sum of gradient around perimeter - Histogram intersection 2. Move ellipse to location that maximizes sum of two criteria
COMPARISON OF MODULES
• Controls pan, tilt, zoom
• Handles textured backgrounds
• More robust• Large basin of
attraction
• Controls pan, tilt• Keeps off neck• Scale in front of
flesh-colored object• Scale when back
turned
COLORGRADIENT
BASIN OF ATTRACTION
Gradient confused, pulls to left Color pulls to right
COMPUTING TIME
0
10
20
30
40
50
60
70m
agni
tude
dot
prod
uct
colo
r
mag
&co
lor
dot
&co
lor
4x4x18x8x1
Real time (30 Hz)
Com
pu
tin
g ti
me
per
fra
me
(ms)
Search range
(on a 200 MHz Pentium Pro)
CONCLUSION
SUCCESSES: 1. Tracks head in real time on standard hardware 2. Insensitive to - full 360-degree out-of-plane rotation - arbitrary camera movement (including zoom) - multiple moving people - severe but brief occlusion - hair/skin color, hair length, facial hair, glasses
FUTURE WORK: 1. Speed (computer speed and NTSC video standard) 2. Color adaptation, but imprecise localization 3. No explicit model of occlusion
Top Related