Scenes From Video Workshop Talk
-
Upload
the-australian-centre-for-visual-technologies -
Category
Technology
-
view
55 -
download
1
description
Transcript of Scenes From Video Workshop Talk
What’s so good about pieces, Lego and understanding?
Anton van den Hengel
Australian Centre for Visual Technologies (ACVT)The University of AdelaideSouth Australia
People think in 3D
It has been a theme …
"the perception of solid objects is a process which can be based on the
properties of three-dimensional transformations and the laws of nature”
Larry Roberts (1965)
Geometry is not enough
Structure and semantics interact
Structure and geometry interact
WHY PLANTS ARE LIKE LEGO
Developmental changes in response to drought
Boris Parent, ACPFG
0
1000
2000
3000
4000
5000
6000
7000
30 35 40 45 50 55 60 65
Ab
solu
te g
row
th r
ate
[m
m2
d-1
]
Time after sowing [d]
drought
well watered
39 d after sowing
46 d after sowing
The escape response of Clipper under drought is reflected in
an earlier time of absolute maximum growth
Morphological changes in response to drought
Boris Parent, ACPFG
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
3
30 40 50 60
Re
lati
ve r
atio
of
sho
ot
are
a /
he
igh
t
Time after sowing [d]
The reduced number of tillers under drought is
reflected in the area/height ratio
Barley cv Clipperdrought
well watered
Deep reasoning
• Try to explain as much as possible
• Fine-grained and detailed
• Deep semantics
• And the implied constraints
• Shape is only an intermediate step
Deconstruction
Silhouettes
• We’re only interested in shape (at least for now)
Deconstruction
• Render all possible building blocks in every possible position, and recover its silhouette
• Then reconstruct object silhouettes from templates
• Requires enough camera information to achieve this
Template shapes
• nTemplates = nShapes x nPositions x nRotations
• So there are lots of them
• But they are sparsely used
Sparse recovery
• \alpha a vector of binary template coefficients
• \Pi a matrix with one template silhouette per column
• y the silhouette of the shape to be recovered
• NP hard and fragile
Sparse recovery – L_1 norm
• But there may still be millions of templates, and they’re enormous (|Pixels| x |Images|)
Sparse recovery – Random projections
• Random projection by DxS matrix \Phi
• D << S
• \Phi is sparsely sampled from N(0,1)
• But there are still too many templates
Sparse recovery - Cropping
• Eliminate templates with a footprint that extends significantly beyond that of the object
• Reduces the number of templates by at least an order of magnitude
• Down to tens to tens of thousands of templates
Binarising the solution
• Solutions are not binary
• Randomly generate binary hypotheses from non-binary \alpha
• Evaluate using an accurate composition model
Results
Results
Results
Results
Results
Plants
Results
200 400 600 800 1000
0.6
0.7
0.8
0.9
Number of Templates
Fra
ctio
n o
f T
rue
Lea
ves R
eco
vere
d
Max
Search
Viable
Results
0 0.01 0.02 0.03 0.04 0.05 0.060
0.02
0.04
0.06
0.08
Noise Level (Fraction of Pixels Changed)
Fra
ctio
n o
f P
ixe
ls E
xp
lain
ed
Max
Search
Composition problems
Not a true model of silhouette formation
So doesn’t deal well with template overlap
Working on this by subtracting overlaps, graph-based approaches
Somewhat overcome by…
Inequality
• Isn’t physically accurate for foreground pixels, so split
• Background (0) pixels
• And foreground pixels
Practicality again
• Only interested in the number of pixels outside the object silhouette, not the location
• So not
• but
Practicality again
• Want to ensure that
• Need to project to a lower dimension
• But \Phi_I must have only positive elements
A better model of composition
• Left with
Constraints - Intersection
Constraints - Intersection
• Form J where every row represents a constraint
• If templates i and k intersect then insert a row in J with only elements i and k set to 1
Constraints - Support
• Form K where every row represents a constraint
• If template i needs support t set K_ii = t
• If template j provides s support to j then K_ij = -s
Measurement benefit tails off
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
0.4
0.5
0.6
0.7
0.8
0.9
1
Noise level (added to camera extrinsics)
Accu
racy (
fra
ctio
n o
f tr
ue
blo
cks r
eco
ve
red
)
Accuracy vs noise for varying numbers of measurements
49
441
1225
2401
3969
5929
8281
11025
Results
Results
Limitations
• One template per value per parameter
• Fixable?