Automatic Dense Semantic Mapping From Visual Street-level Imagery
-
Upload
sunando-sengupta -
Category
Documents
-
view
70 -
download
0
Transcript of Automatic Dense Semantic Mapping From Visual Street-level Imagery
Automatic Dense Semantic Mapping From Visual Street-level Imagery
Sunando Sengupta[1], Paul Sturgess[1], Lubor Ladicky[2], Phillip H.S. Torr[1]
[1]Oxford Brookes University[2] Visual geometry group, Oxford University
http://cms.brookes.ac.uk/research/visiongroup/index.php 1
Dense Semantic Map
• Generate an overhead view of an urban region.• Label every pixel in the Map View is associated with an
object class label
BuildingRoadTreeVegetation FenceSignage
SkyPavement Car Pedestrian Bollard Shop Sign Post 2
Dense Semantic Map• Street images captured inexpensively from vehicle with
multiple mounted camera[1].
3[1] Yotta. DCL, “Yotta dcl case studies,” Available: http://www.yottadcl.com/surveys/case-studies/
Semantic Mapping Framework
• Semantic mapping framework comprises of two stages
Street level Images acquisition
4
Semantic Mapping Framework
• Semantic mapping framework comprises of two stages– Semantic Image Segmentation at street level.
Street level Images acquisition
Image Segmentation
5
Semantic Mapping Framework
• Semantic mapping framework comprises of two stages– Semantic Image Segmentation at street level.– Ground Plane Labelling at a global level.
• One of the first attempts to do overhead mapping from street level images.
Street level Images acquisition
Image Segmentation
Ground plane labelling
6
Semantic Image Segmentation
Label every pixel in the image with an object class
BuildingRoadTreeVegetation FenceSignage
SkyPavement Car Pedestrian Bollard Shop Sign Post
Input Output
Raw Image Labelled Image
Automatic Labeller
Object Class Labels
7
CRFCRF
constructionconstruction
Semantic Image Segmentation• We use Conditional Random Field Framework (CRF)
Final SegmentationInput Image
8
• Each pixel is a node in a grid graph G = (V,E).• Each node is a random variable x taking a label from label
set.
X
Semantic Image Segmentation - CRF• Total energy
• Optimal labelling given as
9
Cc
ccNjVi
jiijVi
ii
i
xxxE )(),()()(,
xx
Epix EpairEregion
Semantic Image Segmentation - CRF
• Total energy E = Epix + Epair + Eregion
• Epix - Model individual pixel’s cost of taking a label.
– Computed via the dense boosting approach– Multi feature variant of texton boost[1]
x
Car 0.2
Road 0.3
10[1] L. Ladicky, C. Russell, P. Kohli, and P. H. Torr, “Associative hierarchical crfs for object class image segmentation,” in ICCV, 2009.
Semantic Image Segmentation - CRF
• Total energy E = Epix + Epair + Eregion
• Epair- Model each pixel neighbourhood interactions.
– Encourages label consistency in adjacent pixels
– Sensitive to edges in images.
– Contrast sensitive Potts modelxi xj
Car
Road
0
g(i,j)
Car
Road
11
Epair
Semantic Image Segmentation - CRF
• Total energy E = Epix + Epair + Eregion
• Eregion - Model behaviour of a group of pixels.
– Classify a region – Encourages all the pixels in a region to take the same label.– Group of pixels given by a multiple meanshift segmentations
c
Car 0.3
Road 0.1
12
Semantic Image Segmentation• Solved using alpha-expansion algorithm[1]
13
BuildingRoadTreeVegetation FenceSignage
SkyPavement Car Pedestrian Bollard Shop Sign Post
Input Image Road Expansion
[1] Fast Approximate Energy Minimization via Graph Cuts. Yuri Boykov et al. ICCV 99
Semantic Image Segmentation• Solved using alpha-expansion algorithm[1]
14
Input Image Building Expansion
BuildingRoadTreeVegetation FenceSignage
SkyPavement Car Pedestrian Bollard Shop Sign Post
[1] Fast Approximate Energy Minimization via Graph Cuts. Yuri Boykov et al. ICCV 99
Semantic Image Segmentation• Solved using alpha-expansion algorithm[1]
15
Input Image Sky Expansion
BuildingRoadTreeVegetation FenceSignage
SkyPavement Car Pedestrian Bollard Shop Sign Post
[1] Fast Approximate Energy Minimization via Graph Cuts. Yuri Boykov et al. ICCV 99
Semantic Image Segmentation• Solved using alpha-expansion algorithm[1]
16
Input Image Pavement Expansion
BuildingRoadTreeVegetation FenceSignage
SkyPavement Car Pedestrian Bollard Shop Sign Post
[1] Fast Approximate Energy Minimization via Graph Cuts. Yuri Boykov et al. ICCV 99
Semantic Image Segmentation• Solved using alpha-expansion algorithm[1]
17
Input Image Final solution
BuildingRoadTreeVegetation FenceSignage
SkyPavement Car Pedestrian Bollard Shop Sign Post
[1] Fast Approximate Energy Minimization via Graph Cuts. Yuri Boykov et al. ICCV 99
Ground Plane Labelling• Combine many labellings from street level imagery.
Automatic Labeller
Output
Labelled Ground PlaneStreet Levellabellings
Input
18
Ground Plane CRF• A CRF defined over the ground plane.
• Each ground plane pixel (zi) is a random variable taking a label from the label set.
• Energy for ground plane crf is
Z
19
gpair
gpix
g EEZE )(
Ground Plane Pixel Cost
KX
Z
• We assume a flat world.
20
Ground Plane Pixel Cost
Homography Road Pavement Post/Pole
KX
Z
• A ground plane region is estimated.
21
KX
Z
Ground Plane Pixel Cost
22
Homography Road Pavement Post/Pole
• Each point in the image projects to a unique point on the ground plane.– Creating a homography
KX
Z
Ground Plane Pixel Cost
23
Ground plane
Pixel histogramsHomography Road Pavement Post/Pole
• The image labelling is mapped to the ground plane – via the homography.
• Labels projected from many views are combined in a histogram.• The normalised histogram gives the naïve probability of the
ground plane pixel taking a label.
Ground Plane Pixel Cost
24
KX
ZGround plane Pixel histogramsHomography Road Pavement Post/Pole
Ground Plane Pixel Cost
25
KX
ZGround plane Pixel histogramsHomography Road Pavement Post/Pole
• Labels projected from many views are combined in a histogram.• The normalised histogram gives the naïve probability of the
ground plane pixel taking a label.
Ground Plane labelling
• Histogram is built for every ground plane pixel giving Egpix
• Pairwise cost (Egpair) added to induce smoothness
– Contrast sensitive potts model
Z
Ground Plane labelling• Final CRF solution obtained using alpha expansion.
Void
Ground Plane labelling
Road expansion
• Final CRF solution obtained using alpha expansion.
Ground Plane labelling
Building expansion
• Final CRF solution obtained using alpha expansion.
Ground Plane labelling
Pavement expansion
• Final CRF solution obtained using alpha expansion.
Ground Plane labelling
Car expansion
• Final CRF solution obtained using alpha expansion.
Ground Plane Labelling
Final Solution
• Final CRF solution obtained using alpha expansion.
Dataset
• Subset of the images captured by the van– 14.8 km of track, 8000 images from each camera.
• Pixel-level labelled ground truth images. Dataset available[1].
• 13 object categories –
• Training - 44 images, testing - 42 images.
[1]http://cms.brookes.ac.uk/research/visiongroup/projects/SemanticMap/index.php
BuildingRoadTreeVegetation FenceSignage
SkyPavement Car Pedestrian Bollard Shop Sign Post
33
SIS Results
• Input Images, output of our image level CRF, ground truths.• Used Automatic Labelling environment[1]
[1] The Automatic Labelling Environment, L Ladicky, PHS Torr. Code available http://cms.brookes.ac.uk/staff/PhilipTorr/ale.htm
BuildingRoadTreeVegetation FenceSignage
SkyPavement Car Pedestrian Bollard Shop Sign Post
34
Input
Semanticsegmentation
Ground Truth
Semantic Map Results
Semantic map of Pembroke city
35
Ground plane Map Evaluation
36
Street Images
Back-projectedMap results
Ground Truth
• We back-project the ground plane map into image domain and evaluate the results.
• Global pixel accuracy of 86%
Results
37
Conclusions• Presented a method to generate
overhead view semantic mapping.
• Experiments on large tracks (~15km) which can be scaled up to country wide mapping
• Dataset available[1].
[1] http://cms.brookes.ac.uk/research/visiongroup/projects/SemanticMap/index.php 38
Future Work
39
Oxford Brookes Vision groupOxford Brookes Universityhttp://cms.brookes.ac.uk/research/visiongroup/index.php
• Perform a 3D street level semantic mapping and reconstruction.
• Add detailed street level information like signs, information boards etc.
Thank you!!!
Ground Plane Pixel Cost
41
• Using single view will create a shadow effect for objects violating flat world assumption and wrong label estimate
KX
Z
Single view
Multi-view
Homography Road Pavement Post/Pole