Image Segmentation Using Hardware Forest Classifiers

29
IMAGE SEGMENTATION USING HARDWARE FOREST CLASSIFIERS Neil Pittman & Alessandro Forin, Microsoft Research Antonio Criminisi & Jamie Shotton, Microsoft Research Cambridge Atabak Mahram, Boston University

Transcript of Image Segmentation Using Hardware Forest Classifiers

Page 1: Image Segmentation Using Hardware Forest Classifiers

IMAGE SEGMENTATION USING HARDWARE

FOREST CLASSIFIERS

Neil Pittman & Alessandro Forin, Microsoft ResearchAntonio Criminisi & Jamie Shotton, Microsoft Research Cambridge

Atabak Mahram, Boston University

Page 2: Image Segmentation Using Hardware Forest Classifiers

Kinect Pipeline

Depth BGR Body Parts

Mod. FitCent. Skeletons

Depth ImageBackground RemovalBody Part ClassificationCentroid CalculationModel FittingUser Skeletons

Page 3: Image Segmentation Using Hardware Forest Classifiers

Partitioning – Xbox & K4WSensor

(Kinect)SW Host

(PC/Xbox/SoC)HW Accelerator(FPGA/GPU)

Application(PC/Xbox)

Depth BGR Body Parts

Cent.Mod. Fit.

Skeleton

Page 4: Image Segmentation Using Hardware Forest Classifiers

Partitioning – What we WantSensor

(Kinect)SW Host

(PC/Xbox/SoC)HW Accelerator(FPGA/GPU)

Application(PC/Xbox)

Depth BGR

Body Parts

Cent. Mod. Fit. Skeleton

Page 5: Image Segmentation Using Hardware Forest Classifiers

What we Want – ‘noBGR’ Directly connect sensor to compute unit. Offload the computation to the device. Lower power for embedded and mobile. Higher Frame rate for next generation apps.

We want hardware Background Removal (BGR)!!!

Page 6: Image Segmentation Using Hardware Forest Classifiers

Background Removal (BGR) Grows moving/active

pixels into islands using Connected Components Algorithm.

Uses history and complex rules for merging and splitting islands into player mask.

Highly sequential per pixel comparisons to its neighbors undesirable for hardware implementation.

Page 7: Image Segmentation Using Hardware Forest Classifiers

Background Removal (BGR) BGR represents a

large computational work load.

Required to classify pixels into one of two classes: player or not player.

We have hardware that can classify pixels into 31 classes: FPGA Forest Fire

CPU BGR fpsIntel Atom 14.3

Arm Cortex 7.0

Page 8: Image Segmentation Using Hardware Forest Classifiers

Forest Fire A random tree based

classification algorithm. Starting at the root, a pixel

traverses each tree to a leaf. Decision of left or right child

is based on an evaluation function.

Each Leaf contains probabilities for each class.

Results of each tree are aggregated for final result.

Page 9: Image Segmentation Using Hardware Forest Classifiers

‘noBGR’ Hypothesis:

BGR is simply classifying pixels as player or background.

If Forest Fire can be trained to classify human body parts, it can be trained to classify player from background.

We have an efficient hardware implementation of Forest Fire for the FPGA.

We can implement BGR functionality using FPGA Forest Fire.

Page 10: Image Segmentation Using Hardware Forest Classifiers

Two Experiments – Baseline

ConnectedComponents

(BGR)

Upsample & Tag

RANSACFloor Decimate

Subsample Forest Fire(Body Parts)

K-meansCentroids

ModelFit

Page 11: Image Segmentation Using Hardware Forest Classifiers

Two Experiments – One Stage

Forest Fire(BGR &

Body Parts)

K-meansFloor

Subsample K-meansCentroids

ModelFitPre-ModelFit

Page 12: Image Segmentation Using Hardware Forest Classifiers

Two Experiments – Two Stage

Forest Fire(BGR)

Upsample/Decimate

K-meansFloor

Subsample Forest Fire(Body Parts)

K-meansCentroids

ModelFitPre-ModelFit

Page 13: Image Segmentation Using Hardware Forest Classifiers

Floor Calculation Forest Fire classifies

the floor pixels of the input image.

Floor plane is calculated based on the centroids of these pixels.

Represented by normal vector where player is standing.

Page 14: Image Segmentation Using Hardware Forest Classifiers

Floor Calculation

Replaces the function of the SW BGR. Finds floor plane for Model Fitting Stage. Implemented in hardware using both

RANSAC and k-means algorithms.

Algorithm Floors Detected

Percentage of Total

Inclination Error (deg)

Azimuth Error (deg)

Float RANSAC 2,366 80.9 20.5 9.9

Integer RANSAC 2,483 84.9 2.5 6.3

k-Means 2,912 99.6 11.0 11.8

Page 15: Image Segmentation Using Hardware Forest Classifiers

Player Tagging

Software BGR Hardware BGR

The BGR software partitions the foreground mask into player masks.

The BGR hardware outputs a single foreground mask. All foreground pixels and their resulting centroids are labeled ‘player 1’.

Page 16: Image Segmentation Using Hardware Forest Classifiers

Player Tagging Model Fitting requires body parts be

assigned to individual players. Pre-ModelFit partitions the centroids by

player using a heuristic.

Page 17: Image Segmentation Using Hardware Forest Classifiers

ResultsInput Depth

SWBGR Forest Fire BGR Forest Fire Floor Skel.

Page 18: Image Segmentation Using Hardware Forest Classifiers

Results

Baseline Hardware (Two Stage) Ground Truth

Page 19: Image Segmentation Using Hardware Forest Classifiers

Results

Suite 94* Suite 97** Suite 111***0

10

20

30

40

50

60

70

80

90

Baseline

One Stage

Two Stage, Depth 16

Two Stage, Depth 10

Percentage Difference from Ground Truth

*standing with varied scenes, **standing with similar scenes, ***seated and furniture.

Page 20: Image Segmentation Using Hardware Forest Classifiers

Power ComparisonPlatform w/Kinect Power (W)PC 162.6

Xbox 92.1

Xilinx ML605 25.7

Digilent ZedBoard 9.0

The Kinect Sensor is powered by the Xbox via USB. It is powered by an external power supply with the other platforms. The Kinect sensor alone draws 3.5 W. This is added to those platforms’ system power in the table above.

Page 21: Image Segmentation Using Hardware Forest Classifiers

FPGA Utilization – One Stage

LUTs % FF % BRAM %Full System 37470 24.9 31796 10.6 27 6.49

Forest Fire Core 30129 20.0 23198 7.69 5 1.2

Sorting FIFO 425 0.28 428 0.14 2 0.48

DDR3 Controller 5488 3.64 7496 2.49 0 0.0

PC Interface 1852 1.23 1101 0.37 22 5.29

Input Buffer 76 0.05 42 0.01 11 2.64

Output Buffer 0 0.0 0 0.0 8 1.92

Using Virtex 6 240t (xv6vlx240t-1ff1156)

Page 22: Image Segmentation Using Hardware Forest Classifiers

FPGA Utilization – Two Stage

LUTs % FF % BRAM %Full System 67236 44.6 55405 18.4 32 7.69

Forest Fire Two Instances 60611 40.2 46614 15.5 10 2.4

Forest Fire Core0 29888 19.8 23275 7.72 5 1.2

Forest Fire Core1 29773 19.8 23337 7.74 5 1.2

DDR3 Controller 4746 3.15 7686 2.55 0 0.0

PC Interface 1876 1.24 1101 0.37 22 5.29

Input Buffer 76 0.05 42 0.01 11 2.64

Output Buffer 0 0.0 0 0.0 8 1.92

Using Virtex 6 240t (xv6vlx240t-1ff1156)

Page 24: Image Segmentation Using Hardware Forest Classifiers

Demo Night Please come see our

Hand Tracking Demo.

Page 25: Image Segmentation Using Hardware Forest Classifiers

BACKUP

Page 26: Image Segmentation Using Hardware Forest Classifiers

Kinect Pipeline

Depth BGR Body Parts

Mod. FitCent. Skeletons

Page 27: Image Segmentation Using Hardware Forest Classifiers

One Stage Database: 32 classes, 3 Trees, 20

Levels. Input: Raw Depth Image. Output: Pixels Tagged with Body Parts

and Floor. Average Performance: ≈ 56 fps

Page 28: Image Segmentation Using Hardware Forest Classifiers

Two Stage, Core0 Database: 3 classes, 3 Trees, 16 Levels. Input: Raw Depth Image. Output: Pixels Tagged with Player, Non-

Player and Floor. Average Performance: ≈ 200+ fps if

subsampled.

Page 29: Image Segmentation Using Hardware Forest Classifiers

Two Stage, Core1 Database: 31 classes, 3 Trees, 20

Levels. Input: Depth Image filtered by Player

Tags. Output: Pixels Tagged with Body Parts. Average Performance: ≈ 200+ fps