Object Detection - Through Machine Learningmachinelearning.math.rs/Stegic-ObjectDetection.pdf ·...

Object DetectionThrough Machine Learning

Machine Learning and Applications Group, 2018.

Uroš Stegić

urosstegic@gmx.com

TRADITIONAL COMPUTER VISION

General Overview

Convolution Operator

Filters

Convolutions Over Volume

Traditional Computer Vision Convolutional Neural Networks Object Detection

Description

Process & analyze visual signalExtract information from visual signalPerform on raw signal (pixel intensities values)

Enhance Intuition

Computer Graphics

(x0, y0) = (3, 4)r = 6↓

Computer Vision

↓(x0, y0) = (3, 4)

Tasks in Computer Vision

Object RecognitionImage RetrievalObject DetectionOCRPose Estimation...

TrackingScene ReconstructionOptical FlowSemantic SegmentationImage Reconstuction...

Tasks in Computer Vision

Object RecognitionImage RetrievalObject DetectionOCRPose Estimation...

TrackingScene ReconstructionOptical FlowSemantic SegmentationImage Reconstuction...

Convolution Operator - Definition

DefinitionLet A,B ∈ D ⊆ Rn×n. Convolution operator, denoted as ∗maps the spaceD×D to a field of real numbers and is definedas follows:

A ∗ B =

n∑i=1

n∑j=1

AijBij

Convolution Operator - Example

1 2 34 5 67 8 9

0 1 01 0 10 1 0

1 2 34 5 67 8 9

0 1 01 0 10 1 0

= 2 ∗ 1 + 4 ∗ 1 + 6 ∗ 1 + 8 ∗ 1 = 20

1 2 34 5 67 8 9

0 1 01 0 10 1 0

= 2 ∗ 1 + 4 ∗ 1 + 6 ∗ 1 + 8 ∗ 1 = 20

1 2 34 5 67 8 9

0 1 01 0 10 1 0

= 2 ∗ 1 + 4 ∗ 1 + 6 ∗ 1 + 8 ∗ 1 = 20

1 2 34 5 67 8 9

0 1 01 0 10 1 0

= 2 ∗ 1 + 4 ∗ 1 + 6 ∗ 1 + 8 ∗ 1 = 20

1 2 34 5 67 8 9

0 1 01 0 10 1 0

= 2 ∗ 1 + 4 ∗ 1 + 6 ∗ 1 + 8 ∗ 1 = 20

Filters

211 39 200 102 174 25 90 144138 44 184 110 193 30 92 136151 73 190 114 189 41 105 128129 101 123 181 201 169 117 191140 122 153 231 209 157 124 113221 115 77 244 198 149 156 247

0 1 01 0 10 1 0

Filters - Examples

Vertical Edge ExtractorHorizontal Edge ExtractorSobel filterSharpenGaussian Blur

Filters - Edge Extractor

1 0 −11 0 −11 0 −1

Filters - Edge Extractor

1 0 −11 0 −11 0 −1

Filters - Sobel

1 0 −12 0 −21 0 −1

Filters - Gaussian Blur

1 4 6 4 14 16 24 16 46 24 36 24 64 16 24 16 41 4 6 4 1

Multiple Input Channels

Figure: Convolution of multichannel image

Multiple Filters

Figure: Convolution of multichannel image with two filters21

CONVOLUTIONAL NEURALNETWORKS

Parameter Learning

Basic CNNs

Residual Networks

Inception Networks

Basic Concepts

Figure: Convolutional layers stacked

Basic Concepts - Takeaway

Image ClassificationParameters (filters) Learning [LBD+89]Weight SharingFeature Extraction

Basic Concepts - Feature Abstractions

Figure: Feature Visualization [ZF13]

Basic Concepts - Pooling Layers

Sampling important FeaturesReduce Computation TimeMake Features Robust

Basic Concepts - Pooling Layers (Example)

Pooling Layer - Max Pooling

9 2 4 13 1 8 24 5 9 25 6 0 1

−→[9 86 9

Basic Concepts - Architecture

Figure: Convolutional Neural Network - Example

CNN Architecture - Lenet-5

Figure: Lenet-5 Architecture [LBBH98]

CNN Architecture - VGG

Figure: VGG Architecture

CNN Architecture - AlexNet

Figure: AlexNet Architecture [KSH12]

CNN - Problems

Vanishing GradientExploding GradientComputational Complexity

Residual Block

Figure: Residual Block (Skip Connection) [HZRS15]

Residual Network

Figure: CNN Architecture - ResNet-34 [HZRS15]

1x1 Convolution

Figure: 1x1 Convolution [LCY13]

Inception Module - Idea

Figure: Inception Module Naive Version [SLJ+14]

Inception Module - Redone

Figure: Inception Module With Dimension Reduction [SLJ+14]

Inception Network

Figure: Inception Network (GoogLeNet) [SLJ+14]

OBJECT DETECTION

Task Outline

RCNN Family

Other Influental Models

Speed/Accuracy Trade-Off

Visualizing the Task

Figure: Object Detection Task40

Understanding the Bounding Box Error

Figure: Bounding Box Missmatch

Defining the IoU

Figure: Intersection over Union

Gaining Intuition on IoU

Figure: Intersection over Union - Example

Similar Bounding Boxes Problem

Figure: Elimination of Multiple Bounding Boxes

Non-Maximum Suppression

Threshold every bounding boxSort bounding boxes by detection probability indecresing orderFor each bounding box bi remove all bounding boxesbj(j = i) such that IoU(bi,bj) ≥ t for some fixed t

YOLO - Introduction

Figure: Grid for YOLO

pcbxbybwbhc1c2...cn

0You Only Look Once: Unified, Real-Time Object Detection [RDGF15]

Limitations (already?)

Problem: Multiple objects centered in same cell

Anchor Boxes

Choose a number of anchors (predefined bboxes)Select a ratio (width and height) for each of themModify the output to include this anchors...Profit

Anchor Boxes - Example

pc1bx1by1bw1bh1c11...cn1

, y2 =

pc2bx2by2bw2bh2c12...cn2

YOLO - Loss Fucntion

L(y, y) = λcoord

s2∑i=0

B∑j=0

1objij [(xi − xi)2 + (yi − yi)2]

+ λcoords2∑i=0

B∑j=0

1objij [(

√wi −√

wi)2 + (

√hi −

√hi)

+s2∑i=0

B∑j=0

1objij (Ci − Ci)

+ λnoobjs2∑i=0

B∑j=0

1objij (Ci − Ci)

+s2∑i=0

∑c∈classes

(pi(c)− pi(c))2

Region Based Approach

Propose Regions of InterestClassify each RoIRegress Bounding Box Coordinates

Region Models

Regions with CNN (R-CNN) [GDDM13]Fast R-CNN [Gir15]Faster R-CNN [RHGS15]Mask R-CNN [HGDG17]

Region Proposals - Selective Search

Figure: Selective Search Algorithm Visualized

Figure: R-CNN Pipeline

Fast R-CNN

Convolution Based Sliding WindowROI PoolingSoftmax Classification

Fast R-CNN - Sliding Window

Figure: Sliding Window - CNN Implementation

Fast R-CNN - Visualized

Figure: Fast R-CNN Pipeline

Fast R-CNN - Loss

L(p,u, tu, v) = Lcls(p,u) + λ[u ≥ 1]Lloc(tu, v)

Lcls(p,u) = − logpu

Lloc(tu, v) =∑

i∈{x,y,w,h}smoothL1(tui − vi)

smoothL1(x) ={0.5x2, if x ≤ 1

x− 0.5, otherwise

Faster R-CNN

Bottleneck: Region Proposals by Selective Search (2s)Solution: Region Proposals by CNN (0.01s)

Region Proposal Network

Figure: Region Proposal Network for Faster R-CNN60

RPN - Loss

L(pi, ti) =1

∑iLcls(pi,p∗i ) + λ

∑ip∗i Lreg(ti, t∗i )

Faster R-CNN - Architecture

Figure: Model Scheme of Faster R-CNN

Mask R-CNN

Figure: Model Scheme of Faster R-CNN

Other Influential Models

RetinaNet (Focal Loss) [LGG+17]Single Shot Detector [LAE+15]

RetinaNet

Figure: Retina Net - Overview

Speed vs. Precision

Figure: GPU Time vs. Precision [HRS+16]

Lecture Pronouncement

CONVERGENCE

References I

Ross B. Girshick, Jeff Donahue, Trevor Darrell, andJitendra Malik, Rich feature hierarchies for accurate objectdetection and semantic segmentation, CoRRabs/1311.2524 (2013).Ross B. Girshick, Fast R-CNN, CoRR abs/1504.08083(2015).Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross B.Girshick, Mask R-CNN, CoRR abs/1703.06870 (2017).

References II

Jonathan Huang, Vivek Rathod, Chen Sun, MenglongZhu, Anoop Korattikara, Alireza Fathi, Ian Fischer,Zbigniew Wojna, Yang Song, Sergio Guadarrama, andKevin Murphy, Speed/accuracy trade-offs for modernconvolutional object detectors, CoRR abs/1611.10012(2016).Kaiming He, Xiangyu Zhang, Shaoqing Ren, and JianSun, Deep residual learning for image recognition, CoRRabs/1512.03385 (2015).

References III

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton,Imagenet classification with deep convolutional neuralnetworks, Advances in Neural Information ProcessingSystems 25 (F. Pereira, C. J. C. Burges, L. Bottou, andK. Q. Weinberger, eds.), Curran Associates, Inc., 2012,pp. 1097–1105.

Wei Liu, Dragomir Anguelov, Dumitru Erhan, ChristianSzegedy, Scott E. Reed, Cheng-Yang Fu, andAlexander C. Berg, SSD: single shot multibox detector,CoRR abs/1512.02325 (2015).Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner,Gradient-based learning applied to document recognition,IEEE (1998), 2278–2324.

References IV

Yann Lecun, Bernhard Boser, John Denker, DonHenderson, R E. Howard, W.E. Hubbard, and LarryJackel, Backpropagation applied to handwritten zip coderecognition, Neural Computation 1 (1989), 541–551.

Min Lin, Qiang Chen, and Shuicheng Yan, Network innetwork, CoRR abs/1312.4400 (2013).Tsung-Yi Lin, Priya Goyal, Ross B. Girshick, Kaiming He,and Piotr Dollár, Focal loss for dense object detection,CoRR abs/1708.02002 (2017).Joseph Redmon, Santosh Kumar Divvala, Ross B.Girshick, and Ali Farhadi, You only look once: Unified,real-time object detection, CoRR abs/1506.02640 (2015).

References V

Shaoqing Ren, Kaiming He, Ross B. Girshick, and JianSun, Faster R-CNN: towards real-time object detection withregion proposal networks, CoRR abs/1506.01497 (2015).

Christian Szegedy, Wei Liu, Yangqing Jia, PierreSermanet, Scott E. Reed, Dragomir Anguelov, DumitruErhan, Vincent Vanhoucke, and Andrew Rabinovich,Going deeper with convolutions, CoRR abs/1409.4842(2014).Matthew D. Zeiler and Rob Fergus, Visualizing andunderstanding convolutional networks, CoRRabs/1311.2901 (2013).

Object Detection - Through Machine Learningmachinelearning.math.rs/Stegic-ObjectDetection.pdf ·...

Documents

Transcript of Object Detection - Through Machine Learningmachinelearning.math.rs/Stegic-ObjectDetection.pdf ·...

1 The Object OBJECT Attributes contained in data structures Procedures which can operate on the data structures Object Identity Object State Object behaviour.

AnEvaluationofDeepLearningMethodsforSmall ObjectDetection

Problems.Net Анатолий Крыжановский Обра. ООП 2 class Bar { } void Foo(object a) { Console.WriteLine("object"); } void Foo(object a, object b) { Console.WriteLine("object,

Object Object Description Object abbr. Strategic Director ... · Object Description Object abbr. Object Type CE Change & Efficiency CECORSER O Strategic Director Change & Efficiency

· Web viewBCA305 :Object Oriented Programming and C++ (Core) Unit 1: Introduction-Object Orientation- object oriented development-Object oriented Methodology-Object oriented Models-Object

Object Recognizing. Object Classes Individual Recognition.

Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Object Persistence using Hibernate An Object-Relational mapping framework for object persistence.

MOVING OBJECTDETECTION USING CELLULAR …umpir.ump.edu.my/291/1/Prema_Latha_Subramaniam.pdf · Author :PREMALATHA SUBRAMANIAM Date :15NOVEMBER2008 . iii Speciallydedicatedto mybelovedparents

MDCN:Multi-Scale,DeepInception ... · MDCN:Multi-Scale,DeepInception ConvolutionalNeuralNetworksforEfficient ObjectDetection Wenchi Ma1; YuanweiWu1; ZongboWang2;GuanghuiWang1 1University

OBJECTDETECTION - CVMLcvml.ajou.ac.kr/wiki/images/8/8b/Ch7_1_Object_Detection.pdf · Faster RCNN •Insert a Region Proposal Network (RPN) after the last convolutional layer •RPN

Object and object-relational databases · Object and object-relational databases ... Multiple inheritance and selective inheritance ! ... (OODBMS) – Object ...

OBJECTS OBJECT IDENTITY REFERENCE TYPES. Object identifier - oid FILE SYSTEM Data object.

Pillar-based Object Detection for Autonomous Driving...Pillar-based ObjectDetection for Autonomous Driving 7 (a) CylindricalView (b) SphericalView Fig.2. Comparison of (a) cylindricalview

Object Database Vs_ Object-Relational Databases

Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

Object identification techniques for object-oriented ...

Object profile Object for transfer Transferor Object profile Beer market analysis

Object Tracking Using Color Object

Regularization for - Machine Learningmachinelearning.math.rs/Jovanovic-MultiTask.pdf · Regularization for ... Zhou,Chen,Ye (2012) Multi-Task Learning: Theory, Algorithms, and Applications