COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4...
Transcript of COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4...
![Page 1: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/1.jpg)
COMPUTER VISION
![Page 2: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/2.jpg)
Tasks
ADAS
Self Driving
Localizati
onPerception
Planning/
Control
Driver
state
Vehicle
Diagnosis
Smart
factory
Meth
od
s
Trad
ition
al
Non-machine LearningGPS,
SLAM
Optimal
control
Mach
ine-L
earnin
gb
ased m
etho
d
Su
perv
ised
SVM
MLP
Pedestrian
detection(HOG+SVM)
Deep
-Learn
ing
based
CNN
Detection/
Segmentat
ion/Classif
ication
End-to-
end
Learning
RNN
(LSTM)
Dry/wet
road
classificati
on
End-to-
end
Learning
Behavior
Prediction/
Driver
identificati
on
*
DNN * *
Reinforcement *
Unsupervised *
![Page 3: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/3.jpg)
Vision tasks
Object
recognition
Object
detection
Semantic
segmentation
Object
tracking
Visual
SLAM
![Page 4: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/4.jpg)
Semantic segmentation
• Building/road/sky/object/grass/water/tree
Clement Farabet, Camille Couprie, Laurent Najman and Yann LeCun: Learning Hierarchical Features for Scene Labeling, IEEE Transactions
on Pattern Analysis and Machine Intelligence, August, 2013
![Page 5: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/5.jpg)
Object tracking
Yi Wu, Jongwoo Lim, and Ming-Hsuan Yang, "Object Tracking Benchmark",
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015
![Page 6: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/6.jpg)
Visual SLAM
![Page 7: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/7.jpg)
Tasks
ADAS
Self Driving
Localizati
onPerception
Planning/
Control
Driver
state
Vehicle
Diagnosis
Smart
factory
Meth
od
s
Trad
ition
al
Non-machine LearningGPS,
SLAM
Optimal
control
Mach
ine-L
earnin
gb
ased m
etho
d
Su
perv
ised
SVM
MLP
Pedestrian
detection(HOG+SVM)
Deep
-Learn
ing
based
CNN
Detection/
Segmentat
ion/Classif
ication
End-to-
end
Learning
RNN
(LSTM)
Dry/wet
road
classificati
on
End-to-
end
Learning
Behavior
Prediction/
Driver
identificati
on
*
DNN * *
Reinforcement *
Unsupervised *
![Page 8: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/8.jpg)
ORB-SLAM in the KITTI dataset
• ORB-SLAM2 is a real-time SLAM library
for Monocular, Stereo and RGB-D cameras that computes the
camera trajectory and a sparse 3D reconstruction
![Page 9: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/9.jpg)
COMPUTER VISION
IMAGE UNDERSTANDING …
![Page 10: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/10.jpg)
Why understanding images is hard
Image
Very many
sources of
variability
From J. Winn, MSR
![Page 11: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/11.jpg)
Sources of image variability
Scene type
Scene geometry
Street scene
From J. Winn, MSR
![Page 12: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/12.jpg)
Sources of image variability
Scene type
Scene geometry
Object classes
Street scene
Sky
Building×3
Road
Sidewalk
Tree×3
Person×4
Bicycle
Car×5
Bench
Bollard
From J. Winn, MSR
![Page 13: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/13.jpg)
Sources of image variability
Street scene
Sky
Building×3
Road
Sidewalk
Tree×3
Person×4
Bicycle
Car×5
Bench
Bollard
Scene type
Scene geometry
Object classes
Object position
Object orientation
From J. Winn, MSR
![Page 14: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/14.jpg)
Sources of image variability
Scene type
Scene geometry
Object classes
Object position
Object orientation
Scene type
Scene geometry
Object classes
Object position
Object orientation
Object shape
Street scene
From J. Winn, MSR
![Page 15: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/15.jpg)
Scene type
Scene geometry
Object classes
Object position
Object orientation
Object shape
Sources of image variability
Scene type
Scene geometry
Object classes
Object position
Object orientation
Object shape
Depth/occlusions
From J. Winn, MSR
![Page 16: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/16.jpg)
Sources of image variability
Scene type
Scene geometry
Object classes
Object position
Object orientation
Object shape
Depth/occlusions
Scene type
Scene geometry
Object classes
Object position
Object orientation
Object shape
Depth/occlusions
Object appearance
From J. Winn, MSR
![Page 17: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/17.jpg)
Sources of image variability
Scene type
Scene geometry
Object classes
Object position
Object orientation
Object shape
Depth/occlusions
Object appearance
Scene type
Scene geometry
Object classes
Object position
Object orientation
Object shape
Depth/occlusions
Object appearance
Illumination
Shadows
From J. Winn, MSR
![Page 18: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/18.jpg)
Sources of image variability
Scene type
Scene geometry
Object classes
Object position
Object orientation
Object shape
Depth/occlusions
Object appearance
Illumination
Shadows
From J. Winn, MSR
![Page 19: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/19.jpg)
Sources of image variability
Scene type
Scene geometry
Object classes
Object position
Object orientation
Object shape
Depth/occlusions
Object appearance
Illumination
Shadows
Motion blur
Camera effects
From J. Winn, MSR
![Page 20: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/20.jpg)
Computer vision problems
Scene type
Scene geometry
Object classes
Object position
Object orientation
Object shape
Depth/occlusions
Object appearance
Illumination
Shadows
Motion blur
Camera effects
![Page 21: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/21.jpg)
Now you see me
![Page 22: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/22.jpg)
![Page 23: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/23.jpg)
![Page 24: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/24.jpg)
![Page 25: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/25.jpg)
![Page 26: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/26.jpg)
![Page 27: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/27.jpg)
![Page 28: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/28.jpg)
![Page 29: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/29.jpg)
![Page 30: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/30.jpg)
![Page 31: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/31.jpg)
Moravec’s Paradox
• The main lesson of 35 years of AI research is that the
hard problems are easy and the easy problems are
hard. The mental abilities of a four-year-old that we take
for granted – recognizing a face, lifting a pencil, walking
across a room, answering a question – in fact solve some
of the hardest engineering problems ever conceived... As
the new generation of intelligent devices appears, it will be
the stock analysts and petrochemical engineers and parole
board members who are in danger of being replaced by
machines. The gardeners, receptionists, and cooks are
secure in their jobs for decades to come.
• Pinker, Steven (September 4, 2007) [1994], The Language Instinct,
Perennial Modern Classics, Harper, ISBN 0-06-133646-7
![Page 32: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/32.jpg)
MACHINE LEARNINGNeural network을중심으로
![Page 33: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/33.jpg)
Tasks
ADAS
Self Driving
Localizati
onPerception
Planning/
Control
Driver
state
Vehicle
Diagnosis
Smart
factory
Meth
od
s
Trad
ition
al
Non-machine LearningGPS,
SLAM
Optimal
control
Mach
ine-L
earnin
gb
ased m
etho
d
Su
perv
ised
MLPPedestrian
detection(HOG+SVM)
Deep
-Learn
ing
based
CNN
Detection/
Segmentat
ion/Classif
ication
End-to-
end
Learning
RNN
(LSTM)
Dry/wet
road
classificati
on
End-to-
end
Learning
DNN
Reinforcement
Unsupervised
![Page 34: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/34.jpg)
LINEAR PERCEPTRON
![Page 35: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/35.jpg)
뉴런: 신경망의기본단위
![Page 36: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/36.jpg)
Basic model
• The first learning machine: the Perceptron (built in 1960)
• The perceptron was a linear classifier
𝑦 = sign(wTx + b)
𝑦 = +1 if 𝑤1𝑥1 + 𝑤2𝑥2 +⋯+𝑤𝑛𝑥𝑛 + 𝑏 > 0−1 otherwise
![Page 37: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/37.jpg)
Linear Perceptron
• The goal: Find the best line (or hyper-plane) to separate the training data.
• In two dimensions, the equation of the line is given by a line:
• 𝑎𝑥 + 𝑏𝑦 + 𝑐 = 0
• A better notation for n dimensions: treat each data point and the coefficients as vectors. Then the equation is given by:
• w⊤x + b = 0
Class (+1)Class (-1)
![Page 38: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/38.jpg)
예시: 연어와농어의구별
![Page 39: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/39.jpg)
예시: 연어와농어의구별
밝기 (𝑙)
폭 (𝑤) 7.3𝑙 + 3.4𝑤 = 100
𝑙
𝑤
7.3𝑙 + 3.4𝑤 ≥ 100
7.3𝑙 + 3.4𝑤 < 100
농어
연어
![Page 40: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/40.jpg)
4 2 2 4
0.2
0.4
0.6
0.8
1.0
예시: 연어와농어의구별
𝑙
𝑤
7.3 × 𝑙
3.4 × 𝑤
Σ
100
연어/농어
![Page 41: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/41.jpg)
Artificial Neuron
𝑤𝑇𝑥 𝑔(𝑤𝑇𝑥)
Activation function (non-linear)
![Page 42: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/42.jpg)
![Page 43: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/43.jpg)
Artificial Neuron
• However, it cannot solve non-linearly-separable problems
![Page 44: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/44.jpg)
MULTI-LAYER PERCEPTRON
![Page 45: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/45.jpg)
![Page 46: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/46.jpg)
Multi-layer Neural Network
• 1st Layer
• h1 = 𝑔(𝑊1x + 𝑏1)
• 2nd Layer
• h2 = 𝑔(𝑊2h1 + 𝑏2)
• …..
• Output layer
• o = softmax(Wnhn−1 + bn)
=
hk
hk−1
𝑊𝑘
![Page 47: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/47.jpg)
4 2 2 4
1.0
0.5
0.5
1.0
4 2 2 4
0.2
0.4
0.6
0.8
1.0
Activation function 𝑔(⋅)
• Sigmoid activation function
• Squashes the neuron’s pre-activation between 0 and 1
• Always positive/Bounded/Strictly increasing
• Hyperbolic tangent (‘‘tanh’’) activation function
• Squashes the neuron’s pre-activation between -1 and 1
• Bounded/Strictly increasing
𝑔 𝑥 =1
1 + exp(−𝑥)
𝑔 𝑥 = tanh 𝑥 =exp 𝑥 − exp(−𝑥)
exp(𝑥) + exp(−𝑥)
![Page 48: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/48.jpg)
Activation function 𝑔(⋅)
• Rectified linear activation function (ReLU)
• Bounded below by 0
• Not upper bounded
• Strictly increasing
𝑔 𝑎 = rectlin 𝑎 = max 0, 𝑎
![Page 49: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/49.jpg)
Soft-max activation function at the output
• For multi-class classification
• We need multiple outputs (1 output per class)
• We use the softmax activation function at the output
• strictly positive
• sums to one
𝑂 𝐚 = softmax 𝐚 =
exp 𝑎1 𝑐 exp 𝑎𝑐exp 𝑎2
𝑐 exp 𝑎𝑐⋮⋮
exp 𝑎𝑐 𝑐 exp 𝑎𝑐
예시: 𝑎, 𝑏, 𝑐 →𝑒𝑎
𝑒𝑎+𝑒𝑏+𝑒𝑐,
𝑒𝑏
𝑒𝑎+𝑒𝑏+𝑒𝑐,
𝑒𝑐
𝑒𝑎+𝑒𝑏+𝑒𝑐
![Page 50: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/50.jpg)
Example (character recognition example)
𝑥 ∈ 0,1 10×14
𝑝(𝑐 = "0"|𝑥)
𝑝(𝑐 = "1"|𝑥)
𝑝(𝑐 = "9"|𝑥)
⋮
140 inputs
Layer 1
with 12 perceptrons
Layer 2
with 10 perceptrons
Each having 12 inputs𝑊1 ∈ ℜ140×12
𝑊2 ∈ ℜ12×10
softm
ax
𝑏1 ∈ ℜ12
𝑏2 ∈ ℜ10
![Page 51: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/51.jpg)
TRAINING OF MULTI-LAYER
PERCEPTRON
![Page 52: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/52.jpg)
Training: Loss function
• Cross entropy (classification)
• 𝑦, 𝑦 ∈ 0,1 𝑁, 𝑖=1𝑦𝑖 = 1, 𝑖=1 𝑦𝑖 = 1
• 𝐿 = − 𝑦𝑖log 𝑦𝑖
• Square Euclidean distance (regression)
• 𝑦, 𝑦 ∈ ℜ𝑁
• 𝐿 =1
2 𝑦𝑖 − 𝑦𝑖
2
Error
![Page 53: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/53.jpg)
Cross Entropy (예시)
• Label:
• [𝑦1 𝑦2 𝑦3] = [1,0,0] : class 1
• [𝑦1 𝑦2 𝑦3] = [0,1,0] : class 2
• [𝑦1 𝑦2 𝑦3] = [0,0,1] : class 3
• 예시• Network output: 𝑦 = [ 𝑦1 𝑦2 𝑦3 ] = [0.3, 0.6, 0.1]
• Loss
• If ground truth is class 1 (i.e., y = [1,0,0]) → −log 0.3 = 1.204
• If ground truth is class 2 (i.e., y = [0,1,0]) → −log 0.6 = 0.511
• If ground truth is class 3 (i.e., y = [0,0,1]) → −log 0.1 = 2.303
• Network output: 𝑦 = [ 𝑦1 𝑦2 𝑦3 ]= [0.01, 0.98, 0.01]
• Loss
• If ground truth is class 1 (i.e., y = [1,0,0]) → −log 0.01 = 4.605
• If ground truth is class 2 (i.e., y = [0,1,0]) → −log 0.98 = 0.020
• If ground truth is class 3 (i.e., y = [0,0,1]) → −log 0.01 = 4.605
𝐿 = − 𝑦𝑖log 𝑦𝑖
![Page 54: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/54.jpg)
Forward/Backward propagation
• Chain rule
𝑊𝑛𝑒𝑤 = 𝑊𝑜𝑙𝑑 − 𝜂𝑑𝐿
𝑑𝑊
![Page 55: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/55.jpg)
![Page 56: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/56.jpg)
![Page 57: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/57.jpg)
Forward/Backward propagation
Forward propagation
Backward propagation
𝑊
y
𝜕𝐿
𝜕𝑦
𝜕𝐿
𝜕x
𝜕𝐿
𝜕𝑊=𝜕𝐿
𝜕y
𝜕y
𝜕𝑊
y = 𝑔(𝑊x + 𝑏)
𝜕y
𝜕x
𝜕y
𝜕𝑊
𝜕𝐿
𝜕x=
𝜕𝐿
𝜕y⋅𝜕y
𝜕x
x
![Page 58: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/58.jpg)
FEED-FORWARD NEURAL
NETWORK (예시)
![Page 59: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/59.jpg)
Forward propagation
The 1st hidden layer The 2nd hidden layer
Soft
Max
Lay
er 1
Lay
er 2
Output
![Page 60: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/60.jpg)
Forward propagation matrix repr.
The 1st hidden layer The 2nd hidden layer
Soft
Max
Lay
er 1
Lay
er 2
Output
![Page 61: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/61.jpg)
BACK-PROPAGATION
ALGORITHM (예시)
![Page 62: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/62.jpg)
Forward propagation
(block-based representation)
Layer 1
Layer 1
Layer 2
Layer 2
Soft
Max
Output
![Page 63: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/63.jpg)
Backward propagation; 2nd layer
Ground
Truth
VS
Output
Soft
Max
Layer 1 Layer 2
• Error propagation
![Page 64: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/64.jpg)
Backward propagation; 2nd layer
• Weight update• Error propagation
Ground
Truth
VS
Output
Soft
Max
![Page 65: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/65.jpg)
Backward propagation; 1st layer
• Error propagation
Layer 1
Ground
Truth
VS
Output
Soft
Max
Layer 2
![Page 66: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/66.jpg)
Backward propagation; 1st layer
• Error propagationLayer 1 Layer 2
• Weight update
Ground
Truth
VS
Output
Soft
Max
![Page 67: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/67.jpg)
TENSORFLOW실습
![Page 68: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/68.jpg)
TENSORFLOW INTRODUCTION
![Page 69: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/69.jpg)
What is TensorFlow?
• TensorFlow is a deep learning library open-sourced by Google.
• TensorFlow provides primitives for defining functions on
tensors and automatically computing their derivatives.
• Tensor is a multidimensional array of numbers
![Page 70: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/70.jpg)
Design Choice
• Network structures
• The mathematical relationship between inputs and outputs
• Loss function
• Optimization
• Optimization methods
• Hyper-parameters (Batch size, Learning rate, … )
![Page 71: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/71.jpg)
Classification vs RegressionClassification Regression
?Expected
Price
1600 sq ft
Color
Weight
Apple
Orange
?
The variable we are trying to predict is
DISCRETE
The variable we are trying to predict is
CONTINUOUS
Housing
Prices
![Page 72: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/72.jpg)
MNIST dataset (classification example)
• handwritten digits
• a training set of 60,000 examples
• 28x28 images
![Page 73: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/73.jpg)
![Page 74: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/74.jpg)
![Page 76: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/76.jpg)
Classification Example Code
=
0
0
1
2
3
4
5
6
7
8
9
1
784 = 282
28
28
![Page 78: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/78.jpg)
VALIDATION
![Page 79: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/79.jpg)
Tasks
ADAS
Self Driving
Localizati
onPerception
Planning/
Control
Driver
state
Vehicle
Diagnosis
Smart
factory
Meth
od
s
Trad
ition
al
Non-machine LearningGPS,
SLAM
Optimal
control
Mach
ine-L
earnin
gb
ased m
etho
d
Su
perv
ised
MLPPedestrian
detection(HOG+SVM)
Deep
-Learn
ing
based
CNN
Detection/
Segmentat
ion/Classif
ication
End-to-
end
Learning
RNN
(LSTM)
Dry/wet
road
classificati
on
End-to-
end
Learning
DNN
Reinforcement
Unsupervised
![Page 80: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/80.jpg)
Validation set approach
• Divide the data in three parts:
• training, validation (development), and test. We use the train and
validation data to select the best model and the test data to assess the
chosen model.
WorldWorld
Samples trainingvalid
ation test
![Page 81: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/81.jpg)
Validation set approach
• Training set
• To fit the models
• Validation set
• To estimate prediction error for model selection
• Test set
• To assess of the generalization error of the final chose model
Train Validation Test
![Page 82: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/82.jpg)
k-fold cross validation
• We partition the data into 𝐾 parts. For the 𝑘 −th part, we fit the
model to the other 𝐾 − 1 parts of the data, and calculate the
prediction error of the fitted model when predicting the 𝑘th part
of the data. We do this for 𝑘 = 1,2,⋯ , 𝐾 and combine the 𝐾estimates
𝐾 = 7
![Page 83: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/83.jpg)
k-Fold cross validation
WorldWorld
Samples
![Page 84: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/84.jpg)
Leave-one-out cross validation
![Page 85: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/85.jpg)
Development Cycle
![Page 86: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/86.jpg)
전통적인접근법
![Page 87: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/87.jpg)
Tasks
ADAS
Self Driving
Localizati
onPerception
Planning/
Control
Driver
state
Vehicle
Diagnosis
Smart
factory
Meth
od
s
Trad
ition
al
Non-machine LearningGPS,
SLAM
Optimal
control
Mach
ine-L
earnin
gb
ased m
etho
d
Su
perv
ised
MLPPedestrian
detection(HOG+SVM)
Deep
-Learn
ing
based
CNN
Detection/
Segmentat
ion/Classif
ication
End-to-
end
Learning
RNN
(LSTM)
Dry/wet
road
classificati
on
End-to-
end
Learning
DNN
Reinforcement
Unsupervised
![Page 88: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/88.jpg)
Conventional approach
• Image classification
“Motocycle”
Slides from “Andrew Ng”
![Page 89: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/89.jpg)
Why is this hard?
Slides from “Andrew Ng”
![Page 90: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/90.jpg)
Feature representation
Slides from “Andrew Ng”
Classification
Algorithm
![Page 91: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/91.jpg)
Feature representation
Classification
Algorithm
Slides from “Andrew Ng”
![Page 92: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/92.jpg)
Example of Feature Representation
• But, … we don’t have a handlebars detector. So, researchers try to hand-
design features to capture various statistical properties of the image
Slides from “Andrew Ng”
![Page 93: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/93.jpg)
Feature representation
Slides from “Andrew Ng”
Classification
Algorithm
![Page 94: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/94.jpg)
Computer vision features
Slides from “Andrew Ng”
![Page 95: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/95.jpg)
Audio features
![Page 96: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/96.jpg)
Traditional pattern recognition
• Fixed/engineered feature + trainable classifier
![Page 97: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/97.jpg)
CASE STUDY:
PEDESTRIAN DETECTOR
![Page 98: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/98.jpg)
Tasks
ADAS
Self Driving
Localizati
onPerception
Planning/
Control
Driver
state
Vehicle
Diagnosis
Smart
factory
Meth
od
s
Trad
ition
al
Non-machine LearningGPS,
SLAM
Optimal
control
Mach
ine-L
earnin
gb
ased m
etho
d
Su
perv
ised
MLPPedestrian
detection(HOG+SVM)
Deep
-Learn
ing
based
CNN
Detection/
Segmentat
ion/Classif
ication
End-to-
end
Learning
RNN
(LSTM)
Dry/wet
road
classificati
on
End-to-
end
Learning
DNN
Reinforcement
Unsupervised
![Page 99: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/99.jpg)
Detection problem (binary) classification problem
• Sliding window scheme
![Page 100: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/100.jpg)
Each window is separately classified
![Page 101: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/101.jpg)
• 64x128 images of humans cropped from a varied set of personal
photos• Positive data – 1239 positive window examples (reflections->2478)
• Negative data – 1218 person-free training photos (12180 patches)
Training data
![Page 102: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/102.jpg)
• A preliminary detector
• Trained with (2478) vs (12180) samples
• Retraining
• With augmented data set
• initial 12180 + hard examples
• Hard examples
• 1218 negative training photos are searched exhaustively for false positive
Training
![Page 103: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/103.jpg)
Feature: histogram of oriented gradients (HOG)
![Page 104: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/104.jpg)
![Page 105: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/105.jpg)
Averaged examples
![Page 106: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/106.jpg)
Algorithm
![Page 107: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/107.jpg)
Dalal, Navneet, and Bill Triggs. "Histograms of oriented gradients for human detection." 2005 IEEE Computer Society Conference on
Computer Vision and Pattern Recognition (CVPR'05). Vol. 1. IEEE, 2005.
![Page 108: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/108.jpg)
Learned model
![Page 109: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/109.jpg)
BACKUPS
![Page 110: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/110.jpg)
http://playground.tensorflow.org/
![Page 111: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/111.jpg)
Visualization MNIST with t-SNE
![Page 112: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/112.jpg)
Why are Deep Architectures hard to train?
• Vanishing gradient problem in back-
propagation.
• Local Optimum (saddle points?) Issue in
Neural Nets
• For Deep Architectures, back-propagation is
apparently getting a local optimum (saddle
points?) that does not generalize well
![Page 113: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/113.jpg)
Empirical Results: Poor performance of Backpropagation on
Deep Neural Nets [Erhan et al., 2009]
• MNIST digit classification task; 400 trials (random seed)
• Each layer: initialize weights with random numbers
• Although 𝐿 + 1 layers is more expressive, worse error than 𝐿layers.
![Page 114: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/114.jpg)
![Page 115: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/115.jpg)
115
An example
![Page 116: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/116.jpg)
Feature extraction
Width feature Lightness feature
116
![Page 117: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function](https://reader036.fdocuments.us/reader036/viewer/2022081323/5f06986d7e708231d418c5ce/html5/thumbnails/117.jpg)
Classification
Simple model Complex model
117