Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks:...
Transcript of Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks:...
![Page 1: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/1.jpg)
Artificial Neural Networks: Intro
CSC411: Machine Learning and Data Mining, Winter 2017
Michael Guerzhoy
“Making Connections” by Filomena Booth (2013)
Slides from Andrew Ng, Geoffrey Hinton, and Tom Mitchell
1
![Page 2: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/2.jpg)
Non-Linear Decision Surfaces
x1
x2
• There is no linear decision boundary
2
![Page 3: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/3.jpg)
Car Classification
Testing:
What is this?
Not a carCars
3
![Page 4: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/4.jpg)
You see this:
But the camera sees this:
4
![Page 5: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/5.jpg)
Learning Algorithm
pixel 1
pixel 2
pixel 1
pixel 2
Raw image
Cars“Non”-Cars
![Page 6: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/6.jpg)
pixel 1
pixel 2
Raw image
Cars“Non”-Cars
Learning Algorithm
pixel 1
pixel 2
![Page 7: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/7.jpg)
pixel 1
pixel 2
Raw image
Cars“Non”-Cars
50 x 50 pixel images→ 2500 pixels(7500 if RGB)
pixel 1 intensity
pixel 2 intensity
pixel 2500 intensity
Quadratic features ( ): ≈3 millionfeatures
Learning Algorithm
pixel 1
pixel 2
![Page 8: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/8.jpg)
Simple Non-Linear Classification Example
x1
x2
x1
x2
8
![Page 9: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/9.jpg)
Linear Neuron
𝑤1 𝑤2 𝑤3
Linear neuron
𝑤0
𝑤0 + 𝑤1𝑥1 + 𝑤2𝑥2 + 𝑤3𝑥3
𝑥1 𝑥2 𝑥3
9
![Page 10: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/10.jpg)
Linear Neuron: Cost Function
• Any number of choices. The one made for linear regression is
σ𝑖=1𝑚 𝑦 𝑖 − 𝑤𝑇𝑥(𝑖)
2
• Can minimize using gradient descent to obtain the best weights w for the training set
10
![Page 11: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/11.jpg)
Logistic Neuron
𝑤1 𝑤2 𝑤3
𝑤0
𝜎(𝑤0 + 𝑤1𝑥1 + 𝑤2𝑥2 +𝑤3𝑥3), 𝜎 𝑡 =1
1 + exp(−𝑡)
𝑥1 𝑥2 𝑥3
11
![Page 12: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/12.jpg)
Logistic Neuron: Cost Function
• Could use the quadratic cost function again
• Could use the “log-loss” function to make the neuron perform logistic regression
−
𝑖=1
𝑚
𝑦 𝑖 log1
1 + exp −𝑤𝑇𝑥 𝑖+ (1 − 𝑦 𝑖 ) log
exp −𝑤𝑇𝑥 𝑖
1 + exp −𝑤𝑇𝑥 𝑖
(Note: we derived this cost function by saying we want to maximize the likelihood of the data under a certain model, but there’s nothing stopping us from just making up a loss function)
12
![Page 13: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/13.jpg)
Logistic Regression Cost Function: Another Look
• 𝐶𝑜𝑠𝑡 ℎ𝑤 𝑥 , 𝑦 = ቐ− log ℎ𝑤 𝑥 , 𝑦 = 1
− log 1 − ℎ𝑤 𝑥 , 𝑦 = 0
• If y = 1, want the cost to be small if ℎ𝑤 𝑥 is close to 1 and large if ℎ𝑤 𝑥 is close to 0• -log(t) is 0 for t=1 and infinity for t = 0
• If y = 0, want the cost to be small if ℎ𝑤 𝑥 is close to 0 and large if ℎ𝑤 𝑥 is close to 1
• Note:0 < 𝜎 𝑡 < 1
𝜎 𝑡 =1
1 + exp(−𝑡)13
![Page 14: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/14.jpg)
Multilayer Neural Networks
• ℎ𝑖,𝑗 = g W𝑖,𝑗𝑥
= 𝑔(
𝑘
𝑊𝑖,𝑗,𝑘𝑥𝑘)
• 𝑥0 = 1 always
• 𝑊𝑖,𝑗,0 is the “bias”
• g is the activation function• Could be 𝑔 𝑡 = 𝑡• Could be 𝑔 𝑡 = 𝜎 𝑡
• Nobody uses those anymore…
output units
input units
hidden units
𝑥1
ℎ𝑖,𝑗
𝑥2 𝑥3
𝑜1 𝑜2
14
![Page 15: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/15.jpg)
Multilayer Neural Network: Speech Recognition Example
15
![Page 16: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/16.jpg)
How to compute AND?
16
![Page 17: Artificial Neural Networks: Introguerzhoy/411/lec/W03/neuralnetworks.pdfArtificial Neural Networks: Intro CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy “Making](https://reader035.fdocuments.us/reader035/viewer/2022071214/6041c8169e151e1b064cb0f4/html5/thumbnails/17.jpg)
How to compute XOR?
17