Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo,...
Transcript of Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo,...
![Page 1: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/1.jpg)
Machine Learning
Neural Networks
(slides from Domingos, Pardo, others)
![Page 2: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/2.jpg)
Human Brain
![Page 3: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/3.jpg)
Neurons
![Page 4: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/4.jpg)
Input-Output Transformation
Input
Spikes
Output
Spike
(Excitatory Post-Synaptic Potential)
Spike (= a brief pulse)
![Page 5: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/5.jpg)
Human Learning
• Number of neurons: ~ 1011
• Connections per neuron: ~ 103 to 105
• Neuron switching time: ~ 0.001 second
• Scene recognition time: ~ 0.1 second
100 inference steps doesn’t seem much
![Page 6: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/6.jpg)
Machine Learning Abstraction
![Page 7: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/7.jpg)
Artificial Neural Networks
• Typically, machine learning ANNs are veryartificial, ignoring:– Time
– Space
– Biological learning processes
• More realistic neural models exist– Hodgkin & Huxley (1952) won a Nobel prize
for theirs (in 1963)
• Nonetheless, very artificial ANNs have been useful in many ML applications
![Page 8: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/8.jpg)
Perceptrons
• The “first wave” in neural networks
• Big in the 1960’s
– McCulloch & Pitts (1943), Woodrow & Hoff (1960), Rosenblatt (1962)
![Page 9: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/9.jpg)
Perceptrons
• Problem def:
– Let f be a target function from X = <x1, x2, …> where xi {0, 1}toy {0, 1}
– Given training data {(X1, y1), (X2, y2)…}
• Learn h (X ), an approximation of f (X )
![Page 10: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/10.jpg)
A single perceptron
else 0
0 if 1
0
n
i
iixw
w1
w3
w2
w4
w5
x1
x2
x3
x4
x5
x0
w0
Inp
uts
Bias (x0 =1,always)
![Page 11: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/11.jpg)
Logical Operators
-0.8
0.5
0.5
else 0
0 if 1
0
n
i
iixw
x0
x1
x2
AND
-0.3
0.5
0.5
else 0
0 if 1
0
n
i
iixw
x0
x1
x2
OR
0.1
-1.0
else 0
0 if 1
0
n
i
iixw
x0
x1
NOT
![Page 12: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/12.jpg)
Learning Weights
• Perceptron Training Rule
• Gradient Descent
• (other approaches: Genetic Algorithms)
else 0
0 if 1
0
n
i
iixw
?x0
?x1
?x2
![Page 13: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/13.jpg)
Perceptron Training Rule
• Weights modified for each training example
• Update Rule:
iiiwww
iixotw )(
where
learning
rate
target
value
perceptron
output
input
value
![Page 14: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/14.jpg)
Perception Training for NOT
Bryan Pardo, Machine Learning: EECS 349 Fall 2009 14
iiiwww
iixotw )(
else 0
0 if 1
0
n
i
iixw
x0
x1
NOT
WorkStart End
w1
w0
![Page 15: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/15.jpg)
What weights make XOR?
• No combination of weights works
• Perceptrons can only represent linearly separable functions
else 0
0 if 1
0
n
i
iixw
?
x0
?
x1
?x2
![Page 16: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/16.jpg)
Linear Separability
x1
x2
OR
![Page 17: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/17.jpg)
Linear Separability
x1
x2
AND
![Page 18: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/18.jpg)
Linear Separability
x1
x2
XOR
![Page 19: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/19.jpg)
Perceptron Training Rule
• Converges to the correct classification IF
– Cases are linearly separable
– Learning rate is slow enough
– Proved by Minsky and Papert in 1969
Killed widespread interest in perceptrons till the 80’s
![Page 20: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/20.jpg)
XOR
else 0
0 if 1
0
n
i
iixw
0
x0
0.6x1
0.6x2
else 0
0 if 1
0
n
i
iixw
0
x0
else 0
0 if 1
0
n
i
iixw
0
x0
XOR1
1
-0.6
-0.6
![Page 21: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/21.jpg)
What’s wrong with perceptrons?
• You can always plug multiple perceptrons together to calculate any function.
• BUT…who decides what the weights are?
– Assignment of error to parental inputs becomes a problem….
![Page 22: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/22.jpg)
Perceptrons use a step function
else 0
0 if 1
0
n
i
iixw
?
x0
?
x1
?x2
Perceptron Threshold
Step function
• Small changes in inputs -> either no change or large change in output.
![Page 23: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/23.jpg)
Solution: Differentiable Function
n
i
iixw
0
?
x0
?
x1
?x2
Simple linear function
• Varying any input a little creates a perceptible change in the output
• We can now characterize how errorchanges wi even in multi-layer case
![Page 24: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/24.jpg)
Measuring error for linear units
• Output Function
• Error Measure:
xwx
)(
Dd
ddotwE
2)(
2
1)(
data target
value
linear unit output
![Page 25: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/25.jpg)
Gradient Descent
Gradient:
Training rule:
![Page 26: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/26.jpg)
Gradient Descent Rule
Dd
dd
ii
otww
E 2)(
2
1
Dd
diddxot ))((
,
Dd
diddiixotww
,)(
Update Rule:
![Page 27: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/27.jpg)
Gradient Descent for Multiple Layers
x0
x1
x2
x0
x0
XOR
n
i
iixw
0
n
i
iixw
0
n
i
iixw
0
ijw
We can compute:
ijw
E
![Page 28: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/28.jpg)
Gradient Descent vs. Perceptrons
• Perceptron Rule & Threshold Units– Learner converges on an answer ONLY IF
data is linearly separable
– Can’t assign proper error to parent nodes
• Gradient Descent– (locally) Minimizes error even if examples are
not linearly separable
– Works for multi-layer networks• But…linear units only make linear decision surfaces
(can’t learn XOR even with many layers)
– And the step function isn’t differentiable…
![Page 29: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/29.jpg)
A compromise function• Perceptron
• Linear
• Sigmoid (Logistic)
else 0
0 if 1
0
n
i
iixw
output
n
i
iixwnetoutput
0
nete
netoutput
1
1)(
![Page 30: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/30.jpg)
The sigmoid (logistic) unit
• Has differentiable function
– Allows gradient descent
• Can be used to learn non-linear functions
?
x1
?x2
n
iii xw
e 01
1
![Page 31: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/31.jpg)
Logistic function
Inputs
Coefficients
Output
Independent
variables
Prediction
Age 34
1Gender
Stage 4
.5
.8
.4
0.6
S“Probability
of beingAlive”
n
iii xw
e 01
1
![Page 32: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/32.jpg)
Neural Network Model
Inputs
Weights
Output
Independent
variables
Dependent
variable
Prediction
Age 34
2Gender
Stage 4
.6
.5
.8
.2
.1
.3.7
.2
WeightsHidden
Layer
“Probability
of beingAlive”
0.6
S
S
.4
.2S
![Page 33: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/33.jpg)
Getting an answer from a NN
Inputs
Weights
Output
Independent
variables
Dependent
variable
Prediction
Age 34
2Gender
Stage 4
.6
.5
.8
.1
.7
WeightsHidden
Layer
“Probability
of beingAlive”
0.6
S
![Page 34: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/34.jpg)
Inputs
Weights
Output
Independent
variables
Dependent
variable
Prediction
Age 34
2Gender
Stage 4
.5
.8
.2
.3
.2
WeightsHidden
Layer
“Probability
of beingAlive”
0.6
S
Getting an answer from a NN
![Page 35: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/35.jpg)
Getting an answer from a NN
Inputs
Weights
Output
Independent
variables
Dependent
variable
Prediction
Age 34
1Gender
Stage 4
.6
.5
.8
.2
.1
.3.7
.2
WeightsHidden
Layer
“Probability
of beingAlive”
0.6
S
![Page 36: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/36.jpg)
Minimizing the Error
winitial wtrained
initial error
final error
Error surface
positive change
negative derivative
local minimum
![Page 37: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/37.jpg)
Differentiability is key!
• Sigmoid is easy to differentiate
• For gradient descent on multiple layers, a little dynamic programming can help:
– Compute errors at each output node
– Use these to compute errors at each hidden node
– Use these to compute weight gradient
))(1()()(
yyy
y
![Page 38: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/38.jpg)
The Backpropagation Algorithm
jikjiji
ji
k
outputsk
hkhhh
h
kkkkk
k
u
xδww
w
δwooδ
δ
otooδ
δk
u
ox
t,x
ight network weeach Update.4
)1(
error term its calculate h,unit hidden each For .3
))(1(
error term its calculate ,unit output each For 2.
network in the unit every for
output thecompute andnetwork the to instanceInput 1.
example, ninginput traieach For
![Page 39: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/39.jpg)
Learning Weights
Inputs
Weights
Output
Independent
variables
Dependent
variable
Prediction
Age 34
1Gender
Stage 4
.6
.5
.8
.2
.1
.3.7
.2
WeightsHidden
Layer
“Probability
of beingAlive”
0.6
S
![Page 40: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/40.jpg)
The fine print
• Don’t implement back-propagation
– Use a package
– Second-order or variable step-size optimization techniques exist
• Feature normalization
– Typical to normalize inputs to lie in [0,1]
• (and outputs must be normalized)
• Problems with NN training:
– Slow training times (though, getting better)
– Local minima
![Page 41: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/41.jpg)
Minimizing the Error
winitial wtrained
initial error
final error
Error surface
positive change
negative derivative
local minimum
![Page 42: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/42.jpg)
Expressive Power of ANNs
• Universal Function Approximator:
– Given enough hidden units, can approximate any continuous function f
• Need 2+ hidden units to learn XOR
• Why not use millions of hidden units?
– Efficiency (training is slow)
– Overfitting
![Page 43: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/43.jpg)
Overfitting
Overfitted ModelReal Distribution
![Page 44: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/44.jpg)
Combating Overfitting in Neural Nets
• Many techniques
• Two popular ones:
– Early Stopping
• Use “a lot” of hidden units
• Just don’t over-train
– Cross-validation
• Test different architectures to choose “right” number of hidden units
![Page 45: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/45.jpg)
Early Stopping
b = training set
a = validation set
Overfitted model
error
Epochs
min ( error)
errora
errorb
Stopping criterion
![Page 46: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/46.jpg)
Cross-validation
• Cross-validation: general-purpose technique for model selection
– E.g., “how many hidden units should I use?”
• More extensive version of validation-set approach.
![Page 47: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/47.jpg)
Cross-validation
• Break training set into k sets
• For each model M
– For i=1…k
•Train M on all but set i
•Test on set i
• Output M with highest average test score,
trained on full training set
![Page 48: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/48.jpg)
Summary of Neural Networks
When are Neural Networks useful?
– Instances represented by attribute-value pairs
• Particularly when attributes are real valued
– The target function is
• Discrete-valued
• Real-valued
• Vector-valued
– Training examples may contain errors
– Fast evaluation times are necessary
When not?
– Fast training times are necessary
– Understandability of the function is required
![Page 49: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/49.jpg)
Summary of Neural Networks
Non-linear regression technique that is trained with gradient descent.
Question: How important is the biological metaphor?
![Page 50: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/50.jpg)
Advanced Topics in Neural Nets
• Batch Move vs. incremental
• Auto-Encoders
• Deep Belief Nets (briefly)
• Neural Networks on Silicon
• Neural Network language models
![Page 51: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/51.jpg)
Incremental vs. Batch Mode
![Page 52: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/52.jpg)
Incremental vs. Batch Mode
• In Batch Mode we minimize:
• Same as computing:
• Then setting
Dd
dDww
Dwww
![Page 53: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/53.jpg)
Advanced Topics in Neural Nets
• Batch Move vs. incremental
• Auto-Encoders
• Deep Belief Nets (briefly)
• Neural Networks on Silicon
• Neural Network language models
![Page 54: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/54.jpg)
Hidden Layer Representations
• Input->Hidden Layer mapping:
– representation of input vectors tailored to the task
• Can also be exploited for dimensionality reduction
– Form of unsupervised learning in which we output a “more compact” representation of input vectors
– <x1, …,xn> -> <x’1, …,x’m> where m < n
– Useful for visualization, problem simplification, data compression, etc.
![Page 55: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/55.jpg)
Dimensionality Reduction
Model: Function to learn:
![Page 56: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/56.jpg)
Dimensionality Reduction: Example
![Page 57: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/57.jpg)
Dimensionality Reduction: Example
![Page 58: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/58.jpg)
Dimensionality Reduction: Example
![Page 59: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/59.jpg)
Dimensionality Reduction: Example
![Page 60: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/60.jpg)
Advanced Topics in Neural Nets
• Batch Move vs. incremental
• Auto-encoders
• Deep Belief Nets (briefly)
• Neural Networks on Silicon
• Neural Network language models
![Page 61: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/61.jpg)
Restricted Boltzman Machine
![Page 62: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/62.jpg)
![Page 63: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/63.jpg)
Auto-encoders vs. RBMs?
• Similar
• Auto-encoder (AE) goal is to reconstruct input in two steps, input->hidden->output
• RBM defines a probability distribution over P(x)
– Goal is to assign high likelihood to the observed training examples
– Determining likelihood of a given x actually requires summing over all possible settings of hidden nodes, rather than just computing a single activation as in AE
– Take EECS 395/495 Probabilistic Graphical Models to learn more
![Page 64: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/64.jpg)
Deep Belief Nets
![Page 65: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/65.jpg)
Advanced Topics in Neural Nets
• Batch Move vs. incremental
• Auto-Encoders
• Deep Belief Nets (briefly)
• Neural Networks on Silicon
• Neural Network language models
![Page 66: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/66.jpg)
Neural Networks on Silicon
• Currently:
Simulation of continuous device physics (neural networks)
Digital computational model (thresholding)
Continuous device physics (voltage)
Why not
skip this?
![Page 67: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/67.jpg)
Example: Silicon Retina
Simulates function of biological retina
Single-transistor synapses adapt to luminance, temporal contrast
Modeling retina directly on chip => requires 100x less power!
![Page 68: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/68.jpg)
Example: Silicon Retina
• Synapses modeled with single transistors
![Page 69: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/69.jpg)
Luminance Adaptation
![Page 70: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/70.jpg)
Comparison with Mammal Data
• Real:
• Artificial:
![Page 71: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/71.jpg)
• Graphics and results taken from:
![Page 72: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/72.jpg)
General NN learning in silicon?
• People seem more excited about / satisfied with GPUs
![Page 73: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/73.jpg)
Advanced Topics in Neural Nets
• Batch Move vs. incremental
• Hidden Layer Representations
• Hopfield Nets
• Neural Networks on Silicon
• Neural Network language models
![Page 74: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/74.jpg)
Neural Network Language Models
• Statistical Language Modeling:
– Predict probability of next word in sequence
I was headed to Madrid , ____
P(___ = “Spain”) = 0.5,
P(___ = “but”) = 0.2, etc.
• Used in speech recognition, machine translation, (recently) information extraction
![Page 75: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/75.jpg)
• Estimate:
Formally
( 121
,...,,| njjjj
wwwwP
( jj
hwP |
![Page 76: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/76.jpg)
![Page 77: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/77.jpg)
Optimizations
• Key idea – learn simultaneously:
– vector representations of each word (here 120 dim)
– predictor of next word. based on previous vectors
• Short-lists
– Much complexity in hidden->output layer
• Number of possible next words is large
– Only predict a subset of words
• Use a standard probabilistic model for the rest
– Can also bin words into fixed classes
![Page 78: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/78.jpg)
Design Decisions (1)
• Number of hidden units
![Page 79: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/79.jpg)
Design Decisions (2)
• Word representation (# of dimensions)
• They chose 120 – more recent work uses up to 1000s
![Page 80: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/80.jpg)
Comparison vs. state of the art
• Circa 2005
Schwenk, Holger, and Jean-Luc Gauvain. "Training neural network language
models on very large corpora." Proceedings of the conference on Human
Language Technology and Empirical Methods in Natural Language
Processing. Association for Computational Linguistics, 2005.
![Page 81: Neural Networks - pdfs.semanticscholar.org€¦ · Neural Networks (slides from Domingos, Pardo, others) Human Brain. Neurons. Input-Output Transformation Input Spikes Output Spike](https://reader033.fdocuments.us/reader033/viewer/2022052719/5f07790f7e708231d41d27ab/html5/thumbnails/81.jpg)
Latest Results
Chelba, Ciprian, et al. "One billion word benchmark for
measuring progress in statistical language modeling." arXiv
preprint arXiv:1312.3005 (2013).