Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial...
Transcript of Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial...
![Page 1: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/1.jpg)
Genome 559: Introduction to Statistical and Computational Genomics
Elhanan Borenstein
Artificial Neural Networks
Some slides adapted from Geoffrey Hinton and Igor Aizenberg
![Page 2: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/2.jpg)
Ab initio gene prediction Parameters:
Splice donor sequence model Splice acceptor sequence model Intron and exon length distribution Open reading frame More …
Markov chain States Transition probabilities
Hidden Markov Model (HMM)
A quick review
![Page 3: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/3.jpg)
Machine learning
“A field of study that gives computers the ability to learnwithout being explicitly programmed.”
Arthur Samuel (1959)
![Page 4: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/4.jpg)
Tasks best solved by learning algorithms Recognizing patterns:
Facial identities or facial expressions
Handwritten or spoken words
Recognizing anomalies: Unusual sequences of credit card transactions
Prediction: Future stock prices
Predict phenotype based on markers
Genetic association, diagnosis, etc.
![Page 5: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/5.jpg)
Why machine learning? It is very hard to write programs that solve problems
like recognizing a face. We don’t know what program to write.
Even if we had a good idea of how to do it, the program might be horrendously complicated.
Instead of writing a program by hand, we collect lots of examples for which we know the correct output A machine learning algorithm then takes these examples,
trains, and “produces a program” that does the job.
If we do it right, the program works for new cases as well as the ones we trained it on.
![Page 6: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/6.jpg)
Why neural networks? One of those things you always hear about but never
know exactly what they actually mean… A good example of a machine learning framework In and out of fashion … An important part of machine learning history A powerful
framework
![Page 7: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/7.jpg)
The goals of neural computation1. To understand how the brain actually works
Neuroscience is hard!
2. To develop a new style of computation Inspired by neurons and their adaptive connections
Very different style from sequential computation
3. To solve practical problems by developing novel learning algorithms Learning algorithms can be very useful even if they have
nothing to do with how the brain works
![Page 8: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/8.jpg)
How the brain works (sort of) Each neuron receives inputs from many other neurons
Cortical neurons use spikes to communicate
Neurons spike once they “aggregate enough stimuli” through input spikes
The effect of each input spike on the neuron is controlled by a synaptic weight. Weights can be positive or negative
Synaptic weights adapt so that the whole network learns to perform useful computations
A huge number of weights can affect the computation in a veryshort time. Much better bandwidththan a computer.
![Page 9: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/9.jpg)
A typical cortical neuron Physical structure:
There is one axon that branches
There is a dendritic tree that collects input from other neurons
Axons typically contact dendritic trees at synapses
A spike of activity in the axon causes a charge to be injected into the post-synaptic neuron
axon
body
dendritictree
![Page 10: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/10.jpg)
Idealized Neuron
Σ
X1w1
X2
X3
w2
w3
Y
Basically, a weighted sum!
ii
iwxy
![Page 11: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/11.jpg)
Adding bias
Σ,b
X1w1
X2
X3
w2
w3
Y
Function does not have to pass through the origin
bwxy ii
i
![Page 12: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/12.jpg)
Adding an “activation” function
Σ,b
X1w1
X2
X3
w2
w3
Y
The “field” of the neuron goes through an activation function
)( bwxy ii
i
φ
Z, (the field of the neuron)
![Page 13: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/13.jpg)
Σ,b,φ
X1w1
X2
X3
w2
w3
Y
Common activation functions
z z
Linear activation
Threshold activation Hyperbolic tangent activation
Logistic activation
u
u
e
eutanhu
2
2
1
1
1
1 zz
e
1, 0,sign( )
1, 0.
if zz z
if z
z
z
z
z
1
-1
1
0
0
1
-1
13
![Page 14: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/14.jpg)
McCulloch-Pitts neurons Introduced in 1943 (and influenced Von Neumann!)
Threshold activation function
Restricted to binary inputs and outputs
bwxz ii
i 1 if z>00 otherwisey=
Σ,b
X1 w1
X2
w2
Y
w1=1, w2=1, b=1.5
X1 X2 y
0 0 0
0 1 0
1 0 0
1 1 1
X1 AND X2
w1=1, w2=1, b=0.5
X1 X2 y
0 0 0
0 1 1
1 0 1
1 1 1
X1 OR X2
![Page 15: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/15.jpg)
Beyond binary neurons
bwxz ii
i 1 if z>00 otherwisey=
Σ,b
X1 w1
X2
w2
Y
w1=1, w2=1, b=1.5
X1 X2 y
0 0 0
0 1 0
1 0 0
1 1 1
X1 AND X2
w1=1, w2=1, b=0.5
X1 X2 y
0 0 0
0 1 1
1 0 1
1 1 1
X1 OR X2
X1
X2
(0,0)
(0,1)
(1,0)
(1,1)
X1
X2
(0,0)
(0,1)
(1,0)
(1,1)
![Page 16: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/16.jpg)
Beyond binary neurons
X1
X2
A general classifier The weights determine the slope
The bias determines the distance from the origin
But … how would we knowhow to set the weights and the bias?(note: the bias can be represented as an additional input)
![Page 17: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/17.jpg)
Perceptron learning Use a “training set” and let the perceptron learn from
its mistakes Training set: A set of input data for which we know the
correct answer/classification!
Learning principle: Whenever the perceptron is wrong, make a small correction to the weights in the right direction.
Note: Supervised learning
Training set vs. testing set
![Page 18: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/18.jpg)
Perceptron learning1. Initialize weights and threshold (e.g., use small
random values).
2. Use input X and desired output d from training set
3. Calculate the actual output, y
4. Adapt weights: wi(t+1) = wi (t) + α(d − y)xi for allweights. α is the learning rate (don’t overshoot)
Repeat 3 and 4 until the d−y is smaller than a user-specified error threshold, or a predetermined number of iterations have been completed.
If solution exists – guaranteed to converge!
![Page 19: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/19.jpg)
Linear separability What about the XOR function?
Or other non linear separable classification problems such as:
![Page 20: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/20.jpg)
Multi-layer feed-forward networks We can connect several neurons, where the output of
some is the input of others.
![Page 21: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/21.jpg)
Solving the XOR problem Only 3 neurons are required!!!
b=1.5X1
X2 +1
Y
b=0.5
b=0.5+1+1
+1 -1
+1
![Page 22: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/22.jpg)
In fact … With one hidden layer you can solve ANY classification
task!
But …. How do you find the right set of weights?
(note: we only have an error delta for the output neuron)
This problem caused this framework to fall out of favor … until …
![Page 23: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/23.jpg)
Back-propagation
Main idea: First propagate a training input data point forward to
get the calculated output
Compare the calculated output with the desired output to get the error (delta)
Now, propagate the error back in the network to get an error estimate for each neuron
Update weights accordingly
![Page 24: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/24.jpg)
Types of connectivity Feed-forward networks
Compute a series of transformations
Typically, the first layer is the input and the last layer is the output.
Recurrent networks Include directed cycles in their
connection graph.
Complicated dynamics.
Memory.
More biologically realistic?
hidden units
output units
input units
![Page 25: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/25.jpg)
Which is the most useful representation?
B
C
A
D
A B C DA 0 0 1 0B 0 0 0 0C 0 1 0 0D 0 1 1 0
Connectivity MatrixList of edges:(ordered) pairs of nodes
[ (A,C) , (C,B) , (D,B) , (D,C) ]
Object Oriented
Name:Angr:
p1Name:Bngr:
Name:Cngr:
p1
Name:Dngr:
p1 p2
Computational representationof networks
![Page 26: Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from Geoffrey Hinton.](https://reader038.fdocuments.us/reader038/viewer/2022103015/5517c65c5503461b658b4959/html5/thumbnails/26.jpg)