Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and...
-
Upload
allan-blair -
Category
Documents
-
view
220 -
download
5
Transcript of Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and...
![Page 1: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/1.jpg)
Deep Learning and its applicationsto Speech
EE 225D - Audio Signal Processing in Humans and Machines
Oriol VinyalsUC Berkeley
![Page 2: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/2.jpg)
●This is my biased view about deep learning and, more generally, machine learning past and current research!
Disclaimer
![Page 3: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/3.jpg)
●It’s a hot topic… isn’t it?
●http://deeplearning.net
Why this talk?
![Page 4: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/4.jpg)
●Let x be a signal (or features in machine learning jargon), want to find a function f that maps x to an output y:●Waveform “x” to sentence “y” (ASR)
●Image “x” to face detection “y” (CV)
●Weather measurements “x” to forecast “y” (…)
●Machine learning approach:●Get as many (x,y) pairs as possible, and find f
minimizing some loss over the training pairs●Supervised
●Unsupervised
Let’s step back to a ML formulation
![Page 5: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/5.jpg)
(slide credit: Eric Xing, CMU)
NN
![Page 6: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/6.jpg)
●Universal approximation thm.:●We can approximate any (continuous) function
on a compact set with a single hidden neural network
Can’t we do everything with NNs?
![Page 7: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/7.jpg)
●It has two (possibly more) meanings:●Use many layers in a NN
●Train each layer in an unsupervised fashion
●G. Hinton (U. of T.) et al made these two ideas famous in his 2006 Science paper.
Deep Learning
![Page 8: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/8.jpg)
2006 Science paper (G. Hinton et al)
![Page 9: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/9.jpg)
Great results using Deep Learning
![Page 10: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/10.jpg)
Deep Learning in Speech
Featureextraction
Phoneprobabilities
HMM
![Page 11: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/11.jpg)
●Small scale (TIMIT)●Many papers, most recent:
[Deng et al, Interspeech11]
●Small scale (Aurora)●50% rel. impr. [Vinyals et al, ICASSP11/12]
●~Med/Lg scale (Switchboard)●30% rel. impr. [Seide et al, Interspeech11]
●… more to come
Some interesting ASR results
![Page 12: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/12.jpg)
●Model strength vs. generalization error
●Deep architectures: more parameters more efficiently… Why?
Why is deep better?
![Page 13: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/13.jpg)
●Most relevant work by B. Olshausen (1997!)
“Sparse Coding with an Overcomplete Basis Set: A Strategy Employed by V1?”
●Take a bunch of random natural images, do unsupervised learning, you recover filters that look exactly the same as V1!
Is this how the brain really works?
![Page 14: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/14.jpg)
●People knew about NN for very long, why the hype now?●Computational power?
●More data available?
●Connection with neuroscience?
●Can we computationally emulate a brain?●~10^11 neurons, ~10^15 connections
●Biggest NN: ~10^4 neurons, ~10^8 connections
●Many connections flow backwards
●Brain understanding is far from complete
Criticisms/open questions
![Page 15: Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.](https://reader036.fdocuments.us/reader036/viewer/2022062516/56649dd85503460f94acdd45/html5/thumbnails/15.jpg)
Questions?