Deep learning Tutorial - Part 2
-
Upload
quantuniversity -
Category
Software
-
view
104 -
download
0
Transcript of Deep learning Tutorial - Part 2
Location:QuantUniversity MeetupJanuary 19th 2017Boston MA
Deep Learning : An introductionPart II
2016 Copyright QuantUniversity LLC.
Presented By:Sri Krishnamurthy, CFA, [email protected]
2
Introduction
Slides and Code will be available at: http://www.analyticscertificate.com/DeepLearning
- Analytics Advisory services- Custom training programs- Architecture assessments, advice and audits
4
• Founder of QuantUniversity LLC. and www.analyticscertificate.com
• Advisory and Consultancy for Financial Analytics• Prior Experience at MathWorks, Citigroup and
Endeca and 25+ financial services and energy customers.
• Regular Columnist for the Wilmott Magazine• Author of forthcoming book
“Financial Modeling: A case study approach” published by Wiley
• Charted Financial Analyst and Certified Analytics Professional
• Teaches Analytics in the Babson College MBA program and at Northeastern University, Boston
Sri KrishnamurthyFounder and CEO
5
Quantitative Analytics and Big Data Analytics Onboarding
•Trained more than 500 students in Quantitative methods, Data Science and Big Data Technologies using MATLAB, Python and R
• Launched the Analytics Certificate Program in September▫New Cohort in March 2017
•Coming soon: Deep Learning and Cognitive computing Certificate!
6
•February 2017▫Apache Spark Lecture – Feb 3rd ▫Deep Learning Workshop – Boston – March 27-28▫Anomaly Detection Workshop – Boston – April 24-25
•March 2017▫Deep Learning Workshop – New York (Date TBD)
Events of Interest
7
•Neural Networks 101
•Multi-Layer Perceptron
•Convolutional Neural Networks
Recap
8
•AutoEncoders•Recurrent Neural Networks▫LSTM
Agenda for today
9
•Unsupervised Algorithms▫Given a dataset with variables , build a model that captures the
similarities in different observations and assigns them to different buckets => Clustering, etc.
▫Create a transformed representation of the original data=> PCA
Machine Learning
Obs1, Obs2,Obs3
etc.Model
Obs1- Class 1Obs2- Class 2Obs3- Class 1
10
•Supervised Algorithms▫Given a set of variables , predict the value of another variable in a
given data set such that
▫If y is numeric => Prediction▫If y is categorical => Classification
Machine Learning
x1,x2,x3… Model F(X) y
11
•Motivation1:
Autoencoders
1. http://ai.stanford.edu/~quocle/tutorial2.pdf
12
https://blog.google/products/google-plus/saving-you-bandwidth-through-machine-learning/
13
•Goal is to have to approximate x • Interesting applications such as ▫Data compression▫Visualization▫Pre-train neural networks
Autoencoder
14
Demo in Keras1
1. https://blog.keras.io/building-autoencoders-in-keras.html2. https://keras.io/models/model/
15
•Pretraining step: Train a sequence of shallow autoencoders, greedily one layer at a time, using unsupervised data.
•Fine-tuning step 1: train the last layer using supervised data•Fine-tuning step 2: use backpropagation to fine-tune the entire
network using supervised data
Autoencoders1
1. http://ai.stanford.edu/~quocle/tutorial2.pdf
Supervised learning
Cross-sectional▫Observations are independent▫Given X1----Xi, predict Y▫CNNs
Supervised learning
Sequential▫Sequentially ordered
▫Given O1---OT, predict OT+1
1 Normal
2 Normal
3 Abnormal
4 Normal
5 Abnormal
18
•Given : X1,X2,X3----XN
•Convert the Univariate time series dataset to a cross sectional Dataset
Time series modeling in Keras using MLPs
X1X2X3X4X5X6X7X8X9
X10X11X12X13X14X15
X YX1 X2X2 X3X3 X4X4 X5X5 X6X6 X7X7 X8X8 X9X9 X10
X10 X11X11 X12X12 X13X13 X14X14 X15
19
•Monthly data•Computational Intelligence in Forecasting•Source: http://irafm.osu.cz/cif/main.php?c=Static&page=download
Sample data
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101 1050
200
400
600
800
1000
1200
1400
1600
1800
20
•Keras is a high-level neural networks library, written in Python and capable of running on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation.
•Allows for easy and fast prototyping (through total modularity, minimalism, and extensibility).
•Supports both convolutional networks and recurrent networks, as well as combinations of the two.
•Supports arbitrary connectivity schemes (including multi-input and multi-output training).
•Runs seamlessly on CPU and GPU.
Keras
21
•Use 72 for training and 36 for testing• Lookback 1, 10• Longer the lookback, larger the network
Multi-Layer Perceptron
Size 8
Size 1
22
Demo
Train Score: 1972.20 MSE (44.41 RMSE)Test Score: 3001.77 MSE (54.79 RMSE)
Train Score: 2631.49 MSE (51.30 RMSE)Test Score: 4166.64 MSE (64.55 RMSE)
Lookback = 1 Lookback = 10
23
•Has 3 types of parameters▫W – Hidden weights▫U – Hidden to Hidden weights▫V – Hidden to Label weights
•All W,U,V are shared
Recurrent Neural Networks1
1. http://ai.stanford.edu/~quocle/tutorial2.pdf
24
Where can Recurrent Neural Networks be used?1
1. http://karpathy.github.io/2015/05/21/rnn-effectiveness/
1. Vanilla mode of processing without RNN, from fixed-sized input to fixed-sized output (e.g. image classification).
2. Sequence output (e.g. image captioning takes an image and outputs a sentence of words).3. Sequence input (e.g. sentiment analysis where a given sentence is classified as expressing positive
or negative sentiment). 4. Sequence input and sequence output (e.g. Machine Translation: an RNN reads a sentence in
English and then outputs a sentence in French).5. Synced sequence input and output (e.g. video classification where we wish to label each frame of
the video).
25
•Andrej Karpathy’s article▫http://karpathy.github.io/2015/05/21/rnn-effectiveness/
•Hand writing generation demo▫http://www.cs.toronto.edu/~graves/handwriting.html
Sample applications
26
Recurrent Neural Networks•A recurrent neural network can be thought of as multiple copies of
the same network, each passing a message to a successor. 1
•Backpropagation(computing gradient wrt all parameters of the network) which is process used to propagate errors and weights needs to be modified for RNNs due to the existence of loops
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
27
•BPTT begins by unfolding a recurrent neural network through time as shown in the figure.
•Training then proceeds in a manner similar to training a feed-forward neural network with backpropagation, except that the training patterns are visited in sequential order.
Back Propagation through time (BPTT)1
1. https://en.wikipedia.org/wiki/Backpropagation_through_time
28
•Backpropagation through time (BPTT) for RNNs is difficult due to a problem known as vanishing/exploding gradient . i.e, the gradient becomes extremely small or large towards the first and end of the network.
•This is addressed by LSTM RNNs. Instead of neurons, LSTMs use memory cells 1
Addressing the problem of Vanishing/Exploding gradient
http://deeplearning.net/tutorial/lstm.html
29
• Dataset of 25,000 movies reviews from IMDB, labeled by sentiment (positive/negative). • Reviews have been preprocessed, and each review is encoded as a sequence of word
indexes (integers). • For convenience, words are indexed by overall frequency in the dataset, so that for
instance the integer "3" encodes the 3rd most frequent word in the data. • The 2011 paper (see below) had approximately 88% accuracy
• See ▫ https://github.com/fchollet/keras/blob/master/examples/imdb_lstm.py▫ http://machinelearningmastery.com/sequence-classification-lstm-recurrent-neural-network
s-python-keras/
▫ http://ai.stanford.edu/~amaas/papers/wvSent_acl2011.pdf
Demo – IMDB Dataset
30
Network
The most frequent 5000 words are chosen and mapped to 32 length vector
Sequences are restricted to 500 words; > 500 cut off ; < 500 pad
LSTM layer with 100 output dimensions
Accuracy: 84.08%
31
•Use 72 for training and 36 for testing• Lookback 1
Using RNNs for the CIF forecasting problem
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 1060
200
400
600
800
1000
1200
1400
1600
1800
32
Result
Train Score: 50.54 RMSETest Score: 65.34 RMSE
Lookback = 1
Train Score: 41.65 RMSETest Score: 90.68 RMSE
Lookback = 10
33
•Approach using Microsoft’s Cognitive Toolkit▫ https://gallery.cortanaintelligence.com/Tutorial/Forecasting-Short-Time-Series-with-LSTM-Neural-Networks-2 ▫ https://www.microsoft.com/en-us/research/product/cognitive-toolkit/model-gallery/
34
Q&A
35
Thank you!Members & Sponsors!
Sri Krishnamurthy, CFA, CAPFounder and CEO
QuantUniversity LLC.
srikrishnamurthy
www.QuantUniversity.com
Contact
Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. and shall not be distributed or used in any other publication without the prior written consent of QuantUniversity LLC.