Deep Learning with Framework

23
Deep Learning with Framework by Nervana Systems Paramita Mirza 15.10.2015

Transcript of Deep Learning with Framework

Page 1: Deep Learning with Framework

Deep Learning with Framework by Nervana Systems

Paramita Mirza 15.10.2015

Page 2: Deep Learning with Framework

http://neon.nervanasys.com/

Page 3: Deep Learning with Framework
Page 4: Deep Learning with Framework

GPU vs CPU

• Multilayer perceptron, 1 hidden layer (100 nodes)

• MNIST dataset (handwritten digits), 60,000 training instances, 10,000 test instances

Network Layers:

Linear Layer 'LinearLayer': 784 inputs, 100 outputs

Activation Layer 'ActivationLayer': Rectlin

Linear Layer 'LinearLayer': 100 inputs, 10 outputs

Activation Layer 'ActivationLayer': Logistic

Epoch 0 [Train |████████████████████| 469/469 batches, 0.70 cost, 18.90s]

Epoch 1 [Train |████████████████████| 469/469 batches, 0.27 cost, 18.57s]

Epoch 2 [Train |████████████████████| 469/469 batches, 0.21 cost, 18.08s]

Epoch 3 [Train |████████████████████| 468/468 batches, 0.17 cost, 18.75s]

Epoch 4 [Train |████████████████████| 469/469 batches, 0.15 cost, 17.78s]

Epoch 5 [Train |████████████████████| 469/469 batches, 0.13 cost, 19.43s]

Epoch 6 [Train |████████████████████| 469/469 batches, 0.11 cost, 19.08s]

Epoch 7 [Train |████████████████████| 468/468 batches, 0.10 cost, 18.05s]

Epoch 8 [Train |████████████████████| 469/469 batches, 0.09 cost, 18.12s]

Epoch 9 [Train |████████████████████| 469/469 batches, 0.08 cost, 18.35s]

Misclassification error = 2.6%

2015-10-14 18:01:37,804 INFO:__init__ - Cudanet backend, RNG seed:

None, numerr: None

2015-10-14 18:01:37,804 INFO:mlp - Layers:

DataLayer d0: 784 nodes

FCLayer h0: 784 inputs, 100 nodes, RectLin act_fn

FCLayer output: 100 inputs, 10 nodes, Logistic act_fn

CostLayer cost: 10 nodes, CrossEntropy cost_fn

2015-10-14 18:01:56,206 INFO:mlp - commencing model fitting

2015-10-14 18:02:01,799 INFO:mlp - epoch: 0, training error: 0.70664

2015-10-14 18:02:07,829 INFO:mlp - epoch: 1, training error: 0.27303

2015-10-14 18:02:13,771 INFO:mlp - epoch: 2, training error: 0.21024

2015-10-14 18:02:19,800 INFO:mlp - epoch: 3, training error: 0.17499

2015-10-14 18:02:25,457 INFO:mlp - epoch: 4, training error: 0.15191

2015-10-14 18:02:31,440 INFO:mlp - epoch: 5, training error: 0.13190

2015-10-14 18:02:37,046 INFO:mlp - epoch: 6, training error: 0.11669

2015-10-14 18:02:43,039 INFO:mlp - epoch: 7, training error: 0.10395

2015-10-14 18:02:49,045 INFO:mlp - epoch: 8, training error: 0.09465

2015-10-14 18:02:55,084 INFO:mlp - epoch: 9, training error: 0.08586

2015-10-14 18:02:55,626 INFO:fit_predict_err - test set

MisclassPercentage_TOP_1 2.57412

2015-10-14 18:02:58,733 INFO:fit_predict_err - train set

MisclassPercentage_TOP_1 1.11846

58.88 sec vs 185.11 sec 68.2% faster

Page 5: Deep Learning with Framework

Getting Started

or

GPU (Maxwell based architecture) requires the installation of CUDA SDK and drivers

Page 6: Deep Learning with Framework

…or with Docker

docker pull kaixhin/neon CPU docker pull kaixhin/cuda-neon GPU

Page 8: Deep Learning with Framework

Datasets

• MNIST, a dataset of handwritten digits (28x28 grayscale), 60,000 training

samples, 10,000 test samples

• CIFAR10, an image dataset (32x32 color), 50,000 training samples,

10,000 test samples, 10 categories

• ImageCaption, an image and caption dataset (flickr8k, flickr30k,

and COCO), 5 reference sentences per image

Page 9: Deep Learning with Framework

Datasets (2)

• Text, Penn Treebank, Hutter Prize, and Shakespeare

• Speech? None, and handler is also not yet implemented

• Adding a new dataset?

…or modifying NEON_HOME/neon/data/loader.py

NEON_HOME/neon/data/__init__.py (continued…)

Page 10: Deep Learning with Framework

NEON_HOME/neon/data/loader.py

• Update dataset_meta = { … … ,

'tempeval3': {

'size': 0,

'file': '',

'url': '',

'func': load_tempeval3

}

• Update load_tempeval3() function – e.g. opening csv files into Numpy array

def load_tempeval3(path):

tempeval3_meta = dataset_meta['tempeval3']

train_path = _valid_path_append(path, "te3-ee-token-embedding-no-label.csv")

train_label_path = _valid_path_append(path, "te3-ee-train-label.csv")

test_path = _valid_path_append(path, "te3-ee-token-embedding-no-label.csv")

test_label_path = _valid_path_append(path, "te3-ee-eval-label.csv")

X_train = np.loadtxt(open(train_path,"rb"), delimiter=",")

y_train = np.loadtxt(open(train_label_path,"rb"), delimiter=",")

X_test = np.loadtxt(open(test_path,"rb"), delimiter=",")

y_test = np.loadtxt(open(test_label_path,"rb"), delimiter=",")

nclass = 14

return (X_train, y_train), (X_test, y_test), nclass

Datasets (3)

The name of dataset to call in the YAML file

Page 11: Deep Learning with Framework

NEON_HOME/neon/data/__init__.py

• Update from neon.data.loader import ( …

… , load_tempeval3)

Datasets (4)

Page 12: Deep Learning with Framework

Input Output

Calculate error Learning

Page 13: Deep Learning with Framework

Network Functions (in Neon)

Initializers Activations Optimizers

Constant Uniform Gaussian GlorotUniform

Identity RectifiedLinear Softmax Tanh Logistic

GradientDescentMomentum RMSProp Adadelta Adam

Costs

Binary Cross Entropy Multiclass Cross Entropy Sum Squared Error

Page 14: Deep Learning with Framework

Convolutional Neural Network

• The infamous network for image recognition – e.g. GoogLeNet (22 layers deep network), the winner of the ImageNet

Large Scale Visual Recognition Challenge 2014

• Different from fully connected layers network; reducing the number of parameters to be learned, while retaining high expressiveness, with: – Local connectivity

– Weight sharing

– Pooling

Page 15: Deep Learning with Framework

Convolutional Neural Network (2)

More info: • http://ufldl.stanford.edu/tutorial/

• http://deeplearning.net/tutorial/lenet.html

Page 16: Deep Learning with Framework

Convolutional Neural Network (2)

• Related Neon layers:

– Pooling

– Convolutional and Deconv (composite deconvolution layer)

– Conv (convolutional layer with a learned bias and activation, implemented as a list composing separate Convolution, Bias and Activation layers)

Page 17: Deep Learning with Framework

Recurrent Neural Network

• What makes recurrent network so special? Sequences!

1. Image classification

2. Image captioning

3. Sentiment analysis

4. Machine translation

5. Video classification

1 2 3 4 5

Page 19: Deep Learning with Framework

Dropout Training

• A simple way to prevent neural networks from overfitting

• Random “dropout” gives big improvements on many benchmark tasks and sets new records for object recognition and molecular activity prediction

• Related Neon layers:

– Dropout

More info: • http://videolectures.net/nips2012_h

inton_networks/

• https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf

Page 20: Deep Learning with Framework

Unsupervised Learning

• Autoencoder

In Neon: # Load dataset

(X_train, y_train), (X_test,

y_test), nclass =

load_mnist(path=args.data_dir)

# Set input and target to X_train

train = DataIterator(X_train,

y_train, nclass, lshape=(1, 28,

28))

Page 22: Deep Learning with Framework

Back to model.py • Model # setup model layers

layers = []

layers.append( … )

layers.append( … )

# initialize model object

mlp = Model(layers=layers)

• Train, Evaluate, Output # train

mlp.fit(train_set, optimizer=optimizer, num_epochs=num_epochs, cost=cost, callbacks=callbacks)

#evaluate

print('Misclassification error = %.1f%%' % (mlp.eval(valid_set, metric=Misclassification())*100))

#output

output = mlp.get_outputs(valid_set)np.savetxt("output.csv", output, delimiter=",")

• Run ./examples/mnist_mlp.py or

neon examples/mnist_mlp.yaml

Page 23: Deep Learning with Framework

Hyperparameter optimization

• Finding good hyperparameters for deep networks is quite tedious to do manually

• Spearmint have been forked and slightly extended to work with Neon*)

*) It seems that only for version < 1.0, and only for Neon in CPU

• Run: hyperopt init -y examples/mnist_mlp.yaml