Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning Services

https://github.com/Microsoft/CNTK

Caffe Cognitive Toolkit MxNet TensorFlow Torch

FCN5 (1024) 55.329ms 51.038ms 60.448ms 62.044ms 52.154ms

AlexNet (256) 36.815ms 27.215ms 28.994ms 103.960ms 37.462ms

ResNet (32) 143.987ms 81.470ms 84.545ms 181.404ms 90.935ms

LSTM (256)

(v7 benchmark)

- 43.581ms

(44.917ms)

288.142ms

(284.898ms)

(223.547ms)

1130.606ms

(906.958ms)

http://dlbench.comp.hkbu.edu.hk/

Benchmarking by HKBU, Version 8

Single Tesla K80 GPU, CUDA: 8.0 CUDNN: v5.1

Caffe: 1.0rc5(39f28e4)

CNTK: 2.0 Beta10(1ae666d)

MXNet: 0.93(32dc3a2)

TensorFlow: 1.0(4ac9c09)

Torch: 7(748f5e3)

2 only supports 1 GPU

Achieved with 1-bit gradient quantizationalgorithm

1 2 3 4 5

speed comparison (samples/second), higher = better

[note: December 2015]

Series1 Series2 Series3

MICROSOFT COGNITIVE TOOLKITFirst Deep Learning Framework Fully Optimized for Pascal

13,000

10,000

12,000

14,000

1 2 3 4 5

Toolkit Delivering Near-Linear Multi-GPU Scaling AlexNet Performance

AlexNet training batch size 128, Grad Bit = 32, Dual socket E5-2699v4 CPUs (total 44 cores)CNTK 2.0b3 (to be released) includes cuDNN 5.1.8, NCCL 1.6.1, NVLink enabled

170x Fasterv. CPU Server

$ pip install <url>

https://docs.microsoft.com/en-us/cognitive-toolkit/Setup-CNTK-on-your-machine

https://notebooks.azure.com/cntk/libraries/tutorials

Example: 2-hidden layer feed-forward NN

h1 = s(W1 x + b1) h1 = sigmoid (x @ W1 + b1)

h2 = s(W2 h1 + b2) h2 = sigmoid (h1 @ W2 + b2)

P = softmax(Wout h2 + bout) P = softmax (h2 @ Wout + bout)

with input x RM and one-hot label L RM

and cross-entropy training criterion

ce = LT log P ce = cross_entropy (L, P)

Scorpusce = max

Example: 2-hidden layer feed-forward NN

with input x RM and one-hot label y RJ

ce = yT log P ce = cross_entropy (L, P)

Scorpusce = max

example: 2-hidden layer feed-forward NN

with input x RM and one-hot label y RJ

ce = yT log P ce = cross_entropy (P, y)

Scorpusce = max

h1 = sigmoid (x @ W1 + b1)

h2 = sigmoid (h1 @ W2 + b2)

P = softmax (h2 @ Wout + bout)

ce = cross_entropy (P, y)

softmax

cross_entropy

h1 = sigmoid (x @ W1 + b1)

h2 = sigmoid (h1 @ W2 + b2)

P = softmax (h2 @ Wout + bout)

ce = cross_entropy (P, y)

softmax

cross_entropy

LEGO-like composability allows CNTK to supportwide range of networks & applications

Script configure and executes through CNTK Python APIs…

trainer• SGD

(momentum,Adam, …)

• minibatching

reader• minibatch source• task-specific

deserializer• automatic

randomization• distributed

reading

corpus model

network• model function• criterion function• CPU/GPU

execution engine• packing, padding

from cntk import *

# readerdef create_reader(path, is_training):

# networkdef create_model_function():

...def create_criterion_function(model):

# trainer (and evaluator)def train(reader, model):

...def evaluate(reader, model):

# main functionmodel = create_model_function()

reader = create_reader(..., is_training=True)train(reader, model)

reader = create_reader(..., is_training=False)evaluate(reader, model)

def create_reader(map_file, mean_file, is_training):# image preprocessing pipelinetransforms = [

ImageDeserializer.crop(crop_type='Random', ratio=0.8, jitter_type='uniRatio')ImageDeserializer.scale(width=image_width, height=image_height,

channels=num_channels,interpolations='linear'),

ImageDeserializer.mean(mean_file)]# deserializerreturn MinibatchSource(ImageDeserializer(map_file, StreamDefs(

features = StreamDef(field='image', transforms=transforms), 'labels = StreamDef(field='label', shape=num_classes)

)), randomize=is_training, epoch_size = INFINITELY_REPEAT if is_training else FULL_DATA_SWEEP)

def create_reader(map_file, mean_file, is_training):# image preprocessing pipelinetransforms = [

ImageDeserializer.crop(crop_type='Random', ratio=0.8, jitter_type='uniRatio')ImageDeserializer.scale(width=image_width, height=image_height,

channels=num_channels,interpolations='linear'),

ImageDeserializer.mean(mean_file)]# deserializerreturn MinibatchSource(ImageDeserializer(map_file, StreamDefs(

features = StreamDef(field='image', transforms=transforms), 'labels = StreamDef(field='label', shape=num_classes)

)), randomize=is_training, epoch_size = INFINITELY_REPEAT if is_training else FULL_DATA_SWEEP)

Modelz = model(x):

h1 = Dense(400, act = relu)(x)h2 = Dense(200, act = relu)(h1)r = Dense(10, act = None)(h2)return r

Losscross_entropy_with_softmax(z,Y)

28 pix

z = model(x):h = Convolution2D((5,5),filt=8, …)(x)h = MaxPooling(…)(h)h = Convolution2D ((5,5),filt=16, …)((h)h = MaxPooling(…)(h) r = Dense(output_classes, act= None)(h)return r

Problem: Tagging entities in Air Traffic Controller (ATIS) data

burbank

From_city

seattle

To_city

flights

tomorrow

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

0 0 1 0

Ԧ𝑥(t)

Ei = 943O= 150

Li = 150O= 300

i = 300O= 129a = sigmoid

ℎ(t-1) ℎ(t)

Ԧ𝑦(t)

Ԧ𝑥(t)

0 0 1 0

Text token

Ԧ𝑦(t)Class label

1 x 943

z = model():returnSequential([

Embedding(emb_dim=150),Recurrence(LSTM(hidden_dim=300),

go_backwards=False),Dense(num_labels = 129)

lr_schedule = C.learning_rate_schedule([0.05]*3 + [0.025]*2 + [0.0125], C.UnitType.minibatch, epoch_size=100)

sgd_learner = C.sgd(z.parameters, lr_schedule)

ATISTrain

Input feature ( 96 x Ԧ𝑥(t))z = model():

returnSequential([

Embedding(emb_dim=150),Recurrence(LSTM(hidden_dim=300),

go_backwards=False),Dense(num_labels = 129)

Loss cross_entropy_with_softmax(z,Y)

Trainer(model, (loss, error), learner)

Trainer.train_minibatch({X, Y})

Error classification_error(z,Y)

Choose a learner(SGD, Adam, adagrad etc.)

One-hot encoded Label

(Y: 96 x 129/sampleOr word in

sequence)

Function z = CNTKLib.Times(weightParam, input) + biasParam;Function loss = CNTKLib.CrossEntropyWithSoftmax(z, labelVariable);

Function conv = CNTKLib.Pooling(CNTKLib.Convolution(convParam, input), PoolingType.Average, poolingWindowShape);

Function resNetNode = CNTKLib.ReLU(CNTKLib.Plus(conv, input));

var parameterLearners = new List<Learner>() { Learner.AdamLearner(classifierOutput.Parameters(), learningRate, momentum) };

var trainer = Trainer.CreateTrainer(classifierOutput, trainingLoss, prediction, parameterLearners);

Function conv = CNTKLib.ReLU(CNTKLib.Convolution(convParams, features, strides ));

Function pooling = CNTKLib.Pooling(conv, PoolingType.Max, poolingWindow, stride, padding);

Function classifier = TestHelper.Dense(pooling, numClasses, device, Activation.None);

var minibatchSource = MinibatchSource.TextFormatMinibatchSource("Train_cntk_text.txt"), streamConfigurations, MinibatchSource.InfinitelyRepeat);

var minibatchData = minibatchSource.GetNextMinibatch(minibatchSize, device);

var arguments = new Dictionary<Variable, MinibatchData> {

{ input, minibatchData[featureStreamInfo] },

{ labels, minibatchData[labelStreamInfo] }

trainer.TrainMinibatch(arguments, device);

https://github.com/Microsoft/CNTK/tree/master/Examples/TrainingCSharp

Accelerating adoption of AI by developers

(consuming models)

Rise of hybrid training and scoring scenarios

Push scoring/inference to the event (edge,

cloud, on-prem)

Some developers moving into deep learning as

non-traditional path to DS / AI dev

Growth of diverse hardware arms race across all

form factors (CPU / GPU / FPGA / ASIC /

device)

Data prep

Model deployment &

management

Model lineage & auditing

Explain-ability

D A T A S C I E N C E & A I

C H A L L E N G E SK E Y T R E N D S

Challenge

• Traditional power line inspection services are

costly

• Demand for low cost image scoring and support

for multiple concurrent customers

• Needed powerful AI to execute on a drone

solution

Solution

• Deep learning to analyze multiple streaming data

• Azure GPUs support Single Shot multibox

detectors

• Reliable, consistent, and highly elastic scalability

with Azure Batch Shipyards

Drone-based electric grid inspector powered by deep learning

leopard?

Deep neural network Spark ML classifier

Decision tree or logistic

regression

Image featuresImage

Class 1 Class 1

Identifying Snow LeopardsComputer vision and classification on Spark

Apps + insightsSocial

CRM INGEST STORE PREP & TRAIN MODEL & SERVE

Data orchestration and monitoring

Data lake and storage

Hadoop/Spark/SQL and ML

Azure Machine Learning

T H E A I D E V E L O P M E N T L I F E C Y C L E

Azure Machine Learning Studio

Platform for data scientists to graphically

build and deploy experiments

• Rapid experiment composition

• > 100 easily configured modules for

data prep, training, evaluation

• Extensibility through R & Python

• Serverless training and deployment

Some numbers:

• 100’s of thousands of deployed models

serving billions of requests

Begin building now with the tools and platforms you know

Build, deploy, and

manage models at

Boost productivity with

agile development

NotebooksIDEs

Azure Machine Learning Workbench

VS Code Tools for AI

N E W C A PA B I L I T I E S

Experimentation and

Model Management

Services

AZURE MACHINE LEARNING SERVICES

SQL Server

Virtual

machines

Container

services

SQL Server

Machine Learning Server

ON-PREMISES

EDGEAzure IoT Edge

TRAIN & DEPLOY OPTIONS

Local machine

Scale up to DSVM

Scale out with Spark on HDInsight

Azure Batch AI (Coming Soon)

ML Server

Experiment Everywhere

A ZURE ML

EXPER IMENTAT ION

Command line tools

Notebooks in Workbench

Manage project dependencies

Manage training jobs locally, scaled-up or scaled-out

Git based checkpointing and version control

Service side capture of run metrics, output logs and models

Use your favorite IDE, and any framework

Experimentation service

U S E T H E M O S T P O P U L A R I N N O VAT I O N S

U S E A N Y TO O L

U S E A N Y F R A M E W O R K O R L I B R A R Y

DOCKER

Single node deployment (cloud/on-prem)

Azure Container Service

Azure IoT Edge

Microsoft ML Server

Spark clusters

SQL Server

Deploy Everywhere

A ZURE ML

MODEL MANAGEMENT

Deployment and management of models as HTTP

services

Container-based hosting of real time and batch

processing

Management and monitoring through Azure

Application Insights

First class support for SparkML, Python, Cognitive

Toolkit, TF, R, extensible to support others (Caffe,

MXnet)

Service authoring in Python

Manage models

AI Powered Spreadsheets

VS Code extension with deep integration to Azure

End to end development environment, from new

project through training

Support for remote training

Job management

On top of all of the goodness of VS Code

(Python, Jupyter, Git, etc)

Windows and Mac based

companion for AI development

Full environment set up (Python,

Jupyter, etc)

Embedded notebooks

Run History and Comparison

experience

New data wrangling tools

Azure Machine Learning Workbench - What Is It?

AI Powered Data Wrangling

Rapidly sample, understand, and

prep data

Leverage PROSE and more for

intelligent, data prep by example

Extend/customize transforms and

featurization through Python

Generate Python and Pyspark for

execution at scale

Machine Learning & AI PortfolioWhen to use what?

What engine(s) do you want to use?

Deployment target

Which experience do you want?

Build your own or consume pre-trained models?

Microsoft

ML & AI

products

Build your

Azure Machine Learning

Code first

(On-prem)

ML Server

Hadoop

Server

(cloud)

AML services (Preview)

Server

Spark Hadoop Azure

DSVM Azure

Container

Service

Visual tooling

(cloud)

AML Studio

Consume

Cognitive services, bots

http://aka.ms/aml_deep_dive

https://aischool.microsoft.com/learning-paths/SPYpcLhRMyEAa2maw6YoU

https://channel9.msdn.com/events/Ignite/Microsoft-Ignite-Orlando-2017/BRK4033

https://channel9.msdn.com/events/Ignite/Microsoft-Ignite-Orlando-2017/BRK2270

https://www.microsoft.com/en-us/cognitive-toolkit/

https://azure.microsoft.com/services/machine-learning-services/

https://azure.microsoft.com/services/virtual-machines/data-science-virtual-machines/

Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning Services

Software

Transcript of Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning Services

Introduction To Azure Machine Learning...Expected Learning Outcomes Azure Machine Learning @tetranoodle Supervised Machine Learning Machine Learning Algorithms in Azure ML Workflow

Workshop: Redes neurais com keras, cntk e tensorflow no Azure … · 2019-05-16 · 1 Workshop: Redes neurais com keras, cntk e tensorflow no Azure machine learning

IoT in Action Taipei - Microsoft...Azure IoT Suite Azure Time Series Insights Azure Machine Learning Azure Stream Analytics Azure Cosmos DB Azure Data Lake Azure Data Lake Analytics

Azure Machine Learning getting started

Deep Learning in Azure

Deep Learning Tools and Frameworks - microsoft.com · deep learning tools and frameworks hamid palangi deep learning group, microsoft research ai redmond, wa, usa ... (cntk), mxnet

K-MUG Azure Machine Learning

Machine Learning and Azure Machine Learning

Introduction to Azure Machine Learning

Introduction to Machine Learning with Azure

Azure Big Data & Machine Learning Matthias Gessenay ... · 2 Agenda Introduction to Azure Data Science Tools Azure Data Lake Hadoop Azure Jupyter Notebooks Azure Machine Learning

Machine Learning Use Cases with Azure

Learning management system net, azure

Introduction to Machine Learning on Azure

DevopsSummitBrasil - Azure Machine Learning

Azure Cognitive Services - Microsoft... · Azure Databricks Machine Learning VMs Popular frameworks To build advanced deep learning solutions Pytorch TensorFlow Keras Onnx Azure Machine

PowerPoint Presentation · Azure Stream Analytics Azure Machine Learning Azure HD Insight Spark, Storm, Kafka Azure Event Hubs Azure IoT Hub Device Provisioning Service Azure IoT

Developer’s Intro to Azure Machine Learning

Leveraging Deep Learning Applications with CNTK

Azure Machine Learning and Data Journeys