Chainer Update v1.8.0 -> v1.10.0+

17
Chainer Update v1.8.0 -> v1.10.0+ Chainer Meetup #03 @ Dwango Seiya Tokui, Preferred Networks, Inc. 2016/07/02

Transcript of Chainer Update v1.8.0 -> v1.10.0+

Page 1: Chainer Update v1.8.0 -> v1.10.0+

Chainer Updatev1.8.0 -> v1.10.0+

Chainer Meetup #03 @ Dwango

Seiya Tokui, Preferred Networks, Inc.

2016/07/02

Page 2: Chainer Update v1.8.0 -> v1.10.0+

Updates v1.8.0 -> v1.10.0

2

Page 3: Chainer Update v1.8.0 -> v1.10.0+

Many updates since the last meetup

v1.8.0 (April 12)

v1.9.0 (May 31)

v1.10.0 (June 28)

Many contributions from the community. Thank you very much!!!!!

(30 PRs from non-PFI/PFNers have been merged since the last meetup in

March)

3

Page 4: Chainer Update v1.8.0 -> v1.10.0+

New core features

CaffeFunction improved: support Py3, support the ResNet models

Weight initializer

– Most links start supporting initializer to initialize the parameters

– Sample:

import chainer.links as L, chainer.initializers as I

L.Linear(784, 1000, initialW=I.Normal(0.01),

initialb=I.Constant(0.1))

– Many built-in initializers

– You can also write your own initializer easily

(it is just a function/callable that initializes the elements of a given CPU array)

Support float16 and float64 in many Functions

CuPy Profiling API

4

Page 5: Chainer Update v1.8.0 -> v1.10.0+

New core feature (cont.)

ndarray arguments for Functions

– Function now accepts NumPy/CuPy ndarrays as arguments

– It automatically wraps the arrays by Variable

– Users do not have to wrap arrays by Variable manually

5

Page 6: Chainer Update v1.8.0 -> v1.10.0+

Many Functions/Links are added

Variable.__getitem__ (F.get_item)

– Variable now supports the basic indexing (cf. NumPy) with backprop

– It supports: integer indexing, slice indexing, newaxis, Ellipsis

– E.g.: slice indexing can be used to crop feature maps

Array manipulation: cast, clip, log1p, expm1, logsumexp, minimum, permutate

NN elements: huber_loss, hard_sigmoid, roi_pooling_2d, StatelessLSTM, Bias,

Scale

There are also many updates on existing Functions/Links (new options,

bug fixes, etc.)

See the recent release notes for the full list of new Functions/Links

https://github.com/pfnet/chainer/releases

6

Page 7: Chainer Update v1.8.0 -> v1.10.0+

New CuPy functions

cupy.nonzero

– Enumerates the indices of non-zero elements

– It implements inclusive scan kernel in CUDA

– We are willing to support a wider range of routines that require the scan

kernel (like cumsum)

In parallel, some routines using nonzero are added: cupy.ix_,

cupy.flatnonzero

Profiling API (cupy.cuda.profile, cupy.cuda.profiler) to enable CUDA

profile collection only in a specified range of codes

7

Page 8: Chainer Update v1.8.0 -> v1.10.0+

Issue: slow merge of PRs. Solution: more minor releases

As I noted, there are now many PRs coming to the Chainer repository

The current release cycle of minor updates per 6 weeks is too slow

We decided to make minor releases more frequently

Until v1.9.0:

Revision release (without new features) for every 2 weeks

Minor release (with new features) for every 6 weeks

From v1.10.0:

Release for every 2 weeks

Any release can contain new features (we increment the minor version

in that case)

8

Page 9: Chainer Update v1.8.0 -> v1.10.0+

Planned updates after v1.10.0

9

Page 10: Chainer Update v1.8.0 -> v1.10.0+

Planned big features for upcoming releases (v1.11-12)

Dataset and Trainer (will be explained)

cuDNN RNN support

Theano function support (use a Theano function as a Chainer Function)

Asynchronous to_gpu

Page 11: Chainer Update v1.8.0 -> v1.10.0+

Target Link

Dataset

Optimizer

Iterator

Dataset and Trainer

Dataset: Abstraction of iterations over datasets

Trainer: Abstraction of training loops

11

TrainerExtension

ExtensionExtension

Updater OptimizerOptimizer

Target LinkTarget Link

IteratorIterator

DatasetDataset

We often use only one

optimizer and one

dataset. This diagram

shows a general case.

Page 12: Chainer Update v1.8.0 -> v1.10.0+

Trainer

Call the Updater and Extensions for every iteration

Updater

– Fetch a mini-batch using Iterator, and update parameters using Optimizer

– You can customize the update routine

– Built-in updater: StandardUpdater, ParallelUpdater (under review)

(ParallelUpdater provides an easy way of data-parallel learning)

Extension

– It adds an extra routine to the training loop

– Basic extensions are built-in:

Evaluator, LogReport, PrintReport, ProgressBar

snapshot, snapshot_object, ExponentialDecay, LinearShift

dump_graph

– You can write your own extensions

12

Page 13: Chainer Update v1.8.0 -> v1.10.0+

Dataset / Iterator

Dataset is just a sequence of data points (a.k.a. examples)

Iterator defines how to iterate over the dataset

Built-in iterators:

– SequentialIterator

– ShuffledIterator

– MultiprocessIterator

(you can easily support multiprocess preprocessing with it)

13

Page 14: Chainer Update v1.8.0 -> v1.10.0+

Reporter: easy way to report an observation

Trainer uses Reporter to collect observations

(e.g. loss value, accuracy, activation statistics, etc.)

Example (simple Classifier):

class Classifier(chainer.Chain):

def __init__(self, predictor):

super(Classifier, self).__init__(predictor=predictor)

def __call__(self, x, t):

y = self.predictor(x)

loss = F.softmax_cross_entropy(y, t)

accuracy = F.accuracy(y, t)

chainer.report({‘loss’: loss, ‘accuracy’: accuracy},

observer=self)

return loss

14

Page 15: Chainer Update v1.8.0 -> v1.10.0+

MNIST Example

15

・・・

Dataset and Iterators

Updater and Trainer

Extensions

Launch training loop

Model and Optimizer

Page 16: Chainer Update v1.8.0 -> v1.10.0+

Note on Trainer

If your training workflow is very different from standard ones, you can

still write your own training loop

We recommend you to use Trainer for newly-written training scripts

– Most usages are covered by the Trainer

– You can flexibly customize each component of Trainer

16

Page 17: Chainer Update v1.8.0 -> v1.10.0+

We are planning the first major version up!

Planning to release it in this autumn (Oct. – Nov.)

It will breaks the backward compatibility

Planned features

Separate CuPy into a separate repository/package

Plugin system and service (we need discussions with you!)

(share Functions/Links/etc. easily with other users without sending PRs)

Backprop as a graph (or gradient of expressions including gradients)

Parameter shape inference (without specifying “input size”)

Make asynchronous CPU/GPU transfer by default

Parameter annotation

17