Frameworks - UMIACS

27
Frameworks Natural Language Processing: Jordan Boyd-Graber University of Maryland INTRODUCTION Slides adapted from Chris Dyer, Yoav Goldberg, Graham Neubig Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 1/1

Transcript of Frameworks - UMIACS

Page 1: Frameworks - UMIACS

Frameworks

Natural Language Processing: JordanBoyd-GraberUniversity of MarylandINTRODUCTION

Slides adapted from Chris Dyer, Yoav Goldberg, Graham Neubig

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 1 / 1

Page 2: Frameworks - UMIACS

Neural Nets and Language

Language

Discrete, structured (graphs, trees)

Neural-Nets

Continuous: poor native support forstructure

Big challenge: writing code that translates between the{discrete-structured, continuous} regimes

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 2 / 1

Page 3: Frameworks - UMIACS

Why not do it yourself?

� Hard to compare with exting models

� Obscures difference between model and optimization

� Debugging has to be custom-built

� Hard to tweak model

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 3 / 1

Page 4: Frameworks - UMIACS

Outline

� Computation graphs (general)

� Neural Nets in PyTorch

� Full example

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 4 / 1

Page 5: Frameworks - UMIACS

Computation Graphs

Expression

~x

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 5 / 1

Page 6: Frameworks - UMIACS

Computation Graphs

Expression

~x>

� Edge: function argument / data dependency

� A node with an incoming edge is a function F ≡ f (u ) edge’s tail node

� A node computes its value and the value of its derivative w.r.t eachargument (edge) times a derivative ∂ f

∂ u

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 5 / 1

Page 7: Frameworks - UMIACS

Computation Graphs

Expression

~x>A

Functions can be nullary, unary, binary, . . . n-ary. Often they are unary orbinary.

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 5 / 1

Page 8: Frameworks - UMIACS

Computation Graphs

Expression

~x>Ax

Computation graphs are (usually) directed and acyclic

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 5 / 1

Page 9: Frameworks - UMIACS

Computation Graphs

Expression

~x>Ax

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 5 / 1

Page 10: Frameworks - UMIACS

Computation Graphs

Expression

~x>Ax + b · ~x + c

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 5 / 1

Page 11: Frameworks - UMIACS

Computation Graphs

Expression

y = ~x>Ax + b · ~x + c

Variable names label nodesNatural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 5 / 1

Page 12: Frameworks - UMIACS

Algorithms

� Graph construction� Forward propagation� Loop over nodes in topological order� Compute the value of the node given its inputs� Given my inputs, make a prediction (i.e. “error” vs. “target output”)

� Backward propagation� Loop over the nodes in reverse topological order, starting with goal node� Compute derivatives of final goal node value wrt each edge’s tail node� How does the output change with small change to inputs?

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 6 / 1

Page 13: Frameworks - UMIACS

Forward Propagation

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1

Page 14: Frameworks - UMIACS

Forward Propagation

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1

Page 15: Frameworks - UMIACS

Forward Propagation

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1

Page 16: Frameworks - UMIACS

Forward Propagation

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1

Page 17: Frameworks - UMIACS

Forward Propagation

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1

Page 18: Frameworks - UMIACS

Forward Propagation

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1

Page 19: Frameworks - UMIACS

Forward Propagation

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1

Page 20: Frameworks - UMIACS

Forward Propagation

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1

Page 21: Frameworks - UMIACS

Constructing Graphs

Static declaration

� Define architecture, run datathrough

� PROS: Optimization, hardwaresupport

� CONS: Structured data ugly,graph language

Theano, Tensorflow

Dynamic declaration

� Graph implicit with data

� PROS: Native language,interleave construction/evaluation

� CONS: Slower, computation canbe wasted

Chainer, Dynet, PyTorch

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 8 / 1

Page 22: Frameworks - UMIACS

Constructing Graphs

Static declaration

� Define architecture, run datathrough

� PROS: Optimization, hardwaresupport

� CONS: Structured data ugly,graph language

Theano, Tensorflow

Dynamic declaration

� Graph implicit with data

� PROS: Native language,interleave construction/evaluation

� CONS: Slower, computation canbe wasted

Chainer, Dynet, PyTorch

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 8 / 1

Page 23: Frameworks - UMIACS

Language is Hierarchical

Page 24: Frameworks - UMIACS

Dynamic Hierarchy in Language

� Language is hierarchical� Graph should reflect this reality� Traditional flow-control best for processing

� Combinatorial algorithms (e.g., dynamic programming)

� Exploit independencies to compute over a large space of operationstractably

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 10 / 1

Page 25: Frameworks - UMIACS

PyTorch

� Torch: Facebook’s deep learning framework

� Nice, but written in Lua (C backend)

� Optimized to run computations on GPU

� Mature, industry-supported framework

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 11 / 1

Page 26: Frameworks - UMIACS

Why GPU?

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 12 / 1

Page 27: Frameworks - UMIACS

Why GPU?

CPU

GPUHDD

Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 12 / 1