Abstract - arXiv We investigate the compositional struc-ture of message vectors computed by a deep...

5
Analogs of Linguistic Structure in Deep Representations Jacob Andreas and Dan Klein Computer Science Division University of California, Berkeley jda,[email protected] Abstract We investigate the compositional struc- ture of message vectors computed by a deep network trained on a communication game. By comparing truth-conditional representations of encoder-produced mes- sage vectors to human-produced refer- ring expressions, we are able to identify aligned (vector, utterance) pairs with the same meaning. We then search for struc- tured relationships among these aligned pairs to discover simple vector space transformations corresponding to nega- tion, conjunction, and disjunction. Our results suggest that neural representations are capable of spontaneously developing a “syntax” with functional analogues to qualitative properties of natural language. 1 1 Introduction The past year has seen a renewal of interest in end- to-end learning of communication strategies be- tween pairs of agents represented with deep net- works (Wagner et al., 2003). Approaches of this kind make it possible to learn decentralized poli- cies from scratch (Foerster et al., 2016; Sukhbaatar et al., 2016), with multiple agents coordinating via learned communication protocol. More gener- ally, any encoder–decoder model (Sutskever et al., 2014) can be viewed as implementing an analo- gous communication protocol, with the input en- coding playing the role of a message in an arti- ficial “language” shared by the encoder and de- coder (Yu et al., 2016). Earlier work has found that under suitable conditions, these protocols acquire simple interpretable lexical (Dircks and Stoness, 1999; Lazaridou et al., 2016) and sequential struc- ture (Mordatch and Abbeel, 2017), even without natural language training data. 1 Code and data are available at http://github. com/jacobandreas/rnn-syn. everything but the blue squares λx.¬(square(x) blue(x)) Human Speaker Logical Evaluator RNN Encoder MLP Decoder Truth-conditional repr. e f e W f W e W f W W W W Figure 1: Overview of our task. Given a dataset of referring expression games, example human expressions, and their as- sociated logical forms, we compute explicit denotations both for the original task and in other possible tasks—giving rise to a truth-conditional representation of the natural language. We train a recurrent encoder–decoder model to solve the same tasks directly, and use the decoder to generate comparable truth-conditional representations of neural encodings. One of the distinguishing features of natural language is compositionality: the existence of op- erations like negation and coordination that can be applied to utterances with predictable effects on meaning. RNN models trained for natural lan- guage processing tasks have been found to learn representations that encode some of this composi- tional structure—for example, sentence represen- tations for machine translation encode explicit fea- tures for certain syntactic phenomena (Shi et al., 2016) and represent some semantic relationships translationally (Levy et al., 2014). It is thus nat- ural to ask whether these “language-like” struc- tures also arise spontaneously in models trained directly from an environment signal. Rather than using language as a form of supervision, we pro- pose to use it as a probe—exploiting post-hoc sta- tistical correspondences between natural language descriptions and neural encodings to discover reg- ular structure in representation space. To do this, we need to find (vector, string) pairs with matching semantics, which requires first aligning unpaired examples of human–human arXiv:1707.08139v1 [cs.CL] 25 Jul 2017

Transcript of Abstract - arXiv We investigate the compositional struc-ture of message vectors computed by a deep...

Analogs of Linguistic Structure in Deep Representations

Jacob Andreas and Dan KleinComputer Science Division

University of California, Berkeleyjda,[email protected]

Abstract

We investigate the compositional struc-ture of message vectors computed by adeep network trained on a communicationgame. By comparing truth-conditionalrepresentations of encoder-produced mes-sage vectors to human-produced refer-ring expressions, we are able to identifyaligned (vector, utterance) pairs with thesame meaning. We then search for struc-tured relationships among these alignedpairs to discover simple vector spacetransformations corresponding to nega-tion, conjunction, and disjunction. Ourresults suggest that neural representationsare capable of spontaneously developinga “syntax” with functional analogues toqualitative properties of natural language.1

1 Introduction

The past year has seen a renewal of interest in end-to-end learning of communication strategies be-tween pairs of agents represented with deep net-works (Wagner et al., 2003). Approaches of thiskind make it possible to learn decentralized poli-cies from scratch (Foerster et al., 2016; Sukhbaataret al., 2016), with multiple agents coordinatingvia learned communication protocol. More gener-ally, any encoder–decoder model (Sutskever et al.,2014) can be viewed as implementing an analo-gous communication protocol, with the input en-coding playing the role of a message in an arti-ficial “language” shared by the encoder and de-coder (Yu et al., 2016). Earlier work has found thatunder suitable conditions, these protocols acquiresimple interpretable lexical (Dircks and Stoness,1999; Lazaridou et al., 2016) and sequential struc-ture (Mordatch and Abbeel, 2017), even withoutnatural language training data.

1 Code and data are available at http://github.com/jacobandreas/rnn-syn.

everything but the blue squares�x.¬(square(x) � blue(x))

✔ ✔ ✔

HumanSpeaker

LogicalEvaluator

RNNEncoder

MLPDecoder

✔ ✔ ✔✔

Truth-conditionalrepr.

e f

�e�W �f�W

�e�W � �f�W �

W

W �

W

Figure 1: Overview of our task. Given a dataset of referringexpression games, example human expressions, and their as-sociated logical forms, we compute explicit denotations bothfor the original task and in other possible tasks—giving rise toa truth-conditional representation of the natural language. Wetrain a recurrent encoder–decoder model to solve the sametasks directly, and use the decoder to generate comparabletruth-conditional representations of neural encodings.

One of the distinguishing features of naturallanguage is compositionality: the existence of op-erations like negation and coordination that can beapplied to utterances with predictable effects onmeaning. RNN models trained for natural lan-guage processing tasks have been found to learnrepresentations that encode some of this composi-tional structure—for example, sentence represen-tations for machine translation encode explicit fea-tures for certain syntactic phenomena (Shi et al.,2016) and represent some semantic relationshipstranslationally (Levy et al., 2014). It is thus nat-ural to ask whether these “language-like” struc-tures also arise spontaneously in models traineddirectly from an environment signal. Rather thanusing language as a form of supervision, we pro-pose to use it as a probe—exploiting post-hoc sta-tistical correspondences between natural languagedescriptions and neural encodings to discover reg-ular structure in representation space.

To do this, we need to find (vector, string)pairs with matching semantics, which requiresfirst aligning unpaired examples of human–human

arX

iv:1

707.

0813

9v1

[cs

.CL

] 2

5 Ju

l 201

7

communication with network hidden states. Thisis similar to the problem of “translating” RNN rep-resentations recently investigated in Andreas et al.(2017). Here we build on that approach in order toperform a detailed analysis of compositional struc-ture in learned “languages”. We investigate a com-munication game previously studied by FitzGeraldet al. (2013), and make two discoveries: in a modeltrained without any access to language data,

1. The strategies employed by human speakersin a given communicative context are surpris-ingly good predictors of RNN behavior in thesame context: humans and RNNs send mes-sages whose interpretations agree on nearly90% of object-level decisions, even outsidethe contexts in which they were produced.

2. Interpretable language-like structure natu-rally arises in the space of representations.We identify geometric regularities corre-sponding to negation, conjunction, and dis-junction, and show that it is possible to lin-early transform representations in ways thatapproximately correspond to these logicaloperations.

2 Task

We focus our evaluation on a communicationgame due to FitzGerald et al. (2013) (Figure 1,top). In this game, the speaker observes (1) aworld W of 1–20 objects labeled with with at-tributes and (2) a designated target subset X of ob-jects in the world. The listener observes only W ,and the speaker’s goal is to communicate a rep-resentation of X that enables the listener to accu-rately reconstruct it. The GENX dataset collectedfor this purpose contains 4170 human-generatednatural-language referring expressions and corre-sponding logical forms for 273 instances of thisgame. Because these human-generated expres-sions have all been pre-annotated, we treat lan-guage and logic interchangeably and refer to bothwith the symbol e. We write e(W ) for the expres-sion generated by a human for a particular worldW , and JeKW for the result of evaluating the logi-cal form e against W .

We are interested in using language data of thiskind to analyze the behavior of a deep modeltrained to play the same game. We focus our anal-ysis on a standard RNN encoder–decoder, with theencoder playing the role of the speaker and the

decoder playing the role of the listener. The en-coder is a single-layer RNN with GRU cells (Choet al., 2014) that consumes both the input worldand target labeling and outputs a 64-dimensionalhidden representation. We write f(W ) for theoutput of this encoder model on a world W . Tomake predictions, this representation is passed toa decoder implemented as a multilayer perceptron.The decoder makes an independent labeling deci-sion about every object in W (taking as input bothf and a feature representation of a particular objectWi). We write JfKW for the full vector of decoderoutputs on W . We train the model maximize clas-sification accuracy on randomly-generated scenesand target sets of the same form as in the GENXdataset.

3 Approach

We are not concerned with the RNN model’s rawperformance on this task (it achieves nearly per-fect accuracy). Instead, our goal is to explorewhat kinds of messages the model computes inorder to achieve this accuracy—and specificallywhether these messages contain high-level seman-tics and low-level structure similar to the referringexpressions produced by humans. But how do wejudge semantic equivalence between natural lan-guage and vector representations? Here, as in An-dreas et al. (2017), we adopt an approach inspiredby formal semantics, and represent the meaning ofmessages via their truth conditions (Figure 1).

For every problem instance W in the dataset,we have access to one or more human messagese(W ) as well as the RNN encoding f(W ). Thetruth-conditional account of meaning suggests thatwe should judge e and f to be equivalent if theydesignate the same set of of objects in the world(Davidson, 1967). But it is not enough to com-pare their predictions solely in the context wherethey were generated—testing if JeKW = JfKW—because any pair of models that achieve perfect ac-curacy on the referring expression task will makethe same predictions in this initial context, regard-less of the meaning conveyed.

Instead, we sample a collection of alternativeworlds {Wi} observed elsewhere in the dataset,and compute a tabular meaning representationrep(e) = {JeKWi

} by evaluating e in each worldWi. We similarly compute rep(f) = {JfKWi

},allowing the learned decoder model to play therole of logical evaluation for message vectors. For

Theory Objects Worlds Tables

All

Random 0.50 0.00 0.00Literal 0.74 0.27 0.05Human 0.92 0.63 0.35

Table 1: Agreement with predicted model behavior for thehigh-level semantic correspondence task, computed for ob-jects (single entries in tabular representation), worlds (rows),and full tables. Referring expressions e generated by humansin a single communicative context are highly predictive ofhow learned representations f will be interpreted by the de-coder across multiple contexts.

logically equivalent messages, these tabular rep-resentations are guaranteed to be identical, so thesampling procedure can be viewed as an approxi-mate test of equivalence. It additionally allows usto compute softer notions of equivalence by mea-suring agreement on individual worlds or objects.

4 Interpreting the meaning of messages

We begin with the simplest question we can an-swer with this tool: how often do the messagesgenerated by the encoder model have the samemeaning as messages generated by humans for thesame context? Again, our goal is not to evaluatethe performance of the RNN model, but insteadour ability to understand its behavior. Does it sendmessages with human-like semantics? Is it moreexplicit? Or does it behave in a way indistinguish-able from a random classifier?

For each scene in the GENX test set, we com-pute the model-generated message f and its tabu-lar representation rep(f), and measure the extentto which this agrees with representations producedby three “theories” of model behavior (Figure 2):(1) a random theory that accepts or rejects ob-jects with uniform probability, (2) a literal the-ory that predicts membership only for objects thatexactly match some object in the original targetset, and (3) a human theory that predicts accord-ing to the most frequent logical form associatedwith natural language descriptions of the target set(as described in the preceding section). We eval-uate agreement at the level of individual objects,worlds, and full tabular meaning representations.

Results are shown in Table 1. Model behavioris well explained by human decisions in the samecontext: object-level decisions can be predictedwith close to 90% accuracy based on human judg-ments alone, and a third of message pairs agreeexactly in every sampled scene, providing strongevidence that they carry the same semantics.

These results suggest that the model has learneda communication strategy that is at least super-ficially language-like: it admits representationsof the same kinds of communicative abstractionsthat humans use, and makes use of these abstrac-tions with some frequency. But this is purelya statement about the high-level behavior of themodel, and not about the structure of the spaceof representations. Our primary goal is to deter-mine whether this behavior is achieved using low-level structural regularities in vector space that canthemselves be associated with aspects of naturallanguage communication.

5 Interpreting the structure of messages

For this we turn to a focused investigation of threespecific logical constructions used in natural lan-guage: a unary operation (negation) and two bi-nary operations (conjunction and disjunction). Allare used in the training data, with a variety ofscopes (e.g. all green objects that are not a tri-angle, all the pieces that are not tan arches).

Because humans often find it useful to specifythe target set by exclusion rather than inclusion,we first hypothesize that the RNN language mightfind it useful to incorporate some mechanism cor-

everything but the blue squares

literaltheory

humantheory

randomtheory

✔ ✔ ✔✔

✔ ✔✔

initialobs.

decoderpred.

alternative

✔ ✔✔(c)

(d)

(e)

(f)

(a)

(b)

Figure 2: Evaluating theories of model behavior. First, theencoder is run on an initial world (a), producing a represen-tation whose meaning we would like to understand (see Fig-ure 1). We then observe the behavior of the decoder holdingthis representation fixed but replacing the underlying worldrepresentation with alternatives like (b). We compare the truedecoder output to a number of theories of its behavior. Therandom theory (d) outputs a random decision for every object.The literal theory (e) predicts that the decoder will output apositive label only on those objects that exactly match someobject in the initial observation. The human theory (f) assignslabels according to the logical semantics of the utterance pro-duced by a human presented with the initial observation.

Theory Objects Worlds Tables

Neg

. Random 0.50 0.00 0.00Literal 0.50 0.12 0.03

Negation 0.97 0.81 0.45D

isj. Random 0.50 0.00 0.00

Literal 0.58 0.09 0.01Disjunction 0.92 0.54 0.19

Con

j. Random 0.50 0.00 0.00Literal 0.81 0.19 0.01

Conjunction 0.90 0.56 0.37

Table 2: Agreement with predicted model behavior for nega-tion, conjunction, and disjunction tasks (top to bottom). Eval-uation is performed on transformed message vectors as de-scribed in Section 5. We discover a robust linear transforma-tion of message vectors corresponding to negation, as well asevidence of structured representations of binary operations.

responding to negation, and that messages can bepredictably “negated” in vector space. To test thishypothesis, we first collect examples of the form(e, f, e′, f ′), where e′ = ¬e, rep(e) = rep(f),and rep(e′) = rep(f ′). In other words, we findpairs of pairs of RNN representations f and f ′ forwhich the natural language messages (e, e′) serveas a denotational certificate that f ′ behaves as anegation of f . If the learned model does not haveany kind of primitive notion of negation, we ex-pect that it will not be possible to find any kind ofpredictable relationship between pairs (f, f ′). (Asan extreme example, we could imagine every pos-sible prediction rule being associated with a differ-ent point in the representation space, with the cor-respondence between position and behavior essen-tially random.) Conversely, if there is a first-classnotion of negation, we should be able to select anarbitrary representation vector f with an associ-ated referring expression e, apply some transfor-mation N to f , and be able to predict a priori howthe decoder model will interpret the representationNf—i.e. in correspondence with ¬e.

Here we make the strong assumption that thenegation operation is not only predictable but lin-ear. Previous work has found that linear opera-tors are powerful enough to capture many hier-archical and relational structures (Paccanaro andHinton, 2002; Bordes et al., 2014). Using ex-amples (f, f ′) collected from the training set asdescribed above, we compute the least-squaresestimate N = argminN

∑ ||Nf − f ′||22 . Toevaluate, we collect example representations fromthe test set that are equivalent to known logicalforms, and measure how frequently model behav-iors rep(Nf) agree with the logical predictions

(a)

�4 �3 �2 �1 0 1 2 3 4 5�5

�4

�3

�2

�1

0

1

2

3

�x.¬red(x)

�x.red(x)

�x.green(x) � blue(x)

�x.¬(green(x) � blue(x))

(b)

�4 �3 �2 �1 0 1 2 3 4�3

�2

�1

0

1

2

3�x.red(x)

�x.yellow(x)

�x.blue(x)

�x.red(x) � yellow(x)

�x.blue(x) � yellow(x)

�x.red(x) � blue(x)

Figure 3: Principal components of structured message trans-formations discovered by our experiments. (a) Negation:black and white dots show raw message vectors denotation-ally equivalent to the provided logical cluster label (Sec-tion 3). Red dots show the result of transforming black dotswith the estimated negation operation N . (b) The correspond-ing experiment for disjunction using the transformation M .

rep(¬e)—in other words, how often the linearoperator N actually corresponds to logical nega-tion. Results are shown in the top portion of Ta-ble 2. Correspondence with the logical form isquite high, resulting in 97% agreement at the levelof individual objects and 45% agreement on fullrepresentations. We conclude that the estimatedlinear operator N is analogous to negation in nat-ural language. Indeed, the behavior of this opera-tor is readily visible in Figure 3: predicted negatedforms (in red) lie close in vector space to their truevalues, and negation corresponds roughly to mir-roring across a central point.

In our final experiment, we explore whether thesame kinds of linear maps can be learned for thebinary operations of conjunction and disjunction.As in the previous section, we collect examplesfrom the training data of representations whose de-notations are known to correspond to groups oflogical forms in the desired relationship—in thiscase tuples (e, f, e′, f ′, e′′, f ′′), where rep(e) =rep(f), rep(e′) = rep(f ′), rep(e′′) = rep(f ′′) andeither e′′ = e ∧ e′ (conjunction) or e′′ = e ∨ e′

(disjunction). Since we expect that our operatorwill be symmetric in its arguments, we solve forM = argminM

∑ ||Mf +Mf ′ − f ′′||22.

Results are shown in the bottom portions ofTable 2. Correspondence between the behaviorpredicted by the contextual logical form and themodel’s actual behavior is less tight than for nega-tion. At the same time, the estimated operatorsare clearly capturing some structure: in the case ofdisjunction, for example, model interpretations arecorrectly modeled by the logical form 92% of thetime at the object level and 19% of the time at thedenotation level. This suggests that the operationsof conjunction and disjunction do have some func-tional counterparts in the RNN language, but thatthese functions are not everywhere well approxi-mated as linear.

6 Conclusions

Building on earlier tools for identifying neuralcodes with natural language strings, we have pre-sented a technique for exploring compositionalstructure in a space of vector-valued representa-tions. Our analysis of an encoder–decoder modeltrained on a reference game identified a numberof language-like properties in the model’s repre-sentation space, including transformations corre-sponding to negation, disjunction, and conjunc-tion. One major question left open by this analy-sis is what happens when multiple transformationsare applied hierarchically, and future work mightfocus on extending the techniques in this paper toexplore recursive structure. We believe our exper-iments so far highlight the usefulness of a deno-tational perspective from formal semantics wheninterpreting the behavior of deep models.

ReferencesJacob Andreas, Anca Dragan, and Dan Klein. 2017.

Translating neuralese. In Proceedings of the AnnualMeeting of the Association for Computational Lin-guistics.

Antoine Bordes, Sumit Chopra, and Jason Weston.2014. Question answering with subgraph embed-dings. arXiv preprint arXiv:1406.3676 .

Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bah-danau, and Yoshua Bengio. 2014. On the propertiesof neural machine translation: Encoder-decoder ap-proaches. arXiv preprint arXiv:1409.1259 .

Donald Davidson. 1967. Truth and meaning. Synthese17(1):304–323.

Christopher Dircks and Scott Stoness. 1999. Effectivelexicon change in the absence of population flux.Advances in Artificial Life pages 720–724.

Nicholas FitzGerald, Yoav Artzi, and Luke Zettle-moyer. 2013. Learning distributions over logicalforms for referring expression generation. In Pro-ceedings of the Conference on Empirical Methodsin Natural Language Processing.

Jakob Foerster, Yannis M Assael, Nando de Freitas,and Shimon Whiteson. 2016. Learning to commu-nicate with deep multi-agent reinforcement learning.In Advances in Neural Information Processing Sys-tems. pages 2137–2145.

Angeliki Lazaridou, Nghia The Pham, and MarcoBaroni. 2016. Towards multi-agent communication-based language learning. arXiv preprintarXiv:1605.07133 .

Omer Levy, Yoav Goldberg, and Israel Ramat-Gan.2014. Linguistic regularities in sparse and explicitword representations. pages 171–180.

Igor Mordatch and Pieter Abbeel. 2017. Emergenceof grounded compositional language in multi-agentpopulations. arXiv preprint arXiv:1703.04908 .

Alberto Paccanaro and Jefferey Hinton. 2002. Learn-ing hierarchical structures with linear relationalembedding. In Advances in Neural InformationProcessing Systems. Vancouver, BC, Canada, vol-ume 14, page 857.

Xing Shi, Inkit Padhi, and Kevin Knight. 2016. Doesstring-based neural mt learn source syntax? In Pro-ceedings of the Conference on Empirical Methods inNatural Language Processing.

Sainbayar Sukhbaatar, Rob Fergus, et al. 2016. Learn-ing multiagent communication with backpropaga-tion. In Advances in Neural Information ProcessingSystems. pages 2244–2252.

Ilya Sutskever, Oriol Vinyals, and Quoc VV Le. 2014.Sequence to sequence learning with neural net-works. In Advances in Neural Information Process-ing Systems. pages 3104–3112.

Kyle Wagner, James A Reggia, Juan Uriagereka, andGerald S Wilkinson. 2003. Progress in the sim-ulation of emergent communication and language.Adaptive Behavior 11(1):37–69.

Licheng Yu, Hao Tan, Mohit Bansal, and Tamara LBerg. 2016. A joint speaker-listener-reinforcermodel for referring expressions. arXiv preprintarXiv:1612.09542 .