The Future is not One-dimensional: Complex ... - BLENDER Lab

13
The Future is not One-dimensional: Complex Event Schema Induction by Graph Modeling for Event Prediction Manling Li 1 , Sha Li 1 , Zhenhailong Wang 1 , Lifu Huang 2 , Kyunghyun Cho 3 , Heng Ji 1 , Jiawei Han 1 , Clare Voss 4 1 UIUC 2 Virginia Tech 3 NYU 4 US ARL {manling2,shal2,wangz3,hengji,hanj}@illinois.edu, [email protected], [email protected], [email protected] Abstract Event schemas encode knowledge of stereo- typical structures of events and their connec- tions. As events unfold, schemas are cru- cial to act as a scaffolding. Previous work on event schema induction focuses either on atomic events or linear temporal event se- quences, ignoring the interplay between events via arguments and argument relations. We in- troduce a new concept of Temporal Complex Event Schema: a graph-based schema repre- sentation that encompasses events, arguments, temporal connections and argument relations. In addition, we propose a Temporal Event Graph Model that predicts event instances fol- lowing the temporal complex event schema. To build and evaluate such schemas, we re- lease a new schema learning corpus contain- ing 6,399 documents accompanied with event graphs, and we have manually constructed gold-standard schemas. Intrinsic evaluations by schema matching and instance graph per- plexity, prove the superior quality of our prob- abilistic graph schema library compared to linear representations. Extrinsic evaluation on schema-guided future event prediction fur- ther demonstrates the predictive power of our event graph model, significantly outperform- ing human schemas and baselines by more than 23.8% on HITS@1. 1 1 Introduction The current automated event understanding task has been overly simplified to be local and sequen- tial. Real world events, such as disease outbreaks and terrorist attacks, have multiple actors, com- plex timelines, intertwined relations and multiple possible outcomes. Understanding such events re- quires knowledge in the form of a library of event schemas, capturing the progress of time, and per- forming global inference for event prediction. For 1 The programs, data and resources are made publicly available for research purpose in https://github.com/ limanling/temporal-graph-schema. example, regarding the 2019 protest in Hong Kong International Airport, a typical question from an- alysts would be “How long will the flights being canceled?” This requires an event understanding system to match events to schema representations and reason about what might happen next. The air- port protest schema would be triggered by “protest” and “flight cancellation”, and evidence of protesters (e.g., the number of protesters, the instruments be- ing used, etc) will suggest a CEO resignation event, or a flight rescheduling event, or continuous flight cancellation events with respective probabilities. Learning Buy Transport Buy Transport Assemble Transport Attack Detonate Crash VEH entity PER learner recepient ORG institute WEA recepient entity entity entity artifact artifact agent Injure Die artifact GPE destination agent instrument place place instrument agent agent located_in FAC located_in agent Figure 1: The example schema of the complex event type car-bombing. A person learned to make bombs and bought materials as well as a vehicle. Then the bomb was assembled to the vehicle, and then the at- tacker drove it to attack people. People can be hurt by the vehicle, or by the explosion of the bomb, or by the crash of the vehicle. Comprehending such a news story requires fol- lowing a timeline, identifying key events and track- ing characters. We refer to such a “story” as a complex event, e.g., the Kabul ambulance bomb- ing event. Its complexity comes from the inclusion of multiple atomic events (and their arguments), relations and temporal order. A complex event schema can be used to define the typical structure of a particular type of complex event, e.g., car- bombing. This leads us to the new task that we address in this paper: temporal complex event

Transcript of The Future is not One-dimensional: Complex ... - BLENDER Lab

The Future is not One-dimensional: Complex Event Schema Induction byGraph Modeling for Event Prediction

Manling Li1, Sha Li1, Zhenhailong Wang1, Lifu Huang2,Kyunghyun Cho3, Heng Ji1, Jiawei Han1, Clare Voss4

1UIUC 2Virginia Tech 3NYU 4US ARL{manling2,shal2,wangz3,hengji,hanj}@illinois.edu,

[email protected], [email protected], [email protected]

Abstract

Event schemas encode knowledge of stereo-typical structures of events and their connec-tions. As events unfold, schemas are cru-cial to act as a scaffolding. Previous workon event schema induction focuses either onatomic events or linear temporal event se-quences, ignoring the interplay between eventsvia arguments and argument relations. We in-troduce a new concept of Temporal ComplexEvent Schema: a graph-based schema repre-sentation that encompasses events, arguments,temporal connections and argument relations.In addition, we propose a Temporal EventGraph Model that predicts event instances fol-lowing the temporal complex event schema.To build and evaluate such schemas, we re-lease a new schema learning corpus contain-ing 6,399 documents accompanied with eventgraphs, and we have manually constructedgold-standard schemas. Intrinsic evaluationsby schema matching and instance graph per-plexity, prove the superior quality of our prob-abilistic graph schema library compared tolinear representations. Extrinsic evaluationon schema-guided future event prediction fur-ther demonstrates the predictive power of ourevent graph model, significantly outperform-ing human schemas and baselines by morethan 23.8% on HITS@1. 1

1 Introduction

The current automated event understanding taskhas been overly simplified to be local and sequen-tial. Real world events, such as disease outbreaksand terrorist attacks, have multiple actors, com-plex timelines, intertwined relations and multiplepossible outcomes. Understanding such events re-quires knowledge in the form of a library of eventschemas, capturing the progress of time, and per-forming global inference for event prediction. For

1The programs, data and resources are made publiclyavailable for research purpose in https://github.com/limanling/temporal-graph-schema.

example, regarding the 2019 protest in Hong KongInternational Airport, a typical question from an-alysts would be “How long will the flights beingcanceled?” This requires an event understandingsystem to match events to schema representationsand reason about what might happen next. The air-port protest schema would be triggered by “protest”and “flight cancellation”, and evidence of protesters(e.g., the number of protesters, the instruments be-ing used, etc) will suggest a CEO resignation event,or a flight rescheduling event, or continuous flightcancellation events with respective probabilities.

LearningBuy Transport

Buy

Transport

Assemble

Transport

Attack

Detonate

Crash

VEH

entity

PER

learner

recepient

ORG

institute

WEArecepiententity

entity

entityartifact

artifact

agent

Injure

Die

artifactGPE

destination

agent

instrument

place

place

instrument

agentagent

located_inFAClocated_in

agent

Figure 1: The example schema of the complex eventtype car-bombing. A person learned to make bombsand bought materials as well as a vehicle. Then thebomb was assembled to the vehicle, and then the at-tacker drove it to attack people. People can be hurt bythe vehicle, or by the explosion of the bomb, or by thecrash of the vehicle.

Comprehending such a news story requires fol-lowing a timeline, identifying key events and track-ing characters. We refer to such a “story” as acomplex event, e.g., the Kabul ambulance bomb-ing event. Its complexity comes from the inclusionof multiple atomic events (and their arguments),relations and temporal order. A complex eventschema can be used to define the typical structureof a particular type of complex event, e.g., car-bombing. This leads us to the new task that weaddress in this paper: temporal complex event

schema induction. Figure 1 shows an exampleschema about car-bombing with multiple temporaldependencies between events. Namely, the occur-rence of one event may depend on multiple events.For example, the ASSEMBLE event happens afterbuying both the bomb materials and the vehicle.Also, there may be multiple events following anevent, such as the multiple consequences of theATTACK event in Figure 1. That is to say, “thefuture is not one-dimensional”. Our automaticallyinduced probabilistic complex event schema can beused to forecast event abstractions into the futureand thus provide a comprehensive understandingof evolving situations, events, and trends.

For each type of complex event, we aim to in-duce a schema library that is probabilistic, tempo-rally organized and semantically coherent. Lowlevel atomic event schemas are abundant, and canbe part of multiple, sparsely occurring, higher-levelschemas. We propose a Temporal Event GraphModel, an auto-regressive graph generation model,to reach this goal. Given a currently extracted eventgraph, we generate the next event type node withits potential arguments, such as the ARREST eventin Figure 2, and then propagate edge-aware infor-mation following temporal orders. After that, weemploy a copy mechanism to generate coreferen-tial arguments, such as the DETAINEE argument isthe ATTACKER of the previous ATTACK event, andbuild relation edges for them, e.g., PART WHOLE

relation between the PLACE arguments. Finally,temporal dependencies are determined with argu-ment connections considered, such as the temporaledge showing that ARREST is after ATTACK.

Our generative model serves as both a schemalibrary and a predictive model. Specifically, we canprobe the model to generate event graphs uncon-ditionally to obtain a set of schemas. We can alsopass partially instantiated graphs to the model and“grow” the graph either forward or backward in timeto predict missing events, arguments or relations,both from the past and in the future. We propose aset of schema matching metrics to evaluate the in-duced schemas by comparing with human-createdschemas and show the power of the probabilisticschema in the task of future event prediction as anextrinsic evaluation, to predict event types that arelikely to happen next.

We make the following novel contributions:

• This is the first work to induce probabilistictemporal graph schemas for complex events

Symbol Meaning

G ∈ G Instance graph of a complex eventS ∈ S Schema graph of a complex event typee ∈ E Event node in an instance graphv ∈ V Entity node in an instance graph

〈ei, el〉 Temporal ordering edge between events eiand el, indicating ei is before el

〈ei, a, vj〉 Argument edge, indicating vj plays argu-ment role a in the event ei

〈vj , r, vk〉 Relation edge between entities vj and vk,and r is the relation type

A(e)Argument role set of event e, defined by theIE ontology

ΦE The type set of eventsΦV The type set of entitiesφ(·) A mapping function from a node to its type

G<iSubgraph of G containing events before eiand their arguments

Table 1: List of symbols

across documents, which capture temporal dy-namics and connections among individual eventsthrough their coreferential or related arguments.• This is the first application of graph generation

methods to induce event schemas.• This is the first work to use complex event

schemas for event type prediction, and also pro-duce multiple hypotheses with probabilities.• We have proposed a comprehensive set of met-

rics for both intrinsic and extrinsic evaluations.• We release a new data set of 6,399 documents

with gold-standard schemas annotated manually.

2 Problem Formulation

From a set of documents describing a complexevent, we construct an instance graph G whichcontains event nodes E and entity nodes (argumentnodes) V . There are three types of edges in thisgraph: (1) event-event edges 〈ei, el〉 connectingevents that have direct temporal relations; (2) event-entity edges 〈ei, a, vj〉 connecting arguments to theevent; and (3) entity-entity edges 〈vj , r, vk〉 indi-cating relations between entities. We can constructinstance graphs by applying Information Extraction(IE) techniques on an input text corpus. In thesegraphs, the relation edges do not have directionsbut temporal edges between events are directional,going from the event before to the event after.

For each complex event type, given a set of in-stance graphs G, the goal of schema induction isto generate a schema library S. In each schemagraph S, the nodes are abstracted to the typesof events and entities. Figure 1 is an example

(4) Entity Relation Edge Generation 

Arrest

Arrest

(5) Event Temporal Ordering Prediction (3) Coreferential Argument Generation 

(2) Edge-Aware Graph Neural Network (1) Event Generation Existing Graph 

PER

Target

Attacker

Temporal Ordering

Place PER

LOC

Attack

Die

VictimPlace

PER

Target

Attacker

Temporal Ordering

Detainee

Jailor

Place

Place

PER

LOC

Attack

Die

VictimPlace

PER

Target

Attacker

Temporal Ordering

Detainee

PER

Jailor

Place

GPEPlace

ArrestPER

LOC

Attack

Die

VictimPlace

generation-mode

copy-modeexisting nodes

vocab (entity types)

PER

Target

Attacker

Temporal Ordering

Detainee

Jailor

Place

Place

ArrestPER

LOC

Attack

Die

VictimPlace

PER

Target

Attacker

Temporal Ordering

Detainee

Jailor

Place

Place

ArrestPER

LOC

Attack

Die

VictimPlace

PER

Target

Attacker

Temporal Ordering

Detainee

Jailor

Place

GPEPlace

PER

LOC

Attack

Die

VictimPlace

part_whole part_whole

GPEPERPERPER

Figure 2: The generation process of Temporal Event Graph Model.

of schema2 for complex event type car-bombing.Schema graphs can be regarded as a summary ab-straction of instance graphs, capturing the reoccur-ring structures.

3 Our Approach

3.1 Instance Graph Construction

To induce schemas for a complex event type, suchas car-bombing, we construct a set of instancegraphs, where each instance graph is about onecomplex event, such as Kabul ambulance bombing.

We first identify a cluster of documents that de-scribes the same complex event. In this paper, wetreat all documents linked to a single Wikipediapage as belonging to the same complex event, de-tailed in §4.1.

We use OneIE, a state-of-the-art Information Ex-traction system (Lin et al., 2020), to extract enti-ties, relations and events, and then perform cross-document entity (Pan et al., 2015, 2017) and eventcoreference resolution (Lai et al., 2021) over thedocument cluster of each complex event. We fur-ther conduct event-event temporal relation extrac-tion (Ning et al., 2019; Wen et al., 2021b) to de-termine the order of event pairs. We run the entire

2For simplification purposes, we mention “schema graphs”as “schemas”, and “events” in schemas are only “event types”.

pipeline following (Wen et al., 2021a) 3, and thedetailed extraction performance is reported in thepaper.

After extraction, we construct one instance graphfor each complex event, where coreferential eventsor entities are merged. We consider the isolatedevents as irrelevant nodes in schema induction, sothey are excluded from the instance graphs duringgraph construction. Considering schema graphsfocus on type-level abstraction, we use type labeland node index to represent each node, ignoring themention level information in these instance graphs.

3.2 Temporal Event Graph Model Overview

Given an instance graph G , we regard the schemaas the hidden knowledge to guide the generationof these graphs. To this end, we propose a tem-poral event graph model that maximizes the prob-ability of each instance graph, parameterized by∏G∈G p(G). At each step, based on the previous

graph G<i, we predict one event node ei with itsarguments to generate the next graph Gi,

p(G) =∏|E|

i=0p(Gi|G<i).

3https://github.com/RESIN-KAIROS/RESIN-pipeline-public

We factorize the probability of generating newnodes and edges as:

p(Gi|G<i) = p(ei|G<i)∏

aj∈A(ei)

p(〈ei, aj , vj〉|ei, aj)

∏vk∈G<i

p(〈vj , r, vk〉|vj , vk)∏

el∈G<i

p(〈ei, el〉|ei, el). (1)

As shown in Figure 2, an event node ei is generatedfirst according to the probability p(ei|G<i). Wethen add argument nodes based on the IE ontol-ogy. We also predict relation 〈vj , r, vk〉 betweenthe newly generated node vj and the existing nodesvk ∈ G<i. After knowing the shared and relatedarguments, we add a final step to predict the tem-poral relations between the new event ei and theexisting events el ∈ G<i.

In the traditional graph generation setting, theorder of node generation can be arbitrary. However,in our instance graphs, event nodes are connectedthrough temporal relations. We order events as adirected acyclic graph (DAG). Considering eachevent may have multiple events both “before” and“after”, we obtain the generation order by traversingthe graph using Breadth-First Search.

We also add dummy START/END event nodes toindicate the starting/ending of the graph generation.At the beginning of the generation process, thegraph G0 has a single start event node e[SOG]. Wegenerate e[EOG] to signal the end of the graph.

3.3 Event GenerationTo determine the event type of the newly generatedevent node ei, we apply a graph pooling over allevents to get the current graph representation gi,

gi = Pooling({e0, · · · , ei−1}).

We use bold to denote the latent representations ofnodes and edges, which will be initialized as zerosand updated at each generation step via messagepassing in § 3.4. We adopt a mean-pooling oper-ation in this paper. After that, the event type ispredicted through a fully connected layer,

p(ei|G<i) =exp(W φ(ei)gi)∑

φ′∈ΦE∪[EOG] exp(W φ′gi).

Once we know the event type of ei, we add allof its arguments in A(ei) defined in the IE ontol-ogy as new entity nodes. For example, in Figure 2,the new event ei is an ARREST event, so we addthree argument nodes for DETAINEE, JAILOR, andPLACE respectively. The edges between these ar-guments and event ei are also added into the graph.

3.4 Edge-Aware Graph Neural NetworkWe use a Graph Neural Network (GNN) (Kipf andWelling, 2017) to update node embeddings follow-ing the graph structure. Before we run the GNN onthe graph, we first add virtual edges between thenewly generated event and all previous events, andbetween new entities and previous entities, shownas dashed lines in Figure 2. The virtual edges en-able the representations of new nodes to aggregatethe messages from previous nodes, which has beenproven effective in (Liao et al., 2019).

To capture rich semantics of edge types, we passedge-aware messages during graph propagation.An intuitive way is to encode different edge typeswith different convolutional filters, which is similarto RGCN (Schlichtkrull et al., 2018). However,the number of RGCN parameters grows rapidlywith the number of edge types and easily becomesunmanageable given the large number of relationtypes and argument roles in the IE ontology.4 In-stead, we learn a vector representation for eachrelation type r and argument role a. The messagepassed through each argument edge 〈ei, a, vj〉 is:

mi,j = ReLU (W a((ei − vj) ‖ a)) ,

where ‖ denotes concatenation operation. Similarly,the message between two entities vj and vk is:

mj,k = ReLU (W r((vj − vk) ‖ r)) .

Considering that the direction of the temporal edgeis important, we parametrize the message over thisedge by assigning two separate weight matrices tothe outgoing and incoming vertices:

mi,l = ReLU (W bfrei −W aftel) .

We aggregate the messages using edge-awareattention following (Liao et al., 2019):5

αi,j = σ(MLP(ei − ej)),

where σ is the sigmoid function, and MLP containstwo hidden layers with ReLU nonlinearities.

The event node representation ei is then updatedusing the messages from its local neighbors N (ei),similar to entity node representations:

ei ← GRU(ei ‖

∑j∈N (ei)

αi,jmi,j

).

4There are 131 edge types according to the fine-grainedLDC Schema Learning Ontology.

5Compared to (Liao et al., 2019), we do not use the posi-tional embedding mask because the newly generated nodeshave distinct roles.

3.5 Coreferential Argument Generation

After updating the node representations, we detectthe entity type of each argument, and also predictwhether the argument is coreferential to existingentities. Inspired by copy mechanism (Gu et al.,2016), we classify each argument node vj to eithera new entity with entity type φ(vj), or an exist-ing entity node in the previous graph G<i. Forexample, in Figure 2, the DETAINEE should beclassified to the existing ATTACKER node, whileJAILOR node is classified as PERSON. Namely,

p(〈ei, aj , vj〉|ei, aj)

=

{p(〈ei, aj , vj〉, g|ei, aj) if vj is new,

p(〈ei, aj , vj〉, c|ei, aj) otherwise,

where p(〈ei, aj , vj〉, g|ei, aj) is the generationprobability, classifying the new node to its entitytype φ(vj):

p(〈ei, aj , vj〉, g|ei, aj) = exp(W φ(vj)vj)/Z

The copy probability p(〈ei, aj , vj〉, c|ei, aj) selectsthe coreferential entity v from the entities in exist-ing graph, denoted by V<i,

p(〈ei, aj , vj〉, c|ei, aj) = exp(W vvj)/Z.

Here, Z is the shared normalization term,

Z =∑

φ′∈ΦVexp(W φ′vj)+

∑v′∈V<i

exp(W v′vj)

If determined to copy, we merge coreferential enti-ties in the graph.

3.6 Entity Relational Edge Generation

In this phase, we determine the virtual edges tobe kept and assign relation types to them, such asPARTWHOLE relation in Figure 2. We model therelation edge generation probability as a categor-ical distribution over relation types, and add [O](OTHER) to the typesetR to represent that there isno relation edge:

p(〈vj , r, vk〉|vj , vk) =exp(MLPr(vj − vk))∑

r′∈R∪[O]

exp(MLPr′(vj − vk))

We use two hidden layers with ReLU activationfunctions to implement the MLP.

3.7 Event Temporal Ordering PredictionTo predict the temporal dependencies between thenew events and existing events, we connect themthrough temporal edges, as shown in Figure 2.These edges are critical for message passing inpredicting the next event. We build temporal edgesin the last phase of generation, since it relies onthe shared and related arguments. Considering thattemporal edges are interdependent, we model thegeneration probability as a mixture of Bernoullidistributions following (Liao et al., 2019):

p(〈ei, el〉|ei, el) =∑

bγb θb,i,l,

γ1, · · · , γB = Softmax(∑

i,lMLP(ei − el)

),

θ1,i,l, · · · , θB,i,l = σ (MLPθ(ei − el)) ,

where B is the number of mixture components.WhenB = 1, the distribution degenerates to factor-ized Bernoulli, which assumes the independenceof each potential temporal edge conditioned on theexisting graph.

3.8 Training and Schema DecodingWe train the model by optimizing the negative log-likelihood loss,

L =∑

G∈Gtrain− log2 p(G).

To compose the schema library for each complexevent scenario, we construct instance graphs fromrelated documents to learn a graph model, and thenobtain the schema using greedy decoding.

4 Evaluation Benchmark

4.1 DatasetWe conduct experiments on two datasets for boththe general scenario and a more specific scenario.We adopt the DARPA KAIROS6 ontology, a newlydefined fine-grained ontology for Schema Learn-ing, with 24 entity types, 46 relation types, 67 eventtypes, and 85 argument roles. 7 Our schema induc-tion method does not rely on any specific ontology,only the IE system is trained on a given ontologyto create the instance event graphs.General Schema Learning Corpus: The SchemaLearning Corpus, released by LDC (LDC2020E25),includes 82 types of complex events, such as Dis-ease Outbreak, Presentations and Shop Online.

6https://github.com/NextCenturyCorporation/kairos-pub/tree/master/data-format/ontology

7The ontology has been released in LDC2020E25.

Each complex event is associated with a set ofsource documents. This data set also includesground-truth schemas created by LDC annotators,which were used for our intrinsic evaluation.

Dataset Split #doc #graph #event #arg #rel

Train 451 451 6,040 10,720 6,858General Dev 83 83 1,044 1,762 1,112

Test 83 83 1,211 2,112 1,363

Train 5,247 343 41,672 136,894 122,846IED Dev 575 42 4,661 15,404 13,320

Test 577 45 5,089 16,721 14,054

Table 2: Data statistics. Each instance graph is aboutone complex event.

IED Schema Learning Corpus: The same type ofcomplex events may have many variants, which de-pends on the different types of conditions and par-ticipants. In order to evaluate our model’s capabil-ity at capturing uncertainty and multiple hypothe-ses, we decided to dive deeper into one scenarioand chose the improvised explosive device (IED)as our case study. We first collected Wikipediaarticles that describe 4 types of complex events,i.e., Car-bombing IED, Drone Strikes IED, SuicideIED and General IED. Then we followed (Li et al.,2021) to exploit the external links to collect theadditional news documents with the correspondingcomplex event type.

The ground-truth schemas for this IED corpusare created manually, through a schema curationtool (Mishra et al., 2021). Only one human schemagraph was created for each complex event type,resulting in 4 schemas. In detail, for each com-plex event type, we presented example instancegraphs and the ranked event sequences to annota-tors to create human (ground truth) schemas. Theevent sequences are generated by traversing the in-stance graphs, and then sorted by frequency and thenumber of arguments. Initially we assigned threeannotators (IE experts) to each create a version ofthe schema and then the final schema was mergedthrough discussion. After that, two annotators (lin-guists) performed a two-pass revision. Human cu-ration focuses on merging and trimming steps byvalidating them using the reference instance graphs.Also, temporal dependencies between steps werefurther refined, and coreferential entities and theirrelations were added during the curation process.To avoid bias from the event sequences, linguists inthe second round revision were not presented withthe event sequences. All annotators were trained

and disagreements were resolved through discus-sion.

4.2 Schema Matching EvaluationWe compare the generated schemas with the groundtruth schemas based on the overlap between them.The following evaluation metrics were employed:8

Event Match: A good schema must contain theevents crucial to the complex event scenario. F-score is used to compute the overlap of event nodes.Event Sequence Match: A good schema is able totrack events through a timeline. So we obtain eventsequences following temporal order, and evaluateF-score on the overlapping sequences of lengthsl = 2 and l = 3.Event Argument Connection Match: Our com-plex event graph schema includes entities and theirrelations and captures how events are connectedthrough arguments, in addition to their temporalorder. We categorize these connections into threecategories: (1) two events are connected by sharedarguments; (2) two events have related arguments,i.e., their arguments are connected through entityrelations; (3) there are no direct connections be-tween two events. For every pair of overlappedevents, we calculate F-score based on whetherthese connections are predicted correctly. The hu-man schemas of the General dataset do not containarguments and the relations between arguments, sowe only compute this metric for the IED dataset.

4.3 Instance Graph Perplexity EvaluationTo evaluate our temporal event graph model, wecompute the instance graph perplexity by predict-ing the instance graphs in the test set,

PP = 2− 1

|Gtest|∑

G∈Gtestlog2 p(G)

. (1)

We calculate the full perplexity for the entire graphusing Equation (1), and event perplexity using onlyevent nodes, emphasizing the importance of cor-rectly predicting events.

4.4 Schema-Guided Event PredictionTo explore schema-guided probabilistic reasoningand prediction, we perform an extrinsic evalua-tion of event prediction. Different from traditionalevent prediction tasks, the temporal event graphscontain arguments with relations, and there are

8We cannot use graph matching to compare between base-lines and our approach due to the difference in the graphstructures being modeled.

type labels assigned to nodes and edges. We cre-ate a graph-based event prediction dataset usingour testing graphs. The task aims to predict endingevents of each graph, i.e., events that have no futureevents after it. An event is predicted correctly ifits event type matches one of the ending events inthe graph. Considering that there can be multipleending events in one instance graph, we rank eventtype prediction scores and adopt MRR (Mean Re-ciprocal Rank) and HITS@1 as evaluation metrics.

5 Experiments

5.1 Experiment Setting

Baseline 1: Event Language Model (Rudingeret al., 2015; Pichotta and Mooney, 2016) is thestate-of-the-art event schema induction method. Itlearns the probability of temporal event sequences,and the event sequences generated from event lan-guage model are considered as schemas.Baseline 2: Sequential Pattern Mining (Pei et al.,2001) is a classic algorithm for discovering com-mon sequences. We also attach arguments andtheir relations as extensions to the pattern. Consid-ering that the event language model baseline cannothandle multiple arguments and relations, we addsequential pattern mining for comparison. The fre-quent patterns mined are considered as schemas.Reference: Human Schema is added as a base-line in the extrinsic task of event prediction. Sincehuman-created schemas are highly accurate but notprobabilistic, we want to evaluate their limits atpredicting events in the extrinsic task. We matchschemas to instances and fill in the matched type.Ablation Study: Event Graph Model w/o Argu-ment Generation is included as a variant of ourmodel in which we remove argument generation(§3.5 and §3.6). It learns to generate a graph con-taining only event nodes with their temporal rela-tions, aiming to verify whether incorporating argu-ment information helps event modeling.

5.2 Implementation Details

Training Details. For our event graph model,the representation dimension is 128, and we usea 2-layer GNN. The value of B is 2. The numberof mixture components in temporal classifier is 2.The learning rate is 1e-4. To train event languagemodel baseline, instead of using LSTM-based ar-chitecture following (Pichotta and Mooney, 2016),we adopt the state-of-the-art auto-regressive lan-guage XLNet (Yang et al., 2019). In detail, we

first linearize the graph using topological sort, andthen train XLNet9 using the dimension of 128 (thesame as our temporal event graph model), and thenumber of layers is 3. The learning rate is 1e-4. Weselect the best model on the validation set. Bothof our model and event language model baselineare trained on one Tesla V100 GPU with 16GBDRAM. For sequential pattern mining, we performrandom walk, starting from every node in instancegraphs and ending at sink nodes, to obtain eventtype sequences, and then apply PrefixSpan (Peiet al., 2001) 10 to rank sequential patterns.Evaluation Details. To compose the schema li-brary, we use the first ranked sequence as theschema for these two models. To perform eventprediction using baselines, we traverse the inputgraph to obtain event type sequences, and conductprediction on all sequences to produce an averagedscore. For human schemas, we first linearize themand the input graphs, and find the longest commonsubsequence between them.

5.3 Results and AnalysisIntrinsic Evaluation. In Table 3, the significantgain on event match demonstrates the ability of ourgraph model to keep salient events. On sequencematch, our approach achieves larger performancegain compared to baselines when the path lengthl is longer. It implies that the proposed model iscapable of capturing longer and wider temporal de-pendencies. In the case of connection match, onlysequential pattern mining in the baselines can pre-dict connections between events. When comparedagainst sequential pattern mining, our generationmodel significantly performs better since it consid-ers the inter-dependency of arguments and encodesthem with graph structures.Extrinsic Evaluation. On the task of schema-guided event prediction, our graph model obtainssignificant improvement (see Table 4.) The lowperformance of human schema demonstrates theimportance of probabilistically modeling schemasto support downstream tasks. Take Figure 3 as anexample. Human schemas produce incorrect eventtypes such as TRAILHEARING, since it matches thesequence ATTACK→DIE→TRAILHEARING, incapableof capturing the inter-dependencies between se-quences. However, our model is able to customizethe prediction to the global context of the input

9https://github.com/huggingface10https://github.com/chuanconggao/

PrefixSpan-py

ModelEvent Sequence Match Connection Event Full

Dataset Match l = 2 l = 3 Match Perplexity Perplexity

General

Event Language Model 54.76 22.87 8.61 - - -Sequential Pattern Mining 49.18 20.31 7.37 - - -

Event Graph Model 58.15 24.79 9.18 - 24.25 137.18w/o ArgumentGeneration 56.96 22.47 8.21 - 68.59 -

IED

Event Language Model 49.15 17.77 5.32 - - -Sequential Pattern Mining 47.91 18.39 4.79 5.41 - -

Event Graph Model 59.73 21.51 7.81 10.67 39.39 168.89w/o ArgumentGeneration 55.01 18.24 6.67 - 51.98 -

Table 3: Intrinsic evaluation results, including schema matching F1 score (%) and instance graph perplexity.

Dataset Model MRR HITS@1

General

Event Language Model 0.367 0.497Sequential Pattern Mining 0.330 0.478Human Schema 0.173 0.205

Event Graph Model 0.401 0.520w/o ArgumentGeneraion 0.392 0.509

IED

Event Language Model 0.169 0.513Sequential Pattern Mining 0.138 0.378Human Schema 0.072 0.222

Event Graph Model 0.224 0.741w/o ArgumentGeneraion 0.210 0.734

Table 4: Schema-guided event prediction performance.

graph, and take into account that there is no AR-REST event or justice-related events in the inputgraph. Also, the human schema fails to predictINJURE and ATTACK, because it relies on the exactmatch of event sequences of lengths l ≥ 2, andcannot handle the variants of sequences. This prob-lem can be solved by our probabilistic schema, viamodeling the prediction probability conditioned onthe existing graph. For example, even though AT-TACK mostly happens before DIE, we learn thatATTACK might repeat after DIE event if there aremultiple ATTACK and DETONATE in the existinggraph, which means the complex event is about aseries of conflict events.Ablation Study. Removing argument generation(“w/o ArgumentGeneration”) generally lowers theperformance on all evaluation tasks, since it ignoresthe coreferential arguments and their relations, butrelies solely on the overly simplistic temporal orderto connect events. This is especially apparent fromthe instance graph perplexity in Table 3.Learning Corpus Size. An average of 113 in-stance graphs is used for each complex event type

existing events

AttackDetonateExplode

Injure

AttackDie

DieDie

Injure

Contact

Die

Attack

Broadcast

ImpedeInterfereWith

events to be predicted

Prediction Result

HumanSchema

FireExplosion

Die

TrialHearing

Broadcast

Transportation

Sentence

GraphTemporalSchema

Die

Injure

Attack

Broadcast

Arrest

Input Graph

Figure 3: An event prediction example (IED scenario).

in the IED scenario, and 383 instance graphs tolearn the schema model in the General scenario.The better performance on the IED dataset in Ta-ble 3 shows that the number of instance graphsincreases the schema induction performance.Effect of Information Extraction Errors. Basedon the error analysis for schemas induced in Table1, the effect of extraction errors can be categorizedinto: (1) temporal ordering errors: 43.3%; (2) miss-ing events: 34.4%; (3) missing coreferential events:8.8%; (4) incorrect event type: 7.7%; (5) miss-ing coreferential arguments: 5.5%. However, evenon automatically extracted event graphs with ex-traction errors, our model significantly performsbetter on event prediction compared to human-constructed schemas, as shown in Table 4. Itdemonstrates that our schema induction methodis robust and effective to support downstream tasks,even when only provided with noisy data with ex-traction errors.

6 Related Work

The definition of a complex event schema sep-arates us from related lines of work, namelyschema induction and script learning. Previouswork on schema induction aims to characterize

event triggers and participants of individual atomicevents (Chambers, 2013; Cheung et al., 2013;Nguyen et al., 2015; Sha et al., 2016; Yuan et al.,2018), ignoring inter-event relations. Work onscript learning, on the other hand, originally lim-ited attention to event chains with a single protago-nist (Chambers and Jurafsky, 2008, 2009; Rudingeret al., 2015; Jans et al., 2012; Granroth-Wilding andClark, 2016) and later extended to multiple partic-ipants (Pichotta and Mooney, 2014, 2016; Weberet al., 2018). Recent efforts rely on distributedrepresentations encoded from the compositionalnature of events (Modi, 2016; Granroth-Wildingand Clark, 2016; Weber et al., 2018, 2020; Zhanget al., 2020), and language modeling (Rudingeret al., 2015; Pichotta and Mooney, 2016; Pengand Roth, 2016). All of these methods still as-sume that events follow linear order in a singlechain. They also overlook the relations betweenparticipants which are critical for understandingthe complex event. However, we induce a com-prehensive event graph schema, capturing both thetemporal dependency and the multi-hop argumentdependency across events.

Our recent work on event graph schema induc-tion (Li et al., 2020) only considers the connectionsbetween a pair of two events. Similarly, their eventprediction task is designed to automatically gener-ate a missing event (e.g., a word sequence) given asingle or a sequence of prerequisite events (Nguyenet al., 2017; Hu et al., 2017; Li et al., 2018b; Kiy-omaru et al., 2019; Lv et al., 2019), or predict apre-condition event given the current events (Kwonet al., 2020). In contrast, we leverage the automat-ically discovered temporal event schema as guid-ance to forecast the future events.

Existing script annotations (Chambers and Ju-rafsky, 2008, 2010; Modi et al., 2016; Wanzareet al., 2016; Mostafazadeh et al., 2016a,b; Kwonet al., 2020) cannot support a comprehensive graphschema induction due to the missing of criticalevent graph structures, such as argument relations.Furthermore, in real-world applications, complexevent schemas are expected to be induced fromlarge-scale historical data, which is not feasibleto annotate manually. We propose a data-drivenschema induction approach, and choose to use IEsystems instead of using manual annotation, to in-duce schemas that are robust and can tolerate ex-traction errors.

Our work is also related to recent advances in

modeling and generation of graphs (Li et al., 2018a;Jin et al., 2018; Grover et al., 2019; Simonovskyand Komodakis, 2018; Liu et al., 2019; Fu et al.,2020; Dai et al., 2020; You et al., 2018; Liao et al.,2019; Yoo et al., 2020; Shi et al., 2020). We are thefirst to perform graph generation on event graphs.

7 Conclusions and Future Work

We propose a new task to induce temporal complexevent schemas, which are capable of representingmultiple temporal dependencies between eventsand their connected arguments. We induce suchschemas by learning an event graph model, a deepauto-regressive model, from the automatically ex-tracted instance graphs. Experiments demonstratethe model’s effectiveness on both intrinsic evalu-ation and the downstream task of schema-guidedevent prediction. These schemas can guide our un-derstanding and ability to make predictions withrespect to what might happen next, along withbackground knowledge including location-, andparticipant-specific and temporally ordered eventinformation. In the future, we plan to extend ourframework to hierarchical event schema induction,as well as event and argument instance prediction.

Acknowledgement

This research is based upon work supported by U.S.DARPA KAIROS Program Nos. FA8750-19-2-1004 and Air Force No. FA8650-17-C-7715. Theviews and conclusions contained herein are thoseof the authors and should not be interpreted asnecessarily representing the official policies, eitherexpressed or implied, of DARPA, or the U.S. Gov-ernment. The U.S. Government is authorized toreproduce and distribute reprints for governmentalpurposes notwithstanding any copyright annotationtherein.

References

Nathanael Chambers. 2013. Event schema inductionwith a probabilistic entity-driven model. In Proceed-ings of the 2013 Conference on Empirical Methodsin Natural Language Processing, pages 1797–1807,Seattle, Washington, USA. Association for Compu-tational Linguistics.

Nathanael Chambers and Dan Jurafsky. 2008. Unsuper-vised learning of narrative event chains. In Proceed-ings of ACL-08: HLT, pages 789–797, Columbus,Ohio. Association for Computational Linguistics.

Nathanael Chambers and Dan Jurafsky. 2009. Unsu-pervised learning of narrative schemas and their par-ticipants. In Proceedings of the Joint Conference ofthe 47th Annual Meeting of the ACL and the 4th In-ternational Joint Conference on Natural LanguageProcessing of the AFNLP, pages 602–610, Suntec,Singapore. Association for Computational Linguis-tics.

Nathanael Chambers and Dan Jurafsky. 2010. Adatabase of narrative schemas. In Proceedingsof the Seventh International Conference on Lan-guage Resources and Evaluation (LREC’10), Val-letta, Malta. European Language Resources Associ-ation (ELRA).

Jackie Chi Kit Cheung, Hoifung Poon, and Lucy Van-derwende. 2013. Probabilistic frame induction. InProceedings of the 2013 Conference of the NorthAmerican Chapter of the Association for Computa-tional Linguistics: Human Language Technologies,pages 837–846, Atlanta, Georgia. Association forComputational Linguistics.

Hanjun Dai, Azade Nazi, Yujia Li, Bo Dai, and DaleSchuurmans. 2020. Scalable deep generative model-ing for sparse graphs. In Proceedings of the 37th In-ternational Conference on Machine Learning, ICML2020, 13-18 July 2020, Virtual Event, volume 119 ofProceedings of Machine Learning Research, pages2302–2312. PMLR.

Dongqi Fu, Dawei Zhou, and Jingrui He. 2020. Localmotif clustering on time-evolving graphs. In KDD

’20: The 26th ACM SIGKDD Conference on Knowl-edge Discovery and Data Mining, Virtual Event, CA,USA, August 23-27, 2020, pages 390–400. ACM.

Mark Granroth-Wilding and Stephen Clark. 2016.What happens next? event prediction using a com-positional neural network model. In Proceedingsof the Thirtieth AAAI Conference on Artificial In-telligence, February 12-17, 2016, Phoenix, Arizona,USA, pages 2727–2733. AAAI Press.

Aditya Grover, Aaron Zweig, and Stefano Ermon.2019. Graphite: Iterative generative modeling ofgraphs. In Proceedings of the 36th InternationalConference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, vol-ume 97 of Proceedings of Machine Learning Re-search, pages 2434–2444. PMLR.

Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O.K.Li. 2016. Incorporating copying mechanism insequence-to-sequence learning. In Proceedings ofthe 54th Annual Meeting of the Association for Com-putational Linguistics (Volume 1: Long Papers),pages 1631–1640, Berlin, Germany. Association forComputational Linguistics.

Linmei Hu, Juanzi Li, Liqiang Nie, Xiaoli Li, and ChaoShao. 2017. What happens next? future subeventprediction using contextual hierarchical LSTM. InProceedings of the Thirty-First AAAI Conference on

Artificial Intelligence, February 4-9, 2017, San Fran-cisco, California, USA, pages 3450–3456. AAAIPress.

Bram Jans, Steven Bethard, Ivan Vulic, andMarie Francine Moens. 2012. Skip n-gramsand ranking functions for predicting script events.In Proceedings of the 13th Conference of theEuropean Chapter of the Association for Computa-tional Linguistics, pages 336–344, Avignon, France.Association for Computational Linguistics.

Wengong Jin, Regina Barzilay, and Tommi S. Jaakkola.2018. Junction tree variational autoencoder formolecular graph generation. In Proceedings of the35th International Conference on Machine Learning,ICML 2018, Stockholmsmassan, Stockholm, Sweden,July 10-15, 2018, volume 80 of Proceedings of Ma-chine Learning Research, pages 2328–2337. PMLR.

Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutionalnetworks. In 5th International Conference on Learn-ing Representations, ICLR 2017, Toulon, France,April 24-26, 2017, Conference Track Proceedings.OpenReview.net.

Hirokazu Kiyomaru, Kazumasa Omura, Yugo Mu-rawaki, Daisuke Kawahara, and Sadao Kurohashi.2019. Diversity-aware event prediction based on aconditional variational autoencoder with reconstruc-tion. In Proceedings of the First Workshop on Com-monsense Inference in Natural Language Process-ing, pages 113–122, Hong Kong, China. Associationfor Computational Linguistics.

Heeyoung Kwon, Mahnaz Koupaee, Pratyush Singh,Gargi Sawhney, Anmol Shukla, Keerthi KumarKallur, Nathanael Chambers, and Niranjan Balasub-ramanian. 2020. Modeling preconditions in textwith a crowd-sourced dataset. In Findings of theAssociation for Computational Linguistics: EMNLP2020, pages 3818–3828, Online. Association forComputational Linguistics.

Tuan Lai, Heng Ji, Trung Bui, Quan Hung Tran, FranckDernoncourt, and Walter Chang. 2021. A context-dependent gated module for incorporating symbolicsemantics into event coreference resolution. InProc. The 2021 Conference of the North AmericanChapter of the Association for Computational Lin-guistics - Human Language Technologies (NAACL-HLT2021).

Manling Li, Qi Zeng, Ying Lin, Kyunghyun Cho, HengJi, Jonathan May, Nathanael Chambers, and ClareVoss. 2020. Connecting the dots: Event graphschema induction with path language modeling. InProceedings of the 2020 Conference on EmpiricalMethods in Natural Language Processing (EMNLP),pages 684–695, Online. Association for Computa-tional Linguistics.

Sha Li, Heng Ji, and Jiawei Han. 2021. Document-level event argument extraction by conditional gen-eration. In Proceedings of the 2021 Conference of

the North American Chapter of the Association forComputational Linguistics: Human Language Tech-nologies, pages 894–908.

Yujia Li, Oriol Vinyals, Chris Dyer, Razvan Pascanu,and Peter Battaglia. 2018a. Learning deep gener-ative models of graphs. In Proceedings of the 35th International Conference on Machine Learning,Stockholm, Sweden, PMLR 80.

Zhongyang Li, Xiao Ding, and Ting Liu. 2018b.Constructing narrative event evolutionary graph forscript event prediction. In Proceedings of theTwenty-Seventh International Joint Conference onArtificial Intelligence, IJCAI 2018, July 13-19, 2018,Stockholm, Sweden, pages 4201–4207. ijcai.org.

Renjie Liao, Yujia Li, Yang Song, Shenlong Wang,William L. Hamilton, David Duvenaud, Raquel Ur-tasun, and Richard S. Zemel. 2019. Efficient graphgeneration with graph recurrent attention networks.In Advances in Neural Information Processing Sys-tems 32: Annual Conference on Neural InformationProcessing Systems 2019, NeurIPS 2019, December8-14, 2019, Vancouver, BC, Canada, pages 4257–4267.

Ying Lin, Heng Ji, Fei Huang, and Lingfei Wu. 2020.A joint neural model for information extraction withglobal features. In Proceedings of the 58th AnnualMeeting of the Association for Computational Lin-guistics, pages 7999–8009, Online. Association forComputational Linguistics.

Jenny Liu, Aviral Kumar, Jimmy Ba, Jamie Kiros, andKevin Swersky. 2019. Graph normalizing flows. InAdvances in Neural Information Processing Systems32: Annual Conference on Neural Information Pro-cessing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 13556–13566.

Shangwen Lv, Wanhui Qian, Longtao Huang, JizhongHan, and Songlin Hu. 2019. Sam-net: Integrat-ing event-level and chain-level attentions to predictwhat happens next. In The Thirty-Third AAAI Con-ference on Artificial Intelligence, AAAI 2019, TheThirty-First Innovative Applications of Artificial In-telligence Conference, IAAI 2019, The Ninth AAAISymposium on Educational Advances in ArtificialIntelligence, EAAI 2019, Honolulu, Hawaii, USA,January 27 - February 1, 2019, pages 6802–6809.AAAI Press.

Piyush Mishra, Akanksha Malhotra, Susan WindischBrown, Martha Palmer, and Ghazaleh Kazeminejad.2021. Schema curation interface: Enhancing userexperience in curation tasks. In Proc. The Joint Con-ference of the 59th Annual Meeting of the Associa-tion for Computational Linguistics and the 11th In-ternational Joint Conference on Natural LanguageProcessing (ACL-IJCNLP 2021) Demo Track.

Ashutosh Modi. 2016. Event embeddings for seman-tic script modeling. In Proceedings of The 20th

SIGNLL Conference on Computational Natural Lan-guage Learning, pages 75–83, Berlin, Germany. As-sociation for Computational Linguistics.

Ashutosh Modi, Tatjana Anikina, Simon Ostermann,and Manfred Pinkal. 2016. InScript: Narrative textsannotated with script information. In Proceedingsof the Tenth International Conference on LanguageResources and Evaluation (LREC’16), pages 3485–3493, Portoroz, Slovenia. European Language Re-sources Association (ELRA).

Nasrin Mostafazadeh, Nathanael Chambers, XiaodongHe, Devi Parikh, Dhruv Batra, Lucy Vanderwende,Pushmeet Kohli, and James Allen. 2016a. A cor-pus and cloze evaluation for deeper understanding ofcommonsense stories. In Proceedings of the 2016Conference of the North American Chapter of theAssociation for Computational Linguistics: HumanLanguage Technologies, pages 839–849, San Diego,California. Association for Computational Linguis-tics.

Nasrin Mostafazadeh, Alyson Grealish, NathanaelChambers, James Allen, and Lucy Vanderwende.2016b. CaTeRS: Causal and temporal relationscheme for semantic annotation of event structures.In Proceedings of the Fourth Workshop on Events,pages 51–61, San Diego, California. Association forComputational Linguistics.

Dai Quoc Nguyen, Dat Quoc Nguyen, Cuong XuanChu, Stefan Thater, and Manfred Pinkal. 2017. Se-quence to sequence learning for event prediction. InProceedings of the Eighth International Joint Con-ference on Natural Language Processing (Volume 2:Short Papers), pages 37–42, Taipei, Taiwan. AsianFederation of Natural Language Processing.

Kiem-Hieu Nguyen, Xavier Tannier, Olivier Ferret,and Romaric Besancon. 2015. Generative eventschema induction with entity disambiguation. InProceedings of the 53rd Annual Meeting of the Asso-ciation for Computational Linguistics and the 7th In-ternational Joint Conference on Natural LanguageProcessing (Volume 1: Long Papers), pages 188–197, Beijing, China. Association for ComputationalLinguistics.

Qiang Ning, Sanjay Subramanian, and Dan Roth. 2019.An improved neural baseline for temporal relationextraction. In Proceedings of the 2019 Conferenceon Empirical Methods in Natural Language Process-ing and the 9th International Joint Conference onNatural Language Processing (EMNLP-IJCNLP),pages 6203–6209, Hong Kong, China. Associationfor Computational Linguistics.

Xiaoman Pan, Taylor Cassidy, Ulf Hermjakob, Heng Ji,and Kevin Knight. 2015. Unsupervised entity link-ing with abstract meaning representation. In Proc.the 2015 Conference of the North American Chap-ter of the Association for Computational Linguis-tics – Human Language Technologies (NAACL-HLT2015).

Xiaoman Pan, Boliang Zhang, Jonathan May, JoelNothman, Kevin Knight, and Heng Ji. 2017. Cross-lingual name tagging and linking for 282 languages.In Proceedings of the 55th Annual Meeting of theAssociation for Computational Linguistics (Volume1: Long Papers), pages 1946–1958, Vancouver,Canada. Association for Computational Linguistics.

Jian Pei, Jiawei Han, Behzad Mortazavi-Asl, HelenPinto, Qiming Chen, Umeshwar Dayal, and Me-ichun Hsu. 2001. Prefixspan: Mining sequential pat-terns by prefix-projected growth. In Proceedings ofthe 17th International Conference on Data Engineer-ing, pages 215–224.

Haoruo Peng and Dan Roth. 2016. Two discoursedriven language models for semantics. In Proceed-ings of the 54th Annual Meeting of the Associationfor Computational Linguistics (Volume 1: Long Pa-pers), pages 290–300, Berlin, Germany. Associationfor Computational Linguistics.

Karl Pichotta and Raymond Mooney. 2014. Statisti-cal script learning with multi-argument events. InProceedings of the 14th Conference of the EuropeanChapter of the Association for Computational Lin-guistics, pages 220–229, Gothenburg, Sweden. As-sociation for Computational Linguistics.

Karl Pichotta and Raymond J. Mooney. 2016. Learn-ing statistical scripts with LSTM recurrent neuralnetworks. In Proceedings of the Thirtieth AAAIConference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA, pages 2800–2806.AAAI Press.

Rachel Rudinger, Pushpendre Rastogi, Francis Ferraro,and Benjamin Van Durme. 2015. Script induction aslanguage modeling. In Proceedings of the 2015 Con-ference on Empirical Methods in Natural LanguageProcessing, pages 1681–1686, Lisbon, Portugal. As-sociation for Computational Linguistics.

Michael Schlichtkrull, Thomas N Kipf, Peter Bloem,Rianne Van Den Berg, Ivan Titov, and Max Welling.2018. Modeling relational data with graph convolu-tional networks. In European Semantic Web Confer-ence, pages 593–607. Springer.

Lei Sha, Sujian Li, Baobao Chang, and Zhifang Sui.2016. Joint learning templates and slots for eventschema induction. In Proceedings of the 2016 Con-ference of the North American Chapter of the As-sociation for Computational Linguistics: HumanLanguage Technologies, pages 428–434, San Diego,California. Association for Computational Linguis-tics.

Chence Shi, Minkai Xu, Zhaocheng Zhu, WeinanZhang, Ming Zhang, and Jian Tang. 2020. Graphaf:a flow-based autoregressive model for moleculargraph generation. In 8th International Confer-ence on Learning Representations, ICLR 2020, Ad-dis Ababa, Ethiopia, April 26-30, 2020. OpenRe-view.net.

Martin Simonovsky and Nikos Komodakis. 2018.Graphvae: Towards generation of small graphs usingvariational autoencoders. In International Confer-ence on Artificial Neural Networks, pages 412–422.Springer.

Lilian DA Wanzare, Alessandra Zarcone, Stefan Thater,and Manfred Pinkal. 2016. A crowdsourceddatabase of event sequence descriptions for the ac-quisition of high-quality script knowledge.

Noah Weber, Niranjan Balasubramanian, andNathanael Chambers. 2018. Event representationswith tensor-based compositions. In Proceedingsof the Thirty-Second AAAI Conference on Artifi-cial Intelligence, (AAAI-18), the 30th innovativeApplications of Artificial Intelligence (IAAI-18),and the 8th AAAI Symposium on EducationalAdvances in Artificial Intelligence (EAAI-18), NewOrleans, Louisiana, USA, February 2-7, 2018,pages 4946–4953. AAAI Press.

Noah Weber, Rachel Rudinger, and BenjaminVan Durme. 2020. Causal inference of script knowl-edge. In Proceedings of the 2020 Conference onEmpirical Methods in Natural Language Processing(EMNLP), pages 7583–7596, Online. Associationfor Computational Linguistics.

Haoyang Wen, Ying Lin, Tuan Lai, Xiaoman Pan, ShaLi, Xudong Lin, Ben Zhou, Manling Li, HaoyuWang, Hongming Zhang, et al. 2021a. Resin:A dockerized schema-guided cross-document cross-lingual cross-media information extraction andevent tracking system. In Proceedings of the 2021Conference of the North American Chapter of theAssociation for Computational Linguistics: HumanLanguage Technologies: Demonstrations, pages133–143.

Haoyang Wen, Yanru Qu, Heng Ji, Qiang Ning, JiaweiHan, Avi Sil, Hanghang Tong, and Dan Roth. 2021b.Event time extraction and propagation via graph at-tention networks. In Proc. The 2021 Conference ofthe North American Chapter of the Association forComputational Linguistics - Human Language Tech-nologies (NAACL-HLT2021).

Zhilin Yang, Zihang Dai, Yiming Yang, Jaime G. Car-bonell, Ruslan Salakhutdinov, and Quoc V. Le. 2019.Xlnet: Generalized autoregressive pretraining forlanguage understanding. In Advances in NeuralInformation Processing Systems 32: Annual Con-ference on Neural Information Processing Systems2019, NeurIPS 2019, December 8-14, 2019, Vancou-ver, BC, Canada, pages 5754–5764.

Sanghyun Yoo, Young-Seok Kim, Kang Hyun Lee,Kuhwan Jeong, Junhwi Choi, Hoshik Lee, andYoung Sang Choi. 2020. Graph-aware transformer:Is attention all graphs need? arXiv preprintarXiv:2006.05213.

Jiaxuan You, Rex Ying, Xiang Ren, William L. Hamil-ton, and Jure Leskovec. 2018. Graphrnn: Generat-

ing realistic graphs with deep auto-regressive mod-els. In Proceedings of the 35th International Con-ference on Machine Learning, ICML 2018, Stock-holmsmassan, Stockholm, Sweden, July 10-15, 2018,volume 80 of Proceedings of Machine Learning Re-search, pages 5694–5703. PMLR.

Quan Yuan, Xiang Ren, Wenqi He, Chao Zhang, XinheGeng, Lifu Huang, Heng Ji, Chin-Yew Lin, and Ji-awei Han. 2018. Open-schema event profiling formassive news corpora. In Proceedings of the 27thACM International Conference on Information andKnowledge Management, CIKM 2018, Torino, Italy,October 22-26, 2018, pages 587–596. ACM.

Li Zhang, Qing Lyu, and Chris Callison-Burch. 2020.Reasoning about goals, steps, and temporal order-ing with WikiHow. In Proceedings of the 2020 Con-ference on Empirical Methods in Natural LanguageProcessing (EMNLP), pages 4630–4639, Online. As-sociation for Computational Linguistics.