Download - Propbank Instance Annotation Guidelines Using a Dedicated Editor, Jubilee

Transcript
Page 1: Propbank Instance Annotation Guidelines Using a Dedicated Editor, Jubilee

Propbank Instance Annotation GuidelinesUsing a Dedicated Editor, Jubilee

Jinho Choi, Claire Bonial, Martha PalmerInstitute of Cognitive Science, University of Colorado at Boulder

• A corpus in which the arguments of each verb predicate are annotated with their semantic roles.• Each predicate is also annotated with its sense ID.• Annotations are done over syntactic trees.

Propbank

• Each task is claimed, double-annotated, and adjudicated.

• In the past, three different tools were used:1. to claim tasks.2. to annotate arguments. 3. to annotate verb senses.

Propbank Annotation Procedure

Sense: open.01

ARG0 (agent)

ARG1 (theme)

ARG2 (instrument)

Task 1

Task 2

Annotation 2-1

Annotation 2-2

Adjudication 2Double-annotateClaimAdjudicate

Jubilee2

1

Jubilee

Jubilee main window

Frameset view

• Displays and allows annotators to choose the sense (roleset) of the predicate with respect to the current tree.

Lemma of the predicatefor the selected roleset

List of roleset IDsfor the predicate

View examplesof the selected roleset

A definition and a generalized argument structure of the selected roleset

Argument view

• Contains buttons representing Propbank argument labels.

Claiming tasks

• Choose a Propbank project.• Choose a task from either:- New tasks:

claimed by one or less annotator.- My tasks:

claimed by the current annotator.

Treebank view in adjudication mode

• Displays and allows adjudicators to choose or edit from multiple annotations.

Adjudicator IDMultiple annotations

Treebank view in annotation mode

• Displays syntactic trees in the selected task.

List of tree IDs Annotator IDNavigation buttons

Raw sentence of the tree

Annotation vs. adjudication mode

• In annotation mode, annotators are allowed to view and edit only tasks claimed by themselves or one other annotator.• In adjudication mode, adjudicators are allowed to view and edit all tasks that have undergone at least single-annotation.

Advantages and Features

• Speed up: argument and sense annotations are simultaneous.

• Unified format: the use of one tool simplifies data maintenance.

• Syntax visualization: syntax is easily

understandable to annotators.

• Semantic supply: frameset info is

provided to consult annotators.

• Multilingual: accommodates Arabic,

Chinese, English, Hindi and Korean.

• Platform independent: runs on any platform with JVM (Java 6.0).

• Run on X11: annotators can

make updates remotely.

Operators• In the absence of Treebank co-indexing, annotators can provide semantic information about a null element by manually linking it to its overt referent using the ‘ ’ operator★ .• In the cases where an argument is discontinuous such that it cannot be captured in the annotation of one node, the ‘,’ operator is used.• The ‘&’ operator is used to link the object trace after a passive verb to its referent in the subject position in reduced relative clauses.

How to obtain Jubilee• Available as an open source project on Google code (http://code.google.com/p/propbank).• Contact: [email protected]

More about Jubilee

• Special thanks are due to Professor Nianwen Xue of Brandeis University for his very helpful insights, as well as Scott Cotton, the developer of RATS and Tom Morton, the developer of WordFreak, both previously used for Propbank annotation.• We also gratefully acknowledge the support of the National Science Foundation Grants CISE-CRI-0551615, Towards a Comprehensive Linguistic Annotation and CISE- CRI 0709167, Collaborative: A Multi-Representational and Multi-Layered Treebank for Hindi/Urdu, and a grant from the Defense Advanced Research Projects Agency (DARPA/IPTO) under the GALE program, DARPA/CMO Contract No. HR0011-06-C-0022, subcontract from BBN, Inc. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Acknowledgements