Propbank Frameset Annotation Guidelines Using a Dedicated Editor, Cornerstone
-
Upload
jinho-d-choi -
Category
Technology
-
view
383 -
download
1
description
Transcript of Propbank Frameset Annotation Guidelines Using a Dedicated Editor, Cornerstone
![Page 1: Propbank Frameset Annotation Guidelines Using a Dedicated Editor, Cornerstone](https://reader037.fdocuments.us/reader037/viewer/2022110308/5579544fd8b42ab6648b48f1/html5/thumbnails/1.jpg)
Propbank Frameset Annotation GuidelinesUsing a Dedicated Editor, Cornerstone
Jinho Choi, Claire Bonial, Martha PalmerInstitute of Cognitive Science, University of Colorado at Boulder
• A corpus in which the arguments of each verb predicate are annotated with their semantic roles.• Each predicate is also annotated with its sense ID supplied through frameset files.
Propbank
• Frameset files outline the argument structure for each sense of every predicate in the Propbank.• Annotators use the semantic and syntactic information provided in frameset files to efficiently make consistent annotations.
Frameset Files
Cornerstone
Advantages and Features
• Free of errors: Frameset files created by Cornerstone are guaranteed to be free of errors.
• Easy customization: allows users to easily customize tags required for frameset annotations.
• Free of XML: frameset authors do not need to know any XML.
• Multilingual: accommodates Arabic,
Chinese, English, Hindi and Korean.
• Platform independent: runs on any platform with JVM (Java 6.0).
• Run on X11: annotators can
make updates remotely.
How to obtain Cornerstone
• Available as an open source project on Google code (http://code.google.com/p/propbank).• The project also provides a Propbank instance editor, Jubilee.• Both Cornerstone and Jubilee have been used in several universities.• Contact: [email protected]
More about Cornerstone
• We gratefully acknowledge the support of the National Science Foundation Grants CISE- CRI-0551615, Towards a Comprehensive Linguistic Annotation and CISE- CRI 0709167, Collaborative: A Multi-Representational and Multi-Layered Treebank for Hindi/Urdu, and a grant from the Defense Advanced Research Projects Agency (DARPA/IPTO) under the GALE
program, DARPA/CMO Contract No. HR0011-06-C-0022, subcontract from BBN, Inc. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Acknowledgements
E.g. John opened the door with his foot.
Sense ID open.01 (‘open’)ARG0 John (agent: ‘opener’)
ARG1 the door (theme: ‘thing opening’)
ARG2 with his foot (instrument)
Propbank annotations are based on the predicate argument structure outlined in frameset files
It is important to keep the frameset files consistent, simple to read, and easy to update.
opened: open.01
John the door with his footARG0 (agent) ARG1 (theme) ARG2 (instrument)
Includingverb-particleconstructions
One pereach verb
All verbsin Propbank
Multi-lemma mode
• A predicate can have multiple lemmas (e.g., open, open up).• Languages: English, Hindi
Uni-lemma mode
• A predicate can have only one lemma.• Languages: Arabic, Chinese
Frameset files in XML
• All frameset files are written in XML that is complicated for those who are not familiar
with it, leading to potential errors.
<frameset><predicate lemma="open"><roleset id="open.01" name="open" vncls="40.3.2 45.4 47.6"><roles><role descr="opener" n="0"> <vnrole vncls="47.6" vntheta="Agent"/> <vnrole vncls="40.3.2" vntheta="Agent"/> <vnrole vncls="45.4" vntheta="Agent"/></role><role descr="thing opening" n="1"> <vnrole vncls="47.6" vntheta="Theme"/> <vnrole vncls="40.3.2" vntheta="Patient"/> <vnrole vncls="45.4" vntheta="Patient"/></role>…
Lemma(s)
SensesMulti-lemma: RolesetUni-lemma: Frameset
Argument structurefor the selected sense
Examplesfor the selected sense
Mappings between
syntactic and semantic arguments