ATLAS Demystified: A Practical Introduction Christophe Laprun, Jonathan Fiscus, John Garofolo,...

download ATLAS Demystified: A Practical Introduction Christophe Laprun, Jonathan Fiscus, John Garofolo, Sylvain Pajot National Institute of Standards and Technology.

If you can't read please download the document

Transcript of ATLAS Demystified: A Practical Introduction Christophe Laprun, Jonathan Fiscus, John Garofolo,...

  • ATLAS Demystified:A Practical IntroductionChristophe Laprun, Jonathan Fiscus, John Garofolo, Sylvain PajotNational Institute of Standards and Technology May 31, 2002Annotation Frameworks and Tools, LREC 2002

  • OverviewGoal: To make == . We need to examine more annotation tasksUniverse

    Linguistic Annotation Universe

    ATLAS-describable Universe generated by the ATLAS ontology:An annotation is the fundamental act of associating some content to a region in a signal

    ATLAS = Architecture and Tools for Linguistic Analysis Systems

  • Brief HistoryStarted with Bird and Libermans Annotation Graphs (AGs)ATLAS working group formed to explore AG conceptLDC, MITRE and NISTIntroduced at LREC 2000Since LREC 2000:LDC pursued Annotation Graph implementationTo satisfy immediate annotation needs for speech and textDeveloped AGTKOptimized for annotation of linear signalsNIST pursued generalized ATLAS model2001 - Multidimensional signals2002 - Type support, explicit support for hierarchies

  • Motivation for GeneralizationA long-term solution was neededLinguistic research is rapidly moving beyond linear signals Multi-modal complex signals with varying dimensionalityNIST Meeting Room data includes speech, video, and whiteboard interactionAutomatic Content Extraction (ACE) program includes extraction from speech, text and image data.Gesture annotation ideally involves 3-dimensional space over time

  • Additional Needs Addressed During GeneralizationType definition supportDefine the content, structure and relationships between annotationsDual use: provides corpus design definition to framework and usersHierarchical dependencies aboundSentences are composed of words which are composed of phones, co-reference, parse trees, etc.AGs do not explicitly express dependenciesUbiquitous annotation validation Happens at every stage of data manipulation: creation, modification and filteringSyntax checking is only the first step

  • What We Have AccomplishedThe core ATLAS annotation ontologyType definition infrastructureDeveloper framework

  • The Core ATLAS Ontology(Simple Speech Use Case)Task: Annotate sentences which are composed of wordsSentence Annot.

    audioChildren Interval RegionWord Annot.

    SheOffset AnchorOffset AnchorhadSignalAnchorRegionContentAnnotationChildren

  • The Core ATLAS Ontology(Simple Gesture Use Case)XYZ AnchorForearm Annotation

    XYZ AnchorFrame AnchorFrame Anchor

  • Type Definition InfrastructureMeta Annotation Infrastructure for ATLAS (MAIA)Provides mechanism for the definition and use of annotations at the semantic levelSpecifies content, structure and relationships between annotations Sufficiently expressive for validationUsers declare their types via XMLno coding requiredFramework generates and uses type constructs from the definition dynamicallyValidation occurs automatically

  • Type Definition Excerpt

    Sentence Annot.

    Children Interval RegionWord Annot.

    WordContent

  • Developer Framework jATLAS: a Java implementationCore suite of objects:Implements ATLAS generic annotation ontologyDefines an Application Programming Interface (API)Low-level services:Data import/export, management utilitiesDefines a Service Provider Interface (SPI) to allow advanced framework extensions additional persistence formsAutomatic validation services via MAIA

  • ATLAS StatusStable ontologyBasic typing services via MAIA Developer framework: jATLAS in Beta versionHas dramatically reduced development times for NIST prototype applicationsPersistence format: ATLAS Interchange Format (AIF)ACE format importAG format import partially supportedActive developmentPublic domain source code, freely available

  • ATLAS Future WorkMAIA extensionsType inheritanceIncreased structural validationContent-based validationFramework extensionsCurrently developing a GUI component frameworkTool developmentAnnotation and evaluation tools at NISTCollaboration with other sitesContributed tools repository

  • More information?http://www.nist.gov/speech/atlasWe welcome feedback, comments and suggestions