A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.
-
Upload
brendan-lee -
Category
Documents
-
view
214 -
download
0
Transcript of A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.
![Page 1: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/1.jpg)
A Common Ontology for Linguistic Concepts
Scott Farrar
University of Arizona
![Page 2: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/2.jpg)
Endangered Languages
• As many as half of the world’s languages are in danger of disappearing LaPolla (1998)
• Including: Many languages in the Americas (Hopi), Africa, Australia (), and Southeast Asia (Biao Min).
![Page 3: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/3.jpg)
EMELD
• EMELD (Electronic Metastructure for Endangered Languages Data)
• One of Application of EMELD: Make endangered languages available on the Semantic Web
![Page 4: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/4.jpg)
Linguistic Field Work
• Linguists collect data
• Datasets (grammars, dictionaries, or glossed corpora)
• Hopi example of kachina:sivu-’ikwiw-ta-qa[vessel-carry: on: back-DUR-REL]
![Page 5: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/5.jpg)
Problems Concerning Data Interoperability
• Dataset can vary according to:– markup– theoretical style – natural language semantics
Az épület-be mégy-ek.
the building-IllativeCase go-1P/SING
I am going into the building.
![Page 6: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/6.jpg)
Problems Concerning Data Interoperability
• Linguistic Data is Dynamic
New data is collected.
Datasets are revised.
Theory changes.
![Page 7: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/7.jpg)
Standardization is not Viable
• Text Encoding Initiative (TEI) (Sperberg-McQueen and Burnard 1994)
• Corpus Encoding Standard (CES) (Ide and Romary 2000)
![Page 8: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/8.jpg)
Towards a Solution
• Data Storage and Distribution—local or distributed?
• Data model for linguistic datasets
• Linguistic ontology
![Page 9: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/9.jpg)
EMELD Architecture
EMELDSearch Engine
GUI
Hopi Mocovi Biao Min
LinguisticOntology
Semantic Web
![Page 10: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/10.jpg)
Linguistic Ontology
• Conceptual Model for the Linguistics domain(special focus on morpho-syntax)
• Built on top of the Standard Upper Merged Ontology (SUMO) (Niles and Pease 2001)– already includes a number of concepts relating to
semiotics and linguistics– incorporates concepts from a number of top-level
ontologies– peer-reviewed and freely available
![Page 11: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/11.jpg)
Backbone Taxonomy• Entity
PhysicalObject
ContentBearingObjectIconSymbolicStringLinguisticExpressionWrittenLinguisticExpression
TextSentencePhraseWordMorpheme
SpokenLinguisticExpressionDialogueSentencePhraseWordMorpheme
![Page 12: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/12.jpg)
Backbone Taxonomy (continued)Abstract
ClassRelation
PredicateGrammaticalRelation
AspectTenseCaseAgreement
AttributeGrammaticalAttribute
GenderPersonNumber
![Page 13: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/13.jpg)
Morphosyntactic CaseCase
InherentCaseSpatio-KineticCase
PositionalCaseInessiveCase
DirectionalCaseIllativeCase
ExistentialCaseAbessiveCasePartitiveCase
InstrumentalCase
StructuralCaseGenitiveCaseErgativeCaseNominativeCase
![Page 14: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/14.jpg)
Future directions
• Include the domains of phonology and discourse analysis.
• The linguistics ontology has applications beyond the immediate EMELD project:– as part of an expert system for reasoning
about language data – as part of an interlingua designed for machine
translation systems
![Page 15: A Common Ontology for Linguistic Concepts Scott Farrar University of Arizona.](https://reader036.fdocuments.us/reader036/viewer/2022082818/56649ee15503460f94bf1c2a/html5/thumbnails/15.jpg)
Contact Info
• Scott Farrar
• Will Lewis
• Terry Langendoen
• {farrar, wlewis, langendoen}
@u.arizona.edu