What do cats have to do with explicit semantics?

13
www.isocat.org What do cats have to do with explicit semantics? Menzo Windhouwer MPI for Psycholinguistics [email protected] Ineke Schuurman KU Leuven & Utrecht University [email protected]

Transcript of What do cats have to do with explicit semantics?

Page 1: What do cats have to do with explicit semantics?

www.isocat.org

What do cats have to do with explicit semantics?

Menzo WindhouwerMPI for Psycholinguistics

[email protected]

Ineke SchuurmanKU Leuven & Utrecht University

[email protected]

Page 2: What do cats have to do with explicit semantics?

www.isocat.org

TTNWW and ISOcat

• TTNWW: TST Tools voor het Nederlands als Web services in een Workflow

• CLARIN-NL and VL pilot project

• Goal: to enable researchers in the humanties to use our tools and resources in an easy way, even when a whole series of tools and resources is involved.

20 January 2012 CLIN22 - TTNWW Project 2

Page 3: What do cats have to do with explicit semantics?

www.isocat.org

TTNWW and ISOcat

• Issues when making use of such a ‘chain’:

– Is the meaning of notion X in resource/tool A the same as that in resource/tool B ?

– Is the meaning of notion X in resource/tool A and that of Y in resource/tool B the same?

– Or, if not the same, are they related? If so, how?

= ISOcat and friends to the rescue !

20 January 2012 CLIN22 - TTNWW Project 3

Page 4: What do cats have to do with explicit semantics?

www.isocat.org

Explicit semantics

• Language resources are valuable assets

– store them in an archive to assure persistency!

– later generations can research material that only now can still be collected

• Problem: used terminology might ‘rot’

– terms get a (slightly) different meaning over (long) periods of time

– later generations need to know the meaning of today

• Solution: make semantics explicit

20 January 2012 CLIN22 - TTNWW Project 4

Page 5: What do cats have to do with explicit semantics?

www.isocat.org

The ISOcat Data Category Registry

http://www.isocat.org/

• An ISOcat data category is “an elementary descriptor in a linguistic structure or an annotation scheme” (ISO 12620:2009)

• ISOcat data categories have unique and persistent identifiers, which can be resolved over the web

http://www.isocat.org/datcat/DC-78

20 January 2012 CLIN22 - TTNWW Project 5

Page 6: What do cats have to do with explicit semantics?

www.isocat.org

Annotate all elements in a linguistic resource

20 January 2012 CLIN22 - TTNWW Project 6

/language/ /alphabet/

/writtenForm/

/japanese/ /ipa/

/lexicon/

/entry/

/lemma/

Page 7: What do cats have to do with explicit semantics?

www.isocat.org

Sharing structure

• Using ISOcat data category references specifications of elementary descriptors can be shared between structures

• How to share (annotated) structures?

• A companion registry for ISOcat is under development: SCHEMAcat

• This registry should persistently store any kind of schema, e.g., XML schemata, EBNF grammars

20 January 2012 CLIN22 - TTNWW Project 7

Page 8: What do cats have to do with explicit semantics?

www.isocat.org

20 January 2012 CLIN22 - TTNWW Project 8

Annotated CGN/DCOI grammartag = pos '(' feat* ')'

# @dcr:datcat ‘WW’ http://www.isocat.org/datacat/DC-1424

# @dcr:datcat ‘TW’ http://www.isocat.org/datacat/DC-1334

# @dcr:datcat ‘VG’ http://www.isocat.org/datacat/DC-1226

# @dcr:datcat ‘TSW’ http://www.isocat.org/datacat/DC-2717

pos = 'N' | ' ADJ' | 'WW' | 'TW' | 'VNW' | 'LID' | 'VZ' | 'VG' | 'BW' | 'TSW'

feat = 'NTYPE' | 'GETAL' | 'GRAAD | 'GENUS | 'NAAMVAL' | 'POSITIE' | 'BUIGING | 'GETAL-N' | 'WVORM | 'PVTIJD | 'PVAGR' | 'NUMTYPE' | 'VWTYPE' | 'PDTYPE' | 'PERSOON' | 'STATUS' | 'NPAGR' | 'LWTYPE' | 'VZTYPE’ | 'CONJTYPE' | 'SPECTYPE'

NTYPE = 'soortnaam' | 'eigennaam'

GETAL = 'enkelvoud' | 'meervoud' | 'getal'

GRAAD = 'basis' | 'comparatief' | 'superlatief' | 'diminutief'

GENUS = 'genus' | 'zijdig' | 'masculien' | 'feminien' | 'onzijdig'

NAAMVAL = 'standaard' | 'nominatief' | 'oblique' | 'bijzonder' | 'genitief' | 'datief'

POSITIE = 'prenominaal' | 'nominaal' | 'postnominaal 'vrij'

BUIGING = 'zonder' | 'met-e' | 'met-s'

GETAL-N = 'zonder-n' | 'meervoud-n'

WVORM = 'persoonsvorm' | 'buigbaar' | 'innitief' | 'onvdw' | 'voltdw‘

# @dcr:datcat PVTIJD http://www.isocat.org/datacat/DC-1286

# @dcr:datcat ‘verleden’ http://www.isocat.ord/datacat/DC-1347

# @dcr:datcat ‘conjunctie’ http://www.isocat.ord/datacat/DC-1843

PVTIJD = 'tegenwoordig' | 'verleden' | 'conjunctief'

PVAGR = 'enkelvoud' | 'meervoud' | 'met-t'

NUMTUPE = 'hoofdtelwoord' | 'rangtelwoord'

VWTYPE = 'pr' | 'persoonlijk' | 'reexief' | 'reciprook' | 'bezittelijk' | 'vb' | 'vragend' | 'betrekkelijk' | 'exclamatief' | 'aanwijzend' | 'onbepaald'

PDTYPE = 'pronomen' | 'adv-pronimen' | 'determiner' | 'gradeerbaar'

PERSOON = 'persoon' | '1' | '2' | '2v' | '2b' | '3' | '3p' | '3' | '3v' | '3o'

STATUS = 'vol' | 'gereduceerd' | 'nadruk'

NPAGR = 'agr' | 'evon' | 'rest' | 'evz' | 'mv' | 'agr3' | 'evmo' | 'rest3' | 'evf' | 'mv'

LWTYPE = 'bepaald' | 'onbepaald'

VZTYPE = 'initieel' | 'versmolten' | 'naal'

CONJTYPE = 'nevenschikkend' | 'onderschikkend'

SPECTYPE = 'afgebroken' | 'onverstaanbaar' | 'vreemd' | 'deeleigen' | 'meta' | 'commentaar' | 'achtergrond' | 'afkorting' | 'symbool' | 'dialect'

Page 9: What do cats have to do with explicit semantics?

www.isocat.org

Sharing relations

• Among data categories and (other) concepts ontological relationships can be defined

• These relationships allow crosswalks between various resource models– discover related resources which use (different levels

of) semantically close data categories

• RELcat is a companion registry which will allow storing (and sharing) a linguists individual view on these relationships

http://lux13.mpi.nl/relcat/ (alpha)

20 January 2012 CLIN22 - TTNWW Project 9

Page 10: What do cats have to do with explicit semantics?

www.isocat.org

Semantic network

20 January 2012 CLIN22 - TTNWW Project 10

Data Category Registry - ISOcat

Linguistic knowledge baseLinguistic resource (schema)Data categories

Containers

Concepts

Concept Registry

Relation

Relation Registry - RELcat

Schema Registry - SCHEMAcat

Page 11: What do cats have to do with explicit semantics?

www.isocat.org

Conclusion

• CLARIN(-NL/-VL), including TTNWW, is working towards a set of registries that enable the community to collaboratively make semantics explicit by:

– sharing elementary descriptors: data categories

• persistently

– sharing structure: schemata

• persistently

– sharing ontological relations

• individual world views20 January 2012 CLIN22 - TTNWW Project 11

Page 12: What do cats have to do with explicit semantics?

www.isocat.org

What do cats have to do with explicit semantics?

20 January 2012 CLIN22 - TTNWW Project 12

Page 13: What do cats have to do with explicit semantics?

www.isocat.org

20 January 2012 CLIN22 - TTNWW Project 13

Thank you for your attention!

Visitwww.isocat.org

Questions?www.isocat.org/forum/

[email protected]