TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework...

23
TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Transcript of TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework...

Page 1: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

TMF - a tutorialPart 3: Designing (schemas and)

filters

TMF - Terminological Markup Framework

Laurent Romary - Laboratoire Loria

Page 2: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

General principles

Terminological information interchange– Three components:

• Source TDB1

• Target TDB2

• Terminological interchange format– A specific TML (DXLT, Geneter)

TDB1 TDB2

TML

Page 3: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Important notice

– GMT is not a TML• A too abstract format

– Uncontrolled recursivity (‘ struct ’ element)

– Uncontrolled content (‘ feat ’ and ‘ annot ’)

• Necessity to provide a schema to check interchanged data

– Precise list of datacategory

– Precise definition of format

– GMT is here to provide conceptual simplicity

Page 4: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Designing filters

TML to GMT

Page 5: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

General principles

Just for your information– The creation of the filters can be automatized

Basic processes– Reduction of expansion trees– Mapping elements and attributes to the

corresponding data categories

Page 6: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Reducing expansion trees

Example• DXLT (Martif) sub-tree

<ntig><!-- some general information associated with the term --><termGrp>

<!-- term related information --></termGrp>

</ntig>

• GMT<struct type="TS"><!-- some features -->

</struct>

Page 7: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Element mapping

Example• DXLT (Martif)

<definition>Bla, bla, bla etc.</definition>

• GMT<feat type="definition">Bla, bla, bla etc.</feat>

Page 8: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Structural elements

Generating a GMT ‘ struct ’ element

<xsl:template match="termEntry"><xsl:element name="struct">

<xsl:attribute name="type">TE</xsl:attribute>

<xsl:apply-templates select="@*|node()"/></xsl:element>

</xsl:template>

Page 9: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Features

Generating a GMT‘ feat ’ element» (style=Attribute)

<xsl:template match="@id"><xsl:element name="feat">

<xsl:attribute name="type">iso12620-identifier</xsl:attribute>

<xsl:value-of select="."/></xsl:element>

</xsl:template>

Page 10: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Features

Generating a GMT‘ feat ’ element» (style=Element)

<xsl:template match="term"><xsl:element name="feat">

<xsl:attribute name="type">iso12620-term</xsl:attribute>

<xsl:apply-templates/></xsl:element>

</xsl:template>

Page 11: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Features

Generating a GMT‘ feat ’ element» (style=TypedElement)

<xsl:template match="descrip[@type='subjectField']"><xsl:element name="attr">

<xsl:attribute name="type">SubjectField</xsl:attribute>

<xsl:apply-templates/></xsl:element>

</xsl:template>

Page 12: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

XML Schemas for TMLs

…work ahead…

Page 13: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Analysing existing TDBs

Towards a generic methodology

Page 14: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

General Architecture

TDB Flat XML GMT TMLForm

at spe

cific

XSL

sty

lesh

eet

Sim

ple

DB dum

per

Autom

atic G

MT2

TML st

yles

heet

Page 15: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

A two phase process

List the various Data Categories used in the TDB– Relate them to existing registries (e.g. iso 12620),

cf. http://salt.loria.fr/public/salt/DCQuery.html

Identify the underlying organization of the TDB– Relate it to the Meta-model– Anchor the DatCat where they actually occur

Page 16: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Analysis of an existing TDB

Going through an example

Page 17: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Eurodicautom sample<entry>

<BE>BTB</BE><TY>DAG77</TY><NI>398</NI><CF>3</CF><CM>AG1</CM><CM>JUA</CM><EN>

<VE>key money</VE><RF>CILF,Dict.Agriculture,ACCT,1977</RF>

</EN><FR>

<VE>pas-de-porte</VE><DF>prix payé au précédent occupant pour le droit d'entrer dans une

exploitation agricole</DF><RF target="DF">TNC(1997)</RF><RF>CILF,Dict.Agriculture,ACCT,1977</RF><NT type="NTE">droit rural;pratique prohibée par la loi</NT>

</FR></entry>

definition-12620A.5.1 (TS)

term-12620A.1 (TS)

Language 12620A.10.7(LS)

note-12620A.8 (TS)

classificationCode-12620A.4.2 (TE)

Page 18: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Result in GMT (1/2)<tmf>

<struct type="TE"><feat type="entryIdentifier-12620A.10.15">BTB-TY-398</feat><feat type="originatingInstitution-12620A.10.22.2">BTB</feat><feat type="projectSubset">DAG77</feat><feat type="NI">398</feat><feat type="reliabilityCode">3</feat><feat type="classificationCode-12620A.4.2">AG1</feat><feat type="classificationCode-12620A.4.2">JUA</feat><struct type="LS">

<feat type="language-12620A.10.7">EN</feat><struct type="TS">

<feat type="term-12620A.1">key money</feat></struct><feat type="sourceIdentifier-

12620A.10.20">CILF,Dict.Agriculture,ACCT,1977</feat></struct>

Page 19: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Result in GMT (2/2)<struct type="LS">

<feat type="language-12620A.10.7">fr</feat><struct type="TS">

<feat type="term-12620A.1">pas-de-porte</feat>

</struct><brack>

<feat type="definition-12620A.5.1">prix payé au précédent occupant pour le droit d'entrer dans une exploitation agricole</feat>

<feat type="sourceIdentifier-12620A.10.20">TNC(1997)</feat>

</brack><feat type="sourceIdentifier-

12620A.10.20">CILF,Dict.Agriculture,ACCT,1977</feat><feat type="note-12620A.8">droit rural;pratique

prohibée par la loi</feat></struct>

</struct></tmf>

Page 20: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Simple rules

Using XSL locality

<xsl:template match="CM"> <feat type="classificationCode-12620A.4.2"> <xsl:apply-templates/> </feat></xsl:template>

Page 21: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Introducing specific levels

Necessity to combine structure and content

<xsl:template match="VE"> <struct type="TS"> <feat type="term-12620A.1"> <xsl:apply-templates/> </feat> </struct></xsl:template>

Page 22: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Default rule

Useful for keeping track of unmapped data categories

<xsl:template match="*"> <feat> <xsl:attribute name="type">

<xsl:value-of select="name()"/></xsl:attribute>

<xsl:apply-templates/> </feat></xsl:template>

Page 23: TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Useful pointers

TMF page:– http://www.loria.fr/projets/TMF

HLT/Salt project page– http://www.loria.fr/projets/SALT

Data category query tool:– http://salt.loria.fr/public/salt/DCQuery.html