TMF - a tutorial Part 3: Designing (schemas and) filters

23
TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

description

TMF - a tutorial Part 3: Designing (schemas and) filters. TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria. General principles. Terminological information interchange Three components: Source TDB 1 Target TDB 2 Terminological interchange format - PowerPoint PPT Presentation

Transcript of TMF - a tutorial Part 3: Designing (schemas and) filters

Page 1: TMF - a tutorial Part 3:  Designing (schemas and) filters

TMF - a tutorialPart 3: Designing (schemas and)

filters

TMF - Terminological Markup Framework

Laurent Romary - Laboratoire Loria

Page 2: TMF - a tutorial Part 3:  Designing (schemas and) filters

General principles

Terminological information interchange– Three components:

• Source TDB1

• Target TDB2

• Terminological interchange format– A specific TML (DXLT, Geneter)

TDB1 TDB2

TML

Page 3: TMF - a tutorial Part 3:  Designing (schemas and) filters

Important notice

– GMT is not a TML• A too abstract format

– Uncontrolled recursivity (‘ struct ’ element)

– Uncontrolled content (‘ feat ’ and ‘ annot ’)

• Necessity to provide a schema to check interchanged data

– Precise list of datacategory

– Precise definition of format

– GMT is here to provide conceptual simplicity

Page 4: TMF - a tutorial Part 3:  Designing (schemas and) filters

Designing filters

TML to GMT

Page 5: TMF - a tutorial Part 3:  Designing (schemas and) filters

General principles

Just for your information– The creation of the filters can be automatized

Basic processes– Reduction of expansion trees– Mapping elements and attributes to the

corresponding data categories

Page 6: TMF - a tutorial Part 3:  Designing (schemas and) filters

Reducing expansion trees

Example• DXLT (Martif) sub-tree

<ntig><!-- some general information associated with the term --><termGrp>

<!-- term related information --></termGrp>

</ntig>

• GMT<struct type="TS"><!-- some features -->

</struct>

Page 7: TMF - a tutorial Part 3:  Designing (schemas and) filters

Element mapping

Example• DXLT (Martif)

<definition>Bla, bla, bla etc.</definition>

• GMT<feat type="definition">Bla, bla, bla etc.</feat>

Page 8: TMF - a tutorial Part 3:  Designing (schemas and) filters

Structural elements

Generating a GMT ‘ struct ’ element

<xsl:template match="termEntry"><xsl:element name="struct">

<xsl:attribute name="type">TE</xsl:attribute>

<xsl:apply-templates select="@*|node()"/></xsl:element>

</xsl:template>

Page 9: TMF - a tutorial Part 3:  Designing (schemas and) filters

Features

Generating a GMT‘ feat ’ element» (style=Attribute)

<xsl:template match="@id"><xsl:element name="feat">

<xsl:attribute name="type">iso12620-identifier</xsl:attribute>

<xsl:value-of select="."/></xsl:element>

</xsl:template>

Page 10: TMF - a tutorial Part 3:  Designing (schemas and) filters

Features

Generating a GMT‘ feat ’ element» (style=Element)

<xsl:template match="term"><xsl:element name="feat">

<xsl:attribute name="type">iso12620-term</xsl:attribute>

<xsl:apply-templates/></xsl:element>

</xsl:template>

Page 11: TMF - a tutorial Part 3:  Designing (schemas and) filters

Features

Generating a GMT‘ feat ’ element» (style=TypedElement)

<xsl:template match="descrip[@type='subjectField']"><xsl:element name="attr">

<xsl:attribute name="type">SubjectField</xsl:attribute>

<xsl:apply-templates/></xsl:element>

</xsl:template>

Page 12: TMF - a tutorial Part 3:  Designing (schemas and) filters

XML Schemas for TMLs

…work ahead…

Page 13: TMF - a tutorial Part 3:  Designing (schemas and) filters

Analysing existing TDBs

Towards a generic methodology

Page 14: TMF - a tutorial Part 3:  Designing (schemas and) filters

General Architecture

TDB Flat XML GMT TMLForm

at spe

cific

XSL

sty

lesh

eet

Sim

ple

DB dum

per

Autom

atic G

MT2

TML st

yles

heet

Page 15: TMF - a tutorial Part 3:  Designing (schemas and) filters

A two phase process

List the various Data Categories used in the TDB– Relate them to existing registries (e.g. iso 12620),

cf. http://salt.loria.fr/public/salt/DCQuery.html

Identify the underlying organization of the TDB– Relate it to the Meta-model– Anchor the DatCat where they actually occur

Page 16: TMF - a tutorial Part 3:  Designing (schemas and) filters

Analysis of an existing TDB

Going through an example

Page 17: TMF - a tutorial Part 3:  Designing (schemas and) filters

Eurodicautom sample<entry>

<BE>BTB</BE><TY>DAG77</TY><NI>398</NI><CF>3</CF><CM>AG1</CM><CM>JUA</CM><EN>

<VE>key money</VE><RF>CILF,Dict.Agriculture,ACCT,1977</RF>

</EN><FR>

<VE>pas-de-porte</VE><DF>prix payé au précédent occupant pour le droit d'entrer dans une

exploitation agricole</DF><RF target="DF">TNC(1997)</RF><RF>CILF,Dict.Agriculture,ACCT,1977</RF><NT type="NTE">droit rural;pratique prohibée par la loi</NT>

</FR></entry>

definition-12620A.5.1 (TS)

term-12620A.1 (TS)

Language 12620A.10.7(LS)

note-12620A.8 (TS)

classificationCode-12620A.4.2 (TE)

Page 18: TMF - a tutorial Part 3:  Designing (schemas and) filters

Result in GMT (1/2)<tmf>

<struct type="TE"><feat type="entryIdentifier-12620A.10.15">BTB-TY-398</feat><feat type="originatingInstitution-12620A.10.22.2">BTB</feat><feat type="projectSubset">DAG77</feat><feat type="NI">398</feat><feat type="reliabilityCode">3</feat><feat type="classificationCode-12620A.4.2">AG1</feat><feat type="classificationCode-12620A.4.2">JUA</feat><struct type="LS">

<feat type="language-12620A.10.7">EN</feat><struct type="TS">

<feat type="term-12620A.1">key money</feat></struct><feat type="sourceIdentifier-

12620A.10.20">CILF,Dict.Agriculture,ACCT,1977</feat></struct>

Page 19: TMF - a tutorial Part 3:  Designing (schemas and) filters

Result in GMT (2/2)<struct type="LS">

<feat type="language-12620A.10.7">fr</feat><struct type="TS">

<feat type="term-12620A.1">pas-de-porte</feat>

</struct><brack>

<feat type="definition-12620A.5.1">prix payé au précédent occupant pour le droit d'entrer dans une exploitation agricole</feat>

<feat type="sourceIdentifier-12620A.10.20">TNC(1997)</feat>

</brack><feat type="sourceIdentifier-

12620A.10.20">CILF,Dict.Agriculture,ACCT,1977</feat><feat type="note-12620A.8">droit rural;pratique

prohibée par la loi</feat></struct>

</struct></tmf>

Page 20: TMF - a tutorial Part 3:  Designing (schemas and) filters

Simple rules

Using XSL locality

<xsl:template match="CM"> <feat type="classificationCode-12620A.4.2"> <xsl:apply-templates/> </feat></xsl:template>

Page 21: TMF - a tutorial Part 3:  Designing (schemas and) filters

Introducing specific levels

Necessity to combine structure and content

<xsl:template match="VE"> <struct type="TS"> <feat type="term-12620A.1"> <xsl:apply-templates/> </feat> </struct></xsl:template>

Page 22: TMF - a tutorial Part 3:  Designing (schemas and) filters

Default rule

Useful for keeping track of unmapped data categories

<xsl:template match="*"> <feat> <xsl:attribute name="type">

<xsl:value-of select="name()"/></xsl:attribute>

<xsl:apply-templates/> </feat></xsl:template>

Page 23: TMF - a tutorial Part 3:  Designing (schemas and) filters

Useful pointers

TMF page:– http://www.loria.fr/projets/TMF

HLT/Salt project page– http://www.loria.fr/projets/SALT

Data category query tool:– http://salt.loria.fr/public/salt/DCQuery.html