Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of...
-
Upload
esther-russell -
Category
Documents
-
view
213 -
download
0
Transcript of Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of...
![Page 1: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/1.jpg)
Leveraging XLT: (Web-Leveraging XLT: (Web-Enabled) Validation of Enabled) Validation of
Terminology CollectionsTerminology Collections
Lee Gillam, University of Surrey
SALT Workshop, Antwerp
31 January 2001
![Page 2: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/2.jpg)
Surrey-EU project historySurrey-EU project history
Terminology Extraction and Management Projects: TWB, TWBII
Management of Text Collections: TRANSTERM
Term Resources: POINTERTerminology Validation: INTERVALConvergence in SALT?
![Page 3: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/3.jpg)
XLT ‘opportunities’XLT ‘opportunities’
Complete terminology collections available in XML – enhancement/reuse of other collections
Large number of (multilingual) terms – difficult for humans to appraise
Terminology relates to usage – document collections highly relevant
Quantity of terms – no guarantee of quality
![Page 4: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/4.jpg)
(Web-Enabled) Validation(Web-Enabled) Validation
Relevant documents on the web – contextual information
Relevant documents on the ‘corporate internet’ – contextual information
Term usage in other organisations (glossaries)/as understood by Joe E.C. Taxpayer
Resource enrichment
![Page 5: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/5.jpg)
System DescriptionSystem Description
For a given (D)XLT collection of terminology:– Partition collection by specific criteria– Collect documents relevant to criteria– Analyse documents against the partitioned
collection– Report results
![Page 6: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/6.jpg)
System DescriptionSystem Description
Partition collection by specific criteria:– Use of ‘Xpath’– “Give me all terms in English”
//dxlt/text/body/termEntry/langSet[@lang = ‘en’]/ntig/termGrp/term/text()
– Alternative example: “Give me all subjectFields”
//dxlt/text/body/termEntry/descrip[@type=‘subjectField’]/text() [check!]
![Page 7: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/7.jpg)
System DescriptionSystem Description
Collect documents relevant to criteria– For terms, try internet/intranet searching– For subject field classifications, classification
documents will be relevant– For definitions, comparisons with other
glossaries may provide useful validating information
– …..
![Page 8: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/8.jpg)
System DescriptionSystem Description
Analyse documents against the partitioned collection– Are the terms contained in the documents?– Are the terms in the documents now used as parts of
compounds?– What are the contexts in which the terms are used?– Are there a number of potential other definitions for a
particular term?– Does this fit in with a specific classification?– ….
![Page 9: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/9.jpg)
System DescriptionSystem Description
Report Results– Term frequency – Zero?– Potential compounds– Contexts– Definitions– Correctly classified– …..
![Page 10: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/10.jpg)
Prototype prototypePrototype prototype
‘XML’
XML attributes ‘Results Area’
Indicative Actions
![Page 11: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/11.jpg)
Prototype prototypePrototype prototype
![Page 12: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/12.jpg)
Prototype prototypePrototype prototypeIndicative XPaths
![Page 13: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/13.jpg)
Prototype prototypePrototype prototype
![Page 14: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/14.jpg)
Prototype prototypePrototype prototype
![Page 15: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/15.jpg)
Prototype prototypePrototype prototype
Recall this term…
![Page 16: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/16.jpg)
Prototype prototypePrototype prototype
CIRCUIT SWITCHINGFound in collected texts 43 times. Valid term?
PACKET SWITCHING also exists in this
resource.
![Page 17: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/17.jpg)
DHydro SampleDHydro Sample <termEntry id="HR-7"> + <transacGrp> <descrip type="subjectField">200</descrip> + <langSet lang="fr"> <langSet lang="en"> <descripGrp> <descrip type="definition">The apparent displacement in position
of a heavenly body caused by the combination of the velocity of light and that of an observer on the surface of the earth. Aberration of light due to the rotation of the earth on its axis is termed diurnal aberration. That due to the revolution of the earth around the sun is termed annual aberration.</descrip> </descripGrp>
<ntig> <termGrp> <term id="HR-7-en-1">aberration of light</term> <termNote type="termType">main entry</termNote> <termNote
type="partOfSpeech">n</termNote></termGrp> </ntig> </langSet> + <langSet lang="es"> </termEntry>
![Page 18: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/18.jpg)
Lenoch (GMT)Lenoch (GMT) <struct type="classification"> <feat type="name">AD2</feat> <feat type="documentation">public and private organisations</feat> <feat type="subclass-of">AD</feat> </struct> <struct type="classification"> <feat type="name">AD3</feat> <feat type="documentation">publications and documentary search</feat> <feat type="subclass-of">AD</feat> </struct> <struct type="classification"> <feat type="name">AD31</feat> <feat type="documentation">documentation and information systems</feat> <feat type="subclass-of">AD3</feat> </struct>
![Page 19: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/19.jpg)
Lenoch (XOL)Lenoch (XOL) <class> <name>AD2</name> <documentation>public and private organisations</documentation> <subclass-of>AD</subclass-of> </class> <class> <name>AD3</name> <documentation>publications and documentary search</documentation> <subclass-of>AD</subclass-of> </class> <class> <name>AD31</name> <documentation>documentation and information
systems</documentation> <subclass-of>AD3</subclass-of> </class>
![Page 20: Leveraging XLT: (Web- Enabled) Validation of Terminology Collections Lee Gillam, University of Surrey SALT Workshop, Antwerp 31 January 2001.](https://reader036.fdocuments.us/reader036/viewer/2022082816/56649f4e5503460f94c700ce/html5/thumbnails/20.jpg)
OutlookOutlook
Initial Results show promise for Validation of Terminological Resources
– significant development work is still required.– XPath generation needs tailoring to specific formats (DXLT), but
provides useful power– Development to merge ‘Web glossaries’ – pre-terminological validation
stage
Provide a powerful prototype of the capabilities for the (Web-Enabled) Validation of Terminology Collections – with DXLT-related formats.
DXLT as the de facto standard format for Terminology Validation?