An Introduction to XLIFF Tony Jewtushenko Oracle Corporation - Principal Product Manager Chair –...
-
Upload
deborah-morrison -
Category
Documents
-
view
222 -
download
1
Transcript of An Introduction to XLIFF Tony Jewtushenko Oracle Corporation - Principal Product Manager Chair –...
An Introductionto XLIFF
Tony JewtushenkoOracle Corporation - Principal Product Manager
Chair – OASIS XLIFF TC
The XML Localisation Interchange File Format
Slide 2
Agenda
• Overview of XLIFF Definition, goals, and benefits of XLIFF
Brief history of XLIFF
• ArchitectureMain features of XLIFF
• The Real WorldUse cases and Tools support for XLIFF
• Current State of AffairsPost XLIFF 1.1 – what’s next…
Slide 3
XLIFF Overview
A glance at the definitions, goals and benefits of the XML Localisation Interchange File Format.
Slide 4
What is XLIFF?
A specification
for the lossless interchange of localizable data and its related information,
which is tool-neutral,
has been formalized as an XML vocabulary,
and features an extensibility mechanism.
Slide 5
XLIFF TC’s Charter
“The purpose of the OASIS XLIFF TC is to define, through XML vocabularies, an extensible specification for the interchange of localization information. The specification will provide the ability to mark up and capture localizable data and interoperate with different processes or phases without loss of information. The vocabularies will be tool-neutral, support the localization-related aspects of internationalization and the entire localization process. The vocabularies will support common software and content data formats. The specification will provide an extensibility mechanism to allow the development of tools compatible with an implementer's own proprietary data formats and workflow requirements.”
Slide 6
Why XLIFF is Needed?
Localization offers the following challenges:
• Insufficient interoperability between tools.
• Lack of support for overall localization workflow.
• Necessity of localization tools developers to deal with many formats.
• Large number of proprietary intermediate formats.
Slide 7
Advantages – Localization Customer
• Single format for adjunct processing (e.g. quality control in terms of spell checking).
• Less dependency on vendors which are able to work with special formats.
• Tighter control on what goes to localization (Pre-filtering of what to translate or not).
• Controlled information flow (author/developer notes, item properties, etc.).
• ID-based leveraging.• All advantages of XML-based processing.
Slide 8
Advantages – Tools Vendor
• Focus on development of core functionality rather treatment of source format.
• Allow usage of tools in new contexts.
• All advantages of XML-based processing.
Slide 9
Advantages – Service Provider
• Single format for adjunct processing (e.g. quality control in terms of spell checking).
• Less dependency on specific localization tools.
• Controlled information flow (author/developer notes, item properties, etc.).
• Allow usage of tools in new contexts.
• All advantages of XML-based processing.
• Open and standard solution for proprietary formats.
Slide 10
Advantages – Technology (1/2)
• For a given utility, only one implementation is necessary (e.g. not one spell checker for RTF, and another one for HTML).
• Increases usability of utilities (i.e. all formats with XLIFF filters can be used with XLIFF-enabled utilities).
Slide 11
Advantages – Technology (2/2)
• All advantages of XML-based processing:– Use of its internationalization features.– Better interoperability and cross-platform support.– Powerful rendering options (XSL-FO, CSS).– Powerful transformation options (XSLT).– Greater integration with Web services.
• Access to existing, and often open-source, XML implementation (lower costs).
Slide 12
Genesis of XLIFF
• Founded: Sept 2000
• Founding Members: Novell, Oracle and Sun
• Initially named “DataDefinition” group
Slide 13
XLIFF Timeline
• September 2000 - DataDefinition Kickoff
• December 2000 - first face to face
• March 2001 - second face to face
• End March 2001 - draft 1.0 spec and DTD published
• June 2001 - White Paper published
• December 2001 - OASIS XLIFF Technical Committee Proposal submitted
• April 2002 – XLIFF 1.0 Specification approved by formal vote as an OASIS Committee Specification
• May 2003 – XLIFF 1.1 Specification approved by formal vote as an OASIS Committee Specification
• August/Sept 2003 – XLIFF 1.1 Peer Review
• November 2003 – Revised XLIFF 1.1 Specification approved as OASIS Committee Specification
• November 2003 – XLIFF 1.1 Specification submitted to OASIS Standards Review Process
Slide 14
OASIS: Standards Body Home of XLIFF
• OASIS: Organization for the Advancement of Structured Information Standards
• World’s largest independent, non-profit organization dedicated to the standardisation of XML applications and Web Services
• More than 150 member companies plus individuals
• Operates XML.ORG Registry, the open community clearinghouse of XML application schemas clearinghouse of XML application schemas
• Technical work on XML interoperability includes XML conformance and XML Registries/Repositories
• General XML technical resource
Slide 15
Drivers Behind XLIFF
Alchemy SoftwareBowne Global SolutionsConvey SoftwareEktron, IncGlobalsightHPLotus/IBMLionbridgeLRCMoravia IT
NovellOracleMicrosoftRWS GroupSAPSDL InternationalSun MicrosystemsTektronix
Slide 16
Present OASIS XLIFF TC
• TC Officers: – TC Chair: Tony Jewtushenko, Oracle Corporation– TC Secretary: Peter Reynolds, Bowne Global Solutions– TC Editor: Yves Savourel
• Current Members of TC: • Gérard Cattin des Bois, Microsoft • Doug Domeny • Milan Karásek, Moravia-IT • Mark Levins, IBM/Lotus • Christian Lieske, SAP • Mat Lovatt, Oracle • Enda McDonnell • David Pooley, SDL • John Reid, Novell• Reinhard Schaler, LRC• Bryan Schnabel, Tektronix • Shigemichi Yazawa
Slide 17
XLIFF TC in the Community
• Shared interests with OASIS Translation Web Services Technical Committee– XLIFF may be used as data container for WS
• Shared interests with the OSCAR SIG at LISA– Segmentation and word-count.– Content markup (inline codes).
• Shared interests with the W3C i18n WG– Localization directives.– Best practices.– In the localization aspects of the W3C. recommendations.– Web services.
Slide 18
Architecture
A look at XLIFF’s main features and how they work together.
Slide 19
Extract-Localize-Merge Paradigm
• Separate data related to localization from parts not related to localization.
• Merge translated data with codes at the end of the process to create the final document.
• Skeleton file is optional, so this paradigm is also optional
Slide 20
A Birds-Eyes View
An XLIFF document can capture anything needed for a localization project:
1. Localizable objects (e.g. text strings) in source and target languages.
2. Supplementary information (e.g. glossaries, or material to recreate the original format).
3. Administrative information (e.g. workflow data).
4. Custom data (e.g. initialization information for tools).
Slide 21
The XLIFF Document
• An XLIFF document is designed to store the extracted data related to localization.
• Each given source container (e.g. a file, a database table, and so forth) corresponds to a <file> element in XLIFF.
• Each XLIFF document can include several <file> elements.
• A whole localization project can possibly be stored in a single XLIFF document.
Slide 22
Bilingual Model
• Each <file> element is designed to store one source language and one target language.
• The rational is that the translation of different target language is done by different people most of the time.
• However, languages in <alt-trans> element can be different. For example, proposed matches in national Portuguese when translating into Brazilian Portuguese.
Slide 23
Localizable Objects
• XLIFF allows not only text string as localizable object but also other object types such as graphics.
• Supplementary information can be represented in a generic way through inline codes (e.g. formatting of text).
• Relationship between object can be captured (e.g. all items in a menu).
Slide 24
An XLIFF Snippet…
A simple menu represented as XLIFF
Slide 25
Supplementary Info
• XLIFF provides “hooks” for storing supplementary information (for example to glossaries or translation memories which should be used).
• The supplementary information can be referenced (i.e. reside outside of the document), or embedded within the document.
Slide 26
Administrative Info
XLIFF provides mechanisms for capturing administrative information:
• For relating source material to XLIFF documents.
• For storing workflow data.
• For providing pre-translation entries.
• For keeping track of changes.
Slide 27
Administrative Info – Pre-Leveraging
A set of proposed translation can be included for each <trans-unit> element, using the <alt-trans> element.
<trans-unit id='1'> <source xml:lang='en'>The text</source> <alt-trans quality-match='high' origin='MTsystem'> <target xml:lang='fr'>Le texte</target> </alt-trans></trans-unit>
Slide 28
Custom Data in XLIFF 1.0
In XLIFF 1.0, we use the <prop> element and the ts attribute to store user-defined information (*note: these features are deprecated in XLIFF 1.1)
<trans-unit id='1' ts='ctx:23a7'> <prop-group> <prop prop-type='myType' >Some property data</prop> </prop-group> <source>Text</source></trans-unit>
Slide 29
XLIFF 1.1 Custom Data
In XLIFF 1.1, we have the ability to customise XLIFF by extending:– Elements– Attributes– Attribute Values
Slide 30
Extending Elements
– Extension points in the following elements: • <header>, <group>, <tool>, <trans-unit>,
<alt-trans>, and <bin-unit>.
– content of each custom element can be any valid XML content:
• empty content, PCDATA, mixed content, and so forth
– Custom elements defined in private namespace schema
Slide 31
Example of Extending Elements in XLIFF 1.1
<xliff version='1.1' xmlns='urn:oasis:names:tc:xliff:document:1.1' xmlns:sup='http://www.ChaucerState.ac.pg/Frm/XLFSup-v1'> <file original='passus-1.doc' source-language='enm‘
datatype='plaintext'> <group> <sup:SourceInfo> <sup:Book>Piers Plowman, Passus 1</sup:Book> <sup:Author>William Langland</sup:Author> </sup:SourceInfo> <sup:WorkInfo Task='transcription' Context='Middle-
English:1360'/> <trans-unit id='1'> <source xml:lang='enm'>What this mountaigne bymeneth</source> <target xml:lang='en'>What this mountain means</target> <sup:Reference Type='strophe'>1-a</sup:Reference> </trans-unit> </group> </file></xliff>
Slide 32
Extending Attributes
• Attributes of a namespace different than XLIFF can be included in these XLIFF elements: – <file>, <group>, <trans-unit>, <source>,
<target>, <tool>, <bin-unit>, <bin-source>, <bin-target>, <alt-trans>, <mrk>, <g>, <x/>, <bx/>, <ex/>, <bpt>, <ept>, <ph>, and <it>
• No specific location where to insert the non-XLIFF attributes
• No limit to the number of non-XLIFF attributes that can be used in an XLIFF document
Slide 33
Example of Extending AttributesAttributes from the HTML vocabulary extend the
<group> and <trans-unit> <xliff version='1.1' xmlns='urn:oasis:names:tc:xliff:document:1.1' xmlns:htm='http://www.w3.org/TR/REC-html40'> <file original='table.htm' source-language='en' datatype='html'>
<group restype='table' htm:border='1' htm:cellpadding='5‘ htm:cellspacing='0' htm:width='100%'><group restype='row'>
<trans-unit id='1' htm:valign='top' htm:width='30%'> <source>Text of row 1 column 1</source>
</trans-unit> <trans-unit id='1' htm:valign='top' htm:width='30%'>
<source>Text of row 1 column 2</source></trans-unit>
</group> <group restype='row'>
<trans-unit id='1' htm:valign='top' htm:width='30%'><source>Text of row 2 column 1</source>
</trans-unit><trans-unit id='1' htm:valign='top' htm:width='30%'>
<source>Text of row 2 column 2</source></trans-unit>
</group></group>
</file></xliff>
Slide 34
Extending Attribute Values
• Attributes where the list of values can be extended are the following: context-type, count-type, ctype, datatype, mtype, restype, size-unit, state, unit, priority, and purpose
• User-defined values must start with a “x-” prefix
• There is no specified mechanism to validate individual user-defined values, beyond starting with “x-”
Slide 35
Example of Extending Attribute Values
• The following excerpt shows how the user-defined value x-for-engineer can be utilized in a document:...<group>
<context-group name='EngineersData'><context context-type='x-for-
engineers'>Data...</context></context-group>
</group> ...
Slide 36
Embedding XLIFF (XLIFF 1.1)
• Can embed an entire or part of an XLIFF doc in other XML doc
• XML defined by XML Schema (XSD) that includes an <any> element in the definition of the element where the XLIFF data can be inserted
Slide 37
Deprecated or changed 1.0
• reformat – feature changed
• tool attribute becomes tool element
• new tool-id attribute
• ts, prop / prop-group - deprecated
• header was required, now optional
• default –can specify default values for given scope
Slide 38
Data Validation
• In 1.0, validation by DTD
• In 1.1, validation by XML Schema – XSD
• XSD provides better control over XML document: – Structure – structured order can be specified– Content – support for standard datatypes like
date– Semantics – can specify range of valid values
or pattern– Support for namespace
Slide 39
The Real World
A look at some concrete examples on how XLIFF can be used in localization projects.
Slide 40
Streamlining L10n Files Exchanges
Localization Customer
LocalizationPreprocessorLocalizationPreprocessor
Pre-translatedProprietary Format File
Localization Vendor
Customer Supported
Localization Tool
INCCSV
DOCDBLANG
SHLMDB
CATCFG
.INI.TXT
ZINCDOCLANG
MSGAGENT
ICSFILXLIFF
HTML.XSL
XML
INSNLM
ASDHGFF
VBNPARA
CATXRDB
PROP.JAVA
C++
HLPRC
MCEN
XSFTFD
LDIMENU
PCT.EXE
..DLL
Localization Customer
INCCSV
DOCDBLANG
SHLMDB
CATCFG
.INI.TXT
ZINCDOCLANG
MSGAGENT
ICSFILXLIFF
HTML.XSL
XML
INSNLM
ASDHGFF
VBNPARA
CATXRDB
PROP.JAVA
C++
HLPRC
MCEN
XSFTFD
LDIMENU
PCT.EXE
..DLL
Localization Vendor
VendorLocalization Process
Localization Customer Localization Vendor
Any tools based on XLIFF Industry
Standard
INCCSV
DOCDBLANG
SHLMDB
CATCFG
.INI.TXT
ZINCDOCLANG
MSGAGENT
ICSFILXLIFF
HTML.XSL
XML
INSNLM
ASDHGFF
VBNPARA
CATXRDB
PROP.JAVA
C++
HLPRC
MCEN
XSFTFD
LDIMENU
PCT.EXE
..DLL
XLIFFLocalizationPreprocessorLocalizationPreprocessor
Slide 41
Basic Use Case – without XLIFF
Tool ResourceFilters
DeveloperApplications TranslatorCustomer
SpecificTool (s)
Native File 2(e.g., JavaFiles)
Native File 1(e.g., HTML)
Native File 3(e.g., Java Properties)
Native File n
Publisher/CustomerDomain
LocalisationDomain
Slide 42
Basic Use Case –with XLIFF
XLIFF compliant DeveloperApplications
TranslatorXLIFFCompliantEditor
XLIFF file(s) containingHTML, Java, Properties, etc translatable resources
Non XLIFF compliant DeveloperApplications
- OR -
Publisher/CustomerDomain
LocalisationDomain
Direct toXLIFF authoring
HTML
Java Properties
RC Data
Pre-processing
Slide 43
Simple Automated Localisation Use Case
Developer Translator
GenerateXLIFF
Pseudo Translate / Test
LocalizationEngineer
XLIFF Translation Kit
Leverage
TranslationRepository
DefectReport
XLIFF Editor
Update
XLIFF Translation Kit
Translate
RequiresTranslation
100%Translated
0% Translated
100%Translated
Slide 44
Automated Localisation with CAT Use Case
Developer Translator
GenerateXLIFF
Pseudo Translate / Test
LocalizationEngineer
XLIFF Translation Kit
100% match
TranslationRepository
DefectReport
XLIFF Editor
XLIFF Translation Kit
Translate
RequiresTranslation
100%Translated
0% Translated
100%Translated
Fuzzymatch
TranslationMemory
MachineTranslation
MachineTranslate
Update
Slide 45
Benefits: Use of XML Technologies
• XSL can be used to perform many tasks on XLIFF documents, for example:– Display translatable content in Web browser.– Generate statistics (e.g. number of localizable
objects).
• Availability of many XML engines makes using XLIFF easy.– Content-related checks (e.g. that certain characters
do not appear as textual contents) can be performed with ordinary Web browsers.
Slide 46
XML-Enabled Translation Tools
• Any XML-enabled translation tool can work with an XLIFF document, as long as the text to translate is initially copied in the <target> elements. However, this does not mean it supports all XLIFF features, but just permits translation of <target> content.
• Many tools cannot handle conditional translation (for example: <trans-unit translate="no">). Then, you need to add extra elements temporarily.
Slide 47
3rd Party Tools Support for XLIFF• RWS Group : Extraction Utility for RC Data and Java Properties to
XLIFF 1.1 http://dotnet.goglobalnow.net/ Various Utilities: http://www.translate.com/shared/tools
• Alchemy Software - Catalyst 5.0 – Visual XLIFF 1.1 Editor http://www.alchemysoftware.ie
• XML-Intl : XLIFF Editor http://www.xml-intl.com
• Heartsome XLIFF Editor: http://www.heartsome.net
• SDL International: SDLX support for XLIFF currently in development. See http://www.sdlx.com for more information.
• Trados: No direct XLIFF support, but can edit XLIFF files using modified INI
• PASS: Passolo: XML Editor can be configured for XLIFF, http://www.passolo.com
Slide 48
More Tools Support for XLIFF• Bowne Global Solutions: Elcano, Online Translation Service
has a web service based connector for XLIFF files http://elcano.bowneglobal.com
• Oracle: HyperHub: Internal Tool for editing Oracle based data contained in XLIFF archives
• IBM: Domino Global Workbench Version 6 (http://www6.software.ibm.com/devcon/devcon/docs/dwkbbet6.htm)
• Sun : Internal XLIFF Editor as described in this article: http://www.sun.com/developers/gadc/technicalpublications/articles/xliff.html
• Open Source XSLT Tools: http://sourceforge.net/project/showfiles.php?group_id=42949&release_id=67485
Slide 49
Current State of Affairs
A look at the work under way at the OASIS XLIFF TC, the future, etc.
Slide 50
Current State of Affairs – To Do
• Specification of canonical representation in XLIFF of common formats (e.g. Windows resources, Java properties), so all XLIFF representations are the same regardless which tool created the document.
• Translation/Localization tools that support XLIFF out-of-the-box (not just as another XML format).
• Open Source filters (e.g. to convert from Windows message catalogues to XLIFF).
Slide 51
More Information
• The XLIFF TC Web Site: http://www.xliff.org
• Presenter: – XLIFF TC Chair: Tony Jewtushenko (Oracle)
• Significant Contributors to this Presentation:– Christian Lieske, (SAP)
– Yves Savourel (RWS Group)([email protected])
Slide 52
Thank You...
Questions?