San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer...

22
San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer [email protected] Raghuram (Ram) Viswanadha [email protected] IBM San José Globalization Center of Competency Copyright © 2003-2004 IBM Corporation

Transcript of San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer...

Page 1: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA – September, 2004

Localizing with XLIFF and ICU

Markus [email protected]

Raghuram (Ram) [email protected]

IBM San José Globalization Center of CompetencyCopyright © 2003-2004 IBM Corporation

Page 2: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

Agenda

• What is software globalization?• Localizable content storage formats• Problems faced by translators• What is XLIFF?• XLIFF Benefits• Integration of XLIFF in localization process• ICU and XLIFF• Localization process demo • Q & A

Page 3: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

Globalization of a Software Product

• Localization (L10N): The process of modifying products or services to account for differences in distinct markets. 

• Internationalization (I18N) :  The process of ensuring at a technical/design level that a product can be easily localized. 

• Globalization (G11N): The proper design and execution of systems, software, services, and procedures so that one instance of software, executing on a single server or end user machine, can process multilingual data, and present data culturally correctly in a multicultural environment such as the Internet.

Page 4: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

Localizable Content Storage Formats

• Plethora of formats– VC++ .rc and .mc files– Java resource bundles– ICU resource bundles– .NET resource files– POSIX Message Catalogs– GNU gettext-.. etc.Note: Refer to the conference proceedings for more information

Page 5: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

Problems for Translators

• Large number of proprietary formats – Very different capabilities

• Lack of tools which understand the different formats and interoperate

• Formats in programming languages

• Lack of a well defined process for managing the localization work flow

Page 6: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

What is ICU?

• Internationalization libraries for C, C++, Java*– Open source – non-viral– Sponsored by IBM* Sun’s Java licenses an earlier ICU version; ICU4J updates it.

• Unicode standard compliant– full supplementary support

• Cross-platform; extensible and customizable• High performance and thread-safe

– Multiple locales in same thread – simultaneously

• http://oss.software.ibm.com/icu/

Page 7: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

ICU Services• Unicode string handling, sets of Unicode characters,

and character properties• Character conversion (>200 conversion tables)• Language-sensitive collation (UCA) and text searching• Unicode regular expressions and text boundary

analysis• Locale-sensitive formatting and parsing (>200 locales)• Timezone and currency handling• Complex text layout• Script transliteration and flexible text-text

transformations• Resource bundles for storing localizable content

Page 8: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

ICU ResourceBundle Features

• Comparable features for content storage except for meta data

• Support for some XLIFF meta data features• Key-Value pairs + Nested Structures• External text and binary files can be imported• Bundle fallback mechanism

– root bundle ‘fr’ bundle French ‘fr_CA’ bundle for French, Canada

• Alias mechanism for conserving space

Page 9: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

What is XLIFF?

• XML Localization Interchange File Format (XLIFF) is an emerging industry standard for exchanging content for localization.

• Defines XML vocabulary for expressing localizable data.

• Designed by localization industry experts to address the problems faced by translators.

Page 10: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

XLIFF Benefits• XML-based

– Problems of encoding and character sets avoided– Rich internationalization features

• Superset of other formats– Lossless conversion to and from different formats– Binary objects (e.g. icons) can be imported or

represented inline • Not proprietary

– Tool and vendor neutral• Key-value pairs + Nested structures• Extensible

Page 11: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

XLIFF Benefits (contd.)

• Translation can be turned off.• Meta data for communication between

developer and translator.• Validation through XML Schema or Data Type

Definition (DTD).• Support for Translation Memory, Machine

Translation (MT) and Computer Aided Translation.

• Meta data for administrative info and version management.

Page 12: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

XLIFF Snippet

<trans-unit xml:space = "preserve" id = "root_fortunes_0" translate="yes">

<source xml:lang = "en">

A child of five could understand this! Fetch me a child of five.

</source>

<target xml:lang="te">

ఒక ఐదు� యెళ్ళ� పిల్ల�వాడు� దీనిని అర్థం�� చేసు�కొగల్లడు�, ఐదు� యెళ్ళ�

పిల్ల�వాడిని తీసు�క� ర్థం�డి

</target>

</trans-unit>

Page 13: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

How Applications Localize

Repository

Template

Tools that

understand multiple

formats

Product Developers

Translators

Localized files

Page 14: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

Using XLIFF

• XLIFF is used only for interchange of content with translators.

• XLIFF is used as the source format of localizable content .

Page 15: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

Transient XLIFF Files

Product Developers

Repository

Translators

Localized XLIFFFiles

Template

XL

IFF Filter

TemplateXLIFF

File

Page 16: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

Source XLIFF Files

Product Developers

Repository

Localized XLIFFFiles

Template

XL

IFF Filter in

build process

TemplateXLIFF

File Translators

Page 17: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

XLIFF for Localization Engineers

Q. What does XLIFF do for me?

• Single format for quality control

• NO errors due to 3rd party tools: you control the converters

• NO data interchange errors

• Control what needs to be translated

• Communicate with translator

Page 18: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

Localization with XLIFF and ICU

Product Developers

Repository

Localized XLIFFFiles

Templatein ICU format

XL

IFF

2ICU

Converte

r and genrb Filters in

build process

TemplateXLIFF

File

Translators

Page 19: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

Lets Localize!

http://oss.software.ibm.com/cvs/icu/icuapps/ufortune

Page 20: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

Conclusion

• XLIFF is the best interchange format available

• ICU provides tools for using XLIFF in the localization process

Page 21: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

References and Resources• International Components for Unicode (ICU) : http://

oss.software.ibm.com/icu/• XLIFF Specification:

http://www.oasis-open.org/committees/xliff/documents/cs-xliff-core-1.1-20031031.htm

• XLIFF Overview: http://xml.coverpages.org/xliff.html• White Paper on XLIFF:

http://www.oasis-open.org/apps/group_public/download.php/3110/XLIFF-core-whitepaper_1.1-cs.pdf

• Domino Global Workbench Version 6: http://www6.software.ibm.com/devcon/devcon/docs/dwkbbet6.htm

• XLIFF at IBM:http://www.ibm.com/software/globalization/highlights/xliff.jsp

• XLIFF at Sun: http://www.sun.com/developers/gadc/technicalpublications/articles/xliff.html

Page 22: San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer markus.scherer@us.ibm.com Raghuram (Ram) Viswanadha ramv@us.ibm.com IBM San.

San José, CA- – September, 200426th Internationalization & Unicode Conference

Localizing with XLIFF and ICU

Questions

Thank you for listening.

Are there any questions?