DITA for Localization

29
Better Translation Technology Andrzej Zydron, CTO XTM International Better Translation Technology DITA Localization

Transcript of DITA for Localization

Better Translation Technology

Andrzej Zydron, CTO XTM International

Better Translation Technology

DITA Localization

Better Translation Technology

In the beginning

Technical documentation was without form, and darkness was upon the face of

the page:

– Manual typesetting

– RTF

– WordPerfect

– MS Word

– FrameMaker

– Ventura Publisher

– Pagemaker

– SGML

3Better Translation Technology

In the beginning

Lack of standards

•Proprietary solutions

•Problems with character encoding

•Expensive to design

•Expensive to build

•Expensive to maintain

•Expensive to localize

4Better Translation Technology

Along came XML

Let there be light:

– XML born in 1997 from SGML/HTML

– Review of lessons learned from SGML

– Easier to implement

– Removed unnecessary complexity

– Declared standard encoding - Unicode

5Better Translation Technology

DITAStandards, Standards, Standards

DITA:

Advent of standards to

technical documentation

6Better Translation Technology

DITA is not perfect!

Better Translation Technology

DITA - the good

Extremely well thought out XML document architecture:

– modularity

– fine level of granularity

– reuse

– bookmap

– standardized elements

– Write once, translate once, reuse many times

– Multiple output formats, multiple places, multiple docs:

• PDF, HTML, mobile, web, paper etc.

8Better Translation Technology

DITA Localization

Practical considerations:

– Controlled Authoring:

• Consistency

• Terminology

– Delivery for localization:

• All at once in one big heap

• JIT - individual topics when ready

– Translation Consistency:

• Translation Memory

• Terminology

9Better Translation Technology

DITA Localization - the good

Modularity:

– Translate a topic once

– Reuse many times!

• No need to retranslate

– Just in time translation

• Translate as soon as source is ready

• Dramatic improvement in time to market

• All documentation in all languages is ready concurrently

10Better Translation Technology

DITA Localization - the good

• Decide how you want to translate:

– Whole document as one using bookmap

– Individual topics navigated according to bookmap

– Individual topics as and when ready

• Handling last minute engineering changes

– JIT translation

– Many TMS systems not good at handling this

– Automatically Update already translated segments

11Better Translation Technology

DITA Localization - the <bad/><ugly/>

The bad and downright ugly (the three vil lains!):

– Word Substitution

• CONREF

• KEYREF

• DITAVAL

– Specialization

– Conditional processing

12Better Translation Technology

DITA: square peg, round hole

• Do not try and force DITA to do what it is not designed for!

• DITA = Modular technical documentation

• Small, discrete topics

• No more than one page of text per topic

• Use the Open Toolkit

• Do not get overambitious with substitutions

– What works for English and Mandarin will not work for other languages

13Better Translation Technology

DITA: Object Oriented Documentation

• DITA is an attempt to use OO design for XML documentation

• Very tempting for computer scientists

• We did it for computer programming

• Why not documentation?

• Problems arise with the nature of documentation

• Problems arise with the nature of human language

14Better Translation Technology

Language – why humans mess things up!

What language is this?

What is he saying?

15Better Translation Technology

Understanding the nature of English

• Why is English different from most other languages?

• English is a fusion language: a creole

– 60% Old Chaucerian English + 40% French

• Other Creoles with a high number of speakers:

– French (Vulgar Latin + Frankish)

– Swahili (Bantu + Arabic)

– Urdu (Hindi + Arabic)

– Mandarin

• (Many Sino-Tibetan languages)

16Better Translation Technology

Understanding the nature of English

• Primitive morphology

– Nouns:

• Singular, plural, possessive

– ship, ships, ship’s, ships’

– No Gender

• a ship, the ship, the ships

– No adjectival agreement

• green ship, green ships

• We can substitute nouns and noun phrases without causing grammatical errors

• This is not true of most other languages

• English does not work l ike most other languages

• Your documentat ion WILL be translated sooner or later

17Better Translation Technology

DITA Localization

Avoid word substitution (CONREF, KEYREF, DITAVAL):

– Linguistic issues

– Adjectival agreement

– Grammatical case

• Presenting the new Ford <keyword keyref=”model”> for 2014.

– very bad idea!

• Focus, Fiesta, Mondeo

• Nowy Focus, Nowa Fiesta, Nowe Mondeo

• Akin to saying ‘Presenting the Ford new Focus’

• Nowym Focus’em, Nową Fiestą, Nowym Mondeo

– May work for alphanumeric words

18Better Translation Technology

DITA Localization

Only use substitution for l inguistically complete sentences

– Warnings

– Cautions

– Notes

Avoid substitution for individual words or noun phrases

19Better Translation Technology

Specialization

• Specialize at your peril!

– A double edged sword

• Increases exponentially difficulty:

– Authoring

– Publishing

– Localization

• New elements/attributes

– How are they to be treated

– For localization: completely new document type

20Better Translation Technology

DITA and OAXAL

• OAXAL - Open Architecture for XML Authoring and Localization

• DITA Authoring and Localization in a Standards context:

– DITA is an Open Standard

– Why use proprietary software for Authoring and Localization of DITA?

Better Translation Technology

OAXAL

http://wiki.oasis-open.org/oaxal/FrontPage

Better Translation Technology

OAXAL Stack

Better Translation Technology

OAXAL Interaction

Better Translation Technology

OAXAL Source Lifecycle

Better Translation Technology

OAXAL Translation Lifecycle

26Better Translation Technology

DITA Localization - considerations

• Choosing the right TMS/CAT System

– Can it handle XML properly:

• Entity references e.g. ‘&amp;’

• Encoding

• Validation

– Does it understand DITA

– Does it understand ditamap/bookmap

– Can you navigate using the bookmap

– Can it handle specialization

– Does it handle JIT

– Can it handle last minute changes

27Better Translation Technology

How to reduce you translation costs

• Write less!

– Ford of Europe reduced translation costs by 50% in 2005

– It costs as much to translate into one language as it does to write the

original

• Use more graphics

– Integrate with CAD/CAM systems

– But beware text in graphics – use callouts

• People may actually start using your documentation

• KISS

• Manage your own translation assets: e.g. invest in your own TMS

– Save an additional 20% on average on cost and 50% on turnaround

Better Translation Technology

Less is More

Better Translation Technology

Contact Details

• Postal address:

– PO Box 2167

– Gerrards Cross

– Bucks SL9 8XF

– United Kingdom

• Phone: +44 1753 480 467

• Fax: +44 1753 480 465

• Andrzej Zydroń – [email protected]