DITA for Localization
Transcript of DITA for Localization
Better Translation Technology
Andrzej Zydron, CTO XTM International
Better Translation Technology
DITA Localization
Better Translation Technology
In the beginning
Technical documentation was without form, and darkness was upon the face of
the page:
– Manual typesetting
– RTF
– WordPerfect
– MS Word
– FrameMaker
– Ventura Publisher
– Pagemaker
– SGML
3Better Translation Technology
In the beginning
Lack of standards
•Proprietary solutions
•Problems with character encoding
•Expensive to design
•Expensive to build
•Expensive to maintain
•Expensive to localize
4Better Translation Technology
Along came XML
Let there be light:
– XML born in 1997 from SGML/HTML
– Review of lessons learned from SGML
– Easier to implement
– Removed unnecessary complexity
– Declared standard encoding - Unicode
5Better Translation Technology
DITAStandards, Standards, Standards
DITA:
Advent of standards to
technical documentation
Better Translation Technology
DITA - the good
Extremely well thought out XML document architecture:
– modularity
– fine level of granularity
– reuse
– bookmap
– standardized elements
– Write once, translate once, reuse many times
– Multiple output formats, multiple places, multiple docs:
• PDF, HTML, mobile, web, paper etc.
8Better Translation Technology
DITA Localization
Practical considerations:
– Controlled Authoring:
• Consistency
• Terminology
– Delivery for localization:
• All at once in one big heap
• JIT - individual topics when ready
– Translation Consistency:
• Translation Memory
• Terminology
9Better Translation Technology
DITA Localization - the good
Modularity:
– Translate a topic once
– Reuse many times!
• No need to retranslate
– Just in time translation
• Translate as soon as source is ready
• Dramatic improvement in time to market
• All documentation in all languages is ready concurrently
10Better Translation Technology
DITA Localization - the good
• Decide how you want to translate:
– Whole document as one using bookmap
– Individual topics navigated according to bookmap
– Individual topics as and when ready
• Handling last minute engineering changes
– JIT translation
– Many TMS systems not good at handling this
– Automatically Update already translated segments
11Better Translation Technology
DITA Localization - the <bad/><ugly/>
The bad and downright ugly (the three vil lains!):
– Word Substitution
• CONREF
• KEYREF
• DITAVAL
– Specialization
– Conditional processing
12Better Translation Technology
DITA: square peg, round hole
• Do not try and force DITA to do what it is not designed for!
• DITA = Modular technical documentation
• Small, discrete topics
• No more than one page of text per topic
• Use the Open Toolkit
• Do not get overambitious with substitutions
– What works for English and Mandarin will not work for other languages
13Better Translation Technology
DITA: Object Oriented Documentation
• DITA is an attempt to use OO design for XML documentation
• Very tempting for computer scientists
• We did it for computer programming
• Why not documentation?
• Problems arise with the nature of documentation
• Problems arise with the nature of human language
14Better Translation Technology
Language – why humans mess things up!
What language is this?
What is he saying?
15Better Translation Technology
Understanding the nature of English
• Why is English different from most other languages?
• English is a fusion language: a creole
– 60% Old Chaucerian English + 40% French
• Other Creoles with a high number of speakers:
– French (Vulgar Latin + Frankish)
– Swahili (Bantu + Arabic)
– Urdu (Hindi + Arabic)
– Mandarin
• (Many Sino-Tibetan languages)
16Better Translation Technology
Understanding the nature of English
• Primitive morphology
– Nouns:
• Singular, plural, possessive
– ship, ships, ship’s, ships’
– No Gender
• a ship, the ship, the ships
– No adjectival agreement
• green ship, green ships
• We can substitute nouns and noun phrases without causing grammatical errors
• This is not true of most other languages
• English does not work l ike most other languages
• Your documentat ion WILL be translated sooner or later
17Better Translation Technology
DITA Localization
Avoid word substitution (CONREF, KEYREF, DITAVAL):
– Linguistic issues
– Adjectival agreement
– Grammatical case
• Presenting the new Ford <keyword keyref=”model”> for 2014.
– very bad idea!
• Focus, Fiesta, Mondeo
• Nowy Focus, Nowa Fiesta, Nowe Mondeo
• Akin to saying ‘Presenting the Ford new Focus’
• Nowym Focus’em, Nową Fiestą, Nowym Mondeo
– May work for alphanumeric words
18Better Translation Technology
DITA Localization
Only use substitution for l inguistically complete sentences
– Warnings
– Cautions
– Notes
Avoid substitution for individual words or noun phrases
19Better Translation Technology
Specialization
• Specialize at your peril!
– A double edged sword
• Increases exponentially difficulty:
– Authoring
– Publishing
– Localization
• New elements/attributes
– How are they to be treated
– For localization: completely new document type
20Better Translation Technology
DITA and OAXAL
• OAXAL - Open Architecture for XML Authoring and Localization
• DITA Authoring and Localization in a Standards context:
– DITA is an Open Standard
– Why use proprietary software for Authoring and Localization of DITA?
26Better Translation Technology
DITA Localization - considerations
• Choosing the right TMS/CAT System
– Can it handle XML properly:
• Entity references e.g. ‘&’
• Encoding
• Validation
– Does it understand DITA
– Does it understand ditamap/bookmap
– Can you navigate using the bookmap
– Can it handle specialization
– Does it handle JIT
– Can it handle last minute changes
27Better Translation Technology
How to reduce you translation costs
• Write less!
– Ford of Europe reduced translation costs by 50% in 2005
– It costs as much to translate into one language as it does to write the
original
• Use more graphics
– Integrate with CAD/CAM systems
– But beware text in graphics – use callouts
• People may actually start using your documentation
• KISS
• Manage your own translation assets: e.g. invest in your own TMS
– Save an additional 20% on average on cost and 50% on turnaround
Better Translation Technology
Contact Details
• Postal address:
– PO Box 2167
– Gerrards Cross
– Bucks SL9 8XF
– United Kingdom
• Phone: +44 1753 480 467
• Fax: +44 1753 480 465
• Andrzej Zydroń – [email protected]