Tools of the Trade for Translators

Post on 01-Oct-2015

53 views 1 download

description

Tools of the Trade for Translators

Transcript of Tools of the Trade for Translators

  • Tools of the Trade

    Angelika ZerfassConsultant / Trainer for Translation Tools

    zerfass@zaac.de

  • Zerfass@zaac.de2

    Agenda

    What tools are used in translation / localizationOverview over the main functionalities of the tools in one category Special topics like word counts, TMX

    How to evaluate what tool is right for you

    Zerfass@zaac.de

  • Zerfass@zaac.de3

    Tools in Translation / Localization

    Translation Memory ToolsSoftware Localization ToolsTerminology Management ToolsProject Management ToolsWorkflow Management ToolsAdditional Utilities

    Zerfass@zaac.de

  • Zerfass@zaac.de4

    Translation Memory Tools

    Zerfass@zaac.de

  • Zerfass@zaac.de5

    Same idea different approach

    Re-use of already translated segmentsDatabase / list for terminologyAlignment of translated material, where no TM tool was usedStatistics on number of words, number of re-usable segments

  • Zerfass@zaac.de6

    Same idea different approach

    Components Repository for text pairs / sentence pairs Editor for translation Repository for terminology Alignment

    Editor Word, separate editor, editor within TM tool

    Repository Segment pair database per language pair Segment pair database per project (multilingual) Segment pairs in separate reference files Paragraph pairs / parallel documents Segment pairs by ID number (Software Localization)

    Project setup All settings within one tool Separate tools for separate tasks (terminology, conversion, translation)

  • Zerfass@zaac.de7

    Same idea different approach

    Project setup One file format / language pair per project Several file formats / languages per project

    Statistics Word count in source / target language Recycling of segments in the source language Repetitions

    Alignment Re-use of previously translated files (source and target language files) One-to-one alignment / many-to-one alignment

    Terminology management Term list / term database Terminology extraction (monolingual / bilingual)

    QA features Check for file structure integrity Check for missed translations, numbers Check for correct usage of terminology

  • Zerfass@zaac.de8

    Macros within WordWordfast

    Metatexis

    Trados

  • Zerfass@zaac.de9

    TM + Editor

    2 applications open for translationConnection needs to be establishedSetup of translation memory independent of file for translationWorking on single files in the editorWorking on batches of files in the TM system (pre-translation)

  • Zerfass@zaac.de10

    Translation Memory Tool

    Terminology Tool

    Editor

    Terminology Extraction Tool

    Source language files

    Target language files

    Alignment Tool

    Separate tools

  • Zerfass@zaac.de11

    Trados with Word

  • Zerfass@zaac.de12

    Trados with TagEditor

  • Zerfass@zaac.de13

    Integrated tools

    Setup of a project before translation can startSelection of file format, languages, files, location for log files with every projectRe-usable settings from previous projectsFiles are imported into the tool, translation are exported from the tool

  • Zerfass@zaac.de14

    Integrated tools

    Translation Repository

    Old source file

    Old target file

    Read in reference material

    Target files

    Generation of target files

    Files to translate

    Read in new files

    Alignment Component

    Translation Editor

    Terminology Component

  • Zerfass@zaac.de15

    Heartsome Editor

  • Zerfass@zaac.de16

    SDLX

    switchboard

  • Zerfass@zaac.de17

    SDLX

    Project setup

  • Zerfass@zaac.de18

    SDLX

    target windowsource window

    terminology window

  • Zerfass@zaac.de19

    MemoQ

  • Zerfass@zaac.de20

    MemoQ

  • Zerfass@zaac.de21

    Transit

    target window

    status windowterminology window

    source window

  • Zerfass@zaac.de22

    Dj Vu

  • Zerfass@zaac.de23

    Word counts in Translation Tools

    Zerfass@zaac.de

  • Zerfass@zaac.de24 Zerfass@zaac.de

    What is counted in an analysis?

    Chars/Word: average number of characters per word Chars Total: Number of characters in counted words no spaces or stand-alone numbers

  • Zerfass@zaac.de25 Zerfass@zaac.de

    Word Count

    What is a word? A word (for Trados) contains at least one letter (or

    language character for Asian languages) Words are often delimited by spaces (exception

    Chinese, Japanese, Thai) Stand-alone numbers (for example in a table column)

    or symbols (like 1) are NOT counted as translatable words in Trados, but are counted in other tools

    Different TM tools count words differently (sometimes even between different versions of the same tool)

    Word count tools: Word, Trados, Transit, Dj Vu, special word count tools like AnyCount, PractiCount...)

  • Zerfass@zaac.de26

    Word Count Examples

    text element Word Trados200 2 words Not countedcompany/name 1 word 2 wordsSegment with an index field

    Contents of index field not counted

    Contents of index field counted

    Segment with automatic field

    Field contents is a word

    Field contents not counted

  • Zerfass@zaac.de27 Zerfass@zaac.de

    Calculation of Match Rates

    Comparison of segments from the documents with the source language segments in the TMThe similarity is given as a percentageAlgorithm for calculation is secret, but takes number of words, number of other elements (numbers) and length of segment into account.Different tools calculate matches differently, so the match rates are not really comparable

  • Zerfass@zaac.de28

    Match Rate Examples

    SDLX Trados Transit Info

    There is information on a new tool.

    test segment

    There is new information on a tool.

    85% 92% 99% 1 word moved

    There is information on a new tool.

    97% 99% 98% Same segment but different formatting

  • Zerfass@zaac.de29 Zerfass@zaac.de

    Repetitions

    A segment that appears in the analyzed documents more than once and does not have a 100% match from the TM, is counted as:

    a no match or match (if the match rate is above the minimum match value in the TM options) at the first occurrence

    the first repetition at the second occurrence

    During translation, the first occurrence needs to be translated from scratch or adapted from the match, from the second occurrence on, it will be a 100% match from the TMWith large projects, where the documents are split up between different translators, you can extract and translate the repetitions (also called frequent segments) first.

    But the extracted list shows the segments out of context, so it might not be easy to translate all of them first, especially if they are quite short

  • Zerfass@zaac.de30

    What influences the statistics?

    Word count Version of (Trados) software INI file (setting for making text between tags or within tags translatable

    or non-translatable) Analysis of TTX versus RTF/DOC

    Sub segments, like footnotes, index entries

    Settings in the Filter Settings dialog for translating different file formats in TagEditor

    Ex. Making hidden layers translatable

    Settings for non-translatable text via Word stylesSegment count

    Settings for penalties (from alignment) Filter settings to prefer matches with a certain additional field Segmentation rules (like abbreviation lists or different segment end

    symbols

    Zerfass@zaac.de

  • Zerfass@zaac.de31

    TMX (Translation Memory Exchange)

    Zerfass@zaac.de

  • Zerfass@zaac.de32

    Localization Standards

    TMX: Translation Memory ExchangeSRX: Segmentation Rules ExchangeOlif: Open Lexicon InterchangeTBX: Term base ExchangeXLIFF: Localization Interchange File

    Format (predecessor OpenTag)

  • Zerfass@zaac.de33

    TMX Translation Memory Exchange

    OSCAR LISA (Localization Industry Standards Association) group (Open Standards for Container/Content Allowing Re-use)

    From the TMX specification: The purpose of the TMX format is to provide a

    standard method to describe translation memory data that is being exchanged among tools and/or translation vendors, while introducing little or no loss of critical data during the process

  • Zerfass@zaac.de34

    What is TMX

    It is an XML representation of translation memory data Header Body

  • Zerfass@zaac.de35

    What is TMX

    Body

    This is the first sentence.

    Dies ist der erste Satz

    tu = Translation Unit, tuv lang = translation unit variant (language), seg = segment

  • Zerfass@zaac.de36

    What is TMX

    Depending on the tool that created the TMX file, it can be bilingual or multilingual.Importing multilingual TMX file into a bilingual project will only import the relevant languages

  • Zerfass@zaac.de37

    Levels of TMX

    Level 1: Plain text only (sufficient for data coming from software

    localization tools)

    Level 2: Text plus formatting (data coming from translation memory tools

    used for translation of documentation)

    To move formatting and text from one tool to the other both tools need to be level 2 compliant!

  • Zerfass@zaac.de38

    Level 1

    Formatting that is applied to the source and target text of a translation unit is not exported to the TMX file, only pure text.

    Original This sentence has some formatting.

    In TMX This sentence has some formatting.

  • Zerfass@zaac.de39

    Level 2

    Formatting that is applied to the source and target text of a translation unit is exported to the TMX file.Different tools use different ways of encoding that information.

  • Zerfass@zaac.de40

    TMX from Dj Vu (Atril)Original This sentence contains different formatting

    information.In TMX from Dj Vu

    {1}This {2} sentence {3} contains {4}different {5}{6}formatting information {7}.

    DV puts placeholders (ph) where the formatting will go, not the formatting information itself, formatting information is stored in a separate file.

  • Zerfass@zaac.de41

    TMX from Trados Original

    This sentence contains different formatting information. In TMX from Translators Workbench

    This {\b /ut>sentence} contains

    {\i different} {\ul formatting information}.

    This {\b sentence} contains {\i different} {\ul formatting information}.

    Example 1 is from Version 6.5, example 2 from version 7

  • Zerfass@zaac.de42

    TMX from Transit (Star) Original

    This sentence contains different formatting information. In TMX from Transit

    This sentence contains different formatting information.

    Transit uses the begin paired tag (bpt) the end paired tag (ept) and the information for bold (b), italics (i) and underlined (u)

  • Zerfass@zaac.de43

    TMX from SDLX (SDL)Original This sentence contains different formatting

    information.In TMX from SDLX

    This sentence 1> contains different 3>.

    SDLX uses placeholders for formatting information that is stored in a different file

  • Zerfass@zaac.de44

    Implications of different tags for formatting

    Tools that use placeholder tags do not include the actual formatting information in the TMX file Other tools can only re-use the text The result of the exchange is the same as with

    TMX level 1 (text only)TMX files which carry the actual formatting information will yield better matches in other tools that can read this information.

  • Zerfass@zaac.de45

    When is it useful?

    A company moves from tool A to tool B, where tool B cannot import the proprietary TM format of tool ATranslators of one project use different tools or a TM needs to be reused in software localization tools as well as regular translation memory toolsExport format after alignment with one tool, to import into another toolTM maintenance, when the TM tool does not offer all functionalities that are neededBilingual terminology extraction

  • Zerfass@zaac.de46

    Does it work?

    With the current versions of translation tools on the market it works quite well Previous versions sometimes created their

    own flavor of TMX which could not readily be imported by other tools, but the export files had to be changed before import. (en-us, EN_US)

    Yes, it does what it was developed for, it makes the exchange of data between tools possibleBUT - This is only half of the storyThe question is, how well can the data that has been exchanged be used

  • Zerfass@zaac.de47

    Reusing TMX data

    Although Translation Memory Tools have the same basic idea (storing source-target language pairs and recycling translations), this has been realized in different ways.Main issue here, are the segmentation rules

  • Zerfass@zaac.de48

    SRX Segmentation Rules Exchange

    From the SRX specification The purpose of the SRX format is to

    provide a standard method to describe segmentation rules that are being exchanged among tools and/or translation vendors...

    is intended to enhance the TMX standard

  • Zerfass@zaac.de49

    Why SRX?

    Tool A Semicolon is end of segment

    This is a sentence; this is another sentence. TM system sees two separate segments

    Tool B Semicolon is NOT end of segment

    This is a sentence; this is another sentence. TM system sees one segment No match from the TMX data!

    Match rate around 50%, usual setting around 70%

  • Zerfass@zaac.de50

    Segmentation rules

    Rules that the tool applies to the text to translate to split it up into segments paragraph sentence phrase incomplete sentences in bulleted lists single words (headings, Note,

    Attention)

  • Zerfass@zaac.de51

    Segmentation rulesEnd of segment rules (common to the default settings of all tools) Dot at the end of a sentence (not after known

    abbreviations) Question mark, exclamation mark Paragraph mark Colon

    End of segment rules (different for different tools) Semicolon Tab character Sub segments (index entries, footnotes,

    graphics)

  • Comparison of default rules

    Workbench Transit DV SDLX Across

    Colon end end end no end no end

    Semi- colon

    no end end end no end no end

    Tab end no end no end no end no end

    Soft return

    no end no end end in Word no end in PPT

    end in Word no end in PPT

    no end

  • Zerfass@zaac.de53

    Settings for better reuse

    Check the segmentation settings of the source tool, if possibleRe-create this setting in the target tool, as far as possibleSet down the minimum match value from the default 75% to about 50%For TM data that does not yield useful results, you may have to run an alignment of the original material on the target system.

  • Zerfass@zaac.de54

    When is SRX useful?

    Moving TM data from one tool to another when the rules for translation have always been the same and the receiving tool is able to recreate the rules from the exporting tool Because: SRX only transports the rules that

    are defined at the time of the export from the TM

    Not many tools can write/read SRX at the moment, so there is only limited experience as of now.

  • Zerfass@zaac.de55

    SRX

    SRX is under developed at the moment. The SRX file will contain the following information:

    - Definition of the rules of a specific language

    - Definition, how those rules were set at the time of the TMX export

  • Zerfass@zaac.de56

    Endrules and exceptions

    Rule: A dot followed by a space is the end of a

    segment.. This is the first sentence. This is the second

    sentence.

    Exception: A dot, preceded by a number is not the end of

    a segment. Dies ist der 1. Satz.

  • Zerfass@zaac.de57

    End rules and exceptions

    Rule:

    [\.] \s

    Exception for numbers, abbreviations...

    [0-9]+\. \s

  • Zerfass@zaac.de58

    What can SRX not do?

    It can only show the segmentation rule settings at the time of export.It cannot show any changes that have been applied in the segmentation rules during the use of the TM.Sometimes the rules from system 1 cannot be re-created in system 2, then the rule will be ignored.

  • Zerfass@zaac.de59

    Issues with data exchange via TMX

    Zerfass@zaac.de

  • Zerfass@zaac.de60

    Initial Situation

    Translation memory in TMX format (Translation Memory Exchange), created by a translation project in Star Transit with FrameMaker filesImport of the TMX file into SDL Trados Translators Workbench and a Star Transit projectComparison of statistics (word count, match rates) during an analysis (Trados) / import (Transit)

    Zerfass@zaac.de

  • Zerfass@zaac.de61

    Issues

    Word counts differedMatch rates differed

    Prices differed quite a lot

  • Zerfass@zaac.de62 Zerfass@zaac.de

    Goal of the test

    Comparison of Word counts Segment counts Match rates

  • Zerfass@zaac.de63

    Comparison of Match Rates

    Trados creates an Ancillary file that contains elements from master pages, headers/footers and variables, so that these elements only have to be translated (and counted) onceDifferent segmentation rules (especially for abbreviations) lead to different numbers of segments in Transit and TradosDifferent representation of tags from the FrameMaker files in TMX lead to low match rates when using a TMX file from Transit in the Analysis in TradosDifferent ways of counting words lead to different word counts

    Zerfass@zaac.de

  • Zerfass@zaac.de64 Zerfass@zaac.de

    Why are the results so different?

    Overview on the word count rules and segmentation rules in Transit and TradosComparison of the segments during Analysis/ImportDifferences in handling of texts from master pages, headers, footers and variables Trados Ancillary file

    Comparison of the TMX files

  • Zerfass@zaac.de65

    Word Count / Segment Count

    Word count comparison

    Zerfass@zaac.de

    Microsoft Office d 97 - 2003-Dokum

  • Zerfass@zaac.de66

    Segmentation Rules

    Zerfass@zaac.de

    Trados Workbench

    Star Transit

    DV SDLX Across

    Colon end end end no end no end

    Semi- colon

    no end end end no end no end

    Tab end no end no end no end no end

    Soft return

    no end no end end in Word no end in PPT

    end in Word no end in PPT

    no end

  • Zerfass@zaac.de67

    Segmentation Rules

    Defining abbreviations During import of files in Transit Separately in Trados TM

    Zerfass@zaac.de

    Trados Transit

  • Zerfass@zaac.de68

    Segment Comparison

    Zerfass@zaac.de

    When reusing a TMX file from Transit (FrameMaker project) Trados very often does not show a match. Only a concordance search shows that the segment is in the TM.

  • Zerfass@zaac.de69

    Comparison of segments within the TM tools

    Zerfass@zaac.de

    Only after changing the minimum match value to a lower value is Trados able to show matches during translation. Match values are often below 50%.The low match rates result from different handling of the tags:As an example: Transit counts 25 Tags, for the same segment, Trados counts 36 tags.For every tag difference Trados subtracts a 2% penalty.

  • Zerfass@zaac.de70

    Comparison of segments within the TM tools

    Zerfass@zaac.de

    Match rate 80%

    Transit adds placeholders to the tags themselves.

    Trados shows two tags and the placeholder as a separate letter

    Different handling of tags changes the segment structure.Trados shows additional words (gray) and therefore also words that where moved in comparison to the match from the TMX TM.

  • Zerfass@zaac.de71

    Trados Ancillary

    Variables, text from headers and footers and texts from the master pages are saved to a separate file in Trados so that the contents only has to be translated once.

    Zerfass@zaac.de

  • Zerfass@zaac.de72

    Comparison of TMX DATA

    TMX from TransitTMX from Trados

  • Segment in Transit Editor (small tag view)

    Segment in Transit Editor (full tag view)

    Segment in Trados Editor (full tag view)

    Segment in Transit TMX

    Segment in Trados TMX

  • Zerfass@zaac.de74

    Software Localization Tools

    Zerfass@zaac.de

  • Zerfass@zaac.de75

    Is there a difference between localization and translation?

  • Zerfass@zaac.de76 Zerfass@zaac.de

    Is there a difference between localization and translation?

    Translation Transfer of text based information from language A to

    language B where necessary, this includes adaptations in

    contents (to comply with legal standards) form (page layout, page format) address (general public, subject matter experts) non-textual elements (pictures)

    Localization Translation and adaptation of

    text (software interface and accompanying documentation) software contents (tax calculation for different countries) resizing (so that translated text fits into the button) non-textual elements (icons, graphics, pictures, symbols) (software testing)

  • Zerfass@zaac.de77 Zerfass@zaac.de

    The history of SW L10N

    In the beginning, each software application was developed for one specific country. There was no thought given to the possibility of one day selling this software in another country.When these ideas came up, it was assumed, that the users would have to cope with the original user interface language, which was mostly English.

  • Zerfass@zaac.de78 Zerfass@zaac.de

    The history of SW L10N

    If a different language version was needed, the software was re-created in the target language Disadvantage: different language versions of the

    software had to be maintainedThe text that showed on the user interface (UI, GUI) was written into the code of the program (hard-coded)Translating hard-coded text requires a lot of programming knowledge on the side of the translator, so that the program does not break during translation.

  • Zerfass@zaac.de79 Zerfass@zaac.de

    sub print_form {my ($content);my ($template,$HTML) = @_;open (FILE, "

  • Zerfass@zaac.de80 Zerfass@zaac.de

    CGI code snippet in PERLsub print_form {

    my ($content);my ($template,$HTML) = @_;open (FILE, "

  • Zerfass@zaac.de81

    Hard-coded strings in C Code: Not Internationalized

    Zerfass@zaac.de

    #include main() {

    int

    n; char y[5]; printf("This

    program converts decimal numbers to hexadecimal\n\n");while(1) { printf("\nEnter

    decimal number: "); scanf("%d",&n); printf("\nNumber

    entered is decimal and hexa",n,n); printf("\nDo

    you want to continue? "); scanf("%s",y); if(strcmp(y,"yes")) { printf("\n

    exiting ..\n"); exit(); } } }

    strings are directly in the code

    source.c

  • Zerfass@zaac.de82 Zerfass@zaac.de

    #include extern unsigned char *intl_m_msg(), *intl_f_msg(); main() {

    int

    n; char y[5]; printf(intl_m_msg("","mypg",1)); while(1) { printf(intl_m_msg("","mypg",2));

    scanf("%d",&n); printf(intl_m_msg("","mypg",3),n,n); printf(intl_m_msg("","mypg",4)); scanf("%s",y); if(strcmp(y, (intl_m_msg("","mypg",6))) { printf(intl_m_msg("","mypg",5)); exit(); } } }

    source.c

    "This program converts decimal numbers to hexadecimal\n\n""\nEnter

    decimal number:" "\nNumber

    entered is decimal and hexa" "\nDo

    you want to continue?" "\nexiting

    ..\n" "yes"

    mypg.en

    123456

    Extract Strings using Extract Tool to Message File

  • Zerfass@zaac.de83 Zerfass@zaac.de

    ""Ce

    programme

    convertit

    les nombres

    dcimaux

    en hexadcimal\n\n"\nEntrer

    le nombre

    dcimal:" "\nLe

    nombre

    entr

    est

    dcimal

    et hexadcimal" "\nVoulez

    vous

    continuer?" "\nSortie

    ..\n" "oui"

    #include extern unsigned char *intl_m_msg(), *intl_f_msg(); main() {

    int

    n; char y[5]; printf(intl_m_msg("","mypg",1)); while(1) { printf(intl_m_msg("","mypg",2));

    scanf("%d",&n); printf(intl_m_msg("","mypg",3),n,n); printf(intl_m_msg("","mypg",4)); scanf("%s",y); if(strcmp(y, (intl_m_msg("","mypg",6))) { printf(intl_m_msg("","mypg",5)); exit(); } } }

    source.c

    mypg.fr

    123456

    Translate the extracted text and place it back into the cod

  • Zerfass@zaac.de84 Zerfass@zaac.de

    Visual Basic: Example

  • Zerfass@zaac.de85 Zerfass@zaac.de

    Action: Copy a file called afile.txt to the floppy drive A: from the hard disk C:

    Command Line/DOS: COPY C:afile.txt A:afile.txt

    Function Key / DOS:

    Menu:

    WIMP:

    User Interface Examples: Copying a File: Four Different Ways

    F5 menu or dialog

    dialog

  • Zerfass@zaac.de86

    Origin and Evolution of the GUI (Graphical User Interface)

    Zerfass@zaac.de

    The software user interfaces we use today are called GUI (Graphical User Interfaces). Windows, the Macintosh operating system and the Unix operating system all use GUIs. The specific type of GUI is called a WIMP interface:

    Windows

    Icons

    Mouse

    Pointer

  • Zerfass@zaac.de87 Zerfass@zaac.de

    Extracted text for translationExcel list, without context information

    Sequence of occurrence Sorted alphabetically

    Sorted alphabetically, only one occurrence for each term

  • Zerfass@zaac.de88 Zerfass@zaac.de

    The history of SW L10N

    Next came globally enabled software (internationalized software), where developers took into account that their program would be sold in many different countries and therefore would need to be able to show many different languages on the user interface. Support for a wide range of fonts Support for different code pages (characters) Separation of translatable text from programming code

    Resource files containing translatable data Binary files containing programming code

  • Zerfass@zaac.de89

    Software Localization Tool

    Editor

    Extraction of translatable text

    Software Localization Tool

    Old source file

    Old target file

    Read in reference material

    Target file

    Generation of target file

    File to translate

    Read in new file

    Translation

  • Zerfass@zaac.de90

    Software Localization: Passolo

  • Zerfass@zaac.de91

    Software Localization: Catalyst

  • Zerfass@zaac.de92

    Features

    For the developer / project manager Preparing files for translation Quality assurance - Pseudo-translation Customization Managing translation projects

    For the translator Translation environment Use of translation memory and terminology Quality assurance features Proofreading

    Data Exchange

    Zerfass@zaac.de

  • Zerfass@zaac.de93

    Preparing Files for Translation

    Hiding text that was extracted but should not be translatedLocking text that should not be translated, but which is important for understanding the contextAdding comments or status flags to text

    Read-onlytbd&Special

    Changed from Version 1.5!

    &Datei&FileStatusCommentTargetSource

  • QA Pseudo Translation

    Checking the original application for: ability to display characters of target

    language controls or text fields that are too small to

    hold translated text Check stability of translated software

  • Zerfass@zaac.de95 Zerfass@zaac.de

  • Zerfass@zaac.de96 Zerfass@zaac.de

    Text expansion (averages)

  • Zerfass@zaac.de97

    Text expansion by language

    There are several sources on the internet, that show charts for average text expansion.

    Zerfass@zaac.de

  • Zerfass@zaac.de98

    Text expansion by languagehttp://www.omnilingua.com/resourcecenter/textexpansion.aspx.

    Zerfass@zaac.de

  • Zerfass@zaac.de99 Zerfass@zaac.de

    Comments

    Bookmarks

  • Zerfass@zaac.de100

    Customization

    Creating parsers to extract text from customer file formatsAbility to call other applications via APICreating macros to facilitate repetitive tasks Using the command line to process files and create projects

  • Zerfass@zaac.de101

    Managing Translation Projects

    Creating translation projects for external translators Creating a package of all files needed for

    translation, including the original files, reference material, glossaries, macros

    Creating statistics Amount of text to be translated Amount of text that was pre-translated with

    material from a previous version Repetitions

  • Zerfass@zaac.de102

    Managing Translation Projects

    Data exchange Export of translation lists

    Source segment target segment pairs Import of translation memory data from other translation

    tools Export of terminology lists Import of terminology lists Recycling of older software versions via alignment

    Update management Updating a project with changed source language files

  • Zerfass@zaac.de103

    Translation Environment

    Editor Translation window Navigation window

    WYSIWYG view of dialogs and menusResizing of controls

  • Zerfass@zaac.de104

    Translation Environment

    Localization of bitmaps, icons and cursors

    External editor Internal editor

    Automatic pre-translation Recycling translations from previous

    projects or translation memory data

    Filling terminology lists Adding new terms to terminology lists during

    translation

  • Zerfass@zaac.de105

    Quality Assurance Features

    Checking features Spell-checking Formatting

    tabs between access key and text

    Access keys Same number of access keys in source and target Do all access keys exist Are all access keys in a dialog unique

    Translation status (for review, signed- off)

  • Zerfass@zaac.de106

    Proofreading

    Proofreading the translated software is made easier by the ability of the software localization tool to show the menus and dialogs as they will appear at runtime.No more proofreading of Excel lists where the context of the text is not apparent.

  • Zerfass@zaac.de107

    Data exchange

    Most tools have their own export/import formatMost tools support the export/import format of standard tools (Trados, Star)Most tools support TMX (translation memory exchange format) for importing and/or exporting dataSome tools differentiate between export of terminology data and export of translation memory data

  • Zerfass@zaac.de108

    Help/Manual FR

    Documentation

    First localize the software Then localize help and documentation

    SW Development

    Build FR

    Project Workflow

  • Zerfass@zaac.de109

    Localization Processes

    Pseudo translation

    Translation Memory Tool

    Export of Terminology

    Export of segment pairs for TM

    Import into TM

    Import intotermbank

    Translation of Online-help, readme files,

    manuals, webpages, packaging...

    Translation of software

    Extract translatable segments

    File preparation

    Software Localization Tool

  • Zerfass@zaac.de110

    Reuse of translations

    Build FR Version 2

    Database

    Software Updates

  • Zerfass@zaac.de111

    Differences between Localization Tools and

    Translation Tools

  • Zerfass@zaac.de112

    Localization Tools

    Extract translatable text by ID numbersSegmentation of file is predefined by file format itself. Segment pairs are identified by their ID numbersAs text in software tends to consist of single words or short phrases, there is no separate management of terms and sentencesTranslation of update files means that already translated IDs are not touched. Only new or changed text will be touched.

  • Zerfass@zaac.de113

    Terminology Management Tools

    Zerfass@zaac.de

  • Zerfass@zaac.de114

    Terminology management

    Term extraction (monolingual and bilingual) creation of term lists from source documents or

    translation memoriesTerm lists / term bases connect to an editor (source document creation) or the

    TM systems and localization tools during translationTerm check ensure the consistent use of terms over the whole

    project check for the use of forbidden terms

  • Zerfass@zaac.de115

    Terminology Management Tools

    Extraction

    Zerfass@zaac.de

  • Zerfass@zaac.de116

    Monolingual Extraction

    Extraction of terms from documents in one language.

    Creation of term lists important terms

    Who defines what is important? How can a tool know, what is important?

    frequent terms What is frequent? 3 times / 10 times Are frequent terms also important?

    new terms According to whose level of subject matter

    knowledge? Compared to which term list / term database?

  • Zerfass@zaac.de117

    Bilingual Extraction

    Term extraction from bilingual sources like translation memory files or bilingual translation files Creation of parallel lists of terms and

    their translation(s) All forms of the term and all its translations Only basic form Most frequent translation of source term

  • Zerfass@zaac.de118

    Term extraction issues

    Terminology extraction is a highly individual process Goal of extraction, subject matter expertise,

    available time Tools use different methods for terminology extraction Concordance, statistics, linguistics

    Tools support different file formats for extraction and export Monolingual, bilingual, export formats

    Tools sometimes dont show the context from which the term was taken

  • Zerfass@zaac.de119

    Term Extraction Tools

    Assistance for manual extractionConcordance tools Extraction of all term combinations

    Statistical extraction tools Frequent terms All languages

    Linguistic extraction tools Extraction of noun phrases Supported languages only

  • Zerfass@zaac.de120

    Manual Extraction

    Human reads the text, understands the meaning and selects terms (or term pairs) according to previous knowledge of the subject matter and/or the goal for the extraction. List of standard terms List of company terms List of new terms Additional information like source, context

    example

  • Zerfass@zaac.de121

    Tools assisting manual extraction

    Tools that connect to an editor and allow the collection of terms or term pairs

    Translation memory tools that save terms and term pairs directly into the term database component

    Term checking tools that report missing terms / translations

  • Zerfass@zaac.de122

    Manual extraction

    Time consumingResource intensiveSubject matter and language expertise requiredMost accurate regarding the goalIndividual goals can be set

  • Zerfass@zaac.de123

    Concordance Tools

    Automatic creation of a list of all terms and term combinations from a documentNo term is missedLong list of termsManual selection process necessary

  • Zerfass@zaac.de124

    Concordance results (Simple Concordance Program SCP)

  • Zerfass@zaac.de125

    TM tool

    Term can consist of up to X wordsTerms that already exist in the database are not extractedExtraction from all files of a project (various file formats)

  • Zerfass@zaac.de126

    Extraction with TM tool

    Export of term list for translationExport of term list to term database

  • Zerfass@zaac.de127

    Statistical Extraction Tool

    Monolingual and bilingual extractionTerms that occur more than X times are extractedList of frequent terms frequent terms are seen as importantImportant terms / new terms that appear in this document less than X times are not extractedCan be used for any languageList of term candidates must be checked by a human with subject matter and language expertise

  • Zerfass@zaac.de128

    Terminology Tool of TM Suite (Dj Vu, Lexicon)

    Settings for number of words per termSettings for frequency

  • Zerfass@zaac.de129

    Bilingual Extraction Results

  • Zerfass@zaac.de130

    SDL MultiTerm Extract

  • Zerfass@zaac.de131

    This is, how a text looks to a statistical extraction tool

    Vot gnig harengoga fuor tok gnig nor shewerginhatz. Mirhon bortup tip trewshu gnig batbo loqtet. Bortup ter, bortup nofdas, semsel nih furpo ayano bliktreptat. Mirhon granbevtrov driktopret grig go wasbrekit mut mirkep taptro gnig suf. Aktrep zitpek nitnit bortup mil. Setrimb ak troptan bur metlatkento.

  • Zerfass@zaac.de132

    Linguistic Extraction Tool

    Tool knows about the structure of the languageExtracted terms can be reduced to their basic from with the help of dictionaries and rulesUser can define the rules used for extractionExtraction limited to supported languages

  • Zerfass@zaac.de133

    Linguistic Settings

    Extraction according to specific rules of the languageFrequency settings

  • Zerfass@zaac.de134

    Results of Extraction with Context Window (TerminologyWizard)

  • Zerfass@zaac.de135

    Linguistic Extraction Tools

    Translations of terms come from the extraction files and internal dictionariesEach term is shown with its context and a grammatical analysisResults of extraction List of one-word terms List of multi-word terms List of context sentences

    Export and view can be filtered

  • Zerfass@zaac.de136

    SDL PhraseFinder

  • Zerfass@zaac.de137

    File Formats

    TM tools extract from every file format they supportConcordance tools are usually limited to text or Word filesBilingual extraction can be produces from bilingual file formats like translation memories, project files of a TM tool or bilingual translation files, but not from two separate filesExport usually in Excel, tab-delimited TXT or directly into the terminology component

  • Zerfass@zaac.de138

    ConclusionNo one tool can do what a human can do, but depending on the goal, the tools can help to automate repetitive tasks and comparisons with stop lists and/or term bases Concordance tools extract all words and

    provide filter and search settings for the view of the term list

    Statistical tools offer settings for frequencies, term length and comparison with stop word lists or existing term lists / term databases

    Linguistic tools can be customized by rules for the extraction, which could be different for various languages and use language-specific dictionaries

  • Zerfass@zaac.de139

    Terminology Management Tools

    Databases

    Zerfass@zaac.de

  • Zerfass@zaac.de140

    Term Base Entry (MultiTerm)

    Text information

    Synonyms

    Category

    Category

  • Zerfass@zaac.de141

    Term Base Entry (Termstar)

  • Zerfass@zaac.de142

    Setup of term lists / term bases

    Subject matter or company specific terms ProductName, Company - Name deactivate checkbox YY, activate checkbox YY Click inside checkbox YY twice

    Collect terms, synonyms, abbreviations screen, scr., monitor

    Base form of the termDecide on term status field

    forbidden, deprecated, pending, confirmed by, used by us, used by competitor

    Zerfass@zaac.de

  • Zerfass@zaac.de143

    Setup of term lists / term bases

    Add as many information fields as needed Note, definition, context example, source Information should help authors and translators to

    decide if and when to use a term / translation

    Not all terminology management tools allow user defined fieldsKeep in mind that the more fields you have, the more time you need to schedule for filling and maintaining the information in those fields

    Zerfass@zaac.de

  • Zerfass@zaac.de144

    Terminology Management Tools

    Retrieval

    Zerfass@zaac.de

  • Zerfass@zaac.de145

    Terminology Retrieval

    Term components of translation memory tools Standalone tools or integrated term modules

    Search sentences to translate for terms from the term base / term list

    Pasting translations of terms into the translation / sending new term pairs to the term base or term list

    Fuzzy matching of terminology Filtering terms (ex: per product)

    Zerfass@zaac.de

  • Zerfass@zaac.de146

    Retrieving terms during translation

    Zerfass@zaac.de

    Sending terms to the term base

  • Zerfass@zaac.de147

    Term retrieval

  • Zerfass@zaac.de148

    Terminology Management Tools

    Checking

    Zerfass@zaac.de

  • Zerfass@zaac.de149

    Source Language Terminology Check

    Check the source language documentation for consistent use of terms in the term list, use of forbidden terms, use of synonyms

    Checking tools with an interface to your authoring system for terminology checks and grammar checks. (checks are customizable to your own rules)

    Zerfass@zaac.de

    TerminologyConnect the AC power cord to the AC power adapter, then to the back of the photo printer

    AC power cord AC power cablepower cord power cablecord cable

  • Zerfass@zaac.de150

    Source Language Terminology Check

    Authoring Memory Systems to check consistent use of standard sentencesPlace the DVD in the.

    Zerfass@zaac.de

  • Zerfass@zaac.de151

    Target Language Terminology Check

    Checking the usage of the terms in the target language (in Translation Memories, bilingual files)

    Is the target language term from the term base used?

    Are there synonyms for the target language term?

    Is there a target language term in the translation where the source term from the term base does not appear in the source segment of the text?

    Has a forbidden term been used?Zerfass@zaac.de

  • Zerfass@zaac.de152

    Example (Wordfast)

    Translation from term list not found

  • Zerfass@zaac.de153

    Example (Wordfast)

    Blacklisted term found

  • Zerfass@zaac.de154

    Example (Quintilian, Add-in for Word)

    Hit or Miss for terms from Excel list

  • Zerfass@zaac.de155

    Example (across)

    Wrong translation (building = Gebude)

    Missing translation

  • Zerfass@zaac.de156

    Example (Transit)

    Wrong translation (building = Gebude)

    Several translations

    Missing translation in term database

  • Zerfass@zaac.de157

    Examples (Dj Vu)

    Wrong translation (building = Gebude)

    Examples (SDLX)

    Several translations for one word

  • Zerfass@zaac.de158

    Examples (Trados)

    - Forbidden term (Monitor)- Wrong term (Auswahlschalter)- No target (menu)

  • Zerfass@zaac.de159

    Setup of term list / term base

    In order to check for forbidden terms, these terms need to be marked with an attribute.

  • Zerfass@zaac.de160

    Term check in QA tools

    File formats Bilingual files

    TRADOS DOC/RTF and TTX STAR language files TM in TMX or TRADOS TXT format

    Checks Terms from term list / term base not used Terms without target equivalent in term list /

    term base Target term present but source term not

    present (reverse check)

  • Zerfass@zaac.de161

    Example (ErrorSpy)Excel term list / suffix and prefix lists or MultiTerm term base

  • Zerfass@zaac.de162

    Example (QA Distiller)

    Excel term list converted to internal dictionary

  • Zerfass@zaac.de163

    Summary

    Each checking routine only checks some possibilities, none checks the whole range

    Missing translation in term list / term base Term with several translations Term with wrong translation Missing source term (reverse check)

  • Zerfass@zaac.de164

    Terminology Management Tools

    Exchange

    Zerfass@zaac.de

  • Zerfass@zaac.de165

    Terminology Exchange

    Most tools can export and import tab-delimited text filesSome tools are offering the TBX (TermBase Exchange) format or similar XML format for data exchange Transit, Heartsome, across MultiTerm (XML format, similar to TMX)

    TBX from term databases could also be useful for: Extracting stop word lists for extraction tools Re-use as dictionary in machine translation systems Displaying terminology information online Optimizing search engines (keywords) and text mining tools

  • Zerfass@zaac.de166

  • Zerfass@zaac.de167

  • Zerfass@zaac.de168

  • Zerfass@zaac.de169

  • Zerfass@zaac.de170

    TBX TermBase Exchange

    From the TBX Specification TBX is an open XML-based standard format for

    terminological data TBX is designed to support the analysis,

    representation, dissemination, and exchange of information from human-oriented terminological databases (term bases)

    TBX is built on the basis of ISO 12620 (data categories) and ISO 12200 (MARTIF - Machine-readable Terminology Interchange Format, core structure)

  • Zerfass@zaac.de171

    Structure of a TBX file

  • Zerfass@zaac.de172

    Structure of a TBX file

  • Zerfass@zaac.de173

    TMX TBX

  • English term

    French term

    Global information in the entry head

    Term level information

    Administrative data of a language

    Language ID

    Languag ID

  • Zerfass@zaac.de175

    Conversion from MultiTerm to TBX (Medtronic)

  • Zerfass@zaac.de176

    Medtronic Mapping Table

  • Zerfass@zaac.de177

    Terminology Processes

    Term list

    Authoring

    Term list

    Terminology extraction

    Terminology approval Term

    list

    Import

    Terminology Database

    Translation and Terminology Check

    Term list

    Term translationsNew termsChange requests

    Term list

    Terminology approval

    Import of translations

    Term check during authoring

    Online publication of term database (intranet/internet)

    Term list

    New termsChange requests

    Term list

    Terminology approval

  • Zerfass@zaac.de178

    Terminology Processes

    Be aware that terminology work requires a lot of resourcesAlways include everybody who has to deal with terminology (authoring, production, development, marketing, sales, translation)There needs to be one person responsible for terminology for each language

  • Zerfass@zaac.de179

    Terminology Costs

  • Zerfass@zaac.de180

    Terminology Cost

    About 10 to 15 Euro per source term including definitionAbout 20 terms a dayAbout between 1 and 1.5 Euro per translation into one language

    Changing a term during a translation project, will cost about $1000 per term per language (JDEdwards)

  • Zerfass@zaac.de181

    Sample calculation

    1000 source terms5 target languagesRate per hour 50 EuroInitial corpus of terms 100 terms per hourTerm maintenance (hours per year and language with 1000 base terms) 12(study by tekom regional group Saxony, Germany)

  • Zerfass@zaac.de182

    Alignment

    Zerfass@zaac.de

  • Zerfass@zaac.de183

    Alignment

    Old source and target language documents are read into the alignment component of the TM toolThe tool segments the files and tries to connect the segments that belong together, thus creating segment pairsA translator checks the alignmentResults are imported into a TM system for reuse with new translations

  • Zerfass@zaac.de184

    Example: Dj Vu

  • Zerfass@zaac.de185

    SDLX

  • Zerfass@zaac.de186

    Alignment

    Before translation, to add new segment pairs to a TMAfter translation, to get the really final segment pairs into the TM As target language documents tend to

    get corrected after translation as wellExport of alignment to TMX for terminology extraction

  • Zerfass@zaac.de187

    Translation Tools

    Zerfass@zaac.de

  • Translation MemoryTerm base /

    term list

    Editor of TM tool

    Terminology Extraction

    Source language files

    Target language files

    Alignment

    Possible file preparation

    Creation of target language fileDTP

    possible text extraction file conversion

    Tools in the localization process

    Software Localization Tool

  • Zerfass@zaac.de189

    Translation Tools

    Utilities

    Zerfass@zaac.de

  • Zerfass@zaac.de190

    Utilities

    Word count tools Conversion tools (from PDF to Word) Extraction tools for text extraction from

    certain file formats QA tools for bilingual material (TMs) Macros, self-developed tools

  • Zerfass@zaac.de191

    Word count tools

    Different tools count differently, because they have different definitions of what a "word" is.

    Some count stand-alone numbers and symbols (like Word) others don't (like Trados)

    Always agree on the tool (and better still on the tool version) that both sides use for counting words

  • Zerfass@zaac.de192 Zerfass@zaac.de

    Word count tools

    Freebudget, http://www.webbudget.com/fb4dload.htmTextCount, http://www.textcount.com/html/textcount-en.htmlAnyCount, http://www.anycount.com/TotalAssistant, http://www.surefiresoftware.com/totalassistant/PractiCount, http://www.practiline.com/Online Word Count, http://allworldphone.com/count-words-characters.htmWordCalc, Syllable and word count, http://www.wordcalc.com/Online Word Count Tool, http://www.wordcounttool.com/

  • Zerfass@zaac.de193

    Conversion / Text Extraction Tools

    Converting PDF to Batch converting files to a specific

    format (FrameMaker to MIF, InDesign to INX, Word to RTF)

    Text extraction for translation (Copyflow for QuarkXPress, Software strings to Excel/XML)

  • Zerfass@zaac.de194

    QA and TM Maintenance Tools

    Checking for consistent punctuation (same as in source or target language specific)

    Checking numbers Checking for missing translations Checking for same source / target Checking for same source / different

    target Search / Replace functions

  • Zerfass@zaac.de195

    QA and TM Maintenance Tools

    ErrorSpy, DOG, http://dog-gmbh.com/index.php?id=44&L=1

    QA Distiller, Yamagata, http://www.qa-distiller.com/

    Quintilian, TerminologyMatters, http://www.terminologymatters.com/quintilian.html

    BlackJack, ITR, http://www.itr.co.uk/en/translation-quality- assurance.html

    Olifant, Enlaso Tools, http://www.translate.com/technology/tools/Olifant.html

  • Zerfass@zaac.de196

    Translation Tools

    Workflow Management / Project Management Tools

    Zerfass@zaac.de

  • Zerfass@zaac.de197

    Project Management and Workflow Tools

    Project Creation in TM tool packaging of project files inserting translated project files

    Project Management Tools Offers and invoicing Data on customers and vendors

    Workflow Tool Automation of processes (file conversion, pre-

    translation, packaging, sending out package to assigned translator)

  • Zerfass@zaac.de198

    Project / Workflow Management

    Batch processingMultiple TMs / Term databasesOnline tracking of project statusAutomating sequences of steps Word count pre-translation copying

    files to translate to different language folders

    Rights and roles

  • Zerfass@zaac.de199

    What is a Workflow?

    A workflow defines the order (rules) in which specific processes (consisting of separate tasks) are performed to achieve a defined result.Each process consists of individual tasks.Each task is associated with a specific resource.Every player in the workflow has a specific role with certain rights.In order to describe a workflow Define your processes that make up the workflow Standardize processes Break down each process into individual tasks Assign the task to a player in the process Assign a role to the player (defining the rights)

    This means that you need to take a very detailed look into the processes involved to be able to define a standard workflow.

  • Zerfass@zaac.de200

    Breaking Down a Project

    Localization Project Define the processes

    Terminology questions File preparation File handling Translation Proofreading / editing Quality assurance

  • Zerfass@zaac.de201

    Process: File Preparation

    Define the tasksGet file list with file formatsCheck files for translatability Simulated translation for software files Check for text on graphics in documentation Check if there is enough space for text expansion Check for consistent use of terminology

    Define what file format needs what kind of preparation Convert files / extract text / mark translatable and

    untranslatable text Create additional files like settings for use in certain translation

    tools Description for the translators on how to handle the files

    Get statistics from the files (number of words, number of repeated sentences, number of sentences that will come from the translation memory system as a match / translation suggestion)Pre-translate files with existing translation memories

  • Zerfass@zaac.de202

    Automation?

    Define the tasksGet file list with file formatsCheck files for translatability

    Simulated translation for software files Check for text on graphics in documentation Check if there is enough space for text expansion Check for consistent use of terminology

    Define what file format needs what kind of preparation Create additional files like settings for use in certain translation tools Description for the translators on how to handle the files Convert files / extract text / mark translatable and untranslatable text

    automate Get statistics from the files (number of words, number of repeated

    sentences, number of sentences that will come from the translation memory system as a match / translation suggestion)

    automatePre-translate files with existing translation memories

    automate

  • Zerfass@zaac.de203

    How to Automate?

    Simple macrosCreate scripts or small toolsUse the API of translation tools to automate several steps in one goSet up a workflow management system

  • Zerfass@zaac.de204

    When to use Workflow (Automation)?

    Whenever there is set of processes that will be used over and over again, this can be thought of as a workflow. the rules are set the process sequence is always the

    same each process has its pre-defined set of

    tasks for each task you can decide who is

    going to do it it can be documented and learned

  • Translator has a question on a term

    AgencyTranslator asks the project manager of the translation agencyClient

    Agency PM asks the PM at the customer

    Client authoring / engineering

    department

    For terminology question in the source language Client

    market center

    For terminology question in the target language

    For terminology question in the target language

  • Translator adds term questions to an online list.

    ClientPM

    Client market center Target language

    questions

    Automation through a central connection point and roles for each user

    Central connection

    point

    Client authoring department

    Source language questions

    Agency adds term questions to an online list.

    Translator has a question on a term

    Agency

    Translator sends questions to agency

  • Zerfass@zaac.de207

    Summary

    Workflow management is work.Many workflow management systems provide a platform for describing and automating parts of a workflow, not a ready to fly automated system.Get the process right, then think about automation.Break down bigger elements into smaller units (processes tasks steps) and describe their conditions and interdependencies.When necessary, there should be a way to break the workflow rules.

  • Zerfass@zaac.de208

    Machine Translation

    Zerfass@zaac.de

  • Zerfass@zaac.de209

    Machine Translation

    Requires source documentation that is optimized for machine translation (controlled source language)Requires post-editing, not so much post- translationCan be used in combination with TM systems (to pre-translate large amounts), but is often not appreciated as very helpful by translators

    Zerfass@zaac.de

  • Zerfass@zaac.de210

    TM - MTInteractive translation

    interactive process

    almost all language pairs possible

    creation of a repository

    Recycling of translations independent of the format of the source document

    Machine translation

    fully automated process

    only works for the language pair the system was created for

    text is usually pre- edited and or post- edited

    good systems are relatively costly

    very fast

  • Zerfass@zaac.de211

    Translation Tools

    Evaluation

    Zerfass@zaac.de

  • Zerfass@zaac.de212

    Evaluation

    Collect your requirementsGet a demo with your files from different tools vendorsTest the tools yourself with the evaluation matrix

    Zerfass@zaac.de

  • Zerfass@zaac.de213

    Tools, Tools, Tools

    Zerfass@zaac.de

  • Zerfass@zaac.de214 Zerfass@zaac.de214

    TM tools information

    Software reviews http://www.localizationworks.com/DRTOM/index.html

    Comparisons DOG, Translation Memory Systeme im Vergleich (German), 2005, ISBN 978-3-9810595-1-9

    Details functionalities, test setup, result matrix

    MD, Translation Memory Systeme im Vergleich (German), issue 4/5 2005

    Newsletters http://www.internationalwriters.com/toolkit/

    Magazines www.multilingual.com

    Training material and courses on tools http://ecolotrain.uni-saarland.de/index.php?id=2525&L=1

    Survey LISA (www.lisa.org) Translation Memory Survey 2002 and 2004 Imperial College London, Translation Memories Survey 2006,

    http://www3.imperial.ac.uk/portal/pls/portallive/docs/1/7307707.PDF

  • Zerfass@zaac.de215 Zerfass@zaac.de215

    Some software localization tools

    Passolo www.passolo.com (XLIFF)RC-WinTrans www.schaudin.comCatalyst www.alchemysoftware.ie (XLIFF)Multilizer www.multilizer.comSDL Insight www.sdl.com/products/sdlinsight.htm (XLIFF)Language Studio ls.atia.comLingobit Localizer www.lingobit.com,RapidTranslation www.rapidtranslation.netVisual Localize www.visloc.com

    AppleGlot http://developer.apple.com/intl/localization/tools.html

    Some TM tools also offer the ability to translate EXE files and the like, but usually they do not offer a visual representation of the dialogs and menus.

  • Zerfass@zaac.de216

    Some Terminology Extraction Tools

    Concordance tools Simple Concordance Program (SCP), http://www.textworld.com/scp/ ExtPhr32, http://publish.uwo.ca/~craven/freeware.htm

    Term extraction tools / components of translation memory toolsStatistical Extraction

    MultiTerm Extract, Dj Vu Lexicon, Heartsome Dictionary Editor, across

    TermiDOG (www.dog-gmbh.de), Chamblon Terminology Extractor (http://www.chamblon.com/terminologyextractor.htm)

    Linguistic Extraction Synthema Terminology Wizard

    (http://www.synthema.it/english/servizi/traduzioni.html), SDL PhraseFinder

    Zerfass@zaac.de216

  • Zerfass@zaac.de217

    Some Translation Memory Tools

    Word based Wordfast, www.wordfast.net Metatexis, www.metatexis.com SDL Trados Workbench with Word, www.sdl.com TinyTM (open source), tinytm.sourceforge.net JiveFusion (for Office files),

    http://www.jivefusiontech.com/products_FT.html

    Integrated TM systems Dj Vu, www.atril.com Transit, www.star-group.net MemoQ,

    http://en.kilgray.com/?q=node/products/memoq/memoq4free Across, www.across.net SDLX, www.sdl.com (integarted into SDL Trados installation) Heartsome, http://www.heartsome.net/EN/home.html

  • Zerfass@zaac.de218

    Some Online Translation Memory Tools

    Online TM tools Ontram, http://www.andrae-ag.de/EN/products/ontram.htm TinyTM, tinytm.sourceforge.net Server-based translation environment from SDL Trados, across

  • Zerfass@zaac.de219

    Some Tools for Source Language Checking

    Acrolinx IQ Suite, acrolinx http://www.acrolinx.com/iq_suite_overview_de.html (German)

    HyperSTE, Tedopres http://www.tedopres.com/en/products-services/hypervision/

    SDL Author Assistant, SDL Writer's Workbench, SDL Trados Authoring Coach, Sajan,

    http://marketing.sajan.com/marketing/solutions/authoring.aspx Author-it Xtend,

    http://www.author-it.com/index.php?page=xtendauthoringmemory across Plug-in

    Zerfass@zaac.de219

  • Zerfass@zaac.de220

    Some Tools for Target Language Checking

    ErrorSpy, DOG, http://dog-gmbh.com/index.php?id=44&L=1

    QA Distiller, Yamagata, http://www.qa-distiller.com/

    Quintilian, TerminologyMatters, http://www.terminologymatters.com/quintilian.html

    BlackJack, ITR, http://www.itr.co.uk/en/translation-quality- assurance.html

  • Zerfass@zaac.de221

    Some Workflow/Project Management Tools

    SDL Trados Teamworks SDL TMS Idiom Worldserver Across workflow module Business Manager, Plunet Project open LTC Orgnizer

  • Zerfass@zaac.de222

    TMX specificationTMX is a recommendation by OSCAR OSCAR: LISA special interest group

    Open Standards for Container/Content Allowing Re-use The latest specification can be downloaded from

    http://www.lisa.org/tmx/tmx.htm For comments: tmx@lisa.org List of TMX certified tools

    The purpose of the TMX format is to provide a standard method to describe translation memory data that is being exchanged among tools and/or translation vendors, while introducing little or no loss of critical data during the process.

  • Zerfass@zaac.de223

    SRX Specification

    Latest version www.lisa.org/srx/srx.htm

    www.lisa.org/srx/srx10-20040420.htm

    Tools of the TradeAgendaTools in Translation / LocalizationTranslation Memory ToolsSame idea different approachSame idea different approachSame idea different approachMacros within WordTM + EditorSeparate toolsTrados with WordTrados with TagEditorIntegrated toolsIntegrated toolsHeartsome EditorSDLXSDLXSDLXMemoQMemoQTransitDj VuWord counts in Translation ToolsWhat is counted in an analysis?Word CountWord Count ExamplesCalculation of Match RatesMatch Rate ExamplesRepetitionsWhat influences the statistics?TMX (Translation Memory Exchange)Localization StandardsTMXTranslation Memory ExchangeWhat is TMXWhat is TMXWhat is TMXLevels of TMXLevel 1Level 2TMX from Dj Vu (Atril)TMX from TradosTMX from Transit (Star)TMX from SDLX (SDL)Implications of different tags for formattingWhen is it useful?Does it work?Reusing TMX dataSRX Segmentation Rules ExchangeWhy SRX?Segmentation rulesSegmentation rulesComparison of default rulesSettings for better reuseWhen is SRX useful?SRXEndrules and exceptionsEnd rules and exceptionsWhat can SRX not do?Issues with data exchange via TMXInitial SituationIssuesGoal of the testComparison of Match RatesWhy are the results so different?Word Count / Segment CountSegmentation RulesSegmentation RulesSegment ComparisonComparison of segments within the TM toolsComparison of segments within the TM toolsTradosAncillaryComparison of TMX DATAFoliennummer 73Software Localization ToolsIs there a difference between localization and translation?Is there a difference between localization and translation?The history of SW L10NThe history of SW L10NFoliennummer 79CGI code snippet in PERLHard-coded strings in C Code: Not InternationalizedExtract Strings using Extract Tool to Message FileTranslate the extracted text and place it back into the codVisual Basic: ExampleFoliennummer 85Origin and Evolution of the GUI (Graphical User Interface)Extracted text for translationThe history of SW L10NSoftware Localization ToolSoftware Localization: PassoloSoftware Localization: CatalystFeaturesPreparing Files for TranslationQA Pseudo TranslationFoliennummer 95Text expansion (averages)Text expansion by languageText expansion by languageCommentsCustomizationManaging Translation ProjectsManaging Translation ProjectsTranslation EnvironmentTranslation EnvironmentQuality Assurance FeaturesProofreadingData exchangeProject WorkflowLocalization ProcessesSoftware UpdatesDifferences between Localization Tools and Translation ToolsLocalization ToolsTerminology Management ToolsTerminology managementTerminology Management ToolsMonolingual ExtractionBilingual ExtractionTerm extraction issuesTerm Extraction ToolsManual ExtractionTools assisting manual extractionManual extractionConcordance ToolsConcordance results (Simple Concordance Program SCP)TM toolExtraction with TM toolStatistical Extraction ToolTerminology Tool of TM Suite (Dj Vu, Lexicon)Bilingual Extraction ResultsSDL MultiTerm ExtractThis is, how a text looks to a statistical extraction toolLinguistic Extraction ToolLinguistic SettingsResults of Extraction with Context Window (TerminologyWizard)Linguistic Extraction ToolsSDL PhraseFinderFile FormatsConclusionTerminology Management ToolsTerm Base Entry (MultiTerm)Term Base Entry (Termstar)Setup of term lists / term basesSetup of term lists / term basesTerminology Management ToolsTerminology RetrievalRetrieving terms during translationTerm retrievalTerminology Management ToolsSource Language Terminology CheckSource Language Terminology CheckTarget Language Terminology CheckExample (Wordfast)Example (Wordfast)Example (Quintilian, Add-in for Word)Example (across)Example (Transit)Examples (Dj Vu)Examples (Trados)Setup of term list / term baseTerm check in QA toolsExample (ErrorSpy)Example (QA Distiller)SummaryTerminology Management ToolsTerminology ExchangeFoliennummer 166Foliennummer 167Foliennummer 168Foliennummer 169TBX TermBase ExchangeStructure of a TBX fileStructure of a TBX fileFoliennummer 173Foliennummer 174Conversion from MultiTerm to TBX (Medtronic)Medtronic Mapping TableTerminology ProcessesTerminology ProcessesTerminology CostsTerminology CostSample calculationAlignmentAlignmentExample: Dj VuSDLXAlignmentTranslation ToolsTools in the localization processTranslation ToolsUtilitiesWord count toolsWord count tools Conversion / Text Extraction ToolsQA and TM Maintenance ToolsQA and TM Maintenance ToolsTranslation ToolsProject Management and Workflow ToolsProject / Workflow ManagementWhat is a Workflow?Breaking Down a ProjectProcess: File PreparationAutomation?How to Automate?When to use Workflow (Automation)?Foliennummer 205Foliennummer 206SummaryMachine TranslationMachine TranslationTM - MTTranslation ToolsEvaluationTools, Tools, ToolsTM tools informationSome software localization toolsSome Terminology Extraction ToolsSome Translation Memory ToolsSome OnlineTranslation Memory ToolsSome Tools forSource Language CheckingSome Tools for Target Language CheckingSome Workflow/Project Management ToolsTMX specificationSRX Specification