Topic 4: The Magicians Hat: Turning Data into Business Intelligence (2)
-
Upload
taus-enabling-better-translation -
Category
Technology
-
view
348 -
download
2
Transcript of Topic 4: The Magicians Hat: Turning Data into Business Intelligence (2)
1. Separate language data from metadata
visualisation tools API 3rd party BI
Memsourcetranslation platform
Elastic Searchdata warehouse
4. Higher data resolution over time
1. Project data
2. Job data
3. Segment data
4. Time tracking
5. Subsegment data
Benchmarks: translation memory
Data for jobs where post-editing analysis has been performed, December 2015 - May 2016
Sample9 bnwords
Benchmarks: machine translation
Sample size 20 million words, December 2015 - May 2016
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
en:es pt:en en:pt es:en en:ru ru:en en:de pt:es en:fr es:pt
% O
F W
OR
DS
IN S
EGEM
ENTS
FR
OM
MT
SAMPLE LANGUAGE PAIRS
EDIT DISTANCE FOR MAJOR LANGUAGE PAIRS
mt.match100 mt.match95 mt.match85 mt.match75 mt.match50 mt.match0
MT not used
Raw MT
Moderate edits
Heavily edited
Benchmarks: rising languages% increase Language Volume67,481.6 Marathi *3,087.7 Hindi *1,006.0 Latvian **784.3 Kazakh *689.1 Serbian *659.5 Finnish **657.5 Romanian *613.9 Vietnamese *592.5 Indonesian *585.7 Turkish **543.6 Hungarian *467.4 Italian ***457.2 Ukrainian *448.9 Chinese ****440.7 Swedish **428.8 Dutch **390.0 Norwegian *378.9 English *****335.2 Danish **
% increase Language Volume331.3 Estonian *319.2 Polish **318.5 Norwegian Bokmål *307.8 German ***306.9 Thai *299.7 Overall increase in volume295.8 Korean **269.4 Czech **268.3 Russian ****245.5 French ****241.4 Slovak *237.8 Portuguese ***237.4 Bulgarian *230.0 Arabic *200.9 Japanese ****194.8 Spanish ****137.8 Bulgarian *121.4 Greek ***
-