Combining Crowd and AI to scale professional-quality ...External NLP Services Spell Check Syntax...

Combining Crowd and AI to scale professional-quality

translation

João GraçaCTO

80% English

The Internet, 1997

5% Portuguese

8% Spanish

20% Chinese

4% German

6% Japanese

3% Arabic

30% English

The Internet, 2017

“All translation firms together are able to translate far less than 1% of relevant content

produced everyday”

CSA – MT Is Unavoidable to Keep Up with Content Volumes

Is Machine Translation already here?

* Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions Marcin Junczys-Dowmunt, Tomasz Dwojak, Hieu Hoang

Everyone agrees that NMT is here to stay and much better than SMT

Moses(generic)

Neural MT(generic)

Moses(adapted)

Neural MT(adapted)

49,943,5

35,729,6

Unbabel Experiments with Customer Service Tickets

Is NMT really enough?

* A Neural Network for Machine Translation, at Production Scale. Google Blog

0 6 12 18 24 30

Good Not sure Bad

Quality per Job

Will AI solve translation?

MACHINE ONLYTIME

MQM 95

QUALITY

Will AI solve translation?H

MQM 95

0 6 12 18 24 30

Good Not sure Bad

Quality per Job

Unbabel Pipeline

JobMachine

Translation

Unbabel Pipeline

JobMachine

Translation

Unbabel Pipeline

JobMachine

Translation

Unbabel Pipeline

QualityEstimation

JobMachine

Translation

Customer

High Q.E.

Unbabel Pipeline

QualityEstimation

JobMachine

Translation

Customer

High Q.E.

Unbabel Pipeline

QualityEstimation

Low Q.E.

Re-Eval CommunityTranslators

JobMachine

Translation

Customer

QualityEstimation Community

Translators

Customer

QualityEstimation

Data Generation Engine

Initial text

Submitted text

Initial text

Submitted text

Before

NODATA POINTS

All changes in between:Mouse clicksKey pressesTimestamps

Data Generation Engine

Raw data Processed information

At 18:03:30:In nugget 3mouseClick Cursor at 16Selected: 0 At 18:03:31:In nugget 3Pressed Backspace Cursor at 16Selected: 0 At 18:03:31:In nugget 3Pressed Backspace Cursor at 15Selected: 0 At 18:03:31:In nugget 3Pressed Backspace

At 18:03:35:In nugget 3Pressed Shift Cursor at 25Selected: 0 At 18:03:35:In nugget 3Pressed s Cursor at 25Selected: 0 At 18:03:35:In nugget 3Pressed i Cursor at 26Selected: 0 At 18:03:35:In nugget 3Pressed e

At 18:03:30:In nugget 3mouseClick Cursor at 16Selected: 0 At 18:03:31:In nugget 3Pressed Backspace Cursor at 16Selected: 0 At 18:03:31:In nugget 3Pressed Backspace Cursor at 15Selected: 0 At 18:03:31:In nugget 3Pressed Backspace

Initial text

“Espero que esto es útil”

• Deleted word “es”• Inserted word “sea”Submitted text

“Espero que esto sea útil”

Keystroke Analysis

JobMachine

Translation

Customer

High Q.E.

QualityEstimation

Low Q.E.

Re-Eval Crowd

MACHINE QE COMMUNITY

Unbabel Pipeline

TranslationMemory

Job ResultMT Router

Customer MT Customer APE

Machine Translation Pipeline

Phrase-based MT Neural MT

Machine Translation Models

Customer Support Tickets

Google NMT

1K 5K 10K

51,748,8

47,246,9

Customer Adaptation

Word-Level QE Which words are translated

correctly/incorrectly?

Sentence-Level QE How good is the entire

translation?

Quality Estimation

Word-level QE example

Hey lá , eu sou pesaroso sobre aquele !OK OK OK OKBA

Quality Estimation

Unbabel Ticket

Bad translation

Good translation

source MT final

QE Training

QE Word Level

Ugent System

LinearQE

NeuralQE

StackedQE

APE-QE

FullStackedQE0 15 30 45 60

F1_MULT

QE Word Level

Ugent System

LinearQE

NeuralQE

StackedQE

APE-QE

F1_MULT41,1

QE Word Level

Ugent System

LinearQE

NeuralQE

StackedQE

APE-QE

F1_MULT

QE Word Level

Ugent System

LinearQE

NeuralQE

StackedQE

APE-QE

F1_MULT

QE Word Level

Ugent System

LinearQE

NeuralQE

StackedQE

APE-QE

F1_MULT

WMT 16WL QE Winner

QE Word Level

Ugent System

LinearQE

NeuralQE

StackedQE

APE-QE

F1_MULT

WMT 16WL QE Winner

QE Word Level

Ugent System

LinearQE

NeuralQE

StackedQE

APE-QE

F1_MULT

Andre F. T. Martins, Marcin Junczys-Dowmunt, Fabio Kepler, Ramon Astudillo, Chris Hokamp, Roman Grundkiewicz.“Pushing the Limits of Translation Quality Estimation.” TACL 2017 (To Appear)

WMT 16WL QE Winner

QE Word Level

QE Sentence Level

YANDEX

StackedQE

APE-QE

FullStackedQE

Pearson Correlation0 17,5 35 52,5 70

QE Sentence Level

YANDEX

StackedQE

APE-QE

FullStackedQE

QE Sentence Level

YANDEX

StackedQE

APE-QE

FullStackedQE

QE Sentence Level

YANDEX

StackedQE

APE-QE

FullStackedQE

QE Sentence Level

YANDEX

StackedQE

APE-QE

FullStackedQE

QE Sentence Level

Andre F. T. Martins, Marcin Junczys-Dowmunt, Fabio Kepler, Ramon Astudillo, Chris Hokamp, Roman Grundkiewicz.“Pushing the Limits of Translation Quality Estimation.” TACL 2017 (To Appear)

QE in the Pipeline

Job MachineTranslation

Customer

High Q.E.

QualityEstimation

QE in the Pipeline

Customer

High Q.E.

QualityEstimation

Document-Level QE how good is the entire document?

QE in the Pipeline

Customer

High Q.E.

QualityEstimation

Low Q.E.

Re-EvalCommunityTranslators

QE in the Pipeline

Customer

High Q.E.

QualityEstimation

Low Q.E.

Re-EvalCommunityTranslators

Human QE Can we evaluate post-edit output?

Interesting numbers

coming soon

QE in the Pipeline

Quality

Professional Translation

Quality

Pillars

Quality

Pillars

Quality

Pillars

• Editors Pool

Quality

Pillars

• Editors Pool• Initial Text (MT)

Quality

Pillars

• Editors Pool• Initial Text (MT)• Editor Assignment

Quality

Pillars

• Editors Pool• Initial Text (MT)• Editor Assignment• Interfaces

Quality

Pillars

• Editors Pool• Initial Text (MT)• Editor Assignment• Interfaces• Quality Evaluation

Unbabel Community

50 000 Users

Unbabel Community

Quality Estimation

Distributed Pipeline

Expert Specialisation layers will grow with time

Testing Phase Editors are tested when they sign up

Training Content Editors get ratings for the tasks

Paid Content The best editors have access to paid content

Editors Pool

Evaluators

Editors Pool

Evaluators

Annotators

Editors Pool

How Editors are Evaluated

Editors Profiling

Editor Assignment

Tasks/time

Editor Assignment

EditorsTasks/time

Editor Assignment

EditorsTasks/time

Editor Assignment

Pull30 m

SLA EditorsTasks/time

Editor Assignment

Pull30 m

Priority EditorsTasks/time

Editor Assignment

Pull30 m

Priority EditorsQueue

Tasks/time

Editor Assignment

Pull30 m

Priority Editors Rating

Tasks/time

Editor Assignment

Pull30 m

NativeQueue

Tasks/time

Editor Assignment

Pull30 m

NativeQueue

Tasks/time

Topics

Editor Assignment

Pull30 m

Native TopicsQueue

Tasks/time

Topics

Editor Assignment

Regular distribution

old rating

Editor Assignment

Regular distribution

old rating

Smart distribution

Improved rating

Editor Assignment

Post-Editing Interfaces

Time Spent on Job

DELIVERYMT

Time Spent on Job

WAITING

Translator 1

DELIVERYMT

Time Spent on Job

WAITING

Translator 1

DELIVERYMT

Translator 2

WAITING

Time Spent on Job: Mobile

WAITING

Translator 1

DELIVERYMT

Translator 2

WAITING

20% less

Time Spent on Job

Pos-Editing Interfaces

Smartcheck

Spelling

Formality

Consistency

External NLP Services

Spell Check

Syntax Parser

Word AlignerClient Rule

Smartcheck

Spelling

Formality

Consistency

Spell Check

Syntax Parser

Smartcheck

Spelling

Formality

Consistency

Spell Check

Syntax Parser

Smartcheck

Annotation Tool

Annotated

Spelling

Formality

Consistency

Spell Check

Syntax Parser

Smartcheck

Annotation Tool

Annotated Learn

M A C H I N E + H U M A N

CORE API

CHAT API

CYRANO API

Customer Service Conversational

MESSAGING API

Language OS

Language Engine

Tickets can come in many languages.

In Customer Service

Unbabel for Zendesk

Unbabel’s Domain Adapted

Machine Translation

Distributed Human Translation

English-speakingagent

Zendesk appconnected with Unbabel

Unbabel can offer the same Customer Satisfaction as native agents

SPEED QUALITY

9420minutes

Customer Replies: Speed & Quality

Editors train the

MT engineCommunity of 50K+ translatorsSkips Humans

Chat Translation Flow

Chat Messages: Speed & Quality

SPEED QUALITY

902Minutes

What is the future?H

MQM 95

Thank Youjoao@unbabel.com

Combining Crowd and AI to scale professional-quality ...External NLP Services Spell Check Syntax...

Documents

Transcript of Combining Crowd and AI to scale professional-quality ...External NLP Services Spell Check Syntax...

ClearPath-A Please always do IPR first before inserting the aligner. Getting aligners You will receive the aligner box containing aligner pouches,

Topdown parser

Schaeffler SmartCheck - User manual

INMA N ALIGNER - doc.vortala.com

Simulation Tools for Advanced Mask Aligner Lithography ... · Simulation Tools for Advanced Mask Aligner Lithography Arianna ... Mask Aligner from the light source to the final pattern

Wheel Aligner/Alignment Lift - Snap-on Wheel Aligner/Alignment Lift Wheel Aligner Accessories geolinerTM 680 Imaging Alignment System EEWA714AL22 All imaging aligners (excluding the

FAG SmartCheck

B211 Parser

Protocolo Clear Aligner

FAG SmartCheck - Network basics · PDF file2 Network basics FAG SmartCheck ... needs an identification number. In TCP/IP networks this identification number is called IP address. A

SCANIA WHEEL ALIGNMENT - AES UK LTD€¦ · SCANIA WHEEL ALIGNMENT - CAM-ALIGNER UPGRADE A, 74320 SCANIA WHEEL ALIGNMENT - CAM-ALIGNER UPGRADE B, 74321 1 1 laser AM cam-aligner cam-aligner

Lecture 9 Bottom-up parsers Datalog , CYK parser, Earley parser

Software Factory, Nprotobuf, Nhibernate, Renci SSH · SmartCheck DHCP IP 192.168.1.100 IP 192.168.1.x FAG SmartCheck ... 2407 9149-66 +49 (0) 2407 9149-59 +49 (0) 2407 9149-99

Parser combinators

FAG SmartCheck - Network basics - Schaeffler

Efficient Aligner Treatment from Start ... - TP Orthodontics · Efficient Aligner Treatment from Start to Finish The Refine™ Complete Aligner System from TPO® is the first aligner

MASK ALIGNER MA6 - Coltronics

3Z RF Aligner Brochure

Data Parser

Project Colorado - LinkedIn · 2020-03-25 · 0.5” Project Colorado Aligner Aligner Aligner 0.5” The vision for LinkedIn’s work in Colorado is to create economic opportunity