TTS Evaluation Julia Hirschberg CS 4706. TTS Evaluation Intelligibility Tests Mean Opinion Scores...
-
Upload
martin-tucker -
Category
Documents
-
view
213 -
download
0
Transcript of TTS Evaluation Julia Hirschberg CS 4706. TTS Evaluation Intelligibility Tests Mean Opinion Scores...
![Page 1: TTS Evaluation Julia Hirschberg CS 4706. TTS Evaluation Intelligibility Tests Mean Opinion Scores Preference Tests 9/7/20152 Speech and Language Processing.](https://reader036.fdocuments.us/reader036/viewer/2022082820/56649e1c5503460f94b0a6a3/html5/thumbnails/1.jpg)
TTS Evaluation
Julia Hirschberg
CS 4706
![Page 2: TTS Evaluation Julia Hirschberg CS 4706. TTS Evaluation Intelligibility Tests Mean Opinion Scores Preference Tests 9/7/20152 Speech and Language Processing.](https://reader036.fdocuments.us/reader036/viewer/2022082820/56649e1c5503460f94b0a6a3/html5/thumbnails/2.jpg)
TTS Evaluation
• Intelligibility Tests• Mean Opinion Scores• Preference Tests
04/21/23 2Speech and Language Processing Jurafsky and Martin
![Page 3: TTS Evaluation Julia Hirschberg CS 4706. TTS Evaluation Intelligibility Tests Mean Opinion Scores Preference Tests 9/7/20152 Speech and Language Processing.](https://reader036.fdocuments.us/reader036/viewer/2022082820/56649e1c5503460f94b0a6a3/html5/thumbnails/3.jpg)
Intelligibility Tests
• Diagnostic Rhyme Test (DRT) – Listening test– Listeners choose between two words differing
by a single phonetic feature (voicing, nasality, sustenation, sibilation)
– DRT: 96 rhyming pairs• Dense/tense, bond/pond, …
– Subject hears dense, chooses either dense or tense– % of correct answers is intelligibility score
– Problem: Only tests single word synthesis
![Page 4: TTS Evaluation Julia Hirschberg CS 4706. TTS Evaluation Intelligibility Tests Mean Opinion Scores Preference Tests 9/7/20152 Speech and Language Processing.](https://reader036.fdocuments.us/reader036/viewer/2022082820/56649e1c5503460f94b0a6a3/html5/thumbnails/4.jpg)
• Modified DRT: – 300 words, 50 sets of 6 words (went, sent,
bent, tent, dent, rent)– Embedded in carrier phrases:
• Now we will say dense again
• Mean Opinion Score– Have listeners rate output on a scale from 1
(bad) to 5 (excellent)• Preference tests:
– Reading addresses out loud, reading news text, using two different systems or systems against human voice
– Do a preference test (prefer A, prefer B)