Meaningful Intonational Variation

28
03/16/22 1 Meaningful Intonational Variation

description

Meaningful Intonational Variation. Today. Assigning variation for TTS, CTS Contours Accent Phrasing Pitch Range Amplitude and timing. TTS Production Pipeline. Orthographic input: Dr. Smith lives on Elm Dr. Text normalization: abbreviation expansion… - PowerPoint PPT Presentation

Transcript of Meaningful Intonational Variation

04/19/23 1

Meaningful Intonational Variation

04/19/23 2

Today

Assigning variation for TTS, CTS Contours

AccentPhrasing

Pitch Range Amplitude and timing

04/19/23 3

TTS Production Pipeline

Orthographic input: Dr. Smith lives on Elm Dr.Text normalization: abbreviation expansion…Pronunciation modeling: POS id, WS disambiguationIntonation assignment: parsing, POS id, robust semantics…Phonetic/phonological realization: phonological parsing, phonetic analysisUnit selection: acoustic analysis

04/19/23 4

Intonation Assignment: Phrasing

Traditional: hand-built rules Punctuation 234-5682 Context/function word: no breaks after

function word He went to dinner Parse? She favors the nuts and bolts

approachCurrent: statistical analysis of large

labeled corpus Punctuation, pos window, utt length,…

04/19/23 5

Functions of Phrasing

Disambiguates syntactic constructions, e.g. PP attachment: S: You should buy the ticket with the

discount coupon. Disambiguates scope ambiguities, e.g.

Negation: S: You aren’t booked through Rome because

of the fare. Or modifier scope:

S: This fare is restricted to retired politicians and civil servants.

04/19/23 6

Intonation Assignment: Accent

Hand-built rules Function/content distinction He went out

the back door/He threw out the trash Complex nominals:

Main Street/Park Avenuecity hall parking lot

Statistical procedures trained on large corpora

Contrastive stress, given/new distinction?

04/19/23 7

Functions of Pitch Accent

Given/new information S: Do you need a return ticket. U: No, thanks, I don’t need a return.

Contrast (narrow focus) U: No, thanks, I don’t need a RETURN…. (I

need a time schedule, receipt,…) Disambiguation of discourse markers

S: Now let me get you the train information. U: Okay (thanks) vs. Okay….(but I really

want…)

04/19/23 8

Intonation Assignment: Contours

Simple rules ‘.’ = declarative contour ‘?’ = yes-no-question contour unless

wh-word present at/near front of sentenceWell, how did he do it? And what do you

know?

What else might we do?

04/19/23 9

Contours: Accent + Phrasing

What do intonational contours ‘mean’ (Ladd ‘80, Bolinger ‘89)? Speech acts (statements, questions,

requests)S: That’ll be credit card? (L* H- H%)

Propositional attitude (uncertainty, incredulity)S: You’d like an evening flight. (L*+H L- H%)

Speaker affect (anger, happiness, love)U: I said four SEVEN one! (L+H* L- L%)

“Personality”S: Welcome to the Sunshine Travel System.

04/19/23 10

Propositional attitude (uncertainty)

Did you feed the animals?

I fed the L*+H goldfish L-H% Distinguish direct/indirect speech acts

Can you open the door?

04/19/23 11

The TTS Front End Today

Corpus-based statistical methods instead of hand-built rule-sets

Dictionaries instead of rules (but fall-back to rules)

Modest attempts to infer contrast, given/new

Text analysis tools: pos tagger, morphological analyzer, little parsing

04/19/23 12

TTS: Where are we now?

Natural sounding speech for some utterances Where good match between input

and databaseStill…hard to vary prosodic features and

retain naturalness Yes-no questions: Do you want to

fly first class?Context-dependent variation still hard to

infer from text and hard to realize naturally:

04/19/23 13

Appropriate contours from text Emphasis, de-emphasis to convey

focus, given/new distinction: I own a cat. Or, rather, my cat owns me.

Variation in pitch range, rate, pausal duration to convey topic structure

Characteristics of ‘emotional speech’ little understood, so hard to convey: …a voice that sounds friendly, sympathetic, authoritative….

How to mimic real voices?

04/19/23 14

TTS vs. CTS

Decisions in Text-to-Speech (TTS) depend on syntax, information status, topic structure,… information explicitly available to NLG

Concept-to-Speech (CTS) systems should be able to specify “better” prosody: the system knows what it wants to say and can specify how

But….generating prosody for CTS isn’t so easy

04/19/23 15

To(nes and)B(reak)I(ndices)Developed by prosody researchers in four

meetings over 1991-94

Goals:

devise common labeling scheme for Standard American English that is robust and reliable

promote collection of large, prosodically labeled, shareable corpora

ToBI standards also proposed for Japanese, German, Italian, Spanish, British and Australian English,....

04/19/23 16

Minimal ToBI transcription:

recording of speech f0 contour ToBI tiers:

orthographic tier: wordsbreak-index tier: degrees of junction

(Price et al ‘89)tonal tier: pitch accents, phrase accents,

boundary tones (Pierrehumbert ‘80)miscellaneous tier: disfluencies, non-

speech sounds, etc.

04/19/23 17

Sample ToBI Labeling

04/19/23 18

Online training material,available at: http://www.ling.ohio-state.edu/phonetics/

ToBI/ Evaluation

Good inter-labeler reliability for expert and naive labelers: 88% agreement on presence/absence of tonal category, 81% agreement on category label, 91% agreement on break indices to within 1 level (Silverman et al. ‘92,Pitrelli et al ‘94)

04/19/23 19

Pitch Accent/Prominence in ToBI

Which items are made intonationally prominent and how?

Accent type:

H* simple high (declarative) L* simple low (ynq) L*+H scooped, late rise (uncertainty/

incredulity) L+H* early rise to stress (contrastive focus)

H+!H* fall onto stress (implied familiarity)

04/19/23 20

•Downstepped accents:

•!H*,

•L+!H*,

•L*+!H

•Degree of prominence:within a phrase: HiF0

across phrases

04/19/23 21

Prosodic Phrasing in ToBI ‘Levels’ of phrasing:

intermediate phrase: one or more pitch accents plus a phrase accent (H- or L- )

intonational phrase: 1 or more intermediate phrases + boundary tone (H% or L% )

ToBI break-index tier

0 no word boundary 1 word boundary 2 strong juncture with no tonal

markings 3 intermediate phrase boundary 4 intonational phrase boundary

04/19/23 22

L*+H

L*

H*

H-H%H-L%L-H%L-L%

04/19/23 23

H* !H*

H+!H*

L+H*

H-H%H-L%L-H%L-L%

04/19/23 24

Contour Examples

http://www.cs.columbia.edu/~julia/cs6998/cards/examples.html

04/19/23 25

And Other Things Contribute: Pitch Range and Timing (Rate, Pause)

Level of speaker engagement

Hello vs. HELLO

Contour interpretation

Rise/fall/rise (L*+H L-H%): Elephantiasis isn’t incurable

Discourse/topic structure: paratones

04/19/23 26

Corpus-Based ResearchPredicting accent, phrasing, contours from

large ToBI-labeled corporaFeatures:

Word position, p.o.s. window, word cooccurence, punctuation, capitalization, sentence length, paragraph position, …

Results:~80-85% correct accent prediction~92-96% correct phrase boundary predictionContours????Reality…

04/19/23 27

This is my version of a rather long sentence which ideally should be broken into several phrases automatically by a smart system but we don't know if this will actually happen do we?

Is a yes-no question uttered with falling intonation? Does that sound delightful? Mellifluous?

I don’t want cereal I want toast.….

04/19/23 28

Next:

Story analysis and generation (readings will be available later this week – we’ll send mail)