Automatic Disfluency Detection in Multi-party Conversations...

rgGerman Research Center forArtificial Intelligence GmbH

FeastFeast, 30th September 2009, 30th September 2009Sebastian Sebastian GermesinGermesin

Automatic DisfluencyAutomatic DisfluencyDetection in Multi-partyDetection in Multi-party

ConversationsConversations

Sebastian Germesin September 092

German Research Center forArtificial Intelligence GmbH

OutlineOutline• Motivation• Theoretical Background• Data (AMI Corpus)• Disfluency Detection System

• Hybrid Classification Approach• Self-arranging Modules• Experimental Results

• Conclusions & Outlook

MotivationMotivationExampleExample

MotivationMotivation• Have to detect (and clean) disfluencies

in the transcribed speech• Readability

• Transcription• Extractive Summarization

• Post-Processing• NLP-systems’ performance drop when faced with

disfluent speech

• Human detector?• Too expensive!• Too slow!

⇒Automatic Detection System!

Theoretical BackgroundTheoretical Background

“Disfluencies are syntactical and grammatical[speech] errors that occur in spoken but notin written language.” [Besser, 2006]

DefinitionDefinition

“The cat uh the dog sneaks around the corner.”

TerminologyTerminology

Reparandum

Interregnum

Reparandum Reparans

Interregnumcomplex

“The d dog sneaks around the corner.”

Reparandum

simple

Theoretical BackgroundTheoretical BackgroundAll TypesAll Types

Simple disfluencies

DataData

• AMI meeting corpus• 135 meetings (~ 100 hours speech)• 4 participants• task: design a remote control• freely interaction• Many annotations, e.g.:

• Transcribed speech• Dialogue acts• Gestures• ...

quantitativequantitative

DataData

• 45 meeting enriched with disfluencyannotation• 31,000 Disfluencies• 15.8% erroneous words• 41.5% disfluent Dialogue Acts• 80% (33) for training• 20% (12) for evaluation

quantitativequantitative

DataData

• Discovered a heterogeneity towards thestrictness of different disfluency types1. Some disfluencies have strict structure

• ex.: Repetition : “The cat the cat plays “2. Some other disfluencies have also strict structure but

this structure is very common in natural language• ex.: Replacement : “The dog the cat plays“• ex.: Fluent : “The dog the cat and the bird play”

3. Some other disfluencies have no obvious structure• ex.: Disruptions : “The dog the cat and“• ex.: Order : “The plays cat”

qualitativequalitative

Automatic SystemAutomatic SystemDesign QuestionDesign Question

• Can we leverage the heterogeneityof disfluencies for their detection?→Yes!

→ Use modules for subsets of disfluencies→ Use different feature-sets for each module

(depending on the disfluency types)→ Find “optimal” classifier for each module

Automatic SystemAutomatic SystemHybrid ModulesHybrid Modules

• SHS:• Stuttering, Hesitation, Slip-of-the-Tongue

• REP:• Repetition

• DNE:• Discourse Marker, Explicit Editing Term

• DEL:• Deletion

• REV:• Insertion, Replacement, Restart, Other

How toHow to combine the modules?combine the modules?

Training ProcessTraining ProcessSelf-arranging ModulesSelf-arranging Modules

• Immense search space• #(modules) * #(classifier) * placeInSystem

• Solution(s):• Old system:

• Choosen manually• Current system:

• Automatically trained1.Use greedy hill-climbing

– Use weight for errors to improve Precision!2.Reduce classifier library

– Take 10% results in maximal performance lossof 2.3% (depending on the module)

GroDiGroDiGreedy Hill-ClimbingGreedy Hill-Climbing

Training ProcessTraining ProcessSelf-arranging ModulesSelf-arranging Modules

• Immense search space• #(modules) * #(classifier) * placeInSystem

• Solution(s):• Old system:

• Choosen manually• Current system:

• Automatically trained1.Use greedy hill-climbing

– Use weight for errors to improve Precision!2.Reduce classifier library

– Take 10% results in maximal performance lossof 2.3% (depending on the module)

GroDiGroDiPerformance-Curve of J48Performance-Curve of J48

Best: J48 "-L -U -M 2 -A"

Experimental ResultsExperimental Results

93.5 %94.5 %12 m.

94.7 %95.1 %6 m.33 m.

new 0.11

94.8 %95.3 %6 m.22 m.

0.4290.5 %92.9 %6 m.22 m.old

83.3 %88.6 %12 m.0.00

85.7 %90.3 %6 m.--baseline

RT-factoravg. F1AccuracyEval.data

Train.dataSystem

ConclusionsConclusions• Aims:

• Development of a system that automaticallydetects a broad set of disfluencies

• Fully automatic learning process• Robust and Fast

• Achievements:• Stand-alone tool for detection of disfluencies:

GroDi - Get rid of Disfluencies• Self-arranging modules• Detection rate: 95% Accuracy• Real-time factor of 0.11

OutlookOutlook

• Develop module(s) for the detection ofMistake, Order, Omission

• Embed other learning approaches, e.g.:• Conditional Random Fields• HMMs

• Use other corpus like, e.g., Switchboard

Thank you!Thank you!

Demo?Demo?

GroDiGroDiDiff. Module ArrangementsDiff. Module Arrangements

GroDiGroDi

Used technologies WEKA toolkit for machine learning Maximum Entropy classifier from Stanford NLP group CRF Tagger from http://crftagger.sourceforge.net/

Features for machine learning: Lexical: words, lexical parallelism, (POS-Tags) Prosodic: duration, pauses, pitch, energy Dynamic: disfluency types of surrounding words Speaker: age, role in meeting, native language

Automatic Disfluency Detection in Multi-party Conversations...

Documents

Transcript of Automatic Disfluency Detection in Multi-party Conversations...

Measuring Financial Covenant Strictness in Private … · Measuring Financial Covenant Strictness in Private Debt Contracts ... If the actual ratio of earnings to interest is six,

SILENT PAUSES AND DISFLUENCIES IN SIMULTANEOUS ... · SILENT PAUSES AND DISFLUENCIES IN SIMULTANEOUS INTERPRETATION: A DESCRIPTIVE ANALYSIS By Benedetta Tissi Freelance Conference

Quantitative analysis of disfluency in children with ... · Quantitative analysis of disfluency in children with autism spectrum disorder or language impairment Heather MacFarlane1,

Intonation Structure and Disfluency Detection in Stuttering · patterns of disfluency types are constrained by prosodic phrase structure. The prominent role of prosodic structure

“Disfluencies in 3 to 5 Years Old Telugu Speaking Normal ... · Disfluencies in 3 to 5 Years Old Telugu Speaking Normal Preschool Children Continuity - i.e., the relatedness of

Remedial Instruction in Language Disfluencies in the Non-Psycho-Expert Lens

speech recognition and removal of disfluencies

DISFLUENCY IN SECOND LANGUAGE: A … · disfluency in second language: a quantitative study on turkish learners of english a thesis submitted to the graduate school of informatics

Speaker-Versus Listener-Oriented Disfluency: A Re-examination … · 2017-07-01 · that if certain types of disfluency are solely (or primar-ily) for the benefit of the listener

The Relationship between Gun Control Strictness and Mass ...

Speech disfluencies in bilingual Yiddish-Dutch speaking ...

Normal Disfluency in Pre-schoolers: Silences, Pauses, and ...

Disfluencies in American Sign Language and English · 2015-05-20 · Disfluencies in American Sign Language and English Christiana David, Karen Emmorey Ph.D., ... We analyzed language

Strictness and Subsidiarity: An Institutional Perspective ...

NORMAL DISFLUENCY VS. THE NSA STUTTERING SUPPORT … · NORMAL DISFLUENCY VS. STUTTERING Fluency is generally described as the smooth, rhythmic, forward flow of speech. For most speakers,

CAMOUFLUO Collection by Sweet Years: strictness & craziness for your smartphone!

On the Relationship Between Laziness and Strictness

DISFLUENCY CLUSTERS OF CHILDREN WHO - …ainslab/readings/Anna/LaSalle_1995_Disfluency...Disfluency clusters of children who stutter: Relation of stutterings to self-repairs. By: LaSalle,

Psycholinguistic Sources of Variation in Disfluency - Scott Fraundorf

Proceedings of Disfluency in Spontaneous Speech (DiSS ...