The Evolution of Speech Segmentation: A Computer Simulation

The Evolution of Speech Segmentation

A Computational Simulation

Richard Littauer (Edinburgh)

Outline

• The Problem

• The Possible Solution

• Conclusions and Implications

The Research Problem

• Word Segmentation

The Problem

• Fluent listeners hear speech as a sequence of discrete words.

• But there are no pauses in the wave form…

The Problem

• Listeners Problem:

• jakɑrəmnə (or thereishope)

• Solution!• Find all boundaries• Don’t find any boundaries

The Problem

• Suggestions:– Allophonic variation– Coarticulation– Prosody– Phonotactics– Combining any of these– Or…

The Problem

• Recent studies have shown that 8-month-olds can segment continuous strings of speech syllables into word-like units using only statistical computation of syllables (Aslin et al. 1997, 1998; Mattys et. al, 1999)

The Problem

• These studies looked at syllable transition probability, but didn’t look at the possibility that the children may simply be counting the syllables.

The Problem

• Furthermore, while Aslin, Saffran, & Newport (1996; 1998) did show that children can use statistical probability, they didn’t judge how that type of analysis would influence language over time.

The Problem

• No one has done this (as far as I am aware.)

The Problem

• So, why does this matter? Because, obviously, the child has no lexicon to back up, so the information which the child is exposed to must be that which is used to learn how to segment properly.

My Simulation

• Code for four different possible transitional segmentation strategies. Use an Iterated Learning Model to see how well these do when culturally replicated.

My Simulation

• Coded four different types of methods:– If you have seen one of the two test words before

and not the other, choose the one you have seen before.

– If one of the test words has occurred more frequently than the other, chose the more frequent one.

– If one of the test words contains more frequent transitions, chose that one.

– If one of the test words contains more probable transitions, chose that one

My Simulation

• Variables:– word recognition– word frequency– syllable transition count– syllable transition probability

My Simulation

• Variables:– Word length– Amount of ‘syllables’– Amount of words– Amount of words used– Fixed lexicons

My Simulation

• Types of Pairings– 2 randoms– 1 lex, 1 random word with the same phonemes– 1 scrambled word– 1 chopped up

My Simulation

• The ILM– All of this was run through an Iterated

Learning Model - which means, a generational model.

My Simulation

• What I judged the output on:– The original generation– The first generation

My Simulation

• What I measured:– Lexical retention– Lexical size– Hamming distance– Levenshtein distance– Phonotactic Development– Transitional Probability

Results

• Word Recognition: Pretty unsuccessful.

• Word Frequency: Wildly successful (100%)

• Transitional Probability: Alright.

• Transitional Counting: Better than alright, and each generation got better.

Results

• Controls:– The WPT was very influential– Fixed original corpus: word recognition and

frequency did too well, while the transitional processes looked most like real language.

– Random original corpus: none of them did well.

Results

• Controls:– More words is better.– Shorter words is better.– Longer runs aren’t needed but are useful.

Disclaimers

• Online processing

• Memory constraints

• The WPT is unrealistic

• Words aren’t isolated

• Abstraction

• The digram analysis

Future Work?

• What about a Bayesian analysis? • How exactly would transition count and

probability be used in sequence?

• And anything you might raise now that shows that I need to redo this?

• thatisitiamdonenowthanks

The Evolution of Speech Segmentation: A Computer Simulation

Technology

Transcript of The Evolution of Speech Segmentation: A Computer Simulation

ATCOSIM Air Traffic Control Simulation Speech Corpus - spsc - Graz

Automatic Segmentation of Spontaneous Speech

Vowel Harmony and Speech Segmentation in Finnishlinguistics.ucla.edu/people/hayes/251VowelHarmony/... · Vowel Harmony and Speech ... Finnish vowel harmony rules require that if the

Segmentation of Highly Vocalic Speech Via Statistical ...cnl.psych.cornell.edu/pubs/2019-tmabbhmrc-LL.pdfSegmentation of Highly Vocalic Speech Via Statistical Learning: Initial Results

Segmentation and Simulation of Objects Represented in Images …tavares/downloads/publications/... · 2018-04-10 · Segmentation and Simulation of Objects Represented in Images using

Unsupervised speech segmentation: an analysis of the ... · 3 I. INTRODUCTION Over the past few years, interest in the automatic segmentation of speech has increased. In the fields

1 Prosody-Based Automatic Segmentation of Speech into Sentences and Topics Elizabeth Shriberg Andreas Stolcke Speech Technology and Research Laboratory.

Eﬃcient Simulation of Markov Chains using Segmentation€¦ · Eﬃcient Simulation of Markov Chains using Segmentation ... Keywords Markov chain · nearly-completely decomposable

S S S A2009 Simulation Study Of Segmentation

Dialogue Context-Based Speech Recognition using User Simulation

AUTOMATIC SEGMENTATION OF SPEECH INTO SENTENCE-LIKE

Speech Segmentation and Clustering Methods for a New ...

FOREGROUND SPEECH SEGMENTATION AND ENHANCEMENT USING ... · 12 dB as opposed to existing spectral subtraction based method ... DOI 10.1109/TASLP.2016 ... FOREGROUND SPEECH SEGMENTATION

Research and simulation on speech recognition by Matlab725254/FULLTEXT01.pdf · 2014-06-16 · Linlin Pan Research and simulation on speech recognition by Matlab ii Abstract With

Segmentation of coarticulated speech in perception · Theresearch investigates how listeners segmentthe acoustic speech signal intophonetic seg ... Dartmouth Col lege, Hanover, NH

Continuous Bangla Speech Segmentation, Classification and Feature Extraction · 2016. 12. 17. · Continuous Bangla Speech Segmentation, Classification and Feature Extraction ...

Automatic Segmentation of Speech into Sentences Using ...

Segmentation and Simulation of Objects Represented in Images ...

Text, Speech, and Vision for Video Segmentation: The …lastchance.inf.cs.cmu.edu/alex/aaai-95.pdf · 2004-01-27 · Text, Speech, and Vision for Video Segmentation: The InformediaTM

Auditory Segmentation and Unvoiced Speech Segregation