Perceptual Categories:
Old and gradient, young and sparse.
Bob McMurrayUniversity of Iowa
Dept. of Psychology
Collaborators
Richard AslinMichael TanenhausDavid Gow
Joe ToscanoCheyenne MunsonMeghan ClayardsDana SubikJulie MarkantJennifer Williams
The students of the MACLab
Categorization
Categorization occurs when:
1) discriminably different stimuli…
2) …are treated equivalently for some purposes…
3) …and stimuli in other categories are treated differently.
Categorization
Perceptual Categorization
• Continuous input maps to discrete categories.• Semantic knowledge plays minor role.• Bottom-up learning processes important.
Categorization
Perceptual Categorization• Continuous inputs map to discrete categories.• Semantic knowledge plays less of a role.
Categories include:• Faces• Shapes• Words• Colors
Exemplars include:• A specific view of a specific faces• A variant of a shape.• A particular word in a particular utterance• Variation in hue, saturation, lightness
Categorization occurs when:
1) Discriminably different stimuli…
2) …are treated equivalently for some purposes…
3) …and stimuli in other categories are treated differently.
ApproachWalk through work on speech and category development.
Assess this definition along the way.
PremiseFor Perceptual Categories this definition largely falls short.
andthis may be a good thing.
Overview
2) Word recognition: exemplars of the same word are not treated equivalently. (+Benefits)
1) Speech perception: Discriminably different and categorical perception.
3) Speech Development: phonemes are not treated equivalently.
4) Speech Development (model): challenging other categories treated differently. (+Benefits)
5) Development of Visual Categories: challenging other categories treated differently.
Categorical Perception
B
P
Subphonemic variation in VOT is discarded in favor of a discrete symbol (phoneme).
• Sharp identification of tokens on a continuum.
VOT
0
100
PB
% /p
/
ID (%/pa/)0
100Discrim
ination
Discrimination
• Discrimination poor within a phonetic category.
Categorical Perception
Categorical Perception: Demonstrated across wide swaths of perceptual categorization.
Line Orientation (Quinn, 2005)Basic Level Objects (Newell & Bulthoff, 2002) Facial Identity (Beale & Keil, 1995)Musical Chords (Howard, Rosen & Broad, 1992)Signs (Emmorey, McCollough & Brentari, 2003)Color (Bornstein & Korda, 1984) Vocal Emotion (Luakka, 2005)Facial Emotion (Pollak & Kistlerl, 2002)
What’s going on?
Categorical Perception
Across a category boundary, CP:• enhances contrast.
Within a category, CP yields• a loss of sensitivity• a down-weighting of the importance of within-
category variation.• discarding continuous detail.
Across a category boundary, CP:• enhances contrast.
Within a category, CP yields• a loss of sensitivity• a downweighting of the importance of within-
category variation.• discarding continuous detail.
Categorical Perception
Categorization occurs when:1) discriminably different stimuli…2) …are treated equivalently for some purposes…3) …and stimuli in other categories are treated
differently
Stimuli are not discriminably different.CP: Categorization affects perception.Definition: Categorization independent of perception.Need a more integrated view…
Categorization occurs when:
1) discriminably different stimuli
Perceptual Categorization
CP: perception not independent of categorization.
2) are treated equivalently for some purposes…
3) and stimuli in other categories are treated differently.
Categorical Perception
Is continuous detail really discarded?
Across a category boundary, CP:• enhances contrast.
Within a category, CP yields• a loss of sensitivity• a downweighting of the importance of within-
category variation.• discarding continuous detail.
Evidence against the strong form of Categorical Perception from psychophysical-type tasks:
Discrimination Tasks Pisoni and Tash (1974) Pisoni & Lazarus (1974)Carney, Widin & Viemeister (1977)
Training Samuel (1977)Pisoni, Aslin, Perey & Hennessy (1982)
Goodness Ratings Miller (1994, 1997…)Massaro & Cohen (1983)
Is continuous detail really discarded?
SidebarThis has
never been examined with non-
speech stimuli…
Is continuous detail really discarded? No.
?Why not?
Is it useful?
bakery
ba…
basic
barrier
barricade bait
baby
Xkery
bakery
X
XXX
Online Word Recognition
• Information arrives sequentially• At early points in time, signal is temporarily ambiguous.
• Later arriving information disambiguates the word.
time
Input: b... u… tt… e… r
beach
bump putter
dog
butter
These processes have been well defined for a phonemic representation of the input.
But considerably less ambiguity if we consider within-category (subphonemic) information.
Example: subphonemic effects of motor processes.
Coarticulation
Sensitivity to these perceptual details might yield earlier disambiguation.
Example: CoarticulationArticulation (lips, tongue…) reflects current, future and past events.
Subtle subphonemic variation in speech reflects temporal organization.
n n
e et c
k
Any action reflects future actions as it unfolds.
?What does sensitivity to within-category
detail do?
Does within-category acoustic detail systematically affect higher level
language?
Is there a gradient effect of subphonemic detail on lexical activation?
Experiment 1
Gradient relationship: systematic effects of subphonemic information on lexical activation.
If this gradiency is used it must be preserved over time.
Need a design sensitive to both systematic acoustic detail and detailed temporal dynamics of lexical activation.
Experiment 1
McMurray, Tanenhaus & Aslin (2002)
Use a speech continuum—more steps yields a better picture acoustic mapping.
KlattWorks: generate synthetic continua from natural speech.
Acoustic Detail
9-step VOT continua (0-40 ms)
6 pairs of words.beach/peach bale/pale bear/pearbump/pump bomb/palm butter/putter
6 fillers.lamp leg lock ladder lip leafshark shell shoe ship sheep shirt
How do we tap on-line recognition?With an on-line task: Eye-movements
Subjects hear spoken language and manipulate objects in a visual world.
Visual world includes set of objects with interesting linguistic properties.
a beach, a peach and some unrelated items.
Eye-movements to each object are monitored throughout the task.
Temporal Dynamics
Tanenhaus, Spivey-Knowlton, Eberhart & Sedivy, 1995
• Relatively natural task.
• Eye-movements generated very fast (within 200ms of first bit of information).
• Eye movements time-locked to speech.
• Subjects aren’t aware of eye-movements.
• Fixation probability maps onto lexical activation..
Why use eye-movements and visual world paradigm?
A moment to view the items
Task
Task
Bear
Repeat 1080 times
By subject: 17.25 +/- 1.33ms By item: 17.24 +/- 1.24ms
High agreement across subjects and items for category boundary.
0 5 10 15 20 25 30 35 400
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
VOT (ms)
prop
orti
on /p
/
B P
Identification Results
Task
Target = Bear
Competitor = Pear
Unrelated = Lamp, Ship
200 ms
1
2
3
4
5
Trials
Time
% f
ixat
ions
Task
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 400 800 1200 1600 0 400 800 1200 1600 2000
Time (ms)
More looks to competitor than unrelated items.
VOT=0 Response= VOT=40 Response=
Fix
atio
n p
ropo
rtio
n
Task
Given that • the subject heard bear• clicked on “bear”…
How often was the subject looking at the “pear”?
Categorical Results Gradient Effect
target
competitor
time
Fix
atio
n p
rop
orti
on target
competitor competitorcompetitor
time
Fix
atio
n p
rop
orti
on target
Results
0 400 800 1200 16000
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0 ms5 ms10 ms15 ms
VOT
0 400 800 1200 1600 2000
20 ms25 ms30 ms35 ms40 ms
VOT
Com
pet
itor
Fix
atio
ns
Time since word onset (ms)
Response= Response=
Long-lasting gradient effect: seen throughout the timecourse of processing.
0 5 10 15 20 25 30 35 400.02
0.03
0.04
0.05
0.06
0.07
0.08
VOT (ms)
CategoryBoundary
Response= Response=
Looks to
Looks to C
omp
etit
or F
ixat
ion
s
B: p=.017* P: p<.001***Clear effects of VOT
Linear Trend B: p=.023* P: p=.002***
Area under the curve:
0 5 10 15 20 25 30 35 400.02
0.03
0.04
0.05
0.06
0.07
0.08
VOT (ms)
Response= Response=
Looks to
Looks to
B: p=.014* P: p=.001***Clear effects of VOT
Linear Trend B: p=.009** P: p=.007**
Unambiguous Stimuli Only
CategoryBoundaryC
omp
etit
or F
ixat
ion
s
Summary
Subphonemic acoustic differences in VOT have gradient effect on lexical activation.
• Gradient effect of VOT on looks to the competitor.
• Seems to be long-lasting.
• Effect holds even for unambiguous stimuli.
Consistent with growing body of work using priming (Andruski, Blumstein & Burton, 1994; Utman, Blumstein & Burton, 2000; Gow, 2001, 2002).
Variants from the same category are not treated equivalently: Gradations in interpretation are related to gradations in stimulus.
Extensions
Word recognition is systematically sensitive to subphonemic acoustic detail.
Voicing Laterality, Manner, Place Natural Speech Vowel Quality
Word recognition is systematically sensitive to subphonemic acoustic detail.
Voicing Laterality, Manner, Place Natural Speech Vowel Quality
Metalinguistic TasksB
ShL
P
Extensions
Word recognition is systematically sensitive to subphonemic acoustic detail.
0 5 10 15 20 25 30 35 40
VOT (ms)
CategoryBoundary
0
0.02
0.04
0.06
0.08
0.1
Response=BLooks to B
Response=PLooks to B
Com
peti
tor
Fix
atio
ns
Voicing Laterality, Manner, Place Natural Speech Vowel Quality
Metalinguistic Tasks
Extensions
Word recognition is systematically sensitive to subphonemic acoustic detail.
0 5 10 15 20 25 30 35 40
VOT (ms)
CategoryBoundary
0
0.02
0.04
0.06
0.08
0.1
Response=BLooks to B
Response=PLooks to B
Com
peti
tor
Fix
atio
ns
Voicing Laterality, Manner, Place Natural Speech Vowel Quality
Metalinguistic Tasks
Extensions
Categorical Perception
VOT
0
100
PB
% /p
/
ID (%/pa/)0
100
Discrim
ination
Discrimination
VOT
0
100
PB
% /p
/
ID (%/pa/)0
100
Discrim
ination
VOT
0
100
PB
% /p
/
ID (%/pa/)0
100
VOT
0
100
PB
% /p
/
ID (%/pa/)0
100
0
100
Discrim
ination
Discrimination
Within-category detail surviving to lexical level.
Abnormally sharp categories may be due to meta-linguistic tasks.
There is a middle ground: warping of perceptual space (e.g. Goldstone, 2002)
Retain: non-independence of perception and categorization.
Categorization occurs when:
1) discriminably different stimuli
Perceptual Categorization
CP: perception not independent of categorization.
Exp 1: Lexical variants not treated equivalently (gradiency)
2) are treated equivalently for some purposes…
3) and stimuli in other categories are treated differently.
Categorization occurs when:
1) discriminably different stimuli
Perceptual Categorization
2) are treated equivalently for some purposes…
WHY?3) and stimuli in
other categories are treated differently.
CP: perception not independent of categorization.
Exp 1: Lexical variants not treated equivalently (gradiency)
Progressive Expectation Formation
Can within-category detail be used to predict future acoustic/phonetic events?
Yes: Phonological regularities create systematic within-category variation.
• Predicts future events.
Any action reflects future actions as it unfolds.
time
Input: m… a… r… oo… ng… g… oo… s…
maroon
goose
goat
duck
Word-final coronal consonants (n, t, d) assimilate the place of the following segment.
Place assimilation -> ambiguous segments —anticipate upcoming material.
Experiment 3: Anticipation
Maroong Goose Maroon Duck
Subject hears “select the maroon duck”“select the maroon goose”“select the maroong goose”“select the maroong duck” *
We should see faster eye-movements to “goose” after assimilated consonants.
Results
Looks to “goose“ as a function of time
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 200 400 600Time (ms)
Fix
atio
n P
rop
orti
on
Assimilated
Non Assimilated
Onset of “goose” + oculomotor delay
Anticipatory effect on looks to non-coronal.
Inhibitory effect on looks to coronal (duck, p=.024)
0
0.05
0.1
0.15
0.2
0.25
0.3
0 200 400 600Time (ms)
Fix
atio
n P
rop
orti
on
AssimilatedNon Assimilated
Looks to “duck” as a function of time
Onset of “goose” + oculomotor delay
Experiment 3: Extensions
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
200 300 400 500 600 700 800
Time (ms)
Loo
ks
to L
abia
l
Assim-Labials
Labials
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
200 300 400 500 600 700 800
Time (ms)
Lo
ok
s to
La
bia
l
Assimilated
Neutral
Possible lexical locus
Green/m Boat
Eight/Ape Babies
Assimilation creates
competition
Sensitivity to subphonemic detail:• Increase priors on likely upcoming events.• Decrease priors on unlikely upcoming events.• Active Temporal Integration Process.
Possible lexical mechanism…
NOT treating stimuli equivalently allows within-category detail to be used for temporal integration.
Lexical activation is exquisitely sensitive to within-category detail: Gradiency.
This sensitivity is useful to integrate material over time.
• Progressive Facilitation• Regressive Ambiguity resolution
(ask me about this)
Adult Summary
Categorization occurs when:
1) discriminably different stimuli
Perceptual Categorization
CP: perception not independent of categorization.
Exp 1: Lexical variants not treated equivalently (gradiency)2) are treated
equivalently for some purposes…
Exp 2: non equivalence enables temporal integration.
3) and stimuli in other categories are treated differently.
Historically, work in speech perception has been linked to development.
Sensitivity to subphonemic detail must revise our view of development.
Development
Use: Infants face additional temporal integration problems
No lexicon available to clean up noisy input: rely on acoustic regularities.
Extracting a phonology from the series of utterances.
Sensitivity to subphonemic detail:
For 30 years, virtually all attempts to address this question have yielded categorical discrimination (e.g. Eimas, Siqueland, Jusczyk & Vigorito, 1971).
Exception: Miller & Eimas (1996).• Only at extreme VOTs.• Only when habituated to non- prototypical token.
Nonetheless, infants possess abilities that would require within-category sensitivity.
• Infants can use allophonic differences at word boundaries for segmentation (Jusczyk, Hohne & Bauman, 1999; Hohne, & Jusczyk, 1994)
• Infants can learn phonetic categories from distributional statistics (Maye, Werker & Gerken, 2002; Maye & Weiss, 2004).
Use?
Speech production causes clustering along contrastive phonetic dimensions.
E.g. Voicing / Voice Onset TimeB: VOT ~ 0P: VOT ~ 40-50
Result: Bimodal distribution
Within a category, VOT forms Gaussian distribution.
VOT0ms 40ms
Statistical Category Learning
• Extract categories from the distribution.
+voice -voice
• Record frequencies of tokens at each value along a stimulus dimension.
VOT
freq
uenc
y
0ms 50ms
To statistically learn speech categories, infants must:
• This requires ability to track specific VOTs.
Why no demonstrations of sensitivity?
• HabituationDiscrimination not ID.Possible selective adaptation.Possible attenuation of sensitivity.
• Synthetic speechNot ideal for infants.
• Single exemplar/continuumNot necessarily a category representation
Experiment 3: Reassess issue with improved methods.
Experiment 4
Head-Turn Preference Procedure (Jusczyk & Aslin, 1995)
Infants exposed to a chunk of language:
• Words in running speech.
• Stream of continuous speech (ala statistical learning paradigm).
• Word list.
Memory for exposed items (or abstractions) assessed:• Compare listening time between consistent and
inconsistent items.
HTPP
Test trials start with all lights off.
Center Light blinks.
Brings infant’s attention to center.
One of the side-lights blinks.
When infant looks at side-light……he hears a word
Beach… Beach… Beach…
…as long as he keeps looking.
7.5 month old infants exposed to either 4 b-, or 4 p-words.
80 repetitions total.
Form a category of the exposed class of words.
PeachBeach
PailBail
PearBear
PalmBomb
Measure listening time on…
VOT closer to boundary
Competitors
Original words
Pear*Bear*
BearPear
PearBear
Methods
B* and P* were judged /b/ or /p/ at least 90% consistently by adult listeners.
B*: 97%P*: 96%
Stimuli constructed by cross-splicing naturally produced tokens of each end point.
B: M= 3.6 ms VOTP: M= 40.7 ms VOT
B*: M=11.9 ms VOTP*: M=30.2 ms VOT
Novelty/Familiarity preference varies across infants and experiments.
1221P
1636B
FamiliarityNoveltyWithin each group will we see evidence for gradiency?
We’re only interested in the middle stimuli (b*, p*).
Infants were classified as novelty or familiarity preferring by performance on the endpoints.
Novelty or Familiarity?
Categorical
What about in between?
After being exposed to bear… beach… bail… bomb…
Infants who show a novelty effect……will look longer for pear than bear.
Gradient
Bear*Bear Pear
Lis
teni
ng T
ime
4000
5000
6000
7000
8000
9000
10000
Target Target* Competitor
Lis
ten
ing
Tim
e (m
s)
B
P
Exposed to:
Novelty infants (B: 36 P: 21)
Target vs. Target*:Competitor vs. Target*:
p<.001p=.017
Experiment 3: Results
Familiarity infants (B: 16 P: 12)
Target vs. Target*:Competitor vs. Target*:
P=.003p=.012
4000
5000
6000
7000
8000
9000
10000
Target Target* Competitor
Lis
ten
ing
Tim
e (m
s) B
P
Exposed to:
NoveltyN=21
P P* B
.024*
.009**
P P* B
.024*
.009**
4000
5000
6000
7000
8000
9000
10000
Lis
ten
ing
Tim
e (m
s)
Infants exposed to /p/
P* B4000
5000
6000
7000
8000
9000
.018*
.028*
.018*
P
Lis
ten
ing
Tim
e (m
s).028*
FamiliarityN=12
NoveltyN=36
<.001**>.1
<.001**>.2
4000
5000
6000
7000
8000
9000
10000
B B* P
Lis
ten
ing
Tim
e (m
s)
Infants exposed to /b/
FamiliarityN=16
4000
5000
6000
7000
8000
9000
10000
B B* P
Lis
ten
ing
Tim
e (m
s).06
.15
7.5 month old infants show gradient sensitivity to subphonemic detail.
• Clear effect for /p/• Effect attenuated for /b/.
Contrary to all previous work:
Experiment 3 Conclusions
Reduced effect for /b/… But:
Bear Pear
Lis
teni
ng T
ime
Bear*
Null Effect?
Bear Pear
Lis
teni
ng T
ime
Bear*
Expected Result?
• Bear* Pear
Bear Pear
Lis
teni
ng T
ime
Bear*
Actual result.
• Category boundary lies between Bear & Bear*- Between (3ms and 11 ms) [??]
• Within-category sensitivity in a different range?
Same design as experiment 3.
VOTs shifted away from hypothesized boundary
Train
40.7 ms.Palm Pear Peach Pail
3.6 ms.Bomb* Bear* Beach* Bale*
-9.7 ms.Bomb Bear Beach Bale
Test:
Bomb Bear Beach Bale -9.7 ms.
Experiment 4
Familiarity infants (34 Infants)
4000
5000
6000
7000
8000
9000
B- B P
Lis
ten
ing
Tim
e (m
s)
=.05*
=.01**
Novelty infants (25 Infants)
=.02*
=.002**
4000
5000
6000
7000
8000
9000
B- B P
Lis
ten
ing
Tim
e (m
s)
• Within-category sensitivity in /b/ as well as /p/.
Experiment 4 Conclusions
Infants do NOT treat stimuli from the same category equivalently: Gradient.
Categorization occurs when:
1) discriminably different stimuli
Perceptual Categorization
CP: perception not independent of categorization.
Exp 1: Lexical variants not treated equivalently (gradiency)2) are treated
equivalently for some purposes…
Exp 2: non equivalence enables temporal integration.
Exp 3/4: Infants do not treat category members equivalently
3) and stimuli in other categories are treated differently.
• Within-category sensitivity in /b/ as well as /p/.
Experiment 4 Conclusions
Infants do NOT treat stimuli from the same category equivalently: Gradient.
Remaining questions:
1) Why the strange category boundary?
2) Where does this gradiency come from?
Experiment 4 Conclusions
Remaining questions:
2) Where does this gradiency come from?
VOT
B- B B* P* P
Lis
teni
ng T
ime
Remaining questions:
2) Where does this gradiency come from?
VOT
B- B B* P* P
Results resemble half a Gaussian…
Remaining questions:
2) Where does this gradiency come from?
Results resemble half a Gaussian…
And the distribution of VOTs is Gaussian
Lisker & Abramson (1964)
Statistical Learning Mechanisms?
/b/ results consistent with (at least) two mappings.
1) Shifted boundary
• Inconsistent with prior literature.
Cat
egor
y M
appi
ngS
tren
gth
VOT
/b/ /p/
Remaining questions:
1) Why the strange category boundary?
/p/
VOT
Adult boundary
/b/
Cat
egor
y M
appi
ngS
tren
gth
HTPP is a one-alternative task. Asks: B or not-B not: B or P
Hypothesis: Sparse categories: by-product of efficient learning.
2) Sparse Categoriesunmappedspace
Remaining questions:
1) Why the strange category boundary?
2) Where does this gradiency come from?
?Are both a by-product of statistical learning?
Can a computational approach contribute?
Mixture of Gaussian model of speech categories
1) Models distribution of tokens asa mixture of Gaussian distributions over phonetic dimension (e.g. VOT) .
2) Each Gaussian represents a category. Posterior probability of VOT ~ activation.
VOT
3) Each Gaussian has threeparameters:
/b/
VOT
Adult boundary
/p/
Cat
egor
y M
appin
gSt
rengt
h
unmappedspace/b/
VOT
Adult boundary
/p/
Cat
egor
y M
appin
gSt
rengt
h
unmappedspace
Computational Model
Statistical Category Learning
1) Start with a set of randomly selected Gaussians.
2) After each input, adjust each parameter to find best description of the input.
3) Start with more Gaussians than necessary--model doesn’t innately know how many categories.
-> 0 for unneeded categories.
VOT VOT
Overgeneralization • large • costly: lose phonetic distinctions…
Undergeneralization• small • not as costly: maintain distinctiveness.
To increase likelihood of successful learning:• err on the side of caution.• start with small
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 40 50 60
Starting
P(S
ucc
ess)
2 Category Model
39,900ModelsRun
3 Category Model
Sparseness coefficient: % of space not strongly mapped to any category.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 2000 4000 6000 8000 10000 12000
Training Epochs
Avg
Sp
arse
nes
s C
oeff
icie
nt
Starting
VOT
Small
.5-1
Unmapped space
Start with large σ
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 2000 4000 6000 8000 10000 12000
Training Epochs
Avg
Sp
arsi
ty C
oeff
icie
nt
20-40
Starting
VOT
.5-1
əə
Intermediate starting σ
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 2000 4000 6000 8000 10000 12000
Training Epochs
Avg
Sp
arsi
ty C
oeff
icie
nt
12-17
3-11
Starting
VOT
.5-1
20-40
Small or even medium starting => sparse category structure during infancy—much of phonetic space is unmapped.
To avoid overgeneralization……better to start with small estimates for
Model Conclusions
Tokens that are treated differently may not be in different categories.
Continuous sensitivity required for statistical learning.
Statistical learning enhances gradient category structure.
Categorization occurs when:1) discriminably
different stimuli
Perceptual Categorization
CP: perception not independent of categorization.
Exp 1: Lexical variants not treated equivalently (gradiency)
2) are treated equivalently for some purposes…
Exp 2: non equivalence enables temporal integration.
Exp 3/4: Infants do not treat category members equivalently
Model: Tokens treated differently are not in different categories (sparseness).
Model: Gradiency arises from statistical learning.
Model: Sparseness by product of optimal learning.
3) and stimuli in other categories are treated differently.
Examination of sparseness/completeness of categories needs a two alternative task.
AEM Paradigm
Treating stimuli equivalentlyTreating stimuli differently
Identification, not discrimination.
Existing infant methods:HabituationHead-Turn PreferencePreferential Looking
Mostly test discrimination
To AEM
AEM Paradigm
Exception: Conditioned Head Turn (Kuhl, 1979)
• After training generalization can be assessed.
• Approximates Go/No-Go task.
• Infant hears constant stream of distractor stimuli.
a a a a…
• Conditioned to turn head in response to a target stimulus using visual reinforcer.
i
AEM Paradigm
When detection occurs this could be because
• Stimulus is perceptually equivalent to target. • Stimulus is perceptually different but member of same
category as target.
When no detection, this could be because
• Stimuli are perceptually different.• Stimuli are in different categories.
A solution: the multiple exemplar approach
AEM Paradigm
Multiple exemplar methods (Kuhl, 1979; 1983)
• Training: single distinction i/a.• Irrelevant variation gradually added (speaker & pitch).• Good generalization.
This exposure may mask natural biases:
• Infants trained on irrelevant dimension(s).• Infants exposed to expected variation along irrelevant
dimension.
Infants trained on a single exemplar did not generalize.
AEM Paradigm
HTPP, Habituation and Conditioned Head-Turn methods all rely on a single response: criterion effects.
Yes: • Both dogs• Both mammals• Both 4-legged animals
No:• Different breeds• Different physical
properties
How does experimenter establish the decision criterion?
Is a member of
’s category?
AEM Paradigm
Multiple responses:
Is a member of
or ?
Two-alternative tasks specify criteria without explicitly teaching:
• What the irrelevant cues are• Their statistical properties (expected variance).
Pug vs. poodle: Decision criteria will be based on species-specific properties (hair-type, body-shape).
AEM Paradigm
Conditioned-Head-Turn provides right sort of response, but cannot be adapted to two-alternatives (Aslin & Pisoni, 1980).
• Large metabolic cost in making head-movement.• Requires 180º shift in attention.
Could we use a different behavioral response in a similar conditioning paradigm?
AEM Paradigm
Eye movements may provide ideal response.
• Smaller angular displacements detectable with computer- based eye-tracking.
• Metabolically cheap—quick and easy to generate.
How can we train infants to make eye movements target locations?
AEM Paradigm
Infants readily make anticipatory eye movements to regularly occurring visual events:
Visual Expectation Paradigm(Haith, Wentworth & Canfield, 1990; Canfield, Smith, Breznyak & Snow, 1997)
Movement under an occluder (Johnson, Amso & Slemmer, 2003)
AEM Paradigm
• Two alternative response (left-right)
• Arbitrary, identification response.
• Response to a single stimulus.
• Many repeated measures.
Anticipatory Eye-Movements (AEM):
Train infants to use anticipatory eye movements as a behavioral label for category identity.
AEM Paradigm
Each category is associated with the left or right side of the screen.
Categorization stimuli followed by visual reinforcer.
AEM Paradigm
Delay between stimulus and reward gradually increases throughout experiment.
time
trial 1
STIMULUS
REINFORCER
REINFORCERtrial 30
Delay provides opportunity for infants to make anticipatory eye-movements to expected location.
STIMULUS
AEM Paradigm
AEM Paradigm
AEM Paradigm
After training on original stimuli, infants are tested on a mixture of:
• new, generalization stimuli (unreinforced)Examine category structure/similarity relative to trained stimuli.
• original, trained stimuli (reinforced)Maintain interest in experiment. Provide objective criterion for inclusion
AEM Paradigm
TV
Remote Eye -tracker
Infrared Video Camera
Baby
MHT Receiver
MHT Transmitter
MHT Control Unit
Eye -tracker Control Unit
To Eye tracking Computer
TVTV
Remote Eye -tracker
Infrared Video Camera
Baby
MHT Receiver
MHT Transmitter
MHT Control Unit
Eye -tracker Control Unit
To Eye tracking Computer
Gaze position assessed with automated, remote eye-tracker.
Gaze position recorded on standard video for analysis.
?
Experiment 5
Multidimensional visual categories
Can infants learn to make anticipatory eye movements in response to visual category identity?
What is the relationship between basic visual features in forming perceptual categories?
• Shape• Color• Orientation
Experiment 5
Train: Shape (yellow square and yellow cross)
Test: Variation in color and orientation.Yellow 0º (training values)Orange 10ºRed 20º
If infants ignore irrelevant variation in color or orientation, performance should be good for generalization stimuli.
If infants’ shape categories are sensitive to this variation, performance will degrade.
Experiment 5: Results
0
10
20
30
40
50
60
70
80
Per
cent
Cor
rect
TrainingStimuli
Yellow, 0°
Yellow Orange Red
Color (n.s.)
No effect of color (p>.2)
Angle (p<.05)
0° 10° 20°
Significant performance deficit due to orientation (p=.002)
9/10 scored better than chance on original stimuli.M = 68.7% Correct
Some stimuli are uncategorized (despite very reasonable responses): sparseness.
Sparseregion of input spaces
Categorization occurs when:1) discriminably
different stimuli
Perceptual Categorization
CP: perception not independent of categorization.
Exp 1: Lexical variants not treated equivalently (gradiency)
2) are treated equivalently for some purposes…
Exp 2: non equivalence enables temporal integration.
Exp 3/4: Infants do not treat category members equivalently
Model: Tokens treated differently are not in different categories (sparseness).
Model: Gradiency arises from statistical learning.
Model: Sparseness by product of optimal learning.
3) and stimuli in other categories are treated differently. Exp 5: Shape categories show similar sparse
structure.
Occlusion-Based AEM
AEM is based on an arbitrary mapping.
• Unnatural mechanism drives anticipation.• Requires slowly changing duration of delay-period.
Infants do make eye-movements to anticipate objects’ trajectories under an occluder. (Johnson, Amso & Slemmer, 2003)
Can infants associate anticipated trajectories (under the occluder) with target identity?
Red Square
Yellow Cross
Yellow SquareTo faces
To end
Can AEM assess auditory categorization?
Can infants “normalize” for variations in pitch and duration?
or…
Are infants’ sensitive to acoustic-detail during a lexical identification task?
Experiment 6
Training:“Teak” -> rightward trajectory.“Lamb” -> leftward trajectory.
“teak!”
“lamb!”
Test:Lamb & Teak with changes in:
Duration: 33% and 66% longer.Pitch: 20% and 40% higher
If infants ignore irrelevant variation in pitch or duration, performance should be good for generalization stimuli.
If infants’ lexical representations are sensitive to this variation, performance will degrade.
Training stimulus (lamb)
Experiment 6 Results 2
Durationp=.002
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
TrainingStimuli
D1 / P1 D2 / P2
Stimulus
Pro
port
ion
Cor
rect
Tri
als
DurationPitch
Pitchp>.1
20 Training trials.11 of 29 infants performed better than chance.
Experiment 6: Results
Again, some stimuli are uncategorized (despite very reasonable responses): sparseness.
Variation in pitch is tolerated for word-categories.
Variation in duration is not.- Takes a gradient form.
Categorization occurs when:1) discriminably
different stimuli
Perceptual Categorization
CP: perception not independent of categorization.
Exp 1: Lexical variants not treated equivalently (gradiency)
2) are treated equivalently for some purposes…
Exp 2: non equivalence enables temporal integration.
Exp 3/4: Infants do not treat category members equivalently
Model: Tokens treated differently are not in different categories (sparseness).
Model: Gradiency arises from statistical learning.
Model: Sparseness by product of optimal learning.
3) and stimuli in other categories are treated differently. Exp 5,6: Shape, Word categories show similar
sparse structure.
Exp 6: Gradiency in infant response to duration.
Can AEM help understand face categorization?
Are facial variants treated equivalently?
Train: two arbitrary facesTest: same faces at
0°, 45°, 90°, 180°
Facial inversion effect.
Exp 7: Face Categorization
Experiment 7: Results
0
0.2
0.4
0.6
0.8
1
45º 90º 180º
Per
cent
Cor
rect
Vertical
22/33 successfully categorized vertical faces.
• 45º, 180º: chance (p>.2).• 90º: p=.111
• 90º vs. Vertical: p<.001• 90º vs. 45º & 180º : p<.001.
Experiment 7
AEM useful with faces.
Facial Inversion effect replicated.
Generalization not simple similarity–90º vs. 45º –Infants’ own category knowledge is reflected.
Resembles VOT (b/p) results: within a dimension, some portions are categorized, others are not.
Again, some stimuli are uncategorized (despite very reasonable responses): sparseness.
Categorization occurs when:1) discriminably
different stimuli
Perceptual Categorization
CP: perception not independent of categorization.
Exp 1: Lexical variants not treated equivalently (gradiency)
2) are treated equivalently for some purposes…
Exp 2: non equivalence enables temporal integration.
Exp 3/4: Infants do not treat category members equivalently
Model: Tokens treated differently are not in different categories (sparseness).
Model: Gradiency arises from statistical learning.
Model: Sparseness by product of optimal learning.
3) and stimuli in other categories are treated differently.
Exp 6: Gradiency in infant response to duration.
Exp 5,6,7: Shape, Word, Face categories show similar sparse structure.
Again, some stimuli are uncategorized (despite very reasonable responses): sparseness.
CategoriesVariation Tolerated
Variation Not tolerated
Exp 5 Shapes Color Orientation
Exp 6 Faces 90° Orientation
Exp 7 Words Pitch Duration
Evidence for complex, but sparse categories: some dimensions (or regions of a dimension) are included in the category, others are not.
• Infants show graded sensitivity to continuous speech cues.
• /b/-results: regions of unmapped phonetic space.
• Statistical approach provides support for sparseness.- Given current learning theories, sparseness results from
optimal starting parameters.
• Empirical test will require a two-alternative task: AEM
• Test of AEM paradigm also shows evidence for sparseness in shapes, words, and faces.
Infant Summary
Audience Specific Conclusions
For speech peopleGradiency: continuous information in the signal is not discarded and is useful during recognition.
Gradiency: Infant speech categories are also gradient, a result of statistical learning.
For infant peopleMethodology: AEM is a useful technique for measuring categorization in infants (bonus: works with undergrads too).
Sparseness: Through the lens of a 2AFC task, (or interactions of categories) categories look more complex.
Perceptual Categorization
1) discriminably different stimuli…
2) …are treated equivalently for some purposes…
3) and stimuli in other categories are treated differently
CP: discrimination not distinct from categorization. Continuous feedback relationship between perception and categorization
Gradiency: Infants and adults do not treat stimuli equivalently. This property arises from learning processes as well as the demands of the task.
Sparseness: Infants’ categories do not fully encompass the input. Many tokens are not categorized at all…
Conclusions
Categorization is an approximation of an underlyingly continuous system.
Clumps of similarity in stimulus-space.
Reflect underlying learning processes and demands of online processing.
During development, categorization is not common (across the complete perceptual space)—small, specific clusters may grow to larger representations.
This is useful: avoid overgeneralization.
Take Home Message
Early, sparse, regions of graded similarity space
…
grow, gain structure
…
but retain their fundamental gradiency.
Perceptual Categories:
Old and gradient, young and sparse.
Bob McMurrayUniversity of Iowa
Dept. of Psychology
Head-Tracker Cam Monitor
IR Head-Tracker Emitters
EyetrackerComputer
SubjectComputer
Computers connected via Ethernet
Head
2 Eye cameras
Misperception: Additional Results
10 Pairs of b/p items.• 0 – 35 ms VOT continua.
20 Filler items (lemonade, restaurant, saxophone…)
Option to click “X” (Mispronounced).
26 Subjects
1240 Trials over two days.
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
0 5 10 15 20 25 30 35
Barricade
Res
pon
se R
ate
Voiced
Voiceless
NW
Identification Results
Parricade
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
0 5 10 15 20 25 30 35
Voiced
Voiceless
NW
Barakeet Parakeet
Res
pon
se R
ate
Significant target responses even at extreme.
Graded effects of VOT on correct response rate.
“Garden-path” effect:Difference between looks to each target (b
vs. p) at same VOT.
VOT = 0 (/b/)
0
0.2
0.4
0.6
0.8
1
0 500 1000
Time (ms)
Fix
atio
ns
to T
arge
t
Barricade
Parakeet
VOT = 35 (/p/)
0 500 1000 1500
Time (ms)
Phonetic “Garden-Path”
-0.1
-0.05
0
0.05
0.1
0.15
0 5 10 15 20 25 30 35
VOT (ms)
Gar
den
-Pat
h E
ffec
t(
Bar
rica
de
- P
arak
eet
)
-0.1
-0.08
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0 5 10 15 20 25 30 35
VOT (ms)
Gar
den
-Pat
h E
ffec
t (
Bar
rica
de
- P
arak
eet
)
Target
Competitor
GP Effect:Gradient effect of VOT.
Target: p<.0001Competitor: p<.0001
Assimilation: Additional Results
runm picks
runm takes ***
When /p/ is heard, the bilabial feature can be assumed to come from assimilation (not an underlying /m/).
When /t/ is heard, the bilabial feature is likely to be from an underlying /m/.
Within-category detail used in recovering from assimilation: temporal integration.
• Anticipate upcoming material• Bias activations based on context
- Like Exp 2: within-category detail retained to resolve ambiguity..
Phonological variation is a source of information.
Exp 3 & 4: Conclusions
Subject hears“select the mud drinker”“select the mudg gear” “select the mudg drinker
Critical Pair
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Time (ms)
Fix
atio
n P
rop
orti
on
Initial Coronal:Mud Gear
Initial Non-Coronal:Mug Gear
Onset of “gear” Avg. offset of “gear” (402 ms)
Mudg Gear is initially ambiguous with a late bias towards “Mud”.
0
0.1
0.2
0.3
0.4
0.5
0.6
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Time (ms)
Fix
atio
n P
ropo
rtio
n
Initial Coronal: Mud Drinker
Initial Non-Coronal: Mug Drinker
Onset of “drinker” Avg. offset of “drinker (408 ms)
Mudg Drinker is also ambiguous with a late bias towards “Mug” (the /g/ has to come from somewhere).
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 200 400 600Time (ms)
Fix
atio
n P
rop
orti
on
Assimilated
Non Assimilated
Onset of “gear”
Looks to non-coronal (gear) following assimilated or non-assimilated consonant.
In the same stimuli/experiment there is also a progressive effect!
• Similar properties in terms of starting and sparseness.
VOT
Categories• Competitive Hebbian Learning
(Rumelhart & Zipser, 1986).
• Not constrained by a particular equation—can fill space better.
Non-parametric approach?
Top Related