Björn Lindblom: Developmental origins of adult phonology
1
Developmental origins of adult phonology
The interplay between phonetic emergents and the evolutionary adaptations of
sound patterns
Björn Lindblom
Stockholm University, Sweden
1. Introduction
Fundamental to linguistic methodology is to distinguish between the abstract
structure of an utterance, its form, and its behavioral expression, its substance. The
traditional division of labor between phonology and phonetics derives from that
distinction (Fischer-Jørgensen 1975). A crucial step in the history of the discipline
was taken by stipulating that form (la langue) take precedence over substance (la
parole) (Saussure 1916).
However, as students of speech sounds endeavor to increase the explanatory
adequacy of their descriptions, it is becoming increasingly clear that the assumption
of the ‘logical priority of linguistic form’, left essentially intact since de Saussure, is
counter-productive to that goal. One of the aims of this introduction is to reexamine
this time-honored assumption.
An area where the consequences of the doctrine are particularly evident is
language acquisition where, logically, the priority of linguistic form cannot be applied
in any plausible way, since at the onset of development there is no form. We shall end
our introductory remarks by concluding that, if accounting for how children learn
their native sound systems is to be part of explanatory linguistics, the doctrine of
‘form first, then substance’ must be rejected and replaced by another paradigm. The
question of what that framework should be will be considered in the second part of
the paper.
Björn Lindblom: Developmental origins of adult phonology
2
1.1 The ‘inescapable’ dogma of 20th century linguistics
The roles of phonology and phonetics are schematically diagrammed in Figure 1.
The starting point is spoken samples from a given language observed by ear and
specified in terms of the elements of a universal phonetic alphabet. This provides raw
materials for functional analyses in which judgements of contrast by native
informants play a crucial role.
Experimental phonetics contributes physical descriptions of the perceptually
relevant correlates of the phonological units. In principle, these specifications can be
translated into audible, and therefore perceptually testable, form by means of speech
synthesis. The key notion is that speech performance is to be analyzed as a realization
of underlying, grammatically determined aspects of sound. Phonology aims at
discovering those aspects, whereas phonetics describes how, once defined, linguistic
structure is actualized by the speaker and how it is recovered from the signal by the
listener.
The significant phrase here is ‘once defined’, since, traditionally, physical
observations cannot precede functional analyses: Linguistic form must take
precedence over phonetic substance.
natural speechpatterns
⇓functional analyses phonology (form)
⇓formal units, rules
⇓instrumental analyses phonetics (substance)
⇓physical correlates ofperceptually relevant
attributes
Figure 1 The traditional division of labor between phonetics and phonology.
Björn Lindblom: Developmental origins of adult phonology
3
A glimpse of the origins of the form-substance distinction can be obtained by
considering the difficulties that the founders of the International Phonetic Association
must have had in their attempts to create a phonetic alphabet, e.g., the problems of
‘phonetic variability’ and ‘phonetic detail’.
What speaking style should phonological analyses be based on? Suppose numerous
instances are recorded of the ‘same’ German utterance, e.g. ‘mit dem Wagen’, ranging
from clear ����� ���� ��� �� to more casual forms such as ��� � ���� (Kohler
1990). A standard move has been to exclude such style-dependent variations from the
domain of phonology proper. As stated by Jakobson & Halle:
"When analyzing the pattern of phonemes or distinctive features composing them, one mustrecur to the fullest, optimal code at the command of the given speakers." Jakobson and Halle(1968:413-414).
Second consider the problem of ‘phonetic detail’. Should tenth be represented as
������, �����������, or with even more detail? Sweet concluded:
"It is necessary to have an alphabet which indicates only those broader distinctions of soundwhich actually correspond to distinctions of meaning in a given language." (Sweet (1877, 103-104)).
It has been claimed that one of de Saussure’s major contributions was:
"…. to focus the attention of the linguist on the system of regularities and relations whichsupport the differences among signs, rather than on the details of individual sound andmeaning in and of themselves......For Saussure, the detailed information accumulated by phoneticians is of only limited utilityfor the linguist, since he is primarily interested in the ways in which sound images differ, andthus does not need to know everything the phonetician can tell him......By this move, then, linguists could be emancipated from their growing obsession with phoneticdetail." (Anderson 1985:41-42, italics ours).
The form-substance distinction ‘solves’ the problems of phonetic detail and
phonetic variability by invoking a process of abstraction and idealization. It replaces
variable and context-dependent behavioral data by invariant and context-free entities,
such as phonemes and allophones. Phonetic substance is stripped away as
Björn Lindblom: Developmental origins of adult phonology
4
linguistically irrelevant so as to uncover the phonologically significant structure
assumed to be embedded in that substance. In other words, progress is achieved by
making phonological structure independent of its behavioral use.
As mentioned, physical observation can never precede functional analysis. For an
illustration consider the following observations. When the Swedish word ‘nolla’
(‘zero’) is played backwards, native Swedes hear ‘hallon’ (‘raspberry’) rather than the
non-word ‘allon’ (Lindblom 1980)1. Spectrograms indicate that, when spoken as a
citation form, ‘nolla’ has expiration noise at the end of the final vowel. This noise is
heard as a speech sound when the tape runs backwards, but not when the word is
presented in the forward direction. Another finding is that, when the name ‘Anna’ is
played backwards, native Swedes hear ‘Hanna’ rather than ‘Anna’ which indicates
that the perceptual asymmetry of the ‘nolla-hallon’ example is likely to originate in
auditory processing (e.g., differences between forward and backward masking) rather
than in language-specific lexical access. The point of these examples is that the same
physical pattern (the utterance-final noise) is a linguistically significant event in one
situation, but not in the other. To the phonologist,
“ … nothing in the physical event ... tells us what is worth measuring and what is not."(Anderson 1985:41)
The ‘nolla-hallon’ demonstration2 conforms with the widespread conviction that
making phonetic measurements, no matter how comprehensive, would not help the
phonologist, since only the ear and the brain of the native speaker can determine what
1 These words have accent II, the ‘grave’ accent. Produced as citation forms their F0 contoursare fall-rise patterns which will remain fall-rise patterns also when reversed.2 The full account of the nolla-hallon effect (Lindblom 1980) has a second part which relatesthe observed perceptual asymmetry to the fact that the world’s languages seem to prefer using theglottal aspirated [h] in syllable-initial over syllable-final position. If, as we suggest, the nolla-hallonasymmetry is linked to universal characteristics of human hearing, this speech-independent auditoryproperty could be invoked to explain the typological observations as well. However, predicting suchpatterns would require a non-traditional framework that puts ‘substance’ first and ‘form’ second (seefurther discussion below).
Björn Lindblom: Developmental origins of adult phonology
5
is of linguistic relevance in a speech signal. It is observations of this sort that have led
linguists to stipulate that form must come first, then substance.
This distinction has been central for all of 20th century linguistics. Linguists have
left it intact assuming their primary concern to be with the individual native speaker's
competence (mental grammar, tacit knowledge), not with performance (its behavioral
instantiations):
"It seems natural to suppose that the study of actual linguistic performance can be seriouslypursued only to the extent that we have a good understanding of the generative grammars thatare acquired by the learner and put to use by the speaker or hearer. The classical Saussureanassumption of the logical priority of the study of langue (and the generative grammars thatdescribe it) seems quite inescapable." (Chomsky (1964:52), italics ours).
1.2 The focus of phonetics: “Given the units, what are the phonetic correlates”?
As these remarks suggest, sound structure is postulated, not observed in the
laboratory. Nonetheless, experimental phoneticians have accepted the ‘logical priority
of form’ since, without an analysis of utterances into some kind of abstract units, it
would be difficult to make sense of laboratory records. Therefore the following 30-
year-old handbook statement on the relationship between phonetics and phonology
continues to be a valid description of how speech sounds are analyzed.
"…. a combination of a strictly structural approach on the form level with an auditorily baseddescription on the substance level will be the best basis for a scientific analysis of theexpression when manifested as sound. This description has to start by the functional analysis,then it must establish in auditory terms the distinctions used for separating phonemic units,and finally, by means of appropriate instruments, find out which acoustic and physiologicalevents correspond to these different units. The interplay between the different sets ofphenomena will probably for a long time remain a basic problem in phonetic research(Malmberg 1968:15)"
We conclude that, in keeping with the ‘inescapable’ dogma, the focus of phonetics
is placed on describing how postulated phonological units are realized in production
and how they are recovered from the signal in perception. In short, “given the units,
what are the phonetic (behavioral) correlates?”
Björn Lindblom: Developmental origins of adult phonology
6
In suggesting that progress was made by defining phonological structure as
independent of on-line use and by relegating the study of individual, situational and
style-dependent variations to phonetics and other performance-oriented disciplines,
the preceding account is unlikely to be controversial. However, as soon as issues of
behavioral realism and explanatory adequacy are raised, problems arise and consensus
tends to disappear.
2. Why a new paradigm is needed
2.1 The behavioral realism of linguistic form
Let us begin by mentioning two classical issues that remain unresolved despite
decades of experimental work. They are closely linked to accepting the priority of
linguistic form: (i) the question of the ‘psychological reality’ of linguistic units and
rules. (ii) the issue of ‘phonetic invariance’.
The ‘psychological reality’ issue derives from the fact that the formal constructs of
linguistic analyses are postulated rather than observed. Data on e.g., the alphabet,
speech errors (Fromkin 1973), word games, synchronic and diachronic phonology
(Halle 1964)3 have been used as evidence for segmental organization and discrete
units as psychologically genuine phenomena. However, this evidence is only indirect
and that leaves room for alternative interpretations. Therefore, it is not surprising that
phoneticians and psycho-linguists differ as to how compelling that evidence really is
(Ladefoged 1984).
The ‘phonetic invariance’ issue has a similar source. It arises from the fact that
natural speech patterns exhibit extensive complex individual, situational and stylistic
variations, and by assuming that formal linguistic units - stripped of variability – can
by hypothesis be upgraded from ‘operationally defined’ to ‘behaviorally real’. On the
Björn Lindblom: Developmental origins of adult phonology
7
one hand, a variable phonetic reality - on the other, context-free invariant linguistic
representations. The mismatch between the two generates the invariance issue. Again
there is indirect evidence but so far there has never been a direct demonstration of
phonetic invariance as a physical observable (Perkell & Klatt 1986).
Are these two long-standing problems indications of significant, but reparable
cracks in the theoretical edifice of linguistic science? Or are they irremediable
consequences of the ‘priority of linguistic form’ portending a paradigm shift? Lacking
solutions, these difficulties have given some speech researchers second thoughts
concerning the behavioral status of phonological units as invariant and context-free. A
case in point is the recent interest in ‘exemplar models’ of speech perception (more
anon).
2.2 Explanatory adequacy in phonology and the form-substance distinction
Few linguists would currently interpret the form-substance distinction so
rigorously as to claim that phonological units and processes are totally arbitrary,
empty logical phenomena4. Recall the following afterthought in chapter nine of The
sound pattern of English:
“The entire discussion in this book suffers from a fundamental theoretical inadequacy. ……..The problem is that our approach to features, to rules, and to evaluation has been overlyformal. ….. In particular, we have not made any use of the fact that features have intrinsiccontent.” (Chomsky and Halle 1968:400; italics ours).
Contemporary phonology presents numerous developments (e.g., ‘Optimality
Theory’, ‘Grounded Phonology’, ‘Laboratory Phonology’) indicating that attempts are
being made to link the description of sound patterns more tightly to the production
3 “Almost every insight gained by modern linguistics from Grimm’s law to Jakobson’sdistinctive features depends crucially on the assumption that speech is a sequence of discrete entities.”(Halle 1964).4 In “Why phonology isn't ‘natural’", Anderson (1981) acknowledges a role for performance-based accounts, but places them outside linguistics proper, since they fail to deal with the aspects thatought to interest the linguist the most, viz., the formal idiosyncracies of Language per se. On that view,what counts as a ‘real’ linguistic explanation is one that deals with the functionally inexplicable.
Björn Lindblom: Developmental origins of adult phonology
8
and perception of speech. It appears clear that phonetics and phonology are
undergoing a rapprochement. Such ‘second thoughts’ imply a softening of the
condition that, like the rest of grammar, sound structure be autonomous and
independent of language use.
Are these developments indicative that a more productive fine-tuning of the
phonetics/phonology division of labor is under way? Or are they signs of a growing
realization that accepting the priority of form creates an impasse that unnecessarily
deprives linguistics of explanatory power?
2.3 The logical priority of ‘la parole’ in the study of language acquisition
The perspective of child phonology further underscores how real and serious these
questions are.
On the one hand, it appears reasonable to expect a phonological theory with
explanatory ambitions to aim at accounting for how children develop the sound
structure of their native languages. In response to the question, ‘Where does
phonology come from?’ linguistics would provide a developmental answer instead of
claiming that it is largely determined by our genetic endowment (=nativism)5, or that
it has to be postulated given analysis methods and the observations themselves
(=curve-fitting). On the other hand, honoring the ‘priority of linguistic form’ does not
make sense in the case of phonetic learning because at the onset of development there
is no ‘form’.
We are forced to conclude that research in this area does not conform to the game
plan of Figure 1 and that the focus question of traditional phonetics, ‘given the units
5 “This analysis into features could not plausibly be said to have been learned, for there aresurely few experiences in the life of a normal individual who is not a professional linguist or aphonetician that would lead her/him to develop a system of features for classifying speech sounds. Oneis, therefore, led to assume that the speech-analysing system is part of our genetic endowment..." (Halle& Stevens 1979:339-340). "And similar correlations between articulatory activity and acoustic signal
Björn Lindblom: Developmental origins of adult phonology
9
what are the phonetic correlates’, is utterly problematic. It would seem preferable to
rephrase it as: “Given the child’s behavior what are the units?”. Consequently, the
‘inescapable’ dogma of 20th century linguistics does not apply to language
acquisition.
If accounting for how children learn their native sound systems is to be part of
explanatory linguistics, the priority of form must be rejected and another paradigm
must be found. How could such a framework be developed?
3. Modeling phonological development as emergent computation
Methodological conditions, long-term goals and hypotheses.
The present section programmatically sketches fragments of a theory of emergent
phonology. The following ground rules provide the key to escaping the explanatory
impasse imposed by the priority of linguistic form:
1. Phonological structure must not prematurely be assumed to be genetically pre-specified. Rather itshould be deduced from the child’s experience and minimal assumptions about ‘initialknowledge’. In the technical sense of the term, it should be derived as emergent behavior.
2. Phonological structure should not be postulated simply because the entities or processes aresuggested by the data to be explained. Always seek independent motivations in ‘first principles’and avoid mere ‘curve fitting’.
Restated the first rule says that nativism should be replaced by emergent
computation. The second is an anti-circularity condition. In summary, the common
message is ‘deduce rather than postulate’!
The following presentation is made up of several hypotheses. (i) Cumulative
perceptual experience is complex but shows lawful effects of emergent categorization.
(ii) Motor learning unfolds according to a criterion of minimum energy consumption.
This universal physiological constraint puts the child within closer reach of the
articulatory patterns of its native phonology. (iii) Reuse of perceptual and motor
are genetically provided for each of the nineteen or so features that make up the universal set ofphonetic features" (Halle & Stevens 1991:10).
Björn Lindblom: Developmental origins of adult phonology
10
patterns is favored owing to a metabolic constraint on memory formation. These ideas
exemplify possible roles that listening, speaking and learning might have in the
shaping of sound systems and suggest domains where behaviorally motivated ‘first
principles’ might be sought. Ambient input interacts with all three. (iv) Languages
exhibit (a tangled fabric of) adaptations to the proposed processes, e.g., patterns of
perceptual contrast, articulatory ease and combinatorial coding of submorphemic
elements such as ‘features’ and ‘segments’. (v) Furthermore, it is assumed that,
although socio-cultural evolution sometimes opposes the effects of the above-
mentioned factors, linguistic systems nevertheless retain those adaptations owing to
the blind phonetic ‘editing’ unwittingly performed by speakers, listeners and learners
during on-line language use (Lindblom, Guion, Hura, Moon & Willerman 1995).
Conceivably it might be objected that the present suggestions drastically
overestimate the role of functional constraints at the expense of formal factors. As a
brief response to that objection we note that obviously the complexity of the
functionalism vs. formalism issue is considerable. Therefore a computational
approach will be necessary. Our position is that any hypothesis propounded - whether
formal or functional, articulatory or perceptual - must eventually be evaluated using
‘first principles’ simulations in an integrated manner that give all the component
hypotheses a fair chance to compete and be numerically evaluated. Accordingly,
whether we favor a functionalist or a formalist stance the agenda becomes identical. A
framework of emergent computation will be needed in either case.
4 Listening
Emergent effects of cumulative perceptual experience
Studies of unscripted speech are beginning to draw attention to the drastic
modifications that phonetic forms frequently suffer under natural, non-laboratory
Björn Lindblom: Developmental origins of adult phonology
11
conditions (Kohler this volume). The variability of infant-directed speech with its
emotive coloring and lively prosody (Fónagy 1983, Fernald 1984), appears to be
similar to what is found in adult-to-adult styles (Kuhl et al 1997, Davis & Lindblom
1994, Ulla Sundberg 1998). It is a near certainty that the invariance issue would need
to be resolved also for Baby Talk. How do children segment the legato flow of signal
information into words? How do they factor out emotive and stylistic transforms
(Johan Sundberg this volume)? In short, how do they manage to build their
phonologies from a complex input?
Machine learning including neural network theory appears relevant to that
question, particularly the ‘unsupervised’ approaches (Hinton & Sejnowski 1999). For
instance, work has been done to study how ‘structure’ can be derived from complex
inputs such as human faces. There is neuro-physiological evidence indicating that the
brain represents whole objects in terms of component parts (Wachsmuth, Oram &
Perrett 1994, Logothetis & Sheinberg 1996). Lee & Seung (1999) describe an attempt
to simulate such behavior computationally. They designed an algorithm that learned
to analyze faces non-holistically, i.e., it automatically parsed them into parts
resembling “several versions of mouths, noses and other facial parts” (p 789). The
term ‘parts’ is here used to refer to “entities that allow objects to be reassembled using
purely additive combinations” (Mel 1999).
The field of automatic speech recognition also offers results rich in implications
for phonetics and phonology. It turns out that currently the best-performing systems
are not based on extensive a priori knowledge about phonetic structure. They do
surprisingly well simply by exploiting statistical regularities in the speech signal.
‘Units’ are derived automatically as the stored data form clusters and as patterns of
‘tied states’ become defined (Young & Woodland 1993).
Björn Lindblom: Developmental origins of adult phonology
12
These and other findings raise the question whether, in processing speech,
children’s brains come up with units-based representations in a similarly unsupervised
fashion. If so, perceptual phonological units would not represent abstract ‘form’ but
simply the emergent patterning of the phonetic information.
The philosophy of the work just exemplified is reminiscent of so-called exemplar-
based models of perception and learning, a paradigm explored for some time by
psychologists (Estes 1993). Exemplar models make minimal assumptions about
‘initial conditions’ and are therefore not guilty of ‘resolving’ issues such as the
variability problem by postulating unknown, innate mechanisms. They make the most
of the signal and its complex, but lawfully structured, variability before positing
abstract hypothetical decoding mechanisms.
Johnson & Mullenix (1997) compare traditional and exemplar-based approaches to
speech perception. They point out that classical accounts assume representations (e.g.,
phoneme-sized units) to be simple (context-free invariants). The task of deriving such
units from the speech signal calls for complex processes capable of extracting
invariants. Mechanisms of this type have been proposed – e.g., the ‘phonetic module’
of the motor theory (Liberman & Mattingly 1985), the ‘smart mechanisms’ of direct
realism (Fowler 1986, 1994) and the ‘top-down’ processes (reconstructive rules,
inference making and hypothesis testing) of cognitively oriented approaches. The
details of how they operate still need to spelled out.
Exemplar accounts adopt the opposite perspective. They assume representation to
be complex and mapping to be simple. Categories form as emergent products of
cumulative phonetic experience. A key point is that, although the variability of speech
signals is extensive, it is highly systematic. Exemplar models capitalize on this fact
storing stimulus information along with its immediate signal context. As more data
Björn Lindblom: Developmental origins of adult phonology
13
accumulate, systematic co-variations among stimulus dimensions gradually appear.
The system can be said to use context to sort and disambiguate the variability. As a
result, speech sounds have complex and contextually embedded representations unlike
abstract phonetic segments. However, with sound systems shaped by a distinctiveness
constraints (Diehl & Lindblom in press, Johnson this volume) and a perceptual space
of sufficiently large dimensionality, the integrity of sound-meaning relationships
should have a fair chance of being maintained.6
5 Speaking
5.1 Clues from the study of non-speech motor processes
‘Articulatory ease’ is sometimes informally invoked to explain both phonetic and
phonological observations. It has a certain common-sense appeal but admittedly its
current status is controversial. Ladefoged’s position (1990) is (i) that it is language-
dependent, (ii) that it cannot be measured and (iii) that therefore appeals to it are
unscientific. In a paper on assimilation, usually seen as an articulatory process, Ohala
(1990) rejects articulatory factors in favor of a perceptual account. He argues that
articulatory ease is likely to play a marginal role in shaping sound patterns and that
invoking it makes explanations teleological.
As warnings against uncritical use of articulatory ease such statements are well
taken, but, in the broader context of experimental biology, they appear overly
pessimistic.
This field presents a large literature on the energetics of locomotion in various
species. Quantitative data are available on how humans and dogs walk and run, birds
and bumblebees fly and how fish swim. A standard way of presenting results is to plot
6 For an illustration of this reasoning see the simplified exemplar-based account (Lindblom
1990, 1996) of phonetic learning in Japanese quail (Kluender et al 1997). For attempts to implementexemplar learning computationally see Lacerda (1995).
Björn Lindblom: Developmental origins of adult phonology
14
the amount of energy that the subject expends against traveling speed. The energy
used is inferred from measurements of oxygen consumption made for subjects under
steady-state conditions and therefore in an aerobic mode of oxygen uptake (McNeill
Alexander 1992). A typical example of this research is the study by Hoyt & Taylor
(1981) who measured energy consumption for horses walking, trotting and galloping.
The subjects were observed as they moved freely and at speeds controlled by a
treadmill. The energy used expressed per unit distance traveled and plotted against
traveling speed formed U-shaped curves with distinct minima. Significantly, these
minima were found to occur at speeds that subjects spontaneously adopted when
moving freely and unconstrained by the speed of the treadmill. Such findings rest
solidly on a large body of physiological studies (McArdle, Katch & Katch 1996) and
have been reported for a number of species. Experimental biologists interpret them to
suggest that locomotion is shaped by a criterion of ‘minimum energy expenditure’.
5.2 Why should speech movements be different?
Are speech movements and whole body movements similarly organized? Since
energy costs for speech are likely to be small in comparison with those of locomotion,
it might be argued that they play no major role at all in shaping phonetic movements.
It is true that, until speech energy costs can be reliably measured, we have no basis for
settling that issue satisfactorily. However, evolution’s tendency towards parsimony
would make us expect the same rules to apply for small as for big movements.
Among phoneticians, it is widely believed that both speech and sound patterns
have many characteristics that are most readily accounted for in terms of production
constraints. Conceivably, we will ultimately be able to show that many of them derive
from a minimum energy expenditure condition. For instance, in running speech,
prosodic modulations and speaking styles produce both strong elaborated and weak
Björn Lindblom: Developmental origins of adult phonology
15
reduced forms. In our opinion, this segmental dynamics is an obvious candidate for an
analysis based on energetics. Similarly, looking typologically at phonological
systems, we observe a clear preference for low-cost motor patterns (Lindblom 1983).
We hypothesize that minimization of energy expenditure plays a causal role in:
• the absence of vegetative movements and mouth sounds;• determining the feature composition of phonetic segments (e.g., why are /i/ and /u/ universally
‘close’ vowels?);• constraining the universal organization of syllabic & phonotactic structure,• the patterning of diachronic and synchronic lenition and fortition processes;• shaping the system-dependent selection of phonetic values in segment inventories;• ….
5.3 Recognizing the ‘DOF problem’ for speech production
Motor systems offer their users an extremely rich set of possibilities for executing
a given task. In principle, there is an infinite number of trajectories that a movement
from one point to another could take. This motoric embarrassment of riches is
technically known as the Degrees of Freedom (DOF) problem. Solving the DOF
problem means selecting a unique movement from a very large search space. As the
following example will show, speech production offers talkers countless possibilities
for any given task which makes DOF a very real issue also for the phonetician.
Articulatory modeling (Lindblom & Sundberg 1971, Maeda 1991) has shown that
there is a continuous trade-off between jaw opening and tongue raising in producing a
given vowel, e.g., an /i/. In principle there is an infinite number of ways in which a
given /i/ formant pattern could be produced.
The normal way of making this vowel is to raise the jaw and adopt a moderately
palatal tongue shape. However, it has been experimentally demonstrated that, when
speakers are asked to produce a normal-sounding /i/ with an atypically large jaw
opening maintained by a ‘bite-block’ (Lindblom et al 1979), their output does not
approach the /�/-like quality predicted by area-function-to-formant calculations. In
fact, subjects are able to match the normal quality and the formant pattern of the
Björn Lindblom: Developmental origins of adult phonology
16
vowel quite closely, a result that clearly indicates a compensatory mode of
articulation. X-ray data (Gay et al 1981) have confirmed this interpretation showing
that, for bite-block /i/:s, subjects compensate by raising the tongue higher than normal
into a super-palatal position. This case is typical of many situations arising in
articulatory modeling. It shows that the DOF problem definitely also applies to
speech.
5.3 From walking to talking
In order to see how articulatory models might handle DOF let us briefly return to
some recent computational research on human walking (Anderson & Pandy 1999,
Pandy this volume). It offers an interesting solution to this problem.
The human body is represented as a 3-D model of the musculo-skeletal system.
The upper part consists of a rigid torso without arms. The lower part has 23 degrees of
freedom controlled by a set of 54 muscles. Attempts were made to simulate the
normal human gait cycle. The findings indicate that the model walks at a forward
velocity of 81 m/min, a value typical of human subjects (Ralston 1976). Predicted
displacements of anatomical structures were quantitatively similar to experimental
observations. Muscle coordination patterns were consistent with EMG data from
human subjects. Metabolic energy was expended at a rate comparable to that for
human walking. A compelling impression of the model’s realism is obtained from a
video demonstration of the performance of the model. The normal gait of the model
skeleton is presented and compared with average measurements from human subjects.
It shows the model walking in an extremely human-like fashion.
With so many muscles and mechanical dimensions, this model has a significant
DOF problem. Particularly relevant to discussions of articulatory ease in phonetics is
the fact that the results were obtained when a performance criterion of least metabolic
Björn Lindblom: Developmental origins of adult phonology
17
cost (= minimum heat production) was used. We can interpret the success of the
simulations as implying that the optimization criterion drastically reduces the search
space and makes it possible for the algorithm to identify unique and optimal
movement trajectories for each subtask during the gait cycle.
Summarizing so far we note (i) that minimum energy consumption evidently helps
solve the DOF problem for non-speech movements and (ii) that this problem also
arises for speech modeling. In the light of those observations it seems justified to
assume that energy costs ought to play an important role also in shaping on-line
speech as well as sound systems.
In order to get a preliminary idea of what energy costs might be for speech
movements, a simplified model of the mandible was constructed (Lindblom et al
1999). The jaw was represented by a system defined by its mass (m), damping (b) and
elasticity (k). The mass was equal to 250 g. Critical damping and a resonance
frequency of 5 Hz were assumed. The energy required to drive the system was
calculated as a function of frequency for sinusoidal jaw movement of 10 mm
0
10
20
30
40
50
0 5 10 15
Frequency of jaw movement (Hz)
Ene
rgy
cost
per
dis
tanc
e tr
avel
ed(j
oule
s/m
eter
)
Figure 2. The energy required to drive a biomechanical model of the jaw as a
function of frequency for a sinusoidal movement of 10 mm amplitude.
Björn Lindblom: Developmental origins of adult phonology
18
amplitude. The results plotted with energy per distance on the ordinate and frequency
of jaw movement on the abscissa indicated a function with a U-shape and a distinct
minimum similar to an ‘upside-down’ resonance curve and not unlike the locomotion
findings reviewed earlier.
6. Learning to speak
6.1 Articulatory boot-strapping: ’Easy-way-sounds-OK’.
The work of MacNeilage (1998) draws attention to the prominent role of the
mandible in babbling and early speech. He argues convincingly that speech did not
have to develop a new rhythm generator for the production of syllables. By the
evolutionary process of continuity and tinkering it made conservative use of existing
central pattern generators, namely those already developed for vegetative purposes.
“ … speech makes use of the same brainstem pattern generator that ingestive cyclicities do,and … control structures for speech purposes are, in part at least, shared with those ofingestion.” (MacNeilage 1998, p 503)
This helps explain the universal fact that virtually every utterance of every speaker
of every one of the world’s languages exhibits syllabic organization - that is, involves
a mandibular open-close movement. It also sheds light on why, both motorically and
sensorily, the jaw and the area around the mouth opening are particularly salient
regions of the vocal tract (Lindblom & Lubker 1985) and are therefore likely to be
explored early on.
Let us supplement this scenario with a few remarks based on energetics. Suppose
that talking is like walking. In other words, young children vocalizing behave exactly
like subjects walking and running in preferring energetically low-cost movements. If
so, their vocal systems would tend to be activated at the minimum points of the U-
shaped curves of their articulators. To further simplify this view of early vocal
behavior, let us limit the degrees of freedom of the production mechanism to the jaw
Björn Lindblom: Developmental origins of adult phonology
19
because of its vegetative salience. What would the articulatory and acoustic
characteristics of opening and closing the jaw at minimum energy cost be like?
Metaphorically, it would be given by the minimum value of the U-shaped curve and
correspond to an open-close alternation near the jaw’s resonance frequency.
Combining this movement with phonation would produce a quasi-syllabic acoustic
output resembling [bababa]. In other words, least effort applied to the jaw would
produce an utterance not unlike canonical babbling.
MacNeilage is right in making us wonder why open-close alternations should be so
ubiquitous in spoken language. We can restate the facts and interpretations presented
so far: (a) The low-energy articulatory search (start pianissimo!) is limited to only a
fragment of the child’s phonetic space (=mandibular oscillation). (2) It helps the child
spontaneously bump into many articulatory patterns used by the ambient phonology
(=’proto-syllables’) by significantly narrowing the alternative possibilities.
Could steps (1) and (2) be generalized and incorporated into a more comprehensive
model of phonetic learning? Do they constitute a general boot-strapping strategy for
discovering native articulatory patterns? An affirmative answer would be possible if it
could be shown that:
(a) the DOF problem for speech is solved in the same way as it is solved for non-speech movements. That would produce a strong statistical bias in favor oflow-cost motor patterns.
(b) Many aspects of the world’s phonologies are low-cost motor patterns.
(c) By cultural evolution the world’s phonologies could in principle havedeveloped biologically less optimal motor patterns than they use now, but havedone so only to a limited extent.
In our opinion there is a strong probability that all three claims are correct. The
reason is, we suggest, that sound patterns are adapted for phonetic development. Low-
cost motor patterns are retained so as to accommodate the child’s energy-efficient
search by providing ambient reinforcement of the child’s efforts (Davis &
Björn Lindblom: Developmental origins of adult phonology
20
MacNeilage this volume). The phrase ‘easy-way-sounds-OK’ captures the nature of
this boot-strapping. Phonologically organized speech presupposes the specialized
ability of vocal imitation (Studdert-Kennedy 1998, this volume). The present account
suggests that imitating is supplemented in important ways by mechanisms of motor
emergence. As articulations are fortuitously discovered the ‘easy way’ and confirmed
by the ambient input, perceptuo-motor links get established to budding perceptual
categories.
6.2 Where do phonological units come from?
The preceding discussion has concentrated on substantive aspects. In this final
section we address the possibility of behavioral origins (as opposed to the pre-
specification) of a formal universal of linguistic structure, e.g. the combinatorial
coding of discrete units.
To address this topic, we will describe a game based on a simple algorithm that
automatically analyzes holistic patterns into smaller elements and then re-uses those
elements. The phenomenon of re-use implies combinatorial organization. In keeping
with the spirit of the proposed phonetics/phonology program, the point is that the
derived units are emergent consequences of system growth and that they do not come
pre-specified. We suggest that this mechanism is formally similar to what goes on in
lexical development.
Phonetically the holistic patterns can be pictured as articulatory and/or auditory
patterns. The segmentation into smaller elements defines the ‘units’. Re-use of those
units is promoted by the fact that memory storage is associated with a biochemical
cost. This cost is hypothesized to derive from the energy metabolism of memory
formation (Gonzales-Lima 1992) and is an increasing function of the novelty of the
Björn Lindblom: Developmental origins of adult phonology
21
stored materials. Since novelty is expensive, holistic coding is disfavored whereas
parts-based re-use is not.
At this point a short summary is needed of some simplified neurobiological facts
about how memories are encoded. Learning causes the brain to change physically.
This change is activity-dependent. Active neural tissue contains more energy-rich
substances. Hence, learning costs metabolic energy. Such conclusions have been
drawn from histochemical analyses of brain tissue. Cytochrome oxidase is a substance
used as a marker of metabolic capacity. The mitochondrial amount of this enzyme is
assumed to reflect the functional level of activity in the neuron. More active neurons
have more cytochrome oxidase and more active regions within a neuron have more
mitochondria (Wong-Riley 1989).
Gonzales-Lima (1992) reports experiments in which rats were trained to associate
reward with an auditory stimulus (FM signal 1-2 kHz). After training for eleven days
the brains of experimental and control animals were examined for cytochrome
oxidase contents in their auditory neostriatum. The experimental group showed
significantly increased amounts of cytochrome oxidase. The proposed interpretation is
that the memory of the conditioning stimulus changes the neurons activated by the
task. This change takes the form of an increase in their metabolic capacity. Fuel
(’potential energy’) is available should a demand arise for their activation (e.g.,
recall). This is reminiscent of other familiar examples of activity-dependent change,
e.g., callous hands and bigger muscles.
These results suggest that a principle of ‘minimal incremental storage’ may be
embodied in the neural metabolism of memory formation. If so, it would mean that
patterns containing more information (more ’bits’ in the information theory sense) are
energetically more costly, and therefore they take longer, to commit to memory. Here
Björn Lindblom: Developmental origins of adult phonology
22
we do not confidently claim that this is the process underlying phonetic learning. Our
objective is rather to demonstrate that a formal-looking property such as
combinatorial coding could in principle readily arise for functional reasons. The mere
possibility of such an account should make us wary of ‘inescapable’ conclusions
about arbitrary formal idiosyncracies.
It is of course important that an account of emerging structure be completely non-
teleological. To say that children acquire large vocabularies because of the advantages
of combinatorial coding is to make a teleological argument. What has to be argued
instead is that combinatorial coding comes into existence owing to the fortuitous
coincidence of several factors. Once that happens, that mode of organization is
reinforced by its functional advantages. We hypothesize that one of the causal factors
behind the ability to code up to 100,000 words or more (Miller 1977) is a metabolic
constraint on memory formation.
6.2 The nepotism game: ‘close relatives get promoted’.
Imagine a 10-by-10 matrix with 100 cells. The point of the game is to choose a
sequence of n points located in the matrix so that a ‘cost criterion’ is minimized. We
consider two alternative definitions of ‘cost’:
(1) For every new cell we pay 1 unit!(2) For every new coordinate specification (row or column) we pay 0.5 units!
A single item costs one unit on either measure. For the first criterion, the cost is
equal to n units regardless of the cells selected. In the case of rule (2), costs can be cut
by selecting a cell in a previously activated row and/or column. As n (system size)
increases numerous opportunities for re-use arise. Figure 2 below shows a situation
with six points sequentially chosen according to the second measure. Selected cells
are marked in black. When a choice is made, the other cells of that row and column
Björn Lindblom: Developmental origins of adult phonology
23
become available at half price (0.5 units). This is indicated by the shading. Zero cost
is associated with cells at intersections of already committed rows and columns. The
example in Figure 2 costs 6 units when we pay per cell (first measure), but only 2.5
units when selections are priced by coordinate specifications as in (2).
Figure 3. Selecting a sequence of n matrix cells in accordance with a costcriterion.
We conclude that rule (1) corresponds to Gestalt coding and that, in conjunction
with cost minimization, rule (2) forces the system to go combinatorial.
6.3 Self-segmentation and the emergence of articulatory ‘re-use’.
To explore what this exercise might tell us about speech, let us interpret the matrix
as a crude articulatory space and replace rows and columns by continuous parameters,
say the phase and amplitude of elementary oscillatory movement. Along a third
dimension we specify the articulator performing the movement. A given point in this
3-D space represents a Gestalt motor score.
Further suppose that a given child consistently uses forms sounding like [�����,
������ and ������. In the articulatory space these forms are represented by three
points whose coordinates specify the movement parameters: e.g., three amplitude
values for the open-close movement of the jaw, two positions (front and back) for the
Björn Lindblom: Developmental origins of adult phonology
24
rest/target alternation of the tongue etc. In standard notation (but without implying
any segmental organization), the jaw-tongue parameters form the following matrix:
tongue positions
_�_�
jaw openings _�_�
_�_�
These specifications are each linked to its own type of anatomically distinct
oscillatory closure movement: d_d_, m_m_, and b_b_.
The nepotism principle (NEP) literally states that a re-combination of all these
hidden “component” movements is favored by the memory constraint. If NEP were
consistently and mechanically implemented, it would yield the following additional
potential re-use patterns for jaw-tongue movement:
tongue positions
_�_�
jaw openings _�_�
_�_�
Moreover, it would put a number of forms in a state of ‘readiness’, e.g., [�����,
[�����, [�����, [�����, [�����, ������, ������, ������, ������, ������,
������, ������, ������, ������, ������. Again no segmental organization is implied.
How does this re-use come about? How are the “component movements” identified?
The quotation marks around “component” are important, since so far we have little
reason to treat phonetic forms as anything but Gestalts.
Björn Lindblom: Developmental origins of adult phonology
25
As a first step, we note that the vocal tract consists of several independently
controllable structures. In other words, although early vocalizations do not arise from
phoneme-like control signals, the system producing them is in fact anatomically
‘segmented’.
Second, we observe that, in many cases, neural representations are somatotopically
organized (Kandel & Schwartz 1991) which means that the brain stores individual
motor and sensory activities in specific locations with anatomical identity preserved
(cf notion of homunculus). Both of these circumstances play a crucial role in the
proposed self-segmentation process.
Faced with the task of producing ambient forms not yet acquired, the child must
solve the problem of assembling new motor programs. NEP predicts that the speed
and accuracy of imitation, spontaneous use and recall will depend significantly on
whether or not the new form shares “component” movements with old forms.
Assembling a new motor score is assisted by overlap with previously encoded
patterns even if those patterns are part of unanalyzed wholes and have not yet been
‘defined’ as separate motor entities. For developmental and typological evidence
supporting this suggestion see review in Lindblom (1998).
We propose that in part the NEP bias makes the child engage in spontaneous
articulatory re-use, in part the native language favors forms that match the output of
NEP. Learners can thus use NEP to find ‘hidden’ structure. Behavioral conditions
make certain patterns more functional than others. Languages are molded by those
functional constraints. They adapt to them incorporating fossils of naturalness in their
architecture and by so doing they become more learnable and easier to use.
Björn Lindblom: Developmental origins of adult phonology
26
7. Summary
How do children find the ‘hidden’ structure of speech? This question presupposes
that ‘structure’ is something disembodied. In other words, it is seen as embedded in an
incomplete, degraded, noisy and infinitely variable signal. That is the traditional, but,
in our view, not necessarily correct view. Instead the following approach is
advocated.
Phonetic variations are far from random. They are patterned in principled ways
because of perceptual distinctiveness, articulatory dynamics and VT acoustics (Fant
1960, Stevens 1998). A cumulatively growing, exemplar-based phonetic memory
should go a long way towards revealing that patterning to the child. In such a model
‘categories’ do not resemble the neat, operationally defined units of classical
phonemic analysis, since their correlates are likely to be strongly contextually
embedded, in a sense ‘hidden’. However, over time, variability would get sorted and
disambiguated by context and by the cues providing semantic and situational labeling.
‘Mapping simple, representation complex!’
One source of information for perceptual labeling is articulatory. Research on non-
speech offers the phonetician valuable clues as to how motor processes operate. The
role of metabolic cost in solving the DOF problem is a case in point. We have made
the parsimonious assumption that speech movements are organized like other
movements. Therefore energetics should be relevant. From that conclusion we were
led to propose a two-part hypothesis: Easy-way-sounds-OK! It says (1) that children
initially explore their vocal resources in an energetically low-cost mode and (2) that
sound patterns have adapted to reward that behavior. This is a kind ‘conspiracy’ that
makes children stumble on motorically motivated phenomena in the ambient language
such as syllabic organization. It also establishes motor links to perceptual forms
(together with imitation).
Björn Lindblom: Developmental origins of adult phonology
27
A related scenario was sketched for the development of the phonemically coded
lexicon. We suggested that a linguistic system with featural and phonemic
recombination humors learners whose memories charge a metabolic fee for storage. If
that fee increases with the number of bits (amount of information) to be stored, it
follows that patterns that do not share materials (Gestalts) are costly, whereas patterns
with overlap are cheaper. Somatotopic organization and VT anatomy were found to
impose an unsupervised segmentation of this overlap into articulator-specific
parameters. This is the process that leads the child to the ‘phonetic gesture’ (Studdert-
Kennedy this volume, Carré this volume). Metabolically controlled re-use is thus
launched and paves the way for cognitively driven and combinatorial vocabulary
growth. These considerations favor the view that phonemic coding is an adaptive
emergent rather than a formal idiosyncracy of our genetic endowment for Language.
Emergent phonology is proposed to promote a new vision of the relationship
between phonetics and phonology. By substituting it for the traditional division of
labor, we would get away from Chomsky’s ‘inescapable dogma’.
The distinctions between form/substance and competence/performance should be
abandoned having served their historical purpose. There is no split between phonetics
and phonology because, from the developmental point of view, phonology remains
behavior. Phonology differs qualitatively from phonetics in that it represents a new,
more complex and higher level of organization of that behavior. For the child,
phonology is not abstract. Its foundation is an emergent patterning of phonetic
content. The starting point is the behavior. ‘Structure’ unfolds from it. Therefore the
issue of ‘psychological reality’ does not arise. Similarly, explanations need not be
limited to post-hoc experimental justifications for postulated formal phenomena but
Björn Lindblom: Developmental origins of adult phonology
28
are integrated into the theory’s predictions. Behavioral realism and explanatory
adequacy are given free reins.
Björn Lindblom: Developmental origins of adult phonology
29
8. ReferencesAnderson S R (1981): "Why phonology isn't "natural'", Linguistic Inquiry 12:493-
539.
Anderson S R (1985): Phonology in the twentieth century, Chicago:Chicago
University Press.
Anderson F C & Pandy M G (1999): “A dynamic optimization solution for vertical
jumping in three dimensions”, Computer Methods in Biomechanics and
Biomedical Engineering, 1-31.
Carré R (2000): "", Phonetica, this volume.
Chomsky N (1964): "Current trends in linguistic theory" 50-118 in Fodor J A & Katz
J J (eds): The structure of language, New York:Prentice-Hall.
Chomsky N & Halle M (1968): The sound pattern of English, New
York:Harper&Row.
Davis B L & Lindblom B (1994). “Prototype formation in speech development and
phonetic variability in Baby Talk”, in Lacerda F, von Hofsten C & Heiman M
(in press): Emerging Cognitive Abilities in Early Infancy, LEA:Hillsdale, NJ.
Diehl R L and Lindblom B (in press): “Explaining the structure of feature and
phoneme inventories”, chapter to appear in Greenberg S and Ainsworth W
(eds): Speech Processing in the Auditory System, Springer Handbook of
Auditory Research (SHAR).
Estes W K (1993): “Concepts, categories, and psychological science”, Psychological
Science 4: 143-153.
Fernald A (1984): “The perceptual and affective salience of mothers' speech to
infants”, 5-29 in Feagans L, Garvey C & Golinkoff R (eds): The origins and
growth of communication, New Brunswick:Ablex.
Fant G (1960): The acoustic theory of speech production, The Hague:Mouton.
Björn Lindblom: Developmental origins of adult phonology
30
Fischer-Jørgensen E (1975): Trends in phonological theory: A historical introduction,
Copenhagen:Akademisk forlag.
Fónagy I (1983): La Vive Voix, Paris:Payot.
Fowler C A (1986): "An event approach to the study of speech perception from a
direct-realist perspective", Journal of Phonetics 14(1): 3-28.
Fowler C A (1994): "Speech perception: Direct realist theory", 4199-4203 in Asher R
E (ed): Encyclopedia of Language and Linguistics, Pergamon:New York.
Fromkin V (1973): Speech errors as linguistic evidence, The Hague:Mouton.
Gay T, Lindblom B & Lubker J (1981): "Production of bite-block vowels: Acoustic
equivalence by selective compensation", J Acoust Soc Am 69:802-810.
Gonzales-Lima F (1992): “Brain imaging of auditory learning functions in rats:
Studies with fluorodeoxyglucose autoradiography and cytochrome oxidase
histochemistry ”, 39-109 in Gonzales-Lima F, Finkenstädt T & Sheich H
(eds): Advances in metabolic mapping techniques for brain imaging of
behavioral and learning functions, NATO ASI Series D:68, Dordrecht:
Kluwer.
Halle M (1964): “On the bases of phonology”, 604-612 in Fodor J A & Katz J J (eds):
The structure of language, New York:Prentice-Hall.
Halle M & Stevens K N (1979): "Some reflections on the theoretical bases of
phonetics", 335-353 in Lindblom B & Öhman S (eds): Frontiers of speech
communication research, London:Academic Press.
Halle M & Stevens K N (1991): "Knowledge of language and the sounds of speech",
1-19 in Sundberg J, Nord L & Carlson R (eds): Music, language, speech and
brain, Houndmills, Basingstoke, England:MacMillan.
Björn Lindblom: Developmental origins of adult phonology
31
Hinton G E & Sejnowski T J (1999): Unsupervised learning: Fondations of neural
computation, MIT Press:Cambridge, MA. Wachsmuth, Oram & Perrett 1994,
Logothetis & Sheinberg 1996
Hoyt D F & Taylor C R (1981): “Gait and the energetics of locomotion in horses”,
Nature 292: 239.
Jakobson R & Halle M (1968): "Phonology in relation to phonetics", 411-449 in
Malmberg B (ed): Manual of phonetics, Amsterdam:North-Holland.
Jakobson R, Fant G & Halle M (1952/69): Preliminaries to speech analysis,
Cambridge, MA: MIT Press.
Johnson K & Mullenix J (1997): “Complex representations used in speech processing:
Overview of the book”, 1-8 in Johnson K & Mullenix J (eds): Talker
variability in speech processing, Academic Press.
Johnson K (2000): “Adaptive dispersion in vowel perception”, Phonetica, this
volume.
Kandel E, Schwartz J & Jessel T (1991): Principles of neural science (3rd edition),
New York:Elsevier.
Kluender K R, Diehl R & Killeen P (1987): "Japanese quail can learn phonetic
categories", Science 237: 1195-1197.
Kohler K (1990): "Segmental reduction in connected speech in German: Phonological
facts and phonetic explanations", 69-92 in Hardcastle W J & Marchal A (eds):
Speech production and speech modeling, Dordrecht:Kluwer.
Kohler K (2000): "Investigating unscripted speech: Implications for phonetics and
phonology", Phonetica, this volume.
Kuhl P K, Andruski J E, Chistovich I A, Chistovich L A, Koshevnikova E V, Ryskina
V L, Stolyarova E I, Sundberg U & Lacerda F (1997): “Cross-language
Björn Lindblom: Developmental origins of adult phonology
32
analysis of phonetic units in language addressed to infants”, Science 277: 684-
686.
Lacerda L 1995. ‘The perceptual magnet effect: An emergent consequence of
exemplar-based phonetic memory’, Proceedings ICPhS Stockholm, 140-147,
vol 2.
Ladefoged P (1984): "’Out of chaos comes order’: Physical, biological and structural
patterns in phonetics", Van den Broecke M P R & Cohen A (eds):
Proceedings of the Xth International Congress of Phonetic Sciences, Vol IIB,
83-95.
Ladefoged P (1990): “Some reflections on the IPA”, Journal of Phonetics 18:335-
346.
Lee D D & Seung H S (1999): “Learning the parts of objects by non-negative matrix
factorization”, Nature 401, 788-791.
Liberman A & Mattingly I (1985): "The motor theory of speech perception revised,"
Cognition 21: 1-36.
Lindblom B (1980): "The goals of phonetics, its unification and application",
Phonetica 37:7-26.
Lindblom B (1983): “Economy of speech gestures”, in P F MacNeilage (ed): Speech
Production, 217-246, Springer Verlag:New York.
Lindblom B (1990): "Explaining phonetic variation: A sketch of the H&H theory",
403-439 in Hardcastle W & Marchal A (eds): Speech Production and Speech
Modeling, Dordrecht:Kluwer.
Lindblom B (1996): “Role of articulation in speech perception: Clues from
production”, J Acoust Soc Am 99(3):1683-1692.
Björn Lindblom: Developmental origins of adult phonology
33
Lindblom B (1998): “Systemic constraints and adaptive change in the formation of
sound structure”, 242-264 in Hurford J, Studdert-Kennedy M & R Knight C
(eds): Approaches to the evolution of language, Cambridge:CUP.
Lindblom B & Sundberg J (1971): "Acoustical consequences of lip, tongue, jaw and
larynx movement", J Acoust Soc Am 50:1166-1179.
Lindblom B, Lubker J & Gay T (1979): "Formant frequencies of some fixed-mandible
vowels and a model of speech programming by predictive simulation", J
Phonetics 7: 147-162.
Lindblom B & Lubker J (1985): “The speech homunculus and a problem of phonetic
linguistics”, 169-192 in Fromkin V A (ed): Phonetic Linguistics, Academic
Press: London.
Lindblom B, Guion S, Hura S, Moon S-J & Willerman R (1995): "Is sound change
adaptive?", Revista di Linguistica 7.1, 5-37.
Lindblom B, Davis J, Brownlee S, Moon S-J & Simpson Z (1999): “Energetics in
phonetics and phonology”, to appear in Fujimura O et al (eds): Linguistics and
Phonetics, Ohio State University.
Logothetis N K & Sheinberg D J (1996): “Visual object recognition”, Annu Rev
Neurosci 19, 577-621.
Maeda S (1991): "On articulatory and acoustic variabilities", J of Phonetics 19:321-
331.
Malmberg B (1968): “The linguistic bases of phonetics”, Manual of phonetics,
Amsterdam:North-Holland.
McNeill Alexander R (1992): The Human Machine, New York:Columbia University
Press.
Björn Lindblom: Developmental origins of adult phonology
34
MacNeilage P F (1998): “The frame/content theory of evolution of speech
production”, Behavioral and Brain Sciences 21, 499-546.
MacNeilage P F & Davis B L (2000): “Deriving speech from non-speech: A view
from ontogeny”, Phonetica, this volume.
McArdle W D, Katch F I & Katch V L (1996): Exercise physiology, 4th ed,
Baltimore:Williams&Wilkins.
Mel B W (1999): “Think positive to find parts”, Nature 401, 759-760.
Miller G A (1977): Spontaneous apprentices, Seabury Press: New York.
Ohala J J (1990): “The phonetics and phonology of aspects of assimilation”, 258-275
in Kingston J & Beckman M (eds): Papers in laboratory phonology: Vol 1.
Between grammar and the physics of speech, Cambridge:CUP.
Pandy M (2000): "", Phonetica, this volume.
Perkell J & Klatt D (1986). Invariance and variability of speech processes,
LEA:Hillsdale, NJ.
Ralston H J (1976): “Energetics of human walking”, 77-98 in Herman R M, Grillner
S, Stein P S G & Stuart D G (eds): Neural control of locomotion, Plenum
Press:New York.
de Saussure F (1916): Cours de linguistique générale, Paris:Payot.
Stevens K N (1998): Acoustic phonetics, Cambridge:M.I.T. Press.
Studdert-Kennedy M (1998): “Evolutionary implications of the particulate principle:
Imitation and the dissociation of phonetic form from semantic function”, in
Knight C, Studdert-Kennedy M & Hurford J R (eds): The emergence of
language: Social function and the origins of linguistic form, Cambridge:CUP.
Studdert-Kennedy M (2000): "Imitation and the emergence of segments", Phonetica,
this volume.
Björn Lindblom: Developmental origins of adult phonology
35
Sundberg J (2000): "Emotive transforms", Phonetica, this volume.
Sundberg U (1998): Mother tongue – Phonetic aspects of infant-directed speech, Ph
D dissertation, Stockholm University.
Sweet H (1877): Handbook of phonetics, Oxford:Henry Frowde.
Young S J & Woodland P C (1993): “The use of tying in continuous speech
recognition”, Proc Eurospeech 93, 2203-2206.
Wachsmuth E, Oram M W & Perrett D J (1994): “Recognition of objects and their
components: responses of single units in the temporal cortex of the macaque”,
Cereb Cortex 4, 509-522.
Wong-Riley M T T (1989): ” Cytochrome oxidase: An endogenous metabolic marker
for neuronal activity”, Trends Neurosci 12(3):94-101.
8. Acknowledgements
This research is supported by grant number BCS-9901021 from the National
Science Foundation, Washington D.C..