Using strange attractors to model sound

8/3/2019 Using strange attractors to model sound

1/208

1

USING STRANGE ATTRACTORS

TO MODEL SOUND

Submitted to

The University of London

for the Degree of

Doctor of Philosophy

Jonathan Mackenzie

King's College

April 1994


2/208

2

Abstract

This thesis investigates the possibility of applying nonlinear dynamical systems

theory to the problem of modelling sound with a computer. The particular interest is in

the creative use of sound, where its representation, generation and manipulation are

important issues. A specific application, for example, is the modelling of

environmental sound for film sound-tracks.

Recently, there have been a number of major advances in the field of nonlinear

dynamical systems which include chaos theory and fractal geometry. It is argued that

these provide a rich source of ideas and techniques relevant to the issues of modelling

sound. One such idea is that complex behaviour may be generated from simple

systems. Such behaviour can often replicate a wide range of natural phenomena, or is

of interest in its own right because of its aesthetic appeal. This has been demonstratedoften through computer generated images and so an equivalent is sought in the audio

domain. This work is believed to be the first substantial attempt at this.

The investigation begins with a consideration of fractal and chaotic properties of

sound and with a comparison between established approaches to modelling and the

alternatives suggested by the new theory. Then, the inquiry concentrates on strange

attractors, which are the mathematical objects central to chaos theory, and on two

ways in which they may be used to model sound.

The first of these involves using static fractal functions to represent sound time

series. A technique is developed for synthesising complex abstract sounds from a

small number of parameters. A class of these sounds have the novel property that they

are simultaneously rhythms and timbres. It is believed these have potential for use in

computer music composition. Also considered is the problem of modelling a given

time series with a fractal function. An algorithm for doing this is taken from the

literature, shown to be of limited ability, and then improved. The results indicate that

data compression may be achieved for certain types of sound.

The second approach focuses on modelling the dynamics of a sound via the

embedded reconstruction of an attractor from a time series. Two models are presented,

one deterministic, the other stochastic. It is demonstrated that with the first of these,

certain sounds may be modelled such that their perceived qualities are preserved. For

some other signals, although the sound is not so well preserved, many statistical

aspects are. The second model is shown to provide a solution to the film sound-track

problem.

It is concluded that this investigation shows strange attractors to have considerable

potential as a basis for modelling sound and that there are many areas for continued

research.


3/208

3

To

Valerie Duff


4/208

4

Acknowledgements

I would very much like to thank my supervisor, Dr. Mark Sandler, for encouragingme to begin this research project, for finding the funding for it, and for everything he

has done towards making it such a stimulating and enjoyable experience. I am also

indebted to Solid State Logic for providing the sponsorship and to Chris Jenkins for

arranging it. I doubt whether I would have had the opportunity to pursue the project of

my choice otherwise.

I am enormously grateful to my colleagues at King's College who have always

been helpful, supportive and inspiring. These include Maaruf Ali, Julian Bean, Victor

Bocharov, Rob Bowman, Ian Clark, Chris Dunn, Jason Goldberg, Anthony Hare, RodHiorns, Simon Kershaw, Panos Kudumakis, Anthony Macgrath, Phillipa Parmiter,

Allan Paul, Marc Price, Mark Townsend, Mike Waters, and Jie Yu.

For sharing their knowledge and for always being helpful I would like to thank

Dr. Bill Chambers, Prof. Tony Davies, and Dr. Luke Hodgkin. I am also deeply

grateful to Peter King, Mustaq Mohammed and Talat Malik for their generous

technical support.

Finally, special thanks to Val, my family and friends for their support, enthusiasm,

patience and inspiration and for knowing never to ask "when are you going to finish?"


5/208

5

Contents

Abstract ............................................................................................................... 2

Acknowledgements ....................................................................................................... 4

Contents ............................................................................................................... 5

List of Figures ............................................................................................................... 8

List of Tables ............................................................................................................. 14

List of Sound Examples .............................................................................................. 16

List of Acronyms......................................................................................................... 19

1. Introduction ........................................................................................20

2. Modelling Sound.................................................................................24

2.1. Sound and its Representation................................................................... 24

2.2. Music composition................................................................................... 25

2.3. The Roomtone Problem ........................................................................... 26

2.4. Digital Audio............................................................................................ 27

2.5. The Modelling Framework....................................................................... 28

2.6. Conventional Models ............................................................................... 29

2.6.1. Physical Modelling.................................................................... 292.6.2. Additive and Subtractive Synthesis........................................... 29

2.6.3. Frequency Modulation and Waveshaping................................. 32

2.7. Summary.................................................................................................. 33

3. Chaos Theory and Fractal Geometry ..............................................34

3.1. Introduction.............................................................................................. 34

3.2. The Significance of Chaos ....................................................................... 35

3.3. Dynamical Systems and State Space........................................................ 363.4. Stability .................................................................................................... 37

3.5. Attractors.................................................................................................. 39

3.6. Chaos........................................................................................................ 40

3.7. Visualisation............................................................................................. 42

3.8. Bifurcation................................................................................................ 44

3.9. Statistical Descriptions of Dynamics ....................................................... 47

3.10. Fractal Geometry.................................................................................... 48

3.11. Iterated Function Systems ...................................................................... 53

3.11.1. Contraction Mappings............................................................. 54


6/208

6

3.11.2. The Random Iteration Algorithm............................................ 56

3.11.3. The Shift Dynamical System................................................... 58

3.11.4. The Collage Theorem.............................................................. 59

3.11.5. The Continuous Dependence of the Attractor on the IFS

Parameters.............................................................................. 60

3.12. Summary................................................................................................ 60

4. Applying Chaos and Fractals to the

Problem of Modelling Sound.............................................................62

4.1. The Reasons for Using Chaos Theory................................................. 62

4.2. Diagnosis of Chaotic Behaviour ......................................................... 64

4.2.1. Chaos and Woodwind Instruments ........................................... 65

4.2.2. Chaos and Gongs....................................................................... 66

4.2.3. Fractal Time Waveforms........................................................... 66

4.2.4. 1/f Noise.................................................................................... 67

4.3. Representing Sound Using Chaos and Fractals................................... 71

4.4. Summary ............................................................................................. 73

5. Fractal Interpolation Functions........................................................75

5.1. Theory ................................................................................................. 75

5.2. The Synthesis Algorithm..................................................................... 785.3. Experiments with the Synthesis Algorithm......................................... 80

5.4. Rhythm/Timbres ................................................................................. 85

5.5. Generating Time-Varying FIF Sounds................................................ 87

5.6. A Genetic Parameter Control Interface............................................... 90

5.6.1. Implementation ......................................................................... 91

5.6.2. Experiments .............................................................................. 95

5.7. Conclusions....................................................................................... 101

6. Modelling Sound with FIFs.............................................................103

6.1. Deriving Interpolation Points from Naturally Occurring Sound ........... Wa

6.2. Mazel's Time Series Models ............................................................. 107

6.3. Comparison with Requantisation ...................................................... 109

6.4. Mazel's Inverse Algorithm for the Self-Affine Model ...................... 114

6.4.1. Initial Results .......................................................................... 118

6.4.2. Error Weighting ...................................................................... 121

6.4.3. Interpolation Point Range Restriction..................................... 124

6.5. Conclusions....................................................................................... 128


7/208

7

7. Chaotic Predictive Modelling..........................................................131

7.1. Chaotic Time Series .......................................................................... 131

7.2. Embedding ........................................................................................ 133

7.3. The Analysis/Synthesis Model.......................................................... 135

7.4. The Inverse Problem ......................................................................... 138

7.5. A Solution to the Inverse Problem................................................... 140

7.6. Experimental Technique ................................................................... 143

7.7. Experiments with a Lorenz Time Series ........................................... 148

7.8. Experiments with Sound Time Series............................................... 155

7.8.1. Air Noises................................................................................ 155

7.8.2. Gong Sounds ........................................................................... 162

7.8.3. Musical Tones ......................................................................... 164

7.9. Conclusions....................................................................................... 167

7.10. Further Work..................................................................................... 172

7.10.1. Using the Same Model with More Sounds ........................... 172

7.10.2. Optimising the Synthetic Mapping ....................................... 173

7.10.3. Stability Analysis .................................................................. 174

7.10.4. Connections with IFS............................................................ 174

7.10.5. Time Varying Sounds............................................................ 177

8. The Poetry Generation Algorithm..................................................178

8.1. Introduction ....................................................................................... 178

8.2. Description of the Algorithm............................................................ 179

8.3. Analysis of the PGA.......................................................................... 184

8.4. Implementation of the PGA for Sound ............................................. 187

8.5. Results............................................................................................... 191

8.6. Conclusions....................................................................................... 197

9. Summary and Conclusions..............................................................200

Appendix A. Previously Published Work ..........................................209

AES Preprint ................................................................................................. 210

ISCAS '94...................................................................................................... 221

References ............................................................................................225


8/208

8

List of Figures

Figure 1.1 A synthetic cloud, fern and a Julia set [frac90]. ........................................ 20

Figure Error! Bookmark not defined..1 The analysis-synthesis scheme................. 25

Figure Error! Bookmark not defined..2 The sound modelling framework. ............ 28

Figure Error! Bookmark not defined..3 A schematic diagram for additive synthesis.

..................................................................................................................................... 30

Figure Error! Bookmark not defined..4 Karplus-Strong algorithm. Top, simplified

recursive linear filter and bottom, general delay-line view. ........................................ 31

Figure Error! Bookmark not defined..5 The basic units used within the FM (top)

and waveshaping (bottom) synthesis techniques......................................................... 32

Figure Error! Bookmark not defined..6 State space representation of a dynamical

system.......................................................................................................................... 37

Figure Error! Bookmark not defined..7 Illustration of the three regular attractor

types. ........................................................................................................................... 40

Figure Error! Bookmark not defined..8 Sequence of magnifications of the Lorenz

attractor showing its fractal, self-similar property. ..................................................... 42

Figure Error! Bookmark not defined..9 Two simulations of the Lorenz system for

similar initial conditions showing sensitive dependence on initial conditions. .......... 42

Figure Error! Bookmark not defined..10 Three phase portraits constructed from a

time series of observations of the Lorenz chaotic system. Delay values are: (a) 1, (b)

10, (c) 100. .................................................................................................................. 43

Figure Error! Bookmark not defined..11 The logistic mapping for 0 9. . .......... 45

Figure Error! Bookmark not defined..12 Bifurcation diagram for the logistic

mapping with corresponding time series plots............................................................ 46

Figure Error! Bookmark not defined..13 The exactly self-similar, triadic Koch

curve............................................................................................................................ 49

Figure Error! Bookmark not defined..14 General formula for similarity dimension

derived by inspection of standard Euclidean shapes. ................................................. 50

Figure Error! Bookmark not defined..15 Iterative construction of the triadic Koch

curve............................................................................................................................ 52


9/208

9

Figure Error! Bookmark not defined..16 Area of closed Koch curve (dark grey) is

within area of circle (light grey) showing that it is finite. ........................................... 52

Figure Error! Bookmark not defined..17 Three affine contraction mappings on

X=R

2

and their single combination, W. ..................................................................... 55

Figure Error! Bookmark not defined..18 The repeated application of a contractive

mapping, W, to some initial set B, tending to the limit set, or attractor, A.................. 55

Figure Error! Bookmark not defined..19 Example of Random Itaration Algorithm

(RIA) in operation. The three images show the results of iterating the Markov process,

(a)~100, (b)~300, (c)~1000 times. .............................................................................. 57

Figure Error! Bookmark not defined..20 Examples of RIA attractors where the

mappings are weighted with different associated probabilities................................... 58

Figure Error! Bookmark not defined..21 Example of an IFS attractor partitioned

into three disjoint subsets according to the effect of the three individual contraction

mappings on the attractor............................................................................................ 59

Figure Error! Bookmark not defined..22 Bifurcation diagram showing a Hopf

bifurcation occurring at the threshold of oscillation in a wind instrument as the

blowing pressure is increased...................................................................................... 65

Figure Error! Bookmark not defined..23 Time series plots and spectral density

forms for 1/f noise compared with white noise and Brown noise............................... 69

Figure Error! Bookmark not defined..24 Power spectral densities of wind noise

(left) and an industrial roomtone (right) showing 1/f characteristic over the audible

range of frequencies. ................................................................................................... 70

Figure Error! Bookmark not defined..25 A demonstration of the property of

continuous dependence of IFS attractors on the parameters that define them. This also

illustrates the power of manipulation capable with chaotic models [frac90].............. 73

Figure Error! Bookmark not defined..26 An example of the effect of three shearmaps, w w w1 2 3, and on the area A and an illustration of one of the vertical scaling

factor, d1...................................................................................................................... 77

Figure Error! Bookmark not defined..27 The initial arbitrary set, B, and a sequence

of five iterations of the deterministic algorithm. ........................................................ 81

Figure Error! Bookmark not defined..28 FIF for equally spaced interpolation points

derived from a single cycle of a sinewave, but where the vertical scaling factors

increase for the mappings from left to right. ............................................................... 82


10/208

10

Figure Error! Bookmark not defined..29 FIF where x values are spaced according to

a square law. Sequence of magnifications of windows is shown in (a)-(d). ............... 83

Figure Error! Bookmark not defined..30 Same interpolation points as Figure Error!

Bookmark not defined..29, but with 6 iterations showing the cumulative effect oferrors in the algorithm. The bottom plot is a magnification of the middle ~1000 points

of the top plot. ............................................................................................................. 84

Figure Error! Bookmark not defined..31 FIF generated from random x,y and d

values for the interpolation points. .............................................................................. 84

Figure Error! Bookmark not defined..32 (a) (left) FIF generated with random y

values, but evenly spaced x. All d= 0.9. (b) (right) FIF generated with random y, but

square law x values. All d= 0.9. ................................................................................. 85

Figure Error! Bookmark not defined..33 - see Table Error! Bookmark not

defined..1 .................................................................................................................... 86

Figure Error! Bookmark not defined..34 Development of two rhythm/timbres from

rhythmic design, top, through interpolation points, middle, to final waveform, bottom.

..................................................................................................................................... 87

Figure Error! Bookmark not defined..35 Control rule for time-varying FIF sound.

Left, pseudocode where jiji yx , is the ith interpolation point of the jth FIF and dij is

the vertical scaling factor for the ith map of the jth FIF. Right, graphical depiction ofthe effect on the interpolation points through time. .................................................... 88

Figure Error! Bookmark not defined..36 Left, time plot of the whole waveform

generated with the control rule shown in Figure Error! Bookmark not defined..35

with selected magnifications of individual FIFs to show how the sound develops

through time. Right, spectrogram of the first half of the sound showing how it

contains complex, time varying partials similar to those found in naturally occurring

musical sounds. ........................................................................................................... 89

Figure Error! Bookmark not defined..37 Pictorial representation of the FIF

parameter control used to generate the second example of a time-varying FIF sound.90

Figure Error! Bookmark not defined..38 Schematic diagram of the model for

biological evolution..................................................................................................... 92

Figure Error! Bookmark not defined..39 Schematic diagram of hardware used for

GEN program. ............................................................................................................. 92

Figure Error! Bookmark not defined..40 Example of mutation, (a), and

recombination, (b), of FIF parameters......................................................................... 94


11/208

11

Figure Error! Bookmark not defined..41 A single screen-shot from the program

GEN............................................................................................................................. 96

Figure Error! Bookmark not defined..42 A sequence of populations generated with

the program GEN. In this case, the FIFs are produced from 6 interpolation points. Atthe start (waveform A - top left) all interpolation points and vertical scaling factors

are zeroed. At each stage, 7 mutations are produced and then a single survivor is

chosen by the operator (starred waveform), which reappears as waveform A in the

next generation............................................................................................................ 98

Figure Error! Bookmark not defined..43 Starting point (top left) and sequence of

starred waveforms from Figure Error! Bookmark not defined..42 shown in more

detail............................................................................................................................ 99

Figure Error! Bookmark not defined..44 Mutated varients of an FIF that is defined

by a relatively large number of parameters. It can be seen (and heard) that when this is

the case, low factor mutations are found not to be distinctive from one another...... 100

Figure Error! Bookmark not defined..45 Results of an experiment to extract

interpolation points by decimating a wind sound waveform and then constructing an

FIF with them............................................................................................................ 103

Figure Error! Bookmark not defined..46 Original wind sound waveform (top),

interpolation of peak points (bottom left), and reconstructed waveform (bottom right).

................................................................................................................................... 105

Figure Error! Bookmark not defined..47 Section of original wind sound (left) and

part of the composite FIF (right) constructed using groups of peak points............... 106

Figure Error! Bookmark not defined..48 Mapping of amplitudes in requantisation

process....................................................................................................................... 110

Figure Error! Bookmark not defined..49 Degradation against compression

performance of Mazel's inverse algorithms for a variety of data and model types

compared with the theoretically expected performance of requantisation................ 113

Figure Error! Bookmark not defined..50 First trial pair of interpolation points on

the original time series graph. ................................................................................... 115

Figure Error! Bookmark not defined..51 Mapping of whole time series to in

between the first pair of interpolation points. ........................................................... 115

Figure Error! Bookmark not defined..52 Maximum vertical extent of part of the

original time series between a pair of consecutive interpolation points and the

maximum vertical extent of the mapped original time series. The vertical scalingfactor is calculated so as to make these two extents equal........................................ 117


12/208

12

Figure Error! Bookmark not defined..Error! Bookmark not defined. Error

weighting function parameterised by ..................................................................... 122

Figure Error! Bookmark not defined..53 Graph of the results shown in Table

Error! Bookmark not defined..9. ........................................................................... 123

Figure Error! Bookmark not defined..54 Comparison of performance between

requantisation and error-weighted version of Mazel's algorithm. The original is 1000

samples of wind noise which is processed as 10x100 sample sections. ................... 124

Figure Error! Bookmark not defined..12 Comparison of performance of the window

restricted inverse algorithm with that of requantisation. The original time series is

wind noise and processed as 10x100 sample sections. ............................................. 126

Figure Error! Bookmark not defined..13 Waveform plot of original wind noise

(left) and compressed FIF version (right) using the modified inverse algorithm. The

compression ratio in this case is 8.1:1, and the SNR is 22.6dB................................ 127

Figure Error! Bookmark not defined..14 Column chart showing the performance

figures given in Table Error! Bookmark not defined..11 for a variety of different

original sound time series. ........................................................................................ 128

Figure Error! Bookmark not defined..55 The proposed analysis/synthesis model

based upon the embedded attractor and measure representation of a sound time series.

................................................................................................................................... 136

Figure Error! Bookmark not defined..56 Left, an example recursive partition for

m=2 and right, the associated search tree.................................................................. 142

Figure Error! Bookmark not defined..57 Lorenz input, N=10,000, Q=256 and a

variety of embedding dimensions, m.................................................................. 149

Figure Error! Bookmark not defined..58 Lorenz input, N=10,000, m=7, and a

variety of number of domains, Q. ............................................................................. 150

Figure Error! Bookmark not defined..59 Lorenz input, Q=64, m=7 and a variety oforiginal time series lengths, N ....................................................................... 151

Figure Error! Bookmark not defined..60 Time series plots from original Lorenz

system (left) and the synthetic one shown as phase portrait Error! Bookmark not

defined..58(f) (right)................................................................................................. 152

Figure Error! Bookmark not defined..61 Estimates of amplitude probability

distributions for original, left, and synthetic, right, time series shown in Figure Error!

Bookmark not defined..60. ..................................................................................... 153


13/208

13

Figure Error! Bookmark not defined..62 Time series plots and phase portraits for:

left, original fan rumble sound and right, best synthetic output, rc127..................... 157

Figure Error! Bookmark not defined..63 Time series plots and phase portraits for

some more outputs from the sound model using the fan rumble as input. Note thatonly about a third the length of the output appears in the phase portraits as it does in

the time series plots for the sake of clarity................................................................ 159

Figure Error! Bookmark not defined..64 Time series plots (first fifth of top plot

shown magnified as second plot), power spectra and phase portraits for original wind

noise, left, and synthetic version, right...................................................................... 161

Figure Error! Bookmark not defined..65 Time series plots, phase portraits and

amplitude histograms for original, left, and synthetic, right, lightly-struck gong sound.

Both amplitude histograms were computed with 10,000 samples and 100 bins....... 163

Figure Error! Bookmark not defined..66 Time series plots, phase portraits and

amplitude histograms for original, left, and synthetic, right, hard-strike gong sound.

Both amplitude histograms were computed with 10,000 samples and 100 bins....... 164

Figure Error! Bookmark not defined..67 Time series plots, power spectra and phase

portraits for original, left and synthetic, right, tuba tones. ........................................ 166

Figure Error! Bookmark not defined..68 Time series and phase portraits for

original, left, and synthetic, right, saxaphone tones. ................................................. 166

Figure Error! Bookmark not defined..69 Relative one-step prediction errors for the

best results found for each of the time series. .......................................................... 168

Figure Error! Bookmark not defined..70 Autocorrelation functions for original, left,

and synthetic, right, gently struck gong sound. The upper plot shows the function upto

8,000 delays, and the lower upto 100 delays. Both were calculated by convolving

10,000 samples of the time series with itself for different delays............................. 171

Figure Error! Bookmark not defined..71 The top line shows the interdependence of

the components of the RIA version of an IFS. The bottom line shows a suggested path

to obtain a solution to the inverse problem. .............................................................. 179

Figure Error! Bookmark not defined..72. Input to the algorithm treated as a circular

sequence. ................................................................................................................... 181

Figure Error! Bookmark not defined..73 Part of the state space, X, corresponding to

an example PGA showing some of the possible states and their associated transitions.

................................................................................................................................... 185


14/208

14

Figure Error! Bookmark not defined..74 Crossfade envelopes applied to beginning

and end of original time series which are then added together to form modified time

series. This is then stored in the circular register so that there is no amplitude

discontinuity between its end and its beginning........................................................ 191

Figure Error! Bookmark not defined..75 Time domain plots of the original

roomtone showing 300 (left) and 3000 (right) samples. ........................................... 194

Figure Error! Bookmark not defined..76 Time domain plots of output time series

when (a) I=300, L=1, (b) I=3000, L=3, and (c) I=300, L=4...................................... 194

Figure Error! Bookmark not defined..77 Comparison between original (left) and

synthetic time series (right) showing: (a)&(b) time domain plots, (c)&(d) power

spectral densities calculated by averaging eleven 4096 point FFTs, and (e)&(f)

amplitude histograms calculated from 30,000 samples. ........................................... 195


15/208

15

List of Tables

Table Error! Bookmark not defined..2 A summary of possible sound types. After

[ross82]........................................................................................................................ 25

Table Error! Bookmark not defined..3 (left) example set of interpolation points and

vertical scaling factors that define the FIF shown in Figure Error! Bookmark not

defined..27. ................................................................................................................ 80

Table Error! Bookmark not defined..4 (right) vertical scaling factors used in

generating Figure Error! Bookmark not defined..28............................................... 80

Table Error! Bookmark not defined..5 and Figure Error! Bookmark not

defined..78 Input data and waveform plot of the resulting FIF that is a rhythm/timbre...................................................................................................................................... 86

Table Error! Bookmark not defined..6 Summary of the results obtained by Mazel

for his four FIF based models/inverse algorithms..................................................... 109

Table Error! Bookmark not defined..7 Summary of results for reimplementation of

Mazel's algorithm for the self-affine model. Each original time series of length Ttot has

been processed as m=10 sections of length T=100. .................................................. 119

Table Error! Bookmark not defined..8 Running algorithm with wind noise as

original time series for a variety of section lengths T. .............................................. 120

Table Error! Bookmark not defined..9 Results of error weighting the inverse

algorithm for a range of weighting function gradients, . The original time series is

wind noise and is processed as 10x 100 sample sections.......................................... 122

Table Error! Bookmark not defined..10 Performance of modified FIF inverse

algorithm with a specified window restricting the range of the trial interpolation point.

................................................................................................................................... 126

Table Error! Bookmark not defined..11 Table of performance figures for window

restricted inverse algorithm using a variety of sound time series. Each original time

series is processed as 10x100 sample sections and the restriction window is set at l=15

and r=25 samples...................................................................................................... 127

Table Error! Bookmark not defined..12 Summary of results using fan rumble sound

as input to the dynamic model............................................................................... 156

Table Error! Bookmark not defined..13 Summary of analysis parameters for best

results using gong sounds.......................................................................................... 162

Table Error! Bookmark not defined..14 Analysis details for the musical tones. .. 165


16/208

16

Table Error! Bookmark not defined..15 Example of the PGA acting on a short

paragraph of text for a variety of values of the seed length parameter. .................... 180

Table Error! Bookmark not defined..16 Example sequence of iterations of the PGA.

................................................................................................................................... 182

Table Error! Bookmark not defined..17 Simple example showing how the

preprocessing reorders the original input sequence. ................................................. 189

Table Error! Bookmark not defined..18 Summary of results obtained with PGA and

industrial roomtone as original time series. (Numbers in brackets are experiment

identification.) ........................................................................................................... 192

Table Error! Bookmark not defined..19 Summary of results for PGA used with

other roomtones having different qualities................................................................ 196

Table Error! Bookmark not defined..20 Summary of results obtained with PGA and

a variety of other background sounds........................................................................ 197


17/208

17

List of Sound Examples

All sounds are created by playing 16-bit sound files at 48kHz or 44.1kHz sample-

rate unless otherwise stated. The sample-rate is indicated by the suffix of the sound

file name given in brackets after each description. For example, '.441' indicates an

original sound recording made with a sample-rate of 44.1kHz or a synthetic version

played-back at that rate. The suffix '.mbi' is used to indicates an abstract waveform

with no intrinsic sample-rate. These files are played at 48kHz.

Playback is via a Digital Audio Labs 'CardD Plus' system connected to an IBM

compatible P.C. This allows an AES/EBU compatible, serial digital audio data-stream

to be generated from the sound file. This is then passed to a Sony TCD-D10 digital

audio tape (DAT) recorder which is used as the digital-to-analogue device.

Chapter 5

1. FIF derived from 17 equally x-spaced interpolation points taken from a single

sinewave cycle, 5 iterations. (sine_5.mbi) .................................................................. 81

2. Same as Sound 1, but with increasing vertical scaling factors. (sine3.mbi) ......... 81

3. FIF derived from 129, square-law x-spaced interpolation points taken from a single

sinewave cycle, 3 iterations. (sine9_3.mbi) ................................................................ 82

4. Same waveform used in Sound 3, but played as a sequence where the speed of

playback is halved at each stage. (sine9_3.mbi) ......................................................... 83

5. FIF derived from randomised interpolation points and vertical scaling factors.

(rand4.mbi).................................................................................................................. 84

6. FIF derived from interpolation points whose y-values are randomised, but that are

regularly x-spaced. (rand2.mbi)................................................................................... 84

7. Same as Sound 6, but with square-law x-spacing. (rand3.mbi) .............................. 84

8. Original FIF rhythm/timbre. (fif1.mbi) ................................................................... 85

9. Same waveform used in Sound 8, but played as a sequence where the speed of

playback is halved at each stage. (fif1.mbi)............................................................... 85

10. First designed FIF rhythm/timbre. (rhy2_1_x.mbi) .............................................. 86

11. Second designed FIF rhythm/timbre. (rhy4_4.mbi).............................................. 86

12. Percussive sounding, time-varying FIF. (tv1.mbi)................................................ 89


18/208

18

13. Second example of a time-varying FIF. (tv2.mbi) ................................................ 89

14. Audio output from the program GEN which accompanies Figure Error!

Bookmark not defined..79. Each of the 8 sounds is a member of a single evolved

population of FIFs. Played at 48kHz........................................................................... 95

15. Sounds to accompany Figure 5.17. Each of the 8 sounds is the chosen survivor of

a sequence of generations produced with GEN. Played at 48kHz. ............................. 96

16. Concatenated sequence of ~15 short, evolved FIFs. (mbi1log.mbi)..................... 97

17. Concatenated sequence of 4 related FIF rhythm/timbres. (goodone.mbi) ............ 97

18. Audio output from GEN which accompanies Figure 5.19. Each sound is the

member of one generation evolved from FIF parameters similar to those used in

Sound 3. It can be heard how there is little to distinguish the mutated offspring. Playedat 48kHz. ................................................................................................................... 100

Chapter 6

19. FIF whose interpolation points are the peak-points of a wind noise waveform.

(wp1.mbi).................................................................................................................. 105

20. As Sound 19, but using groups of peak-points. (wp2.mbi)................................. 106

Chapter 7

All the examples from Chapter 7 are presented as pairs of the original sound and

the synthetic version produced with the chaotic predictive model.

21. Original fan rumble air-noise. (fan_rmb5.48)..................................................... 157

22. Synthetic version of above. (rc127b.48) ............................................................. 157

23. Original wind noise. (wind6.48) ......................................................................... 160


25. Original lightly-struck gong sound. (gong4.48).................................................. 162


27. Original hard-strike gong sound. (gong6.48)...................................................... 162


29. Original tuba tone. (tuba2.48)............................................................................. 165

30. Synthetic version of above. (rc175x.48) ............................................................. 165


19/208

19

31. Original saxophone tone. (sax9.48) .................................................................... 165

32. Synthetic version of above. (rc108x.48) ............................................................. 165

Chapter 8

33. Original industrial roomtone. (rmt4.441)............................................................ 192

34. Synthetic version of Sound 33 produced by PGA where I=300 and L=1.

(rmt4_111.441).......................................................................................................... 192

35. As 34, but I=3,000 and L=2. (rmt4_212.441) ..................................................... 192

36. As 34, but I=3,000 and L=3. (rmt4_213.441) ..................................................... 192

37. As 34, but I=3,000 and L=4. (rmt4_214.441) ..................................................... 192

38. As 34, but I=30,000 and L=2. (rmt4_312.441) ................................................... 192

39. As 34, but I=3,000 and L=5. (rmt4_315.441) ..................................................... 192

40. Original laboratory roomtone, played at 48kHz. (lab_rmt.48)............................ 196

41. Synthetic version of above produced with PGA, played at 48kHz.

(lab_313.48) .............................................................................................................. 196

42. Original 'rumble-like' industrial roomtone. (rmt11.441)..................................... 196

43. Synthetic version of above produced with PGA. (rt11_314.441) ....................... 196

44. Original industrial roomtone with drone. (rmt15.441)........................................ 196

45. Synthetic version of above produced with PGA. (rt15_314.441) ....................... 196

46. Original river sound. (river.48)........................................................................... 197

47. Synthetic version of above produced with PGA. (rive_313.48) ......................... 197

48. Original wind noise. (wind1.48) ......................................................................... 197

49. Synthetic version of above produced with PGA. (wind_313.48)........................ 197

50. Original audience applause sound. (applause.48) ............................................... 197

51. Synthetic version of above produced with PGA. (appl_312.48)......................... 197

52. Original rainforest ambience. (ecuador.48)......................................................... 197

53. Synthetic version of above produced with PGA. (ecua_314.48) ........................ 197

54. Original speech extract. (speech.48) ................................................................... 199

55. Synthetic version of above produced with PGA. (sp_pga.48) ............................ 199


20/208

20


21/208

21

Summary of Acronyms

DAT Digital Audio Tape

DSP Digital Signal Processor

FFT Fast Fourier Transform

FIF Fractal Interpolation Function

FM Frequency Modulation

IFS Iterated Function System

jpdf joint probability density function

LPC Linear Predictive Coding

pdf probability density function

PGA Poetry Generation Algorithm

RIA Random Iteration Algorithm

rms root mean square

SDS Shift Dynamical System

SNR signal to noise ratio


22/208

20

Chapter 1

Introduction

This thesis is about applying science and technology to the arts. In particular, the

science is that of chaos theory, which includes fractal geometry, the technology is the

computer, and the medium of interest, sound. Fractals and chaos are recent

developments which are revolutionising our understanding of the complex and

irregular nature of the world. Chaos theory is concerned specifically with the

behaviour of nonlinear dynamical systems. It is about the realisation that simple,

deterministic systems can exhibit complex, unpredictable behaviour. Fractal geometry

deals with a class of forms that are not accounted for by conventional, Euclidean

geometry. The two overlap with the concept of a strange attractor which both

embodies the nature of chaotic systems and is itself a fractal object. The relevance and

use of chaos and fractals is currently spreading through a diverse range of subjects. A

number of developing areas of interest are characterised by the overlap of both

scientific and artistic concerns. In particular, two subjects have emerged that have

considerable popularity: visual art and music. Both combine fractal and chaotic

models with computer technology to provide powerful tools for artistic

experimentation. The aim of this work is to seek a parallel to this, but involving

sound.

Consider the images shown in Figure Error! Bookmark not defined..1. These

are examples of the power of fractals and chaos. Using only very simple models it is

possible to create images that can be either complex abstract forms or realistic replicas

of natural objects. The question is, can the same be found in the acoustic domain? For

example, could a complex, naturally occurring sound be represented with a simple

model? Does there exist an aural equivalent of the Julia set?

Figure 1.1 A synthetic cloud, fern and a Julia set [frac90].


23/208

21

Interest in fractal music has concentrated on the arrangement of sequences of notes

with reference to fractal or chaotic models. Although the end product is audio, the

actual sounds used are conventional natural or synthetic ones (for example see

[pres88, gogi91 and jone90] ). The time scale on which fractals and chaos are being

used for music, then, is different to that of the sounds themselves. Musical

fluctuations range from thousandths of Hertz up to several Hertz. Audio fluctuations,

however, range from hundreds of Hertz to tens of thousands. An important discovery

that supports the use of fractals and chaos for music composition is that, when

analysed, music from a wide range of cultures and historical periods is found to have

fractal properties [voss78, hsu90 and hsu91]. It has been suggested, however, by

Benoit Mandelbrot, the inventor of the term fractal, that such properties should not

extend beyond the musical structure to the sounds themselves as these are governed

by different mechanisms [mand83].

But why should this necessarily be the case? What about the complex and

irregular side of musical sound, for example the hiss of a breathy saxophone, or the

crash of a cymbal? Also, what about non-musical sound? All around us there are

complex and irregular sounds generated by our environments: a burbling brook,

splashing water, the roaring of the wind, the rumble of thunder and the variety of

screeching, scraping, buzzing and humming noises made by machinery. Is it, perhaps,

that these sounds represent an aural equivalent to the shapes found in nature that have

been neglected by Euclidean geometry and then rediscovered as fractals? Criticising

the conventional Fourier approach to modelling musical sound, the contemporary

composer Iannis Xenakis has said:

"It is as though we wanted to express a sinuous mountain silhouette by portions of

circles." [xena71]

Compare this to what Mandelbrot says in the introduction to his 'The Fractal

Geometry of Nature':

"Clouds are not spheres, mountains are not cones, coastlines are not circles, andbark is not smooth, nor does lightning travel in straight lines." [mand83]

This thesis, then, presents an exploratory study into the idea of using chaos theory

and fractal geometry to model sound. Apart from the interest in this as a research

topic, the work is practically motivated with the aim of developing computerised tools

that would allow control over complex and irregular sounds for creative uses. The

potential applications for such tools include computer music composition and the

generation of sound effects for film and television.


24/208

22

The overall design of this thesis is as follows: Chapters 2, 3 and 4 present the

background to this thesis and develop specific problems on which to work. Then

Chapters 5, 6, 7 and 8 present original contributions towards the solution of these

problems. Each of these chapters contains its own conclusions and a discussion of

further work where relevant. Chapter 9 contains a summary of the thesis and some

general conclusions. An appendix is included which contains copies of previously

published papers on this work and the thesis ends with a full list of references.

Throughout the thesis, references are made to sound examples which are presented on

an accompanying cassette tape. The sound examples are listed, along with all figures

and tables, after the contents pages. Also included is a summary of acronyms for

reference. The content of each chapter is previewed below.

Chapter 2 defines what is meant by a sound model. It considers what sound is, and

the general concept of its representation via the procedures of analysis and synthesis.

Some specific applications are described, including 'the roomtone problem', which

allows a functional description of a model to be developed. Brief reviews of some

well known models fitting this description are given including some of their

advantages and limitations.

Chapter 3 presents a review of chaos theory and fractal geometry. This includes an

outline of some main features and their significance. The emphasis is on

understanding how complex behaviour arises from simple systems, the importance of

strange attractors, and the introduction of Iterated Function Systems (IFS), which

provide a useful practical framework for manipulating strange attractors.

In Chapter 4 the issue of applying the ideas of chaos theory and fractal geometry

to the problem of modelling sound is considered. It is argued that both appear to have

potential use, but that two main questions are raised. Firstly, on a diagnostic level: are

sounds chaotic or fractal? Positive evidence is collected both from the literature and

from original work. The second question is then a practical one: in what way can

sound be represented with chaos or fractals? The conclusion is to concentrate on using

strange attractors in two different ways with an emphasis on involving IFS.

Chapter 5 is concerned with using IFS strange attractors to produce synthetic

sound by generating waveforms with Fractal Interpolation Functions (FIF), a class of

IFS. A basic technique is designed that is then advanced in several ways. The most

important result is the discovery of a new class of sounds that are simultaneously

rhythms and timbres. With these techniques complex sounds may be generated with

small amounts of data and are demonstrated to have potential for musical applications.

Chapter 6 keeps the theme of FIF, but considers the analysis and synthesis of agiven sound. An algorithm is taken from the literature which appears suitable for this


25/208

23

task. It is shown, however, to be inadequate, a reason found, and the algorithm

improved. Results indicate that some degree of data compression may be obtained for

certain sounds.

Chapter 7 is concerned with the problem of modelling the dynamics of a sound viaa strange attractor. The assumption is made that a chaotic system is responsible for a

digital audio time series. The system may then be reconstructed from the time series

with a technique known as embedding. Because of the properties preserved by

embedding, the construction of another chaotic system that approximates the

embedded one should produce a time series that is statistically similar to the original.

An approach to this problem is considered which combines techniques taken from

work on the nonlinear prediction of time series with an original method inspired by

the Shift Dynamical System (SDS) version of an IFS. An analysis/synthesis algorithm

is developed and a number of experiments performed. The algorithm is shown to be

capable of modelling known chaotic systems from their time series. Also, despite

some difficulties, the algorithm is capable of successfully reproducing some natural

sound so that it is perceptually similar to the original.

Chapter 8 is also concerned with the problem of modelling the dynamics of a

sound in an embedded state space setting. The model considered, however, is the

Random Iteration Algorithm (RIA) version of an IFS where a Markov chain is used to

model the embedded invariant measure. In the course of this investigation, an

algorithm is developed which solves the roomtone problem for certain ambient

sounds.

Chapter 9 presents a summary of the thesis and some general conclusions on the

subjects of inverse problems, algorithmic complexity and developments of the work.


26/208

24

Chapter 2

Modelling Sound

This chapter develops a working definition of a sound model. It will consider what

sound is and its representation within an analysis/synthesis framework. Some possible

applications of such a model will be discussed including a specific one concerning

film sound-track editing, known as 'the roomtone problem'. This leads to a set of

useful functions that define the model. Also, a brief review of established modelling

techniques, their advantages and limitations is included.

2.1. Sound and its Representation

What is sound? It can be defined as either an auditory sensation perceived by the

mind, or as the physical disturbance that gives rise to such a sensation [ross82]. A

practical model for sound has, in some way, to represent it in an appropriate form.

Starting from this definition of sound there are a number of levels on which this

representation could take place. Consider these as ordered from the outside in: on the

outside level, a model could be made of the complete physical system that isresponsible for the sound. This might include the source of the disturbance and its

reverberant environment. A list of possible disturbances is shown in Table 2.1.

Secondly, this model may be simplified to include only that which is relevant to

describing the pressure fluctuations in the air at a single point; for example at the ear

or a microphone. Next, a model could be made for the time waveform created by

recording those pressure fluctuations at a single point without any, or little,

consideration of the physical system that created it. The waveform is then an abstract

pattern which is to be modelled. Finally, the model may account for just the

perception of the sound, so that an accurate representation of the time waveform is not

necessary, but a representation is needed that just contains the relevant information to

capture the essential characteristics of the sound.

At whatever level the representation is made, a useful framework within which to

test its validity is provided by the analysis-synthesis scheme shown in Figure 2.1

[riss82]. The important feature is that a listener judges how good the representation is

at capturing the characteristics of the sound. In order to refine this modelling

framework, it will be useful to consider some of the applications where sound models

are, or might be used.


27/208

25

synthesisanalysis

representation sousound

listener

Figure 2.1 The analysis-synthesis scheme.

Physical Disturbance Example

vibrating solid bodies metal bar, speaker cone, violin body

vibrating air column pipe organ, woodwind instrument

flow noise in fluids due to

turbulence

jet engines, air leaking under

pressure, wind noise

interaction of

moving solid with fluid

or

moving fluid with solid

rotating propeller or fan blade

air flow in duct or through grill,

water in pipe, waves breaking on sea

shore

rapid changes in temperature or

pressure

thunder and other sounds caused by

electrical discharge, chemical explosion

shock waves caused by motion or

flow at supersonic speed

supersonic boom caused by jet

aircraft

Table 2.1 A summary of possible sound types. After [ross82].

2.2. Music composition.

An important aspect of music composition is, obviously, the control over the type

and quality of sound used. This century has seen the use of electronic and, more

recently, computer based techniques grow from the experimental to the mainstream.

Typically, such techniques involve obtaining musical sound and processing it to

modify it, or generating it entirely synthetically. Of importance are the degrees of


28/208

26

musical usefulness and flexibility that are offered by a technique coupled with the

ease and efficiency with which it can be executed.

Imagine the example of a drum synthesiser. What might be its attractive features

for a composer? It might be able to take the recording of an original drum sound andreproduce it so as to retain its relevant characteristics, discarding any perceptually

unimportant information in the process. It might then allow the sound to be modified

in a way related to its physical attributes, for example, to be able to change the sound

as if it came from a larger version of the same drum, or one that had a tighter skin and

has been struck with a different beater. Furthermore, the synthesiser might allow drum

sounds to be generated that it would not be possible to create with real instruments.

A more detailed discussion of sound modelling techniques used for music

composition is given in the forthcoming sections 2.5 - 2.8.

2.3. The Roomtone Problem

Another area of creative sound use is film sound-track editing. This, as with music

composition, generally involves manipulating sound in a number of ways except that

often the sound is non-musical. A good example of this is the use of sound effects.

Here, the desire is to add certain sounds to a film to enhance or complement what is

taking place visually. Traditionally, this is done by simulating the appropriate soundswith a variety of acoustic devices or making use of large reference libraries of

recordings. It is, however, often problematic and time consuming to get exactly the

desired sound. A specific example of this is the roomtone problem which was posed

by the company that sponsored this research.

The roomtone problem arises during post-production editing of a film sound-track.

Often, due to problems that have occurred with the location filming, it is necessary to

replace sections of the original sound-track at a later date. For example, this can

involve having them dubbed by the original actors in an acoustically dry sound studio.The problem occurs when the new pieces of sound track are inserted into the original

as there is often a noticeable lack of background sound. As these background sounds

tend to be characteristics of internal locations, they are known as roomtones. One

traditional solution to this problem involves referring to libraries of roomtone

recordings to find a matching sound. It is often difficult, however, to find exactly the

right sound and the process can also be time consuming. Another solution is to make

use of small snippets of the roomtone found in places on the original recording, for

example between lines of dialogue. These may be spliced together, or looped to form

as long a piece as is necessary. As with the other solution, this can be an intricate and


29/208

27

time consuming process, the results often not good enough because the splices and

loops are audible.

An ideal solution to this problem, then, would be some form of sound model that

is able to capture certain essential characteristics of the roomtone from a smalloriginal sample and then produce greater quantities of a synthetic version.

Both the examples of the drum synthesiser and the roomtone problem illustrate a

certain type of creative application for sound models. Generally, the need is for the

model to capture essential characteristics of the sound; for it to allow useful

manipulation of the sound; and/or for it to generate synthetic sound. An important

aspect of such models is that the representation involves a set of parameters. These are

the variables of the model that, with the particular representation, form all the

information extracted by the analysis, and/or used by the synthesis. So for the drum

model, the parameters might include the physical attributes of the drum, or for the

roomtone model, the extract of original sound.

2.4. Digital Audio

Being more specific about the sound model, it is assumed that it will operate

within a computer and therefore rely on digital audio as an intermediate

representation. This brings the enormous advantage that the modelling process may be

implemented as a computer program, which makes it highly flexible, and convenient

to develop [math82]. Digital audio satisfies the definition of a representation for

sound that has been given already. It is a discrete time, discrete amplitude model for

the time waveform generated from recording sound at a single point in space. It

preserves perceived information in the form of all frequencies contained within the

sound up to one half of the sampling frequency. This is guaranteed by Nyquist's

sampling theorem [nyqu28]. It is, however, unwieldy, in that a large amount of data isrequired for good quality representation. For example, the industry standard of a

48kHz sampling rate and 16 bits per sample [aes85] means that approximately one

million bytes of data are required to represent ten seconds of sound; this data not

being in a form that is obviously related to the perceived characteristics of the sound.

This is therefore another reason for further representation of the sound waveform: so

as to reduce the amount of parameter data. Assuming the use of digital audio and

therefore computers also means that the model has to perform its desired functions

within the constraints imposed by the processing ability of the computing devices

used.


30/208

28

2.5. The Modelling Framework

Following the discussion developed within this chapter, then, a working functional

description of a sound model is summarised as follows. A sound model is of use if:

1) it can represent the essential perceived characteristics of the sound;

2) there is less parameter data than there is original sound data;

3) the parameter data is of a form such that its manipulation has a useful or

interesting effect on the sound;

4) it can generate new sounds, or replicas of naturally occurring ones, from a little

data and/or a simple model.

Although much is known for particular situations, it is very difficult to say, in

general, what physical attributes of the sound it is sufficient to preserve in the

representation so as to satisfy 1). This is still an open question in psychoacoustics [see

deut82]. Point 2) on its own may also be described as data compression. Although this

tends to be an attractive feature of a model in terms of reducing the amount of storage

required, it is considered here also in combination with 3) in the sense that the

parameters are more manageable if there are less of them. The synthesis capability of

the model, 4), may be derived from the analysis model and used by supplying it

modified, or artificial parameters, or it may exist on its own as a synthesis-only

technique.

It has also been assumed that the model will operate on a digital audio

representation so that it can operate within a computer. A more detailed diagram of

the sound modelling framework, then, is shown in Figure 2.2.

representation

parameters

modify etc.

operator

microphone loudsp

analysis synthesis

reconsandamplif

13741587

1745

1956

....

....

....

....

digital audio

time waveform

13741587

1745

1956

....

....

....

....

digital audi

sample and quantise

sound

Figure 2.2 The sound modelling framework.


31/208

29

Now that a general modelling framework has been defined, the next section gives

some brief reviews of particular, well known representations that fit this description.

These serve to illustrate the points made so far, and act as a reference when the issue

of modelling sound using chaos theory is discussed in Chapter 4.

2.6. Conventional Models

2.6.1. Physical Modelling

Physical modelling is a synthesis-only technique that is used to generate musical

sound from a computer representation of the physical system responsible for that

sound. The system can include the action of the musician on the instrument, and the

instrument itself. The system is usually partitioned according to physical, functional orcomputational criteria which in fact often coincide. So for example, a violin may be

divided into the bow, strings, bridge and soundboard as separate coupled physical

systems; or into an excitation part (bow on string) that feeds a resonator (string,

bridge, sound board); or into a nonlinear oscillator (exciter) that is input to a linear

filter (resonator).

The appeal of physical modelling is that sounds may be created from a purely

theoretical basis and that the models and parameters are in a form that can be

intuitively understood by the user. The main disadvantage is that despite much basictheory being known about the physics of musical sound generation, often the models

resulting from a direct implementation of the equations produce sounds that are flat

and lifeless [riss82]. This suggests that there are therefore many subtle aspects of

sound production that are important to the highly sensitive perceptual mechanisms of

the ear and brain that are not included in the basic theory. This is an area of current

research [cmj92].

2.6.2. Additive and Subtractive Synthesis

Additive and subtractive synthesis are terms used to cover a range of analysis-

synthesis techniques used for modelling musical instrument and voice sounds and

which rely on spectral representations of the time waveform. As mentioned above, a

number of such sounds can be presumed to be the product of some form of excitation

feeding a resonator. A time-varying spectral analysis of the sound can reveal these

components in a form that then suggests suitable further representations. For example,

such an analysis shows a bowed violin sound to consist of an approximately periodic

excitation, revealed as a set of harmonically related spectral lines, or partials, within


32/208

30

an overall spectral envelope, which is attributed to the resonances of the violin body.

A similar result can be found for voiced speech sounds, where the resonances, also

called formants, vary in time. Unvoiced speech sounds, however, show a broad-band

spectrum modulated by the formant envelope.

Additive synthesis seeks to regenerate the sound by adding together a set of

sinewaves whose frequency and amplitude 'trajectories' vary in time [serr90, riss82].

A diagram of this is shown in Figure 2.3. The trajectories are extracted from the

spectral analysis using a variety of methods. In this form, however, a large amount of

parameter data can be generated. It has been shown, however, that it is the overall

trend of the trajectories that is of greatest perceptual importance and their

approximation with simple piece-wise linear functions allows a considerable degree of

data reduction while maintaining the quality of the reproduced sound [grey75].

Modification of these functions then also allows musically interesting transformations

of the sound.

output

.

.

.

.

.

.

.

.

amp 1

+

freq 1

amp 2

freq 2

amp 3

freq 3

amp 4

freq 4

amp 5

freq 5

amp 6

freq 6

control

sinewave

generators

trajectories

Figure 2.3 A schematic diagram for additive synthesis.

Additive synthesis works well at representing certain sounds to a high degree of

perceptual accuracy. These are ones with a well defined partial structure arising from

periodic excitation and/or systems with simple vibrational modes. It is, however,

limited in its capability to represent complex or noisy sounds, i.e. ones with broad-

band spectral structures.

Subtractive synthesis also seeks to regenerate the sound using the spectral

information. It does this in the opposite sense to additive synthesis by starting with a

spectrally rich input that is then refined with a time varying filter. The excitation may


33/208

31

be periodic or noise-like, to give harmonic or wide-band spectral structure

respectively. The filter then shapes this to provide the formant envelope.

A powerful method for estimating suitable filters is linear prediction [makh75,

moor90]. This encompasses a number of techniques that allow the estimate ofparameters for a digital, recursive linear filter from the original time series. These

filters are of the form,

y x b yn n i n ii

M

1

where x is the excitation input, y the output, b the filter coefficients, and Mthe filter

order which corresponds to one half the number of formant peaks.

This technique is used widely for speech modelling where between 3-7 formants

are required to adequately represent the sound, and so provides a considerable degreeof data reduction. Attempts at modelling drum sounds suggest that approximately 100

are necessary [sand89]. This technique offers the potential for modification of the

individual resonances or the excitation so as to transform the sound in an intuitive

way. There are difficulties, however, associated with the numerical manipulation and

implementation of the high order filters required [sand92].

A much simplified synthesis-only derivative of the recursive filter model, known

as the Karplus-Strong algorithm, has been found to generate certain sounds very

effectively. These include plucked string, drum and electric guitar timbres [karp83,jaff83, sull90]. The simplification is in having high order filter models, but with all

the coefficients set to zero except the higher index ones. Variants include the insertion

of other elements, for example randomly controlled switches and nonlinearities, in the

feedback path. It is therefore equivalently described as a delay-line with feedback via

some kind of modifier. Both these views are shown in Figure 2.4. Typically, the sound

is generated by inputting a burst of noise, or a simple periodic waveform to the delay

line.

modifier

output

z-1 z-1 z-1 z-1 z-1 z-1......

+

delay of samplesDoutput

delay of samplesD

coefficients

input

input

Figure 2.4 Karplus-Strong algorithm. Top, simplified recursive linear filter and

bottom, general delay-line view.


34/208

32

Finally, a technique for combining both additive and subtractive synthesis has also

been proposed [serr90].

2.6.3. Frequency Modulation and Waveshaping

Frequency modulation (FM) and waveshaping are related synthesis-only

techniques that allow the generation of sounds with complex line spectra using simple

models [chow73 and lebr79]. A basic unit of each technique is shown in Figure 2.5.

The units are then combined by either adding several outputs together, or nested so

that the output of one forms the input to another. The parameters inputted to the

model are accessed directly by the user, and/or controlled by simple functions to

generate time-varying sounds.

To their advantage, the sounds produced by these models are often approximate

replicas of musical ones. Both harmonic and inharmonic sounds may be simulated that

are like those generated from string or wind, and percussive instruments, respectively.

It is also possible to generate a wide range of abstract sounds. The relatively small

number of parameters involved allows for easy experimentation by the user and the

simplicity of the models enables them to be easily implemented.

+

amp

freq

amp

freq

out

output

x(t) f [x(t)]f

nonlinearfunction

input

carrier frequency

modulationfrequencyandintensity

outputamplitude

Figure 2.5 The basic units used within the FM (top) and waveshaping (bottom)synthesis techniques.

The disadvantages of these models are that no analysis methods exist that can

produce a set of parameters from a given sound and that, as with physical modelling,

the sounds can lack certain 'natural' qualities [moor90].


35/208

33

2.7. Summary

This chapter has developed the concept of a model for sound with which to work.

The principal idea is that of representation. There are many levels on which a

representation for sound can take place, from the physical to the perceptual. Also,several representations may be used together. An example is the chain of

representations that exists within the additive synthesis model: physical system;

pressure fluctuations at microphone; time waveform; digital audio time series; time-

varying spectrum; set of variable amplitude and frequency sinewaves; set of piece-

wise linear functions.

From a consideration of the types of creative applications where such a model

might be used, a functional description has been advanced. Central to this description

is the idea of a parameterised representation, where the parameters consist of less datathan the modelled sound and are of a form that facilitates manipulation of the sound in

useful ways.

Finally, several well known models fitting this description have been reviewed. These

models are primarily for music and speech sounds and, consequently, focus on

representing those elements that characterise such sounds, both physically and

perceptually, for example spectral lines and formant envelopes. The models, therefore,

concentrate mainly on the top two categories of Table 2.1. No models fitting the

description given in this chapter have been found in the literature which have beenfound for sounds that are outside these categories.


36/208

34

Chapter 3

Chaos Theory and Fractal Geometry

3.1. Introduction

This chapter presents an overview of chaos theory and fractal geometry. The

intention is to present a theoretical basis for the forthcoming chapters. Theory relevant

to each experimental chapter is then presented in that chapter. The emphasis is

therefore on the following subjects: the significance of chaos and fractals; strange

attractors; Iterated Function Systems; and several other relevant ideas and tools. The

chapter may be read in its entirety as a concise introduction to chaos and fractals, or

referred to as and when needed during later chapters. Sources for the general theory of

chaos and fractals include [stew92, farm90, glei87, laut88, deva89, schr91, peit88,

moon87, hao84, mand83, barn88].

Chaos theory is about a new understanding of dynamics, the way in which systems

behave through time. It concerns the realisation that deterministic systems which obey

fixed laws, can exhibit unpredictable behaviour. This runs contrary to the established

viewpoint, dating back to Newton, that the behaviour of deterministic systems can be

predicted for all future time. Also, chaotic behaviour, characterised by being irregularand complex, may be found in very simple systems. This, again, apparently

contradicts the traditional scientific expectation that complex behaviour arises only in

complex systems.

The theory of fractals, however, provides a new understanding of geometry. It is

based on a realisation that there exists a large class of geometric objects not

encompassed by the traditional Euclidean geometry of points, lines and circles, or the

forms of differential calculus, for example smooth curves. Fractal objects have

properties unlike those of their traditional counterparts because of the way they fill

space. For example, they typically have dimensions which are not integers and curves

with infinite length can be contained within a finite volume. Many fractals have the

same form when viewed on different scales, a property known as self-similarity. Like

chaos, it is also possible to construct complex fractal forms using only simple rules.

Of greatest importance, perhaps, is that both chaos and fractals can accurately

represent naturally occurring phenomena. Advances in abstract theory have been

paralleled with discoveries of real-world phenomena which confirm the relevance and

usefulness of chaos and fractals. A selection of the subjects in which this has taken

place are: architecture, art, astrophysics, biology, chemistry, communications,


37/208

35

computing, data compression, economics, electronics, fluid dynamics, geology,

geophysics, linguistics, meteorology, music, physics, signal processing. See [glei87,

pick90, schr91, peit88, cril91, stew90 and moon87] and references therein.

3.2. The Significance of Chaos

Chaos theory concerns the dynamic behaviour of simple nonlinear systems.

Traditionally, the problem of dynamics has been approached in two different ways -

deterministic dynamics and stochastic processes. The deterministic approach assumes

that fixed laws govern the behaviour of a system. These laws may be written down

with linear differential equations, a solution found, and so the behaviour of the system

is known for all time. Such an approach applies to systems with a few degrees of

freedom and where linear relationships, or approximations, exist between thecomponent parts. The advantage to this approach is that the resulting solution gives

complete, predictive knowledge about the behaviour of the system. The main

disadvantage, however, lies also with the solution - it is not always possible to find

one. Analytic techniques do not provide a universal means of solution to systems of

differential equations, especially if they contain nonlinearities.

The alternative, stochastic, approach makes the assumption that the system under

investigation is too complex to be able to describe explicitly with fixed laws. This is

either because there are too many degrees of freedom, or it is not possible to measure

all the relevant aspects of the system. In this case, a partial description of the system

may be given using probability. That is, the degree of uncertainty about a system's

present state, or future behaviour may be quantified. Instead of describing the dynamic

behaviour of every degree of freedom with an explicit solution, only the likelihoods of

expected behaviour are known. These correspond to the average or typical behaviour

found by empirically accumulating information about the system. This is also a

powerful approach as, for example in thermodynamics, the average properties of

particles in a gas provides a useful desc

Using strange attractors to model sound

Documents

Transcript of Using strange attractors to model sound