Niemitalo_DSP.For.The.Braindead

13
AUDIO DSP FOR THE BRAINDEAD I NTERNAL DEVELOPMENT VERSION 2000.4.30 Bleeding-edge version at: http://www.student.oulu./˜oniemita/DSP/INDEX.HTM I, as the author and copyright holder, allow you to do anything you wish with this book free of charg e, includin g cop ying , printi ng and repu bli shin g. In return, you must preserve this notication and the book’s website URL on the title page. Olli Niemital o

Transcript of Niemitalo_DSP.For.The.Braindead

Page 1: Niemitalo_DSP.For.The.Braindead

8/7/2019 Niemitalo_DSP.For.The.Braindead

http://slidepdf.com/reader/full/niemitalodspforthebraindead 1/13

AUDIO DSP FOR THEBRAINDEAD

INTERNAL DEVELOPMENT VERSION 2000.4.30

Bleeding-edge version at:http://www.student.oulu.fi/˜oniemita/DSP/INDEX.HTM

I, as the author and copyright holder, allow you to do anything you wish with thisbook free of charge, including copying, printing and republishing. In return, you

must preserve this notification and the book’s website URL on the title page.

Olli Niemitalo

Page 2: Niemitalo_DSP.For.The.Braindead

8/7/2019 Niemitalo_DSP.For.The.Braindead

http://slidepdf.com/reader/full/niemitalodspforthebraindead 2/13

Contents

About this book III

1 Sampling basics 1

1.1 What is sound? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

1.2 From air pressure to analog . . . . . . . . . . . . . . . . . . . . . . 

1.3 From analog to digital . . . . . . . . . . . . . . . . . . . . . . . . . ¡

1.4 Quantization error . . . . . . . . . . . . . . . . . . . . . . . . . . . ¢

1.5 Frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . £

1.6 Angular frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . £

1.7 Frequency range and aliasing . . . . . . . . . . . . . . . . . . . . . ¤

1.8 Nyquist, we have a problem! . . . . . . . . . . . . . . . . . . . . . ¥

2 Sinusoids 11

2.1 Amplitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   

2.2 Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   

2.3 Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

¡

3 Processing 13

3.1 Mathematical model of sampling . . . . . . . . . . . . . . . . . . .

 

¢

3.2 Discrete processing . . . . . . . . . . . . . . . . . . . . . . . . . . 

£

Collection of filter formulae 171 IIR filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

§

1.1 Fastest and simplest “lowpass” ever! . . . . . . . . . . . . . §

2 FIR filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . §

Internet references 18

Symbol chart 19

¨ ¨

Page 3: Niemitalo_DSP.For.The.Braindead

8/7/2019 Niemitalo_DSP.For.The.Braindead

http://slidepdf.com/reader/full/niemitalodspforthebraindead 3/13

About this book

The purpose of the book is to be a tutorial for people who want to learn audio

digital signal processing, but find the academic books too cryptic and impractical.

Softsynth and audio software makers, game programmers, computer musiciansetc. could fall into this league. You must know how to program and have some

basic math knowledge, that’s all.

Printing should be done on both sides of the paper, preferrably with a color printer.

If your printer is not capable of printing on both sides, first print odd pages on one

side of the paper, re-insert the papers (check you got the right order and position),and print the even pages on the blank sides. Finally, check that there is no lonely

odd page left in the paper tray. If you can’t trust your printer, do the printingchapter by chapter.

Don’t forget to check out the book’s website for the latest version!

I started this project, because my older, similar text (DSPSTUFF.TXT) began to

seem a bit naive, and i © wanted to rewrite the whole thing. ASCII art is not that

accurate so, i chose to use graphics, specificly vector graphics. The quest for rightsoftware led me to LATEX (MiKTEX) and Adobe products (Illustrator) and GnuPlot.

My motivation is sharing knowledge, and probably a tiny bit of that built-in desire

for 15 minutes of fame. For me, this book also works as an answer to all those“How was it again...?” questions that hit me every now and then.

I’d like to thank Timo Tossavainen for teaching me stuff, and my big brother Kalle

Niemitalo for helping with the math and writing. Thanks to all the people i’vegot feedback from. The coolest thing yet is that i have received free software,documents and even job offers in return for my work! :-) Please don’t stop! It’s

great hearing this is of use. Also, i’d like to know if you have found errors or havesuggestions or questions - updating is easy, as this is published electronically.

Olli Niemitalo a.k.a. Yehar / Sublevel 3Student in information technology at the University of Oulu

http://www.student.oulu.fi/˜oniemita (English homepage)http://www.sublevel3.org (Music from our label)

[email protected]

¦ ! " # %$ & '¦ & ( ( ) 0 1 ) 2 34 ! 0 7 ! 3 ) ! & 2 9 8 @ A

B B B

Page 4: Niemitalo_DSP.For.The.Braindead

8/7/2019 Niemitalo_DSP.For.The.Braindead

http://slidepdf.com/reader/full/niemitalodspforthebraindead 4/13

1. Sampling basics

1.1 What is sound?

Sound is pressure changes traveling in the air, or in some other medium likewater. It can be caused by vibrating objects like guitar strings stirring the air, or

by air turbulence. A nuclear explosion does make a loud bang also.

An increase in air pressure practically means an increased number of air molecules

in a volume. Low pressure would mean lack of air molecules. Whenever there’s a

thinner (local low pressure) spot in the air, surrounding air molecules are pushedthere to fill it, but as they moved, they created another thin layer which is again

filled by surrounding molecules. And the sound travels. In air, at about 330m/s.

Hey, it isn’t really that simple, but you don’t necessarily need to know more! It’sthe huge amount of molecules that turns it all into statistics...

1.2 From air pressure to analog

A microphone converts the instantaneous pressure levels into instantaneous elec-tric voltage levels. If one farts into the microphone – don’t mind me not telling how

the sound is constructed – the CE D F G H I QE R G S T Q V plot could look (did look) somethinglike this:

Time 

Voltage

Analog, continuous fart sound 

This C D¦ F G H I QP R G S T Q V signal is called an analog signal – referring to that the voltage

is analogous to air pressure. In this form, the sound can be recorded for example

mechanically on a vinyl disk or magnetically on a tape, or after amplification (volt-age is scaled by multiplication), sent to speakers to convert the voltage changes

back to pressure changes, sound.

X

Page 5: Niemitalo_DSP.For.The.Braindead

8/7/2019 Niemitalo_DSP.For.The.Braindead

http://slidepdf.com/reader/full/niemitalodspforthebraindead 5/13

Y ` b d e f g i q r t d b i v b

1.3 From analog to digital

The computer’s memory can not store an infinite amount of data. The memoryis not continuous like the curve on a vinyl disk. Instead it is divided into a finitenumber of memory slots, bits, and they have only two states, 0 and 1 – black orwhite, no greys, could one say – this is called digital.

Therefore it is not possible to save the original sound in digital in all its detailed-

ness. Luckily (in this context!) our hearing is limited and we cannot hear veryquiet sounds or very high frequencies, so the amount of information needed to

store an accurate sounding representation of a sound is finite, and can be easily

reached using today’s equipment.

How it is done is called sampling. Here’s a sampled version of the fart sound:

Time Amplitude

Digital, discrete fart sound 

The vertical axis is now titled amplitude, since we are no longer dealing with a

real quantity like voltage. Sometimes they say  things like: “The amplitude of this signal is 5 volts ”. In that case, they are talking of the total zero-level to top heightof the waveform. Here, instead, we mean instantaneous values. Just try to grasp

the concepts, and you’ll be all right with the twisted terminology. You may evenbecome friends! (Hope not too good ones)

The sampled sound is not a continuous curve. Instead it is a set of peaks of

different amplitudes, spread in time at equal intervals w (meaning the time betweenadjacent peaks is constant, same everywhere). This kind of a signal, where timeis quantized, is called a discrete signal. The amplitudes of the peaks are taken

from the instantaneous voltage levels of the original sound. Hence the name

sampling. Another name for a single peak is samplepoint, or shortly sample.Samplepoint is preferred, since sample could mean a longer piece of sound too.

To limit the amount of memory required to save the amplitude of a single sam-

plepoint, amplitude is also quantized, meaning it can only have values that are

multiples of a constant. The relation of quantized amplitude to unquantized am-plitude is a staircase function, from which the closest step is always taken:

x yU � � � � �¦ � � � � � � � � � �¦ � � � � � � � �¦ � � j 5 l m np � o � � np � � � �0 � � � l u 5 m �

np � l h 6 m ¦ � � � p l ¦ o $ m � 0 � �

Page 6: Niemitalo_DSP.For.The.Braindead

8/7/2019 Niemitalo_DSP.For.The.Braindead

http://slidepdf.com/reader/full/niemitalodspforthebraindead 6/13

{ | } �

Unquantized 

Quantization step Quantized

Amplitude quantization 

At this point some would say we have a PCM � signal.

In the computer’s memory, the amplitudes of the samplepoints are saved in an

array. The most used data format is stereo 16-bit at 44100Hz, which gives you

a precision of� � � �

different amplitude levels

on twoseparate channels (left and right). 44100Hz is the frequency of the peaks, sam-

plepoints. That means, 44100 samplepoints are recorded during a single second.This frequency is called the sampling frequency , sampling rate  or samplerate .

Some other commonly used bit depths are 8-bit, 24-bit and frequencies 22050Hz,

48000Hz, 88200Hz, 96000Hz. Floating point formats are also possible.

In the rest of this book, we will use � � � � �P � � notation for continuous signalsand � � � � �E � � for their sampled representations.

1.4 Quantization error

It would seem logical that better the amplitude precision, better the quality. Right!Adding one more bit to the bit-depth doubles the number of available amplitude

levels, and drops the quantization error to half. Quantization error is the unwantedaddition to your signal due to quantization, and it can be calculated through:

Quantization error � Quantized signal Original signal

This is a common procedure. To extract the error from a spoiled signal, for closerinvestigation, you subtract the original from it.

Let’s try the formula visually and see what we get from quantizing a sinusoid ,

one of the most basic waveforms:

ª « ¬ ® ¯ °� ± ² ¬ ³ � ® « ³ « µ0 ® « ± ² ¶ ® ª · ¸ ± h ¹ ² ³p º ± ¸ h ¹ ² ³ » ¼ ½ ½ ½ ¾s ¿ À Á Â Ã Ä Å ¿ Æ 5 ³ ª ¬ Ç p Ç ³6 ¬ ¶ È ª ² ¶ ¬

¯ ª¦ ¬ ® ¯ ¸ ³ «0 ¶ ª ®0 ª ® Ç ³ ¯9 ¬ ¶ È ª ²E ¯ ³ É ¯ ³ ¬ ³ ª ¶ ª È5 Ç ³6 ® ¯ ¶ È ¶ ª¦ ² ½ Ê Á Â Ë Ì6 ¯ ³ ³ ¯ ¬ ® Ç 9 Ç ³5 « ¶ ¬ Í ¯ ³ ³6 ¬ ¶ È ª ²

¶ ¬ 6 ¬ ³ ¯ ¶ ³ ¬ ® ª¦ ¯ ¯ ® Πɦ ³ Ï ¬ Ð É ± ² ¬ ³ ¬ ½ Ñ� ¿ À Ì À ¶ ¬ 6 ¬ Ò ª ® ª Ò ¸ ® ¯ 4 « ¶ È ¶ ² ½

Ó Ô ª¦ ² ® È ® ± ¬ ®5 Ç ± 5 ª ® ¶ «P Ð Ç ± 5 ª Õ ² ¶ Ï ³ 5 Ö Õ ¼

Page 7: Niemitalo_DSP.For.The.Braindead

8/7/2019 Niemitalo_DSP.For.The.Braindead

http://slidepdf.com/reader/full/niemitalodspforthebraindead 7/13

× Ø Ú Û Ü Ý Þ ß à á â Û Ú ß ã Ú

ä å

Quantized Original Error  

Extraction of quantization error. Original is subtracted from quantized.

We can promptly see that the amplitude of the quantization error is strictly limited

into a range. The top limit of the range is equal to half the quantization step.

1.5 Frequencies

By frequencies, we mean sinusoids, such as æ ç èp é ê ë ì í î , present in sound. We’ll

use the following abbreviations:

ï ð

ä Sampling frequencyï

ä Frequency of the sinusoid

ê

ä Timeñ ò

ä Initial phase, alpha òó

ä

Amplitude

Sinusoidal frequencies are of the general form:

ô

ì õ� ö ë ê ÷ ø í ù ú û¦ æ é ü ý ù

ï ÿ

í   ÷ í ¡ ¢ ¤ ù ê ë ì í ¥ õ §

ô ©

í î

And the same using the abbreviations:

ó

ú û¦ æ é ü ý

ï

ê ¥

ñ ò

î

Amplitude defines the height of the sinusoid, measured from the zero level to the

top. Initial phase defines the phase of the sinusoid at ê

ä , a cosine being the

result from ñ ò

ä . Commonly, the time unit is seconds, the frequency unit is Hzand no unit is used with the amplitude.

1.6 Angular frequency

A more convenient way to express the frequency in a discrete signal is angular

frequency, which we note by (omega, a Greek letter looking quite like double-u). ! " # $ % & (! ) 0 " 13 2 " $ ) £ 5 ( $ 6! ) " ) 9 8 " 0 $ 5 @3 $ 5£ A %' 5 ( $7 A & 5 & % $ B

Page 8: Niemitalo_DSP.For.The.Braindead

8/7/2019 Niemitalo_DSP.For.The.Braindead

http://slidepdf.com/reader/full/niemitalodspforthebraindead 8/13

C F H P Q S U H V X V ` a S ` P b d e

We also introduce a new letter to express discrete time, f (upper case!). These

new variables are related to g and h by:

i p r t

g

g u

f

p

g u h

There are no units for these quantities, and the general sinusoid formula is sim-plified into:

v w y � �

i

f � � � �

The convenience gained is not only the simplified formula, but also that now f

can be used as samplepoint number  and the frequency is expressed as partsof the sampling frequency, kicking it out of the calculations. For example, if weare assigned to create a sampled sinusoid of some angular frequency, we don’t

need to know the sampling frequency to be able to start typing in the samplepoint

values.

Here are some possibilities for i and the corresponding real freqs:

i p � �

g

p � � �

i p

t � r

g

p

g

u

� �

i p

t

g

p

g

u

� r

i p

r t

g

p

g

u

Let’s visualize a sinusoid (cosine) with the following constants:

Amplitude p

v

p �

Angular frequency p i p

t � r

Initial phase p

� �

p �

w yE � � �

f �

-0.5 

-1.0 

+1.0 

+0.5 

0 1 2 3 4 5 6 7 8  

Amplitude

Time (sample number)

Example sinusoid,v

p � , i p

t � r

, � �

p �

The markers are the samplepoints. Since the angular frequency ist � r

p � , a

quarter of the sampling frequency, the sinusoid goes a full cycle every 4 samples

Page 9: Niemitalo_DSP.For.The.Braindead

8/7/2019 Niemitalo_DSP.For.The.Braindead

http://slidepdf.com/reader/full/niemitalodspforthebraindead 9/13

k m n o m k k

–  is the fourth of a full circle ( z

= { | ). Starting to understand angular

frequency? You can also consider } as a phase increase  that is added to thephase of the sampled sinusoid at every sampling step.

A good visualization aid is a marker going counterclockwise ~ around a unit circle

(radius 1, origo-centered). The circumference (the length of the circle straight-

ened) of a unit circle is { | . At every sampling step the arch traveled by the markeris of length } , so is the angle the marker rotates around origo. If we are creating a

sampled cosine wave, taking samples of the horizontal coordinate of the marker

and starting from coordinates (1,0) at time 0 does the job:

E

-0.5 

-1.0 

+1.0 

+0.5 

0 1 2 3 4 5 6 7 8  

A

mplitude

Time (sample number)

The example sinusoid with unit circle illustration, � , } | � { , � �

1.7 Frequency range and aliasing

Increasing the used sampling frequency allows representing higher frequencies

in the sampled sound. This is specified by the Nyquist criterion: “A sampled rep- 

resentation of a signal is exact if the highest frequency in the signal is less than half the sampling frequency ”, or a version even further tailored for our purposes:

“Only frequencies smaller than half the sampling frequency can be represented.”

So if you use 44100Hz (CD quality) as sampling frequency, the highest frequency

you can have is 22050Hz, consequently called the Nyquist frequency, � � . Ex-pressed in terms of angular frequency, Nyquist frequency is always | , since it is

half the sampling frequency, { | .

A cheesy proof follows. You need to store at least two samplepoints per wave

cycle, the top and the bottom, to be able to represent a sinusoid:

� � �� � ¨ ' � � � 3 � 4 ! � � � �� �

� �

Page 10: Niemitalo_DSP.For.The.Braindead

8/7/2019 Niemitalo_DSP.For.The.Braindead

http://slidepdf.com/reader/full/niemitalodspforthebraindead 10/13

E ª « ¬ ® ¯ ° ± ² ¬ ´ ° µ ´ ° ¶ ´ ¸ ¹ ´ º ¹ ° µ »

¼ ½E ¾ ¿ À Á Â

-0.5 

-1.0 

+1.0 

+0.5 

0 1 2 3 4 5 6 7 8  

Amplitude

Time (sample number)

Nyquist frequency sinusoid, Ã Ä

À , the maximum freq! 

If we have a higher frequency than this Nyquist frequency, and use the same

sampling freq, shit will happen. Here we have a frequency that is 4/3 of theNyquist frequency:

¼ ½E ¾ ¿ Å Æ À Á Â

-0.5 

-1.0 

+1.0 

+0.5 

0  1 2 3 4 5 6 7 8  

A

mplitude

Time (sample number)

Example sinusoid, Ã Ä

ÅÆ À , too high a freq! 

Now we wipe out the continuous waveform and store only the discrete samples:

-0.5 

-1.0 

+1.0 

+0.5 

0 1 2 3 4 5 6 7 8  

Amplitude

Time (sample number)

Discrete samples of the example sinusoid 

Based on this data, it is impossible to retrieve the original sinusoid, because

there’s another, lower frequency sinusoid, that has the same discrete representa-tion, and at resynthesis it is brought up instead of the original higher-than-Nyquist

frequency sinusoid. Here you see it happen:

Page 11: Niemitalo_DSP.For.The.Braindead

8/7/2019 Niemitalo_DSP.For.The.Braindead

http://slidepdf.com/reader/full/niemitalodspforthebraindead 11/13

Ç È Ê Ë Ì Í Î Ï Ð Ñ Ò Ë Ê Ï Ó Ê

Ô ÕE Ö × ØÙ Ú Û Ü

-0.5 

-1.0 

+1.0 

+0.5 

0 1 2 3 4 5 6 7 8  

Amplitude

Time (sample number)

Reconstructed, aliased sinusoid, Ý Þ

ØÙ

Ú .

(Notice the identical positions of -markers)

This process of higher-than-Nyquist frequencies transforming into lower frequen-

cies is called aliasing. Reconstruction only produces Nyquist-range frequencies,and due to sampling, the out-of-range frequencies can not be discriminated fromthe corresponding in-range frequencies having identical discrete representations.

You should note that sinusoids stay as sinusoids through aliasing, even though

changes in frequency take place.

A graph will give a general rule to aliasing. On the horizontal axis we have the

original unaliased frequency, and on the vertical the (possibly) aliased one:

0Hz 

0Hz 

Aliased 

freq

Unaliased frequency 0.5 f s  1.0 f s  1.5 f s  2.0 f s  2.5 f s 

0.5 f s 

Dependence of aliased frequency from unaliased 

Now this explains why it is called aliasing! First, as we increase the frequency

above ß à , it bounces off and aliases over the already used range, and when

increased more, it bounces off the 0Hz. And so on, infinitely.

The amplitude of a sinusoid is preserved in aliasing. What happens to the phaseis usually unimportant, and will not be discussed here.

In our example, the unaliased frequency is in the range á ã ß ä – å á ß ä , so by readingfrom the graph, we can write a mathematical equation for the aliasing relation

(applicable in this specific frequency range only):

Aliased frequency Þ Sampling frequency æ Unaliased frequency

A quick review on the symbols:

ß Þ Unaliased, original frequency

ß Þ

Aliased frequencyß

ä

Þ Sampling frequencyß

à

Þ Nyquist critical frequency Þ ß

ä ê ë

Page 12: Niemitalo_DSP.For.The.Braindead

8/7/2019 Niemitalo_DSP.For.The.Braindead

http://slidepdf.com/reader/full/niemitalodspforthebraindead 12/13

ì î ï ð ñ ò ó ô õ ÷ ø ù ú û ø ú ü ý þ ÿ   ø ¢ £ ¥

We just declared: ¦ ¨ ¦ ¦

And we had ¦ as the unaliased frequency ¦ , so in our example:

¦ ¨ ¦

!

¦ ¨ ¦

#

!

¦ ¨

$

!

¦

. . . Meaning that the aliased frequency is a third of the sampling frequency, so itgoes one full cycle every three samplepoints, as you correctly see happening a

couple of figures back.

Aliasing is one of the most annoying artifacts of sampling. The first situation itis encountered is when doing the analog % digital conversion. If there are above

¦ frequencies in the analog signal, they will be aliased into unpleasant distortion

(that’s what any unwanted frequencies is called) in the digital signal. This is whythere has to be a hardware analog filter to remove above-Nyquist frequencies

before the ADC (Analog to Digital Converter).

Some non-audio designs use the property of aliasing in transforming higher fre-

quencies to the Nyquist range, where they can be analyzed digitally.

1.8 Nyquist, we have a problem!

Perfectly accurate reconstruction is theoretically possible if the Nyquist criterion is

satisfied. Still, it is very common to hear claims, mostly from audiophiles, that hav-ing double the highest existing frequency as sampling frequency is not enough.

They mostly go like this: “If you sample a sinus that is of half the sampling fre- 

quency, you sample the zero crossings. That gives you nothing but silence! ” Yes,that’s true, as you can see for yourself:

& ' ( ) 1 2  4 , Silence

-0.5 

-1.0 

+1.0 

+0.5 

0 1 2 3 4 5 6 7 8  

A

mplitude

Time (sample number)

Sampled & ' ( ) 1 2  4 equals silence! 

The Nyquist frequency could be thought of as a special case, where the phaseinformation of the sinusoid is lost, as it is always reconstructed as 7 8 @

& ) 1 2  4 .

Here’s an example showing how the phase disappears:

Page 13: Niemitalo_DSP.For.The.Braindead

8/7/2019 Niemitalo_DSP.For.The.Braindead

http://slidepdf.com/reader/full/niemitalodspforthebraindead 13/13

A B C F H P Q S U V W Y H F U ` F

Original a bd c e f g h i p q , Reconstructed r a bE c e f g q

-0.5 

-1.0 

+1.0 

+0.5 

0 1 2 3 4 5 6 7 8  

Amplitude

Time (sample number)

The phase information of a Nyquist frequency sinusoid is lost in sampling along with some of its amplitude.

The amplitude of the reconstructed cosine is not that of the original sinusoid. It is,

as easily interpretable from the figure, same as the value of the original sinusoidat g s

B

. In short, Nyquist frequency sinusoids are attenuated or even muteddepending on their initial phases.

The good thing is that this special problem is limited to Nyquist frequency only. Afrequency a tiny bit less does not have the problem. Therefore, the audiophile’s

intuitive argument loses its point. – Sampling at 40001Hz is enough for represent-ing any 20000Hz sinusoid.

Perfect reconstruction is an extremely heavy process, and practical reconstructorsare far from perfect. Still, there is no similar phase-selective attenuation below the

Nyquist frequency. Other kinds of problems, mostly aliasing-related, exist.