AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50,...

141
AES JOURNAL OF THE AUDIO ENGINEERING SOCIETY AUDIO / ACOUSTICS / APPLICATIONS Volume 50 Number 11 2002 November In this issue… Noisy Decay Parameters Improved Time-Frequency Analysis Expanding Surround Sound Synthesizing Surround Sound Features… 113th Convention Report, Los Angeles Bluetooth and Wireless Networking 11th Regional Convention,Tokyo— Call for Papers

Transcript of AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50,...

Page 1: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

sustainingmemberorganizations AESAES

VO

LU

ME

50,NO

.11JO

UR

NA

L O

F T

HE

AU

DIO

EN

GIN

EE

RIN

G S

OC

IET

Y2002 N

OV

EM

BE

R

JOURNAL OF THE AUDIO ENGINEERING SOCIETYAUDIO / ACOUSTICS / APPLICATIONSVolume 50 Number 11 2002 November

In this issue…

Noisy Decay Parameters

Improved Time-FrequencyAnalysis

Expanding Surround Sound

Synthesizing Surround Sound

Features…

113th Convention Report,Los Angeles

Bluetooth and Wireless Networking

11th Regional Convention, Tokyo—Call for Papers

The Audio Engineering Society recognizes with gratitude the financialsupport given by its sustaining members, which enables the work ofthe Society to be extended. Addresses and brief descriptions of thebusiness activities of the sustaining members appear in the Octoberissue of the Journal.

The Society invites applications for sustaining membership. Informa-tion may be obtained from the Chair, Sustaining Memberships Committee, Audio Engineering Society, 60 East 42nd St., Room2520, New York, New York 10165-2520, USA, tel: 212-661-8528.Fax: 212-682-0477.

ACO Pacific, Inc.Air Studios Ltd.AKG Acoustics GmbHAKM Semiconductor, Inc.Amber Technology LimitedAMS Neve plcATC Loudspeaker Technology Ltd.Audio LimitedAudiomatica S.r.l.Audio Media/IMAS Publishing Ltd.Audio Precision, Inc.AudioScience, Inc.Audio-Technica U.S., Inc.AudioTrack CorporationAutograph Sound Recording Ltd.B & W Loudspeakers LimitedBMP RecordingBritish Broadcasting CorporationBSS Audio Cadac Electronics PLCCalrec AudioCanford Audio plcCEDAR Audio Ltd.Celestion International LimitedCerwin-Vega, IncorporatedClearOne Communications Corp.Community Professional Loudspeakers, Inc.Crystal Audio Products/Cirrus Logic Inc.D.A.S. Audio, S.A.D.A.T. Ltd.dCS Ltd.Deltron Emcon LimitedDigidesignDigigramDigital Audio Disc CorporationDolby Laboratories, Inc.DRA LaboratoriesDTS, Inc.DYNACORD, EVI Audio GmbHEastern Acoustic Works, Inc.Eminence Speaker LLC

Event Electronics, LLCFerrotec (USA) CorporationFocusrite Audio Engineering Ltd.Fostex America, a division of Foster Electric

U.S.A., Inc.Fraunhofer IIS-AFreeSystems Private LimitedFTG Sandar TeleCast ASHarman BeckerHHB Communications Ltd.Innova SONInnovative Electronic Designs (IED), Inc.International Federation of the Phonographic

IndustryJBL ProfessionalJensen Transformers Inc.Kawamura Electrical LaboratoryKEF Audio (UK) LimitedKenwood U.S.A. CorporationKlark Teknik Group (UK) PlcKlipsch L.L.C.Laboratories for InformationL-Acoustics USLeitch Technology CorporationLindos ElectronicsMagnetic Reference Laboratory (MRL) Inc.Martin Audio Ltd.Meridian Audio LimitedMetropolis GroupMiddle Atlantic Products Inc.Mosses & MitchellM2 Gauss Corp.Music Plaza Pte. Ltd.Georg Neumann GmbH Neutrik AGNVisionNXT (New Transducers Ltd.)1 LimitedOntario Institute of Audio Recording

TechnologyOutline sncPRIMEDIA Business Magazines & Media Inc.Prism Sound

Pro-Bel LimitedPro-Sound NewsRadio Free AsiaRane CorporationRecording ConnectionRocket NetworkRoyal National Institute for the BlindRTI Tech Pte. Ltd.Rycote Microphone Windshields Ltd.SADiESanctuary Studios Ltd.Sekaku Electron Ind. Co., Ltd.Sennheiser Electronic CorporationShure Inc.Snell & Wilcox Ltd.Solid State Logic, Ltd.Sony Broadcast & Professional EuropeSound Devices LLCSound On Sound Ltd.Soundcraft Electronics Ltd.Soundtracs plcSowter Audio TransformersSRS Labs, Inc.Stage AccompanySterling Sound, Inc.Studer North America Inc.Studer Professional Audio AGTannoy LimitedTASCAMTHAT CorporationTOA Electronics, Inc.Touchtunes Music Corp.United Entertainment Media, Inc.Uniton AGUniversity of DerbyUniversity of SalfordUniversity of Surrey, Dept. of Sound

RecordingVidiPaxWenger CorporationJ. M. Woodgate and AssociatesYamaha Research and Development

Page 2: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

AUDIO ENGINEERING SOCIETY, INC.INTERNATIONAL HEADQUARTERS

60 East 42nd Street, Room 2520, New York, NY 10165-2520, USATel: +1 212 661 8528 . Fax: +1 212 682 0477

E-mail: HQ@aes . Internet: http://www.aes.org

Roger K. Furness Executive DirectorSandra J. Requa Executive Assistant to the Executive Director

ADMINISTRATION

STANDARDS COMMITTEE

GOVERNORS

OFFICERS 2002/2003

Karl-Otto BäderCurtis HoytRoy Pritts

Don PuluseDavid Robinson

Annemarie StaepelaereRoland Tan

Kunimara Tanaka

Ted Sheldon Chair Dietrich Schüller Vice Chair

Mendel Kleiner Chair Mark Ureda Vice Chair

SC-04-01 Acoustics and Sound Source Modeling Richard H. Campbell, Wolfgang Ahnert

SC-04-02 Characterization of Acoustical MaterialsPeter D’Antonio, Trevor J. Cox

SC-04-03 Loudspeaker Modeling and Measurement David Prince, Steve Hutt

SC-04-04 Microphone Measurement and CharacterizationDavid Josephson, Jackie Green

SC-04-07 Listening Tests: David Clark, T. Nousaine

SC-06-01 Audio-File Transfer and Exchange Mark Yonge, Brooks Harris

SC-06-02 Audio Applications Using the High Performance SerialBus (IEEE: 1394): John Strawn, Bob Moses

SC-06-04 Internet Audio Delivery SystemKarlheinz Brandenburg

SC-06-06 Audio MetadataC. Chambers

Kees A. Immink President

Ronald Streicher President-Elect

Garry Margolis Past President

Jim Anderson Vice President Eastern Region, USA/Canada

James A. Kaiser Vice PresidentCentral Region, USA/Canada

Bob Moses Vice President,Western Region, USA/Canada

Søren Bech Vice PresidentNorthern Region, Europe

Markus ErneVice President, Central Region, Europe

Daniel Zalay Vice President, Southern Region, Europe

Mercedes Onorato Vice President,Latin American Region

Neville ThieleVice President, International Region

Han Tendeloo Secretary

Marshall Buck Treasurer

TECHNICAL COUNCIL

Wieslaw V. Woszczyk ChairJürgen Herre and

Robert Schulein Vice Chairs

COMMITTEES

SC-02-01 Digital Audio Measurement Techniques Richard C. Cabot, I. Dennis, M. Keyhl

SC-02-02 Digital Input-Output Interfacing: Julian DunnRobert A. Finger, John Grant

SC-02- 05 Synchronization: Robin Caine

John P. Nunn Chair Robert A. Finger Vice Chair

Robin Caine Chair Steve Harris Vice Chair

John P. NunnChair

Bruce OlsonVice Chair, Western Hemisphere

Mark YongeSecretary, Standards Manager

Yoshizo Sohma Vice Chair, International

SC-02 SUBCOMMITTEE ON DIGITAL AUDIO

Working Groups

SC-03 SUBCOMMITTEE ON THE PRESERVATION AND RESTORATIONOF AUDIO RECORDING

Working Groups

SC-04 SUBCOMMITTEE ON ACOUSTICS

Working Groups

SC-06 SUBCOMMITTEE ON NETWORK AND FILE TRANSFER OF AUDIO

Working Groups

TECHNICAL COMMITTEES

SC-03-01 Analog Recording: J. G. McKnight

SC-03-02 Transfer Technologies: Lars Gaustad, Greg Faris

SC-03-04 Storage and Handling of Media: Ted Sheldon, Gerd Cyrener

SC-03-06 Digital Library and Archives Systems: William Storm Joe Bean, Werner Deutsch

SC-03-12 Forensic Audio: Tom Owen, M. McDermottEddy Bogh Brixen

TELLERSChristopher V. Freitag Chair

WOMEN IN AUDIOKees A. Immink Chair

Correspondence to AES officers and committee chairs should be addressed to them at the society’s international headquarters.

Ray Rayburn Chair John Woodgate Vice Chair

SC-05-02 Audio ConnectorsRay Rayburn, Werner Bachmann

SC-05-03 Audio Connector DocumentationDave Tosti-Lane, J. Chester

SC-05-05 Grounding and EMC Practices Bruce Olson, Jim Brown

SC-05 SUBCOMMITTEE ON INTERCONNECTIONS

Working Groups

ACOUSTICS & SOUNDREINFORCEMENT

Mendel Kleiner ChairKurt Graffy Vice Chair

ARCHIVING, RESTORATION ANDDIGITAL LIBRARIES

David Ackerman Chair

AUDIO FORTELECOMMUNICATIONS

Bob Zurek ChairAndrew Bright Vice Chair

CODING OF AUDIO SIGNALSJames Johnston and

Jürgen Herre Cochairs

AUTOMOTIVE AUDIORichard S. Stroud Chair

Tim Nind Vice Chair

HIGH-RESOLUTION AUDIOMalcolm Hawksford Chair

Vicki R. Melchior andTakeo Yamamoto Vice Chairs

LOUDSPEAKERS & HEADPHONESDavid Clark Chair

Juha Backman Vice Chair

MICROPHONES & APPLICATIONSDavid Josephson Chair

Wolfgang Niehoff Vice Chair

MULTICHANNEL & BINAURALAUDIO TECHNOLOGIESFrancis Rumsey Chair

Gunther Theile Vice Chair

NETWORK AUDIO SYSTEMSJeremy Cooperstock ChairRobert Rowe and Thomas

Sporer Vice Chairs

AUDIO RECORDING & STORAGESYSTEMS

Derk Reefman Chair

PERCEPTION & SUBJECTIVEEVALUATION OF AUDIO SIGNALS

Durand Begault ChairSøren Bech and Eiichi Miyasaka

Vice Chairs

SIGNAL PROCESSINGRonald Aarts Chair

James Johnston and Christoph M.Musialik Vice Chairs

STUDIO PRACTICES & PRODUCTIONGeorge Massenburg Chair

Alan Parsons, David Smith andMick Sawaguchi Vice Chairs

TRANSMISSION & BROADCASTINGStephen Lyman Chair

Neville Thiele Vice Chair

AWARDSRoy Pritts Chair

CONFERENCE POLICYSøren Bech Chair

CONVENTION POLICY & FINANCEMarshall Buck Chair

EDUCATIONDon Puluse Chair

FUTURE DIRECTIONSKees A. Immink Chair

HISTORICALJ. G. (Jay) McKnight Chair

Ted Sheldon Vice ChairDonald J. Plunkett Chair Emeritus

LAWS & RESOLUTIONSRon Streicher Chair

MEMBERSHIP/ADMISSIONSFrancis Rumsey Chair

NOMINATIONSGarry Margolis Chair

PUBLICATIONS POLICYRichard H. Small Chair

REGIONS AND SECTIONSSubir Pramanik Chair

STANDARDSJohn P. Nunn Chair

Page 3: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

AES Journal of the Audio Engineering Society(ISSN 0004-7554), Volume 50, Number 11, 2002 NovemberPublished monthly, except January/February and July/August when published bi-monthly, by the Audio Engineering Society, 60 East 42nd Street, New York, NewYork 10165-2520, USA, Telephone: +1 212 661 8528. Fax: +1 212 682 0477. Periodical postage paid at New York, New York, and at an additional mailing office. Postmaster: Send address corrections to Audio EngineeringSociety, 60 East 42nd Street, New York, New York 10165-2520.

The Audio Engineering Society is not responsible for statements made by itscontributors.

COPYRIGHTCopyright © 2002 by the Audio Engi-neering Society, Inc. It is permitted toquote from this Journal with custom-ary credit to the source.

COPIESIndividual readers are permitted tophotocopy isolated articles for re-search or other noncommercial use.Permission to photocopy for internalor personal use of specific clients isgranted by the Audio Engineering So-ciety to libraries and other users regis-tered with the Copyright ClearanceCenter (CCC), provided that the basefee of $1.00 per copy plus $0.50 perpage is paid directly to CCC, 222Rosewood Dr., Danvers, MA 01923,USA. 0004-7554/95. Photocopies ofindividual articles may be orderedfrom the AES Headquarters office at$5.00 per article.

REPRINTS AND REPUBLICATIONMultiple reproduction or republica-tion of any material in this Journal requires the permission of the AudioEngineering Society. Permissionmay also be required from the author(s). Send inquiries to AES Edi-torial office.

SUBSCRIPTIONSThe Journal is available by subscrip-tion. Annual rates are $160.00 surfacemail, $205.00 air mail. For information,contact AES Headquarters.

BACK ISSUESSelected back issues are available:From Vol. 1 (1953) through Vol. 12(1964), $10.00 per issue (members),$15.00 (nonmembers); Vol. 13 (1965)to present, $6.00 per issue (members),$11.00 (nonmembers). For informa-tion, contact AES Headquarters office.

MICROFILMCopies of Vol. 19, No. 1 (1971 Jan-uary) to the present edition are avail-able on microfilm from University Microfilms International, 300 NorthZeeb Rd., Ann Arbor, MI 48106, USA.

ADVERTISINGCall the AES Editorial office or sende-mail to: JournalAdvertising@aes.

MANUSCRIPTSFor information on the presentationand processing of manuscripts, seeInformation for Authors.

Patricia M. Macdonald Executive EditorWilliam T. McQuaide Managing EditorGerri M. Calamusa Senior EditorAbbie J. Cohen Senior EditorMary Ellen Ilich Associate EditorPatricia L. Sarch Art Director

EDITORIAL STAFF

Europe ConventionsZevenbunderslaan 142/9, BE-1190 Brussels, Belgium, Tel: +32 2 3457971, Fax: +32 2 345 3419, E-mail for convention information:[email protected] ServicesB.P. 50, FR-94364 Bry Sur Marne Cedex, France, Tel: +33 1 4881 4632,Fax: +33 1 4706 0648, E-mail for membership and publication sales:[email protected] KingdomBritish Section, Audio Engineering Society Ltd., P. O. Box 645, Slough,SL1 8BJ UK, Tel: +441628 663725, Fax: +44 1628 667002, E-mail:[email protected] Japan Section, 1-38-2 Yoyogi, Room 703, Shibuyaku-ku, Tokyo 151-0053, Japan, Tel: +81 3 5358 7320, Fax: +81 3 5358 7328, E-mail: aes_japan@aes.

PURPOSE: The Audio Engineering Society is organized for the purposeof: uniting persons performing professional services in the audio engi-neering field and its allied arts; collecting, collating, and disseminatingscientific knowledge in the field of audio engineering and its allied arts;advancing such science in both theoretical and practical applications; andpreparing, publishing, and distributing literature and periodicals relative tothe foregoing purposes and policies.MEMBERSHIP: Individuals who are interested in audio engineering maybecome members of the Society. Applications are considered by the Ad-missions Committee. Grades and annual dues are: Full member, $80.00;Associate member, $80.00; Student member, $40.00. A membership appli-cation form may be obtained from headquarters. Sustaining membershipsare available to persons, corporations, or organizations who wish to supportthe Society. A subscription to the Journal is included with all memberships.

Ronald M. AartsJames A. S. AngusGeorge L. AugspurgerJeffrey BarishJerry BauckJames W. BeauchampSøren BechDurand BegaultBarry A. BlesserJohn S. BradleyRobert Bristow-JohnsonJohn J. BubbersMarshall BuckMahlon D. BurkhardRichard C. CabotEdward M. CherryRobert R. CordellAndrew DuncanJohn M. EargleLouis D. FielderEdward J. Foster

Mark R. GanderEarl R. GeddesDavid GriesingerMalcolm O. J. HawksfordJürgen HerreTomlinson HolmanAndrew HornerJames D. JohnstonArie J. M. KaizerJames M. KatesD. B. Keele, Jr.Mendel KleinerDavid L. KlepperW. Marshall Leach, Jr.Stanley P. LipshitzRobert C. MaherDan Mapes-RiordanJ. G. (Jay) McKnightGuy W. McNallyD. J. MearesRobert A. MoogJames A. MoorerDick Pierce

Martin PolonD. PreisDaniel QueenFrancis RumseyKees A. Schouhamer

ImminkManfred R. SchroederRobert B. SchuleinRichard H. SmallJulius O. Smith IIIGilbert SoulodreHerman J. M. SteenekenJohn StrawnG. R. (Bob) ThurmondJiri TichyFloyd E. TooleEmil L. TorickJohn VanderkooyDaniel R. von

RecklinghausenRhonda WilsonJohn M. WoodgateWieslaw V. Woszczyk

REVIEW BOARD

Flávia Elzinga AdvertisingIngeborg M. StochmalCopy Editor

Barry A. BlesserConsulting Technical Editor

Stephanie Paynes Writer

Daniel R. von Recklinghausen Editor

Eastern Region, USA/CanadaSections: Atlanta, Boston, District of Columbia, New York, Philadelphia, TorontoStudent Sections: American University, Berklee College of Music, CarnegieMellon University, Duquesne University, Fredonia, Full Sail Real WorldEducation, Hampton University, Institute of Audio Research, McGillUniversity, Pennsylvania State University, University of Hartford, Universityof Massachusetts-Lowell, University of Miami, University of North Carolina atAsheville, William Patterson University, Worcester Polytechnic UniversityCentral Region, USA/CanadaSections: Central Indiana, Chicago, Detroit, Kansas City, Nashville, NewOrleans, St. Louis, Upper Midwest, West MichiganStudent Sections: Ball State University, Belmont University, ColumbiaCollege, Michigan Technological University, Middle Tennessee StateUniversity, Music Tech College, SAE Nashville, Northeast CommunityCollege, Ohio University, Ridgewater College, Hutchinson Campus,Southwest Texas State University, University of Arkansas-Pine Bluff,University of Cincinnati, University of Illinois-Urbana-ChampaignWestern Region, USA/CanadaSections: Alberta, Colorado, Los Angeles, Pacific Northwest, Portland, San Diego, San Francisco, Utah, VancouverStudent Sections: American River College, Brigham Young University,California State University–Chico, Citrus College, Cogswell PolytechnicalCollege, Conservatory of Recording Arts and Sciences, Denver, ExpressionCenter for New Media, Long Beach City College, San Diego State University,San Francisco State University, Cal Poly San Luis Obispo, Stanford University,The Art Institute of Seattle, University of Southern California, VancouverNorthern Region, Europe Sections: Belgian, British, Danish, Finnish, Moscow, Netherlands, Norwegian, St. Petersburg, SwedishStudent Sections: All-Russian State Institute of Cinematography, Danish,Netherlands, St. Petersburg, University of Lulea-PiteaCentral Region, EuropeSections: Austrian, Belarus, Czech, Central German, North German, South German, Hungarian, Lithuanian, Polish, Slovakian Republic, Swiss,UkrainianStudent Sections: Berlin, Czech Republic, Darmstadt, Detmold,Düsseldorf, Graz, Ilmenau, Technical University of Gdansk (Poland), Vienna,Wroclaw University of TechnologySouthern Region, EuropeSections: Bosnia-Herzegovina, Bulgarian, Croatian, French, Greek, Israel,Italian, Portugal, Romanian, Slovenian, Spanish, Turkish, YugoslavianStudent Sections: Croatian, Conservatoire de Paris, Italian, Louis-Lumière SchoolLatin American Region Sections: Argentina, Brazil, Chile, Colombia (Medellin), Mexico, Uruguay,VenezuelaStudent Sections: Taller de Arte Sonoro (Caracas)International RegionSections: Adelaide, Brisbane, Hong Kong, India, Japan, Korea, Malaysia,Melbourne, Philippines, Singapore, Sydney

AES REGIONAL OFFICES

AES REGIONS AND SECTIONS

Page 4: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

AES JOURNAL OF THE

AUDIO ENGINEERING SOCIETY

AUDIO/ACOUSTICS/APPLICATIONS

VOLUME 50 NUMBER 11 2002 NOVEMBER

CONTENT

PAPERS

Estimation of Modal Decay Parameters from Noisy Response Measurements..........................................Matti Karjalainen, Poju Antsalo, Aki Mäkivirta, Timo Peltonen, and Vesa Välimäki 867Estimating decay parameters, such as room reverberation or string decay of musical instruments, becomes more inaccurate as the noise level increases if that noise is not also included in the parametric model. A new method based on nonlinear optimization of a linear model with additive noise is demonstrated as being more accurate than the traditional approaches, especially at extreme noise conditions. Several practical examples illustrate the utility of the proposed method.

On the Use of Time–Frequency Reassignment in Additive Sound Modeling.......................................................................................................................Kelly Fitz and Lippold Haken 879The conventional short-time spectral analysis is unable to parameterize signals that have both narrow-band components combined with transients because of the nature of the time–frequency definition. Moreover, a single musical partial, which often has time-varying amplitude and frequency envelopes, does not appear as a single component in the classical linear analysis approach. By reassigning energy components, temporal smear is greatly reduced because unreliable data points can be removed from the representation. Several examples illustrate the ability to reconstruct abrupt onset square waves.

Scalable Multichannel Coding with HRFT Enhancement for DVD and Virtual Sound Systems.....................................................................................................................................M. O. J. Hawksford 894Using the perceptual properties embedded in the head-related transfer functions of a normal listener, a standard five- or six-channel audio source can be transformed to feed additional loudspeakers, perhaps as many as 18. The proposed system enhances image stability by optimizing the correct ear signals at the listener’s location. The approach improves spatial resolution over a wider area while retaining backward compatibility with an unprocessed reproduction environment.

Two-to-Five Channel Sound Processing.............................................................R. Irwan and Ronald M. Aarts 914In order to reproduce two-channel audio sources in a surround listening environment, the authors evaluated a proposed processing system based on decoding the principle directional component using correlation techniques. By parameterizing direction a stable center channel was created. Moreover, the synthesized ambiance for the sides does not degrade image stability.

STANDARDS AND INFORMATION DOCUMENTS

AES Standards Committee News........................................................................................................... 927New Internet facilities; preservation of recordings; storage and handling; audio connectors; grounding and shielding; file exchange

FEATURES

113th Convention Report, Los Angeles................................................................................................. 934Exhibitors............................................................................................................................................... 950Program.................................................................................................................................................. 954

Bluetooth and Wireless Networking—A Primer for Audio Engineers................................................. 97911th Tokyo Regional Convention, Call for Papers ............................................................................ 1001

DEPARTMENTS

News of the Sections ........................................985Sound Track........................................................989Upcoming Meetings ..........................................990New Products and Developments....................990Available Literature ...........................................992

Membership Information...................................993Advertiser Internet Directory............................999In Memoriam ....................................................1000Sections Contacts Directory ..........................1002AES Conventions and Conferences ..............1008

Page 5: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS

0 INTRODUCTION

Parametric analysis, modeling, and equalization (inversemodeling) of reverberant and resonating systems findmany applications in audio and acoustics. These includeroom and concert hall acoustics, resonators in musicalinstruments, and resonant behavior in audio reproductionsystems. Estimating the reverberation time or the modaldecay rate are important measurement problems in roomand concert hall acoustics [1], where signal-to-noise ratios(SNRs) of only 30–50 dB are common. The same prob-lems can be found, for example, in the estimation ofparameters in model-based sound synthesis of musicalinstruments, such as vibrating strings or body modes ofstring instruments [2]. Reliable methods to estimateparameters from noisy measurements are thus needed.

In an ideal case of modal behavior, after a possible ini-tial transient, the decay is exponential until a steady-statenoise floor is encountered. The parameters of primaryinterest to be estimated are

• Initial level of decay LI• Decay rate or reverberation time TD• Noise floor level LN.

In a more complex case there can be two or more modalfrequencies, whereby the decay is no longer simple, butshows additional fluctuation (beating) or a two-stage (ormultiple-stage) decay behavior. In a diffuse field (roomacoustics) the decay of a noiselike response is approxi-mately exponential in rooms with compact geometry. Thenoise floor may also be nonstationary. In this paper we pri-marily discuss a simple mode (that is, a complex conju-gate pole pair in the transfer function) or a dense set ofmodes with exponential reverberant decay, together with astationary noise floor.

Methods presented in the literature and commonsenseor ad hoc methods will first be reviewed. Techniquesbased on energy–time curve analysis of the signal enve-lope are known as methods where the noise floor can befound and estimated explicitly. Backward integration ofenergy, so-called Schroeder integration [3], [4], is oftenapplied first to obtain a smoothed envelope for decay rateestimation.

The effect of the background noise floor is known to beproblematic, and techniques have been developed to com-pensate the effect of envelope flattening when the noisefloor in a measured response is reached, including limitingthe period of integration [5], subtracting an estimatednoise floor energy level from a response [6], or using twoseparate measurements to reduce the effect of noise [7].The iterative method by Lundeby et al. [8] is of particularinterest since it addresses the case of noisy data with care.

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 867

Estimation of Modal Decay Parameters from NoisyResponse Measurements*

MATTI KARJALAINEN,1 AES Fellow, POJU ANTSALO,1 AKI MÄKIVIRTA,2 AES Member,TIMO PELTONEN,3 AES Member, AND VESA VÄLIMÄKI,1,4 AES Member

1 Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing, Espoo, Finland2 Genelec Oy, Iisalmi, Finland3 Akukon Oy, Helsinki, Finland

4 Pori School of Technology and Economics, Tampere University of Technology, Pori, Finland

The estimation of modal decay parameters from noisy measurements of reverberant andresonating systems is a common problem in audio and acoustics, such as in room and concerthall measurements or musical instrument modeling. Reliable methods to estimate the initialresponse level, decay rate, and noise floor level from noisy measurement data are studied andcompared. A new method, based on the nonlinear optimization of a model for exponentialdecay plus stationary noise floor, is presented. A comparison with traditional decay param-eter estimation techniques using simulated measurement data shows that the proposedmethod outperforms in accuracy and robustness, especially in extreme SNR conditions. Threecases of practical applications of the method are demonstrated.

* Presented in the 110th Convention of the Audio EngineeringSociety, Amsterdam, The Netherlands, 2001 May 12–15;revised 2001 November 8 and 2002 September 6.

Page 6: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

KARJALAINEN ET AL. PAPERS

This technique, as most other methods, analyzes the initiallevel LI, the decay time TD, and the noise floor LN sepa-rately, typically starting from a noise floor estimate.Iterative procedures are common in accurate estimation.

A different approach was taken by Xiang [9], where aparameterized signal-plus-noise model is fitted toSchroeder-integrated measurement data by searching for aleast-squares (LS) optimal solution. In this study we haveelaborated a similar method of nonlinear LS optimizationfurther to make it applicable to a wide range of situations,showing good convergence properties. A specific parame-ter and/or a weighting function can be used to fine-tunethe method further for specific problems. The technique iscompared with the Lundeby et al. method by it applying tosimulated cases of exponential decay plus a stationarynoise floor where the exact parameters are known. Theimproved nonlinear optimization technique is found to out-perform traditional methods in accuracy and robustness,particularly in difficult conditions with extreme SNRs.

Finally, the applicability of the improved method isdemonstrated by three examples of real measurementdata: 1) the reverberation time of a concert hall, 2) thelow-frequency mode analysis of a room, and 3) the para-metric analysis of guitar string behavior for model-basedsound synthesis. Possibilities for further generalization ofthe technique to more complex problems, such as two-stage decay, are discussed briefly.

1 DEFINITION OF PROBLEM DOMAIN

A typical property of resonant acoustic systems is thattheir impulse response is a decaying function after a pos-sible initial delay and the onset. In the simplest case theresponse of a single-mode resonator system is

sin ω φeh t A t t u t t τ t t0 0

0^ _ _ _h i i i8 B (1)

where u(t t0) is a step function with value 1 for t ≥ t0 and0 elsewhere, A is the initial response level, t0 the responselatency, for example, due to the propagation delay ofsound, τ the decay rate parameter, ω 2π f the angularfrequency, and φ the initial phase of the sinusoidalresponse. In practical measurements, when there are mul-tiple modes in the system and noise (acoustic noise plusmeasurement system noise), a measured impulse responseis of the form5

sin ω φeh t A t t A n t τni

t t

i

N

i i1

0

i 0!^ _ _ ^h i i h8 B

(2)

where An is the rms value of background noise and n(t) isthe unity-level noise signal. Fig. 1 illustrates a singledelayed mode response corrupted by additive noise.

The task of this study is defined as finding reliable esti-mates for the parameter set Ai, τi, t0, ωi, φi, An, given anoisy measured impulse response of the form of Eq. (2).

The main interest here is focused on systems of 1) separa-ble single modes of type (1), including additive noisefloor, or 2) the dense (diffuse, noiselike) set of modesresulting also in exponential decay similar to Fig. 1. Inboth cases the parameters of primary interest are A, τ, An,and t0.

Often the decay time TD is of main interest, for exam-ple, in room acoustics where the reverberation time [10],[11] of 60-dB decay6 T60 is related to τ by

..ln

τ τT

110

6 908 60

3 .` j (3)

Modern measurement and analysis techniques of systemresponses are carried out by digital signal processingwhereby the discrete-time formulation for modal decay(without initial delay) with sampling rate fs and sampleperiod Ts 1/fs becomes

sin φΩeh n A n τ dn^ _h i (4)

with n being the sample index, τd Tsτ, and Ω 2πTs f.

2 DECAY PARAMETER ESTIMATION

In this section an overview of the known techniques fordecay parameter estimation will be presented. Initial delay

868 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

5 In a more general case the initial delays of signal componentsmay differ and there can be simple nonresonant exponentialterms, but these cases are of less importance here.

6 In practice, the reverberation time is often determined fromthe slope of a decay curve using only the first 25 or 35 dB ofdecay and extrapolating the result to 60 dB. For a recommendedpractice of reverberation time determination, see [1].

Fig. 1. (a) Single-mode impulse response (sinusoidal decay) withinitial delay and additive measurement noise. (b) Absolute valueof response on dB scale to illustrate decay envelope. (c) Hilbertenvelope; otherwise same as (b).

-1

-0.5

0

0.5

1

noise floor level

exponential decaydelay

-30

-25

-20

-15

-10

-5

0

dB

noise floor level (LN)LN

initial level (IL)

0 200 400 600 800 1000 1200 time in samples-30

-25

-20

-15

-10

-5

0

dB

noise floor level (LN)LN

initial level (IL) (b)

noise floor level (LN)LN

initial level (IL)

(c)

(b)

(a)

Page 7: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS NOISE RESPONSE MEASUREMENTS

and level estimation are first discussed briefly. The mainproblem, decay rate estimation, is the second topic.Methods to smooth the decay envelope from a measuredimpulse response are presented. Noise floor estimation, animportant subproblem, is discussed next. Finally, tech-niques for combined noise floor and decay rate estimationare reviewed.

2.1 Initial Delay and Initial Level EstimationIn most cases the initial delay and the initial level par-

ameters are relatively easy to estimate. The initial delaymay be short, not needing any attention, or the initial bulkdelay can be cut off easily up to the edge of responseonset. Only when the onset is relatively irregular or theSNR is low, can the detection of onset time be difficult.

A simple technique to eliminate initial delay is to com-pute the minimum-phase component hmphase(t) of themeasured response [12]. An impulse response can bedecomposed as a sum of minimum-phase and excess-phase components, h(t) hmphase(t) hephase(t). Since theexcess-phase component will have all-pass propertiesmanifested as a delay, computation of the minimum-phasepart will remove the initial delay.

The initial level in the beginning of the decay can bedetected directly from the peak value of the onset. Forimproved robustness, however, it may be better to estimateit from the matched decay curve, particularly its value atthe onset time.

In the case of a room impulse response, the onset cor-responds to direct sound from the sound source. It may beof special interest for the computation of the source-to-receiver distance or in estimating the impulse response ofthe sound source itself by windowing the response prior tothe first room reflection.

2.2 Decay Rate EstimationDecay rate or time estimation is in practice based on fit-

ting a straight line to the decay envelope, such as theenergy–time curve, mapped on a logarithmic (dB) scale.Before the computerized age this was done graphically onpaper. The advantage of manual inspection is that anexpert can avoid data interpretation errors in pathologicalcases. However, in practice the automatic determination ofthe decay rate or time is highly desirable.

2.2.1 Straight-Line Fit to Log EnvelopeFitting a line in a logarithmic decay curve is a concep-

tually and computationally simple way of decay rate esti-mation. The decay envelope y(t) can be computed simplyas a dB-scaled energy–time curve,

logy t x t10 102

^ ^h h8 B (5)

where x(t) is the measured impulse response or a band-pass filtered part of it, such as an octave or one-third-octave band. It is common to apply techniques such asSchroeder integration and Hilbert envelope computation(to be described later) in order to smooth the decay curvebefore line fitting. Least-squares line fitting (linear regres-sion) is done by finding the optimal decay rate k and the

initial level a,

min dy t kt a t ,k a t

t 2

1

2

# ^ ^h h8 B (6)

using, for example, the Matlab function polyfit [13].Practical problems with line fitting are related to the

selection of the interval [t1, t2] and cases where the decayof the measured response is inherently nonlinear. The firstproblem is avoided by excluding onset transients in thebeginning and the noise floor at the end of the measure-ment interval. The second problem is related to such casesas two-stage decay (initial decay rate or early reverbera-tion and late decay rate or reverberation) or beating (fluc-tuation) of the envelope because of two modes close infrequency [see Fig. 9(b)].

2.2.2 Nonlinear Regression (Xiang’s Method)Xiang [9] formulated a method where a measured and

Schroeder-integrated energy–time curve is fitted to a para-metric model of a linear decay plus a constant noise floor.Since the model is not linear in its parameters, nonlinearcurve fitting (nonlinear regression) is needed. Mathemati-cally, this is done by iterative means such as starting froma set of initial values for the model parameters and apply-ing gradient descent to search for a least-squares optimum,

min e dy t x x L t t , ,

schx x x

tt

tx

1 3

2

1 2 3

2

1

2

# ^ ^h h8 B% /

(7)

where ysch(t) is the Schroeder-integrated energy envelope,x1 the initial level, x2 the decay rate parameter, x3 a noisefloor related parameter, L the length of the response, and[t1, t2] the time interval of nonlinear regression. Noticethat the last term for the noise floor effect is a descendingline, instead of a constant level, due to the backward inte-gration of noise energy [9].

Nonlinear optimization is mathematically more com-plex than linear fitting, and care should be taken to guar-antee convergence. Even when converging, the result maybe only a local optimum, and generally the only way toknow that a global optimum is found is to apply exhaus-tive search over possible value combinations of modelparameters which, in a multiparameter case, is often com-putationally too expensive.

Nonlinear optimization techniques will be studied in moredetail later in this paper by introducing generalizations tothe method of Xiang and by comparing the performance ofdifferent techniques in decay parameter estimation.

2.2.3 AR and ARMA ModelingFor a single mode of Eq. (1) the response can be mod-

eled as an impulse response of a resonating second-orderall-pole or pole–zero filter. More generally, a combina-tion of N modes can be modeled as a 2N-order filter. AR(autoregressive) and ARMA (autoregressive movingaverage) modeling [14] are ways to derive parameters forsuch models. In many technical applications the ARmethod is called linear prediction [15]. For example, thefunction lpc in Matlab [16] processes a signal frame

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 869

Page 8: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

KARJALAINEN ET AL. PAPERS

through autocorrelation coefficient computation andsolving normal equations by Levinson recursion, result-ing in the N th-order z-domain transfer function 1/(1 ΣN

i1 αi zi). Poles are obtained by solving the roots of

the denominator polynomial. Each modal resonanceappears as a complex conjugate pole pair (zi , z*i ) in thecomplex z-plane with pole angle φ arg(zi) 2π f /fsand pole radius r zi eτ/fs, where f is the modal fre-quency, fs the sampling rate, and τ the decay parameterof the mode in Eq. (1). ARMA modeling requires an iter-ative solution for a pole–zero filter.

Decay parameter analysis by AR and ARMA modelingis an important technique and attractive, for example, incases where modes overlapping or very close to each otherhave to be modeled, which is often difficult by othermeans. For reverberation with high modal density theorder of AR modeling may become too high for accuratemodeling. Such accuracy is also not necessary for analyz-ing the overall decay rate (reverberation time) only. ARand ARMA modeling of modal behavior in acoustic sys-tems are discussed in detail, for example, in [17].

2.2.4 Group Delay AnalysisA complementary method to AR modeling is to use the

group delay, that is, the phase derivative Tg(ω) dϕ(ω)/dω, as an estimate of the decay time for separablemodes of an impulse response. While AR modeling is sen-sitive to the power spectrum only, the group delay is basedon phase properties only. For a minimum-phase single-mode response the group delay at the modal frequency isinversely proportional to the decay parameter, that is, Tg 1/τ. Group delay computation is somewhat critical due tothe phase unwrapping needed, and the method can be sen-sitive to measurement noise.

2.3 Decay Envelope Smoothing TechniquesIn the methods of linear or nonlinear curve fitting it is

desirable to obtain a smooth decay envelope prior to thefitting operation. The following techniques are often usedto improve the regularity of the decay ramp.

2.3.1 Hilbert Envelope ComputationIn this method the signal x(t) is first converted to an

analytic signal xa(t) so that x(t) is the real part of xa(t) andthe Hilbert transform (90° phase shift) [12] is the imagi-nary part of xa(t). For a single sinusoid this results in anentirely smooth energy–time envelope. An example of aHilbert envelope for a noisy modal response is shown inFig. 1(c).

2.3.2 Schroeder IntegrationA monotonic and smoothed decay curve can be pro-

duced by backward integration of the impulse responseh(t) over the measurement interval [0, T ] and converting itto a logarithmic scale,

.log

τ

τ

τ

τ

d

ddBL t

h

h10

Tt

T

10

2

0

2

#

#^

^

^

h

h

h

R

T

SSSSS

7

V

X

WWWWW

A (8)

This process is commonly known as Schroeder integration[3], [4]. Based on its superior smoothing properties it isused routinely in modern reverberation time measure-ments. A known problem with it is that if the backgroundnoise floor is included within the integration interval, theprocess produces a raised ramp that biases upward the latepart of the decay. This is shown in Fig. 2 for the case ofnoisy single-mode decay [curve (a)] for full response inte-gration [curve (d)].

The tail problem of Schroeder integration has beenaddressed by many authors (for example, in [18], [8], [5],[6]), and techniques to reduce slope biasing have beenproposed. In order to apply these improvements, a goodestimate of the noise floor level is needed first.

2.4 Noise Floor Level EstimationThe limited SNR inherent in practically all acoustical

measurements, and especially measurements performedunder field conditions, calls for attention concerning theupper time limit of decay curve fitting or Schroeder inte-gration. Theoretically this limit is set to infinity, but inpractical measurements it is naturally limited to the lengthof the measured impulse response data. In practice, meas-ured impulse responses must be long enough to accom-modate a large enough dynamic range or the whole systemdecay down to the background noise level.7

Thus the measured impulse response typically containsnot only the decay curve under analysis, but also a steadylevel of background noise, which dominates at the end of

870 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

7 This is needed to avoid time aliasing in MLS and other cyclicimpulse response measurement methods.

Fig. 2. Results of Schroeder integration applied to noisy decay ofa mode. Curve (a) measured noisy response including initialdelay; curve (b) true decay of noiseless mode (dashed straightline); curve (c) noise floor (26 dB); curve (d) Schroeder inte-gration of total measured interval; curve (e) integration overshort interval (0, 900 ms); curve (f) integration over interval (0,1100 ms), curve (g) integration after subtracting noise floor fromenergy–time curve; curve (h) a few decay curves integrated byHirata’s method.

-1

-0.5

0

0.5

1

0 200 400 600 800 1000 1200

-25

-20

-15

-10

-5

0

time in samples

dB

(a)

(b) (d)

(e) (f)(c)

(g)

0 200 400 600 800 1000 1200

-25

-20

-15

-10

-5

0

time in samples

0 200 400 600 800 1000 1200 time in samples

dB

(h)

Page 9: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS NOISE RESPONSE MEASUREMENTS

the response. Fitting the decay line over this part of theenvelope or Schroeder integrating this steady energy levelalong with the exponential decay curve causes an errorboth in the resulting decay rate (see Fig. 2) and in thetime-windowed energies (energy parameters).

To avoid bias by noise, an analysis must be performedon the impulse response data to find the level of back-ground noise and the point where the room decay meetsthe noise level. This way it is possible to effectively trun-cate the impulse response at the noise level, minimizingthe noise energy mixed with the actual decay.

Determination of the noise floor level is difficultwithout using iterative techniques. The method byLundeby et al., which will be outlined later, is a goodexample of iterative techniques integrated with decayrate estimation.

A simple way to obtain a reasonable estimate of a back-ground noise floor is to average a selected part of themeasured response tail or to fit a regression line to it [19].The level is certainly overestimated if the noise floor is notreached, but this is not necessarily problematic, asopposed to underestimating it. Another technique is tolook at the background level before the onset of the mainresponse. This works if there is enough initial latency inthe system response under study.

2.5 Decay Estimation with Noise FloorReduction

In addition to determining the response starting point, itis thus essential to find an end point where the decay curvemeets the background noise, and to truncate the noisefrom the end of the response. Fig. 2 illustrates the effect oflimiting the Schroeder integration interval. If the intervalis too short, as in curve (e), the curve is biased downward.Curve (f) shows a case where the bias due to noise is min-imized by considering the decay only down to 10 dBabove the noise floor.

There are no standardized exact methods for determin-ing the limits for Schroeder integration and decay fittingor noise compensation techniques. The methods are dis-cussed next.

2.5.1 Limited Integration or Decay MatchingInterval

There are several recommendations for dealing with thenoise floor and the point where the decay meets noise. Forexample, according to ISO 3382 [1] to determine roomreverberation, the noise floor must be 10 dB below thelowest decay level used for the calculation of the decayslope. Morgan [5] recommends to truncate at the kneepoint and then measure the decay slope of the backwardintegrated response down to a level 5 dB above the noisefloor.

Faiget et al. [19] propose a simple but systematicmethod for postprocessing noisy impulse responses. Thelatter part of a response is used for the estimation of thebackground noise level by means of a regression line.Another regression line is used for the decay, and the endof the useful response is determined at the crossing pointof the decay and the background noise regression lines.

The decay parameter fitting interval ends 5 dB above thenoise floor.

2.5.2 Lundeby’s MethodLundeby et al. [8] presented an algorithm for automati-

cally determining the background noise level, the decay-noise truncation point, and the late decay slope of animpulse response. The steps of the algorithm are as follows.

1) The squared impulse response is averaged into localtime intervals in the range of 10–50 ms to yield a smoothcurve without losing short decays.

2) A first estimate for the background noise level isdetermined from a time segment containing the last 10%of the impulse response. This gives a reasonable statisticalselection without a large systematic error if the decay con-tinues to the end of the response.

3) The decay slope is estimated using linear regressionbetween the time interval containing the response 0-dBpeak and the first interval 5–10 dB above the backgroundnoise level.

4) A preliminary crosspoint is determined at the inter-section of the decay slope and the background noiselevel.

5) A new time interval length is calculated according tothe calculated slope, so that there are 3–10 intervals per 10dB of decay.

6) The squared impulse is averaged into the new localtime intervals.

7) The background noise level is determined again. Theevaluated noise segment should start from a point corre-sponding to 5–10 dB of decay after the crosspoint, or aminimum of 10% of the total response length.

8) The late decay slope is estimated for a dynamic rangeof 10–20 dB, starting from a point 5–10 dB above thenoise level.

9) A new crosspoint is found.Steps 7–9 are iterated until the crosspoint is found to

converge (maximum five iterations).The response analysis may be further enhanced by esti-

mating the amount of energy under the decay curve afterthe truncation point. The measured decay curve is artifi-cially extended beyond the point of truncation by extrapo-lating the regression line on the late decay curve to infin-ity. The total compensation energy is formed as an idealexponential decay process, the parameters of which arecalculated from the late decay slope.

2.5.3 Subtraction of Noise Floor LevelChu [18] proposed a subtraction method in which the

mean square value of the background noise is subtractedfrom the original squared impulse response before thebackward integration. Curve (g) in Fig. 2 illustrates thiscase. If the noise floor estimate is accurate and the noiseis stationary, the resulting backward integrated curve isclose to the ideal decay curve.

2.5.4 Hirata’s MethodHirata [7] has proposed a simple method for improving

the signal-to-noise ratio by replacing the squared singleimpulse response h2(t) with the product of two impulse

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 871

Page 10: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

KARJALAINEN ET AL. PAPERS

The measured impulse responses consist of the decayterms h1(t), h2(t) and the noise terms n1(t), n2(t). Thehighly correlated decay terms h1(t) and h2(t) yield positivevalues corresponding to the squared response h2(t),whereas the mutually uncorrelated noise terms n1(t) andn2(t) are seen as a random fluctuation K(t) superposed onthe first term. Hirata’s method relies on the impulseresponse at large time values to be stationary. This condi-tion is often not met in practical concert hall measurements.

Curves (h) in Fig. 2 illustrate a few decay curvesobtained by backward integration with Hirata’s method. Inthis simulated case they correspond approximately to thecase of curve (g), the noise floor subtraction technique.

2.5.5 Other MethodsUnder adverse noise conditions, a direct determination

of the T30 decay curve from the squared and time-averagedimpulse response has been noted to be more robust thanthe backward integration method (Satoh et al. [20]).

3 NONLINEAR OPTIMIZATION OF A DECAY-PLUS-NOISE MODEL

The nonlinear regression (optimization) method pro-posed by Xiang [9] was briefly described earlier. In thepresent study we worked along similar ideas, using non-linear optimization for improved robustness and accuracy.In the following we introduce the nonlinear decay-plus-noise model and its application in several cases.

Let us assume that in noiseless conditions the systemunder study results in a simple exponential decay of theresponse envelope, corrupted by additive stationary back-ground noise. We will study two cases that fit into thesame modeling category. In the first case there is a singlemode (a complex conjugate pole pair in transfer function)that in the time domain corresponds to an exponentialdecay function,

.sin ω φeh t A t τm m m m

t m^ _h i (10)

Here Am is the initial envelope amplitude of the decayingsinusoidal, τm is a coefficient that defines the decay rate,ωm is the angular frequency of the mode, and φm is the ini-tial phase of modal oscillation.

The second case that leads to a similar formulation iswhere we have a high density of modes (diffuse soundfield) with exponential decay, resulting in an exponen-

tially decaying noise signal,

eh t A n t τd d

t d^ ^h h (11)

where Ad is the initial rms level of the response, τd is adecay rate parameter, and n(t) is stationary Gaussian noisewith an rms level of 1 ( 0 dB).

In both Eqs. (10) and (11) we assume that a practicalmeasurement of the system impulse response is corruptedwith additive stationary noise,

n t A n tb n^ ^h h (12)

where An is the rms level of the Gaussian measurementnoise in the analysis bandwidth of interest, and it isassumed to be uncorrelated with the decaying systemresponse. Statistically the rms envelope of the measuredresponse is then

.ea t h t n t A A τb n

t2 2 2 2 2^ ^ ^h h h (13)

This is a simple decay model that can be used for para-metric analysis of noise-corrupted measurements. If theamplitude envelope of a specific measurement is y(t), thenan optimized least-squares (LS) error estimate for theparameters A, τ, An can be achieved by minimizing thefollowing expression over a time span [t0, t1] of interest:

min da t y t t, ,τA A t

t 2

n 0

1

# ^ ^h h8 B (14)

Since the model of Eq. (13) is nonlinear in the parametersA, τ, An, nonlinear LS optimization is needed to searchfor the minimum LS error.

By numerical experimentation with real measurementdata it is easy to observe that LS fitting of the model of Eq.(14) places emphasis on large magnitude values, wherebynoise floors well below the signal starting level are esti-mated poorly. In order to improve the optimization, a gen-eralized form of model fitting can be formulated as

, ,min df a t t f y t t t, ,τA A t

t 2

n 0

1

# ^` ^`h j h j9 C (15)

where f (y, t) is a mapping with balanced weight for dif-ferent envelope level values and time moments.

The choice of f (y, t) 20 log10[y(t)] results in fitting

872 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

responses measured separately at the same position,

.

d d

d

d

d

h t t h t n t h t n t t

h t h t h t n t h t n t n t n t t

h t h t n t n tn t

h t

n t

h tt

h t t K t

1

t t

t

t

t

21 1 2 2

1 2 1 2 2 1 1 2

1 2 1 21

1

2

2

2

%

.

33

3

3

3

##

#

#

#

^ ^ ^ ^ ^

^ ^ ^ ^ ^ ^ ^ ^

^ ^ ^ ^^

^

^

^

^ ^

h h h h h

h h h h h h h h

h h h hh

h

h

h

h h

R

T

SSS

8 8

8

V

X

WWW

B B

B

* 4

(9)

Page 11: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS NOISE RESPONSE MEASUREMENTS

on the dB scale. It turns out that low-level noise easily hasa dominating role in this formulation. A better result inmodel fitting can be achieved by using a power law scal-ing f (y, t) ys(t) with the exponent s < 1, which is a com-promise between amplitude and logarithmic scaling. Avalue of s 0.5 has been found to be a useful defaultvalue.8

A time-dependent part of mapping f (y, t), if needed,can be separated as a temporal weighting function w(t). Ageneralized form of the entire optimization is now to find

.min dw t a t w t y t t, ,τA A

s st

t 2

n 0

1

# ^ ^ ^ ^h h h h8 B

(16)

There is no clear physical motivation for the magnitudecompression exponent s. A specific temporal weightingfunction w(t) can be applied case by case, based on extraknowledge of the behavior of the system under study andgoals of the analysis, such as focusing on the early decaytime (early reverberation) of a room response.

The strengths of the nonlinear optimization method areapparent, especially under extreme SNR conditions whereall three parameters A, τ, An are needed with greatestaccuracy. This occurs both at very low SNR conditionswhere the signal is practically buried in background noiseand at the other extreme where the noise floor is notreached within the measured impulse response, but anestimate of the noise level is nevertheless desired. A nec-essary assumption for the method to work in such cases isthat the decay model is valid, implying an exponentialdecay and a stationary noise floor.

Experiments show that the model is useful for bothsingle-mode decay and reverberant acoustic field decaymodels. Fig. 3 depicts three illustrative examples of decaymodel fitting to a single mode plus noise at an initial levelof 0 dB and different noise floor levels. Because of simu-lated noisy responses it is easy to evaluate the estimationaccuracy of each parameter. White curves show the esti-mated behavior of the decay-plus-noise model. In Fig.3(a) the SNR is only 6 dB. Errors in the parameters in thiscase are a 0.5-dB underestimate of A, a 3.5% underesti-mate in the decay time related to parameter τ, and a 1.8-dB overestimate of the noise floor An. In Fig. 3(b) a simi-lar case is shown with a moderate 30-dB SNR. Estimationerrors of the parameters are 0.2 dB for A, 2.8% fordecay time, and 1.2 dB for An. In the third case [Fig.3(c)] the SNR is 60 dB so that the noise floor is barelyreached within the analysis window. In this case the esti-mation errors are 0.002 dB for A, 0.07% for decaytime, and 1.0 dB for An. This shows that the noise flooris estimated with high accuracy, even in this extreme case.

The nonlinear optimization used in this study is basedon using the Matlab function curvefit,9 and the functionsthat implement the weighting by parameter s and theweighting function w(t) can be found at http://www.

acoustics.hut.fi/software/decay.The optimization routines are found converging robustly

in most cases, including such extreme cases as Fig. 3(a)and (c), and the initial values of the parameters for itera-tion are not critical. However, it is possible that in rarecases the optimization diverges and no (not even a local)optimum is found.10 It would be worth working out a ded-icated optimization routine guaranteeing a result in mini-mal computation time.

Our experience in the nonlinear decay parameter fittingdescribed here is that it still needs some extra informationor top-level iteration for the very best results. It is advan-tageous to select the analysis frame so that the noise flooris reached neither too early nor too late. If the noise flooris reached in the very beginning of the frame, the decaymay be missed. Not reaching the noise floor in the frameis a problem only if the estimate of this level is important.A rule for an optimal value of the scaling parameter s is touse s ≈ 1.0 for very low SNRs such as in Fig. 3(a), and letit approach a value of 0.4–0.5 when the noise floor is low,as in Fig. 3(c) (see also Fig. 4).

4 COMPARISON OF DECAY PARAMETERESTIMATION METHODS

The accuracy and robustness of the methods for decayparameter estimation can be evaluated by using synthetic

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 873

8 Interestingly enough, this resembles the loudness scaling inauditory perception known from psychoacoustics [21].

9 In new versions of Matlab, the function curvefit is recom-mended to be replaced by the function lsqcurvefit.

10 The function curvefit also prints warnings of computationalprecision problems even when optimization results are excellent.

Fig. 3. Nonlinear optimization of decay-plus-noise model forthree synthetic noisy responses with initial level of 0 dB and var-ious noise levels. (a) 6 dB. (b) 30 dB. (c) 60 dB. Blackcurves––Hilbert envelopes of simulated responses; whitecurves––estimated decay behavior.

(c)0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

-60

-40

-20

0

leve

l/dB

(b)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-40

-30

-20

-10

0

leve

l/dB

(a)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-20

-10

0

leve

l/dB

Page 12: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

KARJALAINEN ET AL. PAPERS

decay signals or envelope curves, computed for sets of theparameters A, τ, An. By repeating the same for differentmethods, their relative performances can be compared. Inthis section we present results from a comparison of theproposed nonlinear optimization and the method ofLundeby et al. [8].

The accuracy of the two methods was analyzed in thefollowing setting. A decaying sinusoid of 1 kHz with a 60-dB decay time (reverberation time) of 1 second was con-taminated with white noise of Gaussian distribution andzero mean. The initial sinusoidal level to backgroundnoise ratio was varied from 0 to 80 dB in steps of 10 dB.Each method under study was applied to analyze thedecay parameters, and the error to the “true” value wascomputed in dB for the initial and the noise floor levelsand as a percentage of decay time.

Fig. 4 depicts the results of the evaluation for the non-linear optimization proposed in this paper. The accuracyof the decay time estimation in Fig. 4(a) is excellent forSNRs above 30 dB and useful (below 10% typically) evenfor SNRs of 0–10 dB. The initial level is accurate within0.1 dB for an SNR above 20 dB and about 1 dB for anSNR of 0 dB. The noise floor estimate is within approxi-mately 1–2 dB up to an SNR of 60 dB and gives betterthan a guess up to 70–80 dB of SNR. (Notice that the SNRalone is not important here but rather whether or not thenoise floor is reached in the analysis window.)

Fig. 5 plots the same information for decay parameterestimation using the method of Lundeby et al. withoutnoise compensation, implemented by us in Matlab. Sincethis iterative technique is not developed for extremeSNRs, such as 0 dB, it cannot deal with these cases with-out extra tricks, and even then it may have severe prob-lems. We used safety settings whereby we did not try to

obtain decay time values for SNRs below 20 dB, and lowSNR parts of the decay parameter estimate curves areomitted.

For moderate SNRs the results of the method are fairlygood and robust. The decay time shows a positive bias ofa few percent, except for an SNR below 30 dB. The noisefloor estimate is reliable in this case only up to about 50dB SNR. Notice that the method is designed for practicalreverberation time measurements rather than for this testcase, where it could be tuned to perform better.

5 EXAMPLES OF DECAY PARAMETERESTIMATION BY NONLINEAR OPTIMIZATION

In this section we present examples of applying thenonlinear estimation of a decay-plus-noise model to typi-cal acoustic and audio applications, including reverbera-tion time estimation, analysis and modeling of low-frequency modes of a room response, and decay rateanalysis of plucked string vibration for model-based syn-thesis applications.

5.1 Reverberation Time EstimationEstimating the reverberation time of a room or a hall is

relatively easy if the decay curve behaves regularly andthe noise floor is low enough. In practice the case is oftenquite different. Here we demonstrate the behavior of thenonlinear optimization method in an example where themeasured impulse response includes an initial delay, anirregular initial part, and a relatively high measurementnoise floor.

Fig. 6 depicts three different cases of fitting the decay-plus-noise model to this case of a control room with ashort reverberation time. In Fig. 6(a) the fitting is applied

874 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 4. Sine-plus-noise decay parameter estimation errors (aver-age of 20 trials) for proposed nonlinear optimization method asa function of SNR. (a) Decay time estimation error in %.(b) Initial level estimation error in dB. (c) Noise floor estimationerror in dB. ––– s 0.5; – – – s 1.0.

Fig. 5. Decay parameter estimation errors for Lundeby et al.method as a function of SNR. (a) Decay time estimation error in% with truncated Schroeder integration but without noise com-pensation. (b) Initial level estimation error in dB. (c) Noise floorestimation error in dB.

(c)0 10 20 30 40 50 60 SNR/dB 80

-10

10

0

dBaverage: +1.2 dB (s=0.5) +1.8 dB (s=1.0)

(b)

0 10 20 30 40 50 60 80

-2-3

-4

-10

dB

SNR/dB

average: +0.03 dB (s=0.5) —0.01 dB (s=1.0)

(a)

-10

-5

5

0

%

0 10 20 30 40 50 60 80SNR/dB

average: —0.70 % (s=0.5) +0.04 % (s=1.0)

(c)

0

10dB

0 10 20 30 40 50 60 SNR/dB 80

-10

average: Ñ0.4 dB

(b)

-4-3

-2

-1

dB

0 10 20 30 40 50 60 SNR/dB 80

average: Ñ3.1 dB

0

(a)0 10 20 30 40 50 60 SNR/dB 80

-5

-10

0

5%

-5 dB ... noise floor+10 dB integration -5 dB ... noise floor+5dB integration

Page 13: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS NOISE RESPONSE MEASUREMENTS

to the entire decay curve, including the initial delay, andthe resulting model is clearly biased toward too long areverberation time. In Fig. 6(b) the initial delay isexcluded from model fitting, and the result is better.However, after the direct sound there is a period of onlylittle energy during the first reflections prior to the rangeof dense reflections and diffuse response. If the reverber-ation time estimate is to describe the decay of this diffusepart, the case of Fig. 6(c), with a fitting starting from about30 ms, yields the best match to reverberation decay,11 andthe approaching noise floor is also estimated well.

5.2 Modeling of Low-Frequency Room ModesThe next case deals with the modeling of the low-

frequency modes of a room. Below a critical frequency(the so-called Schroeder frequency) the mode density islow and individual modes can be decomposed from themeasured room impulse response. The task here was tofind the most prominent modes and to analyze their modalparameters fm and τm, the frequency and decay parameters,respectively. The case studied was a hard-walled, partiallydamped room with moderate reverberation time (≈ 1 sec-ond) at mid and high frequencies, but much longer decaytimes at the lowest modal frequencies. The following pro-cedure was applied:

• A short-time Fourier analysis of the measured impulseresponse was computed to yield the time–frequencyrepresentation shown in Fig. 7 as a waterfall plot.

• At each frequency bin (1.3-Hz spacing is used) the dB-scaled energy–time decay trajectory was fitted to thedecay-plus-noise model with the nonlinear optimizationtechnique to obtain the optimal decay parameter τ.

• Based on decay parameter values and spectral levels,a rule was written to pick up the most prominentmodal frequencies and the related decay parametervalues.

In this context we are interested in how well the decayparameter estimation worked with noisy measurements.Application of the nonlinear optimization resulted indecay curve fits, some of which are illustrated in Fig. 8, bycomparing the original decay and the decay-plus-noisemodel behavior. For all frequencies in the vicinity of amode the model fits robustly and accurately.12

5.3 Analysis of Decay Rate of Plucked StringTones

A model-based synthesis of string tones can producerealistic guitar-like tones if the parameter values of thesynthesis model are calibrated based on recordings [2].The main properties of tones that need to be analyzed aretheir fundamental frequency and the decay time of all har-monic partials that are audible. While estimating the fun-damental frequency is quite easy, measurement of thedecay times of harmonics ( modes of the string) is com-plicated by the fact that they all have a different rate ofdecay and also the initial level can vary within a range of20–30 dB. There may also be no information about thenoise floor level for all harmonics.

One method used for measuring the decay times isbased on the short-time Fourier analysis. A recorded sin-gle guitar tone is sliced into frames with a window func-tion in the time domain. Each window function is thenFourier transformed with the fast Fourier transform usingzero padding to increase the spectral resolution, and har-monic peaks are hunted from the magnitude spectrumusing a peak-picking algorithm. The peak values from theconsecutive frames are organized as tracks, which corre-spond to the temporal envelopes of the harmonics. Then itbecomes possible to estimate the decay rate of each har-monic mode. In the following, we show how this workswith the proposed decay parameter estimation algorithm.Finally the decay rate of each harmonic is converted intoa corresponding target response, which is used for design-ing the magnitude response of a digital filter that controlsthe decay of harmonics in the synthesis model.

Fig. 9 plots three examples of modal decay analysis ofguitar string harmonics (string 5, open string). Harmonicenvelope trajectories were analyzed as described. Thedecay-plus-noise model was fitted in a time window thatstarted from the maximum value position of the envelopecurve. In Fig. 9(a) the second harmonic shows a highlyregular decay after an initial transient of plucking, wherebydecay fitting is almost perfect. Fig. 9(b), harmonic 24,

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 875

(c)0 0.05 0.1 time/s

-60

-40

-20

leve

l/dB T = 0.26 s60

(b)0 0.05 0.1 time/s

-60

-40

-20

leve

l/dB

T = 0.33 s60

(a)0 0.05

T = 0.52 s

0.1 time/s

-60

-40

-20

leve

l/dB

60

11 In this example, decay parameter analysis is applied to theentire frequency range of the impulse response. In practice it iscomputed as a function of frequency, that is, applied to octave orone-third-octave band decay curves.

12 To obtain the best frequency resolution it may be desirableto replace short-time spectral analysis with energy-decay curvesobtained by backward integration and the model of Eq. (16) witha corresponding formulation derived from Xiang’s formula, Eq.(7).

Fig. 6. Decay-plus-noise model fitting by nonlinear optimizationto a room impulse response. (a) Fitting range includes initialdelay, transient phase, and decay. (b) Fitting includes transientphase and decay. (c) Fitting includes only decay phase. Esti-mated T60 values are given.

Page 14: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

KARJALAINEN ET AL. PAPERS

depicts a strongly beating decay, where probably the hor-izontal and vertical polarizations have a frequency differ-ence that after summation results in beating. Fig. 9(c), har-monic 54, shows a trajectory where the noise floor isreached within the analysis window. In all cases shown thenonlinear optimization works as perfectly as a simpledecay model can do.

As can be concluded from Fig. 9(b), a string can exhibitmore complicated behavior than simple exponentialdecay. Even more complex is the case of piano tonesbecause there are two to three strings slightly off tune, andthe envelope fluctuation can be more irregular. Two-stagedecay is also common where the initial decay is faster thanlater decay [22].

In all such cases a more complex decay model is neededto achieve a good match with measured data. Such tech-niques are studied in [17].

6 SUMMARY AND CONCLUSIONS

An overview of modal decay analysis methods for noisyimpulse response measurements of reverberant acousticsystems has been presented, and further improvementswere introduced. The problem of decay time determina-tion is important, for example, in room acoustics for char-acterizing the reverberation time. Another applicationwhere a similar problem is encountered is the estimationof string model parameters for model-based synthesis of

876 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 7. Time–frequency plot of room response at low frequencies. Lowest modes (especially 40 Hz) show long decay times.

Fig. 8. Fitting of decay-plus-noise model to low-frequencymodal data of room (see Fig. 7). (a) At 40-Hz. (b) At 104 Hz.(c) At 117 Hz (off-mode fast decay). – – – measured; –––– mod-eled.

Fig. 9. Examples of modal decay matching for harmonic com-ponents of guitar string. (a) Regular decay after initial transient.(b) Strongly beating decay (double mode). (c) Fast decay thatreaches noise floor. – – – measured envelope; –––– optimizedmodel fit.

(c)0 0.5 1 1.5 2 2.5 3 3.5 4 time/s 5

-80

-60

-40

leve

l/dB

(b)

0 0.5 1 1.5 2 2.5 3 3.5 4 time/s 5

-80

-60

-40

leve

l/dB

(a)

0 0.5 1 1.5 2 2.5 3 3.5 4 time/s 5-60

-40

-20

leve

l/dB

(c)0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 time/s

-50-40-30-20-10

0

leve

l/dB

(b)

-20

-15

-10

-5

0

leve

l/dB

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 time/s

(a)

-20

-15

-10

-5

0

leve

l/dB

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 time/s

Page 15: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

THE AUTHORS

P. Antsalo T. Peltonen

PAPERS NOISE RESPONSE MEASUREMENTS

plucked string instruments.It is shown that the developed decay-plus-noise model

yields highly accurate decay parameter estimates, outper-forming traditional methods, especially under extremeSNR conditions.

There exist other methods, such as AR modeling, thatshow potential in specific applications. Challenges for fur-ther research are to make modal decay methods (with anincreased number of parameters) able to analyze complexdecay characteristics, such as double decay behavior andstrongly fluctuating responses due to two or more modesvery close in frequency.

A Matlab code for the nonlinear optimization of decayparameters, including data examples, can be found athttp://ww.acoustics.hut.fi/software/decay.

7 ACKNOWLEDGMENT

This study is part of the VÄRE technology program,project TAKU (Control of Closed Space Acoustics), fundedby Tekes (National Technology Agency). The work of VesaVälimäki has been financed by the Academy of Finland.

8 REFERENCES

[1] ISO 3382-1997, “Acoustics––Measurement of theReverberation Time of Rooms with Reference to OtherAcoustical Parameters,” International Standards Organ-ization, Geneva, Switzerland (1997), 21 pp.

[2] V. Välimäki, J. Huopaniemi, M. Karjalainen, and Z.Jánosy, “Physical Modeling of Plucked String Instrumentswith Application to Real-Time Sound Synthesis,” J. AudioEng. Soc., vol. 44, pp. 331–353 (1996 May).

[3] M. R. Schroeder, “New Method of MeasuringReverberation Time,” J. Acoust. Soc. Am., vol. 37, pp. 409–412 (1965).

[4] M. R. Schroeder, “Integrated-Impulse Method Mea-suring Sound Decay without Using Impulses,” J. Acoust.Soc. Am., vol. 66, pp. 497–500 (1979 Aug.).

[5] D. Morgan, “A Parametric Error Analysis of the Back-ward Integration Method for Reverberation Time Estimation,”J. Acoust. Soc. Am., vol. 101, pp. 2686–2693 (1997 May).

[6] L. Faiget, C. Legros, and R. Ruiz, “Optimization ofthe Impulse Response Length: Application to Noisy and

Highly Reverberant Rooms,” J. Audio Eng. Soc., vol. 46,pp. 741–750 (1998 Sept.).

[7] Y. Hirata, “A Method of Eliminating Noise in PowerResponses,” J. Sound Vib., vol. 82, pp. 593–595 (1982).

[8] A. Lundeby, T. E. Vigran, H. Bietz, and M.Vorländer, “Uncertainties of Measurements in RoomAcoustics,” Acustica, vol. 81, pp. 344–355 (1995).

[9] N. Xiang, “Evaluation of Reverberation TimesUsing a Nonlinear Regression Approach,” J. Acoust. Soc.Am., vol. 98, pp. 2112–2121 (1995 Oct.).

[10] W. C. Sabine, Architectural Acoustics (1900;reprinted by Dover, New York, 1964).

[11] L. Beranek, Acoustic Measurements (1949;reprinted by Acoustical Society of America, 1988).

[12] A. V. Oppenheim and R. W. Schafer, Digital SignalProcessing (Prentice-Hall, Englewood Cliffs, NJ, 1975),chap. 10, pp. 480–531.

[13] MathWorks, Inc., MATLAB User’s Guide (2001).[14] S. M. Kay, Fundamentals of Statistical Signal

Processing, vol. I, Estimation Theory (Prentice-Hall,Englewood Cliffs, NJ, 1993).

[15] J. D. Markel and J. A. H. Gray, Linear Predictionof Speech (Springer, Berlin, 1976).

[16] MathWorks, Inc., MATLAB Signal ProcessingToolbox, User’s Guide (2001).

[17] M. Karjalainen, P. A. A. Esquef, P. Antsalo, A.Mäkivirta, and V. Välimäki, “Frequency-Zooming ARMAModeling of Resonant and Reverberant Systems,” to bepublished, vol. 50 (2002 Dec.).

[18] W. T. Chu, “Comparison of Reverberation Mea-surements Using Schroeder’s Impulse Method and Decay-Curve Averaging Method,” J. Acoust. Soc. Am., vol. 63,pp. 1444–1450 (1978 May).

[19] L. Faiget, R. Ruiz, and C. Legros, “Estimation ofImpulse Response Length to Compute Room AcousticalCriteria,” Acustica, vol. 82, suppl. 1, p. S148 (1996 Sept.).

[20] F. Satoh, Y. Hidaka, and H. Tachibana, “Reverbera-tion Time Directly Obtained from Squared Impulse Re-sponse Envelope,” in Proc. Int. Congr. on Acoustics, vol. 4(Seattle, WA, 1998 June), pp. 2755–2756.

[21] E. Zwicker and H. Fastl, Psychoacoustics––Factsand Models (Springer, Berlin, 1990).

[22] G. Weinreich, “Coupled Piano Strings,” J. Acoust.Soc. Am., vol. 62, pp. 1474–1484 (1977 Dec.).

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 877

Page 16: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

KARJALAINEN ET AL. PAPERS

878 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Poju Pietari Antsalo was born in Helsinki, Finland, in1971. He studied electrical and communications engi-neering at the Helsinki University of Technology, andobtained a master’s degree in acoustics and audio signalprocessing in 1999. He has carried out research on roomacoustics and audio reproduction, particularly at low fre-quencies.

Timo Peltonen was born in Espoo, Finland, in 1972. Hereceived an M.Sc. degree in electrical engineering fromthe Helsinki University of Technology in 2000. His mas-ter's thesis addressed the measurement and room acousticanalysis of multichannel impulse responses.

He is the author of IRMA, the Matlab-based measure-ment and analysis software. Currently he is working as an

acoustical consultant at Akukon Consulting EngineersLtd., Helsinki, Finland.

Mr. Peltonen’s interests include acoustical measure-ment techniques and signal processing methods for roomacoustical analysis.

He is a member of the AES, a board member of theAES Finnish Section, and a member of the AcousticalSociety of Finland. He is the author of several AES con-ference papers.

The biographies of Matti Karjalainen and VesaVälimäki were published in the 2002 April issue of theJournal.

The biography of Aki Mäkivirta was published in the2002 July/August issue of the Journal.

Page 17: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS

0 INTRODUCTION

The method of reassignment has been used to sharpenspectrograms in order to make them more readable [1], [2],to measure sinusoidality, and to ensure optimal windowalignment in the analysis of musical signals [3]. We usetime–frequency reassignment to improve our bandwidth-enhanced additive sound model. The bandwidth-enhancedadditive representation is in some way similar to tradi-tional sinusoidal models [4]–[6] in that a waveform ismodeled as a collection of components, called partials,having time-varying amplitude and frequency envelopes.Our partials are not strictly sinusoidal, however. Weemploy a technique of bandwidth enhancement to com-bine sinusoidal energy and noise energy into a single par-tial having time-varying amplitude, frequency, and band-width parameters [7], [8].

Additive sound models applicable to polyphonic andnonharmonic sounds employ long analysis windows, whichcan compromise the time resolution and phase accuracyneeded to preserve the temporal shape of transients.Various methods have been proposed for representingtransient waveforms in additive sound models. Verma andMeng [9] introduce new component types specifically formodeling transients, but this method sacrifices the homo-geneity of the model. A homogeneous model, that is, amodel having a single component type, such as the break-point parameter envelopes in our reassigned bandwidth-enhanced additive model [10], is critical for many kinds of

manipulations [11], [12]. Peeters and Rodet [3] havedeveloped a hybrid analysis/synthesis system that eschewshigh-level transient models and retains unabridged OLA(overlap–add) frame data at transient positions. Thishybrid representation represents unmodified transients per-fectly, but also sacrifices homogeneity. Quatieri et al. [13]propose a method for preserving the temporal envelope ofshort-duration complex acoustic signals using a homoge-neous sinusoidal model, but it is inapplicable to sounds oflonger duration, or sounds having multiple transient events.

We use the method of reassignment to improve the timeand frequency estimates used to define our partial param-eter envelopes, thereby enhancing the time –frequencyresolution of our representation, and improving its phaseaccuracy. The combination of time–frequency reassign-ment and bandwidth enhancement yields a homogeneousmodel (that is, a model having a single component type)that is capable of representing at high fidelity a wide vari-ety of sounds, including nonharmonic, polyphonic, impul-sive, and noisy sounds. The reassigned bandwidth-enhanced sound model is robust under transformation, andthe fidelity of the representation is preserved even undertime dilation and other model-domain modifications. Thehomogeneity and robustness of the reassigned bandwidth-enhanced model make it particularly well suited for suchmanipulations as cross synthesis and sound morphing.

Reassigned bandwidth-enhanced modeling and render-ing and many kinds of manipulations, including morph-ing, have been implemented in the open-source Cclass library Loris [14], and a stream-based, real-timeimplementation of bandwidth-enhanced synthesis is avail-able in the Symbolic Sound Kyma environment [15].

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 879

On the Use of Time–Frequency Reassignment inAdditive Sound Modeling*

KELLY FITZ, AES Member AND LIPPOLD HAKEN, AES Member

Department of Electrical Engineering and Computer Science, Washington University, Pulman, WA 99164

A method of reassignment in sound modeling to produce a sharper, more robust additiverepresentation is introduced. The reassigned bandwidth-enhanced additive model followsridges in a time–frequency analysis to construct partials having both sinusoidal and noisecharacteristics. This model yields greater resolution in time and frequency than is possibleusing conventional additive techniques, and better preserves the temporal envelope oftransient signals, even in modified reconstruction, without introducing new component typesor cumbersome phase interpolation algorithms.

* Manuscript received 2001 December 20; revised 2002 July23 and 2002 September 11.

Page 18: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

FITZ AND HAKEN PAPERS

1 TIME–FREQUENCY REASSIGNMENT

The discrete short-time Fourier transform is often usedas the basis for a time–frequency representation of time-varying signals, and is defined as a function of time indexn and frequency index k as

(1)

2

2

exp

exp

π

π

j

j

X k h l n x lN

l n k

h l x n lN

l

2

2

nl

lN

N

1

1

3

3

k

!

!

^ ^ ^^

^ ^ e

h h hh

h h o

R

T

SSS

V

X

WWW

(2)

where h(n) is a sliding window function equal to 0 for n <(N 1)/2 and n > (N 1)/2 (for N odd), so that Xn(k)is the N-point discrete Fourier transform of a short-timewaveform centered at time n.

Short-time Fourier transform data are sampled at a rateequal to the analysis hop size, so data in derivativetime–frequency representations are reported on a regulartemporal grid, corresponding to the centers of the short-time analysis windows. The sampling of these so-calledframe-based representations can be made as dense asdesired by an appropriate choice of hop size. However,temporal smearing due to long analysis windows neededto achieve high-frequency resolution cannot be relieved bydenser sampling.

Though the short-time phase spectrum is known to con-tain important temporal information, typically only theshort-time magnitude spectrum is considered in thetime–frequency representation. The short-time phasespectrum is sometimes used to improve the frequency esti-mates in the time–frequency representation of quasi-harmonic sounds [16], but it is often omitted entirely, orused only in unmodified reconstruction, as in the basicsinusoidal model, described by McAulay and Quatieri [4].

The so-called method of reassignment computes sharp-ened time and frequency estimates for each spectral com-ponent from partial derivatives of the short-time phasespectrum. Instead of locating time–frequency compo-nents at the geometrical center of the analysis window (tn,ωk), as in traditional short-time spectral analysis, the com-ponents are reassigned to the center of gravity of theircomplex spectral energy distribution, computed from theshort-time phase spectrum according to the principle ofstationary phase [17, ch. 7.3]. This method was first devel-oped in the context of the spectrogram and called the mod-ified moving window method [18], but it has since beenapplied to a variety of time–frequency and time-scaletransforms [1].

The principle of stationary phase states that the variationof the Fourier phase spectrum not attributable to periodicoscillation is slow with respect to frequency in certainspectral regions, and in surrounding regions the variation isrelatively rapid. In Fourier reconstruction, positive andnegative contributions to the waveform cancel in frequencyregions of rapid phase variation. Only regions of slowphase variation (stationary phase) will contribute signifi-

cantly to the reconstruction, and the maximum contribu-tion (center of gravity) occurs at the point where the phaseis changing most slowly with respect to time and frequency.

In the vicinity of t τ (that is, for an analysis windowcentered at time t τ), the point of maximum spectralenergy contribution has time–frequency coordinates thatsatisfy the stationarity conditions

(3),

,

ωφ τ ω ω τ

τφ τ ω ω τ

t

t

0

0

2

2

2

2

^ ^

^ ^

h h

h h

8

8

B

B (4)

where φ(τ, ω) is the continuous short-time phase spec-trum and ω(t τ) is the phase travel due to periodicoscillation [18]. The stationarity conditions are satisfiedat the coordinates

(5),

,

τω

φ τ ω

ωτ

φ τ ω

t

2

2

2

2

t

t

^

^

h

h (6)

representing group delay and instantaneous frequency,respectively.

Discretizing Eqs. (5) and (6) to compute the time andfrequency coordinates numerically is difficult and unreli-able, because the partial derivatives must be approxi-mated. These formulas can be rewritten in the form ofratios of discrete Fourier transforms [1]. Time and fre-quency coordinates can be computed using two additionalshort-time Fourier transforms, one employing a time-weighted window function and one a frequency-weightedwindow function.

Since time estimates correspond to the temporal centerof the short-time analysis window, the time-weighted win-dow is computed by scaling the analysis window functionby a time ramp from (N 1)/2 to (N 1)/2 for a win-dow of length N. The frequency-weighted window iscomputed by wrapping the Fourier transform of the analy-sis window to the frequency range [π, π], scaling thetransform by a frequency ramp from (N 1)/2 to (N 1)/2, and inverting the scaled transform to obtain a (real)frequency-scaled window. Using these weighted win-dows, the method of reassignment computes correctionsto the time and frequency estimates in fractional sampleunits between (N 1)/2 to (N 1)/2. The three analy-sis windows employed in reassigned short-time Fourieranalysis are shown in Fig. 1.

The reassigned time tk,n for the kth spectral componentfrom the short-time analysis window centered at time n (insamples, assuming odd-length analysis windows) is [1]

t n

X k

X k X k0

*;

,k n

n

t n n2

t

^

^ ^

h

h h

R

T

SSSSS

V

X

WWWWW

(7)

where Xt;n(k) denotes the short-time transform computedusing the time-weighted window function and 0 [·]denotes the real part of the bracketed ratio.

880 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Page 19: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS TIME-FREQUENCY REASSIGNMENT IN SOUND MODELING

The corrected frequency ωk,n(k) corresponding to thesame component is [1]

ω k

X k

X k X k1

*

,k n

n

f n n2

;t

^

^ ^

h

h h

R

T

SSSSS

V

X

WWWWW

(8)

where Xf ;n(k) denotes the short-time transform computedusing the frequency-weighted window function and 1 [·]denotes the imaginary part of the bracketed ratio. Both tk,nand ωk,n have units of fractional samples.

Time and frequency shifts are preserved in the reas-signment operation, and energy is conserved in the reas-signed time–frequency data. Moreover, chirps andimpulses are perfectly localized in time and frequency inany reassigned time–frequency or time-scale representa-tion [1]. Reassignment sacrifices the bilinearity oftime–frequency transformations such as the squared mag-nitude of the short-time Fourier transform, since very datapoint in the representation is relocated by a process that ishighly signal dependent. This is not an issue in our repre-sentation, since the bandwidth-enhanced additive model,like the basic sinusoidal model [4], retains data only attime–frequency ridges (peaks in the short-time magnitudespectra), and thus is not bilinear.

Note that since the short-time Fourier transform isinvertible, and the original waveform can be exactlyreconstructed from an adequately sampled short-timeFourier representation, all the information needed to pre-cisely locate a spectral component within an analysis win-dow is present in the short-time coefficients Xn(k).Temporal information is encoded in the short-time phase

spectrum, which is very difficult to interpret. The methodreassignment is a technique for extracting informationfrom the phase spectrum.

2 REASSIGNED BANDWIDTH-ENHANCEDANALYSIS

The reassigned bandwidth-enhanced additive model[10] employs time–frequency reassignment to improvethe time and frequency estimates used to define partialparameter envelopes, thereby improving the time–fre-quency resolution and the phase accuracy of the represen-tation. Reassignment transforms our analysis from aframe-based analysis into a “true” time–frequency analy-sis. Whereas the discrete short-time Fourier transformdefined by Eq. (2) orients data according to the analysisframe rate and the length of the transform, the time andfrequency orientation of reassigned spectral data is solelya function of the data themselves.

The method of analysis we use in our research modelsa sampled audio waveform as a collection of bandwidth-enhanced partials having sinusoidal and noiselike charac-teristics. Other methods for capturing noise in additivesound models [5], [19] have represented noise energy infixed frequency bands using more than one componenttype. By contrast, bandwidth-enhanced partials aredefined by a trio of synchronized breakpoint envelopesspecifying the time-varying amplitude, center frequency,and noise content for each component. Each partial is ren-dered by a bandwidth-enhanced oscillator, described by

cosΒ ζ θy n A n n n n ^ ^ ^ ^ ^h h h h h8 8B B (9)

where A(n) and β(n) are the time-varying sinusoidal andnoise amplitudes, respectively, and ζ(n) is a energy-normalized low-pass noise sequence, generated by excitinga low-pass filter with white noise and scaling the filter gainsuch that the noise sequence has the same total spectralenergy as a full-amplitude sinusoid. The oscillator phaseθ(n) is initialized to some starting value, obtained from thereassigned short-time phase spectrum, and updated accord-ing to the time-varying radian frequency ω(n) by

, >θ θ ωn n n n1 0 ^ ^ ^h h h (10)

The bandwidth-enhanced oscillator is depicted in Fig. 2.We define the time-varying bandwidth coefficient κ(n)

as the fraction of total instantaneous partial energy that isattributable to noise. This bandwidth (or noisiness) coeffi-cient assumes values between 0 for a pure sinusoid and 1for a partial that is entirely narrow-band noise, and variesover time according to the noisiness of the partial. If werepresent the total (sinusoidal and noise) instantaneouspartial energy as Ã2(n), then the output of the bandwidth-enhanced oscillator is described by

.cosκ κ ζ θy n A n n n n n1 2 u^ ^ ^ ^ ^ ^h h h h h h9 8C B

(11)

The envelopes for the time-varying partial amplitudes andfrequencies are constructed by identifying and following

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 881

Fig. 1. Analysis windows employed in three short-time trans-forms used to compute reassigned times and frequencies. (a) Ori-ginal window function h(n) (a 501-point Kaiser window withshaping parameter 12.0 in this case). (b) Time-weighted windowfunction ht(n) nh(n). (c) Frequency-weighted window functionhf(n).

(c)

-250 0 250

-100

0

100

Time (samples)

h f(n

)

(b)

-250 0 250-40

0

40

Time (samples)

h t(n

)

(a)

-250 0 250

1

0

Time (samples)

h(n)

Page 20: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

FITZ AND HAKEN PAPERS

the ridges on the time–frequency surface. The time-varyingpartial bandwidth coefficients are computed and assignedby a process of bandwidth association [7].

We use the method of reassignment to improve thetime and frequency estimates for our partial parameterenvelope breakpoints by computing reassigned times andfrequencies that are not constrained to lie on thetime–frequency grid defined by the short-time Fourieranalysis parameters. Our algorithm shares with tradi-tional sinusoidal methods the notion of temporally con-nected partial parameter estimates, but by contrast, ourestimates are nonuniformly distributed in both time andfrequency.

Short-time analysis windows normally overlap in bothtime and frequency, so time–frequency reassignmentoften yields time corrections greater than the length of theshort-time hop size and frequency corrections greater thanthe width of a frequency bin. Large time corrections arecommon in analysis windows containing strong transientsthat are far from the temporal center of the window. Sincewe retain data only at time–frequency ridges, that is, atfrequencies of spectral energy concentration, we generallyobserve large frequency corrections only in the presenceof strong noise components, where phase stationarity is aweaker effect.

3 SHARPENING TRANSIENTS

Time–frequency representations based on traditionalmagnitude-only short-time Fourier analysis techniques(such as the spectrogram and the basic sinusoidal model[4]) fail to distinguish transient components from sustain-ing components. A strong transient waveform, as shown inFig. 3(a), is represented by a collection of low-amplitudespectral components in early short-time analysis frames,that is, frames corresponding to analysis windows centeredearlier than the time of the transient. A low-amplitude peri-odic waveform, as shown in Fig. 3(b), is also representedby a collection of low-amplitude spectral components. Theinformation needed to distinguish these two critically dif-ferent waveforms is encoded in the short-time phase spec-trum, and is extracted by the method of reassignment.

Time–frequency reassignment allows us to preserve thetemporal envelope shape without sacrificing the homo-geneity of the bandwidth-enhanced additive model. Com-

ponents extracted from early or late short-time analysiswindows are relocated nearer to the times of transientevents, yielding clusters of time–frequency data points, asdepicted in Fig. 4. In this way, time reassignment greatlyreduces the temporal smearing introduced through the useof long analysis windows. Moreover, since reassignmentsharpens our frequency estimates, it is possible to achievegood frequency resolution with shorter (in time) analysiswindows than would be possible with traditional methods.The use of shorter analysis windows further improves ourtime resolution and reduces temporal smearing.

The effect of time–frequency reassignment on the tran-sient response can be demonstrated using a square wavethat turns on abruptly, such as the waveform shown in Fig.5. This waveform, while aurally uninteresting and unin-formative, is useful for visualizing the performance of var-ious analysis methods. Its abrupt onset makes temporalsmearing obvious, its simple harmonic partial amplituderelationship makes it easy to predict the necessary data fora good time–frequency representation, and its simplewaveshape makes phase errors and temporal distortioneasy to identify. Note, however, that this waveform ispathological for Fourier-based additive models, and exag-gerates all of these problems with such methods. We use itonly for the comparison of various methods.

Fig. 6 shows two reconstructions of the onset of a squarewave from time–frequency data obtained using overlap-ping 54-ms analysis windows, with temporal centers sepa-rated by 10 ms. This analysis window is long compared tothe period of the square wave, but realistic for the case of apolyphonic sound (a sound having multiple simultaneousvoices), in which the square wave is one voice. For clarity,only the square wave is presented in this example, andother simultaneous voices are omitted. The square wave

882 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 2. Block diagram of bandwidth-enhanced oscillator. Time-varying sinusoidal and noise amplitudes are controlled by A(n)and β(n), respectively; time-varying center (sinusoidal) fre-quency is ω(n).

+

y(n)

β(n) A(n)

ζ(n)

noise lowpass filter

ω(n)

N

θ(0)(starting phase) Fig. 3. Windowed short-time waveforms (dashed lines), not read-

ily distinguished in basic sinsoidal model [4]. Both waveformsare represented by low-amplitude spectral components. (a) Strongtransient yields off-center components, having large time correc-tions (positive in this case because transient is near right tail ofwindow). (b) Sustained quasi-periodic waveform yields time cor-rections near zero.

0 100 200 300 400 500 600 700

0

1

Time (samples)

Am

plitu

de

(a)

0 100 200 300 400 500 600 700

0

1

Time (samples)

Am

plitu

de

(b)

Page 21: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS TIME-FREQUENCY REASSIGNMENT IN SOUND MODELING

has an abrupt onset. The silence before the onset is notshown. Only the first (lowest frequency) five harmonic par-tials were used in the reconstruction, and consequently theringing due to Gibb’s phenomenon is evident.

Fig. 6(a) is a reconstruction from traditional, nonreas-signed time–frequency data. The reconstructed squarewave amplitude rises very gradually and reaches fullamplitude approximately 40 ms after the first nonzerosample. Clearly, the instantaneous turn-on has beensmeared out by the long analysis window. Fig. 6(b) showsa reconstruction from reassigned time–frequency data.The transient response has been greatly improved by relo-cating components extracted from early analysis windows(like the one on the left in Fig. 5) to their spectral centersof gravity, closer to the observed turn-on time. The syn-thesized onset time has been reduced to approximately 10ms. The corresponding time–frequency analysis data areshown in Fig. 7. The nonreassigned data are evenly dis-tributed in time, so data from early windows (that is, win-dows centered before the onset time) smear the onset,whereas the reassigned data from early analysis windowsare clumped near the correct onset time.

4 CROPPING

Off-center components are short-time spectral compo-nents having large time reassignments. Since they repre-sent transient events that are far from the center of theanalysis window, and are therefore poorly represented inthe windowed short-time waveform, these off-center com-ponents introduce unreliable spectral parameter estimatesthat corrupt our representation, making the model data dif-ficult to interpret and manipulated.

Fortunately large time corrections make off-centercomponents easy to identify and remove from ourmodel. By removing the unreliable data embodied byoff-center components, we make our model cleaner andmore robust. Moreover, thanks to the redundancy inher-ent in short-time analysis with overlapping analysiswindows, we do not sacrifice information by removingthe unreliable data points. The information representedpoorly in off-center components is more reliably repre-sented in well-centered components, extracted fromanalysis windows centered nearer the time of the tran-sient event. Typically, data having time corrections

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 883

Fig. 5. Two long analysis windows superimposed at different times on square wave signal with abrupt turn-on. Short-time transformcorresponding to earlier window generates unreliable parameter estimates and smears sharp onset of square wave.

0 250-1.0

0

1.0

Time (ms)

Am

plitu

de

Fig. 4. Comparison of time–frequency data included in common representations. Only time–frequency orientation of data points isshown. (a) Short-time Fourier transform retains data at every time tn and frequency ωk. (b) Basic sinusoidal model [4] retains data atselected time and frequency samples. (c) Reassigned bandwidth-enhanced analysis data are distributed continuously in time and fre-quency, and retained only at time–frequency ridges. Arrows indicate mapping of short-time spectral samples onto time–frequencyridges due to method of reassignment.

ω8

ω7

ω6

ω5

ω4

ω3

ω2

ω1

t1 t2 t3 t4 t5 t6 t7

ω8

ω7

ω6

ω5

ω4

ω3

ω2

ω1

t1 t2 t3 t4 t5 t6 t7

ω8

ω7

ω6

ω5

ω4

ω3

ω2

ω1

t1 t2 t3 t4 t5 t6 t7

Freq

uenc

y

Freq

uenc

y

Freq

uenc

y

(a) (b) (c)

Page 22: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

FITZ AND HAKEN PAPERS

greater than the time between consecutive analysis win-dow centers are considered to be unreliable and areremoved, or cropped.

Cropping partials to remove off-center componentsallows us to localize transient events reliably. Fig. 7(c)shows reassigned time–frequency data from the abruptsquare wave onset with off-center components removed.The abrupt square wave onset synthesized from thecropped reassigned data, seen in Fig. 6(c), is much sharperthan the uncropped reassigned reconstruction, because thetaper of the analysis window makes even the time correc-

tion data unreliable in components that are very far offcenter.

Fig. 8 shows reassigned bandwidth-enhanced modeldata from the onset of a bowed cello tone before and afterthe removal of off-center components. In this case, com-ponents with time corrections greater than 10 ms (the timebetween consecutive analysis windows) were deemed tobe too far off center to deliver reliable parameter esti-mates. As in Fig. 7(c), the unreliable data clustered at thetime of the onset are removed, leaving a cleaner, morerobust representation.

884 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 7. Time–frequency analysis data points for abrupt square wave onset. (a) Traditional nonreassigned data are evenly distributed intime. (b) Reassigned data are clumped at onset time. (c) Reassigned analysis data after far off-center components have been removed,or cropped. Only time and frequency information is plotted; amplitude information is not displayed.

(a) (b) (c)

2000

00 60Time (ms)

Freq

uenc

y (H

z)

2000

00 60Time (ms)

Freq

uenc

y (H

z)

2000

00 60Time (ms)

Freq

uenc

y (H

z)

Fig. 6. Abrupt square wave onset reconstructed from five sinusoidal partials corresponding to first five harmonics. (a) Reconstructionfrom nonreassigned analysis data. (b) Reconstruction from reassigned analysis data. (c) Reconstruction from reassigned analysis datawith unreliable partial parameter estimates removed, or cropped.

(c)

19 41Time (ms)

Amplitude

(b)

19 41Time (ms)

Amplitude

(a)4422

0 22Time (ms)

Amplitude

Time (ms)

Amplitude

Page 23: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS TIME-FREQUENCY REASSIGNMENT IN SOUND MODELING

5 PHASE MAINTENANCE

Preserving phase is important for reproducing someclasses of sounds, in particular transients and short-durationcomplex audio events having significant information in thetemporal envelope [13]. The basic sinusoidal models pro-posed by McAulay and Quatieri [4] is phase correct, that is,it preserves phase at all times in unmodified reconstruction.In order to match short-time spectral frequency and phaseestimates at frame boundaries, McAulay and Quatieri em-ploy cubic interpolation of the instantaneous partial phase.

Cubic phase envelopes have many undesirable proper-ties. They are difficult to manipulate and maintain undertime- and frequency-scale transformation compared to lin-ear frequency envelopes. However, in unmodified recon-struction, cubic interpolation prevents the propagation ofphase errors introduced by unreliable parameter estimates,maintaining phase accuracy in transients, where the tem-poral envelope is important, and throughout the recon-structed waveform. The effect of phase errors in theunmodified reconstruction of a square wave is illustratedin Fig. 9. If not corrected using a technique such as cubicphase interpolation, partial parameter errors introduced byoff-center components render the waveshape visuallyunrecognizable. Fig. 9(b) shows that cubic phase can beused to correct these errors in unmodified reconstruction.

It should be noted that, in this particular case, the phaseerrors appear dramatic, but do not affect the sound of thereconstructed steady-state waveforms appreciably. Inmany sounds, particularly transient sounds, preservationof the temporal envelope is critical [13], [9], but since theylack audible onset transients, the square waves in Fig.9(a)–(c) sound identical. It should also be noted that cubicphase interpolation can be used to preserve phase accu-racy, but does not reduce temporal smearing due to off-center components in long analysis windows.

It is not desirable to preserve phase at all times in modi-fied reconstruction. Because frequency is the time deriva-tive of phase, any change in the time or frequency scale ofa partial must correspond to a change in the phase values at

the parameter envelope breakpoints. In general, preservingphase using the cubic phase method in the presence of mod-ifications (or estimation errors) introduces wild frequencyexcursions [20]. Phase can be preserved at one time, how-ever, and that time is typically chosen to be the onset ofeach partial, although any single time could be chosen. Thepartial phase at all other times is modified to reflect the newtime–frequency characteristic of the modified partial.

Off-center components with unreliable parameter esti-mates introduce phase errors in modified reconstruction. Ifthe phase is maintained at the partial onset, even the cubicinterpolation scheme cannot prevent phase errors from prop-agating in modified syntheses. This effect is illustrated in Fig.9(c), in which the square wave time–frequency data havebeen shifted in frequency by 10% and reconstructed usingcubic phase curves modified to reflect the frequency shift.

By removing the off-center components at the onset of apartial, we not only remove the primary source of phaseerrors, we also improve the shape of the temporal envelopein the modified reconstruction of transients by preserving amore reliable phase estimate at a time closer to the time ofthe transient event. We can therefore maintain phase accu-racy at critical parts of the audio waveform even undertransformation, and even using linear frequency envelopes,which are much simpler to compute, interpret, edit, andmaintain than cubic phase curves. Fig. 9(d) shows a squarewave reconstruction from cropped reassigned time–fre-quency data, and Fig. 9(e) shows a frequency-shiftedreconstruction, both using linear frequency interpolation.Removing components with large time corrections pre-serves phase in modified and unmodified reconstruction,and thus obviates cubic phase interpolation.

Moreover, since we do not rely on frequent cubic phasecorrections to our frequency estimates to preserve theshape of the temporal envelope (which would otherwisebe corrupted by errors introduced by unreliable data), wehave found that we can obtain very good-quality recon-struction, even under modification, with regularly sampledpartial parameter envelopes. That is, we can sample thefrequency, amplitude, and bandwidth envelopes of our

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 885

Fig. 8. Time–frequency coordinates of data from reassigned bandwidth-enhanced analysis. (a) Before cropping. (b) After cropping ofoff-center components clumped together at partial onsets. Source waveform is a bowed cello tone.

(a) (b)

Time (ms)

Freq

uenc

y (H

z)

0 140370

1200

Time (ms)

Freq

uenc

y (H

z)

0 140370

1200

Page 24: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

FITZ AND HAKEN PAPERS

reassigned bandwidth-enhanced partials at regular inter-vals (of, for example, 10 ms) without sacrificing thefidelity of the model. We thereby achieve the data regular-ity of frame-based additive model data and the fidelity ofreassigned spectral data. Resampling of the partial param-eter envelopes is especially useful in real-time synthesisapplications [11], [12].

6 BREAKING PARTIALS AT TRANSIENT EVENTS

Transients corresponding to the onset of all associatedpartials are preserved in our model by removing off-centercomponents at the ends of partials. If transients always cor-

respond to the onset of associated partials, then that methodwill preserve the temporal envelope of multiple transientevents. In fact, however, partials often span transients. Fig.10 shows a partial that extends over transient boundaries ina representation of a bongo roll, a sequence of very shorttransient events. The approximate attack times are indi-cated by dashed vertical lines. In such cases it is not pos-sible to preserve the phase at the locations of multipletransients, since under modification the phase can only bepreserved at one time in the life of a partial.

Strong transients are identified by the large time correc-tions they introduce. By breaking partials at componentshaving large time corrections, we cause all associated par-

886 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 9. Reconstruction of square wave having abrupt onset from five sinusoidal partials corresponding to first five harmonics. 24-msplot spans slightly less than five periods of 200-Hz waveform. (a) Waveform reconstructed from nonreassigned analysis data using lin-ear interpolation of partial frequencies. (b) Waveform reconstructed from nonreassigned analysis data using cubic phase interpolation,as proposed by McAulay and Quatieri [4]. (c) Waveform reconstructed from nonreassigned analysis data using cubic phase interpola-tion, with partial frequencies shifted by 10%. Notice that more periods of (distorted) waveform are spanned by 24-ms plot than by plotsof unmodified reconstructions, due to frequency shift. (d) Waveform reconstructed from time–frequency reassigned analysis data usinglinear interpolation of partial frequencies, and having off-center components removed, or cropped. (e) Waveform reconstructed fromreassigned analysis data using linear interpolation of partial frequencies and cropping of off-center components, with partial frequen-cies shifted by 10%. Notice that more periods of waveform are spanned by 24-ms plot than by plots of unmodified reconstructions, andthat no distortion of waveform is evident.

(e)

1.0

-1.060 84Time (ms)

Am

plitu

de

(d)

1.0

-1.060 84Time (ms)

Am

plitu

de

(c)

1.0

-1.060 84Time (ms)

Am

plitu

de

(b)

1.0

-1.060 84Time (ms)

Am

plitu

de

(a)60

1.0

-1.084Time (ms)

Am

plitu

de

Page 25: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS TIME-FREQUENCY REASSIGNMENT IN SOUND MODELING

tials to be born at the time of the transient, and therebyenhance our ability to maintain phase accuracy. In Fig. 11the partial that spanned several transients in Fig. 10 has beenbroken at components having time corrections greater thanthe time between successive analysis window centers (about1.3 ms in this case), allowing us to maintain the partialphases at each bongo strike. By breaking partials at the loca-tions of transients, we can preserve the temporal envelope ofmultiple transient events, even under transformation.

Fig. 12(b) shows the waveform for two strikes in a bongoroll reconstructed from reassigned bandwidth-enhanced data.

The same two bongo strikes reconstructed from nonreas-signed data are shown in Fig. 12(a). A comparison with thesource waveform shown in Fig. 12(a) reveals that the recon-struction from reassigned data is better able to preserve thetemporal envelope than the reconstruction from nonreas-signed data and suffers less from temporal smearing.

7 REAL-TIME SYNTHESIS

Together with Kurt Hebel of Symbolic Sound Corporationwe have implemented a real-time reassigned bandwidth-

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 887

(c)

Time (ms)

Am

plitu

de

14 25

1.0

-1.0

(b)

Time (ms)

Am

plitu

de

14 25

1.0

-1.0

(a)

Time (ms)

Am

plitu

de

14 25

1.0

-1.0

Fig. 10. Time–frequency plot of reassigned bandwidth-enhanced analysis data for one strike in a bongo roll. Dashed ver-tical lines show approximate locations of attack transients.Partial extends across transient boundaries. Only time–fre-quency coordinates of partial data are shown; partial amplitudesare not indicated.

239216193170146123

2114

1976

1839

1701

1563

1425

Time (ms)

Freq

uenc

y (H

z)

239216193 170146123

2114

1976

1839

1701

1563

1425

Time (ms)

Freq

uenc

y (H

z)

Fig. 11. Time–frequency plot of reassigned bandwidth-enhanced analy-sis data for one strike in a bongo roll with partials broken at componentshaving large time corrections, and far off-center components removed.Dashed vertical lines show approximate locations of attack transients.Partials break at transient boundaries. Only time–frequency coordinatesof partial data are shown; partial amplitudes are not indicated.

Fig. 12. Waveform plot for two strikes in a bongo roll. (a) Reconstructed from reassigned bandwidth-enhanced data. (b) Reconstructedfrom nonreassigned bandwidth-enhanced data. (c) Synthesized using cubic phase interpolation to maintain phase accuracy.

Page 26: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

FITZ AND HAKEN PAPERS

enhanced synthesizer using the Kyma Sound DesignWorkstation [15].

Many real-time synthesis systems allow the sounddesigner to manipulate streams of samples. In our real-time reassigned bandwidth-enhanced implementation, wework with streams of data that are not time-domain sam-ples. Rather, our envelope parameter streams encode fre-quency, amplitude, and bandwidth envelope parametersfor each bandwidth-enhanced partial [11], [12].

Much of the strength of systems that operate on samplestreams is derived from the uniformity of the data. Thishomogeneity gives the sound designer great flexibilitywith a few general-purpose processing elements. In ourencoding of envelope parameter streams, data homogene-ity is also of prime importance. The envelope parametersfor all the partials in a sound are encoded sequentially.Typically, the stream has a block size of 128 samples,which means the parameters for each partial are updatedevery 128 samples, or 2.9 ms at a 44.1-kHz sampling rate.Sample streams generally do not have block sizes associ-ated with them, but this structure is necessary in our enve-lope parameter stream implementation. The envelopeparameter stream encodes envelope information for a sin-gle partial at each sample time, and a block of samplesprovides updated envelope information for all the partials.

Envelope parameter streams are usually created by tra-versing a file containing frame-based data from an analy-sis of a source recording. Such a file can be derived froma reassigned bandwidth-enhanced analysis by resamplingthe envelopes at intervals of 128 samples at 44.1 kHz. Theparameter streams may also be generated by real-timeanalysis, or by real-time algorithms, but that process isbeyond the scope of this discussion. A parameter streamtypically passes through several processing elements.These processing elements can combine multiple streamsin a variety of ways, and can modify values within astream. Finally a synthesis element computes an audiosample stream from the envelope parameter stream.

Our real-time synthesis element implements bandwidth-enhanced oscillators [8] with the sum

(12)sin θ

θ θ

y n A n N n b n n

n n 1 2

( )

k kk

K

k

F n

0

1

k kk

!^ ^ ^ ^ ^

^ ^

h h h h h

h h

8 B

(13)

where

y time-domain waveform for synthesized soundn sample numberk partial number in soundK total number of partials in sound (usually be-

tween 20 and 160)Ak amplitude envelope of partial kNk noise envelope of partial kb zero-mean noise modulator with bell-shaped

spectrumFk log2 frequency envelope of partial k, radians

per sampleθk running phase for kth partial.

Values for the envelopes Ak, Nk, and Fk are updatedfrom the parameter stream every 128 samples. The syn-thesis element performs sample-level linear interpolationbetween updates, so that Ak, Nk, and Fk are piecewise lin-ear envelopes with segments 128 samples in length [21].The θk values are initialized at partial onsets (when Ak andNk are zero) from the phase envelope in the partial’sparameter stream.

Rather than using a separate model to represent noise inour sounds, we use the envelope Nk (in addition to the tra-ditional Ak and Fk envelopes) and retain a homogeneousdata stream. Quasi-harmonic sounds, even those withnoisy attacks, have one partial per harmonic in our repre-sentation. The noise envelopes allow a sound designer tomanipulate noiselike components of sound in an intuitiveway, using a familiar set of controls. We have imple-mented a wide variety of real-time manipulations on enve-lope parameter streams, including frequency shifting, for-mant shifting, time dilation, cross synthesis, and soundmorphing.

Our new MIDI controller, the Continuum Fingerboard,allows continuous control over each note in a perform-ance. It resembles a traditional keyboard in that it isapproximately the same size and is played with ten fingers[12]. Like keyboards supporting MIDI’s polyphonic after-touch, it continually measures each finger’s pressure. TheContinuum Fingerboard also resembles a fretless stringinstrument in that it has no discrete pitches; any pitch maybe played, and smooth glissandi are possible. It tracks, inthree dimensions (left to right, front to back, and down-ward pressure), the position for each finger pressing on theplaying surface. These continuous three-dimensional out-puts are a convenient source of control parameters forreal-time manipulations on envelope parameter streams.

8 CONCLUSIONS

The reassigned bandwidth-enhanced additive soundmodel [10] combines bandwidth-enhanced analysis andsynthesis techniques [7], [8] with the time–frequencyreassignment technique described in this paper.

We found that the method of reassignment strengthensour bandwidth-enhanced additive sound model dramati-cally. Temporal smearing is greatly reduced because thetime–frequency orientation of the model data is waveformdependent, rather than analysis dependent as in traditionalshort-time analysis methods. Moreover, time–frequencyreassignment allows us to identify unreliable data points(having bad parameter estimates) and remove them fromthe representation. This not only sharpens the representa-tion and makes it more robust, but it also allows us tomaintain phase accuracy at transients, even under trans-formation, while avoiding the problems associated withcubic phase interpolation.

9 REFERENCES

[1] F. Auger and P. Flandrin, “Improving the Readabil-ity of Time–Frequency and Time-Scale Representationsby the Reassignment Method,” IEEE Trans. Signal

888 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Page 27: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS TIME-FREQUENCY REASSIGNMENT IN SOUND MODELING

Process., vol. 43, pp. 1068–1089 (1995 May).[2] F. Plante, G. Meyer, and W. A. Ainsworth,

“Improvement of Speech Spectrogram Accuracy by theMethod of Spectral Reassignment,” IEEE Trans. SpeechAudio Process., vol. 6, pp. 282–287 (1998 May).

[3] G. Peeters and X. Rode, “SINOLA: A NewAnalysis/Synthesis Method Using Spectrum Peak ShapeDistortion, Phase and Reassigned Spectrum,” in Proc. Int.Computer Music Conf. (1999), pp. 153–156.

[4] R. J. McAulay and T. F. Quatieri, “SpeechAnalysis/Synthesis Based on a Sinusoidal Represent-ation,” IEEE Trans. Acoust., Speech, Signal Process., vol.ASSP-34, pp. 744–754 (1986 Aug.).

[5] X. Serra and J. O. Smith, “Spectral ModelingSynthesis: A Sound Analysis/Synthesis System Based on aDeterministic Plus Stochastic Decomposition,” ComputerMusic J., vol. 14, no. 4, pp. 12–24 (1990).

[6] K. Fitz and L. Haken, “Sinusoidal Modeling andManipulation Using Lemur,” Computer Music J., vol. 20,no. 4, pp. 44–59 (1996).

[7] K. Fitz, L. Haken, and P. Christensen, “A NewAlgorithm for Bandwidth Association in Bandwidth-Enhanced Additive Sound Modeling,” in Proc. Int. Com-puter Music Conf. (2000).

[8] K. Fitz and L. Haken, “Bandwidth Enhanced Sin-usoidal Modeling in Lemur,” in Proc. Int. Computer MusicConf. (1995), pp. 154–157.

[9] T. S. Verma and T. H. Y. Meng, “An Analysis/Synthesis Tool for Transient Signals,” in Proc. 16th Int.Congr. on Acoustics/135th Mtg. of the Acoust. Soc. Am.(1998 June), vol. 1, pp. 77–78.

[10] K. Fitz, L. Haken, and P. Christensen,“Transient Preservation under Transformation in anAdditive Sound Model,” in Proc. Int. Computer MusicConf. (2000).

[11] L. Haken, K. Fitz, and P. Christensen, “BeyondTraditional Sampling Synthesis: Real-Time Timbre Morph-ing Using Additive Synthesis,” in Sound of Music:Analysis, Synthesis, and Perception, J. W. Beauchamp, Ed.(Springer, New York, to be published).

[12] L. Haken, E. Tellman, and P. Wolfe, “An IndiscreteMusic Keyboard,” Computer Music J., vol. 22, no. 1, pp.30–48 (1998).

[13] T. F. Quatieri, R. B. Dunn, and T. E. Hanna, “Time-Scale Modification of Complex Acoustic Signals,” in Proc.Int. Conf. on Acoustics, Speech, and Signal Processing(IEEE, 1993), pp. I-213–I-216.

[14] K. Fitz and L. Haken, “The Loris C ClassLibrary,” available at http://www.cerlsoundgroup.org/Loris.

[15] K. J. Hebel and C. Scaletti, “A Framework forthe Design, Development, and Delivery of Real-TimeSoftware-Based Sound Synthesis and Processing Algor-ithms,” presented at the 97th Convention of the AudioEngineering Society, J. Audio Eng. Soc. (Abstracts), vol.42, p. 1050 (1994 Dec.), preprint 3874.

[16] M. Dolson, “The Phase Vocoder: A Tutorial,” Com-puter Music J., vol. 10, no. 4, pp. 14–27 (1986).

[17] A. Papoulis, Systems and Transforms with Appli-cations to Optics (McGraw-Hill, New York, 1968), chap.7.3, p. 234.

[18] K. Kodera, R. Gendrin, and C. de Villedary,“Analysis of Time-Varying Signals with Small BT Values,”IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-26, pp. 64–76 (1978 Feb.).

[19] D. W. Griffin and J. S. Lim, “Multiband ExcitationVocoder,” IEEE Trans. Acoust., Speech, Signal Process.,vol. ASSP-36, p. 1223–1235 (1988 Aug.).

[20] Y. Ding and X. Qian, “Processing of MusicalTones Using a Combined Quadratic Polynomial-PhaseSinusoidal and Residual (QUASAR) Signal Model,”J. Audio Eng. Soc., vol. 45, pp. 571–584 (1997July/Aug.).

[21] L. Haken, “Computational Methods for Real-Time Fourier Synthesis,” IEEE Trans. Acoust., Speech,Signal Process., vol. ASSP-40, pp. 2327–2329 (1992Sept.).

[22] A. Ricci, “SoundMaker 1.0.3,” MicroMat Com-puter Systems (1996–1997).

[23] F. Opolko and J. Wapnick, “McGill UniversityMaster Samples,” McGill University, Montreal, Que.,Canada (1987).

[24] E. Tellman, cello tones recorded by P. Wolfe atPogo Studios, Champaign, IL (1997 Jan.).

APPENDIXRESULTS

The reassigned bandwidth-enhanced additive model isimplemented in the open source C class library Loris[14], and is the basis of the sound manipulation and mor-phing algorithms implemented therein.

We have attempted to use a wide variety of sounds inthe experiments we conducted during the development ofthe reassigned bandwidth-enhanced additive sound model.The results from a few of those experiments are presentedin this appendix. Data and waveform plots are notintended to constitute proof of the efficacy of our algo-rithms, or the utility of our representation. They areintended only to illustrate the features of some of thesounds used and generated in our experiments. The resultsof our work can only be judged by auditory evaluation,and to that end, these sounds and many others are avail-able for audition at the Loris web site [14].

All sounds used in these experiments were sampled at44.1 kHz (CD quality) so time–frequency analysis dataare available at frequencies as high as 22.05 kHz.However, for clarity, only a limited frequency range isplotted in most cases. The spectrogram plots all have highgain so that low-amplitude high-frequency partials are vis-ible. Consequently strong low-frequency partials are veryoften clipped, and appear to have unnaturally flat ampli-tude envelopes.

The waveform and spectrogram plots were producedusing Ricci’s SoundMaker software application [22].

A.1 Flute ToneA flute tone, played at pitch D4 (D above middle C),

having a fundamental frequency of approximately 293Hz and no vibrato, taken from the McGill UniversityMaster Samples compact discs [23, disc 2, track 1, index

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 889

Page 28: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

FITZ AND HAKEN PAPERS

3], is shown in the three-dimensional spectrogram plot inFig. 13. This sound was modeled by reassigned band-width-enhanced analysis data produced busing a 53-msKaiser analysis window with 90-dB sidelobe rejection.The partials were constrained to be separated by at least250 Hz, slightly greater than 85% of the harmonic partialseparation.

Breath noise is a significant component of this sound.This noise is visible between the strong harmonic compo-nents in the spectrogram plot, particularly at frequenciesabove 3 kHz. The breath noise is faithfully represented inthe reassigned bandwidth-enhanced analysis data, and

reproduced in the reconstructions from those analysisdata. A three-dimensional spectrogram plot of the recon-struction is shown in Fig. 14. The audible absence of thebreath noise is apparent in the spectral plot for the sinu-soidal reconstruction from non-bandwidth-enhancedanalysis data, shown in Fig. 15.

A.2 Cello ToneA cello tone, played at pitch D#3 (D sharp below middle

C), having a fundamental frequency of approximately 156Hz, played by Edwin Tellman and recorded by PatrickWolfe [24] was modeled by reassigned bandwidth-

890 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Frequency (Hz)0 5000

Time (s)

0

2.8

Fig. 13. Three-dimensional spectrogram plot for breathy flute tone, pitch D4 (D above middle C). Audible low-frequency noise andrumble from recording are visible. Strong low-frequency components are clipped and appear to have unnaturally flat amplitudeenvelopes due to high gain used to make low-amplitude high-frequency partials visible.

Frequency (Hz)0 5000

Time (s)

0

2.8

Fig. 14. Three-dimensional spectrogram plot for breathy flute tone, pitch D4 (D above middle C), reconstructed from reassigned band-width-enhanced analysis data.

Page 29: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS TIME-FREQUENCY REASSIGNMENT IN SOUND MODELING

enhanced analysis data produced using a 71-ms Kaiseranalysis window with 80-dB sidelobe rejection. The partialswere constrained to be separated by at least 135 Hz, slightlygreater than 85% of the harmonic partial separation.

Bow noise is a strong component of the cello tone,especially in the attack portion. As with the flute tone, thenoise is visible between the strong harmonic componentsin spectral plots, and was preserved in the reconstructionsfrom reassigned bandwidth-enhanced analysis data andabsent from sinusoidal (non-bandwidth-enhanced) recon-structions. Unlike the flute tone, the cello tone has anabrupt attack, which is smeared out in nonreassigned sinu-soidal analyses (data from reassigned and nonreassignedcello analysis are plotted in Fig. 8), causing the recon-structed cello tone to have weak-sounding articulation.The characteristic “grunt” is much better preserved inreassigned model data.

A.3 Flutter-Tongued Flute ToneA flutter-tongued flute tone, played at pitch E4 (E

above middle C), having a fundamental frequency ofapproximately 330 Hz, taken from the McGill UniversityMaster Samples compact discs (23, disc 2, track 2, index5], was represented by reassigned bandwidth-enhancedanalysis data produced using a 17.8-ms Kaiser analysiswindow with 80-dB sidelobe rejection. The partials wereconstrained to be separated by at least 300 Hz, slightlygreater than 90% of the harmonic partial separation. Theflutter-tongue effect introduces a modulation with a periodof approximately 35 ms, and gives the appearance of ver-tical stripes on the strong harmonic partials in the spectro-gram shown in Fig. 16.

With careful choice of the window parameters, recon-struction from reassigned bandwidth-enhanced analysisdata preserves the flutter-tongue effect, even under timedilation, and is difficult to distinguish from the original.

Fig. 17 shows how a poor choice of analysis window, a 71-ms Kaiser window in this case, can degrade the represen-tation. The reconstructed tone plotted in Fig. 17 is recog-nizable, but lacks the flutter effect completely, which hasbeen smeared by the window duration. In this case multi-ple transient events are spanned by a single analysis win-dow, and the temporal center of gravity for that windowlies somewhere between the transient events. Time–fre-quency reassignment allows us to identify multiple tran-sient events in a single sound, but not within a singleshort-time analysis window.

A.4 Bongo RollFig. 18 shows the waveform and spectrogram for an 18-

strike bongo roll taken from the McGill University MasterSamples compact discs [23, disc 3, track 11, index 31].This sound was modeled by reassigned bandwidth-enhanced analysis data produced using a 10-ms Kaiseranalysis window with 90-dB sidelobe rejection. The par-tials were constrained to be separated by at least 300 Hz.

The sharp attacks in this sound were preserved usingreassigned analysis data, but smeared in nonreassignedreconstruction, as discussed in Section 6. The waveformsfor two bongo strikes are shown in reassigned and non-reassigned reconstruction in Fig. 12(b) and (c).Inspection of the waveforms reveals that the attacks inthe nonreassigned reconstruction are not as sharp as inthe original or the reassigned reconstruction, a clearlyaudible difference.

Transient smearing is particularly apparent in time-dilated synthesis, where the nonreassigned reconstructionloses the percussive character of the bongo strikes. Thereassigned data provide a much more robust representa-tion of the attack transients, retaining the percussive char-acter of the bongo roll under a variety of transformations,including time dilation.

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 891

Fig. 15. Three-dimensional spectrogram plot for breathy flute tone, pitch D4 (D above middle C), reconstructed from reassigned non-bandwidth-enhanced analysis data.

Frequency (Hz)0 5000

Time (s)

0

2.8

Page 30: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

FITZ AND HAKEN PAPERS

892 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 17. Waveform and spectrogram plots for reconstruction of flutter-tongued flute tone plotted in Fig. 16, analyzed using long win-dow, which smears out flutter effect.

Time (s)

Am

plitu

de

0 6.7

Time (s)

Freq

uenc

y (k

Hz)

00

6

6.7

2

4

Time (s)

Am

plitu

de

0 6.7

Time (s)

Freq

uenc

y (k

Hz)

00

6

6.7

2

4

Fig. 16. Waveform and spectrogram plots for flutter-tongued flute tone, pitch E4 (E above middle C). Vertical stripes on strong har-monic partials indicate modulation due to flutter-tongue effect. Strong low-frequency components are clipped and appear to have unnat-urally flat amplitude envelopes due to high gain used to make low-amplitude high-frequency partials visible.

Page 31: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS TIME-FREQUENCY REASSIGNMENT IN SOUND MODELING

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 893

Time (s)

Am

plitu

de

0 1.25

Time (s)

Freq

uenc

y (k

Hz)

00

20

1.25

8

4

12

16

Fig. 18. Waveform and spectrogram plots for bongo roll.

THE AUTHORS

Kelly Fitz received B.S., M.S., and Ph.D. degrees inelectrical engineering from the University of Illinois atUrbana-Champaign, in 1990, 1992, and 1999, respec-tively. There he studied digital signal processing as well assound analysis and synthesis with Dr. James Beauchampand sound design and electroacoustic music compositionwith Scott Wyatt using a variety of analog and digital sys-tems in the experimental music studios.

Dr. Fitz is currently an assistant professor in the depart-ment of Electrical Engineering and Computer Science atthe Washington State University.

Lippold Haken has an adjunct professorship in elec-trical and computer engineering at the University of

Illinois, and he is senior computer engineer at PrairieCity Computing in Urbana, Illinois. He is leader ofthe CERL Sound Group, and together with his gradu-ate students developed new software algorithms andsignal processing hardware for computer music. He isinventor of the Continuum Fingerboard, a MIDI con-troller that allows continuous control over each notein a performance. He is a contributor of optimizedreal-time algorithms for the Symbolic SoundCorporation Kyma sound design workstation. He isalso the author of a sophisticated music notation editor,Lime.

He is currently teaching a computer music surveycourse for seniors and graduate students in electrical andcomputer engineering.

K. Fitz L. Haken

Page 32: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS

0 INTRODUCTION

The purpose of this paper is to investigate how trans-aural processing can enhance conventional multichannelaudio both by embedding perceptually relevant informa-tion and by improving image stability using additionalloudspeakers integrated with supplementary digital pro-cessing and coding. The key objective is to achieve scala-bility in spatial performance while retaining full compati-bility with conventional multichannel formats. This enablesthe system in its most basic form with unprocessed loud-speaker feeds to be used in a conventional multichannelinstallation. However, by appropriate signal processingadditional loudspeaker feeds can be derived, together withthe option of exploiting buried data to extract more signalsin order to improve spatial resolution. The system is there-fore hierarchical in terms of number of loudspeakers,channels, and ultimately spatial resolution, while in itssimplest incarnation it remains fully compatible with thesystem configurations used with multichannel DVD-Aand SACD replay equipment.

The multichannel capabilities of DVD1 technology [1],[2] were designed to enhance stereo2 sound reproductionby offering surround image and improved envelopmentcapabilities. Normally multichannel audio encoded ontoDVD assumes the ITU standard of a five-loudspeaker con-figuration driven by five discrete wide-band “loudspeakerfeeds.” However, a limitation of this system is the lack ofa methodology to synthesize virtual images capable ofthree-dimensional audio (that is, a perception of direction,distance, and height together with acoustic envelopment)

rather than just “sound effects” often (although not exclu-sively) associated with surround sound in a home theatercontext. The ITU five-channel loudspeaker configurationcan also be poor at side image localization, although thisdeficiency is closely allied to a sensitivity to roomacoustics. Nevertheless, DVD formats still offer only sixdiscrete channels, which if mapped directly into loud-speaker feeds remain deficient in terms of image preci-sion, especially if height and depth information is to beencoded.

The techniques described in this paper support scalablespatial audio that can remain compatible with conven-tional multichannel systems. It is shown that in this classof system, under anechoic conditions, signal processingcan be used to match theoretically the ear signals to eithera real or an equivalent spatially synthesized sound source.Also, in order to improve image robustness, directionalsound-field encoding is retained as exploited in conven-tional surround sound to match the image synthesizedthrough transaural processing. It may be argued that as thenumber of channels is increased, there is convergencetoward wavefront synthesis [3], where by default optimumear-signal reconstruction is achieved. However, the pro-posed system is positioned well into the middle territory3

and is far removed from the array sizes required for broad-band wavefront synthesis. Consequently, from the per-spective of wavefront synthesis the transition frequencyabove which spatial aliasing occurs is located at a rela-tively low frequency, implying that for the proposed sys-tem the core concepts of wavefront synthesis do not apply

894 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Scalable Multichannel Coding with HRTF Enhancementfor DVD and Virtual Sound Systems*

M. O. J. HAWKSFORD

Centre for Audio Research and Engineering, University of Essex, UK CO4 3SQ

A scalable and reverse compatible multichannel method of spatial audio using transauralcoding designed for multiple-loudspeaker feeds is described with a focus on attainingoptimum ear signals. A Fourier transform method for computing HRTF matrices is employed,including the generation of a subset of band-limited reproduction channels. Applicationsconsidered embrace multichannel audio, DVD, virtual reality, and telepresence.

* Presented at the 108th Convention of the Audio EngineeringSociety, Paris, France, 2000 February 19–22; revised 2001October 10 and 2002 July 15.

1 Includes both DVD-A and SACD formats.2 Of Greek origin, meaning solid, stereo is applicable univer-

sally to multichannel audio.3 A range of 5 to 32 loudspeakers is suggested.

Page 33: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SCALABLE MULTICHANNEL CODING WITH HRTF ENHANCEMENT

over much of the audio band.It is emphasized that an n-channel system does not nec-

essarily imply n loudspeakers. Indeed, as is well known, itis possible for a two-loudspeaker system to reproducevirtual-sound sources [4], while using more than n loud-speakers can help create a more robust and stable illusion.Also, the mature technology of Ambisonics [5]–[7] isscalable and can accommodate both additional loudspeak-ers and information channels. However, here the encodingis hierarchical in terms of spatial spherical harmonics,although no attempt is made to reconstruct the ear signaldirectly at the listener. Consequently the approach taken inthis paper differs in a number of fundamental aspects fromthat of Ambisonics, especially since there is no attempt totransform a sound field directly into a spherical harmonicset. Thus it remains for future work to establish the rela-tive merits of these approaches although, because similarloudspeaker arrays are used, there is no fundamental com-promise should the system be used either for Ambisonicsor for conventional surround sound encoded audio.

The method of spatial audio described in this paper usesa conventional loudspeaker array to surround the listenerand to reproduce a directional sound field. In addition, earsignals are simultaneously synthesized using head-relatedtransfer functions (HRTFs) matched to the source image,where it is assumed in all cases that loudspeaker transferfunctions have been equalized or otherwise taken intoaccount. A number of examples illustrate the computa-tional methods, which include pairwise transaural imagesynthesis4 reported in earlier work, where some prelimi-nary experimental results were also discussed [8]–[10] toestablish the efficacy of the method. This technique isespecially well matched to multichannel multiloudspeakerinstallations, where transaural coding can be applied dur-ing encoding and recording while processing within thedecoder located within the reproduction system can accom-modate both additional loudspeakers and loudspeakerpositions that differ from those assumed at the encoder.Consequently for an n-channel system it is straightforwardto employ only n loudspeakers, although additional loud-speaker feeds can be derived, while still retaining correctear signals, either by using matrix techniques or determin-istically within the DVD-A format using additionalembedded code. However, it is emphasized that in the sim-plest configuration, using only direct loudspeaker feedsand provided the loudspeakers are correctly located, thereare no additional decoding requirements and the systemremains fully compatible with all existing recordings.

Alternative technologies such as Ambisonics [11] haveused fewer loudspeakers together with sophisticatedmatrix encoding. Also, there has been substantial researchinto perceptually based processing to reconstruct a three-dimensional environment using only two channels andtwo loudspeakers. More recently DOLBY EX5 has beenintroduced as a means of synthesizing a center rear chan-nel using nonlinear Prologic6 processing applied to therear two channels of a five-channel system. However, thistechnology is aimed principally at surround sound as con-ceived for cinema and home theater, with a bias towardsound effects and ambience creation. Nevertheless there

exists a grey area between cinema applications, musicreproduction, gaming applications, and the synthesis ofvirtual acoustics, especially as at their core the same mul-tichannel carriers can link all systems. It is therefore notunreasonable to anticipate some degree of convergence assimilar theoretical models apply. Also, with conventionalmultichannel technology it is often the listening environ-ment and the methods used to craft the audio signals thatimpose the greatest performance limitations.

Multichannel stereo on DVD allows for improved meth-ods of spatial encoding that can transcend the common stu-dio practice of using just pairwise amplitude panning withblending to mono. It is conjectured that by including per-ceptually motivated processing, three-dimensional “sound-scapes” can be rendered rather than just peripheral sur-round sound. Complex HRTF data by default encapsulateall relevant spatial information [that is, interaural ampli-tude difference (IAD), interaural time difference (ITD),and directional spectral weighting] and form a generalizedapproach. However, to reduce signal coloration, a methodof HRTF equalization is proposed with an emphasis oncharacterizing the interear difference signal computed inthe lateral plane. The extension to height information inthe equalized HRTFs is also discussed briefly.

In Section 5.2.1 a special case is presented for narrowsubtended angle, two-channel stereo where it is shownthat ear signals derived from a real acoustic source locatedon the arc of the loudspeakers can be closely synthesizedusing a mono source with amplitude-only panning.Critically in this example, the HRTFs are defined by theactual locations of the loudspeakers, and so are matchedautomatically to an individual listener’s HRTF character-istics. This is an important aspect of the proposal, which isdirectly extendable to the method of pairwise associationwhere mismatch sensitivity between listener HRTFs andtarget HRTFs is reduced. This approach also encapsulatessuccinctly the principles of two-channel amplitude-onlypanning stereo while exposing inherent errors as the anglebetween the loudspeakers is increased.

To summarize, four core elements constitute the pro-posed scalable and reverse compatible spatial audiosystem:

• A vector component of the sound field is produced as aloudspeaker array surrounds the listener following con-ventional multichannel audio practice.

• Pairwise transaural techniques are used to code direc-tional information and to create ear signals matched tothe required source signal.

• Matrix processing can increase the number of loud-speakers used in the array while simultaneously pre-serving the ear signals resulting from transaural pro-cessing and nonoptimum loudspeaker placements.

• Embedded digital code7 [12] is used to create additionalchannels for enhanced resolution while remaining com-

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 895

4 Subject to a British Telecommunications patent application.5 Dolby Laboratories, channel extension technology to AC-3

perceptual coding.6 Registered trademark of Dolby Laboratories.7 Applicable only to the DVD-A format.

Page 34: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

HAWKSFORD PAPERS

patible with the basic system already enhanced by pair-wise transaural processing.

1 HRTF NOTATION

A set of HRTFs is unique to an individual and describesa continuum of acoustic transfer functions linking everypoint in space to the listener’s ears. HRTFs depend on therelative position of the source to the listener and are influ-enced by distance, reflection, and diffraction around thehead, pinna, and torso. In this paper HRTFs were derivedfrom measurements taken at BT Laboratories (BTL) uti-lizing an artificial head and small microphones mounted atthe entrance of each ear canal. Measurements of head-related impulse responses (HRIRs) were performed in ananechoic chamber at 10° intervals using a maximum-length-sequence (MLS) excitation, and the correspondingHRTFs were computed using a time window and theFourier transform.

To define the nomenclature used for various HRTF sub-functions, consider the arrangement shown in Fig. 1,where the listener’s ears are labeled A (left) and B (right)when viewed from above. In a sound reproduction systemall sound sources and loudspeakers have associated pairsof HRTFs, uniquely linking them to the listener, whereasin this paper these transfer functions are called the HRTFcoordinates for each object given. In Fig. 1 the singlesound source X has the HRTF coordinates hxa, hxb,while the three loudspeakers 1, 2, and n, with arbitrarypositions, have the coordinates ha(1), hb(1), ha(2),hb(2), and ha(n), hb(n). In specifying the loudspeakerHRTF coordinates, a left– right designation can beincluded when the loudspeaker array is known to be sym-metrical about the centerline. Consequently h la(r) denotesthe HRTF between the left-hand loudspeaker r and theleft-hand ear A. However, for arrays having only threesymmetrically positioned loudspeakers (left, center, andright) a simpler notation is used in Section 3, namely, h la,

h lb, h ca, h cb, and h ra, h rb. It should be observed thatin typical HRTF calculations, such as the evaluation of thepositional transfer functions GR and GL used in transauralsignal processing (see Section 3.1), ear-canal equalizationneed not be incorporated, provided the same set of HRTFsis used for both loudspeaker and image locations. Then theear-canal transfer functions cancel, assuming they are notdirectionally encoded. For example, in Section 4 Eqs.(23a) and (23b) describe typical transaural processing toderive the positional transfer functions GR and GL, whereany transfer function components common to all HRTFscancel.

2 HRTF EQUALIZATION

The HRTFs used in transaural processing reveal fre-quency response variations that may contribute tonal col-oration when sound is reproduced. In this section a strat-egy for equalization is studied that reduces the overallspectral variation, yet retains the key attributes deemedessential for localization. A simplified form of HRTF isalso defined, which can prove useful in multiloudspeakersystems.

2.1 Methods of EqualizationWhen sound is reproduced over a conventional multi-

loudspeaker array, where for example a signal is spatial-ized using pairwise amplitude panning, equalization as afunction of direction is not normally employed. In such asystem sound is perceived generally as uncolored, eventhough the ears, head, and torso impose direction-specificspectral weighting. However, although HRTFs used intransaural processing take account of both the source loca-tion and the loudspeakers, reducing frequency responsevariations can ameliorate tonal variance, which maybecome accentuated as the phantom image moves awayfrom the loudspeaker locations and when the listener turnsaway from the optimum forward orientation. Also, sys-tems are rarely optimally aligned and exhibit sensitivity tosmall head motions, both of which map into frequencyresponse errors in the reconstructed ear signals. Con-sequently the aim is to introduce minimum spectral mod-ifications commensurate with achieving spatialization.

It is proposed that for image localization within the lat-eral plane the relationship between the complex interauraldifference signal and the signal components common toboth ears is the critical factor. This conjecture is based onthe premise that for lateral images, spectral componentscommon to both ears relate closely to the source spectrumwhereas the interaural spectrum is strongly influenced bysource direction. Consequently it is argued that modifica-tion to the common spectrum causes principally tonalcoloration, whereas the relationship between commonspectrum and interaural spectrum is more critical to local-ization, even though spectral cues embedded in the sourcecan induce an illusion of height. This approach may beextendable to include height localization, although it isrecognized that additional spectral weighting of the mon-aural component can be required following, for example,the boosted-band experiments performed by Blauert [13].

896 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 1. Definition of HRTF notation for n loudspeakers posi-tioned arbitrarily around the listener.

Page 35: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SCALABLE MULTICHANNEL CODING WITH HRTF ENHANCEMENT

2.1.1 Lateral-Plane HRTF EqualizationThe proposed method of lateral-plane HRTF equaliza-

tion first transforms each HRTF pair into sum and differ-ence (M-S) coordinates and then performs equalization onthe corresponding pair by dividing by the correspondingsum spectrum. It is proposed that all HRTFs in a setshould be equalized using this technique in order to main-tain relative group delay and, with appropriate weighting,relative level.

As defined in Section 1, let the HRTFs for a givensource location X be hxa and hxb, and let the correspondingcomplex sum and difference transforms be HSUMx andHDIFFx. Thus

(1)

.

HSUM

HDIFF

h h

h h

x xa xb

x xa xb (2)

Four methods of HRTF equalization that match this objec-tive are identified, where hxea, hxeb are the resultingHRTFs after equalization.

Method 1: Equalization by the modulus of the complexsum spectrum,

(3a)

.

hh h

hW

hh h

hW

xeaxa xb

xanx

xebxa xb

xbnx (3b)

Method 2: Equalization by complex sum spectrum,

(4a)

.

hh h

hW

hh h

hW

xeaxa xb

xanx

xebxa xb

xbnx (4b)

Method 3: Equalization by the derived minimum-phasespectrum of the complex sum spectrum,

)))))( ( ( ( (exp logconj hilbert absh

h h

hW

xea

xa xb

xanx

(5a)

))))).

( ( ( ( (exp logconj hilbert absh

h h

hW

xeb

xa xb

xbnx

(5b)

Here Wnx are the normalization coefficients calculated tomaintain the relative levels after equalization of all HRTFcoordinates in the set. Each form of equalization deliversidentical magnitude spectra in the HRTFs, although thereare variations in the time-domain waveforms resultingfrom phase response differences. To illustrate these varia-tions, consider an example HRTF pair corresponding to anominal 30° off-axis image source. Fig. 2(a) shows themeasured HRIRs, whereas Fig. 2(b)–(d) presents theimpulse responses resulting from each form of equaliza-tion in order that both pre- and postring can be compared.Fig. 3 shows the corresponding amplitude spectra before

and after equalization, and Fig. 4 illustrates the sum anddifference spectra, again before and after equalization.

In selecting a potential equalization strategy, it is a nec-essary condition that the relative time difference betweenHRTF pairs be maintained. Also, the time-domain wave-forms should not accentuate or exhibit excessive preringor postring, as this can produce unnatural sound col-oration. Although each equalization method meets theprincipal objective, the technique of forging the denomi-nator from the minimum phase of the sum spectrum yieldsresults with minimum prering. In essence, the minimum-phase information common to both ear signals is removed,leaving mainly excess phase components to carry theessential time-delay information. Equalization using thecomplex sum spectrum (method 2) also yields resultsclose to the requirements. However, inspection of Fig. 2(c)shows that the right-ear response, which in this case hasthe greater delay, exhibits prering extending back in timeto the commencement of the left HRIR.

However, experience gained with equalization hasrevealed that certain image locations, particularly towardthe center rear, can yield excessive ringing after equaliza-tion. Consequently a further equalization variant is pro-posed. This is similar to method 3, but it differs in the waythe sum spectrum is computed and is defined as follows.

Method 4: Equalization by the derived minimum-phasesum of the moduli of each complex spectrum,

) ( )))))( ( ( ( (exp log absconj hilbert abs

h

h h

hW

xea

xa xb

xanx

(6a)

) ( ))))).

( ( ( ( (exp log absconj hilbert abs

h

h h

hW

xeb

xa xb

xbnx

(6b)

In the denominator this algorithm factors out the interauraltime difference between left and right signals, which other-wise map into artificial amplitude response variations in thecomplex sum spectrum. As such this procedure could beargued to be a better estimator of the common spectrum, ashuman auditory processing does not sum ear signals directly.Overall the effect on HRTFs is minor. Fig. 5 presents resultsthat should be compared directly with those in Fig. 3.

Finally a further variant of equalization is where anaverage of all sum spectra is formed and the HRTFs aremodified following procedures similar to those reported inthis section but with particular emphasis on the minimum-phase and sum-of-moduli techniques. However, in thiscase, since all HRTFs are modified by a common equal-ization function in a way similar to ear-canal equalization,when positional transfer functions are calculated, theirform is unchanged.

2.1.2 Equalization with the Addition of HeightCues

Research by Blauert [13] has shown that by introducingspecific frequency-dependent characteristics into theHRTFs a sensation of height is achievable. However, the

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 897

Page 36: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

HAWKSFORD PAPERS

898 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 2. Normalized time-domain left–right HRTFs at 30°. Top––left ear; bottom––right ear. (a) Measured. Observe relative time dis-placement revealing ITD and lack of prering in natural responses. (b) Method 1 equalized. Observe excessive prering that blurs com-mencement of the two HRIRs. (c) Method 2 equalized. HRIRs exhibit mirror images except for initial impulse. (d) Method 3 equal-ized. Prering reduced and initial ITD of HRIRs maintained.

(c)

(a)

(b)

Page 37: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SCALABLE MULTICHANNEL CODING WITH HRTF ENHANCEMENT

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 899

Fig. 3. Magnitude HRTF pair at 30°. Top––left ear; bottom––right ear. (a) Measured. (b). Equalized. Results are identical for meth-ods 1, 2, and 3.

(b)

(a)

(d)

Fig. 2. Continued

Page 38: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

HAWKSFORD PAPERS

900 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

(a)

Fig. 5. Equalized HRTF pair at 30°, applicable to method 4 only. Top––left ear; bottom––right ear.

Fig. 4. Magnitude HRTF sum and difference spectra at 30°. Sum––“diamond” line; difference––continuous line. (a) Unequalized.(b) Equalized, applicable to methods 1, 2, and 3. (Note constant-level sum spectrum following equalization.)

(b)

Page 39: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SCALABLE MULTICHANNEL CODING WITH HRTF ENHANCEMENT

question arises as to whether this modification is compat-ible with the equalization strategies presented in Section2.1.1. For example, the following questions need to beconsidered:

• Is it sufficient to measure the HRTF coordinates only atthe required location above the lateral plane and thenapply equalization, and will then sufficient informationremain buried in the interear difference signal with aunique characterization to discriminate against lateralimages with equivalent interaural time differences?

• Does the absolute amplitude response variation, ratherthan just the difference response variation inherent inthe HRTFs, represent a major factor in producing heightcues?

• Are there secondary factors, such as ground reflections,which introduce additional cues to aid height localiza-tion? Effectively this would require at least two inter-fering sets of HRTFs to be summed.

A full investigation of these points relating to height isbeyond the scope of the present study. However, if theground reflection model were responsible, then the equal-ization methods could be applied individually to the directsource and to the ground reflection, with the results com-bined by taking the path difference into account.

2.2 Simplified HRTF ModelsIn applications where phantom images are positioned

close to the locality of the loudspeakers, it may be suffi-cient to use a simple form of HRTF. This is particularlyapplicable with multiloudspeaker arrays where a vectorcomponent already forms a strong localization clue. Fig. 6shows a source image at angle θ defined by hxa and hxbwith respect to a human head of diameter d meters.

2.2.1 Simple HRTF Model 1The model ignores head shadowing and assumes that

only the interaural time difference is significant. Hence for

a source at angle θ from the forward position, the respec-tive HRTF coordinates are approximately

(7a)exp sin

exp sin

π θ

π θ

j

j

h f Tc

d

h f Tc

d

22

22

xa

xb

^

^

h

h

=

=

G

G

*

*

4

4 (7b)

where the velocity of sound is c m/s and T s is the timedelay from the source to the center of the head. In thismodel the advantage of equalization method 3 is evidentas no equalization need be applied.

2.2.2 Simple HRTF Model 2In this second model both the front and the back waves

are considered, where the back wave results from headdefraction. In this representation a wave incident on ear Aproduces, by head defraction, a secondary signal at ear B.The head diffraction transfer function from ears A to B,DHA–B(r, θ, φ), is a function of the direction of the incidentwave defined by the spherical coordinates r, θ, φ. A sim-ilar function linking ears B to A is defined, DHB–A(r, θ, φ).Hence for the source HRTF coordinates hxa, hxb,

, ,

exp sin

exp sin

π θ

θ φ π θ

j

DH j

h f Tc

d

r f Tc

d

22

22

xa

B A-

^

_ ^

h

i h

=

=

G

G

*

*

4

4

(8a)

, , .

exp sin

exp sin

π θ

θ φ π θ

j

DH j

h f Tc

d

r f Tc

d

22

22

xb

A B-

^

_ ^

h

i h

=

=

G

G

*

*

4

4

(8b)

In a simple model the diffraction transfer functions couldbe represented as attenuation DHk with a time delay ofapproximately the interaural time delay ∆TA–B,

, , , ,

.exp

θ φ θ φ

π ∆

DH DH

DH j

r r

f T2

A B B A

k A B.

- -

-

_ _

_

i i

i (9)

3 MULTILOUDSPEAKER ARRAYS IN TWO-CHANNEL STEREO

This section introduces variants to two-channel, two-loudspeaker transaural processing to demonstrate how atwo-channel signal format can be mapped into n feeds todrive a multiloudspeaker array [14]. It is assumed thatmore than two loudspeakers are driven simultaneously bysignals derived from a single-point sound source, whileformal methods show that the correct ear signals can beretained. Besides supporting stand-alone applications,these transformations are relevant in the development ofmultichannel transaural stereo, as described in Section 5.

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 901

Fig. 6. Sound source X and simplified head model used to deriveapproximate HRTFs.

Page 40: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

HAWKSFORD PAPERS

The outputs of an n-array of loudspeakers combine byacoustical superposition at the entrance to each ear canal.The principal condition for accurate sound localization isthat these signals match the signals that would have beengenerated by a real sound source, both in the static caseand in the case for small head rotations. Also, by usingseveral loudspeakers placed to surround the listener,sound-field direction can make the system more tolerantto head motion. Consequently changes in ear signals withhead motion match more closely those of a real image.

A static sound source, whatever its size and physicallocation, produces two ear signals that fully define theevent, provided the relative head position to source isfixed. In practice it is possible to generate the correct earsignal from two or more noncoincident loudspeakers thatcan take any arbitrary position around the head. However,if the position of the head moves, then a change in the earsignals results, which no longer match the phantom imagecorrectly, and a localization error is perceived. In a systemof wavefront reconstruction this distortion is minimized,although the penalty is a large number of loudspeakersand channels. However, if a limited number n of loud-speakers is used (for example, n 12), then althoughimage position distortion still occurs, the effect is reducedas there is a robust directional component. Also, when aphantom image coincides with a loudspeaker position, thepositional distortion as a function of head position tends tozero, although it is debatable whether this is a desirablesituation as images from other locations are representeddifferently in terms of their radiated cones of sound.

3.1 Three-Loudspeaker Transaural ProcessingTo illustrate how to accommodate more than two loud-

speakers in an array while retaining the requirements forprecise HRTF formulation, consider a three-loudspeakerarray as illustrated in Fig. 7. In this system a mono sourcesignal X is filtered by the positional transfer functions GRand GL to form LT and RT, which in turn form inputs to thematrix [a]. By way of example a Trifield8 matrix (afterGerzon [15]) is selected, which is defined as

.

.

.

.

.

.

.

a

a

a

a

a

a

0 8850

0 4511

0 1150

0 1150

0 4511

0 8850

11

21

31

12

22

32

R

T

SSSSS

R

T

SSSSS

V

X

WWWWW

V

X

WWWWW

(10)

By applying the coefficients defined in matrix [a], thethree loudspeaker signals L, C, and R (left, center, right)can be derived. However, to reproduce optimum localiza-tion, the system requires that the ear signals produced bythe three loudspeakers match the ear signals that would beproduced by the real source.

3.1.1 AnalysisThe positional transfer function matrix [G] converts the

mono signal X to LT and RT as

.L

R

G

GX

T

T

R

L

R

T

SSS

R

T

SSS

V

X

WWW

V

X

WWW

(11)

Using matrix [a], the loudspeaker feeds L, C, and R arethen derived from [G]X,

.

L

C

R

a

a

a

a

a

a

G

GX

L

R

11

21

31

12

22

32

R

T

SSSS

R

T

SSSSS

R

T

SSS

V

X

WWWW

V

X

WWWWW

V

X

WWW

(12)

However, recalling the HRTFs as defined in Fig. 7, where hxaand hxb are the HRTF coordinates of source image X, then

h

hX

h

h

h

h

h

h

L

C

R

xa

xb

la

lb

ca

cb

ra

rb

R

T

SSS

R

T

SSS

R

T

SSSS

V

X

WWW

V

X

WWW

V

X

WWWW

(13)

where, substituting for L, C, and R,

.h

h

h

h

h

h

h

h

a

a

a

a

a

a

G

G

xa

xb

la

lb

ca

cb

ra

rb

L

R

11

21

31

12

22

32

R

T

SSS

R

T

SSS

R

T

SSSSS

R

T

SSS

V

X

WWW

V

X

WWW

V

X

WWWWW

V

X

WWW

(14)

The positional transfer functions GR and GL then followby matrix inversion,

G

G

h

h

h

h

h

h

a

a

a

a

a

a

h

h

L

R

la

lb

ca

cb

ra

rb

xa

xb

11

21

31

12

22

32

1

R

T

SSS

R

T

SSS

R

T

SSSSS

R

T

SSS

V

X

WWW

V

X

WWW

V

X

WWWWW

V

X

WWW

Z

[

\

]]

]]

_

`

a

bb

bb

(15)

902 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 7. Transaural processing using symmetrical three-loudspeakerarray.

8 Registered trademark describing a two-channel to three-loudspeaker mapping proposed by Gerzon [15].

Page 41: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SCALABLE MULTICHANNEL CODING WITH HRTF ENHANCEMENT

from which the loudspeaker feeds L, C, and R arecalculated,

.

L

C

R

a

a

a

a

a

a

h

h

h

h

h

h

h

h

a

a

a

a

a

a

Xxa

xb

la

lb

ca

cb

ra

rb

11

21

31

12

22

32

11

21

31

12

22

32

1R

T

SSSS

R

T

SSSSS

R

T

SSS

R

T

SSS

R

T

SSSSS

V

X

WWWW

V

X

WWWWW

V

X

WWW

V

X

WWW

V

X

WWWWW

Z

[

\

]]

]]

Z

[

\

]]

]]

_

`

a

bb

bb

_

`

a

bb

bb

(16)

In practice L, C, and R are calculated directly using matrixinversion. However, because the transfer functions can haveseveral thousand elements, to avoid large-dimension matri-ces the solution can be decomposed as follows. Define

h

h

h

h

h

h

h

h

h

h

a

a

a

a

a

a

la

lb

ca

cb

ra

rb

11

21

12

22

11

21

31

12

22

32

R

T

SSS

R

T

SSS

R

T

SSSSS

V

X

WWW

V

X

WWW

V

X

WWWWW

(17)

giving

G

h

h

h

h

h

h

G

xa

xb

L

R

11

21

12

22

R

T

SSS

R

T

SSS

R

T

SSS

V

X

WWW

V

X

WWW

V

X

WWW (18)

where, using matrix inversion, the positional transferfunctions are

(19a)Gh h h h

h h h h

Gh h h h

h h h h

Rxa xr

Lxa xb

12 21 11 22

21 11

12 21 11 22

12 22 (19b)

which enable L, C, and R to be calculated. Fig. 8 showsexample transfer functions linking the system input to thethree loudspeaker inputs (L, C, and R) located at 45°, 0°,and 45°, with a source location at 30°. Simulations con-firm that the correct ear signals are produced as shown inFig. 9, while Figs. 10 and 11 present the magnitudes of thepositional transfer functions GL and GR and their differen-tial phase response, respectively.

3.2 Three-Loudspeaker Matrix with Band-LimitedCenter Channel

This section extends multiloudspeaker transaural pro-cessing by considering a case where the center channel isband-limited by a low-pass filter with a transfer functionλ( f ). For example, λ( f ) could constrain the center chan-nel to operate only in the band where the ear and brainemploy interaural time differences for localization.Alternatively the center channel may be used mainly forlow-frequency reproduction.

The inclusion of λ( f ) yields effective HRTF center chan-nel coordinates hca.*λ( f ), hcb.*λ( f ), where .* implieselement-by-element vector multiplication. Hence, from theequations derived in Section 3.1, L, C, and R follow,

To illustrate this system with band-limited center chan-nel, Fig. 12 shows again the system input to the loud-speaker transfer functions for the three loudspeakerslocated at 45°, 0°, and 45°, with a phantom source loca-tion at 30°. The low-pass filter λ( f ) in the center channelhas a cutoff frequency of 100 Hz with an asymptoticattenuation slope of 40 dB per octave. The ear signals areformed correctly and are identical to those presented inFig. 9.

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 903

*

*.

λ

λλ f

L

C

R

a

a

a

a

a

a

h

h

h

h

h f

h f

h

h

a

a

a

a

a

a

X

1

1

.

.

xa

xb

la

lb

ca

cb

ra

rb

11

21

31

12

22

32

11

21

31

12

22

32

1

_

__

i

ii

R

T

SSSS

R

T

SSSSS

R

T

SSS

R

T

SSS

R

T

SSSSS

R

T

SSSSSS

R

T

SSSSS

V

X

WWWW

V

X

WWWWW

V

X

WWW

V

X

WWW

V

X

WWWWW

V

X

WWWWWW

V

X

WWWWW

Z

[

\

]]

]]

Z

[

\

]]

]]

_

`

a

bb

bb

_

`

a

bb

bb

(20)

Fig. 9. Input-to-ear transfer functions for three-loudspeaker sys-tem with Trifield matrix, corresponding to functions shown inFig. 8.

Fig. 8. Loudspeaker feed transfer functions for Trifield matrixlinking input to three loudspeakers located at 45°, 0°, and 45°,image at 30°.

Page 42: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

HAWKSFORD PAPERS

An attraction of this configuration is that the centerchannel can have a limited bandwidth while offeringimprovements in bass quality both in terms of power han-dling and by improving modal dispersion in the listeningroom. Alternatively, if a loss of spatial resolution at lowfrequency is permitted, then the center channel could func-tion as a subwoofer with an upper response that extendsonly into the lower midband frequency range. The left- andright-hand loudspeakers would extend to high frequencies,although with restricted low-frequency performance.

3.3 n-Loudspeaker Array with Two-ChannelTransaural Processing

The method of using more than two loudspeakers canbe generalized to n loudspeakers while retaining only twoinformation channels, where for example the loudspeakerssurround the listener in a symmetrical array. The left- andright-hand loudspeakers in the array are fed by one of twoinformation signals, and each loudspeaker has individualweighting defined by a coefficient matrix [a]

Although this matrix equation cannot be solved in generalas there are too many independent variables, solutions canbe achieved when the matrix [a] is specified. For generaltwo-channel stereophonic reproduction this system offerslittle advantage. However, in a telepresence and telecon-ference environment the coefficient matrix [a] may betransmitted for a given talker alongside the two informa-tion channels. A transaural reproduction system can thenbe conceived, where the coefficients are updated dynami-cally to enhance directional coding. This becomes partic-ularly attractive where there are a number of talkers, as the

coefficient matrix could be adjusted dynamically toenhance localization.

4 MULTICHANNEL PARADIGM EXPLOITINGPAIRWISE TRANSAURAL STEREO

This section reviews a spatial audio paradigm that linksmultichannel audio and transaural processing. The tech-nique augments the directional clues inherent in multi-channel stereo reproduction by embedding HRTF datasuch that the ear signals are matched more accurately tothose produced by the source image. By default, such pro-cessing includes both frequency (interaural amplituderesponse) and time (interaural time response) informationand therefore forms an elegant method of virtual imagemanipulation. Also, because HRTFs vary with both angu-lar position and distance, sound sources can be synthesizedand manipulated in three-dimensional space, together with

reflections spatialized using their HRTF coordinates,which further enhances this process. In a multichannelsystem this processing is performed during source codingand is therefore compatible with all DVD formats.

The proposal operates at six principal levels:

• Selecting a pair of loudspeakers whose subtendedangle includes the position of the phantom imagehelps reinforce the sound direction and matches con-ventional mixing practice for localization in multichan-nel systems.

• Encoded amplitude differences in signals above about2 kHz support localization using interaural amplitudedifferences.

• Encoded time differences in signals below about 2 kHzsupport localization using interaural time differenceswhere an extended bass performance is desirable.

• The addition of transaural processing based on HRTFdata enables the construction of ear signals that matchthe original event and aids localization.

• Closer spacing of loudspeakers in a multiloudspeakerarray reduces sensitivity to the precise form of HRTFcharacteristics, thus making an averaged HRTF setmore applicable to a wide range of listeners.

• The effect of moderate head motion, which is a desirableattribute for improving localization, is supported. Forrelatively small loudspeaker subtended angles the errorin ear-signal reconstruction is reduced when the head ismoved by a small angle such that the vector componentreenforces localization.

904 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 10. Positional transfer functions GL and GR, correspondingfunctions shown in Fig. 8.

.X

LS

LS

LS n

a

a

a

a

a

a

h

h

h

h

h

h

h n

h n

a

a

a

a

a

a

1

2 1

1

2

2

xa

xb

a

b

a

b

a

b

n n n n

1

2

1

2

1

2

1

2

1

h h h

f

f h h

^

^

^

^

^

^

^

^

^

h

h

h

h

h

h

h

h

h

R

T

SSSSSS

R

T

SSSSSS

R

T

SSS

R

T

SSS

R

T

SSSSSS

V

X

WWWWWW

V

X

WWWWWW

V

X

WWW

V

X

WWW

V

X

WWWWWW

Z

[

\

]]]

]]]

Z

[

\

]]]

]]]

_

`

a

bbb

bbb

_

`

a

bbb

bbb

(21)

Page 43: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SCALABLE MULTICHANNEL CODING WITH HRTF ENHANCEMENT

As an example, consider a circular array of n loud-speakers, as shown in Fig. 13. In this system pairwise cod-ing (PWC) selects the two closest loudspeakers such thatan image X falls within the subtended angle at the listeningposition. Two-channel HRTF synthesis is then used to formthe optimum ear signals. For example, if loudspeakers rand r 1 are selected from the n-array of loudspeakers,then loudspeaker feeds LS(r) and LS(r 1) are computed,

.

LS

LS

r

r

h r

h r

h r

h r

h

hX

G

G X

1

1

1

a

b

a

b

xa

xb

r

r

1

1

^

^

^

^

^

^

_

h

h

h

h

h

h

i

R

T

SSS

R

T

SSS

R

T

SSS

R

T

SSS

V

X

WWW

V

X

WWW

V

X

WWW

V

X

WWW

(22)

Matrix [G] defines a set of positional transfer functions,which are effectively filters located between source andloudspeaker feed, where the HRTF notation was definedin Section 1 with reference to Fig. 1.

Solving for the positional transfer function matrix [G],

(23a)

.

Gh r h r h r h r

h h r h h r

Gh r h r h r h r

h h r h h r

1 1

1 1

1 1

a b b a

xa b xb a

a b b a

xb a xa b

r

r 1

^ ^ ^ ^

^ ^

^ ^ ^ ^

^ ^

h h h h

h h

h h h h

h h(23b)

This result can be generalized for an n-array of loud-speakers as

If a sound source falls between loudspeakers r and r 1,then ar ar1 1; otherwise all remaining coefficients inmatrix [a] are set to zero. Consequently for a sound sourceto circumnavigate the head, the HRTF coordinates hxa andhxb must change dynamically, whereas as the sourcemoves between loudspeaker pairs, the coefficient matrix isswitched to redirect the sound.

A four-channel, four-loudspeaker PWC scheme isshown in Fig. 14. The positional transfer functions G1,G2, G3, G4 are calculated for each source location, whichthen filters the source signal X to form the loudspeakerfeeds. Because transaural processing is performed at theencoder, a simple replay system is supported. Conseq-uently complete compatibility with conventional multi-channel audio is retained.

5 INCREASED NUMBERS OF LOUDSPEAKERS

An increase in the number of loudspeakers can achievemore even sound distribution, distribute power handling,and possibly lower the sensitivity to room acoustics. Thissection considers methods by which the number of loud-speakers in an array can be increased.

In the basis system, where loudspeakers are linkeddirectly to information channels located at coordinatescompatible with the encoding HRTF coordinates, theloudspeakers are designated nodal loudspeakers. An n-

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 905

Fig. 11. Differential phase response between positional transferfunctions GL and GR corresponding to functions shown in Fig 8.

Fig. 12. Transfer functions linking input to three-loudspeakerfeeds for Trifield matrix, but with center channel band-limited to100 Hz, corresponding to functions shown in Fig. 8.

.X X

LS

LS

LS n

a

a

a

a

a

a

h

h

h

h

h

h

h n

h n

a

a

a

a

a

a

G

G

G

1

2 1

1

2

2

xa

xb

a

b

a

b

a

b

n n n n n

1

2

1

2

1

2

1

2

1

1

2

h h h

f

f h h h

^

^

^

^

^

^

^

^

^

h

h

h

h

h

h

h

h

h

R

T

SSSSSS

R

T

SSSSSS

R

T

SSS

R

T

SSS

R

T

SSSSSS

R

T

SSSSSS

V

X

WWWWWW

V

X

WWWWWW

V

X

WWW

V

X

WWW

V

X

WWWWWW

V

X

WWWWWW

Z

[

\

]]]

]]]

Z

[

\

]]]

]]]

_

`

a

bbb

bbb

_

`

a

bbb

bbb

(24)

Page 44: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

HAWKSFORD PAPERS

channel system therefore has n nodal loudspeakers, whichconstitute the basis array, with the corresponding drivesignals, or primary signals, collectively forming the pri-mary signal set. Loudspeakers in addition to the nodalloudspeakers are termed secondary loudspeakers.

5.1 Compensation for Inclusion of SecondaryLoudspeakers

The objective is to derive additional signals within thedecoder to drive secondary loudspeakers located betweenthe nodal loudspeakers. However, the ear signals must beconserved and theoretically remain identical to the casewhere only the nodal loudspeakers are present. It isassumed here that all loudspeakers in the array have iden-tical transfer functions and therefore do not affect thedecoder process. If this is not the case, then they require

individual correction. A sector of such an array is shownin Fig. 15. In this array nodal loudspeakers r and r 1have the respective HRTF coordinates [ha(r), hb(r)] and[ha(r 1), hb(r 1)] at the listener, whereas the singlesecondary loudspeaker p has the HRTF coordinates[ha( p), hb( p)].

The synthesis of the drive signal for secondary loud-speaker p employs two weighting functions λp1 and λp2applied to the respective primary signals LSr and LSr1such that LSp, the drive signal to the secondary loud-speaker p, is

.λ λLS LS LS p p r p r1 2 1 (25)

However, when the secondary loudspeaker enters thearray, it is necessary to compensate the output from thetwo adjacent nodal loudspeakers in order that the ear sig-nals remain unchanged. Ideally this needs to be achievedwithout knowledge of the encoding parameters. Otherwisethe modified array cannot be used universally in a multi-channel audio system. A scheme capable of meeting thisobjective is shown in Fig. 16, where the compensationtransfer functions γp1 and γp2 filter the signal LSp to yieldsignals that are added to LSr and LSr1.

906 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 14. Four-loudspeaker array with pairwise transaural synthesis.Fig. 16. Decoder processing to compensate for secondary loud-speaker p.

Fig. 13. n-loudspeaker array, suitable for transaural pairwise stereo.

Fig. 15. Nodal loudspeakers with additional secondary loudspeaker.

Page 45: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SCALABLE MULTICHANNEL CODING WITH HRTF ENHANCEMENT

5.1.1 AnalysisConsider initially the case when the secondary loud-

speaker p is absent from the array. The left- and right-earsignals ea and eb are expressed in terms of the HRTFs cor-responding to the two adjacent nodal loudspeakers r andr 1,

(26a)

.

LS LS

LS LS

e h r h r

e h r h r

1

1

a a a

b b b

r r

r r

1

1

^ ^

^ ^

h h

h h (26b)

When the secondary loudspeaker p is introducedinto the array, the modified ear signals e

a and eb

become

Forcing ea ea and e

b eb, then

γ

γ

h r

h r

h r

h r

h p

h p

1

1

a

b

a

b

a

b

p

p

1

2

^

^

^

^

_

_

h

h

h

h

i

i

R

T

SSS

R

T

SSS

R

T

SSS

V

X

WWW

V

X

WWW

V

X

WWW

(28)

from which the correction functions γp1 and γp2 follow,

(29a)

.

γ

γ

h r h r h r h r

h p h r h p h r

h r h r h r h r

h p h r h p h r

1 1

1 1

1 1

a b b a

a b b a

a b b a

b a a b

p

p

1

2

^ ^ ^ ^

_ ^ _ ^

^ ^ ^ ^

_ ^ _ ^

h h h h

i h i h

h h h h

i h i h

R

T

SSS

R

T

SSS

V

X

WWW

V

X

WWW

(29b)

These equations reveal that the correction functions γp1and γp2 depend only on the HRTFs corresponding to theloudspeaker array. Consequently they are purely a func-tion of the replay system, and its loudspeaker layout thuscan be computed within the replay decoder as part of theinstallation procedure. However, the weighting functionsγp1 and γp2 can be selected independently, provided thesystem is stable.

5.2 Relationship of System Parameters toLoudspeaker HRTFs

Section 5.1 presented an analysis of decoder parameterswhere precise HRTF measurement data are known foreach loudspeaker position. However, there are some inter-esting observations and simplifications that can be made,which are considered in this section.

5.2.1 HRTFs Derived Using Linear InterpolationConsider a pair of nodal loudspeakers within an n-array

PWC system where the proximity of loudspeakers r andr 1 is such that the image source HRTFs can be approx-

imated by linear interpolation, such that

(30a)β

β

h m m h r m h r

h m m h r m h r

1 1

1 1

xa a a

xb b b

r r r

r r r

_ ^ _ ^

_ ^ _ ^

i h i h

i h i h

$

$

.

. (30b)

where mr is the panning variable with a range of 1 to 0,which corresponds to an image pan from loudspeaker r tor 1, and the function β(mr) is a moderator chosen toachieve a constant subjective loudness with variations inmr. By substitution, the positional transfer functions Gr

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 907

a λ λ

λ λ λ λ

γ

γ

LS LS LS LS

LS LS LS LS

e h r h r h r

h r h p

1

1

a

a a

a ar r r p r p p

r p r p p r p r p

1 1 1 2 1

1 1 2 2 1 1 2

l ^ ^ ^

^ _

h h h

h i

$

$ $

.

. .

(27a)

b

.

λ λ

λ λ λ λ

γ

γ

LS LS LS LS

LS LS LS LS

e h r h r h r

h r h p

1

1

b

b b

b br r r p r p p

r p r p p r p r p

1 1 1 2 1

1 1 2 2 1 1 2

l ^ ^ ^

^ _

h h h

h i

$

$ $

.

. .

(27b)

and Gr1 then simplify to

(31a)

.

β

β

G m m

G m m1

r r r

r r r1

_

_ _

i

i i (31b)

Consequently, for an image source that lies on a radial arcbetween the two nodal loudspeakers, simple amplitudepanning yields the optimum panning algorithm.Effectively, HRTF coding information is derived directlyfrom the loudspeaker locations and therefore is matchedprecisely to the listener. This assumes that intermediateHRTFs are derived by linear interpolation. It should alsobe noted that as the image source moves away from theradial arc containing the loudspeaker array, changes inHRTFs occur, making the positional transfer functionscomplex. Nevertheless, even when more exact HRTF dataare available, there remains a strong desensitization to theexact form of the HRTFs when the positional transferfunctions are calculated because of the relatively closeproximity of loudspeakers in an n-array.

Consider next a secondary loudspeaker p that is addedto the array, again assuming a linear interpolation model.It is assumed that the secondary loudspeaker is locatedmidway along the same radial arc as the nodal loudspeak-ers and that its HRTFs ha( p) and hb( p) with respect to thelistener can be determined by linear interpolation. Thusfor the midpoint location,

(32a). .

. . .

h p h r h r

h p h r h r

0 5 0 5 1

0 5 0 5 1

a a a

b b b

_ ^ ^

_ ^ ^

i h h

i h h (32b)

From these data the compensation gamma functions γp1

Page 46: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

HAWKSFORD PAPERS

and γp2 follow,

.γ γ 0 5 p p1 2 (33)

revealing once more a simple form for this special case.The modified positional transfer functions GMr GMr1and GMp for the respective nodal and secondary loud-speakers r, r 1, and p are calculated as

(34a)

.

λ λ

γ λ γ λ

γ λ γ λ

GM

GM

GM

G G

G G

G G

1

1

p p r p r

r p p r p p r

r p p r r p p

1 2 1

1 1 1 2 1

1 2 1 1 2 2

`

`

j

j

(34b)

(34c)

Assuming symmetry, let λp1 λp2 0.5 and consider thefollowing examples:

1) Gr 1 and Gr1 0, yielding GMr 0.75, GMr1 0.25, and GMp 0.5.

2) Gr 0.5 and Gr1 0.5, yielding GMr 0.25,GMr1 0.25, and GMp 0.5.

For an image located coincident with the secondary loud-speaker there is a 6-dB level difference between second-ary and nodal loudspeaker input signals. Also signal pro-cessing is simple and uses only real coefficients in thematrices.

5.2.2 Compensation and Incorporation of ExactHRTF Data

It is instructive to compare three other options for incor-porating secondary loudspeaker and image HRTF coordi-nates. In each case correction functions λp1 and λp2 are cal-culated to match the selected HRTFs and the correspondingtransfer functions for system input-to-loudspeaker inputsevaluated for loudspeakers r, p, and r 1 located at 20°,30°, and 40°, respectively, with an image at 30°. The fourcases are as follows.

Case 1:Image Measured HRTF dataSecondary loudspeaker Measured HRTF dataResults See Fig. 17(a)

Case 2:Image Measured HRTF dataSecondary loudspeaker HRTF derived by linear

interpolationResults See Fig. 17(b)

Case 3:Image HRTF derived by linear

interpolationSecondary loudspeaker Measured HRTF dataResults See Fig. 17(c)

908 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 17. Transfer functions linking input to three-loudspeaker feeds at 20°, 30°, and 40°, image at 20°. (a) Case 1. (b) Case 2. (c) Case3. (d) Case 4.

(a) (c)

(b)

Page 47: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SCALABLE MULTICHANNEL CODING WITH HRTF ENHANCEMENT

Case 4:Image HRTF derived by linear

interpolationSecondary loudspeaker HRTF derived by linear

interpolationResults See Fig. 17(d) and discussion

in Section 5.2.1

Fig. 17(c) shows that when the HRTF coordinates forthe image are derived by linear interpolation, the centerchannel response is constant with frequency, even thoughthe secondary loudspeaker HRTF coordinates are derivedby measurement. Also, when both sets of coordinates arederived by linear interpolation, all three responses are con-stant, as demonstrated in Section 5.2.1. However, for cases1 and 2 all responses vary with frequency, although case 2maintains a greater high-frequency content in the centerchannel response, which is desirable for an image locatedcoincident with the secondary loudspeaker.

5.2.3 Compensation for Encoder–DecoderLoudspeaker Displacement Error

The encoder assumes a nominal loudspeaker array loca-tion when determining the positional transfer functions. Ifthe nodal loudspeakers are placed in the listening space inequivalent positions, then no additional processing isrequired at the decoder. However, in circumstances wherethe loudspeaker locations are displaced, positional com-pensation is required. It is important that positional com-pensation be independent of source coding and can beapplied at the decoder without knowledge of the encodingalgorithm. A pairwise positional correction scheme isshown in Fig. 18, where the compensation functionsGC11(r), GC12(r), GC11(r 1), and GC12(r 2) are usedto derive modified loudspeaker feeds. The form of thiscompensation is not unique, as correction signals can beapplied to other loudspeakers in the array via appropriatefilters. However, it is suggested that the loudspeaker

selected to process the correction signal be the one closestto the loudspeaker that has been displaced. Hence by wayof example, consider the scheme shown in Fig. 18. Whenthe only active primary fed is LSr, and equating the ear sig-nals for both optimum and displaced loudspeaker locations,the positional compensation filters are as follows:

a

b

a

b.

GC

GC

r

r

h r

h r

h r

h r

h r

h r

1

1

a

b

11

12

1l

l

l

l

^

^

^

^

^

^

^

^

h

h

h

h

h

h

h

h

R

T

SSS

R

T

SSS

R

T

SSS

V

X

WWW

V

X

WWW

V

X

WWW

(35)

Similarly, when only LSr1 is active, then

a

b

a

b.

GC

GC

r

r

h r

h r

h r

h r

h r

h r

1

1

1

1

1

1

a

b

11

12

1l

l

l

l

^

^

^

^

^

^

^

^

h

h

h

h

h

h

h

h

R

T

SSS

R

T

SSS

R

T

SSS

V

X

WWW

V

X

WWW

V

X

WWW

(36)

5.3 Standardization of HRTF Grid and NodalLoudspeaker Locations

The techniques described in the preceding requireknowledge of the HRTF coordinates for each nodal loud-speaker. Establishing a standardized encoding grid whereeach grid point is assigned nominal HRTF coordinates andwhere nodal loudspeakers are assigned nominal grid loca-tions can satisfy this requirement. All encoder and decoderusers then universally know this information. An examplegrid proposal, as shown in Fig. 19, is based on a 60° sub-tended angle for nodal loudspeakers with two additional

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 909

Fig. 17. Continued

(d) Fig. 18. Compensation process for nodal loudspeaker positionalerrors. (a) Optimum loudspeaker location. (b) Suboptimum loud-speaker location.

(b)

(a)

Page 48: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

HAWKSFORD PAPERS

secondary loudspeakers. Three layers of nodes are sug-gested at radii of 1.5 m, 3 m, and 6 m. It is recognized thatHRTFs are not unique, being listener specific, but when amultiloudspeaker array is formed using a set of HRTFsthat are shared with image synthesis, errors are reduced.

HRTF coordinates that are noncoincident with the nodalpoints can be inferred by interpolation. For example,assume that an image X is located at the cylindrical coor-dinates r, θ, where the four nearest nodes are r1, θ1,r1, θ2, r2, θ1, and r2, θ2. The interpolated HRTFsha(r, θ) and hb(r, θ) are then

multichannel system configurations. However, inevitablythere is a limit to spatial resolution arising from the use ofmatrixing only, which imposes crosstalk between nodaland secondary loudspeaker feeds. Some advantage may begained by using nonlinear decoding with dynamic para-meterization, although for high-resolution music repro-duction linear decoding should be retained.

Because DVD-A is capable of six channels at 24 bit 96

910 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 19. Proposed 18-segment constellation map to define a standard set of HRTFs.

kHz, some of the lower bits in the LPCM stream can be sac-rificed [12] while still retaining an exemplary dynamic rangeby using standard methods of psychoacoustically motivatednoise shaping and equalization [16]. The least significantbits in the LPCM streams together with a randomizationfunction can then be used to encode additional audio chan-nels using perceptual coders such as AC-3,9 DTS,10 orMPEG.11 For example, 4 bit per sample per LPCM channel

where mθ and mr are the angular and radial linear interpo-lation parameters defining the image X. For images that lieeither within the inner radius or beyond the outer radius,angular interpolation is performed first, followed by anappropriate adjustment to the amplitude and time delaysbased on the radial distance from the head.

6 PERCEPTUALLY BASED CODING EXPLOITINGEMBEDDED CODE IN PRIMARY SIGNALS TOENHANCE SPATIAL RESOLUTION

The techniques described in Section 5 can be extendedto a system with any number of nodal and secondary loud-speakers and thus can be matched to a wide variety of

(37a), , , , ,

, , , , ,

θ θ θ θ θ

θ θ θ θ θ

h r m m h r m h r m m h r m h r

h r m m h r m h r m m h r m h r

1 1 1

1 1 1

θ θ θ θ

θ θ θ θ

a a a a a

b b b b b

r r

r r

1 1 1 2 2 1 2 2

1 1 1 2 2 1 2 2

^ _ _ _ _ _ _ _

^ _ _ _ _ _ _ _

h i i i i i i i

h i i i i i i i

8 8

8 8

B B

B B (37b)

9 Proprietary perceptual coding developed by Dolby Laboratories.10 Digital Theatre Systems.

Page 49: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SCALABLE MULTICHANNEL CODING WITH HRTF ENHANCEMENT

at 96 kHz yields a serial bit rate of 384 kbit/s.The proposal retains the primary signals in high-

resolution LPCM and uses the matrix methods describedin Section 5 to estimate the secondary loudspeaker feeds.Spatially related difference signals are then calculatedfrom the discrete secondary loudspeaker signals availableat the encoder and the matrix-derived signals. Also,because of close spatial clustering of the additional chan-nels there is a high degree of interchannel correlation withboth the primary and the secondary loudspeaker signals, afactor that bodes well for accurate perceptual coding.Close clustering also implies that perceptual coding errorsare not widely dispersed in space, yielding an improvedmasking performance. Given that DVD-A already sup-ports six LPCM channels, it is suggested that an extra twoencoded signals per primary signal is a realistic compro-mise, yielding a total of 18 channels, as proposed in thestandardized HRTF constellation illustrated in Fig. 18. Inthe grand plan there would be n perceptual coders in oper-ation, one per nodal loudspeaker feed. In such a schemefurther gains are possible by integrating dynamic bit allo-cation across all coders as well as using a perceptualmodel designed specifically for multichannel stereoencoding. In difficult encoding situations dynamic spatialblending can be used to reduce the difference signals priorto perceptual encryption. Fig. 20 shows the basic encoderarchitecture where the error signals D1 and D2 are indi-cated. In Fig. 21 a decoder is shown where identical esti-mates are made of the secondary loudspeaker input sig-nals, but with the addition of the difference signals to yielddiscrete loudspeaker feeds. Of course, if the embeddedperceptually coded difference signals are not used, esti-mates can still be made for the secondary loudspeakers asshown in Fig. 16. Alternatively, for a basic scheme anarray of nodal loudspeakers only can be used.

7 CONCLUSIONS

This paper has presented a method for multichannelaudio that is fully compatible with DVD (DVD-A andSACD) multichannel formats, which have the capabilityof six high-resolution signals. The key to this technologyis the exploitation of spatial coding using HRTF data toenhance the positional representation of sound sources.As such it forms a link between two-channel transauraltechniques and conventional multichannel audio usingmany loudspeakers. This technique has already beendemonstrated in telepresence and teleconferencing appli-cations [9], [10] to be effective in representing spatialaudio. However, the methods described here show how aparticular loudspeaker array can be configured whereissues of positional calibration were discussed for loud-speakers displaced from those locations assumed duringcoding.

The method is scalable and fully backward compatible.In a simple system there is no additional processing at thedecoder where, for example, the outputs of a DVD playerare routed directly to an array of loudspeakers. However,if additional loudspeakers are used, as might be envisagedwith tiled walls of flat-panel NXT12 loudspeakers (see, forexample, Fig. 22), then formal methods exist, enabling thecorrect ear signals at the listening position to be main-tained. Also for DVD-A, a method was suggested whereperceptually coded information is embedded within theLPCM code to enable discrete loudspeaker signals to bederived. It was proposed that an upper limit of 18 channelsshould be accommodated, although full compatibility withsystems down to the basic array is maintained.

An interesting observation for systems using a largenumber of closely spaced loudspeakers is that image posi-tioning on the arc of the array can use simple linear ampli-tude panning applied between pairs of adjacent loud-

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 911

11 Perceptual based audio coding proposed by the MotionPicture Expert Group.

12 Registered trademark of Company name, New Transducersplc, UK.

Fig. 20. High-level processor architecture for encoding discretesecondary loudspeaker feeds.

Fig. 21. High-level decoder architecture to derive discrete sec-ondary loudspeaker feeds.

Page 50: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

HAWKSFORD PAPERS

speakers. This approximation assumes that the imageHRTF coordinates can be estimated to sufficient accuracyusing linear interpolation between adjacent loudspeakerHRTF coordinates. This expedience effectively embedsHRTF data matched exactly to the listener simply becauseof the physical location of the loudspeakers. However, asthe loudspeaker spacing increases, this approximationfails, requiring then the use of more accurate HRTF imagecoordinates together with transaural PWC, as described.This is particularly important where an image is locatedaway from the arc of the loudspeaker array and wherereflections are to be rendered to craft a more accurate vir-tual acoustic.

This work is also targeted at new communication for-mats for virtual reality, telepresence, and video conferenc-ing [17], [18], where future research should investigate itsapplication. Such schemes are not constrained by the nor-mal paradigms of multichannel stereophonic reproduc-tion, nor is compatibility necessarily sought. The approachis to form an optimum methodology for constructingphantom images and to consider coding paradigms appro-priate for communication. For example, one possible com-munication format assigns a discrete channel to each phat-nom image. The channel then conveys the auditory signalstogether with the spatial coordinates updated at a ratecompatible with motion tracking of the sound source. Atthe receiver, a processor carries a downloaded programwith knowledge of the positional data and sourceacoustics from which the required reflections and rever-beration are computed. These data would then be format-ted to match the selected loudspeaker array. Such ascheme has great flexibility and can allow many monosources to contribute to the final soundscape.

In conclusion, the techniques presented describe a

means by which spatial resolution and image coding per-formance can transcend the six-channel limitation of thecurrent DVD formats, yet without requiring additionalstorage capacity. Also, by basing signal processing on aperceptual model of hearing, it is revealed how soundimages can be rendered and, in particular, how interauralamplitude differences and interaural time differences canbe accommodated without seeking tradeoffs between timeand amplitude clues. Essentially the work has presented ascalable and reverse compatible solution to multichannelaudio that is particularly well matched to an LPCM formaton DVD-A.

8 REFERENCES

[1] HFN/RR, “Digital Frontiers,” vol. 40, pp. 58–59,106 (1995 Feb.).

[2] M. O. J. Hawksford, “High-Definition Digital Audioin 3-Dimensional Sound Reproduction,” presented at the103rd Convention of the Audio Engineering Society, J.Audio Eng. Soc. (Abstracts), vol. 45, p. 1016 (1997 Nov.),preprint 4560.

[3] A. J. Berkout, D. de Vries, and P. Vogel, “AcousticControl by Wavefield Synthesis,” J. Acoust. Soc. Am., vol.93, pp. 2764–2778 (1993).

[4] D. R. Begault, 3-D Sound for Virtual Reality andMultimedia (AP Professional, 1994).

[5] M. A. Gerzon, “Periphony: With-Height SoundReproduction,” J. Audio Eng. Soc., vol. 21, pp. 2–10(1973 Jan./Feb.).

[6] M. A. Gerzon, “Ambisonics in MultichannelBroadcasting and Video,” J. Audio Eng. Soc., vol. 33, pp.859–871 (1985 Nov.).

[7] M. A. Gerzon, “Hierarchical Transmission Systemsfor Multispeaker Stereo,” J. Audio Eng. Soc., vol. 40, pp.692–705 (1992 Sept.).

[8] K. C. K. Foo and M. O. J. Hawksford, “HRTFSensitivity Analysis for Three-Dimensional Spatial AudioUsing the Pairwise Loudspeaker Association Paradigm,”presented at the 103rd Convention of the AudioEngineering Society, J. Audio Eng. Soc. (Abstracts), vol.45, p. 1018 (1997 Nov.), preprint 4572.

[9] K. C. K. Foo, M. O. J. Hawksford, and M. P. Hollier,“Three-Dimensional Sound Localization with MultipleLoudspeakers Using a Pairwise Association Paradigm andEmbedded HRTFs,” presented at the 104th Convention ofthe Audio Engineering Society, J. Audio Eng. Soc.(Abstracts), vol. 46, p. 572 (1998 June), preprint 4745.

[10] K. C. K. Foo, M. O. J. Hawksford, and M. P.Hollier, “Pairwise Loudspeaker Paradigms for Multichan-nel Audio in Home Theatre and Virtual Reality,” presentedat the 105th Convention of the Audio Engineering Society,J. Audio Eng. Soc. (Abstracts), vol. 46, p. 1035 (1998Nov.), preprint 4796.

[11] M. Gerzon, “Practical Periphony: The Reproduc-tion of Full-Sphere Sound,” presented at the 65th Con-vention of the Audio Engineering Society, J. Audio Eng.Soc. (Abstracts), vol. 28, p. 364 (1980 May), preprint1571.

[12] M. A. Gerzon and P. G. Craven, “A High-Rate

912 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 22. Multichannel configuration using array of wall-mountedflat-panel diffuse loudspeakers.

Page 51: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SCALABLE MULTICHANNEL CODING WITH HRTF ENHANCEMENT

Buried Data Channel for Audio CD,” J. Audio Eng. Soc.,vol. 43, pp. 3–22 (1995 Jan./Feb.).

[13] J. Blauert, Spatial Hearing, rev. ed. (MIT Press,Cambridge, MA, 1997).

[14] J. Bauck and D. H. Cooper, “Generalized Trans-aural Stereo and Applications,” J. Audio Eng. Soc., vol. 44,pp. 683–705 (1996 Sept.).

[15] M. A. Gerzon, “Optimum Reproduction Matricesfor Multispeaker Stereo,” J. Audio Eng. Soc., vol. 40, pp.571–589 (1992 July/Aug.).

[16] J. R. Stuart and R. J. Wilson, “Dynamic RangeEnhancement Using Noise-Shaped Dither Applied to

Signals with and without Preemphasis,” presented at the96th Convention of the Audio Engineering Society, J.Audio Eng. Soc. (Abstracts), vol. 42, p. 400 (1994 May),preprint 3871.

[17] “Telepresence Theme,” BT Technol. J., vol. 15(1997 Oct.).

[18] D. M. Burraston, M. P. Hollier, and M. O. J.Hawksford, “Limitations of Dynamically Controlling theListening Position in a 3-D Ambisonic Environment,” pre-sented at the 102nd Convention of the Audio EngineeringSociety, J. Audio Eng. Soc. (Abstracts), vol. 45, p. 413(1997 May), preprint 4460.

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 913

THE AUTHOR

Malcolm Hawksford received a B.Sc. degree with FirstClass Honors in 1968 and a Ph.D. degree in 1972, bothfrom the University of Aston in Birmingham, UK. HisPh.D. research program was sponsored by a BBCResearch Scholarship and investigated delta modulationand sigma–delta modulation (SDM, now known as bit-stream coding) for color television and produced a digitaltime-compression/time-multiplex technique for combin-ing luminance and chrominance signals, a forerunner ofthe MAC/DMAC video system.

Dr. Hawksford is director of the Centre for AudioResearch and Engineering and a professor in theDepartment of Electronic Systems Engineering at EssexUniversity, where his research and teaching interestsinclude audio engineering, electronic circuit design, andsignal processing. His research encompasses both analogand digital systems with a strong emphasis on audio sys-tems including loudspeaker technology. Since 1982,research into digital crossover networks and equalizationfor loudspeakers has resulted in an advanced digital andactive loudspeaker system being designed at Essex

University. A first in 1986 was for a prototype system to bedemonstrated at the Canon Research Centre in Tokyo,work sponsored by a research contract from Canon. Muchof this work has appeared in the JAES, together with a sub-stantial number of contributions at AES conventions.

His research has also encompassed oversampling andnoise-shaping techniques applied to analog-to-digital anddigital-to-analog conversion with a special emphasis onSDM. Other research has included the linearization ofPWM encoders, diffuse loudspeaker technology, andthree-dimensional spatial audio and telepresence includingmultichannel sound reproduction.

Dr. Hawksford is a recipient of the 1997/1998 AESPublications Award for his paper, “Digital SignalProcessing Tools for Loudspeaker Evaluation andDiscrete-Time Crossover.” He is a chartered engineer aswell as a fellow of the AES, IEE, and IOA. He is currentlychair of the AES Technical Committee on High-Resolution Audio and is a founder member of the AcousticRenaissance for Audio (ARA). He is also a technical con-sultant for NXT, UK and LFD Audio, UK.

Page 52: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS

0 INTRODUCTION

Since the introduction of the digital versatile disk(DVD) and the super audio CD (SACD), a revival of mul-tichannel audio has appeared in sound systems for con-sumer use today. It is, however, desirable to maintaincompatibility with the existing two-channel stereo record-ings and/or broadcasting. Therefore the conversion oftwo-channel stereo to the multichannel format has beenstudied extensively over the decades, and a considerablenumber of publications exist [1]–[8]. Among these,Gerzon and Barton’s is particularly notable, in whichmany schemes have been proposed (see [6] and referencestherein).

Although many authors have introduced multichannelsound systems with a large number of channels, werestrict ourselves to a home cinema setup for which it hasbeen shown that five channels is sufficient for creatingambience effects [9]. Hence in this paper we focus on sig-nal format conversion from two-channel stereo to the five-channel (two-to-five) sound processing algorithm.

The desired setup is shown in Fig. 1, in which the chan-nels are labeled L (left), C (center), R (right), SL (left sur-round), and SR (right surround) according to convention.This setting is adopted from the ITU multichannel config-uration [10], with three loudspeakers placed in front of thelistener, and the other two at the back.

The front channels are used to provide a high degree ofdirectional accuracy over a wide listening area for front-stage sounds, particularly dialogues, and the rear channelsproduce diffuse surround sounds, providing ambience and

environment affects. An additional loudspeaker (sub-woofer) may be used to augment bass reproduction, whichis often called 5.1 system, with .1 referring to the low-frequency enhancement (LFE) channel. In this paper,however, we do not use a subwoofer, since the system caneasily be extended when necessary without affecting thealgorithm.

The algorithm presented in this paper offers twoimprovements above the existing two-to-five channelsound systems. First a problem associated with channelcrosstalk is reduced, and therefore sound localization isbetter. Listening tests have confirmed that good soundlocalization without the need to listen at the sweet spotgives more space to the listener to enjoy the programoffered rather than restricting the listener to the sweetspot.

Second a better sound distribution to the surround chan-nels is achieved by using a cross-correlation technique.Surround channels are crucial in creating the ambienceeffects, which is one of the main goals of multichannelaudio. At the same time, the energy preservation criterionis an important constraint that has been used to designmultichannel matrices [7]. The main reason for this is tomaintain backward and forward stereo compatibility.Furthermore, the preservation criterion ensures that allsignals present in the two-channel transmitted signals areproduced at a correct power level, so that the balancebetween the different signal sounds in the recording is notdisturbed.

This paper is organized as follows. In Section 1 atechnique for deriving a robust center channel is out-lined. A three-dimensional mapping to derive the sur-round channels is discussed in Section 2. The rest of thepaper will discuss subjective assessments of some lis-tening tests that have been performed in order to com-

914 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Two-to-Five Channel Sound Processing*

R. IRWAN AND RONALD M. AARTS, AES Fellow

Philips Research Laboratories, 5656 AA Eindhoven, The Netherlands

While stereo music reproduction was a dramatic advance over mono, recently a transitionto multichannel audio has created a more involving experience for listeners. An algorithm toconvert stereo to five-channel sound reproduction is presented. An effective sound distribu-tion to the surround channels is achieved by using a cross-correlation technique, and a robuststereo image is obtained using principal component analysis. Informal listening testscomparing this scheme with other methods revealed that the proposed algorithm is preferredfor both on and off the reference listening position (sweet spot).

*Partly presented at the AES 19th International Conference,Germany, 2001 June 21–24; revised 2001 October 5 and 2002May 31.

Page 53: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SOUND PROCESSING

pare the present method with other existing two-to-fivechannel sound systems. Concluding remarks are pre-sented in Section 4.

1 CENTER LOUDSPEAKER

We consider the three-channel approach first. It isknown that the sound quality of stereo sound reproductioncan be improved by adding an additional loudspeakerbetween each adjacent pair of loudspeakers. For example,as proposed by Klipsch [1], an additional center loud-speaker C can be fed with the sum signal 2(xL xR),where xL and xR represent signals from left and right,respectively. The 2 factor was introduced to preserve thetotal energy from the three loudspeakers, assuming inco-herent additive for left, center, and right sounds recordedby two widely spaced microphones. A major drawback of

this approach is that crosstalk with the left and right chan-nels is inevitable, and therefore it will narrow the stereoimage considerably.

We propose an algorithm to derive the center channelwithout these drawbacks, using principal componentanalysis (PCA) [11], which produces two vectors indicat-ing the direction of both the dominant signal y and theremaining signal q, as shown in Fig. 2 by dashed lines.Note that these two directions are perpendicular to eachother, creating a new coordinate system. These two signalsare then used as basis signals in the matrix decoding, apoint that is different from other existing two-to-five chan-nel sound systems.

To derive the center channel’s gain using the directionof a stereo image, we process the audio signal comingfrom a CD (sampling frequency Fs 44.1 kHz) on a sam-ple basis. Each sample of a stereo pair at a time index k

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 915

Fig. 2. Lissajous plot of stereo signal recorded from the fragment “The Great Pretender”: by Freddy Mercury. Dashed lines representnew coordinate system based on both dominant signal y and remaining signal q, forming the direction of a stereo image α.

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

y

q

right input signal values (xR

)

left

inpu

t sig

nal v

alue

s (x

L)

Fig. 1. ITU reference configuration [9]. ––reference listening position (sweet spot). Left and right channels are placed at angles 30°from C; two surround channels are placed at angles 110° from C.

30o

110o

SLStereo source

2 => 5

L R

C

x

SR

Page 54: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

IRWAN AND AARTS PAPERS

can be expressed as

k x k x kx L RT

^ ^ ^h h h8 B (1)

where k is an integer.Let us now define y (k) to be a linear combination of the

input signals,

y k k kw x T^ ^ ^h h h (2)

where

k w k w kw L RT

^ ^ ^h h h8 B (3)

is a weight vector corresponding to the left and the rightchannels, respectively.

In order to find the optimum weighting vectors, wemaximize the energy of Eq. (2) with respect to w, that is,

E y k

w0

2

2

2 ^ h< F

(4)

where E denotes the expected value. Using a method pre-sented by Haykin [12], we obtain by means of the steep-est descent method

µ

µΕ

wk k

E y k

k y k k

w w

w x

12

11

1 1 1

2

2

2

^ ^

^

^ ^ ^

h h

h

h h h

<

8

F

B (5)

where µ is a step size. Since E[y (k 1)] and E[x(k 1)]are both scalars, Eq. (5) can be estimated as

.µk k y k kw w x1 1 1 ^ ^ ^ ^h h h h (6)

Normalizing Eq. (6) such that w(k)2 1, gives thedesired sample estimate of w,

.

µ

µk

k y k k

k y k kw

w x

w x

1 1 1

1 1 1

,L R

2!

^

^ ^ ^

^ ^ ^h

h h h

h h h

8 B

(7)

Assuming that the step size µ is small, Eq. (7) can beexpanded as a power series in µ, yielding

µk k y k

k k y k

w w

x w

1 1

1 1 1

^ ^ ^

^ ^ ^

h h h

h h h8 B (8)

which is a least-mean-square (LMS) algorithm with y (k 1) as input. Writing out Eq. (8) for left and right channels,respectively, produces

.

µ

µ

w k w k y k

x k w k y k

w k w k y k

x k w k y k

1 1

1 1 1

1 1

1 1 1

L L

L L

R R

R R

^ ^ ^

^ ^ ^

^ ^ ^

^ ^ ^

h h h

h h h

h h h

h h h

8

8

B

B

(9)

Karhunen [13] has shown that the algorithm is stable ifand only if

< <µ k kx x0 2T^ ^h h (10)

or the step size must satisfy the following constraint:

< <µk kx x

02

T^ ^h h

(11)

and therefore it is input signal dependent.The direction of a stereo image in terms of an angle, in

radians, can easily be computed as

.arctanα kk

k

w

w

R

L^

^

^h

h

hR

T

SSS

V

X

WWW

(12)

Fig. 3 shows the values of α when it is calculated for aCD stereo music recording. Recalling Fig. 2 with the leftchannel corresponding to α π/2 and the right channel toα 0, we can see that α fluctuates around π/4, creating aphantom source almost equidistant between the left andright channels.

Fig. 4 shows the same response of the angle α, but nowmeasured from a DVD movie fragment, where abruptchanges from one channel to the other are present. Weintentionally take a shorter fragment in order to demon-strate that the algorithm is still able to detect abruptchanges in localizations within a short period of time.

Now we can represent a pair of stereo signals using avector given by Eq. (3). This is a vector of unit length hav-ing the right channel gain in the horizontal axis, and the leftchannel gain in the vertical axis, as shown in Fig. 5(a). Tomap this stereo vector onto a three-channel vector, we dou-ble the angle α and produce a new mapping, as depicted inFig. 5(b). We can then find the projections of the vector ontothe LR axis and the C axis using sine and cosine rules,

.

c w w

c w w2

LR R L

C L R

2 2

(13)

It should be pointed out that the transformation illus-trated in Fig. 5 works only for nonnegative α. This isbecause for negative α, multiplication by a factor of 2results in the vector being in a lower quadrant, and there-fore no gain can be derived for the center channel. Toovercome this problem, extra information should be used,which is described in the next section.

2 SURROUND LOUDSPEAKERS

The surround channels are generally used to createambience effects for music. For applications in the filmindustry the surround channels are used for sound effects.A common technique for ambience reconstruction is theuse of delayed front channel information for the surroundchannels. Dolby Pro Logic, for instance, has delayed thesurround sounds so as to arrive at the listeners’ ears at least10 ms later than the front sounds [7].

Environmental and ambience effects can be computedby considering left and right channel variations (xL xR)

916 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Page 55: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SOUND PROCESSING

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 917

Fig. 5. (a) Direction vector plots of stereo signals. (b) Corresponding three-channel representation by doubling the angle α.

Rw

w

α

L

R

L C

c

c

LR

C

L R

(a) (b)

Fig. 4. Fluctuation of the direction α computed from a DVD fragment containing sounds of a car passing with high speed from onechannel to the other. Total duration of fragment about 20 seconds.

0 3 6 9 12 15 18 210

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

time [s]

radi

ans

0 10 20 30 40 50 60 70 80 900

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

time [s]

radi

ans

Fig. 3. Typical example of fluctuation of α, computed from a CD stereo music fragment with a stable phantom source.

Page 56: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

IRWAN AND AARTS PAPERS

in the original signals. This variation is usually referred toas the antiphase components, the amount of which can berepresented by the remaining signal q (see Fig. 2).However, it can be expected that when the amount of thedominant signal equals or almost equals that of theremaining signal, an ambiguity appears since there is noway of determining the direction vector uniquely. In thissituation the distribution in Fig. 2 is no longer an ellipsebut has a circlelike form (y ≈ q), as illustrated in Fig. 6,causing α to be not well defined.

Obviously extra information is necessary when dealingwith this sort of ambiguity. In this paper we propose to usea known technique to measure the amount of antiphasecomponents, namely, the correlation coefficient, which isgiven in any text book on statistics as

ρx x x x

x x x x

L L R R

L L R R

2 2!!

!

r r

r r

_ _

_ _

i i

i i(14)

where xL and xR are the mean values of xL and xR, respectively.Aarts et al. [14] have shown that Eq. (14) can be com-

puted recursively by using only a few arithmetic operations,

ρ ρ

γ ρ

k k

x k x k x k x k k

1

2 1

L R L R2 2

t t

t

^ ^

^ ^ ^ ^ ^

h h

h h h h h< F) 3

(15)

where γ is the step size determining the time constant, andthe caret (^) is used to denote that it is an estimate of the

true ρ. A summary of the mathematical derivations of Eq.(15) can be found in Appendix 3.

To give some ideas how this tracking algorithm works,we present two examples of measurements using Eq. (15),which are shown in Figs. 7 and 8. The measurements areperformed within a time frame of 50 seconds, with thestep size γ set to 103 at Fs 44.1 kHz. Fig. 7 is a typicalexample of a modest stereo for which the correlationvaries around 0.70, and it is thus neither too strong (monosound) nor too weak (diffuse sound). On the other hand,Fig. 8 shows an example of an uncorrelated stereo signalwith many antiphase components, for which α is difficultto detect (see Fig. 6).

It is worth mentioning here that there are three othervariants of Eq. (15) which are evenly robust. For furtherinformation the reader is referred to [14].

Since

ρ1 1 # # (16)

it is possible that the antiphase components exceed thedominant signal(y < q). In this case we treat the inputsignals as uncorrelated, and therefore

,

, .ρ

ρ ρ

otherwise0

0 10

# #* (17)

It can be shown (see Appendix 1) that a relationshipexists between this cross-correlation method and PCAdescribed in the previous section.

918 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 6. Lissajous plot of the first 23-second stereo signal recorded from the fragment “Holiday” by Madonna, where the amount of adominant signal is almost equal to that of a remaining signal, forming a circle-like distribution.

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

right input signal values (xR

)

left

inpu

t sig

nal v

alue

s (x

L)

y

q

Page 57: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SOUND PROCESSING

2.1 Three-Dimensional MappingTo avoid ambiguity when the amount of the dominant

signal approaches that of the remaining signals, the use ofboth the direction of the stereo image and the correlationcoefficient is necessary. The latter is included in the map-ping (see Fig. 5) by, for example, placing the surroundchannels in the vertical plane, as shown in Fig. 9.

The angle β can be defined to represent the actual sur-round information by means of the adaptive correlationcoefficient, for example, by using the expression

arcsinβ ρk k1 0^ ^h h8 B (18)

and hence,

.βπ

k02

# #^ h (19)

Thus as the amount of the remaining signal increases(input signals become weakly correlated), the angle β alsoincreases, which reduces the total distribution to the frontchannels. On the other hand, when the input signals arestrongly correlated (quasi mono), β approaches zero, pro-ducing a larger contribution to the front channels. Thisprinciple satisfies the energy preservation criterion, whichis discussed in more detail in Section 2.2.

Since the direction vector on the horizontal plane is nowlifted by an angle β, recalculation of the projections is nec-

essary. Using straightforward trigonometry and keeping inmind that the vector is of unit length, we obtain

LR

.

cos

cos

sin

β

β

β

c c

c c

c

LR

C C

S

(20)

2.2 MatrixingThe system described so far reproduces four channel

signals as L, C, R, and S from two input signals. Thereforewe have a 4 2 reproduction matrix.

We now discuss the objective requirement on the energypreservation as emphasized in Section 2.1. A matrix pre-serves energy if and only if its columns are of unit length,and the columns are pairwise orthogonal. Since the prod-uct of any two orthogonal matrices is also orthogonal,back and forward compatibility between stereo and multi-channel can also be achieved.

Following this energy criterion, we design the matrix asfollows:

.

u k

u k

u k

u k

c k

c k

c k

gw k

gw k

c k

y k

q k

0

0

L

R

C

S

L

R

C

L

R

S

^

^

^

^

^

^

^

^

^

^

^

^

h

h

h

h

h

h

h

h

h

h

h

h

R

T

SSSSSS

R

T

SSSSSS

R

T

SSS

V

X

WWWWWW

V

X

WWWWWW

V

X

WWW

(21)

The components of the left-hand side of Eq. (21) denotethe signals for the left, right, and center loudspeakers, anduS denotes the mono surround signal. The basis signals areobtained by rotating the coordinate system of xL and xR,

y k w k x k w k x k

q k w k x k w k x k

L L R R

R L L R

^ ^ ^ ^ ^

^ ^ ^ ^ ^

h h h h h

h h h h h

(22)

and

,

,

<

,

,

otherwise

otherwise

cc c

cc c

0

0

0

0

LLR LR

RLR LR $

*

*

(23)

and g is a gain to control the energy preservation.

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 919

Fig. 8. Tracked cross-correlation coefficients obtained from thefragment “Holiday” by Madonna. See Fig. 6 for the correspon-ding Lissajous plot.

Fig. 7. Tracked cross-correlation coefficients obtained from thefragment “The Great Pretender” by Freddy Mercury. See Fig. 2for the stereo image, which is a typical example of a modeststereo recording.

Fig. 9. Three-dimensional mapping showing front (horizontalplane) and surround channels (vertical plane). Parameter β deter-mines the level of surround information with respect to the frontchannel sounds.

βc

α2

S

C

R

cS

LRLR

C

C

c c′

c′L

0 10 20 30 40 50−1.00−0.75

−0.50

−0.25

0.00

0.25

0.50

0.751.00

t [s]

0 10 20 30 40 50−1.00−0.75

−0.50

−0.25

0.00

0.25

0.50

0.751.00

t [s]

Page 58: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

IRWAN AND AARTS PAPERS

Since cL and cR can only produce one value dependingon the condition in Eq. (23), the length of the first columnof the matrix given in Eq. (21) is equal to c2

LR c2C, which

is unity. The second column of Eq. (21) contains mainly aprojection of the vector onto the horizontal plane (see Fig.9). The length of this column is equal to g(w2

L w2R)

c2S 1. The two columns are thus of unit length and pair-

wise orthogonal if g cos2 β, and therefore the matrixpreserves the total energy.

Finally, the Lauridsen [15] decorrelator is used toobtain stereo surround because of its simplicity. Thisdecorrelator can be viewed as two FIR comb filters (hLand hR) with two taps each for surround left and surroundright. The impulse responses of these filters are illustratedin Fig. 10. A time delay of δ ≈ 10 ms (440 samples) is usedbetween the taps, which is determined experimentally.

The choice of the time delay δ is a subtle compromisebetween the amount of widening and the sound diffuse-ness. The greater δ is, the more diffuse the sounds will be,and at some point it will lead to confusion.

Note that there are other decorrelator filters available,such as complementary comb filters, in which the “teeth”are distributed equally on a logarithmic frequency scale.Informal listening tests, however, revealed that theLauridsen decorrelator is better appreciated when it isapplied to the surround channels. Furthermore, its effi-ciency in the implementation also plays an important rolein our application.

2.3 Analysis of Each Discrete ChannelThe behavior of the proposed method can be analyzed

in each channel and gives useful information for validat-ing an implementation. In addition, such an analysis canbe used to demonstrate the channel separation of ourmethod. Such analyses are listed in Table 1.

First we feed the system with a sine wave in the leftchannel only and set the right channel to zero,

.

sin ωx k k

x k 0

L

R

^ ^

^

h h

h

(24)

In this situation we have wL 1 and wR 0. Therefore,

.

sin ωy k k

q k 0

^ ^

^

h h

h

(25)

Substituting Eqs. (24) and (25) into Eq. (21), and keepingin mind that cR cC 0, we obtain

.

sin ωu k c k k

u k

u k

u k

0

0

0

L L

R

C

S

^ ^ ^

^

^

^

h h h

h

h

h

(26)

920 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

k

k

hL

hR

δ

Fig. 10. Impulse response of left and right Lauridsen decorrelation filters. Time delay δ ≈ 10 ms (440 samples) is experimentally cho-sen to produce the most pleasant stereo sounds for the application concerned.

Table 1. Summary of extreme cases decribed in text.*

Input uL uR uC uS wL wR ρ0 β

xL f, xR 0 f 0 0 0 1 0 0 π/2

xL 0, xR f 0 f 0 0 0 1 0 π/2

xL xR f 0 0 κf 02

12

2

12 1 0

xL f, xR f κf κf 0 κf2

12

2

12 0 π/2

xL, xR ≠ 0, uncorrelated 0 0 0 κf –– –– 0 π/2

* A time signal f is used to represent any input signals fed into left, right, or a combina-tion of left and right channels. κ represents a scalar due to mapping (Fig. 9). Note thatwhen uncorrelated signals are fed into the system, wL and wR become undefined, mean-ing they can assume any value.

Page 59: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SOUND PROCESSING

This is to be expected as any signal fed into one particularchannel should be kept the same in the output.

Second, feeding the input signal into the right channel,some other combinations can be analyzed similarly. Theoutputs of these combinations are summarized in Table 1.

From the table it can be seen that fully correlated inputsignals will be reproduced in the center channel while allother channels are zero. This explains the strong soundlocalization that is achieved during the listening test,which is discussed in the next section.

Furthermore, when we feed the left and right channelswith antiphase signals, no sound will be reproduced in thecenter channel, left and right outputs are in antiphase, andsome sounds are going to the surround channels. Thisextreme case demonstrates how ambience effects are cre-ated when many antiphase signals are present in the orig-inal signals.

3 LISTENING TEST

In order to investigate the appreciation of the discussedconversion method, the system was tested together with afew other systems.

3.1 MethodThe method of paired comparisons [16] was used to

gather personal preference data. During each trial, sub-jects heard a music excerpt encoded by a certain methodMi, then the same excerpt encoded by Mj. The subjectcould repeat the pairs as often as desired, and finally hadto indicate whether Mi was preferred above Mj or viceversa. There were four systems under test, so six pairs perrepertoire per listening position were offered to the sub-jects. If Mi was preferred above Mj a 1 was placed in apreference matrix X at element xij, or otherwise at xji. Thismatrix was scaled with Thurstone’s decision model [17](see Appendix 2 for more details).

3.2 Technical Equipment and RepertoireThe subjects were either at the position advised by the

ITU [10], referred to as the sweet spot, or 1 m aside of thatspot, referred to as “off the sweet spot.” The loudspeakersused were Philips DSS940 (digital) loudspeakers. The lis-tening room was a rather dry listening room, whichenabled a critical judgment for localization and crosstalkbetween the channels.

We compared four different systems: the system pro-posed in this paper (system 1), its variant, which putsmore low-frequency to the surrounds (system 2), and twoother commercially available systems (system 3 and 4,respectively).

Four different music excerpts were chosen. Three musicfragments (Fish: “The Company,” The Corrs: “What CanI Do,” and Melanie C: “Never Be the Same Again”) and asound track from the movie picture “The Titanic” (Track#23 from the DVD) were used. In total a subject had togive 6 2 positions 4 fragments 48 assessments.

The number of different tracks was somewhat limited.To derive more general conclusions more tracks would benecessary, such as used in [19]. However, our primary aim

was to focus on the theory, while more elaborate listeningtests can still be done in the future.

3.3 SubjectsThere were 17 subjects. Most were experienced listen-

ers and all had no reported hearing loss.

3.4 ResultsFor each type of repertoire and each subject the scaled

results were plotted. An example is given in Fig. 11. Ahigh scale value means high appreciation. The value itselfis of no importance. It is the value with respect to the oth-ers. The sum of the scale values equals zero.

The number on top of Figs. 11–13 denotes the coeffi-cient of consistence ξ [16]. A value of ξ 1 means fullyconsistent. In such a case there is never a violation of thetriangular inequalities (Xi > Xj > Xk > Xi). ξ 0 meansfully inconsistent. This coefficient is important for variousreasons. First it reveals how consistently a subject judgesthe stimuli. In this case subject SP appeared to be a veryconsistent judge. Second, if the subjects have differentpreferences with respect to each other, or for differentrepertoires, then summing their preference matrices willlower the ξ value.

For both positions, on the sweet spot and off the sweetspot, the results for all subjects and repertoires are scaledand plotted in Figs. 12 and 13, respectively.

It is clear that system 1––the system discussed in thispaper––performs very well, both on the sweet spot andoff the sweet spot. In Figs. 12 and 13 we see values ofξ 0.4 and 0.6, respectively, revealing that not all thesubjects act as one single fully consistent subject.However, it appeared that the individual subjects act ratherconsistently. Furthermore, the figures show that system 1is rather robust in the face of a displacement from the

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 921

Fig. 11. Results of subject SP, four music excerpts, off the sweetspot. Scale values on vertical axis are arbitrary.

0 1 2 3 4−8

−4

0

4

8ξ = 1.0000

Stereo Reproduction System

Sca

leV

alue

Page 60: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

IRWAN AND AARTS PAPERS

sweet spot, which is a desirable property.Another analysis––using the Bradley–Terry model

[16]––of the results of the listening test was performed[18]. This model was fitted by the maximum-likelihoodmethod, and the goodness-of-fit was tested. It revealedthat the four processing methods differed significantlyfrom one another.

4 CONCLUSION

A new method to convert two-channel stereo to multi-channel sound has been presented. A three-dimensionalrepresentation has been used to produce each channel’sgain, which is time varying. PCA proved to be a powerfultool to detect the direction of a stereo image, which is thenused to derive the center channel’s gain. Furthermore, arobust tracking algorithm for computing the cross correla-tion between left and right channels has been used toimprove the sound quality of the surround channels.

A listening test comparing four different systems hasbeen carried out using both music recordings and DVDmovie tracks, and the results have been analyzed using theThurstone scaling technique as well as the Bradley–Terrymodel. The preliminary listening test has shown that theproposed method is very good, both on and off the sweetspot. Moreover, it has been shown that the four processingmethods differed significantly from one another.

5 ACKNOWLEDGMENT

The authors thank D. Roovers for his work in the initialphase of the project, and the statisticians T. J. J. Denteneerand J. Engel for interpreting the results of the listeningtests.

6 REFERENCES

[1] P. W. Klipsch, “Stereophonic Sound with Two Tracks,Three Channels by Means of a Phantom Circuit (2PH3),”J. Audio Eng. Soc., vol. 6, p. 118 (1958).

[2] P. Scheiber, “Four Channels and Compatibility,” J.Audio Eng. Soc., vol. 19, pp. 267–279 (1971 Apr.).

[3] J. Eargle, “4–2–4 Matrix Systems: Standards, Prac-tice, and Interchangeability,” J. Audio Eng. Soc., vol. 20,pp. 809–815 (1972 Dec.).

[4] M. E. G. Willcocks, “Transformations of the EnergySphere,” J. Audio Eng. Soc., vol. 31, pp. 29–36 (1983Jan./Feb.).

[5] S. Julstrum, “A High-Performance Surround SoundProcess for Home Video,” J. Audio Eng. Soc., vol. 35, pp.536–549 (1987 July/Aug.).

[6] M. A. Gerzon and G. J. Barton, “Ambisonic Decod-ers for HDTV,” presented at the 92nd Convention of theAudio Engineering Society, J. Audio Eng. Soc., vol. 40, p.438 (1992 May), preprint 3345.

[7] R. Dressler, “Pro Logic Surround Decoder,Principles of Operation,” Dolby Laboratories, http://www.dolby.com (1997).

[8] R. Irwan and R. M. Aarts, “A Method to ConvertStereo to Multichannel Sound,” presented at the AES 19thInternational Conference, Germany (2001).

[9] G. Theile, “HDTV Sound System: How ManyChannels?” in Proc. 10th AES Conf. on Images and Audio(1991), pp. 147–162.

[10] ITU-R BS.1116, “Methods for the SubjectiveAssessment of Small Impairments in Audio SystemsIncluding Multichannel Sound Systems,” InternationalTelecommunications Union, Geneva, Switzerland (1994).

[11] T. W. Lee, Independent Component Analysis:

922 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 12. Total scaling results of all 17 subjects, four music excerpts,on the sweet spot. Scale values on vertical axis are arbitrary.

Fig. 13. Total scaling results of all 17 subjects, four musicexcerpts, off the sweet spot. Scale values on vertical axis arearbitrary.

0 1 2 3 4−1.1

−0.7

−0.3

0.1

0.5

0.9ξ = 0.3739

Stereo Reproduction System

Sca

leV

alue

0 1 2 3 4−2

−1

0

1

2ξ = 0.5695

Stereo Reproduction System

Sca

leV

alue

Page 61: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SOUND PROCESSING

Theory and Applications (Kluwer, Boston, MA, 1998).[12] S. Haykin, Neural Networks, 2nd ed. (Prentice-

Hall, Englewood Cliffs, NJ, 1999).[13] J. Karhunen, “Stability of Oja’s PCA Subspace

Rule,” Neural Comput., vol. 6, pp. 739–747 (1994).[14] R. M. Aarts, R. Irwan, and A. J. E. M. Janssen,

“Efficient Tracking of the Cross-Correlation Coefficient,”IEEE Trans. Speech Audio Process., vol. 10, pp. 391–402(2002 Sept.).

[15] H. Lauridsen, “Experiments Concerning DifferentKinds of Room-Acoustics Recording” (in Danish),Ingenioren, vol. 47 (1954).

[16] H. A. David, The Method of Paired Comparisons,2nd ed. (Griffin, London, 1988).

[17] W. S. Torgerson, Theory and Methods of Scaling(Wiley, New York, 1958).

[18] J. Engel, “Paired Comparisons on Audio: Model-ling the System Differences,” Intern. Rep. E1624-20, CQM(Centre for Quantitative Methods), Eindhoven, TheNetherlands (2001 July).

[19] G. A. Soulodre, T. Grusec, M. Lavoie, and L.Thibault, “Subjective Evaluation of State-of-the-Art Two-Channel Audio Codecs,” J. Audio Eng. Soc. (EngineeringReports), vol. 46, pp. 164–177 (1998 Mar.).

[20] R. M. Aarts, “A New Method for the Design ofCrossover Filters,” J. Audio Eng. Soc., vol. 37, pp.445–454 (1989 June).

[21] L. L. Thurstone, “A Law of Comparative Judge-ment,” Psychol. Rev., vol. 34, pp. 273–286 (1927).

[22] M. Abramowitz and I. A. Stegun, Handbook ofMathematical Functions (Dover, New York, 1972).

APPENDIX 1RELATION BETWEEN CORRELATIONCOEFFICIENT ρ AND PCA

As was presented in Section 2, the cross-correlationtechnique is very useful in determining the surround chan-nel distribution. In this appendix we discuss the relation-ship between the cross-correlation technique and the prin-cipal component analysis (PCA), also known as theKarhunen–Loéve transformation. In general, PCA maxi-mizes the rate of decrease in variance for each of its com-ponents. The solution lies in the eigenstructure of thecovariance matrix C.

We may decompose C with the singular-value decom-position (SVD) [12] as

ΛC V V 1 (27)

where V is an orthogonal (unitary) matrix of eigenvectorsvj and Λ is a diagonal matrix with the eigenvalues

, , , , ,λ λ λ λΛ diag j m1 2 f f8 B (28)

arranged in decreasing order,

> > > > >λ λ λ λj m1 2 g g (29)

so that λ1 λmax.It appears that the PCA and the SVD of C are basically

the same, just viewing the problem in different ways.From the theory of SVD it follows that the eigenvectors ofthe covariance matrix C define the unit vectors vj repre-senting the principal directions along which the varianceis maximized for each component. The associated eigen-values define the total variance of the m elements,

.σ λjj

m

jj

m2

1 1

! ! (30)

In the present case C is constructed in the followingway. Consider a segment of left and right audio samplesxL [xL(1) g xL(B)] and xR [xR(1) g xR(B)]; hencem 2. With their correlation coefficients ρ the covariancematrix C can be written as

ρσ σ

ρσ σ

σC

x

x x

x x

x

2

2L

L R

L R

R

J

L

KKK

N

P

OOO

(31)

Now we see the relation between the correlation coeffi-cient ρ on the left-hand side of Eq. (27) and the principalcomponents on the right-hand side of Eq. (27). The latterwe will develop more explicitly in the following.

Since C is positive definite, the eigenvalues of C areboth real and positive and can be calculated as

.

λ σ σ

σ σ ρσ σ

s

s

2

1

2

, x x

x x x x

1 22 2

2 22 2

L R

L R L R

a

a `

k

k j

(32)

(33)

The eigenvectors of C corresponding to the eigenvaluesλ1,2 are

,γρσ σ

σ σ sv

21

,

x x

x x1 2

2 2

L R

L R

J

L

KKK

N

P

OOO

(34)

where γ is such that v1,2 1.If we consider V as a rotation matrix over the angle α,

as used in Eq. (12), then we can derive

.tan

ρσ σ

σ σ α

2

2

x x

x x2 2

L R

L Ra ^k h

(35)

As special cases we consider ρ 0. Then the left andright channels are uncorrelated and the eigenvectors arejust (1, 0), (0, 1), which are coincident with the originalleft and right axes. As corresponding eigenvalues we haveλ1 σ2

xLand λ2 σ2

xR, which are the powers of the left and

right channels, respectively. A similar case occurs if σ2xL

σ2

xR σ2. Then λ1,2 σ2 and α π/4 and/or ρ 0.

Since PCA can be seen as finding a decomposition ofthe covariance matrix C, the transformation into domi-nant and remaining signals using PCA described inSection 1 can also be carried out by computing v1 and v2,respectively. It has, however, a limited practical useful-ness since it requires more computation effort as opposedto an efficient way of using the LMS algorithm given inEq. (6).

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 923

Page 62: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

IRWAN AND AARTS PAPERS

APPENDIX 2SCALING

A2.1 IntroductionA problem encountered in many disciplines is how to

measure and interpret the relationships between objects. Asecond problem is the lack, in general, of a mathematicalrelationship between the perceived response and the actualphysical measure. With regard to this paper, how does theappreciation of our 2-to-5-channel system differ from oth-ers? How do we measure and what scale do we need? Inthe following we discuss some scales and techniques andgive two examples.

A2.2 ScalingThe purpose of scaling is to quantify the qualitative

relationships between objects by scaling data. Scaling pro-cedures attempt to do this by using rules that assign num-bers to qualities of things or events. There are two types ofscaling, univariate scaling, which is explained hereafter,and multidimensional scaling (MDS), which is an exten-sion of univariate scaling (see, for example, [20]).Univariate scaling is usually based on the law of compar-ative judgment [17], [21]. It is a set of equations relatingthe proportion of times any stimulus i is judged greater oris more highly appreciated relative to a given attribute (inour case the appreciation) than any other stimulus j. Theset of equations is derived from the postulates presented in[17]. In brief, these postulates are as follows.

1) Each stimulus when presented to an observer givesrise to a discriminal process which has some value on thepsychological continuum of interest (in our case theappreciation).

2) Because of momentary fluctuations in the organism,a given stimulus does not always excite the same discrim-inal process. This can be considered as noise in theprocess. It is postulated that the values of the discriminalprocess are such that the frequency distribution is normalon the psychological continuum.

3) The mean and the standard deviation of the distribu-tion associated with a stimulus are taken as its scale valueand discriminal dispersion, respectively.

Consider the theoretical distributions Sj and Sk of thediscriminal process for any two stimuli j and k, respec-tively, as shown in Fig. 14(a). Let Sj and Sk correspond tothe scale values of the two stimuli and σj and σk to theirdiscriminal dispersion caused by noise.

Now we assume that the standard deviations of the dis-tributions are all equal and constant (as in Fig. 14), andthat the correlation between the pairs of discriminalprocesses is constant. This is called “condition C,” inTorgerson [17]. Since the distribution of the difference ofthe normal distributions is normal, we get

S S cx k j jkr r (36)

where c is a constant and xjk is the transformed [see Eq.(39)] proportion of the number of times stimulus k is morehighly appreciated than stimulus j. Eq. (36) is also knownas Thurstone’s case V. The distribution of the discriminal

differences is plotted in Fig. 14(b). Eq. (36) is a set ofn(n 1) equations with n 1 unknowns, n scale values,and c. This can be solved with the least-square method.Setting c 1 and the origin of the scale to the mean of theestimated scale values, that is,

/n s1 0jj

n

1

! (37)

we get

/ .s n x1k jkj

n

1

! (38)

Thus the least-square solution of the scale values can beobtained simply by averaging the columns of matrix X.However, the elements xjk of X are not directly available.With paired comparisons we measure the proportion pkjthat stimulus k was judged greater than stimulus j. Thisproportion can be considered a probability that stimulus kwas judged greater than stimulus j. This probability isequal to the shaded are in Fig. 14(b), or

erfx pjk jk` j (39)

where erf is the error function [22, §7, 26.2], which caneasily be approximated (see, for example, [22, §26.2.23]).A problem may arise if pjk ≈ 1 since xjk can be verylarge. In this case one can then replace xjk by a largevalue.

It may be noted that this type of transformation is alsoknown as Gaussian transform, where instead of the sym-bol x, z is used, known as the z score. Instead of using Eq.(39), other models are used, such as the Bradley–Terrymodel (see [16]). All forms of the law of comparative

924 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 14. (a) Probability distributions Sj and Sk of stimuli j and kon psychological continuum, with mean values Sj and Sk. (b) Pro-bability distributions of difference of random variables. Shadedportion gives the proportion of times stimulus k was judgedgreater than stimulus j. Sk Si is proportional to the differencein scale value for both stimuli.

(a)

(b)

Page 63: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PAPERS SOUND PROCESSING

judgment assume that each stimulus has been comparedwith the other stimuli a large number of times. The directmethod of obtaining the values of pjk is known as themethod of paired comparisons (see, for example, [16]). Asan example, the measured probabilities pjk for a subject arelisted in Table 2. The upper triangular is calculated aspjk 1 pkj. Using Eq. (39) the xjk values are obtained.Using Eq. (38) the final scale values are determined andplotted in Fig. 11.

APPENDIX 3EFFICIENT COMPUTATION OF CROSS-CORRELATION COEFFICIENT

In this appendix we summarize the mathematical deri-vations of the tracking cross-correlation coefficients aswas fully reported in [14]. For the generalization we use xand y for two signals being correlated instead of xL and xR,which represent specifically stereo audio signals.

We show that ρ satisfies to a good approximation (whenη is small) the recursion in Eq. (15) with γ given by

γe

x y

c

2

η

rms rms(40)

where c 1 eη, and the subscripts rms refer to the rootmean-square values of x and y.

Using an exponential window we can redefine the cor-relation of x and y at time instant k as

ρ k

S k S k

S k

/

xx yy

xy

1 2^

^ ^

^h

h h

h

8 B

(41)

for k an integer and where

e

e

S k c x y

S k cx y1

η

η

xyl

k l k lt

xy k k

0

3

!^

^

h

h(42)

and Sxx and Syy are defined similarly. Hence,

ρ k ^ h

e e

e

S k c x S k c y

S k c x y

1 1

1

/

η η

η

xx k yy k

xy k k

2 21 2

^ ^

^

h h

h

9 9C C' 1

(43)

Since we consider small values of η, we have that c 1 eη is small as well. Expanding the right-hand side ofEq. (43) in powers of c and retaining only the constant andthe linear term, we get, after some calculations,

Then, deleting the O(c2) term, we obtain the recur-sion in Eq. (15), with γ given by Eq. (40), when weidentify

x S k

y S k

rms

rms

xx

yy

2

2

^

^

h

h

(45)

for a sufficiently large k, and assuming that x2rms

y2rms.One may ask how to handle signals x and y that have

nonzero, and actually time-varying, mean values. In thosecases we still define ρ(k) as in Eq. (42), however, with Sxyreplaced by

eS k c x x k y y k ηxy

l

lk l k l

0

3

! r r^ ^ ^h h h8 8B B (46)

where

e

e

x k c x

y k c y

η

η

l

lk l

l

lk l

0

0

3

3

!

!

r

r

^

^

h

h

(47)

and Sxx and Syy changed accordingly. It can then be shownthat

e

e

x k x k cx

y k y k cy

1

1

η

η

k

k

r r

r r

^ ^

^ ^

h h

h h

(48)

and

e kS k S c p q1 ηxy xy k k

^ ^h h (49)

where pk xk x(k 1), qk yk y(k 1), while sim-

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 925

.

ρ ρ

ρ

ek k

S k S k

c

x yS k

S kx

S k

S ky k O c

1

2 1 1

21

1

1

11

/

/ /

η

xx yy

k kxx

yyk

yy

xxk

1 2

1 2

2

1 2

2 2

^ ^

^ ^

^

^

^

^^ `

h h

h h

h

h

h

hh j

R

T

SSS

R

T

SSS

8

V

X

WWW

V

X

WWW

B

Z

[

\

]]

]]

Z

[

\

]]

]]

_

`

a

bb

bb

_

`

a

bb

bb

(44)

Table 2. Proportion pjk of timesthat stimulus k was judged higher

than stimulus j, obtained viapaired comparisons.*

––0.0 ––

↓ j 0.0 0.0 ––0.5 1.0 1.0 ––→ k

*For subject SP, averaged over thefour fragments off the sweep spot.

Page 64: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

IRWAN AND AARTS PAPERS

ilar recursions hold for Sxx and Syy. This then yields From this point onward, comparing with Eq. (44), onecan proceed to apply many, if not all, of the develop-ments presented in this appendix to this more generalsituation.

926 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

.

ρ

ρ

k

S k S k

c

p qS k

S kp

S k

S kq k O c

1

2 1 1

21

1

1

11

/

/ /

xx yy

k kxx

yyk

yy

xxk

1 2

1 2

2

1 2

2 2

^

^ ^

^

^

^

^^ `

h

h h

h

h

h

hh j

R

T

SSS

R

T

SSS

8

V

X

WWW

V

X

WWW

B

Z

[

\

]]

]]

Z

[

\

]]

]]

_

`

a

bb

bb

_

`

a

bb

bb(50)

ρ k

S k cp S k cq

S k cp q

1 1

1

/

xx k yy k

xy k k

2 21 2

^

^ ^

^h

h h

h

9 9C C' 1

THE AUTHORS

Roy Irwan received an M.Sc. degree in electrical engi-neering from the Delft University of Technology, Delft,The Netherlands in 1992, and a Ph.D. degree in electricalengineering from the University of Canterbury, Christ-church, New Zealand in 1999.

From 1993 to 1995 he was employed as a system engi-neer at NKF B.V. After obtaining his Ph.D. degree in1999, he joined the Digital Signal Processing group atPhilips Research Laboratories, Eindhoven, The Nether-lands. Since 2002 he has been working with the StateUniversity Groningen, faculty of Medical Sciences.

Dr. Irwan has published a number of refereed papers ininternational journals and has more than 14 patent appli-cations. His research interests include (medical) imageand signal processing.

Ronald Aarts was born in 1956, in Amsterdam, TheNetherlands. He received a degree in electrical engineer-ing in 1977, and a Ph.D. degree from the Delft University

of Technology in 1994.In 1977 he joined the Optics group of Philips Research

Laboratories, Eindhoven, The Netherlands, where he wasinvolved in research into servos and signal processing foruse in both Video Long Play players and Compact Discplayers. In 1984 he joined the Acoustics group of thePhilips Research Laboratories and was engaged in thedevelopment of CAD tools and signal processing for loud-speaker systems. In 1994 he became a member of theDigital Signal Processing group of the Philips ResearchLaboratories. There he has been engaged in the improve-ment of sound reproduction by exploiting DSP and psy-choacoustical phenomena.

Dr. Aarts has published over one hundred technical papersand reports and is the holder of more than a dozen U.S.patents in his fields. He was a member of the organizingcommittee and chair for various conventions. He is a seniormember of the IEEE, a fellow of the AES, and a member ofthe Dutch Acoustical Society and the Acoustical Society ofAmerica. He is past chair of the Dutch section of the AES.

R. Irwan R. M. Aarts

Page 65: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

In MemoriamShortly before this Journal went to press, we receivedsome sad news.

Daniel Queen, secretary of the AES StandardsCommittee 1980-2001, died on October 17 in Providence,Rhode Island, after a long illness. He had been living inAthens, Greece, for the past year, returning to the UnitedStates only two days before he died.

We are deeply saddened to hear this news. Dan’s contri-bution to AES standards over many years has beenimmense and, personally, I don't know what I would havedone without his help over the past year.

A fuller appreciation of Dan’s life will appear in the nextissue of this Journal.

MARK YONGE

The New AES Standards Internet Facili-ties, Mentioned in the September Issue,Are Now in Service

E-MAIL REFLECTORSA total of 67 new e-mail reflectors—one for each of thecurrently active AES standards groups—have been inservice since mid-September. Messages sent to a singlegroup e-mail address are distributed to all members of thatgroup to provide a fast and fluent medium for discussionof standards business. Additionally, the service operatesmore quickly and securely than before. Unsolicited e-mailand inappropriate content are filtered appropriately. Alltraffic is virus-checked.

WEB SITEThe new Web site, http://www.aes.org/standards/, wasmade public on 2002-10-02 and has two functions.

A public area provides general information, as pre-viously, including information on standards projects andgroups as well as the facility to download copies of pub-lished standards. The style and presentation of the sitehas been substantially updated to make access faster andmore direct.

The AES Standards Web site will continue to be activeas a responsive publication medium for news and publicnotices such as Calls for Comment. A Participate buttonprovides general access to an on-line membership requestform in addition to our established methods for membership request.

In addition to these public pages, there is a new Webarea which provides services for AES Standards workinggroups.

A Log-In button allows members to access all of theirregistered groups, and their working documents, within anew secure Web area. Members will already have receivedtheir initial username and password. There is an automatedoption for members to request a password reminder and tochange their username or password.

The new working area also allows members to check thestatus of their working groups and update their contactdetails directly.

Importantly, the new site allows us to exchange workingdocuments more easily than before, replacing the FTP siteswe have used previously.

Each group area has a link to a page showing all thecurrent working documents for that group. The page isarranged so that subdirectories—for Task Groups, forexample—appear at the top of the page. Documents forthe main group are then shown. The list may be sortedby name, size or date; if you simply need to see thelatest documents, you can simply sort by date to bringthe newest to the top of the list, then click on the

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 927

COMMITTEE NEWSAES STANDARDS

Information regarding Standards Committee activi-ties including meetings, structure, procedures, re-ports, and membership may be obtained viahttp://www.aes.org/standards/. For its publisheddocuments and reports, including this column, theAESSC is guided by International ElectrotechnicalCommission (IEC) style as described in the ISO-IECDirectives, Part 3. IEC style differs in some respectsfrom the style of the AES as used elsewhere in thisJournal. For current project schedules, see the pro-ject-status document on the Web site. AESSC docu-ment stages referenced are proposed task-groupdraft (PTD), proposed working-group draft (PWD),proposed call for comment (PCFC), and call forcomment (CFC).

Page 66: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

documents you need to download them. Please enjoy exploring this new facility. We now need to

shut down the old e-mail reflectors and document FTPsites, which have served us so well over so many years.

MARK YONGE

AES STANDARDS SECRETARY

Report of the SC-03-02 Working Groupon Transfer Technologies of the SC-03Subcommittee on the Preservation and Restoration of Audio Recordingmeeting, held in conjunction with theAES 112th Convention in Munich, Germany, 2002-05-12In the absence of the Chair and Vice Chairs, the meetingwas convened by D. Schueller.

The agenda and report from the previous meeting wereaccepted as written.

Open projects

AES-X47 Minimum Set of Calibration Tones forArchival TransferD. Wickstrom published a working group draft in Januarywith certain questions, none have been commented on bythe group.

Current goal A PWD is anticipated by 2002-09.

AES-X64 Test Methods and Materials for ArchivalMechanical MediaThe intent of the project is to re-press a metal master origi-nating in the EMI factory, which complies to the IEC-98standard for coarse groove recordings with bands of fre-quencies starting with a 1 kHz reference level signal andthen in decreasing steps (18, 16, 14, 12, 10, 8, 6, 5, 4, 3, 2,1 kHz; 700, 400, 200, 110, 50, 30 Hz).

It is further our intent to cut and press another test disk.The specifications for this second disc are:

—A sinusoidal sweep from 20 Hz to 20 kHz withconstant velocity above 250 Hz. This would start with a 1kHz tone and take 50 seconds for the sweep so allowingthe use of the standard Bruel & Kjaer test equipment.

—A band of 1 kHz at a reference level: 8 cm/sec 20 mmlight band width as used on Ortofon Test Record OR1001.

—A band of 1 kHz at a reference level: 16cm/sec postwar standard (DIN 45533 1953).

—A repeat of the track 1 sweep. This would be cut at asmaller diameter so that the effect of playback tracinglosses for a particular stylus/arm combination could bemeasured.

—The same programme would be pressed on both sides.The set of discs will be accompanied by a manual de-scribing their use.

The working group will produce a PWD describing thesetest materials with the intent to publish a call for comment.

AES-X65 Rosetta Tone for Transfer of HistoricalMechanical MediaThe current goal is review of status. This has been on theagenda since 1997, and yet no progress has been madesince G. Brock-Nannestad posted his paper in July 1998.Unless proposals on the contents of such a Rosetta Toneare put forward for discussion, the meeting felt that theproject should be withdrawn.

AES-X90 Analog Transfer of Audio Program MaterialThere has been no communication on this project. Unless adedicated working chair for Task Group SC-03-02-A canbe nominated, the meeting recommends that the projectshould be withdrawn.

AES-X106 Styli Shape and Size for Transfer ofRecordsBrock-Nannestad and F. Lechleitner have said they willprovide input for the PTD.

AES-X107 Compilation of Technical Archives forMechanical MediaThere has been no input on this project, but relevant liaisonbodies have been contacted describing the project and re-questing cooperation in putting together a referencedocument.

New projectsNo new projects were received or proposed.

New businessThere was no new business.

The next meeting will be held in conjunction with theAES 113th Convention in Los Angeles, CA, US.

Report of the SC-03-04 Working Groupon Storage and Handling of Media of the SC-03 Subcommittee on the Preservation and Restoration of AudioRecording meeting, held in conjunctionwith the AES 112th Convention in Munich, Germany, 2001-11-30

The meeting was convened by T. Sheldon.The agenda were approved with the addition of a note to

discuss the future of the working group under the topic ofAES-X80.

The report of the previous meeting was approved withtwo changes. 1) D. Schueller asked that his name bespelled correctly, with the umlaut. 2) In the notes forAES35-R, the second sentence should be changed to read,“D. Schueller indicated that he would like to see includedin AES38-R a plan to investigate the light sensitivity issuein the next revision.”

Open projects

AES22-R Review of AES22-1997 AES recommendedpractice for audio preservation and restoration—Storage

928 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

AES STANDARDSCOMMITTEE NEWS

Page 67: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

of polyester-based magnetic tapeSchueller will lead the review of this document.

AES28-R Review of AES28-1997 AES standard for audiopreservation and restoration—Method for estimating lifeexpectancy of compact discs (CD-ROM), based on effectsof temperature and relative humidityNo one at the meeting felt able to conduct the review ofthis document. Sheldon agreed to refer it to the JointTechnical Commission (JTC) where the document was au-thored for consideration at the next meeting of SC-03-04.

AES35-R Review of AES35-2000 AES standard for audiopreservation and restoration—Method for estimating lifeexpectancy of magneto-optical (M-O) disks, based oneffects of temperature and relative humidityNo action was taken.

AES38-R Review of AES38-2000 AES standard foraudio preservation and restoration—Life expectancy ofinformation stored in recordable compact disc systems—Method for estimating, based on effects of temperatureand relative humiditySchueller said that it was of vital importance to include asection on light sensitivity of these media as soon aspossible. It was noted that J. M. Fontaine and D. Kunejwere knowledgeable and could draft a new section. G.Cyrener agreed to investigate possible changes to includethe issue of light sensitivity.

AES-X51 Procedures for the Storage of Optical Discs,Including Read Only, Write-once, and Re-writableSheldon noted that the draft document is a standard in theimaging area, but he has not brought it to the workinggroup for consideration because of questions about theAES-X80 liaison relationship within the AES StandardsCommittee.

AES-X54 Magnetic Tape Care and HandlingSheldon explained the status of this document within theJTC where it is being authored by AES members andothers as a part of AES-X80. It is being submitted byJTC within ISO TC42 to form a second DraftInternational Standard, ISO 18933 DIS. Sheldon willcommence the discussion of the text of this second DISballot document by July 2002. Further discussion willbe conducted by e-mail.

AES-X55 Projection of the Life Expectancy ofMagnetic TapeThe investigation to determine a suite of tests to determinethe life expectancy of various formulations of magnetic tapecontinues. It was noted that the International Association ofSound & Audiovisual Archives (IASA), with UNESCOsupport, has initiated a project to encourage tape manufac-turers to take an interest in preserving the 200 million hoursof information stored on magnetic tape. One goal is to findways to measure the life expectancy of magnetic tape. Thisproject should be continued as these initiatives promise todevelop new methods to determine life expectancies.However, it was noted that the tape manufacturers currently

are not thriving, so actions need to happen quickly if theyare to participate. Also creating a sense of urgency, thesigns of tape instability in the world’s archives are growing.This subject also is being discussed at the JTC; the minutesof the JTC in Montreal on the subject of magnetic tape lifeexpectancy testing were read. The search continues for abreakthrough that will open possibilities for success.

AES-X80 Liaison with ANSI/PIMA IT9-5The current goal is a review of the liaison relationship. Theliaison arrangement is now with International ImagingIndustry Association (I3A) which was the creation of themerger of Photographic Image Manufacturers Association(PIMA) and two other imaging associations. The title ofthe liaison should be changed to “AES-X80 Liaison withI3A IT9-5.”

Sheldon asked whether the AES-X80 liaison ar-rangement should be continued with I3A. In reality, vir-tually all research and preparation of standards documentspublished by AES comes from the JTC. The JTC has manyAES members on its membership list, and many AES SC-03-04 members participate in the development of standardsas a part of the JTC. AES SC-03-04 has not undertaken inrecent years any work of its own apart from the work beingconducted by the Joint Technical Commission. Thequestion also should be asked, “Should the AES SC-03-04working group continue?”

New projectsNo project requests were received or introduced.

Sheldon noted, however, that at the AES-X80 JointTechnical Commission meeting April 19-20, 2002 thetopic of the care and handling of transportable magneticdiscs was raised and discussed. The JTC received a reportfrom J. Lindner surveying the current state of these media.The JTC also is beginning consideration of the care andhandling of optical discs. E. Zwanafelt agreed to preparean outline for consideration at the next meeting.

New businessThere was no new business.

The next meeting is scheduled to be held in conjunctionwith the AES 113th Convention in Los Angeles, CA, US.

Report of the SC-05-02 Working Groupon Single-Programme Connections ofthe SC-05 Subcommittee on Intercon-nections meeting, held in conjunctionwith the AES 112th Convention in Munich, Germany, 2002-05-09J. Brown convened the meeting in the absence of WorkingGroup Chair R. Rayburn and Vice Chair W. Bachmann.

The agenda and the report from the previous meetingwere approved as written.

Current development projects

AES-X11 Fiber-Optic Audio Connections: Connectors

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 929

AES STANDARDSCOMMITTEE NEWS

Page 68: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

930 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

AES STANDARDSCOMMITTEE NEWS

and Cables Being Used and Considered for AudioJ. Woodgate reported on the meeting of Task Group SC-05-02-F held on 2002-05-09.

A proposed draft, AES32-TU, has been published as aTrial-Use Publication. The current goal is to monitor thetrial use. Members present considered that a revision oramendment of the document is now necessary.

The draft specifies a high-quality connector, and therewere suggestions that others should be included. It wassuggested that the insistence on including only the SCconnector had caused a loss of interest in the project. IECSC 86C has standardized several new connectors and it isnot known whether any of these are likely to be used foraudio. R. Caine mentioned a connector designated as“ST1,” recommended for use with MADI and FCDI, ofinterest to SC-02-02 as a connector already in use forAES3. A dual version has been adopted by an ISO com-mittee. Caine was invited to submit a document on thesubject, preferably with proposed texts for an amendmentor a new document.

J. Gaunt considered that the type of connector was lessimportant than the signal protocol. The chair suggested aset of tables showing which connectors, fiber types, andsignal protocols worked together and which did not.

It appears unacceptable to let the existing AES32-tu draftgo forward to publication as a full standard; it needs to bereviewed for further improvements. It was agreed that, atthe present stage of technology, the document should be areport of what is actually being used, rather than a standardspecifying what shall be used.

It was agreed that a user survey should be carried outto determine which connectors are actually in use. Aform was designed and some responses were obtainedduring the Convention. Responses would be collected bythe Standards Secretariat and sent to the Task Group.The closing date for the submission of responses was setat 31 July 2002. The draft was then reviewed in detail.

The following text was agreed: “This Report mainlydeals with connectors for use with IR of wavelengths1300 nm and 850 nm. 1300 nm is most widely used,while 850 nm is occasionally used for graded-index applications.”

It is possible that more terms should be defined, if reallynecessary. The texts in the Glossary need not be as formalas that of a definition.

Additional connectors, ST1 and MTRJ should be men-tioned, together with any others found to be in significantuse according to the results of the survey. More infor-mation should be submitted on the ST1 (Caine) and theMTRJ (Gaunt) connectors.

It was stated that IEC MT 61806 looks to AES for inputon professional applications of optical fiber.

AES-X40 Compatibility of Tip-Ring-Sleeve ConnectorsConforming to Different StandardsAES-R3 has been issued as a project report. Current goal isreview of this document with a target date of 2005-10. Nofurther action is required.

THEPROCEEDINGS

OF THE AES 22ND

INTERNATIONALCONFERENCE

2002 June 15–17Espoo, Finland

These 45 papers are devoted to virtualand augmented reality, sound synthe-sis, 3-D audio technologies, audio cod-ing techniques, physical modeling, sub-jective and objective evaluation, andcomputational auditory scene analysis.

You can purchase the book and CD-ROM online at www.aes.org. For more

information email publication_orders@aes or telephone +1 212 661 8528.

Also available on CD-ROM

VIRTUAL,SYNTHETIC, AND

ENTERTAINMENTAUDIO

Page 69: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

AES-X105 Modified XLR-3 Connector for DigitalMicrophonesThe work done by members of SC-04-04D on a digital mi-crophone connector will be used by M. Natter to developmechanical specifications for the connector. It is our hopeto have that material prepared for inclusion in the IECXLR standard as an amendment during the next main-tenance cycle (roughly three years). The Secretary of IECSC 48B has been notified. The digital connector shouldinclude the option of a capacitor for the concentric con-nection of the shield to the shell. The current goal is astandard. The target date is 2003-05.

AES-X113 Universal Female Phone JackThe project intent is to develop a specification for a jackthat will take both international IEC 60603-11 jacks and B-gauge plugs. The original project initiation form or a newone will be produced as soon as possible. A. Eckhart haspreviously indicated that such a connector exists and vol-unteered to document it.

AES-X123 XL Connectors to Improve ElectromagneticCompatibilityThe new intent is a Performance Standard for both male andfemale connectors that includes performance limits, definestest fixtures and methods, and defines an objective for eachgeneric type. The goal is a PTD. The target date is 2003-10.

At least six connector types were discussed, as set out inthe draft X13 document. Two are cable-mounted types,both male and female, intended for the termination of bothmicrophone and line level circuits. These connectors shouldcontain a concentric capacitor to terminate the shield to theshell at radio frequencies. Two connector types, male andfemale, are intended for use within equipment, and ac-cording to the recommendations of SC-05-05, should offergreatly improved contact between the shells of mating con-nectors, greatly improved contact between pin 1, the shell,and the outside of the chassis (or shielding enclosure). Itwas expected that small radio frequency bypass capacitorswould be fitted between pins 2 and 3 and the shell. Twoconnector types, male and female, are intended for use onwiring panels external to equipment.

M. Natter reported on research work on several pre-pro-duction prototypes of a male cable-mount connector incor-porating a capacitor of concentric construction. B.Whitlock reported, via e-mail, on preliminary mea-surements of concentric capacitors. Measurement tech-niques, including suitable test fixtures, were discussed. Thetest jig should allow determination of the impedance of thecapacitor over a wide frequency range by the measurementof the voltage divider ratio when a generator of knownimpedance drives the capacitor.

Woodgate described a test jig he has constructed thatconnects the center conductor of a coaxial connector fromthe generator to the shield connection of the concentric ca-pacitor. The shell of the coaxial connector is connectednearly concentrically to the shell of the connector in whichthe concentric connector is mounted. The cable shield alsoconnects to pin 1 through a suitable ferrite bead. The center

conductor of a coaxial feed to an rf voltmeter (spectrumanalyzer, network analyzer, receiver) connects to pin 1 of amating connector, and the shell of that coaxial connector isconnected to the shell of the connector holding the ca-pacitor being tested. The effectiveness of the test jig couldbe determined by replacing the capacitor with a copperdisc that short circuits the generator but makes contact withpin 1 in the same manner as it would with the capacitor,and by measuring at the same point as before. Under theshort circuit condition a very high value of attenuationshould be measured over the frequency range of interest.Brown noted the need for measurements that test for de-tection of radio frequency energy by a differential inputstage connected when a signal is injected in the samemanner as in the Woodgate tests.

Natter and Woodgate will continue work on one ormore prototypes of connectors that include a concentriccapacitor and measurements of their performance.

Commonly used panel-mount connectors with mountingflanges that contact the outside of the chassis or panel wereseen as nearly ideal for contact between the shell and thepanel. Some redesign is needed to reduce the impedance(principally the inductance) of the connection between pin1 and the chassis by shortening the signal path. This maybe accomplished by connecting pin 1 to the shell within theconnector.

Male and female connectors were described to meet therequirement of insulating XL connector shells from amounting panel external to equipment. These connectorsare expected to use a capacitor between pin 1 and the paneland another between the shell and the panel. Because ofthe way SC-05-05 expects these connectors to be used,these capacitors can be conventional types having goodhigh frequency properties. It was noted that the capacitorbetween the shell and the chassis can be subjected to con-siderable stress from ESD, and should be of a type suitablyrated for that condition.

AES-X130 Category-6 Data Connector in an XLConnector ShellThe intent of the standard to be developed needs clarifi-cation. J. Woodgate will put a summary of IEC SC 48Bdocuments on CAT6 connectors on the reflector. Thecurrent goal is a PTD. The target date is 2003-05.

New projectsNo project requests were received or introduced.

New businessThere was a discussion of the changes in scope for SC-05-02 and SC-05-03 expressed in the minutes of theNovember 2001 meeting. It was felt that, although accu-rately reported in those minutes, the wording should berevised slightly for clarity.

Secretariat note: At the subsequent meeting ofSubcommittee SC-05, the proposed clarifications wereadopted with minor amendments so the the scopes now read:

“The scope of the SC-05-02 Working Group on Audio

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 931

AES STANDARDSCOMMITTEE NEWS

Page 70: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Connections shall include, within the bounds of the scopeof SC-05, new usage, description, and contact designationfor connectors for audio and ancillary functions.

“The scope of the SC-05-03 Working Group on AudioConnectors shall include, within the bounds of the scope ofSC-05, documentation of established connector usages foraudio and ancillary functions.”

The next meeting is scheduled to be held in conjunctionwith the AES 113th Convention in Los Angeles, CA, US.

Report of the SC-05-05, Working Groupon Grounding and EMC Practices of theSC-05 Subcommittee on Interconnec-tions, held in conjunction with the AES112th Convention in Munich, Germany,2002-05-10The meeting was convened by Chair B. Olson.

The agenda order was revised by the chair to have theAES-X13 item appear at the end of Current Projects. Therevised agenda was approved. The report of the meeting inNovember 2001 in New York City was accepted as written.

Current projects

AES-X27 Test Methods for Measuring ElectromagneticInterferenceThis document is intended to be an Engineering Reportoutlining useful procedures for measuring the electro-magnetic interference created by real-world conditions.

A report called “Informal immunity testing of small mi-crophone pre-amplifiers and mix consoles” was presentedby J. Brown. This showed the general direction that wewill need to go to produce a set of guidelines allowingusers to determine the susceptibility of equipment to elec-tromagnetic interference.

R. Caine pointed out that this testing could be muchmore stringent than the legal requirements and is intendedto show the ability of equipment that is most suitable forthe highest quality professional audio systems.

It is hoped that manufacturers will be encouraged toproduce a low-cost test generator that will allow suitabletesting of equipment using a variety of test fixtures.

Alternatively, the document will describe proceduresthat can be used with radios transmitters or cellular tele-phones as the interference source. These radios and cellulartelephones would need to be used in their normal operationby licensed operators as required by relevant statute.

The following observations were made from initial expe-rience of testing with radios and cellular telephones:

1) As frequency increased, the immunity improved.2) As the distance from the device increased past a

wavelength of the RF carrier, the cable attenuation reducedthe level of the interference and therefore the immunitywas also increased.

3) The headphone output stages of some mixers wereoften the most susceptible to interference.

This report on informal immunity testing was NOT di-rected only at AES-X27, but rather at demonstrating the

general level of immunity of some typical low cost productsto a real-world interference source. Inclusion of such a testinto AES-X27 was discussed and considered useful bythose members present, but the need for cautions regardingthe avoidance of interference to radio communications andconformance with national regulations was emphasized.

Caine pointed out that a coupling device would be usefulfor the radiated immunity tests.

J. Woodgate pointed out that this project originallystarted as a set of tests for low-frequency interference. Healso pointed out that it should reference IEC 61004 as theprimary test method.

Olson proposed a foreword for AES-X27 that clearlypresents why these tests are different from either EN55103-2 or IEC 61004. AES-X27 should address testingfor pin 1 problems at both audio and radio frequencies,testing of excessive input and output bandwidth, andgeneral RF immunity testing.

Brown offered to lead the writing effort for AES-X27.The expectation is for a PWD for the Standards Project

Report by 2002-09.

AES-X35 Installation Wiring PracticesOlson proposed an outline to guide further development ofthe document.I. Cable GroupingII. Grounding

A. MeshB. StarC. HybridD. Wrong

III. SafetyA. Fire and Smoke

IV. ShieldingV. Cable Types

A. ConduitB. Plenum requirements

VI. Wiring PracticesA. Permanent installationsB. Temporary installations

VII. ConnectorsWoodgate pointed out the need for Normative

References to the relevant safety standards and documents.This will be a substantial amount of work.

The chair will find help to merge the existing X35document into the new outline.

AES-X112 Insulating Cable-Mount XL ConnectorsA new, clearer title was proposed: “XLR free connectorswith non-conducting shells.”

The intent is to create an Information Document on ap-plications of connectors for facilities. This should includean explanation of stopgap measures using nonconductivecovers to prevent inadvertent ground connections to theshielding contact. This work should appear as part of theX35 document. Woodgate offered to write a section forX35 regarding this.

AES-X125 Input Filtering for ElectromagneticCompatibility

932 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

AES STANDARDSCOMMITTEE NEWS

Page 71: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Waldron indicated that K. Armstrong would write an infor-mation document for this project.

Brown proposed that the scope be expanded to includeoutput and control port filtering. This need not be specificbut should list the appropriate issues to be considered. TheWG decided that just input and output filtering would beincluded at this stage.

AES-X13 Guidelines for GroundingThe scope of the document has been greatly simplified.Specific wording was discussed to remove performance re-quirements and clarify definitions. The meeting agreed onall the wording changes and decided that the PWD wasready to be progressed to a PCFC. The revised documentwill be posted to the reflector before being forwarded tothe Secretariat to be formatted as a PCFC.

M. Natter presented preliminary results of measurementsmade on a connector prototype that incorporates someideas for coaxial connection through a quasi-discoid ca-pacitor to the shell of an XLR connector. The results lookpromising but need further investigation and testing.

Much discussion ensued about the need for protectionfrom surges across the capacitor. More testing and mea-surement is required to show whether this is a problem. R.Cabot pointed out that, for the value of capacitance underconsideration, it did not appear that a damaging voltagewould appear across the capacitor in any configuration.

New projectsThere were no new projects.

New businessThere was no new business.

The next meeting will be held in conjunction with theAES 113th Convention in Los Angeles, CA, US.

Report of the SC-06-01 Working Groupon Audio-File Transfer and Exchange ofthe SC-06 Subcommittee on Networkand File Transfer of Audio meeting, heldin conjunction with the AES 112th Con-vention in Munich, Germany, 2002-05-09

The meeting was convened by Chair M. Yonge.The agenda was approved with the addition of a report

on the progress of AES46-xxxx. The report of the previousmeeting at 111th Convention in New York was approvedas written.

Open projects

AES46-xxxx Radio Traffic Data Extension toBroadcast Wave FilesA Call for Comment was issued on 2002-03-07. M.Gerhardt had sent a comment requesting clarification of theorder of RIFF chunks within a Broadcast Wave File (BWF)which had been satisfactorily addressed by an editorialchange. There had been no other comments so far withinthe comment period which was due to close on 2002-06-08.

AES31-1-R AES Standard for Network and FileTransfer of Audio, Part 1: Disk FormatNo action was taken in this maintenance project.

AES31-3-R AES Standard for Network and FileTransfer of Audio, Part 3: Simple Project InterchangeIt was observed that the sampling frequencies supportedwithin AES31-3-1999 did not include some of the higherfrequencies anticipated in a revision of AES5, specif-ically 96 kHz and 192 kHz. These frequencies are im-portant for record companies who wish to archive musicrecording, for instance. An amendment was proposed toinclude a multiplier parameter in the header chunk thatwould indicate multiples of the sampling frequenciesalready defined.

U. Henry is preparing a proposal document for EditAutomation.

The question of supporting video in AES31 was raised.There was a general feeling that this would be too complex forthis body to handle, although there was no objection toworking with some other body to assist the development of acomplementary standard for simple video project interchange.

AES-X66 File Format for Transferring Digital AudioData Between Systems of Different Type andManufactureThere was a discussion of the filename extension to beused with Broadcast Wave Files. Some had assumed that“.bwf” would be used. It was pointed out that this was notpart of the EBU specification and that the “.wav” extensionwas intended.

There was some concern that the usage of the UniqueMaterial Identifier (UMID) was under renewed discussionwithin SMPTE with the potential to impact its use in AES-X66. More information will be sought.

The issue of higher sampling frequencies will be discussedwith the EBU to avoid the risk of divergent specifications.

The document for AES-X66 still needs to be re-draftedin IEC style; the secretariat will arrange for this to happen.

AES-X68 A Format for Passing Edited Digital AudioBetween Systems of Different Type and Manufacturethat Is Based on Object Oriented Computer TechniquesNo action was taken.

AES-X71 Liaison with SMPTE Registration AuthorityNo action was reported. It was felt that the AES shouldinvestigate registering a SMPTE label for the AES31-1data format.

AES-X128 Liaison with AAF AssociationNo action was taken.

New projectsNo new projects were received or proposed.

New businessThere was no new business.

The next meeting is scheduled to be held in conjunctionwith the AES 113th Convention in Los Angeles, CA, US.

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 933

AES STANDARDSCOMMITTEE NEWS

Page 72: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

934 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

AESAES 111133thth CCoonnvveenntt iioonn2002 October 5–8

LLLLoooossss AAAAnnnnggggeeeelllleeeessss CCCCoooonnnnvvvveeeennnntttt iiiioooonnnn CCCCeeeennnntttteeeerrrr LLLLoooossss AAAAnnnnggggeeeelllleeeessss ,,,, CCCCaaaallll iiii ffffoooorrrrnnnniiiiaaaa,,,,UUUUSSSSAAAA

Page 73: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 935

To no one’s surprise, the sun wasshining brightly throughout thedays of the AES 113th Conven-tion held October 5–8 in Los An-geles, illuminating the positiveoutlook of the audio industry aftera difficult year. The combinationof a large number of exhibitors

and delegates together with a wide range of new productsand excellent technical events indicates an industry in goodshape growing stronger.

OPENING CEREMONIESThe opening of the convention commenced with a welcomefrom Roger Furness, executive director of the AES, who wasglad to see that so many people had come to attend the con-vention in spite of the sunny weather outside. He introducedAES President Garry Margolis, who said that it was an honorand a pleasure to be the president of the Society during achallenging and rewarding year. He proclaimed the AES tobe a forward-looking society and hinted that exciting newdevelopments are in the pipeline.

Floyd Toole, 113th Convention chair, discussed the con-vention theme: “Science in the Service of Art.” He praisedthe audio industry as a marvelous enterprise and explainedthat this was shown perfectly by the diversity of activitieswithin the convention: from the events hosted by the Histor-ical Committee to the innovative products and software inthe exhibition. But he warned that without art the businesswould die, and he saluted the artists who would be perform-ing in the Songwriters Showcase throughout the convention.

Next the presentation of awards was overseen by DavidRobinson, the Awards Committee chair. Citations wereawarded to Elmar Leal for many years of outstanding work inSouth America, culminating in the formation of the LatinAmerica Region, and Roland Tan for his work in SoutheastAsia and the formation of the Singapore Section. Board ofGovernors Awards were presented to Roy Pritts and Ron Stre-icher for cochairing the 109th Convention in Los Angeles inSeptember 2000. AES Fellowships were awarded to DurandBegault for his contributions to our understanding of spatialhearing and its applications, Gilbert Soulodre for his signifi-cant contributions to procedures for subjective testing of audiosystems, and Carson Taylor in recognition of lifelong contri-butions to the art and science of music recording techniques.

The keynote speech was given by Leonardo Chiariglione ofthe Telecom Italia Lab. He offered further examination of theconvention theme by discussing the effect of new distributiontechnology on the industry and the art. He compared therepercussions of sharing MP3 files on the Internet with thechanges brought about by the advent of radio and recordingtechnologies. He outlined the tension currently created by thediffering opinions of the consumers, who usually want freeaccess to any content in a form that can be manipulated andtransferred to any device, and the distributors, who want to re-tain control of the publishing mechanism and need a financialreturn on content. Encouraging both sides in the dispute totake a step back from confrontation, he proposed a solutionbased on open access to protected content. This solution is

T

Page 74: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

936 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

OPENING CEREMONIES AND AOPENING CEREMONIES AND AWWARDSARDS

Roger Furness, AESexecutive director

Garry Margolis, AESpresident

Floyd Toole, 113thConvention chair

Leonardo Chiariglione, 113thkeynote speaker

Garry Margolis presents Citation Awards to Elmar Leal, left photo, and Roland Tan.

David Robinson, AwardsCommittee chair,announces awardrecipients.

based on interoperable technology making readily accessiblecontent easy to distribute and requiring that the consumer ob-tain the right to play the content. He expanded on this bytelling of his dream that any person has the opportunity tocreate art, distribute it, and gain just reward for it. He believesthat the technology to achieve this is available and it is time tomake technology and content friends of mankind. He con-cluded by saying that “unless people get value from their art,they can’t make it. All artists should have means to producetheir art and receive remuneration.”

Toole concluded the opening ceremony by thanking his

convention committee for their dedicated efforts that madethe convention possible, and asked the audience to go outand enjoy the results of that hard work.

EXHIBITIONThere was a very positive atmosphere on the exhibition floorthroughout the convention, with the quality and quantity ofthe attendees beating the expectations of exhibitors. Of primeimportance was the fact that the audience included a largenumber of the industry’s key decisionmakers. The format ofthe exhibition also allowed for the attendees to talk at length

Page 75: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

line array products. JBL intro-duced three new VerTec modelsaimed at medium and small soundreinforcement applications, de-signed to give high output powerand quality within a lightweightpackage. Meyer Sound also ex-hibited compact line array units,with their M1D and M2D prod-ucts. Electro-Voice used the con-vention to launch its XLC rangeof compact line-array loudspeak-ers designed to meet the chal-lenges of difficult acoustical envi-ronments. SLS Loudspeakers alsointroduced its new compact linearray product, the RLA/2 that in-cludes dual 8-inch woofers and ahigh-frequency ribbon driver.

Renkus-Heinz demonstrated aninteresting combination of tech-nologies for the live-sound marketwith the incorporation of the Eth-ernet-based CobraNet audio dis-tribution system into their ST-STX loudspeaker range. Theyclaim that this allows improvedsignal quality, minimal data loss,and precision remote control overlong distances, together with net-working and interoperability witha range of other products incorpo-rating CobraNet. Continuing thistheme, Neutrik showed a range ofprofessional EtherCon EthernetRJ45 connectors built within astandard XLR casing in order tobetter cope with the stresses of theprofessional audio environment.

At the other end of the livesound signal chain, Audixdemonstrated its new D6 kickdrum microphone that was devel-oped from detailed analysis ofthe sound of a large number ofdrums. Audio-Technica launcheda range of live-sound instrumentmicrophones, including the

AE5100 and the AE3000, which are large diaphragm con-densers, and the AE2500, which combines condenser anddynamic elements within one microphone.

A large number of significant new software releases wereannounced at the convention. Steinberg launched version2.0 of Nuendo, featuring a reengineered 32-bit floating-point mixer, enhanced routing options (including surroundsound up to 10.2 channels) and new networking capabilitiesover TCP/IP. Digidesign announced version 6.0 of ProToolsfor Mac OS X, including a cleaner user interface, improvedMIDI facilities with support for the OS X CoreMIDI en-

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 937

Roy Pritts, left photo, and Ron Streicher receiving Board of Governors Awards.

Fellowship Awardsrecipients:counterclockwise from left,Durand Begault, GilbertSoulodre, and CarsonTaylor.

with the product manufacturers and give valuable feedback.Numerous exciting new products were displayed at the

convention. A small number of these are summarized here,and a full list of exhibitors begins on page 950 of this issue.

One focus of the convention was on live sound, with twoworkshops on line arrays, a roundtable of live sound engi-neers discussing the latest trends, techniques, and tools inmodern sound reinforcement, and a large number of newproducts from the major manufacturers. In addition to theworkshops on line array loudspeakers, Meyer Sound, JBL,SLS, McCauley Sound, and Electro-Voice all showed their

Page 76: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

gine, and the ability to takeadvantage of dual-processorMacintosh G4 computers.SADiE unveiled its Series 5products, based on a newlydesigned hardware platform.This includes four new prod-ucts that encompass author-ing and editingfor either PCM orDSD technolo-gies, with the op-tion of up to 8channels.

Plugzilla is aunique productthat combines thetraditionally separate realms of computer-based and outboardprocessing. This is a stand-alone 2U unit which runs VSTsoftware plug-ins. It contains two independent machines thatwill run up to eight plug-ins simultaneously and has enoughprocessing power for up to 16 channels of reverberation. Theparameters of the plug-ins are accessed via the front-panelrotary controllers, MIDI, USB, and assignable foot switches.

The Virtual Mixer from the Virtual Mixing Company wasanother unique product. Acting as a front end to a MIDI-controlled system, it allows tracks or MIDI instruments tobe mixed in a visual environment where the instruments arepositioned in space according to their pitch, level, andpanned position. The mix can then be manipulated visuallyusing a mouse or a touch screen.

Continuing the popular trend for multichannel surroundsound production, a number of innovative new products werereleased at the convention. Fostex showed a 6-channel loca-tion recorder that records directly onto DVD-RAM disks andincludes microphone preamplifiers and time-code features.Lexicon demonstrated updates to the 960L processor, includ-ing an automation option and LOGIC7 up-mixing algorithmsto create 5-channel surround sound from 2-channel stereo.TASCAM exhibited a surround sound monitor controller, theDS-M7.1, which is designed to add multiloudspeaker moni-toring control to consoles with limited output busses.

The exhibition floor also contained a large number of small

stands exhibiting specialistproducts. These includedmanufacturers of outboardrecording equipment, such ascompressors and microphonepreamplifiers. Anthony De-Maria Labs showed a numberof tube compressors that are

based on classicdesigns. Univer-sal Audio un-veiled a newchannel strip, the6176, whichcombines a mi-crophone pream-plifier with a

compressor in a single unit, providing vintage characterwith modern specifications.

Finally, a number of microphone products were shown atthe convention, though not all were new designs. Most notableamong these was the Telefunken Ela-M 251, a reissue of theclassic tube microphone that is hand built to original specifica-tions using the same methods used to make the original ver-sion 40 years ago. With the mixture of cutting-edge and clas-sic, the exhibition contained something for every audioprofessional.

PAPERS SESSIONSPapers cochairs Eric Benjamin and John Strawn organizedsessions at the 113th Convention that covered the advancedaudio research being performed across a wide range of topics.Based on the number of papers presented in each area, it ap-pears that the themes of psychoacoustics, low bit-rate coding,signal processing, and the design and measurement of trans-ducers are those currently of greatest research interest.

A number of papers were related to the measurement ofloudspeaker drivers. One such paper by Siegfried Linkwitzinvestigated the loudspeaker parameters that are of greatestimportance. He argued that volume displacement, intermod-ulation distortion, stored energy, and off-axis frequency re-sponse all need to be tightly controlled, and that the impor-tance of phase linearity and cabinet diffraction are

938 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Speaking at AES annualbusiness meeting:clockwise from above left,Ron Streicher, secretary;Garry Margolis, president;Marshall Buck, treasurer;and Chris Freitag, Boardof Tellers chair.

Multichannel demo room at 113th allowed presenters todemonstrate new theories with 5.1-channel samples.Author Siegfried Linkwitz presenting paper.

Page 77: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Mono

Multichannel

Stereo

• Home Theater/Entertainment

• Wireless + Portable

• Telecom + Voice

• Gaming

• Internet + Broadcast

Technologies. Product Applications

World Wide Partners

• Circle Surround II

• FOCUS

• SRS 3D

• SRS Headphone

• TruBass

• TruSurround XT

• VIP

• WOW

The Future of Audio. Technical information and online demos at www.srslabs.com2002 SRS Labs, Inc. All rights reserved. The SRS logo is a registered trademark of SRS Labs, Inc.C

Aiwa, AKM, Analog Devices, Broadcom, Cirrus Logic, ESS, Fujitsu, Funai,

Hitachi, Hughes Network Systems, Kenwood, Marantz, Microsoft,

Mitsubishi, Motorola, NJRC, Olympus, Philips, Pioneer, RCA, Samsung,

Sanyo, Sherwood, Sony, STMicroelectronics, Texas Instruments, Toshiba

SRS Labs is a recognized leader in developing audio solutions for any application. Its diverse portfolio

of proprietary technologies includes mono and stereo enhancement, voice processing, multichannel

audio, headphones, and speaker design. • With over seventy patents, established platform partnerships

with analog and digital implementations, and hardware or software solutions, SRS Labs is the perfect

partner for companies reliant upon audio performance.

Page 78: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

sometimes exaggerated. Other papers in this field were ontopics such as techniques for measuring the amplitude re-sponse of loudspeaker systems in domestic environments byAllan Devantier and methods for interpreting nonlinearitymeasurements of transducers by Alexander Voishvillo, AlexTerekhov, Gene Czerwinski, and Sergei Alexandrov.

The sessions on signal processing included papers on fil-ter morphing by Rob Clark, Emmanuel Ifeachor, and GlennRogers, sound synthesis by genetic programming by Ricar-do Garcia, and noise shaping in test-signal generation byStanley Lipshitz, John Vanderkooy, and Edward Semyonov.There was also a paper by Frank Siebenhaar, ChristianNeubauer, Robert Bäuml, and Jürgen Herre on audio water-marking based on a Scalar Costa Scheme, which allows theinclusion of higher data rates within the watermark for em-

940 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Ron Streicher (right), special events cochair

Papers cochairs John Strawn (left) and EricBenjamin

From left, Theresa Leonard, education eventschair, Elliot Scheiner, renowned producer, andScott Cannon, education events assistant

Marshall Buck (center), workshops chair, flanked by workshopsassistants David Sheirman (left) and Bejan Amini

Van Webster (left), facilities chair, and Floyd Toole,convention chair

From left, historical events cochairs Dale Manquen andWes Dooley, and Eric Gaskell who assisted with WhenVinyl Ruled exhibit.

From left, Mel Lambert, special event cochair and technicaltours cochair, Lisa Roy, Platinum Series chair, and RogerFurness, AES executive director

Page 79: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

bedding such things as song lyrics or pictures.Multichannel sound and spatial audio appeared frequently

within the sessions mentioned above, as well as in a separatesession on multichannel sound. Listener envelopment in sur-round sound was investigated in a paper by Gilbert Soulo-dre, Michel Lavoie, and Scott Norcross, based on applyingmeasurements developed for concert hall acoustics. Otherpapers in the session on multichannel sound included an in-vestigation of the interchannel interference from multipleloudspeakers at the listening position and its relationship tomicrophone technique by Geoff Martin, the effect of earlyreflections on localization by Jason Corey and WieslawWoszczyk, an approach for synthesizing surround soundfrom mono and two-channel stereo by Ching-Shun Lin andChris Kyriakakis, and an investigation of loudspeaker ar-

rangements for creating a diffuse sound field by KoichiroHiyama, Setsu Komiyama, and Kimio Hamasaki.

Further papers sessions included Room Acoustics andSound Reinforcement, High Resolution Audio, Recordingand Reproduction of Audio, and Audio Networking and Au-tomotive Audio. Individual papers presented at the conven-tion can be purchased at www.aes.org/publications/preprints,and a CD-ROM with all the 113th Convention papers is alsoavailable. A full list of paper titles and abstracts begins onpage 954 of this issue.

WORKSHOPSThe workshops, organized by Marshall Buck with the assis-tance of Bejan Amini and David Scheirman, provided aninteresting mix of the current issues facing the industry.

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 941

We Thank…Floyd Toole

chairEric Benjamin John Strawn

papers cochairsMarshall Buckworkshops chair

Bejan Amini David Scheirmanworkshops assistants

Mel Lambert Ron Streicherspecial events cochairs

Lisa RoyPlatinum Series chair

Theresa Leonardeducation events chair

Scott Cannoneducation events assistant

Wes Dooley Dale Manquenhistorical events cochairs

Van Websterfacilities chair

Peter Chaikin Mel Lamberttechnical tours cochairs

Claudia Koalmusic chair

Shelley Hermanchair, The Jazz Singer

Bob Leesigns chair

Han Tendelooprogram coordinator

Ken Lopezvolunteers coordinator

Ken Lopez (center), volunteers coordinator, with 113th volunteers

Floyd Toole (left) and BobLee, signs chair

Shelley Herman (left), chair of TheJazz Singer, and Ken Lopez,volunteers coordinator

Han Tendeloo,programcoordinator

Peter Chaikin (right)technical tours cochairwith Rich Tozzoli

Claudia Koal, music chair,with Gordon van Ekstrom

Marla Egan (right), producer for PlatinumSeries, with Lisa Roy

Page 80: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

the summary that it is most important for engineers to trusttheir ears and adapt the available techniques to the particularsituation. John Eargle then gave a historical view of stereoand surround microphone techniques, before Doug Botnick,Michael Bishop, Richard King, and Mick Sawaguchi dis-cussed how they apply the available techniques to their ownproductions.

There was a standing-room-only crowd for Mixing andMastering in Multichannel Surround, chaired by MichaelBishop. The first presentation was by Bob Ludwig, who ex-plained the challenges of mastering in surround sound, in-cluding advice on the use of the LFE channel. John Eargledescribed the way in which he simultaneously mixes forsurround and stereo, referring to the problems of producing

stereo from the surround mix by downmixing.Frank Filipetti discussed the problems of creat-ing a multichannel product from a stereo mix,and encouraged record companies to specifyhow multichannel mixes were created. Finally,the opportunities provided by multichannelsurround sound were explained by GeorgeMassenburg and Elliot Scheiner. Massenburgdescribed it as a “sandbox” as he talked aboutwhy he enjoyed surround mixing, and Schein-er explained that, “there are no rules, but youhave to maintain the integrity of the musicwhen you redo it in surround.”

In addition to these two workshops on mul-tichannel surround sound, there was also TheApplication of Multichannel Sound Formats

In addition to the usual lively discussion sessions, this con-vention continued the recent trend of an increasing numberof tutorial workshops featuring recognized experts in theirrespective fields.

Surround sound and multichannel audio continued to be apopular topic at this convention. The first workshop wasStereo and Surround Microphone Techniques, chaired byGeoff Martin. Despite the early start of this workshop, alarge crowd attended to hear the panel discuss microphonetechniques that are applicable to a wide range of genres. Thesession opened with an informative presentation by Martinon the basic theories behind simple microphone arrays, with

942 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Jim West (second from left), who gave Richard C. HeyserMemorial Lecture at 113th, receives citation from WieslawWoszczyk (left), Technical Council chair, and vice chairs,Robert Schulein and Jürgen Herre (right). Quantity and quality of attendees exceeded exhibitors’ expectations.

Stereo and Surround Microphone Techniques was one of 15 well-attendedworkshops.

Attendees were able to sample advanced automotive audio systems.

Vintage equipment displayed at When Vinyl Ruled exhibit.

Page 81: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August
Page 82: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

in Vehicles, chaired by Richard Stroud, and Coding of Spa-tial Audio, chaired by Christof Faller. The surround soundworkshops were enhanced by the availability of a multi-channel reproduction system in a small room which enabledthe presenters to demonstrate their recordings in a moreacoustically suitable environment. Other demonstrations re-lating to the workshops program included some automobileswith multichannel audio systems.

As part of the promotion of new areas of audio technol-ogy within the AES, the 113th Convention included theworkshop Game Audio, with presentations by expertsfrom a range of sections of the computer game industry.Topics covered the use of surround sound in video games,issues in programming audio for games, and the audio ca-pabilities of the XBox. Also included was a discussion of

how to enter the game audio industry.Loudspeaker technology received significant attention,

with Large Signal Loudspeaker Parameter Measurement,chaired by John Stewart, and the two-part Loudspeaker LineArrays, chaired by Jim Brown and John Murray.

The other workshops covered a wide range of topics: NewMedia for Music, chaired by Dave Davis; What Audio Engi-neers Should Know About Human Sound Perception, present-ed by Durand Begault and William Martens; AES 42-2001,chaired by Stephan Peus; Studio Production and Practices,chaired by George Massenburg and David Smith; PerceptualIssues Related to Cascaded Audio Codecs, chaired byThomas Sporer; Protecting Your Hearing Against Loss,chaired by Dilys Jones and Sigfrid Soli; and Recent Develop-ments in MPEG-4 Audio, chaired by Jürgen Herre.

944 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

113TH SPECIAL EVENTS1. Platinum Producers Panel 1: from left,Ben Grosse, Tal Herzberg, Rob Cavallo,moderator Mitch Gallagher, Ron Fair, andMike Elizondo.2. Platinum Producers Panel 2: from left,Larry Levine, Michael Bradford, moderatorHoward Massey, Phil Ramone, PatrickLeonard, and Bob Ezrin.3. Road Warriors Panel: from left, KirkKelsey, David Morgan, Greg Dean,moderators Steve Harvey and CliveYoung, and Paul Gallo.4. GRAMMY Recording Soundtable: AESPresident Garry Margolis welcomingaudience and panelists, from left, KenJordan, George Massenburg, Jack JosephPuig, moderator Howard Massey, ElliotScheiner, Brian Garten, and Al Schmitt.5. An Afternoon with…Tom Holman: TomHolman (left) and moderator GeorgePetersen.

1

2

5 3

4

Page 83: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

HEYSER LECTUREThe Richard C. Heyser Memorial Lecture waspresented at this convention by James E. West,coinventor of the modern electret microphone.His lecture focused on its development and theunique applications of this technology to a widerange of technical areas. He started by reviewingthe history of the technology, from the first ex-periments with electrets in the 15th century to theearly electret microphones used by the Japanesein World War II. He went on to explain that theproblem with these electrets was that they werewax-based, which caused their characteristics tobe dependent on the atmospheric conditions andlimited their lifespan to about six months.

The main achievement of West and his col-leagues at Bell Labs was the discovery of stablecharge storage in thin polymers and using themin microphone technology. West explained thatwhereas condenser microphones might be thebest choice for some applications, electret mi-crophones can be used in a wider range of situa-tions, such as compact arrays.

West demonstrated the use of electret micro-phones together with signal processing for use intelecommunications. He showed an audio andvideo clip of the use of four microphones to ob-tain a second-order directional response to reducebackground noise. This was followed by ademonstration of the use of signal processing toadapt for the distance between the mouth of thetalker and the microphone array. In this case, thevariations in the level and the frequency responsecaused by varying the talker-microphone distancewere compensated based on analysis of the audiosignals reaching each of the microphones in thearray. West also demonstrated how this technolo-gy was adapted for special applications, such ascommunications in motor racing to overcome thehigh ambient noise levels in race cars.

In conclusion, West described the use of elec-tret microphones in arrays of up to three dimen-sions. This enables their use in a wide range ofapplications, from reducing background noisefor speech recognition, through steered directiv-ity for amplifying audience questions, to record-ing and reproducing 3D sound.

In addition to the lecture, the Technical Coun-cil was busy throughout the convention withcommittee meetings that covered a wide rangeof topics. The AES Technical Committees re-view trends in audio technology and practiceand provide expertise to the Society, such assuggesting topics for workshops and confer-ences, publishing technical documents, and col-lating written papers and audio demonstrationson important topics. More information on theTechnical Council and the Committees can befound at www.aes.org/technical.

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 945

Among many artists who performed at Songwriters Showcase: top tobottom, Severin Browne and James Coberly Smith, Deni Bonet, Vinx,and Tim Janis.

Page 84: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

SPECIAL EVENTSThe program of special events at the convention included the14th annual GRAMMY Recording Soundtable, two Plat-inum Producers sessions, and a Road Warriors Panel. Thosewho attended the Recording Soundtable heard Phil Ramone,Ken Jordan, George Massenburg, Jack Joseph Puig, ElliotScheiner, and Al Schmitt discuss how they cope with thechallenges of the recording industry. In the first PlatinumProducers session Rob Cavallo, Mike Elizondo, Ron Fair,Ben Grosse, and Tal Herzberg, with moderator Mitch Gal-lagher, discussed trends in engineering and producing, suchas how the traditionally separate roles of producer, engineer,and studio technician are starting to blend into one.

In the second Platinum Producer session Howard Masseymoderated a discussion with panelists Michael Bradford,Bob Ezrin, Patrick Leonard, Larry Levine, and Phil Ramoneon the past, present, and future of the recording industry.Continuing the theme of the keynote speech at the openingceremony, they discussed the difficulties of creating and dis-tributing new and possibly commercially risky music. Theynoted that less risks are being taken with new artists in thecurrent business climate, meaning that the songs producedare becoming more and more similar. They briefly touchedon the possibility that new methods of distribution couldhelp this situation, but might also lead to the problem of notgetting an adequate financial return.

On Tuesday Paul Gallo introduced the Road Warriors Pan-el of Kirk Kelsey, David Morgan, Greg Dean, and modera-tors Steve Harvey and Clive Young, who discussed the latesttrends, techniques, and tools in the live-sound industry.

To commemorate the 75th anniversary of the first suc-cessful talking picture, on Sunday evening the conventionwas host to an old-time radio recreation of The Jazz Singer.The show featured Richard Halpern in the starring rolemade famous by Al Jolson, along with a supporting cast thatincluded AES President Garry Margolis. Convention ChairFloyd Toole commented that this event exemplified the con-vention theme, “Science in the Service of Art,” as the pro-duction was art at its highest.

Throughout the convention attendees had the opportunityto take advantage of free hearing screening cosponsored bythe AES and the House Ear Institute in response to a grow-ing interest in hearing conservation and to heighten aware-ness of the need for hearing protection and the safe manage-ment of sound.

The Historical Committee again put on a number of demon-strations and talks, as well as an exhibition of production tech-nology during the age of vinyl, organized by Wes Dooley andDale Manquen. The presentations included Carson Taylor dis-cussing location recording from the 1950s to the 1970s and areview of 12 landmark microphones by Jim Webb. Other pre-sentations covered topics such as vinyl disk mastering, recordmanufacturing, the development of powered monitors, and thedevelopment of mixing console technology.

The Art part of the convention theme was on display atthe Songwriters Showcase held in the main entrance area ofthe convention center. Attendees had the chance to listen toa wide range of live music performed throughout the fourdays of the convention. The music continued each eveningat mixer events, which allowed the convention attendees tomeet in a relaxed social atmosphere.

Other special events at the convention were: “An After-noon with…Tom Holman,” in which moderator George Pe-tersen interviewed Holman about his career in audio; theSPARS Business Panel; and there were sessions on techni-cal areas that are currently of interest such as Audio PostProduction in 24p HDTV and The Virtual Studio: DAW-Based Recording.

EDUCATION EVENTSNumerous education events also took place, arranged byTheresa Leonard and Don Puluse with the assistance ofScott Cannon and members of the student delegate assem-bly. The program gave students the opportunity to presenttheir own work, including paper and poster sessions fortechnical work and a recording competition for practicalwork. For the first time a number of the educational eventswere included in the main convention workshop program,

946 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

113TH MEETINGS1. Technical Council: from left, Jürgen Herre, vice chair, WieslawWoszczyk, chair, Bob Schulein, vice chair. 2. Publications Policy Committee: from left, Daniel von Recklinghausen,Emil Torick, and Garry Margolis. 3. Standards Committee: Mark Yonge (left) and John Nunn. 4. Historical Committee: Jay McKnight and Dale Manquen. 5. Regions and Sections Committee: Subir Pramanik.

31 2

5 4

Page 85: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

these being the workshops on multichannel microphonetechniques and mixing and mastering in surround soundmentioned above. It turned out that these events were alsopopular with nonstudents, resulting in some of the mosthighly attended events of the convention.

There was also a strong focus on mentoring. This enabledstudents to meet and talk to a number of respected audio en-gineers from a wide range of backgrounds, both in a groupsituation and in one-to-one sessions. The Education Forumhosted a discussion about the issues facing education in au-dio engineering, and enabled those involved in education tomeet and share opinions. At the Education Fair attendeeswere able to learn more about the course offerings in audioat numerous colleges and universities.

TECHNICAL TOURSPeter Chaikin and Mel Lambert organized a fascinatingrange of technical tours to complement the other conventionactivities. There were tours of a number of the leadingrecording studios in the Los Angeles area, including SunsetSound Studios, Glenwood Place Studios, Capitol RecordingStudios, and Cello Recording Studios. Tours were also orga-nized to a number of performance venues including the Ko-

dak Theater, the Hollywood Bowl, andPlatinum Live. A final event included atour of the newly completed Cathedral ofOur Lady of the Angels. This event washeld in combination with the American In-stitute of Organ Builders, and included lec-tures on the new 105-rank, 6,019-pipe or-gan. This visit concluded with a recital byAES organist-in-residence Graham Blyth.He performed works by Soler, Franck,Messiaen, Bach, and Viern on the new or-gan to an enthusiastic audience.

AES COMMITTEE MEETINGSThe various committee meetings that helpto keep the Society running were held in the

days before, during, and after the four days of the convention.The meetings of the Standards Committee commenced

under the administration of Standards Manager Mark Yongeand Standards Committee Chair John Nunn two days beforethe convention began. As part of the mission of the commit-tee to develop standards relating to current practice andtechnology in professional audio, there were discussions ona wide range of topics, including digital audio, forensics, lis-tening tests, file transfer, and measurement. The conventionalso saw the launch of the new website and email facilitiesfor the Standards Committee to help develop the electroniccommunication of the working groups. For more informa-tion see www.aes.org/standards.

Other Society meetings were held throughout the conven-tion. Following the convention the AES Board of Governors(see next page) met to discuss the issues facing the Societyand to consider aspects of policy and planning.

Attendees left the convention with insights, possibilities,ideas, and inspiration in their chosen field and about the in-dustry in general. Now they can look forward to the next Eu-ropean convention, the AES 114th to be held March 22–25,2003 in Amsterdam. For more information on this and otherSociety activities, visit the AES website at www.aes.org.

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 947

113th Education events: counterclockwise from top right: Education Fair,Mentoring Panel, students anxiously await start of Recording Competition,and Stereo and Surround Microphone Techniques workshop.

Page 86: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

AES 113TH CONVENTION

31 2

6

15

948 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Board of Governors Meets

Kunimara Tanaka, incoming governor

Daniel von Recklinghausen, editor; Marina Bosi, governor;Roger Furness, executive director

Theresa Leonard, governor and 24th InternationalConference chair; John Nunn, Standards Committee chair

Annemarie Staepelaere, governor; Roland Tan, governor;Markus Erne, Europe Central Region vice president

Jim Anderson, USA/Canada Eastern Region vice president;Curtis Hoyt, incoming governor

Marshall Buck, treasurer, Convention Policy Committeechair, and Finance Committee chair; Karl-Otto Bäder,governor

Mercedes Onorato, Latin American Region vice president;Daniel Zalay, Europe Southern Region vice president

Juergen Wahl, governor; Neville Thiele, International Regionvice president

8

7

6

5

4

3

2

1

Meeting on October 9, members of the AES Board ofGovernors gather from around the world to hear reports

from AES officials and standing committees:

9

1413

11

Page 87: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

AES 113TH CONVENTION

4

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 949

Scott Cannon,studentrepresentative, RoyPritts, past presidentand NominationsCommittee chair

Han Tendeloo,incoming secretary;Wieslaw Woszczyk,Technical Council chair

Jay Fouts, legal counsel; Richard Small, Publications PolicyCommittee chair

Jay McKnight, Historical Committee chair; Subir Pramanik,Regions and Sections Committee chair

Bob Sherwood, financial advisor; Nick Zacharov, governor

Bob Moses, USA/Canada Western Region vice president; JamesKaiser, USA/Canada Central Region vice president

Søren Bech, Europe Northern Region vice president andConference Policy Committee chair

Garry Margolis, president, Future Directions Committee chair,and Women in Audio Committee acting chair; Ron Streicher,secretary and incoming president elect

Don Puluse, Education Committee chair and incoming governor;David Robinson, governor and Awards Committee Chair

17

16

15

14

13

12

11

10

9

87

16

5

12

10

17

Page 88: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

A.D.A.M. Audio GmbHBerlin, GermanyA DesignsWest Hills, CA, USAAardvarkAnn Arbor, MI, USAAbletonArcadia, CA, USA*ACO Pacific, Inc.Belmont, CA, USAAcoustic SystemsAustin, TX, USAAcoustical Solutions, Inc.`Richmond, VA, USAAdamson Systems EngineeringAjax, Ontario, CanadaADK MicrophonesRidgefield, WA, USAAdvanced AudioNorth Hollywood, CA, USAAEAPasadena, CA, USAAEQ, S. A. Leganes, Madrid, SpainAkai Musical Instrument CorporationFt. Worth, TX, USA*AKG Acoustics, USNashville, TN, USA*AKM Semiconductor, Inc.San Jose, CA, USAAlcorn McBride, Inc.Orlando, FL, USAAlesis Distribution LLCLos Angeles, CA, USA

Allen & Heath USAAgoura Hills, CA, USAAlterMediaBurbank, CA, USAAmekNashville, TN, USAAmetron Audio/VideoHollywood, CA, USA*AMS Neve PLCBurnley, Lancashire, UKAnalog Devices, Inc.Norwood, MA, USAAnthony DeMaria LabsNew Paltz, NY, USAApex N.V.Genk, BelgiumAphex SystemsSun Valley, CA, USAApogee Electronics, Inc.Santa Monica, CA, USAApplied Microphone TechnologyBerkeley Heights, NJ, USAART-Applied Research & TechnologyRochester, NY, USAAPRSTotnes, UKArboretum Systems, Inc.Pacifica, CA, USAArkaosArcadia, CA, USA*ATC Loudspeaker TechnologyStroud, Glousester, UK

ATC / Transamerica Audio Group, Inc. Las Vegas, NV, USA

ATI - Audio Technologies, Inc.Horsham, PA, USA

The ATI GroupJessup, MD, USA

ATR Service Co. York, PA, USA

Audio Accessories, Inc.Marlow, NH, USA

Audio DevelopmentsPortland, ME, USA

Audio Engineering Associates Pasadena, CA, USA

*Audio PrecisionBeaverton, OR, USA

Audio SpecialtiesHerndon, VA, USA

*Audio-Technica U.S., Inc.Stow, OH, USA

Audio Technology MagazineDee Why, NSW, Australia

Audix CorporationWilsonville, OR, USA

Auralex AcousticsIndianapolis, IN, USA

Avalon Design, Inc.San Clemente, CA, USA

Bader LLCDanville, KY, USA

Baltic Latvian Universal Electronics “B.L.U.E.”Westlake Village, CA, USABelden Electronics DivisionRichmond, IN, USABenchmark Media Systems, Inc.Syracuse, NY, USABerklee College of MusicBoston, MA, USABIAS (Berkley Integrated Audio Software )Petaluma, CA, USABitHeadz, Inc.East Greenwich, RI, USABlue Sky / Group OneFarmingdale, NY, USABossLos Angeles, CA, USABrainstorm Electronics, Inc.North Hollywood, CA, USABrauner / Transamerica AudioLas Vegas, NV, USABrill ElectronicsOakland, CA, USABruel & Kjaer North AmericaNorcross, GA, USABryston Ltd. Arcadia, CA, USA*BSS Audio Nashville, TN, USACable FactoryBurnaby, B.C., Canada

*Cadac Electronics Ltd.Luton, Bedfordshire, UK

Cakewalk Boston, MA, USA

Caldwell Bennett Inc.Oriskany, NY, USA

*Calrec Audio Ltd.Hebden Bridge, West Yorkshire, UK

CB ElectronicsCharvil, Berks., UK

*Cedar Audio LimitedFullbourn, Cambridge, UK

Celestion / Group One Ltd.Farmingdale, NY, USA

Cherry Lane Magazines, LLC - Home Recording MagazineNew York, NY, USA

Chevin ResearchOld Lyme, CT, USA

Church Production MagazineCarey, NC, USA

Cirrus Logic Inc.Austin, TX, USA

Cliff Electronic Components, Inc.Benicia, CA, USA

Club SystemsPort Washington, NY, USA

Coding TechnologiesNürnberg, Germany

ColesPortland, ME, USA

Coles / AEAPasadena, CA, USA

950 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

________________* Sustaining Member of the Audio Engineering Society

111133 thth ExhibitorsExhibitorsLos Angeles • California • 2002 October 5–8

Page 89: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

*Community Professional LoudspeakersChester, PA, USACompuplus Inc. / Mac HollywoodHollywood, CA, USAConvention TV @ AESPort Washington, NY, USACooper Sound SystemsSan Luis Obispo, CA, USACountryman Associates, Inc.Redwood City, CA, USACrane Song Ltd.Superior, WI, USACreative Network DesignAlameda, CA, USACrest AudioFair Lawn, NJ, USACrown InternationalElkhart, IN, USACycling '74San Francisco, CA, USA*D.A.S. Audio of America, Inc.Valencia, SpainD.W. Fearn Westchester, PA, USAD2Audio CorporationAustin, TX, USADACS Ltd.Portland, ME, USADan Dugan Sound DesignSan Francisco, CA, USAData Linked ProductsOrange, CA, USAdbx ProSandy, UT, USA*dCS Ltd.Portland, ME, USADemeter Tube AmplificationVan Nuys, CA, USADenon ElectronicsPine Brook, NJ, USADesign FX AudioBurbank, CA, USADiGiCo / SoundtracsEpsom, Surrey, UK*DigidesignDaly City, CA, USADigidesign Development PartnersDaly City, CA, USA*DigigramArlington, VA, USADigital Music TechnologiesBurbank, CA, USADisc MakersPennsauken, NJ, USADJ TimesPort Washington, NY, USADK Audio A/SHerlev, Denmark*Dolby Laboratories, Inc.San Francisco, CA, USADoremi Labs, Inc.Burbank, CA, USADorrough ElectronicsWoodland Hills, CA, USA

DPA Microphones / TGI North America Inc.Kitchener, Ontario, CanadaDrawmer (USA) / Transamerica Audio Group, Inc. Las Vegas, NV, USADigital Theater Systems, Inc.Agoura Hills, CA, USADupcoSalem, NH, USADynaudio AcousticsWestlake Village, CA, USAEarth Works Audio ProductsMilford, NH, USA*Eastern Acoustic Works, Inc.Whitinsville, MA, USAESI (EGO SYS Inc.)San Jose, CA, USAElectro-VoiceBurnsville, MN, USAElectronic MusicianEmeryville, CA, USAElsevier ScienceNew York, NY, USAEmbracing Sound ExperienceStockholm, SwedenEqui=Tech CorporationSelma, OR, USAETA SystemsTwinsburg, OH, USAEuphonix, Inc.Palo Alto, CA, USAEventide, Inc.Little Ferry, NJ, USAExpression Center for New MediaEmeryville, CA, USAFairlight USAHollywood, CA, USAFAR Fundamental Acoustic ResearchOugree, Belgium*Fostex AmericaNorwalk, CA, USA*Fraunhofer Institut Fuer Integrierte SchaltungenErlangen, GermanyFriendChip AmericaWest Hollywood, CA, USAFurman SoundPetaluma, CA, USAGefen SystemsWoodland Hills, CA, USAGenelecNatick, MA, USAGenex AudioSanta Monica, CA, USAGeoffrey Daking & Co., Inc.Treasure Island, FL, USAGepco International, Inc.Des Plaines, IL, USAGibson LabsRedondo Beach, CA, USAGML, LLCFranklin, TN, USA

Gold LineWest Redding, CT, USAGordon InstrumentsNashville, TN, USAGotham Audio CablesRegensdorf, SwitzerlandGrace DesignBoulder, CO, USAGreat River Electronics Inc.Inver Grove Heights, MN, USAGroove TubesSan Fernando, CA, USAGroup One Ltd.Farmingdale, NY, USAH.E.A.R. — Hearing Education & Awareness for RockersSan Francisco, CA, USAHacousto International / Sonic Systems, Inc.Stratford, CT, USAHarrison by GLW, Inc.Lavergne, TN, USAHear TechnologiesHuntsville, AL, USA*HHB Communications USA Simi Valley, CA, USAHosa Technology, Inc.Buena Park, CA, USAHouse Ear Research InstituteLos Angeles, CA, USAIbiquity Digital Corp.Warren, NJ, USAILIO EntertainmentsMalibu, CA, USAImas Publishing / Audio Media / Pro Audio Review Falls Church, VA, USAIndependent AudioPortland, ME, USAIndustrial Acoustics Co.Bronx, NY, USA*Innova-SONOld Lyme, CT, USAInter-M / Algorithmix (European Office, Germany)Karisruhe, GermanyInternational DJ Expo / The Club ShowPort Washington, NY, USAIZ Technology Corp.Burnaby, B. C., Canada*JBL ProfessionalNorthridge, CA, USA*Jensen TransformersVan Nuys, CA, USAJLCooper ElectronicsEl Segundo, CA, USAThe John Hardy CompanyEvanston, IL, USAJosephson EngineeringSanta Cruz, CA, USAJRF Magnetic SciencesGreendell, NJ, USA*Klark-TeknikBurnsville, MN, USA

Klippel GmbHDresden, GermanyKRK Systems, LLCHollywood, FL, USAKurzweil Music SystemsLakewood, WA, USA*L-Acoustics USOxnard, CA, USALavry Engineering (formery dB Technologies)Seattle, WA, USALectrosonics, Inc.Rio Rancho, NM, USALevel Control SystemsSierra Madre, CA, USALexicon, Inc.Bedford, MA, USALine 6 Agoura Hills, CA, USALinn Products Ltd.Glasgow Scotland, UKListen, Inc.Boston, MA, USALittle LabsLos Angeles, CA, USALive Sound! InternationalSan Francisco, CA, USALogitek Electronic Systems, Inc.Houston, TX, USALundahl Transformers ABNorrtalje, SwedenLynx Studio Technology, Inc. Newport Beach, CA, USAM.-Audio Inc.Arcadia, CA, USAMAGMASan Diego, CA, USAMagtrax SurroundPortland, ME, USAManley Laboratories, Inc.Chino, CA, USAMarianWest Hollywood, CA, USAMarquette Audio Labs / MercuryHayward, CA, USAMarshall Electronics, Inc. (Mogami Cables)El Segundo, CA, USAMarshall Electronics, Inc.(MXL Microphones)El Segundo, CA, USA*Martin AudioKitchener, Ontario, CanadaMartinsound, Inc.Alhambra, CA, USA

Master Recording SupplyNewport Beach, CA, USAMBHO — Haun Microphones GermanyBrooklyn, NY, USAMC2 Audio Ltd./Group OneFarmingdale, NY, USAMcCauley Sound, Inc.Puyallup, WA, USAMercury Recording Equipment Co.Hayward, CA, USAMerging TechnologiesNorthbrook, IL, USAMetric Halo Distribution, Inc.Castle Point, NY, USAMeyer Sound Laboratories, Inc.Berkeley, CA, USAMicroboards Technology, LLCChanhassen, MN, USAMicrotech Gefell GmbHThuringia, GermanyMidasBurnsville, MN, USAMidimanArcadia, CA, USAMillennia Media, Inc.Placerville, CA, USAMiller & Kreisel Sound CorporationChatsworth, CA, USAMixEmeryville, CA, USAMixed Logic Studio ElectronicsBrook Park, OH, USAMotorola, Inc. Austin, TX, USAmSoft Inc.Woodland Hills, CA, USAThe Museum of Sound RecordingNew York, NY, USAMusic And More (MAM)West Hollywood, CA, USAMusic Connection MagazineStudio City, CA, USAMusic Maker Publications / Recording MagazineBoulder, CO, USAMytek DigitalNew York, NY, USANagra USA, Inc.Nashville, TN, USA

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 951

113th Exhibitors

Page 90: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August
Page 91: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

National Academy of Recording Arts and ScienceSanta Monica, CA, USANative Instruments GmbHBerlin, Germany*Georg NeumannOld Lyme, CT, USA*Neutrik USA, Inc.Lakewood, NJ, USANeutrik Test Instrument (NTI) St. Laurent, QC, Canada*New Transducers Ltd. (NXT)Huntingdon, Cambs., UKNexo USASan Rafael, CA, USANHT Pro Benicia, CA, USANoren Products, Inc.Menlo Park, CA, USANorris-Whitney CommunicationsSt. Catherines, Ontario, CanadaOKM MicrophonesPortland, ME, USAOpticomErlangen, GermanyOtari CorporationCanoga Park, CA, USAPacific Radio ElectronicsLos Angeles, CA, USAPearl LabPortland, ME, USAPeavey Electronics Corp.Meridian, MS, USAPelonis Sound and Acoustics, Inc.Gaviota, CA, USAPendulum Audio, Inc.Berkeley Heights, NJ, USAPenn Fabrication, Inc.Oxnard, CA, USAPenny and Giles Control Inc.Schaumburg, IL, USAPhilips Electronics — Super Audio CDEindhoven, The NetherlandsPhoenix Audio International (UK)Stevenage, Herts., UKPilchner Schoustal Inc.Toronto, Ontario, CanadaPlugzillaNew York, NY, USAPlus24West Hollywood, CA, USAPMC MonitorsWelwyn Garden City, Herts., UKPMI Audio GroupTorrance, CA, USAPost Magazine (AdvanStar Communications)New York, NY, USAPowerphysics Newport Beach, CA, USAPrecision LaboratoriesAltamonta Springs, FL, USAPreSonus Audio ElectronicsBaton Rouge, LA, USA

Prime LEDSan Franciso, CA, USA*Primedia BusinessNew York, NY, USAPrimera Technology Inc.Plymouth, MN, USAPrinceton Digital Princeton, NJ, USAPrism Media Products, Inc.Rockaway, NJ, USAProfessional Audio Design, Inc.Rockland, MA, USAPropellerheadArcadia, CA, USAQSC Audio Products, Inc.Costa Mesa, CA, USARADIAN Audio Engineering, Inc.Orange, CA, USARadikal TechnologiesTeaneck, NJ, USARDL Radio Design LabsCarpinteria, CA, USA*Rane CorporationMukilteo, WA, USARCS EnterprisesWhite Plains, NY, USARenkus-Heinz, Inc.Foothill Ranch, CA, USAResolution (S2 Publications Ltd. )London, UKRODE MicrophonesTorrance, CA, USARohde & Schwarz GmbH & Co. KGMunich, GermanyRoland CorporationLos Angeles, CA, USARolls CorporationMurray, UT, USARoyer LabsBurbank, CA, USARTSBurnsville, MN, USASabra-som Ltd. (K-IV Enterprises)Mahwah, NJ, USA*SADIE, Inc.Nashville, TN, USASAE Institute of TechnologyNorth Miami Beach, FL, USASanken MicrophonesW. Hollywood, CA, USASchoeps/Redding Audio, Inc.Newtown, CT, USASE ElectronicsRohnert Park, CA, USASEK’DNorth Hollywood, CA, USASeltron Components Ltd.Leadgate, Co Durham, Suffolk, UK*Sennheiser Electronics Corp.Old Lyme, CT, USAThe Serafine CollectionVenice, CA, USA

*Shure IncorporatedEvanston, IL, USASLS LoudspeakersSpringfield, MO, USASolid State LogicBegbroke, Oxford, UKSommerCable AmericaWest Hollywood, CA, USASonic Studio LLCPlymouth, MN, USASonifexPortland, ME, USASonomicNew York, NY, USA*Sony Electronics, Inc.Park Ridge, NJ, USASony Super Audio CDNew York, NY, USA*Sound Devices, LLCReedsburg, WI, USASound IdeasRichmond Hill, Ontario, Canada*Sound on Sound MagazineBar Hill, Cambridge, UKSound & Video Contractor MagazineEmeryville, CA, USA*SoundcraftNashville, TN, USASoundelux — Hollywood EdgeHollywood, CA, USASoundelux MicrophonesHollywood, CA, USASoundfield Research / Transamerica Audio GroupLas Vegas, NV, USASPARSMemphis, TN, USASpeck ElectronicsFallbrook, CA, USASpectrasonicsBurbank, CA, USASPL Electronics GmbHNiederkruechten, Germany*SRS Labs, Inc.Santa Ana, CA, USASteinberg N. AmericaChatsworth, CA, USAStorCase TechnologyFountain Valley, CA, USA*STUDERToronto, Ontario, CanadaStudio Network SolutionsSt. Louis, MO, USAStudio Technologies, Inc.Skokie, IL, USASummit Audio, Inc.Watsonville, CA, USASunrise E.& E. Inc.Diamond Bar, CA, USASwissonic AmericaWest Hollywood, CA, USASwitchcraft, Inc.Chicago, IL, USASymetrix, Inc.Lynwood, WA, USA

SyntrilliumSoftware Corp.Scottsdale, AZ, USATAD — Technical Audio Devices / PioneerLong Beach,CA, USATamura CorporationTokyo, Japan*Tannoy North America Inc.Kitchener, Ontario, Canada*TASCAMMontebello, CA, USATaytrix, Inc.Jersey City, NJ, USATC Electronic Inc.Westlake Village, CA, USATC HeliconWestlake Village, CA, USATC WorksWestlake Village, CA, USATECA SpaBresua, ItalyTelefunken North America, LLCSouth Windsor, CT, USATelex Communications, Inc.Burnsville, MN, USATerraSondeBoulder, CO, USASound & Communication MagazinePort Washington NY, USATesta CommunicationsPort Washington, NY, USATexas InstrumentsDallas, TX, USA*THAT CorporationMilford, MA, USAThe Music & SoundRetailerPort Washington, NY, USATHE — Taylor Hohendahl EngineeringS. Woodstock, CT, USATHX Ltd.San Rafael, CA, USATonian LaboratoriesBurbank, CA, USAToteVisionSeattle, WA, USATour Guide / Session Guide MagazinesNashville, TN, USATrian Electronics/Quested Monitoring Systems/Shep AssociatesWaunakee, WI, USATrident AudioHook Green, Meopham, Kent, UKTube TechWestlake Village, CA, USAUltrasone AGWalnut Creek, CA, USAUnder CoverNew York, NY, USA

*United Entertainment MediaNew York, NY, USAUniversal AudioSanta Cruz, CA, USA*VidiPaxNew York, NY, USAVienna Symphonic Library GmbHWien, AustriaVintage Studio RentalsNorth Hollywood, CA, USAVintech AudioSeffner, FL, USAVirtual Mixing CompanyDaly City, CA, USAVoyager Sound Inc.Weston, MA, USAWacom TechnologyVancouver, WA, USAWave DistributionRingwood, NJ, USAWave:Space, Inc.Foothill Ranch, CA, USAWave Arts, Inc.Arlington, MA, USAWaves, Inc.Knoxville, TN, USA*Wenger CorporationOwatonna, MN, USAWest Penn Wire / CDTWashington, PA, USAWestlake AudioNewbury Park, CA, USAWhirlwindRochester, NY, USAWhisperRoom, Inc.Morristown, TN, USAWireworks CorporationHillside, NJ, USAWohler Technologies, Inc.S. San Francisco, CA, USAWorld Link DigitalBurbank, CA, USAX-Vision Audio US Ltd.Youngstown, OH, USAXTA Electronics / Group One Ltd.Farmingdale, NY, USA*Yamaha Corporation of AmericaBuena Park, CA, USAYamaha mLAN Licensing OfficeBuena Park, CA, USAZ-Systems, Inc.Gainesville, FL, USAZaxcom AudioMidland Park, NJ, USA

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 953

113th Exhibitors

Page 92: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

954 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Session A Saturday, October 5 9:00 am–11:30 amRoom 404AB

TRANSDUCERS, PART 1

Chair: Marshall Buck, Gibson Labs, Redondo Beach, CA, USA; Psychotechnology Inc., Los Angeles,CA, USA

9:00 am

A-1 Which Loudspeaker Parameters Are Importantto Create the Illusion of a Live Performance inthe Living Room?— Siegfried Linkwitz, LinkwitzLab, Corte Madera, CA, USA

The preference in loudspeaker product design is for asmall size, while preserving maximum low-frequencyextension and output volume. If the goal is to create arealistic reproduction of a live event, then certain speak-er parameters must be adequately controlled, such asvolume displacement, intermodulation distortion, storedenergy, and off-axis frequency response. Componentsmust be carefully selected for low distortion perfor-mance. Parameters like phase linearity and cabinet dif-fraction are sometimes overrated. Multichannel loud-speaker setups require propagation delay correction andbass management if not all of the loudspeakers cover thefull frequency range. These issues are reviewed at theadvent of high resolution surround sound. The new tech-nology can only fulfill its promise and expand into morethan a niche market if capable loudspeakers are widelyavailable.Convention Paper 5637

9:30 am

A-2 Characterizing the Amplitude Response ofLoudspeaker Systems—Allan Devantier, HarmanInternational Industries, Inc., Northridge, CA, USA

The amplitude response of a loudspeaker system is char-acterized by a series of spatially averaged measurements.The proposed approach recognizes that the listener hearsthree acoustical events in a typical domestic environ-ment: the direct sound, the early arrivals, and the rever-berant sound field. A survey of fifteen domestic multi-channel installations was used to determine the typicalangle of the direct sound and the early arrivals. The re-flected sound that arrives at the listener after encounter-ing only one room boundary is used to approximate the

early arrivals, and the total sound power is used to ap-proximate the reverberant sound field. Two unique direc-tivity indices are also defined, and the in-room responseof the loudspeaker is predicted from anechoic data.Convention Paper 5638

10:00 am

A-3 Graphing, Interpretation, and Comparison ofResults of Loudspeaker NonlinearityMeasurement—Alexander Voishvillo, Alex Terekhov,Gene Czerwinski, Sergei Alexandrov, Cerwin-Vega Inc.,Simi Valley, CA, USA

Harmonic distortion and THD do not convey sufficientinformation about nonlinearity in loudspeakers and horndrivers. Multitone stimulus and Gaussian noise producemore informative nonlinear response. Reaction to Gauss-ian noise can be transformed into a coherence or incoherence function. They provide information aboutnonlinearity in the form of easy-to-grasp frequency-de-pendent curves. Alternatively, the results of multitonemeasurement are difficult to interpret, compare, andoverlay. A new method of depicting the results of multi-tone measurements has been developed. The distortionproducts are averaged in a moving frequency window.The result of the measurement is a single, continuous,frequency-dependent curve that takes into account thelevel of distortion products and their density. The curvescan be easily overlaid and compared. Future develop-ment of a new method may lead to correlation betweenthe level of distortion curves and the audibility of nonlin-ear distortion.Convention Paper 5639

10:30 am

A-4 The Effects of Voice-Coil Axial Rest Position onAmplitude Modulation Distortion inLoudspeakers—Ryan J. Mihelich, Harman/BeckerAutomotive Systems, Martinsville, IN, USA

The magnetic field in the air gap of a conventional loud-speaker motor is often an asymmetric nonlinear functionof axial position. Placement of the voice-coil into thisasymmetrical field yields an asymmetric nonlinear force-factor, Bl, which is a primary cause of amplitude modu-lation distortion in loudspeakers. Adjustment of the restposition of the voice-coil in this field can alter the natureof this modulation distortion. Common practice is tonominally set the voice-coil at the geometric center of

AESAES 111133thth CCoonnvveennttiioonnPPrrooggraramm2002 October 5–8

LLLLoooossss AAAAnnnnggggeeeelllleeeessss CCCCoooonnnnvvvveeeennnntttt iiiioooonnnn CCCCeeeennnntttteeeerrrr

LLLLoooossss AAAAnnnnggggeeeelllleeeessss ,,,, CCCCaaaallll iiii ffffoooorrrrnnnniiiiaaaa,,,, UUUUSSSSAAAA

Page 93: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 955

the gap or at the position generating maximum Bl. Atime-domain nonlinear simulator has been used to inves-tigate effects of voice-coil placement in an asymmetricflux field on amplitude modulation distortion.Convention Paper 5640

11:00 am

A-5 Nonlinearity in Horn Drivers—Where theDistortion Comes From?—Alexander Voishvillo,Cerwin-Vega Inc., Simi Valley, CA, USA

Compared to other components of professional soundsystems (omitting free propagation distortion), horn dri-vers have the worst nonlinear distortion. Some of the dri-ver’s distortions can be mitigated by proper mechanicalmeasures. However, distortions caused by nonlinear aircompression and propagation are inherent to any horndriver. In this paper the comparison of nonlinear distor-tions caused by different sources is carried through mea-surements and modeling. The new dynamic model of thecompression driver is based on the system of nonlineardifferential and algebraic equations. Complex impedanceof an arbitrary horn is considered by turning the imped-ance into a system of differential equations describingthe pressure and velocity at the horn’s throat. The com-parison is carried out using harmonic distortion and thereaction to multitone stimulus.Convention Paper 5641

Session B Saturday, October 5 9:00 am–11:00 amRoom 406AB

SIGNAL PROCESSING, PART 1

Chair: Rob Maher, Montana State University, Bozeman, MT, USA

9:00 am

B-1 A Talker-Tracking Microphone Array forTeleconferencing Systems—Kazunori Kobayashi,Ken’ichi Furuya, Akitoshi Kataoka, NTT, Musashino-shi, Tokyo, Japan

We propose a beamforming method that is applicable tonear sound fields where a filter-and-sum microphone ar-ray maintains better quality for the target sound than theconventional delay-and-sum array. We also describe areal-time implementation that includes steering of thebeam to detected talker locations. With the use of a mi-crophone array, our system also cuts levels of noise toachieve high-quality sound acquisition. Furthermore, itallows the talker to be in any position. Computer simula-tion and experiments show that our method is effective inteleconferencing systems.Convention Paper 5642

9:30 am

B-2 An Alternative Model for Sound SignalsEncountered in Reverberant Environments;Robust Maximum Likelihood Localization and Parameter Estimation Based on a Sub-Gaussian Model—Panayiotis G. Georgiou, ChrisKyriakakis, University of Southern California, LosAngeles, CA, USA

In this paper we investigate an alternative to the Gaussiandensity for modeling signals encountered in audio environ-ments. The observation that sound signals are impulsive in

nature, combined with the reverberation effects commonlyencountered in audio, motivates the use of the sub-Gauss-ian density. The new sub-Gaussian statistical model andthe separable solution of its maximum likelihood estimatorare derived. These are used in an array scenario to demon-strate with both simulations and two different microphonearrays the achievable performance gains. The simulationsexhibit the robustness of the sub-Gaussian-based methodwhile the real world experiments reveal a significant per-formance gain, supporting the claim that the sub-Gaussianmodel is better suited for sound signals.Convention Paper 5643

10:00 am

B-3 Imperceptible Echo for Robust AudioWatermarking—Hyen-O Oh1, Jong Won Seok2, JinWoo Hong2, Dae-Hee Youn1

1Yonsei University, Seoul, Korea2Electronics & Telecommunications Research Institute (ETRI), Daejon, Korea

In echo watermarking, the effort to improve robustnessoften conflicts with the requirement of imperceptibility.There have been inherent trade-offs in general audio watermarking techniques. In this paper we challenge thedevelopment of imperceptible but detectable echo ker-nels being directly embedded into the high-quality audiosignal. Mathematical and perceptual characteristics ofecho kernels are analyzed in a frequency domain. Final-ly, we can obtain a greater flat frequency response inperceptually significant bands by combining closely lo-cated positive and negative echoes. The proposed echomakes it possible to improve the robustness of an echowatermark without breaking the imperceptibility.(Paper not presented at convention, but ConventionPaper 5644 is available.)

10:30 am

B-4 New High Data Rate Audio Watermarking Basedon SCS (Scalar Costa Scheme)—FrankSiebenhaar1, Christian Neubauer1, Robert Bäuml2,Jürgen Herre1

1Fraunhofer Institute for Integrated Circuits, Erlangen, Germany

2Friedrich-Alexander University, Erlangen, Germany

Presently, distribution of audio material is no longer lim-ited to physical media. Instead, distribution via the Inter-net is of increasing importance. In order to attach addi-tional information to the audio content, either forforensic or digital rights management purposes or for an-notation purposes, watermarking is a promising tech-nique since it is independent of the audio format andtransmission technology.

State-of-the-art spread spectrum watermarking systemscan offer high robustness against unintentional and inten-tional signal modifications. However, their data rate istypically comparatively low, often below 100 bit/s. Thispaper describes the adaptation of a new watermarkingscheme called Scalar Costa Scheme (SCS), which is basedon dithered quantization of audio signals. In order to fulfillthe demands of high quality audio signal processing, mod-ifications of the basic SCS, such as the introduction of apsychoacoustic model and new algorithms to determinequantization intervals, are required. Simulation figures andresults of a sample implementation, which show the po-tential of this new watermarking scheme, are presented inthis paper along with a short theoretical introduction to theSCS watermarking scheme.Convention Paper 5645

Page 94: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Technical Technical PPrrooggraramm

956 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Workshop 1 Saturday, October 5 9:00 am–11:00 amRoom 408A

STEREO AND SURROUND MICROPHONE TECHNIQUES (TUTORIAL)

Chair: Geoff Martin, McGill University, Montreal, Quebec, Canada, and Bang & Olufsen a/s, Struer, Denmark

Presenters: Michael Bishop, Telarc InternationalDoug Botnick, Independent Engineer/ProducerJohn Eargle, JME/JBL ProfessionalRichard King, Sony Music StudiosMick Sawaguchi, NHK Broadcasting Center, Programme Production Operations, Sound Division

This tutorial workshop includes a panel of leading industryprofessionals from the areas of pop, classical and film music,as well as radio drama. Issues to be discussed include the char-acteristics of various microphone configurations as well asupward-and-downward compatibility considerations. Thisworkshop, which has been organized by the AES EducationCommittee, will be of benefit to audio engineers of all back-grounds, including students.

11:10 am

Technical Committee Meeting on Microphones

Special EventWHEN VINYL RULED!Saturday, October 5 12:00 noon–6:00 pmSunday, October 6 12:00 noon–6:00 pmMonday, October 7 10:00 am–6:00 pmTuesday, October 8 10:00 am–4:00 pmRoom 308A

Organizers: Wes Dooley, Audio Engineering Associates, Pasadena, CA, USADale Manquen, Consultant, Thousand Oaks, CA, USA

Vinyl records ushered in an age of consumer high-fidelity.Magnetic tape recording was the technology that madepractical the production of long-playing records.Assembling a component high-fidelity system became awidespread hobby for many of us, and for decades fueledthe development of loudspeakers, electronics, and micro-phones. Listening to recorded music became a part of manypeople’s daily life.

Our historical exhibit will offer attendees an overview ofproduction technology during the age of vinyl and spotlight itsrelevance to current music production.

The analog audio presentation will take the visitor fromAmpex’s first machine, the 30 in/s, one-quarter-inch, full-track recordings to the leading edge technology of MichaelSpitz’s contemporary 30 in/s, one-inch, two-track masteringdecks. Capitol Records will have the spotlight on Saturdayafternoon when Carson Taylor talks about and plays examplesof Tower and location recordings of the fifties, sixties, andseventies. Other key highlights will include Jim Webb’s pre-sentation of “12 Landmark Microphones” that made history;Kevin Gray’s and Stan Ricker’s presentation of vinyl discmastering and record manufacturing; Paul McManus on thedevelopment and history of powered monitors from the fifties;and Ken Hirsch and David Gordon on the development ofmixing console technology.

Saturday, Sunday, Monday, and Tuesday come up to DemoRoom Row and hear good sounds, vintage and contemporary.

Be sure to check the daily schedule outside Room 308A forexact presentation times.

Special EventFREE HEARING SCREENINGS CO-SPONSORED BY THE AES AND HOUSE EAR INSTITUTESaturday, October 5 12:00 noon–6:00 pmSunday, October 6 10:00 am–6:00 pmMonday, October 7 10:00 am–6:00 pmTuesday, October 8 10:00 am–4:00 pmBooth 1881

Attendees are invited to take advantage of a free hearingscreening co-sponsored by the AES and House Ear Institute.Four people can be screened simultaneously in the mobileaudiological screening unit located on the exhibit floor. Adaily sign-up sheet at the unit will allow individuals toreserve a screening time for that day. This hearing screen-ing service has been developed in response to a growinginterest in hearing conservation and to heighten awarenessof the need for hearing protection and the safe managementof sound.

Special EventSONGWRITERS SHOWCASE: “IT’S ALL ABOUT THE SONG”October 5–October 7 1:00 pm–8:00 pmOctober 8 12:00 noon–3:00 pmSouth Hall Lobby

Produced by: Claudia Koal

The Audio Engineering Society will present the SongwritersShowcase, “It’s All About the Song.” This event marks thethird time the AES will incorporate live original music into itsconvention schedule. The Songwriters Showcase is a meansof acknowledging the song as a key element in the motivationfor technological advancements in audio. The roster of per-formers includes international composers and songwriters.The event features a wide spectrum of musical genres.Members of performing rights organizations ASCAP, BMI,and SESAC will perform original works at daily perfor-mances. A complete schedule of performers will be posted inthe registration, press, and exhibition areas throughout thefour-day convention. Enjoy special performances during theAES Mixers each evening.

Education EventSTUDENT DELEGATE ASSEMBLY 1Saturday, October 5, 1:00 pm–2:30 pm Room 402A

Chair: Scott Cannon, Stanford University Student Section, Stanford, CA, USA

Vice Chair: Dell Harris, Hampton University Student Section, Hampton, VA, USA

All students and educators are invited to participate in a discus-sion of opportunities in the audio field and issues of importanceto audio education. A descriptive overview of conferenceevents for students will also be given. This opening meeting ofthe Student Delegate Assembly will introduce the candidatesfor the coming year's election for chair and vice chair of theNorth and Latin America Regions. Each AES regional vicepresident may present two candidates for the election to beheld at the closing meeting of the SDA. Election results and

Page 95: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 957

Recording Competition and Poster Awards will be given at theStudent Delegate Assembly 2 on Tuesday, October 8, at 10:00am, in Room 402A.

Session C Saturday, October 5 2:00 pm–6:00 pmRoom 404AB

TRANSDUCERS, PART 2

Chair: Steve Hutt, Harman/Becker Automotive Systems, Martinsville, IN, USA

2:00 pm

C-1 The Bidirectional Microphone: A ForgottenPatriarch—Ron Streicher1, Wes Dooley2

1Pacific Audio-Visual Enterprises, Pasadena, CA, USA2Audio Engineering Associates, Pasadena, CA, USA

Despite being one of the progenitors of all modern mi-crophones and recording techniques, the bidirectionalpattern is still not very well understood. Its proper andeffective use remains somewhat of a mystery to manyrecording and sound reinforcement engineers. In this pa-per the bidirectional microphone is examined from his-torical, technical, and operational perspectives. We re-view how it developed and exists as a fundamentalelement of almost all other single-order microphone pat-terns. In the course of describing how this unique patternresponds to sound waves arriving from different anglesof incidence, we show that it very often can be success-fully employed where other more commonly-used micro-phones cannot.Convention Paper 5646

2:30 pm

C-2 Gaussian Mixture Model-Based Methods forVirtual Microphone Signal Synthesis—Athanasios Mouchtaris, Shrikanth S. Narayanan, Chris Kyriakakis, University of Southern California,Los Angeles, CA, USA

Multichannel audio can immerse a group of listeners in aseamless aural environment. However, several issuesmust be addressed such as the excessive transmission re-quirements of multichannel audio, as well as the fact thatto date only a handful of music recordings have beenmade with multiple channels. Previously, we proposed asystem capable of synthesizing the multiple channels of avirtual multichannel recording from a smaller set of ref-erence recordings. In this paper these methods are ex-tended to provide a more general coverage of the prob-lem. The emphasis here is on time-varying filteringtechniques that can be used to enhance particular instru-ments in the recording, which is desired in order to simu-late virtual microphones in several locations close to andaround the sound source.Convention Paper 5647

3:00 pm

C-3 Driver Directivity Control by SoundRedistribution—Jan Abildgaard Pedersen, GertMunch, Bang & Olufsen a/s, Struer, Denmark

The directivity of a single loudspeaker driver is con-trolled by adding an acoustic reflector to an ordinary dri-ver. The driver radiates upward and the sound is redis-

tributed by being reflected off the acoustic reflector. Theshape of the acoustic reflector is nontrivial and yields aninteresting and useful directivity both in the vertical andhorizontal plane. Two-dimensional FEM simulations and3-D BEM simulations are compared to free field mea-surements performed on a loudspeaker using the acousticreflector. The resulting directivity is related to results ofpreviously reported psychoacoustic experiments.Convention Paper 5648

3:30 pm

C-4 Pressure Response of Line Sources—Mark S.Ureda, JBL Professional, Northridge, CA, USA

The on-axis pressure response of a vertical line source isknown to decrease at 3 dB per doubling of distance in thenear field and at 6 dB in the far field. This paper shows thatthe conventional mathematics used to achieve this resultunderstates the distance at which the -3 dB to -6 dB transi-tion occurs. An examination of the pressure field of a linesource reveals that the near field extends to a greater dis-tance at positions laterally displaced from the centerline,normal to the source. The paper introduces the endpointconvention for the pressure response and compares the on-axis response of straight and hybrid line sources.Convention Paper 5649

4:00 pm

C-5 High-Frequency Components for High-OutputArticulated Line Arrays—Doug Button, JBLProfessional, Northridge, CA, USA

The narrow vertical pattern achieved by line arrays hasprompted much interest in the method for many forms ofsound reinforcement in recent years. The live sound seg-ment of the audio community has used horns and compres-sion drivers for sound reinforcement for several decades.To adopt a line array philosophy to meet the demands ofhigh level sound reinforcement, requires an approach thatallows for the creation of a line source from the output ofcompression drivers. Additionally, it is desired that the linearray take on different vertical patterns dependent uponuse. This requires the solution to allow for the array to bearticulated. Outlined in this paper is a waveguide/compres-sion driver combination that is compact and simple in ap-proach and highly suited for articulated arrays.Convention Paper 5650

4:30 pm

C-6 High-Efficiency Direct-Radiator LoudspeakerSystems—John Vanderkooy1, Paul M. Boers2

1University of Waterloo, Waterloo, Ontario, Canada2Philips Research Labs, Eindhoven, The Netherlands

Direct-radiator loudspeakers become more efficient asthe total magnetic flux is increased, but the accompany-ing equalization and amplifier modify the gains thusmade. We study the combination of an efficient high-Bldriver with several amplifier types, including a highly ef-ficient class-D amplifier. Comparison is made of a typi-cal simulated driver, excited with a few different amplifi-er types, using various audio signals. The comparison isquite striking as the Bl value of the driver increases, sig-nificantly favoring the class-D amplifier.Convention Paper 5651

5:00 pm

C-7 Audio Application of the Parametric Array—Implementation through a Numerical Model—Wontak Kim1, Victor W. Sparrow2

Page 96: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Technical Technical PPrrooggraramm

958 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

1Bose Corporation, Framingham, MA, USA2Pennsylvania State University, University Park, PA, USA

Implementing the parametric array for audio applicationsis examined through numerical modeling and analyticalapproximation. The analytical solution of the nonlinearwave equation is used to provide guidelines on the de-sign parameters of the parametric array. The solution re-lates the source size, input pressure level, and the carrierfrequency to the audible signal response including theoutput level, beam width, and length of the interactionregion. A time domain finite difference code that accu-rately solves the KZK nonlinear parabolic wave equationis used to predict the response of the parametric array.The accuracy of the numerical model is established by asimple parametric array experiment. In considering theimplementation issues for audio applications of the para-metric array, the emphasis is given to the poor frequencyresponse and the harmonic distortion. Signal processingtechniques to improve the frequency response and theharmonic distortion are suggested and implementedthrough the numerical simulation.Convention Paper 5652

5:30 pm

C-8 Implementation of Straight-Line and Flat-PanelConstant Beamwidth Transducer (CBT)Loudspeaker Arrays Using Signal Delays—D. B.(Don) Keele, Jr., Harman/Becker Automotive Systems,Martinsville, IN, USA

Conventional CBT arrays require a driver configurationthat conforms to either a spherical cap-curved surface ora circular arc. CBT arrays can also be implemented inflat-panel or straight-line array configurations using sig-nal delays and Legendre function shading of the driveramplitudes. Conventional CBT arrays do not require anysignal processing except for simple frequency-indepen-dent shifts in loudspeaker level. However, the signalprocessing for the delay-derived CBT configurations, al-though more complex, is still frequency independent.This is in contrast with conventional constant-beamwidth flat-panel and straight-line designs which re-quire strongly frequency-dependent signal processing.Additionally, the power response roll-off of the delay-derived CBT arrays is one-half the roll-off rate of theconventional designs, i.e., 3- or 6-dB/octave (line orflat) for the CBT array versus 6- or 12-dB/octave for theconventional designs. Delay-derived straight-line CBTarrays also provide superior horizontal off-axis responsebecause they do not exhibit the ±90 degree right-left off-axis sound pressure buildup or bulge as compared toconventional circular-arc CBT arrays. In comparison toconventional CBT arrays, the two main disadvantages ofdelay-derived straight-line or flat-panel CBT arrays are1) the more complicated processing required, which in-cludes multiple power amplifiers and delay elements;and 2) the widening of the polar response at extremeoff-axis angles particularly for arrays that provide widecoverage with beamwidths greater than 60 degrees. Thispaper illustrates its findings using numerical simulationand modeling.Convention Paper 5653

Session D Saturday, October 5 2:00 pm–4:30 pmRoom 406AB

SIGNAL PROCESSING, PART 2

Chair: Robert Bristow-Johnson, Wave Mechanics, Burlington, VT, USA

2:00 pm

D-1 Automatic Design of Sound SynthesisTechniques by Means of GeneticProgramming—Ricardo A. Garcia, Chaoticom,Hampton Falls, NH, USA

Design of sound synthesis techniques (SST) is a difficultproblem. It is usually assumed that it requires human in-genuity to find a suitable solution. Many of the SSTscommonly used are the fruit of experimentation and longrefinement processes. An automated approach for designof SSTs is proposed. The problem is stated as a search inthe multidimensional SST space. It uses genetic pro-gramming (GP) to suggest valid functional forms andstandard optimization techniques to fit their internal parameters. A psychoacoustic auditory model is used tocompute the perceptual distance between the target andtest sounds. The developed AGeSS (automatic generatorof sound synthesizers) system is introduced, and a simpleexample of the evolved SSTs is shown.Convention Paper 5654

2:30 pm

D-2 A Consumer-Adjustable Dynamic Range ControlSystem—Keith A. McMillen, Octiv, Inc., Berkeley, CA,USA

Advances in technology have afforded listeners an avail-able dynamic range in excess of 120 dB. While impres-sive in proper concert halls and listening rooms, largedynamic ranges are not always realistic for all environ-ments and musical styles. This paper describes a practi-cal multiband dynamics processor software object thatcan reside in low cost consumer products and allow theuser to adjust dynamic range to fit his or her taste and listening environment.Convention Paper 5655

3:00 pm

D-3 A Simple, Efficient Algorithm for Reduction ofHiss Amplification Under High DynamicsCompression—Guillermo García, CCRMA, Stanford,CA, USA

We present a very simple, effective, and computationallyefficient algorithm to reduce the typical hiss amplifica-tion (or breathing) artifact of dynamics compressorsworking under high compression ratios. The algorithmworks in the time domain, is very easy to implement, hasa very low computational cost, and requires little pro-gram memory, therefore being of special interest for con-sumer-audio applications.(Paper not presented at convention, but ConventionPaper 5656 is available.)

3:30 pm

D-4 Adaptive Predistortion Filter for Linearization ofDigital PWM Power Amplifier Using NeuralNetworks—Minki Yang1, Jong-Hoon Oh2

1Pulsus Technologies, Inc., Seoul, Korea2Pohang University of Science and Technology, Pohang, Korea

The paper presents a method to compensate for nonlineardistortion of digital pulse width modulation (PWM)power amplifiers by prefiltering the input signals usingartificial neural networks. We first construct a model ofthe digital amplifier using artificial neural networks. Us-ing this model, the artificial neural network model of apredistortion filter is trained such that the combined sys-

Page 97: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 959

tem, the digital amplifier, and predistortion filter pro-duces an output that is linearly proportional to the input.The simulation results show that the artificial neural net-works of the predistortion filter can effectively correctthe nonlinear distortion of the digital amplifier.Convention Paper 5657

4:00 pm

D-5 Objective Measures of the Quality of SpeechTransmission in a Real Mobile Network—Measuring, Estimate, and Prediction Method—Dragana Sagovnovic, Telekom Srbija, Belgrade,Yugoslavia

Contemporary means of communication (e.g., mobile tele-phony) have brought new limitations that telecom opera-tors have to take into consideration. One of them is thefact that the types of deterioration of speech quality, per-ceived in mobile telephony, are different from the degra-dations noted in fixed telephony. This paper discusses,during the process of estimating the quality of speechtransmission, the comparative (objective) methods for aformed database of degradations of a real mobile commu-nications system. The consideration of the results of theobjective-method tests is based on the development of anew objective method of speech quality. The results gath-ered during the comparison tests have been displayed andinterpreted for different types of Serbian vowels: frontvowels, /e/; mid vowels, /a/; and back vowels, /u/.(Paper not presented at convention, but ConventionPaper 5658 is available.)

4:40 pm

Technical Committee Meeting on Signal Processing

Workshop 2 Saturday, October 5 2:00 pm–5:00 pmRoom 408A

NEW MEDIA FOR MUSIC—ADAPTIVE RESPONSE TO TECHNOLOGY

Chair: Dave Davis, QCA Mastering, Cincinnati, OH, USA

Panelists: Bob Ludwig, Gateway Mastering and DVD, Portland, ME Bill McQuay, Radio Expeditions: National PublicRadio and the National Geographic Society, Baltimore, MD Bobby Owsinski, Surround Associates, Studio City, CA Mike Sokol, JMS Productions & Fits and StartsProductions, Hagerstown, MD

In 2002 the audio industry is at a crossroad. The CD is seem-ingly vulnerable to piracy by file trading and easy replication.Studios and engineers are seeing their markets shrink and ratesdecline. New formats are available to consumers, but there areno budgets for creating content. While technology gets theblame, any solution requires new technologies to succeed. Wewill examine product-oriented solutions that are available to-day and how to create new kinds of products that expand bothartistic expression and markets.

This workshop focuses on using existing technology andadapting existing models to expand the music industry withnew products. It is not an esoteric romp through dot.bombschemes or a complicated strategy requiring consumers toreplace their stereo equipment and record collections. The cur-rent crisis requires us to adapt and evolve incrementally, sothis workshop is about identifying successful models and

applying them to what we collectively do. Many of the tech-nologies and models that can save us have already matured andfound their way into the homes of music fans. 2002 is our last,best hope to restore the health and vitality to the music indus-try through our products, rather than legislation or litigation.

Dave Davis (QCA Mastering/UltraInteractive) will introducethe qualities of new media and how they can be used to add valueto music products. He will explain why new media strategies areessential to our industry's adaptation to current cultural and tech-nological challenges and demonstrate a number of existing prod-ucts that provide fans with a richer experience. Finally, he willpresent some works that were conceived for rich media environ-ments and require a DVD or computer to experience; for manynew artists these technologies not only address economic chal-lenges, but broaden their vision and expand their palette of tools.

Bob Ludwig (Gateway Mastering and DVD) will discuss therole of the mastering studio regarding SACD, DVD-V, DVD-Aand enhanced CD. He will discuss how Gateway uses it's ftp sitefor projects and various pro and consumer file transfer technolo-gies (Liquid Audio, DigiProNet, Rocket etc) and explain how anintegrated facility can support an artist's vision from soup to nutsin one the building.

Bill McQuay (Radio Expeditions: National Public Radio andthe National Geographic Society) is going to discuss the NationalGeographic Expeditions series as a forward looking model to sup-port rich, challenging production for programming in a radio for-mat. The Radio Expeditions feature runs once a month on NPR’sflagship news magazine Morning Edition. It lives on in that formfor fans on NPR’s web site, where it can be streamed at will. Thematerial was expanded into a lecture/presentation series with sur-round sound for a live venue. A DVD is in the works, presentingan even richer version of the program. These additional modes ofpresentation collectively support the use of audio in a deeper, rich-er way than is traditionally possible on broadcast radio.

Bobby Owsinski (Surround AssociatesA) is deeply involved inthe development and exploration of surround mixing and willaddress the role of the creative engineer in production for multiplemix targets, as well as what studios and tracking engineers can doto facilitate surround work downstream.

Mike Sokol (JMS Productions & Fits and Starts Productions)will address the opportunities created by surround for audio pro-fessionals. Mike will be discussing a specific project in radiodrama that was staged live in surround, and is working to broad-cast it to home theaters via Digital cable using Dolby Digital. Thisrepresents both a reborn market, and the beginnings of surround-driven audio-only content in a digital broadcast environment.Mike will also discuss other alternative 5.1 audio markets, such asPowerpoint 5.1, as well as ways to do large music concerts with5.1 surround elements. All of these markets offer additional pro-duction and revenue streams for studios trying to justify 5.1 pro-duction gear.

5:10 pm

Technical Committee Meeting on Archiving,Restoration, and Digital Libraries

Workshop 3 Saturday, October 5 2:00 pm–5:00 pmRoom 408B

WHAT AUDIO ENGINEERS SHOULD KNOW ABOUTHUMAN SOUND PERCEPTION

Presenters: Durand Begault, NASA Ames Research Center, Mountain View, CA, USA William Martens, University of Aizu, Aizuwakamatsu-shi, Japan

Audio engineering professionals are intimately familiar withthe art of producing acoustical phenomena to achieve specificpsychoacoustic goals for applications such as music, multime-dia, speech reinforcement, etc. However, they may be less fa-

Page 98: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Technical Technical PPrrooggraramm

960 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

miliar with the underlying mechanisms of hearing and percep-tion that influence their everyday decisions.

This tutorial will review fundamental concepts of humansound perception, first from a monaural hearing perspectiveand then from a spatial hearing perspective. Topics includepitch, loudness, timbre, sensitivity to phase; temporal resolu-tion and temporal integration; masking; spatial perception;perceptual effects of rooms; and differences between head-phone and loudspeaker reproduction. Questions and commentsfrom the audience will be encouraged.

5:10 pm

Technical Committee Meeting on Perception andSubjective Evaluation of Audio Signals

Education EventMENTORING PANEL: STEPS TO THE FUTURE—EFFECTIVELY USING MENTORS TO HELP BUILD A CAREERSaturday, October 5, 2:30 pm–4:00 pmRoom 402A

Chair: Theresa Leonard, The Banff Centre, Banff, Alberta, Canada

Panelists: Jim Anderson, Independent Recording Engineer, AES Regional VPEdwin Dolinsky, Electronic ArtsBill Dooley, SPARS, The Village RecorderLynn Fuston, Skywalker SoundRichard King, Sony MusicDavid Moulton, Moulton LaboratoriesJulie Perez, NBC TV, Music Mixer, “Late Night with Conan O'Brien”

Thriving in the audio industry takes more than technical apti-tude—it helps to have guidance from industry professionals.Learn about the benefits of mentor relationships and how todevelop your own network of industry connections via mentoring.Students have a chance to learn firsthand from industry veteransby attending a mentoring panel and then signing up for one-on-one time with one of the distinguished participants.

Special Event14TH ANNUAL GRAMMY® RECORDING SOUNDTABLE: 21ST CENTURY REALITIESSaturday, October 5, 4:00 pm–6:00 pmRoom 411 Theater

Moderator: Howard Massey, On The Right Wavelength Consulting

Panelists: Brian GartenKen JordanGeorge MassenburgJack Joseph PuigElliot ScheinerAl Schmitt

The National Academy of Recording Arts & Sciences, Inc. willpresent the 14th Annual GRAMMY® Recording SoundTable,moderated by Howard Massey. It's the 21st century: Recordingbudgets are shrinking, technologies are advancing, and publictaste in music is changing. How does today's producer cope withthese realities? Do 24 bits truly matter when most people are lis-tening to music in MP3 format? Is there still room for analog intoday's digital world? And, with entire hit records being made onlaptop computers, is the professional recording studio becomingan endangered species? Don’t miss the lively discussion as ourpanelists attack these topics head on.

Brian Garten is engineer for the Neptunes, and the man behindthe console for two of 2002’s biggest hits: Nelly's "Hot in Here"and "Dilemma," featuring Kelly of Destiny's Child. He has alsorecorded by Beyonce Knowles, Britney Spears, Mary J. Blige,Alicia Keys, Usher, No doubt, Busta Rhymes, and JustinTimberlake.

Beatmaster Ken Jordan of the alternative-electronic duo TheCrystal Method brings a variety of influences to their dance-based productions, merging hip-hop, rock, and breakbeats into aunique blend that have made them an integral part of club cul-ture. Their debut LP, Vegas, was released to critical acclaim in1997, and their latest album, Tweekend, was released in 2001.The two albums have sold over a million and a half copies. For the very latest on this duo, check out www.thecrystal-method.com.

George Massenburg is known as an engineer's engineer. Hispristine work with artists like Little Feat, Toto, Linda Ronstadt,Lyle Lovett, Emmylou Harris, Carly Simon, Mary ChapinCarpenter, and Aaron Neville constantly sets new standards inaudio excellence, and he is a technical pioneer as well. Recipientof a Technical GRAMMY® Award for his development of inno-vative recording tools such as the parametric equalizer,Massenburg is actively involved in surround sound mixing andthe standardization of archiving materials.

Jack Joseph Puig is one of the most sought-after engineer/pro-ducers crafting music today. He owns an astonishing collectionof vintage and modern recording equipment that he uses to createa unique engineering and production sound that merges the stylesof the last four decades. You can hear his work on releases bySheryl Crow, Vanessa Carlton, John Mayer, No Doubt, CountingCrows, Green Day, Goo Goo Dolls, Hole, Shelby Lynne,Weezer, Stone Temple Pilots, Tonic, Stevie Nicks, and RobbieWilliams.

Engineer extraordinaire Elliot Scheiner has been nominatedfor more than a dozen GRAMMY® Awards and has won five ofthem, including the 2000 GRAMMY® Award for BestEngineered Album, Non-Classical. He has worked with a widevariety of multiplatinum artists, including Faith Hill, Beck,Steely Dan, Fleetwood Mac, The Eagles, Van Morrison, TheDoobie Brothers, America, John Fogerty and Jimmy Buffett. Herecently completed a surround sound remix of the classic Queenalbum A Night at the Opera, which won the 2002 DVD awardfor Best Audio.

There are few people truly deserving of the term legendary,but Al Schmitt is one of them. Not only has he won an astonish-ing nine GRAMMY® Awards, which include the 2001 BestEngineered Album, Non-Classical Award for Diana Krall's TheLook of Love, he has also won two Latin GRAMMY® Awards.He has produced, engineered, and/or mixed more than 150 goldand platinum records for a diverse range of artists, includingFrank Sinatra, Jefferson Airplane, Steely Dan, Barbra Streisand,Henry Mancini, and Duane Eddy.

Moderator Howard Massey is a noted industry consultant andveteran audio journalist. He has written for Home Recording,Surround Professional, EQ, Musician, Billboard, and many otherpublications, and is the author of eleven books, including Behindthe Glass, a collection of interviews with the world's top recordproducers. Massey has also worked extensively as an audio engi-neer, producer, songwriter, and touring/session musician.

Special EventAES MIXEROctober 5–October 7 6:00 pm–8:00 pmSouth Hall Lobby

A series of informal Mixers will be held each evening in theSouth Lobby entrance to the Los Angeles Convention Center, toenable convention attendees to meet in a social atmosphere afterthe day’s activities and catch up with friends and colleaguesfrom the industry. There will be music featuring performancesfrom the “Songwriter’s Showcase,” a cash bar, and snacks.

Page 99: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 961

Session E Sunday, October 6 9:00 am–12:00 noonRoom 406AB

SIGNAL PROCESSING, PART 3

Chair: Jayant Datta, Discrete Labs, Syracuse, NY, USA

9:00 am

E-1 Computationally Efficient Inversion of MixedPhase Plants with IIR Filters—TimoleonPapadopoulos, Philip A. Nelson, University ofSouthampton, Southampton, UK

Inverse filtering in a single or in multiple channels arisesas a problem in a number of applications in the areas ofcommunications, active control, sound reproduction, andvirtual acoustic imaging. In the single-channel case, whenthe plant C(z-1) sought to be inverted has zeros outsidethe unit circle in the z-plane, an approximation to the in-verse 1/C(z-1) can be realized with an FIR filter if an ap-propriate amount of modeling delay is introduced to thesystem. But the closer the zeros of C(z-1) are to the unitcircle (either inside or outside it), the longer the FIR in-verse has to be, typically several tens of times longer thanthe plant. An off-line implementation utilizing a variantof the backward-in-time filtering technique usually asso-ciated with zero-phase FIR filtering is presented. Thisforms the basis on which a single-channel mixed phaseplant can be inverted with an IIR filter of order roughlydouble than that of C(z-1), thus decimating the processingtime required for the inverse filtering computation.Convention Paper 5659

9:30 am

E-2 Optimal Filter Partition for Efficient Convolutionwith Short Input/Output Delay—Guillermo García,CCRMA, Stanford, CA, USA

A new algorithm to find an optimal filter partition forefficient long convolution with low input/output delay ispresented. For a specified input/output delay and filterlength, our algorithm finds the nonuniform filter parti-tion that minimizes computational cost of the convolu-tion. We perform a detailed cost analysis of differentblock convolution schemes and show that our optimal-partition finder algorithm allows for significant perfor-mance improvement. Furthermore, when several longconvolutions are computed in parallel and their outputsare mixed down (as is the case in multiple-source 3-Daudio rendering), the algorithm finds an optimal parti-tion (common to all channels) that allows for furtherperformance optimization.Convention Paper 5660

10:00 am

E-3 Filter Morphing—Topologies, Signals, andSampling Rates—Rob Clark1, 2, Emmanuel Ifeachor2,Glenn Rogers1

1Allen & Heath Limited, Penryn, Cornwall, UK2University of Plymouth, Plymouth, UK

Digital filter morphing techniques exist to reduce audi-ble transient distortion during filter frequency responsechange. However, such distortions are heavily depen-dent on signal content, frequency response settings, fil-ter topology, interpolation scheme, and sampling rates.This paper presents an investigation into these issues,implementing various filter topologies using different

input stimuli and filter state change scenarios. The paperidentifies the mechanisms causing these distortions,specifying worst case filter state change scenarios. Theeffects of existing interpolator schemes, finite wordlength, and system sampling rates on signal distortionare presented. The paper provides an understanding offilter state change, critical in the design of filter morph-ing algorithms.Convention Paper 5661

10:30 am

E-4 Evaluation of Inverse Filtering Techniques forRoom/Speaker Equalization—Scott G. Norcross,Gilbert A. Soulodre, Michel C. Lavoie, CommunicationsResearch Centre, Ottawa, Ontario, Canada

Inverse filtering has been proposed for numerous appli-cations in audio and telecommunications, such as speak-er equalization, virtual source creation, and room decon-volution. When an impulse response (IR) is atnonminimum phase, its corresponding inverse can pro-duce artifacts that become distinctly audible. These arti-facts produced by the inverse filtering can actually de-grade the signal rather than improve it. The severity ofthese artifacts is affected by the characteristics of the fil-ter and the method (time or frequency domain) used tocompute its inverse. In this paper objective and subjec-tive tests were conducted to investigate and highlight thepotential limitations associated with several different in-verse-filtering techniques. The subjective tests were con-ducted in compliance with the ITU-R MUSHRA method.Convention Paper 5662

11:00 am

E-5 Using Subband Filters to Reduce theComplexity of Real-Time Signal Processing—J. Michael Peterson, Chris Kyriakakis, University ofSouthern California, Los Angeles, CA, USA

Several high quality audio applications require the use oflong finite impulse response (FIR) filters to model theacoustical properties of a room. Various structures forsubband filtering are examined. These structures havethe ability to divide long FIR filters into smaller FIR filters that are easier to use. Two structures will be discussed to process the signals in a real-time manner,time-convolution of spectrograms, and generalized filter-banks. Also filter estimation will be discussed.Convention Paper 5663

11:30 am

E-6 Noise Shaping in Digital Test-SignalGeneration—Stanley P. Lipshitz1, John Vanderkooy1,Edward V. Semyonov2

1University of Waterloo, Waterloo, Ontario, Canada2Tomsk State University of Control Systems and Radioelectronics, Tomsk, Russia

In an earlier paper we put forth an idea to use noise-shaping techniques in the generation of digital test sig-nals. The previous paper proposed using noise shapingaround an undithered quantizer to generate sinusoidaldigital test signals with spectra having error nulls at theharmonics of the signal frequency, thus making digitaldistortion measurements of very great dynamic rangepossible. We extend this idea in this present paper in anumber of ways. We show a) that dither is necessary inorder to suppress spurious artifacts caused by the nonlin-earity of an undithered noise shaper; b) that wider anddeeper nulls at the harmonic frequencies can be achieved

Page 100: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Technical Technical PPrrooggraramm

962 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

by using higher-order noise-shaper designs; c) that IIRfilter designs can moderate the increased noise powerthat accompanies an increased FIR filter order; and d)some other novel uses of noise shaping in digital signalgeneration.Convention Paper 5664

12:10 pm

Technical Committee Meeting on Multichannel andBinaural Audio Technologies

Session F Sunday, October 6 9:00 am–12:00 noonRoom 404ABl

ROOM ACOUSTICS AND SOUND REINFORCEMENT

Chair: Eric Benjamin, Dolby Laboratories, San Francisco, CA, USA

9:00 am

F-1 Factors Affecting Accuracy of LoudspeakerMeasurements for Computational Predictions—Roger Schwenke, Meyer Sound Laboratories, Berkeley,CA, USA

The delay of a signal from the input terminals of a loud-speaker amplifier to the output terminals of a micro-phone can be represented as two parts: one from the elec-trical input to acoustical transmission and an acousticalpropagation delay from some point on the loudspeaker tothe microphone. For computational models of mixturesof loudspeakers to be correct, these delays must be mea-sured accurately. It will be shown that temperature dif-ferences as small as 1 degree Celsius between measure-ments of two models of loudspeakers can causesignificant differences in the predicted sound field.Though sound speed is much less sensitive to changes inhumidity, the difference between assuming a typical hu-midity and assuming zero humidity (which is the norm)can be significant.Convention Paper 5665

9:30 am

F-2 Systems for Stereophonic SoundReinforcement: Performance Criteria, DesignTechniques, and Practical Examples—Jim Brown,Audio Systems Group, Inc., Chicago, IL, USA

Although stereo systems for large rooms were pioneeredin a well documented work at Bell Labs in the 1930s,most modern practitioners appear to be ignorant of themost important aspect of that work as applied to modernsound reinforcement. This paper draws on the author’sexperience of over twenty years with both portable andpermanent systems using two and three front-referencedchannels. Design criteria and examples are presented toillustrate both good and bad design practices.Convention Paper 5666

10:00 am

F-3 Cable Impedance and Digital Audio—Stephen H.Lampen, David A. DeSmidt, Belden Electronics Division,San Francisco, CA, USA

One of the key differences between cable designed foranalog signals and cable designed for digital signals isthe impedance of the cable. Why is impedance importantfor digital but not for analog? What effect do impedance

variations or mismatching have on digital signals? Canyou use Category 5e or Category 6 computer cable to rundigital audio? Can you use coaxial cable to carry digitalaudio? This paper addresses all these questions and alsooutlines the limitations of digital cable designs.Convention Paper 5667

10:30 am

F-4 Limitations of Current Sound SystemIntelligibility Verification Techniques—PeterMapp, Peter Mapp Associates, Colchester, Essex, UK

The role of emergency sound and voice alarm systems inlife safety management has never been so important.However, to be effective, it is essential that such systemsare adequately intelligible. Verification of system intelli-gibility is therefore assuming ever-greater importance.While a number of verification techniques are available,each is not without its drawbacks. The paper reviews theavailable methods and, using the results of new research,highlights areas of weakness of the current techniques.Convention Paper 5668

11:00 am

F-5 Robustness of Multiple Listener Equalizationwith Magnitude Response Averaging—SunilBharitkar, Philip Hilmes, Chris Kyriakakis, Universityof Southern California, Los Angeles, CA, USA

Traditionally, room response equalization is performed toimprove sound quality at a given listener. However, roomresponses vary with source and listener positions. Hence,in a multiple listener environment, equalization may beperformed through spatial averaging of magnitude re-sponses at locations of interest. However, the perfor-mance of averaging-based equalization, at the listeners,may be affected when listener positions change. In thispaper, we present a statistical approach to map variationsin listener positions to a performance metric of equaliza-tion for magnitude response averaging. The results indi-cate that, for the analyzed listener configurations, thezone of equalization depends on the distance of micro-phones from a source and the frequencies in the sound.Convention Paper 5669

11:30 am

F-6 Coax and Digital Audio—Stephen H. Lampen, DavidA. DeSmidt, Belden Electronics Division, San Francisco,CA, USA

Coaxial cables have been used to run digital audio sig-nals for many years, and have been added to the AESspecifications (AES3-id). How is coax different fromtwisted pairs? What are the distance limitations? Whattrade-offs are made going from digital twisted pairs todigital coax? These questions are all answered in this pa-per including a discussion of baluns, which are used toconvert from one format to the other.Convention Paper 5670

12:10 pm

Technical Committee Meeting on Acoustics andSound Reinforcement

Workshop 4 Sunday, October 6 9:00 am–12:00 noonRoom 408B

LARGE SIGNAL LOUDSPEAKER PARAMETER MEA-SUREMENT

Page 101: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 963

Chair: John Stewart, Harman/Becker AutomotiveSystems, Martinsville, IN, USA

Panelists: Doug Button, JBL Professional, Northridge, CA, USADavid Clark, DLC Design, Wixom, MI, USA Earl Geddes, GedLee Associates, Northville, MI, USASteve Hutt, Harman/Becker Automotive Systems, Martinsville, IN, USAWolfgang Klippel, Klippel, GmbH, Dresden, GermanySteve Temme, Listen, Inc., Boston, MA, USAAlexander Voishvillo, Cerwin-Vega Inc., Simi Valley, CA, USA

This workshop will present measurement methods used tocharacterize loudspeaker performance at large signal levels. Abrief discussion of each technique—how it evolved, what ben-efit it brings, etc.—will be followed by a demonstration of thetechnique when possible. The discussion will include: high-power impedance measurement, Xmax, model parameter non-linearities, flux modulation, harmonic distortion, maximumSPL, and power compression.

12:10 pm

Technical Committee Meeting on Audio forTelecommunication

Workshop 5 Sunday, October 6 9:00 am–12:00 noonRoom 408A

MIXING AND MASTERING IN MULTICHANNEL SURROUND (TUTORIAL)

Chair: Michael Bishop, Telarc International Corporation, Cleveland, OH, USA

Panelists: John Eargle, JME/JBL ProfessionalFrank Filipetti, Independent Engineer/ProducerBob Ludwig, Gateway Mastering & DVDGeorge Massenburg, GML, LLCElliott Scheiner, Independent Engineer/Producer

This tutorial workshop includes a panel of leading industryprofessionals who will discuss various formats for mixing andmastering in surround. Topics to be covered include:

• Mixing and mastering for DTS/Dolby Digital Surround vs.discrete multichannel surround

• Mixing for 5.1 vs. 6.0, 7.1 or 10.2

• Multichannel mix recording formats in most common useand compatibility with mastering needs

• Mixing and mastering monitor loudspeaker setup/formatsand compatibility with common home theater setups in thehome

• Essential tools for mixing and mastering in multichannelsurround

• Mix approaches and preferences: ambient surround vs. directsurround

• Challenges of mastering for DVD-A

• Challenges of mastering for SACD

Audio and music examples will be played. This workshop,which has been sponsored by the AES Education Committee,will be of benefit to audio engineers of all backgrounds, in-cluding students.

Regrettably, Dennis Purcel was scheduled to be a panelist inthis workshop but has since passed away.

12:10 pm

Technical Committee Meeting on Transmission andBroadcasting

Special EventPLATINUM PRODUCERS PANEL 1PRODUCER, ENGINEER, STUDIO TECHNICIAN—BLURRING OF ROLESSunday, October 6, 12:30 pm–2:30 pmRoom 403A

Moderator: Mitch Gallagher, EQ Magazine

Panelists: Rob CavalloMike ElizondoRon FairBen GrosseTaz Herzberg

It is no secret that the demands of creating audio and music intoday's marketplace require a broad range of skills and flexibility.It is increasingly common for one person to handle what oncewere multiple dedicated jobs: producing, engineering, perform-ing . . . and to do them all at the same time. We've gathered fiveof today's hottest producers and engineers; join as they discussthe tricks, techniques, mindset, and technologies that allow themto cover multiple roles at once—and how you can apply thisinformation to your own situation.

Mitch Gallagher, the Editor in Chief of EQ magazine, beganworking in music professionally over 20 years ago. His back-ground includes a degree in music, as well as graduate studies inelectronic music composition and classical guitar. He is an author,journalist, teacher, touring and studio musician, recording engi-neer, project studio/multimedia production company owner, andaward-winning composer. His first book, Make Music Now!, wasrecently released by Backbeat Books.

Mike Elizondo has shared production credit with Dr. Dre onthe Rolling Stones “I Miss You” for the Austin Powers sound-track. His songs and musicianship can be found on Eve’sGrammy winning “Let Me Blow Ya Mind,” Mary J. Blige’s“Family Affair,” Eminem’s “My Dad’s Gone Crazy.”Elizondo a production company “Tone Down Productions”with its first release, Rebel through Columbia due in early2003.

Ron Fair is a veteran record man who is a rare combinationof producer, A&R executive, musician, arranger, and engineer.He signed and produced multi-Grammy winner ChristinaAguilera and also signed Lit. He is currently President ofA&M Records and produced the vocal performances ofChristina Aguilera, Mya, Pink, and Lil’ Kim.

Ben Grosse is a producer/engineer/mixer whose creditsinclude Vertical Horizon, Six Pence None The Richer, Fuel,Filter, Guster, Ben Folds, Live, BT, Megadeth, 3rd Eye Blind,Red Hot Chili Peppers, Madonna and Crystal Method. Heowns The Mix Room, Burbank, CA, which features two SSL-equipped rooms.

Taz Herzberg is a Los Angeles-based programmer and engi-neer. His credits include Counting Crows, Vanessa Carlton,Christina Aguilera, and the Grammy-awarded “LadyMarmalade” remake.

Workshop 7 Sunday, October 6 1:00 pm–3:00 pmRoom 406AB

AES42-2001: AES STANDARD FOR ACOUSTIC- DIGITAL INTERFACE FOR MICROPHONES

Chair: Stephan Peus, Georg Neumann GmbH, Berlin, Germany

Page 102: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Technical Technical PPrrooggraramm

964 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Panelists: Manfred Bleichwehl, Steven Harris, Juergen Wahl, Sennheiser Electronic Corporation, Van Nuys, CA, USA

This workshop introduces the new AES42-2001 Standard forAcoustic-Digital Interface for Microphones. Among its most in-teresting potentials are the many options for controlling remotelya wide variety of microphone features, which are not possiblewith traditional analog microphones. Microphones that are builtaccording to this new standard will not only have an internal ana-log/digital converter but also extended digital signal processingcapabilities (DSP). It can incorporate, for example, the equivalentof a high-gain, high-precision traditional microphone preamplifi-er, inside the digital microphone. Presenters will demonstratesome of these features and advantages and show potential applica-tions using an operational microphone system, designed accord-ing to the new AES42-2001 Standard.

Session G Sunday, October 6 2:00 pm–6:00 pmRoom 408A

MULTICHANNEL SOUND

Chair: Durand Begault, NASA Ames Research Center, Mountain View, CA, USA

2:00 pm

G-1 Interchannel Interference at the ListeningPosition in a Five-Channel LoudspeakerConfiguration—Geoff Martin, McGill University,Montreal, Quebec, Canada, and Bang & Olufsen a/s,Struer, Denmark

It is commonly-accepted thinking that the use of a five-channel surround sound reproduction system increases thesize of the listening area over that for two-channel stereo-phonic systems. In actual fact, for many types of programmaterial, the area of this so-called sweet spot is reduceddue to interference between the channels at the listener’sears. This effect is described and analyzed through theo-retical evaluation and psychoacoustic listening tests.Convention Paper 5677

2:30 pm

G-2 Breaking and Making the Rules: Sound Designin 6.1—Ambrose Field, University of York, York, UK

It is often colloquially said that there are no rules for thespatialization of multichannel content. This paper seeksto identify creative ways in which the multichannel sys-tems of today and tomorrow can be harnessed to provideaesthetically convincing surround environments. Thesuitability of production techniques based on Ambisonicmethods is re-evaluated for this task, and the discussionfocuses on the role of multichannel systems in sound de-sign for presentation in the movie theater.Convention Paper 5672

3:00 pm

G-3 Localization of Lateral Phantom Images in a 5-Channel System With and Without SimulatedEarly Reflections—Jason Corey, Wieslaw Woszczyk,McGill University, Montreal, Quebec, Canada

Phantom images that rely on interchannel level differencescan be produced easily for two-channel stereo. Yet one ofthe most difficult challenges in production for a five-chan-nel environment is the creation of stable phantom images

to the side of the listening position. The addition of simu-lated early reflection patterns from all five loudspeakersinfluences the localization of lateral phantom sources. Lis-tening tests were conducted to compare participants’ abili-ties to localize lateral sources under three conditions: pow-er-panned sources alone, sources with simulated earlyreflection patterns, and simulated early reflection patternsalone (without direct sound). Our results compare local-ization error for the three conditions at different locationsand suggest that early reflection patterns alone can be suf-ficient for source localization.Convention Paper 5673

3:30 pm

G-4 The Minimum Number of Loudspeakers and itsArrangement for Reproducing the SpatialImpression of Diffuse Sound Field—KoichiroHiyama, Setsu Komiyama, Kimio Hamasaki, NHK,Setagaya, Tokyo, Japan

It is important to find out how many loudspeakers arenecessary for multichannel sound systems to reproducethe spatial impression of a diffuse sound field. This paperdiscusses the issue in the case where loudspeakers areplaced symmetrically along a concentric circle aroundthe listener and at the same height as the listener’s ear. Itbecomes clear that the spatial impression of the diffusesound field can be obtained by only two symmetricalpairs of loudspeakers (that is, four loudspeakers in all).In this arrangement, one pair of loudspeakers should beplaced in the frontal area around the listener at an angleof about 60 degrees, and the other pair should be in therear area with an angle of 120 degrees to 180 degrees.Convention Paper 5674

4:00 pm

G-5 A Fuzzy Cerebellar Model Approach forSynthesizing Multichannel Recordings—Ching-Shun Lin, Chris Kyriakakis, University of SouthernCalifornia, Los Angeles, CA, USA

New high capacity optical discs and high bandwidth net-works provide the capability for delivering multichannelaudio. Although there are many one- and two-channelrecordings in existence, only a handful of multichannelrecordings exist. In this paper we propose a neural net-work approach that can synthesize microphone signalswith the correct acoustical characteristics of specificvenues that have been characterized in advance. Thesesignals can be used to generate a multichannel recordingwith the acoustical characteristics of the original venue.The complex semi-cepstrum technique is employed toextract features from musical signals recorded in a venueand these signals are sent into the fuzzy cerebellar modelarticulation controller (FCMAC) for training.Convention Paper 5675

4:30 pm

G-6 Investigation of Listener Envelopment inMultichannel Surround Systems—Gilbert A.Soulodre, Michel C. Lavoie, Scott G. Norcross,Communications Research Centre, Ottawa, Ontario,Canada

It is now well understood that listener envelopment(LEV) is an essential component of good concert hallacoustics. An objective measure based on the late-lateralenergy has been shown to perform well at predictingLEV. One goal of multichannel surround systems is toimprove the re-creation of the concert hall experience ina home listening environment. By varying the amount of

Page 103: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

late-lateral energy, such systems should allow the per-ception of LEV to be enhanced and controlled. In thispaper the loudspeaker/listening room interactions areshown to limit the range of acoustical conditions that canbe re-created. A series of formal subjective tests wereconducted to determine if objective measures of late-lat-eral energy are suitable for predicting LEV in multichan-nel surround systems.Convention Paper 5676

5:00 pm

G-7 The Significance of Interchannel Correlation,Phase, and Amplitude Differences onMultichannel Microphone Techniques—GeoffMartin, McGill University, Montreal, Quebec, Canada,and Bang & Olufsen a/s, Struer, Denmark

There is a measurable interference between correlatedsignals produced by multiple loudspeakers in a standardfive-channel loudspeaker configuration, resulting in anaudible comb filter effect. This is due to small individualdifferences in distances between the ears of the listenerand the various loudspeakers. Although this effect iscaused by the dimensions and characteristics of the mon-itoring environment, it can be minimized in the recordingprocess, particularly through the relative placement ofmicrophones and choice of their directional characteris-tics. In order to analyze this effect, the correlation of mi-crophone signals and their amplitude differences in arecording environment are evaluated using theoreticalmodels. This procedure is applied to coincident andspaced pairs of transducers for direct and reverberantsounds.Convention Paper 5671

5:30 pm

G-8 A Multichannel Surround Audio FormatApplying 3-D Sound Technology—Myung-SooKang, Kyoo-Nyun Kim, University of Ulsan, Ulsan,Korea

As systems employ multichannel audio more and more,it is necessary to consider the surround sound capabilityin the process of sound recording and playing. This paperpresents a structured audio format and its application todesign a more efficient surround sound system. Three-di-mensional sound technology is used for localization ofthe sound source. We define the reusable sound object toclarify the audio format. Sound object is a unit of record-ed sound samples that can be changed by various effectproperties. Filter and 3-D properties are applied tochange the sound objects in each track.(Paper not presented at convention, but ConventionPaper 5678 is available.)

Session H Sunday, October 6 2:00 pm–4:30 pmRoom 404AB

LOW BIT-RATE CODING, PART 1

Chair: Gary Brown, Tensilica, Santa Clara, CA, USA

2:00 pm

H-1 Scalable Lossless Audio Coding Based onMPEG-4 BSAC—Doh-Hyung Kim, Jung-Hoe Kim,

Sang-Wook Kim, Samsung Advanced Institute ofTechnology, Suwon, Korea

In this paper a new hybrid type of scalable lossless au-dio coding scheme based on MPEG-4 BSAC (bit slicedarithmetic coding) is proposed. This method introducestwo residual error signals, lossy coding error signaland prediction error signal, and utilizes the rice codingas a lossless coding tool. These kinds of processes en-able an increase in the compression ratio. As a result ofexperiment, average total file size can be reducedabout 50 to 60 percent of the original size. Conse-quently, a slight modification of the conventionalMPEG-4 general audio coding scheme can give a scal-able lossless audio coding functionality between lossyand lossless bitstream.Convention Paper 5679

2:30 pm

H-2 Lossless Audio Coding Using AdaptiveMultichannel Prediction—Tilman Liebchen,Technical University of Berlin, Berlin, Germany

Lossless audio coding enables the compression of digitalaudio data without any loss in quality due to a perfect re-construction of the original signal. The compression isachieved by means of decorrelation methods such as lin-ear prediction. However, since audio signals usually con-sist of at least two channels, which are often highly cor-related with each other, it is worthwhile to make use ofinterchannel correlations as well. This paper shows howconventional (mono) prediction can be extended tostereo and multichannel prediction in order to improvecompression efficiency. Results for stereo and multi-channel recordings are given.Convention Paper 5680

3:00 pm

H-3 Design of an Error-Resilient Layered AudioCodec—Dai Yang, Hongmei Ai, Chris Kyriakakis, C.-C. Jay Kuo, University of Southern California, LosAngeles, CA, USA

Current high quality audio coding techniques mainly fo-cus on coding efficiency, which makes them extremelysensitive to channel noise, especially in high error ratewireless channels. In this paper we propose an error-re-silient layered audio codec (ERLAC) which providesfunctionalities of both fine-grain scalability and error-re-siliency. A progressive quantization, a dynamic segmen-tation scheme, a frequency interleaving technique, andan unequal error protection scheme are adopted in theproposed algorithm to construct the final error robustlayered audio bitstream. The performance of the pro-posed algorithm is tested under different error patterns ofWCDMA channels with several test audio materials. Ourexperimental results show that the proposed approachachieves excellent error resilience at a regular user bitrate of 64 kb/s.Convention Paper 5681

3:30 pm

H-4 Application of a Concatenated Coding Systemwith Convolutional Codes and Reed-SolomonCodes to MPEG Advanced Audio Coding—DongYan Huang1, Say Wei Foo2, Weisi Lin3, Ju-Nia Al Lee4

1Institute of Microelectronics, Singapore, Singapore2Nanyang Technological University, Singapore, Singapore

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 965

Page 104: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Technical Technical PPrrooggraramm

966 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

4:40 pm

Technical Committee Meeting on Coding of AudioSignals

Workshop 6 Sunday, October 6 2:00 pm–5:00 pmRoom 408B

GAME AUDIO

Chair: Aaron Marks, On Your Mark Music Productions, CA, USA

Speakers: Keith Arem, PCB Productions (CD-ROM audio)Jack Buser, Dolby Labs, San Francisco, CA, USA (Use of Surround Sound in Video Games)Brian Schmidt, Microsoft (XBox audio)Martin Wilde, Game Audio Programmer (Game audio programming issues)

This game audio workshop is designed to answer one of themost often asked questions: What does it take to make greatgame audio? Not only will we provide the answers, but wealso will come armed with many specific audio examples de-signed to showcase the specific talents, techniques, and tech-nologies used to help make the games business the billion dol-lar industry that it is today.

5:10 pm

Technical Committee Meeting on Audio for Games

Education EventONE-ON-ONE MENTORING SESSION 1Sunday, October 6, 2:30 pm–4:30 pmRoom 402A

Students are invited to sign up for an individual meeting withaudio mentors from the industry. The sign-up sheet will belocated near the student center of the convention, and all stu-dents are invited to participate in this exciting and rewardingopportunity for focused discussion.

Education EventEDUCATION FAIRSunday, October 6, 2:30 pm–4:30 pm Room 403B

Institutions offering studies in audio—from short courses tograduate degrees—will be represented in a tabletop session.Information on each school’s respective programs will bemade available through the display of literature and academicguidance sessions with representatives. There is no chargefor schools to participate and admission is free and open toeveryone.

Education EventSTUDENT POSTER SESSIONSunday, October 6, 2:30 pm–4:30 pmRoom 403B

The event will display the scholarly/research/creative worksfrom AES student members. Since many institutions areengaged in both research and applied sciences of audio, thissession will provide an opportunity to display and discussthese accomplishments with professionals, educators, andother students.

3Laboratories of Information Technology, Singapore, Singapore

4National University of Singapore, Singapore

Reliable delivery of audio bitstream is vital to ensurethe acceptable audio quality perceived by 3G networkcustomers. In other words, an audio coding schemethat is employed must be fairly robust over the error-prone channels. Various error-resilience techniquescan be utilized for the purpose. Due to the fact thatsome parts of the audio bitstream are less sensitive totransmission errors than others, the unequal error pro-tection (UEP) is used to reduce the redundancy intro-duced by error resilience requirements. The currentUEP scheme with convolutional codes and multistageinterleaving has an unfortunate tendency to generateburst errors at the decoder output as the noise level isincreased. A concatenated system combining Reed-Solomon codes with convolutional codes in the UEPscheme is investigated for MPEG advanced audio cod-ing (AAC). Under severe channel conditions with ran-dom bit error rates of up to 5x10-02, the proposedscheme achieved more than 50 percent improvement inresidual bit error rate over the original scheme at a bitrate of 64 kb/s and sampling frequency of 48 kHz. Un-der burst error conditions with burst error length of upto 4 ms, the proposed scheme achieved more than 65percent improvement in bit error rate over the originalscheme. The average percentage overhead incurred byusing the concatenated system is about 3.5 percent ofthe original UEP scheme. Further improvements aremade by decreasing the puncturing rate of convolution-al codes. However, this can only be adopted when highprotection is needed in extremely noisy conditions(e.g., channel BER significantly exceeds 1.00e-02)since it incurs increased overheads.Convention Paper 5682

4:00 pm

H-5 A Simple Method for Reproducing HighFrequency Components at Low Bit-Rate AudioCoding—Jeongil Seo, Daeyoung Jang, Jinwoo Hong,Kyeoungok Kang, Electronics & TelecommunicationsResearch Institute (ETRI), Yuseong-Gu, Deajon, Korea

In this paper we describe a simple method for repro-ducing high frequency components at low bit-rate au-dio coding. To compress an audio signal at low bitrates (below 16-kb/s per channel) we can use a lowersampling frequency (below 16 kHz) or high perfor-mance audio coding technology. When an audio signalis sampled at a low frequency and coded at a low bitrate, high frequency components and reverberantsound are lost because of quantization noise betweenpitch pulses. In a short-term period, the harmonic char-acteristic of audio signals is stationary, so the replica-tion of high-frequency bands with low-frequencybands can extend the frequency range of resultingsound and enhance the sound quality. In addition, forreducing the number of bands to be reproduced weadapted this algorithm at the Bark scale domain. Forcompatibility with a conventional audio decoder, theadditional bitstream is added at the end of each frame,which is generated by a conventional audio coder. Weadapted this proposed algorithm to MPEG-2 AAC andincreased the quality of audio in comparison with theconventional MPEG-2 AAC coded audio at the samerate. The computational cost of the proposed algorithmis similar to or a little more than a conventionalMPEG-2 AAC decoder.Convention Paper 5683

Page 105: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Special EventAN AFTERNOON WITH . . . TOM HOLMANSunday, October 6, 2:30 pm–5:00 pmRoom 411 Theater

Moderator: George Petersen, Mix Magazine, Emeryville, CA, USA

Tom Holman, whose distinguished career in audio, video, andfilm spans more than three decades, will participate in an inter-active discussion with the audience. President of the TMHCorporation, and Professor of Film Sound at the University ofSouthern California School of Cinema-Television, TomHolman has the rare ability to push the state of the art andchallenge entire industries to achieve new quality standards.He creates new markets and redefines exiting markets by intro-ducing new quality standards and designing products that per-form to them.

Workshop 8 Sunday, October 6 3:00 pm–5:40 pmRoom 406AB

PROTECTING YOUR HEARING AGAINST LOSS:ASSESSMENT OF FUNCTIONAL HEARING ABILITY IN NOISE

Co-chairs: Dilys Jones, Sigfrid Soli, House Ear Institute, Los Angeles, CA, USA

These experts in the field of hearing will discuss the meth-ods in which hearing can be protected from loss and willassess functional hearing ability in noise. The workshopwill feature a tutorial on the anatomy of the ear, a discus-sion of the factors that contribute to hearing loss, and hear-ing evaluations.

Sigfrid Soli will also discuss: audiometric testing; useand interpretation of an audiogram; extended high-frequen-cy testing; and how to protect and conserve one's hearingby using the Occupat ional Safety and Heal thAdministration’s (OSHA) guidelines, hearing protectiondevices, in-ear monitors, and safe behavioral practices.

Education EventEDUCATION FORUM Sunday, October 6, 4:30 pm–6:00 pm Room 402A

Co-hosts: Theresa Leonard, The Banff Centre, Banff, Alberta, CanadaDon Puluse

This event is a meeting of the AES Education Committee, teach-ers, authors, students, and members interested in the issues ofprimary and continuing education of the audio industry. It is anopportunity to discuss the programs of the Education Committeeand to provide input for future projects of this committee.

Special EventTHE JAZZ SINGERSunday, October 6, 6:30 pm–7:30 pmRoom 403A

An old time radio re-creation of the Lux Radio Theatre produc-tion of “The Jazz Singer,” the first successful talking picturestarring Al Jolson, will be staged on the 75th anniversary of thefilm’s original debut. Several veteran radio actors have beenassembled to commemorate both the film and the original 1947broadcast. This radio re-creation will feature Richard Halpern inthe starring role, with Herb Ellis directing.

Session I Monday, October 7 9:00 am–12:00 noonRoom 404AB

LOW BIT-RATE CODING, PART 2

Chair: Marina Bosi, MPEG LA, Denver, Colorado

9:00 am

I-1 Perceptually-Based Joint-Program AudioCoding—Christof Faller1, Raziel Haimi-Cohen2, Peter Kroon1, Joseph Rothweiler1

1Agere Systems, Murray Hill, NJ, USA2Lucent Technologies, Murray Hill, NY, USA

Some digital audio broadcasting systems, such as SatelliteDigital Audio Radio Services (SDARS), transmit manyaudio programs over the same transmission channel. In-stead of splitting up the channel into fixed bit-rate sub-channels, each carrying one audio program, one can dy-namically distribute the channel capacity among the audioprograms. We describe an algorithm which implementsthis concept taking into account statistics of the bit ratevariation of audio coders and perception. The result is adynamic distribution of the channel capacity among thecoders depending on the perceptual entropy of the individ-ual programs. This solution provides improved audio qual-ity compared with fixed bit-rate subchannels for the sametotal transmission capacity. The proposed scheme is non-iterative and has a low computational complexity.Convention Paper 5684

9:30 am

I-2 Incorporation of Inharmonicity Effects intoAuditory Masking Models—Hossein Najaf-Zadeh,Hassan Lahdili, Louis Thibault, CommunicationsResearch Centre, Ottawa, Ontario, Canada

In this paper the effect of inharmonic structure of audiomaskers on the produced masking pattern is addressed. Inmost auditory models, the tonal structure of the masker isanalyzed to determine the masking threshold. Based on psy-choacoustic data, masking thresholds caused by tonal sig-nals are lower compared to those produced by noise-likemaskers. However, the relationship between spectral com-ponents has not been considered. It has been found that fortwo different multitonal maskers with the same power, theone with a harmonic structure produces a lower maskingthreshold. This paper proposes a modification to the MPEGpsychoacoustic model 2 in order to take into account the in-harmonic structure of the input signal. Informal listeningtests have shown that the bit rate required for transparentcoding of inharmonic (multitonal) audio material can be re-duced by 10 percent if the new modified psychoacousticmodel 2 is used in the MPEG 1 Layer II encoder.Convention Paper 5685

10:00 am

I-3 Binaural Cue Coding Applied to AudioCompression with Flexible Rendering—ChristofFaller, Frank Baumgarte, Agere Systems, Murray Hill,NJ, USA

In this paper we describe an efficient scheme for com-pression and flexible spatial rendering of audio signals.The method is based on binaural cue coding (BCC)which was recently introduced for efficient compressionof multichannel audio signals. The encoder input consistsof separate signals without directional spatial cues, suchas separate sound source signals, i.e., several monophon-

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 967

Page 106: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Technical Technical PPrrooggraramm

968 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

ic signals. The signal transmitted to the decoder consistsof the mono sum-signal of all input signals plus a low bitrate (e.g., 2 kb/s) set of BCC parameters. The mono sig-nal can be encoded with any conventional audio orspeech coder. Using the BCC parameters and the monosignal, the BCC synthesizer can flexibly render a spatialimage by determining the perceived direction of the au-dio content of each of the encoder input signals. We pro-vide the results of an audio quality assessment usingheadphones, which is a more critical scenario than loud-speaker playback.Convention Paper 5686

10:30 am

I-4 Two-Pass Encoding of Audio Material UsingMP3 Compression—Martin Weishart1, Ralf Göbel2,Jürgen Herre1

1Fraunhofer Institute for Integrated Circuits, Erlangen, Germany

2Fachhochschule Koblenz, Koblenz, Germany

Perceptual audio coding has become a widely-used tech-nique for economic transmission and storage of high-quality audio signals. Audio compression schemes, asknown from MPEG-1, 2, and 4 allow encoding with ei-ther a constant or a variable bit rate over time. Whilemany applications demand a constant bit rate due to chan-nel characteristics, the use of variable bit-rate encodingbecomes increasingly attractive, e.g., for Internet audioand portable audio players. Using such an approach canlead to significant improvements in audio quality com-pared to traditional constant bit-rate encoding, but theconsumed average bit rate will generally depend on thecompressed audio material. This paper presents investiga-tions into two-pass encoding which combines the flexibil-ity of variable bit rate encoding and a predictable targetbit consumption.Convention Paper 5687

11:00 am

I-5 Technical Aspects of Digital RightsManagement Systems—Christian Neubauer1,Karlheinz Brandenburg2, Frank Siebenhaar1

1Fraunhofer Institute for Integrated Circuits IIS-A, Erlangen, Germany

2Fraunhofer Arbeitsgruppe Elektronische Medientechnologie AEMT and Ilmenau University,Ilmenau, Germany

In today’s multimedia world digital content is easilyavailable to and widely used by end consumers. On theone hand high quality, the ability to be copied withoutloss of quality, and the existence of portable playersmake digital content, in particular digital music, very at-tractive to consumers. On the other hand the music in-dustry is facing increasing revenue loss due to illegalcopying. To cope with this problem so-called DigitalRights Management (DRM) systems have been devel-oped in order to control the usage of content. However,currently no vendor and no DRM system is widely ac-cepted by the market. This is due to the incompatibilityof different systems, the lack of open standards, and oth-er reasons. This paper analyzes the current situation ofDRM systems, derives requirements for DRM systems,and presents technological building blocks to meet theserequirements. Finally, an alternative approach for aDRM system is presented that better respects the rightsof the consumers.Convention Paper 5688

11:30 am

I-6 Configurable Microprocessor Implementation ofLow Bit-Rate Audio Decoding—Gary S. Brown,Tensilica, Inc., Santa Clara, CA, USA

Using a configurable microprocessor to implement lowbit-rate audio applications by tailoring the instruction setreduces algorithm complexity and implementation cost.As an example, this paper describes a Dolby digital (AC-3) decoder implementation that uses a commercially-available configurable microprocessor to achieve 32-bitfloating-point precision while minimizing the requiredprocessor clock rate and die size. This paper focuses onhow the audio quality and features of the reference de-coder algorithm dictate the customization of the micro-processor. This paper shows examples of audio specificextensions to the processor’s instruction set to create afamily of AC-3 decoder implementations that meet multi-ple performance and cost points. How this approach ben-efits other audio applications is also discussed.Convention Paper 5689

12:10 pm

Technical Committee Meeting on Network Audio Systems

Session J Monday, October 7 9:00 am–11:00 amRoom 406AB

HIGH-RESOLUTION AUDIO

Chair: Rhonda Wilson, Meridian Audio Limited, Huntingdon, Cambridgeshire, UK

9:00 am

J-1 DSD Processing Modules—Architecture andApplication—Nathan Bentall, Gary Cook, PeterEastty, Eamon Hughes, Michael Page, ChristopherSleight, Mike Smith, Sony Broadcast & ProfessionalResearch Labs, Oxford, UK

As demand continues to grow for production equipmenttargeting the high resolution, multichannel capabilities ofSACD (super audio CD), there is increasing interest inadding DSD capability to both new and existing systems.The prospect of researching and implementing the neces-sary algorithms from scratch can be daunting. The highdata rates, coupled with the asymmetric multipliers oftenrequired by the algorithms, make conventional von-Neu-mann-type DSP platforms (where most developers tradi-tionally have their DSP expertise) seem sub-optimal.Based on custom DSD audio processing engines andpackaged in the very compact SODIMM form factor, themodules described in this paper can add high quality,real time, low latency DSD audio processing functionali-ty to a system with a minimum of development time.Convention Paper 5690

9:30 am

J-2 Multichannel Audio Connection for DirectStream Digital—Michael Page, Nathan Bentall, Gary Cook, Peter Eastty, Eamon Hughes, ChristopherSleight, Sony Broadcast & Professional Research Labs,Oxford, UK

The development of large-scale multitrack productionequipment for direct stream digital (DSD) audio will be

Page 107: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

substantially aided by the availability of a convenientmultichannel interface. DSD is the high-resolution one-bit audio coding system for the super audio CD con-sumer disc format. Existing multichannel audio intercon-nection and networking protocols are not easily able tosupport the high-frequency sample clock required byDSD. A new interconnection type is introduced to pro-vide reliable, low-latency, full-duplex transfer of 24channels of DSD, using a single conventional office net-working cable. The interconnection transfers audio datausing Ethernet physical layer technology, while convey-ing a DSD sample clock over the same cable.Convention Paper 5691

10:00 am

J-3 The Effect of Idle Tone Structure on EffectiveDither in Delta-Sigma Modulation Systems: Part2—James A. S. Angus, University of Salford, Salford, UK

This paper clarifies some of the confusion that has arisenover the efficacy of dither in PCM and sigma-delta mod-ulation (SDM) systems. It presents a means of analyzingin-band, idle tone structure using chaos theory and de-scribes a fair means of comparison between PCM andSDM. It presents results that show that dither can be ef-fective in SDM systems.Convention Paper 5692

10:30 am

J-4 DC Analysis of High Order Sigma-DeltaModulators—Derk Reefman, Erwin Janssen, PhilipsResearch, Eindhoven, The Netherlands

A new method for the DC analysis of a sigma-delta mod-ulator (SDM) is presented. The model used for the de-scription of an SDM is adopted from Candy’s model fora first order SDM. However, where Candy’s model is ex-act for a first order SDM, it fails to be so in a higher or-der case. In our model, we deal with this by the introduc-tion of stochastic behavior of the SDM and obtain theprobability density distribution function of some vari-ables which determine many of the characteristics of theSDM in the time domain. Comparison with simulationresults shows that the assumption of stochastic behavioris rather good for SDM orders greater than 3, which dis-play significant noise shaping. For lower orders (or lessaggressive noise shaping) the approximation is less good.As an aside, the new model of sigma-delta modulationalso clarifies why the time-quantized dither approachpresented by Hawksford is much better compared tostandard quantizer dithering.Convention Paper 5693

11:10 am

Technical Committee Meeting on High-ResolutionAudio

Workshop 9 Monday, October 7 9:00 am–12:00 noonRoom 408A

STUDIO PRODUCTION AND PRACTICES

Chair: George Massenburg, GML, North Hollywood, CA, USAwith David Smith, Sony Music Studios, New York, NY, USA

This workshop will be an open discussion led by esteemedrecording engineers George Massenburg and David Smith on de-

livery standards in digital and analog audio in the modern studio. Mr. Massenburg and Mr. Smith, who are respectively the

Chair and a member of the AES Technical Committee onStudio Practices, will detail their proven techniques for estab-lishing high quality, archive-ready audio, using today’s state ofthe art professional tools. They will concentrate on deliveryformats, media requirements, and proper documentation, andwill cover the myriad of formats embedded in current digitalaudio workstations. In addition, they will share their practicalsecrets for everyday success in the studio environment.

12:10 pm

Technical Committee Meeting on Studio Practicesand Production

Workshop 10 Monday, October 7 9:00 am–12:00 noonRoom 408B

LOUDSPEAKER LINE ARRAYS, PART 1: THEORY ANDHISTORY

Chair: Jim Brown, Audio Systems Group, Inc., Chicago, IL, USA

Panelists: Wolfgang Ahnert, Acoustic Design Ahnert, Berlin, GermanyStefan Feistel, Software Design Ahnert, Berlin, GermanyKurt Graffy, Arup Acoustics, Los Angeles, CA, USADavid Gunness, Eastern Acoustic Works, Inc.,Whitinsville, MA, USADon Keele, Harman/Becker Automotive Systems,Martinsville, IN, USAMike Klasco, Menlo Scientific, El Sobrante, CA, USA

This tutorial workshop is targeted at sound reinforcement profes-sionals. The session includes development of basic conceptsunderlying line arrays, both simple and complex, an historical sur-vey of commercial line array products, and a variety of line arraydesign techniques. Methods of predicting line array performanceare described, both at the conceptual and practical level. Thisworkshop is a highly useful precursor to Workshop12,Loudspeaker Line Arrays, Part 2: Practice and Applications, at1:30 pm this afternoon.

Presentations:

Historical Review of Line Array Products, by Mike Klasco

Basic Line Array Concepts,” Don Keele

Electronically-Steered Arrays, by David Gunness

Measurement and Modeling of Multi-Element Loudspeakers andModeling of Waveguide Elements, by David Gunness

Line Array System Design Considerations, by Wolfgang Ahnert

3-D Modeling of Line Arrays in Software, by Stefan Feistel

Sort-of Line Arrays? by Jim Brown

12:10 pmTechnical Committee Meeting on Loudspeakers

Education EventONE-ON-ONE MENTORING SESSION 2Monday, October 7, 10:00 am–12:00 noonRoom 402A

Students are invited to sign up for an individual meeting with

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 969

Page 108: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Technical Technical PPrrooggraramm

970 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

audio mentors from the industry. The sign-up sheet will belocated near the student center of the convention, and all stu-dents are invited to participate in this exciting and rewardingopportunity for focused discussion.

Special EventPLATINUM PRODUCERS PANEL 2:PAST, PRESENT, AND FUTURE OF RECORDINGMonday, October 7, 12:30 pm–2:30 pmRoom 403A

Moderator: Howard Massey, On The Right Wavelength Consulting

Panelists: Michael BradfordBob EzrinPatrick LeonardLarry LevinePhil Ramone

How has the art of music recording changed over the past fourdecades? What is the state-of-the-art today? Where is record-ing going? Join us for a lively, fascinating discussion as four ofthe world’s top record producers talk about how far we havecome and make predictions for the future.

Multi-instrumentalist Michael Bradford (a.k.a Chunky Style)is one of today's hottest producers. Best known for his co-pro-ductions of Kid Rock's controversial The History of Rock andUncle Kracker's hit single Follow Me, Bradford has also workedwith Run-DMC, Terence Trent D'Arby, and Gregg Alexander,and has collaborated with renowned arranger Paul Buckmasteron a number of movie soundtracks.

Bob Ezrin has produced multiplatinum albums and liveevents for some of the world’s leading artists, including PinkFloyd (The Wall), Alice Cooper, KISS, Rod Stewart, PeterGabriel, Lou Reed, Roger Daltrey, and Nine Inch Nails. He hasalso established several successful production and publishingcompanies, a recording and mastering studio, and an indepen-dent label. He is also active in community service, and currentlyserves as Co-Chairman of the Board of Directors of The LosAngeles Mentoring Partnership, and is VP of the Board ofDirectors of the Mr. Holland’s Opus Foundation. He is currentlyproducing the new Jane’s Addiction album.

Keyboardist Patrick Leonard produced Elton John's criticallyacclaimed 2001 album Songs from the West Coast and recentlycompleted the new album for Duncan Sheik. He has also pro-duced Madonna, Jewel, Bryan Ferry, Roger Waters, and RodStewart, and has enjoyed fame as both a solo artist and as half ofthe nineties rock duo Toy Matinee.

Larry Levine is responsible for some of the greatest classicrock recordings of all time. As Phil Spector's longtime engineer,he helped craft the wall of sound that made records by TheRonettes and other Spector artists sonically unique. Levine alsomanned the board for numerous Beach Boys albums, includingthe legendary Pet Sounds, as well as Ike and Tina Turner'sfamed River Deep, Mountain High. He has also recorded HerbAlpert, The Righteous Brothers, Billy Eckstine, and TheRamones.

Phil Ramone is one of the most revered names in the musicbusiness. A former child prodigy and classically trained violinist,the Pope of Pop has produced an astonishing list of platinum-selling artists, including Billy Joel, Frank Sinatra, Paul Simon,Ray Charles, Elton John, Bono, Quincy Jones, Madonna, TonyBennett, Carly Simon, Gloria Estefan, Luciano Pavarotti, NatalieCole, B.B. King, Paul McCartney, Sinead O'Connor, GeorgeMichael, James Taylor, and Jon Secada. He has also producednumerous theater, television, and film soundtracks and is current-ly working on album projects with Rod Stewart and LizaMinnelli.

Moderator Howard Massey is a noted industry consultant andveteran audio journalist. He has written for Home Recording,

Surround Professional, EQ, Musician, Billboard, and many otherpublications, and is the author of 11 books, including Behind theGlass, a collection of interviews with the world's top record pro-ducers. Massey has also worked extensively as an audio engi-neer, producer, songwriter, and touring/session musician.

Education EventSTUDENT RECORDING COMPETITIONMonday, October 7, 1:00 pm–6:00 pmRoom 309

Hosts: Theresa Leonard, The Banff Centre, Banff, Alberta, CanadaDon Puluse

Finalists selected by an elite panel of judges will give briefdescriptions and play recordings in the Classical and Jazz/Popcategories. One submission per category per student per studentsection or school is allowed. Meritorious awards will be present-ed at the closing Student Delegate Assembly meeting onTuesday, October 8, at 10:00 am in Room 402AB.

1:00 pm–2:00 pm Classical Category1

2:00 pm–3:00 pm Surround Classical Category1

3:00 pm–4:00 pm Jazz/Folk Category2

4:00 pm–5:00 pm Pop/Rock Category3

5:00 pm–6:00 pm Surround Non-Classical Category3

1Judges: Michael Bishop, John Eargle, Richard King2Judges: Jim Anderson, Lynn Fuston, Bill Schnee3Judges: Frank Filipetti, Bob Ludwig, Elliot Scheiner

Workshop 12 Monday, October 7 1:30 pm–5:30 pmRoom 408B

LOUDSPEAKER LINE ARRAYS, PART 2: PRACTICEAND APPLICATION

Chair: John Murray, Live Sound International Magazine!, San Francisco, CA, USA

Panelists: François Deffarges, Nexo, San Rafael, CA, USAMark Engebretson, JBL Professional, Northridge, CA, USAChristian Heil, L’Acoustics, Oxnard, CA, USATom McCauley, McCauley Sound, Inc., Puyallup, WA, USAJeff Rocha, Eastern Acoustic Works, Whitinsville, MA, USAEvert Start, Duran Audio, Zaltbommel, The Netherlands

This workshop focuses on commercially available modular linearray systems for performance audio use. Presenters who repre-sent engineering departments of several competitive manufactur-ers and are actively working in the development of loudspeakerline arrays and for manufacturers of such systems will be featured.

Presentations:

Fresnel Zone Analysis, Wave Sculpturing Technology (WST),and HF DOSC Wave Guide, by Christian Heil

Arithmetic Spiral Arrays, MF Diffraction-Slot Band-Pass Section,and High-Efficiency Driver Design, by Mark Engebretson

Divergence Shading, MF Parabolic Separator, and MF & LFBand-Pass Technology, by Jeff Rocha

Hyperboloid Reflective Wave Source, MF Phase Plug, and Hy-percardioid Subbass, by François Deffarges

Page 109: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Adaptive Density Inverse Flat Lens, MF Driver Design, and Inter-Cell Summation Aperture, by Tom McCauley.

Logarithmic Driver Spacing, Bessel FIR Filtering, and SteerableLobes/2-Lobe Arrays, by Evert Start

Session K Monday, October 7 2:00 pm–4:00 pmRoom 404AB

RECORDING AND REPRODUCTION OF AUDIO

Chair: Bob Moses, Island Digital Media Group, Vashon, WA, USA

2:00 pm

K-1 Power Supply Regulation in Audio PowerAmplifiers—Eric Mendenhall, Gibson Labs, RedondoBeach, CA, USA

Audio power amplifiers have typically been supplied pow-er by the simplest possible means, usually an off-line sup-ply with no line or load regulation, most commonly basedon a line frequency transformer. Even modern amplifiersutilizing switchmode power supplies are usually designedwithout line or load regulation. The exception has beenmade for high-end audiophile amplifiers. The pros andcons of a regulated power supply are investigated.Convention Paper 5694

2:30 pm

K-2 Audio Power Amplifier Output StageProtection—Eric Mendenhall, Gibson Labs, RedondoBeach, CA, USA

This paper reviews a progression of circuits used for pro-tecting bipolar power transistors in the output stages ofaudio power amplifiers. Design oriented methods of de-termining the protection locus are shown in a mathemati-cal and graphical procedure. The circuits are then ex-panded from their standard configurations to allow fortransient excursion beyond steady state limits, and ther-mally dependent protection limits, to better match theprotection limits to the actual output stage capability.This allows the protection scheme to prevent outputstage failure in the least restrictive way. A new method isshown for achieving a junction temperature estimationsystem without the use of a multiplier.Convention Paper 5695

3:00 pm

K-3 Archiving Audio—Jim Wheeler, Tape Restoration &Forensics Company, Oceano, CA, USA

As hundreds of millions of tapes age, they begin to deterio-rate. This paper describes how to recover these unplayabletapes as well as how to store them properly. This paper willalso cover all of the issues of archiving audio, includinghigh-capacity and inexpensive hard disk drives, as well asequipment obsolescence and new media. Convention Paper 5696

3:30 pm

K-4 Finding a Recording Audio Education Programthat Suits Your Career Choice—Laurel Cash-Jones,Burbank, CA, USA

This paper discusses the past, present, and future ofrecording audio education. It describes how the job mar-

ket and educational requirements have changed, andtakes a look at how to plan for a successful career. It alsoprovides valuable information on getting and keeping ajob in today’s fast-paced world of professional audio.Convention Paper 5697

4:10 pm

Technical Committee Meeting on Audio Recordingand Storage Systems

Session L Monday, October 7 2:00 pm–4:30 pmRoom 406AB

AUDIO NETWORKING AND AUTOMOTIVE AUDIO

Chair: John Strawn, S Systems, Larkspur, CA, USA

2:00 pm

L-1 Studio Exploring Using Universal Plug andPlay—Rob Laubscher1, Richard Foss2

1Seattle, WA, USA2Rhodes University, Grahamstown, South Africa

This paper explores the use of universal plug and play(UPnP) as a studio control technology. The architectureof a possible studio control technology is introduced.The elements of this studio control architecture are relat-ed to the architecture of UPnP. A sample implementationdemonstrates the key aspects of using UPnP as a studiocontrol technology.Convention Paper 5698

2:30 pm

L-2 mLAN: Current Status and Future Directions—Jun-chi Fujimori1, Richard Foss2

1Yamaha Corporation, Hamamatsu, Japan2Rhodes University, Grahamstown, South Africa

“mLAN” describes a network that allows for the trans-mission and receipt of audio and music control data byaudio devices. IEEE 1394 was chosen as the specifica-tion upon which to implement mLAN. mLAN has builton IEEE 1394 and related standards, introducing for-mats, structures, and procedures that enable the deploy-ment of IEEE 1394 within a music studio context. Thispaper discusses these standards, their implementations,and provides pointers to the future evolution of mLAN.Convention Paper 5699

3:00 pm

L-3 Mutually-Immersive Audio Telepresence—Norman P. Jouppi1, Michael J. Pan2

1HP Labs, Palo Alto, CA, USA2UCLA, Los Angeles, CA, USA

Mutually-immersive audio telepresence attempts to cre-ate for the user the audio perception of being in a remotelocation, as well as simultaneously to create the percep-tion for people in the remote location that the user of thesystem is present there. The system provides bidirection-al multichannel audio with relatively good fidelity, isola-tion from local sounds, and a reduction of local reflec-tions. The system includes software for reducingunwanted feedback and joystick control of the audio en-vironment. We are investigating the use of this telepres-ence system as a substitute for business travel. Initial

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 971

Page 110: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Technical Technical PPrrooggraramm

972 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

user response has been positive, and future improve-ments in networking quality of service (QOS) will im-prove the interactivity of the system.Convention Paper 5700

3:30 pm

L-4 Assessment of the Sound Field in a Car—Chulmin Choi1, Lae-Hoon Kim1, Sejin Doo2, YangkiOh3, Koeng-Mo Sung1

1Seoul National University, Seoul, Korea2Dong-Ah Broadcasting College, Ansung, Korea3Mokpo National University, Mokpo, Korea

This paper describes the measurement and assessmentmethod for the sound field in a car cabin, which is sosmall that conventional room acoustical parameters can-not be employed directly. First, we measured the soundfield using a multichannel microphone system and calcu-lated some room acoustical parameters for judging theirvalidity in the car cabin. As a result, we concluded thatmany of the conventional parameters do not have usefulmeaning in a car and its audio system. By analyzing theimpulse responses from many cars, we developed someparameters for a more profound assessment of the soundfield in a car.Convention Paper 5701

4:00 pm

L-5 Measurement of the Speech Intelligibility InsideCars—Angelo Farina, Fabio Bozzoli, University ofParma, Parma, Italy

The paper describes a measurement system developedfor assessing speech intelligibility inside car compart-ments. This relates directly to the understandability ofthe voices being reproduced through the radio receivingsystem (news, traffic information, etc.), but in the futureit will also be used for assessing the direct speech com-munications between the passengers and the performanceof hands-free communication devices. The system isbased on the use of two head-and-torso simulators, oneequipped with an artificial mouth, the second equippedwith binaural microphones. Only the second is usedwhen the sound is being reproduced through the car’ssound system. The MLS-based version of the STImethod is used for performing the measurements, takinginto account the effect of the background noise and theelectro-acoustic propagation inside the compartment.Convention Paper 5702

4:40 pm

Technical Committee Meeting on Automotive Audio

Workshop 11 Monday, October 7 2:00 pm–5:00 pmRoom 408A

RECENT DEVELOPMENTS IN MPEG-4 AUDIO

Chair: Jürgen Herre, Fraunhofer Institute for IntegratedCircuits, Erlangen, Germany

Panelists: Martin Dietz, Coding Technologies, Nürnberg, GermanyRalf Geiger, Fraunhofer Institute for Integrated Circuits IIS-A, Erlangen, GermanyLeon van de Kerkhof, Philips Digital Systems Laboratory, Eindhoven, The Netherlands

For more than a decade MPEG audio standards have been defin-

ing the state of the art in perceptual coding of audio signals.Several phases of standardization (MPEG-1, MPEG-2, MPEG-4) have been pushing borders beyond everyone’s initial expecta-tions. This workshop reports on three new extensions to the ex-isting MPEG-4 audio standard which are currently underdevelopment. The work on compatible bandwidth extension ofaudio signals provides an additional performance boost for AACcoding at very low bit rates, permitting considerable signal qual-ity at bit rates around 24 kbit/s per channel. A second extensionaugments the existing MPEG parametric audio coder withmodes for representing high-quality audio signals. Finally,methods for lossless audio coding are under consideration, ex-tending current MPEG-4 audio coders toward perfect represen-tation at word lengths and sampling rates typically associatedwith high-definition audio. The workshop will provide back-ground and demonstrations on these technologies.

Special EventSPARS BUSINESS PANEL: WHAT I'VE HAD TO DO TOSURVIVE SINCE 9/11, ENRON, “DESKTOP” AUDIO,AND STIFFER COMPETITIONMonday, October 7, 3:30 pm–5:30 pmRoom 403A

Moderator: David Amlen, SPARS President, Sound on Sound (NYC)

Panelists: Russ Berger, Russ Berger Design Group, Addison, TX, USASteve Davis, Crawford Audio, Atlanta, GA, USAKevin Mills, Larrabee Studios, Los Angeles, CA, USAJoe Montarello, Capitol Region InsuranceDavid Porter, Annex Digital, San Francisco, CA, USA

The Society of Professional Audio Recording Services (SPARS)is a 23-year-old professional organization dedicated to sharingpractical, hands-on business information about audio facilityownership, management, and operations. This event, which isco-hosted by the AES, features an elite panel of studio ownersand managers who will explore strategies for adapting yourbusiness to changing times. Panelists will illustrate, with storiesand solutions, innovative ways they have handled unique busi-ness challenges faced in successfully running today’s audio pro-duction facility.

Special EventOPEN HOUSE OF THE TECHNICAL COUNCIL ANDTHE RICHARD C. HEYSER MEMORIAL LECTUREMonday, October 7, 6:00 pm–8:00 pmRoom 403A

Lecturer: James E. West

The Heyser Series is an endowment for lectures by eminentindividuals with outstanding reputations in audio engineeringand its related fields. The series is featured twice annually atboth the United States and European AES conventions.Established in May 1999, The Richard C. Heyser MemorialLecture honors the memory of Richard Heyser, a scientist atthe Jet Propulsion Laboratory, who was awarded nine patentsin audio and communication techniques and was widelyknown for his ability to clearly present new and complextechnical ideas. Mr. Heyser was also an AES governor andAES Silver Medal recipient.

The distinguished Richard C. Heyser lecturer is ProfessorJames West of Avaya Labs and Johns Hopkins University, co-

Page 111: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

inventor of the electret condenser microphone. His lecture istitled, “Modern Electret Microphones and Their Applications.”

It is well known that condenser microphones are the trans-ducer of choice when accuracy, stability, frequency character-istics, dynamic range, and phase are important. But conven-tional condenser microphones require critical and costly con-struction as well as the need for a high dc bias for linearity.These disadvantages ruled out advanced practical microphonedesigns such as multi- element arrays and the use of linearmicrophones in telephony. The combination of our discoveryof stable charge storage in thin polymers and the need forimproved linearity in communications encouraged the devel-opment of modern electret microphones in the early 1960s atBell Labs.

Others had suggested the use of electrets in transducers(electrical analog of a permanent magnet) in the late 1920s,but these and subsequent efforts all suffered from the insuffi-cient stability of wax electrets under normal environmentalconditions. Water molecules from atmospheric humidity werethe main depolarizing factor in Carnauba and other wax elec-trets. The first broad application of wax electret microphoneswas discovered in captured World War II Japanese fieldequipment. Because of the decay of the polarization of theelectret, these microphones had a lifetime of about sixmonths.

Modern polymer electret transducers can be constructed invarious sizes and shapes mainly because of the simplicity ofthe transducer. All that is needed in the mechanical system isa thin (25 microns) charged polymer, a small (irregular) airgap and a back plate. An impedance converter is necessaryand is provided by a FET transistor to better match conven-tional electronic equipment. Applications of electret micro-phones range from very small hearing aid microphones (a fewsquare millimeters) to very large single element units (20 cmdiameter) for underwater and airborne reception of very lowfrequencies. Because the frequency and phase response ofelectret microphones are relatively constant from unit to unit,multiple element two-dimensional arrays can be constructed.We have constructed a two- dimensional, 400-element arrayfor Arnold Auditorium at Bell Labs with electret elements thatare available for under $1.00 each.

Telephone bandwidth and frequency characteristics haveremained constant for the past 30 years while entertainmenthas brought high fidelity including surround sound into mosthomes throughout the world. People are accustomed to goodquality sound and expect it in communication systems. TheInternet Protocol (IP) offers the needed bandwidth to improveaudio quality for telephony, but it will require broadbandmicrophones and loudspeakers to provide customers withvoice presence and clarity. Directional microphones for bothhandheld and hands-free modes are necessary to improve sig-nal-to-noise ratios and to enable automatic speech recogni-tion. Arrays with dynamic beam forming properties arerequired especially for conference rooms. Signal processinghas made possible stereo acoustic echo cancellers and manyother signal enhancements that improve audio quality. Dr.West will discuss some of the current work on broadbandcommunications at Avaya Labs.

James E. West is a Bell Laboratories Fellow and formermember of the Acoustics and Speech Research Department atLucent Technologies, specializing in electroacoustics, physicalacoustics, and architectural acoustics. He is now research sci-entist in the Multimedia Technologies Research Lab of Avaya.His pioneering research on charge storage and transport inpolymers led to the development of electret transducers forsound recording and voice communication. Almost 90 percentof all microphones built today are based on the principles firstpublished by West in the early 1960s. This simple but ruggedtransducer is the heart of most new telephones manufacturedby Lucent and other producers of communication equipment.He is the recipient of the Callinan Award (1970), sponsored bythe Electrochemical Society of America, the Senior Award

(1970), sponsored by the IEEE Group on Acoustics, the LewisHoward Latimer Light Switch and Socket Award (1989),sponsored by the National Patent Law Association, the GeorgeR. Stibitz Trophy, sponsored by the Third Annual AT&TPatent Award (1993), New Jersey Inventor of the Year for1995, The Acoustical Society of America's Silver Medal inEngineering Acoustics (1995), an honorary Doctor of Sciencedegree from New Jersey Institute of Technology (1997), theGolden Torch Award (1998) sponsored by the NationalSociety of Black Engineers, the Industrial Research Institutes1998 Achievement Award, and The Ronald H. BrownAmerican Innovator Award (1999). The Acoustical Society ofAmerica has chosen one of his early papers on ElectretMicrophones as a Benchmark publication.

The lecture will be followed by a reception hosted by theTechnical Council.

Special EventTOUR OF CATHEDRAL OF OUR LADY OF THEANGELS AND ORGAN RECITALTuesday, October 8, 8:30 am–3:00 pmCathedral of Our Lady of the Angels555 W. Temple StreetLos Angeles

Join with the American Institute of Organbuilders for a tour andseries of lectures in the newly-completed Cathedral of Our Ladyof the Angels. The prime focus of the tour will be “Opus 75”—the Cathedral’s new 105 rank, 6019 pipe organ by Dobson PipeOrgan Builders (Lake City, Iowa). The tour also will feature arecital by the AES’ “resident organist” Graham Blyth.

Included in the series of lectures will be “The CathedralOrgan: The Truth Revealed,” by Manuel Rosales, consultant tothe Cathedral for the organ project; “Design Adventures:Collaboration with Architects,” by Lynn Dobson, President,Dobson Pipe Organ Builders; “How to Hold Up 55 Tons ofOrgan,” by Jon Thieszen, a member of the organ design team;“Entering the Tonal Unknown,” by John Panning and SamuelSoria, Cathedral Organists (this will be a lecture and demon-stration of the features of “Opus 75”).

At noon, lunch will be provided in the ArchdiocesanConference Center, accompanied by a lecture, “OrganStructure: Preparing for the Big One!” John Seest will offer alook into the support of the Dobson Organ in the massive newconcrete Cathedral and the attention given to the high earth-quake design loads that make this a unique organ structure.Following the lunch and lecture, Graham Blyth will perform arecital on the Dobson Organ.

To conclude the tour, Dennis Paoletti and John Prohs,acoustical consultants for the Cathedral, will discuss some ofthe unique design features and challenges they faced in realiz-ing this massive and complex project.

Cost for the tour, including lunch, is $45 per person.Reservations are required and must be made at the SpecialEvents Desk no later than 4:00 pm Sunday, October 6th.Convention attendees wanting to attend just the organ recitalby Graham Blyth, but not the entire tour, are encouraged tocheck the poster in the Convention Center lobby for the start-ing time and additional details.

Attendance at the recital is free.

Session M Tuesday, October 8 9:00 am–12:00 noonRoom 404AB

PSYCHOACOUSTICS, PART 1

Chair: Christopher Struck, Dolby Laboratories, San Francisco, CA, USA

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 973

Page 112: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Technical Technical PPrrooggraramm

974 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

9:00 am

M-1 Comparison of Objective Measures of LoudnessUsing Audio Program Material—Eric Benjamin,Dolby Laboratories, San Francisco, CA, USA

Level measurements do not necessarily correlate wellwith subjective loudness. Several methods are describedfor making objective measurements that are designed tocorrelate better with actual loudness. Some of these mea-sures are: A-weighting and B-weighting, the methods ofStevens and Zwicker as described in ISO 532A and B,and the method described by Moore and Glasberg. All ofthese measures are intended to describe (measure) theloudness of sounds with constant spectra. How well dothese measures work with typical audio signals distrib-uted via broadcast or recording?Convention Paper 5703

9:30 am

M-2 Descriptive Analysis and Ideal Point Modellingof Speech Quality in Mobile Communication—Ville-Veikko Mattila, Nokia Research Center, Tampere,Finland

Descriptive analysis of processed speech quality was car-ried out by semantic differentiation, and external prefer-ence mapping was used to relate the attributes to overallquality judgments. Clean and noisy speech samples fromdifferent speakers were processed by various processingchains representing mobile communications, resulting ina total of 170 samples. The perceptual characteristics ofthe test samples were described by 18 screened subjects,and the final descriptive language with 21 attributes weredeveloped in panel discussions. The scaled attributeswere mapped to overall quality evaluations collectedfrom 30 screened subjects by partial least square regres-sion. The Phase II ideal point modeling was used to pre-dict the quality with an average error of about 6 percentand to study the linearity of the attributes.Convention Paper 5704

10:00 am

M-3 Relating Multilingual Semantic Scales to aCommon Timbre Space—William L. Martens,Charith N. W. Giragama, University of Aizu, Aizu-Wakamatsu, Fukushima-ken, Japan

A single, common perceptual space for a small set ofprocessed guitar timbres was derived for two groups oflisteners, one group comprising native speakers of theJapanese language, the other group comprising nativespeakers of Sinhala, a language of Sri Lanka. Subsets ofthese groups made ratings on 13 bipolar adjective scalesfor the same set of sounds, each of the two groups usinganchoring adjectives taken from their native language.The adjectives were those freely chosen most often in apreliminary elicitation. The results showed that theJapanese and Sinhalese semantic scales related different-ly to the dimensions of their shared timbre space that wasderived using MDS analysis of the combined dissimilari-ty ratings of listeners from both groups.Convention Paper 5705

10:30 am

M-4 Design and Evaluation of Binaural Cue CodingSchemes—Frank Baumgarte, Christof Faller, AgereSystems, Murray Hill, NJ, USA

Binaural cue coding (BCC) offers a compact parametricrepresentation of auditory spatial information such as lo-

calization cues inherent in multichannel audio signals.BCC allows reconstruction of the spatial image given amono signal and spatial cues that require a very low rateof a few kbit/s. This paper reviews relevant auditory per-ception phenomena exploited by BCC. The BCC coreprocessing scheme design is discussed from a psychoa-coustic point of view. This approach leads to a BCC im-plementation based on binaural perception models. Theaudio quality of this implementation is compared withlow-complexity FFT-based BCC schemes presented ear-lier. Furthermore, spatial equalization schemes are intro-duced to optimize the auditory spatial image of loud-speaker or headphone presentation.Convention Paper 5706

11:00 am

M-5 Evaluating Influences of a Central AutomotiveLoudspeaker on Perceived Spatial AttributesUsing a Graphical Assessment Language—Natanya Ford1, Francis Rumsey1, Tim Nind2

1University of Surrey, Guildford, Surrey, UK2Harman/Becker Automotive Systems, Martinsville, IN, USA

An investigation is described which further develops agraphical assessment language (GAL) for subjectivelyevaluating spatial attributes of audio reproductions. Twogroups of listeners, those with previous experience of us-ing a GAL and listeners new to graphical elicitation,were involved in the study which considered the influ-ence of a central automotive loudspeaker on listeners’perception of ensemble width, instrument focus, and im-age skew. Listeners represented these attributes fromboth driver’s and passenger’s seats using their owngraphical descriptors. Source material for the study con-sisted of simple instrumental and vocal sources chosenfor their spectral and temporal characteristics. Sourcesranged from a sustained cello melody to percussion andspeech extracts. When analyzed using conventional sta-tistical methods, responses highlighted differences in lis-teners’ perceptions of width, focus, and skew for the var-ious experimental conditions.Convention Paper 5707

11:30 am

M-6 Multidimensional Perceptual Calibration forDistortion Effects Processing Software—MaruiAtsushi, William L. Martens, University of Aizu,Aizuwakamatsu-shi, Fukushima-ken, Japan

Controlled nonlinear distortion effects processing pro-duces a wide range of musically useful outputs, espe-cially in the production of popular guitar sounds. Butsystematic control of distortion effects has been diffi-cult to attain, due to the complex interaction of inputgain, drive level, and tone controls. Rather than at-tempting to calibrate the output of commercial effectsprocessing hardware, which typically employs propri-etary distortion algorithms, a real-time software-baseddistortion effects processor was implemented and test-ed. Three distortion effect types were modeled usingboth waveshaping and a second order filter to providemore complete control over the parameters typicallymanipulated in controlling effects for electric guitars.The motivation was to relate perceptual differences be-tween effects processing outputs and the mathematicalfunctions describing the nonlinear waveshaping produc-ing variation in distortion. Perceptual calibration en-tailed listening sessions where listeners adjusted thetone of each of nine test outputs, and then made both

Page 113: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

pairwise dissimilarity ratings and attribute ratings forthose nine stimuli. The results provide a basis for an ef-fects-processing interface that is perceptually-calibratedfor system users.Convention Paper 5708

Workshop 13 Tuesday, October 8 9:00 am–12:00 noonRoom 408B

PERCEPTUAL ISSUES RELATED TO CASCADEDAUDIO CODECS

Chair: Thomas Sporer, Fraunhofer Institute for Integrated Circuits, Ilmenau, Germany

Panelists: Louis Fielder, Dolby Laboratories, Inc., San Francisco, USAJürgen Herre, Fraunhofer IIS, Erlangen, GermanyRepresentitives from IRT, Munich, Germany; BBC, UK

Digital processing seemed to solve the problem of reduced au-dio quality due to copying forever, but today several stages ofthe transmission chain include perceptual audio codingschemes such as MPEG-2 Layer-2 and Layer-3, MPEG-4AAC, AC-3, and ATRAC. This mixture of different codingformats leads to significant accumulation of coder distortionsin each encoding step (so-called tandem coding) and associat-ed serious loss of subjective audio quality. Additional prob-lems can be caused by processing such as equalization andcompression.

In this workshop typical usage scenarios will be presentedand discussed. Results from listening tests conducted by ITUand EBU over the last few years will be summarized. Differentsolutions to reduce tandem artifacts and to increase robustnesswill be presented.

Workshop 14 Tuesday, October 8 9:00 am–12:00 noonRoom 408A

THE APPLICATION OF MULTICHANNEL SOUND FOR-MATS IN VEHICLES

Chair: Richard Stroud, Stroud Audio, Inc., Kokomo, IN, USA

Panelists: David ClarkNeal HouseMark Ziemba

Multichannel audio has made its way into the car. Many moreapplications of this technology will doubtless appear in newvehicles. Workshop presenters will discuss vehicular multi-channel audio installation design goals, system development,available source material, recording considerations, and evalu-ation methodologies. Demonstration vehicles for multichannellistening will be present.

Special EventAUDIO POST PRODUCTION IN 24p HDTV AND RELATED FORMATSTuesday, October 8, 10:00 am–11:30 amRoom 403A

Moderator: Dennis Weinreich, Videosonics Cinema Sound, London, UK

Panelists: Colin Broad, CB ElectronicsDoug Ford, Skywalker SoundRobert Predovich, Soundmaster GroupScott Wood, Digidesign/Avid Technologies

The challenge of new technologies has significant impact onour professional workflow. As cutting-edge sound people wemust be able to understand how best to utilize these technolo-gies to benefit most from them—both technically and creative-ly. We need to understand what the technology is attempting toaddress and how we should use it to our best advantage. Someof us will try to find where it fits into our existing work flow.Others will see new creative opportunities. Many advancesrequire complete rethinking of how we approach our job inorder to benefit from them.

HD 24p is one such advance that will have significantimpact on how we do our jobs. Currently, the lighter andseemingly less complex production tools cause filmmakers tothink that HD will give a better look and sound to productionwith less effort. In time, after the advantages are weighedagainst the disadvantages, we will probably find ourselvesworking in ways we currently cannot foresee.

Our panel will discuss the present position on working withHD and 24p, with particular emphasis on Audio PostProduction. We will look at issues from the production floor tosound editorial to the dubbing stage. The panel will be madeup of contributors from Post facilities, manufacturers’ repre-sentatives, and professionals in the field who can share theirHD 24p experiences and enlighten us about this new and inter-esting technology.

This event is being presented by the APRS (UK) and co-hosted by the AES.

Education EventSTUDENT DELEGATE ASSEMBLY 2Tuesday, October 8, 10:00 am–11:30 pmRoom 402A

Chair: Scott Cannon, Stanford University Student Section, Stanford, CA, USA

Vice Chair: Dell Harris, Hampton University Student Section, Hampton, VA, USA

At this meeting the SDA will elect new officers. One vote willbe cast by the designated representative from each recognizedAES student section in the North and Latin America Regions.Judges’ comments and awards will be presented for theRecording Competitions and the Student Poster Session. Plansfor future student activities at local, regional, and internationallevels will be summarized.

Special EventROAD WARRIORS PANEL: TRUE STORIES FROMTHE FRONT LINES OF SOUND REINFORCEMENTTuesday, October 8, 12:30 pm – 2:00 pmRoom 403A

Introduction: Paul Gallo, founder of Pro Sound News, New York, NY, USA

Moderators: Steve Harvey, Clive Young, Pro Sound News, New York, NY, USA

Participants: Greg DeanKirk KelseyDavid Morgan

Introduced by Paul Gallo, publishing veteran and long-timefriend to the live sound industry, this freewheeling panel of

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 975

Page 114: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Technical Technical PPrrooggraramm

touring professionals will cover the latest trends, techniques,and tools that shape modern sound reinforcement. ModeratorsSteve Harvey and Clive Young will guide the all-star panelacross subjects ranging from gear to gossip, in what promisesto be an entertaining and educational 90 minutes, with theengineers on the business side of the microphone, sayingsomething besides “testing” and “check” for a change!

Special EventORGAN RECITALTuesday, October 8, 1:00 pm–2:00 pmCathedral of Our Lady of the Angels555 W. Temple StreetLos Angeles

Organist: Graham Blyth

Visit the newly completed Dobson Organ at the new Cathedralof Our Lady of the Angels in downtown Los Angeles. The spe-cially organized event will feature a lunchtime performance byGraham Blyth, the AES’ resident organ recitalist. His programwill include: Antonio Soler’s “The Emperor's Fanfare,” CesarFranck’s “Chorale No. 3 in A Minor,” Olivier Messiaen’s “DieuParmi Nous,” plus works by J. S. Bach and Louis Vierne. Mr.Blyth’s performance will be preceded by a series of presenta-tions from the design and installation team for the DobsonOrgan, and will be followed by a brief lecture from DennisPaoletti and John Prohs, the Cathedral’s acoustical consultants.

Graham Blyth received his early musical training as a JuniorExhibitioner at Trinity College of Music in London, England.Subsequently at Bristol University, he took up conducting, per-forming Bach’s St. Matthew Passion before he was 21. Heholds diplomas in Organ Performance from the Royal Collegeof Organists, The Royal College of Music, and the TrinityCollege of Music. In the late 1980s he renewed his studieswith Sulemita Aronowsky for piano and with Robert Munnsfor organ.

Mr. Blyth made his international debut with an organ recitalat St. Thomas Church, New York, in 1993, and since then hasplayed in San Francisco (Grace Cathedral), Los Angeles,Amsterdam, Copenhagen, Munich, and Paris (MadeleineChurch). He gives numerous concerts each year, principally asan organist and a pianist, but also as a conductor and a harpsi-chord player.

Mr. Blyth is founder and technical director of Soundcraft.He divides his time between his main career as a designer ofprofessional audio equipment and organ-related activities. Hehas lived in Wantage, Oxfordshire, U.K., since 1984, where heis currently artistic director of the Wantage Chamber Concertsand director of the Wantage Festival of Arts. He is alsofounder and conductor of the Challow Chamber Singers &Players. He is involved with Musicom Ltd., a British companyat the leading edge of the pipe organ control system and digitalpipe synthesis design. He also acts as tonal consultant to theSaville Organ Company and is recognized as one of the lead-ing voicers of digital pipe modeling systems.

Session N Tuesday, October 8 1:00 pm–4:00 pmRoom 404AB

PSYCHOACOUSTICS, PART 2

Chair: Nick Zacharov, Nokia Research Center, Tampere, Finland

1:00 pm

N-1 Industry Evaluation of In-Band On-ChannelDigital Audio Broadcast Systems—David Wilson,

Consumer Electronics Association, Arlington, VA,USA

The National Radio Systems Committee’s testing andevaluation program for in-band on-channel digital audiobroadcast systems is described. The results of laboratoryand field tests performed during 2001 on iBiquity DigitalCorporation’s AM-band and FM-band IBOC DAB sys-tems are reported. The conclusions drawn from the labo-ratory and field test results are also reported, and impli-cations for the future are discussed.Convention Paper 5709

1:30 pm

N-2 Comparisons of De Facto and MPEG StandardAudio Codecs in Sound Quality—Eunmi L. Oh,JungHoe Kim, Samsung Advanced Institute ofTechnology, Suwon, Korea

The current paper is concerned with assessing the soundquality of various audio codecs including ubiquitous defacto standards. Formal listening tests were conductedbased on the ITU-R Recommendation BS.1116 in orderto provide an objective measure of sound quality. Codecstested included de facto standards that were commercial-ly and noncommercially available and the MPEG generalaudio. In addition, our recently updated codec was test-ed. Test items consisted of usual MPEG test sequencesand other sensitive sound excerpts at the bit rate of 64-and 96-kb/s stereo. Experimental results show that thesound quality of our newest codec out paces that of mostof other codecs.Convention Paper 5710

2:00 pm

N-3 Evaluating Digital Audio Artifacts with PEAQ—Eric Benjamin, Dolby Laboratories, San Francisco, CA,USA

Portions of the digital audio chain have been incremen-tally improved by development, such that objectivespecifications indicate a very high level of perfor-mance. Subjective reviews of these components oftenclaim to observe substantial differences between prod-ucts. This investigation uses the tool PEAQ (perceptualevaluation of audio quality) to measure the audio degra-dation caused by analog to digital converters, digital toanalog converters, and sample rate conversion, and alsoto measure the minute incremental changes of codecaudio quality that accompany very small changes indata rate.Convention Paper 5711

2:30 pm

N-4 The Use of Head-and-Torso Models forImproved Spatial Sound Synthesis—Ralph Algazi,Richard O. Duda, Dennis M. Thompson, University ofCalifornia, Davis, CA, USA

This paper concerns the use of a simple head-and-torsomodel to correct deficiencies in the low-frequency be-havior of experimentally measured head-related transferfunctions (HRTFs). This so-called snowman model con-sists of a spherical head located above a spherical torso.In addition to providing improved low-frequency re-sponse for music reproduction, the model provides themajor low-frequency localization cues, including cuesfor low-elevation as well as high-elevation sources. Themodel HRTF and the measured HRTF can be easilycombined by using the phase response of the model at allfrequencies and by cross-fading between the dB magni-

976 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Page 115: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

3:30 pm

N-6 On the Design of Canonical Sound LocalizationEnvironments—Eric J. Angel, Ralph Algazi, RichardO. Duda, University of California, Davis, CA, USA

This paper addresses the design of virtual auditoryspaces that optimize the localization of sound sourcesunder engineering constraints. Such a design incorpo-rates some critical cues commonly provided by roomsand by head motion. Different designs are evaluatedby psychoacoustics tests with several subjects. Local-ization accuracy is measured by the azimuth and ele-vation errors and the front/back confusion rate. Wepresent a methodology and results for some simplecanonical environments that optimize the localizationof sounds.Convention Paper 5714

Workshop 15Tuesday, October 8 1:00 pm–4:00 pmRoom 408A

CODING OF SPATIAL AUDIO—YESTERDAY, TODAY,TOMORROW

Chair: Christof Faller, Agere Systems, Murray Hill, NJ, USA

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 977

tude responses of the model and the measurements. Forefficient implementation, the exact snowman HRTF isapproximated by two time delays and two first-order IIRfilters. Because the poles are independent of the locationof the virtual source, this supports a simple real-time im-plementation that allows for arbitrarily rapid head andsource motion.Convention Paper 5712

3:00 pm

N-5 Three-Dimensional Headphone SoundReproduction Based on Active NoiseCancellation—Daniël Schobben, Ronald Aarts, PhilipsResearch Labortories, Eindhoven, The Netherlands

Headphone signal processing systems that are commer-cially available today are not optimized for the individuallistener. This results in large localization errors for mostlisteners. In this paper a system is introduced that re-quires a one-time calibration procedure, which can becarried out conveniently by the listener. This system con-sists of conventional headphones into which small mi-crophones have been mounted. An active noise cancella-tion method is used to achieve a sound reproduction viaheadphones, which is as close as possible to a referenceloudspeaker setup. The active noise cancellation systemis based on adaptive filters that are implemented in thefrequency domain.Convention Paper 5713

113th ConventionPapers and CD-ROMConvention Papers of many of the presentations given at the 113th Convention and a CD-ROM containing the 113thConvention Papers are available from the AudioEngineering Society. Prices follow:

113th CONVENTION PAPERS (single copy)Member: $ 5. (USA) or £3.50 (UK)

Nonmember: $ 10. (USA) or £7.00 (UK)

113th COMPLETE SET (78 papers)Member: $100. (USA) or £ 66. (UK)

Nonmember: $150. (USA) or £ 99. (UK)

113th CD-ROM (78 papers)Member: $100. (USA) or £ 66. (UK)

Nonmember: $150. (USA) or £ 99. (UK)

113th CONVENTION PAPER SET AND CD-ROM (78 papers, each)

Member: $150. (USA) or £ 99. (UK)Nonmember: $225. (USA) or £149. (UK)

A complete list of the 113th Convention Papers with a convenient order form is inserted in this issue. In addition, audiocassettes of the papers, workshops, and opening ceremonies are available from Mobiltape, and may be purchased byusing the enclosed order form.

113th Convention2002 October 5–October 8Los Angeles

Presented at the 113th Convention2002 October 5–October 8 • Los Angeles

Page 116: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Technical Technical PPrrooggraramm

978 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Panelists: Frank Baumgarte, Agere Systems, Murray Hill, NJ, USAMark Davis, Dolby, San Francisco, CA, USAMartin Dietz, Coding Technologies, Nürnberg, GermanyGerald Schuller, Thomas Sporer, Fraunhofer Arbeitsgruppe Elektronische Medientechnologie AEMT; and Ilmenau University, Ilmenau, Germany

Low bit-rate audio coding has become ubiquitous in many of to-day’s audio systems, most of which are able to handle stereo ormultichannel audio signals. A closer examination of issues relat-ed to the compression of spatial audio reveals a number of com-plex perceptual and coding issues that need to be considered inorder to achieve optimum coder performance. While a wealth ofdifferent approaches have evolved over the recent decade, thereis no single technique serving all purposes equally well. Thisworkshop will provide a review of the principles and practicalapproaches for coding of spatial audio, and a discussion of thedifferent dimensions of the tradeoff bit rate vs. spatial quality,and characterizes commonly used coders w.r.t. these aspects.

Introduction, by Christof Faller

Spatial Perception, by Thomas Sporer

This talk will review spatial perception that is relevant for repro-duction and coding of spatial audio.

Historical Development of Spatial Audio Reproduction andTransmission, by Mark Davis

The historical development of stereophonic and multichannelaudio reproduction will be presented. It includes a descriptionof early experiments for multiloudspeaker and two-loudspeak-er stereophony, and the early techniques for pseudo stereopho-ny, etc. For the historical development of spatial audio trans-mission, sum/difference transmission for FM radiostereophony will be described, and new techniques based onperceptual audio coding will be briefly mentioned.

Coding of Stereophonic Signals, by Gerald Schuller

This talk will focus on traditional ways of encoding of stereoaudio signals such as separate coding (fails for certain signals),L/R coding and S/D coding and BMLD consideration, redun-dancy reduction/irrelevancy reduction, intensity stereo, andmatrices as used in popular coders.

Parametric Stereo for Very Low Bit-Rate Stereo Coding, byMartin Dietz

This talk will describe the idea of parametic stereo coding: amono signal, downmixed from the stereo source and com-pressed by powerful modern coding algorithms, is convertedback into stereophonic sound by means of a coded parametricdescription of the spatial properties of the original signal.

Future Directions for Coding of Spatial Audio, by FrankBaumgarte

The technique and philosophy behind binaural cue coding(BCC) and other potential forward-looking technologies forcoding of spatial audio will be described.

Education EventSTUDENT PAPER SESSIONTuesday, October 8, 1:00 pm–3:00 pmRoom 406AB

The paper session provides opportunity for students to presenttheir research work on all subjects covered by the AES.Professionals from the audio industry will be invited to discussthe students’ projects.

Special EventTHE VIRTUAL STUDIO: DAW-BASED RECORDINGTuesday, October 8, 2:30 pm–5:30 pmRoom 403A

Moderator: Frank Wells, Pro Sound News, New York, NY, USA

Panelists: David Channing,Independent EngineerLynn Fuston, 3D AudioJim Kaiser, MastermixNathaniel Kunkel,Independent EngineerRalf Schluezen, TC WorksRich Tozzoli,Independent Engineer

Everyday, more of the recording and post production studio infra-structure collapses into the frame of a computer. This two-partevent will focus on technical concerns unique to DAW-basedrecording, as well as present power user application techniques.

Bits and Bytes and Bottlenecks

Inside a DAW, music exists as data, but a special kind of datathat is sensitive to issues like word length, processor resolution,headroom and delay compensation. These aren’t the sexy parts ofthe feature set, to be sure, but attention to the details can notice-ably and measurably improve the final product. Our presenta-tions will focus on some of the core issues that require attention.

Beyond the Basics

While a DAW can function as a simple recorder or editor, withplug-ins and other DSP tricks at their disposal, a DAW can alsomanipulate audio in sophisticated ways that were hitherto impos-sible. Power users will present advanced, cross platform tech-niques for the DAW user.

Moderator Frank Wells came to the recording industry afteryears of work as a radio broadcast engineer. Following nearly adecade as Chief of Technical Services for Masterfonics,Nashville, he defected to the world of trade journalism, first asthe founding editor of Audio Media, USA and currently editor ofPro Sound News and executive editor of Surround Professional.He is also the current chair of the AES Nashville section.

David Channing has engineered and/or edited two Elton Johnrecords, the new Duncan Sheik record, Jewel's "Spirit" and pro-jects with Enrique Iglesias, Melissa Etheridge, Shawn Colvin,and many more.

Lynn Fuston is owner/chief engineer of 3D Audio, with engi-neering credits that include Amy Grant, Mark O'Connor, DCTalk, Lee Greenwood, and the Newsboys. He has also produceda series of microphone and microphone pre-amplifier comparisonCDs, and will soon release an A/D comparison disc.

Jim Kaiser is VP Central Region for the AES, and Director ofTechnology at Mastermix, Nashville. His background includessession engineering and technical consulting, including full facil-ity technical design.

Nathaniel Kunkel is known for a broad body of recordingwork that includes CSNY, Little Feat, Linda Ronstadt and LyleLovett, including extensive work in 5.1 for DVD-A and SuperAudio CD (the Graham Nash project, Songs for Survivors, andthe 5.1 remix of James Taylor's J.T.).

Ralf Schluezen is CEO of TC Works, the software division ofTC Electronic, based in Hamburg, Germany. He has extensiveexperience in plug-in design, including a stint with Steinbergbefore joining TC.

Rich Tozzoli is a New York-based engineer whose long list ofcredits includes extensive surround work with the likes of DavidBowie, Foghat and Average White Band, along with numerouslive classical recordings. He is also a contributing editor withSurround Professional.

Page 117: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 979

WWireless networking, us-ing radio frequencylinks instead of wires,is becoming increas-ingly popular as a

means of connecting digital devices andcomputers. It provides greater flexibilityand mobility to the user and does awaywith those tangled strands of wires. Andwireless networks can be integratedwith wired systems, thereby acting asextensions of existing networks. Audioinformation is increasingly transferredover networks instead of dedicated digi-tal interfaces or analog cabling, makingit important that audio engineers havean appreciation of the issues involved.

Bluetooth is a technology for wire-less personal area networks (WPANs)that has been widely publicized recent-ly, but it is only one of a number of op-

tions for wireless data communication.This article puts Bluetooth into contextand describes wireless data communi-cations in relation to audio systems.This field is changing fast like most as-pects of computing, making it hard forstandardization to keep pace with tech-nology development. A number ofwireless networking standards havebeen or are being standardized by theIEEE, but alternative solutions also ex-ist that overlap with or differ fromIEEE standards. As a rule it seems thatIEEE has attempted where possible toadopt commercial technology in re-sponse to calls for suitable solutions.

WIRELESS PERSONAL AREANETWORKSWPANs connect desktop digitaldevices, usually over short dis-

tances within a so-called personaloperating space of 10 m envelop-ing the person. WPANs are intend-ed mainly for indoor and fixed out-door applications and operate atdata rates of up to about 1 Mbit/s.They are a useful next step frominfrared communications. Blue-tooth is the prime example of aWPAN. Home RF is potential lysimilar, but it sits somewhere be-tween a PAN and a LAN (local areanetwork). Fig. 1 shows an exampleof the interrelationship betweenPANs, LANs, and WANs. WPANsare broadly covered in IEEE stan-dards work by the 802.15 groups.The primary concerns when design-ing successful WPAN technologyare power consumption, simplicity,and product size.

and wireless networkinga primer for audio engineers

BLUETOOTH

Fig. 1. Example of WPAN, WLAN, and WWAN interacting.

Page 118: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

BLUETOOTH

WIRELESS LOCAL AREANETWORKSWLANs are used over larger dis-tances, up to 500 m, in areas wherepeople may move around with digitaldevices but usually remain stationarywhile using the devices. They typical-ly operate around hot-spot accesspoints, for example in offices, homes,and schools. Data rates are typicallyhigher than WPANs, between 2 andabout 50 Mbit/s. WLANs are coveredin IEEE standards activity by the802.11 groups. Typically above thisdata rate are WMANs (wirelessmetropolitan area networks) that inter-connect LANs over large regions.

WIRELESS WIDE AREANETWORKSWWANs have more to do with globaltelecommunications systems, for ex-ample mobile phone operators usingGPRS (General Packet Radio Service)equipment, and are mainly designedfor applications in which people movearound while using devices. They arenot featured in this article as they havelittle to do with potential audio appli-cations in consumer and professionalenvironments.

General radio frequency issues

Wireless networks typically use low-power, spread-spectrum, and frequen-cy-hopping techniques to ensure ro-bust communications in the face ofinterference and fading. They typicallyuse coding techniques to make the

data signal look like noise or veryshort-term impulsive interference to anarrow-band receiver in the radio fre-quency (RF) domain, as shown in Fig.2. Most current systems operate in alicense-free region of the RF spectrumbetween 2.4 and 5 GHz, relying prin-cipally on the low-power and interfer-ence-avoiding mechanisms of systemsto operate satisfactorily in this free-for-all region.

The 2.4-GHz band is becoming in-creasingly crowded owing to Bluetoothdevices, wireless networks, and tele-phones, so there can be significant RFinterference in this band. The 5-GHzband is somewhat quieter in the UnitedStates, but signals have a much shorterrange at this frequency and are moreeasily obstructed by walls, doors, andother objects. Communication there-fore tends to be over line-of-sight pathsor short distances. Some chipsets areappearing that will operate in bothbands for IEEE 802.11 standards. Theissue of interference is covered ingreater detail later in this article.

EXAMPLES OF WPANTECHNOLOGYBluetooth operates in the 2.4-GHzband at data rates up to 1 Mbit/s overdistances of up to 10 m or 100 m de-pending on the implementation. A rel-atively simple binary FM modulationmethod is used because Bluetooth hasto be implementable in basic devices.Communication between devices canbe either point-to-point or point-to-

multipoint. In the latter case one de-vice acts as master, synchronizing,controlling, and sharing the channelwith as many as seven slaves in a so-called piconet of up to eight devices.Frequency hopping between 23 or 79RF channels is employed to increaserobustness, and data packets are eachtransmitted on a different hop frequen-cy. The IEEE 802.15.1 WPAN stan-dard is based on Bluetooth.

The primary modes for data commu-nication are ACL (asynchronous con-nectionless) links and SCO (syn-chronous connection-oriented) links.The latter are limited to a data rate of64 kbit/s each, and up to three concur-rent SCO links are allowed per Blue-tooth channel. They are set up betweena master and a single Bluetooth deviceat a time for time-critical purposessuch as voice audio. SCO links are es-tablished as real-time links betweentwo points for the duration of the con-nection. Packets are never retransmit-ted, owing to their real-time nature. Inthe time remaining within each trans-mission slot the master can set up anasynchronous communication with oneor more slaves. ACL communicationsoperate rather like conventional pack-et-switched networks, with the packetheader being used to indicate the desti-nation on the piconet.

It is possible to operate either oneasynchronous and three synchronousvoice channels (bidirectional) or onechannel that handles both asyn-chronous data and synchronous voice

980 Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Fig. 2. Wireless networks use low-power, spread-spectrum, and frequency-hopping techniques for robust communications.

Page 119: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Audio Eng. Soc., Vol. 50, No. 11, 2002 November 981

(this is essentially made up of singleSCO packets with two parts). Themaximum asynchronous data rate is723.2 kbit/s in one direction, with a re-turn rate of 57.6 kbit/s, or 433.3 kbit/ssymmetrically.

In addition to the Bluetooth CoreSpecification that describes the princi-ples of communication, the BluetoothSIG (special interest group) publishesa number of profile documents that de-scribe specific approaches to datatransfer for particular applications. Thedocuments of primary interest to audioengineers are the Generic Audio/VideoDistribution Profile (GAVDP) and theAdvanced Audio Distribution Profile(A2DP). These are discussed later inthis article.

The long-term future of Bluetooth isnot certain, but the same can be said ofvirtually any data communicationstechnology today. For example, Appleand a number of mobile-telephonemanufacturers have significant invest-ment in Bluetooth, and this may helpto ensure its future.

Many commentators appear enthusi-astic about the market potential for thetechnology. It is a relatively low-ratesystem but is intended for low-cost,low-power implementation in highlyportable devices. Some commentatorscite the relative advantages of broad-band technology such as UWB (ultra-wide band) that may ultimately be-come more useful in high-ratemultimedia environments. Some con-cern has also been expressed about the

interoperability of Bluetooth devices,as peer-to-peer networking interoper-ability is apparently not mandatory fordevice certification. Success maytherefore come down to a question ofwhich Bluetooth devices actuallywork together and at what level of so-phistication.

High-rate and ultrawide bandsystems

The IEEE 802.15.3 group is studyinghigh-rate applications above 20Mbit/s. These could ultimately be upto 100 times faster than Bluetooth.UWB is also being developed for ap-plications requiring very high datarates. UWB uses very wide bandtransmissions at extremely low power,requiring receivers and transmitters tobe tightly synchronized. The U.S. Fed-eral Communications Commission(FCC) has approved UWB within cer-tain low-power limits, but other coun-tries have not yet done the same. IEEE802.15 Study Group 3a is studying al-ternative physical layer options forhigh-rate WPANs.

Low-rate, low-powerconsumption systems

IEEE 802.15.4 is devising a lower ratestandard for simple devices, such asjoysticks, that do not need high datarates but need small size and long bat-tery life. Data rates of 20, 40, and 250kbit/s are targeted for use within RFbands at 2.4 GHz, 915, and 868 MHz.This clearly lies below the typical

Bluetooth application range of datarates, whereas high rate and UWB lieabove the Bluetooth range, as shownin Fig. 3.

EXAMPLES OF WLANTECHNOLOGYBroadband WLAN technology is de-veloping so fast that the regulatoryand commercial situation changes al-most weekly. However, the followingis an attempt to summarize the currentsituation as clearly as possible.

IEEE 802.11

IEEE 802.11a and IEEE 802.11b arewireless LAN standards and the basisof wireless Ethernet or WiFi. WiFi is amark for compatible products awardedby the Wireless Ethernet Compatibili-ty Alliance (WECA). Apple uses thistechnology in its AirPort products, asdo many of the consumer wireless net-work systems currently marketed incomputer stores.

The two standards define differentphysical layers, where 802.11b trans-mits at 2.4 GHz and 802.11a transmitsat 5 GHz. The standards do not inter-operate directly with each other, al-though bridges between them are pos-sible and chipsets are appearing thatoperate in either mode. 802.11 dealswith the physical and MAC (mediumaccess control) layers of networkingprotocol, but the remaining layers arevirtually identical to Ethernet, whichis IEEE 802.3. In network terminolo-gy layers are different levels in the

BLUETOOTH

Fig. 3. Data rates of common wireless technologies.

Page 120: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

982 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

somewhat arcane hierarchy of datacommunications, with the applicationat the top and the physical medium atthe bottom. As with cell phones, userscan roam between wireless accesspoints (APs), using the beacon of theAP as a judge of signal strength. Col-lision avoidance and carrier detectionare used for medium access, ratherlike Ethernet, and authentication orfull encryption can be used for securi-ty purposes.

Both FHSS (frequency hoppingspread spectrum) and DSSS (direct se-quence spread spectrum) approachesare specified for 1- and 2-Mbit/s rates,but only DSSS is allowed for 11-Mbit/s communications. 5.5 Mbit/s isalso allowed and even higher bit-ratestandards were approved in 1999. Infact it appears that most developers ac-tually went for DSSS even at the lowerrates, for future compatibility with thehigher rates. Generally, the higher theradio frequency (and therefore thegreater the potential for high datarates), the shorter the range that may

be covered. For example the 802.11astandard can operate at 54 Mbit/s butonly within a 50-meter range. Alterna-tively, 802.11b operates at a maximumspeed of 11 Mbit/s but can cover 300meters outdoors (100 meters indoors).As with most packet-switched RF net-works, contention, interference, andoverheads can reduce the realistic datarate closer to half the maximum.

Contrary to what may have been an-ticipated, 802.11b is already in wideuse but 802.11a products are only justbeginning to appear. In the UK an in-ternal agreement is being formulated toallow 802.11a to operate in a limitedfashion in the region from 5.15 to 5.25GHz. This allows four access points asopposed to the eight possible if thesystem occupies the full band from5.15 to 5.35 GHz. Full approval fromthe European TelecommunicationsStandards Institute (ETSI) is not yet fi-nalized because of military and gov-ernment use of the 5-GHz band. Butother countries are striking individualagreements rather like the UK. Dy-

namic frequency selection (DFS) andtransmission power control (TPC) havebeen required for conformity with Eu-ropean requirements, and these arenow virtually completed for 802.11a(this project was originally coded802.11h). However, as discussed in thenext section, full ETSI approval ofwireless networking in the 5-GHz banddoes appear to be feasible, and thisband is apparently being cleared inmany countries for broadband datacommunications.

The range of 802.11 activities iswide and fast moving. Seehttp://grouper.ieee.org/groups/802/11/for the latest information.

HiperLAN

HiperLAN operates in the 5-GHz bandup to 54 Mbit/s and requires approxi-mately 330 MHz of bandwidth. Theserates and frequencies are the same as802.11a, and at the physical layer thetwo are almost identical. HiperLANhas been developed by ETSI and theBroadband Radio Access Network(BRAN), whose founding membersare Tenovis (Bosch), Dell, Ericsson,Nokia, Telia, and Texas Instruments.Typically, the operating ranges are 30m indoors and 150 m outdoors. In Eu-rope specific bands from 5.15 to 5.35GHz and 5.470 to 5.725 GHz seem tohave been approved already for dedi-cated use by HiperLAN. In the U.S.the 5.15–5.35-GHz and 5.725–5.825-GHz bands are unlicensed and usable.

The main difference between Hiper-LAN and 802.11a is at the media ac-cess (MAC) level. Whereas 802.11a isan extension of wireless Ethernet, pri-marily based on contention mecha-nisms and packet switching, HiperLANis claimed to offer connection-orientedcommunication and is compatible withother circuit-switched network stan-dards such as ATM. As a result it offersquality-of-service (QoS) guaranteesand therefore will be useful for real-time streaming applications.

HomeRF

HomeRF is yet another competingtechnology in the 2.4-GHz band. It of-fers a data rate of 10 Mbit/s and com-bines cordless telephone links (up toeight concurrently), wireless network-ing, and data streaming for home en-tertainment products.

BLUETOOTH

UUSSEEFFUULL WWEEBB SSIITTEESS

Bluetooth specifications:www.bluetooth.com/dev/specifications.asp

HiperLAN:www.hiperlan2.com

HomeRF:www.homerf.org

IEEE 802.11:grouper.ieee.org/groups/802/11/

IEEE 802.15:grouper.ieee.org/groups/802/15/

Ultra wide band:www.uwb.org

Wireless networking links:www.wireless-nets.com/links.htm

Page 121: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 983

Up to eight simultaneous real-timestreaming sessions are possible, eithertwo-way, multicast, or receive only.Priority is given to real-time voicecommunications over streaming appli-cations as the default condition. Asyn-chronous packets take last place in thequeue. The IEEE describes HomeRFas a sort of trimmed-down 802.11.

RF INTERFERENCE ISSUESInterference is a particular problem inthe crowded 2.4-GHz band, which islicense free in most parts of the world.Industrial, scientific, and medical ap-plications (ISM) use it, and mi-crowave ovens are the primary culpritswhen it comes to interference.

In order to limit the effects of inter-ference, frequency hopping spreadspectrum (FHSS) techniques are usedby Bluetooth and HomeRF. Hoppingreduces channel efficiency in favor ofrobustness and is simpler to imple-ment than the DSSS techniques de-scribed below. Such techniques usual-ly employ at least 75 frequencies witha maximum dwell time per frequencyof 400 ms. This ensures that commu-nication does not spend too long oneach frequency and the signal lookslike random impulsive interference toa narrow-band receiver. Both transmit-ter and receiver follow the same pseu-dorandom pattern of hops, and adap-tive hopping can avoid frequenciesthat are known to be blocked. TypicalFHSS products occupy about 1 MHzof bandwidth. They achieve higherpower within each frequency band atany one time and may therefore resultin a better instantaneous RF signal-to-noise ratio than other spread-spectrumtechniques.

The Bluetooth hopping rate is quitefast (1,600 times per second) comparedwith some other systems. Per second,in any one band, it is said that one ismore likely to encounter an 802.11btransmission than a Bluetooth signal.There is some concern over the coloca-tion of Bluetooth and 802.11b devices,as Bluetooth transmitters have beenshown to reduce the performance ofthe other network considerably ifplaced close to a receiver. Bluetoothhas limited error handling and retrymechanisms beyond the frequency-hopping physical layer. So it is not par-ticularly sophisticated for robust wire-

less LANs, for which it was not de-signed, intended instead as a simpleand cheap short-distance cable-replacement technology.

The direct sequence spread spec-trum (DSSS) approach, as used by802.11b for example, uses more band-width than strictly required by the datarate. Data are coded onto chips (sim-ply a pattern of data bits) that have re-dundant portions spread over the fre-quencies in the band. The data areexclusively ordered within an 11-bitBarker code (a pseudorandom se-quence), the bits of which make up thechips, and there are usually an integralnumber of chips per bit. These aremodulated onto the carrier using dif-ferential phase-shift keying (DPSK),so a narrowband receiver perceivesthis signal as low-level noise. Even ifsome parts are lost during transmis-sion owing to interference, error cor-rection can be used to recover thedata. The so-called spreading ratio isrelated to the degree of redundancyemployed (number of chips per bit);the spreading ratio of most wirelessLANs is around eight. This ratiomakes the use of the band reasonablyefficient at the expense of higher ro-bustness. Sometimes the band is splitbetween more than one network; amaximum of three networks is possi-ble in typical current implementations.Typical current DSSS products occu-py 20 to 22 MHz of bandwidth nomatter what the data rate. The powerwithin any one band is relatively low,making the RF signal potentially lessrobust than with frequency-hoppingtechniques, but performance in prac-tice depends on the coding schemeand spreading ratio employed.

BLUETOOTH AUDIOThe basic audio functionality in theBluetooth Core Specification is reallyonly suitable for telecommunicationsapplications. The bandwidth is typical-ly limited to 4 kHz for telephony, andthe sampling rate is correspondingly 8kHz. Relevant standards are ITU-TP.71, G.711, 712, and 714. The 64-kbit/s voice channels use µlaw or A-law logarithmic PCM coding or CVS-DM (continuous variable slope deltamodulation). CVSDM is said to bepreferable with respect to quality androbustness to errors. Audio error cor-

rection depends on which packet typecarries the audio transmission: HV3(High-quality Voice 3) packets haveno forward error correction (FEC) andcontain 3.75 ms of audio at 64 kbit/s,whereas HV1 and 2 packets havesome error correction and containshorter durations of audio, 1.25 and2.5 ms respectively.

Bluetooth audio streamingusing ACL packets

New draft profiles have been createdfor audio streaming using asyn-chronous (ACL) packets that can oc-cupy the full remaining bit rate ofBluetooth (721 kbit/s, after the voicequality streams are taken into ac-count). These use real-time protocol(RTP) for streaming. RTP was origi-nally developed for managing stream-ing connections on asynchronouspacket-switched networks such asparts of the Internet.

Either point-to-point or point-to-multipoint (such as one transmitter tomultiple Bluetooth loudspeakers) areallowed. QoS is not guaranteed withACL connections although RTP doesprovide for some real-time require-ments provided that buffering is usedat the receiver. Better QoS provisionfor streaming has been requested bythe A/V working group of BluetoothSIG in the next revision of the Blue-tooth data-link specification. AlthoughBluetooth supports isochronous com-munication (the type required forclock-dependent processes) forstreaming applications through the useof higher-level L2CAP (logical linkcontrol and adaptation protocol) con-nections, this is always only on a best-effort basis. The only truly syn-chronous reserved slots at thebaseband level are for SCO packets,basically speech audio.

AVCTP is the Audio/Video ControlTransport Protocol that can be used forconveying messages intended for con-trolling Bluetooth A/V devices. TheAVDTP (Audio/Video DistributionTransport Protocol) uses L2CAP pack-ets to transfer audio with connectionsestablished between a transmitter and areceiver. This protocol provides amechanism for reporting QoS, opti-mizing the use of bandwidth, minimiz-ing delays, and attaching time-stampdata to packets for synchronization

BLUETOOTH

Page 122: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

984 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

purposes. The AVDTP document pro-vides some advice in an appendix re-lating to the synchronization of deviceseither to each other or to a separatenetwork clock. It also briefly mentionsan approach to measuring timing jitter.Broadly, the approach relies on thetiming mechanisms inherent in RTPand RTCP (real-time control protocol).

The A2DP (Advanced Audio Distri-bution Profile) describes a configura-tion of layers in the Bluetooth stackand higher application layers that canbe used for the conveyance of audio. Itwas authored by the audio/video work-ing group consisting of members fromSony, Toshiba, Nokia, Philips, Erics-son, and Matsushita.

Considering the bit rate, there is re-ally no way that uncompressed high-quality audio can be carried, requiringthat some form of low bit-rate codingbe employed. Specified in A2DP is amandatory audio codec, which is alow-complexity, subband codec(SBC), whereas other codec types(MPEG and ATRAC in the currentversion) are listed as optional. TheSBC codec was developed for Blue-tooth but based upon an earlier Philipssystem described by de Bont et al.1

NonA2DP codec types can be accom-modated, although the rubric is ratherconfusing in relation to this eventuali-ty. This is supposed to be achieved ei-ther by upgrading the codec to optionalstatus within A2DP (this requires themanufacturer to submit clear defini-tions of certain required characteris-tics) or by transcoding the audio datato SBC if the receiver does not supportthe decoding of the data type; this wayinteroperability is maintained as muchas possible.

The profile document does not de-fine anything in relation to nonA2DPcodecs except that the vendor is sup-posed to use a Bluetooth AssignedNumber to identify itself, and its pa-rameters must be signalled within thestandard packet headers. Audio may beencrypted for content protection or not,but this is application dependent. Themandatory subband codec should use

at least one of 44.1- or 48-kHz sam-pling frequencies, and other lowerrates can be specified. The encodershould be able to handle at least monoand one stereo mode (such as dualchannel, stereo, joint stereo) and thereceiver should be able to decode all ofthese. Similar requirements exist forMPEG 1, Layer I, II or III, for MPEG2 and 4-AAC, and for ATRAC codecs.MPEG codecs are also allowed vari-able bit-rate (VBR) encoding in theprofile.

Quality and Robustness ofBluetooth Audio Streaming

For adequate audio quality the A2DPprofile requires that the audio data ratebe sufficiently lower than the availablelink data rate to allow for the retrans-missions that will avoid packet loss.The margin allowed for this obviouslydepends on the robustness expected ofthe application in the face of interfer-ence or long distances. The profilelimits the SBC bit rate to a maximumof 320 kbit/s for mono and 512 kbit/sfor stereo. The maximum available bitrate is 721 kbit/s. The data overheadwhen one transmitter is communicat-ing multiple ACL streams to differentreceivers can lower the overall band-width available for audio, hence thenumber of channels is tightly limitedeven with data compression. An alter-native is to use broadcast mode inwhich the full data stream for all audiochannels is broadcast to all receivers,requiring them to separate the chan-nels themselves (they therefore needto have a means of channel identifica-tion). The maximum number of audiochannels in this mode is seven, that isthe maximum number of connectionsto a master device.

There are some problems with point-to-multipoint connection for audio us-ing ACL and RTP because different re-transmission rates will apply on eachconnection and possibly affect inter-channel synchronization. The result ofthis is timing differences between theaudio channels or phase distortion. This

can be ameliorated using adequatebuffering and resynchronization.Broadcasting gets around this problem,but packet loss can be encountered andnot recovered because there is no dy-namic retransmission method forbroadcast mode. However, there is afixed retransmission option for broad-cast mode, which appears to act ratherlike a form of permanent redundancy,whereby packets are always retransmit-ted at the expense of a reduction inavailable bit rate on the channel. Floroset al.2 found broadcast mode with noretransmissions only just acceptable foraudio streaming owing to packet losson the wireless link. One retransmis-sion reduced audio data loss to 2 to 3%compared with 7 to 8%, but reducedthe effective bit rate from 551 to 325kbit/s. Because the losses were in com-pressed audio data, the resulting un-compressed audio was badly affected.They claim that retrieval mechanismsin the Bluetooth lower levels and appli-cation layer could be adapted to mini-mize the effects of packet loss on audioquality, but have some reservationsabout the use of the approach for syn-chronous compressed multichannel au-dio. They concluded that one could eas-ily transmit stereo audio at 256 kbit/sper channel plus control information,within the bandwidth available, usingtwo separate ACL links. The best datarate was obtained with DH5 packets,getting close to the upper limit of 721kbit/s, but all DH packets have no for-ward error correction and so are moreprone to data loss on the link. DMpackets have forward error correction(FEC) and a correspondingly loweroverall data rate.

Whether Bluetooth stands the test oftime remains to be seen, but wirelesstechnology for audio transmission willundoubtedly continue to expand. Seethe box on p. 982 for useful web siteswith more information.

FURTHER READING

Specification of the Bluetooth System,version 1.1 (2001).

Bluetooth Audio/Video WorkingGroup (2002), Advanced AudioDistribution Profile 0.95b.

Bluetooth Audio/Video WorkingGroup (2002), Generic Audio/VideoDistribution Profile 0.95b.

BLUETOOTH

2A. Floros, M. Koutroubas, N. Alexander-Tatlas, and J. Mourjopoulos, “A Study ofWireless Compressed Digital Audio Trans-mission,” presented at the 112th Conven-tion of the Audio Engineering Society, J.Audio Eng. Soc. (Abstracts), vol. 50, p.498 (2002 June), preprint 5516.

1F. de Bont, M. Groenewegen, and W.Oomen, “A High Quality Audio-CodingSystem at 128 kb/s,” presented at the 98thConvention of the Audio Engineering So-ciety, J. Audio Eng. Soc. (Abstracts), vol.43, p. 388 (1995 May), preprint 3937.

Page 123: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

OF THE

SECTIONSWe appreciate the assistance of thesection secretaries in providing theinformation for the following reports.

NEWS

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 985

Vocal AmplificationThe Colorado Section held its firstmeeting of the season on September25, at the King Center of the AurariaHigher Education Center campus inDenver. Some 24 guests attended apanel discussion on voice, stage andsound amplification that covered abroad range of interests, including audience expectations for vocal per-formance in coming years.

Patti Peterson, professor of vocalstudies at the University of Colorado,began the session by describing the aspects of vocal pedagogy that arecommon to both amplified and unam-plified performance practices. She alsotalked about how vocal technique andtone production vary, depending onwhether or not sound amplification isemployed.

Peter Russell, president and generaldirector of Opera Colorado, which inits early days was one of the first com-panies to practice amplified opera-in-the-round, spoke forcefully in favor ofproscenium style presentations of un-amplified opera. Russell characterizedamplified vocal performance as appro-priate for musical theater, but reiterat-ed his position that conventionallyproduced opera must be unamplifiedto be true to the art form. Russell alsosaid that since productions developedfor in-the-round presentations have lit-

tle value to other (more traditional)companies, co-production with otheropera companies for such perfor-mances is not economically feasible.

Bob Burnham then spoke to thegroup from a sound technician’s pointof view on the importance of estab-lishing a warm and supportive rapportwith the performing artists. Drawingon his 33 years of experience as arecording and sound reinforcementengineer, Burnham described severalmethods of miking and reinforcementand fielded a number of very pertinentquestions from the audience.

Japan ToursWith 53 members and guests theJapan Section toured the new soundresearch facility at NHK Science &Technical Research Labs in Setagaya,Tokyo, on June 26.

Kimio Hamasaki of NHK wel-comed visitors and talked brieflyabout NHK’s objectives in undertak-ing the renovation, and about thosesound research projects that are beingundertaken now by the company.Members were able to visit the labora-tories, talk with the heads, and take alook at some of the company’s latestproducts. These include an extendeddynamic range silicon diaphragm con-denser microphone; a tiny moving coilmicrophone capable of picking up a

sound of insect chewing leaves, whichwas developed for the purpose of sup-porting microscopic scenes taken by ahigh resolution camera; a highly direc-tional stereo gun microphone; and asound image control system that usesloudspeaker arrays for a virtual studio.

Hamasaki concluded the tour with ademonstration of a 23-channel soundsystem developed for an experimental4000-line HD-TV system, which impressed the group.

The section met again on August22, for the annual review and to evalu-ate the fiscal year of 2001. During theevening, all the section’s activitiesthroughout the year, as well as the set-tlement of accounts were reported andapproved. Other business included theelection of new officers and the announcement of activity and budgetprojections for 2002.

Vic Goh

Audio Tech SeminarThe Singapore Section presented anAudio Technology Seminar in con-junction with Broadcast Asia 2002,Singapore’s premier broadcast exhibi-tion and conference on June 17, 2002.Entitled, “Future Trends for Digital Audio: Principles, the Practice & thePossibilities,” the seminar drew 48professionals from various areas of theaudio industry and included delegatesfrom 12 countries, including the USA,Japan and China.

The full-day seminar featured sevenspeakers, who covered topics relatingto digital TV, digital audio, and loud-speaker systems. The first lecture focused on the work being done in thearea of compression schemes and digi-tal television. Mary Ann Seidler, inter-national director of sales for TelosSystems/Omnia USA, talked

Colorado Section addressed voice, stage and sound amplification. Panelists (leftto right): Bob Mahoney, Patti Peterson, Bob Burnham and Peter Russel.

Page 124: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

of the three pairs of presentations andAES mementos were given to thespeakers by Robert Soo, section chair.The section is already preparing for its2003 seminar to be held in conjunc-tion with Broadcast Asia 2003.

Kenneth J. Delbridge

Seattle Symphony CenterTwenty-eight members and guests ofthe Pacific Northwest Section held abusiness meeting at Soundbridge, theSeattle Symphony’s Music DiscoveryCenter, an educational outreach sec-tion of the Benaroya Hall Complex onJune 25.

Aurika Hays, section chair, openedthe meeting with some remarks aboutthe success of the preceding year andthanked members for all their hardwork in helping to bring Soundbridgeto life.

Ron Hyder and Rick Chinn (sectioncommitteemen) gave details of contin-uing improvements to the sound rein-forcement system in Benaroya Hall S.Mark Taper Auditorium. The design-ers purchased a JBL Vertec line arraysystem for left/right arrays and redidthe built-in central cluster system,which is normally intended for basicuse such as announcements, with JBLcomponents to match the Vertecsound. They also installed a distributedloudspeaker array under the balcony,and purchased a Soundcraft Series 4board. Unfortunately, the Seattle Sym-phony was in closed rehearsal at thetime, so the group could not see theitems firsthand. However, attendeeswere able to view the rehearsal on theSoundbridge plasma TV.

Bryan Stratton, manager of Sound-bridge, then talked about how Sound-bridge strives to be a hub for the Seattle Symphony’s education andcommunity programs. The programoffers high-tech, hands-on interactiveexhibits such as “Be a Virtual Con-ductor;” real instruments visitors cantry; and a “Music Bar” featuring hun-dreds of musical pieces that can becalled up for listening by visitors. Theorganization also offers programs andclasses to educate people of all agesabout music.

Soundbridge was originally slated to

stages of major TV shows, including“The Sopranos,” “Alias” and “StarTrek: Next Generation.” Local dele-gates were able to get a glimpse intothe workings of the companies thatproduce so many of the programs thatmake up many of the prime time pro-gramming in the region.

After a lunch break, two more semi-nars on digital audio focused on thedevelopments in network audio.YvesAnsade, head of Product Developmentof Digigram, France, gave a presenta-tion on using Ethernet cables forsound distribution that included an in-depth explanation of applicationspossible in a format that permits 24-bit/64-channel sound. Ansade also discussed EtherSound, a proprietarycard system developed by Digigramthat utilizes Ethernet cables for sounddistribution.

The last presentation focused on adual-loudspeaker project involvingmLAN. Kunihiko Maeda, chief plan-ner of the R&D Planning Section forOtari Inc System Engineering Divi-sion, USA, and Junichi Fujimori, pro-ject leader of Yamaha Corporation,Japan, described the efforts made bytheir respective companies to bringmLAN to fruition for sound applica-tions. According to Maeda and Fuji-mori, using the FireWire/IEEE 1394format single-cable connections withinstudio and stage environments is nowpossible. This eliminates cable clutteras well as permitting network connec-tions that allow other settings and parameters to be transmitted via thesingle FireWire cable.

Lively Q&A sessions followed each

about some of the new schemes nowavailable for audio compression. Inparticular, she focused on the advan-tages of MPEG4, such as its more effi-cient use of bandwidth.

Henk Mensinga, sales manager ofTC Electronic of Denmark, then spokeabout audio for DTV, covering the requirements of broadcasting for digi-tal TV. Mensinga discussed the assort-ment of digital TV formats that arenow being adopted by governments indifferent countries.

After a 15-minute break for refresh-ments, the discussion moved on to theimplementation of digital audio. JohnWigglesworth, vice president & direc-tor of Creative Services for Four Media Company Asia of Singapore,and Danial Shimiaei, chief digital sys-tems engineer of Todd-SoundeluxUSA, talked about how internationalprogram producers are currentlychoosing to complete their programsin high-definition video with Dolby5.1 soundtracks. Even though many ofthe final delivery formats for TV arestill in analog and capable of support-ing only Dolby Surround 2.0, the prac-tice of exceeding current requirementspermits a degree of future proofingthat will ultimately prolong the shelf-life of these products.

Wigglesworth then talked about dig-ital audio from a management point ofview. He deciphered budget break-downs and compared major Holly-wood movie budgets with typicalAsian budgets. Shimiaei covered thetopic from an operational angle. Hetalked about the responsibilities involved in the numerous mixing

OF THE

NEWS

SECTIONS

986 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Singapore discusses future trends for digital audio. photo/Singapore Exhibition Services

Page 125: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

talking a little about the Associationfor Recorded Sound Collections(ARSC), a group dedicated to thepreservation of historical recordingsand their importance in cultural her-itage. For more information on thisgroup, visit the Web site at:www.arsc-audio.org.

Alexandra Loubeau

IEEE 1394 in SF The September meeting of the SanFrancisco Section at Dolby Labs, fea-tured John Strawn of S Systems andMike Overlin of Yamaha. They spoketo 65 audio professionals about IEEE1394. The IEEE 1394 bus can be usedto distribute audio and MIDI signals topractically any audio device includingpersonal computers, keyboards, mixers,samplers and signal processors. IEEE1394 provides configurable, high-speed(400 Mbps) data transfer and can great-ly reduce the complexity of cablescommonly found in audio systems.

Strawn talked in general about howaudio is distributed over IEEE 1394,which was originally developed to debug back planes and has sinceevolved into a set of standards for ahigh-speed serial bus. Apple Comput-er then developed Firewire, which inturn, became the prototype for theIEEE 1394 standard.

According to Strawn, it is importantto remember that IEEE 1394 is a bus,not a network, and is quite versatile incarrying SMPTE code, MIDI, 1-bit,floating point or linear audio. Datatransfer on IEEE 1394/Firewire can beisochronous (real-time), or asyn-

Waring Collection Fourteen members of the Penn StateStudent Section held the first meetingof the semester on September 23, atthe Fred Waring Collection of the Pat-tee Library. Waring, who is probablybest known for the Waring blender,hailed from the local area of Tyroneand donated his collection of musicalarchives and memorabilia to PennState after his death.

Curator Peter Keifer, who was asound person and road manager forWaring, began with a short history ofWaring’s career as a choral conductorand showman. The Waring Collectionhouses more than 10 000 recordings,sheet music, scrapbooks, photographs,original cartoon artwork, correspon-dence, and other items such as cos-tumes and the renown Waring blender.Keifer talked about the many changesin recording media and microphonetechniques as reflected in Waring’sarchives and souvenirs from his Penn-sylvania shows. Early shows wererecorded using only one or two micro-phones for the soloists. Later, Keiferexplained, the shows incorporatedmore microphones for the band.

The group gathered in the archivearea, where Keifer transfers recordingsfrom old formats such as aluminumrecords and wire recordings to digitalformats like DAT and CD. He demon-strated the wire recorder as well asvarious types of records that requiredifferent size styli.

Keifer concluded the session by

occupy a small restaurant space. Strat-ton walked the group through the exhibits and demonstrated the MusicBar, which consisted of a row of PCsconnected to a server. Each station hastouchscreen controls, which allow theuser to access information about hun-dreds of musical pieces, composersand styles, as well as the Seattle Sym-phony recordings.

There were also several videokiosks on myriad musical topics. Oneexplained the forms and styles of vari-ous musical genres; another showedhow music is used in films and car-toons. There was a “Meet the Conduc-tor” kiosk, that featured a video ofsymphony conductor Gerard Schwarzand offered the participant a view ofthe orchestra from the conductor’sstandpoint, and several kiosks on vari-ous instruments and musicians. Onecould even try playing the instrumentson display.

After experiencing the array of delightful exhibits, attendees reassem-bled for the announcement of electionresults and a presentation of doorprizes. For the list of new officers seethe December issue of the Journal.Door prizes included a Silk Roadposter won by Seth St. John, and twotickets to the Seattle Symphony BugsBunny on Broadway concert, won byGary Louie. Members then discussedseveral ideas for next season’s meetingtopics. Film scoring, IMAX theaters,acoustic analyzers, grounding andshielding were some being considered.

Gary Louie

OF THE

NEWS

SECTIONS

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 987

Bryan Stratton (in suit)explains a kiosk to PacificNorthwest members.Aurika Hays and GaryLouie (inset) look on.

photos/Rick Smargiassi

John Strawn (left) and Mike Overlintell San Francisco members aboutIEEE 1394.

Page 126: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

acoustics and electroacoustics.The meeting was full of interesting

reports, information and demonstra-tions. Tobias Lentz reported on his in-vestigations in the field of audio-virtu-al reality. Simulation, in many cases,is much cheaper and faster than reali-ty; consider flight simulators and sim-ulations of a production line in the automobile industry. Lentz talkedabout how such models can be appliedto the creation of the ideal acousticsfor new buildings and/or rooms. Fornew buildings room acoustical condi-tions demonstrated in virtual realityenable decisions to be faster and easi-er. The reproduction of sound fields isbased on the fast calculation of soundtransmission and the use — in thiscase – of only two channels and thereproduction with loudspeakers instead of headphones.

Professor Völker then took overwith a report on a listening test thatcompared the sound of 5.1 multichan-nel stereo reproduction in an Institutefor Acoustics and Building Physicscontrol room with a normal livingroom. The living room contained fur-niture and had the acoustical featuresof a normal room, i.e., no specially tai-lored design.

Alexander Bob described the test results, which showed that a normalliving room with additional sound reflections placed on the ceiling wallsand floor was not as acoustically badas expected. Some listeners actuallypreferred these acoustic conditions tothe normal non-reverberant IAB con-trol room, where the reproduction wasreportedly more precise and sharplydirected.

and dialog editing on the SADiE Audio Workstation; audio postproduc-tion on Sonic Solutions; and live clas-sical or jazz recording straight tostereo.

The range of activities covered theprincipal areas of work of professionalstudios and production facilities. aswell as a wide variety of equipment,media and musical styles. During thecourse of the day members made andedited recordings under the guidanceof experts. By using the equipmentfirst hand, participants gained a deeperappreciation of the decisions and com-promises involved in the design. Thecourse organizer, Ken Blair of BMPRecording, assisted by other AESmembers as well as staff and studentsof the Tonmeister course, were onhand to guide participants through thestudio process.

Virtual Acoustics The Central Germany Section held ameeting June 24, at the Technical Uni-versity of Aachen on virtual acousticsand the simulation of sound fields andmultichannel reproduction. Presentwere 31 members and a large group ofstudents, who were interested in find-ing out more about the work of theAES and its student sections all overthe world.

The University of Aachen, under theleadership of Professor Vorländer,head of the Institute for TechnicalAcoustics, has for many years been acenter of acoustics and electroa-coustics. The school first became famous with Professor Kutruff and hisextended investigations in room

chronous (not real-time). Overlin described a family of prod-

ucts that incorporate IEEE 1394/-Firewire into a practical system calledmLAN by Yamaha and its licensees.The core of Yamaha’s mLAN is a series of chipsets, which can be embedded in audio devices to allowconnection to the IEEE 1394 bus.

Overlin gave an example of mLANapplied to sound reinforcement at asports stadium in Florida. In this situa-tion, amplifier power totaling 350 000W drives 250 loudspeakers. Tworooms contain power amplifiers, one70 meters from the audio controlroom, the other 400 meters away. Interconnections were done usingmLAN over glass fiber optics to avoida wiring maze.

Strawn and Overlin concluded witha discussion of some of the major advancements in interconnections, animportant and often overlooked part ofthe audio world.

Paul Howard

British Look at Pro StudiosBritish Section members gatheredJune 19, at the Department of SoundRecording at Surrey University to par-ticipate in a one-day course organizedto give members an opportunity togain hands-on experience in profes-sional studio operations. Entitled Stu-dio Appreciation Day, the day’sevents offered a practical introductionto five principal areas of studio opera-tion: surround sound mixing on aSony Oxford Console; multitrack popmixdown on Neve Series V; music

OF THE

NEWS

SECTIONS

988 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Tobias Lentz tellsCentral Germanymembers about audio-virtual reality.

Page 127: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

EMPIRE STATE ANTENNA

The Empire State Building has reaf-firmed its pledge to provide the power,space and infrastructure necessary topermanently accommodate the televi-sion and FM broadcasters in the NewYork metropolitan area who previouslytransmitted from the World Trade Cen-ter. Adequate power and electricity hasalready been added to the Empire StateBuilding antenna to allow transmissionat full power and full range in coopera-tion with the Metropolitan TelevisionAlliance.

Since the destruction of the WorldTrade Center, all television broadcastershave been transmitting from the EmpireState Building. In the wake of the attacks, the building accommodatedbroadcasters by providing access to itsantenna at no cost from September 11through October 2001. The building immediately installed a temporary elec-trical AC power supply circuit of 2000amperes, which it replaced in March2002 by a permanent AC power supplycircuit of 9000 amperes that permitsanalog broadcast at full range and pow-er. Plans to install an additional 16 700amperes of electricity will permit fullanalog and digital transmission by allbroadcasters by mid-2003. If the pro-posed 2000-ft. antenna is built at an alternative site, the Empire State Build-ing will remain the primary antenna inthe three to four years required to com-plete such a structure, and then couldserve as a permanent secondary antenna.

MUSIC AWARDS FOR 2004

The University of Louisville Schoolof Music has announced that it is now accepting applications for the Universi-ty of Louisville Grawemeyer Awardfor Music Composition 2004. The Uni-versity will offer an international prizein recognition of outstanding achieve-

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 989

TRACK

SOUND

ment by a living composer in a musicalgenre: choral, orchestral, chamber,electronic, song-cycle, dance, opera,musical theater, extended solo work,etc. The $200,000 award will be grant-ed to a composer for a work premieredduring the five-year period from Jan-uary 1, 1998 through December 31,2002. For details regarding the submis-sion of scores and the rules and proce-dures for the selection of the winningwork, contact: Grawemeyer MusicAward Committee, School of Music,University of Louisville, Louisville,Kentucky 40292, USA. The committeemust receive all materials plus the application form by January 27, 2003.

DVB-T PROJECT

Broadcast specialist Rohde & Schwarzhas been awarded an exclusive contractto supply the required transmitter sys-tems needed to convert the Berlinmetropolitan area from terrestrial ana-log to digital signal transmission.

In the first phase, the Munich-basedcompany will supply 13 new transmit-ters and modify the existing analoghigh-power transmitter so it can beused for digital transmission. TheDVB-T coverage of households isscheduled for completion by August2003. The existing conventional ana-log terrestrial TV transmissions willbe fully replaced. The digitization ofterrestrial frequencies will allow moreTV program channels to be transmit-ted on fewer frequencies and provideviewers with additional services suchas electronic program guides.

EDISON EXHIBIT AT VIRTUAL MUSEUM

The IEEE Virtual Museum (VM) haslaunched its newest exhibit: ThomasEdison: A Lifetime of Invention. Theexhibit, funded by the Charles Edison

Fund, explores the different stages ofEdison’s career, from his entrepreneuri-al youth to the disappointments of hislater years.

Most are familiar with the worksthat marked the apex of Edison’s career: the incandescent bulb, a light-ing system, and the phonograph. Theseachievements earned Edison themoniker: “The Wizard of Menlo Park,”and made him famous.

Lesser known, but also compelling,are the research and inventions thatbracket Edison’s most prolific years.Edison’s first patented device, an auto-matic vote-recording machine (1865),worked well, but did not meet withcommercial success. From that, Edisonlearned to create things that peopleneeded and would buy. For the nextfifty years, he followed that dictumand, in addition to his most famousworks, introduced such things as theuniversal stock ticker, the carbon but-ton transmitter and the storage battery.Along the way, he also helped launchthe modern electric utility industry,founded many companies (one ofwhich became General Electric) andcreated the precursor to the modern research laboratory.

Unfortunately, as he grew older,Edison’s ability to identify potentiallycommercial products declined. His lat-er years were marked by ephemeralsuccess in the concrete and moviebusinesses and complete failure in ironore production. Nonetheless, by thetime of his death in 1931, he had closeto 1100 patents, a number whichremains unsurpassed by any inventor.

The exhibit does a thorough job ofexplaining to a pre-college audiencehow different technologies worked,how they were developed, and the impact they had on the people whoused them. To find out more aboutEdison, his inventions and his era, vis-it: www.ieee.org/museum.

Page 128: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

CD AND DVD DUPLICATOR is anadvanced, stand-alone, full-featuredsystem appropriate for video data orprofessional audio applications. Theindustrial ES-DVD8 features four usercontrol buttons; two-line backlit LCDdisplay; dedicated read-only DVD-ROM drive; supports DVD/CD video,data and professional audio formats; a burn-on-the-fly copy option; poweron self-diagnostics; internal hard drivefor multiple image storage andfirmware upgrade via CD-ROM. EchoStar Systems, 1223 Osborne Road,Downingtown, PA 19335, USA; tel.+1 610 518 2860; fax +1 267 2000462; e-mail Joe@echostarsystems(Joe Alfonsi); Web sitewww.echostarsystems.com.

A E S S U S T A I N I N G M E M B E R

COMPACT PRODUCTION MIX-ING CONSOLE is an entry-leveldesk ideally suited for repertory the-ater, corporate events, industrials, andrental purposes. The console is highlyspecified with high quality micro-phone input and full four-band para-

metric EQ with the ability to switchthe EQ pre the insert point, as fea-tured on the J-Type live productionconsole. The standard S-Type config-uration has a 25-way frame with 16mono inputs, 8 x group outputs, auxil-lary outputs, matrix outputs and DCmaster faders, plus an oscillator/com-munications module. However, theconfiguration is flexible and modulescan be slotted into any position. Thedesk is available in three frame sizes:17-way, 25-way and 33-way, and maybe linked via optional bus connectors.Cadac Electronics Plc, One NewStreet, Luton, Bedfordshire, LU15DX, UK; tel. +44 1582 404202; fax+44 1582 412799; e-mail info@cadac-sound; Web site www.cadac-sound.com.

A E S S U S T A I N I N G M E M B E R

LIGHTWEIGHT LOUDSPEAKERfor small PA systems is operablebetween 70-Hz and 16-kHz. TheMVP25 houses a single 12-in low fre-quency driver and proprietary LM20™

non-metallic, ferro fluid-cooled com-pression driver coupled to a 90-degree

AND

DEVELOPMENTSProduct information is provided as aservice to our readers. Contact manu-facturers directly for additional infor-mation and please refer to the Journalof the Audio Engineering Society.

NEW PRODUCTS

990 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

2003 March 22-25: 114th AESConvention, RAI Conferenceand Exhibition Centre, Ams-terdam, the Netherlands. Seep. 1008 for more information.

2003 Apri l 28-May 2: 145thMeeting of the Acoustical Society of America, Nashville,TN, USA. For information call:516-576-2360, fax: 516-576-2377.

2003 May 23-25: AES 23rd International Confer-ence, “Signal Process-ing in Audio Record-ing and Reproduction,”Copenhagen, Denmark.Marienlyst Hotel, Helsingor,Copenhagen. For details see p. 1008.

2003 June 26-28: AES 24th International Conference,“Multichannel Audio: The NewReality,“ The Banff Centre,Banff, Alberta, Canada. Formore information see:www.banffcentre.ca.

2003 July 7-10: Tenth Interna-tional Congress on Sound andVibration, Stockholm, Swe-den.

2003 October 10-13: AES 115thAES Convention, Jacob K.Javits Convention Center,New York, NY, USA. Seep.1008 for details.

2003 October 20-23: NAB Europe Radio Conference,Prague, Czech Republic. Contact Mark Rebholz (202) 429-3191 or website: www.nab.org.

Upcoming Meetings

Page 129: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

by 40-degree horn. With a maximumoutput of 116-dB SPL, the enclosure’ssensitivity is rated at 94-dB SPL, whilepower handling is 150-W RMS, 375-W program. The unit stands 20.4-inhigh and measures 14.5-in wide by13.8-in deep. Ruggedly constructedusing an internally braced MDF board,the MVP25 is covered with black carpeting and has recessed handles.Community Professional Loud-speakers, 333 East Fifth Street,Chester, PA 19013, USA; tel. +1 610876 3400 or 800 523 4934 (toll-free);fax +1 610 874 0190; Web sitewww.loudspeakers.net.

CLASSIC TUBE COMPRESSOR/LIMITER offers analog stereo tubecompression and 24-bit/96-k digitalconversion via a DC1 24/96-outputmodule. The TS2 allows users to dialin the desired amount of tube warmthvia a variable tube drive control andA/B the results by means of a separateswitch located on the front panel. Thesoft-knee compressor performs well inboth mixing and tracking applicationsand has a stereo (link) mode for warm-ing up digital mixdowns, and a dual-mono mode for tracking. A compresscontrol makes best use of thresholdand ratio settings and attack andrelease controls are independent. TheTS2 also includes a fixed threshold(+16) limiter to prevent overload of the digital output and provide maximum output without distortion.This product is distributed by the Transamerica Audio Group,www.transaudiogroup.com. Drawmer,Charlotte Street Business Centre,Charlotte Street, Wakefield, WestYorkshire WF1 1UH; tel. +44 1924378669; fax +44 1924 290460; e-maildramer.com; Web sitewww.drawmer.com.

A E S S U S T A I N I N G M E M B E R

SUBWOOFER LINE ARRAY ELE-MENT is an 18-in subwoofer that caneither be flown in the array as aVT4889 full-range unit, suspended as adedicated subwoofer line array, orground-stacked as desired. The new

VT4880 weighs 59.9 kilograms (132lbs), delivers a maximum peak outputof 138 dB and has an input power rat-ing of 4800 W. The subwoofer has twoJBL 2258H 18-in dual voice coil,Direct Cooled™ cone transducers withNeodymium Differential Drive®. Therigid enclosure has DuraFlex™ finishand weatherized component trans-ducers. JBL Professional, 8500 BalboaBoulevard, Northridge, CA 91329,USA; tel. +1 818 894 8850; fax +1 818 894 3479; Web site www.jbl-pro.com/pressroom.

SOUND CALIBRATOR conforms toIEC 60942, Class 1 requirements andis suitable for calibrating high-preci-sion sound pressure level meters.Nominal frequency rating is 1 kHzand sound pressure level is 94 dB. The unit can accept both 1-in and 1/2-in microphones and runs on two alka-line batteries for over 30 hours of con-tinuous use. The small, handheld NC-74 automatically compensates foratmospheric pressure fluctuations, sothat manual compensation is notrequired. Rion Co., Ltd., 20-41,Higashimotomachi 3-chome;Kokubunji, Tokyo 185-8533, Japan;tel. +81 42 359 7888; fax +81 42 3597442; e-mail info@rion; Websitewww.rion.co.jp, distributed byScantek, Inc., 7060-L Oakland MillsRoad, Columbia, MD 21046, USA;tel. +1 410 290 7726; fax +1 410 2909167; e-mail: info@scantekinc;Web site www.scantekinc.com.

A E S S U S T A I N I N G M E M B E R

SUB-MINIATURE CLIP-ONMICROPHONE is practically invisi-ble and built to exacting tolerances.The MKE Platinum features optimizedtreble response for improved headroom

at the wireless body-pack transmitter.Bass response is optimized throughcapsule ventilation design and featuresa smooth THD curve. The microphonewill handle sound pressure levels up to142 dB. Hermetically sealed capsuleand improved embossed umbrelladiaphragm protects the microphoneagainst moisture and ensures a longerworking life. Low-capacitance, ultra-thin flexible cable allows for unobtru-sive attachment of the microphone.Sennheiser Electronic Corporation, 1Enterprise Drive, Old Lyme, CT06371, USA; tel. +1 860 434 9190; fax+1 860 434 1759; Web sitewww.sennheiserusa.com.

SOFTWARE UPGRADE to theSymNet modular audio routing anddigital signal processing system is nowavailable. The latest version supportstwo new hardware add-ons to theSymNet line: BreakIn12 andBreakOut 12, both of which extend theinput and output capacity of the sys-tem. In addition, two new subclassesof modules called ProgrammableFilters have been added to the Filtersand EQ class in the software’s moduleview toolkit. They include mono andstereo programmable 6 dB/octavesloped low pass and high pass filters,and band pass filters, which affordgreater control than standard filters byadding controls based on filter type,resonance, and slope. In addition, thestereo and mono sound pressure level(SPL) modules now include SPL com-puters with gap sensing technology.These new modules are located underthe Dynamics class in the software’smodule view toolkit and control thevolume of a signal as measured by asensing microphone during gaps in theprogram material. Symetrix, Inc.,14926 35th Avenue, West Lynnwood,WA 98037, USA; tel. +1 425 7873222; fax +1 425 787 3211; Web sitewww.symetrixaudio.com.

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 991

AND

NEW PRODUCTS

DEVELOPMENTS

Page 130: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

which many are in Russian. Further-more, there are quite a number of misspellings of names like Hancel andCrammer for Hankel and Cramér,respectively; and errors in the biblio-graphic data in the references. Few audio engineers will fully benefit fromthe theories explained in this book.

Ronald M. Aarts The Netherlands

IN BRIEF AND OF INTEREST…

The Life and Works of Alan DowerBlumlein: The Inventor of Stereo byRobert Charles Alexander (FocalPress), 2000, explores the fascinatingcareer of the man known as “the inventor of stereo.”

Blumlein was born in 1903 anddemonstrated an early proclivity foracademics and engineering. He attend-ed Imperial College in London and in1924, after graduating, joined Interna-tional Western Electric, launching hiscareer as an engineer. Over the courseof the next several years Blumleinworked on various projects related totelephony and telegraphy. When hejoined the Gramophone Company, hehelped devise one of his key inven-tions, a method of recording 2-channelaudio in a single phonograph groove.Many of these features would later beused in the stereo records introducedin the late 1950s.

By the early 1930s, Blumlein wasworking with the British firm EMI toward establishing electronic televi-sion standards, and he helped in thecreation of a new high-definition (forthe time) TV system. With the comingof World War II, however, Blumleinand his colleagues at EMI were drawntoward war work. In 1942, during air-borne experiments on a new type ofradar set, Blumlein was killed with sev-

eral others when their airplane crashed.In this 398-page biography, Alexan-

der carefully documents the life of thisinventor, placing his numerous contri-butions into the broader context of research and development in theBritish communications industry. Thebook includes illustrations, referencesand tables. Price is: $29.99 (paper-back). Focal Press, 225 WildwoodAvenue, Woburn, MA 01801, USA;tel: 800-366-2665, fax: 781-904-2620,Internet: www.focalpress.com.

Noise Reduction in Speech Applica-tions by Gillian M. Davis (CRC Press),2002, provides a comprehensive intro-duction to modern techniques for removing or reducing backgroundnoise from a range of speech-relatedapplications.

Edited by Gillian Davis, the bookbegins with a tutorial, which providesthe background material necessary tounderstanding system issues, digitalalgorithms, and their implementation.The following chapters are written byinternational experts and offer a prac-tical systems approach to help readersdecide whether or not digital noise reduction will solve specific problemsin their own systems. A closing sectionexplores a variety of applications anddemonstrates the satisfying results thatcan be achieved with various noise reduction techniques.

Other features of the book include: asection on how to select the most appropriate technique for a given situ-ation; background on digital signalprocessing techniques and noise indigital speech communication sys-tems; an examination of single-chan-nel speech enhancers, microphone arrays and echo canceller noise reduc-tion algorithms. The price: $129.95.CRC Press, or www.crcpress.com.

LITERATUREThe opinions expressed are those ofthe individual reviewers and are notnecessarily endorsed by the Editors ofthe Journal.

AVAILABLE

SIGNAL PROCESSING NOISE byVyacheslav P. Tuzlukov, CRC Press,Boca Raton, FL, 652 pages, 2002,$129.95, ISBN: 0-8493-1025-3.

The author presents much of his ownresearch in this book on processing ofsignals with additive and multiplicativenoise. This noise can significantly limitthe potential of complex signal pro-cessing systems, especially when thosesystems use signals with complexphase structure.

The book sets forth a generalizedapproach to signal processing in thepresence of multiplicative and additivenoise that represents an advance insignal processing and detection theory.This approach extends the limits of thenoise immunity set by classical andmodern signal processing theories.Systems constructed on this basisachieve better detection performancethan that of systems currently in use.Addressing a fundamental problem incomplex signal processing systems,this book offers a theoretical develop-ment for raising noise immunity invarious applications, but mainly onsignal detection.

Readers who are not experts in advanced DSP will find this a difficultbook. This is addressed by the authorhimself when he states: “To better understand the fundamental statementsand concepts…, the reader shouldconsult my two earlier books.” Thenovice DSP engineer should acquirethe necessary background to fully ben-efit from this text.

This work, while containing almost700 pages, is definitely too concise tobe a first text on this topic, but it is agood supplementary text for knowl-edgeable readers who are involved inthe topic. A second difficulty with it isthat the author refers sometimes in onesentence to 45 journal papers, of

992 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

Page 131: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

In Memoriam

John Granville Humble was bornDecember 31, 1922, in Lincoln,Nebraska. As a boy, he loved to

take things apart to see how theyworked, and then put them back together, trying carefully not to haveany left over parts. He often used hislittle brother to determine if certain“high voltage” areas of his disassem-bled radios were fully discharged andsafe to touch.

John was always, intrigued with allcommunications media. He lovednewspapers, and was knowledgeableabout the history of all of the famouspublications of the time. In the eighthgrade, he enlisted the help of severalof his friends and published his ownnewspaper, The News Recorder,which he sold every Friday for twocents a copy. It wasn’t long before heand the other boys associated with thepaper had completed a wired commu-nication network to each of theirhouses spanning several city blocks.

John always excelled in his studiesand was recommended by his chem-istry and physics teachers to apply forthe Midland Radio School Scholar-ship. Midland, located in Missouri,was one of the most prestigious radiotraining schools at that time. Johntook their advice and was accepted after graduating from high school in1941. Midland Radio School wasfounded by Arthur B. Church and wasaffiliated with KMBC in Kansas City,MO. Many radio personalities begantheir careers at KMBC including John

Cameron Swayze, Ted Malone, Wal-ter Cronkite and Caroline Ellis. Aftercompleting his studies at Midland, heworked for a radio station in Way-cross, GA, for about a year. He thenreturned to Kansas City and workedfor Radio Station KMBC as a radioengineer.

In the early 1960s John came toCalifornia to work as the Northwestsales representative for Altec LansingCorporation. It was at Altec that hebegan to develop his well regardedreputation in the audio industry. Asthe years passed, he moved in the direction of the theater sound productdivision. In theater sound, the Altecname was held in high regard becausemost of the solid engineering prac-tices were adopted from the earlydays at Western Electric.

Because John never took any endeavor lightly, he quickly becamean authoritative figure and a bit of ahistorian for the theater sound division.He represented quite a lot more thanthat of salesman. He used his welltrained background in engineering andradio, and adapted it well to audio.Many distinguished engineers at Altec,like Bill Hayes, remember havingtechnical conversations with him asoften as three times a week .

John was never pretentious; he nev-er came across as a “know-it-all.” Hedid, however, know quite a bit, andfor questions a customer might havethat he didn’t know, he was alwayssure to find the answer. In this way,

he would learn little extra pearls ofaudio wisdom and history along theway. During these years, and many tofollow, he earned the respect of manyleaders in the audio industry,

As the theater business started todecline, Altec began to focus more onengineered sound. John saw this as anopportunity to begin his own busi-ness. He became an independent salesrepresentative for a number of prod-ucts. Shortly after taking on a new,unknown line of commercial soundequipment manufactured in Japan bythe TOA Electric Company, hemoved to Southern California.

After talking to many people sinceJohn’s death, the one word most oftenmentioned was integrity. His honestyand integrity along with his sincerelove of the business was his key tosuccess. He was always on the side ofthe customer; and everyone enjoyedseeing him show up at their office.

During the 29,063 days he spenthere on earth (he died July 27),s Johnmade many friends and took an inter-est in young people starting out in thesound industry, mentoring them tosuccessful careers. He was also involved with neighborhood charityprojects of his own and, living up tohis name, never mentioned them.

Those who have worked with Johnwill never hear the familiar words“John Humble” in the receiver again.A dear friend will be sorely missed.

Tim Clark, John Murrayand Thomas P. Leach

1000 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

John G. Humble1922 – 2002

Page 132: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

PROPOSED TOPICS FOR PAPERS

Please submit via Email proposed title, abstract, andprécis to papers chair no later than 2003 February 28.If you have any questions contact Papers Chair:

High-Resolution AudioSignal Processing for AudioArchitectural Acoustics and Room AcousticsAudio NetworkingRecording and ReproductionAudio Games and EntertainmentSound ReinforcementDigital Audio Equipment and MediaAutomotive AudioInstrumentation and MeasurementTransmission

CONVENTION CHAIRS

SUBMISSION SCHEDULEAuthors whose contributions have been acceptedfor presentation will receive additional instructionsfor submission of their manuscripts.

Proposal deadline: 2003 February 28

Acceptance emailed: 2003 March 17

Preprint deadline: 2003 May 10

Convention chair: Kimio HamasakiNHK Science & Technical ResearchLaboratoriesTelephone: +81 3 5494 3208Fax: +81 3 5494 3219Email: hamasaki.k-ku@nhk

Vice chair: Hiroaki SuzukiVictor Company of Japan, Limited (JVC)Telephone: +81 45 450 1779Fax: +81 45 450 1639Email: [email protected]

TransducersPsychoacoustics, Perception, Listening TestsMultichannel SoundProgram ProductionAudio CodingAudio WatermarkingVirtual Reality AudioDigital BroadcastingAudio Restoration and EnhancementAnalysis and Synthesis of Sound

AUDIO ENGINEERING SOCIETYCALL for PAPERS

AES 11th Regional Convention, 2003Tokyo, Japan

The 11th Regional Convention committee invites you to submit technical papers for presentation at the 2003 July meeting inTokyo, Japan. By 2003 February 28 a proposed title, 60- to 120-word abstract (English), and 500- to 750-word précis of thepaper should be submitted via Email to the papers chair (see below). The précis should describe the work performed, meth-ods employed, conclusion(s), and significance of the paper. Title, abstract, and précis should follow the guidelines in Infor-mation for Authors at www.aes.org/journal/con_infoauth.html. Authors without Email and Internet access should contact thepapers chair for hardcopy forms and instructions. Acceptance of papers will be determined by a review committee based onan assessment of the précis. A preprint manuscript will be a condition for acceptance of the paper for presentation at theconvention. Abstracts of accepted papers will be published in the convention program. Please choose your wording carefully.

Dates: 2003 July 7–9Location: Science Museum, Chiyoda, Tokyo, Japan

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 1001

Shinji KoyanoEmail: [email protected]

Pioneer Corporation, Corporate R & D Lab.6-1-1, Fujimi, Tsurugashima-shi

Saitama 350-2288, JapanTelephone: +81 49 279 2627

Fax: +81 49 279 1513

2003

Page 133: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

1002 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

EASTERN REGION,USA/CANADA

Vice President:Jim Anderson12 Garfield PlaceBrooklyn, NY 11215Tel. +1 718 369 7633Fax +1 718 669 7631E-mail

UNITED STATES OFAMERICA

CONNECTICUT

University of HartfordSection (Student)Howard A. CanistraroFaculty AdvisorAES Student SectionUniversity of HartfordWard College of Technology200 Bloomfield Ave.West Hartford, CT 06117-1599Tel. +1 860 768 5358Fax +1 860 768 5074 E-mail

FLORIDA

Full Sail Real WorldEducation Section (Student)Bill Smith, Faculty AdvisorAES Student SectionFull Sail Real World Education3300 University Blvd., Suite 160Winter Park, FL 327922Tel. +1 800 679 0100E-mail

University of Miami Section(Student)Ken Pohlmann, Faculty AdvisorAES Student SectionUniversity of MiamiSchool of MusicPO Box 248165Coral Gables, FL 33124-7610Tel. +1 305 284 6252Fax +1 305 284 4448E-mail

GEORGIA

Atlanta SectionRobert Mason2712 Leslie Dr.Atlanta, GA 30345Home Tel. +1 770 908 1833E-mail

MASSACHUSETTS

Berklee College of MusicSection (Student)

Eric Reuter, Faculty AdvisorBerklee College of MusicAudio Engineering Societyc/o Student Activities1140 Boylston St., Box 82Boston, MA 02215Tel. +1 617 747 8251Fax +1 617 747 2179E-mail

Boston SectionJ. Nelson Chadderdonc/o Oceanwave Consulting, Inc.21 Old Town Rd.Beverly, MA 01915Tel. +1 978 232 9535 x201Fax +1 978 232 9537E-mail

University of Massachusetts–Lowell Section (Student)John Shirley, Faculty AdvisorAES Student ChapterUniversity of Massachusetts–LowellDept. of Music35 Wilder St., Ste. 3Lowell, MA 01854-3083Tel. +1 978 934 3886Fax +1 978 934 3034E-mail

Worcester PolytechnicInstitute Section (Student) William MichalsonFaculty AdvisorAES Student SectionWorcester Polytechnic Institute100 Institute Rd.Worcester, MA 01609Tel. +1 508 831 5766E-mail

NEW JERSEY

William Paterson UniversitySection (Student)David Kerzner, Faculty AdvisorAES Student SectionWilliam Paterson University300 Pompton Rd.Wayne, NJ 07470-2103Tel. +1 973 720 3198Fax +1 973 720 2217E-mail

NEW YORK

Fredonia Section (Student)Bernd Gottinger, Faculty AdvisorAES Student SectionSUNY–Fredonia1146 Mason HallFredonia, NY 14063Tel. +1 716 673 4634Fax +1 716 673 3154E-mail

Institute of Audio ResearchSection (Student)Noel Smith, Faculty AdvisorAES Student SectionInstitute of Audio Research 64 University Pl.New York, NY 10003Tel. +1 212 677 7580Fax +1 212 677 6549E-mail

New York SectionRobbin L. GheeslingBroadness, LLC265 Madison Ave., Second FloorNew York, NY 10016Tel. +1 212 818 1313Fax +1 212 818 1330E-mail

NORTH CAROLINA

University of North Carolinaat Asheville Section (Student)Wayne J. KirbyFaculty AdvisorAES Student SectionUniversity of North Carolina at

AshevilleDept. of MusicOne University HeightsAsheville, NC 28804Tel. +1 828 251 6487Fax +1 828 253 4573E-mail

PENNSYLVANIA

Carnegie Mellon UniversitySection (Student)Thomas SullivanFaculty AdvisorAES Student SectionCarnegie Mellon UniversityUniversity Center Box 122Pittsburg, PA 15213Tel. +1 412 268 3351E-mail

Duquesne University Section(Student)Francisco RodriguezFaculty AdvisorAES Student SectionDuquesne UniversitySchool of Music600 Forbes Ave.Pittsburgh, PA 15282Tel. +1 412 434 1630Fax +1 412 396 5479E-mail

Pennsylvania State UniversitySection (Student)Brian TuttleAES Penn State Student ChapterGraduate Program in Acoustics

217 Applied Science Bldg.University Park, PA 16802Home Tel. +1 814 863 8282Fax +1 814 865 3119E-mail

Philadelphia SectionRebecca MercuriP.O. Box 1166.Philadelphia, PA 19105Tel. +1 609 895 1375E-mail

VIRGINIA

Hampton University Section(Student)Bob Ransom, Faculty AdvisorAES Student SectionHampton UniversityDept. of MusicHampton, VA 23668Office Tel. +1 757 727 5658,

+1 757 727 5404Home Tel. +1 757 826 0092Fax +1 757 727 5084E-mail

WASHINGTON, DC

American University Section(Student)Benjamin TomassettiFaculty AdvisorAES Student SectionAmerican UniversityPhysics Dept.4400 Massachusetts Ave., N.W.Washington, DC 20016Tel. +1 202 885 2746Fax +1 202 885 2723E-mail

District of Columbia SectionJohn W. ReiserDC AES Section SecretaryP.O. Box 169Mt. Vernon, VA 22121-0169Tel. +1 703 780 4824Fax +1 703 780 4214E-mail

CANADA

McGill University Section(Student)John Klepko, Faculty AdvisorAES Student SectionMcGill UniversitySound Recording StudiosStrathcona Music Bldg.555 Sherbrooke St. W.Montreal, Quebec H3A 1E3CanadaTel. +1 514 398 4535 ext. 0454E-mail

DIRECTORY

SECTIONS CONTACTS

The following is the latest information we have available for our sections contacts. If youwish to change the listing for your section, please mail, fax or e-mail the new informationto: Mary Ellen Ilich, AES Publications Office, Audio Engineering Society, Inc., 60 East42nd Street, Suite 2520, New York, NY 10165-2520, USA. Telephone +1 212 661 8528.Fax +1 212 661 7829.

Updated information that is received by the first of the month will be published in thenext month’s Journal. Please help us to keep this information accurate and timely.

Page 134: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Toronto SectionAnne Reynolds606-50 Cosburn Ave.Toronto, Ontario M4K 2G8CanadaTel. +1 416 957 6204Fax +1 416 364 1310E-mail

CENTRAL REGION,USA/CANADA

Vice President:Jim KaiserMaster Mix1921 Division St.Nashville, TN 37203Tel. +1 615 321 5970Fax +1 615 321 0764E-mail

UNITED STATES OFAMERICA

ARKANSAS

University of Arkansas atPine Bluff Section (Student)Robert Elliott, Faculty AdvisorAES Student SectionMusic Dept. Univ. of Arkansasat Pine Bluff1200 N. University DrivePine Bluff, AR 71601Tel. +1 870 575 8916Fax +1 870 543 8108E-mail

INDIANA

Ball State University Section(Student)Michael Pounds, Faculty AdvisorAES Student SectionBall State UniversityMET Studios2520 W. BethelMuncie, IN 47306Tel. +1 765 285 5537Fax +1 765 285 8768E-mail

Central Indiana SectionJames LattaSound Around6349 Warren Ln.Brownsburg, IN 46112Office Tel. +1 317 852 8379Fax +1 317 858 8105E-mail

ILLINOIS

Chicago SectionRobert ZurekMotorola2001 N. Division St.Harvard, IL 60033Tel. +1 815 884 1361Fax +1 815 884 2519E-mail

Columbia College Section(Student)Dominique J. ChéenneFaculty AdvisorAES Student Section

676 N. LaSalle, Ste. 300Chicago, IL 60610Tel. +1 312 344 7802Fax +1 312 482 9083

University of Illinois atUrbana-Champaign Section(Student)David S. Petruncio Jr.AES Student SectionUniversity of Illinois, Urbana-

ChampaignUrbana, IL 61801Tel. +1 217 621 7586E-mail

KANSAS

Kansas City SectionJim MitchellCustom Distribution Limited12301 Riggs Rd.Overland Park, KS 66209Tel. +1 913 661 0131Fax +1 913 663 5662

LOUISIANA

New Orleans SectionJoseph Doherty6015 Annunication St.New Orleans, LA 70118Tel. +1 504 891 4424Fax +1 504 891 6075

MICHIGAN

Detroit SectionTom ConlinDaimlerChryslerE-mail

Michigan TechnologicalUniversity Section (Student)Andre LaRoucheAES Student SectionMichigan Technological

UniversityElectrical Engineering Dept.1400 Townsend Dr.Houghton, MI 49931Home Tel. +1 906 847 9324E-mail

West Michigan SectionCarl HordykCalvin College3201 Burton S.E.Grand Rapids, MI 49546Tel. +1 616 957 6279Fax +1 616 957 6469E-mail

MINNESOTA

Music Tech College Section(Student)Michael McKernFaculty AdvisorAES Student SectionMusic Tech College19 Exchange Street EastSaint Paul, MN 55101Tel. +1 651 291 0177Fax +1 651 291 0366E-mail..................................

Ridgewater College,Hutchinson Campus Section(Student)Dave Igl, Faculty AdvisorAES Student SectionRidgewater College, Hutchinson

Campus2 Century Ave. S.E.Hutchinson, MN 55350E-mail

Upper Midwest SectionGreg ReiersonRare Form Mastering4624 34th Avenue SouthMinneapolis, MN 55406Tel. +1 612 327 8750E-mail

MISSOURI

St. Louis SectionJohn Nolan, Jr.693 Green Forest Dr.Fenton, MO 63026Tel./Fax +1 636 343 4765E-mail

NEBRASKA

Northeast Community CollegeSection (Student)Anthony D. BeardsleeFaculty AdvisorAES Student SectionNortheast Community CollegeP.O. Box 469Norfolk, NE 68702Tel. +1 402 644 0581Fax +1 402 644 0650E-mail

OHIO

Ohio University Section(Student)Erin M. DawesAES Student SectionOhio UniversityRTVC Bldg.9 S. College St.Athens, OH 45701-2979Home Tel. +1 740 597 6608E-mail

University of CincinnatiSection (Student)Thomas A. HainesFaculty AdvisorAES Student SectionUniversity of CincinnatiCollege-Conservatory of MusicM.L. 0003Cincinnati, OH 45221Tel. +1 513 556 9497Fax +1 513 556 0202

TENNESSEE

Belmont University Section(Student)Wesley Bulla, Faculty AdvisorAES Student SectionBelmont UniversityNashville, TN 37212

Middle Tennessee StateUniversity Section (Student)Phil Shullo, Faculty AdvisorAES Student SectionMiddle Tennessee State University301 E. Main St., Box 21Murfreesboro, TN 37132Tel. +1 615 898 2553E-mail

Nashville Section Tom EdwardsMTV Networks330 Commerce St.Nashville, TN 37201Tel. +1 615 335 8520Fax +1 615 335 8608E-mail

SAE Nashville Section (Student)Larry Sterling, Faculty AdvisorAES Student Section7 Music Circle N.Nashville, TN 37203Tel. +1 615 244 5848Fax +1 615 244 3192E-mail

TEXAS

Southwest Texas StateUniversity Section (Student)Mark C. EricksonFaculty AdvisorAES Student Section Southwest Texas State

University224 N. Guadalupe St.San Marcos, TX 78666Tel. +1 512 245 8451Fax +1 512 396 1169E-mail

WESTERN REGION,USA/CANADA

Vice President:Bob MosesIsland Digital Media Group,

LLC26510 Vashon Highway S.W.Vashon, WA 98070Tel. +1 206 463 6667Fax +1 810 454 5349E-mail

UNITED STATES OFAMERICA

ARIZONA

Conservatory of TheRecording Arts and SciencesSection (Student)Glen O’Hara, Faculty AdvisorAES Student Section Conservatory of The Recording

Arts and Sciences2300 E. Broadway Rd.Tempe, AZ 85282Tel. +1 480 858 9400, 800 562

6383 (toll-free)Fax +1 480 829 1332E-mail

SECTIONS CONTACTSDIRECTORY

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 1003

Page 135: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

CALIFORNIA

American River CollegeSection (Student)Eric Chun, Faculty AdvisorAES Student SectionAmerican River College Chapter4700 College Oak Dr.Sacramento, CA 95841Tel. +1 916 484 8420E-mail

Cal Poly San Luis ObispoState University Section(Student)Jerome R. BreitenbachFaculty AdvisorAES Student SectionCalifornia Polytechnic State

UniversityDept. of Electrical EngineeringSan Luis Obispo, CA 93407Tel. +1 805 756 5710Fax +1 805 756 1458E-mail

California State University–Chico Section (Student)Keith Seppanen, Faculty AdvisorAES Student SectionCalifornia State University–Chico400 W. 1st St.Chico, CA 95929-0805Tel. +1 530 898 5500E-mail

Citrus College Section(Student)Gary Mraz, Faculty AdvisorAES Student SectionCitrus CollegeRecording Arts1000 W. Foothill Blvd.Glendora, CA 91741-1899Fax +1 626 852 8063

Cogswells PolytechnicalCollege Section (Student)Tim Duncan, Faculty SponsorAES Student SectionCogswell Polytechnical CollegeMusic Engineering Technology1175 Bordeaux Dr.Sunnyvale, CA 94089Tel. +1 408 541 0100, ext. 130Fax +1 408 747 0764E-mail

Expression Center for NewMedia Section (Student)Scott Theakston, Faculty AdvisorAES Student SectionEx’pression Center for New

Media6601 Shellmount St.Emeryville, CA 94608Tel. +1 510 654 2934Fax +1 510 658 3414E-mail

Long Beach City CollegeSection (Student)Nancy Allen, Faculty AdvisorAES Student SectionLong Beach City College4901 E. Carson St.Long Beach, CA 90808

Tel. +1 562 938 4312Fax +1 562 938 4409E-mail

Los Angeles SectionAndrew Turner1733 Lucile Ave., #8Los Angeles, CA 90026Tel. +1 323 661 0390E-mail

San Diego SectionJ. Russell Lemon2031 Ladera Ct.Carlsbad, CA 92009-8521Home Tel. +1 760 753 2949E-mail

San Diego State UniversitySection (Student)John Kennedy, Faculty AdvisorAES Student SectionSan Diego State UniversityElectrical & Computer

Engineering Dept.5500 Campanile Dr.San Diego, CA 92182-1309Tel. +1 619 594 1053Fax +1 619 594 2654E-mail

San Francisco SectionBill Orner1513 Meadow LaneMountain View, Ca 94040Tel. +1 650 903 0301Fax +1 650 903 0409E-mail

San Francisco StateUniversity Section (Student)John Barsotti, Faculty AdvisorAES Student SectionSan Francisco State UniversityBroadcast and Electronic

Communication Arts Dept.1600 Halloway Ave.San Francisco, CA 94132Tel. +1 415 338 1507E-mail

Stanford University Section(Student)Jay Kadis, Faculty AdvisorStanford AES Student SectionStanford UniversityCCRMA/Dept. of MusicStanford, CA 94305-8180Tel. +1 650 723 4971Fax +1 650 723 8468E-mail

University of SouthernCalifornia Section(Student)Kenneth LopezFaculty AdvisorAES Student SectionUniversity of Southern California840 W. 34th St.Los Angeles, CA 90089-0851Tel. +1 213 740 3224Fax +1 213 740 3217E-mail

COLORADO

Colorado SectionRobert F. MahoneyRobert F. Mahoney &

Associates310 Balsam Ave.Boulder, CO 80304Tel. +1 303 443 2213Fax +1 303 443 6989E-mail

Denver Section (Student)Roy Pritts, Faculty AdvisorAES Student SectionUniversity of Colorado at

DenverDept. of Professional StudiesCampus Box 162P.O. Box 173364Denver, CO 80217-3364Tel. +1 303 556 2795Fax +1 303 556 2335E-mail

OREGON

Portland SectionTony Dal MolinAudio Precision, Inc.5750 S.W. Arctic Dr.Portland, OR 97005Tel. +1 503 627 0832Fax +1 503 641 8906E-mail

UTAH

Brigham Young UniversitySection (Student)Jim Anglesey,

Faculty AdvisorBYU-AES Student SectionSchool of MusicBrigham Young UniversityProvo, UT 84602Tel. +1 801 378 1299Fax +1 801 378 5973 (Music

Office)E-mail

Utah SectionDeward Timothyc/o Poll Sound4026 S. MainSalt Lake City, UT 84107Tel. +1 801 261 2500Fax +1 801 262 7379

WASHINGTON

Pacific Northwest SectionGary LouieUniversity of Washington

School of MusicPO Box 353450Seattle, WA 98195Office Tel. +1 206 543 1218Fax +1 206 685 9499E-mail

The Art Institute of SeattleSection (Student)David G. ChristensenFaculty AdvisorAES Student SectionThe Art Institute of Seattle

2323 Elliott Ave.Seattle, WA 98121-1622 Tel. +1 206 239 2397E-mail......

CANADA

Alberta SectionFrank LockwoodAES Alberta SectionSuite 404815 - 50 Avenue S.W.Calgary, Alberta T2S 1H8CanadaHome Tel. +1 403 703 5277Fax +1 403 762 6665E-mail

Vancouver SectionPeter L. JanisC-Tec #114, 1585 BroadwayPort Coquitlam, B.C. V3C 2M7CanadaTel. +1 604 942 1001Fax +1 604 942 1010E-mail

Vancouver Student SectionGregg Gorrie, Faculty AdvisorAES Greater Vancouver

Student SectionCentre for Digital Imaging and

Sound3264 Beta Ave.Burnaby, B.C. V5G 4K4, CanadaTel. +1 604 298 5400E-mail.....

NORTHERN REGION,EUROPE

Vice President:Søren BechBang & Olufsen a/sCoreTechPeter Bangs Vej 15DK-7600 Struer, DenmarkTel. +45 96 84 49 62Fax +45 97 85 59 50E-mailnorthern

BELGIUM

Belgian SectionHermann A. O. WilmsAES Europe Region OfficeZevenbunderslaan 142, #9BE-1190 Vorst-Brussels, BelgiumTel. +32 2 345 7971Fax +32 2 345 3419

DENMARK

Danish SectionKnud Bank ChristensenSkovvej 2DK-8550 Ryomgård, DenmarkTel. +45 87 42 71 46Fax +45 87 42 70 10E-mail

1004 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

SECTIONS CONTACTSDIRECTORY

Page 136: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 1005

Danish Student SectionTorben Poulsen Faculty AdvisorAES Student SectionTechnical University of DenmarkØrsted-DTU, Acoustic

TechnologyDTU - Building 352DK-2800 Kgs. Lyngby, DenmarkTel. +45 45 25 39 40Fax +45 45 88 05 77E-mail ...

FINLAND

Finnish SectionKalle KoivuniemiNokia Research CenterP.O. Box 100FI-33721 Tampere, FinlandTel. +358 7180 35452Fax +358 7180 35897E-mail

NETHERLANDS

Netherlands SectionRinus BooneVoorweg 105ANL-2715 NG ZoetermeerNetherlandsTel. +31 15 278 14 71, +31 62

127 36 51Fax +31 79 352 10 08E-mail

Netherlands Student SectionDirk FischerAES Student SectionGroenewegje 143aDen Haag, NetherlandsHome Tel. +31 70 3885958E-mailnetherlands

NORWAY

Norwegian SectionJan Erik JensenNøklesvingen 74NO-0689 Oslo, NorwayOffice Tel. +47 22 24 07 52Home Tel. +47 22 26 36 13 Fax +47 22 24 28 06E-mail

RUSSIA

All-Russian State Institute ofCinematography Section(Student)Leonid Sheetov, Faculty SponsorAES Student SectionAll-Russian State Institute of

Cinematography (VGIK)W. Pieck St. 3RU-129226 Moscow, RussiaTel. +7 095 181 3868Fax +7 095 187 7174E-mail

Moscow SectionMichael LannieResearch Institute for

Television and RadioAcoustic Laboratory12-79 Chernomorsky bulvarRU-113452 Moscow, RussiaTel. +7 095 2502161, +7 095

1929011Fax +7 095 9430006E-mail

St. Petersburg SectionIrina A. AldoshinaSt. Petersburg University of

TelecommunicationsGangutskaya St. 16, #31RU-191187 St. Petersburg

RussiaTel. +7 812 272 4405Fax +7 812 316 1559E-mail

St. Petersburg Student SectionNatalia V. TyurinaFaculty AdvisorProsvescheniya pr., 41, 185RU-194291 St. Petersburg, RussiaTel. +7 812 595 1730Fax +7 812 316 1559E-mailst_peter

SWEDEN

Swedish SectionMikael OlssonAudio Data LabKatarinavägen 22SE-116 45 Stockholm, SwedenTel. +46 8 30 29 98Fax +46 8 641 67 91E-mail

University of Luleå-PiteåSection (Student)Lars Hallberg, Faculty SponsorAES Student SectionUniversity of Luleå-PiteåSchool of MusicBox 744S-94134 Piteå, SwedenTel. +46 911 726 27Fax +46 911 727 10E-mail

UNITED KINGDOM

British SectionHeather LaneAudio Engineering SocietyP.O. Box 645Slough GB-SL1 8BJUnited KingdomTel. +44 1628 663725Fax +44 1628 667002E-mail

CENTRAL REGION,EUROPE

Vice President:Markus ErneScopein ResearchSonnmattweg 6CH-5000 Aarau, Switzerland

Tel. +41 62 825 09 19Fax +41 62 825 09 15E-mail

AUSTRIA

Austrian SectionFranz LechleitnerLainergasse 7-19/2/1AT-1230 Vienna, AustriaOffice Tel. +43 1 4277 29602Fax +43 1 4277 9296E-mail

Graz Section (Student)Robert Höldrich Faculty SponsorInstitut für Elektronische Musik

und AkustikInffeldgasse 10AT-8010 Graz, AustriaTel. +43 316 389 3172Fax +43 316 389 3171E-mail

Vienna Section (Student)Jürg Jecklin, Faculty SponsorVienna Student SectionUniversität für Musik und

Darstellende Kunst WienInstitut für Elektroakustik und

Experimentelle MusikRienösslgasse 12AT-1040 Vienna, AustriaTel. +43 1 587 3478Fax +43 1 587 3478 20E-mail

CZECH REPUBLIC

Czech SectionJiri OcenasekDejvicka 36CZ-160 00 Prague 6Czech Republic Home Tel. +420 2 24324556E-mail

Czech Republic StudentSectionLibor Husník, Faculty AdvisorAES Student SectionCzech Technical University at

PragueTechnická 2, CZ-116 27 Prague 6Czech RepublicTel. +420 2 2435 2115E-mail

GERMANY

Berlin Section (Student)Bernhard Güttler Zionskirchstrasse 14DE-10119 Berlin, GermanyTel. +49 30 4404 72 19Fax +49 30 4405 39 03E-mail

Central German SectionErnst-Joachim VölkerInstitut für Akustik und

BauphysikKiesweg 22-24DE-61440 Oberursel, GermanyTel. +49 6171 75031

Fax +49 6171 85483E-mail

Darmstadt Section (Student)G. M. Sessler, Faculty SponsorAES Student SectionTechnical University of

DarmstadtInstitut für ÜbertragungstechnikMerkstr. 25DE-64283 Darmstadt, GermanyTel. +49 6151 162869E-maildarmstadt

Detmold Section (Student)Andreas Meyer, Faculty SponsorAES Student Sectionc/o Erich Thienhaus InstitutTonmeisterausbildung

Hochschule für Musik Detmold

Neustadt 22, DE-32756Detmold, GermanyTel/Fax +49 5231 975639E-mail

Düsseldolf Section (Student)Ludwig KuglerAES Student SectionBilker Allee 126DE-40217 Düsseldorf, GermanyTel. +49 211 3 36 80 38E-mailduesseldorf

Ilmenau Section (Student)Karlheinz BrandenburgFaculty SponsorAES Student SectionInstitut für MedientechnikPF 10 05 65DE-98684 Ilmenau, GermanyTel. +49 3677 69 2676Fax +49 3677 69 1255E-maililmenau

North German SectionReinhard O. SahrEickhopskamp 3DE-30938 Burgwedel, GermanyTel. +49 5139 4978Fax +49 5139 5977E-mail

South German SectionGerhard E. PicklappLandshuter Allee 162DE-80637 Munich, GermanyTel. +49 89 15 16 17Fax +49 89 157 10 31E-mail

HUNGARY

Hungarian SectionIstván MatókRona u. 102. II. 10HU-1149 Budapest, HungaryHome Tel. +36 1 383 24 81Fax +36 1 383 24 81E-mail

SECTIONS CONTACTSDIRECTORY

Page 137: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

LITHUANIA

Lithuanian SectionVytautas J. StauskisVilnius Gediminas Technical

UniversitySauletekio al. 11LT-2040 Vilnius, LithuaniaTel. +370 2 700 492Fax +370 2 700 498E-mail

POLAND

Polish SectionJan A. AdamczykUniversity of Mining and

MetallurgyDept. of Mechanics and

Vibroacousticsal. Mickiewicza 30PL-30 059 Cracow, PolandTel. +48 12 617 30 55Fax +48 12 633 23 14E-mail

Technical University of GdanskSection (Student)Pawel ZwanAES Student Section Technical University of GdanskSound Engineering Dept.ul. Narutowicza 11/12PL-80 952 Gdansk, PolandHome Tel. +48 58 347 23 98Office Tel. +4858 3471301Fax +48 58 3471114E-mail

Wroclaw University ofTechnology Section (Student)Andrzej B. DobruckiFaculty SponsorAES Student SectionInstitute of Telecommunications

and AcousticsWroclaw University of

TechnologyWybrzeze Wyspianskiego 27PL-503 70 Wroclaw, PolandTel. +48 71 320 30 68Fax +48 71 320 31 89E-mail wroclaw

REPUBLIC OF BELARUS

Belarus SectionValery ShalatoninBelarusian State University of

Informatics and Radioelectronics

vul. Petrusya Brouki 6BY-220027 MinskRepublic of BelarusTel. +375 17 239 80 95Fax +375 17 231 09 14E-mail

SLOVAK REPUBLIC

Slovakian Republic SectionRichard VarkondaCentron Slovakia Ltd.Podhaj 107SK-841 03 BratislavaSlovak Republic

Tel. +421 7 6478 0767Fax. +421 7 6478 0042E-mailslovakian

SWITZERLAND

Swiss SectionJoel GodelAES Swiss SectionSonnmattweg 6CH-5000 AarauSwitzerlandE-mail

UKRAINE

Ukrainian SectionValentin AbakumovNational Technical University

of UkraineKiev Politechnical InstitutePolitechnical St. 16Kiev UA-56, UkraineTel./Fax +38 044 2366093

SOUTHERN REGION,EUROPE

Vice President:Daniel ZalayConservatoire de ParisDept. SonFR-75019 Paris, FranceOffice Tel. +33 1 40 40 46 14Fax +33 1 40 40 47 68E-mailsouthern

BOSNIA-HERZEGOVINA

Bosnia-Herzegovina SectionJozo TalajicBulevar Mese Selimovica 12BA-71000 SarajevoBosnia–HerzegovinaTel. +387 33 455 160Fax +387 33 455 163E-mail bosnia

BULGARIA

Bulgarian SectionKonstantin D. KounovBulgarian National RadioTechnical Dept.4 Dragan Tzankov Blvd. BG-1040 Sofia, BulgariaTel. +359 2 65 93 37, +359 2

9336 6 01Fax +359 2 963 1003E-mail bulgaria

CROATIA

Croatian SectionSilvije StamacHrvatski RadioPrisavlje 3HR-10000 Zagreb, CroatiaTel. +385 1 634 28 81Fax +385 1 611 58 29E-mail

Croatian Student SectionHrvoje DomitrovicFaculty AdvisorAES Student SectionFaculty of Electrical

Engineering and ComputingDept. of Electroaocustics (X. Fl.)Unska 3HR-10000 Zagreb, CroatiaTel. +385 1 6129 640Fax +385 1 6129 852E-mailcroatian

FRANCE

Conservatoire de ParisSection (Student)Alessandra Galleron36, Ave. ParmentierFR-75011 Paris, FranceTel. +33 1 43 38 15 94

French SectionMichael WilliamsIle du Moulin62 bis Quai de l’Artois FR-94170 Le Perreux sur

Marne, FranceTel. +33 1 48 81 46 32Fax +33 1 47 06 06 48E-mail

Louis Lumière Section(Student)Alexandra Carr-BrownAES Student SectionEcole Nationale Supérieure

Louis Lumière7, allée du Promontoire, BP 22FR-93161 Noisy Le Grand

Cedex, FranceTel. +33 6 18 57 84 41E-maillouis

GREECE

Greek SectionVassilis TsakirisCrystal AudioAiantos 3a VrillissiaGR 15235 Athens, GreeceTel. + 30 2 10 6134767Fax + 30 2 10 6137010E-mail

ISRAEL

Israel SectionBen Bernfeld Jr.H. M. Acustica Ltd.20G/5 Mashabim St..IL-45201 Hod Hasharon, IsraelTel./Fax +972 9 7444099E-mail

ITALY

Italian SectionCarlo Perrettac/o AES Italian SectionPiazza Cantore 10IT-20134 Milan, ItalyTel. +39 338 9108768Fax +39 02 58440640E-mail

Italian Student SectionFranco Grossi, Faculty AdvisorAES Student SectionViale San Daniele 29 IT-33100 Udine, ItalyTel. +39 0432227527E-mailitalian

PORTUGAL

Portugal SectionRui Miguel Avelans CoelhoR. Paulo Renato 1, 2APT-2745-147 Linda-a-VelhaPortugalTel. +351 214145827E-mail portugal

ROMANIA

Romanian SectionMarcia TaiachinRadio Romania60-62 Grl. Berthelot St.RO-79756 Bucharest, RomaniaTel. +40 1 303 12 07Fax +40 1 222 69 19

SLOVENIA

Slovenian SectionTone SeliskarRTV SlovenijaKolodvorska 2SI-1550 Ljubljana, SloveniaTel. +386 61 175 2708Fax +386 61 175 2710E-mail slovenian

SPAIN

Spanish SectionJuan Recio MorillasSpanish SectionC/Florencia 14 3oDES-28850 Torrejon de Ardoz

(Madrid), SpainTel. +34 91 540 14 03E-mail spanish

TURKEY

Turkish SectionSorgun AkkorSTDGazeteciler Sitesi, Yazarlar

Sok. 19/6Esentepe 80300 Istanbul, TurkeyTel. +90 212 2889825Fax +90 212 2889831E-mail

YUGOSLAVIA

Yugoslavian Section Tomislav StanojevicSava centreM. Popovica 9YU-11070 Belgrade, YugoslaviaTel. +381 11 311 1368

1006 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

SECTIONS CONTACTSDIRECTORY

Page 138: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November 1007

Fax +381 11 605 578E-mailyugo

LATIN AMERICAN REGION

Vice President:Mercedes OnoratoTalcahuano 141Buenos Aires, ArgentinaTel./Fax +5411 4 375 0116E-maillatin

ARGENTINA

Argentina SectionHernan Ranucci Talcahuano 141Buenos Aires, ArgentinaTel./Fax +5411 4 375 0116E-mail

BRAZIL

Brazil SectionRosalfonso BortoniRua Doutor Jesuíno Maciel,

1584/22Campo BeloSão Paulo, SP, Brazil 04615-004Tel.+55 11 5533-3970Fax +55 21 2421 0112E-mail

CHILE

Chile SectionAndres Pablo Schmidt IlicTonmusikHernan Cortez 2768Ñuñoa, Santiago de ChileTel: +56 2 3792064Fax: +56 2 2513283E-mail

COLOMBIA

Colombia SectionTony Penarredonda CaraballoCarrera 51 #13-223Medellin, ColombiaTel. +57 4 265 7000Fax +57 4 265 2772E-mail

MEXICO

Mexican SectionJavier Posada Div. Del Norte #1008Col. Del ValleMexico, D.F. MX-03100MexicoTel. +52 5 669 48 79Fax +52 5 543 60 37E-mailmexican

URUGUAY

Uruguay SectionRafael AbalSondor S.A.Calle Rio Branco 1530C.P. UY-11100 MontevideoUruguayTel. +598 2 901 26 70,

+598 2 90253 88Fax +598 2 902 52 72E-mail

VENEZUELA

Taller de Arte Sonoro,Caracas Section (Student)Carmen Bell-Smythe de LealFaculty AdvisorAES Student SectionTaller de Arte SonoroAve. Rio de Janeiro Qta. Tres PinosChuao, VE-1061 CaracasVenezuelaTel. +58 14 9292552Tel./Fax +58 2 9937296E-mail

Venezuela SectionElmar LealAve. Rio de JaneiroQta. Tres PinosChuao, VE-1061 CaracasVenezuelaTel. +58 14 9292552Tel./Fax +58 2 9937296E-mail

INTERNATIONAL REGION

Vice President:Neville Thiele10 Wycombe St.Epping, NSW AU-2121,AustraliaTel. +61 2 9876 2407Fax +61 2 9876 2749E-mail

AUSTRALIA

Adelaide SectionDavid MurphyKrix Loudspeakers14 Chapman Rd.Hackham AU-5163South AustraliaTel. +618 8 8384 3433Fax +618 8 8384 3419E-mail

Brisbane SectionDavid RingroseAES Brisbane SectionP.O. Box 642Roma St. Post OfficeBrisbane, Qld. AU-4003, AustraliaOffice Tel. +61 7 3364 6510E-mail

Melbourne SectionGraham J. HaynesP.O. Box 5266

Wantirna South, VictoriaAU-3152, AustraliaTel. +61 3 9887 3765Fax +61 3 9887 1688E-mailmel

Sydney SectionHoward JonesAES Sydney SectionP.O. Box 766Crows Nest, NSW AU-2065AustraliaTel. +61 2 9417 3200Fax +61 2 9417 3714E-mail

HONG KONG

Hong Kong SectionHenry Ma Chi FaiHKAPA, School of Film and

Television1 Gloucester Rd. Wanchai, Hong KongTel. +852 2584 8824Fax +852 2588 1303E-mail

INDIA

India SectionAvinash OakWestern Outdoor Media

Technologies Ltd.16, Mumbai Samachar MargMumbai 400023, IndiaTel. +91 22 204 6181Fax +91 22 660 8144E-mail

JAPAN

Japan SectionKatsuya (Vic) Goh2-15-4 Tenjin-cho, Fujisawa-shiKanagawa-ken 252-0814, JapanTel. +81 466 81 0681Fax +81 466 81 0698 E-mail

KOREA

Korea SectionSeong-Hoon KangTaejeon Health Science CollegeDept. of Broadcasting

Technology77-3 Gayang-dong Dong-guTaejeon, Korea Tel. +82 42 630 5990Fax +82 42 628 1423E-mail

MALAYSIA

Malaysia SectionC. K. Ng King Musical Industries

Sdn BhdLot 5, Jalan 13/2MY-46200 Kuala LumpurMalaysiaTel. +603 7956 1668Fax +603 7955 4926E-mail

PHILIPPINES

Philippines SectionDario (Dar) J. Quintos125 Regalia Park TowerP. Tuazon Blvd., CubaoQuezon City, PhilippinesTel./Fax +63 2 4211790, +63 2

4211784E-mail

SINGAPORE

Singapore SectionCedric M. M. TioApt. Block 237Bishan Street 22, # 02-174Singapore 570237Republic of SingaporeTel. +65 6887 4382Fax +65 6887 7481E-mail

Chair:Dell HarrisHampton University Section(AES)63 Litchfield CloseHampton, VA 23669Tel +1 757 265 1033E-mail

Vice Chair:Scott CannonStanford University Section (AES)P.O. Box 15259Stanford, CA 94309Tel. +1 650 346 4556Fax +1 650 723 8468E-mail

Chair:Blaise ChabanisConservatoire de Paris

(CNSMDP) Student Section (AES)

14, rue de la FaisanderieFR-77200 Torcy, FranceTel. +336 62 15 29 97E-mail

Vice Chair:Werner de BruijnThe Netherlands Student

Section (AES)Korvezeestraat 541NL-2628 CZ DelftThe NetherlandsHome Tel. +31 15 2622995Office Tel. +31 15 2782021E-mailwerner

EUROPE/INTERNATIONALREGIONS

NORTH/SOUTH AMERICA REGIONS

STUDENT DELEGATEASSEMBLY

SECTIONS CONTACTSDIRECTORY

Page 139: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

1008 J. Audio Eng. Soc., Vol. 50, No. 11, 2002 November

AES CONVENTIONS AND CON113th ConventionLos Angeles, California, USADate: 2002 October 5–8Location: Los AngelesConvention Center,Los Angeles, California, USA

The latest details on the following events are posted on the AES Website: http://www.aes.org

Fax: +81 3 5494 3219Email

Convention vice chair: Hiroaki SuzukiVictor Company of Japan (JVC)Telephone: +81 45 450 1779Emailvictor.co.jp

Papers chair: Shinji KoyanoPioneer Corporation

Telephone: +81 49 279 2627Fax: +81 49 279 1513Email:pioneer.co.jp

Workshops chair: Toru KamekawaTokyo National University of FineArt & MusicTelephone: +81 3 297 73 8663Fax: +81 297 73 8670Email

11th Regional ConventionTokyo, JapanDate: 2003 July 7–9Location: Science Museum,Chiyoda, Tokyo, JapanConvention chair:Kimio HamasakiNHK Science & TechnicalResearch LaboratoriesTelephone: +81 3 5494 3208

Convention chair:Floyd TooleHarman International8500 Balboa Blvd.Northridge, CA 91329, USATelephone: +1 818 895 5761Fax: +1 818 893 7139Email

Papers cochair: John StrawnS Systems, Inc.

Telephone: +1 415 927 8856Email

Papers cochair:Eric BenjaminDolby Laboratories, Inc.Telephone: +1 415 558 0236Email

Exhibit information:Chris PlunkettTelephone: +1 212 661 8528

DSP-Acoustics & SoundReproductionPhilips Research Labs, WY81Prof. Hostlaan 45656 AA Eindhoven, TheNetherlandsTelephone: +31 40 274 3149Fax: +31 40 274 3230Email

Convention chair:Peter A. SwarteP.A.S. Electro-AcousticsGraaf Adolfstraat 855616 BV EindhovenThe NetherlandsTelephone: +31 40 255 0889Email

Papers chair: Ronald M. AartsVice Chair: Erik Larsen

Conference chair:Per RubakAalborg UniversityFredrik Bajers Vej 7 A3-216DK-9220 Aalborg ØDenmarkTelephone: +45 9635 8682Email

Papers cochair: Jan Abildgaard PedersenBang & Olufsen A/SPeter Bangs Vej 15P.O. box 40,DK-7600 StruerPhone: +45 9684 1122Email

Papers cochair: Lars Gottfried Johansen

23rd International ConferenceCopenhagen, Denmark“Signal Processing in AudioRecording and Reproduction”Date: 2003 May 23–25Location: Marienlyst Hotel,Helsingør, Copenhagen,Denmark

114th ConventionAmsterdam, The NetherlandsDate: 2003 March 22–25Location: RAI Conference and Exhibition Centre,Amsterdam, The Netherlands

Papers chair: Geoff MartinEmail

Conference chair:Theresa LeonardThe Banff CentreBanff, CanadaEmail

Conference vice chair:John SorensenThe Banff CentreBanff, CanadaEmail

24th InternationalConferenceBanff, Canada“Multichannel Audio:The New Reality”Date: 2003 June 26–28Location: The Banff Centre,Banff, Alberta, Canada

115th ConventionNew York, NY, USADate: 2003 October 10–13Location: Jacob K. JavitsConvention Center, NewYork, New York, USA

2002

Los Angeles

2003Amsterdam

Banff2003

NEW YORK2003

Page 140: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

Exhibit chair: Tadahiko NakaokiPioneer Business Systems DivisionTelephone: +81 3 3763 9445Fax : +81 3 3763 3138Email

Section contact: Vic GohEmail

Call for papers: This issue,pp. 1001 (2002 November)

FERENCESPresentationManuscripts submitted should betypewritten on one side of ISO size A4(210 x 297 mm) or 216-mm x 280-mm(8.5-inch x 11-inch) paper with 40-mm(1.5-inch) margins. All copies includingabstract, text, references, figure captions,and tables should be double-spaced.Pages should be numbered consecutively.Authors should submit an original plustwo copies of text and illustrations.ReviewManuscripts are reviewed anonymouslyby members of the review board. After thereviewers’ analysis and recommendationto the editors, the author is advised ofeither acceptance or rejection. On thebasis of the reviewers’ comments, theeditor may request that the author makecertain revisions which will allow thepaper to be accepted for publication.ContentTechnical articles should be informativeand well organized. They should citeoriginal work or review previous work,giving proper credit. Results of actualexperiments or research should beincluded. The Journal cannot acceptunsubstantiated or commercial statements.OrganizationAn informative and self-containedabstract of about 60 words must beprovided. The manuscript should developthe main point, beginning with anintroduction and ending with a summaryor conclusion. Illustrations must haveinformative captions and must be referredto in the text.

References should be cited numerically inbrackets in order of appearance in thetext. Footnotes should be avoided, whenpossible, by making parentheticalremarks in the text.

Mathematical symbols, abbreviations,acronyms, etc., which may not be familiarto readers must be spelled out or definedthe first time they are cited in the text.

Subheads are appropriate and should beinserted where necessary. Paragraphdivision numbers should be of the form 0(only for introduction), 1, 1.1, 1.1.1, 2, 2.1,2.1.1, etc.

References should be typed on amanuscript page at the end of the text inorder of appearance. References toperiodicals should include the authors’names, title of article, periodical title,volume, page numbers, year and monthof publication. Book references shouldcontain the names of the authors, title ofbook, edition (if other than first), nameand location of publisher, publication year,and page numbers. References to AESconvention preprints should be replacedwith Journal publication citations if thepreprint has been published.IllustrationsFigure captions should be typed on aseparate sheet following the references.Captions should be concise. All figures

should be labeled with author’s name andfigure number.Photographs should be black and white prints without a halftone screen,preferably 200 mm x 250 mm (8 inch by10 inch).Line drawings (graphs or sketches) can beoriginal drawings on white paper, or high-quality photographic reproductions.The size of illustrations when printed in theJournal is usually 82 mm (3.25 inches)wide, although 170 mm (6.75 inches) widecan be used if required. Letters on originalillustrations (before reduction) must be largeenough so that the smallest letters are atleast 1.5 mm (1/16 inch) high when theillustrations are reduced to one of the abovewidths. If possible, letters on all originalillustrations should be the same size.Units and SymbolsMetric units according to the System ofInternational Units (SI) should be used.For more details, see G. F. Montgomery,“Metric Review,” JAES, Vol. 32, No. 11,pp. 890–893 (1984 Nov.) and J. G.McKnight, “Quantities, Units, LetterSymbols, and Abbreviations,” JAES, Vol.24, No. 1, pp. 40, 42, 44 (1976 Jan./Feb.).Following are some frequently used SIunits and their symbols, some non-SI unitsthat may be used with SI units (), andsome non-SI units that are deprecated ().

Unit Name Unit Symbolampere Abit or bits spell outbytes spell outdecibel dBdegree (plane angle) () °farad Fgauss () Gsgram ghenry Hhertz Hzhour () hinch () injoule Jkelvin Kkilohertz kHzkilohm kΩliter () l, Lmegahertz MHzmeter mmicrofarad µFmicrometer µmmicrosecond µsmilliampere mAmillihenry mHmillimeter mmmillivolt mVminute (time) () minminute (plane angle) () ’nanosecond nsoersted () Oeohm Ωpascal Papicofarad pFsecond (time) ssecond (plane angle) () ”siemens Stesla Tvolt Vwatt Wweber Wb

INFORMATION FOR AUTHORS

Email

Call for papers: Vol. 50, No. 1/2,p. 100 (2002 January/February)

Call for workshops participants: Vol. 50, No. 3, p. 206 (2002 March)

Convention preview: Vol. 50, No. 7/8,pp. 606–628 (2002 July/August)

Convention report: This issue,pp. 934–978 (2002 November)

Exhibit information:Thierry BergmansTelephone: +32 2 345 7971Fax: +32 2 345 3419Email

Call for papers: Vol. 50, No. 6,p. 535 (2002 June)

Aalborg UniversityNiels Jernes Vej 14, 4DK-9220 Aalborg ØPhone: +45 9635 9828Email

Call for papers: Vol. 50, No. 9,p. 737 (2002 September)

Call for contributions: Vol. 50, No. 10,pp. 851–852 (2002 October)

Page 141: AES organizations AES · AES Journal of the Audio Engineering Society (ISSN 0004-7554), Volume 50, Number 11, 2002 November Published monthly, except January/February and July/August

sustainingmemberorganizations AESAES

VO

LU

ME

50,NO

.11JO

UR

NA

L O

F T

HE

AU

DIO

EN

GIN

EE

RIN

G S

OC

IET

Y2002 N

OV

EM

BE

R

JOURNAL OF THE AUDIO ENGINEERING SOCIETYAUDIO / ACOUSTICS / APPLICATIONSVolume 50 Number 11 2002 November

In this issue…

Noisy Decay Parameters

Improved Time-FrequencyAnalysis

Expanding Surround Sound

Synthesizing Surround Sound

Features…

113th Convention Report,Los Angeles

Bluetooth and Wireless Networking

11th Regional Convention, Tokyo—Call for Papers

The Audio Engineering Society recognizes with gratitude the financialsupport given by its sustaining members, which enables the work ofthe Society to be extended. Addresses and brief descriptions of thebusiness activities of the sustaining members appear in the Octoberissue of the Journal.

The Society invites applications for sustaining membership. Informa-tion may be obtained from the Chair, Sustaining Memberships Committee, Audio Engineering Society, 60 East 42nd St., Room2520, New York, New York 10165-2520, USA, tel: 212-661-8528.Fax: 212-682-0477.

ACO Pacific, Inc.Air Studios Ltd.AKG Acoustics GmbHAKM Semiconductor, Inc.Amber Technology LimitedAMS Neve plcATC Loudspeaker Technology Ltd.Audio LimitedAudiomatica S.r.l.Audio Media/IMAS Publishing Ltd.Audio Precision, Inc.AudioScience, Inc.Audio-Technica U.S., Inc.AudioTrack CorporationAutograph Sound Recording Ltd.B & W Loudspeakers LimitedBMP RecordingBritish Broadcasting CorporationBSS Audio Cadac Electronics PLCCalrec AudioCanford Audio plcCEDAR Audio Ltd.Celestion International LimitedCerwin-Vega, IncorporatedClearOne Communications Corp.Community Professional Loudspeakers, Inc.Crystal Audio Products/Cirrus Logic Inc.D.A.S. Audio, S.A.D.A.T. Ltd.dCS Ltd.Deltron Emcon LimitedDigidesignDigigramDigital Audio Disc CorporationDolby Laboratories, Inc.DRA LaboratoriesDTS, Inc.DYNACORD, EVI Audio GmbHEastern Acoustic Works, Inc.Eminence Speaker LLC

Event Electronics, LLCFerrotec (USA) CorporationFocusrite Audio Engineering Ltd.Fostex America, a division of Foster Electric

U.S.A., Inc.Fraunhofer IIS-AFreeSystems Private LimitedFTG Sandar TeleCast ASHarman BeckerHHB Communications Ltd.Innova SONInnovative Electronic Designs (IED), Inc.International Federation of the Phonographic

IndustryJBL ProfessionalJensen Transformers Inc.Kawamura Electrical LaboratoryKEF Audio (UK) LimitedKenwood U.S.A. CorporationKlark Teknik Group (UK) PlcKlipsch L.L.C.Laboratories for InformationL-Acoustics USLeitch Technology CorporationLindos ElectronicsMagnetic Reference Laboratory (MRL) Inc.Martin Audio Ltd.Meridian Audio LimitedMetropolis GroupMiddle Atlantic Products Inc.Mosses & MitchellM2 Gauss Corp.Music Plaza Pte. Ltd.Georg Neumann GmbH Neutrik AGNVisionNXT (New Transducers Ltd.)1 LimitedOntario Institute of Audio Recording

TechnologyOutline sncPRIMEDIA Business Magazines & Media Inc.Prism Sound

Pro-Bel LimitedPro-Sound NewsRadio Free AsiaRane CorporationRecording ConnectionRocket NetworkRoyal National Institute for the BlindRTI Tech Pte. Ltd.Rycote Microphone Windshields Ltd.SADiESanctuary Studios Ltd.Sekaku Electron Ind. Co., Ltd.Sennheiser Electronic CorporationShure Inc.Snell & Wilcox Ltd.Solid State Logic, Ltd.Sony Broadcast & Professional EuropeSound Devices LLCSound On Sound Ltd.Soundcraft Electronics Ltd.Soundtracs plcSowter Audio TransformersSRS Labs, Inc.Stage AccompanySterling Sound, Inc.Studer North America Inc.Studer Professional Audio AGTannoy LimitedTASCAMTHAT CorporationTOA Electronics, Inc.Touchtunes Music Corp.United Entertainment Media, Inc.Uniton AGUniversity of DerbyUniversity of SalfordUniversity of Surrey, Dept. of Sound

RecordingVidiPaxWenger CorporationJ. M. Woodgate and AssociatesYamaha Research and Development