TablaNet: a Real-Time Online Musical Collaboration System...

14
TablaNet: a Real-Time Online Musical Collaboration System for Indian Percussion Mihir Sarkar Thesis Proposal for the Degree of Master of Science at the Massachusetts Institute of Technology Fall 2006 Thesis Advisor Barry L. Vercoe Professor of Media Arts and Sciences Massachusetts Institute of Technology Thesis Reader Tod Machover Professor of Music and Media Massachusetts Institute of Technology Thesis Reader Miller S. Puckette Professor, Music Associate Director, Center for Research in Computing and the Arts University of California, San Diego

Transcript of TablaNet: a Real-Time Online Musical Collaboration System...

Page 1: TablaNet: a Real-Time Online Musical Collaboration System ...alumni.media.mit.edu/~mihir/documents/mihir_smthesis_proposal.pdf · TablaNet: a Real-Time Online Musical Collaboration

TablaNet:a Real-Time Online Musical Collaboration System

for Indian Percussion

Mihir Sarkar

Thesis Proposal for the Degree of Master of Scienceat the

Massachusetts Institute of Technology

Fall 2006

Thesis Advisor Barry L. VercoeProfessor of Media Arts and Sciences

Massachusetts Institute of Technology

Thesis Reader Tod MachoverProfessor of Music and Media

Massachusetts Institute of Technology

Thesis Reader Miller S. PucketteProfessor, Music

Associate Director, Center for Research in Computing and the ArtsUniversity of California, San Diego

Page 2: TablaNet: a Real-Time Online Musical Collaboration System ...alumni.media.mit.edu/~mihir/documents/mihir_smthesis_proposal.pdf · TablaNet: a Real-Time Online Musical Collaboration

Abstract

Distance education in music stands to benefit from real-time interactions over the

Internet. For instance we can imagine an instructor living in a city teaching music

to children in villages so as to enhance or help maintain their local traditions. At

the same time, online music performance systems rely on real-time communication

platforms over fast and robust data networks. In this context I propose to develop

TablaNet, a real-time online musical collaboration system for the tabla, a pair of North

Indian hand drums. I selected the tabla, not only because of my familiarity with it,

but also because of its ”intermediate complexity” as a percussion instrument: although

tabla patterns are only based on rhythmic compositions without melodic or harmonic

structure, different strokes can produce a variety of more than 10 pitched and unpitched

sounds called bols, which contribute to the tabla’s expressive potential. Unlike other

networked music performance projects, which attempt to optimize the audio stream in

order to minimize the network latency, I plan to transmit symbolic information over

the network. By listening to individual drum sounds, and automatically recognizing

them at the near-end, the system will be able, based on the prior events received, to

predict and synthesize rhythmic phrases with the appropriate pitch and tempo at the

far-end. The system will be evaluated on quantitative grounds, such as its latency

tolerance and audio quality, as well as in terms of the system’s ”playability” by tabla

players of various levels.

i

Page 3: TablaNet: a Real-Time Online Musical Collaboration System ...alumni.media.mit.edu/~mihir/documents/mihir_smthesis_proposal.pdf · TablaNet: a Real-Time Online Musical Collaboration

Table of Contents

1 Introduction 1

2 Background 1

2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

3 Proposed Approach 3

3.1 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

3.2 Initial Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

3.3 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3.4 Hardware Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3.5 Software Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

4 Evaluation 5

4.1 Expected Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

4.2 Quantitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

4.3 Qualitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

5 Planning 6

5.1 Deliverables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

5.2 Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

5.3 Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

References 9

Outside Reader Biography 11

Miller S. Puckette . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

ii

Page 4: TablaNet: a Real-Time Online Musical Collaboration System ...alumni.media.mit.edu/~mihir/documents/mihir_smthesis_proposal.pdf · TablaNet: a Real-Time Online Musical Collaboration

1 Introduction

Hand drums are essential to Indian music; they are not only used for rhythmic accompani-

ment but also in call-and-response ”duels” and solo performances. However it is sometimes

difficult to find instruction for these instruments in areas with different musical traditions

(e.g. between the North and the South of India, or between rural areas, where classical

instruments may be difficult to come by, and cities, which may have limited access to folk

culture). Moreover with people being increasingly mobile and ”connected”, communication

services (in particular over data networks) are becoming ever more relevant, both socially

(e.g. through ”social networking”) and culturally—as a possible means to sustain indigenous

artistic traditions. In this context, I propose to develop TablaNet, a real-time online musical

collaboration system for Indian percussion involving machine listening.

The main challenge in this application is to overcome network latency. Musicians need to

be perceptually synchronized with one another while data travels on the network. I plan

to solve this problem by writing software that (i) recognizes individual drum strokes and

extracts higher-level rhythmic features from the input signal, (ii) transmits symbolic events

over the network instead of an audio stream, and (iii) synthesizes rhythmic phrases at the

output by using previous events to predict current patterns.

In this project, I will focus my attention on the tabla, the most popular percussion instru-

ment in North India. I expect my results to generalize to other percussion instruments of a

similar nature. This work will result in a playable prototype, a simulation environment for

testing and demonstration, a video presentation, and my master’s thesis, which will docu-

ment this study. After introducing the background to this project and mentioning previous

work in this area, I will outline my approach to solve this problem. I will then present the

evaluation criteria, and define the project plan and requirements.

2 Background

2.1 Motivation

While growing up in France, I missed being able to play with my musician friends in India

and the US. To overcome this situation, we would mail each other multitrack cassettes where

we had recorded one or more tracks. The Internet made this process faster, if not easier, but

we were still far from being able to ”jam” together. This inspired me to devise a system to

enable musicians to play together in real-time over the Internet.

1

Page 5: TablaNet: a Real-Time Online Musical Collaboration System ...alumni.media.mit.edu/~mihir/documents/mihir_smthesis_proposal.pdf · TablaNet: a Real-Time Online Musical Collaboration

2.2 Related work

2.2.1 Tabla Analysis & Synthesis

I shall not describe the tabla in this document (the reader is invited to wait for my master’s

thesis where background information and references will be provided). Probably because it is

one of the most popular Indian instruments, and possibly because of its timbral quality—its

ability to produce both pitched and unpitched sounds—several researchers have investigated

the questions of modeling and simulating the tabla. There have been a number of attempts

to recognize tabla strokes using statistical pattern classification (from [Gillet and Richard,

2003] and [Chatwani, 2003] to [Samudravijaya et al., 2004] and [Chordia, 2005]). However

all these methods analyze recorded performances, and are not necessarily applicable to live

performances, which may be affected by varying environmental conditions, and captured

with sensors other than microphones. There have been different types of electronic tabla

controllers (see [Hun Roh and Wilcox, 1995] and [Kapur et al., 2003a]), some of which

use tabla sounds that are generated with physical models [Kapur et al., 2004]. Moreover

substantial progress has been made in representing complex rhythms with a linguistic model

[Kippen and Bel, 1992] that has been implemented on the Bol Processor [Kippen and Bel,

1994]. However there has been no work, as far as I know, in the area of phrase prediction

for percussion instruments (see [Chafe, 1997] on the prediction of solo piano performance).

2.2.2 Networked Musical Performance

Since the advent of the Internet, musicians have been looking at online music collaboration as

the next ”killer-app”. In fact the network music performance space has been and continues to

be the source of several commercial endeavors (from the defunct Rocket Network to Ninjam,

Audio Fabric, and Lightspeed Audio Labs, a new startup still in stealth mode). However

these efforts (e.g. [Cooperstock and Spackman, 2001], [Kapur et al., 2003b], [Sawchuk et al.,

2003] and [Weinberg, 2005b]) are restricted by a hard theoretical constraint: the inherent

latency of computer networks. This delay, whose minimum is bounded by the speed of

light, is undesirable for music traveling over long distances. In spite of that, most current

projects still attempt to minimize latency either by sending MIDI commands, or by trying to

optimize the trade-off between audio stream compression and algorithmic complexity (e.g.

[Lazzaro and Wawrzynek, 2001], [Chatwani and Koren, 2004], [Gu et al., 2004]). Some

projects even rely on improved and faster networks, such as the experimental Internet2

[Bargar et al., 1998]. More recently, studies have been conducted on the effects of time delay

2

Page 6: TablaNet: a Real-Time Online Musical Collaboration System ...alumni.media.mit.edu/~mihir/documents/mihir_smthesis_proposal.pdf · TablaNet: a Real-Time Online Musical Collaboration

on musician synchronization (see [Chafe et al., 2004], [Chew et al., 2004] and [Maki-Patola,

2005]). Some researchers, notably Chris Chafe of CCRMA at Stanford University, have

also found creative ways to turn network latency to their advantage by converting delays

into reverberation [Chafe, 2003]. Thus, despite meeting with limited success, researchers are

finding new ways to interact musically over the Internet, and several roadmaps have been

proposed for networked musical performance (for instance [Weinberg, 2005a] and [Kapur

et al., 2005]).

3 Proposed Approach

3.1 Research Methodology

I propose to develop a computer system to enable real-time online musical collaboration

between two tabla players. The principles of this application, although specific to Indian

percussions, can be extended and generalized to other instruments and cultures. This system

will be evaluated with human tabla players using the system in a live setting propitious to

musical exchange. We shall also discuss the importance of interactions via other modalities

such as speech or vision, which carry instructions, appreciative sounds and gestures, and

an ”excitement factor” among the musicians and the audiences at both ends. An initial

assessment may be conducted on the importance of visual contact between musicians (in

particular tabla players) playing together in order to evaluate the relevance of a networked

music performance system offering only audio as a communication channel.

Several risks could impede progress on this project; there could be technical difficulties

for instance, like a subsystem not attaining the expected quality (e.g. low tabla strokes

recognition rate). However risk is inherent to research, and although I shall find ways to

mitigate them as much as possible in the course of this study, I shall not detail them further

in this document.

3.2 Initial Study

I conducted preliminary work where I demonstrated the concept presented in this document

by sensing vibrations on the tabla drumhead, analyzing stroke onsets, and transmitting

tempo and quantized onset events over a non-guaranteed connectionless UDP (User Data-

gram Protocol) network layer. The receiver triggered sampled tabla sounds on reception of

the events. This application was prototyped in the Max/MSP environment.

3

Page 7: TablaNet: a Real-Time Online Musical Collaboration System ...alumni.media.mit.edu/~mihir/documents/mihir_smthesis_proposal.pdf · TablaNet: a Real-Time Online Musical Collaboration

3.3 System Design

The proposed system architecture is described in the TablaNet system diagram (fig. 1). We

do not go into further details in this proposal about the network infrastructure, or details of

the computer system (standard configuration—probably under Linux) and audio speakers.

Tabla

Tabla

Sensors

Sensors Mixer / amplifier

Mixer / amplifier

Speaker

Speaker

Computersystem

Computersystem

Network

Figure 1: The TablaNet System Diagram

3.4 Hardware Implementation

The TablaNet system, although mostly software-based, relies on important pieces of hard-

ware. In order to avoid feedback from the speakers, which play the audio signal generated

by the far-end, into a microphone (and thus generating false alarms), I plan to use vibra-

tion sensors (most probably piezo-electric films) placed directly on the tabla drumheads. The

outputs of these sensors will be fed into a pre-amplified mixer, keeping in mind the frequency

range of tabla sounds, and will finally enter the A-to-D converter on the computer.

3.5 Software Implementation

The computer program at the near-end will contain code to extract features from the audio

input, classify incoming tabla strokes based on those features, and perform higher-level

4

Page 8: TablaNet: a Real-Time Online Musical Collaboration System ...alumni.media.mit.edu/~mihir/documents/mihir_smthesis_proposal.pdf · TablaNet: a Real-Time Online Musical Collaboration

operations, like extract the tempo. The application will then transmit the data to the far-

end computer over the Internet. The receiver will reassemble the packets, and generate a

tabla phrase in real-time based on the events received up to that point in time. A main

part of the work will be to design the tabla phrase prediction algorithm. The target software

environment (i.e. language, IDE) has not been decided yet. Tabla sound synthesis at the far-

end will either be based on a physical model so as to offer maximum control over the sound

quality (e.g. pitch slides), or on sample playback (e.g. wavetable synthesis or soundfonts,

which sometimes offer limited instrument control over some sound parameters) in order to

limit the additional load in designing a tabla sound synthesis.

4 Evaluation

4.1 Expected Contributions

I expect that my research work will result in the following contributions:

• Design a networked tabla performance system

• Develop an extensible tabla phrase prediction engine

• Implement a real-time continuous tabla strokes recognizer

• Realize a sensor interface for percussion with no audio feedback based on an array ofpiezo-electric sensors placed on each tabla head and an appropriate amplifier interface

• Create a real-world musical interaction between two tabla musicians over a computernetwork

4.2 Quantitative Results

The system will be evaluated on the following criteria:

• Tabla strokes recognition rate, and comparison with existing systems

• One-way and round-trip time delay (network latency), and comparison with allowableperceptual maximum

• Tabla phrase prediction error rate

• Output audio quality by listeners (non-performers) based on a statistical perceptualassessment

5

Page 9: TablaNet: a Real-Time Online Musical Collaboration System ...alumni.media.mit.edu/~mihir/documents/mihir_smthesis_proposal.pdf · TablaNet: a Real-Time Online Musical Collaboration

4.3 Qualitative Results

In addition to the quantitative assessment, we will examine the system’s ”playability” by

tabla players of various levels (beginner = less than 1 year experience; intermediate = from 1

to 3 years experience; and expert = more than 3 years experience). Experiments will involve

activities in the areas of:

• Distance learning

• Rhythmic accompaniment

• Call and response (called Jugalbandi)

Network latency will be simulated using median and worst case figures. After playing on the

system for various periods of time, tabla players at both ends as well as the audience will be

asked to comment on whether the system meets their expectations in terms of how ”natural”

the rhythmic patterns (variety, quantization, etc.) and audio output sound. Results will be

collected in the form of a survey and evaluated with a formal quantitative coding system

for qualitative data. I hope that the prototype will give musicians the impression of playing

with a fellow musician, rather than just playing with (or against) a machine. Questionnaire

responses will be included as an appendix to my master’s thesis.

5 Planning

5.1 Deliverables

The deliverables for this project fall under two categories: a working prototype suitable

for live demonstration and simulation (i.e. one tabla player versus the computer), and a

technical description of my work in the form of my master’s thesis, which will document the

design choices, implementation details, and results of this study. In addition, I intend to

present the results of this research at appropriate venues (e.g. the Conference on Human

Factors in Computing Systems (CHI), the Audio Engineering Society (AES) Convention, the

International Conference on New Interfaces for Musical Expression (NIME), or the Sound

and Music Computing (SMC) Conference). I will also produce a short audio/video segment

to illustrate various usage scenarios of the system in action (e.g. rhythmic accompaniment,

call and response).

6

Page 10: TablaNet: a Real-Time Online Musical Collaboration System ...alumni.media.mit.edu/~mihir/documents/mihir_smthesis_proposal.pdf · TablaNet: a Real-Time Online Musical Collaboration

5.2 Schedule

January • Background research

• Preliminary tabla strokes dataset collection

• Discrete tabla strokes identification (offline simulation)

• COUHES1 application for data gathering and system testing

February • Sensor interface design and development

• Complete tabla strokes dataset collection

• Continuous tabla strokes identification (real-time processing)

• Article on TablaNet system architecture

March • User interface and system prototyping

• Networked musical collaboration environment

• Tabla sound synthesis (sample playback)

• Master’s thesis first draft

April • Learning and prediction of tabla performance

• Tabla sound synthesis (physical model)

• System testing and evaluation

• Master’s thesis review and final draft

May • Video footage and production

• Prototype demonstration

• Master’s thesis submission

• Article on tabla strokes identification and phrase prediction

1MIT Committee on the Use of Humans as Experimental Subjects

7

Page 11: TablaNet: a Real-Time Online Musical Collaboration System ...alumni.media.mit.edu/~mihir/documents/mihir_smthesis_proposal.pdf · TablaNet: a Real-Time Online Musical Collaboration

5.3 Resources

The resources required to carry-on this project are:

• A tabla set (available from Prof. Barry Vercoe)

• Microphone, pre-amplifier, audio cables (available)

• 2 audio speakers (to be procured through an internal channel)

• 2 computers for demonstration (to be procured through an internal channel)

• Development platforms (Mac OS X and Windows XP available, Linux to be installed)

• Audio software and development environment (partially available)

• Vibration (piezo) sensors (partially available)

• Electronic parts for sensor interface and pre-amplifier (to be purchased)

• Participation incentives (gift coupons or the like) for dataset gathering and systemtesting

As far as recording tabla strokes and testing the system are concerned, I have access to a

relatively large number of tabla players of various levels at the Media Lab, and through

Sangam (the MIT Indian students association) and the music school of Sangeet, a Harvard

University student-run organization dedicated to South Asian music. In addition, several

Media Lab students can help me with recording and editing the video footage.

8

Page 12: TablaNet: a Real-Time Online Musical Collaboration System ...alumni.media.mit.edu/~mihir/documents/mihir_smthesis_proposal.pdf · TablaNet: a Real-Time Online Musical Collaboration

References

R. Bargar, S. Church, A. Fukuda, J. Grunke, D. Keislar, B. Moses, B. Novak, B. Pennycook,Z. Settel, J. Strawn, et al. AES white paper: Networking audio and music using Internet2 andnext-generation Internet capabilities. Technical report, AES: Audio Engineering Society, 1998.

C. Chafe. Statistical Pattern Recognition for Prediction of Solo Piano Performance. In Proc. ICMC,Thessaloniki, 1997.

C. Chafe. Distributed Internet Reverberation for Audio Collaboration. In AES (Audio EngineeringSociety) 24th Int’l Conf. on Multichannel Audio, 2003.

C. Chafe, M. Gurevich, G. Leslie, and S. Tyan. Effect of Time Delay on Ensemble Accuracy. InProceedings of the International Symposium on Musical Acoustics, 2004.

A. Chatwani and A. Koren. Optimization of Audio Streaming for Wireless Networks. Technicalreport, Princeton University, 2004.

A.A. Chatwani. Real-Time Recognition of Tabla Bols. Princeton University, Senior Thesis, May2003.

E. Chew, R. Zimmermann, A.A. Sawchuk, C. Kyriakakis, C. Papadopoulos, ARJ Francois, G. Kim,A. Rizzo, and A. Volk. Musical Interaction at a Distance: Distributed Immersive Performance. InProceedings of the MusicNetwork Fourth Open Workshop on Integration of Music in MultimediaApplications, September, pages 15–16, 2004.

P. Chordia. Segmentation and Recognition of Tabla Strokes. In Proc. of ISMIR (InternationalConference on Music Information Retrieval), 2005.

J.R. Cooperstock and S.P. Spackman. The Recording Studio that Spanned a Continent. In Proc.of IEEE International Conference on Web Delivering of Delivering of Music (WEDELMUSIC),2001.

O.K. Gillet and G. Richard. Automatic Labelling of Tabla Signals. In Proc. of the 4th ISMIRConf., 2003.

X. Gu, M. Dick, U. Noyer, and L. Wolf. NMP-a new networked music performance system. InGlobal Telecommunications Conference Workshops, IEEE, pages 176–185, 2004.

J. Hun Roh and L. Wilcox. Exploring Tabla Drumming Using Rhythmic Input. In CHI’95 pro-ceedings, 1995.

A. Kapur, G. Essl, P. Davidson, and P.R. Cook. The Electronic Tabla Controller. Journal of NewMusic Research, 32(4):351–359, 2003a.

A. Kapur, G. Wang, P. Davidson, PR Cook, D. Trueman, TH Park, and M. Bhargava. TheGigapop Ritual: A Live Networked Performance Piece for Two Electronic Dholaks, DigitalSpoon, DigitalDoo, 6 String Electric Violin, Rbow, Sitar, Table, and Bass Guitar. In Proceedingsof the International Conference on New Interfaces for Musical Expression (NIME), Montreal,2003b.

9

Page 13: TablaNet: a Real-Time Online Musical Collaboration System ...alumni.media.mit.edu/~mihir/documents/mihir_smthesis_proposal.pdf · TablaNet: a Real-Time Online Musical Collaboration

A. Kapur, P. Davidson, P.R. Cook, P. Driessen, and A. Schloss. Digitizing North Indian Perfor-mance. In Proceedings of the International Computer Music Conference, 2004.

A. Kapur, G. E. Wang, P. Davidson, and P. R. Cook. Interactive Network Performance: a dreamworth dreaming? Organised Sound, 10(03):209–219, 2005.

J. Kippen and B. Bel. Modelling Music with Grammars: Formal Language Representation in theBol Processor. Computer Representations and Models in Music, Ac. Press ltd, pages 207–232,1992.

J. Kippen and B. Bel. Computers, Composition and the Challenge of ”New Music” in ModernIndia. Leonardo Music Journal, 4:79–84, 1994.

J. Lazzaro and J. Wawrzynek. A case for network musical performance. In Proceedings of the 11thinternational workshop on Network and operating systems support for digital audio and video,pages 157–166. ACM Press New York, NY, USA, 2001.

T. Maki-Patola. Musical Effects of Latency. Suomen Musiikintutkijoiden, 9:82–85, 2005.

K. Samudravijaya, S. Shah, and P. Pandya. Computer Recognition of Tabla Bols. Technical report,Tata Institute of Fundamental Research, 2004.

AA Sawchuk, E. Chew, R. Zimmermann, C. Papadopoulos, and C. Kyriakakis. From remote mediaimmersion to Distributed Immersive Performance. In Proceedings of the 2003 ACM SIGMMworkshop on Experiential telepresence, pages 110–120. ACM Press New York, NY, USA, 2003.

G. Weinberg. Interconnected Musical Networks: Toward a Theoretical Framework. ComputerMusic Journal, 29(2):23–39, 2005a.

G. Weinberg. Local Performance Networks: musical interdependency through gestures and con-trollers. Organised Sound, 10(03):255–265, 2005b.

10

Page 14: TablaNet: a Real-Time Online Musical Collaboration System ...alumni.media.mit.edu/~mihir/documents/mihir_smthesis_proposal.pdf · TablaNet: a Real-Time Online Musical Collaboration

Outside Reader Biography

Miller S. Puckette

Miller Puckette obtained a B.S. in Mathematics from MIT (1980) and Ph.D. in Mathematics from

Harvard (1986). Puckette was a member of MIT’s Media Lab from its inception until 1987, and then

a researcher at IRCAM (Institut de Recherche et de Coordination Acoustique/Musique, founded by

composer and conductor Pierre Boulez). There he wrote the Max program for Macintosh computers,

which was first distributed commercially by Opcode Systems in 1990 and is now available from

Cycling ’74. In 1989 Puckette joined IRCAM’s ”musical workstation” team and put together an

enhanced version of Max, called Max/FTS, for the ISPW system, which was commercialized by

Ariel, Inc. This system became a widely used platform in computer music research and production

facilities. The IRCAM real-time development team has since reimplemented and extended this

software under the name jMax, which is distributed free with source code.

Puckette joined the Music department of the University of California, San Diego in 1994, and is now

Associate Director of the Center for Research in Computing and the Arts (CRCA). He is currently

working on a new real-time software system for live musical and multimedia performances called

Pure Data (”Pd”), in collaboration with many other artists/researchers/ programmers worldwide.

Pd is free and runs on Linux, IRIX, and Windows systems. Since 1997 Puckette has also been part

of the Global Visual Music project with Mark Danks, Rand Steiger, and Vibeke Sorensen, which

has been generously supported by a grant from the Intel Research Council.

11