MAPS AND LEGENDS: FPS-BASED INTERFACES FOR …rob/papers/hamilton-ICMC2007-fps.pdf · internet...

4
MAPS AND LEGENDS: FPS-BASED INTERFACES FOR COMPOSITION AND IMMERSIVE PERFORMANCE Robert Hamilton Center for Computer Research in Music and Acoustics (CCRMA) Department of Music Stanford University [email protected] ABSTRACT This paper describes an interactive multi-channel multi- user networked system for real-time composition and immersive performance built using a modified version of the Quake III gaming engine. By tracking users positional and action data within a virtual space, and by streaming that data over a network using OSC messages formatted as UDP packets to a multi-channel Pure Data patch, actions in virtual space are correlated to sonic output in a physical space. Virtual environments designed as abstract compositional maps or representative models of the users actual physical space are investigated as means to guide and shape compositional and performance choices. This paper analyzes both the technological concerns for building and realizing the system as well as the compositional and perceptual issues inherent in the project itself. 1. INTRODUCTION In the context of highly realistic 3-dimensional video games, sound is commonly utilized as a critical element in the communication of virtual spatial cues as well as in the enhancement and definition of actions by gamer- controlled avatars and other game entities alike. By presenting a user-centric sound field to the gamer — where the intended listening audience is one player alone within a front-focused stereo, 5.1 or other commercially standard sound field - designers create insular sound-worlds, reinforcing the immersive experience through their use of realistic 3D models and environments with auditory cues and behaviors based in the reality of the virtual world. As platforms for creating, interacting with and acting within virtual environments, video game engines such as the open-source Quake III engine offer artists a powerful new paradigm to explore novel methodologies for the control of sound and music with a low cost to entry and a flexible and extensible development environment. Used in conjunction with a data- transmission protocol like Open Sound Control (OSC) over UDP, in-game parameters controlled by multiple game users can be passed to any number of music and sound generation software environments, creating a seamless network-based transmission of data from the virtual realm to physical auditory space. In this manner, composers and performers alike can explore new relationships with composed and improvised musical materials by subverting traditional models of performer/composer/audience relationships and by distributing them in new contexts across networked physical and virtual space. Towards this goal, from the standpoint of the composer, it becomes necessary to explore novel forms of compositional structure based in the virtual environment and designed to fully exploit the dynamic nature of both the control and sound-generation systems while at the same time contributing a satisfying and logical structure to the musical product. One solution — referred to henceforth as the compositional map — draws inspiration and structure from the topography of the virtual environment itself and aims to build upon this visual and musical structure by leveraging the inherently flexible and indeterminate system of player motions and actions. Within this composer-defined visual and musical space, in-game performers reacting with their virtual environment as well as with one another via their avatars, collectively move the musical work forwards from inception to completion in a quasi- improvisatory fashion. 2. SYSTEM OVERVIEW maps and legends is an evolving integrated software system making use of the immersive environment of the Quake III engine as the user-interface for a flexible composition and performance system. Pre-composed computer-generated musical cells are assigned to each in-game performer and are triggered, processed and controlled by the performers interactions with and paths through the environment. By tracking multiple performers coordinate locations within virtual space and by subsequently spatializing those locations across a multi-channel performance space, an auditory correlation can be formed between the physical and virtual environments, engaging the attentions of both performers and audience members alike within the musical soundscape. Figure 1: Maps and Legends compositional map shown from both a top-down structural view (left) and a rendered ground-level view (right) in the GtkRadiant editor.

Transcript of MAPS AND LEGENDS: FPS-BASED INTERFACES FOR …rob/papers/hamilton-ICMC2007-fps.pdf · internet...

Page 1: MAPS AND LEGENDS: FPS-BASED INTERFACES FOR …rob/papers/hamilton-ICMC2007-fps.pdf · internet access and personal computers seek to bring together visually immersive graphical game-worlds,

MAPS AND LEGENDS: FPS-BASED INTERFACES FOR

COMPOSITION AND IMMERSIVE PERFORMANCE

Robert Hamilton

Center for Computer Research in Music and Acoustics (CCRMA)

Department of Music

Stanford University

[email protected]

ABSTRACT

This paper describes an interactive multi-channel multi-user networked system for real-time composition andimmersive performance built using a modified version ofthe Quake III gaming engine. By tracking users’ positionaland action data within a virtual space, and by streamingthat data over a network using OSC messages formatted asUDP packets to a multi-channel Pure Data patch, actions invirtual space are correlated to sonic output in a physicalspace. Virtual environments designed as abstractcompositional maps or representative models of the users’actual physical space are investigated as means to guideand shape compositional and performance choices. Thispaper analyzes both the technological concerns forbuilding and realizing the system as well as thecompositional and perceptual issues inherent in theproject itself.

1. INTRODUCTION

In the context of highly realistic 3-dimensional videogames, sound is commonly utilized as a critical elementin the communication of virtual spatial cues as well asin the enhancement and definition of actions by gamer-controlled avatars and other game entities alike. Bypresenting a user-centric sound field to the gamer –where the intended listening audience is one playeralone within a front-focused stereo, 5.1 or othercommercially standard sound field - designers createinsular sound-worlds, reinforcing the immersiveexperience through their use of realistic 3D models andenvironments with auditory cues and behaviors based inthe reality of the virtual world.

As platforms for creating, interacting with and actingwithin virtual environments, video game engines suchas the open-source Quake III engine offer artists apowerful new paradigm to explore novel methodologiesfor the control of sound and music with a low cost toentry and a flexible and extensible developmentenvironment. Used in conjunction with a data-transmission protocol like Open Sound Control (OSC)over UDP, in-game parameters controlled by multiplegame users can be passed to any number of music andsound generation software environments, creating aseamless network-based transmission of data from thevirtual realm to physical auditory space. In this manner,composers and performers alike can explore newrelationships with composed and improvised musicalmaterials by subverting traditional models ofperformer/composer/audience relationships and bydistributing them in new contexts across networkedphysical and virtual space.

Towards this goal, from the standpoint of thecomposer, it becomes necessary to explore novel forms

of compositional structure based in the virtualenvironment and designed to fully exploit the dynamicnature of both the control and sound-generation systemswhile at the same time contributing a satisfying andlogical structure to the musical product. One solution –referred to henceforth as the “compositional map” –draws inspiration and structure from the topography ofthe virtual environment itself and aims to build uponthis visual and musical structure by leveraging theinherently flexible and indeterminate system of playermotions and actions. Within this composer-definedvisual and musical space, in-game performers reactingwith their virtual environment as well as with oneanother via their avatars, collectively move the musicalwork forwards from inception to completion in a quasi-improvisatory fashion.

2. SYSTEM OVERVIEW

maps and legends is an evolving integrated softwaresystem making use of the immersive environment ofthe Quake III engine as the user-interface for a flexiblecomposition and performance system. Pre-composedcomputer-generated musical cells are assigned to eachin-game performer and are triggered, processed andcontrolled by the performers’ interactions with andpaths through the environment. By tracking multipleperformers’ coordinate locations within virtual spaceand by subsequently spatializing those locations acrossa multi-channel performance space, an auditorycorrelation can be formed between the physical andvirtual environments, engaging the attentions of bothperformers and audience members alike within themusical soundscape.

Figure 1: Maps and Legends compositional map shownfrom both a top-down structural view (left) and a rendered“ground-level” view (right) in the GtkRadiant editor.

Page 2: MAPS AND LEGENDS: FPS-BASED INTERFACES FOR …rob/papers/hamilton-ICMC2007-fps.pdf · internet access and personal computers seek to bring together visually immersive graphical game-worlds,

maps and legends was designed as a fully-rendered 3-dimensional virtual compositional map built using theGtkRadiant game-level editing software [5] (see Figure1). While clearly-visible pathways, directional arrows,and active directional jump-pads are built into the mapto encourage or force performer motion in certain pre-defined or composed directions, each performer retains ahigh-level of independence and improvisatoryflexibility, allowing for spontaneous new interpretationsof the pre-composed materials.

Users running the game-client software andconnected over a standard high-speed network controltheir avatars and through them the musical work usingstandard Quake III game control methodologies –typically a combination of computer-keyboard controlsfor motion and a mouse for view-angle. Game-clientsconnect to a host game-server which in-turn streamsOSC-formatted data reflecting their players’ actions andcoordinates to a sound-server in the performance venuerunning Pure Data (PD). Sound generation andprocessing for each independent performer are handledby and spatialized within an 8-channel PD patch,circumventing the sound-system of the game itself andsubstituting the composer’s musical environment forQuake III’s stock in-game sounds and music.

Figure 2: System diagram of Client/Server interactions

At the heart of maps and legends is a set of softwaremodules, currently Linux-only, created by mu lti-media artists Julian Oliver and Stephen Pickles entitledq3apd [3] which modify the open-source network codeof the Quake III game engine to stream a variety of in-game data – including global XYZ player positioningcoordinates, directional velocity and view-angle –formatted as OSC messages over a network as UDPpackets. As users connect to the host game server, datafrom their characters’ movements and actions are sent toa PD patch which parses incoming OSC messages and

extracts multiple player-specific data points. In thismanner, the global position of individual game playerswithin the virtual game-space, and certain subsequentactions performed by each user, are mapped to a numberof sound-generation and spatialization controlparameters creating a rich interactive-system for musicalcontrol.

3. PRIOR WORK

The use of networked/multi-user video game paradigmsfor music and sound generation has become increasinglycommon as generations of musicians who have grownup with readily accessible home video game systems,internet access and personal computers seek to bringtogether visually immersive graphical game-worlds,wide-spanning networks and interactive control systemswith musical systems. Though its graphical display isrendered in 2-dimensions, Small_Fish by KiyoshiFurukawa, Masaki Fujihata and Wolfgang Muench [4]is a game-like musical interface which allowsperformers/players to create rich musical tapestries usinga variety of control methods. Auracle [3], by MaxNeuhaus, Phil Burk and their team from AkademieSchloss Solitude allows networked users to collaborateand improvise using vocal gesture. Oliver and Pickles’own works, including q3apd and Fijuu2 [7], a fully-rendered three-dimensional audio/visual installationcontrolled with a game-pad, tightly marry the video-game and musical worlds through the use of immersivegraphics and familiar game control systems. And workon the Co-Audicle by Ge Wang, Perry Cook and thePrinceton Soundlabs team is actively seeking to build auser-community of collaborative performers throughnetworked extension of the ChucK language and itsAudicle front-end [11].

Compositional precedents for modular composedforms allowing for performer control over a work’sstructure can be found in the polyvalent form ofKarlheinz Stockhausen’s Zyklus [9] for onepercussionist, as well as in the variable form of hisKlavierstück XI [10] for solo piano. In Zyklus,Stockhausen’s strictly composed sectional materials aredesigned to be interpreted in a performer-chosendirection, reading through the score either forwards orbackwards, starting performance on any given page.Similarly, in Klavierstück XI, nineteen composed andprecisely notated musical fragments are ordered by theperformer. These flexible structural concepts are alsoprevalent in John Cage’s body of chance-based musicalworks including his Music of Changes [1] or StringQuartet in Four Parts [2], where pre-composed musicalcells were selected through chance operations andformed into a definitive compositional structure.

4. MAPPINGS AND SOUND GENERATION

By mapping OSC data-streams from the q3apd mod tovarious sound processing and spatialization controls inPD, a virtually-enactive control system is createdallowing for musically expressive and flexible controlthrough the virtual physicality of performers’ avatarmotion. The linking of virtual gesture to sound andspatialized auditory motion sets the stage for an

Page 3: MAPS AND LEGENDS: FPS-BASED INTERFACES FOR …rob/papers/hamilton-ICMC2007-fps.pdf · internet access and personal computers seek to bring together visually immersive graphical game-worlds,

exploration of musical control through a manner ofvirtual interactive choreography. Towards this end,sound-generating subsystems making use of sample-playback, active filters, delay and reverb parameters areall linked to various possible performer motions oractions.

4.1. q3apd data streams

Making use of the Quake III engine’s native ability torun user-created software library modifications or“mods”, q3apd streams a number of game-state andpositioning parameters for each connected user from thegame server to a specified IP and Port address as OSCmessages. The q3apd libraries export player-specificdata, including XYZ positioning and view-angle,directional velocity, selected weapon, and player statessuch as jumping, crouching or falling. q3apd alsoformats each message with a prepended user-id tagensuring that multiple user data streams can easily beseparated in PD and tracked independently.

4.2. PD Mappings

Basic control values supplied by q3apd such as player-motion and XYZ position, (Figure 3: section A) areused to calculate constantly changing horizontal-planedistance vectors from a performer’s current position topre-defined virtual speaker locations within thecompositional map (Figure 3: section B). Similarly,distance between multiple performers is calculated andmapped to sound events – complementary or disruptivedepending on composer-defined performance states – inan effort to support or discourage performers frommoving too close or too far from one another.

At key points in the compositional map, circulartrigger regions lie in the suggested path of motion. Pre-composed musical materials are triggered whenperformers move over these coordinate spaces, viewedin-game as bright yellow circles on the ground. In thismanner, separate sets of mono sound files, allcomplementary parts of the composition, are assignedto individual performers and triggered in PD based oneach performers’ position (Figure 3: section C).

Figure 3: Pure-Data patch with highlighted sections forA) player coordinates, directional velocity and view-angle tracking, B) player-to-speaker distance values, andC) player-specific map triggers.

Figure 4: Multi-channel amplitude is calculated as a factorof virtual distance from performer to each speaker. Shownalso are yellow circular trigger-locations, directionalarrows and suggested pathways.

These sets of sound files, a mixture of synthesizedand sampled sound, generated using custom softwaresystems built in Max/MSP and PD, make up the bulkof pre-composed materials for maps and legends. Otherbasic mappings include a light chorus and reverbprocessing applied to a performer’s current sound whenmoving over a highlighted pathway – a bonus of sortsfor moving in pre-composed patterns – as well as alonger reverb and longer delay applied when the user’sZ coordinate indicates that they are fairly “high” in themap’s vertical dimension – a state that can be triggeredmore often by lowering game-server-side parameters likevirtual gravity.

By coordinating the speed of various “weapon”projectiles in the game, it has also been possible tocreate a satisfactory illusion of a “shooting-sound”which travels independently across the map, trackingthe trajectory and speed of a visible fired projectile. Oneparticularly effective mapping matches the speed of alarge glowing orb moving relatively slowly across themap with the panning for a particular sound event.

4.3. Multi-channel Output and Spatialization

To clarify the relationship between individualperformers and their respective sound sets, triggeredsound files are spatialized based on a performer’sdistance from each of eight defined speaker locations. Inshort, sounds follow the performers’ motions throughspace, spatialized between multiple speakers at anygiven time. Speaker locations are defined in PD as XYcoordinate pairs, with independent gain levels for eachperformer for each speaker set with a simple distancefunction. In this manner, multiple speakerconfigurations for multi-channel output can be easilyconfigured without any changes to the compositional-map itself. While at this time additional reverb ordelay-based panning cues, or more accurate multi-planarspatialization methods like vector based amplitudepanning (VBAP) [8] or Ambisonic encoding are notused to either simulate the acoustics of the virtual spaceor to provide more realistic panning effects, suchapproaches are being investigated.

Page 4: MAPS AND LEGENDS: FPS-BASED INTERFACES FOR …rob/papers/hamilton-ICMC2007-fps.pdf · internet access and personal computers seek to bring together visually immersive graphical game-worlds,

5. DISCUSSION AND CONCLUSIONS

As an immersive environment for interactive networkedperformance and modular composition, the systemdesigned for maps and legends affords composers andperformers alike an extremely powerful and novelmusical experience. As the tools for generating virtualenvironments are flexible and robust, and through theuse of q3apd and OSC can interface with a variety ofsoftware-based musical environments, there exist manycompositional and performance methodologies whichcan be explored using such a system.

During development of maps and legends it becameclear that the relationship between virtual environmentand physical listening space played a key role in anaudience’s ability to become immersed in the projectedaudio-visual landscape. After presenting different visualdisplays to an audience during performance – either anin-game global vantage point or a performer’s standardfirst-person view – early usage indicates that perhaps theperceptual illusion of superimposed spaces succeeds orbreaks down based at least in part on the perceivedalignment of the virtual viewpoint with the audience’sphysical viewpoint. When no visual environment isprojected to an audience, the resulting musicalexperience changes dramatically, offering less sensoryconfusion and a seemingly greater ability for audiencemembers to focus on the musical work at hand.However, without visual cues to help define the virtualenvironment, the musical soundscape lacks the clearindications of performers’ deterministic vs. improvisedgestures.

A key goal in building maps and legends was thedevelopment of a system capable of immersing anaudience in a virtual sound world, creating a perceptualsuperimposition of virtual and physical environments(see Figure 5). Towards the expansion of this concept, arealistic slightly larger-than-scale virtual model ofCCRMA’s new heptagonal 16-channel Listening Roomwas created, complete with accurate speaker placements(see Figure 6). Game players sitting in the actualListening Room could enter the model and move their

Figure 5: Superimposition of a virtual game-space on top ofa physical performance space

avatars through the virtual room, controlling soundacross an 8-channel sound field mirroring correspondingreal-world physical speaker locations. For individualscontrolling the system as well as individual watchingon a projected video screen, effects ranging fromsensory confusion to mild discomfort were reported.Though not intended as a full-blown perceptual study,the immediate reactions of those involved hint at deeperissues with users’ cognitive abilities to separate virtualenvironments from physical environments.

Figure 6: Schematic, In-game and Real-worldimages of CCRMA's Listening Room

6. REFERENCES

[1] Cage, J. Music of Changes. (Score) New York:Henmar Press, C.F. Peters, 1951.

[2] Cage, J. String Quartet in Four Parts. (Score).New York: Henmar Press, C.F. Peters, 1950.

[3] Freeman, J. et al. “The Architecture ofAuracle: A Voice-Controlled, NetworkedSound Instrument”, Proceedings of theInternational Computer Music Conference,Barcelona, Spain, 2005.

[4] Furukawa, K., Fujihata, M., and Muench, W.http://hosting.zkm.de/wmuench/small_fish.

[5] GtkRadiant, http://www.qeradiant.com/

[6] Oliver, J. and Pickles, S. q3apd,http://www.selectparks.net/archive/q3apd.htm,as viewed 4/2007.

[7] Pickles,S. fijuu2, http://www.fijuu.com, asviewed 4/2007.

[8] Pulkki, V. “Virtual sound source positioningusing vector base amplitude panning” Journalof the Audio Engineering Society, 45(6) pp.456-466, June 1997.

[9] Stockhausen, K. Zyklus, Universal Edition,London, 1960.

[10] Stockhausen, K. Klavierstück XI, UniversalEdition, London, 1957.

[11] Wang, G., Misra, A., Davidson, P. and Cook,P. “Co-Audicle: A Collaborative AudioProgramming Space”, Proceedings of theInternational Computer Music Conference,Barcelona, Spain, 2005.