DeviceCycle: rapid and reusable prototyping of gestural interfaces ... Q1-Q15… · DeviceCycle:...

4
DeviceCycle: rapid and reusable prototyping of gestural interfaces, applied to audio browsing by similarity Christian Frisson, Benoˆ ıt Macq Universit ´ e catholique de Louvain, TELE Lab Place du Levant, 2 B-1348 Louvain-la-Neuve [email protected] St´ ephane Dupont, Xavier Siebert, Damien Tardieu, Thierry Dutoit Universit ´ e de Mons, TCTS and MathRo Labs Boulevard Dolez, 31 B-7000 Mons [email protected] ABSTRACT This paper presents the development of rapid and reusable gestural interface prototypes for navigation by similarity in an audio database and for sound manipulation, using the AudioCycle application. For this purpose, we propose and follow guidelines for rapid prototyping that we apply using the PureData visual programming environment. We have mainly developed three prototypes of manual control: one combining a 3D mouse and a jog wheel, a second featur- ing a force-feedback 3D mouse, and a third taking advan- tage of the multitouch trackpad. We discuss benefits and shortcomings we experienced while prototyping using this approach. Keywords Human-computer interaction, gestural interfaces, rapid pro- totyping, browsing by similarity, audio database 1. INTRODUCTION Human-computer interaction gathers multiple, interdisci- plinary fields of expertise: interaction design, software de- velopment, cognition, ergonomics, and so on... There is a trend among researchers in engineering sciences to put forward new technologies before finding applications for it, while designers would rather first understand user require- ments before tailoring interfaces dedicated to their needs. In practise, it is convenient to use both methods side by side. 1.1 Our Approach We started conducting research from an engineering point of view: trying to improve methods and algorithms for, in our case, hypermedia navigation or browsing by similarity in multimedia databases. This is an emerging field cover- ing several application domains and use cases. Browsing in audio database by similarity is not yet a common prac- tise, it aims at solving current limitations in sound search, retrieval and discovery; used in domains such as sound de- sign, soundtrack composition, DJ/VJ’ing, electroacoustic music composition, personal audio library listening, among others. The knowledge and experience regarding interactive Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. NIME2010, Sydney, Australia Copyright 2010, Copyright remains with the author(s). systems dedicated to these tasks vary amongst practition- ers and disciplines. Addressing their various needs thus requires a disciplined methodology. 1.1.1 General user-centered method, parallelized us- ability tasks We extended our initial approach by borrowing a method- ology from the field of usability. The recent book by Bernsen and Dybkjaer [3] addresses this topic to a great extent. The knowledge acquired by gathering literature regarding spe- cific use cases is highly increased by undertaking contextual inquiries, which consist in interviewing experts, possibly with questionnaires, so as to understand their habits and needs regarding their practises. Brainstorming with these experts to produce paper mockups of the desired user inter- faces is cost-effective, especially when it prevents the unnec- essary development of would-be inadequate prototypes. We have started investigating these axes. After collecting this information, several cycles of prototypes of user interfaces need to be designed, built, tested with users, and refined. The time-consuming task of software and hardware devel- opment should be run in parallel with these usability tasks so as to ensure on-time delivery, and to ease the assembly of prototypes. 1.1.2 Focus on rapid prototyping In this paper, we will focus on one single task pertain- ing to this approach: the rapid prototyping of software and hardware interfaces, following the design phase, but before running mandatory usability tests for validation of the pro- totypes. Frameworks or toolboxes for rapid prototyping of multi- modal applications should meet the following requirements: rapid: quicker to develop than finalized products; modular: proposing a component-based software and hardware architecture; realtime: seamless human-computer interaction with no perceivable latency; reusable: modules or components developed should fit well into new projects with little adaptation; understandable: visual representations of the under- lying pipelines or mappings should be obvious; flexible: using libraries released under open licenses; generic: should work and fit well with most platforms and peripherals; sustainable: leaning towards the choice of environmental- friendly hardware. Proceedings of the 2010 Conference on New Interfaces for Musical Expression (NIME 2010), Sydney, Australia 473

Transcript of DeviceCycle: rapid and reusable prototyping of gestural interfaces ... Q1-Q15… · DeviceCycle:...

Page 1: DeviceCycle: rapid and reusable prototyping of gestural interfaces ... Q1-Q15… · DeviceCycle: rapid and reusable prototyping of gestural interfaces, applied to audio browsing by

DeviceCycle: rapid and reusable prototyping of gesturalinterfaces, applied to audio browsing by similarity

Christian Frisson, Benoıt MacqUniversite catholique de Louvain, TELE Lab

Place du Levant, 2B-1348 Louvain-la-Neuve

[email protected]

Stephane Dupont, Xavier Siebert,Damien Tardieu, Thierry Dutoit

Universite de Mons, TCTS and MathRo LabsBoulevard Dolez, 31

B-7000 [email protected]

ABSTRACTThis paper presents the development of rapid and reusablegestural interface prototypes for navigation by similarity inan audio database and for sound manipulation, using theAudioCycle application. For this purpose, we propose andfollow guidelines for rapid prototyping that we apply usingthe PureData visual programming environment. We havemainly developed three prototypes of manual control: onecombining a 3D mouse and a jog wheel, a second featur-ing a force-feedback 3D mouse, and a third taking advan-tage of the multitouch trackpad. We discuss benefits andshortcomings we experienced while prototyping using thisapproach.

KeywordsHuman-computer interaction, gestural interfaces, rapid pro-totyping, browsing by similarity, audio database

1. INTRODUCTIONHuman-computer interaction gathers multiple, interdisci-

plinary fields of expertise: interaction design, software de-velopment, cognition, ergonomics, and so on... There isa trend among researchers in engineering sciences to putforward new technologies before finding applications for it,while designers would rather first understand user require-ments before tailoring interfaces dedicated to their needs.In practise, it is convenient to use both methods side byside.

1.1 Our ApproachWe started conducting research from an engineering point

of view: trying to improve methods and algorithms for, inour case, hypermedia navigation or browsing by similarityin multimedia databases. This is an emerging field cover-ing several application domains and use cases. Browsingin audio database by similarity is not yet a common prac-tise, it aims at solving current limitations in sound search,retrieval and discovery; used in domains such as sound de-sign, soundtrack composition, DJ/VJ’ing, electroacousticmusic composition, personal audio library listening, amongothers. The knowledge and experience regarding interactive

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.NIME2010, Sydney, AustraliaCopyright 2010, Copyright remains with the author(s).

systems dedicated to these tasks vary amongst practition-ers and disciplines. Addressing their various needs thusrequires a disciplined methodology.

1.1.1 General user-centered method, parallelized us-ability tasks

We extended our initial approach by borrowing a method-ology from the field of usability. The recent book by Bernsenand Dybkjaer [3] addresses this topic to a great extent. Theknowledge acquired by gathering literature regarding spe-cific use cases is highly increased by undertaking contextualinquiries, which consist in interviewing experts, possiblywith questionnaires, so as to understand their habits andneeds regarding their practises. Brainstorming with theseexperts to produce paper mockups of the desired user inter-faces is cost-effective, especially when it prevents the unnec-essary development of would-be inadequate prototypes. Wehave started investigating these axes. After collecting thisinformation, several cycles of prototypes of user interfacesneed to be designed, built, tested with users, and refined.

The time-consuming task of software and hardware devel-opment should be run in parallel with these usability tasksso as to ensure on-time delivery, and to ease the assemblyof prototypes.

1.1.2 Focus on rapid prototypingIn this paper, we will focus on one single task pertain-

ing to this approach: the rapid prototyping of software andhardware interfaces, following the design phase, but beforerunning mandatory usability tests for validation of the pro-totypes.

Frameworks or toolboxes for rapid prototyping of multi-modal applications should meet the following requirements:

• rapid: quicker to develop than finalized products;

• modular: proposing a component-based software andhardware architecture;

• realtime: seamless human-computer interaction withno perceivable latency;

• reusable: modules or components developed should fitwell into new projects with little adaptation;

• understandable: visual representations of the under-lying pipelines or mappings should be obvious;

• flexible: using libraries released under open licenses;

• generic: should work and fit well with most platformsand peripherals;

• sustainable: leaning towards the choice of environmental-friendly hardware.

Proceedings of the 2010 Conference on New Interfaces for Musical Expression (NIME 2010), Sydney, Australia

473

Page 2: DeviceCycle: rapid and reusable prototyping of gestural interfaces ... Q1-Q15… · DeviceCycle: rapid and reusable prototyping of gestural interfaces, applied to audio browsing by

Several frameworks, toolboxes and workbenches for rapidprototyping of multimodal interfaces have been proposedin the past, among which the following are still availableand maintained: Amico1, HephaisTK2, OpenInterface3[10],Squidy Lib4. Some of these are surveyed in [6]. However,we opted for dataflow environments, initially pushed for-ward by the signal processing communities, because simpleinterfacing is offered (audio and video input/output, gestu-ral controllers and sensors interfacing, network communi-cation) and because we had previously acquired knowledgeand training in using these.

For the prototyping of user interfaces, we chose to usePureData5, as it is an opensource (free), cross-platform,modular environment for visual programming. One conve-nient feature of PureData for interaction designers is theability to edit mappings at runtime, on-the-fly, without re-compiling. Jamoma6 similarly simplifies the prototyping ofdataflow applications, but inside the commercial Max/MSPenvironment, and less dedicated to multimodal interaction.Lawson’s paper [10] compares the OpenInterface multimodalworkbench with the PureData dataflow environment.

1.2 Our context of application: browsing au-dio loops by similarity and manipulation

1.2.1 AudioCycle, our test applicationThe AudioCycle application had already been presented

[13, 7]: it offers an interactive visualization of a databaseof audio loops, with distances estimated from extractedsignal-based features. A two-dimensional visualization hasalready been proposed, as illustrated in figure 1. This ap-plication has since then been extended to the MediaCycleframework, supporting other types of media such as im-ages and videos. The framework relies on cross-platformopensource libraries, for instance for the AudioCycle ap-plication: OpenAL for the audio feedback, OSG for theOpenGL scene-based view, and Qt for the GUI.

Figure 1: Screenshot of our test prototype for audiobrowsing by similarity: AudioCycle.

1.2.2 Other applications featuring dedicated gestu-ral interfaces

1http://amico.sourceforge.net2http://sourceforge.net/projects/hephaistk/3http://www.openinterface.org4http://www.squidy-lib.de5http://www.puredata.info6http://www.jamoma.org

A consistent survey of emerging applications for brows-ing audio databases by similarity, notably Musicream, Mu-sicRainbow, SmartMusicKiosk and SoundSpotter, is avail-able in [4]. The type of media in the audio database variesamongst these applications: samples, loops, and music li-braries. As we are focusing on gestural control, we note thatamong these, MusicRainbow allows one to dial music genresfor browsing a song database intuitively using a rotary con-troller. Alternatively, applications such as SongExplorer [9]have been proposed for interacting with sound content, forinstance a music collection, using multitouch tables and/or“tangible” objects. In SoundTorch [8], the user can browsethrough a song library using a Wii remote controller.

Other inspiring dedicated controllers have been associ-ated to audio manipulation applications, extending beyondbrowsing by similarity. Villar et al designed a low-cost sys-tem for DJ’s using repurposed hard-drives [14]. Beamish etal [1] proposed force-feedback versions of the DJ elementarycontrollers: turntables and crossfaders. Chu [5] designed aforce-feedback rotative knob for sound editing purposes.

1.2.3 Restriction to manual controllersExpert users that would use a similarity-based approach

to browse a collection of sounds for professional purposes re-quire small movements considering their day-long sessions.We thus chose to address only manual controllers that cansit on top of desks or tables, requiring less bodily movementsor aerial gestures. Relatedly, Dan Saffer opposes “touch”to “free-form” modalities in his taxonomy of gestural inputmodalities [11].

2. METHOD AND ACHIEVEMENTS

2.1 OpenSoundControl supportWe have added OpenSoundControl (OSC) support to the

MediaCycle framework with a dedicated namespace, so asto allow the external control of MediaCycle applications bymore modalities, particularly gestural input.

We have identified two modes of interaction with Audio-Cycle: navigation (such as moving, hovering, zooming thedatabase view) and manipulation (notably the playback ofsound, with optional pitch-independent time stretching orbeat-matching). For prospective collaborative uses of theapplication, using multiple control agents, we have chosento specify numbers identifying the user and possible agentstied to the modes of interaction. For example, a single userusing one browser to move the 2D view space may send:/audiocycle/1/browser/1/move <x><y>.

2.2 Discovering off-the-shelf devicesBefore creating new dedicated gestural interfaces, we wan-

ted to explore the limitations of off-the-shelf devices. Wehave investigated three categories of devices:

1. USB Human Interface Devices (HID), particularly Con-tour Design Shuttle jog wheels and 3dconnexion SpaceNavigator 3D mice, using the [hidio] object by Steineret al [12], offering several improvements over [hid]

(notably hotplugging devices);

2. force-feedback devices, notably the Novint Falcon 3DOFforce-feedback mouse, operated with the [np nifalcon]

object using a reverse-engineering driver library7, as-sorted with the HSP set of abstractions by Berdahl etal [2] facilitating the starting up with basic physicaleffects;

7http://sourceforge.net/projects/libnifalcon

Proceedings of the 2010 Conference on New Interfaces for Musical Expression (NIME 2010), Sydney, Australia

474

Page 3: DeviceCycle: rapid and reusable prototyping of gestural interfaces ... Q1-Q15… · DeviceCycle: rapid and reusable prototyping of gestural interfaces, applied to audio browsing by

3. multitouch trackpads, using the [fingerpinger] ob-ject that we ported from its initial Max/MSP imple-mentation8 to PureData, allowing to access informa-tion of blobs detected from fingers hovering the AppleMacbook Multitouch Trackpad, only available on Ap-ple OSX currently.

For instant reusability of the controllers and associatedobjects, we created a set of abstractions following the samescheme for each (for instance a toggle connected at the leftinput of each enables to turn on or off the probing of thedevice), we ensured that the layout of the data visualizationwidgets on the patch remains coarsely equivalent regardinghow related controls are laid out on the devices, as illus-trated in figure 2.

Figure 2: Abstraction for rapid interfacing of a Con-tour Design Shuttle Pro2 jog wheel.

For most USB HID devices, we had to uninstall the officialdrivers that circumvent implementation issues by convert-ing raw events into mouse/keyboard events.

2.3 Rapid prototyping: producing mappingsWe opted for three test prototypes:

1. bimanual control featuring a 3D mouse for the navi-gation in the audio database and a jog wheel for themanipulation of audio loops, is illustrated in figure 3;

2. a force-feedback version of the navigation, replacingthe 3D mouse by a 3DOF force-feedback mouse;

3. a keyboard and multitouch trackpad combination

Figure 4 illustrates that mappings can be easily edited.Basic control widgets, such as toggles and sliders, are pro-vided with the default PureData installation, and allow oneto visualize values and test mappings offline. Objects andabstractions from the PureData extended release, such as

8http://www.anyma.ch/2009/research/multitouch-external-for-maxmsp/

Figure 3: Bimanual audio browsing (left hand, 3Dmouse) and manipulation (right hand, jog wheel) inaction.

[scale] from the maxlib library, increase the speed of de-velopment and help reduce the visual clutter in the patch.For less straightforward event management, the [expr] ob-ject helps to define conditional sequences. OSC objects helpto easily send filtered and conditioned events to the Au-dioCycle application, and to receive information for force-feedback.

3. DISCUSSION

3.1 Prototypes diversityThe first prototype reduces the level of movements down

to wrists-only. As each device is assigned to only one in-teraction mode (navigation vs manipulation), the devicescan be placed on a desk or table at convenient locationsregarding hands so as to prevent long-term injuries.

Regarding the second prototype, even if usability settingsare provided with the device, a suitable position is hard tofind due to the size of the device, especially if coupled withanother device. One strength of this prototype is the addedvalue of force-feedback; we believe that it might increasethe speed of finding a given target sound. For instance, asrepresented in figure 4, the friction activated when passingthrough loops might decrease the speed of reaching a giventarget, but increase the accuracy of positioning. A gravita-tional system analogy of attracting the force-feeback pointertowards barycenters of clusters can also be considered. Suchsystems can also benefit to users with disabilities, such asblind people.

For multi-purposed applications, the mouse and keyboardcombination should remain usable as it is still the standardsetup. This was the first supported user interface of theAudioCycle application. The third prototype augments thiscombination with a multitouch trackpad. This prototypecan alternatively be used for prototyping small form factormultitouch applications.

3.2 Visual versus script-based prototypingSome simple event mapping transformation were less straight-

forward than expected, for instance changing rotation fromabsolute to relative. This could have been more easily ac-complished with script-based programming, which is moreefficient for conditional and procedural statements, datarecording at given time stamps (storing in a variable), andlarge datasets using dedicated types. OpenInterface ad-

Proceedings of the 2010 Conference on New Interfaces for Musical Expression (NIME 2010), Sydney, Australia

475

Page 4: DeviceCycle: rapid and reusable prototyping of gestural interfaces ... Q1-Q15… · DeviceCycle: rapid and reusable prototyping of gestural interfaces, applied to audio browsing by

Figure 4: Rapid mapping example using a NovintFalcon force-feedback 3D mouse: samples are hov-ered with friction using the X and Y axes, zoom bythe Z axis forced to its initial position.

dresses this issue by allowing the execution of Matlab scripts.Dataflow patches remain efficient for giving an overall sum-mary of the executed pipeline.

3.3 Initial requirements assessmentMost of our initial requirements for rapid prototyping are

met, with the following exceptions:

• visual programming prevents easy reusability betweenthe prototype and the final product, as visual codeneeds to be converted into textual code;

• the GPL licence of PureData prevents from using itin commercial projects.

4. CONCLUSIONS AND PERSPECTIVESWe have set up a working environment for rapid and

reusable prototyping of gestural interfaces. The next stepconsists in defining an usability protocol so as to determinewhich gestural interfaces provide the best results in a givenapplication scenario with specific users and application do-mains. Since we have tested off-the-shelf components, wewill try to repurpose some of these devices as in [14], forinstance for the creation of a force-feedback jog wheel.

5. ACKNOWLEDGMENTSAll authors are supported by numediart9, a long-term

research program centered on Digital Media Arts, fundedby Region Wallonne, Belgium (grant N◦716631).

We wish to express our gratitude to our collaboratorsthat contributed to the development of AudioCycle: Nicolasd’Alessandro, Thomas Dubuisson, Raphael Sebbe, JeromeUrbain.

9http://www.numediart.org

6. REFERENCES[1] T. Beamish, K. van den Doel, K. MacLean, and

S. Fels. D’groove: A haptic turntable for digital audiocontrol. In Proceedings of the 2003 InternationalConference on Auditory Display (ICAD), 2003.

[2] E. Berdahl, G. Niemeyer, and J. O. Smith III. HSP:A simple and effective open-source platform forimplementing haptic musical instruments. InProceedings of NIME, 2009.

[3] N. O. Bernsen and L. Dybkjaer. Multimodal Usability.Human-Computer Interaction Series. Springer, 2009.

[4] M. A. Casey, R. Veltkamp, M. Goto, M. Leman,C. Rhodes, and M. Slaney. Content-based musicinformation retrieval: Current directions and futurechallenges. In Proceedings of the IEEE, volume 96, 42008.

[5] L. L. Chu. Using haptics for digital audio navigation.In Proceedings of the International Computer MusicConference (ICMC), 2002.

[6] B. Dumas, D. Lalanne, and S. Oviatt. HumanMachine Interaction: Research Results of the MMIProgram, chapter Multimodal Interfaces: A Survey ofPrinciples, Models and Frameworks, pages 3–26.Springer-Verlag, Berlin, Heidelberg, 2009.

[7] S. Dupont, T. Dubuisson, J. Urbain, C. Frisson,R. Sebbe, and N. d’Alessandro. Audiocycle: Browsingmusical loop libraries. In 7th International Workshopon Content-Based Multimedia Indexing (CBMI),pages 73–80, Chania, Crete, June 3-5 2009.

[8] S. Heise, M. Hlatky, and J. Loviscach. Soundtorch:Quick browsing in large audio collections. In 125thAudio Engineering Society Convention, number 7544,2008.

[9] C. F. Julia and S. Jorda. Songexplorer: A tabletopapplication for exploring large collections of songs. In10th International Society for Music InformationRetrieval Conference (ISMIR 2009), 2009.

[10] J.-Y. L. Lawson, A.-A. Al-Akkad, J. Vanderdonckt,and B. Macq. An open source workbench forprototyping multimodal interactions based onoff-the-shelf heterogeneous components. InProceedings of the 1st ACM SIGCHI symposium onEngineering interactive computing systems (EICS’09),2009.

[11] D. Saffer. Designing Gestural Interfaces. O’ReillyMedia, Inc., 2009.

[12] H.-C. Steiner, D. Merrill, and O. Matthes. A unifiedtoolkit for accessing human interface devices in puredata and max/msp. In Proceedings of the 2007Conference on New Interfaces for Musical Expression(NIME07), 2007.

[13] J. Urbain, T. Dubuisson, S. Dupont, C. Frisson,R. Sebbe, and N. d’Alessandro. Audiocycle: Asimilarity-based visualization of musical libraries. InIEEE International Conference on Multimedia andExpo (ICME), pages 1847–1848, Cancun, Mexico,June 28 - July 3 2009.

[14] N. Villar, M. Jervis, A. Lang, and H. Gellersen. Thecolordex dj system: A novel interface for musicmixing. In Proc. New Interfaces for MusicalExpression (NIME’07), 2007.

Proceedings of the 2010 Conference on New Interfaces for Musical Expression (NIME 2010), Sydney, Australia

476