8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and...

21
8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University of Limerick Ireland

Transcript of 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and...

Page 1: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

1

Client Server Browsing of Sound Resources: Classification and Browsing

E. Brazil

Interaction Design Centre

University of Limerick

Ireland

Page 2: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

2

Introduction

? - how to classify sound resources and how

to provide an interface to browse these

resources.

! - provide a browsable sound database for

users via intranet / Internet environments

Page 3: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

Overview of Research Areas

• Sound Classification

• Sound Representation

• Sound Browsing

Page 4: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

Sound Classification

• Two levels of classification

• Course level– Distinguish whether Speech, Music,

Environmental, Silence or Other category

• Fine level– Use human perceptual features

Page 5: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

Coarse-level classification of audio (1)

– Audio signals are classified into basic types, including speech, music, several types of environmental sounds, and silence

– Take morphological and statistical analyses of short-time feature curves (energy function, average zero-crossing rate, fundamental frequency), as well as a rule-based heuristic classification procedure

Page 6: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

Coarse-level classification of audio (2)

• Short-time energy function– Short-time energy of audio signal reflects the

amplitude variations over time

• Short-time average zero-crossing rate

– ZCR is the number of times the signal passes

through zero in a given time interval

• Spectral Centroid

Page 7: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

Fine-level classification of audio

• Further classification will be conducted within each basic type:

– music: classify music played by different instruments, different types of music, singing, plain song

– speech: differentiate voices of man, woman, and child, speech with music background

– environmental sound: divide them into classes such as applause, bell ring, footstep, windstorm, laughter, bird’s sound, and so on

Page 8: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

Sound Representation

• Previous work has concentrated on– Visual star-field type display

• New novel visual representations– Visualisations on spheres (non-Euclidean

spaces)– Hyper tree– Excentric labeling

Page 9: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

Star-field Display

Virtual University - Uni. Vienna

Page 10: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

Visualisations on Spheres

H3: Laying OutLarge DirectedGraphs in 3D HyperbolicSpace - Munzer

Page 11: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

Hyper Tree

www.inxight.com

Page 12: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

Excentric Labeling

HCIL – Uni. Maryland

Page 13: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

Sound Browsing

• Iterative & Interactive Activity:– Opportunistic & Serendipitous

• Enable users’ to explore a data set

• External & internal properties of objects:– Context & Content

• Evaluate and revise understanding of relationships

Page 14: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

14

The Sonic Browser ApplicationAudio: Direct representation of tunes

(exploting the cocktailparty effect)

• Sounds are panned out in a stereo field controlled by the visual location of the tunes nearest to the cursor.

• The volume of the tunes playing concurrently is proportional to the visual distance between the objects and the cursor

Page 15: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

16

The Sonic Browser Application

Page 16: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

Client – Server Issues

• let the server do the mixing and spatialisation

• analysis and classification on server

• lightweight client - Java.

• different network topologies and protocols.– Latency issues– Use of a floating ‘Aura’

Page 17: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

Cue Points

• Use Cue Points as Marker Points– Mark a specific point or section of a sound

• Play only significant portion of sound while browsing

• Reduce time to identify sound by playing characteristic or significant part

• Found in many common sound file formats* Technical Report UL-IDC-01-02

Page 18: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

22

Application Platform: HW & OS

• Normal Multimedia PC – (Pentium II/III w. SB Live, etc)

• Server – MS Windows 98/2000

• Client– Any O/S with Java Runtime

Page 19: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

Conclusion

• Facilitate different visualisation tools, e.g. for non-Euclidean space.

• Address payment and copyright issues

• Investigate other file types, e.g. MPEG-7.

Page 20: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

References (1)

• Brazil, E. (2001). Cue Points: An Examination Of Common Sound File Formats. Limerick, University of Limerick.

• Fekete, J. D., Plaisant, C. (1999). Excentric Labeling: Dynamic Neighborhood Labeling for Data Visualization. Conference on Human factors in Computer Systems, New York, ACM.

• Fernström, M., Brazil, E. (2001). Sonic Browsing: An Auditory Tool For Multimedia Asset Management. International Conference on Auditory Display, Espoo, Finland.

• Ó Maidín, D. and M. Fernström (2000). The Best of Two Worlds: Retrieving and Browsing. COST-G6 Conference on Digital Audio Effects DAFx-00, Verona, Universita degli Studi Verona.

Page 21: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University.

8th Annual CSIS Research Conference

References (2)

• Shneiderman, B. (1996). The eyes have it: A task by data type taxonomy for information visualizations. IEEE, Visual Languages, Boulder, CO, USA.

• Zhang, T., Kuo, C.C. (1998). Content-based Classification and Retrieval of Audio. SPIE's 43rd Annual Meeting - Conference on Advanced Signal Processing Algorithms, Architectures, and Implementations VIII, San Diego.

• Zhang, T., Kuo, C.C. (1998). Hierarchical System for Content-Based Audio Classification and Retrieval. SPIE's Conference on Multimedia Storage and Archiving Systems III, Boston.