Works by Masataka Goto Dr. Masataka Goto (* The photo is taken from Goto’s Home page) The...

18
Works by Masataka Goto Dr. Masataka Goto (* The photo is taken from Goto’s Home page) The National Institute of Advanced Industrial Science and Technology (AIST) Home page: http://staff.aist.go.jp/m.goto/ Presented by Beinan Li, Music Tech @ McGill, 2005-2-10

Transcript of Works by Masataka Goto Dr. Masataka Goto (* The photo is taken from Goto’s Home page) The...

Page 1: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

Works by Masataka Goto

Dr. Masataka Goto (* The photo is taken from Goto’s Home page)

The National Institute of Advanced Industrial Science and Technology (AIST)

Home page: http://staff.aist.go.jp/m.goto/ Presented by Beinan Li, Music Tech @ McGill, 2005-2-10

Page 2: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

Content Goto’s personal info MIR / Music understanding

Real-time Beat Tracking System for Musical Acoustic Signals Real-time F0 Estimation of Melody and Bass Lines in Musical Audio Signals SmartMusicKIOSK: Music Listening Station with Chorus-Search Function

Speech Interface Speech Completion Speech Spotter

Interactive music system A Distributed Cooperative System to Play MIDI Instruments Interactive Performance of a Music-controlled CG Dancer VirJa Session (A Virtual Jazz Session System)

Music database

Page 3: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

Masataka Goto A researcher working at the National Institute of Advance

d Industrial Science and Technology (AIST), a newborn Japanese public research organization (15 former)

A researcher of Precursory Research for Embryonic Science and Technology (PRESTO) ("Information and Human Activity" research area), Japan Science and Technology Corporation (JST)

Doctor degree from Waseda University, 1998. Research interests.

Page 4: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

Real-time Beat Tracking System (2001)

(Next) Can recognize a hierarchical beat structure

(quarter-note, half-note, and measure levels ) in real-world audio signals sampled from popular-music compact discs.

With or without drums Time-signature 4/4 ; tempo is roughly constant Using selected musical knowledge (heuristics) Succeeded in 43 out of 45 songs

Page 5: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

Real-time Beat Tracking System

Main issues of beat tracking from acoustic signal: detecting beat-tracking cues in audio signals interpreting the cues to infer the beat structure dealing with the ambiguity of interpretation

Cues: Onset times of different frequency ranges Chord-change possibilities based on provisional time strips Drum patterns for Bass/Snare drums

Quantitative rhythmic difficulty: Power transition Multi-agent based hypothesis evaluation

Page 6: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

Real-time Beat Tracking System

Chord-change possibility, from dominant frequency by histogram peak within a period of time. (Picture taken from Goto, 2001)

Page 7: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

Selectively Used Musical Knowledge Onset time:

(a-1) “A frequent inter-onset interval is likely to be the inter-beat interval.” (a-2) “Onset times tend to coincide with beat times (i.e., sounds are likely to occur on beats).”

Chord change: (b-1) “Chords are more likely to change on beat times than on other positions.” (b-2) “….on half-note times than on other positions of beat

times.” (b-3) “….at the beginnings of measures than at other positions of half-note

times.” Drum pattern: (re-evaluate hypothesis)

(c-1) “The beginning of the input drum pattern indicates a half-note time.” (c-2) “The input drum pattern has the appropriate inter-beat interval.”

Page 8: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

Real-time F0 Estimation of Melody and Bass Lines (2004) (Next)

Music Scene Description based on subsymbolic representation

Find a predominant harmonic structure instead of a single fundamental frequency (within a restricted range).

Melody lines: by a voice or a single-tone mid-range instrument; Bass lines: by a bass guitar or contrabass

The average detection rate: 88.4% for the melody line and 79.9% for the bass

Page 9: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

Real-time F0 Estimation of Melody and Bass Lines

Main problem: Which F0 (in polyphonic) -> melody / bass ? Unknown number of sound sources. Select from several candidates.

Assumptions: Melody / bass have a harmonic structure, regardless F0 Melody / bass have a frequency range for most predominant

harmonic structure (“MPHS”) Melody / bass line have temporally continuous trajectories (F0),

during a musical note.

Page 10: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

Real-time F0 Estimation of Melody and Bass Lines

Method: Limit the frequency range:

melody : middle- and high-frequency regions Bass: low frequency whether the F0 is within the limited range or not.

Find the MPHS and its F0 View the observed frequency components as a weighted mixture of all

possible harmonic-structure tone models without assuming the number of sound sources

Deal with ambiguity Considers candidates’ temporal continuity and selects the most

dominant and stable trajectory of the F0

Page 11: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

Music Listening Station with Chorus-Search Function (2004)

Music-playback interface for trial listening and general music selecting / sampling.

Function for jumping to the chorus section Visualizing song structure. (Picture taken from Goto’s home page)

Page 12: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

Speech Completion (2002)

Helps the user recall uncertain phrases and saves labor when the input phrase is long.

Based on the phenomenon: Human hesitates by lengthening a vowel (a filled pause is uttere

d): e.g. “Er…” Displays completion candidates acoustically resemble the

uttered fragment for user to choose. Filled pause: small fundamental frequency (voice pitch) transition

s and small spectral envelope deformations. Vocabulary tree, HMM-based speech recognizer . English with Japanese accent? (vowel -> consonant)

Page 13: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

Speech Spotter (2004)

Allow user to enter voice commands into a speech recognizer in the midst of natural human-human conversation.

Filled-pause / High-pitch detection based (voice cue). On-demand information system for assisting human-huma

n conversation (e.g. weather inquiry during talk) Music-playback system for enriching telephone conversati

on (i.e. BGM judebox)

Page 14: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

A Distributed Cooperative System to Play MIDI Instruments (2002)

Remote Music Control Protocol (RMCP)Extension of MIDI. Network symbolized multimedia

information transmission.UDP / IP, client-server communicationEthernet / Internet Information sharing by broadcast and time scheduling

using time stamps

Page 15: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

Interactive Performance of a Music-controlled CG Dancer (1997)

CG character to enhance musician communication in real jam session via visual attention.

A successful CG dance depends on interactions between each musician and CG character. E.g. If the guitarist plays, CGC does not move unless the

drummer determines the motion timing

Page 16: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

VirJa Session (A Virtual Jazz Session System) (1999)

Enable distributed computer players to listen to other computer players' performances as well as human's performance and to interact with each other.

On top of the last two techniques.

Page 17: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

RWC Music Database (2002- )(back)

RWC (Real World Computing) Music Database Copyright-cleared Common foundation for research. Benchmark. Built by the RWC Music Database Sub-Working Group (Goto as

the chair) of the Real World Computing Partnership (RWCP) of Japan.

World's first large-scale music database specifically for research purposes.

Six original collections, 315 pieces with: original audio signals individual sounds at half-tone intervals MIDI files, variations of playing styles, dynamics, etc. text files of lyrics

Page 18: Works by Masataka Goto  Dr. Masataka Goto (* The photo is taken from Goto’s Home page)  The National Institute of Advanced Industrial Science and Technology.

References

Goto’s home page: http://staff.aist.go.jp/m.goto/

Masataka Goto: A Real-time Music-scene-description System: Predominant-F0 Estimation for Detecting Melody and Bass Lines in Real-world Audio Signals, Speech Communication (ISCA Journal), Vol.43, No.4, pp.311-329, 2004.

Masataka Goto: An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds, Journal of New Music Research, Vol.30, No.2, pp.159-171, June 2001.