Demos for QBSH
description
Transcript of Demos for QBSH
Demos for QBSH
J.-S. Roger Jang (張智星 )
http://mirlab.org/jang
CSIE Dept, National Taiwan University
Intro. to QBSH
QBSH: Query by Singing/HummingChallenges
Robust pitch tracking Key transposition Collection of song databases Efficient comparison
Karaoke box: ~10000 songsInternet: 500M songs, 12M albums (www.jogli.com)
Efficient Retrieval in QBSH
Methods for efficient retrieval Multi-stage progressive filtering Indexing for different comparison methods Music phrase identification Repeating pattern identification Distributed & parallel computing
Our focus Parallel computing via GPU
MIRACLE
MIRACLE Music Information
Retrieval Acoustically via Clustered and paralleL Engines
Database (~20K songs) MIDI files Solo vocals (<100) Melody extracted from
polyphonic music (<100)
Comparison methods Linear scaling Dynamic time warping
Top-10 Accuracy ~75%
Platform Single CPU+GPU
MIRACLE (II)
References (full list) J.-S. Roger Jang and Ming-Yang Gao, "A Query-by-Singing System based on
Dynamic Programming", International Workshop on Intelligent Systems Resolutions (the 8th Bellman Continuum), PP. 85-89, Hsinchu, Taiwan, Dec 2000.
Jyh-Shing Roger Jang, Jiang-Chun Chen, Ming-Yang Kao, "MIRACLE: A Music Information Retrieval System with Clustered Computing Engines", International Symposium on Music Information Retrieval (ISMIR) 2001
… Chung-Che Wang and Jyh-Shing Roger Jang, “Acceleration of Query by
Singing/Humming Systems on GPU: Compare from Anywhere”, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012
MIRACLE Before Oct. 2011Client-server distributed computingCloud computing via clustered PCs
Master server
Clients Clustered servers
PC
PDA/Smartphone
Cellular
Slave
Slave
Slave
Master server
Slave servers
Request: pitch vector
Response: search result
Database size: ~12,000
Current MIRACLESingle server with GPU
NVIDIA 560 Ti, 384 cores (speedup factor = 10)
Master server
ClientsSingle server
PC
PDA/Smartphone
Cellular
Master serverRequest: pitch vector
Response: search result
Database size: ~13,000
MIRACLE in the FutureMulti-modal retrieval
Singing, humming, speech, audio, tapping…
Master server
Clients Clustered servers
PC
PDA/Smartphone
Cellular
Slave
Slave
Slave
Master server
Slave servers
Request: feature vector
Response: search result
QBSH for Various Platforms
PC Web version
Embedded systems Karaoke machines
Smartphones iPhone/Android
Toys 16-bit micro-
controller
QBSH Prototype in MATLAB
To create a QBSH prototype in MATLAB Get familiar with audio processing in MATLAB
See audio signal processing
Try the programming contests onPitch trackingQBSH
• Run exampleProgram/goDemo.m to test drive the QBSH prototype in MATLAB!
QBSH Demos
QBSH demos by our lab QBSH on the web: MIRACLE QBSH on toys
Existing commercial QBSH systems www.midomi.com www.soundhound.com
Returned Results
Typical results of MIRACLE
13
Online Karaoke
Synchronized lyrics
Calory consumption
Real-time score
Recording
Live broadcast
Real-time pitch display
Automatic key adjustment
Future Work
Multi-modal music retrieval Query by user’s inputs: Singing, humming, whistling,
speech, tapping, beatboxing Query by exact examples: Audio clips
Speedup schemes Repeating pattern id., DTW indexing
Database preparation Polyphonic audio music as database The ultimate
challenge!