Overview What : Stroke type Transformation: Timbre Rhythm When: Stroke timing Resynthesis.

Post on 17-Dec-2015

216 views 0 download

Tags:

Transcript of Overview What : Stroke type Transformation: Timbre Rhythm When: Stroke timing Resynthesis.

Tabla GyanRealtime tabla recognition and resynthesis

Parag Chordia (GTCMT)Alex Rae (GTCMT)

Overview

What :Stroke type

Transformation:TimbreRhythm

When:Stroke timing

Resynthesis

Video Demo

The Drum

• Dayan – treble drum

• Bayan – bass drum

Tabla Language

Recognition Architecture

Onset detection

Statistical ModelSVM

BayesianNeural Net

Training data

ke

tun

dhe

gedha

te

Input music

Stroke Label

Rhythm

Build Model: Training Data

• Several Datasets• Professional

musician• Home recording

• Audio recordings manually edited and labeled

Build Model: Target Mapping

• Standardize idiosyncratic traditional naming conventions

• Map timbrally similar (or identical) strokes to the same category

Build Model: Feature Extraction

Spectral Features• MFCCs (24)• Centroid• Variance• Skewness• Kurtosis• Slope• Roll-off

VarianceF1F2F3...

Fn

Spectral centroid

Kurtosis

Feature Vector

Build Model: Trained Model

• WEKA machine learning package• Support Vector Machine• Models trained on different datasets can be

saved for future use

Audio: Input

• Live audio is taken from a close-mic’d tabla

• Stereo signal provides partial separation of drums

Audio: Segmentation

• Onset detection done in Max using bonk~• More recent parallel project uses spectral flux

algorithm in Java• End of stroke marked by next onset (1 sec

buffer size)• Onset times stored

Audio: Feature Extraction

VarianceF1F2F3...

Fn

Spectral centroid

Kurtosis

Feature Vector

Output: Classification

• Feature vector is fed to previously trained model

• Single category label returned

SVM labelfeature vector

Output: Symbolic Score

• Stroke label combined with timing and amplitude information

• Score stored in temporary buffer in Max patch

.3204 .9665 2

.3527 .5715 6

.3031 .3648 6

.3325 .9827 6

.2970 .4762 2

.3865 .5928 1

.3496 .6603 8

.7046 .4621 1

.3144 .5024 6

.7152 .2990 6

.3387 .8891 2

.2902 .7342 6

.3868 .9051 7

.3049 .5727 1

Output: Timbre Remapping

Stroke labels can be flexibly remapped

Output: Conditional Repetition

Output: User Interface

Dangum

Future Directions

• Beat tracking• Modeling specific types of improvisational

forms (e.g. qaida, tihai …)• Automate transformations• Improve interface so it can be “played”• Tracking of expressive parameters (e.g. bayan

pitch modulation)

Conclusions

• Shown a realtime tabla interaction system• Implemented as Max java external using

machine learning to identify strokes• Supports flexible transformations• Foundation for more general improvisation

system