I-VECTORS FOR TIMBRE-BASED MUSIC SIMILARITY AND MUSIC ARTIST CLASSIFICATION
-
Author
hamid-eghbal-zadeh -
Category
Technology
-
view
747 -
download
0
Embed Size (px)
Transcript of I-VECTORS FOR TIMBRE-BASED MUSIC SIMILARITY AND MUSIC ARTIST CLASSIFICATION

I-VECTORS FOR TIMBRE-BASED MUSIC SIMILARITY AND MUSIC
ARTIST CLASSIFICATION
Hamid Eghbal-zadeh, Bernhard Lehner
Markus Schedl, Gerhard Widmer
ISMIR 2015

Outline • Introduction
o Timbral features
o Song-level features
o Factor analysis for song-level features
• I-vector Frontend o Related work
o Overview
o I-vector factor analysis
o Feature spaces
o Total variability
o Extraction procedure
• Experiments o Setup
o Evaluation
o Results
• Conclusion

INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
Timbral features

Introduction – Timbral features
Timbral features (e.g. MFCCs) are very important in content-based MIR tasks • Music similarity [Sereylehner2010,Pohle2007] • Genre classification [Mirex,Sereylehner2010,Pohle2007] • Artist classification [Ellis2007,Su2013,Kuksa2014] • etc
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Introduction – Timbral features
Difference in level of features: • Frame-level
Song 1 features
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Introduction – Timbral features
Difference in level of features: • Frame-level
• Block-level
Song 1 features
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Introduction – Timbral features
Difference in level of features: • Frame-level
• Block-level
• Song-level processing
Song 1
Song 2
Song 3
vector 2
vector 1 features
features
features
vector 3
Song 1 features
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Introduction – Timbral features
Difference in level of features: • Frame-level
• Block-level
• Song-level processing
Song 1
Song 2
Song 3
vector 2
vector 1 features
features
features
vector 3
Song 1 features
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
Song-level features

Introduction – Song-level Timbral features
Difference in level of features:
• Song-level
1. Using single Gaussians [Sereylehner2010,Pohle2007]
processing Song 1
Song 2
Song 3
vector 2
vector 1 features
features
features
vector 3
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Introduction – Song-level Timbral features
Difference in level of features:
• Song-level
1. Using single Gaussians [Sereylehner2010,Pohle2007] 2. Using GMMs [Charbuillet2011, Kruspe2014]
a) Train a GMM on a song
processing Song 1
Song 2
Song 3
vector 2
vector 1 features
features
features
vector 3
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Introduction – Song-level Timbral features
Difference in level of features:
• Song-level
1. Using single Gaussians [Sereylehner2010,Pohle2007] 2. Using GMMs [Charbuillet2011, Kruspe2014]
a) Train a GMM on a song b) Adapt a pre-trained GMM on a song
processing Song 1
Song 2
Song 3
vector 2
vector 1 features
features
features
vector 3
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Introduction – Song-level Timbral features
Difference in level of features:
• Song-level
1. Using single Gaussians [Sereylehner2010,Pohle2007] 2. Using GMMs [Charbuillet2011, Kruspe2014]
a) Train a GMM on a song b) Adapt a pre-trained GMM on a song c) Use a pre-trained GMM as a prior
processing Song 1
Song 2
Song 3
vector 2
vector 1 features
features
features
vector 3
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Introduction – Song-level Timbral features
Difference in level of features:
• Song-level
1. Using single Gaussians [Sereylehner2010,Pohle2007] 2. Using GMMs [Charbuillet2011, Kruspe2014]
a) Train a GMM on a song b) Adapt a pre-trained GMM on a song c) Use a pre-trained GMM as a prior
processing Song 1
Song 2
Song 3
vector 2
vector 1 features
features
features
vector 3
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Introduction – Song-level Timbral features
Difference in level of features:
• Song-level
1. Using single Gaussians [Sereylehner2010,Pohle2007] 2. Using GMMs [Charbuillet2011, Kruspe2014]
a) Train a GMM on a song b) Adapt a pre-trained GMM on a song c) Use a pre-trained GMM as a prior
3. Using GMMs+Factor Analysis [Kruspe2014, Eghbal-zadeh2015]
processing Song 1
Song 2
Song 3
vector 2
vector 1 features
features
features
vector 3
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Introduction – Song-level Timbral features
Difference in level of features:
• Song-level
1. Using single Gaussians [Sereylehner2010,Pohle2007] 2. Using GMMs [Charbuillet2011, Kruspe2014]
a) Train a GMM on a song b) Adapt a pre-trained GMM on a song c) Use a pre-trained GMM as a prior
3. Using GMMs+Factor Analysis [Kruspe2014, Eghbal-zadeh2015]
processing Song 1
Song 2
Song 3
vector 2
vector 1 features
features
features
vector 3
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
Why Factor Analysis?

Introduction – Advantages of GMM+FA
• Having a probabilistic representation for a song:
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

• Having a probabilistic representation for a song: regarding a specific variability
Introduction – Advantages of GMM+FA
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Introduction – Advantages of GMM+FA
• Having a probabilistic representation for a song: regarding a specific variability
• Lower dimensionality: With a 1024 components GMM from 20 dim MFCCs, we will have 1024 x 20 dimensions
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Introduction – Advantages of GMM+FA
• Having a probabilistic representation for a song: regarding a specific variability
• Lower dimensionality: With a 1024 components GMM from 20 dim MFCCs, we will have 1024 x 20 dimensions • Better discrimination:
GMM GMM+FA
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
• Samples are taken form GTZAN dataset, projected using PCA

INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
Related work

I-vectors – Related work
• The FA model we use with GMMs is called I-vector FA
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Related work
• The FA model we use with GMMs is called I-vector FA • Introduced in speaker verification [Dehak2010]
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Related work
• The FA model we use with GMMs is called I-vector FA • Introduced in speaker verification [Dehak2010] • Speech Emotion Recognition [Chen2011] • Speech Language Recognition [Martinez2011] • Speech Accent Recognition [Bahari2013] • Speech Audio Scene Detection [Elizalde2013] • Singing Language Recognition [Kruspe2014] • Music Artist Recognition [Eghbal-zadeh2015]
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
How it works?

I-vectors – Overview
Song 1
Song 2
Song 3
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Overview
Song 1
Song 2
Song 3
features
features
features
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Overview
I-vector extractor Song 1
Song 2
Song 3
features
features
features
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Overview
i-vector 3
I-vector extractor Song 1
Song 2
Song 3
i-vector 2
i-vector 1
features
features
features
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
I-vector Factor Analysis

I-vectors – Factor Analysis
M = m + T y
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Factor Analysis
Song GMM supervector: assumed to be normally distributed
with mean vector m and
covariance matrix 𝑻𝑻𝒕
M = m + T y
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Factor Analysis
Song GMM supervector: assumed to be normally distributed
with mean vector m and
covariance matrix 𝑻𝑻𝒕
class- and song-independent supervector (comes from a Universal Background Model - UBM)
M = m + T y
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Factor Analysis
Song GMM supervector: assumed to be normally distributed
with mean vector m and
covariance matrix 𝑻𝑻𝒕
class- and song-independent supervector (comes from a Universal Background Model - UBM)
low rank matrix, learned from training
M = m + T y
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Factor Analysis
Song GMM supervector: assumed to be normally distributed
with mean vector m and
covariance matrix 𝑻𝑻𝒕
class- and song-independent supervector (comes from a Universal Background Model - UBM)
low rank matrix, learned from training
i-vector M = m + T y
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Factor Analysis
Song GMM supervector: assumed to be normally distributed
with mean vector m and
covariance matrix 𝑻𝑻𝒕
class- and song-independent supervector (comes from a Universal Background Model - UBM)
low rank matrix, learned from training
i-vector M = m + T y
• I-vectors are assumed to be normally distributed
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Factor Analysis
Song GMM supervector: assumed to be normally distributed
with mean vector m and
covariance matrix 𝑻𝑻𝒕
class- and song-independent supervector (comes from a Universal Background Model - UBM)
low rank matrix, learned from training
i-vector M = m + T y
• I-vectors are assumed to be normally distributed • Unlike GMM supervectors, i-vectors are from a single-Gaussian space
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
Different feature spaces

I-vectors – Feature space
Frame-level feature space
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Feature space
Frame-level feature space GMM feature space
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Feature space
Frame-level feature space GMM feature space I-vector feature space
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Feature space
Frame-level feature space GMM feature space I-vector feature space [Total Variability Space (TVS)]
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Feature space
Frame-level feature space GMM feature space I-vector feature space [Total Variability Space (TVS)]
20 dim 1024 x 20 dim 400 dim
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Feature space
Frame-level feature space GMM feature space I-vector feature space [Total Variability Space (TVS)]
20 dim 1024 x 20 dim 400 dim [number of total factors]
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
What is total variability?

I-vectors – Total variability
In genre classification:
• Genre variability : the variability appears between different
genres.
• Song variability : the variability appears within songs of a genre.
• Total variability : genre + song variability
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Total variability
In genre classification:
• Genre variability : the variability appears between different
genres.
• Song variability : the variability appears within songs of a genre.
• Total variability : genre + song variability
In artist recognition:
• Artist variability : the variability appears between different artists.
• Song variability : the variability appears within songs of an artist.
• Total variability : artist + song variability
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
How to extract i-vectors?
1, 2, 3 …

I-vectors – Procedure
step1
i-vector GMM supervector
Factor Analysis
Statistics calculation
Feature extraction
Audio Frame-level features (MFCCs)
Frame-level features, 20 dim MFCCs
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Procedure
step1
i-vector GMM supervector
Factor Analysis
Statistics calculation
Feature extraction
Audio Frame-level features (MFCCs)
step2
Using a universal background model (UBM)
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Procedure
step1
i-vector GMM supervector
Factor Analysis
Statistics calculation
Feature extraction
Audio Frame-level features (MFCCs)
step2
Using a universal background model (UBM)
UBM is a GMM trained on frame-level features extracted from a set of songs
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Procedure
step1
i-vector GMM supervector
Factor Analysis
Statistics calculation
Feature extraction
Audio Frame-level features (MFCCs)
step2
Using a universal background model (UBM)
𝑁𝑐 = 𝛾𝑡(𝑐)𝑡∈𝑠
𝐹𝑐 = 𝛾𝑡 𝑐 𝑡∈𝑠 Y𝑡
𝛾𝑡:The posterior probability of component c for observation t, given UBM Y𝑡:Acoustic feature vectors N and F are referred to as GMM supervectors.
UBM is a GMM trained on frame-level features extracted from a set of songs
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Procedure
step1
i-vector GMM supervector
Factor Analysis
Statistics calculation
Feature extraction
Audio Frame-level features (MFCCs)
step2 step3
Assuming GMM supervector 𝑀
𝑀 ~Normal(𝑚, 𝑇 ∗ 𝑇𝑡) Where 𝑚 is artist and
song independent (from UBM)
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Procedure
step1
i-vector GMM supervector
Factor Analysis
Statistics calculation
Feature extraction
Audio Frame-level features (MFCCs)
step2 step3
Assuming GMM supervector 𝑀
𝑀 ~Normal(𝑚, 𝑇 ∗ 𝑇𝑡) Where 𝑚 is artist and
song independent (from UBM)
I-vector 𝑦 is estimated via: 𝑦 = (𝐼 + 𝑇𝑡 Σ−1𝑁 𝑇)−1. 𝑇−1Σ−1𝐹
T: is learned from training data 𝛴: is a diagonal covariance matrix learned in Factor Analysis procedure 𝐹 and 𝑁: GMM supervector
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

I-vectors – Procedure
step1
i-vector GMM supervector
Factor Analysis
Statistics calculation
Feature extraction
Audio Frame-level features (MFCCs)
step2 step3
Assuming GMM supervector 𝑀
𝑀 ~Normal(𝑚, 𝑇 ∗ 𝑇𝑡) Where 𝑚 is artist and
song independent (from UBM)
I-vector 𝑦 is estimated via: 𝑦 = (𝐼 + 𝑇𝑡 Σ−1𝑁 𝑇)−1. 𝑇−1Σ−1𝐹
T: is learned from training data 𝛴: is a diagonal covariance matrix learned in Factor Analysis procedure 𝐹 and 𝑁: GMM supervector UNSUPERVISED
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Tasks:
1.Music artist classification
2.Music similarity
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
Tasks:
1.Music artist classification
2.Music similarity

Experiments – Evaluation
In music artist classification:
• Artist20 dataset, 1,413 songs
• 20 artist, mostly rock and pop, 6 albums from each artist
• 6-fold cross validation
• 5 albums for train, 1 album for test
• UBM and T are trained on training set of each fold
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Setup
In music artist classification:
• Frame-level features:
a) 20 dim MFCCs
b) Spectrum Derivatives
• 400 total factors
• LDA for dimensionality reduction
• 3 back-ends: PLDA, KNN, DA {I-vector extraction}
{PLDA,…} {MFCC+SD}
Extract features
Extract GMM
supervectors
Front end
Compensation/ Normalization
{LDA/Length norm}
Fram
es
Backend
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Results
In music artist classification:
Baselines
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Results
In music artist classification:
Baselines
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
MFCCs

Experiments – Results
In music artist classification:
Baselines
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
MFCCs
MFCCs+SD

Experiments – Tasks INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
Tasks:
1.Music artist classification
2.Music similarity

Experiments – Evaluation
In music similarity:
• 1517Artists dataset
used for training T and UBM
• GTZAN dataset
used for music similarity estimation experiments
• KNN [k=1:20]
to evaluate similarity matrix with genre labels
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Evaluation INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
Blues, Classic, Country, Disco, Hip-Hop, Jazz, Metal, Pop, Reggae, Rock GTZAN:
1,000 songs x 30 sec excerpts
10 genres
Alternative & Punk, Blues, Children, Classical, Comedy & Spoken Word, Country, Easy Listening & Vocals, Electronic & Dance, Folk, Hip-Hop, Jazz, Latin, New Age, R&B & Soul, Reggae, Religious, Rock & Pop, Soundtracks & More, World
1517Artists:
3,180 songs x full songs
19 genres
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Setup
In music similarity:
• Frame-level features:
a) 25 dim MFCCs + delta + delta2
b) Spectrum Derivatives
• 400 total factors
• Cosine distance for pairwise distance matrix
• Distance normalization, distance combination {I-vector extraction}
{Similarity matrix} {MFCC+SD}
Extract features
Extract GMM
supervectors
Front end
Cosine
{Distance}
Fram
es
Backend
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Results
Baselines
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
In music similarity:

Experiments – Results
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
Baselines
MFCC-IVEC
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
In music similarity:

Experiments – Results
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
Baselines
[MFCC+SD] IVEC
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
In music similarity:

Experiments – Results
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
Baselines
[MFCC+ SD] IVEC + RTBOF
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
In music similarity:

Experiments – Results
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
Baselines
[MFCC+ SD] IVEC + CMB
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
In music similarity:

Experiments – Tasks INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
Tasks:
1.Music artist classification
2.Music similarity
• Extended results

Experiments – Extended results
In music similarity:
• 1517Artists
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Extended results
In music similarity:
• 1517Artists
• Not in the paper
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Extended results
In music similarity:
• 1517Artists
• Not in the paper
• Only MFCCs
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Extended results
In music similarity:
• 1517Artists
• Not in the paper
• Only MFCCs
• More results:
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
http://bird.cp.jku.at/ivectors/ I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Extended results
In music similarity:
• 1517Artists
• Not in the paper
• Only MFCCs
• More results:
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
Baselines
http://bird.cp.jku.at/ivectors/ I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Extended results
In music similarity:
• 1517Artists
• Not in the paper
• Only MFCCs
• More results:
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
Baselines
BLS
http://bird.cp.jku.at/ivectors/ I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Extended results
In music similarity:
• 1517Artists
• Not in the paper
• Only MFCCs
• More results:
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
Baselines
RTBOF
http://bird.cp.jku.at/ivectors/ I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Extended results
In music similarity:
• 1517Artists
• Not in the paper
• Only MFCCs
• More results:
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
Baselines
MFCC-IVEC
http://bird.cp.jku.at/ivectors/ I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Extended results
In music similarity:
• 1517Artists
• Not in the paper
• Only MFCCs
• More results:
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
Baselines
MFCC-IVEC+RTBOF
http://bird.cp.jku.at/ivectors/ I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Experiments – Extended results
In music similarity:
• 1517Artists
• Not in the paper
• Only MFCCs
• More results:
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
Baselines
MFCC-IVEC+BLS
http://bird.cp.jku.at/ivectors/ I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain
Conclusion

Conclusion – Advantages
• Song-level timbral features
• Can capture specific variability
• Low dimensional features
• Can be used for supervised and unsupervised tasks
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Conclusion – Advantages
• Song-level timbral features
• Can capture specific variability
• Low dimensional features
• Can be used for supervised and unsupervised tasks
• Very good timbral features, please use it!
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
• Contact: [email protected] www.cp.jku.at/people/eghbal-zadeh/
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Conclusion – Drawbacks
• Can be used with features can be modeled via GMMs
• Relies on a UBM
• Unreliable for very short segments
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Thank you!
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain

Acknowledgement
• We would like to acknowledge the tremendous help by Dan Ellis from Columbia
University, who shared the details of his work, which enabled us to reproduce his
experiment results.
• Thank to Pavel Kuksa from University of Pennsylvania for sharing the details of
his work with us.
• Also thank to Jan Schlüter from OFAI for his help with music similarity baselines.
• And at the end, we appreciate helpful suggestions of Rainer Kelz and Filip
Korzeniowski from Johannes Kepler University of Linz to this work.
• This work was supported by the EU-FP7 project no.601166 (PHENICX), and by the
Austrian Science Fund (FWF) under grants TRP307-N23 and Z159.
INTRODUCTION I-VECTORS EXPERIMENTS CONCLUSION
I-vectors for timbre-based music similarity and music artist classification ISMIR 2015 Malaga, Spain