A comprehensive framework for audio and music · PDF fileA comprehensive framework for audio...

30
A comprehensive framework for audio and music analysis Olivier Lartillot Aalborg University, Denmark MIRtoolbox MiningSuite

Transcript of A comprehensive framework for audio and music · PDF fileA comprehensive framework for audio...

A comprehensive framework for audio and music analysisOlivier Lartillot Aalborg University, Denmark

MIRtoolbox MiningSuite

Outline

Audio and musical feature extraction from recordings

MIRtoolbox

Music analysis from symbolic representation:

grouping, reduction, motivic patterns

MiningSuite

Comprehensive set of audio / musical features

Modular syntactic layer on top of Matlab

both simple and powerful language

Referential framework in Music Information Retrieval

10000s of downloads, 500+ citations in papers

MIRtoolbox

SoundAudio level

Dynamics

MIRtoolbox

Dynamics: Global energy

Dvorak, Symphony 8 G major, Allegretto grazioso - Molto vivace

10 20 30 40 50 60 70 80 900

0.01

0.02

0.03

0.04

0.05

0.06RMS energy, son13

Temporal location of events (in s.)

coef

ficie

nt v

alue

pp

ff>>>

>>

>>>

SoundAudio level

Dynamics Timbre

MIRtoolbox

Timbre Energy decomposed into frequencies (spectrum)

0 2000 4000 6000 8000 10000 120000

500

1000

1500

2000

2500

3000

3500

Spectrum

frequency (Hz)

magnitude

Low frequency energy: dark

sound

High frequency energy: bright sound

Energy in close frequency range: rough sound

Harmonic sound: regular frequency series vs. noise

Timbral gesture: brightness

Beethoven, 9th Symphony, Scherzo70 75 80 85 90 95 100

0.2

0.4

0.6

Brightness, son5

Temporal location of events (in s.)

coef

ficie

nt v

alue

dark

bright

20 40 60 80 100 120 1400.10.20.30.40.5

Brightness, son6

Temporal location of events (in s.)

coef

ficie

nt v

alue

Beethoven, 7th Symphony, Allegrettodark

bright

SoundAudio level

Dynamics Timbre Pitch

MIRtoolbox

Pitch gesture

30 35 40 45 50

400

500

600

700

800

900

Pitch, ANemo1.wav

Temporal location of events (in s.)

coef

ficie

nt v

alue

(in

Hz)

Freq

uenc

ies

Hz

Low

High

Van Zijl, Toiviainen, Lartillot, Luck. The sound of emotion: The effect of performers’ experienced emotions on auditory performance characteristics. Music Perception (2014)

Harty, Miniature for oboe and piano, Orientale

SoundAudio level

Dynamics Timbre Pitch

Notes“Symbolic” level

MIRtoolbox

Dynamics: Note attacks

CPE Bach, Cello Concerto in A major, WQ172, 3rd movement

0 5 10 150

0.2

0.4

0.6

0.8

1Onset curve (Envelope)

time (s)

ampl

itude

sf>

>>

SoundAudio level

Dynamics Timbre Pitch

Notes“Symbolic” level

Metre

MIRtoolbox

o = mironsets(’mysong’, ‘Detect’, ‘No’)

do = mironsets(o, ‘Diff’)

ac = mirautocor(do)

pa = mirpeaks(ac, ’Total’, 1)

In short:

[t, pa] = mirtempo(’mysong’)

0 1 2 3 4 50

0.02

0.04

0.06

0.08Onset curve (Envelope)

time (s)

am

plit

ude

o

0 1 2 3 4 5!0.05

0

0.05

0.1

0.15Onset curve (Differentiated envelope)

time (s)

am

plit

ud

e

do

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6!0.4

!0.2

0

0.2

0.4

0.6Envelope autocorrelation

lag (s)

co

eff

icie

nts

ac

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6!0.4

!0.2

0

0.2

0.4

0.6Envelope autocorrelation

lag (s)

coeffic

ients

pa

t = 129.6333 bpm

Tempo estimation

ac

Tempo estimationo = mironsets(’mysong’, ‘Detect’, ‘No’)

do = mironsets(o, ‘Diff’)

f = mirframe(do)

ac = mirautocor(do)

pa = mirpeaks(ac, ’Total’, 1)

In short:

[t, pa] = mirtempo(’mysong’, ‘Frame’)

0 2 4 $ % 10 12 14!1

0

1

2Onset curve (Differentiated envelope)

time (s)

am

plit

ud

e

f

pa

0 2 4 6 8 10 12130

140

150

160

170

180Tempo

Temporal location of events (in s.)

coeffic

ient valu

e (

in b

pm

) t

Sound

Notes

Audio level

“Symbolic” level

Metre

Dynamics Timbre Pitch

Mode

MIRtoolbox

Tonal gestureK

eys

5 10 15 20 25 30

CC#

DD#

EF

F#G

G#A

Temporal location of events (in s.)

coef

ficie

nt v

alue

Key, Umile01

majmin

5 10 15 20 25 30

CC#

DD#

EF

F#G

G#A

Temporal location of events (in s.)

coef

ficie

nt v

alue

Key, Umile01

majmin

Monteverdi, Hor che’l ciel e la terra, 1st part

20 40 60 80 100 120 140 160

CC#

DD#

EF

F#G

G#A

A#B

Temporal location of events (in s.)

coef

ficie

nt v

alue

Key, son5.wav

Beethoven, 9th Symphony, Scherzo20 40 60 80 100 120 140

CC#

DD#

EF

F#G

G#A

A#B

Temporal location of events (in s.)

coef

ficie

nt v

alue

Key, son27.wav

Schönberg, Verklärte Nacht, Sehr Ruhig

20 40 60 80 100 120C#

DD#

EF

F#G

G#A

A#B

Temporal location of events (in s.)

coef

ficie

nt v

alue

Key, son32.wav

Tiersen, Comptine d’un autre été : L’après-midi

* Sykes, Glowinski, Grandjean, Lartillot, Eliard. Is a contemporary listener able to distinguish between the musical emotional figures created by Monteverdi? SysMus 2013

*

Sound

Notes

Audio level

“Symbolic” level

Mode Metre Motifs

Patterns

Dynamics Timbre Pitch

Segments

MIRtoolbox

Timbral structure

0 20 40 60 80 100 120 140 160

2

4

6

8

10

12

MFCC

time axis (in s.)

coef

ficie

nt ra

nks MFCC representation (timbre)

Similarity matrix

temporal location of frame centers (in s.)tem

pora

l loc

atio

n of

fram

e ce

nter

s (in

s

20 40 60 80 100 120 140 160

50

100

150 Similarity matrix

20 40 60 80 100 120 140 1600

0.2

0.4

0.6

0.8

1Novelty, son8

Temporal location of events (in s.)

coef

ficie

nt v

alue

Novelty curveBrahms, Symphony No.3

in F major, Poco Allegretto

Lartillot, Cereghetti, Eliard, Grandjean. A simple, high-yield method for assessing structural novelty. International Conference on Music & Emotion (2013)

0 50 100 150CM

C#MDM

D#MEMFM

F#MGM

G#MAM

A#MBMCm

C#mDm

D#mEmFm

F#mGm

G#mAm

A#mBm

Key strength

time axis (in s.)

tona

l cen

ter

Similarity matrix

temporal location of frame centers (in s.)

tem

pora

l loc

atio

n of

fram

e ce

nter

s (in

s.)

20 40 60 80 100 120 140

20

40

60

80

100

120

140 Similarity matrix

20 40 60 80 100 120 1400

0.5

1Novelty, son4

Temporal location of events (in s.)

coef

ficie

nt v

alue Novelty curve

Bazzini, Dance of the GoblinsTonal structure

Sequential repetition

Notes“Symbolic” level

Mode Metre Motifs

Patterns

Groups Segments

SoundAudio level

Dynamics Timbre Pitch

MIRtoolboxMiningSuite

Groups Segments

! !"!"!! ! #$ ! "!"!42!# "!"!!

!% !! !" ! ! # " !& !

" ! ! ! !!5

$ !"

!" ! !

"

!! &

'"!!

#

! # !! !!$7 !(!!

Music engraving by LilyPond 2.18.2—www.lilypond.org

Mozart, Variation XI on “Ah, vous dirai-je maman”, K.265/300e

Local grouping

Sound

Notes

Audio level

“Symbolic” level

Groups Segments

Dynamics Timbre Pitch

Ornament reduction

MiningSuite

! !"!"!! ! #$ ! "!"!42!# "!"!!

!% !! !" ! ! # " !& !

" ! ! ! !!5

$ !"

!" ! !

"

!! &

'"!!

#

! # !! !!$7 !(!!

Music engraving by LilyPond 2.18.2—www.lilypond.org

Mozart, Variation XI on “Ah, vous dirai-je maman”, K.265/300e

Ornamentation reduction

syntagmatic network

head

Sound

Notes

Audio level

“Symbolic” level

Groups Segments

Dynamics Timbre Pitch

Ornament reduction

MiningSuite

Motifs

Patterns

Barlow & Morgerstern ABarlow & Morgerstern B

!! ! !" !#! !"!!!$ !L1%& !'$"

!! ! ! !($ ' !! ! !" " ! !$) %M1

! !" ! ! !!

!#! ! ! !($ ' !! ! !" ! ! !$) %

U1

!" ! ! !!

!#!' # *!#!

#!! !L2% $!& !

!#! !"!!$"! ! ! ' ! !$) %

U2 !" !

(!!!!!(!' !M2%) $ !'$

"

#!#!!!+!!!U3

%) $ " !!'$

,!,! ! !$! ! !" ' !# !!& %L3

!!! !

Music engraving by LilyPond 2.18.2—www.lilypond.org

J.S. Bach, Well-Tempered Clavier, Book II, Fugue XX Detected subject entries

Lartillot, ISMIR 2014

Sound

Notes

Audio level

“Symbolic” level

Groups Segments

Motifs

Patterns

Dynamics

Emotion

Timbre Pitch

Ornament reduction

Mode Metre

MiningSuiteFormStyleLrn2

Cre8

MiningSuitecode.google.com/p/miningsuite

SIGMINR!signal processing

AUDMINR!auditory modelling

SEQMINR!sequence processing

PATMINR!pattern mining

MUSMINR!music analysis

MiningSuitecode.google.com/p/miningsuite

• New syntactic layer in the source code itself

• Matlab optimisation & code readability

• Significantly optimised (in speed and memory) Fully rewritten using recent Matlab object-oriented programming capabilities

• Open source. Developers’ community

• Tutorial at ISMIR 2014 conferenceLrn2Cre8