BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date:...
Transcript of BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date:...
![Page 1: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/1.jpg)
BLUES from Music:BLind Underdetermined Extraction of Sources
from Music
Michael Syskind Pedersen Tue Lehn-Schiøler
Jan LarsenIMM, Technical University of Denmark
ICA2006, Charleston, SC, USA
![Page 2: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/2.jpg)
Separating music into basic components
Michael Syskind Pedersen, IMM, Technical University of Denmark
![Page 3: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/3.jpg)
Motivation: Why separating music?
• Music Transcription• Identifying instruments• Identify vocalist
Michael Syskind Pedersen, IMM, Technical University of Denmark
![Page 4: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/4.jpg)
Assumptions• Stereo recording of the music piece is
available.• The instruments are separated to some
extent in time and in frequency, i.e. the instruments are sparse in the time-frequency (T-F) domain.
• The different instruments originate from spatially different directions.
Michael Syskind Pedersen, IMM, Technical University of Denmark
![Page 5: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/5.jpg)
Separation principle 1: T-F masking
![Page 6: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/6.jpg)
Stereo channel 1 Stereo channel 2
Gain difference between channels
![Page 7: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/7.jpg)
Separation principle 2: ICA
sources mixedsignals
recovered source signals
mixing
x = As
separation
ICAy = Wx
What happens if a 2-by-2 separation matrix W is applied
to a 2-by-N mixing system?
Michael Syskind Pedersen, IMM, Technical University of Denmark
![Page 8: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/8.jpg)
ICA on stereo signals
• We assume that the mixture can be modeled as an instantaneous mixture, i.e.
• The ratio between the gains in each column in the mixing matrix corresponds to a certain direction.
⎥⎦
⎤⎢⎣
⎡=
)()()()(
)(212
111
N
N
rrrr
Aθθθθ
θL
LsAx N ), ... ,( 1 θθ=
Michael Syskind Pedersen, IMM, Technical University of Denmark
![Page 9: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/9.jpg)
Direction dependent gain|)(|log20)( θWAθr =
When W is applied, the two separated channels each contain a group of sources, which is as independent as possible from the other channel.
Michael Syskind Pedersen, IMM, Technical University of Denmark
![Page 10: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/10.jpg)
x1 x2
ICA
STFT STFT
y1 y2
Y1(t, f) Y2(t, f)
⎩⎨⎧ >
= otherwise 0
when 1
c / YYBM 21
1⎩⎨⎧ >
= otherwise 0
when 1
c / YYBM 12
2
X1(t,f)
BM1 BM2
x1(2) x2
(2)
ICA+BM Separator
^ ^
Combining ICA and T-F masking
ISTFT
X2(t,f)
ISTFT
X1(t,f)
x1(2) x2
(2)^ ^ISTFT
X2(t,f)
ISTFT
![Page 11: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/11.jpg)
Method applied iterativelyx1 x2
ICA+BM
ICA+BM ICA+BM
ICA+BM ICA+BM
Michael Syskind Pedersen, IMM, Technical University of Denmark
![Page 12: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/12.jpg)
Improving method• The assumption of
instantaneous mixing may not always hold.
• Assumption can be relaxed.
• Separation procedure is continued until very sparse masks are obtained.
• Masks that mainly contain the same source are afterwards merged.
ICA+BM
ICA+BM
ICA+BM
ICA+BM
ICA+BM ICA+BM ICA+BM
ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM
ICA+BMICA+BMICA+BMICA+BMICA+BMICA+BMICA+BMICA+BM ICA+BMICA+BMICA+BMICA+BM ICA+BMICA+BMICA+BMICA+BM
ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BMICA+BM ICA+BM ICA+BM ICA+BM
ICA+BM ICA+BM ICA+BM ICA+BMICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BM ICA+BMICA+BM ICA+BM ICA+BM ICA+BMICA+BM ICA+BM ICA+BM ICA+BM
Michael Syskind Pedersen, IMM, Technical University of Denmark
![Page 13: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/13.jpg)
Mask mergingIf the signals in the time domain are correlated, their corresponding masks are merged.
The resulting signal from the merged mask is of higher quality.
![Page 14: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/14.jpg)
Results• Evaluation on real stereo music
recordings, with the stereo recording of each instrument available, before mixing.
• We find the correlation between the obtained sources and the by the ideal binary mask obtained sources.
• Other segregated music examples are available online.
Michael Syskind Pedersen, IMM, Technical University of Denmark
![Page 15: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/15.jpg)
Bas
s
Bas
s D
rum
Gui
tar d
Gui
tar f
Sna
re D
rum
Output1
72% 92%
3% 1% 17%
Output2 5% 1%
55%
4% 14%
Output3 9% 4% 9%
72% 21%
Remaining
14% 3%
32% 23% 48%
% of power 46% 27% 1% 7% 7%
Results
• The segregated outputs are dominated by individual instruments
• Some instruments cannot be segregated by this method, because they are not spatially different.
Michael Syskind Pedersen, IMM, Technical University of Denmark
![Page 16: BLUES from Music · Microsoft PowerPoint - BLUES from Music.ppt Author: str_msp Created Date: 3/14/2006 10:58:18 AM ...](https://reader036.fdocuments.us/reader036/viewer/2022071113/5fea5cec96e0c6358658cee0/html5/thumbnails/16.jpg)
Conclusion and future work• We have presented an unsupervised method for
segregation of single instruments or vocal sound from stereo music.
• Our method is based on combining ICA and T-F masking.
• The segregated signals are maintained in stereo.• Only spatially different signals can be segregated
from each other. • The proposed framework may be improved by
combining the method with single channel separation methods.
Michael Syskind Pedersen, IMM, Technical University of Denmark