Uncovering affinity of artists to multiple genres from social behaviour data

14
IIIA - CSIC Claudio Baccigalupo – October 2008 Uncovering affinity of artists to multiple genres from social behaviour data

description

Slides for the seminar at the Artificial Intelligence Research Institute (IIIA), Barcelona, October 2008

Transcript of Uncovering affinity of artists to multiple genres from social behaviour data

Page 1: Uncovering affinity of artists to multiple genres from social behaviour data

IIIA - CSIC

Claudio Baccigalupo – October 2008

Uncovering affinity of artists to multiple genres

from social behaviour data

Page 2: Uncovering affinity of artists to multiple genres from social behaviour data

Claudio Baccigalupo – Uncovering affinity of artists to multiple genres from social behaviour data – October 2008

Music stores Online

In most organisational schemas, artists have a unique genre label attached1 THE ISSUE

Such a Boolean approach cannot address questions such as:

• Which Country artist is the ‘most Country’?

• Which artist has the ‘most genre affinity’ with Madonna?

• Which genres are ‘socially related’?

Browse by Genre:

Alternative Rock (Britpop, Hardcore & Punk, Indie…)

Blues (Regional, Blues Rock, Modern, Traditional, …)

Christian & Gospel (CCM, Praise, Christian Rock…)

Country (Classic, Alt-Country, Roadhouse, Bluegrass…)

Dance & DJ (Techno-House, Dance-Pop, Trance, …)

Folk (Contemporary, Traditional, British, Folk-Rock, …)

Page 3: Uncovering affinity of artists to multiple genres from social behaviour data

Claudio Baccigalupo – Uncovering affinity of artists to multiple genres from social behaviour data – October 2008

Each artist has a degree of affinity to each genre, depending on how people use that artist2 THE IDEA

When two artists/genres occur together and closely (in magazines, radios, Web sites, playlists, …), they share some cultural affinity.

Our goal is to model relationships from artists to genres as Fuzzy Sets, describing each artist x as a vector [Mx(g1), Mx(g2), …, where each

value Mx(gi) indicates how much the artist x has affinity to the genre gi.

appears with songs like:

…mostly with Rock/Pop artists

appears with songs like:

…mostly with R&B artists

[Mx(g1), Mx(g2), . . . ]xMx(gi) gix

Madonna: Holiday (1983) Madonna: Secret (1994)

Page 4: Uncovering affinity of artists to multiple genres from social behaviour data

Claudio Baccigalupo – Uncovering affinity of artists to multiple genres from social behaviour data – October 2008

Many social Web communities provide collections of user-compiled music playlists

Madonna: MusicStrands members share playlists from the plug-in or Web site

Co-occurrences analysis of 4,000 artists and their genres in a set of playlists from the Web3

To calculate the genre-affinity degree of an artist to a genre :

• Retrieve 1,030,068 playlists compiled by members of MusicStrands

• Measure the normalised association from x to any other artist, based on how many times they occur in playlists and how closely

• Aggregate and normalise these associations by genre

• Combine artist-to-artist and artist-to-genre associations

THE TECHNIQUE

x gMx(g)

x

Page 5: Uncovering affinity of artists to multiple genres from social behaviour data

Claudio Baccigalupo – Uncovering affinity of artists to multiple genres from social behaviour data – October 2008

1. Aggregate the number of artist-to-artist co-occurrences in the playlists:

3. Cumulate the association Ax(y) from artist x to any artist of genre g:

2. Normalise the association Ax(y) with respect to artist popularity:

4. Normalise the association Px(g) with respect to genre popularity:

Combining and normalising artist-to-artist and artist-to-genre associations3

5. Weight the association Px(g) with the association Ax(y) and normalise to [0,1]:

THE TECHNIQUE

Mx(g) =12

!"y!A

#Ax(y)#Py(g)n

+ 1

$

Px(g)

Ax(y)

Ax(y)x g

!Px(g) !Ax(y) [0, 1]

Ax(y) = ! · [d0(x, y) + d0(y, x)]+ " · [d1(x, y) + d1(y, x)]+ # · [d2(x, y) + d2(y, x)]

Px(g) =!

y!X :!(y)=g

Ax(y)

!Ax(y) =Ax(y)!Ax

|max(Ax(y)!Ax)|

!Px(g) =

"y!X :!(y)=g Ax(y)"

y!X Ax(y)

Page 6: Uncovering affinity of artists to multiple genres from social behaviour data

Claudio Baccigalupo – Uncovering affinity of artists to multiple genres from social behaviour data – October 2008

Artists can be described as genre-affinity vectors4 THE RESULTS

The genre-affinity degree Mx(g) is high when artists that often co-occur with x belong to genre g and artists that rarely co-occur with x do not belong to genre g.

Mx(g) =12

!"y!A

#Ax(y)#Py(g)n

+ 1

$

xxg

Mx(g)g

Mx(g) ! [0, 1]

Rock/Pop

R&B

Country

Jazz

Rap0.5000.500

Rap

Jazz

Country

R&B

Rock/Pop

CountryR&B

Rap

From a ‘Boolean’ approach: Madonna is Rock/Pop

To a ‘Fuzzy’ approach: membership degrees Mx(g) € [0,1]

Page 7: Uncovering affinity of artists to multiple genres from social behaviour data

Claudio Baccigalupo – Uncovering affinity of artists to multiple genres from social behaviour data – October 2008

Genre-centrality comparison of two artists, both originally labelled as ‘Rock/Pop’

Genre-centrality comparison of two artists, one labelled ‘Rock/Pop’, the other ‘R&B’

Artists can be compared in terms of centrality to different genres4 THE RESULTS

The genre-centrality of an artist x to a genre g is the percentage of artists whose genre affinity to g is ≤ Mx(g)

! [0, 1]R&B

Rock/Pop

Rap

CountryCountry Rap

Rock/Pop

R&B

25%

50%

75%

100%

R&B

Rock/Pop

Rap

Country

Country

Rap

Rock/Pop

R&B

25%

50%

75%

100%

x gg ! Mx(g)

Page 8: Uncovering affinity of artists to multiple genres from social behaviour data

Claudio Baccigalupo – Uncovering affinity of artists to multiple genres from social behaviour data – October 2008

Hidden relationships are uncovered in the domain of artists4 THE RESULTS

Artists with the highest genre affinity to a genre g are called core

artists, and are good representatives of g

Metric distances can be used to compare artists or visualise them

using a multi-dimensionality reduction method

Tanya Tucker

●●

●●

●●

●●

● ●

● ●

● ●

● ●

●●

●●

●● ●

●●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●● ●

●●

●●

● ●

● ●

●●

● ●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●● ●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

● ●

●●

●● ●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●● ●

● ●

●●

●●

● ●

●●

● ●●

●●

●●

● ●

●●

●●

●●●

● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●

●● ●

● ●

●●

●●

●●

●●

●●

●●

●●

● ● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●●

●●

●●

●●

●●

● ●

● ●

● ●

● ●

●●

●●

●● ●

●●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●● ●

●●

●●

● ●

● ●

●●

● ●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●● ●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

● ●

●●

●● ●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●● ●

● ●

●●

●●

● ●

●●

● ●●

●●

●●

● ●

●●

●●

●●●

● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●

●● ●

● ●

●●

●●

●●

●●

●●

●●

●●

● ● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●●

Madonna

The Jacksons

Metallica

Johnny CashBruce Springsteen

Wu−Tang Clan

Chick Corea

Nelly

Patti Smith

Anthrax

Whitney Houston

The Stone Roses

Shakira

Shirley Bassey

WU-TANG CLAN

CHICK COREA

SHAKIRA

SHIRLEY BASSEY

ANTHRAX

PATTY SMITH

BRUCE SPRINGSTEEN

WHITNEY HOUSTON

JOHNNY CASH

THE STREETS

AALIYAH

NELLY

THE STONE ROSESTHE STONE ROSES

Country

R&B

Diamond Rio Confederate Railroad

Loose Ends Gerald Levert Zhane

Too Short Westside Connection Masta Ace Inc.

Rap

g

g

First core artists of 3 different genres 2D reduction of the Euclidean distance among artists (as 28-dimensional vectors)

Page 9: Uncovering affinity of artists to multiple genres from social behaviour data

Claudio Baccigalupo – Uncovering affinity of artists to multiple genres from social behaviour data – October 2008

Genre-affinity of artists to two independent genres: ‘Rap’ and ‘Country’

Genre-affinity of artists to two correlated genres: ‘Rap’ and ‘R&B’

Hidden relationships are uncovered in the domain of genres4 THE RESULTS

Two genres g and h are correlated when artists with a high genre affinity degree Mx(g) also have a high value for Mx(h)

● ●●

●●

●●

●●●

●● ●

● ●●●●●●

●●●●

● ●

●● ●●

● ●● ●●●

● ●●

●●

●●

●●

●● ●

●●

●●●

●●●●● ●

●●

●● ●●●

● ● ●●

●●

●●

●●

●●●

●●●

●●●

●● ●●● ●●

●●

●●

●●

●●

●● ●

●● ●●● ●

●●

● ●● ●●●●●

●●●●

●●

●●● ●● ●●●

● ●●●●●●

●●

●●

●●

●● ●●●

●●

●●

●●●●

●●●●

●●

●●●

● ●●

●●●

●●

●●

●●

●●●●

●●●●● ●

●●●●

● ●●●

0.4995 0.5000 0.5005 0.5010

0.5000

0.5005

0.5010

0.5015

Rap

Country

● ●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●●

●●

● ●●

●●●

● ●

●●

●●●

●● ●

● ●●

●●

● ●

●●

●● ●

●●

●●

●●

●●

●●

●●●

●●

●● ●●●

●●

●●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

● ●

●●●●

●●●

●●

●●

0.4995 0.5000 0.5005 0.50100.4990

0.4995

0.5000

0.5005

0.5010

0.5015

0.5020

Rap

R&B

Country

Rap

R&B

Rap

g hMx(g) Mx(h)

Page 10: Uncovering affinity of artists to multiple genres from social behaviour data

Claudio Baccigalupo – Uncovering affinity of artists to multiple genres from social behaviour data – October 2008

Hidden relationships are uncovered in the domain of genres4

Rg_h Country Blues Jazz Reggae R&B Rap

Country

Blues

Jazz

Reggae

R&B

Rap

1 0.2 0.1 0 0.1 -0.1

0.2 1 0.4 0.1 0 -0.2

0.1 0.4 1 0.1 0.2 -0.1

0 0.1 0.1 1 0.4 0.4

0.1 0 0.2 0.4 1 0.6

-0.1 -0.2 -0.1 0.4 0.6 1

THE RESULTS

Two genres g and h are correlated when artists with a high genre affinity degree Mx(g) also have a high value for Mx(h)

Pearson coefficient r(g,h) for 6 genre-centrality vectors

Mx(g) Mx(h)

!g,h

ρg,h

g h

Page 11: Uncovering affinity of artists to multiple genres from social behaviour data

Claudio Baccigalupo – Uncovering affinity of artists to multiple genres from social behaviour data – October 2008

Tag-centrality comparison of two artists Pearson coefficient r(g,h) for 7 tags

Co-occurrences analysis of 4,000 artists and their tags in MusicStrands playlists5 OTHER DOMAINS

Two tags g and h are correlated when artists with a high genre affinity degree Mx(g) also have a high value for Mx(h)

! [0, 1]

(pop, rock, electronic, rock & pop, spoken

word, interview)

rock

heavy metal hardcore punk

electronicdance

r&b

r&b

dance

electronichardcore punk

heavy metal

rock

25%

50%

75%

100%

(rock, heavy metal, rock & pop, speed thrash metal, gold disc, spoken word, interview)

Rg_h Indie Rock Awesome Brazilian Salsa Trumpet Jazz

Indie

Rock

Awesome

Brazilian

Salsa

Trumpet

Jazz

1 0.7 0.8 -0.3 -0.3 -0.1 -0.2

0.7 1 0.9 -0.7 -0.4 -0.4 -0.5

0.8 0.9 1 -0.5 -0.3 -0.3 -0.4

-0.3 -0.7 -0.5 1 0.3 0.2 0.2

-0.3 -0.4 -0.3 0.3 1 0.2 0.2

-0.1 -0.4 -0.3 0.2 0.2 1 1

-0.2 -0.5 -0.4 0.2 0.2 1 1

g hMx(g) Mx(h)

!g,h

ρg,h

Page 12: Uncovering affinity of artists to multiple genres from social behaviour data

Claudio Baccigalupo – Uncovering affinity of artists to multiple genres from social behaviour data – October 2008

A movie represented as a genre- centrality vector (8 genres shown)

Genre-affinity of movies to two exclusive genres: ‘Romance’ and ‘Sci-Fi’

Co-rating analysis of 8,865 movies and their IMDB Genres in Netflix competition dataset5 OTHER DOMAINS

Two movies are socially related when some Netflix customer watched both movies and assigned them the maximum user rating

! [0, 1]

Drama

Romance

Fantasy

Action

Thriller

Sci−Fi

Comedy

Crime

25%

50%

75%

100%

(Romance, Action, Thriller, Adventure, Sci-Fi)

●●

●●

●●

●●

●●●

* ●●

●●

● ●

● ●●●●

●●●

● ●●

●●

●●

*●

●●

●●

●●

●●●

●●●

*●

●●

●●

● ●●● ●

● *●

●●

● *●

● ●

●●

●●

●● ● ●

●●

●●

●●

● *●

●●●

●●●

●●●

● ●

●●

●●

● ●●

●●

●* ●●

● ●●

● ●

●●

●●●

●●

●●●● ●

*●

●●

●●

●●

*●

●●

●●

*

●●

●●

●●

*

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●

●●

●● ●●

●●

●●

●●

●●

●●

●● ●●●

*

●●

●●

●●

●●

● ●●

●●

●●

●●

●● ●

●●●

●●

●●●

●●

●●

●●

*● ●

***

●●

●●

● ● ●

●●●

●●●

*

●●

●●

* ●

●●

●●

●●

●●●

●●

●●

*

●●

●●●●

● ●●

●●

●●

●●

●●

*

●●

●●

●● ●

●● ●●●

●●

●●

●●

●●

● ●

●●

● ●

*

●●

● ●

●●

●●

*

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●● ●

●● ●

●●

●●

●●●●●

●●

●●

●●● * ●

●●

● ●

●●

● ●* ●

●●●

*●

●●

●●

●●

● ●

●●

*

●●

●●

*

●●

●●●

●●●

● ●●

● ●

●●

●●

●●●●

●●

●●● ●

●●

●●

*●

● ●

●●

●●

●●

*

●●

●●

*

●●

●●

*●

*

●●

●●

*

● ●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●*●

●●

●●

●●

●●

●●*

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

*●

●●

●●

●●

● ●●

●●

●● ●

●●

●●●

●●●

●●

●●

● ●

●●

● ●

●●

●●

*●

●●

●●

*●

●●

●●●

●● ●●

●●

●●

●●●

●●

●●

●●●●

● ●

●●●

*

●●

*

●●

●●●●

●●●

●●

●●

●●●

●●

● ●

●●●

●●

●●

●●●

● ●

●●

●●●

● ●

*

●● ●

● ●

● *●

● ●

●●

●●

●●

●● ●

*

●●

●●

●●

●●

● ●● ●

●●

● ●

●●

● ***

** *****

****** *

**

* **

***

*

*

***

*

*

**

***

*

*

*

*

*

*

0.4998 0.4999 0.5000 0.5001 0.50020.49995

0.50000

0.50005

0.50010

0.50015

Romance

Sci−Fi

Page 13: Uncovering affinity of artists to multiple genres from social behaviour data

Claudio Baccigalupo – Uncovering affinity of artists to multiple genres from social behaviour data – October 2008

Baccigalupo C., Donaldson J., Plaza E. “Uncovering Affinity of Artists to Multiple Genres From Social Behaviour Data” International Symposium on Music Information Retrieval (ISMIR), 2008.

• To uncover richer relationships between artists and genres, exploiting real user behaviour data from social Web sites

• To provide an ontology to describe these relationships (genre affinity degree, genre centrality, core/bridge artists, correlated genres)

• To propose a content-agnostic technique, applicable to any domain where objects have categories and co-occurrences

• To make public both the analysed dataset (artists, genres, and tags from MusicStrands) and the code to reproduce the analysis at:

A new genre ontology, a social-based analysis method, a public real-world dataset6 CONTRIBUTIONS

http://labs.strands.com/music/affinity

Page 14: Uncovering affinity of artists to multiple genres from social behaviour data

IIIA - CSIC

Claudio Baccigalupo – October 2008

Questions?

http://www.iiia.csic.es/~claudio