The Echo Nest at Music and Bits, October 21 2009

Post on 09-May-2015

5.729 views 0 download

description

Brian Whitman talks about The Echo Nest's "machine listening" platform and the pitfalls and promise of music data.

Transcript of The Echo Nest at Music and Bits, October 21 2009

Thursday, October 22, 2009

I am losing my voice.I am sorry.

I am normally louder than this.I also added text to the pictures.

Thursday, October 22, 2009

A Short (Personal) History ofComputers Listening to Music

1999-2009

Thursday, October 22, 2009

I was a musician for a while.Electronic music.

“Intelligent dance music”(worst genre name ever)

Thursday, October 22, 2009

Thursday, October 22, 2009

“Fish / Cut bait”

Handheld-music (1998-2001)I made my own software to make music

Thursday, October 22, 2009

“Fish / Cut bait”

Handheld-music (1998-2001)Did it make me a better musician? Definitely not.

Thursday, October 22, 2009

It was 1999. Lots of stuff was happening.

Thursday, October 22, 2009

I learned about music from reading web sites.Forums, mailing lists.

Thursday, October 22, 2009

Thursday, October 22, 2009

You could now download a song faster than real time.I figured things would change quick.

Thursday, October 22, 2009

So I went to grad school.I studied information retrieval, language processing.

Thursday, October 22, 2009

Columbia University, NYC

MIT Media Labfinishing my dissertation

Thursday, October 22, 2009

People were starting to apply IR techniques to music.Audio files are treated like text.

FFT frames became wordsSongs became “documents”

Thursday, October 22, 2009

Thursday, October 22, 2009

There’s a problem with that.Just because you can convert an mp3 to #s

doesn’t mean you understand it.

Thursday, October 22, 2009

“Music IR” was born.The applications are varied, but most

have nothing to do with music.

Thursday, October 22, 2009

Retrieving Music by Rhythmic Similarity

cal excerpt. (The effect of varying the truncated regions was notexamined, and it is not unlikely that other values may result in bet-ter retrieval performance.)

4.1.1 Euclidean DistanceThree different distance measures were used. The first wasstraightforward squared Euclidean distance measure, or the sum ofthe squares of the element-by-element differences between the val-ues, as used in Experiment 1. For evaluation, each excerpt wasused as a query. Each of the 15 corpus documents was then rankedby similarity to each of the 15 queries using the squared Euclideandistance. (For the purposes of ranking, the squared distance servesas well as the distance, as the square root function is monotonic.)Each query had 2 relevant documents in the corpus, so this waschosen as the cutoff point for measuring retrieval precision. Thusthere were 30 relevant documents for this query set. For eachquery, documents were ranked by increasing Euclidean distancefrom the query. Using this measure, 24 of the 30 possible docu-ments were relevant (i.e. from the same relevance class), giving aretrieval precision of 80%. (More sophisticated analyses such asROC curves, are probably not warranted due to the small corpussize.)

4.1.2 Cosine DistanceThe second measure used is a cosine metric, similar to thatdescribed in the previous section. This distance measure may bepreferable because it is less sensitive to the actual magnitudes ofthe vectors involved. This measure proved to perform significantlybetter than the Euclidean distance. Using this measure, 29 of the 30

documents retrieved were relevant, giving a retrieval precision of96.7% at this cutoff.

4.1.3 Fourier Beat Spectral CoefficientsThe final distance measure is based on the Fourier coefficients ofthe beat spectrum, because they can represent the rough spectralshape with many fewer parameters. A more compact representa-tion is valuable for a number of reasons: for example, fewer ele-ments speeds distance comparisons and also reduces the amount ofdata that must be stored to represent each file. To this effect, thefast Fourier transform was computed for each beat spectral vector.The log of the magnitude was then determined, and the mean sub-tracted from each coefficient. Because high “frequencies” in thebeat spectra are not rhythmically significant, the transform resultswere truncated to the 25 lowest coefficients. Additionally thezeroth coefficient was ignored, as the DC component is insignifi-cant for zero-mean data. The cosine distance metric was computedfor the 24 zero-mean Fourier coefficients, which served as the finaldistance metric. Experimentally, this measure performed identi-cally to the cosine metric, yielding 29 of 30 relevant documents or96.7% precision. Note that this performance was achieved using anorder of magnitude fewer parameters.Though this corpus is admittedly very small, there is no reason thatthe methods presented here could not be scaled to thousands oreven millions of works. Computing the beast spectrum is computa-tionally quite reasonable and can be done several times faster thanreal time, and even more rapidly if spectral parameters can bederived directly from MP3 compressed data as in [12] and [13].Additionally, well-known database organization methods can dra-

0

0.5

1

1.5

Tempo (bpm)

squa

red

Eucl

idea

n di

stan

ce

110 130 120 122 124 126 128 118 116 114 112

Figure 5. Euclidean Distance vs. Tempo

110 bpm

112 bpm

114 bpm

116 bpm

120 bpm

122 bpm

124 bpm

126 bpm

128 bpm

130 bpm

Thursday, October 22, 2009

The worst offender: “Genre Identification”Countless PhDs on this useless task.

Trying to teach a computer a marketing construct.

Thursday, October 22, 2009

Show of hands:

Is Bjork “electronic, pop, jazz”?

Thursday, October 22, 2009

At MIT I convinced someone to buy lots of computers

Thursday, October 22, 2009

Thursday, October 22, 2009

And tried to figure out how to get musicinto music analysis

Thursday, October 22, 2009

Simple things like detecting holiday musicis very hard.

Thursday, October 22, 2009

I decided if I could get a computer to makeholiday music,

We could claim we understand it.

Thursday, October 22, 2009

Music Acquisition (2001-)This is automatically generated holiday music

based on listening to 1,000 Christmas songs

Thursday, October 22, 2009

It should be a funny joke that you can run statistics of millions of things

and “understand it.”

Thursday, October 22, 2009

Thursday, October 22, 2009

I built Eigenradio in 2003 to show peopleWhat computers hear when they hear music

Thursday, October 22, 2009

Thursday, October 22, 2009

There’s obviously so much more to musicthan the audio signaland that other stuff

is probably more important

Thursday, October 22, 2009

My brother makes music with sine wavesand nothing else

and gets a 9.7 on Pitchfork.This is fascinating!

Thursday, October 22, 2009

My brother makes music with sine wavesand nothing else

and gets a 9.7 on Pitchfork.This is fascinating!

Were the sine waves that good?

Thursday, October 22, 2009

Review Regression (2004)Thursday, October 22, 2009

It turns out if you understand languageand audio

at the same time you start learning a lot more.

Thursday, October 22, 2009

Here we predict ratings on All Music Guideand Pitchfork

By listening to the audio and reading about the artist.

Thursday, October 22, 2009

Audio alone was terribleText alone was better than audio

Both together were the best.

Thursday, October 22, 2009

AMG Ratings

Pitc

hfor

k R

atin

gs

2 4 6 8

20

40

60

80

100

Randomly selected AMG Ratings

Pitc

hfor

k R

atin

gs

2 4 6 8

20

40

60

80

100

AMG Ratings

Audi

o−de

rived

Rat

ings

2 4 6 8

2

4

6

8

Pitchfork Ratings

Audi

o−de

rived

Rat

ings

20 40 60 80 100

20

40

60

80

100

2

4

6

8

10

12

2

4

6

8

10

1

2

3

4

5

6

20

40

60

80

100

120.147[.080]

.127[.082]

Thursday, October 22, 2009

I became interested in more ridiculous questions:“Can we find the saddest song in the world?”

Thursday, October 22, 2009

Thursday, October 22, 2009

So I started a company in 2005with my co-founder Tristan, also at the Lab.

Thursday, October 22, 2009

Tristan is a DSP “machine listening” expertand I handled the text side

Thursday, October 22, 2009

MAGIC

Thursday, October 22, 2009

Why does the Echo Nest exist?

Thursday, October 22, 2009

The best music experience is still very manual.I am still reading about music, not using a recommender.

Thursday, October 22, 2009

Thursday, October 22, 2009

Thursday, October 22, 2009

& the act of listening to music is easier than ever

Thursday, October 22, 2009

Thursday, October 22, 2009

But data is hard.Most designers make very bad decisions

because their tools are inefficient.

Thursday, October 22, 2009

Collaborative filtering (X who did Y also did Z)is so easy to make; but it’s also so terrible.

Thursday, October 22, 2009

Collaborative filtering (X who did Y also did Z)is so easy to make; but it’s also so terrible.

The SQL join is destroying music.

Thursday, October 22, 2009

Thursday, October 22, 2009

Thursday, October 22, 2009

Thursday, October 22, 2009

Thursday, October 22, 2009

In 2005 we modeled the worst case scenario:

In which collaborative filtering was the only wayfor an artist to get noticed.

The popular ones would eat the unknown ones alive.

3 sets of 3 artists each remained.

Thursday, October 22, 2009

Set ABritney Spears

Backstreet BoysCristina Aguilera

Set BAlice in Chains

KornFaith no More

Set CChris IsaakBob Dylan

Crowded House

Thursday, October 22, 2009

So the Echo Nest gives everyone great data.They can decide on their own how to show it.

Thursday, October 22, 2009

The Echo Nest 2005

Somerville, MA USA2 people2 computersLots of ideas1m documents10,000 artists100,000 songs0 public facing sites

Thursday, October 22, 2009

The Echo Nest 2009

Somerville, MA USA20 people200 computersLots of products5bn documents1,000,000 artistsmany millions of songs0 public facing sites

Thursday, October 22, 2009

What We Do

Thursday, October 22, 2009

“Know everything about music and listeners.”

Thursday, October 22, 2009

“Know everything about music and listeners.”“Give (and sell) great data to everyone.”

Thursday, October 22, 2009

“Know everything about music and listeners.”“Give (and sell) great data to everyone.”

“Do it automatically with no bias, on everything.”

Thursday, October 22, 2009

CodeCustomers Crawling

Machine LearningNLP DSP

Thursday, October 22, 2009

• Similar Songs• Tempo• Key• Mode• Time Signature• Beats• Downbeats• Segments• Timbre• Pitch• Loudness• Sections

• demographics - age, gender, location• psychographics - preferences, lifestyle• music preference • listening patterns• tastemaker profiling- writers, bloggers

Artist Data Song Data Listener Data• Tag Clouds• Similar Artists• Analytics• Familiarity • Hotttnesss• Blogs• News • Reviews• Audio• Video • Profile Sites• Misspellings• Aliases

Thursday, October 22, 2009

We have a lot of data andwe have a lot of products.

We sell mostly to social networks, labels;video games; PR firms; musicians

Thursday, October 22, 2009

Similarity

Acousticanalysis

Artist metrics

FeedsRemix

Recommendation

Search / TagsMetadata

Predictive analytics

Thursday, October 22, 2009

The reason we are special is 2 things:

Scale and Platform

Thursday, October 22, 2009

Our scale is limitless.We have hundreds of computers

We always do our computation on everything.We can learn about new music very quickly.

Thursday, October 22, 2009

All Music Guide Pandora The Echo Nest

known artists 280,000 80,000 1,000,000

years to get there 18 8 1

time to understand one album 1 week 1 day <1 minute

cost to understand one album $400 $40 $0.001

Scale

Thursday, October 22, 2009

Our platform is huge. We have thousands of “free” developers using our API

Our customers use the same platformSo do we.

Thursday, October 22, 2009

Platform

Thursday, October 22, 2009

We sell two main products:

Fanalytics is a predictive analytics toolset for artists

The Knowledge is a dynamic metadata service (recommendation, feeds, data)

for web sites

Thursday, October 22, 2009

Fanalytics lets artists and labels get a viewinto the world of online music

We recommend blogs for artistsWe show predicted analytics on activity

Thursday, October 22, 2009

Predictive analytics

Artist metrics

Thursday, October 22, 2009

We also maintain a popular open sourceremixing community and code baseso people can make awesome free

mashups, remixes, web sites using our tech

Not much of a business but we love it.

Thursday, October 22, 2009

Remix

Thursday, October 22, 2009

“DonkDJ.com” was made using RemixIt automatically “donks” (ask someone what this means)

any song you upload

Thursday, October 22, 2009

Thursday, October 22, 2009

Morecowbell.dj adds cowbell to any song

This Is My Jam was a pre-Muxtape (by one day)mixtape sharing site that only let you use 30s samples

and made a total mess of the output.

Like I said, not much of a business.

Thursday, October 22, 2009

Thursday, October 22, 2009

Thursday, October 22, 2009

We also have artists using Remix-- our data is now powering some next generation

electronic music

Thursday, October 22, 2009

I’ve always wanted to hear Michael Jackson trying to sing Amerie’s “One Thing” automatically by comparing

timbre, pitch and loudness distances.

-B.L.

Thursday, October 22, 2009

James Brown... FOREVER.

Thursday, October 22, 2009

Remix also works on video

Thursday, October 22, 2009

Let’s hear Daft Punk’s “Revolution 909” played by a fight scene from Undefeatable!

-Y.A.

Thursday, October 22, 2009

Our analysis data powers a lot of visualizers andvideo games (rhythm games on your own MP3s)

Thursday, October 22, 2009

Acousticanalysis

Thursday, October 22, 2009

The Knowledge is a much better music data serviceCustomers can subscribe to constantly-updatedsimilarity, metadata, feeds, recommendations, etc

Thursday, October 22, 2009

Our similarity and recommendation data is some of the best, because we use so many sources

and we know about all artists even if they are tiny

Thursday, October 22, 2009

Similarity

Feeds

Thursday, October 22, 2009

Since our similarity is based on so many features:popularity, audio analysis, text analysis,

structured metadata, influences, ...

Thursday, October 22, 2009

Since our similarity is based on so many features:popularity, audio analysis, text analysis,

structured metadata, influences, ...We provide our customers with the knobs

and let them decide what is important for the task.

Thursday, October 22, 2009

Since our similarity is based on so many features:popularity, audio analysis, text analysis,

structured metadata, influences, ...We provide our customers with the knobs

and let them decide what is important for the task.We do not give a “single answer.”

There is no single answer.

Thursday, October 22, 2009

Similarity

Thursday, October 22, 2009

We can build paths between artists on any vector

Thursday, October 22, 2009

Similarity

Acousticanalysis

Search / Tags

Thursday, October 22, 2009

Our future:

Thursday, October 22, 2009

1. Listener analytics

Thursday, October 22, 2009

We’ve been running large scale data miningon millions of listeners to help with analytics,

for example a gender predictor based on your music taste

Thursday, October 22, 2009

Here’s the basis vectors; strongest correlators of gender:

Thursday, October 22, 2009

Male Female

Pet Shop Boys Eternal

Fort Minor Metro Station

Justice Gackt

Mike Oldfield Paolo Nutini

U2 London after Midnight

Thursday, October 22, 2009

2. More musicians to use our remix tools

Thursday, October 22, 2009

(I’ve noticed the better you are with computers,the worse your music is. This may just be me)

Thursday, October 22, 2009

0%

25%

50%

75%

100%

nothing not much a little somewhat pretty good expert dork prime

Mus

ic g

oodn

ess

Computers know-how

Thursday, October 22, 2009

3. Search anything APIs

Thursday, October 22, 2009

We will soon make all of our acoustic dataavailable for searching and browsing

(right now it has to be your content):

“Find me a drum hit in this collectionthat sounds like the break in ‘Single Ladies’”

Thursday, October 22, 2009

Thursday, October 22, 2009

Combined with Remix this will allow anyoneto compose music that uses all music in the world

Thursday, October 22, 2009

>> from echonest import search>> segments = search.query(“voice”, soundsLike=”bjork”, pitch=”F#”)>> len(segments)65706>> new_song = random.shuffle(segments).write(“bjork2009.mp3”)

Thursday, October 22, 2009

To wrap up:

1. Don’t trust computers

Thursday, October 22, 2009

To wrap up:

1. Don’t trust computers2. But trust us, really

Thursday, October 22, 2009

To wrap up:

1. Don’t trust computers2. But trust us, really

3. Sorry I can’t speak very well

Thursday, October 22, 2009

Thursday, October 22, 2009