ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton...

29
ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1

Transcript of ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton...

Page 1: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

ChucK! @ HarvestworksPart 3 : Audio analysis & machine learning

Rebecca FiebrinkPrinceton University

1

Page 2: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

Real-time audio analysis

• Goal: Analyze audio within same sample-synchronous framework as synthesis & interaction.

Page 3: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

The Unit Analyzercenter freq, radius

Impulse generator

BiQuadFilter

DAC

Send impulse

FFTFFT

Spectral feature

extractors

Spectral feature

extractors

IFFTIFFT……

Time-domain feature

extractors

Time-domain feature

extractors

UAnaUAnaNew: Unit Analyzer

UGenOld: Unit Generator

Page 4: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

The Unit Analyzer

4

• Like a unit generator– Blackbox for computation– Plug into a directed graph/network/patch

• Unlike a unit generator– Input is samples, data, and/or metadata– Output is samples, data, and/or metadata– Not tied to sample rate; computed on-demand

Page 5: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

=>

Page 6: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

=^

Page 7: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

=> =^chuck upchuck

See upchuck_operator.ck, upchuck_function.ck, continuous_feature_extraction.ck

Page 8: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

The UAnaBlob

• Upchucked by UAna• Generic representation for metadata.

– Real and complex arrays– Spectra, feature values, or user-defined– Timestamped

• One associated with each UAna

Page 9: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

FFT/IFFT

• Takes care of:– Buffering input / overlap-adding output– Maintaining window and FFT sizes– Mediating audio rate and analysis “rate”

• FFT outputs complex spectrum as well as magnitude spectrum– Low-level: access/modify contents manually– High-level: connect FFT to spectral processing UAnae

See ifft.ck, ifft_transformation.ck

Page 10: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

Example: Cross-synthesis

• Apply the spectral envelope of one sound to another sound– Ex: xsynth_robot123.ck, xsynth_guitar123.ck– Voice spectrum taken from:

10

Page 11: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

Machine learning for live performance

• Problem: How do we use audio and gestural features?– there is a semantic gap between the raw data that

computers use and the musical, cultural, aesthetic meanings that humans perceive and assign.

Page 12: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

One solution: A lot of code

• What algorithm would you design to tell a computer whether a picture contains a human face?

12

Page 13: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

The problem

• If your algorithm doesn’t work, how can you fix it?

• You can’t easily reuse it to do a similar task (e.g., recognizing monkey faces that are not human)

• There’s no “theory” for how to write a good algorithm

• It’s a lot of work!

13

Page 14: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

Another solution: Machine learning (Classification)

• Classification is a data-driven approach for applying labels to data. Once a classifier has been trained on a training set that includes the true labels, it will predict labels for new data it hasn’t seen before.

14

Page 15: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.
Page 16: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

Classifier

Data Set: A feature vector and class for every data point

Train the classifier on a labeled dataset

Page 17: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

Run the trained classifier on new data

Classifier

NO!NO!

Page 18: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

Candidates for classification

• Which gesture did the performer just make with the iCube?

• Which instruments are playing right now?• Who is singing? What language are they singing?• Is this chord major or minor?• Is this dancer moving quickly or slowly?• Is this music happy or sad?• Is anyone standing near the camera?

18

Page 19: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

An example algorithm: kNN

• The features of an example are treated as its coordinates in n-dimensional space

• To classify an new example, the algorithm looks for its k (maybe 10) nearest neighbors in that space, and chooses the most popular class.

19

Page 20: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

kNN space: Basketball or Sumo?

20

Feature 1: Weight

Feat

ure

2: H

eigh

t

Page 21: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

kNN space: Basketball or Sumo?

21

Feature 1: Weight

Feat

ure

2: H

eigh

t

?

Page 22: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

kNN space: Basketball or Sumo?

22

Feature 1: Weight

Feat

ure

2: H

eigh

t

?

K=3

Page 23: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

kNN space: Basketball or Sumo?

23

Feature 1: Weight

Feat

ure

2: H

eigh

t

SS

Page 24: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

SMIRK (small music information retrieval toolkit)

• For real-time application of machine learning– Learning in ChucK– E.g., kNN gesture classification, musical audio

genre/artist classification

24

Page 25: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

Interaction & on-the-fly learning

• Can we make process of training a classifier interactive? Performative?

25

Page 26: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

Another technique: Neural networks

• Very early method• Inspired by the brain• Results in highly non-

linear functions from input to output

26

Page 27: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

Combining Techniques with Wekinator

27

ChucK: Pass features to Java, receive

results back and use them to make sound

ChucK: Pass features to Java, receive

results back and use them to make sound

Java: Train a neural network to map

features to sounds

Java: Train a neural network to map

features to soundsOSC

Example: Wekinator

See performance video at http://wekinator.cs.princeton.edu/video/nets0.mov

Page 28: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

Review

• Machine learning can be used to:– Apply meaningful labels (classification)– Learn (& re-learn) functions from inputs to outputs

(e.g., neural networks)

• Appropriate for camera, audio, sensors, and many other types of data

• Live, interactive performance is a very interesting application area

• http://wekinator.cs.princeton.edu28

Page 29: ChucK! @ Harvestworks Part 3 : Audio analysis & machine learning Rebecca Fiebrink Princeton University 1.

Wrap-up

• Thanks for coming, thanks to Harvestworks!• See resources on handout; workshop

webpage with slides & code• Please fill out evaluation forms!

29