Welcome to Computer Audition
Transcript of Welcome to Computer Audition
![Page 1: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/1.jpg)
Welcome to Computer Audition
(ECE 277/477, AME 277/477, CSC 264/464, TEE 477)
Zhiyao DuanAssistant Professor of ECE and CS
University of Rochester
![Page 2: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/2.jpg)
Human Audition
• Understanding the environment
• Communication
• Entertainment
2ECE 477 - Computer Audition, Zhiyao Duan 2019
![Page 3: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/3.jpg)
Computer Audition
3ECE 477 - Computer Audition, Zhiyao Duan 2019
• Understanding the environment
• Communication
• Entertainment – entertain human
![Page 4: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/4.jpg)
Some Key Problems
• Sound source identification
• Source localization
• Content understanding
– Speech, event, melody, rhythm
• Source separation
ECE 477 - Computer Audition, Zhiyao Duan 2019 4
![Page 5: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/5.jpg)
Tools for Sound Interaction
Modify: Delphi Theater (300 B.C.)Create: Bone Flutes (7000 B.C.)
Record: Cylinder Phonograph (1899)
Transmit:Crystal Radio
(1914)
5ECE 477 - Computer Audition, Zhiyao Duan 2019
![Page 6: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/6.jpg)
Impact on Many Fields
Computer Audition
Psycho-acoustics
Information Retrieval
Music Cognition
Signal Processing
Speech Science
Machine Learning
6ECE 477 - Computer Audition, Zhiyao Duan 2019
![Page 7: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/7.jpg)
Many Applications
7ECE 477 - Computer Audition, Zhiyao Duan 2019
Violy
![Page 8: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/8.jpg)
Some Demos
• Automatic music accompaniment
– http://www.music.informatics.indiana.edu/~craphael/music_plus_one/movies/movies.html
• Multimedia synchronization
– https://www.audiolabs-erlangen.de/fau/professor/mueller/demos
ECE 477 - Computer Audition, Zhiyao Duan 2019 8
![Page 9: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/9.jpg)
Some Demos
• Source Separation– Pop music separation [Takahashi, 2018]
• https://sisec18.unmix.app/#/unmix/AM%20Contra%20-%20Heart%20Peripheral/TAU1
– Violin/piano separation [Li, 2019]
• Mixture: violin: piano:
– Speech/noise separation (speech enhancement) [Eskimez, 2018]
• Mixture: enhanced speech:
– Speech separation [Hershey, 2016]
• Mixture: female #1: female #2:
– Audio-visual speech separation [Afouras, 2018]
• http://www.robots.ox.ac.uk/~vgg/demo/theconversation/demos/vox/0/demo.html
ECE 477 - Computer Audition, Zhiyao Duan 2019 9
![Page 10: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/10.jpg)
Some Demos
J. Brahms,Clarinet Quintet in B minor, op.115. 3rd movement
Single-channel polyphonic music
Source 1
Source 2
…
Source N
10ECE 477 - Computer Audition, Zhiyao Duan 2019
• Soundprism
![Page 11: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/11.jpg)
Some Demos
• Automatic music transcription– Multi-instrument transcription
– Context-dependent piano transcription [Cogliati, 2017]
• Input: transcribed:
– Deep learning based: “Onsets and Frames” [Hawthorne, 2017]
• Input: transcribed:
ECE 477 - Computer Audition, Zhiyao Duan 2019 11
Algorithm Transcription Ground-truth Transcription
![Page 12: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/12.jpg)
Some Demos
• Acoustic event detection and localization
– https://www.youtube.com/watch?v=iImkV6oKG_8
• Voice conversion
– https://www.youtube.com/watch?v=RB7upq8nzIU
• Audio morphing
– https://www.audiolabs-erlangen.de/resources/MIR/2015-ISMIR-LetItBee
ECE 477 - Computer Audition, Zhiyao Duan 2019 12
![Page 13: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/13.jpg)
Some Demos
• Automatic song writing
– http://www.youtube.com/watch?v=3oGFogwcx-E
• Music Generation
– https://www.youtube.com/watch?v=BfrNiqvKbLQ
• Music harmonization [Yan, 2018]
• Music generation [Yan, under review]
– String trio:
ECE 477 - Computer Audition, Zhiyao Duan 2019 13
![Page 14: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/14.jpg)
Course Topics
• Fundamentals of human audition
• Auditory models
• Audio features (pitch, timbre, ect.)
• Audio modeling techniques
• State-of-the-art research topics
– Polyphonic pitch analysis
– Source separation
– Sound identification
– ……
14ECE 477 - Computer Audition, Zhiyao Duan 2019
![Page 15: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/15.jpg)
Course Objectives
• General understanding of the field
• Deep understanding and hands-on research experience in a sub-field
• Gain experience of the full cycle of research
• Able to think critically
• Improve presentation and writing skills
15ECE 477 - Computer Audition, Zhiyao Duan 2019
![Page 16: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/16.jpg)
Assignments
• Total (110 points)
– Homework (50 points)
– Class paper review (14 points)
– Presentation of research (10 points)
– Course project (30 points)
– Peer feedback (6 points)
• No exams
16ECE 477 - Computer Audition, Zhiyao Duan 2019
![Page 17: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/17.jpg)
Grading
• No extra credit
• No curve
• 200-level students get 10 points boost
1109390878070 8373 77
AA-B+BB-C+CC-…
17ECE 477 - Computer Audition, Zhiyao Duan 2019
![Page 18: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/18.jpg)
Important Policies
• Late homework penalty
– 20% deduction each day
• Do your own work
– Discussions are encouraged
– No exchange of code
– No copying of five or more consecutive words
– Cite external sources
• Attendance is not taken, but class discussions are very important for learning
18ECE 477 - Computer Audition, Zhiyao Duan 2019
![Page 19: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/19.jpg)
Prerequisites
• Signal Processing
– ECE 246/446 or ECE 272/472 or equivalent
• Matlab or Python programming
• Preferred but not required
– Machine learning such as SVM, Markov models, neural networks, clustering, etc.
19ECE 477 - Computer Audition, Zhiyao Duan 2019
![Page 20: Welcome to Computer Audition](https://reader030.fdocuments.us/reader030/viewer/2022012700/61a336dcdda3416b2f4b2af8/html5/thumbnails/20.jpg)
Three Websites
• Course website
– All materials (lecture notes, readings, assignments, etc.)
– http://www.ece.rochester.edu/~zduan/teaching/ece477
• Blackboard:
– Only for announcements and homework submissions
• Piazza
– Only for discussions
20ECE 477 - Computer Audition, Zhiyao Duan 2019