Post on 19-Apr-2021
CMSE 820
Mathematical Foundations of Data Science
Instructor: Matthew Hirn
Data science• Process data
• Extract information from data
• Make predictions using data
• Large amounts of data (“Big Data”)
• Often high dimensional (“Curse of Dimensionality”)
Data science
Signal processing: Processing, extracting, and transferring information contained in multitude different formats, broadly referred to as signals.
Some examples of data science in use From data to knowledge
• Recommend movies on Netflix or products on Amazon
• Object recognition in images or automatic image tagging
• Community detection in social networks (e.g., Facebook)
• Automatic medical diagnosis and treatment recommendation
Object recognition
Guang-Tong Zhou, Tian Lan, Weilong Yang, and Greg Mori
Predictive vs descriptive
Supervised vs unsupervised machine learning
Classification
Training phase:
{(x1
, y
1
), . . . , (xn, yn)}| {z }labeled data
⇢ X ⇥ Y 7! f : X ! Y, f(xi) = yi| {z }classification model
, |Y| < 1
Testing phase:
x 2 X| {z }new data
7! classification model ) f(x) = y 2 Y| {z }classification result
Classification
Example: MNIST
Example: CalTech 101
Regression
Similar to classification, but the model f can have an infinite range!
For example, Y = R or [0, 1]
Regression
Clustering
Clustering
Example: Bickley jetRalf Banisch
Dimensionality reduction
Dimensionality reduction
Principal Component Analysis
Manifold learning
Example: Lip motions in speech
Example: Lip motions in speech
Stéphane Lafon, Yosi Keller, and Ronald R. Coifman
Example: Chemistry
Sandip De, Albert P. Bartók, Gábor Csanyi and Michele Ceriotti
Compressed sensing
Example: Single pixel camera
Digital Signal Processing Group
Kelly Lab
Department of Electrical and Computer Engineering
Rice University
Syllabus
My information
• instructor: Matthew Hirn • office: 2507F, Engineering Building • email: mhirn@msu.edu • phone: (517) 432-0611 • course webpage: MSU Desire2Learn (D2L) course page
Office hours
• Tuesday, 3:00 - 4:00 PM
• Friday, 3:00 - 4:00 PM
• By appointment
Grading
• Homework exercises: 35%
• Midterm: 15%
• Project: 15%
• Final Exam: 35%
Exam dates
• Midterm: Thursday, March 2 (in class)
• Final: Thursday, May 4, 7:45 AM - 9:45 AM (same place)
• These are cumulative, closed book exams
Exercises• Will be posted on D2L on a rolling basis
• After each class, anywhere from zero to a few exercises
• Generally due one week after they are posted
• Some will be programming (MATLAB)
• Others will be mathematical proofs
• All solutions must be typed and submitted online through D2L
Project
• Opportunity to explore an application of the mathematical theory we will develop
• Will be developed over stages throughout the semester