CMSE 820 Mathematical Foundations of Data Science · 2017. 1. 11. · CMSE 820 Mathematical...

Post on 19-Apr-2021

6 views 6 download

Transcript of CMSE 820 Mathematical Foundations of Data Science · 2017. 1. 11. · CMSE 820 Mathematical...

CMSE 820

Mathematical Foundations of Data Science

Instructor: Matthew Hirn

Data science• Process data

• Extract information from data

• Make predictions using data

• Large amounts of data (“Big Data”)

• Often high dimensional (“Curse of Dimensionality”)

Data science

Signal processing: Processing, extracting, and transferring information contained in multitude different formats, broadly referred to as signals.

Some examples of data science in use From data to knowledge

• Recommend movies on Netflix or products on Amazon

• Object recognition in images or automatic image tagging

• Community detection in social networks (e.g., Facebook)

• Automatic medical diagnosis and treatment recommendation

Object recognition

Guang-Tong Zhou, Tian Lan, Weilong Yang, and Greg Mori

Predictive vs descriptive

Supervised vs unsupervised machine learning

Classification

Training phase:

{(x1

, y

1

), . . . , (xn, yn)}| {z }labeled data

⇢ X ⇥ Y 7! f : X ! Y, f(xi) = yi| {z }classification model

, |Y| < 1

Testing phase:

x 2 X| {z }new data

7! classification model ) f(x) = y 2 Y| {z }classification result

Classification

Example: MNIST

Example: CalTech 101

Regression

Similar to classification, but the model f can have an infinite range!

For example, Y = R or [0, 1]

Regression

Clustering

Clustering

Example: Bickley jetRalf Banisch

Dimensionality reduction

Dimensionality reduction

Principal Component Analysis

Manifold learning

Example: Lip motions in speech

Example: Lip motions in speech

Stéphane Lafon, Yosi Keller, and Ronald R. Coifman

Example: Chemistry

Sandip De, Albert P. Bartók, Gábor Csanyi and Michele Ceriotti

Compressed sensing

Example: Single pixel camera

Digital Signal Processing Group

Kelly Lab

Department of Electrical and Computer Engineering

Rice University

Syllabus

My information

• instructor: Matthew Hirn • office: 2507F, Engineering Building • email: mhirn@msu.edu • phone: (517) 432-0611 • course webpage: MSU Desire2Learn (D2L) course page

Office hours

• Tuesday, 3:00 - 4:00 PM

• Friday, 3:00 - 4:00 PM

• By appointment

Grading

• Homework exercises: 35%

• Midterm: 15%

• Project: 15%

• Final Exam: 35%

Exam dates

• Midterm: Thursday, March 2 (in class)

• Final: Thursday, May 4, 7:45 AM - 9:45 AM (same place)

• These are cumulative, closed book exams

Exercises• Will be posted on D2L on a rolling basis

• After each class, anywhere from zero to a few exercises

• Generally due one week after they are posted

• Some will be programming (MATLAB)

• Others will be mathematical proofs

• All solutions must be typed and submitted online through D2L

Project

• Opportunity to explore an application of the mathematical theory we will develop

• Will be developed over stages throughout the semester