Methods in Computational Linguistics II Queens College Lecture 3: Counting More Things.
Machine Learning Queens College Lecture 1: Introduction.
-
Upload
felix-weaver -
Category
Documents
-
view
224 -
download
3
Transcript of Machine Learning Queens College Lecture 1: Introduction.
![Page 1: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/1.jpg)
Machine Learning
Queens College
Lecture 1: Introduction
![Page 2: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/2.jpg)
2
Today
• Welcome• Overview of Machine Learning• Class Mechanics• Syllabus Review
![Page 3: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/3.jpg)
3
My research and background
• Speech– Analysis of Intonation– Segmentation
• Natural Language Processing– Computational Linguistics
• Evaluation Measures• All of this research relies heavily on
Machine Learning
![Page 4: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/4.jpg)
4
You
• Why are you taking this class?• What is your background and comfort with
– Calculus– Linear Algebra– Probability and Statistics
• What is your programming language of preference?– C++, java, or python are preferred
![Page 5: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/5.jpg)
5
Machine Learning
• Automatically identifying patterns in data• Automatically making decisions based on
data• Hypothesis:
Data Learning Algorithm Behavior
Data Programmer or Expert Behavior
≥
![Page 6: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/6.jpg)
6
Machine Learning in Computer Science
Machine LearningBiomedical/
ChemedicalInformatics
Financial Modeling
Natural Language Processing
Speech/Audio
Processing Planning
Locomotion
Vision/Image
Processing
Robotics
Human Computer Interaction
Analytics
![Page 7: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/7.jpg)
7
Major Tasks
• Regression– Predict a numerical value from “other
information”
• Classification– Predict a categorical value
• Clustering– Identify groups of similar entities
• Evaluation
![Page 8: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/8.jpg)
8
Feature Representations
• How do we view data?
Entity in the World
Web PageUser BehaviorSpeech or Audio DataVisionWinePeopleEtc.
Feature Representation
Machine Learning Algorithm
Feature Extraction
Our Focus
![Page 9: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/9.jpg)
9
Feature Representations
Height Weight Eye Color Gender
66 170 Blue Male
73 210 Brown Male
72 165 Green Male
70 180 Blue Male
74 185 Brown Male
68 155 Green Male
65 150 Blue Female
64 120 Brown Female
63 125 Green Female
67 140 Blue Female
68 165 Brown Female
66 130 Green Female
![Page 10: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/10.jpg)
10
Classification
• Identify which of N classes a data point, x, belongs to.
• x is a column vector of features.
OR
![Page 11: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/11.jpg)
11
Target Values
• In supervised approaches, in addition to a data point, x, we will also have access to a target value, t.
Goal of Classification
Identify a function y, such that y(x) = t
![Page 12: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/12.jpg)
12
Feature Representations
Height Weight Eye Color Gender
66 170 Blue Male
73 210 Brown Male
72 165 Green Male
70 180 Blue Male
74 185 Brown Male
68 155 Green Male
65 150 Blue Female
64 120 Brown Female
63 125 Green Female
67 140 Blue Female
68 165 Brown Female
66 130 Green Female
![Page 13: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/13.jpg)
13
Graphical Example of Classification
![Page 14: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/14.jpg)
14
Graphical Example of Classification
?
![Page 15: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/15.jpg)
15
Graphical Example of Classification
?
![Page 16: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/16.jpg)
16
Graphical Example of Classification
![Page 17: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/17.jpg)
17
Graphical Example of Classification
![Page 18: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/18.jpg)
18
Graphical Example of Classification
![Page 19: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/19.jpg)
19
Decision Boundaries
![Page 20: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/20.jpg)
20
Regression
• Regression is a supervised machine learning task. – So a target value, t, is given.
• Classification: nominal t • Regression: continuous t
Goal of Classification
Identify a function y, such that y(x) = t
![Page 21: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/21.jpg)
21
Differences between Classification and Regression
• Similar goals: Identify y(x) = t.• What are the differences?
– The form of the function, y (naturally).– Evaluation
• Root Mean Squared Error• Absolute Value Error• Classification Error• Maximum Likelihood
– Evaluation drives the optimization operation that learns the function, y.
![Page 22: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/22.jpg)
22
Graphical Example of Regression
?
![Page 23: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/23.jpg)
23
Graphical Example of Regression
![Page 24: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/24.jpg)
24
Graphical Example of Regression
![Page 25: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/25.jpg)
25
Clustering
• Clustering is an unsupervised learning task.– There is no target value to shoot for.
• Identify groups of “similar” data points, that are “dissimilar” from others.
• Partition the data into groups (clusters) that satisfy these constraints1. Points in the same cluster should be similar.
2. Points in different clusters should be dissimilar.
![Page 26: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/26.jpg)
26
Graphical Example of Clustering
![Page 27: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/27.jpg)
27
Graphical Example of Clustering
![Page 28: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/28.jpg)
28
Graphical Example of Clustering
![Page 29: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/29.jpg)
29
Mechanisms of Machine Learning
• Statistical Estimation– Numerical Optimization– Theoretical Optimization
• Feature Manipulation• Similarity Measures
![Page 30: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/30.jpg)
30
Mathematical Necessities
• Probability• Statistics• Calculus
– Vector Calculus
• Linear Algebra
• Is this a Math course in disguise?
![Page 31: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/31.jpg)
31
Why do we need so much math?
• Probability Density Functions allow the evaluation of how likely a data point is under a model. – Want to identify good PDFs. (calculus)– Want to evaluate against a known PDF. (algebra)
![Page 32: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/32.jpg)
32
Gaussian Distributions
• We use Gaussian Distributions all over the place.
![Page 33: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/33.jpg)
33
Gaussian Distributions
• We use Gaussian Distributions all over the place.
![Page 34: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/34.jpg)
34
Data Data Data
• “There’s no data like more data”• All machine learning techniques rely on the
availability of data to learn from.• There is an ever increasing amount of data being
generated, but it’s not always easy to process.– UCI
• http://archive.ics.uci.edu/ml/
– LDC (Linguistic Data Consortium)• http://www.ldc.upenn.edu/
– Contact me for speech data.
• Is all data equal?
![Page 35: Machine Learning Queens College Lecture 1: Introduction.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649e0f5503460f94afa4f9/html5/thumbnails/35.jpg)
35
Class Structure and Policies
• Course website:– http://eniac.cs.qc.cuny.edu/andrew/ml/syllabus.html
• Email list– CUNY First has an email function – most students do not use
the associated email address…– Put your email address on the sign up sheet.