Supervised Learning & Classification, part I Reading: DH&S, Ch 1.
-
date post
21-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Supervised Learning & Classification, part I Reading: DH&S, Ch 1.
Supervised Learning &
Classification, part I
Reading: DH&S, Ch 1
Administrivia...
•Pretest answers back today
•Today’s lecture notes online after class
•Apple Keynote, PDF, PowerPoint
•PDF & PPT auto-converted; may be flakey
Your place in history•Yesterday:
•Course administrivia
•Fun & fluffy philosophy
•Today:
•The basic ML problem
•Branches of ML: the 20,000 foot view
•Intro to supervised learning
•Definitions and stuff
Pretest results: trends
•Courses dominated by math, stat; followed by algorithms; followed by CS530; followed by AI & CS500
•Proficiencies: probability > algorithms > linear algebra
•μ=56%
•σ=28%
The basic ML problem
“Emphysema”
World
Super
vised
f(⋅)
•Our job: Reconstruct f() from observations
•Knowing f() tells us:
•Can recognize new (previously unseen) instances
•Classification or discrimination
Hashimoto-Pritzker
The basic ML problem
f(⋅) ???
•Our job: Reconstruct f() from observations•Knowing f() tells us:•Can synthesize new data (e.g., speech or images)•Generation
The basic ML problem
Randomsource
Emphysema
f(⋅)
The basic ML problem•Our job: Reconstruct f() from observations
•Knowing f() tells us:
•Can help us understand the process that generated data
•Description or analysis
•Can tell us/find things we never knew
•Discovery or data mining
f(⋅)
How many clusters (“blobs”) are there?Taxonomy of data?Networks of relationships?Unusual/unexpected things?Most important characteristics?
The basic ML problem•Our job: Reconstruct f() from observations
•Knowing f() tells us:
•Can help us act or perform better
•Control
Turn left?Turn right?Accelerate?Brake?Don’t ride in
the rain?
A brief taxonomyAll All MLML
(highly
abbreviat
ed)
- have “inputs”- have “outputs”- find “best” f()
- have “inputs”- no “outputs”- find “best” f()
- have “inputs”- have “controls”- have “reward”- find “best” f()
SupervisSuperviseded
UnsuperviUnsupervisedsed
ReinforcemReinforcementent
LearningLearning
A brief taxonomyAll All MLML
SupervisSuperviseded
UnsuperviUnsupervisedsed
ReinforcemReinforcementent
LearningLearning
(highly
abbreviat
ed)
ClassificatiClassificationon RegressionRegression
Discrete outputs Continuous outputs
A classic example: digitsThe post office wants to be able to
auto-scanenvelopes, recognize addresses, etc.
87131
???
Digits to bits
255, 255, 127, 35, 0, 0 ...
255, 0, 93, 11, 45, 6 ...
Feature vectorDigitize (sensors)
Measurements & features•The collection of numbers from the sensors:
•... is called a feature vector, a.k.a.,
•attribute vector
•measurement vector
•instance
255, 0, 93, 11, 45, 6 ...
•Written
•where
•d is the dimension of the vector
•Each is drawn from some range
•E.g., or or
Measurements & features
•Features (attributes, independent variables) can come in different flavors:
•Continuous
•Discrete
•Categorical or nominal
More on features
•We (almost always) assume that the set of features is fixed & of finite dimension, d
•Sometimes quite large, though (d≥100,000 not uncommon)
•The set of all possible instances is the instance space or feature space,
•
•
•
More on features
•We (almost always) assume that the set of features is fixed & of finite dimension, d
•Sometimes quite large, though (d≥100,000 not uncommon)
•The set of all possible instances is the instance space or feature space,
•
More on features
•Every example comes w/ a class
•A.k.a., label, prediction, dependent variable, etc.
•For classification problems, class label is categorical
•For regression problems, it’s continuous
•Usually called dependent or regressed variable
•We’ll write
•E.g.,
Classes
255, 255, 127, 35, 0, 0 ...
255, 0, 93, 11, 45, 6 ...
“7”
“8”
Classes, cont’d
•The possible values of the class variable is called the class set, class space, or range
•Book writes indiv classes as
•Presumably whole class set is:
•So
A very simple example
I. setosa I. versicolor I. virginica
Sepal lengthSepal widthPetal lengthPetal width
Feature space,
A very simple example
I. setosa I. versicolor I. virginica
Class space,
Training data•Set of all available data for learning == training data
•A.k.a., parameterization set, fitting set, etc.
•Denoted
•Can write as a matrix, w/ a corresponding class vector:
Finally, goals•Now that we have and , we have a (mostly) well defined job:
Find the function
that most closely approximates the “true” function
The supervised learning problem:
Goals?•Key Questions:
•What candidate functions do we consider?
•What does “most closely approximates” mean?
•How do you find the one you’re looking for?
•How do you know you’ve found the “right” one?