Tommi S. Jaakkola MIT AI Lab

Machine learning: lecture 1

Tommi S. Jaakkola

MIT AI Lab

6.867 Machine learning: administrivia

• Instructor: Prof. Tommi Jaakkola

• General info

– lectures TR 2.30-4pm

– tutorials/recitations (time/location tba)

• Grading

– midterm (15%), final (25%)

– 6 (≈ bi-weekly) problem sets (30%)

– final project (30%)

Tommi Jaakkola, MIT AI Lab 2

Broader context

• What is learning anyway?

• Principles of learning are “universal”

– society (e.g., scientific community)

– animal (e.g., human)

– machine

• Prediction is the key...

Prediction

• We make predictions all the time but rarely investigate the

processes underlying our predictions

• In carrying out scientific research we are also governed by

how theories are evaluated

• To automate the process of making predictions we need to

understand in addition how we search and refine “theories”

Machine learning

• Statistical machine learning

– principles, methods, and algorithms for learning and

prediction on the basis of past experience

– already everywhere: speech recognition, hand-written

character recognition, information retrieval, operating

systems, compilers, fraud detection, security, defense

applications, ...

Example

• A classifcation problem: predict the grades for students

taking this course

Example

taking this course

• Key steps:

1. data

2. assumptions

3. representation

4. estimation

5. evaluation

6. model selection

Example

taking this course

• Key steps:

1. data: what “past experience” can we rely on?

Example

taking this course

• Key steps:

2. assumptions: what can we assume about the students or

the course?

Example

taking this course

• Key steps:

the course?

3. representation: how do we “summarize” a student?

Example

taking this course

• Key steps:

the course?

4. estimation: how do we construct a map from students to

grades?

Example

taking this course

• Key steps:

the course?

grades?

5. evaluation: how well are we predicting?

Example

taking this course

• Key steps:

the course?

grades?

5. evaluation: how well are we predicting?

6. model selection: perhaps we can do even better?

• The data we have available (in principle):

– names and grades of students in past years ML courses

– academic record of past and current students

• The data we have available (in principle):

– names and grades of students in past years ML courses

– academic record of past and current students

• “training” data:

Student ML course 1 course 2 ...

Peter A B A ...

David B A A ...

• “test” data:

Student ML course 1 course 2 ...

Jack ? C A ...

Kate ? A A ...

• Anything else we could use?

Assumptions

• There are many assumptions we can make to facilitate

predictions

1. the course has remained roughly the same over the years

2. each student performs independently from others

Representation

• Academic records are rather diverse so we might limit the

summaries to a select few courses

• For example, we can summarize the ith student (say Pete)

with a vector

xi = [A C B]

where the grades correspond to (say) 18.06, 6.041, and

6.034.

• The available data in this representation

Training Test

Student ML grade Student ML grade

x1 A x′1 ?

x2 B x′2 ?

. . . . . . . . . . . .

Estimation

• Given the training data

Student ML grade

. . . . . .

we need to find a mapping from “input vectors” x to “labels”

y encoding the grades for the ML course.

Estimation

• Given the training data

Student ML grade

. . . . . .

we need to find a mapping from “input vectors” x to “labels”

y encoding the grades for the ML course.

• Possible solution (nearest neighbor classifier):

1. For any student x find the “closest” student xi in the

training set

2. Predict yi, the grade of the closest student

Evaluation

• How can we tell how good our predictions are?

– we can wait till the end of this course...

– we can try to assess the accuracy based on the data we

already have (training data)

Evaluation

• How can we tell how good our predictions are?

– we can wait till the end of this course...

– we can try to assess the accuracy based on the data we

already have (training data)

• Possible solution:

– divide the training set further into training and test sets

– evaluate the classifier constructed on the basis of only the

smaller training set on the new test set

Model selection

• We can refine

– the estimation algorithm (e.g., using a classifier other than

the nearest neighbor classifier)

– the representation (e.g., base the summaries on a different

set of courses)

– the assumptions (e.g., perhaps students work in groups)

• We have to rely on the method of evaluating the accuracy

of our predictions to select among the possible refinements

Types of learning problems

A rough (and somewhat outdated) classification of learning

problems:

• Supervised learning, where we get a set of training inputs

and outputs

– classification, regression

• Unsupervised learning, where we are interested in capturing

inherent organization in the data

– clustering, density estimation

• Reinforcement learning, where we only get feedback in the

form of how well we are doing (not what we should be doing)

– planning

Supervised learning: classification (again)

Example: digit recognition (8x8 binary digits)

binary digit target label

“2”

“1”

. . . . . .

• We wish to learn the mapping from digits to labels

Representation/assumptions

• A change in the representation that preserves the relevant

information can preclude learning

Supervised learning: regression

−2 −1 0 1 2−5

• Given a set of training examples {(x1, y1) . . . , (xn, yn)}, we

want to learn a mapping f : X → Y such that

yi ≈ f(xi), i = 1, . . . , n

Unsupervised learning: data organization

The digits again...

• We’d like to understand the generation process of examples

(digits in this case)

Tommi S. Jaakkola MIT AI Lab · Tommi Jaakkola, MIT AI Lab 21. Model selection We can re ne { the...

Documents

Transcript of Tommi S. Jaakkola MIT AI Lab · Tommi Jaakkola, MIT AI Lab 21. Model selection We can re ne { the...

Jaakkola Only] [Compatibility Mode]

Tommi kramer 2013-06-21-caise-re2-kramer

Active Information Retrievalpapers.nips.cc/paper/1954-active-information-retrieval.pdf · 2014. 4. 15. · Active Information Retrieval Tommi Jaakkola MIT AI Lab Cambridge, MA tommi@ai.mit.edu

Tommi Mikkola DESIGN OF HYDRAULIC CYLINDER FOR HAND …

JAVELIN By: Sampsa Lohiniva & Tommi Paananen SPO10S JAVELIN By: Sampsa Lohiniva & Tommi Paananen SPO10S ATHLETICS.

Hietala Timo Salmi Tommi

Tommi S. Jaakkola MIT AI Labdspace.mit.edu/bitstream/handle/1721.1/46320/6-867Fall-2002/NR/rdon... · Tommi Jaakkola, MIT AI Lab 8. Additive models cont’d More generally, predictions

Matti Jaakkola Strategic Marketing and Its Effect on ... · Matti Jaakkola Strategic Marketing and Its Effect on Business Performance: Moderating Effect of Country-specific Factors

Tommi S. Jaakkola MIT AI Labdspace.mit.edu/bitstream/handle/1721.1/46320/6-867... · Tommi Jaakkola, MIT AI Lab 7. Markov chain: estimation • Can we simply estimate Markov chains

Tommi Koistinen Nokia Networks - Aalto Coding in 3G Networks S-38.130 Postgraduate Course in Telecommunications Spring 2001 Tommi Koistinen Nokia Networks

Virtualisoitu datakeskusratkaisu cisco tommi saxelin

Ensemblelearning Lecture12 - MITpeople.csail.mit.edu/dsontag/courses/ml12/slides/lecture12.pdf · Slides adapted from Luke Zettlemoyer, Vibhav Gogate, Rob Schapire, and Tommi Jaakkola

COLLABORATIVE FUTURE EVENT RECOMMENDATION · EVENT RECOMMENDATION Einat Minkov, Ben Charrow Jonathan Ledlie, Seth Teller, Tommi Jaakkola 1 CIKM 2010 . ... RNA!polymerase (a) O R3

JAVELIN By: Sampsa Lohiniva & Tommi Paananen SPO10S

Tommi S. Jaakkola MIT CSAIL · Data compression and model selection •We can alternatively view model selection as a problem of ﬁnding the best way of communicating the available

Tackling The Challenges of Big Data Big Data …...1 Tackling The Challenges of Big Data Big Data Analytics Machine Learning Tools Tommi Jaakkola Professor Massachusetts Institute

© Tommi Salama 2009 Character Sheetimages1.wikia.nocookie.net/__cb20100624163448/.../d/db/Characters… · Character Sheet © Tommi Salama 2009 ... Pathfinder RPG Core Rulebook.

Timeline Editor Tommi Karhela, Antti Villberg VTT.

Leandro Coelho Dalvi*, Christiane Laine, Tommi Virtanen ...