Machine Learning - uwyo.educlan/teach/ai19/ml_a.pdf · What is machine learning? 0.4*δ{lottery} -...

Machine Learning

Chao Lan

Background

Can we build a machine that can automatically filter spams?

Which words imply spam?

Does this word imply spam?

Does this combination of words imply spam?

Manually designing patterns for spam is hard.

Can we let the machine learn patterns of spam?

Computers learn from examples to improve its generalizable (classification) performance. - without being explicitly programmed

What is machine learning?

0.4*δ{lottery} - 0.7*δ{lottery} + 0.18*δ{account} - 0.32*δ{birth} > 0.5

A hypothetical pattern of spam learned by the machine.

Other Examples

Concepts

Computers learn from examples to improve its generalizable (classification) performance. - without being explicitly programmed

Revisit: What is machine learning?

Instance, Label

instance x

label y instance x

label y

instance x

predicted label f(x)model f

Prediction Error (or, Generalization Error)

err(f) = 0.3

Training, Training Set

model(training) instances train a model

Supervised Learning versus Unsupervised Learning Tasks

know instances and their labels in the training set

Supervised Learning versus Unsupervised Learning Tasks

know instances, not their labels, in the training set

Testing, Testing Set

(testing) instance predict predicted label

Classification versus Regression

label is discrete

Classification versus Regression

minutes for the survey

label is continuous

[E1] Build a model to classify article topic (sports, politics, etc)

1. what is an instance, what is the label?

2. what are the model input and output?

3. If we have a set of documents with on sports, politics, education and academic, is it a supervised or unsupervised learning task?

4. Is it a classification or regression task?

3. If we have a set of documents with on sports, politics, education and academic, is it a supervised or unsupervised learning task?

3. If we have a set of documents with known topics on sports, politics and academic, is it a supervised or unsupervised learning task?

[E2] Build a model to predict student GPA.

3. If we have a set of students whose GPAs will be known by the end of this semester, is it a supervised or unsupervised learning task?

An instance is often represented as a feature vector x.

lie,cheat

behavior

peer rej

low ac

Q: How to represent a text document?

Example

google lotterycatemailtransportpandamillion ..

1101001..

Q: How to represent an image?

Example

Q: how to represent a user in a graph?

Example

Q: better ways to build vector? (feature engineering)

A model is a function governed by unknown parameters.

Example: model f is a linear function of features xi with unknown parameters θi’s.

f(x) = θ1x1 + θ2x2 + … + θpxp

- training f means estimating θ’s from training instances

- once θ’s are fixed, model f is fixed and can be applied

Example: use a hyper-parameter λ to control the domain of θ’s.

f(x) = θ1x1 + θ2x2 + … + θpxp

- if λ = 10, then θ ∈ [-1,1] — larger domain, f is complex

- if λ = 1, then θ ∈ {0, 1} — smaller domain, f is simple

A model’s complexity is governed by hyper-parameters.

1. f(x) = θ1x1 + θ2x2 + … + θpxp, θ ∈ [0,1]

2. f(x) = θ1x1 + θ2x2 + … + θpxp, θ ∈ {0,1}

Q: which model is has higher complexity?

1. f(x) = θ1x1 + θ2x2 + … + θpxp, θ ∈ [0,1]

2. f(x) = θ1x1 + θ2x2 + … + θpxp, θ ∈ {0,1}

A model with larger domain is often more complex.

Q: what is the hyper-parameter?