Welcome to Machine Learning - kt.era.eekt.era.ee/lectures/aacimp2015/1-intro.pdf · Welcome to...

Welcome to Machine Learning

Konstantin Tretyakovhttp://kt.era.ee

AACIMP Summer

School 2015

Data mining,

Data analysis, Statistical analysis,

Pattern discovery, Statistical learning,

Machine learning, Predictive analytics,

Business intelligence, Data-driven methods,

Inductive reasoning, Pattern analysis,

Knowledge discovery from databases,

Neural networks,

“Unformalizable” problems

General rule:

IF (Х is made of plastic), THEN (Х is not edible)

Application in the particular case:Х = Rubic’s cube

⇒Rubic’s cube is not edible

General rule:

is_solution(x) = function { … }

Application in the particular case:is_solution(?) = true

General rule:

add(x, y) = function { … }

Application in the particular case:add(2,4) D

General rule:

add(x, y) = function {

Particular cases:add(2,4) = 6

add(5,3) = 8

add(1,2) = 3

Deduction and Induction

Deduction

Given a general rule, make a decision in a particular case

Induction

Given particular cases, find a general rule

Machine learning =

A set of techniques for dealing

with inductive problems.

Classification

by analogy

MNIST dataset

http://yann.lecun.com/exdb/mnist/

Handwritten digits, 28 х 28

MNIST dataset

images = load_images()

labels = load_labels()

# Let us just use 1000 images

training_set = images[0:1000]

training_labels = labels[0:1000]

MNIST dataset

> training_set[0]

array([ 0, 0, 0, ...,

254, 241, 198, ...])

> training_labels[0]

‘7’

Nearest neighbor method

Training set

def classify(img):

similarities =

[similarity(img, p) for p in training_set]

i = similarities.index(max(similarities))

return training_labels[i]

def classify(img):

similarities =

def classify(img):

similarities =

def classify(img):

similarities =

def classify(img):

similarities =

def classify(img):

similarities =

def similarity(img1, img2):

return -sum(abs(img1 - img2))

Testing the algorithm

test_set = images[1000:2000]

test_labels = labels[1000:2000]

predicted_class = [classify(p) for p in test_set]

n_successes =

sum(array(predicted_class) ==

array(test_labels))

=> 843/1000

9 or 4?

from sklearn.neighbors import

KNeighborsClassifier

clf = KNeighborsClassifier(n_neighbors=1)

clf.fit(training_set, training_labels)

predicted_class = clf.predict(test_set)

=> 855/1000

Scikit-learnhttp://scikit-learn.org/

Voronoi diagram

… not the only way

How can we modify the method?

Training set

Some samples may be thrown away

Training set

…or we can add “weights”

Training set

…or we can add “weights”

Training set

.. or we can SUM instead of MAX

Training set

evidence(img, 4) =

0.9 * similarity(img, tr[1])

= 34.0

Training set

evidence(img, 7) =

= 20.0

Formally…

𝐾(𝒙1, 𝒛)

Formally…

𝐾 𝒙1, 𝒛 , 𝐾(𝒙2, 𝒛), 𝐾 𝒙3, 𝒛 , …

𝐾 𝒙101, 𝒛 , 𝐾(𝒙212, 𝒛), …

Formally…

𝑤1𝐾 𝒙1, 𝒛 + 𝑤2𝐾 𝒙2, 𝒛 + 𝑤3𝐾 𝒙3, 𝒛 , …

𝑤101𝐾 𝒙101, 𝒛 + 𝑤212𝐾 𝒙212, 𝒛 …

Formally…

𝑤𝑖𝐾(𝒙𝑖 , 𝒛)

𝑤𝑗𝐾(𝒙𝑗 , 𝒛)

Formally…

𝑤𝑖𝐾(𝒙𝑖 , 𝒛)

𝑤𝑗𝐾(𝒙𝑗 , 𝒛)

Formally…

𝑤𝑖𝑦𝑖𝐾(𝒙𝑖 , 𝒛)

𝑤𝑗𝑦𝑗𝐾(𝒙𝑗 , 𝒛)

Kernel-based classifier

𝑓𝒘 𝒛 =

𝑤𝑖𝑦𝑖 𝐾(𝒙𝑖 , 𝒛)

How to find the weights?

𝑓𝒘 𝒛 =

Find weights, such that the

misclassification error rate on the training set

is the smallest.

w = argminw ErrorRate (fw, Data)

𝑓𝒘 𝒛 =

approximation to misclassification error on

the training set is the smallest.

w = argminw Error (fw, Data)

𝑓𝒘 𝒛 =

error rate on the training set is the smallest +

there are many zero weights.

w = argminw Error (fw, Data) + Complexity (w)

𝑓𝒘 𝒛 =

Support Vector Machine

from sklearn.svm import SVC

clf = SVC(kernel=‘linear’)

clf.fit(training_set, training_labels)

predicted_class = clf.predict(test_set)

=> 865/1000

Classification by

analogy:k-NN

Search for the nearest neighbor

def classify(img):

similarities =

def classify(img):

similarities =

Inefficient

def classify(img):

nearest_neighbour =

training_set.find_nearest_neighbour(img)

return nearest_neighbour.label

Indexing!

find_nearest_neighbour

if pixel[10,13] > 4:

if pixel[3,24] < 0:

nearest_neighbour = A

nearest_neighbour = B

nearest_neighbour = C

Classification Tree

if pixel[10,13] > 4:

if pixel[3,24] < 0:

class = ‘1’

class = ‘2’

class = ‘3’

Classification Tree

Orangehttp://orange.biolab.si/

RegTree

Search for optimal model

parameters

General form of a model

𝑓𝒘 𝒛 =

Search for the optimal

separation line

Optimization

Modeling

Optimization

Modeling

“Learning”

“Training”

“Estimation”

Linear model

f(image) =

pixel1*w1 + pixel2*w2 + … + pixel784*w784

Linear Classification

from sklearn.linear_model import

LinearRegression,

LogisticRegression,

RidgeClassifier,

ElasticNet,

SGDClassifier,

=> 809/1000

Modeling

Linear model

f(image) =

pixel1

pixel2

pixel3

pixel4

pixel784

f(image)

Neural network

Modeling

Why should a model,

trained on one set of data,

work well on future data?

When is it not the case?

How to formalize a new model?

How to find parameters?

How to create efficient learning algorithms?

How to handle structured data?

Unsupervised learning

Semi-supervised learning

On-line learning

Active learning

Multi-instance learning

Reinforcement learning

Probabilistic models

Graphical models

Ensemble learners

Data fusion

Tools:R, Weka, RapidMiner,

Orange, scikits-learn,

MLPy, MDP, PyBrain,

Theano,Torch,Caffe,

Matlab…

Model-based Instance-based

Theory and

Practice

The OCR problem is unusual in that it is ____.

The two important perspectives on machine

learning are ______-based and _____-based.

The “soul” of machine learning is the

minimization task

argmin𝒘 ________ + 𝜆 _________

The visualization of a decision rule inferred by

a single-nearest-neighbor algorithm is called

_____________________

Two important components of machine

learning:

The machine learning algorithms mentioned in

this lecture were:

The machine learning algorithms mentioned in

this lecture were:

K-nearest neighbor classifier

Classification trees

Linear models

Neural networks

Questions?

Model-based Instance-based

Theory and

Practice

Welcome to Machine Learning - kt.era.eekt.era.ee/lectures/aacimp2015/1-intro.pdf · Welcome to...

Documents

Transcript of Welcome to Machine Learning - kt.era.eekt.era.ee/lectures/aacimp2015/1-intro.pdf · Welcome to...

Introducing Machine Learning · 4 ntroducing Machine Learning How Machine Learning Works Machine learning uses two types of techniques: supervised learning, which trains a model on

CS7267 Machine Learning introduction to machine learning

Machine Learning using MapReduce - Boise State Universitycs.boisestate.edu/.../mapreduce-machine-learning-handout.pdf · 2016-10-27 · What is Machine Learning I Machine learning

DAT340/DIT866 Applied Machine Learning: Machine learning ...richajo/dit866/lectures/l11/l11.pdfDAT340/DIT866 Applied Machine Learning: Machine learning and ethics Vilhelm Verendel

Machine Learning Made Easy€¦ · Machine Learning –What is Machine Learning and why do we need it? –Common challenges in Machine Learning Example 1: Human activity learning

An Introduction to Machine Learning - Welcome to LDI | LDI

Welcome to the Machine Learning Practical Deep Neural Networks€¦ · Arti cial Intelligence (AI): enable machines to perform human-like tasks Machine Learning (ML): automatically

Machine Learning for NLP - Ethics and Machine Learning

Bayesian Methods for Machine Learning - Machine Learning (Theory)

Machine Learning - apiacoa.orgapiacoa.org/publications/teaching/machine-learning/machine... · Machine Learning Fabrice Rossi ... I chess program ... Machine programming I climbing

Machine Learning and Azure Machine Learning

machine Learning Learning

DEEP MACHINE LEARNING - SFU.ca · PDF fileDEEP MACHINE LEARNING “A Shallow ... Welcome to the Slide Presentation and Question/Answer session for my PhD Comprehensive ... • Convolution*,

Welcome to the Machine Learning Practical Deep Neural Networks€¦ · Previous Exposure to Machine Learning (e.g. Inf2B, IAML) Enthusiasm for Machine Learning Do not do MLP if you

Machine Learning with MATLAB · 2 Agenda Machine Learning Overview Machine Learning with MATLAB –Unsupervised Learning Clustering –Supervised Learning Classification Regression

Johannes Brandstetter Institute for Machine Learning fileOEAW AI SUMMER SCHOOL Introduction to Data Science Johannes Brandstetter Institute for Machine Learning. Welcome to Statistics,

Deep learning - Machine learning overviewce.sharif.edu/courses/98-99/1/ce719-1/resources/root/...Deep learning j What is machine learning? Types of machine learning Machine learning

Machine Learning Manual - Bright Computingsupport.brightcomputing.com/manuals/8.0/machine-learning-manual.pdf · Welcome to the Machine Learning Manual for Bright Cluster Manager

Introduction Welcome Machine Learning. Andrew Ng.

An Introduction to Machine Learning - GitHub Pages · Introduction Machine learning The machine learning process What’s machine learning History Supervised learning Non-supervised