Machine learning workshop @DYP Pune

Introduction to Data Science, Understanding Machine Learning and Embracing it within IoT Solution

Meet Ganesh Raskar | @geekwhocodes

• Intern at RapidCircle India • Microsoft Student Partner • Microsoft Certified Professional • Microsoft Specialist (HTML5, CSS3 & JavaScript, Azure Web Services) • Periodic Blogger (http://geekwhocodes.me) • Lifelong learner

Email [email protected] https://twitter.com/geekwhocodes

About https://about.me/geekwhocodes

LinkedIn https://in.linkedin.com/in/geekwhocodes

http://geekwhocodes.me/

mailto:[email protected]

https://twitter.com/geekwhocodes

https://about.me/geekwhocodes

https://in.linkedin.com/in/geekwhocodes

https://in.linkedin.com/in/geekwhocodes

Module 01 | Introduction to Data Science

It has it’s own jargon

What is Data Science ?

• Evolving subject, no single definition• Requires a range of skills

Data science is the exploration and quantitative analysis of all available structured and unstructured data to develop understanding, extract knowledge, and formulate actionable results.

Action Decision

Why did it happen?

What will happen?

What should I do? Decision automation

Decision support

Data

What happened? Manual process

Value

Data Decision Actions

What Types of AnalyticsRetrospective analytics

Real-time analytics IntelligentSaaS apps

Predictive analytics

• Predictive analytics calibrated on past data, tells us what to expect• Prescriptive analysis tells what actions to take

Predictive vs Prescriptive Analytics

Module 02 | Understanding Machine Learning

What is Machine Learning?

What Machine Learning

does?

Finds patterns in dataUses those patterns to predict the future Examples:• Detecting credit card fraud• Determining whether a customer is

likely to switch to a competitor• Detecting machine failure• Lots more

What Does It Mean

to Learn?

How did you learn to read?Learning requires:• Identifying patterns• Recognizing those patterns when you

see them again• Theory -> Simulation -> Try to

understand things>This is what machine learning does

Finding Patterns Name Amount Fraudule

ntOmkar ₹ 10,000 NoAmit ₹ 17,000 YesAnkit ₹ 20,000 YesGanesh ₹ 19,000 No

A simple example

Name Amount Issued Used Age Fraudulent

Omkar ₹ 10,000 India India 27 NoGanesh ₹ 23,000 India India 21 NoAnkit ₹ 12,000 India USA 25 YesAmit ₹ 2,000 USA India 27 YesAvani ₹ 14,000 India Amsterda

m26 No

Vinit ₹ 69,000 India Holand 25 YesAditi ₹ 70,000 USA USA 26 NoSwapnil ₹ 9,000 India India 21 NoGayatri ₹ 30,000 India London 20 Yes

A bit complex example

What’s the pattern for fraudulent transactions?

Machine Learning in a Nutshell

Machine learning algorithm

Model

Application

Contains patterns

Finds patterns

Recognizes patterns

Supplies new data to see if it matches known patterns

Data

Why Machine

Learning is so hot right now?

Doing machine learning well requires:• Lots of data• Lots of compute power• Effective machine learning

algorithmsAll of those things are now more available than ever

Who’s Interested in Machine Learning ?

Business LeadersW ant solutions to business problems

Data Scientists W ant powerful,

easy-to-use tools

Software DevelopersW ant to create

better applications

Who is Data Scientist?

Someone who knows about:• Statistics• Machine learning software• Some problem domain (ideally)

Key facts about data scientists:• Good ones are scarce• Good ones are expensive

The Role of R

R is an open source programming language• Supports machine learning,

statistical computing, and more• Has many available packages• R is very popular• Many commercial machine

learning offerings support RBut it’s not the only choice:• Python is also increasingly

popular

• Machine learning lets us find patterns in existing data, then create and use a model that recognizes those patterns in new data

• Machine learning has gone mainstream• Big vendors think there’s big money in this

market• Machine learning can probably help

your organization

Summary

Module 03 | Machine Learning Process

ML Process is:

Iterative• In both big and small ways

Challenging• It’s rarely easy

Often rewarding• But not always

First Step : Asking The Right Question

Choosing what question to ask is the most important part

of the ML process

Ask yourself : Do you’ve the right data to answer the question?

Ask yourself: Do you know how you’ll evaluate the result?

Machine Learning Flow

Chosen Model

Deploy chosen model

Candidate Model

Apply learning algorith

m to data

Prepared Data

Apply pre-

processing to data

Iterate to find the best model

Iterate until data is ready

ML AlgorithmsApplications

Raw Data

Raw Data

Choose data

DataProcessing Modules

Repeating The Process

Raw Data

Prepared Data

Apply pre-

processing to data

Deploy chosen model

Apply learning algorith

m to data

Chosen Model

Candidate Model

Re-create model regularly

Scenario : Predicting Customer Churn

Detailed Call Data

ModelMachine

Call Center Staff

Call Center ApplicationAggregated

CRM Call Data Data

Data for ML

ML Prep Application

Hadoop, Spark, etc.

Aggregation Application

Customers

• Choose the right question• Data Transformation• Iterate until you have a model that

makes good predictions• Periodically rebuild the model• Deploy the solution

Summary

The Closer Look at Machine Learning

ML has it’s own jargon

Terminology

The value you want to predict is in the

training dataThe data is labeled

The value you want to predict is not in the training data

The data is unlabeled

Training Data

Supervised Learning

Unsupervised Learning

Most common

The prepared data used to create a modelCreating a model is called training a model

* We’ll focus on Supervised ML

training or prepared data

Data Processing for Supervised Machine Learning

Features Target Value

Available Data Preprocessing Modules

1) Read raw data

2) Create trainin

gdata

Data Source 2

. . .

Data Source 1

Data Source N

100011010011110111110110

Categorizing Machine Learning ProblemsRegression Classification

Clustering Recommenders

For Predicting real-valued outcomes :• How many customers will visit our site

next week?• How may TV’s will sell next year?• Can we predicts someone’s income from

their click through information?• How many? It’s regression problem

For predicting truth valued outcomes:• Will I pass next semester ?• Is this transaction is fraudulent ?• Is this a spam e-mail?

For solving Unsupervised learning problems :

• Identifying chair from bunch of different objects?

• Hand-writing recognition • Is this Ganesh's voice ?

Recommending products based on history:• Building recommender engines

• Machine learning has come of age• Machine learning isn’t hard to

understand• Although it can be hard to do well

• Machine learning can probably help your organization

Summary

Module 03-01 | Regression

How many ?

Regression

• Introduction to Regression• Simple Linear Regression (1 Feature)• Ridge Regression• SVM Regression• Cross-Validation

Introduction to Regression • Each observation is represented by a set of numbers.

A person is represented as:

Labels, called ySingle feature, called x

Need a function that estimates y for a new x.

Clicks[10][7][…]

Income53-15…

NameGanes

hAnkit[…]

Simple Linear Regression• Formally, given training set (xi,yi) for i=1…n, we want to

create a regression model f that can predict label y for a new x.

f(x) = function(Number of Businessweek clicks)

2000Number of Business week clicks In

com

e0

f(x)

1,000K

f(x) = 100K + 5K*Number of Businessweek clicks(x)

• Want model to be as close to data as possible. want these to be small: yi f (xi ) equivalently want these to be small: (yi f (xi ))

2

SSE(f) : Summation of above function• You do not need to solve the minimization

problem – the machine learning algorithm will do it for you.

Ridge Regression • Extension to Simple Linear Regression• Formally, given training set (xi,yi) for i=1…

n, we want to create a regression model f that can predict label y for a new x.

Estimated income:

f(x) = function(feature1, feature2, feature3, feature4, feature5,… etc.)

For instance,f(x) = 3*Number of visits +10*Number of Businessweek clicks +100*Number people emailed per day +2*Number of purchases of over 5K within the last month +10*Number of visits to airlines

But f(x) could be much more complicated

Ridge Regression

Over-fitting Model :• Multiple features• Wrong ML algorithm• It just remember the data• Worst

Could choose b0, b1, b2, etc., to minimize the total error on the training set + regularization term <- keeping the model simple

• C will be calculated using Cross Validation

• This is called “Ridge Regression”

Support Vector Machine Regression

0

• The difference between Ridge & SVM is how they measure difference between prediction and the truth

• Epsilon to – as long as f(x) & y within the epsilon on either sides, the value of [ y - f(x) ] = 0

• You don’t need to do it by yourself, it’ll covered by ML algorithm

Cross Validation

• Cross Validation (CV) is the most popular way to evaluate a machine learning algorithm on a dataset.

• You will need a dataset, an algorithm, and an evaluation measure for the quality of the result. The evaluation measure might be the squared error between the predictions and the truth.

• Divide the data into approximately-equally sized 10 “folds”• Train the algorithm on 9 folds, compute the evaluation measure on the last fold.• Repeat this 10 times, using each fold in turn as the test fold.• Report the mean and standard deviation of the evaluation measure over the 10

folds.

Train Test

Module 03-02 | Classification

True or False?Class1 or Class2 or Class N?

Classification

• What is classification?• Loss functions for classification• Logistic regression• SVM• AdaBoost

• Decision trees• Multiclass classification• Imbalanced learning• ROC curves and the AUC

Introduction to Classification • Formally, given training set (xi,yi) for i=1…n, we want to create a

classification model f that can predict label y for a new x.

A person is represented as:

Labels, called yfeatures, called x

Need a function that estimates y for a new x.

[5][10][7][…]

1-11…

[12][14][47][…]

[51][15][8][…]

[25][30][9][…]

1 2 0 1

Is s\he passed last semester?

Number of backlogs last year

Is s\he active on social network?

Is s\he have smartphone?1

Introduction to Classification

8Study no. of hours per day

Las

t Yea

r Bac

k lo

g0

3f(x)>0 f(x)=0

f(x)<0

Fail

Pass

f(x) = function(Last Year Back log, Study No. of hr/day)

The machine learning algorithm will create the function f for you. It might be very complicated, but the way to use is not complicated:

The predicted value of y for a new x is the sign of f(x).

Module 03-02 | Clustering

Clustering• Clustering is an key unsupervised problem.• “Unsupervised” means that the training data has no ground truth labels to learn from.• This means they are much harder to evaluate.

Supervised:

chair?

(not a chair)

(chair)(not a chair)

(not a chair)

(chair)

(chair)

(not a chair)

Unsupervised:

Clustering

• “Unsupervised” means that the training data has no ground truth labels to learn from.

Applications include:• Automatically grouping documents/webpages into topics– For instance, grouping news stories from today into categories

• Clustering large number of products– E.g. online shopping sites (search)

• Clustering customers into those with similar purchase behavior

Clustering

Module 03-02 | Recommenders

Introduction to Recommenders

• Self Expletory• Market Basket Analysis

• Customer purchasing behaviour • Increase sales and maintain inventory

Facebook, LinkedIn Matrix Factorisation Collaborative FilteringK-NN & Pearson

Content-based Bayesian classifiers, cluster analysis,Decision trees, artificial neural networksUsed in : Nextflix

Recommenders Terminology :

Items : [1,2,3,4,5,6,,7,8,9]Itemset : any subset {3,5} {5,8} {1,3}.. Etc.Transaction : {2,3} {4,9} {7,2} {9,3}Rule : eg. {7 -> 2}

Support of itemset : proportion of transactions containing itemset(if user buy 7, what are chances to buy 2 as well.)

• Collaborative Filtering • Content-Based Filtering – works on the metadata of item • Hybrid Approach

• User – Movie Matrix• Goal is to predict user’s rating for the movie

that he didn’t watched yet• The intuition behind using matrix

factorization to solve this problem is that there should be some latent features that determine how a user rates an item.

User 1User 2User 3

.

.

.

User n

Recommenders

Module 04 | Demo

Will I pass next semester?

Module 05 | Demo

How can we use ML in IoT?

Information

Intel Edison

• Dual-core, dual-threaded Intel® Atom™ CPU at 500 MHz

• 32-bit Intel® Quark™ microcontroller at 100 MHz

• 1 GB LPDDR3 memory• 20 digital input/output pins including 4 pins as

PWM(pulse width modulation) outputs• 6 analog inputs• 1 I2C• 1 ICSP(In-Circuit Serial Programming)• Micro USB device connector• SD Card connector• BLE 4.0• Yocto Linux 1.6*

Water Flow sensor (1-30L/min) – My experiment specific

Thank you

Machine learning workshop @DYP Pune

Technology

Transcript of Machine learning workshop @DYP Pune