ARTIFICIAL INTELLIGENCE A I PARADIGM FOR SMART … · artificial intelligence an innovative...

Post on 15-Jul-2020

9 views 0 download

Transcript of ARTIFICIAL INTELLIGENCE A I PARADIGM FOR SMART … · artificial intelligence an innovative...

ARTIFICIAL INTELLIGENCE

AN INNOVATIVE PARADIGM FOR SMART COMPUTING

DR. P. SHANMUGAVADIVU

PROFESSOR , DEPT. OF COMPUTER SCIENCE & APPLICATIONS

GANDHIGRAM RURAL INSTITUTE (DEEMED TO BE UNIVERSITY)

GANDHIGRAM, DINDIGUL, TAMIL NADU, INDIA.

psvadivu67@gmail.com

COURSE OUTLINE

Day 1: Artificial Intelligence – An Overview

Day 2: Machine Learning Algorithms – Part 1

Day 3: Machine Learning Algorithms – Part 2

Day 4: Neural Networks & Deep Learning

Day 5: Convolutional Neural Networks

DAY 2: MACHINE LEARNING ALGORITHMS – PART 1

AGENDA

MACHINE LEARNING - OVERVIEW

APPLICATIONS OF ML

TYPES OF ML

ML ALGORITHMS

4

1 Machine Learning – An Overview

2 Applications of Machine Learning

3 Types of Machine Learning

4 Machine Learning Algorithms

5. Q & A

Machine Learning is the field of studythat gives computers the ability to learnwithout being explicitly programmed.

-

-[Arthur Samuel, 1959]

MACHINE LEARNING

A computer program is said to learn fromexperience E from some task T and someperformance measure P, if its P on T,improves with experience E.

-[Tom Mitchell, Carnegie Mellon University, 1997]

5

No human experts

Industrial/Manufacturing control

Mass spectrometer analysis, drug design, astronomical discovery

Black-box human expertise

Face/Handwriting/Speech recognition

Driving a car, flying a plane

Rapidly changing phenomena

Credit scoring, financial modeling

Diagnosis, fraud detection

Need for customization/personalization

Personalized news reader

Movie/Book recommendation

WHY MACHINE LEARNING?

6

ML, employs a variety of statistical, probabilistic, andoptimization techniques.

Algorithms that can learn from observational data, and canmake predictions based on it.

Explicitly used to make decisions based on learned patternsand create an analytical model for future predictions.

Data find patterns train itself produce an output.

The accuracy of classification is highly influenced by thedistribution and diversity of data.

MACHINE LEARNING - CONCEPT

7

APPLICATIONS OF MACHINE LEARNING

Healthcare Services

Language Translation

Online Fraud Detection

Online Customer Support

Email Spam and Malware Filtering

Social Media Services (Face Recognition, Similar Pins)

Virtual Personal Assistants (Smart Speakers, Smart Phones, Mobile Apps)

Prediction while commuting (Traffic Prediction, Online Transportation Networks)

https://medium.com/app-affairs/9-applications-of-machine-learning-from-day-to-day-life-112a47a429d0 8

DETECTION

Text, Speech & Image Interpretation

Human Behaviour & Identity

Abuse & Fraud Detection

PREDICTION GENERATION

Recommendation on Individual

Behaviour & Condition

Collective Behaviour

Visual Art

Music

Text

Design

THE POTENTIALS OF ML

9

ML FOR PREDICTIVE ANALYTICS

TheMechanics

10

Machine Learning

Supervised

Regression

Classification

Recommendation

UnsupervisedClustering

Association Rules

TYPES OF MACHINE LEARNING

11

SUPERVISED LEARNING

Labelled data are used to train the algorithms

Algorithms are trained using annotated data, where the input and the output are known

Uses the data patterns to predict the output for new data labels

It is mainly used in Predicting Modelling

12

UNSUPERVISED LEARNING

Unlabeled data are used to train the algorithm, which means it used against data that has no historical labels.

The purpose is to explore the data and find some structure within.

This learning technique works well on transactional data.

It is mainly used in Descriptive Modelling

13

Supervised Learning

Unsupervised Learning

Regression

• Linear

• Logistic

Classification

• Decision Trees

• Random Forest

• Naïve Bayes

• Support Vector Machine

• Neural Networks

Recommendation

• User-based

• Item-based

Clustering

• K-Means

• K-Nearest Neighbors

Association Rules

• Market Basket Analysis

14

SAMPLE ILLUSTRATION

15

REGRESSION

A regression problem is when the output variable is a real or continuous value

Examples:

Predicting age of a person

Predicting nationality of a person

Predicting whether stock price of a company will increase tomorrow

16

CLASSIFICATION

Classification is the process of categorizing a given set of data into classes.

Perform mapping function from input variable to discrete output variables.

Structured or unstructured data.

Main goal is to identify which category or class that the new data will fall in.

17

Types:• Simple Linear Regression: It is characterized

by one independent variable.• Multiple Linear Regression: It is characterized

by multiple independent variables.

I. LINEAR REGRESSION

It is a kind of predictive modelling where the possible output(Y) for the given input(X) ispredicted based on the previous data or values.

The main aim is to find the best fit line, which minimizes error

It is used to predict values within a continuous range rather than trying to classify theminto categories.

The known parameters are used to make a continuous and constant slope which is usedto predict the unknown or the result.

MACHINE LEARNING ALGORITHMS

18

While training the model :x: input training data (univariate – one input variable(parameter))y: labels to data (supervised learning)

When training the model :It fits the best line to predict the value of y for a given value of x.

The model gets the best regression fit line by finding the best a and b values.b: intercepta: coefficient of x

Once find the best a and b values, then the best fit line will produce. Finally using the model for prediction, it will predict the value of y for the input value of x.

Equation:

LINEAR REGRESSION…

19

The values a and b must be chosen so that the error is minimum.

If sum of squared error is taken as a metric to evaluate the model, then the goal is to obtain aline that best reduces the error.

LINEAR REGRESSION…

20

II. LOGISTIC REGRESSION

The appropriate regression analysis to conduct when the dependent variable has a binarysolution (output belongs to either of the two classes (1 or 0).

It is a classification algorithm that uses one or more independent variables to determine anoutcome.

Goal – to find the best fitting relationship between the dependent variable and a set ofindependent variables.

Merits – Understanding the influence of independent variables on the outcome of thedependent variable

Demerits – only works if the predicted variable is binary.

21

Logistic Regression

Target Variables Examples

Binomial 2 possible types “win” Vs “loss”, “pass” Vs “fail”,

Multinomial 3 or more (not ordered) “disease A” Vs “disease B” Vs “disease C”.

Ordinal Deals with ordered categoriesCategory:“very poor”, “poor”, “good”, “very good”. Score : 0, 1, 2, 3

LOGISTIC REGRESSION...

Two important parts of logistic regression

Hypothesis and Sigmoid Curve.

Hypothesis can derive the likelihood of the event.

Hypothesis Expectation:

Generated data can fit into a log function that creates an S-shaped curve known as “sigmoid”. (Converts any value from -∞ to + ∞ to a discrete value).

22

The hypothesis of logistic regression tends to limit the cost function between 0 and 1.Therefore linear functions fail to represent it as it can have a value greater than 1 or less than 0which is not possible as per the hypothesis of logistic regression.

Sigmoid function maps any real value into another value between 0 and 1. In machine learning, it is used to map predictions to probabilities, using the Formula:

Where,f(x) = output between 0 and 1 (probability estimate)x = input to the functione = base of natural log

LOGISTIC REGRESSION...

23

III. DECISION TREE

It is a Tree which is developed based on certain decisions taken by the algorithm in accordancewith the given data that it has been trained on.

Decision Tree uses the features in the given data to perform Supervised Learning and develop atree-like structure (data structure) whose branches are developed in such a way that given thefeature-set, the decision tree can predict the expected output relatively accurately.

First, it breaks down a data set into smaller subsetswith an associated decision tree.

Decision nodes - two or more branches and a leafnode - classification or decision.

The topmost decision node - best predictor calledroot node.

24

Entropy is a measure of “purity” of an arbitrary collection of information.

Information Gain: The amount of relevant information that is gained from a given randomsample size can be calculated

Entropy (E) is used to calculate Information Gain, which is used to identify which attribute of agiven dataset provides the highest amount of information.

The attribute which provides the highest amount of information for the given dataset isconsidered to have more contribution towards the outcome of the classifier and hence is giventhe higher priority in the tree.

DECISION TREE…

25

Advantages:

Specific

Easy to use

Versatile

Resistant to data abnormalities

Visualization of the decision taken

DECISION TREE…

https://www.knowledgehut.com/blog/data-science/classification-and-regression-trees-in-mach

Applications:

Select a flight to travel

Selecting alternative products

Sentiment Analysis

Energy Consumption

Fault Diagnosis

Limitations:

Sensitivity to hyperparameter tuning

Overfitting

Underfitting

26

The first step in developing a machine learning model is training and validation, by partition thedataset, which involves choosing what percentage of your data to use for the training, validation, andholdout sets.

27

TRAINING, VALIDATION, AND TESTING/HOLD OUT DATASETS

Taining Set(60-80%) Validation Set(15-25%) Testing Set(15-25%)

The sample of data used tofit the model

The sample of data used to provide anunbiased evaluation of a model fit on thetraining dataset while tuning modelhyperparameters.

The sample of data used to providean unbiased evaluation of a finalmodel fit on the training dataset

It uncovers or learnsrelationships betweenthe features and the targetvariable.

It is used to find how accurately itidentifies relationships between theknown outcomes for the target variableand the dataset’s other features.

It provides a final estimate ofthe model’s performance after ithas been trained and validated.

OVERFITTING AND UNDERFITTING

Underfit Model: A model that fails to sufficiently learn the problem and performs poorly on a training dataset and does not perform well on a holdout sample.

Overfit Model: A model that learns the training dataset too well, performing well on the training dataset but does not perform well on a hold out sample.

Good Fit Model: A model that suitably learns the training dataset and generalizes well to the hold out dataset.

28

Bias is the difference between the model’s average prediction and the expected value.

Variance in data is the variability of the model in a case where different Training Data is used.

29

BIAS-VARIANCE TRADEOFF

Characteristics of a biased model:

Underfitting

Low Training Accuracy

Inability to solve complex problems

Characteristics of a model with Variance

Overfitting

Low Testing Accuracy

Overcomplicating simpler problems

30

Train longer

Train a more complex model

Obtain more features

Decrease regularization

New model architecture

Obtain more data

Decrease number of features

Increase Regularization

New model architecture

BIAS-VARIANCE TRADEOFF

Detection and Solution to High Bias problem - if the training error is high:

Detection and Solution to High Variance problem - if a validation error is high:

31

Two variants of Machine learning algorithms are: Supervised Learning

Unsupervised Learning

Machine Learning is used for Prediction, Decision- making & Generation

Regression techniques are used for Predictive & Descriptive Analysis

Decision trees are used for Classification

TAKE-AWAY POINTS

WEB RESOURCES

REFERENCES

32

Q&A

33

CONTACT:

DR. P. SHANMUGAVADIVU

psvadivu67@gmail.com

9443736780