Getting started with data analytics with azure machine learning

29
Getting started with Data Analytics with Azure Machine Learning Bhakthi Liyanage Northern VA CodeCamp Spring 2016 30 April 2016

Transcript of Getting started with data analytics with azure machine learning

Getting started with Data Analytics with Azure Machine Learning

Bhakthi LiyanageNorthern VA CodeCamp Spring 201630 April 2016

Bhakthi LiyanageBank of America Merrill Lynch

Who am I?Sr. SharePoint Architect16+ years in the IT industry10+ years in SharePoint

[email protected]

@bhakthil

https://www.linkedin.com/pub/bhakthi-liyanage/14/15/912

https://github.com/bhakthil

Thanks to our sponsors

Platinum

Gold

Bronze

Northern VA CodeCamp Spring 2016

Thanks to host sponsor

Northern VA CodeCamp Spring 2016

Agenda• Introducing machine learning• Introducing Azure Machine Learning• Machine Learning Lifecycle• Demo• Summary• Q & A

Introducing Machine Learning

What is machine learning?Academic DefinitionMachine learning is a subfield of computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. Machine learning explores the study and construction of algorithms that can learn from and make predictions on data.

Simple DefinitionComputing systems that become smarter with learning and experienceExperience = Past data + human input

Why machine learning?• Need to know of the future

• Being able to predict the future with a reasonable accuracy

ReportsYesterday Today Tomorrow

Business Intelligence

Predictive Analytics

Pred

ictab

ility

Time

Why machine learning?

Roles in machine learningData scientist

A highly educated and skilled person who can solve complex data problems by employing deep expertise in scientific disciplines (mathematics, statistics or computer science)

Data professionalA skilled person who creates or maintains data systems, data solutions, or implements predictive modellingRoles: Database Administrator, Database Developer, or BI Developer

Software developerA skilled person who designs and develops programming logic, and can apply machine learning to integrate predictive functionality into applications

Problem Identification What problems are we trying

to solve?◦ Anomaly detection◦ Customer churn◦ Predictive maintenance◦ Recommendations system

What data do we have or do we have any data at all?◦ Data already available via sensory

systems, transactional databases, customer sales databases, etc.

Predictive maintenance

Vision Analytics

Recommenda-tion engines

Advertising analysis

Weather forecasting for business planning

Social network analysis

Legal discovery and document archiving

Pricing analysis

Fraud detection

Churn analysis

Equipment monitoring

Location-based tracking and services

Personalized Insurance

Data Data Consist of

◦ Features (aka input parameters) : The data that is fed in to the model

◦ Identify which features relevant for the problem

◦ Labels : Historical result of each observation Training Data

◦ Pairing of features and label◦ Historical

Data Validation◦ Used to verify the trained model

LearningSupervised

◦ Machine learning task of inferring a function/model from labeled training data or examples

◦ Training data consist of both features and labelsUn-supervised

◦ Machine learning task of inferring a function to describe hidden structure from unlabeled data

◦ Data contains only features

Introducing Azure Machine Learning

Azure Machine LearningOne solution for machine learning Enables powerful cloud-based predictive analytics Professionals can easily build, deploy and share

advanced analytics solutions Browser based, Rapid Deployment Connects seamlessly with other Azure data-related

services, including: Azure HDInsight (Big Data) Azure SQL Database, and Virtual Machines

Models are consumed via ML API service

Machine learning lifecycleDefine

Objective

Collect Data

Prepare Data

Train Models

Evaluate

Models

Deploy

Manage

Integrate

It is important to start a machine learning project with a clearly defined objective

I need to predict customer churn rate for next 6 months…

Define Objective

I need to suggest relevant products to

the customers

I need to know when my manufacturing equipment will fail

Collecting complete data is critical◦ Garbage in ► Garbage out

Datasets can be sourced from:◦ Internal sources, i.e. operational systems, data warehouse, etc.◦ External sources◦ Different formats, i.e. relational, multidimensional, text, map-

reduce Combining datasets can enrich data

◦ E.g., integrate internal data to external data like weather, or market intelligence data

◦ Weather data with flight delay data◦ Population data with energy consumption data

Collect Data

Prepare data for machine learning◦ Transform to cleanse, reduce or reformat◦ Isolate and flag abnormal data◦ Appropriately substitute missing values◦ Categorize continuous values into ranges◦ Normalize continuous values between 0 and 1

Of course, having the required data to begin with is important◦ When designing systems, give consideration to attributes that

may be required as inputs for future modeling, e.g. demographic data: Birth date, gender, etc.

Prepare Data

This stage is iterative, and experimentation involves:◦ Selecting a machine learning algorithm◦ Defining inputs and outputs◦ Optimizing by configuring algorithm parameters

Model evaluation is critical to determine:◦ Accuracy, Reliability, Usefulness

Train Models

Evaluate

Models

First, add a scoring experiment– Training logic is replaced with a trained model– Inputs and output end-points are added– Module properties can be parameterized

Publish the experiment to the gallery– Learn from others by discovering experiments– Contribute and showcase your experiments

Deploy

Integrate

Integrate the experiment with external applications– Integration offers REST web service end points– Each web service offers two methods:

• Request/Response Service (RRS) ► Low latency, highly scalable web service

• Batch Execution Service (BES) ► High volume, asynchronous scoring of many records

Azure Machine LearningOne solution for machine learning

Stream analytics, blob storage, Azure SQL, HDInsight

Azure ML Services

Clients

Azure ML Studio

ML web service end-points

Data Model Development Model Deployment Operationalize

Power BI/DashboardsMobile AppsWeb Apps

Azure Portal

Azure Ops Team

ML Studio

Data Scientist

HDInsight

Azure Storage

Desktop Data

Azure Portal & ML API service

Azure Ops Team

ML API service Developer

ML Studio and the Data Professional• Access and prepare data• Create, test and train models• Collaborate • One click to stage for production

via the API service

Azure Portal & ML API serviceand the Azure Ops Team• Create ML Studio workspace• Assign storage account(s)• Monitor ML consumption• See alerts when model is ready• Deploy models to web service

ML API service and the Application Developer• Tested models available as a URL that can be called from any endpoint

Business users easily access results from anywhere, on any device

Azure Machine LearningOne solution for machine learning

Faster towards solutions

Mashup of powerful algorithms

Global scaling of solutions via cloud API

Elastic, pay-as-you-go model with low operative costs

Quick and easy extensibility with cloud functions such asPower BI, Hadoop (Azure HDInsight) and cloud storage

Azure Machine LearningOne solution for machine learning

DemoAutomotive Pricing Prediction

SummaryMachine Learning is a subfield of computer science and statistics that deals with the construction and study of systems that can learn from data.

Azure Machine Learning key attributes:Fully managed ► No hardware or software to buyIntegrated ► Drag, drop, connect and configure

Best-in-class algorithms ► Proven solutions from Xbox and BingR built in ► Use over 400 R packages, or bring your own R or Python code

Deploy in minutes ► Operationalize with a clickFlexible consumption ► Any device capable of consuming REST API

Machine Learning is now approachable to developers

Q & A