Analytics Life Cycle: Pangea is Panacea! · Analytics Lifecycle –Enterprise View Data Ingestion...

9
1 Analytics Life Cycle: Pangea is Panacea!

Transcript of Analytics Life Cycle: Pangea is Panacea! · Analytics Lifecycle –Enterprise View Data Ingestion...

Page 1: Analytics Life Cycle: Pangea is Panacea! · Analytics Lifecycle –Enterprise View Data Ingestion Algorithm Selection ... Pangea is a distributed analytics workbench that provides

1

Analytics Life Cycle: Pangea is Panacea!

Page 2: Analytics Life Cycle: Pangea is Panacea! · Analytics Lifecycle –Enterprise View Data Ingestion Algorithm Selection ... Pangea is a distributed analytics workbench that provides

The accompanying material and any related oral or written discussion (the “Materials”) is governed by the limitations detailed below:

Licensed Content and Ownership - HCL, PANGEA and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. Content distributed within a HCL client organization must display HCL copyright notices and attributions of authorship.

IP & Patent Liability - This Solution/ Proposition is covered by a pending patent. Any refactoring or subsequent re-use is an unlicensed use and therefore constitutes patent infringement. If there is any further detailed information required, please contact [email protected]

Liability Disclaimer -The information herein is for informational purposes only and represents the current view of HCL Technologies Ltd as of the date of this presentation. Because HCL must respond to changing market conditions, it should not be interpreted to be a commitment on the part of HCL, and HCL cannot guarantee the accuracy of any information provided after the date of this presentation. HCL MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Terms of Use, IP and Liability Disclaimer

Terms of Use, IP and liability disclaimer

Page 3: Analytics Life Cycle: Pangea is Panacea! · Analytics Lifecycle –Enterprise View Data Ingestion Algorithm Selection ... Pangea is a distributed analytics workbench that provides

Analytics Lifecycle – Enterprise View

Data Ingestion

Algorithm Selection

Model Building and Tuning

Model Monitoring

Ingest data from diverse data sources- It is a common ask

Lesser coding and more effort on model execution& tuning accounts for productivity

ML models need regular monitoring and Updation on need basis unlike traditional programs

Select appropriate algorithms suitable for the Business requirement(s) – Automated recommendationIs definite ask

Ease of deployment and (near) real timeScoring are crucial for enterprise acceptance& Success

Data Preparation

Ontologies for schema preparation& Transformation should be defined

.

Data

IngestionBusiness Problem

Description

Data

Preparation

Algorithm

SelectionModel

Building & Tuning

Model Deployment

Model Monitoring

Business

Insights & VisualizationsBusiness Problem Description

Extract problem definition from businessOwners – Tricky often times but important

Model Deployment

Business Insights

Business insight wrappers are crucial for the Successful adoption of analytics/ML

Page 4: Analytics Life Cycle: Pangea is Panacea! · Analytics Lifecycle –Enterprise View Data Ingestion Algorithm Selection ... Pangea is a distributed analytics workbench that provides

ML/DL Model Lifecycle

.

Drift

Analysis Automatic Output

Variance Analysis

Manual/

Supervised Analysis

Identify impacted

parametersRevise

Model Parameters

Update model

Deployed model

Monitor Inputs

Input Analysis

Output Analysis

Drift/Newness

Error/Variance/FP/FN

Numerical Data – Distribution AnalysisCategorical Data – Obsolete/New categoriesText data – Obsolete/New Keywords

Estimate data shift @ regular intervalsCheck for new/deleted categories/words

Error/variance for time/state modelsFP/FN for feedback based models

Boolean/Categorical/Labels (Clusters)

Page 5: Analytics Life Cycle: Pangea is Panacea! · Analytics Lifecycle –Enterprise View Data Ingestion Algorithm Selection ... Pangea is a distributed analytics workbench that provides

Analytics Model Monitoring – Heuristics to Watch

Burst or patch of data causes abrupt transition

Production data causes the model outcome to shift/change

incrementally

Yet times data influences gradual change in the

outcomes over a period of time

Some data sets yield recurring change states in output

Stray incidents occur when occasional input results in

unexpected output

Page 6: Analytics Life Cycle: Pangea is Panacea! · Analytics Lifecycle –Enterprise View Data Ingestion Algorithm Selection ... Pangea is a distributed analytics workbench that provides

Types of Analytical Models - Recap

Preventive and proactive alerts and life time estimates

Unsupervised Model that groups similar data/objects into

k - clusters

OptimizationClustering

Heuristic and OR models for optimization

Survival

Time series based forecast models

Supervised models that label datasets

Classification ForecastRegression

Linear, non-linear and logistic regression models

Page 7: Analytics Life Cycle: Pangea is Panacea! · Analytics Lifecycle –Enterprise View Data Ingestion Algorithm Selection ... Pangea is a distributed analytics workbench that provides

Best Practices – Analytics Adoption

Data analysis for duplicates, missing values

etc

Model building, tuning,

deployment

Model monitoring at

regular intervals

Reduced Time to Value

Ontologies and schema preparation

* These views may not expressly or implied to the affiliated organization. They are entirely speaker’s opinions based on his experience and understanding

Data ingestion with diverse connectors

Business logic wrappers and

insights + visualizations

Page 8: Analytics Life Cycle: Pangea is Panacea! · Analytics Lifecycle –Enterprise View Data Ingestion Algorithm Selection ... Pangea is a distributed analytics workbench that provides

Pangea* - Overview

Pangea is a distributed analytics workbench that provides an end to end platform for building and operationalizing Analytics quicker

Delivers end to end analytics with an intuitive drag and drop of data and models/algorithms

Reduces model deployment time from several months to days

Data & Code distribution on virtual nodes ensures scalability

Actionable Insights

customizable solution to fit the client needs

Zero Coding Approach Single Click Deployment

Distributed Analytics at ScaleModular and Flexible

Pangea brings in automation to achieve speed, scale, collaboration and enforces best practices implementation across analytics life cycle to reduce the total cost of ownership

Drastic time-to-insight reduction

Data Ingestion from divergent data sources

Modelling and tuning without coding

Inbuilt & 3rd party UI for reports and charts

Deployment through clicks and configuration

* HCL Internal IP/Tool

Page 9: Analytics Life Cycle: Pangea is Panacea! · Analytics Lifecycle –Enterprise View Data Ingestion Algorithm Selection ... Pangea is a distributed analytics workbench that provides

• Data ingestion is key without too much emphasis on ‘outcome’ at that time

• Data preparation goes hand and glove with business problem descriptions

• Ontology and/or schema preparation invisible yet inevitable step in the

enterprise analytics life cycle

• Analytics/ML Modelling without ease of deployment and monitoring are

short-lived

• Analytical models without business wrappers are only serve as PoCs

9

Summary – Pangea Best Practices