Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses...

24
Cognitive Computing Solutions from Scry Analytics Dr. Alok Aggarwal Founder and CEO May 2017

Transcript of Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses...

Page 1: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Cognitive Computing Solutions from Scry Analytics

Dr. Alok Aggarwal

Founder and CEO

May 2017

Page 2: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Brief Overview of Artificial Intelligence

AGENDA

Collatio: Creating a Single Source of

Truth System

Jidoka: AI solutions to drive Automated Actionable Insights

2

Page 3: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Artificial Intelligence (AI) Brief History In 1951, Alan Turing, suggested the “Imitation Game” to define AI: Imagine three rooms, each connected via computer screen & keyboard; Judge decides which of the two talking is a man or computer – Turing Test AI has generated lots of excitement & optimism - also massive setbacks:

•Over 250,000 research articles related to AI •Research progressing to capture full human intelligence -- includes language, audio, and video processing; knowledge & learning; perception and the ability to move & manipulate objects •Still out of reach – intuition; reasoning; creativity; planning; emotions; intelligent actions (e.g., sense of humor) •We rarely use intuition, creativity or intelligent actions in “routine work”

Judge

Computer? Man?

3

Page 4: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Summary of Key Terms

Services Automation

Classical Quant. Modeling

Supervised Machine Learning

Unsupervised Mach. Learning

Reinforcement Learning

Natural Language Processing

Speech & Audio Processing

Video & Image Processing

Other Intelligent Actions

Fixed & Static Rules; Structured Data

Fixed Rules with Probability-Statistics

Pattern Recognition by Training

Self Learnt Pattern Recognition (in time)

Self Learnt Pattern Recognition (by doing)

Extracting Intelligence from Text

Extracting Intelligence from Speech & Audio

Extracting Intelligence from Video & Images

Modeling Intuition, Creativity, Reasoning, etc.

Cogn

itive

Com

putin

g

Artif

icia

l Int

ellig

ence

Inte

llige

nt S

ervi

ces

Auto

mat

ion

Mac

hine

Le

arni

ng

Category Brief Description

4

Page 5: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Advances in Technology 1951 – 2015

“Democratization” of Hardware

1965 – Moore’s Law: For a given price, processing power (or memory) doubles every 18 months 1965 – 2015: Improvement by 100 Million in CPU & Memory 2006 – Power & Memory are available (& scalable) as a commodity; 100 CPUs now rentable for only an hour from Amazon & others

Parallel & Distributed Computing

2004 – Parallel & distributed computing “reduced to practice” by Google 2005 – By modifying MapReduce, Yahoo! Introduced Open Source Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers

Improved Algorithms & Open Source Software

1989 - Free & Open Source Software Libraries contain many algorithms, leading to ‘democratization’ 2005 – Artificial Intelligence Algorithms have advanced substantially 2010 - Real time analysis of disparate data in large volume now a reality

5

Page 6: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Exponential Increase in Data Security & Anonymization

Need to bring varied data sources together (e.g., structured, unstructured, audio, video) so as to get a unified view with respect to volume, variety, veracity, and often, velocity Databases may be in different locations or countries and rules may prohibit these to be brought together; we may need data virtualization & determination of attributes that connect them Data may need to be anonymized; need to avoid Target & Netflix type issues

6

Page 7: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

About Scry Analytics

Founder - Evalueserve; Dir. of Res. - IBM Watson Res. Ctr.; Founder - IBM India Research • Machine Learning (Paper Mills), Advanced algorithms for Hierarchical Memories & IBM Watson, etc. (1984 - 97) • Sabbatical at MIT (1990-1991); went back and deployed ML-genetic algorithms for Paper Mill Scheduling (1992-95) • Founded IBM India Research Lab. (1998 - 2000); grew it to 30 PhDs and 30 Masters • Co-founded Evalueserve (2000 - 13); research & analytics services co. with 3,200 employees worldwide • Founded Scry Analytics in February 2014; Scry – Crystal ball gazing or Fortune Telling

What We Do and How • Create single source of truth (~ 97% accurate) data systems so as to provide data relevance, reliability and quality

across full range of data (e.g., structured, unstructured, machine logs) • Use Cognitive Computing and Subject Matter Expertise to build library of automated solutions to provide

actionable insights for improving risk, complaints, regulatory compliance, revenue and profits

• Do a quick proof of concept (3-4 months) and then scale for production 7

Page 8: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Brief Overview of Artificial Intelligence

AGENDA

Collatio: Creating a Single Source of

Truth System

Jidoka: AI solutions to drive Automated

Actionable Insights

8

Page 9: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Prospecting & Marketing SMB Loans

• Small & Medium Business (SMB) Lending Group of a Bank wants to market actively to new customers • Our solution uses the base of current customers, i.e., who have taken loans & have a checking account • It creates “knowledge graphs” and use Machine Learning to provide predictive & prescriptive insights

to determine prospects for giving loans

Checking Account: (a) is SMB paying loans to some other lender, (b) given the average balance in the account, can this SMB use more loan, (c) which other SMBs are working with this SMB, etc. Creating a “Knowledge Graph”: Machine Learning Algorithms become more accurate when the number of SMBs who are customers of banks is very large because then most prospect SMBs are already “working with these customers” & their payment behavior becomes more evident Credit History of SMBs: For obtaining past credit history External Data: E.g., Glassdoor, Yelp, Privco, DnB, Social Media data, industry & geography data

9

Page 10: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Scry Analytics’ proprietary Natural Language Processing & Deep Learning solution provides: Automatic classification, review and categorization of complaints Identification w.r.t. sub-categories & product types Historical trends w.r.t. issues, their categories and sub-categories Comparison & benchmarking of complaints w.r.t. peer firms Decision Support for “Next Best Action” so as to resolve a new complaint

Complaint Categorization & Resolution System

10

Page 11: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Categorization of Complaints w.r.t. Firms E.g. Wells Fargo

If complaint is resolved with a customer directly, CFPB’s website may say “closed with explanation” 11

Page 12: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Benchmarking Complaints Among Peers E.g. Citibank vs. Bank of America

Citibank Bank of America

12

Page 13: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Next Best Action Decision Support for Complaint Resolution

I HAD A CREDIT BALANCE ON MY ACCOUNT OF XXXX CENTS. THIS MONTH MY ACCOUNT WAS CHARGED TO REMOVE THE XXXX CENTS. I CONTACTED CHASE AND WAS TOLD ANY CREDIT BALANCE UNDER XXXX (XXXX - XXXX) WOULD BE ADJUSTED AND RETAINED BY CHASE AS INTEREST IF NOT USED WITH 60 DAYS. I WAS NOT OFFERED A CREDIT BY CHASE OR ANY SUPPORTING DOCUMENTATION. I REVIEWED MY ACCOUNT OPENING MATERIALS AND COULD NOT FIND ANYWHERE THAT STATED THEY CAN KEEP MY CREDIT IF I DO N'T USE IT WITH 60 DAYS. XXXX CENTS IS NOT A LOT OF MONEY, HOWEVER WHEN YOU HAVE XXXX CUSTOMERS IN XXXX STATES XXXX - XXXX CENTS CAN ADD UP TO XXXX DOLLARS IN UNDESERVED PROFITS TO THE BANK THAT DOES NOT BELONG TO THEM. TO ME THIS APPEARS TO BE THEFT BY CHASE BANK AND XXXX CUSTOMERS ARE BEING DECEIVED ; BECAUSE TO MOST PEOPLE PENNIES DON'T MATTER. PLEASE INVESTIGATE

New Complaint

Output of the Decision Support System: • Category – Communication Issues • Next Best Action using similar issues in the past: Firm has responded to consumer &

chooses not to respond publicly

13

Page 14: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Checking Account: (a) is customer paying his/her bills on time, (b) account & credit card balances, (c) spending behavior, (d) loss of job, (e) application fee paid to another lender Credit History: FICO and credit-related attributes; other aspects of credit history Current Loan: (a) has asset value gone up, (b) will Adjustable Rate Mortgage go up soon, (c) have mortgage rates gone down, (d) is monthly payment high w.r.t. avg. balance in checking/saving account Social Media & Ads: (a) current lender getting bad reviews in social media, on CFPB website, or is sued for deceptive lending, (b) another lender giving huge incentives, etc. Other Attributes: (a) does the customer re-finances often, (b) is it a construction Loan, (c) own other homes, (d) payment to Architect, (e) demographics, e.g., education , age

Detecting Customers likely to Re-Finance

Mortgage Lending Group in Bank wants to pro-actively determine as to which of its customers are likely to re-finance their current mortgage; “customers“ are those with checking account & mortgage with Bank. Similar use case arises in figuring out who will pay off auto loans, credit cards and/or renew term deposits

14

Page 15: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Automated categorization of reasons for delinquency (e.g. inability to pay may be because of job-loss, alimony or medical reasons); hence recommendations based on the degree of similarity by Scry’s Cognitive Computing solution achieve 90%+ accuracy Solution checks all CCA discussions and provides “percentage of compliance” for each discussion Runs in REAL TIME by doing limited speech-to-text, ML, and NLP etc.

Improving Compliance & Efficiency Contact Centers

• Manual and time intensive process – CCAs often do not write correct reasons for delinquency (e.g., inability to pay vs. unwilling to pay)

• Ambiguity - Language ambiguity particularly while talking via phone • Determining resolution w.r.t. delinquency is not easy • Determining Compliance of CCAs is time consuming and laborious; hence,

only 2% are usually checked

Agent Name

Total Calls

Scry Compliance

Complete

Greeting

Company

Identification

Agent Name

Identification

Ask Caller

Identification

Agent Name ↓

Avg→ 720.7

1 56.12 56.48 50.81 52.47 54.74

Kam Mortenson 622 48.8 54.8 43.5 57.5 48.6

Tiny Mcelwee 892 59.9 54 42 40.3 53.7

Lyman Broadus 549 65.9 62.1 51.4 44.5 47

Kasie Mullen 1056 40 63.3 51.4 41.7 68.9

Charis Guice 490 57 57.3 58.4 59.6 48.4

Maragret Mani 840 61.9 52 60.9 66.5 61.6

Linnea Cuthbertson 596 68.8 50.3 49.6 66.2 40.3

15

Page 16: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Brief Overview of Artificial Intelligence

AGENDA

Collatio: Creating a Single Source of

Truth System

Jidoka: AI solutions to drive Automated

Actionable Insights

16

Page 17: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Scry Collatio Cognitive Computing – Creating Single Source of Truth

17

Grap

hica

l Use

r Int

erfa

ce

On Premise RDBMS

On Premise Unstructured

Data

External RDBMS

External Unstructured

Data

Online databases

Data

Lake

Connectors & Scrapers Library

User Groups

Pre-Built SSoT Containers

Ontology & Live Manual

Data Governance

Data Exploration

Role Based Access & Security

Data Collation Engine

Data Quality (DQ) Engine

Business Rules (BR) Engine

Business Dashboards

Page 18: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Connectors, Scrapers & Data Ingestion Scry Collatio Components

18

• Connectors • Relational DBs e.g. MSSQL, MySQL,

POSTGRES, Oracle • Many online databases • External systems such as Salesforce,

Service now, etc.

• Scrapers for various websites etc. • Schema Checks and Alerts • Full & Incremental Loads on

hourly/daily/weekly basis • Persistent Data Structure and Data

Lineage

On Premise RDBMS

On Premise Unstructured Data

External RDBMS

External Unstructured Data

Online databases Data Lake

Connectors/ Scrapers Library

Page 19: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Role Based Access, Encryption & Security Scry Collatio Components

19

• Two level encryption for data (if required) • Authentication • User Administration

• Ability to Manage users • Ability to manage roles and Role

Hierarchy • Authorization

• Pages - Restrict access to Pages and specific functions on various pages

• Data Set – Restrict access control to the granularity of specific data rows, datasets, data groups or data, columns, tables & DBs

Page 20: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Data Exploration & Preparation Scry Collatio Components

20

• Process structured or unstructured data

• Ability to upload files or connect to RDBMS to create projects and datasets

• Explore data, extract meta data & related statistics

• Identify row & column similarities & correlated attributes among datasets

• Create derived datasets – join/merge tables

• Identify patterns, charts, graphs in the data

• Keyword matching, topic modelling, & summarization of text documents

• Learns new connections and deploys them for all new, incremental data

Page 21: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Automated Data Quality (DQ) Engine Scry Collatio Components

21

• Automated DQ Configuration Management

• Automated cleansing using proprietary machine learning & NLP algorithms

• Audit & maintain history & Audit of the execution & changes to improve DQ

• Dashboards to visualize overall DQ and DQ at table/column level

• Ability to extract data that failed DQ rules

• Allows for cleansing data manually

• Learns how DQ exceptions were fixed by Data Governance Team in the past & applies it

• Provides data quality statistics regularly

• Outliers

• Ranges

• Duplicate Values

• Duplicate Rows

• Address validation

• Numeric Conversion

• Date Conversion

• Orphans

• Email

• Nulls

• Empty Strings

• Sets

• Candidate Key

• Length check

• Date Range

• Phone formats

• Address formats

Page 22: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Business Rules (BR) Engine Scry Collatio Components

22

• Allows configuring, scheduling & executing of business rules using the User Interface on the fly

• Dashboards to visualize overall BR, table/column specific BR results

• History/Audit of the execution and changes to Business Rules

• Extract data that failed Business Rules

• Provides statistics regarding Business Rules on a regular basis

Page 23: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

End to End Automated Process Scry Collatio End to End Process

23

Schema Checks, Data Ingestion & hourly/daily snapshots (5 mins)

Data Quality Execution (15-30 mins)

Business Rules (BR) Execution (5-15 mins)

Data Collation (Source to FACT) Business Group 1

(15-30 mins)

Business Group 3

(15-30 mins)

Business Group 2

(15-30 mins)

Refresh DQ Dashboards

Refresh Business Dashboards and Data Objects for further analysis (5-10 mins)

Refresh BR Dashboards (5 mins)

Page 24: Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers Improved Algorithms & Open Source Software 1989 - Free

Thank You Alok Aggarwal

Math and Statistical Algorithms

Machine Learning, Natural Language Processing & Information Retrieval Algorithms

Custom Creation of User Interfaces

Build, Maintain and Upgrade Integrated Solutions

Automatic Fill-In & Decision Support

Munge, Harmonize and Cleanse Data

Process Management & Rules-Based Frameworks

SCRY ANALYTICS INNOVATIVE

SOLUTIONS & SERVICES +1 914 980 4717

+1 408 872 1078

alok.aggarwal@ scryanalytics.com

ScryAnalytics.com

24