Data Science for Retail Broking

19
Data Science for Retail Broking

Transcript of Data Science for Retail Broking

Page 1: Data Science for Retail Broking

Data Science for Retail Broking

Page 2: Data Science for Retail Broking

Page 2 © AlgoAnalytics All rights reserved

Overview

About Algoanalytics

Applications of Analytics in Retail Broking

Retail Broking Case Studies

Page 3: Data Science for Retail Broking

Page 3 © AlgoAnalytics All rights reserved

CEO and Company ProfileAniruddha PantCEO and Founder of AlgoAnalytics

PhD, Control systems, University of California at Berkeley, USA 2001

• 20+ years in application of advanced mathematical techniques to academic and enterprise problems.

• Experience in application of machine learning to various business problems.

• Experience in financial markets trading; Indian as well as global markets.

Highlights

• Experience in cross-domain application of basic scientific process.

• Research in areas ranging from biology to financial markets to military applications.

• Close collaboration with premier educational institutes in India, USA & Europe.

• Active involvement in startup ecosystem in India.

Expertise

• Vice President, Capital Metrics and Risk Solutions• Head of Analytics Competency Center, Persistent Systems• Scientist and Group Leader, Tata Consultancy Services

Prior Experience

• Work at the intersection of mathematics and other domains

• Harness data to provide insight and solutions to our clients

Analytics Consultancy

• +30 data scientists with experience in mathematics and engineering

• Team strengths include ability to deal with structured/ unstructured data, classical ML as well as deep learning using cutting edge methodologies

Led by Aniruddha Pant

• Develop advanced mathematical models or solutions for a wide range of industries:

• Financial services, Retail, economics, healthcare, BFSI, telecom, …

Expertise in Mathematics and Computer Science

• Work closely with domain experts – either from the clients side or our own – to effectively model the problem to be solved

Working with Domain Specialists

About AlgoAnalytics

Page 4: Data Science for Retail Broking

AlgoAnalytics - One Stop AI ShopAniruddha PantCEO and Founder of AlgoAnalytics•We use structured data to design

our predictive analytics solutions like churn, recommender sys

•We use techniques like clustering, Recurrent Neural Networks

Structured Data

•We used text data analytics for designing solutions like sentiment analysis, news summarization and much more

•We use techniques like natural language processing, word2vec, deep learning, TF-IDF

Text Data

•Image data is used for predicting existence of particular pathology, image recognition and many others

•We use techniques like deep learning – convolutional neural network, artificial neural networks and technologies like TensorFlow

Image Data

•We use sound data to design factory solutions like air leakage detection, identification of empty and loaded strokes from press data, engine-compressor fault detection

•We use techniques like deep learning

Sound Data

Retail•Churn Analysis•Recommender System• Image Analytics – image recognition,

tagging, original and substitute differentiation

Healthcare•Medical Image Diagnostics •Work flow optimization•Cash flow forecasting

Legal•Contracts Management•Structured Document decomposition•Document similarity in text analytics

Internet of Things•Assisted Living•Predictive in ovens•Air leakage detection•Engine/compressor fault detection

Others•Algorithmic trading strategies•Risk sensing – network theory•Network failure model•Multilanguage sentiment analytics

Page 5: Data Science for Retail Broking

Page 5 © AlgoAnalytics All rights reserved

Technology:

Page 6: Data Science for Retail Broking

Page 6 © AlgoAnalytics All rights reserved

Data Science for Broking

Predict Dormancy – Finding which clients are unlikely to trade and take action

Stock Recommender System – Suggesting stocks that will increase probability of trading for a particular customer

Dormancy visualization Identifying KPIs for visualizing Dormancy

RM change – Assessing impact of change of relationship manger on trading activity

Mobile Brokerage – Origination, pricing and valuation of loans

Dynamic pricing models - Predicting brokerage slabs and sensitivities of trading volumes to broking charges

Channel adoption and preference – Use demographics and trading data to build a classification model

Portfolio Analytics – Analysis of client portfolio and suggestions for changes as per risk profile

Risk profiling – margin and default risks

Document similarity in text analytics – Automatic email classification, determining the topic of complaint etc.

Automated news download – downloading news that is relevant for customers

Sentiment analysis – Using text analytics to decide customer response to various offerings

News Summarization – Automatic download of relevant news items, News summarization

Page 7: Data Science for Retail Broking

Page 7 © AlgoAnalytics All rights reserved

Customer churn

Customer churn prediction : take customers past activity data and use distributed file systems and cluster computing to predict churn

Process Data

Compute Features

Train the classificatio

n model

Predict / Score

Process Data

Compute Features

Train the classification

model

Predict / Score

Page 8: Data Science for Retail Broking

Page 8 © AlgoAnalytics All rights reserved

Q5 Customers with no tradeswere marked as DORMANT

Test data

Label

Q1 Q2 Q3 Q4

Q2 Q3 Q4 Q5

Modelling(Machine learning

Algorithms) Result Evaluation

Prediction

• Train data• Data aggregation

quarter-wise

Trades data

Roughly 6% of the clientsresponsible for~80% of the loss

Past Brokerage

Number of Trades

Margin Amount Exchange Ledger

Amount

Examples of features used

Client profiles in terms of attributes computed from past trading.- Active clients = 1.03Mn- Active clients for which trade data is available = 346K- Average count of active clients who traded at least once during train

period = 254K

Prediction for quarter Jul – Sep 2015 Oct – Dec 2015

Accuracy 81.10% 78.30%

Sensitivity 88.35% 75.42%

Specificity 72.78% 81.90%

Prevalence 53.4% 55.57%

AUC 89.56% 88.21%

Total clients 252845 255873

% Growth in Nifty -5.01% -0.03%

Dormancy Prediction: predicts customers likely to stop trading

• INR 1.6 M brokerage from 2,200 (11% of 20K - CRM assigned ) reactivated clients.

• INR 309 K brokerage from 1,881 (4.8% of 39K – CRM not assigned ) reactivated clients

Page 9: Data Science for Retail Broking

Page 9 © AlgoAnalytics All rights reserved

Recommender System

What is RecSys? Value of Recommendation

RecSys Modeling and Applications

Aims to predict user preferences based on historical activity and implicit / explicit feedback

Helps in presenting the most relevant information (e.g. list of products / services)

Collaborative filtering: User’s behavior, similar users

Content-based filtering: using discrete characteristic of items

*Movies, music, news, books, search queries, social tags, etc.

* Financial services, insuranceIntel business unites (BUs), sales and marketing

Nearest Neighbor modeling

Matrix factorization and factorization machines

Classification learning model

Page 10: Data Science for Retail Broking

Page 10 © AlgoAnalytics All rights reserved

Stock Based Recommender System

Data Filtering• Discard Short Lived Sessions• Remove Rare Items• Consider only top ‘n’ most

popular items

Training and Testing• Training, Validation and

Testing set• Deep learning• Final Recommendations

Evaluation• Recall: Number of times actual

next item in the sequence is in top ‘k’ recommendations

Observed Recall@5 – 30.39%

Last 5 stocks bought Recommended Stock Probability

Intraday INFOSYS_TECHNOLOGIES

Intraday State Bank of India 0.1629

Intraday MOTHERSON_SUMI_SYST

EMIntraday L&T 0.137

Intraday UTI BANK LTD Longterm ICICI Banking Copora 0.124

Intraday ASIAN PAINT Intraday ICICI Banking Copora 0.0709

Intraday LIC HOUSING FIN Intraday Bharat Forge 0.061

Page 11: Data Science for Retail Broking

Page 11 © AlgoAnalytics All rights reserved

RM Change Risk Evaluation

RM’s Customer Ranking

• Rank clients under an RM in terms of their net worth and brokerage

Change in Ranking

• Determine changes in ranking of net worth and brokerage

• Evaluating the impact of the RM change by identifying the customers at risk who are associated with/managed by the same RM leaving the organization :

Take 90 days before and after an RM

change

Sum pre and post brokerage and net

worth

Sort clients according to brokerage and net

worth ranking

Develop model to predict change in

ranking on RM change

• Results based on changes in net worth and brokerage revenue:

Accuracy Sensitivity Specificity Prevalence Kappa AUC

Brokerage 74.53% 70.92% 77.02% 40.88% 0.4764 83.09%

Net Worth 74.88% 73.53% 76.07% 46.76% 0.4957 82.28%

• Results:

Page 12: Data Science for Retail Broking

Page 12 © AlgoAnalytics All rights reserved

Cluster Discovery, Description & Visualization using Predicted Labels

Cluster Discovery Cluster analysis can help identify groups or clusters of similar clients. Functional description of each cluster can help make inferences on client dormancy. Cluster analysis can be performed for predicted dormants. Features such as traded days, net worth, turnover (TO) may group clients into separate

clusters.

Cluster visualization

Prominent 2D View of Clusters

3D Visualization for Cluster Separation

Cluster Size

Q4 Traded Days

Q4 Min Net Worth (Lacs)

M3 Traded Days

M3 Peak Net Worth

(Lacs)

Q4 NSE Traded Days

13736 1.48 1.43 0.00 0.00 1.27

373 17.97 0.20 0.76 0.19 17.68

7952 1.22 0.60 1.07 0.82 1.09

Cluster DescriptionEach cluster can be used as a business action item for reaching out to clients Represent each cluster by a single point in the feature space. Make inferences on client behavior from these representation.

Page 13: Data Science for Retail Broking

Page 13 © AlgoAnalytics All rights reserved

Mobile Brokerage Analysis

For example: In June there are 8693 clients on mobile, 5390 clients remained till Jan 2016. We computed different brokerage features based on the client present in June.

Take the number of mobile and non-mobile

users in June 2016

Trace the same set clients in the

following months

Analyze the difference in their trade behavior

over the months

Page 14: Data Science for Retail Broking

Page 14 © AlgoAnalytics All rights reserved

Non-Mobile

For Clients with Mobile Brokerage<=0 in June' 15

(# of Traded Clients on Mobile

Total # of Traded Clients Total Brokerage Average Total

brokerage Drop in average

total brokerage in non-mobile

Ratio of total clients retained in non-mobile

customersJune 2015 0 48669 121861024.8 2503.873612 1 1July 2015 1878 32408 128207743 3956.052303 1.57997284 0.6659August 2015 3625 31692 124188548.5 3918.608751 1.56501859 0.6512September 2015 3475 29102 95712270.28 3288.855415 1.313506959 0.5980October 2015 2968 24533 94536048.35 3853.423892 1.538984984 0.5041November 2015 2970 23876 84523884.11 3540.11912 1.413856954 0.4906December 2015 3575 24383 75354303.71 3090.444314 1.234265299 0.5010January 2016 3179 23220 63526022.15 2735.832134 1.092639868 0.4771

Mobile

For Clients with Mobile Brokerage>0 in June' 15

# of Traded Clients on Mobile

Total # of Traded Clients Total Brokerage Average Total brokerage

Drop in average total brokerage in

mobile

Ratio of total clients retained

in mobile customers

June 2015 8693 8693 31603303.92 3635.488775 1 1July 2015 5412 7244 31868831.43 4399.341721 1.21011011 0.8333August 2015 5474 7089 31138534.24 4392.514351 1.208232131 0.8155September 2015 4606 6417 23532889.71 3667.272824 1.008742717 0.7382October 2015 4036 5693 20816096.02 3656.437031 1.005762157 0.6549November 2015 3874 5565 19891240.14 3574.346835 0.98318192 0.6402December 2015 3961 5556 17288433.75 3111.669141 0.855914936 0.6391January 2016 3823 5390 14874570.71 2759.660613 0.759089296 0.6200

Mobile Brokerage Analysis: Results

Jun-15 Jul-15 Aug-15 Sep-15 Oct-15 Nov-15 Dec-15 Jan-160

0.2

0.4

0.6

0.8

1

1.2

Comparison of clients retained

Ratio of total clients retained in mobile customersRatio of total clients retained in non-mobile customers

Jun-15 Jul-15 Aug-15 Sep-15 Oct-15 Nov-15 Dec-15 Jan-160

0.20.40.60.8

11.21.41.61.8

Comparing average drop in total brokerage in mobile and non-mobile

Drop in average total brokerage in mobileDrop in average total brokerage in non-mobile

Total mobile clients retained = 0.62 Total non-mobile clients retained =0.48 More mobile clients are retained than the non-mobile clients.

Page 15: Data Science for Retail Broking

Page 15 © AlgoAnalytics All rights reserved

Text Analytics for Retail broking

Page 16: Data Science for Retail Broking

Page 16 © AlgoAnalytics All rights reserved

Text analyticsIdentifying Important News Item News Summarization

Automatic summarization is the process of reducing a text document with computer program in order to create summary that retains the most important points of the original document

Creation of algorithm that separates out market/ stock relevant news and filters out noise using machine learning

Supervised and unsupervised machine learning approaches can be tried to enhance the performance of the model.

Latest research from Natural Language Processing(NLP) and text analytics can yield interesting results.

Extraction• Includes topic detection, scoring and extraction of most relevant text segment

ML Techniques•Techniques like RNNLM (recurrent neural network language modeling and recurrent convolutional neural networks will be applied

Page 17: Data Science for Retail Broking

Page 17 © AlgoAnalytics All rights reserved

Document Similarity in Text Analytics

Problem Statement - Finding semantically similar clause from standard clause library for each clause from input document

Page 18: Data Science for Retail Broking

Page 18 © AlgoAnalytics All rights reserved

Algorithmic Trading StrategyMethodology Technology Infrastructure

Page 19: Data Science for Retail Broking

Interested?Contact us: [email protected]