Data Science for Retail Broking
-
Upload
algoanalytics-financial-consultancy-pvt-ltd -
Category
Data & Analytics
-
view
261 -
download
1
Transcript of Data Science for Retail Broking
Data Science for Retail Broking
Page 2 © AlgoAnalytics All rights reserved
Overview
About Algoanalytics
Applications of Analytics in Retail Broking
Retail Broking Case Studies
Page 3 © AlgoAnalytics All rights reserved
CEO and Company ProfileAniruddha PantCEO and Founder of AlgoAnalytics
PhD, Control systems, University of California at Berkeley, USA 2001
• 20+ years in application of advanced mathematical techniques to academic and enterprise problems.
• Experience in application of machine learning to various business problems.
• Experience in financial markets trading; Indian as well as global markets.
Highlights
• Experience in cross-domain application of basic scientific process.
• Research in areas ranging from biology to financial markets to military applications.
• Close collaboration with premier educational institutes in India, USA & Europe.
• Active involvement in startup ecosystem in India.
Expertise
• Vice President, Capital Metrics and Risk Solutions• Head of Analytics Competency Center, Persistent Systems• Scientist and Group Leader, Tata Consultancy Services
Prior Experience
• Work at the intersection of mathematics and other domains
• Harness data to provide insight and solutions to our clients
Analytics Consultancy
• +30 data scientists with experience in mathematics and engineering
• Team strengths include ability to deal with structured/ unstructured data, classical ML as well as deep learning using cutting edge methodologies
Led by Aniruddha Pant
• Develop advanced mathematical models or solutions for a wide range of industries:
• Financial services, Retail, economics, healthcare, BFSI, telecom, …
Expertise in Mathematics and Computer Science
• Work closely with domain experts – either from the clients side or our own – to effectively model the problem to be solved
Working with Domain Specialists
About AlgoAnalytics
AlgoAnalytics - One Stop AI ShopAniruddha PantCEO and Founder of AlgoAnalytics•We use structured data to design
our predictive analytics solutions like churn, recommender sys
•We use techniques like clustering, Recurrent Neural Networks
Structured Data
•We used text data analytics for designing solutions like sentiment analysis, news summarization and much more
•We use techniques like natural language processing, word2vec, deep learning, TF-IDF
Text Data
•Image data is used for predicting existence of particular pathology, image recognition and many others
•We use techniques like deep learning – convolutional neural network, artificial neural networks and technologies like TensorFlow
Image Data
•We use sound data to design factory solutions like air leakage detection, identification of empty and loaded strokes from press data, engine-compressor fault detection
•We use techniques like deep learning
Sound Data
Retail•Churn Analysis•Recommender System• Image Analytics – image recognition,
tagging, original and substitute differentiation
Healthcare•Medical Image Diagnostics •Work flow optimization•Cash flow forecasting
Legal•Contracts Management•Structured Document decomposition•Document similarity in text analytics
Internet of Things•Assisted Living•Predictive in ovens•Air leakage detection•Engine/compressor fault detection
Others•Algorithmic trading strategies•Risk sensing – network theory•Network failure model•Multilanguage sentiment analytics
Page 5 © AlgoAnalytics All rights reserved
Technology:
Page 6 © AlgoAnalytics All rights reserved
Data Science for Broking
Predict Dormancy – Finding which clients are unlikely to trade and take action
Stock Recommender System – Suggesting stocks that will increase probability of trading for a particular customer
Dormancy visualization Identifying KPIs for visualizing Dormancy
RM change – Assessing impact of change of relationship manger on trading activity
Mobile Brokerage – Origination, pricing and valuation of loans
Dynamic pricing models - Predicting brokerage slabs and sensitivities of trading volumes to broking charges
Channel adoption and preference – Use demographics and trading data to build a classification model
Portfolio Analytics – Analysis of client portfolio and suggestions for changes as per risk profile
Risk profiling – margin and default risks
Document similarity in text analytics – Automatic email classification, determining the topic of complaint etc.
Automated news download – downloading news that is relevant for customers
Sentiment analysis – Using text analytics to decide customer response to various offerings
News Summarization – Automatic download of relevant news items, News summarization
Page 7 © AlgoAnalytics All rights reserved
Customer churn
Customer churn prediction : take customers past activity data and use distributed file systems and cluster computing to predict churn
Process Data
Compute Features
Train the classificatio
n model
Predict / Score
Process Data
Compute Features
Train the classification
model
Predict / Score
Page 8 © AlgoAnalytics All rights reserved
Q5 Customers with no tradeswere marked as DORMANT
Test data
Label
Q1 Q2 Q3 Q4
Q2 Q3 Q4 Q5
Modelling(Machine learning
Algorithms) Result Evaluation
Prediction
• Train data• Data aggregation
quarter-wise
Trades data
Roughly 6% of the clientsresponsible for~80% of the loss
Past Brokerage
Number of Trades
Margin Amount Exchange Ledger
Amount
Examples of features used
Client profiles in terms of attributes computed from past trading.- Active clients = 1.03Mn- Active clients for which trade data is available = 346K- Average count of active clients who traded at least once during train
period = 254K
Prediction for quarter Jul – Sep 2015 Oct – Dec 2015
Accuracy 81.10% 78.30%
Sensitivity 88.35% 75.42%
Specificity 72.78% 81.90%
Prevalence 53.4% 55.57%
AUC 89.56% 88.21%
Total clients 252845 255873
% Growth in Nifty -5.01% -0.03%
Dormancy Prediction: predicts customers likely to stop trading
• INR 1.6 M brokerage from 2,200 (11% of 20K - CRM assigned ) reactivated clients.
• INR 309 K brokerage from 1,881 (4.8% of 39K – CRM not assigned ) reactivated clients
Page 9 © AlgoAnalytics All rights reserved
Recommender System
What is RecSys? Value of Recommendation
RecSys Modeling and Applications
Aims to predict user preferences based on historical activity and implicit / explicit feedback
Helps in presenting the most relevant information (e.g. list of products / services)
Collaborative filtering: User’s behavior, similar users
Content-based filtering: using discrete characteristic of items
*Movies, music, news, books, search queries, social tags, etc.
* Financial services, insuranceIntel business unites (BUs), sales and marketing
Nearest Neighbor modeling
Matrix factorization and factorization machines
Classification learning model
Page 10 © AlgoAnalytics All rights reserved
Stock Based Recommender System
Data Filtering• Discard Short Lived Sessions• Remove Rare Items• Consider only top ‘n’ most
popular items
Training and Testing• Training, Validation and
Testing set• Deep learning• Final Recommendations
Evaluation• Recall: Number of times actual
next item in the sequence is in top ‘k’ recommendations
Observed Recall@5 – 30.39%
Last 5 stocks bought Recommended Stock Probability
Intraday INFOSYS_TECHNOLOGIES
Intraday State Bank of India 0.1629
Intraday MOTHERSON_SUMI_SYST
EMIntraday L&T 0.137
Intraday UTI BANK LTD Longterm ICICI Banking Copora 0.124
Intraday ASIAN PAINT Intraday ICICI Banking Copora 0.0709
Intraday LIC HOUSING FIN Intraday Bharat Forge 0.061
Page 11 © AlgoAnalytics All rights reserved
RM Change Risk Evaluation
RM’s Customer Ranking
• Rank clients under an RM in terms of their net worth and brokerage
Change in Ranking
• Determine changes in ranking of net worth and brokerage
• Evaluating the impact of the RM change by identifying the customers at risk who are associated with/managed by the same RM leaving the organization :
Take 90 days before and after an RM
change
Sum pre and post brokerage and net
worth
Sort clients according to brokerage and net
worth ranking
Develop model to predict change in
ranking on RM change
• Results based on changes in net worth and brokerage revenue:
Accuracy Sensitivity Specificity Prevalence Kappa AUC
Brokerage 74.53% 70.92% 77.02% 40.88% 0.4764 83.09%
Net Worth 74.88% 73.53% 76.07% 46.76% 0.4957 82.28%
• Results:
Page 12 © AlgoAnalytics All rights reserved
Cluster Discovery, Description & Visualization using Predicted Labels
Cluster Discovery Cluster analysis can help identify groups or clusters of similar clients. Functional description of each cluster can help make inferences on client dormancy. Cluster analysis can be performed for predicted dormants. Features such as traded days, net worth, turnover (TO) may group clients into separate
clusters.
Cluster visualization
Prominent 2D View of Clusters
3D Visualization for Cluster Separation
Cluster Size
Q4 Traded Days
Q4 Min Net Worth (Lacs)
M3 Traded Days
M3 Peak Net Worth
(Lacs)
Q4 NSE Traded Days
13736 1.48 1.43 0.00 0.00 1.27
373 17.97 0.20 0.76 0.19 17.68
7952 1.22 0.60 1.07 0.82 1.09
Cluster DescriptionEach cluster can be used as a business action item for reaching out to clients Represent each cluster by a single point in the feature space. Make inferences on client behavior from these representation.
Page 13 © AlgoAnalytics All rights reserved
Mobile Brokerage Analysis
For example: In June there are 8693 clients on mobile, 5390 clients remained till Jan 2016. We computed different brokerage features based on the client present in June.
Take the number of mobile and non-mobile
users in June 2016
Trace the same set clients in the
following months
Analyze the difference in their trade behavior
over the months
Page 14 © AlgoAnalytics All rights reserved
Non-Mobile
For Clients with Mobile Brokerage<=0 in June' 15
(# of Traded Clients on Mobile
Total # of Traded Clients Total Brokerage Average Total
brokerage Drop in average
total brokerage in non-mobile
Ratio of total clients retained in non-mobile
customersJune 2015 0 48669 121861024.8 2503.873612 1 1July 2015 1878 32408 128207743 3956.052303 1.57997284 0.6659August 2015 3625 31692 124188548.5 3918.608751 1.56501859 0.6512September 2015 3475 29102 95712270.28 3288.855415 1.313506959 0.5980October 2015 2968 24533 94536048.35 3853.423892 1.538984984 0.5041November 2015 2970 23876 84523884.11 3540.11912 1.413856954 0.4906December 2015 3575 24383 75354303.71 3090.444314 1.234265299 0.5010January 2016 3179 23220 63526022.15 2735.832134 1.092639868 0.4771
Mobile
For Clients with Mobile Brokerage>0 in June' 15
# of Traded Clients on Mobile
Total # of Traded Clients Total Brokerage Average Total brokerage
Drop in average total brokerage in
mobile
Ratio of total clients retained
in mobile customers
June 2015 8693 8693 31603303.92 3635.488775 1 1July 2015 5412 7244 31868831.43 4399.341721 1.21011011 0.8333August 2015 5474 7089 31138534.24 4392.514351 1.208232131 0.8155September 2015 4606 6417 23532889.71 3667.272824 1.008742717 0.7382October 2015 4036 5693 20816096.02 3656.437031 1.005762157 0.6549November 2015 3874 5565 19891240.14 3574.346835 0.98318192 0.6402December 2015 3961 5556 17288433.75 3111.669141 0.855914936 0.6391January 2016 3823 5390 14874570.71 2759.660613 0.759089296 0.6200
Mobile Brokerage Analysis: Results
Jun-15 Jul-15 Aug-15 Sep-15 Oct-15 Nov-15 Dec-15 Jan-160
0.2
0.4
0.6
0.8
1
1.2
Comparison of clients retained
Ratio of total clients retained in mobile customersRatio of total clients retained in non-mobile customers
Jun-15 Jul-15 Aug-15 Sep-15 Oct-15 Nov-15 Dec-15 Jan-160
0.20.40.60.8
11.21.41.61.8
Comparing average drop in total brokerage in mobile and non-mobile
Drop in average total brokerage in mobileDrop in average total brokerage in non-mobile
Total mobile clients retained = 0.62 Total non-mobile clients retained =0.48 More mobile clients are retained than the non-mobile clients.
Page 15 © AlgoAnalytics All rights reserved
Text Analytics for Retail broking
Page 16 © AlgoAnalytics All rights reserved
Text analyticsIdentifying Important News Item News Summarization
Automatic summarization is the process of reducing a text document with computer program in order to create summary that retains the most important points of the original document
Creation of algorithm that separates out market/ stock relevant news and filters out noise using machine learning
Supervised and unsupervised machine learning approaches can be tried to enhance the performance of the model.
Latest research from Natural Language Processing(NLP) and text analytics can yield interesting results.
Extraction• Includes topic detection, scoring and extraction of most relevant text segment
ML Techniques•Techniques like RNNLM (recurrent neural network language modeling and recurrent convolutional neural networks will be applied
Page 17 © AlgoAnalytics All rights reserved
Document Similarity in Text Analytics
Problem Statement - Finding semantically similar clause from standard clause library for each clause from input document
Page 18 © AlgoAnalytics All rights reserved
Algorithmic Trading StrategyMethodology Technology Infrastructure
Interested?Contact us: [email protected]