Risk Analysis in the Financial Services Industry
-
Upload
revolution-analytics -
Category
Technology
-
view
104 -
download
0
description
Transcript of Risk Analysis in the Financial Services Industry
Revolution Confidential
R in the Financial Services Industry
June 6, 2013
Karl-Kuno KunzeNeil MillerAndrie De Vries
Breakfast Briefing
Revolution ConfidentialR in Financial Services
Welcome & Revolution
Neil Miller Managing Director, International Andrie de Vries Business Services Director, Europe Revolution Analytics
R in Financial Institutions
Karl-Kuno Kunze Managing Director Nagler & Company
2
Revolution ConfidentialRevolution Analytics Corporate Overview & Quick Facts
Founded 2007
Office Locations Palo Alto (HQ), Seattle (Engineering)SingaporeLondon
CEO David Rich
Number of customers
200+
Investors • Northbridge Venture Partners• Intel Capital• Platform Vendor
Web site: • www.revolutionanalytics.com
Revolution – “Contender” The Forrester Wave™: Big DataPredictive Analytics Solutions, Q1 2013
3
In the big data analytics context, speed and scale are critical drivers of success, and Revolution R delivers on both
Revolution R Enterprise is the leading commercial analytics platform based on the open source R statistical computing language
Revolution Confidential
Incredible graphics, visualization and flexiblestatistical analytics capabilities
4
4500+ packages
Revolution Confidential
5
has some constraints for enterprise use
Revolution Confidential
6
Innovate for breakthroughs
&&
Scale & power your analytics
Deploy widely with confidence
Revolution Confidential
7
Revolution R Enterprise
ScaleRDistributed High Performance Architecture +
High Performance Big Data Analytics
packages
RevoRPerformance Enhanced Open Source R + Open Source R
packages
g pConnectR
High Speed Connectors
PlatformRDistributed Compute Contexts
DevelopRIntegrated Development
Environment
DeployRWeb Services
Revolution R EnterpriseHigh Performance, Multi-Platform Enterprise Analytics Platform
Revolution ConfidentialDistributedR and ScaleR processing handles big data and / or big analytics.
8
Revolution ConfidentialIntegration Layer:DeployR makes R accessible
SeamlessBring the power of R to any web enabled application
SimpleLeverage common APIs including JS, Java, .NET
ScalableRobustly scale user and compute workloads
SecureManage enterprise security with LDAP & SSO
9
R / Statistical Modeling Expert
DeployR
Data AnalysisData Analysis
Business IntelligenceBusiness Intelligence
Mobile Web AppsMobile Web Apps
Cloud / SaaSCloud / SaaS
DeploymentExpert
Revolution ConfidentialOn-Demand Analytics with DeployR
10
Market Basket Analysis using Java Script and R enabled by DeployR
•User selection drives Java Script…
•which drives R script… •which drives Java Script to return to user data and graphics needed…
•…enabled by DeployR API’s
Revolution Confidential
Example: Allstate performance assessment of SAS, R, Hadoop, Revolution (October 2012)
11
• Steve Yun, Principal Predictive Modeller at Allstate Research and Planning Centre benchmarked SAS, R and Hadoop. “Data is our competitive advantage”.
• Generalised Linear Model for 150 million observations of insurance data and 70 degrees of freedom.
Conclusion: • SAS works, but is slow. • The data is too big for open-source R, even on a very large server.• Hadoop is not a right fit • Revolution ScaleR gets the same results as SAS, but much faster and on cheaper kit
Software Platform Comments Time to fitSAS (current tool)
16-core Sun Server Proc GENMOD 5 hours
rmr / map-reduce
10-node (8 cores / node) Hadoopcluster
Lot of coding, prep and error investigation. Possible to improve time?
> 9 hours processing
Open source R 250-GB Server Full data set and sampling. Sampling quicker but not acceptable to business.
Impossible(> 3 days)
RevolutionScaleR
5-node (4 cores / node) LSF cluster
90 minutes to load full data set 5.7minutes
Revolution Confidential
“As things become more and more extreme, I need a model that can estimate my risk in a way to that enhances our confidence in our pricing and reserving. Modeling with Revolution R Enterprise gives me that.”
VP and Pricing Actuary, Jamie Botelho
Economic Capital Modeling
12
1 day to 15 minutes100,000 years of simulationsPricing optimization increases financial health
Profile: 10-year-old reinsurer’s Actuarial Group systematically makes sound financial and pricing decisions in production system and completes ad hoc analysis.
Key Technology: Revolution R Enterprise replaced Excel; drives business rules in company production system
Outcomes: Ability to compensate for lack of historical data by simulating a wide variety and quantity of events and using advanced correlation techniques. Complete full day of work in 15 minutes
Bottom line: Improved financial health by managing risk and increasing pricing optimization
Revolution ConfidentialF100 Investment Co. Outlier & Error Detection
13
Profile: Full-service global investment and securities management firm proved effectiveness of Revolution R Enterprise to detect potentially costly outliers and errors
Key Technology: Revolution R Enterprise using ScaleR Big Data Analytics capabilities
Analytic Approach – Exchange Rate Error Detection: ARIMA and VAR models used to define acceptable value changes using the prediction for the next value in a time series. Models trained using historical data.
“The models’ performance were impressive and few errors were missed.” VP, IT
Bottom line: new analytics paradigm for existing processes introduced, with potential for millions of dollars in cost avoidance
>65M end-of-day trades>8,500 variablesWeekly model re-training
Analytic Approach – Outlier Detection: Use historical data for each customer (>65M end-of-day trades and >8,500 variables) to build and train linear regression model to establish range of predicted values for customers’ trades so that actual trades can be analyzed for outliers.
“Using statistical analysis by customer delivers superior accuracy compared to rules-based analysis (such as analyzing largest 10% of trades), which fail over time as volumes or client behavior changes. Statistical models that can be retrained (e.g weekly) will account for changes and not fail over time.” VP, IT
Revolution ConfidentialQuantitative Research @ Global Investment Co.
14
Profile: Full-service global investment and securities management firm’s IT team proved effectiveness of Revolution R Enterprise to detect potentially costly outliers and errors
Key Technology: Revolution R Enterprise using ScaleR Big Data and DeployR integrated with Siteminder, which provides a secure, transparent, centralized analytics center.
Analytic Approach – develop models that can be applied to real-world data to exploit market opportunities and successfully develop, back-test, and deploy quantitative and event-based trading and investment strategies to effectively manage risk.
Quants’ daily model updates deployed to 100’s of traders
Challenge - Quantitative Research Group had a decentralized modeling practice where quants used Excel, Python, Java, open source R, and other tools to develop models that informed daily trading. This environment posed risk to IP protection, model versioning, transparency.
Bottom Line - Powerful statistical analytics platform provides centralized, secure model repository guides hundreds of millions of dollars of transactions made by 100’s of traders.
Revolution ConfidentialInnovates to Outperform
15
“One of the first R-based production deployments we rolled out tracks revenue flows among manufacturers and their suppliers. We combine public and proprietary data and apply graph analyses to get a clearer understanding of the likely performance of suppliers. These forecasts are more accurate than what could be developed with quarters-old public financial reports.”
- Sr. Quantitative Researcher, Tal Sasani
Profile: Publicly-traded, investment management company that includes the Livestrong family of funds. Revolution R Enterprise optimizes $8.5B portfolio of 22 funds.
Key Technology: Revolution R Enterprise replacing proprietary industry applications. Tableau front end for production analytics.
Outcomes: Battery of custom analytics now run overnight to inform morning work
Put R-based analytics into production
Bottom Line: Custom-built simulations, scenario analyses & financial stress tests improve confidence in forecasts and analysis, lifting the business
New data, more liftStrategy simulation & portfolio optimizationDays to overnight
Revolution ConfidentialOther Financial Services examples Op Risk: Conducting Monte Carlo simulations on 100,000 years of simulated data to measure
aggregate operational risk from 7 types of operational risk in accordance with BASEL II requirements
Mortgage loan default analysis and prediction in a Hadoop environment Moved from SAS = lower cost, better model uplift, better Hadoop integration
Credit Scoring in Database with Netezza: Increased Speed
Model Governance Issues: Model management through DeployR – changing analyst community and business user access via Qlikview, Excel, Python
Using Revolution to support SAS to analyse foreign trade transactions to identify anomalies: Better data exploration and visualisation
Control – “1600 SAS programmers and all the new guys coming in know R – now is the time to get my hands around R before it spins out of control with all these new R zealots coming on board”
IT Innovation – starting to use Hadoop. SAS too hard to write map reduce jobs Cross Platform – 500 Teradata appliances and 10 Netezza. Seamlessly deploy analysis across
their infrastructure
16
Revolution ConfidentialHigh Performance R & Big Data Analytics Parallel External Memory Algorithms
17
Data import – Delimited, Fixed, SAS, SPSS, OBDC
Variable creation & transformation
Recode variables Factor variables Missing value handling Sort Merge Split Aggregate by category
(means, sums)
Data import – Delimited, Fixed, SAS, SPSS, OBDC
Variable creation & transformation
Recode variables Factor variables Missing value handling Sort Merge Split Aggregate by category
(means, sums)
Min / Max Mean Median (approx.) Quantiles (approx.) Standard Deviation Variance Correlation Covariance Sum of Squares (cross product
matrix for set variables) Pairwise Cross tabs Risk Ratio & Odds Ratio Cross-Tabulation of Data
(standard tables & long form) Marginal Summaries of Cross
Tabulations
Min / Max Mean Median (approx.) Quantiles (approx.) Standard Deviation Variance Correlation Covariance Sum of Squares (cross product
matrix for set variables) Pairwise Cross tabs Risk Ratio & Odds Ratio Cross-Tabulation of Data
(standard tables & long form) Marginal Summaries of Cross
Tabulations
Chi Square Test Kendall Rank Correlation Fisher’s Exact Test Student’s t-Test
Chi Square Test Kendall Rank Correlation Fisher’s Exact Test Student’s t-Test
Data Prep, Distillation & Descriptive Analytics Data Prep, Distillation & Descriptive Analytics
Subsample (observations & variables)
Random Sampling
Subsample (observations & variables)
Random Sampling
R Data Step Statistical Tests
Sampling
Descriptive Statistics
Revolution ConfidentialHigh Performance R & Big Data Analytics Parallel External Memory Algorithms
18
Sum of Squares (cross product matrix for set variables)
Multiple Linear Regression Generalized Linear Models (GLM)
- All exponential family distributions: binomial, Gaussian, inverse Gaussian, Poisson, Tweedie. Standard link functions including: cauchit, identity, log, logit, probit. User defined distributions & link functions.
Covariance & Correlation Matrices
Logistic Regression Classification & Regression Trees Predictions/scoring for models Residuals for all models
Sum of Squares (cross product matrix for set variables)
Multiple Linear Regression Generalized Linear Models (GLM)
- All exponential family distributions: binomial, Gaussian, inverse Gaussian, Poisson, Tweedie. Standard link functions including: cauchit, identity, log, logit, probit. User defined distributions & link functions.
Covariance & Correlation Matrices
Logistic Regression Classification & Regression Trees Predictions/scoring for models Residuals for all models
Histogram Line Plot Scatter Plot Lorenz Curve ROC Curves (actual data and
predicted values)
Histogram Line Plot Scatter Plot Lorenz Curve ROC Curves (actual data and
predicted values)
K-Means K-Means
Statistical ModelingStatistical Modeling
Decision Trees Decision Trees
Predictive Models Cluster AnalysisData Visualization
Classification
Machine LearningMachine Learning
SimulationSimulation
Monte Carlo Monte Carlo
Revolution Confidential
19
www.revolutionanalytics.com Twitter: @RevolutionR
The leading commercial provider of software and support for the popular open source R statistics language.
Thank you