Massively Scalable Computational Finance with SciDB
-
Upload
paradigm4inc -
Category
Software
-
view
176 -
download
5
description
Transcript of Massively Scalable Computational Finance with SciDB
![Page 1: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/1.jpg)
Massively Scalable
Computational
Finance with SciDB
Bryan Lewis
Chief Data Scientist
Frank Smietana
Solutions Architect
![Page 2: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/2.jpg)
© P
ara
dig
m4
GoToWebinar
• Ask questions using the
Q&A window
• This webinar is being
recorded
• Replays will be available
from paradigm4.com
![Page 3: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/3.jpg)
© P
ara
dig
m4
Common issues
• Expensive data ETL
• Lack of horizontal scalability
• Hard to program
• Hard to extend
• Difficulty with data JOINS
![Page 4: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/4.jpg)
© P
ara
dig
m4
What is SciDB?
Massively scalable
distributed array database
![Page 5: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/5.jpg)
© P
ara
dig
m4
What is SciDB?
Open source
![Page 6: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/6.jpg)
© P
ara
dig
m4
Mike Stonebraker CTO
What is SciDB?
![Page 7: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/7.jpg)
© Paradigm4 Inc.
Lawrence Berkeley
NASA Goddard
Projects using satellite image data
Institute for Geoinformatics
Global land change analysis on remote
sensing data (LANDSAT, MODIS, SENTINEL)
Lawrence Berkeley
Big Science and SciDB
![Page 8: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/8.jpg)
© P
ara
dig
m4
Commercial applications Pharma, Biotech, Healthcare
Quantitative Finance
Image & Sensor Analytics
E-commerce
![Page 9: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/9.jpg)
© P
ara
dig
m4
Arrays for finance
Symbol
Tim
e
![Page 10: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/10.jpg)
© P
ara
dig
m4
Fast multidimensional SELECTs
![Page 11: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/11.jpg)
© P
ara
dig
m4
Table model i j data
1 1 0.5
1 2 0.3
1 3 0.1
1 4 -0.5
2 1 0.9
2 2 0.0
2 3 -0.8
2 4 -0.8
3 1 1.1
3 2 1.0
3 3 1.2
3 4 1.5
4 1 0.9
4 2 1.0
4 3 1.2
4 4 1,5
![Page 12: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/12.jpg)
© P
ara
dig
m4
Array model
0.5 0.3 0.1 -0.5
0.9 0.0 -0.8 -0.8
1.1 1.0 1.2 1.5
0.9 1.0 1.2 1.5
j
i
(1,1)
![Page 13: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/13.jpg)
© P
ara
dig
m4
Our approach
• Less data movement
• Spatial data clustering
• Leverage popular languages
• Extensibility
![Page 14: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/14.jpg)
© P
ara
dig
m4
C++
Julia
Java/JVM
Javascript
Array SQL
Use Popular Languages
JDBC
Protocol buffers
C/C++ API
HTTP
![Page 15: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/15.jpg)
© P
ara
dig
m4
SciDB
0
SciDB
…
SciDB
1
SciDB
2
Shared-nothing architecture
![Page 16: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/16.jpg)
© P
ara
dig
m4
Common issues
• Expensive data ETL
• Lack of horizontal scalability
• Hard to program
• Hard to extend
• Difficulty with data JOINS
![Page 17: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/17.jpg)
© P
ara
dig
m4
SciDB
• Minimize ETL
• Massively scalable
• Program from many languages
• Open-source extensibility
• Fast parallel JOIN
![Page 18: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/18.jpg)
© P
ara
dig
m4
Poll
![Page 19: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/19.jpg)
© P
ara
dig
m4
Examples
• Order books
• Network analysis
![Page 20: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/20.jpg)
© P
ara
dig
m4
Order book challenges
• Lots of exchanges
• Regulatory compliance
• Margins are shrinking
• Want more alpha
![Page 21: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/21.jpg)
© P
ara
dig
m4
Create order book
• Load raw data into array
• Dimension along symbol and time
coordinate axes
• Create order book entries with
custom aggregation function ORDERBOOK
https://github.com/Paradigm4/orderbook-example
![Page 22: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/22.jpg)
© P
ara
dig
m4
Consolidate order books
• Load as arrays
• Merge into single array
• Impute missing value
(inexact temporal join)
• Aggregate by time and symbol
![Page 23: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/23.jpg)
© P
ara
dig
m4
Example Order Books
![Page 24: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/24.jpg)
© P
ara
dig
m4
Merge and impute
![Page 25: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/25.jpg)
© P
ara
dig
m4
Consolidated Order Book
![Page 26: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/26.jpg)
© P
ara
dig
m4
Benchmark Results
• 9 exchanges; 358,000,000 events; 8,000 symbols
• Order book depth: 10
![Page 27: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/27.jpg)
© P
ara
dig
m4
Financial network analysis
![Page 28: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/28.jpg)
© P
ara
dig
m4
A graph
![Page 29: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/29.jpg)
© P
ara
dig
m4
Sparse matrix representation
![Page 30: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/30.jpg)
© P
ara
dig
m4
Bitcoin transactions A directed graph
Represented as a nonsymmetric
sparse matrix
From
address
To
address Date, Amount,
Transaction ID
![Page 31: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/31.jpg)
© P
ara
dig
m4
Bitcoin network schema
(using the Reid/Harrigan user ID method)
![Page 32: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/32.jpg)
Identify important nodes
• Kleinberg HITS method
• Subgraph centrality
• Fielder clustering
• Other methods...
![Page 33: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/33.jpg)
Bitcoin subgraph centrality
• Identify top 5 most central hub and authority nodes
• 16.3M nodes
• 6.3M x 6.3M sparse matrix
• 8-instance SciDB cluster on a single workstation (8 cores)
• 20 seconds
![Page 34: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/34.jpg)
© Paradigm4 Inc.
Correlation network
1 Compute bar data closing
prices from TAQ trades
2 na.locf imputation
3 Correlation matrix across all
instruments
4 Regularize
5 Precision matrix
6 Threshold
7 Plot clusters
All inside SciDB up to plot
![Page 35: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/35.jpg)
Take away
• Bringing the analysis to the data
• In-database complex math
• Parallel time series analysis
• Programmable from C++, R, Python ...
• MPP on commodity clusters, clouds
• Extensible, open-source
www.paradigm4.com
![Page 36: Massively Scalable Computational Finance with SciDB](https://reader034.fdocuments.us/reader034/viewer/2022052618/5495a0e7b47959654d8b4ddf/html5/thumbnails/36.jpg)
© Paradigm4 Inc.
Questions?
Tell us about your application • [email protected]
Try our Quick Start • scidb.org/forum
• Download a VM or EC2 AMI
www.paradigm4.com