#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
-
Upload
pivotalopensourcehub -
Category
Technology
-
view
1.483 -
download
0
Transcript of #GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
Andre Langevin March 9th 2016
Wall Street Derivative Risk Solutions Using Apache Geode (Incubating)
Design Whiteboards for Solution Architects
Design Pattern
Event-based cross-product risk system using Geode
A Crash Course in Wall Street Trading
• Big Wall Street firms have “FICC” trading business organized by market: • Each business will trade “cash” and derivative products, but traders specialize in one or the other • There may be a team of traders working to trade a single product
• Trading systems are product specific and often highly specialized:
• May have up to 50 different booking points for transactions • Multiple instances of the same trading system, deployed in different regions • Electronic markets mean that there are often external booking points to consolidate
• Managing these businesses requires a consolidated view of risk:
• Risk factors span products and markets – it is not sufficient to just look at the risk by trade or book • Risk measures must be both fast and detailed to be relevant on the trading floor • Desk heads aggregate risk from individual trades to stay within desk limits for risk • Business heads aggregate risk across desk to stay within the firm’s risk appetite and regulatory limits
FICC ”Fixed Income Commodities & Currencies”
Calculating Risk
• What is the “risk” that we are trying to measure? • Trades are valued by discounting their estimated future cash flows • Discount factors are based on observable market data • Movement in markets can change the value of your trades • “Trade Risk” is the sensitivity of each trade to changes in market data
• Markets are represented using curves: • A curve is defined using observable rates and prices and then “built” into a smooth, consistent “zero
curve” using interpolation
• Consistency is paramount: • Most firms have a proprietary math library used for valuation and risk • Use the same market data in all calculations to avoid basis differences
Technology Solutions that Work Badly
• The easiest thing to do is just book all of your trades using one trading system! • Trading systems are product specific for many very good reasons, so this idea is a non-starter
• How about booking all of the hedges into the primary trading system?
• Cash product systems can’t price derivatives, so you have to invent simple “proxies” for them • Have to build live feeds from one trading system into another – or book duplicates by hand • The back office has to remove the duplicates before settlement and accounting
• How about adding up all of the risk from each trading system into a report? • Almost impossible to make the valuations consistent across systems:
• Different yield curve definitions, and different market data sources feeding curves • Different math libraries, and often a technology challenge to make them run fast enough • Different calculation methodologies (is relative risk up or down?)
• Difficult to achieve speed needed to accurately compute hedge requirements
Cash Products Cash products are securities that are settled immediately for a cash payment, such as stocks and bonds.
Filling in the Details of the Design
Event-based cross-product risk system using Geode
PDX Integration Glue
• PDX serialization is an effective cross-language bridge: • PDX data objects bridge solution components in Java, C# and C++ • Avoid language-specific data types (e.g. C# date types) that don’t
have equivalents in other languages
• Structure PDX objects to optimize performance: • May want to externalize sub-objects or lists into separate objects • Balance speed of lookup with memory consumption • Need to consider cluster locality
• JSON is a good message format: • PDX natively handles JSON, but not XML • C# works well with JSON, so the calculation engine and the
downstream consumers should consume easily
Designing and Naming Data Objects
• The trade data model serves two distinct purposes: • Descriptive data is only used for aggregation and viewing • Model parameters are only needed to calculate risk • Can be split into two regions to optimize performance
• Market data should follow the calculation design: • Model data to align to the calculation engine’s math library to reduce
format conversions downstream
• Use “dot” notation to give business-friendly keys to objects: • Create compound keys like “USD.LIBOR.3M” and ”USD.LIBOR.6M” to
allow business users to “guess” a key easily – promotes use of Geode data in secondary applications and spreadsheets
• Values in the “dot” name are repeated as attributes of the object
Region Design
• Trade and market data regions: • Both may be high velocity, but with a low number of contributors • Curve definitions are updated slowly but used constantly • Typically a curve embeds a list of rates – leave it denormalized if
rates are updated slowly • If calculation engine supports it, create a second region to cache
built interest rate and credit curves (building a credit curve is 80% of the valuation time for a credit default swap)
• Consider splitting model parameters from descriptive data to reduce amount of data flowing to compute grid
• Foreign exchange quotes are typically small and updated daily • Interest rates change slowly and are referenced constantly
• Computational results and aggregation: • Risk results will be the the largest and highest traffic region • Pre-aggregate risk inside Geode to support lower powered
consumers (e.g. web pages)
Region Placement On the Geode Cluster
• Region placement optimizes the solution’s performance: • Consider placement of market data and trades holistically to make the
risk calculation efficient – keep all data on one machine
• Partition the trades regions to balance the cluster: • Partition trade region to maximize parallel execution during compute • Use a business term (e.g. valuation curve, currency, industry) that can
be used to partition both the trade and market data sets
• Partition or replicate market data to optimize computations: • Replicate interest rates and foreign exchange rates to all nodes • Replicate or partition curve data to maximize collocation of trades with
their market data to minimize cross-member network traffic • When using an external compute grid, this technique should also be
applied to the local Geode cache on the compute grid
Getting Trade Data into Geode
• Message formats vary by product type: • OTC derivatives typically are usually captured in XML documents • Bond trading systems use FIX or similar (e.g. TOMS) • Proprietary formats from legacy trading systems
• Broker messages in an application server:
• Transactional message consumer is best pattern • XML-to-object parsing tools readily available
• Trade data capture is transactional: • Best practice is to make end-to-end process a transaction, but may
need to split into two legs based on source of messages
Getting Market Data into Geode
• Market data feeds have many proprietary formats
• Market data is often exceptionally fast moving: • Foreign exchange quotes for the major current pairs can reach
70,000 messages/second
• Market data can also be very slow moving: • Rate fixings like LIBOR are once per day • Illiquid securities may not be quoted daily
• Conflate fast market data by sampling:
• Discard inbound ticks that don’t move the market sufficiently • Sample down to a rate that your compute farm can accommodate • External client required to conflate within message queue
• Gate market data into batches:
• Push complete update of all market data at pre-determined intervals • Day open and close by trading location (NY, London, Hong Kong)
Crunching Numbers on a Shared Grid
• Most trading firms have a proprietary math library: • Developed by internal quantitative teams to ensure consistency • Usually coded in C++ or C# to take advantage of Intel compute grid
• Pushing Geode events to an external compute grid:
• Typical compute grid has a “head node” or “broker” • Use client-side Asynchronous Event Queue (“AEQ”) to collect events
for grid’s broker to process • Stateless grid nodes can synchronously put results back to Geode
regions to ensure results are captured
• Caching locally on the grid to accelerate performance: • Grid nodes can use Geode client-side caching proxies • Use client-side region interest registration to ensure updates are
pushed to grid nodes • Can use wildcards on keys (see dot notation)
Crunching Numbers Inside Geode
• Running the math inside Geode is dramatically faster: • STAC Report Issue 0.99 in 2010 found that trade valuations running
inside GemFire 6.3 were 76 times faster than a traditional grid
• Using the Geode grid as a compute grid: • Math library must be coded in java (most are C++ or C#) • Try to use function parameters to define data model • Opportunities to cache frequently used derived results
• Using cache listeners to propagate valuation events: • Use cache listener to detect data updates in regions that contain
valuation inputs (e.g. new trade, market data updates) • Do not listen to “jittery” regions, such as exchange rates
• Encapsulate math into functions that cache listener can execute • Ensure regions are partitioned in order to get parallel execution
across the grid
Ticking Risk Views
• Roll-your-own client applications to view ticking risk: • Desktop applications can use the client libraries to receive events
from the cluster using Continuous Queries, which can then be displayed in real time
• Server hosted applications can use Continuous Queries or Asynchronous Event Queues
• Integrating packaged products: • Some specialty products handle streaming risk:
• Armanta TheatreTM
• ION Enterprise RiskTM
• Integrate using custom java components
• The traders will always want spreadsheets: • Write an Excel a plug-in
Join the Apache Geode Community!
• Check out: http://geode.incubator.apache.org
• Subscribe: [email protected]
• Download: http://geode.incubator.apache.org/releases/
Thank you!