Using In-Memory, Data-Parallel Computing for Operational Intelligence Copyright © 2014 by ScaleOut...

Using In-Memory, Data-Parallel Computing for Operational

Intelligence

Copyright © 2014 by ScaleOut Software, Inc.

Portland Big Data Users GroupOctober 23, 2014

Bill Bain, CEO ([email protected])

2 ScaleOut Software, Inc.

• What Is Operational Intelligence?

• Example: Tracking Cable Viewers

• Data-Parallel Computation

• Implementing OI Using an In-Memory Data Grid:

• Distributing the Data Across a Cluster

• Running the Computation using “Parallel Method Invocation”

• An Example in Financial Services

• Implementing In-Memory Hadoop MapReduce

• Video Demo

• More Examples of Operational Intelligence

Agenda


• Develops and markets In-Memory Data Grids,software middleware for:• Scaling application performance and • Providing operational intelligence using• In-memory data storage and computing

• Dr. William Bain, Founder & CEO

• Career focused on parallel computing – Bell Labs, Intel, Microsoft

• 3 prior start-ups, last acquired by Microsoft and product now ships as Network Load Balancing in Windows Server

• Nine years in the market; 400 customers, 10,000 servers

• Sample customers:

About ScaleOut Software

http://about-monster.com/

http://en.wikipedia.org/wiki/Image:HSN.png


Goal: Provide immediate feedback to a system handling live data.A few examples:• Ecommerce: for personalized, real-time recommendations• Equity trading: to minimize risk during a trading day• Reservations systems: to identify issues, reroute, etc.• Credit cards & wire transfers: to detect fraud in real

time• Smart grids: to optimize power distribution & detect

issues

Online Systems Need Operational Intelligence


• Goals:• Make real-time, personalized upsell offers.

• Immediately respond to service issues.

• Track aggregate behavior to identify patterns, e.g.:

• Total instantaneous incoming event rate

• Most popular programs and # viewers by zip code

• Requirements:• Track events from 10M cable boxes with 25K events/sec (2.2B/day).

• Correlate, cleanse, and enrich events per rules (e.g. ignore fast channel switches, match channels to programs).

• Be able to feed enriched events to recommendation engine within 5 sec.

• Immediately examine any cable box (e.g., box status) & track statistics.

Example: Track Cable TV Viewers

©2011 Tammy Bruce presents LiveWire


Based on a simulated workload for San Diego metropolitan area:

• Continuously correlates and enriches telemetry from 10M simulated set-top boxes (from synthetic load generator).

• Processes more than 30K events/second.

• Enriches events with program information every second.

• Tracks aggregate statistics (e.g., top 10 programs by zip code) every 10 secs.

The Result: An OI Platform

Real-Time Dashboard


Big Data Analytics

Real-Time vs. Batch Analytics

Static data setsPetabytesDisk storageMinutes to hoursBest uses:

• Analyzing warehoused data

• Mining for long-term trends

Live data setsGigabytes to terabytesIn-memory storageSeconds to minutesBest uses:

• Tracking live data

• Immediatelyidentifying trends and capturing opportunities

• Providing immediate feedback

AnalyticsServer

hServer

HadoopIBM

TeradataSASSAP

Real-Time Batch

Real-time“Operational Intelligence”

Batch“Business Intelligence”


• Operational intelligence can co-exist with business intelligence:

• Processes streaming data close to its sources.

• Provides real-time, “tactical” feedback (e.g., recommendations, alerts).

• Transforms data for storage in the data warehouse (ETL).

• Data warehouse provides “strategic” guidance.

• Using the same tool set (e.g., Hadoop MapReduce) lowers TCO:

• Leverages common skill set.

• Simplifies design (e.g., loading data into HDFS).

Integrated View of Analytics


• To keep up with fast growing “live” workloads & maintain fast response times:

• Track state of entities within a live system.

• Reliably process updates to data set in real-time.

• To identify and respond to trends in fast-changing data:

• Enrich & evaluate “live” data set in real time.

• Respond to identified patterns within seconds.

Challenges for Operational Intelligence

0

50

100

150

200

250

300

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

Millions

Growth in Web Servers

Source:Netcraft

0

500

1000

1500

2000

2500

3000

3500

4000

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Exeb

ytes

Growth in “Big Data”

“More data has been created in the past three years than in the past 40,000.”


The Solution: Data-Parallel Computing

• Straightforward, well understood model of parallel computation• An alternative to task-parallel computation (e.g., Storm)• Simple: runs the same code on multiple, in-memory data items.• Powerful: maintains a “live,” in-memory model of a real-world system• Fast: avoids data motion which lowers speedup.

Analyze Data (Eval)

Combine Results (Merge)

Server Cluster


Model & track “live” system’s state in memory; analyze in parallel:

Implementing OI

In-MemoryState in“IMDG”

NoSQLStorage

Real-TimeData Parallel

Analysis


• Each cable box is represented as an object in the IMDG:

• Object holds raw & enriched event streams, viewer parameters, and statistics.

• IMDG captures incoming events by updating objects.

• IMDG uses data-parallel computation to:

• immediately enrich box objectsto generate alerts to recc. engine, and

• continuouslycollect and reportglobal statistics.

Example: Tracking Cable Viewers


• Storm implements pipelined execution of tasks by “bolts” on incoming data streams.

• Streams can be distributed to bolts with configurable mappings.

• Developer controls the number of tasks per bolt.

• Storm uses a centralized master node and Zookeeper for fault-tolerance.

• Key strength: continuous processing of input streams

• Issues:

• Complexity / tuning

• Minimizing data motion

• Managing global state

Quick Comparison to Storm


Data-Parallel Enables Linear Speedup

Avoids data motion (network or disk I/O) which limits throughput:


Data-Parallel Computing Is Not New

• 1980’s: Special Purpose Hardware: “SIMD”

Thinking Machines Connection Machine 5

• 1990’s: General Purpose Parallel Supercomputers:“Domain Decomposition”, “SPMD”

IntelIPSC-2

IBMSP1


Data-Parallel Computing Is Not New

• 1990’s – early 2000’s: HPC on Clusters: “MPI”

• Since 2003: Clusters, the Cloud, and IMDGs: “MapReduce”

HPBladeServers

Amazon EC2, Windows Azure


• In-memory data grid (IMDG) holds active entities undergoing state changes in memory.

• Backing store optionally holds large population of entities.

• IMDG processes incoming stream of state changes.

• Analytics engine examines entities in real time and generates alerts within seconds as needed.

A Data-Parallel Architecture forOperational Intelligence


In-Memory Data Grid (IMDG) stores “live” data in a cluster:

• Fits in the business logic layer:

• Follows object-oriented view of data(vs. relational view).

• Stores collections of Java/.NET/C++ objects shared by multiple clients.

• Uses create/read/update/delete and query APIs to access data.

• Implemented across a cluster of servers or VMs:

• Scales storage and throughput by adding servers.

• Provides high availability in case a server fails.

In-Memory Data Grid for Live Data


• IMDG’s collections of objects act like process collections:

• Unstructured, typically instances of a class (stored as serialized blobs)

• Individually accessible / update-able

• IMDG adds attributes:

• Accessible by global key

• Query-able by properties

• Highly available

• Optional timeouts

• Distributed locking

• Integration with a backing store

• Optional dependency relationships

• Asynchronous event handling

IMDGs Store “Live” Data

Basic “CRUD” APIs:• Create(key, obj, tout)• Read(key)• Update(key, obj)• Delete(key)and…• Lock(key)• Unlock(key)

Object

key


Data-Parallel Computing Using PMI

“Parallel Method Invocation” (PMI): an object-oriented version of data-parallel computing from the HPC community:

• Serves as a platform for MapReduce and other data-parallel operators.

• Selects objects using a parallel query on data hosted in the IMDG.

• Runs user-defined methods in parallel across the cluster.

Analyze Data (Eval)

Combine Results (Merge)

In-Memory Data Grid Runs Data-Parallel Computation.


Spark / Spark Streaming from U.C. Berkeley amplab:

• In-memory computing to accelerate and extend Hadoop MapReduce using data-parallel operators in Scala.

• Stores data as “resilient distributed datasets” (RDDs):

• Distributed across cluster

• Immutable

• Hold data from/output to HDFS.

• Manages data stream as a sequence of RDDs.

• Comparison to IMDG:

• Not designed for operational systems:

• Lacks high availability (uses lineage).

• Intended for data-parallel operations:

• Lacks CRUD APIs on individual objects.

Comparison: IMDGs to Spark


Integrate analysis into a stock trading platform:

• The IMDG holds market data and hedging strategies.

• Updates to market data continuously flow through the IMDG.

• The IMDG performsrepeated data-parallel analysis on hedging strategies and alerts traders in real time.

• IMDG automatically and dynamicallyscales its throughput to handle newhedging strategies by adding servers.

Example in Financial Services


Selects all relevant objects in a distributed collection.

• Query spec matches data’s object-oriented properties.

• Selected objects are fed to the analysis engine on each local server.

Step 1: Select with Parallel Query


Java Example: Parallel Query public class Portfolio { private long id; private Set<Stock> longPositions; private Set<Stock> shortPositions; private double totalValue; private Region region; private boolean alerted; // alert for trading

@SossIndexAttribute // query-able property public double getTotalValue() {…} @SossIndexAttribute // query-able property public Region getRegion() {…}

public Set<Long> evalPositions(MarketSnapshot ms) {…};}NamedCache pset = CacheFactory.getCache(“portfolios");

Set<Portfolio> res = pset.queryObjects(Portfolio.class, and(greaterThan(“totalValue”, 1000000), equals(“region”, Region.US)));


• Create method to analyze a queried portfolio and another method to pair-wise merge the result sets of alerted portfolios:

Java Example: Parallel Method Invocation

public class PortfolioAnalysis implements Invokable<Portfolio, MarketSnapshot, Set<Long>>{ public Set<Long> eval(Portfolio p, MarketSnapshot ms)

throws InvokeException { // update portfolio and return id if alerted: return p.evalPositions(ms); }

public Set<Long> merge(Set<Long> set1, Set<Long> set2) throws InvokeException {

set1.addAll(set2); return set1; // merged set of alerted portfolio ids }}


• Run a parallel method invocation on a queried set of portfolios and return set of ids for alerted portfolios:

Java Example: Parallel Method Invocation

NamedCache pset = CacheFactory.getCache(“portfolios");

InvokeResult alertedPortolios = pset.invoke( PortfolioAnalysis.class, Portfolio.class, and(greaterThan(“totalValue”, 1000000), // query spec equals(“region”, Region.US)), marketSnapshot, // parameters ... );

System.out.println("The alerted portfolios are" + alertedPortfolios.getResult());


• IMDG ships user’s code and libraries to its servers.

• IMDG automatically schedules analysis operations across all grid servers and cores:

• The analysis runs on all objects selectedby the parallel query.

• Each grid server analyzes its locally storedobjects to minimize data motion.

• Parallel execution ensures fast completion time:

• IMDG automatically distributes workload across servers/cores.

• Scaling the IMDG automatically handles larger data sets.

Running the Analysis


• The IMDG automatically merges all analysis results:

• The IMDG first merges all results within each grid server in parallel.

• It then merges results across all grid servers to create one combined result.

• Efficient parallel mergeminimizes the delay incombining all results.

• The IMDG delivers thecombined result to theinvoking application as one object.

Merging the Results


• Measured a similar financial services application (back testing stock trading strategies on stock histories)

• Hosted IMDG in Amazon EC2 using 75 servers holding 1 TB of stock history data in memory

• IMDG handled a continuous stream of updates (1.1 GB/s)

• Results: analyzed 1 TB in 4.1 seconds (250 GB/s) with linear scaling

Sample Performance Results for PMI


Benefits:

• Enables use of Hadoop MapReduce for operational intelligence.

• Accelerates data access by holding data in memory.

• Analyzes and updates “live” data.

• Reduces overheads of standardHadoop distributions:

• Batch scheduling

• Disk access

• Data shuffling

• Mandatory key sorting

• Enables new features, e.g.:

• Global combining, optional sorting

Using PMI to Implement“In-Memory” Hadoop MapReduce


• A Hadoop distribution does not have to be installed unless HDFS is used.

• The developer starts MapReduce applications from a remote workstation.

• The IMDG automatically builds a reusable “invocation grid” of JVMs on the grid’s servers for PMI and ships the application’s jars.

• Results are stored in the IMDG, HDFS, or optionally globally merged and returned to the remote workstation.

Running MapReduce on the IMDG


Run In-Memory MR with YARN• YARN, transparently integrates batch and in-memory

MapReduce into a single execution framework with shared access to HDFS.

• For example, hServer can transparently run Apache Hive in-memory.

Example of ScaleOut hServer with HortonworksExample of Hive

Running on hServer


Run MapReduce as two PMI phases:• Data can be input from either the

IMDG or an external data source.

• Works with any input/output format compatible with the Apache distribution.

• IMDG uses its data-parallel execution engine (PMI) to invoke the mappers and the reducers.

• Eliminates batch scheduling overhead.

• Intermediate results are stored within the IMDG.

• Minimizes data motion between the mappers and reducers.

• Allows optional sorting.

• Output of a single reducer/combiner optionally can be globally merged.

Implementing MapReduce


• IMDG adds grid input format for accessing key/value pairs held in the IMDG.

• MapReduce programs optionally can output results to IMDG with grid output format.

• Grid Record Reader optimizes access to key/value pairs to eliminate network overhead.

• Applications can access and update key/value pairs as operational data during analysis.

Accessing IMDG Data for M/R


• IMDG adds Dataset Record Reader (wrapper) to cache HDFS data during program execution.

• Hadoop automatically retrieves data from ScaleOut IMDG on subsequent runs.

• Dataset Record Reader stores and retrieves data with minimum network and memory overheads.

• Tests with Terasort benchmark have demonstrated 11X faster access latency over HDFS without IMDG.

Optional Caching of HDFS Data


IMDG needs multiple in-memory storage models:

• Named cache, optimized for rich semantics on large objects:

• Property-based query

• Distributed locking

• Access from remote grids

• Named map, optimized for efficient storage and bulk analysis (e.g., MapReduce):

• Highly efficient object storage

• Pipelined, bulk-access mechanisms

Optimized In-Memory Storage


In-Memory Named Map:

• Stores key/value pairs in chunks.

• Allows CRUD operations on kvps.

• Automatically organizes chunks into splits.

• Uses per-split hash table to access keys and manage multi-valued keys.

• Stores shuffled data set between mappers and reducers.

• Pipelines chunks to mappers and from reducers.

• Optionally uses memory mapped files to reduce access latency.

• Provides support for sorting keys.

Named Map Optimizations


• Measured performance:

• Startup times reduced to a few milliseconds

• Word count benchmark shows 20X speedup.

• Real-world example shows >40X speedup.

• MapReduce optimizations:

• Optional sorting

• Optional multicast of parameters to mappers

• Optional O(logN) global combining (avoids single, sequential reducer)

• Optional HDFS caching

• Optional reuse of JVMs across jobs

• Current limitations:

• No specific security for multi-tenancy

• Intermediate data must fit in the IMDG

Performance & Optimizations


• Invocation grids can be re-used across MapReduce jobs:

Accelerating Start-Up Times

public static void main(String argv[]) throws Exception { //Configure and load the invocation grid InvocationGrid grid = HServerJob.getInvocationGridBuilder("myGrid"). // Add JAR files as IG dependencies addJar("main-job.jar"). addJar("first-library.jar").

// Add classes as IG dependencies addClass(MyMapper.class). addClass(MyReducer.class). // Define custom JVM parameters setJVMParameters("-Xms512M -Xmx1024M"). load(); //Run 10 jobs on the same invocation grid for(int i=0; i<10; i++) { Configuration conf = new Configuration(); //The preloaded invocation grid is passed as the parameter to the job Job job = new HServerJob(conf, "Job number "+i, false, grid); //......Configure the job here......... //Run the job job.waitForCompletion(true); } //Unload the invocation grid when we are done grid.unload();}


• IMDG can run Apache Hive distribution unchanged.

• Accelerates queries for datasets hosted in HDFS or the IMDG:

• Intermediate data must fit within the IMDG.

• Challenges we faced:

• Requires YARN to transparently invoke MapReduce on IMDG.

• IMDG must use multiple JVMs per server since Hive tasks are not thread-safe.

• IMDG must support Hadoop’s distributed cache (required by Hive).

Running Hive on In-Memory Data


• Assume we have a named map called “customers” of customer objects:

Example: Querying a Named Map

public class Customer implements Serializable{ private int customerId; private String firstName; private String lastName; private String login;

public int getCustomerId() { return customerId;}

public String getFirstName() { return firstName;} ...}


• Create a table view of a named map:

• Associates class properties with columns.

• Allows properties to be omitted.

• Allows use of custom serialization.


public hive> CREATE TABLEcustomers (customerid int, firstname string, lastname string, login string)STORED BY 'com.scaleoutsoftware.soss.hserver.hive.HServerHiveStorageHandler'TBLPROPERTIES ("hserver.map.name" = "customers");OKTime taken: 0.508 seconds


• Now query the named map:


hive> SELECT * FROM customers;..............................1 Eduardo Hazelrigg ehazelrigg13 Serena Sadberry ssadberry9 Ermelinda Manganaro emanganaro5 Edda Speir espeir17 Tomeka Stovall tstovall21 Luciano Perkinson lperkinson25 Jacob Garrow jgarrow33 Quincy Kreutzer qkreutzer37 Iona Speir ispeir41 Ermelinda Thielen ethielenTime taken: 0.475 seconds, Fetched: 100 row(s)


The Challenge: Operational intelligence to quickly evaluate and respond to sub-second market changes:

• Hedge fund tracks a set of hedging strategies:• Strategies can cover various market

sectors, such as high-tech, automotive, energy, consumer, real estate, etc.

• Each strategy contains list of holdings and rules for managing the holdings (such as target allocations).

• Updates to market data continuously arrive during the trading day.

• Challenge: The hedge fund must be able to quickly update and analyze its hedging strategies and provide alerts to traders.

Demo of the Finserv App. Using M/R


• MapReduce delivers a set of alerts to traders within 300 msec.

• Enables the trader to examine strategy details in real time:

Output: Real-Time Alerts


Fast map/reduce reconciles inventory and order systems for an online retailer:

• Challenge: Inventory and onlineorder management are handledby different applications.

• Reconciled once per day.

• Inaccurate orders reduces margins.

• Solution:

• Host SKUs in IMDG updated in real time by order & inventory systems.

• Use MapReduce to reconcile in two minutes.

• Results: Real-time reconciliation ensures accurate orders.

Example in Ecommerce: Inventory Management


• IMDG holds customerinformation for active Web users.

• IMDG saves/retrieves customer information from backing store.

• Web browsers send activity information to analytics engine.

• IMDG updates customer history andpreferences.

• Analytics engine identifies browsing andbuying patterns.

• Analytics engine makes suggestions in real-time. Also sends email follow-ups.

Example: Web Shopping


• Online systems need operational intelligence on “live” data for immediate feedback.

• Operational intelligence can be implemented using standard data-parallel computing techniques, such as M/R.

• In-memory data grids provide an excellent platform for operational intelligence:

• Track the state of a “live” system.

• Implement high availability.

• Offer fast, data-parallel computation for immediate feedback.

Recap

49

Additional Information


• ScaleOut StateServer®

• In-Memory Data Grid for Windows andLinux

• Scales application performance.

• Industry-leading performance and ease of use

• ScaleOut GeoServer® adds• WAN based data replication for DR

• Breakthrough technology for globaldata access

• ScaleOut Analytics Server® adds• Real-time data analysis for “live” data

• Comprehensive management tools

• ScaleOut hServer®

• Full Hadoop Map/Reduce engine (>20X faster*)

• Hadoop Map/Reduce on live, in-memory data

ScaleOut Software ProductsScaleOut StateServer In-Memory Data Grid

GridService

GridService

GridService

GridService

*in benchmark testing


Many Use Cases:

• Authorizations / Payment Processing / Mobile Payments

• Service Activation

• Inventory Management

• Sensor Data / SCADA

• Real Time Tracking

• Fraud Detection

• Situational Awareness

• Churn Management

• Market Feed / Event Handlers

• Execution Rules

• Financial: Risk, P&L, Pricing

• Operational Risk Compliance

The Need for Real-Time AnalyticsAcross Key Industries:

• CPG

• Financial

• Telco

• Retail

• Utilities

• Manufacturing

• Logistics

• IC / DoD

• Life Sciences

• Government

• Health Care

• Law enforcement


• Brick and mortar stores need to compete with online experience.

• Point-of-sale identifies opt-in customers to analytics engine.

• RFID tags identify product selection and availability in showroom.

• Analytics engine sends real-time advisories to sales staff via tablet.

Example: Retail Shopping


• Typically used for very large, static, offline datasets

• Data must be copied from disk-based storage (e.g., HDFS) into memory for analysis.

• Hadoop Map/Reduce adds lengthy batch scheduling and data shuffling overhead.

Problem: Hadoop Cannot Efficiently Perform Real-Time Analytics


// This job will run using the Hadoop // job tracker:public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new Job(conf, "wordcount"); job.setOutputKeyClass(Text.class);job.setOutputValueClass(IntWritable.class); job.setMapperClass(Map.class);job.setReducerClass(Reduce.class); job.setInputFormatClass( TextInputFormat.class);job.setOutputFormatClass( TextOutputFormat.class); FileInputFormat.addInputPath( job, new Path(args[0]));FileOutputFormat.setOutputPath( job, new Path(args[1])); job.waitForCompletion(true);}

// This job will run using ScaleOut hServer:

public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new HServerJob(conf, "wordcount"); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(Map.class);job.setReducerClass(Reduce.class); job.setInputFormatClass( TextInputFormat.class);job.setOutputFormatClass( TextOutputFormat.class); FileInputFormat.addInputPath( job, new Path(args[0]));FileOutputFormat.setOutputPath( job, new Path(args[1])); job.waitForCompletion(true);}

Configuring MapReduce for the IMDG

• Without YARN, subclass the Hadoop Job class with a one-line change (below).

• With YARN, just replace the MapReduce execution framework.


• Mark class properties as indexes for query:

• Define a query using these properties:

Parallel Query Example (C#)

class Stock { [SossIndex] public string Ticker { get; set; } public decimal TotalShares { get; set; } public decimal Price { get; set; }}

NamedCache cache = CacheFactory.GetCache("Stocks");var q = from s in cache.QueryObjects<Stock>() where s.Ticker == "GOOG" || s.Ticker == "ORCL" select s; Console.WriteLine("{0} Stocks found", q.Count());


• Create method to analyze each queried stock object:

• Create method to pair-wise merge the results:

Example of Analysis Code (C#)

static decimal eval(Stock stock, StockCalcParams params){ return stock.Price * stock.TotalShares;}

static decimal merge(decimal r1, decimal r2){ return r1 + r2;}


• Run a parallel method invocation:

Invoking the Parallel Analysis (C#)

NamedCache cache = CacheFactory.GetCache("Stocks");

decimal valueOfSelectedStocks =

(from s in cache.QueryObjects<Stock>() where s.Ticker == "GOOG" || s.Ticker == "ORCL" select s)

.Invoke(new StockCalcParams(…), new Func<Stock, StockCalcParams, decimal>(eval))

.Merge(new Func<decimal, decimal, decimal>(merge));

Console.WriteLine(“The value of selected stocks is {0}", valueOfSelectedStocks);

Using In-Memory, Data-Parallel Computing for Operational Intelligence Copyright © 2014 by ScaleOut...

Documents

Transcript of Using In-Memory, Data-Parallel Computing for Operational Intelligence Copyright © 2014 by ScaleOut...