Accelerating Anomaly Detection Algorithms on FPGA-Based...
Transcript of Accelerating Anomaly Detection Algorithms on FPGA-Based...
Accelerating Anomaly Detection Algorithms on
FPGA-Based High-Speed NICsHiroki Matsutani
Dept. of ICS, Keio Universityhttp://www.arc.ics.keio.ac.jp/~matutani
August 2nd, 2018 International Forum on MPSoC for Software-Defined Hardware (MPSoC'18) 1
Accelerator design for big data
2
Input stream data Message queuing
RealtimeView
Database layer(Polyglot persistence)
DB queries
Data exchange (serialization)
Customer analysis Topic prediction Blockchain records Geolocation query …
Big data (Surveillance, Network service, SNS,UAV, IoT)
BatchView
BatchLearning
Inference
Batchprocessing
Streamprocessing
Today’s talk: Online learning
3
Input stream data Message queuing
RealtimeView
Database layer(Polyglot persistence)
DB queries
Data exchange (serialization)
Customer analysis Topic prediction Blockchain records Geolocation query …
Big data (Surveillance, Network service, SNS,UAV, IoT)
BatchView
BatchLearning
I/O intensive Compute intensive
Message queuing middleware
(Apache Kafka)
Stream processing (Apache Spark
Streaming)
Batch processing(Apache Spark)
Batch learning (Apache Spark
MLlib)Serialization
(Apache Thrift)KVS / Column DB(Redis, HBase)
Document DB(MongoDB)
Graph DB (Neo4j), graph
processing
Inference(DNN, CNN, RNN)
Tight integration of I/O and compute FPGA
Massive parallelism Networked GPU cluster
GPUsHostSwitchFour
10GbE FPGA
Inference,Online
learning
Batchprocessing
Streamprocessing
Online learning(SGD, ChangeFinder,
OS-ELM)
Offline vs. Online learning
4
All training data
New training
data+
Predictor
OK or NG?Test data
GPU-based batch processing
Examples: DNN, CNN, … Learning cost is high Predictor updated infrequently
Learning cost is low Predictor updated frequently Not very versatile
Predictor Test data ≒Training data
Sequential learning+ Inference
FPGA NIC/Switch
10GbEx4
FPGA-based stream processing
OK or NG?
Offline learning Online learning
Online learning approaches
5
Next value Xt is predictedbased on recent p values
Online sequential learning for SLFN (input, hidden, and output layers)
Time t
Xt-1
Xt-2
Xt-3Xt-4
Xt-5
?
Xt= β1Xt-1 + β2Xt-2 + …
ChangeFinder:Outlier and change point detections on time-series data
AR-model based Neural network
ChangeFinder on 10GbE FPGA• ChangeFinder algorithm
6
[J.Takeuchi, IEEE TKDE'06]
Elapsed time
Input data Xt
Change-point score
Step 1 (Outlier score):Receive input data Xt at time tCalculate outlier score of Xt based on past dataInfluence of past data controlled by discount rate r
Step 2 (Smoothing):Calculate moving average Yt of the outlier scoreSmoothing is controlled by window size S
Step 3 (Change-point score):Step 1 is performed for YtThe result is change-point score
ChangeFinder on 10GbE FPGA• ChangeFinder algorithm
7
[J.Takeuchi, IEEE TKDE'06]
Elapsed time
Input data Xt
Change-point score
Threshold
Step 1 (Outlier score):Receive input data Xt at time tCalculate outlier score of Xt based on past dataInfluence of past data controlled by discount rate r
Step 2 (Smoothing):Calculate moving average Yt of the outlier scoreSmoothing is controlled by window size S
Step 3 (Change-point score):Step 1 is performed for YtThe result is change-point score
ChangeFinder on 10GbE FPGA• 10GbE NIC datapath by Verilog HDL• Application logic in wrapper in HLS
8
10GMACRX 0
DMARX
Input Arbiter
Output PortLookup
BRAM OutputQueues
10GMACRX 3
10GMACTX 0
10GMACTX 3
DMATX
Wrapper
ChangeFinder.c
AXI4-Stream
AXI4-Stream
[T.Iwata, HeteroPar‘18]
ChangeFinder on 10GbE FPGA• Throughput: 83.4% of 10GbE line rate
9
Youtube Video:https://www.youtube.com/watch?v=wgTcBfkE5hY
Online learning approaches
10
Time t
Xt-1
Xt-2
Xt-3Xt-4
Xt-5
?
Xt= β1Xt-1 + β2Xt-2 + …
Next value Xt is predictedbased on recent p values
Online sequential learning for SLFN (input, hidden, and output layers)
n N m
Weight vector β00~β(N-1)(m-1)
ChangeFinder:Outlier and change point detections on time-series data
OS-ELM:Single hidden layer neural network (SLFN)
AR-model based Neural network
[N. Liang, TNN 2006]
Online learning + unsupervised
Pre-trained predictor
OK or NG?Test data Learning
predictorTest data ≒Training data
Sequential learning+ Inference
Offline learning Online learning
Inference only
Normal values (incl. noise) are learned after the deployment Anomaly detection adapted to a given environment
[M.Tsukada, HeteroPar’18]*Collaboration with Prof. M.Kondo (UTokyo)
+Unsupervised anomaly detection
(No training data needed)
11
Online learning + unsupervised• Learn vibration pattern of fan + noise
12
Youtube Video:https://www.youtube.com/watch?v=tCw7p7bjwTs
Summary: Online learning FPGA
14
Input stream data Message queuing
RealtimeView
Database layer(Polyglot persistence)
DB queries
Data exchange (serialization)
Customer analysis Topic prediction Blockchain records Geolocation query …
Big data (Surveillance, Network service, SNS,UAV, IoT)
BatchView
BatchLearning
I/O intensive Compute intensive
Message queuing middleware
(Apache Kafka)
Stream processing (Apache Spark
Streaming)
Batch processing(Apache Spark)
Batch learning (Apache Spark
MLlib)Serialization
(Apache Thrift)KVS / Column DB(Redis, HBase)
Document DB(MongoDB)
Graph DB (Neo4j), graph
processing
Inference(DNN, CNN, RNN)
Tight integration of I/O and compute FPGA
Massive parallelism Networked GPU cluster
GPUsHostSwitchFour
10GbE FPGA
Inference,Online
learning
Batchprocessing
Stream processing
Online learning(SGD, ChangeFinder,
OS-ELM)
References (1/2)• Outlier detection on 10GbE FPGA NIC
– Ami Hayashi, et.al., "An FPGA-Based In-NIC Cache Approach for Lazy Learning Outlier Filtering", PDP 2017.
– Ami Hayashi, et.al., "A Line Rate Outlier Filtering FPGA NIC using 10GbE Interface", ACM Comp Arch News (2015).
• Change-point detection on FPGA NIC– Takuma Iwata, et.al., "Accelerating Online
Change-Point Detection Algorithm using 10GbE FPGA NIC", HeteroPar 2018.
15
References (2/2)• Online sequential unsupervised anomaly
detector on FPGA– Mineto Tsukada, et.al., "OS-ELM-FPGA: An
FPGA-Based Online Sequential Unsupervised Anomaly Detector", HeteroPar 2018.
16
17
Thank you !
Acknowledgement:This work is supported by JST CREST JPMJCR1785
Thank you for listening