Parallel&Implementaon&of&...
Transcript of Parallel&Implementaon&of&...
Parallel Implementa,on of Support Vector Machines (SVMs)
PRESENTATION CLASS PROJECT, CMPE-‐655
FALL-‐2014
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 1
OUTLINE § INTRODUCTION § SUPPORT VECTOR MACHINES § SEQUENTIAL VS. PARALLEL MINIMIZATION § PARALLEL COMPUTING ARCHITECTURES § CONCLUSION
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 2
§ INTRODUCTION Why SVMs?? • because of their great performance in solving different machine learning
problems. • perform binary classifica,on with superior results • can also be used for mul,-‐class classifica,on and regression Why Parallel SVMs?? • The computa,ons behind SVMs are theore,cally mind-‐boggling and
computa,onally lavish • The training process of SVM involves op,miza,on of large size quadra,c
problems
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 3
§ INTRODUCTION What SVMs do?? • Like other machine learning techniques, SVMs include a training phase • evaluates informa,on from an input dataset to build a model • Model is used for tes,ng an unforeseen sample How the performance is measured?? • measured by the ability in predic,ng the correct labels of test sample. • no over fibng
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH
hcp://groups.csail.mit.edu/ddmg/drupal/content/non-‐linear-‐svm-‐separa,on
§ INTRODUCTION Any Issues with SVMs?? • the computa,onally lavish nature of training SVMs restricts use of large
datasets. • Using cross-‐valida,on and few other parameters adds extra expenses to
training phase How do we deal with it?? • parallel compu,ng power of GPUs and other parallel processing resources
can be exploited upon this issue • So training will be now way faster.
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH
§ INTRODUCTION Size of data sets used by researchers?
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH
PASCAL Challenge on Large-‐scale Learning data sets
§ INTRODUCTION SVM Theory
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH
y(i) = +1,wT x(i) + b > 0
y(i) = −1,wT x(i) + b < 0y(i),wT x(i) + b = 0
• The points touching the separa,ng hyper plane are called Support Vectors .
• distance between xsv(support vectors) and
hyper plane is called margin. • Support Vector Machines try to maximize
the margin while building the model .
§ INTRODUCTION
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH
§ INTRODUCTION
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH
§ SEQUENTIAL IMPLEMENTATION • broken down into each itera,on step where two αi weights are
calculated. • Afer this the op,miza,on condi,ons are updated for remaining data points • next two weights are calculated • Itera,ons are repeated un,l convergence. • Afer the convergence the final w and b values are used for SVMs
classifica,on
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 10
§ PARALLEL IMPLEMENTATION Why parallel?? • single program mul,ple data (SPMD) mode . • In the sequen,al algorithm, a large por,on of processing
,me is wasted in upda,ng Kernel Matrix • Almost an excess of 90% of the aggregate execu,on ,me • Upda,ng Kernel matrix is independently evaluated on
training data
• create the parallel system for upda,ng kernel Matrix
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 11
§ PARALLEL IMPLEMENTATION • The whole training data set is ini,ally subdivided into
smaller tasks • each of the divided data is disseminated into one CPU
processor • Every processor will update a different par,,on of Kernel
matrix based on the data assigned • upda,ng Kernel matrix u,lizing all the processors is done • the parallel execu,on ,me would be tparall𝑒l=tseq/p.
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 12
§ PARALLEL IMPLEMENTATION
• Every processor calculates the kernel matrix from the assigned data
• solves for the local support vectors and values of w and b
• program is executed with itera,ve Map-‐reduce Algorithm
• The local acributes are then sent to the next stage and afer the final stage sent to the master.
• In Master Global Support Vectors and values of w and b are generated.
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 13
Xu, Ke, Cui Wen, Qiong Yuan, Xiangzhu He, and Jun Tie. "A MapReduce based Parallel SVM for Email
Classifica,on." Journal of Networks 9, no. 6 (2014): 1640-‐1647
§ PARALLEL IMPLEMENTATION
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 14
§ PARALLEL COMPUTING ARCHITECTURES
• GPU architectures are par,cular for 1. compute intensive, 2. memory-‐escalated, 3. highly parallel computa,ons • GPU execu,on is subject to discovering high
degrees of parallelism • Algorithms for machine learning need to consider
such parallelism to use a lot of mul,-‐core processors.
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 15
§ PARALLEL COMPUTING ARCHITECTURES Few Architectures • CUDA(Compu,ng Unified Device Architecture) o developed by NVIDIA to support their general
purpose compu,ng GPUs o It primarily uses C/C++ syntax o Many toolboxes and plug-‐ins can be found to help
increase the produc,vity
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 16
§ RESULTS
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 17
Some Paralleliza,on Results from “Parallel Sofware for Training Large Scale Support Vector Machines on Mul,processor Systems”
§ RESULTS
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 18
Carpenter, A. U. S. T. I. N. "CUSVM: A CUDA implementa,on of support vector classifica,on and regression." pacernsonscreen. net/cuSVMDesc. pdf (2009).
N=32,561 , M=123 N=49,749 , M=100
N=60000 , M=784
N=561,012 , M=54
§ RESULTS without Regulariza,on
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 19
Carpenter, A. U. S. T. I. N. "CUSVM: A CUDA implementa,on of support vector classifica,on and regression." pacernsonscreen. net/cuSVMDesc. pdf (2009).
DATASET # SUPPORT VECTORS cuSVM LIBSVM
DIFFERENCE in b
TRAINING TIME cuSVM LIBSVM
SPEEDUP
Adult 18,676 19,059 2.8×10−6 31.6s 541.2s 17.1
Web 35,220 35,231 2.6×10−4 228.3s 2,906.8s 12.7
MNIST 43,751 43,754 2.0×10−7 498.9s 17,267s 34.6
Forest 270,305 270,304 8.0×10−3 2,016.4s 29,494.314.1s 14.1
§ RESULTS with Regulariza,on
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 20
Carpenter, A. U. S. T. I. N. "CUSVM: A CUDA implementa,on of support vector classifica,on and regression." pacernsonscreen. net/cuSVMDesc. pdf (2009).
DATASET # SUPPORT VECTORS cuSVM LIBSVM
DIFFERENCE in b
TRAINING TIME cuSVM LIBSVM
SPEEDUP
Adult 18,670 19,079 8.0×10−7 31.6s 548.8s 17.4
Web 35,220 35,307 3.8×10−4 230.8s 3,380.9s 14.2
MNIST 43,729 43,732 8.6×10−5 465.9s 16,499s 35.4
Forest 42,284 42,104 4.2×10−4 254.9s 18,519.2s 72.6
§ CONCLUSION • The parallel SVM u,lizes mul,ple CPU processors to manage
the computa,ons of kernel matrix
• By subdividing the whole training data set into smaller
datasets
• each of the divided datasets assigned to one CPU processor .
• parallel SVMs improves overall performance, by reducing
computa,on ,me at each stage in parallel program .
• This parallel mode used is the SPMD in MPI .
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 21
§ CONCLUSION • proficiency of the parallel SVMs diminishes with the
increment of the quan,ty of processors
• communica,on ,me with the u,liza,on of more processors
• parallel SVMs, the mul,class classifica,on problems is parallel
performed trea,ng one as class by one class
• Even a data set with 2 samples perform becer in parallel
SVM compared to sequen,al SVM
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 22
§ FUTURE WORK • involves performing the mul,class classifica,on problems in
parallel
• considering all the classes
• parallel SVMs from classifica,on to regression es,ma,on
• execu,ng the same technique for SVM regression.
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 23
THANK YOU
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 24
QUESTIONS??
12/16/14 CH VIJAYA NAGA JYOTH SUMANTH 25