Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing...

20
Presenter: Hung-Fu Li HPDS Lab. NKUAS vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua Sun IEEE 2009

Transcript of Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing...

Page 1: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

Presenter: Hung-Fu Li

HPDS Lab.NKUAS

vCUDA: GPU Accelerated High Performance Computing in Virtual Machines

Lin Shi, Hao Chen and Jianhua Sun

IEEE 2009

Page 2: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

2

Lecture Outline

Abstract 3Background 4Motivation 5CUDA Architecture 7vCUDA Architecture 8Experiment Result 13Conclusion 19

Page 3: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

3

Abstract

This paper describe vCUDA, a GPGPU computation solution for virtual machine. The author announced that the API interception and redirection could provide transparent and high performance to the applications.This paper would carry out the performance evaluation on the overhead of their framework.

Page 4: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

4

Background

VM(Virtual Machine)CUDA (Computation Unified Device Architecture)API (Application Programming Interface)API Interception, RedirectionRPC(Remote Procedure Call)

Page 5: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

5

Motivation

Virtualization may be the simplest solution to heterogeneous computation environment.Hardware varied by vendors, it is not necessary for VM-developer to implements hardware drivers for them. (due to license, vendor would not public the source and kernel technique)

Page 6: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

6

Motivation ( cont. )

Currently the virtualization does only support Accelerated Graphic API such as OpenGL, named VMGL, which is not used for general computation purpose.

Page 7: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

7

CUDA Architecture

Component Stack

CUDA Enabled Device

CUDA Driver API

CUDA Runtime API

CUDA Driver

User Application

<< CUDA Extensions to C>>

Page 8: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

8

vCUDA Architecture

Split the stack into hardware/software binding

CUDA Enabled Device

CUDA Driver API

CUDA Runtime API

CUDA Driver

User Application

<< CUDA Extensions to C>>

hard binding

soft binding

Direct communicate

Part of SDK

Page 9: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

9

vCUDA Architecture ( cont. )

Re-group the stack into host and remote side.

CUDA Enabled Device

[v]CUDA Driver API

[v]CUDA Runtime API

CUDA Driver

User Application

<< CUDA Extensions to C>>

CUDA Driver API

Host binding

Remote binding(guestOS)

Part of SDK

[v]CUDA Enabled Device(vGPU)

Page 10: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

10

vCUDA Architecture ( cont. )

Use fake API as adapter to adapt the instant driver and the virtual driver.API Interception

Parameters passed

Order Semantics

Hardware State

Communication

Use Lazy-RPC TransmissionUse XML-RPC as high-level communication.(for cross-platform requirement)

[v]CUDA Driver API

[v]CUDA Runtime APIRemote binding(guestOS)[v]CUDA Enabled Device(vGPU)

Page 11: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

11

vCUDA Architecture ( cont. )

Virtual Machine OSHost OS

lazyRPC

Non instant API

Instant API

Page 12: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

12

vCUDA Architecture ( cont. )

vCUDA API with virtual GPULazy RPC

Reduce the overhead of switching between host OS and guest OS.

AP LazyRPC

vGPUHardware states

API Invocation

GPU

Instant api call

NonInstant API call

NonInstant Package

Stub

vStub

Page 13: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

13

Experiment Result

CriteriaPerformance

Lazy RPC and Concurrency

Suspend& Resume

Compatibility

Page 14: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

14

Experiment Result ( cont. )Experiment Result ( cont. )

CriteriaPerformance

Lazy RPC and Concurrency

Suspend& Resume

Compatibility

Page 15: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

15

Experiment Result ( cont. )Experiment Result ( cont. )

CriteriaPerformance

Lazy RPC and Concurrency

Suspend& Resume

Compatibility

Page 16: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

16

Experiment Result ( cont. )Experiment Result ( cont. )

CriteriaPerformance

Lazy RPC and Concurrency

Suspend& Resume

Compatibility

Page 17: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

17

Experiment Result ( cont. )Experiment Result ( cont. )

CriteriaPerformance

Lazy RPC and Concurrency

Suspend& Resume

Compatibility

Page 18: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

18

Experiment Result ( cont. )Experiment Result ( cont. )

CriteriaPerformance

Lazy RPC and Concurrency

Suspend& Resume

Compatibility

MV: Matrix Vector Multiplication AlgorithmStoreGPU: Exploiting Graphics Processing Units to Accelerate Distributed Storage Systems MRRR: Multiple Relatively Robust RepresentationsGPUmg: Molecular Dynamics Simulation with GPU

Page 19: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

19

Conclusion

They have developed CUDA interface for virtual machine, which is compatible to the native interface. The data transmission is a significant bottleneck, due to RPC XML-parsing. This presentation have briefly present the major architecture of the vCUDA and the idea of it. We could extend the architecture as component / solution to make the cloud computing support GPU.

Page 20: Presenter: Hung-Fu Li HPDS Lab. NKUAS 2009-12-31 1 vCUDA: GPU Accelerated High Performance Computing in Virtual Machines Lin Shi, Hao Chen and Jianhua.

20

End of Presentation

Thanks for your listening.