High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University...

34
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University [email protected] state.edu Piyush Shivam Ohio State University [email protected] state.edu D.K. Panda Ohio State University [email protected] state.edu Pete Wyckoff Ohio Supercomputer Center [email protected]

Transcript of High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University...

Page 1: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

High PerformanceUser-Level Sockets over

Gigabit Ethernet

Pavan BalajiOhio State University

[email protected]

Piyush ShivamOhio State University

[email protected]

D.K. PandaOhio State University

[email protected]

Pete WyckoffOhio Supercomputer Center

[email protected]

Page 2: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Presentation Overview

Background and Motivation

Design Challenges

Performance Enhancement Techniques

Performance Results

Conclusions

Page 3: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Background and Motivation

SocketsFrequently used APITraditional Kernel-Based Implementation

Unable to exploit High Performance NetworksEarlier Solutions

Interrupt Coalescing Checksum Offload Insufficient

It gets worse with 10 Gigabit NetworksCan we do better

User-level support

Page 4: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Kernel Based Implementation of Sockets

NIC

IP

TCP

Sockets

Application or Library

Hardware

Kernel

User Space Pros• High Compatibility

Cons• Kernel Context Switches• Multiple Copies• CPU Resources

Page 5: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Alternative Implementations of Sockets (GigaNet cLAN)

“VI aware” NIC

IP

TCP

Sockets

Application or Library

Hardware

Kernel

User Space Pros• High Compatibility

Cons• Kernel Context Switches• Multiple Copies• CPU Resources

IP-to-VI layer

Page 6: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Sockets over User-Level Protocols

Sockets is a generalized protocol Sockets over VIA

Developed by Intel Corporation [shah98] and ET Research Institute [sovia01]

GigaNet cLAN platform

Most networks in the world are Ethernet Gigabit Ethernet

Backward compatible Gigabit Network over the existing installation base MVIA: Version of VIA on Gigabit Ethernet

Kernel Based A need for a High Performance Sockets layer over

Gigabit Ethernet

Page 7: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

User-Level Protocol over Gigabit Ethernet

Ethernet Message Passing (EMP) Protocol Zero-Copy OS-Bypass NIC-driven User-Level

protocol over Gigabit Ethernet Developed over the Dual-processor Alteon NICs Complete Offload of message passing functionality to

the NIC

• Piyush Shivam, Pete Wyckoff, D.K. Panda, “EMP: Zero-Copy OS-bypass NIC-driven Gigabit Ethernet Message Passing”, Supercomputing, November ’01

• Piyush Shivam, Pete Wyckoff, D.K. Panda, “Can User-Level Protocols take advantage of Multi-CPU NICs?”, IPDPS, April ‘02

Page 8: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

EMP: Latency

0

50

100

150

200

250

4 8 16 32 64 128 256 512 1K 2K 4K

Message Size (bytes)

Late

ncy

(us)

TCP

EMP

A base latency of 28s compared to an ~120 s of TCP for 4-byte messages

Page 9: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

EMP: Bandwidth

0

200

400

600

800

1000

1200

4 8 16 32 64 128 256 512 1K 2K 4K 8K

Message Size (bytes)

Ban

dw

idth

(M

bp

s)

EMP

TCP

Saturated the Gigabit Ethernet network with a peak bandwidth of 964Mbps

Page 10: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Proposed Solution

Gigabit Ethernet NIC

Sockets over EMP

Application or Library

Hardware

Kernel

User Space

• Kernel Context Switches• Multiple Copies• CPU Resources• High Performance

OS Agent

EMP Library

Page 11: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Presentation Overview

Background and Motivation

Design Challenges

Performance Enhancement Techniques

Performance Results

Conclusions

Page 12: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Design Challenges

Functionality Mismatches

Connection Management

Message Passing

Resource Management

UNIX Sockets

Page 13: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Functionality Mismatches and Connection Management

Functionality Mismatches

No API for buffer advertising in TCP

Connection Management

Data Message Exchange

Descriptors required for connection

management

Page 14: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Message Passing

Message PassingData Streaming

Parts of the same message can be read potentially to different buffers

Unexpected Message ArrivalsSeparate Communication Thread

• Keeps track of used descriptors and re-posts• Polling Threads have high Synchronization cost• Sleeping Threads involve OS scheduling granularity

Rendezvous ApproachEager with Flow Control

Page 15: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Rendezvous Approach

Sender Receiver

SQ

RQ

SQ

RQ

send()

receive()

Request

ACK

Data

Page 16: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Eager with Flow Control

Sender Receiver

SQ

RQ

SQ

RQ

send()

Data

ACK

Data

receive()

Page 17: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Resource Management and UNIX Sockets

Resource ManagementClean up unused descriptors (connection

management)Free registered memory

UNIX SocketsFunction OverridingApplication ChangesFile Descriptor Tracking

Page 18: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Presentation Overview

Background and Motivation

Design Challenges

Performance Enhancement Techniques

Performance Results

Conclusions

Page 19: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Performance Enhancement Techniques

Credit Based Flow Control

Disabling Data Streaming

Delayed Acknowledgments

EMP Unexpected Queue

Page 20: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Credit Based Flow Control

Sender Receiver

SQ

RQ

SQ

RQCredits Left: 4Credits Left: 3Credits Left: 2Credits Left: 1Credits Left: 0Credits Left: 4

• Multiple Outstanding Credits

Page 21: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Non-Data Streaming and Delayed Acknowledgments

Disabling Data Streaming Intermediate copy required for Data Streaming Place data directly into user buffer

Delayed Acknowledgments Increase in Bandwidth

Lesser Network Traffic NIC has lesser work to do

Decrease in Latency Lesser descriptors posted Lesser Tag Matching at the NIC

550ns per descriptor

Page 22: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

EMP Unexpected Queue

EMP Unexpected QueueEMP features unexpected message queue

Advantages: Last to be checkedDisadvantage: Data Copy

Acknowledgments in the Unexpected Queue

No copy, since acknowledgments carry no dataAcknowledgments pushed out of the critical

path

Page 23: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Presentation Overview

Background and Motivation

Design Challenges

Performance Enhancement Techniques

Performance Results

Conclusions

Page 24: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Performance Results

Micro-benchmarksLatency (ping-pong)Bandwidth

FTP ApplicationWeb Server

HTTP/1.0 SpecificationsHTTP/1.1 Specifications

Page 25: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Experimental Test-bed

Four Pentium III 700Mhz Quads

1GB Main Memory

Alteon NICs

Packet Engine Switch

Linux version 2.4.18

Page 26: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Micro-benchmarks: Latency

0

50

100

150

200

250

Message Size (bytes)

Lat

ency

(u

s) TCP

Data Streaming

Non-Data Streaming

EMP

Up to 4 times improvement compared to TCP

Overhead of 0.5us compared to EMP

Page 27: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Micro-benchmarks: Bandwidth

0

100

200

300

400

500

600

700

800

9004 8 16 32 64 128

256

512

1K 2K 4K 8K 16K

32K

64K

128

Message Size (bytes)

Ban

dw

idth

(M

bp

s)

Data Streaming

Non-Data Streaming

TCP

Enhanced TCP

An improvement of 53% compared to enhanced TCP

Page 28: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

FTP Application

0

2

4

6

8

10

12

Tra

nsf

er T

ime

(sec

s)

1 4 16 64 256

File Size (Mbytes)

Data Streaming

Non-Data Streaming

TCP

Up to 2 times improvement compared to TCP

Page 29: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Web Server (HTTP/1.0)

0

2000

4000

6000

8000

10000

12000

Response Size (bytes)

Tra

nsa

ctio

ns

per

sec

on

d

TCP

Data Streaming

Non-Data Streaming

Up to 6 times improvement compared to TCP

Page 30: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Web Server (HTTP/1.1)

0

2000

40006000

8000

10000

1200014000

16000

18000

Response Size (bytes)

Tra

nsa

ctio

ns

per

sec

on

d

TCP

Data Streaming

Non-Data Streaming

Up to 3 times improvement compared to TCP

Page 31: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Conclusions

Developed a High Performance User-Level Sockets implementation over Gigabit Ethernet

Latency close to base EMP (28 s) 28.5 s for Non-Data Streaming 37 s for Data Streaming sockets 4 times improvement in latency compared to TCP

Peak Bandwidth of 840Mbps 550Mbps obtained by TCP with increased Registered space

for the kernel (up to 2MB) Default case is 340Mbps with 32KB Improvement of 53%

Page 32: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Conclusions (contd.)

FTP Application shows an improvement of nearly 2 times

Web Server shows tremendous performance improvement HTTP/1.0 shows an improvement of up to 6 times HTTP/1.1 shows an improvement of up to 3 times

Page 33: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

Future WorkDynamic Credit Allocation

NIC: The trusted component Integrated QoS

Currently on Myrinet ClustersCommercial applications in the Data

Center environmentExtend the idea to next generation

interconnects InfiniBand 10 Gigabit Ethernet

Page 34: High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University.

For more information, please visit the

http://nowlab.cis.ohio-state.edu

Network Based Computing Laboratory,

The Ohio State University

Thank You

NBC Home Page