Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

38
Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense) (Venkat)anathan Varadarajan, Thawan Kooburat, Benjamin Farley, Thomas Ristenpart, and Michael Swift 1 DEPARTMENT OF COMPUTER SCIENCES

description

Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense). (Venkat)anathan Varadarajan, Thawan Kooburat, Benjamin Farley, Thomas Ristenpart, and Michael Swift. Department of Computer Sciences. Public Clouds (EC2, Azure, Rackspace, …). Multi-tenancy - PowerPoint PPT Presentation

Transcript of Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

Page 1: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

1

Resource-Freeing Attacks:Improve Your Cloud Performance

(at Your Neighbor's Expense)

(Venkat)anathan Varadarajan, Thawan Kooburat,

Benjamin Farley, Thomas Ristenpart,

and Michael Swift

DEPARTMENT OF COMPUTER SCIENCES

Page 2: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

2

Public Clouds (EC2, Azure, Rackspace, …)

VM

Multi-tenancyDifferent customers’ virtual machines (VMs) share same server

Why multi-tenancy?Improved resource utilization

VM

VM

VM

VM

VM

VM

Page 3: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

Implications of Multi-tenancy

• VMs share many resources– CPU, cache, memory, disk, network, etc.

• Virtual Machine Managers (VMM) – Goal: Provide Isolation

• Deployed VMMs don’t perfectly isolate VMs– Side-channels [Ristenpart et al. ’09, Zhang et al. ’12]

3

Today: Performance degraded by other customers

VM

VM

VMM

Page 4: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

4

CPU Net Disk Cache0

100

200

300

400

500

600

Perfo

rman

ce D

egra

datio

n (%

)

Contention in Xen

Local Xen TestbedMachine Intel Xeon E5430,

2.66 GhzCPU 2 packages each

with 2 coresCache Size 6MB per package

VM

VM

Non-work-conserving CPU scheduling

Work-conservingscheduling

3x-6x Performance loss Higher cost

Page 5: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

This work: Greedy customer can recover performance by interfering with other tenants

Resource-Freeing Attack

What can a tenant do?

5

Pack up VM and move(See our SOCC 2012 paper)… but, not all workloads cheap to move

VM

VM

Ask provider for better isolation… requires overhaul of the cloud

Page 6: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

6

Resource-freeing attacks (RFAs)

• What is an RFA? • RFA case studies

1. Two highly loaded web server VMs2. Last Level Cache (LLC) bound VM and

highly loaded webserver VM• Demonstration on Amazon EC2

Page 7: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

7

The Setting

Victim:– One or more VMs– Public interface (eg, http)

Beneficiary:– VM whose performance we want

to improve

Helper:– Mounts the attack

Beneficiary and victim fighting over a target resource

Helper

VM

VM

Victim

Beneficiary

Page 8: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

Example: Network Contention

• Beneficiary & Victim– Apache webservers hosting static and

dynamic (CGI) web pages.• Target Resource: Network Bandwidth• Work-conserving scheduler– network bandwidth

8

Net

Clients

What can you do?

VictimBeneficiary

Loca

l Xen

Test

be

d

Page 9: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

Ways to Reduce Contention?

Break into victim VM and disable it

9

Net

Clients

Loca

l Xen

Test

be

d

But:• Requires knowledge of

vulnerability• Drastic• Easy to detect

Helper

VictimBeneficiary

The good:frees up resources used by victim

Page 10: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

Ways to Reduce Contention?

Do a simple DoS attack?

This may NOT free up target resources

10

Net

Clients

Loca

l Xen

Test

be

dBackfires: May increase the contention

Helper

SYN

floo

d

VictimBeneficiary

Page 11: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

11

Recipe for a Successful RFAShift resource away from the target resource towards the bottleneck resource

Shift resource usage via public interface

Proportion of Network usage

CPU intensive dynamic pages

Static pagesProp

ortio

n of

CPU

usa

ge

Push

tow

ards

CPU

bott

lene

ck

Reduce target resource usage

Limits

Page 12: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

12

An RFA in Our Example

Net

Helper

CGI R

eque

st

CPU Utilization

Clients

Result in our testbed:Increases beneficiary’s share of bandwidth

No RFA: 1800 page requests/secW/ RFA: 3026 page requests/sec

50% 85%share of

bandwidth

Page 13: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

13

Shared CPU Cache:– Ubiquitous: Almost all workloads need cache– Hardware controlled: Not easily isolated via

software– Performance Sensitive: High performance cost!

Resource-freeing attacks 1) Send targeted requests to victim 2) Shift resources use from target to a bottleneck

Can we mount RFAs when targetresource is CPU cache?

Page 14: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

14

Cache Contention

1000 2000 30000

50

100

150

200

250

Webserver Request Rate

Cach

e Pe

rform

ance

Deg

rada

tion

(%)

RFA Goal

Page 15: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

Case Study: Cache vs. Network

• Victim : Apache webserver hosting static and dynamic (CGI) web pages

• Beneficiary: Synthetic cache bound workload (LLCProbe)

• Target Resource: Cache• No cache isolation:– ~3x slower when sharing

cache with webserver

15

Net

Cache

$$$ Clients

Loca

l Xen

Test

be

d

VictimBeneficiary

CoreCore

Page 16: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

16

Net

Cache vs. NetworkVictim webserver frequently interrupts, pollutes the cache– Reason: Xen gives higher

priority to VM consuming less CPU time

Cache

Clients$$$

Cache state time line

Beneficiary starts to run

Core Core

decreased cache efficiency

Webserver receives a

request

Heavily loaded web server

cache state

Page 17: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

17

Net

Cache vs. Network w/ RFARFA helps in two ways:1. Webserver loses its

priority.2. Reducing the capacity

of webserver.Cache

Clients$$$

Cache state time line

Core Core

HelperHeavily loaded webserver requests under RFA

CGI R

eque

stBeneficiary starts to run

Webserver receives a

request

Heavily loaded web server

cache state

Page 18: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

18

RFA: Performance ImprovementRFA intensities – time in ms per second

196% slowdown

86% slowdown

60%Performance Improvement

Page 19: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

19

RFA Effect on InterruptionsBeneficiary: LLCProbe

40%

85%

x+

Page 20: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

20

RFA Effect on Victim’s capacity

Decreases with increasing RFA intensity

Page 21: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

21

Instance type m1.small

# of co-resident pairs 9 (23 total instances)

Machine type Intel Xeon E5507 with 4MB LLC

Experiments on Amazon EC2

VM

VM

VM

VM

VM

Multiple Accounts

Co-resident VMs from our accounts:Stand-ins for victim and beneficiary

Separate instances for helper and web clients

No direct interact with any other customersIndirect interaction akin to normal usage cases

VM

Page 22: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

LLCProbe Synthetic Benchmark

RFA improved performance of LLCProbe on all experimental EC2 instances!

Highest performance improvement of 13%,

recovering 33% of performance lost.

22

Average performance improvement: 6%

Page 23: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

23

mcf from SPEC-CPU

10% slowdown

6% slowdown

3% performance improvement = 35% reduction in performance loss

On average RFA improved performance across all SPEC workloads!

Page 24: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

24

Discussion: Practical Aspects

RFA case studies used CPU intensive CGI requests– Alternative: DoS vulnerabilities

(Eg. hash-collision attacks)Identifying co-resident victims– Easy on most clouds

(Co-resident VMs have predictable internal IP addresses)

No public interface? – Paper discusses possibilities for RFAs

VM

VM

Page 25: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

25

Conclusion

Resource-Freeing Attacks– Interfere with victim to shift

resource use – Proof-of-concept of efficacy in

public clouds

Open questions: – Other RFAs? – Countermeasures: Detection, stricter

isolation, smarter scheduling?

VM

VM

Page 26: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

26

References[MMSys10] Sean K. Barker and Prashant Shenoy. “Empirical evaluation of latency-sensitive application performance in the cloud.” In MMSys, 2010.

[Security10] Thomas Moscibroda and Onur Mutlu. “Memory performance attacks: Denial of memory service in multi-core systems.” In Usenix Security Symposium, 2007.

[CCS09] T. Ristenpart, E. Tromer, H. Shacham, and S. Savage. “Hey, you, get off my cloud: exploring information leakage in third party compute clouds.” In CCS, 2009.

Page 27: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

27

Backup Slides

Page 28: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

28

Discussion: Countermeasures

Detection?– May be hard to differentiate RFA from legitimate

Stricter Isolation?– Works but expensive

Contention-aware scheduling– Not yet used in public IaaS

Page 29: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

29

Discussion: Economies• Cost of RFA

– Helper instance, and– RFA traffic.

• Co-resident helper– An efficient implementation of helper can run inside the

attacker’s VM.– Current helper implementation consumes 15 Kbps of network

bandwidth and a CPU utilization of 0.7%.• Multiplex Singe Helper Instance for many beneficiaries.• Note: Currently, internal EC2 network traffic is free-of-

cost.

Page 30: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

30

Identifying Co-resident VMs

• Identifying the public interface:– Predictable numerical distance between internal

IP addresses in public clouds.– Identifying port used by the victim application

(standard ports like http(s), etc.).

Page 31: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

31

Experiment: Measuring Resource Contention

• Synthetic workloads

Page 32: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

32

Other RFAs

• RFAs are not limited to the presented case studies.

• LLC vs. Disk– Sending spurious, random disk requests

asynchronously to create a bottleneck for the shared disk resource.

• Memory vs. Disk– Similarly to the above RFA

Page 33: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

33

Discussion: More on Practical Aspects

• Work-conserving vs. Non-work-conserving schedulers– It is expected that public cloud environment manage

resources in a non-work-conserving fashion.– Eg. Net vs. Net RFA won’t work on Amazon EC2.

• Simulated client workload– What is the effect of RFA in the presence of multiple

independent client requests originating from numerous clients?

Page 34: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

34

N/WCore Core Core Core

cache

memory

Disk

Hypervisor

Dom0 Dom0 Dom0 Dom0

VM VM VM VM

VM VM VM VM

Xen Internals

• Domain-0– Privileged Domain, direct

access to I/O devices.– All I/O requests goes

through Dom-0• Xen scheduler internal– Boost priority for

interactive workloads

Inco

min

g re

ques

t

Page 35: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

35

Experiment: Measuring Resource Contention

• On a local Xen test bed

Loca

l Xen

Test

be

d

VM

N/WCore Core Core Core

VM

LLC

memory

Disk

VM VM VM

VMVM VM VM

Machine Intel Xeon E5430, 2.66 Ghz

Packages 2, 2 cores per package

LLC Size 6MB per package LLC

CPU Net Disk Memory Cache

0

100

200

300

400

500

600

CPUNetDiskMemoryCache

Conflicting Workloads

Perf

orm

ance

Deg

rada

tion

(%)

Observed Workloads:

Not all resources

conflict

Some have huge performance degradation

Page 36: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

36

Boost Priority and Interruptions

Victim: Webserver Beneficiary: LLCProbe

95%

< 30%

40%

85%

Fewer interruptions Higher cache efficiency

Page 37: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

37

Demonstration on EC2

• Problem #1: Achieving Co-residence– Launching multiple instances simultaneously from

two or more accounts.• Problem #2: Verifying Co-residency– Numerical distance between internal IP addresses

[CCS09].– Faster packet round-trip times.– Using resource contention experiments.

Page 38: Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)

38

Normalized Performance on EC2

Baseline

Higher is better

Aggregate performance

degradation is within 5 performance points

6%

On an average all SPEC workloads

benefitted from RFA