Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

36
Non-Control-Data Attacks and Securing software by enforcing data-flow integrity Zhiqiang Lin Mar 28, 2007 CS590 paper presentation

description

CS590 paper presentation. Non-Control-Data Attacks and Securing software by enforcing data-flow integrity. Zhiqiang Lin Mar 28, 2007. Non-Control-Data Attacks Are Realistic Threats. Overview. Examples. Data flow Integrity. - PowerPoint PPT Presentation

Transcript of Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

Page 1: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

Non-Control-Data Attacks and Securing software by enforcing

data-flow integrity

Zhiqiang LinMar 28, 2007

CS590 paper presentation

Page 2: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

Non-Control-Data Attacks Are Realistic Threats

Overview

Examples

Discussions

Data flow Integrity

Conclusions

Shuo Chen, Jun Xu, Emre C. Sezer, Prachi Gauriar, and Ravishankar K.

Iyer

USENIX Security’05

Credit: most slides of this presentation come from Shuo Chen’s

Page 3: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

3

Control Data Attack: Well-Known, Dominant

• Control data attack: corrupt function pointers, jump targets and return addresses to run malicious code– E.g., code injection, mimicry attack and return-to-LibC

• Currently the most dominant form of memory corruption attacks [CERT and Microsoft Security Bulletin]– By exploiting many vulnerabilities such as buffer overflow,

format string bug, integer overflow, double free, etc.

Page 4: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

4

Current defense techniques

• Enforce control data integrity to provide security.

Legal Control flow

?

Page 5: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

5

Non-Control-Data Attack

• Non-control-data attacks: attacks not corrupting any control data– i.e., attacks preserving the integrity of control flow of

the victim process

• Currently very rare in reality– Very few instances documented in literature.– Several papers: theoretically possible to construct

non-control-data attacks against synthetic programs.– Not yet considered as a serious threat

• How applicable are such attacks against real-world software? – Why rare attackers’ incapability or lack of

incentives?– No focused investigation yet.

Page 6: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

6

Motivating Facts

• Random hardware memory errors could subvert the security of real-world systems. – Boneh and DeMillo: random errors allow deriving secret keys in CRT-

based RSA implementation. [Eurocrypt’97]– Our previous work: authentication of SSH and FTP servers, packet

filtering of Linux firewalls can be compromised. [DSN’01 and DSN’02]– Govindavajhala and Appel: Java type system can be subverted. [S&P’03]– None of them is control-data attack. A wide range of real-world software

susceptible.

• Software vulnerabilities are more deterministic and more amenable to attacks.

• Many software vulnerabilities are essentially “memory fault injectors”: overwriting an arbitrary memory location– Heap overflow– Double free– Format string bug– Integer overflow

Page 7: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

7

General Applicability of Non-Control-Data Attacks

• The claim:– Many real-world software applications are susceptible to

non-control-data attacks. – The severity of the attack consequences is equivalent to

that due to control data attacks.

• Goal of their paper– Experimentally validate the claim

• Construct non-control-data attacks to compromise the security of “representative” applications

– Discuss the implications of the claim on current defensive techniques

– Call for comprehensive defensive techniques

Page 8: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

Realistic Non-Control-Data Attacks

Overview

Examples

Discussions

Data flow Integrity

Conclusions

Page 9: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

9

Selection of Target Applications

• Real-world applications, not synthetic applications.• Leading application categories

– CERT advisories (2000 – 2004)• 84% are server vulnerabilities• HTTP service (18%), database service (10%), 6 remote login service

(8%), mail service (5%), FTP service (4%).

• Selection criteria– Different types of vulnerabilities should be covered – Different types of server applications should be studied

• Practical constraints for our selection – Uncertainties in many vulnerability reports: really exploitable?– Proprietary source code – Limited information about details of many vulnerabilities

• Eventually, they selected– Open-source FTP, SSH, Telnet, HTTP servers– Stack buffer overflow, format string, heap corruption, integer

overflow.

Page 10: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

10

1. Non-Control-Data Attack against WU-FTPD Server (via a format string bug)

int x;FTP_service(...) { authenticate(); x = user ID of the authenticated user; seteuid(x); while (1) { get_FTP_command(...); if (a data command?) getdatasock(...); }}getdatasock( ... ) { seteuid(0); setsockopt( ... ); seteuid(x);}

x=109, run as EUID 0x uninitialized, run as EUID 0

x=109, run as EUID 109. Lose the root privilege!

x=0, run as EUID 0

x=0, run as EUID 0

When return to service loop, still runs as EUID 0 (root). Allow us to upload /etc/passwdWe can grant ourselves the root privilege!

Only corrupt an integer, not a control data attack.

Get a data command (e.g., PUT)Get a special SITE EXEC command. Exploit a format string vulnerability.x= 0, still run as EUID 109.

Page 11: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

11

/usr/local/httpd/exe/usr/local/httpd/exe

2. Non-Control-Data Attack against NULL-HTTP Server (via a heap overflow bug)

• Attack the configuration string of CGI-BIN path.• Mechanism of CGI

– suppose server name = www.foo.comCGI-BIN =

– Requested URL = http://www.foo.com/cgi-bin– The server executes

• Our attack– Exploit the vulnerability to overwrite CGI-BIN to /bin– Request URL http://www.foo.com/cgi-bin/sh– The server executes

The server gives me a root shell!Only overwrite four characters in the CGI-BIN string.

/usr/local/httpd/exe/usr/local/httpd/exe

/bin/bin/sh/sh

/bar/bar/bar/bar

Page 12: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

12

3. Non-Control-Data Attack against SSH Communications SSH Server (via an integer overflow bug)

void do_authentication(char *user, ...) { int auth = 0; ... while (!auth) { /* Get a packet from the client */ type = packet_read(); switch (type) { ... case SSH_CMSG_AUTH_PASSWORD: if (auth_password(user, password)) auth =1; case ... } if (auth) break; } /* Perform session preparation. */ do_authenticated(…);}

auth = 0

auth = 0

Password incorrect, but auth = 1

auth = 1

Logged in without correct password

auth = 1

Page 13: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

13

4. More Non-Control-Data Attacks

• Against NetKit Telnet server (default Telnet server of Redhat Linux)– Exploit a heap overflow bug– Overwrite two strings:

/bin/login –h foo.com -p (normal scenario) /bin/sh –h –p -p (attack scenario)

– The server runs /bin/sh when it tries to authenticate the user.

• Against GazTek HTTP server– Exploit a stack buffer overflow bug

• Send a legitimate URL http://www.foo.com/cgi-bin/bar• The server checks that “/..” is not embedded in the URL• Exploit the bug to change the URL to

http://www.foo.com/cgi-bin/../../../../bin/sh• The server executes /bin/sh

Page 14: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

14

What Non-Control-Data Attacks Imply?

• Control flow integrity is not a sufficiently accurate approximation to software security.

• Many types of non-control data critical to security– User identify data– configuration data– user input data – decision-making data

• Once attackers have the incentive, they are likely to succeed in non-control-data attacks.

Page 15: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

Securing software by enforcing data-flow integrity

Overview

Examples

Discussions

Data flow Integrity

Conclusions

Miguel Castro, Microsoft Research; Manuel Costa, Microsoft Research Cambridge; Tim Harris, Microsoft Research

OSDI’06

Page 16: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

16

Motivation

• Most of the software in use today is written in C++. This body of software has a large amount of defects and there exists many ways to exploit these defects such as corrupting control data.

• Removing or avoiding all defects is hard and that although it is possible to prevent attacks based on control-data exploits, certain attacks can succeed without compromising control-flow, in particular the non-control data attack.

Page 17: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

17

Basic Idea – Data Flow Integrity (DFI)

• A technique that computes a dataflow graph for a vulnerable program, and instruments the program to ensure that the flow of data at runtime is allowed by the data-flow graph.

• It can be applied to existing C and C++ programs automatically, because it requires no modifications and it does not generate false positives.

Page 18: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

18

DFI – High level Overview (1/2)

• Analysis Part– Using reaching definition analysis to compute

a data-flow graph at compile time. – For every load, compute the set of stores that

may produce the loaded data. – An ID is assigned to every store operation and

for each load, the set of allowed IDs is computed.

In compiler theory, a reaching definition for a given instruction is another instruction, the target variable of which may reach the given instruction without an intervening assignment.

d1 : y := 3 d2 : x := y

d1 : y := 3

d2 : y := 4

d3 : x := y

Page 19: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

19

DFI – High level Overview (2/2)

• Enforcing Part (The results of the analysis is used to add run-time checks that will enforce data-flow integrity)– Stores are instrumented to write their ID into

the runtime definition table (RDT). The RDT keeps track of the last store to write to each memory location.

– Loads are instrumented to check if the store in the RDT is in their set of allowed writes. If a store ID is not in the set during a check, a exception is raised.

Page 20: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

20

Example vulnerable code in C and their high-level intermediate representation

Phoenix compiler infrastructure

Page 21: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

21

Static Analysis

• Compute reaching definitions using a combination of two analyses: – flow-sensitive intra-procedural analysis – flow-insensitive and context-insensitive inter-

procedural analysis.

• They operate on Phoenix's high level intermediate representation

The set of reaching definitions is {1,8} for both uses of authenticated (in lines 2 and

10).

Page 22: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

22

Instrumentation

• SETDEF opnd id• CHECKDEF opnd setName.

– The first instruction sets the RDT entry for opnd to id.

– The second retrieves the runtime definition identifier for opnd from the RDT and checks if the identifier is in the reaching definitions set with name setName.

– The compiler maintains a map from set names to set values that is used when lowering CHECKDEF instructions to the assembly of the target machine.

Page 23: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

23

Instrumented Example code

SETDEF opnd idCHECKDEF opnd setName.

Note: Every Store is instrumented for the check

Page 24: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

24

Optimizations

• Renaming equivalent definitions• Removing bounds checks on writes• Removing SETDEFs and CHECKDEFs• Optimizing membership checks• Removing SETDEFs for safe

definitions

Page 25: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

25

Evaluation - Performance

Page 26: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

26

Evaluation – space overhead

Page 27: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

27

Evaluation - Performance

Page 28: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

28

Evaluation – effectiveness against attacks

• Synthetic attacks – Wilander’s buffer overflowtestbed

• NullHttpd– Corrupting cgi-bin configuration string

• SSH– Overwrite a stack variable

• Stunnel– A format string attack == control data attack

• No false positive

Page 29: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

Overview

Examples

Discussions

Data flow Integrity

Conclusions

Page 30: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

30

Discussions on Current Defensive Techniques

• Defenses based on control flow integrity– Monitor system call sequences– Protect control data– Non-executable stack and heap

• Pointer encryption PointGuard– Identifying pointers in low level code is really challenging

• Address space randomization– Challenge: need to randomize every program segment– Limitation: 32-bit address space cannot provide sufficient entropy

• Memory safety enforcement– Promising direction, e.g., CCured, Cyclone, CRED– Currently difficult to migrate existing large code bases to memory safe

version. Incur runtime overhead. Difficult to ensure memory safety for low-level code.

• Data flow integrity– Efficient– High performance overhead 1.5X-2.7– Points-to-analysis in inter-procedure analysis?

• Still open: to design a generic and secure defense?

Page 31: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

31

Mitigating Factors

• Requiring application-specific semantic knowledge– Control-data attack unrelated to the semantics of the

victim process (hijack the control flow, do whatever you like)

– Non-control-data attack rely on the semantics of the victim process

– Not a fundamental constraint• Semantics of widely used applications will be well

understood, if attackers have strong incentives• The more instances attackers see, the easier they can clone

new ones. A matter of experiences.

• Lifetime of security-critical data– Attacks are not possible if the vulnerabilities exist outside

the lifetime of the target data.– Programs can be modified to reduce data lifetime to

enhance security.

Page 32: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

32

Reducing Data Lifetime for Security

Original WU-FTPD

lifetime of x is globalsiteexec() {

}getdatasock() { seteuid(0); setsockopt( ... ); seteuid(x);}

Modified WU-FTPD

siteexec() {

}getdatasock() { tmp = geteuid(); seteuid(0); setsockopt( ... ); seteuid(tmp); }

Lifetime of seteuid() argument

Page 33: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

33

Reducing Data Lifetime for Security

Original SSHDdo_authentication(){ int auth = 0; while (!auth) { type = packet_read(); switch (type) { case CMSG_AUTH_PASSWORD: if (auth_password(passwd)) auth = 1; case ... } if (auth) break; } do_authenticated(pw); }

Modified SSHDdo_authentication(){ int auth = 0; while (!auth) { type = packet_read(); auth = 0; switch (type) { case CMSG_AUTH_PASSWORD: if (auth_password(passwd)) auth = 1; case ... } if (auth) break; } do_authenticated(pw); }

Lifetime of auth flag

Page 34: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

Overview

Examples

Discussions

Data flow Integrity

Conclusions

Page 35: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

35

Conclusions

• Many real-world software applications are susceptible to attacks that do not hijack program control flow.

• Constructing a generic and secure defensive technique to defeat both control-data attacks and non-control-data attacks is still an open problem? (DFI is the best so far?)

Page 36: Non-Control-Data Attacks and Securing software by enforcing data-flow integrity

36

Conclusions

• Other possible methods:– “Reducing data lifetime is a secure programming

practice to increase software resilience to attacks. “– …