The Interface Definition Language for Fail-Safe C
Kohei Suenaga, Yutaka Oiwa,Eijiro Sumii, Akinori Yonezawa
University of Tokyko
International Symposium on Software Security
2
In this presentation…
We introduce the IDL for Fail-Safe C With our IDL, we can…
Easily generate wrappers for external functions
Safely interface Fail-Safe C with external functions
Our approach can be used to other safe languages
International Symposium on Software Security
3
Background
International Symposium on Software Security
4
Fail-Safe C
Safe implementation of C Translates C sources to fail-safe ones
Inserts safety checks such as boundary checks
Ensures safety focusing on types of objects
Prevents programs from performing unsafe operations
International Symposium on Software Security
5
Problems of Fail-Safe C
Cooperation with external functions Data representation problem
Fail-Safe C uses its original data representation Cannot call external functions directly
Safety problemMany external functions require
preconditions for safety
International Symposium on Software Security
6
Solution
To prepare a wrapper for each function Checks preconditions, converts
representation, …
We want to automatically generate such wrappers
International Symposium on Software Security
7
Approach
Interface Definition Language (IDL) Describe preconditions and behavior of
external functions with the IDL
IDL processorInterfaceDefinition Wrappers
wrapper(…){………memcpy(…)………
wrapper(…){………memcpy(…)………
int main(…){………memcpy(…)………
int main(…){………memcpy(…)………
memcpy(…){……………return;}
memcpy(…){……………return;}
ChecksPreconditionsConverts
Arguments
ConvertsReturn value
International Symposium on Software Security
9
Outline of the Presentation
Safety Fail-Safe C and our IDL guarantees Internal data representation of Fail-Safe C Wrappers’ behavior Experiment Related work Future work
International Symposium on Software Security
10
The Safety Fail-Safe C Guarantees
If a program attempts to perform undefined behavior, Fail-Safe C aborts the program
before the operation is performed
Not fully formal, but sufficient for our aim
International Symposium on Software Security
11
The Safety Fail-Safe C Guarantees
void strcpy(char *s1, char *s2) {while (*s1++ = *s2++)
;}
Out-of-bound access may occur here
The result of out-of-bound access is
Undefined
Usual C compiler
International Symposium on Software Security
12
The Safety Fail-Safe C Guarantees
void strcpy(char *s1, char *s2) {while (*s1++ = *s2++)
;}
Attempts to perform out-of-bound access
Aborts the program
Fail-Safe C compiler
International Symposium on Software Security
13
The Safety Fail-Safe C Guarantees (again)
If a program attempts to perform undefined behavior, Fail-Safe C aborts the program
before the operation is performed
Not fully formal, but sufficient for our aim
International Symposium on Software Security
14
The Safety our IDL Guarantees
Two assumptions Fail-Safe C does not contain bugs
The safety of Fail-Safe C does hold just before wrappers are called
Interface definitions correctly reflect the implementation of external functions
If these two assumptions hold, our IDL guarantees thatFail-Safe C’s safety holds after external functions return
International Symposium on Software Security
15
Internal Data Representationof Fail-Safe C
Fail-Safe C differs from usual C in representation of… Memory blocks Pointers Integers
International Symposium on Software Security
16
Internal Data Representationof Fail-Safe C
Every memory block has a header
size
Data
TypeInfo
Header
International Symposium on Software Security
17
Internal Data Representationof Fail-Safe C
Every pointer is represented in 2 words
size
Data
base
offset
TypeInfo
International Symposium on Software Security
18
Internal Data Representationof Fail-Safe C
Every integer is also represented in 2 words (pointers may be cast to integers)
base
offset
pointer integer
cast
International Symposium on Software Security
19
Summary of Data Representation
Pointers Represented in 2 words
Integers Represented in 2 words
Memory blocks Have metadata Contents has Fail-Safe
C’s representation
Wrappers have to convert the representation of arguments
International Symposium on Software Security
20
Behavior of the Wrapper of memcpy
char *memcpy(char *s1, char *s2, int n);
International Symposium on Software Security
21
Behavior of the Wrapper of memcpy
char *memcpy(char *s1, char *s2, int n);
Preconditions
n > 0
1. s1 != NULL, s2 != NULL2. First n bytes of memory blocks has
to be accessible
n bytes from s1 and s2 cannot overlap each other
International Symposium on Software Security
22
Behavior of the Wrapper of memcpy
char *memcpy(char *s1, char *s2, int n);
Converting representation (before call) 2-word repr. →
1-word repr.
1. Allocates new memory block in C’s image
2. Copies contents to it converting repr.
International Symposium on Software Security
23
Behavior of the Wrapper of memcpy
char *memcpy(char *s1, char *s2, int n);
Converting representation (after returns)
1. Writes back update of the memory block2. Deallocates the newly allocated memory
block
Encodes return value maintaining the distance from s1
International Symposium on Software Security
24
Behavior of Wrappers
1. Precondition Checking2. Decoding and Allocation
Integers are converted to 1-word repr. Allocates a memory block and copies to
it for each pointer-type argument
3. Call Safe if assumptions appeared before
hold
International Symposium on Software Security
25
Behavior of Wrappers
4. Encoding and Deallocation Converts the return value to Fail-Safe
C’s repr. Reflect update of passed memory
blocks Reflect update of global variables (Fail-
Safe C allocates two regions for each global variable)
Deallocates memory blocks allocated in the wrapper
International Symposium on Software Security
26
An Example of Interface Definition
Add supplemental information to C’s declaration in the form of attributes
[points_in(s1)]char *memcpy([never_null, can_access(0, n-1), write(true, 0, n-1)] char *s1,
[never_null, can_access(0, n-1)] const char *s2 int n) [precond(n > 0), no_overlap(s1, 0, n-1, s2, 0, n-1)];
International Symposium on Software Security
27
Experiments
Measured overhead with four micro-benchmarks In each benchmark, we used…
Programs compiled with Fail-Safe C and used wrappers to call external functions
Programs compiled with gcc and called wrappers directly
Environment: UltraSPARC-II 400 MHz CPU, 13.0 GB RAM
International Symposium on Software Security
28
Benchmark 1: succ
Takes one integer, adds 1 to it and return it
Measured time spent in calling this function 107 times as an external function
Give information about the overhead of converting integers
International Symposium on Software Security
29
Benchmark 2: arraysucc
Takes an array of 107 characters Measured time spent in calling this
function once as an external function
Give information about the overhead of converting pointer-type arguments
International Symposium on Software Security
30
Benchmark 3: cp
File-copying program that uses open, read and write system calls
Measured time spent in copying a 100K-byte file
Give information about the overhead of pretty practical programs
International Symposium on Software Security
31
Benchmark 4: echo
A simple echo server that uses socket, bind, listen, accept, recv and send system calls (with 1K-byte buffer)
Measured time spent in sending/receiving 100K-byte data between two machines connected with 100BASE-T
Give information about the overhead under existence of network delay
International Symposium on Software Security
32
Overall Result
The overhead of memory allocation is large
Execution time (msec)suc
carraysucc
cp echo
With wrappers 234 597 144 2498Without wrappers
220 200 91 2494
Overhead (%) 6 199 58 0.16
International Symposium on Software Security
33
Breakdown of arraysucc
Pre Decode
Call Encode
Total
Time (msec)
0 326 110 160 596
proportion(%)
0 54.7 18.5 26.8
Time spent in each phase of arraysucc
More than half of wrapper’s execution time is spent in Decoding
and Allocation
International Symposium on Software Security
34
Breakdown of cp
Execution time of each phase of read/write’s wrapper
Pre Decode
Call Encode
Total
read 1 16 46 14 77write 1 16 73 4 94Call phase takes most of time due to file access.
However, Decoding and Allocation phase takes much timein wrapper’s execution time
International Symposium on Software Security
35
Overall Result (again)
Execution time (msec)
succ
arraysucc
cp echo
With wrappers 234 597 144 2498Without wrappers
220 200 91 2494
Overhead (%) 6 199 58 0.16The overhead of echo is very small
International Symposium on Software Security
36
Breakdown of echo
Time spent in each phase of recv/send system calls
Pre/Decode Call Encode
Total
recv 1 2485 1 2487send 1 4 0 5
Under existence of network delay, wrappers’ overhead is relatively small
International Symposium on Software Security
37
Discussion
The overhead of Decoding and Allocation is dominant
To reduce this overhead… Omit copying contents of memory
block if the block is not read Omit allocating new memory block if
the block has the same image as usual C’s one
International Symposium on Software Security
38
Related Work
CamlIDL [Leroy 01], H/Direct [Finne et.al. 98] IDL for OCaml and Haskell The syntax of our IDL is based on
CamlIDL Pay less attention to safety
External function call is not always safe if preconditions do not hold
International Symposium on Software Security
39
Related Work
CCured [Necula et.al. 02, Condit et.al. 03] Analyses pointer usage statically and cuts off
unnecessary safety checks Two ways to call external functions
Tell the compiler which functions are external ones Cannot check preconditions, cannot deal with memory
blocks allocated in external functions Provide wrappers for each external function
Have to write wrappers manually
International Symposium on Software Security
40
Future Work
Implementing optimizations Applying our approach to existing
programs As soon as the implementation of Fail-
Safe C is complemeted Target: sendmail
International Symposium on Software Security
41
Conclusion
We designed an IDL for Fail-Safe C to call external functions safely
International Symposium on Software Security
42
Fin
International Symposium on Software Security
43
Internal Data Representationof Fail-Safe C
TypeInfo contains typename and handler methods
size
Data
Type name
read_word()write_word()
Handler methods:Functions that access memory according to the memory block’s type
International Symposium on Software Security
44
Why we don’t need to check
postconditions?
Top Related