Finding software bugs with the Clang Static Analyzer

70
Finding software bugs with the Clang Static Analyzer Ted Kremenek, Apple Inc.

Transcript of Finding software bugs with the Clang Static Analyzer

Page 1: Finding software bugs with the Clang Static Analyzer

Finding software bugs with theClang Static Analyzer

Ted Kremenek, Apple Inc.

Page 2: Finding software bugs with the Clang Static Analyzer

Findings Bugs with Compiler Techniques

Page 3: Finding software bugs with the Clang Static Analyzer

Findings Bugs with Compiler TechniquesCompile-time warnings

% clang t.c

t.c:38:13: warning: invalid conversion '%lb' printf("%s%lb%d", "unix", 10, 20); ~~~~^~~~~

Page 4: Finding software bugs with the Clang Static Analyzer

Findings Bugs with Compiler TechniquesCompile-time warnings

Static Analysis

• Checking performed by compiler warnings inherently limited• Find path-specific bugs• Deeper bugs: memory leaks, buffer overruns, logic errors

% clang t.c

t.c:38:13: warning: invalid conversion '%lb' printf("%s%lb%d", "unix", 10, 20); ~~~~^~~~~

Page 5: Finding software bugs with the Clang Static Analyzer

Benefits of Static Analysis

Page 6: Finding software bugs with the Clang Static Analyzer

Benefits of Static Analysis

Early discovery of bugs

• Find bugs early, while the developer is hacking on their code• Bugs caught early are cheaper to fix

Page 7: Finding software bugs with the Clang Static Analyzer

Benefits of Static Analysis

Early discovery of bugs

• Find bugs early, while the developer is hacking on their code• Bugs caught early are cheaper to fix

Systematic checking of all code

• Static analysis reasons about all corner cases

Page 8: Finding software bugs with the Clang Static Analyzer

Benefits of Static Analysis

Early discovery of bugs

• Find bugs early, while the developer is hacking on their code• Bugs caught early are cheaper to fix

Systematic checking of all code

• Static analysis reasons about all corner cases

Find bugs without test cases

• Useful for finding bugs in hard-to-test code• Not a replacement for testing

Page 9: Finding software bugs with the Clang Static Analyzer

This Talk: Clang “Static Analyzer”

Clang-based static analysis tool for finding bugs

• Supports C and Objective-C (C++ in the future)

Outline

• Demo• How it works• Design and implementation• Looking forward

Page 10: Finding software bugs with the Clang Static Analyzer

This Talk: Clang “Static Analyzer”

Clang-based static analysis tool for finding bugs

• Supports C and Objective-C (C++ in the future)

Outline

• Demo• How it works• Design and implementation• Looking forward

http://clang.llvm.org

Page 11: Finding software bugs with the Clang Static Analyzer

Demo

Page 12: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

Page 13: Finding software bugs with the Clang Static Analyzer

How does static analysis work?• Can catch bugs with different degrees of analysis sophistication

Page 14: Finding software bugs with the Clang Static Analyzer

How does static analysis work?• Can catch bugs with different degrees of analysis sophistication• Per-statement, per-function, whole-program all important

Page 15: Finding software bugs with the Clang Static Analyzer

How does static analysis work?• Can catch bugs with different degrees of analysis sophistication• Per-statement, per-function, whole-program all important

compiler warnings (simple checks)

% gcc -Wall -O1 -c t.ct.c: In function ‘f’:t.c:5: warning: ‘x’ may be used uninitialized in this function

% clang -warn-uninit-values t.ct.c:13:12: warning: use of uninitialized variable return x; ^

int f(int y) {  int x;    if (y)    x = 1;    printf("%d\n", y);    return x;}

Page 16: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

int f(int y) {  int x;    if (y)    x = 1;    printf("%d\n", y);    return x;}

Page 17: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

int f(int y) {  int x;    if (y)    x = 1;    printf("%d\n", y);    return x;}

int x;if (y)

x = 1;

printf(“%d\n”, y);return x;

control-flow graph

Page 18: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

int x;if (y)

x = 1;

printf(“%d\n”, y);return x;

control-flow graph

int f(int y) {  int x;    if (y)    x = 1;    printf("%d\n", y);    return x;}

Page 19: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

int x;if (y)

x = 1;

printf(“%d\n”, y);return x;

control-flow graph

The bug occurs on this feasible path

int f(int y) {  int x;    if (y)    x = 1;    printf("%d\n", y);    return x;}

Page 20: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

int f(int y) {  int x;    if (y)    x = 1;    printf("%d\n", y);

}  return x;

Page 21: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

int f(int y) {  int x;    if (y)    x = 1;    printf("%d\n", y);

if (y)

}

  return x;

  return y; return x;

return y;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

Page 22: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

return y;

return x;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

Page 23: Finding software bugs with the Clang Static Analyzer

% gcc -Wall -O1 -c t.ct.c: In function ‘f’:t.c:5: warning: ‘x’ may be used uninitialized in this function

% clang -warn-uninit-values t.ct.c:13:12: warning: use of uninitialized variable return x; ^

How does static analysis work?

return y;

return x;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

Page 24: Finding software bugs with the Clang Static Analyzer

% gcc -Wall -O1 -c t.ct.c: In function ‘f’:t.c:5: warning: ‘x’ may be used uninitialized in this function

% clang -warn-uninit-values t.ct.c:13:12: warning: use of uninitialized variable return x; ^

Two feasible paths:

How does static analysis work?

return y;

return x;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

Page 25: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

return x;

return y;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

% gcc -Wall -O1 -c t.ct.c: In function ‘f’:t.c:5: warning: ‘x’ may be used uninitialized in this function

% clang -warn-uninit-values t.ct.c:13:12: warning: use of uninitialized variable return x; ^

Two feasible paths:

• Neither branch taken (y == 0)

Page 26: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

return x;

return y;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

% gcc -Wall -O1 -c t.ct.c: In function ‘f’:t.c:5: warning: ‘x’ may be used uninitialized in this function

% clang -warn-uninit-values t.ct.c:13:12: warning: use of uninitialized variable return x; ^

Two feasible paths:

• Neither branch taken (y == 0)

• Both branches taken (y != 0)

Page 27: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

return x;

return y;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

Page 28: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

return x;

return y;

printf(“%d\n”, y);if (y)

int x;if (y)

x = 1;

Bogus warning occurs on infeasible path:

• Don’t take first branch (y == 0)

• Take second branch (y != 0)

Page 29: Finding software bugs with the Clang Static Analyzer

How does static analysis work?

Page 30: Finding software bugs with the Clang Static Analyzer

False Positives (Bogus Errors)

Page 31: Finding software bugs with the Clang Static Analyzer

False Positives (Bogus Errors)• False positives can occur due to analysis imprecision

■ False paths■ Insufficient knowledge about the program

Page 32: Finding software bugs with the Clang Static Analyzer

False Positives (Bogus Errors)• False positives can occur due to analysis imprecision

■ False paths■ Insufficient knowledge about the program

• Many ways to reduce false positives■ More precise analysis■ Difficult to eliminate false positives completely

Page 33: Finding software bugs with the Clang Static Analyzer

Flow-Sensitive Analyses

Page 34: Finding software bugs with the Clang Static Analyzer

Flow-Sensitive Analyses• Flow-sensitive analyses reason about flow of values

y = 1;x = y + 2; // x == 3

Page 35: Finding software bugs with the Clang Static Analyzer

Flow-Sensitive Analyses• Flow-sensitive analyses reason about flow of values

• No path-specific information

y = 1;x = y + 2; // x == 3

if (x == 0) ++x; // x == ?else x = 2; // x == 2y = x; // x == ?, y == ?

Page 36: Finding software bugs with the Clang Static Analyzer

Flow-Sensitive Analyses• Flow-sensitive analyses reason about flow of values

• No path-specific information

• LLVM’s SSA form designed for flow-sensitive algorithms

y = 1;x = y + 2; // x == 3

if (x == 0) ++x; // x == ?else x = 2; // x == 2y = x; // x == ?, y == ?

Page 37: Finding software bugs with the Clang Static Analyzer

Flow-Sensitive Analyses• Flow-sensitive analyses reason about flow of values

• No path-specific information

• LLVM’s SSA form designed for flow-sensitive algorithms• Linear-time algorithms

■ Used by optimization algorithms and compiler warnings

y = 1;x = y + 2; // x == 3

if (x == 0) ++x; // x == ?else x = 2; // x == 2y = x; // x == ?, y == ?

Page 38: Finding software bugs with the Clang Static Analyzer

Path-Sensitive Analyses

Page 39: Finding software bugs with the Clang Static Analyzer

Path-Sensitive Analyses• Reason about individual paths and guards on branches

if (x == 0) ++x; // x == 1else x = 2; // x == 2y = x; // (x == 1, y == 1) or (x == 2, y == 2)

Page 40: Finding software bugs with the Clang Static Analyzer

Path-Sensitive Analyses• Reason about individual paths and guards on branches

• Uninitialized variables example:■ Path-sensitive analysis picks up only 2 paths■ No false positive

if (x == 0) ++x; // x == 1else x = 2; // x == 2y = x; // (x == 1, y == 1) or (x == 2, y == 2)

Page 41: Finding software bugs with the Clang Static Analyzer

Path-Sensitive Analyses• Reason about individual paths and guards on branches

• Uninitialized variables example:■ Path-sensitive analysis picks up only 2 paths■ No false positive

• Worst-case exponential-time■ Complexity explodes with branches and loops■ Lots of clever tricks to reduce complexity in practice

if (x == 0) ++x; // x == 1else x = 2; // x == 2y = x; // (x == 1, y == 1) or (x == 2, y == 2)

Page 42: Finding software bugs with the Clang Static Analyzer

Path-Sensitive Analyses• Reason about individual paths and guards on branches

• Uninitialized variables example:■ Path-sensitive analysis picks up only 2 paths■ No false positive

• Worst-case exponential-time■ Complexity explodes with branches and loops■ Lots of clever tricks to reduce complexity in practice

• Clang static analyzer uses flow- and path-sensitive analyses

if (x == 0) ++x; // x == 1else x = 2; // x == 2y = x; // (x == 1, y == 1) or (x == 2, y == 2)

Page 43: Finding software bugs with the Clang Static Analyzer

Finding leaks in Objective-C code

Page 44: Finding software bugs with the Clang Static Analyzer

Memory Management in Objective-C

Objective-C in a Nutshell

• Used to develop Mac/iPhone apps• C with object-oriented programming extensions

Memory management

• Objective-C objects have embedded reference counts• Reference counts obey strict ownership idiom• Garbage collection also available... but there are subtle rules

Page 45: Finding software bugs with the Clang Static Analyzer

Ownership Idiom

Page 46: Finding software bugs with the Clang Static Analyzer

Ownership Idiom

// Allocate an NSString. Since the object is newly allocated,// ‘str’ is an owning reference (+1 retain count).NSString* str = [[NSString alloc] initWithCString:“hello world” encoding:NSASCIIStringEncoding];

Page 47: Finding software bugs with the Clang Static Analyzer

Ownership Idiom

// Allocate an NSString. Since the object is newly allocated,// ‘str’ is an owning reference (+1 retain count).NSString* str = [[NSString alloc] initWithCString:“hello world” encoding:NSASCIIStringEncoding];

// Pass ‘str’ to ‘foo’. ‘foo’ may increment the retain// count, but we are still obligated to decrement the +1// count we have because ‘str’ is an owning reference.foo(str);

Page 48: Finding software bugs with the Clang Static Analyzer

Ownership Idiom

// Allocate an NSString. Since the object is newly allocated,// ‘str’ is an owning reference (+1 retain count).NSString* str = [[NSString alloc] initWithCString:“hello world” encoding:NSASCIIStringEncoding];

// Pass ‘str’ to ‘foo’. ‘foo’ may increment the retain// count, but we are still obligated to decrement the +1// count we have because ‘str’ is an owning reference.foo(str);

// We’re done using str. Decrement our ownership count.[str release];

Page 49: Finding software bugs with the Clang Static Analyzer

Ownership Idiom

// Allocate an NSString. Since the object is newly allocated,// ‘str’ is an owning reference (+1 retain count).NSString* str = [[NSString alloc] initWithCString:“hello world” encoding:NSASCIIStringEncoding];

// Pass ‘str’ to ‘foo’. ‘foo’ may increment the retain// count, but we are still obligated to decrement the +1// count we have because ‘str’ is an owning reference.foo(str);

// We’re done using str. Decrement our ownership count.// LEAK!

Page 50: Finding software bugs with the Clang Static Analyzer

Memory Leak: Colloquy

7/29/08 11:08 PM/Users/resistor/Downloads/Colloquy/Views/MVTextView.m

Page 1 of 7file:///Volumes/Data/Users/kremenek/Desktop/ColloquyAnalysis/Colloquy/report-shpnE5.html#EndPath

[1] Method returns an object with a +1 retain count (owning reference).

[2] Taking true branch.

[3] Object allocated on line 34 and stored into 'newArray' is no longer referenced after this point and has a retain count of +1

(object leaked).

Bug Summary

File: Views/MVTextView.m

Location: line 39, column 3

Description: Memory Leak

Code is compiled without garbage collection.

Annotated Source Code

1 #import "MVTextView.h"

2 #import "JVTranscriptFindWindowController.h"

3

4 @interface MVTextView (MVTextViewPrivate)

5 - (BOOL) checkKeyEvent:(NSEvent *) event;

6 - (BOOL) triggerKeyEvent:(NSEvent *) event;

7 @end

8

9 #pragma mark -

10

11 @implementation MVTextView

12 - (id)initWithFrame:(NSRect)frameRect textContainer:(NSTextContainer *)aTextContainer {

13 if( (self = [super initWithFrame:frameRect textContainer:aTextContainer] ) )

14 defaultTypingAttributes = [[NSDictionary allocWithZone: ] init];

15 return self;

16 }

17

18 - (void) dealloc {

19 [defaultTypingAttributes release];

20 defaultTypingAttributes = ;

21

22 [_lastCompletionMatch release];

23 _lastCompletionMatch = ;

24

25 [_lastCompletionPrefix release];

26 _lastCompletionPrefix = ;

27

28 [super dealloc];

29 }

30

31 #pragma mark -

32

33 - (void) interpretKeyEvents:(NSArray *) eventArray {

34 NSMutableArray *newArray = [[NSMutableArray allocWithZone: ] init];

35 NSEnumerator *e = [eventArray objectEnumerator];

36 NSEvent *anEvent = ;

37

38 if( ! [self isEditable] ) {

39 [super interpretKeyEvents:eventArray];

40 return;

41 }

42

43 while( ( anEvent = [e nextObject] ) ) {

44 if( [self checkKeyEvent:anEvent] ) {

45 if( [newArray count] > 0 ) {

46 [super interpretKeyEvents:newArray];

nil

nil

nil

nil

nil

nil

Page 51: Finding software bugs with the Clang Static Analyzer

Ownership DFA

Page 52: Finding software bugs with the Clang Static Analyzer

Ownership DFA

Owned (+1)

Page 53: Finding software bugs with the Clang Static Analyzer

Ownership DFA

Owned (+2) Owned (+3)

retain

release

retain

release

retain

releaseOwned (+1)

Page 54: Finding software bugs with the Clang Static Analyzer

Ownership DFA

Released

release

Owned (+2) Owned (+3)

retain

release

retain

release

retain

releaseOwned (+1)

Page 55: Finding software bugs with the Clang Static Analyzer

Ownership DFA

Use after Release

any useReleased

release

Owned (+2) Owned (+3)

retain

release

retain

release

retain

releaseOwned (+1)

Page 56: Finding software bugs with the Clang Static Analyzer

Ownership DFA

Use after Release

any useReleased

release

Owned (+2) Owned (+3)

retain

release

retain

release

retain

releaseOwned (+1)

A memory leak occurs when we no longer reference an pointerOwned

Page 57: Finding software bugs with the Clang Static Analyzer

Ownership DFA

Use after Release

any useReleased

release

Owned (+2) Owned (+3)

retain

release

retain

release

retain

releaseOwned (+1)

A memory leak occurs when we no longer reference an pointerOwned

¬Owned ¬Owned (+1) ¬Owned (+2)

Invalid Release

retain

release

retain

release

release

retain

release

A leak occurs when we no longer referencea pointer with an excess retain count¬Owned

Page 58: Finding software bugs with the Clang Static Analyzer

Miscellanea

Checker-specific issues

• Autorelease pools• Objective-C 2.0 Garbage Collection• API-specific ownership rules• Educational diagnostics

Analysis issues

• Aliasing• Plenty of room for improvement

Page 59: Finding software bugs with the Clang Static Analyzer

Checker Results• Used internally at Apple• Announced in June 2008 (WWDC)

■ Hundreds of downloads of the static analyzer■ Thousands of bugs found

Page 60: Finding software bugs with the Clang Static Analyzer

Some Implementation Details

Page 61: Finding software bugs with the Clang Static Analyzer

Why Analyze Source Code?

Bug-finding requires excellent diagnostics

• Tool must explain a bug to the user

• Users cannot fix bugs they don’t understand• Need rich source and type information

What about analyzing LLVM IR?

• Loss of source information• High-level types discarded• Compiler lowers language constructs• Compiler makes assumptions (e.g., order of evaluation)

Page 62: Finding software bugs with the Clang Static Analyzer

Clang Libraries

Page 63: Finding software bugs with the Clang Static Analyzer

Clang Libraries

App LibrariesCore Libraries

Analysis

Rewrite

AST

Parse

Lex

Basic

CLI Driver IDE ?

LLVMGen

Sema

Page 64: Finding software bugs with the Clang Static Analyzer

App LibrariesCore Libraries

Analysis

Rewrite

AST

Parse

Lex

Clang Libraries

Basic

CLI Driver IDE ?

LLVMGen

Sema

Page 65: Finding software bugs with the Clang Static Analyzer

libAnalysis

Page 66: Finding software bugs with the Clang Static Analyzer

libAnalysis

Intra-Procedural Analysis

• Source-level Control-Flow Graphs (CFGs)• Flow-sensitive dataflow solver

■ Live Variables■ Uninitialized Values

• Path-sensitive dataflow engine■ Retain/Release checker■ Logic bugs (e.g., null dereferences)

• Various checks and analyses■ Dead stores■ API checks

Page 67: Finding software bugs with the Clang Static Analyzer

libAnalysis

Page 68: Finding software bugs with the Clang Static Analyzer

libAnalysis

Path Diagnostics (Bug-Reporting)

• PathDiagnosticClient■ Abstract interface to implement a “view” of bug reports■ Separates report visualization from generation■ HTMLDiagnostics (renders HTML, uses libRewrite)

• BugReporter■ Helper class to generate diagnostics for PathDiagnosticClient

Page 69: Finding software bugs with the Clang Static Analyzer

Looking Forward• Richer Diagnostics• Inter-procedural Analysis (IPA)• Lots of Checks• Scriptability• Multiple Analysis Engines

Page 70: Finding software bugs with the Clang Static Analyzer

Looking Forward• Richer Diagnostics• Inter-procedural Analysis (IPA)• Lots of Checks• Scriptability• Multiple Analysis Engines

http://clang.llvm.org