Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf ·...
Transcript of Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf ·...
![Page 1: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/1.jpg)
PROGRAMMING LANGUAGES LABORATORY!Universidade Federal de Minas Gerais -‐ Department of Computer Science
Fernando Magno Quintão [email protected]
PROGRAM ANALYSIS AND OPTIMIZATION – DCC888!
TAINTED FLOW ANALYSIS!
The material in these slides have been taken from Chapter 19 – StaAc Single-‐Assignment Form – of "Modern Compiler ImplementaAon in Java – Second EdiAon", by Andrew Appel and Jens Palsberg. RESEARCH SCHOOLS OF THE ÉCOLE NORMALE SUPÉRIEURE DE LYON!
![Page 2: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/2.jpg)
An Example is Worth Many Words
1) What is the person behind the phone complaining about?
2) Have you ever seen such a problem before?
![Page 3: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/3.jpg)
Bobby Tables has got a Car
What is the problem with the plate of this car?
![Page 4: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/4.jpg)
InformaAon Flow
• Programs manipulate informaAon. • Some informaAon should not leave the program.
– Example: an uncriptographed password.
• Other informaAon should not reach sensiAve parts of the program. – Example: a string too large to fit into an array.
In the cartoon we just saw, what is the problem: informaAon entering the program, or informaAon leaving the program?
![Page 5: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/5.jpg)
InformaAon flow vulnerabiliAes
• If the user can read sensiAve informaAon from the program, we say that the program has an informa(on disclosure vulnerability.
• If the user can send harmful informaAon to the program, we say that the program has a tainted flow vulnerability.
1) Which tainted flow vulnerabiliAes can you think about?
2) Can you think about informaAon disclosure vulnerabiliAes?
3) Can you think about a way to find out if a program has such a vulnerability. Try to be creaAve!
![Page 6: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/6.jpg)
Tainted Flow VulnerabiliAes
• An adversary can compromise the
program by sending malicious
information to it.
• This type of attack is very common in web servers.
![Page 7: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/7.jpg)
An example of security bug
1 <?php 2 init_session(); 3 echo “Hello ” . $_GET[‘name’]; 4 ?>
http://localhost/xss.php?name=Fernando
What is the vulnerability of this program?
![Page 8: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/8.jpg)
An example of security bug
1 <?php 2 init_session(); 3 echo “Hello ” . $_GET[‘name’]; 4 ?>
http://localhost/xss.php?name=Fernando How to steal a cookie:
http://localhost/xss.php?
name=<script>alert(docume
nt.cookie);</script>
![Page 9: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/9.jpg)
An example of security bug
1 <?php 2 init_session(); 3 echo “Hello ” . $_GET[‘name’]; 4 ?>
![Page 10: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/10.jpg)
An example of security bug
1 <?php 2 init_session(); 3 echo “Hello ” . $_GET[‘name’]; 4 ?>
How to clean this program?
![Page 11: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/11.jpg)
An example of security bug
1 <?php 3 init_session(); 4 echo “Hello ” . htmlentities($_GET[‘name’]); 4 ?>
Clean!
![Page 12: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/12.jpg)
There are several ways to break a program
$id = $_GET[”user”];
if ($id == '') { echo "Invalid user: $id" } else { $getuser = $DB->query (”SELECT * FROM `table` WHERE id=’$id’”); echo $getuser; }
What is the vulnerability of this program?
![Page 13: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/13.jpg)
There are several ways to break a program
$id = $_GET[”user”];
if ($id == '') { echo "Invalid user: $id" } else { $getuser = $DB->query (”SELECT * FROM `table` WHERE id=’$id’”); echo $getuser; }
What if the name of the student is Robert’); drop table Students ; --!
![Page 14: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/14.jpg)
The Tainted Flow Problem
• Instance: a tuple T = (P, SO, SI, SA) – P → Program
– SO → Sources – SI → Sinks
– SA → SaniAzers
<?php echo htmlentities($_GET[‘name’]); ?>
![Page 15: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/15.jpg)
The Tainted Flow Problem
• Instance: a tuple T = (P, SO, SI, SA) – P → Program
– SO → Sources – SI → Sinks
– SA → SaniAzers
<?php echo htmlentities($_GET[‘name’]); ?>
![Page 16: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/16.jpg)
The Tainted Flow Problem
• Instance: a tuple T = (P, SO, SI, SA) – P → Program
– SO → Sources – SI → Sinks
– SA → SaniAzers
<?php echo htmlentities($_GET[‘name’]); ?>
![Page 17: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/17.jpg)
The Tainted Flow Problem
• Instance: a tuple T = (P, SO, SI, SA) – P → Program
– SO → Sources – SI → Sinks
– SA → Sani4zers
<?php echo htmlentities($_GET[‘name’]); ?>
![Page 18: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/18.jpg)
The Tainted Flow Problem
• Instance: a tuple T = (P, SO, SI, SA) – P → Program
– SO → Sources – SI → Sinks
– SA → SaniAzers
• Problem: find a path from a source to a sink that does not go across a saniAzer
What do you think I mean by path?!
![Page 19: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/19.jpg)
Example: cross-‐site scripAng (XSS)
• Instance: a tuple T = (P, SO, SI, SA) – P → Program
– SO → Sources – SI → Sinks
– SA → SaniAzers
Cross-site Scripting (XSS) SO: $_GET, $_POST, ... SI: echo, print, printf, ... SA: htmlentities, strip_tags,...
![Page 20: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/20.jpg)
Example: XSS
<?php $name = $_GET['name']; echo $name; ?>
<?php $name = htmlenAAes($_GET['name']); echo $name; ?>
![Page 21: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/21.jpg)
Example: SQL injecAon
• Instance: a tuple T = (P, SO, SI, SA) – P → Program
– SO → Sources – SI → Sinks
– SA → SaniAzers
SQL injection: SO: $_GET, $_POST, ... SI: mysql_query, pg_query, ... SA: addslashes, pg_escape_string,...
![Page 22: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/22.jpg)
Example: SQL InjecAon
<?php $userid = $_GET['userid']; $passwd = $_GET['passwd']; ... $result = mysql_query("SELECT userid FROM users WHERE userid=$userid AND passwd='$passwd'"); ?>
<?php $userid = (int) $_GET['userid']; $passwd = addslashes($_GET['passwd']); ... $result = mysql_query("SELECT userid FROM users WHERE userid=$userid AND passwd='$passwd'"); ?>
Can you find a string that goes around the password guard?!
![Page 23: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/23.jpg)
Example: Command ExecuAon
• Instance: a tuple T = (P, SO, SI, SA) – P → Program
– SO → Sources – SI → Sinks
– SA → SaniAzers
Command Execution: SO: $_GET, $_POST, ... SI: exec, system, passthru, ... SA: escapeshellcmd, escapeshellarg,..
![Page 24: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/24.jpg)
Example: Command ExecuAon
<?php $filename = $_GET['filename']; system("/usr/bin/file $filename"); ?>
<?php $filename = escapecmdshell ($_GET['filename']); system("/usr/bin/file $filename"); ?>
1) Do you know what the file command does?!
2) Can you do anything evil with this program?!
![Page 25: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/25.jpg)
Example: Remote File Inclusion
• Instance: a tuple T = (P, SO, SI, SA) – P → Program
– SO → Sources – SI → Sinks
– SA → SaniAzers
Remote File Inclusion: SO: $_GET, $_POST, ... SI: include, include_once, require, require_once, ... SA: ?
![Page 26: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/26.jpg)
Example: Remote File Inclusion
<?php $file = $_GET['filename']; include($file); ?>
<?php $file_incs = array("file1", ..., "fileN"); $file_id = $_GET['file_id'];
$inc = $file_incs[$file_id]; if (isset($inc)) include($inc); else echo "Error..."; ?>
![Page 27: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/27.jpg)
Example: File System Access
• Instance: a tuple T = (P, SO, SI, SA) – P → Program
– SO → Sources – SI → Sinks
– SA → SaniAzers
File System Access: SO: $_GET, $_POST, ... SI: chdir, mkdir, rmdir, rename, copy, chgrp, chown, chmod, unlink, ... SA: ?
![Page 28: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/28.jpg)
Example: File System Access
<?php $filename = $_GET['filename']; unlink($filename); ?>
The developer can use the same methods used to guard against remote file inclusion aoacks.
![Page 29: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/29.jpg)
Example: Malicious EvaluaAon
• Instance: a tuple T = (P, SO, SI, SA) – P → Program
– SO → Sources – SI → Sinks
– SA → SaniAzers
Malicious Evaluation: SO: $_GET, $_POST, ... SI: eval, preg_replace, ... SA: ?
![Page 30: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/30.jpg)
Example: Malicious EvaluaAon
<?php $code = $_GET['code']; eval($code); ?>
There is not a systemaAc way of checking staAcally that dynamic code is safe…
![Page 31: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/31.jpg)
It is a serious problem…
• The Annual SANS’s Report esAmates that SQL injecAon aoacks have happened 19 million Ames in July 2009.
• CVE1 2006 statistics: • #1: Cross-site scripting: 18.5% • #2: SQL injection: 13.6% • #3: remote file inclusion: 13.1% • #17: command execution: 0.4% • #24: Eval injection: 0.3%
1: Common VulnerabiliAes and Exposures
![Page 32: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/32.jpg)
Discovering Tainted Flow VulnerabiliAes
• How can we find if a program contains a tainted flow vulnerability? – Is your algorithm decidable?
– Is it fast? – Is it too conserva4ve? In other words, can it output a false posiAve?
– Is it sound? In other words, can it produce false negaAves?
– Does it work for every kind of tainted flow vulnerability?
![Page 33: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/33.jpg)
DCC 888
Universidade Federal de Minas Gerais – Department of Computer Science – Programming Languages Laboratory
DENSE TAINTED FLOW ANALYSIS
![Page 34: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/34.jpg)
How to detect tainted flow vulnerabiliAes?
• Problem: find a path from a source to a sink that does not go across a saniAzer.
The very old quesAon: what is a path inside the program?!
![Page 35: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/35.jpg)
The very old quesAon: what is a path inside the program?!
A path is the flow of control that exists in the program
How to detect tainted flow vulnerabiliAes?
• Problem: find a path from a source to a sink that does not go across a saniAzer.
A path is a
chain of data
dependences
in a program
![Page 36: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/36.jpg)
Nano-‐PHP: our toy language
• The toy language should be simple, so that the enAre exposiAon of ideas is not complex.
• Yet, it should be complex enough to include every aspect of the algorithm that will be important in the real world.
• Oten it is useful to define a staAc analysis on a small programming language.
• We can prove properAes for this programming language.
• These proofs not only enhance confidence on the algorithm, but they also help the reader to understand how the algorithm works.
![Page 37: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/37.jpg)
Nano-‐PHP: our toy language
Name Instruc4on Example
Assignment from source v = ❍ $x = $_POST[‘content’]
Assignment to sink ● = v echo($v)
Simple assignment x = ⊗(x1, …, xn) $a = $t1 * $t2
Branch bra l1, …, ln if (…) {…} else {…}
Filter x = filter $a = htmlentities($t1)
Validator validate x, lc, lt if(is_num($t1)) {…}
l0: v = ❍ l1: x = filter l2: validate(v, l3, l9) l3: bra l4, l8 l4: ● = x l5: v = ❍ l6: v = ⊗(v) l7: bra l4 l8: ● = x l9: bra l4
$v = DB.get($_GET[‘child’]); $x = “”; if (DB.isMember($v)) { while (DB.hasParent($v)) { echo($x); $x = $_POST[‘$v’]; $v = DB.getParent($v); } echo($v); }
![Page 38: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/38.jpg)
Name Instruc4on Example
Assignment from source v = ❍ $x = $_POST[‘content’]
Assignment to sink ● = v echo($v)
Simple assignment x = ⊗(x1, …, xn) $a = $t1 * $t2
Branch bra l1, …, ln if (…) {…} else {…}
Filter x = filter $a = htmlentities($t1)
Validator validate x, lc, lt if(is_num($t1)) {…}
l0: v = ❍ l1: x = filter l2: validate(v, l3, l9) l3: bra l4, l8 l4: ● = x l5: v = ❍ l6: v = ⊗(v) l7: bra l3 l8: ● = x l9: bra l9
$v = DB.get($_GET[‘child’]); $x = “”; if (DB.isMember($v)) { while (DB.hasParent($v)) { echo($x); $x = $_POST[‘$v’]; $v = DB.getParent($v); } echo($v); }
Nano-‐PHP: our toy language
![Page 39: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/39.jpg)
Name Instruc4on Example
Assignment from source v = ❍ $x = $_POST[‘content’]
Assignment to sink ● = v echo($v)
Simple assignment x = ⊗(x1, …, xn) $a = $t1 * $t2
Branch bra l1, …, ln if (…) {…} else {…}
Filter x = filter $a = htmlentities($t1)
Validator validate x, lc, lt if(is_num($t1)) {…}
l0: v = ❍ l1: x = filter l2: validate(v, l3, l9) l3: bra l4, l8 l4: ● = x l5: v = ❍ l6: v = ⊗(v) l7: bra l3 l8: ● = x l9: bra l9
$v = DB.get($_GET[‘child’]); $x = “”; if (DB.isMember($v)) { while (DB.hasParent($v)) { echo($x); $x = $_POST[‘$v’]; $v = DB.getParent($v); } echo($v); }
Nano-‐PHP: our toy language
![Page 40: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/40.jpg)
Name Instruc4on Example
Assignment from source v = ❍ $x = $_POST[‘content’]
Assignment to sink ● = v echo($v)
Simple assignment x = ⊗(x1, …, xn) $a = $t1 * $t2
Branch bra l1, …, ln if (…) {…} else {…}
Filter x = filter $a = htmlentities($t1)
Validator validate x, lc, lt if(is_num($t1)) {…}
l0: v = ❍ l1: x = filter l2: validate(v, l3, l9) l3: bra l4, l8 l4: ● = x l5: v = ❍ l6: v = ⊗(v) l7: bra l3 l8: ● = x l9: bra l9
$v = DB.get($_GET[‘child’]); $x = “”; if (DB.isMember($v)) { while (DB.hasParent($v)) { echo($x); $x = $_POST[‘$v’]; $v = DB.getParent($v); } echo($v); }
Nano-‐PHP: our toy language
![Page 41: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/41.jpg)
Name Instruc4on Example
Assignment from source v = ❍ $x = $_POST[‘content’]
Assignment to sink ● = v echo($v)
Simple assignment x = ⊗(x1, …, xn) $a = $t1 * $t2
Branch bra l1, …, ln if (…) {…} else {…}
Filter x = filter $a = htmlentities($t1)
Validator validate x, lc, lt if(is_num($t1)) {…}
l0: v = ❍ l1: x = filter l2: validate(v, l3, l9) l3: bra l4, l8 l4: ● = x l5: v = ❍ l6: v = ⊗(v) l7: bra l3 l8: ● = x l9: bra l9
$v = DB.get($_GET[‘child’]); $x = “”; if (DB.isMember($v)) { while (DB.hasParent($v)) { echo($x); $x = $_POST[‘$v’]; $v = DB.getParent($v); } echo($v); }
Nano-‐PHP: our toy language
![Page 42: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/42.jpg)
Name Instruc4on Example
Assignment from source v = ❍ $x = $_POST[‘content’]
Assignment to sink ● = v echo($v)
Simple assignment x = ⊗(x1, …, xn) $a = $t1 * $t2
Branch bra l1, …, ln if (…) {…} else {…}
Filter x = filter $a = htmlentities($t1)
Validator validate x, lc, lt if(is_num($t1)) {…}
$v = DB.get($_GET[‘child’]); $x = “”; if (DB.isMember($v)) { while (DB.hasParent($v)) { echo($x); $x = $_POST[‘$v’]; $v = DB.getParent($v); } echo($v); }
l0: v = ❍ l1: x = filter l2: validate(v, l3, l9) l3: bra l4, l8 l4: ● = x l5: v = ❍ l6: v = ⊗(v) l7: bra l3 l8: ● = v l9: bra l9
Nano-‐PHP: our toy language
![Page 43: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/43.jpg)
Path as Control Flow
A path is the flow of control that exists in the program
$v = DB.get($_GET[‘child’]); $x = “”; if (DB.isMember($v)) { while (DB.hasParent($v)) { echo($x); $x = $_POST[‘$v’]; $v = DB.getParent($v); } echo($v); }
1) Does this program contain a vulnerability?
2) What is the control flow graph of this program?
![Page 44: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/44.jpg)
l0: v = !l1: x = filter
l2: validate(v, l6, l8)
l3: • = x
l6: bra l3, l7
l4: x = !l7: • = v
l8: bra l8
l5: v =!(v)
Path as Control Flow $v = DB.get($_GET[‘child’]); $x = “”; if (DB.isMember($v)) { while (DB.hasParent($v)) { echo($x); $x = $_POST[‘$v’]; $v = DB.getParent($v); } echo($v); }
l0: v = ❍ l1: x = filter l2: validate(v, l3, l9) l6: bra l3, l7 l3: ● = x l4: x = ❍ l5: v = ⊗(v) l7: ● = v l8: bra l8
Can you find a vulnerable path along the control flow graph of this program?
![Page 45: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/45.jpg)
l0: v = !l1: x = filter
l2: validate(v, l6, l8)
l3: • = x
l6: bra l3, l7
l4: x = !l7: • = v
l8: bra l8
l5: v =!(v)
Path as Control Flow $v = DB.get($_GET[‘child’]); $x = “”; if (DB.isMember($v)) { while (DB.hasParent($v)) { echo($x); $x = $_POST[‘$v’]; $v = DB.getParent($v); } echo($v); }
![Page 46: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/46.jpg)
Tracking the Program Data Flow
l0: v = !l1: x = filter
l2: validate(v, l6, l8)
l3: • = x
l6: bra l3, l7
l4: x = !l7: • = v
l8: bra l8
l5: v =!(v)
$v = DB.get($_GET[‘child’]); $x = “”; if (DB.isMember($v)) { while (DB.hasParent($v)) { echo($x); $x = $_POST[‘$v’]; $v = DB.getParent($v); } echo($v); }
The mission: Provide an algorithm that determines if the program contains a vulnerability. * Does it terminate? * What is the complexity? * Is it sound?
![Page 47: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/47.jpg)
Data Flow Analysis
• A program point is any point between two consecuAve instrucAons.
• Let’s associate with each program point a funcAon that maps variables to either clean or tainted.
• This funcAon is, indeed, a point in the product lazce of the two lazces below, assuming two variables, a and b:
Which lazce do we obtain from the product of these two lazces?
{}
{a} {b}
{a, b}
clean
tainted
?
![Page 48: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/48.jpg)
Data Flow Analysis
• We will use this product lazce, whose diagram is seen below:
{}
{(a, clean)} {(b, clean)}
{(a, tainted)} {(b, tainted)}
{(a, clean), (b, clean)}
{(a, clean), (b, tainted)}{(a, tainted), (b, clean)}
{(a, tainted), (b, tainted)}
We did not draw the associaAons between variables and the undefined value (?), to avoid clogging the figure too much.
![Page 49: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/49.jpg)
Dense Data Flow Analysis
• NoAce that our analysis will be dense: – We are associaAng each pair (p, v) with an abstract state, where p is a program point, and v is a variable.
So, given a program with V variables, and P points, how many iteraAons we will need, in the worst case, to determine the abstract state of all the pairs (vars × points)?
{}
{(a, clean)} {(b, clean)}
{(a, tainted)} {(b, tainted)}
{(a, clean), (b, clean)}
{(a, clean), (b, tainted)}{(a, tainted), (b, clean)}
{(a, tainted), (b, tainted)}
![Page 50: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/50.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → ?, x → ?}
{v → ?, x → ?}
{v → ?, x → ?} {v → ?, x → ?}
{v → ?, x → ?}
{v → ?, x → ?}
{v → ? x → ?}
{v → ?, x → ?}
{v → ?, x → ?} {v → ?, x → ?}
{v → ?, x → ?}
{v → ?, x → ?}
![Page 51: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/51.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → ?, x → ?}
{v → ?, x → ?}
{v → ?, x → ?} {v → ?, x → ?}
{v → ?, x → ?}
{v → ?, x → ?}
{v → ? x → ?}
{v → ?, x → ?}
{v → ?, x → ?} {v → ?, x → ?}
{v → ?, x → ?}
{v → ?, x → ?}
Drafting an algorithm: Every function starts mapping every variable to an unknown value.
* How can you evolve these functions, so that they “learn” information about the variables?
![Page 52: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/52.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → ?, x → ?}
{v → ?, x → ?}
{v → ?, x → ?} {v → ?, x → ?}
{v → ?, x → ?}
{v → ?, x → ?}
{v → ? x → ?}
{v → ?, x → ?}
{v → ?, x → ?} {v → ?, x → ?}
{v → ?, x → ?}
{v → ?, x → ?}
Worklist: l0
![Page 53: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/53.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → ?, x → ?}
{v → ?, x → ?} {v → ?, x → ?}
{v → ?, x → ?}
{v → ?, x → ?}
{v → ? x → ?}
{v → ?, x → ?}
{v → ?, x → ?} {v → ?, x → ?}
{v → ?, x → ?}
{v → ?, x → ?}
Worklist: l1
![Page 54: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/54.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → ?, x → ?} {v → ?, x → ?}
{v → ?, x → ?}
{v → ?, x → ?}
{v → ? x → ?}
{v → ?, x → ?}
{v → ?, x → ?} {v → ?, x → ?}
{v → ?, x → ?}
{v → ?, x → ?}
Worklist: l2
![Page 55: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/55.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → ?, x → ?}
{v → ?, x → ?}
{v → ? x → ?}
{v → ?, x → ?}
{v → ?, x → ?} {v → ?, x → ?}
{v → ?, x → ?}
{v → ?, x → ?}
Worklist: l3, l9
![Page 56: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/56.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → ?, x → ?}
{v → ?, x → ?}
{v → ? x → ?}
{v → ?, x → ?}
{v → C, x → C} {v → C, x → C}
{v → ?, x → ?}
{v → ?, x → ?}
Worklist: l9, l4, l8
![Page 57: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/57.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → ?, x → ?}
{v → ?, x → ?}
{v → ? x → ?}
{v → ?, x → ?}
{v → C, x → C} {v → C, x → C}
{v → ?, x → ?}
{v → T, x → C}
l4,
Worklist: l4, l8, l9
![Page 58: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/58.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → ?, x → ?}
{v → ?, x → ?}
{v → ? x → ?}
{v → C, x → C}
{v → C, x → C} {v → C, x → C}
{v → ?, x → ?}
{v → T, x → C}
Worklist: l8, l9, l5
![Page 59: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/59.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → ?, x → ?}
{v → ?, x → ?}
{v → ? x → ?}
{v → C, x → C}
{v → C, x → C} {v → C, x → C}
{v → C, x → C}
{v → T, x → C}
Worklist: l9, l5
![Page 60: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/60.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → ?, x → ?}
{v → ?, x → ?}
{v → ? x → ?}
{v → C, x → C}
{v → C, x → C} {v → C, x → C}
{v → C, x → C}
{v → T, x → C}
Worklist: l9, l5
What happens when information collides?:
At l9 we have two different states of variable v colliding: v is clean through l8, and tainted through l2.
What is going to be the state of v at l9?
![Page 61: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/61.jpg)
The Meet Operator
• The meet operator defines what happens at the joining points in the program.
• We can easily extends its definiAon to our product lazce, e.g., {(a, tainted), (b, clean)} ∧ {(a, clean), (b, undefined)} = {(a, tainted), (b, clean)}
∧ Undefined Clean Tainted
Undefined Undefined Clean Tainted
Clean Clean Clean Tainted
Tainted Tainted Tainted Tainted
Is this analysis may or must?
![Page 62: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/62.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → ?, x → ?}
{v → ?, x → ?}
{v → ? x → ?}
{v → C, x → C}
{v → C, x → C} {v → C, x → C}
{v → C, x → C}
{v → T, x → C}
Worklist: l5
Remains the same!
![Page 63: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/63.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → ?, x → ?}
{v → ?, x → ?}
{v → C x → T}
{v → C, x → C}
{v → C, x → C} {v → C, x → C}
{v → C, x → C}
{v → T, x → C}
Worklist: l6
![Page 64: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/64.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → ?, x → ?}
{v → C, x → T}
{v → C x → T}
{v → C, x → C}
{v → C, x → C} {v → C, x → C}
{v → C, x → C}
{v → T, x → C}
Worklist: l7
![Page 65: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/65.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → C, x → T}
{v → C, x → T}
{v → C x → T}
{v → C, x → C}
{v → C, x → C} {v → C, x → C}
{v → C, x → C}
{v → T, x → C}
Worklist: l3
![Page 66: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/66.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → C, x → T}
{v → C, x → T}
{v → C x → T}
{v → C, x → C}
{v → C, x → C} {v → C, x → C}
{v → C, x → C}
{v → T, x → C}
Worklist: l3
Again: What happens when two functions “meet”?
![Page 67: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/67.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → C, x → T}
{v → C, x → T}
{v → C x → T}
{v → C, x → C}
{v → C, x → T} {v → C, x → T}
{v → C, x → C}
{v → T, x → C}
Worklist: l4, l8
![Page 68: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/68.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → C, x → T}
{v → C, x → T}
{v → C x → T}
{v → C, x → T}
{v → C, x → T} {v → C, x → T}
{v → C, x → C}
{v → T, x → C}
Worklist: l8, l5
![Page 69: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/69.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → C, x → T}
{v → C, x → T}
{v → C x → T}
{v → C, x → T}
{v → C, x → T} {v → C, x → T}
{v → C, x → T}
{v → T, x → C}
Worklist: l5, l9
![Page 70: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/70.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → C, x → T}
{v → C, x → T}
{v → C x → T}
{v → C, x → T}
{v → C, x → T} {v → C, x → T}
{v → C, x → T}
{v → T, x → C}
Worklist: l9
Remains the same!
![Page 71: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/71.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → C, x → T}
{v → C, x → T}
{v → C x → T}
{v → C, x → T}
{v → C, x → T} {v → C, x → T}
{v → C, x → T}
{v → T, x → T}
Worklist: l9
![Page 72: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/72.jpg)
Example of dense approach
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
{v → T, x → ?}
{v → T, x → C}
{v → T, x → C} {v → C, x → C}
{v → C, x → T}
{v → C, x → T}
{v → C x → T}
{v → C, x → T}
{v → C, x → T} {v → C, x → T}
{v → C, x → T}
{v → T, x → T}
Worklist:
Remains the same!
and we are done!
![Page 73: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/73.jpg)
Ensuring TerminaAon
• Does this algorithm always terminates? – It does, because each variable can be mapped to either ?, clean or tainted. Once it reaches tainted, it does not change anymore. If every variable in a program point becomes tainted, that program point will not be added to the work list again.
• So, what is the complexity of this algorithm? – It is O(P×V×M), where P is the number of program poitns, V is the number of variables, and M is the number of predecessors of a program point. Usually, O(P) = O(M) = O(V); thus, we have O(V3).
![Page 74: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/74.jpg)
Ensuring TerminaAon
• Does this algorithm always terminates? – It does, because each variable can be mapped to either ?, clean or tainted. Once it reaches tainted, it does not change anymore. If every variable in a program point becomes tainted, that program point will not be added to the work list again.
• So, what is the complexity of this algorithm? – It is O(P×V×M), where P is the number of program points, V is the number of variables, and M is the number of predecessors of a program point. Usually, O(P) = O(V); and O(M) = O(1). Thus, we have O(V2).
![Page 75: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/75.jpg)
DCC 888
Universidade Federal de Minas Gerais – Department of Computer Science – Programming Languages Laboratory
SPARSE TAINTED FLOW ANALYSIS
![Page 76: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/76.jpg)
Problems with dense analyses
• Lots of redundant informaAon – Some nodes just pass forward the informaAon that they receive.
– These nodes are bound to idenAty transfer funcAons:
l3: bra l4, l8 {v → C, x → T}
{v → C, x → T}
{v → C, x → T}
![Page 77: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/77.jpg)
Problems with dense analyses
• Lots of redundant informaAon – Some nodes just pass forward the informaAon that they receive.
– These nodes are bound to idenAty transfer funcAons:
l3: bra l4, l8 {v → C, x → T}
{v → C, x → T}
{v → C, x → T}
We had two noAons of "path". Do you remember the second?
![Page 78: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/78.jpg)
Another noAon of path
$v = DB.get($_GET[‘child’]); $x = “”; if (DB.isMember($v)) { while (DB.hasParent($v)) { echo($x); $x = $_POST[‘$v’]; $v = DB.getParent($v); } echo($v); }
$_GET[‘child’]
$v $x
A path is a
chain of data
dependences
in a program
![Page 79: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/79.jpg)
Sparse Analysis
• If the “state” of a variable is always the same, we can bind this state, e.g., clean or tainted, to the variable itself.
$v = DB.get($_GET[‘child’]); $x = “”; if (DB.isMember($v)) { while (DB.hasParent($v)) { echo($x); $x = $_POST[‘$v’]; $v = DB.getParent($v); } echo($v); }
$_GET[‘child’]
$v $x
Is the state of $x always the same in the program above?
![Page 80: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/80.jpg)
Sparse Analysis
$v = DB.get($_GET[‘child’]); $x = “”; if (DB.isMember($v)) { while (DB.hasParent($v)) { echo($x); $x = $_POST[‘$v’]; $v = DB.getParent($v); } echo($v); }
$_GET[‘child’]
$v $x
1) How can we ensure that the abstract state of a variable is always the same?
2) This property will give us a sparse analysis. Do you remember what is a sparse analysis?
![Page 81: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/81.jpg)
StaAc Single Assignment
• The key is to ensure that each variable is defined in at most one site inside the program text.
l9: bra l9
l0: v = ❍ l1: x = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
• The name x is defined twice. We need to rename it!
![Page 82: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/82.jpg)
StaAc Single Assignment
• The key is to ensure that each variable is defined in at most one site inside the program text.
l9: bra l9
l0: v = ❍ l1: x1 = filter
l2: validate(v, l3, l9)
l3: bra l4, l8
l4: ● = x
l5: x6 = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
• The name x is defined twice. We need to rename it!
• But what is x at l4?
![Page 83: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/83.jpg)
StaAc Single Assignment
• The key is to ensure that each variable is defined in at most one site inside the program text.
l9: bra l9
l0: v = ❍ l1: x1 = filter
l2: validate(v, l3, l9)
l3: x5 ← φ (x1, x6) l3: bra l4, l8
l4: ● = x5
l5: x6 = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
• The name x is defined twice. We need to rename it!
• But what is x at l4? • We use phi-‐func4ons to
merge different variables into a single name.
![Page 84: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/84.jpg)
StaAc Single Assignment
• The variable v defined at l0 is clean at some parts of the program, and tainted at others. Can you idenAfy them?
l9: bra l9
l0: v = ❍ l1: x1 = filter
l2: validate(v, l3, l9)
l3: x5 ← φ (x1, x6) l3: bra l4, l8
l4: ● = x5
l5: x6 = ❍ l6: v = ⊗(v)
l7: bra l4
l8: ● = v
• We can “learn” informaAon about v from the validator.
• But we sAll would like to associate v with a single abstract state, clean or tainted. How?
![Page 85: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/85.jpg)
Validators define new variable names
• The variable v defined at l0 is clean in some parts of the program, and tainted at others. Can you idenAfy them?
l9: bra l9
l0: v = ❍ l1: x1 = filter
l2: validate(v, v2, l3, v3, l9)
l3: x5 ← φ (x1, x6) l3: bra l4, l8
l4: ● = x5
l5: x6 = ❍ l6: v = ⊗(v2)
l7: bra l4
l8: ● = v2
• We can “learn” informaAon about v from the validator.
• Let’s rename each variable that is validated. In this way, each validator defines two new names.
![Page 86: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/86.jpg)
Validators define new variable names
• The variable v defined at l0 is clean in some parts of the program, and tainted at others. Can you idenAfy them?
l9: bra l9
l0: v = ❍ l1: x1 = filter
l2: validate(v, v2, l3, v3, l9)
l3: x5 ← φ (x1, x6) l3: bra l4, l8
l4: ● = x5
l5: x6 = ❍ l6: v = ⊗(v2)
l7: bra l4
l8: ● = v2
• We can “learn” informaAon about v from the validator.
And what should we do about the two definiAons of variable v? Remember: we can have only one definiAon per variable.
![Page 87: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/87.jpg)
Example of Extended-‐SSA form program
l11: bra l11
l0: v0 = ❍ l1: x1 = filter
l2: validate(v, v2, l3, v3, l11)
l3: v4 ← φ (v2, v7) l4: x5 ← φ (x1, x6) l5: bra l6, l10
l6: ● = x5
l7: x6 = ❍ l8: v7 = ⊗(v4)
l9: bra l3
l10: ● = v4
What is the complexity of building this program representaAon?
![Page 88: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/88.jpg)
Sparse Analysis
• Algorithm in three steps: – Convert the program to e-‐SSA form → O(V2)
• In pracAce, it is → O(V)
– Build the constraint graph → O(V2) • In pracAce, it is → O(V)
– Traverse the constraint graph → O(E) → O(V2) • In pracAce, it is → O(V)
• We are assuming that V is the number of variables in the program.
![Page 89: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/89.jpg)
$i $i2
$i1 is_num
$a
stripslashes
Building the Constraint Graph
$v$_POST['id']
$v echo
$t1
$t2$a
$t1
$t2$a
Instruc4on Example Node
v = ❍ $x = $_POST[‘content’]
● = v echo($v)
x = ⊗(x1, …, xn) $a = $t1 * $t2
x = filter $a = htmlentities($t1)
validate(v, vc, lc, vt, lt) if(is_num($t1)) {…}
v ← φ (v1, v2) $v = phi($v1, $v2)
![Page 90: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/90.jpg)
l11: bra l11
l0: v0 = ❍ l1: x1 = filter l2: validate(v, v2, l3, v3, l11)
l3: v4 ← φ (v2, v7) l4: x5 ← φ (x1, x6) l5: bra l6, l10
l6: ● = x5 l7: x6 = ❍ l8: v7 = ⊗(v4) l9: bra l3
l10: ● = v4
v3
v0
v2
v7
v4
x1
x6
x5
Example of constraint graph
![Page 91: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/91.jpg)
l11: bra l11
l0: v0 = ❍ l1: x1 = filter l2: validate(v, v2, l3, v3, l11)
l3: v4 ← φ (v2, v7) l4: x5 ← φ (x1, x6) l5: bra l6, l10
l6: ● = x5 l7: x6 = ❍ l8: v7 = ⊗(v4) l9: bra l3
l10: ● = v4
v3
v0
v2
v7
v4
x1
x6
x5
Example of constraint graph
![Page 92: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/92.jpg)
l11: bra l11
l0: v0 = ❍ l1: x1 = filter l2: validate(v, v2, l3, v3, l11)
l3: v4 ← φ (v2, v7) l4: x5 ← φ (x1, x6) l5: bra l6, l10
l6: ● = x5 l7: x6 = ❍ l8: v7 = ⊗(v4) l9: bra l3
l10: ● = v4
v3
v0
v2
v7
v4
x1
x6
x5
Example of constraint graph
![Page 93: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/93.jpg)
l11: bra l11
l0: v0 = ❍ l1: x1 = filter l2: validate(v, v2, l3, v3, l11)
l3: v4 ← φ (v2, v7) l4: x5 ← φ (x1, x6) l5: bra l6, l10
l6: ● = x5 l7: x6 = ❍ l8: v7 = ⊗(v4) l9: bra l3
l10: ● = v4
v3
v0
v2
v7
v4
x1
x6
x5
Example of constraint graph
![Page 94: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/94.jpg)
l11: bra l11
l0: v0 = ❍ l1: x1 = filter l2: validate(v, v2, l3, v3, l11)
l3: v4 ← φ (v2, v7) l4: x5 ← φ (x1, x6) l5: bra l6, l10
l6: ● = x5 l7: x6 = ❍ l8: v7 = ⊗(v4) l9: bra l3
l10: ● = v4
v3
v0
v2
v7
v4
x1
x6
x5
Example of constraint graph
![Page 95: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/95.jpg)
Find a path from to without passing through
v3
v0
v2
v7
v4
x1
x6
x5
Traverse the Constraint Graph
![Page 96: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/96.jpg)
Find a path from to without passing through
v3
v0
v2
v7
v4
x1
x6
x5
Traverse the Constraint Graph
![Page 97: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/97.jpg)
Find a path from to without passing through
v3
v0
v2
v7
v4
x1
x6
x5
Traverse the Constraint Graph
![Page 98: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/98.jpg)
Find a path from to without passing through
v3
v0
v2
v7
v4
x1
x6
x5
Report:
this program
slice is bug
free
Traverse the Constraint Graph
![Page 99: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/99.jpg)
Find a path from to without passing through
v3
v0
v2
v7
v4
x1
x6
x5
Traverse the Constraint Graph
![Page 100: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/100.jpg)
Find a path from to without passing through
v3
v0
v2
v7
v4
x1
x6
x5
Traverse the Constraint Graph
![Page 101: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/101.jpg)
Data Flow analysis as Graph Reachability
• This data flow analysis reduces to a simple graph reachability problem because the lazce that is associated with each variable has height two: either a variable is clean, or it is tainted.
• In our formalism, any direct dependence from a variable v to a variable u transmits the abstract state from u to v.
• Therefore, we just want to know if there is a path from a source of malicious informaAon, the original tainted data, to sensiAve operaAons. – This path must not cross saniAzers, because saniAzers propagate clean informaAon.
![Page 102: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/102.jpg)
Program Slices
• Any subset of the dependence graph is called a program slice.
• The part of the dependence graph that depends on a variable x0 is called the forward slice of x0.
• The part of the dependence graph on which a variable x1 depends on is called the backward slice of x1.
x0
x1
e
e e
x0
x1
e
e e
Image that you must report a "buggy" part of the program back to the user. How can you use this noAon of slices to determine which part of the program is buggy?
![Page 103: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/103.jpg)
Program Slices
• Slices are useful to decrease the amount of informaAon that we must report to the user, in case the program is vulnerable: – The dangerous part of the program is given by the intersecAon of
the backward slice of a sink with the forward slice of a source, as long as there is a path from this source to this sink.
x0
x1
e
e e
x0
x1
e
e e
x0
x1
e
e e
![Page 104: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/104.jpg)
The Importance of Pointer Analysis
• In order to truly represent the dependencies in the program, the dependence graph must take pointers into consideraAon.
int main(int argc, char** argv){ int a = 0; a++; int* p = (int*)malloc(sizeof(int)); int* p2 = (int*)malloc(sizeof(int)); *p = a; if (argc%2) { free(p2); p2 = p; } else { *p2 = *p; } return 0; } This example is taken from an actual LLVM bytecode.
![Page 105: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/105.jpg)
The Importance of Pointer Analysis
This example is taken from an actual LLVM bytecode.
![Page 106: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/106.jpg)
<?php $host = $_POST['host']; $uid = $_POST['uid']; $pwd = $_POST['pwd']; $database_colla4on = $_POST['database_collaAon']; $output = '<select id="database_collaAon" name="database_collaAon”> <opAon value="'.$database_colla4on.'" selected >’.$database_colla4on.'</opAon></select>'; if ($conn = @ mysql_connect($host, $uid, $pwd)) { $getCol = mysql_query("SHOW COLLATION"); if (@mysql_num_rows($getCol) > 0) { $output = '<select id="database_collaAonse_collaAon” name="database_collaAon">'; while ($row = mysql_fetch_row($getCol)) { $selected = ( $row[0]==$database_collaAon ? ' selected' : '' ); $output .= '<opAon value="'.$row[0].'"'.$selected.'>'.$row[0].'</opAon>'; } $output .= '</select>'; } } echo $output; ?> MODx CMS version 1.0.3
Real World Example
Can you idenAfy the vulnerability of this program?
![Page 107: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/107.jpg)
Real World Example
l4: database_collation4 = ol5: output5 = !(database_collation4)
l9: output9 = "()
l#: output# = #(output5, output14) l17: • = output#
l3: bra l9, l# l3: bra l11, l3
l11: select11 = "(database_collation4)
l12: output12 = "(select11)l14: output14 = "()
Where is the bug?
![Page 108: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/108.jpg)
l4: database_collation4 = ol5: output5 = !(database_collation4)
l9: output9 = "()
l#: output# = #(output5, output14) l17: • = output#
l3: bra l9, l# l3: bra l11, l3
l11: select11 = "(database_collation4)
l12: output12 = "(select11)l14: output14 = "()
Real World Example
![Page 109: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/109.jpg)
Real World Example
$database_collation4
$_POST['...']
echo$output5
$output!
![Page 110: Tainted Flow Analysis - Laure Gonnordlaure.gonnord.org/.../ER03_2015/TaintedFlowAnalysis.pdf · 2015. 1. 29. · Nano/PHP:*our*toy*language* • The*toy*language*should*be*simple,*](https://reader035.fdocuments.us/reader035/viewer/2022081623/61403edc1664f1518558a1ba/html5/thumbnails/110.jpg)
A Bit of History
• The foundaAons of informaAon flow were introduced by Denning and Denning in 1977
• The noAon of program slice was introduced by Weiser.
• The dense algorithm that we saw in these slides is an adaptaAon of Orbaek and Palsberg's analysis.
• These slides follow the exposiAon given by Rimsa et al.
• Denning, D., and Denning, P., "CerAficaAon of programs for secure informaAon flow", commun. ACM 20 p 504-‐513 (1977)
• Weiser, M., "Program Slicing", ICSE, p 439-‐449 (1981)
• Orbaek, P., and Palsberg, J., "Trust in the lambda-‐calculus", Journal of Func. Programming 7(6), p 557-‐591 (1997)
• Rimsa, A., Amorim, M., and Pereira, F., "Tainted Flow Analysis on e-‐SSA-‐form Programs", CC, p 124-‐143 (2011)