Mining SQL Injection and Cross Site Scripting Vulnerabilities using Hybrid Program Analysis

Mining SQL Injection and Cross Site Scripting Vulnerabilities using Hybrid Program Analysis Shar Lwin Khin, Tan Hee Beng Kuan Information Engineering, Nanyang Technological University, Singapore

Lionel Briand, Interdisciplinary Centre for ICT Security, Reliability, and Trust, University of Luxembourg, Luxembourg

[email protected] [email protected]

[email protected]

Mo7va7on

  Increasing number of vulnerabili7es

 Developers lack security awareness

 Manual vulnerability audit is effort intensive

Related Work

Method Granularity Accuracy Scalability

Vuln. Predic7on × √ √ Sta7c taint analysis √ × √ Sta7c & dynamic analysis √ √ × ??? √ √ √

Problem Defini7on 1/2  Input valida,on and sani,za,on are two common defense methods used in web applica7ons

 Sta,c a2ributes have been shown to be indicators of vulnerabili7es, though not accurate enough

 Can we use Sta7c and dynamic aPributes together characterizing the implementa7ons of these defense methods as indicators?

 Machine learning to predict vulnerability based on aPributes

Problem Defini7on 2/2  Typical predic7on models are classifica7on-‐based  Being supervised learning, their effec7veness is dependent on the availability of sufficient training data tagged with class labels

 Cluster analysis (CA) is a type of unsupervised learning methods

 CA may be used if vulnerable instances can be dis7nguished from non-‐vulnerable instances based on the proposed aPributes

Vulnerability Distribu7ons

© Web Hacking Incident Database

SQL Injec7on

7

Hacker login.php

Database

$name = ’ or 1=1 --

$q = “select * from user where name=‘’ or 1=1--’ and pw=‘’

 Cause: Inadequate valida7on and sani7za7on of user inputs used in queries

$q = “select * from user where name=‘”.$name.“’ and pw=‘”.$pw.“’”

Unauthorized user information SQLI!

Cross Site Scrip7ng  Cause: No sanity check of input before used in HTML documents Hacker Victim travelerTip.php

Inject Script: <script>alert(xss!);</script>

Visit

http://travelingForum/travelerTip.php?Action=Post&Place=Greece&Tip=<Script>document.location=‘http://hackerSite/stealCookie.jsp?cookie=’+document.cookie; </Script>

Injected Script executed on victim’s browser

XSS!

Vulnerability Predic7on Principles 1/2  Using hybrid code a2ributes to predict vulnerabili7es  Based on both sta7c and dynamic program analyses  Input valida7on checks and sani7za7on opera7ons mainly based on string opera,ons  e.g., preg_replace(“<script”, “”, $data)

 Classify the types of string opera7ons applied according to their poten,al effects on the inputs before their use in security-‐sensi7ve statements—sinks  e.g., echo $data; mysql_query($data)

 Such valida7on checks and opera7ons can be iden7fied by analyzing data dependence graphs

Vulnerability Predic7on Principles 2/2

 Given the data dependence graph of a sink: extrac,ng the number of inputs, and the numbers and types of valida,on and sani,za,on func,ons from the graph, can we predict the sink’s vulnerability?

 E.g., if a sink uses five different inputs, there should at least be five input valida7on or sani7za7on func7ons.

sink

Sta7c and Dynamic Classifica7on  From the language built-‐in func7ons that have specific

security purposes, the language operators, and the predefined language parameters used, a node is classified sta,cally.

 e.g., addslashes($input), $_GET, $a = $b . $c  But it is classified dynamically if the node invokes user-‐

defined func7ons or some built-‐in func7ons such as string replacement.

 e.g., $sanitized = preg_replace(“<+”, “”, $input)  The func7on code is executed using a set of predefined test

inputs, and the final values of test input variables are searched for malicious characters.

Hybrid Code APributes Attribute

ID Attribute Name Description

Static attributes 1 Client The number of nodes that access data from HTTP request parameters 2 File The number of nodes that access data from files 3 Database The number of nodes that access data from database 4 Text-database Boolean value ‘TRUE’ if there is any text-based data accessed from database; ‘FALSE’ otherwise 5 Other-database Boolean value ‘TRUE’ if there is any data except text-based data accessed from database; ‘FALSE’

otherwise 6 Session The number of nodes that access data from persistent data objects 7 Uninit The number of nodes that reference un-initialized program variable 8 SQLI-sanitization The number of nodes that apply standard sanitization functions for preventing SQLI issues 9 XSS-sanitization The number of nodes that apply standard sanitization functions for preventing XSS issues 10 Numeric-casting The number of nodes that type-cast data into a numeric type data 11 Numeric-type-check The number of nodes that perform numeric data type check 12 Encoding The number of nodes that encode data into a certain format 13 Un-taint The number of nodes that return predefined information or information not influenced by external

users 14 Boolean The number of nodes which invoke functions that return Boolean value 15 Propagate The number of nodes that propagate partial or complete value of an input

Dynamic attributes 16 Numeric The number of nodes which invoke functions that return only numeric, mathematic, or dash characters 17 LimitLength The number of nodes that invoke string-length limiting functions 18 URL The number of nodes that invoke path-filtering functions 19 EventHandler The number of nodes that invoke event-handler filtering functions 20 HTMLTag The number of nodes that invoke HTML-tag filtering functions 21 Delimiter The number of nodes that invoke delimiter filtering functions 22 AlternateEncode The number of nodes that invoke alternate-character-encoding filtering functions

Target attribute 23 Vulnerable? Indicates a class label—Vulnerable or Not-Vulnerable

Sample APribute Vectors

•  Each sink would be represented by a 23-‐dimensional aPribute vector.

•  Sample aPribute vectors (Session, XSS-‐sanit, Un-‐taint, Delimiter, Propagate,…, Vulnerable?):  (2, 4, 0, 0, 2,…, Not-‐Vulnerable)  (1, 0, 1, 1, 7,…, Vulnerable)

13/50

Supervised Vulnerability Predic7on

 Data Preprocessing  Normaliza7on  Principal Component Analysis

 Classifiers  Logis7c Regression –regression analysis  Mul7-‐Layer Perceptron –neural network analysis

 Training & Tes7ng –10-‐fold cross valida7on

Unsupervised Vulnerability Predic7on

 Use same data preprocessing ac7vi7es as supervised models

 K-‐means cluster analysis based on two assump7ons  non-‐vulnerable sinks are much more frequent than vulnerable sinks

 vulnerable sinks have different characteris7cs from non-‐vulnerable sinks

 Label clusters as Vulnerable or Non-‐Vulnerable:  K=4: Maximum number of clusters  %Normal=12: Minimum size of non-‐vulnerable cluster

Case Study

 Six open source, web applica7ons (PHP):  Known vulnerable  Func7onali7es: school admin, forum, news, content, database management

 Sizes: from 2k – 44k LOC

 Vulnerability iden7fica7on: manual & vuln. databases – Bugtraq, CVE

16

Prototype Tool

Architecture of PhpMiner

Weka

Experiment & Result 1/2

Classification results of predictors built from hybrid attributes.

LR performs better than MLP Maximum analysis time: 2 hours, average ½ hour AccuracyShin et al. TSE’113 achieved recall>80 and pf<25 Pixy S&P’061 reported pf>20.

Too many false positives! Ardilla ICSE’094 reported up to 50% of paths left

unexplored.... False negatives?

Our result recall=90, pf=5

Measure (%) à Data & Classifier

recall false alarm precision

schmate-html LR 99 3 98 MLP 99 0 100

faqforge-html LR 89 5 94 MLP 91 5 94

utopia-html LR 94 1 94 MLP 94 2 89

phorum-html LR 78 1 70 MLP 33 0 100

cutesite-html LR 68 9 61 MLP 78 8 67

myadmin-html LR 85 1 89 MLP 75 1 83

Average results on XSS prediction LR 86 3 84 MLP 78 3 89

schmate-sql LR 97 8 98 MLP 96 35 92

faqforge-sql LR 88 4 94 MLP 88 4 94

phorum-sql LR 100 3 63 MLP 0 1 0

cutesite-sql LR 91 14 89 MLP 89 18 86

Average results on SQLI prediction LR 94 7 86 MLP 68 15 68

Overall average LR 90 5 85 MLP 74 8 81

Experiment & Result 2/2 Measure (%) Data

recall

false alarm

precision

utopia-html 100 13 65 phorum-html 56 11 16 cutesite-html 70 20 41 myadmin-html 55 8 33 phorum-sql 100 7 38 Average 76 12 39

k-means clustering analysis results on the datasets which have < 40% vulnerable sinks

Measure (%) Data

recall

false alarm

precision

schmate-html 9 0 100 faqforge-html 26 0 100 schmate-sql 3 32 29 faqforge-sql 0 0 undefined cutesite-sql 0 0 undefined Average 8 6 undefined

k-means clustering analysis results on the datasets which have ≥ 40% vulnerable sinks

When assumptions are not met, clustering does not work!

Limita7ons

 Supervised learning requires sufficient labeled data for training

 Unsupervised learning relies on some assump7ons, which are not always true: Applicable for most commercial systems?

 For unsupervised learning, tuning the parameters is required:   K: Maximum number of clusters   %Normal: Minimum size of non-‐vulnerable cluster

Conclusion

 Security audi7ng by providing probabilis7c alerts about vulnerable code statements.

 Propose hybrid (sta7c and Dynamic) code aPributes for vulnerability predic,on using machine learning

 APributes characterize common input valida7on and sani7za7on code paPerns, without expensive analysis

 Scalability: < 2 hours on a regular PC  Both supervised learning and unsupervised learning methods were used

 Supervised learning accuracy: 90% R, 85% P  Unsupervised learning: Lower accuracy, applicability?

Future Work

 Semi-‐supervised learning  Combining data dependency informa7on with control dependency informa7on

 Address other types of similar vulnerabili7es by considering other types of code paPerns

The End!

hPp://sharlwinkhin.com

23/50

Thank You!

Question?

References 1.  N. Jovanovic, C. Kruegel, and E. Kirda, “Pixy: a sta7c analysis tool for

detec7ng web applica7on vulnerabili7es,” in IEEE Symposium on Security and Privacy, 2006, pp. 258-‐263.

2.  D. Balzarou et al., “Saner: composing sta7c and dynamic analysis to validate sani7za7on in web applica7ons,” in IEEE Symposium on Security and Privacy, 2008, pp. 387-‐401.

3.  Y. Shin, A. Meneely, L. Williams, and J. A. Osborne, “Evalua7ng complexity, code churn, and developer ac7vity metrics as indicators of sowware vulnerabili7es,” IEEE Transac7ons on Sowware Engineering, vol. 37 (6), pp. 772-‐787, 2011.

4.  Kieżun, A., Guo, P. J., Jayaraman, K., and Ernst, M. D. 2009. Automa7c crea7on of SQL injec7on and cross-‐site scrip7ng aPacks. In Proceedings of the 31st Interna,onal Conference on SoTware Engineering, Vancouver, BC, pp. 199-‐209.

5.  RSnake. hPp://ha.ckers.org, accessed March 2012. 6.  I. H. WiPen and E. Frank, Data Mining, 2nd ed., Morgan Kaufmann, 2005.

24

Mining SQL Injection and Cross Site Scripting Vulnerabilities using Hybrid Program Analysis

Documents

Transcript of Mining SQL Injection and Cross Site Scripting Vulnerabilities using Hybrid Program Analysis