Dynamic Tainting for Deployed Java Programs
Du LiAdvisor: Witawas Srisa-an
University of Nebraska-Lincoln
1
Example
2
Example (Cont)
3
Dynamic Tainting Analysis
• Mark and track certain data at runtime• Widely applied to:
o Attack preventiono Data lifespan/scope analysiso Generation of test caseso etc.
4
Dynamic Tainting Analysis (cont.)
• Powerful but expensiveo Significant overheado Need third-party tools supporto Suitable for debugging and
maintenance, too much overhead for deployed systems
5
Goal• Make dynamic tainting
analysis feasible for deployed systemso Fast: High-performance and
low overheado Easy to use: No need for extra
HW or SW supporto Flexible: Users control what
and when to monitor
6
Idea
• Java Virtual Machine is a good platform for dynamic tainting analysiso Useful runtime information
CFG, data access, object life span, ...o Existing components
Barriers Optimizing compilers Garbage collectors
7
Outline
• Motivation• A proposed solution• Implementation plan• Conclusion
8
Motivation
• Existing dynamic tainting tools have high overheado Dytan: 30-50 times overhead (Clause et al.)o Taintcheck: 20 times overhead in the worst case
(Seward et al.)o Effective memory protector: 25% overhead but
need special hardware (Clause et al.)
9
Motivation (cont.)
• Existing solutions need extra SW supporto Dytan: on top of PIN (Luk et al.)o TaintChecker: on top of Valgrind (Nethercote et
al.)o Require users' extra effort to set up execution
environment on third-party tools (not always feasible)
10
Motivation (cont.)
• Existing work is based on binary instrumentation
• Managed languages (like Java) are popular in many deployed environments
• It is time to investigate tainting analysis in virtual machines
11
Solution
• JVM-based Dynamic Tainting frameworko Easy to use: as one JVM build-in feature, no need for extra tool
support, enable by setting a flago High-performance: utilize existing runtime systems to generate
informationo Customization: can be configurable to monitor only a subset of
data flow
12
Implementation
• JVM has useful infrastructure for dynamic tainting analysiso Read/Write barriers → data access tracingo Garbage collector → marking information
processo JIT compilers → optimization
13
Implementation (cont.)
• Data tracingo Write/Read Barriers can monitor all data access efficientlyo Garbage collector can help to identify data references o Our experiment and existing work (Blackburn et al.) shows
average overhead of barriers can range from 6.49% to 21.24%o Data tracing is the dominant part of overhead
14
Implementation (cont.)
• Achieving High-Performanceo JIT compiler is a great place to improve performance
All tainting analysis related code will be optimized by JIT compiler
Data dependency information can be generated by JIT compiler
Make tainting process more accurate and smart
15
Implementation (cont.)
• Achieving Customizationo JIT compiler can replace code at runtime
Our framework makes use of JIT compiler to customize tainting analysis sampling rate, granularity, or even turn on/off tainting analysis
o Tradeoff between accuracy and performance
16
Status1. Working on Maxine Virtual Machine to build
data flow analysis framework2. Basic components are close to be done3. Plan to build tainting analysis on top of the
framework
17
Conclusion
• Existing tainting analysis solution is powerful but heavy weight
• JVM base tainting framework is easier to use: efficient, flexible
• Existing JVM infrastructure can help to improve tainting process
18
Top Related