CSTalks-Visualizing Software Behavior-14Sep
-
Upload
cstalks -
Category
Technology
-
view
437 -
download
0
description
Transcript of CSTalks-Visualizing Software Behavior-14Sep
![Page 1: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/1.jpg)
Visualizing Software Behavior Wu Yongzheng
14/Sep/2011 NUS SoC CSTalks 1
![Page 2: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/2.jpg)
Problems
• Software is complex – Large codebase – Interaction between components – Components from different vendor – Closed source, closed API
• Why understand software? – As developer => less bugs – As administrator => diagnosis – Curiosity?
• Execution trace contains software behavior information, but it’s huge.
14/Sep/2011 NUS SoC CSTalks 2
![Page 3: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/3.jpg)
Software Traces
• Types of traces – Instruction trace: records machine instructions – Call trace: records function calls – System call trace: records system calls – Software logs: important events
• System trace – System call trace from all processes – Mainly resource usage, system & process
interaction
14/Sep/2011 NUS SoC CSTalks 3
![Page 4: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/4.jpg)
WinResMon
• WinResMon: our trace recorder. • Works in Windows • Types of events:
– File: open, read, write, close, rename, … – Registry: open, get value, set value, delete, … – Network: connect, listen, send, receive, … – Process/thread: create, terminate.
14/Sep/2011 NUS SoC CSTalks 4
![Page 5: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/5.jpg)
Information (fields) in an Event
• PID/TID Process/thread ID • Program name Path of program’s EXE • User name/group Process’ owner • Start/end time Event timing in CPU ticks • Operation type E.g. file open • Parameter Type dependent. E.g.
– file path, system call flags, registry path – IP address
• Call stack trace Call stack in user process
14/Sep/2011 NUS SoC CSTalks 5
![Page 6: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/6.jpg)
Why visualize System Traces
• Software is complex – Interaction between modules, other software
• Software can be closed source, but interaction is open
• Human is good at detecting – Repeated pattern – Anomaly
NUS SoC CSTalks 6 14/Sep/2011
![Page 7: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/7.jpg)
What is DotPlot?
E A C B E E E D C
A
C
B
C
D
E
B
C
E
NUS SoC CSTalks 7
Trace X
Trace Y
14/Sep/2011
![Page 8: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/8.jpg)
What is DotPlot?
E A C B E E E D C
A
C
B
C
D
E
B
C
E
NUS SoC CSTalks 8
Trace X
Trace Y
14/Sep/2011
![Page 9: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/9.jpg)
An Example
NUS SoC CSTalks 9
Visualization comparing: MS PowerPoint, MS Word, OO Word, and OO PowerPoint.
14/Sep/2011
![Page 10: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/10.jpg)
Elements of VDP
NUS SoC CSTalks 10
1: Extended DotPlot 2,3: Axis Histogram 4,5: Barcode
1 3 4
2
3 14/Sep/2011
![Page 11: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/11.jpg)
Extended DotPlot
NUS SoC CSTalks 11
• Matching Rule – Define whether two events match – By fields: e.g. “if PIDs and resource paths are
the same”, “if program names are the same”
• DP Coloring Rule – Define color for matched events – Traditional DP uses black only – Use RGB model on black background, CMY
on white background – Use regular expression to specify events – E.g. “.*file_open.*”→blue. “.*reg_.*”→cyan
14/Sep/2011
![Page 12: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/12.jpg)
Event-ordered and Time-ordered
• Each event takes different time • The meaning/unit of each axis
NUS SoC CSTalks 12
Event-ordered Time-ordered
14/Sep/2011
![Page 13: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/13.jpg)
Axis Histogram
NUS SoC CSTalks 13
– Ticks mark unit time (e.g. 1 second) – Histogram
• Event density (time-ordered) • Time spent (event-ordered)
14/Sep/2011
![Page 14: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/14.jpg)
Barcode
NUS SoC CSTalks 14
• One dimensional • Highlight user chosen events
• E.g. file_open → red • One or more (e.g. three below) • Barcode coloring rules
14/Sep/2011
![Page 15: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/15.jpg)
Example 1: File Copying
NUS SoC CSTalks 15
Self-comparison, event-ordered xcopy copying 8 files: 1MB, 10KB, 10MB, 100KB, 1MB, 10KB, 10MB and 100KB DP match : operation + parameter (pathname) DP color : magenta → source; cyan → destination; black → other
File Operation
Source/Dst File Operation
Registry Operation
14/Sep/2011
![Page 16: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/16.jpg)
File Size
NUS SoC CSTalks 16
File size is visible Two 1MB and 10MB are shown Two 10KB and two 100KB are visible only when zoomed in
14/Sep/2011
![Page 17: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/17.jpg)
Zooming in
NUS SoC CSTalks 17
DP color : magenta → source; cyan → destination; black → other
14/Sep/2011
![Page 18: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/18.jpg)
A Surprise: Registry Operations
NUS SoC CSTalks 18
So many registry operations for a console application
Registry Operation
14/Sep/2011
![Page 19: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/19.jpg)
Another Surprise: DLLs
NUS SoC CSTalks 19
File, but not source or destination. Time on DLLs is more than a 1MB file.
File Operation
Source/Dst File Operation
DLLs
14/Sep/2011
![Page 20: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/20.jpg)
Example 2: Software Build
NUS SoC CSTalks 20
X: succeed; Y: failed due to missing .c file DP match : program + operation + value (pathname) DP color : black → any Bar1 color : black → nmake.exe Bar2 color : cyan → cl.exe; magenta → link.exe Bar3 color : cyan → reading .c files; magenta → reading .h files
Y: Failed due to missing .c file
X: succeed
14/Sep/2011
![Page 21: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/21.jpg)
Number of Executions
NUS SoC CSTalks 21
X: 4 compiles (cl.exe), 1 link (link.exe) Y: 3 compiles, 0 link Y: Third compile doesn’t read .c or .h. Bar2 color : cyan → cl.exe; magenta → link.exe Bar3 color : cyan → reading .c files; magenta → reading .h files
X: 4 compiler, 1 linker
Y: 3 compiler, 0 linker
14/Sep/2011
![Page 22: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/22.jpg)
Similarity & Difference
NUS SoC CSTalks 22
Two traces are similar. Y (failed) trace terminates earlier. Right before reading .c file
14/Sep/2011
![Page 23: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/23.jpg)
Different Matching Rule
NUS SoC CSTalks 23
Operation Type Program Name
14/Sep/2011
![Page 24: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/24.jpg)
Example 3: Two Idle Windows Machine
NUS SoC CSTalks 24
• Time-ordered • 1 hour each • Different time • About 750K events
each
14/Sep/2011
![Page 25: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/25.jpg)
Anomaly & Repeated Pattern
NUS SoC CSTalks 25
• Periodic pattern • Most events in R1 • Most time in R2 alike • Easily spot anomaly &
regular pattern
R1
R2
14/Sep/2011
![Page 26: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/26.jpg)
Zoom In
NUS SoC CSTalks 26
R1
R2
14/Sep/2011
![Page 27: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/27.jpg)
R1: Windows Update
• Similar events (darker area) are by Windows Auto Updater
• More file operation, less registry operation
NUS SoC CSTalks 27
magenta → wuauclt.exe (Windows Update)
File Operation
Registry Operation
14/Sep/2011
![Page 28: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/28.jpg)
14/Sep/2011 NUS SoC CSTalks 28
![Page 29: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/29.jpg)
Visualizing Module Dependencies
• The problem – There’s vulnerability in X. Which software uses X? – Why my software uses X? I never call it. – Is it safe to uninstall X?
• Software module – Windows DLLs – UNIX .so – Java class, packages
14/Sep/2011 NUS SoC CSTalks 29
![Page 30: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/30.jpg)
Examples of dependencies (1)
• Binaries used by notepad – c:\windows\apppatch\acgenral.dll – c:\windows\system32\avgrsstx.dll – c:\windows\system32\imm32.dll – c:\windows\system32\lpk.dll – c:\windows\system32\msacm32.dll – c:\windows\system32\msctf.dll – c:\windows\system32\msctfime.ime – c:\windows\system32\shimeng.dll – c:\windows\system32\usp10.dll – c:\windows\system32\uxtheme.dll – c:\windows\system32\winmm.dll – c:\windows\system32\winspool.drv – c:\windows\winsxs\x86_microsoft.windows.common-
controls_6595b64144ccf1df_6.0.2600.5512_x-ww_35d4ce83\comctl32.dll
14/Sep/2011 NUS SoC CSTalks 30
![Page 31: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/31.jpg)
Examples of dependencies (2) • Simple boot (only Windows installed)
– DLLs: 154 – EXEs: 10 – Drivers: 1 – Ime: 1
• Typical boot (Windows + applications) – DLLs: 274 – EXEs: 15 – Telephony/Modem: 6 – Drivers: 3 – ActiveX: 2 – Ime: 1
14/Sep/2011 NUS SoC CSTalks 31
![Page 32: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/32.jpg)
Visualization (1)
• Basic dependency graph • Graph is too dense
14/Sep/2011 NUS SoC CSTalks 32
![Page 33: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/33.jpg)
Binary Dependency Visualization • Two types of nodes: EXE, DLL + etc • Three types of directed edges
1. EXE X launches another EXE Y 2. EXE X load a DLL Y 3. A function in binary X calls a function in binary Y
• How are binaries shared among programs? – EXE Dependency Graph – Only Type 1 and 2 edge – Group DLLs by loader
• How binaries interact? – DLL Dependency Graph – Only Type 2 and 3 edge – Group DLLs manually by functionality or software vendor
14/Sep/2011 NUS SoC CSTalks 33
![Page 34: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/34.jpg)
Visualization (1)
• Basic dependency graph • Graph is too dense
14/Sep/2011 NUS SoC CSTalks 34
![Page 35: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/35.jpg)
A more usable Visualization: EXE Dependency Graph
• Grouped dependency graph 1
1
1
2
2
14/Sep/2011 NUS SoC CSTalks 35
![Page 36: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/36.jpg)
Comparing Microsoft Word and Open Office Writer
14/Sep/2011 NUS SoC CSTalks 36
![Page 37: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/37.jpg)
DLL Dependency Graph: actual binary usage
• Some definitions: – An EXE-DLL dependency in a DLL Dependency Graph is
when there is has a control transfer from code in executable x to code in DLL y. We say that x has an EXE-DLL dependency on y.
– A DLL-DLL dependency in a DLL Dependency Graph is when there is has a control transfer from code in DLL x to code in DLL y. We say that x has a DLL-DLL dependency on y
14/Sep/2011 NUS SoC CSTalks 37
![Page 38: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/38.jpg)
wget: DLL dependency without grouping
14/Sep/2011 NUS SoC CSTalks 38
![Page 39: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/39.jpg)
wget: DLL dependency group by fnctionality
14/Sep/2011 NUS SoC CSTalks 39
![Page 40: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/40.jpg)
Examples of grouping By functionality (GIMP)
14/Sep/2011 NUS SoC CSTalks 40
![Page 41: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/41.jpg)
Examples of grouping By software vendor (GIMP)
14/Sep/2011 NUS SoC CSTalks 41
![Page 42: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/42.jpg)
Two Operations
• Diff – Compare two graphs.
• E.g. from same program but different environment/input • E.g. from two related programs
– Diff graph G1 and G2 to get G3. • Projection
– Focus on a particular module X – Only show modules that calls X or called by X
(recursive defination) – Project graph G1 on module M to get G2 – Not a simple subgraph problem
14/Sep/2011 NUS SoC CSTalks 42
![Page 43: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/43.jpg)
Diff of DLL dependency graph of Internet Explorer with Flash and without
14/Sep/2011 NUS SoC CSTalks 43
![Page 44: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/44.jpg)
Projection of the DLL dependency graph of Internet Explorer on Flash
14/Sep/2011 NUS SoC CSTalks 44
![Page 45: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/45.jpg)
Firefox using tortoisesvn
14/Sep/2011 NUS SoC CSTalks 45
![Page 46: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/46.jpg)
Questions?
14/Sep/2011 NUS SoC CSTalks 46
![Page 47: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/47.jpg)
Visualizing binaries executed
• Call graph is large. • Group functions to images => DLL dependency
graph. • DLL dependency graph is still large. • Group DLLs by properties:
– By functionality: graphics, audio, network… – By vendor: microsoft, adobe… – By path: C:\windows\system32\*.dll,
D:\vmware\*.dll…
14/Sep/2011 NUS SoC CSTalks 47
![Page 48: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/48.jpg)
Visualizing binaries executed (1)
• Generate call tree, call graph, DLL dependency graph • PIN tool to collect execution trace
– Trace include call, return, thread, context, system call events
– Call and return records stack pointer, PC and target address.
• Not trivial to maintain call stack by tracking call and return – Non-return function (long jump) – Thread, fiber – Context – Kernel callback
14/Sep/2011 NUS SoC CSTalks 48
![Page 49: CSTalks-Visualizing Software Behavior-14Sep](https://reader034.fdocuments.us/reader034/viewer/2022051323/547cf2d4b37959492b8b512f/html5/thumbnails/49.jpg)
Projection void main (void) { A(); B(1); } void A (void) { B(0); } void B (int i) { if (i) D(); else C(); } void C (void) {} void D (void) {}
14/Sep/2011 NUS SoC CSTalks 49
main
A
B
C
D
main
A
B
C
Full Graph
Project on A