Download - STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

Transcript
Page 1: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

1 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

STAAF An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

OWASP AppSec USA

Thursday, September 22, 2011

Page 2: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

2 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Allow Me to Introduce Myself

Ryan W Smith VP Engineering at Praetorian

OWASP DFW Chapter Leader (2011)

Active member of The Honeynet

Project (2002- )

8+ years of work with DoD,

Intelligence Community,

Federal/State/Local governments,

and Fortune 500 companies

Page 3: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

3 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Page 4: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

4 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Presentation Roadmap

• STAAF (Overview)

• Background

• STAAF (Deep Dive)

• Results

• Future Work

• Conclusions

Page 5: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

5 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

What can STAAF do for you?

Observation #1:

There are a lot of Android app analysis

tools freely available

BUT:

They’re typically designed for single app

analysis

STAAF leverages the power of these tools as modules, And adds efficiency, scalability, data mgmt and sharing

Page 6: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

6 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

What can STAAF do for you?

Analyze 50k apps in less than 2 days and make

the extracted data readily available to analysts Goal

Observation #2:

Higher value analysis can be attained by

analyzing large numbers of applications

over long periods of time

SOLUTION:

Reduce the time and complexity for an

analyst to process large numbers of apps

Page 7: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

7 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Analytic Data

What can STAAF do for you?

Minimize analysts’ effort to extract meaningful

results from a large number of applications Goal

Process Apps

Mod #1

Mod #2

Mod #3

Mod #N

Process Data Mod A

Mod B

Mod C

Mod Z

Meaningful

Results

Page 8: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

8 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

What is STAAF

Page 9: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

9 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

What is STAAF

Page 10: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

10 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

What is STAAF

Page 11: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

11 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

What is STAAF

Page 12: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

12 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

What is STAAF

Page 13: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

13 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

What STAAF is NOT

• STAAF is not a stand alone application

• STAAF is not only a malware detection or anti-virus engine

• STAAF is not an application collection tool

STAAF is a problem agnostic app analysis framework

Page 14: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

14 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Presentation Roadmap

• STAAF (Overview)

• Background

• STAAF (Deep Dive)

• Results

• Future Work

• Conclusions

Page 15: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

15 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Android’s Open App Model

• Low barrier to entry

• Apps hosted and installed from anywhere

• All apps are created equal

• No distinction between core apps and 3rd party apps

• Accept apps based on: 1. Trust of the source

2. Permissions requested

Page 16: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

16 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

“Legitimate” Monitoring Apps

Ad/Marketing Networks

Social Gaming Networks

Page 17: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

17 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

“Not-So-Legitimate” Permission Use

SMS Trojan – Link to site hosting rogue app for “free movie player” – Sends 2 Premium SMS messages to a Kazakhstan number

(about $5 per message)

Gemini – Repackaged apps in Chinese markets – Sex positions and MonkeyJump2 are known examples – Central C&C – Exfiltrates unique device identifiers – Downloads and Install New Apps (with permission)

DroidDream – Approx. 50 Malicious apps in official market – Central C&C – Exfiltrates unique device identifiers – Downloads additional code modules

Page 18: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

18 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Presentation Roadmap

• STAAF (Overview)

• Background

• STAAF (Deep Dive)

• Results

• Future Work

• Conclusions

Page 19: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

19 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

STAAF Workflow

Step 0: STAAF components initialized

Page 20: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

20 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

STAAF Workflow

Step 1: Users sends APKs to be processed

Page 21: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

21 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

STAAF Workflow

Step 2: Coordinator checks database for previous

results and logs new instance data for each APK

Page 22: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

22 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

STAAF Workflow

Step 3: Coordinator sends new APKs to the file

repository service

Page 23: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

23 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

STAAF Workflow

Step 4: Coordinator sends tasking orders to the task

queue

Page 24: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

24 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

STAAF Workflow

Step 5: Elastic computing nodes pull tasks from their

designated task queue

Page 25: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

25 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

STAAF Workflow

Step 6: Elastic computing nodes pull in the APK and

related information

Page 26: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

26 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

STAAF Workflow

Step 6: After processing the elastic computing nodes

push out processed files and analysis results

Page 27: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

27 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

STAAF Workflow

Step 7: When all tasking is complete elastic

computing nodes notify the coordinator

Page 28: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

28 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Task Modules

• Can be registered dynamically

• Task-Oriented

– High level

• What % of apps use permission X

• What is the most common libraries used

– Mid level

• Extract Permissions

• Extract static URLs

• Extract Methods Called

– Low level

• Extract manifest

• Extract Dex bytecode

PR

IOR

ITY

CO

MP

LE

XIT

Y

Page 29: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

29 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Deduplication of Effort

• All Intermediate data are cached for later use

– Extract and convert manifest to ASCII

– Extract Dex and convert to Smali and Java

– Compute the control flow graph from the Dex

• Libraries and shared resources must only be processed once

• Apps must only be processed once by each module, ever

Small savings matter at large scales

Page 30: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

30 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Distributed Data Sharing

• Sharing app samples is just the beginning

• Share the entire process:

– Raw Application

– Extracted Resources

– Raw Data

– Processed Data

• Or set specific limits on what data is shared

Page 31: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

31 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Presentation Roadmap

• STAAF (Overview)

• Background

• STAAF (Deep Dive)

• Results

• Future Work

• Conclusions

Page 32: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

32 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Time Trials STAAF Performance Tests

# Time Apps ECUs Nodes Database

1 2h25m 500 1 1 Central

2 2h00m 500 1 2 Central

3 1h56m 500 1 4 Central

4 0h36m 500 1 4 Local

5 0h36m 500 5 1 Central

6 0h28m 500 5 4 Central

7 0h10m 500 5 4 Local

8 0h27m 1722 5 5 Local

9 1h19m 9349 5 10 Local

Achieved 50k apps in ~7 hours* *Extrapolated from shorter tests

Page 33: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

33 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Time Trials STAAF Performance Tests

# Time Apps ECUs Nodes Database

1 2h25m 500 1 1 Central

2 2h00m 500 1 2 Central

3 1h56m 500 1 4 Central

4 0h36m 500 1 4 Local

5 0h36m 500 5 1 Central

6 0h28m 500 5 4 Central

7 0h10m 500 5 4 Local

8 0h27m 1722 5 5 Local

9 1h19m 9349 5 10 Local

“One EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.” -Amazon

1 ECU

1 Node

Central DB

5 ECUs

1 Node

Central DB

Compute Time 2h25m

Compute Time 0h36m

Page 34: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

34 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Time Trials STAAF Performance Tests

# Time Apps ECUs Nodes Database

1 2h25m 500 1 1 Central

2 2h00m 500 1 2 Central

3 1h56m 500 1 4 Central

4 0h36m 500 1 4 Local

5 0h36m 500 5 1 Central

6 0h28m 500 5 4 Central

7 0h10m 500 5 4 Local

8 0h27m 1722 5 5 Local

9 1h19m 9349 5 10 Local

STAAF is bound by both CPU and database throughput

1 ECU

1 Node

Central DB

1 ECU

4 Nodes

Central DB

Compute Time 2h25m

Compute Time 1h56m

Page 35: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

35 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Time Trials STAAF Performance Tests

# Time Apps ECUs Nodes Database

1 2h25m 500 1 1 Central

2 2h00m 500 1 2 Central

3 1h56m 500 1 4 Central

4 0h36m 500 1 4 Local

5 0h36m 500 5 1 Central

6 0h28m 500 5 4 Central

7 0h10m 500 5 4 Local

8 0h27m 1722 5 5 Local

9 1h19m 9349 5 10 Local

By using distributed, local databases STAAF achieves a significant time performance increase

1 ECU

4 Nodes

Central DB

1 ECU

4 Nodes

Local DB

Compute Time 1h56m

Compute Time 0h36m

Page 36: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

36 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Time Trials STAAF Performance Tests

# Time Apps ECUs Nodes Database

1 2h25m 500 1 1 Central

2 2h00m 500 1 2 Central

3 1h56m 500 1 4 Central

4 0h36m 500 1 4 Local

5 0h36m 500 5 1 Central

6 0h28m 500 5 4 Central

7 0h10m 500 5 4 Local

8 0h27m 1722 5 5 Local

9 1h19m 9349 5 10 Local

Using by adding multiple processors with local databases, we achieve near linear scalability

1 ECU

1 Node

Central DB

1 ECU

4 Nodes

Local DB

Compute Time 2h25m

Compute Time 0h36m

Page 37: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

37 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Time Trials STAAF Performance Tests

# Time Apps ECUs Nodes Database

1 2h25m 500 1 1 Central

2 2h00m 500 1 2 Central

3 1h56m 500 1 4 Central

4 0h36m 500 1 4 Local

5 0h36m 500 5 1 Central

6 0h28m 500 5 4 Central

7 0h10m 500 5 4 Local

8 0h27m 1722 5 5 Local

9 1h19m 9349 5 10 Local

By simply increasing the CPU capacity to 5 ECUs, we achieve the same performance as four 1 ECU nodes

1 ECU

4 Nodes

Local DB

5 ECUs

1 Node

Central DB

Compute Time 0h36m

Compute Time 0h36m

Page 38: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

38 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Time Trials STAAF Performance Tests

# Time Apps ECUs Nodes Database

1 2h25m 500 1 1 Central

2 2h00m 500 1 2 Central

3 1h56m 500 1 4 Central

4 0h36m 500 1 4 Local

5 0h36m 500 5 1 Central

6 0h28m 500 5 4 Central

7 0h10m 500 5 4 Local

8 0h27m 1722 5 5 Local

9 1h19m 9349 5 10 Local

Once again, using a central database fails to achieve linear performance gains

5 ECUs

1 Node

Central DB

5 ECUs

4 Nodes

Central DB

Compute Time 0h36m

Compute Time 0h28m

Page 39: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

39 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Time Trials STAAF Performance Tests

# Time Apps ECUs Nodes Database

1 2h25m 500 1 1 Central

2 2h00m 500 1 2 Central

3 1h56m 500 1 4 Central

4 0h36m 500 1 4 Local

5 0h36m 500 5 1 Central

6 0h28m 500 5 4 Central

7 0h10m 500 5 4 Local

8 0h27m 1722 5 5 Local

9 1h19m 9349 5 10 Local

By using distributed, local databases we once again achieve near linear performance gains

5 ECUs

1 Node

Central DB

5 ECUs

4 Nodes

Local DB

Compute Time 0h36m

Compute Time 0h10m

Page 40: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

40 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Time Trials STAAF Performance Tests

# Time Apps ECUs Nodes Database

1 2h25m 500 1 1 Central

2 2h00m 500 1 2 Central

3 1h56m 500 1 4 Central

4 0h36m 500 1 4 Local

5 0h36m 500 5 1 Central

6 0h28m 500 5 4 Central

7 0h10m 500 5 4 Local

8 0h27m 1722 5 5 Local

9 1h19m 9349 5 10 Local

By increasing CPU capacity, number of processing nodes, and number of databases, we decreased processing time by 14.5x

1 ECU

1 Node

Central DB

5 ECUs

4 Nodes

Local DB

Compute Time 2h25m

Compute Time 0h10m

Page 41: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

41 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Time Trials STAAF Performance Tests

# Time Apps ECUs Nodes Database

1 2h25m 500 1 1 Central

2 2h00m 500 1 2 Central

3 1h56m 500 1 4 Central

4 0h36m 500 1 4 Local

5 0h36m 500 5 1 Central

6 0h28m 500 5 4 Central

7 0h10m 500 5 4 Local

8 0h27m 1722 5 5 Local

9 1h19m 9349 5 10 Local

Larger tests confirm that STAAF continues to scale linearly

5 ECUs

5 Nodes

Local DB

5 ECUs

10 Nodes

Local DB

1722 Apps Compute Time

0h27m

9349 Apps Compute Time

1h19m

Page 42: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

42 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Permissions Requested

Average: 3

Most Requested: 117

Initial Results :: Permissions Requests 53,000 Applications Analyzed

Android Market: ~48,000

3rd Party Markets: ~5,000

Location Data 11,929 (24%)

Read Contacts 3,636 (8%)

Send SMS 1,693 (4%)

Receive SMS 1,262 (4%)

Record Audio 1,100 (2%)

Read SMS 832 (2%)

Process Outgoing Calls 323 (1%)

Page 43: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

43 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Additional Results :: Shared Libraries 53,000 Applications Analyzed

Android Market: ~48,000

3rd Party Markets: ~5,000

com.admob 38% (18,426 apps )

org.apache 8% ( 3,684 apps )

com.google.android 6% ( 2,838 apps )

com.google.ads 6% ( 2,779 apps )

com.flurry 6% ( 2,762 apps )

com.mobclix 4% ( 2,055 apps )

com.millennialmedia 4% ( 1,758 apps)

Page 44: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

44 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Permissions Are Not a Good Indicator

SMS Trojan

SMS Replicator

Droid Dream

zsones

&

Fake Security Tool Gemeni

Malware only needs a single permission

Page 45: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

45 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Presentation Roadmap

• STAAF (Overview)

• Background

• STAAF (Deep Dive)

• Results

• Future Work

• Conclusions

Page 46: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

46 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

STAAF’s Future

• Build a publically available user interface

• Provide a dashboard with global stats

• Further Tune database performance issues

• Build more complex analysis modules – Static data flow analysis

– Dynamic sandbox analysis

• Expose a public module interface through UI

Page 47: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

47 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Presentation Roadmap

• STAAF (Overview)

• Background

• STAAF (Deep Dive)

• Results

• Future Work

• Conclusions

Page 48: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

48 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Final Thoughts

• STAAF is a system of systems and services, not an application

• STAAF enables large scale Android application analysis

• STAAF is problem agnostic and can be tailored to answer many analytic questions

• STAAF augments the capabilities of the analyst, it does not replace them

• STAAF achieves scalable performance increases by increasing computer nodes/power

Page 49: STAAF, An Efficient Distributed Framework for Performing Large-Scale Android Application Analysis

49 Entire contents © 2011 Praetorian. All rights reserved. Your World, Secured

Q&A