Download - Examining the Landscape and Impact of Android App Plagiarism

Transcript
Page 1: Examining the Landscape and Impact of Android App Plagiarism

1

Examining the Landscape and Impact of Android App Plagiarism

Clint GiblerRyan StevensJonathan CrussellHao Chen

Hui ZangHeesook Choi

Page 2: Examining the Landscape and Impact of Android App Plagiarism

2

Smartphones Abound

Page 3: Examining the Landscape and Impact of Android App Plagiarism

3

Page 4: Examining the Landscape and Impact of Android App Plagiarism

4

Plagiarism Harms the App Ecosystem

• Developers– Lose revenue and incentive to make apps

• Markets– Polluted search results

• Users– Difficult to find useful, high-quality apps

Page 5: Examining the Landscape and Impact of Android App Plagiarism

5

Investigation Goals• Characteristics of cloned apps

– Market– App category– Ad provider

• Impact on developers– Ad revenue– User base

Page 6: Examining the Landscape and Impact of Android App Plagiarism

6

Definitions• Cloning

– Apps with significant code sharing• Plagiarism

– Cloned apps by different authors• Owner

– Signed and uploaded a given app

Page 7: Examining the Landscape and Impact of Android App Plagiarism

7

Dataset – Android Apps• 265,000 apps from 17 markets

Apps

Play

9 English

6 Chinese

2 Russian

Page 8: Examining the Landscape and Impact of Android App Plagiarism

8

Dataset – Clone Clusters• 265,000 apps from 17 markets

Page 9: Examining the Landscape and Impact of Android App Plagiarism

9

Dataset – Clone Clusters• [Crussell ESORICS 2013]• >5,000 clusters of similar apps• >44,000 unique apps

Page 10: Examining the Landscape and Impact of Android App Plagiarism

10

Dataset – Clone Clusters• [Crussell ESORICS 2013]• >5,000 clusters of similar apps• >44,000 unique apps

Likely clones

Page 11: Examining the Landscape and Impact of Android App Plagiarism

11

Characteristics of Cloned Apps

Page 12: Examining the Landscape and Impact of Android App Plagiarism

12

Cloning between Markets

playandroidonline

Page 13: Examining the Landscape and Impact of Android App Plagiarism

13

How do Plagiarized Apps Impact Developers?

Page 14: Examining the Landscape and Impact of Android App Plagiarism

14

Determining Impact• Naïve approach

– How many times has this app been cloned?• Our approach

– How many use plagiarized apps instead of the original?

X

Page 15: Examining the Landscape and Impact of Android App Plagiarism

15

Determining Impact• Measuring users running a given app• Determining app ownership• Identifying original app from plagiarized

Page 16: Examining the Landscape and Impact of Android App Plagiarism

16

(we’re not Google)

Page 17: Examining the Landscape and Impact of Android App Plagiarism

17

So… what apps are you

running?

Page 18: Examining the Landscape and Impact of Android App Plagiarism

18

Advertising Background

Ad request

Client ID = “bob”

Ad URL

Ad Server

Ad library

Page 19: Examining the Landscape and Impact of Android App Plagiarism

19

Number of Users Running an App

“bob” Aha! Bob’s app is being run.

Page 20: Examining the Landscape and Impact of Android App Plagiarism

20

Dataset – Network Traffic• Major U.S. Cellular Provider• 2.6 billion packets in 12 days• All user-identifying info removed

Page 21: Examining the Landscape and Impact of Android App Plagiarism

21

Determine Ownership of Apps• Owners may have multiple dev accounts

– On one or multiple markets• Apps that share an owner should not be

considered plagiarized

Page 22: Examining the Landscape and Impact of Android App Plagiarism

22

Determine Ownership of AppsPhase 1 – Market/Dev Account

Page 23: Examining the Landscape and Impact of Android App Plagiarism

23

Determine Ownership of AppsPhase 1 – Market/Dev Account

Page 24: Examining the Landscape and Impact of Android App Plagiarism

24

Determine Ownership of AppsPhase 2 - Signature

Page 25: Examining the Landscape and Impact of Android App Plagiarism

25

Determine Ownership of AppsPhase 2 - Signature

Page 26: Examining the Landscape and Impact of Android App Plagiarism

26

Determine Ownership of AppsPhase 2 - Signature

Page 27: Examining the Landscape and Impact of Android App Plagiarism

27

Determine Ownership of AppsPhase 3 – Client IDs

Page 28: Examining the Landscape and Impact of Android App Plagiarism

28

Determine Ownership of AppsPhase 3 – Client IDs

Page 29: Examining the Landscape and Impact of Android App Plagiarism

29

Determine Ownership of AppsPhase 3 – Client IDs

Page 30: Examining the Landscape and Impact of Android App Plagiarism

30

Determine Ownership of AppsPhase 3 – Client IDs

Page 31: Examining the Landscape and Impact of Android App Plagiarism

31

Identifying Original Apps:

• Date first uploaded to the market• Popularity

– Installs

– Rating• Code size

Naïve Approaches

Page 32: Examining the Landscape and Impact of Android App Plagiarism

32

Determining Original vs Clones

• Goal: give lower bound

20 impressionsAlice

Charlie 50 impressions

Bob 30 impressions

An Example Cluster

AliceBobCharlie

Impressions

50%

20%

30%

Page 33: Examining the Landscape and Impact of Android App Plagiarism

33

Determining Original vs Clones

• Goal: give lower bound

Estimated Loss

AliceBobCharlie

Impressions

50%

20%

30%

AliceBobCharlie

50%

50%

Page 34: Examining the Landscape and Impact of Android App Plagiarism

34

Determining Original vs Clones

• Goal: give lower bound

Estimated Loss

AliceBobCharlie

Real Loss

50%

20%

30%

AliceBobCharlie

50%

50%

Page 35: Examining the Landscape and Impact of Android App Plagiarism

35

AliceBobCharlie

Determining Original vs Clones

• Goal: give lower bound

Estimated Loss Real Loss

70% 30%

AliceBobCharlie

50%

50%

Page 36: Examining the Landscape and Impact of Android App Plagiarism

36

Percent Revenue/Users Lost

Page 37: Examining the Landscape and Impact of Android App Plagiarism

37

Suggestions for Reducing Cloning

• Developers– Proguard, License Verification Library (LVL)

• Markets– Use tools to detect cloned apps– Adjust market registration fee

• Ad providers– Vet developers

Page 38: Examining the Landscape and Impact of Android App Plagiarism

38

Conclusion• First large scale study on impact of Android

application plagiarism• Combine

– Static analysis for clone detection– Network analysis for revenue loss measurement– Use client IDs to link both analyses

• Coming soon:

sherlockdroid.com