AppWatcher: Unveiling the Underground Market of Trading Mobile …sxz16/papers/appwatcher.pdf ·...

11
AppWatcher: Unveiling the Underground Market of Trading Mobile App Reviews Zhen Xie Dept of Computer Science and Engineering The Pennsylvania State University [email protected] Sencun Zhu Dept of Computer Science and Engineering & College of Information Sciences and Technology The Pennsylvania State University [email protected] ABSTRACT Driven by huge monetary reward, some mobile application (app) developers turn to the underground market to buy positive reviews instead of doing legal advertisements. These promotion reviews are either directly posted in app stores like iTunes and Google Play, or published on some popular websites that have many app users. Until now, a clear un- derstanding of this app promotion underground market is still lacking. In this work, we focus on unveiling this un- derground market and statistically analyzing the promotion incentives, characteristics of promoted apps and suspicious reviewers. To collect promoted apps, we built an automatic data collection system, AppWatcher, which monitored 52 paid review service providers for four months and crawled all the app metadata from their corresponding app stores. Fi- nally, AppWatcher exposes 645 apps promoted in app stores and 29, 680 apps promoted in some popular websites. The current underground market is then reported from various perspectives (e.g., service price, app volume). We identified some interesting features of both promoted apps and suspi- cious reviewers, which are significantly different from those of randomly chosen apps. Finally, we built a simple tracer to narrow down the suspect list of promoted apps in the underground market. Categories and Subject Descriptors K.6.5 [Management of Computing and Information Systems]: [Security and Protection]; D.2.8 [Software En- gineering]: Metrics Keywords Underground market, Mobile app reviews, Opinion Mining, App Stores, Fake Reviews 1. INTRODUCTION The great popularity of smart devices (e.g., tablets and smartphones) has attracted more and more software devel- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. WiSec’15, Jun 22-26 2015, NewYork City, NY, USA. Copyright is held by the owner/author(s). Publication rights licensed to ACM. ACM 978-1-4503-3623-9/15/06 ...$15.00 DOI: http://dx.doi.org/10.1145/2766498.2766510. opers to engage in mobile application (app) development. As of September 2014, both Apple App Store [1] and Google Play [2] offered 1.3 million apps. The entire app market has reached up to 35 billion dollars in 2013 and it could hit 70 billion dollars by 2017 [3]. Meanwhile, the competition to at- tract customers has become more furious than ever. To pro- mote their apps, some developers choose to buy reviews from companies like bestreviewapp [4], dailyappshow [5]. This re- view trading market will cause great damage to app mar- ket players. On one hand, these paid reviews may mislead honest customers and cause both time and money losses to them. On the other hand, it will demoralize those developers who try their best effort to improve app qualities in order to receive more and better reviews. Therefore, paid reviews are forbidden by app store vendors. Apple App Store warns the developers not to cheat the user reviews with fake or paid reviews in Clause 3.10 of “App Store Review Guidelines” [6]; GooglePlay requires the developers not to manipulate prod- uct ratings or reviews with unauthorized means like fake or paid reviews in its “Google Play Developer Program Poli- cies” [7]. Moreover, according to “The FTC’s Revised En- dorsement Guides” [8], paid reviews are illegal if endorsers fail to show whether they are paid to write the reviews (cur- rently almost no reviewers state whether their reviews are paid or not). By the place where a paid review is posted, the app review market can be grouped into two types. The first type is to post reviews directly in app stores (referred to as app store promotion ). Unlike traditional product stores, app stores are highly centralized; for example, iTunes is the only store that distributes apps for non-jailbroken devices of iTouch, iPhone, and iPad. Therefore, app reviews posted in these stores could be read by all the app users. However, as one user is only allowed to write one review for an app, non- conforming developers have to find many reviewers to pro- mote their apps. This is often difficult. Thus, some service providers have established a business to charge developers who seek reviews and pay app users who like to make money for writing app reviews. The second promotion type is to post reviews on popular websites (referred to as web promo- tion ), which have a large population of readers and followers in social networks like Facebook and Twitter. They may quickly broadcast app reviews to their readers and followers who are potential app users, and meanwhile, these websites would get paid by app developers. Until now, this app re- view trading market is still under the ground and unknown to the community.

Transcript of AppWatcher: Unveiling the Underground Market of Trading Mobile …sxz16/papers/appwatcher.pdf ·...

Page 1: AppWatcher: Unveiling the Underground Market of Trading Mobile …sxz16/papers/appwatcher.pdf · list of such websites. 2.2 Service Provider’s Meta Data Collection To collect a

AppWatcher: Unveiling the Underground Market of TradingMobile App Reviews

Zhen XieDept of Computer Science and Engineering

The Pennsylvania State [email protected]

Sencun ZhuDept of Computer Science and Engineering &

College of Information Sciences and TechnologyThe Pennsylvania State University

[email protected]

ABSTRACT

Driven by huge monetary reward, some mobile application(app) developers turn to the underground market to buypositive reviews instead of doing legal advertisements. Thesepromotion reviews are either directly posted in app storeslike iTunes and Google Play, or published on some popularwebsites that have many app users. Until now, a clear un-derstanding of this app promotion underground market isstill lacking. In this work, we focus on unveiling this un-derground market and statistically analyzing the promotionincentives, characteristics of promoted apps and suspiciousreviewers. To collect promoted apps, we built an automaticdata collection system, AppWatcher, which monitored 52paid review service providers for four months and crawled allthe app metadata from their corresponding app stores. Fi-nally, AppWatcher exposes 645 apps promoted in app storesand 29, 680 apps promoted in some popular websites. Thecurrent underground market is then reported from variousperspectives (e.g., service price, app volume). We identifiedsome interesting features of both promoted apps and suspi-cious reviewers, which are significantly different from thoseof randomly chosen apps. Finally, we built a simple tracerto narrow down the suspect list of promoted apps in theunderground market.

Categories and Subject Descriptors

K.6.5 [Management of Computing and InformationSystems]: [Security and Protection]; D.2.8 [Software En-gineering]: Metrics

Keywords

Underground market, Mobile app reviews, Opinion Mining,App Stores, Fake Reviews

1. INTRODUCTIONThe great popularity of smart devices (e.g., tablets and

smartphones) has attracted more and more software devel-

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for profit or commercial advantage and that copies bear this notice and the full citation

on the first page. Copyrights for components of this work owned by others than the

author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or

republish, to post on servers or to redistribute to lists, requires prior specific permission

and/or a fee. Request permissions from [email protected].

WiSec’15, Jun 22-26 2015, New York City, NY, USA.

Copyright is held by the owner/author(s). Publication rights licensed to ACM.

ACM 978-1-4503-3623-9/15/06 ...$15.00

DOI: http://dx.doi.org/10.1145/2766498.2766510.

opers to engage in mobile application (app) development.As of September 2014, both Apple App Store [1] and GooglePlay [2] offered 1.3 million apps. The entire app market hasreached up to 35 billion dollars in 2013 and it could hit 70billion dollars by 2017 [3]. Meanwhile, the competition to at-tract customers has become more furious than ever. To pro-mote their apps, some developers choose to buy reviews fromcompanies like bestreviewapp [4], dailyappshow [5]. This re-view trading market will cause great damage to app mar-ket players. On one hand, these paid reviews may misleadhonest customers and cause both time and money losses tothem. On the other hand, it will demoralize those developerswho try their best effort to improve app qualities in order toreceive more and better reviews. Therefore, paid reviews areforbidden by app store vendors. Apple App Store warns thedevelopers not to cheat the user reviews with fake or paidreviews in Clause 3.10 of “App Store Review Guidelines” [6];GooglePlay requires the developers not to manipulate prod-uct ratings or reviews with unauthorized means like fake orpaid reviews in its “Google Play Developer Program Poli-cies” [7]. Moreover, according to “The FTC’s Revised En-dorsement Guides” [8], paid reviews are illegal if endorsersfail to show whether they are paid to write the reviews (cur-rently almost no reviewers state whether their reviews arepaid or not).

By the place where a paid review is posted, the app reviewmarket can be grouped into two types. The first type is topost reviews directly in app stores (referred to as app store

promotion). Unlike traditional product stores, app storesare highly centralized; for example, iTunes is the only storethat distributes apps for non-jailbroken devices of iTouch,iPhone, and iPad. Therefore, app reviews posted in thesestores could be read by all the app users. However, as oneuser is only allowed to write one review for an app, non-conforming developers have to find many reviewers to pro-mote their apps. This is often difficult. Thus, some serviceproviders have established a business to charge developerswho seek reviews and pay app users who like to make moneyfor writing app reviews. The second promotion type is topost reviews on popular websites (referred to as web promo-

tion), which have a large population of readers and followersin social networks like Facebook and Twitter. They mayquickly broadcast app reviews to their readers and followerswho are potential app users, and meanwhile, these websiteswould get paid by app developers. Until now, this app re-view trading market is still under the ground and unknownto the community.

Page 2: AppWatcher: Unveiling the Underground Market of Trading Mobile …sxz16/papers/appwatcher.pdf · list of such websites. 2.2 Service Provider’s Meta Data Collection To collect a

In this paper, we aim to unveil the underground marketof trading mobile app reviews and collect a set of promotedapps as the ground truth to further study their statisticalcharacteristics. We joined their promotion communities asa passive reviewer or follower and collected apps assignedto us. We have built an automatic data collection system,called AppWatcher, to monitor 52 service providers everyday. In four months, AppWatcher has gathered 645 apps re-viewed by app store promotion and 29, 680 apps reviewed byweb promotion. It has also collected app meta data like rat-ing, text comment, and reviewer information, and randomlychosen 186,263 apps from Apple App Store for comparisonstudy. We have statistically studied service providers, pro-motion incentives, promoted apps, and suspicious reviewers.Moreover, we have compared their features with randomlychosen apps and tested the significance of their difference.

Our key contributions can be summarized as follows.

1. Market Study: To our best knowledge, we are thefirst to study the underground market of mobile appreview promotion. Our results is very helpful to under-stand the current situation and growth of this market.

2. New Discoveries: We have found some interestingfeatures of both promoted apps and suspicious review-ers. These features are significantly different from thoseof randomly chosen apps and they are critical to de-sign advanced algorithms for detecting promoted appsor reviewers.

3. Promoted Apps: Except service providers, it is verydifficult to get the ground truth of whether an app isa promoted app or not. All the promoted apps wehave collected by joining as a recruited reviewer aredefinitely promoted. This app dataset is importantto measure the accuracy of promoted app detectionalgorithms [9][10].

4. Promoted App Tracer: We design a simple apptracer, which can be extended by adding heuristics wediscovered, to narrow down the suspect list of pro-moted apps in the underground market.

2. DATA COLLECTION METHODOLOGY

2.1 Service Provider CollectionOur basic way to find app promotion service providers is

searching for their websites with the help of search engines.To look for app store review providers, we search with var-

ious combinations of keywords such as “paid”, “buy”, “app”,“review”, “rating”, “app stores”. For website review providerswhich charge for app reviews, we use the combinations of“paid”, “buy”, “app”, “review”, “evaluation”, “rating”. In ad-dition, we have found one blog [11] that keeps updating thelist of such websites.

2.2 Service Provider’s Meta Data CollectionTo collect a service provider’s meta data information, we

read its pages like FAQ, About, Pricing. Specifically, wecollected such data as service price, reward to reviewers,service launching time. Service launching time is the timewhen a service provider started to review apps. Normally,a website shows its creation time (i.e., year) at the bottomof the front page. If there is no such information, we would

refer to the whois information of the domain and use thedomain creation time instead.

2.3 Promoted App CollectionWe have built an automated data collection system, named

AppWatcher, to gather promoted apps from the service providers.AppWatcher is implemented in JAVA and Python and ithas 12,908 lines of source code. As illustrated in Fig. 1,the system includes four major modules: website monitor,assignment collector, app meta data crawler, and database.

Website

Monitor

Assignment

Collector

App/Review/Reviewer Crawler

Da

tab

ase

AppWatcher

Itunes

Service Providers

A2

App

Developers

GooglePlay

A1

A3

A4

A5

App Stores

Figure 1: The interactions between AppWatcherand review trading market. The detailed actionsbetween them are as follows. A1: App developersupload their apps to app stores; A2: Some devel-opers buy reviews from service providers; A3: The“website monitor” module monitors target serviceproviders to collect paid reviews. If the serviceprovider is recruiting reviewers online, “assignmentcollector” module will join as a passive reviewer tocollect assigned app information; A4: the module of“app meta data crawler” crawlers paid reviews alongwith their dates; A5: “app meta data crawler” re-trieves these promoted apps’ meta data from theircorresponding app stores.

Website monitor is used to collect promoted apps fromwebsites (i.e., service providers) where the reviews are pub-lished. It retrieves information like app id, review publishingdate, author, review content, and editor’s rating. Then, itsaves such structural data to the database.

Assignment collector is used to gather the informationof promoted apps from service providers which recruit re-viewers from the wild. We registered an account in eachservice provider, and waited for assignments (i.e., apps tobe reviewed) from service providers. We then parsed theassignment page and retrieved app identities. This modulewas run twice a day to monitor the change of assignmentsand record the time when a new app is added and the timewhen it is gone.

App meta data crawler is for retrieving an app’s de-tailed information from its app store. For each app collectedby Website Monitor and Assignment Collector, this modulesearches the app meta data (e.g., app description and re-viewers) from its app store. The structural data is stored inthe database for further analysis. For iTunes, the collectoris able to crawl all the reviews of an app. However, GooglePlay only allows to view up to 5, 000 latest reviews for any

Page 3: AppWatcher: Unveiling the Underground Market of Trading Mobile …sxz16/papers/appwatcher.pdf · list of such websites. 2.2 Service Provider’s Meta Data Collection To collect a

app, so our collector only collects the latest 5, 000 reviewsor all reviews (when total number is below 5,000).

2.4 Ethical ConsiderationOur study on the underground market for review/ranking

fraudulence is ultimately related to the real app market.One important rule in our experiments is not to influencethe real market. Therefore, we have chosen to be an passiveobserver. Specifically, when crawling apps from a website,we slowed down the speed of crawling (i.e., twice a day)to avoid traffic jamming. In addition, when joining as areviewer in a reviewer recruiting website, we only passivelyreceived the assignments without posting any review for anyapp. When crawling app meta data from app stores, we alsolimited the downloading speed.

3. APP STORE PROMOTION

3.1 DatasetWe have found 26 websites that have been involved in

app store promotion. Among them, five websites were closedafter being exposed very recently. These websites are used toadvertise their services. Ten of them are even used to recruitreviewers online, and two of these ten websites require toreview an app or pay for registration fees before joining. Tonot violate the ethical regulation, we deployed AppWatcherto only monitor the remaining eight websites. We registeredan account as a reviewer in each website and recorded appsassigned to us. We logged into these websites twice a dayand observed the change of these assignments. In total, wehave collected 645 apps, which were mostly promoted fromJuly 1st to October 7th, 20141.

Among these promoted apps, 618 are from the iOS plat-form while only 27 apps are from the Android platform.Since the population of promoted Android apps is not suf-ficient for statistical study, we chose to focus on iOS apps.54 iOS apps, which were promoted by “reviewfordev”, havebeen removed from Apple App Store, so we collected themeta data of 564 iOS apps with 497, 259 text comments and462, 896 reviewers.

For comparative study, we also randomly selected 179, 353apps from Apple App Store, which include 10% apps fromeach subcategory that has over 100 apps and all apps ineach subcategory with less than 100 apps2. We collectedtheir meta data, including 9, 399, 014 text comments, and6, 722, 558 reviewers.

3.2 Analysis of Service ProvidersFor all the service providers engaged in app store based

promotion, we mainly study when their services started, howmuch they charge customers, how much they reward review-ers, and how many apps they have reviewed.

3.2.1 Service launch time

Service launch time is the time when service providersstart to review apps. The first service was started in 2010and then more and more service providers emerged in 2012

1“reviewfordev” displays all the apps it has promoted, andsome of these apps were promoted before our monitoring.2We acquire the entire app ID list from websiteof Apple, https://itunes.apple.com/us/genre/ios-books/id6018?mt=8.

buyk

lout

buyi

phon

eapp

revi

ews

revi

ewfo

rdev

appi

ness

appc

asht

oday

buyr

evie

ws

thes

ocia

lmar

kete

ers

best

revi

ewap

p

appr

evie

ws4

sale

geta

ppre

view

s4x

n

buya

ppsr

evie

w

andr

oida

ppre

view

sour

ce−2

0

2

4

6

8

10

12

14

Pri

ce (

$)

(a) Service price

bestreviewapp

apprebates

getappreviews.n

et

m.appredeem

promodisp

enser

giftmeapps

myappratin

gs

h.appiness.io

reviewford

ev−50

0

50

100

150

200

250

300

# o

f to

tal ap

ps

(b) App volume

Figure 2: Service Providers Statistics

and 2013. On their websites, many of them mention “1 mil-lion apps in iTunes” as the incentive for using their services.With over a million apps in an app store, it is extremelydifficult for a new app to acquire initial reviews, while withno or few reviews an app would be ranked far behind andlook much less attractive to app users. Therefore, their re-view promotion services could help app developers boosterup their apps out of this difficult time.

3.2.2 Service price and reward

Service price is the price of writing one text review alongwith a rating score. In Fig. 2(a), we depict the lowest pricefor one review from each service provider. The highest priceis $12.5 from “buyklout” [12], and the lowest one is around$0.5 from “androidappreviewsource” [13]. Basically, buyingmore reviews will be given more discounts. For example,“appiness” [14] charges $9 (US dollars) for each review ifbuying 5 reviews, $8 for 10, $7 for 50, and $6 for 100 reviews.

The reward for writing a review varies from $0.5 to $5.For each qualified review, “appiness” and “reviewfordev”pay$5 and $4, respectively, while most others pay $0.5. Someservice providers use other type of rewards. For example,“appredeem” pays 1000 points, which can be cashed out;“appwin” sets up a lottery for app reviewers.

3.2.3 Promoted App volume

Except “reviewfordev” [15], none of the service providersshows the total number of promoted apps on their websites.However, some of them recruit online reviewers, so we canjoin as a passive reviewer and monitor app review assign-ments for more than four months. As shown in Fig. 2(b), themost popular service provider is “bestreviewapp” [4], whichhas accepted 251 apps. Some other providers like “appre-bates” [16](84), “getappreviews” [17] (38), “appredeem” [18](34), “promodispenser” [19] (32), and “giftmeapps” [20](30),have less apps. For the rest three, only 3 or less apps werefound. In total, we have found 473 apps promoted withinthe monitoring period.

Except service providers, probably no one knows the ac-curate count of paid reviewers (or accounts). However, someof them have offered review packages, which state the num-ber of reviews they are able to deliver for one app. For appstores like iTunes, each user is only allowed to write onereview for each app. Therefore, the maximum number ofreviews they are advertising to deliver for one app could re-flect the number of reviewers they can hire or the number ofaccounts under their control. From Fig. 3, we can see that“4xn” [21] has the largest number of reviewers (i.e.,1, 000).The second largest one is “getappreviews” [17], which candeliver 500 reviews at a time. The others can deliver lessthan 200 reviews.

Page 4: AppWatcher: Unveiling the Underground Market of Trading Mobile …sxz16/papers/appwatcher.pdf · list of such websites. 2.2 Service Provider’s Meta Data Collection To collect a

4xn

getap

previe

ws

buyap

psrevi

ew

andro

idapprev

iewsou

rce

appin

ess

buyre

views

appre

views

4sale

review

fordev

0

200

400

600

800

1000

Maxim

um # of reviews

Figure 3: The maximum number of reviews that canbe delivered in a single package.

3.3 Analysis of Promotion Incentive

3.3.1 Promotion Incentives

To find out when and why a developer made his decisionto start app promotions, we analyzed app events like ver-sion updates and rating changes that happened before thepromotion.

First, we retrieved all the version update history fromiTunes3 and identified the most recent version update be-fore the promotion. Further, we calculated the intervals (interms of days) between the version update and the promo-tion. The average interval is 39 days. Fig. 4(a) shows thedistribution of all the intervals. Specifically, 47% promotionswere launched within one week after a new version update,62% within two weeks, 80% within 60 days, and 92% within120 days.

Second, we studied apps’ weekly average ratings for fourweeks before the promotion. As shown in Fig. 4(b), nearlyhalf of average ratings are one star, 80% average ratings arebelow three stars, and 90% below four stars. That is, mostapps had received low average ratings before promotion.

Third, we looked into the distribution of weekly reviewerquantity. As illustrated in Fig. 4(c), more than 80% appshave less than 10 reviewers in a week and more than 90%apps have less than 50 reviewers. That is, most promotedapps have few reviewers in weeks before promotion.

Statistically, low average ratings and few reviewers are themost likely motivation for promotion.

3.3.2 Promotion Completion Time

To avoid the notice of app store vendors, most serviceproviders try to post reviews gradually rather than postingbulk reviews in a very short time. Another reason could bethat promotion reviews are completed by human, which lim-its the speed. Nevertheless, most of the reviews are promisedto be completed within one week (e.g., appiness, bestre-viewapp, appreviews4sale), or two weeks (e.g., appreview-source). For example, as shown below, appiness mentionsits strategy in its FAQ page.

Q: How long does it take to complete thereviews? A: It’s not a precise science becausepeople are involved. Reviewers agree to review

3Google Play does not provide version update events on itswebsite, so we only focus on iTunes.

an app within 72 hours of receiving the request.Some may do it immediately, some may do itafter a day and some may push it to the limit.However, we like this 72 hour window becausethis ensures that the downloads and reviews looks

natural.

Q: I’d like to order 100 or more reviewsbut I want it to be spread out over time?A: No problem. First, reviewers can take any-where from 0 to 72 hours to review the app, soyou will have a natural trickling in of reviews.

To confirm this, we studied the number of days needed tocomplete a review task. The quickest task was completedwithin one day, while the slowest one took up to 115 days.The average completion time was 21 days. As illustratedin Fig. 5, nearly 77% promotions were completed within 30days and nearly 90% promotions were completed within 60days.

0 20 40 60 80 100

# of days

0.0

0.2

0.4

0.6

0.8

1.0

CDF

Figure 5: The number of days needed to completereviews for an app.

A few service providers state that they can deliver thereviews within a very short time like one day (e.g., 4xn4,madapp [22]). madapp [22] states that it can deliver 1,000organic reviews (i.e., written by real users) within one day.Given such a big number, our conjecture is that it may lever-age bots to automatically generate and post some reviews.

3.4 Analysis of Promoted AppsFor all the promoted apps, we collected their meta data

and then studied their platforms, average ratings, total num-ber of reviewers, category distributions, and developers. Forthe purpose of comparison, we also randomly chose the samenumber of iOS apps that have at least one text comment.

All the promoted apps are either on iOS (96%) or on An-droid (4%). Even though both app stores have more than 1.3million apps, iOS app developers were reported to have 85%more revenue than Android apps [23]. This could motivatemore developers to promote their iOS apps than Androidapps. As promoted Android apps are too few, next we onlyfocus on iOS apps.

The average ratings from reviewers who posted text com-ments are shown in Fig. 6. Over 90% promoted apps havepositive (i.e., >3 stars) average ratings and 80% averageratings are above 4 stars. Compared to randomly chosenapps, promoted apps have much higher average ratings. Totest whether their difference is statistically significant, we

4This website has been closed very recently.

Page 5: AppWatcher: Unveiling the Underground Market of Trading Mobile …sxz16/papers/appwatcher.pdf · list of such websites. 2.2 Service Provider’s Meta Data Collection To collect a

0 50 100 150 200 250

# of days

0.0

0.2

0.4

0.6

0.8

1.0CDF

(a) Intervals between promotion time andthe last version update time.

1 2 3 4 5

Weekly Average Ratings

0.0

0.2

0.4

0.6

0.8

1.0

CDF

(b) Weekly average ratings

0 50 100 150 200 250

Weekly Rater Quantities

0.0

0.2

0.4

0.6

0.8

1.0

CDF

(c) Weekly average raters

Figure 4: Promotion incentives

conducted a Welch’s t-test with the null hypothesis thatthe mean of promoted apps’ average ratings are identical tothat of random apps’ average ratings. The test returned asmall p-value (p-value<0.05), which denied the null hypoth-esis and supported the hypothesis that promoted apps havesignificantly larger average ratings than random apps.

We also depict the total number of reviewers. From Fig. 7we can see that promoted apps have more reviewers thanrandom apps. Their difference was also tested to be signifi-cant by Welch’s t-test.

0 100 200 300 400 500

The sequence of apps

0

1

2

3

4

5

Average Rating (iOS)

Promoted Apps

Random Apps

Figure 6: Average ratings from reviewers who haveprovided both ratings and text comments to iOSapps.

0 500 1000 1500 2000

# of Reviewers (iOS)

0.0

0.2

0.4

0.6

0.8

1.0

CDF

Promoted Apps

Random Apps

Figure 7: Total number of reviewers who have pro-vided both ratings and text comments to iOS apps.

To study the category distribution, we count the numberof apps in each category. As an iOS app could be assignedmultiple categories, the app would be counted in more than

one category. As shown in Fig. 8(a), the top first cate-gory is“Games” following by“Newsstand”, “Photo & Video”,“Lifestyle”, and “Productivity”. Compared to random appsin Fig. 8(b), the category distribution of promoted apps isnearly the same. However, promoted apps in each category(except “Social Networking”, “Medical”, and “Navigation”)have more paid apps than random apps have.

0 50 100 150 200 250 300 350

# of apps (iOS)

Weather

Navigation

Catalogs

Medical

Reference

Finance

Health & Fitness

Social Networking

Books

Utilities

Productivity

Lifestyle

Photo & Video

Newsstand

Games

Paid Apps(%)

100.0%

50.0%

71.4%

33.3%

72.7%

58.3%

46.7%

8.3%

86.4%

64.6%

78.8%

45.5%

42.1%

38.2%

48.8%

(a) Promoted apps

0 20000 40000 60000 80000 100000

# of apps (iOS)

Weather

Catalogs

Finance

Navigation

Medical

Social Networking

Photo & Video

Health & Fitness

Books

Productivity

Reference

Utilities

Lifestyle

Newsstand

Games

Paid apps(%)

36.6%

11.8%

22.9%

54.7%

37.9%

18.5%

44.5%

39.9%

62.7%

37.4%

49.4%

39.9%

29.2%

33.2%

40.1%

(b) Random apps

Figure 8: App category distributionAs for app developers, all the promoted apps belong to

373 developers. Among them, 79 developers (21%) have atleast two apps while the remaining 294 developers(79%) onlyown one app. The top three developers have 15, 13, and 13promoted apps, respectively.

3.5 Analysis of Recruited Reviewers

Page 6: AppWatcher: Unveiling the Underground Market of Trading Mobile …sxz16/papers/appwatcher.pdf · list of such websites. 2.2 Service Provider’s Meta Data Collection To collect a

For promoted iOS apps, we collected all the reviewers whoposted text comments. In this section, we study their behav-iors in terms of co-rating and average rating. For compara-tive study, we randomly chose the same number of reviewersfrom Apple App Store.

Even though we cannot tell which reviewers were recruited,those who have reviewed multiple promoted apps after pro-motion started are suspicious. Given the set of 564 promotedapps, we first collect the number of reviewers who reviewedthem, and then calculate the number of reviewers (y-axis)who have reviewed a certain number of promoted apps (x-axis), shown in Fig. 9. We can observe that while the mostnumber of reviewers reviewed only one promoted app, manyreviewers have rated over tens of promoted apps. There areeven two reviewers who reviewed 246 promoted apps. Incontrast, for random (i.e.,randomly chosen) apps, none oftheir reviewers have reviewed more than three apps in thechosen set (we do not show a figure here). Therefore, activereviewers of promoted apps are likely recruited reviewers.

0 50 100 150 200 250# of rated apps

0

1

2

3

4

5

6

# of reviewers (log10)

Figure 9: Number of reviewers (y-axis) who havereviewed certain number of promoted apps (x-axis)

Further, we collect those reviewers (also called suspicious

reviewers) who reviewed at least two promoted apps duringmonitoring time. In total, we find 1, 363 such reviewers. Wethen count the total numbers of apps in the app store theyhave reviewed, including apps that are not in our knownpromotion app set. In Fig. 10, we list all such reviewers inx-axis based on the number of reviewed apps (in the decreas-ing order). On average, each suspicious reviewer reviewed9 apps, while the highest number is close to 800. We canalso observe that the first 200 suspicious reviewers all havereviewed at least 100 apps in the app store. In contrast, lessthan 30 random reviewers have reviewed at least 100 apps inthe app store. Therefore, overall suspicious reviewers havereviewed significantly more apps than random reviewers.

Furthermore, we study the average ratings provided byreviewers. For suspicious reviewers, we chose those whohad reviewed at least two and at least six promoted apps.While for random reviewers, we randomly chose those whohad reviewed at least two and at least six apps in the appstore. As illustrated in Fig. 11, for those who reviewedat least two apps, nearly 1,200(88%) suspicious reviewers,but only 800(59%) random reviewers, have average ratingshigher than 4 stars. In the case of six apps, nearly 400 (67%)suspicious reviewers, but only 200 (33%) random reviewers,have average ratings higher than 4.5. Therefore, average rat-ings from suspicious reviewers are significantly higher than

0 200 400 600 800 1000 1200 1400

The sequence of reviewers

0

100

200

300

400

500

600

700

800

# of apps

Suspicious Reviewers

Random Reviewers

Figure 10: Number of apps reviewed by one re-viewer.

that from random reviewers in both cases. This observationis also supported by a Welch’s t-test.

0 200 400 600 800 1000 1200 1400

The sequence of reviewers

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Average ratings

≥2 apps≥2 apps (random)≥6 apps≥6 apps (random)

Figure 11: The average ratings to all apps from re-viewers rated different number of apps.

3.6 Promoted App TracerJoining as a recruited reviewer to monitor review assign-

ments, we have collected many promoted apps from severalservice providers. However, these apps are probably only asmall part of all promoted apps. This is because this kind ofpromotion services has been launched for years, while Ap-pWatcher only monitored them for four months. That is,we were unable to directly discover apps promoted beforeour monitoring. Besides, some service providers may recruittheir reviewers through other channels like invitation onlymembership and phone call groups. Nevertheless, the goodnews is that the entire review history of each reviewer hasbeen recorded and stored in app stores. Next, we present asimple approach to demonstrate how to leverage our knowl-edge of promoted apps as the ground truth to find unknownpromoted apps in app stores.

Intuitively, if a reviewer has reviewed many apps flaggedas abused apps, he is probably recruited and his reviews tothe unflagged apps also become suspicious. Further, if anapp has received many such suspicious reviews, it becomeshighly suspicious. Specifically, on one hand, given a set ofpromoted apps, reviewers who have reviewed more knownpromoted apps are more likely to be a recruited reviewer.In our previous comparison between reviewers of promotedapps and reviewers of randomly chosen apps, we observedthat no reviewers from randomly chosen apps have reviewed

Page 7: AppWatcher: Unveiling the Underground Market of Trading Mobile …sxz16/papers/appwatcher.pdf · list of such websites. 2.2 Service Provider’s Meta Data Collection To collect a

more than three apps from the app set. However, review-ers of promoted apps have reviewed far more apps (up to246). Those reviewers who have reviewed many promotedapps and given high rating scores are highly likely recruitedreviewers. On the other hand, given a set of recruited re-viewers, the apps rated by many of them are also highlylikely promoted apps. By following down the chain fromknown promoted apps to suspicious reviewers, and then toother promoted apps, we are able to expand the set of po-tentially promoted apps and finally get a suspect list. Thissuspect list can be further examined and verified by appstore vendors with other additional evidences such as creditcard numbers bound with each review account, IP addressof submitting a review, etc.

Based on these observations, we design an iterative algo-rithm to trace apps promoted by recruited reviewers. Asillustrated in Fig. 12, our algorithm starts from a knownpromoted app set (e.g., App Set I), which are the promotedapps we previously collected. Then, it inspects each reviewerof these promoted apps and labels reviewers who have re-viewed at least Na promoted apps (e.g., Na = 5) as re-cruited reviewers (e.g., Attacker Set I). Among all the appsreviewed by these reviewers, those apps reviewed by at leastNr recruited reviewers (e.g., Nr = 25) are then tagged aspromoted apps, and they form a new apps set (e.g., App SetII). This process continues iteratively until no suspicious re-viewers or apps can be found.

Known Promoted App Set

Highly Suspicious Attackers

Highly Suspicious Apps

Highly Suspicious Attackers

Highly Suspicious Apps

App Set I App Set II App Set III

Attacker Set I Attacker Set II

Figure 12: Our promoted app tracer that starts froma set of promoted apps.

We have implemented this algorithm in JAVA and ap-plied it to search for suspicious apps in Apple App Store.By setting Na and Nr to 5 and 25 in our experiments, re-spectively, shortly our algorithm reported additional 2, 410suspicious apps (not including the previous 546 iOS appscollected by AppWatcher), along with 157, 502 potential re-cruited reviewers.

For the lack of ground truth and additional information(e.g., reviewer account information, IP address) for valida-tion, we are unable to verify the above results (i.e., suspi-cious apps and reviewers) one by one. Instead, we study theoverall distributions of these suspicious apps and review-ers and their differences with randomly chosen apps andreviewers. If they have similar distributions as the knownpromoted apps (Section 3.4) and recruited reviewers (Sec-tion 3.5), they are very likely to be promoted apps and re-cruited reviewers.

We randomly chose 500 apps reported by our tracer, andstudied their average rating distribution. As illustrated inFig. 13, the average ratings of these suspicious apps are muchhigher than that of random apps. Their difference is signif-

icant, supported by Welch’s t-test (p-value<0.05). On theother hand, compared to Fig. 6, the average rating distri-butions of known promoted apps and suspicious apps arenearly identical. That is, these suspicious apps have sim-ilar rating characteristics as known promoted apps, whichindicates highly likely they are also promoted apps.

0 100 200 300 400

The sequence of apps

1

2

3

4

5

Average Rating

Suspicious Apps

Random Apps

Figure 13: Average rating of suspicious apps.Likewise, as shown in Fig. 14, the average ratings of sus-

picious reviewers are also much higher than that of randomreviewers. Compared to Fig. 11, the average rating distribu-tions of suspicious reviewers are similar in both scenarios. Inother words, the suspicious reviewers reported by our tracerhave the similar rating characteristics as recruited reviewers.Hence, they are likely recruited reviewers.

0 100 200 300 400 500 600

The sequence of reviewers

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Average Rating

Suspicious Reviewers

Random Reviewers

Figure 14: Average ratings provided by suspiciousreviewers.

Note that the purpose of this basic app tracer is to providea generic framework and demonstrate how to utilize knownpromoted apps and basic heuristics to detect other unknownpromoted apps. The criteria and suggested heuristics arenot fixed or perfect, which can be extended to include otherevidences like correlation between app’s average ratings andreview quantities [10]. We also note that during the traver-sal process some popular apps might be accidentally addedto the app set. As a result, some honest reviewers would bemistaken as suspicious reviewers, which would in turn bringmore popular apps into the suspicious app set. Clearly, byexcluding popular apps (e.g., those in top-200 rank chartsfor at least a few weeks) in our algorithm, one may filter outsuch noise. Also, one may increase the values of Na and Nb

when running our algorithm to reduce the chance of normalapps being included. After all, compared to over 1.2 mil-lion apps (as of June 2014) in iTunes store, with the currentparameter setting, our basic tracer output 2,410 apps for fur-

Page 8: AppWatcher: Unveiling the Underground Market of Trading Mobile …sxz16/papers/appwatcher.pdf · list of such websites. 2.2 Service Provider’s Meta Data Collection To collect a

ther investigation, which only accounts for 0.2% of all apps.With additional information such as reviewer credit cardnumber, IP addresses, purchasing history, origin of country,app store may accurately pinpoint promoted apps amongthis short list.

4. WEB PROMOTIONSome service providers help developers advertise apps on

their websites and charge for writing and publishing reviews.However, such paid reviews are forbidden by app store ven-dors like Google and Apple. Moreover, according to FTC [8],paid reviews without explicitly stating they are paid are ille-gal. In this section, we study these paid web promotion ser-vice providers, apps they have promoted, the objectivenessof their reviews, and promotion incentives of developers.

4.1 DatasetWe have found 31 service providers which charge develop-

ers for app reviews. For each service provider, we manuallycollected its meta data like service launching time, app vol-ume, service price from its website.

Then, AppWatcher was started to monitor these websiteseveryday. It has found 29, 680 promoted apps by July 15,2014. For those apps with app ids shown on the websites,it further collected their meta data from the correspondingapp stores. The meta data includes such information as to-tal number of raters, distribution of ratings, average rating,category, version changes, developers, etc. It also collectedall of the reviews along with review date, reviewer identify,rating score, targeted app version, review content. In total,we collected meta data for 7, 077 apps, 5,349 developers,along with 5, 290, 281 reviews.

4.2 Analysis of Service ProvidersFor all the 31 websites we have monitored, we analyze

when they started, what their service prices are, and whatthe market revenue is.

dailyappshow

theiphoneappreview

smartappsforkids

appeggs

best10apps

fanappic

appcraver

appdigity

tapscape

appdictions

digital-storytim

e

appstorearcade

thesmartphoneappreview

theimum

mobiappsreview

androidheadlines

hotmacapps

bestappsforkids

androidappsreview

appshrink

indiegamemag

topbestfreeapps

freshapps

iphoneglance

iphonetoolbox

alphadigits

iphoneappsreviewsite

0

1000

2000

3000

4000

5000

# of total apps

(a) App volume

androidheadlines

smartappsforkids

Best10Apps

Appcraver

fanappic

androidtapp

freshapps

DigitalStorytime

iPhoneGlance

Tapscape

TheiMum

TheiPhoneAppReview

AppstoreArcade

androidappsreview

DailyAppShow

appeggs

BestAppsforKids

appdictions

AppShrink

thesmartphoneappreview

appdigity

HotMacApps

MobiAppsReview

iPhoneToolbox

topbestfreeapps

reviewfordev

0

50

100

150

200

250

Price ($)

(b) Service price

Figure 15: App volume and service price of eachservice provider

4.2.1 Service launch time

Service launch time is the time when a service providerstarted. Most websites were launched in 2008, 2010, and2011. On July, 2008, Apple opened the app store iTunes [1].On March 6, 2012, GooglePlay5 launched android app store.Right after these openings, six service providers started theirservices. On September, 2010, GooglePlay expanded its ser-vice to 29 additional countries [2], and Apple App Storereached 10 billion downloads on January 22, 2011 [1]. Clearly,

5Its first name was Android Market, which was rebrandedas GooglePlay on March 6, 2012.

the quick increase of app market and its extremely hugeuser group have not only attracted many developers, butalso quickly motivated some people to provide mobile appreview services.

4.2.2 App volume

The volumes of promoted apps on different websites aredisplayed in Fig. 15(a). The first four websites have re-viewed more than 3, 000 apps, while the other websites havereviewed less than 2, 000 apps. The number of apps on thefirst four websites accounts for 55% of the total number ofpromoted apps.

4.2.3 Service price

Service price is the price of a single app review posted ona service provider website. As illustrated in Fig. 15(b), thehighest service price is from “androidheadlines” [24], whichis $250 for one app review. Five websites charge customersabout $150 and the others charge less than $100. Theirprices could be determined by factors like pageviews, socialmeida followers. For example, “androidheadlines” claims toown 350,000+ followers in the social network. As for thesecond highest one (i.e., smartappsforkids), it has 120,000followers.

4.2.4 Market revenue and market trend

Based on the minimal service price and app volume in eachwebsite, the market revenue for web promotion is estimatedto be at least $2, 562, 021.

The growth trend of the entire market is illustrated inFig. 16. The monthly app volume has seen a great increasein early 2011. At around May 2012, it suddenly increasedto over 900 apps. As for the monthly average rating, it firstgradually increased from 3 star to 4 star from August 2008to December 2010. Then, it varied between 3 star and 4star for some time. Since 2012, it has been stable at the 4.5star level. Overall, both the monthly app volume in the webpromotion market and the average rating of promoted appshave increased in recently years.

2008-08

2008-12

2009-04

2009-08

2009-12

2010-04

2010-08

2010-12

2011-04

2011-08

2011-12

2012-04

2012-08

2012-12

2013-04

2013-08

2013-12

2014-040

200

400

600

800

1000

# of total apps

2.5

3.0

3.5

4.0

4.5

5.0

Average ratings

App volume

Average rating

Figure 16: Monthly changes of app volume and av-erage ratings

4.3 Analysis of Promoted AppsThe promoted apps are mostly iOS apps, which account

for 93.4%. The second most promoted apps are Androidapps and they account for 6.4%. Other platforms, includingWindows Phone, BlackBerry, Kindle Fire, only account for0.16%. The most likely reason is that iOS apps make themost revenue, leading Android apps by (85%) [23].

Page 9: AppWatcher: Unveiling the Underground Market of Trading Mobile …sxz16/papers/appwatcher.pdf · list of such websites. 2.2 Service Provider’s Meta Data Collection To collect a

0 1000 2000 3000 4000

The sequence reviewers

0.0

0.2

0.4

0.6

0.8

1.0

CDF

Promoted apps

Random apps

Figure 17: Total number of reviewers of an app.

As the total number of apps in Windows, Kindle Fire,BlackBerry, is much smaller, we only analyze promoted appsbased on iOS and Android. For all the promoted apps, wecount the number of reviewers who provided both ratingsand text comments. As illustrated in Fig. 17, over 70%apps have less than 500 raters and reviewers, and over 80%apps have less than 1000 raters and reviewers. Note thatfor random apps (i.e., randomly selected iOS apps), shownin the same figure, over 95% apps have less than 500 raters.This percentage is much higher than that of promoted apps.Therefore, in general, promoted apps have more reviewersthan random apps.

0 1000 2000 3000 4000

# of apps (iOS)

Catalogs

Weather

Navigation

Finance

Medical

Reference

Health & Fitness

Social Networking

Photo & Video

Utilities

Lifestyle

Productivity

Books

Newsstand

Games

Paid apps(%)

27.3%

65.7%

57.1%

64.4%

73.7%

60.2%

60.2%

20.7%

49.8%

60.9%

48.3%

66.7%

83.1%

56.8%

62.6%

Figure 18: App category distribution (iOS)

The categorial distribution of promoted apps is depictedFig. 18. Here, the right y-axis shows the percentage of paidapps among all apps in each category. The top five cate-gories of promoted iOS apps are Games, Newsstand, Books,Productivity, and Lifestyle. In all the categories, more paidapps than free apps have been promoted. While the cate-gories of promoted apps (Fig. 18) are very similar to thatof random apps (Fig. 8(b)), promoted apps have higher per-centages of paid apps than free apps random apps have ineach category. This is not surprising though. As paid appsgenerate revenue to app developers directly, developers ofpaid apps are more willing to pay for web promotion.

The number of promoted apps from the same developer isalso studied. For all the 5,349 developers, 1,016 developers(19%) have promoted at least two of their apps while theremaining 4,333 developers(81%) promoted one app. Thetop three developers have promoted 177, 84, and 69 apps,respectively.

4.4 Analysis of Review Objectiveness

Nearly all the service providers claim that their reviewswill be objective and they cannot guarantee any positivereview. Only a few of them mention that they will sendnegative reviews back to the developers. To understand theobjectiveness of their ratings, we collected editor’s ratingsand further studied rating distribution.

From Table. 1, we can see that, 92% ratings by websiteservice provides are positive ratings (i.e., 4 or 5 stars6).5% ratings are neutral (i.e., 3 stars) and 3% ratings arenegative (i.e., 1 or 2 stars). Compared to ratings of ran-dom apps (68% positive and 19% negative), editors’ ratingshave far more positive ratings and less negative ones. More-over, among editors’ ratings, 5-star ratings account for 60%whereas only 9% of random ratings are 5 stars. Overall,more editors’ ratings are positive and less are negative.

Table 1: Rating comparison between random appsand promoted apps.

Rating Random Apps Promoted Apps

5 star 9% 60%4 star 58% 32%3 star 14% 5%2 star 18% 2%1 star 1% 1%

Further, in Fig. 19, we depict the proportion of positiveratings from each website7. For the first few websites, nearlyall the ratings are positive. Only three websites have positiveratings below 50%.

smartappsforkids

androidappsreview

fanappic

iphoneappsreviewsite

androidheadlines

mobiappsreview

bestappsforkids

best10apps

theiphoneappreview

tapscape

digital-storytime

appcraver

appstorearcade

alphadigits

theimum

iphoneglance

0.2

0.4

0.6

0.8

1.0

Positive Ratings (%)

Figure 19: The proportion of positive ratings in eachwebsite

4.5 Time for PromotionsFor developers, the ultimate goal of using web promotion

is to make more revenue, but what is the time for them todecide to use promotion? In this section, by studying theintervals between web promotion and app version update,weekly average rating, and weekly rater quantity before thepromotion, we can get the answer and further infer the stim-ulus for promotion.

From Fig. 20(a), we can see that 50% promotions hap-pened within one month of a new release and over 80% hap-pened within 100 days. Therefore, one to three months after6We have converted different rating scales to five star rating.7Only those websites providing ratings along with text com-ments are included.

Page 10: AppWatcher: Unveiling the Underground Market of Trading Mobile …sxz16/papers/appwatcher.pdf · list of such websites. 2.2 Service Provider’s Meta Data Collection To collect a

0 50 100 150 200 250 300 350

# of days

0.0

0.2

0.4

0.6

0.8

1.0

CDF

(a) Intervals between promotion and versionupdates

1 2 3 4 5

Weekly average rating

0.0

0.2

0.4

0.6

0.8

1.0

CDF

(b) Weekly average rating

0 20 40 60 80 100

Weekly average raters

0.0

0.2

0.4

0.6

0.8

1.0

CDF

(c) Weekly # of raters

Figure 20: Promotion incentives

a new version release have been the common time for webpromotion.

Further, in Fig. 20(b), we depict the CDF of the averageratings in four weeks8 just before the web promotion. As wecan see, over 14% apps have average rating of 1 star, 52%apps less than 2 stars, and up to 73% less than 3 stars.

We also studied the weekly number of raters in four weeksbefore the promotion. As illustrated in Fig. 20(c), nearly85% apps have less than 3 raters in each week, 90% appsless than 7 raters, and 95% apps less than 21 raters.

The above study shows that these promoted apps haveexperienced low average ratings or few raters in the weeksbefore the promotion. Our conjecture is that after a devel-oper released a new version of his app, normally he waitedfor one to three months, hoping his app would become pop-ular and get good reviews. However, when observing bothlow rating and few reviewers for his app, he decided to seekhelp from web promotion.

4.6 Cross PromotionAs the main channels of promotion on the underground

market, app store promotion and web promotion share manycommon features. For example, they all charge developersand write positive reviews; a few service providers dominatethe market and promote a lot more apps than others; thepromoted apps usually experienced few reviewers or low av-erage ratings before promotion. Nevertheless, there are alsoessential differences. First, the sources of app reviewers aredifferent. For app store promotion, service provides eithermanipulate lots of accounts or recruit reviewers online. Butfor web promotion, each service provider hires a few edi-tors. Second, after releasing a new app, developers oftenwait longer before choosing web promotion than developerswho choose app store promotion. One possible explanationis that app store promotion is useful to get initial reviews fora new release, while web promotion is for app advertisement.The latter does not necessarily result in more reviews.

Some developers might resort to both ways of app pro-motion. We cross checked all the apps promoted by appstore promotion from July 1st to October 7th, 2014 andthose by web promotion till October 7th, 2014. In total,we have found 98 such apps out of 546 apps (18%). As anapp often has multiple versions, developers might promotedifferent app versions in a different way. Among these crosspromoted apps, 12 apps were promoted for the same version,while the other 77 apps were promoted for different versions.Moreover, 11 developers (owners) have cross promoted more

8The cases of 2, 6, and 8 week average ratings are verysimilar.

than one app. The top three developers have cross promoted8, 4, and 3 apps, respectively.

5. RELATED WORKUnderground Market Study Underground market is

the center of connecting illegal merchants with special ser-vice providers including CAPTCHA solving systems [25],Twitter account spam [26], fake reviews, etc. These marketsusually rely on some popular crowd-sourcing websites likeAmazon’s Mechanical Turk, Freelancer. Amazon’s Mechan-ical Turk has been studied from various perspectives includ-ing worker demographics [27] [28], behavior research [29],marketplace [30]. Some other crowd-sourcing websites likeZhubajie [31], Sandaha [32], MicroBlogs [33] were also stud-ied [34] from the perspective of campaigns on these websites.Freelancer was studied [25] from the role of freelance laborin web service abuse like account registration and verifica-tion, social network linking. Some forums was also studiedin trading twitter spam accounts [26]. Unlike the above un-derground markets, app review market consists of serviceproviders whose services only focus on writing mobile appreviews. They tried many sophisticated strategies to eludedetection algorithms and some of them have built a rewardsystem to recruit reviewers from the wild.

Opinion Spam Detection Opinion spam was first stud-ied by Jindal [35] and various approaches have been pro-posed [36] [37] [38] [39] [40] to detect fake reviewers in tra-ditional market like Amazon. Comment spam of blogs wasalso studied using spam structure [41], content based algo-rithm [42], language models [43] etc. Moreover, collusivereviewer groups are also studied [44] [45].

In the mobile app review market, opinion spam is evenmore serious than ever because of huge marketplace andcentralized app distribution systems like iTunes, Google-Play. However, due to the natural differences between mo-bile app markets and traditional markets in terms of reviewcontent, reviewer volumes, reviewer behavior, etc, previousapproaches do not apply directly in mobile app market. Sofar very few approaches [10] [9] have been proposed to de-tect either collusive reviewers and fake reviews, respectively.One challenging issue with existing detection approaches isthe lack of ground truth to verify their detection results.The outcome from this work can provide a dateset as theground truth.

6. CONCLUSIONAs the size of an app market (e.g., Google Play and iTunes)

has exceeded a million of apps, it is very hard to make anew app known in an app market. This makes the under-ground market of trading app reviews increasingly popu-

Page 11: AppWatcher: Unveiling the Underground Market of Trading Mobile …sxz16/papers/appwatcher.pdf · list of such websites. 2.2 Service Provider’s Meta Data Collection To collect a

lar. Review-based promotions in mobile app markets coulddiscourage benign developers who try hard to improve appquality. In this paper, we first built a system called App-

Watcher to monitor the app promotion market, collect pro-moted apps, and analyze facts such as service launch time,service price, app volume, possible promotion incentives.We then revealed the characteristics of promoted apps andsuspicious attackers, which can serve as important heuris-tics to design automated and effective algorithms to iden-tify promoted apps in the entire app store such as iTunes.The dataset collected by AppWatcher can also serve as theground truth to evaluate the power of any app promotiondetection algorithm.

7. ACKNOWLEDGEMENTThis work was supported in part by NSF grant CCF-

1320605 and a Google gift. We also thank the reviewersfor helpful comments.

8. REFERENCES

[1] wiki, http://en.wikipedia.org/wiki/App Store (iOS).

[2] http://en.wikipedia.org/wiki/Google Play.

[3] venturebeat, http://venturebeat.com/2014/04/29/mobile-apps-could-hit-70b-in-\\revenues-by-2017-as-non-game-categories-take-off/.

[4] bestreviewapp, http://www.bestreviewapp.com/.

[5] dailyappshow, http://www.dailyappshow.com/.

[6] Apple, https://developer.apple.com/app-store/review/guidelines/.

[7] Google, https://play.google.com/about/developer-content-policy.html.

[8] FTC, https://www.ftc.gov/tips-advice/business-center/guidance/ftcs-revised-endorsement-guides-what-people-are-asking.

[9] R. Chandy and H. Gu, “Identifying spam in the iosapp store,”WICOW/AIRWeb ’12.

[10] Z. Xie and S. Zhu, “Grouptie: Toward hiddencollusion group discovery in app stores,”Wisec ’14.

[11] A. J. SMITH,http://www.appynation.com/hall-of-infamy/.

[12] buyklout, http://www.buyklout.com/.

[13] androidappreviewsource,http://www.androidappreviewsource.com/.

[14] appiness, http://www.appiness.io/.

[15] reviewfordev, http://www.reviewfordev.com/.

[16] apprebates, http://www.apprebates.com/.

[17] getappreviews, http://www.getappreviews.com/.

[18] appredeem, http://m.appredeem.com/.

[19] promodispenser, http://www.promodispenser.com/.

[20] giftmeapps, http://www.giftmeapps.com/.

[21] 4xn, http://www.4xn.com/.

[22] Madapp, http://madapp.me/index.html#services.

[23] Appannie, http://blog.appannie.com/app-annie-index-market-q1-2014/.

[24] androidheadlines, http://www.androidheadlines.com/.

[25] M. Motoyama, K. Levchenko, C. Kanich, D. McCoy,G. M. Voelker, and S. Savage, “Re:Captchas-understanding captcha-solving services in aneconomic context.” USENIX ’10.

[26] K. Thomas, D. McCoy, C. Grier, A. Kolcz, andV. Paxson, “Trafficking fraudulent accounts: The roleof the underground market in twitter spam andabuse.” USENIX ’13.

[27] P. G. Ipeirotis, “Demographics of mechanical turk,”2010.

[28] J. Ross, L. Irani, M. Silberman, A. Zaldivar, andB. Tomlinson, “Who are the crowdworkers?: shiftingdemographics in mechanical turk,” CHI’10.

[29] W. Mason and S. Suri, “Conducting behavioral

research on amazonAaArs mechanical turk,”Behaviorresearch methods, vol. 44, no. 1, pp. 1–23, 2012.

[30] P. G. Ipeirotis, “Analyzing the amazon mechanicalturk marketplace,” XRDS ’10.

[31] zhubajie, http://www.zhubajie.com.

[32] ——, http://www.sandaha.com.

[33] weibo, http://www.weibo.com.

[34] G. Wang, C. Wilson, X. Zhao, Y. Zhu, M. Mohanlal,H. Zheng, and B. Y. Zhao, “Serf and turf:Crowdturfing for fun and profit,”WWW ’12.

[35] N. Jindal and B. Liu, “Opinion spam and analysis,”WSDM ’08.

[36] G. Fei, A. Mukherjee, B. Liu, M. Hsu, M. Castellanos,and R. Ghosh, “Exploiting burstiness in reviews forreview spammer detection.” ICWSM ’13.

[37] F. Li, M. Huang, Y. Yang, and X. Zhu, “Learning toidentify review spam,” IJCAI ’11.

[38] G. Wang, S. Xie, B. Liu, and P. S. Yu, “Review graphbased online store review spammer detection,” ICDM’11.

[39] M. Ott, Y. Choi, C. Cardie, and J. T. Hancock,“Finding deceptive opinion spam by any stretch of theimagination,” HLT ’11.

[40] N. Jindal, B. Liu, and E.-P. Lim, “Finding unusualreview patterns using unexpected rules,” CIKM ’10.

[41] J. Zhang and G. Gu, “Neighborwatcher: Acontent-agnostic comment spam inference system.”NDSS ’13.

[42] P. Kolari, T. Finin, and A. Joshi, “Svms for theblogosphere: Blog identification and splog detection.”AAAI ’06.

[43] G. Mishne, D. Carmel, and R. Lempel, “Blocking blogspam with language model disagreement.” AIRWeb’05.

[44] A. Mukherjee, B. Liu, and N. Glance, “Spotting fakereviewer groups in consumer reviews,”WWW ’12.

[45] A. Mukherjee, B. Liu, J. Wang, N. Glance, andN. Jindal, “Detecting group review spam,”WWW ’11.