Fighting Fire With Fire : Crowdsourcing Security Threats and Solutions on the Social Web

Fighting Fire With Fire:Crowdsourcing Security Threats and Solutions on the Social Web

Gang Wang, Christo Wilson, Manish Mohanlal, Ben Y. ZhaoComputer Science Department, UC Santa Barbara.

[email protected]

A Little Bit About Me 3nd Year PhD @ UCSB Intern at MSR Redmond 2011 Intern at LinkedIn (Security

Team) 2012

2

Research Interests: Security and Privacy Online Social Networks Crowdsourcing

Data Driven Analysis and Modling

3

Recap: Threats on the Social Web Social spam is a serious problem

10% of wall posts with URLs on Facebook are spam

70% phishing Sybils underlie many attacks on Online Social

Networks Spam, spear phishing, malware distribution Sybils blend completely into the social graph

Existing countermeasures are ineffective Blacklists only catch 28% of spam Sybil detectors from the literature do not work

4

Sybil Accounts on Facebook In-house estimates

Early 2012: 54 million August 2012: 83 million 8.7% of the user base

Fake likes VirtualBagel: useless site, 3,000 likes

in 1 week 75% from Cairo, age 13-17• Sybils attacks in large scale

• Advertisers are fleeing Facebook

5

Sybil Accounts on Twitter

92% of Newt Gingritch’s followers are Sybils Russian political protests on Twitter

25,000 Sybils sent 440,000 tweets 1 million Sybils controlled overall

Follo

wers

4,000 new followers/d

ay100,000 new followers in 1

day• Twitter is vital infrastructure• Sybils usurping Twitter for political ends

6

Talk Outline1. Malicious crowdsourcing sites – crowdturfing

[WWW’12] Spam and Sybils generated by real people Huge threat in China Growing threat in the US

2. Crowdsourced Sybil detection [NDSS’13] If attackers can do it, why not defenders? Can humans detect Sybils? Is this cost effective? Design a crowdsourced Sybil detection system

User Study

7 Outline Intro Crowdturfing

Crowdsourcing Overview What is Crowdturfing How bad is it? Crowdturfing in the US

Crowdsourced Sybil Detection Conclusion

8 We tend to think of spam as “low quality” What about high quality spam and Sybils? Open questions

What is the scope of this problem? Generated manually or mechanically? What are the economics?

High Quality Sybils and Spam

Gang WangMaxGentleman is the bestest male enhancement system avalable. http://cid-ce6ec5.space.live.com/

FAKEStock Photographs

http://cid-ce6ec5.space.live.com/

9

Black Market Crowdsourcing Amazon’s Mechanical Turk

Admins remove spammy jobs Black market crowdsourcing websites

Spam and fake accounts, generated by real people Major force in China, expanding in the US and IndiaCrowdturfing = Crowdsourcing + Astroturfing

11

Crowdturfing Workflow

Customers

Initiate campaigns

May be legitimate businesses

Agents Manage

campaign and workers

Verify completed tasks

Workers Complete

tasks for money

Control Sybils on other websites

Campaign

Tasks

Reports

12

Crowdturfing in China

Site

ActiveSince

TotalCampaigns

Workers

Reports

$ forWorkers

$ forSite

Zhubajie

Nov. 2006 76K 169K 6.3M $2.4M $595K

1

10

10

100

1000

10000

100000

1000000Site Growth Over Time

Cam

paig

ns p

er

Mon

th

Dol

lars

per

Mon

th

Jan. 08 Jan. 09 Jan. 10 Jan. 11

Zhubajie

Sandaha

Campaigns

$

Campaigns

$

13

Spreading Spam on Weibo

100 1000 10000 100000 1000000 100000000

102030405060708090

100

Approximate Audience Size per Campaign

CDF

50% of campaigns reach

>100000 users 8% reach>1 million

users• Campaigns reach huge audiences• How effective are these campaigns?

14

Travel agency reported sales statistics 2 sales/month before our campaign 11 sales within 24 hours after our campaign Each trip sells for $1500!

Initiate our own campaigns as a customer 4 benign ad campaigns promoting real e-

commerce sites All clicks route through our measurement

server

How Effective is Crowdturfing?

Campaign About Targ

etCos

tTask

sRepor

tsClicks

Cost Per

ClickVacation Advertise for

a discount vacation through a

travel agent

Weibo

$15 100

108 28 $0.21

QQ 118 187 $0.09Forums 123 3 $0.90

Web Display Ads CPC =

$0.01

15

Crowdturfing in America

Other studies support these findings Freelancer

28% spam jobs Bulk OSN accounts, likes, spam Connections to botnet operators

US Sites % Crowdturfing

Legit

Mechanical Turk 12%Bl

ack Market

MinuteWorkers

70%

MyEasyTasks 83%Microworkers 89%ShortTasks 95%

Poultry Markets $20 for 1000

followers Ponzi scheme

16

Takeaways Identified a new threat: Crowdturfing

Growing exponentially in size and revenue in China

$1 million per month on just one site Cost effective: $0.21 per click

Starting to grow in US and other countries Mechanical Turk, Freelancer Twitter Follower Markets

Huge problem for existing security systems Little to no automation to detect Turing tests fail

17 Outline Intro Crowdturfing Crowdsourced Sybil Detection

Open Questions User Study Accuracy Analysis System Design

Conclusion

18

Crowdsourcing Sybil Defense Defenders are losing the battle against OSN

Sybils Idea: build a crowdsourced Sybil detector

Leverage human intelligence Scalable

Open Questions How accurate are users? What factors affect detection accuracy? Is crowdsourced Sybil detection cost effective?

19

User Study Two groups of users

Experts – CS professors, masters, and PhD students Turkers – crowdworkers from Mechanical Turk and

Zhubajie Three ground-truth datasets of full user profiles

Renren – given to us by Renren Inc. Facebook US and India

Crawled Legitimate profiles – 2-hops from our own profiles Suspicious profiles – stock profile images Banned suspicious profiles = Sybils

Stock Picture

Crowdturfing Site

20

Progress

Classifying Profiles

BrowsingProfiles

Screenshot of Profile(Links Cannot be

Clicked)

Real or fake?

Why?

Navigation Buttons

Testers may skip around and revisit

profiles

21

Experiment Overview

Dataset

# of Profiles

Test Group

# of Teste

rs

Profile per

TesterSybil Legit.

Renren 100 100Chinese Expert

24 100

Chinese Turker

418 10

Facebook US 32 50 US Expert 40 50

US Turker 299 12Facebook India 50 49 India Expert 20 100

India Turker 342 12

Crawled Data

Data from Renren

Fewer Experts

More Profiles per Experts

22

Individual Tester Accuracy

0 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100 Chinese TurkerUS TurkerUS Expert

Accuracy per Tester (%)

CDF

(%)

Not so

good :(

• Experts prove that humans can be accurate• Turkers need extra help…

Awesome!80% of experts

have >90% accuracy!

23

Accuracy of the Crowd Treat each classification by each tester

as a vote Majority makes final decision

Dataset Test Group False Positives

False Negatives

Renren Chinese Expert 0% 3%Chinese Turker 0% 63%

Facebook US

US Expert 0% 10%US Turker 2% 19%

Facebook India

India Expert 0% 16%India Turker 0% 50%

Almost Zero False Positives

Experts Perform

OkayTurkers Miss

Lots of Sybils

• False positive rates are excellent• Turkers need extra help against false negatives• What can be done to improve accuracy?

24

Eliminating Inaccurate Turkers

0 10 20 30 40 50 60 700

20

40

60

80

100ChinaIndiaUS

Turker Accuracy Threshold (%)

Fals

e N

egat

ive

Rate

(%) Dramatic

Improvement

Most workers are >40% accurate From 60% to

10% False Negatives• Only a subset of workers are removed (<50%)

• Getting rid of inaccurate turkers is a no-brainer

25

How Many Classifications Do You Need?

2 4 6 8 10 12 14 16 18 20 22 240

20

40

60

80

100

Classifications per Profile

Erro

r Ra

te (%

)

ChinaIndia

US

False Negatives

False Positives

• Only need a 4-5 classifications to converge• Few classifications = less cost

26

How to turn our results into a system?1. Scalability

OSNs with millions of users2. Performance

Improve turker accuracy Reduce costs

3. Preserve user privacy when giving data to turkers

27

Social NetworkHeuristics

User ReportsSuspicious Profiles

All Turkers

OSN employee

TurkerSelection Accurate Turkers

Very Accurate Turkers

Sybils

System Architecture

Filtering Layer

Crowdsourcing Layer

Filter Out Inaccurate

TurkersMaximize Usefulness

of High Accuracy Turkers

Rejected!

• Leverage Existing Techniques

• Help the System Scale

?

• Continuous Quality Control

• Locate Malicious Workers

Trace Driven Simulations Simulate 2000 profiles Error rates drawn from survey

data Vary 4 parameters

28

Accurate Turkers

Very Accurate Turkers

Classifications

Classifications

Threshold

Controversial Range

Results• Average 6 classifications per profile• <1% false positives• <1% false negatives

2

5

90%

20-50%

Results++• Average 8 classifications per profile• <0.1% false positives• <0.1% false negatives

29

Estimating Cost Estimated cost in a real-world social networks: Tuenti

12,000 profiles to verify daily 14 full-time employees Annual salary 30,000 EUR (~$20 per hour) $2240 per day

Crowdsourced Sybil Detection 20sec/profile, 8 hour day 50 turkers Facebook wage ($1 per hour) $400 per day

Cost with malicious turkers Estimate that 25% of turkers are malicious 63 turkers $1 per hour $504 per day

30

Takeaways Humans can differentiate between real and

fake profiles Crowdsourced Sybil detection is feasible Designed a crowdsourced Sybil detection

system False positives and negatives <1% Resistant to infiltration by malicious workers Sensitive to user privacy Low cost

Augments existing security systems

31 Outline Intro Crowdturfing Crowdsourced Sybil Detection Conclusion

Summary of My Work Future Work

32

Key Contributions1. Identified novel threat: crowdturfing

End-to-end spam measurements from customers to the web

Insider knowledge of social spam2. Novel defense: crowdsourced Sybil

detection User study proves feasibility of this approach Build an accurate, scalable system Possible deployment in real OSNs – LinkedIn

and RenRen

33

Ongoing Works1. Twitter follower markets

Locate customers who purchase bulk of Twitter followers

Study the un-follow dynamics of customers Develop systems to detect customers in the wild

2. Sybil detection using server-side click streams Build click models based on clickstream logs Extract click patterns of Sybil and normal users Develop systems to detect Sybil

34 Questions?

Thank you!

35

Potential Project Ideas Malware distribution in cellular networks

Identify malware related cellular network traffic Coordinated malware distribution campaigns Feature based detection

Advertising traffic analysis on mobile Apps Characterize ads traffic How effective for app-displayed ads to get click-

through? Are there malware delivered through ads?

36

Preserving User Privacy Showing profiles to crowdworkers raises

privacy issues Solution: reveal profile information in

context!

Crowdsourced

Evaluation!

Crowdsourced

Evaluation

Public Profile

Information

Friend-Only

Profile Informatio

nFriends

37

Clickstream Sybil Detection

Sybil Clickstream

Friend

Invite

Share

Browse

Profiles

Initial Final

96%

9%

68%

15% 2%

27%64%

20% 55%31%

Photo

Initial Final22% 3%

Share

MessageFrien

d Invite

Browse

Profiles

9% 4%

5%5%14%

9%

21%56%

56%

29%

86%87%

10%43%

14%93%

Normal Clickstream

Clickstream detection of Sybils1. Absolute number of

clicks2. Time between clicks3. Page traversal order

Challenges Real-time Massive scalability Low-overhead

38

Are Workers Real People?

0 5 10 15 200123456789

ZhubajieSandaha

Hours in the Day

% o

f Rep

orts

from

W

orke

rs

Late Night/Early Morning Work Day/Evening

Lunch Dinn

erZBJSDH

39

Crowdsourced Sybil Detection How to detect crowdturfed Sybils?

Blur the line between real and fake Difficult to detect algorithmically

Anecdotal evidence that people can spot Sybils 75% of friend requests from Sybils are rejected Can people distinguish in real/fake general?

User studies: experts, turkers, undergrads What features give Sybils away? Are certain Sybils tougher than others?

Integration of human and machine intelligence

40

Survey Fatigue US Experts US Turkers

0 3 6 90

20

40

60

80

100

0

20

40

60

80

100

Profile OrderTi

me

per

Profi

le (s

)

Accu

racy

(%)

No fatigue

0 8 16 24 32 40 480

20

40

60

80

100

0

20

40

60

80

100

AccuracyProfile Order

Tim

e pe

r Pr

ofile

(s)

Accu

racy

(%)

Fatigue matters

All testers speed up over time

41

Sybil Profile Difficulty

0 5 10 15 20 25 30 350

102030405060708090

100

Turker

Sybil Profiles Ordered By Turker Accuracy

Aver

age

Accu

racy

per

Sy

bil (

%)

Experts perform well on most difficult Sybils

Really difficult profiles

• Some Sybils are more stealthy• Experts catch more tough Sybils than turkers

Fighting Fire With Fire : Crowdsourcing Security Threats and Solutions on the Social Web

Documents

Transcript of Fighting Fire With Fire : Crowdsourcing Security Threats and Solutions on the Social Web