Fighting Fire With Fire : Crowdsourcing Security Threats and Solutions on the Social Web

41
Fighting Fire With Fire: Crowdsourcing Security Threats and Solutions on the Social Web Gang Wang, Christo Wilson, Manish Mohanlal, Ben Y. Zhao Computer Science Department, UC Santa Barbara. [email protected]

description

Fighting Fire With Fire : Crowdsourcing Security Threats and Solutions on the Social Web. [email protected]. Gang Wang , Christo Wilson , Manish Mohanlal , Ben Y. Zhao Computer Science Department , UC Santa Barbara. A Little B it A bout M e. 3 nd Year PhD @ UCSB - PowerPoint PPT Presentation

Transcript of Fighting Fire With Fire : Crowdsourcing Security Threats and Solutions on the Social Web

Page 1: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

Fighting Fire With Fire:Crowdsourcing Security Threats and Solutions on the Social Web

Gang Wang, Christo Wilson, Manish Mohanlal, Ben Y. ZhaoComputer Science Department, UC Santa Barbara.

[email protected]

Page 2: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

A Little Bit About Me 3nd Year PhD @ UCSB Intern at MSR Redmond 2011 Intern at LinkedIn (Security

Team) 2012

2

Research Interests: Security and Privacy Online Social Networks Crowdsourcing

Data Driven Analysis and Modling

Page 3: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

3

Recap: Threats on the Social Web Social spam is a serious problem

10% of wall posts with URLs on Facebook are spam

70% phishing Sybils underlie many attacks on Online Social

Networks Spam, spear phishing, malware distribution Sybils blend completely into the social graph

Existing countermeasures are ineffective Blacklists only catch 28% of spam Sybil detectors from the literature do not work

Page 4: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

4

Sybil Accounts on Facebook In-house estimates

Early 2012: 54 million August 2012: 83 million 8.7% of the user base

Fake likes VirtualBagel: useless site, 3,000 likes

in 1 week 75% from Cairo, age 13-17• Sybils attacks in large scale

• Advertisers are fleeing Facebook

Page 5: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

5

Sybil Accounts on Twitter

92% of Newt Gingritch’s followers are Sybils Russian political protests on Twitter

25,000 Sybils sent 440,000 tweets 1 million Sybils controlled overall

Follo

wers

4,000 new followers/d

ay100,000 new followers in 1

day• Twitter is vital infrastructure• Sybils usurping Twitter for political ends

Page 6: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

6

Talk Outline1. Malicious crowdsourcing sites – crowdturfing

[WWW’12] Spam and Sybils generated by real people Huge threat in China Growing threat in the US

2. Crowdsourced Sybil detection [NDSS’13] If attackers can do it, why not defenders? Can humans detect Sybils? Is this cost effective? Design a crowdsourced Sybil detection system

User Study

Page 7: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

7 Outline Intro Crowdturfing

Crowdsourcing Overview What is Crowdturfing How bad is it? Crowdturfing in the US

Crowdsourced Sybil Detection Conclusion

Page 8: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

8 We tend to think of spam as “low quality” What about high quality spam and Sybils? Open questions

What is the scope of this problem? Generated manually or mechanically? What are the economics?

High Quality Sybils and Spam

Gang WangMaxGentleman is the bestest male enhancement system avalable. http://cid-ce6ec5.space.live.com/

FAKEStock Photographs

Page 9: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

9

Black Market Crowdsourcing Amazon’s Mechanical Turk

Admins remove spammy jobs Black market crowdsourcing websites

Spam and fake accounts, generated by real people Major force in China, expanding in the US and IndiaCrowdturfing = Crowdsourcing + Astroturfing

Page 10: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

10

Page 11: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

11

Crowdturfing Workflow

Customers

Initiate campaigns

May be legitimate businesses

Agents Manage

campaign and workers

Verify completed tasks

Workers Complete

tasks for money

Control Sybils on other websites

Campaign

Tasks

Reports

Page 12: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

12

Crowdturfing in China

Site

ActiveSince

TotalCampaigns

Workers

Reports

$ forWorkers

$ forSite

Zhubajie

Nov. 2006 76K 169K 6.3M $2.4M $595K

1

10

10

100

1000

10000

100000

1000000Site Growth Over Time

Cam

paig

ns p

er

Mon

th

Dol

lars

per

Mon

th

Jan. 08 Jan. 09 Jan. 10 Jan. 11

Zhubajie

Sandaha

Campaigns

$

Campaigns

$

Page 13: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

13

Spreading Spam on Weibo

100 1000 10000 100000 1000000 100000000

102030405060708090

100

Approximate Audience Size per Campaign

CDF

50% of campaigns reach

>100000 users 8% reach>1 million

users• Campaigns reach huge audiences• How effective are these campaigns?

Page 14: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

14

Travel agency reported sales statistics 2 sales/month before our campaign 11 sales within 24 hours after our campaign Each trip sells for $1500!

Initiate our own campaigns as a customer 4 benign ad campaigns promoting real e-

commerce sites All clicks route through our measurement

server

How Effective is Crowdturfing?

Campaign About Targ

etCos

tTask

sRepor

tsClicks

Cost Per

ClickVacation Advertise for

a discount vacation through a

travel agent

Weibo

$15 100

108 28 $0.21

QQ 118 187 $0.09Forums 123 3 $0.90

Web Display Ads CPC =

$0.01

Page 15: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

15

Crowdturfing in America

Other studies support these findings Freelancer

28% spam jobs Bulk OSN accounts, likes, spam Connections to botnet operators

US Sites % Crowdturfing

Legit

Mechanical Turk 12%Bl

ack Market

MinuteWorkers

70%

MyEasyTasks 83%Microworkers 89%ShortTasks 95%

Poultry Markets $20 for 1000

followers Ponzi scheme

Page 16: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

16

Takeaways Identified a new threat: Crowdturfing

Growing exponentially in size and revenue in China

$1 million per month on just one site Cost effective: $0.21 per click

Starting to grow in US and other countries Mechanical Turk, Freelancer Twitter Follower Markets

Huge problem for existing security systems Little to no automation to detect Turing tests fail

Page 17: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

17 Outline Intro Crowdturfing Crowdsourced Sybil Detection

Open Questions User Study Accuracy Analysis System Design

Conclusion

Page 18: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

18

Crowdsourcing Sybil Defense Defenders are losing the battle against OSN

Sybils Idea: build a crowdsourced Sybil detector

Leverage human intelligence Scalable

Open Questions How accurate are users? What factors affect detection accuracy? Is crowdsourced Sybil detection cost effective?

Page 19: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

19

User Study Two groups of users

Experts – CS professors, masters, and PhD students Turkers – crowdworkers from Mechanical Turk and

Zhubajie Three ground-truth datasets of full user profiles

Renren – given to us by Renren Inc. Facebook US and India

Crawled Legitimate profiles – 2-hops from our own profiles Suspicious profiles – stock profile images Banned suspicious profiles = Sybils

Stock Picture

Crowdturfing Site

Page 20: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

20

Progress

Classifying Profiles

BrowsingProfiles

Screenshot of Profile(Links Cannot be

Clicked)

Real or fake?

Why?

Navigation Buttons

Testers may skip around and revisit

profiles

Page 21: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

21

Experiment Overview

Dataset

# of Profiles

Test Group

# of Teste

rs

Profile per

TesterSybil Legit.

Renren 100 100Chinese Expert

24 100

Chinese Turker

418 10

Facebook US 32 50 US Expert 40 50

US Turker 299 12Facebook India 50 49 India Expert 20 100

India Turker 342 12

Crawled Data

Data from Renren

Fewer Experts

More Profiles per Experts

Page 22: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

22

Individual Tester Accuracy

0 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100 Chinese TurkerUS TurkerUS Expert

Accuracy per Tester (%)

CDF

(%)

Not so

good :(

• Experts prove that humans can be accurate• Turkers need extra help…

Awesome!80% of experts

have >90% accuracy!

Page 23: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

23

Accuracy of the Crowd Treat each classification by each tester

as a vote Majority makes final decision

Dataset Test Group False Positives

False Negatives

Renren Chinese Expert 0% 3%Chinese Turker 0% 63%

Facebook US

US Expert 0% 10%US Turker 2% 19%

Facebook India

India Expert 0% 16%India Turker 0% 50%

Almost Zero False Positives

Experts Perform

OkayTurkers Miss

Lots of Sybils

• False positive rates are excellent• Turkers need extra help against false negatives• What can be done to improve accuracy?

Page 24: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

24

Eliminating Inaccurate Turkers

0 10 20 30 40 50 60 700

20

40

60

80

100ChinaIndiaUS

Turker Accuracy Threshold (%)

Fals

e N

egat

ive

Rate

(%) Dramatic

Improvement

Most workers are >40% accurate From 60% to

10% False Negatives• Only a subset of workers are removed (<50%)

• Getting rid of inaccurate turkers is a no-brainer

Page 25: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

25

How Many Classifications Do You Need?

2 4 6 8 10 12 14 16 18 20 22 240

20

40

60

80

100

Classifications per Profile

Erro

r Ra

te (%

)

ChinaIndia

US

False Negatives

False Positives

• Only need a 4-5 classifications to converge• Few classifications = less cost

Page 26: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

26

How to turn our results into a system?1. Scalability

OSNs with millions of users2. Performance

Improve turker accuracy Reduce costs

3. Preserve user privacy when giving data to turkers

Page 27: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

27

Social NetworkHeuristics

User ReportsSuspicious Profiles

All Turkers

OSN employee

TurkerSelection Accurate Turkers

Very Accurate Turkers

Sybils

System Architecture

Filtering Layer

Crowdsourcing Layer

Filter Out Inaccurate

TurkersMaximize Usefulness

of High Accuracy Turkers

Rejected!

• Leverage Existing Techniques

• Help the System Scale

?

• Continuous Quality Control

• Locate Malicious Workers

Page 28: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

Trace Driven Simulations Simulate 2000 profiles Error rates drawn from survey

data Vary 4 parameters

28

Accurate Turkers

Very Accurate Turkers

Classifications

Classifications

Threshold

Controversial Range

Results• Average 6 classifications per profile• <1% false positives• <1% false negatives

2

5

90%

20-50%

Results++• Average 8 classifications per profile• <0.1% false positives• <0.1% false negatives

Page 29: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

29

Estimating Cost Estimated cost in a real-world social networks: Tuenti

12,000 profiles to verify daily 14 full-time employees Annual salary 30,000 EUR (~$20 per hour) $2240 per day

Crowdsourced Sybil Detection 20sec/profile, 8 hour day 50 turkers Facebook wage ($1 per hour) $400 per day

Cost with malicious turkers Estimate that 25% of turkers are malicious 63 turkers $1 per hour $504 per day

Page 30: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

30

Takeaways Humans can differentiate between real and

fake profiles Crowdsourced Sybil detection is feasible Designed a crowdsourced Sybil detection

system False positives and negatives <1% Resistant to infiltration by malicious workers Sensitive to user privacy Low cost

Augments existing security systems

Page 31: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

31 Outline Intro Crowdturfing Crowdsourced Sybil Detection Conclusion

Summary of My Work Future Work

Page 32: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

32

Key Contributions1. Identified novel threat: crowdturfing

End-to-end spam measurements from customers to the web

Insider knowledge of social spam2. Novel defense: crowdsourced Sybil

detection User study proves feasibility of this approach Build an accurate, scalable system Possible deployment in real OSNs – LinkedIn

and RenRen

Page 33: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

33

Ongoing Works1. Twitter follower markets

Locate customers who purchase bulk of Twitter followers

Study the un-follow dynamics of customers Develop systems to detect customers in the wild

2. Sybil detection using server-side click streams Build click models based on clickstream logs Extract click patterns of Sybil and normal users Develop systems to detect Sybil

Page 34: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

34 Questions?

Thank you!

Page 35: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

35

Potential Project Ideas Malware distribution in cellular networks

Identify malware related cellular network traffic Coordinated malware distribution campaigns Feature based detection

Advertising traffic analysis on mobile Apps Characterize ads traffic How effective for app-displayed ads to get click-

through? Are there malware delivered through ads?

Page 36: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

36

Preserving User Privacy Showing profiles to crowdworkers raises

privacy issues Solution: reveal profile information in

context!

Crowdsourced

Evaluation!

Crowdsourced

Evaluation

Public Profile

Information

Friend-Only

Profile Informatio

nFriends

Page 37: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

37

Clickstream Sybil Detection

Sybil Clickstream

Friend

Invite

Share

Browse

Profiles

Initial Final

96%

9%

68%

15% 2%

27%64%

20% 55%31%

Photo

Initial Final22% 3%

Share

MessageFrien

d Invite

Browse

Profiles

9% 4%

5%5%14%

9%

21%56%

56%

29%

86%87%

10%43%

14%93%

Normal Clickstream

Clickstream detection of Sybils1. Absolute number of

clicks2. Time between clicks3. Page traversal order

Challenges Real-time Massive scalability Low-overhead

Page 38: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

38

Are Workers Real People?

0 5 10 15 200123456789

ZhubajieSandaha

Hours in the Day

% o

f Rep

orts

from

W

orke

rs

Late Night/Early Morning Work Day/Evening

Lunch Dinn

erZBJSDH

Page 39: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

39

Crowdsourced Sybil Detection How to detect crowdturfed Sybils?

Blur the line between real and fake Difficult to detect algorithmically

Anecdotal evidence that people can spot Sybils 75% of friend requests from Sybils are rejected Can people distinguish in real/fake general?

User studies: experts, turkers, undergrads What features give Sybils away? Are certain Sybils tougher than others?

Integration of human and machine intelligence

Page 40: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

40

Survey Fatigue US Experts US Turkers

0 3 6 90

20

40

60

80

100

0

20

40

60

80

100

Profile OrderTi

me

per

Profi

le (s

)

Accu

racy

(%)

No fatigue

0 8 16 24 32 40 480

20

40

60

80

100

0

20

40

60

80

100

AccuracyProfile Order

Tim

e pe

r Pr

ofile

(s)

Accu

racy

(%)

Fatigue matters

All testers speed up over time

Page 41: Fighting  Fire  With  Fire : Crowdsourcing Security  Threats  and  Solutions  on the Social Web

41

Sybil Profile Difficulty

0 5 10 15 20 25 30 350

102030405060708090

100

Turker

Sybil Profiles Ordered By Turker Accuracy

Aver

age

Accu

racy

per

Sy

bil (

%)

Experts perform well on most difficult Sybils

Really difficult profiles

• Some Sybils are more stealthy• Experts catch more tough Sybils than turkers