PhishAri: Automatic Realtime Phishing Detection on Twitter

27
Automatic Realtime Phishing Detection on Twitter Anupama Aggarwal, Ashwin Rajadesingan, Ponnurangam Kumaraguru 1

description

With the advent of online social media, phishers have started using social networks like Twitter, Facebook, Foursquare to spread phishing scams. Twitter is an immensely popular micro-blogging network where people post short messages of 140 characters called tweets. It has over 100 million active users who post about 200 million tweets everyday. Because of this vast information dissemination, phishers have started using Twitter as a medium to spread phishing. It is also difficult to detect phishing on Twitter unlike emails because of the quick spread of phishing links in the network, short size of the content, and use of URL obfuscation to shorten the URL to meet the requirement of 140 character tweet limit. Our technique, PhishAri, detects phishing on Twitter in realtime. We use Twitter specific features along with URL features to detect whether a tweet posted with a URL is phishing or not. Some of the Twitter specific features we use are tweet content and its characteristics like length, hashtags and mentions. Other Twitter features used are the characteristics of the Twitter user posting the tweet such as age of the account, number of tweets and the follower-followee ratio. These twitter specific features coupled with URL based features prove to be a strong mechanism to detect phishing tweets. We use machine learning classification techniques and detect phishing tweets with an accuracy of 92.52%. We have deployed our system for end-users by providing an easy to use Chrome browser extension. The extension works in realtime and classifies a tweet as phishing or safe when it appears in Twitter timeline of a user. In this research, we show that we are able to detect phishing tweets at zero hour with high accuracy which is much faster than public blacklists and as well as Twitter's own defense mechanism to detect malicious content. We also performed a quick user evaluation of PhishAri in a laboratory study to show that users like and are happy to use PhishAri in real-world. To the best of our knowledge, this is the first realtime, comprehensive, and usable system to detect phishing on Twitter.

Transcript of PhishAri: Automatic Realtime Phishing Detection on Twitter

Page 1: PhishAri: Automatic Realtime Phishing Detection on Twitter

Automatic Realtime Phishing Detection on

Twitter

Anupama Aggarwal, Ashwin Rajadesingan,Ponnurangam Kumaraguru

1

Page 2: PhishAri: Automatic Realtime Phishing Detection on Twitter

Motivation: Some Statistics

• $520 million were lost worldwide from phishing attacks in 2011 alone. (RSA Report)

• In 2012, around 20% of all phishing attacks targeted Facebook

• Social network phishing has jumped 221% attacks during Q1 of 2012

2

Page 3: PhishAri: Automatic Realtime Phishing Detection on Twitter

Phishing Detection on OSM: Current State-of-Art

3

• Offline Spam Characterization & Detection Studies

• No characterization of phishing on OSM

• Lack of Realtime detection mechanisms

• Absence of end-user deployed systems

• Dependence on Spam/Phishing Blacklists

Page 4: PhishAri: Automatic Realtime Phishing Detection on Twitter

What Did We Do to Fill the Gap?

• Built a mechanism to Automatically detect phishing on Twitter in Realtime

• No dependency on Blacklists

• Deployed end-user system for Twitter users - Chrome Extension

4

Page 5: PhishAri: Automatic Realtime Phishing Detection on Twitter

Twitter 101

5

Hey, I am in Puerto Rico

attending @APWG eCrime research

Talking about #phishing on OSN

Tweets<140 char

Earn Money #help #moneyhttp://bit.ly/Pw637z

Page 6: PhishAri: Automatic Realtime Phishing Detection on Twitter

Twitter 101

6

Hey, I am in Puerto Rico

attending @APWG eCrime research

Talking about #phishing on OSN

Earn Money #help #moneyhttp://bit.ly/Pw637z

@Tag

#Tag

URL in Tweet

To mention/reply to a Twitter user

To mention a topic

To link external media

Page 7: PhishAri: Automatic Realtime Phishing Detection on Twitter

Twitter 101

7

attending @APWG eCrime research

I’ll follow Grey1!

I’ll follow Grey2!

We’ll follow Blue!

Followers

Followees

attending @APWG eCrime research

Retweet (RT)

Nice! I’ll share this tweet in my network!

Page 8: PhishAri: Automatic Realtime Phishing Detection on Twitter

Twitter 101

8

attending @APWG eCrime research

I’ll follow Grey1!

I’ll follow Grey2!

We’ll follow Blue!

Nice! I’ll share this tweet in my network!

Followers

Followees

attending @APWG eCrime research

Retweet (RT)

Twitter Timeline

Tweets by FolloweesRetweets by Followees

Tweets by SelfRetweets by Self

Tweets with @Blue

@Blue

Page 9: PhishAri: Automatic Realtime Phishing Detection on Twitter

Challenges of PhishingDetection on Twitter

• Only 140 Characters - very less information

• Use of short URLs in tweets

• 100,000 Tweets per minute - quick spread

• Phishing Blacklists are slow - not reliable

9

Page 10: PhishAri: Automatic Realtime Phishing Detection on Twitter

Our Contribution

• PhishAri: Automatic realtime phishing detection mechanism for Twitter

• More efficient than plain blacklisting method

• Better than Twitter’s own phishing detection mechanism

• Real-world implementation of the system - Chrome Extension for Twitter

10

Page 11: PhishAri: Automatic Realtime Phishing Detection on Twitter

Methodology

• Step 1: Classification Model for Phishing Detection

• Data Collection

• Feature Extraction

• Classification

• Step 2: Realtime end-user Interface

• Using pre-trained classification model

• Chrome Browser Extension

11

Page 12: PhishAri: Automatic Realtime Phishing Detection on Twitter

Data Collection

12

Wait for 3 days

• 1,589 Phishing Tweets

• 903 Unique phishing URLs

Page 13: PhishAri: Automatic Realtime Phishing Detection on Twitter

• URL Features - Length, number of dots, characters, redirections

• WHOIs Features - domain name, ownership period

• Tweet Features - Number of #tags, @mentions, length, trending topics

• Network Features - Follower/Followee ratio, Age of account, Number of Tweets

13

Features Used

Page 14: PhishAri: Automatic Realtime Phishing Detection on Twitter

Classification Results

14

EvaluationMetric Naive Bayes Decision

TreeRandom Forest

Accuracy 87.02% 89.28% 92.52%

Precision(Phishing)

89.21% 88.05% 95.24%

Precision(Safe)

92.12% 94.15% 97.23%

Recall(Phishing)

68.32% 74.51% 92.21%

Precision(Safe)

85.68% 89.20% 95.54%

Page 15: PhishAri: Automatic Realtime Phishing Detection on Twitter

Evaluation

• Comparison with Blacklists

• 80.6% more phishing tweets detected by PhishAri at zero hour which were caught by blacklists after 3 days.

• Comparison with Twitter’s defense mechanism

• 84.6% more phishing tweets detected by PhishAri at zero hour which were marked as suspicious by Twitter after 3 days

15

Page 16: PhishAri: Automatic Realtime Phishing Detection on Twitter

Time Evaluation

• Used Intel Xeon 16 core Ubuntu server with 2.67 GHz processor and 32 GB RAM

• Multiprocessing Modules for faster processing

• Time required for the feature extraction & classification of a tweet is a maximum of 0.522 seconds (Min: 0.167 sec, Avg: 0.425 sec, Median 0.384 sec)

16

Page 17: PhishAri: Automatic Realtime Phishing Detection on Twitter

Text Analysis

17

Legitimate Tweets Phishing Tweets

Page 18: PhishAri: Automatic Realtime Phishing Detection on Twitter

PhishAri: RESTful API

• Use above classification model to create a RESTful API

• POST requests can be made to API to query a tweet

• Pre-trained classifier model used for classification of new tweets

18

Page 19: PhishAri: Automatic Realtime Phishing Detection on Twitter

PhishAri Chrome Extension

19

Page 20: PhishAri: Automatic Realtime Phishing Detection on Twitter

• Red / Green Indicators in front of Tweets with URLs

• Detects phishing tweets on

• User Timeline

• Twitter search results

• Profile of other users

• DMs (Limited as for now)

20

PhishAri Chrome Extension

Page 21: PhishAri: Automatic Realtime Phishing Detection on Twitter

21

Demo

Page 22: PhishAri: Automatic Realtime Phishing Detection on Twitter

How Extension Works?

22

• Integration of API with the Browser Extension

Page 23: PhishAri: Automatic Realtime Phishing Detection on Twitter

PhishAri Extension: User Experience and Statistics

• 78 Active Users

• User study shows that -

• users want support for other browsers, mobile apps

• found useful to use

• more robustness desired

23

Page 24: PhishAri: Automatic Realtime Phishing Detection on Twitter

• “Phish” + “Ari” = Realtime Automatic Detection

• 92.52% Accuracy with Random Forest Classifier

• Efficient - takes only 0.522 seconds for indicator to appear

• No dependency on Blacklists

• Faster than Blacklists

• Faster than Twitter’s own detection mechanism

24

Conclusion

Page 25: PhishAri: Automatic Realtime Phishing Detection on Twitter

• Backend database for faster lookup

• Increase the scope of PhishAri from public to all tweets

• Increase response time of PhishAri and appearance of indicators

• Support for other browsers and mobile apps

25

Future Work

Page 26: PhishAri: Automatic Realtime Phishing Detection on Twitter

Thank You!

26

Questions?Suggestions?

Page 27: PhishAri: Automatic Realtime Phishing Detection on Twitter

For any further information, please write [email protected]

precog.iiitd.edu.in

27