Evil Twins: Modeling Power Users in Attacks on...

Evil Twins: Modeling Power Users

in Attacks on Recommender Systems

7/9/2014 1

David C. Wilson and Carlos E. Seminario University Of North Carolina Charlotte

7/9/2014 2

7/9/2014 3

7/9/2014 4

Have you ever wondered whether these are

REAL ratings & REAL reviews

entered by REAL people??

Do you always TRUST

this information??

7/9/2014 5

7/9/2014 6

Research Problem

7/9/2014 7

Attacks on Recommender Systems

“Push”

“Nuke”

“Disrupt”

Example of a Push Attack

7/9/2014 8

User Profiles Avengers Titanic Avatar Alien Psycho TwilightBob 5 2 3 3 ?Ted 2 4 4 1Fred 3 1 3 1 2

Ginger 4 2 3 1 1Jodie 3 3 2 1 3 1

Jill 3 1 2Tom 4 3 3 3 2

Corey 5 1 5 1

Source: Mobasher et al, 2007


7/9/2014 9


Ginger 4 2 3 1 1Jodie 3 3 2 1 3 1

Jill 3 1 2Tom 4 3 3 3 2

Corey 5 1 5 1


Ginger 4 2 3 1 1Jodie 3 3 2 1 3 1

Jill 3 1 2Tom 4 3 3 3 2

Corey 5 1 5 1Alice 5 3 2 5Axel 5 1 4 2 5Alvin 5 2 2 2 5



7/9/2014 10

Filler: Ratings to correlate with regular

users

Target Item, Attack Intent


Ginger 4 2 3 1 1Jodie 3 3 2 1 3 1

Jill 3 1 2Tom 4 3 3 3 2

Corey 5 1 5 1


Ginger 4 2 3 1 1Jodie 3 3 2 1 3 1

Jill 3 1 2Tom 4 3 3 3 2




7/9/2014 11

Filler Size: Avg # Ratings

per Profile

Attack Size: # of Attack

Profiles


Ginger 4 2 3 1 1Jodie 3 3 2 1 3 1

Jill 3 1 2Tom 4 3 3 3 2

Corey 5 1 5 1


Ginger 4 2 3 1 1Jodie 3 3 2 1 3 1

Jill 3 1 2Tom 4 3 3 3 2




Ginger 4 2 3 1 1Jodie 3 3 2 1 3 1

Jill 3 1 2Tom 4 3 3 3 2



7/9/2014 12

Filler Size: Avg # Ratings

per Profile

Filler: Ratings to correlate with regular

users

Target Item, Attack Intent

Attack Size: # of Attack

Profiles


Ginger 4 2 3 1 1Jodie 3 3 2 1 3 1

Jill 3 1 2Tom 4 3 3 3 2

Corey 5 1 5 1


“Twilight” predicted rating (User-based) Before attack 2 After attack 5


Ginger 4 2 3 1 1Jodie 3 3 2 1 3 1

Jill 3 1 2Tom 4 3 3 3 2


Generating Attack User Profiles

7/9/2014 13

Filler Item Selection and

Ratings are key

to a successful

attack!!


Ginger 4 2 3 1 1Jodie 3 3 2 1 3 1

Jill 3 1 2Tom 4 3 3 3 2



7/9/2014 14


Ratings are key

to a successful

attack!!

Attack Models using Statistical “Average” Users


Ginger 4 2 3 1 1Jodie 3 3 2 1 3 1

Jill 3 1 2Tom 4 3 3 3 2


Random: Filler @ norm dist around avg of all item ratings Source: O’Mahony et al, 2002; Lam & Riedl 2004; Mobasher et al, 2007; Williams et al 2006; Hurley et al, 2009


7/9/2014 15


Ratings are key

to a successful

attack!!

Random: Filler @ norm dist around avg of all item ratings Average: Filler @ norm dist around avg of each item’s rating Source: O’Mahony et al, 2002; Lam & Riedl 2004; Mobasher et al, 2007; Williams et al 2006; Hurley et al, 2009


Ginger 4 2 3 1 1Jodie 3 3 2 1 3 1

Jill 3 1 2Tom 4 3 3 3 2



7/9/2014 16



Ratings are key

to a successful

attack!!


Ginger 4 2 3 1 1Jodie 3 3 2 1 3 1

Jill 3 1 2Tom 4 3 3 3 2


Random: Filler @ norm dist around avg of all item ratings Average: Filler @ norm dist around avg of each item’s rating Bandwagon: Popular @ max rating, filler @ Random model Segment: Popular @ max rating, filler @ min rating Obfuscated: Noise injection, User shifting, Target shifting Average-over-popular: Average model with % Popular filler Source: O’Mahony et al, 2002; Lam & Riedl 2004; Mobasher et al, 2007; Williams et al 2006; Hurley et al, 2009


7/9/2014 17



Ratings are key

to a successful

attack!!

Research Gap

18 7/9/2014

Attack Models and Attack Detection research have focused on “average” attackers and related user models

Source: Mobasher et al, 2007; Burke, et al, 2011; Chirita, et al, 2005; Mehta and Nejdl, 2009; Hurley et al, 2009; Williams et al 2007; Sandvig et al, 2007; Cheng and Hurley, 2010, Bhaumik et al, 2006

However, attackers continue to find new and more powerful

strategies to attack Recommender Systems

User-User Similarity Matrix Social graph

Social Network Analysis: Central Influential

Viral Marketing: Connected users exert influence

Source: Palau et al 2004; Wasserman and Faust, 1994; Domingos and Richardson, 2001; Anand and Griffiths, 2011

Influential “Power” Users vs “Average” Users

7/9/2014 19

Influence Impact a user has on recommendations

Selecting Power Users from a Dataset

7/9/2014 20

Finding optimal set of Power Users in a social network is complex, so heuristic approaches for Power User selection are used .. Number of Ratings in the user profile

Aggregated Similarity: sum of similarities between users

In-Degree Centrality: number of neighborhoods a user is in

Source: Kempe et al, 2003; Rashid et al, 2005; Goyal and Lakshmanan, 2012; Herlocker et al, 2004; Lathia et al, 2008; Wilson and Seminario, 2013

User Profiles Avengers Titanic Avatar Alien Psycho Twilight NumRatingsBob 5 2 3 3 4Ted 2 4 4 1 4Fred 3 1 3 1 2 5

Ginger 4 2 3 1 1 5Jodie 3 3 2 1 3 1 6

Jill 3 1 2 3Tom 4 3 3 3 2 5

Corey 5 1 5 1 4

7/9/2014 21

Number of Ratings Power Users

User-Item Matrix

Similarity Bob Ted Fred Ginger Jodie Jill Tom Corey AggSimBob 0.756 0.718 0.945 2.419Ted 0.500 0.522 1.000 2.022Fred 0.756 0.674 0.426 1.856

Ginger 1.000 0.866 1.000 2.866Jodie 0.767 0.866 1.000 2.633

Jill 1.000 0.866 0.866 2.732Tom 0.945 0.866 0.645 2.456

Corey 1.000 1.000 1.000 3.000

7/9/2014 22

Aggregated Similarity Power Users User-User Similarity Matrix

(Pearson Correlation, no weighting)

Similarity Bob Ted Fred Ginger Jodie Jill Tom CoreyBob 0.378 0.479 0.472Ted 0.250 0.348 0.333Fred 0.378 0.449 0.284

Ginger 0.639 0.577 0.500Jodie 0.639 0.538 0.667

Jill 0.333 0.433 0.433Tom 0.472 0.577 0.538

Corey 0.500 0.667 0.433InDegree 2 0 1 7 5 1 4 4

7/9/2014 23

In Degree Power Users User-User Similarity Matrix

(Pearson Correlation, weighted)

Social Graph based on User-User Similarities

Degree Centrality

7/9/2014 24

Bob

Ted Jodie

Tom

Corey

Fred

Jill

Ginger

Bob 2

Fred 1

Corey 4

Ginger 7 Tom

4

Jodie 5

Jill 1

Ted 0

Real Power Users Have Impact

7/9/2014 25

Investigated feasibility of Power User Attack (PUA) Select top Power Users from dataset Use Power User profiles as “filler” Select target items (“new” items) Attack parameters: size, intent CF Algorithms and Datasets

Source: Wilson and Seminario, 2013; Seminario and Wilson, 2014

Small number of Power Users (< 5% of dataset users) can have significant effects on recommendations

PUA effective against User-based and SVD-based RS Source: Wilson and Seminario, 2013; Seminario and Wilson, 2014

Power User Model

7/9/2014 26

Synthetic Power Users (SPU’s) based on Real Power Users (RPU’s)

1. Select top RPU’s from dataset InDegree Number of Ratings Aggregated Similarity

2. Generate SPU’s for Attack SPU profiles based on RPU’s The “evil twins”

3. Evaluate the Model Before Attack ..

4. Select Attack Parameters

5. Evaluate the Model After Attack

Generate Synthetic Power Users (SPU’s)

7/9/2014 27

Filler Size based on RPU’s profile size Item Selection based on RPU’s item popularity Item Ratings based on RPU’s average item ratings

Generate Synthetic Power Users (SPU’s)

7/9/2014 28

Filler Size based on RPU’s profile size Item Selection based on RPU’s item popularity Item Ratings based on RPU’s average item ratings

Objective was to emulate (not duplicate) Real Power Users

Evaluating the Power User Model

7/9/2014 29

Evaluation Metrics Before Attack: How well do the SPU’s match the RPU’s?

Precision and Recall Mean Absolute Error Statistical differences

Power User Model – Evaluation Before the Attack

7/9/2014 30

Selecting SPU’s from ML100K dataset Source for ML100K: grouplens.org


7/9/2014 31

Selecting SPU’s from ML100K dataset

We were able to find the majority of NumRatings and InDegree SPU’s in the top-50

Good SPU emulation of RPU’s

Source for ML100K: grouplens.org


7/9/2014 32

SPU vs RPU MAE Differences


7/9/2014 33

SPU vs RPU MAE Differences

InDegree and NumRatings SPU’s have better ablation results than AggSim

InDegree SPU’s indicate a strong level of influence


7/9/2014 34

SPU vs RPU Statistical Characteristics

No differences across all power user selection methods Average number of ratings per power user Average user rating (across all power users) Average item rating (across all power user items) The Power User Model generates SPU’s that match

key statistical measures of RPU’s Good SPU emulation of RPU’s

Power User Model

7/9/2014 35

Synthetic Power Users (SPU’s) based on Real Power Users (RPU’s) 1. Select top RPU’s from dataset

InDegree Number of Ratings Aggregated Similarity

2. Generate SPU’s for Attack SPU profiles based on RPU’s The “evil twins”

3. Evaluate the Model Before Attack

4. Select Attack Parameters Attack Size: 5% (= 50 power users) Attack intent: Push (promote) Target items: “New”, injected at run time

5. Evaluate the Model After Attack ..

Evaluating the Power User Model

7/9/2014 36

Evaluation Metrics Before Attack: How well do the SPU’s match the RPU’s?

Precision and Recall, Mean Absolute Error, Statistical differences

After Attack: How effective is the Power User Attack with SPU’s? Robustness: Hit Ratio, Rank, Prediction Shift

Source: O’Mahony et al, 2002, Lam and Riedl, 2004; Mobasher et al, 2007; Burke et al, 2011

Hit Ratio % of users with target item top-N list

Rank position of target item in top-N list

Prediction Shift change in predicted rating for target item

High Hit Ratio, low Rank, high Prediction Shift indicates more impact

Source: Lam and Riedl, 2004; Mobasher et al, 2007; Burke et al, 2011; Seminario and Wilson 2014

Power User Model – Evaluation After the Attack

7/9/2014 37

Power User Model – Evaluation After the Attack

7/9/2014 38

Attacks with InDegree and Number of Ratings SPU’s have high impacts on User-based and

SVD-based recommenders

Summary & Future Work

7/9/2014 39

Power User Model produces InDegree and NumRatings SPU’s that effectively emulate RPU’s

Power User Attack with SPU’s is effective against User-based and SVD-based CF systems

Small number of SPU’s (< 5%) can have significant effects on recommender predictions

Future ..

Explore other Power User selection methods Extend Power User Model (item selection) Evaluate other CF algorithms and domains Mitigate Power User attacks

Summary & Future Work

7/9/2014 40

Power User Model produces InDegree and NumRatings SPU’s that effectively emulate RPU’s

Power User Attack with SPU’s is effective against User-based and SVD-based CF systems

Small number of SPU’s (< 5%) can have significant effects on recommender predictions

Future ..

Explore other Power User selection methods Extend Power User Model (item selection) Evaluate other CF algorithms and domains Mitigate Power User attacks

Power User Attacks using InDegree and NumRatings methods can impact recommender systems

System operators should be aware of, and able to

defend against, Power User Attacks

7/9/2014 41

Thank You !! Carlos Seminario [email protected]

David Wilson [email protected]

Evil Twins: Modeling Power Users in Attacks on...

Documents

Transcript of Evil Twins: Modeling Power Users in Attacks on...