Trust-aware Recommender Systems

22
TRUST-AWARE RECOMMENDER SYSTEMS A RESEARCH TARGETING THE SPARSITY PROBLEM IN CF

description

Trust-aware Recommender Systems based on collaborative filtering with additional input trust metrix

Transcript of Trust-aware Recommender Systems

Page 1: Trust-aware Recommender Systems

TRUST-AWARE RECOMMENDER SYSTEMSA RESEARCH TARGETING THE SPARSITY PROBLEM IN CF

Page 2: Trust-aware Recommender Systems

GROUP MEMBERS

MUHAMMAD YOUSAF (10-SE-18)

MUHAMMAD JAHANGEER SHAMS (10-SE-144)

MUHAMMAD ALI RAFIQUE (10-SE-88)

Page 3: Trust-aware Recommender Systems

COLLABORATIVE FILTERING

• WHAT IS COLLABORATIVE FILTERING?

• OPINIONS EXPRESSED BY THE OTHER SIMILAR USERS.

• COMPUTE THE PEARSON CORRELATION COEFFICIENT.• APPLY FORMULA TO FIND PREDICTION

RA IS AVERAGE RATING OF ACTIVE USER P(A,I) IS PREDICTION, W(A,U) IS PEARSON COEFFICIENT, R(U,I) IS RATING PROVIDED BY OTHER USERS, RU AVERAGE OF THE RATINGS PROVIDED BY USER U AND K IS

NO. OF SIMILAR USERS OR NEIGHBORS.

• EFFECTIVE IN GENERATING RECOMMENDATIONS AND WIDELY USED.

3

Page 4: Trust-aware Recommender Systems

PROBLEMS WITH TRADITIONAL CF

• RSS BASED ON CF SUFFER SOME INHERENT WEAKNESSES

• DATA SPARSITY CAUSES THE FIRST SERIOUS WEAKNESS OF COLLABORATIVE FILTERING

• CLOD START USERS

• NEW USERS

• WHAT IF BLACK HAT USERS BECOME SIMILAR?

• ATTACKER CAN COPY THE RATINGS OF TARGET USER AND FOOL THE SYSTEM INTO THINKING THAT THE ATTACKER IS IN FACT THE MOST SIMILAR USER TO TARGET USER.

4

Page 5: Trust-aware Recommender Systems

TRUST-AWARE RECOMMENDER SYSTEMS

• QUALITY ASSESSMENT BY USERS.

• USERS RATE OTHER USERS TO EXPRESS LEVEL OF TRUST, SYSTEM CAN THEN AGGREGATE ALL THE TRUST STATEMENTS IN A SINGLE TRUST NETWORKS REPRESENTING THE RELATIONSHIPS BETWEEN USERS.

• TRUST METRICS PREDICT, BASED ON THE TRUST NETWORK, THE TRUSTWORTHINESS OF “UNKNOWN” USERS, I.E. USERS IN WHICH A CERTAIN USER DIDN’T EXPRESS A TRUST STATEMENT.

5

Page 6: Trust-aware Recommender Systems

TYPES OF TRUST METRICS

1. LOCAL TRUST METRICS

• TAKE INTO ACCOUNT THE VERY PERSONAL AND SUBJECTIVE VIEWS OF THE USERS AND PREDICT DIFFERENT VALUES OF TRUST IN OTHER USERS FOR EVERY SINGLE USER.

2. GLOBAL TRUST METRICS

• A GLOBAL “REPUTATION” VALUE THAT APPROXIMATES HOW THE COMMUNITY AS A WHOLE CONSIDERS A CERTAIN USER.

• PAGERANK FOR EXAMPLE, IS A GLOBAL TRUST METRIC.

6

Page 7: Trust-aware Recommender Systems

ARCHITECTURE OF TARS

7

Trust[N*N]

Rating[N*M]

Trust Metric

Similarity Metric

Estimated Trust

User Similarity

Rating Predictor

Predicted Ratings[N*M]

InputFirst Step Second Step Out Put

Page 8: Trust-aware Recommender Systems

TARS ARCHITECTURE

• TWO INPUTS

• THE TRUST MATRIX

• THE RATINGS MATRIX

• OUTPUT IS A MATRIX OF PREDICTED RATINGS

• THE DIFFERENCE WITH RESPECT TO TRADITIONAL CF SYSTEMS IS THE ADDITIONAL INPUT MATRIX OF TRUST STATEMENTS.

• FIRST STEP FINDS NEIGHBORS

• SECOND STEP PREDICTS RATINGS BASED ON A WEIGHTED SUM OF THE RATINGS GIVEN BY NEIGHBORS TO ITEMS

8

Page 9: Trust-aware Recommender Systems

TARS ARCHITECTURE

• THE KEY DIFFERENCE IS IN HOW NEIGHBORS ARE IDENTIFIED AND HOW THEIR WEIGHTS ARE COMPUTED

• BOTH TRUSTED METRICS AND SIMILARITY METRICS CAN BE USED TO FIND SIMILAR USERS

• IN BOTH ROW I CONTAINS NEIGHBORS OF USERS AND COLUMN J CONTAINS WEIGHTS FOR THEIR SIMILARITY OR TRUST.

9

Page 10: Trust-aware Recommender Systems

HOW PROBLEMS ARE SOLVED

• SPARSITY REDUCED

• ESPECIALLY FOR COLD START USERS

• JUST ONE TRUST EXPRESSION NEEDED

• MORE ACCURACY FOR LESS TRUST EXPRESSION THAN FOR LESS COMMON RATINGS IN SIMILARITY MODULE

• ATTACKERS

• ATTACKS ARE ADDRESSED BY A TRUST-AWARE TECHNIQUE GIVEN THAT THE FAKE IDENTITIES USED FOR THE ATTACKS ARE NOT TRUSTED EXPLICITLY BY THE ACTIVE USERS

• RATINGS THEY HAVE INTRODUCED CAN’T GAME THE SYSTEM.

10

Page 11: Trust-aware Recommender Systems

RELATED WORK

• “TRUST IN RECOMMENDER SYSTEMS” BY O’DONOVAN AND SMYTH PROPOSE ALGORITHMS FOR COMPUTING PROFILE LEVEL TRUST AND ITEM LEVEL TRUST

• TRUST VALUES ARE DERIVED FROM RATINGS (OF THE MOVIE LENS DATASET) RATHER THAN TAKING FROM USERS

• GOLBECK DESIGNED A TRUST METRIC CALLED TIDALTRUST [2]

• EVEN IF ON A DATASET OF JUST 300 MEMBERS, IT IS INTERESTING TO NOTE THAT HER FINDINGS ARE SIMILAR TO OURS

• GOLBECK’S PHD THESIS [2] FOCUS ON TRUST IN WEB-BASED SOCIAL NETWORKS

• RECOMMENDER SYSTEM, FILMTRUST

• USERS CAN RATE FILMS AND WRITE REVIEWS AND THEY CAN ALSO EXPRESS TRUST STATEMENTS

11

Page 12: Trust-aware Recommender Systems

EMPIRICAL VALIDATION

• DATASET FROM EPINIONS.COM

• EPINIONS IS A CONSUMERS OPINION SITE WHERE USERS CAN REVIEW ITEMS

• USERS CAN ALSO EXPRESS THEIR WEB OF TRUST, VALUABLE AND OFFENSIVE USERS

• CRAWLER THAT RECORDED RATINGS AND TRUST STATEMENTS ISSUED BY A USER AND THEN MOVED TO USERS TRUSTED BY THAT USERS AND RECURSIVELY DID THE SAME.

• 49, 290 USERS AND 139, 738 DIFFERENT RATED AT LEAST ONCE AND 664, 824 REVIEWS

• SPARSITY IS 99.99135%

• SPARSITY IS MUCH HIGHER IN EPINIONS THAN IN MOVIELENZ (MOSTLY USED DATASET FOR TESTING)

• LARGE MAJORITY OF USERS WERE COLD START

• 52.82% GAVE LESS THAN 5 REVIEWS

• 45% OF THE RATINGS ARE 5 (BEST), 29% ARE 4, 11% ARE 3, 8% ARE 2 AND 7% ARE 1 (WORST)

12

Page 13: Trust-aware Recommender Systems

EVALUATION MEASURES

• LEAVE-ONE-OUT TECHNIQUE TO EVALUATE RECOMMENDER

• PROBLEM WITH MAE

• SOLUTION BY MEAN ABSOLUTE USER ERROR

• COVERAGE PROBLEM

• SOLUTION BY USERS COVERAGE

• WE REPORT RESULTS FOR COLD START USERS, RATINGS FROM 1 TO 4; HEAVY RATERS RATINGS MORE THAN; BLACK SHEEP, USERS WHO PROVIDED MORE THAN 4 RATINGS AND FOR WHICH THE AVERAGE DISTANCE OF THEIR RATING ON ITEM I WITH RESPECT TO MEAN RATING OF ITEM I IS GREATER THAN 1

13

Page 14: Trust-aware Recommender Systems

RESULTS OF THE EXPERIMENTS

• COLLABORATIVE FILTERING OUTPERFORMED BY SIMPLE AVERAGE(TRUST ALL ALGO)

• NOT USING WEIGHTED AVERAGE OR USING SIMILARITY FACTOR 1 FOR ALL USERS

• MAE

• 0.821 FOR TRUST ALL WHITE 0.843 FOR STANDARD CF

• COVERAGE

• EPINIONS IS 51.28% FOR CF AND 88.20% FOR TRUSTALL

• ON COLD

• START COVERAGE OF CF IS 3.22% WHILE THE COVERAGE OF TRUSTALL IS 92.92%

• AND THE MAE OF CF IS 1.094 WHILE THE MAE OF TRUSTALL IN 0.856

• NOTE THAT IN THE REAL-WORLD, COLD START USERS MAKE UP MORE THAN 50% OF TOTAL USERS

14

Page 15: Trust-aware Recommender Systems

RESULTS

15

Page 16: Trust-aware Recommender Systems

ANOTHER VARIATION MT1

• USERS EXPLICITLY TRUSTED BY THE ACTIVE USER

• SETTING THE PROPAGATION HORIZON AT 1 FOR THE LOCAL TRUST METRIC MOLE TRUST

• MORE ACCURATE FOR COLD START USERS THAN CF

• NOW CONSIDER ALL RATINGS

• MAUE ACHIEVED BY MT1 AND CF IS RESPECTIVELY 0.790 AND 0.938

• CF IS ABLE TO PREDICT MORE RATINGS THAN MT1 (RATINGS) COVERAGE IS 51.28% VS. 28.33%),

• MT1 IS ABLE TO GENERATE AT LEAST A PREDICTION FOR MORE USERS (USERS COVERAGE IS 46.64% VS. 40.78%).

16

Page 17: Trust-aware Recommender Systems

CONTINUED COMPARISON OF MT1 AND CF

• CF PERFORMS MUCH WORSE THAN MT1 WHEN WE CONSIDER THE ERROR ACHIEVED OVER EVERY SINGLE USER

• CF WORKS WELL, FOR COVERAGE AND IN TERMS OF ERROR, FOR HEAVY RATERS BUT POORLY FOR COLD START USERS

• FOR CONTROVERSIAL ITEMS AND OPINIONATED USERS MT1 OUTPERFORMS BOTH CF AND TRUSTALL

17

Page 18: Trust-aware Recommender Systems

COMPARISON BY PROPAGATING TRUST

• HERE WE ANALYZE BY USING MT2, MT3 AND MT4 THE ALGORITHMS WHICH PROPAGATE TRUST UP TO DISTANCE 2, 3 AND 4 RESPECTIVELY

• ALLOWS TO REACH MORE USERS AND PREDICT A TRUST SCORE FOR MORE OF THEM

• PREDICTION COVERAGE OF THE RS ALGORITHM INCREASES (SEE GRAPH)

• INCREASES FROM 28.33% FOR MT1, TO 60.47% FOR MT2, TO 74.37% FOR MT3

• THE DOWNSIDE OF THIS IS THAT THE ERROR INCREASES AS WELL

• MAUE IS 0.674 FOR MT1, 0.820 FOR MT2 AND 0.854 FOR MT3

• THE TRUST PROPAGATION HORIZON BASICALLY REPRESENTS A TRADEOFF BETWEEN ACCURACY AND COVERAGE

• MOREOVER GLOBAL TRUST METRICS NOT APPROPRIATE FOR RECOMMENDER SYSTEMS

18

Page 19: Trust-aware Recommender Systems

COMBINING ESTIMATED TRUST AND USER SIMILARITY

• POSSIBLE IN OUR PROPOSED SYSTEM BUT NOT EFFECTIVE

• BEST COVERAGE

• ITS BETTER THAN CF IN CASE OF ERROR

• BUT WORSE THAN MTX

• RESULTS OR NOT SATISFACTORY EVEN IF WE USE TRUST WHEN BOTH ARE AVAILABLE

19

Page 20: Trust-aware Recommender Systems

DISCUSSION OF RESULTS

• CONSIDERING DIRECT TRUSTED USERS GIVE MINIMUM ERROR WITH ACCEPTABLE COVERAGE

ESPECIALLY IN CASE OF BLACKSHEEP AND CONTROVERSIAL ISSUES

• TRUST BASED RS ARE MOST IMPORTANT FOR COLD START USERS

• USING TRUST PROPAGATION COVERAGE INCREASES BUT ERROR ALSO INCREASES

• CF IS VERY BAD FOR COLDSTART .

20

Page 21: Trust-aware Recommender Systems

CONCLUSIONS

• RECOMMENDER SYSTEMS SHOULD BE IMPROVED USING TRUST INFO

• RESULTS INDICATE THAT TRUST IS VERY EFFECTIVE IN ALLEVIATING RSS WEAKNESSES

• THE TRUST PROPAGATION HORIZON REPRESENTS A TRADEOFF BETWEEN ACCURACY AND COVERAGE

21

Page 22: Trust-aware Recommender Systems

WHAT DID WE LEARN

• TRUST INFO IS IMPORTANT FOR RECOMMENDATION

• PRESENTATION APPROACH WAS GOOD IN PAPER

• IT WAS VERY GOOD WORK

• THE PROPOSED SYSTEM SOLVES THE CONVENTIONAL PROBLEMS IN CF

22