Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI...
-
Upload
aubrie-stevenson -
Category
Documents
-
view
221 -
download
0
Transcript of Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI...
![Page 1: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/1.jpg)
1
Linking Organizational Social Networking ProfilesPROJECT ID: H0791030JEROME CHENG ZHI KAI (A0080860H)
![Page 2: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/2.jpg)
2
Example: Holiday InnTWITTER FACEBOOK
![Page 3: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/3.jpg)
3
Motivation: Individuals
• Want to find profiles, but no one place has them
• Sometimes on company websites, but:• No standardized location• Not all companies bother
![Page 4: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/4.jpg)
4
![Page 5: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/5.jpg)
5
![Page 6: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/6.jpg)
6
Motivation: Organizations
• Track competitor’s use of social media
• Find imposter profiles
![Page 7: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/7.jpg)
7
Problem Definition
System
Social Profiles
Organization Name
Official
Affiliate
Unrelated
![Page 8: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/8.jpg)
8
Related Work
• Focused on deduplication for individuals
• Relevant: profile characteristics focused on
![Page 9: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/9.jpg)
9
Related Work: Usernames
• Connecting Corresponding Identities across Communities (Zafarani & Liu, 2009)
• Connecting users across social media sites: a behavioral-modeling approach (Zafarani & Liu, 2013)
• Studying User Footprints in Different Online Social Networks (Malhotra et al., 2012)
![Page 10: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/10.jpg)
10
Related Work: Created Content
• Identifying Users Across Social Tagging Systems (Iofciu, Fankhauser, Abel & Bischoff, 2011)
![Page 11: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/11.jpg)
11
Methodology: System Design
1. Input: organization’s name (query)
2. Search Facebook/Twitter APIs, retrieve profiles
3. Convert profiles into feature vectors
4. Classify profile-as-vectors
![Page 12: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/12.jpg)
12
Classifier Choice
• Evaluated scikit-learn’s:• Decision Tree• Naïve Bayes• Support Vector• Logistic Regression• Random Forest
• Features aren’t independent – trees are well-suited
![Page 13: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/13.jpg)
13
Feature Breakdown: Name-based
• Normalized Edit Distance• Query to Username• Query to Display Name
• Edit Distance• Query to Username• Query to Display Name• Length of Query• Length of Username• Length of Display Name
![Page 14: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/14.jpg)
14
Feature Breakdown: Name-based Quirks
• Need to handle abbreviations, stopwords• Citigroup versus Citi, General Motors versus GM
• Take two edit distances: original string, processed string
• Use better scoring of the two
![Page 15: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/15.jpg)
15
Feature Breakdown: Description
• Occurrences of Query
• Cosine Similarity• Query and Description• Duckduckgo Description and Profile Description
![Page 16: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/16.jpg)
16
Feature Breakdown: Language Models
• Construct Bigram Language Model for:• Official profile descriptions• Affiliate profile descriptions• Unrelated profile descriptions
• Probability that candidate description belongs to each
![Page 17: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/17.jpg)
17
Evaluation: Ground Truth Creation
1. Retrieved organizations from Freebase
2. Searched for profiles on Twitter/Facebook
3. Manually labelled as official/affiliate/unrelated
![Page 18: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/18.jpg)
18
Evaluation: Ground Truth Breakdown
TWITTER CLASSES
Official; 232; 7%
Affiliate; 675; 20%
Unrelated; 2474; 73%
FACEBOOK CLASSES
Official; 146; 4%Affiliate; 491; 14%
Unrelated; 2776; 81%
3381 labels 3413 labels
![Page 19: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/19.jpg)
19
Evaluation: Process
• Mainly concerned with official and affiliate classes• Not interested in unrelated class
• Modified 10-fold Cross Validation
![Page 20: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/20.jpg)
20
Evaluation: Modified Cross Validation
1. Generate folds as per normal
2. Train classifier on training set as per normal
3. For each affiliate/official profile in test set:1. Input organization’s name to system2. Count number of correct results
4. Calculate precision/recall/F1 from counts
![Page 21: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/21.jpg)
21
Evaluation: Baseline
• Normalised Edit Distance: Username/Display Name and Query
• Emulates searching networks manually without examining profile in detail
![Page 22: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/22.jpg)
22
Results & Discussion: Twitter
F1 Precision Recall0.000
0.100
0.200
0.300
0.400
0.500
0.600
0.700
0.800
0.900
1.000
0.559
0.716
0.458
0.862
0.947
0.791
Official
Baseline Final
F1 Precision Recall0.000
0.100
0.200
0.300
0.400
0.500
0.600
0.700
0.800
0.900
1.000
0.7130.750
0.559
0.905 0.884 0.862
Affiliate
Baseline Final
![Page 23: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/23.jpg)
23
Results & Discussion: Facebook
F1 Precision Recall0.000
0.100
0.200
0.300
0.400
0.500
0.600
0.700
0.800
0.900
1.000
0.7500.792
0.711
0.8840.945
0.830
Official
Baseline Final
F1 Precision Recall0.000
0.100
0.200
0.300
0.400
0.500
0.600
0.700
0.800
0.900
1.000
0.559
0.744
0.480
0.8620.816
0.639
Affiliate
Baseline Final
![Page 24: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/24.jpg)
24
Discussion
• Baseline performs well for official class on Facebook
• Username and display name alone are good indicators for this class• Other features still help, but not as much
![Page 25: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/25.jpg)
25
Discussion: Facebook Characteristics
• Many profile types: people, pages, places, etc.
• Finding official pages is simplified
• But: finding affiliates requires more effort
![Page 26: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/26.jpg)
26
Discussion: Facebook Characteristics
• Facebook doesn’t require a “username” be specified for pages• Will just use an ID instead
• Auto-generated pages also only have IDs, use name from Wikipedia/other sources
![Page 27: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/27.jpg)
27
Limitations
• Ground truth proportions: expand and/or balance
![Page 28: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/28.jpg)
28
Limitations
• Ground truth proportions: expand and/or balance
• Limited number of profiles retrieved for classification
![Page 29: Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1.](https://reader034.fdocuments.us/reader034/viewer/2022051216/5697bf711a28abf838c7ddab/html5/thumbnails/29.jpg)
29
Future Work
• Support additional networks
• Examine post content
• “Preferential” classification