Conversation Level Constraints on Pedophile Detection in Chat Rooms

Conversation Level Constraints on Pedophile Detection in Chat RoomsPAN 2012 — Sexual Predator Identification

Claudia Peersman, Frederik Vaassen, Vincent Van Asch and Walter Daelemans

Overview• Task 1: Sexual Predator Identification• Preprocessing• Experimental Setup and Results• Test Run Results

• Task 2: Identifying Grooming Posts• Grooming Dictionary• Test Run Results

• Discussion

Task 1: Preprocessing of the Data• Data: PAN 2012 competition training set• predator vs. non-predator• info on the conversation, user and post level

• Two splits: training and validation set

• No user was present in more than one cluster

prevent overfitting of user-specific features

Experimental Setup

• Features: token unigrams

• LiBSVM

• Probability output

• Parameter optimization

• Experiments on 3 levels

• data resampling

Level 1: the Post Classifier

• Resample the number of posts

Equal distribution of posts per class

• About 40,000 posts per class in training

• No resampling in the validation sets

Level 1: the Post Classifier (2)• Only output on the post level

• Aggregate the post level predictions to the user level:• LiBSVM’s probability outputs• Predators = average of the 10 highest

predator class probabilities ≥ 0.85

Results for the Predator Class

Scores Post ClassifierRecall 0.93Precision 0.36F-score 0.52

Level 2: the User Classifier• Resampling on the user level

exclude users with no suspicious posts

• Filter: dictionary of grooming vocabulary see Task 2

• Why? • reduce the amount of data • “hard” classification higher precision?

Update Results (1)

Scores Post Classifier User ClassifierRecall 0.93 0.82Precision 0.36 0.88F-score 0.52 0.84

Combine systems?

Data reduction: up to 48.4%

Combining the systems• Weighted voting using LiBSVM’s

probability outputs

• 70% of the weight on the high precision User Classifier

Update Results (2)

Scores Post Classifier

User Classifier

Combined Results

Recall 0.93 0.82 0.85Precision 0.36 0.88 0.84F-score 0.52 0.84 0.84

Level 3: Conversation Level Constraints

• Both users in a conversation labeled as predators

• Our approach: • go back to predator probability output • use the high precision user classifier• Predator probability ≥ 0.75

System Overview

Combined Prediction

Apply Conversatio

n Level Constraints

Final Predator

ID List

Post Classifier

User Classifier

Update Results (3)

Scores Post Classifier

User Classifier

Combined Results

Combined +

ConstraintsRecall 0.93 0.82 0.85 0.85Precision 0.36 0.88 0.84 0.94F-score 0.52 0.84 0.84 0.89

Results on the PAN 2012 Test Set

Scores Combined + Constraints PAN Test Set

Recall 0.85 0.60Precision 0.94 0.89F-score (β = 1) 0.89 0.72

• Future research: • more splits• investigate ensembles

Task 2: Identifying Grooming Posts • From the final predator ID list detect posts

expressing typical grooming behavior

• No gold standard labels What is grooming?

• Predator conversations have predictable stages (e.g. Lanning, 2010; McGhee et al., 2011)

Task 2: Identifying Grooming Posts (2)

• Dictionary containing references to 6 stages:• sexual topic• reframing• approach • data requests• isolation from adult supervision• age (difference)

Task 2: Identifying Grooming Posts (3)

• Resources:• McGhee et al. (2011)• English Urban Dictionary website

http://www.urbandictionary.com/• English Synonyms

http://www.synonym.net/

• cf. user classifier filter

Results on the PAN 2012 Test Set

• Precision = 0.36

• Recall = 0.26

• F-score (β = 1) = 0.30

Discussion

• Use of β-factors to calculate the F-score:• Task 1: focus on precision (β = 0.5)• Task 2: focus on recall (β = 3.0)

• However, in practice:• find all predators (recall in Task 1)• find the most striking posts (precision in

Task 2)

Questions?

Contact: claudia.peersman@ua.ac.be

Conversation Level Constraints on Pedophile Detection in Chat Rooms

Documents

Transcript of Conversation Level Constraints on Pedophile Detection in Chat Rooms

555 chat chat chat

Crimen Sollicitiationis: Vatican Top Secret document to protect Pedophile priests

Web Chat Software Light Chat

Was Mark Twain a Satanic Pedophile?

Pedophile Gary Blair Walker denied parole, 2009

Introduction to Skype · chat. Tells you about the folks you’re speaking to. H. Conversation window –Displayed when you select a contact or set up a group chat. Shows the instant

AUTISM SPECTRUM DISORDERS - lumsa.it l2 chat m-chat q-chat pddst-ii esat itc fyi cesdd sacs m-chat q-chat fyi chat-23 biscuit stat

Introduction · Web viewThe Chat icon lists your recent messages or add a private chat. You can browse the messages list, open a message to write a reply, read a whole conversation

A Conversation with the Executive Director 2020 Conference ... · A Conversation with the Executive Director 2020 Conference Update Friday, June 12th, 2020 Please use the CHAT feature

Pedophile Reporter Outlook plugin

Reve Chat - Web Chat Software

Pedophile Worked at Cadet Camps

Welcome to the December TLC Chat! A Conversation on Qualitative Research Your Hosts: Kelly and Roberta A Conversation on Qualitative Research Your Hosts:

Jim Rogers - SMart · A Conversation with Legendary Investor Jim Rogers truewealthpublishing.asia 3 successful investors in history. Introduction I recently had a chat with fellow

Profile of a Pedophile Deegan Malone, EdS, LPC-S, JSOCC.

PowerPoint Presentation...My Health at Vanderbilt Coordinated Clinics Adam Huggins Patient—centered Join the conversation #StrategyShare19 ONLINE CLICK TO CHAT LIVE CHAT SCHEDULE

· Web view#mHealth Tweet Chat #2: May 9, 2012. To read the chat, scroll to the bottom first. The conversation moves up. dlschermd Thank YOU! #mhealth -9:00 PM May 9th, 2012

Nelson Mandela University · Web viewHow to Post in Chat Channels You can type directly in the conversation Channel (Posts tab), press Enter or Send

"Two Homosexual Pedophile Sadistic Serial Killers:Jürgen ...benecke.com/pdf/serial_killer_bartsch_mark_benecke.pdf · MINERVA MED LEG Two homosexual pedophile sadistic serial killers:

Amnesty International Director Colm O'Gorman: Catholic Church Protects Pedophile Priests