School of Engineering and Informatics
SiocLog: Providing IRC Discussion Logs as Linked Data
Tuukka Hastrup1, Uldis Bojars2 and John G. Breslin2, 3
1 University of Jyväskylä, Finland
2 DERI, NUI Galway, Ireland 3 School of Engineering and Informatics, NUI Galway, Ireland
School of Engineering and Informatics
Motivation
• IRC conversations are quite disconnected from the Web and even from other IRC channels and networks
• Often there is valuable and needed information in an IRC chat that cannot be linked to people, topics or events, or in general referenced from elsewhere
• This may be useful to people who do not use IRC, by those on other networks, or simply by people who leave and rejoin a channel
School of Engineering and Informatics
Motivation (2)
• SIOC provides a framework for linking social media contributions to other content and Linked Data resources, and IRC can become part of that framework
• We also need mechanisms to link the IRC contributions to the people who made them, hence the use of Web ID
School of Engineering and Informatics
Background
• We will begin by introducing the various areas relevant to this system:
– IRC
– Linked Data
– SIOC
– Web ID
School of Engineering and Informatics
Internet Relay Chat (IRC)
• Instant messaging / internet chat is a major form of social interaction online
• It is often disconnected from the Web:
– Due to the different protocols involved
– Due to its real-time nature / lack of persistent storage
• IRC was one of the earliest chat systems
• It has an important role amongst open-source communities, web communities, and even geeks!
– Hundreds of thousands of users online at any time
School of Engineering and Informatics
Linked Data
• Building a “Web of Data” to enhance the current Web
• Exposing, sharing and connecting data about things via dereferenceable URIs
• Linking datasets together that were not previously connected, for example:
– Music and people
– Real-world things and places
• The Linking Open Data (LOD) effort aims to link various open datasets together (DBpedia, GeoNames, etc.)
School of Engineering and Informatics
Semantically-Interlinked Online Communities (SIOC)
• An effort from DERI, NUI Galway to discover how we can create / establish ontologies on the Semantic Web
• Goal of the SIOC ontology is to address interoperability issues on the (Social) Web
• http://sioc-project.org/
• SIOC has been adopted in a framework of 50 applications or modules deployed on over 400 sites
• Various domains: Web 2.0, enterprise information integration, HCLS, e-government
School of Engineering and Informatics
School of Engineering and Informatics
Some of the SIOC core ontology classes and properties
School of Engineering and Informatics
Some examples of where SIOC is already use (about 50 implementations / applications)
School of Engineering and Informatics
Web ID
• A Web ID is a web address that identifies a person as a Linked Data item
• A Web ID should also lead to a document with more information about that person (e.g. FOAF, other RDF)
• For more information, see the definition in this paper:
– Ching-Man Au Yeung, Ilaria Liccardi, Kanghao Lu, Oshani Seneviratne, Tim Berners-Lee, “Decentralization: The Future of Online Social Networking”, W3C Workshop on Future of Social Networking
School of Engineering and Informatics
Design
School of Engineering and Informatics
Mapping IRC identifiers to URIs on the Web
• irc://freenode
(IRC Network)
• irc://freenode/%23channel
(Channel)
• No identifier
(Message)
• irc://freenode/persona,isuser
(Chat Persona)
• http://irc.sioc-project.org/#freenode
• http://irc.sioc-project.org/channel#channel
• http://irc.sioc-project.org/channel/0000-00-00 #00:00:00.00
• http://irc.sioc-project.org/users/persona#user
School of Engineering and Informatics
Some of the internal and external links
School of Engineering and Informatics
Browsing the Linked Data
School of Engineering and Informatics
Creating a link between a user account on IRC and a personal profile
• Claiming a Web ID creates a link [black] between a user account (a sioc:User that created a sioc:Post in a sioct:ChatChannel) and a person (foaf:Person)
• The person can manually verify this:
– By pointing back to the sioc:User from their foaf:Person definition [grey]
School of Engineering and Informatics
Web IDs in SiocLog
• A Web ID can be claimed using mttlbot
• Can claim using standard IRC services
/msg nickserv
set property webid SomeWebID
School of Engineering and Informatics
Implementation
• 2000 lines of Python source code
• 1000 lines of Zope/TAL HTML templates
• Twisted, SimpleTAL and Redland libraries
• Four major components:
– IRC interface, data analysis, data integration, Web
School of Engineering and Informatics
Implementation (2)
• IRC interface:
– Discussion logger / persona monitor on Twisted
• Data analysis:
– Process logs, a filters pipeline, sinks for stats / output
• Data integration:
– Queries for external Linked Data (personal profiles)
• Web interface:
– Requests via CGI, publishes as HTML and RDF
School of Engineering and Informatics
Finding the names of friends of an IRC persona with SPARQL
semwebquery –sparql "SELECT ?name WHERE {
?person foaf:holdsAccount
<http://irc.sioc-project.org/users/melvster#user> .
?person foaf:knows ?friend .
?friend foaf:name ?name . }"
School of Engineering and Informatics
Validation
• 291 chat personas on five channels
• 22,418 chat messages
• 51 chat personas have associated Web IDs claimed using mttlbot (2/3) or nickserv (1/3)
– 44 of those have a valid associated RDF document
• Scalable (projected 4 million triples in 10 years)
• SiocLog data being consumed by the “Towards linked sensor data for Hackystat” project
• SiocLog interfaces to FOAF Me for new profile creation
School of Engineering and Informatics
Future work
• Extend to instant messaging and private messaging
• Study of IRC communities where users and content are distributed across channels and networks
School of Engineering and Informatics
Acknowledgements
• We would like to thank Science Foundation Ireland for their support under grant SFI/08/CE/I1380 (Líon 2)
• Thanks also to Benja Fallenstein and Dan Brickley for their insights
School of Engineering and Informatics
Summary
• IRC conversations are quite disconnected from the Web and even from other IRC channels and networks
• Often there is valuable and needed information in an IRC chat that cannot be linked to people, topics or events, or in general referenced from elsewhere
• SIOC provides a framework for interlinking social media to other content and Linked Data, and IRC has been integrated as a part of that framework
• We also used mechanisms to link IRC contributions to the people who made them via Web ID and FOAF
Top Related