Generating collections for stories and events
Transcript of Generating collections for stories and events
Alexander Nwala Computer Science Ph.D student, Old Dominion University
Summer Fellow, Harvard Law School Library Innovation Lab
Dr. Michael Nelson Department of Computer Science, Old Dominion University
Archives Unleashed 2.0: Web Archive Datathon,Washington DC
June 14, 2016
Generating collections for stories and events
• Carbon date is a tool for estimating the creation date of a website
• Returns a machine-readable structure
Website: http://cd.cs.odu.edu/
Carbon Date #WhatDidItLookLike
• Tumblr blog shows what a website looked like across multiple years
• Nominate websites to What Did It Look Like? by tweeting: “#whatdiditlooklike URL”
Website: http://whatdiditlooklike.mementoweb.org/
Previous Projects
• Returns an archived version of the page closest to the time of the tweet or
• Returns a newly archived version of the page, if the page was not archived 24 hours ago
Website: https://twitter.com/icanhazmemento/
#ICanHazMemento Web Query Classifier
• Classifies a web query as scholar or non-scholar
• Route the query to a local Digital Library not crawlable via Search Engines
Tech report: https://arxiv.org/abs/1605.00184
Previous Projects
Generating collections for stories and events
• It’s not difficult to collect resources (links) for a story, but can we build a good collection?
• What does good mean?
• What are some properties of good stories?
Is this story from Storify good? Is this story from Storify good?
Generating collections for stories and eventsWhere to begin
• Understanding potential sources; Social Media (Storify, Twitter, etc) and News.
Storify native search does not find Stories
There is no neat way of collecting tweets in a conversation, since the subset of the graph seen depends on what tweet is selected.