Building Real Time, Open-Source Tools for Wikipedia
-
Upload
fitc -
Category
Technology
-
view
199 -
download
0
Transcript of Building Real Time, Open-Source Tools for Wikipedia
WikiWash: Ideation to ProductA tool for uncovering spin on Wikipedia
Rob KenediEntrepreneur-In-Residence, @twg
@rkenedi
1. Context
2. The problem
3. TWG’s approach
4. The solution: WikiWash
5. Lessons learned
6. What’s next
Table of contents
Context
TechrakingConnecting journalists with technologists
and designers to address problems around
the world
The Problem
Today, some of the most relevant stories can only be told by poring over datasets and crunching numbers in Excel. It’s imperative reporters have tools to find the stories hidden in the data.
-Luke Simcoe, Data Journalist, Metro News Canada
“
Currently, English Wikipedia includes 4,852,854 articles.More than 800 new articles are added every single day. *
*Source: Wikipedia
There is no political power without control of the archive, if not of memory.
-Jacques Derrida (1998)
“
TWG’s Approach
Unique ValuePropositions
Problem
Spin is introduced into Wiki pages by biased edits
Can’t connect edits with users, or uncover agendas / story angles
Can’t get the data out of the system
Hard to vizualize data to find patterns
Can’t track changes to pages (relating to branded entities)
Can’t find all brand references on Wikipedia
Wikipedia perceived to be susceptible to biased revisions.
Very hard to track revisions on Wikipedia, either historically or as they occur.
Associate page edits with users, and download the data
Ability to compare multiple pages to uncover patterns in edits, and download the data
Ability to track activity and alert to edit activity / trends (that may indicate bias intent)
No. pages ‘un-washed’
Number of connections / biased edits uncovered
Number of edits to Wikipedia pages caused by uncovered biases
Number of stories published siting data from the site
Viral, word of mouth
Partnerships with print / online media organizations - cross promotion
Social media referrals
PPC, Display, Email, SEO
Clearly demonstrate
connections between
Wikipedia page edits and
the users making those
edits.
Ability to track and uncover spin and malicious edits.
Track page edits in near-real-time, and offer alerts that uncover trends and emerging stories.
Developed by and for working reporters
Reporters
Activists
Academics & Students
Citizen Journalists
Hig
h U
se$
PR & Media
Brand Stakeholders
Wikipedia
Existing Wikipedia revision history page
Wikistats
Wikiwatchdog
Article Revision Stats, Wiki Blame, etc
IT Infrastructure
Continuous reporting / scraping (unless partner up with Wikipedia)
Marketing & Promotion
Free for single use on historic data/edits
Subscription model for activity alerts and real-time tracking (uncover breaking stories / bias)
Competitors / Comparables Cost Structure Revenue Streams
Solution Unfair Advantages
ChannelsKey Metrics
Customer Segments
Unique ValuePropositions
Problem
Spin is introduced into Wiki pages by biased edits
Can’t connect edits with users, or uncover agendas / story angles
Can’t get the data out of the system
Hard to vizualize data to find patterns
Can’t track changes to pages (relating to branded entities)
Can’t find all brand references on Wikipedia
Wikipedia perceived to be susceptible to biased revisions.
Very hard to track revisions on Wikipedia, either historically or as they occur.
Associate page edits with users, and download the data
Ability to compare multiple pages to uncover patterns in edits, and download the data
Ability to track activity and alert to edit activity / trends (that may indicate bias intent)
No. pages ‘un-washed’
Number of connections / biased edits uncovered
Number of edits to Wikipedia pages caused by uncovered biases
Number of stories published siting data from the site
Viral, word of mouth
Partnerships with print / online media organizations - cross promotion
Social media referrals
PPC, Display, Email, SEO
Clearly demonstrate
connections between
Wikipedia page edits and
the users making those
edits.
Ability to track and uncover spin and malicious edits.
Track page edits in near-real-time, and offer alerts that uncover trends and emerging stories.
Developed by and for working reporters
Reporters
Activists
Academics & Students
Citizen Journalists
Hig
h U
se$
PR & Media
Brand Stakeholders
Wikipedia
Existing Wikipedia revision history page
Wikistats
Wikiwatchdog
Article Revision Stats, Wiki Blame, etc
IT Infrastructure
Continuous reporting / scraping (unless partner up with Wikipedia)
Marketing & Promotion
Free for single use on historic data/edits
Subscription model for activity alerts and real-time tracking (uncover breaking stories / bias)
Competitors / Comparables Cost Structure Revenue Streams
Solution Unfair Advantages
ChannelsKey Metrics
Customer Segments
The Solution: WikiWash
WikiWash decreases spin and bias on Wikipedia, by holding those making changes accountable
How does it do that?• Realtime
• Open source
• Export your data
• Free!
• Works with Wikipedia’s API
• Built in Javascript
• Uses Node.js, Express.js, Angular.js, Socket.IO
to facilitate involvement from others
http://blog.twg.ca/2014/11/building-wikiwash/
wikiwash.org
Limited by API; caching data
Lessons LearnedFocus on realtime changes
Trending articles aid understanding
Focus on product first, aesthetics second
Wikitext
Wikipedia API Load Times
Preemptive Caching
Limited by API; caching data
Lessons LearnedFocus on realtime changes
Trending articles aid understanding
Focus on product first, aesthetics second
Limited by API; caching data
Lessons LearnedFocus on realtime changes
Trending articles aid understanding
Focus on product first, aesthetics second
Limited by API; caching data
Lessons LearnedFocus on realtime changes
Trending articles aid understanding
Focus on product first, aesthetics second
What’s Next?
• Notifications via email
• Website embed capability
• Access to Wikipedia’s firehose
• UX improvements
• Language support
• Next / previous navigation
Feature RoadmapWHAT’S NEXT
• Clear product ownership
• Product / market fit
• Pirate Metrics as a guide
• Qualitative & quantitative feedback
• Incrementally invest until inflection point
How TWG Decides Next StepsWHAT’S NEXT
Fork the project on Github to improve it
github.com/twg/wikiwash
Send us your feedback, ideas and feature requests
Let’s talk about how to bring digital products to market or a live demo of WikiWash