Building Real Time, Open-Source Tools for Wikipedia

39
WikiWash: Ideation to Product A tool for uncovering spin on Wikipedia

Transcript of Building Real Time, Open-Source Tools for Wikipedia

Page 1: Building Real Time, Open-Source Tools for Wikipedia

WikiWash: Ideation to ProductA tool for uncovering spin on Wikipedia

Page 2: Building Real Time, Open-Source Tools for Wikipedia

Rob KenediEntrepreneur-In-Residence, @twg

@rkenedi

Page 3: Building Real Time, Open-Source Tools for Wikipedia

1. Context

2. The problem

3. TWG’s approach

4. The solution: WikiWash

5. Lessons learned

6. What’s next

Table of contents

Page 4: Building Real Time, Open-Source Tools for Wikipedia

Context

Page 5: Building Real Time, Open-Source Tools for Wikipedia

TechrakingConnecting journalists with technologists

and designers to address problems around

the world

Page 6: Building Real Time, Open-Source Tools for Wikipedia

The Problem

Page 7: Building Real Time, Open-Source Tools for Wikipedia

Today, some of the most relevant stories can only be told by poring over datasets and crunching numbers in Excel. It’s imperative reporters have tools to find the stories hidden in the data.

-Luke Simcoe, Data Journalist, Metro News Canada

Page 8: Building Real Time, Open-Source Tools for Wikipedia

Currently, English Wikipedia includes 4,852,854 articles.More than 800 new articles are added every single day. *

*Source: Wikipedia

Page 9: Building Real Time, Open-Source Tools for Wikipedia
Page 10: Building Real Time, Open-Source Tools for Wikipedia
Page 11: Building Real Time, Open-Source Tools for Wikipedia

There is no political power without control of the archive, if not of memory.

-Jacques Derrida (1998)

Page 12: Building Real Time, Open-Source Tools for Wikipedia

TWG’s Approach

Page 13: Building Real Time, Open-Source Tools for Wikipedia

Unique ValuePropositions

Problem

Spin is introduced into Wiki pages by biased edits

Can’t connect edits with users, or uncover agendas / story angles

Can’t get the data out of the system

Hard to vizualize data to find patterns

Can’t track changes to pages (relating to branded entities)

Can’t find all brand references on Wikipedia

Wikipedia perceived to be susceptible to biased revisions.

Very hard to track revisions on Wikipedia, either historically or as they occur.

Associate page edits with users, and download the data

Ability to compare multiple pages to uncover patterns in edits, and download the data

Ability to track activity and alert to edit activity / trends (that may indicate bias intent)

No. pages ‘un-washed’

Number of connections / biased edits uncovered

Number of edits to Wikipedia pages caused by uncovered biases

Number of stories published siting data from the site

Viral, word of mouth

Partnerships with print / online media organizations - cross promotion

Social media referrals

PPC, Display, Email, SEO

Clearly demonstrate

connections between

Wikipedia page edits and

the users making those

edits.

Ability to track and uncover spin and malicious edits.

Track page edits in near-real-time, and offer alerts that uncover trends and emerging stories.

Developed by and for working reporters

Reporters

Activists

Academics & Students

Citizen Journalists

Hig

h U

se$

PR & Media

Brand Stakeholders

Wikipedia

Existing Wikipedia revision history page

Wikistats

Wikiwatchdog

Article Revision Stats, Wiki Blame, etc

IT Infrastructure

Continuous reporting / scraping (unless partner up with Wikipedia)

Marketing & Promotion

Free for single use on historic data/edits

Subscription model for activity alerts and real-time tracking (uncover breaking stories / bias)

Competitors / Comparables Cost Structure Revenue Streams

Solution Unfair Advantages

ChannelsKey Metrics

Customer Segments

Page 14: Building Real Time, Open-Source Tools for Wikipedia

Unique ValuePropositions

Problem

Spin is introduced into Wiki pages by biased edits

Can’t connect edits with users, or uncover agendas / story angles

Can’t get the data out of the system

Hard to vizualize data to find patterns

Can’t track changes to pages (relating to branded entities)

Can’t find all brand references on Wikipedia

Wikipedia perceived to be susceptible to biased revisions.

Very hard to track revisions on Wikipedia, either historically or as they occur.

Associate page edits with users, and download the data

Ability to compare multiple pages to uncover patterns in edits, and download the data

Ability to track activity and alert to edit activity / trends (that may indicate bias intent)

No. pages ‘un-washed’

Number of connections / biased edits uncovered

Number of edits to Wikipedia pages caused by uncovered biases

Number of stories published siting data from the site

Viral, word of mouth

Partnerships with print / online media organizations - cross promotion

Social media referrals

PPC, Display, Email, SEO

Clearly demonstrate

connections between

Wikipedia page edits and

the users making those

edits.

Ability to track and uncover spin and malicious edits.

Track page edits in near-real-time, and offer alerts that uncover trends and emerging stories.

Developed by and for working reporters

Reporters

Activists

Academics & Students

Citizen Journalists

Hig

h U

se$

PR & Media

Brand Stakeholders

Wikipedia

Existing Wikipedia revision history page

Wikistats

Wikiwatchdog

Article Revision Stats, Wiki Blame, etc

IT Infrastructure

Continuous reporting / scraping (unless partner up with Wikipedia)

Marketing & Promotion

Free for single use on historic data/edits

Subscription model for activity alerts and real-time tracking (uncover breaking stories / bias)

Competitors / Comparables Cost Structure Revenue Streams

Solution Unfair Advantages

ChannelsKey Metrics

Customer Segments

Page 15: Building Real Time, Open-Source Tools for Wikipedia
Page 16: Building Real Time, Open-Source Tools for Wikipedia

The Solution: WikiWash

Page 17: Building Real Time, Open-Source Tools for Wikipedia

WikiWash decreases spin and bias on Wikipedia, by holding those making changes accountable

Page 18: Building Real Time, Open-Source Tools for Wikipedia

How does it do that?• Realtime

• Open source

• Export your data

• Free!

• Works with Wikipedia’s API

• Built in Javascript

• Uses Node.js, Express.js, Angular.js, Socket.IO

to facilitate involvement from others

http://blog.twg.ca/2014/11/building-wikiwash/

Page 19: Building Real Time, Open-Source Tools for Wikipedia

wikiwash.org

Page 20: Building Real Time, Open-Source Tools for Wikipedia
Page 21: Building Real Time, Open-Source Tools for Wikipedia
Page 22: Building Real Time, Open-Source Tools for Wikipedia
Page 23: Building Real Time, Open-Source Tools for Wikipedia

Limited by API; caching data

Lessons LearnedFocus on realtime changes

Trending articles aid understanding

Focus on product first, aesthetics second

Page 24: Building Real Time, Open-Source Tools for Wikipedia

Wikitext

Page 25: Building Real Time, Open-Source Tools for Wikipedia

Wikipedia API Load Times

Page 26: Building Real Time, Open-Source Tools for Wikipedia

Preemptive Caching

Page 27: Building Real Time, Open-Source Tools for Wikipedia

Limited by API; caching data

Lessons LearnedFocus on realtime changes

Trending articles aid understanding

Focus on product first, aesthetics second

Page 28: Building Real Time, Open-Source Tools for Wikipedia

Limited by API; caching data

Lessons LearnedFocus on realtime changes

Trending articles aid understanding

Focus on product first, aesthetics second

Page 29: Building Real Time, Open-Source Tools for Wikipedia
Page 30: Building Real Time, Open-Source Tools for Wikipedia

Limited by API; caching data

Lessons LearnedFocus on realtime changes

Trending articles aid understanding

Focus on product first, aesthetics second

Page 31: Building Real Time, Open-Source Tools for Wikipedia

What’s Next?

Page 32: Building Real Time, Open-Source Tools for Wikipedia

• Notifications via email

• Website embed capability

• Access to Wikipedia’s firehose

• UX improvements

• Language support

• Next / previous navigation

Feature RoadmapWHAT’S NEXT

Page 33: Building Real Time, Open-Source Tools for Wikipedia

• Clear product ownership

• Product / market fit

• Pirate Metrics as a guide

• Qualitative & quantitative feedback

• Incrementally invest until inflection point

How TWG Decides Next StepsWHAT’S NEXT

Page 34: Building Real Time, Open-Source Tools for Wikipedia

Fork the project on Github to improve it

Page 35: Building Real Time, Open-Source Tools for Wikipedia

github.com/twg/wikiwash

Page 36: Building Real Time, Open-Source Tools for Wikipedia

Send us your feedback, ideas and feature requests

Page 38: Building Real Time, Open-Source Tools for Wikipedia

Let’s talk about how to bring digital products to market or a live demo of WikiWash