Wikisource technical infrastructure - Wikimedia · Rewrite editing interfaces in PHP Try to have...

Post on 02-Aug-2020

4 views 0 download

Transcript of Wikisource technical infrastructure - Wikimedia · Rewrite editing interfaces in PHP Try to have...

Wikisource technical infrastructureWhat we have done and what we could do

II

Thomas Pellissier TanonUser:Tpt@Tpt93

Wikisource Conference 2015

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 1 / 20

Wikisource ?

Current state:

4 millions of pages2.1 millions of proofread pages600 active editors (> 5 edits)

Strong issues:

books not easily accessibleno real bibliographic databasecontributing is quite difficult

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 2 / 20

Wikisource ?

Current state:

4 millions of pages2.1 millions of proofread pages600 active editors (> 5 edits)

Strong issues:

books not easily accessibleno real bibliographic databasecontributing is quite difficult

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 2 / 20

Wikisource technical infrastructure

MediaWiki

but with custom extensions like ProofreadPage

developed and maintained by volunteer contributors and a few GSoCprojects

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 3 / 20

Wikisource technical infrastructure

MediaWiki

but with custom extensions like ProofreadPage

developed and maintained by volunteer contributors and a few GSoCprojects

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 3 / 20

Wikisource technical infrastructure

MediaWiki

but with custom extensions like ProofreadPage

developed and maintained by volunteer contributors and a few GSoCprojects

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 3 / 20

Outline

1 What we have done

2 What we could do

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 4 / 20

What we have done

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 5 / 20

Wsexport

Is a ”magic” export tool

adapted to Wikisource needs

ePub is the base format

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 6 / 20

Wsexport

Is a ”magic” export tool

adapted to Wikisource needs

ePub is the base format

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 6 / 20

Wsexport

Is a ”magic” export tool

adapted to Wikisource needs

ePub is the base format

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 6 / 20

Wsexport

Migrated on Wikimedia Toolslabs at https://tools.wmflabs.org/wsexport

Integrated in the UI of most ofWikisources

48, 000 exports in October 2015

Supports PDF, mobi...

ePub 3 is the default format

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 7 / 20

Wsexport

Migrated on Wikimedia Toolslabs at https://tools.wmflabs.org/wsexport

Integrated in the UI of most ofWikisources

48, 000 exports in October 2015

Supports PDF, mobi...

ePub 3 is the default format

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 7 / 20

Wsexport

Migrated on Wikimedia Toolslabs at https://tools.wmflabs.org/wsexport

Integrated in the UI of most ofWikisources

48, 000 exports in October 2015

Supports PDF, mobi...

ePub 3 is the default format

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 7 / 20

Wsexport

Migrated on Wikimedia Toolslabs at https://tools.wmflabs.org/wsexport

Integrated in the UI of most ofWikisources

48, 000 exports in October 2015

Supports PDF, mobi...

ePub 3 is the default format

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 7 / 20

Wsexport

Migrated on Wikimedia Toolslabs at https://tools.wmflabs.org/wsexport

Integrated in the UI of most ofWikisources

48, 000 exports in October 2015

Supports PDF, mobi...

ePub 3 is the default format

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 7 / 20

Refactoring of ProofreadPage

Goals:

More maintainable code

Use new MediaWiki features (ContentHandler...)

Better performances

Less breakages

Content validation

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 8 / 20

Refactoring of ProofreadPage

Goals:

More maintainable code

Use new MediaWiki features (ContentHandler...)

Better performances

Less breakages

Content validation

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 8 / 20

Refactoring of ProofreadPage

Goals:

More maintainable code

Use new MediaWiki features (ContentHandler...)

Better performances

Less breakages

Content validation

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 8 / 20

Refactoring of ProofreadPage

Done:

Rewrite editing interfaces in PHP

Try to have not too badly architectured codeAutomated testsJSON encoding of Page: pages in API

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 9 / 20

Refactoring of ProofreadPage

Done:

Rewrite editing interfaces in PHPTry to have not too badly architectured code

Automated testsJSON encoding of Page: pages in API

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 9 / 20

Refactoring of ProofreadPage

Done:

Rewrite editing interfaces in PHPTry to have not too badly architectured codeAutomated tests

JSON encoding of Page: pages in API

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 9 / 20

Refactoring of ProofreadPage

Done:

Rewrite editing interfaces in PHPTry to have not too badly architectured codeAutomated testsJSON encoding of Page: pages in API

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 9 / 20

Interproject sidebar

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 10 / 20

IA-Upload

An Internet Archive to Commons import tool(http://tools.wmflabs.org/ia-upload)

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 11 / 20

BUB

An import tool for Internet Archive from Google Books and other sources(https://tools.wmflabs.org/bub/)

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 12 / 20

What we could do

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 13 / 20

Some ideas...

A Wikisource contributors survey done in Fall 2013 with 251 answers

Figure: What do you think are the core priorities for the Wikisource community?

17 %

Integrated ePub exporter15 %

Localized OCR

15 %

Better and easier workflow

11 %Wikidata integration 9 %

Visual Editor (Page: namespace)

9 %

Visual Editor (main namespace)

14 %

Metadata management system

10 %

Import-export systems

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 14 / 20

VisualEditor support

Improve Parsoid rendering of Wikisource content

Support tags used on Wikisource (pages, poem, section...)

Custom interface for Page: pages

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 15 / 20

VisualEditor support

Improve Parsoid rendering of Wikisource content

Support tags used on Wikisource (pages, poem, section...)

Custom interface for Page: pages

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 15 / 20

VisualEditor support

Improve Parsoid rendering of Wikisource content

Support tags used on Wikisource (pages, poem, section...)

Custom interface for Page: pages

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 15 / 20

Mobile support

Custom edit interface for Page:and Index: pages

We should have a nice UI forboth browsing and editing

Future: gamification?

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 16 / 20

Mobile support

Custom edit interface for Page:and Index: pages

We should have a nice UI forboth browsing and editing

Future: gamification?

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 16 / 20

Mobile support

Custom edit interface for Page:and Index: pages

We should have a nice UI forboth browsing and editing

Future: gamification?

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 16 / 20

Exportation of content

Improve Wsexport performances

Use Parsoid

Nice book browsing interface + OPDS?

Integrated inside of Wikimedia infrastructure?

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 17 / 20

Exportation of content

Improve Wsexport performances

Use Parsoid

Nice book browsing interface + OPDS?

Integrated inside of Wikimedia infrastructure?

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 17 / 20

Exportation of content

Improve Wsexport performances

Use Parsoid

Nice book browsing interface + OPDS?

Integrated inside of Wikimedia infrastructure?

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 17 / 20

Exportation of content

Improve Wsexport performances

Use Parsoid

Nice book browsing interface + OPDS?

Integrated inside of Wikimedia infrastructure?

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 17 / 20

Some other ideas

Use Wikidata as much as possible

Gamification (capcha...)

. . .

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 18 / 20

Some other ideas

Use Wikidata as much as possible

Gamification (capcha...)

. . .

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 18 / 20

Some other ideas

Use Wikidata as much as possible

Gamification (capcha...)

. . .

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 18 / 20

Conclusion

What we need now:

Stronger interwiki collaboration

People to build the Wikisource of tomorrow

Stronger support from the Wikimedia movement

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 19 / 20

Conclusion

What we need now:

Stronger interwiki collaboration

People to build the Wikisource of tomorrow

Stronger support from the Wikimedia movement

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 19 / 20

Conclusion

What we need now:

Stronger interwiki collaboration

People to build the Wikisource of tomorrow

Stronger support from the Wikimedia movement

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 19 / 20

Thanks a lot for your attention!

Thomas Pellissier Tanon Wikisource technical infrastructure 2 Wikisource Conference 2015 20 / 20