Two E xamples of Open Source Software Developed at CERN: and

34

description

RMLL visits at CERN – July 2012. Two E xamples of Open Source Software Developed at CERN: and. Digital Library Software http://invenio-software.org/. What is it used for?. ~350GB of PDFs at CERN ~20TB of images and videos 1M records . Depositing Archiving Organizing - PowerPoint PPT Presentation

Transcript of Two E xamples of Open Source Software Developed at CERN: and

Page 1: Two E xamples  of Open Source Software  Developed at  CERN:              and
Page 2: Two E xamples  of Open Source Software  Developed at  CERN:              and

Two Examples of Open Source Software Developed at CERN: and

RMLL visits at CERN – July 2012

LOGO

Page 3: Two E xamples  of Open Source Software  Developed at  CERN:              and

Digital Library Software

http://invenio-software.org/

LOGO

Page 4: Two E xamples  of Open Source Software  Developed at  CERN:              and

What is it used for?• Depositing• Archiving• Organizing• Disseminating

• Any type of document~350GB of PDFs at CERN

~20TB of images and videos1M records

LOGO

Page 5: Two E xamples  of Open Source Software  Developed at  CERN:              and

What is

LOGO

‣ Integrated Digital Library / Repository software

‣ A platform of choice for managing documents in HEP

‣ also adopted in other fields (medium to big repositories)

‣ Web application

‣ Open-source GPL-2 project

‣ LAMP stack: Python (mostly), MySQL and Apache

‣ Based on open standardsMARCXML, OAI-PMH, OpenURL, OpenSearch, etc.

‣ Flexible, scriptable

Page 6: Two E xamples  of Open Source Software  Developed at  CERN:              and

LOGO

Page 7: Two E xamples  of Open Source Software  Developed at  CERN:              and

Invenio’s gears• Lots of Python, with a sprinkle of C and Lisp(!)• 630K lines of Python code• MySQL ISAM for storing data• Native indexing engine• Apache + mod_wsgi + mod_xsendfile

LOGO

Page 8: Two E xamples  of Open Source Software  Developed at  CERN:              and

Invenio’s History1954 CERN library starts paper dissemination of preprints (early Open

Access)1965 First computers at CERN library to help with cataloging1990 Electronic distribution of preprints via FTP1993 CERN Preprint Server, web front-end of electronic preprint

catalogue. Institutional repository1996 CERN Library Server (weblib): added books, periodicals and

"other material”.2000 CERN Document Server: multimedia material, internal notes

2002 First public release of the software under GNU-GPL.Worldwide installations and collaborations

Page 9: Two E xamples  of Open Source Software  Developed at  CERN:              and

Open Access at CERN• “Consistent with the stated position of the Collaborations and the General Conditions applicable

to Experiments at CERN, every effort will be made to publish papers under Open Access conditions, as defined by the SCOAP3 initiative. As at the date of this document, the Creative Commons Attribution ("cc by") license meets these conditions.”

• OA at CERN has a long history, the CERN Convention of 1953 states: "...the results of its experimental and theoretical work shall be published or otherwise made generally available". 

LOGO

Page 10: Two E xamples  of Open Source Software  Developed at  CERN:              and

Our development Environment• Git distributed version control system• Trac for ticket tracking• VirtualBox + Vagrant for testing

deployment• We develop on SLC5/6 (based on

RHEL5/6), on Ubuntu, on Debian…

LOGO

Page 11: Two E xamples  of Open Source Software  Developed at  CERN:              and

Quality Assurance• Coding standards

• Eg. PEP8 (Style Guide for Python), etc.

• Documentation• "If the code and the comments disagree, then both are probably wrong."

– attributed to Norm Schryer

• Test suite• ~1,000 unit/regression/web tests

• Security• XSS, CSRF, SQL injection, etc.

• Code review• Kwalitee check: "measuring" quality

• "It looks like quality, it sounds like quality, but it’s not quite quality.”– CPAN Testing Service (quoting Michael Schwern)

LOGO

Page 12: Two E xamples  of Open Source Software  Developed at  CERN:              and

Our community

• 30 institutions worldwide• CERN + DESY + Fermilab + SLAC• EPFL …• ADS and arXiv joining forces• Translated so far into 26 languages• 45 committers (in the last year)• Free + Paid support

LOGO

Page 13: Two E xamples  of Open Source Software  Developed at  CERN:              and

An example installation

LOGO

• 1 Load balancer (HAProxy + Apache mod_proxy + mod_evasive)

• 5 Worker nodes:• 2 VMs for static files• 3 Real machines for Python handled requests

• 2 DB nodes (MySQL master + MySQL replica)• AFS distributed FS for backups and file storage• Sustained recent Higgs announcement load (230

requests per second with peaks of 800 req/s)

Page 14: Two E xamples  of Open Source Software  Developed at  CERN:              and

What’s next?• Werkzeug/Flask + Jinja2 + WTForms for the

web framework• SQLAlchemy for DB abstraction• Twitter Bootstrap + jQuery for the style• Optional Solr indexing

LOGO

Page 15: Two E xamples  of Open Source Software  Developed at  CERN:              and

Conference Management Software

http://indico-software.org

LOGO

Page 16: Two E xamples  of Open Source Software  Developed at  CERN:              and

• History and Features• Technologies• Development

LOGO

Page 17: Two E xamples  of Open Source Software  Developed at  CERN:              and

What is Indico ?• Web-based event organization• Archive of events metadata and related

documents (minutes, slides, etc)• Booking service and collaboration hub

• Rooms• Videoconference• Webcast

LOGO

Page 18: Two E xamples  of Open Source Software  Developed at  CERN:              and

What is Indico ?• Started as an European Project - 2002

• First time used in 2004• In production at CERN: http://indico.cern.ch• And in >100 institutions around the world

• GSI, DESY, Fermilab,…• http://indico-software.org/wiki/IndicoWorldWide

• Free and Open Source

LOGO

Page 19: Two E xamples  of Open Source Software  Developed at  CERN:              and

Indico @ CERN• > 170.000 events• > 700.000 presentations• > 900.000 files

LOGO

Page 20: Two E xamples  of Open Source Software  Developed at  CERN:              and

Event Management with Indico• All kinds of events

LOGO

Page 21: Two E xamples  of Open Source Software  Developed at  CERN:              and

Managing Simple Events

LOGO

Page 22: Two E xamples  of Open Source Software  Developed at  CERN:              and

Managing Meetings

LOGO

Page 23: Two E xamples  of Open Source Software  Developed at  CERN:              and

Managing Conferences

LOGO

Page 24: Two E xamples  of Open Source Software  Developed at  CERN:              and

Managing Conferences• Full Lifecycle

LOGO

Page 25: Two E xamples  of Open Source Software  Developed at  CERN:              and

Managing Conferences

LOGO

Page 26: Two E xamples  of Open Source Software  Developed at  CERN:              and

Collaboration Hub• Room Booking

LOGO

Page 27: Two E xamples  of Open Source Software  Developed at  CERN:              and

Collaboration Hub• Collaboration service requests:

Videoconference, webcast, recording

LOGO

Page 28: Two E xamples  of Open Source Software  Developed at  CERN:              and

Technology• Python >2.6 + WSGI

• babel, webassets, pytz, zope.index, zope.interface, simplejson, suds, lxml, zc.queue, python-dateutil, pypdf, pyatom, reportlab, etc

• Mako 0.4.1+ as template engine• ZODB as underlying database (http

://www.zodb.org/)• Web frameworks:

• jQuery• Backbone.js

LOGO

Page 29: Two E xamples  of Open Source Software  Developed at  CERN:              and

Infrastructure

LOGO

Page 30: Two E xamples  of Open Source Software  Developed at  CERN:              and

Compatibility• Many browsers compatibility: IE8+, FF3.6+,

GChrome, Safari, etc• Working on mobile version

LOGO

Page 31: Two E xamples  of Open Source Software  Developed at  CERN:              and

Development Tools• Git as Control Version System• ~ Eclipse + PyDev• Unit and Selenium Test +

Jenkins (Continuous Integration Server)

• Sphinx for Documentation• Trac as Project Site• Github: http://github.com/indico• Transifex for i18n:

https://www.transifex.com/projects/p/indico/

Page 32: Two E xamples  of Open Source Software  Developed at  CERN:              and

What’s Next ?• Enhance the software: v1.0 end of 2012• Enlarge the community: more advertising

LOGO

Page 33: Two E xamples  of Open Source Software  Developed at  CERN:              and

Questions?

LOGO

Page 34: Two E xamples  of Open Source Software  Developed at  CERN:              and