OWF14 - Project and Community Driving :

18
21 octobre 2014 | PAGE 1 Working with FOSS communities at CEA Philippe DENIEL ([email protected]) Henri DOREAU ([email protected])

description

Henri DOREAU - Taking part in the Lustre Filesystem community The open-source Lustre distributed filesystem is the cornerstone of numerous world-class High Performance Computing (HPC) sites. The CEA/DAM has been a user and contributor for years, working with major tech companies and other leading HPC sites all over the world. Beyond the fruitful technical collaboration, CEA/DAM has also participated in building community organizations such as EOFS (European Open FileSystem). Despite major organizational changes throughout its history, the Lustre project has always exhibited a remarkable sustainability in its community of users and developers. This presentation will describe the multiple interactions between the CEA/DAM and the Lustre community in its largest definition, how they're managed and some of their direct outcomes. Philippe DENIEL - NFS-Ganesha: an Opensource NFS server in the User Space This paper describe why CEA developed its own NFS server and why this project was pushed as open source software. It shows how it gave birth to a "NFS-Ganesha developers community" and how this community organized itself. The extension and development of the community years after years is shown too, with the incoming benefit of such a collaboration in-between a institutional scientific research center like CEA and the industry.

Transcript of OWF14 - Project and Community Driving :

Page 1: OWF14 - Project and Community Driving :

21 octobre 2014 | PAGE 1

Working with FOSS communities at CEA

Philippe DENIEL ([email protected])

Henri DOREAU ([email protected])

Page 2: OWF14 - Project and Community Driving :

CEA in a few words

Teaching and dissemination of knowledge

Valuation and technological dissemination

Low CarbonEnergy

Very LargeResearch

Infrastructures

Defense and

deterrenceTechnologies for

Information and health

Recherche Fondamentale 30% de subvention

Fundamental Research

Page 3: OWF14 - Project and Community Driving :

10 CENTERS IN FRANCE

Micro-NanotechnologiesNanobiotechnologiesNew TechnologiesRhône-Alpes

Lasers and plasmasAquitaine

Cesta

Grenoble

Marcoule

ValducLe Ripault

Bruyères-le-Châtel

Saclay

Fontenay-aux-Roses

Gramat

Cadarache

MaterialsCentre, Bourgogne

Vulnerability Assessment DetonicsMidi-Pyrénées

Nuclear :nuclear fuel life cycleand waste managementVallée du Rhône

Nuclear : Fusion, fissionProvence Alpes Côte d’Azur

Materials Sciences, Software Technologies, High Performance Computing, BiomedicalIle-de-France

Page 4: OWF14 - Project and Community Driving :

Paris

Saclay

CEA/DIF (Bruyères-Le-Châtel)

www-hpc.cea.fr

4

Page 5: OWF14 - Project and Community Driving :

TERA = defense

CCRT = CEA + industry partners+ France Génomique

CURIE @ TGCCFrance/Europe HPC

TER@TEC Campus: hosts industrials, software company, labs (Intel, Bull, DISTENE, ESI, SILKAN…)Contribution to a French and European industrial ecosystem

CEA : « From research to industry »

Page 6: OWF14 - Project and Community Driving :

PAGE 6

Shared TERA/TGCC tools

CEA TERA/TGCC teams have expertise in HPCManaging very large clusters Managing high performance parallel file systemsManaging highly capacitive (~100PB) storage systems

Those teams have develop their own toolsClusterShell: a python based parallel shell capable of dealing with large clusters

http://cea-hpc.github.io/clustershell/Collaboration to the development of the Lustre file system

http://lustre-shine.sourceforge.net/Shine: clustershell based utility to administrate large Lustre configuration

http://lustre-shine.sourceforge.net/NFS-Ganesha: a generic NFS server running in user-space

https://github.com/nfs-ganesha/nfs-ganesha/wikiRobinhood: advanced FS audit and monitoring software

http://robinhood.sf.net

The rest of the topic focuses on Lustre and NFS-Ganesha

Page 7: OWF14 - Project and Community Driving :

PAGE 7

OpenSource products at TERA and TGCC

HPC is a “niche market”HPC market brings good image to companies involved in it...but HPC market brings less money than the enterprise marketCompanies will shoot the works on enterprise customers

Enterprise products do not fit HPC needsLack of scalabilityWeak inter-operability with HPC simulation codeHPC generates a “pressure” on software that is beyond compared

CEA chose to develop its own toolsWe have something which fits perfectly our needsThe estimated cost, in man-years, is smaller than the cost to maintain a badly adapted solution in production

CEA policy is to collaborate and share knowledge

Share home-made tools in Open Source is a natural behaviorAll other HPC sites will behave the same

Page 8: OWF14 - Project and Community Driving :

PAGE 8

Ganesha : a community born at CEA

Ganesha was born because of TERA's needsWe needed a server to export a proprietary HSM's namespace via NFSv3We had to develop something of our ownWe choose to made it generic and capable of dealing with various protocols and backends

Ganesha was an opensource product since its designBackend-specific part of the product was isolated in dedicated library called FSALs (File System Abstraction Layer). Today FSALs exist for XFS, VFS, LUSTRE, CEPH, GPFS, GLUSTERFS, ZFSGanesha become the first “multi-usage” NFSv3+NFSv4 server in User Space for Linux

The Industry is in love with the Open Source ModelGanesha is OpenSource since 7/21/2007 (first release on SourceForge)IBM became an active contributor in 2009LinuxBox/CohortFS came in late 2009Panasas joined the community in early 2011RedHat joined in 2013 (Ganesha will be part of Fedora21)The community now involves more than 35 steady commiters from about 10 companies

Page 9: OWF14 - Project and Community Driving :

PAGE 9

Bringing up the Ganesha Community

Creating a community = communicatingExpose project's releases on SourceForgeCreate mailing lists related to the project (a least one dedicated to users and one dedicated to developers). SourceForge can host such listsExpose source repositories to encourage people downloading dev versions and compile/modify them

Manage source using Git : managing remote commiters is easyExpose git tree on the web (for example on github.com)

Have a website and/or a wiki to give informationA centralized bug repository is critical

Ganesha bug tracker is hosted by RedHat's bugzilla

There is nothing like verbal communicationSubmit abstracts and papers to conference

A 30' topic is really cool : people will attend your topic and read the proceedingsDo not underestimate “lesser” sessions

BOF sessions :BOF sessions : technically skilled people attend it, some may find interest in your project and start collaborating. At least, they can do positive report to their bossesWiP Sessions : very small topics (about 5') but people involved in technology watch often attend itPoster Sessions : makes it possible to have long talks with potential contributors

Page 10: OWF14 - Project and Community Driving :

PAGE 10

The community in action (1/2)

Main issue : deal with remote peopleContributors are spread across different countries and timezones

India is 4h30 “later”USA Central Time is 7 hours “sooner”

The main problem is to keep people in sync.

Information channelsUse the mailing list as much as possible

It's easy to follow a discussion threadMajordomo is keeping archive of the messages post on the list.

People ask for review of the patch on the listCurrently, reviews are made via github.com website

For “synchronous” discussion, people prefer talking on IRC

After several years, the project finally has a logo !!!

Page 11: OWF14 - Project and Community Driving :

PAGE 11

The community in action (1/2)

CheckpointsWeekly concalls (phone conference)

New features and patches are discussed Status of branches in the official source repository is addressedDecisions are taken during the concallAttendees can introduce “open topics” to talk about possible new features or bugs

IRL meeting Ganesha community meets once a year during Connectathon, a larger conference dedicated to NFS interoperabilityPart of the community attends the “bake-a-thon” (non official connectathon), twice a year

Industrial contributors: good or bad ? 90% of the Ganesha contributors belong to the industry

Ganesha is part of a future product (we use LGPL)People from the industry have very strict test suites and QAThe open source economical model is recognized by a valuable one by everyoneBUTBUTPeople from different companies are competitors

They play the game of the open source but do not forget the rules of the marketThe balance is quite positive: the project wins almost 10 man-year each year through FOSS collaboration

Page 12: OWF14 - Project and Community Driving :

PAGE 12

Lustre, the galactic filesystem

Scalable clustered filesystemPowers the world's most powerful supercomputers

Tens of thousand of clientsHundreds of petabytes of storageTeraBytes per second of I/O throughput

Fully software solutionKernel-land (Linux)Distributed under the terms of GNU GPLv2

Actively developed (~100 contributors per major release)Drives an entire ecosystem (robinhood policy engine, hadoop adapter...)

Page 13: OWF14 - Project and Community Driving :

PAGE 13

Project history

Started 1999, P. Braam at Carnegie Mellon UniversityFounded Cluster Filesystem (CFS) company in 2001Acquired by Sun Microsystems in 2007Acquired by Oracle in 2010, which dropped it less than a year laterCreation of whamcloudAcquired by Intel in 2012Xyratex Ltd. bought the IP in Feb. 2013 and gave it back to the community

The core developers mostly remained the sameThe community organized itself to cope with these changes (OpenSFS, EOFS)

Page 14: OWF14 - Project and Community Driving :

PAGE 14

Lustre community today

Diverse backgrounds, same goals

Major HPC actorsIntelSeagateCrayBull

Large computing centersUSA: LLNL, ORNL, NASA, NCSA...France: CEA, Total, EDF, MeteoFrance...Germany: FZJ, HLRSItaly: CinecaAsia: RIKENAustralia: NCI...

UniversitiesUniversity of IndianaUniversity of ReimsUniversity of DresdenStanford University...

Page 15: OWF14 - Project and Community Driving :

PAGE 15

Working together

Sharing ideas, sharing code, sharing benefits

Continuous integration techniquesEach and every patch involves several developersImprove code qualityImprove communication within the project

Regular, major eventsLustre User Group (OpenSFS, USA)Lustre Admin & Dev workshop (EOFS, France)China Lustre User GroupJapan Lustre User Group

Strong links between administrators and developersSysdevs get feedback from sysadms......sometimes they are the same persons!Product architects are quite active on the mailing lists

Shared best practicesCode Reviews and “official” branches hosted in GerritUnified coding style and documentation writing

Page 16: OWF14 - Project and Community Driving :

PAGE 16

As a conclusion

Collaboration with FOSS community is goodA way to bring more men-years to the projectContributors with different use cases will highlight bugs

Sharing and communicatingA community is a good place to implement and share good practicesThe community is structured by the common tools and common “virtuous” ways of using themContributors “Tables of the Law” will provide a strong and reliable backbone to the project

Do not hesitate to work with the industryOpen Source software is a valuable economical modelThe industry invests a lot in Open SourceIndustrial won't have the same goals as research institution, but common “root needs” are easy to find to start collaboratingChoose a license which is compatible with such a collaboration (LGPLv3, CeCILL-C,...)

Page 17: OWF14 - Project and Community Driving :

PAGE 17

Questions ?

Page 18: OWF14 - Project and Community Driving :

Direction des applications militairesCommissariat à l’énergie atomique et aux énergies alternatives

Centre DAM-Ile de France | 91297 Bruyères-le-Châtel Cedex

T. +33 (0)1 69 26 40 00 | F. +33 (0)1 69 26 70 86

Etablissement public à caractère industriel et commercial | RCS Paris B 775 685 019