Presentation distro recipes-2013

20
Intoduction Use case My efforts Package / Project meta-data Fin Using RDF metadata for traceability among projects and distributions Olivier Berger <mailto:[email protected]> Debian + Télécom SudParis Thursday 04/04/2013 Distro Recipes - Paris

Transcript of Presentation distro recipes-2013

Page 1: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Using RDF metadata for traceability amongprojects and distributions

Olivier Berger <mailto:[email protected]>Debian + Télécom SudParis

Thursday 04/04/2013Distro Recipes - Paris

Page 2: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Quick IntroductionShort bio

Olivier BERGER<mailto:[email protected]><mailto:[email protected]>Research Engineer at TELECOM SudParis, expert on softwaredevelopment forges, and interoperability in Libre Softwaredevelopment projects. Contributor to FusionForge, Debian, etc.

Page 3: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

How much duplication ?

• Think about all the duplicate bug reports, and the number ofpeople involved. . . usually not the ones who can help, btw

• Can you navigate between all of these ?• Does google search help ?

Page 4: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Some issues

• Universal Free Software Project description format ; (nay)• Universal distribution Package description format ; (nay)• Common Semantics ? (probably)• How to inter-link documents about the same program :

• Bug reports (upstream and in all distros)• Launchpad

• Security advisories• Debian’s Homepage: control field

• Are we maintaining Free Software distributions in close silos ?• External directories (FSF, Freshmeat/Freecode, JoinUp, etc.)• What’s usually wrong with [ XML | JSON | YAML | RFC822 ]format ?

Page 5: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

An approachLinked Open Development Meta-Data

• Let’s try and make as much distro facts as possible availableto humans + machines ?

• Adopting the 5 ? Open Data principles using RDF for distrometa-data :

? make your stuff available on the web? ? make it available as structured data? ? ? non-proprietary format? ? ? ? use URLs to identify things, so that people can point

at your stuff (RDF)? ? ? ? ? link your data to other people’s data to provide context

(Linked RDF)

Page 6: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Project/program/package traceability over the FLOSSecosystem

• Assembling a graph of descriptions of packages/projectspublished as Linked Data (DOAP or ADMS.SW) on theirforges / project portals.For instance :

• For Debian, from the Debian PTS (already Linked Data proof)• For Apache, Gnome, Pypi, from DOAP files (not yet all

Linked Data, but close)• . . . Add your preferred upstream . . .

• Consumed by developer/maintainer/packager tools : followinglinks between packages, (and their bugs, security alerts), all insemantic interoperable meta-data formats (RDF) !

Page 7: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Matching project/package descriptions

Example (SPARQL query to match packages by theirhomepages)PREFIX doap : <ht tp :// u s e f u l i n c . com/ns /doap>

SELECT ∗ WHERE{

GRAPH <ht tp :// packages . qa . deb i an . org/>{?dp doap : homepage ?h

}GRAPH <ht tp :// p r o j e c t s . apache . org/>{?ap doap : homepage ?h

}}

“Semantic query” : Trying to match source packages in Debianwhose upstream project’s homepages match those of the Apacheproject’s DOAP descriptors.

Page 8: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Matching packages

Results : 62 matching Apache projects packaged in Debian (forwhich maintainers did set the Homepage Control field consistently).

Example (Matching upstream Apache project homepages withDebian source packages’)

dp h apivy ant.a.o/ivy/ ant.a.o/ivy/apr apr.a.o/ apr.a.o/apr-util apr.a.o/ apr.a.o/libcommons-cli-java commons.a.o/cli/ commons.a.o/cli/libcommons-codec-java commons.a.o/codec/ commons.a.o/codec/libcommons-collections3-java commons.a.o/collections/ commons.a.o/collections/libcommons-collections-java commons.a.o/collections/ commons.a.o/collections/commons-daemon commons.a.o/daemon/ commons.a.o/daemon/libcommons-discovery-java commons.a.o/discovery/ commons.a.o/discovery/libcommons-el-java commons.a.o/el/ commons.a.o/el/libcommons-fileupload-java commons.a.o/fileupload/ commons.a.o/fileupload/commons-io commons.a.o/io/ commons.a.o/io/commons-jci commons.a.o/jci/ commons.a.o/jci/libcommons-launcher-java commons.a.o/launcher/ commons.a.o/launcher/. . . . . . . . .

Matching program names gives more results but is ambiguous

Page 9: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

My current experimentmining project descriptions

• Running on my laptop ATM• Currently use Python to harvest meta-data from DOAP filesfor :

• Gnome• Apache• Pypi.python.org• Debian• (add you ?)

• Down to a virtuoso Triple store : > 2.5 M triples ATM• A python app to perform queries• May be published as a public service some day

Page 10: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Adding RDF to the Debian Package Tracking Systemhttp ://packages.qa.debian.org/

Page 11: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Adding ADMS.SW for FusionForge

• Adding ADMS.SW support in FusionForge• Projects• Releases• Trove categories

• Expected deployment on :• Cenatic• Adullact• Debian’s Alioth

Page 12: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Model : Graph of RDF resourcesReference Linked Data resources with canonical URI like<http://packages.qa.debian.org/PACKAGE#RESOURCE_ID>

The greyed resources correspond to upstream components

Page 13: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Forget about RDF/XML

Yes, RDF can be expressed :

• as XML• as Turtle (PREFERRED) : text (close to YAML / RFC 822)• as JSON

Turtle example :@p r e f i x r d f : <h t t p : //www.w3 . org /1999/02/22− rd f−syntax−ns#> .@p r e f i x f o a f : <h t t p : // xmlns . com/ f o a f /0 .1/> .@p r e f i x ow l : <h t t p : //www.w3 . org /2002/07/ owl#> .

<h t t p : // peop l e . deb i an . org /~ ob e r g i x / f o a f . t t l#me>a f o a f : P e r s o n ;f oa f : name " O l i v i e r ␣ Berge r " ;f o a f : n i c k " ob e r g i x " ;f oa f :mbox " ma i l t o : o b e r g i x@d e b i a n . org " ;f oa f :homepage <h t t p : // peop l e . deb i an . org /~ ob e r g i x /> ;owl :sameAs <h t t p : //www−p u b l i c . te lecom−s u d p a r i s . eu/~berger_o/ f o a f . r d f

#me> .

See also RDF Primer — Turtle version

Page 14: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Apache2 Debian packaging as RDF

http://packages.qa.debian.org/a/apache2.ttl@p r e f i x doap : <h t t p : // u s e f u l i n c . com/ns /doap#> .@p r e f i x admssw: <h t t p : // p u r l . o rg /adms/sw/> .

<h t t p : //p . qa . d . o/ apache2#p r o j e c t>a admssw :So f twa r ePro j e c t ;doap:name " apache2 " ;d o a p : d e s c r i p t i o n "Debian ␣ apache2 ␣ sou r c e ␣ packag ing " ;doap:homepage <h t t p : // packages . d . o/ s r c : a p a c h e 2> ;doap:homepage <h t t p : //p . qa . d . o/ apache2> ;d o a p : r e l e a s e <h t t p : //p . qa . d . o/ apache2#apache2_2 .2.22−11> ;s c h ema : c o n t r i b u t o r [

a f o a f :On l i n eAc c oun t ;foa f :accountName "Debian ␣Apache␣Ma i n t a i n e r s " ;f o a f : a c coun tSe r v i c eHomepage <h t t p : //qa . d . o/ d e v e l o p e r . php? l o g i n=

debian−a p a c h e@ l i s t s . d . o>] .

<h t t p : //p . qa . d . o/ apache2#apache2_2 .2.22−11>a admssw :So f twa reRe l ea se ;r d f s : l a b e l " apache2 ␣2.2.22−11" ;d o a p : r e v i s i o n "2.2.22−11" ;admssw:package <h t t p : //p . qa . d . o/ apache2#apache2_2 .2 .22−11. dsc> ;admssw : i n c l udedAs s e t <h t t p : //p . qa . d . o/ apache2#upstreamsrc_2 . 2 . 2 2> ;admssw : i n c l udedAs s e t <h t t p : //p . qa . d . o/ apache2#deb ians rc_2 .2.22−11> .

Page 15: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

The ADMS.SW ontology

Asset Description Metadata Schema for Software (ADMS.SW)

• Pilot : EC / Interoperability Solutions forEuropean Public Administrations (ISA) -cf. Joinup site

• Exchanging project / packages / releasesdescriptions across development platformsand directories

Page 16: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Open specifications

• Not too much NIH syndrom : reuses :• ADMS / RADion (generic meta-data for semantic assets

indexing)• DOAP (Description of a project)• SPDX™ ( Software Package Data Exchange ®)• W3C Government Linked Data (GLD) Working Group

Version 1.0 issued 2012/06/29• RDF Validator available• RDF is extensible : ADMS.SW / DOAP core + distro-specificextensions

Page 17: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

ADMS.SW main concepts

Project (= Program), Release, Package

Modeling Debian will require other complements

Page 18: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Related initiatives

Related initiatives about package meta-data :• AppStream (Software Center, DEP11, etc.)• Umegaya (upstream meta-data, links with researchpublications, etc.)

• DistroMatch (match package names across distributions)

Page 19: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Recommendations

• Upstream authors : please create DOAP descriptions for yourprojectshttps ://github.com/edumbill/doap/wiki

• Distributions : join the ADMS.SW bandwagon to documentspackage releases

• Followup-to : <[email protected]> ?

Page 20: Presentation distro recipes-2013

Intoduction Use case My efforts Package / Project meta-data Fin

Fin

More details at :• http ://wiki.debian.org/qa.debian.org/pts/RdfInterface• Linked Data descriptions of Debian source packages usingADMS.SW

• Authoritative Linked Data descriptions of Debian sourcepackages using ADMS.SW, to appear at OSS 2013 (pre-printavailable on demand)

Contact :Micro-blogging : @oberger http://identi.ca/oberger/Email : mailto:[email protected] : http://www-public.telecom-sudparis.eu/~berger_o/weblog/

Copyright 2013 Institut Mines Telecom + Olivier BergerLicense of this presentation : Creative Commons Share Alike (except illustrations which are under

copyright of their respective owners)