How e-infrastructure can contribute to Linked Germplasm Data

36
How einfrastructure can contribute to Linked Germplasm data Giannis Stoitsis, AgroKnow [email protected] econference on Germplasm Data Interoperability

description

 

Transcript of How e-infrastructure can contribute to Linked Germplasm Data

Page 1: How e-infrastructure can contribute to Linked Germplasm Data

How  e-­‐infrastructure  can  contribute  to  Linked  Germplasm  data  

Giannis  Stoitsis,  Agro-­‐Know  [email protected]  

e-­‐conference  on  Germplasm  Data    Interoperability  

Page 2: How e-infrastructure can contribute to Linked Germplasm Data

Contents  

•  Why  we  need  e-­‐infrastructure  •  What  e-­‐infrastructure  can  provide  •  The  agINFRA  approach    •  agINFRA  powered  services  for  Germplasm  data    

•  What  is  next  

Page 3: How e-infrastructure can contribute to Linked Germplasm Data

WHY  WE  NEED  E-­‐INFRASTRUCTURE  

Page 4: How e-infrastructure can contribute to Linked Germplasm Data

•  publicaKons,  thesis,  reports,  other  grey  literature  •  educaKonal  material  and  content,  courseware  •  primary  data,  such  as  measurements  &  observaKons  

–  structured,  e.g.  datasets  as  tables  –  digiKzed,  e.g.  images,  videos  

•  secondary  data,  such  as  processed  elaboraKons  –  e.g.  dendrograms,  pie  charts,  models  

•  provenance  informaKon,  incl.  authors,  their  organizaKons  and  projects  

•  experimental  protocols  &  methods  •  social  data,  tags,  raKngs,  etc.  •  …  

agricultural  data  

Page 5: How e-infrastructure can contribute to Linked Germplasm Data

•  stats  •  gene  banks  •  gis  data  •  blogs,    •  journals  •  open  archives  •  raw  data  •  technologies  •  learning  objects  •  ………..  

educators’ view

Page 6: How e-infrastructure can contribute to Linked Germplasm Data

•  stats  •  gene  banks  •  gis  data  •  blogs,    •  journals  •  open  archives  •  raw  data  •  technologies  •  learning  objects  •  ………..  

researchers’ view

Page 7: How e-infrastructure can contribute to Linked Germplasm Data

•  stats  •  gene  banks  •  gis  data  •  blogs,    •  journals  •  open  archives  •  raw  data  •  technologies  •  learning  objects  •  ………..  

practioners’ view

Page 8: How e-infrastructure can contribute to Linked Germplasm Data

•  stats  •  gene  banks  •  gis  data  •  blogs,    •  journals  •  open  archives  •  raw  data  •  technologies  •  learning  objects  •  ………..  

Page 9: How e-infrastructure can contribute to Linked Germplasm Data

we  sKll  have  data  silos  •  Many  metadata  standards  (e.g.  DC,  IEEE  LOM,  Dw,  local  schemas)  •  Diversity  of  web  interfaces  (e.g.  REST,  OAI-­‐PMH,  SOAP,  SPI,  SQI)  •  Different  exchange  format  (e.g.  XML,  RDF,  JSON)  •  Fragmented  use  of  texonomies  

LD for educational data/resource sharing Overview Approaches for LD in educational data sharing

On the-fly/automated integration of heterogeneous APIs and data (http://www.meducator.net)

Dataset (transformation and) cataloging (http://linkedup-project.eu)

?

We are still here … … and not here …

Page 10: How e-infrastructure can contribute to Linked Germplasm Data

we  need  ontologies  published  online  and  aligned  

•  stats  •  gene  banks  •  blogs,    •  journals  •  open  archives  •  raw  data  •  learning  objects    

Page 11: How e-infrastructure can contribute to Linked Germplasm Data

we  need  tools  to  share  data  

Page 12: How e-infrastructure can contribute to Linked Germplasm Data

we  need  tools  to  semanKcally  annotate  data  

Page 13: How e-infrastructure can contribute to Linked Germplasm Data

and  for  all  this  we  need  

Page 14: How e-infrastructure can contribute to Linked Germplasm Data
Page 15: How e-infrastructure can contribute to Linked Germplasm Data

•  aim  is:  promo&ng  data  sharing  and  consump&on  related  to  any  research  ac&vity  aimed  at  improving  produc&vity  and  quality  of  crops  

ICT  for  compu&ng,  connec&vity,  storage,  instrumenta&on  

   

data  infrastructure  for  agriculture  

Page 16: How e-infrastructure can contribute to Linked Germplasm Data

what  researchers  need  in  agINFRA  

…  only  a  browser  and  internet  connecKon  

Page 17: How e-infrastructure can contribute to Linked Germplasm Data

typical  problem:  compuKng  

Page 18: How e-infrastructure can contribute to Linked Germplasm Data

typical  problem:  hosKng  

Page 19: How e-infrastructure can contribute to Linked Germplasm Data

what  can  be  hosted  and  executed  on  agINFRA  

•  Data  storage  &  management  tools  – APIs  for  content  disseminaKon  in  large  networks  

•  Processing  &  visualisaKon  tools  •  Metadata  aggregaKon  infra  •  Search  engines  and  apps  for  insKtuKons  or  communiKes  

•  Environments  for  running  experiments  e.g.  comparing  different  content  recommendaKon  algorithms  

Page 20: How e-infrastructure can contribute to Linked Germplasm Data

h[p://aginfra.eu/en/our-­‐soluKon/api  

Page 21: How e-infrastructure can contribute to Linked Germplasm Data

HOW  AGINFRA  CAN  SOLVE  DATA  INTEROPERABILITY  PROBLEMS    

Page 22: How e-infrastructure can contribute to Linked Germplasm Data

WORKFLOW  FOR  METADATA  AGGREGATION  

Page 23: How e-infrastructure can contribute to Linked Germplasm Data

metadata  aggregaKons  

•  concerns  viewing  merged  collecAons  of  metadata  records  from  different  sources  

•  useful:  when  access  to  specific  supersets  or  subsets  of  networked  collecAons  – records  actually  stored  at  aggregator  – or  queries  distributed  at  virtually  aggregated  collecKons  

23  

Page 24: How e-infrastructure can contribute to Linked Germplasm Data

typically  look  like  this  

24   Ternier et al., 2010

Page 25: How e-infrastructure can contribute to Linked Germplasm Data

metadata  aggregaKon  tools  

More  than  a  harvester:  

q Valida&on  Service  q Repository  So4ware    q Registry  Service    q Harvester  

25  

Powered by

Page 26: How e-infrastructure can contribute to Linked Germplasm Data

a  metadata  aggregaKon  workflow  that  can  be  ported  on  agINFRA  

HarvesKng   ValidaKng   Transforming  

OAI  target  -­‐  XMLs  

TriplificaKon  Storing  and  indexing    

Page 27: How e-infrastructure can contribute to Linked Germplasm Data

TOOLS  FOR  PUBLISHING  AND  LINKING  VOCABULARIES  

Page 28: How e-infrastructure can contribute to Linked Germplasm Data
Page 29: How e-infrastructure can contribute to Linked Germplasm Data

AGRICULTURAL  DATA  DISCOVERY  SERVICE/PORTAL  OVER  THE  CLOUD  

Page 30: How e-infrastructure can contribute to Linked Germplasm Data

agricultural  data  discovery  modules  for  open  source  CMS  

hIp://www.youtube.com/watch?v=OYlxWlyag04&feature=youtu.be  

Page 31: How e-infrastructure can contribute to Linked Germplasm Data

LINKING  GERMPLASM  DATABASES  AND  EXPOSING  DESCRIPTIONS  AS  LINKED  DATA  

Page 32: How e-infrastructure can contribute to Linked Germplasm Data

agINFRA  contribuKon  in  germplasm  data  interoperability    

•  Define  recommendaKons  for  describing  germplasm  data  

•  Define  mappings  between  different  metadata  formats  

•  Provide  APIs  for  transformaKon  –  triplificaKon  of  germplasm  descripKons  

Page 33: How e-infrastructure can contribute to Linked Germplasm Data

mapping  between  different  metadata  formats  powered  by  agINFRA  

Page 34: How e-infrastructure can contribute to Linked Germplasm Data

publishing  germplasm  data  as  linked  data  in  agINFRA  

services

Page 35: How e-infrastructure can contribute to Linked Germplasm Data

next  steps  in  the  context  of  agINFRA  

•  Develop  the  recommendaKons  for  publishing  germplasm  data  

•  Deploy  transformers  and  make  them  available  in  agINFRA  

•  Deploy  API  for  triplificaKon  

Page 36: How e-infrastructure can contribute to Linked Germplasm Data

   

thank  you!  [email protected]    www.agroknow.gr  www.aginfra.eu