Big Data to Knowledge (BD2K) Jennie Larkin, Ph.D. NIH RDA P5 March 10,2015.
NIH BD2K bioCADDIE DataMed: Data Discovery Index
-
Upload
susanna-assunta-sansone -
Category
Data & Analytics
-
view
352 -
download
1
Transcript of NIH BD2K bioCADDIE DataMed: Data Discovery Index
![Page 1: NIH BD2K bioCADDIE DataMed: Data Discovery Index](https://reader031.fdocuments.us/reader031/viewer/2022022413/58ef25b91a28ab40598b4623/html5/thumbnails/1.jpg)
Consultant, Honorary Academic Editor
Associate Director, Principal Investigator
!
Susanna-Assunta Sansone, PhD!!
!
Alan Turing Institute Symposium Oxford, 6-7 April, 2016
![Page 2: NIH BD2K bioCADDIE DataMed: Data Discovery Index](https://reader031.fdocuments.us/reader031/viewer/2022022413/58ef25b91a28ab40598b4623/html5/thumbnails/2.jpg)
A Data Discovery Index prototype that:!• Helps users find and access shared data !
• Interoperates in the NIH Commons (biomedical digital assets) !
![Page 3: NIH BD2K bioCADDIE DataMed: Data Discovery Index](https://reader031.fdocuments.us/reader031/viewer/2022022413/58ef25b91a28ab40598b4623/html5/thumbnails/3.jpg)
![Page 4: NIH BD2K bioCADDIE DataMed: Data Discovery Index](https://reader031.fdocuments.us/reader031/viewer/2022022413/58ef25b91a28ab40598b4623/html5/thumbnails/4.jpg)
Repositories
Metadata Ingestion ElasticSearch
Terminology server
User Interface
Online datasets
Publishers Funding Agencies
Data producers
Dat
a So
urce
s
Ingestion Indexing
Searching
prototype!
![Page 5: NIH BD2K bioCADDIE DataMed: Data Discovery Index](https://reader031.fdocuments.us/reader031/viewer/2022022413/58ef25b91a28ab40598b4623/html5/thumbnails/5.jpg)
aggregator'A'
B C
Aaggregator'
Data'Discovery'Index'
data'
Organizing framework and portal for data
Dashed lines: mapping of metadata standards, links to aggregators, data Aggregators: repositories or various indices Data: digital research objects
Pilot projects* Core development team
* There is work for everyone (and more)
Designed as an element of the ecosystem!
![Page 6: NIH BD2K bioCADDIE DataMed: Data Discovery Index](https://reader031.fdocuments.us/reader031/viewer/2022022413/58ef25b91a28ab40598b4623/html5/thumbnails/6.jpg)
Use cases- community-driven effort!
![Page 7: NIH BD2K bioCADDIE DataMed: Data Discovery Index](https://reader031.fdocuments.us/reader031/viewer/2022022413/58ef25b91a28ab40598b4623/html5/thumbnails/7.jpg)
The ‘right’ level of metadata elements!!
Examples of competency questions, derived from the use cases
![Page 8: NIH BD2K bioCADDIE DataMed: Data Discovery Index](https://reader031.fdocuments.us/reader031/viewer/2022022413/58ef25b91a28ab40598b4623/html5/thumbnails/8.jpg)
The ‘appropriate’ metadata standards!!
Mapping the landscape of standards and databases in the life sciences
![Page 9: NIH BD2K bioCADDIE DataMed: Data Discovery Index](https://reader031.fdocuments.us/reader031/viewer/2022022413/58ef25b91a28ab40598b4623/html5/thumbnails/9.jpg)
mapped a variety of metadata standards and database schemas
Generic schemas:!• schema.org!• DataCite!• RIF-CS!• DCAT!• PROV!• VOID!• Dublin Core !• etc…!!
Life/biomedical schemas:!• BioProject!• BioSample!• MiNIML!• PRIDE-ml!• GA4GH metadata schema!• SRA xml!• CDISC SDM / BRIDGE model !• etc…!
We have aimed to have maximum coverage of use cases with minimal number of data elements
We do foresee that not all questions can be answered in full
From to!
![Page 10: NIH BD2K bioCADDIE DataMed: Data Discovery Index](https://reader031.fdocuments.us/reader031/viewer/2022022413/58ef25b91a28ab40598b4623/html5/thumbnails/10.jpg)
Prototype, model, mappings, documentation and more at!https://biocaddie.org and https://github.com/biocaddie !
Supported by the NIH grant 1U24 AI117966-01 to the University of California, San Diego