Data dialogue - Human Genomic Data Discovery

17
Human Genomic Data Discoverability Fiona Nielsen – Data Dialogue, Cambridge – July 28 th 2016

Transcript of Data dialogue - Human Genomic Data Discovery

Page 1: Data dialogue - Human Genomic Data Discovery

Human Genomic Data DiscoverabilityFiona Nielsen – Data Dialogue, Cambridge – July 28th 2016

Page 2: Data dialogue - Human Genomic Data Discovery

The surge of genomics data

• High throughput technologies – biology is moving from the lab to the computer

2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Genomes Sequenced

80+ PB

Sequenced every year

Page 3: Data dialogue - Human Genomic Data Discovery

Population sequencing projects

• For example 100,000 Genomes project in the UK

Page 4: Data dialogue - Human Genomic Data Discovery

Where is the data?

• A researcher in human genomics knows on average 4-5 data sources

The need to redefine data sharing: http://www.sciencedirect.com/science/article/pii/S2212066114000386

Page 5: Data dialogue - Human Genomic Data Discovery

Hundreds of data sources

• Content overview of 163 data sources

Assay Types

Dedicated to…

Page 6: Data dialogue - Human Genomic Data Discovery

Hundreds of data sources

• Sizes vary from tens to 100s of thousands of samples

0.2

2

20

200

2000

20000

200000

2000000

Chart TitleSa

mpl

e #

(Log

10)

Top 5:GEO (1.8M)PMI Cohort Program (1M)Auria Biopankki (1M)EGA (~0.6M)SRA (~0.5M)

Page 7: Data dialogue - Human Genomic Data Discovery

Which populations are represented?

Aboriginals

African Americans

Africans

Australians

Chinese

MalaysIndians

DanishDutch Estonian

Russian

European Ancestry

FinnishIcelandic

JapaneseKorean

Latin Americans

Saudi

Swedish

Page 8: Data dialogue - Human Genomic Data Discovery

Where does the data come from?

9475600

88

660

26

68

5062

3

25

0

0

23

International

Interesting site to look at: http://omicsmaps.com/stats

Page 9: Data dialogue - Human Genomic Data Discovery

Why is some data not shared?

• Challenges for international research community: How to work across borders and silos?

Page 10: Data dialogue - Human Genomic Data Discovery

Why is some data not shared?

• Additional challenges for biomedical: Data privacy, data governance, patient consent, medical legislation

Page 11: Data dialogue - Human Genomic Data Discovery

Also consider: Community-led resources

• patient groups, academia, the general public

Page 12: Data dialogue - Human Genomic Data Discovery

What needs to change?

• Increased data visibility and accessibility positively benefit both researchers and patients

?

Page 13: Data dialogue - Human Genomic Data Discovery

Pain points

FRAGMENTEDPoor visibility of available

genomic data

ADMIN BURDENHuge overhead to manage

data access

BAD CULTURELack of data sharing habits in

research culture

Page 14: Data dialogue - Human Genomic Data Discovery

Best practices

MAKE DATA DISCOVERABLE

SIMPLIFY WORKFLOWS

CONTRIBUTE TOCOMMUNITY

DNAdigest and Repositive – Connecting the world of genomic datahttp://journals.plos.org/plosbiology/article?id=10.1371%2Fjournal.pbio.1002418

Page 15: Data dialogue - Human Genomic Data Discovery

Panel discussion

• What are best practices for sharing difficult data?

FAIR data: Findable, Accessible, Interoperable, Reuseable

Page 16: Data dialogue - Human Genomic Data Discovery

Translating and Commercialising Genomic Research7-9 December 2016| Wellcome Genome Campus, Hinxton, Cambridge UK

Applications open soon!

Scientific programme committee Emmanuelle Astoul Wellcome Trust Sanger Institute, UKFiona Nielsen Repositive/DNAdigest, UKAbel Ureta-Vidal Eagle Genomics, UKRoss Rounsevell Wellcome Trust Sanger Institute, UK

Full details at: www.wellcomegenomecampus.org/coursesandconferences

Topics will include:• Commercial opportunities arising from data aggregation• Exploiting bioinformatics tools• Externalising bioinformatics pipelines• Translating biomarkers, genetic signatures or gene panels

Page 17: Data dialogue - Human Genomic Data Discovery

CEO Fiona Nielsen, [email protected]

Try our free platform for discovering human genomic data http://repositive.io Follow us on twitter @repositiveio