SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos)...

Post on 28-Feb-2021

2 views 0 download

Transcript of SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos)...

SNPs and Human Diseases XV

November 14th, 2018

Microbiome

Robert Kraaij, PhD

Erasmus MC, Internal Medicine

r.kraaij@erasmusmc.nl

Metagenomics - terminology

the study of metagenomes

genetic material recovered from environmental samples

ecological community of microorganisms

symbiosis

commensal

mutual

parasitic

OPTION 1:

microbiota community of microorganisms

microbiome genomes of the microbiota

Metagenomics - terminology

the study of metagenomes

genetic material recovered from environmental samples

ecological community of microorganisms

symbiosis

commensal

mutual

parasitic

OPTION 2:

microbiota collection of microorganisms

metagenome genomes of the microbiota

microbiome community of microorganisms and host

Microbiota: more than just bacteria…

Archaea

Bacteria

Protozoa

Viruses

human viruses

bacteriophages

Fungi

molds

yeasts

Microbiota: more than just bacteria…

Archaea

Bacteria

Protozoa

Viruses

human viruses

bacteriophages

Fungi

molds

yeasts

The human gut microbiota

- the forgotten organ

1013 bacterial cells = 1013 body cells

~106 bacterial genes vs ~20,000 human genes

many unique functions

involved in health and disease!

Density of microbiota increases along GI tract

Walter and Ley (2011) Annu Rev Microbiol. stool (1011 cells/ml)

Stool as ‘proxy’ of gut (distal colon) microbiota

collection

storage

type 1 type 2 type 3 type 4 type 5 type 6 type 7

profiling

metadata - Bristol stool scale

- Rotterdam Study RS-IV

- n = 836

Human microbiota: more than just the gut…

urine stool

nose tooth

eye

skin

PHENOTYPE

GENOTYPE

ENVIRONMENT

DIET

LIFE-STYLE

MICROBIOME

Microbiota

Gut microbiome and disease associations

obesity

Crohn’s disease

ulcerative colitis

eczema

asthma

diabetes

depression

etc

hype cycle

Overview

Microbiota profiling

Data analysis

Microbiota profiling

WHO ARE THEY?

WHAT DO THEY DO?

culture-based techniques

culturomics

16S rRNA marker gene

arrays (hitChip)

ISpro

sequencing

microbiome array

shotgun sequencing (metagenomics)

Microbiota profiling

IS-proTM profiling

16S-23S interspace (IS region)

taxonomy based on size differences

prokaryotic

rRNA operon 16S 23S 5S

Bacteroidetes

Firmicutes, Actinobacteria,

Fusobacteria, Verrucomicrobia

Proteobacteria

IS region

FAFV

NCBI database - 16S – 23S rRNA

- 8990 entries

Budding et al. (2010) FASEB J.

fragment size (nt) a

bu

nd

an

ce

culture-based techniques

culturomics

16S rRNA marker gene

arrays (hitChip)

ISpro

sequencing

microbiome array

shotgun sequencing (metagenomics)

Microbiota profiling

16S ribosomal RNA gene amplicon

highly conserved in bacteria and archaea

species-independent PCR amplification

variable regions

taxonomic classification

16S rRNA

16S rRNA amplicons

prokaryotic

rRNA operon 16S 23S 5S

1500bp Oxford Nanopore long read sequencing

IS region

~400bp Illumina MiSeq short read sequencing

16S rRNA amplicon and sequencing

Fadrosh et al. (2014)

Illumina MiSeq

QIIME-based analysis pipeline

Silva database, version 128 Max Planck Institute for Marine Microbiology and Jacobs University, Bremen, Germany

September 2016

8,430,487 entries

Read-pair merging

Q-score > 19

Chimera filtering

Sample QC

Reads > mean – 2 SD

OTU calling

Taxonomy

Phylogeny

OTU table (anonymous)

Biome table (taxonomy)

Phylogenetic tree

OTU abundancy filtering

> 0.005% of total reads

Caporaso et al. (2010)

Read-pair merging

Chimera filtering

- chimeras are PCR artifacts

Chimera filtering

Query

Chunk Chunk Chunk Chunk

Ref DB

Hits

Query

A

Query

A

B

normal chimera

4x

OTU clustering

Operational taxonomic units (OTUs)

clustering on basis of homology of the reads (97%)

OTUs can be aligned to reference databases

unknown OTUs can still be used in analyses

Closed reference calling

Each read is compared directly to the database

Database determines phylogenetic tree

Standardized taxonomy > allows for collaboration

culture-based techniques

culturomics

16S rRNA marker gene

arrays (hitChip)

ISpro

sequencing

microbiome array

shotgun sequencing (metagenomics)

Microbiota profiling

Affymetrix Axiom Microbiome array

culture-based techniques

culturomics

16S rRNA marker gene

arrays (hitChip)

ISpro

sequencing

microbiome array

shotgun sequencing (metagenomics)

Microbiota profiling

Shotgun metagenomics

Flaws of 16S rRNA profiling

selection introduced by PCR amplification

no eukaryotic species such as fungi

phylotyping will not give insights into

the gene functions of unknown species

Shotgun metagenomics

Direct sequencing of DNA

High output sequencing

2 x 100 bp

reads are too short for proper annotation

de novo assembly is preferred

need for compute power

2 x 100 bp

paired-reads de novo assembly ~1 kbp contigs

Metagenomics technology push

MetaHIT

European FP7 project

Human Microbiome Project (HMP)

NIH-sponsored project

Profiling of shotgun data

phylotyping databases

metagenomic species (MGS)

~7000 MGS specified

gene catalogue

8.1 million genes from 760 samples

functional databases

Phylotyping of shotgun data

Arumugam et al., 2011 MetaHIT

Functional analysis of shotgun data

Arumugam et al., 2011 MetaHIT

Taxonomic vs functional profiling

large taxonomic differences are not reflected in functional profiles

The Human Microbiome Project Consortium (2012)

Samples ordered by taxonomic profiles

Samples ordered by functional profiles

Profiling the gut microbiome

WHO ARE THEY?

16S TAXONOMY

METAGENOMICS

WHAT CAN THEY DO?

METAGENOMICS

WHAT ARE THEY DOING?

METATRANSCRIPTOMICS

METAPROTEOMICS

WHAT HAVE THEY DONE?

METABOLOMICS

Profiling the gut microbiome

Overview

Microbiota profiling

Data analysis

Complex multi-dimensional data

no normal or mean profile

enterotypes?

sparse data

many zero abundances

limited by technique

count data

dependent on technique

how to normalize?

compositional data

relative abundances add up to 1

Diversities

α-diversity

diversity within a sample

biological metric

number of species * evenness

β-diversity

diversity (distance or dissimilarity) between samples

UniFrac distances

OTU table

OTU id sample_01 sample_02 sample_03 sample_04 …

OTU_12 3 0 456 343

OTU_318 34 45 3 2

OTU_37 567 2134 478 675

… … … … … …

Total 5,975 4,952 6,735 5,374

Rotterdam 16S rRNA datasets

Domain Phylum Class Order Family Genus OTUs

(2) (11) (18) (24) (43) (183) (777)

Class Domain Phylum Order Family Genus OTUs

(1) (7) (15) (19) (36) (152) (661)

Shannon Diversity Index Shannon Diversity Index

5 major phyla

N=2,111

N=156

N=1,427

N=1,135

N=1,106

Generation R Study

9-11 year-olds

Rotterdam Study

adults

Radjabzadeh et al. (2018) in preparation

Children vs adults - Generation R Study vs Rotterdam Study

N=2,111 N=1,427

GenR RS

*** 8

7

6

5

4

3

2

Sh

an

no

n d

ive

rsit

y i

nd

ex

average phylum-level profiles

Radjabzadeh et al. (2018) in preparation

Shannon alpha diversity

MiBioGen consortium

Meta-analyses of gut microbiome GWAS

> 20 cohorts (still including)

> 20,000 samples

16S rRNA profiling (Illumina)

226 genera

8M HRC1.1 imputed SNPs

NGRC

Traits

Shannon alpha-diversity

Binary trait (presence/absence)

Quantitative trait (abundance)

Beta-diversity

MiBioGen consortium

Cohort name Population 16S domain Genotyping method N Description

1 LLD Netherlands (Caucasian) V4 Illumina Immunochip, Cytochip 1089 Representative of population

2 NGRC Netherlands (Caucasian) V1-V2 PsychChip (Broad Institute, Boston, USA) 153 Healthy group + ADHD group

3 RS Netherlands (Caucasian) V3-V4 Illumina 550k 1427 Representative of population

4 GENR Netherlands (multi-ethnic) V3-V4 Illumina 610k 2111 Representative of population

5 NTR Netherlands (Caucasian) V4 Affymetrix 6.0 499 Twins

6 MIBS_Co Netherlands (Caucasian) V4 Illumina OmniExpressExome 111 Healthy volunteers

7 FGFP Belgium (Caucasian) V4 Illumina OmniExpress 2482 Representative of population

8 SHIP Germany (Caucasian) V1-V2 Affymetrix 6.0, Illumina OmniExpressExome, Exomechip 1904 Representative of population

9 SHIP-TREND Germany (Caucasian) V1-V2 Affymetrix 6.0, Illumina OmniExpressExome, Exomechip (-) Representative of population

10 FOCUS Germany (Caucasian) V1-V2 Illumina Immunochip, Exome 1555 Representative of population

11 BSPSPC Germany (Caucasian) V1-V2 Illumina 550K, Immunochip, Metabochip, Affymetrix 6.0, Axiom 912 Representative of population

12 TwinsUK UK (Caucasian) V4 HumanHap300, Hap610Q, 1M-Duo, 1.2M-Duo 1793 Twins

13 CHRIS Italy (Caucasian) ? ? ? ?

14 COPSAC Denmark (Caucasian) V4 Illumina OmniExpress 424 Representative of population

15 POPCOL Sweden (Caucasian) V1-V2 Illumina MiSeq 250 Representative of population

16 METSIM Finland (Caucasian) V4 Illumina OmniExpressExome 531 Representative of population

17 PNP Israel (Israeli) V3-V4 Metabolochip 1066 Healthy volunteers

18 GEM_HCE_v12 Canada, USA, Israel (Caucasian, Israeli) V4 Illumina HumanCoreExome, Immunochip (-) Healthy individuals

19 GEM_HCE_v24 Canada, USA, Israel (Caucasian, Israeli) V4 Illumina HumanCoreExome, Immunochip (-) Healthy individuals

20 GEM_ICHIP_HCE Canada, USA, Israel (Caucasian, Israeli) V4 Illumina HumanCoreExome, Immunochip 1543 Healthy individuals

21 CARDIA USA (Caucasian and African-American) V3-V4 Illumina Exome, Affymetrix 6.0 282 Representative of population

22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population

23 KSCS Korea (Asian) V3-V4 Illumina HumanCore BeadChips 12v 833 Representative of population

23 cohorts >21,000 samples

MiBioGen consortium

55 bacterial taxa (1,232 SNPs)

GWAS quantitative trait

226 genera

8M SNPs

Meta-analysis

MiBioGen consortium (2018) unpublished

LCT locus

Questions…

?

?