SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos)...

49
SNPs and Human Diseases XV November 14th, 2018 Microbiome Robert Kraaij, PhD Erasmus MC, Internal Medicine [email protected]

Transcript of SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos)...

Page 1: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

SNPs and Human Diseases XV

November 14th, 2018

Microbiome

Robert Kraaij, PhD

Erasmus MC, Internal Medicine

[email protected]

Page 2: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Metagenomics - terminology

the study of metagenomes

genetic material recovered from environmental samples

ecological community of microorganisms

symbiosis

commensal

mutual

parasitic

OPTION 1:

microbiota community of microorganisms

microbiome genomes of the microbiota

Page 3: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Metagenomics - terminology

the study of metagenomes

genetic material recovered from environmental samples

ecological community of microorganisms

symbiosis

commensal

mutual

parasitic

OPTION 2:

microbiota collection of microorganisms

metagenome genomes of the microbiota

microbiome community of microorganisms and host

Page 4: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Microbiota: more than just bacteria…

Archaea

Bacteria

Protozoa

Viruses

human viruses

bacteriophages

Fungi

molds

yeasts

Page 5: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Microbiota: more than just bacteria…

Archaea

Bacteria

Protozoa

Viruses

human viruses

bacteriophages

Fungi

molds

yeasts

Page 6: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

The human gut microbiota

- the forgotten organ

1013 bacterial cells = 1013 body cells

~106 bacterial genes vs ~20,000 human genes

many unique functions

involved in health and disease!

Page 7: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Density of microbiota increases along GI tract

Walter and Ley (2011) Annu Rev Microbiol. stool (1011 cells/ml)

Page 8: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Stool as ‘proxy’ of gut (distal colon) microbiota

collection

storage

type 1 type 2 type 3 type 4 type 5 type 6 type 7

profiling

metadata - Bristol stool scale

- Rotterdam Study RS-IV

- n = 836

Page 9: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Human microbiota: more than just the gut…

urine stool

nose tooth

eye

skin

Page 10: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

PHENOTYPE

GENOTYPE

ENVIRONMENT

DIET

LIFE-STYLE

MICROBIOME

Microbiota

Page 11: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Gut microbiome and disease associations

obesity

Crohn’s disease

ulcerative colitis

eczema

asthma

diabetes

depression

etc

hype cycle

Page 12: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Overview

Microbiota profiling

Data analysis

Page 13: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Microbiota profiling

WHO ARE THEY?

WHAT DO THEY DO?

Page 14: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

culture-based techniques

culturomics

16S rRNA marker gene

arrays (hitChip)

ISpro

sequencing

microbiome array

shotgun sequencing (metagenomics)

Microbiota profiling

Page 15: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

IS-proTM profiling

16S-23S interspace (IS region)

taxonomy based on size differences

prokaryotic

rRNA operon 16S 23S 5S

Bacteroidetes

Firmicutes, Actinobacteria,

Fusobacteria, Verrucomicrobia

Proteobacteria

IS region

FAFV

NCBI database - 16S – 23S rRNA

- 8990 entries

Budding et al. (2010) FASEB J.

fragment size (nt) a

bu

nd

an

ce

Page 16: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

culture-based techniques

culturomics

16S rRNA marker gene

arrays (hitChip)

ISpro

sequencing

microbiome array

shotgun sequencing (metagenomics)

Microbiota profiling

Page 17: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

16S ribosomal RNA gene amplicon

highly conserved in bacteria and archaea

species-independent PCR amplification

variable regions

taxonomic classification

16S rRNA

Page 18: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

16S rRNA amplicons

prokaryotic

rRNA operon 16S 23S 5S

1500bp Oxford Nanopore long read sequencing

IS region

~400bp Illumina MiSeq short read sequencing

Page 20: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

16S rRNA amplicon and sequencing

Fadrosh et al. (2014)

Illumina MiSeq

Page 21: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

QIIME-based analysis pipeline

Silva database, version 128 Max Planck Institute for Marine Microbiology and Jacobs University, Bremen, Germany

September 2016

8,430,487 entries

Read-pair merging

Q-score > 19

Chimera filtering

Sample QC

Reads > mean – 2 SD

OTU calling

Taxonomy

Phylogeny

OTU table (anonymous)

Biome table (taxonomy)

Phylogenetic tree

OTU abundancy filtering

> 0.005% of total reads

Caporaso et al. (2010)

Page 22: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Read-pair merging

Page 23: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Chimera filtering

- chimeras are PCR artifacts

Page 24: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Chimera filtering

Query

Chunk Chunk Chunk Chunk

Ref DB

Hits

Query

A

Query

A

B

normal chimera

4x

Page 25: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

OTU clustering

Operational taxonomic units (OTUs)

clustering on basis of homology of the reads (97%)

OTUs can be aligned to reference databases

unknown OTUs can still be used in analyses

Page 26: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Closed reference calling

Each read is compared directly to the database

Database determines phylogenetic tree

Standardized taxonomy > allows for collaboration

Page 27: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

culture-based techniques

culturomics

16S rRNA marker gene

arrays (hitChip)

ISpro

sequencing

microbiome array

shotgun sequencing (metagenomics)

Microbiota profiling

Page 28: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Affymetrix Axiom Microbiome array

Page 29: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

culture-based techniques

culturomics

16S rRNA marker gene

arrays (hitChip)

ISpro

sequencing

microbiome array

shotgun sequencing (metagenomics)

Microbiota profiling

Page 30: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Shotgun metagenomics

Flaws of 16S rRNA profiling

selection introduced by PCR amplification

no eukaryotic species such as fungi

phylotyping will not give insights into

the gene functions of unknown species

Page 31: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Shotgun metagenomics

Direct sequencing of DNA

Page 32: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

High output sequencing

2 x 100 bp

reads are too short for proper annotation

de novo assembly is preferred

need for compute power

2 x 100 bp

paired-reads de novo assembly ~1 kbp contigs

Page 33: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Metagenomics technology push

MetaHIT

European FP7 project

Human Microbiome Project (HMP)

NIH-sponsored project

Page 34: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Profiling of shotgun data

phylotyping databases

metagenomic species (MGS)

~7000 MGS specified

gene catalogue

8.1 million genes from 760 samples

functional databases

Page 35: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Phylotyping of shotgun data

Arumugam et al., 2011 MetaHIT

Page 36: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Functional analysis of shotgun data

Arumugam et al., 2011 MetaHIT

Page 37: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Taxonomic vs functional profiling

large taxonomic differences are not reflected in functional profiles

The Human Microbiome Project Consortium (2012)

Samples ordered by taxonomic profiles

Samples ordered by functional profiles

Page 38: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Profiling the gut microbiome

WHO ARE THEY?

16S TAXONOMY

METAGENOMICS

WHAT CAN THEY DO?

METAGENOMICS

WHAT ARE THEY DOING?

METATRANSCRIPTOMICS

METAPROTEOMICS

WHAT HAVE THEY DONE?

METABOLOMICS

Page 39: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Profiling the gut microbiome

Page 40: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Overview

Microbiota profiling

Data analysis

Page 41: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Complex multi-dimensional data

no normal or mean profile

enterotypes?

sparse data

many zero abundances

limited by technique

count data

dependent on technique

how to normalize?

compositional data

relative abundances add up to 1

Page 42: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Diversities

α-diversity

diversity within a sample

biological metric

number of species * evenness

β-diversity

diversity (distance or dissimilarity) between samples

UniFrac distances

Page 43: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

OTU table

OTU id sample_01 sample_02 sample_03 sample_04 …

OTU_12 3 0 456 343

OTU_318 34 45 3 2

OTU_37 567 2134 478 675

… … … … … …

Total 5,975 4,952 6,735 5,374

Page 44: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Rotterdam 16S rRNA datasets

Domain Phylum Class Order Family Genus OTUs

(2) (11) (18) (24) (43) (183) (777)

Class Domain Phylum Order Family Genus OTUs

(1) (7) (15) (19) (36) (152) (661)

Shannon Diversity Index Shannon Diversity Index

5 major phyla

N=2,111

N=156

N=1,427

N=1,135

N=1,106

Generation R Study

9-11 year-olds

Rotterdam Study

adults

Radjabzadeh et al. (2018) in preparation

Page 45: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Children vs adults - Generation R Study vs Rotterdam Study

N=2,111 N=1,427

GenR RS

*** 8

7

6

5

4

3

2

Sh

an

no

n d

ive

rsit

y i

nd

ex

average phylum-level profiles

Radjabzadeh et al. (2018) in preparation

Shannon alpha diversity

Page 46: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

MiBioGen consortium

Meta-analyses of gut microbiome GWAS

> 20 cohorts (still including)

> 20,000 samples

16S rRNA profiling (Illumina)

226 genera

8M HRC1.1 imputed SNPs

NGRC

Traits

Shannon alpha-diversity

Binary trait (presence/absence)

Quantitative trait (abundance)

Beta-diversity

Page 47: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

MiBioGen consortium

Cohort name Population 16S domain Genotyping method N Description

1 LLD Netherlands (Caucasian) V4 Illumina Immunochip, Cytochip 1089 Representative of population

2 NGRC Netherlands (Caucasian) V1-V2 PsychChip (Broad Institute, Boston, USA) 153 Healthy group + ADHD group

3 RS Netherlands (Caucasian) V3-V4 Illumina 550k 1427 Representative of population

4 GENR Netherlands (multi-ethnic) V3-V4 Illumina 610k 2111 Representative of population

5 NTR Netherlands (Caucasian) V4 Affymetrix 6.0 499 Twins

6 MIBS_Co Netherlands (Caucasian) V4 Illumina OmniExpressExome 111 Healthy volunteers

7 FGFP Belgium (Caucasian) V4 Illumina OmniExpress 2482 Representative of population

8 SHIP Germany (Caucasian) V1-V2 Affymetrix 6.0, Illumina OmniExpressExome, Exomechip 1904 Representative of population

9 SHIP-TREND Germany (Caucasian) V1-V2 Affymetrix 6.0, Illumina OmniExpressExome, Exomechip (-) Representative of population

10 FOCUS Germany (Caucasian) V1-V2 Illumina Immunochip, Exome 1555 Representative of population

11 BSPSPC Germany (Caucasian) V1-V2 Illumina 550K, Immunochip, Metabochip, Affymetrix 6.0, Axiom 912 Representative of population

12 TwinsUK UK (Caucasian) V4 HumanHap300, Hap610Q, 1M-Duo, 1.2M-Duo 1793 Twins

13 CHRIS Italy (Caucasian) ? ? ? ?

14 COPSAC Denmark (Caucasian) V4 Illumina OmniExpress 424 Representative of population

15 POPCOL Sweden (Caucasian) V1-V2 Illumina MiSeq 250 Representative of population

16 METSIM Finland (Caucasian) V4 Illumina OmniExpressExome 531 Representative of population

17 PNP Israel (Israeli) V3-V4 Metabolochip 1066 Healthy volunteers

18 GEM_HCE_v12 Canada, USA, Israel (Caucasian, Israeli) V4 Illumina HumanCoreExome, Immunochip (-) Healthy individuals

19 GEM_HCE_v24 Canada, USA, Israel (Caucasian, Israeli) V4 Illumina HumanCoreExome, Immunochip (-) Healthy individuals

20 GEM_ICHIP_HCE Canada, USA, Israel (Caucasian, Israeli) V4 Illumina HumanCoreExome, Immunochip 1543 Healthy individuals

21 CARDIA USA (Caucasian and African-American) V3-V4 Illumina Exome, Affymetrix 6.0 282 Representative of population

22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population

23 KSCS Korea (Asian) V3-V4 Illumina HumanCore BeadChips 12v 833 Representative of population

23 cohorts >21,000 samples

Page 48: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

MiBioGen consortium

55 bacterial taxa (1,232 SNPs)

GWAS quantitative trait

226 genera

8M SNPs

Meta-analysis

MiBioGen consortium (2018) unpublished

LCT locus

Page 49: SNPs and Human Diseases XV November 14th, 2018 slides K… · 22 HCHS/SOL USA (Hispanics/Latinos) V4 Illumina Omni Chip + custom array 1778 Representative of population 23 KSCS Korea

Questions…

?

?