Making Sense of Microbes - Jack Simpson

18
Making Sense of Microbes Metagenomic Analysis Workflow CSIRO COMPUTATIONAL INFORMATICS Jack Simpson 10 Februrary 2014

Transcript of Making Sense of Microbes - Jack Simpson

Making Sense of MicrobesMetagenomic Analysis Workflow

CSIRO COMPUTATIONAL INFORMATICS

Jack Simpson

10 Februrary 2014

Cells in the human body

Jack Simpson2 |

Microbial:100 trillion

Human10 trillion

Human Microbiome Project

• Investigate impact on human health

• ~ 240 healthy individuals

• 18 microbial habitats (e.g. airways)

• > 5000 samples

• Taxonomic marker: 16S rRNA gene

Jack Simpson3 |

OTU counts

Jack Simpson4 |

Samples

Metagenomics workflow

• Three areas I found interesting over the summer

• Background of the data

• Working with count data

• Finding associations

Jack Simpson5 |

Where does metagenomic data start?

• Where did our counts and OTUs come from?

• Count data is not raw data: many processing decisions

• 16S rRNA gene resolution and primers

• Multiple variable regions

• Lab protocols and comparing projects

• Biological data starts in the real world

Jack Simpson6 |

Working with count data

Jack Simpson7 |

Zeroes: Much Ado About Nothing…?

• Does absence of evidence == evidence of absence?

• What do we do with zeroes?

• Remove or Pseudocounts?

• When to remove/replace?

• Merged at the class level: visualise and replace zeroes

Jack Simpson8 |

Heatmap of all counts

Jack Simpson9 |

Beware hidden complexity

Jack Simpson10 |

Heatmap of grouped counts

Jack Simpson11 |

Zoom in on the heatmap

Jack Simpson12 |

Processing the Data

• OTU count data analysis

• Dealt with zeroes

• Visualised the data

• Normalization and transformation: log or Aitchison’s CLR?

• What do different transformations do to the data with different sample numbers?

• See artefacts related to discretization and zeroes

Jack Simpson13 |

Gut log compositional and log raw data

Jack Simpson14 |

Gut log compositional & clr compositional

Jack Simpson15 |

Finding associations

• Warning: compositional data!

• Be careful with correlation

• Fractions are not independent == negative correlation

• What can be done?

• Proportionality

Presentation title | Presenter name16 |

Summary

• Metagenomic data background

• Processing our data

• Looking for associations the right way

Jack Simpson17 |

Thank-you!

Jack Simpson18 |