Microbiome 16S Analysis: A Quick-Start...

100
Microbiome 16S Analysis: A Quick-Start Guide Amanda Birmingham Center for Computational Biology & Bioinformatics University of California at San Diego

Transcript of Microbiome 16S Analysis: A Quick-Start...

Page 1: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Microbiome16SAnalysis:AQuick-StartGuideAmandaBirminghamCenter for Computat iona l B io logy &B io informat i csUn ivers i ty o f Ca l i forn ia at San D iego

Page 2: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Agenda:• Rapidintroductionto16Smicrobiomestudies

• Summaryofanalysisstepsandsoftwaretools

•Minimalinstructiononcomputeenvironment

• Practicumon16SanalysiswithQIIME2◦ Alternatinglectureandtutorial§ Goal:AnytopicI’velecturedabout,youwillgettotestlive(evenifwedon’tfinishalltopics)

• NoticeanemphasisonspeedJ◦ ReddotonslidemeansIwon’tbecoveringitindepth◦ ConsiderableadditionalmaterialisdescribedintheSupplementalSlides

Page 3: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

V1 V2 V3 V4 V5 V6 V7 V8 V9

100 500 1000 1500

MarkerGeneMetagenomicsBasics• Approach:PCRampliconsofaconservedconstitutivegene(a"markergene")todetermineidentityandabundanceofmicrobespresent◦ Usuallythe“conservedconstitutivegene”ofchoiceisanrRNA§ Forbacteria/archea,usuallythe16S—smallsub-unit(SSU)ofribosome−ExcludeseukaryoticDNAaseukaryotes’SSUis18S

• 16SrRNA widelyconservedacrossbacteria/archaea(sosharedprimersites)◦ Butalsohas9hypervariableregions§ Canbeusedtoiddifferent“species”andbuildphylogenetictrees

• Can’tstudyfungiwith16S(theydon’thaveit)nor18S(evolvestooslowly)◦ Internaltranscribedspacer(ITS)isstandardfungimarkergene;28Salsoused

Page 4: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

WhentoUseMarkerGeneMetagenomics•WhenyoursampleisMOSTLYmadeupofhostDNA,e.g.tumorsamples◦ ShotgunreadswillalsobemostlyhostDNA,withfewleftoverforthemicrobes◦ Use16SrRNA instead,astheprimersexcludeeukaryoticDNAfromamplification

•Whenyou’recheapJ

Imagemod

ified

from

Morgan&

Hutten

hower

(2012).PLoSCo

mpu

tBiol8(12):e1002808.

Page 5: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

WhentoUseMarkerGeneMetagenomics•WhenyoursampleisMOSTLYmadeupofhostDNA,e.g.tumorsamples◦ ShotgunreadswillalsobemostlyhostDNA,withfewleftoverforthemicrobes◦ Use16SrRNA instead,astheprimersexcludeeukaryoticDNAfromamplification

•Whenyou’recheapJ

• Thegoodnews:◦ Targetgenestudiesareslightly cheapertoprepandsequencethanshotgunones◦ Analysissoftwareismature,andmanystudiescanbeanalyzedonalaptop◦ Knowntaxacanbedetectedwithverylow(100sofreads)sequencedepth

Page 6: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

WhentoUseMarkerGeneMetagenomics•WhenyoursampleisMOSTLYmadeupofhostDNA,e.g.tumorsamples◦ ShotgunreadswillalsobemostlyhostDNA,withfewleftoverforthemicrobes◦ Use16SrRNA instead,astheprimersexcludeeukaryoticDNAfromamplification

•Whenyou’recheapJ

• Thegoodnews:◦ Targetgenestudiesareslightly cheapertoprepandsequencethanshotgunones◦ Analysissoftwareismature,andmanystudiescanbeanalyzedonalaptop◦ Knowntaxacanbedetectedwithverylow(100sofreads)sequencedepth

• Thebadnews◦ Notargetgenedistinguishesallmicrobeswell§ And,foragivengene,noprimerpairdistinguishesallmicrobeswell

◦ Noothergenomeinformation(outsidetargetgene)iscaptured

Page 7: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

CommonIssuesinMarkerGeneStudies• Neglectingmetadata◦ Analysiscannottestforeffectsof,ordiscardbiasfrom,featuresyoudidn’trecord!

• Pickingnovel16Sprimers—notallcreatedequal◦ EarthMicrobiomeProjectrecommends515f-806rprimers,error-correctingbarcodes

• Nottakingprecautionstosupportampliconsequencing◦ SomeIlluminamachinesrequirehighPhiX,lowclusterdensity

Page 8: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

MarkerGeneAnalysisWorkflow

•Mostcriticalanalysischoices:◦ WhethertodoOTUpickingorerrorcorrection

◦ Whatαandβmetricstopick§ Somearephylogeneticallyaware,somearen’t

Pre-process[Removeprimers,

demultiplex,qualityfilter]

Pickfeatures&representativesequences

Featuretable(taxonabundance)Assigntaxonomy

Alignsequences&inferphylogeny

Phylogenetictree

Calculateαdiversity

Calculateβdiversity

Visualize

Testdifferencesforsignificance

Figures

P-values

Sequencefile(s)

Metadataspreadsheet

Cleanedsequence

file

Page 9: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

SoftwareSelection• Google“16Sanalysis<programname>”;maincontendersare

•Mothur◦ Name:notanacronym(playonDOTUR,SONS)◦ Philosophy:singlepieceofre-implementedsoftware◦ Toppro:easytoinstall◦ Topcon:re-implementationscouldbebuggy◦ Language:C++◦ Model:open-source◦ License:GPL◦ Published:2009◦ Developed:atUmichigan

Page 10: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

SoftwareSelection• Google“16Sanalysis<programname>”;maincontendersare

• QIIME◦ Name:QuantitativeInsightsIntoMicrobialEcology◦ Philosophy:wrapperofbest-in-classsoftware◦ Toppro:extremelyflexible◦ Topcon:QIIME2notyetfeature-complete◦ Language:python(wrapper)◦ Model:open-source◦ License:mixed◦ Published:2010◦ Developed:AtUCSD,NAU

Page 11: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

SoftwareSelection• Google“16Sanalysis<programname>”◦ MaincontendersareMothur andQIIME◦ Bothwidelyused◦ Bothpridethemselvesonqualityofsupport

•WilldiscussonlyQIIMEinthistutorial

• QIIME1vsQIIME2◦ QIIME1won’tbesupportedafterendof2017◦ QIIME2notyetfeature-complete§ Butalreadymucheasiertouse!

◦ ThistutorialusesQIIME2only

• I’mnotaQIIME2developer◦ I’mnottakingcreditforthistool,justdemonstratingit!

Page 12: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

GettingtheSoftware&Data

• Notcoveredinthistutorial,forsakeoftime

• QIIME2isveryeasytoinstallwiththeConda environment- andpackage-manager◦ Conda isalsoveryeasytoinstall—eitherMiniconda orAnacondaversions◦ OnceConda installed,QIIME2installisoneline,e.g.forlinux 64-bit,conda create –n qiime2-2017.6 --file https://data.qiime2.org/distro/core/qiime2-2017.6-conda-linux-64.txt

• Dataacquisitionmethodisproject-specific◦ Publicdatacanoftenbepulleddownfrominternetwithwget orcurl commands◦ Sequencingdatafromacoreusuallyavailablebyftp◦ Ifallelsefails,useaflashdriveJ

Page 13: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

GetReadyToPractice!• “Whyareyoumakingmetype?!”◦ QIIME2hasaGUI—butstillveryunderdevelopment◦ QIIME2command-lineinterfaceiseasytoinstallandreadytorun◦ Emphasizetypingratherthancopy/pastingcommandsbecauseinyourrealanalyses,youwillneedtotypeintheappropriatecommandsforyourdata§ Needtomakerealistictypingmistakesnowsoyouknowhowtocorrectthemlater!

Page 14: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

GetReadyToPractice!• “Whyareyoumakingmetype?!”◦ QIIME2hasaGUI—butstillveryunderdevelopment◦ QIIME2command-lineinterfaceiseasytoinstallandreadytorun◦ Typingratherthancopy/pastingcommandsbecauseinyourrealanalyses,youwillneedtotypeintheappropriatecommandsforyourdata§ Needtomakerealistictypingmistakesnowsoyouknowhowtocorrectthemlater!

•WillbeworkinginshellontheUbuntulinux operatingsystem◦ TerminalonMacOSXisverysimilar,windowsISN’T§ Mostbioinformaticssoftwaredoesn’tsupportWindows§ UsevirtualmachineorCygwin

• Notethatyouwillbetrainingonunusuallytractabledata◦ Beautiful,clearclustering,significantp-values,etc.◦ Ifyourowndatadon’tgivesuchclearresults,thatdoesn'tmeantheanalysisiswrong

Page 15: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

TipstoHelp•Whentypingafilenameordirectorypath,youcanusetabcompletion◦ Starttypingfile/directorypath,thenhittab—ifonlyonefile/directorymatcheswhatyoualreadytyped,shellfillsthatin§ Veryhelpfulforcorrectlyenteringlongfilenames§ If>1matches,shellfillsinasmuchasitcan

• Pressuparrowtogetbackpreviouscommandsyoutyped

• Ifyoutypeacommand,pressenter,and“nothinghappens”,don’tjustrunitagain◦ Manyunix commandsproducenovisibleoutputtoshell—justgetbackcommandprompt◦ Thatdoesn’tmeantheydonothing,sorunningthem*again*canscrewupresults◦ Donotstorecommandsinawordprocessingprogram(orPowerPoint,etc)§ E.g.,MSWordchangeshyphensto“mdash”—whichcommandlinecan’tunderstand

◦ Shellcommandsarecase-sensitive

Page 16: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Pre-Acknowledgment• PleasejoinmeinthankingPedroFernandes

•Withouthisimpeccablymanagedtrainingcomputers,resources,androom,thesetutorialswouldnotbepossible!

Page 17: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

MakingaMappingFile

• “Mappingfile”containsmetadataforstudy◦ MustcontaininfoneededtoprocesssequencesandtestYOURhypotheses

• QIIME1requiredcertaincolumnsincertainorder,butQIIME2ismoreflexible◦ Tab-separatedtextfilewithcolumnlabelsinfirstline+atleastonedataline§ Columnlabelvaluesmustbeunique(i.e.noduplicatevalues)

◦ Firstcolumnisthe“identifier”column(sampleID)§ Allvaluesinthefirstcolumnmustbeunique(i.e.noduplicatevalues)

◦ Seehttps://docs.qiime2.org/2017.6/tutorials/metadata/

• Theeasiestwaytomakeamappingfileiswithaspreadsheet◦ ButExcelisnotyourfriend!§ Routinelycorruptsgenesymbols,anythinginterpretedasadates,etc,&isn’treversible

Page 18: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:ViewingAMappingFile• OpenTerminal◦ Forbelow,remembertotrytabcompletion!

source activate qiime2-2017.6

cd tutorial-qiime2

ls

nano sample-metadata.tsv

• Turnonyourgreenlightwhenthemappingfileopensforyou

Page 19: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

MappingFileView

Page 20: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:ViewingAMappingFile• OpenTerminal◦ Forbelow,remembertotrytabcompletion!

source activate qiime2-2017.6

cd tutorial-qiime2

ls

nano sample-metadata.tsv

• Stretchthewindowsoyoucanlookatthecontents;then,toclose,typeCtrl + x

•MappingfileerrorscanleadtoQIIME2errors—orworse,garbageresults!◦ Keemei (pronounced‘keymay’)toolchecksforerrorsinGoogleSheets§ Chromeonly,andmusthaveGoogleaccounttouse§ Seehttp://keemei.qiime.org/

Page 21: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

ImportingData• Aftersequencedataisonyourmachine,mustbeimportedtoaQIIME2“artifact”◦ Artifact=data+metadata◦ QIIME2artifactshaveextension.qza

• Differentinputcommandsfor◦ Differentkindsofinputdata(e.g.,single-endvspaired-end)◦ Differentformatsofinputdata(e.g.,sequences&barcodesinsameordifferentfile)

• See“Importingdata”tutorialathttps://docs.qiime2.org/

https://docs.qiime2.org/2017.6/concepts/#data-files-qiime-2-artifacts

Page 22: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:ImportingData

• Abackslash\ isusedtobreakupacommandontomultiplelines◦ Ifyouprefertotypethewholecommandontoonerun-online,youcanleaveitout

qiime tools import \--type EMPSingleEndSequences \--input-path emp-single-end-sequences \--output-path emp-single-end-sequences.qza

Page 23: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:ImportingData

•Whatdoesthiscommandactuallydo?◦ Tellsqiime tolookintothefolderemp-single-end-sequences …◦ ForthekindofsequencefilesexpectedforEMPSingleEndSequences ...◦ Andloadthemintoanewqiimeartifactnamedemp-single-end-sequences.qza

• Notestructureofargumentstoqiime command◦ Pluginnamethenmethodnamethenarguments§ Ordermatters

qiime tools import \--type EMPSingleEndSequences \--input-path emp-single-end-sequences \--output-path emp-single-end-sequences.qza

Page 24: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Demultiplexing

•Mustassignresultingsequencestosamplestoanalyze

• Youmaynotneedtodothis!◦ Ifsequencingdonebyacore,resultsmaybedemultiplexed beforereturnedtoyou

QIIM

E2,https://qiim

e2.org.

Page 25: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:Demultiplexingqiime demux emp-single \--i-seqs emp-single-end-sequences.qza \--m-barcodes-file sample-metadata.tsv \--m-barcodes-category BarcodeSequence \--o-per-sample-sequences demux.qza

• Argumentshaveanamingconvention◦ Inputs(--i-<whatever>),metadata(--m-<whatever>),parameter(--p-<whatever),output(--o-<whatever>)

◦ Orderdoesn’tmatter

Page 26: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:Demultiplexing (cont.)• Presumablyyou’dliketoknowhowyourdemultiplexing worked

• QIIME2artifactfilescan’tbevieweddirectly(e.g.,innano)

• Newconcept:QIIME2visualizationfile◦ Has.qzv extension◦ Isintendedforhuman(ratherthancomputer)use◦ Generallyprovideinfoviaawebbrowser

Page 27: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:Demultiplexing (cont.)

• Nowviewthevisualization,locallyqiime tools view demux.qzv

qiime demux summarize \--i-data demux.qza \--o-visualization demux.qzv

Page 28: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:Demultiplexing (cont.)

Page 29: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:Demultiplexing (cont.)

Page 30: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:Demultiplexing (cont.)

Page 31: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:Demultiplexing (cont.)

• Nowviewthevisualization,locallyqiime tools view demux.qzv

•Whendoneexamining,inTerminal,typeJUSTq◦ Don’tneedtohitEnterafterwards◦ Beware:quittingvisualizationdoesn’tclosewebpage(butpagebecomesunreliable)

qiime demux summarize \--i-data demux.qza \--o-visualization demux.qzv

Page 32: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

QualityControl

Bokulich,N.etal.(2013).Quality-filteringvastlyim

provesdiversityestimatesfrom

Illumina

amplicon

sequencing.NatM

ethods,10(1),57–59.

QIIMEdefaults:• r=3• q=3• p=0.75• n=0• c=0.005%or2

Page 33: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:QualityControlqiime quality-filter q-score \--i-demux demux.qza \--o-filtered-sequences demux-filtered.qza \--o-filter-stats demux-filter-stats.qza

• Thisrunsthecommandwithdefaultvaluesforallthetuneable parameters◦ Toseetheoptionalparameters,theirdescriptions,andtheirdefaults,runjustqiime quality-filter q-score

Page 34: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:QualityControlqiime quality-filter q-score \--i-demux demux.qza \--o-filtered-sequences demux-filtered.qza \--o-filter-stats demux-filter-stats.qza

qiime quality-filter visualize-stats \--i-filter-stats demux-filter-stats.qza \--o-visualization demux-filter-stats.qzv

• Remember:fortheremainderofthistutorial,anytimeyoucreateavisualizationfile,youwillneedtorunanadditionalcommandtoviewit!

qiime tools view yourvisualizationfilename.qzv

Page 35: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:QualityControl

Page 36: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

FeatureTableCreation—ThePast• Lastyear:OTU(OperationalTaxonomicUnit)◦ “anoperationaldefinitionofaspeciesusedwhenonlyDNAsequencedataisavailable”◦ Sequencesat/aboveagivensimilaritythresholdconsideredpartofthesameOTU§ 97%istheusual“species-level”threshold−Similaritydeterminedusingalignment(time-consuming)

§ Purposeistominimizeimpactofsequencingerrors−Butalsomasksfine(sub-OTU)variationinrealbiologicalsequences

◦ Resultsverydifficulttocompareacrossstudiesifdonedenovo§ “Closedreference”,“openreference”methodsincreasecomparabilityrequirereferencedatabase

• Outputisa“featuretable”:§ Rowsaresamples§ ColumnsareOTUs(arbitraryidentifiersifdenovo,fromreferencedatabaseifclosedreference)§ ValuesarefrequencyofreadsfromthatOTUinthatsample

Page 37: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

FeatureTableCreation—ThePresent• Thisyear:sOTU (sub-OTU)methods◦ Useerrormodelingtoinsilco correctsequencingmistakes§ Soundsimpossiblebutisactuallyquiteaccurate,withrighterrormodel−Errormodelisspecifictothesequencingtype(e.g.,454,IlluminaHi/MiSeq)

◦ Result:onlysequenceslikelytohavebeeninputtothesequencer◦ Optionsinclude(NOTacompletelist):§ DADA2(2016)§ Deblur (2017)

• OutputisSTILLafeaturetable:◦ Rowsaresamples◦ ColumnsareSEQUENCES◦ ValuesarefrequencyofreadsfromthatSEQUENCEinthatsample

QIIM

E2,https://qiim

e2.org.

Page 38: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:FeatureTableCreation

• Thiscommandcantakeafewminutestorun◦ Sodon’tworryifthecommandpromptdoesn’timmediatelyreturnafteryouhitenter

•Wheredoyouguessthenumber120camefrom?

qiime deblur denoise-16S \--i-demultiplexed-seqs demux-filtered.qza \--p-trim-length 120 \--o-representative-sequences rep-seqs.qza \--o-table table.qza \--o-stats deblur-stats.qza

Page 39: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:FeatureTableCreationqiime deblur denoise-16S \--i-demultiplexed-seqs demux-filtered.qza \--p-trim-length 120 \--o-representative-sequences rep-seqs.qza \--o-table table.qza \--o-stats deblur-stats.qza

•Wheredoyouguessthenumber120camefrom?◦ Itisthelengthtowhichallsequenceswillbetrimmed

◦ ItwaschosenbyviewingtheInteractiveQualityPlotindemux.qzv

◦ Youmightevenchooseamoreconservativelength,like110

Page 40: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:FeatureTableCreation(cont.)qiime feature-table tabulate-seqs \--i-data rep-seqs.qza \--o-visualization rep-seqs.qzv

Page 41: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

FeatureTableTabulationView

Page 42: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:FeatureTableCreation(cont.)qiime feature-table summarize \--i-table table.qza \--o-visualization table.qzv \--m-sample-metadata-file sample-metadata.tsv

Page 43: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

FeatureTableSummaryView

Page 44: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

FeatureTableSummaryView

Page 45: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

FeatureTableSummaryView

Page 46: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

FeatureTableSummaryView

Page 47: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

PhylogeneticTreeCreation• Evolutionisthecoreconceptofbiology◦ There’sonlysomuchyoucanlearnfrommicrobeswhileignoringevolution!

• Evolution-awareanalysesofadatasetneedaphylogenetictreeofitssequences◦ Denovo:infertreeusingonlysequencesfromdataset◦ Reference-based:insertsequencesfromdatasetintoanexistingphylogenetictree§ Notallexistingphylogeniesarecreatedequal—havestrengthsandweaknessesbasedonintendedpurposewhendeveloped

• PhylogeneticallybasedanalysesinQIIME2needarooted tree

Unrooted: Rooted:

Geer,R.C.,Messersmith,D.J,Alpi,K.,Bhagwat,M.,Chattopadhyay,A.,Gaedeke,N.,Lyon,J.,Minie,M.E.,Morris,R.C.,Ohles,J.A.,Osterbur,D.L.&Tennant,M.R.2002.NCBIAdvancedWorkshopforBioinformaticsInformationSpecialists.[Online]http://www.ncbi.nlm.nih.gov/Class/NAWBIS/.

Page 48: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:PhylogeneticTreeCreation• Note:herewearedoingdenovophylogenetictreecreation◦ NotnecessarilytheBESTapproach,butaneasytoshowyouJ

• Novisualizationswillbeproduced

Page 49: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:PhylogeneticTreeCreationqiime alignment mafft \--i-sequences rep-seqs.qza \--o-alignment aligned-rep-seqs.qza

qiime alignment mask \--i-alignment aligned-rep-seqs.qza \--o-masked-alignment masked-aligned-rep-seqs.qza

qiime phylogeny fasttree \--i-alignment masked-aligned-rep-seqs.qza \--o-tree unrooted-tree.qza

qiime phylogeny midpoint-root \--i-tree unrooted-tree.qza \--o-rooted-tree rooted-tree.qza

Page 50: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

CoreMetrics• Sohowdoyouactuallycomparemicrobialcommunities?◦ Can’tjusteyeballthe(gigantic,sparse)featuretablesandlookfordifferences◦ Instead,calculatemetricsthatcompressalotofinfointoasinglenumber◦ Thendostatisticaltestsonmetricstolookforsignificantdifferences§ BECAREFUL—microbiomedataissparse,compositional,etc,sorequiresunusualtests§ QIIME2usesappropriatetests;ifdoingyourown,MUST checktheliteraturefirst

• Thesemetricsarelossy!◦ Nometricexposesalltheinformationinthefullfeaturetable§ Ifitdid,itwouldBEthefeaturetable

◦ Differentmetricscapturedifferentaspectsofthecommunities

• Thus...◦ Don’task,“WhichmetricshouldIuse?”UNTILyouknowwhatyou’relookingfor!

Page 51: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

CoreMetrics(cont.)• QIIME2calculatesasmorgasbordofmetricsforyouwithonecommand

• Alphadiversity§ Shannon’sdiversityindex(aquantitativemeasureofcommunityrichness)§ ObservedOTUs(aqualitativemeasureofcommunityrichness)§ Faith’sPhylogeneticDiversity(aqualitiative measureofcommunityrichnessthatincorporatesphylogeneticrelationshipsbetweenthefeatures)

§ Evenness(orPielou’s Evenness;ameasureofcommunityevenness)

• Betadiversity§ Jaccard distance(aqualitativemeasureofcommunitydissimilarity)§ Bray-Curtisdistance(aquantitativemeasureofcommunitydissimilarity)§ unweightedUniFrac distance(aqualitativemeasureofcommunitydissimilaritythatincorporatesphylogeneticrelationshipsbetweenthefeatures)

§ weightedUniFrac distance(aquantitativemeasureofcommunitydissimilaritythatincorporatesphylogeneticrelationshipsbetweenthefeatures)

Page 52: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

NormalizationforCoreMetrics

• Normalizationisnecessaryforvalidcomparisonsofabundance/diversity◦ “Buthow?!”§ Longstandingapproach:rarefaction(reduceallsamplestouniformsamplingdepth)§ Recentpublicationcausedconcern− Wastenot,wantnot:whyrarefyingmicrobiomedataisinadmissible.McMurdie PJ,HolmesS.PLoS

Comput Biol.2014;10(4).§ Furtherworkdemonstratedconcernisexcessive− Normalizationandmicrobialdifferentialabundancestrategiesdependupondatacharacteristics.

WeissS,etal.Microbiome.2017Mar3;5(1):27.(Note:I’manauthor,sonotobjective)

• Calculatedmetricvaluesdependonsamplingdepth

• Ex:circledcolumnhasmorenon-zerocountsthanothers◦ Isitscommunityreallymorediverse—ordowejustSEEmore?◦ Sampleswithmoresequences(greatersamplingdepth)show

morediversity

Page 53: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Rarefaction• Whatisrarefaction?◦ randomlysubsamplingthesamenumberofsequencesfromeachsample◦ NB:sampleswithoutthatnumberofsequencesarediscarded

• Concerns:◦ Toolow:ignorealotofsamples’information◦ Toohigh:ignorealotofsamples◦ Still agoodchoicefornormalization(WeissS,etal.Microbiome.2017):§ “Rarefyingmoreclearlyclusterssamplesaccordingtobiologicaloriginthanother

normalizationtechniquesdoforordinationmetricsbasedonpresenceorabsence”§ “Alternatenormalizationmeasuresarepotentiallyvulnerabletoartifactsduetolibrary

size”

• Researchermustchoosesamplingdepth—buthow?

Page 54: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

SamplingDepthSelection• Don’tsweatittoomuch◦ “Low”depths(10-1000sequencespersample)captureallbutverysubtlevariations

◦ Retainingsamplesisusuallymoreimportantthanretainingsequences§ MaycarenotjusthowmanysamplesareleftoutbutWHICHsamplesareleftout

Fig.2,Kuczynski,J.etal.,"Directsequencingofthehumanmicrobiomereadilyrevealscommunitydifferences",GenomeBiology,2010

Fulldataset(approximately1,500sequencespersample) Datasetsampledatonly10sequencespersample,showingthesamepattern

Page 55: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Exercise:CoreMetricsSamplingDepth

qiime diversity core-metrics \--i-phylogeny rooted-tree.qza \--i-table table.qza \--p-sampling-depth ??? \--output-dir metrics

•DoNOTstarttypingyet!

• Notethatthecoremetricscommandrequiresasamplingdepth

Page 56: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Exercise:CoreMetricsSamplingDepth•Whichsamplingdepthshouldweuse?◦ Howcanwedecide?

◦ Workwithyourpartnertochooseasamplingdepth,thenanswer:§ Whydidyouchoosethisvalue?§ Howmanysampleswillbeexcludedfromyouranalysisbasedonthischoice?§ Howmanytotalsequenceswillyoubeanalyzinginthecore-metricscommand?

qiime tools view table.qzv

Page 57: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Answers:CoreMetrics

•Myanswers:◦ Whydidyouchoosethisvalue?§ Anythinghigherexcludes>=halfofrightpalmsamples

◦ Howmanysampleswillbeexcludedfromyouranalysisbasedonthischoice?§ 4,allfromrightpalmofsubject1

◦ Howmanytotalsequenceswillyoubeanalyzinginthecore-metricscommand?§ 24,000(23.40%)

qiime diversity core-metrics \--i-phylogeny rooted-tree.qza \--i-table table.qza \--p-sampling-depth 800 \--output-dir metrics

Page 58: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:CoreMetrics

• Note:thereisnosinglevisualizationforcoremetrics◦ Wewillexamineafewdifferentvisualizationslater

qiime diversity core-metrics \--i-phylogeny rooted-tree.qza \--i-table table.qza \--p-sampling-depth 800 \--output-dir metrics

Page 59: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

BetaDiversity• “Between-sample”diversity◦ Hassimilarcategories,caveatsas𝜶 diversity

• Apopularphylogeneticoptionis'UniFrac’:◦ Measureshowdifferenttwosamples'componentsequencesare

◦ WeightedUniFrac:takesabundanceeachsequenceintoaccount

IllustrationcourtesyofD

r.Ro

bKn

ight

Page 60: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

BetaDiversityOrdination• Ordination:multivariatetechniquesthatarrangesamplesalongaxesonthebasisofcomposition

• PrincipalCoordinatesAnalysis:awaytomapnon-EuclideandistancesintoaEuclideanspacetoenablefurtherinvestigation◦ AbbreviatedasPCoA,nottobeconfusedwithPCA(PrincipalComponentAnalysis)◦ Startingpointisdistancematrix§ NOTthefullsetofindependentvariablesforeachsample

◦ npairwisedistancesareprojectedinton-1dimensions◦ PCAperformedtoreducethedimensionalitybackdown

• PCoA axescan’tbedecomposedintoindependentvariablecontributions◦ Butresultscanbecomparedtometadatatoidentifypatterns

ABC

ABC

Page 61: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:BetaDiversityOrdinationqiime emperor plot \--i-pcoa metrics/unweighted_unifrac_pcoa_results.qza \--m-metadata-file sample-metadata.tsv \--o-visualization metrics/unweighted-unifrac-emperor.qzv

• ThisisonlyshowingthePCoA visualizationofONEbetadiversitymetric◦ Notnecessarily“thecorrectone”!◦ Rememberthat3othersarecalculatedbycore-metrics alone

• Tocheckthegroupsignificanceofadifferentmetric,justinputadifferentfile◦ Tofindthem:cd metrics/ls *_pcoa_results.qza

Page 62: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

BetaDiversityOrdinationView

Page 63: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Exercise:BetaDiversityOrdination• InitialPCoA view(seepreviousslide)iscompletelyindependent ofmetadata◦ Clusters/gradients/etc seeninPCoA areproducedbyunsupervisedlearning,basedonthefeaturetableinformationwithoutanyawarenessofmetadata

• It’sgreattoseeclear,distinctclustersasinthisdataset–butevengreateriftheycanbeexplainedbyaknownmetadatacategory

•Workwithyourpartnertoanswerthefollowingquestion:◦ Canyoufindametadatacategorythatappearsassociatedwiththeobservedclusters?§ Hint:Experimentwithcoloringpointsbydifferentmetadata

Page 64: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Answers:BetaDiversityOrdination•Myanswer:◦ Canyoufindametadatacategorythatappearsassociatedwiththeobservedclusters?§ Yep:BodySite.Noteleftandrightpalmaren’tdistinctfromeachother,unsurprisingly

Page 65: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:BetaDiversityGroupSignificance

• Standardcaveatsapply—notthe“onetruemetric”,etc

qiime diversity beta-group-significance \--i-distance-matrix metrics/unweighted_unifrac_distance_matrix.qza \--m-metadata-file sample-metadata.tsv \--m-metadata-category BodySite \--p-pairwise \--o-visualization \metrics/unweighted-unifrac-bodysite-significance.qzv

Page 66: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

BetaDiversityGroupSignificanceView

Page 67: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Exercise:BetaDiversityGroupSignificance•Workwithyourpartnertoanswerthesequestions:◦ Doesthegroupsignificanceanalysisbearoutyourintuitionfromtheordination?§ Ifso,arethedifferencesstatisticallysignificant?§ AretherespecificpairsofBodySite valuesthataresignificantlydifferentfromeachother?

◦ HowaboutSubject?§ Hint:youwillneedtorunanewcommand!

Page 68: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Answers:BetaDiversityGroupSignificance•Myanswers:◦ Doesthegroupsignificanceanalysisbearoutyourintuitionfromtheordination?§ Yes§ Ifso,arethedifferencesstatisticallysignificant?−Yes,withp<=0.001(bonus:why doIsay ”lessthan orequal to”?)

§ AretherespecificpairsofBodySite valuesthataresignificantlydifferentfromeachother?−Yes,allofthepairsexceptleftpalm/rightpalm

◦ HowaboutSubject?• qiime diversity beta-group-significance \

• --i-distance-matrix metrics/unweighted_unifrac_distance_matrix.qza \

• --m-metadata-file sample-metadata.tsv \

• --m-metadata-category Subject\

• --p-pairwise \

• --o-visualization metrics/unweighted-unifrac-subject-significance.qzv

§ Nope:significanceofdifferenceofdistributionsofunweightedunifrac metricgroupedbysubjecthasp=0.442

Page 69: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Acknowledgements• CenterforComputationalBiology&Bioinformatics,UniversityofCaliforniaatSanDiego

• Caporaso lab,NorthernArizonaUniversity• Knightlab,UCSD• QIIME2developmentteam!◦ Especiallyfortheexcellent“MovingPictures”tutorialonwhichthisoneisbased

Page 70: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:BetaDiversityOrdination• Butwait,thisistime-seriesdata!Maybewe’dliketoviewitonatimeaxis:

qiime emperor plot \--i-pcoa metrics/unweighted_unifrac_pcoa_results.qza \--m-metadata-file sample-metadata.tsv \--p-custom-axis DaysSinceExperimentStart \--o-visualization metrics/unweighted-unifrac-emperor-bydayssince.qzv

• Standardcaveatsapply

Page 71: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

SupplementalSlides

Page 72: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

BetaDiversityOrdinationView(cont.)

Page 73: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:BetaDiversityCorrelation

• Standardcaveatsapply

qiime diversity beta-correlation \--i-distance-matrix metrics/unweighted_unifrac_distance_matrix.qza \--m-metadata-file sample-metadata.tsv \--m-metadata-category DaysSinceExperimentStart \--o-visualization metrics/unweighted-unifrac-

dayssinceexperimentstart-beta-correlation.qzv

Page 74: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

BetaDiversityCorrelationView

Page 75: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

AlphaDiversity• “Within-sample”diversity◦ Manydifferentmetricsexist§ Taxonomy-based(e.g.,numberofobservedOTUs)−Assumeeverythingisequallydissimilar−Morelikelytoseedifferencesbasedoncloserelatives

§ Phylogeny-based(e.g.,phylogeneticdiversityoverwholetree)−Treatlessrelateditemsasmoredissimilar−Betteratscalingtheobserveddifferences

◦ The“correct”metric(s)arethoserelevanttoyourhypothesis§ PleasedoHAVEahypothesis!

• Testingapproach:◦ Examinealphadiversitymetricbymetadatavalues◦ Testwhetherdifferencesinmetricdistributionisdifferentbetweengroups(ifmetadataiscategorical)orcorrelatedwithmetadata(ifmetadataiscontinuous)

Page 76: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

NumberofOTUsbysamplingsite

AlphaDiversity

Highwithin-samplediversity—why?

Page 77: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:AlphaDiversityGroupSignificance

• Note:onlyshowingyouthegroupsignificancevisualizationofONEalphadiversitymetric◦ Rememberthat3othersarecalculatedbycore-metrics alone◦ TheoneIamshowingisnot“thecorrectone”—picktheonethatfitsyourhypothesis

• Tocheckthegroupsignificanceofadifferentmetric,justinputadifferentvectorfile◦ Tofindthem:

qiime diversity alpha-group-significance \--i-alpha-diversity metrics/faith_pd_vector.qza \--m-metadata-file sample-metadata.tsv \--o-visualization metrics/faith-pd-group-significance.qzv

cd metrics/ls *_vector.qza

Page 78: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

AlphaDiversityGroupSignificanceView

Page 79: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

AlphaDiversityGroupSignificanceView

Page 80: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Exercise:AlphaDiversityGroupSignificance

•Workwithyourpartnertoanswerthesequestions:◦ IsBodySite valueassociatedwithsignificantdifferencesinphylogeneticdiversity?◦ Whichtwositeshavethemostsignificantdifferenceinphylogeneticdiversitydistributions?§ Notedifferentbetweenp-valueandq-value

◦ IsSubjectvalueassociatedwithsignificantdifferencesinphylogeneticdiversity?

qiime diversity alpha-group-significance \--i-alpha-diversity metrics/faith_pd_vector.qza \--m-metadata-file sample-metadata.tsv \--o-visualization metrics/faith-pd-group-significance.qzv

Page 81: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Answers:AlphaDiversityGroupSignificance•Myanswers:◦ IsBodySite valueassociatedwithsignificantdifferencesinphylogeneticdiversity?§ Yes,withp<1E-3

◦ Whichtwositeshavethemostsignificantlydifferenceinphylogeneticdiversitydistributions?§ Leftpalmis(equally)mostsignificantlydifferentfromgutandtongue−Consider:anyideawhyperhapsleftpalmbutnotright?

◦ IsSubjectvalueassociatedwithsignificantdifferencesinphylogeneticdiversity?§ No,pvalue=0.24

Page 82: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:AlphaDiversityCorrelation

• Samecaveatasbefore:◦ OnlyshowingthecorrelationvisualizationofONEalphadiversitymetric§ Notnecessarily“thecorrectone”!

qiime diversity alpha-correlation \--i-alpha-diversity metrics/evenness_vector.qza \--m-metadata-file sample-metadata.tsv \--o-visualization metrics/evenness-alpha-correlation.qzv

Page 83: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

AlphaDiversityCorrelationView

Page 84: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

TaxonomicAssignment• SequencefeaturesorOTUshavelimitedutility◦ Atsomepoint,you’llwanttolinkyourfindingstopublishedwork◦ Thatrequiresidentifyingthetaxonomyofeachsequencefeature

• Steps:◦ Pickreferencedatabase§ Ihearyoucry,“WhichoneshouldIuse?”

◦ Trainaclassifieralgorithmtoassigntaxonomiestosequences§ Usethereferencedatabaseasthetrainingset

◦ Runtheclassifieralgorithmonyoursequencefeatures

Page 85: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

TaxonomicAssignment• SequencefeaturesorOTUshavelimitedutility◦ Atsomepoint,you’llwanttolinkyourfindingstopublishedwork◦ Thatrequiresidentifyingthetaxonomyofeachsequencefeature

• Steps:◦ Pickreferencedatabase§ Ihearyoucry,“WhichoneshouldIuse?”

◦ Trainaclassifieralgorithmtoassigntaxonomiestosequences§ Usethereferencedatabaseasthetrainingset

◦ Runtheclassifieralgorithmonyoursequencefeatures

Page 86: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

CommonIssuesinMarkerGeneStudies• Neglectingmetadata◦ Analysiscannottestforeffectsof,ordiscardbiasfrom,featuresyoudidn’trecord!

• Pickingnovel16Sprimers—notallcreatedequal◦ EarthMicrobiomeProjectrecommends515f-806rprimers,error-correctingbarcodes

• Nottakingprecautionstosupportampliconsequencing◦ SomeIlluminamachinesrequirehighPhiX,lowclusterdensity

• Selectinganinappropriatereferencedatabase◦ E.g.,Greengenes (16S)referencedatabasewhensequencingITS

Page 87: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

MarkerGeneReferenceDatabases◦ NOTacompletelist:§ Greengenes:16S§ Silva:16S/18S§ RDP:16S/18S/28S§ UNITE:ITS

◦ Anothernotcompletelistateukref.org/databases(notjusteukaryotic)◦ Attheveryleast,chooseadatabasethatincludesyourmarkergene!§ Beyondthat,formalguidanceishardtofind§ ButofftherecordyoumightgetsomeinformalguidanceJ

Page 88: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

TaxonomicAssignment• SequencefeaturesorOTUshavelimitedutility◦ Atsomepoint,you’llwanttolinkyourfindingstopublishedwork◦ Thatrequiresidentifyingthetaxonomyofeachsequencefeature

• Steps:◦ Pickreferencedatabase§ Ihearyoucry,“WhichoneshouldIuse?”

◦ Trainaclassifieralgorithmtoassigntaxonomiestosequences§ Usethereferencedatabaseasthetrainingset

◦ Runtheclassifieralgorithmonyoursequencefeatures

Page 89: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

CommonIssuesinMarkerGeneStudies• Neglectingmetadata◦ Analysiscannottestforeffectsof,ordiscardbiasfrom,featuresyoudidn’trecord!

• Pickingnovel16Sprimers—notallcreatedequal◦ EarthMicrobiomeProjectrecommends515f-806rprimers,error-correctingbarcodes

• Nottakingprecautionstosupportampliconsequencing◦ SomeIlluminamachinesrequirehighPhiX,lowclusterdensity

• Selectinganinappropriatereferencedatabase◦ E.g.,Greengenes (16S)referencedatabasewhensequencingITS

• Expectingspecies-leveltaxonomycalls◦ MostOTUs/featuresonlyspecifiedtofamilyorgenuslevel

Page 90: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Taxonomy:ExpectationVsRealityIdealResult RealResult

Kingdom Bacteria Bacteria

Phylum Proteobacteria Proteobacteria

Class Gammaproteobacteria GammaproteobacteriaOrder Enterobacteriales Enterobacteriales

Family Enterobacteriaceae Enterobacteriaceae

Genus Eschericia ---

Species coli OTU 2445338

Strain O157:H7 --

Page 91: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:TaxonomicAssignmentqiime taxa tabulate \--i-data taxonomy.qza \--o-visualization taxonomy.qzv

Page 92: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

TaxonomicAssignmentTabulationView

Page 93: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:TaxonomicAssignmentqiime feature-classifier classify-sklearn \--i-classifier gg-13-8-99-515-806-nb-classifier.qza \--i-reads rep-seqs.qza \--o-classification taxonomy.qza

qiime taxa barplot \--i-table table.qza \--i-taxonomy taxonomy.qza \--m-metadata-file sample-metadata.tsv \--o-visualization taxa-bar-plots.qzv

Page 94: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

TaxonomicAssignmentBarPlotView

Page 95: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Exercise:TaxonomicAssignment• “Level1”=kingdom,“Level2”=phylum,etc

•Workwithyourpartnerto:◦ Visualizethetaxaatlevel2◦ SortthesamplesbyBodySite◦ Doyouseeanythingsuggestive?

Page 96: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Answers:TaxonomicAssignment

• GutsureseemstohavealotmoreBacteroidetes thantheothersites

Page 97: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

DifferentialAbundanceAnalysis•Whygotothetroubleofassigningtaxonomies?◦ Probablyyouwanttoknowwhetheranyparticulartaxaaredifferentiallyabundant§ Indifferentindividuals,environments,timepoints,etc

• Howtotestfordifferentialabundance?◦ Remember:microbiomedatasetsare“compositional”(fixedsum)◦ Watchout:“traditional”statisticalmethodsperformbadlyforthissortofdata!§ E.g.,95%falsepositiveswhenyouexpectanFDRof5%

◦ Whattouseinstead?§ Balancetrees(borrowedfromgeology)arecurrentlythebestknownoption(asof2017)−Buttheyaren’timplementedinQIIME2yet−Sountiltheyare,usepreviousbestknownoption(asof2015)

§ ANCOM(ANalysis ofCompositionOfMicrobiomes)

Page 98: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Practicum:DifferentialAbundanceAnalysisqiime taxa collapse \--i-table table.qza \--i-taxonomy taxonomy.qza \--p-level 2 \--o-collapsed-table table-level2.qza

qiime composition add-pseudocount \--i-table table-l2.qza \--o-composition-table comp-table-level2.qza

qiime composition ancom \--i-table comp-table-level2.qza \--m-metadata-file sample-metadata.tsv \--m-metadata-category BodySite \--o-visualization ancom-bodysite-level2.qzv

Page 99: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

DifferentialAbundanceAnalysisView

Page 100: Microbiome 16S Analysis: A Quick-Start Guidecompbio.ucsd.edu/wp-content/uploads/...tutorial_non-interactive.pdf · Microbiome 16S Analysis: A Quick-Start Guide ... Alternating lecture

Details:DifferentialAbundanceAnalysis•W-statistic◦ #ofotheritemsfromwhichasingleitemisfoundtobesignificantlydifferent§ With alpha=0.05by default(can be changed)

• Percentileabundancetable:◦ Atableofitemsandtheirpercentileabundancesineachgroup◦ Rowsareitems◦ Columnsarepercentilewithinagroup◦ Valuesareabundanceofreadsforgivenpercentileforthatgroup