Viewing & Getting GO COST Functional Modeling Workshop
22-24 April, Helsinki
Slide 2
Summary 1.Ontology Browsers QuickGO, AmiGO searching for GO
Terms, getting GO Plant Ontology Browser 2.Finding/ Adding GO for
Functional Modeling. GOProfiler summary of GO for your species
GORetriever gets evisting GO GOanna adds GO (Blast) Adding GO for
large datasets 3.Array annotation 4.Using added GO in GO Enrichment
Analysis.
Slide 3
GO Browsers QuickGO Browser (EBI GOA Project)
http://www.ebi.ac.uk/ego/ protein annotations search by GO Term or
by UniProt ID AmiGO Browser (GO Consortium Project)
http://amigo.geneontology.org/cgi- bin/amigo/go.cgi search by GO
Term or by accession
QuickGO Features Searching for gene products Can use
gene/protein gene names, but better to use accessions. Works off
UniProt accessions/IDs Can enter multiple accessions (separated by
a space). Can search for GO Terms Has autocomplete Provides ranked
list of matches Matches are also grouped by BP, CC, MF GO Terms
definitions, annotations, parents, children terms. Advanced
filtering options. Download as protein lists or gene association
file format.
Slide 6
record information annotation data number of annotations
Slide 7
Add taxon ID for horse. (Use to find taxon ID for your
species.)
AmiGO Features Need to select either a gene product of GO Term
search. Searching for gene products Can use gene/protein gene
names, but better to use accessions. Works off multiple
accessions/IDs Only accepts a single accession, not a list. View
information about gene product & about annotations for that
gene product. Can search for GO Terms Large numbers of GO
annotations are truncated. Some filtering options. Filter by
ontology or evidence code. Filter by database or species. Download
as sequences or gene association file.
Slide 12
Plant Ontology (PO) Browser describes plant anatomy and
morphology and stages of development for all plants Plant Anatomy
e.g., plant structures (PO:0009011) such as plant organ
(PO:0009008), plant cell (PO:0009002), whole plant (PO:0000003),
portion of plant tissue (PO:0009007), and vascular system
(PO:0000034), etc. Plant Structure Development Stage e.g., plant
tissue development stage (PO:0025423), leaf development stage
(PO:0001050), whole plant development stage (PO:0007033), seed
development stage (PO:0001170), and sporophyte development stage
(PO:0028002), etc.
Slide 13
http://www.plantontology.org/
Slide 14
Ontology Browsers Use to identify specific ontology terms of
interest. Use to download specific annotation files for specific
gene lists for species use as input for GO or PO expression
analysis
Slide 15
Tutorial 1. Familiarizing your self with ontology browsers. OR
Use browsers to look for GO/PO for accessions from your own data
set.
Slide 16
2. Finding/ Adding GO for Functional Modeling How much GO is
available for your species? How much GO is available for your data
set? How much of this is in the tool(s) you want to use? Do you
need to add GO? GOProfilerGORetrieverLast update? Source? GOanna,
Blast2GO, etc
Slide 17
GOProfiler GOProfiler allows you get an overview of what GO
annotation exists for the species you are interested in.
Slide 18
Slide 19
Number of proteins is based upon GO Consortium records for
these species. Species with only IEA annotations do not have an
active GO annotation project GO provided automatically by EBI GOA
Project.
Slide 20
GORetriever Allows you to get existing GO annotations for a
specific set of gene products. Accepts a text file of accessions or
IDs. Returns GO annotations, list of accessions that have no GO and
a GO Summary file.
Slide 21
Input file text file of return separated accessions.
Slide 22
GORetriever Results
Slide 23
Slide 24
add GO to this list using GOanna or Blast2GO
Slide 25
GORetriever Results do functional grouping using
GOSlimViewer
Slide 26
only returns existing GO only accepts limited accession types
GOanna does a Blast search against existing GO annotated products.
allows you to quickly transfer GO to gene products where they have
similar sequences accepts fasta files
Slide 27
Incorrect email address you will not receive your results!
Contact AgBase if you have not received results after 24-48h.
Slide 28
GOanna Results If you enter an incorrect email address you will
not receive your results! Contact AgBase if you have not received
results after 24-48h.
Slide 29
query IDs are hyperlinked to BLAST data (files must be in the
same directory)
Slide 30
*WHAT IS A GOOD ALIGNMENT? 1. Manually inspect alignments and
delete any lines where there is not a good alignment*. 2. Add this
additional annotation to the annotations from GORetriever.
Slide 31
GOanna2ga New to AgBase: an online script to convert your
GOanna file to a gene association file format. add manually checked
GOanna annotations to a GORetriever file
Slide 32
Tutorial 2 Getting GO. GOProfiler check what is available
GORetriever get existing GO GOanna add GO annotations Note - you
will use Blast2GO to add additional GO annotations to your data
sets tomorrow. OR Getting existing GO & adding additional GO to
your own data set.
Slide 33
Some limitations of GOanna: BLAST analysis is slow results
emailed limit to 5,000 inputs or an overall file size of 6Mb limit
to 3 jobs submitted/user at one time How do I do to get GO for my
50,000 RNA-Seq dataset? 50 x GOanna submissions + manual
interpretation of results impractical and slow!! ALTERNATIVELY:
Contact AgBase we use internal GO annotation pipelines/queuing We
can help customize databasess GO can be kept private and released
after publication
Slide 34
How do I do to get GO for my 50,000 RNA-Seq dataset? GOanna is
being deployed on the iPlant discovery environment increased
computing capacity faster Blast searches no limitations on file
number or size http://www.iplantcollaborative.org/
Slide 35
GO annotation of RNA-Seq data 1.Retrieve any existing GO
annotation for gene products Genome2Seq: Rapidly retrieves a fasta
file of sequences and GO based on genome co-ordinates generated
from RNA-Seq data. 2.InterProScan identifies functional motifs and
domains Can be mapped to GO terms (IEA) VERY computer intensive do
this on HPC resources; being implemented on iPlant Improved results
if transcripts are translated (e.g. EMBOSS) 3.BLAST based
similarity transfer (ISA) e.g. Blast2GO, GOanna Should only
transfer GO annotations based upon direct experimental evidence
codes. Need to test sample set to determine good matches/Evalues.
4.Combine GO annotations into single file. Remove duplicates
Slide 36
Slide 37
Adding GO Annotation GO annotations are usually added as gene
association files. Check the number of the columns. Can check file
format against the GO guide: Check your analysis tool: accepts
additional GO annotations format required
http://www.geneontology.org/GO.format.annotation.shtml
Slide 38
GO Enrichment tools that support agricultural species.