Prioritization of Avian GO Annotation. 1.59 5 46.62 4 31,819 3 19,979 3 2.1Chicken...

10
Prioritization of Avian GO Annotation

description

Phase 1: “Breadth” 7, 478 Chicken entries in UniProtKB  GOA provides IEA mapping for UniProtKB entries Initial strategy for AgBase biocurators was to add GO to chicken gene products that had none. Since 46% of the chicken proteins in NRPD were predicted, they would have no GO  IEA, ISS, ISO….

Transcript of Prioritization of Avian GO Annotation. 1.59 5 46.62 4 31,819 3 19,979 3 2.1Chicken...

Page 1: Prioritization of Avian GO Annotation. 1.59 5 46.62 4 31,819 3 19,979 3 2.1Chicken 2.1829.99108,06949,5163.4Rat 1 3.579.28228,69664,01837.1Mouse 11.414.91415,83036,43736.3Human.

Prioritization of Avian GO Annotation

Page 2: Prioritization of Avian GO Annotation. 1.59 5 46.62 4 31,819 3 19,979 3 2.1Chicken 2.1829.99108,06949,5163.4Rat 1 3.579.28228,69664,01837.1Mouse 11.414.91415,83036,43736.3Human.

1.59546.62431,819319,97932.1Chicken2.1829.99108,06949,5163.4Rat1

3.579.28228,69664,01837.1Mouse11.414.91415,83036,43736.3Human

proteins/gene% predicted

proteinsNo. Proteins

(NRPD)No. Entrez

GenesGenome

Build2Species

Structural Annotation

1. The rat genome was published only 8 months prior to the chicken genome, yet rat has 2x as many genes in Entrez Gene and 3x as many proteins.

2. After two genome builds chicken still has 5% of genomic sequence that has not been assigned a chromosome and mini-chromosomes have not been sequenced.

3. Chicken genes and proteins are under-represented in public databases.4. Of the chicken proteins available from NRPD, almost half are predicted based upon

computational analysis.5. On average chicken has only 1 protein per gene so very little is known about isoforms

and alternate transcripts in the chicken gene products.

NRPD: Non-redundant Protein Database

Page 3: Prioritization of Avian GO Annotation. 1.59 5 46.62 4 31,819 3 19,979 3 2.1Chicken 2.1829.99108,06949,5163.4Rat 1 3.579.28228,69664,01837.1Mouse 11.414.91415,83036,43736.3Human.

Phase 1: “Breadth” 7, 478 Chicken entries in UniProtKB

GOA provides IEA mapping for UniProtKB entries Initial strategy for AgBase biocurators was to

add GO to chicken gene products that had none.

Since 46% of the chicken proteins in NRPD were predicted, they would have no GO IEA, ISS, ISO….

Page 4: Prioritization of Avian GO Annotation. 1.59 5 46.62 4 31,819 3 19,979 3 2.1Chicken 2.1829.99108,06949,5163.4Rat 1 3.579.28228,69664,01837.1Mouse 11.414.91415,83036,43736.3Human.
Page 5: Prioritization of Avian GO Annotation. 1.59 5 46.62 4 31,819 3 19,979 3 2.1Chicken 2.1829.99108,06949,5163.4Rat 1 3.579.28228,69664,01837.1Mouse 11.414.91415,83036,43736.3Human.

0

20

40

60

80

100

Human Mouse Rat Chicken

no GO

AgBase

computational GO

manual GO

% of gene products

annotated

the proportion of GO for chicken is over-represented because of their under-representation in public databases

Functional Annotation

Page 6: Prioritization of Avian GO Annotation. 1.59 5 46.62 4 31,819 3 19,979 3 2.1Chicken 2.1829.99108,06949,5163.4Rat 1 3.579.28228,69664,01837.1Mouse 11.414.91415,83036,43736.3Human.

Phase 2: “Depth”

Page 7: Prioritization of Avian GO Annotation. 1.59 5 46.62 4 31,819 3 19,979 3 2.1Chicken 2.1829.99108,06949,5163.4Rat 1 3.579.28228,69664,01837.1Mouse 11.414.91415,83036,43736.3Human.
Page 8: Prioritization of Avian GO Annotation. 1.59 5 46.62 4 31,819 3 19,979 3 2.1Chicken 2.1829.99108,06949,5163.4Rat 1 3.579.28228,69664,01837.1Mouse 11.414.91415,83036,43736.3Human.

What are the community needs?

Page 9: Prioritization of Avian GO Annotation. 1.59 5 46.62 4 31,819 3 19,979 3 2.1Chicken 2.1829.99108,06949,5163.4Rat 1 3.579.28228,69664,01837.1Mouse 11.414.91415,83036,43736.3Human.

GO Annotation of Arrays

DelMar14K, FHCRC, Tgu array 44K Agilent oligo array AIIM array, Affymetrix

Should we be focusing on arrays? What arrays should we do?

Page 10: Prioritization of Avian GO Annotation. 1.59 5 46.62 4 31,819 3 19,979 3 2.1Chicken 2.1829.99108,06949,5163.4Rat 1 3.579.28228,69664,01837.1Mouse 11.414.91415,83036,43736.3Human.

GO Annotation Priorities? Provide “breadth” of coverage Annotate products represented on arrays Reference Genome targets Subject areas (immunity,

nutrition/metabolism, development Ad hoc as requested