Future Plans for the HGNC Elspeth Bruford. HGNC Team Elspeth Bruford Susan Tweedie* Ruth Seal Kris...
-
Upload
jared-norris -
Category
Documents
-
view
214 -
download
0
Transcript of Future Plans for the HGNC Elspeth Bruford. HGNC Team Elspeth Bruford Susan Tweedie* Ruth Seal Kris...
Future Plans for the HGNC
Elspeth Bruford
HGNC Team
ElspethBruford
SusanTweedie*
RuthSeal
KrisGray
Welcome to Susan
* starting 8.12.2014
BethYates
1. continue naming of human protein-coding genes, pseudogenes
& RNA genes
2. continue reassignment of uninformative symbols based on
functional data
3. coordinate gene naming across vertebrates
4. assign gene names within complex families across vertebrate
species (olfactory receptors, cytochrome P450s)
Funded Aims (2012-2017)
Human Gene Naming
• Locus types: complex cases
segregating pseudogenes: pseudogene/protein codingpseudogenes/lncRNAs
smORFs: lncRNA/protein codingMT-RNR2: rRNA/protein coding
• Continue to display only one locus type per symbol?
Proteogenomics
• Proposed workshop with NextProt, UniProt, Havana, RefSeq, HPA, Peptide Atlas etc. (Tress, Pandey, Rinn, Ponting, smORFs)
• Discuss complex cases and genes encoding Uniprot PE3/4/5
NB we currently have 316 entries of locus type “unknown”
• Protein nomenclature – separate field? e.g. TACR3, “tachykinin receptor 3” vs “neuromedin K
receptor”
Bidirectional promoters
Grzechnik et al 2014, TiBS
• lncRNA lies “head to head” (TSSs < 1 kb) with protein coding gene on antisense
strand
• implications for transcription of both loci
• currently denoted in gene name, e.g. FOXG1-AS1, “FOXG1 antisense RNA 1
(head to head)”
• have proposed naming these as GENE1-AU1, “gene 1 antisense upstream RNA 1”
Human Phenotype Naming• Historically HGNC maintained symbols for mapped phenotypes where causative
gene was unknown• Many of these fall into series, e.g. MRX#, DFN#, JBTS#• Now only approve new phenotypic symbols upon request, usually via direct
contact from researchers • Allows researchers to reserve provisional symbols that can be confirmed upon
acceptance of ms• In the last decade OMIM have started assigning new members to these series
without notification/consultation• Many have been added retrospectively• Some confusion caused, attempts made to minimise this but…• …OMIM have told us they want to take over assignment
Human Phenotype Ontology
• Being developed by Peter Robinson et al in Berlin• Standardized vocabulary of phenotypic abnormalities encountered in
human disease• ~10,000 terms, over 50,000 annotations to inherited diseases• Created using information from literature, Orphanet, Decipher & OMIM
Human Phenotype Ontology• CEP290: mutations can cause multiple phenotypes
Human Phenotype Ontology
Human Phenotype Ontology
• CEP290 has 143 HPO terms
• HPO PhenExplorer
• How/where to display143 terms?
• Likewise, should we include GO terms?
Renaming
• Literature searches• Comparison with UniProt names• CFAP# genes example – working with expert:
19 genes renamed, 19 aliased
Gene Naming Across Vertebrates
• What species? chimp > dog > cow > pig > horse > ?
• GeneFam – website brandingDatabase ID format: e.g. HGNCGF:#, GeneFam:#, HGNC:PTRO#
?
Gene Naming Across VertebratesSemi-automated naming of consensus 1:1 orthologs as identified by 4 comprehensive orthology resources:
• Ensembl Compara
• Homologene
• OMA
• Panther
HCOP – HGNC Comparison of Orthology Predictions toolcompares human to 17 species: chimp, macaque, mouse, rat, dog, horse, cow,
pig, opossum, platypus, chicken, Anole lizard, Xenopus, zebrafish, C. elegans, Drosophila and S. cerevisiae
using data from 12 resources:
Text mining
• Recent study by EBI Literature Services team
• Looked at gene symbol usage in full text articles in EuropePMC
• In 2006 Tamames & Valencia article estimated usage at 30% HGNC
symbols vs 70% synonyms
• In 2014 -70% HGNC symbols vs 30% synonyms
• Awaiting data on the recalcitrant 30%
can we update to what is being used?
Journals• HGNC have checked manuscripts for Genomics since 1990s (or earlier?)• 113 mss in the last year• impact factor in 1994: 5.037 > now: 2.793
• possible alternatives: PLOS One? (IF 2013 = 3.534)publishes ncRNA paperse.g. in June, PMID 24905231 cited AK096725 (LBX2-AS1) and
ENST00000453068 (now CYP51A1-AS1)in May, PMID 24879036 cited lincRNA-ENST00000515084 (now LINC01373) - NB ENST00000515084 is a retired IDBUT huge no. of submissions, how to filter/identify, speed, logistics
etc
• other suggestions?
Creative Commons?
• Public copyright licence, now on version 4.0• Different types
• allow adaptations of your work to be shared?• yes• no• yes, as long as others share alike
• CC0 – public domain mark
Computing
http://hgnc.github.io/reveal-talks/it-future-plans-sab2014.html
Complex Gene Families
• Olfactory receptors – Tsviya• Cytochrome P450s – Jed & David