Recerca de selenoprote ïnes en el genoma d’organimes eucariotes
description
Transcript of Recerca de selenoprote ïnes en el genoma d’organimes eucariotes
![Page 1: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/1.jpg)
Recerca de selenoproteïnes en el genoma d’organimes eucariotes
Bioinformàtica
Didac SantesmassesPhD student
Bioinformatics and genomics programmeRoderic Guigó's group
Centre for Genomic Regulation, Barcelona
![Page 2: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/2.jpg)
GPx6 (selenoprotein)
GPx5 (cysteine homologue)
Genes in human chromosome 6
The only selenoprotein in chromosome 6
Dark Blue: coding
![Page 3: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/3.jpg)
Structure of a selenoprotein gene
![Page 4: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/4.jpg)
Selenoprotein families include selenoproteins and cysteine
homologues (= orthologues or paralogues)
![Page 5: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/5.jpg)
Selenoproteins are generally misannotated
Percentages are computed by comparison Selenoprofiles-Ensembl annotations – see Mariotti and Guigó, 2010 - Bioinformatics.
![Page 6: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/6.jpg)
Bioinformatics methods for selenoproteins
• De novo: Selenogeneid (Castellano et al. 2001)• Homology based approaches:
UGA / Sec or UGA / Cys alignments
(e.g. Kryukov et al. 2003)
- Selenoprofiles (Mariotti and Guigó 2010)- Seblastian (Mariotti et al. 2013)
• SECIS prediction: - SECISearch (Kryukov et al. 2003) - SECISearch3 (Mariotti et al. 2013) – explained later
![Page 7: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/7.jpg)
Why selenocysteine?
• What is cysteine used for?
mainly for: Disulfide bonds
Inter or intramolecular, important for the folding of proteins, their stability, and in many cases necessary for catalysis.
Thioredoxins are a large class of proteins that perform redox reactions using thiol/disulfide switches (catalytic cysteines).
![Page 8: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/8.jpg)
Why selenocysteine?
The term “Thioredoxin” generally designates small oxidoreductase proteins, constituing a pool of enzymes for anti-oxidant defense. The thioredoxin system includes these proteins and those operating on them, both using them as electron donors, or maintaining them reduced.
Plenty of other proteins possess a thioredoxin-like fold, relying on the same local structure that includes a thiol/disulfide switch, and performing redox functions.
Selenocysteine is found almost always replacing a catalytic cysteine in redox proteins. Many selenoproteins possess a thioredoxin-like fold. Some are involved in the thioredoxin system. Consistently, selenocysteine is more reactive than cysteine, and it is a better reductant.
Anyway, some selenoproteins possess totally unrelated functions (e.g. SelJ = eye crystallin, SelP = selenium storage and transport)
![Page 9: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/9.jpg)
Examples:Thioredoxin Reductases (TR)
TrxS
S TrxSH
SHTR +NADP-
+NADPH+H+
• TR3: mitochondrial Found in all vertebrates
• TGR (TR2): can reduce glutathione disulfide. Contains a glutaredoxin (Grx) domain
Found only in tetrapodes
• TR1: cytosolic. Found in all vertebrates
3 paralogous genes are present for this family in human, all of which are selenoproteins. Sec is present as penultimate residue, and it is catalytic.
TR are the only responsible for the reduction of thioredoxins in cell.
![Page 10: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/10.jpg)
Glutathione Peroxidases (GPx)
3 groups:GPx1/GPx2, GPx3/GPx5/GPx6, GPx4/GPx7/GPx8
GPx6 converted to Cys independently in 3 species
• GPx1: cytosolic, abundant in liver and red blood cells
• GPx2: cytosolic, found in liver and gastrointestinal system
• GPx3: secreted, found in plasma and intestine
• GPx4: cytolic / mytochondrial, abundant in testis. Can reduce phospholipid hydroperoxides
• GPx5: secreted/membrane bound.Found only in epididymis
• GPx6: secreted (?).Found in olfactory epithelium and embryonic tissues
• GPx7: secreted (?)
• GPx8: membrane bound (?)
R-OOH + 2 GSH R-OH + H2O + GSSGGPx
Reduces superoxides at expenses of glutathione (GSH).8 paralogues in human, 5 with Sec
![Page 11: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/11.jpg)
![Page 12: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/12.jpg)
28
20 duplications
9 gene losses
13 Sec Cys
Selenoproteins in mammals / vertebrates
![Page 13: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/13.jpg)
Among vertebrates, selenoproteins are quite conserved: most of them are found in all species, few transformations occurred. This is very different than the situation in insects:
(Chapple and Guigó, 2008)
![Page 14: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/14.jpg)
• Selenoproteins have the peculiar characteristic of possessing a UGA codon, recoded because of the presence of the SECIS element.
• If you learn how to predict selenoproteins, you are able to do the same with any “standard” protein family.
• Bioinformatics project: find all selenoproteins in a given genome
Selenoproteins as test case
![Page 15: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/15.jpg)
UPF Biologia. Curs 2007-15
2007/08 – 2008/09: find all selenoproteins in a given protist genome2009/10 – 2011/12: find a given selenoprotein family in all protist genomes2012/13 – 2014/15: find all selenoproteins in a given vertebrate genome
![Page 16: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/16.jpg)
![Page 17: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/17.jpg)
Objective: find all selenoproteins in a given vertebrate genome
Project 2014-15 selenoproteins in vertebrates
http://bioinformatica.upf.edu/
![Page 18: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/18.jpg)
Useful resourcesSequences of your assigned species:
• EnsemblCollection of genomes (and annotations)
Your assigned genomes will be available in the UPF
computers when you will start the project
![Page 19: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/19.jpg)
Useful resourcesSequences of your assigned species:
• NCBI nucleotideCollection of all sequences (genomes, ESTs, etc)If you do not find something that you expect to be there, look for other
sources of sequences. Most genomes today are low/medium-quality
![Page 20: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/20.jpg)
Useful resourcesSelenoprotein sequences:
• SelenoDB 2.0 (and 1.0)Database containing manual annotations for human, and
automated annotations (selenoprofiles) for other vertebrate species.
• NCBI proteinNCBI hosts the sets of all known proteins. Noisy, but
comprehensive. You can find here more selenoprotein sequences searching by homology (blast) or by keywords
![Page 21: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/21.jpg)
Useful resources
Tools:• Blast - typically tblastn• Exonerate - protein2genome mode• Genewise
• Webserver with SECISearch3 and Seblastian:http://seblastian.crg.es/
S14. Anotació de genomes (I)
S15. Anotació de genomes (II)
![Page 22: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/22.jpg)
Based on a manually curated 2ndary structure alignment
Combines up to 3 methods to ensure maximum sensitivity
Filter and grading procedure based on manual inspection of hundreds of SECIS elements
SECISearch 3
Mariotti et al. 2013
![Page 23: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/23.jpg)
Assumptions: the presence of a detectable
SECIS within acceptable genomic distance from the Sec-
UGA annotated homologue(s) (Sec/Cys) in the reference
protein database
Seblastian
Mariotti et al. 2013
![Page 24: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/24.jpg)
![Page 25: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/25.jpg)
• Results must be presented in a web page with the structure of a scientific paper
• A file with the aminoacid sequences of all selenoproteins identified must be provided; plus another file with all SECISes identified
• All genes should be as complete as possible: starting with a AUG, ending with a stop codon, and with an identified SECIS element downstream
• Ignore alternative isoforms (if any), just choose one
Notes for the project
![Page 26: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/26.jpg)
• Report also the genes of selenoprotein machinery: SecS, eEFsec, pstk, secp43, SBP2, SPS1, (SPS2).
Ignore tRNAsec, for technical reasons
Notes for the project
![Page 27: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/27.jpg)
• Zero, one or many genes? Careful with superfamilies, and gene duplications
• Know what to expect
• Genome assemblies are not perfect!
Common pitfalls
![Page 28: Recerca de selenoprote ïnes en el genoma d’organimes eucariotes](https://reader035.fdocuments.us/reader035/viewer/2022062520/5681651c550346895dd79b91/html5/thumbnails/28.jpg)
EvaluationThe projects will be evaluated based on:
- results you are expected to find all selenoproteins in your assembly
- discussioninterpret your results logically
- methodsscripting is encouraged (but not compulsory)
- presentationthe web page should present the work as clearly as possible