Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with...
Transcript of Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with...
![Page 1: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/1.jpg)
EBI is an Outstation of the European Molecular Biology Laboratory.
Ensembl Tools
![Page 2: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/2.jpg)
Questions?
• We’ve muted all the mics• Ask questions in the Chat box in
the webinar interface• I will check the Chat box
periodically for questions• There’s no threading so please
respond with @name
![Page 3: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/3.jpg)
Objectives
• What is Ensembl?
• What tools are available in Ensembl?
• How to use the online tools in Ensembl.
• Where to go for help and documentation.
![Page 4: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/4.jpg)
Overview
• Introduction to Ensembl
• BLAST/BLAT
• Sequence searching
• Assembly Converter
• Convert files between genome assemblies
• Data Slicer
• Pull out sections of VCF and BAM files
• File Chameleon
• Custom download of reference files for NGS analysis
• Variant Effect Predictor (VEP)
• Analyse your own variants
![Page 5: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/5.jpg)
Introduction
Why do we need genome browsers?
1977: 1st genome to be sequenced (5 kb)
2004: finished human sequence (3 Gb)
![Page 6: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/6.jpg)
CGGCCTTTGGGCTCCGCCTTCAGCTCAAGACTTAACTTCCCTCCCAGCTGTCCCAGATGACGCCATCTGAAATTTCTTGGAAACACGATCACTTTAACGGAATATTGCTGTTTTGGGGAAGTGTTTTACAGCTGCTGGGCACGCTGTATTTGCCTTACTTAAGCCCCTGGTAATTGCTGTATTCCGAAGACATGCTGATGGGAATTACCAGGCGGCGTTGGTCTCTAACTGGAGCCCTCTGTCCCCACTAGCCACGCGTCACTGGTTAGCGTGATTGAAACTAAATCGTATGAAAATCCTCTTCTCTAGTCGCACTAGCCACGTTTCGAGTGCTTAATGTGGCTAGTGGCACCGGTTTGGACAGCACAGCTGTAAAATGTTCCCATCCTCACAGTAAGCTGTTACCGTTCCAGGAGATGGGACTGAATTAGAATTCAAACAAATTTTCCAGCGCTTCTGAGTTTTACCTCAGTCACATAATAAGGAATGCATCCCTGTGTAAGTGCATTTTGGTCTTCTGTTTTGCAGACTTATTTACCAAGCATTGGAGGAATATCGTAGGTAAAAATGCCTATTGGATCCAAAGAGAGGCCAACATTTTTTGAAATTTTTAAGACACGCTGCAACAAAGCAGGTATTGACAAATTTTATATAACTTTATAAATTACACCGAGAAAGTGTTTTCTAAAAAATGCTTGCTAAAAACCCAGTACGTCACAGTGTTGCTTAGAACCATAAACTGTTCCTTATGTGTGTATAAATCCAGTTAACAACATAATCATCGTTTGCAGGTTAACCACATGATAAATATAGAACGTCTAGTGGATAAAGAGGAAACTGGCCCCTTGACTAGCAGTAGGAACAATTACTAACAAATCAGAAGCATTAATGTTACTTTATGGCAGAAGTTGTCCAACTTTTTGGTTTCAGTACTCCTTATACTCTTAAAAATGATCTAGGACCCCCGGAGTGCTTTTGTTTATGTAGCTTACCATATTAGAAATTTAAAACTAAGAATTTAAGGCTGGGCGTGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGTGGGCGGATCACTTGAGGCCAGAAGTTTGAGACCAGCCTGGCCAACATGGTGAAACCCTATCTCTACTAAAAATACAAAAAATGTGCTGCGTGTGGTGGTGCGTGCCTGTAATCCCAGCTACACGGGAGGTGGAGGCAGGAGAATCGCTTGAACCCTGGAGGCAGAGGTTGCAGTGAGCCAAGATCATGCCACTGCACTCTAGCCTGGGCCACATAGCATGACTCTGTCTCAAAACAAACAAACAAACAAAAAACTAAGAATTTAAAGTTAATTTACTTAAAAATAATGAAAGCTAACCCATTGCATATTATCACAACATTCTTAGGAAAAATAACTTTTTGAAAACAAGTGAGTGGAATAGTTTTTACATTTTTGCAGTTCTCTTTAATGTCTGGCTAAATAGAGATAGCTGGATTCACTTATCTGTGTCTAATCTGTTATTTTGGTAGAAGTATGTGAAAAAAAATTAACCTCACGTTGAAAAAAGGAATATTTTAATAGTTTTCAGTTACTTTTTGGTATTTTTCCTTGTACTTTGCATAGATTTTTCAAAGATCTAATAGATATACCATAGGTCTTTCCCATGTCGCAACATCATGCAGTGATTATTTGGAAGATAGTGGTGTTCTGAATTATACAAAGTTTCCAAATATTGATAAATTGCATTAAACTATTTTAAAAATCTCATTCATTAATACCACCATGGATGTCAGAAAAGTCTTTTAAGATTGGGTAGAAATGAGCCACTGGAAATTCTAATTTTCATTTGAAAGTTCACATTTTGTCATTGACAACAAACTGTTTTCCTTGCAGCAACAAGATCACTTCATTGATTTGTGAGAAAATGTCTACCAAATTATTTAAGTTGAAATAACTTTGTCAGCTGTTCTTTCAAGTAAAAATGACTTTTCATTGAAAAAATTGCTTGTTCAGATCACAGCTCAACATGAGTGCTTTTCTAGGCAGTATTGTACTTCAGTATGCAGAAGTGCTTTATGTATGCTTCCTATTTTGTCAGAGATTATTAAAAGAAGTGCTAAAGCATTGAGCTTCGAAATTAATTTTTACTGCTTCATTAGGACATTCTTACATTAAACTGGCATTATTATTACTATTATTTTTAACAAGGACACTCAGTGGTAAGGAATATAATGGCTACTAGTATTAGTTTGGTGCCACTGCCATAACTCATGCAAATGTGCCAGCAGTTTTACCCAGCATCATCTTTGCACTGTTGATACAAATGTCAACATCATGAAAAAGGGTTGAAAAAAGGAATATTTTAATAGTTTTCAGTTACTTT
![Page 7: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/7.jpg)
We need to make the data mean something…
http://www. ensembl.org
http://www.ncbi.nlm.nih.gov/mapview
http://genome.ucsc.edu
![Page 8: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/8.jpg)
Ensembl Features
• Gene builds for ~70 species
• Gene trees
• Regulatory build (ENCODE)
• Variation display and VEP
• Display of user data
• BioMart (data export)
• Programmatic access via the APIs
• Completely Open Source
![Page 9: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/9.jpg)
Access scales
Whole genome
Groups
One by oneMain browserMobile site
BioMartREST APIVEP
Perl APIMySQL
FTP
![Page 10: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/10.jpg)
Vertebrate species on Ensembl
Image obtained using Dendroscope:
Dendroscope 3: An interactive tool for rooted phylogenetic trees and networks
D.H. Huson and C ScornavaccaSystematic Biology, 2012
![Page 11: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/11.jpg)
Non-vertebrates on Ensembl genomes
FungiBacteria
Plants
Protists
Metazoa
www.ensemblgenomes.org
![Page 12: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/12.jpg)
Ensembl and Ensembl GenomesEnsembl EnsemblGenomes
Released 2000 2009
Species Vertebrates (fly, worm and yeast as outgroups)
Non-vertebrates (protists, plants, fungi, metazoa, bacteria)
Annotation by Ensembl in collaboration with the scientific communities
URL www.ensembl.org www.ensemblgenomes.org
![Page 13: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/13.jpg)
Release cycle
89May 2017
2-3 months
New genome assemblies
Updated variation
data
Updated regulation
data
New/updated interfaces
Updated gene sets
Compara on new genes and genomes
Underlying software updates
90July 2017
![Page 14: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/14.jpg)
Ensembl Tools
Tools allow:
• Interpretation and processing of your own data• Custom download of Ensembl data for further
analysis
![Page 15: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/15.jpg)
BLAST/BLAT for sequence searching
• Find Ensembl sequences that match your sequence using BLAST/BLAT
• Search:• Nucleotide sequences• Protein sequences• Short sequences (eg primers, morpholinos, siRNAs)
• Search against• Genomic sequences• cDNA sequences• Protein sequences
![Page 16: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/16.jpg)
Hands on – BLAST/BLAT
• I’ve designed a pair of primers for RT-PCR against human BRCA2
• I want to make sure they don’t have any non-specific hits that will mess up my RT-PCR results
• The sequences are:
>fwdGAGGACTCCTTATGTCCAAATTT
>revGAGAATCAGCTTCTGGGGTAATAA
![Page 17: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/17.jpg)
Assembly converter
• You have data mapped to an old genome assembly• You want to update your data to map it to a new one
![Page 18: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/18.jpg)
What is a genome assembly?
CGGCCTTTGGGCTCCGCCTTCAGCTCAAGATCCGCCTTCAGCTCAAGACTTAACTTC
GGGCTCCGCCTTCAGCTC ACTTAACTTCCCTCCCAGCTGTCC
AACTTCCCTCCCAGCTTCCCAGCTGTCCCAGATGACGCCATC
CAGATGACGCC
CAGCTGTCCCAGATGACCGGCCTTTGGGCTCC
CGGCCTTTGGGCTCCGCCTTCAGCTCAAGACTTAACTTCCCTCCCAGCTGTCCCAGATGACGCCATC
Sequence reads
Match up overlaps
Genome assembly
CGGCCTTTGGGCTCCGCCTTCAGCTCAAGA
TCCGCCTTCAGCTCAAGACTTAACTTC
GGGCTCCGCCTTCAGCTC
ACTTAACTTCCCTCCCAGCTGTCC
AACTTCCCTCCCAGCTTCCCAGCTGTCCCAGATGACGCCATC
CAGATGACGCC
CAGCTGTCCCAGATGAC
CGGCCTTTGGGCTCC
![Page 19: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/19.jpg)
Genome contigs
CM
IM
AL
BL
BL102
AL476
CM
553IM
768
![Page 20: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/20.jpg)
Reference alleles
IM
CM
AL
BL
BL102
AL476
CM
553IM
768
BL102
AGTCGTAGCTAGCTAGGCCATAGGCGA
Frequency T = 0.05, frequency G = 0.95G is the allele in all primatesT causes disease susceptibility
Perhaps G should be the reference allele?We can replace the region with a new contig
![Page 21: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/21.jpg)
Genome Gaps
IM
CM
AL
BL
BL102
AL476
CM
553IM
768
BL102
AL476
Gap in the genome caused by:● Poor sequencing at this
region● No contig was ever
cloned
We can fill in the gap with a new contig
![Page 22: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/22.jpg)
Incorrectly assembled contigs
IM
CM
AL
BL
BL102
AL476
CM
553IM
768
CM
AL
BL
BL102
AL476
CM
553IM
768
IM
![Page 23: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/23.jpg)
New genome assemblies
• Fixing errors in the genome produces a new genome assembly
• New genome assemblies mean re-mapping of all genome features
• Ensembl will stop updating the old assembly when a new one is brought in
• You’ve got data mapped to the old assembly and you want to compare to the up-to-date Ensembl annotation
![Page 24: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/24.jpg)
Assembly converter
• Converts genome coordinates to a different genome assembly.
• Works with:• BED (simple coordinates)• GFF (gene, transcript and exon coordinates)• GTF (gene, transcript and exon coordinates)• WIG (values plotted against the genome)• VCF (variants)
![Page 25: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/25.jpg)
Hands-on – Assembly converter
• We’re going to convert a small BED file from the human genome assembly GRCh37 to the more recent GRCh38
• BED is a simple features format which lists the start and end coordinate of the feature.
5 36821734 37091336 P1
5 36731578 36978408 P2
5 36908654 37108773 P3
![Page 26: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/26.jpg)
Data Slicer for variants
• Whole genome VCF files are unwieldy• They contain all variants in the genome• They contain all genotypes from all individuals studied• Sometimes you just want to analyse a small region and one
population• The Data Slicer allows you to take a slice of a VCF and narrow
down to only individuals and populations of interest
• Data Slicer currently only accesses the 1000 Genomes data• It is only available for human and only on GRCh37
![Page 27: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/27.jpg)
Hands on – Data Slicer
• I want to get a VCF of the region containing the MC1R gene for the British population
• MC1R is found at 16:89978527-89987385 in GRCh37• The three-letter code for the British population in 1000
Genomes is GBR
![Page 28: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/28.jpg)
FTP
• Files of our complete database:• Genomic, cDNA, CDS, ncRNA and protein sequence
(FASTA)
• Annotated sequence (EMBL, GenBank)
• Gene sets (GTF, GFF)
• Whole-genome multiple and gene-based multiple alignments (MAF)
• Variants (VCF, GVF)
• Constrained elements (BED)
• Regulatory features (BED, BigWig)
• RNA-Seq files (BAM, BigWig)
• MySQL database
![Page 29: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/29.jpg)
Access FTP
Your favourite FTP client
FTP downloads pagehttp://www.ensembl.org/info/data/ftp/index.html
FTP siteftp://ftp.ensembl.org/pub/
![Page 30: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/30.jpg)
FTP files are big
• Multiple Mb/Gb
• Lots of time to download/unzip
• Do you really need this data?
• Make sure it’s the right file before you download.
![Page 31: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/31.jpg)
File chameleon for NGS analysis
• Although files on the Ensembl FTP site are in a standard format, different tools define the standards differently (sigh!)
• Your NGS analysis tool might need files that are slightly different to the Ensembl formats
• File chameleon allows you to download files with these adjustments
![Page 32: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/32.jpg)
Hands on – File Chameleon
• I need a GFF3 file of cat for my RNA-seq analysis.• My tool requires:
• UCSC-style chromosome naming like chr1• Only genes shorter than 4 Mb• Transcript IDs in every line
• We will use File Chameleon to download this customised file.
![Page 33: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/33.jpg)
Analyse your own variants with the VEP
• Find out the effects of your own variants on Ensembl genes• Analyse whole genome variant calls• Filter variants to find those that might be interesting
![Page 34: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/34.jpg)
Your own variant dataVariant coordinates 1 881907 881906 -/C +
5 140532 140532 T/C +12 1017956 1017956 T/A +2 946507 946507 G/C +14 19584687 19584687 C/T -
HGVS notation ENST00000285667.3:c.1047_1048insC5:g.140532T>CNM_153681.2:c.7C>TENSP00000439902.1:p.Ala2233AspNP_000050.2:p.Ile2285Val
VCF #CHROM POS ID REF ALT20 14370 rs6054257 G A20 17330 . T A20 1110696 rs6040355 A G,T20 1230237 . T .
Variant IDs rs41293501COSM327779rs146120136FANCD1:c.475G>Ars373400041
![Page 35: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/35.jpg)
Variation types
1) Small scale in one or few nucleotides of a gene
• Small insertions and deletions (DIPs or indels)
• Single nucleotide polymorphism (SNP)
A G A C T T G A C C T G T C T - A A C T G G AT G A C T T G A C - T G T C T G A A C G G G A
2) Large scale in chromosomal structure (structural variation)
• Copy number variations (CNV)
• Large deletions/duplications, insertions, translocations
deletion duplication insertion translocation
![Page 36: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/36.jpg)
Variation consequences
ATG AAAAAAA
Regulatory
3’ UTRIntronic
CODINGNon-synonymous
CODINGSynonymous
Splice site5’ Upstream 5’ UTR 3’ Downstream
![Page 37: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/37.jpg)
http://www.ensembl.org/info/docs/variation/predicted_data.html
Consequence terms
![Page 38: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/38.jpg)
Predicting missense effects – SIFT and PolyPhen
SIFT and PolyPhen score changes in amino acid sequence based on:
• How well conserved the protein is
• The chemical change in the amino acid• 3D structure and domains (PolyPhen only)
• SIFT and PolyPhen are predictions, not facts• A prediction will never be as good as experimental validation
![Page 39: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/39.jpg)
SIFT PolyPhen
1
0
0.05Deleterious
Tolerated
1
0
0.1Probably damaging
Benign
0.2Possibly damaging
![Page 40: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/40.jpg)
Use the VEP
http://www.ensembl.org/info/docs/tools/vep/index.html
![Page 41: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/41.jpg)
Species that work with the VEP
+ everything in Plants, Fungi, Metazoa, Protists and Bacteria
?
![Page 42: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/42.jpg)
Set up a cache
- Speed up your VEP script with an offline cache.- Use prebuilt caches for Ensembl species.- Or make your own from GTF and FASTA files -
even for genomes not in Ensembl.
http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html
✓
![Page 43: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/43.jpg)
VEP plugins
• Plugins add extra functionality to the VEP• They may extend, filter or manipulate the output of the VEP.• Plugins may make use of external data or code.• Available on the web tool and with the script.
![Page 44: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/44.jpg)
Hands on
• We’re going to look at a set of four variants to find out what genes they hit and what effect they have on them.
9 128328461 128328461 A/- + var1
9 128322349 128322349 C/A + var2
9 128323079 128323079 C/G + var3
9 128322917 128322917 G/A + var4
![Page 45: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/45.jpg)
Questions?
• We’ve muted all the mics• Ask questions in the Chat box in
the webinar interface• I will check the Chat interface• There’s no threading so please
respond with @name
![Page 46: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/46.jpg)
Host an Ensembl course
Browser course
½-2 day course on the Ensembl browser, aimed at wet-lab scientists.
One trainer.
REST API course
1-2 day course on the Ensembl Perl API, aimed at bioinformaticians.
1-2 trainers.
http://training.ensembl.org/
We can teach an Ensembl course at your institute for free (except trainers’ expenses).
Email us: [email protected]
![Page 47: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/47.jpg)
Help and documentationCourse online http://www.ebi.ac.uk/training/online/subjects/11
Tutorials www.ensembl.org/info/website/tutorials
Flash animations
www.youtube.com/user/EnsemblHelpdesk
http://u.youku.com/Ensemblhelpdesk
Email us [email protected]
Ensembl public mailing lists [email protected], [email protected]
![Page 48: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/48.jpg)
Follow us
www.facebook.com/Ensembl.org
@Ensembl
www.ensembl.info
![Page 49: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/49.jpg)
Publications
Aken, B. et al
Ensembl 2017
Nucleic Acids Research
http://europepmc.org/articles/PMC5210575
Xosé M. Fernández-Suárez and Michael K. SchusterUsing the Ensembl Genome Server to Browse Genomic Sequence Data.Current Protocols in Bioinformatics 1.15.1-1.15.48 (2010)www.ncbi.nlm.nih.gov/pubmed/20521244
Giulietta M Spudich and Xosé M Fernández-SuárezTouring Ensembl: A practical guide to genome browsingBMC Genomics 11:295 (2010)www.biomedcentral.com/1471-2164/11/295
http://www.ensembl.org/info/about/publications.html
![Page 50: Ensembl Tools - European Bioinformatics Institute · Annotation by Ensembl in collaboration with the scientific communities ... will mess up my RT-PCR results ... IM CM AL BL BL102](https://reader036.fdocuments.us/reader036/viewer/2022062605/5fcf69b5039128024e6a0a92/html5/thumbnails/50.jpg)
Ensembl AcknowledgementsThe Entire Ensembl Team
Funding
Co-funded by the European Union