Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona...

42
Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Transcript of Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona...

Page 1: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Understanding Human Variation

Fiona Cunningham European Bioinformatics Institute

November 2012

Page 2: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Talk  outline  

•  Gene-c  varia-on  –  Different  types  –  Origins  

•  Why  are  all  those  variants  important?  –  Importance  and  prac-cal  applica-ons  

•  How  is  varia-on  data  discovered?  –  Inves-ga-ng  gene-c  varia-on  and  progress  over  -me  

•  Ensembl  and  modern  Bioinforma-cs  –  Building  infrastructure  for  research  –  Interpre-ng  variants    

Page 3: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

The  Reference  Human  Genome  •  Published 2001 •  Finished in 2004 •  Still incomplete

Page 4: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

4/75

Every individual has a unique genome

Page 5: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

5/75

ACCCAATAGCAGAACAGCTACTGGAACTAAAATCCTCTGATTTCAAATAACAGCCCCGCCCACTACCACTAAGTGAAGTCATCCACAACCACACACCGACCACTCTAAGCTTTTGTAAGATCGGCTCGCTTTGGGGAACAGGTCTTGAGAGAACATCCCTTTTAAGGTCAGAACAAAGGTATTTCATAGGTCCCAGGTCGTGTCCCGAGGGCGCCCACCCAAACATGAGCTGGAGCAAAAAGAAAGGGATGGGGGACTTGGAGTAGGCATAGGGGCGGCCCCTCCAAGCAGGGTGGCCTGGGACTCTTAAGGGTCAGCGAGAAGAGAACACACACTCCAGCTCCCGCTTTATTCGGTCAGATACTGACGGTTGGGATGCCTGACAAGGAATTTCCTTTCGCCACACTGAGAAATACCCGCAGCGGCCCACCCAGGCCTGACTTCCGGGTGGTGCGTGTGCTGCGTGTCGCGTCACGGCGTCACGTGGCCAGCGCGGGCTTGTGGCGCGAGCTTCTGAAACTAGGCGGCAGAGGCGGAGCCGCTGTGGCACTGCTGCGCCTCTGCTGCGCCTCGGGTGTCTTTTGCGGCGGTGGGTCGCCGCCGGGAGAAGCGTGAGGGGACAGATTTGTGACCGGCGCGGTTTTTGTCAGCTTACTCCGGCCAAAAAAGAACTGCACCTCTGGAGCGGGTTAGTGGTGGTGGTAGTGGGTTGGGACGAGCGCGTCTTCCGCAGTCCCAGTCCAGCGTGGCGGGGGAGCGCCTCACGCCCCGGGTCGCTGCCGCGGCTTCTTGCCCTTTTGTCTCTGCCAACCCCCACCCATGCCTGAGAGAAAGGTCCTTGCCCGAAGGCAGATTTTCGCCAAGCAAATTCGAGCCCCGCCCCTTCCCTGGGTCTCCATTTCCCGCCTCCGGCCCGGCCTTTGGGCTCCGCCTTCAGCTCAAGACTTAACTTCCCTCCCAGCTGTCCCAGATGACGCCATCTGAAATTTCTTGGAAACACGATCACTTTAACGGAATATTGCTGTTTTGGGGAAGTGTTTTACAGCTGCTGGGCACGCTGTATTTGCCTTACTTAAGCCCCTGGTAATTGCTGTATTCCGAAGACATGCTGATGGGAATTACCAGGCGGCGTTGGTCTCTAACTGGAGCCCTCTGTCCCCACTAGCCACGCGTCACTGGTTAGCGTGATTGAAACTAAATCGTATGAAAATCCTCTTCTCTAGTCGCACTAGCCACGTTTCGAGTGCTTAATGTGGCTAGTGGCACCGGTTTGGACAGCACAGCTGTAAAATGTTCCCATCCTCACAGTAAGCTGTTACCGTTCCAGGAGATGGGACTGAATTAGAATTCAAACAAATTTTCCAGCGCTTCTGAGTTTTACCTCAGTCACATAATAAGGAATGCATCCCTGTGTAAGTGCATTTTGGTCTTCTGTTTTGCAGACTTATTTACCAAGCATTGGAGGAATATCGTAGGTAAAAATGCCTATTGGATCCAAAGAGAGGCCAACATTTTTTGAAATTTTTAAGACACGCTGCAACAAAGCAGGTATTGACAAATTTTATATAACTTTATAAATTACACCGAGAAAGTGTTTTCTAAAAAATGCTTGCTAAAAACCCAGTACGTCACAGTGTTGCTTAGAACCATAAACTGTTCCTTATGTGTGTATAAATCCAGTTAACAACATAATCATCGTTTGCAGGTTAACCACATGATAAATATAGAACGTCTAGTGGATAAAGAGGAAACTGGCCCCTTGACTAGCAGTAGGAACAATTACTAACAAATCAGAAGCATTAATGTTACTTTATGGCAGAAGTTGTCCAACTTTTTGGTTTCAGTACTCCTTATACTCTTAAAAATGATCTAGGACCCCCGGAGTGCTTTTGTTTATGTAGCTTACCATATTAGAAATTTAAAACTAAGAATTTAAGGCTGGGCGTGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGTGGGCGGATCACTTGAGGCCAGAAGTTTGAGACCAGCCTGGCCAACATGGTGAAACCCTATCTCTACTAAAAATACAAAAAATGTGCTGCGTGTGGTGGTGCGTGCCTGTAATCCCAGCTACACGGGAGGTGGAGGCAGGAGAATCGCTTGAACCCTGGAGGCAGAGGTTGCAGTGAGCCAAGATCATGCCACTGCACTCTAGCCTGGGCCACATAGCATGACTCTGTCTCAAAACAAACAAACAAACAAAAAACTAAGAATTTAAAGTTAATTTACTTAAAAATAATGAAAGCTAACCCATTGCATATTATCACAACATTCTTAGGAAAAATAACTTTTTGAAAACAAGTGAGTGGAATAGTTTTTACATTTTTGCAGTTCTCTTTAATGTCTGGCTAAATAGAGATAGCTGGATTCACTTATCTGTGTCTAATCTGTTATTTTGGTAGAAGTATGTGAAAAAAAATTAACCTCACGTTGAAAAAAGGAATATTTTAATAGTTTTCAGTTACTTTTTGGTATTTTTCCTTGTACTTTGCATAGATTTTTCAAAGATCTAATAGATATACCATAGGTCTTTCCCATGTCGCAACATCATGCAGTGATTATTTGGAAGATAGTGGTGTTCTGAATTATACAAAGTTTCCAAATATTGATAAATTGCATTAAACTATTTTAAAAATCTCATTCATTAATACCACCATGGATGTCAGAAAAGTCTTTTAAGATTGGGTAGAAATGAGCCACTGGAAATTCTAATTTTCATTTGAAAGTTCACATTTTGTCATTGACAACAAACTGTTTTCCTTGCAGCAACAAGATCACTTCATTGATTTGTGAGAAAATGTCTACCAAATTATTTAAGTTGAAATAACTTTGTCAGCTGTTCTTTCAAGTAAAAATGACTTTTCATTGAAAAAATTGCTTGTTCAGATCACAGCTCAACATGAGTGCTTTTCTAGGCAGTATTGTACTTCAGTATGCAGAAGTGCTTTATGTATGCTTCCTATTTTGTCAGAGATTATTAAAAGAAGTGCTAAAGCATTGAGCTTCGAAATTAATTTTTACTGCTTCATTAGGACATTCTTACATTAAACTGGCATTATTATTACTATTATTTTTAACAAGGACACTCAGTGGTAAGGAATATAATGGCTACTAGTATTAGTTTGGTGCCACTGCCATAACTCATGCAAATGTGCCAGCAGTTTTACCCAGCATCATCTTTGCACTGTTGATACAAATGTCAACATCATGAAAAAGGGAAATGATTCCATAGCGTTATTATGAAAGTAGTTTTGAACTGTAATGGTAGAGGATGAATAGCTCACAATACAAATTTGTCATTTCCCTTTAAGAGAGAATTCCCATTTTATGTGAGAGTCCACATGTTCCTCATACCCATAGTTTGCCACATCTTGAGTACTCTTCAGAATTATTTGAATTTTTTGAATTTTATCTGTGGAATGTATTTTTTTTTTTTTCTTTTTTGAGACACAGTCTTGCT

T T

A

T

G

T

C

C

C

C

Page 6: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Single nucleotide variants

A  single  nucleo-de  variant  is  a  change  that  happens  at  one  posi-on  in  the  DNA  sequence.        A  single  nucleo-de  polymorphism  (SNP):                    (In  double-­‐stranded  DNA,  this  changes  a  base  pair).  

Person  1.  TTCCCTA  Person  2.  TTCCTTA  

Page 7: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Other short variants

C   G  T   G  

A   T   G  G   G   C   A   C   T  T  

Insertion

T  

Dinucleotide Substitution

C   T  

T  

Deletion

Page 8: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

•  Structural variants: large •  deletions •  duplications •  insertions •  translocations

•  Copy number variants (CNVs): sequence repeated ‘n’ times in an individual

Large scale: >50 base pairs to megabases

deletion duplication

translocation insertion

Page 9: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

SNP SNP

SNP SNP

SNP Appearance of new variants by

mutation

SNP SNP

SNP SNP

SNP Survival of alleles through early

generations against the odds

SNP Increase of the allele

to a substantial population frequency

Fixation of the allele in populations

Origin  of  Variants  E.g.  more  copies  

of  CCL3L1      HIV  resistance    

Germline variation: passed to descendants. Somatic Mutation: not passed to descendants.

Page 10: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Talk  outline  

•  Gene-c  varia-on  –  Different  types  –  Origins  

•  Why  are  all  those  variants  important?  –  Importance  and  prac-cal  applica-ons  

•  Where  did  they  all  come  from?  –  Inves-ga-ng  gene-c  varia-on  and  progress  over  -me  

•  Ensembl  and  modern  Bioinforma-cs  –  Building  infrastructure  for  research  –  Interpre-ng  variants    

Page 11: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Disease  and  differences  •  Varia-on:  interes-ng  for  evolu-on,  popula-on  migra-on  and  adapta-on  

–  Differences  in  phenotype:  Height,  intelligence,  body  mass  –  Single  variant  disorders:  Sickle  cell  anaemia,  cys-c  fibrosis  –  Complex  Disease:  Bipolar  disorder,  schizophrenia,  Alzheimer’s    –  Noravirus  protec-on  (Homozygous  for  alt  allele  rs601338)  

•  SV,  Copy  number  varia-on:  Gene  dosage  -­‐  too  few  or  too  many  copies    –  lupus,  autoimmune  disease:  too  few  copies  of  FCGR3B  –  HIV  infec-on  resistance:  more  copies  of  CCL3L1  –   Intellectual  disorders  

Page 12: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Prac-cal  applica-ons  of  varia-on  

Risk  assessment  •  Of  radia-on  exposure,  mutagenic  chemicals  and  

cancer-­‐causing  toxins  

Anthropology,  evolu?on,  and  human  migra?on  •  muta-ons  lineages,  mitochondrial  inheritance  and  Y  

chromosomes  •  compara-ve  genomics:  for  understanding  diseases  

and  traits.    

Molecular  and  clinical  medicine  •  Diagnosis,  detec-on  and  treatment:      

–  e.g.  myotonic  dystrophy,  fragile  X  syndrome,  inherited  colon  cancer,  Alzheimer's  disease,  and  familial  breast  cancer  

•  Pharmacogenomics  "custom  drugs"  

Page 13: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Prac-cal  applica-ons  of  varia-on  DNA  forensics    •  Iden-fica-on  of    

–  suspects  –  exonerate  innocents  –  catastrophe  vic-ms  –  endangered  species  (against  poachers)  

Agriculture,  livestock  breeding  •  Disease-­‐,  insect-­‐,  and  drought-­‐resistant  crops  •  Healthier,  more  produc-ve,  disease-­‐resistant  farm  animals  •  More  nutri-ous  produce  •  Reducing  the  costs  of  agriculture  

Page 14: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Talk  outline  

•  Gene-c  varia-on  –  Different  types  –  Origins  

•  Why  are  all  those  variants  important?  –  Importance  and  prac-cal  applica-ons  

•  How  is  varia-on  data  discovered?  –  Inves-ga-ng  gene-c  varia-on  and  progress  over  -me  

•  Ensembl  and  modern  Bioinforma-cs  –  Building  infrastructure  for  research  –  Interpre-ng  variants    

Page 15: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Mendel    (1822  –  1884)    

•  "father  of  gene-cs"  for  his  study  of  the  inheritance  of  traits  in  pea  plants.    

•  1866  -­‐Published  the  results  of  the  inheritance  of  "factors"  in  pea  plants  

•  Paaerns  in  pea  traits  explained  by  inherited  factors  

Page 16: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

SNP  Consor-um  (TSC)  

•  1999:  private  /public  collabora-on  •  Share  costs  to  produce  a  public  resource  of  single  nucleo-de  polymorphisms  (SNPs)    

•  Goal:  discover  300  000  SNPs  in  two  years  

•  Result:  1.4  million  SNPs  by  2001  •  24  people  represen-ng  several  races        

Page 17: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Loca-on  by  mapping  flanking  sequence  

Page 18: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Genome Sequencing

•  13-year project •  2001: Human genome working

drafts •  Data unit of approximately 10x

coverage of human •  10 years and cost about $3 billion •  Olympics 2012: $19 billion

Human Genome Project

Page 19: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Finding  all  human  SNPs  

•  3  major  popula-ons  

•  Alleles  and  frequencies  

•  Tag  variants  

HapMap Project- 2002 Goal: find all SNPs present across different populations (“all” means present at at least 5%)

h6p://hapmap.ncbi.nlm.nih.gov/    

Page 20: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Haplotypes and LD

•  A haplotype can be thought of as a collection of alleles. •  ‘LD’ (Linkage Disequilibrium): a measure of how likely two alleles will

be inherited together

Important  project.  S-ll  very  highly    regarded  today.  

Page 21: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Associa-on  studies  

•  Genome  Wide  Associa?on  Studies  (GWAS)  

•  E.g.  WTCCC  2005  •  Gather  phenotypes  

Use common SNPs to understand common disease Diseases: diabetes, Crohn’s disease, breast cancer, coronary artery disease, bipolar disorder, hypertension, multiple sclerosis,…

Page 22: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

0

1000

2000

3000

4000

5000

6000

19961997199819992000200120022003200420052006200720082009

YearD

isks

(T

B)

1000  Genome  Project  -­‐  Genome  Sequencing  

Finding all human SNPs

•  2008: World-wide capacity dramatically increasing

•  Goal: Find genetic variants with frequencies of >1%

•  In 3 weeks data double that of past 13 years

Lactose tolerance

Page 23: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

1000 Genomes Populations

YRI Yoruba MKK

Maasai

LWK Luhya

ASW African

TSI Toscan

CHS Han (South)  

CHD Chinese

JPT Japanese

MEX Mexican  

GIH Gujarati  

CEU Northern and Western European  

GBR British

IBS Spain

FIN Finnish CHB

Han Chinese  CDX Chinese Dai  

KHV Vietnam  

GWD The Gambia

ACB Barbados

AJM African  

PUR Puerto Rican

CLM Colombian

PEL Peruvian

PJL Pakistani

Page 24: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Today…  

•  2012:  Every  14  minutes  (£4000)  – £600  exome  

•  Rare  disease:  1  in  17  people  in  the  UK  – There  are  over  6,000  recognised  rare  diseases.  – DDD:  Deciphering  Developmental  Disorders  

•  Ongoing  projects:  – UK10K:  6000  cases,  4000  controls  

 

Page 25: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Challenges:  for  EBI  and  our  users      

Sequencing machine

Scientist

Timothy K. Stanton

Page 26: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Talk  outline  

•  Gene-c  varia-on  –  Different  types  –  Origins  

•  Why  are  all  those  variants  important?  –  Importance  and  prac-cal  applica-ons  

•  Where  did  they  all  come  from?  –  Inves-ga-ng  gene-c  varia-on  and  progress  over  -me  

•  Ensembl  and  modern  Bioinforma-cs  –  Building  infrastructure  for  research  –  Interpre-ng  variants    

Page 27: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

EBI      

•  EBI’s  mission:  To  provide  freely  available  data  and  bioinforma?cs  services  to  all  facets  of  the  scien-fic  community  to  promote  scien-fic  progress  

•  The  world’s  most  comprehensive  collec-on  of  molecular  databases:  from  DNA  and  protein  sequence  to  complex  pathways  and  networks  –  Integra-on  and  community  engagement  is  at  the  heart  of  these  efforts  

•  European  node  for  globally  coordinated  data  collec-on  and  dissemina-on  projects    

Page 28: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

28

Genome-­‐wide  data  from  Ensembl  

Across species Within species

Synteny

Pick a genome

Orthology

Genomic alignments

SNPs

Genes Chromosomes

Gene regulation

•  Ensembl’s  mission:  to  enable  genomic  science      

Page 29: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Species with variation data in Ensembl

Page 30: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Data  access  -­‐  variants  on  the  genome  

Page 31: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Data  access-­‐  variants  per  protein  

Page 32: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Ensembl  Varia-on    

Page 33: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Varia-on  annota-on  –  phenotype  data  

Page 34: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

ENST

CODING Synonymous

INTRONIC 5’ UTR

ATG AAAAAAA Regulatory

Splice sites

CODING Non-Synonymous

3’ UTR 5’ Upstream 3’ downstream

Consequence Types

•  A  SNP  can  be  in  an  exon  in  some  transcripts,  and  in  an  intron  in  another.  

Page 35: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

GAG >TAG Glu > STOP

GAG >GAA Glu > Glu

GAG >GGG Glu > Gly

Synonymous (silent) no change in amino acid Non-synonymous (missense) change in amino acid Stop gain (nonsense) introduces a stop codon

Consequences of variants in the protein-coding sequence

Page 36: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Added  more  detailed  terms  

• regulatory region

• TF binding site

• intergenic• upstream

• 5 prime UTR • initiator codon

• synonymous variant• missense variant• inframe insertion• inframe deletion• stop gained• frameshift variant• coding sequence variant

• splice donor• splice acceptor

• splice region• intron variant

• stop lost• stop retained variant• incomplete terminal

codon

• 3 prime UTR

• downstream

5’ 3’

• regulatory region

• TF binding site

• intergenic• upstream

• 5 prime UTR • initiator codon

• synonymous variant• missense variant• inframe insertion• inframe deletion• stop gained• frameshift variant• coding sequence variant

• splice donor• splice acceptor

• splice region• intron variant

• stop lost• stop retained variant• incomplete terminal

codon

• 3 prime UTR

• downstream

5’ 3’

Page 37: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

A  

A C A C A

Ref Reads

? SNP

Data  access-­‐  varia-on  annota-on  

In-dels

Structural variants

GWAS

Page 38: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Interpreta-on  of  variants  

•  Interpreta-on  of  variants  is  key  •  Ensembl  is  well  placed  for  doing  this  with  contribu-ons  from  all:  –  High-­‐quality  evidence-­‐based  gene  build  – Mul-ple  alignments  –  Regulatory  informa-on  –  Varia-on  and  phenotype  informa-on  

–  VEP  for  all  types  of  varia-on  •  Good  support  •  Fast  script  version  •  REST  API  

Page 39: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

39/75

Variant  Effect  Predictor  

Page 40: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

40/75

Variant  Effect  Predictor  

Page 41: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012

Summary  

•  Importance  of  variants:  their  roles  in  disease  and  phenotypes  differences    

•  Classes  of  variants  –  Short  (single  nucleo-de)  variants:  SNPs,  indels  –  Structural  variants    

•  Effects  of  variants:  non-­‐synonymous,  stop  lost  etc.  •  Source  of  variants:  dbSNP,  Muta-on  databases    

–  Big  projects:  1000  Genomes,  HapMap  

•  Bioinforma-cs  infrastructure  projects:  Ensembl  

Page 42: Understanding Human Variation - EMBL-EBI · 2013-04-16 · Understanding Human Variation Fiona Cunningham European Bioinformatics Institute November 2012