Genome representation and variant identification
description
Transcript of Genome representation and variant identification
![Page 1: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/1.jpg)
Genome representation and variant identification
Deanna M. Church, NCBI
![Page 2: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/2.jpg)
![Page 3: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/3.jpg)
The Reference Assembly is NOT Static
NCBI35 (hg17)NCBI36 (hg18)GRCh37 (hg19)GRCh37.p9
![Page 4: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/4.jpg)
Image credit: http://www.tohlejokes.com
![Page 5: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/5.jpg)
http://genomereference.org
![Page 6: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/6.jpg)
Resolved: 716Open: 697
![Page 7: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/7.jpg)
http://www.ncbi.nlm.nih.gov/dbvar
![Page 8: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/8.jpg)
Studies
Variant Regions
Variant Calls
Variant Region nsv531833 type: CNV
Variant Calls: nssv577112 type: copy number gain Method: Oligo aCGH Analysis: Probe signal intensity phenotype: Autism; etc. Clinical: Pathogenic Copy Number: 3
Variant Calls: nssv580124 type: copy number loss Method: Oligo aCGH Analysis: Probe signal intensity phenotype: Autism. Clinical: Pathogenic Copy Number: 1
MethodsAnalysis
PublicationsSamples
Submitted assembly
![Page 9: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/9.jpg)
Variant Call Ambiguitystart stop
Inner start Inner stop
Outer start Outer stop
Probes with decreased signal intensityProbes with expected signal intensity
breakpoint breakpoint
Inner start Inner stop
![Page 10: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/10.jpg)
Variant Call AmbiguityOuter start Outer stop
Fosmid clone (40 Kb +/- 1 Kb)
20Kb Clone has an insertionrelative to the genome
Clone has a deletionrelative to the genome 60 Kb
![Page 11: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/11.jpg)
Assembly, Mis-assembly, Biology and Variant Interpretation
![Page 12: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/12.jpg)
BAC insertBAC vector
Shotgun sequence
Assemble
GAPS
“finishers” go in to manually fill the gaps, often by PCR
![Page 13: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/13.jpg)
NCBI36 (hg18)
GRCh
37 (h
g19)
![Page 14: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/14.jpg)
NCBI35 (hg17)
GRCh37 (hg19)
AL139246.20
AL139246.21
![Page 15: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/15.jpg)
Build sequence contigs based on contigs defined in TPF (Tiling Path File).
Check for orientation consistenciesSelect switch pointsInstantiate sequence for further analysis
Switch point
Consensus sequence
![Page 16: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/16.jpg)
NCBI36
![Page 17: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/17.jpg)
nsv832911 (nstd68) Submitted on NCBI35 (hg17)
![Page 18: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/18.jpg)
NCBI35 (hg17) Tiling Path
GRCh37 (hg19) Tiling Path
Gap Inserted
Moved approximately 2 Mb distal on chr15
NC_0000015.8 (chr15)
NC_0000015.9 (chr15)
Removed from assembly
Added to assembly
HG-24
![Page 19: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/19.jpg)
Sequences from haplotype 1Sequences from haplotype 2
Old Assembly model: compress into a consensus
New Assembly model: represent both haplotypes
![Page 20: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/20.jpg)
AC074378.4AC079749.5
AC134921.2AC147055.2
AC140484.1AC019173.4
AC093720.2AC021146.7
NCBI36 NC_000004.10 (chr4) Tiling Path
Xue Y et al, 2008
TMPRSS11E TMPRSS11E2
GRCh37 NC_000004.11 (chr4) Tiling Path
AC074378.4AC079749.5
AC134921.1AC147055.2
AC093720.2AC021146.7
TMPRSS11E
GRCh37: NT_167250.1 (UGT2B17 alternate locus)
AC074378.4AC140484.1
AC019173.4AC226496.2
AC021146.7
TMPRSS11E2
nsv532126 (nstd37)
![Page 21: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/21.jpg)
GRCh37
![Page 22: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/22.jpg)
81 FIX Patches71 NOVEL Patches
GRCh37.p9
![Page 23: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/23.jpg)
Dennis et al., 2012
1q32 1q21 1p21
1p21 patch alignment to chromosome 1
![Page 24: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/24.jpg)
Finding the data
![Page 25: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/25.jpg)
How dbVar* manages data
*and most other NCBI databases too
Object Method Analysis Clinical assertion
NCBI36 location
Etc…
nsv1000 Oligo aCGH Probe signal intensity
None Location Etc…
nsv2000 Sequencing Paired end analysis
None Location Etc…
nsv3000 Sequencing Read Depth
Benign Location Etc..
… … … … … …
Search Term
![Page 26: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/26.jpg)
![Page 27: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/27.jpg)
![Page 28: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/28.jpg)
Variant submitted on NCBI35 (hg17)Failed to remap to NCBI36 (hg18)Successful remap to GRCh37 (hg19)
![Page 29: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/29.jpg)
![Page 30: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/30.jpg)
No results in ‘normal’ dbVar searchGenome Sensor predicts this is a location -> points to dbVar Genome Browser
![Page 31: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/31.jpg)
![Page 32: Genome representation and variant identification](https://reader036.fdocuments.us/reader036/viewer/2022081513/56816222550346895dd24cae/html5/thumbnails/32.jpg)
Acknowledgements
dbVar
John LopezTim HefferonJohn GarnerChao ChenGeorge ZhouVictor Ananiev
NCBI
Collaborators
DGVaDGV
GRCNCBI
Valerie SchneiderNathan BoukHsiu-Chuan Chen
Collaborators
TGI-WUWTSIEBI
ISCANCBI Genomes, Viewers and Variation groups