Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis...
Transcript of Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis...
![Page 1: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/1.jpg)
Detection and analysis of SNP polymorphisms
Alexis DEREEPER
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
![Page 2: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/2.jpg)
What is a SNP?
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
• A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion of a large population
• SNPs in coding regions may (or may not) alter the protein structure and function
![Page 3: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/3.jpg)
Why studying SNP?
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
• Population genetics: population stratification, linkage disequilibrium…
• Define markers for genetic maps
• Analysis of genome structure, genome evolution
• Genome Wide Association studies. SNPs can be used for estimating predisposition to disease, for predicting specific genetic traits • Functional analysis: alteration of protein structure and function
![Page 4: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/4.jpg)
Re-sequencing projects: a deluge of SNPs
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
Using NGS technologies: RESEQUENCING Mapping SNPs
![Page 5: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/5.jpg)
SNPs from RNASeq: example of Arcad project
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
Available reference ? genome/transcriptome
454 sequencing
De novo reference assembly
Solexa sequencing
Mapping on reference
Polymorphism database in adapted format • redundancy • open reading frame • CDS/UTR
Diversity study • Comparative domestication • Life history trait impact
• Functionnal evolution
Yes
No
Ortholog/paralogs assignation
Solexa sequencing
CROP Breeding SNP database
• functional annotation • selection footprint
Strategy : comparative population genomics with transcriptomics data
![Page 6: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/6.jpg)
Strategy for SNP discovery from NGS
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
Global BAM with read group
FastQ Groomer
Mapping BWA
IndelRealigner
UnifiedGenotyper
VCF file
Fastq (ind1)
BAM with read group
FastQ Groomer
Mapping BWA
Fastq (ind2)
BAM with read group
FastQ Groomer
Mapping BWA
Fastq (ind3)
BAM with read group
FastQ Groomer
Mapping BWA
Fastq (ind4)
BAM with read group
….
mergeSam
Add or Replace Groups Add or Replace Groups Add or Replace Groups Add or Replace Groups
DepthOfCoverage
Depth file
GATK
PicardTools
![Page 7: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/7.jpg)
FASTQ Format
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
• Standard format of short reads from NGS data
FASTQ file → TEXT file
STRUCTURE:
@HWUSI-EAS454_0006:1:112:14105:5498#CTTGTA
CGCCAAGAAGTGTAGCAAAACGGCAGAGCTCGTGGATTAAACAAACAGAGGATTTCGGTGAGGATTGAGGGGGAGT
+
cfffcfeffdeefefffcffffffffcffeffffdffffafcfffffdffffdfefeddf^eececfffdfcbffb
@HWUSI-EAS454_0006:1:37:16314:3410#CTTGTA
AGTGTAGCAAAACGGCAGAGCTCGTGGATTAAACAAACAGAGGATTTCGGTGAGGATTGAGGGGGAGTGGTGGCCG
+
`bTbbccccceeeeeceeeecccYeedded`ceec]dddde^a`deeeec\`dddcbaadadYd`]]Jc_^bc^^\
![Page 8: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/8.jpg)
FASTQ Format
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
f → Quality = 38 (102 – 64)
@HWUSI-EAS454_0006:1:112:14105:5498#CTTGTA
CGCCAAGAAGTGTAGCAAAACGGCAGAGCTCGTGGATTAAACAAACAGAGGATTTCGGTGAGGATTGAGGGGGAGT
+
cfffcfeffdeefefffcffffffffcffeffffdffffafcfffffdffffdfefeddf^eececfffdfcbffb
![Page 9: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/9.jpg)
SAM Format
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
• Standard format for mapping of NGS data
• Sequence Alignment Mapping (SAM) Binary Alignment Mapping (BAM)
![Page 10: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/10.jpg)
Visualization of SNPs in mapping alignment
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
Tablet
• Graphical viewer for assembly of NGS data
• Accepts different formats: ACE, SAM, BAM
![Page 11: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/11.jpg)
SNP discovery from NGS data
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
GATK (Genome Analysis Tool Kit)
• Package for analysis of NGS data.
• Developed for the analysis of Human medical resequencing projects (1000 Genomes, The Cancer Genome Atlas)
• Includes tools for depth analysis, quality score recalibration, SNP/InDel discovery
• Complementary of 2 other packages: SamTools, PicardTools
PREPROCESS: * Index human genome (Picard), we used HG18 from UCSC. * Convert Illumina reads to Fastq format * Convert Illumina 1.6 read quality scores to standard Sanger scores FOR EACH SAMPLE: 1. Align samples to genome (BWA), generates SAI files. 2. Convert SAI to SAM (BWA) 3. Convert SAM to BAM binary format (SAM Tools) 4. Sort BAM (SAM Tools) 5. Index BAM (SAM Tools) 6. Identify target regions for realignment (Genome Analysis Toolkit) 7. Realign BAM to get better Indel calling (Genome Analysis Toolkit) 8. Reindex the realigned BAM (SAM Tools) 9. Call Indels (Genome Analysis Toolkit) 10. Call SNPs (Genome Analysis Toolkit)
11. View aligned reads in BAM/BAI (Integrated Genome Viewer)
![Page 12: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/12.jpg)
VCF Format (Variant Call Format)
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
##fileformat=VCFv4.0
##fileDate=20090805
##source=myImputationProgramV3.1
##reference=1000GenomesPilot-NCBI36
##phasing=partial
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=AF,Number=.,Type=Float,Description="Allele Frequency">
##INFO=<ID=AA,Number=1,Type=String,Description="Ancestral Allele">
##INFO=<ID=DB,Number=0,Type=Flag,Description="dbSNP membership, build 129">
##INFO=<ID=H2,Number=0,Type=Flag,Description="HapMap2 membership">
##FILTER=<ID=q10,Description="Quality below 10">
##FILTER=<ID=s50,Description="Less than 50% of samples have data">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FORMAT=<ID=HQ,Number=2,Type=Integer,Description="Haplotype Quality">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA00001 NA00002
20 14370 rs6054257 G A 29 PASS NS=3;DP=14;AF=0.5;DB;H2 GT:GQ:DP:HQ 0|0:48:1:51,51 1|0:48:8:51,51
20 17330 . T A 3 q10 NS=3;DP=11;AF=0.017 GT:GQ:DP:HQ 0|0:49:3:58,50 0|1:3:5:65,3
Advantages: describes the variations for each position + genotype assignation
![Page 13: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/13.jpg)
GATK (other functionalities)
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
• DepthOfCoverage module: Enables to inform sequencing depth of coverage for each gene, each position and each individual
• ReadBackedPhasing module: Enables to define if possible allele association (phase or haplotype) in case of heterozygosity…
And not AGG GGA
![Page 14: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/14.jpg)
GATK (other functionalities)
Global BAM with read group
FastQ Groomer
Mapping BWA
IndelRealigner
UnifiedGenotyper
VCF file
Fastq (ind1)
BAM with read group
FastQ Groomer
Mapping BWA
Fastq (ind2)
BAM with read group
FastQ Groomer
Mapping BWA
Fastq (ind3)
BAM with read group
FastQ Groomer
Mapping BWA
Fastq (ind4)
BAM with read group
….
mergeSam
Add or Replace Groups Add or Replace Groups Add or Replace Groups Add or Replace Groups
DepthOfCoverage
Depth file ReadBackedPhasing
VariantFiltration
Phased VCF
Filtered VCF
![Page 15: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/15.jpg)
The SNiPlay project
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
![Page 16: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/16.jpg)
The SNiPlay project
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
SNiPlay: Web-based application for polymorphism analysis
http://sniplay.cirad.fr
![Page 17: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/17.jpg)
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
Global BAM with read group
FastQ Groomer
Mapping BWA
IndelRealigner
UnifiedGenotyper
VCF file
Fastq (RC1)
BAM with read group
FastQ Groomer
Mapping BWA
Fastq (RC2)
BAM with read group
FastQ Groomer
Mapping BWA
Fastq (RC3)
BAM with read group
FastQ Groomer
Mapping BWA
Fastq (RC4)
BAM with read group
….
mergeSam
Add or Replace Groups Add or Replace Groups Add or Replace Groups Add or Replace Groups
DepthOfCoverage
Depth file
![Page 18: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/18.jpg)
SNiPlay : parameters and options
21 November 2013
Select the VCF format
Load the VCF file, the reference and the depth file
Indicate groups of individuals to make SNP comparison
![Page 19: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/19.jpg)
SNiPlay : parameters and options
21 November 2013
Filter SNPs respecting minimum Depth coverage
Select the banana genome
Check the steps to be performed
![Page 20: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/20.jpg)
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
SNiPlay: SNP statistics
![Page 21: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/21.jpg)
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
SNiPlay and Illumina genotyping chip
Cartesian coordinates
Genotyping file
Submission file for Illumina
Analysis with the BeadStudio software
Design of Illumina chip
![Page 22: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/22.jpg)
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
SNiPlay : Allelic files
@DARwin 5.0 - ALLELIC - 2
33 20
N° 50 50 122 122 218 218 245 245 261 261 290 290 356
1 1 1 1 1 3 3 3 3 4 4 2 2 2
2 1 1 1 1 3 3 1 3 4 4 2 2 2
3 1 1 1 1 3 3 3 3 4 4 2 2 2
4 1 1 1 1 3 3 3 3 4 4 2 2 2
33
10
P 49 121 217 244 260 289
SSSSSSSSSS
#cARB
A A G G T C C A T T
A A G G T C C A T T
#cSYR
A A G A T C C A T C
A A G G T C C A T T
• PED format
• DARwin format
• .inp format for Phase • Format for TASSEL (association studies)
cARB 1 0 0 1 0 1 1 1 1 3 3 3 3 4 4 2 2 2 2 1 1 4 4 4 4
cSYR 2 0 0 1 0 1 1 1 1 3 3 1 3 4 4 2 2 2 2 1 1 4 4 2 4
cARA 3 0 0 1 0 1 1 1 1 3 3 3 3 4 4 2 2 2 2 1 1 4 4 4 4
33 10:2
50 122 218 245 261 290 356 461 467 560
cARB A:A A:A G:G G:G T:T C:C C:C A:A T:T T:T
cSYR A:A A:A G:G A:G T:T C:C C:C A:A T:T C:T
cARA A:A A:A G:G G:G T:T C:C C:C A:A T:T T:T
cORL A:A A:A G:G G:G T:T C:C C:C A:A T:T T:T
cLAR A:G A:G A:G A:G C:T C:C C:C A:A T:T C:T
Provides various formats of allelic files:
![Page 23: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/23.jpg)
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
SNiPlay : Annotation of SNPs
1) Locate SNP on a genome
• using Blast • or using GFF if reference correspond to gene/CDS
2) Annotate SNPs with SnpEff program
![Page 24: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/24.jpg)
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
SNiPlay : Annotation of SNPs
![Page 25: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/25.jpg)
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
SNiPlay : Diversity analysis
SeqLib library
![Page 26: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/26.jpg)
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
SNiPlay : Haplotype network
High frequency haplotypes
Low frequency haplotype
Group distribution whithin this haplotype
Distance between 2 haplotypes (nb of mutations)
Haplophyle
![Page 27: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/27.jpg)
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
SNiPlay : Comparison of SNPs between groups
Individu, group
Ind1, Table
Ind2, Table
Ind3, Table
Ind4, East
Ind5, East
Ind6, East
Ind7, East
Ind8, West
External file (optional)
![Page 28: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/28.jpg)
SNiPlay: Population structure analysis
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
Admixture
• Test different values of K (estimates of probability that samples are structured in K populations)
• For the best value of K, the application shows Q estimates for each individual (probability that the individual belongs to each population)
![Page 29: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/29.jpg)
SNiPlay : GWAS analysis
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
Tassel, MLMM…
• Genome Wide Association Studies (GWAS)
• Estimate the association between SNPs and a phenotypic trait
• Display Manhattan plots: GWAS statistical tests (-log10 pvalue) along the chromosomes
![Page 30: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/30.jpg)
SNP analysis in an allopolyploid
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
SNiPloid
• Compare SNPs observed in allotetraploid to those observed between parental genomes
• Categorize SNP in different evolution scenarii
• Attempt to assign alleles to subgenomes • Estimates the subgenomic contribution to the transcriptome
![Page 31: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/31.jpg)
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
SNPs in the Banana Genome Hub
![Page 32: Detection and analysis of SNP polymorphisms · Detection and analysis of SNP polymorphisms Alexis DEREEPER 21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November](https://reader034.fdocuments.us/reader034/viewer/2022050521/5fa49c62d7c4443fd540784d/html5/thumbnails/32.jpg)
Objectives of the exercises
21 November 2013 Training course in Bioinformatics, Montpellier, 18-22 November 2013
• To know and manipulate available packages/tools for SNP and INDEL detection from NGS data (assembly of NGS data)
• To think about difficulties encountered when analysing new generation sequencing data (differentiate sequencing errors, paralogs and allelic variation)
• Detect SNP and assign genotypes to every polymorphic positions
• Simply exploit polymorphisms data via a Web-based application (genetic diversity, LD)
• Obtain an exploitable dataset to send for the design of a high-throughput SNP chip (Illumina VeraCode technology)
Short reads Solexa
Mapping SAM
Exploitation of polymorphism data
Design of a Illumina SNP chip
Assignation of genotypes
Ind1 ATTGTGTCGTAACGTATGTCATGTCGT Ind2 ATTGTGTCGGAACGTATGTCATGTCGT Ind3 ATTGTGTCGKAACGTATGTCATGTCGT
Allelic variations
List of SNPs
867 A/G 1998 T/C 2341 T/G