Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for...
Transcript of Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for...
![Page 1: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/1.jpg)
INTRODUCING NVBIO: HIGH PERFORMANCE PRIMITIVES FOR COMPUTATIONAL GENOMICS
Jonathan Cohen, NVIDIA
Nuno Subtil, NVIDIA
Jacopo Pantaleoni, NVIDIA
![Page 2: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/2.jpg)
SEQUENCING AND MOORE’S LAW
Slide courtesy Illumina
![Page 3: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/3.jpg)
DR
AM
I/F
HO
ST
I/F
Gig
a T
hre
ad
DR
AM
I/F
DR
AM
I/F
DR
AM
I/F
DR
AM
I/F
DR
AM
I/F
L2
Tesla Fermi Kepler
GPUS AND MOORE’S LAW
![Page 4: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/4.jpg)
ARE GPUS A FIT FOR BIOINFORMATICS? Con
Branchy code, integer-based
IO intensive
Many bottlenecks, many pipelines
Pro
High throughput
Inner-loops are hungry for bandwidth & compute
Few common computational patterns (Dynamic programming, text index traversal, branch-and-bound search)
![Page 5: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/5.jpg)
WHY NVBIO?
2 year internal NVIDIA effort to answer conclusively yes or no
Goals:
— Map core computational building blocks to CUDA
— Find programming idioms to express parallelism appropriately
— Understand performance at socket, node, and system level
— Deliver high-quality open source library for community to build on
— Produce usable tools for non-developers, comparable with current state-of-the-art
![Page 6: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/6.jpg)
OVERVIEW
What is NVBIO
Core concepts
NVBowtie2
Tutorial
![Page 7: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/7.jpg)
INTRODUCING NVBIO
Open source (GPL v2) C++ framework for sequence analysis
Designed for heterogeneous computing: CPU + GPU
Scalable computational building blocks for large datasets
Allow programmer to express and exploit parallelism easily
Ready-to-use tools (nvBowtie2, nvSetBwt)
![Page 8: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/8.jpg)
HETEROGENEOUS LIBRARY
FM-index
Suffix Trie
Radix Tree
Sorted Dictionary
Edit Distance
Smith-Waterman
Needleman-Wunsch
Gotoh
Banded/Full DP
DP Alignment Text Indices
Exact Search
Backtracking
Text Search
FASTQ
FASTA
BWT Index
Sequence I/O
SAM
BAM
CRAM (wip)
Alignment I/O
HTML report
generators
Support Tools
GPU
CPU
O( 1k-10k) threads
O( 10-100) threads
![Page 9: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/9.jpg)
BUILDING BLOCK: FM-INDEX
CAACATGTGAC
AGATTTAGACCA
CAGATTGAGTT
GTGTACAACCA
GATTAGACACAA
AGTTGTGA$
Original text FM-Index
Q=CTTAGAAC
Found at location 213321
Fast O(|Q|) query
GACCTAGCGAA
GCTTTCCCC$G
CGGACAAACAA
AAATAAATAATAA
TATTGTTTTTGA
GGGTAGGA
Burrows-Wheeler
Transform
O(1) Incremental Search:
Given set of substrings S,
find subset which start with X
![Page 10: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/10.jpg)
BWT CONSTRUCTION BENCHMARK (hg19 – GRCh37)
0:15:54
0:43:17
0:01:53
0:00:00
0:10:00
0:20:00
0:30:00
0:40:00
0:50:00
BWT Construction
bwt-sw
bowtie2
nvBWT
hh:mm:ss
CPU: i7 3930K 3.2Ghz
GPU: K40
![Page 11: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/11.jpg)
STRING SEARCH BENCHMARK
19.80
1.45
123.30
6.33
1
10
100
1000
Exact Search Approximate Search
i7 3930K 3.2Ghz
K40
M queries/s
Reads: SRR493095
Genome: hg19
![Page 12: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/12.jpg)
BUILDING BLOCK: DYNAMIC PROGRAMMING (SMITH WATERMAN, ETC.)
_ C C A T T G
_ 0 -1 -2 -3 -4 -5 -6
C -1 2 1 0 -1 -2 -3
A -2 1 1 3 2 1 0
T -3 0 0 2 5 4 3
T -4 -1 -1 1 4 7 6
G -5 -2 -2 0 3 6 9
G -6 -3 -3 -1 2 5 8
Match = +2
Mismatch = -1
Delete = -1
Insert = -1
CCATTG_
C_ATTGG
![Page 13: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/13.jpg)
SMITH-WATERMAN BENCHMARK
19.6
71.9
113.5
82.2
135
102.3
162
0
20
40
60
80
100
120
140
160
180
Smith-Waterman Edit Distance
i7 3930K 3.2 Ghz *
K20
K40 (750 Mhz)
K40 (850 Mhz)
GCUPS
* SSW library –
12 threads
![Page 14: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/14.jpg)
BUILDING BLOCK: PIPELINE PARALLELISM
void process(int i)
{
State state(i);
while (state.is_done() == false) // taken 30% of the times
{
if (state.A_flag) // taken 50% of the times
{ // => 15% utilization!
if (state.B_flag) // taken 33% of the times
AB( state ); // => 5% utilization!
else
A( state ); // => 10% utilization!
}
else if (state.B_flag) // taken 50% of the times
B( state ); // => 15% utilization!
if (state.C_flag) // taken 33% of the times
C( state ); // => 10% utilization!
}
}
![Page 15: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/15.jpg)
while (...)
AB()
A_flag C()
B()
A()
A_flag &
B_flag
B_flag
i
![Page 16: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/16.jpg)
BUILDING BLOCK: PIPELINE PARALLELISM
__host__ bool pipeline() { // pipeline scheduler - CPU
if (while_q.size() > thresh) while_stage<<<while_q.size()>>>();
if (AB_q.size() > thresh) AB_stage<<<AB_q.size()>>>(); // etc.
return while_q.empty() && AB.empty() && ...;
}
__global__ void while_stage() { // primary stage - GPU
const int tid = thread_id(); // thread id
if (tid >= in_queue.size()) return;
const State state = in_queue[tid]; // fetch work from input queue
if (state.A_flag)
{
if (state.B_flag) AB_queue.push( state );
else A_queue.push( state );
}
else if (state.B_flag) B_queue.push( state );
else if (state.C_flag) C_queue.push( state );
}
![Page 17: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/17.jpg)
NVBOWTIE2: RE-ENGINEERING BOWTIE2 ON NVBIO Implement Bowtie2 algorithm from scratch on NVBIO
Supports full spectrum of features
First achieve same accuracy as Bowtie2, then worry about speed
Single End
Paired Ends Local
End-to-End Best Mapping
All Mapping
![Page 18: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/18.jpg)
NVBOWTIE2 - Architecture Overview
seed
map score
no
traceback
CPU background
input thread
prefetch
read
batches
CPU background
output thread
consume
alignment
batches
CPU CPU
CPU CPU
reformat /
compress
yes
no
reseed?
CPU CPU
CPU CPU
decompress /
reformat yes
finished?
SAM /
BAM
FASTQ
(GZIP)
FM-INDEX
DP ALIGN
Runs on GPU
![Page 19: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/19.jpg)
Illumina HiSeq 2000 10M x 100bp x 2 end-to-end
Ion Proton 100M x 175bp (8-350) local
Illumina HiSeq 2000 10M x 100bp x 2 local
ERR161544
- ERR161544
NVBOWTIE2 – RESULTS ON REAL DATASETS
speedup 3.4x
alignment rate =
disagreement 0.006%
speedup 8.4x
alignment rate -0.6%
disagreement 0.03%
speedup 2.8x
alignment rate =
disagreement 0.022%
CPU: Core i7 3930K 3.2Ghz
GPU: K40
Alignment rate: % difference in
number of aligned reads (MAPQ >=32)
Disagreement: % reads aligned to
different location (MAPQ>=32)
![Page 20: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/20.jpg)
NVBIO TUTORIAL
Nuno Subtil
![Page 21: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/21.jpg)
OUTLINE
Problem statement
Architecting an aligner for the GPU
— Batching
— Lessons learned from nvBowtie
How to implement using nvbio
— I/O
— FM index lookups
— DP alignment
![Page 22: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/22.jpg)
PROBLEM STATEMENT
Implement a simple proto-alignment pipeline
— Input: human genome, set of short (~300bp) reads
— Output: a single genome coordinate per read
Two-stage approach:
— Seed using the initial 20bps via FM-index lookups
— Use dynamic programming to extend the first seed we find
![Page 23: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/23.jpg)
in.fastq
FM index
Parser Seeding Select +
Coordinate Xform
Extension Output
Genome
![Page 24: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/24.jpg)
BATCHING FOR GPUS
Classic CPU approach: drive one read through the entire pipeline, move on to the next
#pragma omp parallel for
for(c = 0; c < num_reads; c++) {
uint32 bwt_loc = fmi(fmindex, reads[c]);
uint32 genome_loc = locate(fmindex, bwt_loc);
uint32 score = sw(genome_loc, width, reads[c]);
}
in.fastq
FM index
Parser Seeding Select +
Coordinate Xform
Extension Output
Genome
![Page 25: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/25.jpg)
BATCHING FOR GPUS
GPUs benefit from a batched approach
for(c = 0; c < num_reads; c++)
bwt_loc[c] = fmi(fmindex, reads[c]);
for(c = 0; c < num_reads; c++)
genome_loc[c] = locate(fmindex, bwt_loc[c]);
for(c = 0; c < num_reads; c++)
score = sw(genome_loc[c], width, reads[c]);
in.fastq
FM index
Parser Seeding Select +
Coordinate Xform
Extension Output
Genome
![Page 26: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/26.jpg)
BATCHING FOR GPUS
GPUs benefit from a batched approach
for(c = 0; c < num_reads; c++)
bwt_loc[c] = fmi(fmindex, reads[c]);
for(c = 0; c < num_reads; c++)
genome_loc[c] = locate(fmindex, bwt_loc[c]);
for(c = 0; c < num_reads; c++)
score = sw(genome_loc[c], width, reads[c]);
in.fastq
FM index
Parser Seeding Select +
Coordinate Xform
Extension Output
Genome
![Page 27: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/27.jpg)
BATCHING FOR GPUS
GPUs benefit from a batched approach
fmi<<<T,B>>>(bwt_loc, fmindex, reads);
genome<<<T,B>>>(genome_loc, fmindex, bwt_loc);
sw<<<T,B>>>(score, genome_loc, width, reads);
in.fastq
FM index
Parser Seeding Select +
Coordinate Xform
Extension Output
Genome
![Page 28: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/28.jpg)
ARCHITECTURE Consume reads in large batches
Keep track of a “pipeline context”
— Intermediate buffers go here
— Memory requirements determined by batch size
— Avoid malloc/free during alignment (allocate memory up front)
Isolate each stage clearly
— Input: well-defined subset of pipeline state (+ read data from fastq)
— Well-defined output
— Allow for memory to be reused between stages
![Page 29: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/29.jpg)
Parser
Seeding Select +
Coordinate Xform
Extension
Output in.fastq
FM index Genome
context
Intermediate buffers
GPU
CPU
![Page 30: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/30.jpg)
PIPELINE SKELETON
![Page 31: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/31.jpg)
PARSING FASTQ FILES
![Page 32: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/32.jpg)
ACCESSING READ DATA
![Page 33: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/33.jpg)
CREATING AND LOADING THE FM-INDEX
NVbio includes offline utilities to generate the FM-index
— Same code is included in the library if you want to do it online
— FM-index also encapsulates packed genome data and SSA
FM-index interface can load the FM-index data from disk
Also supported: placing the index in system shared memory
— Index stays in memory across runs of your application
— Eliminates most startup costs
— Different class, same interface --- easy!
![Page 34: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/34.jpg)
NVBIO: LOADING FM-INDEX DATA
![Page 35: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/35.jpg)
QUERYING THE FM-INDEX
uint2 nvbio::match(fmindex, pattern, length)
— Queries fmindex for pattern
— Returns a BWT range (NOT genome coordinates)
![Page 36: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/36.jpg)
QUERYING THE FM-INDEX: GPU VERSION
![Page 37: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/37.jpg)
GENOME COORDINATE TRANSFORM
FM-index returns BWT coordinates
nvbio::locate(fmindex, bwt_coord)
— Input: FM-index, BWT coordinate
— Output: corresponding genome coordinate
![Page 38: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/38.jpg)
GENOME COORDINATE TRANSFORM
![Page 39: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/39.jpg)
DYNAMIC PROGRAMMING ALIGNMENT
nvbio supports several different DP aligners
— { Edit-distance, Smith-Waterman, Gotoh } x { local, semi-global, global } x { banded, full matrix } x { score, backtrack }
Template library for DP alignment
— Configured entirely at compile time
![Page 40: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/40.jpg)
WORKFLOW FOR DP ALIGNMENT
Configure alignment via typedefs
— Alignment parameters are template parameters
Create temporary storage
Call nvbio::aln::alignment_score()
![Page 41: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/41.jpg)
SETTING UP LOCAL GOTOH ALIGNMENT
Accumulates alignment results (score, position, …)
Can be implemented by client application
Alignment object type configured
for local Gotoh alignment
![Page 42: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/42.jpg)
STRING ALIGNMENT FUNCTION
![Page 43: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/43.jpg)
IT WORKS!!
![Page 44: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/44.jpg)
WHAT ELSE?
DP back-tracing
Fast BWT builder
Data movement between CPU and GPU
— Views vs containers
GPU work queues
— Flexible scheduling: per thread, per warp, per block
Many useful primitives
— Queues, heaps, string sets, …
![Page 45: Introducing NVBIO: High Performance Primitives for ... · —Produce usable tools for non-developers, comparable with current state-of-the-art . OVERVIEW What is NVBIO Core concepts](https://reader036.fdocuments.us/reader036/viewer/2022071215/6046873f05958e062c6597d8/html5/thumbnails/45.jpg)
WHERE TO GET http://nvlabs.github.io/nvbio/index.html
Google Group: nvbio-users