Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online...
Transcript of Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online...
![Page 1: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/1.jpg)
Feb 29, 2016 Online Tracking for Mu3e 1
Online Track Reconstruction on GPUs for the Mu3e Experiment
Dorothea vom Bruchfor the Mu3e Collaboration
DPG Frühjahrstagung 2016, T42: Trigger und DAQ II
![Page 2: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/2.jpg)
Feb 29, 2016 Online Tracking for Mu3e 2
The Mu3e ExperimentSearch for charged lepton flavour-violating decay with a sensitivity in the branching ratio better than 10 -16
μ+→e + e− e+
Branching ratio suppressed in Standard Model to below 10-54
Any hint of signal new physics● Supersymmetry● Grand unified models● Extended Higgs sector● ...
Current limit on branching ratio: 10-12 (SINDRUM, 1988)
![Page 3: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/3.jpg)
Feb 29, 2016 Online Tracking for Mu3e 3
Signal versus Background
Signal ● Coincident in time● Single vertex●
● E = ∑ p⃗ i=0
mμ
e+
e+
e-
Random Combinations ● Not coincident in time● No single vertex●
● E ∑ p⃗ i≠0
≠mμ
e+
e+
e-
Internal Conversion● Coincident in time● Single vertex●
● E ∑ p⃗ i≠0
≠mμ
![Page 4: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/4.jpg)
Feb 29, 2016 Online Tracking for Mu3e 4
The Mu3e DetectorRequirements
● Excellent momentum resolution: < 0.5 MeV/c● Good timing resolution: 100 ps for tiles, 1 ns for fibres, < 20 ns for pixels● Good vertex resolution: 300 μm● High rates: 108 -109 μ/s (Paul Scherrer Institute, Switzerland)
![Page 5: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/5.jpg)
Feb 29, 2016 Online Tracking for Mu3e 5
The Mu3e DetectorRequirements
● Excellent momentum resolution: < 0.5 MeV/c● Good timing resolution: 100 ps for tiles, 1 ns for fibres, < 20 ns for pixels● Good vertex resolution: 300 μm● High rates: 108 -109 μ/s (Paul Scherrer Institute, Switzerland)
10 cm
![Page 6: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/6.jpg)
Feb 29, 2016 Online Tracking for Mu3e 6
Readout Scheme
● Triggerless readout
50 Gbit/s data rate
● Online data reduction
● Track reconstruction and vertex fitting on Graphics Processing Units (GPUs)
● Reduction factor of ~1000
...
~1100Pixel Sensors
FPGA FPGA FPGA38 FPGAs
Switching
Boards
...PC
GPUPC
GPUPC
12 PCs
DataCollection
Server
MassStorage
1 6.4 Gbit/slink each
12 6.4 Gbits/s links per board
Gbit Ethernet
up to 451.25 Gbit/slinks
GPUFPGA FPGA FPGA
![Page 7: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/7.jpg)
Feb 29, 2016 Online Tracking for Mu3e 7
Fast Data Transfer
CPU
FPGA GPU
PCIe
RAM● Direct Memory Access to main
memory
● Copy to GPU memory
● At 1.5 GB/s: measured bit error rate
< 4 x 10-16
![Page 8: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/8.jpg)
Feb 29, 2016 Online Tracking for Mu3e 8
Online Reconstruction● Number of possible track candidates ~ n3
● At 108 μ/s: ~10 hits / layer / 50 ns O(103) combinations / 50 ns
FPGA Geometrical selection
GPU
RAM
RAM
Main memory
Multiple scattering fit
Matching layer 4
Track combinations
Vertex fit
Main memory
e+ e-
Selection decision
Selected frames
DMA Transfer
DMA Transfer
Mainmemoryas buffer
![Page 9: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/9.jpg)
Feb 29, 2016 Online Tracking for Mu3e 9
Geometrical Selection
z
rx
y
01
2
01
2
![Page 10: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/10.jpg)
Feb 29, 2016 Online Tracking for Mu3e 10
Geometrical Selection
z
rx
y
01
2
01
2
z1 - z
0
Ф1 - Ф
0
![Page 11: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/11.jpg)
Feb 29, 2016 Online Tracking for Mu3e 11
Geometrical Selection
z
rx
y
01
2
01
2
z2 - z
1 Ф2 - Ф
1
![Page 12: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/12.jpg)
Feb 29, 2016 Online Tracking for Mu3e 12
Geometrical Selection
![Page 13: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/13.jpg)
Feb 29, 2016 Online Tracking for Mu3e 13
Multiple Scattering Fit● Electrons: 12 – 53 MeV/c
● Resolution dominated by multiple Coulomb scattering
● Ignore hit uncertainty
● Describe track as sequence of hit triplets
● Multiple scattering at middle hit of triplet
● Minimize multiple scattering
Triplet
χ2=ΦMS
2
σMS,Φ2 +
θMS2
σMS, θ2
![Page 14: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/14.jpg)
Feb 29, 2016 Online Tracking for Mu3e 14
Propagation to 4th Layer
● Position of 4th layer known● : propagate in xy-plane● : propagate in z direction
After all selections:● 98 % of true 4-hit tracks selected● 65 % random combinations of 3 hits
α
β α
βR
![Page 15: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/15.jpg)
Feb 29, 2016 Online Tracking for Mu3e 15
Parallelization
...
...
...
...
... ... ...
~ 2000 compute cores on GPU
● Fit for one combination of three hits
● Cut on χ2
● Propagation to 4th layer
● Loop over hits in 4th layer: check if hit exists in proximity of propagated track
![Page 16: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/16.jpg)
Feb 29, 2016 Online Tracking for Mu3e 16
Performance 108 muons / s GTX680 GTX980
Fits / s 2x107 3x107
109 muons / s
Fits / s 9.7x109 1.6x1010
Pictures: pcmag.com, nvidia.com
![Page 17: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/17.jpg)
Feb 29, 2016 Online Tracking for Mu3e 17
Performance 108 muons / s GTX680 GTX980
Fits / s 2x107 3x107
109 muons / s
Fits / s 9.7x109 1.6x1010
Pictures: pcmag.com, nvidia.com
108 muons / s Reduction factor
Triplets / s
Total 2x1010
After geometrial selection
50 4x108
After multiple scattering fit
2 2x108
After propagationTo 4th layer
2.5 8x107
@ 108 /s: μ O(10) DAQ computers are sufficient
![Page 18: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/18.jpg)
Feb 29, 2016 Online Tracking for Mu3e 18
Next Steps
● Study, optimize vertex fit performance● Simplify for GPU implementation● Implement geometrical selection on FPGA● Test whole chain of online selection
More Mu3e talks:
● Mu3e Experiment: T22.4&5, T42.7, T75.7, T98.1&5
● MuPix Telescope: T42.6, T99.5
● HV-MAPS / MuPix: T72.1-3
![Page 19: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/19.jpg)
Feb 29, 2016 Online Tracking for Mu3e 19
Backup Slides
![Page 20: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/20.jpg)
Feb 29, 2016 Online Tracking for Mu3e 20
Multiple Scattering Fit
Reduce by factor 2
z
s
x
y
ΦMS
S01
S12
S 12
S 01
Θ MSχ
2=
ϕMS2
σMS2 +
θMS2
σMS2
● R3D from fit
● Sign of R3D
track curvature
● Cut on fit success and χ2
![Page 21: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/21.jpg)
Feb 29, 2016 Online Tracking for Mu3e 21
Required Momentum Resolution
Graph: R. M. Djilkibaev, R. V. Konoplich, Phys.Rev.D79(2009)073004
![Page 22: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/22.jpg)
Feb 29, 2016 Online Tracking for Mu3e 22
Performance @ 109 muons/s
109 muons / s Reduction factor Triplets / s
Total 2x1013
After geometrial selection
50 4x1011
After multiple scattering fit
2 2x1011
After propagationTo 4th layer
2.6 8x1010
![Page 23: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/23.jpg)
Feb 29, 2016 Online Tracking for Mu3e 23
GPU Properties
● Highly parallel structure
● Process large blocks of data
● Nvidia: API extension to C:
CUDA (Compute Unified Device Architecture)
DRAM
Device = GPU card
StreamingMultiprocessor
(SM)
GPU
Cache
Host = CPU
Memory
allo
cate
Host code
laun
ch k
ern
el
cop
y b
ack
allocate
![Page 24: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/24.jpg)
Feb 29, 2016 Online Tracking for Mu3e 24
GPU ArchitectureDevice
SM 0 SM 1 SM 2 SM 3
...
.
.
.
.
.
.
.
.
.
.
.
.
Block 0 Block 1 Block 2 Block 3
Block 4 Block 5 Block 6 Block 7
...
Thread 0
.
.
.
Thread 31
Warp 0
Thread 32
.
.
.
Thread 63
Warp 1
Thread 64
.
.
.
Thread 96
Warp 2
Limits # blocks per SM
8 SMs
Max. 1024 threads per block
1 kernel per thread,all threads execute same kernel
Max. 2048 threadsper SM
Specs for GTX680
![Page 25: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/25.jpg)
Feb 29, 2016 Online Tracking for Mu3e 25
Fitting Kernel
Block (0,0) Block (0,1)
grid dimension N = # selected triplets / 128
Thread(0,0)
Thread(0,1)
Thread(0,128)
block dimension x = 128 (or other multiple of 32)
Block (0,N)
...
...
Launch grid
with all
possible hit
combinations
Apply
selection cuts
Store
indices of
selected triplets
FPGA in final implemen-tation
![Page 26: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/26.jpg)
Feb 29, 2016 Online Tracking for Mu3e 26
DMA: Implementation● Stratix V / IV development board: DMA engine, PCIe interface● Kernel module for communication with FPGA
– Mapping of memory addresses– Read, write functions– Interrupt handling
● CUDA API: memory allocation of page-locked memory, usable for DMA from FPGA to RAM and from RAM to GPU memory
● Use DMA with scatter / gather mapping– Large (GB) memory buffers possible
![Page 27: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/27.jpg)
Feb 29, 2016 Online Tracking for Mu3e 27
DMA: Implementation
CUDA API:memory allocation
Physicalmemory
Virtualmemory
Length 1
Length 2
Length 3
FPGA
Data memory256 kB
Address memory
Write addresses,lengths to FPGA
GPU
![Page 28: Online Track Reconstruction on GPUs for the Mu3e ...Feb 29, 2016 Online Tracking for Mu3e 1 Online Track Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e](https://reader034.fdocuments.us/reader034/viewer/2022051813/6031df1dbc0e83445d7934f3/html5/thumbnails/28.jpg)
Feb 29, 2016 Online Tracking for Mu3e 28
Segmentation, Interrupt messages
...
64 DMA blocks
DMA block
DMA block
...
16 PCIe blocks
interrupt
interrupt4 kB each