Weekly Report-Kmeans Ph.D. Student: Leo Lee date: Nov. 13, 2009.
Weekly Report- Reduction Ph.D. Student: Leo Lee date: Oct. 30, 2009.
-
date post
21-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Weekly Report- Reduction Ph.D. Student: Leo Lee date: Oct. 30, 2009.
Weekly Report-Reduction
Ph.D. Student: Leo Leedate: Oct. 30, 2009
Outline
7 Reduction implementations
Matrix multiplication
Protein Identification
Work plan
Five implementation discussed before
Interleaved Addressing With “%”
Interleaved Addressing Bank conflicts
Sequential addressing
Perform add when loading
Unroll the last warp
Implementation 6: Complete Unrolling
Specify block size as a function template parameter
Invoking template kernels
Could block size be a parameter?
Results
Implementation 7:
Results
Brief summary
Optimization Use efficient operator, not %; Avoid branch divergent in warps; Try to minimize the time of accessing the
global memory; Avoid bank conflict in shared memory; Unroll the loop as much as possible;
Matrix multiplication Time calculation
total time, in Main function double lf = clock(); for(int i=0; i<nMultipTimes; ++i) { RunTest(argc, argv, i, lfTotalTime, true); } total time = clock()-lf;
Compute time, in RunTest Function //create and start timer unsigned int nTimer = 0; cutilCheckError(cutCreateTimer(&nTimer)); cutilCheckError(cutStartTimer(nTimer));
matrixMul<<<grid, thread>>>(pDC, pDA, pDB, WA, WB);
//copy result from device to host cutilSafeCall(cudaMemcpy(pHC, pDC, nMemSizeC,
cudaMemcpyDeviceToHost)); //stop and destroy timer cutilCheckError(cutStopTimer(nTimer));
Experiments
WA, HA, WB
GPU / CPU
Comput time (ms) total time (ms)
16,16,16
GPU / CPU 45 / 15 24678 / 78
32,32,32
GPU / CPU 60 / 62 27250 / 203
48,80,128
GPU / CPU 225 / 861 26625 / 1203
128,256,512
GPU / CPU 4249 / 45829 35531 / 49328
512,512,512
GPU / CPU 27441 / 364232 70359 / 382062
2048,2048,2048
GPU 1697967 2020968
1697967->20, excluding the results transferring to the host
Mass Spectrometry Based Protein Identification
Mixed Proteins
>ipi|IPI00243451|IPI00243451.6 MDQHQHLNKTAESASSEKKKTRRCNGFKMFLAALSFSYIAKALGGIIMKISITQIERRFD…
TAESASSEKMFLAALSFSYIAK…
Digest
Mixed peptides
LC-MS/MS
Data
analyze
Protein sequence Peptide sequence
Merge
19-21-08 FT 893 MS2 9 avg #1 RT: 0.63 AV: 1 NL: 1.04E4T: FTMS + p NSI Full ms2 [email protected] [ 500.00-1600.00]
600 700 800 900 1000 1100 1200 1300 1400m/z
0
10
20
30
40
50
60
70
80
90
100
Relat
ive A
bund
ance
928.6396
929.9735
720.3784823.9249
916.4733769.9116 955.7405
1008.5148
1097.6791676.8584
1229.5820 1358.6410900.2117663.0114588.3018 1115.5698 1412.59381348.38761239.3015
Tandem MS
Protein Identification
Computing intensive Usually identify 50,000 around Mass Spectrum;
Each MS will score with at least 10,000 peptides, and some time more than 100,000 peptides;
Some experiment costs more than a year, and parallel software is commonly used. Parallel identification related papers are published in JRP, RCM and Bioinformatics. But still no research are based on GPU.
Protein Identification
Mass spectrum Peptide19-21-08 FT 893 MS2 9 avg #1 RT: 0.63 AV: 1 NL: 1.04E4T: FTMS + p NSI Full ms2 [email protected] [ 500.00-1600.00]
600 700 800 900 1000 1100 1200 1300 1400m/z
0
10
20
30
40
50
60
70
80
90
100
Rel
ativ
e A
bund
ance
928.6396
929.9735
720.3784823.9249
916.4733769.9116 955.7405
1008.5148
1097.6791676.8584
1229.5820 1358.6410900.2117663.0114588.3018 1115.5698 1412.59381348.38761239.3015
Mass + tolerance
M1M2M3…
M200
TAESASSEK
Mass
P1P2P3…
P40
score
For each Mi in MS
check if it is in P
Got a vector MSB (0, 1, 0, 0, …)
For each Pi in P
check if it is in MS
Got a vector PB(1, 1, 0…)
Compute the cosine…
Keep the top-k score
Vector MS Vector P
400 EVDG
400 AAEE
400 PSTD
631 EMSVPS
699 TLKHLK
699 WDRDL
……
Peptide
>IQPSKANMETEPDQ…
>DEAVPPPALQLQFN…
>RQRAILKVMNTIGGE……
Protein
Protein Identification
Protein Identification
X! Tandem The most efficient open source software Parallel version Not too large!
Other works
Read parallel computing books;
Learn MLP, data mining.
Work plan
Data mining homework;
K-means debug and test.
Thanks