Kanellopoulos - Musical Improvisation as Action_ an Arendtian Perspective
SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for...
Transcript of SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for...
![Page 1: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/1.jpg)
SMASHCo-Designing Software Compression
and Hardware-Accelerated Indexing
for Efficient Sparse Matrix Operations
Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina Giannoula,
Roknoddin Azizi, Skanda Koppula, Nika Mansouri Ghiasi,
Taha Shahroodi, Juan Gomez Luna, Onur Mutlu
![Page 2: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/2.jpg)
2
Executive Summary
• Many important workloads heavily make use of sparse matrices
• Shortcomings of existing compression formats
• Expensive discovery of the positions of non-zero elements or
• Narrow applicability
• SMASH: hardware/software cooperative mechanism for efficient
sparse matrix storage and computation
• Software: Efficient compression scheme using
a hierarchy of bitmaps
• Hardware: Hardware unit interprets the bitmap hierarchy
and accelerates indexing
• Performance improvement: 38% and 44% for SpMV and SpMM
over the widely used CSR format
• SMASH is highly efficient, low-cost and widely applicable
• Matrices are extremely sparse
• Compression is essential to avoid storage/computational overheads
![Page 3: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/3.jpg)
3
1. Sparse Matrix Basics
4. Evaluation
5. Conclusion
Presentation Outline
2. Compression Formats & Shortcomings
3. SMASH
![Page 4: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/4.jpg)
4
Recommender Systems Graph Analytics Neural Networks
• Collaborative Filtering
• PageRank
• Breadth First Search
• Betweenness
Centrality
• Sparse DNNs
• Graph Neural Networks
Sparse Matrix Operations are Widespread Today
![Page 5: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/5.jpg)
5
0.0003%non-zero elements
2.31% non-zero elements
Real World Matrices Have High Sparsity
Sparse matrix compression
is essential to enable
efficient storage and computation
![Page 6: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/6.jpg)
6
1. Sparse Matrix Basics
4. Evaluation
5. Conclusion
Presentation Outline
2. Compression Formats & Shortcomings
3. SMASH
![Page 7: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/7.jpg)
7
Limitations of Existing Compression Formats
General formats
optimize for storage
Expensive discovery of
the positions
of non-zero elements
1
![Page 8: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/8.jpg)
8
• Used in multiple libraries & frameworks:
Intel MKL, TACO, Ligra, Polymer
• Stores only the non-zero elements
of the sparse matrix
• Provides high compression ratio
Widely Used Format: Compressed Sparse Row
![Page 9: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/9.jpg)
9
0row_ptr
col_ind
values
5 0 0 0
0 0 3 4
0 8 0 0
0 0 0 3
Α =
MATRIX CSR
Basics of CSR: Indexing Overhead
1 3 4 5
0 2 3 1 3
5 3 4 8 3
Requires multiple
data-dependent memory accesses
Multiple instructions are needed to discover
the position of a non-zero element
![Page 10: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/10.jpg)
10
Sparse Matrix Vector Multiplication (SpMV)
Indexing Overhead in Sparse Kernels
Sparse Matrix Matrix Multiplication (SpMM)
Indexing is expensive
for major sparse matrix kernels
Indexing for every non-zero element of the sparse matrix to multiply with the corresponding element of the vector.
Index matching for every inner product between the 2 sparse matrices.
![Page 11: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/11.jpg)
11
Performing Indexing with Zero Cost
Reducing the cost of indexing
can accelerate sparse matrix operations
2.21 2.13
2.81
0.00
0.50
1.00
1.50
2.00
2.50
3.00
SpMatAdd SpMV SpMM
Spe
ed
up
CSR ZERO-COST INDEXING
![Page 12: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/12.jpg)
12
Limitations of Existing Compression Formats
General formats
optimize for storage
Specialized formats assume
specific matrix structures
and patterns (e.g., diagonals)
Expensive discovery of
the positions
of non-zero elements
Narrow
applicability
1
2
![Page 13: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/13.jpg)
13
• Minimizes the indexing overheads
• Can be used across a wide range of
sparse matrices and sparse matrix operations
• Enables high compression ratio
Design a sparse matrix compression mechanism that:
Our Goal
![Page 14: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/14.jpg)
14
1. Sparse Matrix Basics
4. Evaluation
5. Conclusion
Presentation Outline
2. Compression Formats & Shortcomings
3. SMASH
Software Compression Scheme
Hardware Acceleration Unit
Cross-layer Interface
![Page 15: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/15.jpg)
15
SMASH: Key Idea
Efficient
compression
using a Hierarchy
of Bitmaps
Software
Unit that scans
bitmaps to
accelerate
indexing
Hardware
SMASH ISA
Hardware/Software cooperative mechanism:
• Enables highly-efficient sparse matrix compression and computation
• General across a diverse set of sparse matrices and sparse matrix operations
![Page 16: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/16.jpg)
16
1. Sparse Matrix Basics
4. Evaluation
5. Conclusion
Presentation Outline
2. Compression Formats & Shortcomings
3. SMASH
Software Compression Scheme
Hardware Acceleration Unit
Cross-layer Interface
![Page 17: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/17.jpg)
17
• Encodes the positions of non-zero elementsusing bits to maintain low storage overhead
SMASH: Software Compression Scheme
Efficient
compression
using a Hierarchy
of Bitmaps
Software
![Page 18: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/18.jpg)
18
1 0 0 0
0 1 0 0
0 0 0 00 0 1 1
Encodes if a block of the matrix contains
any non-zero element
1
BITMAP
0 0 0
0 1 0 0
0 0 0 00 0 1 1
Might contain high number of zero bits
SMASH: Software Compression Scheme
BITMAP:
MATRIX
NZ ZEROS ZEROS ZEROS
ZEROS NZ ZEROS ZEROS
ZEROS ZEROS ZEROS ZEROS
ZEROS ZEROS NZ NZ
Idea: Apply the same encoding recursively
to compress more effectively
![Page 19: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/19.jpg)
19
NON-ZERO VALUES ARRAY
NZ
Z
Z Z
NZ NZ
Z Z Z
Z Z Z
Z Z
Z Z
Bitmap 01 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
1 0 0 1
1 1
Bitmap 1
Bitmap 2
Compression Ratio 4:1
Compression Ratio 2:1 MATRIX
NZ NZ NZ
Z Zero Region
NZ Non-Zero Region
Hierarchy of Bitmaps
Storage
Efficient
Fast
Indexing
Hardware
Friendly
![Page 20: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/20.jpg)
20
1. Sparse Matrix Basics
4. Evaluation
5. Conclusion
Presentation Outline
2. Compression Formats & Shortcomings
3. SMASH
Software Compression Scheme
Hardware Acceleration Unit
Cross-layer Interface
![Page 21: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/21.jpg)
21
SMASH: Hardware Acceleration Unit
Unit that scans
bitmaps to
accelerate
indexing
Hardware
• Buffers the Bitmap Hierarchy and scans it using bitwise operations
• Interprets the Bitmap Hierarchy used in software
![Page 22: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/22.jpg)
22
SRAM BUFFER 2
SRAM BUFFER 1
SRAM BUFFER 0
Matrix Parameters
Bitmap Parameters
Hardware LogicNON
ZERO
INDEX
SCAN
READ
UPDATE
Gro
up
2
Gro
up
3
BITMAP
BUFFERS
CPU
Bitmap Management Unit (BMU)
BMU
SMASH ISA
![Page 23: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/23.jpg)
23
1. Sparse Matrix Basics
4. Evaluation
5. Conclusion
Presentation Outline
2. Compression Formats & Shortcomings
3. SMASH
Software Compression Scheme
Hardware Acceleration Unit
Cross-Layer Interface
![Page 24: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/24.jpg)
24
MATINFOBMAPINFORDBMAP
RDIND
PBMAP
SMASH: Cross-Layer Interface
Need for a cross-layer interface that
enables software to control the BMU
• Communicate the parameters needed
to calculate the index
• Query the BMU to retrieve the index
of the next non-zero element
Enables SMASH to flexibly accelerate
a diverse range of operations
on any sparse matrix
SMASH ISA
![Page 25: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/25.jpg)
25
1. Sparse Matrix Basics
4. Evaluation
5. Conclusion
Presentation Outline
2. Compression Formats & Shortcomings
3. SMASH
Software Compression Scheme
Hardware Acceleration Unit
Cross-Layer Interface
![Page 26: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/26.jpg)
26
Simulator: ZSim Simulator [1]
Workloads:• Sparse Matrix Kernels
SpMV & SpMM from TACO [2]
• Graph Applications
PageRank & Betweenness Centrality from Ligra [3]
Input datasets:15 diverse sparse matrices & 4 graphs
from the Sparse Suite Collection [4]
Methodology
[1] Sanchez et al. “ZSim: Fast and Accurate Microarchitectural Simulation of Thousand-Core Systems” ISCA‘13[2] Kjolstad et al. “ taco: A Tool to Generate Tensor Algebra Kernels“ ASE’17[3] Shun et al. “Ligra: A Lightweight Graph Processing Framework for Shared Memory” PPoPP’13[4] Davis et al. “The University of Florida Sparse Matrix Collection” TOMS’11
![Page 27: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/27.jpg)
27
Evaluated Sparse Matrices
![Page 28: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/28.jpg)
28
Performance Improvement Using SMASH
SMASH provides significant performance
improvements over state-of-the-art formats
1.00 1.00 1.00 1.001.041.10
1.191.26
1.38 1.44
1.27 1.31
0.00
0.20
0.40
0.60
0.80
1.00
1.20
1.40
1.60
SpMV SpMM PageRank BC
Spe
ed
up
ove
r TA
CO
-CSR
TACO-CSR TACO-BCSR SMASH
![Page 29: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/29.jpg)
29
Number of Executed Instructions
SMASH significantly reduces
the number of executed instructions
1.00 1.00 1.00 1.00
0.890.81 0.82
0.78
0.630.54
0.75 0.70
0.00
0.20
0.40
0.60
0.80
1.00
1.20
SpMV SpMM PageRank BC
No
rmal
ize
d N
um
be
r o
f In
stru
ctio
ns
ove
r TA
CO
-CSR
TACO-CSR TACO-BCSR SMASH
![Page 30: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/30.jpg)
30
0.00
0.20
0.40
0.60
0.80
1.00
1.20
1.40
1.60
Spe
ed
up
INCREASING SPARSITY→
TACO-CSR TACO-BCSR SMASH
Sparsity Sweep for SpMV
SMASH provides speedups regardless
of the sparsity of the matrix
![Page 31: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/31.jpg)
31
SMASH configuration
• Support for 4 matrices in the BMU
• 256 bytes / SRAM buffer
• 140 bytes for registers & counters
0.076% area overhead over an Intel Xeon CPU
SMASH incurs negligible area overhead
Hardware Area Overhead
![Page 32: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/32.jpg)
32
• Compression ratio sensitivity analysis
• Distribution of non-zero elements
• Detailed results for SpMM
• Conversion from CSR to SMASH overhead
• Software-only approaches executed
in a real machine
Other Results in the Paper
![Page 33: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/33.jpg)
33
1. Sparse Matrix Basics
4. Evaluation
5. Conclusion
Presentation Outline
2. Compression Formats & Shortcomings
3. SMASH
Software Compression Scheme
Hardware Acceleration Unit
Cross-layer Interface
![Page 34: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/34.jpg)
34
Summary & Conclusion
• Many important workloads heavily make use of sparse matrices
• Shortcomings of existing compression formats
• Expensive discovery of the positions of non-zero elements or
• Narrow applicability
• SMASH: hardware/software cooperative mechanism for efficient
sparse matrix storage and computation
• Software: Efficient compression scheme using
a hierarchy of bitmaps
• Hardware: Hardware unit interprets the bitmap hierarchy
and accelerates indexing
• Performance improvement: 38% and 44% for SpMV and SpMM
over the widely used CSR format
• SMASH is highly efficient, low-cost and widely applicable
• Matrices are extremely sparse
• Compression is essential to avoid storage/computational overheads
![Page 35: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/35.jpg)
SMASHCo-Designing Software Compression
and Hardware-Accelerated Indexing
for Efficient Sparse Matrix Operations
Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina Giannoula,
Roknoddin Azizi, Skanda Koppula, Nika Mansouri Ghiasi,
Taha Shahroodi, Juan Gomez Luna, Onur Mutlu
![Page 36: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/36.jpg)
36
SMALL BITMAPS VS ZERO COMPUTATION
MATRIX MATRIX
1
Bit
ma
p-0
Bit
ma
p-0
NZANZA
1 A B A B
A BA B
Two 0’sSix 0’s
Compression Ratio
![Page 37: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/37.jpg)
37
MATRIX A MATRIX BΑ DΒ
E
F
C 0 1 2
Α Β C D E F
0 1 3
INDEX MATCHINGFOR EVERY INNER PRODUCT
col_ind row_ind
Sparse Matrix Matrix Multiplication with CSR
![Page 38: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/38.jpg)
38
CSR-based SpMV
VECTOR012345
INDIRECT ACCESS TO LOAD THE VECTOR
0 1 3 4 5row_ptr
0 2 3 1 3col_ind
CSR-based SpMV
![Page 39: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/39.jpg)
39
NZ
NZAVECTOR
BMU
0 1 0 0 1 0
0 1 0 1 0 0
0 1 0 1 0 0
Matrix Parameters
Bitmap Parameters
COLUMNINDEX
ROWINDEX
NZ NZ
NZ NZ NZ
NZ NZ NZ
NZ NZ NZ
STREAM THROUGH THE NON-ZERO BLOCKS
INDEX THE VECTOR USING THE BMU
Use Case: SpMV
![Page 40: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/40.jpg)
40
Effect of Compression Ratio
![Page 41: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/41.jpg)
41
CSR compresses more efficiently SMASH > CSR
Storage Efficiency
![Page 42: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/42.jpg)
42
0.8
0.9
1
1.1
1.2
1.3
Spe
ed
up
HIGHLY SCATTERED → HIGHLY GATHERED
MATRIX 13
Distribution of non-zero elements
![Page 43: SMASH - ETH Z...SMASH Co-Designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations Konstantinos Kanellopoulos, Nandita Vijaykumar, Christina](https://reader033.fdocuments.us/reader033/viewer/2022042206/5ea8ce23a50904224e1a61db/html5/thumbnails/43.jpg)
43
SMASH enables highly-efficient storage of sparse matrices
1
10
100
1000
10000
GMEAN
TOTA
L C
OM
PR
ESSI
ON
RA
TIO
INCREASING DENSITY →
SMASH CSR
Storage Efficiency