qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Srisatish
Avoiding Communication in Linear Algebra Jim Demmel UC Berkeley bebop.cs.berkeley.edu.
Automatic Performance Tuning of Sparse Matrix Kernels Observations and Experience Performance tuning is tedious and time- consuming work. Richard Vuduc.
CS267 L19 Dense Linear Algebra I.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 19: Dense Linear Algebra - I James Demmel demmel/cs267_Spr99
03/09/2007CS267 Lecture 161 CS 267 Sparse Matrices: Sparse Matrix-Vector Multiply for Iterative Solvers Kathy Yelick yelick/cs267_sp07.
10/25/01CSE 260 - Applications CSE 260 – Introduction to Parallel Computation Topic 8: Benchmarks & Applications October 25, 2001.
03/04/2009CS267 Lecture 12a1 CS 267 Dense Linear Algebra: Possible Class Projects James Demmel demmel/cs267_Spr09.
16 December 2005Universidad de Murcia1 Research in parallel routines optimization Domingo Giménez Dpto. de Informática y Sistemas Javier Cuenca Dpto. de.
The Price of Cache-obliviousness Keshav Pingali, University of Texas, Austin Kamen Yotov, Goldman Sachs Tom Roeder, Cornell University John Gunnels, IBM.
CS 267 Dense Linear Algebra: Possible Class Projects
CSE 260 – Introduction to Parallel Computation
A Comparison of Cache-conscious and Cache-oblivious Programs