News

DBCSR is a library designed to efficiently perform sparse matrix-matrix multiplication, among other operations. It is MPI and OpenMP parallel and can exploit Nvidia and AMD GPUs via CUDA and HIP. To ...
We call such families of moduli low-weight polynomial form integers (LWPFIs). We show an efficient modular multiplication method using LWPFI moduli. LWPFIs allow general implementation and there exist ...
We present here the miss rate comparison of cache oblivious matrix multiplication using the sequential access recursive technique and normal multiplication program. Varying the cache size the ...