Search Results for author: Aydın Buluç

Found 4 papers, 3 papers with code

Distributed-Memory Sparse Kernels for Machine Learning

1 code implementation15 Mar 2022 Vivek Bharadwaj, Aydın Buluç, James Demmel

Further, we give two communication-eliding strategies to reduce costs further for FusedMM kernels: either reusing the replication of an input dense matrix for the SDDMM and SpMM in sequence, or fusing the local SDDMM and SpMM kernels.

BIG-bench Machine Learning Collaborative Filtering +1

LOGAN: High-Performance GPU-Based X-Drop Long-Read Alignment

1 code implementation12 Feb 2020 Alberto Zeni, Giulia Guidi, Marquita Ellis, Nan Ding, Marco D. Santambrogio, Steven Hofmeyr, Aydın Buluç, Leonid Oliker, Katherine Yelick

To highlight the impact of our work on a real-world application, we couple LOGAN with a many-to-many long-read alignment software called BELLA, and demonstrate that our implementation improves the overall BELLA runtime by up to 10. 6x.

Vocal Bursts Intensity Prediction

A High-Throughput Solver for Marginalized Graph Kernels on GPU

no code implementations14 Oct 2019 Yu-Hang Tang, Oguz Selvitopi, Doru Popovici, Aydın Buluç

To cope with the gap between the instruction throughput and the memory bandwidth of current generation GPUs, our solver forms the tensor product linear system on-the-fly without storing it in memory when performing matrix-vector dot product operations in PCG.

Vocal Bursts Intensity Prediction

High-performance sparse matrix-matrix products on Intel KNL and multicore architectures

1 code implementation5 Apr 2018 Yusuke Nagasaka, Satoshi Matsuoka, Ariful Azad, Aydın Buluç

Our hash-table and heap-based algorithms are showing significant speedups from libraries in the majority of the cases while different algorithms dominate the other scenarios with different matrix size, sparsity, compression factor and operation type.

Distributed, Parallel, and Cluster Computing

Cannot find the paper you are looking for? You can Submit a new open access paper.