Search Results for author: Mohammad Sadrosadati

Found 7 papers, 4 papers with code

ORIGAMI: A Heterogeneous Split Architecture for In-Memory Acceleration of Learning

no code implementations • 30 Dec 2018 • Hajar Falahati, Pejman Lotfi-Kamran, Mohammad Sadrosadati, Hamid Sarbazi-Azad

To utilize available bandwidth without violating area and power budgets of logic layer, ORIGAMI comes with a computation-splitting compiler that divides an ML algorithm between in-memory accelerators and an out-of-the-memory platform in a balanced way and with minimum inter-communications.

Paper
Add Code

Hermes: Accelerating Long-Latency Load Requests via Perceptron-Based Off-Chip Load Prediction

1 code implementation • 1 Sep 2022 • Rahul Bera, Konstantinos Kanellopoulos, Shankar Balachandran, David Novo, Ataberk Olgun, Mohammad Sadrosadati, Onur Mutlu

To this end, we propose a new technique called Hermes, whose key idea is to: 1) accurately predict which load requests might go off-chip, and 2) speculatively fetch the data required by the predicted off-chip loads directly from the main memory, while also concurrently accessing the cache hierarchy for such loads.

Paper
Code

TargetCall: Eliminating the Wasted Computation in Basecalling via Pre-Basecalling Filtering

1 code implementation • 9 Dec 2022 • Meryem Banu Cavlak, Gagandeep Singh, Mohammed Alser, Can Firtina, Joël Lindegger, Mohammad Sadrosadati, Nika Mansouri Ghiasi, Can Alkan, Onur Mutlu

However, for many applications, the majority of reads do no match the reference genome of interest (i. e., target reference) and thus are discarded in later steps in the genomics pipeline, wasting the basecalling computation.

Paper
Code

TransPimLib: A Library for Efficient Transcendental Functions on Processing-in-Memory Systems

1 code implementation • 3 Apr 2023 • Maurus Item, Juan Gómez-Luna, Yuxin Guo, Geraldo F. Oliveira, Mohammad Sadrosadati, Onur Mutlu

In order to provide support for transcendental (and other hard-to-calculate) functions in general-purpose PIM systems, we present \emph{TransPimLib}, a library that provides CORDIC-based and LUT-based methods for trigonometric functions, hyperbolic functions, exponentiation, logarithm, square root, etc.

Paper
Code

RawAlign: Accurate, Fast, and Scalable Raw Nanopore Signal Mapping via Combining Seeding and Alignment

1 code implementation • 8 Oct 2023 • Joël Lindegger, Can Firtina, Nika Mansouri Ghiasi, Mohammad Sadrosadati, Mohammed Alser, Onur Mutlu

mean) while improving accuracy by 1. 35$\times$ (1. 34$\times$) in terms of F-1 score on average (geo.

Paper
Code

Accelerating Graph Neural Networks on Real Processing-In-Memory Systems

no code implementations • 26 Feb 2024 • Christina Giannoula, Peiming Yang, Ivan Fernandez Vega, Jiacheng Yang, Yu Xin Li, Juan Gomez Luna, Mohammad Sadrosadati, Onur Mutlu, Gennady Pekhimenko

Graph Neural Network (GNN) execution involves both compute-intensive and memory-intensive kernels, the latter dominates the total time, being significantly bottlenecked by data movement between memory and processors.

Paper
Add Code

Analysis of Distributed Optimization Algorithms on a Real Processing-In-Memory System

no code implementations • 10 Apr 2024 • Steve Rhyner, Haocong Luo, Juan Gómez-Luna, Mohammad Sadrosadati, Jiawei Jiang, Ataberk Olgun, Harshita Gupta, Ce Zhang, Onur Mutlu

Processor-centric architectures (e. g., CPU, GPU) commonly used for modern ML training workloads are limited by the data movement bottleneck, i. e., due to repeatedly accessing the training dataset.

Distributed Optimization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.