Search Results for author: Ahmet Inci

Found 6 papers, 0 papers with code

QUIDAM: A Framework for Quantization-Aware DNN Accelerator and Model Co-Exploration

no code implementations • 30 Jun 2022 • Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Ting-Wu Chin, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu

As the machine learning and systems communities strive to achieve higher energy-efficiency through custom deep neural network (DNN) accelerators, varied precision or quantization levels, and model compression techniques, there is a need for design space exploration frameworks that incorporate quantization-aware processing elements into the accelerator design space while having accurate and fast power, performance, and area models.

Model Compression Quantization

Paper
Add Code

Efficient Deep Learning Using Non-Volatile Memory Technology

no code implementations • 27 Jun 2022 • Ahmet Inci, Mehmet Meric Isgenc, Diana Marculescu

DeepNVM++ is demonstrated on STT-/SOT-MRAM technologies and can be used for the characterization, modeling, and analysis of any NVM technology for last-level caches in GPUs for DL applications.

Paper
Add Code

QADAM: Quantization-Aware DNN Accelerator Modeling for Pareto-Optimality

no code implementations • 20 May 2022 • Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu

We also show that the proposed lightweight processing elements (LightPEs) consistently achieve Pareto-optimal results in terms of accuracy and hardware-efficiency.

Quantization

Paper
Add Code

QAPPA: Quantization-Aware Power, Performance, and Area Modeling of DNN Accelerators

no code implementations • 17 May 2022 • Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu

As the machine learning and systems community strives to achieve higher energy-efficiency through custom DNN accelerators and model compression techniques, there is a need for a design space exploration framework that incorporates quantization-aware processing elements into the accelerator design space while having accurate and fast power, performance, and area models.

Model Compression Quantization

Paper
Add Code

The Architectural Implications of Distributed Reinforcement Learning on CPU-GPU Systems

no code implementations • 8 Dec 2020 • Ahmet Inci, Evgeny Bolotin, Yaosheng Fu, Gal Dalal, Shie Mannor, David Nellans, Diana Marculescu

With deep reinforcement learning (RL) methods achieving results that exceed human capabilities in games, robotics, and simulated environments, continued scaling of RL training is crucial to its deployment in solving complex real-world problems.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

DeepNVM++: Cross-Layer Modeling and Optimization Framework of Non-Volatile Memories for Deep Learning

no code implementations • 8 Dec 2020 • Ahmet Inci, Mehmet Meric Isgenc, Diana Marculescu

Under iso-area assumptions, STT-MRAM and SOT-MRAM provide up to 2x and 2. 3x EDP reduction and accommodate 2. 3x and 3. 3x cache capacity when compared to SRAM, respectively.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.