Search Results for author: Ahmet Inci

Found 6 papers, 0 papers with code

QUIDAM: A Framework for Quantization-Aware DNN Accelerator and Model Co-Exploration

no code implementations30 Jun 2022 Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Ting-Wu Chin, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu

As the machine learning and systems communities strive to achieve higher energy-efficiency through custom deep neural network (DNN) accelerators, varied precision or quantization levels, and model compression techniques, there is a need for design space exploration frameworks that incorporate quantization-aware processing elements into the accelerator design space while having accurate and fast power, performance, and area models.

Model Compression Quantization

Efficient Deep Learning Using Non-Volatile Memory Technology

no code implementations27 Jun 2022 Ahmet Inci, Mehmet Meric Isgenc, Diana Marculescu

DeepNVM++ is demonstrated on STT-/SOT-MRAM technologies and can be used for the characterization, modeling, and analysis of any NVM technology for last-level caches in GPUs for DL applications.

QADAM: Quantization-Aware DNN Accelerator Modeling for Pareto-Optimality

no code implementations20 May 2022 Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu

We also show that the proposed lightweight processing elements (LightPEs) consistently achieve Pareto-optimal results in terms of accuracy and hardware-efficiency.

Quantization

QAPPA: Quantization-Aware Power, Performance, and Area Modeling of DNN Accelerators

no code implementations17 May 2022 Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu

As the machine learning and systems community strives to achieve higher energy-efficiency through custom DNN accelerators and model compression techniques, there is a need for a design space exploration framework that incorporates quantization-aware processing elements into the accelerator design space while having accurate and fast power, performance, and area models.

Model Compression Quantization

The Architectural Implications of Distributed Reinforcement Learning on CPU-GPU Systems

no code implementations8 Dec 2020 Ahmet Inci, Evgeny Bolotin, Yaosheng Fu, Gal Dalal, Shie Mannor, David Nellans, Diana Marculescu

With deep reinforcement learning (RL) methods achieving results that exceed human capabilities in games, robotics, and simulated environments, continued scaling of RL training is crucial to its deployment in solving complex real-world problems.

reinforcement-learning Reinforcement Learning (RL)

DeepNVM++: Cross-Layer Modeling and Optimization Framework of Non-Volatile Memories for Deep Learning

no code implementations8 Dec 2020 Ahmet Inci, Mehmet Meric Isgenc, Diana Marculescu

Under iso-area assumptions, STT-MRAM and SOT-MRAM provide up to 2x and 2. 3x EDP reduction and accommodate 2. 3x and 3. 3x cache capacity when compared to SRAM, respectively.

Cannot find the paper you are looking for? You can Submit a new open access paper.