no code implementations • 21 Apr 2024 • Wei Niu, Md Musfiqur Rahman Sanim, Zhihao Shu, Jiexiong Guan, Xipeng Shen, Miao Yin, Gagan Agrawal, Bin Ren
Focusing on emerging transformers (specifically the ones with computationally efficient Swin-like architectures) and large models (e. g., Stable Diffusion and LLMs) based on transformers, we observe that layout transformations between the computational operators cause a significant slowdown in these applications.
no code implementations • 29 Feb 2024 • Wei Niu, Gagan Agrawal, Bin Ren
Though many compilation and runtime systems have been developed for DNNs in recent years, the focus has largely been on static DNNs.
no code implementations • 31 Dec 2021 • Xiang Li, Dong Li, Ruoming Jin, Gagan Agrawal, Rajiv Ramnath
Though other methods (particularly those based on Laplacian Smoothing) have reported better accuracy, a fundamental limitation of all the work is a lack of scalability.
no code implementations • 30 Aug 2021 • Wei Niu, Jiexiong Guan, Yanzhi Wang, Gagan Agrawal, Bin Ren
Deep Neural Networks (DNNs) have emerged as the core enabler of many major applications on mobile devices.
no code implementations • 13 Jul 2020 • Peng Jiang, Gagan Agrawal
Compared with full-communication SGD, our ADPSGD achieves 1:14x to 1:27x speedups with a 100Gbps connection among computing nodes, and the speedups increase to 1:46x ~ 1:95x with a 10Gbps connection.
no code implementations • 28 Oct 2019 • Renhao Cui, Gagan Agrawal, Rajiv Ramnath
Businesses communicate using Twitter for a variety of reasons -- to raise awareness of their brands, to market new products, to respond to community comments, and to connect with their customers and potential customers in a targeted manner.
no code implementations • 10 Jul 2019 • Renhao Cui, Gagan Agrawal, Rajiv Ramnath
This paper presents techniques to detect the "offline" activity a person is engaged in when she is tweeting (such as dining, shopping or entertainment), in order to create a dynamic profile of the user, for uses such as better targeting of advertisements.
no code implementations • NeurIPS 2018 • Peng Jiang, Gagan Agrawal
The large communication overhead has imposed a bottleneck on the performance of distributed Stochastic Gradient Descent (SGD) for training deep neural networks.