no code implementations • 24 May 2023 • Zirui Liu, Guanchu Wang, Shaochen Zhong, Zhaozhuo Xu, Daochen Zha, Ruixiang Tang, Zhimeng Jiang, Kaixiong Zhou, Vipin Chaudhary, Shuai Xu, Xia Hu
While the model parameters do contribute to memory usage, the primary memory bottleneck during training arises from storing feature maps, also known as activations, as they are crucial for gradient calculation.
no code implementations • 24 May 2023 • Zirui Liu, Zhimeng Jiang, Shaochen Zhong, Kaixiong Zhou, Li Li, Rui Chen, Soo-Hyun Choi, Xia Hu
However, model editing for graph neural networks (GNNs) is rarely explored, despite GNNs' widespread applicability.
10 code implementations • 17 Mar 2023 • Daochen Zha, Zaid Pervaiz Bhat, Kwei-Herng Lai, Fan Yang, Zhimeng Jiang, Shaochen Zhong, Xia Hu
Artificial Intelligence (AI) is making a profound impact in almost every domain.
1 code implementation • ICLR 2022 • Shaochen Zhong, Guanqun Zhang, Ningjia Huang, Shuai Xu
In this paper, we revisit the idea of kernel pruning (to only prune one or several $k \times k$ kernels out of a 3D-filter), a heavily overlooked approach under the context of structured pruning due to it will naturally introduce sparsity to filters within the same convolutional layer—thus, making the remaining network no longer dense.