1 code implementation • 2 Mar 2024 • Tianyi Zhang, Jonah Wonkyu Yi, Bowen Yao, Zhaozhuo Xu, Anshumali Shrivastava
Large language model inference on Central Processing Units (CPU) is challenging due to the vast quantities of expensive Multiply-Add (MAD) matrix operations in the attention computations.
1 code implementation • 5 Feb 2024 • Zirui Liu, Jiayi Yuan, Hongye Jin, Shaochen Zhong, Zhaozhuo Xu, Vladimir Braverman, Beidi Chen, Xia Hu
This memory demand increases with larger batch sizes and longer context lengths.
no code implementations • 5 Feb 2024 • Shanshan Han, Qifan Zhang, Yuhang Yao, Weizhao Jin, Zhaozhuo Xu, Chaoyang He
This paper explores existing works of multi-agent systems and identifies challenges that remain inadequately addressed.
no code implementations • 23 Dec 2023 • Guanchu Wang, Yu-Neng Chuang, Fan Yang, Mengnan Du, Chia-Yuan Chang, Shaochen Zhong, Zirui Liu, Zhaozhuo Xu, Kaixiong Zhou, Xuanting Cai, Xia Hu
To address this problem, we develop a pre-trained, DNN-based, generic explainer on large-scale image datasets, and leverage its transferability to explain various vision models for downstream tasks.
no code implementations • 23 Sep 2023 • Zhuang Wang, Zhaozhuo Xu, Anshumali Shrivastava, T. S. Eugene Ng
We then systematically explore the design space of communication schemes for sparse tensors and find the optimal one.
1 code implementation • NeurIPS 2023 • Zirui Liu, Guanchu Wang, Shaochen Zhong, Zhaozhuo Xu, Daochen Zha, Ruixiang Tang, Zhimeng Jiang, Kaixiong Zhou, Vipin Chaudhary, Shuai Xu, Xia Hu
While the model parameters do contribute to memory usage, the primary memory bottleneck during training arises from storing feature maps, also known as activations, as they are crucial for gradient calculation.
no code implementations • 17 May 2023 • Zhaozhuo Xu, Zirui Liu, Beidi Chen, Yuxin Tang, Jue Wang, Kaixiong Zhou, Xia Hu, Anshumali Shrivastava
Thus, optimizing this accuracy-efficiency trade-off is crucial for the LLM deployment on commodity hardware.
no code implementations • 10 Mar 2023 • Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu
Current theoretical literature focuses on greedy search on exact near neighbor graph while practitioners use approximate near neighbor graph (ANN-Graph) to reduce the preprocessing time.
no code implementations • 21 Dec 2022 • Lianke Qin, Aravind Reddy, Zhao Song, Zhaozhuo Xu, Danyang Zhuo
In this paper, we propose Adam-Hash: an adaptive and dynamic multi-resolution hashing data-structure for fast pairwise summation estimation.
no code implementations • 8 Aug 2022 • Jiehao Liang, Zhao Song, Zhaozhuo Xu, Junze Yin, Danyang Zhuo
In this work, we focus on the dynamic maintenance of KDE data structures with robustness to adversarial queries.
no code implementations • 5 Aug 2022 • Hang Hu, Zhao Song, Runzhou Tao, Zhaozhuo Xu, Junze Yin, Danyang Zhuo
Online bipartite matching is a fundamental problem in online algorithms.
no code implementations • 22 Jun 2022 • Zhaozhuo Xu, Weijie Zhao, Shulong Tan, Zhixin Zhou, Ping Li
Given a vertex deletion request, we thoroughly investigate solutions to update the connections of the vertex.
no code implementations • NeurIPS 2021 • Zhaozhuo Xu, Beidi Chen, Chaojian Li, Weiyang Liu, Le Song, Yingyan Lin, Anshumali Shrivastava
However, as one of the most influential and practical MT paradigms, iterative machine teaching (IMT) is prohibited on IoT devices due to its inefficient and unscalable algorithms.
no code implementations • NeurIPS 2021 • Aditya Desai, Zhaozhuo Xu, Menal Gupta, Anu Chandran, Antoine Vial-Aussavy, Anshumali Shrivastava
This paradigm breaks the SI into local inversion tasks, which predicts each small chunk of subsurface properties using surrounding seismic data.
no code implementations • NeurIPS 2021 • Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu
In this work, we focus on improving the per iteration cost of CGM.
no code implementations • 15 Jun 2021 • Zhaozhuo Xu, Minghao Yan, Junyan Zhang, Anshumali Shrivastava
The dot product self-attention in Transformer allows us to model interactions between words.
no code implementations • 18 May 2021 • Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu
We present the first provable Least-Squares Value Iteration (LSVI) algorithms that have runtime complexity sublinear in the number of actions.
no code implementations • 26 Feb 2021 • Zhaozhuo Xu, Aditya Desai, Menal Gupta, Anu Chandran, Antoine Vial-Aussavy, Anshumali Shrivastava
We propose a fundamental shift to move away from convolutions and introduce SESDI: Set Embedding based SDI approach.
no code implementations • ICLR 2021 • Beidi Chen, Zichang Liu, Binghui Peng, Zhaozhuo Xu, Jonathan Lingjie Li, Tri Dao, Zhao Song, Anshumali Shrivastava, Christopher Re
Recent advances by practitioners in the deep learning community have breathed new life into Locality Sensitive Hashing (LSH), using it to reduce memory and time bottlenecks in neural network (NN) training.
no code implementations • 2 Jul 2020 • Zichang Liu, Zhaozhuo Xu, Alan Ji, Jonathan Li, Beidi Chen, Anshumali Shrivastava
Efficient inference for wide output layers (WOLs) is an essential yet challenging task in large scale machine learning.
no code implementations • NeurIPS 2019 • Zhixin Zhou, Shulong Tan, Zhaozhuo Xu, Ping Li
We present a fast search on graph algorithm for Maximum Inner Product Search (MIPS).
no code implementations • IJCNLP 2019 • Shulong Tan, Zhixin Zhou, Zhaozhuo Xu, Ping Li
Retrieval of relevant vectors produced by representation learning critically influences the efficiency in natural language processing (NLP) tasks.
no code implementations • 27 Sep 2018 • Shulong Tan, Zhixin Zhou, Zhaozhuo Xu, Ping Li
As Approximate Nearest Neighbor Search (ANNS) techniques have specifications on metric distances, efficient searching by advanced measures is still an open question.