Search Results for author: Zhaozhuo Xu

Found 23 papers, 3 papers with code

NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention

1 code implementation2 Mar 2024 Tianyi Zhang, Jonah Wonkyu Yi, Bowen Yao, Zhaozhuo Xu, Anshumali Shrivastava

Large language model inference on Central Processing Units (CPU) is challenging due to the vast quantities of expensive Multiply-Add (MAD) matrix operations in the attention computations.

16k Language Modelling +1

LLM Multi-Agent Systems: Challenges and Open Problems

no code implementations5 Feb 2024 Shanshan Han, Qifan Zhang, Yuhang Yao, Weizhao Jin, Zhaozhuo Xu, Chaoyang He

This paper explores existing works of multi-agent systems and identifies challenges that remain inadequately addressed.

Management

LETA: Learning Transferable Attribution for Generic Vision Explainer

no code implementations23 Dec 2023 Guanchu Wang, Yu-Neng Chuang, Fan Yang, Mengnan Du, Chia-Yuan Chang, Shaochen Zhong, Zirui Liu, Zhaozhuo Xu, Kaixiong Zhou, Xuanting Cai, Xia Hu

To address this problem, we develop a pre-trained, DNN-based, generic explainer on large-scale image datasets, and leverage its transferability to explain various vision models for downstream tasks.

Zen: Near-Optimal Sparse Tensor Synchronization for Distributed DNN Training

no code implementations23 Sep 2023 Zhuang Wang, Zhaozhuo Xu, Anshumali Shrivastava, T. S. Eugene Ng

We then systematically explore the design space of communication schemes for sparse tensors and find the optimal one.

Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model

1 code implementation NeurIPS 2023 Zirui Liu, Guanchu Wang, Shaochen Zhong, Zhaozhuo Xu, Daochen Zha, Ruixiang Tang, Zhimeng Jiang, Kaixiong Zhou, Vipin Chaudhary, Shuai Xu, Xia Hu

While the model parameters do contribute to memory usage, the primary memory bottleneck during training arises from storing feature maps, also known as activations, as they are crucial for gradient calculation.

Language Modelling Stochastic Optimization

A Theoretical Analysis Of Nearest Neighbor Search On Approximate Near Neighbor Graph

no code implementations10 Mar 2023 Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu

Current theoretical literature focuses on greedy search on exact near neighbor graph while practitioners use approximate near neighbor graph (ANN-Graph) to reduce the preprocessing time.

Adaptive and Dynamic Multi-Resolution Hashing for Pairwise Summations

no code implementations21 Dec 2022 Lianke Qin, Aravind Reddy, Zhao Song, Zhaozhuo Xu, Danyang Zhuo

In this paper, we propose Adam-Hash: an adaptive and dynamic multi-resolution hashing data-structure for fast pairwise summation estimation.

Dynamic Maintenance of Kernel Density Estimation Data Structure: From Practice to Theory

no code implementations8 Aug 2022 Jiehao Liang, Zhao Song, Zhaozhuo Xu, Junze Yin, Danyang Zhuo

In this work, we focus on the dynamic maintenance of KDE data structures with robustness to adversarial queries.

Density Estimation

Proximity Graph Maintenance for Fast Online Nearest Neighbor Search

no code implementations22 Jun 2022 Zhaozhuo Xu, Weijie Zhao, Shulong Tan, Zhixin Zhou, Ping Li

Given a vertex deletion request, we thoroughly investigate solutions to update the connections of the vertex.

Quantization Recommendation Systems

Locality Sensitive Teaching

no code implementations NeurIPS 2021 Zhaozhuo Xu, Beidi Chen, Chaojian Li, Weiyang Liu, Le Song, Yingyan Lin, Anshumali Shrivastava

However, as one of the most influential and practical MT paradigms, iterative machine teaching (IMT) is prohibited on IoT devices due to its inefficient and unscalable algorithms.

Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing

no code implementations18 May 2021 Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu

We present the first provable Least-Squares Value Iteration (LSVI) algorithms that have runtime complexity sublinear in the number of actions.

reinforcement-learning Reinforcement Learning (RL)

MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training

no code implementations ICLR 2021 Beidi Chen, Zichang Liu, Binghui Peng, Zhaozhuo Xu, Jonathan Lingjie Li, Tri Dao, Zhao Song, Anshumali Shrivastava, Christopher Re

Recent advances by practitioners in the deep learning community have breathed new life into Locality Sensitive Hashing (LSH), using it to reduce memory and time bottlenecks in neural network (NN) training.

Efficient Neural Network Language Modelling +2

Climbing the WOL: Training for Cheaper Inference

no code implementations2 Jul 2020 Zichang Liu, Zhaozhuo Xu, Alan Ji, Jonathan Li, Beidi Chen, Anshumali Shrivastava

Efficient inference for wide output layers (WOLs) is an essential yet challenging task in large scale machine learning.

Retrieval

On Efficient Retrieval of Top Similarity Vectors

no code implementations IJCNLP 2019 Shulong Tan, Zhixin Zhou, Zhaozhuo Xu, Ping Li

Retrieval of relevant vectors produced by representation learning critically influences the efficiency in natural language processing (NLP) tasks.

BIG-bench Machine Learning Representation Learning +1

Fast Binary Functional Search on Graph

no code implementations27 Sep 2018 Shulong Tan, Zhixin Zhou, Zhaozhuo Xu, Ping Li

As Approximate Nearest Neighbor Search (ANNS) techniques have specifications on metric distances, efficient searching by advanced measures is still an open question.

Open-Ended Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.