no code implementations • 15 Oct 2024 • YIngyu Liang, Jiangxuan Long, Zhenmei Shi, Zhao Song, Yufa Zhou
Large Language Models (LLMs) have shown immense potential in enhancing various aspects of our daily lives, from conversational AI to search and AI assistants.
no code implementations • 12 Oct 2024 • Xiaoyu Li, YIngyu Liang, Zhenmei Shi, Zhao Song, Yufa Zhou
For small cache sizes, we provide an algorithm that improves over existing methods and achieves the tight bounds.
no code implementations • 12 Oct 2024 • YIngyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song, Yufa Zhou
In contrast, the multi-layer perceptrons with $\mathsf{ReLU}$ activation ($\mathsf{ReLU}$-$\mathsf{MLP}$), one of the most fundamental components of neural networks, is known to be expressive; specifically, a two-layer neural network is a universal approximator given an exponentially large number of hidden neurons.
no code implementations • 23 Aug 2024 • YIngyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song, Yufa Zhou
The computational complexity of the self-attention mechanism in popular transformer architectures poses significant challenges for training and inference, and becomes the bottleneck for long inputs.
no code implementations • 20 Jul 2024 • YIngyu Liang, Zhenmei Shi, Zhao Song, Yufa Zhou
In addition, our data structure can guarantee that the process of answering user query satisfies $(\epsilon, \delta)$-DP with $\widetilde{O}(n^{-1} \epsilon^{-1} \alpha^{-1/2} R^{2s} R_w r^2)$ additive error and $n^{-1} (\alpha + \epsilon_s)$ relative error between our output and the true answer.
no code implementations • 26 May 2024 • YIngyu Liang, Zhenmei Shi, Zhao Song, Yufa Zhou
We prove that if the target distribution is a $k$-mixture of Gaussians, the density of the entire diffusion process will also be a $k$-mixture of Gaussians.
no code implementations • 26 May 2024 • YIngyu Liang, Zhenmei Shi, Zhao Song, Yufa Zhou
Tensor Attention, a multi-view attention that is able to capture high-order correlations among multiple modalities, can overcome the representational limitations of classical matrix attention.
no code implementations • 8 May 2023 • Yeqi Gao, Zhao Song, Xin Yang, Yufa Zhou
Large language models (LLMs), especially those based on the Transformer architecture, have had a profound impact on various aspects of daily life, such as natural language processing, content generation, research methodologies, and more.