Search Results for author: Jing Nathan Yan

Found 6 papers, 2 papers with code

MambaByte: Token-free Selective State Space Model

no code implementations24 Jan 2024 Junxiong Wang, Tushaar Gangavarapu, Jing Nathan Yan, Alexander M. Rush

We propose MambaByte, a token-free adaptation of the Mamba SSM trained autoregressively on byte sequences.

Computational Efficiency Inductive Bias +1

Diffusion Models Without Attention

no code implementations30 Nov 2023 Jing Nathan Yan, Jiatao Gu, Alexander M. Rush

In recent advancements in high-fidelity image generation, Denoising Diffusion Probabilistic Models (DDPMs) have emerged as a key player.

Denoising Image Generation

On What Basis? Predicting Text Preference Via Structured Comparative Reasoning

no code implementations14 Nov 2023 Jing Nathan Yan, Tianqi Liu, Justin T Chiu, Jiaming Shen, Zhen Qin, Yue Yu, Yao Zhao, Charu Lakshmanan, Yair Kurzion, Alexander M. Rush, Jialu Liu, Michael Bendersky

Comparative reasoning plays a crucial role in text preference prediction; however, large language models (LLMs) often demonstrate inconsistencies in their reasoning.

Hallucination Retrieval

Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning

no code implementations13 Nov 2023 Yue Yu, Jiaming Shen, Tianqi Liu, Zhen Qin, Jing Nathan Yan, Jialu Liu, Chao Zhang, Michael Bendersky

To fully unleash the power of explanations, we propose EASE, an Explanation-Aware Soft Ensemble framework to empower in-context learning with LLMs.

In-Context Learning Language Modelling +2

Pretraining Without Attention

1 code implementation20 Dec 2022 Junxiong Wang, Jing Nathan Yan, Albert Gu, Alexander M. Rush

Even so, BiGS is able to match BERT pretraining accuracy on GLUE and can be extended to long-form pretraining of 4096 tokens without approximation.

Cannot find the paper you are looking for? You can Submit a new open access paper.