Search Results for author: Tongyao Zhu

Found 4 papers, 3 papers with code

When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training

1 code implementation20 Nov 2024 Haonan Wang, Qian Liu, Chao Du, Tongyao Zhu, Cunxiao Du, Kenji Kawaguchi, Tianyu Pang

To address this, we develop AnchorAttention, a plug-and-play attention method that alleviates numerical issues caused by BFloat16, improves long-context capabilities, and speeds up training.

Computational Efficiency Position

CCSBench: Evaluating Compositional Controllability in LLMs for Scientific Document Summarization

no code implementations16 Oct 2024 Yixi Ding, Jiaying Wu, Tongyao Zhu, Yanxia Qin, Qian Liu, Min-Yen Kan

To broaden the dissemination of scientific knowledge to diverse audiences, scientific document summarization must simultaneously control multiple attributes such as length and empirical focus.

Document Summarization Scientific Document Summarization

Beyond Memorization: The Challenge of Random Memory Access in Language Models

1 code implementation12 Mar 2024 Tongyao Zhu, Qian Liu, Liang Pang, Zhengbao Jiang, Min-Yen Kan, Min Lin

Through carefully-designed synthetic tasks, covering the scenarios of full recitation, selective recitation and grounded question answering, we reveal that LMs manage to sequentially access their memory while encountering challenges in randomly accessing memorized content.

Memorization Open-Domain Question Answering

Translating Natural Language to Planning Goals with Large-Language Models

1 code implementation10 Feb 2023 Yaqi Xie, Chen Yu, Tongyao Zhu, Jinbin Bai, Ze Gong, Harold Soh

Recent large language models (LLMs) have demonstrated remarkable performance on a variety of natural language processing (NLP) tasks, leading to intense excitement about their applicability across various domains.

Spatial Reasoning Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.