Search Results for author: Suyuchen Wang

Found 3 papers, 2 papers with code

Resonance RoPE: Improving Context Length Generalization of Large Language Models

1 code implementation29 Feb 2024 Suyuchen Wang, Ivan Kobyzev, Peng Lu, Mehdi Rezagholizadeh, Bang Liu

This paper addresses the challenge of train-short-test-long (TSTL) scenarios in Large Language Models (LLMs) equipped with Rotary Position Embedding (RoPE), where models pre-trained on shorter sequences face difficulty with out-of-distribution (OOD) token positions in longer sequences.

Language Modelling Position

Cannot find the paper you are looking for? You can Submit a new open access paper.