no code implementations • 16 Jan 2024 • Yachao Li, Junhui Li, Jing Jiang, Min Zhang
Our proposed translation mixed-instructions enable LLMs (Llama-2~7B and 13B) to maintain consistent translation performance from the sentence level to documents containing as many as 2048 tokens.
no code implementations • 12 Dec 2022 • Yachao Li, Junhui Li, Jing Jiang, Shimin Tao, Hao Yang, Min Zhang
To alleviate this problem, we propose a position-aware Transformer (P-Transformer) to enhance both the absolute and relative position information in both self-attention and cross-attention.
1 code implementation • COLING 2018 • Yachao Li, Junhui Li, Min Zhang
In the popular sequence to sequence (seq2seq) neural machine translation (NMT), there exist many weighted sum models (WSMs), each of which takes a set of input and generates one output.