no code implementations • NAACL (AutoSimTrans) 2022 • Mengge Liu, Xiang Li, Bao Chen, Yanzhi Tian, Tianwei Lan, Silin Li, Yuhang Guo, Jian Luan, Bin Wang
This system paper describes the BIT-Xiaomi simultaneous translation system for Autosimtrans 2022 simultaneous translation challenge.
no code implementations • 4 Feb 2025 • Menglong Cui, Pengzhi Gao, Wei Liu, Jian Luan, BinWang
Large language models (LLMs) have shown continuously improving multilingual capabilities, and even small-scale open-source models have demonstrated rapid performance enhancement.
no code implementations • 30 Dec 2024 • Xiaolin Hu, Xiang Cheng, Peiyu Liu, Wei Liu, Jian Luan, Bin Wang, Yong liu
To address this, we propose Weight-Decomposed Tensor Adaptation (DoTA), which leverages the Matrix Product Operator (MPO) decomposition of pre-trained weights for effective initialization in fine-tuning LLMs.
no code implementations • 7 Dec 2024 • WeiJie Chen, Ting Bai, Jinbo Su, Jian Luan, Wei Liu, Chuan Shi
Large language models with retrieval-augmented generation encounter a pivotal challenge in intricate retrieval tasks, e. g., multi-hop question answering, which requires the model to navigate across multiple documents and generate comprehensive responses based on fragmented information.
no code implementations • 28 Oct 2024 • Yuhan Chen, Ang Lv, Jian Luan, Bin Wang, Wei Liu
Furthermore, we conduct a detailed analysis of rotary position encoding (RoPE, a prevalent relative positional encoding in LLMs), and found that the U-shape attention is caused by some learned components, which are also the key factor limiting RoPE's expressiveness and extrapolation. Inspired by these insights, we propose High-frequency rotary Position Encoding (HoPE).
no code implementations • 25 Sep 2024 • Qibin Wang, Xiaolin Hu, Weikai Xu, Wei Liu, Jian Luan, Bin Wang
Low-rank adaptation (LoRA) and its variants have recently gained much interest due to their ability to avoid excessive inference costs.
1 code implementation • 23 Sep 2024 • Qinzhuo Wu, Wei Liu, Jian Luan, Bin Wang
Recently, tool-augmented LLMs have gained increasing attention.
1 code implementation • 23 Sep 2024 • Qinzhuo Wu, Weikai Xu, Wei Liu, Tao Tan, Jianfeng Liu, Ang Li, Jian Luan, Bin Wang, Shuo Shang
These fine-tuned VLMs may still ignore the relationships between UI pages, neglect the roles of elements in page transitions and lack inter-UI understanding.
no code implementations • 18 Sep 2024 • Manxi Sun, Wei Liu, Jian Luan, Pengzhi Gao, Bin Wang
The Sparsely-Activated Mixture-of-Experts (MoE) has gained increasing popularity for scaling up large language models (LLMs) without exploding computational costs.
no code implementations • 29 Aug 2024 • Jingyi Wang, Jianzhong Ju, Jian Luan, Zhidong Deng
Recent advances in large vision-language models (VLMs) typically employ vision encoders based on the Vision Transformer (ViT) architecture.
1 code implementation • 8 Jul 2024 • Bowen Shen, Zheng Lin, Daren Zha, Wei Liu, Jian Luan, Bin Wang, Weiping Wang
However, as the coarse-grained structured pruning poses large damage to the highly interconnected model, achieving a high compression ratio for scaled-up LLMs remains a challenge.
1 code implementation • 1 Jul 2024 • Shihan Deng, Weikai Xu, Hongda Sun, Wei Liu, Tao Tan, Jianfeng Liu, Ang Li, Jian Luan, Bin Wang, Rui Yan, Shuo Shang
With the remarkable advancements of large language models (LLMs), LLM-based agents have become a research hotspot in human-computer interaction.
1 code implementation • 3 Jun 2024 • Quandong Wang, Yuxuan Yuan, Xiaoyu Yang, Ruike Zhang, Kang Zhao, Wei Liu, Jian Luan, Daniel Povey, Bin Wang
In inference, it boosts speeds by up to 37% and reduces memory by 1GB per GPU.
no code implementations • 11 Mar 2024 • Yuanhang Zheng, Peng Li, Wei Liu, Yang Liu, Jian Luan, Bin Wang
Specifically, our proposed ToolRerank includes Adaptive Truncation, which truncates the retrieval results related to seen and unseen tools at different positions, and Hierarchy-Aware Reranking, which makes retrieval results more concentrated for single-tool queries and more diverse for multi-tool queries.
1 code implementation • 26 Feb 2024 • Renren Jin, Jiangcun Du, Wuwei Huang, Wei Liu, Jian Luan, Bin Wang, Deyi Xiong
Our experimental results indicate that LLMs with 4-bit quantization can retain performance comparable to their non-quantized counterparts, and perplexity can serve as a proxy metric for quantized LLMs on most benchmarks.
2 code implementations • 10 Jan 2024 • Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, Rui Kong, Yile Wang, Hanfei Geng, Jian Luan, Xuefeng Jin, Zilong Ye, Guanjing Xiong, Fan Zhang, Xiang Li, Mengwei Xu, Zhijun Li, Peng Li, Yang Liu, Ya-Qin Zhang, Yunxin Liu
Next, we discuss several key challenges to achieve intelligent, efficient and secure Personal LLM Agents, followed by a comprehensive survey of representative solutions to address these challenges.
no code implementations • 7 Nov 2023 • Mengge Liu, Wen Zhang, Xiang Li, Yanzhi Tian, Yuhang Guo, Jian Luan, Bin Wang, Shuoying Chen
Simultaneous machine translation (SiMT) is a challenging task that requires starting translation before the full source sentence is available.
1 code implementation • 28 Oct 2023 • Hongda Sun, Weikai Xu, Wei Liu, Jian Luan, Bin Wang, Shuo Shang, Ji-Rong Wen, Rui Yan
Recent advances in large language models (LLMs) have revolutionized the landscape of reasoning tasks.
no code implementations • 29 Jun 2023 • Tianwen Wei, Jian Luan, Wei Liu, Shuang Dong, Bin Wang
We present the Chinese Elementary School Math Word Problems (CMATH) dataset, comprising 1. 7k elementary school-level math word problems with detailed annotations, source from actual Chinese workbooks and exams.
no code implementations • 18 Jun 2023 • Kang Zhao, Wei Liu, Jian Luan, Minglei Gao, Li Qian, Hanlin Teng, Bin Wang
In this paper, we propose a Unified framework for Long-term Memory Conversations (UniMC), which increases the connection between different stages by learning relevance representation.
1 code implementation • 27 May 2023 • Zhibin Lan, Jiawei Yu, Xiang Li, Wen Zhang, Jian Luan, Bin Wang, Degen Huang, Jinsong Su
Text image translation (TIT) aims to translate the source texts embedded in the image to target translations, which has a wide range of applications and thus has important research value.
1 code implementation • 2 Mar 2023 • Mengge Liu, Wen Zhang, Xiang Li, Jian Luan, Bin Wang, Yuhang Guo, Shuoying Chen
Simultaneous machine translation (SimulMT) models start translation before the end of the source sentence, making the translation monotonically aligned with the source sentence.
no code implementations • 17 Jan 2023 • Xiangyu Qin, Zhiyu Wu, Jinshi Cui, Tingting Zhang, Yanran Li, Jian Luan, Bin Wang, Li Wang
Accordingly, we propose a novel paradigm, i. e., exploring contextual information and dialogue structure information in the fine-tuning step, and adapting the PLM to the ERC task in terms of input text, classification structure, and training strategy.
no code implementations • 7 Dec 2022 • Fengyu Yang, Jian Luan, Yujun Wang
We introduce phonology embedding to capture the English differences between different phonology.
no code implementations • 9 Oct 2021 • Yunchao He, Jian Luan, Yujun Wang
Sequence expansion between encoder and decoder is a critical challenge in sequence-to-sequence TTS.
1 code implementation • 3 Sep 2020 • Jiawei Chen, Xu Tan, Jian Luan, Tao Qin, Tie-Yan Liu
To tackle the difficulty of singing modeling caused by high sampling rate (wider frequency band and longer waveform), we introduce multi-scale adversarial training in both the acoustic model and vocoder to improve singing modeling.
no code implementations • 9 Jul 2020 • Yi Ren, Xu Tan, Tao Qin, Jian Luan, Zhou Zhao, Tie-Yan Liu
DeepSinger has several advantages over previous SVS systems: 1) to the best of our knowledge, it is the first SVS system that directly mines training data from music websites, 2) the lyrics-to-singing alignment model further avoids any human efforts for alignment labeling and greatly reduces labeling cost, 3) the singing model based on a feed-forward Transformer is simple and efficient, by removing the complicated acoustic feature modeling in parametric synthesis and leveraging a reference encoder to capture the timbre of a singer from noisy singing data, and 4) it can synthesize singing voices in multiple languages and multiple singers.
no code implementations • 18 Jun 2020 • Jie Wu, Jian Luan
This paper presents a high quality singing synthesizer that is able to model a voice with limited available recordings.
no code implementations • 11 Jun 2020 • Peiling Lu, Jie Wu, Jian Luan, Xu Tan, Li Zhou
This paper presents XiaoiceSing, a high-quality singing voice synthesis system which employs an integrated network for spectrum, F0 and duration modeling.