1 code implementation • 24 Aug 2023 • Yijie Chen, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
The experimental results demonstrate significant improvements in translation performance with SWIE based on BLOOMZ-3b, particularly in zero-shot and long text translations due to reduced instruction forgetting risk.
1 code implementation • 23 Aug 2023 • Yijin Liu, Xianfeng Zeng, Fandong Meng, Jie zhou
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization, through instruction fine-tuning.
1 code implementation • 6 Aug 2023 • Xianfeng Zeng, Yijin Liu, Fandong Meng, Jie zhou
To address this issue, we propose to utilize \textit{multiple references} to enhance the consistency between these metrics and human evaluations.
no code implementations • 4 May 2023 • Yijin Liu, Xianfeng Zeng, Fandong Meng, Jie zhou
Recently, DeepNorm scales Transformers into extremely deep (i. e., 1000 layers) and reveals the promising potential of deep scaling.
no code implementations • 7 Mar 2022 • Leyang Cui, Fandong Meng, Yijin Liu, Jie zhou, Yue Zhang
Although pre-trained sequence-to-sequence models have achieved great success in dialogue response generation, chatbots still suffer from generating inconsistent responses in real-world practice, especially in multi-turn settings.
1 code implementation • ACL 2022 • Songming Zhang, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jian Liu, Jie zhou
Token-level adaptive training approaches can alleviate the token imbalance problem and thus improve neural machine translation, through re-weighting the losses of different target tokens based on specific statistical metrics (e. g., token frequency or mutual information).
no code implementations • 1 Jan 2022 • Jizhou Li, Bin Chen, Guibin Zan, Guannan Qian, Piero Pianetta, Yijin Liu
Resolving morphological chemical phase transformations at the nanoscale is of vital importance to many scientific and industrial applications across various disciplines.
1 code implementation • EMNLP 2021 • Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
Its core motivation is to simulate the inference scene during training by replacing ground-truth tokens with predicted tokens, thus bridging the gap between training and inference.
no code implementations • WMT (EMNLP) 2021 • Xianfeng Zeng, Yijin Liu, Ernan Li, Qiu Ran, Fandong Meng, Peng Li, Jinan Xu, Jie zhou
This paper introduces WeChat AI's participation in WMT 2021 shared news translation task on English->Chinese, English->Japanese, Japanese->English and English->German.
1 code implementation • Findings (ACL) 2021 • Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
In this way, the model is exactly exposed to predicted tokens for high-confidence positions and still ground-truth tokens for low-confidence positions.
1 code implementation • ACL 2021 • Yangyifan Xu, Yijin Liu, Fandong Meng, Jiajun Zhang, Jinan Xu, Jie zhou
Recently, token-level adaptive training has achieved promising improvement in machine translation, where the cross-entropy loss function is adjusted by assigning different training weights to different tokens, in order to alleviate the token imbalance problem.
1 code implementation • ACL 2021 • Mengqi Miao, Fandong Meng, Yijin Liu, Xiao-Hua Zhou, Jie zhou
The Neural Machine Translation (NMT) model is essentially a joint language model conditioned on both the source sentence and partial translation.
no code implementations • WMT (EMNLP) 2020 • Fandong Meng, Jianhao Yan, Yijin Liu, Yuan Gao, Xianfeng Zeng, Qinsong Zeng, Peng Li, Ming Chen, Jie zhou, Sifan Liu, Hao Zhou
We participate in the WMT 2020 shared news translation task on Chinese to English.
no code implementations • 27 Apr 2020 • Yijin Liu, Fandong Meng, Jie zhou, Yufeng Chen, Jinan Xu
Depth-adaptive neural networks can dynamically adjust depths according to the hardness of input words, and thus improve efficiency.
1 code implementation • 29 Feb 2020 • Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
The Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network, which views words as nodes and performs layer-wise recurrent steps between them simultaneously.
2 code implementations • IJCNLP 2019 • Yijin Liu, Fandong Meng, Jinchao Zhang, Jie zhou, Yufeng Chen, Jinan Xu
Spoken Language Understanding (SLU) mainly involves two tasks, intent detection and slot filling, which are generally modeled jointly in existing works.
Ranked #1 on Slot Filling on CAIS
1 code implementation • ACL 2019 • Yijin Liu, Fandong Meng, Jinchao Zhang, Jinan Xu, Yufeng Chen, Jie zhou
Current state-of-the-art systems for sequence labeling are typically based on the family of Recurrent Neural Networks (RNNs).
Ranked #17 on Named Entity Recognition (NER) on CoNLL 2003 (English) (using extra training data)