1 code implementation • 23 Jul 2024 • Yijie Chen, Yijin Liu, Fandong Meng, Jinan Xu, Yufeng Chen, Jie zhou
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words), which assesses gender bias beyond binary gender.
1 code implementation • 5 Jun 2024 • Zengkui Sun, Yijin Liu, Fandong Meng, Jinan Xu, Yufeng Chen, Jie zhou
Multilingual neural machine translation models generally distinguish translation directions by the language tag (LT) in front of the source or target sentences.
1 code implementation • 5 Jun 2024 • Zengkui Sun, Yijin Liu, Jiaan Wang, Fandong Meng, Jinan Xu, Yufeng Chen, Jie zhou
Consequently, on the reasoning questions, we discover that existing methods struggle to utilize the edited knowledge to reason the new answer, and tend to retain outdated responses, which are generated by the original models utilizing original knowledge.
1 code implementation • 29 May 2024 • Chenze Shao, Fandong Meng, Yijin Liu, Jie zhou
Leveraging this strategy, we train language generation models using two classic strictly proper scoring rules, the Brier score and the Spherical score, as alternatives to the logarithmic score.
1 code implementation • 11 Apr 2024 • Yijie Chen, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
In this paper, we suggest that code comments are the natural logic pivot between natural language and code language and propose using comments to boost the code generation ability of code LLMs.
1 code implementation • 10 Apr 2024 • Yijin Liu, Fandong Meng, Jie zhou
Recently, dynamic computation methods have shown notable acceleration for Large Language Models (LLMs) by skipping several layers of computations through elaborate heuristics or additional predictors.
1 code implementation • 24 Aug 2023 • Yijie Chen, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
The experimental results demonstrate significant improvements in translation performance with SWIE based on BLOOMZ-3b, particularly in zero-shot and long text translations due to reduced instruction forgetting risk.
1 code implementation • 23 Aug 2023 • Yijin Liu, Xianfeng Zeng, Fandong Meng, Jie zhou
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization, through instruction fine-tuning.
1 code implementation • 6 Aug 2023 • Xianfeng Zeng, Yijin Liu, Fandong Meng, Jie zhou
To address this issue, we propose to utilize \textit{multiple references} to enhance the consistency between these metrics and human evaluations.
no code implementations • 4 May 2023 • Yijin Liu, Xianfeng Zeng, Fandong Meng, Jie zhou
Recently, DeepNorm scales Transformers into extremely deep (i. e., 1000 layers) and reveals the promising potential of deep scaling.
no code implementations • 7 Mar 2022 • Leyang Cui, Fandong Meng, Yijin Liu, Jie zhou, Yue Zhang
Although pre-trained sequence-to-sequence models have achieved great success in dialogue response generation, chatbots still suffer from generating inconsistent responses in real-world practice, especially in multi-turn settings.
1 code implementation • ACL 2022 • Songming Zhang, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jian Liu, Jie zhou
Token-level adaptive training approaches can alleviate the token imbalance problem and thus improve neural machine translation, through re-weighting the losses of different target tokens based on specific statistical metrics (e. g., token frequency or mutual information).
no code implementations • 1 Jan 2022 • Jizhou Li, Bin Chen, Guibin Zan, Guannan Qian, Piero Pianetta, Yijin Liu
Resolving morphological chemical phase transformations at the nanoscale is of vital importance to many scientific and industrial applications across various disciplines.
1 code implementation • EMNLP 2021 • Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
Its core motivation is to simulate the inference scene during training by replacing ground-truth tokens with predicted tokens, thus bridging the gap between training and inference.
no code implementations • WMT (EMNLP) 2021 • Xianfeng Zeng, Yijin Liu, Ernan Li, Qiu Ran, Fandong Meng, Peng Li, Jinan Xu, Jie zhou
This paper introduces WeChat AI's participation in WMT 2021 shared news translation task on English->Chinese, English->Japanese, Japanese->English and English->German.
1 code implementation • Findings (ACL) 2021 • Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
In this way, the model is exactly exposed to predicted tokens for high-confidence positions and still ground-truth tokens for low-confidence positions.
1 code implementation • ACL 2021 • Yangyifan Xu, Yijin Liu, Fandong Meng, Jiajun Zhang, Jinan Xu, Jie zhou
Recently, token-level adaptive training has achieved promising improvement in machine translation, where the cross-entropy loss function is adjusted by assigning different training weights to different tokens, in order to alleviate the token imbalance problem.
1 code implementation • ACL 2021 • Mengqi Miao, Fandong Meng, Yijin Liu, Xiao-Hua Zhou, Jie zhou
The Neural Machine Translation (NMT) model is essentially a joint language model conditioned on both the source sentence and partial translation.
no code implementations • WMT (EMNLP) 2020 • Fandong Meng, Jianhao Yan, Yijin Liu, Yuan Gao, Xianfeng Zeng, Qinsong Zeng, Peng Li, Ming Chen, Jie zhou, Sifan Liu, Hao Zhou
We participate in the WMT 2020 shared news translation task on Chinese to English.
no code implementations • 27 Apr 2020 • Yijin Liu, Fandong Meng, Jie zhou, Yufeng Chen, Jinan Xu
Depth-adaptive neural networks can dynamically adjust depths according to the hardness of input words, and thus improve efficiency.
1 code implementation • 29 Feb 2020 • Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
The Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network, which views words as nodes and performs layer-wise recurrent steps between them simultaneously.
2 code implementations • IJCNLP 2019 • Yijin Liu, Fandong Meng, Jinchao Zhang, Jie zhou, Yufeng Chen, Jinan Xu
Spoken Language Understanding (SLU) mainly involves two tasks, intent detection and slot filling, which are generally modeled jointly in existing works.
Ranked #1 on
Slot Filling
on CAIS
1 code implementation • ACL 2019 • Yijin Liu, Fandong Meng, Jinchao Zhang, Jinan Xu, Yufeng Chen, Jie zhou
Current state-of-the-art systems for sequence labeling are typically based on the family of Recurrent Neural Networks (RNNs).
Ranked #20 on
Named Entity Recognition (NER)
on CoNLL 2003 (English)
(using extra training data)