1 code implementation • COLING 2022 • Chenxu Yang, Zheng Lin, Jiangnan Li, Fandong Meng, Weiping Wang, Lanrui Wang, Jie zhou
The knowledge selector generally constructs a query based on the dialogue context and selects the most appropriate knowledge to help response generation.
no code implementations • EMNLP 2020 • Xiuyi Chen, Fandong Meng, Peng Li, Feilong Chen, Shuang Xu, Bo Xu, Jie zhou
Here, we deal with these issues on two aspects: (1) We enhance the prior selection module with the necessary posterior information obtained from the specially designed Posterior Information Prediction Module (PIPM); (2) We propose a Knowledge Distillation Based Training Strategy (KDBTS) to train the decoder with the knowledge selected from the prior distribution, removing the exposure bias of knowledge selection.
no code implementations • 28 Oct 2024 • Meiqi Chen, Fandong Meng, Yingxue Zhang, Yan Zhang, Jie zhou
In this paper, we propose CRAT, a novel multi-agent translation framework that leverages RAG and causality-enhanced self-reflection to address these challenges.
1 code implementation • 22 Oct 2024 • Yuxian Gu, Hao Zhou, Fandong Meng, Jie zhou, Minlie Huang
For effectiveness, MiniPLM leverages the differences between large and small LMs to enhance the difficulty and diversity of the training data, helping student LMs acquire versatile and sophisticated knowledge.
no code implementations • 11 Oct 2024 • Xiangyu Hong, Che Jiang, Biqing Qi, Fandong Meng, Mo Yu, BoWen Zhou, Jie zhou
We further demonstrate the correlation between the efficiency of length extrapolation and the extension of the high-dimensional attention allocation of these heads.
1 code implementation • 10 Oct 2024 • Yutong Wang, Jiali Zeng, Xuebo Liu, Derek F. Wong, Fandong Meng, Jie zhou, Min Zhang
Large language models (LLMs) have achieved reasonable quality improvements in machine translation (MT).
no code implementations • 30 Sep 2024 • Wenchao Chen, LiQiang Niu, Ziyao Lu, Fandong Meng, Jie zhou
Image generation models have encountered challenges related to scalability and quadratic complexity, primarily due to the reliance on Transformer-based backbones.
1 code implementation • 20 Sep 2024 • Zhibin Lan, LiQiang Niu, Fandong Meng, Wenbo Li, Jie zhou, Jinsong Su
Recently, when dealing with high-resolution images, dominant LMMs usually divide them into multiple local images and one global image, which will lead to a large number of visual tokens.
1 code implementation • 23 Jul 2024 • Yijie Chen, Yijin Liu, Fandong Meng, Jinan Xu, Yufeng Chen, Jie zhou
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words), which assesses gender bias beyond binary gender.
1 code implementation • 17 Jul 2024 • Chenze Shao, Fandong Meng, Jie zhou
As Large Language Models (LLMs) achieve remarkable progress in language understanding and generation, their training efficiency has become a critical concern.
1 code implementation • 3 Jul 2024 • Zhibin Lan, LiQiang Niu, Fandong Meng, Jie zhou, Min Zhang, Jinsong Su
Among them, the target text decoder is used to alleviate the language alignment burden, and the image tokenizer converts long sequences of pixels into shorter sequences of visual tokens, preventing the model from focusing on low-level visual features.
no code implementations • 24 Jun 2024 • Xue Zhang, Yunlong Liang, Fandong Meng, Songming Zhang, Yufeng Chen, Jinan Xu, Jie zhou
Multilingual knowledge editing (MKE) aims to simultaneously revise factual knowledge across multilingual languages within large language models (LLMs).
1 code implementation • 24 Jun 2024 • Kunting Li, Yong Hu, Liang He, Fandong Meng, Jie zhou
To address this issue, we propose C-LLM, a Large Language Model-based Chinese Spell Checking method that learns to check errors Character by Character.
1 code implementation • 12 Jun 2024 • Yutong Wang, Jiali Zeng, Xuebo Liu, Fandong Meng, Jie zhou, Min Zhang
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
1 code implementation • 5 Jun 2024 • Zengkui Sun, Yijin Liu, Fandong Meng, Jinan Xu, Yufeng Chen, Jie zhou
Multilingual neural machine translation models generally distinguish translation directions by the language tag (LT) in front of the source or target sentences.
1 code implementation • 5 Jun 2024 • Zengkui Sun, Yijin Liu, Jiaan Wang, Fandong Meng, Jinan Xu, Yufeng Chen, Jie zhou
Consequently, on the reasoning questions, we discover that existing methods struggle to utilize the edited knowledge to reason the new answer, and tend to retain outdated responses, which are generated by the original models utilizing original knowledge.
1 code implementation • 3 Jun 2024 • Yongjing Yin, Jiali Zeng, Yafu Li, Fandong Meng, Yue Zhang
The fine-tuning of open-source large language models (LLMs) for machine translation has recently received considerable attention, marking a shift towards data-centric research from traditional neural machine translation.
1 code implementation • 29 May 2024 • Chenze Shao, Fandong Meng, Yijin Liu, Jie zhou
Leveraging this strategy, we train language generation models using two classic strictly proper scoring rules, the Brier score and the Spherical score, as alternatives to the logarithmic score.
no code implementations • 29 May 2024 • Chenze Shao, Fandong Meng, Jiali Zeng, Jie zhou
Building upon this analysis, we propose employing the confidence of predicting EOS as a detector for under-translation, and strengthening the confidence-based penalty to penalize candidates with a high risk of under-translation.
1 code implementation • 11 Apr 2024 • Yijie Chen, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
In this paper, we suggest that code comments are the natural logic pivot between natural language and code language and propose using comments to boost the code generation ability of code LLMs.
1 code implementation • 10 Apr 2024 • Yijin Liu, Fandong Meng, Jie zhou
Recently, dynamic computation methods have shown notable acceleration for Large Language Models (LLMs) by skipping several layers of computations through elaborate heuristics or additional predictors.
1 code implementation • 29 Mar 2024 • Che Jiang, Biqing Qi, Xiangyu Hong, Dayuan Fu, Yang Cheng, Fandong Meng, Mo Yu, BoWen Zhou, Jie zhou
We reveal the different dynamics of the output token probabilities along the depths of layers between the correct and hallucinated cases.
1 code implementation • 28 Feb 2024 • Shicheng Xu, Liang Pang, Mo Yu, Fandong Meng, HuaWei Shen, Xueqi Cheng, Jie zhou
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating additional information from retrieval.
2 code implementations • 31 Jan 2024 • Chujie Zheng, Fan Yin, Hao Zhou, Fandong Meng, Jie zhou, Kai-Wei Chang, Minlie Huang, Nanyun Peng
In this work, we investigate how LLMs' behavior (i. e., complying with or refusing user queries) is affected by safety prompts from the perspective of model representation.
1 code implementation • 16 Jan 2024 • Xinwei Long, Jiali Zeng, Fandong Meng, Zhiyuan Ma, Kaiyan Zhang, BoWen Zhou, Jie zhou
Knowledge retrieval with multi-modal queries plays a crucial role in supporting knowledge-intensive multi-modal applications.
no code implementations • 14 Nov 2023 • Yi Liu, Lianzhe Huang, Shicheng Li, Sishuo Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun
Therefore, to evaluate the ability of LLMs to discern the reliability of external knowledge, we create a benchmark from existing knowledge bases.
1 code implementation • 14 Nov 2023 • Kunting Li, Yong Hu, Shaolei Wang, Hanhan Ma, Liang He, Fandong Meng, Jie zhou
However, in the Chinese Spelling Correction (CSC) task, we observe a discrepancy: while ChatGPT performs well under human evaluation, it scores poorly according to traditional metrics.
no code implementations • 8 Nov 2023 • Zhen Yang, Yingxue Zhang, Fandong Meng, Jie zhou
Specifically, for the input from any modality, TEAL first discretizes it into a token sequence with the off-the-shelf tokenizer and embeds the token sequence into a joint embedding space with a learnable embedding matrix.
1 code implementation • 6 Nov 2023 • Jiali Zeng, Fandong Meng, Yongjing Yin, Jie zhou
Contemporary translation engines based on the encoder-decoder framework have made significant strides in development.
no code implementations • 3 Nov 2023 • Shicheng Xu, Liang Pang, Jiangnan Li, Mo Yu, Fandong Meng, HuaWei Shen, Xueqi Cheng, Jie zhou
Readers usually only give an abstract and vague description as the query based on their own understanding, summaries, or speculations of the plot, which requires the retrieval model to have a strong ability to estimate the abstract semantic associations between the query and candidate plots.
1 code implementation • 9 Oct 2023 • Yun Luo, Zhen Yang, Fandong Meng, Yingjie Li, Fang Guo, Qinglin Qi, Jie zhou, Yue Zhang
Active learning (AL), which aims to construct an effective training set by iteratively curating the most formative unlabeled data for annotation, has been widely used in low-resource tasks.
1 code implementation • 8 Oct 2023 • Yun Luo, Zhen Yang, Fandong Meng, Yingjie Li, Jie zhou, Yue Zhang
However, we observe that merely concatenating sentences in a contextual window does not fully utilize contextual information and can sometimes lead to excessive attention on less informative sentences.
2 code implementations • 16 Sep 2023 • Jiaan Wang, Yunlong Liang, Zengkui Sun, Yuxuan Cao, Jiarong Xu, Fandong Meng
With the recent advancements in large language models (LLMs), knowledge editing has been shown as a promising technique to adapt LLMs to new knowledge without retraining from scratch.
1 code implementation • 9 Sep 2023 • Yifan Dong, Suhang Wu, Fandong Meng, Jie zhou, Xiaoli Wang, Jianxin Lin, Jinsong Su
2) the input text and image are often not perfectly matched, and thus the image may introduce noise into the model.
1 code implementation • 7 Sep 2023 • Chujie Zheng, Hao Zhou, Fandong Meng, Jie zhou, Minlie Huang
This work shows that modern LLMs are vulnerable to option position changes in MCQs due to their inherent "selection bias", namely, they prefer to select specific option IDs as answers (like "Option A").
1 code implementation • 24 Aug 2023 • Yijie Chen, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
The experimental results demonstrate significant improvements in translation performance with SWIE based on BLOOMZ-3b, particularly in zero-shot and long text translations due to reduced instruction forgetting risk.
1 code implementation • 23 Aug 2023 • Yijin Liu, Xianfeng Zeng, Fandong Meng, Jie zhou
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization, through instruction fine-tuning.
1 code implementation • 17 Aug 2023 • Yun Luo, Zhen Yang, Fandong Meng, Yafu Li, Jie zhou, Yue Zhang
Catastrophic forgetting (CF) is a phenomenon that occurs in machine learning when a model forgets previously learned information while acquiring new knowledge.
1 code implementation • 6 Aug 2023 • Xianfeng Zeng, Yijin Liu, Fandong Meng, Jie zhou
To address this issue, we propose to utilize \textit{multiple references} to enhance the consistency between these metrics and human evaluations.
1 code implementation • 29 Jul 2023 • Lean Wang, Wenkai Yang, Deli Chen, Hao Zhou, Yankai Lin, Fandong Meng, Jie zhou, Xu sun
As large language models (LLMs) generate texts with increasing fluency and realism, there is a growing need to identify the source of texts to prevent the abuse of LLMs.
1 code implementation • 10 Jul 2023 • Jiali Zeng, Fandong Meng, Yongjing Yin, Jie zhou
Open-sourced large language models (LLMs) have demonstrated remarkable efficacy in various tasks with instruction tuning.
no code implementations • 13 Jun 2023 • Jiali Zeng, Yufan Jiang, Yongjing Yin, Yi Jing, Fandong Meng, Binghuai Lin, Yunbo Cao, Jie zhou
Multilingual pre-trained language models have demonstrated impressive (zero-shot) cross-lingual transfer abilities, however, their performance is hindered when the target language has distant typology from source languages or when pre-training data is limited in size.
2 code implementations • 23 May 2023 • Lean Wang, Lei LI, Damai Dai, Deli Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun
In-context learning (ICL) emerges as a promising capability of large language models (LLMs) by providing them with demonstration examples to perform diverse tasks.
1 code implementation • 22 May 2023 • Yunlong Liang, Fandong Meng, Jiaan Wang, Jinan Xu, Yufeng Chen, Jie zhou
Further, we propose a dual knowledge distillation and target-oriented vision modeling framework for the M$^3$S task.
no code implementations • 20 May 2023 • Yun Luo, Xiaotian Lin, Zhen Yang, Fandong Meng, Jie zhou, Yue Zhang
It is seldom considered to adapt the decision boundary for new representations and in this paper we propose a Supervised Contrastive learning framework with adaptive classification criterion for Continual Learning (SCCL), In our method, a contrastive loss is used to directly learn representations for different tasks and a limited number of data samples are saved as the classification criterion.
1 code implementation • 17 May 2023 • Mo Yu, Jiangnan Li, Shunyu Yao, Wenjie Pang, Xiaochen Zhou, Zhou Xiao, Fandong Meng, Jie zhou
As readers engage with a story, their understanding of a character evolves based on new events and information; and multiple fine-grained aspects of personalities can be perceived.
no code implementations • 16 May 2023 • Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng Qu, Jie zhou
In this paper, we aim to unify MLS and CLS into a more general setting, i. e., many-to-many summarization (M2MS), where a single model could process documents in any language and generate their summaries also in any language.
no code implementations • 13 May 2023 • Chulun Zhou, Yunlong Liang, Fandong Meng, Jinan Xu, Jinsong Su, Jie zhou
In this paper, we propose Regularized Contrastive Cross-lingual Cross-modal (RC^3) pre-training, which further exploits more abundant weakly-aligned multilingual image-text pairs.
no code implementations • 11 May 2023 • Mingliang Zhang, Zhen Cao, Juntao Liu, LiQiang Niu, Fandong Meng, Jie zhou
Our approach effectively demonstrates the benefits of combining query-based and anchor-free models for achieving robust layout segmentation in corporate documents.
no code implementations • 10 May 2023 • Yun Luo, Zhen Yang, Xuefeng Bai, Fandong Meng, Jie zhou, Yue Zhang
Intuitively, the representation forgetting can influence the general knowledge stored in pre-trained language models (LMs), but the concrete effect is still unclear.
no code implementations • 8 May 2023 • Zhiyuan Zhang, Deli Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun
To settle this issue, we propose the Fine-purifying approach, which utilizes the diffusion theory to study the dynamic process of fine-tuning for finding potentially poisonous dimensions.
no code implementations • 4 May 2023 • Yijin Liu, Xianfeng Zeng, Fandong Meng, Jie zhou
Recently, DeepNorm scales Transformers into extremely deep (i. e., 1000 layers) and reveals the promising potential of deep scaling.
no code implementations • 4 May 2023 • Yunlong Liang, Fandong Meng, Jinan Xu, Jiaan Wang, Yufeng Chen, Jie zhou
Specifically, we propose a ``versatile'' model, i. e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks, and can translate well in multiple settings simultaneously, and theoretically it can be as many as possible.
1 code implementation • 7 Mar 2023 • Jiaan Wang, Yunlong Liang, Fandong Meng, Zengkui Sun, Haoxiang Shi, Zhixu Li, Jinan Xu, Jianfeng Qu, Jie zhou
In detail, we regard ChatGPT as a human evaluator and give task-specific (e. g., summarization) and aspect-specific (e. g., relevance) instruction to prompt ChatGPT to evaluate the generated results of NLG models.
no code implementations • 28 Feb 2023 • Jiaan Wang, Yunlong Liang, Fandong Meng, Beiqi Zou, Zhixu Li, Jianfeng Qu, Jie zhou
Given a document in a source language, cross-lingual summarization (CLS) aims to generate a summary in a different target language.
no code implementations • 27 Jan 2023 • Chulun Zhou, Yunlong Liang, Fandong Meng, Jie zhou, Jinan Xu, Hongji Wang, Min Zhang, Jinsong Su
To address these issues, in this paper, we propose a multi-task multi-stage transitional (MMT) training framework, where an NCT model is trained using the bilingual chat translation dataset and additional monolingual dialogues.
no code implementations • 25 Jan 2023 • Wenkai Yang, Deli Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun
Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively in a data privacy-preserving manner.
1 code implementation • 15 Dec 2022 • Yunlong Liang, Fandong Meng, Jinan Xu, Jiaan Wang, Yufeng Chen, Jie zhou
However, less attention has been paid to the visual features from the perspective of the summary, which may limit the model performance, especially in the low- and zero-resource scenarios.
no code implementations • 14 Dec 2022 • Jiaan Wang, Fandong Meng, Yunlong Liang, Tingyi Zhang, Jiarong Xu, Zhixu Li, Jie zhou
In detail, we find that (1) the translationese in documents or summaries of test sets might lead to the discrepancy between human judgment and automatic evaluation; (2) the translationese in training sets would harm model performance in real-world applications; (3) though machine-translated documents involve translationese, they are very useful for building CLS systems on low-resource languages under specific training strategies.
no code implementations • 8 Dec 2022 • Jianhao Yan, Jin Xu, Fandong Meng, Jie zhou, Yue Zhang
In this work, we show that the issue arises from the un-consistency of label smoothing on the token-level and sequence-level distributions.
no code implementations • 30 Nov 2022 • Zhen Yang, Fandong Meng, Yingxue Zhang, Ernan Li, Jie zhou
We report the result of the first edition of the WMT shared task on Translation Suggestion (TS).
no code implementations • 28 Nov 2022 • Yunlong Liang, Fandong Meng, Jinan Xu, Yufeng Chen, Jie zhou
Our systems achieve 0. 810 and 0. 946 COMET scores.
no code implementations • 28 Nov 2022 • Ernan Li, Fandong Meng, Jie zhou
This paper introduces WeChat's participation in WMT 2022 shared biomedical translation task on Chinese to English.
1 code implementation • 16 Nov 2022 • Yong Hu, Fandong Meng, Jie zhou
In this paper, we present CSCD-NS, the first Chinese spelling check (CSC) dataset designed for native speakers, containing 40, 000 samples from a Chinese social platform.
no code implementations • 26 Oct 2022 • Jiangnan Li, Mo Yu, Fandong Meng, Zheng Lin, Peng Fu, Weiping Wang, Jie zhou
Although these tasks are effective, there are still urging problems: (1) randomly masking speakers regardless of the question cannot map the speaker mentioned in the question to the corresponding speaker in the dialogue, and ignores the speaker-centric nature of utterances.
1 code implementation • 21 Oct 2022 • Lanrui Wang, Jiangnan Li, Zheng Lin, Fandong Meng, Chenxu Yang, Weiping Wang, Jie zhou
We use a fine-grained encoding strategy which is more sensitive to the emotion dynamics (emotion flow) in the conversations to predict the emotion-intent characteristic of response.
3 code implementations • 17 Oct 2022 • Hui Jiang, Ziyao Lu, Fandong Meng, Chulun Zhou, Jie zhou, Degen Huang, Jinsong Su
Meanwhile we inject two types of perturbations into the retrieved pairs for robust training.
no code implementations • COLING 2022 • Yongjing Yin, Yafu Li, Fandong Meng, Jie zhou, Yue Zhang
Modern neural machine translation (NMT) models have achieved competitive performance in standard benchmarks.
1 code implementation • 11 Oct 2022 • Yuanxin Liu, Fandong Meng, Zheng Lin, Jiangnan Li, Peng Fu, Yanan Cao, Weiping Wang, Jie zhou
In response to the efficiency problem, recent studies show that dense PLMs can be replaced with sparse subnetworks without hurting the performance.
1 code implementation • 10 Oct 2022 • Qingyi Si, Fandong Meng, Mingyu Zheng, Zheng Lin, Yuanxin Liu, Peng Fu, Yanan Cao, Weiping Wang, Jie zhou
To overcome this limitation, we propose a new dataset that considers varying types of shortcuts by constructing different distribution shifts in multiple OOD test sets.
1 code implementation • 10 Oct 2022 • Qingyi Si, Yuanxin Liu, Fandong Meng, Zheng Lin, Peng Fu, Yanan Cao, Weiping Wang, Jie zhou
However, these models reveal a trade-off that the improvements on OOD data severely sacrifice the performance on the in-distribution (ID) data (which is dominated by the biased samples).
1 code implementation • 9 Oct 2022 • Siyu Lai, Zhen Yang, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
Word alignment which aims to extract lexicon translation equivalents between source and target sentences, serves as a fundamental tool for natural language processing.
1 code implementation • 13 Sep 2022 • Zhen Yang, Fandong Meng, Yuanmeng Yan, Jie zhou
While the post-editing effort can be used to measure the translation quality to some extent, we find it usually conflicts with the human judgement on whether the word is well or poorly translated.
no code implementations • 25 Jun 2022 • Jianhao Yan, Fandong Meng, Jie zhou
Hallucination, one kind of pathological translations that bothers Neural Machine Translation, has recently drawn much attention.
1 code implementation • ACL 2022 • Yunlong Liang, Fandong Meng, Jinan Xu, Yufeng Chen, Jie zhou
Neural Chat Translation (NCT) aims to translate conversational text into different languages.
1 code implementation • 2 May 2022 • Jiangnan Li, Fandong Meng, Zheng Lin, Rui Liu, Peng Fu, Yanan Cao, Weiping Wang, Jie zhou
Conversational Causal Emotion Entailment aims to detect causal utterances for a non-neutral targeted utterance from a conversation.
Ranked #1 on Causal Emotion Entailment on RECCON
1 code implementation • NAACL 2022 • Yuanxin Liu, Fandong Meng, Zheng Lin, Peng Fu, Yanan Cao, Weiping Wang, Jie zhou
Firstly, we discover that the success of magnitude pruning can be attributed to the preserved pre-training performance, which correlates with the downstream transferability.
1 code implementation • NAACL 2022 • Siyu Lai, Zhen Yang, Fandong Meng, Xue Zhang, Yufeng Chen, Jinan Xu, Jie zhou
Generating adversarial examples for Neural Machine Translation (NMT) with single Round-Trip Translation (RTT) has achieved promising results by releasing the meaning-preserving restriction.
no code implementations • 23 Mar 2022 • Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng Qu, Jie zhou
Cross-lingual summarization is the task of generating a summary in one language (e. g., English) for the given document(s) in a different language (e. g., Chinese).
1 code implementation • 16 Mar 2022 • Duo Zheng, Fandong Meng, Qingyi Si, Hairun Fan, Zipeng Xu, Jie zhou, Fangxiang Feng, Xiaojie Wang
Visual dialog has witnessed great progress after introducing various vision-oriented goals into the conversation, especially such as GuessWhich and GuessWhat, where the only image is visible by either and both of the questioner and the answerer, respectively.
1 code implementation • ACL 2022 • Yunlong Liang, Fandong Meng, Chulun Zhou, Jinan Xu, Yufeng Chen, Jinsong Su, Jie zhou
The goal of the cross-lingual summarization (CLS) is to convert a document in one language (e. g., English) to a summary in another one (e. g., Chinese).
no code implementations • 7 Mar 2022 • Leyang Cui, Fandong Meng, Yijin Liu, Jie zhou, Yue Zhang
Although pre-trained sequence-to-sequence models have achieved great success in dialogue response generation, chatbots still suffer from generating inconsistent responses in real-world practice, especially in multi-turn settings.
1 code implementation • ACL 2022 • Songming Zhang, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jian Liu, Jie zhou
Token-level adaptive training approaches can alleviate the token imbalance problem and thus improve neural machine translation, through re-weighting the losses of different target tokens based on specific statistical metrics (e. g., token frequency or mutual information).
no code implementations • ACL 2022 • Yulin Xu, Zhen Yang, Fandong Meng, JieZhou
Complete Multi-lingual Neural Machine Translation (C-MNMT) achieves superior performance against the conventional MNMT by constructing multi-way aligned corpus, i. e., aligning bilingual training examples from different language pairs when either their source or target sides are identical.
1 code implementation • COLING 2022 • Duzhen Zhang, Zhen Yang, Fandong Meng, Xiuyi Chen, Jie zhou
Causal Emotion Entailment (CEE) aims to discover the potential causes behind an emotion in a conversational utterance.
Ranked #4 on Causal Emotion Entailment on RECCON
no code implementations • ACL 2022 • Chulun Zhou, Fandong Meng, Jie zhou, Min Zhang, Hongji Wang, Jinsong Su
Most dominant neural machine translation (NMT) models are restricted to make predictions only according to the local context of preceding words in a left-to-right manner.
1 code implementation • ACL 2022 • Yunlong Liang, Fandong Meng, Jinan Xu, Yufeng Chen, Jie zhou
In this work, we introduce a new task named Multimodal Chat Translation (MCT), aiming to generate more accurate translations with the help of the associated dialogue history and visual context.
2 code implementations • 11 Feb 2022 • Jiaan Wang, Fandong Meng, Ziyao Lu, Duo Zheng, Zhixu Li, Jianfeng Qu, Jie zhou
We present ClidSum, a benchmark dataset for building cross-lingual summarization systems on dialogue documents.
1 code implementation • EMNLP 2021 • Shaopeng Lai, Ante Wang, Fandong Meng, Jie zhou, Yubin Ge, Jiali Zeng, Junfeng Yao, Degen Huang, Jinsong Su
Dominant sentence ordering models can be classified into pairwise ordering models and set-to-sequence models.
1 code implementation • 11 Oct 2021 • Zhen Yang, Fandong Meng, Yingxue Zhang, Ernan Li, Jie zhou
To break this limitation, we create a benchmark data set for TS, called \emph{WeTS}, which contains golden corpus annotated by expert translators on four translation directions.
no code implementations • 1 Oct 2021 • Xianggen Liu, Pengyong Li, Fandong Meng, Hao Zhou, Huasong Zhong, Jie zhou, Lili Mou, Sen Song
The key idea is to integrate powerful neural networks into metaheuristics (e. g., simulated annealing, SA) to restrict the search space in discrete optimization.
no code implementations • Findings (ACL) 2021 • Feilong Chen, Xiuyi Chen, Fandong Meng, Peng Li, Jie zhou
Specifically, GoG consists of three sequential graphs: 1) H-Graph, which aims to capture coreference relations among dialog history; 2) History-aware Q-Graph, which aims to fully understand the question through capturing dependency relations between words based on coreference resolution on the dialog history; and 3) Question-aware I-Graph, which aims to capture the relations between objects in an image based on fully question representation.
1 code implementation • Findings (ACL) 2021 • Feilong Chen, Fandong Meng, Xiuyi Chen, Peng Li, Jie zhou
Visual dialogue is a challenging task since it needs to answer a series of coherent questions on the basis of understanding the visual environment.
no code implementations • Findings (EMNLP) 2021 • Mingliang Zhang, Fandong Meng, Yunhai Tong, Jie zhou
Therefore, we focus on balancing the learning competencies of different languages and propose Competence-based Curriculum Learning for Multilingual Machine Translation, named CCL-M.
1 code implementation • Findings (EMNLP) 2021 • Duo Zheng, Zipeng Xu, Fandong Meng, Xiaojie Wang, Jiaan Wang, Jie zhou
To enhance VD Questioner: 1) we propose a Related entity enhanced Questioner (ReeQ) that generates questions under the guidance of related entities and learns entity-based questioning strategy from human dialogs; 2) we propose an Augmented Guesser (AugG) that is strong and is optimized for the VD setting especially.
1 code implementation • EMNLP 2021 • Yunlong Liang, Chulun Zhou, Fandong Meng, Jinan Xu, Yufeng Chen, Jinsong Su, Jie zhou
Neural Chat Translation (NCT) aims to translate conversational text between speakers of different languages.
1 code implementation • EMNLP 2021 • Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
Its core motivation is to simulate the inference scene during training by replacing ground-truth tokens with predicted tokens, thus bridging the gap between training and inference.
no code implementations • WMT (EMNLP) 2021 • Xianfeng Zeng, Yijin Liu, Ernan Li, Qiu Ran, Fandong Meng, Peng Li, Jinan Xu, Jie zhou
This paper introduces WeChat AI's participation in WMT 2021 shared news translation task on English->Chinese, English->Japanese, Japanese->English and English->German.
1 code implementation • ACL 2021 • Yunlong Liang, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
Despite the impressive performance of sentence-level and context-aware Neural Machine Translation (NMT), there still remain challenges to translate bilingual conversational text due to its inherent characteristics such as role preference, dialogue coherence, and translation consistency.
1 code implementation • Findings (ACL) 2021 • Ying Zhang, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
In this paper, we tackle the problem by transferring knowledge from three aspects, i. e., domain, language and task, and strengthening connections among them.
1 code implementation • Findings (ACL) 2021 • Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
In this way, the model is exactly exposed to predicted tokens for high-confidence positions and still ground-truth tokens for low-confidence positions.
1 code implementation • 12 Jul 2021 • Zipeng Xu, Fandong Meng, Xiaojie Wang, Duo Zheng, Chenxu Lv, Jie zhou
In Reinforcement Learning, it is crucial to represent states and assign rewards based on the action-caused transitions of states.
no code implementations • 29 Jun 2021 • Jianhao Yan, Chenming Wu, Fandong Meng, Jie zhou
Current evaluation of an NMT system is usually built upon a heuristic decoding algorithm (e. g., beam search) and an evaluation metric assessing similarity between the translation and golden reference.
1 code implementation • CL (ACL) 2021 • Chenze Shao, Yang Feng, Jinchao Zhang, Fandong Meng, Jie zhou
Non-Autoregressive Neural Machine Translation (NAT) removes the autoregressive mechanism and achieves significant decoding speedup through generating target words independently and simultaneously.
1 code implementation • ACL 2021 • Yuanxin Liu, Fandong Meng, Zheng Lin, Weiping Wang, Jie zhou
In this paper, however, we observe that although distilling the teacher's hidden state knowledge (HSK) is helpful, the performance gain (marginal utility) diminishes quickly as more HSK is distilled.
no code implementations • ACL 2021 • Lei Shen, Fandong Meng, Jinchao Zhang, Yang Feng, Jie zhou
Generating some appealing questions in open-domain conversations is an effective way to improve human-machine interactions and lead the topic to a broader or deeper direction.
1 code implementation • ACL 2021 • Hui Jiang, Chulun Zhou, Fandong Meng, Biao Zhang, Jie zhou, Degen Huang, Qingqiang Wu, Jinsong Su
Due to the great potential in facilitating software development, code generation has attracted increasing attention recently.
no code implementations • NAACL 2021 • Yingxue Zhang, Fandong Meng, Peng Li, Ping Jian, Jie zhou
Implicit discourse relation recognition (IDRR) aims to identify logical relations between two adjacent sentences in the discourse.
1 code implementation • ACL 2021 • Fusheng Wang, Jianhao Yan, Fandong Meng, Jie zhou
As an active research field in NMT, knowledge distillation is widely applied to enhance the model's performance by transferring teacher model's knowledge on each training sample.
1 code implementation • ACL 2021 • Yangyifan Xu, Yijin Liu, Fandong Meng, Jiajun Zhang, Jinan Xu, Jie zhou
Recently, token-level adaptive training has achieved promising improvement in machine translation, where the cross-entropy loss function is adjusted by assigning different training weights to different tokens, in order to alleviate the token imbalance problem.
1 code implementation • ACL 2021 • Mengqi Miao, Fandong Meng, Yijin Liu, Xiao-Hua Zhou, Jie zhou
The Neural Machine Translation (NMT) model is essentially a joint language model conditioned on both the source sentence and partial translation.
no code implementations • 4 Mar 2021 • Zhekun Shi, Di Tan, Quan Liu, Fandong Meng, Bo Zhu, Longjian Xue
Bioinspired structure adhesives have received increasing interest for many applications, such as climbing robots and medical devices.
Soft Condensed Matter
1 code implementation • 9 Dec 2020 • Yunlong Liang, Fandong Meng, Ying Zhang, Jinan Xu, Yufeng Chen, Jie zhou
Firstly, we design a Heterogeneous Graph-Based Encoder to represent the conversation content (i. e., the dialogue history, its emotion flow, facial expressions, audio, and speakers' personalities) with a heterogeneous graph neural network, and then predict suitable emotions for feedback.
1 code implementation • EMNLP 2020 • Jianhao Yan, Fandong Meng, Jie zhou
Transformer models achieve remarkable success in Neural Machine Translation.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Lin Qiao, Jianhao Yan, Fandong Meng, Zhendong Yang, Jie zhou
Therefore, we propose a novel Sentiment-Controllable topic-to-essay generator with a Topic Knowledge Graph enhanced decoder, named SCTKG, which is based on the conditional variational autoencoder (CVAE) framework.
no code implementations • 10 Oct 2020 • Yingxue Zhang, Fandong Meng, Peng Li, Ping Jian, Jie zhou
As conventional answer selection (AS) methods generally match the question with each candidate answer independently, they suffer from the lack of matching information between the question and the candidate.
1 code implementation • EMNLP 2020 • Shuhao Gu, Jinchao Zhang, Fandong Meng, Yang Feng, Wanying Xie, Jie zhou, Dong Yu
The vanilla NMT model usually adopts trivial equal-weighted objectives for target tokens with different frequencies and tends to generate more high-frequency tokens and less low-frequency tokens compared with the golden token distribution.
no code implementations • WMT (EMNLP) 2020 • Fandong Meng, Jianhao Yan, Yijin Liu, Yuan Gao, Xianfeng Zeng, Qinsong Zeng, Peng Li, Ming Chen, Jie zhou, Sifan Liu, Hao Zhou
We participate in the WMT 2020 shared news translation task on Chinese to English.
1 code implementation • 4 Sep 2020 • Huan Lin, Fandong Meng, Jinsong Su, Yongjing Yin, Zhengyuan Yang, Yubin Ge, Jie zhou, Jiebo Luo
Particularly, we represent the input image with global and regional visual features, we introduce two parallel DCCNs to model multimodal context vectors with visual features at different granularities.
Ranked #3 on Multimodal Machine Translation on Multi30K
no code implementations • 12 Aug 2020 • Yunlong Liang, Fandong Meng, Jinchao Zhang, Yufeng Chen, Jinan Xu, Jie zhou
For multiple aspects scenario of aspect-based sentiment analysis (ABSA), existing approaches typically ignore inter-aspect relations or rely on temporal dependencies to process aspect-aware representations of all aspects in a sentence.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1
1 code implementation • ACL 2020 • Yongjing Yin, Fandong Meng, Jinsong Su, Chulun Zhou, Zhengyuan Yang, Jie zhou, Jiebo Luo
Multi-modal neural machine translation (NMT) aims to translate source sentences into a target language paired with images.
no code implementations • 15 Jul 2020 • Jianhao Yan, Fandong Meng, Jie zhou
Though remarkable successes have been achieved by Neural Machine Translation (NMT) in recent years, it still suffers from the inadequate-translation problem.
no code implementations • ACL 2020 • Yong Shan, Zekang Li, Jinchao Zhang, Fandong Meng, Yang Feng, Cheng Niu, Jie zhou
Recent studies in dialogue state tracking (DST) leverage historical information to determine states which are generally represented as slot-value pairs.
Ranked #6 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.1
Dialogue State Tracking Multi-domain Dialogue State Tracking
no code implementations • 27 Apr 2020 • Yijin Liu, Fandong Meng, Jie zhou, Yufeng Chen, Jinan Xu
Depth-adaptive neural networks can dynamically adjust depths according to the hardness of input words, and thus improve efficiency.
no code implementations • 26 Apr 2020 • Zeyang Lei, Zekang Li, Jinchao Zhang, Fandong Meng, Yang Feng, Yujiu Yang, Cheng Niu, Jie zhou
Furthermore, to facilitate the convergence of Gaussian mixture prior and posterior distributions, we devise a curriculum optimization strategy to progressively train the model under multiple training criteria from easy to hard.
3 code implementations • 4 Apr 2020 • Yunlong Liang, Fandong Meng, Jinchao Zhang, Jinan Xu, Yufeng Chen, Jie zhou
The aspect-based sentiment analysis (ABSA) task remains to be a long-standing challenge, which aims to extract the aspect term and then identify its sentiment orientation. In previous approaches, the explicit syntactic structure of a sentence, which reflects the syntax properties of natural language and hence is intuitively crucial for aspect term extraction and sentiment recognition, is typically neglected or insufficiently modeled.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +3
2 code implementations • Findings (EMNLP) 2021 • Yunlong Liang, Fandong Meng, Jinchao Zhang, Yufeng Chen, Jinan Xu, Jie zhou
Aspect-based sentiment analysis (ABSA) mainly involves three subtasks: aspect term extraction, opinion term extraction, and aspect-level sentiment classification, which are typically handled in a separate or joint manner.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +3
1 code implementation • 29 Feb 2020 • Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou
The Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network, which views words as nodes and performs layer-wise recurrent steps between them simultaneously.
1 code implementation • 18 Dec 2019 • Feilong Chen, Fandong Meng, Jiaming Xu, Peng Li, Bo Xu, Jie zhou
Visual Dialog is a vision-language task that requires an AI agent to engage in a conversation with humans grounded in an image.
1 code implementation • 21 Nov 2019 • Chenze Shao, Jinchao Zhang, Yang Feng, Fandong Meng, Jie zhou
Non-Autoregressive Neural Machine Translation (NAT) achieves significant decoding speedup through generating target words independently and simultaneously.
no code implementations • 17 Nov 2019 • Fandong Meng, Jinchao Zhang, Yang Liu, Jie zhou
Recurrent neural networks (RNNs) have been widely used to deal with sequence learning problems.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1
no code implementations • 5 Nov 2019 • Yong Shan, Yang Feng, Jinchao Zhang, Fandong Meng, Wen Zhang
Generally, Neural Machine Translation models generate target words in a left-to-right (L2R) manner and fail to exploit any future (right) semantics information, which usually produces an unbalanced translation.
no code implementations • 21 Oct 2019 • Yingxue Zhang, Ping Jian, Fandong Meng, Ruiying Geng, Wei Cheng, Jie zhou
Implicit discourse relation classification is of great importance for discourse parsing, but remains a challenging problem due to the absence of explicit discourse connectives communicating these relations.
2 code implementations • IJCNLP 2019 • Yijin Liu, Fandong Meng, Jinchao Zhang, Jie zhou, Yufeng Chen, Jinan Xu
Spoken Language Understanding (SLU) mainly involves two tasks, intent detection and slot filling, which are generally modeled jointly in existing works.
Ranked #1 on Slot Filling on CAIS
no code implementations • ACL 2020 • Xianggen Liu, Lili Mou, Fandong Meng, Hao Zhou, Jie zhou, Sen Song
Unsupervised paraphrase generation is a promising and important research topic in natural language processing.
no code implementations • IJCNLP 2019 • Zhengxin Yang, Jinchao Zhang, Fandong Meng, Shuhao Gu, Yang Feng, Jie zhou
Context modeling is essential to generate coherent and consistent translation for Document-level Neural Machine Translations.
1 code implementation • IJCNLP 2019 • Yunlong Liang, Fandong Meng, Jinchao Zhang, Jinan Xu, Yufeng Chen, Jie zhou
Aspect based sentiment analysis (ABSA) aims to identify the sentiment polarity towards the given aspect in a sentence, while previous models typically exploit an aspect-independent (weakly associative) encoder for sentence representation generation.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1
2 code implementations • ACL 2019 • Zekang Li, Cheng Niu, Fandong Meng, Yang Feng, Qian Li, Jie zhou
Document Grounded Conversations is a task to generate dialogue responses when chatting about the content of a given document.
2 code implementations • ACL 2019 • Chenze Shao, Yang Feng, Jinchao Zhang, Fandong Meng, Xilin Chen, Jie zhou
Non-Autoregressive Transformer (NAT) aims to accelerate the Transformer model through discarding the autoregressive mechanism and generating target words independently, which fails to exploit the target sequential information.
1 code implementation • ACL 2019 • Yijin Liu, Fandong Meng, Jinchao Zhang, Jinan Xu, Yufeng Chen, Jie zhou
Current state-of-the-art systems for sequence labeling are typically based on the family of Recurrent Neural Networks (RNNs).
Ranked #17 on Named Entity Recognition (NER) on CoNLL 2003 (English) (using extra training data)
no code implementations • ACL 2019 • Wen Zhang, Yang Feng, Fandong Meng, Di You, Qun Liu
Neural Machine Translation (NMT) generates target words sequentially in the way of predicting the next word conditioned on the context words.
1 code implementation • 19 Dec 2018 • Fandong Meng, Jinchao Zhang
In this paper, we further enhance the RNN-based NMT through increasing the transition depth between consecutive hidden states and build a novel Deep Transition RNN-based Architecture for Neural Machine Translation, named DTMT.
no code implementations • EMNLP 2018 • Baosong Yang, Zhaopeng Tu, Derek F. Wong, Fandong Meng, Lidia S. Chao, Tong Zhang
Self-attention networks have proven to be of profound value for its strength of capturing global dependencies.
Ranked #29 on Machine Translation on WMT2014 English-German
no code implementations • 29 Jun 2018 • Fandong Meng, Zhaopeng Tu, Yong Cheng, Haiyang Wu, Junjie Zhai, Yuekui Yang, Di Wang
Although attention-based Neural Machine Translation (NMT) has achieved remarkable progress in recent years, it still suffers from issues of repeating and dropping translations.
no code implementations • ACL 2018 • Yong Cheng, Zhaopeng Tu, Fandong Meng, Junjie Zhai, Yang Liu
Small perturbations in the input can severely distort intermediate representations and thus impact translation quality of neural machine translation (NMT) models.
no code implementations • COLING 2016 • Fandong Meng, Zhengdong Lu, Hang Li, Qun Liu
Conventional attention-based Neural Machine Translation (NMT) conducts dynamic alignment in generating the target sentence.
no code implementations • 6 Jun 2016 • Yaohua Tang, Fandong Meng, Zhengdong Lu, Hang Li, Philip L. H. Yu
In this paper, we propose phraseNet, a neural machine translator with a phrase memory which stores phrase pairs in symbolic form, mined from corpus or specified by human experts.