Search Results for author: Fandong Meng

Found 150 papers, 88 papers with code

TAKE: Topic-shift Aware Knowledge sElection for Dialogue Generation

1 code implementation COLING 2022 Chenxu Yang, Zheng Lin, Jiangnan Li, Fandong Meng, Weiping Wang, Lanrui Wang, Jie zhou

The knowledge selector generally constructs a query based on the dialogue context and selects the most appropriate knowledge to help response generation.

Dialogue Generation Knowledge Distillation +1

Bridging the Gap between Prior and Posterior Knowledge Selection for Knowledge-Grounded Dialogue Generation

no code implementations EMNLP 2020 Xiuyi Chen, Fandong Meng, Peng Li, Feilong Chen, Shuang Xu, Bo Xu, Jie zhou

Here, we deal with these issues on two aspects: (1) We enhance the prior selection module with the necessary posterior information obtained from the specially designed Posterior Information Prediction Module (PIPM); (2) We propose a Knowledge Distillation Based Training Strategy (KDBTS) to train the decoder with the knowledge selected from the prior distribution, removing the exposure bias of knowledge selection.

Decoder Dialogue Generation +1

CRAT: A Multi-Agent Framework for Causality-Enhanced Reflective and Retrieval-Augmented Translation with Large Language Models

no code implementations28 Oct 2024 Meiqi Chen, Fandong Meng, Yingxue Zhang, Yan Zhang, Jie zhou

In this paper, we propose CRAT, a novel multi-agent translation framework that leverages RAG and causality-enhanced self-reflection to address these challenges.

Machine Translation RAG +1

MiniPLM: Knowledge Distillation for Pre-Training Language Models

1 code implementation22 Oct 2024 Yuxian Gu, Hao Zhou, Fandong Meng, Jie zhou, Minlie Huang

For effectiveness, MiniPLM leverages the differences between large and small LMs to enhance the difficulty and diversity of the training data, helping student LMs acquire versatile and sophisticated knowledge.

Diversity Knowledge Distillation +1

On the token distance modeling ability of higher RoPE attention dimension

no code implementations11 Oct 2024 Xiangyu Hong, Che Jiang, Biqing Qi, Fandong Meng, Mo Yu, BoWen Zhou, Jie zhou

We further demonstrate the correlation between the efficiency of length extrapolation and the extension of the high-dimensional attention allocation of these heads.

Position Reading Comprehension

MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation

no code implementations30 Sep 2024 Wenchao Chen, LiQiang Niu, Ziyao Lu, Fandong Meng, Jie zhou

Image generation models have encountered challenges related to scalability and quadratic complexity, primarily due to the reliance on Transformer-based backbones.

Mamba Text-to-Image Generation

AVG-LLaVA: A Large Multimodal Model with Adaptive Visual Granularity

1 code implementation20 Sep 2024 Zhibin Lan, LiQiang Niu, Fandong Meng, Wenbo Li, Jie zhou, Jinsong Su

Recently, when dealing with high-resolution images, dominant LMMs usually divide them into multiple local images and one global image, which will lead to a large number of visual tokens.

Avg

Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words

1 code implementation23 Jul 2024 Yijie Chen, Yijin Liu, Fandong Meng, Jinan Xu, Yufeng Chen, Jie zhou

This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words), which assesses gender bias beyond binary gender.

Machine Translation Translation

Patch-Level Training for Large Language Models

1 code implementation17 Jul 2024 Chenze Shao, Fandong Meng, Jie zhou

As Large Language Models (LLMs) achieve remarkable progress in language understanding and generation, their training efficiency has become a critical concern.

Language Modelling

Translatotron-V(ison): An End-to-End Model for In-Image Machine Translation

1 code implementation3 Jul 2024 Zhibin Lan, LiQiang Niu, Fandong Meng, Jie zhou, Min Zhang, Jinsong Su

Among them, the target text decoder is used to alleviate the language alignment burden, and the image tokenizer converts long sequences of pixels into shorter sequences of visual tokens, preventing the model from focusing on low-level visual features.

Decoder Machine Translation

Multilingual Knowledge Editing with Language-Agnostic Factual Neurons

no code implementations24 Jun 2024 Xue Zhang, Yunlong Liang, Fandong Meng, Songming Zhang, Yufeng Chen, Jinan Xu, Jie zhou

Multilingual knowledge editing (MKE) aims to simultaneously revise factual knowledge across multilingual languages within large language models (LLMs).

knowledge editing

C-LLM: Learn to Check Chinese Spelling Errors Character by Character

1 code implementation24 Jun 2024 Kunting Li, Yong Hu, Liang He, Fandong Meng, Jie zhou

To address this issue, we propose C-LLM, a Large Language Model-based Chinese Spell Checking method that learns to check errors Character by Character.

Chinese Spell Checking Language Modelling +1

TasTe: Teaching Large Language Models to Translate through Self-Reflection

1 code implementation12 Jun 2024 Yutong Wang, Jiali Zeng, Xuebo Liu, Fandong Meng, Jie zhou, Min Zhang

The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.

Instruction Following Machine Translation +2

LCS: A Language Converter Strategy for Zero-Shot Neural Machine Translation

1 code implementation5 Jun 2024 Zengkui Sun, Yijin Liu, Fandong Meng, Jinan Xu, Yufeng Chen, Jie zhou

Multilingual neural machine translation models generally distinguish translation directions by the language tag (LT) in front of the source or target sentences.

Decoder Machine Translation +2

Outdated Issue Aware Decoding for Reasoning Questions on Edited Knowledge

1 code implementation5 Jun 2024 Zengkui Sun, Yijin Liu, Jiaan Wang, Fandong Meng, Jinan Xu, Yufeng Chen, Jie zhou

Consequently, on the reasoning questions, we discover that existing methods struggle to utilize the edited knowledge to reason the new answer, and tend to retain outdated responses, which are generated by the original models utilizing original knowledge.

knowledge editing

LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation

1 code implementation3 Jun 2024 Yongjing Yin, Jiali Zeng, Yafu Li, Fandong Meng, Yue Zhang

The fine-tuning of open-source large language models (LLMs) for machine translation has recently received considerable attention, marking a shift towards data-centric research from traditional neural machine translation.

Data Augmentation Machine Translation +2

Language Generation with Strictly Proper Scoring Rules

1 code implementation29 May 2024 Chenze Shao, Fandong Meng, Yijin Liu, Jie zhou

Leveraging this strategy, we train language generation models using two classic strictly proper scoring rules, the Brier score and the Spherical score, as alternatives to the logarithmic score.

Language Modelling scoring rule +1

Understanding and Addressing the Under-Translation Problem from the Perspective of Decoding Objective

no code implementations29 May 2024 Chenze Shao, Fandong Meng, Jiali Zeng, Jie zhou

Building upon this analysis, we propose employing the confidence of predicting EOS as a detector for under-translation, and strengthening the confidence-based penalty to penalize candidates with a high risk of under-translation.

Machine Translation NMT +2

Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective

1 code implementation11 Apr 2024 Yijie Chen, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou

In this paper, we suggest that code comments are the natural logic pivot between natural language and code language and propose using comments to boost the code generation ability of code LLMs.

Code Generation HumanEval

Accelerating Inference in Large Language Models with a Unified Layer Skipping Strategy

1 code implementation10 Apr 2024 Yijin Liu, Fandong Meng, Jie zhou

Recently, dynamic computation methods have shown notable acceleration for Large Language Models (LLMs) by skipping several layers of computations through elaborate heuristics or additional predictors.

Machine Translation Text Summarization

On Large Language Models' Hallucination with Regard to Known Facts

1 code implementation29 Mar 2024 Che Jiang, Biqing Qi, Xiangyu Hong, Dayuan Fu, Yang Cheng, Fandong Meng, Mo Yu, BoWen Zhou, Jie zhou

We reveal the different dynamics of the output token probabilities along the depths of layers between the correct and hallucinated cases.

Hallucination Triplet

On Prompt-Driven Safeguarding for Large Language Models

2 code implementations31 Jan 2024 Chujie Zheng, Fan Yin, Hao Zhou, Fandong Meng, Jie zhou, Kai-Wei Chang, Minlie Huang, Nanyun Peng

In this work, we investigate how LLMs' behavior (i. e., complying with or refusing user queries) is affected by safety prompts from the perspective of model representation.

Generative Multi-Modal Knowledge Retrieval with Large Language Models

1 code implementation16 Jan 2024 Xinwei Long, Jiali Zeng, Fandong Meng, Zhiyuan Ma, Kaiyan Zhang, BoWen Zhou, Jie zhou

Knowledge retrieval with multi-modal queries plays a crucial role in supporting knowledge-intensive multi-modal applications.

Retrieval

RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge

no code implementations14 Nov 2023 Yi Liu, Lianzhe Huang, Shicheng Li, Sishuo Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun

Therefore, to evaluate the ability of LLMs to discern the reliability of external knowledge, we create a benchmark from existing knowledge bases.

counterfactual Knowledge Graphs +2

Eval-GCSC: A New Metric for Evaluating ChatGPT's Performance in Chinese Spelling Correction

1 code implementation14 Nov 2023 Kunting Li, Yong Hu, Shaolei Wang, Hanhan Ma, Liang He, Fandong Meng, Jie zhou

However, in the Chinese Spelling Correction (CSC) task, we observe a discrepancy: while ChatGPT performs well under human evaluation, it scores poorly according to traditional metrics.

Semantic Similarity Semantic Textual Similarity +1

TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models

no code implementations8 Nov 2023 Zhen Yang, Yingxue Zhang, Fandong Meng, Jie zhou

Specifically, for the input from any modality, TEAL first discretizes it into a token sequence with the off-the-shelf tokenizer and embeds the token sequence into a joint embedding space with a learnable embedding matrix.

Improving Machine Translation with Large Language Models: A Preliminary Study with Cooperative Decoding

1 code implementation6 Nov 2023 Jiali Zeng, Fandong Meng, Yongjing Yin, Jie zhou

Contemporary translation engines based on the encoder-decoder framework have made significant strides in development.

Decoder Machine Translation +2

Plot Retrieval as an Assessment of Abstract Semantic Association

no code implementations3 Nov 2023 Shicheng Xu, Liang Pang, Jiangnan Li, Mo Yu, Fandong Meng, HuaWei Shen, Xueqi Cheng, Jie zhou

Readers usually only give an abstract and vague description as the query based on their own understanding, summaries, or speculations of the plot, which requires the retrieval model to have a strong ability to estimate the abstract semantic associations between the query and candidate plots.

Information Retrieval Retrieval

XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners

1 code implementation9 Oct 2023 Yun Luo, Zhen Yang, Fandong Meng, Yingjie Li, Fang Guo, Qinglin Qi, Jie zhou, Yue Zhang

Active learning (AL), which aims to construct an effective training set by iteratively curating the most formative unlabeled data for annotation, has been widely used in low-resource tasks.

Active Learning Decoder +2

Enhancing Argument Structure Extraction with Efficient Leverage of Contextual Information

1 code implementation8 Oct 2023 Yun Luo, Zhen Yang, Fandong Meng, Yingjie Li, Jie zhou, Yue Zhang

However, we observe that merely concatenating sentences in a contextual window does not fully utilize contextual information and can sometimes lead to excessive attention on less informative sentences.

Cross-Lingual Knowledge Editing in Large Language Models

2 code implementations16 Sep 2023 Jiaan Wang, Yunlong Liang, Zengkui Sun, Yuxuan Cao, Jiarong Xu, Fandong Meng

With the recent advancements in large language models (LLMs), knowledge editing has been shown as a promising technique to adapt LLMs to new knowledge without retraining from scratch.

knowledge editing

Large Language Models Are Not Robust Multiple Choice Selectors

1 code implementation7 Sep 2023 Chujie Zheng, Hao Zhou, Fandong Meng, Jie zhou, Minlie Huang

This work shows that modern LLMs are vulnerable to option position changes in MCQs due to their inherent "selection bias", namely, they prefer to select specific option IDs as answers (like "Option A").

Computational Efficiency Multiple-choice +1

Improving Translation Faithfulness of Large Language Models via Augmenting Instructions

1 code implementation24 Aug 2023 Yijie Chen, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou

The experimental results demonstrate significant improvements in translation performance with SWIE based on BLOOMZ-3b, particularly in zero-shot and long text translations due to reduced instruction forgetting risk.

Instruction Following Machine Translation +2

Instruction Position Matters in Sequence Generation with Large Language Models

1 code implementation23 Aug 2023 Yijin Liu, Xianfeng Zeng, Fandong Meng, Jie zhou

Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization, through instruction fine-tuning.

Instruction Following Position +2

An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning

1 code implementation17 Aug 2023 Yun Luo, Zhen Yang, Fandong Meng, Yafu Li, Jie zhou, Yue Zhang

Catastrophic forgetting (CF) is a phenomenon that occurs in machine learning when a model forgets previously learned information while acquiring new knowledge.

Decoder Reading Comprehension

Towards Multiple References Era -- Addressing Data Leakage and Limited Reference Diversity in NLG Evaluation

1 code implementation6 Aug 2023 Xianfeng Zeng, Yijin Liu, Fandong Meng, Jie zhou

To address this issue, we propose to utilize \textit{multiple references} to enhance the consistency between these metrics and human evaluations.

Diversity nlg evaluation +1

Towards Codable Watermarking for Injecting Multi-bits Information to LLMs

1 code implementation29 Jul 2023 Lean Wang, Wenkai Yang, Deli Chen, Hao Zhou, Yankai Lin, Fandong Meng, Jie zhou, Xu sun

As large language models (LLMs) generate texts with increasing fluency and realism, there is a growing need to identify the source of texts to prevent the abuse of LLMs.

Language Modelling

TIM: Teaching Large Language Models to Translate with Comparison

1 code implementation10 Jul 2023 Jiali Zeng, Fandong Meng, Yongjing Yin, Jie zhou

Open-sourced large language models (LLMs) have demonstrated remarkable efficacy in various tasks with instruction tuning.

Translation

Soft Language Clustering for Multilingual Model Pre-training

no code implementations13 Jun 2023 Jiali Zeng, Yufan Jiang, Yongjing Yin, Yi Jing, Fandong Meng, Binghuai Lin, Yunbo Cao, Jie zhou

Multilingual pre-trained language models have demonstrated impressive (zero-shot) cross-lingual transfer abilities, however, their performance is hindered when the target language has distant typology from source languages or when pre-training data is limited in size.

Clustering Question Answering +6

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning

2 code implementations23 May 2023 Lean Wang, Lei LI, Damai Dai, Deli Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun

In-context learning (ICL) emerges as a promising capability of large language models (LLMs) by providing them with demonstration examples to perform diverse tasks.

In-Context Learning

Mitigating Catastrophic Forgetting in Task-Incremental Continual Learning with Adaptive Classification Criterion

no code implementations20 May 2023 Yun Luo, Xiaotian Lin, Zhen Yang, Fandong Meng, Jie zhou, Yue Zhang

It is seldom considered to adapt the decision boundary for new representations and in this paper we propose a Supervised Contrastive learning framework with adaptive classification criterion for Continual Learning (SCCL), In our method, a contrastive loss is used to directly learn representations for different tasks and a limited number of data samples are saved as the classification criterion.

Classification Continual Learning +1

Personality Understanding of Fictional Characters during Book Reading

1 code implementation17 May 2023 Mo Yu, Jiangnan Li, Shunyu Yao, Wenjie Pang, Xiaochen Zhou, Zhou Xiao, Fandong Meng, Jie zhou

As readers engage with a story, their understanding of a character evolves based on new events and information; and multiple fine-grained aspects of personalities can be perceived.

Towards Unifying Multi-Lingual and Cross-Lingual Summarization

no code implementations16 May 2023 Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng Qu, Jie zhou

In this paper, we aim to unify MLS and CLS into a more general setting, i. e., many-to-many summarization (M2MS), where a single model could process documents in any language and generate their summaries also in any language.

Language Modelling Text Summarization

RC3: Regularized Contrastive Cross-lingual Cross-modal Pre-training

no code implementations13 May 2023 Chulun Zhou, Yunlong Liang, Fandong Meng, Jinan Xu, Jinsong Su, Jie zhou

In this paper, we propose Regularized Contrastive Cross-lingual Cross-modal (RC^3) pre-training, which further exploits more abundant weakly-aligned multilingual image-text pairs.

Contrastive Learning Machine Translation

WeLayout: WeChat Layout Analysis System for the ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents

no code implementations11 May 2023 Mingliang Zhang, Zhen Cao, Juntao Liu, LiQiang Niu, Fandong Meng, Jie zhou

Our approach effectively demonstrates the benefits of combining query-based and anchor-free models for achieving robust layout segmentation in corporate documents.

Bayesian Optimization Segmentation

Investigating Forgetting in Pre-Trained Representations Through Continual Learning

no code implementations10 May 2023 Yun Luo, Zhen Yang, Xuefeng Bai, Fandong Meng, Jie zhou, Yue Zhang

Intuitively, the representation forgetting can influence the general knowledge stored in pre-trained language models (LMs), but the concrete effect is still unclear.

Continual Learning General Knowledge

Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous Dimensions in Pre-trained Language Models Caused by Backdoor or Bias

no code implementations8 May 2023 Zhiyuan Zhang, Deli Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun

To settle this issue, we propose the Fine-purifying approach, which utilizes the diffusion theory to study the dynamic process of fine-tuning for finding potentially poisonous dimensions.

BranchNorm: Robustly Scaling Extremely Deep Transformers

no code implementations4 May 2023 Yijin Liu, Xianfeng Zeng, Fandong Meng, Jie zhou

Recently, DeepNorm scales Transformers into extremely deep (i. e., 1000 layers) and reveals the promising potential of deep scaling.

Unified Model Learning for Various Neural Machine Translation

no code implementations4 May 2023 Yunlong Liang, Fandong Meng, Jinan Xu, Jiaan Wang, Yufeng Chen, Jie zhou

Specifically, we propose a ``versatile'' model, i. e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks, and can translate well in multiple settings simultaneously, and theoretically it can be as many as possible.

Document Translation Machine Translation +3

Is ChatGPT a Good NLG Evaluator? A Preliminary Study

1 code implementation7 Mar 2023 Jiaan Wang, Yunlong Liang, Fandong Meng, Zengkui Sun, Haoxiang Shi, Zhixu Li, Jinan Xu, Jianfeng Qu, Jie zhou

In detail, we regard ChatGPT as a human evaluator and give task-specific (e. g., summarization) and aspect-specific (e. g., relevance) instruction to prompt ChatGPT to evaluate the generated results of NLG models.

nlg evaluation Story Generation

Zero-Shot Cross-Lingual Summarization via Large Language Models

no code implementations28 Feb 2023 Jiaan Wang, Yunlong Liang, Fandong Meng, Beiqi Zou, Zhixu Li, Jianfeng Qu, Jie zhou

Given a document in a source language, cross-lingual summarization (CLS) aims to generate a summary in a different target language.

Informativeness

A Multi-task Multi-stage Transitional Training Framework for Neural Chat Translation

no code implementations27 Jan 2023 Chulun Zhou, Yunlong Liang, Fandong Meng, Jie zhou, Jinan Xu, Hongji Wang, Min Zhang, Jinsong Su

To address these issues, in this paper, we propose a multi-task multi-stage transitional (MMT) training framework, where an NCT model is trained using the bilingual chat translation dataset and additional monolingual dialogues.

NMT Sentence +1

Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning

no code implementations25 Jan 2023 Wenkai Yang, Deli Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun

Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively in a data privacy-preserving manner.

Federated Learning Privacy Preserving

Summary-Oriented Vision Modeling for Multimodal Abstractive Summarization

1 code implementation15 Dec 2022 Yunlong Liang, Fandong Meng, Jinan Xu, Jiaan Wang, Yufeng Chen, Jie zhou

However, less attention has been paid to the visual features from the perspective of the summary, which may limit the model performance, especially in the low- and zero-resource scenarios.

Abstractive Text Summarization

Understanding Translationese in Cross-Lingual Summarization

no code implementations14 Dec 2022 Jiaan Wang, Fandong Meng, Yunlong Liang, Tingyi Zhang, Jiarong Xu, Zhixu Li, Jie zhou

In detail, we find that (1) the translationese in documents or summaries of test sets might lead to the discrepancy between human judgment and automatic evaluation; (2) the translationese in training sets would harm model performance in real-world applications; (3) though machine-translated documents involve translationese, they are very useful for building CLS systems on low-resource languages under specific training strategies.

DC-MBR: Distributional Cooling for Minimum Bayesian Risk Decoding

no code implementations8 Dec 2022 Jianhao Yan, Jin Xu, Fandong Meng, Jie zhou, Yue Zhang

In this work, we show that the issue arises from the un-consistency of label smoothing on the token-level and sequence-level distributions.

Machine Translation NMT

Findings of the WMT 2022 Shared Task on Translation Suggestion

no code implementations30 Nov 2022 Zhen Yang, Fandong Meng, Yingxue Zhang, Ernan Li, Jie zhou

We report the result of the first edition of the WMT shared task on Translation Suggestion (TS).

Machine Translation Task 2 +1

Summer: WeChat Neural Machine Translation Systems for the WMT22 Biomedical Translation Task

no code implementations28 Nov 2022 Ernan Li, Fandong Meng, Jie zhou

This paper introduces WeChat's participation in WMT 2022 shared biomedical translation task on Chinese to English.

Machine Translation Translation

CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers

1 code implementation16 Nov 2022 Yong Hu, Fandong Meng, Jie zhou

In this paper, we present CSCD-NS, the first Chinese spelling check (CSC) dataset designed for native speakers, containing 40, 000 samples from a Chinese social platform.

Spelling Correction

Question-Interlocutor Scope Realized Graph Modeling over Key Utterances for Dialogue Reading Comprehension

no code implementations26 Oct 2022 Jiangnan Li, Mo Yu, Fandong Meng, Zheng Lin, Peng Fu, Weiping Wang, Jie zhou

Although these tasks are effective, there are still urging problems: (1) randomly masking speakers regardless of the question cannot map the speaker mentioned in the question to the corresponding speaker in the dialogue, and ignores the speaker-centric nature of utterances.

Reading Comprehension

Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible Knowledge Selection

1 code implementation21 Oct 2022 Lanrui Wang, Jiangnan Li, Zheng Lin, Fandong Meng, Chenxu Yang, Weiping Wang, Jie zhou

We use a fine-grained encoding strategy which is more sensitive to the emotion dynamics (emotion flow) in the conversations to predict the emotion-intent characteristic of response.

Dialogue Generation Emotion Recognition +2

Towards Robust k-Nearest-Neighbor Machine Translation

3 code implementations17 Oct 2022 Hui Jiang, Ziyao Lu, Fandong Meng, Chulun Zhou, Jie zhou, Degen Huang, Jinsong Su

Meanwhile we inject two types of perturbations into the retrieved pairs for robust training.

Machine Translation NMT +1

A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models

1 code implementation11 Oct 2022 Yuanxin Liu, Fandong Meng, Zheng Lin, Jiangnan Li, Peng Fu, Yanan Cao, Weiping Wang, Jie zhou

In response to the efficiency problem, recent studies show that dense PLMs can be replaced with sparse subnetworks without hurting the performance.

Natural Language Understanding

Language Prior Is Not the Only Shortcut: A Benchmark for Shortcut Learning in VQA

1 code implementation10 Oct 2022 Qingyi Si, Fandong Meng, Mingyu Zheng, Zheng Lin, Yuanxin Liu, Peng Fu, Yanan Cao, Weiping Wang, Jie zhou

To overcome this limitation, we propose a new dataset that considers varying types of shortcuts by constructing different distribution shifts in multiple OOD test sets.

Question Answering Visual Question Answering

Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning

1 code implementation10 Oct 2022 Qingyi Si, Yuanxin Liu, Fandong Meng, Zheng Lin, Peng Fu, Yanan Cao, Weiping Wang, Jie zhou

However, these models reveal a trade-off that the improvements on OOD data severely sacrifice the performance on the in-distribution (ID) data (which is dominated by the biased samples).

Contrastive Learning Question Answering +1

Cross-Align: Modeling Deep Cross-lingual Interactions for Word Alignment

1 code implementation9 Oct 2022 Siyu Lai, Zhen Yang, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou

Word alignment which aims to extract lexicon translation equivalents between source and target sentences, serves as a fundamental tool for natural language processing.

Language Modelling Sentence +2

Rethink about the Word-level Quality Estimation for Machine Translation from Human Judgement

1 code implementation13 Sep 2022 Zhen Yang, Fandong Meng, Yuanmeng Yan, Jie zhou

While the post-editing effort can be used to measure the translation quality to some extent, we find it usually conflicts with the human judgement on whether the word is well or poorly translated.

Machine Translation Sentence +2

Probing Causes of Hallucinations in Neural Machine Translations

no code implementations25 Jun 2022 Jianhao Yan, Fandong Meng, Jie zhou

Hallucination, one kind of pathological translations that bothers Neural Machine Translation, has recently drawn much attention.

Hallucination Machine Translation +2

Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training

1 code implementation NAACL 2022 Yuanxin Liu, Fandong Meng, Zheng Lin, Peng Fu, Yanan Cao, Weiping Wang, Jie zhou

Firstly, we discover that the success of magnitude pruning can be attributed to the preserved pre-training performance, which correlates with the downstream transferability.

Transfer Learning

Generating Authentic Adversarial Examples beyond Meaning-preserving with Doubly Round-trip Translation

1 code implementation NAACL 2022 Siyu Lai, Zhen Yang, Fandong Meng, Xue Zhang, Yufeng Chen, Jinan Xu, Jie zhou

Generating adversarial examples for Neural Machine Translation (NMT) with single Round-Trip Translation (RTT) has achieved promising results by releasing the meaning-preserving restriction.

Machine Translation NMT +1

A Survey on Cross-Lingual Summarization

no code implementations23 Mar 2022 Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng Qu, Jie zhou

Cross-lingual summarization is the task of generating a summary in one language (e. g., English) for the given document(s) in a different language (e. g., Chinese).

Survey

Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene

1 code implementation16 Mar 2022 Duo Zheng, Fandong Meng, Qingyi Si, Hairun Fan, Zipeng Xu, Jie zhou, Fangxiang Feng, Xiaojie Wang

Visual dialog has witnessed great progress after introducing various vision-oriented goals into the conversation, especially such as GuessWhich and GuessWhat, where the only image is visible by either and both of the questioner and the answerer, respectively.

Visual Dialog

A Variational Hierarchical Model for Neural Cross-Lingual Summarization

1 code implementation ACL 2022 Yunlong Liang, Fandong Meng, Chulun Zhou, Jinan Xu, Yufeng Chen, Jinsong Su, Jie zhou

The goal of the cross-lingual summarization (CLS) is to convert a document in one language (e. g., English) to a summary in another one (e. g., Chinese).

Machine Translation Translation

Towards Robust Online Dialogue Response Generation

no code implementations7 Mar 2022 Leyang Cui, Fandong Meng, Yijin Liu, Jie zhou, Yue Zhang

Although pre-trained sequence-to-sequence models have achieved great success in dialogue response generation, chatbots still suffer from generating inconsistent responses in real-world practice, especially in multi-turn settings.

Chatbot Re-Ranking +1

Conditional Bilingual Mutual Information Based Adaptive Training for Neural Machine Translation

1 code implementation ACL 2022 Songming Zhang, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jian Liu, Jie zhou

Token-level adaptive training approaches can alleviate the token imbalance problem and thus improve neural machine translation, through re-weighting the losses of different target tokens based on specific statistical metrics (e. g., token frequency or mutual information).

Language Modelling Machine Translation +2

EAG: Extract and Generate Multi-way Aligned Corpus for Complete Multi-lingual Neural Machine Translation

no code implementations ACL 2022 Yulin Xu, Zhen Yang, Fandong Meng, JieZhou

Complete Multi-lingual Neural Machine Translation (C-MNMT) achieves superior performance against the conventional MNMT by constructing multi-way aligned corpus, i. e., aligning bilingual training examples from different language pairs when either their source or target sides are identical.

Diversity Machine Translation

Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation

no code implementations ACL 2022 Chulun Zhou, Fandong Meng, Jie zhou, Min Zhang, Hongji Wang, Jinsong Su

Most dominant neural machine translation (NMT) models are restricted to make predictions only according to the local context of preceding words in a left-to-right manner.

Decoder Knowledge Distillation +4

MSCTD: A Multimodal Sentiment Chat Translation Dataset

1 code implementation ACL 2022 Yunlong Liang, Fandong Meng, Jinan Xu, Yufeng Chen, Jie zhou

In this work, we introduce a new task named Multimodal Chat Translation (MCT), aiming to generate more accurate translations with the help of the associated dialogue history and visual context.

Multimodal Machine Translation Sentiment Analysis +1

ClidSum: A Benchmark Dataset for Cross-Lingual Dialogue Summarization

2 code implementations11 Feb 2022 Jiaan Wang, Fandong Meng, Ziyao Lu, Duo Zheng, Zhixu Li, Jianfeng Qu, Jie zhou

We present ClidSum, a benchmark dataset for building cross-lingual summarization systems on dialogue documents.

WeTS: A Benchmark for Translation Suggestion

1 code implementation11 Oct 2021 Zhen Yang, Fandong Meng, Yingxue Zhang, Ernan Li, Jie zhou

To break this limitation, we create a benchmark data set for TS, called \emph{WeTS}, which contains golden corpus annotated by expert translators on four translation directions.

Machine Translation Translation

Simulated annealing for optimization of graphs and sequences

no code implementations1 Oct 2021 Xianggen Liu, Pengyong Li, Fandong Meng, Hao Zhou, Huasong Zhong, Jie zhou, Lili Mou, Sen Song

The key idea is to integrate powerful neural networks into metaheuristics (e. g., simulated annealing, SA) to restrict the search space in discrete optimization.

Paraphrase Generation

GoG: Relation-aware Graph-over-Graph Network for Visual Dialog

no code implementations Findings (ACL) 2021 Feilong Chen, Xiuyi Chen, Fandong Meng, Peng Li, Jie zhou

Specifically, GoG consists of three sequential graphs: 1) H-Graph, which aims to capture coreference relations among dialog history; 2) History-aware Q-Graph, which aims to fully understand the question through capturing dependency relations between words based on coreference resolution on the dialog history; and 3) Question-aware I-Graph, which aims to capture the relations between objects in an image based on fully question representation.

coreference-resolution Implicit Relations +2

Multimodal Incremental Transformer with Visual Grounding for Visual Dialogue Generation

1 code implementation Findings (ACL) 2021 Feilong Chen, Fandong Meng, Xiuyi Chen, Peng Li, Jie zhou

Visual dialogue is a challenging task since it needs to answer a series of coherent questions on the basis of understanding the visual environment.

Dialogue Generation Visual Grounding

Competence-based Curriculum Learning for Multilingual Machine Translation

no code implementations Findings (EMNLP) 2021 Mingliang Zhang, Fandong Meng, Yunhai Tong, Jie zhou

Therefore, we focus on balancing the learning competencies of different languages and propose Competence-based Curriculum Learning for Multilingual Machine Translation, named CCL-M.

Machine Translation Translation

Enhancing Visual Dialog Questioner with Entity-based Strategy Learning and Augmented Guesser

1 code implementation Findings (EMNLP) 2021 Duo Zheng, Zipeng Xu, Fandong Meng, Xiaojie Wang, Jiaan Wang, Jie zhou

To enhance VD Questioner: 1) we propose a Related entity enhanced Questioner (ReeQ) that generates questions under the guidance of related entities and learns entity-based questioning strategy from human dialogs; 2) we propose an Augmented Guesser (AugG) that is strong and is optimized for the VD setting especially.

Diversity Reinforcement Learning (RL) +1

Scheduled Sampling Based on Decoding Steps for Neural Machine Translation

1 code implementation EMNLP 2021 Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou

Its core motivation is to simulate the inference scene during training by replacing ground-truth tokens with predicted tokens, thus bridging the gap between training and inference.

Machine Translation Text Summarization +1

WeChat Neural Machine Translation Systems for WMT21

no code implementations WMT (EMNLP) 2021 Xianfeng Zeng, Yijin Liu, Ernan Li, Qiu Ran, Fandong Meng, Peng Li, Jinan Xu, Jie zhou

This paper introduces WeChat AI's participation in WMT 2021 shared news translation task on English->Chinese, English->Japanese, Japanese->English and English->German.

Knowledge Distillation Machine Translation +3

Modeling Bilingual Conversational Characteristics for Neural Chat Translation

1 code implementation ACL 2021 Yunlong Liang, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou

Despite the impressive performance of sentence-level and context-aware Neural Machine Translation (NMT), there still remain challenges to translate bilingual conversational text due to its inherent characteristics such as role preference, dialogue coherence, and translation consistency.

Machine Translation NMT +2

Target-Oriented Fine-tuning for Zero-Resource Named Entity Recognition

1 code implementation Findings (ACL) 2021 Ying Zhang, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou

In this paper, we tackle the problem by transferring knowledge from three aspects, i. e., domain, language and task, and strengthening connections among them.

named-entity-recognition Named Entity Recognition +2

Confidence-Aware Scheduled Sampling for Neural Machine Translation

1 code implementation Findings (ACL) 2021 Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou

In this way, the model is exactly exposed to predicted tokens for high-confidence positions and still ground-truth tokens for low-confidence positions.

Machine Translation Translation

Modeling Explicit Concerning States for Reinforcement Learning in Visual Dialogue

1 code implementation12 Jul 2021 Zipeng Xu, Fandong Meng, Xiaojie Wang, Duo Zheng, Chenxu Lv, Jie zhou

In Reinforcement Learning, it is crucial to represent states and assign rewards based on the action-caused transitions of states.

reinforcement-learning Reinforcement Learning +1

Digging Errors in NMT: Evaluating and Understanding Model Errors from Partial Hypothesis Space

no code implementations29 Jun 2021 Jianhao Yan, Chenming Wu, Fandong Meng, Jie zhou

Current evaluation of an NMT system is usually built upon a heuristic decoding algorithm (e. g., beam search) and an evaluation metric assessing similarity between the translation and golden reference.

Data Augmentation Inductive Bias +3

Sequence-Level Training for Non-Autoregressive Neural Machine Translation

1 code implementation CL (ACL) 2021 Chenze Shao, Yang Feng, Jinchao Zhang, Fandong Meng, Jie zhou

Non-Autoregressive Neural Machine Translation (NAT) removes the autoregressive mechanism and achieves significant decoding speedup through generating target words independently and simultaneously.

Machine Translation NMT +2

Marginal Utility Diminishes: Exploring the Minimum Knowledge for BERT Knowledge Distillation

1 code implementation ACL 2021 Yuanxin Liu, Fandong Meng, Zheng Lin, Weiping Wang, Jie zhou

In this paper, however, we observe that although distilling the teacher's hidden state knowledge (HSK) is helpful, the performance gain (marginal utility) diminishes quickly as more HSK is distilled.

Knowledge Distillation

GTM: A Generative Triple-Wise Model for Conversational Question Generation

no code implementations ACL 2021 Lei Shen, Fandong Meng, Jinchao Zhang, Yang Feng, Jie zhou

Generating some appealing questions in open-domain conversations is an effective way to improve human-machine interactions and lead the topic to a broader or deeper direction.

Diversity Question Generation +1

Exploring Dynamic Selection of Branch Expansion Orders for Code Generation

1 code implementation ACL 2021 Hui Jiang, Chulun Zhou, Fandong Meng, Biao Zhang, Jie zhou, Degen Huang, Qingqiang Wu, Jinsong Su

Due to the great potential in facilitating software development, code generation has attracted increasing attention recently.

Code Generation

Context Tracking Network: Graph-based Context Modeling for Implicit Discourse Relation Recognition

no code implementations NAACL 2021 Yingxue Zhang, Fandong Meng, Peng Li, Ping Jian, Jie zhou

Implicit discourse relation recognition (IDRR) aims to identify logical relations between two adjacent sentences in the discourse.

Relation Sentence

Selective Knowledge Distillation for Neural Machine Translation

1 code implementation ACL 2021 Fusheng Wang, Jianhao Yan, Fandong Meng, Jie zhou

As an active research field in NMT, knowledge distillation is widely applied to enhance the model's performance by transferring teacher model's knowledge on each training sample.

Knowledge Distillation Machine Translation +2

Bilingual Mutual Information Based Adaptive Training for Neural Machine Translation

1 code implementation ACL 2021 Yangyifan Xu, Yijin Liu, Fandong Meng, Jiajun Zhang, Jinan Xu, Jie zhou

Recently, token-level adaptive training has achieved promising improvement in machine translation, where the cross-entropy loss function is adjusted by assigning different training weights to different tokens, in order to alleviate the token imbalance problem.

Diversity Machine Translation +1

Prevent the Language Model from being Overconfident in Neural Machine Translation

1 code implementation ACL 2021 Mengqi Miao, Fandong Meng, Yijin Liu, Xiao-Hua Zhou, Jie zhou

The Neural Machine Translation (NMT) model is essentially a joint language model conditioned on both the source sentence and partial translation.

Hallucination Language Modelling +4

Tree frog-inspired nanopillar arrays for enhancement of adhesion and friction

no code implementations4 Mar 2021 Zhekun Shi, Di Tan, Quan Liu, Fandong Meng, Bo Zhu, Longjian Xue

Bioinspired structure adhesives have received increasing interest for many applications, such as climbing robots and medical devices.

Soft Condensed Matter

Emotional Conversation Generation with Heterogeneous Graph Neural Network

1 code implementation9 Dec 2020 Yunlong Liang, Fandong Meng, Ying Zhang, Jinan Xu, Yufeng Chen, Jie zhou

Firstly, we design a Heterogeneous Graph-Based Encoder to represent the conversation content (i. e., the dialogue history, its emotion flow, facial expressions, audio, and speakers' personalities) with a heterogeneous graph neural network, and then predict suitable emotions for feedback.

Decoder Graph Neural Network

A Sentiment-Controllable Topic-to-Essay Generator with Topic Knowledge Graph

no code implementations Findings of the Association for Computational Linguistics 2020 Lin Qiao, Jianhao Yan, Fandong Meng, Zhendong Yang, Jie zhou

Therefore, we propose a novel Sentiment-Controllable topic-to-essay generator with a Topic Knowledge Graph enhanced decoder, named SCTKG, which is based on the conditional variational autoencoder (CVAE) framework.

Decoder Diversity +2

MS-Ranker: Accumulating Evidence from Potentially Correct Candidates for Answer Selection

no code implementations10 Oct 2020 Yingxue Zhang, Fandong Meng, Peng Li, Ping Jian, Jie zhou

As conventional answer selection (AS) methods generally match the question with each candidate answer independently, they suffer from the lack of matching information between the question and the candidate.

Answer Selection Reinforcement Learning (RL)

Token-level Adaptive Training for Neural Machine Translation

1 code implementation EMNLP 2020 Shuhao Gu, Jinchao Zhang, Fandong Meng, Yang Feng, Wanying Xie, Jie zhou, Dong Yu

The vanilla NMT model usually adopts trivial equal-weighted objectives for target tokens with different frequencies and tends to generate more high-frequency tokens and less low-frequency tokens compared with the golden token distribution.

Diversity Machine Translation +2

Dynamic Context-guided Capsule Network for Multimodal Machine Translation

1 code implementation4 Sep 2020 Huan Lin, Fandong Meng, Jinsong Su, Yongjing Yin, Zhengyuan Yang, Yubin Ge, Jie zhou, Jiebo Luo

Particularly, we represent the input image with global and regional visual features, we introduce two parallel DCCNs to model multimodal context vectors with visual features at different granularities.

Decoder Multimodal Machine Translation +2

Modeling Inter-Aspect Dependencies with a Non-temporal Mechanism for Aspect-Based Sentiment Analysis

no code implementations12 Aug 2020 Yunlong Liang, Fandong Meng, Jinchao Zhang, Yufeng Chen, Jinan Xu, Jie zhou

For multiple aspects scenario of aspect-based sentiment analysis (ABSA), existing approaches typically ignore inter-aspect relations or rely on temporal dependencies to process aspect-aware representations of all aspects in a sentence.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

Dual Past and Future for Neural Machine Translation

no code implementations15 Jul 2020 Jianhao Yan, Fandong Meng, Jie zhou

Though remarkable successes have been achieved by Neural Machine Translation (NMT) in recent years, it still suffers from the inadequate-translation problem.

Machine Translation NMT +2

Faster Depth-Adaptive Transformers

no code implementations27 Apr 2020 Yijin Liu, Fandong Meng, Jie zhou, Yufeng Chen, Jinan Xu

Depth-adaptive neural networks can dynamically adjust depths according to the hardness of input words, and thus improve efficiency.

Sentence Embeddings text-classification +1

Towards Multimodal Response Generation with Exemplar Augmentation and Curriculum Optimization

no code implementations26 Apr 2020 Zeyang Lei, Zekang Li, Jinchao Zhang, Fandong Meng, Yang Feng, Yujiu Yang, Cheng Niu, Jie zhou

Furthermore, to facilitate the convergence of Gaussian mixture prior and posterior distributions, we devise a curriculum optimization strategy to progressively train the model under multiple training criteria from easy to hard.

Diversity Response Generation

A Dependency Syntactic Knowledge Augmented Interactive Architecture for End-to-End Aspect-based Sentiment Analysis

3 code implementations4 Apr 2020 Yunlong Liang, Fandong Meng, Jinchao Zhang, Jinan Xu, Yufeng Chen, Jie zhou

The aspect-based sentiment analysis (ABSA) task remains to be a long-standing challenge, which aims to extract the aspect term and then identify its sentiment orientation. In previous approaches, the explicit syntactic structure of a sentence, which reflects the syntax properties of natural language and hence is intuitively crucial for aspect term extraction and sentiment recognition, is typically neglected or insufficiently modeled.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +3

An Iterative Multi-Knowledge Transfer Network for Aspect-Based Sentiment Analysis

2 code implementations Findings (EMNLP) 2021 Yunlong Liang, Fandong Meng, Jinchao Zhang, Yufeng Chen, Jinan Xu, Jie zhou

Aspect-based sentiment analysis (ABSA) mainly involves three subtasks: aspect term extraction, opinion term extraction, and aspect-level sentiment classification, which are typically handled in a separate or joint manner.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +3

Depth-Adaptive Graph Recurrent Network for Text Classification

1 code implementation29 Feb 2020 Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie zhou

The Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network, which views words as nodes and performs layer-wise recurrent steps between them simultaneously.

General Classification Sentence +2

DMRM: A Dual-channel Multi-hop Reasoning Model for Visual Dialog

1 code implementation18 Dec 2019 Feilong Chen, Fandong Meng, Jiaming Xu, Peng Li, Bo Xu, Jie zhou

Visual Dialog is a vision-language task that requires an AI agent to engage in a conversation with humans grounded in an image.

AI Agent Decoder +2

Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation

1 code implementation21 Nov 2019 Chenze Shao, Jinchao Zhang, Yang Feng, Fandong Meng, Jie zhou

Non-Autoregressive Neural Machine Translation (NAT) achieves significant decoding speedup through generating target words independently and simultaneously.

Machine Translation Sentence +1

Improving Bidirectional Decoding with Dynamic Target Semantics in Neural Machine Translation

no code implementations5 Nov 2019 Yong Shan, Yang Feng, Jinchao Zhang, Fandong Meng, Wen Zhang

Generally, Neural Machine Translation models generate target words in a left-to-right (L2R) manner and fail to exploit any future (right) semantics information, which usually produces an unbalanced translation.

Decoder Machine Translation +1

Semantic Graph Convolutional Network for Implicit Discourse Relation Classification

no code implementations21 Oct 2019 Yingxue Zhang, Ping Jian, Fandong Meng, Ruiying Geng, Wei Cheng, Jie zhou

Implicit discourse relation classification is of great importance for discourse parsing, but remains a challenging problem due to the absence of explicit discourse connectives communicating these relations.

Classification Discourse Parsing +3

CM-Net: A Novel Collaborative Memory Network for Spoken Language Understanding

2 code implementations IJCNLP 2019 Yijin Liu, Fandong Meng, Jinchao Zhang, Jie zhou, Yufeng Chen, Jinan Xu

Spoken Language Understanding (SLU) mainly involves two tasks, intent detection and slot filling, which are generally modeled jointly in existing works.

Intent Detection slot-filling +2

Unsupervised Paraphrasing by Simulated Annealing

no code implementations ACL 2020 Xianggen Liu, Lili Mou, Fandong Meng, Hao Zhou, Jie zhou, Sen Song

Unsupervised paraphrase generation is a promising and important research topic in natural language processing.

Diversity Paraphrase Generation +3

A Novel Aspect-Guided Deep Transition Model for Aspect Based Sentiment Analysis

1 code implementation IJCNLP 2019 Yunlong Liang, Fandong Meng, Jinchao Zhang, Jinan Xu, Yufeng Chen, Jie zhou

Aspect based sentiment analysis (ABSA) aims to identify the sentiment polarity towards the given aspect in a sentence, while previous models typically exploit an aspect-independent (weakly associative) encoder for sentence representation generation.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

Incremental Transformer with Deliberation Decoder for Document Grounded Conversations

2 code implementations ACL 2019 Zekang Li, Cheng Niu, Fandong Meng, Yang Feng, Qian Li, Jie zhou

Document Grounded Conversations is a task to generate dialogue responses when chatting about the content of a given document.

Decoder

Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation

2 code implementations ACL 2019 Chenze Shao, Yang Feng, Jinchao Zhang, Fandong Meng, Xilin Chen, Jie zhou

Non-Autoregressive Transformer (NAT) aims to accelerate the Transformer model through discarding the autoregressive mechanism and generating target words independently, which fails to exploit the target sequential information.

Decoder Machine Translation +2

GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling

1 code implementation ACL 2019 Yijin Liu, Fandong Meng, Jinchao Zhang, Jinan Xu, Yufeng Chen, Jie zhou

Current state-of-the-art systems for sequence labeling are typically based on the family of Recurrent Neural Networks (RNNs).

Ranked #17 on Named Entity Recognition (NER) on CoNLL 2003 (English) (using extra training data)

Chunking NER +2

Bridging the Gap between Training and Inference for Neural Machine Translation

no code implementations ACL 2019 Wen Zhang, Yang Feng, Fandong Meng, Di You, Qun Liu

Neural Machine Translation (NMT) generates target words sequentially in the way of predicting the next word conditioned on the context words.

Machine Translation NMT +2

DTMT: A Novel Deep Transition Architecture for Neural Machine Translation

1 code implementation19 Dec 2018 Fandong Meng, Jinchao Zhang

In this paper, we further enhance the RNN-based NMT through increasing the transition depth between consecutive hidden states and build a novel Deep Transition RNN-based Architecture for Neural Machine Translation, named DTMT.

Machine Translation NMT +1

Neural Machine Translation with Key-Value Memory-Augmented Attention

no code implementations29 Jun 2018 Fandong Meng, Zhaopeng Tu, Yong Cheng, Haiyang Wu, Junjie Zhai, Yuekui Yang, Di Wang

Although attention-based Neural Machine Translation (NMT) has achieved remarkable progress in recent years, it still suffers from issues of repeating and dropping translations.

Decoder Machine Translation +3

Towards Robust Neural Machine Translation

no code implementations ACL 2018 Yong Cheng, Zhaopeng Tu, Fandong Meng, Junjie Zhai, Yang Liu

Small perturbations in the input can severely distort intermediate representations and thus impact translation quality of neural machine translation (NMT) models.

Decoder Machine Translation +2

Interactive Attention for Neural Machine Translation

no code implementations COLING 2016 Fandong Meng, Zhengdong Lu, Hang Li, Qun Liu

Conventional attention-based Neural Machine Translation (NMT) conducts dynamic alignment in generating the target sentence.

Decoder Machine Translation +3

Neural Machine Translation with External Phrase Memory

no code implementations6 Jun 2016 Yaohua Tang, Fandong Meng, Zhengdong Lu, Hang Li, Philip L. H. Yu

In this paper, we propose phraseNet, a neural machine translator with a phrase memory which stores phrase pairs in symbolic form, mined from corpus or specified by human experts.

Decoder Machine Translation +2