Search Results for author: Deyi Xiong

Found 118 papers, 28 papers with code

A Test Suite for Evaluating Discourse Phenomena in Document-level Neural Machine Translation

no code implementations AACL (iwdp) 2020 Xinyi Cai, Deyi Xiong

The need to evaluate the ability of context-aware neural machine translation (NMT) models in dealing with specific discourse phenomena arises in document-level NMT.

Machine Translation NMT +1

Re-embedding Difficult Samples via Mutual Information Constrained Semantically Oversampling for Imbalanced Text Classification

no code implementations EMNLP 2021 Jiachen Tian, Shizhan Chen, Xiaowang Zhang, Zhiyong Feng, Deyi Xiong, Shaojuan Wu, Chunliu Dou

Difficult samples of the minority class in imbalanced text classification are usually hard to be classified as they are embedded into an overlapping semantic region with the majority class.

text-classification Text Classification

Adaptive Differential Privacy for Language Model Training

no code implementations FL4NLP (ACL) 2022 Xinwei Wu, Li Gong, Deyi Xiong

Although differential privacy (DP) can protect language models from leaking privacy, its indiscriminative protection on all data points reduces its practical utility.

Language Modelling

Chinese WPLC: A Chinese Dataset for Evaluating Pretrained Language Models on Word Prediction Given Long-Range Context

no code implementations EMNLP 2021 Huibin Ge, Chenxi Sun, Deyi Xiong, Qun Liu

Experiment results show that the Chinese pretrained language model PanGu-\alpha is 45 points behind human in terms of top-1 word prediction accuracy, indicating that Chinese WPLC is a challenging dataset.

Language Modelling

TED-CDB: A Large-Scale Chinese Discourse Relation Dataset on TED Talks

no code implementations EMNLP 2020 Wanqiu Long, Bonnie Webber, Deyi Xiong

As different genres are known to differ in their communicative properties and as previously, for Chinese, discourse relations have only been annotated over news text, we have created the TED-CDB dataset.

Relation Transfer Learning

Evaluating Discourse Cohesion in Pre-trained Language Models

no code implementations COLING (CODI, CRAC) 2022 Jie He, Wanqiu Long, Deyi Xiong

Large pre-trained neural models have achieved remarkable success in natural language process (NLP), inspiring a growing body of research analyzing their ability from different aspects.

CoDoNMT: Modeling Cohesion Devices for Document-Level Neural Machine Translation

1 code implementation COLING 2022 Yikun Lei, Yuqi Ren, Deyi Xiong

In this paper, we propose a document-level neural machine translation framework, CoDoNMT, which models cohesion devices from two perspectives: Cohesion Device Masking (CoDM) and Cohesion Attention Focusing (CoAF).

Machine Translation NMT +2

KaFSP: Knowledge-Aware Fuzzy Semantic Parsing for Conversational Question Answering over a Large-Scale Knowledge Base

1 code implementation ACL 2022 Junzhuo Li, Deyi Xiong

In this paper, we study two issues of semantic parsing approaches to conversational question answering over a large-scale knowledge base: (1) The actions defined in grammar are not sufficient to handle uncertain reasoning common in real-world scenarios.

Conversational Question Answering Entity Disambiguation +2

Learning Structural Information for Syntax-Controlled Paraphrase Generation

no code implementations Findings (NAACL) 2022 Erguang Yang, Chenglin Bai, Deyi Xiong, Yujie Zhang, Yao Meng, Jinan Xu, Yufeng Chen

To model the alignment relation between words and nodes, we propose an attention regularization objective, which makes the decoder accurately select corresponding syntax nodes to guide the generation of words. Experiments show that SI-SCP achieves state-of-the-art performances in terms of semantic and syntactic quality on two popular benchmark datasets. Additionally, we propose a Syntactic Template Retriever (STR) to retrieve compatible syntactic structures.

Paraphrase Generation Relation

ParaZh-22M: A Large-Scale Chinese Parabank via Machine Translation

no code implementations COLING 2022 Wenjie Hao, Hongfei Xu, Deyi Xiong, Hongying Zan, Lingling Mu

Paraphrasing, i. e., restating the same meaning in different ways, is an important data augmentation approach for natural language processing (NLP).

Data Augmentation Machine Translation +3

LHMKE: A Large-scale Holistic Multi-subject Knowledge Evaluation Benchmark for Chinese Large Language Models

no code implementations19 Mar 2024 Chuang Liu, Renren Jin, Yuqi Ren, Deyi Xiong

Current datasets collect questions from Chinese examinations across different subjects and educational levels to address this issue.

Multiple-choice

OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety

no code implementations18 Mar 2024 Chuang Liu, Linhao Yu, Jiaxuan Li, Renren Jin, Yufei Huang, Ling Shi, Junhui Zhang, Xinmeng Ji, Tingting Cui, Tao Liu, Jinwang Song, Hongying Zan, Sun Li, Deyi Xiong

In addition to these benchmarks, we have implemented a phased public evaluation and benchmark update strategy to ensure that OpenEval is in line with the development of Chinese LLMs or even able to provide cutting-edge benchmark datasets to guide the development of Chinese LLMs.

Benchmarking Mathematical Reasoning

FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models

no code implementations12 Mar 2024 Yan Liu, Renren Jin, Lin Shi, Zheng Yao, Deyi Xiong

We conduct extensive experiments on a wide range of LLMs on FineMath and find that there is still considerable room for improvements in terms of mathematical reasoning capability of Chinese LLMs.

Math Mathematical Reasoning

Exploring Multilingual Human Value Concepts in Large Language Models: Is Value Alignment Consistent, Transferable and Controllable across Languages?

1 code implementation28 Feb 2024 Shaoyang Xu, Weilong Dong, Zishan Guo, Xinwei Wu, Deyi Xiong

Drawing from our findings on multilingual value alignment, we prudently provide suggestions on the composition of multilingual data for LLMs pre-training: including a limited number of dominant languages for cross-lingual alignment transfer while avoiding their excessive prevalence, and keeping a balanced distribution of non-dominant languages.

Cross-Lingual Transfer Philosophy

Do Large Language Models Mirror Cognitive Language Processing?

no code implementations28 Feb 2024 Yuqi Ren, Renren Jin, Tongxuan Zhang, Deyi Xiong

We employ Representational Similarity Analysis (RSA) to mearsure the alignment between 16 mainstream LLMs and fMRI signals of the brain.

Chatbot Logical Reasoning +1

A Comprehensive Evaluation of Quantization Strategies for Large Language Models

no code implementations26 Feb 2024 Renren Jin, Jiangcun Du, Wuwei Huang, Wei Liu, Jian Luan, Bin Wang, Deyi Xiong

Our experimental results indicate that LLMs with 4-bit quantization can retain performance comparable to their non-quantized counterparts, and perplexity can serve as a proxy metric for quantized LLMs on most benchmarks.

Language Modelling Quantization

RoleEval: A Bilingual Role Evaluation Benchmark for Large Language Models

1 code implementation26 Dec 2023 Tianhao Shen, Sun Li, Quan Tu, Deyi Xiong

We expect that RoleEval would highlight the significance of assessing role knowledge for large language models across various languages and cultural settings.

Memorization Multiple-choice

CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models

no code implementations20 Dec 2023 Dan Shi, Chaobin You, Jiantao Huang, Taihao Li, Deyi Xiong

With these pre-defined domains and slots, we collect 76, 787 commonsense knowledge annotations from 19, 700 dialogues through crowdsourcing.

Causal Inference Common Sense Reasoning

AI-driven emergence of frequency information non-uniform distribution via THz metasurface spectrum prediction

no code implementations5 Dec 2023 Xiaohua Xing, Yuqi Ren, Die Zou, Qiankun Zhang, Bingxuan Mao, Jianquan Yao, Deyi Xiong, Shuang Zhang, Liang Wu

Recently, artificial intelligence has been extensively deployed across various scientific disciplines, optimizing and guiding the progression of experiments through the integration of abundant datasets, whilst continuously probing the vast theoretical space encapsulated within the data.

Is Robustness Transferable across Languages in Multilingual Neural Machine Translation?

no code implementations31 Oct 2023 Leiyu Pan, Supryadi, Deyi Xiong

In particular, we use character-, word-, and multi-level noises to attack the specific translation direction of the multilingual neural machine translation model and evaluate the robustness of other translation directions.

Data Augmentation Machine Translation +1

Towards a Deep Understanding of Multilingual End-to-End Speech Translation

1 code implementation31 Oct 2023 Haoran Sun, Xiaohu Zhao, Yikun Lei, Shaolin Zhu, Deyi Xiong

In this paper, we employ Singular Value Canonical Correlation Analysis (SVCCA) to analyze representations learnt in a multilingual end-to-end speech translation model trained over 22 languages.

Machine Translation Translation

DEPN: Detecting and Editing Privacy Neurons in Pretrained Language Models

1 code implementation31 Oct 2023 Xinwei Wu, Junzhuo Li, Minghui Xu, Weilong Dong, Shuangzhi Wu, Chao Bian, Deyi Xiong

The ability of data memorization and regurgitation in pretrained language models, revealed in previous studies, brings the risk of data leakage.

Memorization Model Editing

Evaluating Large Language Models: A Comprehensive Survey

1 code implementation30 Oct 2023 Zishan Guo, Renren Jin, Chuang Liu, Yufei Huang, Dan Shi, Supryadi, Linhao Yu, Yan Liu, Jiaxuan Li, Bojian Xiong, Deyi Xiong

We hope that this comprehensive overview will stimulate further research interests in the evaluation of LLMs, with the ultimate goal of making evaluation serve as a cornerstone in guiding the responsible development of LLMs.

Large Language Model Alignment: A Survey

no code implementations26 Sep 2023 Tianhao Shen, Renren Jin, Yufei Huang, Chuang Liu, Weilong Dong, Zishan Guo, Xinwei Wu, Yan Liu, Deyi Xiong

We also envision bridging the gap between the AI alignment research community and the researchers engrossed in the capability exploration of LLMs for both capable and safe LLMs.

Language Modelling Large Language Model

Watermarking Conditional Text Generation for AI Detection: Unveiling Challenges and a Semantic-Aware Watermark Remedy

1 code implementation25 Jul 2023 Yu Fu, Deyi Xiong, Yue Dong

To mitigate potential risks associated with language models, recent AI detection research proposes incorporating watermarks into machine-generated text through random vocabulary restrictions and utilizing this information for detection.

Conditional Text Generation Data-to-Text Generation

CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models

1 code implementation28 Jun 2023 Yufei Huang, Deyi Xiong

In this work, we present a Chinese Bias Benchmark dataset that consists of over 100K questions jointly constructed by human experts and generative language models, covering stereotypes and societal biases in 14 social dimensions related to Chinese culture and values.

Inverse Reinforcement Learning for Text Summarization

no code implementations19 Dec 2022 Yu Fu, Deyi Xiong, Yue Dong

We introduce inverse reinforcement learning (IRL) as an effective paradigm for training abstractive summarization models, imitating human summarization behaviors.

Abstractive Text Summarization reinforcement-learning +1

FewFedWeight: Few-shot Federated Learning Framework across Multiple NLP Tasks

no code implementations16 Dec 2022 Weilong Dong, Xinwei Wu, Junzhuo Li, Shuangzhi Wu, Chao Bian, Deyi Xiong

It broadcasts the global model in the server to each client and produces pseudo data for clients so that knowledge from the global model can be explored to enhance few-shot learning of each client model.

Federated Learning Few-Shot Learning +1

NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual Question Answering

no code implementations7 Nov 2022 Tengxun Zhang, Hongfei Xu, Josef van Genabith, Deyi Xiong, Hongying Zan

Hybrid tabular-textual question answering (QA) requires reasoning from heterogeneous information, and the types of reasoning are mainly divided into numerical reasoning and span extraction.

Question Answering

Informative Language Representation Learning for Massively Multilingual Neural Machine Translation

1 code implementation COLING 2022 Renren Jin, Deyi Xiong

Experiment results on two datasets for massively multilingual neural machine translation demonstrate that language-aware multi-head attention benefits both supervised and zero-shot translation and significantly alleviates the off-target translation issue.

Machine Translation Navigate +2

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

no code implementations22 Jun 2022 Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou

This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims.

Benchmarking Text Generation

Unsupervised and Few-shot Parsing from Pretrained Language Models

no code implementations10 Jun 2022 Zhiyuan Zeng, Deyi Xiong

We therefore extend the unsupervised models to few-shot parsing models (FPOA, FPIO) that use a few annotated trees to learn better linear projection matrices for parsing.

Language Modelling

Efficient Cluster-Based k-Nearest-Neighbor Machine Translation

2 code implementations ACL 2022 Dexin Wang, Kai Fan, Boxing Chen, Deyi Xiong

k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT).

Contrastive Learning Domain Adaptation +4

Bridging between Cognitive Processing Signals and Linguistic Features via a Unified Attentional Network

no code implementations16 Dec 2021 Yuqi Ren, Deyi Xiong

The proposed framework only requires cognitive processing signals recorded under natural reading as inputs, and can be used to detect a wide range of linguistic features with a single cognitive dataset.

Sentence

Secoco: Self-Correcting Encoding for Neural Machine Translation

no code implementations Findings (EMNLP) 2021 Tao Wang, Chengqi Zhao, Mingxuan Wang, Lei LI, Hang Li, Deyi Xiong

This paper presents Self-correcting Encoding (Secoco), a framework that effectively deals with input noise for robust neural machine translation by introducing self-correcting predictors.

Machine Translation NMT +1

An Empirical Study on Adversarial Attack on NMT: Languages and Positions Matter

no code implementations ACL 2021 Zhiyuan Zeng, Deyi Xiong

For autoregressive NMT models that generate target words from left to right, we observe that adversarial attack on the source language is more effective than on the target language, and that attacking front positions of target sentences or positions of source sentences aligned to the front positions of corresponding target sentences is more effective than attacking other positions.

Adversarial Attack NMT

Multi-Head Highly Parallelized LSTM Decoder for Neural Machine Translation

no code implementations ACL 2021 Hongfei Xu, Qiuhui Liu, Josef van Genabith, Deyi Xiong, Meng Zhang

This has to be computed n times for a sequence of length n. The linear transformations involved in the LSTM gate and state computations are the major cost factors in this.

Machine Translation Translation

TGEA: An Error-Annotated Dataset and Benchmark Tasks for TextGeneration from Pretrained Language Models

no code implementations ACL 2021 Jie He, Bo Peng, Yi Liao, Qun Liu, Deyi Xiong

Each error is hence manually labeled with comprehensive annotations, including the span of the error, the associated span, minimal correction to the error, the type of the error, and rationale behind the error.

Common Sense Reasoning Text Generation

CogAlign: Learning to Align Textual Neural Representations to Cognitive Language Processing Signals

1 code implementation ACL 2021 Yuqi Ren, Deyi Xiong

Most previous studies integrate cognitive language processing signals (e. g., eye-tracking or EEG data) into neural models of natural language processing (NLP) just by directly concatenating word embeddings with cognitive features, ignoring the gap between the two modalities (i. e., textual vs. cognitive) and noise in cognitive features.

EEG Electroencephalogram (EEG) +6

Enhanced Aspect-Based Sentiment Analysis Models with Progressive Self-supervised Attention Learning

1 code implementation5 Mar 2021 Jinsong Su, Jialong Tang, Hui Jiang, Ziyao Lu, Yubin Ge, Linfeng Song, Deyi Xiong, Le Sun, Jiebo Luo

In aspect-based sentiment analysis (ABSA), many neural models are equipped with an attention mechanism to quantify the contribution of each context word to sentiment prediction.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA)

Integrating Pre-trained Model into Rule-based Dialogue Management

no code implementations17 Feb 2021 Jun Quan, Meng Yang, Qiang Gan, Deyi Xiong, Yiming Liu, Yuchen Dong, Fangxin Ouyang, Jun Tian, Ruiling Deng, Yongzhi Li, Yang Yang, Daxin Jiang

Rule-based dialogue management is still the most popular solution for industrial task-oriented dialogue systems for their interpretablility.

Dialogue Management Management +1

Efficient Object-Level Visual Context Modeling for Multimodal Machine Translation: Masking Irrelevant Objects Helps Grounding

no code implementations18 Dec 2020 Dexin Wang, Deyi Xiong

In this paper, we propose an object-level visual context modeling framework (OVC) to efficiently capture and explore visual information for multimodal machine translation.

Multimodal Machine Translation Object +1

Balanced Joint Adversarial Training for Robust Intent Detection and Slot Filling

no code implementations COLING 2020 Xu Cao, Deyi Xiong, Chongyang Shi, Chao Wang, Yao Meng, Changjian Hu

Joint intent detection and slot filling has recently achieved tremendous success in advancing the performance of utterance understanding.

Intent Detection slot-filling +1

A Learning-Exploring Method to Generate Diverse Paraphrases with Multi-Objective Deep Reinforcement Learning

no code implementations COLING 2020 Mingtong Liu, Erguang Yang, Deyi Xiong, Yujie Zhang, Yao Meng, Changjian Hu, Jinan Xu, Yufeng Chen

We propose a learning-exploring method to generate sentences as learning objectives from the learned data distribution, and employ reinforcement learning to combine these new learning objectives for model training.

Paraphrase Generation Reinforcement Learning (RL)

The Box is in the Pen: Evaluating Commonsense Reasoning in Neural Machine Translation

1 code implementation Findings of the Association for Computational Linguistics 2020 Jie He, Tao Wang, Deyi Xiong, Qun Liu

Our experiments and analyses demonstrate that neural machine translation performs poorly on commonsense reasoning of the three ambiguity types in terms of both reasoning accuracy ( 6 60. 1{\%}) and reasoning consistency (6 31{\%}).

Common Sense Reasoning Machine Translation +2

RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich Semantic Annotations for Task-Oriented Dialogue Modeling

1 code implementation EMNLP 2020 Jun Quan, Shian Zhang, Qian Cao, Zizhong Li, Deyi Xiong

In order to alleviate the shortage of multi-domain data and to capture discourse phenomena for task-oriented dialogue modeling, we propose RiSAWOZ, a large-scale multi-domain Chinese Wizard-of-Oz dataset with Rich Semantic Annotations.

Dialogue State Tracking Intent Detection +4

Transformer with Depth-Wise LSTM

no code implementations13 Jul 2020 Hongfei Xu, Qiuhui Liu, Deyi Xiong, Josef van Genabith

In this paper, we suggest that the residual connection has its drawbacks, and propose to train Transformers with the depth-wise LSTM which regards outputs of layers as steps in time series instead of residual connections, under the motivation that the vanishing gradient problem suffered by deep networks is the same as recurrent networks applied to long sequences, while LSTM (Hochreiter and Schmidhuber, 1997) has been proven of good capability in capturing long-distance relationship, and its design may alleviate some drawbacks of residual connections while ensuring the convergence.

Time Series Analysis

Learning Source Phrase Representations for Neural Machine Translation

no code implementations ACL 2020 Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu, Jingyi Zhang

Considering that modeling phrases instead of words has significantly improved the Statistical Machine Translation (SMT) approach through the use of larger translation blocks ("phrases") and its reordering ability, modeling NMT at phrase level is an intuitive proposal to help the model capture long-distance relationships.

Machine Translation NMT +1

Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change

no code implementations ACL 2020 Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu

We propose to automatically and dynamically determine batch sizes by accumulating gradients of mini-batches and performing an optimization step at just the time when the direction of gradients starts to fluctuate.

Modeling Long Context for Task-Oriented Dialogue State Generation

no code implementations ACL 2020 Jun Quan, Deyi Xiong

Based on the recently proposed transferable dialogue state generator (TRADE) that predicts dialogue states from utterance-concatenated dialogue context, we propose a multi-task learning model with a simple yet effective utterance tagging technique and a bidirectional language model as an auxiliary task for task-oriented dialogue state generation.

Language Modelling Multi-Task Learning

Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation

no code implementations30 Mar 2020 Pei Zhang, Xu Zhang, Wei Chen, Jian Yu, Yan-Feng Wang, Deyi Xiong

In this paper, we propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.

Document Level Machine Translation Machine Translation +4

Probing Word Translations in the Transformer and Trading Decoder for Encoder Layers

no code implementations NAACL 2021 Hongfei Xu, Josef van Genabith, Qiuhui Liu, Deyi Xiong

Due to its effectiveness and performance, the Transformer translation model has attracted wide attention, most recently in terms of probing-based approaches.

Translation Word Translation

Shallow Discourse Annotation for Chinese TED Talks

1 code implementation LREC 2020 Wanqiu Long, Xinyi Cai, James E. M. Reid, Bonnie Webber, Deyi Xiong

Text corpora annotated with language-related properties are an important resource for the development of Language Technology.

Translation

Effective Data Augmentation Approaches to End-to-End Task-Oriented Dialogue

no code implementations5 Dec 2019 Jun Quan, Deyi Xiong

The training of task-oriented dialogue systems is often confronted with the lack of annotated data.

Data Augmentation Sentence +1

Merging External Bilingual Pairs into Neural Machine Translation

no code implementations2 Dec 2019 Tao Wang, Shaohui Kuang, Deyi Xiong, António Branco

As neural machine translation (NMT) is not easily amenable to explicit correction of errors, incorporating pre-specified translations into NMT is widely regarded as a non-trivial challenge.

Machine Translation NMT +1

Learning to Reuse Translations: Guiding Neural Machine Translation with Examples

no code implementations25 Nov 2019 Qian Cao, Shaohui Kuang, Deyi Xiong

In this paper, we study the problem of enabling neural machine translation (NMT) to reuse previous translations from similar examples in target prediction.

Machine Translation NMT +1

Lipschitz Constrained Parameter Initialization for Deep Transformers

no code implementations ACL 2020 Hongfei Xu, Qiuhui Liu, Josef van Genabith, Deyi Xiong, Jingyi Zhang

In this paper, we first empirically demonstrate that a simple modification made in the official implementation, which changes the computation order of residual connection and layer normalization, can significantly ease the optimization of deep Transformers.

Translation

BiPaR: A Bilingual Parallel Dataset for Multilingual and Cross-lingual Reading Comprehension on Novels

1 code implementation IJCNLP 2019 Yimin Jing, Deyi Xiong, Yan Zhen

We analyze BiPaR in depth and find that BiPaR offers good diversification in prefixes of questions, answer types and relationships between questions and passages.

coreference-resolution Machine Reading Comprehension +1

Generating Highly Relevant Questions

no code implementations IJCNLP 2019 Jiazuo Qiu, Deyi Xiong

The neural seq2seq based question generation (QG) is prone to generating generic and undiversified questions that are poorly relevant to the given passage and target answer.

Question Generation Question-Generation

Towards Linear Time Neural Machine Translation with Capsule Networks

no code implementations IJCNLP 2019 Mingxuan Wang, Jun Xie, Zhixing Tan, Jinsong Su, Deyi Xiong, Lei LI

In this study, we first investigate a novel capsule network with dynamic routing for linear time Neural Machine Translation (NMT), referred as \textsc{CapsNMT}.

Machine Translation NMT +2

Simplifying Neural Machine Translation with Addition-Subtraction Twin-Gated Recurrent Networks

3 code implementations EMNLP 2018 Biao Zhang, Deyi Xiong, Jinsong Su, Qian Lin, Huiji Zhang

Experiments on WMT14 translation tasks demonstrate that ATR-based neural machine translation can yield competitive performance on English- German and English-French language pairs in terms of both translation quality and speed.

Chinese Word Segmentation Machine Translation +2

Encoding Gated Translation Memory into Neural Machine Translation

no code implementations EMNLP 2018 Qian Cao, Deyi Xiong

Translation memories (TM) facilitate human translators to reuse existing repetitive translation fragments.

Machine Translation NMT +3

Sentence Weighting for Neural Machine Translation Domain Adaptation

no code implementations COLING 2018 Shiqi Zhang, Deyi Xiong

In this paper, we propose a new sentence weighting method for the domain adaptation of neural machine translation.

Domain Adaptation Language Modelling +3

Fusing Recency into Neural Machine Translation with an Inter-Sentence Gate Model

no code implementations COLING 2018 Shaohui Kuang, Deyi Xiong

Neural machine translation (NMT) systems are usually trained on a large amount of bilingual sentence pairs and translate one sentence at a time, ignoring inter-sentence information.

Machine Translation NMT +2

Accelerating Neural Transformer via an Average Attention Network

1 code implementation ACL 2018 Biao Zhang, Deyi Xiong, Jinsong Su

To alleviate this issue, we propose an average attention network as an alternative to the self-attention network in the decoder of the neural Transformer.

Machine Translation Translation

Variational Recurrent Neural Machine Translation

no code implementations16 Jan 2018 Jinsong Su, Shan Wu, Deyi Xiong, Yaojie Lu, Xianpei Han, Biao Zhang

Partially inspired by successful applications of variational recurrent neural networks, we propose a novel variational recurrent neural machine translation (VRNMT) model in this paper.

Machine Translation NMT +2

Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings

no code implementations ACL 2018 Shaohui Kuang, Junhui Li, António Branco, Weihua Luo, Deyi Xiong

In neural machine translation, a source sequence of words is encoded into a vector from which a target sequence is generated in the decoding phase.

Machine Translation Sentence +2

Modeling Source Syntax for Neural Machine Translation

no code implementations ACL 2017 Junhui Li, Deyi Xiong, Zhaopeng Tu, Muhua Zhu, Min Zhang, Guodong Zhou

Even though a linguistics-free sequence to sequence model in neural machine translation (NMT) has certain capability of implicitly learning syntactic information of source sentences, this paper shows that source syntax can be explicitly incorporated into NMT effectively to provide further improvements.

Machine Translation NMT +1

A GRU-Gated Attention Model for Neural Machine Translation

no code implementations27 Apr 2017 Biao Zhang, Deyi Xiong, Jinsong Su

In this paper, we propose a novel GRU-gated attention model (GAtt) for NMT which enhances the degree of discrimination of context vectors by enabling source representations to be sensitive to the partial translation generated by the decoder.

Machine Translation NMT +1

Learning Event Expressions via Bilingual Structure Projection

no code implementations COLING 2016 Fangyuan Li, Ruihong Huang, Deyi Xiong, Min Zhang

Aiming to resolve high complexities of event descriptions, previous work (Huang and Riloff, 2013) proposes multi-faceted event recognition and a bootstrapping method to automatically acquire both event facet phrases and event expressions from unannotated texts.

Improving Translation Selection with Supersenses

no code implementations COLING 2016 Haiqing Tang, Deyi Xiong, Oier Lopez de Lacalle, Eneko Agirre

Selecting appropriate translations for source words with multiple meanings still remains a challenge for statistical machine translation (SMT).

Machine Translation Translation +1

Improving Statistical Machine Translation with Selectional Preferences

no code implementations COLING 2016 Haiqing Tang, Deyi Xiong, Min Zhang, ZhengXian Gong

In this paper, we study semantic dependencies between verbs and their arguments by modeling selectional preferences in the context of machine translation.

Machine Translation Semantic Role Labeling +2

Neural Machine Translation Advised by Statistical Machine Translation

no code implementations17 Oct 2016 Xing Wang, Zhengdong Lu, Zhaopeng Tu, Hang Li, Deyi Xiong, Min Zhang

Neural Machine Translation (NMT) is a new approach to machine translation that has made great progress in recent years.

Machine Translation NMT +1

Lattice-Based Recurrent Neural Network Encoders for Neural Machine Translation

no code implementations25 Sep 2016 Jinsong Su, Zhixing Tan, Deyi Xiong, Rongrong Ji, Xiaodong Shi, Yang Liu

Neural machine translation (NMT) heavily relies on word-level modelling to learn semantic representations of input sentences.

Machine Translation NMT +2

Cseq2seq: Cyclic Sequence-to-Sequence Learning

no code implementations29 Jul 2016 Biao Zhang, Deyi Xiong, Jinsong Su

The vanilla sequence-to-sequence learning (seq2seq) reads and encodes a source sequence into a fixed-length vector only once, suffering from its insufficiency in modeling structural correspondence between the source and target sequence.

Machine Translation Translation +1

BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings

1 code implementation25 May 2016 Biao Zhang, Deyi Xiong, Jinsong Su

In this paper, we propose a bidimensional attention based recursive autoencoder (BattRAE) to integrate clues and sourcetarget interactions at multiple levels of granularity into bilingual phrase representations.

Semantic Similarity Semantic Textual Similarity

Variational Neural Machine Translation

1 code implementation EMNLP 2016 Biao Zhang, Deyi Xiong, Jinsong Su, Hong Duan, Min Zhang

Models of neural machine translation are often from a discriminative family of encoderdecoders that learn a conditional distribution of a target sentence given a source sentence.

Machine Translation Sentence +1

Variational Neural Discourse Relation Recognizer

1 code implementation EMNLP 2016 Biao Zhang, Deyi Xiong, Jinsong Su, Qun Liu, Rongrong Ji, Hong Duan, Min Zhang

In order to perform efficient inference and learning, we introduce neural discourse relation models to approximate the prior and posterior distributions of the latent variable, and employ these approximated distributions to optimize a reparameterized variational lower bound.

Relation

Neural Discourse Relation Recognition with Semantic Memory

no code implementations12 Mar 2016 Biao Zhang, Deyi Xiong, Jinsong Su

Inspired by this, we propose a neural recognizer for implicit discourse relation analysis, which builds upon a semantic memory that stores knowledge in a distributed fashion.

General Knowledge Relation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.