Search Results for author: Deyi Xiong

Found 145 papers, 45 papers with code

Learning Structural Information for Syntax-Controlled Paraphrase Generation

no code implementations Findings (NAACL) 2022 Erguang Yang, Chenglin Bai, Deyi Xiong, Yujie Zhang, Yao Meng, Jinan Xu, Yufeng Chen

To model the alignment relation between words and nodes, we propose an attention regularization objective, which makes the decoder accurately select corresponding syntax nodes to guide the generation of words. Experiments show that SI-SCP achieves state-of-the-art performances in terms of semantic and syntactic quality on two popular benchmark datasets. Additionally, we propose a Syntactic Template Retriever (STR) to retrieve compatible syntactic structures.

Decoder Paraphrase Generation +1

TED-CDB: A Large-Scale Chinese Discourse Relation Dataset on TED Talks

no code implementations EMNLP 2020 Wanqiu Long, Bonnie Webber, Deyi Xiong

As different genres are known to differ in their communicative properties and as previously, for Chinese, discourse relations have only been annotated over news text, we have created the TED-CDB dataset.

Relation Transfer Learning

Adaptive Differential Privacy for Language Model Training

no code implementations FL4NLP (ACL) 2022 Xinwei Wu, Li Gong, Deyi Xiong

Although differential privacy (DP) can protect language models from leaking privacy, its indiscriminative protection on all data points reduces its practical utility.

Language Modeling Language Modelling +1

A Test Suite for Evaluating Discourse Phenomena in Document-level Neural Machine Translation

no code implementations AACL (iwdp) 2020 Xinyi Cai, Deyi Xiong

The need to evaluate the ability of context-aware neural machine translation (NMT) models in dealing with specific discourse phenomena arises in document-level NMT.

Machine Translation NMT +1

Re-embedding Difficult Samples via Mutual Information Constrained Semantically Oversampling for Imbalanced Text Classification

no code implementations EMNLP 2021 Jiachen Tian, Shizhan Chen, Xiaowang Zhang, Zhiyong Feng, Deyi Xiong, Shaojuan Wu, Chunliu Dou

Difficult samples of the minority class in imbalanced text classification are usually hard to be classified as they are embedded into an overlapping semantic region with the majority class.

Decoder text-classification +1

CoDoNMT: Modeling Cohesion Devices for Document-Level Neural Machine Translation

1 code implementation COLING 2022 Yikun Lei, Yuqi Ren, Deyi Xiong

In this paper, we propose a document-level neural machine translation framework, CoDoNMT, which models cohesion devices from two perspectives: Cohesion Device Masking (CoDM) and Cohesion Attention Focusing (CoAF).

Machine Translation NMT +2

ParaZh-22M: A Large-Scale Chinese Parabank via Machine Translation

no code implementations COLING 2022 Wenjie Hao, Hongfei Xu, Deyi Xiong, Hongying Zan, Lingling Mu

Paraphrasing, i. e., restating the same meaning in different ways, is an important data augmentation approach for natural language processing (NLP).

Data Augmentation Machine Translation +3

KaFSP: Knowledge-Aware Fuzzy Semantic Parsing for Conversational Question Answering over a Large-Scale Knowledge Base

1 code implementation ACL 2022 Junzhuo Li, Deyi Xiong

In this paper, we study two issues of semantic parsing approaches to conversational question answering over a large-scale knowledge base: (1) The actions defined in grammar are not sufficient to handle uncertain reasoning common in real-world scenarios.

Conversational Question Answering Entity Disambiguation +3

Chinese WPLC: A Chinese Dataset for Evaluating Pretrained Language Models on Word Prediction Given Long-Range Context

no code implementations EMNLP 2021 Huibin Ge, Chenxi Sun, Deyi Xiong, Qun Liu

Experiment results show that the Chinese pretrained language model PanGu-\alpha is 45 points behind human in terms of top-1 word prediction accuracy, indicating that Chinese WPLC is a challenging dataset.

Diversity Language Modeling +1

AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation

no code implementations Findings (ACL) 2021 Wuwei Huang, Dexin Wang, Deyi Xiong

In end-to-end speech translation, acoustic representations learned by the encoder are usually fixed and static, from the perspective of the decoder, which is not desirable for dealing with the cross-modal and cross-lingual challenge in speech translation.

Decoder Speech-to-Text Translation +1

Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation

no code implementations14 Mar 2025 Wuwei Huang, Renren Jin, Wen Zhang, Jian Luan, Bin Wang, Deyi Xiong

Recent studies on end-to-end speech translation(ST) have facilitated the exploration of multilingual end-to-end ST and end-to-end simultaneous ST.

Decoder Transfer Learning +1

Evaluating Discourse Cohesion in Pre-trained Language Models

no code implementations COLING (CODI, CRAC) 2022 Jie He, Wanqiu Long, Deyi Xiong

Large pre-trained neural models have achieved remarkable success in natural language process (NLP), inspiring a growing body of research analyzing their ability from different aspects.

The Box is in the Pen: Evaluating Commonsense Reasoning in Neural Machine Translation

1 code implementation Findings of the Association for Computational Linguistics 2020 Jie He, Tao Wang, Deyi Xiong, Qun Liu

Our experiments and analyses demonstrate that neural machine translation performs poorly on commonsense reasoning of the three ambiguity types in terms of both reasoning accuracy (60. 1%) and reasoning consistency (31%).

Common Sense Reasoning Machine Translation +2

ProBench: Benchmarking Large Language Models in Competitive Programming

no code implementations28 Feb 2025 Lei Yang, Renren Jin, Ling Shi, Jianxiang Peng, Yue Chen, Deyi Xiong

To bridge the gap for high-level code reasoning assessment, we propose ProBench to benchmark LLMs in competitive programming, drawing inspiration from the International Collegiate Programming Contest.

Attribute Benchmarking +1

Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts

no code implementations18 Feb 2025 Leiyu Pan, Zhenpeng Su, Minxuan Lv, Yizhe Xiong, Xiangwen Zhang, Zijia Lin, Hui Chen, Jungong Han, Guiguang Ding, Cheng Luo, Di Zhang, Kun Gai, Deyi Xiong

Moreover, we find that Finedeep achieves optimal results when balancing depth and width, specifically by adjusting the number of expert sub-layers and the number of experts per sub-layer.

Efficient Exploration

Evaluating and Improving Graph to Text Generation with Large Language Models

1 code implementation24 Jan 2025 Jie He, Yijun Yang, Wanqiu Long, Deyi Xiong, Victor Gutierrez Basulto, Jeff Z. Pan

Although we explored the optimal prompting strategies and proposed a novel and effective diversity-difficulty-based few-shot sample selection method, we found that the improvements from tuning-free approaches were incremental, as LLMs struggle with planning on complex graphs, particularly those with a larger number of triplets.

Diversity Few-Shot Learning +1

DCIS: Efficient Length Extrapolation of LLMs via Divide-and-Conquer Scaling Factor Search

1 code implementation25 Dec 2024 Lei Yang, Shaoyang Xu, Deyi Xiong

Further fine-tuning with the identified scaling factors effectively extends the context window of LLMs.

Large Language Model Safety: A Holistic Survey

1 code implementation23 Dec 2024 Dan Shi, Tianhao Shen, Yufei Huang, Zhigen Li, Yongqi Leng, Renren Jin, Chuang Liu, Xinwei Wu, Zishan Guo, Linhao Yu, Ling Shi, Bojian Jiang, Deyi Xiong

The rapid development and deployment of large language models (LLMs) have introduced a new frontier in artificial intelligence, marked by unprecedented capabilities in natural language understanding and generation.

Language Modeling Language Modelling +4

Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning

1 code implementation21 Nov 2024 Hang Zhou, Yehui Tang, Haochen Qin, Yujie Yang, Renren Jin, Deyi Xiong, Kai Han, Yunhe Wang

Our empirical studies, including instruction tuning experiments with models such as Pythia and LLaMA, demonstrate the effectiveness of the proposed framework.

GhostRNN: Reducing State Redundancy in RNN with Cheap Operations

no code implementations20 Nov 2024 Hang Zhou, Xiaoxu Zheng, Yunhe Wang, Michael Bi Mi, Deyi Xiong, Kai Han

Recurrent neural network (RNNs) that are capable of modeling long-distance dependencies are widely used in various speech tasks, eg., keyword spotting (KWS) and speech enhancement (SE).

Keyword Spotting Speech Enhancement

Self-Pluralising Culture Alignment for Large Language Models

1 code implementation16 Oct 2024 Shaoyang Xu, Yongqi Leng, Linhao Yu, Deyi Xiong

In this paper, we propose CultureSPA, a Self-Pluralising Culture Alignment framework that allows LLMs to simultaneously align to pluralistic cultures.

Prompt Engineering

LANDeRMT: Detecting and Routing Language-Aware Neurons for Selectively Finetuning LLMs to Machine Translation

no code implementations29 Sep 2024 Shaolin Zhu, Leiyu Pan, Bo Li, Deyi Xiong

For the detected neurons, we further propose a conditional awareness-based routing mechanism to dynamically adjust language-general and language-specific capacity within LLMs, guided by translation signals.

Machine Translation Translation

Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning

no code implementations27 Sep 2024 Yu Fu, Jie He, Yifan Yang, Qun Liu, Deyi Xiong

In this framework, we present a reinforcement-based approach to dynamically estimating source task weights that measure the contribution of the corresponding tasks to the target task in the meta-transfer learning.

Meta-Learning Transfer Learning

CMoralEval: A Moral Evaluation Benchmark for Chinese Large Language Models

1 code implementation19 Aug 2024 Linhao Yu, Yongqi Leng, Yufei Huang, Shang Wu, Haixin Liu, Xinmeng Ji, Jiahui Zhao, Jinwang Song, Tingting Cui, Xiaoqing Cheng, Tao Liu, Deyi Xiong

These help us curate CMoralEval that encompasses both explicit moral scenarios (14, 964 instances) and moral dilemma scenarios (15, 424 instances), each with instances from different data sources.

Diversity Language Modeling +3

FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data

1 code implementation12 Aug 2024 Haoran Sun, Renren Jin, Shaoyang Xu, Leiyu Pan, Supryadi, Menglong Cui, Jiangcun Du, Yikun Lei, Lei Yang, Ling Shi, Juesi Xiao, Shaolin Zhu, Deyi Xiong

To mitigate this challenge, we present FuxiTranyu, an open-source multilingual LLM, which is designed to satisfy the need of the research community for balanced and high-performing multilingual capabilities.

Language Modeling Language Modelling +1

Towards Understanding Multi-Task Learning (Generalization) of LLMs via Detecting and Exploring Task-Specific Neurons

no code implementations9 Jul 2024 Yongqi Leng, Deyi Xiong

With these identified task-specific neurons, we delve into two common problems in multi-task learning and continuous learning: Generalization and Catastrophic Forgetting.

Multi-Task Learning

Automated Progressive Red Teaming

1 code implementation4 Jul 2024 Bojian Jiang, Yi Jing, Tianhao Shen, Tong Wu, Qing Yang, Deyi Xiong

To address this gap, we propose Automated Progressive Red Teaming (APRT) as an effectively learnable framework.

Active Learning Red Teaming +1

IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying and Reweighting Context-Aware Neurons

1 code implementation26 Jun 2024 Dan Shi, Renren Jin, Tianhao Shen, Weilong Dong, Xinwei Wu, Deyi Xiong

To mitigate such knowledge conflicts, we propose a novel framework, IRCAN (Identifying and Reweighting Context-Aware Neurons) to capitalize on neurons that are crucial in processing contextual cues.

MoE-CT: A Novel Approach For Large Language Models Training With Resistance To Catastrophic Forgetting

no code implementations25 Jun 2024 TianHao Li, Shangjie Li, Binbin Xie, Deyi Xiong, Baosong Yang

The advent of large language models (LLMs) has predominantly catered to high-resource languages, leaving a disparity in performance for low-resource languages.

Language Modeling Language Modelling +1

Efficiently Exploring Large Language Models for Document-Level Machine Translation with In-context Learning

no code implementations11 Jun 2024 Menglong Cui, Jiangcun Du, Shaolin Zhu, Deyi Xiong

Subsequently, sentences most similar to the summary are retrieved from the datastore as demonstrations, which effectively guide LLMs in generating cohesive and coherent translations.

Document Level Machine Translation In-Context Learning +3

CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models

no code implementations7 Jun 2024 Ling Shi, Deyi Xiong

Each question is accompanied with 4 answer choices that state opinions or behavioral tendencies corresponding to the question.

Multiple-choice Philosophy +1

Benchmarks Underestimate the Readiness of Multi-lingual Dialogue Agents

no code implementations28 May 2024 Andrew H. Lee, Sina J. Semnani, Galo Castillo-López, Gäel de Chalendar, Monojit Choudhury, Ashna Dua, Kapil Rajesh Kavitha, Sungkyun Kim, Prashant Kodali, Ponnurangam Kumaraguru, Alexis Lombard, Mehrad Moradshahi, Gihyun Park, Nasredine Semmar, Jiwon Seo, Tianhao Shen, Manish Shrivastava, Deyi Xiong, Monica S. Lam

However, after manual evaluation of the validation set, we find that by correcting gold label errors and improving dataset annotation schema, GPT-4 with our prompts can achieve (1) 89. 6%-96. 8% accuracy in DST, and (2) more than 99% correct response generation across different languages.

Dialogue State Tracking In-Context Learning +1

Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs

no code implementations24 May 2024 Chenxi Sun, Hongzhi Zhang, Zijia Lin, Jingyuan Zhang, Fuzheng Zhang, Zhongyuan Wang, Bin Chen, Chengru Song, Di Zhang, Kun Gai, Deyi Xiong

The core of our approach is the observation that a pre-trained language model can confidently predict multiple contiguous tokens, forming the basis for a \textit{lexical unit}, in which these contiguous tokens could be decoded in parallel.

Code Generation Language Modeling +4

ConTrans: Weak-to-Strong Alignment Engineering via Concept Transplantation

1 code implementation22 May 2024 Weilong Dong, Xinwei Wu, Renren Jin, Shaoyang Xu, Deyi Xiong

From the perspective of representation engineering, ConTrans refines concept vectors in value alignment from a source LLM (usually a weak yet aligned LLM).

LFED: A Literary Fiction Evaluation Dataset for Large Language Models

1 code implementation16 May 2024 Linhao Yu, Qun Liu, Deyi Xiong

The rapid evolution of large language models (LLMs) has ushered in the need for comprehensive assessments of their performance across various dimensions.

An Empirical Study on the Robustness of Massively Multilingual Neural Machine Translation

1 code implementation13 May 2024 Supryadi, Leiyu Pan, Deyi Xiong

Massively multilingual neural machine translation (MMNMT) has been proven to enhance the translation quality of low-resource languages.

Machine Translation Translation

LHMKE: A Large-scale Holistic Multi-subject Knowledge Evaluation Benchmark for Chinese Large Language Models

no code implementations19 Mar 2024 Chuang Liu, Renren Jin, Yuqi Ren, Deyi Xiong

Current datasets collect questions from Chinese examinations across different subjects and educational levels to address this issue.

Multiple-choice

OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety

no code implementations18 Mar 2024 Chuang Liu, Linhao Yu, Jiaxuan Li, Renren Jin, Yufei Huang, Ling Shi, Junhui Zhang, Xinmeng Ji, Tingting Cui, Tao Liu, Jinwang Song, Hongying Zan, Sun Li, Deyi Xiong

In addition to these benchmarks, we have implemented a phased public evaluation and benchmark update strategy to ensure that OpenEval is in line with the development of Chinese LLMs or even able to provide cutting-edge benchmark datasets to guide the development of Chinese LLMs.

Benchmarking Mathematical Reasoning

FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models

no code implementations12 Mar 2024 Yan Liu, Renren Jin, Ling Shi, Zheng Yao, Deyi Xiong

We conduct extensive experiments on a wide range of LLMs on FineMath and find that there is still considerable room for improvements in terms of mathematical reasoning capability of Chinese LLMs.

Math Mathematical Reasoning

Exploring Multilingual Concepts of Human Value in Large Language Models: Is Value Alignment Consistent, Transferable and Controllable across Languages?

1 code implementation28 Feb 2024 Shaoyang Xu, Weilong Dong, Zishan Guo, Xinwei Wu, Deyi Xiong

Prior research has revealed that certain abstract concepts are linearly represented as directions in the representation space of LLMs, predominantly centered around English.

Cross-Lingual Transfer Philosophy

Do Large Language Models Mirror Cognitive Language Processing?

no code implementations28 Feb 2024 Yuqi Ren, Renren Jin, Tongxuan Zhang, Deyi Xiong

In this paper, we employ Representational Similarity Analysis (RSA) to measure the alignment between 23 mainstream LLMs and fMRI signals of the brain to evaluate how effectively LLMs simulate cognitive language processing.

Chatbot Logical Reasoning +2

A Comprehensive Evaluation of Quantization Strategies for Large Language Models

1 code implementation26 Feb 2024 Renren Jin, Jiangcun Du, Wuwei Huang, Wei Liu, Jian Luan, Bin Wang, Deyi Xiong

Our experimental results indicate that LLMs with 4-bit quantization can retain performance comparable to their non-quantized counterparts, and perplexity can serve as a proxy metric for quantized LLMs on most benchmarks.

Language Modeling Language Modelling +1

RoleEval: A Bilingual Role Evaluation Benchmark for Large Language Models

1 code implementation26 Dec 2023 Tianhao Shen, Sun Li, Quan Tu, Deyi Xiong

We expect that RoleEval would highlight the significance of assessing role knowledge for large language models across various languages and cultural settings.

Memorization Multiple-choice

CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models

1 code implementation20 Dec 2023 Dan Shi, Chaobin You, Jiantao Huang, Taihao Li, Deyi Xiong

With these pre-defined domains and slots, we collect 76, 787 commonsense knowledge annotations from 19, 700 dialogues through crowdsourcing.

Causal Inference Common Sense Reasoning

AI-driven emergence of frequency information non-uniform distribution via THz metasurface spectrum prediction

no code implementations5 Dec 2023 Xiaohua Xing, Yuqi Ren, Die Zou, Qiankun Zhang, Bingxuan Mao, Jianquan Yao, Deyi Xiong, Shuang Zhang, Liang Wu

Recently, artificial intelligence has been extensively deployed across various scientific disciplines, optimizing and guiding the progression of experiments through the integration of abundant datasets, whilst continuously probing the vast theoretical space encapsulated within the data.

Towards a Deep Understanding of Multilingual End-to-End Speech Translation

1 code implementation31 Oct 2023 Haoran Sun, Xiaohu Zhao, Yikun Lei, Shaolin Zhu, Deyi Xiong

In this paper, we employ Singular Value Canonical Correlation Analysis (SVCCA) to analyze representations learnt in a multilingual end-to-end speech translation model trained over 22 languages.

Machine Translation Translation

DEPN: Detecting and Editing Privacy Neurons in Pretrained Language Models

1 code implementation31 Oct 2023 Xinwei Wu, Junzhuo Li, Minghui Xu, Weilong Dong, Shuangzhi Wu, Chao Bian, Deyi Xiong

The ability of data memorization and regurgitation in pretrained language models, revealed in previous studies, brings the risk of data leakage.

Memorization Model Editing

Is Robustness Transferable across Languages in Multilingual Neural Machine Translation?

no code implementations31 Oct 2023 Leiyu Pan, Supryadi, Deyi Xiong

In particular, we use character-, word-, and multi-level noises to attack the specific translation direction of the multilingual neural machine translation model and evaluate the robustness of other translation directions.

Data Augmentation Machine Translation +1

Evaluating Large Language Models: A Comprehensive Survey

1 code implementation30 Oct 2023 Zishan Guo, Renren Jin, Chuang Liu, Yufei Huang, Dan Shi, Supryadi, Linhao Yu, Yan Liu, Jiaxuan Li, Bojian Xiong, Deyi Xiong

We hope that this comprehensive overview will stimulate further research interests in the evaluation of LLMs, with the ultimate goal of making evaluation serve as a cornerstone in guiding the responsible development of LLMs.

Survey

Large Language Model Alignment: A Survey

no code implementations26 Sep 2023 Tianhao Shen, Renren Jin, Yufei Huang, Chuang Liu, Weilong Dong, Zishan Guo, Xinwei Wu, Yan Liu, Deyi Xiong

We also envision bridging the gap between the AI alignment research community and the researchers engrossed in the capability exploration of LLMs for both capable and safe LLMs.

Language Modeling Language Modelling +3

Watermarking Conditional Text Generation for AI Detection: Unveiling Challenges and a Semantic-Aware Watermark Remedy

1 code implementation25 Jul 2023 Yu Fu, Deyi Xiong, Yue Dong

To mitigate potential risks associated with language models, recent AI detection research proposes incorporating watermarks into machine-generated text through random vocabulary restrictions and utilizing this information for detection.

Conditional Text Generation Data-to-Text Generation

CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models

1 code implementation28 Jun 2023 Yufei Huang, Deyi Xiong

In this work, we present a Chinese Bias Benchmark dataset that consists of over 100K questions jointly constructed by human experts and generative language models, covering stereotypes and societal biases in 14 social dimensions related to Chinese culture and values.

Inverse Reinforcement Learning for Text Summarization

no code implementations19 Dec 2022 Yu Fu, Deyi Xiong, Yue Dong

We introduce inverse reinforcement learning (IRL) as an effective paradigm for training abstractive summarization models, imitating human summarization behaviors.

Abstractive Text Summarization reinforcement-learning +2

FewFedWeight: Few-shot Federated Learning Framework across Multiple NLP Tasks

no code implementations16 Dec 2022 Weilong Dong, Xinwei Wu, Junzhuo Li, Shuangzhi Wu, Chao Bian, Deyi Xiong

It broadcasts the global model in the server to each client and produces pseudo data for clients so that knowledge from the global model can be explored to enhance few-shot learning of each client model.

Federated Learning Few-Shot Learning +1

NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual Question Answering

no code implementations7 Nov 2022 Tengxun Zhang, Hongfei Xu, Josef van Genabith, Deyi Xiong, Hongying Zan

Hybrid tabular-textual question answering (QA) requires reasoning from heterogeneous information, and the types of reasoning are mainly divided into numerical reasoning and span extraction.

Question Answering

Informative Language Representation Learning for Massively Multilingual Neural Machine Translation

1 code implementation COLING 2022 Renren Jin, Deyi Xiong

Experiment results on two datasets for massively multilingual neural machine translation demonstrate that language-aware multi-head attention benefits both supervised and zero-shot translation and significantly alleviates the off-target translation issue.

Machine Translation Navigate +2

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

1 code implementation22 Jun 2022 Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou

This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims.

Benchmarking Text Generation

Unsupervised and Few-shot Parsing from Pretrained Language Models

no code implementations10 Jun 2022 Zhiyuan Zeng, Deyi Xiong

We therefore extend the unsupervised models to few-shot parsing models (FPOA, FPIO) that use a few annotated trees to learn better linear projection matrices for parsing.

Language Modelling

Efficient Cluster-Based k-Nearest-Neighbor Machine Translation

2 code implementations ACL 2022 Dexin Wang, Kai Fan, Boxing Chen, Deyi Xiong

k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT).

Contrastive Learning Domain Adaptation +4

Bridging between Cognitive Processing Signals and Linguistic Features via a Unified Attentional Network

no code implementations16 Dec 2021 Yuqi Ren, Deyi Xiong

The proposed framework only requires cognitive processing signals recorded under natural reading as inputs, and can be used to detect a wide range of linguistic features with a single cognitive dataset.

Sentence

Secoco: Self-Correcting Encoding for Neural Machine Translation

no code implementations Findings (EMNLP) 2021 Tao Wang, Chengqi Zhao, Mingxuan Wang, Lei LI, Hang Li, Deyi Xiong

This paper presents Self-correcting Encoding (Secoco), a framework that effectively deals with input noise for robust neural machine translation by introducing self-correcting predictors.

Machine Translation NMT +1

An Empirical Study on Adversarial Attack on NMT: Languages and Positions Matter

no code implementations ACL 2021 Zhiyuan Zeng, Deyi Xiong

For autoregressive NMT models that generate target words from left to right, we observe that adversarial attack on the source language is more effective than on the target language, and that attacking front positions of target sentences or positions of source sentences aligned to the front positions of corresponding target sentences is more effective than attacking other positions.

Adversarial Attack NMT

Multi-Head Highly Parallelized LSTM Decoder for Neural Machine Translation

no code implementations ACL 2021 Hongfei Xu, Qiuhui Liu, Josef van Genabith, Deyi Xiong, Meng Zhang

This has to be computed n times for a sequence of length n. The linear transformations involved in the LSTM gate and state computations are the major cost factors in this.

Decoder Machine Translation +1

TGEA: An Error-Annotated Dataset and Benchmark Tasks for TextGeneration from Pretrained Language Models

no code implementations ACL 2021 Jie He, Bo Peng, Yi Liao, Qun Liu, Deyi Xiong

Each error is hence manually labeled with comprehensive annotations, including the span of the error, the associated span, minimal correction to the error, the type of the error, and rationale behind the error.

Common Sense Reasoning Diagnostic +1

CogAlign: Learning to Align Textual Neural Representations to Cognitive Language Processing Signals

1 code implementation ACL 2021 Yuqi Ren, Deyi Xiong

Most previous studies integrate cognitive language processing signals (e. g., eye-tracking or EEG data) into neural models of natural language processing (NLP) just by directly concatenating word embeddings with cognitive features, ignoring the gap between the two modalities (i. e., textual vs. cognitive) and noise in cognitive features.

EEG named-entity-recognition +5

Enhanced Aspect-Based Sentiment Analysis Models with Progressive Self-supervised Attention Learning

1 code implementation5 Mar 2021 Jinsong Su, Jialong Tang, Hui Jiang, Ziyao Lu, Yubin Ge, Linfeng Song, Deyi Xiong, Le Sun, Jiebo Luo

In aspect-based sentiment analysis (ABSA), many neural models are equipped with an attention mechanism to quantify the contribution of each context word to sentiment prediction.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

Integrating Pre-trained Model into Rule-based Dialogue Management

no code implementations17 Feb 2021 Jun Quan, Meng Yang, Qiang Gan, Deyi Xiong, Yiming Liu, Yuchen Dong, Fangxin Ouyang, Jun Tian, Ruiling Deng, Yongzhi Li, Yang Yang, Daxin Jiang

Rule-based dialogue management is still the most popular solution for industrial task-oriented dialogue systems for their interpretablility.

Dialogue Management Management +1

Efficient Object-Level Visual Context Modeling for Multimodal Machine Translation: Masking Irrelevant Objects Helps Grounding

no code implementations18 Dec 2020 Dexin Wang, Deyi Xiong

In this paper, we propose an object-level visual context modeling framework (OVC) to efficiently capture and explore visual information for multimodal machine translation.

Multimodal Machine Translation Object +1

Balanced Joint Adversarial Training for Robust Intent Detection and Slot Filling

no code implementations COLING 2020 Xu Cao, Deyi Xiong, Chongyang Shi, Chao Wang, Yao Meng, Changjian Hu

Joint intent detection and slot filling has recently achieved tremendous success in advancing the performance of utterance understanding.

Intent Detection slot-filling +1

A Learning-Exploring Method to Generate Diverse Paraphrases with Multi-Objective Deep Reinforcement Learning

no code implementations COLING 2020 Mingtong Liu, Erguang Yang, Deyi Xiong, Yujie Zhang, Yao Meng, Changjian Hu, Jinan Xu, Yufeng Chen

We propose a learning-exploring method to generate sentences as learning objectives from the learned data distribution, and employ reinforcement learning to combine these new learning objectives for model training.

Deep Reinforcement Learning Diversity +2

RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich Semantic Annotations for Task-Oriented Dialogue Modeling

1 code implementation EMNLP 2020 Jun Quan, Shian Zhang, Qian Cao, Zizhong Li, Deyi Xiong

In order to alleviate the shortage of multi-domain data and to capture discourse phenomena for task-oriented dialogue modeling, we propose RiSAWOZ, a large-scale multi-domain Chinese Wizard-of-Oz dataset with Rich Semantic Annotations.

Dialogue State Tracking Intent Detection +4

Rewiring the Transformer with Depth-Wise LSTMs

no code implementations13 Jul 2020 Hongfei Xu, Yang song, Qiuhui Liu, Josef van Genabith, Deyi Xiong

Stacking non-linear layers allows deep neural networks to model complicated functions, and including residual connections in Transformer layers is beneficial for convergence and performance.

NMT Time Series Analysis

Learning Source Phrase Representations for Neural Machine Translation

no code implementations ACL 2020 Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu, Jingyi Zhang

Considering that modeling phrases instead of words has significantly improved the Statistical Machine Translation (SMT) approach through the use of larger translation blocks ("phrases") and its reordering ability, modeling NMT at phrase level is an intuitive proposal to help the model capture long-distance relationships.

Machine Translation NMT +1

Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change

no code implementations ACL 2020 Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu

We propose to automatically and dynamically determine batch sizes by accumulating gradients of mini-batches and performing an optimization step at just the time when the direction of gradients starts to fluctuate.

Modeling Long Context for Task-Oriented Dialogue State Generation

no code implementations ACL 2020 Jun Quan, Deyi Xiong

Based on the recently proposed transferable dialogue state generator (TRADE) that predicts dialogue states from utterance-concatenated dialogue context, we propose a multi-task learning model with a simple yet effective utterance tagging technique and a bidirectional language model as an auxiliary task for task-oriented dialogue state generation.

Language Modeling Language Modelling +1

Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation

no code implementations30 Mar 2020 Pei Zhang, Xu Zhang, Wei Chen, Jian Yu, Yan-Feng Wang, Deyi Xiong

In this paper, we propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.

Document Level Machine Translation Machine Translation +4

Probing Word Translations in the Transformer and Trading Decoder for Encoder Layers

no code implementations NAACL 2021 Hongfei Xu, Josef van Genabith, Qiuhui Liu, Deyi Xiong

Due to its effectiveness and performance, the Transformer translation model has attracted wide attention, most recently in terms of probing-based approaches.

Decoder Translation +1

Shallow Discourse Annotation for Chinese TED Talks

1 code implementation LREC 2020 Wanqiu Long, Xinyi Cai, James E. M. Reid, Bonnie Webber, Deyi Xiong

Text corpora annotated with language-related properties are an important resource for the development of Language Technology.

Translation

Effective Data Augmentation Approaches to End-to-End Task-Oriented Dialogue

no code implementations5 Dec 2019 Jun Quan, Deyi Xiong

The training of task-oriented dialogue systems is often confronted with the lack of annotated data.

Data Augmentation Diversity +2

Merging External Bilingual Pairs into Neural Machine Translation

no code implementations2 Dec 2019 Tao Wang, Shaohui Kuang, Deyi Xiong, António Branco

As neural machine translation (NMT) is not easily amenable to explicit correction of errors, incorporating pre-specified translations into NMT is widely regarded as a non-trivial challenge.

Machine Translation NMT +1

Learning to Reuse Translations: Guiding Neural Machine Translation with Examples

no code implementations25 Nov 2019 Qian Cao, Shaohui Kuang, Deyi Xiong

In this paper, we study the problem of enabling neural machine translation (NMT) to reuse previous translations from similar examples in target prediction.

Decoder Machine Translation +2

Lipschitz Constrained Parameter Initialization for Deep Transformers

no code implementations ACL 2020 Hongfei Xu, Qiuhui Liu, Josef van Genabith, Deyi Xiong, Jingyi Zhang

In this paper, we first empirically demonstrate that a simple modification made in the official implementation, which changes the computation order of residual connection and layer normalization, can significantly ease the optimization of deep Transformers.

Decoder Translation

BiPaR: A Bilingual Parallel Dataset for Multilingual and Cross-lingual Reading Comprehension on Novels

1 code implementation IJCNLP 2019 Yimin Jing, Deyi Xiong, Yan Zhen

We analyze BiPaR in depth and find that BiPaR offers good diversification in prefixes of questions, answer types and relationships between questions and passages.

coreference-resolution Machine Reading Comprehension +1

Generating Highly Relevant Questions

no code implementations IJCNLP 2019 Jiazuo Qiu, Deyi Xiong

The neural seq2seq based question generation (QG) is prone to generating generic and undiversified questions that are poorly relevant to the given passage and target answer.

Question Generation Question-Generation

Towards Linear Time Neural Machine Translation with Capsule Networks

no code implementations IJCNLP 2019 Mingxuan Wang, Jun Xie, Zhixing Tan, Jinsong Su, Deyi Xiong, Lei LI

In this study, we first investigate a novel capsule network with dynamic routing for linear time Neural Machine Translation (NMT), referred as \textsc{CapsNMT}.

Machine Translation NMT +2

Simplifying Neural Machine Translation with Addition-Subtraction Twin-Gated Recurrent Networks

3 code implementations EMNLP 2018 Biao Zhang, Deyi Xiong, Jinsong Su, Qian Lin, Huiji Zhang

Experiments on WMT14 translation tasks demonstrate that ATR-based neural machine translation can yield competitive performance on English- German and English-French language pairs in terms of both translation quality and speed.

Chinese Word Segmentation Machine Translation +2

Encoding Gated Translation Memory into Neural Machine Translation

no code implementations EMNLP 2018 Qian Cao, Deyi Xiong

Translation memories (TM) facilitate human translators to reuse existing repetitive translation fragments.

Decoder Machine Translation +4

Sentence Weighting for Neural Machine Translation Domain Adaptation

no code implementations COLING 2018 Shiqi Zhang, Deyi Xiong

In this paper, we propose a new sentence weighting method for the domain adaptation of neural machine translation.

Domain Adaptation Language Modeling +4

Fusing Recency into Neural Machine Translation with an Inter-Sentence Gate Model

no code implementations COLING 2018 Shaohui Kuang, Deyi Xiong

Neural machine translation (NMT) systems are usually trained on a large amount of bilingual sentence pairs and translate one sentence at a time, ignoring inter-sentence information.

Machine Translation NMT +2

Accelerating Neural Transformer via an Average Attention Network

1 code implementation ACL 2018 Biao Zhang, Deyi Xiong, Jinsong Su

To alleviate this issue, we propose an average attention network as an alternative to the self-attention network in the decoder of the neural Transformer.

Decoder Machine Translation +1

Variational Recurrent Neural Machine Translation

no code implementations16 Jan 2018 Jinsong Su, Shan Wu, Deyi Xiong, Yaojie Lu, Xianpei Han, Biao Zhang

Partially inspired by successful applications of variational recurrent neural networks, we propose a novel variational recurrent neural machine translation (VRNMT) model in this paper.

Decoder Machine Translation +3

Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings

no code implementations ACL 2018 Shaohui Kuang, Junhui Li, António Branco, Weihua Luo, Deyi Xiong

In neural machine translation, a source sequence of words is encoded into a vector from which a target sequence is generated in the decoding phase.

Machine Translation Sentence +2

Modeling Source Syntax for Neural Machine Translation

no code implementations ACL 2017 Junhui Li, Deyi Xiong, Zhaopeng Tu, Muhua Zhu, Min Zhang, Guodong Zhou

Even though a linguistics-free sequence to sequence model in neural machine translation (NMT) has certain capability of implicitly learning syntactic information of source sentences, this paper shows that source syntax can be explicitly incorporated into NMT effectively to provide further improvements.

Machine Translation NMT +1

A GRU-Gated Attention Model for Neural Machine Translation

no code implementations27 Apr 2017 Biao Zhang, Deyi Xiong, Jinsong Su

In this paper, we propose a novel GRU-gated attention model (GAtt) for NMT which enhances the degree of discrimination of context vectors by enabling source representations to be sensitive to the partial translation generated by the decoder.

Decoder Machine Translation +2

Improving Translation Selection with Supersenses

no code implementations COLING 2016 Haiqing Tang, Deyi Xiong, Oier Lopez de Lacalle, Eneko Agirre

Selecting appropriate translations for source words with multiple meanings still remains a challenge for statistical machine translation (SMT).

Machine Translation Translation +1

Improving Statistical Machine Translation with Selectional Preferences

no code implementations COLING 2016 Haiqing Tang, Deyi Xiong, Min Zhang, ZhengXian Gong

In this paper, we study semantic dependencies between verbs and their arguments by modeling selectional preferences in the context of machine translation.

Machine Translation Semantic Role Labeling +2

Learning Event Expressions via Bilingual Structure Projection

no code implementations COLING 2016 Fangyuan Li, Ruihong Huang, Deyi Xiong, Min Zhang

Aiming to resolve high complexities of event descriptions, previous work (Huang and Riloff, 2013) proposes multi-faceted event recognition and a bootstrapping method to automatically acquire both event facet phrases and event expressions from unannotated texts.

Neural Machine Translation Advised by Statistical Machine Translation

no code implementations17 Oct 2016 Xing Wang, Zhengdong Lu, Zhaopeng Tu, Hang Li, Deyi Xiong, Min Zhang

Neural Machine Translation (NMT) is a new approach to machine translation that has made great progress in recent years.

Machine Translation NMT +1

Lattice-Based Recurrent Neural Network Encoders for Neural Machine Translation

no code implementations25 Sep 2016 Jinsong Su, Zhixing Tan, Deyi Xiong, Rongrong Ji, Xiaodong Shi, Yang Liu

Neural machine translation (NMT) heavily relies on word-level modelling to learn semantic representations of input sentences.

Machine Translation NMT +2

Cseq2seq: Cyclic Sequence-to-Sequence Learning

no code implementations29 Jul 2016 Biao Zhang, Deyi Xiong, Jinsong Su

The vanilla sequence-to-sequence learning (seq2seq) reads and encodes a source sequence into a fixed-length vector only once, suffering from its insufficiency in modeling structural correspondence between the source and target sequence.

Machine Translation Translation +1

BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings

1 code implementation25 May 2016 Biao Zhang, Deyi Xiong, Jinsong Su

In this paper, we propose a bidimensional attention based recursive autoencoder (BattRAE) to integrate clues and sourcetarget interactions at multiple levels of granularity into bilingual phrase representations.

Semantic Similarity Semantic Textual Similarity

Variational Neural Machine Translation

1 code implementation EMNLP 2016 Biao Zhang, Deyi Xiong, Jinsong Su, Hong Duan, Min Zhang

Models of neural machine translation are often from a discriminative family of encoderdecoders that learn a conditional distribution of a target sentence given a source sentence.

Decoder Machine Translation +2

Variational Neural Discourse Relation Recognizer

1 code implementation EMNLP 2016 Biao Zhang, Deyi Xiong, Jinsong Su, Qun Liu, Rongrong Ji, Hong Duan, Min Zhang

In order to perform efficient inference and learning, we introduce neural discourse relation models to approximate the prior and posterior distributions of the latent variable, and employ these approximated distributions to optimize a reparameterized variational lower bound.

Relation

Neural Discourse Relation Recognition with Semantic Memory

no code implementations12 Mar 2016 Biao Zhang, Deyi Xiong, Jinsong Su

Inspired by this, we propose a neural recognizer for implicit discourse relation analysis, which builds upon a semantic memory that stores knowledge in a distributed fashion.

General Knowledge Relation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.