no code implementations • 5 Dec 2022 • Feng Nie, Meixi Chen, Zhirui Zhang, Xu Cheng
However, the performance of in-context learning is susceptible to the choice of prompt format, training examples and the ordering of the training examples.
1 code implementation • 23 May 2022 • Yichao Du, Weizhi Wang, Zhirui Zhang, Boxing Chen, Tong Xu, Jun Xie, Enhong Chen
End-to-End Speech Translation (E2E-ST) has received increasing attention due to the potential of its less error propagation, lower latency, and fewer parameters.
no code implementations • Findings (ACL) 2022 • Fenfei Guo, Chen Zhang, Zhirui Zhang, Qixin He, Kejun Zhang, Jun Xie, Jordan Boyd-Graber
This paper develops automatic song translation (AST) for tonal languages and addresses the unique challenge of aligning words' tones with melody of a song in addition to conveying the original meaning.
1 code implementation • 21 Dec 2021 • Yichao Du, Zhirui Zhang, Weizhi Wang, Boxing Chen, Jun Xie, Tong Xu
In this paper, we attempt to model the joint probability of transcription and translation based on the speech input to directly leverage such triplet data.
1 code implementation • 23 Sep 2021 • Dongqi Wang, Haoran Wei, Zhirui Zhang, ShuJian Huang, Jun Xie, Jiajun Chen
We study the problem of online learning with human feedback in the human-in-the-loop machine translation, in which the human translators revise the machine-generated translations and then the corrected translations are used to improve the neural machine translation (NMT) system.
1 code implementation • Findings (EMNLP) 2021 • Xin Zheng, Zhirui Zhang, ShuJian Huang, Boxing Chen, Jun Xie, Weihua Luo, Jiajun Chen
Recently, $k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor ($k$NN) retrieval to achieve domain adaptation without retraining.
1 code implementation • Findings (EMNLP) 2021 • Weizhi Wang, Zhirui Zhang, Yichao Du, Boxing Chen, Jun Xie, Weihua Luo
However, it usually suffers from capturing spurious correlations between the output language and language invariant semantics due to the maximum likelihood training objective, leading to poor transfer performance on zero-shot translation.
1 code implementation • 31 Aug 2021 • Weizhi Wang, Zhirui Zhang, Junliang Guo, Yinpei Dai, Boxing Chen, Weihua Luo
In this paper, we propose to formulate the task-oriented dialogue system as the purely natural language generation task, so as to fully leverage the large-scale pre-trained models like GPT-2 and simplify complicated delexicalization prepossessing.
2 code implementations • ACL 2021 • Xin Zheng, Zhirui Zhang, Junliang Guo, ShuJian Huang, Boxing Chen, Weihua Luo, Jiajun Chen
On four benchmark machine translation datasets, we demonstrate that the proposed method is able to effectively filter out the noises in retrieval results and significantly outperforms the vanilla kNN-MT model.
no code implementations • 16 Apr 2021 • Junliang Guo, Zhirui Zhang, Linlin Zhang, Linli Xu, Boxing Chen, Enhong Chen, Weihua Luo
In this way, our approach is able to more comprehensively find adversarial examples around the decision boundary and effectively conduct adversarial attacks.
1 code implementation • NeurIPS 2020 • Junliang Guo, Zhirui Zhang, Linli Xu, Hao-Ran Wei, Boxing Chen, Enhong Chen
Our framework is based on a parallel sequence decoding algorithm named Mask-Predict considering the bi-directional and conditional independent nature of BERT, and can be adapted to traditional autoregressive decoding easily.
no code implementations • EMNLP 2020 • Hao-Ran Wei, Zhirui Zhang, Boxing Chen, Weihua Luo
In this paper, we focus on the domain-specific translation with low resources, where in-domain parallel corpora are scarce or nonexistent.
no code implementations • 3 Dec 2019 • Baijun Ji, Zhirui Zhang, Xiangyu Duan, Min Zhang, Boxing Chen, Weihua Luo
However, existing transfer methods involving a common target language are far from success in the extreme scenario of zero-shot translation, due to the language space mismatch problem between transferor (the parent model) and transferee (the child model) on the source side.
no code implementations • ACL 2019 • Zhirui Zhang, Xiujun Li, Jianfeng Gao, Enhong Chen
This paper presents a new approach that extends Deep Dyna-Q (DDQ) by incorporating a Budget-Conscious Scheduling (BCS) to best utilize a fixed, small amount of user interactions (budget) for learning task-oriented dialogue agents.
1 code implementation • 14 Jan 2019 • Shuo Ren, Zhirui Zhang, Shujie Liu, Ming Zhou, Shuai Ma
To address this issue, we introduce phrase based Statistic Machine Translation (SMT) models which are robust to noisy data, as posterior regularizations to guide the training of unsupervised NMT models in the iterative back-translation process.
no code implementations • CONLL 2018 • Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, Enhong Chen
To address this issue and stabilize the GAN training, in this paper, we propose a novel Bidirectional Generative Adversarial Network for Neural Machine Translation (BGAN-NMT), which aims to introduce a generator model to act as the discriminator, whereby the discriminator naturally considers the entire translation space so that the inadequate training problem can be alleviated.
no code implementations • 24 Aug 2018 • Wenhu Chen, Guanlin Li, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou
Then, we interpret sequence-to-sequence learning as learning a transductive model to transform the source local latent distributions to match their corresponding target distributions.
no code implementations • 23 Aug 2018 • Zhirui Zhang, Shuo Ren, Shujie Liu, Jianyong Wang, Peng Chen, Mu Li, Ming Zhou, Enhong Chen
Language style transferring rephrases text with specific stylistic attributes while preserving the original attribute-independent content.
Ranked #3 on
Unsupervised Text Style Transfer
on GYAFC
no code implementations • 13 Aug 2018 • Zhirui Zhang, Shuangzhi Wu, Shujie Liu, Mu Li, Ming Zhou, Tong Xu
Although Neural Machine Translation (NMT) has achieved remarkable progress in the past several years, most NMT systems still suffer from a fundamental shortcoming as in other sequence generation tasks: errors made early in generation process are fed as inputs to the model and can be quickly amplified, harming subsequent sequence generation.
no code implementations • NAACL 2018 • Wenhu Chen, Guanlin Li, Shuo Ren, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou
In order to alleviate data sparsity and overfitting problems in maximum likelihood estimation (MLE) for sequence prediction tasks, we propose the Generative Bridging Network (GBN), in which a novel bridge module is introduced to assist the training of the sequence prediction model (the generator network).
no code implementations • NAACL 2018 • Duyu Tang, Nan Duan, Zhao Yan, Zhirui Zhang, Yibo Sun, Shujie Liu, Yuanhua Lv, Ming Zhou
Secondly, directly applying GAN that regards all the generated questions as negative instances could not improve the accuracy of the QA model.
2 code implementations • 15 Mar 2018 • Hany Hassan, Anthony Aue, Chang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, Shujie Liu, Tie-Yan Liu, Renqian Luo, Arul Menezes, Tao Qin, Frank Seide, Xu Tan, Fei Tian, Lijun Wu, Shuangzhi Wu, Yingce Xia, Dong-dong Zhang, Zhirui Zhang, Ming Zhou
Machine translation has made rapid advances in recent years.
Ranked #3 on
Machine Translation
on WMT 2017 English-Chinese
no code implementations • 1 Mar 2018 • Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, Enhong Chen
Monolingual data have been demonstrated to be helpful in improving translation quality of both statistical machine translation (SMT) systems and neural machine translation (NMT) systems, especially in resource-poor or domain adaptation tasks where parallel data are not rich enough.
no code implementations • EMNLP 2017 • Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, Enhong Chen
Although sequence-to-sequence (seq2seq) network has achieved significant success in many NLP tasks such as machine translation and text summarization, simply applying this approach to transition-based dependency parsing cannot yield a comparable performance gain as in other state-of-the-art methods, such as stack-LSTM and head selection.
no code implementations • 28 Jun 2017 • Wenhu Chen, Guanlin Li, Shuo Ren, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou
In order to alleviate data sparsity and overfitting problems in maximum likelihood estimation (MLE) for sequence prediction tasks, we propose the Generative Bridging Network (GBN), in which a novel bridge module is introduced to assist the training of the sequence prediction model (the generator network).