Search Results for author: Chenze Shao

Found 16 papers, 11 papers with code

Non-autoregressive Streaming Transformer for Simultaneous Translation

1 code implementation • 23 Oct 2023 • Zhengrui Ma, Shaolei Zhang, Shoutao Guo, Chenze Shao, Min Zhang, Yang Feng

Simultaneous machine translation (SiMT) models are trained to strike a balance between latency and translation quality.

Paper
Code

Fuzzy Alignments in Directed Acyclic Graph for Non-Autoregressive Machine Translation

1 code implementation • 12 Mar 2023 • Zhengrui Ma, Chenze Shao, Shangtong Gui, Min Zhang, Yang Feng

Non-autoregressive translation (NAT) reduces the decoding latency but suffers from performance degradation due to the multi-modality problem.

Machine Translation Sentence +1

Paper
Code

Rephrasing the Reference for Non-Autoregressive Machine Translation

no code implementations • 30 Nov 2022 • Chenze Shao, Jinchao Zhang, Jie zhou, Yang Feng

In response to this problem, we introduce a rephraser to provide a better training target for NAT by rephrasing the reference sentence according to the NAT output.

Machine Translation Sentence +1

Paper
Add Code

Viterbi Decoding of Directed Acyclic Transformer for Non-Autoregressive Machine Translation

1 code implementation • 11 Oct 2022 • Chenze Shao, Zhengrui Ma, Yang Feng

Non-autoregressive models achieve significant decoding speedup in neural machine translation but lack the ability to capture sequential dependency.

Machine Translation Translation

116

Paper
Code

Non-Monotonic Latent Alignments for CTC-Based Non-Autoregressive Machine Translation

1 code implementation • 8 Oct 2022 • Chenze Shao, Yang Feng

We extend the alignment space to non-monotonic alignments to allow for the global word reordering and further consider all alignments that overlap with the target sentence.

Machine Translation Sentence +1

Paper
Code

One Reference Is Not Enough: Diverse Distillation with Reference Selection for Non-Autoregressive Translation

1 code implementation • NAACL 2022 • Chenze Shao, Xuanfu Wu, Yang Feng

Non-autoregressive neural machine translation (NAT) suffers from the multi-modality problem: the source sentence may have multiple correct translations, but the loss function is calculated only according to the reference sentence.

Knowledge Distillation Machine Translation +2

Paper
Code

Overcoming Catastrophic Forgetting beyond Continual Learning: Balanced Training for Neural Machine Translation

1 code implementation • ACL 2022 • Chenze Shao, Yang Feng

The underlying cause is that training samples do not get balanced training in each model update, so we name this problem \textit{imbalanced training}.

Continual Learning Knowledge Distillation +2

Paper
Code

Sequence-Level Training for Non-Autoregressive Neural Machine Translation

1 code implementation • CL (ACL) 2021 • Chenze Shao, Yang Feng, Jinchao Zhang, Fandong Meng, Jie zhou

Non-Autoregressive Neural Machine Translation (NAT) removes the autoregressive mechanism and achieves significant decoding speedup through generating target words independently and simultaneously.

Machine Translation NMT +2

Paper
Code

Guiding Teacher Forcing with Seer Forcing for Neural Machine Translation

no code implementations • ACL 2021 • Yang Feng, Shuhao Gu, Dengji Guo, Zhengxin Yang, Chenze Shao

Meanwhile, we force the conventional decoder to simulate the behaviors of the seer decoder via knowledge distillation.

Knowledge Distillation L2 Regularization +2

Paper
Add Code

Modeling Coverage for Non-Autoregressive Neural Machine Translation

no code implementations • 24 Apr 2021 • Yong Shan, Yang Feng, Chenze Shao

Non-Autoregressive Neural Machine Translation (NAT) has achieved significant inference speedup by generating all tokens simultaneously.

Machine Translation Sentence +1

Paper
Add Code

Knowledge Distillation based Ensemble Learning for Neural Machine Translation

no code implementations • 1 Jan 2021 • Chenze Shao, Meng Sun, Yang Feng, Zhongjun He, Hua Wu, Haifeng Wang

Under this framework, we introduce word-level ensemble learning and sequence-level ensemble learning for neural machine translation, where sequence-level ensemble learning is capable of aggregating translation models with different decoding strategies.

Ensemble Learning Knowledge Distillation +2

Paper
Add Code

Generating Diverse Translation from Model Distribution with Dropout

no code implementations • EMNLP 2020 • Xuanfu Wu, Yang Feng, Chenze Shao

Despite the improvement of translation quality, neural machine translation (NMT) often suffers from the lack of diversity in its generation.

Machine Translation NMT +2

Paper
Add Code

Modeling Fluency and Faithfulness for Diverse Neural Machine Translation

1 code implementation • 30 Nov 2019 • Yang Feng, Wanying Xie, Shuhao Gu, Chenze Shao, Wen Zhang, Zhengxin Yang, Dong Yu

Neural machine translation models usually adopt the teacher forcing strategy for training which requires the predicted sequence matches ground truth word by word and forces the probability of each prediction to approach a 0-1 distribution.

Machine Translation Translation

Paper
Code

Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation

1 code implementation • 21 Nov 2019 • Chenze Shao, Jinchao Zhang, Yang Feng, Fandong Meng, Jie zhou

Non-Autoregressive Neural Machine Translation (NAT) achieves significant decoding speedup through generating target words independently and simultaneously.

Machine Translation Sentence +1

Paper
Code

Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation

3 code implementations • ACL 2019 • Chenze Shao, Yang Feng, Jinchao Zhang, Fandong Meng, Xilin Chen, Jie zhou

Non-Autoregressive Transformer (NAT) aims to accelerate the Transformer model through discarding the autoregressive mechanism and generating target words independently, which fails to exploit the target sequential information.

Machine Translation Sentence +1

Paper
Code

Greedy Search with Probabilistic N-gram Matching for Neural Machine Translation

1 code implementation • EMNLP 2018 • Chenze Shao, Yang Feng, Xilin Chen

Neural machine translation (NMT) models are usually trained with the word-level loss using the teacher forcing algorithm, which not only evaluates the translation improperly but also suffers from exposure bias.

Machine Translation NMT +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.