Search Results for author: Shuhao Gu

Found 14 papers, 7 papers with code

Addressing the Length Bias Problem in Document-Level Neural Machine Translation

no code implementations • 20 Nov 2023 • Zhuocheng Zhang, Shuhao Gu, Min Zhang, Yang Feng

To solve the length bias problem, we propose to improve the DNMT model in training method, attention mechanism, and decoding strategy.

Machine Translation Translation

Paper
Add Code

Enhancing Neural Machine Translation with Semantic Units

1 code implementation • 17 Oct 2023 • Langlin Huang, Shuhao Gu, Zhuocheng Zhang, Yang Feng

Conventional neural machine translation (NMT) models typically use subwords and words as the basic units for model input and comprehension.

Machine Translation NMT +2

Paper
Code

Continual Learning of Neural Machine Translation within Low Forgetting Risk Regions

1 code implementation • 3 Nov 2022 • Shuhao Gu, Bojie Hu, Yang Feng

Specifically, we propose two methods to search the low forgetting risk regions, which are based on the curvature of loss and the impacts of the parameters on the model output, respectively.

Continual Learning Domain Adaptation +2

Paper
Code

Improving Zero-Shot Multilingual Translation with Universal Representations and Cross-Mappings

1 code implementation • 28 Oct 2022 • Shuhao Gu, Yang Feng

The many-to-many multilingual neural machine translation can translate between language pairs unseen during training, i. e., zero-shot translation.

Machine Translation Translation

Paper
Code

Importance-based Neuron Allocation for Multilingual Neural Machine Translation

1 code implementation • ACL 2021 • Wanying Xie, Yang Feng, Shuhao Gu, Dong Yu

Multilingual neural machine translation with a single model has drawn much attention due to its capability to deal with multiple languages.

General Knowledge Machine Translation +1

Paper
Code

Guiding Teacher Forcing with Seer Forcing for Neural Machine Translation

no code implementations • ACL 2021 • Yang Feng, Shuhao Gu, Dengji Guo, Zhengxin Yang, Chenze Shao

Meanwhile, we force the conventional decoder to simulate the behaviors of the seer decoder via knowledge distillation.

Knowledge Distillation L2 Regularization +2

Paper
Add Code

Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation

1 code implementation • NAACL 2021 • Shuhao Gu, Yang Feng, Wanying Xie

Domain Adaptation is widely used in practical applications of neural machine translation, which aims to achieve good performance on both the general-domain and in-domain.

Domain Adaptation Knowledge Distillation +2

Paper
Code

Investigating Catastrophic Forgetting During Continual Training for Neural Machine Translation

no code implementations • COLING 2020 • Shuhao Gu, Yang Feng

The investigation on the modules of the NMT model shows that some modules have tight relation with the general-domain knowledge while some other modules are more essential in the domain adaptation.

Domain Adaptation Machine Translation +2

Paper
Add Code

Token-level Adaptive Training for Neural Machine Translation

1 code implementation • EMNLP 2020 • Shuhao Gu, Jinchao Zhang, Fandong Meng, Yang Feng, Wanying Xie, Jie zhou, Dong Yu

The vanilla NMT model usually adopts trivial equal-weighted objectives for target tokens with different frequencies and tends to generate more high-frequency tokens and less low-frequency tokens compared with the golden token distribution.

Machine Translation NMT +1

Paper
Code

Robust Neural Machine Translation with ASR Errors

no code implementations • WS 2020 • Haiyang Xue, Yang Feng, Shuhao Gu, Wei Chen

In this paper, we propose a method to handle the two problems so as to generate robust translation to ASR errors.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Modeling Fluency and Faithfulness for Diverse Neural Machine Translation

1 code implementation • 30 Nov 2019 • Yang Feng, Wanying Xie, Shuhao Gu, Chenze Shao, Wen Zhang, Zhengxin Yang, Dong Yu

Neural machine translation models usually adopt the teacher forcing strategy for training which requires the predicted sequence matches ground truth word by word and forces the probability of each prediction to approach a 0-1 distribution.

Machine Translation Translation

Paper
Code

Enhancing Context Modeling with a Query-Guided Capsule Network for Document-level Translation

no code implementations • IJCNLP 2019 • Zhengxin Yang, Jinchao Zhang, Fandong Meng, Shuhao Gu, Yang Feng, Jie zhou

Context modeling is essential to generate coherent and consistent translation for Document-level Neural Machine Translations.

Translation

Paper
Add Code

Improving Multi-Head Attention with Capsule Networks

no code implementations • 31 Aug 2019 • Shuhao Gu, Yang Feng

Multi-head attention advances neural machine translation by working out multiple versions of attention in different subspaces, but the neglect of semantic overlapping between subspaces increases the difficulty of translation and consequently hinders the further improvement of translation performance.

Clustering Machine Translation +1

Paper
Add Code

Improving Domain Adaptation Translation with Domain Invariant and Specific Information

no code implementations • NAACL 2019 • Shuhao Gu, Yang Feng, Qun Liu

Besides, we add a discriminator to the shared encoder and employ adversarial training for the whole model to reinforce the performance of information separation and machine translation simultaneously.

Domain Adaptation Machine Translation +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.