Search Results for author: Shuhao Gu

Found 14 papers, 7 papers with code

Addressing the Length Bias Problem in Document-Level Neural Machine Translation

no code implementations20 Nov 2023 Zhuocheng Zhang, Shuhao Gu, Min Zhang, Yang Feng

To solve the length bias problem, we propose to improve the DNMT model in training method, attention mechanism, and decoding strategy.

Machine Translation Translation

Enhancing Neural Machine Translation with Semantic Units

1 code implementation17 Oct 2023 Langlin Huang, Shuhao Gu, Zhuocheng Zhang, Yang Feng

Conventional neural machine translation (NMT) models typically use subwords and words as the basic units for model input and comprehension.

Machine Translation NMT +2

Continual Learning of Neural Machine Translation within Low Forgetting Risk Regions

1 code implementation3 Nov 2022 Shuhao Gu, Bojie Hu, Yang Feng

Specifically, we propose two methods to search the low forgetting risk regions, which are based on the curvature of loss and the impacts of the parameters on the model output, respectively.

Continual Learning Domain Adaptation +2

Improving Zero-Shot Multilingual Translation with Universal Representations and Cross-Mappings

1 code implementation28 Oct 2022 Shuhao Gu, Yang Feng

The many-to-many multilingual neural machine translation can translate between language pairs unseen during training, i. e., zero-shot translation.

Machine Translation Translation

Importance-based Neuron Allocation for Multilingual Neural Machine Translation

1 code implementation ACL 2021 Wanying Xie, Yang Feng, Shuhao Gu, Dong Yu

Multilingual neural machine translation with a single model has drawn much attention due to its capability to deal with multiple languages.

General Knowledge Machine Translation +1

Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation

1 code implementation NAACL 2021 Shuhao Gu, Yang Feng, Wanying Xie

Domain Adaptation is widely used in practical applications of neural machine translation, which aims to achieve good performance on both the general-domain and in-domain.

Domain Adaptation Knowledge Distillation +2

Investigating Catastrophic Forgetting During Continual Training for Neural Machine Translation

no code implementations COLING 2020 Shuhao Gu, Yang Feng

The investigation on the modules of the NMT model shows that some modules have tight relation with the general-domain knowledge while some other modules are more essential in the domain adaptation.

Domain Adaptation Machine Translation +2

Token-level Adaptive Training for Neural Machine Translation

1 code implementation EMNLP 2020 Shuhao Gu, Jinchao Zhang, Fandong Meng, Yang Feng, Wanying Xie, Jie zhou, Dong Yu

The vanilla NMT model usually adopts trivial equal-weighted objectives for target tokens with different frequencies and tends to generate more high-frequency tokens and less low-frequency tokens compared with the golden token distribution.

Machine Translation NMT +1

Modeling Fluency and Faithfulness for Diverse Neural Machine Translation

1 code implementation30 Nov 2019 Yang Feng, Wanying Xie, Shuhao Gu, Chenze Shao, Wen Zhang, Zhengxin Yang, Dong Yu

Neural machine translation models usually adopt the teacher forcing strategy for training which requires the predicted sequence matches ground truth word by word and forces the probability of each prediction to approach a 0-1 distribution.

Machine Translation Translation

Improving Multi-Head Attention with Capsule Networks

no code implementations31 Aug 2019 Shuhao Gu, Yang Feng

Multi-head attention advances neural machine translation by working out multiple versions of attention in different subspaces, but the neglect of semantic overlapping between subspaces increases the difficulty of translation and consequently hinders the further improvement of translation performance.

Clustering Machine Translation +1

Improving Domain Adaptation Translation with Domain Invariant and Specific Information

no code implementations NAACL 2019 Shuhao Gu, Yang Feng, Qun Liu

Besides, we add a discriminator to the shared encoder and employ adversarial training for the whole model to reinforce the performance of information separation and machine translation simultaneously.

Domain Adaptation Machine Translation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.