Search Results for author: Dongdong Zhang

Found 37 papers, 20 papers with code

Towards Making the Most of Cross-Lingual Transfer for Zero-Shot Neural Machine Translation

1 code implementation ACL 2022 Guanhua Chen, Shuming Ma, Yun Chen, Dongdong Zhang, Jia Pan, Wenping Wang, Furu Wei

When applied to zero-shot cross-lingual abstractive summarization, it produces an average performance gain of 12. 3 ROUGE-L over mBART-ft. We conduct detailed analyses to understand the key ingredients of SixT+, including multilinguality of the auxiliary parallel data, positional disentangled encoder, and the cross-lingual transferability of its encoder.

Abstractive Text Summarization Cross-Lingual Abstractive Summarization +5

Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models

no code implementations26 Feb 2024 Tianyi Tang, Wenyang Luo, Haoyang Huang, Dongdong Zhang, Xiaolei Wang, Xin Zhao, Furu Wei, Ji-Rong Wen

Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora.

Not All Metrics Are Guilty: Improving NLG Evaluation with LLM Paraphrasing

1 code implementation24 May 2023 Tianyi Tang, Hongyuan Lu, Yuchen Eleanor Jiang, Haoyang Huang, Dongdong Zhang, Wayne Xin Zhao, Furu Wei

Most research about natural language generation (NLG) relies on evaluation benchmarks with limited references for a sample, which may result in poor correlations with human judgements.

Machine Translation nlg evaluation +2

Target-Agnostic Gender-Aware Contrastive Learning for Mitigating Bias in Multilingual Machine Translation

no code implementations23 May 2023 Minwoo Lee, Hyukhun Koh, Kang-il Lee, Dongdong Zhang, Minsung Kim, Kyomin Jung

In this paper, we specifically target the gender bias issue of multilingual machine translation models for unambiguous cases where there is a single correct translation, and propose a bias mitigation method based on a novel approach.

Contrastive Learning Machine Translation +1

One-stop Training of Multiple Capacity Models

no code implementations23 May 2023 Lan Jiang, Haoyang Huang, Dongdong Zhang, Rui Jiang, Furu Wei

Notably, the analysis demonstrates that our method significantly influences the initial training process, leading to more efficient convergence and superior solutions.

Knowledge Distillation Machine Translation +1

Chain-of-Dictionary Prompting Elicits Translation in Large Language Models

no code implementations11 May 2023 Hongyuan Lu, Haoyang Huang, Dongdong Zhang, Haoran Yang, Wai Lam, Furu Wei

Large language models (LLMs) have shown surprisingly good performance in multilingual neural machine translation (MNMT) even when trained without parallel data.

In-Context Learning Machine Translation +1

Are More Layers Beneficial to Graph Transformers?

1 code implementation1 Mar 2023 Haiteng Zhao, Shuming Ma, Dongdong Zhang, Zhi-Hong Deng, Furu Wei

Despite that going deep has proven successful in many neural architectures, the existing graph transformers are relatively shallow.

GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator

1 code implementation20 Dec 2022 Jian Yang, Shuming Ma, Li Dong, Shaohan Huang, Haoyang Huang, Yuwei Yin, Dongdong Zhang, Liqun Yang, Furu Wei, Zhoujun Li

Inspired by the idea of Generative Adversarial Networks (GANs), we propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator, unifying the ability of language understanding and generation in a single model.

Denoising Sentence +1

Advancing Multilingual Pre-training: TRIP Triangular Document-level Pre-training for Multilingual Language Models

no code implementations15 Dec 2022 Hongyuan Lu, Haoyang Huang, Shuming Ma, Dongdong Zhang, Wai Lam, Furu Wei

Despite the success of multilingual sequence-to-sequence pre-training, most existing approaches rely on document-level monolingual corpora in many different languages, sentence-level bilingual corpora,\footnote{In this paper, we use `bilingual corpora' to denote parallel corpora with `bilingual translation pairs' in many different language pairs, each consisting of two sentences/documents with the same meaning written in different languages.

Abstractive Text Summarization Cross-Lingual Abstractive Summarization +4

A Bilingual Parallel Corpus with Discourse Annotations

1 code implementation26 Oct 2022 Yuchen Eleanor Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Mrinmaya Sachan, Ryan Cotterell

The BWB corpus consists of Chinese novels translated by experts into English, and the annotated test set is designed to probe the ability of machine translation systems to model various discourse phenomena.

Document Level Machine Translation Machine Translation +2

LVP-M3: Language-aware Visual Prompt for Multilingual Multimodal Machine Translation

no code implementations19 Oct 2022 Hongcheng Guo, Jiaheng Liu, Haoyang Huang, Jian Yang, Zhoujun Li, Dongdong Zhang, Zheng Cui, Furu Wei

To this end, we first propose the Multilingual MMT task by establishing two new Multilingual MMT benchmark datasets covering seven languages.

Multimodal Machine Translation Translation

Revamping Multilingual Agreement Bidirectionally via Switched Back-translation for Multilingual Neural Machine Translation

no code implementations28 Sep 2022 Hongyuan Lu, Haoyang Huang, Shuming Ma, Dongdong Zhang, Furu Wei, Wai Lam

Despite the fact that multilingual agreement (MA) has shown its importance for multilingual neural machine translation (MNMT), current methodologies in the field have two shortages: (i) require parallel data between multiple language pairs, which is not always realistic and (ii) optimize the agreement in an ambiguous direction, which hampers the translation performance.

Document Level Machine Translation Document Translation +2

GTrans: Grouping and Fusing Transformer Layers for Neural Machine Translation

1 code implementation29 Jul 2022 Jian Yang, Yuwei Yin, Liqun Yang, Shuming Ma, Haoyang Huang, Dongdong Zhang, Furu Wei, Zhoujun Li

Transformer structure, stacked by a sequence of encoder and decoder network layers, achieves significant development in neural machine translation.

Machine Translation Translation

HLT-MT: High-resource Language-specific Training for Multilingual Neural Machine Translation

1 code implementation11 Jul 2022 Jian Yang, Yuwei Yin, Shuming Ma, Dongdong Zhang, Zhoujun Li, Furu Wei

Nonetheless, multilingual training is plagued by language interference degeneration in shared parameters because of the negative interference among different translation directions, especially on high-resource languages.

Machine Translation Translation

DeepNet: Scaling Transformers to 1,000 Layers

6 code implementations1 Mar 2022 Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Furu Wei

In this paper, we propose a simple yet effective method to stabilize extremely deep Transformers.

Translation

Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt

1 code implementation23 Feb 2022 Lianzhe Huang, Shuming Ma, Dongdong Zhang, Furu Wei, Houfeng Wang

To collocate with the unified prompt, we propose a new initialization method for the target label word to further improve the model's transferability across languages.

Zero-Shot Cross-Lingual Transfer

PAEG: Phrase-level Adversarial Example Generation for Neural Machine Translation

no code implementations COLING 2022 Juncheng Wan, Jian Yang, Shuming Ma, Dongdong Zhang, Weinan Zhang, Yong Yu, Zhoujun Li

While end-to-end neural machine translation (NMT) has achieved impressive progress, noisy input usually leads models to become fragile and unstable.

Machine Translation NMT +1

SMDT: Selective Memory-Augmented Neural Document Translation

no code implementations5 Jan 2022 Xu Zhang, Jian Yang, Haoyang Huang, Shuming Ma, Dongdong Zhang, Jinlong Li, Furu Wei

Existing document-level neural machine translation (NMT) models have sufficiently explored different context settings to provide guidance for target generation.

Document Level Machine Translation Document Translation +4

Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural Machine Translation

1 code implementation16 Oct 2021 Guanhua Chen, Shuming Ma, Yun Chen, Dongdong Zhang, Jia Pan, Wenping Wang, Furu Wei

When applied to zero-shot cross-lingual abstractive summarization, it produces an average performance gain of 12. 3 ROUGE-L over mBART-ft. We conduct detailed analyses to understand the key ingredients of SixT+, including multilinguality of the auxiliary parallel data, positional disentangled encoder, and the cross-lingual transferability of its encoder.

Abstractive Text Summarization Cross-Lingual Abstractive Summarization +5

Multilingual Agreement for Multilingual Neural Machine Translation

no code implementations ACL 2021 Jian Yang, Yuwei Yin, Shuming Ma, Haoyang Huang, Dongdong Zhang, Zhoujun Li, Furu Wei

Although multilingual neural machine translation (MNMT) enables multiple language translations, the training process is based on independent multilingual objectives.

Machine Translation Translation

DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

2 code implementations25 Jun 2021 Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei

While pretrained encoders have achieved success in various natural language understanding (NLU) tasks, there is a gap between these pretrained encoders and natural language generation (NLG).

Abstractive Text Summarization Machine Translation +5

Smart-Start Decoding for Neural Machine Translation

no code implementations NAACL 2021 Jian Yang, Shuming Ma, Dongdong Zhang, Juncheng Wan, Zhoujun Li, Ming Zhou

Most current neural machine translation models adopt a monotonic decoding order of either left-to-right or right-to-left.

Machine Translation Translation

How Does Distilled Data Complexity Impact the Quality and Confidence of Non-Autoregressive Machine Translation?

no code implementations Findings (ACL) 2021 Weijia Xu, Shuming Ma, Dongdong Zhang, Marine Carpuat

While non-autoregressive (NAR) models are showing great promise for machine translation, their use is limited by their dependence on knowledge distillation from autoregressive models.

Knowledge Distillation Machine Translation +1

M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training

1 code implementation CVPR 2021 Minheng Ni, Haoyang Huang, Lin Su, Edward Cui, Taroon Bharti, Lijuan Wang, Jianfeng Gao, Dongdong Zhang, Nan Duan

We present M3P, a Multitask Multilingual Multimodal Pre-trained model that combines multilingual pre-training and multimodal pre-training into a unified framework via multitask pre-training.

Image Captioning Image Retrieval +4

Cannot find the paper you are looking for? You can Submit a new open access paper.