Search Results for author: Pengzhi Gao

Found 7 papers, 6 papers with code

Towards Boosting Many-to-Many Multilingual Machine Translation with Large Language Models

1 code implementation • 11 Jan 2024 • Pengzhi Gao, Zhongjun He, Hua Wu, Haifeng Wang

The training paradigm for machine translation has gradually shifted, from learning neural machine translation (NMT) models with extensive parallel corpora to instruction finetuning on multilingual large language models (LLMs) with high-quality translation pairs.

Machine Translation NMT +1

Paper
Code

An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation

1 code implementation • 28 Aug 2023 • Pengzhi Gao, Ruiqing Zhang, Zhongjun He, Hua Wu, Haifeng Wang

Consistency regularization methods, such as R-Drop (Liang et al., 2021) and CrossConST (Gao et al., 2023), have achieved impressive supervised and zero-shot performance in the neural machine translation (NMT) field.

Machine Translation NMT +2

Paper
Code

Learning Multilingual Sentence Representations with Cross-lingual Consistency Regularization

1 code implementation • 12 Jun 2023 • Pengzhi Gao, Liwen Zhang, Zhongjun He, Hua Wu, Haifeng Wang

Multilingual sentence representations are the foundation for similarity-based bitext mining, which is crucial for scaling multilingual neural machine translation (NMT) system to more languages.

Machine Translation NMT +2

Paper
Code

Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency Regularization

1 code implementation • 12 May 2023 • Pengzhi Gao, Liwen Zhang, Zhongjun He, Hua Wu, Haifeng Wang

The experimental analysis also proves that CrossConST could close the sentence representation gap and better align the representation space.

Machine Translation NMT +2

Paper
Code

Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation

1 code implementation • NAACL 2022 • Pengzhi Gao, Zhongjun He, Hua Wu, Haifeng Wang

We introduce Bi-SimCut: a simple but effective training strategy to boost neural machine translation (NMT) performance.

Ranked #1 on Machine Translation on WMT2014 German-English

Machine Translation NMT +2

Paper
Code

Mixup Decoding for Diverse Machine Translation

no code implementations • Findings (EMNLP) 2021 • Jicheng Li, Pengzhi Gao, Xuanfu Wu, Yang Feng, Zhongjun He, Hua Wu, Haifeng Wang

To further improve the faithfulness and diversity of the translations, we propose two simple but effective approaches to select diverse sentence pairs in the training corpus and adjust the interpolation weight for each pair correspondingly.

Machine Translation Sentence +1

Paper
Add Code

A Data-Centric Framework for Composable NLP Workflows

1 code implementation • EMNLP 2020 • Zhengzhong Liu, Guanxiong Ding, Avinash Bukkittu, Mansi Gupta, Pengzhi Gao, Atif Ahmed, Shikun Zhang, Xin Gao, Swapnil Singhavi, Linwei Li, Wei Wei, Zecong Hu, Haoran Shi, Haoying Zhang, Xiaodan Liang, Teruko Mitamura, Eric P. Xing, Zhiting Hu

Empirical natural language processing (NLP) systems in application domains (e. g., healthcare, finance, education) involve interoperation among multiple components, ranging from data ingestion, human annotation, to text retrieval, analysis, generation, and visualization.

Retrieval Text Retrieval

236

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.