Search Results for author: Pengzhi Gao

Found 7 papers, 6 papers with code

Towards Boosting Many-to-Many Multilingual Machine Translation with Large Language Models

1 code implementation11 Jan 2024 Pengzhi Gao, Zhongjun He, Hua Wu, Haifeng Wang

The training paradigm for machine translation has gradually shifted, from learning neural machine translation (NMT) models with extensive parallel corpora to instruction finetuning on multilingual large language models (LLMs) with high-quality translation pairs.

Machine Translation NMT +1

An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation

1 code implementation28 Aug 2023 Pengzhi Gao, Ruiqing Zhang, Zhongjun He, Hua Wu, Haifeng Wang

Consistency regularization methods, such as R-Drop (Liang et al., 2021) and CrossConST (Gao et al., 2023), have achieved impressive supervised and zero-shot performance in the neural machine translation (NMT) field.

Machine Translation NMT +2

Learning Multilingual Sentence Representations with Cross-lingual Consistency Regularization

1 code implementation12 Jun 2023 Pengzhi Gao, Liwen Zhang, Zhongjun He, Hua Wu, Haifeng Wang

Multilingual sentence representations are the foundation for similarity-based bitext mining, which is crucial for scaling multilingual neural machine translation (NMT) system to more languages.

Machine Translation NMT +2

Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency Regularization

1 code implementation12 May 2023 Pengzhi Gao, Liwen Zhang, Zhongjun He, Hua Wu, Haifeng Wang

The experimental analysis also proves that CrossConST could close the sentence representation gap and better align the representation space.

Machine Translation NMT +2

Mixup Decoding for Diverse Machine Translation

no code implementations Findings (EMNLP) 2021 Jicheng Li, Pengzhi Gao, Xuanfu Wu, Yang Feng, Zhongjun He, Hua Wu, Haifeng Wang

To further improve the faithfulness and diversity of the translations, we propose two simple but effective approaches to select diverse sentence pairs in the training corpus and adjust the interpolation weight for each pair correspondingly.

Machine Translation Sentence +1

A Data-Centric Framework for Composable NLP Workflows

1 code implementation EMNLP 2020 Zhengzhong Liu, Guanxiong Ding, Avinash Bukkittu, Mansi Gupta, Pengzhi Gao, Atif Ahmed, Shikun Zhang, Xin Gao, Swapnil Singhavi, Linwei Li, Wei Wei, Zecong Hu, Haoran Shi, Haoying Zhang, Xiaodan Liang, Teruko Mitamura, Eric P. Xing, Zhiting Hu

Empirical natural language processing (NLP) systems in application domains (e. g., healthcare, finance, education) involve interoperation among multiple components, ranging from data ingestion, human annotation, to text retrieval, analysis, generation, and visualization.

Retrieval Text Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.