Non-Parametric Domain Adaptation for End-to-End Speech Translation

2 code implementations23 May 2022 Yichao Du, Weizhi Wang, Zhirui Zhang, Boxing Chen, Tong Xu, Jun Xie, Enhong Chen

End-to-End Speech Translation (E2E-ST) has received increasing attention due to the potential of its less error propagation, lower latency, and fewer parameters.

Domain Adaptation Translation

Visually-Augmented Language Modeling

1 code implementation20 May 2022 Weizhi Wang, Li Dong, Hao Cheng, Haoyu Song, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei

With the visually-augmented context, VaLM uses a visual knowledge fusion layer to enable multimodal grounded language modeling by attending to both text context and visual knowledge in images.

Image Retrieval Language Modelling +1

Regularizing End-to-End Speech Translation with Triangular Decomposition Agreement

1 code implementation21 Dec 2021 Yichao Du, Zhirui Zhang, Weizhi Wang, Boxing Chen, Jun Xie, Tong Xu

In this paper, we attempt to model the joint probability of transcription and translation based on the speech input to directly leverage such triplet data.

Association Automatic Speech Recognition +5

Rethinking Zero-shot Neural Machine Translation: From a Perspective of Latent Variables

1 code implementation Findings (EMNLP) 2021 Weizhi Wang, Zhirui Zhang, Yichao Du, Boxing Chen, Jun Xie, Weihua Luo

However, it usually suffers from capturing spurious correlations between the output language and language invariant semantics due to the maximum likelihood training objective, leading to poor transfer performance on zero-shot translation.

Denoising Machine Translation +2

Task-Oriented Dialogue System as Natural Language Generation

1 code implementation31 Aug 2021 Weizhi Wang, Zhirui Zhang, Junliang Guo, Yinpei Dai, Boxing Chen, Weihua Luo

In this paper, we propose to formulate the task-oriented dialogue system as the purely natural language generation task, so as to fully leverage the large-scale pre-trained models like GPT-2 and simplify complicated delexicalization prepossessing.

Text Generation Transfer Learning

Clustering tweets usingWikipedia concepts

no code implementations LREC 2014 Guoyu Tang, Yunqing Xia, Weizhi Wang, Raymond Lau, Fang Zheng

We address the polysemy issue with a Bayesian model, and the synonymy issue by exploiting the Wikipedia redirections.

Text Clustering

