1 code implementation • 22 Jun 2022 • Jiahui Yu, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, ZiRui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, Ben Hutchinson, Wei Han, Zarana Parekh, Xin Li, Han Zhang, Jason Baldridge, Yonghui Wu
We present the Pathways Autoregressive Text-to-Image (Parti) model, which generates high-fidelity photorealistic images and supports content-rich synthesis involving complex compositions and world knowledge.
Ranked #1 on Text-to-Image Generation on COCO
In this paper, we leverage a noisy dataset of over one billion image alt-text pairs, obtained without expensive filtering or post-processing steps in the Conceptual Captions dataset.
Ranked #1 on Image Classification on VTAB-1k (using extra training data)
We evaluate our methods on de-noising parallel texts and training neural machine translation models.
Unlike previous approaches requiring style-labeled training data, our method makes use of readily-available unlabeled text by relying on the implicit connection in style between adjacent sentences, and uses labeled data only at inference time.
We present a novel approach to the challenging problem of label-free text style transfer.
By supporting multi-modal retrieval training and evaluation, image captioning datasets have spurred remarkable progress on representation learning.
Most data selection research in machine translation focuses on improving a single domain.
The task of retrieving clips within videos based on a given natural language query requires cross-modal reasoning over multiple frames.
RLIE-DQN is a recently proposed Reinforcement Learning-based Information Extraction (IE) technique which is able to incorporate external evidence during the extraction process.