Search Results for author: Zarana Parekh

Found 10 papers, 6 papers with code

Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision

4 code implementations11 Feb 2021 Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, YunHsuan Sung, Zhen Li, Tom Duerig

In this paper, we leverage a noisy dataset of over one billion image alt-text pairs, obtained without expensive filtering or post-processing steps in the Conceptual Captions dataset.

 Ranked #1 on Image Classification on VTAB-1k (using extra training data)

Cross-Modal Retrieval Fine-Grained Image Classification +6

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

2 code implementations22 Jun 2022 Jiahui Yu, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, ZiRui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, Ben Hutchinson, Wei Han, Zarana Parekh, Xin Li, Han Zhang, Jason Baldridge, Yonghui Wu

We present the Pathways Autoregressive Text-to-Image (Parti) model, which generates high-fidelity photorealistic images and supports content-rich synthesis involving complex compositions and world knowledge.

Machine Translation Text-to-Image Generation +1

TextSETTR: Few-Shot Text Style Extraction and Tunable Targeted Restyling

1 code implementation ACL 2021 Parker Riley, Noah Constant, Mandy Guo, Girish Kumar, David Uthus, Zarana Parekh

Unlike previous approaches requiring style-labeled training data, our method makes use of readily-available unlabeled text by relying on the implicit connection in style between adjacent sentences, and uses labeled data only at inference time.

Style Transfer Text Style Transfer

ExCL: Extractive Clip Localization Using Natural Language Descriptions

1 code implementation NAACL 2019 Soham Ghosh, Anuva Agarwal, Zarana Parekh, Alexander Hauptmann

The task of retrieving clips within videos based on a given natural language query requires cross-modal reasoning over multiple frames.

Speeding up Reinforcement Learning-based Information Extraction Training using Asynchronous Methods

1 code implementation EMNLP 2017 Aditya Sharma, Zarana Parekh, Partha Talukdar

RLIE-DQN is a recently proposed Reinforcement Learning-based Information Extraction (IE) technique which is able to incorporate external evidence during the extraction process.

reinforcement-learning Reinforcement Learning (RL)

A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning

no code implementations CVPR 2023 Aishwarya Kamath, Peter Anderson, Su Wang, Jing Yu Koh, Alexander Ku, Austin Waters, Yinfei Yang, Jason Baldridge, Zarana Parekh

Recent studies in Vision-and-Language Navigation (VLN) train RL agents to execute natural-language navigation instructions in photorealistic environments, as a step towards robots that can follow human instructions.

 Ranked #1 on Vision and Language Navigation on RxR (using extra training data)

Imitation Learning Instruction Following +1

Cannot find the paper you are looking for? You can Submit a new open access paper.