Search Results for author: Zarana Parekh

Found 12 papers, 6 papers with code

Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models

no code implementations27 May 2024 Cristina N. Vasconcelos, Abdullah Rashwan Austin Waters, Trevor Walker, Keyang Xu, Jimmy Yan, Rui Qian, Shixin Luo, Zarana Parekh, Andrew Bunner, Hongliang Fei, Roopal Garg, Mandy Guo, Ivana Kajic, Yeqing Li, Henna Nandwani, Jordi Pont-Tuset, Yasumasa Onoe, Sarah Rosston, Su Wang, Wenlei Zhou, Kevin Swersky, David J. Fleet, Jason M. Baldridge, Oliver Wang

Building on this core model, we propose a greedy algorithm that grows the architecture into high-resolution end-to-end models, while preserving the integrity of the pre-trained representation, stabilizing training, and reducing the need for large high-resolution datasets.


A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning

no code implementations CVPR 2023 Aishwarya Kamath, Peter Anderson, Su Wang, Jing Yu Koh, Alexander Ku, Austin Waters, Yinfei Yang, Jason Baldridge, Zarana Parekh

Recent studies in Vision-and-Language Navigation (VLN) train RL agents to execute natural-language navigation instructions in photorealistic environments, as a step towards robots that can follow human instructions.

 Ranked #1 on Vision and Language Navigation on RxR (using extra training data)

Imitation Learning Instruction Following +1

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

2 code implementations22 Jun 2022 Jiahui Yu, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, ZiRui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, Ben Hutchinson, Wei Han, Zarana Parekh, Xin Li, Han Zhang, Jason Baldridge, Yonghui Wu

We present the Pathways Autoregressive Text-to-Image (Parti) model, which generates high-fidelity photorealistic images and supports content-rich synthesis involving complex compositions and world knowledge.

Decoder Machine Translation +2

Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision

4 code implementations11 Feb 2021 Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, YunHsuan Sung, Zhen Li, Tom Duerig

In this paper, we leverage a noisy dataset of over one billion image alt-text pairs, obtained without expensive filtering or post-processing steps in the Conceptual Captions dataset.

 Ranked #1 on Image Classification on VTAB-1k (using extra training data)

Cross-Modal Retrieval Fine-Grained Image Classification +6

TextSETTR: Few-Shot Text Style Extraction and Tunable Targeted Restyling

1 code implementation ACL 2021 Parker Riley, Noah Constant, Mandy Guo, Girish Kumar, David Uthus, Zarana Parekh

Unlike previous approaches requiring style-labeled training data, our method makes use of readily-available unlabeled text by relying on the implicit connection in style between adjacent sentences, and uses labeled data only at inference time.

Decoder Style Transfer +1

ExCL: Extractive Clip Localization Using Natural Language Descriptions

1 code implementation NAACL 2019 Soham Ghosh, Anuva Agarwal, Zarana Parekh, Alexander Hauptmann

The task of retrieving clips within videos based on a given natural language query requires cross-modal reasoning over multiple frames.

Speeding up Reinforcement Learning-based Information Extraction Training using Asynchronous Methods

1 code implementation EMNLP 2017 Aditya Sharma, Zarana Parekh, Partha Talukdar

RLIE-DQN is a recently proposed Reinforcement Learning-based Information Extraction (IE) technique which is able to incorporate external evidence during the extraction process.

reinforcement-learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.