Search Results for author: Alexander Ku

Found 13 papers, 4 papers with code

Prompt Expansion for Adaptive Text-to-Image Generation

no code implementations • 27 Dec 2023 • Siddhartha Datta, Alexander Ku, Deepak Ramachandran, Peter Anderson

Text-to-image generation models are powerful but difficult to use.

Paper
Add Code

A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning

no code implementations • CVPR 2023 • Aishwarya Kamath, Peter Anderson, Su Wang, Jing Yu Koh, Alexander Ku, Austin Waters, Yinfei Yang, Jason Baldridge, Zarana Parekh

Recent studies in Vision-and-Language Navigation (VLN) train RL agents to execute natural-language navigation instructions in photorealistic environments, as a step towards robots that can follow human instructions.

Ranked #1 on Vision and Language Navigation on RxR (using extra training data)

Imitation Learning Instruction Following +1

Paper
Add Code

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

2 code implementations • 22 Jun 2022 • Jiahui Yu, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, ZiRui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, Ben Hutchinson, Wei Han, Zarana Parekh, Xin Li, Han Zhang, Jason Baldridge, Yonghui Wu

We present the Pathways Autoregressive Text-to-Image (Parti) model, which generates high-fidelity photorealistic images and supports content-rich synthesis involving complex compositions and world knowledge.

Ranked #1 on Text-to-Image Generation on LAION COCO

Machine Translation Text-to-Image Generation +1

506

Paper
Code

Vector-quantized Image Modeling with Improved VQGAN

5 code implementations • ICLR 2022 • Jiahui Yu, Xin Li, Jing Yu Koh, Han Zhang, Ruoming Pang, James Qin, Alexander Ku, Yuanzhong Xu, Jason Baldridge, Yonghui Wu

Motivated by this success, we explore a Vector-quantized Image Modeling (VIM) approach that involves pretraining a Transformer to predict rasterized image tokens autoregressively.

Image Generation Representation Learning +1

10,815

Paper
Code

PanGEA: The Panoramic Graph Environment Annotation Toolkit

no code implementations • NAACL (ALVR) 2021 • Alexander Ku, Peter Anderson, Jordi Pont-Tuset, Jason Baldridge

PanGEA, the Panoramic Graph Environment Annotation toolkit, is a lightweight toolkit for collecting speech and text annotations in photo-realistic 3D environments.

Instruction Following

Paper
Add Code

On the Evaluation of Vision-and-Language Navigation Instructions

no code implementations • EACL 2021 • Ming Zhao, Peter Anderson, Vihan Jain, Su Wang, Alexander Ku, Jason Baldridge, Eugene Ie

Vision-and-Language Navigation wayfinding agents can be enhanced by exploiting automatically generated navigation instructions.

Vision and Language Navigation

Paper
Add Code

Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding

3 code implementations • EMNLP 2020 • Alexander Ku, Peter Anderson, Roma Patel, Eugene Ie, Jason Baldridge

We introduce Room-Across-Room (RxR), a new Vision-and-Language Navigation (VLN) dataset.

Ranked #5 on Vision and Language Navigation on RxR

Vision and Language Navigation

215

Paper
Code

Transferable Representation Learning in Vision-and-Language Navigation

no code implementations • ICCV 2019 • Haoshuo Huang, Vihan Jain, Harsh Mehta, Alexander Ku, Gabriel Magalhaes, Jason Baldridge, Eugene Ie

Vision-and-Language Navigation (VLN) tasks such as Room-to-Room (R2R) require machine agents to interpret natural language instructions and learn to act in visually realistic environments to achieve navigation goals.

Ranked #115 on Vision and Language Navigation on VLN Challenge

Representation Learning Vision and Language Navigation

Paper
Add Code

General Evaluation for Instruction Conditioned Navigation using Dynamic Time Warping

1 code implementation • 11 Jul 2019 • Gabriel Ilharco, Vihan Jain, Alexander Ku, Eugene Ie, Jason Baldridge

We address fundamental flaws in previously used metrics and show how Dynamic Time Warping (DTW), a long known method of measuring similarity between two time series, can be used for evaluation of navigation agents.

Dynamic Time Warping Navigate +2

104

Paper
Code

Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation

no code implementations • ACL 2019 • Vihan Jain, Gabriel Magalhaes, Alexander Ku, Ashish Vaswani, Eugene Ie, Jason Baldridge

We also show that the existing paths in the dataset are not ideal for evaluating instruction following because they are direct-to-goal shortest paths.

Instruction Following Vision and Language Navigation

Paper
Add Code

Image Transformer

no code implementations • 15 Feb 2018 • Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Łukasz Kaiser, Noam Shazeer, Alexander Ku, Dustin Tran

Image generation has been successfully cast as an autoregressive sequence generation or transformation problem.

Ranked #3 on Density Estimation on CIFAR-10

Density Estimation Image Generation +1

Paper
Add Code

Capturing Human Category Representations by Sampling in Deep Feature Spaces

no code implementations • ICLR 2018 • Joshua Peterson, Krishan Aghi, Jordan Suchow, Alexander Ku, Tom Griffiths

In this paper, we introduce a method for estimating the structure of human categories that draws on ideas from both cognitive science and machine learning, blending human-based algorithms with state-of-the-art deep representation learners.

BIG-bench Machine Learning

Paper
Add Code

Neural Networks for irregularly observed continuous-time Stochastic Processes

no code implementations • ICLR 2018 • Francois W. Belletti, Alexander Ku, Joseph E. Gonzalez

Designing neural networks for continuous-time stochastic processes is challenging, especially when observations are made irregularly.

Video Classification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.