no code implementations • 30 Oct 2024 • Peide Huang, Yuhan Hu, Nataliya Nechyporenko, Daehwa Kim, Walter Talbott, Jian Zhang
This paper introduces a framework, called EMOTION, for generating expressive motion sequences in humanoid robots, enhancing their ability to engage in humanlike non-verbal communication.
no code implementations • 29 Oct 2024 • Murtaza Dalal, Min Liu, Walter Talbott, Chen Chen, Deepak Pathak, Jian Zhang, Ruslan Salakhutdinov
We transfer our local policies from simulation to reality and observe they can solve unseen long-horizon manipulation tasks with up to 8 stages with significant pose, object and scene configuration variation.
no code implementations • 27 Jul 2024 • Tudor Cristea-Platon, Bogdan Mazoure, Josh Susskind, Walter Talbott
Reinforcement learning practitioners often avoid hierarchical policies, especially in image-based observation spaces.
no code implementations • 1 Feb 2024 • Yao-Hung Hubert Tsai, Walter Talbott, Jian Zhang
This paper focuses on decision planning with uncertainty estimation to address the hallucination problem in language models.
no code implementations • 26 Oct 2023 • Andrew Szot, Max Schwarzer, Harsh Agrawal, Bogdan Mazoure, Walter Talbott, Katherine Metcalf, Natalie Mackraz, Devon Hjelm, Alexander Toshev
We show that large language models (LLMs) can be adapted to be generalizable policies for embodied visual tasks.
no code implementations • 9 Jun 2023 • Bogdan Mazoure, Walter Talbott, Miguel Angel Bautista, Devon Hjelm, Alexander Toshev, Josh Susskind
A fairly reliable trend in deep reinforcement learning is that the performance scales with the number of parameters, provided a complimentary scaling in amount of training data.
1 code implementation • 27 Jul 2022 • Miguel Angel Bautista, Pengsheng Guo, Samira Abnar, Walter Talbott, Alexander Toshev, Zhuoyuan Chen, Laurent Dinh, Shuangfei Zhai, Hanlin Goh, Daniel Ulbricht, Afshin Dehghan, Josh Susskind
We introduce GAUDI, a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be rendered immersively from a moving camera.
Ranked #1 on Image Generation on ARKitScenes
1 code implementation • 15 Jul 2022 • Shuangfei Zhai, Navdeep Jaitly, Jason Ramapuram, Dan Busbridge, Tatiana Likhomanenko, Joseph Yitan Cheng, Walter Talbott, Chen Huang, Hanlin Goh, Joshua Susskind
This pretraining strategy which has been used in BERT models in NLP, Wav2Vec models in Speech and, recently, in MAE models in Vision, forces the model to learn about relationships between the content in different parts of the input using autoencoding related objectives.
no code implementations • 5 Jul 2022 • Chen Huang, Walter Talbott, Navdeep Jaitly, Josh Susskind
Inspired by the success of ConvNets that are combined with pooling to capture long-range dependencies, we learn to pool neighboring features for each token before computing attention in a given attention layer.
no code implementations • 28 Jan 2022 • Martin Bertran, Walter Talbott, Nitish Srivastava, Joshua Susskind
Learning generalizeable policies from visual input in the presence of visual distractions is a challenging problem in reinforcement learning.
1 code implementation • 2 Dec 2021 • Nitish Srivastava, Walter Talbott, Martin Bertran Lopez, Shuangfei Zhai, Josh Susskind
Modeling the world can benefit robot learning by providing a rich training signal for shaping an agent's latent state space.
no code implementations • 29 Sep 2021 • Shuangfei Zhai, Walter Talbott, Nitish Srivastava, Chen Huang, Hanlin Goh, Ruixiang Zhang, Joshua M. Susskind
We introduce Dot Product Attention Free Transformer (DAFT), an efficient variant of Transformers \citep{transformer} that eliminates the query-key dot product in self attention.
Ranked #679 on Image Classification on ImageNet
no code implementations • NAACL (ACL) 2022 • Jean-Francois Ton, Walter Talbott, Shuangfei Zhai, Josh Susskind
In particular, we find that the added L2 regularization seems to improve the performance for high-frequency words without deteriorating the performance for low frequency ones.
11 code implementations • 28 May 2021 • Shuangfei Zhai, Walter Talbott, Nitish Srivastava, Chen Huang, Hanlin Goh, Ruixiang Zhang, Josh Susskind
We introduce Attention Free Transformer (AFT), an efficient variant of Transformers that eliminates the need for dot product self attention.
no code implementations • 27 Jun 2020 • Miguel Angel Bautista, Walter Talbott, Shuangfei Zhai, Nitish Srivastava, Joshua M. Susskind
State-of-the-art learning-based monocular 3D reconstruction methods learn priors over object categories on the training set, and as a result struggle to achieve reasonable generalization to object categories unseen during training.
no code implementations • 18 Jun 2020 • Shuangfei Zhai, Walter Talbott, Miguel Angel Bautista, Carlos Guestrin, Josh M. Susskind
We introduce Set Distribution Networks (SDNs), a novel framework that learns to autoencode and freely generate sets.
1 code implementation • NeurIPS 2019 • Shuangfei Zhai, Walter Talbott, Carlos Guestrin, Joshua M. Susskind
In contrast to a traditional view where the discriminator learns a constant function when reaching convergence, here we show that it can provide useful information for downstream tasks, e. g., feature extraction for classification.
no code implementations • 15 May 2019 • Chen Huang, Shuangfei Zhai, Walter Talbott, Miguel Angel Bautista, Shih-Yu Sun, Carlos Guestrin, Josh Susskind
In most machine learning training paradigms a fixed, often handcrafted, loss function is assumed to be a good proxy for an underlying evaluation metric.