Search Results for author: Walter Talbott

Found 15 papers, 5 papers with code

Efficient Non-Parametric Uncertainty Quantification for Black-Box Large Language Models and Decision Planning

no code implementations • 1 Feb 2024 • Yao-Hung Hubert Tsai, Walter Talbott, Jian Zhang

This paper focuses on decision planning with uncertainty estimation to address the hallucination problem in language models.

Decision Making Hallucination +1

Paper
Add Code

Large Language Models as Generalizable Policies for Embodied Tasks

no code implementations • 26 Oct 2023 • Andrew Szot, Max Schwarzer, Harsh Agrawal, Bogdan Mazoure, Walter Talbott, Katherine Metcalf, Natalie Mackraz, Devon Hjelm, Alexander Toshev

We show that large language models (LLMs) can be adapted to be generalizable policies for embodied visual tasks.

Language Modelling Large Language Model +1

Paper
Add Code

Value function estimation using conditional diffusion models for control

no code implementations • 9 Jun 2023 • Bogdan Mazoure, Walter Talbott, Miguel Angel Bautista, Devon Hjelm, Alexander Toshev, Josh Susskind

A fairly reliable trend in deep reinforcement learning is that the performance scales with the number of parameters, provided a complimentary scaling in amount of training data.

Continuous Control

Paper
Add Code

GAUDI: A Neural Architect for Immersive 3D Scene Generation

1 code implementation • 27 Jul 2022 • Miguel Angel Bautista, Pengsheng Guo, Samira Abnar, Walter Talbott, Alexander Toshev, Zhuoyuan Chen, Laurent Dinh, Shuangfei Zhai, Hanlin Goh, Daniel Ulbricht, Afshin Dehghan, Josh Susskind

We introduce GAUDI, a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be rendered immersively from a moving camera.

Ranked #1 on Image Generation on ARKitScenes

Image Generation Scene Generation

605

Paper
Code

Position Prediction as an Effective Pretraining Strategy

1 code implementation • 15 Jul 2022 • Shuangfei Zhai, Navdeep Jaitly, Jason Ramapuram, Dan Busbridge, Tatiana Likhomanenko, Joseph Yitan Cheng, Walter Talbott, Chen Huang, Hanlin Goh, Joshua Susskind

This pretraining strategy which has been used in BERT models in NLP, Wav2Vec models in Speech and, recently, in MAE models in Vision, forces the model to learn about relationships between the content in different parts of the input using autoencoding related objectives.

Position speech-recognition +1

Paper
Code

Efficient Representation Learning via Adaptive Context Pooling

no code implementations • 5 Jul 2022 • Chen Huang, Walter Talbott, Navdeep Jaitly, Josh Susskind

Inspired by the success of ConvNets that are combined with pooling to capture long-range dependencies, we learn to pool neighboring features for each token before computing attention in a given attention layer.

Representation Learning

Paper
Add Code

Efficient Embedding of Semantic Similarity in Control Policies via Entangled Bisimulation

no code implementations • 28 Jan 2022 • Martin Bertran, Walter Talbott, Nitish Srivastava, Joshua Susskind

Learning generalizeable policies from visual input in the presence of visual distractions is a challenging problem in reinforcement learning.

Data Augmentation Reinforcement Learning (RL) +2

Paper
Add Code

Robust Robotic Control from Pixels using Contrastive Recurrent State-Space Models

1 code implementation • 2 Dec 2021 • Nitish Srivastava, Walter Talbott, Martin Bertran Lopez, Shuangfei Zhai, Josh Susskind

Modeling the world can benefit robot learning by providing a rich training signal for shaping an agent's latent state space.

Paper
Code

A Dot Product Attention Free Transformer

no code implementations • 29 Sep 2021 • Shuangfei Zhai, Walter Talbott, Nitish Srivastava, Chen Huang, Hanlin Goh, Ruixiang Zhang, Joshua M. Susskind

We introduce Dot Product Attention Free Transformer (DAFT), an efficient variant of Transformers \citep{transformer} that eliminates the query-key dot product in self attention.

Ranked #620 on Image Classification on ImageNet

Image Classification Language Modelling

Paper
Add Code

Regularized Training of Nearest Neighbor Language Models

no code implementations • NAACL (ACL) 2022 • Jean-Francois Ton, Walter Talbott, Shuangfei Zhai, Josh Susskind

In particular, we find that the added L2 regularization seems to improve the performance for high-frequency words without deteriorating the performance for low frequency ones.

L2 Regularization Language Modelling

Paper
Add Code

An Attention Free Transformer

6 code implementations • 28 May 2021 • Shuangfei Zhai, Walter Talbott, Nitish Srivastava, Chen Huang, Hanlin Goh, Ruixiang Zhang, Josh Susskind

We introduce Attention Free Transformer (AFT), an efficient variant of Transformers that eliminates the need for dot product self attention.

Position

47,627

Paper
Code

On the generalization of learning-based 3D reconstruction

no code implementations • 27 Jun 2020 • Miguel Angel Bautista, Walter Talbott, Shuangfei Zhai, Nitish Srivastava, Joshua M. Susskind

State-of-the-art learning-based monocular 3D reconstruction methods learn priors over object categories on the training set, and as a result struggle to achieve reasonable generalization to object categories unseen during training.

3D Reconstruction Position

Paper
Add Code

Set Distribution Networks: a Generative Model for Sets of Images

no code implementations • 18 Jun 2020 • Shuangfei Zhai, Walter Talbott, Miguel Angel Bautista, Carlos Guestrin, Josh M. Susskind

We introduce Set Distribution Networks (SDNs), a novel framework that learns to autoencode and freely generate sets.

3D Reconstruction Face Verification

Paper
Add Code

Adversarial Fisher Vectors for Unsupervised Representation Learning

1 code implementation • NeurIPS 2019 • Shuangfei Zhai, Walter Talbott, Carlos Guestrin, Joshua M. Susskind

In contrast to a traditional view where the discriminator learns a constant function when reaching convergence, here we show that it can provide useful information for downstream tasks, e. g., feature extraction for classification.

General Classification Representation Learning

Paper
Code

Addressing the Loss-Metric Mismatch with Adaptive Loss Alignment

no code implementations • 15 May 2019 • Chen Huang, Shuangfei Zhai, Walter Talbott, Miguel Angel Bautista, Shih-Yu Sun, Carlos Guestrin, Josh Susskind

In most machine learning training paradigms a fixed, often handcrafted, loss function is assumed to be a good proxy for an underlying evaluation metric.

General Classification Meta-Learning +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.