Search Results for author: Siddarth Venkatraman

Found 8 papers, 4 papers with code

Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training

1 code implementation24 Mar 2025 Brian R. Bartoldson, Siddarth Venkatraman, James Diffenderfer, Moksh Jain, Tal Ben-Nun, Seanie Lee, Minsu Kim, Johan Obando-Ceron, Yoshua Bengio, Bhavya Kailkhura

A training node simultaneously samples data from this buffer based on reward or recency to update the policy using Trajectory Balance (TB), a diversity-seeking RL objective introduced for GFlowNets.

Diversity Large Language Model +3

Solving Bayesian inverse problems with diffusion priors and off-policy RL

no code implementations12 Mar 2025 Luca Scimeca, Siddarth Venkatraman, Moksh Jain, Minsu Kim, Marcin Sendera, Mohsin Hasan, Luke Rowe, Sarthak Mittal, Pablo Lemos, Emmanuel Bengio, Alexandre Adam, Jarrid Rector-Brooks, Yashar Hezaveh, Laurence Perreault-Levasseur, Yoshua Bengio, Glen Berseth, Nikolay Malkin

This paper presents a practical application of Relative Trajectory Balance (RTB), a recently introduced off-policy reinforcement learning (RL) objective that can asymptotically solve Bayesian inverse problems optimally.

Reinforcement Learning (RL)

Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models

no code implementations10 Feb 2025 Siddarth Venkatraman, Mohsin Hasan, Minsu Kim, Luca Scimeca, Marcin Sendera, Yoshua Bengio, Glen Berseth, Nikolay Malkin

We propose to amortize the cost of sampling from such posterior distributions with diffusion models that sample a distribution in the noise space ($\mathbf{z}$).

Conditional Image Generation

Amortizing intractable inference in diffusion models for vision, language, and control

1 code implementation31 May 2024 Siddarth Venkatraman, Moksh Jain, Luca Scimeca, Minsu Kim, Marcin Sendera, Mohsin Hasan, Luke Rowe, Sarthak Mittal, Pablo Lemos, Emmanuel Bengio, Alexandre Adam, Jarrid Rector-Brooks, Yoshua Bengio, Glen Berseth, Nikolay Malkin

Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors in downstream tasks poses an intractable posterior inference problem.

continuous-control Continuous Control +3

Reasoning with Latent Diffusion in Offline Reinforcement Learning

1 code implementation12 Sep 2023 Siddarth Venkatraman, Shivesh Khaitan, Ravi Tej Akella, John Dolan, Jeff Schneider, Glen Berseth

However, a key challenge in offline RL lies in effectively stitching portions of suboptimal trajectories from the static dataset while avoiding extrapolation errors arising due to a lack of support in the dataset.

D4RL Offline RL +4

MLNav: Learning to Safely Navigate on Martian Terrains

no code implementations9 Mar 2022 Shreyansh Daftry, Neil Abcouwer, Tyler del Sesto, Siddarth Venkatraman, Jialin Song, Lucas Igel, Amos Byon, Ugo Rosolia, Yisong Yue, Masahiro Ono

We present MLNav, a learning-enhanced path planning framework for safety-critical and resource-limited systems operating in complex environments, such as rovers navigating on Mars.

Navigate

Machine Learning Based Path Planning for Improved Rover Navigation (Pre-Print Version)

no code implementations11 Nov 2020 Neil Abcouwer, Shreyansh Daftry, Siddarth Venkatraman, Tyler del Sesto, Olivier Toupet, Ravi Lanka, Jialin Song, Yisong Yue, Masahiro Ono

Enhanced AutoNav (ENav), the baseline surface navigation software for NASA's Perseverance rover, sorts a list of candidate paths for the rover to traverse, then uses the Approximate Clearance Evaluation (ACE) algorithm to evaluate whether the most highly ranked paths are safe.

BIG-bench Machine Learning

Deep Residual Neural Networks for Image in Speech Steganography

1 code implementation30 Mar 2020 Shivam Agarwal, Siddarth Venkatraman

We propose a deep learning based technique to hide a source RGB image message inside finite length speech segments without perceptual loss.

Multimedia Sound Audio and Speech Processing

Cannot find the paper you are looking for? You can Submit a new open access paper.