Search Results for author: Rasool Fakoor

Found 25 papers, 8 papers with code

Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens

no code implementations18 Oct 2024 Zhepeng Cen, Yao Liu, Siliang Zeng, Pratik Chaudhari, Huzefa Rangwala, George Karypis, Rasool Fakoor

Our first approach is Batch-Scheduled Sampling, where, during training, we stochastically choose between the ground-truth token from the dataset and the model's own generated token as input to predict the next token.

Math Question Answering

AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents

no code implementations17 Oct 2024 Ke Yang, Yao Liu, Sapana Chaudhary, Rasool Fakoor, Pratik Chaudhari, George Karypis, Huzefa Rangwala

On the other hand, there has been limited study on the misalignment between a web agent's observation/action representation and the pre-training data of the LLM it's based on.

AlphaRouter: Quantum Circuit Routing with Reinforcement Learning and Tree Search

no code implementations7 Oct 2024 Wei Tang, Yiheng Duan, Yaroslav Kharkov, Rasool Fakoor, Eric Kessler, Yunong Shi

Quantum computers have the potential to outperform classical computers in important tasks such as optimization and number factoring.

reinforcement-learning Reinforcement Learning +1

EXTRACT: Efficient Policy Learning by Extracting Transferable Robot Skills from Offline Data

no code implementations25 Jun 2024 Jesse Zhang, Minho Heo, Zuxin Liu, Erdem Biyik, Joseph J Lim, Yao Liu, Rasool Fakoor

Prior work in skill-based RL either requires expert supervision to define useful skills, which is hard to scale, or learns a skill-space from offline data with heuristics that limit the adaptability of the skills, making them difficult to transfer during downstream RL.

Reinforcement Learning (RL) Robot Manipulation

Learning the Target Network in Function Space

no code implementations3 Jun 2024 Kavosh Asadi, Yao Liu, Shoham Sabach, Ming Yin, Rasool Fakoor

We focus on the task of learning the value function in the reinforcement learning (RL) setting.

Reinforcement Learning (RL)

TAIL: Task-specific Adapters for Imitation Learning with Large Pretrained Models

no code implementations9 Oct 2023 Zuxin Liu, Jesse Zhang, Kavosh Asadi, Yao Liu, Ding Zhao, Shoham Sabach, Rasool Fakoor

Inspired by recent advancements in parameter-efficient fine-tuning in language domains, we explore efficient fine-tuning techniques -- e. g., Bottleneck Adapters, P-Tuning, and Low-Rank Adaptation (LoRA) -- in TAIL to adapt large pretrained models for new tasks with limited demonstration data.

Continual Learning Imitation Learning +1

Budgeting Counterfactual for Offline RL

no code implementations NeurIPS 2023 Yao Liu, Pratik Chaudhari, Rasool Fakoor

The main challenge of offline reinforcement learning, where data is limited, arises from a sequence of counterfactual reasoning dilemmas within the realm of potential actions: What if we were to choose a different course of action?

counterfactual Counterfactual Reasoning +2

Task-Agnostic Continual Reinforcement Learning: Gaining Insights and Overcoming Challenges

2 code implementations28 May 2022 Massimo Caccia, Jonas Mueller, Taesup Kim, Laurent Charlin, Rasool Fakoor

We pose two hypotheses: (1) task-agnostic methods might provide advantages in settings with limited data, computation, or high dimensionality, and (2) faster adaptation may be particularly beneficial in continual learning settings, helping to mitigate the effects of catastrophic forgetting.

Continual Learning Continuous Control +4

Faster Deep Reinforcement Learning with Slower Online Network

1 code implementation10 Dec 2021 Kavosh Asadi, Rasool Fakoor, Omer Gottesman, Taesup Kim, Michael L. Littman, Alexander J. Smola

In this paper we endow two popular deep reinforcement learning algorithms, namely DQN and Rainbow, with updates that incentivize the online network to remain in the proximity of the target network.

Deep Reinforcement Learning reinforcement-learning +1

Graph-Enhanced Exploration for Goal-oriented Reinforcement Learning

no code implementations ICLR 2022 Jiarui Jin, Sijin Zhou, Weinan Zhang, Tong He, Yong Yu, Rasool Fakoor

Goal-oriented Reinforcement Learning (GoRL) is a promising approach for scaling up RL techniques on sparse reward environments requiring long horizon planning.

continuous-control Continuous Control +4

Flexible Model Aggregation for Quantile Regression

1 code implementation26 Feb 2021 Rasool Fakoor, Taesup Kim, Jonas Mueller, Alexander J. Smola, Ryan J. Tibshirani

Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions, or to model a diverse population without being overly reductive.

Econometrics model +2

Continuous Doubly Constrained Batch Reinforcement Learning

1 code implementation NeurIPS 2021 Rasool Fakoor, Jonas Mueller, Kavosh Asadi, Pratik Chaudhari, Alexander J. Smola

Reliant on too many experiments to learn good actions, current Reinforcement Learning (RL) algorithms have limited applicability in real-world settings, which can be too expensive to allow exploration.

reinforcement-learning Reinforcement Learning +1

Regioned Episodic Reinforcement Learning

no code implementations1 Jan 2021 Jiarui Jin, Cong Chen, Ming Zhou, Weinan Zhang, Rasool Fakoor, David Wipf, Yong Yu, Jun Wang, Alex Smola

Goal-oriented reinforcement learning algorithms are often good at exploration, not exploitation, while episodic algorithms excel at exploitation, not exploration.

reinforcement-learning Reinforcement Learning +1

Explore with Dynamic Map: Graph Structured Reinforcement Learning

no code implementations1 Jan 2021 Jiarui Jin, Sijin Zhou, Weinan Zhang, Rasool Fakoor, David Wipf, Tong He, Yong Yu, Zheng Zhang, Alex Smola

In reinforcement learning, a map with states and transitions built based on historical trajectories is often helpful in exploration and exploitation.

reinforcement-learning Reinforcement Learning +1

TraDE: A Simple Self-Attention-Based Density Estimator

no code implementations1 Jan 2021 Rasool Fakoor, Pratik Anil Chaudhari, Jonas Mueller, Alex Smola

We present TraDE, a self-attention-based architecture for auto-regressive density estimation with continuous and discrete valued data.

Density Estimation Out-of-Distribution Detection

DDPG++: Striving for Simplicity in Continuous-control Off-Policy Reinforcement Learning

no code implementations26 Jun 2020 Rasool Fakoor, Pratik Chaudhari, Alexander J. Smola

This paper prescribes a suite of techniques for off-policy Reinforcement Learning (RL) that simplify the training process and reduce the sample complexity.

continuous-control Continuous Control +3

Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation

1 code implementation NeurIPS 2020 Rasool Fakoor, Jonas Mueller, Nick Erickson, Pratik Chaudhari, Alexander J. Smola

Automated machine learning (AutoML) can produce complex model ensembles by stacking, bagging, and boosting many individual models like trees, deep networks, and nearest neighbor estimators.

AutoML Data Augmentation

TraDE: Transformers for Density Estimation

no code implementations6 Apr 2020 Rasool Fakoor, Pratik Chaudhari, Jonas Mueller, Alexander J. Smola

We present TraDE, a self-attention-based architecture for auto-regressive density estimation with continuous and discrete valued data.

Density Estimation Out-of-Distribution Detection

Meta-Q-Learning

2 code implementations ICLR 2020 Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, Alexander J. Smola

This paper introduces Meta-Q-Learning (MQL), a new off-policy algorithm for meta-Reinforcement Learning (meta-RL).

continuous-control Continuous Control +3

P3O: Policy-on Policy-off Policy Optimization

1 code implementation5 May 2019 Rasool Fakoor, Pratik Chaudhari, Alexander J. Smola

Extensive experiments on the Atari-2600 and MuJoCo benchmark suites show that this simple technique is effective in reducing the sample complexity of state-of-the-art algorithms.

MuJoCo Reinforcement Learning +1

Differentiable Greedy Networks

no code implementations30 Oct 2018 Thomas Powers, Rasool Fakoor, Siamak Shakeri, Abhinav Sethy, Amanjit Kainth, Abdel-rahman Mohamed, Ruhi Sarikaya

Optimal selection of a subset of items from a given set is a hard problem that requires combinatorial optimization.

Claim Verification Combinatorial Optimization +1

Reinforcement Learning To Adapt Speech Enhancement to Instantaneous Input Signal Quality

no code implementations29 Nov 2017 Rasool Fakoor, Xiaodong He, Ivan Tashev, Shuayb Zarar

Today, the optimal performance of existing noise-suppression algorithms, both data-driven and those based on classic statistical methods, is range bound to specific levels of instantaneous input signal-to-noise ratios.

reinforcement-learning Reinforcement Learning +2

Memory-augmented Attention Modelling for Videos

1 code implementation7 Nov 2016 Rasool Fakoor, Abdel-rahman Mohamed, Margaret Mitchell, Sing Bing Kang, Pushmeet Kohli

We present a method to improve video description generation by modeling higher-order interactions between video frames and described concepts.

Video Description

Cannot find the paper you are looking for? You can Submit a new open access paper.