no code implementations • 18 Oct 2024 • Zhepeng Cen, Yao Liu, Siliang Zeng, Pratik Chaudhari, Huzefa Rangwala, George Karypis, Rasool Fakoor
Our first approach is Batch-Scheduled Sampling, where, during training, we stochastically choose between the ground-truth token from the dataset and the model's own generated token as input to predict the next token.
no code implementations • 17 Oct 2024 • Ke Yang, Yao Liu, Sapana Chaudhary, Rasool Fakoor, Pratik Chaudhari, George Karypis, Huzefa Rangwala
On the other hand, there has been limited study on the misalignment between a web agent's observation/action representation and the pre-training data of the LLM it's based on.
no code implementations • 7 Oct 2024 • Wei Tang, Yiheng Duan, Yaroslav Kharkov, Rasool Fakoor, Eric Kessler, Yunong Shi
Quantum computers have the potential to outperform classical computers in important tasks such as optimization and number factoring.
no code implementations • 25 Jun 2024 • Jesse Zhang, Minho Heo, Zuxin Liu, Erdem Biyik, Joseph J Lim, Yao Liu, Rasool Fakoor
Prior work in skill-based RL either requires expert supervision to define useful skills, which is hard to scale, or learns a skill-space from offline data with heuristics that limit the adaptability of the skills, making them difficult to transfer during downstream RL.
no code implementations • 3 Jun 2024 • Kavosh Asadi, Yao Liu, Shoham Sabach, Ming Yin, Rasool Fakoor
We focus on the task of learning the value function in the reinforcement learning (RL) setting.
no code implementations • 9 Oct 2023 • Zuxin Liu, Jesse Zhang, Kavosh Asadi, Yao Liu, Ding Zhao, Shoham Sabach, Rasool Fakoor
Inspired by recent advancements in parameter-efficient fine-tuning in language domains, we explore efficient fine-tuning techniques -- e. g., Bottleneck Adapters, P-Tuning, and Low-Rank Adaptation (LoRA) -- in TAIL to adapt large pretrained models for new tasks with limited demonstration data.
no code implementations • NeurIPS 2023 • Yao Liu, Pratik Chaudhari, Rasool Fakoor
The main challenge of offline reinforcement learning, where data is limited, arises from a sequence of counterfactual reasoning dilemmas within the realm of potential actions: What if we were to choose a different course of action?
no code implementations • 4 Oct 2022 • Rasool Fakoor, Jonas Mueller, Zachary C. Lipton, Pratik Chaudhari, Alexander J. Smola
Real-world deployment of machine learning models is challenging because data evolves over time.
2 code implementations • 28 May 2022 • Massimo Caccia, Jonas Mueller, Taesup Kim, Laurent Charlin, Rasool Fakoor
We pose two hypotheses: (1) task-agnostic methods might provide advantages in settings with limited data, computation, or high dimensionality, and (2) faster adaptation may be particularly beneficial in continual learning settings, helping to mitigate the effects of catastrophic forgetting.
1 code implementation • 10 Dec 2021 • Kavosh Asadi, Rasool Fakoor, Omer Gottesman, Taesup Kim, Michael L. Littman, Alexander J. Smola
In this paper we endow two popular deep reinforcement learning algorithms, namely DQN and Rainbow, with updates that incentivize the online network to remain in the proximity of the target network.
no code implementations • ICLR 2022 • Jiarui Jin, Sijin Zhou, Weinan Zhang, Tong He, Yong Yu, Rasool Fakoor
Goal-oriented Reinforcement Learning (GoRL) is a promising approach for scaling up RL techniques on sparse reward environments requiring long horizon planning.
1 code implementation • 26 Feb 2021 • Rasool Fakoor, Taesup Kim, Jonas Mueller, Alexander J. Smola, Ryan J. Tibshirani
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions, or to model a diverse population without being overly reductive.
1 code implementation • NeurIPS 2021 • Rasool Fakoor, Jonas Mueller, Kavosh Asadi, Pratik Chaudhari, Alexander J. Smola
Reliant on too many experiments to learn good actions, current Reinforcement Learning (RL) algorithms have limited applicability in real-world settings, which can be too expensive to allow exploration.
no code implementations • 1 Jan 2021 • Jiarui Jin, Cong Chen, Ming Zhou, Weinan Zhang, Rasool Fakoor, David Wipf, Yong Yu, Jun Wang, Alex Smola
Goal-oriented reinforcement learning algorithms are often good at exploration, not exploitation, while episodic algorithms excel at exploitation, not exploration.
no code implementations • 1 Jan 2021 • Jiarui Jin, Sijin Zhou, Weinan Zhang, Rasool Fakoor, David Wipf, Tong He, Yong Yu, Zheng Zhang, Alex Smola
In reinforcement learning, a map with states and transitions built based on historical trajectories is often helpful in exploration and exploitation.
no code implementations • 1 Jan 2021 • Rasool Fakoor, Pratik Anil Chaudhari, Jonas Mueller, Alex Smola
We present TraDE, a self-attention-based architecture for auto-regressive density estimation with continuous and discrete valued data.
no code implementations • 26 Jun 2020 • Rasool Fakoor, Pratik Chaudhari, Alexander J. Smola
This paper prescribes a suite of techniques for off-policy Reinforcement Learning (RL) that simplify the training process and reduce the sample complexity.
1 code implementation • NeurIPS 2020 • Rasool Fakoor, Jonas Mueller, Nick Erickson, Pratik Chaudhari, Alexander J. Smola
Automated machine learning (AutoML) can produce complex model ensembles by stacking, bagging, and boosting many individual models like trees, deep networks, and nearest neighbor estimators.
no code implementations • 6 Apr 2020 • Rasool Fakoor, Pratik Chaudhari, Jonas Mueller, Alexander J. Smola
We present TraDE, a self-attention-based architecture for auto-regressive density estimation with continuous and discrete valued data.
2 code implementations • ICLR 2020 • Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, Alexander J. Smola
This paper introduces Meta-Q-Learning (MQL), a new off-policy algorithm for meta-Reinforcement Learning (meta-RL).
1 code implementation • 5 May 2019 • Rasool Fakoor, Pratik Chaudhari, Alexander J. Smola
Extensive experiments on the Atari-2600 and MuJoCo benchmark suites show that this simple technique is effective in reducing the sample complexity of state-of-the-art algorithms.
no code implementations • 30 Oct 2018 • Thomas Powers, Rasool Fakoor, Siamak Shakeri, Abhinav Sethy, Amanjit Kainth, Abdel-rahman Mohamed, Ruhi Sarikaya
Optimal selection of a subset of items from a given set is a hard problem that requires combinatorial optimization.
no code implementations • 16 Feb 2018 • Rasool Fakoor, Xiaodong He, Ivan Tashev, Shuayb Zarar
For a speech-enhancement algorithm, it is highly desirable to simultaneously improve perceptual quality and recognition rate.
no code implementations • 29 Nov 2017 • Rasool Fakoor, Xiaodong He, Ivan Tashev, Shuayb Zarar
Today, the optimal performance of existing noise-suppression algorithms, both data-driven and those based on classic statistical methods, is range bound to specific levels of instantaneous input signal-to-noise ratios.
1 code implementation • 7 Nov 2016 • Rasool Fakoor, Abdel-rahman Mohamed, Margaret Mitchell, Sing Bing Kang, Pushmeet Kohli
We present a method to improve video description generation by modeling higher-order interactions between video frames and described concepts.