no code implementations • 14 Feb 2024 • Mehdi Fatemi, Sindhu Gowda
We address causal reasoning in multivariate time series data generated by stochastic processes.
no code implementations • 3 Nov 2023 • Meng Cao, Mehdi Fatemi, Jackie Chi Kit Cheung, Samira Shabanian
While large language models (LLMs) have achieved impressive performance in generating fluent and realistic text, controlling the generated text so that it exhibits properties such as safety, factuality, and non-toxicity remains challenging.
1 code implementation • 27 Feb 2023 • Meng Cao, Mehdi Fatemi, Jackie Chi Kit Cheung, Samira Shabanian
Other methods rely on rule-based or prompt-based token elimination, which are limited as they dismiss future tokens and the overall meaning of the complete discourse.
1 code implementation • 17 Mar 2022 • Mehdi Fatemi, Mary Wu, Jeremy Petch, Walter Nelson, Stuart J. Connolly, Alexander Benz, Anthony Carnicelli, Marzyeh Ghassemi
Finally, we apply our new algorithms to a real-world offline dataset pertaining to warfarin dosing for stroke prevention and demonstrate similar results.
1 code implementation • ICLR 2022 • Mehdi Fatemi, Arash Tavakoli
We present a general convergent class of reinforcement learning algorithms that is founded on two distinct principles: (1) mapping value estimates to a different space using arbitrary functions from a broad class, and (2) linearly decomposing the reward signal into multiple channels.
1 code implementation • NeurIPS 2021 • Mehdi Fatemi, Taylor W. Killian, Jayakumar Subramanian, Marzyeh Ghassemi
Machine learning has successfully framed many sequential decision making problems as either supervised prediction, or optimal decision-making policy identification via reinforcement learning.
1 code implementation • 13 Jul 2021 • Sungryull Sohn, Sungtae Lee, Jongwook Choi, Harm van Seijen, Mehdi Fatemi, Honglak Lee
We propose the k-Shortest-Path (k-SP) constraint: a novel constraint on the agent's trajectory that improves the sample efficiency in sparse-reward MDPs.
1 code implementation • 23 Nov 2020 • Taylor W. Killian, Haoran Zhang, Jayakumar Subramanian, Mehdi Fatemi, Marzyeh Ghassemi
Reinforcement Learning (RL) has recently been applied to sequential estimation and prediction problems identifying and developing hypothetical treatment strategies for septic patients, with a particular focus on offline learning with observational data.
1 code implementation • ICLR 2021 • Arash Tavakoli, Mehdi Fatemi, Petar Kormushev
To test this, we set forth the action hypergraph networks framework -- a class of functions for learning action representations in multi-dimensional discrete action spaces with a structural inductive bias.
2 code implementations • NeurIPS 2019 • Harm van Seijen, Mehdi Fatemi, Arash Tavakoli
In an effort to better understand the different ways in which the discount factor affects the optimization process in reinforcement learning, we designed a set of experiments to study each effect in isolation.
1 code implementation • NeurIPS 2017 • Harm van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche, Tavian Barnes, Jeffrey Tsang
One of the main challenges in reinforcement learning (RL) is generalisation.
no code implementations • ICLR 2018 • Romain Laroche, Mehdi Fatemi, Joshua Romoff, Harm van Seijen
We consider tackling a single-agent RL problem by distributing it to $n$ learners.
no code implementations • 15 Dec 2016 • Harm van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche
In this paper, we propose a framework for solving a single-agent task by using multiple agents, each focusing on different aspects of the task.
no code implementations • WS 2016 • Mehdi Fatemi, Layla El Asri, Hannes Schulz, Jing He, Kaheer Suleman
Indeed, with only a few hundred dialogues collected with a handcrafted policy, the actor-critic deep learner is considerably bootstrapped from a combination of supervised and batch RL.