Search Results for author: Remi Tachet

Found 11 papers, 6 papers with code

Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch

1 code implementation • NeurIPS 2023 • Shangtong Zhang, Remi Tachet, Romain Laroche

In this paper, we establish the global optimality and convergence rate of an off-policy actor critic algorithm in the tabular setting without using density ratio to correct the discrepancy between the state distribution of the behavior policy and that of the target policy.

Policy Gradient Methods

3,095

Paper
Code

Reinforcement Learning Framework for Deep Brain Stimulation Study

1 code implementation • 22 Feb 2020 • Dmitrii Krylov, Remi Tachet, Romain Laroche, Michael Rosenblum, Dmitry V. Dylov

Malfunctioning neurons in the brain sometimes operate synchronously, reportedly causing many neurological diseases, e. g. Parkinson's.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Domain Adaptation with Conditional Distribution Matching and Generalized Label Shift

1 code implementation • NeurIPS 2020 • Remi Tachet, Han Zhao, Yu-Xiang Wang, Geoff Gordon

However, recent work has shown limitations of this approach when label distributions differ between the source and target domains.

Multi-class Classification Unsupervised Domain Adaptation

Paper
Code

Dr Jekyll and Mr Hyde: the Strange Case of Off-Policy Policy Updates

1 code implementation • 29 Sep 2021 • Romain Laroche, Remi Tachet

To implement the principles prescribed by our theory, we propose an agent, Dr Jekyll & Mr Hyde (JH), with a double personality: Dr Jekyll purely exploits while Mr Hyde purely explores.

Paper
Code

Learning Invariances for Policy Generalization

1 code implementation • 7 Sep 2018 • Remi Tachet, Philip Bachman, Harm van Seijen

While recent progress has spawned very powerful machine learning systems, those agents remain extremely specialized and fail to transfer the knowledge they gain to similar yet unseen tasks.

BIG-bench Machine Learning Data Augmentation +3

Paper
Code

On the Learning Dynamics of Deep Neural Networks

no code implementations • 18 Sep 2018 • Remi Tachet, Mohammad Pezeshki, Samira Shabanian, Aaron Courville, Yoshua Bengio

While a lot of progress has been made in recent years, the dynamics of learning in deep nonlinear neural networks remain to this day largely misunderstood.

Binary Classification General Classification

Paper
Add Code

Increasing Robustness to Spurious Correlations using Forgettable Examples

no code implementations • EACL 2021 • Yadollah Yaghoobzadeh, Soroush Mehri, Remi Tachet, T. J. Hazen, Alessandro Sordoni

Neural NLP models tend to rely on spurious correlations between labels and input features to perform their tasks.

Natural Language Inference Natural Language Understanding +2

Paper
Add Code

Estimating savings in parking demand using shared vehicles for home-work commuting

1 code implementation • 13 Oct 2017 • Dániel Kondor, Hongmou Zhang, Remi Tachet, Paolo Santi, Carlo Ratti

The increasing availability and adoption of shared vehicles as an alternative to personally-owned cars presents ample opportunities for achieving more efficient transportation in cities.

Computers and Society Social and Information Networks

Paper
Code

Decomposed Mutual Information Estimation for Contrastive Representation Learning

no code implementations • 25 Jun 2021 • Alessandro Sordoni, Nouha Dziri, Hannes Schulz, Geoff Gordon, Phil Bachman, Remi Tachet

We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views.

Data Augmentation Dialogue Generation +2

Paper
Add Code

On the Convergence of SARSA with Linear Function Approximation

no code implementations • 14 Feb 2022 • Shangtong Zhang, Remi Tachet, Romain Laroche

SARSA, a classical on-policy control algorithm for reinforcement learning, is known to chatter when combined with linear function approximation: SARSA does not diverge but oscillates in a bounded region.

Paper
Add Code

Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms

no code implementations • 15 Feb 2022 • Romain Laroche, Remi Tachet

To increase the unlearning speed, we study a novel policy update: the gradient of the cross-entropy loss with respect to the action maximizing $q$, but find that such updates may lead to a decrease in value.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.