no code implementations • 15 Oct 2021 • Yasmeen Hitti, Ionelia Buzatu, Manuel Del Verme, Mark Lefsrud, Florian Golemo, Audrey Durand
We argue that plant responses to an environmental stimulus are a good example of a real-world problem that can be approached within a reinforcement learning (RL)framework.
no code implementations • ICML Workshop AutoML 2021 • Maxime Heuillet, Benoit Debaque, Audrey Durand
The goal of Automated Machine Learning (AutoML) is to make Machine Learning (ML) tools more accessible.
no code implementations • 22 Mar 2021 • Joseph Jay Williams, Jacob Nogas, Nina Deliu, Hammad Shaikh, Sofia S. Villar, Audrey Durand, Anna Rafferty
We therefore use our case study of the ubiquitous two-arm binary reward setting to empirically investigate the impact of using Thompson Sampling instead of uniform random assignment.
1 code implementation • 3 Nov 2020 • Sophie-Camille Hogue, Flora Chen, Geneviève Brassard, Denis Lebel, Jean-François Bussières, Audrey Durand, Maxime Thibault
The objective of this work was to assess the clinical performance of an unsupervised machine learning model aimed at identifying unusual medication orders and pharmacological profiles.
no code implementations • 3 Jul 2020 • Deepak Sharma, Audrey Durand, Marc-André Legault, Louis-Philippe Lemieux Perreault, Audrey Lemaçon, Marie-Pierre Dubé, Joelle Pineau
Genome-Wide Association Studies are typically conducted using linear models to find genetic variants associated with common diseases.
1 code implementation • LREC 2020 • Nicolas Garneau, Mathieu Godbout, David Beauchemin, Audrey Durand, Luc Lamontagne
In this paper, we reproduce the experiments of Artetxe et al. (2018b) regarding the robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings.
1 code implementation • 11 Oct 2019 • Sharan Vaswani, Abbas Mehrabian, Audrey Durand, Branislav Kveton
We propose $\tt RandUCB$, a bandit strategy that builds on theoretically derived confidence intervals similar to upper confidence bound (UCB) algorithms, but akin to Thompson sampling (TS), it uses randomization to trade off exploration and exploitation.
no code implementations • 17 Sep 2019 • Thang Doan, Bogdan Mazoure, Moloud Abdar, Audrey Durand, Joelle Pineau, R. Devon Hjelm
Continuous control tasks in reinforcement learning are important because they provide an important framework for learning in high-dimensional state spaces with deceptive rewards, where the agent can easily become trapped into suboptimal solutions.
1 code implementation • 16 May 2019 • Bogdan Mazoure, Thang Doan, Audrey Durand, R. Devon Hjelm, Joelle Pineau
The ability to discover approximately optimal policies in domains with sparse rewards is crucial to applying reinforcement learning (RL) in many real-world scenarios.
1 code implementation • NeurIPS 2018 • Pierre Thodoroff, Audrey Durand, Joelle Pineau, Doina Precup
Several applications of Reinforcement Learning suffer from instability due to high variance.
2 code implementations • 1 Nov 2018 • Pierre Thodoroff, Audrey Durand, Joelle Pineau, Doina Precup
Several applications of Reinforcement Learning suffer from instability due to high variance.
3 code implementations • 31 Jul 2018 • Thang Doan, Joao Monteiro, Isabela Albuquerque, Bogdan Mazoure, Audrey Durand, Joelle Pineau, R. Devon Hjelm
We argue that less expressive discriminators are smoother and have a general coarse grained view of the modes map, which enforces the generator to cover a wide portion of the data distribution support.
no code implementations • 28 Mar 2018 • Louis-Émile Robitaille, Audrey Durand, Marc-André Gardner, Christian Gagné, Paul De Koninck, Flavie Lavoie-Cardinal
More specifically, we are proposing a system based on a deep neural network that can provide a quantitative quality measure of a STED image of neuronal structures given as input.
no code implementations • 2 Aug 2017 • Audrey Durand, Odalric-Ambrym Maillard, Joelle Pineau
The variance of the noise is not assumed to be known.
no code implementations • 4 Jan 2017 • Audrey Durand, Christian Gagné
The question is: how good do estimations of these objectives have to be in order for the solution maximizing the preference function to remain unchanged?