no code implementations • 22 Feb 2023 • Thibault Cordier, Tanguy Urvoy, Fabrice Lefevre, Lina M. Rojas-Barahona
Reinforcement learning has been widely adopted to model dialogue managers in task-oriented dialogues.
no code implementations • SIGDIAL (ACL) 2022 • Thibault Cordier, Tanguy Urvoy, Fabrice Lefèvre, Lina M. Rojas-Barahona
Task-oriented dialogue systems are designed to achieve specific goals while conversing with humans.
no code implementations • EACL 2021 • Betty Fabre, Tanguy Urvoy, Jonathan Chevelu, Damien Lolive
We study a search-based paraphrase generation scheme where candidate paraphrases are generated by iterated transformations from the original sentence and evaluated in terms of syntax quality, semantic distance, and lexical distance.
no code implementations • ACL (WebNLG, INLG) 2020 • Sebastien Montella, Betty Fabre, Tanguy Urvoy, Johannes Heinecke, Lina Rojas-Barahona
The task of verbalization of RDF triples has known a growth in popularity due to the rising ubiquity of Knowledge Bases (KBs).
no code implementations • 25 Nov 2020 • Thibault Cordier, Tanguy Urvoy, Lina M. Rojas-Barahona, Fabrice Lefèvre
We notably propose a randomised exploration policy which allows for a seamless hybridisation of the learned policy and the expert.
no code implementations • NeurIPS Workshop LMCA 2020 • Betty Fabre, Tanguy Urvoy, Jonathan Chevelu, Damien Lolive
A good paraphrase is semantically similar to the original sentence but it must be also well formed, and syntactically different to ensure diversity.
1 code implementation • NeurIPS 2019 • Nicolas Carrara, Edouard Leurent, Romain Laroche, Tanguy Urvoy, Odalric-Ambrym Maillard, Olivier Pietquin
A Budgeted Markov Decision Process (BMDP) is an extension of a Markov Decision Process to critical applications requiring safety constraints.
no code implementations • 16 Aug 2017 • Pratik Gajane, Tanguy Urvoy, Emilie Kaufmann
In this framework, motivated by privacy preservation in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards through a stochastic corruption process with known parameters.
no code implementations • 18 Jan 2016 • Artem Sokolov, Stefan Riezler, Tanguy Urvoy
We present an application to discriminative reranking in Statistical Machine Translation (SMT) where the learning algorithm only has access to a 1-BLEU loss evaluation of a predicted translation instead of obtaining a gold standard reference translation.
no code implementations • 15 Jan 2016 • Pratik Gajane, Tanguy Urvoy, Fabrice Clérot
We study the K-armed dueling bandit problem which is a variation of the classical Multi-Armed Bandit (MAB) problem in which the learner receives only relative feedback about the selected pairs of arms.
no code implementations • 10 Jul 2015 • Pratik Gajane, Tanguy Urvoy
Partial monitoring is a generic framework for sequential decision-making with incomplete feedback.
no code implementations • 27 Apr 2015 • Raphaël Féraud, Robin Allesiardo, Tanguy Urvoy, Fabrice Clérot
The dependence of the sample complexity upon the number of contextual variables is logarithmic.