Search Results for author: Tanguy Urvoy

Found 12 papers, 1 papers with code

Few-Shot Structured Policy Learning for Multi-Domain and Multi-Task Dialogues

no code implementations22 Feb 2023 Thibault Cordier, Tanguy Urvoy, Fabrice Lefevre, Lina M. Rojas-Barahona

Reinforcement learning has been widely adopted to model dialogue managers in task-oriented dialogues.

Neural-Driven Search-Based Paraphrase Generation

no code implementations EACL 2021 Betty Fabre, Tanguy Urvoy, Jonathan Chevelu, Damien Lolive

We study a search-based paraphrase generation scheme where candidate paraphrases are generated by iterated transformations from the original sentence and evaluated in terms of syntax quality, semantic distance, and lexical distance.

Paraphrase Generation Sentence

Diluted Near-Optimal Expert Demonstrations for Guiding Dialogue Stochastic Policy Optimisation

no code implementations25 Nov 2020 Thibault Cordier, Tanguy Urvoy, Lina M. Rojas-Barahona, Fabrice Lefèvre

We notably propose a randomised exploration policy which allows for a seamless hybridisation of the learned policy and the expert.

Imitation Learning Q-Learning +1

Neural-Driven Multi-criteria Tree Search for Paraphrase Generation

no code implementations NeurIPS Workshop LMCA 2020 Betty Fabre, Tanguy Urvoy, Jonathan Chevelu, Damien Lolive

A good paraphrase is semantically similar to the original sentence but it must be also well formed, and syntactically different to ensure diversity.

Paraphrase Generation Sentence

Corrupt Bandits for Preserving Local Privacy

no code implementations16 Aug 2017 Pratik Gajane, Tanguy Urvoy, Emilie Kaufmann

In this framework, motivated by privacy preservation in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards through a stochastic corruption process with known parameters.

Recommendation Systems

Bandit Structured Prediction for Learning from Partial Feedback in Statistical Machine Translation

no code implementations18 Jan 2016 Artem Sokolov, Stefan Riezler, Tanguy Urvoy

We present an application to discriminative reranking in Statistical Machine Translation (SMT) where the learning algorithm only has access to a 1-BLEU loss evaluation of a predicted translation instead of obtaining a gold standard reference translation.

Machine Translation Structured Prediction +1

A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits

no code implementations15 Jan 2016 Pratik Gajane, Tanguy Urvoy, Fabrice Clérot

We study the K-armed dueling bandit problem which is a variation of the classical Multi-Armed Bandit (MAB) problem in which the learner receives only relative feedback about the selected pairs of arms.

Information Retrieval Retrieval

Utility-based Dueling Bandits as a Partial Monitoring Game

no code implementations10 Jul 2015 Pratik Gajane, Tanguy Urvoy

Partial monitoring is a generic framework for sequential decision-making with incomplete feedback.

Decision Making

Random Forest for the Contextual Bandit Problem - extended version

no code implementations27 Apr 2015 Raphaël Féraud, Robin Allesiardo, Tanguy Urvoy, Fabrice Clérot

The dependence of the sample complexity upon the number of contextual variables is logarithmic.

Cannot find the paper you are looking for? You can Submit a new open access paper.