Search Results for author: Tanguy Urvoy

Found 12 papers, 1 papers with code

Few-Shot Structured Policy Learning for Multi-Domain and Multi-Task Dialogues

no code implementations • 22 Feb 2023 • Thibault Cordier, Tanguy Urvoy, Fabrice Lefevre, Lina M. Rojas-Barahona

Reinforcement learning has been widely adopted to model dialogue managers in task-oriented dialogues.

Paper
Add Code

Graph Neural Network Policies and Imitation Learning for Multi-Domain Task-Oriented Dialogues

no code implementations • SIGDIAL (ACL) 2022 • Thibault Cordier, Tanguy Urvoy, Fabrice Lefèvre, Lina M. Rojas-Barahona

Task-oriented dialogue systems are designed to achieve specific goals while conversing with humans.

Imitation Learning Task-Oriented Dialogue Systems

Paper
Add Code

Neural-Driven Search-Based Paraphrase Generation

no code implementations • EACL 2021 • Betty Fabre, Tanguy Urvoy, Jonathan Chevelu, Damien Lolive

We study a search-based paraphrase generation scheme where candidate paraphrases are generated by iterated transformations from the original sentence and evaluated in terms of syntax quality, semantic distance, and lexical distance.

Paraphrase Generation Sentence

Paper
Add Code

Denoising Pre-Training and Data Augmentation Strategies for Enhanced RDF Verbalization with Transformers

no code implementations • ACL (WebNLG, INLG) 2020 • Sebastien Montella, Betty Fabre, Tanguy Urvoy, Johannes Heinecke, Lina Rojas-Barahona

The task of verbalization of RDF triples has known a growth in popularity due to the rising ubiquity of Knowledge Bases (KBs).

Data Augmentation Denoising +1

Paper
Add Code

Diluted Near-Optimal Expert Demonstrations for Guiding Dialogue Stochastic Policy Optimisation

no code implementations • 25 Nov 2020 • Thibault Cordier, Tanguy Urvoy, Lina M. Rojas-Barahona, Fabrice Lefèvre

We notably propose a randomised exploration policy which allows for a seamless hybridisation of the learned policy and the expert.

Imitation Learning Q-Learning +1

Paper
Add Code

Neural-Driven Multi-criteria Tree Search for Paraphrase Generation

no code implementations • NeurIPS Workshop LMCA 2020 • Betty Fabre, Tanguy Urvoy, Jonathan Chevelu, Damien Lolive

A good paraphrase is semantically similar to the original sentence but it must be also well formed, and syntactically different to ensure diversity.

Paraphrase Generation Sentence

Paper
Add Code

Budgeted Reinforcement Learning in Continuous State Space

1 code implementation • NeurIPS 2019 • Nicolas Carrara, Edouard Leurent, Romain Laroche, Tanguy Urvoy, Odalric-Ambrym Maillard, Olivier Pietquin

A Budgeted Markov Decision Process (BMDP) is an extension of a Markov Decision Process to critical applications requiring safety constraints.

Autonomous Driving reinforcement-learning +1

537

Paper
Code

Corrupt Bandits for Preserving Local Privacy

no code implementations • 16 Aug 2017 • Pratik Gajane, Tanguy Urvoy, Emilie Kaufmann

In this framework, motivated by privacy preservation in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards through a stochastic corruption process with known parameters.

Recommendation Systems

Paper
Add Code

Bandit Structured Prediction for Learning from Partial Feedback in Statistical Machine Translation

no code implementations • 18 Jan 2016 • Artem Sokolov, Stefan Riezler, Tanguy Urvoy

We present an application to discriminative reranking in Statistical Machine Translation (SMT) where the learning algorithm only has access to a 1-BLEU loss evaluation of a predicted translation instead of obtaining a gold standard reference translation.

Machine Translation Structured Prediction +1

Paper
Add Code

A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits

no code implementations • 15 Jan 2016 • Pratik Gajane, Tanguy Urvoy, Fabrice Clérot

We study the K-armed dueling bandit problem which is a variation of the classical Multi-Armed Bandit (MAB) problem in which the learner receives only relative feedback about the selected pairs of arms.

Information Retrieval Retrieval

Paper
Add Code

Utility-based Dueling Bandits as a Partial Monitoring Game

no code implementations • 10 Jul 2015 • Pratik Gajane, Tanguy Urvoy

Partial monitoring is a generic framework for sequential decision-making with incomplete feedback.

Decision Making

Paper
Add Code

Random Forest for the Contextual Bandit Problem - extended version

no code implementations • 27 Apr 2015 • Raphaël Féraud, Robin Allesiardo, Tanguy Urvoy, Fabrice Clérot

The dependence of the sample complexity upon the number of contextual variables is logarithmic.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.