TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Machine Translation	IWSLT2014 German-English	Actor-Critic [Bahdanau2017]	BLEU score	28.53	# 33
Machine Translation	IWSLT2015 English-German	RNNsearch	BLEU score	25.04	# 8
Machine Translation	IWSLT2015 German-English	RNNsearch	BLEU score	29.98	# 9

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/an-actor-critic-algorithm-for-sequence/machine-translation-on-iwslt2015-english)](https://paperswithcode.com/sota/machine-translation-on-iwslt2015-english?p=an-actor-critic-algorithm-for-sequence)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/an-actor-critic-algorithm-for-sequence/machine-translation-on-iwslt2015-german)](https://paperswithcode.com/sota/machine-translation-on-iwslt2015-german?p=an-actor-critic-algorithm-for-sequence)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/an-actor-critic-algorithm-for-sequence/machine-translation-on-iwslt2014-german)](https://paperswithcode.com/sota/machine-translation-on-iwslt2014-german?p=an-actor-critic-algorithm-for-sequence)`

An Actor-Critic Algorithm for Sequence Prediction

24 Jul 2016 · Dzmitry Bahdanau, Philemon Brakel, Kelvin Xu, Anirudh Goyal, Ryan Lowe, Joelle Pineau, Aaron Courville, Yoshua Bengio ·

We present an approach to training neural networks to generate sequences using actor-critic methods from reinforcement learning (RL). Current log-likelihood training methods are limited by the discrepancy between their training and testing modes, as models must generate tokens conditioned on their previous guesses rather than the ground-truth tokens. We address this problem by introducing a \textit{critic} network that is trained to predict the value of an output token, given the policy of an \textit{actor} network. This results in a training procedure that is much closer to the test phase, and allows us to directly optimize for a task-specific score such as BLEU. Crucially, since we leverage these techniques in the supervised learning setting rather than the traditional RL setting, we condition the critic network on the ground-truth output. We show that our method leads to improved performance on both a synthetic task, and for German-English machine translation. Our analysis paves the way for such methods to be applied in natural language generation tasks, such as machine translation, caption generation, and dialogue modelling.

PDF Abstract

Code

Add Remove Mark official

rizar/actor-critic-public official

166

joeynmt/joeynmt

656

juliakreutzer/joeynmt

Tasks

Add Remove

Caption Generation

Machine Translation

Reinforcement Learning (RL)

Spelling Correction

Text Generation

Translation

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Edit

Ranked #8 on Machine Translation on IWSLT2015 English-German

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Machine Translation	IWSLT2014 German-English	Actor-Critic [Bahdanau2017]	BLEU score	28.53	# 33	Compare
Machine Translation	IWSLT2015 English-German	RNNsearch	BLEU score	25.04	# 8	Compare
Machine Translation	IWSLT2015 German-English	RNNsearch	BLEU score	29.98	# 9	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

An Actor-Critic Algorithm for Sequence Prediction

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove