A2C, or Advantage Actor Critic, is a synchronous version of the A3C policy gradient method. As an alternative to the asynchronous implementation of A3C, A2C is a synchronous, deterministic implementation that waits for each actor to finish its segment of experience before updating, averaging over all of the actors. This more effectively uses GPUs due to larger batch sizes.

Image Credit: OpenAI Baselines

Source: Asynchronous Methods for Deep Reinforcement Learning

Latest Papers

PAPER DATE
The Greatest Teacher, Failure is: Using Reinforcement Learning for SFC Placement Based on Availability and Energy Consumption
Guto Leoni SantosTheo LynnJudith KelnerPatricia Takako Endo
2020-10-12
Value-Decomposition Multi-Agent Actor-Critics
| Jianyu SuStephen AdamsPeter A. Beling
2020-07-24
Generalized State-Dependent Exploration for Deep Reinforcement Learning in Robotics
| Antonin RaffinFreek Stulp
2020-05-12
Work in Progress: Temporally Extended Auxiliary Tasks
Craig SherstanBilal KartalPablo Hernandez-LealMatthew E. Taylor
2020-04-01
Adaptive Discretization for Continuous Control using Particle Filtering Policy Network
| Pei XuIoannis Karamouzas
2020-03-16
Accessing Higher-level Representations in Sequential Transformers with Feedback Memory
Angela FanThibaut LavrilEdouard GraveArmand JoulinSainbayar Sukhbaatar
2020-02-21
Learning Representations in Reinforcement Learning: an Information Bottleneck Approach
Yingjun PeiXinwen Hou
2020-01-01
SLM Lab: A Comprehensive Benchmark and Modular Software Framework for Reproducible Deep Reinforcement Learning
| Keng Wah LoonLaura GraesserMilan Cvitkovic
2019-12-28
Buffer-aware Wireless Scheduling based on Deep Reinforcement Learning
Chen XuJian WangTianhang YuChuili KongYourui HuangfuRong LiYiqun GeJun Wang
2019-11-13
Learning Representations in Reinforcement Learning:An Information Bottleneck Approach
Pei YingjunHou Xinwen
2019-11-12
AI Assisted Annotator using Reinforcement Learning
V. Ratna SaripalliGopal AvinashDibyajyoti PatiMichael PotterCharles W. Anderson
2019-10-02
Quantized Reinforcement Learning (QUARL)
| Srivatsan KrishnanSharad ChitlangiaMaximilian LamZishen WanAleksandra FaustVijay Janapa Reddi
2019-10-02
Improving OOV Detection and Resolution with External Language Models in Acoustic-to-Word ASR
Hirofumi InagumaMasato MimuraShinsuke SakaiTatsuya Kawahara
2019-09-22
Deep Reinforcement Learning with Modulated Hebbian plus Q Network Architecture
Pawel LadoszEseoghene Ben-IwhiwhuJeffery DickYang HuNicholas KetzSoheil KolouriJeffrey L. KrichmarPraveen PillyAndrea Soltoggio
2019-09-21
Vision-based Navigation Using Deep Reinforcement Learning
Jonáš KulhánekErik DernerTim de BruinRobert Babuška
2019-08-08
Variance Reduction in Actor Critic Methods (ACM)
Eric Benhamou
2019-07-23
Gossip-based Actor-Learner Architectures for Deep Reinforcement Learning
| Mahmoud AssranJoshua RomoffNicolas BallasJoelle PineauMichael Rabbat
2019-06-09
RL-Based Method for Benchmarking the Adversarial Resilience and Robustness of Deep Reinforcement Learning Policies
Vahid BehzadanWilliam Hsu
2019-06-03
Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire Evacuation Environment
Jivitesh SharmaPer-Arne AndersenOle-Chrisoffer GranmoMorten Goodwin
2019-05-23
Autonomous Air Traffic Controller: A Deep Multi-Agent Reinforcement Learning Approach
Marc BrittainPeng Wei
2019-05-02
Multi-Agent Deep Reinforcement Learning for Large-scale Traffic Signal Control
| Tianshu ChuJie WangLara CodecàZhaojian Li
2019-03-11
Integrating Reinforcement Learning to Self Training for Pulmonary Nodule Segmentation in Chest X-rays
Sejin ParkWoochan HwangKyu-Hwan Jung
2018-11-21
An initial attempt of combining visual selective attention with deep reinforcement learning
Liu YuezhangRuohan ZhangDana H. Ballard
2018-11-11
Representation Learning with Contrastive Predictive Coding
| Aaron van den OordYazhe LiOriol Vinyals
2018-07-10
An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution
| Rosanne LiuJoel LehmanPiero MolinoFelipe Petroski SuchEric FrankAlex SergeevJason Yosinski
2018-07-09
Improving width-based planning with compact policies
Miquel JunyentAnders JonssonVicenç Gómez
2018-06-15
Deep Curiosity Search: Intra-Life Exploration Can Improve Performance on Challenging Deep Reinforcement Learning Problems
Christopher StantonJeff Clune
2018-06-01
An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients
Jiaming SongYuhuai Wu
2018-01-17
Exploring Deep Recurrent Models with Reinforcement Learning for Molecule Design
Daniel NeilMarwin SeglerLaura GuaschMohamed AhmedDean PlumbleyMatthew SellwoodNathan Brown
2018-01-01
A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management
Iñigo CasanuevaPaweł BudzianowskiPei-Hao SuNikola MrkšićTsung-Hsien WenStefan UltesLina Rojas-BarahonaSteve YoungMilica Gašić
2017-11-29
AI Safety Gridworlds
| Jan LeikeMiljan MarticVictoria KrakovnaPedro A. OrtegaTom EverittAndrew LefrancqLaurent OrseauShane Legg
2017-11-27
TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning
| Gregory FarquharTim RocktäschelMaximilian IglShimon Whiteson
2017-10-31
Adversarial Advantage Actor-Critic Model for Task-Completion Dialogue Policy Learning
Baolin PengXiujun LiJianfeng GaoJingjing LiuYun-Nung ChenKam-Fai Wong
2017-10-31
Asynchronous Methods for Deep Reinforcement Learning
| Volodymyr MnihAdrià Puigdomènech BadiaMehdi MirzaAlex GravesTimothy P. LillicrapTim HarleyDavid SilverKoray Kavukcuoglu
2016-02-04

Components

COMPONENT TYPE
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories