Policy Gradient Methods

Robust Predictable Control

Introduced by Eysenbach et al. in Robust Predictable Control

Robust Predictable Control, or RPC, is an RL algorithm for learning policies that uses only a few bits of information. RPC brings together ideas from information bottlenecks, model-based RL, and bits-back coding. The main idea of RPC is that if the agent can accurately predict the future, then the agent will not need to observe as many bits from future observations. Precisely, the agent will learn a latent dynamics model that predicts the next representation using the current representation and action. In addition to predicting the future, the agent can also decrease the number of bits by changing its behavior. States where the dynamics are hard to predict will require more bits, so the agent will prefer visiting states where its learned model can accurately predict the next state.

Source: Robust Predictable Control


Paper Code Results Date Stars


Task Papers Share
Decision Making 1 100.00%


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign