Search Results for author: Brendan D. Tracey

Found 7 papers, 4 papers with code

From Motor Control to Team Play in Simulated Humanoid Football

1 code implementation25 May 2021 SiQi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess

In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds.

Decision Making Imitation Learning +2

Caveats for information bottleneck in deterministic scenarios

1 code implementation ICLR 2019 Artemy Kolchinsky, Brendan D. Tracey, Steven Van Kuyk

We demonstrate three caveats when using IB in any situation where $Y$ is a deterministic function of $X$: (1) the IB curve cannot be recovered by maximizing the IB Lagrangian for different values of $\beta$; (2) there are "uninteresting" trivial solutions at all points of the IB curve; and (3) for multi-layer classifiers that achieve low prediction error, different layers cannot exhibit a strict trade-off between compression and prediction, contrary to a recent proposal.

Upgrading from Gaussian Processes to Student's-T Processes

no code implementations18 Jan 2018 Brendan D. Tracey, David H. Wolpert

The Student's-T distribution has higher Kurtosis than a Gaussian distribution and so outliers are much more likely, and the posterior variance increases or decreases depending on the variance of observed data sample values.

Gaussian Processes

Deep Reinforcement Learning for Event-Driven Multi-Agent Decision Processes

1 code implementation19 Sep 2017 Kunal Menda, Yi-Chun Chen, Justin Grana, James W. Bono, Brendan D. Tracey, Mykel J. Kochenderfer, David Wolpert

The incorporation of macro-actions (temporally extended actions) into multi-agent decision problems has the potential to address the curse of dimensionality associated with such decision problems.

reinforcement-learning

Estimating Mixture Entropy with Pairwise Distances

no code implementations8 Jun 2017 Artemy Kolchinsky, Brendan D. Tracey

We prove this family includes lower and upper bounds on the mixture entropy.

Nonlinear Information Bottleneck

3 code implementations6 May 2017 Artemy Kolchinsky, Brendan D. Tracey, David H. Wolpert

Information bottleneck (IB) is a technique for extracting information in one random variable $X$ that is relevant for predicting another random variable $Y$.

Reducing the error of Monte Carlo Algorithms by Learning Control Variates

no code implementations7 Jun 2016 Brendan D. Tracey, David H. Wolpert

Crucially, it is a post-processing technique, requiring no additional samples, and can be applied to data generated by any MC estimator.

Cannot find the paper you are looking for? You can Submit a new open access paper.