An oft-ignored challenge of real-world reinforcement learning is that the real world does not pause when agents make learning updates.
We study the problem of training a Reinforcement Learning (RL) agent that is collaborative with humans without using any human data.
Current stripe-based feature learning approaches have delivered impressive accuracy, but do not make a proper trade-off between diversity, locality, and robustness, which easily suffers from part semantic inconsistency for the conflict between rigid partition and misalignment.
Accurate temporal action proposals play an important role in detecting actions from untrimmed videos.
In this technical report, we describe our solution to temporal action proposal (task 1) in ActivityNet Challenge 2019.
In this paper, we propose a novel meta-learning method in a reinforcement learning setting, based on evolution strategies (ES), exploration in parameter space and deterministic policy gradients.
We show in the experiments that Pommerman is a perfect environment for studying continual learning, and the agent can improve its performance by continually learning new skills without forgetting the old ones.
Instead of learning on semantic regions, we uniformly partition the images into several stripes, and vary the number of parts in different local branches to obtain local feature representations with multiple granularities.
Ranked #3 on Person Re-Identification on SYSU-30k (using extra training data)