Despite its broad application and interest, it remains a challenging problem in part due to the vast range of conditions under which it must be robust.
This paper proposes a novel deep reinforcement learning algorithm to perform automatic analysis and detection of gameplay issues in complex 3D navigation environments.
Recent self-supervised learning methods are able to learn high-quality image representations and are closing the gap with supervised approaches.
To this end, we propose four different policy fusion methods for combining pre-trained policies.
Experimental results demonstrate the effectiveness of our approach: using less than 50\% of available real thermal training data, and relying on synthesized data generated by our model in the domain adaptation phase, our detector achieves state-of-the-art results on the KAIST Multispectral Pedestrian Detection Benchmark; even if more real thermal data is available adding GAN generated images to the training data results in improved performance, thus showing that these images act as an effective form of data augmentation.
Recent advances in Deep Reinforcement Learning (DRL) have largely focused on improving the performance of agents with the aim of replacing humans in known and well-defined environments.
We propose a technique based on Adversarial Inverse Reinforcement Learning which can significantly decrease the need for expert demonstrations in PCG games.
In this paper we introduce DeepCrawl, a fully-playable Roguelike prototype for iOS and Android in which all agents are controlled by policy networks trained using Deep Reinforcement Learning (DRL).
For future learning systems incremental learning is desirable, because it allows for: efficient resource usage by eliminating the need to retrain from scratch at the arrival of new data; reduced memory usage by preventing or limiting the amount of data required to be stored -- also important when privacy limitations are imposed; and learning that more closely resembles human learning.
We call our method Recurrent Attention to Transient Tasks (RATT), and also show how to adapt continual learning approaches based on weight egularization and knowledge distillation to recurrent continual learning problems.
To prevent forgetting, we combine generative feature replay in the classifier with feature distillation in the feature extractor.
This will turn the classic audio guide into a smart personal instructor with which the visitor can interact by asking for explanations focused on specific interests.
Our results show that networks trained to regress to the ground truth targets for labeled data and to simultaneously learn to rank unlabeled data obtain significantly better, state-of-the-art results for both IQA and crowd counting.
In this paper we propose a technique to create and exploit an intermediate representation of images based on text attributes which are character probability maps.
We propose a novel crowd counting approach that leverages abundantly available unlabeled crowd imagery in a learning-to-rank framework.
Ranked #14 on Crowd Counting on ShanghaiTech B
In this paper we propose an approach to avoiding catastrophic forgetting in sequential task learning scenarios.
We show that domain transfer leads to large shifts in network activations and that it is desirable to take this into account when compressing.
The number of emergencies where computer vision tools has been considered or used is very wide, and there is a great overlap across related emergency research.
Furthermore, on the LIVE benchmark we show that our approach is superior to existing NR-IQA techniques and that we even outperform the state-of-the-art in full-reference IQA (FR-IQA) methods without having to resort to high-quality reference images to infer IQA.
A set of feature vectors are derived from an intermediate convolutional layer corresponding to different areas of the image.
Text Proposals have emerged as a class-dependent version of object proposals - efficient approaches to reduce the search space of possible text object locations in an image.
This paper proposes a novel method to optimize bandwidth usage for object detection in critical communication scenarios.
Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding.
Object detection with deep neural networks is often performed by passing a few thousand candidate bounding boxes through a deep neural network for each image.
In this paper we present the use of Sparse Radial Sampling Local Binary Patterns, a variant of Local Binary Patterns (LBP) for text-as-texture classification.
We describe a novel technique for feature combination in the bag-of-words model of image classification.