However, it is difficult to directly leverage such large amounts of unlabeled and highly diverse data for complex 3D reasoning and planning tasks.
We tackle inherent data scarcity by leveraging a simulation environment to spawn dynamic agents with various mobility aids.
Motivated by this observation, we develop a framework for learning a situational driving policy that effectively captures reasoning under varying types of scenarios.
Beyond label efficiency, we find several additional training benefits when leveraging visual abstractions, such as a significant reduction in the variance of the learned policy when compared to state-of-the-art end-to-end driving models.
Our results show that our proposed multi-stream CNN is the best model for predicting time to near-collision.
The learning process is interactive, with a human expert first providing input in the form of full demonstrations along with some subgoal states.
Consider an assistive system that guides visually impaired users through speech and haptic feedback to their destination.
This work addresses the task of accurately localizing driver hands and classifying the grasp state of each hand.
We aim to study the modeling limitations of the commonly employed boosted decision trees classifier.
Ranked #17 on Face Detection on WIDER Face (Medium)
This study aims to analyze the benefits of improved multi-scale reasoning for object detection and localization with deep convolutional neural networks.