Synthetically-generated audios and videos -- so-called deep fakes -- continue to capture the imagination of the computer-graphics and computer-vision communities.
In this study, we present an analysis of model-based ensemble learning for 3D point-cloud object classification and detection.
In this paper, we propose a novel regularization method for Generative Adversarial Networks, which allows the model to learn discriminative yet compact binary representations of image patches (image descriptors).
In the task of Object Recognition, there exists a dichotomy between the categorization of objects and estimating object pose, where the former necessitates a view-invariant representation, while the latter requires a representation capable of capturing pose information over different categories of objects.
How does fine-tuning of a pre-trained CNN on a multi-view dataset affect the representation at each layer of the network?
Due to large variations in shape, appearance, and viewing conditions, object recognition is a key precursory challenge in the fields of object manipulation and robotic/AI visual reasoning in general.
In doing so, we also present a method of recovering an accurate depthmap of the scene and recovering the scene without the visual effects of haze.