By adding an additional loss function on the low-dimensional feature space during training, the reconstruction frameworks from under-sampled or corrupted data can reproduce more realistic images that are closer to the original with finer textures, sharper edges, and improved overall image quality.
We aim to leverage the densely labeled task, image parsing, a. k. a panoptic segmentation, to learn a model that encodes and discovers object-centric context.
SurReal complex-valued networks adopt a manifold view of complex numbers and derive a distance metric that is invariant to complex scaling.
Many diseases are classified based on human-defined rubrics that are prone to bias.
We compare C-SURE with SurReal and a real-valued baseline on complex-valued MSTAR and RadioML datasets.
We improve on the previous model by introducing several changes to the model, which leads to a better depth and grayscale estimation, and increased perceptual quality.
Inspired by bats' echolocation mechanism, we design a low-cost BatVision system that is capable of seeing the 3D spatial layout of space ahead by just listening with two ears.
Humans can envision a realistic photo given a free-hand sketch that is not only spatially imprecise and geometrically distorted but also without colors and visual details.
Key features include a human-centred representation, and a declarative, explainable interpretation model supporting deep semantic question-answering founded on an integration of methods in knowledge representation and deep learning based computer vision.
Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so.