We conduct extensive experiments on ADE20K, Cityscapes, and Pascal Context, and the results show that applying the CBL to various popular segmentation networks can significantly improve the mIoU and boundary F-score performance.
Ranked #19 on Semantic Segmentation on Cityscapes val
Extensive experiments demonstrate that our model not only significantly improves existing methods on all these tasks, but also shows great ability in the few-shot and domain generalization settings.
Ranked #3 on Text based Person Retrieval on ICFG-PEDES
Under this novel view, we propose a Class Center Similarity layer (CCS layer) to address the above-mentioned challenges by generating adaptive class centers conditioned on different scenes and supervising the similarities between class centers.
In view of this, we propose a new goal area-based framework, named Goal Area Network (GANet), for motion forecasting, which models goal areas rather than exact goal coordinates as preconditions for trajectory prediction, performing more robustly and accurately.
Ranked #15 on Motion Forecasting on Argoverse CVPR 2020
Specifically, we first capture the different representations with different augmentations, then regularize the cosine distance of the representations to enhance the consistency.
Image manipulation with StyleGAN has been an increasing concern in recent years. Recent works have achieved tremendous success in analyzing several semantic latent spaces to edit the attributes of the generated images. However, due to the limited semantic and spatial manipulation precision in these latent spaces, the existing endeavors are defeated in fine-grained StyleGAN image manipulation, i. e., local attribute translation. To address this issue, we discover attribute-specific control units, which consist of multiple channels of feature maps and modulation styles.
The last layer of FCN is typically a global classifier (1x1 convolution) to recognize each pixel to a semantic label.
Ranked #18 on Semantic Segmentation on PASCAL Context
We introduce a lightweight unit, conditional channel weighting, to replace costly pointwise (1x1) convolutions in shuffle blocks.
Ranked #37 on Pose Estimation on COCO test-dev
In this paper, we present a Representative Graph (RepGraph) layer to dynamically sample a few representative features, which dramatically reduces redundancy.
We propose to treat these spatial details and categorical semantics separately to achieve high accuracy and high efficiency for realtime semantic segmentation.
Ranked #1 on Real-Time Semantic Segmentation on COCO-Stuff
Given an input image and corresponding ground truth, Affinity Loss constructs an ideal affinity map to supervise the learning of Context Prior.
Ranked #1 on Scene Understanding on ADE20K val
For semantic segmentation, most existing real-time deep models trained with each frame independently may produce inconsistent results for a video sequence.
Ranked #2 on Video Semantic Segmentation on CamVid
FFU and BFU add the IoU variance to the results of CFU, yielding class-specific foreground and background features, respectively.
Panoptic segmentation, which needs to assign a category label to each pixel and segment each object instance simultaneously, is a challenging topic.
Semantic segmentation requires both rich spatial information and sizeable receptive field.
Ranked #4 on Semantic Segmentation on SkyScapes-Dense
Most existing methods of semantic segmentation still suffer from two aspects of challenges: intra-class inconsistency and inter-class indistinction.
Ranked #5 on Semantic Segmentation on PASCAL VOC 2012 test