The first is scene-level swapping which exchanges point cloud sectors of two LiDAR scans that are cut along the azimuth axis.
The recently proposed DEtection TRansformer (DETR) has established a fully end-to-end paradigm for object detection.
We formulate the new setup by a dual detection task which first detects integral text units and then groups them into a CTB.
Specifically, we design cross-frame pseudo labelling to provide pseudo supervision from previous video frames while learning from the augmented current video frames.
Domain adaptive panoptic segmentation aims to mitigate data annotation challenge by leveraging off-the-shelf annotated data in one or multiple related source domains.
We build the balanced subclass distributions by clustering pixels of each original class into multiple subclasses of similar sizes, which provide class-balanced pseudo supervision to regularize the class-biased segmentation.
The convergence of point cloud and DNNs has led to many deep point cloud models, largely trained under the supervision of large-scale and densely-labelled point cloud data.
To this end, we design an innovative historical contrastive learning (HCL) technique that exploits historical source hypothesis to make up for the absence of source data in UMA.
Specifically, we design GenCo, a Generative Co-training network that mitigates the discriminator over-fitting issue by introducing multiple complementary discriminators that provide diverse supervision from multiple distinctive views in training.
This paper presents DA-VSN, a domain adaptive video segmentation network that addresses domain gaps in videos by temporal consistency regularization (TCR) for consecutive frames of target-domain videos.
Extensive experiments show that SynLiDAR provides a high-quality data source for studying 3D transfer and the proposed PCT achieves superior point cloud translation consistently across the three setups.
Generative Adversarial Networks (GANs) have become the de-facto standard in image synthesis.
Second, it introduces multi-view spectral learning that learns useful unsupervised representations by maximizing mutual information among multiple ST-generated spectral views of each target sample.
We position the few labeled target samples as references that gauge the similarity between source and target features and guide adaptive inter-domain alignment for learning more similar source features.
With FAA-generated samples, the training can continue the 'random walk' and drift into an area with a flat loss landscape, leading to more robust domain adaptation.
In this work, we explore the idea of instance contrastive learning in unsupervised domain adaptation (UDA) and propose a novel Category Contrast technique (CaCo) that introduces semantic priors on top of instance discrimination for visual UDA tasks.
The first task focuses on image-to-character (I2C) mapping which detects a set of character candidates from images based on different alignments of visual features in an non-sequential way.
In this work, we adopt a one-stage detector and design DA-DETR, a simple yet effective domain adaptive object detection network that performs inter-domain alignment with a single discriminator.
Recent progresses in domain adaptive semantic segmentation demonstrate the effectiveness of adversarial learning (AL) in unsupervised domain adaptation.
It has been studied widely by domain randomization that transfers source images to different styles in spatial space for learning domain-agnostic features.
The inter-task regularization exploits the complementary nature of instance segmentation and semantic segmentation and uses it as a constraint for better feature alignment across domains.
Specifically, we design a residual dense block with multiple receptive fields as a building block in the encoder which preserves detailed information in each modality and learns hierarchical modality-specific and fused features effectively.
Ranked #15 on 3D Semantic Segmentation on SemanticKITTI
Specifically, we design an uncertainty metric that assesses the alignment of each sample and adjusts the strength of adversarial learning for well-aligned and poorly-aligned samples adaptively.
Recent advances in unsupervised domain adaptation for semantic segmentation have shown great potentials to relieve the demand of expensive per-pixel annotations.
Despite the rapid progress of generative adversarial networks (GANs) in image synthesis in recent years, the existing image synthesis approaches work in either geometry domain or appearance domain alone which often introduces various synthesis artifacts.