In this paper, we employ interpolated generative models to generate OoD samples at training time via data augmentation.
We present Voxel Transformer (VoTr), a novel and effective voxel-based Transformer backbone for 3D object detection from point clouds.
Ranked #2 on 3D Object Detection on waymo vehicle (L1 mAP metric)
To resolve the problems, we propose a novel second-stage module, named pyramid RoI head, to adaptively learn the features from the sparse points of interest.
Ranked #2 on 3D Object Detection on waymo vehicle (AP metric)
In this work, we propose robust Neural Architecture Search for OoD generalization (NAS-OoD), which optimizes the architecture with respect to its performance on generated OoD data by gradient descent.
Ranked #1 on Domain Generalization on NICO Vehicle
The classification branch extracts global group priors by learning correlations among image clusters.
Noting the scarcity and low quality (in terms of resolution and scene diversity) of the publicly available video crowd datasets, we have collected and built a large-scale video crowd counting datasets, VidCrowd, to contribute to the community.
Single image crowd counting is a challenging computer vision problem with wide applications in public safety, city planning, traffic management, etc.
To address that, we propose DecAug, a novel decomposed feature representation and semantic augmentation approach for OoD generalization.
Designing a general crowd counting algorithm applicable to a wide range of crowd images is challenging, mainly due to the possibly large variation in object scales and the presence of many isolated small clusters.