no code implementations • CVPR 2024 • Jianan Li, Qiulei Dong
In this technique an inter-contrast loss is derived from the perturbed unlabeled point cloud pairs while an intra-contrast loss is derived from a single unlabeled point cloud.
no code implementations • 25 Dec 2023 • Jiayin Sun, Qiulei Dong
Open-set image recognition (OSR) aims to both classify known-class samples and identify unknown-class samples in the testing set, which supports robust classifiers in many realistic applications, such as autonomous driving, medical diagnosis, security monitoring, etc.
no code implementations • 16 Oct 2023 • Yuzhen Liu, Qiulei Dong
Based on this graph, the confidence-aware rotation averaging module, which is differentiable, is explored to predict the absolute rotations.
no code implementations • 25 Sep 2023 • Jiayin Sun, Hong Wang, Qiulei Dong
Image recognition is a classic and common task in the computer vision field, which has been widely applied in the past decade.
1 code implementation • ICCV 2023 • Zhengming Zhou, Qiulei Dong
Monocular and binocular self-supervised depth estimations are two important and related tasks in computer vision, which aim to predict scene depths from single images and stereo image pairs respectively.
no code implementations • 14 Jul 2023 • Jiayin Sun, Hong Wang, Qiulei Dong
Open-set image recognition is a challenging topic in computer vision.
no code implementations • CVPR 2023 • Jianan Li, Qiulei Dong
The proposed APF consists of a feature extraction module for extracting point features, a prototypical constraint module, and a feature adversarial module.
no code implementations • CVPR 2023 • Tao Tan, Qiulei Dong
The teacher model contains a backbone estimation module for initial object pose estimation, and an object pose refiner for refining the initial object poses using a geometric constraint (called relative-pose constraint) derived from relative camera poses.
no code implementations • 25 Nov 2022 • Jiayin Sun, Hong Wang, Qiulei Dong
To address this problem, motivated by the temporal attention mechanism in brains, we propose a spatial-temporal attention network for learning fine-grained feature representations, called STAN, where the features learnt by implementing a sequence of spatial self-attention operations corresponding to multiple moments are aggregated progressively.
no code implementations • 23 Sep 2022 • Yuzhen Liu, Qiulei Dong
This teacher-student regularizer is to constrain the difference between the positive (also negative) pair similarity from the teacher model and that from the student model, and we theoretically prove that a more effective student model could be trained by minimizing a weighted combination of the triplet loss and this regularizer, than its teacher which is trained by minimizing the triplet loss singly.
2 code implementations • 15 Sep 2022 • Zhengming Zhou, Qiulei Dong
Addressing this problem, we propose the Self-Distilled Feature Aggregation (SDFA) module for simultaneously aggregating a pair of low-scale and high-scale features and maintaining their contextual consistency.
no code implementations • 13 Jul 2022 • Jiayin Sun, Qiulei Dong
Specifically, at each iteration, a dual-space consistent sampling approach is presented in the explored reliability sampling module for selecting some relatively more reliable ones from the test samples according to their pseudo labels assigned by a baseline method, which could be an arbitrary inductive OSR method.
no code implementations • 30 Mar 2022 • Bo Liu, Lihua Hu, Qiulei Dong, Zhanyi Hu
How to generate pseudo labels for unseen-class samples and how to use such usually noisy pseudo labels are two critical issues in transductive learning.
2 code implementations • 21 Mar 2022 • Zhengming Zhou, Qiulei Dong
In spite of recent efforts in this field, how to learn accurate scene depths and alleviate the negative influence of occlusions for self-supervised depth estimation, still remains an open problem.
no code implementations • 17 Mar 2022 • Bo Liu, Qiulei Dong, Zhanyi Hu
Firstly, we propose a Semantic-diversity transfer Network (SetNet) addressing the first two limitations, where 1) a multiple-attention architecture and a diversity regularizer are proposed to learn multiple local visual features that are more consistent with semantic attributes and 2) a projector ensemble that geometrically takes diverse local features as inputs is proposed to model visual-semantic relations from diverse local perspectives.
no code implementations • 16 Jan 2022 • Pinhe Wang, Limin Shi, Bao Chen, Zhanyi Hu, Qiulei Dong, Jianzhong Qiao
How to use multiple optical satellite images to recover the 3D scene structure is a challenging and important problem in the remote sensing field.
no code implementations • 14 Jan 2022 • Bo Liu, Lihua Hu, Zhanyi Hu, Qiulei Dong
This work is a systematical analysis on the so-called hard class problem in zero-shot learning (ZSL), that is, some unseen classes disproportionally affect the ZSL performances than others, as well as how to remedy the problem by detecting and exploiting hard classes.
no code implementations • 8 Jul 2021 • Shuang Deng, Qiulei Dong, Bo Liu, Zhanyi Hu
The proposed network is iteratively updated with its predicted pseudo labels, where a superpoint generation module is introduced for extracting superpoints from 3D point clouds, and a pseudo-label optimization module is explored for automatically assigning pseudo labels to the unlabeled points under the constraint of the extracted superpoints.
no code implementations • 7 Jul 2021 • Shuang Deng, Qiulei Dong
Addressing this problem, we propose a global attention network for point cloud semantic segmentation, named as GA-Net, consisting of a point-independent global attention module and a point-dependent global attention module for obtaining contextual information of 3D point clouds in this paper.
1 code implementation • 7 Jul 2021 • Shuang Deng, Bo Liu, Qiulei Dong, Zhanyi Hu
Many recent works show that a spatial manipulation module could boost the performances of deep neural networks (DNNs) for 3D point cloud analysis.
no code implementations • 1 Jul 2021 • Bo Liu, Shuang Deng, Qiulei Dong, Zhanyi Hu
In this work, a language-level Semantics Conditioned framework for 3D Point cloud segmentation, called SeCondPoint, is proposed, where language-level semantics are introduced to condition the modeling of point feature distribution as well as the pseudo-feature generation, and a feature-geometry-based mixup approach is further proposed to facilitate the distribution learning.
1 code implementation • CVPR 2021 • Siqi Fan, Qiulei Dong, Fenghua Zhu, Yisheng Lv, Peijun Ye, Fei-Yue Wang
For each 3D point, the local polar representation block is firstly explored to construct a spatial representation that is invariant to the z-axis rotation, then the dual-distance attentive pooling block is designed to utilize the representations of its neighbors for learning more discriminative local features according to both the geometric and feature distances among them, and finally, the global contextual feature block is designed to learn a global context for each 3D point by utilizing its spatial location and the volume ratio of the neighborhood to the global point cloud.
Ranked #2 on 3D Semantic Segmentation on STPLS3D
1 code implementation • CVPR 2021 • Liu Bo, Qiulei Dong, Zhanyi Hu
Addressing this problem, we first empirically analyze the roles of unseen-class samples with different degrees of hardness in the training process based on the uneven prediction phenomenon found in many ZSL methods, resulting in three observations.
no code implementations • 29 Aug 2020 • Bo Liu, Qiulei Dong, Zhanyi Hu
In addition, considering that the visual features from categorization CNNs are generally inconsistent with their semantic features, a simple feature selection strategy is introduced for extracting more compact semantic visual features.
no code implementations • 22 Oct 2019 • Qiulei Dong, Jiayin Sun, Zhanyi Hu
In this work, we investigate this problem by formulating face images as points in a shape-appearance parameter space, and our results demonstrate that: (i) The encoding and decoding of the neuron responses (representations) to face images in CNNs could be achieved under a linear model in the parameter space, in agreement with the recent discovery in primate IT face neurons, but different from the aforementioned perspective on CNNs' face representation with complex image feature encoding; (ii) The linear model for face encoding and decoding in the parameter space could achieve close or even better performances on face recognition and verification than state-of-the-art CNNs, which might provide new lights on the design strategies for face recognition systems; (iii) The neuron responses to face images in CNNs could not be adequately modelled by the axis model, a model recently proposed on face modelling in primate IT cortex.
no code implementations • 6 Jun 2019 • Qiulei Dong, Bo Liu, Zhanyi Hu
Recently DCNN (Deep Convolutional Neural Network) has been advocated as a general and promising modelling approach for neural object representation in primate inferotemporal cortex.
no code implementations • 12 Dec 2016 • Qiulei Dong, Zhanyi Hu
Lehky et al. (Lehky, 2011) provided a statistical analysis on neural responses to object stimuli in primate AIT cortex.