Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations

no code implementations14 May 2023 Weiwei Lin, Chenhang He, Man-Wai Mak, Youzhi Tu

Self-supervised learning (SSL) speech models such as wav2vec and HuBERT have demonstrated state-of-the-art performance on automatic speech recognition (ASR) and proved to be extremely useful in low label-resource settings.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

One-to-Few Label Assignment for End-to-End Dense Detection

1 code implementation CVPR 2023 Shuai Li, Minghan Li, Ruihuang Li, Chenhang He, Lei Zhang

The positive and negative weights of these soft anchors are dynamically adjusted during training so that they can contribute more to ``representation learning'' in the early training stage, and contribute more to ``duplicated prediction removal'' in the later stage.

Representation Learning

MSF: Motion-guided Sequential Fusion for Efficient 3D Object Detection from Point Cloud Sequences

1 code implementation CVPR 2023 Chenhang He, Ruihuang Li, Yabin Zhang, Shuai Li, Lei Zhang

Current top-performing multi-frame detectors mostly follow a Detect-and-Fuse framework, which extracts features from each frame of the sequence and fuses them to detect the objects in the current frame.

3D Object Detection Autonomous Driving +1

DynaMask: Dynamic Mask Selection for Instance Segmentation

no code implementations CVPR 2023 Ruihuang Li, Chenhang He, Shuai Li, Yabin Zhang, Lei Zhang

The representative instance segmentation methods mostly segment different object instances with a mask of the fixed resolution, e. g., 28*28 grid.

Instance Segmentation Segmentation +1

Masked Surfel Prediction for Self-Supervised Point Cloud Learning

1 code implementation7 Jul 2022 Yabin Zhang, Jiehong Lin, Chenhang He, Yongwei Chen, Kui Jia, Lei Zhang

In this work, we make the first attempt, to the best of our knowledge, to consider the local geometry information explicitly into the masked auto-encoding, and propose a novel Masked Surfel Prediction (MaskSurf) method.

Point cloud reconstruction Self-Supervised Learning

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds

1 code implementation CVPR 2022 Chenhang He, Ruihuang Li, Shuai Li, Lei Zhang

VoxSeT is built upon a voxel-based set attention (VSA) module, which reduces the self-attention in each voxel by two cross-attentions and models features in a hidden space induced by a group of latent codes.

3D Object Detection object-detection

A Dual Weighting Label Assignment Scheme for Object Detection

1 code implementation CVPR 2022 Shuai Li, Chenhang He, Ruihuang Li, Lei Zhang

Existing LA methods mostly focus on the design of pos weighting function, while the neg weight is directly derived from the pos weight.

object-detection Object Detection +1

Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images with Virtual Depth

no code implementations28 Jul 2021 Chenhang He, Jianqiang Huang, Xian-Sheng Hua, Lei Zhang

Current geometry-based monocular 3D object detection models can efficiently detect objects by leveraging perspective geometry, but their performance is limited due to the absence of accurate depth information.

Depth Estimation Monocular 3D Object Detection +1

Structure Aware Single-Stage 3D Object Detection From Point Cloud

1 code implementation CVPR 2020 Chenhang He, Hui Zeng, Jianqiang Huang, Xian-Sheng Hua, Lei Zhang

The auxiliary network is jointly optimized, by two point-level supervisions, to guide the convolutional features in the backbone network to be aware of the object structure.

3D Object Detection Autonomous Driving +1

