UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather

no code implementations8 Apr 2024 Haimei Zhao, Jing Zhang, Zhuo Chen, Shanshan Zhao, DaCheng Tao

We devote UniMix to two main setups: 1) unsupervised domain adaption, adapting the model from the clear weather source domain to the adverse weather target domain; 2) domain generalization, learning a model that generalizes well to unseen scenes in adverse weather.

Autonomous Driving Domain Generalization +2

Local-consistent Transformation Learning for Rotation-invariant Point Cloud Analysis

1 code implementation17 Mar 2024 Yiyang Chen, Lunhao Duan, Shanshan Zhao, Changxing Ding, DaCheng Tao

Equipped with LCRF and RPR, our LocoTrans is capable of learning local-consistent transformation and preserving local geometry, which benefits rotation invariance learning.

When ControlNet Meets Inexplicit Masks: A Case Study of ControlNet on its Contour-following Ability

no code implementations1 Mar 2024 Wenjie Xuan, Yufei Xu, Shanshan Zhao, Chaoyue Wang, Juhua Liu, Bo Du, DaCheng Tao

Subsequently, to enhance controllability with inexplicit masks, an advanced Shape-aware ControlNet consisting of a deterioration estimator and a shape-prior modulation block is devised.

Optical Quantum Sensing for Agnostic Environments via Deep Learning

no code implementations13 Nov 2023 Zeqiao Zhou, Yuxuan Du, Xu-Fei Yin, Shanshan Zhao, Xinmei Tian, DaCheng Tao

DQS incorporates two essential components: a Graph Neural Network (GNN) predictor and a trigonometric interpolation algorithm.

Hierarchical Point-based Active Learning for Semi-supervised Point Cloud Semantic Segmentation

1 code implementation ICCV 2023 Zongyi Xu, Bo Yuan, Shanshan Zhao, Qianni Zhang, Xinbo Gao

The most recent methods of this kind measure the uncertainty of each pre-divided region for manual labelling but they suffer from redundant information and require additional efforts for region division.

Active Learning Point Cloud Segmentation +2

Cross-modal & Cross-domain Learning for Unsupervised LiDAR Semantic Segmentation

no code implementations5 Aug 2023 Yiyang Chen, Shanshan Zhao, Changxing Ding, Liyao Tang, Chaoyue Wang, DaCheng Tao

In recent years, cross-modal domain adaptation has been studied on the paired 2D image and 3D LiDAR data to ease the labeling costs for 3D LiDAR semantic segmentation (3DLSS) in the target domain.

Domain Adaptation LIDAR Semantic Segmentation +1

PNT-Edge: Towards Robust Edge Detection with Noisy Labels by Learning Pixel-level Noise Transitions

1 code implementation26 Jul 2023 Wenjie Xuan, Shanshan Zhao, Yu Yao, Juhua Liu, Tongliang Liu, Yixin Chen, Bo Du, DaCheng Tao

Exploiting the estimated noise transitions, our model, named PNT-Edge, is able to fit the prediction to clean labels.

Edge Detection

DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting

1 code implementation31 May 2023 Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Tongliang Liu, Bo Du, DaCheng Tao

In this paper, we present DeepSolo++, a simple DETR-like baseline that lets a single decoder with explicit points solo for text detection, recognition, and script identification simultaneously.

Decoder Scene Text Detection +2

Caption Anything: Interactive Image Description with Diverse Multimodal Controls

1 code implementation4 May 2023 Teng Wang, Jinrui Zhang, Junjie Fei, Hao Zheng, Yunlong Tang, Zhe Li, Mingqi Gao, Shanshan Zhao

Controllable image captioning is an emerging multimodal topic that aims to describe the image with natural language following human purpose, $\textit{e. g.}$, looking at the specified regions or telling in a particular text style.

controllable image captioning Instruction Following

SimDistill: Simulated Multi-modal Distillation for BEV 3D Object Detection

2 code implementations29 Mar 2023 Haimei Zhao, Qiming Zhang, Shanshan Zhao, Zhe Chen, Jing Zhang, DaCheng Tao

Multi-view camera-based 3D object detection has become popular due to its low cost, but accurately inferring 3D geometry solely from camera data remains challenging and may lead to inferior performance.

3D Object Detection Knowledge Distillation +1

Adaptive Edge-to-Edge Interaction Learning for Point Cloud Analysis

no code implementations20 Nov 2022 Shanshan Zhao, Mingming Gong, Xi Li, DaCheng Tao

To explore the role of the relation between edges, this paper proposes a novel Adaptive Edge-to-Edge Interaction Learning module, which aims to enhance the point-to-point relation through modelling the edge-to-edge interaction in the local region adaptively.

Relation Semantic Segmentation

DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting

1 code implementation CVPR 2023 Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Tongliang Liu, Bo Du, DaCheng Tao

In this paper, we present DeepSolo, a simple DETR-like baseline that lets a single Decoder with Explicit Points Solo for text detection and recognition simultaneously.

 Ranked #1 on Text Spotting on Total-Text (using extra training data)

Decoder Scene Text Detection +3

MetaComp: Learning to Adapt for Online Depth Completion

no code implementations21 Jul 2022 Yang Chen, Shanshan Zhao, Wei Ji, Mingming Gong, Liping Xie

However, facing a new environment where the test data occurs online and differs from the training data in the RGB image content and depth sparsity, the trained model might suffer severe performance drop.

Depth Completion Meta-Learning +1

MeshMAE: Masked Autoencoders for 3D Mesh Data Analysis

no code implementations20 Jul 2022 Yaqian Liang, Shanshan Zhao, Baosheng Yu, Jing Zhang, Fazhi He

We first randomly mask some patches of the mesh and feed the corrupted mesh into Mesh Transformers.

DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer

1 code implementation10 Jul 2022 Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Bo Du, DaCheng Tao

However, these methods built upon detection transformer framework might achieve sub-optimal training efficiency and performance due to coarse positional query modeling. In addition, the point label form exploited in previous works implies the reading order of humans, which impedes the detection robustness from our observation.

Inductive Bias Scene Text Detection +1

Recent Advances for Quantum Neural Networks in Generative Learning

no code implementations7 Jun 2022 Jinkai Tian, Xiaoyu Sun, Yuxuan Du, Shanshan Zhao, Qing Liu, Kaining Zhang, Wei Yi, Wanrong Huang, Chaoyue Wang, Xingyao Wu, Min-Hsiu Hsieh, Tongliang Liu, Wenjing Yang, DaCheng Tao

Due to the intrinsic probabilistic nature of quantum mechanics, it is reasonable to postulate that quantum generative learning models (QGLMs) may surpass their classical counterparts.

BIG-bench Machine Learning Quantum Machine Learning

Iterative Geometry-Aware Cross Guidance Network for Stereo Image Inpainting

no code implementations8 May 2022 Ang Li, Shanshan Zhao, Qingjie Zhang, Qiuhong Ke

The IGGNet contains two key ingredients, i. e., a Geometry-Aware Attention (GAA) module and an Iterative Cross Guidance (ICG) strategy.

Image Inpainting

FIBA: Frequency-Injection based Backdoor Attack in Medical Image Analysis

3 code implementations CVPR 2022 Yu Feng, Benteng Ma, Jing Zhang, Shanshan Zhao, Yong Xia, DaCheng Tao

However, designing a unified BA method that can be applied to various MIA systems is challenging due to the diversity of imaging modalities (e. g., X-Ray, CT, and MRI) and analysis tasks (e. g., classification, detection, and segmentation).

Artifact Detection Backdoor Attack +6

Domain Generalization via Entropy Regularization

1 code implementation NeurIPS 2020 Shanshan Zhao, Mingming Gong, Tongliang Liu, Huan Fu, DaCheng Tao

To arrive at this, some methods introduce a domain discriminator through adversarial learning to match the feature distributions in multiple source domains.

Domain Generalization

Adaptive Context-Aware Multi-Modal Network for Depth Completion

1 code implementation25 Aug 2020 Shanshan Zhao, Mingming Gong, Huan Fu, DaCheng Tao

Furthermore, considering the mutli-modality of input data, we exploit the graph propagation on the two modalities respectively to extract multi-modal representations.

Depth Completion

Group-wise Deep Co-saliency Detection

no code implementations24 Jul 2017 Lina Wei, Shanshan Zhao, Omar El Farouk Bourahla, Xi Li, Fei Wu

In this paper, we propose an end-to-end group-wise deep co-saliency detection approach to address the co-salient object discovery problem based on the fully convolutional network (FCN) with group input and group output.

Co-Salient Object Detection Object Discovery +1

Deep Optical Flow Estimation Via Multi-Scale Correspondence Structure Learning

no code implementations23 Jul 2017 Shanshan Zhao, Xi Li, Omar El Farouk Bourahla

Therefore, a key issue to solve in this area is how to effectively model the multi-scale correspondence structure properties in an adaptive end-to-end learning fashion.

Optical Flow Estimation

