Search Results for author: Shaofei Huang

Found 13 papers, 6 papers with code

Mask-Enhanced Segment Anything Model for Tumor Lesion Semantic Segmentation

1 code implementation9 Mar 2024 Hairong Shi, Songhao Han, Shaofei Huang, Yue Liao, Guanbin Li, Xiangxing Kong, Hua Zhu, Xiaomu Wang, Si Liu

Considering the inherent differences in tumor lesion segmentation data across various medical imaging modalities and equipment, integrating medical knowledge into the Segment Anything Model (SAM) presents promising capability due to its versatility and generalization potential.

Lesion Segmentation Segmentation +1

Transferring CLIP's Knowledge into Zero-Shot Point Cloud Semantic Segmentation

no code implementations12 Dec 2023 Yuanbin Wang, Shaofei Huang, Yulu Gao, Zhen Wang, Rui Wang, Kehua Sheng, Bo Zhang, Si Liu

In this work, we focus on zero-shot point cloud semantic segmentation and propose a simple yet effective baseline to transfer the visual-linguistic knowledge implied in CLIP to point cloud encoder at both feature and output levels.

3D Semantic Segmentation Point Cloud Segmentation +2

Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training

no code implementations CVPR 2024 Runze He, Shaofei Huang, Xuecheng Nie, Tianrui Hui, Luoqi Liu, Jiao Dai, Jizhong Han, Guanbin Li, Si Liu

In this paper, we target the adaptive source driven 3D scene editing task by proposing a CustomNeRF model that unifies a text description or a reference image as the editing prompt.

3D scene Editing

Discovering Sounding Objects by Audio Queries for Audio Visual Segmentation

no code implementations18 Sep 2023 Shaofei Huang, Han Li, Yuqing Wang, Hongji Zhu, Jiao Dai, Jizhong Han, Wenge Rong, Si Liu

Explicit object-level semantic correspondence between audio and visual modalities is established by gathering object information from visual features with predefined audio queries.

Object Semantic correspondence

Anchor3DLane: Learning to Regress 3D Anchors for Monocular 3D Lane Detection

1 code implementation CVPR 2023 Shaofei Huang, Zhenwei Shen, Zehao Huang, Zi-han Ding, Jiao Dai, Jizhong Han, Naiyan Wang, Si Liu

An attempt has been made to get rid of BEV and predict 3D lanes from FV representations directly, while it still underperforms other BEV-based methods given its lack of structured representation for 3D lanes.

3D Lane Detection

Cross-Modality Domain Adaptation for Freespace Detection: A Simple yet Effective Baseline

no code implementations6 Oct 2022 Yuanbin Wang, Leyan Zhu, Shaofei Huang, Tianrui Hui, Xiaojie Li, Fei Wang, Si Liu

To better bridge the domain gap between source domain (synthetic data) and target domain (real-world data), we also propose a Selective Feature Alignment (SFA) module which only aligns the features of consistent foreground area between the two domains, thus realizing inter-domain intra-modality adaptation.

Autonomous Driving Semantic Segmentation +1

A Keypoint-based Global Association Network for Lane Detection

1 code implementation CVPR 2022 Jinsheng Wang, Yinchao Ma, Shaofei Huang, Tianrui Hui, Fei Wang, Chen Qian, Tianzhu Zhang

Earlier works follow a top-down roadmap to regress predefined anchors into various shapes of lane lines, which lacks enough flexibility to fit complex shapes of lanes due to the fixed anchor shapes.

Ranked #4 on Lane Detection on TuSimple (F1 score metric)

Keypoint Estimation Lane Detection

TransRefer3D: Entity-and-Relation Aware Transformer for Fine-Grained 3D Visual Grounding

no code implementations5 Aug 2021 Dailan He, Yusheng Zhao, Junyu Luo, Tianrui Hui, Shaofei Huang, Aixi Zhang, Si Liu

Existing works usually adopt dynamic graph networks to indirectly model the intra/inter-modal interactions, making the model difficult to distinguish the referred object from distractors due to the monolithic representations of visual and linguistic contents.

3D visual grounding Relation +1

Cross-Modal Progressive Comprehension for Referring Segmentation

1 code implementation15 May 2021 Si Liu, Tianrui Hui, Shaofei Huang, Yunchao Wei, Bo Li, Guanbin Li

In this paper, we propose a Cross-Modal Progressive Comprehension (CMPC) scheme to effectively mimic human behaviors and implement it as a CMPC-I (Image) module and a CMPC-V (Video) module to improve referring image and video segmentation models.

Attribute Image Segmentation +5

Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation

no code implementations CVPR 2021 Tianrui Hui, Shaofei Huang, Si Liu, Zihan Ding, Guanbin Li, Wenguan Wang, Jizhong Han, Fei Wang

Though 3D convolutions are amenable to recognizing which actor is performing the queried actions, it also inevitably introduces misaligned spatial information from adjacent frames, which confuses features of the target frame and yields inaccurate segmentation.

Decoder feature selection +1

ORDNet: Capturing Omni-Range Dependencies for Scene Parsing

no code implementations11 Jan 2021 Shaofei Huang, Si Liu, Tianrui Hui, Jizhong Han, Bo Li, Jiashi Feng, Shuicheng Yan

Our ORDNet is able to extract more comprehensive context information and well adapt to complex spatial variance in scene images.

Scene Parsing

Referring Image Segmentation via Cross-Modal Progressive Comprehension

1 code implementation CVPR 2020 Shaofei Huang, Tianrui Hui, Si Liu, Guanbin Li, Yunchao Wei, Jizhong Han, Luoqi Liu, Bo Li

In addition to the CMPC module, we further leverage a simple yet effective TGFE module to integrate the reasoned multimodal features from different levels with the guidance of textual information.

Attribute Image Segmentation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.