Search Results for author: Haifeng Huang

Found 11 papers, 6 papers with code

Multi-Modal Domain Adaptation Across Video Scenes for Temporal Video Grounding

no code implementations21 Dec 2023 Haifeng Huang, Yang Zhao, Zehan Wang, Yan Xia, Zhou Zhao

Thus, to address this issue and enhance model performance on new scenes, we explore the TVG task in an unsupervised domain adaptation (UDA) setting across scenes for the first time, where the video-query pairs in the source scene (domain) are labeled with temporal boundaries, while those in the target scene are not.

Unsupervised Domain Adaptation Video Grounding

Chat-3D v2: Bridging 3D Scene and Large Language Models with Object Identifiers

2 code implementations13 Dec 2023 Haifeng Huang, Zehan Wang, Rongjie Huang, Luping Liu, Xize Cheng, Yang Zhao, Tao Jin, Zhou Zhao

These tokens capture the object's attributes and spatial relationships with surrounding objects in the 3D scene.

Attribute Object +1

SynFundus-1M: A High-quality Million-scale Synthetic fundus images Dataset with Fifteen Types of Annotation

1 code implementation1 Dec 2023 Fangxin Shang, Jie Fu, Yehui Yang, Haifeng Huang, Junwei Liu, Lei Ma

Large-scale public datasets with high-quality annotations are rarely available for intelligent medical imaging research, due to data privacy concerns and the cost of annotations.

Denoising

Extending Multi-modal Contrastive Representations

1 code implementation13 Oct 2023 Zehan Wang, Ziang Zhang, Luping Liu, Yang Zhao, Haifeng Huang, Tao Jin, Zhou Zhao

Inspired by recent C-MCR, this paper proposes Extending Multimodal Contrastive Representation (Ex-MCR), a training-efficient and paired-data-free method to flexibly learn unified contrastive representation space for more than three modalities by integrating the knowledge of existing MCR spaces.

3D Object Classification Representation Learning +1

Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes

1 code implementation17 Aug 2023 Zehan Wang, Haifeng Huang, Yang Zhao, Ziang Zhang, Zhou Zhao

This paper presents Chat-3D, which combines the 3D visual perceptual ability of pre-trained 3D representations and the impressive reasoning and conversation capabilities of advanced LLMs to achieve the first universal dialogue systems for 3D scenes.

Language Modelling Large Language Model +1

3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding

no code implementations25 Jul 2023 Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao

3D visual grounding aims to localize the target object in a 3D point cloud by a free-form language description.

Object Position +3

Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding

1 code implementation ICCV 2023 Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao

To accomplish this, we design a novel semantic matching model that analyzes the semantic similarity between object proposals and sentences in a coarse-to-fine manner.

Object Semantic Similarity +3

Connecting Multi-modal Contrastive Representations

no code implementations NeurIPS 2023 Zehan Wang, Yang Zhao, Xize Cheng, Haifeng Huang, Jiageng Liu, Li Tang, Linjun Li, Yongqi Wang, Aoxiong Yin, Ziang Zhang, Zhou Zhao

This paper proposes a novel training-efficient method for learning MCR without paired data called Connecting Multi-modal Contrastive Representations (C-MCR).

3D Point Cloud Classification counterfactual +4

Contrastive Centroid Supervision Alleviates Domain Shift in Medical Image Classification

no code implementations31 May 2022 Wenshuo Zhou, Dalu Yang, Binghong Wu, Yehui Yang, Junde Wu, Xiaorong Wang, Lei Wang, Haifeng Huang, Yanwu Xu

Deep learning based medical imaging classification models usually suffer from the domain shift problem, where the classification performance drops when training data and real-world data differ in imaging equipment manufacturer, image acquisition protocol, patient populations, etc.

domain classification Domain Generalization +3

Progressive Hard-case Mining across Pyramid Levels for Object Detection

1 code implementation15 Sep 2021 Binghong Wu, Yehui Yang, Dalu Yang, Junde Wu, Xiaorong Wang, Haifeng Huang, Lei Wang, Yanwu Xu

Based on focal loss with ATSS-R50, our approach achieves 40. 5 AP, surpassing the state-of-the-art QFL (Quality Focal Loss, 39. 9 AP) and VFL (Varifocal Loss, 40. 1 AP).

object-detection Object Detection

Towards Interpretable Clinical Diagnosis with Bayesian Network Ensembles Stacked on Entity-Aware CNNs

no code implementations ACL 2020 Jun Chen, Xiaoya Dai, Quan Yuan, Chao Lu, Haifeng Huang

The automatic text-based diagnosis remains a challenging task for clinical use because it requires appropriate balance between accuracy and interpretability.

Cannot find the paper you are looking for? You can Submit a new open access paper.