Search Results for author: Haoran Duan

Found 30 papers, 15 papers with code

From Gaze to Insight: Bridging Human Visual Attention and Vision Language Model Explanation for Weakly-Supervised Medical Image Segmentation

no code implementations15 Apr 2025 Jingkun Chen, Haoran Duan, Xiao Zhang, Boyan Gao, Tao Tan, Vicente Grau, Jungong Han

To implement this, the teacher model first learns from gaze points enhanced by VLM-generated descriptions of lesion morphology, establishing a foundation for guiding the student model.

Diagnostic Image Segmentation +4

ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation

1 code implementation3 Apr 2025 Yuan Zhou, Shilong Jin, Litao Hua, Wanjun Lv, Haoran Duan, Jungong Han

Recent advances in zero-shot text-to-3D generation have revolutionized 3D content creation by enabling direct synthesis from textual descriptions.

3D Generation Text to 3D

Attention in Diffusion Model: A Survey

no code implementations1 Apr 2025 Litao Hua, Fan Liu, Jie Su, Xingyu Miao, Zizhou Ouyang, Zeyu Wang, Runze Hu, Zhenyu Wen, Bing Zhai, Yang Long, Haoran Duan, Yuan Zhou

Attention mechanisms have become a foundational component in diffusion models, significantly influencing their capacity across a wide range of generative and discriminative tasks.

Diversity model +1

FMDConv: Fast Multi-Attention Dynamic Convolution via Speed-Accuracy Trade-off

no code implementations21 Mar 2025 Tianyu Zhang, Fan Wan, Haoran Duan, Kevin W. Tong, Jingjing Deng, Yang Long

Spatial convolution is fundamental in constructing deep Convolutional Neural Networks (CNNs) for visual recognition.

Edge-computing

Asynchronous Personalized Federated Learning through Global Memorization

no code implementations1 Mar 2025 Fan Wan, Yuchen Li, Xueqi Qiu, Rui Sun, Leyuan Zhang, Xingyu Miao, Tianyu Zhang, Haoran Duan, Yang Long

The proliferation of Internet of Things devices and advances in communication technology have unleashed an explosion of personal data, amplifying privacy concerns amid stringent regulations like GDPR and CCPA.

Memorization Personalized Federated Learning +3

Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields

1 code implementation31 Jan 2025 Xingyu Miao, Haoran Duan, Yang Bai, Tejal Shah, Jun Song, Yang Long, Rajiv Ranjan, Ling Shao

To achieve this, we introduce an adapter module and mitigate the noise issue in the dense CLIP feature distillation process through a self-cross-training strategy.

Segmentation Text Augmentation

Exemplar-condensed Federated Class-incremental Learning

no code implementations25 Dec 2024 Rui Sun, Yumin Zhang, Varun Ojha, Tejal Shah, Haoran Duan, Bo Wei, Rajiv Ranjan

We propose Exemplar-Condensed federated class-incremental learning (ECoral) to distil the training characteristics of real images from streaming data into informative rehearsal exemplars.

class-incremental learning Class Incremental Learning +2

NoiseHGNN: Synthesized Similarity Graph-Based Neural Network For Noised Heterogeneous Graph Representation Learning

1 code implementation24 Dec 2024 Xiong Zhang, Cheng Xie, Haoran Duan, Beibei Yu

For homogeneous graphs, the latest works use original node features to synthesize a similarity graph that can correct the structure of the noised graph.

Graph Learning Graph Neural Network +1

Dynamic Label Adversarial Training for Deep Learning Robustness Against Adversarial Attacks

no code implementations23 Aug 2024 Zhenyu Liu, Haoran Duan, HuiZhi Liang, Yang Long, Vaclav Snasel, Guiseppe Nicosia, Rajiv Ranjan, Varun Ojha

Additionally, we found that a budgeted dimension of inner optimization for the target model may contribute to the trade-off between clean accuracy and robust accuracy.

Prototype Correlation Matching and Class-Relation Reasoning for Few-Shot Medical Image Segmentation

no code implementations7 Jun 2024 Yumin Zhang, Hongliu Li, Yajun Gao, Haoran Duan, Yawen Huang, Yefeng Zheng

Specifically, in order to address false pixel correlation match brought by large intra-class variations, we propose a prototype correlation matching module to mine representative prototypes that can characterize diverse visual information of different appearances well.

Image Segmentation Medical Image Segmentation +2

Wearable-based behaviour interpolation for semi-supervised human activity recognition

no code implementations24 May 2024 Haoran Duan, Shidong Wang, Varun Ojha, Shizheng Wang, Yawen Huang, Yang Long, Rajiv Ranjan, Yefeng Zheng

While traditional feature engineering for Human Activity Recognition (HAR) involves a trial-anderror process, deep learning has emerged as a preferred method for high-level representations of sensor-based human activities.

Deep Learning Feature Engineering +1

ExactDreamer: High-Fidelity Text-to-3D Content Creation via Exact Score Matching

1 code implementation24 May 2024 Yumin Zhang, Xingyu Miao, Haoran Duan, Bo Wei, Tejal Shah, Yang Long, Rajiv Ranjan

Furthermore, to effectively capture the dynamic changes of the original and auxiliary variables, the LoRA of a pre-trained diffusion model implements these exact paths.

3D Generation Denoising +1

Rehearsal-free Federated Domain-incremental Learning

no code implementations22 May 2024 Rui Sun, Haoran Duan, Jiahua Dong, Varun Ojha, Tejal Shah, Rajiv Ranjan

A key feature of RefFiL is the generation of local fine-grained prompts by our domain adaptive prompt generator, which effectively learns from local domain knowledge while maintaining distinctive boundaries on a global scale.

Contrastive Learning Federated Learning +1

Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching

1 code implementation18 May 2024 Xingyu Miao, Haoran Duan, Varun Ojha, Jun Song, Tejal Shah, Yang Long, Rajiv Ranjan

In this work, we propose a novel Trajectory Score Matching (TSM) method that aims to solve the pseudo ground truth inconsistency problem caused by the accumulated error in Interval Score Matching (ISM) when using the Denoising Diffusion Implicit Models (DDIM) inversion process.

3D Generation Denoising +1

From Sora What We Can See: A Survey of Text-to-Video Generation

1 code implementation17 May 2024 Rui Sun, Yumin Zhang, Tejal Shah, Jiahao Sun, Shuoying Zhang, Wenqi Li, Haoran Duan, Bo Wei, Rajiv Ranjan

With impressive achievements made, artificial intelligence is on the path forward to artificial general intelligence.

Text-to-Video Generation Video Generation

Sentinel-Guided Zero-Shot Learning: A Collaborative Paradigm without Real Data Exposure

1 code implementation14 Mar 2024 Fan Wan, Xingyu Miao, Haoran Duan, Jingjing Deng, Rui Gao, Yang Long

With increasing concerns over data privacy and model copyrights, especially in the context of collaborations between AI service providers and data owners, an innovative SG-ZSL paradigm is proposed in this work.

Zero-Shot Learning

Pixel Sentence Representation Learning

2 code implementations13 Feb 2024 Chenghao Xiao, Zhuoxu Huang, Danlu Chen, G Thomas Hudson, Yizhi Li, Haoran Duan, Chenghua Lin, Jie Fu, Jungong Han, Noura Al Moubayed

To our knowledge, this is the first representation learning method devoid of traditional language models for understanding sentence and document semantics, marking a stride closer to human-like textual comprehension.

Natural Language Inference Representation Learning +3

ConRF: Zero-shot Stylization of 3D Scenes with Conditioned Radiation Fields

1 code implementation2 Feb 2024 Xingyu Miao, Yang Bai, Haoran Duan, Fan Wan, Yawen Huang, Yang Long, Yefeng Zheng

Most of the existing works on arbitrary 3D NeRF style transfer required retraining on each single style condition.

NeRF Style Transfer

CTNeRF: Cross-Time Transformer for Dynamic Neural Radiance Field from Monocular Video

1 code implementation10 Jan 2024 Xingyu Miao, Yang Bai, Haoran Duan, Yawen Huang, Fan Wan, Yang Long, Yefeng Zheng

The goal of our work is to generate high-quality novel views from monocular videos of complex and dynamic scenes.

NeRF

Dual Feature Augmentation Network for Generalized Zero-shot Learning

1 code implementation25 Sep 2023 Lei Xiang, Yuan Zhou, Haoran Duan, Yang Long

To address these issues, we propose a novel Dual Feature Augmentation Network (DFAN), which comprises two feature augmentation modules, one for visual features and the other for semantic features.

Attribute Diversity +1

UniHead: Unifying Multi-Perception for Detection Heads

1 code implementation23 Sep 2023 Hantao Zhou, Rui Yang, Yachao Zhang, Haoran Duan, Yawen Huang, Runze Hu, Xiu Li, Yefeng Zheng

The detection head constitutes a pivotal component within object detectors, tasked with executing both classification and localization functions.

DS-Depth: Dynamic and Static Depth Estimation via a Fusion Cost Volume

1 code implementation14 Aug 2023 Xingyu Miao, Yang Bai, Haoran Duan, Yawen Huang, Fan Wan, Xinxing Xu, Yang Long, Yefeng Zheng

Nevertheless, the dynamic cost volume inevitably generates extra occlusions and noise, thus we alleviate this by designing a fusion module that makes static and dynamic cost volumes compensate for each other.

Monocular Depth Estimation Optical Flow Estimation +1

Absolute Zero-Shot Learning

1 code implementation23 Feb 2022 Rui Gao, Fan Wan, Daniel Organisciak, Jiyao Pu, Junyan Wang, Haoran Duan, Peng Zhang, Xingsong Hou, Yang Long

Considering the increasing concerns about data copyright and privacy issues, we present a novel Absolute Zero-Shot Learning (AZSL) paradigm, i. e., training a classifier with zero real data.

Transfer Learning Zero-Shot Learning

Semi-Supervised Crowd Counting from Unlabeled Data

no code implementations31 Aug 2021 Haoran Duan, Fan Wan, Rui Sun, Zeyu Wang, Varun Ojha, Yu Guan, Hubert P. H. Shum, Bingzhang Hu, Yang Long

Our method achieved competitive performance in semi-supervised learning approaches on these crowd counting datasets.

Crowd Counting

EfficientTDNN: Efficient Architecture Search for Speaker Recognition

1 code implementation25 Mar 2021 Rui Wang, Zhihua Wei, Haoran Duan, Shouling Ji, Yang Long, Zhen Hong

Compared with hand-designed approaches, neural architecture search (NAS) appears as a practical technique in automating the manual architecture design process and has attracted increasing interest in spoken language processing tasks such as speaker recognition.

Data Augmentation Network Pruning +2

SOFA-Net: Second-Order and First-order Attention Network for Crowd Counting

no code implementations9 Aug 2020 Haoran Duan, Shidong Wang, Yu Guan

To obtain the appropriate crowd representation, in this work we proposed SOFA-Net(Second-Order and First-order Attention Network): second-order statistics were extracted to retain selectivity of the channel-wise spatial information for dense heads while first-order statistics, which can enhance the feature discrimination for the heads' areas, were used as complementary information.

Crowd Counting

Cannot find the paper you are looking for? You can Submit a new open access paper.