Search Results for author: Haodong Li

Found 15 papers, 5 papers with code

Dual Capsule Attention Mask Network with Mutual Learning for Visual Question Answering

no code implementations COLING 2022 Weidong Tian, Haodong Li, Zhong-Qiu Zhao

The attention mechanism can highlight fine-grained features with critical information, thus ensuring that feature extraction emphasizes the objects related to the questions.

Question Answering Visual Question Answering

Jasmine: Harnessing Diffusion Prior for Self-supervised Depth Estimation

no code implementations20 Mar 2025 Jiyuan Wang, Chunyu Lin, Cheng Guan, Lang Nie, Jing He, Haodong Li, Kang Liao, Yao Zhao

In this paper, we propose Jasmine, the first Stable Diffusion (SD)-based self-supervised framework for monocular depth estimation, which effectively harnesses SD's visual priors to enhance the sharpness and generalization of unsupervised prediction.

Image Reconstruction Monocular Depth Estimation +1

DDFAV: Remote Sensing Large Vision Language Models Dataset and Evaluation Benchmark

1 code implementation5 Nov 2024 Haodong Li, Haicheng Qu, Xiaofeng Zhang

Next, a training instruction set is produced based on some high-quality remote sensing images selected from the proposed dataset.

Data Augmentation Hallucination +1

DOME: Taming Diffusion Model into High-Fidelity Controllable Occupancy World Model

no code implementations14 Oct 2024 Songen Gu, Wei Yin, Bu Jin, Xiaoyang Guo, Junming Wang, Haodong Li, Qian Zhang, Xiaoxiao Long

The ability of this world model to capture the evolution of the environment is crucial for planning in autonomous driving.

Autonomous Driving model

DisEnvisioner: Disentangled and Enriched Visual Prompt for Customized Image Generation

no code implementations2 Oct 2024 Jing He, Haodong Li, Yongzhe Hu, Guibao Shen, Yingjie Cai, Weichao Qiu, Ying-Cong Chen

However, existing methods, both tuning-based and tuning-free, struggle with interpreting the subject-essential attributes from the visual prompt.

Image Generation

Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction

no code implementations26 Sep 2024 Jing He, Haodong Li, Wei Yin, Yixun Liang, Leheng Li, Kaiqiang Zhou, Hongbo Zhang, Bingbing Liu, Ying-Cong Chen

In this paper, we provide a systemic analysis of the diffusion formulation for the dense prediction, focusing on both quality and efficiency.

3D Reconstruction Denoising +4

Bi-TTA: Bidirectional Test-Time Adapter for Remote Physiological Measurement

no code implementations25 Sep 2024 Haodong Li, Hao Lu, Ying-Cong Chen

To address this, we pioneer the Test-Time Adaptation (TTA) in rPPG, enabling the adaptation of pre-trained models to the target domain during inference, sidestepping the need for annotations or source data due to privacy considerations.

Test-time Adaptation

DASSF: Dynamic-Attention Scale-Sequence Fusion for Aerial Object Detection

no code implementations18 Jun 2024 Haodong Li, Haicheng Qu

In order to improve the detection accuracy of densely overlapping small targets and fuzzy targets, this paper proposes a dynamic-attention scale-sequence fusion algorithm (DASSF) for small target detection in aerial images.

object-detection Small Object Detection

Digger: Detecting Copyright Content Mis-usage in Large Language Model Training

no code implementations1 Jan 2024 Haodong Li, Gelei Deng, Yi Liu, Kailong Wang, Yuekang Li, Tianwei Zhang, Yang Liu, Guoai Xu, Guosheng Xu, Haoyu Wang

In this paper, we introduce a detailed framework designed to detect and assess the presence of content from potentially copyrighted books within the training datasets of LLMs.

Language Modeling Language Modelling +2

LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching

1 code implementation CVPR 2024 Yixun Liang, Xin Yang, Jiantao Lin, Haodong Li, Xiaogang Xu, Yingcong Chen

The recent advancements in text-to-3D generation mark a significant milestone in generative models, unlocking new possibilities for creating imaginative 3D assets across various real-world scenarios.

3D Generation Text to 3D

ReLoc: A Restoration-Assisted Framework for Robust Image Tampering Localization

1 code implementation8 Nov 2022 Peiyu Zhuang, Haodong Li, Rui Yang, Jiwu Huang

The ReLoc framework mainly consists of an image restoration module and a tampering localization module.

Image Restoration

Robust Coordinated Longitudinal Control of MAV Based on Energy State

no code implementations11 Aug 2022 Chenlong Zhang, Dawei Li, Haodong Li

Fixed-wing Miniature Air Vehicle (MAV) is not only coupled with longitudinal motion, but also more susceptible to wind disturbance due to its lighter weight, which brings more challenges to its altitude and airspeed controller design.

Detection of Deep Network Generated Images Using Disparities in Color Components

1 code implementation22 Aug 2018 Haodong Li, Bin Li, Shunquan Tan, Jiwu Huang

In this paper, we address the problem of detecting deep network generated (DNG) images by analyzing the disparities in color components between real scene images and DNG images.

Multimedia

Image Processing Operations Identification via Convolutional Neural Network

no code implementations9 Sep 2017 Bolin Chen, Haodong Li, Weiqi Luo

The extensive results show that the proposed method can outperform the currently best method based on hand crafted features and three related methods based on CNN for image steganalysis and/or forensics, achieving the state-of-the-art results.

Image Forensics Steganalysis

Cannot find the paper you are looking for? You can Submit a new open access paper.