Search Results for author: Haofan Wang

Found 30 papers, 15 papers with code

EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer

no code implementations10 Mar 2025 Yuxuan Zhang, Yirui Yuan, Yiren Song, Haofan Wang, Jiaming Liu

These innovations collectively make our framework highly efficient, flexible, and suitable for a wide range of tasks.

Computational Efficiency Image Generation

Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement

1 code implementation10 Nov 2024 Zhennan Chen, Yajie Li, Haofan Wang, Zhibo Chen, Zhengkai Jiang, Jun Li, Qian Wang, Jian Yang, Ying Tai

Regional prompting, or compositional generation, which enables fine-grained spatial control, has gained increasing attention for its practicality in real-world applications.

Attribute RAG +1

InstantIR: Blind Image Restoration with Instant Generative Reference

no code implementations9 Oct 2024 Jen-Yuan Huang, Haofan Wang, Qixun Wang, Xu Bai, Hao Ai, Peng Xing, Jen-tse Huang

In this paper, we introduce Instant-reference Image Restoration (InstantIR), a novel diffusion-based BIR method which dynamically adjusts generation condition during inference.

Image Restoration

Image Watermarks are Removable Using Controllable Regeneration from Clean Noise

1 code implementation7 Oct 2024 Yepeng Liu, Yiren Song, Hai Ci, Yu Zhang, Haofan Wang, Mike Zheng Shou, Yuheng Bu

This scheme adds varying numbers of noise steps to the latent representation of the watermarked image, followed by a controlled denoising process starting from this noisy latent representation.

Attribute Denoising

Multi-scale Multi-instance Visual Sound Localization and Segmentation

no code implementations31 Aug 2024 Shentong Mo, Haofan Wang

Visual sound localization is a typical and challenging problem that predicts the location of objects corresponding to the sound source in a video.

Object Localization

CSGO: Content-Style Composition in Text-to-Image Generation

no code implementations29 Aug 2024 Peng Xing, Haofan Wang, Yanpeng Sun, Qixun Wang, Xu Bai, Hao Ai, Renyuan Huang, Zechao Li

Based on this pipeline, we construct a dataset IMAGStyle, the first large-scale style transfer dataset containing 210k image triplets, available for the community to explore and research.

Style Transfer Text-to-Image Generation

InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image Generation

1 code implementation30 Jun 2024 Haofan Wang, Peng Xing, Renyuan Huang, Hao Ai, Qixun Wang, Xu Bai

Style transfer is an inventive process designed to create an image that maintains the essence of the original while embracing the visual style of another.

Style Transfer Text-to-Image Generation

Unified Video-Language Pre-training with Synchronized Audio

no code implementations12 May 2024 Shentong Mo, Haofan Wang, Huaxia Li, Xu Tang

Video-language pre-training is a typical and challenging problem that aims at learning visual and textual representations from large-scale data in a self-supervised way.

Multimodal Sense-Informed Prediction of 3D Human Motions

no code implementations5 May 2024 Zhenyu Lou, Qiongjie Cui, Haofan Wang, Xu Tang, Hong Zhou

Predicting future human pose is a fundamental application for machine intelligence, which drives robots to plan their behavior and paths ahead of time to seamlessly accomplish human-robot collaboration in real-world 3D scenarios.

motion prediction Prediction +1

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation

1 code implementation3 Apr 2024 Haofan Wang, Matteo Spinelli, Qixun Wang, Xu Bai, Zekui Qin, Anthony Chen

Tuning-free diffusion-based models have demonstrated significant potential in the realm of image personalization and customization.

Text-to-Image Generation

Multimodal Sense-Informed Forecasting of 3D Human Motions

no code implementations CVPR 2024 Zhenyu Lou, Qiongjie Cui, Haofan Wang, Xu Tang, Hong Zhou

To address this limitation this work introduces a novel multi-modal sense-informed motion prediction approach which conditions high-fidelity generation on two modal information: external 3D scene and internal human gaze and is able to recognize their salience for future human activity.

motion prediction Trajectory Prediction

Expressive Forecasting of 3D Whole-body Human Motions

1 code implementation19 Dec 2023 Pengxiang Ding, Qiongjie Cui, Min Zhang, Mengyuan Liu, Haofan Wang, Donglin Wang

Human motion forecasting, with the goal of estimating future human behavior over a period of time, is a fundamental task in many real-world applications.

Human Pose Forecasting Motion Forecasting

Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting

no code implementations14 Dec 2023 Anthony Chen, Huanrui Yang, Yulu Gan, Denis A Gudovskiy, Zhen Dong, Haofan Wang, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer, Shanghang Zhang

In particular, we build a tree-like Split-Ensemble architecture by performing iterative splitting and pruning from a shared backbone model, where each branch serves as a submodel corresponding to a subtask.

Synthesizing Physically Plausible Human Motions in 3D Scenes

1 code implementation17 Aug 2023 Liang Pan, Jingbo Wang, Buzhen Huang, Junyu Zhang, Haofan Wang, Xu Tang, Yangang Wang

We present a physics-based character control framework for synthesizing human-scene interactions.

1st Place Solution for PSG competition with ECCV'22 SenseHuman Workshop

2 code implementations6 Feb 2023 Qixun Wang, Xiaofeng Guo, Haofan Wang

Panoptic Scene Graph (PSG) generation aims to generate scene graph representations based on panoptic segmentation instead of rigid bounding boxes.

Multi-class Classification Panoptic Segmentation +5

Test-time Personalizable Forecasting of 3D Human Poses

no code implementations ICCV 2023 Qiongjie Cui, Huaijiang Sun, Jianfeng Lu, Weiqing Li, Bin Li, Hongwei Yi, Haofan Wang

Current motion forecasting approaches typically train a deep end-to-end model from the source domain data, and then apply it directly to target subjects.

Motion Forecasting

LaT: Latent Translation with Cycle-Consistency for Video-Text Retrieval

no code implementations11 Jul 2022 Jinbin Bai, Chunhui Liu, Feiyue Ni, Haofan Wang, Mengying Hu, Xiaofeng Guo, Lele Cheng

To overcome the above issue, we present a novel mechanism for learning the translation relationship from a source modality space $\mathcal{S}$ to a target modality space $\mathcal{T}$ without the need for a joint latent space, which bridges the gap between visual and textual domains.

Representation Learning Text Retrieval +3

TransAug: Translate as Augmentation for Sentence Embeddings

no code implementations30 Oct 2021 Jue Wang, Haofan Wang, Xing Wu, Chaochen Gao, Debing Zhang

In this paper, we present TransAug (Translate as Augmentation), which provide the first exploration of utilizing translated sentence pairs as data augmentation for text, and introduce a two-stage paradigm to advances the state-of-the-art sentence embeddings.

Contrastive Learning Data Augmentation +4

When Differential Privacy Meets Interpretability: A Case Study

no code implementations24 Jun 2021 Rakshit Naidu, Aman Priyanshu, Aadith Kumar, Sasikanth Kotti, Haofan Wang, FatemehSadat Mireshghallah

Given the increase in the use of personal data for training Deep Neural Networks (DNNs) in tasks such as medical imaging and diagnosis, differentially private training of DNNs is surging in importance and there is a large body of work focusing on providing better privacy-utility trade-off.

Automatic Speech Verification Spoofing Detection

1 code implementation15 Dec 2020 Shentong Mo, Haofan Wang, Pinxu Ren, Ta-Chung Chi

Automatic speech verification (ASV) is the technology to determine the identity of a person based on their voice.

SS-CAM: Smoothed Score-CAM for Sharper Visual Feature Localization

2 code implementations25 Jun 2020 Haofan Wang, Rakshit Naidu, Joy Michael, Soumya Snigdha Kundu

Interpretation of the underlying mechanisms of Deep Convolutional Neural Networks has become an important aspect of research in the field of deep learning due to their applications in high-risk environments.

Smoothed Geometry for Robust Attribution

1 code implementation NeurIPS 2020 Zifan Wang, Haofan Wang, Shakul Ramkumar, Matt Fredrikson, Piotr Mardziel, Anupam Datta

Feature attributions are a popular tool for explaining the behavior of Deep Neural Networks (DNNs), but have recently been shown to be vulnerable to attacks that produce divergent explanations for nearby inputs.

XDeep: An Interpretation Tool for Deep Neural Networks

1 code implementation4 Nov 2019 Fan Yang, Zijian Zhang, Haofan Wang, Yuening Li, Xia Hu

XDeep is an open-source Python package developed to interpret deep models for both practitioners and researchers.

Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks

9 code implementations3 Oct 2019 Haofan Wang, Zifan Wang, Mengnan Du, Fan Yang, Zijian Zhang, Sirui Ding, Piotr Mardziel, Xia Hu

Recently, increasing attention has been drawn to the internal mechanisms of convolutional neural networks, and the reason why the network makes specific decisions.

Adversarial Attack Decision Making +2

Contextual Local Explanation for Black Box Classifiers

no code implementations2 Oct 2019 Zijian Zhang, Fan Yang, Haofan Wang, Xia Hu

We introduce a new model-agnostic explanation technique which explains the prediction of any classifier called CLE.

General Classification Image Classification +1

Hybrid coarse-fine classification for head pose estimation

1 code implementation21 Jan 2019 Haofan Wang, Zhenghua Chen, Yi Zhou

In this paper, to do the estimation without facial landmarks, we combine the coarse and fine regression output together for a deep network.

3D Reconstruction Classification +6

Cannot find the paper you are looking for? You can Submit a new open access paper.