Search Results for author: Haofan Wang

Found 19 papers, 11 papers with code

InstantID: Zero-shot Identity-Preserving Generation in Seconds

2 code implementations • 15 Jan 2024 • Qixun Wang, Xu Bai, Haofan Wang, Zekui Qin, Anthony Chen, Huaxia Li, Xu Tang, Yao Hu

There has been significant progress in personalized image synthesis with methods such as Textual Inversion, DreamBooth, and LoRA.

Ranked #2 on Diffusion Personalization Tuning Free on AgeDB

Diffusion Personalization Tuning Free Image Generation

9,810

Paper
Code

Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks

9 code implementations • 3 Oct 2019 • Haofan Wang, Zifan Wang, Mengnan Du, Fan Yang, Zijian Zhang, Sirui Ding, Piotr Mardziel, Xia Hu

Recently, increasing attention has been drawn to the internal mechanisms of convolutional neural networks, and the reason why the network makes specific decisions.

Adversarial Attack Decision Making +1

9,444

Paper
Code

SS-CAM: Smoothed Score-CAM for Sharper Visual Feature Localization

2 code implementations • 25 Jun 2020 • Haofan Wang, Rakshit Naidu, Joy Michael, Soumya Snigdha Kundu

Interpretation of the underlying mechanisms of Deep Convolutional Neural Networks has become an important aspect of research in the field of deep learning due to their applications in high-risk environments.

1,813

Paper
Code

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation

1 code implementation • 3 Apr 2024 • Haofan Wang, Matteo Spinelli, Qixun Wang, Xu Bai, Zekui Qin, Anthony Chen

Tuning-free diffusion-based models have demonstrated significant potential in the realm of image personalization and customization.

Text-to-Image Generation

1,142

Paper
Code

1st Place Solution for PSG competition with ECCV'22 SenseHuman Workshop

2 code implementations • 6 Feb 2023 • Qixun Wang, Xiaofeng Guo, Haofan Wang

Panoptic Scene Graph (PSG) generation aims to generate scene graph representations based on panoptic segmentation instead of rigid bounding boxes.

Multi-class Classification Panoptic Segmentation +4

435

Paper
Code

Hybrid coarse-fine classification for head pose estimation

1 code implementation • 21 Jan 2019 • Haofan Wang, Zhenghua Chen, Yi Zhou

In this paper, to do the estimation without facial landmarks, we combine the coarse and fine regression output together for a deep network.

Ranked #3 on Head Pose Estimation on AFLW

3D Reconstruction Classification +6

Paper
Code

Synthesizing Physically Plausible Human Motions in 3D Scenes

1 code implementation • 17 Aug 2023 • Liang Pan, Jingbo Wang, Buzhen Huang, Junyu Zhang, Haofan Wang, Xu Tang, Yangang Wang

Experimental results demonstrate that our framework can synthesize physically plausible long-term human motions in complex 3D scenes.

Paper
Code

XDeep: An Interpretation Tool for Deep Neural Networks

1 code implementation • 4 Nov 2019 • Fan Yang, Zijian Zhang, Haofan Wang, Yuening Li, Xia Hu

XDeep is an open-source Python package developed to interpret deep models for both practitioners and researchers.

Paper
Code

Automatic Speech Verification Spoofing Detection

1 code implementation • 15 Dec 2020 • Shentong Mo, Haofan Wang, Pinxu Ren, Ta-Chung Chi

Automatic speech verification (ASV) is the technology to determine the identity of a person based on their voice.

Paper
Code

Expressive Forecasting of 3D Whole-body Human Motions

1 code implementation • 19 Dec 2023 • Pengxiang Ding, Qiongjie Cui, Min Zhang, Mengyuan Liu, Haofan Wang, Donglin Wang

Human motion forecasting, with the goal of estimating future human behavior over a period of time, is a fundamental task in many real-world applications.

Human Pose Forecasting Motion Forecasting

Paper
Code

Smoothed Geometry for Robust Attribution

1 code implementation • NeurIPS 2020 • Zifan Wang, Haofan Wang, Shakul Ramkumar, Matt Fredrikson, Piotr Mardziel, Anupam Datta

Feature attributions are a popular tool for explaining the behavior of Deep Neural Networks (DNNs), but have recently been shown to be vulnerable to attacks that produce divergent explanations for nearby inputs.

Paper
Code

Contextual Local Explanation for Black Box Classifiers

no code implementations • 2 Oct 2019 • Zijian Zhang, Fan Yang, Haofan Wang, Xia Hu

We introduce a new model-agnostic explanation technique which explains the prediction of any classifier called CLE.

General Classification Image Classification

Paper
Add Code

When Differential Privacy Meets Interpretability: A Case Study

no code implementations • 24 Jun 2021 • Rakshit Naidu, Aman Priyanshu, Aadith Kumar, Sasikanth Kotti, Haofan Wang, FatemehSadat Mireshghallah

Given the increase in the use of personal data for training Deep Neural Networks (DNNs) in tasks such as medical imaging and diagnosis, differentially private training of DNNs is surging in importance and there is a large body of work focusing on providing better privacy-utility trade-off.

Paper
Add Code

EfficientCLIP: Efficient Cross-Modal Pre-training by Ensemble Confident Learning and Language Modeling

no code implementations • 10 Sep 2021 • Jue Wang, Haofan Wang, Jincan Deng, Weijia Wu, Debing Zhang

Extra rich non-paired single-modal text data is used for boosting the generalization of text branch.

Cross-Modal Retrieval Language Modelling +4

Paper
Add Code

TransAug: Translate as Augmentation for Sentence Embeddings

no code implementations • 30 Oct 2021 • Jue Wang, Haofan Wang, Xing Wu, Chaochen Gao, Debing Zhang

In this paper, we present TransAug (Translate as Augmentation), which provide the first exploration of utilizing translated sentence pairs as data augmentation for text, and introduce a two-stage paradigm to advances the state-of-the-art sentence embeddings.

Contrastive Learning Data Augmentation +4

Paper
Add Code

LaT: Latent Translation with Cycle-Consistency for Video-Text Retrieval

no code implementations • 11 Jul 2022 • Jinbin Bai, Chunhui Liu, Feiyue Ni, Haofan Wang, Mengying Hu, Xiaofeng Guo, Lele Cheng

To overcome the above issue, we present a novel mechanism for learning the translation relationship from a source modality space $\mathcal{S}$ to a target modality space $\mathcal{T}$ without the need for a joint latent space, which bridges the gap between visual and textual domains.

Ranked #11 on Zero-Shot Video Retrieval on MSVD

Representation Learning Retrieval +4

Paper
Add Code

One-shot Implicit Animatable Avatars with Model-based Priors

no code implementations • ICCV 2023 • Yangyi Huang, Hongwei Yi, Weiyang Liu, Haofan Wang, Boxi Wu, Wenxiao Wang, Binbin Lin, Debing Zhang, Deng Cai

Most of these methods fail to achieve realistic reconstruction when only a single image is available.

Neural Rendering

Paper
Add Code

Test-time Personalizable Forecasting of 3D Human Poses

no code implementations • ICCV 2023 • Qiongjie Cui, Huaijiang Sun, Jianfeng Lu, Weiqing Li, Bin Li, Hongwei Yi, Haofan Wang

Current motion forecasting approaches typically train a deep end-to-end model from the source domain data, and then apply it directly to target subjects.

Motion Forecasting

Paper
Add Code

Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting

no code implementations • 14 Dec 2023 • Anthony Chen, Huanrui Yang, Yulu Gan, Denis A Gudovskiy, Zhen Dong, Haofan Wang, Tomoyuki Okuno, Yohei Nakata, Shanghang Zhang, Kurt Keutzer

In particular, we build a tree-like Split-Ensemble architecture by performing iterative splitting and pruning from a shared backbone model, where each branch serves as a submodel corresponding to a subtask.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.