no code implementations • 15 Jan 2025 • Siqi Li, Zhengkai Jiang, Jiawei Zhou, Zhihong Liu, Xiaowei Chi, Haoqian Wang
Virtual try-on has emerged as a pivotal task at the intersection of computer vision and fashion, aimed at digitally simulating how clothing items fit on the human body.
no code implementations • 9 Jan 2025 • Yuhong Zhang, Jing Lin, Ailing Zeng, Guanlin Wu, Shunlin Lu, Yurong Fu, Yuanhao Cai, Ruimao Zhang, Haoqian Wang, Lei Zhang
To address this issue, we develop a scalable annotation pipeline that can automatically capture 3D whole-body human motion and comprehensive textural labels from RGB videos and build the Motion-X dataset comprising 81. 1K text-motion pairs.
1 code implementation • 16 Dec 2024 • Zikang Chen, Tao Jiang, Xiaowan Hu, Wang Zhang, Huaqiu Li, Haoqian Wang
This results in suboptimal utilization of both inter-frame and intra-frame information, and it also neglects the potential of optical flow alignment under self-supervised conditions, leading to biased and insufficient denoising outcomes.
no code implementations • 11 Dec 2024 • Kangjie Chen, Bingquan Dai, Minghan Qin, Dongbin Zhang, Peihao Li, Yingshuang Zou, Haoqian Wang
3D semantic field learning is crucial for applications like autonomous navigation, AR/VR, and robotics, where accurate comprehension of 3D scenes from limited viewpoints is essential.
no code implementations • 9 Dec 2024 • Weitao Wang, Haoran Xu, Yuxiao Yang, Zhifang Liu, Jun Meng, Haoqian Wang
Automatic approaches have proven challenging to align with human preferences, and the mixed comparison of text- and image-driven methods often leads to unfair evaluations.
1 code implementation • 4 Dec 2024 • Qinwei Lin, Xiaopeng Sun, Yu Gao, Yujie Zhong, Dengjie Li, Zheng Zhao, Haoqian Wang
Our method enhances the transmission of LR information in the early stages of diffusion to guarantee image fidelity and stimulates the generation ability of the SD model itself more in the later stages to enhance the detail of generated images.
no code implementations • 25 Aug 2024 • Chuanrui Zhang, Yingshuang Zou, Zhuoling Li, Minmin Yi, Haoqian Wang
Especially for the scenes that have many non-overlapping areas between various views and contain numerous similar regions, the matching performance of existing methods is poor and the reconstruction precision is limited.
1 code implementation • 23 Jul 2024 • Xiaowan Hu, Yiyi Chen, Yan Li, Minquan Wang, Haoqian Wang, Quan Chen, Han Li, Peng Jiang
The LPR task encompasses three primary dilemmas in real-world scenarios: 1) the recognition of intended products from distractor products present in the background; 2) the video-image heterogeneity that the appearance of products showcased in live streams often deviates substantially from standardized product images in stores; 3) there are numerous confusing products with subtle visual nuances in the shop.
no code implementations • 9 Jul 2024 • Chuanrui Zhang, Yonggen Ling, Minglei Lu, Minghan Qin, Haoqian Wang
We study the 3D object understanding task for manipulating everyday objects with different material properties (diffuse, specular, transparent and mixed).
no code implementations • 22 May 2024 • Tian Lan, Qinwei Lin, Haoqian Wang
Further, an additional language-extended loop closure module which is based on CLIP feature is designed to continually perform global optimization to correct drift errors accumulated as the system runs.
no code implementations • 3 May 2024 • Yingshuang Zou, Yikang Ding, Xi Qiu, Haoqian Wang, Haotian Zhang
This paper presents a novel self-supervised two-frame multi-camera metric depth estimation network, termed M${^2}$Depth, which is designed to predict reliable scale-aware surrounding depth in autonomous driving.
no code implementations • CVPR 2024 • Yuxiao Liu, Zhe Li, Yebin Liu, Haoqian Wang
To adequately utilize the available image evidence in multi-view video-based avatar modeling, we propose TexVocab, a novel avatar representation that constructs a texture vocabulary and associates body poses with texture maps for animation.
no code implementations • 23 Mar 2024 • Dongbin Zhang, Chuming Wang, Weitao Wang, Peihao Li, Minghan Qin, Haoqian Wang
The photometric variation and transient occluders in those unconstrained images make it difficult to reconstruct the original scene accurately.
no code implementations • 28 Dec 2023 • Yichong Xia, Yujun Huang, Bin Chen, Haoqian Wang, YaoWei Wang
To address this limitation, we propose a Feature-based Fast Cascade Alignment network (FFCA-Net) to fully leverage the side information on the decoder.
1 code implementation • CVPR 2024 • Minghan Qin, Wanhua Li, Jiawei Zhou, Haoqian Wang, Hanspeter Pfister
Humans live in a 3D world and commonly use natural language to interact with a 3D scene.
1 code implementation • 9 Dec 2023 • Junzhe Lu, Jing Lin, Hongkun Dou, Ailing Zeng, Yue Deng, Yulun Zhang, Haoqian Wang
Our approach demonstrates considerable enhancements over common uniform scheduling used in image domains, boasting improvements of 5. 4%, 17. 2%, and 3. 8% across human mesh recovery, pose completion, and motion denoising, respectively.
1 code implementation • 7 Dec 2023 • Shibin Wu, Bang Yang, Zhiyu Ye, Haoqian Wang, Hairong Zheng, Tong Zhang
Medical report generation demands automatic creation of coherent and precise descriptions for medical images.
1 code implementation • 27 Nov 2023 • Yang Liu, Xiang Huang, Minghan Qin, Qinwei Lin, Haoqian Wang
Neural radiance fields are capable of reconstructing high-quality drivable human avatars but are expensive to train and render and not suitable for multi-human scenes with complex shadows.
no code implementations • 10 Oct 2023 • Minghan Qin, Yifan Liu, Yuelang Xu, Xiaochen Zhao, Yebin Liu, Haoqian Wang
One crucial aspect of 3D head avatar reconstruction lies in the details of facial expressions.
no code implementations • 14 Sep 2023 • Yaoyu Su, Shaohui Wang, Haoqian Wang
In this paper, we present the decomposed triplane-hash neural radiance fields (DT-NeRF), a framework that significantly improves the photorealistic rendering of talking faces and achieves state-of-the-art results on key evaluation datasets.
no code implementations • 18 Jul 2023 • Zhuoling Li, Chunrui Han, Zheng Ge, Jinrong Yang, En Yu, Haoqian Wang, Hengshuang Zhao, Xiangyu Zhang
Besides, GroupLane with ResNet18 still surpasses PersFormer by 4. 9% F1 score, while the inference speed is nearly 7x faster and the FLOPs is only 13. 3% of it.
1 code implementation • NeurIPS 2023 • Jing Lin, Ailing Zeng, Shunlin Lu, Yuanhao Cai, Ruimao Zhang, Haoqian Wang, Lei Zhang
In this paper, we present Motion-X, a large-scale 3D expressive whole-body motion dataset.
2 code implementations • NeurIPS 2023 • Yuanhao Cai, Yuxin Zheng, Jing Lin, Xin Yuan, Yulun Zhang, Haoqian Wang
Finally, our BiSRNet is derived by using the proposed techniques to binarize the base model.
no code implementations • CVPR 2023 • Ruichen Zheng, Peng Li, Haoqian Wang, Tao Yu
Detailed 3D reconstruction and photo-realistic relighting of digital humans are essential for various applications.
1 code implementation • 3 Apr 2023 • Zhuoling Li, Chuanrui Zhang, Wei-Chiu Ma, Yipin Zhou, Linyan Huang, Haoqian Wang, SerNam Lim, Hengshuang Zhao
In recent years, transformer-based detectors have demonstrated remarkable performance in 2D visual perception tasks.
1 code implementation • CVPR 2023 • Jing Lin, Ailing Zeng, Haoqian Wang, Lei Zhang, Yu Li
It is challenging to perform this task with a single network due to resolution issues, i. e., the face and hands are usually located in extremely small regions.
Ranked #3 on
3D Human Pose Estimation
on UBody
1 code implementation • 14 Mar 2023 • Haohan Wang, Liang Liu, Boshen Zhang, Jiangning Zhang, Wuhao Zhang, Zhenye Gan, Yabiao Wang, Chengjie Wang, Haoqian Wang
Recent works on sparsely annotated object detection alleviate this problem by generating pseudo labels for the missing annotations.
no code implementations • 14 Mar 2023 • Xiangwen Deng, Yingshuang Zou, Yuanhao Cai, Chendong Zhao, Yang Liu, Zhifang Liu, Yuxiao Liu, Jiawei Zhou, Haoqian Wang
To solve this problem, we propose a novel method, namely Face-guided Dual Style Transfer (FDST).
5 code implementations • ICCV 2023 • Yuanhao Cai, Hao Bian, Jing Lin, Haoqian Wang, Radu Timofte, Yulun Zhang
When enhancing low-light images, many deep learning algorithms are based on the Retinex theory.
Ranked #1 on
Low-Light Image Enhancement
on SMID
Low-light Image Deblurring and Enhancement
Low-Light Image Enhancement
+3
no code implementations • 11 Mar 2023 • Zhuchen Shao, Liuxi Dai, Yifeng Wang, Haoqian Wang, Yongbing Zhang
Moreover, we highlight AugDiff's higher-quality augmented feature over image augmentation and its superiority over self-supervised learning.
1 code implementation • 10 Mar 2023 • Haohan Wang, Liang Liu, Wuhao Zhang, Jiangning Zhang, Zhenye Gan, Yabiao Wang, Chengjie Wang, Haoqian Wang
Few-shot semantic segmentation aims to learn to segment unseen class objects with the guidance of only a few support images.
Ranked #49 on
Few-Shot Semantic Segmentation
on COCO-20i (1-shot)
no code implementations • ICCV 2023 • Peihao Li, Shaohui Wang, Chen Yang, Bingbing Liu, Weichao Qiu, Haoqian Wang
Neural radiance fields (NeRF) achieve impressive performance in novel view synthesis when trained on only single sequence data.
no code implementations • ICCV 2023 • Zhuchen Shao, Yifeng Wang, Yang Chen, Hao Bian, Shaohui Liu, Haoqian Wang, Yongbing Zhang
Gigapixel Whole Slide Images (WSIs) aided patient diagnosis and prognosis analysis are promising directions in computational pathology.
no code implementations • ICCV 2023 • Hewei Guo, Liping Ren, Jingjing Fu, Yuwang Wang, Zhizheng Zhang, Cuiling Lan, Haoqian Wang, Xinwen Hou
Targeting for detecting anomalies of various sizes for complicated normal patterns, we propose a Template-guided Hierarchical Feature Restoration method, which introduces two key techniques, bottleneck compression and template-guided compensation, for anomaly-free feature restoration.
Ranked #17 on
Anomaly Detection
on MVTec LOCO AD
no code implementations • 10 Nov 2022 • Yifan Liu, YouBao Tang, Ning Zhang, Ruei-Sung Lin, Haoqian Wang
Temporal action localization (TAL) aims to detect the boundary and identify the class of each action instance in a long untrimmed video.
no code implementations • 15 Oct 2022 • Chendong Zhao, Jianzong Wang, Xiaoyang Qu, Haoqian Wang, Jing Xiao
Unsupervised representation learning for speech audios attained impressive performances for speech recognition tasks, particularly when annotated speech is limited.
no code implementations • 30 Sep 2022 • Chendong Zhao, Jianzong Wang, Wen qi Wei, Xiaoyang Qu, Haoqian Wang, Jing Xiao
For multi-head attention in Transformer ASR, it is not easy to model monotonic alignments in different heads.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
1 code implementation • 26 Jun 2022 • Hao Bian, Zhuchen Shao, Yang Chen, Yifeng Wang, Haoqian Wang, Jian Zhang, Yongbing Zhang
We achieve the state-of-the-art performance on the SICAPv2 dataset, and the visual analysis shows the accurate prediction results of instance level.
no code implementations • 8 Jun 2022 • Zhuoling Li, Chuanrui Zhang, En Yu, Haoqian Wang
(2) Combining depth estimation and 2D object detection is a promising M3OD pre-training baseline.
1 code implementation • 20 May 2022 • Jing Lin, Xiaowan Hu, Yuanhao Cai, Haoqian Wang, Youliang Yan, Xueyi Zou, Yulun Zhang, Luc van Gool
On the other hand, we equip the sequence-to-sequence model with an unsupervised optical flow estimator to maximize its potential.
Ranked #2 on
Video Enhancement
on MFQE v2
1 code implementation • 20 May 2022 • Yuanhao Cai, Jing Lin, Haoqian Wang, Xin Yuan, Henghui Ding, Yulun Zhang, Radu Timofte, Luc van Gool
In coded aperture snapshot spectral compressive imaging (CASSI) systems, hyperspectral image (HSI) reconstruction methods are employed to recover the spatial-spectral signal from a compressed measurement.
Ranked #1 on
Spectral Reconstruction
on Real HSI
no code implementations • CVPR 2022 • Zhuoling Li, Zhan Qu, Yang Zhou, Jianzhuang Liu, Haoqian Wang, Lihui Jiang
To tackle this problem, we propose a depth solving system that fully explores the visual clues from the subtasks in M3OD and generates multiple estimations for the depth of each target.
3 code implementations • 17 Apr 2022 • Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Zhang, Hanspeter Pfister, Radu Timofte, Luc van Gool
Existing leading methods for spectral reconstruction (SR) focus on designing deeper or wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB image to its hyperspectral image (HSI).
Ranked #1 on
Spectral Reconstruction
on ARAD-1K
2 code implementations • NeurIPS 2021 • Yuanhao Cai, Xiaowan Hu, Haoqian Wang, Yulun Zhang, Hanspeter Pfister, Donglai Wei
Additionally, for better noise fitting, we present an efficient architecture Simple Multi-scale Network (SMNet) as the generator.
Ranked #1 on
Noise Estimation
on SIDD
no code implementations • 4 Apr 2022 • Shiqi Xu, Wenhui Liu, Xi Yang, Joakim Jönsson, Ruobing Qian, Paul McKee, Kanghyun Kim, Pavan Chandra Konda, Kevin C. Zhou, Lucas Kreiß, Haoqian Wang, Edouard Berrocal, Scott Huettel, Roarke Horstmeyer
We evaluate our setup by classifying different spatiotemporal-decorrelating patterns hidden beneath a 5mm tissue-like phantom made with rapidly decorrelating dynamic scattering media.
1 code implementation • 9 Mar 2022 • Yuanhao Cai, Jing Lin, Xiaowan Hu, Haoqian Wang, Xin Yuan, Yulun Zhang, Radu Timofte, Luc van Gool
Many algorithms have been developed to solve the inverse problem of coded aperture snapshot spectral imaging (CASSI), i. e., recovering the 3D hyperspectral images (HSIs) from a 2D compressive measurement.
Ranked #4 on
Spectral Reconstruction
on Real HSI
2 code implementations • CVPR 2022 • Xiaowan Hu, Yuanhao Cai, Jing Lin, Haoqian Wang, Xin Yuan, Yulun Zhang, Radu Timofte, Luc van Gool
On the one hand, the proposed HR spatial-spectral attention module with its efficient feature fusion provides continuous and fine pixel-level features.
Ranked #7 on
Spectral Reconstruction
on Real HSI
no code implementations • 4 Mar 2022 • Peng Li, Jiayin Zhao, Jingyao Wu, Chao Deng, Haoqian Wang, Tao Yu
Light field disparity estimation is an essential task in computer vision with various applications.
no code implementations • 21 Feb 2022 • Chendong Zhao, Jianzong Wang, Xiaoyang Qu, Haoqian Wang, Jing Xiao
In this paper, we aim to evaluate and enhance the robustness of G2P models.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • 6 Jan 2022 • Jing Lin, Yuanhao Cai, Xiaowan Hu, Haoqian Wang, Youliang Yan, Xueyi Zou, Henghui Ding, Yulun Zhang, Radu Timofte, Luc van Gool
Exploiting similar and sharper scene patches in spatio-temporal neighborhoods is critical for video deblurring.
Ranked #1 on
Deblurring
on DVD
4 code implementations • CVPR 2022 • Yuanhao Cai, Jing Lin, Xiaowan Hu, Haoqian Wang, Xin Yuan, Yulun Zhang, Radu Timofte, Luc van Gool
The HSI representations are highly similar and correlated across the spectral dimension.
Ranked #2 on
Spectral Reconstruction
on ARAD-1K
1 code implementation • 16 Aug 2021 • Ran Yu, Chenyu Tian, Weihao Xia, Xinyuan Zhao, Haoqian Wang, Yujiu Yang
To alleviate this problem, we propose a mechanism named Inner Center Sampling to improve the accuracy of instance segmentation.
Ranked #6 on
Human Instance Segmentation
on OCHuman
1 code implementation • 22 Jul 2021 • Chenyu Tian, Ran Yu, Xinyuan Zhao, Weihao Xia, Haoqian Wang, Yujiu Yang
This simple framework achieves an unprecedented speed and a competitive accuracy on the COCO benchmark compared with state-of-the-art methods.
no code implementations • 3 Jul 2021 • Shiqi Xu, Xi Yang, Wenhui Liu, Joakim Jonsson, Ruobing Qian, Pavan Chandra Konda, Kevin C. Zhou, Lucas Kreiss, Qionghai Dai, Haoqian Wang, Edouard Berrocal, Roarke Horstmeyer
Noninvasive optical imaging through dynamic scattering media has numerous important biomedical applications but still remains a challenging task.
no code implementations • CVPR 2021 • Xiaowan Hu, Ruijun Ma, Zhihong Liu, Yuanhao Cai, Xiaole Zhao, Yulun Zhang, Haoqian Wang
The extraction of auto-correlation in images has shown great potential in deep learning networks, such as the self-attention mechanism in the channel domain and the self-similarity mechanism in the spatial domain.
no code implementations • 24 Feb 2021 • Zhuoling Li, Haohan Wang, Tymoteusz Swistek, Weixin Chen, Yuanzheng Li, Haoqian Wang
Few-shot learning is challenging due to the limited data and labels.
4 code implementations • ECCV 2020 • Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du, Haoqian Wang, Xiangyu Zhang, Xinyu Zhou, Erjin Zhou, Jian Sun
To tackle this problem, we propose an efficient attention mechanism - Pose Refine Machine (PRM) to make a trade-off between local and global representations in output features and further refine the keypoint locations.
Ranked #1 on
Keypoint Detection
on COCO test-challenge
no code implementations • 20 Aug 2019 • Jun Xu, Zhou Xu, Wangpeng An, Haoqian Wang, David Zhang
In this paper, we propose a novel Non-negative Sparse and Collaborative Representation (NSCR) for pattern classification.
no code implementations • 11 Aug 2019 • Haoqian Wang, Zhiwei Xu, Jun Xu, Wangpeng An, Lei Zhang, Qionghai Dai
There are two main problems in label inference: how to measure the confidence of the unlabeled data and how to generalize the classifier.
1 code implementation • 16 Jun 2019 • Jun Xu, Yingkun Hou, Dongwei Ren, Li Liu, Fan Zhu, Mengyang Yu, Haoqian Wang, Ling Shao
A novel Structure and Texture Aware Retinex (STAR) model is further proposed for illumination and reflectance decomposition of a single image.
2 code implementations • 29 Dec 2018 • Dan Wang, Mengqi Ji, Yong Wang, Haoqian Wang, Lu Fang
Inspired by the conditional integration idea in classical control society, we propose SPI-Optimizer, an integral-Separated PI controller based optimizer WITHOUT introducing extra hyperparameter.
1 code implementation • ECCV 2018 • Haitian Zheng, Mengqi Ji, Haoqian Wang, Yebin Liu, Lu Fang
The Reference-based Super-resolution (RefSR) super-resolves a low-resolution (LR) image given an external high-resolution (HR) reference image, where the reference image and LR image share similar viewpoint but with significant resolution gap x8.
3 code implementations • CVPR 2018 • Wangpeng An, Haoqian Wang, Qingyun Sun, Jun Xu, Qionghai Dai, Lei Zhang
We first reveal the intrinsic connections between SGD-Momentum and PID based controller, then present the optimization algorithm which exploits the past, current, and change of gradients to update the network parameters.
no code implementations • 1 Dec 2015 • Dongsheng An, Jinli Suo, Xiangyang Ji, Haoqian Wang, Qionghai Dai
Specifically, this paper derives a normalized dichromatic model for the pixels with identical diffuse color: a unit circle equation of projection coefficients in two subspaces that are orthogonal to and parallel with the illumination, respectively.