Search Results for author: Haoqian Wang

Found 64 papers, 30 papers with code

RealVVT: Towards Photorealistic Video Virtual Try-on via Spatio-Temporal Consistency

no code implementations15 Jan 2025 Siqi Li, Zhengkai Jiang, Jiawei Zhou, Zhihong Liu, Xiaowei Chi, Haoqian Wang

Virtual try-on has emerged as a pivotal task at the intersection of computer vision and fashion, aimed at digitally simulating how clothing items fit on the human body.

Virtual Try-on

Motion-X++: A Large-Scale Multimodal 3D Whole-body Human Motion Dataset

no code implementations9 Jan 2025 Yuhong Zhang, Jing Lin, Ailing Zeng, Guanlin Wu, Shunlin Lu, Yurong Fu, Yuanhao Cai, Ruimao Zhang, Haoqian Wang, Lei Zhang

To address this issue, we develop a scalable annotation pipeline that can automatically capture 3D whole-body human motion and comprehensive textural labels from RGB videos and build the Motion-X dataset comprising 81. 1K text-motion pairs.

Human Mesh Recovery Motion Generation

Spatiotemporal Blind-Spot Network with Calibrated Flow Alignment for Self-Supervised Video Denoising

1 code implementation16 Dec 2024 Zikang Chen, Tao Jiang, Xiaowan Hu, Wang Zhang, Huaqiu Li, Haoqian Wang

This results in suboptimal utilization of both inter-frame and intra-frame information, and it also neglects the potential of optical flow alignment under self-supervised conditions, leading to biased and insufficient denoising outcomes.

Denoising Optical Flow Estimation +1

SLGaussian: Fast Language Gaussian Splatting in Sparse Views

no code implementations11 Dec 2024 Kangjie Chen, Bingquan Dai, Minghan Qin, Dongbin Zhang, Peihao Li, Yingshuang Zou, Haoqian Wang

3D semantic field learning is crucial for applications like autonomous navigation, AR/VR, and robotics, where accurate comprehension of 3D scenes from limited viewpoints is essential.

3DGS Autonomous Navigation +1

MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences

no code implementations9 Dec 2024 Weitao Wang, Haoran Xu, Yuxiao Yang, Zhifang Liu, Jun Meng, Haoqian Wang

Automatic approaches have proven challenging to align with human preferences, and the mixed comparison of text- and image-driven methods often leads to unfair evaluations.

16k

TASR: Timestep-Aware Diffusion Model for Image Super-Resolution

1 code implementation4 Dec 2024 Qinwei Lin, Xiaopeng Sun, Yu Gao, Yujie Zhong, Dengjie Li, Zheng Zhao, Haoqian Wang

Our method enhances the transmission of LR information in the early stages of diffusion to guarantee image fidelity and stimulates the generation ability of the SD model itself more in the later stages to enhance the detail of generated images.

Denoising Image Super-Resolution

TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers

no code implementations25 Aug 2024 Chuanrui Zhang, Yingshuang Zou, Zhuoling Li, Minmin Yi, Haoqian Wang

Especially for the scenes that have many non-overlapping areas between various views and contain numerous similar regions, the matching performance of existing methods is poor and the reconstruction precision is limited.

3DGS 3D Reconstruction +2

Spatiotemporal Graph Guided Multi-modal Network for Livestreaming Product Retrieval

1 code implementation23 Jul 2024 Xiaowan Hu, Yiyi Chen, Yan Li, Minquan Wang, Haoqian Wang, Quan Chen, Han Li, Peng Jiang

The LPR task encompasses three primary dilemmas in real-world scenarios: 1) the recognition of intended products from distractor products present in the background; 2) the video-image heterogeneity that the appearance of products showcased in live streams often deviates substantially from standardized product images in stores; 3) there are numerous confusing products with subtle visual nuances in the shop.

Retrieval

Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images

no code implementations9 Jul 2024 Chuanrui Zhang, Yonggen Ling, Minglei Lu, Minghan Qin, Haoqian Wang

We study the 3D object understanding task for manipulating everyday objects with different material properties (diffuse, specular, transparent and mixed).

Decoder Object +5

Monocular Gaussian SLAM with Language Extended Loop Closure

no code implementations22 May 2024 Tian Lan, Qinwei Lin, Haoqian Wang

Further, an additional language-extended loop closure module which is based on CLIP feature is designed to continually perform global optimization to correct drift errors accumulated as the system runs.

global-optimization Simultaneous Localization and Mapping

M${^2}$Depth: Self-supervised Two-Frame Multi-camera Metric Depth Estimation

no code implementations3 May 2024 Yingshuang Zou, Yikang Ding, Xi Qiu, Haoqian Wang, Haotian Zhang

This paper presents a novel self-supervised two-frame multi-camera metric depth estimation network, termed M${^2}$Depth, which is designed to predict reliable scale-aware surrounding depth in autonomous driving.

Autonomous Driving Depth Estimation

TexVocab: Texture Vocabulary-conditioned Human Avatars

no code implementations CVPR 2024 Yuxiao Liu, Zhe Li, Yebin Liu, Haoqian Wang

To adequately utilize the available image evidence in multi-view video-based avatar modeling, we propose TexVocab, a novel avatar representation that constructs a texture vocabulary and associates body poses with texture maps for animation.

Human Dynamics

Gaussian in the Wild: 3D Gaussian Splatting for Unconstrained Image Collections

no code implementations23 Mar 2024 Dongbin Zhang, Chuming Wang, Weitao Wang, Peihao Li, Minghan Qin, Haoqian Wang

The photometric variation and transient occluders in those unconstrained images make it difficult to reconstruct the original scene accurately.

NeRF Novel View Synthesis

FFCA-Net: Stereo Image Compression via Fast Cascade Alignment of Side Information

no code implementations28 Dec 2023 Yichong Xia, Yujun Huang, Bin Chen, Haoqian Wang, YaoWei Wang

To address this limitation, we propose a Feature-based Fast Cascade Alignment network (FFCA-Net) to fully leverage the side information on the decoder.

Data Compression Decoder +2

DPoser: Diffusion Model as Robust 3D Human Pose Prior

1 code implementation9 Dec 2023 Junzhe Lu, Jing Lin, Hongkun Dou, Ailing Zeng, Yue Deng, Yulun Zhang, Haoqian Wang

Our approach demonstrates considerable enhancements over common uniform scheduling used in image domains, boasting improvements of 5. 4%, 17. 2%, and 3. 8% across human mesh recovery, pose completion, and motion denoising, respectively.

Denoising Human Mesh Recovery +1

Animatable 3D Gaussian: Fast and High-Quality Reconstruction of Multiple Human Avatars

1 code implementation27 Nov 2023 Yang Liu, Xiang Huang, Minghan Qin, Qinwei Lin, Haoqian Wang

Neural radiance fields are capable of reconstructing high-quality drivable human avatars but are expensive to train and render and not suitable for multi-human scenes with complex shadows.

Novel View Synthesis

DT-NeRF: Decomposed Triplane-Hash Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis

no code implementations14 Sep 2023 Yaoyu Su, Shaohui Wang, Haoqian Wang

In this paper, we present the decomposed triplane-hash neural radiance fields (DT-NeRF), a framework that significantly improves the photorealistic rendering of talking faces and achieves state-of-the-art results on key evaluation datasets.

NeRF

GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping

no code implementations18 Jul 2023 Zhuoling Li, Chunrui Han, Zheng Ge, Jinrong Yang, En Yu, Haoqian Wang, Hengshuang Zhao, Xiangyu Zhang

Besides, GroupLane with ResNet18 still surpasses PersFormer by 4. 9% F1 score, while the inference speed is nearly 7x faster and the FLOPs is only 13. 3% of it.

3D Lane Detection

Binarized Spectral Compressive Imaging

2 code implementations NeurIPS 2023 Yuanhao Cai, Yuxin Zheng, Jing Lin, Xin Yuan, Yulun Zhang, Haoqian Wang

Finally, our BiSRNet is derived by using the proposed techniques to binarize the base model.

Binarization

One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer

1 code implementation CVPR 2023 Jing Lin, Ailing Zeng, Haoqian Wang, Lei Zhang, Yu Li

It is challenging to perform this task with a single network due to resolution issues, i. e., the face and hands are usually located in extremely small regions.

3D Human Pose Estimation 3D Human Reconstruction +2

Calibrated Teacher for Sparsely Annotated Object Detection

1 code implementation14 Mar 2023 Haohan Wang, Liang Liu, Boshen Zhang, Jiangning Zhang, Wuhao Zhang, Zhenye Gan, Yabiao Wang, Chengjie Wang, Haoqian Wang

Recent works on sparsely annotated object detection alleviate this problem by generating pseudo labels for the missing annotations.

Object object-detection +2

AugDiff: Diffusion based Feature Augmentation for Multiple Instance Learning in Whole Slide Image

no code implementations11 Mar 2023 Zhuchen Shao, Liuxi Dai, Yifeng Wang, Haoqian Wang, Yongbing Zhang

Moreover, we highlight AugDiff's higher-quality augmented feature over image augmentation and its superiority over self-supervised learning.

Diversity Image Augmentation +4

NeRF-MS: Neural Radiance Fields with Multi-Sequence

no code implementations ICCV 2023 Peihao Li, Shaohui Wang, Chen Yang, Bingbing Liu, Weichao Qiu, Haoqian Wang

Neural radiance fields (NeRF) achieve impressive performance in novel view synthesis when trained on only single sequence data.

NeRF Novel View Synthesis +1

Template-guided Hierarchical Feature Restoration for Anomaly Detection

no code implementations ICCV 2023 Hewei Guo, Liping Ren, Jingjing Fu, Yuwang Wang, Zhizheng Zhang, Cuiling Lan, Haoqian Wang, Xinwen Hou

Targeting for detecting anomalies of various sizes for complicated normal patterns, we propose a Template-guided Hierarchical Feature Restoration method, which introduces two key techniques, bottleneck compression and template-guided compensation, for anomaly-free feature restoration.

Anomaly Detection

Prior-enhanced Temporal Action Localization using Subject-aware Spatial Attention

no code implementations10 Nov 2022 Yifan Liu, YouBao Tang, Ning Zhang, Ruei-Sung Lin, Haoqian Wang

Temporal action localization (TAL) aims to detect the boundary and identify the class of each action instance in a long untrimmed video.

Optical Flow Estimation Temporal Action Localization

Learning Invariant Representation and Risk Minimized for Unsupervised Accent Domain Adaptation

no code implementations15 Oct 2022 Chendong Zhao, Jianzong Wang, Xiaoyang Qu, Haoqian Wang, Jing Xiao

Unsupervised representation learning for speech audios attained impressive performances for speech recognition tasks, particularly when annotated speech is limited.

Domain Adaptation Representation Learning +2

Multiple Instance Learning with Mixed Supervision in Gleason Grading

1 code implementation26 Jun 2022 Hao Bian, Zhuchen Shao, Yang Chen, Yifeng Wang, Haoqian Wang, Jian Zhang, Yongbing Zhang

We achieve the state-of-the-art performance on the SICAPv2 dataset, and the visual analysis shows the accurate prediction results of instance level.

Multiple Instance Learning whole slide images

Degradation-Aware Unfolding Half-Shuffle Transformer for Spectral Compressive Imaging

1 code implementation20 May 2022 Yuanhao Cai, Jing Lin, Haoqian Wang, Xin Yuan, Henghui Ding, Yulun Zhang, Radu Timofte, Luc van Gool

In coded aperture snapshot spectral compressive imaging (CASSI) systems, hyperspectral image (HSI) reconstruction methods are employed to recover the spatial-spectral signal from a compressed measurement.

Compressive Sensing Image Reconstruction +1

Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular 3D Object Detection

no code implementations CVPR 2022 Zhuoling Li, Zhan Qu, Yang Zhou, Jianzhuang Liu, Haoqian Wang, Lihui Jiang

To tackle this problem, we propose a depth solving system that fully explores the visual clues from the subtasks in M3OD and generates multiple estimations for the depth of each target.

Depth Estimation Diversity +3

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction

3 code implementations17 Apr 2022 Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Zhang, Hanspeter Pfister, Radu Timofte, Luc van Gool

Existing leading methods for spectral reconstruction (SR) focus on designing deeper or wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB image to its hyperspectral image (HSI).

Spectral Reconstruction Spectral Super-Resolution

Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction

1 code implementation9 Mar 2022 Yuanhao Cai, Jing Lin, Xiaowan Hu, Haoqian Wang, Xin Yuan, Yulun Zhang, Radu Timofte, Luc van Gool

Many algorithms have been developed to solve the inverse problem of coded aperture snapshot spectral imaging (CASSI), i. e., recovering the 3D hyperspectral images (HSIs) from a 2D compressive measurement.

Compressive Sensing Image Reconstruction +1

Real-time Human-Centric Segmentation for Complex Video Scenes

1 code implementation16 Aug 2021 Ran Yu, Chenyu Tian, Weihao Xia, Xinyuan Zhao, Haoqian Wang, Yujiu Yang

To alleviate this problem, we propose a mechanism named Inner Center Sampling to improve the accuracy of instance segmentation.

Human Instance Segmentation Segmentation +2

PoseDet: Fast Multi-Person Pose Estimation Using Pose Embedding

1 code implementation22 Jul 2021 Chenyu Tian, Ran Yu, Xinyuan Zhao, Weihao Xia, Haoqian Wang, Yujiu Yang

This simple framework achieves an unprecedented speed and a competitive accuracy on the COCO benchmark compared with state-of-the-art methods.

Multi-Person Pose Estimation

Pseudo 3D Auto-Correlation Network for Real Image Denoising

no code implementations CVPR 2021 Xiaowan Hu, Ruijun Ma, Zhihong Liu, Yuanhao Cai, Xiaole Zhao, Yulun Zhang, Haoqian Wang

The extraction of auto-correlation in images has shown great potential in deep learning networks, such as the self-attention mechanism in the channel domain and the self-similarity mechanism in the spatial domain.

Image Denoising

Learning Delicate Local Representations for Multi-Person Pose Estimation

4 code implementations ECCV 2020 Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du, Haoqian Wang, Xiangyu Zhang, Xinyu Zhou, Erjin Zhou, Jian Sun

To tackle this problem, we propose an efficient attention mechanism - Pose Refine Machine (PRM) to make a trade-off between local and global representations in output features and further refine the keypoint locations.

Keypoint Detection Multi-Person Pose Estimation

Non-negative Sparse and Collaborative Representation for Pattern Classification

no code implementations20 Aug 2019 Jun Xu, Zhou Xu, Wangpeng An, Haoqian Wang, David Zhang

In this paper, we propose a novel Non-negative Sparse and Collaborative Representation (NSCR) for pattern classification.

Classification Face Recognition +1

Semi-Supervised Self-Growing Generative Adversarial Networks for Image Recognition

no code implementations11 Aug 2019 Haoqian Wang, Zhiwei Xu, Jun Xu, Wangpeng An, Lei Zhang, Qionghai Dai

There are two main problems in label inference: how to measure the confidence of the unlabeled data and how to generalize the classifier.

Attribute Generative Adversarial Network

STAR: A Structure and Texture Aware Retinex Model

1 code implementation16 Jun 2019 Jun Xu, Yingkun Hou, Dongwei Ren, Li Liu, Fan Zhu, Mengyang Yu, Haoqian Wang, Ling Shao

A novel Structure and Texture Aware Retinex (STAR) model is further proposed for illumination and reflectance decomposition of a single image.

Low-Light Image Enhancement model

SPI-Optimizer: an integral-Separated PI Controller for Stochastic Optimization

2 code implementations29 Dec 2018 Dan Wang, Mengqi Ji, Yong Wang, Haoqian Wang, Lu Fang

Inspired by the conditional integration idea in classical control society, we propose SPI-Optimizer, an integral-Separated PI controller based optimizer WITHOUT introducing extra hyperparameter.

Stochastic Optimization

CrossNet: An End-to-end Reference-based Super Resolution Network using Cross-scale Warping

1 code implementation ECCV 2018 Haitian Zheng, Mengqi Ji, Haoqian Wang, Yebin Liu, Lu Fang

The Reference-based Super-resolution (RefSR) super-resolves a low-resolution (LR) image given an external high-resolution (HR) reference image, where the reference image and LR image share similar viewpoint but with significant resolution gap x8.

Decoder Patch Matching +1

A PID Controller Approach for Stochastic Optimization of Deep Networks

3 code implementations CVPR 2018 Wangpeng An, Haoqian Wang, Qingyun Sun, Jun Xu, Qionghai Dai, Lei Zhang

We first reveal the intrinsic connections between SGD-Momentum and PID based controller, then present the optimization algorithm which exploits the past, current, and change of gradients to update the network parameters.

Stochastic Optimization

Fast and High Quality Highlight Removal from A Single Image

no code implementations1 Dec 2015 Dongsheng An, Jinli Suo, Xiangyang Ji, Haoqian Wang, Qionghai Dai

Specifically, this paper derives a normalized dichromatic model for the pixels with identical diffuse color: a unit circle equation of projection coefficients in two subspaces that are orthogonal to and parallel with the illumination, respectively.

Clustering Diversity +2

Cannot find the paper you are looking for? You can Submit a new open access paper.