no code implementations • 7 Feb 2025 • Runqing Jiang, Ye Zhang, Longguang Wang, Pengpeng Yu, Yulan Guo
Post-training quantization (PTQ) has emerged as a promising solution for reducing the storage and computational cost of vision transformers (ViTs).
no code implementations • 5 Jan 2025 • Minglin Chen, Longguang Wang, Sheng Ao, Ye Zhang, Kai Xu, Yulan Guo
To fully leverage 2D diffusion priors in geometry and appearance generation, we introduce a semantic-guided geometry diffusion model and a semantic-geometry guided diffusion model which are finetuned on a scene dataset.
1 code implementation • 14 Dec 2024 • Qingyu Xu, Longguang Wang, Weidong Sheng, Yingqian Wang, Chao Xiao, Chao Ma, Wei An
Extensive experiments are conducted on VT-Tiny-MOT, and the results have demonstrated the effectiveness of our method.
2 code implementations • Asian Conference on Computer Vision 2024 • Hongda Liu, Longguang Wang, Weijun Guan, Ye Zhang, Yulan Guo
Specifically, for style modeling, we propose a style representation learning scheme to encode the style information into a compact representation.
no code implementations • 26 Nov 2024 • Yukun Wang, Longguang Wang, Zhiyuan Ma, Qibin Hu, Kai Xu, Yulan Guo
Despite the typical inversion-then-editing paradigm using text-to-image (T2I) models has demonstrated promising results, directly extending it to text-to-video (T2V) models still suffers severe artifacts such as color flickering and content distortion.
no code implementations • 25 Sep 2024 • Longguang Wang, Yulan Guo, Juncheng Li, Hongda Liu, Yang Zhao, Yingqian Wang, Zhi Jin, Shuhang Gu, Radu Timofte
This paper summarizes the 3rd NTIRE challenge on stereo image super-resolution (SR) with a focus on new solutions and results.
1 code implementation • 1 Jul 2024 • Hongda Liu, Longguang Wang, Ye Zhang, Kaiwen Xue, Shunbo Zhou, Yulan Guo
In addition, we develop an energy distance loss to facilitate the learning of the degradation representations by introducing a bounded constraint.
1 code implementation • 16 May 2024 • Zhaoxu Li, Wei An, Gaowei Guo, Longguang Wang, Yingqian Wang, Zaiping Lin
Hyperspectral target detection (HTD) aims to identify specific materials based on spectral information in hyperspectral imagery and can detect extremely small objects, some of which occupy a smaller than one-pixel area.
no code implementations • 8 Apr 2024 • Zhiqi Huang, Huixin Xiong, Haoyu Wang, Longguang Wang, Zhiheng Li
Then, the object images are employed as additional prompts to facilitate the diffusion model to better understand the relationship between foreground and background regions during image generation.
1 code implementation • 1 Feb 2024 • Zhuo Su, Jiehua Zhang, Longguang Wang, Hua Zhang, Zhen Liu, Matti Pietikäinen, Li Liu
With PDC and Bi-PDC, we further present two lightweight deep networks named \emph{Pixel Difference Networks (PiDiNet)} and \emph{Binary PiDiNet (Bi-PiDiNet)} respectively to learn highly efficient yet more accurate representations for visual tasks including edge detection and object recognition.
no code implementations • CVPR 2024 • Longguang Wang, Juncheng Li, Yingqian Wang, Qingyong Hu, Yulan Guo
The difficulty of acquiring high-resolution (HR) and low-resolution (LR) image pairs in real scenarios limits the performance of existing learning-based image super-resolution (SR) methods in the real world.
no code implementations • CVPR 2024 • Kunhong Li, Longguang Wang, Ye Zhang, Kaiwen Xue, Shunbo Zhou, Yulan Guo
In this paper we exploit local structure information (LSI) to enhance stereo matching.
no code implementations • 21 Dec 2023 • Peng Zhao, Jiehua Zhang, Bowen Peng, Longguang Wang, YingMei Wei, Yu Liu, Li Liu
2) BNNs consistently exhibit better adversarial robustness under black-box attacks.
1 code implementation • ICCV 2023 • Zhiqiang Shen, Xiaoxiao Sheng, Hehe Fan, Longguang Wang, Yulan Guo, Qiong Liu, Hao Wen, Xi Zhou
In this paper, we propose a Masked Spatio-Temporal Structure Prediction (MaST-Pre) method to capture the structure of point cloud videos without human annotations.
no code implementations • ICCV 2023 • Xiaoxiao Sheng, Zhiqiang Shen, Gang Xiao, Longguang Wang, Yulan Guo, Hehe Fan
Instead of contrasting the representations of clips or frames, in this paper, we propose a unified self-supervised framework by conducting contrastive learning at the point level.
no code implementations • 10 Aug 2023 • Shaocong Liu, Tao Wang, Yan Zhang, Ruqin Zhou, Li Li, Chenguang Dai, Yongsheng Zhang, Longguang Wang, Hanyun Wang
The adjacent points with the same category labels are then clustered together using the Euclidean clustering algorithm to obtain the semantic instances, which are represented by three kinds of attributes including spatial location information, semantic categorical information, and global geometric shape information.
1 code implementation • CVPR 2023 • Zhiqiang Shen, Xiaoxiao Sheng, Longguang Wang, Yulan Guo, Qiong Liu, Xi Zhou
Self-supervised learning can extract representations of good quality from solely unlabeled data, which is appealing for point cloud videos due to their high labelling cost.
1 code implementation • 20 Apr 2023 • Yingqian Wang, Longguang Wang, Zhengyu Liang, Jungang Yang, Radu Timofte, Yulan Guo
In this report, we summarize the first NTIRE challenge on light field (LF) image super-resolution (SR), which aims at super-resolving LF images under the standard bicubic degradation with a magnification factor of 4.
1 code implementation • ICCV 2023 • Boyang Li, Yingqian Wang, Longguang Wang, Fei Zhang, Ting Liu, Zaiping Lin, Wei An, Yulan Guo
The core idea of this work is to recover the per-pixel mask of each target from the given single point label by using clustering approaches, which looks simple but is indeed challenging since targets are always insalient and accompanied with background clutters.
1 code implementation • ICCV 2023 • Zhengyu Liang, Yingqian Wang, Longguang Wang, Jungang Yang, Shilin Zhou, Yulan Guo
Exploiting spatial-angular correlation is crucial to light field (LF) image super-resolution (SR), but is highly challenging due to its non-local property caused by the disparities among LF images.
no code implementations • ICCV 2023 • Zhiheng Fu, Longguang Wang, Lian Xu, Zhiyong Wang, Hamid Laga, Yulan Guo, Farid Boussaid, Mohammed Bennamoun
In this paper, we thus propose an unsupervised viewpoint representation learning scheme for 3D point cloud completion without explicit viewpoint estimation.
1 code implementation • 19 Jul 2022 • Xiaoyu Dong, Naoto Yokoya, Longguang Wang, Tatsumi Uezato
Self-supervised cross-modal super-resolution (SR) can overcome the difficulty of acquiring paired training data, but is challenging because only low-resolution (LR) source and high-resolution (HR) guide images from different modalities are available.
3 code implementations • 13 Jun 2022 • Yingqian Wang, Zhengyu Liang, Longguang Wang, Jungang Yang, Wei An, Yulan Guo
In our method, a practical LF degradation model is developed to formulate the degradation process of real LF images.
no code implementations • 20 Apr 2022 • Longguang Wang, Yulan Guo, Yingqian Wang, Juncheng Li, Shuhang Gu, Radu Timofte
In this paper, we summarize the 1st NTIRE challenge on stereo image super-resolution (restoration of rich details in a pair of low-resolution stereo images) with a focus on new solutions and results.
1 code implementation • CVPR 2022 • Yingqian Wang, Longguang Wang, Zhengyu Liang, Jungang Yang, Wei An, Yulan Guo
Based on the proposed cost constructor, we develop a deep network for LF depth estimation.
no code implementations • 22 Feb 2022 • Yingqian Wang, Longguang Wang, Gaochang Wu, Jungang Yang, Wei An, Jingyi Yu, Yulan Guo
In this paper, we propose a generic mechanism to disentangle these coupled information for LF image processing.
1 code implementation • CVPR 2022 • Kunhong Li, Longguang Wang, Li Liu, Qing Ran, Kai Xu, Yulan Guo
Weakly supervised learning can help local feature methods to overcome the obstacle of acquiring a large-scale dataset with densely labeled correspondences.
Ranked #1 on
Camera Localization
on Aachen Day-Night benchmark
1 code implementation • 4 Jan 2022 • Xinyi Ying, Yingqian Wang, Longguang Wang, Weidong Sheng, Li Liu, Zaiping Lin, Shilin Zhou
Specifically, motivated by the local motion prior in the spatio-temporal dimension, we propose a local spatio-temporal attention module to perform implicit frame alignment and incorporate the local spatio-temporal information to enhance the local features (especially for small targets).
1 code implementation • CVPR 2022 • Longguang Wang, Xiaoyu Dong, Yingqian Wang, Li Liu, Wei An, Yulan Guo
Since a linear quantizer (i. e., round(*) function) cannot well fit the bell-shaped distributions of weights and activations, many existing methods use pre-defined functions (e. g., exponential function) with learnable parameters to build the quantizer for joint optimization.
1 code implementation • 29 Sep 2021 • Juncheng Li, Zehua Pei, Wenjie Li, Guangwei Gao, Longguang Wang, Yingqian Wang, Tieyong Zeng
This is an exhaustive survey of SISR, which can help researchers better understand SISR and inspire more exciting research in this field.
1 code implementation • 17 Aug 2021 • Zhengyu Liang, Yingqian Wang, Longguang Wang, Jungang Yang, Shilin Zhou
With the proposed angular and spatial Transformers, the beneficial information in an LF can be fully exploited and the SR performance is boosted.
1 code implementation • 1 Jun 2021 • Boyang Li, Chao Xiao, Longguang Wang, Yingqian Wang, Zaiping Lin, Miao Li, Wei An, Yulan Guo
With the repeated interaction in DNIM, infrared small targets in deep layers can be maintained.
2 code implementations • CVPR 2021 • Longguang Wang, Yingqian Wang, Xiaoyu Dong, Qingyu Xu, Jungang Yang, Wei An, Yulan Guo
In this paper, we propose an unsupervised degradation representation learning scheme for blind SR without explicit degradation estimation.
1 code implementation • 7 Nov 2020 • Yingqian Wang, Xinyi Ying, Longguang Wang, Jungang Yang, Wei An, Yulan Guo
Although recent years have witnessed the great advances in stereo image super-resolution (SR), the beneficial information provided by binocular systems has not been fully used.
2 code implementations • 16 Sep 2020 • Longguang Wang, Yulan Guo, Yingqian Wang, Zhengfa Liang, Zaiping Lin, Jungang Yang, Wei An
Based on our PAM, we propose a parallax-attention stereo matching network (PASMnet) and a parallax-attention stereo image super-resolution network (PASSRnet) for stereo matching and stereo image super-resolution tasks.
1 code implementation • 7 Jul 2020 • Yingqian Wang, Jungang Yang, Longguang Wang, Xinyi Ying, Tianhao Wu, Wei An, Yulan Guo
In this paper, we propose a deformable convolution network (i. e., LF-DFnet) to handle the disparity problem for LF image SR.
1 code implementation • CVPR 2021 • Longguang Wang, Xiaoyu Dong, Yingqian Wang, Xinyi Ying, Zaiping Lin, Wei An, Yulan Guo
Specifically, we develop a Sparse Mask SR (SMSR) network to learn sparse masks to prune redundant computation.
2 code implementations • ICCV 2021 • Longguang Wang, Yingqian Wang, Zaiping Lin, Jungang Yang, Wei An, Yulan Guo
In this paper, we propose to learn a scale-arbitrary image SR network from scale-specific networks.
1 code implementation • 6 Apr 2020 • Xinyi Ying, Longguang Wang, Yingqian Wang, Weidong Sheng, Wei An, Yulan Guo
In this paper, we propose a deformable 3D convolution network (D3Dnet) to incorporate spatio-temporal information from both spatial and temporal dimensions for video SR.
2 code implementations • 6 Jan 2020 • Longguang Wang, Yulan Guo, Li Liu, Zaiping Lin, Xinpu Deng, Wei An
The key challenge for video SR lies in the effective exploitation of temporal dependency between consecutive frames.
Ranked #6 on
Video Super-Resolution
on MSU Super-Resolution for Video Compression
(BSQ-rate over ERQA metric)
1 code implementation • 17 Dec 2019 • Yingqian Wang, Longguang Wang, Jungang Yang, Wei An, Jingyi Yu, Yulan Guo
Specifically, spatial and angular features are first separately extracted from input LFs, and then repetitively interacted to progressively incorporate spatial and angular information.
1 code implementation • 10 Dec 2019 • Yingqian Wang, Tianhao Wu, Jungang Yang, Longguang Wang, Wei An, Yulan Guo
In this paper, we handle the LF de-occlusion (LF-DeOcc) problem using a deep encoder-decoder network (namely, DeOccNet).
no code implementations • 15 Mar 2019 • Yingqian Wang, Longguang Wang, Jungang Yang, Wei An, Yulan Guo
With the popularity of dual cameras in recently released smart phones, a growing number of super-resolution (SR) methods have been proposed to enhance the resolution of stereo image pairs.
1 code implementation • CVPR 2019 • Longguang Wang, Yingqian Wang, Zhengfa Liang, Zaiping Lin, Jungang Yang, Wei An, Yulan Guo
Stereo image pairs can be used to improve the performance of super-resolution (SR) since additional information is provided from a second viewpoint.
Ranked #1 on
Image Super-Resolution
on KITTI 2012 - 4x upscaling
2 code implementations • 23 Sep 2018 • Longguang Wang, Yulan Guo, Zaiping Lin, Xinpu Deng, Wei An
Extensive experiments demonstrate that HR optical flows provide more accurate correspondences than their LR counterparts and improve both accuracy and consistency performance.
Ranked #18 on
Video Super-Resolution
on Vid4 - 4x upscaling
no code implementations • 23 Aug 2017 • Longguang Wang, Zaiping Lin, Jinyan Gao, Xinpu Deng, Wei An
Single image super-resolution aims to generate a high-resolution image from a single low-resolution image, which is of great significance in extensive applications.
no code implementations • 20 Jun 2017 • Longguang Wang, Zaiping Lin, Xinpu Deng, Wei An
In this paper, we propose an end-to-end fast upscaling technique to replace the interpolation operator, design upscaling filters in LR space for periodic sub-locations respectively and shuffle the filter results to derive the final reconstruction errors in HR space.