Search Results for author: Yulan Guo

Found 73 papers, 47 papers with code

Learning Hierarchical Color Guidance for Depth Map Super-Resolution

no code implementations12 Mar 2024 Runmin Cong, Ronghui Sheng, Hao Wu, Yulan Guo, Yunchao Wei, WangMeng Zuo, Yao Zhao, Sam Kwong

On the one hand, the low-level detail embedding module is designed to supplement high-frequency color information of depth features in a residual mask manner at the low-level stages.

Depth Map Super-Resolution

Sports-QA: A Large-Scale Video Question Answering Benchmark for Complex and Professional Sports

1 code implementation3 Jan 2024 Haopeng Li, Andong Deng, Qiuhong Ke, Jun Liu, Hossein Rahmani, Yulan Guo, Bernt Schiele, Chen Chen

Reasoning over sports videos for question answering is an important task with numerous applications, such as player training and information retrieval.

Action Understanding counterfactual +4

Point Contrastive Prediction with Semantic Clustering for Self-Supervised Learning on Point Cloud Videos

no code implementations ICCV 2023 Xiaoxiao Sheng, Zhiqiang Shen, Gang Xiao, Longguang Wang, Yulan Guo, Hehe Fan

Instead of contrasting the representations of clips or frames, in this paper, we propose a unified self-supervised framework by conducting contrastive learning at the point level.

Contrastive Learning Representation Learning +1

2D3D-MATR: 2D-3D Matching Transformer for Detection-free Registration between Images and Point Clouds

1 code implementation ICCV 2023 Minhao Li, Zheng Qin, Zhirui Gao, Renjiao Yi, Chenyang Zhu, Yulan Guo, Kai Xu

The commonly adopted detect-then-match approach to registration finds difficulties in the cross-modality cases due to the incompatible keypoint detection and inconsistent feature description.

Keypoint Detection Patch Matching

Variational Probabilistic Fusion Network for RGB-T Semantic Segmentation

no code implementations17 Jul 2023 Baihong Lin, Zengrong Lin, Yulan Guo, Yulan Zhang, Jianxiao Zou, Shicai Fan

RGB-T semantic segmentation has been widely adopted to handle hard scenes with poor lighting conditions by fusing different modality features of RGB and thermal images.

Segmentation Semantic Segmentation +1

PointCMP: Contrastive Mask Prediction for Self-supervised Learning on Point Cloud Videos

1 code implementation CVPR 2023 Zhiqiang Shen, Xiaoxiao Sheng, Longguang Wang, Yulan Guo, Qiong Liu, Xi Zhou

Self-supervised learning can extract representations of good quality from solely unlabeled data, which is appealing for point cloud videos due to their high labelling cost.

Self-Supervised Learning Transfer Learning

NTIRE 2023 Challenge on Light Field Image Super-Resolution: Dataset, Methods and Results

1 code implementation20 Apr 2023 Yingqian Wang, Longguang Wang, Zhengyu Liang, Jungang Yang, Radu Timofte, Yulan Guo

In this report, we summarize the first NTIRE challenge on light field (LF) image super-resolution (SR), which aims at super-resolving LF images under the standard bicubic degradation with a magnification factor of 4.

Image Super-Resolution

Monte Carlo Linear Clustering with Single-Point Supervision is Enough for Infrared Small Target Detection

1 code implementation ICCV 2023 Boyang Li, Yingqian Wang, Longguang Wang, Fei Zhang, Ting Liu, Zaiping Lin, Wei An, Yulan Guo

The core idea of this work is to recover the per-pixel mask of each target from the given single point label by using clustering approaches, which looks simple but is indeed challenging since targets are always insalient and accompanied with background clutters.

Clustering

Robust Multiview Point Cloud Registration with Reliable Pose Graph Initialization and History Reweighting

1 code implementation CVPR 2023 Haiping Wang, YuAn Liu, Zhen Dong, Yulan Guo, Yu-Shen Liu, Wenping Wang, Bisheng Yang

Previous multiview registration methods rely on exhaustive pairwise registration to construct a densely-connected pose graph and apply Iteratively Reweighted Least Square (IRLS) on the pose graph to compute the scan poses.

Point Cloud Registration

Semi-Weakly Supervised Object Kinematic Motion Prediction

no code implementations CVPR 2023 Gengxin Liu, Qian Sun, Haibin Huang, Chongyang Ma, Yulan Guo, Li Yi, Hui Huang, Ruizhen Hu

First, although 3D dataset with fully annotated motion labels is limited, there are existing datasets and methods for object part semantic segmentation at large scale.

motion prediction Object +3

Pseudo-label Correction and Learning For Semi-Supervised Object Detection

no code implementations6 Mar 2023 Yulin He, Wei Chen, Ke Liang, Yusong Tan, Zhengfa Liang, Yulan Guo

Our proposed method, Pseudo-label Correction and Learning (PCL), is extensively evaluated on the MS COCO and PASCAL VOC benchmarks.

object-detection Object Detection +2

Learning Non-Local Spatial-Angular Correlation for Light Field Image Super-Resolution

1 code implementation ICCV 2023 Zhengyu Liang, Yingqian Wang, Longguang Wang, Jungang Yang, Shilin Zhou, Yulan Guo

Exploiting spatial-angular correlation is crucial to light field (LF) image super-resolution (SR), but is highly challenging due to its non-local property caused by the disparities among LF images.

Image Super-Resolution

BUFFER: Balancing Accuracy, Efficiency, and Generalizability in Point Cloud Registration

1 code implementation CVPR 2023 Sheng Ao, Qingyong Hu, Hanyun Wang, Kai Xu, Yulan Guo

Extensive experiments on real-world scenarios demonstrate that our method achieves the best of both worlds in accuracy, efficiency, and generalization.

Computational Efficiency Open-Ended Question Answering +1

3D Spatial Multimodal Knowledge Accumulation for Scene Graph Prediction in Point Cloud

no code implementations CVPR 2023 Mingtao Feng, Haoran Hou, Liang Zhang, Zijie Wu, Yulan Guo, Ajmal Mian

In-depth understanding of a 3D scene not only involves locating/recognizing individual objects, but also requires to infer the relationships and interactions among them.

Context-Aware Alignment and Mutual Masking for 3D-Language Pre-Training

1 code implementation CVPR 2023 Zhao Jin, Munawar Hayat, Yuwei Yang, Yulan Guo, Yinjie Lei

The current approaches for 3D visual reasoning are task-specific, and lack pre-training methods to learn generic representations that can transfer across various tasks.

3D dense captioning Dense Captioning +3

VAPCNet: Viewpoint-Aware 3D Point Cloud Completion

no code implementations ICCV 2023 Zhiheng Fu, Longguang Wang, Lian Xu, Zhiyong Wang, Hamid Laga, Yulan Guo, Farid Boussaid, Mohammed Bennamoun

In this paper, we thus propose an unsupervised viewpoint representation learning scheme for 3D point cloud completion without explicit viewpoint estimation.

Point Cloud Completion Representation Learning +1

Bridging the Domain Gap in Satellite Pose Estimation: a Self-Training Approach based on Geometrical Constraints

no code implementations23 Dec 2022 Zi Wang, Minglin Chen, Yulan Guo, Zhang Li, Qifeng Yu

Recently, unsupervised domain adaptation in satellite pose estimation has gained increasing attention, aiming at alleviating the annotation cost for training deep models.

Pose Estimation Pseudo Label +1

MTU-Net: Multi-level TransUNet for Space-based Infrared Tiny Ship Detection

1 code implementation28 Sep 2022 Tianhao Wu, Boyang Li, Yihang Luo, Yingqian Wang, Chao Xiao, Ting Liu, Jungang Yang, Wei An, Yulan Guo

Due to the extremely large image coverage area (e. g., thousands square kilometers), candidate targets in these images are much smaller, dimer, more changeable than those targets observed by aerial-based and land-based imaging devices.

Data Augmentation

Real-World Light Field Image Super-Resolution via Degradation Modulation

3 code implementations13 Jun 2022 Yingqian Wang, Zhengyu Liang, Longguang Wang, Jungang Yang, Wei An, Yulan Guo

In our method, a practical LF degradation model is developed to formulate the degradation process of real LF images.

Image Super-Resolution

Deep Learning for Visual Speech Analysis: A Survey

no code implementations22 May 2022 Changchong Sheng, Gangyao Kuang, Liang Bai, Chenping Hou, Yulan Guo, Xin Xu, Matti Pietikäinen, Li Liu

Visual speech, referring to the visual domain of speech, has attracted increasing attention due to its wide applications, such as public security, medical treatment, military defense, and film entertainment.

speech-recognition Visual Speech Recognition

4DAC: Learning Attribute Compression for Dynamic Point Clouds

no code implementations25 Apr 2022 Guangchi Fang, Qingyong Hu, Yiling Xu, Yulan Guo

In addition, we also propose a deep conditional entropy model to estimate the probability distribution of the transformed coefficients, by incorporating temporal context from consecutive point clouds and the motion estimation/compensation modules.

Attribute Data Compression +2

NTIRE 2022 Challenge on Stereo Image Super-Resolution: Methods and Results

no code implementations20 Apr 2022 Longguang Wang, Yulan Guo, Yingqian Wang, Juncheng Li, Shuhang Gu, Radu Timofte

In this paper, we summarize the 1st NTIRE challenge on stereo image super-resolution (restoration of rich details in a pair of low-resolution stereo images) with a focus on new solutions and results.

Stereo Image Super-Resolution

RayMVSNet: Learning Ray-based 1D Implicit Fields for Accurate Multi-View Stereo

no code implementations CVPR 2022 Junhua Xi, Yifei Shi, Yijie Wang, Yulan Guo, Kai Xu

In particular, we propose RayMVSNet which learns sequential prediction of a 1D implicit field along each camera ray with the zero-crossing point indicating scene depth.

Multi-Task Learning

Semantic-Aware Domain Generalized Segmentation

1 code implementation CVPR 2022 Duo Peng, Yinjie Lei, Munawar Hayat, Yulan Guo, Wen Li

In this paper, we address domain generalized semantic segmentation, where a segmentation model is trained to be domain-invariant without using any target domain data.

Domain Generalization Segmentation +1

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds

1 code implementation CVPR 2022 Yifan Zhang, Qingyong Hu, Guoquan Xu, Yanxin Ma, Jianwei Wan, Yulan Guo

To reduce the memory and computational cost, existing point-based pipelines usually adopt task-agnostic random sampling or farthest point sampling to progressively downsample input point clouds, despite the fact that not all points are equally important to the task of object detection.

Object object-detection +1

Depth Estimation by Combining Binocular Stereo and Monocular Structured-Light

1 code implementation CVPR 2022 Yuhua Xu, Xiaoli Yang, Yushan Yu, Wei Jia, Zhaobi Chu, Yulan Guo

In order to verify the effectiveness of the proposed system, we build a prototype and collect a test dataset in indoor scenes.

Depth Estimation Stereo Matching

3DAC: Learning Attribute Compression for Point Clouds

1 code implementation CVPR 2022 Guangchi Fang, Qingyong Hu, Hanyun Wang, Yiling Xu, Yulan Guo

Finally, the estimated probabilities are used to further compress these transform coefficients to a final attributes bitstream.

Attribute

Box2Seg: Learning Semantics of 3D Point Clouds with Box-Level Supervision

no code implementations9 Jan 2022 Yan Liu, Qingyong Hu, Yinjie Lei, Kai Xu, Jonathan Li, Yulan Guo

In this paper, we introduce a neural architecture, termed Box2Seg, to learn point-level semantics of 3D point clouds with bounding box-level supervision.

Semantic Segmentation

Decoupling Makes Weakly Supervised Local Feature Better

1 code implementation CVPR 2022 Kunhong Li, Longguang Wang, Li Liu, Qing Ran, Kai Xu, Yulan Guo

Weakly supervised learning can help local feature methods to overcome the obstacle of acquiring a large-scale dataset with densely labeled correspondences.

Camera Localization Image Matching +1

Learnable Lookup Table for Neural Network Quantization

1 code implementation CVPR 2022 Longguang Wang, Xiaoyu Dong, Yingqian Wang, Li Liu, Wei An, Yulan Guo

Since a linear quantizer (i. e., round(*) function) cannot well fit the bell-shaped distributions of weights and activations, many existing methods use pre-defined functions (e. g., exponential function) with learnable parameters to build the quantizer for joint optimization.

Computational Efficiency Image Classification +3

Detecting and Tracking Small and Dense Moving Objects in Satellite Videos: A Benchmark

1 code implementation25 Nov 2021 Qian Yin, Qingyong Hu, Hao liu, Feng Zhang, Yingqian Wang, Zaiping Lin, Wei An, Yulan Guo

Satellite video cameras can provide continuous observation for a large-scale area, which is important for many remote sensing applications.

Matrix Completion Moving Object Detection +3

Spatial-Temporal Transformer for 3D Point Cloud Sequences

no code implementations19 Oct 2021 Yimin Wei, Hao liu, TingTing Xie, Qiuhong Ke, Yulan Guo

We test the effectiveness our PST2 with two different tasks on point cloud sequences, i. e., 4D semantic segmentation and 3D action recognition.

3D Action Recognition Segmentation +1

Selective Light Field Refocusing for Camera Arrays Using Bokeh Rendering and Superresolution

1 code implementation9 Aug 2021 Yingqian Wang, Jungang Yang, Yulan Guo, Chao Xiao, Wei An

In this letter, we propose a light field refocusing method to improve the imaging quality of camera arrays.

Bilateral Grid Learning for Stereo Matching Networks

no code implementations CVPR 2021 Bin Xu, Yuhua Xu, Xiaoli Yang, Wei Jia, Yulan Guo

In this paper, we present a novel edge-preserving cost volume upsampling module based on the slicing operation in the learned bilateral grid.

Robot Navigation Stereo Matching

Deep Learning for Scene Classification: A Survey

no code implementations26 Jan 2021 Delu Zeng, Minyu Liao, Mohammad Tavakolian, Yulan Guo, Bolei Zhou, Dewen Hu, Matti Pietikäinen, Li Liu

Scene classification, aiming at classifying a scene image to one of the predefined scene categories by comprehending the entire image, is a longstanding, fundamental and challenging problem in computer vision.

Classification General Classification +1

Symmetric Parallax Attention for Stereo Image Super-Resolution

1 code implementation7 Nov 2020 Yingqian Wang, Xinyi Ying, Longguang Wang, Jungang Yang, Wei An, Yulan Guo

Although recent years have witnessed the great advances in stereo image super-resolution (SR), the beneficial information provided by binocular systems has not been fully used.

Occlusion Handling Stereo Image Super-Resolution

A Practical Tutorial on Graph Neural Networks

1 code implementation11 Oct 2020 Isaac Ronald Ward, Jack Joyner, Casey Lickfold, Yulan Guo, Mohammed Bennamoun

Graph neural networks (GNNs) have recently grown in popularity in the field of artificial intelligence (AI) due to their unique ability to ingest relatively unstructured data types as input data.

Parallax Attention for Unsupervised Stereo Correspondence Learning

1 code implementation16 Sep 2020 Longguang Wang, Yulan Guo, Yingqian Wang, Zhengfa Liang, Zaiping Lin, Jungang Yang, Wei An

Based on our PAM, we propose a parallax-attention stereo matching network (PASMnet) and a parallax-attention stereo image super-resolution network (PASSRnet) for stereo matching and stereo image super-resolution tasks.

Stereo Image Super-Resolution Stereo Matching

Axiom-based Grad-CAM: Towards Accurate Visualization and Explanation of CNNs

3 code implementations5 Aug 2020 Ruigang Fu, Qingyong Hu, Xiaohu Dong, Yulan Guo, Yinghui Gao, Biao Li

To have a better understanding and usage of Convolution Neural Networks (CNNs), the visualization and interpretation of CNNs has attracted increasing attention in recent years.

Image Generation

Light Field Image Super-Resolution Using Deformable Convolution

1 code implementation7 Jul 2020 Yingqian Wang, Jungang Yang, Longguang Wang, Xinyi Ying, Tianhao Wu, Wei An, Yulan Guo

In this paper, we propose a deformable convolution network (i. e., LF-DFnet) to handle the disparity problem for LF image SR.

Image Super-Resolution

Pseudo-LiDAR Point Cloud Interpolation Based on 3D Motion Representation and Spatial Supervision

no code implementations20 Jun 2020 Haojie Liu, Kang Liao, Chunyu Lin, Yao Zhao, Yulan Guo

Pseudo-LiDAR point cloud interpolation is a novel and challenging task in the field of autonomous driving, which aims to address the frequency mismatching problem between camera and LiDAR.

Autonomous Driving Optical Flow Estimation

Learning Local Features with Context Aggregation for Visual Localization

no code implementations26 May 2020 Siyu Hong, Kunhong Li, Yongcong Zhang, Zhiheng Fu, Mengyi Liu, Yulan Guo

Most existing methods use detect-then-describe or detect-and-describe strategy to learn local features without considering their context information.

Keypoint Detection Visual Localization

Deformable 3D Convolution for Video Super-Resolution

1 code implementation6 Apr 2020 Xinyi Ying, Longguang Wang, Yingqian Wang, Weidong Sheng, Wei An, Yulan Guo

In this paper, we propose a deformable 3D convolution network (D3Dnet) to incorporate spatio-temporal information from both spatial and temporal dimensions for video SR.

Motion Compensation Video Super-Resolution

Deep Video Super-Resolution using HR Optical Flow Estimation

2 code implementations6 Jan 2020 Longguang Wang, Yulan Guo, Li Liu, Zaiping Lin, Xinpu Deng, Wei An

The key challenge for video SR lies in the effective exploitation of temporal dependency between consecutive frames.

Motion Compensation Optical Flow Estimation +1

Deep Learning for 3D Point Clouds: A Survey

3 code implementations27 Dec 2019 Yulan Guo, Hanyun Wang, Qingyong Hu, Hao liu, Li Liu, Mohammed Bennamoun

To stimulate future research, this paper presents a comprehensive review of recent progress in deep learning methods for point clouds.

3D Object Detection 3D Shape Classification +3

Spatial-Angular Interaction for Light Field Image Super-Resolution

1 code implementation17 Dec 2019 Yingqian Wang, Longguang Wang, Jungang Yang, Wei An, Jingyi Yu, Yulan Guo

Specifically, spatial and angular features are first separately extracted from input LFs, and then repetitively interacted to progressively incorporate spatial and angular information.

Image Super-Resolution SSIM

DeOccNet: Learning to See Through Foreground Occlusions in Light Fields

1 code implementation10 Dec 2019 Yingqian Wang, Tianhao Wu, Jungang Yang, Longguang Wang, Wei An, Yulan Guo

In this paper, we handle the LF de-occlusion (LF-DeOcc) problem using a deep encoder-decoder network (namely, DeOccNet).

PLIN: A Network for Pseudo-LiDAR Point Cloud Interpolation

no code implementations16 Sep 2019 Haojie Liu, Kang Liao, Chunyu Lin, Yao Zhao, Yulan Guo

In this paper, we propose a novel Pseudo-LiDAR interpolation network (PLIN) to increase the frequency of LiDAR sensors.

Autonomous Driving

Unsupervised Primitive Discovery for Improved 3D Generative Modeling

no code implementations CVPR 2019 Salman H. Khan, Yulan Guo, Munawar Hayat, Nick Barnes

Using the primitive parts for shapes as attributes, a parameterized 3D representation is modeled in the first stage.

3D Shape Generation

Flickr1024: A Large-Scale Dataset for Stereo Image Super-Resolution

no code implementations15 Mar 2019 Yingqian Wang, Longguang Wang, Jungang Yang, Wei An, Yulan Guo

With the popularity of dual cameras in recently released smart phones, a growing number of super-resolution (SR) methods have been proposed to enhance the resolution of stereo image pairs.

Stereo Image Super-Resolution

Learning Parallax Attention for Stereo Image Super-Resolution

1 code implementation CVPR 2019 Longguang Wang, Yingqian Wang, Zhengfa Liang, Zaiping Lin, Jungang Yang, Wei An, Yulan Guo

Stereo image pairs can be used to improve the performance of super-resolution (SR) since additional information is provided from a second viewpoint.

Stereo Image Super-Resolution

Learning for Video Super-Resolution through HR Optical Flow Estimation

2 code implementations23 Sep 2018 Longguang Wang, Yulan Guo, Zaiping Lin, Xinpu Deng, Wei An

Extensive experiments demonstrate that HR optical flows provide more accurate correspondences than their LR counterparts and improve both accuracy and consistency performance.

Motion Compensation Optical Flow Estimation +1

Learning for Disparity Estimation through Feature Constancy

2 code implementations CVPR 2018 Zhengfa Liang, Yiliu Feng, Yulan Guo, Hengzhu Liu, Wei Chen, Linbo Qiao, Li Zhou, Jianfeng Zhang

The second part performs matching cost calculation, matching cost aggregation and disparity calculation to estimate the initial disparity using shared features.

Disparity Estimation Stereo Matching +1

Partial Procedural Geometric Model Fitting for Point Clouds

1 code implementation17 Oct 2016 Zongliang Zhang, Jonathan Li, Yulan Guo, Yangbin Lin, Ming Cheng, Cheng Wang

However, most geometric model fitting methods are unable to fit an arbitrary geometric model (e. g. a surface with holes) to incomplete data, due to that the similarity metrics used in these methods are unable to measure the rigid partial similarity between arbitrary models.

Rotational Projection Statistics for 3D Local Surface Description and Object Recognition

no code implementations11 Apr 2013 Yulan Guo, Ferdous Sohel, Mohammed Bennamoun, Min Lu, Jianwei Wan

The performance of the proposed LRF, RoPS descriptor and object recognition algorithm was rigorously tested on a number of popular and publicly available datasets.

3D Object Recognition Object

Cannot find the paper you are looking for? You can Submit a new open access paper.