no code implementations • 10 Dec 2024 • Aixuan Li, Jing Zhang, Jiawei Shi, Yiran Zhong, Yuchao Dai
We find that the well-trained victim models (VMs), against which the attacks are generated, serve as fundamental prerequisites for adversarial attacks, i. e. a segmentation VM is needed to generate attacks for segmentation.
no code implementations • 10 Dec 2024 • Hui Deng, Jiawei Shi, Zhen Qin, Yiran Zhong, Yuchao Dai
In this paper, we revisit deep NRSfM from two perspectives to address the limitations of current deep NRSfM methods : (1) canonicalization and (2) sequence modeling.
no code implementations • 12 Nov 2024 • Liyuan Zhang, Le Hui, Qi Liu, Bo Li, Yuchao Dai
Multi-instance point cloud registration aims to estimate the pose of all instances of a model point cloud in the whole scene.
no code implementations • 30 Oct 2024 • Naijian Cao, Renjie He, Yuchao Dai, Mingyi He
In order to enhance representations of attention mechanisms while preserving low computational complexity, we propose the LoFLAT, a novel Local Feature matching using Focused Linear Attention Transformer in this paper.
no code implementations • 31 May 2024 • Zhen Qin, Yuxin Mao, Xuyang Shen, Dong Li, Jing Zhang, Yuchao Dai, Yiran Zhong
Linear attention mechanisms have gained prominence in causal language models due to their linear computational complexity and enhanced speed.
no code implementations • CVPR 2024 • Jiawei Shi, Hui Deng, Yuchao Dai
Even though Non-rigid Structure-from-Motion (NRSfM) has been extensively studied and great progress has been made, there are still key challenges that hinder their broad real-world applications: 1) the inherent motion/rotation ambiguity requires either explicit camera motion recovery with extra constraint or complex Procrustean Alignment; 2) existing low-rank modeling of the global shape can over-penalize drastic deformations in the 3D shape sequence.
1 code implementation • 22 Apr 2024 • Yuxin Mao, Xuyang Shen, Jing Zhang, Zhen Qin, Jinxing Zhou, Mochu Xiang, Yiran Zhong, Yuchao Dai
To support research in this field, we have developed a comprehensive Text to Audible-Video Generation Benchmark (TAVGBench), which contains over 1. 7 million clips with a total duration of 11. 8 thousand hours.
no code implementations • CVPR 2024 • Zhicheng Lu, Xiang Guo, Le Hui, Tianrui Chen, Min Yang, Xiao Tang, Feng Zhu, Yuchao Dai
In this way, our solution achieves 3D geometry-aware deformation modeling, which enables improved dynamic view synthesis and 3D dynamic reconstruction.
no code implementations • CVPR 2024 • YuFei Wang, Ge Zhang, Shaoqian Wang, Bo Li, Qi Liu, Le Hui, Yuchao Dai
In this paper we visualize the internal feature maps to analyze how the network densifies the input sparse depth.
no code implementations • CVPR 2024 • Xiao Tang, Min Yang, Penghui Sun, Hui Li, Yuchao Dai, Feng Zhu, Hojae Lee
In addition to speed up prior information search we propose an optical flow and structural similarity based prior information search method.
no code implementations • ICCV 2023 • YuFei Wang, Bo Li, Ge Zhang, Qi Liu, Tao Gao, Yuchao Dai
Existing deep learning-based depth completion methods generally employ massive stacked layers to predict the dense depth map from sparse input data.
1 code implementation • ICCV 2023 • Yuxin Mao, Jing Zhang, Mochu Xiang, Yiran Zhong, Yuchao Dai
To achieve this, our ECMVAE factorizes the representations of each modality with a modality-shared representation and a modality-specific representation.
no code implementations • ICCV 2023 • Xiang Guo, Jiadai Sun, Yuchao Dai, GuanYing Chen, Xiaoqing Ye, Xiao Tan, Errui Ding, Yumeng Zhang, Jingdong Wang
This paper proposes a neural radiance field (NeRF) approach for novel view synthesis of dynamic scenes using forward warping.
1 code implementation • ICCV 2023 • Zhexiong Wan, Yuxin Mao, Jing Zhang, Yuchao Dai
Recently, the RGB images and point clouds fusion methods have been proposed to jointly estimate 2D optical flow and 3D scene flow.
no code implementations • 5 Sep 2023 • YuFei Wang, Yuxin Mao, Qi Liu, Yuchao Dai
The decomposed filters not only maintain the favorable properties of guided dynamic filters as being content-dependent and spatially-variant, but also reduce model parameters and hardware costs, as the learned adaptors are decoupled with the number of feature channels.
no code implementations • 18 Aug 2023 • Haorui Ji, Hui Deng, Yuchao Dai, Hongdong Li
Most of the previous 3D human pose estimation work relied on the powerful memory capability of the network to obtain suitable 2D-3D mappings from the training data.
1 code implementation • 16 Aug 2023 • Dawei Hao, Yuxin Mao, Bowen He, Xiaodong Han, Yuchao Dai, Yiran Zhong
In this paper, inspired by the human ability to mentally simulate the sound of an object and its visual appearance, we introduce a bidirectional generation framework.
no code implementations • 8 Aug 2023 • Chen Wang, Jiadai Sun, Lina Liu, Chenming Wu, Zhelun Shen, Dayan Wu, Yuchao Dai, Liangjun Zhang
However, the shape-radiance ambiguity of radiance fields remains a challenge, especially in the sparse viewpoints setting.
no code implementations • 31 Jul 2023 • Yuxin Mao, Jing Zhang, Mochu Xiang, Yunqiu Lv, Yiran Zhong, Yuchao Dai
We propose a latent diffusion model with contrastive learning for audio-visual segmentation (AVS) to extensively explore the contribution of audio.
1 code implementation • 31 Jul 2023 • Zhelun Shen, Xibin Song, Yuchao Dai, Dingfu Zhou, Zhibo Rao, Liangjun Zhang
Due to the domain differences and unbalanced disparity distribution across multiple datasets, current stereo matching approaches are commonly limited to a specific dataset and generalize poorly to others.
1 code implementation • 31 Jul 2023 • Mengqi He, Jing Zhang, Zhaoyuan Yang, Mingyi He, Nick Barnes, Yuchao Dai
We analysis performance of semantic segmentation models wrt.
no code implementations • 19 Jul 2023 • Mochu Xiang, Jing Zhang, Nick Barnes, Yuchao Dai
Effectively measuring and modeling the reliability of a trained model is essential to the real-world deployment of monocular depth estimation (MDE) models.
no code implementations • 18 Jul 2023 • Zhen Qin, Weixuan Sun, Kaiyue Lu, Hui Deng, Dongxu Li, Xiaodong Han, Yuchao Dai, Lingpeng Kong, Yiran Zhong
Meanwhile, it emphasizes a general paradigm for designing broadly more relative positional encoding methods that are applicable to linear transformers.
no code implementations • 10 Jul 2023 • Aixuan Li, Jing Zhang, Yunqiu Lv, Tong Zhang, Yiran Zhong, Mingyi He, Yuchao Dai
In this case, salient objects are typically non-camouflaged, and camouflaged objects are usually not salient.
1 code implementation • 7 Jul 2023 • Yunqiu Lv, Jing Zhang, Nick Barnes, Yuchao Dai
Unsupervised object discovery (UOD) refers to the task of discriminating the whole region of objects from the background within a scene without relying on labeled datasets, which benefits the task of bounding-box-level localization and pixel-level segmentation.
1 code implementation • 6 Jun 2023 • Aixuan Li, Yuxin Mao, Jing Zhang, Yuchao Dai
In particular, following the principle of disentangled representation learning, we introduce a mutual information upper bound with a mutual information minimization regularizer to encourage the disentangled representation of each modality for salient object detection.
2 code implementations • 8 May 2023 • Zhen Qin, Xiaodong Han, Weixuan Sun, Bowen He, Dong Li, Dongxu Li, Yuchao Dai, Lingpeng Kong, Yiran Zhong
Sequence modeling has important applications in natural language processing and computer vision.
no code implementations • 21 Apr 2023 • Bin Fan, Yuchao Dai, Yongduek Seo, Mingyi He
The normalized eight-point algorithm has been widely viewed as the cornerstone in two-view geometry computation, where the seminal Hartley's normalization has greatly improved the performance of the direct linear transformation algorithm.
no code implementations • 14 Apr 2023 • Jaime Spencer, C. Stella Qian, Michaela Trescakova, Chris Russell, Simon Hadfield, Erich W. Graf, Wendy J. Adams, Andrew J. Schofield, James Elder, Richard Bowden, Ali Anwar, Hao Chen, Xiaozhi Chen, Kai Cheng, Yuchao Dai, Huynh Thai Hoa, Sadat Hossain, Jianmian Huang, Mohan Jing, Bo Li, Chao Li, Baojun Li, Zhiwen Liu, Stefano Mattoccia, Siegfried Mercelis, Myungwoo Nam, Matteo Poggi, Xiaohua Qi, Jiahui Ren, Yang Tang, Fabio Tosi, Linh Trinh, S. M. Nadim Uddin, Khan Muhammad Umair, Kaixuan Wang, YuFei Wang, Yixing Wang, Mochu Xiang, Guangkai Xu, Wei Yin, Jun Yu, Qi Zhang, Chaoqiang Zhao
This paper discusses the results for the second edition of the Monocular Depth Estimation Challenge (MDEC).
1 code implementation • CVPR 2023 • Xuyang Shen, Dong Li, Jinxing Zhou, Zhen Qin, Bowen He, Xiaodong Han, Aixuan Li, Yuchao Dai, Lingpeng Kong, Meng Wang, Yu Qiao, Yiran Zhong
We explore a new task for audio-visual-language modeling called fine-grained audible video description (FAVD).
1 code implementation • 14 Feb 2023 • Hongguang Zhang, Limeng Zhang, Yuchao Dai, Hongdong Li, Piotr Koniusz
Contemporary deep learning multi-scale deblurring models suffer from many issues: 1) They perform poorly on non-uniformly blurred images/videos; 2) Simply increasing the model depth with finer-scale levels cannot improve deblurring; 3) Individual RGB frames contain a limited motion information for deblurring; 4) Previous models have a limited robustness to spatial transformations and noise.
no code implementations • CVPR 2023 • Zhibo Rao, Bangshu Xiong, Mingyi He, Yuchao Dai, Renjie He, Zhelun Shen, Xing Li
Experimental results on multi-datasets show that: (1) our method can be easily plugged into the current various stereo matching models to improve generalization performance; (2) our method can reduce the significant volatility of generalization performance among different training epochs; (3) we find that the current methods prefer to choose the best results among different training epochs as generalization performance, but it is impossible to select the best performance by ground truth in practice.
no code implementations • ICCV 2023 • Le Hui, Linghua Tang, Yuchao Dai, Jin Xie, Jian Yang
Then, to generate homogeneous superpoints from the sparse LiDAR point cloud, we propose a LiDAR point grouping algorithm that simultaneously considers the similarity of point embeddings and the Euclidean distance of points in 3D space.
1 code implementation • CVPR 2023 • Bin Fan, Yuxin Mao, Yuchao Dai, Zhexiong Wan, Qi Liu
Rolling shutter correction (RSC) is becoming increasingly popular for RS cameras that are widely used in commercial and industrial applications.
no code implementations • CVPR 2023 • Xinyu Tian, Jing Zhang, Mochu Xiang, Yuchao Dai
Most of the existing salient object detection (SOD) models focus on improving the overall model performance, without explicitly explaining the discrepancy between the training and testing distributions.
1 code implementation • 16 Nov 2022 • Zhexiong Wan, Yuchao Dai, Yuxin Mao
In this paper, we propose a novel deep learning-based dense and continuous optical flow estimation framework from a single image with event streams, which facilitates the accurate perception of high-speed motion.
no code implementations • 26 Oct 2022 • Zhiyuan Zhang, Yuchao Dai, Bin Fan, Jiadai Sun, Mingyi He
In this paper, we propose to learn a robust task-specific feature descriptor to consistently describe the correct point correspondence under interference.
no code implementations • 26 Oct 2022 • Zhiyuan Zhang, Jiadai Sun, Yuchao Dai, Bin Fan, Qi Liu
In response, this paper presents a novel end-to-end learning-based method to estimate the dense correspondence of 3D point clouds, in which the problem of point matching is formulated as a zero-one assignment problem to achieve a permutation matching matrix to implement the one-to-one principle fundamentally.
1 code implementation • 26 Oct 2022 • YuFei Wang, Yuchao Dai, Qi Liu, Peng Yang, Jiadai Sun, Bo Li
We find that existing depth-only methods can obtain satisfactory results in the areas where the measurement points are almost accurate and evenly distributed (denoted as normal areas), while the performance is limited in the areas where the foreground and background points are overlapped due to occlusion (denoted as overlap areas) and the areas where there are no measurement points around (denoted as blank areas) since the methods have no reliable input information in these areas.
no code implementations • 15 Oct 2022 • Kaiyue Lu, Zexiang Liu, Jianyuan Wang, Weixuan Sun, Zhen Qin, Dong Li, Xuyang Shen, Hui Deng, Xiaodong Han, Yuchao Dai, Yiran Zhong
Therefore, we propose a feature fixation module to reweight the feature importance of the query and key before computing linear attention.
no code implementations • 13 Oct 2022 • Yuxin Mao, Zhexiong Wan, Yuchao Dai, Xin Yu
Single image blind deblurring is highly ill-posed as neither the latent sharp image nor the blur kernel is known.
1 code implementation • 6 Oct 2022 • Bin Fan, Yuchao Dai, Hongdong Li
The RSSR is a very challenging task, and to our knowledge, no practical solution exists to date.
1 code implementation • 5 Jul 2022 • Jiadai Sun, Yuchao Dai, Xianjing Zhang, Jintao Xu, Rui Ai, Weihao Gu, Xieyuanli Chen
We also use a point refinement module via 3D sparse convolution to fuse the information from both LiDAR range image and point cloud representations and reduce the artifacts on the borders of the objects.
no code implementations • 15 Jun 2022 • Xiang Guo, GuanYing Chen, Yuchao Dai, Xiaoqing Ye, Jiadai Sun, Xiao Tan, Errui Ding
The second module contains a density and a color grid to model the geometry and density of the scene.
1 code implementation • CVPR 2022 • Bin Fan, Yuchao Dai, Zhiyuan Zhang, Qi Liu, Mingyi He
Then, a refinement scheme is proposed to guide the GS frame synthesis along with bilateral occlusion masks to produce high-fidelity GS video frames at arbitrary times.
1 code implementation • 23 May 2022 • Yunqiu Lv, Jing Zhang, Yuchao Dai, Aixuan Li, Nick Barnes, Deng-Ping Fan
With the above understanding about camouflaged objects, we present the first triple-task learning framework to simultaneously localize, segment, and rank camouflaged objects, indicating the conspicuousness level of camouflage.
no code implementations • 10 Apr 2022 • Hui Deng, Tong Zhang, Yuchao Dai, Jiawei Shi, Yiran Zhong, Hongdong Li
In this paper, we propose to model deep NRSfM from a sequence-to-sequence translation perspective, where the input 2D frame sequence is taken as a whole to reconstruct the deforming 3D non-rigid shape sequence.
no code implementations • 24 Mar 2022 • Zhiyuan Zhang, Jiadai Sun, Yuchao Dai, Dingfu Zhou, Xibin Song, Mingyi He
Existing correspondences-free methods generally learn the holistic representation of the entire point cloud, which is fragile for partial and noisy point clouds.
no code implementations • 24 Mar 2022 • Zhiyuan Zhang, Jiadai Sun, Yuchao Dai, Bin Fan, Mingyi He
3D point cloud registration is fragile to outliers, which are labeled as the points without corresponding points.
1 code implementation • CVPR 2022 • Shaoqian Wang, Bo Li, Yuchao Dai
Specifically, a lightweight 3D CNN is utilized to generate the coarsest initial depth map which is essential to launch the GRU and guarantee a fast convergence.
no code implementations • 29 Nov 2021 • Jiadai Sun, Yuxin Mao, Yuchao Dai, Yiran Zhong, Jianyuan Wang
The task of semi-supervised video object segmentation (VOS) has been greatly advanced and state-of-the-art performance has been made by dense matching-based methods.
no code implementations • 23 Nov 2021 • Xinyu Tian, Jing Zhang, Yuchao Dai
Given multiple saliency annotations, we introduce a general divergence modeling strategy via random sampling, and apply our strategy to an ensemble based framework and three latent variable model based solutions to explore the subjective nature of saliency.
no code implementations • 22 Nov 2021 • Jing Zhang, Yuchao Dai, Mehrtash Harandi, Yiran Zhong, Nick Barnes, Richard Hartley
Uncertainty estimation has been extensively studied in recent literature, which can usually be classified as aleatoric uncertainty and epistemic uncertainty.
no code implementations • 28 Oct 2021 • Zhiyuan Zhang, Jiadai Sun, Yuchao Dai, Dingfu Zhou, Xibin Song, Mingyi He
Even though considerable progress has been made in deep learning-based 3D point cloud processing, how to obtain accurate correspondences for robust registration remains a major challenge because existing hard assignment methods cannot deal with outliers naturally.
1 code implementation • 13 Oct 2021 • Jing Zhang, Yuchao Dai, Mochu Xiang, Deng-Ping Fan, Peyman Moghadam, Mingyi He, Christian Walder, Kaihao Zhang, Mehrtash Harandi, Nick Barnes
Deep neural networks can be roughly divided into deterministic neural networks and stochastic neural networks. The former is usually trained to achieve a mapping from input space to output space via maximum likelihood estimation for the weights, which leads to deterministic predictions during testing.
1 code implementation • ICCV 2021 • Jing Zhang, Deng-Ping Fan, Yuchao Dai, Xin Yu, Yiran Zhong, Nick Barnes, Ling Shao
In this paper, we introduce a novel multi-stage cascaded learning framework via mutual information minimization to "explicitly" model the multi-modal information between RGB image and depth data.
no code implementations • ICCV 2021 • Haitian Zeng, Yuchao Dai, Xin Yu, Xiaohan Wang, Yi Yang
As NRSfM is a highly under-constrained problem, we propose two new pairwise regularization to further regularize the reconstruction.
1 code implementation • ICCV 2021 • Bin Fan, Yuchao Dai, Mingyi He
The vast majority of modern consumer-grade cameras employ a rolling shutter mechanism, leading to image distortions if the camera moves during image acquisition.
1 code implementation • ICCV 2021 • Fei Zhang, Chaochen Gu, Chenyue Zhang, Yuchao Dai
Therefore, a CAM with more information related to object seeds can be obtained by narrowing down the gap between the sum of CAMs generated by the CP Pair and the original CAM.
no code implementations • 24 Jun 2021 • Mochu Xiang, Jing Zhang, Yunqiu Lv, Aixuan Li, Yiran Zhong, Yuchao Dai
In this paper, we study the depth contribution for camouflaged object detection, where the depth maps are generated with existing monocular depth estimation (MDE) methods.
Generative Adversarial Network Monocular Depth Estimation +5
2 code implementations • 20 Apr 2021 • Yuxin Mao, Jing Zhang, Zhexiong Wan, Yuchao Dai, Aixuan Li, Yunqiu Lv, Xinyu Tian, Deng-Ping Fan, Nick Barnes
For the former, we apply transformer to a deterministic model, and explain that the effective structure modeling and global context modeling abilities lead to its superior performance compared with the CNN based frameworks.
3 code implementations • CVPR 2021 • Zhelun Shen, Yuchao Dai, Zhibo Rao
In this paper, we propose CFNet, a Cascade and Fused cost volume based network to improve the robustness of the stereo matching network.
2 code implementations • CVPR 2021 • Aixuan Li, Jing Zhang, Yunqiu Lv, Bowen Liu, Tong Zhang, Yuchao Dai
Visual salient object detection (SOD) aims at finding the salient object(s) that attract human attention, while camouflaged object detection (COD) on the contrary intends to discover the camouflaged object(s) that hidden in the surrounding.
1 code implementation • CVPR 2021 • Jianyuan Wang, Yiran Zhong, Yuchao Dai, Stan Birchfield, Kaihao Zhang, Nikolai Smolyanskiy, Hongdong Li
Two-view structure-from-motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM.
Ranked #28 on Monocular Depth Estimation on KITTI Eigen split
1 code implementation • CVPR 2021 • Yunqiu Lv, Jing Zhang, Yuchao Dai, Aixuan Li, Bowen Liu, Nick Barnes, Deng-Ping Fan
With the above understanding about camouflaged objects, we present the first ranking based COD network (Rank-Net) to simultaneously localize, segment and rank camouflaged objects.
Ranked #7 on Camouflaged Object Segmentation on PCOD_1200
no code implementations • 5 Mar 2021 • Dingfu Zhou, Xibin Song, Yuchao Dai, Junbo Yin, Feixiang Lu, Jin Fang, Miao Liao, Liangjun Zhang
3D object detection from a single image is an important task in Autonomous Driving (AD), where various approaches have been proposed.
Ranked #20 on Monocular 3D Object Detection on KITTI Cars Moderate
no code implementations • ICCV 2021 • Ge Gao, Pei You, Rong pan, Shunyuan Han, Yuanyuan Zhang, Yuchao Dai, Hojae Lee
In recent years, neural image compression emerges as a rapidly developing topic in computer vision, where the state-of-the-art approaches now exhibit superior compression performance than their conventional counterparts.
no code implementations • ICCV 2021 • Yamin Mao, Zhihua Liu, Weiming Li, Yuchao Dai, Qiang Wang, Yun-Tae Kim, Hong-Seok Lee
Extensive experiments show that the proposed method achieves the highest ground truth covering ratio compared with other cascade cost volume based stereo matching methods.
no code implementations • ICCV 2021 • Bin Fan, Yuchao Dai
In this paper, we propose to invert the above RS imaging mechanism, i. e., recovering a high framerate GS video from consecutive RS images to achieve RS temporal super-resolution (RSSR).
no code implementations • 31 Dec 2020 • Zhibo Rao, Mingyi He, Yuchao Dai
In this paper, we proposed a novel class attention module and decomposition-fusion strategy to cope with imbalanced labels.
no code implementations • 10 Dec 2020 • Jing Zhang, Yuchao Dai, Xin Yu, Mehrtash Harandi, Nick Barnes, Richard Hartley
Existing deep neural network based salient object detection (SOD) methods mainly focus on pursuing high network accuracy.
no code implementations • 6 Dec 2020 • Yiran Zhong, Yuchao Dai, Hongdong Li
More specifically, we represent the desired depth map as a collection of 3D planar and the reconstruction problem is formulated as the optimization of planar parameters.
no code implementations • 2 Dec 2020 • Yiran Zhong, Yuchao Dai, Hongdong Li
The given sparse depth points are served as a data term to constrain the weighting process.
no code implementations • 1 Dec 2020 • Yiran Zhong, Charles Loop, Wonmin Byeon, Stan Birchfield, Yuchao Dai, Kaihao Zhang, Alexey Kamenev, Thomas Breuel, Hongdong Li, Jan Kautz
A common way to speed up the computation is to downsample the feature volume, but this loses high-frequency details.
3 code implementations • NeurIPS 2020 • Jianyuan Wang, Yiran Zhong, Yuchao Dai, Kaihao Zhang, Pan Ji, Hongdong Li
Learning matching costs has been shown to be critical to the success of the state-of-the-art deep stereo matching methods, in which 3D convolutions are applied on a 4D feature volume to learn a 3D cost volume.
1 code implementation • NeurIPS 2020 • Xuelian Cheng, Yiran Zhong, Mehrtash Harandi, Yuchao Dai, Xiaojun Chang, Tom Drummond, Hongdong Li, ZongYuan Ge
To reduce the human efforts in neural network design, Neural Architecture Search (NAS) has been applied with remarkable success to various high-level vision tasks such as classification and semantic segmentation.
Ranked #2 on Stereo Disparity Estimation on Scene Flow
no code implementations • 22 Oct 2020 • Xiang Guo, Bo Li, Yuchao Dai, Tongxin Zhang, Hui Deng
That is, we synthesize the novel view from only a 6-DoF camera pose directly.
no code implementations • 14 Sep 2020 • Zhexiong Wan, Yuxin Mao, Yuchao Dai
Optical flow estimation is an important computer vision task, which aims at estimating the dense correspondences between two frames.
4 code implementations • 7 Sep 2020 • Jing Zhang, Deng-Ping Fan, Yuchao Dai, Saeed Anwar, Fatemeh Saleh, Sadegh Aliakbarian, Nick Barnes
Our framework includes two main models: 1) a generator model, which maps the input image and latent variable to stochastic saliency prediction, and 2) an inference model, which gradually updates the latent variable by sampling it from the true or approximate posterior distribution.
Ranked #1 on RGB-D Salient Object Detection on LFSD
2 code implementations • 23 Jun 2020 • Zhelun Shen, Yuchao Dai, Xibin Song, Zhibo Rao, Dingfu Zhou, Liangjun Zhang
First, we construct combination volumes on the upper levels of the pyramid and develop a cost volume fusion module to integrate them for initial disparity estimation.
no code implementations • 15 Jun 2020 • Suryansh Kumar, Luc van Gool, Carlos E. P. de Oliveira, Anoop Cherian, Yuchao Dai, Hongdong Li
Assuming that a deforming shape is composed of a union of local linear subspace and, span a global low-rank space over multiple frames enables us to efficiently model complex non-rigid deformations.
no code implementations • 14 Jun 2020 • Ke Wang, Bin Fan, Yuchao Dai
In this paper, we present a novel linear algorithm to estimate the 6 DoF relative pose from consecutive frames of stereo rolling shutter (RS) cameras.
no code implementations • CVPR 2020 • Xibin Song, Yuchao Dai, Dingfu Zhou, Liu Liu, Wei Li, Hongdng Li, Ruigang Yang
Second, we propose a new framework for real-world DSR, which consists of four modules : 1) An iterative residual learning module with deep supervision to learn effective high-frequency components of depth maps in a coarse-to-fine manner; 2) A channel attention strategy to enhance channels with abundant high-frequency components; 3) A multi-stage fusion module to effectively re-exploit the results in the coarse-to-fine process; and 4) A depth refinement module to improve the depth map by TGV regularization and input loss.
1 code implementation • CVPR 2020 • Jing Zhang, Deng-Ping Fan, Yuchao Dai, Saeed Anwar, Fatemeh Sadat Saleh, Tong Zhang, Nick Barnes
In this paper, we propose the first framework (UCNet) to employ uncertainty for RGB-D saliency detection by learning from the data labeling process.
Ranked #4 on RGB-D Salient Object Detection on LFSD
1 code implementation • CVPR 2020 • Jing Zhang, Xin Yu, Aixuan Li, Peipei Song, Bowen Liu, Yuchao Dai
In this paper, we propose a weakly-supervised salient object detection model to learn saliency from such annotations.
no code implementations • 19 Nov 2019 • Suryansh Kumar, Yuchao Dai, Hongdong Li
We assume that a dynamic scene can be approximated by numerous piecewise planar surfaces, where each planar surface enjoys its own rigid motion, and the global change in the scene between two frames is as-rigid-as-possible (ARAP).
no code implementations • 6 Oct 2019 • Liyuan Pan, Yuchao Dai, Miaomiao Liu, Fatih Porikli, Quan Pan
Under our model, these three tasks are naturally connected and expressed as the parameter estimation of 3D scene structure and camera motion (structure and motion for the dynamic scenes).
no code implementations • 30 Aug 2019 • Yuchao Dai, Zhidong Zhu, Zhibo Rao, Bo Li
The success of existing deep-learning based multi-view stereo (MVS) approaches greatly depends on the availability of large-scale supervision in the form of dense depth maps.
1 code implementation • 11 Aug 2019 • Dingfu Zhou, Jin Fang, Xibin Song, Chenye Guan, Junbo Yin, Yuchao Dai, Ruigang Yang
In 2D/3D object detection task, Intersection-over-Union (IoU) has been widely employed as an evaluation metric to evaluate the performance of different detectors in the testing stage.
no code implementations • 25 Apr 2019 • Zhidong Zhu, Mingyi He, Yuchao Dai, Zhibo Rao, Bo Li
The network consists of three modules: Multi-Scale 2D local feature extraction module, Cross-form spatial pyramid module and Multi-Scale 3D Feature Matching and Fusion module.
no code implementations • 25 Apr 2019 • Zhibo Rao, Mingyi He, Yuchao Dai, Zhidong Zhu, Bo Li, Renjie He
The multi-scale residual 3D convolution module learns the different scale geometry context from the cost volume which aggregated by the multi-scale fusion 2D convolution module.
no code implementations • CVPR 2019 • Yiran Zhong, Pan Ji, Jianyuan Wang, Yuchao Dai, Hongdong Li
In this paper, we propose Deep Epipolar Flow, an unsupervised optical flow method which incorporates global geometric constraints into network learning.
1 code implementation • CVPR 2019 • Hongguang Zhang, Yuchao Dai, Hongdong Li, Piotr Koniusz
depth, we propose a stacked version of our multi-patch model.
Ranked #10 on Deblurring on RealBlur-R (trained on GoPro) (SSIM (sRGB) metric)
1 code implementation • 12 Mar 2019 • Liyuan Pan, Richard Hartley, Cedric Scheerlinck, Miaomiao Liu, Xin Yu, Yuchao Dai
Based on the abundant event data alongside a low frame rate, easily blurred images, we propose a simple yet effective approach to reconstruct high-quality and high frame rate sharp videos.
no code implementations • 3 Mar 2019 • Dingfu Zhou, Yuchao Dai, Hongdong Li
Recovering the absolute metric scale from a monocular camera is a challenging but highly desirable problem for monocular camera-based systems.
no code implementations • 1 Mar 2019 • Liyuan Pan, Yuchao Dai, Miaomiao Liu
Camera shake during exposure is a major problem in hand-held photography, as it causes image blur that destroys details in the captured images.~In the real world, such blur is mainly caused by both the camera motion and the complex scene structure.~While considerable existing approaches have been proposed based on various assumptions regarding the scene structure or the camera motion, few existing methods could handle the real 6 DoF camera motion.~In this paper, we propose to jointly estimate the 6 DoF camera motion and remove the non-uniform blur caused by camera motion by exploiting their underlying geometric relationships, with a single blurry image and its depth map (either direct depth measurements, or a learned depth map) as input.~We formulate our joint deblurring and 6 DoF camera motion estimation as an energy minimization problem which is solved in an alternative manner.
no code implementations • 11 Feb 2019 • Suryansh Kumar, Ram Srivatsav Ghorakavi, Yuchao Dai, Hongdong Li
Given per-pixel optical flow correspondences between two consecutive frames and, the sparse depth prior for the reference frame, we show that, we can effectively recover the dense depth map for the successive frames without solving for 3D motion parameters.
no code implementations • CVPR 2019 • Xibin Song, Peng Wang, Dingfu Zhou, Rui Zhu, Chenye Guan, Yuchao Dai, Hao Su, Hongdong Li, Ruigang Yang
Specifically, we first segment each car with a pre-trained Mask R-CNN, and then regress towards its 3D pose and shape based on a deformable 3D car model with or without using semantic keypoints.
no code implementations • 26 Nov 2018 • Liyuan Pan, Richard Hartley, Miaomiao Liu, Yuchao Dai
The image blurring process is generally modelled as the convolution of a blur kernel with a latent image.
1 code implementation • CVPR 2019 • Liyuan Pan, Cedric Scheerlinck, Xin Yu, Richard Hartley, Miaomiao Liu, Yuchao Dai
In this paper, we propose a simple and effective approach, the \textbf{Event-based Double Integral (EDI)} model, to reconstruct a high frame-rate, sharp video from a single blurry frame and its event data.
5 code implementations • ICCV 2019 • Liu Liu, Hongdong Li, Yuchao Dai
This paper tackles the problem of large-scale image-based localization (IBL) where the spatial location of a query image is determined by finding out the most similar reference images in a large database.
no code implementations • ECCV 2018 • Yiran Zhong, Yuchao Dai, Hongdong Li
This paper proposes an original problem of \emph{stereo computation from a single mixture image}-- a challenging problem that had not been researched before.
no code implementations • 27 Aug 2018 • Xibin Song, Yuchao Dai, Xueying Qin
However, there still exist two major issues with these DCNN based depth map super-resolution methods that hinder the performance: i) The low-resolution depth maps either need to be up-sampled before feeding into the network or substantial deconvolution has to be used; and ii) The supervision (high-resolution depth maps) is only applied at the end of the network, thus it is difficult to handle large up-sampling factors, such as $\times 8, \times 16$.
no code implementations • 13 Aug 2018 • Yiran Zhong, Yuchao Dai, Hongdong Li
This paper is concerned with the problem of how to better exploit 3D geometric information for dense semantic image labeling.
no code implementations • ECCV 2018 • Yiran Zhong, Hongdong Li, Yuchao Dai
Deep Learning based stereo matching methods have shown great successes and achieved top scores across different benchmarks.
no code implementations • 30 Jul 2018 • Xiang Guo, Yuchao Dai
In this paper, we propose to address the problem of single image 3D human pose estimation with occluded measurements by exploiting the Euclidean distance matrix (EDM).
no code implementations • CVPR 2018 • Jing Zhang, Tong Zhang, Yuchao Dai, Mehrtash Harandi, Richard Hartley
Such supervision, while labor-intensive and not always possible, tends to hinder the generalization ability of the learned models.
no code implementations • CVPR 2018 • Suryansh Kumar, Anoop Cherian, Yuchao Dai, Hongdong Li
To address these issues, in this paper, we propose a new approach for dense NRSfM by modeling the problem on a Grassmann manifold.
no code implementations • 27 Nov 2017 • Liyuan Pan, Yuchao Dai, Miaomiao Liu, Fatih Porikli
In this paper, we propose to tackle the problem of depth map completion by jointly exploiting the blurry color image sequences and the sparse depth map measurements, and present an energy minimization based formulation to simultaneously complete the depth maps, estimate the scene flow and deblur the color images.
no code implementations • ICCV 2017 • Liu Liu, Hongdong Li, Yuchao Dai
In this paper, we introduce a global method which harnesses global contextual information exhibited both within the query image and among all the 3D points in the map.
no code implementations • 4 Sep 2017 • Yiran Zhong, Yuchao Dai, Hongdong Li
Exiting deep-learning based dense stereo matching methods often rely on ground-truth disparity maps as the training signals, which are however not always available in many situations.
no code implementations • ICCV 2017 • Suryansh Kumar, Yuchao Dai, Hongdong Li
This paper proposes a new approach for monocular dense 3D reconstruction of a complex dynamic scene from two perspective frames.
no code implementations • 15 Aug 2017 • Jing Zhang, Yuchao Dai, Fatih Porikli, Mingyi He
There has been profound progress in visual saliency thanks to the deep learning architectures, however, there still exist three major challenges that hinder the detection performance for scenes with complex compositions, multiple salient objects, and salient objects of diverse scales.
1 code implementation • 2 Aug 2017 • Bo Li, Yuchao Dai, Mingyi He
Extensive experiments on the NYU Depth V2 and KITTI datasets show the superiority of our method compared with current state-of-the-art methods.
no code implementations • ICCV 2017 • Pan Ji, Hongdong Li, Yuchao Dai, Ian Reid
Rigid structure-from-motion (RSfM) and non-rigid structure-from-motion (NRSfM) have long been treated in the literature as separate (different) problems.
no code implementations • 12 Jul 2017 • Dingfu Zhou, Yuchao Dai, Hongdong Li
First, we prove that there indeed exist enough degrees of freedom to apply pixel-wise local homography for stereo rectification.
no code implementations • 27 Jun 2017 • Yuchao Dai, Huizhong Deng, Mingyi He
Second, we propose to exploit the spatial smoothness by resorting to the Laplacian of the 3D non-rigid shape.
no code implementations • 2 Jun 2017 • Jing Zhang, Bo Li, Yuchao Dai, Fatih Porikli, Mingyi He
Then the results from deep FCNN and RBD are concatenated to feed into a shallow network to map the concatenated feature maps to saliency maps.
no code implementations • 14 May 2017 • Suryansh Kumar, Yuchao Dai, Hongdong Li
This spatio-temporal representation not only provides competitive 3D reconstruction but also outputs robust segmentation of multiple non-rigid objects.
1 code implementation • 27 Apr 2017 • Bo Li, Yuchao Dai, Huahui Chen, Mingyi He
This paper proposes a new residual convolutional neural network (CNN) architecture for single image depth estimation.
no code implementations • 19 Apr 2017 • Bo Li, Huahui Chen, Yu-cheng Chen, Yuchao Dai, Mingyi He
However, due to the difficulty in representing the 3D skeleton video and the lack of training data, action detection from streaming 3D skeleton video still lags far behind its recognition counterpart and image based object detection.
no code implementations • 19 Apr 2017 • Bo Li, Mingyi He, Xuelian Cheng, Yu-cheng Chen, Yuchao Dai
Especially on the largest and challenge NTU RGB+D, UTD-MHAD, and MSRC-12 dataset, our method outperforms other methods by a large margion, which proves the efficacy of the proposed method.
Ranked #92 on Skeleton Based Action Recognition on NTU RGB+D
no code implementations • CVPR 2017 • Liyuan Pan, Yuchao Dai, Miaomiao Liu, Fatih Porikli
Unlike the existing approach [31] which used a pre-computed scene flow, we propose a single framework to jointly estimate the scene flow and deblur the image, where the motion cues from scene flow estimation and blur information could reinforce each other, and produce superior results than the conventional scene flow estimation or stereo deblurring methods.
no code implementations • 15 Jul 2016 • Suryansh Kumar, Yuchao Dai, Hongdong Li
Recent progress have extended SFM to the areas of {multi-body SFM} (where there are {multiple rigid} relative motions in the scene), as well as {non-rigid SFM} (where there is a single non-rigid, deformable object or scene).
no code implementations • 7 Jul 2016 • Xibin Song, Yuchao Dai, Xueying Qin
In this paper, we bridge up the gap and extend the success of deep convolutional neural network to depth super-resolution.
no code implementations • 12 May 2016 • Liu Liu, Hongdong Li, Yuchao Dai
When the solver is used in combination with RANSAC, we are able to quickly prune unpromising hypotheses, significantly improve the chance of finding inliers.
no code implementations • CVPR 2016 • Jiaolong Yang, Hongdong Li, Yuchao Dai, Robby T. Tan
This paper deals with a challenging, frequently encountered, yet not properly investigated problem in two-frame optical flow estimation.
no code implementations • CVPR 2016 • Yuchao Dai, Hongdong Li, Laurent Kneip
The vast majority of modern consumer-grade cameras employ a rolling shutter mechanism.
no code implementations • CVPR 2015 • Bo Li, Chunhua Shen, Yuchao Dai, Anton Van Den Hengel, Mingyi He
Predicting the depth (or surface normal) of a scene from single monocular color images is a challenging task.