Search Results for author: Pan Ji

Found 37 papers, 12 papers with code

ConsistNet: Enforcing 3D Consistency for Multi-view Images Diffusion

1 code implementation16 Oct 2023 Jiayu Yang, Ziang Cheng, Yunfei Duan, Pan Ji, Hongdong Li

Given a single image of a 3D object, this paper proposes a novel method (named ConsistNet) that is able to generate multiple images of the same object, as if seen they are captured from different viewpoints, while the 3D (multi-view) consistencies among those multiple generated images are effectively exploited.

Depth Estimation Depth Prediction +1

RGB-based Category-level Object Pose Estimation via Decoupled Metric Scale Recovery

1 code implementation19 Sep 2023 Jiaxin Wei, Xibin Song, Weizhe Liu, Laurent Kneip, Hongdong Li, Pan Ji

While showing promising results, recent RGB-D camera-based category-level object pose estimation methods have restricted applications due to the heavy reliance on depth sensors.

Pose Estimation

Dynamic Voxel Grid Optimization for High-Fidelity RGB-D Supervised Surface Reconstruction

no code implementations12 Apr 2023 Xiangyu Xu, Lichang Chen, Changjiang Cai, Huangying Zhan, Qingan Yan, Pan Ji, Junsong Yuan, Heng Huang, Yi Xu

Direct optimization of interpolated features on multi-resolution voxel grids has emerged as a more efficient alternative to MLP-like modules.

Surface Reconstruction

CLIP-FLow: Contrastive Learning by semi-supervised Iterative Pseudo labeling for Optical Flow Estimation

no code implementations25 Oct 2022 Zhiqi Zhang, Nitin Bansal, Changjiang Cai, Pan Ji, Qingan Yan, Xiangyu Xu, Yi Xu

To this end, we propose CLIP-FLow, a semi-supervised iterative pseudo-labeling framework to transfer the pretraining knowledge to the target real domain.

Contrastive Learning Optical Flow Estimation +1

MonoIndoor++:Towards Better Practice of Self-Supervised Monocular Depth Estimation for Indoor Environments

no code implementations18 Jul 2022 Runze Li, Pan Ji, Yi Xu, Bir Bhanu

As compared to outdoor environments, estimating depth of monocular videos for indoor environments, using self-supervised methods, results in two additional challenges: (i) the depth range of indoor video sequences varies a lot across different frames, making it difficult for the depth network to induce consistent depth cues for training; (ii) the indoor sequences recorded with handheld devices often contain much more rotational motions, which cause difficulties for the pose network to predict accurate relative camera poses.

Depth Prediction Monocular Depth Estimation +1

Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of Semantics and Depth

no code implementations21 Jun 2022 Nitin Bansal, Pan Ji, Junsong Yuan, Yi Xu

Multi-task learning (MTL) paradigm focuses on jointly learning two or more tasks, aiming for significant improvement w. r. t model's generalizability, performance, and training/inference memory footprint.

Data Augmentation Depth Estimation +3

RIAV-MVS: Recurrent-Indexing an Asymmetric Volume for Multi-View Stereo

no code implementations CVPR 2023 Changjiang Cai, Pan Ji, Qingan Yan, Yi Xu

At the pixel level, we propose to break the symmetry of the Siamese network (which is typically used in MVS to extract image features) by introducing a transformer block to the reference image (but not to the source images).

Depth Estimation

FisheyeDistill: Self-Supervised Monocular Depth Estimation with Ordinal Distillation for Fisheye Cameras

no code implementations5 May 2022 Qingan Yan, Pan Ji, Nitin Bansal, Yuxin Ma, Yuan Tian, Yi Xu

In this paper, we deal with the problem of monocular depth estimation for fisheye cameras in a self-supervised manner.

Monocular Depth Estimation

CNN-Augmented Visual-Inertial SLAM with Planar Constraints

no code implementations5 May 2022 Pan Ji, Yuan Tian, Qingan Yan, Yuxin Ma, Yi Xu

The CNN depth effectively bootstraps the back-end optimization of SLAM and meanwhile the CNN uncertainty adaptively weighs the contribution of each feature point to the back-end optimization.

GeoRefine: Self-Supervised Online Depth Refinement for Accurate Dense Mapping

no code implementations3 May 2022 Pan Ji, Qingan Yan, Yuxin Ma, Yi Xu

We present a robust and accurate depth refinement system, named GeoRefine, for geometrically-consistent dense mapping from monocular sequences.

Optical Flow Estimation

PlaneMVS: 3D Plane Reconstruction from Multi-View Stereo

no code implementations CVPR 2022 Jiachen Liu, Pan Ji, Nitin Bansal, Changjiang Cai, Qingan Yan, Xiaolei Huang, Yi Xu

The semantic plane detection branch is based on a single-view plane detection framework but with differences.

3D Reconstruction

Deformable VisTR: Spatio temporal deformable attention for video instance segmentation

no code implementations12 Mar 2022 Sudhir Yarram, Jialian Wu, Pan Ji, Yi Xu, Junsong Yuan

To improve the training efficiency, we propose Deformable VisTR, leveraging spatio-temporal deformable attention module that only attends to a small fixed set of key spatio-temporal sampling points around a reference point.

Instance Segmentation Semantic Segmentation +1

MonoIndoor: Towards Good Practice of Self-Supervised Monocular Depth Estimation for Indoor Environments

no code implementations ICCV 2021 Pan Ji, Runze Li, Bir Bhanu, Yi Xu

The effectiveness of each module is shown through a carefully conducted ablation study and the demonstration of the state-of-the-art performance on three indoor datasets, \ie, EuRoC, NYUv2, and 7-scenes.

Monocular Depth Estimation Pose Estimation

Disentangling Noise from Images: A Flow-Based Image Denoising Neural Network

1 code implementation11 May 2021 Yang Liu, Saeed Anwar, Zhenyue Qin, Pan Ji, Sabrina Caldwell, Tom Gedeon

The prevalent convolutional neural network (CNN) based image denoising methods extract features of images to restore the clean ground truth, achieving high denoising accuracy.

Disentanglement Image Denoising

Fusing Higher-order Features in Graph Neural Networks for Skeleton-based Action Recognition

1 code implementation4 May 2021 Zhenyue Qin, Yang Liu, Pan Ji, Dongwoo Kim, Lei Wang, Bob McKay, Saeed Anwar, Tom Gedeon

Recent skeleton-based action recognition methods extract features from 3D joint coordinates as spatial-temporal cues, using these representations in a graph neural network for feature fusion to boost recognition performance.

Action Recognition Skeleton Based Action Recognition

Set Augmented Triplet Loss for Video Person Re-Identification

no code implementations2 Nov 2020 Pengfei Fang, Pan Ji, Lars Petersson, Mehrtash Harandi

Modern video person re-identification (re-ID) machines are often trained using a metric learning approach, supervised by a triplet loss.

Metric Learning Video-Based Person Re-Identification

Displacement-Invariant Matching Cost Learning for Accurate Optical Flow Estimation

2 code implementations NeurIPS 2020 Jianyuan Wang, Yiran Zhong, Yuchao Dai, Kaihao Zhang, Pan Ji, Hongdong Li

Learning matching costs has been shown to be critical to the success of the state-of-the-art deep stereo matching methods, in which 3D convolutions are applied on a 4D feature volume to learn a 3D cost volume.

Optical Flow Estimation Stereo Matching

Channel Recurrent Attention Networks for Video Pedestrian Retrieval

no code implementations7 Oct 2020 Pengfei Fang, Pan Ji, Jieming Zhou, Lars Petersson, Mehrtash Harandi

Full attention, which generates an attention value per element of the input feature maps, has been successfully demonstrated to be beneficial in visual tasks.

Person Retrieval Retrieval

Cross-Modality 3D Object Detection

no code implementations16 Aug 2020 Ming Zhu, Chao Ma, Pan Ji, Xiaokang Yang

In this paper, we focus on exploring the fusion of images and point clouds for 3D object detection in view of the complementary nature of the two modalities, i. e., images possess more semantic information while point clouds specialize in distance sensing.

3D Classification 3D Object Detection +3

Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction

1 code implementation ECCV 2020 Lokender Tiwari, Pan Ji, Quoc-Huy Tran, Bingbing Zhuang, Saket Anand, Manmohan Chandraker

Classical monocular Simultaneous Localization And Mapping (SLAM) and the recently emerging convolutional neural networks (CNNs) for monocular depth prediction represent two largely disjoint approaches towards building a 3D map of the surrounding environment.

Depth Estimation Depth Prediction +1

Neural Collaborative Subspace Clustering

no code implementations24 Apr 2019 Tong Zhang, Pan Ji, Mehrtash Harandi, Wenbing Huang, Hongdong Li

We introduce the Neural Collaborative Subspace Clustering, a neural model that discovers clusters of data points drawn from a union of low-dimensional subspaces.


Noise-Aware Unsupervised Deep Lidar-Stereo Fusion

3 code implementations CVPR 2019 Xuelian Cheng, Yiran Zhong, Yuchao Dao, Pan Ji, Hongdong Li

In this paper, we present LidarStereoNet, the first unsupervised Lidar-stereo fusion network, which can be trained in an end-to-end manner without the need of ground truth depth maps.

Depth Completion Stereo Matching +1

Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes

no code implementations CVPR 2019 Yiran Zhong, Pan Ji, Jianyuan Wang, Yuchao Dai, Hongdong Li

In this paper, we propose Deep Epipolar Flow, an unsupervised optical flow method which incorporates global geometric constraints into network learning.

Benchmarking Optical Flow Estimation

Scalable Deep $k$-Subspace Clustering

no code implementations2 Nov 2018 Tong Zhang, Pan Ji, Mehrtash Harandi, Richard Hartley, Ian Reid

In this paper, we introduce a method that simultaneously learns an embedding space along subspaces within it to minimize a notion of reconstruction error, thus addressing the problem of subspace clustering in an end-to-end learning paradigm.


Adaptive Low-Rank Kernel Subspace Clustering

1 code implementation17 Jul 2017 Pan Ji, Ian Reid, Ravi Garg, Hongdong Li, Mathieu Salzmann

In this paper, we present a kernel subspace clustering method that can handle non-linear models.

Clustering Image Clustering +1

Robust Multi-body Feature Tracker: A Segmentation-free Approach

no code implementations CVPR 2016 Pan Ji, Hongdong Li, Mathieu Salzmann, Yiran Zhong

Feature tracking is a fundamental problem in computer vision, with applications in many computer vision tasks, such as visual SLAM and action recognition.

Action Recognition Motion Segmentation +2

Shape Interaction Matrix Revisited and Robustified: Efficient Subspace Clustering with Corrupted and Incomplete Data

1 code implementation ICCV 2015 Pan Ji, Mathieu Salzmann, Hongdong Li

The Shape Interaction Matrix (SIM) is one of the earliest approaches to performing subspace clustering (i. e., separating points drawn from a union of subspaces).

Clustering Face Clustering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.