Search Results for author: Hongdong Li

Found 156 papers, 72 papers with code

Winding Number for Region-Boundary Consistent Salient Contour Extraction

no code implementations • CVPR 2013 • Yansheng Ming, Hongdong Li, Xuming He

The main focus is given to how to maintain the consistency (compatibility) between the region cues and the boundary cues.

Boundary Detection Segmentation

Paper
Add Code

Efficient Computation of Relative Pose for Multi-Camera Systems

no code implementations • CVPR 2014 • Laurent Kneip, Hongdong Li

We present a novel solution to compute the relative pose of a generalized camera.

Computational Efficiency Motion Estimation

Paper
Add Code

Expanding the Family of Grassmannian Kernels: An Embedding Perspective

no code implementations • 4 Jul 2014 • Mehrtash T. Harandi, Mathieu Salzmann, Sadeep Jayasumana, Richard Hartley, Hongdong Li

Modeling videos and image-sets as linear subspaces has proven beneficial for many visual recognition tasks.

Clustering

Paper
Add Code

Iteratively Reweighted Graph Cut for Multi-label MRFs with Non-convex Priors

no code implementations • CVPR 2015 • Thalaiyasingam Ajanthan, Richard Hartley, Mathieu Salzmann, Hongdong Li

While widely acknowledged as highly effective in computer vision, multi-label MRFs with non-convex priors are difficult to optimize.

Paper
Add Code

Kernel Methods on Riemannian Manifolds with Gaussian RBF Kernels

no code implementations • 30 Nov 2014 • Sadeep Jayasumana, Richard Hartley, Mathieu Salzmann, Hongdong Li, Mehrtash Harandi

We then use the proposed framework to identify positive definite kernels on two specific manifolds commonly encountered in computer vision: the Riemannian manifold of symmetric positive definite matrices and the Grassmann manifold, i. e., the Riemannian manifold of linear subspaces of a Euclidean space.

Paper
Add Code

Kernel Methods on the Riemannian Manifold of Symmetric Positive Definite Matrices

no code implementations • CVPR 2013 • Sadeep Jayasumana, Richard Hartley, Mathieu Salzmann, Hongdong Li, Mehrtash Harandi

To encode the geometry of the manifold in the mapping, we introduce a family of provably positive definite kernels on the Riemannian manifold of SPD matrices.

Motion Segmentation Pedestrian Detection +2

Paper
Add Code

Optimizing Over Radial Kernels on Compact Manifolds

no code implementations • CVPR 2014 • Sadeep Jayasumana, Richard Hartley, Mathieu Salzmann, Hongdong Li, Mehrtash Harandi

We tackle the problem of optimizing over all possible positive definite radial kernels on Riemannian manifolds for classification.

General Classification

Paper
Add Code

A Framework for Shape Analysis via Hilbert Space Embedding

no code implementations • 13 Dec 2014 • Sadeep Jayasumana, Mathieu Salzmann, Hongdong Li, Mehrtash Harandi

We propose a framework for 2D shape analysis using positive definite kernels defined on Kendall's shape manifold.

Clustering General Classification +1

Paper
Add Code

Dense, Accurate Optical Flow Estimation With Piecewise Parametric Model

no code implementations • CVPR 2015 • Jiaolong Yang, Hongdong Li

This paper proposes a simple method for estimating dense and accurate optical flow field.

Optical Flow Estimation

Paper
Add Code

Tracking Randomly Moving Objects on Edge Box Proposals

no code implementations • 29 Jul 2015 • Gao Zhu, Fatih Porikli, Hongdong Li

Our method generates a small number of "high-quality" proposals by a novel instance-specific objectness measure and evaluates them against the object model that can be adopted from an existing tracking-by-detection approach as a core tracker.

Object valid

Paper
Add Code

Shape Interaction Matrix Revisited and Robustified: Efficient Subspace Clustering with Corrupted and Incomplete Data

1 code implementation • ICCV 2015 • Pan Ji, Mathieu Salzmann, Hongdong Li

The Shape Interaction Matrix (SIM) is one of the earliest approaches to performing subspace clustering (i. e., separating points drawn from a union of subspaces).

Ranked #2 on Motion Segmentation on Hopkins155

Clustering Face Clustering +1

Paper
Code

Robust Multi-body Feature Tracker: A Segmentation-free Approach

no code implementations • CVPR 2016 • Pan Ji, Hongdong Li, Mathieu Salzmann, Yiran Zhong

Feature tracking is a fundamental problem in computer vision, with applications in many computer vision tasks, such as visual SLAM and action recognition.

Action Recognition Motion Segmentation +2

Paper
Add Code

Neural Aggregation Network for Video Face Recognition

no code implementations • CVPR 2017 • Jiaolong Yang, Peiran Ren, Dong-Qing Zhang, Dong Chen, Fang Wen, Hongdong Li, Gang Hua

The network takes a face video or face image set of a person with a variable number of face images as its input, and produces a compact, fixed-dimension feature representation for recognition.

Ranked #7 on Face Verification on IJB-A

Face Recognition Face Verification

Paper
Add Code

Learning Image Matching by Simply Watching Video

no code implementations • 19 Mar 2016 • Gucan Long, Laurent Kneip, Jose M. Alvarez, Hongdong Li

This work presents an unsupervised learning based approach to the ubiquitous computer vision problem of image matching.

Paper
Add Code

Rolling Shutter Camera Relative Pose: Generalized Epipolar Geometry

no code implementations • CVPR 2016 • Yuchao Dai, Hongdong Li, Laurent Kneip

The vast majority of modern consumer-grade cameras employ a rolling shutter mechanism.

Paper
Add Code

Beyond Local Search: Tracking Objects Everywhere with Instance-Specific Proposals

no code implementations • CVPR 2016 • Gao Zhu, Fatih Porikli, Hongdong Li

Object valid

Paper
Add Code

Robust Optical Flow Estimation of Double-Layer Images under Transparency or Reflection

no code implementations • CVPR 2016 • Jiaolong Yang, Hongdong Li, Yuchao Dai, Robby T. Tan

This paper deals with a challenging, frequently encountered, yet not properly investigated problem in two-frame optical flow estimation.

Optical Flow Estimation valid

Paper
Add Code

Go-ICP: A Globally Optimal Solution to 3D ICP Point-Set Registration

no code implementations • 11 May 2016 • Jiaolong Yang, Hongdong Li, Dylan Campbell, Yunde Jia

The evaluation demonstrates that the proposed method is able to produce reliable registration results regardless of the initialization.

Ranked #6 on Point Cloud Registration on FP-O-H

Image to Point Cloud Registration

Paper
Add Code

Robust and Efficient Relative Pose with a Multi-camera System for Autonomous Vehicle in Highly Dynamic Environments

no code implementations • 12 May 2016 • Liu Liu, Hongdong Li, Yuchao Dai

When the solver is used in combination with RANSAC, we are able to quickly prune unpromising hypotheses, significantly improve the chance of finding inliers.

Motion Estimation

Paper
Add Code

Multi-body Non-rigid Structure-from-Motion

no code implementations • 15 Jul 2016 • Suryansh Kumar, Yuchao Dai, Hongdong Li

Recent progress have extended SFM to the areas of {multi-body SFM} (where there are {multiple rigid} relative motions in the scene), as well as {non-rigid SFM} (where there is a single non-rigid, deformable object or scene).

3D Reconstruction Clustering

Paper
Add Code

Semi-Dense Visual Odometry for RGB-D Cameras Using Approximate Nearest Neighbour Fields

no code implementations • 6 Feb 2017 • Yi Zhou, Laurent Kneip, Hongdong Li

This paper presents a robust and efficient semi-dense visual odometry solution for RGB-D cameras.

Visual Odometry

Paper
Add Code

Spatial-Temporal Union of Subspaces for Multi-body Non-rigid Structure-from-Motion

no code implementations • 14 May 2017 • Suryansh Kumar, Yuchao Dai, Hongdong Li

This spatio-temporal representation not only provides competitive 3D reconstruction but also outputs robust segmentation of multiple non-rigid objects.

3D Reconstruction

Paper
Add Code

Pixel-variant Local Homography for Fisheye Stereo Rectification Minimizing Resampling Distortion

no code implementations • 12 Jul 2017 • Dingfu Zhou, Yuchao Dai, Hongdong Li

First, we prove that there indeed exist enough degrees of freedom to apply pixel-wise local homography for stereo rectification.

3D Reconstruction Stereo Matching +1

Paper
Add Code

"Maximizing rigidity" revisited: a convex programming approach for generic 3D shape reconstruction from multiple perspective views

no code implementations • ICCV 2017 • Pan Ji, Hongdong Li, Yuchao Dai, Ian Reid

Rigid structure-from-motion (RSfM) and non-rigid structure-from-motion (NRSfM) have long been treated in the literature as separate (different) problems.

3D Reconstruction 3D Shape Reconstruction

Paper
Add Code

Adaptive Low-Rank Kernel Subspace Clustering

1 code implementation • 17 Jul 2017 • Pan Ji, Ian Reid, Ravi Garg, Hongdong Li, Mathieu Salzmann

In this paper, we present a kernel subspace clustering method that can handle non-linear models.

Clustering Image Clustering +1

Paper
Code

Monocular Dense 3D Reconstruction of a Complex Dynamic Scene from Two Perspective Frames

no code implementations • ICCV 2017 • Suryansh Kumar, Yuchao Dai, Hongdong Li

This paper proposes a new approach for monocular dense 3D reconstruction of a complex dynamic scene from two perspective frames.

3D Reconstruction Dynamic Reconstruction +3

Paper
Add Code

Self-Supervised Learning for Stereo Matching with Self-Improving Ability

no code implementations • 4 Sep 2017 • Yiran Zhong, Yuchao Dai, Hongdong Li

Exiting deep-learning based dense stereo matching methods often rely on ground-truth disparity maps as the training signals, which are however not always available in many situations.

Self-Supervised Learning Stereo Matching +1

Paper
Add Code

Deep Subspace Clustering Networks

3 code implementations • NeurIPS 2017 • Pan Ji, Tong Zhang, Hongdong Li, Mathieu Salzmann, Ian Reid

We present a novel deep neural network architecture for unsupervised subspace clustering.

Ranked #3 on Image Clustering on Extended Yale-B

Clustering Image Clustering

203

Paper
Code

Globally-Optimal Inlier Set Maximisation for Simultaneous Camera Pose and Feature Correspondence

no code implementations • ICCV 2017 • Dylan Campbell, Lars Petersson, Laurent Kneip, Hongdong Li

Estimating the 6-DoF pose of a camera from a single image relative to a pre-computed 3D point-set is an important task for many computer vision applications.

Pose Estimation

Paper
Add Code

Efficient Global 2D-3D Matching for Camera Localization in a Large-Scale 3D Map

no code implementations • ICCV 2017 • Liu Liu, Hongdong Li, Yuchao Dai

In this paper, we introduce a global method which harnesses global contextual information exhibited both within the query image and among all the 3D points in the map.

3D Feature Matching Camera Localization

Paper
Add Code

Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective

no code implementations • CVPR 2018 • Suryansh Kumar, Anoop Cherian, Yuchao Dai, Hongdong Li

To address these issues, in this paper, we propose a new approach for dense NRSfM by modeling the problem on a Grassmann manifold.

Paper
Add Code

Adversarial Spatio-Temporal Learning for Video Deblurring

1 code implementation • 28 Mar 2018 • Kaihao Zhang, Wenhan Luo, Yiran Zhong, Lin Ma, Wei Liu, Hongdong Li

To tackle the second challenge, we leverage the developed DBLRNet as a generator in the GAN (generative adversarial network) architecture, and employ a content loss in addition to an adversarial loss for efficient adversarial training.

Deblurring Generative Adversarial Network

Paper
Code

Semantic Single-Image Dehazing

no code implementations • 16 Apr 2018 • Ziang Cheng, ShaoDi You, Viorela Ila, Hongdong Li

In experiments, we validate our ap- proach upon synthetic and real hazy images, where our method showed superior performance over state-of-the-art approaches, suggesting semantic information facilitates the haze removal task.

Image Dehazing Single Image Dehazing +1

Paper
Add Code

Structure from Recurrent Motion: From Rigidity to Recurrency

no code implementations • CVPR 2018 • Xiu Li, Hongdong Li, Hanbyul Joo, Yebin Liu, Yaser Sheikh

This paper proposes a new method for Non-Rigid Structure-from-Motion (NRSfM) from a long monocular video sequence observing a non-rigid object performing recurrent and possibly repetitive dynamic action.

Clustering

Paper
Add Code

Semi-Dense 3D Reconstruction with a Stereo Event Camera

2 code implementations • ECCV 2018 • Yi Zhou, Guillermo Gallego, Henri Rebecq, Laurent Kneip, Hongdong Li, Davide Scaramuzza

Event cameras are bio-inspired sensors that offer several advantages, such as low latency, high-speed and high dynamic range, to tackle challenging scenarios in computer vision.

3D Reconstruction Simultaneous Localization and Mapping

410

Paper
Code

Action Anticipation By Predicting Future Dynamic Images

no code implementations • 1 Aug 2018 • Cristian Rodriguez, Basura Fernando, Hongdong Li

Human action-anticipation methods predict what is the future action by observing only a few portion of an action in progress.

Action Anticipation Autonomous Driving +1

Paper
Add Code

Open-World Stereo Video Matching with Deep RNN

no code implementations • ECCV 2018 • Yiran Zhong, Hongdong Li, Yuchao Dai

Deep Learning based stereo matching methods have shown great successes and achieved top scores across different benchmarks.

Stereo Matching Stereo Matching Hand

Paper
Add Code

3D Geometry-Aware Semantic Labeling of Outdoor Street Scenes

no code implementations • 13 Aug 2018 • Yiran Zhong, Yuchao Dai, Hongdong Li

This paper is concerned with the problem of how to better exploit 3D geometric information for dense semantic image labeling.

Paper
Add Code

Stereo Computation for a Single Mixture Image

no code implementations • ECCV 2018 • Yiran Zhong, Yuchao Dai, Hongdong Li

This paper proposes an original problem of \emph{stereo computation from a single mixture image}-- a challenging problem that had not been researched before.

Stereo Matching Stereo Matching Hand +1

Paper
Add Code

Stochastic Attraction-Repulsion Embedding for Large Scale Image Localization

5 code implementations • ICCV 2019 • Liu Liu, Hongdong Li, Yuchao Dai

This paper tackles the problem of large-scale image-based localization (IBL) where the spatial location of a query image is determined by finding out the most similar reference images in a large database.

Image-Based Localization Representation Learning +1

264

Paper
Code

ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving

no code implementations • CVPR 2019 • Xibin Song, Peng Wang, Dingfu Zhou, Rui Zhu, Chenye Guan, Yuchao Dai, Hao Su, Hongdong Li, Ruigang Yang

Specifically, we first segment each car with a pre-trained Mask R-CNN, and then regress towards its 3D pose and shape based on a deformable 3D car model with or without using semantic keypoints.

3D Car Instance Understanding Autonomous Driving

Paper
Add Code

The Alignment of the Spheres: Globally-Optimal Spherical Mixture Alignment for Camera Pose Estimation

no code implementations • CVPR 2019 • Dylan Campbell, Lars Petersson, Laurent Kneip, Hongdong Li, Stephen Gould

Determining the position and orientation of a calibrated camera from a single image with respect to a 3D model is an essential task for many applications.

Pose Estimation

Paper
Add Code

Dense Depth Estimation of a Complex Dynamic Scene without Explicit 3D Motion Estimation

no code implementations • 11 Feb 2019 • Suryansh Kumar, Ram Srivatsav Ghorakavi, Yuchao Dai, Hongdong Li

Given per-pixel optical flow correspondences between two consecutive frames and, the sparse depth prior for the reference frame, we show that, we can effectively recover the dense depth map for the successive frames without solving for 3D motion parameters.

Depth Estimation Motion Estimation +1

Paper
Add Code

Breaking the Spatio-Angular Trade-off for Light Field Super-Resolution via LSTM Modelling on Epipolar Plane Images

no code implementations • 15 Feb 2019 • Hao Zhu, Mantang Guo, Hongdong Li, Qing Wang, Antonio Robles-Kelly

We prove that the light field is a 2D series, thus, a specifically designed CNN-LSTM network is proposed to capture the continuity property of the EPI.

Super-Resolution

Paper
Add Code

Ground Plane based Absolute Scale Estimation for Monocular Visual Odometry

no code implementations • 3 Mar 2019 • Dingfu Zhou, Yuchao Dai, Hongdong Li

Recovering the absolute metric scale from a monocular camera is a challenging but highly desirable problem for monocular camera-based systems.

Monocular Visual Odometry

Paper
Add Code

Lending Orientation to Neural Networks for Cross-view Geo-localization

1 code implementation • CVPR 2019 • Liu Liu, Hongdong Li

The goal is to predict the spatial location of a ground-level query image by matching it to a large geotagged aerial image database (e. g., satellite imagery).

Paper
Code

Deep Stacked Hierarchical Multi-patch Network for Image Deblurring

1 code implementation • CVPR 2019 • Hongguang Zhang, Yuchao Dai, Hongdong Li, Piotr Koniusz

depth, we propose a stacked version of our multi-patch model.

Ranked #9 on Deblurring on RealBlur-R (trained on GoPro) (SSIM (sRGB) metric)

Deblurring Image Deblurring

186

Paper
Code

Noise-Aware Unsupervised Deep Lidar-Stereo Fusion

3 code implementations • CVPR 2019 • Xuelian Cheng, Yiran Zhong, Yuchao Dao, Pan Ji, Hongdong Li

In this paper, we present LidarStereoNet, the first unsupervised Lidar-stereo fusion network, which can be trained in an end-to-end manner without the need of ground truth depth maps.

Depth Completion Stereo Matching +1

Paper
Code

Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes

no code implementations • CVPR 2019 • Yiran Zhong, Pan Ji, Jianyuan Wang, Yuchao Dai, Hongdong Li

In this paper, we propose Deep Epipolar Flow, an unsupervised optical flow method which incorporates global geometric constraints into network learning.

Benchmarking Optical Flow Estimation

Paper
Add Code

Neural Collaborative Subspace Clustering

no code implementations • 24 Apr 2019 • Tong Zhang, Pan Ji, Mehrtash Harandi, Wenbing Huang, Hongdong Li

We introduce the Neural Collaborative Subspace Clustering, a neural model that discovers clusters of data points drawn from a union of low-dimensional subspaces.

Clustering

Paper
Add Code

Optimal Feature Transport for Cross-View Image Geo-Localization

1 code implementation • 11 Jul 2019 • Yujiao Shi, Xin Yu, Liu Liu, Tong Zhang, Hongdong Li

This paper proposes a novel Cross-View Feature Transport (CVFT) technique to explicitly establish cross-view domain transfer that facilitates feature alignment between ground and aerial images.

Image-Based Localization Metric Learning

Paper
Code

Learning Trajectory Dependencies for Human Motion Prediction

5 code implementations • ICCV 2019 • Wei Mao, Miaomiao Liu, Mathieu Salzmann, Hongdong Li

In this paper, we propose a simple feed-forward deep network for motion prediction, which takes into account both temporal smoothness and spatial dependencies among human body joints.

Ranked #5 on Multi-Person Pose forecasting on Expi - unseen actions split

Human motion prediction Human Pose Forecasting +2

226

Paper
Code

Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention

1 code implementation • 20 Aug 2019 • Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Fatemeh Sadat Saleh, Hongdong Li, Stephen Gould

Given an untrimmed video and a sentence as the query, the goal is to determine the starting, and the ending, of the relevant visual moment in the video, that corresponds to the query sentence.

Sentence

Paper
Code

Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison

2 code implementations • 24 Oct 2019 • Dongxu Li, Cristian Rodriguez Opazo, Xin Yu, Hongdong Li

Based on this new large-scale dataset, we are able to experiment with several deep learning methods for word-level sign recognition and evaluate their performances in large scale scenarios.

Ranked #3 on Sign Language Recognition on WLASL100

Action Classification Benchmarking +3

737

Paper
Code

Superpixel Soup: Monocular Dense 3D Reconstruction of a Complex Dynamic Scene

no code implementations • 19 Nov 2019 • Suryansh Kumar, Yuchao Dai, Hongdong Li

We assume that a dynamic scene can be approximated by numerous piecewise planar surfaces, where each planar surface enjoys its own rigid motion, and the global change in the scene between two frames is as-rigid-as-possible (ARAP).

3D Reconstruction

Paper
Add Code

Spatial-Aware Feature Aggregation for Image based Cross-View Geo-Localization

1 code implementation • NeurIPS 2019 • Yujiao Shi, Liu Liu, Xin Yu, Hongdong Li

The first step is to apply a regular polar transform to warp an aerial image such that its domain is closer to that of a ground-view panorama.

Ranked #4 on Image-Based Localization on VIGOR Cross Area

Image-Based Localization

Paper
Code

Rethinking Class Relations: Absolute-relative Supervised and Unsupervised Few-shot Learning

1 code implementation • CVPR 2021 • Hongguang Zhang, Piotr Koniusz, Songlei Jian, Hongdong Li, Philip H. S. Torr

The majority of existing few-shot learning methods describe image relations with binary labels.

Ranked #10 on Unsupervised Few-Shot Image Classification on Tiered ImageNet 5-way (5-shot)

Relation Unsupervised Few-Shot Image Classification +1

Paper
Code

Few-shot Action Recognition with Permutation-invariant Attention

1 code implementation • ECCV 2020 • Hongguang Zhang, Li Zhang, Xiaojuan Qi, Hongdong Li, Philip H. S. Torr, Piotr Koniusz

Such encoded blocks are aggregated by permutation-invariant pooling to make our approach robust to varying action lengths and long-range temporal dependencies whose patterns are unlikely to repeat even in clips of the same class.

Ranked #6 on Few Shot Action Recognition on Kinetics-100

Few-Shot action recognition Few Shot Action Recognition +3

Paper
Code

6DoF Object Pose Estimation via Differentiable Proxy Voting Loss

no code implementations • 10 Feb 2020 • Xin Yu, Zheyu Zhuang, Piotr Koniusz, Hongdong Li

In this paper, we aim to reduce such errors by incorporating the distances between pixels and keypoints into our objective.

Object Pose Estimation

Paper
Add Code

Transferring Cross-domain Knowledge for Video Sign Language Recognition

no code implementations • CVPR 2020 • Dongxu Li, Xin Yu, Chenchen Xu, Lars Petersson, Hongdong Li

To this end, we extract news signs using a base WSLR model, and then design a classifier jointly trained on news and isolated signs to coarsely align these two domain features.

Sign Language Recognition

Paper
Add Code

Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point Problem

1 code implementation • 15 Mar 2020 • Liu Liu, Dylan Campbell, Hongdong Li, Dingfu Zhou, Xibin Song, Ruigang Yang

Conventional absolute camera pose via a Perspective-n-Point (PnP) solver often assumes that the correspondences between 2D image pixels and 3D points are given.

132

Paper
Code

Deblurring by Realistic Blurring

1 code implementation • CVPR 2020 • Kaihao Zhang, Wenhan Luo, Yiran Zhong, Lin Ma, Bjorn Stenger, Wei Liu, Hongdong Li

To address this problem, we propose a new method which combines two GAN models, i. e., a learning-to-Blur GAN (BGAN) and learning-to-DeBlur GAN (DBGAN), in order to learn a better model for image deblurring by primarily learning how to blur images.

Ranked #17 on Deblurring on HIDE (trained on GOPRO)

Deblurring Image Deblurring

Paper
Code

Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching

1 code implementation • CVPR 2020 • Yujiao Shi, Xin Yu, Dylan Campbell, Hongdong Li

Cross-view geo-localization is the problem of estimating the position and orientation (latitude, longitude and azimuth angle) of a camera at ground level given a large-scale database of geo-tagged aerial (e. g., satellite) images.

Paper
Code

End-to-end Learning for Inter-Vehicle Distance and Relative Velocity Estimation in ADAS with a Monocular Camera

1 code implementation • 7 Jun 2020 • Zhenbo Song, Jianfeng Lu, Tong Zhang, Hongdong Li

In this paper, we propose a monocular camera-based inter-vehicle distance and relative velocity estimation method based on end-to-end training of a deep neural network.

Optical Flow Estimation

Paper
Code

Dense Non-Rigid Structure from Motion: A Manifold Viewpoint

no code implementations • 15 Jun 2020 • Suryansh Kumar, Luc van Gool, Carlos E. P. de Oliveira, Anoop Cherian, Yuchao Dai, Hongdong Li

Assuming that a deforming shape is composed of a union of local linear subspace and, span a global low-rank space over multiple frames enables us to efficiently model complex non-rigid deformations.

Clustering

Paper
Add Code

The IKEA ASM Dataset: Understanding People Assembling Furniture through Actions, Objects and Pose

1 code implementation • 1 Jul 2020 • Yizhak Ben-Shabat, Xin Yu, Fatemeh Sadat Saleh, Dylan Campbell, Cristian Rodriguez-Opazo, Hongdong Li, Stephen Gould

The availability of a large labeled dataset is a key requirement for applying deep learning methods to solve various computer vision tasks.

Action Recognition Object +4

Paper
Code

TSPNet: Hierarchical Feature Learning via Temporal Semantic Pyramid for Sign Language Translation

2 code implementations • NeurIPS 2020 • Dongxu Li, Chenchen Xu, Xin Yu, Kaihao Zhang, Ben Swift, Hanna Suominen, Hongdong Li

Sign language translation (SLT) aims to interpret sign video sequences into text-based natural language sentences.

Sign Language Recognition Sign Language Translation +3

737

Paper
Code

DORi: Discovering Object Relationship for Moment Localization of a Natural-Language Query in Video

1 code implementation • 13 Oct 2020 • Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Basura Fernando, Hongdong Li, Stephen Gould

This paper studies the task of temporal moment localization in a long untrimmed video using natural language query.

Sentence

Paper
Code

Hierarchical Neural Architecture Search for Deep Stereo Matching

1 code implementation • NeurIPS 2020 • Xuelian Cheng, Yiran Zhong, Mehrtash Harandi, Yuchao Dai, Xiaojun Chang, Tom Drummond, Hongdong Li, ZongYuan Ge

To reduce the human efforts in neural network design, Neural Architecture Search (NAS) has been applied with remarkable success to various high-level vision tasks such as classification and semantic segmentation.

Ranked #2 on Stereo Disparity Estimation on Scene Flow

Neural Architecture Search Semantic Segmentation +3

252

Paper
Code

Displacement-Invariant Matching Cost Learning for Accurate Optical Flow Estimation

3 code implementations • NeurIPS 2020 • Jianyuan Wang, Yiran Zhong, Yuchao Dai, Kaihao Zhang, Pan Ji, Hongdong Li

Learning matching costs has been shown to be critical to the success of the state-of-the-art deep stereo matching methods, in which 3D convolutions are applied on a 4D feature volume to learn a 3D cost volume.

Optical Flow Estimation Stereo Matching

145

Paper
Code

Displacement-Invariant Cost Computation for Efficient Stereo Matching

no code implementations • 1 Dec 2020 • Yiran Zhong, Charles Loop, Wonmin Byeon, Stan Birchfield, Yuchao Dai, Kaihao Zhang, Alexey Kamenev, Thomas Breuel, Hongdong Li, Jan Kautz

A common way to speed up the computation is to downsample the feature volume, but this loses high-frequency details.

Autonomous Driving Stereo Matching

Paper
Add Code

Dual Pixel Exploration: Simultaneous Depth Estimation and Image Restoration

1 code implementation • CVPR 2021 • Liyuan Pan, Shah Chowdhury, Richard Hartley, Miaomiao Liu, Hongguang Zhang, Hongdong Li

The heavy defocus blur in DP pairs affects the performance of matching-based depth estimation approaches.

Depth Estimation Image Restoration

Paper
Code

PlueckerNet: Learn to Register 3D Line Reconstructions

2 code implementations • 2 Dec 2020 • Liu Liu, Hongdong Li, Haodong Yao, Ruyi Zha

Aligning two partially-overlapped 3D line reconstructions in Euclidean space is challenging, as we need to simultaneously solve correspondences and relative pose between line reconstructions.

Translation

Paper
Code

Efficient Depth Completion Using Learned Bases

no code implementations • 2 Dec 2020 • Yiran Zhong, Yuchao Dai, Hongdong Li

The given sparse depth points are served as a data term to constrain the weighting process.

Depth Completion

Paper
Add Code

Depth Completion using Piecewise Planar Model

no code implementations • 6 Dec 2020 • Yiran Zhong, Yuchao Dai, Hongdong Li

More specifically, we represent the desired depth map as a collection of 3D planar and the reconstruction problem is formulated as the optimization of planar parameters.

Depth Completion Visual Odometry

Paper
Add Code

Canny-VO: Visual Odometry with RGB-D Cameras based on Geometric 3D-2D Edge Alignment

no code implementations • 15 Dec 2020 • Yi Zhou, Hongdong Li, Laurent Kneip

The present paper reviews the classical problem of free-form curve registration and applies it to an efficient RGBD visual odometry system called Canny-VO, as it efficiently tracks all Canny edge features extracted from the images.

Visual Odometry

Paper
Add Code

Benchmarking Ultra-High-Definition Image Super-Resolution

no code implementations • ICCV 2021 • Kaihao Zhang, Dongxu Li, Wenhan Luo, Wenqi Ren, Bjorn Stenger, Wei Liu, Hongdong Li, Ming-Hsuan Yang

Increasingly, modern mobile devices allow capturing images at Ultra-High-Definition (UHD) resolution, which includes 4K and 8K images.

4k 8k +3

Paper
Add Code

Geometry-Guided Street-View Panorama Synthesis from Satellite Imagery

1 code implementation • 2 Mar 2021 • Yujiao Shi, Dylan Campbell, Xin Yu, Hongdong Li

Specifically, we observe that when a 3D point in the real world is visible in both views, there is a deterministic mapping between the projected points in the two-view images given the height information of this 3D point.

Image Generation

Paper
Code

ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring

no code implementations • CVPR 2021 • Dongxu Li, Chenchen Xu, Kaihao Zhang, Xin Yu, Yiran Zhong, Wenqi Ren, Hanna Suominen, Hongdong Li

Video deblurring models exploit consecutive frames to remove blurs from camera shakes and object motions.

Deblurring

Paper
Add Code

Deep Dense Multi-scale Network for Snow Removal Using Semantic and Geometric Priors

no code implementations • 21 Mar 2021 • Kaihao Zhang, Rongqing Li, Yanjiang Yu, Wenhan Luo, Changsheng Li, Hongdong Li

Images captured in snowy days suffer from noticeable degradation of scene visibility, which degenerates the performance of current vision-based intelligent systems.

Image Restoration Snow Removal

Paper
Add Code

Self-Supervised Visibility Learning for Novel View Synthesis

1 code implementation • CVPR 2021 • Yujiao Shi, Hongdong Li, Xin Yu

We then warp and aggregate source view pixels to synthesize a novel view based on the estimated source-view visibility and target-view depth.

Novel View Synthesis

Paper
Code

Deep Two-View Structure-from-Motion Revisited

1 code implementation • CVPR 2021 • Jianyuan Wang, Yiran Zhong, Yuchao Dai, Stan Birchfield, Kaihao Zhang, Nikolai Smolyanskiy, Hongdong Li

Two-view structure-from-motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM.

Ranked #24 on Monocular Depth Estimation on KITTI Eigen split

3D Reconstruction Monocular Depth Estimation +3

173

Paper
Code

Learning Optical Flow from a Few Matches

1 code implementation • CVPR 2021 • Shihao Jiang, Yao Lu, Hongdong Li, Richard Hartley

In this paper, we show that the dense correlation volume representation is redundant and accurate flow estimation can be achieved with only a fraction of elements in it.

Ranked #11 on Optical Flow Estimation on KITTI 2015 (train)

Optical Flow Estimation

163

Paper
Code

Learning to Estimate Hidden Motions with Global Motion Aggregation

2 code implementations • ICCV 2021 • Shihao Jiang, Dylan Campbell, Yao Lu, Hongdong Li, Richard Hartley

We demonstrate that the optical flow estimates in the occluded regions can be significantly improved without damaging the performance in non-occluded regions.

Ranked #5 on Optical Flow Estimation on Sintel-final

Optical Flow Estimation

887

Paper
Code

One Ring to Rule Them All: a simple solution to multi-view 3D-Reconstruction of shapes with unknown BRDF via a small Recurrent ResNet

no code implementations • 11 Apr 2021 • Ziang Cheng, Hongdong Li, Richard Hartley, Yinqiang Zheng, Imari Sato

This paper proposes a simple method which solves an open problem of multi-view 3D-Reconstruction for objects with unknown and generic surface materials, imaged by a freely moving camera and a freely moving point light source.

3D Reconstruction Multi-View 3D Reconstruction +3

Paper
Add Code

Lighting, Reflectance and Geometry Estimation from 360$^{\circ}$ Panoramic Stereo

no code implementations • 20 Apr 2021 • Junxuan Li, Hongdong Li, Yasuyuki Matsushita

We propose a method for estimating high-definition spatially-varying lighting, reflectance, and geometry of a scene from 360$^{\circ}$ stereo images.

Paper
Add Code

Beyond Monocular Deraining: Parallel Stereo Deraining Network Via Semantic Prior

no code implementations • 9 May 2021 • Kaihao Zhang, Wenhan Luo, Yanjiang Yu, Wenqi Ren, Fang Zhao, Changsheng Li, Lin Ma, Wei Liu, Hongdong Li

We first use a coarse deraining network to reduce the rain streaks on the input images, and then adopt a pre-trained semantic segmentation network to extract semantic features from the coarse derained image.

Benchmarking Rain Removal +1

Paper
Add Code

Multi-view 3D Reconstruction of a Texture-less Smooth Surface of Unknown Generic Reflectance

1 code implementation • CVPR 2021 • Ziang Cheng, Hongdong Li, Yuta Asano, Yinqiang Zheng, Imari Sato

Recovering the 3D geometry of a purely texture-less object with generally unknown surface reflectance (e. g. non-Lambertian) is regarded as a challenging task in multi-view reconstruction.

3D Object Reconstruction 3D Reconstruction +2

Paper
Code

Shell Theory: A Statistical Model of Reality

1 code implementation • IEEE Transactions on Pattern Analysis and Machine Intelligence 2021 • Wen-Yan Lin, Siying Liu, Changhao Ren, Ngai-Man Cheung, Hongdong Li, Yasuyuki Matsushita

The foundational assumption of machine learning is that the data under consideration is separable into classes; while intuitively reasonable, separability constraints have proven remarkably difficult to formulate mathematically.

Ranked #1 on Unsupervised Anomaly Detection with Specified Settings -- 10% anomaly on STL-10 (using extra training data)

Anomaly Detection BIG-bench Machine Learning +6

Paper
Code

Multi-level Motion Attention for Human Motion Prediction

1 code implementation • 17 Jun 2021 • Wei Mao, Miaomiao Liu, Mathieu Salzmann, Hongdong Li

Whether based on recurrent or feed-forward neural networks, existing learning based methods fail to model the observation that human motion tends to repeat itself, even for complex sports actions and cooking activities.

Human motion prediction motion prediction

Paper
Code

PluckerNet: Learn To Register 3D Line Reconstructions

no code implementations • CVPR 2021 • Liu Liu, Hongdong Li, Haodong Yao, Ruyi Zha

Aligning two partially-overlapped 3D line reconstructions in Euclidean space is challenging, as we need to simultaneously solve line correspondences and relative pose between reconstructions.

Translation

Paper
Add Code

Lighting, Reflectance and Geometry Estimation From 360deg Panoramic Stereo

no code implementations • CVPR 2021 • Junxuan Li, Hongdong Li, Yasuyuki Matsushita

We propose a method for estimating high-definition spatially-varying lighting, reflectance, and geometry of a scene from 360deg stereo images.

Paper
Add Code

Underwater Image Restoration via Contrastive Learning and a Real-world Dataset

1 code implementation • 20 Jun 2021 • Junlin Han, Mehrdad Shoeiby, Tim Malthus, Elizabeth Botha, Janet Anstee, Saeed Anwar, Ran Wei, Mohammad Ali Armin, Hongdong Li, Lars Petersson

There are 2000 reference restored images and 6003 original underwater images in the unpaired training set.

Benchmarking Contrastive Learning +3

Paper
Code

Ranking Models in Unlabeled New Environments

2 code implementations • ICCV 2021 • Xiaoxiao Sun, Yunzhong Hou, Weijian Deng, Hongdong Li, Liang Zheng

For this problem, we propose to adopt a proxy dataset that 1) is fully labeled and 2) well reflects the true model rankings in a given target environment, and use the performance rankings on the proxy sets as surrogates.

Person Re-Identification

Paper
Code

Blind Image Decomposition

1 code implementation • 25 Aug 2021 • Junlin Han, Weihao Li, Pengfei Fang, Chunyi Sun, Jie Hong, Mohammad Ali Armin, Lars Petersson, Hongdong Li

We propose and study a novel task named Blind Image Decomposition (BID), which requires separating a superimposed image into constituent underlying images in a blind setting, that is, both the source components involved in mixing as well as the mixing mechanism are unknown.

Rain Removal

110

Paper
Code

Neural Plenoptic Sampling: Capture Light-field from Imaginary Eyes

no code implementations • 29 Sep 2021 • Junxuan Li, Yujiao Shi, Hongdong Li

It encodes a complete light-field (\ie, lumigraph) therefore allows one to freely roam in the space and view the scene from any location in any direction.

Novel View Synthesis Position

Paper
Add Code

Neural Photometric Stereo for Shape and Material Estimation

no code implementations • 29 Sep 2021 • Junxuan Li, Hongdong Li

This paper addresses a challenging Photometric-Stereo problem where the object to be reconstructed has unknown, non-Lambertian, and possibly spatially-varying surface materials.

Paper
Add Code

Learning To Segment Dominant Object Motion From Watching Videos

no code implementations • 28 Nov 2021 • Sahir Shrestha, Mohammad Ali Armin, Hongdong Li, Nick Barnes

Existing deep learning based unsupervised video object segmentation methods still rely on ground-truth segmentation masks to train.

Object Optical Flow Estimation +4

Paper
Add Code

HDR-NeRF: High Dynamic Range Neural Radiance Fields

no code implementations • CVPR 2022 • Xin Huang, Qi Zhang, Ying Feng, Hongdong Li, Xuan Wang, Qing Wang

The key to our method is to model the physical imaging process, which dictates that the radiance of a scene point transforms to a pixel value in the LDR image with two implicit functions: a radiance field and a tone mapper.

Vocal Bursts Intensity Prediction

Paper
Add Code

Improving GAN Equilibrium by Raising Spatial Awareness

1 code implementation • CVPR 2022 • Jianyuan Wang, Ceyuan Yang, Yinghao Xu, Yujun Shen, Hongdong Li, Bolei Zhou

We further propose to align the spatial awareness of G with the attention map induced from D. Through this way we effectively lessen the information gap between D and G. Extensive results show that our method pushes the two-player game in GANs closer to the equilibrium, leading to a better synthesis performance.

Attribute Inductive Bias

157

Paper
Code

Label-Free Model Evaluation with Semi-Structured Dataset Representations

1 code implementation • 1 Dec 2021 • Xiaoxiao Sun, Yunzhong Hou, Hongdong Li, Liang Zheng

In the absence of image labels, based on dataset representations, we estimate model performance for AutoEval with regression.

regression

Paper
Code

MC-Blur: A Comprehensive Benchmark for Image Deblurring

2 code implementations • 1 Dec 2021 • Kaihao Zhang, Tao Wang, Wenhan Luo, Boheng Chen, Wenqi Ren, Bjorn Stenger, Wei Liu, Hongdong Li, Ming-Hsuan Yang

Blur artifacts can seriously degrade the visual quality of images, and numerous deblurring methods have been proposed for specific scenarios.

Benchmarking Deblurring +1

141

Paper
Code

Align and Prompt: Video-and-Language Pre-training with Entity Prompts

1 code implementation • CVPR 2022 • Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi

To achieve this, we first introduce an entity prompter module, which is trained with VTC to produce the similarity between a video crop and text prompts instantiated with entity names.

Ranked #19 on Zero-Shot Video Retrieval on DiDeMo

Entity Alignment Retrieval +3

182

Paper
Code

Transcribing Natural Languages for The Deaf via Neural Editing Programs

1 code implementation • 17 Dec 2021 • Dongxu Li, Chenchen Xu, Liu Liu, Yiran Zhong, Rong Wang, Lars Petersson, Hongdong Li

This work studies the task of glossification, of which the aim is to em transcribe natural spoken language sentences for the Deaf (hard-of-hearing) community to ordered sign language glosses.

Sentence

Paper
Code

Multi-level Second-order Few-shot Learning

1 code implementation • 15 Jan 2022 • Hongguang Zhang, Hongdong Li, Piotr Koniusz

The goal of multi-level feature design is to extract feature representations at different layer-wise levels of CNN, realizing several levels of visual abstraction to achieve robust few-shot learning.

Ranked #11 on Unsupervised Few-Shot Image Classification on Tiered ImageNet 5-way (5-shot)

Few-Shot action recognition Few Shot Action Recognition +2

Paper
Code

Deep Image Deblurring: A Survey

no code implementations • 26 Jan 2022 • Kaihao Zhang, Wenqi Ren, Wenhan Luo, Wei-Sheng Lai, Bjorn Stenger, Ming-Hsuan Yang, Hongdong Li

Image deblurring is a classic problem in low-level computer vision with the aim to recover a sharp image from a blurred input image.

Deblurring Image Deblurring

Paper
Add Code

You Only Cut Once: Boosting Data Augmentation with a Single Cut

1 code implementation • 28 Jan 2022 • Junlin Han, Pengfei Fang, Weihao Li, Jie Hong, Mohammad Ali Armin, Ian Reid, Lars Petersson, Hongdong Li

We present You Only Cut Once (YOCO) for performing data augmentations.

Data Augmentation

Paper
Code

Neural Reflectance for Shape Recovery with Shadow Handling

1 code implementation • CVPR 2022 • Junxuan Li, Hongdong Li

This network is able to leverage the observed photometric variance and shadows on the surface, and recover both surface shape and general non-Lambertian reflectance.

Paper
Code

Accurate 3-DoF Camera Geo-Localization via Ground-to-Satellite Image Matching

1 code implementation • 26 Mar 2022 • Yujiao Shi, Xin Yu, Liu Liu, Dylan Campbell, Piotr Koniusz, Hongdong Li

We address the problem of ground-to-satellite image geo-localization, that is, estimating the camera latitude, longitude and orientation (azimuth angle) by matching a query image captured at the ground level against a large-scale database with geotagged satellite images.

Image Retrieval Retrieval

Paper
Code

Stereo Unstructured Magnification: Multiple Homography Image for View Synthesis

no code implementations • 1 Apr 2022 • Qi Zhang, Xin Huang, Ying Feng, Xue Wang, Hongdong Li, Qing Wang

A two-stage network is developed for novel view synthesis.

Novel View Synthesis

Paper
Add Code

Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image

1 code implementation • CVPR 2022 • Yujiao Shi, Hongdong Li

This paper addresses the problem of vehicle-mounted camera localization by matching a ground-level image with an overhead-view satellite map.

Camera Localization Image Retrieval +2

Paper
Code

Deep Non-rigid Structure-from-Motion: A Sequence-to-Sequence Translation Perspective

no code implementations • 10 Apr 2022 • Hui Deng, Tong Zhang, Yuchao Dai, Jiawei Shi, Yiran Zhong, Hongdong Li

In this paper, we propose to model deep NRSfM from a sequence-to-sequence translation perspective, where the input 2D frame sequence is taken as a whole to reconstruct the deforming 3D non-rigid shape sequence.

3D Reconstruction Translation

Paper
Add Code

Unsupervised Video Interpolation by Learning Multilayered 2.5D Motion Fields

no code implementations • 21 Apr 2022 • Ziang Cheng, Shihao Jiang, Hongdong Li

This implicit neural representation learns the video as a space-time continuum, allowing frame interpolation at any temporal resolution.

Video Frame Interpolation

Paper
Add Code

CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping

1 code implementation • 31 May 2022 • Junlin Han, Lars Petersson, Hongdong Li, Ian Reid

We present a simple method, CropMix, for the purpose of producing a rich input distribution from the original dataset distribution.

Contrastive Learning

Paper
Code

Self-calibrating Photometric Stereo by Neural Inverse Rendering

1 code implementation • 16 Jul 2022 • Junxuan Li, Hongdong Li

This paper tackles the task of uncalibrated photometric stereo for 3D object reconstruction, where both the object shape, object reflectance, and lighting directions are unknown.

3D Object Reconstruction Inverse Rendering +1

Paper
Code

Satellite Image Based Cross-view Localization for Autonomous Vehicle

no code implementations • 27 Jul 2022 • Shan Wang, Yanhao Zhang, Ankit Vora, Akhil Perincherry, Hongdong Li

This paper introduces a novel approach to cross-view localization that departs from the conventional image retrieval method.

Autonomous Vehicles Image Retrieval +1

Paper
Add Code

CVLNet: Cross-View Semantic Correspondence Learning for Video-based Camera Localization

no code implementations • 7 Aug 2022 • Yujiao Shi, Xin Yu, Shan Wang, Hongdong Li

The critical challenge of this task is to learn a powerful global feature descriptor for the sequential ground-view images while considering its domain alignment with reference satellite images.

Camera Localization Image-Based Localization +1

Paper
Add Code

NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction

1 code implementation • 29 Sep 2022 • Ruyi Zha, Yanhao Zhang, Hongdong Li

This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction (Cone Beam Computed Tomography) that requires no external training data.

Ranked #2 on Novel View Synthesis on X3D

Low-Dose X-Ray Ct Reconstruction Novel View Synthesis

Paper
Code

Distance Based Image Classification: A solution to generative classification's conundrum?

1 code implementation • 4 Oct 2022 • Wen-Yan Lin, Siying Liu, Bing Tian Dai, Hongdong Li

We use the model to develop a classification scheme which suppresses the impact of noise while preserving semantic cues.

Image Classification

Paper
Code

Rolling Shutter Inversion: Bring Rolling Shutter Images to High Framerate Global Shutter Video

1 code implementation • 6 Oct 2022 • Bin Fan, Yuchao Dai, Hongdong Li

The RSSR is a very challenging task, and to our knowledge, no practical solution exists to date.

Optical Flow Estimation Super-Resolution

Paper
Code

A Survey of Deep Face Restoration: Denoise, Super-Resolution, Deblur, Artifact Removal

1 code implementation • 5 Nov 2022 • Tao Wang, Kaihao Zhang, Xuanxi Chen, Wenhan Luo, Jiankang Deng, Tong Lu, Xiaochun Cao, Wei Liu, Hongdong Li, Stefanos Zafeiriou

Second, we discuss the challenges of face restoration.

Image Restoration Super-Resolution

364

Paper
Code

What Images are More Memorable to Machines?

1 code implementation • 14 Nov 2022 • Junlin Han, Huangying Zhan, Jie Hong, Pengfei Fang, Hongdong Li, Lars Petersson, Ian Reid

This paper studies the problem of measuring and predicting how memorable an image is to pattern recognition machines, as a path to explore machine intelligence.

Paper
Code

Interacting Hand-Object Pose Estimation via Dense Mutual Attention

1 code implementation • 16 Nov 2022 • Rong Wang, Wei Mao, Hongdong Li

In contrast, we propose a novel dense mutual attention mechanism that is able to model fine-grained dependencies between the hand and the object.

Ranked #2 on hand-object pose on HO-3D (using extra training data)

hand-object pose Object +1

Paper
Code

Wide-Angle Rectification via Content-Aware Conformal Mapping

no code implementations • CVPR 2023 • Qi Zhang, Hongdong Li, Qing Wang

Despite the proliferation of ultra wide-angle lenses on smartphone cameras, such lenses often come with severe image distortion (e. g. curved linear structure, unnaturally skewed faces).

Paper
Add Code

A Rotation-Translation-Decoupled Solution for Robust and Efficient Visual-Inertial Initialization

1 code implementation • CVPR 2023 • Yijia He, Bo Xu, Zhanpeng Ouyang, Hongdong Li

We propose a novel visual-inertial odometry (VIO) initialization method, which decouples rotation and translation estimation, and achieves higher efficiency and better robustness.

Translation

136

Paper
Code

Spatial Steerability of GANs via Self-Supervision from Discriminator

no code implementations • 20 Jan 2023 • Jianyuan Wang, Lalit Bhagat, Ceyuan Yang, Yinghao Xu, Yujun Shen, Hongdong Li, Bolei Zhou

In this work, we propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space or requiring extra annotations.

Image Generation Inductive Bias +1

Paper
Add Code

CircNet: Meshing 3D Point Clouds with Circumcenter Detection

1 code implementation • 23 Jan 2023 • Huan Lei, Ruitao Leng, Liang Zheng, Hongdong Li

In this paper, we leverage the duality between a triangle and its circumcenter, and introduce a deep neural network that detects the circumcenters to achieve point cloud triangulation.

Surface Reconstruction

Paper
Code

MEGANE: Morphable Eyeglass and Avatar Network

no code implementations • CVPR 2023 • Junxuan Li, Shunsuke Saito, Tomas Simon, Stephen Lombardi, Hongdong Li, Jason Saragih

However, modeling the geometric and appearance interactions of glasses and the face of virtual representations of humans is challenging.

Image Generation Inverse Rendering

Paper
Add Code

Event-guided Multi-patch Network with Self-supervision for Non-uniform Motion Deblurring

1 code implementation • 14 Feb 2023 • Hongguang Zhang, Limeng Zhang, Yuchao Dai, Hongdong Li, Piotr Koniusz

Contemporary deep learning multi-scale deblurring models suffer from many issues: 1) They perform poorly on non-uniformly blurred images/videos; 2) Simply increasing the model depth with finer-scale levels cannot improve deblurring; 3) Individual RGB frames contain a limited motion information for deblurring; 4) Previous models have a limited robustness to spatial transformations and noise.

Deblurring

186

Paper
Code

Seeing Through the Glass: Neural 3D Reconstruction of Object Inside a Transparent Container

1 code implementation • CVPR 2023 • Jinguang Tong, Sundaram Muthu, Fahira Afzal Maken, Chuong Nguyen, Hongdong Li

In this paper, we define a new problem of recovering the 3D geometry of an object confined in a transparent enclosure.

3D Reconstruction

Paper
Code

WildLight: In-the-wild Inverse Rendering with a Flashlight

1 code implementation • CVPR 2023 • Ziang Cheng, Junxuan Li, Hongdong Li

Our system recovers scene geometry and reflectance using only multi-view images captured by a smartphone.

3D Reconstruction Inverse Rendering

Paper
Code

Inverting the Imaging Process by Learning an Implicit Camera Model

no code implementations • CVPR 2023 • Xin Huang, Qi Zhang, Ying Feng, Hongdong Li, Qing Wang

In principle, our new implicit neural camera model has the potential to benefit a wide array of other inverse imaging tasks.

Paper
Add Code

GridFormer: Residual Dense Transformer with Grid Structure for Image Restoration in Adverse Weather Conditions

no code implementations • 29 May 2023 • Tao Wang, Kaihao Zhang, Ziqian Shao, Wenhan Luo, Bjorn Stenger, Tong Lu, Tae-Kyun Kim, Wei Liu, Hongdong Li

Second, we introduce a residual dense transformer block (RDTB) as the final GridFormer layer.

Image Restoration Rain Removal

Paper
Add Code

Boosting 3-DoF Ground-to-Satellite Camera Localization Accuracy via Geometry-Guided Cross-View Transformer

1 code implementation • ICCV 2023 • Yujiao Shi, Fei Wu, Akhil Perincherry, Ankit Vora, Hongdong Li

In this paper, we propose a method to increase the accuracy of a ground camera's location and orientation by estimating the relative rotation and translation between the ground-level image and its matched/retrieved satellite image.

Camera Localization Image Retrieval +2

Paper
Code

EndoSurf: Neural Surface Reconstruction of Deformable Tissues with Stereo Endoscope Videos

1 code implementation • 21 Jul 2023 • Ruyi Zha, Xuelian Cheng, Hongdong Li, Mehrtash Harandi, ZongYuan Ge

We constrain the learned shape by tailoring multiple regularization strategies and disentangling geometry and appearance.

Surface Reconstruction

Paper
Code

LLDiffusion: Learning Degradation Representations in Diffusion Models for Low-Light Image Enhancement

1 code implementation • 27 Jul 2023 • Tao Wang, Kaihao Zhang, Ziqian Shao, Wenhan Luo, Bjorn Stenger, Tae-Kyun Kim, Wei Liu, Hongdong Li

In this paper, we address this limitation by proposing a degradation-aware learning scheme for LLIE using diffusion models, which effectively integrates degradation and image priors into the diffusion process, resulting in improved image enhancement.

Image Generation Low-Light Image Enhancement

Paper
Code

View Consistent Purification for Accurate Cross-View Localization

no code implementations • ICCV 2023 • Shan Wang, Yanhao Zhang, Akhil Perincherry, Ankit Vora, Hongdong Li

This paper proposes a fine-grained self-localization method for outdoor robotics that utilizes a flexible number of onboard cameras and readily accessible satellite images.

Pose Estimation

Paper
Add Code

Unsupervised 3D Pose Estimation with Non-Rigid Structure-from-Motion Modeling

no code implementations • 18 Aug 2023 • Haorui Ji, Hui Deng, Yuchao Dai, Hongdong Li

Most of the previous 3D human pose estimation work relied on the powerful memory capability of the network to obtain suitable 2D-3D mappings from the training data.

3D Human Pose Estimation 3D Pose Estimation

Paper
Add Code

Faster Stochastic Variance Reduction Methods for Compositional MiniMax Optimization

no code implementations • 18 Aug 2023 • Jin Liu, Xiaokang Pan, Junwen Duan, Hongdong Li, Youqi Li, Zhe Qu

All the proposed complexities indicate that our proposed methods can match lower bounds to existing minimax optimizations, without requiring a large batch size in each iteration.

Stochastic Optimization

Paper
Add Code

MB-TaylorFormer: Multi-branch Efficient Transformer Expanded by Taylor Formula for Image Dehazing

1 code implementation • ICCV 2023 • Yuwei Qiu, Kaihao Zhang, Chenxi Wang, Wenhan Luo, Hongdong Li, Zhi Jin

To address this issue, we propose a new Transformer variant, which applies the Taylor expansion to approximate the softmax-attention and achieves linear computational complexity.

Image Dehazing

Paper
Code

Stereo Matching in Time: 100+ FPS Video Stereo Matching for Extended Reality

no code implementations • 8 Sep 2023 • Ziang Cheng, Jiayu Yang, Hongdong Li

One of the major difficulties is the lack of high-quality indoor video stereo training datasets captured by head-mounted VR/AR glasses.

Mixed Reality Stereo Matching

Paper
Add Code

Deep Video Restoration for Under-Display Camera

no code implementations • 9 Sep 2023 • Xuanxi Chen, Tao Wang, Ziqian Shao, Kaihao Zhang, Wenhan Luo, Tong Lu, Zikun Liu, Tae-Kyun Kim, Hongdong Li

With the pipeline, we build the first large-scale UDC video restoration dataset called PexelsUDC, which includes two subsets named PexelsUDC-T and PexelsUDC-P corresponding to different displays for UDC.

Video Restoration

Paper
Add Code

RGB-based Category-level Object Pose Estimation via Decoupled Metric Scale Recovery

1 code implementation • 19 Sep 2023 • Jiaxin Wei, Xibin Song, Weizhe Liu, Laurent Kneip, Hongdong Li, Pan Ji

While showing promising results, recent RGB-D camera-based category-level object pose estimation methods have restricted applications due to the heavy reliance on depth sensors.

Object Pose Estimation

Paper
Code

Alice Benchmarks: Connecting Real World Re-Identification with the Synthetic

no code implementations • 6 Oct 2023 • Xiaoxiao Sun, Yue Yao, Shengjin Wang, Hongdong Li, Liang Zheng

In this paper, we detail the settings of Alice benchmarks, provide an analysis of existing commonly-used domain adaptation methods, and discuss some interesting future directions.

Domain Adaptation

Paper
Add Code

DeepSimHO: Stable Pose Estimation for Hand-Object Interaction via Physics Simulation

1 code implementation • NeurIPS 2023 • Rong Wang, Wei Mao, Hongdong Li

Specifically, for an initial hand-object pose estimated by a base network, we forward it to a physics simulator to evaluate its stability.

3D Pose Estimation hand-object pose +1

Paper
Code

ConsistNet: Enforcing 3D Consistency for Multi-view Images Diffusion

1 code implementation • 16 Oct 2023 • Jiayu Yang, Ziang Cheng, Yunfei Duan, Pan Ji, Hongdong Li

Given a single image of a 3D object, this paper proposes a novel method (named ConsistNet) that is able to generate multiple images of the same object, as if seen they are captured from different viewpoints, while the 3D (multi-view) consistencies among those multiple generated images are effectively exploited.

Depth Estimation Depth Prediction +2

Paper
Code

3D Human Pose Analysis via Diffusion Synthesis

no code implementations • 17 Jan 2024 • Haorui Ji, Hongdong Li

In this paper, we propose PADS (Pose Analysis by Diffusion Synthesis), a novel framework designed to address various challenges in 3D human pose analysis through a unified pipeline.

Denoising

Paper
Add Code

BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation

no code implementations • 30 Jan 2024 • Zhennan Wu, Yang Li, Han Yan, Taizhang Shang, Weixuan Sun, Senbo Wang, Ruikai Cui, Weizhe Liu, Hiroyuki Sato, Hongdong Li, Pan Ji

A variational auto-encoder is employed to compress the tri-planes into the latent tri-plane space, on which the denoising diffusion process is performed.

Denoising Scene Generation

Paper
Add Code

PromptRR: Diffusion Models as Prompt Generators for Single Image Reflection Removal

1 code implementation • 4 Feb 2024 • Tao Wang, Wanglong Lu, Kaihao Zhang, Wenhan Luo, Tae-Kyun Kim, Tong Lu, Hongdong Li, Ming-Hsuan Yang

For the prompt generation, we first propose a prompt pre-training strategy to train a frequency prompt encoder that encodes the ground-truth image into LF and HF prompts.

Reflection Removal

Paper
Code

Strong and Controllable Blind Image Decomposition

1 code implementation • 15 Mar 2024 • Zeyu Zhang, Junlin Han, Chenhui Gou, Hongdong Li, Liang Zheng

To address this need, we add controllability to the blind image decomposition process, allowing users to enter which types of degradation to remove or retain.

Paper
Code

Frankenstein: Generating Semantic-Compositional 3D Scenes in One Tri-Plane

no code implementations • 24 Mar 2024 • Han Yan, Yang Li, Zhennan Wu, Shenzhou Chen, Weixuan Sun, Taizhang Shang, Weizhe Liu, Tian Chen, Xiaqiang Dai, Chao Ma, Hongdong Li, Pan Ji

We present Frankenstein, a diffusion-based framework that can generate semantic-compositional 3D scenes in a single pass.

Denoising

Paper
Add Code

NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation

no code implementations • 27 Mar 2024 • Ruikai Cui, Weizhe Liu, Weixuan Sun, Senbo Wang, Taizhang Shang, Yang Li, Xibin Song, Han Yan, Zhennan Wu, Shenzhou Chen, Hongdong Li, Pan Ji

3D shape generation aims to produce innovative 3D content adhering to specific conditions and constraints.

3D Shape Generation 3D Shape Modeling

Paper
Add Code

Homography Guided Temporal Fusion for Road Line and Marking Segmentation

1 code implementation • ICCV 2023 • Shan Wang, Chuong Nguyen, Jiawei Liu, Kaihao Zhang, Wenhan Luo, Yanhao Zhang, Sundaram Muthu, Fahira Afzal Maken, Hongdong Li

Reliable segmentation of road lines and markings is critical to autonomous driving.

Autonomous Driving Segmentation

Paper
Code

Beyond Monocular Deraining: Stereo Image Deraining via Semantic Understanding

no code implementations • ECCV 2020 • Kaihao Zhang, Wenhan Luo, Wenqi Ren, Jingwen Wang Fang Zhao, Lin Ma , Hongdong Li

Moreover, even for single image based monocular deraining, many current methods fail to complete the task satisfactorily because they mostly rely on per pixel loss functions and ignoring semantic information.

Benchmarking Rain Removal +1

Paper
Add Code

Automatic Gloss Dictionary for Sign Language Learners

no code implementations • ACL 2022 • Chenchen Xu, Dongxu Li, Hongdong Li, Hanna Suominen, Ben Swift

A multi-language dictionary is a fundamental tool for language learning, allowing the learner to look up unfamiliar words.

Paper
Add Code

Deep Novel View Synthesis from Colored 3D Point Clouds

1 code implementation • ECCV 2020 • Zhenbo Song, Wayne Chen, Dylan Campbell, Hongdong Li

We propose a new deep neural network which takes a colored 3D point cloud of a scene, and directly synthesizes a photo-realistic image from an arbitrary viewpoint.

Image Generation Novel View Synthesis

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.