Search Results for author: Hongdong Li

Found 156 papers, 72 papers with code

Winding Number for Region-Boundary Consistent Salient Contour Extraction

no code implementations CVPR 2013 Yansheng Ming, Hongdong Li, Xuming He

The main focus is given to how to maintain the consistency (compatibility) between the region cues and the boundary cues.

Boundary Detection Segmentation

Expanding the Family of Grassmannian Kernels: An Embedding Perspective

no code implementations4 Jul 2014 Mehrtash T. Harandi, Mathieu Salzmann, Sadeep Jayasumana, Richard Hartley, Hongdong Li

Modeling videos and image-sets as linear subspaces has proven beneficial for many visual recognition tasks.

Clustering

Iteratively Reweighted Graph Cut for Multi-label MRFs with Non-convex Priors

no code implementations CVPR 2015 Thalaiyasingam Ajanthan, Richard Hartley, Mathieu Salzmann, Hongdong Li

While widely acknowledged as highly effective in computer vision, multi-label MRFs with non-convex priors are difficult to optimize.

Kernel Methods on Riemannian Manifolds with Gaussian RBF Kernels

no code implementations30 Nov 2014 Sadeep Jayasumana, Richard Hartley, Mathieu Salzmann, Hongdong Li, Mehrtash Harandi

We then use the proposed framework to identify positive definite kernels on two specific manifolds commonly encountered in computer vision: the Riemannian manifold of symmetric positive definite matrices and the Grassmann manifold, i. e., the Riemannian manifold of linear subspaces of a Euclidean space.

Kernel Methods on the Riemannian Manifold of Symmetric Positive Definite Matrices

no code implementations CVPR 2013 Sadeep Jayasumana, Richard Hartley, Mathieu Salzmann, Hongdong Li, Mehrtash Harandi

To encode the geometry of the manifold in the mapping, we introduce a family of provably positive definite kernels on the Riemannian manifold of SPD matrices.

Motion Segmentation Pedestrian Detection +2

Optimizing Over Radial Kernels on Compact Manifolds

no code implementations CVPR 2014 Sadeep Jayasumana, Richard Hartley, Mathieu Salzmann, Hongdong Li, Mehrtash Harandi

We tackle the problem of optimizing over all possible positive definite radial kernels on Riemannian manifolds for classification.

General Classification

A Framework for Shape Analysis via Hilbert Space Embedding

no code implementations13 Dec 2014 Sadeep Jayasumana, Mathieu Salzmann, Hongdong Li, Mehrtash Harandi

We propose a framework for 2D shape analysis using positive definite kernels defined on Kendall's shape manifold.

Clustering General Classification +1

Tracking Randomly Moving Objects on Edge Box Proposals

no code implementations29 Jul 2015 Gao Zhu, Fatih Porikli, Hongdong Li

Our method generates a small number of "high-quality" proposals by a novel instance-specific objectness measure and evaluates them against the object model that can be adopted from an existing tracking-by-detection approach as a core tracker.

Object valid

Shape Interaction Matrix Revisited and Robustified: Efficient Subspace Clustering with Corrupted and Incomplete Data

1 code implementation ICCV 2015 Pan Ji, Mathieu Salzmann, Hongdong Li

The Shape Interaction Matrix (SIM) is one of the earliest approaches to performing subspace clustering (i. e., separating points drawn from a union of subspaces).

Clustering Face Clustering +1

Robust Multi-body Feature Tracker: A Segmentation-free Approach

no code implementations CVPR 2016 Pan Ji, Hongdong Li, Mathieu Salzmann, Yiran Zhong

Feature tracking is a fundamental problem in computer vision, with applications in many computer vision tasks, such as visual SLAM and action recognition.

Action Recognition Motion Segmentation +2

Neural Aggregation Network for Video Face Recognition

no code implementations CVPR 2017 Jiaolong Yang, Peiran Ren, Dong-Qing Zhang, Dong Chen, Fang Wen, Hongdong Li, Gang Hua

The network takes a face video or face image set of a person with a variable number of face images as its input, and produces a compact, fixed-dimension feature representation for recognition.

Face Recognition Face Verification

Learning Image Matching by Simply Watching Video

no code implementations19 Mar 2016 Gucan Long, Laurent Kneip, Jose M. Alvarez, Hongdong Li

This work presents an unsupervised learning based approach to the ubiquitous computer vision problem of image matching.

Rolling Shutter Camera Relative Pose: Generalized Epipolar Geometry

no code implementations CVPR 2016 Yuchao Dai, Hongdong Li, Laurent Kneip

The vast majority of modern consumer-grade cameras employ a rolling shutter mechanism.

Beyond Local Search: Tracking Objects Everywhere with Instance-Specific Proposals

no code implementations CVPR 2016 Gao Zhu, Fatih Porikli, Hongdong Li

Our method generates a small number of "high-quality" proposals by a novel instance-specific objectness measure and evaluates them against the object model that can be adopted from an existing tracking-by-detection approach as a core tracker.

Object valid

Robust Optical Flow Estimation of Double-Layer Images under Transparency or Reflection

no code implementations CVPR 2016 Jiaolong Yang, Hongdong Li, Yuchao Dai, Robby T. Tan

This paper deals with a challenging, frequently encountered, yet not properly investigated problem in two-frame optical flow estimation.

Optical Flow Estimation valid

Go-ICP: A Globally Optimal Solution to 3D ICP Point-Set Registration

no code implementations11 May 2016 Jiaolong Yang, Hongdong Li, Dylan Campbell, Yunde Jia

The evaluation demonstrates that the proposed method is able to produce reliable registration results regardless of the initialization.

Image to Point Cloud Registration

Robust and Efficient Relative Pose with a Multi-camera System for Autonomous Vehicle in Highly Dynamic Environments

no code implementations12 May 2016 Liu Liu, Hongdong Li, Yuchao Dai

When the solver is used in combination with RANSAC, we are able to quickly prune unpromising hypotheses, significantly improve the chance of finding inliers.

Motion Estimation

Multi-body Non-rigid Structure-from-Motion

no code implementations15 Jul 2016 Suryansh Kumar, Yuchao Dai, Hongdong Li

Recent progress have extended SFM to the areas of {multi-body SFM} (where there are {multiple rigid} relative motions in the scene), as well as {non-rigid SFM} (where there is a single non-rigid, deformable object or scene).

3D Reconstruction Clustering

Semi-Dense Visual Odometry for RGB-D Cameras Using Approximate Nearest Neighbour Fields

no code implementations6 Feb 2017 Yi Zhou, Laurent Kneip, Hongdong Li

This paper presents a robust and efficient semi-dense visual odometry solution for RGB-D cameras.

Visual Odometry

Spatial-Temporal Union of Subspaces for Multi-body Non-rigid Structure-from-Motion

no code implementations14 May 2017 Suryansh Kumar, Yuchao Dai, Hongdong Li

This spatio-temporal representation not only provides competitive 3D reconstruction but also outputs robust segmentation of multiple non-rigid objects.

3D Reconstruction

Pixel-variant Local Homography for Fisheye Stereo Rectification Minimizing Resampling Distortion

no code implementations12 Jul 2017 Dingfu Zhou, Yuchao Dai, Hongdong Li

First, we prove that there indeed exist enough degrees of freedom to apply pixel-wise local homography for stereo rectification.

3D Reconstruction Stereo Matching +1

Adaptive Low-Rank Kernel Subspace Clustering

1 code implementation17 Jul 2017 Pan Ji, Ian Reid, Ravi Garg, Hongdong Li, Mathieu Salzmann

In this paper, we present a kernel subspace clustering method that can handle non-linear models.

Clustering Image Clustering +1

Self-Supervised Learning for Stereo Matching with Self-Improving Ability

no code implementations4 Sep 2017 Yiran Zhong, Yuchao Dai, Hongdong Li

Exiting deep-learning based dense stereo matching methods often rely on ground-truth disparity maps as the training signals, which are however not always available in many situations.

Self-Supervised Learning Stereo Matching +1

Globally-Optimal Inlier Set Maximisation for Simultaneous Camera Pose and Feature Correspondence

no code implementations ICCV 2017 Dylan Campbell, Lars Petersson, Laurent Kneip, Hongdong Li

Estimating the 6-DoF pose of a camera from a single image relative to a pre-computed 3D point-set is an important task for many computer vision applications.

Pose Estimation

Efficient Global 2D-3D Matching for Camera Localization in a Large-Scale 3D Map

no code implementations ICCV 2017 Liu Liu, Hongdong Li, Yuchao Dai

In this paper, we introduce a global method which harnesses global contextual information exhibited both within the query image and among all the 3D points in the map.

3D Feature Matching Camera Localization

Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective

no code implementations CVPR 2018 Suryansh Kumar, Anoop Cherian, Yuchao Dai, Hongdong Li

To address these issues, in this paper, we propose a new approach for dense NRSfM by modeling the problem on a Grassmann manifold.

Adversarial Spatio-Temporal Learning for Video Deblurring

1 code implementation28 Mar 2018 Kaihao Zhang, Wenhan Luo, Yiran Zhong, Lin Ma, Wei Liu, Hongdong Li

To tackle the second challenge, we leverage the developed DBLRNet as a generator in the GAN (generative adversarial network) architecture, and employ a content loss in addition to an adversarial loss for efficient adversarial training.

Deblurring Generative Adversarial Network

Semantic Single-Image Dehazing

no code implementations16 Apr 2018 Ziang Cheng, ShaoDi You, Viorela Ila, Hongdong Li

In experiments, we validate our ap- proach upon synthetic and real hazy images, where our method showed superior performance over state-of-the-art approaches, suggesting semantic information facilitates the haze removal task.

Image Dehazing Single Image Dehazing +1

Structure from Recurrent Motion: From Rigidity to Recurrency

no code implementations CVPR 2018 Xiu Li, Hongdong Li, Hanbyul Joo, Yebin Liu, Yaser Sheikh

This paper proposes a new method for Non-Rigid Structure-from-Motion (NRSfM) from a long monocular video sequence observing a non-rigid object performing recurrent and possibly repetitive dynamic action.

Clustering

Semi-Dense 3D Reconstruction with a Stereo Event Camera

2 code implementations ECCV 2018 Yi Zhou, Guillermo Gallego, Henri Rebecq, Laurent Kneip, Hongdong Li, Davide Scaramuzza

Event cameras are bio-inspired sensors that offer several advantages, such as low latency, high-speed and high dynamic range, to tackle challenging scenarios in computer vision.

3D Reconstruction Simultaneous Localization and Mapping

Action Anticipation By Predicting Future Dynamic Images

no code implementations1 Aug 2018 Cristian Rodriguez, Basura Fernando, Hongdong Li

Human action-anticipation methods predict what is the future action by observing only a few portion of an action in progress.

Action Anticipation Autonomous Driving +1

Open-World Stereo Video Matching with Deep RNN

no code implementations ECCV 2018 Yiran Zhong, Hongdong Li, Yuchao Dai

Deep Learning based stereo matching methods have shown great successes and achieved top scores across different benchmarks.

Stereo Matching Stereo Matching Hand

3D Geometry-Aware Semantic Labeling of Outdoor Street Scenes

no code implementations13 Aug 2018 Yiran Zhong, Yuchao Dai, Hongdong Li

This paper is concerned with the problem of how to better exploit 3D geometric information for dense semantic image labeling.

Stereo Computation for a Single Mixture Image

no code implementations ECCV 2018 Yiran Zhong, Yuchao Dai, Hongdong Li

This paper proposes an original problem of \emph{stereo computation from a single mixture image}-- a challenging problem that had not been researched before.

Stereo Matching Stereo Matching Hand +1

Stochastic Attraction-Repulsion Embedding for Large Scale Image Localization

5 code implementations ICCV 2019 Liu Liu, Hongdong Li, Yuchao Dai

This paper tackles the problem of large-scale image-based localization (IBL) where the spatial location of a query image is determined by finding out the most similar reference images in a large database.

Image-Based Localization Representation Learning +1

ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving

no code implementations CVPR 2019 Xibin Song, Peng Wang, Dingfu Zhou, Rui Zhu, Chenye Guan, Yuchao Dai, Hao Su, Hongdong Li, Ruigang Yang

Specifically, we first segment each car with a pre-trained Mask R-CNN, and then regress towards its 3D pose and shape based on a deformable 3D car model with or without using semantic keypoints.

3D Car Instance Understanding Autonomous Driving

The Alignment of the Spheres: Globally-Optimal Spherical Mixture Alignment for Camera Pose Estimation

no code implementations CVPR 2019 Dylan Campbell, Lars Petersson, Laurent Kneip, Hongdong Li, Stephen Gould

Determining the position and orientation of a calibrated camera from a single image with respect to a 3D model is an essential task for many applications.

Pose Estimation

Dense Depth Estimation of a Complex Dynamic Scene without Explicit 3D Motion Estimation

no code implementations11 Feb 2019 Suryansh Kumar, Ram Srivatsav Ghorakavi, Yuchao Dai, Hongdong Li

Given per-pixel optical flow correspondences between two consecutive frames and, the sparse depth prior for the reference frame, we show that, we can effectively recover the dense depth map for the successive frames without solving for 3D motion parameters.

Depth Estimation Motion Estimation +1

Breaking the Spatio-Angular Trade-off for Light Field Super-Resolution via LSTM Modelling on Epipolar Plane Images

no code implementations15 Feb 2019 Hao Zhu, Mantang Guo, Hongdong Li, Qing Wang, Antonio Robles-Kelly

We prove that the light field is a 2D series, thus, a specifically designed CNN-LSTM network is proposed to capture the continuity property of the EPI.

Super-Resolution

Ground Plane based Absolute Scale Estimation for Monocular Visual Odometry

no code implementations3 Mar 2019 Dingfu Zhou, Yuchao Dai, Hongdong Li

Recovering the absolute metric scale from a monocular camera is a challenging but highly desirable problem for monocular camera-based systems.

Monocular Visual Odometry

Lending Orientation to Neural Networks for Cross-view Geo-localization

1 code implementation CVPR 2019 Liu Liu, Hongdong Li

The goal is to predict the spatial location of a ground-level query image by matching it to a large geotagged aerial image database (e. g., satellite imagery).

Noise-Aware Unsupervised Deep Lidar-Stereo Fusion

3 code implementations CVPR 2019 Xuelian Cheng, Yiran Zhong, Yuchao Dao, Pan Ji, Hongdong Li

In this paper, we present LidarStereoNet, the first unsupervised Lidar-stereo fusion network, which can be trained in an end-to-end manner without the need of ground truth depth maps.

Depth Completion Stereo Matching +1

Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes

no code implementations CVPR 2019 Yiran Zhong, Pan Ji, Jianyuan Wang, Yuchao Dai, Hongdong Li

In this paper, we propose Deep Epipolar Flow, an unsupervised optical flow method which incorporates global geometric constraints into network learning.

Benchmarking Optical Flow Estimation

Neural Collaborative Subspace Clustering

no code implementations24 Apr 2019 Tong Zhang, Pan Ji, Mehrtash Harandi, Wenbing Huang, Hongdong Li

We introduce the Neural Collaborative Subspace Clustering, a neural model that discovers clusters of data points drawn from a union of low-dimensional subspaces.

Clustering

Optimal Feature Transport for Cross-View Image Geo-Localization

1 code implementation11 Jul 2019 Yujiao Shi, Xin Yu, Liu Liu, Tong Zhang, Hongdong Li

This paper proposes a novel Cross-View Feature Transport (CVFT) technique to explicitly establish cross-view domain transfer that facilitates feature alignment between ground and aerial images.

Image-Based Localization Metric Learning

Learning Trajectory Dependencies for Human Motion Prediction

5 code implementations ICCV 2019 Wei Mao, Miaomiao Liu, Mathieu Salzmann, Hongdong Li

In this paper, we propose a simple feed-forward deep network for motion prediction, which takes into account both temporal smoothness and spatial dependencies among human body joints.

Human motion prediction Human Pose Forecasting +2

Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention

1 code implementation20 Aug 2019 Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Fatemeh Sadat Saleh, Hongdong Li, Stephen Gould

Given an untrimmed video and a sentence as the query, the goal is to determine the starting, and the ending, of the relevant visual moment in the video, that corresponds to the query sentence.

Sentence

Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison

2 code implementations24 Oct 2019 Dongxu Li, Cristian Rodriguez Opazo, Xin Yu, Hongdong Li

Based on this new large-scale dataset, we are able to experiment with several deep learning methods for word-level sign recognition and evaluate their performances in large scale scenarios.

Action Classification Benchmarking +3

Superpixel Soup: Monocular Dense 3D Reconstruction of a Complex Dynamic Scene

no code implementations19 Nov 2019 Suryansh Kumar, Yuchao Dai, Hongdong Li

We assume that a dynamic scene can be approximated by numerous piecewise planar surfaces, where each planar surface enjoys its own rigid motion, and the global change in the scene between two frames is as-rigid-as-possible (ARAP).

3D Reconstruction

Spatial-Aware Feature Aggregation for Image based Cross-View Geo-Localization

1 code implementation NeurIPS 2019 Yujiao Shi, Liu Liu, Xin Yu, Hongdong Li

The first step is to apply a regular polar transform to warp an aerial image such that its domain is closer to that of a ground-view panorama.

Image-Based Localization

Few-shot Action Recognition with Permutation-invariant Attention

1 code implementation ECCV 2020 Hongguang Zhang, Li Zhang, Xiaojuan Qi, Hongdong Li, Philip H. S. Torr, Piotr Koniusz

Such encoded blocks are aggregated by permutation-invariant pooling to make our approach robust to varying action lengths and long-range temporal dependencies whose patterns are unlikely to repeat even in clips of the same class.

Few-Shot action recognition Few Shot Action Recognition +3

6DoF Object Pose Estimation via Differentiable Proxy Voting Loss

no code implementations10 Feb 2020 Xin Yu, Zheyu Zhuang, Piotr Koniusz, Hongdong Li

In this paper, we aim to reduce such errors by incorporating the distances between pixels and keypoints into our objective.

Object Pose Estimation

Transferring Cross-domain Knowledge for Video Sign Language Recognition

no code implementations CVPR 2020 Dongxu Li, Xin Yu, Chenchen Xu, Lars Petersson, Hongdong Li

To this end, we extract news signs using a base WSLR model, and then design a classifier jointly trained on news and isolated signs to coarsely align these two domain features.

Sign Language Recognition

Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point Problem

1 code implementation15 Mar 2020 Liu Liu, Dylan Campbell, Hongdong Li, Dingfu Zhou, Xibin Song, Ruigang Yang

Conventional absolute camera pose via a Perspective-n-Point (PnP) solver often assumes that the correspondences between 2D image pixels and 3D points are given.

Deblurring by Realistic Blurring

1 code implementation CVPR 2020 Kaihao Zhang, Wenhan Luo, Yiran Zhong, Lin Ma, Bjorn Stenger, Wei Liu, Hongdong Li

To address this problem, we propose a new method which combines two GAN models, i. e., a learning-to-Blur GAN (BGAN) and learning-to-DeBlur GAN (DBGAN), in order to learn a better model for image deblurring by primarily learning how to blur images.

Deblurring Image Deblurring

Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching

1 code implementation CVPR 2020 Yujiao Shi, Xin Yu, Dylan Campbell, Hongdong Li

Cross-view geo-localization is the problem of estimating the position and orientation (latitude, longitude and azimuth angle) of a camera at ground level given a large-scale database of geo-tagged aerial (e. g., satellite) images.

End-to-end Learning for Inter-Vehicle Distance and Relative Velocity Estimation in ADAS with a Monocular Camera

1 code implementation7 Jun 2020 Zhenbo Song, Jianfeng Lu, Tong Zhang, Hongdong Li

In this paper, we propose a monocular camera-based inter-vehicle distance and relative velocity estimation method based on end-to-end training of a deep neural network.

Optical Flow Estimation

Dense Non-Rigid Structure from Motion: A Manifold Viewpoint

no code implementations15 Jun 2020 Suryansh Kumar, Luc van Gool, Carlos E. P. de Oliveira, Anoop Cherian, Yuchao Dai, Hongdong Li

Assuming that a deforming shape is composed of a union of local linear subspace and, span a global low-rank space over multiple frames enables us to efficiently model complex non-rigid deformations.

Clustering

Hierarchical Neural Architecture Search for Deep Stereo Matching

1 code implementation NeurIPS 2020 Xuelian Cheng, Yiran Zhong, Mehrtash Harandi, Yuchao Dai, Xiaojun Chang, Tom Drummond, Hongdong Li, ZongYuan Ge

To reduce the human efforts in neural network design, Neural Architecture Search (NAS) has been applied with remarkable success to various high-level vision tasks such as classification and semantic segmentation.

Neural Architecture Search Semantic Segmentation +3

Displacement-Invariant Matching Cost Learning for Accurate Optical Flow Estimation

3 code implementations NeurIPS 2020 Jianyuan Wang, Yiran Zhong, Yuchao Dai, Kaihao Zhang, Pan Ji, Hongdong Li

Learning matching costs has been shown to be critical to the success of the state-of-the-art deep stereo matching methods, in which 3D convolutions are applied on a 4D feature volume to learn a 3D cost volume.

Optical Flow Estimation Stereo Matching

PlueckerNet: Learn to Register 3D Line Reconstructions

2 code implementations2 Dec 2020 Liu Liu, Hongdong Li, Haodong Yao, Ruyi Zha

Aligning two partially-overlapped 3D line reconstructions in Euclidean space is challenging, as we need to simultaneously solve correspondences and relative pose between line reconstructions.

Translation

Efficient Depth Completion Using Learned Bases

no code implementations2 Dec 2020 Yiran Zhong, Yuchao Dai, Hongdong Li

The given sparse depth points are served as a data term to constrain the weighting process.

Depth Completion

Depth Completion using Piecewise Planar Model

no code implementations6 Dec 2020 Yiran Zhong, Yuchao Dai, Hongdong Li

More specifically, we represent the desired depth map as a collection of 3D planar and the reconstruction problem is formulated as the optimization of planar parameters.

Depth Completion Visual Odometry

Canny-VO: Visual Odometry with RGB-D Cameras based on Geometric 3D-2D Edge Alignment

no code implementations15 Dec 2020 Yi Zhou, Hongdong Li, Laurent Kneip

The present paper reviews the classical problem of free-form curve registration and applies it to an efficient RGBD visual odometry system called Canny-VO, as it efficiently tracks all Canny edge features extracted from the images.

Visual Odometry

Benchmarking Ultra-High-Definition Image Super-Resolution

no code implementations ICCV 2021 Kaihao Zhang, Dongxu Li, Wenhan Luo, Wenqi Ren, Bjorn Stenger, Wei Liu, Hongdong Li, Ming-Hsuan Yang

Increasingly, modern mobile devices allow capturing images at Ultra-High-Definition (UHD) resolution, which includes 4K and 8K images.

4k 8k +3

Geometry-Guided Street-View Panorama Synthesis from Satellite Imagery

1 code implementation2 Mar 2021 Yujiao Shi, Dylan Campbell, Xin Yu, Hongdong Li

Specifically, we observe that when a 3D point in the real world is visible in both views, there is a deterministic mapping between the projected points in the two-view images given the height information of this 3D point.

Image Generation

Deep Dense Multi-scale Network for Snow Removal Using Semantic and Geometric Priors

no code implementations21 Mar 2021 Kaihao Zhang, Rongqing Li, Yanjiang Yu, Wenhan Luo, Changsheng Li, Hongdong Li

Images captured in snowy days suffer from noticeable degradation of scene visibility, which degenerates the performance of current vision-based intelligent systems.

Image Restoration Snow Removal

Self-Supervised Visibility Learning for Novel View Synthesis

1 code implementation CVPR 2021 Yujiao Shi, Hongdong Li, Xin Yu

We then warp and aggregate source view pixels to synthesize a novel view based on the estimated source-view visibility and target-view depth.

Novel View Synthesis

Learning Optical Flow from a Few Matches

1 code implementation CVPR 2021 Shihao Jiang, Yao Lu, Hongdong Li, Richard Hartley

In this paper, we show that the dense correlation volume representation is redundant and accurate flow estimation can be achieved with only a fraction of elements in it.

Optical Flow Estimation

Learning to Estimate Hidden Motions with Global Motion Aggregation

2 code implementations ICCV 2021 Shihao Jiang, Dylan Campbell, Yao Lu, Hongdong Li, Richard Hartley

We demonstrate that the optical flow estimates in the occluded regions can be significantly improved without damaging the performance in non-occluded regions.

Optical Flow Estimation

One Ring to Rule Them All: a simple solution to multi-view 3D-Reconstruction of shapes with unknown BRDF via a small Recurrent ResNet

no code implementations11 Apr 2021 Ziang Cheng, Hongdong Li, Richard Hartley, Yinqiang Zheng, Imari Sato

This paper proposes a simple method which solves an open problem of multi-view 3D-Reconstruction for objects with unknown and generic surface materials, imaged by a freely moving camera and a freely moving point light source.

3D Reconstruction Multi-View 3D Reconstruction +3

Lighting, Reflectance and Geometry Estimation from 360$^{\circ}$ Panoramic Stereo

no code implementations20 Apr 2021 Junxuan Li, Hongdong Li, Yasuyuki Matsushita

We propose a method for estimating high-definition spatially-varying lighting, reflectance, and geometry of a scene from 360$^{\circ}$ stereo images.

Beyond Monocular Deraining: Parallel Stereo Deraining Network Via Semantic Prior

no code implementations9 May 2021 Kaihao Zhang, Wenhan Luo, Yanjiang Yu, Wenqi Ren, Fang Zhao, Changsheng Li, Lin Ma, Wei Liu, Hongdong Li

We first use a coarse deraining network to reduce the rain streaks on the input images, and then adopt a pre-trained semantic segmentation network to extract semantic features from the coarse derained image.

Benchmarking Rain Removal +1

Multi-view 3D Reconstruction of a Texture-less Smooth Surface of Unknown Generic Reflectance

1 code implementation CVPR 2021 Ziang Cheng, Hongdong Li, Yuta Asano, Yinqiang Zheng, Imari Sato

Recovering the 3D geometry of a purely texture-less object with generally unknown surface reflectance (e. g. non-Lambertian) is regarded as a challenging task in multi-view reconstruction.

3D Object Reconstruction 3D Reconstruction +2

Shell Theory: A Statistical Model of Reality

1 code implementation IEEE Transactions on Pattern Analysis and Machine Intelligence 2021 Wen-Yan Lin, Siying Liu, Changhao Ren, Ngai-Man Cheung, Hongdong Li, Yasuyuki Matsushita

The foundational assumption of machine learning is that the data under consideration is separable into classes; while intuitively reasonable, separability constraints have proven remarkably difficult to formulate mathematically.

Anomaly Detection BIG-bench Machine Learning +6

Multi-level Motion Attention for Human Motion Prediction

1 code implementation17 Jun 2021 Wei Mao, Miaomiao Liu, Mathieu Salzmann, Hongdong Li

Whether based on recurrent or feed-forward neural networks, existing learning based methods fail to model the observation that human motion tends to repeat itself, even for complex sports actions and cooking activities.

Human motion prediction motion prediction

PluckerNet: Learn To Register 3D Line Reconstructions

no code implementations CVPR 2021 Liu Liu, Hongdong Li, Haodong Yao, Ruyi Zha

Aligning two partially-overlapped 3D line reconstructions in Euclidean space is challenging, as we need to simultaneously solve line correspondences and relative pose between reconstructions.

Translation

Lighting, Reflectance and Geometry Estimation From 360deg Panoramic Stereo

no code implementations CVPR 2021 Junxuan Li, Hongdong Li, Yasuyuki Matsushita

We propose a method for estimating high-definition spatially-varying lighting, reflectance, and geometry of a scene from 360deg stereo images.

Ranking Models in Unlabeled New Environments

2 code implementations ICCV 2021 Xiaoxiao Sun, Yunzhong Hou, Weijian Deng, Hongdong Li, Liang Zheng

For this problem, we propose to adopt a proxy dataset that 1) is fully labeled and 2) well reflects the true model rankings in a given target environment, and use the performance rankings on the proxy sets as surrogates.

Person Re-Identification

Blind Image Decomposition

1 code implementation25 Aug 2021 Junlin Han, Weihao Li, Pengfei Fang, Chunyi Sun, Jie Hong, Mohammad Ali Armin, Lars Petersson, Hongdong Li

We propose and study a novel task named Blind Image Decomposition (BID), which requires separating a superimposed image into constituent underlying images in a blind setting, that is, both the source components involved in mixing as well as the mixing mechanism are unknown.

Rain Removal

Neural Plenoptic Sampling: Capture Light-field from Imaginary Eyes

no code implementations29 Sep 2021 Junxuan Li, Yujiao Shi, Hongdong Li

It encodes a complete light-field (\ie, lumigraph) therefore allows one to freely roam in the space and view the scene from any location in any direction.

Novel View Synthesis Position

Neural Photometric Stereo for Shape and Material Estimation

no code implementations29 Sep 2021 Junxuan Li, Hongdong Li

This paper addresses a challenging Photometric-Stereo problem where the object to be reconstructed has unknown, non-Lambertian, and possibly spatially-varying surface materials.

Learning To Segment Dominant Object Motion From Watching Videos

no code implementations28 Nov 2021 Sahir Shrestha, Mohammad Ali Armin, Hongdong Li, Nick Barnes

Existing deep learning based unsupervised video object segmentation methods still rely on ground-truth segmentation masks to train.

Object Optical Flow Estimation +4

HDR-NeRF: High Dynamic Range Neural Radiance Fields

no code implementations CVPR 2022 Xin Huang, Qi Zhang, Ying Feng, Hongdong Li, Xuan Wang, Qing Wang

The key to our method is to model the physical imaging process, which dictates that the radiance of a scene point transforms to a pixel value in the LDR image with two implicit functions: a radiance field and a tone mapper.

Vocal Bursts Intensity Prediction

Improving GAN Equilibrium by Raising Spatial Awareness

1 code implementation CVPR 2022 Jianyuan Wang, Ceyuan Yang, Yinghao Xu, Yujun Shen, Hongdong Li, Bolei Zhou

We further propose to align the spatial awareness of G with the attention map induced from D. Through this way we effectively lessen the information gap between D and G. Extensive results show that our method pushes the two-player game in GANs closer to the equilibrium, leading to a better synthesis performance.

Attribute Inductive Bias

Label-Free Model Evaluation with Semi-Structured Dataset Representations

1 code implementation1 Dec 2021 Xiaoxiao Sun, Yunzhong Hou, Hongdong Li, Liang Zheng

In the absence of image labels, based on dataset representations, we estimate model performance for AutoEval with regression.

regression

MC-Blur: A Comprehensive Benchmark for Image Deblurring

2 code implementations1 Dec 2021 Kaihao Zhang, Tao Wang, Wenhan Luo, Boheng Chen, Wenqi Ren, Bjorn Stenger, Wei Liu, Hongdong Li, Ming-Hsuan Yang

Blur artifacts can seriously degrade the visual quality of images, and numerous deblurring methods have been proposed for specific scenarios.

Benchmarking Deblurring +1

Align and Prompt: Video-and-Language Pre-training with Entity Prompts

1 code implementation CVPR 2022 Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi

To achieve this, we first introduce an entity prompter module, which is trained with VTC to produce the similarity between a video crop and text prompts instantiated with entity names.

Entity Alignment Retrieval +3

Transcribing Natural Languages for The Deaf via Neural Editing Programs

1 code implementation17 Dec 2021 Dongxu Li, Chenchen Xu, Liu Liu, Yiran Zhong, Rong Wang, Lars Petersson, Hongdong Li

This work studies the task of glossification, of which the aim is to em transcribe natural spoken language sentences for the Deaf (hard-of-hearing) community to ordered sign language glosses.

Sentence

Multi-level Second-order Few-shot Learning

1 code implementation15 Jan 2022 Hongguang Zhang, Hongdong Li, Piotr Koniusz

The goal of multi-level feature design is to extract feature representations at different layer-wise levels of CNN, realizing several levels of visual abstraction to achieve robust few-shot learning.

Few-Shot action recognition Few Shot Action Recognition +2

Deep Image Deblurring: A Survey

no code implementations26 Jan 2022 Kaihao Zhang, Wenqi Ren, Wenhan Luo, Wei-Sheng Lai, Bjorn Stenger, Ming-Hsuan Yang, Hongdong Li

Image deblurring is a classic problem in low-level computer vision with the aim to recover a sharp image from a blurred input image.

Deblurring Image Deblurring

Neural Reflectance for Shape Recovery with Shadow Handling

1 code implementation CVPR 2022 Junxuan Li, Hongdong Li

This network is able to leverage the observed photometric variance and shadows on the surface, and recover both surface shape and general non-Lambertian reflectance.

Accurate 3-DoF Camera Geo-Localization via Ground-to-Satellite Image Matching

1 code implementation26 Mar 2022 Yujiao Shi, Xin Yu, Liu Liu, Dylan Campbell, Piotr Koniusz, Hongdong Li

We address the problem of ground-to-satellite image geo-localization, that is, estimating the camera latitude, longitude and orientation (azimuth angle) by matching a query image captured at the ground level against a large-scale database with geotagged satellite images.

Image Retrieval Retrieval

Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image

1 code implementation CVPR 2022 Yujiao Shi, Hongdong Li

This paper addresses the problem of vehicle-mounted camera localization by matching a ground-level image with an overhead-view satellite map.

Camera Localization Image Retrieval +2

Deep Non-rigid Structure-from-Motion: A Sequence-to-Sequence Translation Perspective

no code implementations10 Apr 2022 Hui Deng, Tong Zhang, Yuchao Dai, Jiawei Shi, Yiran Zhong, Hongdong Li

In this paper, we propose to model deep NRSfM from a sequence-to-sequence translation perspective, where the input 2D frame sequence is taken as a whole to reconstruct the deforming 3D non-rigid shape sequence.

3D Reconstruction Translation

Unsupervised Video Interpolation by Learning Multilayered 2.5D Motion Fields

no code implementations21 Apr 2022 Ziang Cheng, Shihao Jiang, Hongdong Li

This implicit neural representation learns the video as a space-time continuum, allowing frame interpolation at any temporal resolution.

Video Frame Interpolation

CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping

1 code implementation31 May 2022 Junlin Han, Lars Petersson, Hongdong Li, Ian Reid

We present a simple method, CropMix, for the purpose of producing a rich input distribution from the original dataset distribution.

Contrastive Learning

Self-calibrating Photometric Stereo by Neural Inverse Rendering

1 code implementation16 Jul 2022 Junxuan Li, Hongdong Li

This paper tackles the task of uncalibrated photometric stereo for 3D object reconstruction, where both the object shape, object reflectance, and lighting directions are unknown.

3D Object Reconstruction Inverse Rendering +1

Satellite Image Based Cross-view Localization for Autonomous Vehicle

no code implementations27 Jul 2022 Shan Wang, Yanhao Zhang, Ankit Vora, Akhil Perincherry, Hongdong Li

This paper introduces a novel approach to cross-view localization that departs from the conventional image retrieval method.

Autonomous Vehicles Image Retrieval +1

CVLNet: Cross-View Semantic Correspondence Learning for Video-based Camera Localization

no code implementations7 Aug 2022 Yujiao Shi, Xin Yu, Shan Wang, Hongdong Li

The critical challenge of this task is to learn a powerful global feature descriptor for the sequential ground-view images while considering its domain alignment with reference satellite images.

Camera Localization Image-Based Localization +1

NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction

1 code implementation29 Sep 2022 Ruyi Zha, Yanhao Zhang, Hongdong Li

This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction (Cone Beam Computed Tomography) that requires no external training data.

Low-Dose X-Ray Ct Reconstruction Novel View Synthesis

Distance Based Image Classification: A solution to generative classification's conundrum?

1 code implementation4 Oct 2022 Wen-Yan Lin, Siying Liu, Bing Tian Dai, Hongdong Li

We use the model to develop a classification scheme which suppresses the impact of noise while preserving semantic cues.

Image Classification

What Images are More Memorable to Machines?

1 code implementation14 Nov 2022 Junlin Han, Huangying Zhan, Jie Hong, Pengfei Fang, Hongdong Li, Lars Petersson, Ian Reid

This paper studies the problem of measuring and predicting how memorable an image is to pattern recognition machines, as a path to explore machine intelligence.

Interacting Hand-Object Pose Estimation via Dense Mutual Attention

1 code implementation16 Nov 2022 Rong Wang, Wei Mao, Hongdong Li

In contrast, we propose a novel dense mutual attention mechanism that is able to model fine-grained dependencies between the hand and the object.

Ranked #2 on hand-object pose on HO-3D (using extra training data)

hand-object pose Object +1

Wide-Angle Rectification via Content-Aware Conformal Mapping

no code implementations CVPR 2023 Qi Zhang, Hongdong Li, Qing Wang

Despite the proliferation of ultra wide-angle lenses on smartphone cameras, such lenses often come with severe image distortion (e. g. curved linear structure, unnaturally skewed faces).

A Rotation-Translation-Decoupled Solution for Robust and Efficient Visual-Inertial Initialization

1 code implementation CVPR 2023 Yijia He, Bo Xu, Zhanpeng Ouyang, Hongdong Li

We propose a novel visual-inertial odometry (VIO) initialization method, which decouples rotation and translation estimation, and achieves higher efficiency and better robustness.

Translation

Spatial Steerability of GANs via Self-Supervision from Discriminator

no code implementations20 Jan 2023 Jianyuan Wang, Lalit Bhagat, Ceyuan Yang, Yinghao Xu, Yujun Shen, Hongdong Li, Bolei Zhou

In this work, we propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space or requiring extra annotations.

Image Generation Inductive Bias +1

CircNet: Meshing 3D Point Clouds with Circumcenter Detection

1 code implementation23 Jan 2023 Huan Lei, Ruitao Leng, Liang Zheng, Hongdong Li

In this paper, we leverage the duality between a triangle and its circumcenter, and introduce a deep neural network that detects the circumcenters to achieve point cloud triangulation.

Surface Reconstruction

MEGANE: Morphable Eyeglass and Avatar Network

no code implementations CVPR 2023 Junxuan Li, Shunsuke Saito, Tomas Simon, Stephen Lombardi, Hongdong Li, Jason Saragih

However, modeling the geometric and appearance interactions of glasses and the face of virtual representations of humans is challenging.

Image Generation Inverse Rendering

Event-guided Multi-patch Network with Self-supervision for Non-uniform Motion Deblurring

1 code implementation14 Feb 2023 Hongguang Zhang, Limeng Zhang, Yuchao Dai, Hongdong Li, Piotr Koniusz

Contemporary deep learning multi-scale deblurring models suffer from many issues: 1) They perform poorly on non-uniformly blurred images/videos; 2) Simply increasing the model depth with finer-scale levels cannot improve deblurring; 3) Individual RGB frames contain a limited motion information for deblurring; 4) Previous models have a limited robustness to spatial transformations and noise.

Deblurring

WildLight: In-the-wild Inverse Rendering with a Flashlight

1 code implementation CVPR 2023 Ziang Cheng, Junxuan Li, Hongdong Li

Our system recovers scene geometry and reflectance using only multi-view images captured by a smartphone.

3D Reconstruction Inverse Rendering

Inverting the Imaging Process by Learning an Implicit Camera Model

no code implementations CVPR 2023 Xin Huang, Qi Zhang, Ying Feng, Hongdong Li, Qing Wang

In principle, our new implicit neural camera model has the potential to benefit a wide array of other inverse imaging tasks.

Boosting 3-DoF Ground-to-Satellite Camera Localization Accuracy via Geometry-Guided Cross-View Transformer

1 code implementation ICCV 2023 Yujiao Shi, Fei Wu, Akhil Perincherry, Ankit Vora, Hongdong Li

In this paper, we propose a method to increase the accuracy of a ground camera's location and orientation by estimating the relative rotation and translation between the ground-level image and its matched/retrieved satellite image.

Camera Localization Image Retrieval +2

EndoSurf: Neural Surface Reconstruction of Deformable Tissues with Stereo Endoscope Videos

1 code implementation21 Jul 2023 Ruyi Zha, Xuelian Cheng, Hongdong Li, Mehrtash Harandi, ZongYuan Ge

We constrain the learned shape by tailoring multiple regularization strategies and disentangling geometry and appearance.

Surface Reconstruction

LLDiffusion: Learning Degradation Representations in Diffusion Models for Low-Light Image Enhancement

1 code implementation27 Jul 2023 Tao Wang, Kaihao Zhang, Ziqian Shao, Wenhan Luo, Bjorn Stenger, Tae-Kyun Kim, Wei Liu, Hongdong Li

In this paper, we address this limitation by proposing a degradation-aware learning scheme for LLIE using diffusion models, which effectively integrates degradation and image priors into the diffusion process, resulting in improved image enhancement.

Image Generation Low-Light Image Enhancement

View Consistent Purification for Accurate Cross-View Localization

no code implementations ICCV 2023 Shan Wang, Yanhao Zhang, Akhil Perincherry, Ankit Vora, Hongdong Li

This paper proposes a fine-grained self-localization method for outdoor robotics that utilizes a flexible number of onboard cameras and readily accessible satellite images.

Pose Estimation

Unsupervised 3D Pose Estimation with Non-Rigid Structure-from-Motion Modeling

no code implementations18 Aug 2023 Haorui Ji, Hui Deng, Yuchao Dai, Hongdong Li

Most of the previous 3D human pose estimation work relied on the powerful memory capability of the network to obtain suitable 2D-3D mappings from the training data.

3D Human Pose Estimation 3D Pose Estimation

Faster Stochastic Variance Reduction Methods for Compositional MiniMax Optimization

no code implementations18 Aug 2023 Jin Liu, Xiaokang Pan, Junwen Duan, Hongdong Li, Youqi Li, Zhe Qu

All the proposed complexities indicate that our proposed methods can match lower bounds to existing minimax optimizations, without requiring a large batch size in each iteration.

Stochastic Optimization

MB-TaylorFormer: Multi-branch Efficient Transformer Expanded by Taylor Formula for Image Dehazing

1 code implementation ICCV 2023 Yuwei Qiu, Kaihao Zhang, Chenxi Wang, Wenhan Luo, Hongdong Li, Zhi Jin

To address this issue, we propose a new Transformer variant, which applies the Taylor expansion to approximate the softmax-attention and achieves linear computational complexity.

Image Dehazing

Stereo Matching in Time: 100+ FPS Video Stereo Matching for Extended Reality

no code implementations8 Sep 2023 Ziang Cheng, Jiayu Yang, Hongdong Li

One of the major difficulties is the lack of high-quality indoor video stereo training datasets captured by head-mounted VR/AR glasses.

Mixed Reality Stereo Matching

Deep Video Restoration for Under-Display Camera

no code implementations9 Sep 2023 Xuanxi Chen, Tao Wang, Ziqian Shao, Kaihao Zhang, Wenhan Luo, Tong Lu, Zikun Liu, Tae-Kyun Kim, Hongdong Li

With the pipeline, we build the first large-scale UDC video restoration dataset called PexelsUDC, which includes two subsets named PexelsUDC-T and PexelsUDC-P corresponding to different displays for UDC.

Video Restoration

RGB-based Category-level Object Pose Estimation via Decoupled Metric Scale Recovery

1 code implementation19 Sep 2023 Jiaxin Wei, Xibin Song, Weizhe Liu, Laurent Kneip, Hongdong Li, Pan Ji

While showing promising results, recent RGB-D camera-based category-level object pose estimation methods have restricted applications due to the heavy reliance on depth sensors.

Object Pose Estimation

Alice Benchmarks: Connecting Real World Re-Identification with the Synthetic

no code implementations6 Oct 2023 Xiaoxiao Sun, Yue Yao, Shengjin Wang, Hongdong Li, Liang Zheng

In this paper, we detail the settings of Alice benchmarks, provide an analysis of existing commonly-used domain adaptation methods, and discuss some interesting future directions.

Domain Adaptation

DeepSimHO: Stable Pose Estimation for Hand-Object Interaction via Physics Simulation

1 code implementation NeurIPS 2023 Rong Wang, Wei Mao, Hongdong Li

Specifically, for an initial hand-object pose estimated by a base network, we forward it to a physics simulator to evaluate its stability.

3D Pose Estimation hand-object pose +1

ConsistNet: Enforcing 3D Consistency for Multi-view Images Diffusion

1 code implementation16 Oct 2023 Jiayu Yang, Ziang Cheng, Yunfei Duan, Pan Ji, Hongdong Li

Given a single image of a 3D object, this paper proposes a novel method (named ConsistNet) that is able to generate multiple images of the same object, as if seen they are captured from different viewpoints, while the 3D (multi-view) consistencies among those multiple generated images are effectively exploited.

Depth Estimation Depth Prediction +2

3D Human Pose Analysis via Diffusion Synthesis

no code implementations17 Jan 2024 Haorui Ji, Hongdong Li

In this paper, we propose PADS (Pose Analysis by Diffusion Synthesis), a novel framework designed to address various challenges in 3D human pose analysis through a unified pipeline.

Denoising

BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation

no code implementations30 Jan 2024 Zhennan Wu, Yang Li, Han Yan, Taizhang Shang, Weixuan Sun, Senbo Wang, Ruikai Cui, Weizhe Liu, Hiroyuki Sato, Hongdong Li, Pan Ji

A variational auto-encoder is employed to compress the tri-planes into the latent tri-plane space, on which the denoising diffusion process is performed.

Denoising Scene Generation

PromptRR: Diffusion Models as Prompt Generators for Single Image Reflection Removal

1 code implementation4 Feb 2024 Tao Wang, Wanglong Lu, Kaihao Zhang, Wenhan Luo, Tae-Kyun Kim, Tong Lu, Hongdong Li, Ming-Hsuan Yang

For the prompt generation, we first propose a prompt pre-training strategy to train a frequency prompt encoder that encodes the ground-truth image into LF and HF prompts.

Reflection Removal

Strong and Controllable Blind Image Decomposition

1 code implementation15 Mar 2024 Zeyu Zhang, Junlin Han, Chenhui Gou, Hongdong Li, Liang Zheng

To address this need, we add controllability to the blind image decomposition process, allowing users to enter which types of degradation to remove or retain.

Frankenstein: Generating Semantic-Compositional 3D Scenes in One Tri-Plane

no code implementations24 Mar 2024 Han Yan, Yang Li, Zhennan Wu, Shenzhou Chen, Weixuan Sun, Taizhang Shang, Weizhe Liu, Tian Chen, Xiaqiang Dai, Chao Ma, Hongdong Li, Pan Ji

We present Frankenstein, a diffusion-based framework that can generate semantic-compositional 3D scenes in a single pass.

Denoising

Beyond Monocular Deraining: Stereo Image Deraining via Semantic Understanding

no code implementations ECCV 2020 Kaihao Zhang, Wenhan Luo, Wenqi Ren, Jingwen Wang Fang Zhao, Lin Ma , Hongdong Li

Moreover, even for single image based monocular deraining, many current methods fail to complete the task satisfactorily because they mostly rely on per pixel loss functions and ignoring semantic information.

Benchmarking Rain Removal +1

Automatic Gloss Dictionary for Sign Language Learners

no code implementations ACL 2022 Chenchen Xu, Dongxu Li, Hongdong Li, Hanna Suominen, Ben Swift

A multi-language dictionary is a fundamental tool for language learning, allowing the learner to look up unfamiliar words.

Deep Novel View Synthesis from Colored 3D Point Clouds

1 code implementation ECCV 2020 Zhenbo Song, Wayne Chen, Dylan Campbell, Hongdong Li

We propose a new deep neural network which takes a colored 3D point cloud of a scene, and directly synthesizes a photo-realistic image from an arbitrary viewpoint.

Image Generation Novel View Synthesis

Cannot find the paper you are looking for? You can Submit a new open access paper.