Search Results for author: Luc van Gool

Found 451 papers, 226 papers with code

Fixing Localization Errors to Improve Image Classification

1 code implementation ECCV 2020 Guolei Sun, Salman Khan, Wen Li, Hisham Cholakkal, Fahad Shahbaz Khan, Luc van Gool

This way, in an effort to fix localization errors, our loss provides an extra supervisory signal that helps the model to better discriminate between similar classes.

Classification General Classification +3

Modeling the Effects of Windshield Refraction for Camera Calibration

no code implementations ECCV 2020 Frank Verbiest, Marc Proesmans, Luc van Gool

Instead of using a generalized camera approach, we propose a novel approach to jointly optimize a traditional camera model, and a mathematical representation of the windshield’s surface.

Autonomous Driving Camera Calibration

NeRF-GAN Distillation for Efficient 3D-Aware Generation with Convolutions

no code implementations22 Mar 2023 Mohamad Shahbazi, Evangelos Ntavelis, Alessio Tonioni, Edo Collins, Danda Pani Paudel, Martin Danelljan, Luc van Gool

Pose-conditioned convolutional generative models struggle with high-quality 3D-consistent image generation from single-view datasets, due to their lack of sufficient 3D priors.

Image Generation Inductive Bias

DiffIR: Efficient Diffusion Model for Image Restoration

no code implementations16 Mar 2023 Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, Luc van Gool

Since the iterations are few, our DiffIR can adopt a joint optimization of CPEN$_{S2}$, DIRformer, and denoising network, which can further reduce the estimation error influence.

Denoising Image Generation +1

Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution

no code implementations15 Mar 2023 Zixiang Zhao, Jiangshe Zhang, Xiang Gu, Chengli Tan, Shuang Xu, Yulun Zhang, Radu Timofte, Luc van Gool

Then, the extracted features are mapped to the spherical space to complete the separation of private features and the alignment of shared features.

Contrastive Learning Depth Map Super-Resolution

Graph Transformer GANs for Graph-Constrained House Generation

no code implementations14 Mar 2023 Hao Tang, Zhenyu Zhang, Humphrey Shi, Bo Li, Ling Shao, Nicu Sebe, Radu Timofte, Luc van Gool

We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations in an end-to-end fashion for the challenging graph-constrained house generation task.

House Generation Node Classification

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

no code implementations13 Mar 2023 Zixiang Zhao, Haowen Bai, Yuanzhi Zhu, Jiangshe Zhang, Shuang Xu, Yulun Zhang, Kai Zhang, Deyu Meng, Radu Timofte, Luc van Gool

To leverage strong generative priors and address challenges such as unstable training and lack of interpretability for GAN-based generative methods, we propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM).

Denoising

Contrastive Model Adaptation for Cross-Condition Robustness in Semantic Segmentation

no code implementations9 Mar 2023 David Bruggemann, Christos Sakaridis, Tim Brödermann, Luc van Gool

We investigate normal-to-adverse condition model adaptation for semantic segmentation, whereby image-level correspondences are available in the target domain.

Contrastive Learning Semantic Segmentation +2

A Multiplicative Value Function for Safe and Efficient Reinforcement Learning

1 code implementation7 Mar 2023 Nick Bührer, Zhejun Zhang, Alexander Liniger, Fisher Yu, Luc van Gool

To this end, we propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.

Navigate reinforcement-learning +3

TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction

1 code implementation7 Mar 2023 Zhejun Zhang, Alexander Liniger, Dengxin Dai, Fisher Yu, Luc van Gool

We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving, and based on TrafficBots we obtain a world model tailored for the planning module of autonomous vehicles.

Autonomous Driving Model-based Reinforcement Learning +1

Efficient and Explicit Modelling of Image Hierarchies for Image Restoration

1 code implementation1 Mar 2023 Yawei Li, Yuchen Fan, Xiaoyu Xiang, Denis Demandolx, Rakesh Ranjan, Radu Timofte, Luc van Gool

The aim of this paper is to propose a mechanism to efficiently and explicitly model image hierarchies in the global, regional, and local range for image restoration.

Image Restoration

VA-DepthNet: A Variational Approach to Single Image Depth Prediction

1 code implementation13 Feb 2023 Ce Liu, Suryansh Kumar, Shuhang Gu, Radu Timofte, Luc van Gool

While state-of-the-art deep neural network methods for SIDP learn the scene depth from images in a supervised setting, they often overlook the invaluable invariances and priors in the rigid scene space, such as the regularity of the scene.

Depth Estimation Depth Prediction

Event-Based Frame Interpolation with Ad-hoc Deblurring

no code implementations12 Jan 2023 Lei Sun, Christos Sakaridis, Jingyun Liang, Peng Sun, JieZhang Cao, Kai Zhang, Qi Jiang, Kaiwei Wang, Luc van Gool

The performance of video frame interpolation is inherently correlated with the ability to handle motion in the input scene.

Deblurring Image Deblurring +1

Beyond SOT: It's Time to Track Multiple Generic Objects at Once

no code implementations22 Dec 2022 Christoph Mayer, Martin Danelljan, Ming-Hsuan Yang, Vittorio Ferrari, Luc van Gool, Alina Kuznetsova

TaMOs achieves a 4x faster run-time in case of 10 concurrent objects compared to tracking each object independently and outperforms existing single object trackers on our new benchmark.

Object Tracking

One-Shot Domain Adaptive and Generalizable Semantic Segmentation with Class-Aware Cross-Domain Transformers

no code implementations14 Dec 2022 Rui Gong, Qin Wang, Dengxin Dai, Luc van Gool

Thus, we aim to relieve this need on a large number of real data, and explore the one-shot unsupervised sim-to-real domain adaptation (OSUDA) and generalization (OSDG) problem, where only one real-world data sample is available.

Autonomous Driving Domain Adaptation +1

Source-free Depth for Object Pop-out

no code implementations10 Dec 2022 Zongwei Wu, Danda Pani Paudel, Deng-Ping Fan, Jingjing Wang, Shuo Wang, Cédric Demonceaux, Radu Timofte, Luc van Gool

In this work, we adapt such depth inference models for object segmentation using the objects' ``pop-out'' prior in 3D.

object-detection Object Detection +2

MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation

1 code implementation2 Dec 2022 Lukas Hoyer, Dengxin Dai, Haoran Wang, Luc van Gool

MIC significantly improves the state-of-the-art performance across the different recognition tasks for synthetic-to-real, day-to-nighttime, and clear-to-adverse-weather UDA.

Image Classification object-detection +4

Neural Radiance Fields for Manhattan Scenes with Unknown Manhattan Frame

no code implementations2 Dec 2022 Nikola Popovic, Danda Pani Paudel, Luc van Gool

Such representations are known to benefit from additional geometric and semantic supervision.

Novel View Synthesis

Advancing Learned Video Compression with In-loop Frame Prediction

1 code implementation13 Nov 2022 Ren Yang, Radu Timofte, Luc van Gool

In this paper, we propose an Advanced Learned Video Compression (ALVC) approach with the in-loop frame prediction module, which is able to effectively predict the target frame from the previously compressed frames, without consuming any bit-rate.

MS-SSIM SSIM +1

MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning

no code implementations8 Nov 2022 Andrey Ignatov, Anastasia Sycheva, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, Luc van Gool

While neural networks-based photo processing solutions can provide a better image quality compared to the traditional ISP systems, their application to mobile devices is still very limited due to their very high computational complexity.

PyNet-V2 Mobile: Efficient On-Device Photo Processing With Neural Networks

1 code implementation8 Nov 2022 Andrey Ignatov, Grigory Malivenko, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, Luc van Gool

The increased importance of mobile photography created a need for fast and performant RAW image processing pipelines capable of producing good visual results in spite of the mobile camera sensor limitations.

Towards Versatile Embodied Navigation

1 code implementation30 Oct 2022 Hanqing Wang, Wei Liang, Luc van Gool, Wenguan Wang

With the emergence of varied visual navigation tasks (e. g, image-/object-/audio-goal and vision-language navigation) that specify the target in different ways, the community has made appealing advances in training specialized agents capable of handling individual navigation tasks well.

Decision Making Vision-Language Navigation +1

TripletTrack: 3D Object Tracking using Triplet Embeddings and LSTM

no code implementations28 Oct 2022 Nicola Marinello, Marc Proesmans, Luc van Gool

We start from an off-the-shelf 3D object detector, and apply a tracking mechanism where objects are matched by an affinity score computed on local object feature embeddings and motion descriptors.

3D Object Tracking Autonomous Driving +1

Masked Vision-Language Transformer in Fashion

1 code implementation27 Oct 2022 Ge-Peng Ji, Mingcheng Zhuge, Dehong Gao, Deng-Ping Fan, Christos Sakaridis, Luc van Gool

We present a masked vision-language transformer (MVLT) for fashion-specific multi-modal representation.

Image Reconstruction Retrieval

Learning Attention Propagation for Compositional Zero-Shot Learning

no code implementations20 Oct 2022 Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc van Gool, Alain Pagani, Didier Stricker, Muhammad Zeshan Afzal

CAPE learns to identify this structure and propagates knowledge between them to learn class embedding for all seen and unseen compositions.

Compositional Zero-Shot Learning

Multi-View Photometric Stereo Revisited

no code implementations14 Oct 2022 Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc van Gool

The proposed approach in this paper exploits the benefit of uncertainty modeling in a deep neural network for a reliable fusion of photometric stereo (PS) and multi-view stereo (MVS) network predictions.

3D Shape Representation

Composite Learning for Robust and Effective Dense Predictions

no code implementations13 Oct 2022 Menelaos Kanakis, Thomas E. Huang, David Bruggemann, Fisher Yu, Luc van Gool

In this paper, we find that jointly training a dense prediction (target) task with a self-supervised (auxiliary) task can consistently improve the performance of the target task, while eliminating the need for labeling auxiliary tasks.

Boundary Detection Monocular Depth Estimation +2

SiNeRF: Sinusoidal Neural Radiance Fields for Joint Pose Estimation and Scene Reconstruction

1 code implementation10 Oct 2022 Yitong Xia, Hao Tang, Radu Timofte, Luc van Gool

NeRFmm is the Neural Radiance Fields (NeRF) that deal with Joint Optimization tasks, i. e., reconstructing real-world scenes and registering camera parameters simultaneously.

Image Generation Pose Estimation

Robustifying the Multi-Scale Representation of Neural Radiance Fields

no code implementations9 Oct 2022 Nishant Jain, Suryansh Kumar, Luc van Gool

Although recently proposed Mip-NeRF could handle multi-scale imaging problems with NeRF, it cannot handle camera pose estimation error.

Pose Estimation

Basic Binary Convolution Unit for Binarized Image Restoration Network

1 code implementation2 Oct 2022 Bin Xia, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Radu Timofte, Luc van Gool

In this study, we reconsider components in binary convolution, such as residual connection, BatchNorm, activation function, and structure, for IR tasks.

Binarization Image Restoration +1

TT-NF: Tensor Train Neural Fields

1 code implementation30 Sep 2022 Anton Obukhov, Mikhail Usvyatsov, Christos Sakaridis, Konrad Schindler, Luc van Gool

Learning neural fields has been an active topic in deep learning research, focusing, among other issues, on finding more compact and easy-to-fit representations.

Denoising Low-rank compression

Exploiting Instance-based Mixed Sampling via Auxiliary Source Domain Supervision for Domain-adaptive Action Detection

1 code implementation28 Sep 2022 Yifan Lu, Gurkirt Singh, Suman Saha, Luc van Gool

We propose a novel domain adaptive action detection approach and a new adaptation protocol that leverages the recent advancements in image-level unsupervised domain adaptation (UDA) techniques and handle vagaries of instance-level video data.

Action Detection Pseudo Label +2

I2DFormer: Learning Image to Document Attention for Zero-Shot Image Classification

no code implementations21 Sep 2022 Muhammad Ferjad Naeem, Yongqin Xian, Luc van Gool, Federico Tombari

In order to distill discriminative visual words from noisy documents, we introduce a new cross-modal attention module that learns fine-grained interactions between image patches and document words.

Generalized Zero-Shot Learning Image Classification +2

Spatio-Temporal Action Detection Under Large Motion

1 code implementation6 Sep 2022 Gurkirt Singh, Vasileios Choutas, Suman Saha, Fisher Yu, Luc van Gool

Current methods for spatiotemporal action tube detection often extend a bounding box proposal at a given keyframe into a 3D temporal cuboid and pool features from nearby frames.

Action Detection

ManiFlow: Implicitly Representing Manifolds with Normalizing Flows

no code implementations18 Aug 2022 Janis Postels, Martin Danelljan, Luc van Gool, Federico Tombari

In contrast to prior work, we approach this problem by generating samples from the original data distribution given full knowledge about the perturbed distribution and the noise model.

Surface Reconstruction

AVisT: A Benchmark for Visual Object Tracking in Adverse Visibility

1 code implementation14 Aug 2022 Mubashir Noman, Wafa Al Ghallabi, Daniya Najiha, Christoph Mayer, Akshay Dudhane, Martin Danelljan, Hisham Cholakkal, Salman Khan, Luc van Gool, Fahad Shahbaz Khan

While being greatly benefiting to the tracking research, existing benchmarks do not pose the same difficulty as before with recent trackers achieving higher performance mainly due to (i) the introduction of more sophisticated transformers-based methods and (ii) the lack of diverse scenarios with adverse visibility such as, severe weather conditions, camouflage and imaging effects.

Visual Object Tracking Visual Tracking

Reference-based Image Super-Resolution with Deformable Attention Transformer

1 code implementation25 Jul 2022 JieZhang Cao, Jingyun Liang, Kai Zhang, Yawei Li, Yulun Zhang, Wenguan Wang, Luc van Gool

Reference-based image super-resolution (RefSR) aims to exploit auxiliary reference (Ref) images to super-resolve low-resolution (LR) images.

Image Super-Resolution

Towards Interpretable Video Super-Resolution via Alternating Optimization

1 code implementation21 Jul 2022 JieZhang Cao, Jingyun Liang, Kai Zhang, Wenguan Wang, Qin Wang, Yulun Zhang, Hao Tang, Luc van Gool

These issues can be alleviated by a cascade of three separate sub-tasks, including video deblurring, frame interpolation, and super-resolution, which, however, would fail to capture the spatial and temporal correlations among video sequences.

Deblurring Space-time Video Super-resolution +2

Refign: Align and Refine for Adaptation of Semantic Segmentation to Adverse Conditions

1 code implementation14 Jul 2022 David Bruggemann, Christos Sakaridis, Prune Truong, Luc van Gool

Due to the scarcity of dense pixel-level semantic annotations for images recorded in adverse visual conditions, there has been a keen interest in unsupervised domain adaptation (UDA) for the semantic segmentation of such images.

Semantic Segmentation Unsupervised Domain Adaptation

Organic Priors in Non-Rigid Structure from Motion

no code implementations13 Jul 2022 Suryansh Kumar, Luc van Gool

Besides that, the paper provides insights into the NRSfM factorization -- both in terms of shape and motion -- and is the first approach to show the benefit of single rotation averaging for NRSfM.

L2E: Lasers to Events for 6-DoF Extrinsic Calibration of Lidars and Event Cameras

1 code implementation3 Jul 2022 Kevin Ta, David Bruggemann, Tim Brödermann, Christos Sakaridis, Luc van Gool

As neuromorphic technology is maturing, its application to robotics and autonomous vehicle systems has become an area of active research.

Autonomous Driving Camera Calibration

HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection

1 code implementation30 Jun 2022 Tim Broedermann, Christos Sakaridis, Dengxin Dai, Luc van Gool

Besides standard cameras, autonomous vehicles typically include multiple additional sensors, such as lidars and radars, which help acquire richer information for perceiving the content of the driving scene.

2D object detection Autonomous Vehicles +3

3D-Aware Video Generation

1 code implementation29 Jun 2022 Sherwin Bahmani, Jeong Joon Park, Despoina Paschalidou, Hao Tang, Gordon Wetzstein, Leonidas Guibas, Luc van Gool, Radu Timofte

Generative models have emerged as an essential building block for many image synthesis and editing tasks.

Image Generation Video Generation

Structured Sparsity Learning for Efficient Video Super-Resolution

no code implementations15 Jun 2022 Bin Xia, Jingwen He, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Luc van Gool

In SSL, we design pruning schemes for several key components in VSR models, including residual blocks, recurrent networks, and upsampling networks.

Video Super-Resolution

Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation

no code implementations13 Jun 2022 Wouter Van Gansbeke, Simon Vandenhende, Luc van Gool

This paper presents MaskDistill: a novel framework for unsupervised semantic segmentation based on three key ideas.

Ranked #2 on Unsupervised Semantic Segmentation on PASCAL VOC 2012 val (using extra training data)

Unsupervised Semantic Segmentation

Recurrent Video Restoration Transformer with Guided Deformable Attention

1 code implementation5 Jun 2022 Jingyun Liang, Yuchen Fan, Xiaoyu Xiang, Rakesh Ranjan, Eddy Ilg, Simon Green, JieZhang Cao, Kai Zhang, Radu Timofte, Luc van Gool

Specifically, RVRT divides the video into multiple clips and uses the previously inferred clip feature to estimate the subsequent clip feature.

 Ranked #1 on Deblurring on DVD

Deblurring Denoising +3

Gradient Obfuscation Checklist Test Gives a False Sense of Security

no code implementations3 Jun 2022 Nikola Popovic, Danda Pani Paudel, Thomas Probst, Luc van Gool

It has since become a trend to use these five characteristics as a sufficient test, to determine whether or not gradient obfuscation is the main source of robustness.

GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector

1 code implementation30 May 2022 Peng Zheng, Huazhu Fu, Deng-Ping Fan, Qi Fan, Jie Qin, Yu-Wing Tai, Chi-Keung Tang, Luc van Gool

In this paper, we present a novel end-to-end group collaborative learning network, termed GCoNet+, which can effectively and efficiently (250 fps) identify co-salient objects in natural scenes.

Co-Salient Object Detection object-detection +1

Deep Gradient Learning for Efficient Camouflaged Object Detection

1 code implementation25 May 2022 Ge-Peng Ji, Deng-Ping Fan, Yu-Cheng Chou, Dengxin Dai, Alexander Liniger, Luc van Gool

This paper introduces DGNet, a novel deep framework that exploits object gradient supervision for camouflaged object detection (COD).

Defect Detection object-detection +2

Degradation-Aware Unfolding Half-Shuffle Transformer for Spectral Compressive Imaging

1 code implementation20 May 2022 Yuanhao Cai, Jing Lin, Haoqian Wang, Xin Yuan, Henghui Ding, Yulun Zhang, Radu Timofte, Luc van Gool

In coded aperture snapshot spectral compressive imaging (CASSI) systems, hyperspectral image (HSI) reconstruction methods are employed to recover the spatial-spectral signal from a compressed measurement.

Compressive Sensing Image Reconstruction +1

Revisiting Random Channel Pruning for Neural Network Compression

1 code implementation CVPR 2022 Yawei Li, Kamil Adamczewski, Wen Li, Shuhang Gu, Radu Timofte, Luc van Gool

The proposed approach provides a new way to compare different methods, namely how well they behave compared with random pruning.

Neural Network Compression

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

2 code implementations11 May 2022 Yawei Li, Kai Zhang, Radu Timofte, Luc van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Yanbo Wang, Xiaozhong Ji, Chuming Lin, Donghao Luo, Ying Tai, Chengjie Wang, Zhizhong Zhang, Yuan Xie, Shen Cheng, Ziwei Luo, Lei Yu, Zhihong Wen, Qi Wu1, Youwei Li, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Yuanfei Huang, Meiguang Jin, Hua Huang, Jing Liu, Xinjian Zhang, Yan Wang, Lingshun Long, Gen Li, Yuanfan Zhang, Zuowei Cao, Lei Sun, Panaetov Alexander, Yucong Wang, Minjie Cai, Li Wang, Lu Tian, Zheyuan Wang, Hongbing Ma, Jie Liu, Chao Chen, Yidong Cai, Jie Tang, Gangshan Wu, Weiran Wang, Shirui Huang, Honglei Lu, Huan Liu, Keyan Wang, Jun Chen, Shi Chen, Yuchun Miao, Zimo Huang, Lefei Zhang, Mustafa Ayazoğlu, Wei Xiong, Chengyi Xiong, Fei Wang, Hao Li, Ruimian Wen, Zhijing Yang, Wenbin Zou, Weixin Zheng, Tian Ye, Yuncheng Zhang, Xiangzhen Kong, Aditya Arora, Syed Waqas Zamir, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Dandan Gaoand Dengwen Zhouand Qian Ning, Jingzhu Tang, Han Huang, YuFei Wang, Zhangheng Peng, Haobo Li, Wenxue Guan, Shenghua Gong, Xin Li, Jun Liu, Wanjun Wang, Dengwen Zhou, Kun Zeng, Hanjiang Lin, Xinyu Chen, Jinsheng Fang

The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29. 00dB on DIV2K validation set.

Image Super-Resolution

HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

1 code implementation27 Apr 2022 Lukas Hoyer, Dengxin Dai, Luc van Gool

Therefore, we propose HRDA, a multi-resolution training approach for UDA, that combines the strengths of small high-resolution crops to preserve fine segmentation details and large low-resolution crops to capture long-range context dependencies with a learned scale attention, while maintaining a manageable GPU memory footprint.

Semantic Segmentation Synthetic-to-Real Translation +1

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction

1 code implementation17 Apr 2022 Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Zhang, Hanspeter Pfister, Radu Timofte, Luc van Gool

Existing leading methods for spectral reconstruction (SR) focus on designing deeper or wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB image to its hyperspectral image (HSI).

Spectral Reconstruction Spectral Super-Resolution

Neural Vector Fields for Implicit Surface Representation and Inference

1 code implementation13 Apr 2022 Edoardo Mello Rella, Ajad Chhatkuli, Ender Konukoglu, Luc van Gool

Implicit fields have been very effective to represent and learn 3D shapes accurately.

Coarse-to-Fine Feature Mining for Video Semantic Segmentation

1 code implementation CVPR 2022 Guolei Sun, Yun Liu, Henghui Ding, Thomas Probst, Luc van Gool

To address this problem, we propose a Coarse-to-Fine Feature Mining (CFFM) technique to learn a unified presentation of static contexts and motional contexts.

Semantic Segmentation Video Semantic Segmentation

Learning Online Multi-Sensor Depth Fusion

1 code implementation7 Apr 2022 Erik Sandström, Martin R. Oswald, Suryansh Kumar, Silvan Weder, Fisher Yu, Cristian Sminchisescu, Luc van Gool

Multi-sensor depth fusion is able to substantially improve the robustness and accuracy of 3D reconstruction methods, but existing techniques are not robust enough to handle sensors which operate with diverse value ranges as well as noise and outlier statistics.

3D Reconstruction Mixed Reality +1

Deep Interactive Motion Prediction and Planning: Playing Games with Motion Prediction Models

no code implementations5 Apr 2022 Jose L. Vazquez, Alexander Liniger, Wilko Schwarting, Daniela Rus, Luc van Gool

Fundamental to the success of our method is the design of a novel multi-agent policy network that can steer a vehicle given the state of the surrounding agents and the map information.

motion prediction

Arbitrary-Scale Image Synthesis

1 code implementation CVPR 2022 Evangelos Ntavelis, Mohamad Shahbazi, Iason Kastanis, Radu Timofte, Martin Danelljan, Luc van Gool

Positional encodings have enabled recent works to train a single adversarial network that can generate images of different scales.

Image Generation

FoV-Net: Field-of-View Extrapolation Using Self-Attention and Uncertainty

no code implementations4 Apr 2022 Liqian Ma, Stamatios Georgoulis, Xu Jia, Luc van Gool

The ability to make educated predictions about their surroundings, and associate them with certain confidence, is important for intelligent systems, like autonomous vehicles and robots.

Autonomous Vehicles Decision Making

Direct Dense Pose Estimation

no code implementations4 Apr 2022 Liqian Ma, Lingjie Liu, Christian Theobalt, Luc van Gool

In addition, DDP is computationally more efficient than previous dense pose estimation methods, and it reduces jitters when applied to a video sequence, which is a problem plaguing the previous methods.

Action Recognition Pose Estimation +2

Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation

1 code implementation CVPR 2022 Hanqing Wang, Wei Liang, Jianbing Shen, Luc van Gool, Wenguan Wang

Since the rise of vision-language navigation (VLN), great progress has been made in instruction following -- building a follower to navigate environments under the guidance of instructions.

Data Augmentation Instruction Following +2

LiDAR Snowfall Simulation for Robust 3D Object Detection

1 code implementation CVPR 2022 Martin Hahner, Christos Sakaridis, Mario Bijelic, Felix Heide, Fisher Yu, Dengxin Dai, Luc van Gool

Due to the difficulty of collecting and annotating training data in this setting, we propose a physically based method to simulate the effect of snowfall on real clear-weather LiDAR point clouds.

3D Object Detection Autonomous Driving +2

Rethinking Semantic Segmentation: A Prototype View

1 code implementation CVPR 2022 Tianfei Zhou, Wenguan Wang, Ender Konukoglu, Luc van Gool

Prevalent semantic segmentation solutions, despite their different network designs (FCN based or attention based) and mask decoding strategies (parametric softmax based or pixel-query based), can be placed in one category, by considering the softmax weights or query vectors as learnable class prototypes.

Semantic Segmentation

Continual Test-Time Domain Adaptation

1 code implementation CVPR 2022 Qin Wang, Olga Fink, Luc van Gool, Dengxin Dai

However, real-world machine perception systems are running in non-stationary and continually changing environments where the target domain distribution can change over time.

Domain Adaptation

Spatially Multi-conditional Image Generation

no code implementations25 Mar 2022 Ritika Chakraborty, Nikola Popovic, Danda Pani Paudel, Thomas Probst, Luc van Gool

However, multi-conditional image generation is a very challenging problem due to the heterogeneity and the sparsity of the (in practice) available conditioning labels.

Conditional Image Generation

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

1 code implementation24 Mar 2022 Kai Zhang, Yawei Li, Jingyun Liang, JieZhang Cao, Yulun Zhang, Hao Tang, Radu Timofte, Luc van Gool

While recent years have witnessed a dramatic upsurge of exploiting deep neural networks toward solving image denoising, existing methods mostly rely on simple noise assumptions, such as additive white Gaussian noise (AWGN), JPEG compression noise and camera sensor noise, and a general-purpose blind denoising method for real images remains unsolved.

Image Denoising Image-to-Image Translation

Robust Visual Tracking by Segmentation

1 code implementation21 Mar 2022 Matthieu Paul, Martin Danelljan, Christoph Mayer, Luc van Gool

We infer a bounding box from the segmentation mask, validate our tracker on challenging tracking datasets and achieve the new state of the art on LaSOT with a success AUC score of 69. 7%.

Semantic Segmentation Video Object Segmentation +3

Transforming Model Prediction for Tracking

1 code implementation CVPR 2022 Christoph Mayer, Martin Danelljan, Goutam Bhat, Matthieu Paul, Danda Pani Paudel, Fisher Yu, Luc van Gool

Optimization based tracking methods have been widely successful by integrating a target model prediction module, providing effective global reasoning by minimizing an objective function.

 Ranked #1 on Visual Object Tracking on LaSOT (IS metric)

Inductive Bias Visual Object Tracking

Transform your Smartphone into a DSLR Camera: Learning the ISP in the Wild

no code implementations20 Mar 2022 Ardhendu Shekhar Tripathi, Martin Danelljan, Samarth Shukla, Radu Timofte, Luc van Gool

We propose a trainable Image Signal Processing (ISP) framework that produces DSLR quality images given RAW images captured by a smartphone.

Motion Estimation

Zero Pixel Directional Boundary by Vector Transform

1 code implementation ICLR 2022 Edoardo Mello Rella, Ajad Chhatkuli, Yun Liu, Ender Konukoglu, Luc van Gool

One of the key problems in boundary detection is the label representation, which typically leads to class imbalance and, as a consequence, to thick boundaries that require non-differential post-processing steps to be thinned.

Boundary Detection

Scribble-Supervised LiDAR Semantic Segmentation

1 code implementation CVPR 2022 Ozan Unal, Dengxin Dai, Luc van Gool

Densely annotating LiDAR point clouds remains too expensive and time-consuming to keep up with the ever growing volume of data.

3D Semantic Segmentation LIDAR Semantic Segmentation

Revisiting Deep Semi-supervised Learning: An Empirical Distribution Alignment Framework and Its Generalization Bound

no code implementations13 Mar 2022 Feiyu Wang, Qin Wang, Wen Li, Dong Xu, Luc van Gool

Benefited from this new perspective, we first propose a new deep semi-supervised learning framework called Semi-supervised Learning by Empirical Distribution Alignment (SLEDA), in which existing technologies from the domain adaptation community can be readily used to address the semi-supervised learning problem through reducing the empirical distribution distance between labeled and unlabeled data.

Data Augmentation Domain Adaptation

Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction

1 code implementation9 Mar 2022 Yuanhao Cai, Jing Lin, Xiaowan Hu, Haoqian Wang, Xin Yuan, Yulun Zhang, Radu Timofte, Luc van Gool

Many algorithms have been developed to solve the inverse problem of coded aperture snapshot spectral imaging (CASSI), i. e., recovering the 3D hyperspectral images (HSIs) from a 2D compressive measurement.

Compressive Sensing Image Reconstruction

Barlow constrained optimization for Visual Question Answering

no code implementations7 Mar 2022 Abhishek Jha, Badri N. Patro, Luc van Gool, Tinne Tuytelaars

In this paper, we propose a novel regularization for VQA models, Constrained Optimization using Barlow's theory (COB), that improves the information content of the joint space by minimizing the redundancy.

Question Answering Visual Question Answering +1

HDNet: High-resolution Dual-domain Learning for Spectral Compressive Imaging

2 code implementations CVPR 2022 Xiaowan Hu, Yuanhao Cai, Jing Lin, Haoqian Wang, Xin Yuan, Yulun Zhang, Radu Timofte, Luc van Gool

On the one hand, the proposed HR spatial-spectral attention module with its efficient feature fusion provides continuous and fine pixel-level features.

Compressive Sensing Image Reconstruction +1

Pix2NeRF: Unsupervised Conditional $π$-GAN for Single Image to Neural Radiance Fields Translation

2 code implementations26 Feb 2022 Shengqu Cai, Anton Obukhov, Dengxin Dai, Luc van Gool

We propose a pipeline to generate Neural Radiance Fields~(NeRF) of an object or a scene of a specific class, conditioned on a single input image.

3D-Aware Image Synthesis Novel View Synthesis +2

Uncertainty-Aware Deep Multi-View Photometric Stereo

no code implementations CVPR 2022 Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc van Gool

At each pixel, our approach either selects or discards deep-PS and deep-MVS network prediction depending on the prediction uncertainty measure.

Surface Reconstruction

Adiabatic Quantum Computing for Multi Object Tracking

no code implementations CVPR 2022 Jan-Nico Zaech, Alexander Liniger, Martin Danelljan, Dengxin Dai, Luc van Gool

Multi-Object Tracking (MOT) is most often approached in the tracking-by-detection paradigm, where object detections are associated through time.

Association Multi-Object Tracking

Fast Online Video Super-Resolution with Deformable Attention Pyramid

no code implementations3 Feb 2022 Dario Fuoli, Martin Danelljan, Radu Timofte, Luc van Gool

Our DAP aligns and integrates information from the recurrent state into the current frame prediction.

Video Super-Resolution

VRT: A Video Restoration Transformer

1 code implementation28 Jan 2022 Jingyun Liang, JieZhang Cao, Yuchen Fan, Kai Zhang, Rakesh Ranjan, Yawei Li, Radu Timofte, Luc van Gool

Besides, parallel warping is used to further fuse information from neighboring frames by parallel feature warping.

Deblurring Denoising +7

RePaint: Inpainting using Denoising Diffusion Probabilistic Models

2 code implementations CVPR 2022 Andreas Lugmayr, Martin Danelljan, Andres Romero, Fisher Yu, Radu Timofte, Luc van Gool

In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks.

Denoising Image Inpainting

Collapse by Conditioning: Training Class-conditional GANs with Limited Data

1 code implementation ICLR 2022 Mohamad Shahbazi, Martin Danelljan, Danda Pani Paudel, Luc van Gool

On the contrary, we observe that class-conditioning causes mode collapse in limited data settings, where unconditional learning leads to satisfactory generative ability.

Pix2NeRF: Unsupervised Conditional p-GAN for Single Image to Neural Radiance Fields Translation

1 code implementation CVPR 2022 Shengqu Cai, Anton Obukhov, Dengxin Dai, Luc van Gool

We propose a pipeline to generate Neural Radiance Fields (NeRF) of an object or a scene of a specific class, conditioned on a single input image.

3D-Aware Image Synthesis Novel View Synthesis +2

Improving the Behaviour of Vision Transformers with Token-consistent Stochastic Layers

no code implementations30 Dec 2021 Nikola Popovic, Danda Pani Paudel, Thomas Probst, Luc van Gool

We use linear layers with token-consistent stochastic parameters inside the multilayer perceptron blocks, without altering the architecture of the transformer.

Adversarial Robustness Transfer Learning

End-to-End Learning of Multi-category 3D Pose and Shape Estimation

no code implementations19 Dec 2021 Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

We use a Transformer-based architecture to detect the keypoints, as well as to summarize the visual context of the image.

Topology Preserving Local Road Network Estimation from Single Onboard Camera Image

1 code implementation CVPR 2022 Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

We represent the road topology using a set of directed lane curves and their interactions, which are captured using their intersection points.

Efficient Visual Tracking with Exemplar Transformers

2 code implementations17 Dec 2021 Philippe Blatter, Menelaos Kanakis, Martin Danelljan, Luc van Gool

E. T. Track, our visual tracker that incorporates Exemplar Transformer modules, runs at 47 FPS on a CPU.

Visual Object Tracking Visual Tracking

Implicit Neural Representations for Image Compression

no code implementations8 Dec 2021 Yannick Strümpler, Janis Postels, Ren Yang, Luc van Gool, Federico Tombari

Recently Implicit Neural Representations (INRs) gained attention as a novel and effective representation for various data types.

Image Compression Quantization

Event-Based Fusion for Motion Deblurring with Cross-modal Attention

1 code implementation30 Nov 2021 Lei Sun, Christos Sakaridis, Jingyun Liang, Qi Jiang, Kailun Yang, Peng Sun, Yaozu Ye, Kaiwei Wang, Luc van Gool

Traditional frame-based cameras inevitably suffer from motion blur due to long exposure times.

 Ranked #1 on Deblurring on GoPro (using extra training data)

Deblurring Image Deblurring +1

DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation

3 code implementations CVPR 2022 Lukas Hoyer, Dengxin Dai, Luc van Gool

It improves the state of the art by 10. 8 mIoU for GTA-to-Cityscapes and 5. 4 mIoU for Synthia-to-Cityscapes and enables learning even difficult classes such as train, bus, and truck well.

Semantic Segmentation Synthetic-to-Real Translation +1

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model

1 code implementation CVPR 2022 Zipeng Xu, Tianwei Lin, Hao Tang, Fu Li, Dongliang He, Nicu Sebe, Radu Timofte, Luc van Gool, Errui Ding

We propose a novel framework, i. e., Predict, Prevent, and Evaluate (PPE), for disentangled text-driven image manipulation that requires little manual annotation while being applicable to a wide variety of manipulations.

Image Manipulation Language Modelling

Normalizing Flow as a Flexible Fidelity Objective for Photo-Realistic Super-resolution

no code implementations5 Nov 2021 Andreas Lugmayr, Martin Danelljan, Fisher Yu, Luc van Gool, Radu Timofte

Super-resolution is an ill-posed problem, where a ground-truth high-resolution image represents only one possibility in the space of plausible solutions.

Super-Resolution

Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo

no code implementations11 Oct 2021 Berk Kaya, Suryansh Kumar, Francesco Sarno, Vittorio Ferrari, Luc van Gool

Our method performs neural rendering of multi-view images while utilizing surface normals estimated by a deep photometric stereo network.

3D Reconstruction Neural Rendering

Neural Architecture Search for Efficient Uncalibrated Deep Photometric Stereo

no code implementations11 Oct 2021 Francesco Sarno, Suryansh Kumar, Berk Kaya, Zhiwu Huang, Vittorio Ferrari, Luc van Gool

We then perform a continuous relaxation of this search space and present a gradient-based optimization strategy to find an efficient light calibration and normal estimation network.

Neural Architecture Search

Structured Bird's-Eye-View Traffic Scene Understanding from Onboard Images

1 code implementation ICCV 2021 Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

In this work, we study the problem of extracting a directed graph representing the local road network in BEV coordinates, from a single onboard camera image.

Autonomous Navigation Scene Understanding

PDC-Net+: Enhanced Probabilistic Dense Correspondence Network

1 code implementation28 Sep 2021 Prune Truong, Martin Danelljan, Radu Timofte, Luc van Gool

In order to apply dense methods to real-world applications, such as pose estimation, image manipulation, or 3D reconstruction, it is therefore crucial to estimate the confidence of the predicted matches.

3D Reconstruction Geometric Matching +6

Context-aware Padding for Semantic Segmentation

no code implementations16 Sep 2021 Yu-Hui Huang, Marc Proesmans, Luc van Gool

Zero padding is widely used in convolutional neural networks to prevent the size of feature maps diminishing too fast.

Semantic Segmentation

TACS: Taxonomy Adaptive Cross-Domain Semantic Segmentation

1 code implementation10 Sep 2021 Rui Gong, Martin Danelljan, Dengxin Dai, Danda Pani Paudel, Ajad Chhatkuli, Fisher Yu, Luc van Gool

In many real-world settings, the target domain task requires a different taxonomy than the one imposed by the source domain.

Contrastive Learning Domain Adaptation +1

Perceptual Learned Video Compression with Recurrent Conditional GAN

3 code implementations7 Sep 2021 Ren Yang, Radu Timofte, Luc van Gool

This paper proposes a Perceptual Learned Video Compression (PLVC) approach with recurrent conditional GAN.

Video Compression

Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with Self-Supervised Depth Estimation

1 code implementation28 Aug 2021 Lukas Hoyer, Dengxin Dai, Qin Wang, Yuhua Chen, Luc van Gool

Training deep networks for semantic segmentation requires large amounts of labeled training data, which presents a major challenge in practice, as labeling segmentation masks is a highly labor-intensive process.

Data Augmentation Domain Adaptation +3

Generalized Real-World Super-Resolution through Adversarial Robustness

1 code implementation25 Aug 2021 Angela Castillo, María Escobar, Juan C. Pérez, Andrés Romero, Radu Timofte, Luc van Gool, Pablo Arbeláez

Instead of learning a dataset-specific degradation, we employ adversarial attacks to create difficult examples that target the model's weaknesses.

Adversarial Robustness Super-Resolution

SwinIR: Image Restoration Using Swin Transformer

9 code implementations23 Aug 2021 Jingyun Liang, JieZhang Cao, Guolei Sun, Kai Zhang, Luc van Gool, Radu Timofte

In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection.

Color Image Denoising Grayscale Image Denoising +6

Deep Reparametrization of Multi-Frame Super-Resolution and Denoising

2 code implementations ICCV 2021 Goutam Bhat, Martin Danelljan, Fisher Yu, Luc van Gool, Radu Timofte

The deep reparametrization allows us to directly model the image formation process in the latent space, and to integrate learned image priors into the prediction.

Burst Image Super-Resolution Denoising +2

End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

2 code implementations ICCV 2021 Zhejun Zhang, Alexander Liniger, Dengxin Dai, Fisher Yu, Luc van Gool

Our end-to-end agent achieves a 78% success rate while generalizing to a new town and new weather on the NoCrash-dense benchmark and state-of-the-art performance on the challenging public routes of the CARLA LeaderBoard.

Autonomous Driving Imitation Learning +2

Decoder Fusion RNN: Context and Interaction Aware Decoders for Trajectory Prediction

no code implementations12 Aug 2021 Edoardo Mello Rella, Jan-Nico Zaech, Alexander Liniger, Luc van Gool

Forecasting the future behavior of all traffic agents in the vicinity is a key task to achieve safe and reliable autonomous driving systems.

Motion Forecasting Trajectory Prediction

Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in Adverse Weather

1 code implementation ICCV 2021 Martin Hahner, Christos Sakaridis, Dengxin Dai, Luc van Gool

2) Through extensive experiments with several state-of-the-art detection approaches, we show that our fog simulation can be leveraged to significantly improve the performance for 3D object detection in the presence of fog.

3D Object Detection object-detection +1

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution

1 code implementation ICCV 2021 Jingyun Liang, Guolei Sun, Kai Zhang, Luc van Gool, Radu Timofte

Extensive experiments on synthetic and real images show that the proposed MANet not only performs favorably for both spatially variant and invariant kernel estimation, but also leads to state-of-the-art blind SR performance when combined with non-blind SR methods.

Image Super-Resolution

Boosting Few-shot Semantic Segmentation with Transformers

no code implementations4 Aug 2021 Guolei Sun, Yun Liu, Jingyun Liang, Luc van Gool

Due to the fact that fully supervised semantic segmentation methods require sufficient fully-labeled data to work well and can not generalize to unseen classes, few-shot segmentation has attracted lots of research attention.

Few-Shot Semantic Segmentation Semantic Segmentation

A Survey on Deep Learning Technique for Video Segmentation

1 code implementation2 Jul 2021 Tianfei Zhou, Fatih Porikli, David Crandall, Luc van Gool, Wenguan Wang

Video segmentation -- partitioning video frames into multiple segments or objects -- plays a critical role in a broad range of practical applications, from enhancing visual effects in movie, to understanding scenes in autonomous driving, to creating virtual background in video conferencing.

Autonomous Driving Scene Understanding +3

On the Practicality of Deterministic Epistemic Uncertainty

1 code implementation1 Jul 2021 Janis Postels, Mattia Segu, Tao Sun, Luca Sieber, Luc van Gool, Fisher Yu, Federico Tombari

We find that, while DUMs scale to realistic vision tasks and perform well on OOD detection, the practicality of current methods is undermined by poor calibration under distributional shifts.

Out of Distribution (OOD) Detection Semantic Segmentation

GANmut: Learning Interpretable Conditional Space for Gamut of Emotions

no code implementations CVPR 2021 Stefano d'Apolito, Danda Pani Paudel, Zhiwu Huang, Andres Romero, Luc van Gool

On the other hand, learning from inexpensive and intuitive basic categorical emotion labels leads to limited emotion variability.

Video Super-Resolution Transformer

1 code implementation12 Jun 2021 JieZhang Cao, Yawei Li, Kai Zhang, Jingyun Liang, Luc van Gool

Specifically, to tackle the first issue, we present a spatial-temporal convolutional self-attention layer with a theoretical understanding to exploit the locality information.

Optical Flow Estimation Video Super-Resolution

Generative Flows with Invertible Attentions

no code implementations CVPR 2022 Rhea Sanjay Sukthanker, Zhiwu Huang, Suryansh Kumar, Radu Timofte, Luc van Gool

The key idea is to exploit a masked scheme of these two attentions to learn long-range data dependencies in the context of generative flows.

Image Generation

Go with the Flows: Mixtures of Normalizing Flows for Point Cloud Generation and Reconstruction

no code implementations6 Jun 2021 Janis Postels, Mengya Liu, Riccardo Spezialetti, Luc van Gool, Federico Tombari

Recently normalizing flows (NFs) have demonstrated state-of-the-art performance on modeling 3D point clouds while allowing sampling with arbitrary resolution at inference time.

Data Augmentation Point Cloud Generation

Vision Transformers with Hierarchical Attention

2 code implementations6 Jun 2021 Yun Liu, Yu-Huan Wu, Guolei Sun, Le Zhang, Ajad Chhatkuli, Luc van Gool

This paper tackles the low-efficiency flaw of the vision transformer caused by the high computational/space complexity in Multi-Head Self-Attention (MHSA).

Image Classification Instance Segmentation +4

Fourier Space Losses for Efficient Perceptual Image Super-Resolution

no code implementations ICCV 2021 Dario Fuoli, Luc van Gool, Radu Timofte

As large models are often not practical in real-world applications, we investigate and propose novel loss functions, to enable SR with high perceptual quality from much more efficient models.

Image Super-Resolution

Boosting Crowd Counting with Transformers

no code implementations23 May 2021 Guolei Sun, Yun Liu, Thomas Probst, Danda Pani Paudel, Nikola Popovic, Luc van Gool

This indicates that global scene context is essential, despite the seemingly bottom-up nature of the problem.

Crowd Counting

Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation

1 code implementation CVPR 2021 Suman Saha, Anton Obukhov, Danda Pani Paudel, Menelaos Kanakis, Yuhua Chen, Stamatios Georgoulis, Luc van Gool

Specifically, we show that: (1) our approach improves performance on all tasks when they are complementary and mutually dependent; (2) the CTRL helps to improve both semantic segmentation and depth estimation tasks performance in the challenging UDA setting; (3) the proposed ISL training scheme further improves the semantic segmentation performance.

Monocular Depth Estimation Multi-Task Learning +3