Search Results for author: Luc van Gool

Found 540 papers, 271 papers with code

Modeling the Effects of Windshield Refraction for Camera Calibration

no code implementations ECCV 2020 Frank Verbiest, Marc Proesmans, Luc van Gool

Instead of using a generalized camera approach, we propose a novel approach to jointly optimize a traditional camera model, and a mathematical representation of the windshield’s surface.

Autonomous Driving Camera Calibration

Fixing Localization Errors to Improve Image Classification

1 code implementation ECCV 2020 Guolei Sun, Salman Khan, Wen Li, Hisham Cholakkal, Fahad Shahbaz Khan, Luc van Gool

This way, in an effort to fix localization errors, our loss provides an extra supervisory signal that helps the model to better discriminate between similar classes.

Classification General Classification +3

Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of Text-to-Video Diffusion Models

no code implementations8 Apr 2024 Saman Motamed, Wouter Van Gansbeke, Luc van Gool

With recent advances in image and video diffusion models for content creation, a plethora of techniques have been proposed for customizing their generated content.

Video Editing

Self-Explainable Affordance Learning with Embodied Caption

no code implementations8 Apr 2024 Zhipeng Zhang, Zhimin Wei, Guolei Sun, Peng Wang, Luc van Gool

In the field of visual affordance learning, previous methods mainly used abundant images or videos that delineate human behavior patterns to identify action possibility regions for object manipulation, with a variety of applications in robotic tasks.

Empowering Image Recovery_ A Multi-Attention Approach

no code implementations6 Apr 2024 Juan Wen, Yawei Li, Chao Zhang, Weiyan Hou, Radu Timofte, Luc van Gool

Integration of attention mechanisms across feature and positional dimensions further enhances the recovery of fine details.

Image Restoration

Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation

no code implementations4 Apr 2024 Elham Amin Mansour, Ozan Unal, Suman Saha, Benjamin Bejar, Luc van Gool

A key challenge in panoptic UDA is reducing the domain gap between a labeled source and an unlabeled target domain while harmonizing the subtasks of semantic and instance segmentation to limit catastrophic interference.

Autonomous Driving Instance Segmentation +3

Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning

1 code implementation4 Apr 2024 Rui Li, Tobias Fischer, Mattia Segu, Marc Pollefeys, Luc van Gool, Federico Tombari

We propose KYN, a novel method for single-view scene reconstruction that reasons about semantic and spatial context to predict each point's density.

3D Scene Reconstruction Depth Estimation +2

HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud

2 code implementations4 Apr 2024 Wencan Cheng, Hao Tang, Luc van Gool, Jong Hwan Ko

Extracting keypoint locations from input hand frames, known as 3D hand pose estimation, is a critical task in various human-computer interaction applications.

3D Hand Pose Estimation

I-Design: Personalized LLM Interior Designer

no code implementations3 Apr 2024 Ata Çelen, Guo Han, Konrad Schindler, Luc van Gool, Iro Armeni, Anton Obukhov, Xi Wang

Interior design allows us to be who we are and live how we want - each design is as unique as our distinct personality.

Language Modelling Large Language Model +2

A Unified and Interpretable Emotion Representation and Expression Generation

no code implementations1 Apr 2024 Reni Paskaleva, Mykyta Holubakha, Andela Ilic, Saman Motamed, Luc van Gool, Danda Paudel

However, emotions are often compound, e. g. happily surprised, and can be mapped to the action units (AUs) used for expressing emotions, and trivially to the canonical ones.

GlORIE-SLAM: Globally Optimized RGB-only Implicit Encoding Point Cloud SLAM

no code implementations28 Mar 2024 Ganlin Zhang, Erik Sandström, Youmin Zhang, Manthan Patel, Luc van Gool, Martin R. Oswald

To alleviate this issue, with the aid of a monocular depth estimator, we introduce a novel DSPO layer for bundle adjustment which optimizes the pose and depth of keyframes along with the scale of the monocular depth.

Simultaneous Localization and Mapping

UniDepth: Universal Monocular Metric Depth Estimation

1 code implementation27 Mar 2024 Luigi Piccinelli, Yung-Hsu Yang, Christos Sakaridis, Mattia Segu, Siyuan Li, Luc van Gool, Fisher Yu

However, the remarkable accuracy of recent MMDE methods is confined to their training domains.

Ranked #2 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)

Monocular Depth Estimation

Towards Online Real-Time Memory-based Video Inpainting Transformers

no code implementations24 Mar 2024 Guillaume Thiry, Hao Tang, Radu Timofte, Luc van Gool

Video inpainting tasks have seen significant improvements in recent years with the rise of deep neural networks and, in particular, vision transformers.

Video Inpainting

FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks

no code implementations11 Mar 2024 Muhammad Saif Ullah Khan, Muhammad Ferjad Naeem, Federico Tombari, Luc van Gool, Didier Stricker, Muhammad Zeshan Afzal

We propose FocusCLIP, integrating subject-level guidance--a specialized mechanism for target-specific supervision--into the CLIP framework for improved zero-shot transfer on human-centric tasks.

Activity Recognition Age Classification +1

Rethinking Few-shot 3D Point Cloud Semantic Segmentation

1 code implementation1 Mar 2024 Zhaochong An, Guolei Sun, Yun Liu, Fayao Liu, Zongwei Wu, Dan Wang, Luc van Gool, Serge Belongie

The former arises from non-uniform point sampling, allowing models to distinguish the density disparities between foreground and background for easier segmentation.

Few-shot 3D Point Cloud Semantic Segmentation Segmentation +1

Loopy-SLAM: Dense Neural SLAM with Loop Closures

no code implementations14 Feb 2024 Lorenzo Liso, Erik Sandström, Vladimir Yugay, Luc van Gool, Martin R. Oswald

Neural RGBD SLAM techniques have shown promise in dense Simultaneous Localization And Mapping (SLAM), yet face challenges such as error accumulation during camera tracking resulting in distorted maps.

Simultaneous Localization and Mapping

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

no code implementations5 Feb 2024 Yuqian Fu, Yu Wang, Yixuan Pan, Lian Huai, Xingyu Qiu, Zeyu Shangguan, Tong Liu, Yanwei Fu, Luc van Gool, Xingqun Jiang

This paper studies the challenging cross-domain few-shot object detection (CD-FSOD), aiming to develop an accurate object detector for novel domains with minimal labeled examples.

Cross-Domain Few-Shot Few-Shot Object Detection +3

Key-Graph Transformer for Image Restoration

no code implementations4 Feb 2024 Bin Ren, Yawei Li, Jingyun Liang, Rakesh Ranjan, Mengyuan Liu, Rita Cucchiara, Luc van Gool, Nicu Sebe

While it is crucial to capture global information for effective image restoration (IR), integrating such cues into transformer-based methods becomes computationally expensive, especially with high input resolution.

Graph Attention Image Restoration

Image Fusion via Vision-Language Model

no code implementations3 Feb 2024 Zixiang Zhao, Lilun Deng, Haowen Bai, Yukun Cui, Zhipeng Zhang, Yulun Zhang, Haotong Qin, Dongdong Chen, Jiangshe Zhang, Peng Wang, Luc van Gool

Therefore, we introduce a novel fusion paradigm named image Fusion via vIsion-Language Model (FILM), for the first time, utilizing explicit textual information in different source images to guide image fusion.

Language Modelling

Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes

no code implementations27 Jan 2024 Diandian Guo, Deng-Ping Fan, Tongyu Lu, Christos Sakaridis, Luc van Gool

The estimation of implicit cross-frame correspondences and the high computational cost have long been major challenges in video semantic segmentation (VSS) for driving scenes.

Motion Estimation Segmentation +2

MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty

no code implementations23 Jan 2024 Tim Brödermann, David Bruggemann, Christos Sakaridis, Kevin Ta, Odysseas Liagouris, Jason Corkill, Luc van Gool

Achieving level-5 driving automation in autonomous vehicles necessitates a robust semantic visual perception system capable of parsing data from different sensors across diverse conditions.

Autonomous Vehicles Panoptic Segmentation

Learning to Prompt with Text Only Supervision for Vision-Language Models

1 code implementation4 Jan 2024 Muhammad Uzair Khattak, Muhammad Ferjad Naeem, Muzammal Naseer, Luc van Gool, Federico Tombari

While effective, most of these works require labeled data which is not practical, and often struggle to generalize towards new datasets due to over-fitting on the source data.

Prompt Engineering

Residual Learning for Image Point Descriptors

no code implementations24 Dec 2023 Rashik Shrestha, Ajad Chhatkuli, Menelaos Kanakis, Luc van Gool

Such an approach of optimization allows us to discard learning knowledge already present in non-differentiable functions such as the hand-crafted descriptors and only learn the residual knowledge in the main network branch.

Camera Localization Ensemble Learning

Ternary-type Opacity and Hybrid Odometry for RGB-only NeRF-SLAM

no code implementations20 Dec 2023 Junru Lin, Asen Nachkov, Songyou Peng, Luc van Gool, Danda Pani Paudel

To foster this line of research, we also propose a simple yet novel visual odometry scheme that uses a hybrid combination of volumetric and warping-based image renderings.

Visual Odometry

DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control

no code implementations5 Dec 2023 Yuru Jia, Lukas Hoyer, Shengyu Huang, Tianfu Wang, Luc van Gool, Konrad Schindler, Anton Obukhov

Large, pretrained latent diffusion models (LDMs) have demonstrated an extraordinary ability to generate creative content, specialize to user data through few-shot fine-tuning, and condition their output on other modalities, such as semantic maps.

Autonomous Driving Domain Generalization +1

Zero-Shot Point Cloud Registration

no code implementations5 Dec 2023 Weijie Wang, Guofeng Mei, Bin Ren, Xiaoshui Huang, Fabio Poiesi, Luc van Gool, Nicu Sebe, Bruno Lepri

The cornerstone of ZeroReg is the novel transfer of image features from keypoints to the point cloud, enriched by aggregating information from 3D geometric neighborhoods.

Point Cloud Registration

LALM: Long-Term Action Anticipation with Language Models

no code implementations29 Nov 2023 Sanghwan Kim, Daoji Huang, Yongqin Xian, Otmar Hilliges, Luc van Gool, Xi Wang

Understanding human activity is a crucial yet intricate task in egocentric vision, a field that focuses on capturing visual perspectives from the camera wearer's viewpoint.

Action Anticipation Action Recognition +4

Continuous Pose for Monocular Cameras in Neural Implicit Representation

1 code implementation28 Nov 2023 Qi Ma, Danda Pani Paudel, Ajad Chhatkuli, Luc van Gool

In this paper, we showcase the effectiveness of optimizing monocular camera poses as a continuous function of time.

Simultaneous Localization and Mapping

2D Feature Distillation for Weakly- and Semi-Supervised 3D Semantic Segmentation

no code implementations27 Nov 2023 Ozan Unal, Dengxin Dai, Lukas Hoyer, Yigit Baran Can, Luc van Gool

As 3D perception problems grow in popularity and the need for large-scale labeled datasets for LiDAR semantic segmentation increase, new methods arise that aim to reduce the necessity for dense annotations by employing weakly-supervised training.

2D Semantic Segmentation 3D Semantic Segmentation +3

Single-Model and Any-Modality for Video Object Tracking

1 code implementation27 Nov 2023 Zongwei Wu, Jilai Zheng, Xiangxuan Ren, Florin-Alexandru Vasluianu, Chao Ma, Danda Pani Paudel, Luc van Gool, Radu Timofte

In practice, most existing RGB trackers learn a single set of parameters to use them across datasets and applications.

Object Video Object Tracking

SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance

1 code implementation27 Nov 2023 Lukas Hoyer, David Joseph Tan, Muhammad Ferjad Naeem, Luc van Gool, Federico Tombari

In SemiVL, we propose to integrate rich priors from VLM pre-training into semi-supervised semantic segmentation to learn better semantic decision boundaries.

Segmentation Semi-Supervised Semantic Segmentation

Lego: Learning to Disentangle and Invert Concepts Beyond Object Appearance in Text-to-Image Diffusion Models

no code implementations23 Nov 2023 Saman Motamed, Danda Pani Paudel, Luc van Gool

To enable customized content creation based on a few example images of a concept, methods such as Textual Inversion and DreamBooth invert the desired concept and enable synthesizing it in new scenes.

Language Modelling Large Language Model +3

3D Compression Using Neural Fields

no code implementations21 Nov 2023 Janis Postels, Yannick Strümpler, Klara Reichard, Luc van Gool, Federico Tombari

Neural Fields (NFs) have gained momentum as a tool for compressing various data modalities - e. g. images and videos.

Attribute

Deep Equilibrium Diffusion Restoration with Parallel Sampling

1 code implementation20 Nov 2023 JieZhang Cao, Yue Shi, Kai Zhang, Yulun Zhang, Radu Timofte, Luc van Gool

Due to the inherent property of diffusion models, most existing methods need long serial sampling chains to restore HQ images step-by-step, resulting in expensive sampling time and high computation costs.

Image Restoration

Model-aware 3D Eye Gaze from Weak and Few-shot Supervisions

1 code implementation20 Nov 2023 Nikola Popovic, Dimitrios Christodoulou, Danda Pani Paudel, Xi Wang, Luc van Gool

In this work, we propose to predict 3D eye gaze from weak supervision of eye semantic segmentation masks and direct supervision of a few 3D gaze vectors.

Semantic Segmentation

MoVideo: Motion-Aware Video Generation with Diffusion Models

no code implementations19 Nov 2023 Jingyun Liang, Yuchen Fan, Kai Zhang, Radu Timofte, Luc van Gool, Rakesh Ranjan

While recent years have witnessed great progress on using diffusion models for video generation, most of them are simple extensions of image generation frameworks, which fail to explicitly consider one of the key differences between videos and images, i. e., motion.

Image Generation Image to Video Generation +1

Contrastive Learning for Multi-Object Tracking with Transformers

no code implementations14 Nov 2023 Pierre-François De Plaen, Nicola Marinello, Marc Proesmans, Tinne Tuytelaars, Luc van Gool

The DEtection TRansformer (DETR) opened new possibilities for object detection by modeling it as a translation task: converting image features into object-level representations.

Contrastive Learning Multi-Object Tracking +4

Learning Robust Multi-Scale Representation for Neural Radiance Fields from Unposed Images

no code implementations8 Nov 2023 Nishant Jain, Suryansh Kumar, Luc van Gool

The key ideas presented in this paper are (i) Recovering accurate camera parameters via a robust pipeline from unposed day-to-day images is equally crucial in neural novel view synthesis problem; (ii) It is rather more practical to model object's content at different resolutions since dramatic camera motion is highly likely in day-to-day unposed images.

Depth Estimation Depth Prediction +2

Towards High-quality HDR Deghosting with Conditional Diffusion Models

no code implementations2 Nov 2023 Qingsen Yan, Tao Hu, Yuan Sun, Hao Tang, Yu Zhu, Wei Dong, Luc van Gool, Yanning Zhang

To address this challenge, we formulate the HDR deghosting problem as an image generation that leverages LDR features as the diffusion model's condition, consisting of the feature condition generator and the noise predictor.

Denoising Image Generation

SILC: Improving Vision Language Pretraining with Self-Distillation

no code implementations20 Oct 2023 Muhammad Ferjad Naeem, Yongqin Xian, Xiaohua Zhai, Lukas Hoyer, Luc van Gool, Federico Tombari

However, the contrastive objective used by these models only focuses on image-text alignment and does not incentivise image feature learning for dense prediction tasks.

Classification Contrastive Learning +8

Real-Time Motion Prediction via Heterogeneous Polyline Transformer with Relative Pose Encoding

1 code implementation NeurIPS 2023 Zhejun Zhang, Alexander Liniger, Christos Sakaridis, Fisher Yu, Luc van Gool

The real-world deployment of an autonomous driving system requires its components to run on-board and in real-time, including the motion prediction module that predicts the future trajectories of surrounding traffic participants.

Autonomous Driving motion prediction

Probabilistic Sampling of Balanced K-Means using Adiabatic Quantum Computing

no code implementations18 Oct 2023 Jan-Nico Zaech, Martin Danelljan, Luc van Gool

Adiabatic quantum computing (AQC) is a promising quantum computing approach for discrete and often NP-hard optimization problems.

Clustering

Discwise Active Learning for LiDAR Semantic Segmentation

no code implementations23 Sep 2023 Ozan Unal, Dengxin Dai, Ali Tamer Unal, Luc van Gool

Finally we propose a semi-supervised learning approach to utilize all frames within our dataset and improve performance.

Active Learning LIDAR Semantic Segmentation +1

Breathing New Life into 3D Assets with Generative Repainting

2 code implementations15 Sep 2023 Tianfu Wang, Menelaos Kanakis, Konrad Schindler, Luc van Gool, Anton Obukhov

Diffusion-based text-to-image models ignited immense attention from the vision community, artists, and content creators.

Deformable Neural Radiance Fields using RGB and Event Cameras

no code implementations ICCV 2023 Qi Ma, Danda Pani Paudel, Ajad Chhatkuli, Luc van Gool

In this work, we develop a novel method to model the deformable neural radiance fields using RGB and event cameras.

Temporal-aware Hierarchical Mask Classification for Video Semantic Segmentation

1 code implementation14 Sep 2023 Zhaochong An, Guolei Sun, Zongwei Wu, Hao Tang, Luc van Gool

Modern approaches have proved the huge potential of addressing semantic segmentation as a mask classification task which is widely used in instance-level segmentation.

Classification Segmentation +2

Three Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding

no code implementations8 Sep 2023 Ozan Unal, Christos Sakaridis, Suman Saha, Fisher Yu, Luc van Gool

A common formulation to tackle 3D visual grounding is grounding-by-detection, where localization is done via bounding boxes.

3D Instance Segmentation Object +3

Neural Gradient Regularizer

1 code implementation31 Aug 2023 Shuang Xu, Yifan Wang, Zixiang Zhao, Jiangjun Peng, Xiangyong Cao, Deyu Meng, Yulun Zhang, Radu Timofte, Luc van Gool

NGR is applicable to various image types and different image processing tasks, functioning in a zero-shot learning fashion, making it a versatile and plug-and-play regularizer.

Zero-Shot Learning

Introducing Language Guidance in Prompt-based Continual Learning

1 code implementation ICCV 2023 Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc van Gool, Didier Stricker, Federico Tombari, Muhammad Zeshan Afzal

While the model faces a disjoint set of classes in each task in this setting, we argue that these classes can be encoded to the same embedding space of a pre-trained language encoder.

Continual Learning

DiffI2I: Efficient Diffusion Model for Image-to-Image Translation

no code implementations26 Aug 2023 Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, Radu Timotfe, Luc van Gool

Compared to traditional DMs, the compact IPR enables DiffI2I to obtain more accurate outcomes and employ a lighter denoising network and fewer iterations.

Denoising Image-to-Image Translation +2

DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation

no code implementations ICCV 2023 Hanqing Wang, Wei Liang, Luc van Gool, Wenguan Wang

VLN-CE is a recently released embodied task, where AI agents need to navigate a freely traversable environment to reach a distant target location, given language instructions.

Decision Making Navigate +1

When Super-Resolution Meets Camouflaged Object Detection: A Comparison Study

no code implementations8 Aug 2023 Juan Wen, Shupeng Cheng, Peng Xu, BoWen Zhou, Radu Timofte, Weiyan Hou, Luc van Gool

Super Resolution (SR) and Camouflaged Object Detection (COD) are two hot topics in computer vision with various joint applications.

Object object-detection +2

How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges

1 code implementation27 Jul 2023 Haotong Qin, Ge-Peng Ji, Salman Khan, Deng-Ping Fan, Fahad Shahbaz Khan, Luc van Gool

Google's Bard has emerged as a formidable competitor to OpenAI's ChatGPT in the field of conversational AI.

Prior Based Online Lane Graph Extraction from Single Onboard Camera Image

no code implementations25 Jul 2023 Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

Thus, online estimation of the lane graph is crucial for widespread and reliable autonomous navigation.

Autonomous Navigation

Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic Image Synthesis

1 code implementation22 Jul 2023 Hao Tang, Guolei Sun, Nicu Sebe, Luc van Gool

To tackle 2), we design an effective module to selectively highlight class-dependent feature maps according to the original semantic layout to preserve the semantic information.

Contrastive Learning Image Generation

Improving Online Lane Graph Extraction by Object-Lane Clustering

no code implementations ICCV 2023 Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

In this work, we propose an architecture and loss formulation to improve the accuracy of local lane graph estimates by using 3D object detection outputs.

3D Object Detection Autonomous Driving +4

AutoDecoding Latent 3D Diffusion Models

1 code implementation NeurIPS 2023 Evangelos Ntavelis, Aliaksandr Siarohin, Kyle Olszewski, Chaoyang Wang, Luc van Gool, Sergey Tulyakov

We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core.

Prompting Diffusion Representations for Cross-Domain Semantic Segmentation

no code implementations5 Jul 2023 Rui Gong, Martin Danelljan, Han Sun, Julio Delgado Mangas, Luc van Gool

Intrigued by this result, we set out to explore how well diffusion-pretrained representations generalize to new domains, a crucial ability for any representation.

Domain Generalization Image Generation +2

Palm: Predicting Actions through Language Models @ Ego4D Long-Term Action Anticipation Challenge 2023

1 code implementation28 Jun 2023 Daoji Huang, Otmar Hilliges, Luc van Gool, Xi Wang

We present Palm, a solution to the Long-Term Action Anticipation (LTA) task utilizing vision-language and large language models.

Action Anticipation Image Captioning +3

UncLe-SLAM: Uncertainty Learning for Dense Neural SLAM

1 code implementation19 Jun 2023 Erik Sandström, Kevin Ta, Luc van Gool, Martin R. Oswald

We present an uncertainty learning framework for dense neural simultaneous localization and mapping (SLAM).

Simultaneous Localization and Mapping

SF-FSDA: Source-Free Few-Shot Domain Adaptive Object Detection with Efficient Labeled Data Factory

no code implementations7 Jun 2023 Han Sun, Rui Gong, Konrad Schindler, Luc van Gool

Domain adaptive object detection aims to leverage the knowledge learned from a labeled source domain to improve the performance on an unlabeled target domain.

Object object-detection +2

Condition-Invariant Semantic Segmentation

1 code implementation27 May 2023 Christos Sakaridis, David Bruggemann, Fisher Yu, Luc van Gool

Motivated by these findings, we propose to leverage stylization in performing feature-level adaptation by aligning the internal network features extracted by the encoder of the network from the original and the stylized view of each input image with a novel feature invariance loss.

Segmentation Semantic Segmentation +1

Equivariant Multi-Modality Image Fusion

2 code implementations19 May 2023 Zixiang Zhao, Haowen Bai, Jiangshe Zhang, Yulun Zhang, Kai Zhang, Shuang Xu, Dongdong Chen, Radu Timofte, Luc van Gool

These components enable the net training to follow the principles of the natural sensing-imaging process while satisfying the equivariant imaging prior.

Self-Supervised Learning

Denoising Diffusion Models for Plug-and-Play Image Restoration

2 code implementations15 May 2023 Yuanzhi Zhu, Kai Zhang, Jingyun Liang, JieZhang Cao, Bihan Wen, Radu Timofte, Luc van Gool

Although diffusion models have shown impressive performance for high-quality image synthesis, their potential to serve as a generative denoiser prior to the plug-and-play IR methods remains to be further explored.

Deblurring Denoising +4

StyleGenes: Discrete and Efficient Latent Distributions for GANs

no code implementations30 Apr 2023 Evangelos Ntavelis, Mohamad Shahbazi, Iason Kastanis, Radu Timofte, Martin Danelljan, Luc van Gool

Thus, by independently sampling a variant for each gene and combining them into the final latent vector, our approach can represent a vast number of unique latent samples from a compact set of learnable parameters.

Disentanglement

Neural Implicit Dense Semantic SLAM

no code implementations27 Apr 2023 Yasaman Haghighi, Suryansh Kumar, Jean-Philippe Thiran, Luc van Gool

Visual Simultaneous Localization and Mapping (vSLAM) is a widely used technique in robotics and computer vision that enables a robot to create a map of an unfamiliar environment using a camera sensor while simultaneously tracking its position over time.

Scene Understanding Semantic Segmentation +1

EDAPS: Enhanced Domain-Adaptive Panoptic Segmentation

1 code implementation ICCV 2023 Suman Saha, Lukas Hoyer, Anton Obukhov, Dengxin Dai, Luc van Gool

EDAPS significantly improves the state-of-the-art performance for panoptic segmentation UDA by a large margin of 20% on SYNTHIA-to-Cityscapes and even 72% on the more challenging SYNTHIA-to-Mapillary Vistas.

Domain Adaptation Instance Segmentation +2

Domain Adaptive and Generalizable Network Architectures and Training Strategies for Semantic Image Segmentation

3 code implementations26 Apr 2023 Lukas Hoyer, Dengxin Dai, Luc van Gool

As previous UDA&DG semantic segmentation methods are mostly based on outdated networks, we benchmark more recent architectures, reveal the potential of Transformers, and design the DAFormer network tailored for UDA&DG.

Domain Generalization Image Segmentation +2

Indiscernible Object Counting in Underwater Scenes

1 code implementation CVPR 2023 Guolei Sun, Zhaochong An, Yun Liu, Ce Liu, Christos Sakaridis, Deng-Ping Fan, Luc van Gool

We further advance the frontier of this field by systematically studying a new challenge named indiscernible object counting (IOC), the goal of which is to count objects that are blended with respect to their surroundings.

Benchmarking Object +2

Advances in Deep Concealed Scene Understanding

1 code implementation21 Apr 2023 Deng-Ping Fan, Ge-Peng Ji, Peng Xu, Ming-Ming Cheng, Christos Sakaridis, Luc van Gool

Concealed scene understanding (CSU) is a hot computer vision topic aiming to perceive objects exhibiting camouflage.

Scene Understanding Semantic Segmentation

Quantum Annealing for Single Image Super-Resolution

no code implementations18 Apr 2023 Han Yao Choong, Suryansh Kumar, Luc van Gool

As a result, in this work, we take the privilege to perform an early exploration of applying a quantum computing algorithm to this important image enhancement problem, i. e., SISR.

Combinatorial Optimization Image Enhancement +1

SAM Struggles in Concealed Scenes -- Empirical Study on "Segment Anything"

no code implementations12 Apr 2023 Ge-Peng Ji, Deng-Ping Fan, Peng Xu, Ming-Ming Cheng, BoWen Zhou, Luc van Gool

Segmenting anything is a ground-breaking step toward artificial general intelligence, and the Segment Anything Model (SAM) greatly fosters the foundation models for computer vision.

CamDiff: Camouflage Image Augmentation via Diffusion Model

1 code implementation11 Apr 2023 Xue-Jing Luo, Shuo Wang, Zongwei Wu, Christos Sakaridis, Yun Cheng, Deng-Ping Fan, Luc van Gool

Specifically, we leverage the latent diffusion model to synthesize salient objects in camouflaged scenes, while using the zero-shot image classification ability of the Contrastive Language-Image Pre-training (CLIP) model to prevent synthesis failures and ensure the synthesized object aligns with the input prompt.

Image Augmentation Image Classification +3

Point-SLAM: Dense Neural Point Cloud-based SLAM

2 code implementations ICCV 2023 Erik Sandström, Yue Li, Luc van Gool, Martin R. Oswald

We propose a dense neural simultaneous localization and mapping (SLAM) approach for monocular RGBD input which anchors the features of a neural scene representation in a point cloud that is iteratively generated in an input-dependent data-driven manner.

Simultaneous Localization and Mapping

Online Lane Graph Extraction from Onboard Video

no code implementations3 Apr 2023 Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

One of the most common and useful representation of such an understanding is done in the form of BEV lane graphs.

Autonomous Driving Navigate

Single Image Depth Prediction Made Better: A Multivariate Gaussian Take

no code implementations CVPR 2023 Ce Liu, Suryansh Kumar, Shuhang Gu, Radu Timofte, Luc van Gool

Accordingly, we introduce an approach that performs continuous modeling of per-pixel depth, where we can predict and reason about the per-pixel depth and its distribution.

Depth Estimation Depth Prediction

Enhanced Stable View Synthesis

no code implementations CVPR 2023 Nishant Jain, Suryansh Kumar, Luc van Gool

Extensive evaluation of our approach on the popular benchmark dataset, such as Tanks and Temples, shows substantial improvement in view synthesis results compared to the prior art.

3D Reconstruction Novel View Synthesis

NeRF-GAN Distillation for Efficient 3D-Aware Generation with Convolutions

1 code implementation22 Mar 2023 Mohamad Shahbazi, Evangelos Ntavelis, Alessio Tonioni, Edo Collins, Danda Pani Paudel, Martin Danelljan, Luc van Gool

Pose-conditioned convolutional generative models struggle with high-quality 3D-consistent image generation from single-view datasets, due to their lack of sufficient 3D priors.

Image Generation Inductive Bias

DiffIR: Efficient Diffusion Model for Image Restoration

1 code implementation ICCV 2023 Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, Luc van Gool

Diffusion model (DM) has achieved SOTA performance by modeling the image synthesis process into a sequential application of a denoising network.

Denoising Image Generation +1

Graph Transformer GANs for Graph-Constrained House Generation

no code implementations CVPR 2023 Hao Tang, Zhenyu Zhang, Humphrey Shi, Bo Li, Ling Shao, Nicu Sebe, Radu Timofte, Luc van Gool

We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations in an end-to-end fashion for the challenging graph-constrained house generation task.

Generative Adversarial Network House Generation +1

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

2 code implementations ICCV 2023 Zixiang Zhao, Haowen Bai, Yuanzhi Zhu, Jiangshe Zhang, Shuang Xu, Yulun Zhang, Kai Zhang, Deyu Meng, Radu Timofte, Luc van Gool

To leverage strong generative priors and address challenges such as unstable training and lack of interpretability for GAN-based generative methods, we propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM).

Denoising

TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction

2 code implementations7 Mar 2023 Zhejun Zhang, Alexander Liniger, Dengxin Dai, Fisher Yu, Luc van Gool

We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving, and based on TrafficBots we obtain a world model tailored for the planning module of autonomous vehicles.

Autonomous Driving Model-based Reinforcement Learning +1

A Multiplicative Value Function for Safe and Efficient Reinforcement Learning

1 code implementation7 Mar 2023 Nick Bührer, Zhejun Zhang, Alexander Liniger, Fisher Yu, Luc van Gool

To this end, we propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.

Navigate reinforcement-learning +3

Efficient and Explicit Modelling of Image Hierarchies for Image Restoration

1 code implementation CVPR 2023 Yawei Li, Yuchen Fan, Xiaoyu Xiang, Denis Demandolx, Rakesh Ranjan, Radu Timofte, Luc van Gool

The aim of this paper is to propose a mechanism to efficiently and explicitly model image hierarchies in the global, regional, and local range for image restoration.

Image Deblurring Image Defocus Deblurring +1

VA-DepthNet: A Variational Approach to Single Image Depth Prediction

2 code implementations13 Feb 2023 Ce Liu, Suryansh Kumar, Shuhang Gu, Radu Timofte, Luc van Gool

While state-of-the-art deep neural network methods for SIDP learn the scene depth from images in a supervised setting, they often overlook the invaluable invariances and priors in the rigid scene space, such as the regularity of the scene.

Depth Prediction Monocular Depth Estimation

No One Left Behind: Real-World Federated Class-Incremental Learning

2 code implementations2 Feb 2023 Jiahua Dong, Hongliu Li, Yang Cong, Gan Sun, Yulun Zhang, Luc van Gool

These issues render global model to undergo catastrophic forgetting on old categories, when local clients receive new categories consecutively under limited memory of storing old categories.

Class Incremental Learning Federated Learning +1

Continuous Pseudo-Label Rectified Domain Adaptive Semantic Segmentation With Implicit Neural Representations

no code implementations CVPR 2023 Rui Gong, Qin Wang, Martin Danelljan, Dengxin Dai, Luc van Gool

Unsupervised domain adaptation (UDA) for semantic segmentation aims at improving the model performance on the unlabeled target domain by leveraging a labeled source domain.

Pseudo Label Semantic Segmentation +1

Self-Supervised Burst Super-Resolution

no code implementations ICCV 2023 Goutam Bhat, Michaël Gharbi, Jiawen Chen, Luc van Gool, Zhihao Xia

Extensive experiments on real and synthetic data show that, despite only using noisy bursts during training, models trained with our self-supervised strategy match, and sometimes surpass, the quality of fully-supervised baselines trained with synthetic data or weakly-paired ground-truth.

Super-Resolution

Beyond SOT: Tracking Multiple Generic Objects at Once

1 code implementation22 Dec 2022 Christoph Mayer, Martin Danelljan, Ming-Hsuan Yang, Vittorio Ferrari, Luc van Gool, Alina Kuznetsova

Our approach achieves a 4x faster run-time in case of 10 concurrent objects compared to tracking each object independently and outperforms existing single object trackers on our new benchmark.

Attribute Object +1

One-Shot Domain Adaptive and Generalizable Semantic Segmentation with Class-Aware Cross-Domain Transformers

no code implementations14 Dec 2022 Rui Gong, Qin Wang, Dengxin Dai, Luc van Gool

Thus, we aim to relieve this need on a large number of real data, and explore the one-shot unsupervised sim-to-real domain adaptation (OSUDA) and generalization (OSDG) problem, where only one real-world data sample is available.

Autonomous Driving Domain Adaptation +1

Surface Normal Clustering for Implicit Representation of Manhattan Scenes

1 code implementation ICCV 2023 Nikola Popovic, Danda Pani Paudel, Luc van Gool

In this work, we aim to leverage the geometric prior of Manhattan scenes to improve the implicit neural radiance field representations.

Clustering Novel View Synthesis

MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation

1 code implementation CVPR 2023 Lukas Hoyer, Dengxin Dai, Haoran Wang, Luc van Gool

MIC significantly improves the state-of-the-art performance across the different recognition tasks for synthetic-to-real, day-to-nighttime, and clear-to-adverse-weather UDA.

Image Classification object-detection +4

CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion

2 code implementations CVPR 2023 Zixiang Zhao, Haowen Bai, Jiangshe Zhang, Yulun Zhang, Shuang Xu, Zudi Lin, Radu Timofte, Luc van Gool

We then introduce a dual-branch Transformer-CNN feature extractor with Lite Transformer (LT) blocks leveraging long-range attention to handle low-frequency global features and Invertible Neural Networks (INN) blocks focusing on extracting high-frequency local information.

object-detection Object Detection +1

Advancing Learned Video Compression with In-loop Frame Prediction

1 code implementation13 Nov 2022 Ren Yang, Radu Timofte, Luc van Gool

In this paper, we propose an Advanced Learned Video Compression (ALVC) approach with the in-loop frame prediction module, which is able to effectively predict the target frame from the previously compressed frames, without consuming any bit-rate.

MS-SSIM SSIM +1

MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning

no code implementations8 Nov 2022 Andrey Ignatov, Anastasia Sycheva, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, Luc van Gool

While neural networks-based photo processing solutions can provide a better image quality compared to the traditional ISP systems, their application to mobile devices is still very limited due to their very high computational complexity.

PyNet-V2 Mobile: Efficient On-Device Photo Processing With Neural Networks

1 code implementation8 Nov 2022 Andrey Ignatov, Grigory Malivenko, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, Luc van Gool

The increased importance of mobile photography created a need for fast and performant RAW image processing pipelines capable of producing good visual results in spite of the mobile camera sensor limitations.

Towards Versatile Embodied Navigation

1 code implementation30 Oct 2022 Hanqing Wang, Wei Liang, Luc van Gool, Wenguan Wang

With the emergence of varied visual navigation tasks (e. g, image-/object-/audio-goal and vision-language navigation) that specify the target in different ways, the community has made appealing advances in training specialized agents capable of handling individual navigation tasks well.

Decision Making Vision-Language Navigation +1

TripletTrack: 3D Object Tracking using Triplet Embeddings and LSTM

no code implementations28 Oct 2022 Nicola Marinello, Marc Proesmans, Luc van Gool

We start from an off-the-shelf 3D object detector, and apply a tracking mechanism where objects are matched by an affinity score computed on local object feature embeddings and motion descriptors.

3D Object Tracking Autonomous Driving +2

Masked Vision-Language Transformer in Fashion

1 code implementation27 Oct 2022 Ge-Peng Ji, Mingcheng Zhuge, Dehong Gao, Deng-Ping Fan, Christos Sakaridis, Luc van Gool

We present a masked vision-language transformer (MVLT) for fashion-specific multi-modal representation.

Image Reconstruction Retrieval

Learning Attention Propagation for Compositional Zero-Shot Learning

no code implementations20 Oct 2022 Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc van Gool, Alain Pagani, Didier Stricker, Muhammad Zeshan Afzal

CAPE learns to identify this structure and propagates knowledge between them to learn class embedding for all seen and unseen compositions.

Compositional Zero-Shot Learning

Multi-View Photometric Stereo Revisited

no code implementations14 Oct 2022 Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc van Gool

The proposed approach in this paper exploits the benefit of uncertainty modeling in a deep neural network for a reliable fusion of photometric stereo (PS) and multi-view stereo (MVS) network predictions.

3D Shape Representation

Composite Learning for Robust and Effective Dense Predictions

no code implementations13 Oct 2022 Menelaos Kanakis, Thomas E. Huang, David Bruggemann, Fisher Yu, Luc van Gool

In this paper, we find that jointly training a dense prediction (target) task with a self-supervised (auxiliary) task can consistently improve the performance of the target task, while eliminating the need for labeling auxiliary tasks.

Boundary Detection Monocular Depth Estimation +3

SiNeRF: Sinusoidal Neural Radiance Fields for Joint Pose Estimation and Scene Reconstruction

1 code implementation10 Oct 2022 Yitong Xia, Hao Tang, Radu Timofte, Luc van Gool

NeRFmm is the Neural Radiance Fields (NeRF) that deal with Joint Optimization tasks, i. e., reconstructing real-world scenes and registering camera parameters simultaneously.

Image Generation Pose Estimation

Robustifying the Multi-Scale Representation of Neural Radiance Fields

no code implementations9 Oct 2022 Nishant Jain, Suryansh Kumar, Luc van Gool

Although recently proposed Mip-NeRF could handle multi-scale imaging problems with NeRF, it cannot handle camera pose estimation error.

Pose Estimation

Basic Binary Convolution Unit for Binarized Image Restoration Network

2 code implementations2 Oct 2022 Bin Xia, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Radu Timofte, Luc van Gool

In this study, we reconsider components in binary convolution, such as residual connection, BatchNorm, activation function, and structure, for IR tasks.

Binarization Image Restoration +1

Physical Adversarial Attack meets Computer Vision: A Decade Survey

1 code implementation30 Sep 2022 Hui Wei, Hao Tang, Xuemei Jia, Zhixiang Wang, Hanxun Yu, Zhubo Li, Shin'ichi Satoh, Luc van Gool, Zheng Wang

Building upon this foundation, we uncover the pervasive role of artifacts carrying adversarial perturbations in the physical world.

Adversarial Attack Medical Diagnosis

TT-NF: Tensor Train Neural Fields

1 code implementation30 Sep 2022 Anton Obukhov, Mikhail Usvyatsov, Christos Sakaridis, Konrad Schindler, Luc van Gool

Learning neural fields has been an active topic in deep learning research, focusing, among other issues, on finding more compact and easy-to-fit representations.

Denoising Low-rank compression

Exploiting Instance-based Mixed Sampling via Auxiliary Source Domain Supervision for Domain-adaptive Action Detection

1 code implementation28 Sep 2022 Yifan Lu, Gurkirt Singh, Suman Saha, Luc van Gool

We propose a novel domain adaptive action detection approach and a new adaptation protocol that leverages the recent advancements in image-level unsupervised domain adaptation (UDA) techniques and handle vagaries of instance-level video data.

Action Detection Pseudo Label +2

I2DFormer: Learning Image to Document Attention for Zero-Shot Image Classification

no code implementations21 Sep 2022 Muhammad Ferjad Naeem, Yongqin Xian, Luc van Gool, Federico Tombari

In order to distill discriminative visual words from noisy documents, we introduce a new cross-modal attention module that learns fine-grained interactions between image patches and document words.

Generalized Zero-Shot Learning Image Classification +2

Spatio-Temporal Action Detection Under Large Motion

no code implementations6 Sep 2022 Gurkirt Singh, Vasileios Choutas, Suman Saha, Fisher Yu, Luc van Gool

Current methods for spatiotemporal action tube detection often extend a bounding box proposal at a given keyframe into a 3D temporal cuboid and pool features from nearby frames.

Action Detection

ManiFlow: Implicitly Representing Manifolds with Normalizing Flows

no code implementations18 Aug 2022 Janis Postels, Martin Danelljan, Luc van Gool, Federico Tombari

In contrast to prior work, we approach this problem by generating samples from the original data distribution given full knowledge about the perturbed distribution and the noise model.

Surface Reconstruction

AVisT: A Benchmark for Visual Object Tracking in Adverse Visibility

1 code implementation14 Aug 2022 Mubashir Noman, Wafa Al Ghallabi, Daniya Najiha, Christoph Mayer, Akshay Dudhane, Martin Danelljan, Hisham Cholakkal, Salman Khan, Luc van Gool, Fahad Shahbaz Khan

While being greatly benefiting to the tracking research, existing benchmarks do not pose the same difficulty as before with recent trackers achieving higher performance mainly due to (i) the introduction of more sophisticated transformers-based methods and (ii) the lack of diverse scenarios with adverse visibility such as, severe weather conditions, camouflage and imaging effects.

Visual Object Tracking Visual Tracking

Towards Interpretable Video Super-Resolution via Alternating Optimization

1 code implementation21 Jul 2022 JieZhang Cao, Jingyun Liang, Kai Zhang, Wenguan Wang, Qin Wang, Yulun Zhang, Hao Tang, Luc van Gool

These issues can be alleviated by a cascade of three separate sub-tasks, including video deblurring, frame interpolation, and super-resolution, which, however, would fail to capture the spatial and temporal correlations among video sequences.

Deblurring Space-time Video Super-resolution +2

Refign: Align and Refine for Adaptation of Semantic Segmentation to Adverse Conditions

1 code implementation14 Jul 2022 David Bruggemann, Christos Sakaridis, Prune Truong, Luc van Gool

Due to the scarcity of dense pixel-level semantic annotations for images recorded in adverse visual conditions, there has been a keen interest in unsupervised domain adaptation (UDA) for the semantic segmentation of such images.

Semantic Segmentation Unsupervised Domain Adaptation

Organic Priors in Non-Rigid Structure from Motion

no code implementations13 Jul 2022 Suryansh Kumar, Luc van Gool

Besides that, the paper provides insights into the NRSfM factorization -- both in terms of shape and motion -- and is the first approach to show the benefit of single rotation averaging for NRSfM.

L2E: Lasers to Events for 6-DoF Extrinsic Calibration of Lidars and Event Cameras

1 code implementation3 Jul 2022 Kevin Ta, David Bruggemann, Tim Brödermann, Christos Sakaridis, Luc van Gool

As neuromorphic technology is maturing, its application to robotics and autonomous vehicle systems has become an area of active research.

Autonomous Driving Camera Calibration

HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection

1 code implementation30 Jun 2022 Tim Broedermann, Christos Sakaridis, Dengxin Dai, Luc van Gool

Besides standard cameras, autonomous vehicles typically include multiple additional sensors, such as lidars and radars, which help acquire richer information for perceiving the content of the driving scene.

Autonomous Vehicles object-detection +3

3D-Aware Video Generation

1 code implementation29 Jun 2022 Sherwin Bahmani, Jeong Joon Park, Despoina Paschalidou, Hao Tang, Gordon Wetzstein, Leonidas Guibas, Luc van Gool, Radu Timofte

Generative models have emerged as an essential building block for many image synthesis and editing tasks.

Image Generation Video Generation

Structured Sparsity Learning for Efficient Video Super-Resolution

1 code implementation CVPR 2023 Bin Xia, Jingwen He, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Luc van Gool

In SSL, we design pruning schemes for several key components in VSR models, including residual blocks, recurrent networks, and upsampling networks.

Video Super-Resolution

Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation

1 code implementation13 Jun 2022 Wouter Van Gansbeke, Simon Vandenhende, Luc van Gool

This paper presents MaskDistill: a novel framework for unsupervised semantic segmentation based on three key ideas.

Ranked #4 on Unsupervised Semantic Segmentation on PASCAL VOC 2012 val (using extra training data)

Object Segmentation +1

Gradient Obfuscation Checklist Test Gives a False Sense of Security

no code implementations3 Jun 2022 Nikola Popovic, Danda Pani Paudel, Thomas Probst, Luc van Gool

It has since become a trend to use these five characteristics as a sufficient test, to determine whether or not gradient obfuscation is the main source of robustness.

GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector

2 code implementations30 May 2022 Peng Zheng, Huazhu Fu, Deng-Ping Fan, Qi Fan, Jie Qin, Yu-Wing Tai, Chi-Keung Tang, Luc van Gool

In this paper, we present a novel end-to-end group collaborative learning network, termed GCoNet+, which can effectively and efficiently (250 fps) identify co-salient objects in natural scenes.

Co-Salient Object Detection Object +2

Deep Gradient Learning for Efficient Camouflaged Object Detection

1 code implementation25 May 2022 Ge-Peng Ji, Deng-Ping Fan, Yu-Cheng Chou, Dengxin Dai, Alexander Liniger, Luc van Gool

This paper introduces DGNet, a novel deep framework that exploits object gradient supervision for camouflaged object detection (COD).

Defect Detection Object +4

Degradation-Aware Unfolding Half-Shuffle Transformer for Spectral Compressive Imaging

1 code implementation20 May 2022 Yuanhao Cai, Jing Lin, Haoqian Wang, Xin Yuan, Henghui Ding, Yulun Zhang, Radu Timofte, Luc van Gool

In coded aperture snapshot spectral compressive imaging (CASSI) systems, hyperspectral image (HSI) reconstruction methods are employed to recover the spatial-spectral signal from a compressed measurement.

Compressive Sensing Image Reconstruction +1

Revisiting Random Channel Pruning for Neural Network Compression

1 code implementation CVPR 2022 Yawei Li, Kamil Adamczewski, Wen Li, Shuhang Gu, Radu Timofte, Luc van Gool

The proposed approach provides a new way to compare different methods, namely how well they behave compared with random pruning.

Neural Network Compression

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

2 code implementations11 May 2022 Yawei Li, Kai Zhang, Radu Timofte, Luc van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Yanbo Wang, Xiaozhong Ji, Chuming Lin, Donghao Luo, Ying Tai, Chengjie Wang, Zhizhong Zhang, Yuan Xie, Shen Cheng, Ziwei Luo, Lei Yu, Zhihong Wen, Qi Wu1, Youwei Li, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Yuanfei Huang, Meiguang Jin, Hua Huang, Jing Liu, Xinjian Zhang, Yan Wang, Lingshun Long, Gen Li, Yuanfan Zhang, Zuowei Cao, Lei Sun, Panaetov Alexander, Yucong Wang, Minjie Cai, Li Wang, Lu Tian, Zheyuan Wang, Hongbing Ma, Jie Liu, Chao Chen, Yidong Cai, Jie Tang, Gangshan Wu, Weiran Wang, Shirui Huang, Honglei Lu, Huan Liu, Keyan Wang, Jun Chen, Shi Chen, Yuchun Miao, Zimo Huang, Lefei Zhang, Mustafa Ayazoğlu, Wei Xiong, Chengyi Xiong, Fei Wang, Hao Li, Ruimian Wen, Zhijing Yang, Wenbin Zou, Weixin Zheng, Tian Ye, Yuncheng Zhang, Xiangzhen Kong, Aditya Arora, Syed Waqas Zamir, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Dandan Gaoand Dengwen Zhouand Qian Ning, Jingzhu Tang, Han Huang, YuFei Wang, Zhangheng Peng, Haobo Li, Wenxue Guan, Shenghua Gong, Xin Li, Jun Liu, Wanjun Wang, Dengwen Zhou, Kun Zeng, Hanjiang Lin, Xinyu Chen, Jinsheng Fang

The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29. 00dB on DIV2K validation set.

Image Super-Resolution

HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

1 code implementation27 Apr 2022 Lukas Hoyer, Dengxin Dai, Luc van Gool

Therefore, we propose HRDA, a multi-resolution training approach for UDA, that combines the strengths of small high-resolution crops to preserve fine segmentation details and large low-resolution crops to capture long-range context dependencies with a learned scale attention, while maintaining a manageable GPU memory footprint.

Segmentation Semantic Segmentation +3

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction

3 code implementations17 Apr 2022 Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Zhang, Hanspeter Pfister, Radu Timofte, Luc van Gool

Existing leading methods for spectral reconstruction (SR) focus on designing deeper or wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB image to its hyperspectral image (HSI).

Spectral Reconstruction Spectral Super-Resolution

Neural Vector Fields for Implicit Surface Representation and Inference

1 code implementation13 Apr 2022 Edoardo Mello Rella, Ajad Chhatkuli, Ender Konukoglu, Luc van Gool

With neural networks, several other variations and training principles have been proposed with the goal to represent all classes of shapes.

Learning Online Multi-Sensor Depth Fusion

1 code implementation7 Apr 2022 Erik Sandström, Martin R. Oswald, Suryansh Kumar, Silvan Weder, Fisher Yu, Cristian Sminchisescu, Luc van Gool

Multi-sensor depth fusion is able to substantially improve the robustness and accuracy of 3D reconstruction methods, but existing techniques are not robust enough to handle sensors which operate with diverse value ranges as well as noise and outlier statistics.

3D Reconstruction Mixed Reality +1

Arbitrary-Scale Image Synthesis

1 code implementation CVPR 2022 Evangelos Ntavelis, Mohamad Shahbazi, Iason Kastanis, Radu Timofte, Martin Danelljan, Luc van Gool

Positional encodings have enabled recent works to train a single adversarial network that can generate images of different scales.

Image Generation

Deep Interactive Motion Prediction and Planning: Playing Games with Motion Prediction Models

no code implementations5 Apr 2022 Jose L. Vazquez, Alexander Liniger, Wilko Schwarting, Daniela Rus, Luc van Gool

Fundamental to the success of our method is the design of a novel multi-agent policy network that can steer a vehicle given the state of the surrounding agents and the map information.

motion prediction

FoV-Net: Field-of-View Extrapolation Using Self-Attention and Uncertainty

no code implementations4 Apr 2022 Liqian Ma, Stamatios Georgoulis, Xu Jia, Luc van Gool

The ability to make educated predictions about their surroundings, and associate them with certain confidence, is important for intelligent systems, like autonomous vehicles and robots.

Autonomous Vehicles Decision Making

Direct Dense Pose Estimation

no code implementations4 Apr 2022 Liqian Ma, Lingjie Liu, Christian Theobalt, Luc van Gool

In addition, DDP is computationally more efficient than previous dense pose estimation methods, and it reduces jitters when applied to a video sequence, which is a problem plaguing the previous methods.

Action Recognition Pose Estimation +2

Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation

1 code implementation CVPR 2022 Hanqing Wang, Wei Liang, Jianbing Shen, Luc van Gool, Wenguan Wang

Since the rise of vision-language navigation (VLN), great progress has been made in instruction following -- building a follower to navigate environments under the guidance of instructions.

counterfactual Data Augmentation +3

LiDAR Snowfall Simulation for Robust 3D Object Detection

1 code implementation CVPR 2022 Martin Hahner, Christos Sakaridis, Mario Bijelic, Felix Heide, Fisher Yu, Dengxin Dai, Luc van Gool

Due to the difficulty of collecting and annotating training data in this setting, we propose a physically based method to simulate the effect of snowfall on real clear-weather LiDAR point clouds.

Autonomous Driving Object +3

Rethinking Semantic Segmentation: A Prototype View

1 code implementation CVPR 2022 Tianfei Zhou, Wenguan Wang, Ender Konukoglu, Luc van Gool

Prevalent semantic segmentation solutions, despite their different network designs (FCN based or attention based) and mask decoding strategies (parametric softmax based or pixel-query based), can be placed in one category, by considering the softmax weights or query vectors as learnable class prototypes.

Segmentation Semantic Segmentation

Spatially Multi-conditional Image Generation

no code implementations25 Mar 2022 Ritika Chakraborty, Nikola Popovic, Danda Pani Paudel, Thomas Probst, Luc van Gool

However, multi-conditional image generation is a very challenging problem due to the heterogeneity and the sparsity of the (in practice) available conditioning labels.

Conditional Image Generation Missing Labels

Continual Test-Time Domain Adaptation

2 code implementations CVPR 2022 Qin Wang, Olga Fink, Luc van Gool, Dengxin Dai

However, real-world machine perception systems are running in non-stationary and continually changing environments where the target domain distribution can change over time.

Test-time Adaptation

Practical Blind Image Denoising via Swin-Conv-UNet and Data Synthesis

2 code implementations24 Mar 2022 Kai Zhang, Yawei Li, Jingyun Liang, JieZhang Cao, Yulun Zhang, Hao Tang, Deng-Ping Fan, Radu Timofte, Luc van Gool

While recent years have witnessed a dramatic upsurge of exploiting deep neural networks toward solving image denoising, existing methods mostly rely on simple noise assumptions, such as additive white Gaussian noise (AWGN), JPEG compression noise and camera sensor noise, and a general-purpose blind denoising method for real images remains unsolved.

Image Denoising Image-to-Image Translation

Transforming Model Prediction for Tracking

1 code implementation CVPR 2022 Christoph Mayer, Martin Danelljan, Goutam Bhat, Matthieu Paul, Danda Pani Paudel, Fisher Yu, Luc van Gool

Optimization based tracking methods have been widely successful by integrating a target model prediction module, providing effective global reasoning by minimizing an objective function.

Ranked #20 on Visual Object Tracking on LaSOT (Precision metric)

Inductive Bias Visual Object Tracking

Robust Visual Tracking by Segmentation

2 code implementations21 Mar 2022 Matthieu Paul, Martin Danelljan, Christoph Mayer, Luc van Gool

We infer a bounding box from the segmentation mask, validate our tracker on challenging tracking datasets and achieve the new state of the art on LaSOT with a success AUC score of 69. 7%.

Segmentation Semantic Segmentation +4

Transform your Smartphone into a DSLR Camera: Learning the ISP in the Wild

no code implementations20 Mar 2022 Ardhendu Shekhar Tripathi, Martin Danelljan, Samarth Shukla, Radu Timofte, Luc van Gool

We propose a trainable Image Signal Processing (ISP) framework that produces DSLR quality images given RAW images captured by a smartphone.

Motion Estimation

Scribble-Supervised LiDAR Semantic Segmentation

3 code implementations CVPR 2022 Ozan Unal, Dengxin Dai, Luc van Gool

Densely annotating LiDAR point clouds remains too expensive and time-consuming to keep up with the ever growing volume of data.

3D Semantic Segmentation LIDAR Semantic Segmentation +1

Zero Pixel Directional Boundary by Vector Transform

1 code implementation ICLR 2022 Edoardo Mello Rella, Ajad Chhatkuli, Yun Liu, Ender Konukoglu, Luc van Gool

One of the key problems in boundary detection is the label representation, which typically leads to class imbalance and, as a consequence, to thick boundaries that require non-differential post-processing steps to be thinned.

Boundary Detection

Revisiting Deep Semi-supervised Learning: An Empirical Distribution Alignment Framework and Its Generalization Bound

no code implementations13 Mar 2022 Feiyu Wang, Qin Wang, Wen Li, Dong Xu, Luc van Gool

Benefited from this new perspective, we first propose a new deep semi-supervised learning framework called Semi-supervised Learning by Empirical Distribution Alignment (SLEDA), in which existing technologies from the domain adaptation community can be readily used to address the semi-supervised learning problem through reducing the empirical distribution distance between labeled and unlabeled data.

Data Augmentation Domain Adaptation

Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction

1 code implementation9 Mar 2022 Yuanhao Cai, Jing Lin, Xiaowan Hu, Haoqian Wang, Xin Yuan, Yulun Zhang, Radu Timofte, Luc van Gool

Many algorithms have been developed to solve the inverse problem of coded aperture snapshot spectral imaging (CASSI), i. e., recovering the 3D hyperspectral images (HSIs) from a 2D compressive measurement.

Compressive Sensing Image Reconstruction +1

Barlow constrained optimization for Visual Question Answering

1 code implementation7 Mar 2022 Abhishek Jha, Badri N. Patro, Luc van Gool, Tinne Tuytelaars

In this paper, we propose a novel regularization for VQA models, Constrained Optimization using Barlow's theory (COB), that improves the information content of the joint space by minimizing the redundancy.

Question Answering Visual Question Answering

Uncertainty-Aware Deep Multi-View Photometric Stereo

no code implementations CVPR 2022 Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc van Gool

At each pixel, our approach either selects or discards deep-PS and deep-MVS network prediction depending on the prediction uncertainty measure.

Surface Reconstruction

Pix2NeRF: Unsupervised Conditional $π$-GAN for Single Image to Neural Radiance Fields Translation

2 code implementations26 Feb 2022 Shengqu Cai, Anton Obukhov, Dengxin Dai, Luc van Gool

We propose a pipeline to generate Neural Radiance Fields~(NeRF) of an object or a scene of a specific class, conditioned on a single input image.

3D-Aware Image Synthesis Novel View Synthesis +2

Adiabatic Quantum Computing for Multi Object Tracking

no code implementations CVPR 2022 Jan-Nico Zaech, Alexander Liniger, Martin Danelljan, Dengxin Dai, Luc van Gool

Multi-Object Tracking (MOT) is most often approached in the tracking-by-detection paradigm, where object detections are associated through time.

Multi-Object Tracking Object

Fast Online Video Super-Resolution with Deformable Attention Pyramid

no code implementations3 Feb 2022 Dario Fuoli, Martin Danelljan, Radu Timofte, Luc van Gool

Our DAP aligns and integrates information from the recurrent state into the current frame prediction.

Video Super-Resolution

VRT: A Video Restoration Transformer

1 code implementation28 Jan 2022 Jingyun Liang, JieZhang Cao, Yuchen Fan, Kai Zhang, Rakesh Ranjan, Yawei Li, Radu Timofte, Luc van Gool

Besides, parallel warping is used to further fuse information from neighboring frames by parallel feature warping.

Deblurring Denoising +7

RePaint: Inpainting using Denoising Diffusion Probabilistic Models

3 code implementations CVPR 2022 Andreas Lugmayr, Martin Danelljan, Andres Romero, Fisher Yu, Radu Timofte, Luc van Gool

In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks.

Denoising Image Inpainting

Collapse by Conditioning: Training Class-conditional GANs with Limited Data

1 code implementation ICLR 2022 Mohamad Shahbazi, Martin Danelljan, Danda Pani Paudel, Luc van Gool

On the contrary, we observe that class-conditioning causes mode collapse in limited data settings, where unconditional learning leads to satisfactory generative ability.

Generative Adversarial Network

Pix2NeRF: Unsupervised Conditional p-GAN for Single Image to Neural Radiance Fields Translation

1 code implementation CVPR 2022 Shengqu Cai, Anton Obukhov, Dengxin Dai, Luc van Gool

We propose a pipeline to generate Neural Radiance Fields (NeRF) of an object or a scene of a specific class, conditioned on a single input image.

3D-Aware Image Synthesis Novel View Synthesis +2

Improving the Behaviour of Vision Transformers with Token-consistent Stochastic Layers

no code implementations30 Dec 2021 Nikola Popovic, Danda Pani Paudel, Thomas Probst, Luc van Gool

We use linear layers with token-consistent stochastic parameters inside the multilayer perceptron blocks, without altering the architecture of the transformer.

Adversarial Robustness Transfer Learning

End-to-End Learning of Multi-category 3D Pose and Shape Estimation

no code implementations19 Dec 2021 Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

We use a Transformer-based architecture to detect the keypoints, as well as to summarize the visual context of the image.

Topology Preserving Local Road Network Estimation from Single Onboard Camera Image

1 code implementation CVPR 2022 Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

We represent the road topology using a set of directed lane curves and their interactions, which are captured using their intersection points.

Cannot find the paper you are looking for? You can Submit a new open access paper.