Search Results for author: Luc van Gool

Found 540 papers, 272 papers with code

RePaint: Inpainting using Denoising Diffusion Probabilistic Models

3 code implementations CVPR 2022 Andreas Lugmayr, Martin Danelljan, Andres Romero, Fisher Yu, Radu Timofte, Luc van Gool

In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks.

Denoising Image Inpainting

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

5 code implementations ICCV 2021 Wenguan Wang, Tianfei Zhou, Fisher Yu, Jifeng Dai, Ender Konukoglu, Luc van Gool

Inspired by the recent advance in unsupervised contrastive representation learning, we propose a pixel-wise contrastive framework for semantic segmentation in the fully supervised setting.

Metric Learning Optical Character Recognition (OCR) +3

SwinIR: Image Restoration Using Swin Transformer

9 code implementations23 Aug 2021 Jingyun Liang, JieZhang Cao, Guolei Sun, Kai Zhang, Luc van Gool, Radu Timofte

In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection.

Color Image Denoising Grayscale Image Denoising +6

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction

3 code implementations17 Apr 2022 Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Zhang, Hanspeter Pfister, Radu Timofte, Luc van Gool

Existing leading methods for spectral reconstruction (SR) focus on designing deeper or wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB image to its hyperspectral image (HSI).

Spectral Reconstruction Spectral Super-Resolution

Temporal Segment Networks for Action Recognition in Videos

11 code implementations8 May 2017 Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc van Gool

Furthermore, based on the temporal segment networks, we won the video classification track at the ActivityNet challenge 2016 among 24 teams, which demonstrates the effectiveness of TSN and the proposed good practices.

Action Classification Action Recognition In Videos +3

Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

19 code implementations2 Aug 2016 Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc van Gool

The other contribution is our study on a series of good practices in learning ConvNets on video data with the help of temporal segment network.

Action Classification Action Recognition In Videos +2

Learning Discriminative Model Prediction for Tracking

2 code implementations ICCV 2019 Goutam Bhat, Martin Danelljan, Luc van Gool, Radu Timofte

The current strive towards end-to-end trainable computer vision systems imposes major challenges for the task of visual tracking.

Visual Object Tracking Visual Tracking

Know Your Surroundings: Exploiting Scene Information for Object Tracking

1 code implementation ECCV 2020 Goutam Bhat, Martin Danelljan, Luc van Gool, Radu Timofte

Such approaches are however prone to fail in case of e. g. fast appearance changes or presence of distractor objects, where a target appearance model alone is insufficient for robust tracking.

Object Tracking

Learning Target Candidate Association to Keep Track of What Not to Track

1 code implementation ICCV 2021 Christoph Mayer, Martin Danelljan, Danda Pani Paudel, Luc van Gool

To tackle the problem of lacking ground-truth correspondences between distractor objects in visual tracking, we propose a training strategy that combines partial annotations with self-supervision.

Visual Object Tracking Visual Tracking

Efficient Visual Tracking with Exemplar Transformers

2 code implementations17 Dec 2021 Philippe Blatter, Menelaos Kanakis, Martin Danelljan, Luc van Gool

E. T. Track, our visual tracker that incorporates Exemplar Transformer modules, runs at 47 FPS on a CPU.

Visual Object Tracking Visual Tracking

Transforming Model Prediction for Tracking

1 code implementation CVPR 2022 Christoph Mayer, Martin Danelljan, Goutam Bhat, Matthieu Paul, Danda Pani Paudel, Fisher Yu, Luc van Gool

Optimization based tracking methods have been widely successful by integrating a target model prediction module, providing effective global reasoning by minimizing an objective function.

Ranked #21 on Visual Object Tracking on LaSOT (Precision metric)

Inductive Bias Visual Object Tracking

Robust Visual Tracking by Segmentation

2 code implementations21 Mar 2022 Matthieu Paul, Martin Danelljan, Christoph Mayer, Luc van Gool

We infer a bounding box from the segmentation mask, validate our tracker on challenging tracking datasets and achieve the new state of the art on LaSOT with a success AUC score of 69. 7%.

Decoder Segmentation +5

AVisT: A Benchmark for Visual Object Tracking in Adverse Visibility

1 code implementation14 Aug 2022 Mubashir Noman, Wafa Al Ghallabi, Daniya Najiha, Christoph Mayer, Akshay Dudhane, Martin Danelljan, Hisham Cholakkal, Salman Khan, Luc van Gool, Fahad Shahbaz Khan

While being greatly benefiting to the tracking research, existing benchmarks do not pose the same difficulty as before with recent trackers achieving higher performance mainly due to (i) the introduction of more sophisticated transformers-based methods and (ii) the lack of diverse scenarios with adverse visibility such as, severe weather conditions, camouflage and imaging effects.

Visual Object Tracking Visual Tracking

Beyond SOT: Tracking Multiple Generic Objects at Once

1 code implementation22 Dec 2022 Christoph Mayer, Martin Danelljan, Ming-Hsuan Yang, Vittorio Ferrari, Luc van Gool, Alina Kuznetsova

Our approach achieves a 4x faster run-time in case of 10 concurrent objects compared to tracking each object independently and outperforms existing single object trackers on our new benchmark.

Attribute Object +1

Towards End-to-End Lane Detection: an Instance Segmentation Approach

22 code implementations15 Feb 2018 Davy Neven, Bert de Brabandere, Stamatios Georgoulis, Marc Proesmans, Luc van Gool

By doing so, we ensure a lane fitting which is robust against road plane changes, unlike existing approaches that rely on a fixed, pre-defined transformation.

Instance Segmentation Lane Detection +1

DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks

3 code implementations ICCV 2017 Andrey Ignatov, Nikolay Kobyshev, Radu Timofte, Kenneth Vanhoey, Luc van Gool

Despite a rapid rise in the quality of built-in smartphone cameras, their physical limitations - small sensor size, compact lenses and the lack of specific hardware, - impede them to achieve the quality results of DSLR cameras.

Translation

On the Practicality of Deterministic Epistemic Uncertainty

2 code implementations1 Jul 2021 Janis Postels, Mattia Segu, Tao Sun, Luca Sieber, Luc van Gool, Fisher Yu, Federico Tombari

We find that, while DUMs scale to realistic vision tasks and perform well on OOD detection, the practicality of current methods is undermined by poor calibration under distributional shifts.

Out of Distribution (OOD) Detection Semantic Segmentation +1

VRT: A Video Restoration Transformer

1 code implementation28 Jan 2022 Jingyun Liang, JieZhang Cao, Yuchen Fan, Kai Zhang, Rakesh Ranjan, Yawei Li, Radu Timofte, Luc van Gool

Besides, parallel warping is used to further fuse information from neighboring frames by parallel feature warping.

Deblurring Denoising +7

Designing a Practical Degradation Model for Deep Blind Image Super-Resolution

3 code implementations ICCV 2021 Kai Zhang, Jingyun Liang, Luc van Gool, Radu Timofte

It is widely acknowledged that single image super-resolution (SISR) methods would not perform well if the assumed degradation model deviates from those in real images.

Image Super-Resolution Video Super-Resolution

Deep Extreme Cut: From Extreme Points to Object Segmentation

2 code implementations CVPR 2018 Kevis-Kokitsi Maninis, Sergi Caelles, Jordi Pont-Tuset, Luc van Gool

This paper explores the use of extreme points in an object (left-most, right-most, top, bottom pixels) as input to obtain precise object segmentation for images and videos.

Instance Segmentation Interactive Segmentation +4

Deep Unfolding Network for Image Super-Resolution

1 code implementation CVPR 2020 Kai Zhang, Luc van Gool, Radu Timofte

As a result, the proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model, while maintaining the advantages of learning-based methods.

Image Super-Resolution

SRFlow: Learning the Super-Resolution Space with Normalizing Flow

7 code implementations ECCV 2020 Andreas Lugmayr, Martin Danelljan, Luc van Gool, Radu Timofte

SRFlow therefore directly accounts for the ill-posed nature of the problem, and learns to predict diverse photo-realistic high-resolution images.

Ranked #6 on Image Super-Resolution on DIV2K val - 4x upscaling (using extra training data)

Image Manipulation Image Super-Resolution

MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning

1 code implementation ECCV 2020 Simon Vandenhende, Stamatios Georgoulis, Luc van Gool

In this paper, we argue about the importance of considering task interactions at multiple scales when distilling task information in a multi-task learning setup.

Multi-Task Learning Semantic Segmentation

Multi-Task Learning for Dense Prediction Tasks: A Survey

1 code implementation28 Apr 2020 Simon Vandenhende, Stamatios Georgoulis, Wouter Van Gansbeke, Marc Proesmans, Dengxin Dai, Luc van Gool

In this survey, we provide a well-rounded view on state-of-the-art deep learning approaches for MTL in computer vision, explicitly emphasizing on dense prediction tasks.

Multi-Task Learning

End-to-end Lane Detection through Differentiable Least-Squares Fitting

1 code implementation1 Feb 2019 Wouter Van Gansbeke, Bert de Brabandere, Davy Neven, Marc Proesmans, Luc van Gool

The problem with such a two-step approach is that the parameters of the network are not optimized for the true task of interest (estimating the lane curvature parameters) but for a proxy task (segmenting the lane markings), resulting in sub-optimal performance.

Lane Detection

Warp Consistency for Unsupervised Learning of Dense Correspondences

1 code implementation ICCV 2021 Prune Truong, Martin Danelljan, Fisher Yu, Luc van Gool

From our observations and empirical results, we design a general unsupervised objective employing two of the derived constraints.

Dense Pixel Correspondence Estimation

PDC-Net+: Enhanced Probabilistic Dense Correspondence Network

1 code implementation28 Sep 2021 Prune Truong, Martin Danelljan, Radu Timofte, Luc van Gool

In order to apply dense methods to real-world applications, such as pose estimation, image manipulation, or 3D reconstruction, it is therefore crucial to estimate the confidence of the predicted matches.

3D Reconstruction Geometric Matching +6

Plug-and-Play Image Restoration with Deep Denoiser Prior

4 code implementations31 Aug 2020 Kai Zhang, Yawei Li, WangMeng Zuo, Lei Zhang, Luc van Gool, Radu Timofte

Recent works on plug-and-play image restoration have shown that a denoiser can implicitly serve as the image prior for model-based methods to solve many inverse problems.

Deblurring Demosaicking +1

Practical Blind Image Denoising via Swin-Conv-UNet and Data Synthesis

2 code implementations24 Mar 2022 Kai Zhang, Yawei Li, Jingyun Liang, JieZhang Cao, Yulun Zhang, Hao Tang, Deng-Ping Fan, Radu Timofte, Luc van Gool

While recent years have witnessed a dramatic upsurge of exploiting deep neural networks toward solving image denoising, existing methods mostly rely on simple noise assumptions, such as additive white Gaussian noise (AWGN), JPEG compression noise and camera sensor noise, and a general-purpose blind denoising method for real images remains unsolved.

Image Denoising Image-to-Image Translation

One-Shot Video Object Segmentation

8 code implementations CVPR 2017 Sergi Caelles, Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Laura Leal-Taixé, Daniel Cremers, Luc van Gool

This paper tackles the task of semi-supervised video object segmentation, i. e., the separation of an object from the background in a video, given the mask of the first frame.

Foreground Segmentation Object +4

Deep Temporal Linear Encoding Networks

2 code implementations CVPR 2017 Ali Diba, Vivek Sharma, Luc van Gool

Advantages of TLEs are: (a) they encode the entire video into a compact feature representation, learning the semantics and a discriminative feature space; (b) they are applicable to all kinds of networks like 2D and 3D CNNs for video classification; and (c) they model feature interactions in a more expressive way and without loss of information.

Representation Learning Video Classification

Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction

1 code implementation9 Mar 2022 Yuanhao Cai, Jing Lin, Xiaowan Hu, Haoqian Wang, Xin Yuan, Yulun Zhang, Radu Timofte, Luc van Gool

Many algorithms have been developed to solve the inverse problem of coded aperture snapshot spectral imaging (CASSI), i. e., recovering the 3D hyperspectral images (HSIs) from a 2D compressive measurement.

Compressive Sensing Image Reconstruction +1

Degradation-Aware Unfolding Half-Shuffle Transformer for Spectral Compressive Imaging

1 code implementation20 May 2022 Yuanhao Cai, Jing Lin, Haoqian Wang, Xin Yuan, Henghui Ding, Yulun Zhang, Radu Timofte, Luc van Gool

In coded aperture snapshot spectral compressive imaging (CASSI) systems, hyperspectral image (HSI) reconstruction methods are employed to recover the spatial-spectral signal from a compressed measurement.

Compressive Sensing Image Reconstruction +1

Sparse and noisy LiDAR completion with RGB guidance and uncertainty

1 code implementation14 Feb 2019 Wouter Van Gansbeke, Davy Neven, Bert de Brabandere, Luc van Gool

However, we additionally propose a fusion method with RGB guidance from a monocular camera in order to leverage object information and to correct mistakes in the sparse input.

Autonomous Vehicles Depth Completion +2

Semantic Instance Segmentation with a Discriminative Loss Function

8 code implementations8 Aug 2017 Bert De Brabandere, Davy Neven, Luc van Gool

In this work we propose to tackle the problem with a discriminative loss function, operating at the pixel level, that encourages a convolutional network to produce a representation of the image that can easily be clustered into instances with a simple post-processing step.

Instance Segmentation Lane Detection +4

DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation

3 code implementations CVPR 2022 Lukas Hoyer, Dengxin Dai, Luc van Gool

It improves the state of the art by 10. 8 mIoU for GTA-to-Cityscapes and 5. 4 mIoU for Synthia-to-Cityscapes and enables learning even difficult classes such as train, bus, and truck well.

Semantic Segmentation Synthetic-to-Real Translation +1

Domain Adaptive and Generalizable Network Architectures and Training Strategies for Semantic Image Segmentation

3 code implementations26 Apr 2023 Lukas Hoyer, Dengxin Dai, Luc van Gool

As previous UDA&DG semantic segmentation methods are mostly based on outdated networks, we benchmark more recent architectures, reveal the potential of Transformers, and design the DAFormer network tailored for UDA&DG.

Domain Generalization Image Segmentation +2

Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals

2 code implementations ICCV 2021 Wouter Van Gansbeke, Simon Vandenhende, Stamatios Georgoulis, Luc van Gool

To achieve this, we introduce a two-step framework that adopts a predetermined mid-level prior in a contrastive optimization objective to learn pixel embeddings.

Clustering Object +2

Practical Full Resolution Learned Lossless Image Compression

3 code implementations CVPR 2019 Fabian Mentzer, Eirikur Agustsson, Michael Tschannen, Radu Timofte, Luc van Gool

We propose the first practical learned lossless image compression system, L3C, and show that it outperforms the popular engineered codecs, PNG, WebP and JPEG 2000.

Image Compression

SEEDS: Superpixels Extracted via Energy-Driven Sampling

1 code implementation16 Sep 2013 Michael Van den Bergh, Xavier Boix, Gemma Roig, Luc van Gool

We define a robust and fast to evaluate energy function, based on enforcing color similarity between the bound- aries and the superpixel color histogram.

Superpixels

DiffIR: Efficient Diffusion Model for Image Restoration

1 code implementation ICCV 2023 Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, Luc van Gool

Diffusion model (DM) has achieved SOTA performance by modeling the image synthesis process into a sequential application of a denoising network.

Denoising Image Generation +1

UniDepth: Universal Monocular Metric Depth Estimation

1 code implementation27 Mar 2024 Luigi Piccinelli, Yung-Hsu Yang, Christos Sakaridis, Mattia Segu, Siyuan Li, Luc van Gool, Fisher Yu

However, the remarkable accuracy of recent MMDE methods is confined to their training domains.

Ranked #2 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)

Monocular Depth Estimation

Point-SLAM: Dense Neural Point Cloud-based SLAM

2 code implementations ICCV 2023 Erik Sandström, Yue Li, Luc van Gool, Martin R. Oswald

We propose a dense neural simultaneous localization and mapping (SLAM) approach for monocular RGBD input which anchors the features of a neural scene representation in a point cloud that is iteratively generated in an input-dependent data-driven manner.

Simultaneous Localization and Mapping

Replacing Mobile Camera ISP with a Single Deep Learning Model

3 code implementations13 Feb 2020 Andrey Ignatov, Luc van Gool, Radu Timofte

The model is trained to convert RAW Bayer data obtained directly from mobile camera sensor into photos captured with a professional high-end DSLR camera, making the solution independent of any particular mobile ISP implementation.

Demosaicking Denoising

Efficient and Explicit Modelling of Image Hierarchies for Image Restoration

1 code implementation CVPR 2023 Yawei Li, Yuchen Fan, Xiaoyu Xiang, Denis Demandolx, Rakesh Ranjan, Radu Timofte, Luc van Gool

The aim of this paper is to propose a mechanism to efficiently and explicitly model image hierarchies in the global, regional, and local range for image restoration.

Image Deblurring Image Defocus Deblurring +1

Improving Video Generation for Multi-functional Applications

1 code implementation30 Nov 2017 Bernhard Kratzwald, Zhiwu Huang, Danda Pani Paudel, Acharya Dinesh, Luc van Gool

In this paper, we aim to improve the state-of-the-art video generative adversarial networks (GANs) with a view towards multi-functional applications.

Colorization Future prediction +2

Rethinking Semantic Segmentation: A Prototype View

1 code implementation CVPR 2022 Tianfei Zhou, Wenguan Wang, Ender Konukoglu, Luc van Gool

Prevalent semantic segmentation solutions, despite their different network designs (FCN based or attention based) and mask decoding strategies (parametric softmax based or pixel-query based), can be placed in one category, by considering the softmax weights or query vectors as learnable class prototypes.

Segmentation Semantic Segmentation

CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion

3 code implementations CVPR 2023 Zixiang Zhao, Haowen Bai, Jiangshe Zhang, Yulun Zhang, Shuang Xu, Zudi Lin, Radu Timofte, Luc van Gool

We then introduce a dual-branch Transformer-CNN feature extractor with Lite Transformer (LT) blocks leveraging long-range attention to handle low-frequency global features and Invertible Neural Networks (INN) blocks focusing on extracting high-frequency local information.

object-detection Object Detection +1

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

3 code implementations ICCV 2023 Zixiang Zhao, Haowen Bai, Yuanzhi Zhu, Jiangshe Zhang, Shuang Xu, Yulun Zhang, Kai Zhang, Deyu Meng, Radu Timofte, Luc van Gool

To leverage strong generative priors and address challenges such as unstable training and lack of interpretability for GAN-based generative methods, we propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM).

Denoising

Equivariant Multi-Modality Image Fusion

3 code implementations19 May 2023 Zixiang Zhao, Haowen Bai, Jiangshe Zhang, Yulun Zhang, Kai Zhang, Shuang Xu, Dongdong Chen, Radu Timofte, Luc van Gool

These components enable the net training to follow the principles of the natural sensing-imaging process while satisfying the equivariant imaging prior.

Self-Supervised Learning

Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation

1 code implementation CVPR 2021 M. Saquib Sarfraz, Naila Murray, Vivek Sharma, Ali Diba, Luc van Gool, Rainer Stiefelhagen

Action segmentation refers to inferring boundaries of semantically consistent visual concepts in videos and is an important requirement for many video understanding tasks.

Action Segmentation Clustering +2

Denoising Diffusion Models for Plug-and-Play Image Restoration

2 code implementations15 May 2023 Yuanzhi Zhu, Kai Zhang, Jingyun Liang, JieZhang Cao, Bihan Wen, Radu Timofte, Luc van Gool

Although diffusion models have shown impressive performance for high-quality image synthesis, their potential to serve as a generative denoiser prior to the plug-and-play IR methods remains to be further explored.

Deblurring Denoising +4

Pix2NeRF: Unsupervised Conditional $π$-GAN for Single Image to Neural Radiance Fields Translation

2 code implementations26 Feb 2022 Shengqu Cai, Anton Obukhov, Dengxin Dai, Luc van Gool

We propose a pipeline to generate Neural Radiance Fields~(NeRF) of an object or a scene of a specific class, conditioned on a single input image.

3D-Aware Image Synthesis Novel View Synthesis +2

Pix2NeRF: Unsupervised Conditional p-GAN for Single Image to Neural Radiance Fields Translation

1 code implementation CVPR 2022 Shengqu Cai, Anton Obukhov, Dengxin Dai, Luc van Gool

We propose a pipeline to generate Neural Radiance Fields (NeRF) of an object or a scene of a specific class, conditioned on a single input image.

3D-Aware Image Synthesis Novel View Synthesis +2

CUHK & ETHZ & SIAT Submission to ActivityNet Challenge 2016

1 code implementation2 Aug 2016 Yuanjun Xiong, Li-Min Wang, Zhe Wang, Bo-Wen Zhang, Hang Song, Wei Li, Dahua Lin, Yu Qiao, Luc van Gool, Xiaoou Tang

This paper presents the method that underlies our submission to the untrimmed video classification task of ActivityNet Challenge 2016.

General Classification Video Classification

End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

2 code implementations ICCV 2021 Zhejun Zhang, Alexander Liniger, Dengxin Dai, Fisher Yu, Luc van Gool

Our end-to-end agent achieves a 78% success rate while generalizing to a new town and new weather on the NoCrash-dense benchmark and state-of-the-art performance on the challenging public routes of the CARLA LeaderBoard.

Autonomous Driving Imitation Learning +2

MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation

1 code implementation CVPR 2023 Lukas Hoyer, Dengxin Dai, Haoran Wang, Luc van Gool

MIC significantly improves the state-of-the-art performance across the different recognition tasks for synthetic-to-real, day-to-nighttime, and clear-to-adverse-weather UDA.

Image Classification object-detection +4

Video Super-Resolution Transformer

1 code implementation12 Jun 2021 JieZhang Cao, Yawei Li, Kai Zhang, Luc van Gool

Specifically, to tackle the first issue, we present a spatial-temporal convolutional self-attention layer with a theoretical understanding to exploit the locality information.

Optical Flow Estimation Video Super-Resolution

Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation

1 code implementation CVPR 2021 Lukas Hoyer, Dengxin Dai, Yuhua Chen, Adrian Köring, Suman Saha, Luc van Gool

Training deep networks for semantic segmentation requires large amounts of labeled training data, which presents a major challenge in practice, as labeling segmentation masks is a highly labor-intensive process.

Data Augmentation Monocular Depth Estimation +2

Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with Self-Supervised Depth Estimation

1 code implementation28 Aug 2021 Lukas Hoyer, Dengxin Dai, Qin Wang, Yuhua Chen, Luc van Gool

Training deep networks for semantic segmentation requires large amounts of labeled training data, which presents a major challenge in practice, as labeling segmentation masks is a highly labor-intensive process.

Data Augmentation Domain Adaptation +5

HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

1 code implementation27 Apr 2022 Lukas Hoyer, Dengxin Dai, Luc van Gool

Therefore, we propose HRDA, a multi-resolution training approach for UDA, that combines the strengths of small high-resolution crops to preserve fine segmentation details and large low-resolution crops to capture long-range context dependencies with a learned scale attention, while maintaining a manageable GPU memory footprint.

Segmentation Semantic Segmentation +3

Learning for Video Compression with Hierarchical Quality and Recurrent Enhancement

3 code implementations CVPR 2020 Ren Yang, Fabian Mentzer, Luc van Gool, Radu Timofte

In our HLVC approach, the hierarchical quality benefits the coding efficiency, since the high quality information facilitates the compression and enhancement of low quality frames at encoder and decoder sides, respectively.

Decoder Image Compression +3

Learning for Video Compression with Recurrent Auto-Encoder and Recurrent Probability Model

2 code implementations24 Jun 2020 Ren Yang, Fabian Mentzer, Luc van Gool, Radu Timofte

The experiments show that our approach achieves the state-of-the-art learned video compression performance in terms of both PSNR and MS-SSIM.

Decoder MS-SSIM +2

OpenDVC: An Open Source Implementation of the DVC Video Compression Method

4 code implementations29 Jun 2020 Ren Yang, Luc van Gool, Radu Timofte

At the time of writing this report, several learned video compression methods are superior to DVC, but currently none of them provides open source codes.

MS-SSIM SSIM +1

Perceptual Learned Video Compression with Recurrent Conditional GAN

3 code implementations7 Sep 2021 Ren Yang, Radu Timofte, Luc van Gool

This paper proposes a Perceptual Learned Video Compression (PLVC) approach with recurrent conditional GAN.

Video Compression

Continual Test-Time Domain Adaptation

2 code implementations CVPR 2022 Qin Wang, Olga Fink, Luc van Gool, Dengxin Dai

However, real-world machine perception systems are running in non-stationary and continually changing environments where the target domain distribution can change over time.

Test-time Adaptation

Appearance-and-Relation Networks for Video Classification

1 code implementation CVPR 2018 Limin Wang, Wei Li, Wen Li, Luc van Gool

Specifically, SMART blocks decouple the spatiotemporal learning module into an appearance branch for spatial modeling and a relation branch for temporal modeling.

Action Classification Action Recognition +6

Structured Bird's-Eye-View Traffic Scene Understanding from Onboard Images

2 code implementations ICCV 2021 Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

In this work, we study the problem of extracting a directed graph representing the local road network in BEV coordinates, from a single onboard camera image.

Autonomous Navigation Lane Detection +1

HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud

2 code implementations4 Apr 2024 Wencan Cheng, Hao Tang, Luc van Gool, Jong Hwan Ko

Extracting keypoint locations from input hand frames, known as 3D hand pose estimation, is a critical task in various human-computer interaction applications.

3D Hand Pose Estimation

Convolutional Oriented Boundaries: From Image Segmentation to High-Level Tasks

2 code implementations17 Jan 2017 Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Pablo Arbeláez, Luc van Gool

We present Convolutional Oriented Boundaries (COB), which produces multiscale oriented contours and region hierarchies starting from generic image classification Convolutional Neural Networks (CNNs).

Boundary Detection Contour Detection +7

Convolutional Oriented Boundaries

1 code implementation9 Aug 2016 Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Pablo Arbeláez, Luc van Gool

We present Convolutional Oriented Boundaries (COB), which produces multiscale oriented contours and region hierarchies starting from generic image classification Convolutional Neural Networks (CNNs).

Contour Detection General Classification +2

Conditional Probability Models for Deep Image Compression

1 code implementation CVPR 2018 Fabian Mentzer, Eirikur Agustsson, Michael Tschannen, Radu Timofte, Luc van Gool

During training, the auto-encoder makes use of the context model to estimate the entropy of its representation, and the context model is concurrently updated to learn the dependencies between the symbols in the latent representation.

Image Compression MS-SSIM +3

Dense 3D Regression for Hand Pose Estimation

1 code implementation CVPR 2018 Chengde Wan, Thomas Probst, Luc van Gool, Angela Yao

Specifically, we decompose the pose parameters into a set of per-pixel estimations, i. e., 2D heat maps, 3D heat maps and unit 3D directional vector fields.

3D Hand Pose Estimation regression

A Survey on Deep Learning Technique for Video Segmentation

1 code implementation2 Jul 2021 Tianfei Zhou, Fatih Porikli, David Crandall, Luc van Gool, Wenguan Wang

Video segmentation -- partitioning video frames into multiple segments or objects -- plays a critical role in a broad range of practical applications, from enhancing visual effects in movie, to understanding scenes in autonomous driving, to creating virtual background in video conferencing.

Autonomous Driving Scene Understanding +4

Deep Reparametrization of Multi-Frame Super-Resolution and Denoising

2 code implementations ICCV 2021 Goutam Bhat, Martin Danelljan, Fisher Yu, Luc van Gool, Radu Timofte

The deep reparametrization allows us to directly model the image formation process in the latent space, and to integrate learned image priors into the prediction.

Burst Image Super-Resolution Denoising +2

GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector

2 code implementations30 May 2022 Peng Zheng, Huazhu Fu, Deng-Ping Fan, Qi Fan, Jie Qin, Yu-Wing Tai, Chi-Keung Tang, Luc van Gool

In this paper, we present a novel end-to-end group collaborative learning network, termed GCoNet+, which can effectively and efficiently (250 fps) identify co-salient objects in natural scenes.

Co-Salient Object Detection Object +2

Night-to-Day Image Translation for Retrieval-based Localization

1 code implementation26 Sep 2018 Asha Anoosheh, Torsten Sattler, Radu Timofte, Marc Pollefeys, Luc van Gool

We then compare the daytime and translated night images to obtain a pose estimate for the night image using the known 6-DOF position of the closest day image.

Image Retrieval Position +4

Disentangled Person Image Generation

1 code implementation CVPR 2018 Liqian Ma, Qianru Sun, Stamatios Georgoulis, Luc van Gool, Bernt Schiele, Mario Fritz

Generating novel, yet realistic, images of persons is a challenging task due to the complex interplay between the different image factors, such as the foreground, background and pose information.

Gesture-to-Gesture Translation Person Re-Identification +1

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution

1 code implementation ICCV 2021 Jingyun Liang, Guolei Sun, Kai Zhang, Luc van Gool, Radu Timofte

Extensive experiments on synthetic and real images show that the proposed MANet not only performs favorably for both spatially variant and invariant kernel estimation, but also leads to state-of-the-art blind SR performance when combined with non-blind SR methods.

Image Super-Resolution

LiDAR Snowfall Simulation for Robust 3D Object Detection

1 code implementation CVPR 2022 Martin Hahner, Christos Sakaridis, Mario Bijelic, Felix Heide, Fisher Yu, Dengxin Dai, Luc van Gool

Due to the difficulty of collecting and annotating training data in this setting, we propose a physically based method to simulate the effect of snowfall on real clear-weather LiDAR point clouds.

Autonomous Driving Object +3

Towards Partial Supervision for Generic Object Counting in Natural Scenes

1 code implementation13 Dec 2019 Hisham Cholakkal, Guolei Sun, Salman Khan, Fahad Shahbaz Khan, Ling Shao, Luc van Gool

Our RLC framework further reduces the annotation cost arising from large numbers of object categories in a dataset by only using lower-count supervision for a subset of categories and class-labels for the remaining ones.

Image Classification Image-level Supervised Instance Segmentation +3

Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in Adverse Weather

1 code implementation ICCV 2021 Martin Hahner, Christos Sakaridis, Dengxin Dai, Luc van Gool

2) Through extensive experiments with several state-of-the-art detection approaches, we show that our fog simulation can be leveraged to significantly improve the performance for 3D object detection in the presence of fog.

3D Object Detection Object +3

Dynamic Filter Networks

1 code implementation NeurIPS 2016 Bert De Brabandere, Xu Jia, Tinne Tuytelaars, Luc van Gool

In a traditional convolutional layer, the learned filters stay fixed after training.

 Ranked #1 on Video Prediction on KTH (Cond metric)

Depth Estimation Optical Flow Estimation +1

ComboGAN: Unrestrained Scalability for Image Domain Translation

1 code implementation19 Dec 2017 Asha Anoosheh, Eirikur Agustsson, Radu Timofte, Luc Van Gool

This year alone has seen unprecedented leaps in the area of learning-based image translation, namely CycleGAN, by Zhu et al.

Image-to-Image Translation Translation

AutoDecoding Latent 3D Diffusion Models

1 code implementation NeurIPS 2023 Evangelos Ntavelis, Aliaksandr Siarohin, Kyle Olszewski, Chaoyang Wang, Luc van Gool, Sergey Tulyakov

We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core.

Event-Based Fusion for Motion Deblurring with Cross-modal Attention

1 code implementation30 Nov 2021 Lei Sun, Christos Sakaridis, Jingyun Liang, Qi Jiang, Kailun Yang, Peng Sun, Yaozu Ye, Kaiwei Wang, Luc van Gool

Traditional frame-based cameras inevitably suffer from motion blur due to long exposure times.

Ranked #3 on Deblurring on GoPro (using extra training data)

Deblurring Image Deblurring +1

Weakly Supervised 3D Object Detection from Lidar Point Cloud

1 code implementation ECCV 2020 Qinghao Meng, Wenguan Wang, Tianfei Zhou, Jianbing Shen, Luc van Gool, Dengxin Dai

This work proposes a weakly supervised approach for 3D object detection, only requiring a small set of weakly annotated scenes, associated with a few precisely labeled object instances.

3D Object Detection Object +1

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

2 code implementations11 May 2022 Yawei Li, Kai Zhang, Radu Timofte, Luc van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Yanbo Wang, Xiaozhong Ji, Chuming Lin, Donghao Luo, Ying Tai, Chengjie Wang, Zhizhong Zhang, Yuan Xie, Shen Cheng, Ziwei Luo, Lei Yu, Zhihong Wen, Qi Wu1, Youwei Li, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Yuanfei Huang, Meiguang Jin, Hua Huang, Jing Liu, Xinjian Zhang, Yan Wang, Lingshun Long, Gen Li, Yuanfan Zhang, Zuowei Cao, Lei Sun, Panaetov Alexander, Yucong Wang, Minjie Cai, Li Wang, Lu Tian, Zheyuan Wang, Hongbing Ma, Jie Liu, Chao Chen, Yidong Cai, Jie Tang, Gangshan Wu, Weiran Wang, Shirui Huang, Honglei Lu, Huan Liu, Keyan Wang, Jun Chen, Shi Chen, Yuchun Miao, Zimo Huang, Lefei Zhang, Mustafa Ayazoğlu, Wei Xiong, Chengyi Xiong, Fei Wang, Hao Li, Ruimian Wen, Zhijing Yang, Wenbin Zou, Weixin Zheng, Tian Ye, Yuncheng Zhang, Xiangzhen Kong, Aditya Arora, Syed Waqas Zamir, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Dandan Gaoand Dengwen Zhouand Qian Ning, Jingzhu Tang, Han Huang, YuFei Wang, Zhangheng Peng, Haobo Li, Wenxue Guan, Shenghua Gong, Xin Li, Jun Liu, Wanjun Wang, Dengwen Zhou, Kun Zeng, Hanjiang Lin, Xinyu Chen, Jinsheng Fang

The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29. 00dB on DIV2K validation set.

Image Super-Resolution

Basic Binary Convolution Unit for Binarized Image Restoration Network

2 code implementations2 Oct 2022 Bin Xia, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Radu Timofte, Luc van Gool

In this study, we reconsider components in binary convolution, such as residual connection, BatchNorm, activation function, and structure, for IR tasks.

Binarization Image Restoration +1

VA-DepthNet: A Variational Approach to Single Image Depth Prediction

2 code implementations13 Feb 2023 Ce Liu, Suryansh Kumar, Shuhang Gu, Radu Timofte, Luc van Gool

While state-of-the-art deep neural network methods for SIDP learn the scene depth from images in a supervised setting, they often overlook the invaluable invariances and priors in the rigid scene space, such as the regularity of the scene.

Depth Prediction Monocular Depth Estimation

LocalViT: Bringing Locality to Vision Transformers

2 code implementations12 Apr 2021 Yawei Li, Kai Zhang, JieZhang Cao, Radu Timofte, Luc van Gool

The importance of locality mechanisms is validated in two ways: 1) A wide range of design choices (activation function, layer placement, expansion ratio) are available for incorporating locality mechanisms and all proper choices can lead to a performance gain over the baseline, and 2) The same locality mechanism is successfully applied to 4 vision transformers, which shows the generalization of the locality concept.

Image Classification

Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification

3 code implementations22 Nov 2017 Ali Diba, Mohsen Fayyaz, Vivek Sharma, Amir Hossein Karami, Mohammad Mahdi Arzani, Rahman Yousefzadeh, Luc van Gool

Thus, by finetuning this network, we beat the performance of generic and recent methods in 3D CNNs, which were trained on large video datasets, e. g. Sports-1M, and finetuned on the target datasets, e. g. HMDB51/UCF101.

Action Recognition General Classification +3

Detection-aided liver lesion segmentation using deep learning

2 code implementations29 Nov 2017 Miriam Bellver, Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Xavier Giro-i-Nieto, Jordi Torres, Luc van Gool

A fully automatic technique for segmenting the liver and localizing its unhealthy tissues is a convenient tool in order to diagnose hepatic diseases and assess the response to the according treatments.

Computed Tomography (CT) Lesion Segmentation +1

Curriculum Model Adaptation with Synthetic and Real Data for Semantic Foggy Scene Understanding

1 code implementation5 Jan 2019 Dengxin Dai, Christos Sakaridis, Simon Hecker, Luc van Gool

The method is based on the fact that the results of semantic segmentation in moderately adverse conditions (light fog) can be bootstrapped to solve the same problem in highly adverse conditions (dense fog).

Domain Adaptation Scene Understanding +2

Video Object Segmentation with Episodic Graph Memory Networks

1 code implementation ECCV 2020 Xiankai Lu, Wenguan Wang, Martin Danelljan, Tianfei Zhou, Jianbing Shen, Luc van Gool

How to make a segmentation model efficiently adapt to a specific video and to online target appearance variations are fundamentally crucial issues in the field of video object segmentation.

Object Segmentation +4

Deep Gradient Learning for Efficient Camouflaged Object Detection

1 code implementation25 May 2022 Ge-Peng Ji, Deng-Ping Fan, Yu-Cheng Chou, Dengxin Dai, Alexander Liniger, Luc van Gool

This paper introduces DGNet, a novel deep framework that exploits object gradient supervision for camouflaged object detection (COD).

Defect Detection Object +4

Breathing New Life into 3D Assets with Generative Repainting

2 code implementations15 Sep 2023 Tianfu Wang, Menelaos Kanakis, Konrad Schindler, Luc van Gool, Anton Obukhov

Diffusion-based text-to-image models ignited immense attention from the vision community, artists, and content creators.

Progressive Prioritized Multi-View Stereo

1 code implementation CVPR 2016 Alex Locher, Michal Perdoch, Luc van Gool

This work proposes a progressive patch based multi-view stereo algorithm able to deliver a dense point cloud at any time.

Learning Unsupervised Hierarchical Part Decomposition of 3D Objects from a Single RGB Image

1 code implementation CVPR 2020 Despoina Paschalidou, Luc van Gool, Andreas Geiger

Humans perceive the 3D world as a set of distinct objects that are characterized by various low-level (geometry, reflectance) and high-level (connectivity, adjacency, symmetry) properties.

3D Reconstruction

Structured Sparsity Learning for Efficient Video Super-Resolution

1 code implementation CVPR 2023 Bin Xia, Jingwen He, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Luc van Gool

In SSL, we design pruning schemes for several key components in VSR models, including residual blocks, recurrent networks, and upsampling networks.

Video Super-Resolution

Real-Time Motion Prediction via Heterogeneous Polyline Transformer with Relative Pose Encoding

1 code implementation NeurIPS 2023 Zhejun Zhang, Alexander Liniger, Christos Sakaridis, Fisher Yu, Luc van Gool

The real-world deployment of an autonomous driving system requires its components to run on-board and in real-time, including the motion prediction module that predicts the future trajectories of surrounding traffic participants.

Autonomous Driving motion prediction

Depth Estimation from Monocular Images and Sparse Radar Data

1 code implementation30 Sep 2020 Juan-Ting Lin, Dengxin Dai, Luc van Gool

We give a comprehensive study of the fusion between RGB images and Radar measurements from different aspects and proposed a working solution based on the observations.

Depth Estimation

Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation

1 code implementation13 Jun 2022 Wouter Van Gansbeke, Simon Vandenhende, Luc van Gool

This paper presents MaskDistill: a novel framework for unsupervised semantic segmentation based on three key ideas.

Ranked #4 on Unsupervised Semantic Segmentation on PASCAL VOC 2012 val (using extra training data)

Object Segmentation +1

Vision Transformers with Hierarchical Attention

3 code implementations6 Jun 2021 Yun Liu, Yu-Huan Wu, Guolei Sun, Le Zhang, Ajad Chhatkuli, Luc van Gool

This paper tackles the high computational/space complexity associated with Multi-Head Self-Attention (MHSA) in vanilla vision transformers.

Image Classification Instance Segmentation +4

3D Appearance Super-Resolution with Deep Learning

1 code implementation CVPR 2019 Yawei Li, Vagia Tsiminaki, Radu Timofte, Marc Pollefeys, Luc van Gool

Experimental results demonstrate that our proposed networks successfully incorporate the 3D geometric information and super-resolve the texture maps.

Super-Resolution

Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic Image Synthesis

1 code implementation22 Jul 2023 Hao Tang, Guolei Sun, Nicu Sebe, Luc van Gool

To tackle 2), we design an effective module to selectively highlight class-dependent feature maps according to the original semantic layout to preserve the semantic information.

Contrastive Learning Image Generation

Refign: Align and Refine for Adaptation of Semantic Segmentation to Adverse Conditions

1 code implementation14 Jul 2022 David Bruggemann, Christos Sakaridis, Prune Truong, Luc van Gool

Due to the scarcity of dense pixel-level semantic annotations for images recorded in adverse visual conditions, there has been a keen interest in unsupervised domain adaptation (UDA) for the semantic segmentation of such images.

Semantic Segmentation Unsupervised Domain Adaptation

Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing

1 code implementation CVPR 2021 Tianfei Zhou, Wenguan Wang, Si Liu, Yi Yang, Luc van Gool

To address the challenging task of instance-aware human part parsing, a new bottom-up regime is proposed to learn category-level human semantic segmentation as well as multi-person pose estimation in a joint and end-to-end manner.

Human Parsing Multi-Person Pose Estimation +3

Learned Multi-Patch Similarity

1 code implementation ICCV 2017 Wilfried Hartmann, Silvano Galliani, Michal Havlena, Luc van Gool, Konrad Schindler

Estimating a depth map from multiple views of a scene is a fundamental task in computer vision.

RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials

1 code implementation CVPR 2018 Despoina Paschalidou, Ali Osman Ulusoy, Carolin Schmitt, Luc van Gool, Andreas Geiger

RayNet integrates a CNN that learns view-invariant feature representations with an MRF that explicitly encodes the physics of perspective projection and occlusion.

3D Reconstruction

SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance

1 code implementation27 Nov 2023 Lukas Hoyer, David Joseph Tan, Muhammad Ferjad Naeem, Luc van Gool, Federico Tombari

In SemiVL, we propose to integrate rich priors from VLM pre-training into semi-supervised semantic segmentation to learn better semantic decision boundaries.

Decoder Segmentation +1

Understanding Bird's-Eye View of Road Semantics using an Onboard Camera

1 code implementation5 Dec 2020 Yigit Baran Can, Alexander Liniger, Ozan Unal, Danda Paudel, Luc van Gool

In this work, we study scene understanding in the form of online estimation of semantic BEV maps using the video input from a single onboard camera.

Autonomous Navigation Scene Understanding

3D-Aware Video Generation

1 code implementation29 Jun 2022 Sherwin Bahmani, Jeong Joon Park, Despoina Paschalidou, Hao Tang, Gordon Wetzstein, Leonidas Guibas, Luc van Gool, Radu Timofte

Generative models have emerged as an essential building block for many image synthesis and editing tasks.

Image Generation Video Generation

Large Scale Holistic Video Understanding

1 code implementation ECCV 2020 Ali Diba, Mohsen Fayyaz, Vivek Sharma, Manohar Paluri, Jurgen Gall, Rainer Stiefelhagen, Luc van Gool

HVU is organized hierarchically in a semantic taxonomy that focuses on multi-label and multi-task video understanding as a comprehensive problem that encompasses the recognition of multiple semantic aspects in the dynamic scene.

Action Classification Action Recognition +7

Towards Interpretable Video Super-Resolution via Alternating Optimization

1 code implementation21 Jul 2022 JieZhang Cao, Jingyun Liang, Kai Zhang, Wenguan Wang, Qin Wang, Yulun Zhang, Hao Tang, Luc van Gool

These issues can be alleviated by a cascade of three separate sub-tasks, including video deblurring, frame interpolation, and super-resolution, which, however, would fail to capture the spatial and temporal correlations among video sequences.

Deblurring Space-time Video Super-resolution +2

Learning to Prompt with Text Only Supervision for Vision-Language Models

1 code implementation4 Jan 2024 Muhammad Uzair Khattak, Muhammad Ferjad Naeem, Muzammal Naseer, Luc van Gool, Federico Tombari

While effective, most of these works require labeled data which is not practical, and often struggle to generalize towards new datasets due to over-fitting on the source data.

Prompt Engineering

DLOW: Domain Flow for Adaptation and Generalization

1 code implementation CVPR 2019 Rui Gong, Wen Li, Yu-Hua Chen, Luc van Gool

In this work, we present a domain flow generation(DLOW) model to bridge two different domains by generating a continuous sequence of intermediate domains flowing from one domain to the other.

Domain Adaptation Semantic Segmentation +1

Topology Preserving Local Road Network Estimation from Single Onboard Camera Image

1 code implementation CVPR 2022 Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

We represent the road topology using a set of directed lane curves and their interactions, which are captured using their intersection points.

Query-adaptive Video Summarization via Quality-aware Relevance Estimation

1 code implementation1 May 2017 Arun Balajee Vasudevan, Michael Gygli, Anna Volokitin, Luc van Gool

Although the problem of automatic video summarization has recently received a lot of attention, the problem of creating a video summary that also highlights elements relevant to a search query has been less studied.

Video Summarization

Advances in Deep Concealed Scene Understanding

1 code implementation21 Apr 2023 Deng-Ping Fan, Ge-Peng Ji, Peng Xu, Ming-Ming Cheng, Christos Sakaridis, Luc van Gool

Concealed scene understanding (CSU) is a hot computer vision topic aiming to perceive objects exhibiting camouflage.

Scene Understanding Semantic Segmentation

Semi-Supervised Learning by Augmented Distribution Alignment

1 code implementation ICCV 2019 Qin Wang, Wen Li, Luc van Gool

We reveal that an essential sampling bias exists in semi-supervised learning due to the limited number of labeled samples, which often leads to a considerable empirical distribution mismatch between labeled data and unlabeled data.

Domain Adaptation Semi-Supervised Image Classification

AENet: Learning Deep Audio Features for Video Analysis

1 code implementation3 Jan 2017 Naoya Takahashi, Michael Gygli, Luc van Gool

Instead, combining visual features with our AENet features, which can be computed efficiently on a GPU, leads to significant performance improvements on action recognition and video highlight detection.

Action Recognition Data Augmentation +4

Learning Filter Basis for Convolutional Neural Network Compression

3 code implementations ICCV 2019 Yawei Li, Shuhang Gu, Luc van Gool, Radu Timofte

Convolutional neural networks (CNNs) based solutions have achieved state-of-the-art performances for many computer vision tasks, including classification and super-resolution of images.

General Classification Image Classification +2

Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression

2 code implementations CVPR 2020 Yawei Li, Shuhang Gu, Christoph Mayer, Luc van Gool, Radu Timofte

In this paper, we analyze two popular network compression techniques, i. e. filter pruning and low-rank decomposition, in a unified sense.

Talk2Car: Taking Control of Your Self-Driving Car

1 code implementation IJCNLP 2019 Thierry Deruyttere, Simon Vandenhende, Dusan Grujicic, Luc van Gool, Marie-Francine Moens

Or more specifically, we consider the problem in an autonomous driving setting, where a passenger requests an action that can be associated with an object found in a street scene.

Autonomous Driving Object +2

SESAME: Semantic Editing of Scenes by Adding, Manipulating or Erasing Objects

1 code implementation ECCV 2020 Evangelos Ntavelis, Andrés Romero, Iason Kastanis, Luc van Gool, Radu Timofte

In contrast to previous methods that employ a discriminator that trivially concatenates semantics and image as an input, the SESAME discriminator is composed of two input streams that independently process the image and its semantics, using the latter to manipulate the results of the former.

Image Manipulation Image-to-Image Translation

Learning Better Lossless Compression Using Lossy Compression

1 code implementation CVPR 2020 Fabian Mentzer, Luc van Gool, Michael Tschannen

We leverage the powerful lossy image compression algorithm BPG to build a lossless image compression system.

Image Compression

DHP: Differentiable Meta Pruning via HyperNetworks

2 code implementations ECCV 2020 Yawei Li, Shuhang Gu, Kai Zhang, Luc van Gool, Radu Timofte

Passing the sparsified latent vectors through the hypernetworks, the corresponding slices of the generated weight parameters can be removed, achieving the effect of network pruning.

Denoising Image Classification +3

Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning

1 code implementation4 Apr 2024 Rui Li, Tobias Fischer, Mattia Segu, Marc Pollefeys, Luc van Gool, Federico Tombari

We propose KYN, a novel method for single-view scene reconstruction that reasons about semantic and spatial context to predict each point's density.

3D Scene Reconstruction Depth Estimation +2

Sliced Wasserstein Generative Models

1 code implementation8 Jun 2017 Jiqing Wu, Zhiwu Huang, Dinesh Acharya, Wen Li, Janine Thoma, Danda Pani Paudel, Luc van Gool

In generative modeling, the Wasserstein distance (WD) has emerged as a useful metric to measure the discrepancy between generated and real data distributions.

Image Generation Video Generation

Towards High Resolution Video Generation with Progressive Growing of Sliced Wasserstein GANs

1 code implementation4 Oct 2018 Dinesh Acharya, Zhiwu Huang, Danda Pani Paudel, Luc van Gool

Furthermore, we introduce a sliced version of Wasserstein GAN (SWGAN) loss to improve the distribution learning on the video data of high-dimension and mixed-spatiotemporal distribution.

Action Recognition Image Generation +2

Sliced Wasserstein Generative Models

1 code implementation CVPR 2019 Jiqing Wu, Zhiwu Huang, Dinesh Acharya, Wen Li, Janine Thoma, Danda Pani Paudel, Luc van Gool

In generative modeling, the Wasserstein distance (WD) has emerged as a useful metric to measure the discrepancy between generated and real data distributions.

Image Generation Video Generation

Physical Adversarial Attack meets Computer Vision: A Decade Survey

1 code implementation30 Sep 2022 Hui Wei, Hao Tang, Xuemei Jia, Zhixiang Wang, Hanxun Yu, Zhubo Li, Shin'ichi Satoh, Luc van Gool, Zheng Wang

Building upon this foundation, we uncover the pervasive role of artifacts carrying adversarial perturbations in the physical world.

Adversarial Attack Medical Diagnosis

Indiscernible Object Counting in Underwater Scenes

1 code implementation CVPR 2023 Guolei Sun, Zhaochong An, Yun Liu, Ce Liu, Christos Sakaridis, Deng-Ping Fan, Luc van Gool

We further advance the frontier of this field by systematically studying a new challenge named indiscernible object counting (IOC), the goal of which is to count objects that are blended with respect to their surroundings.

Benchmarking Object +2

TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction

2 code implementations7 Mar 2023 Zhejun Zhang, Alexander Liniger, Dengxin Dai, Fisher Yu, Luc van Gool

We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving, and based on TrafficBots we obtain a world model tailored for the planning module of autonomous vehicles.

Autonomous Driving Model-based Reinforcement Learning +1

Deep Equilibrium Diffusion Restoration with Parallel Sampling

1 code implementation20 Nov 2023 JieZhang Cao, Yue Shi, Kai Zhang, Yulun Zhang, Radu Timofte, Luc van Gool

Due to the inherent property of diffusion models, most existing methods need long serial sampling chains to restore HQ images step-by-step, resulting in expensive sampling time and high computation costs.

Image Restoration

Collapse by Conditioning: Training Class-conditional GANs with Limited Data

1 code implementation ICLR 2022 Mohamad Shahbazi, Martin Danelljan, Danda Pani Paudel, Luc van Gool

On the contrary, we observe that class-conditioning causes mode collapse in limited data settings, where unconditional learning leads to satisfactory generative ability.

Generative Adversarial Network

Arbitrary-Scale Image Synthesis

1 code implementation CVPR 2022 Evangelos Ntavelis, Mohamad Shahbazi, Iason Kastanis, Radu Timofte, Martin Danelljan, Luc van Gool

Positional encodings have enabled recent works to train a single adversarial network that can generate images of different scales.

Image Generation

Single-Model and Any-Modality for Video Object Tracking

1 code implementation27 Nov 2023 Zongwei Wu, Jilai Zheng, Xiangxuan Ren, Florin-Alexandru Vasluianu, Chao Ma, Danda Pani Paudel, Luc van Gool, Radu Timofte

In practice, most existing RGB trackers learn a single set of parameters to use them across datasets and applications.

Object Video Object Tracking

SMIT: Stochastic Multi-Label Image-to-Image Translation

1 code implementation10 Dec 2018 Andrés Romero, Pablo Arbeláez, Luc van Gool, Radu Timofte

This problem is highly challenging due to three main reasons: (i) unpaired datasets, (ii) multiple attributes, and (iii) the multimodality (e. g., style) associated with the translation.

Image-to-Image Translation Translation

Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation

1 code implementation CVPR 2021 Suman Saha, Anton Obukhov, Danda Pani Paudel, Menelaos Kanakis, Yuhua Chen, Stamatios Georgoulis, Luc van Gool

Specifically, we show that: (1) our approach improves performance on all tasks when they are complementary and mutually dependent; (2) the CTRL helps to improve both semantic segmentation and depth estimation tasks performance in the challenging UDA setting; (3) the proposed ISL training scheme further improves the semantic segmentation performance.

Monocular Depth Estimation Multi-Task Learning +4

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model

1 code implementation CVPR 2022 Zipeng Xu, Tianwei Lin, Hao Tang, Fu Li, Dongliang He, Nicu Sebe, Radu Timofte, Luc van Gool, Errui Ding

We propose a novel framework, i. e., Predict, Prevent, and Evaluate (PPE), for disentangled text-driven image manipulation that requires little manual annotation while being applicable to a wide variety of manipulations.

Image Manipulation Language Modelling

Continuous Pose for Monocular Cameras in Neural Implicit Representation

1 code implementation28 Nov 2023 Qi Ma, Danda Pani Paudel, Ajad Chhatkuli, Luc van Gool

In this paper, we showcase the effectiveness of optimizing monocular camera poses as a continuous function of time.

Simultaneous Localization and Mapping

DeepProposals: Hunting Objects and Actions by Cascading Deep Convolutional Layers

1 code implementation15 Jun 2016 Amir Ghodrati, Ali Diba, Marco Pedersoli, Tinne Tuytelaars, Luc van Gool

In this paper, a new method for generating object and action proposals in images and videos is proposed.

Object

DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers

1 code implementation ICCV 2015 Amir Ghodrati, Ali Diba, Marco Pedersoli, Tinne Tuytelaars, Luc van Gool

We generate hypotheses in a sliding-window fashion over different activation layers and show that the final convolutional layers can find the object of interest with high recall but poor localization due to the coarseness of the feature maps.

Object

Model-aware 3D Eye Gaze from Weak and Few-shot Supervisions

1 code implementation20 Nov 2023 Nikola Popovic, Dimitrios Christodoulou, Danda Pani Paudel, Xi Wang, Luc van Gool

In this work, we propose to predict 3D eye gaze from weak supervision of eye semantic segmentation masks and direct supervision of a few 3D gaze vectors.

Semantic Segmentation

SMILE: Semantically-guided Multi-attribute Image and Layout Editing

1 code implementation5 Oct 2020 Andrés Romero, Luc van Gool, Radu Timofte

Additionally, our method is capable of adding, removing or changing either fine-grained or coarse attributes by using an image as a reference or by exploring the style distribution space, and it can be easily extended to head-swapping and face-reenactment applications without being trained on videos.

Attribute Face Reenactment +1

Masked Vision-Language Transformer in Fashion

1 code implementation27 Oct 2022 Ge-Peng Ji, Mingcheng Zhuge, Dehong Gao, Deng-Ping Fan, Christos Sakaridis, Luc van Gool

We present a masked vision-language transformer (MVLT) for fashion-specific multi-modal representation.

Image Reconstruction Retrieval

Reparameterizing Convolutions for Incremental Multi-Task Learning without Task Interference

1 code implementation ECCV 2020 Menelaos Kanakis, David Bruggemann, Suman Saha, Stamatios Georgoulis, Anton Obukhov, Luc van Gool

First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning).

Incremental Learning Multi-Task Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.