Search Results for author: Michael Felsberg

Found 64 papers, 33 papers with code

NeuroNCAP: Photorealistic Closed-loop Safety Testing for Autonomous Driving

2 code implementations11 Apr 2024 William Ljungbergh, Adam Tonderski, Joakim Johnander, Holger Caesar, Kalle Åström, Michael Felsberg, Christoffer Petersson

We present a versatile NeRF-based simulator for testing autonomous driving (AD) software systems, designed with a focus on sensor-realistic closed-loop evaluation and the creation of safety-critical scenarios.

Autonomous Driving

Composed Video Retrieval via Enriched Context and Discriminative Embeddings

1 code implementation25 Mar 2024 Omkar Thawakar, Muzammal Naseer, Rao Muhammad Anwer, Salman Khan, Michael Felsberg, Mubarak Shah, Fahad Shahbaz Khan

Composed video retrieval (CoVR) is a challenging problem in computer vision which has recently highlighted the integration of modification text with visual queries for more sophisticated video search in large databases.

Composed Video Retrieval (CoVR) Retrieval

DiffSF: Diffusion Models for Scene Flow Estimation

1 code implementation8 Mar 2024 Yushan Zhang, Bastian Wandt, Maria Magnusson, Michael Felsberg

Aiming at improving accuracy while additionally providing an estimate for uncertainty, we propose DiffSF that combines transformer-based scene flow estimation with denoising diffusion models.

Denoising Scene Flow Estimation +1

PALO: A Polyglot Large Multimodal Model for 5B People

1 code implementation22 Feb 2024 Muhammad Maaz, Hanoona Rasheed, Abdelrahman Shaker, Salman Khan, Hisham Cholakal, Rao M. Anwer, Tim Baldwin, Michael Felsberg, Fahad S. Khan

PALO offers visual reasoning capabilities in 10 major languages, including English, Chinese, Hindi, Spanish, French, Arabic, Bengali, Russian, Urdu, and Japanese, that span a total of ~5B people (65% of the world population).

Language Modelling Large Language Model +1

SeTformer is What You Need for Vision and Language

no code implementations7 Jan 2024 Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Michael Felsberg

Kernel methods are employed to simplify computations by approximating softmax but often lead to performance drops compared to softmax attention.

Computational Efficiency Language Modelling +3

Steerers: A framework for rotation equivariant keypoint descriptors

1 code implementation4 Dec 2023 Georg Bökman, Johan Edstedt, Michael Felsberg, Fredrik Kahl

Image keypoint descriptions that are discriminative and matchable over large changes in viewpoint are vital for 3D reconstruction.

3D Reconstruction Data Augmentation

Certainty In, Certainty Out: REVQCs for Quantum Machine Learning

no code implementations16 Oct 2023 Hannah Helgesen, Michael Felsberg, Jan-Åke Larsson

The field of Quantum Machine Learning (QML) has emerged recently in the hopes of finding new machine learning protocols or exponential speedups for classical ones.

Quantum Machine Learning

Leveraging the Power of Data Augmentation for Transformer-based Tracking

no code implementations15 Sep 2023 Jie Zhao, Johan Edstedt, Michael Felsberg, Dong Wang, Huchuan Lu

Due to long-distance correlation and powerful pretrained models, transformer-based methods have initiated a breakthrough in visual object tracking performance.

Data Augmentation Visual Object Tracking

IBAFormer: Intra-batch Attention Transformer for Domain Generalized Semantic Segmentation

no code implementations12 Sep 2023 Qiyu Sun, Huilin Chen, Meng Zheng, Ziyan Wu, Michael Felsberg, Yang Tang

Domain generalized semantic segmentation (DGSS) is a critical yet challenging task, where the model is trained only on source data without access to any target data.

Semantic Segmentation

Learning to Augment: Hallucinating Data for Domain Generalized Segmentation

no code implementations4 Jul 2023 Qiyu Sun, Pavlo Melnyk, Michael Felsberg, Yang Tang

Domain generalized semantic segmentation (DGSS) is an essential but highly challenging task, in which the model is trained only on source data and any target data is not available.

Data Augmentation Image Enhancement +1

Learning Coverage Paths in Unknown Environments with Deep Reinforcement Learning

no code implementations29 Jun 2023 Arvi Jonnarth, Jie Zhao, Michael Felsberg

Coverage path planning (CPP) is the problem of finding a path that covers the entire free space of a confined area, with applications ranging from robotic lawn mowing to search-and-rescue.

reinforcement-learning

Flexible Distribution Alignment: Towards Long-tailed Semi-supervised Learning with Proper Calibration

no code implementations7 Jun 2023 Emanuel Sanchez Aimar, Hannah Helgesen, Yonghao Xu, Marco Kuhlmann, Michael Felsberg

Long-tailed semi-supervised learning (LTSSL) represents a practical scenario for semi-supervised applications, challenged by skewed labeled distributions that bias classifiers.

Data Augmentation

O$n$ Learning Deep O($n$)-Equivariant Hyperspheres

no code implementations24 May 2023 Pavlo Melnyk, Michael Felsberg, Mårten Wadenbäck, Andreas Robinson, Cuong Le

In this paper, we utilize hyperspheres and regular $n$-simplexes and propose an approach to learning deep features equivariant under the transformations of $n$D reflections and rotations, encompassed by the powerful group of O$(n)$.

RoMa: Robust Dense Feature Matching

1 code implementation24 May 2023 Johan Edstedt, Qiyu Sun, Georg Bökman, Mårten Wadenbäck, Michael Felsberg

The aim is to learn a robust model, i. e., a model able to match under challenging real-world changes.

Key Point Matching regression

High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation

1 code implementation5 Apr 2023 Arvi Jonnarth, Yushan Zhang, Michael Felsberg

Our work is based on two techniques for improving CAMs; importance sampling, which is a substitute for GAP, and the feature similarity loss, which utilizes a heuristic that object contours almost always align with color edges in images.

Image Classification Segmentation +4

Flow-guided Semi-supervised Video Object Segmentation

no code implementations25 Jan 2023 Yushan Zhang, Andreas Robinson, Maria Magnusson, Michael Felsberg

A model to extract the combined information from optical flow and the image is proposed, which is then used as input to the target model and the decoder network.

Object Optical Flow Estimation +5

Raw or Cooked? Object Detection on RAW Images

no code implementations21 Jan 2023 William Ljungbergh, Joakim Johnander, Christoffer Petersson, Michael Felsberg

Images fed to a deep neural network have in general undergone several handcrafted image signal processing (ISP) operations, all of which have been optimized to produce visually pleasing images.

Object object-detection +1

Evidential Deep Learning for Class-Incremental Semantic Segmentation

no code implementations6 Dec 2022 Karl Holmquist, Lena Klasén, Michael Felsberg

In this paper, we address the problem of how to model unlabeled classes while avoiding spurious feature clustering of future uncorrelated classes.

Class-Incremental Semantic Segmentation Clustering +1

TetraSphere: A Neural Descriptor for O(3)-Invariant Point Cloud Analysis

1 code implementation26 Nov 2022 Pavlo Melnyk, Andreas Robinson, Michael Felsberg, Mårten Wadenbäck

In our approach, we perform TetraTransform--an equivariant embedding of the 3D input into 4D, constructed from the steerable neurons--and extract deeper O(3)-equivariant features using vector neurons.

3D Point Cloud Classification Point Cloud Classification

Balanced Product of Calibrated Experts for Long-Tailed Recognition

1 code implementation CVPR 2023 Emanuel Sanchez Aimar, Arvi Jonnarth, Michael Felsberg, Marco Kuhlmann

We show how to properly define these distributions and combine the experts in order to achieve unbiased predictions, by proving that the ensemble is Fisher-consistent for minimizing the balanced error.

Long-tail Learning Long-tail Learning on CIFAR-10-LT (ρ=100) +1

Video Instance Segmentation via Multi-scale Spatio-temporal Split Attention Transformer

1 code implementation24 Mar 2022 Omkar Thawakar, Sanath Narayan, Jiale Cao, Hisham Cholakkal, Rao Muhammad Anwer, Muhammad Haris Khan, Salman Khan, Michael Felsberg, Fahad Shahbaz Khan

When using the ResNet50 backbone, our MS-STS achieves a mask AP of 50. 1 %, outperforming the best reported results in literature by 2. 7 % and by 4. 8 % at higher overlap threshold of AP_75, while being comparable in model size and speed on Youtube-VIS 2019 val.

Instance Segmentation Semantic Segmentation +2

Visual Feature Encoding for GNNs on Road Networks

no code implementations2 Mar 2022 Oliver Stromann, Alireza Razavi, Michael Felsberg

In this work, we present a novel approach to learning an encoding of visual features into graph neural networks with the application on road network data.

Classification Image Classification +1

DKM: Dense Kernelized Feature Matching for Geometry Estimation

1 code implementation CVPR 2023 Johan Edstedt, Ioannis Athanasiadis, Mårten Wadenbäck, Michael Felsberg

This changes with our novel dense method, which outperforms both dense and sparse methods on geometry estimation.

Geometric Matching

Learning to integrate vision data into road network data

no code implementations20 Dec 2021 Oliver Stromann, Alireza Razavi, Michael Felsberg

Road networks are the core infrastructure for connected and autonomous vehicles, but creating meaningful representations for machine learning applications is a challenging task.

Attribute Autonomous Vehicles +1

DoodleFormer: Creative Sketch Drawing with Transformers

no code implementations6 Dec 2021 Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Jorma Laaksonen, Michael Felsberg

Creative sketch image generation is a challenging vision problem, where the task is to generate diverse, yet realistic creative sketches possessing the unseen composition of the visual-world objects.

Image Generation

Dense Gaussian Processes for Few-Shot Segmentation

1 code implementation7 Oct 2021 Joakim Johnander, Johan Edstedt, Michael Felsberg, Fahad Shahbaz Khan, Martin Danelljan

Given the support set, our dense GP learns the mapping from local deep image features to mask values, capable of capturing complex appearance distributions.

Few-Shot Semantic Segmentation Gaussian Processes +1

Fully Steerable 3D Spherical Neurons

no code implementations29 Sep 2021 Pavlo Melnyk, Michael Felsberg, Mårten Wadenbäck

Emerging from low-level vision theory, steerable filters found their counterpart in prior work on steerable convolutional neural networks equivariant to rigid transformations.

Graph Representation Learning for Road Type Classification

1 code implementation16 Jul 2021 Zahra Gharaee, Shreyas Kowshik, Oliver Stromann, Michael Felsberg

We present a novel learning-based approach to graph representations of road networks employing state-of-the-art graph convolutional neural networks.

Classification Descriptive +3

Steerable 3D Spherical Neurons

1 code implementation2 Jun 2021 Pavlo Melnyk, Michael Felsberg, Mårten Wadenbäck

In our work, we propose a steerable feed-forward learning-based approach that consists of neurons with spherical decision surfaces and operates on point clouds.

Deep Gaussian Processes for Few-Shot Segmentation

no code implementations30 Mar 2021 Joakim Johnander, Johan Edstedt, Martin Danelljan, Michael Felsberg, Fahad Shahbaz Khan

Through the expressivity of the GP, our approach is capable of modeling complex appearance distributions in the deep feature space.

Gaussian Processes Segmentation

Normalized Convolution Upsampling for Refined Optical Flow Estimation

2 code implementations13 Feb 2021 Abdelrahman Eldesokey, Michael Felsberg

Our proposed approach formulates the upsampling task as a sparse problem and employs the normalized convolutional neural networks to solve it.

Optical Flow Estimation

Learning Video Instance Segmentation with Recurrent Graph Neural Networks

no code implementations7 Dec 2020 Joakim Johnander, Emil Brissman, Martin Danelljan, Michael Felsberg

Most existing approaches to video instance segmentation comprise multiple modules that are heuristically combined to produce the final output.

Instance Segmentation Management +3

Embed Me If You Can: A Geometric Perceptron

1 code implementation ICCV 2021 Pavlo Melnyk, Michael Felsberg, Mårten Wadenbäck

Our extension of the MLHP model, the multilayer geometric perceptron (MLGP), and its respective layer units, i. e., geometric neurons, are consistent with the 3D geometry and provide a geometric handle of the learned coefficients.

Decision Making

Uncertainty-Aware CNNs for Depth Completion: Uncertainty from Beginning to End

1 code implementation CVPR 2020 Abdelrahman Eldesokey, Michael Felsberg, Karl Holmquist, Mikael Persson

In this work, we thus focus on modeling the uncertainty of depth data in depth completion starting from the sparse noisy input all the way to the final prediction.

Computational Efficiency Depth Completion

Learning Fast and Robust Target Models for Video Object Segmentation

2 code implementations CVPR 2020 Andreas Robinson, Felix Järemo Lawin, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg

The target appearance model consists of a light-weight module, which is learned during the inference stage using fast optimization techniques to predict a coarse but robust target segmentation.

One-shot visual object segmentation Segmentation +2

Unsupervised Learning of Anomaly Detection from Contaminated Image Data using Simultaneous Encoder Training

1 code implementation27 May 2019 Amanda Berg, Jörgen Ahlberg, Michael Felsberg

In this work, we evaluate the effects of anomaly contaminations in the training data on state-of-the-art GAN-based anomaly detection methods.

Anomaly Detection valid

Discriminative Online Learning for Fast Video Object Segmentation

no code implementations18 Apr 2019 Andreas Robinson, Felix Järemo Lawin, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg

We propose a novel approach, based on a dedicated target appearance model that is exclusively learned online to discriminate between the target and background image regions.

Object One-shot visual object segmentation +4

ATOM: Accurate Tracking by Overlap Maximization

3 code implementations CVPR 2019 Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, Michael Felsberg

We argue that this approach is fundamentally limited since target estimation is a complex task, requiring high-level knowledge about the object.

General Classification Visual Object Tracking +1

Confidence Propagation through CNNs for Guided Sparse Depth Regression

1 code implementation5 Nov 2018 Abdelrahman Eldesokey, Michael Felsberg, Fahad Shahbaz Khan

In this paper, we propose an algebraically-constrained normalized convolution layer for CNNs with highly sparse input that has a smaller number of network parameters compared to related work.

Autonomous Driving Depth Completion +1

Propagating Confidences through CNNs for Sparse Data Regression

1 code implementation30 May 2018 Abdelrahman Eldesokey, Michael Felsberg, Fahad Shahbaz Khan

To tackle this challenging problem, we introduce an algebraically-constrained convolution layer for CNNs with sparse input and demonstrate its capabilities for the scene depth completion task.

Autonomous Driving Depth Completion +1

Density Adaptive Point Set Registration

1 code implementation CVPR 2018 Felix Järemo Lawin, Martin Danelljan, Fahad Shahbaz Khan, Per-Erik Forssén, Michael Felsberg

Contrary to previous works, we model the underlying structure of the scene as a latent probability distribution, and thereby induce invariance to point set density changes.

Deep Motion Features for Visual Tracking

no code implementations20 Dec 2016 Susanna Gladh, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg

To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking.

Action Recognition Optical Flow Estimation +3

Scale Coding Bag of Deep Features for Human Attribute and Action Recognition

no code implementations14 Dec 2016 Fahad Shahbaz Khan, Joost Van de Weijer, Rao Muhammad Anwer, Andrew D. Bagdanov, Michael Felsberg, Jorma Laaksonen

Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding.

Action Recognition In Still Images Attribute

ECO: Efficient Convolution Operators for Tracking

5 code implementations CVPR 2017 Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, Michael Felsberg

Moreover, our fast variant, using hand-crafted features, operates at 60 Hz on a single CPU, while obtaining 65. 0% AUC on OTB-2015.

Visual Object Tracking

Discriminative Scale Space Tracking

no code implementations20 Sep 2016 Martin Danelljan, Gustav Häger, Fahad Shahbaz Khan, Michael Felsberg

Compared to the standard exhaustive scale search, our approach achieves a gain of 2. 5% in average overlap precision on the OTB dataset.

Visual Object Tracking

Learning Spatially Regularized Correlation Filters for Visual Tracking

no code implementations ICCV 2015 Martin Danelljan, Gustav Häger, Fahad Shahbaz Khan, Michael Felsberg

These methods utilize a periodic assumption of the training samples to efficiently learn a classifier on all patches in the target neighborhood.

Visual Tracking

A Probabilistic Framework for Color-Based Point Set Registration

no code implementations CVPR 2016 Martin Danelljan, Giulia Meneghetti, Fahad Shahbaz Khan, Michael Felsberg

On the Stanford Lounge dataset, our approach achieves a relative reduction of the failure rate by 78% compared to the baseline.

Efficient Robust Mean Value Calculation of 1D Features

no code implementations29 Jan 2016 Erik Jonsson, Michael Felsberg

A robust mean value is often a good alternative to the standard mean value when dealing with data containing many outliers.

Cannot find the paper you are looking for? You can Submit a new open access paper.