no code implementations • 20 Sep 2021 • David Curto, Albert Clapés, Javier Selva, Sorina Smeureanu, Julio C. S. Jacques Junior, David Gallardo-Pujol, Georgina Guilera, David Leiva, Thomas B. Moeslund, Sergio Escalera, Cristina Palmero
Personality computing has become an emerging topic in computer vision, due to the wide range of applications it can be used for.
In a zero-shot cross-dataset generalization experiment, we show that our affordance learning scheme can be applied across a diverse mix of datasets and improves driveability estimation in unseen environments compared to general-purpose, single-dataset segmentation.
To this end, in this work we present a large novel and publicly available multi-label classification dataset for image-based sewer defect classification called Sewer-ML.
Experimental results on both real and artificially corrupted face images show that our method results in more detailed reconstructions with less noise compared to existing State-of-the-Art (SoTA) methods.
Explainable Artificial Intelligence (XAI) has in recent years become a well-suited framework to generate human understandable explanations of black box models.
1 code implementation • 26 Nov 2020 • Adrien Deliège, Anthony Cioppa, Silvio Giancola, Meisam J. Seikavandi, Jacob V. Dueholm, Kamal Nasrollahi, Bernard Ghanem, Thomas B. Moeslund, Marc Van Droogenbroeck
In this work, we propose SoccerNet-v2, a novel large-scale corpus of manual annotations for the SoccerNet video dataset, along with open challenges to encourage more research in soccer understanding and broadcast production.
Ranked #1 on Camera shot segmentation on SoccerNet-v2
In this work we present a novel publicly available stereo based 3D RGB dataset for multi-object zebrafish tracking, called 3D-ZeF.
A new brand of technical artificial intelligence ( Explainable AI ) research has focused on trying to open up the 'black box' and provide some explainability.
As an alternative, we developed a system that detects players from a unique cheap and wide-angle fisheye camera assisted by a single narrow-angle thermal camera.
Then, the proposed method extracts deep semantic information from a fully convolutional FEN and fuses it with the best ResNet-based feature maps to strengthen the target representation in the learning process of continuous convolution filters.
We show that accuracy improvements can be made with more complex meta-architectures and speed can be optimised by decreasing the image size with only slight losses in accuracy.
We benchmark our loss on a large dataset of soccer videos, SoccerNet, and achieve an improvement of 12. 8% over the baseline.
Ranked #2 on Action Spotting on SoccerNet-v2
In this paper, we design a system for the detection of rainfall by the use of surveillance cameras.
We propose a new evaluation protocol that evaluates the rain removal algorithms on their ability to improve the performance of subsequent segmentation, instance segmentation, and feature tracking algorithms under rain and snow.
This tech report gives an introduction to two annotation toolboxes that enable the creation of pixel and polygon-based masks as well as bounding boxes around objects of interest.
This paper proposes a double-deep spatio-angular learning framework for light field based face recognition, which is able to learn both texture and angular dynamics in sequence using convolutional representations; this is a novel recognition framework that has never been proposed before for either face recognition or any other visual recognition task.
In general, event-related information needs can be observed in query streams through various temporal patterns of user search behavior, e. g., spiky peaks for popular events, and periodicities for repetitive events.
Sparse representation has been applied successfully in abnormal event detection, in which the baseline is to learn a dictionary accompanied by sparse codes.