no code implementations • 22 Mar 2025 • Mehdi Noroozi, Alberto Gil Ramos, Luca Morreale, Ruchika Chavhan, Malcolm Chadwick, Abhinav Mehrotra, Sourav Bhattacharya
We present Explicit Conditioning (EC) of the noise distribution on the input modalities to achieve this.
no code implementations • 20 Mar 2025 • Philipp Becker, Abhinav Mehrotra, Ruchika Chavhan, Malcolm Chadwick, Luca Morreale, Mehdi Noroozi, Alberto Gil Ramos, Sourav Bhattacharya
Second, we formulate a hybrid attention scheme for multi-modal inputs that combines linear attention for image-to-image interactions and standard scaled dot-product attention for interactions involving prompts.
no code implementations • CVPR 2025 • Mehdi Noroozi, Isma Hadji, Victor Escorcia, Anestis Zaganidis, Brais Martinez, Georgios Tzimiropoulos
To maintain a high visual quality on such low compute budget, we introduce a number of training strategies: (i) A novel conditioning mechanism on the low resolution input, coined bidirectional conditioning, which tailors the SD model for the SR task.
no code implementations • 30 Jan 2024 • Mehdi Noroozi, Isma Hadji, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos
We show that the combination of spatially distilled U-Net and fine-tuned decoder outperforms state-of-the-art methods requiring 200 steps with only one single step.
2 code implementations • 1 Sep 2022 • Nadine Behrmann, S. Alireza Golestaneh, Zico Kolter, Juergen Gall, Mehdi Noroozi
This paper introduces a unified framework for video action segmentation via sequence to sequence (seq2seq) translation in a fully and timestamp supervised setup.
Ranked #5 on
Action Segmentation
on Assembly101
1 code implementation • 27 Jan 2022 • David T. Hoffmann, Nadine Behrmann, Juergen Gall, Thomas Brox, Mehdi Noroozi
This paper introduces Ranking Info Noise Contrastive Estimation (RINCE), a new member in the family of InfoNCE losses that preserves a ranked ordering of positive samples.
no code implementations • 27 Oct 2021 • Saber Pourheydari, Emad Bahrami, Mohsen Fayyaz, Gianpiero Francesca, Mehdi Noroozi, Juergen Gall
While recurrent neural networks (RNNs) demonstrate outstanding capabilities for future video frame prediction, they model dynamics in a discrete time space, i. e., they predict the frames sequentially with a fixed temporal step.
no code implementations • ICCV 2021 • Nadine Behrmann, Mohsen Fayyaz, Juergen Gall, Mehdi Noroozi
We argue that a single representation to capture both types of features is sub-optimal, and propose to decompose the representation space into stationary and non-stationary features via contrastive learning from long and short views, i. e. long video sequences and their shorter sub-sequences.
no code implementations • 3 Dec 2020 • Mehdi Noroozi
This paper introduces a novel and fully unsupervised framework for conditional GAN training in which labels are automatically obtained from data.
1 code implementation • CVPR 2021 • Mohsen Fayyaz, Emad Bahrami, Ali Diba, Mehdi Noroozi, Ehsan Adeli, Luc van Gool, Juergen Gall
While the GFLOPs of a 3D CNN can be decreased by reducing the temporal feature resolution within the network, there is no setting that is optimal for all input clips.
no code implementations • 11 Nov 2020 • Nadine Behrmann, Juergen Gall, Mehdi Noroozi
This paper introduces a novel method for self-supervised video representation learning via feature prediction.
no code implementations • CVPR 2018 • Mehdi Noroozi, Ananth Vinjimoor, Paolo Favaro, Hamed Pirsiavash
We use this framework to design a novel self-supervised task, which achieves state-of-the-art performance on the common benchmarks in PASCAL VOC 2007, ILSVRC12 and Places by a significant margin.
2 code implementations • ICCV 2017 • Mehdi Noroozi, Hamed Pirsiavash, Paolo Favaro
In this paper, we use two image transformations in the context of counting: scaling and tiling.
Ranked #142 on
Self-Supervised Image Classification
on ImageNet
no code implementations • 22 Aug 2017 • Paramanand Chandramouli, Mehdi Noroozi, Paolo Favaro
In this paper, we address the problem of reflection removal and deblurring from a single image captured by a plenoptic camera.
no code implementations • 5 Jan 2017 • Mehdi Noroozi, Paramanand Chandramouli, Paolo Favaro
The task of image deblurring is a very ill-posed problem as both the image and the blur are unknown.
9 code implementations • 30 Mar 2016 • Mehdi Noroozi, Paolo Favaro
By following the principles of self-supervision, we build a convolutional neural network (CNN) that can be trained to solve Jigsaw puzzles as a pretext task, which requires no manual labeling, and then later repurposed to solve object classification and detection.