Search Results for author: Amirhossein Habibian

Found 15 papers, 5 papers with code

Object-Centric Diffusion for Efficient Video Editing

no code implementations11 Jan 2024 Kumara Kahatapitiya, Adil Karjauv, Davide Abati, Fatih Porikli, Yuki M. Asano, Amirhossein Habibian

Diffusion-based video editing have reached impressive quality and can transform either the global style, local structure, and attributes of given video inputs, following textual edit prompts.

Object Video Editing

VaLID: Variable-Length Input Diffusion for Novel View Synthesis

no code implementations14 Dec 2023 Shijie Li, Farhad G. Zanjani, Haitam Ben Yahia, Yuki M. Asano, Juergen Gall, Amirhossein Habibian

This is because the source-view images and corresponding poses are processed separately and injected into the model at different stages.

Image Generation Novel View Synthesis +1

Simple and Efficient Architectures for Semantic Segmentation

1 code implementation16 Jun 2022 Dushyant Mehta, Andrii Skliar, Haitam Ben Yahia, Shubhankar Borse, Fatih Porikli, Amirhossein Habibian, Tijmen Blankevoort

Though the state-of-the architectures for semantic segmentation, such as HRNet, demonstrate impressive accuracy, the complexity arising from their salient design choices hinders a range of model acceleration tools, and further they make use of operations that are inefficient on current hardware.

Decoder Image Classification +2

SALISA: Saliency-based Input Sampling for Efficient Video Object Detection

no code implementations5 Apr 2022 Babak Ehteshami Bejnordi, Amirhossein Habibian, Fatih Porikli, Amir Ghodrati

In this paper, we propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detection that allows for heavy down-sampling of unimportant background regions while preserving the fine-grained details of a high-resolution image.

Object object-detection +1

Delta Distillation for Efficient Video Processing

1 code implementation17 Mar 2022 Amirhossein Habibian, Haitam Ben Yahia, Davide Abati, Efstratios Gavves, Fatih Porikli

By extensive experiments on a wide range of architectures, including the most efficient ones, we demonstrate that delta distillation sets a new state of the art in terms of accuracy vs. efficiency trade-off for semantic segmentation and object detection in videos.

Knowledge Distillation object-detection +4

Region-of-Interest Based Neural Video Compression

no code implementations3 Mar 2022 Yura Perugachi-Diaz, Guillaume Sautière, Davide Abati, Yang Yang, Amirhossein Habibian, Taco S Cohen

To the best of our knowledge, our proposals are the first solutions that integrate ROI-based capabilities into neural video compression models.

Quantization Video Compression

Skip-Convolutions for Efficient Video Processing

1 code implementation CVPR 2021 Amirhossein Habibian, Davide Abati, Taco S. Cohen, Babak Ehteshami Bejnordi

We reformulate standard convolution to be efficiently computed on residual frames: each layer is coupled with a binary gate deciding whether a residual is important to the model prediction,~\eg foreground regions, or it can be safely skipped, e. g. background regions.

Model Compression

Video Compression With Rate-Distortion Autoencoders

no code implementations ICCV 2019 Amirhossein Habibian, Ties van Rozendaal, Jakub M. Tomczak, Taco S. Cohen

We employ a model that consists of a 3D autoencoder with a discrete latent space and an autoregressive prior used for entropy coding.

Motion Compensation Semantic Compression +1

Learning Variations in Human Motion via Mix-and-Match Perturbation

no code implementations2 Aug 2019 Mohammad Sadegh Aliakbarian, Fatemeh Sadat Saleh, Mathieu Salzmann, Lars Petersson, Stephen Gould, Amirhossein Habibian

In this paper, we introduce an approach to stochastically combine the root of variations with previous pose information, which forces the model to take the noise into account.

Decoder Human motion prediction +1

VideoStory Embeddings Recognize Events when Examples are Scarce

no code implementations8 Nov 2015 Amirhossein Habibian, Thomas Mensink, Cees G. M. Snoek

In our proposed embedding, which we call VideoStory, the correlations between the terms are utilized to learn a more effective representation by optimizing a joint objective balancing descriptiveness and predictability. We show how learning the VideoStory using a multimodal predictability loss, including appearance, motion and audio features, results in a better predictable representation.

Attribute Event Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.