Search Results for author: Ali Diba

Found 18 papers, 7 papers with code

Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation

1 code implementation • CVPR 2021 • M. Saquib Sarfraz, Naila Murray, Vivek Sharma, Ali Diba, Luc van Gool, Rainer Stiefelhagen

Action segmentation refers to inferring boundaries of semantically consistent visual concepts in videos and is an important requirement for many video understanding tasks.

Ranked #1 on Action Segmentation on MPII Cooking 2 Dataset

Action Segmentation Clustering +2

312

Paper
Code

Vi2CLR: Video and Image for Visual Contrastive Learning of Representation

no code implementations • ICCV 2021 • Ali Diba, Vivek Sharma, Reza Safdari, Dariush Lotfi, Saquib Sarfraz, Rainer Stiefelhagen, Luc van Gool

In this paper, we introduce a novel self-supervised visual representation learning method which understands both images and videos in a joint learning fashion.

Action Recognition Clustering +2

Paper
Add Code

3D CNNs with Adaptive Temporal Feature Resolutions

1 code implementation • CVPR 2021 • Mohsen Fayyaz, Emad Bahrami, Ali Diba, Mehdi Noroozi, Ehsan Adeli, Luc van Gool, Juergen Gall

While the GFLOPs of a 3D CNN can be decreased by reducing the temporal feature resolution within the network, there is no setting that is optimal for all input clips.

Action Recognition

Paper
Code

Self-Supervised Ranking for Representation Learning

no code implementations • 14 Oct 2020 • Ali Varamesh, Ali Diba, Tinne Tuytelaars, Luc van Gool

We present a new framework for self-supervised representation learning by formulating it as a ranking problem in an image retrieval context on a large number of random views (augmentations) obtained from images.

Clustering Contrastive Learning +5

Paper
Add Code

Large Scale Holistic Video Understanding

1 code implementation • ECCV 2020 • Ali Diba, Mohsen Fayyaz, Vivek Sharma, Manohar Paluri, Jurgen Gall, Rainer Stiefelhagen, Luc van Gool

HVU is organized hierarchically in a semantic taxonomy that focuses on multi-label and multi-task video understanding as a comprehensive problem that encompasses the recognition of multiple semantic aspects in the dynamic scene.

Ranked #11 on Action Recognition on UCF101

Action Classification Action Recognition +7

Paper
Code

DynamoNet: Dynamic Action and Motion Network

no code implementations • ICCV 2019 • Ali Diba, Vivek Sharma, Luc van Gool, Rainer Stiefelhagen

With these overall objectives, to this end, we introduce a novel unified spatio-temporal 3D-CNN architecture (DynamoNet) that jointly optimizes the video classification and learning motion representation by predicting future frames as a multi-task learning problem.

Action Recognition Classification +5

Paper
Add Code

Spatio-Temporal Channel Correlation Networks for Action Classification

no code implementations • ECCV 2018 • Ali Diba, Mohsen Fayyaz, Vivek Sharma, M. Mahdi Arzani, Rahman Yousefzadeh, Juergen Gall, Luc van Gool

Our experiments show that adding STC blocks to current state-of-the-art architectures outperforms the state-of-the-art methods on the HMDB51, UCF101 and Kinetics datasets.

Action Classification Classification +1

Paper
Add Code

Classification-Driven Dynamic Image Enhancement

no code implementations • CVPR 2018 • Vivek Sharma, Ali Diba, Davy Neven, Michael S. Brown, Luc van Gool, Rainer Stiefelhagen

In this paper, we are interested in learning CNNs that can emulate image enhancement and restoration, but with the overall goal to improve image classification and not necessarily human perception.

Classification General Classification +3

Paper
Add Code

Weakly Supervised Object Discovery by Generative Adversarial & Ranking Networks

no code implementations • 22 Nov 2017 • Ali Diba, Vivek Sharma, Rainer Stiefelhagen, Luc van Gool

We approach GANs with a novel training method and learning objective, to discover multiple object instances for three cases: 1) synthesizing a picture of a specific object within a cluttered scene; 2) localizing different categories in images for weakly supervised object detection; and 3) improving object discov- ery in object detection pipelines.

Ranked #2 on Weakly Supervised Object Detection on COCO test-dev

Object object-detection +2

Paper
Add Code

Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification

3 code implementations • 22 Nov 2017 • Ali Diba, Mohsen Fayyaz, Vivek Sharma, Amir Hossein Karami, Mohammad Mahdi Arzani, Rahman Yousefzadeh, Luc van Gool

Thus, by finetuning this network, we beat the performance of generic and recent methods in 3D CNNs, which were trained on large video datasets, e. g. Sports-1M, and finetuned on the target datasets, e. g. HMDB51/UCF101.

Action Recognition General Classification +3

105

Paper
Code

Classification Driven Dynamic Image Enhancement

no code implementations • 20 Oct 2017 • Vivek Sharma, Ali Diba, Davy Neven, Michael S. Brown, Luc van Gool, Rainer Stiefelhagen

In this paper, we are interested in learning CNNs that can emulate image enhancement and restoration, but with the overall goal to improve image classification and not necessarily human perception.

Classification General Classification +3

Paper
Add Code

Weakly Supervised Cascaded Convolutional Networks

no code implementations • CVPR 2017 • Ali Diba, Vivek Sharma, Ali Pazandeh, Hamed Pirsiavash, Luc van Gool

The final stage of both architectures is a part of a convolutional neural network that performs multiple instance learning on proposals extracted in the previous stage(s).

Ranked #2 on Weakly Supervised Object Detection on ImageNet

Multiple Instance Learning Object +3

Paper
Add Code

Deep Temporal Linear Encoding Networks

2 code implementations • CVPR 2017 • Ali Diba, Vivek Sharma, Luc van Gool

Advantages of TLEs are: (a) they encode the entire video into a compact feature representation, learning the semantics and a discriminative feature space; (b) they are applicable to all kinds of networks like 2D and 3D CNNs for video classification; and (c) they model feature interactions in a more expressive way and without loss of information.

Representation Learning Video Classification

553

Paper
Code

Efficient Two-Stream Motion and Appearance 3D CNNs for Video Classification

no code implementations • 31 Aug 2016 • Ali Diba, Ali Mohammad Pazandeh, Luc van Gool

The video and action classification have extremely evolved by deep neural networks specially with two stream CNN using RGB and optical flow as inputs and they present outstanding performance in terms of video analysis.

3D Architecture Action Classification +4

Paper
Add Code

DeepCAMP: Deep Convolutional Action & Attribute Mid-Level Patterns

no code implementations • CVPR 2016 • Ali Diba, Ali Mohammad Pazandeh, Hamed Pirsiavash, Luc van Gool

On the other hand, we let an iteration of feature learning and patch clustering purify the set of dedicated patches that we use.

Attribute Clustering

Paper
Add Code

DeepProposals: Hunting Objects and Actions by Cascading Deep Convolutional Layers

1 code implementation • 15 Jun 2016 • Amir Ghodrati, Ali Diba, Marco Pedersoli, Tinne Tuytelaars, Luc van Gool

In this paper, a new method for generating object and action proposals in images and videos is proposed.

Object

Paper
Code

DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers

1 code implementation • ICCV 2015 • Amir Ghodrati, Ali Diba, Marco Pedersoli, Tinne Tuytelaars, Luc van Gool

We generate hypotheses in a sliding-window fashion over different activation layers and show that the final convolutional layers can find the object of interest with high recall but poor localization due to the coarseness of the feature maps.

Object

Paper
Code

Multi-attribute Queries: To Merge or Not to Merge?

no code implementations • CVPR 2013 • Mohammad Rastegari, Ali Diba, Devi Parikh, Ali Farhadi

We exploit a discriminative binary space to compute these geometric quantities efficiently.

Attribute Image Retrieval

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.