1 code implementation • 9 Jun 2022 • Guocheng Qian, Yuchen Li, Houwen Peng, Jinjie Mai, Hasan Abed Al Kader Hammoud, Mohamed Elhoseiny, Bernard Ghanem
In this work, we revisit the classical PointNet++ through a systematic study of model training and scaling strategies, and offer two major contributions.
Ranked #1 on
3D Point Cloud Classification
on ScanObjectNN
1 code implementation • 6 Jun 2022 • Motasem Alfarra, Juan C. Pérez, Egor Shulgin, Peter Richtárik, Bernard Ghanem
However, as in the single-node supervised learning setup, models trained in federated learning suffer from vulnerability to imperceptible input transformations known as adversarial attacks, questioning their deployment in security-related applications.
1 code implementation • 3 Jun 2022 • Kevin Qinghong Lin, Alex Jinpeng Wang, Mattia Soldan, Michael Wray, Rui Yan, Eric Zhongcong Xu, Difei Gao, RongCheng Tu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Dima Damen, Bernard Ghanem, Wei Liu, Mike Zheng Shou
Video-Language Pretraining (VLP), aiming to learn transferable representation to advance a wide range of video-text downstream tasks, has recently received increasing attention.
1 code implementation • 14 May 2022 • Shuming Liu, Mengmeng Xu, Chen Zhao, Xu Zhao, Bernard Ghanem
Interestingly, on ActivityNet-1. 3, it reaches 37. 78% average mAP, while only requiring 6 mins of training time and 1. 23 GB memory based on pre-extracted features.
no code implementations • 4 May 2022 • Zhen Dong, Kaicheng Zhou, Guohao Li, Qiang Zhou, Mingfei Guo, Bernard Ghanem, Kurt Keutzer, Shanghang Zhang
Neural architecture search (NAS) has shown great success in the automatic design of deep neural networks (DNNs).
no code implementations • 26 Apr 2022 • Mengmeng Xu, Erhan Gundogdu, Maksim Lapin, Bernard Ghanem, Michael Donoser, Loris Bazzani
Long-form video understanding requires designing approaches that are able to temporally localize activities or language.
Contrastive Learning
Few Shot Temporal Action Localization
+3
no code implementations • 14 Apr 2022 • Anthony Cioppa, Silvio Giancola, Adrien Deliege, Le Kang, Xin Zhou, Zhiyu Cheng, Bernard Ghanem, Marc Van Droogenbroeck
Tracking objects in soccer videos is extremely important to gather both player and team statistics, whether it is to estimate the total distance run, the ball possession or the team formation.
1 code implementation • CVPR 2022 • Gabriel Pérez S., Juan C. Pérez, Motasem Alfarra, Silvio Giancola, Bernard Ghanem
In this work, we propose 3DeformRS, a method to certify the robustness of point cloud Deep Neural Networks (DNNs) against real-world deformations.
no code implementations • 11 Apr 2022 • Guocheng Qian, Xuanyang Zhang, Guohao Li, Chen Zhao, Yukang Chen, Xiangyu Zhang, Bernard Ghanem, Jian Sun
TNAS performs a modified bi-level Breadth-First Search in the proposed trees to discover a high-performance architecture.
1 code implementation • CVPR 2022 • Maksim Makarenko, Arturo Burguete-Lopez, Qizhou Wang, Fedor Getman, Silvio Giancola, Bernard Ghanem, Andrea Fratalocchi
Hyperspectral imaging has attracted significant attention to identify spectral signatures for image classification and automated pattern recognition in computer vision.
no code implementations • 27 Mar 2022 • Juan Leon Alcazar, Moritz Cordes, Chen Zhao, Bernard Ghanem
Recent advances in the Active Speaker Detection (ASD) problem build upon a two-stage process: feature extraction and spatio-temporal context aggregation.
no code implementations • 24 Mar 2022 • Qiankun Gao, Chen Zhao, Bernard Ghanem, Jian Zhang
After RRL, the classification head is fine-tuned with global class-balanced classification loss to address the data imbalance issue as well as learn the decision boundary between new and previous classes.
no code implementations • 23 Mar 2022 • Bing Li, Cheng Zheng, Guohao Li, Bernard Ghanem
To provide an alternative, we propose a novel approach that utilizes monocular RGB images and point clouds to generate pseudo scene flow labels for training scene flow networks.
no code implementations • 3 Mar 2022 • Chen Zhao, Merey Ramazanova, Mengmeng Xu, Bernard Ghanem
To address these issues and precisely model temporal action detection, we formulate the task of temporal action detection in a novel perspective of semantic segmentation.
no code implementations • 10 Feb 2022 • Merey Ramazanova, Victor Escorcia, Fabian Caba Heilbron, Chen Zhao, Bernard Ghanem
In this work, we take a deep look into the effectiveness of audio in detecting actions in egocentric videos and introduce a simple-yet-effective approach via Observing, Watching, and Listening (OWL) to leverage audio-visual information and context for egocentric TAL.
no code implementations • 10 Feb 2022 • Juan C. Pérez, Motasem Alfarra, Ali Thabet, Pablo Arbeláez, Bernard Ghanem
We propose a methodology for assessing and characterizing the robustness of FRMs against semantic perturbations to their input.
no code implementations • 31 Jan 2022 • Motasem Alfarra, Juan C. Pérez, Anna Frühstück, Philip H. S. Torr, Peter Wonka, Bernard Ghanem
We show the vulnerability of both the generative model and the FID against additive perturbations in the latent space.
no code implementations • CVPR 2022 • Andrés Villa, Kumail Alhamoud, Juan León Alcázar, Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem
We perform in-depth evaluations of existing CL methods in vCLIMB, and observe two unique challenges in video data.
1 code implementation • CVPR 2022 • Anirudh Thatipelli, Sanath Narayan, Salman Khan, Rao Muhammad Anwer, Fahad Shahbaz Khan, Bernard Ghanem
Experiments are performed on four few-shot action recognition benchmarks: Kinetics, SSv2, HMDB51 and UCF101.
no code implementations • NeurIPS 2021 • Mengmeng Xu, Juan Manuel Perez Rua, Xiatian Zhu, Bernard Ghanem, Brais Martinez
This results in a task discrepancy problem for the video encoder – trained for action classification, but used for TAL.
1 code implementation • CVPR 2022 • Mattia Soldan, Alejandro Pardo, Juan León Alcázar, Fabian Caba Heilbron, Chen Zhao, Silvio Giancola, Bernard Ghanem
The recent and increasing interest in video-language research has driven the development of large-scale datasets that enable data-intensive machine learning techniques.
Ranked #1 on
Natural Language Moment Retrieval
on MAD
no code implementations • 30 Nov 2021 • Abdullah Hamdi, Silvio Giancola, Bernard Ghanem
This novel 3D Voint cloud representation combines the compactness of 3D point cloud representation with the natural view-awareness of multi-view representation.
1 code implementation • NeurIPS 2021 • Guocheng Qian, Hasan Abed Al Kader Hammoud, Guohao Li, Ali Thabet, Bernard Ghanem
We then introduce a new Anisotropic Reduction function into our Separable SA module and propose an Anisotropic Separable SA (ASSA) module that substantially increases the network's accuracy.
Ranked #9 on
Semantic Segmentation
on S3DIS Area5
1 code implementation • CVPR 2022 • Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.
1 code implementation • EMNLP 2021 • Jialin Gao, Xin Sun, Mengmeng Xu, Xi Zhou, Bernard Ghanem
Temporal language grounding in videos aims to localize the temporal span relevant to the given query sentence.
no code implementations • 29 Sep 2021 • Motasem Alfarra, Adel Bibi, Philip Torr, Bernard Ghanem
In this work, we revisit Gaussian randomized smoothing and show that the variance of the Gaussian distribution can be optimized at each input so as to maximize the certification radius for the construction of the smooth classifier.
no code implementations • 12 Sep 2021 • Alejandro Pardo, Fabian Caba Heilbron, Juan León Alcázar, Ali Thabet, Bernard Ghanem
Understanding movies and their structural patterns is a crucial task to decode the craft of video editing.
no code implementations • 12 Sep 2021 • Hasan Abed Al Kader Hammoud, Bernard Ghanem
Deep Neural Networks (DNNs) are ubiquitous and span a variety of applications ranging from image classification and facial recognition to medical image analysis and real-time object detection.
1 code implementation • ICCV 2021 • Alejandro Pardo, Fabian Caba Heilbron, Juan León Alcázar, Ali Thabet, Bernard Ghanem
Video content creation keeps growing at an incredible pace; yet, creating engaging stories remains challenging and requires non-trivial video editing expertise.
1 code implementation • 29 Jul 2021 • Juan C. Pérez, Motasem Alfarra, Guillaume Jeanneret, Laura Rueda, Ali Thabet, Bernard Ghanem, Pablo Arbeláez
Deep learning models are prone to being fooled by imperceptible perturbations known as adversarial attacks.
1 code implementation • 9 Jul 2021 • Francisco Eiras, Motasem Alfarra, M. Pawan Kumar, Philip H. S. Torr, Puneet K. Dokania, Bernard Ghanem, Adel Bibi
All prior art on randomized smoothing has focused on isotropic $\ell_p$ certification, which has the advantage of yielding certificates that can be easily compared among isotropic methods via $\ell_p$-norm radius.
2 code implementations • 2 Jul 2021 • Motasem Alfarra, Adel Bibi, Naeemullah Khan, Philip H. S. Torr, Bernard Ghanem
Deep neural networks are vulnerable to input deformations in the form of vector fields of pixel displacements and to other parameterized geometric deformations e. g. translations, rotations, etc.
4 code implementations • 14 Jun 2021 • Guohao Li, Matthias Müller, Bernard Ghanem, Vladlen Koltun
Deep graph neural networks (GNNs) have achieved excellent results on various tasks on increasingly large graph datasets with millions of nodes and edges.
Ranked #1 on
Node Property Prediction
on ogbn-proteins
1 code implementation • 3 Jun 2021 • Juan Leon Alcazar, Long Mai, Federico Perazzi, Joon-Young Lee, Pablo Arbelaez, Bernard Ghanem, Fabian Caba Heilbron
To showcase the potential of our new dataset, we propose an audiovisual baseline and benchmark for person retrieval.
1 code implementation • 10 May 2021 • Bing Li, Cheng Zheng, Silvio Giancola, Bernard Ghanem
We propose a novel scene flow estimation approach to capture and infer 3D motions from point clouds.
no code implementations • 19 Apr 2021 • Anthony Cioppa, Adrien Deliège, Floriane Magera, Silvio Giancola, Olivier Barnich, Bernard Ghanem, Marc Van Droogenbroeck
Specifically, we distill a powerful commercial calibration tool in a recent neural network architecture on the large-scale SoccerNet dataset, composed of untrimmed broadcast videos of 500 soccer games.
1 code implementation • 14 Apr 2021 • Silvio Giancola, Bernard Ghanem
In this paper, we focus our analysis on action spotting in soccer broadcast, which consists in temporally localizing the main actions in a soccer game.
Ranked #1 on
Action Spotting
on SoccerNet-v2
no code implementations • 28 Mar 2021 • Mengmeng Xu, Juan-Manuel Perez-Rua, Xiatian Zhu, Bernard Ghanem, Brais Martinez
This results in a task discrepancy problem for the video encoder -- trained for action classification, but used for TAL.
1 code implementation • ICML Workshop AML 2021 • Motasem Alfarra, Juan C. Pérez, Ali Thabet, Adel Bibi, Philip H. S. Torr, Bernard Ghanem
Deep neural networks are vulnerable to small input perturbations known as adversarial attacks.
3 code implementations • 24 Feb 2021 • Bing Li, Yuanlue Zhu, Yitong Wang, Chia-Wen Lin, Bernard Ghanem, Linlin Shen
Specifically, a new generator architecture is proposed to simultaneously transfer color/texture styles and transform local facial shapes into anime-like counterparts based on the style of a reference anime-face, while preserving the global structure of the source photo-face.
1 code implementation • ICCV 2021 • Juan León-Alcázar, Fabian Caba Heilbron, Ali Thabet, Bernard Ghanem
Active speaker detection requires a solid integration of multi-modal cues.
no code implementations • ICCV 2021 • Bing Li, Chia-Wen Lin, Cheng Zheng, Shan Liu, Junsong Yuan, Bernard Ghanem, C.-C. Jay Kuo
In the second stage, we derive another warping model to refine warping results in less important regions by eliminating serious distortions in shape, disparity and 3D structure.
no code implementations • 1 Jan 2021 • Guohao Li, Chenxin Xiong, Ali Thabet, Bernard Ghanem
We add our generalized aggregation into a deep GCN framework and show it achieves state-of-the-art results in six benchmarks from OGB.
no code implementations • 1 Jan 2021 • Motasem Alfarra, Adel Bibi, Hasan Abed Al Kader Hammoud, Mohamed Gaafar, Bernard Ghanem
This work tackles the problem of characterizing and understanding the decision boundaries of neural networks with piecewise linear non-linearity activations.
no code implementations • 29 Dec 2020 • Hani Itani, Silvio Giancola, Ali Thabet, Bernard Ghanem
Since it is learnable, this mapping is allowed to be different per layer instead of being applied uniformly throughout the depth of the network.
1 code implementation • 8 Dec 2020 • Motasem Alfarra, Adel Bibi, Philip H. S. Torr, Bernard Ghanem
In this work, we revisit Gaussian randomized smoothing and show that the variance of the Gaussian distribution can be optimized at each input so as to maximize the certification radius for the construction of the smooth classifier.
no code implementations • ICCV 2021 • Chen Zhao, Ali Thabet, Bernard Ghanem
In VSS, we focus on a short period of a video and magnify it along the temporal dimension to obtain a larger scale.
Ranked #6 on
Temporal Action Localization
on THUMOS’14
2 code implementations • 26 Nov 2020 • Adrien Deliège, Anthony Cioppa, Silvio Giancola, Meisam J. Seikavandi, Jacob V. Dueholm, Kamal Nasrollahi, Bernard Ghanem, Thomas B. Moeslund, Marc Van Droogenbroeck
In this work, we propose SoccerNet-v2, a novel large-scale corpus of manual annotations for the SoccerNet video dataset, along with open challenges to encourage more research in soccer understanding and broadcast production.
Ranked #1 on
Camera shot segmentation
on SoccerNet-v2
1 code implementation • ICCV 2021 • Abdullah Hamdi, Silvio Giancola, Bernard Ghanem
MVTN exhibits clear performance gains in the tasks of 3D shape classification and 3D shape retrieval without the need for extra training supervision.
Ranked #1 on
3D Object Retrieval
on ShapeNetCore 55
1 code implementation • 23 Nov 2020 • Humam Alwassel, Silvio Giancola, Bernard Ghanem
Extensive experiments show that using features trained with our novel pretraining strategy significantly improves the performance of recent state-of-the-art methods on three tasks: Temporal Action Localization, Action Proposal Generation, and Dense Video Captioning.
1 code implementation • ICCV 2021 • Mengmeng Xu, Juan-Manuel Perez-Rua, Victor Escorcia, Brais Martinez, Xiatian Zhu, Li Zhang, Bernard Ghanem, Tao Xiang
However, most existing models developed for these tasks are pre-trained on general video action classification tasks.
Ranked #12 on
Temporal Action Localization
on ActivityNet-1.3
1 code implementation • 19 Nov 2020 • Mattia Soldan, Mengmeng Xu, Sisi Qu, Jesper Tegner, Bernard Ghanem
Grounding language queries in videos aims at identifying the time interval (or moment) semantically relevant to a language query.
Ranked #1 on
Natural Language Moment Retrieval
on TACoS
3 code implementations • CVPR 2022 • Kezhi Kong, Guohao Li, Mucong Ding, Zuxuan Wu, Chen Zhu, Bernard Ghanem, Gavin Taylor, Tom Goldstein
Data augmentation helps neural networks generalize better by enlarging the training set, but it remains an open question how to effectively augment graph data to enhance the performance of GNNs (Graph Neural Networks).
Ranked #1 on
Graph Property Prediction
on ogbg-ppa
no code implementations • 24 Aug 2020 • Guohao Li, Mengmeng Xu, Silvio Giancola, Ali Thabet, Bernard Ghanem
In this paper, we introduce a new NAS framework, dubbed LC-NAS, where we search for point cloud architectures that are constrained to a target latency.
1 code implementation • 3 Aug 2020 • Samuel Albanie, Yang Liu, Arsha Nagrani, Antoine Miech, Ernesto Coto, Ivan Laptev, Rahul Sukthankar, Bernard Ghanem, Andrew Zisserman, Valentin Gabeur, Chen Sun, Karteek Alahari, Cordelia Schmid, Shi-Zhe Chen, Yida Zhao, Qin Jin, Kaixu Cui, Hui Liu, Chen Wang, Yudong Jiang, Xiaoshuai Hao
This report summarizes the results of the first edition of the challenge together with the findings of the participants.
no code implementations • 10 Jul 2020 • Sisi Qu, Mengmeng Xu, Bernard Ghanem, Jesper Tegner
EDNA uses the diffusion signal as a proxy for computing node similarities between networks.
no code implementations • 21 Jun 2020 • Modar Alfadly, Adel Bibi, Emilio Botero, Salman AlSubaihi, Bernard Ghanem
This has incited research on the reaction of DNNs to noisy input, namely developing adversarial input attacks and strategies that lead to robust DNNs to these attacks.
1 code implementation • 13 Jun 2020 • Motasem Alfarra, Juan C. Pérez, Adel Bibi, Ali Thabet, Pablo Arbeláez, Bernard Ghanem
This paper studies how encouraging semantically-aligned features during deep neural network training can increase network robustness.
3 code implementations • 13 Jun 2020 • Guohao Li, Chenxin Xiong, Ali Thabet, Bernard Ghanem
Graph Convolutional Networks (GCNs) have been drawing significant attention with the power of representation learning on graphs.
Ranked #1 on
Node Property Prediction
on ogbn-proteins
1 code implementation • CVPR 2020 • Juan Leon Alcazar, Fabian Caba Heilbron, Long Mai, Federico Perazzi, Joon-Young Lee, Pablo Arbelaez, Bernard Ghanem
Current methods for active speak er detection focus on modeling short-term audiovisual information from a single speaker.
no code implementations • 3 May 2020 • Motasem Alfarra, Slavomir Hanzely, Alyazeed Albasyoni, Bernard Ghanem, Peter Richtarik
Recent advances in the theoretical understanding of SGD led to a formula for the optimal batch size minimizing the number of effective data passes, i. e., the number of iterations times the batch size.
no code implementations • 20 Feb 2020 • Motasem Alfarra, Adel Bibi, Hasan Hammoud, Mohamed Gaafar, Bernard Ghanem
Our main finding is that the decision boundaries are a subset of a tropical hypersurface, which is intimately related to a polytope formed by the convex hull of two zonotopes.
no code implementations • 6 Feb 2020 • Jean Lahoud, Bernard Ghanem
These labels, denoted by HN-labels, represent different height and normal patches, which allow mining of local semantic information that is useful in the task of semantic RGB segmentation.
Ranked #57 on
Semantic Segmentation
on NYU Depth v2
no code implementations • ICLR 2020 • Modar Alfadly, Adel Bibi, Muhammed Kocabas, Bernard Ghanem
In this work, we propose a new training regularizer that aims to minimize the probabilistic expected training loss of a DNN subject to a generic Gaussian input.
1 code implementation • ECCV 2020 • Juan C. Pérez, Motasem Alfarra, Guillaume Jeanneret, Adel Bibi, Ali Thabet, Bernard Ghanem, Pablo Arbeláez
We revisit the benefits of merging classical vision concepts with deep learning models.
1 code implementation • CVPR 2020 • Anthony Cioppa, Adrien Deliège, Silvio Giancola, Bernard Ghanem, Marc Van Droogenbroeck, Rikke Gade, Thomas B. Moeslund
We benchmark our loss on a large dataset of soccer videos, SoccerNet, and achieve an improvement of 12. 8% over the baseline.
Ranked #2 on
Action Spotting
on SoccerNet-v2
1 code implementation • ECCV 2020 • Abdullah Hamdi, Sara Rojas, Ali Thabet, Bernard Ghanem
Our proposed attack increases the attack success rate by up to 40% for those transferred to unseen networks (transferability), while maintaining a high success rate on the attacked network.
1 code implementation • CVPR 2020 • Guohao Li, Guocheng Qian, Itzel C. Delgadillo, Matthias Müller, Ali Thabet, Bernard Ghanem
Architecture design has become a crucial component of successful deep learning.
Ranked #3 on
Node Classification
on PPI
1 code implementation • CVPR 2021 • Guocheng Qian, Abdulellah Abualshour, Guohao Li, Ali Thabet, Bernard Ghanem
We combine Inception DenseGCN with NodeShuffle into a new point upsampling pipeline called PU-GCN.
no code implementations • 30 Nov 2019 • Jia-Hong Huang, Modar Alfadly, Bernard Ghanem, Marcel Worring
In this work, we propose a new method that uses semantically related questions, dubbed basic questions, acting as noise to evaluate the robustness of VQA models.
1 code implementation • NeurIPS 2020 • Humam Alwassel, Dhruv Mahajan, Bruno Korbar, Lorenzo Torresani, Bernard Ghanem, Du Tran
To the best of our knowledge, XDC is the first self-supervised learning method that outperforms large-scale fully-supervised pretraining for action recognition on the same architecture.
no code implementations • 27 Nov 2019 • Jesus Zarzar, Silvio Giancola, Bernard Ghanem
We integrate residual GCNs in a two-stage 3D object detection pipeline, where 3D object proposals are refined using a novel graph representation.
Ranked #14 on
3D Object Detection
on KITTI Cars Easy
5 code implementations • CVPR 2020 • Mengmeng Xu, Chen Zhao, David S. Rojas, Ali Thabet, Bernard Ghanem
In this work, we propose a graph convolutional network (GCN) model to adaptively incorporate multi-level semantic context into video features and cast temporal action detection as a sub-graph localization problem.
Ranked #14 on
Temporal Action Localization
on THUMOS’14
(mAP IOU@0.5 metric)
4 code implementations • 15 Oct 2019 • Guohao Li, Matthias Müller, Guocheng Qian, Itzel C. Delgadillo, Abdulellah Abualshour, Ali Thabet, Bernard Ghanem
This work transfers concepts such as residual/dense connections and dilated convolutions from CNNs to GCNs in order to successfully train very deep GCNs.
Ranked #4 on
3D Semantic Segmentation
on PartNet
no code implementations • 25 Sep 2019 • Salman AlSubaihi, Adel Bibi, Modar Alfadly, Abdullah Hamdi, Bernard Ghanem
al. that bounded input intervals can be inexpensively propagated from layer to layer through deep networks.
no code implementations • 25 Sep 2019 • Motasem Alfarra, Adel Bibi, Hasan Hammoud, Mohamed Gaafar, Bernard Ghanem
We use tropical geometry, a new development in the area of algebraic geometry, to provide a characterization of the decision boundaries of a simple neural network of the form (Affine, ReLU, Affine).
2 code implementations • 30 Jul 2019 • Victor Escorcia, Mattia Soldan, Josef Sivic, Bernard Ghanem, Bryan Russell
We evaluate our approach on two recently proposed datasets for temporal localization of moments in video with natural language (DiDeMo and Charades-STA) extended to our video corpus moment retrieval setting.
1 code implementation • 24 Jul 2019 • Adel Bibi, Baoyuan Wu, Bernard Ghanem
In this paper, we enforce the above two categories into a unified clustering model starting with the integer program formulation of the standard K-means.
no code implementations • ICCV 2019 • Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald
The second goal is to learn instance information by densely estimating directional information of the instance's center of mass for each voxel.
Ranked #2 on
3D Semantic Instance Segmentation
on ScanNetV2
2 code implementations • 28 May 2019 • Salman Al-Subaihi, Adel Bibi, Modar Alfadly, Abdullah Hamdi, Bernard Ghanem
In this paper, we closely examine the bounds of a block of layers composed in the form of Affine-ReLU-Affine.
1 code implementation • 9 May 2019 • Baoyuan Wu, Li Shen, Tong Zhang, Bernard Ghanem
Thus, LS-LP is equivalent to the original MAP inference problem.
1 code implementation • 7 May 2019 • Guocheng Qian, Yuanhao Wang, Chao Dong, Jimmy S. Ren, Wolfgang Heidrich, Bernard Ghanem, Jinjin Gu
Such a mixture problem is usually solved by a sequential solution (applying each method independently in a fixed order: DM $\to$ DN $\to$ SR), or is simply tackled by an end-to-end network without enough analysis into interactions among tasks, resulting in an undesired performance drop in the final image quality.
no code implementations • ICLR 2019 • Adel Bibi, Bernard Ghanem, Vladlen Koltun, Rene Ranftl
In particular, we show that a forward pass through a standard dropout layer followed by a linear layer and a non-linear activation is equivalent to optimizing a convex optimization objective with a single iteration of a $\tau$-nice Proximal Stochastic Gradient method.
1 code implementation • 24 Apr 2019 • Modar Alfadly, Adel Bibi, Bernard Ghanem
Despite the impressive performance of deep neural networks (DNNs) on numerous vision tasks, they still exhibit yet-to-understand uncouth behaviours.
no code implementations • 18 Apr 2019 • Matthias Müller, Guohao Li, Vincent Casser, Neil Smith, Dominik L. Michels, Bernard Ghanem
A common approach is to learn an end-to-end policy that directly predicts controls from raw images by imitating an expert.
no code implementations • 16 Apr 2019 • Abdullah Hamdi, Bernard Ghanem
Generative Adversarial Networks (GANs) have gained momentum for their ability to model image distributions.
no code implementations • 11 Apr 2019 • Juan Leon Alcazar, Maria A. Bravo, Ali K. Thabet, Guillaume Jeanneret, Thomas Brox, Pablo Arbelaez, Bernard Ghanem
Instance-level video segmentation requires a solid integration of spatial and temporal information.
no code implementations • 10 Apr 2019 • Alejandro Pardo, Mengmeng Xu, Ali Thabet, Pablo Arbelaez, Bernard Ghanem
We adopt a hybrid supervised learning framework to train the object detector from both these types of annotation.
no code implementations • 10 Apr 2019 • Chen Zhao, Bernard Ghanem
Although deep convolutional neural networks (CNNs) have achieved great success in computer vision tasks, its real-world application is still impeded by its voracious demand of computational resources.
1 code implementation • 9 Apr 2019 • Abdullah Hamdi, Bernard Ghanem
Despite the impressive performance of Deep Neural Networks (DNNs) on various vision tasks, they still exhibit erroneous high sensitivity toward semantic primitives (e. g. object pose).
1 code implementation • ICCV 2019 • Guohao Li, Matthias Müller, Ali Thabet, Bernard Ghanem
Finally, we use these new concepts to build a very deep 56-layer GCN, and show how it significantly boosts performance (+3. 7% mIoU over state-of-the-art) in the task of point cloud semantic segmentation.
1 code implementation • 30 Mar 2019 • Ali Thabet, Humam Alwassel, Bernard Ghanem
In fact, we show how Morton features can be used to significantly improve performance (+3% for 2 popular semantic segmentation algorithms) in the task of semantic segmentation of point clouds on the challenging and large-scale S3DIS dataset.
1 code implementation • 30 Mar 2019 • Alejandro Pardo, Humam Alwassel, Fabian Caba Heilbron, Ali Thabet, Bernard Ghanem
RefineLoc shows competitive results with the state-of-the-art in weakly-supervised temporal localization.
Temporal Localization
Weakly Supervised Action Localization
+2
no code implementations • 25 Mar 2019 • Jesus Zarzar, Silvio Giancola, Bernard Ghanem
Successively, we refine our selection of 3D object candidates by exploiting the similarity capability of a 3D Siamese network.
1 code implementation • CVPR 2019 • Silvio Giancola, Jesus Zarzar, Bernard Ghanem
We design a Siamese tracker that encodes model and candidate shapes into a compact latent representation.
1 code implementation • 5 Dec 2018 • Abdullah Hamdi, Matthias Müller, Bernard Ghanem
In contrast, we present a general framework for adversarial attacks on trained agents, which covers semantic perturbations to the environment of the agent performing the task as well as pixel-level attacks.
no code implementations • ECCV 2018 • Yancheng Bai, Yongqiang Zhang, Mingli Ding, Bernard Ghanem
In the MTGAN, the generator is a super-resolution network, which can up-sample small blurred images into fine-scale ones and recover detailed information for more accurate detection.
no code implementations • ECCV 2018 • Xin Yu, Basura Fernando, Bernard Ghanem, Fatih Porikli, Richard Hartley
State-of-the-art face super-resolution methods use deep convolutional neural networks to learn a mapping between low-resolution (LR) facial patterns and their corresponding high-resolution (HR) counterparts by exploring local information.
no code implementations • 11 Aug 2018 • Bernard Ghanem, Juan Carlos Niebles, Cees Snoek, Fabian Caba Heilbron, Humam Alwassel, Victor Escorcia, Ranjay Krishna, Shyamal Buch, Cuong Duc Dao
The guest tasks focused on complementary aspects of the activity recognition problem at large scale and involved three challenging and recently compiled datasets: the Kinetics-600 dataset from Google DeepMind, the AVA dataset from Berkeley and Google, and the Moments in Time dataset from MIT and IBM Research.
1 code implementation • ECCV 2018 • Humam Alwassel, Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem
Despite the recent progress in video understanding and the continuous rate of improvement in temporal action localization throughout the years, it is still unclear how far (or close?)
no code implementations • CVPR 2018 • Adel Bibi, Modar Alfadly, Bernard Ghanem
Moreover, we show how these expressions can be used to systematically construct targeted and non-targeted adversarial attacks.
no code implementations • CVPR 2018 • Yancheng Bai, Yongqiang Zhang, Mingli Ding, Bernard Ghanem
In this paper, we proposed an algorithm to directly generate a clear high-resolution face from a blurry small one by adopting a generative adversarial network (GAN).
no code implementations • CVPR 2018 • Yongqiang Zhang, Yancheng Bai, Mingli Ding, Yongqiang Li, Bernard Ghanem
Finally, we use these pseudo ground-truths to train a fully-supervised detector.
no code implementations • 25 Apr 2018 • Matthias Müller, Alexey Dosovitskiy, Bernard Ghanem, Vladlen Koltun
Simulation can help end-to-end driving systems by providing a cheap, safe, and diverse training environment.
2 code implementations • 12 Apr 2018 • Silvio Giancola, Mohieddine Amine, Tarek Dghaily, Bernard Ghanem
A total of 6, 637 temporal annotations are automatically parsed from online match reports at a one minute resolution for three main classes of events (Goal, Yellow/Red Card, and Substitution).
Ranked #6 on
Action Spotting
on SoccerNet
no code implementations • 8 Apr 2018 • Lama Affara, Bernard Ghanem, Peter Wonka
Convolutional Sparse Coding (CSC) is a well-established image representation model especially suited for image restoration tasks.
2 code implementations • 5 Apr 2018 • Victor Escorcia, Cuong D. Dao, Mihir Jain, Bernard Ghanem, Cees Snoek
Second, we propose an actor-based attention mechanism that enables the localization of the actions from action class labels and actor proposals and is end-to-end trainable.
no code implementations • 31 Mar 2018 • Baoyuan Wu, Fan Jia, Wei Liu, Bernard Ghanem, Siwei Lyu
This work focuses on the problem of multi-label learning with missing labels (MLML), which aims to label each test instance with multiple class labels given training instances that have an incomplete/partial set of these labels.
no code implementations • CVPR 2018 • Baoyuan Wu, Weidong Chen, Peng Sun, Wei Liu, Bernard Ghanem, Siwei Lyu
In D2IA, we generate a relevant and distinct tag subset, in which the tags are relevant to the image contents and semantically distinct to each other, using sequential sampling from a determinantal point process (DPP) model.
1 code implementation • ECCV 2018 • Matthias Müller, Adel Bibi, Silvio Giancola, Salman Al-Subaihi, Bernard Ghanem
In this work, we present TrackingNet, the first large-scale dataset and benchmark for object tracking in the wild.
no code implementations • 3 Mar 2018 • Guohao Li, Matthias Müller, Vincent Casser, Neil Smith, Dominik L. Michels, Bernard Ghanem
Recent work has explored the problem of autonomous navigation by imitating a teacher and learning an end-to-end policy, which directly predicts controls from raw images.
no code implementations • 28 Jan 2018 • Yancheng Bai, Huijuan Xu, Kate Saenko, Bernard Ghanem
In this paper, we propose the contextual multi-scale region convolutional 3D network (CMS-RC3D) for activity detection.
no code implementations • 16 Nov 2017 • Jia-Hong Huang, Cuong Duc Dao, Modar Alfadly, Bernard Ghanem
In VQA, adversarial attacks can target the image and/or the proposed main question and yet there is a lack of proper analysis of the later.
no code implementations • 22 Oct 2017 • Bernard Ghanem, Juan Carlos Niebles, Cees Snoek, Fabian Caba Heilbron, Humam Alwassel, Ranjay Khrisna, Victor Escorcia, Kenji Hata, Shyamal Buch
The ActivityNet Large Scale Activity Recognition Challenge 2017 Summary: results and challenge participants papers.
no code implementations • ICCV 2017 • Sara Shaheen, Lama Affara, Bernard Ghanem
The process of drawing a line drawing can be approximated as the sparse spatial localization of a number of typical basic strokes, which in turn can be cast as a non-standard CSC model that considers the line drawing formation process from parametric curves.
no code implementations • ICCV 2017 • Adel Bibi, Bernard Ghanem
Convolutional sparse coding (CSC) has gained attention for its successful role as a reconstruction and a classification tool in the computer vision and machine learning community.
no code implementations • ICCV 2017 • Jean Lahoud, Bernard Ghanem
We then use the 3D information to orient, place, and score bounding boxes around objects.
Ranked #2 on
Object Detection In Indoor Scenes
on SUN RGB-D
no code implementations • 27 Sep 2017 • Lama Affara, Bernard Ghanem, Peter Wonka
Convolutional sparse coding (CSC) is an important building block of many computer vision applications ranging from image and video compression to deep learning.
no code implementations • 14 Sep 2017 • Jia-Hong Huang, Cuong Duc Dao, Modar Alfadly, C. Huck Yang, Bernard Ghanem
Visual Question Answering (VQA) models should have both high robustness and accuracy.
no code implementations • 19 Aug 2017 • Matthias Müller, Vincent Casser, Neil Smith, Dominik L. Michels, Bernard Ghanem
Automating the navigation of unmanned aerial vehicles (UAVs) in diverse scenarios has gained much attention in recent years.
no code implementations • 19 Aug 2017 • Matthias Müller, Vincent Casser, Jean Lahoud, Neil Smith, Bernard Ghanem
We present a photo-realistic training and evaluation simulator (Sim4CV) with extensive applications across various fields of computer vision.
no code implementations • 11 Aug 2017 • Abdullah Hamdi, Bernard Ghanem
Kernel Correlation Filters have shown a very promising scheme for visual tracking in terms of speed and accuracy on several benchmarks.
no code implementations • 20 Jul 2017 • Yancheng Bai, Bernard Ghanem
We test our MB-FCN detector on two public face detection benchmarks, including FDDB and WIDER FACE.
no code implementations • CVPR 2017 • Matthias Mueller, Neil Smith, Bernard Ghanem
Correlation filter (CF) based trackers have recently gained a lot of popularity due to their impressive performance on benchmark datasets, while maintaining high frame rates.
no code implementations • CVPR 2017 • Adel Bibi, Hani Itani, Bernard Ghanem
Since all operations in our FFTLasso method are element-wise, the subproblems are completely independent and can be trivially parallelized (e. g. on a GPU).
no code implementations • CVPR 2017 • Fabian Caba Heilbron, Wayner Barrios, Victor Escorcia, Bernard Ghanem
Despite the recent advances in large-scale video analysis, action detection remains as one of the most challenging unsolved problems in computer vision.
no code implementations • CVPR 2017 • Ganzhao Yuan, Wei-Shi Zheng, Bernard Ghanem
Incorporating a new Gaussian elimination procedure, the matrix splitting method achieves state-of-the-art performance.
1 code implementation • CVPR 2017 • Shyamal Buch, Victor Escorcia, Chuanqi Shen, Bernard Ghanem, Juan Carlos Niebles
Our paper presents a new approach for temporal detection of human actions in long, untrimmed video sequences.
no code implementations • CVPR 2017 • Baoyuan Wu, Fan Jia, Wei Liu, Bernard Ghanem
To this end, we treat the image annotation as a subset selection problem based on the conditional determinantal point process (DPP) model, which formulates the representation and diversity jointly.
1 code implementation • CVPR 2018 • Jian Zhang, Bernard Ghanem
With the aim of developing a fast yet accurate algorithm for compressive sensing (CS) reconstruction of natural images, we combine in this paper the merits of two existing categories of CS methods: the structure insights of traditional optimization-based methods and the speed of recent network-based ones.
1 code implementation • ECCV 2018 • Humam Alwassel, Fabian Caba Heilbron, Bernard Ghanem
To address this need, we propose the new problem of action spotting in video, which we define as finding a specific action in a video while observing a small portion of that video.
no code implementations • 19 Mar 2017 • Jia-Hong Huang, Modar Alfadly, Bernard Ghanem
Given a natural language question about an image, the first module takes the question as input and then outputs the basic questions of the main given question.
no code implementations • CVPR 2016 • Fabian Caba Heilbron, Juan Carlos Niebles, Bernard Ghanem
In many large-scale video analysis scenarios, one is interested in localizing and recognizing human activities that occur in short temporal intervals within long untrimmed videos.
no code implementations • CVPR 2016 • Adel Bibi, Tianzhu Zhang, Bernard Ghanem
In this paper, we present a part-based sparse tracker in a particle filter framework where both the motion and appearance model are formulated in 3D.
no code implementations • CVPR 2016 • Tianzhu Zhang, Adel Bibi, Bernard Ghanem
Sparse representation has been introduced to visual tracking by finding the best target candidate with minimal reconstruction error within the particle filter framework.
no code implementations • 26 Apr 2016 • Baoyuan Wu, Bernard Ghanem
This paper revisits the integer programming (IP) problem, which plays a fundamental role in many computer vision and machine learning applications.
no code implementations • ICCV 2015 • Mohammed Hachama, Bernard Ghanem, Peter Wonka
In this paper, we address the problem of computing an intrinsic decomposition of the colors of a surface into an albedo and a shading term.
no code implementations • ICCV 2015 • Rachit Dubey, Joshua Peterson, Aditya Khosla, Ming-Hsuan Yang, Bernard Ghanem
We augment both the images and object segmentations from the PASCAL-S dataset with ground truth memorability scores and shed light on the various factors and properties that make an object memorable (or forgettable) to humans.
no code implementations • ICCV 2015 • Baoyuan Wu, Siwei Lyu, Bernard Ghanem
This work focuses on the problem of multi-label learning with missing labels (MLML), which aims to label each test instance with multiple class labels given training instances that have an incomplete/partial set of these labels (i. e. some of their labels are missing).
no code implementations • CVPR 2015 • Victor Escorcia, Juan Carlos Niebles, Bernard Ghanem
One of the cornerstone principles of deep models is their abstraction capacity, i. e. their ability to learn abstract concepts from `simpler' ones.
1 code implementation • CVPR 2015 • Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem, Juan Carlos Niebles
In spite of many dataset efforts for human action recognition, current computer vision algorithms are still severely limited in terms of the variability and complexity of the actions that they can recognize.
no code implementations • CVPR 2015 • Tianzhu Zhang, Si Liu, Changsheng Xu, Shuicheng Yan, Bernard Ghanem, Narendra Ahuja, Ming-Hsuan Yang
Sparse representation has been applied to visual tracking by finding the best target candidate with minimal reconstruction error by use of target templates.
no code implementations • CVPR 2015 • Ganzhao Yuan, Bernard Ghanem
This paper focuses on TV for image restoration in the presence of impulse noise.
no code implementations • CVPR 2015 • Bernard Ghanem, Ali Thabet, Juan Carlos Niebles, Fabian Caba Heilbron
This paper proposes a new framework for estimating the Manhattan Frame (MF) of an indoor scene from a single RGB-D image.
no code implementations • 9 Mar 2015 • Muhammad Uzair, Faisal Shafait, Bernard Ghanem, Ajmal Mian
Efficient and accurate joint representation of a collection of images, that belong to the same class, is a major research challenge for practical image set classification.