no code implementations • ICCV 2023 • Wenpeng Xiao, Wentao Liu, Yitong Wang, Bernard Ghanem, Bing Li
Considering the complexity of hair structure, we innovatively treat hair wisp extraction as an instance segmentation problem, where a hair wisp is referred to as an instance.
1 code implementation • 12 Sep 2023 • Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim, Chen Chen, Fabian Deuser, Feng Yan, Fufu Yu, Gal Shitrit, Guanshuo Wang, Gyusik Choi, Hankyul Kim, Hao Guo, Hasby Fahrudin, Hidenari Koguchi, Håkan Ardö, Ibrahim Salah, Ido Yerushalmy, Iftikar Muhammad, Ikuma Uchida, Ishay Be'ery, Jaonary Rabarisoa, Jeongae Lee, Jiajun Fu, Jianqin Yin, Jinghang Xu, Jongho Nang, Julien Denize, Junjie Li, Junpei Zhang, Juntae Kim, Kamil Synowiec, Kenji Kobayashi, Kexin Zhang, Konrad Habel, Kota Nakajima, Licheng Jiao, Lin Ma, Lizhi Wang, Luping Wang, Menglong Li, Mengying Zhou, Mohamed Nasr, Mohamed Abdelwahed, Mykola Liashuha, Nikolay Falaleev, Norbert Oswald, Qiong Jia, Quoc-Cuong Pham, Ran Song, Romain Hérault, Rui Peng, Ruilong Chen, Ruixuan Liu, Ruslan Baikulov, Ryuto Fukushima, Sergio Escalera, Seungcheon Lee, Shimin Chen, Shouhong Ding, Taiga Someya, Thomas B. Moeslund, Tianjiao Li, Wei Shen, Wei zhang, Wei Li, Wei Dai, Weixin Luo, Wending Zhao, Wenjie Zhang, Xinquan Yang, Yanbiao Ma, Yeeun Joo, Yingsen Zeng, Yiyang Gan, Yongqiang Zhu, Yujie Zhong, Zheng Ruan, Zhiheng Li, Zhijian Huang, Ziyu Meng
More information on the tasks, challenges, and leaderboards are available on https://www. soccer-net. org.
no code implementations • 11 Sep 2023 • Santiago Rivier, Carlos Hinojosa, Silvio Giancola, Bernard Ghanem
In this work, we present a weakly supervised learning algorithm to train semantic segmentation algorithms that only rely on query point annotations instead of full mask labels.
no code implementations • 28 Aug 2023 • Juan Leon-Alcazar, Yazeed Alnumay, Cheng Zheng, Hassane Trigui, Sahejad Patel, Bernard Ghanem
We propose a two-stage CNN pipeline that identifies the key structural components of an analog gauge and outputs an angular reading.
no code implementations • 22 Aug 2023 • Yuxuan Du, Yibo Yang, Tongliang Liu, Zhouchen Lin, Bernard Ghanem, DaCheng Tao
Understanding the dynamics of large quantum systems is hindered by the curse of dimensionality.
1 code implementation • ICCV 2023 • Haozhe Liu, Mingchen Zhuge, Bing Li, Yuhui Wang, Francesco Faccio, Bernard Ghanem, Jürgen Schmidhuber
Recent work on deep reinforcement learning (DRL) has pointed out that algorithmic information about good policies can be extracted from offline data which lack explicit information about executed actions.
1 code implementation • 10 Aug 2023 • Yangyang Xu, Yibo Yang, Bernard Ghanem, Lefei Zhang, Du Bo, DaCheng Tao
In this work, we present a novel MTL model by combining both merits of deformable CNN and query-based Transformer with shared gating for multi-task learning of dense prediction.
2 code implementations • 3 Aug 2023 • Yibo Yang, Haobo Yuan, Xiangtai Li, Jianlong Wu, Lefei Zhang, Zhouchen Lin, Philip Torr, DaCheng Tao, Bernard Ghanem
Beyond the normal case, long-tail class incremental learning and few-shot class incremental learning are also proposed to consider the data imbalance and data scarcity, respectively, which are common in real-world implementations and further exacerbate the well-known problem of catastrophic forgetting.
class-incremental learning
Few-Shot Class-Incremental Learning
+1
1 code implementation • 30 Jun 2023 • Guocheng Qian, Jinjie Mai, Abdullah Hamdi, Jian Ren, Aliaksandr Siarohin, Bing Li, Hsin-Ying Lee, Ivan Skorokhodov, Peter Wonka, Sergey Tulyakov, Bernard Ghanem
We present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes generation from a single unposed image in the wild using both2D and 3D priors.
1 code implementation • 28 Jun 2023 • Jianzong Wu, Xiangtai Li, Shilin Xu, Haobo Yuan, Henghui Ding, Yibo Yang, Xia Li, Jiangning Zhang, Yunhai Tong, Xudong Jiang, Bernard Ghanem, DaCheng Tao
To our knowledge, this is the first comprehensive literature review of open vocabulary learning.
no code implementations • 15 Jun 2023 • Juan C. Pérez, Sara Rojas, Jesus Zarzar, Bernard Ghanem
We found that introducing image augmentations during training presents challenges such as geometric and photometric inconsistencies for learning NRMs from images.
no code implementations • 13 Jun 2023 • Wentian Zhang, Haozhe Liu, Bing Li, Jinheng Xie, Yawen Huang, Yuexiang Li, Yefeng Zheng, Bernard Ghanem
By treating the generated data in training as a stream, we propose to detect whether the discriminator slows down the learning of new knowledge in generated data.
no code implementations • 1 Jun 2023 • Jun Chen, Deyao Zhu, Guocheng Qian, Bernard Ghanem, Zhicheng Yan, Chenchen Zhu, Fanyi Xiao, Mohamed Elhoseiny, Sean Chang Culatana
Although acquired extensive knowledge of visual concepts, it is non-trivial to exploit knowledge from these VL models to the task of semantic segmentation, as they are usually trained at an image level.
Open Vocabulary Semantic Segmentation
Semantic Segmentation
+2
no code implementations • 28 May 2023 • Lama Alssum, Juan Leon Alcazar, Merey Ramazanova, Chen Zhao, Bernard Ghanem
Studying continual learning in the video domain poses even more challenges, as video data contains a large number of frames, which places a higher burden on the replay memory.
no code implementations • 26 May 2023 • Junting Chen, Guohao Li, Suryansh Kumar, Bernard Ghanem, Fisher Yu
Our method propagates semantics on the scene graphs based on language priors and scene statistics to introduce semantic knowledge to the geometric frontiers.
no code implementations • 26 May 2023 • Mingchen Zhuge, Haozhe Liu, Francesco Faccio, Dylan R. Ashley, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, Louis Kirsch, Bing Li, Guohao Li, Shuming Liu, Jinjie Mai, Piotr Piękos, Aditya Ramesh, Imanol Schlag, Weimin Shi, Aleksandar Stanić, Wenyi Wang, Yuhui Wang, Mengmeng Xu, Deng-Ping Fan, Bernard Ghanem, Jürgen Schmidhuber
What should be the social structure of an NLSOM?
1 code implementation • ICCV 2023 • Hasan Abed Al Kader Hammoud, Ameya Prabhu, Ser-Nam Lim, Philip H. S. Torr, Adel Bibi, Bernard Ghanem
We revisit the common practice of evaluating adaptation of Online Continual Learning (OCL) algorithms through the metric of online accuracy, which measures the accuracy of the model on the immediate next few samples.
1 code implementation • CVPR 2023 • Chong Mou, Youmin Xu, Jiechong Song, Chen Zhao, Bernard Ghanem, Jian Zhang
For large-capacity, we present a reversible pipeline to perform multiple videos hiding and recovering through a single invertible neural network (INN).
no code implementations • 19 Apr 2023 • Jinjie Mai, Jun Chen, Bing Li, Guocheng Qian, Mohamed Elhoseiny, Bernard Ghanem
In this paper, we propose a novel and generalizable framework called LLM-Brain: using Large-scale Language Model as a robotic brain to unify egocentric memory and control.
no code implementations • 10 Apr 2023 • Hassan Mkhallati, Anthony Cioppa, Silvio Giancola, Bernard Ghanem, Marc Van Droogenbroeck
By providing broadcasters with a tool to summarize the content of their video with the same level of engagement as a live game, our method could help satisfy the needs of the numerous fans who follow their team but cannot necessarily watch the live game.
no code implementations • 10 Apr 2023 • Jan Held, Anthony Cioppa, Silvio Giancola, Abdullah Hamdi, Bernard Ghanem, Marc Van Droogenbroeck
The Video Assistant Referee (VAR) has revolutionized association football, enabling referees to review incidents on the pitch, make informed decisions, and ensure fairness.
1 code implementation • 10 Apr 2023 • Motasem Alfarra, Hani Itani, Alejandro Pardo, Shyma Alhuwaider, Merey Ramazanova, Juan C. Pérez, Zhipeng Cai, Matthias Müller, Bernard Ghanem
To address this issue, we propose a more realistic evaluation protocol for TTA methods, where data is received in an online fashion from a constant-speed data stream, thereby accounting for the method's adaptation speed.
no code implementations • 9 Apr 2023 • Silvio Giancola, Anthony Cioppa, Julia Georgieva, Johsan Billingham, Andreas Serner, Kerry Peek, Bernard Ghanem, Marc Van Droogenbroeck
In this paper, we propose an active learning framework that selects the most informative video samples to be annotated next, thus drastically reducing the annotation effort and accelerating the training of action spotting models to reach the highest accuracy at a faster pace.
no code implementations • 6 Apr 2023 • Jia-Hong Huang, Modar Alfadly, Bernard Ghanem, Marcel Worring
This work proposes a new method that utilizes semantically related questions, referred to as basic questions, acting as noise to evaluate the robustness of VQA models.
no code implementations • 6 Apr 2023 • Mengmeng Xu, Mattia Soldan, Jialin Gao, Shuming Liu, Juan-Manuel Pérez-Rúa, Bernard Ghanem
To alleviate the boundary ambiguity, we propose to study the video activity localization problem from a denoising perspective.
1 code implementation • 3 Apr 2023 • Joachim Houyon, Anthony Cioppa, Yasir Ghunaim, Motasem Alfarra, Anaïs Halin, Maxim Henry, Bernard Ghanem, Marc Van Droogenbroeck
In this paper, we propose a solution to this issue by leveraging the power of continual learning methods to reduce the impact of domain shifts.
2 code implementations • 31 Mar 2023 • Guohao Li, Hasan Abed Al Kader Hammoud, Hani Itani, Dmitrii Khizbullin, Bernard Ghanem
To address the challenges of achieving autonomous cooperation, we propose a novel communicative agent framework named role-playing.
no code implementations • 23 Mar 2023 • Hasan Abed Al Kader Hammoud, Adel Bibi, Philip H. S. Torr, Bernard Ghanem
In this paper we investigate the frequency sensitivity of Deep Neural Networks (DNNs) when presented with clean samples versus poisoned samples.
1 code implementation • CVPR 2023 • Ameya Prabhu, Hasan Abed Al Kader Hammoud, Puneet Dokania, Philip H. S. Torr, Ser-Nam Lim, Bernard Ghanem, Adel Bibi
Our conclusions are consistent in a different number of stream time steps, e. g., 20 to 200, and under several computational budgets.
1 code implementation • ICCV 2023 • Qiankun Gao, Chen Zhao, Yifan Sun, Teng Xi, Gang Zhang, Bernard Ghanem, Jian Zhang
1) Learning: the pre-trained model adapts to the new task by tuning an online PET module, along with our adaptation speed calibration to align different PET modules, 2) Accumulation: the task-specific knowledge learned by the online PET module is accumulated into an offline PET module through momentum update, 3) Ensemble: During inference, we respectively construct two experts with online/offline PET modules (which are favored by the novel/historical tasks) for prediction ensemble.
1 code implementation • ICCV 2023 • Jiwen Yu, Yinhuai Wang, Chen Zhao, Bernard Ghanem, Jian Zhang
In this work, we propose a training-Free conditional Diffusion Model (FreeDoM) used for various conditions.
1 code implementation • ICCV 2023 • Sara Rojas, Jesus Zarzar, Juan Camilo Perez, Artsiom Sanakoyeu, Ali Thabet, Albert Pumarola, Bernard Ghanem
Re-ReND is designed to achieve real-time performance by converting the NeRF into a representation that can be efficiently processed by standard graphics pipelines.
1 code implementation • 2 Mar 2023 • Haozhe Liu, Wentian Zhang, Bing Li, Haoqian Wu, Nanjun He, Yawen Huang, Yuexiang Li, Bernard Ghanem, Yefeng Zheng
The evaluation results demonstrate that our AdaptiveMix can facilitate the training of GANs and effectively improve the image quality of generated samples.
1 code implementation • ICCV 2023 • Wayner Barrios, Mattia Soldan, Fabian Caba Heilbron, Alberto Mario Ceballos-Arroyo, Bernard Ghanem
The recent introduction of the large-scale long-form MAD dataset for language grounding in videos has enabled researchers to investigate the performance of current state-of-the-art methods in the long-form setup, with unexpected findings.
Ranked #1 on
Natural Language Moment Retrieval
on MAD
Natural Language Moment Retrieval
Natural Language Visual Grounding
+2
1 code implementation • CVPR 2023 • Yasir Ghunaim, Adel Bibi, Kumail Alhamoud, Motasem Alfarra, Hasan Abed Al Kader Hammoud, Ameya Prabhu, Philip H. S. Torr, Bernard Ghanem
We show that a simple baseline outperforms state-of-the-art CL methods under this evaluation, questioning the applicability of existing methods in realistic settings.
no code implementations • 3 Jan 2023 • Hasan Abed Al Kader Hammoud, Shuming Liu, Mohammed Alkhrashi, Fahad Albalawi, Bernard Ghanem
Although backdoor attacks have been extensively studied in the image domain, there are very few works that explore such attacks in the video domain, and they tend to conclude that image backdoor attacks are less effective in the video domain.
no code implementations • CVPR 2023 • Haoqian Wu, Keyu Chen, Haozhe Liu, Mingchen Zhuge, Bing Li, Ruizhi Qiao, Xiujun Shu, Bei Gan, Liangsheng Xu, Bo Ren, Mengmeng Xu, Wentian Zhang, Raghavendra Ramachandra, Chia-Wen Lin, Bernard Ghanem
Temporal video segmentation is the get-to-go automatic video analysis, which decomposes a long-form video into smaller components for the following-up understanding tasks.
no code implementations • CVPR 2023 • Chen Zhao, Shuming Liu, Karttikeya Mangalam, Bernard Ghanem
Temporal action localization (TAL) requires long-form reasoning to predict actions of various durations and complex content.
no code implementations • ICCV 2023 • Jun Chen, Deyao Zhu, Guocheng Qian, Bernard Ghanem, Zhicheng Yan, Chenchen Zhu, Fanyi Xiao, Sean Chang Culatana, Mohamed Elhoseiny
Semantic segmentation is a crucial task in computer vision that involves segmenting images into semantically meaningful regions at the pixel level.
Open Vocabulary Semantic Segmentation
Semantic Segmentation
+2
1 code implementation • CVPR 2023 • Haozhe Liu, Wentian Zhang, Bing Li, Haoqian Wu, Nanjun He, Yawen Huang, Yuexiang Li, Bernard Ghanem, Yefeng Zheng
The evaluation results demonstrate that our AdaptiveMix can facilitate the training of GANs and effectively improve the image quality of generated samples.
1 code implementation • 27 Dec 2022 • Abdullah Hamdi, Faisal AlZahrani, Silvio Giancola, Bernard Ghanem
Multi-view projection techniques have shown themselves to be highly effective in achieving top-performing results in the recognition of 3D shapes.
1 code implementation • 18 Dec 2022 • Abdullah Hamdi, Bernard Ghanem, Matthias Nießner
SuRFNet employs partial SRFs from few/one images and a specialized SRF loss to learn to generate high-quality sparse voxel radiance fields that can be rendered from novel views.
1 code implementation • ICCV 2023 • Jinjie Mai, Abdullah Hamdi, Silvio Giancola, Chen Zhao, Bernard Ghanem
Yet, we point out that the low number of camera poses caused by camera re-localization from previous VQ3D methods severally hinders their overall success rate.
no code implementations • CVPR 2023 • Andrés Villa, Juan León Alcázar, Motasem Alfarra, Kumail Alhamoud, Julio Hurtado, Fabian Caba Heilbron, Alvaro Soto, Bernard Ghanem
In this paper, we address the problem of continual learning for video data.
no code implementations • 29 Nov 2022 • Salman AlSubaihi, Mohammed Alkhrashi, Raied Aljadaany, Fahad Albalawi, Bernard Ghanem
We provide two variants of PermLL in this paper: one applies the permutation layer to the model's prediction, while the other applies it directly to the given noisy label.
no code implementations • 29 Nov 2022 • Motasem Alfarra, Zhipeng Cai, Adel Bibi, Bernard Ghanem, Matthias Müller
In ODICS, the model is continually presented with batches of densely labeled images from different domains; computation is limited and no information about the task boundaries is available.
1 code implementation • 27 Nov 2022 • Sauradip Nag, Mengmeng Xu, Xiatian Zhu, Juan-Manuel Perez-Rua, Bernard Ghanem, Yi-Zhe Song, Tao Xiang
In this work, we introduce a new multi-modality few-shot (MMFS) TAD problem, which can be considered as a marriage of FS-TAD and ZS-TAD by leveraging few-shot support videos and new class names jointly.
1 code implementation • 25 Nov 2022 • Chen Zhao, Shuming Liu, Karttikeya Mangalam, Bernard Ghanem
Temporal action localization (TAL) requires long-form reasoning to predict actions of various durations and complex content.
1 code implementation • 21 Nov 2022 • Ling Yang, Zhilin Huang, Yang song, Shenda Hong, Guohao Li, Wentao Zhang, Bin Cui, Bernard Ghanem, Ming-Hsuan Yang
Generating images from graph-structured inputs, such as scene graphs, is uniquely challenging due to the difficulty of aligning nodes and connections in graphs with objects and their relations in images.
no code implementations • 21 Nov 2022 • Jesus Zarzar, Sara Rojas, Silvio Giancola, Bernard Ghanem
The predicted semantic fields allow SegNeRF to achieve an average mIoU of $\textbf{30. 30%}$ for 2D novel view segmentation, and $\textbf{37. 46%}$ for 3D part segmentation, boasting competitive performance against point-based methods by using only a few posed images.
1 code implementation • CVPR 2023 • Mengmeng Xu, Yanghao Li, Cheng-Yang Fu, Bernard Ghanem, Tao Xiang, Juan-Manuel Perez-Rua
Our experiments show the proposed adaptations improve egocentric query detection, leading to a better visual query localization system in both 2D and 3D configurations.
no code implementations • 18 Nov 2022 • Jinjie Mai, Chen Zhao, Abdullah Hamdi, Silvio Giancola, Bernard Ghanem
Visual queries 3D localization (VQ3D) is a task in the Ego4D Episodic Memory Benchmark.
1 code implementation • 26 Oct 2022 • Haozhe Liu, Wentian Zhang, Jinheng Xie, Haoqian Wu, Bing Li, Ziqi Zhang, Yuexiang Li, Yawen Huang, Bernard Ghanem, Yefeng Zheng
Since the observation is that noise-prone regions such as textural and clutter backgrounds are adverse to the generalization ability of CNN models during training, we enhance features from discriminative regions and suppress noise-prone ones when combining an image pair.
7 code implementations • 5 Oct 2022 • Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li
The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.
no code implementations • 29 Sep 2022 • Kumail Alhamoud, Hasan Abed Al Kader Hammoud, Motasem Alfarra, Bernard Ghanem
Recent progress in empirical and certified robustness promises to deliver reliable and deployable Deep Neural Networks (DNNs).
1 code implementation • 25 Aug 2022 • Guocheng Qian, Xingdi Zhang, Abdullah Hamdi, Bernard Ghanem
This is mainly due to the limitation of Transformers: a demanding need for large training data.
1 code implementation • 25 Aug 2022 • Haozhe Liu, Bing Li, Haoqian Wu, Hanbang Liang, Yawen Huang, Yuexiang Li, Bernard Ghanem, Yefeng Zheng
In this paper, we propose a novel training pipeline to address the mode collapse issue of GANs.
1 code implementation • 3 Aug 2022 • Mengmeng Xu, Cheng-Yang Fu, Yanghao Li, Bernard Ghanem, Juan-Manuel Perez-Rua, Tao Xiang
The repeated gradient computation of the same object lead to an inefficient training; (2) The false positive rate is high on background frames.
1 code implementation • 4 Jul 2022 • Kevin Qinghong Lin, Alex Jinpeng Wang, Mattia Soldan, Michael Wray, Rui Yan, Eric Zhongcong Xu, Difei Gao, RongCheng Tu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Dima Damen, Bernard Ghanem, Wei Liu, Mike Zheng Shou
In this report, we propose a video-language pretraining (VLP) based solution \cite{kevin2022egovlp} for four Ego4D challenge tasks, including Natural Language Query (NLQ), Moment Query (MQ), Object State Change Classification (OSCC), and PNR Localization (PNR).
2 code implementations • 9 Jun 2022 • Guocheng Qian, Yuchen Li, Houwen Peng, Jinjie Mai, Hasan Abed Al Kader Hammoud, Mohamed Elhoseiny, Bernard Ghanem
In this work, we revisit the classical PointNet++ through a systematic study of model training and scaling strategies, and offer two major contributions.
Ranked #3 on
3D Part Segmentation
on ShapeNet-Part
1 code implementation • 6 Jun 2022 • Motasem Alfarra, Juan C. Pérez, Egor Shulgin, Peter Richtárik, Bernard Ghanem
However, as in the single-node supervised learning setup, models trained in federated learning suffer from vulnerability to imperceptible input transformations known as adversarial attacks, questioning their deployment in security-related applications.
2 code implementations • 3 Jun 2022 • Kevin Qinghong Lin, Alex Jinpeng Wang, Mattia Soldan, Michael Wray, Rui Yan, Eric Zhongcong Xu, Difei Gao, RongCheng Tu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Dima Damen, Bernard Ghanem, Wei Liu, Mike Zheng Shou
Video-Language Pretraining (VLP), which aims to learn transferable representation to advance a wide range of video-text downstream tasks, has recently received increasing attention.
Ranked #2 on
Object State Change Classification
on Ego4D
1 code implementation • 14 May 2022 • Shuming Liu, Mengmeng Xu, Chen Zhao, Xu Zhao, Bernard Ghanem
We propose to sequentially forward the snippet frame through the video encoder, and backward only a small necessary portion of gradients to update the encoder.
no code implementations • 4 May 2022 • Zhen Dong, Kaicheng Zhou, Guohao Li, Qiang Zhou, Mingfei Guo, Bernard Ghanem, Kurt Keutzer, Shanghang Zhang
Neural architecture search (NAS) has shown great success in the automatic design of deep neural networks (DNNs).
no code implementations • 26 Apr 2022 • Mengmeng Xu, Erhan Gundogdu, Maksim Lapin, Bernard Ghanem, Michael Donoser, Loris Bazzani
Long-form video understanding requires designing approaches that are able to temporally localize activities or language.
Contrastive Learning
Few Shot Temporal Action Localization
+3
no code implementations • 14 Apr 2022 • Anthony Cioppa, Silvio Giancola, Adrien Deliege, Le Kang, Xin Zhou, Zhiyu Cheng, Bernard Ghanem, Marc Van Droogenbroeck
Tracking objects in soccer videos is extremely important to gather both player and team statistics, whether it is to estimate the total distance run, the ball possession or the team formation.
1 code implementation • CVPR 2022 • Gabriel Pérez S., Juan C. Pérez, Motasem Alfarra, Silvio Giancola, Bernard Ghanem
In this work, we propose 3DeformRS, a method to certify the robustness of point cloud Deep Neural Networks (DNNs) against real-world deformations.
1 code implementation • 11 Apr 2022 • Guocheng Qian, Xuanyang Zhang, Guohao Li, Chen Zhao, Yukang Chen, Xiangyu Zhang, Bernard Ghanem, Jian Sun
TNAS performs a modified bi-level Breadth-First Search in the proposed trees to discover a high-performance architecture.
1 code implementation • CVPR 2022 • Maksim Makarenko, Arturo Burguete-Lopez, Qizhou Wang, Fedor Getman, Silvio Giancola, Bernard Ghanem, Andrea Fratalocchi
Hyperspectral imaging has attracted significant attention to identify spectral signatures for image classification and automated pattern recognition in computer vision.
1 code implementation • 27 Mar 2022 • Juan Leon Alcazar, Moritz Cordes, Chen Zhao, Bernard Ghanem
Recent advances in the Active Speaker Detection (ASD) problem build upon a two-stage process: feature extraction and spatio-temporal context aggregation.
Ranked #4 on
Audio-Visual Active Speaker Detection
on AVA-ActiveSpeaker
(using extra training data)
1 code implementation • 24 Mar 2022 • Qiankun Gao, Chen Zhao, Bernard Ghanem, Jian Zhang
After RRL, the classification head is refined with global class-balanced classification loss to address the data imbalance issue as well as learn the decision boundaries between new and previous classes.
no code implementations • 23 Mar 2022 • Bing Li, Cheng Zheng, Guohao Li, Bernard Ghanem
To provide an alternative, we propose a novel approach that utilizes monocular RGB images and point clouds to generate pseudo scene flow labels for training scene flow networks.
no code implementations • 3 Mar 2022 • Chen Zhao, Merey Ramazanova, Mengmeng Xu, Bernard Ghanem
To address these issues and precisely model temporal action detection, we formulate the task of temporal action detection in a novel perspective of semantic segmentation.
no code implementations • 10 Feb 2022 • Merey Ramazanova, Victor Escorcia, Fabian Caba Heilbron, Chen Zhao, Bernard Ghanem
We validate our approach in two large-scale datasets, EPIC-Kitchens, and HOMAGE.
no code implementations • 10 Feb 2022 • Juan C. Pérez, Motasem Alfarra, Ali Thabet, Pablo Arbeláez, Bernard Ghanem
We propose a methodology for assessing and characterizing the robustness of FRMs against semantic perturbations to their input.
1 code implementation • 31 Jan 2022 • Motasem Alfarra, Juan C. Pérez, Anna Frühstück, Philip H. S. Torr, Peter Wonka, Bernard Ghanem
Finally, we show that the FID can be robustified by simply replacing the standard Inception with a robust Inception.
no code implementations • CVPR 2022 • Andrés Villa, Kumail Alhamoud, Juan León Alcázar, Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem
We perform in-depth evaluations of existing CL methods in vCLIMB, and observe two unique challenges in video data.
1 code implementation • CVPR 2022 • Anirudh Thatipelli, Sanath Narayan, Salman Khan, Rao Muhammad Anwer, Fahad Shahbaz Khan, Bernard Ghanem
Experiments are performed on four few-shot action recognition benchmarks: Kinetics, SSv2, HMDB51 and UCF101.
Ranked #1 on
Few Shot Action Recognition
on HMDB51
no code implementations • NeurIPS 2021 • Mengmeng Xu, Juan Manuel Perez Rua, Xiatian Zhu, Bernard Ghanem, Brais Martinez
This results in a task discrepancy problem for the video encoder – trained for action classification, but used for TAL.
Ranked #6 on
Temporal Action Localization
on HACS
1 code implementation • CVPR 2022 • Mattia Soldan, Alejandro Pardo, Juan León Alcázar, Fabian Caba Heilbron, Chen Zhao, Silvio Giancola, Bernard Ghanem
The recent and increasing interest in video-language research has driven the development of large-scale datasets that enable data-intensive machine learning techniques.
Ranked #2 on
Natural Language Moment Retrieval
on MAD
2 code implementations • 30 Nov 2021 • Abdullah Hamdi, Silvio Giancola, Bernard Ghanem
To this end, we introduce the concept of the multi-view point cloud (Voint cloud), representing each 3D point as a set of features extracted from several view-points.
1 code implementation • NeurIPS 2021 • Guocheng Qian, Hasan Abed Al Kader Hammoud, Guohao Li, Ali Thabet, Bernard Ghanem
We then introduce a new Anisotropic Reduction function into our Separable SA module and propose an Anisotropic Separable SA (ASSA) module that substantially increases the network's accuracy.
Ranked #27 on
Semantic Segmentation
on S3DIS Area5
3 code implementations • CVPR 2022 • Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.
1 code implementation • EMNLP 2021 • Jialin Gao, Xin Sun, Mengmeng Xu, Xi Zhou, Bernard Ghanem
Temporal language grounding in videos aims to localize the temporal span relevant to the given query sentence.
no code implementations • 12 Sep 2021 • Hasan Abed Al Kader Hammoud, Bernard Ghanem
Deep Neural Networks (DNNs) are ubiquitous and span a variety of applications ranging from image classification to real-time object detection.
1 code implementation • 12 Sep 2021 • Alejandro Pardo, Fabian Caba Heilbron, Juan León Alcázar, Ali Thabet, Bernard Ghanem
Advances in automatic Cut-type recognition can unleash new experiences in the video editing industry, such as movie analysis for education, video re-editing, virtual cinematography, machine-assisted trailer generation, machine-assisted video editing, among others.
1 code implementation • ICCV 2021 • Alejandro Pardo, Fabian Caba Heilbron, Juan León Alcázar, Ali Thabet, Bernard Ghanem
Video content creation keeps growing at an incredible pace; yet, creating engaging stories remains challenging and requires non-trivial video editing expertise.
1 code implementation • 29 Jul 2021 • Juan C. Pérez, Motasem Alfarra, Guillaume Jeanneret, Laura Rueda, Ali Thabet, Bernard Ghanem, Pablo Arbeláez
Deep learning models are prone to being fooled by imperceptible perturbations known as adversarial attacks.
1 code implementation • 9 Jul 2021 • Francisco Eiras, Motasem Alfarra, M. Pawan Kumar, Philip H. S. Torr, Puneet K. Dokania, Bernard Ghanem, Adel Bibi
Randomized smoothing has recently emerged as an effective tool that enables certification of deep neural network classifiers at scale.
2 code implementations • 2 Jul 2021 • Motasem Alfarra, Adel Bibi, Naeemullah Khan, Philip H. S. Torr, Bernard Ghanem
Deep neural networks are vulnerable to input deformations in the form of vector fields of pixel displacements and to other parameterized geometric deformations e. g. translations, rotations, etc.
4 code implementations • 14 Jun 2021 • Guohao Li, Matthias Müller, Bernard Ghanem, Vladlen Koltun
Deep graph neural networks (GNNs) have achieved excellent results on various tasks on increasingly large graph datasets with millions of nodes and edges.
Ranked #1 on
Node Property Prediction
on ogbn-proteins
1 code implementation • 3 Jun 2021 • Juan Leon Alcazar, Long Mai, Federico Perazzi, Joon-Young Lee, Pablo Arbelaez, Bernard Ghanem, Fabian Caba Heilbron
To showcase the potential of our new dataset, we propose an audiovisual baseline and benchmark for person retrieval.
1 code implementation • 10 May 2021 • Bing Li, Cheng Zheng, Silvio Giancola, Bernard Ghanem
We propose a novel scene flow estimation approach to capture and infer 3D motions from point clouds.
no code implementations • 19 Apr 2021 • Anthony Cioppa, Adrien Deliège, Floriane Magera, Silvio Giancola, Olivier Barnich, Bernard Ghanem, Marc Van Droogenbroeck
Specifically, we distill a powerful commercial calibration tool in a recent neural network architecture on the large-scale SoccerNet dataset, composed of untrimmed broadcast videos of 500 soccer games.
1 code implementation • 14 Apr 2021 • Silvio Giancola, Bernard Ghanem
In this paper, we focus our analysis on action spotting in soccer broadcast, which consists in temporally localizing the main actions in a soccer game.
Ranked #7 on
Action Spotting
on SoccerNet-v2
(Average-mAP metric)
no code implementations • 28 Mar 2021 • Mengmeng Xu, Juan-Manuel Perez-Rua, Xiatian Zhu, Bernard Ghanem, Brais Martinez
This results in a task discrepancy problem for the video encoder -- trained for action classification, but used for TAL.
1 code implementation • ICML Workshop AML 2021 • Motasem Alfarra, Juan C. Pérez, Ali Thabet, Adel Bibi, Philip H. S. Torr, Bernard Ghanem
Deep neural networks are vulnerable to small input perturbations known as adversarial attacks.
3 code implementations • 24 Feb 2021 • Bing Li, Yuanlue Zhu, Yitong Wang, Chia-Wen Lin, Bernard Ghanem, Linlin Shen
Specifically, a new generator architecture is proposed to simultaneously transfer color/texture styles and transform local facial shapes into anime-like counterparts based on the style of a reference anime-face, while preserving the global structure of the source photo-face.
1 code implementation • ICCV 2021 • Juan León-Alcázar, Fabian Caba Heilbron, Ali Thabet, Bernard Ghanem
Active speaker detection requires a solid integration of multi-modal cues.
no code implementations • 1 Jan 2021 • Guohao Li, Chenxin Xiong, Ali Thabet, Bernard Ghanem
We add our generalized aggregation into a deep GCN framework and show it achieves state-of-the-art results in six benchmarks from OGB.
no code implementations • 1 Jan 2021 • Motasem Alfarra, Adel Bibi, Hasan Abed Al Kader Hammoud, Mohamed Gaafar, Bernard Ghanem
This work tackles the problem of characterizing and understanding the decision boundaries of neural networks with piecewise linear non-linearity activations.
no code implementations • ICCV 2021 • Bing Li, Chia-Wen Lin, Cheng Zheng, Shan Liu, Junsong Yuan, Bernard Ghanem, C.-C. Jay Kuo
In the second stage, we derive another warping model to refine warping results in less important regions by eliminating serious distortions in shape, disparity and 3D structure.
Vocal Bursts Intensity Prediction
Vocal Bursts Valence Prediction
no code implementations • 29 Dec 2020 • Hani Itani, Silvio Giancola, Ali Thabet, Bernard Ghanem
Since it is learnable, this mapping is allowed to be different per layer instead of being applied uniformly throughout the depth of the network.
no code implementations • 8 Dec 2020 • Motasem Alfarra, Adel Bibi, Philip H. S. Torr, Bernard Ghanem
In this work, we revisit Gaussian randomized smoothing and show that the variance of the Gaussian distribution can be optimized at each input so as to maximize the certification radius for the construction of the smooth classifier.
no code implementations • ICCV 2021 • Chen Zhao, Ali Thabet, Bernard Ghanem
In VSS, we focus on a short period of a video and magnify it along the temporal dimension to obtain a larger scale.
Ranked #11 on
Temporal Action Localization
on ActivityNet-1.3
2 code implementations • ICCV 2021 • Abdullah Hamdi, Silvio Giancola, Bernard Ghanem
MVTN exhibits clear performance gains in the tasks of 3D shape classification and 3D shape retrieval without the need for extra training supervision.
Ranked #1 on
3D Object Retrieval
on ShapeNetCore 55
3 code implementations • 26 Nov 2020 • Adrien Deliège, Anthony Cioppa, Silvio Giancola, Meisam J. Seikavandi, Jacob V. Dueholm, Kamal Nasrollahi, Bernard Ghanem, Thomas B. Moeslund, Marc Van Droogenbroeck
In this work, we propose SoccerNet-v2, a novel large-scale corpus of manual annotations for the SoccerNet video dataset, along with open challenges to encourage more research in soccer understanding and broadcast production.
Ranked #1 on
Camera shot segmentation
on SoccerNet-v2
1 code implementation • 23 Nov 2020 • Humam Alwassel, Silvio Giancola, Bernard Ghanem
Extensive experiments show that using features trained with our novel pretraining strategy significantly improves the performance of recent state-of-the-art methods on three tasks: Temporal Action Localization, Action Proposal Generation, and Dense Video Captioning.
1 code implementation • ICCV 2021 • Mengmeng Xu, Juan-Manuel Perez-Rua, Victor Escorcia, Brais Martinez, Xiatian Zhu, Li Zhang, Bernard Ghanem, Tao Xiang
However, most existing models developed for these tasks are pre-trained on general video action classification tasks.
Ranked #17 on
Temporal Action Localization
on ActivityNet-1.3
1 code implementation • 19 Nov 2020 • Mattia Soldan, Mengmeng Xu, Sisi Qu, Jesper Tegner, Bernard Ghanem
Grounding language queries in videos aims at identifying the time interval (or moment) semantically relevant to a language query.
Ranked #1 on
Natural Language Moment Retrieval
on TACoS
3 code implementations • CVPR 2022 • Kezhi Kong, Guohao Li, Mucong Ding, Zuxuan Wu, Chen Zhu, Bernard Ghanem, Gavin Taylor, Tom Goldstein
Data augmentation helps neural networks generalize better by enlarging the training set, but it remains an open question how to effectively augment graph data to enhance the performance of GNNs (Graph Neural Networks).
Ranked #1 on
Graph Property Prediction
on ogbg-ppa
no code implementations • 24 Aug 2020 • Guohao Li, Mengmeng Xu, Silvio Giancola, Ali Thabet, Bernard Ghanem
In this paper, we introduce a new NAS framework, dubbed LC-NAS, where we search for point cloud architectures that are constrained to a target latency.
1 code implementation • 3 Aug 2020 • Samuel Albanie, Yang Liu, Arsha Nagrani, Antoine Miech, Ernesto Coto, Ivan Laptev, Rahul Sukthankar, Bernard Ghanem, Andrew Zisserman, Valentin Gabeur, Chen Sun, Karteek Alahari, Cordelia Schmid, Shi-Zhe Chen, Yida Zhao, Qin Jin, Kaixu Cui, Hui Liu, Chen Wang, Yudong Jiang, Xiaoshuai Hao
This report summarizes the results of the first edition of the challenge together with the findings of the participants.
no code implementations • 10 Jul 2020 • Sisi Qu, Mengmeng Xu, Bernard Ghanem, Jesper Tegner
EDNA uses the diffusion signal as a proxy for computing node similarities between networks.
no code implementations • 21 Jun 2020 • Modar Alfadly, Adel Bibi, Emilio Botero, Salman AlSubaihi, Bernard Ghanem
This has incited research on the reaction of DNNs to noisy input, namely developing adversarial input attacks and strategies that lead to robust DNNs to these attacks.
3 code implementations • 13 Jun 2020 • Guohao Li, Chenxin Xiong, Ali Thabet, Bernard Ghanem
Graph Convolutional Networks (GCNs) have been drawing significant attention with the power of representation learning on graphs.
Ranked #1 on
Node Property Prediction
on ogbn-proteins
1 code implementation • 13 Jun 2020 • Motasem Alfarra, Juan C. Pérez, Adel Bibi, Ali Thabet, Pablo Arbeláez, Bernard Ghanem
This paper studies how encouraging semantically-aligned features during deep neural network training can increase network robustness.
1 code implementation • CVPR 2020 • Juan Leon Alcazar, Fabian Caba Heilbron, Long Mai, Federico Perazzi, Joon-Young Lee, Pablo Arbelaez, Bernard Ghanem
Current methods for active speak er detection focus on modeling short-term audiovisual information from a single speaker.
no code implementations • 3 May 2020 • Motasem Alfarra, Slavomir Hanzely, Alyazeed Albasyoni, Bernard Ghanem, Peter Richtarik
Recent advances in the theoretical understanding of SGD led to a formula for the optimal batch size minimizing the number of effective data passes, i. e., the number of iterations times the batch size.
no code implementations • 20 Feb 2020 • Motasem Alfarra, Adel Bibi, Hasan Hammoud, Mohamed Gaafar, Bernard Ghanem
Our main finding is that the decision boundaries are a subset of a tropical hypersurface, which is intimately related to a polytope formed by the convex hull of two zonotopes.
no code implementations • 6 Feb 2020 • Jean Lahoud, Bernard Ghanem
These labels, denoted by HN-labels, represent different height and normal patches, which allow mining of local semantic information that is useful in the task of semantic RGB segmentation.
Ranked #92 on
Semantic Segmentation
on NYU Depth v2
no code implementations • ICLR 2020 • Modar Alfadly, Adel Bibi, Muhammed Kocabas, Bernard Ghanem
In this work, we propose a new training regularizer that aims to minimize the probabilistic expected training loss of a DNN subject to a generic Gaussian input.
1 code implementation • ECCV 2020 • Juan C. Pérez, Motasem Alfarra, Guillaume Jeanneret, Adel Bibi, Ali Thabet, Bernard Ghanem, Pablo Arbeláez
We revisit the benefits of merging classical vision concepts with deep learning models.
1 code implementation • CVPR 2020 • Anthony Cioppa, Adrien Deliège, Silvio Giancola, Bernard Ghanem, Marc Van Droogenbroeck, Rikke Gade, Thomas B. Moeslund
We benchmark our loss on a large dataset of soccer videos, SoccerNet, and achieve an improvement of 12. 8% over the baseline.
Ranked #3 on
Action Spotting
on SoccerNet
1 code implementation • ECCV 2020 • Abdullah Hamdi, Sara Rojas, Ali Thabet, Bernard Ghanem
Our proposed attack increases the attack success rate by up to 40% for those transferred to unseen networks (transferability), while maintaining a high success rate on the attacked network.
1 code implementation • CVPR 2020 • Guohao Li, Guocheng Qian, Itzel C. Delgadillo, Matthias Müller, Ali Thabet, Bernard Ghanem
Architecture design has become a crucial component of successful deep learning.
Ranked #4 on
Node Classification
on PPI
1 code implementation • CVPR 2021 • Guocheng Qian, Abdulellah Abualshour, Guohao Li, Ali Thabet, Bernard Ghanem
We combine Inception DenseGCN with NodeShuffle into a new point upsampling pipeline called PU-GCN.
no code implementations • 30 Nov 2019 • Jia-Hong Huang, Modar Alfadly, Bernard Ghanem, Marcel Worring
In this work, we propose a new method that uses semantically related questions, dubbed basic questions, acting as noise to evaluate the robustness of VQA models.
1 code implementation • NeurIPS 2020 • Humam Alwassel, Dhruv Mahajan, Bruno Korbar, Lorenzo Torresani, Bernard Ghanem, Du Tran
To the best of our knowledge, XDC is the first self-supervised learning method that outperforms large-scale fully-supervised pretraining for action recognition on the same architecture.
no code implementations • 27 Nov 2019 • Jesus Zarzar, Silvio Giancola, Bernard Ghanem
We integrate residual GCNs in a two-stage 3D object detection pipeline, where 3D object proposals are refined using a novel graph representation.
Ranked #14 on
3D Object Detection
on KITTI Cars Hard
5 code implementations • CVPR 2020 • Mengmeng Xu, Chen Zhao, David S. Rojas, Ali Thabet, Bernard Ghanem
In this work, we propose a graph convolutional network (GCN) model to adaptively incorporate multi-level semantic context into video features and cast temporal action detection as a sub-graph localization problem.
Ranked #4 on
Temporal Action Localization
on FineAction
4 code implementations • 15 Oct 2019 • Guohao Li, Matthias Müller, Guocheng Qian, Itzel C. Delgadillo, Abdulellah Abualshour, Ali Thabet, Bernard Ghanem
This work transfers concepts such as residual/dense connections and dilated convolutions from CNNs to GCNs in order to successfully train very deep GCNs.
Ranked #5 on
3D Semantic Segmentation
on PartNet
no code implementations • 25 Sep 2019 • Salman AlSubaihi, Adel Bibi, Modar Alfadly, Abdullah Hamdi, Bernard Ghanem
al. that bounded input intervals can be inexpensively propagated from layer to layer through deep networks.
no code implementations • 25 Sep 2019 • Motasem Alfarra, Adel Bibi, Hasan Hammoud, Mohamed Gaafar, Bernard Ghanem
We use tropical geometry, a new development in the area of algebraic geometry, to provide a characterization of the decision boundaries of a simple neural network of the form (Affine, ReLU, Affine).
2 code implementations • 30 Jul 2019 • Victor Escorcia, Mattia Soldan, Josef Sivic, Bernard Ghanem, Bryan Russell
We evaluate our approach on two recently proposed datasets for temporal localization of moments in video with natural language (DiDeMo and Charades-STA) extended to our video corpus moment retrieval setting.
1 code implementation • 24 Jul 2019 • Adel Bibi, Ali Alqahtani, Bernard Ghanem
Extensive experiments on both synthetic and real data demonstrate when: (1) utilizing a single category of constraint, the proposed model is superior to or competitive with SOTA constrained clustering models, and (2) utilizing both categories of constraints jointly, the proposed model shows better performance than the case of the single category.
no code implementations • ICCV 2019 • Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald
The second goal is to learn instance information by densely estimating directional information of the instance's center of mass for each voxel.
Ranked #2 on
3D Semantic Instance Segmentation
on ScanNetV2
2 code implementations • 28 May 2019 • Salman Al-Subaihi, Adel Bibi, Modar Alfadly, Abdullah Hamdi, Bernard Ghanem
In this paper, we closely examine the bounds of a block of layers composed in the form of Affine-ReLU-Affine.
1 code implementation • 9 May 2019 • Baoyuan Wu, Li Shen, Tong Zhang, Bernard Ghanem
Thus, LS-LP is equivalent to the original MAP inference problem.
1 code implementation • 7 May 2019 • Guocheng Qian, Yuanhao Wang, Jinjin Gu, Chao Dong, Wolfgang Heidrich, Bernard Ghanem, Jimmy S. Ren
In this work, we comprehensively study the effects of pipelines on the mixture problem of learning-based DN, DM, and SR, in both sequential and joint solutions.
no code implementations • ICLR 2019 • Adel Bibi, Bernard Ghanem, Vladlen Koltun, Rene Ranftl
In particular, we show that a forward pass through a standard dropout layer followed by a linear layer and a non-linear activation is equivalent to optimizing a convex optimization objective with a single iteration of a $\tau$-nice Proximal Stochastic Gradient method.
1 code implementation • 24 Apr 2019 • Modar Alfadly, Adel Bibi, Bernard Ghanem
Despite the impressive performance of deep neural networks (DNNs) on numerous vision tasks, they still exhibit yet-to-understand uncouth behaviours.
no code implementations • 18 Apr 2019 • Matthias Müller, Guohao Li, Vincent Casser, Neil Smith, Dominik L. Michels, Bernard Ghanem
A common approach is to learn an end-to-end policy that directly predicts controls from raw images by imitating an expert.
no code implementations • 16 Apr 2019 • Abdullah Hamdi, Bernard Ghanem
Generative Adversarial Networks (GANs) have gained momentum for their ability to model image distributions.
no code implementations • 11 Apr 2019 • Juan Leon Alcazar, Maria A. Bravo, Ali K. Thabet, Guillaume Jeanneret, Thomas Brox, Pablo Arbelaez, Bernard Ghanem
Instance-level video segmentation requires a solid integration of spatial and temporal information.
no code implementations • 10 Apr 2019 • Chen Zhao, Bernard Ghanem
Although deep convolutional neural networks (CNNs) have achieved great success in computer vision tasks, its real-world application is still impeded by its voracious demand of computational resources.
no code implementations • 10 Apr 2019 • Alejandro Pardo, Mengmeng Xu, Ali Thabet, Pablo Arbelaez, Bernard Ghanem
We adopt a hybrid supervised learning framework to train the object detector from both these types of annotation.
1 code implementation • 9 Apr 2019 • Abdullah Hamdi, Bernard Ghanem
Despite the impressive performance of Deep Neural Networks (DNNs) on various vision tasks, they still exhibit erroneous high sensitivity toward semantic primitives (e. g. object pose).
1 code implementation • ICCV 2019 • Guohao Li, Matthias Müller, Ali Thabet, Bernard Ghanem
Finally, we use these new concepts to build a very deep 56-layer GCN, and show how it significantly boosts performance (+3. 7% mIoU over state-of-the-art) in the task of point cloud semantic segmentation.
1 code implementation • 30 Mar 2019 • Ali Thabet, Humam Alwassel, Bernard Ghanem
In fact, we show how Morton features can be used to significantly improve performance (+3% for 2 popular se