1 code implementation • ECCV 2020 • Deng-Ping Fan, Yingjie Zhai, Ali Borji, Jufeng Yang, Ling Shao
In particular, we 1) propose a bifurcated backbone strategy (BBS) to split the multi-level features into teacher and student features, and 2) utilize a depth-enhanced module (DEM) to excavate informative parts of depth cues from the channel and spatial views.
1 code implementation • 1 Jul 2024 • Ali Borji
The primary aim of this manuscript is to underscore a significant limitation in current deep learning models, particularly vision models.
no code implementations • 3 Apr 2024 • Morteza Moradi, Mohammad Moradi, Francesco Rundo, Concetto Spampinato, Ali Borji, Simone Palazzo
Recent advancements in video saliency prediction (VSP) have shown promising performance compared to the human visual system, whose emulation is the primary goal of VSP.
1 code implementation • 29 Oct 2023 • Ali Borji
Although extensive research has been carried out to evaluate the effectiveness of AI tools and models in detecting deep fakes, the question remains unanswered regarding whether these models can accurately identify genuine images that appear artificial.
1 code implementation • 28 May 2023 • Ali Borji
Transformers have emerged as the prevailing standard solution for various AI tasks, including computer vision and natural language processing.
no code implementations • 29 Mar 2023 • Ali Borji
The ability of image and video generation models to create photorealistic images has reached unprecedented heights, making it difficult to distinguish between real and fake images in many cases.
1 code implementation • 6 Feb 2023 • Ali Borji
The goal of this study is to assist researchers and developers in enhancing future language models and chatbots.
1 code implementation • 29 Jan 2023 • Ali Borji
Our dataset also comes with a ``miscellaneous'' category, over which we test the image tagging models.
1 code implementation • 28 Jan 2023 • Ali Borji
Questions and answers are formulated and verified carefully and manually.
1 code implementation • 4 Nov 2022 • Ali Borji
A classifier is then trained on the logit vectors of the trained set of this dataset to map the logit vector to the network index that has generated it.
1 code implementation • 2 Oct 2022 • Ali Borji
Here, we conduct a quantitative comparison of three popular systems including Stable Diffusion, Midjourney, and DALL-E 2 in their ability to generate photorealistic faces in the wild.
no code implementations • 23 Aug 2022 • Ali Borji
On VQA, the OFA model scores 77. 3\% on answering 241 binary questions across 50 images.
1 code implementation • 4 Aug 2022 • Ali Borji
Almost all adversarial attacks are formulated to add an imperceptible perturbation to an image in order to fool a model.
no code implementations • 31 Jul 2022 • Ali Borji
Short answer: Yes, Long answer: No!
1 code implementation • 21 Jul 2022 • Ali Borji, Sikun Lin
We show, both theoretically and experimentally, that SplitMixer performs on par with the state-of-the-art MLP-like models while having a significantly lower number of parameters and FLOPS.
1 code implementation • 23 Jun 2022 • Ali Borji
For nearly a decade, the COCO dataset has been the central test bed of research in object detection.
1 code implementation • 21 Jun 2022 • Ali Borji
Here, we quantify the sensitivity of AP to bounding box perturbations and show that AP is very sensitive to small translations.
1 code implementation • 19 Mar 2022 • Ali Borji
Hybrid images is a technique to generate images with two interpretations that change as a function of viewing distance.
no code implementations • 20 Feb 2022 • Ali Borji
Overparametrization has become a de facto standard in machine learning.
1 code implementation • 5 Nov 2021 • Minglang Qiao, Yufan Liu, Mai Xu, Xin Deng, Bing Li, Weiming Hu, Ali Borji
In this paper, we propose a multitask learning method for visual-audio saliency prediction and sound source localization on multi-face video by leveraging visual, audio and face information.
1 code implementation • ECCV 2020 • Yufan Liu, Minglang Qiao, Mai Xu, Bing Li, Weiming Hu, Ali Borji
Inspired by the findings of our investigation, we propose a novel multi-modal video saliency model consisting of three branches: visual, audio and face.
1 code implementation • 17 Mar 2021 • Ali Borji
This work is an update of a previous paper on the same topic published a few years ago.
1 code implementation • 9 Mar 2021 • Ali Borji
High image resolution is critical to obtain a good performance in many computer vision applications.
1 code implementation • 8 Mar 2021 • Ali Borji
They showed a dramatic performance drop of the state of the art object recognition models on this dataset.
no code implementations • 1 Jan 2021 • Ali Borji
In this work, by employing two popular state-of-the-art object detection benchmarks, MMDetection and Detectron2, and analyzing more than 15 models over 4 large-scale datasets, we systematically determine the upper bound in AP, which is 91. 6% on PASCAL VOC (test2007), 78. 2% on MS COCO (val2017), and 58. 9% on OpenImages (V4 validation set), regardless of the IOU.
no code implementations • ICLR 2021 • Ali Borji
Relative to the numbers reported in Barbu et al., around 10-15% of the performance loss is recovered, without any test time data augmentation.
no code implementations • NeurIPS Workshop ICBINB 2021 • Ali Borji
Further, we show that edge information can a) benefit other adversarial training methods, b) be even more effective in conjunction with background subtraction, c) be used to defend against poisoning attacks, and d) make CNNs more robust against natural image corruptions such as motion blur, impulse noise, and JPEG compression, than CNNs trained solely on RGB images.
2 code implementations • 31 Aug 2020 • Ali Borji
In the first one, a classifier is adversarially trained on images with the edge map as an additional channel.
no code implementations • 30 Aug 2020 • Samad Zabihi, Hamed Rezazadegan Tavakoli, Ali Borji
Our proposed model consists of a modified U-net architecture, a novel fully connected layer, and central difference convolutional layers.
2 code implementations • 6 Jul 2020 • Yingjie Zhai, Deng-Ping Fan, Jufeng Yang, Ali Borji, Ling Shao, Junwei Han, Liang Wang
In particular, first, we propose to regroup the multi-level features into teacher and student features using a bifurcated backbone strategy (BBS).
Ranked #2 on RGB-D Salient Object Detection on RGBD135
1 code implementation • 13 May 2020 • Ali Borji
Deep learning has come a long way and has enjoyed an unprecedented success.
1 code implementation • 26 Apr 2020 • Ali Borji
I introduce a very simple method to defend against adversarial examples.
1 code implementation • 5 Apr 2020 • Ali Borji
Object detection remains as one of the most notorious open problems in computer vision.
1 code implementation • 4 Apr 2020 • Ali Borji
They showed a dramatic performance drop of the state of the art object recognition models on this dataset.
Ranked #4 on Image Classification on ObjectNet (Bounding Box) (using extra training data)
1 code implementation • ICLR 2020 • Ali Borji, Sikun Lin
A white noise analysis of modern deep neural networks is presented to unveil their biases at the whole network level or the single neuron level.
1 code implementation • 27 Nov 2019 • Ali Borji, Seyed Mehdi Iranmanesh
Object detection remains as one of the most notorious open problems in computer vision.
1 code implementation • 18 Nov 2019 • Zhaohui Che, Ali Borji, Guangtao Zhai, Suiyi Ling, Jing Li, Patrick Le Callet
Deep neural networks are vulnerable to adversarial attacks.
2 code implementations • 25 May 2019 • Hamed R. -Tavakoli, Ali Borji, Esa Rahtu, Juho Kannala
Our results suggest that (1) audio is a strong contributing cue for saliency prediction, (2) salient visible sound-source is the natural cause of the superiority of our Audio-Visual model, (3) richer feature representations for the input space leads to more powerful predictions even in absence of more sophisticated saliency decoders, and (4) Audio-Visual model improves over 53. 54\% of the frames predicted by the best Visual model (our baseline).
1 code implementation • 16 May 2019 • Zhaohui Che, Ali Borji, Guangtao Zhai, Xiongkuo Min, Guodong Guo, Patrick Le Callet
Data size is the bottleneck for developing deep saliency models, because collecting eye-movement data is very time consuming and expensive.
no code implementations • 12 Apr 2019 • Hamed R. -Tavakoli, Esa Rahtu, Juho Kannala, Ali Borji
Extensive experiments over multiple datasets reveal that (1) spatial biases are strong in egocentric videos, (2) bottom-up saliency models perform poorly in predicting gaze and underperform spatial biases, (3) deep features perform better compared to traditional features, (4) as opposed to hand regions, the manipulation point is a strong influential cue for gaze prediction, (5) combining the proposed recurrent model with bottom-up cues, vanishing points and, in particular, manipulation point results in the best gaze prediction accuracy over egocentric videos, (6) the knowledge transfer works best for cases where the tasks or sequences are similar, and (7) task and activity recognition can benefit from gaze prediction.
no code implementations • 2 Apr 2019 • Zhaohui Che, Ali Borji, Guangtao Zhai, Suiyi Ling, Guodong Guo, Patrick Le Callet
The proposed attack only requires a part of the model information, and is able to generate a sparser and more insidious adversarial perturbation, compared to traditional image-space attacks.
no code implementations • ICCV 2019 • Sen He, Hamed R. -Tavakoli, Ali Borji, Nicolas Pugeault
In this work, we present a novel dataset consisting of eye movements and verbal descriptions recorded synchronously over images.
1 code implementation • CVPR 2019 • Sen He, Hamed R. -Tavakoli, Ali Borji, Yang Mi, Nicolas Pugeault
Our analyses reveal that: 1) some visual regions (e. g. head, text, symbol, vehicle) are already encoded within various layers of the network pre-trained for object recognition, 2) using modern datasets, we find that fine-tuning pre-trained models for saliency prediction makes them favor some categories (e. g. head) over some others (e. g. text), 3) although deep models of saliency outperform classical models on natural images, the converse is true for synthetic stimuli (e. g. pop-out search arrays), an evidence of significant difference between human and data-driven saliency models, and 4) we confirm that, after-fine tuning, the change in inner-representations is mostly due to the task and not the domain shift in the data.
no code implementations • 1 Mar 2019 • Mohamed Elfeki, Ali Borji
Prior work proposed supervised and unsupervised algorithms to train models for learning the underlying behavior of humans by increasing modeling complexity or craft-designing better heuristics to simulate human summary generation process.
1 code implementation • 1 Dec 2018 • Mohamed Elfeki, Krishna Regmi, Shervin Ardeshir, Ali Borji
In this work, we introduce two datasets (synthetic and natural/real) containing simultaneously recorded egocentric and exocentric videos.
1 code implementation • 1 Dec 2018 • Mohamed Elfeki, Liqiang Wang, Ali Borji
With vast amounts of video content being uploaded to the Internet every minute, video summarization becomes critical for efficient browsing, searching, and indexing of visual content.
no code implementations • 11 Oct 2018 • Ali Borji, Hamed R. -Tavakoli, Zoya Bylinskii
In this review, we examine the recent progress in saliency prediction and proposed several avenues for future research.
1 code implementation • 10 Oct 2018 • Zhaohui Che, Ali Borji, Guangtao Zhai, Xiongkuo Min
Most of current studies on human gaze and saliency modeling have used high-quality stimuli.
no code implementations • 8 Oct 2018 • Ali Borji
Visual saliency models have enjoyed a big leap in performance in recent years, thanks to advances in deep learning and large scale annotated data.
no code implementations • ECCV 2018 • Shervin Ardeshir, Ali Borji
Videos recorded from first person (egocentric) perspective have little visual appearance in common with those from third person perspective, especially with videos captured by top-view surveillance cameras.
no code implementations • ECCV 2018 • Shengli Hu, Ali Borji
We create a dataset of 543, 758 logo designs spanning 39 industrial categories and 216 countries.
2 code implementations • 14 Aug 2018 • Krishna Regmi, Ali Borji
For this, we propose to use homography as a guide to map the images between the views based on the common field of view to preserve the details in the input image.
no code implementations • ECCV 2018 • Aidean Sharghi, Ali Borji, Chengtao Li, Tianbao Yang, Boqing Gong
In terms of modeling, we design a new probabilistic distribution such that, when it is integrated into SeqDPP, the resulting model accepts user input about the expected length of the summary.
no code implementations • 27 Jun 2018 • Changqun Xia, Jia Li, Jinming Su, Ali Borji
Due to the effectiveness of the learned metric, it also can be used to facilitate the development of new models for fixation prediction.
1 code implementation • CVPR 2018 • Yu Zeng, Huchuan Lu, Lihe Zhang, Mengyang Feng, Ali Borji
The categories and appearance of salient objects vary from image to image, therefore, saliency detection is an image-specific task.
no code implementations • CVPR 2018 • Tiantian Wang, Lihe Zhang, Shuo Wang, Huchuan Lu, Gang Yang, Xiang Ruan, Ali Borji
Moreover, to effectively recover object boundaries, we propose a local Boundary Refinement Network (BRN) to adaptively learn the local contextual information for each spatial position.
Ranked #15 on RGB Salient Object Detection on DUTS-TE
1 code implementation • CVPR 2018 • Wenguan Wang, Jianbing Shen, Xingping Dong, Ali Borji
Salient object detection is then viewed as fine-grained object-level saliency segmentation and is progressively optimized with the guidance of the fixation map in a top-down manner.
2 code implementations • 26 May 2018 • Deng-Ping Fan, Cheng Gong, Yang Cao, Bo Ren, Ming-Ming Cheng, Ali Borji
The existing binary foreground map (FM) measures to address various types of errors in either pixel-wise or structural ways.
no code implementations • 27 Mar 2018 • Qibin Hou, Jiang-Jiang Liu, Ming-Ming Cheng, Ali Borji, Philip H. S. Torr
Although these tasks are inherently very different, we show that our unified approach performs very well on all of them and works far better than current single-purpose state-of-the-art methods.
no code implementations • ECCV 2018 • Deng-Ping Fan, Ming-Ming Cheng, Jiang-Jiang Liu, Shang-Hua Gao, Qibin Hou, Ali Borji
Our analysis identifies a serious design bias of existing SOD datasets which assumes that each image contains at least one clearly outstanding salient object in low clutter.
no code implementations • 15 Mar 2018 • Sen He, Ali Borji, Yang Mi, Nicolas Pugeault
Deep convolutional neural networks have demonstrated high performances for fixation prediction in recent years.
1 code implementation • CVPR 2018 • Krishna Regmi, Ali Borji
X-Fork architecture has a single discriminator and a single generator.
1 code implementation • CVPR 2018 • Aisha Urooj Khan, Ali Borji
In the quest for robust hand segmentation methods, we evaluated the performance of the state of the art semantic segmentation methods, off the shelf and fine-tuned, on existing datasets.
4 code implementations • 9 Feb 2018 • Ali Borji
Generative models, in particular generative adversarial networks (GANs), have received significant attention recently.
no code implementations • 25 Jan 2018 • Wei-Ta Chu, Kai-Chia Ho, Ali Borji
In this paper, we attempt to employ convolutional recurrent neural networks for weather temperature estimation using only image data.
1 code implementation • CVPR 2018 • Wenguan Wang, Jianbing Shen, Fang Guo, Ming-Ming Cheng, Ali Borji
Existing video saliency datasets lack variety and generality of common dynamic scenes and fall short in covering challenging situations in unconstrained environments.
no code implementations • 26 Dec 2017 • Cecilia La Place, Aisha Urooj Khan, Ali Borji
As a result of our efforts, we have seen an improvement of 10-15% in the average MCR compared to the prior methods on SkyFinder dataset.
1 code implementation • ICCV 2017 • Tiantian Wang, Ali Borji, Lihe Zhang, Pingping Zhang, Huchuan Lu
To remedy this problem, here we propose to augment feedforward neural networks with a novel pyramid pooling module and a multi-stage refinement mechanism for saliency detection.
Ranked #16 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)
no code implementations • 9 Aug 2017 • Yu Zeng, Huchuan Lu, Ali Borji
Here, we explore the low-level statistics of images generated by state-of-the-art deep generative models.
no code implementations • 8 Aug 2017 • Yu Zeng, Huchuan Lu, Ali Borji, Mengyang Feng
Saliency maps are generated according to each region's strategy in the Nash equilibrium of the proposed Saliency Game.
1 code implementation • ICCV 2017 • Deng-Ping Fan, Ming-Ming Cheng, Yun Liu, Tao Li, Ali Borji
Our new measure simultaneously evaluates region-aware and object-aware structural similarity between a SM and a GT map.
1 code implementation • 15 Jun 2017 • Ali Borji, Aysegul Dundar
We do not dwell much on the learning mechanisms in these frameworks as they are still a matter of debate, with respect to biological constraints.
no code implementations • CVPR 2017 • Hamed R. -Tavakoli, Fawad Ahmed, Ali Borji, Jorma Laaksonen
This paper revisits visual saliency prediction by evaluating the recent advancements in this field such as crowd-sourced mouse tracking-based databases and contextual annotations.
no code implementations • 19 May 2017 • Hung Le, Ali Borji
In this work, we explain in detail how receptive fields, effective receptive fields, and projective fields of neurons in different layers, convolution or pooling, of a Convolutional Neural Network (CNN) are calculated.
no code implementations • 11 May 2017 • Ali Borji
A negative result is when the outcome of an experiment or a model is not what is expected or when a hypothesis does not hold.
Cultural Vocal Bursts Intensity Prediction Experimental Design +1
2 code implementations • ICCV 2017 • Hamed R. -Tavakoli, Rakshith Shetty, Ali Borji, Jorma Laaksonen
To bridge the gap between humans and machines in image understanding and describing, we need further insight into how people describe a perceived scene.
no code implementations • 17 Dec 2016 • Sajad Mousavi, Michael Schukat, Enda Howley, Ali Borji, Nasser Mozayani
Bottom-Up (BU) saliency models do not perform well in complex interactive environments where humans are actively engaged in tasks (e. g., sandwich making and playing the video games).
no code implementations • 17 Dec 2016 • Shervin Ardeshir, Krishna Regmi, Ali Borji
On one hand, the abundance of egocentric cameras in the past few years has offered the opportunity to study a lot of vision problems from the first-person perspective.
4 code implementations • CVPR 2017 • Qibin Hou, Ming-Ming Cheng, Xiao-Wei Hu, Ali Borji, Zhuowen Tu, Philip Torr
Recent progress on saliency detection is substantial, benefiting mostly from the explosive development of Convolutional Neural Networks (CNNs).
Ranked #4 on RGB Salient Object Detection on SBU / SBU-Refine
1 code implementation • 20 Oct 2016 • Hamed R. -Tavakoli, Ali Borji, Jorma Laaksonen, Esa Rahtu
This paper presents a novel fixation prediction and saliency modeling framework based on inter-image similarities and ensemble of Extreme Learning Machines (ELM).
no code implementations • 9 Oct 2016 • Jessica Finocchiaro, Aisha Urooj Khan, Ali Borji
We used both traditional computer vision approaches and deep learning in order to determine the visual cues that results in best height estimation.
no code implementations • 4 Sep 2016 • Ali Borji
Inspired by the finding that vanishing point (road tangent) guides driver's gaze, in our previous work we showed that vanishing point attracts gaze during free viewing of natural scenes as well as in visual search (Borji et al., Journal of Vision 2016).
no code implementations • 30 Aug 2016 • Shervin Ardeshir, Ali Borji
First, having a set of egocentric videos and a top-view video, can we verify if the top-view video contains all, or some of the egocentric viewers present in the egocentric set?
no code implementations • 24 Jul 2016 • Shervin Ardeshir, Ali Borji
At the same time, surveillance cameras and drones offer an abundance of visual information, often captured from top-view.
no code implementations • CVPR 2016 • Ali Borji, Saeed Izadi, Laurent Itti
Tolerance to image variations (e. g. translation, scale, pose, illumination, background) is an important desired property of any object recognition system, be it human or machine.
no code implementations • 24 Apr 2016 • Dingwen Zhang, Huazhu Fu, Junwei Han, Ali Borji, Xuelong. Li
Co-saliency detection is a newly emerging and rapidly growing research area in computer vision community.
no code implementations • 6 Dec 2015 • Ali Borji, Mengyang Feng
In the second experiment, we asked 14 subjects (4 female, mean age 23. 07, SD=1. 26) to search for a target character (T or L) placed randomly on a 3x3 imaginary grid overlaid on top of an image.
no code implementations • 6 Dec 2015 • Mengyang Feng, Ali Borji, Huchuan Lu
By predicting where humans look in natural scenes, we can understand how they perceive complex natural scenes and prioritize information for further high-level visual processing.
no code implementations • 4 Dec 2015 • Ali Borji, Saeed Izadi, Laurent Itti
Tolerance to image variations (e. g. translation, scale, pose, illumination) is an important desired property of any object recognition system, be it human or machine.
no code implementations • 27 Oct 2015 • Laurent Itti, Ali Borji
We focus on {\em computational models of attention} as defined by Tsotsos \& Rothenstein \shortcite{Tsotsos_Rothenstein11}: Models which can process any visual stimulus (typically, an image or video clip), which can possibly also be given some task definition, and which make predictions that can be compared to human or animal behavioral or physiological responses elicited by the same stimulus and task.
no code implementations • 24 Oct 2015 • Laurent Itti, Ali Borji
This chapter reviews recent computational models of visual attention.
no code implementations • 14 May 2015 • Ali Borji, Mengyang Feng, Huchuan Lu
Eye movements are crucial in understanding complex scenes.
2 code implementations • 14 May 2015 • Ali Borji, Laurent Itti
Saliency modeling has been an active research area in computer vision for about two decades.
no code implementations • 30 Mar 2015 • Ali Borji, James Tanner
Predicting where people look in natural scenes has attracted a lot of interest in computer vision and computational neuroscience over the past two decades.
no code implementations • 5 Jan 2015 • Ali Borji, Ming-Ming Cheng, Huaizu Jiang, Jia Li
We extensively compare, qualitatively and quantitatively, 40 state-of-the-art models (28 salient object detection, 10 fixation prediction, 1 objectness, and 1 baseline) over 6 challenging datasets for the purpose of benchmarking salient object detection and segmentation methods.
no code implementations • 8 Dec 2014 • Ali Borji
While the notion of most salient object is sensible when multiple objects exist in a scene, current datasets for evaluation of saliency detection approaches often have scenes with only one single object.
no code implementations • 18 Nov 2014 • Ali Borji, Ming-Ming Cheng, Qibin Hou, Huaizu Jiang, Jia Li
Detecting and segmenting salient objects from natural scenes, often referred to as salient object detection, has attracted great interest in computer vision.
no code implementations • CVPR 2014 • Ali Borji, Laurent Itti
Several decades of research in computer and primate vision have resulted in many models (some specialized for one problem, others more general) and invaluable experimental data.
no code implementations • NeurIPS 2013 • Ali Borji, Laurent Itti
Many real-world problems have complicated objective functions.