On the more challenging ADE20K dataset, our best model yields a single-scale mIoU of 50. 18, and a multi-scale mIoU of 51. 8, which is on-par with the current state-of-art model, while we drastically cut the number of FLOPs by 53. 5%.
Video compression (e. g., H. 264, MPEG-4) reduces superfluous information by representing the raw video stream using the concept of Group of Pictures (GOP).
We propose a taxonomy of discrete-variant GANs and continuous-variant GANs, in which GANs deal with discrete time series and continuous time series data.
To this end, we propose a spAtio-temporal, Channel and moTion excitatION (ACTION) module consisting of three paths: Spatio-Temporal Excitation (STE) path, Channel Excitation (CE) path, and Motion Excitation (ME) path.
no code implementations • 26 Apr 2020 • Qi She, Fan Feng, Qi Liu, Rosa H. M. Chan, Xinyue Hao, Chuanlin Lan, Qihan Yang, Vincenzo Lomonaco, German I. Parisi, Heechul Bae, Eoin Brophy, Baoquan Chen, Gabriele Graffieti, Vidit Goel, Hyonyoung Han, Sathursan Kanagarajah, Somesh Kumar, Siew-Kei Lam, Tin Lun Lam, Liang Ma, Davide Maltoni, Lorenzo Pellegrini, Duvindu Piyasena, ShiLiang Pu, Debdoot Sheet, Soonyong Song, Youngsung Son, Zhengwei Wang, Tomas E. Ward, Jianwen Wu, Meiqing Wu, Di Xie, Yangsheng Xu, Lin Yang, Qiaoyong Zhong, Liguang Zhou
This report summarizes IROS 2019-Lifelong Robotic Vision Competition (Lifelong Object Recognition Challenge) with methods and results from the top $8$ finalists (out of over~$150$ teams).
Egocentric gestures are the most natural form of communication for humans to interact with wearable devices such as VR/AR helmets and glasses.
In this work, we introduce an evaluation metric called Neuroscore, for evaluating the performance of GANs, that more directly reflects psychoperceptual image quality through the utilization of brain signals.
Yet, robotic vision poses unique challenges for applying visual algorithms developed from these standard computer vision datasets due to their implicit assumption over non-varying distributions for a fixed set of tasks.
While several reviews for GANs have been presented to date, none have considered the status of this field based on their progress towards addressing practical challenges relevant to computer vision.
Deep neural networks (DNNs) are inspired from the human brain and the interconnection between the two has been widely studied in the literature.
In this work, we describe an evaluation metric we call Neuroscore, for evaluating the performance of GANs, that more directly reflects psychoperceptual image quality through the utilization of brain signals.
In the recent years Generative Adversarial Networks (GANs) have demonstrated significant progress in generating authentic looking data.
In this paper we make two primary contributions to that field: 1) We propose a novel spatial filtering method which we call the Multiple Time Window LDA Beamformer (MTWLB) method; 2) we provide a comprehensive comparison of nine spatial filtering pipelines using three spatial filtering schemes namely, MTWLB, xDAWN, Common Spatial Pattern (CSP) and three linear classification methods Linear Discriminant Analysis (LDA), Bayesian Linear Regression (BLR) and Logistic Regression (LR).
We then use the 2048 dimensional features from the penultimate layer as input to a support vector machine.
We propose a novel approach that combines a brain-computer interface (BCI) with GANs to generate a measure we call Neuroscore, which closely mirrors the behavioral ground truth measured from participants tasked with discerning real from synthetic images.