2 code implementations • 11 May 2022 • Yawei Li, Kai Zhang, Radu Timofte, Luc van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Yanbo Wang, Xiaozhong Ji, Chuming Lin, Donghao Luo, Ying Tai, Chengjie Wang, Zhizhong Zhang, Yuan Xie, Shen Cheng, Ziwei Luo, Lei Yu, Zhihong Wen, Qi Wu1, Youwei Li, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Yuanfei Huang, Meiguang Jin, Hua Huang, Jing Liu, Xinjian Zhang, Yan Wang, Lingshun Long, Gen Li, Yuanfan Zhang, Zuowei Cao, Lei Sun, Panaetov Alexander, Yucong Wang, Minjie Cai, Li Wang, Lu Tian, Zheyuan Wang, Hongbing Ma, Jie Liu, Chao Chen, Yidong Cai, Jie Tang, Gangshan Wu, Weiran Wang, Shirui Huang, Honglei Lu, Huan Liu, Keyan Wang, Jun Chen, Shi Chen, Yuchun Miao, Zimo Huang, Lefei Zhang, Mustafa Ayazoğlu, Wei Xiong, Chengyi Xiong, Fei Wang, Hao Li, Ruimian Wen, Zhijing Yang, Wenbin Zou, Weixin Zheng, Tian Ye, Yuncheng Zhang, Xiangzhen Kong, Aditya Arora, Syed Waqas Zamir, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Dandan Gaoand Dengwen Zhouand Qian Ning, Jingzhu Tang, Han Huang, YuFei Wang, Zhangheng Peng, Haobo Li, Wenxue Guan, Shenghua Gong, Xin Li, Jun Liu, Wanjun Wang, Dengwen Zhou, Kun Zeng, Hanjiang Lin, Xinyu Chen, Jinsheng Fang
The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29. 00dB on DIV2K validation set.
This is especially dangerous for some systems with high-security requirements, so this paper proposes a new defense method by using the model super-fitting state to improve the model's adversarial robustness (i. e., the accuracy under adversarial attacks).
Various defense models have been proposed to resist adversarial attack algorithms, but existing adversarial robustness evaluation methods always overestimate the adversarial robustness of these models (i. e., not approaching the lower bound of robustness).
We introduce a new table detection and structure recognition approach named RobusTabNet to detect the boundaries of tables and reconstruct the cellular structure of the table from heterogeneous document images.
Traditional frame-based cameras inevitably suffer from motion blur due to long exposure times.
1 code implementation • 10 Nov 2021 • Xiangru Lian, Binhang Yuan, XueFeng Zhu, Yulong Wang, Yongjun He, Honghuan Wu, Lei Sun, Haodong Lyu, Chengjun Liu, Xing Dong, Yiqiao Liao, Mingnan Luo, Congfei Zhang, Jingru Xie, Haonan Li, Lei Chen, Renjie Huang, Jianying Lin, Chengchun Shu, Xuezhong Qiu, Zhishan Liu, Dongying Kong, Lei Yuan, Hai Yu, Sen yang, Ce Zhang, Ji Liu
Specifically, in order to ensure both the training efficiency and the training accuracy, we design a novel hybrid training algorithm, where the embedding layer and the dense neural network are handled by different synchronization mechanisms; then we build a system called Persia (short for parallel recommendation training system with hybrid acceleration) to support this hybrid training algorithm.
3D point cloud registration ranks among the most fundamental problems in remote sensing, photogrammetry, robotics and geometric computer vision.
Correspondence-based point cloud registration is a cornerstone in robotics perception and computer vision, which seeks to estimate the best rigid transformation aligning two point clouds from the putative correspondences.
In this paper, we present a novel time-efficient RANSAC-type consensus maximization solver, named DANIEL (Double-layered sAmpliNg with consensus maximization based on stratIfied Element-wise compatibiLity), for robust registration.
Our approach, named conditional DETR, learns a conditional spatial query from the decoder embedding for decoder multi-head cross-attention.
We propose a separation guided speaker diarization (SGSD) approach by fully utilizing a complementarity of speech separation and speaker clustering.
Sparse connectivity: there is no connection across channels, and each position is connected to the positions within a small local window.
In this paper, we propose a new multi-modal backbone network by concatenating a BERTgrid to an intermediate layer of a CNN model, where the input of CNN is a document image and the BERTgrid is a grid of word embeddings, to generate a more powerful grid-based document representation, named ViBERTgrid.
A lightweight panoramic annular semantic segmentation neural network model is designed to achieve high-accuracy and real-time scene parsing.
Ranked #47 on Semantic Segmentation on Cityscapes val
Rotation search and point cloud registration are two fundamental problems in robotics and computer vision, which aim to estimate the rotation and the transformation between the 3D vector sets and point clouds, respectively.
Correspondence-based rotation search and point cloud registration are two fundamental problems in robotics and computer vision.
This system description describes our submission system to the Third DIHARD Speech Diarization Challenge.
Once the scale is estimated, our second contribution is to relax the non-convex global registration problem into a convex Semi-Definite Program (SDP) in a certifiable way using Sum-of-Squares (SOS) Relaxation and show that the relaxation is tight.
In the last few years, the emission measure of the low-temperature plasma has been decreasing, indicating that the blast wave has left the main ER.
High Energy Astrophysical Phenomena
The progressive learning of knowledge distillation is a two-step approach that continuously improves the performance of the student GAN and achieves better performance than single step methods.
Ranked #1 on Unsupervised Anomaly Detection on MNIST
The key idea is to decompose text detection into two subproblems, namely detection of text primitives and prediction of link relationships between nearby text primitive pairs.
Semantic segmentation has made striking progress due to the success of deep convolutional neural networks.
Ranked #13 on Semantic Segmentation on EventScape
1 code implementation • 2 Dec 2019 • Paola Garcia, Jesus Villalba, Herve Bredin, Jun Du, Diego Castan, Alejandrina Cristia, Latane Bullock, Ling Guo, Koji Okabe, Phani Sankar Nidadavolu, Saurabh Kataria, Sizhu Chen, Leo Galmant, Marvin Lavechin, Lei Sun, Marie-Philippe Gill, Bar Ben-Yair, Sajjad Abdoli, Xin Wang, Wassim Bouaziz, Hadrien Titeux, Emmanuel Dupoux, Kong Aik Lee, Najim Dehak
This paper presents the problems and solutions addressed at the JSALT workshop when using a single microphone for speaker detection in adverse scenarios.
Audio and Speech Processing Sound
However, in face of adverse conditions such as the nighttime, semantic segmentation loses its accuracy significantly.
Ranked #4 on Semantic Segmentation on Nighttime Driving
The challenges for this problem are two-fold: on the one hand, we have to derive a causal estimator to estimate the causal quantity from observational data, where there exists confounding bias; on the other hand, we have to deal with the identification of CATE when the distribution of covariates in treatment and control groups are imbalanced.
In this paper, we present a new Mask R-CNN based text detection approach which can robustly detect multi-oriented and curved text from natural scene images in a unified manner.
Ranked #3 on Scene Text Detection on SCUT-CTW1500
In this article, we show that solving the system of linear equations by manipulating the kernel and the range space is equivalent to solving the problem of least squares error approximation.
The anchor mechanism of Faster R-CNN and SSD framework is considered not effective enough to scene text detection, which can be attributed to its IoU based matching criterion between anchors and ground-truth boxes.
To illustrate our methodology, we provide simulation evidence and a real data example on the statistical properties and computational efficiency of SBR versus direct posterior sampling using spike-and-slab priors.
Most existing constrained adaptive filtering algorithms are developed under mean square error (MSE) criterion, which is an ideal optimality criterion under Gaussian noises.